Enterprise AI Chat Assistant with SSO & Compliance
Deploy a production-grade, retrieval-grounded AI assistant with enterprise SSO, auditability, threat modeling, per-tenant isolation, and GDPR controls.
Identity & Access
- SAML & OIDC SSO (multi‑tenant mapping)
- Role-based access (Admin / Editor / Viewer)
- Just‑in‑time user provisioning; invite flows
- Session + refresh token lifetime policies
Security & Compliance
- Formal threat model (prompt injection, SSRF, data exfiltration)
- PII redaction & field‑level encryption (sensitive fields)
- Configurable data retention (90d → 730d)
- Audit logging & export (GCS / BigQuery)
Reliability & Control
- Adaptive relevance threshold (≤5% false positive goal)
- Fallback reason telemetry (low score, timeout, provider error)
- Per‑tenant prompt version history & rollback
- Kill switch & forced re‑index controls
Architecture Highlights
CrawlBot AI runs as a GCP‑native microservice platform: Cloud Run for stateless services, MongoDB Atlas for operational data and vector search, and a provider‑agnostic LLM gateway (Gemini primary, OpenAI fallback). Tenant isolation enforced at service boundaries with scoped service accounts and per‑tenant metadata filters. Observability via OpenTelemetry traces + structured logs for every retrieval and answer synthesis path.
Security posture includes strict CSP/SRI for embed scripts, robots.txt compliance & domain allowlists for crawling, secret management via GCP Secret Manager, and quarterly threat model review. All changes to infrastructure are codified via Pulumi with preview + apply pipelines (no console drift).
Why Enterprises Choose CrawlBot AI
- Fast time to value: crawl → configure → embed in under an hour.
- Grounded answers with strict refusal when context insufficient.
- Per‑embed analytics & audit trails build trust and ROI transparency.
- Programmatic control (gRPC + upcoming admin APIs) for integration.
FAQ
SAML 2.0 and OpenID Connect (OIDC) at launch; SCIM user provisioning is on the roadmap.
Logical isolation via tenant IDs at every storage + retrieval boundary, row-level filters, and scoped service accounts; no cross-tenant vector queries.
Yes, with retention windows configurable per tenant (default 90 days for chat logs) and PII redaction/anonymization rules applied before persistence.
Foundational controls align with SOC2 readiness; formal threat model maintained; audit logging, principle of least privilege IAM, secret rotation schedule.