SaaS Infrastructure Architecture
This page describes the Arbitex SaaS infrastructure as deployed in Azure Kubernetes Service (AKS). It covers the deployment model, network topology, request data flow, storage architecture, observability design, and security boundaries. This is the live production architecture for the Arbitex SaaS offering.
For the full technical deployment reference including Helm chart structure, Dockerfile details, and CI/CD pipeline, see Epic M deployment architecture overview.
Deployment model
Section titled “Deployment model”Arbitex SaaS runs on Azure Kubernetes Service in a single-region active deployment.
┌────────────────────────────────────────────────────────────────┐│ Azure Region ││ ││ ┌────────────────────────────────────────────────────────┐ ││ │ AKS Cluster │ ││ │ │ ││ │ ┌──────────────┐ ┌──────────────┐ ┌─────────────┐ │ ││ │ │ platform- │ │ platform- │ │ ner-gpu │ │ ││ │ │ api │ │ frontend │ │ (GLiNER) │ │ ││ │ │ :8000 │ │ :8080 │ │ :8200 │ │ ││ │ └──────────────┘ └──────────────┘ └─────────────┘ │ ││ │ │ ││ │ ┌──────────────┐ ┌──────────────────────────────┐ │ ││ │ │ deberta- │ │ GPU Node Pool │ │ ││ │ │ validator │ │ (NER + DeBERTa workloads) │ │ ││ │ │ :8201 │ └──────────────────────────────┘ │ ││ │ └──────────────┘ │ ││ └────────────────────────────────────────────────────────┘ ││ ││ ┌─────────────┐ ┌──────────────────┐ ┌──────────────────┐ ││ │ PostgreSQL │ │ Redis Cache │ │ Azure Key Vault │ ││ │ (Flexible) │ │ (sessions/cache)│ │ (secrets/certs) │ ││ └─────────────┘ └──────────────────┘ └──────────────────┘ ││ ││ ┌─────────────────────────────────────────────────────────┐ ││ │ Azure Files / Blob Storage │ ││ │ (GeoIP MMDB, CredInt bloom filter, object storage) │ ││ └─────────────────────────────────────────────────────────┘ │└────────────────────────────────────────────────────────────────┘Services
Section titled “Services”| Service | Replicas | Scaling |
|---|---|---|
platform-api (FastAPI) | HPA-managed | Scales on CPU/memory |
platform-frontend (React/Nginx) | HPA-managed | Scales on requests |
ner-gpu (GLiNER NER) | GPU node pool | Scale-to-zero capable |
deberta-validator (DeBERTa NLI) | GPU node pool | Scale-to-zero capable |
GPU inference workloads (GLiNER, DeBERTa) run on a dedicated GPU node pool with accelerator=nvidia node selectors. The CPU node pool handles all other workloads.
Multi-region and Phase B
Section titled “Multi-region and Phase B”The current production deployment is single-region (Epic M Phase A). Multi-region active-active deployment and automated Azure provisioning are planned under Epic M Phase B.
Network architecture
Section titled “Network architecture”Perimeter — Cloudflare
Section titled “Perimeter — Cloudflare”All customer-facing traffic enters through Cloudflare before reaching origin:
Internet → Cloudflare Edge → AKS NGINX Ingress → Services| Cloudflare capability | Configuration |
|---|---|
| WAF | OWASP ruleset + custom AI proxy rules |
| DDoS protection | Unmetered at network layer |
| Bot management | Automated traffic filtering |
| Rate limiting | Per-path limits enforced before origin |
| Authenticated Origin Pulls | mTLS — cryptographic proof traffic originates from Cloudflare |
| CDN | Static assets; API traffic passes through with cache-miss |
NGINX Ingress
Section titled “NGINX Ingress”The AKS NGINX Ingress controller handles:
- TLS termination for HTTPS traffic
- Request body size limits (configurable per route)
- SSE (Server-Sent Events) long-poll support for streaming completions
- Proxy headers for client IP propagation
Internal network — Calico policies
Section titled “Internal network — Calico policies”Inside the AKS cluster, Calico network policies enforce default-deny between all pods. Each service pair that needs to communicate has an explicit allowlist policy.
Default: deny all pod-to-pod traffic
Explicit allowances: platform-api → postgresql (port 5432) platform-api → redis (port 6379) platform-api → ner-gpu (port 8200) platform-api → deberta-validator (port 8201) platform-api → azure-key-vault (via private endpoint) platform-frontend → platform-api (internal proxy)A compromised pod cannot reach unrelated services.
Private endpoints
Section titled “Private endpoints”All data services are accessible only via Azure Private Endpoints:
- PostgreSQL — no public endpoint
- Redis — no public endpoint
- Azure Key Vault — no public endpoint
Traffic between the AKS cluster and data services never traverses the public internet.
Staff admin plane
Section titled “Staff admin plane”Internal Arbitex staff tooling (int.arbitex.ai) is NS-delegated to a private RFC 1918 nameserver. The subtree is unreachable from the public internet (external DNS returns SERVFAIL). Access requires VPN or Cloudflare Zero Trust connector.
Request data flow
Section titled “Request data flow”A request from client to AI provider passes through the following stages. The audit layer runs at each enforcement point.
Client sends request │ ▼1. Cloudflare Edge - WAF inspection - Rate limiting - DDoS filtering │ ▼2. NGINX Ingress - TLS termination - IP allowlist check (org-configured) - Request size enforcement │ ▼3. Authentication middleware - API key validation (SHA-256 hash compare) - RS256 JWT validation (if Bearer token present) - SAML session validation (for portal requests) ✎ Audit: request_received, auth_result │ ▼4. Policy Engine - Policy chain evaluation (first_applicable algorithm) - Compliance Bundle rules evaluated - Custom org policy rules evaluated ✎ Audit: policy_evaluated, rule_matched (if applicable) │ ▼5. DLP Pipeline (3 tiers) - Tier 1: Regex + Luhn (structured PII) - Tier 2: GLiNER NER via ner-gpu service - Tier 3: DeBERTa NLI via deberta-validator service ✎ Audit: dlp_findings │ ▼6. Enforcement action - BLOCK → 400 response returned, no provider call - REDACT → prompt modified, continue - CANCEL → completion cancelled after initial tokens - ALLOW_WITH_OVERRIDE → logged, continue - PROMPT → governance challenge interposed ✎ Audit: enforcement_action │ ▼7. Provider Gateway - Route selection (latency, cost, fallback chain) - Provider API call over TLS - Circuit breaker on provider errors │ ▼8. Response processing - DLP scan on completion (response side) - Policy evaluation on response ✎ Audit: response_received, response_enforcement (if applicable) │ ▼9. Response to client ✎ Audit: request_complete (final entry, closes HMAC chain link)Each ✎ Audit step produces one or more entries in the HMAC-chained audit log. The chain links audit entries for the same request using the request_id correlation field.
Storage
Section titled “Storage”PostgreSQL (primary store)
Section titled “PostgreSQL (primary store)”PostgreSQL via Azure Database for PostgreSQL Flexible Server.
| Use | Tables |
|---|---|
| User accounts, orgs, groups | users, orgs, groups, group_members |
| API keys (SHA-256 hashed) | api_keys |
| OAuth clients and tokens | oauth_clients, oauth_tokens |
| Policy packs, rules, chains | policy_packs, policy_rules, policy_chains |
| SAML IdP configurations (Fernet-encrypted) | saml_idp_configs |
| SIEM connector configurations (Fernet-encrypted) | siem_configs |
| IP allowlist rules | org_ip_allowlists |
| Audit log entries | audit_logs |
| Usage metering records | usage_records, usage_summaries |
Encryption: AES-256 at storage level (Azure platform-managed). Sensitive configuration fields additionally encrypted with Fernet (AES-128-CBC) at the application layer.
Backup: Azure Flexible Server automated backups with point-in-time recovery (PITR). See DR runbook.
Redis (cache and sessions)
Section titled “Redis (cache and sessions)”Redis via Azure Cache for Redis.
| Use | Key pattern |
|---|---|
| Distributed session store | session:<session_id> |
| OAuth token cache | oauth_token:<token_hash> |
| IP allowlist cache | ip_allowlist:<org_id> — 60s TTL |
| Policy chain cache | policy_chain:<org_id> — 60s TTL |
| SIEM config cache | siem_config:<org_id> — 60s TTL |
| Rate limiting counters | rate_limit:<org_id>:<path> |
No sticky sessions required: The Redis session store enables distributed sessions — any pod can handle any request without session affinity. This enables horizontal scaling without session-partition constraints.
Encryption: TLS enforced for all Redis connections. No data persisted to unencrypted disk.
Azure Files / Blob Storage
Section titled “Azure Files / Blob Storage”| Dataset | Format | Update cadence |
|---|---|---|
| GeoIP database (MaxMind) | MMDB binary | Scheduled refresh |
| Credential Intelligence bloom filter | Binary | CDN-delivered refresh |
| Audit export archives | JSON (OCSF) | On-demand / scheduled |
Encryption: Azure Storage Service Encryption (AES-256, platform-managed keys).
Azure Key Vault
Section titled “Azure Key Vault”Stores all secrets in production:
- Fernet encryption key (application-layer encryption)
- SAML signing certificates
- Internal CA intermediate key (for outpost certificate issuance)
- Service connection strings and credentials
Key Vault tier: FIPS 140-2 Level 3 HSM-backed. Applications fail to start if Key Vault is unreachable and insecure fallbacks are detected.
Observability
Section titled “Observability”Structured logging
Section titled “Structured logging”All services emit structured JSON logs with consistent fields:
| Field | Description |
|---|---|
timestamp | ISO 8601 |
level | debug, info, warning, error, critical |
service | Service name (platform-api, ner-gpu, etc.) |
request_id | UUID — spans all log entries for a single request |
org_id | Tenant identifier |
event | Event name |
duration_ms | Processing time for completed operations |
Logs ship to Azure Log Analytics and are available for SIEM correlation. Each request produces log entries across multiple services that correlate via request_id.
Distributed tracing
Section titled “Distributed tracing”HTTP requests carry the traceparent header per W3C Trace Context specification. The header is propagated through the full request chain (API → NER GPU → DeBERTa → provider), enabling end-to-end request tracing in compatible observability platforms.
Health checks and circuit breakers
Section titled “Health checks and circuit breakers”| Service | Health endpoint | Circuit breaker |
|---|---|---|
platform-api | GET /healthz | — |
ner-gpu | GET /health | 3 failures / 60s reset |
deberta-validator | GET /health | 3 failures / 60s reset |
| Provider integrations | Per-provider health | Provider health score; automatic fallback |
Circuit breakers on the NER and DeBERTa microservices prevent GPU inference failures from blocking the full request pipeline — the DLP pipeline degrades gracefully to lower tiers when GPU services are unavailable.
Security boundaries
Section titled “Security boundaries”Per-tenant isolation
Section titled “Per-tenant isolation”Arbitex enforces tenant isolation at the data layer. There is no shared database across tenants — all tables carry a tenant_id (UUID) column and all queries are filtered by tenant at the service layer. Tenant identity is derived from the authenticated API key or JWT claim; it cannot be overridden by the caller.
Tenant A Tenant B │ │ ▼ ▼ org_id=uuid-A org_id=uuid-B │ │ ▼ ▼All DB queries: All DB queries: WHERE tenant_id=A WHERE tenant_id=BOrg-scoped tokens
Section titled “Org-scoped tokens”| Credential type | Scope |
|---|---|
| API keys | Single org; carry role (admin/user) |
| RS256 JWT (M2M) | Single org; carry scopes (api:read, api:write, etc.) |
| SAML session | Single org; user identity from IdP |
| Outpost mTLS certificate | Single outpost; issued at registration, revokeable |
No cross-tenant token escalation path exists.
Rate limiting tiers
Section titled “Rate limiting tiers”Rate limiting is enforced at two layers:
| Layer | Mechanism | Granularity |
|---|---|---|
| Cloudflare edge | Per-path rate limits | Global (pre-auth) |
| Platform API | Sliding window (Redis) | Per org, per endpoint |
| Plan tier | RPM limit | Burst |
|---|---|---|
| Free | 60 RPM | Limited |
| Standard | 600 RPM | Standard burst |
| Premium | 3,000 RPM | Premium burst |
| Enterprise | Configurable | Custom |
M2M OAuth clients have an additional rate_limit_tier property (standard / premium / unlimited) independent of the org plan tier.
Non-root containers
Section titled “Non-root containers”All service containers run as non-root users:
platform-api—appuser(UID 1000)platform-frontend—nginxuser (UID 101)- GPU microservices — non-root GPU user
Pod Security Standards restricted profile is enforced on CPU namespaces. baseline profile on the GPU inference namespace (required for CUDA driver access).
Hybrid deployment — Arbitex Outpost
Section titled “Hybrid deployment — Arbitex Outpost”The Arbitex Outpost extends the SaaS enforcement pipeline into your private network.
Your Network Arbitex SaaS │ │ ▼ │┌──────────────────────────────────┐ ││ Arbitex Outpost │ ││ ┌────────────┐ ┌────────────┐ │ ││ │ DLP │ │ Policy │ │◄─────┤ policy sync (mTLS)│ │ Pipeline │ │ Cache │ │ ││ └────────────┘ └────────────┘ │ ││ ┌────────────┐ ┌────────────┐ │ ││ │ DeBERTa │ │ CredInt │ │ ││ │ (local) │ │ Bloom │ │ ││ └────────────┘ └────────────┘ │ ││ │ ││ Audit buffer → SaaS sync (mTLS) │─────►│ audit events└──────────────────────────────────┘ │ (to provider, or local-only) ▼ AI ProviderThe Outpost runs the same 3-tier DLP pipeline locally. Policy packs sync from the SaaS control plane over mTLS. In air-gap mode, the Outpost operates without SaaS connectivity, using the last synced policy bundle.
See Outpost deployment guide and Air-gap deployment guide.
See also
Section titled “See also”- Epic M deployment architecture overview — Helm chart structure, Dockerfile details, CI/CD pipeline
- Security Trust Center — Security architecture, authentication, data protection
- Multi-tenancy architecture — Tenant isolation model
- Audit Data Model Reference — Audit log schema, OCSF mapping
- Outpost deployment guide — Hybrid deployment architecture
- DR runbook — PostgreSQL PITR, AKS cluster restore