Skip to content

SaaS Infrastructure Architecture

Arbitex runs on Azure Kubernetes Service in the East US 2 region, behind Cloudflare’s global edge network. The architecture is designed for encrypted-by-default operation, fine-grained network isolation, and cost-controlled scale-to-zero for pre-production and demo workloads.

This page describes the customer-facing infrastructure model. It covers what services run where, how traffic flows, and what security boundaries exist between layers.


The production stack is organized into five layers. Each layer has a distinct security and operational profile.

The data layer runs continuously and is the most expensive to provision. All data services use Azure Private Endpoints — there are no publicly reachable database or cache endpoints.

ComponentServiceNotes
DatabaseAzure Flexible Server PostgreSQLTwo databases: platform and cloud. Zone-redundant standby for high availability. 128 GB auto-grow, 30-day geo-redundant backup. Private endpoint only.
CacheAzure Cache for RedisTLS enforced. No persistence to unencrypted disk. Three logical databases: platform (db=0), cloud (db=1), reserved for BYOK DEK cache (db=2). Private endpoint.
SecretsAzure Key VaultHSM-backed key storage. Managed identity authentication — no stored credentials. Private endpoint. Applications refuse to start if Key Vault is unreachable.
Static assetsAzure Files (ZRS)Read-only share mounted into pods. Hosts GeoIP MMDB database (MaxMind GeoIP2 City, 118 MB) and Credential Intelligence database.
Object storageAzure Files (ZRS)Write access restricted to Platform API only. Stores document uploads. Daily snapshots.

The compute layer uses AKS Standard tier with three node pool types. Web and GPU pools scale to zero when not in use, which is the primary cost-control mechanism in pre-production.

PoolNode typeScalingNotes
System poolB2s_v2 (burstable)Always onRuns Kubernetes system pods. Burstable credits accumulate while web/GPU pools are parked.
Web poolD4ds_v5Scale to zeroRuns Platform API, Cloud Portal, Outpost Proxy relay, and NGINX Ingress (2 replicas).
GPU poolNC4as_T4_v3 (NVIDIA T4)Scale to zero — Deallocate modeRuns DeBERTa inference, intent classification, and embedding models. Deallocate mode preserves the node allocation, enabling 2–4 minute restarts vs. 8–25 minute cold provisioning.

Pod-to-pool mapping:

PodPool
Platform APIWeb
Cloud PortalWeb
Outpost Proxy (SaaS relay)Web
NGINX Ingress (×2 replicas)Web
DeBERTa / intent / embeddingsGPU
Alembic migration jobsWeb (Job)

Network policies (Calico) enforce default-deny between pod pairs. Each service pair that communicates has an explicit allowlist. Pod Security Standards (restricted profile) are enforced on all namespaces; the GPU namespace uses baseline (required for CUDA workloads).

Inference failure behavior: If the GPU pool is unavailable (parked or restarting), inference requests fail closed — requests are blocked rather than passing through without DLP scanning.

All external traffic enters through Cloudflare before reaching the AKS origin.

Internet → Cloudflare edge → NGINX Ingress → Pod

Cloudflare edge:

  • WAF with OWASP ruleset
  • Unmetered DDoS protection
  • CDN and bot management
  • Full (Strict) SSL mode — Cloudflare validates the origin certificate
  • DNS management for all public domains

Origin TLS (Cloudflare Authenticated Origin Pulls):

Cloudflare establishes mutual TLS to the AKS origin using the Cloudflare Origin CA certificate. This cryptographically ensures that traffic arriving at the origin came from Cloudflare, not arbitrary internet sources. IP allowlisting provides defense-in-depth.

NGINX Ingress:

  • Two replicas with PodDisruptionBudget (minAvailable=1) and anti-affinity across nodes — eliminates single-pod traffic SPOF
  • Per-path body limits: 1 MB for chat completions, 25 MB for document upload paths
  • WebSocket support for streaming paths (proxy_buffering off)
  • mTLS enforcement for Outpost traffic (/outpost/* paths)

Outpost mTLS:

Hybrid Outpost deployments authenticate to the Gateway using client certificates issued at Outpost registration time. NGINX enforces ssl_verify_client on for /outpost/* paths. The Platform API middleware verifies the full certificate chain against the Cloud CA — not just issuer presence.

ComponentDetail
PlatformGitHub Actions
RegistryAzure Container Registry (ACR Basic)
AuthenticationOIDC federation — no stored Azure credentials in GitHub
TriggerManual (workflow_dispatch) — both build and deploy are button presses

Build pipeline: lint → test → Trivy image scan → docker build → push to ACR. Critical and high Trivy findings block deployment.

Deploy pipeline: select image tag → helm upgrade → smoke test → git tag.

All base images are pinned by SHA256 digest. No curl in production images (Python health checks instead).

ComponentDetail
MetricsAzure Monitor + Container Insights
LogsJSON structured logs to stdout → Log Analytics (90-day active, 2-year archive)
AKS diagnosticsAPI server audit logs → Log Analytics (supports SOC 2 CC7)
Alerting13 alert rules across all layers

Key alerts:

AlertCondition
Pod crash loop>3 restarts in 10 min
Key Vault CSI mount failurePod in CreateContainerConfigError state
Cert expiry warningAny certificate <14 days to expiry
Inference GPU OOMGPU memory >90%
API 5xx spike>10 5xx errors in 5 min
Demo still runningWeb+GPU pools occupied >4 hours

All public domains are managed in Cloudflare DNS. All records are proxied through Cloudflare (orange cloud on — no direct IP exposure).

DomainRoutes toService
api.arbitex.ai/v1/*Platform API :8100AI proxy — customer requests
api.arbitex.ai/outpost/*Outpost Proxy :8300Hybrid Outpost communication
cloud.arbitex.aiCloud Portal :8000Customer admin portal
arbitex.aiVercelMarketing site (static)
docs.arbitex.aiStatic hostThis documentation site

Internal staff tooling runs at int.arbitex.ai, which is NS-delegated to a private RFC 1918 nameserver and is unreachable from the public internet. See Security Overview — Staff admin plane isolation.


In the SaaS deployment model, the Arbitex control plane and data plane both run in the Arbitex-managed Azure environment. Customer prompts and completions pass through the Gateway in the Arbitex AKS cluster. DLP scanning runs on the GPU pool within the same cluster.

Customers access the Gateway at api.arbitex.ai/v1 using their API key. The Cloud Portal at cloud.arbitex.ai provides admin configuration.

In the Hybrid Outpost model, the Arbitex control plane remains Arbitex-managed (SaaS), but the data plane — prompt inspection and policy enforcement — runs in a container in the customer’s own environment.

The Outpost communicates with the Cloud control plane to receive configuration (policy packs, compliance bundles, API keys) and to write audit events. The Outpost is resilient to control plane outages: it caches configuration locally and continues enforcing policy when the Cloud is unreachable.

Outpost registration and certificate issuance are managed by the Cloud control plane. Each Outpost receives a unique client certificate from the Cloud CA at registration time. This certificate is used for mTLS authentication on all /outpost/* requests.

See Deployment guide for Outpost setup and network requirements.


The infrastructure is designed for scale-to-zero pre-production operation.

StateMonthly cost
Parked (no active workloads)~$966
Demo session (4-hour active window)~$966 + ~$6/session
LayerParked cost
L1 Data$720
L2 Compute$133
L3 Ingress$75
L4 CI/CD$8
L5 Monitoring$30

The $720/month data layer cost reflects zone-redundant PostgreSQL HA (which bills at 2× compute on v5-series instances) and four Azure Private Endpoints (~$35/month). This layer cannot scale to zero — the database and cache must remain available for all services.