Outpost deployment architecture

The Outpost is a self-contained proxy that runs entirely within the customer’s VPC. It intercepts AI model requests, enforces policies and DLP rules, writes a tamper-evident local audit log, and forwards requests to cloud model providers — without customer prompts leaving the VPC before inspection.

This document covers containerization, deployment models (Docker Compose, Kubernetes, air-gap), every runtime component, network topology, and resource requirements.

Containerization

The Outpost is distributed as a Docker image built from a two-stage Dockerfile.

Image build

Stage 1 — UI builder: node:20-slim. Builds the admin UI React bundle from ui/ using npm ci + npm run build.

Stage 2 — Python runtime: python:3.12-slim. The production image:

Installs Python dependencies from pyproject.toml (pip install -e ".[runtime]") in a cached layer before copying application code.
Copies the outpost/ application package.
Copies the UI build output from Stage 1 (/ui/dist → /app/ui/dist).
Creates a non-root user appuser (UID/GID 1000) with ownership of /app.
Runs as appuser — no root privileges at runtime.

CPU vs GPU images: A build argument INFERENCE_MODE (default: cpu) sets the DLP_NER_DEVICE environment variable at build time. The GPU image is built with --build-arg INFERENCE_MODE=gpu.

Image tag	`DLP_NER_DEVICE` default	Use case
`arbitex/outpost:<VERSION>`	`cpu`	Standard deployment
`arbitex/outpost:<VERSION>-gpu`	`gpu`	GPU-accelerated DLP

Container ports

Port	Binding	Purpose
8300	`0.0.0.0`	AI proxy — chat completions, model list, health probes
8301	`127.0.0.1`	Admin API — emergency overrides, PROMPT hold management

Port 8301 binds to localhost only. In Kubernetes deployments it is not exposed via a Service; access requires kubectl port-forward.

Container health check

The Dockerfile health check polls http://localhost:8300/health every 30 seconds (10-second timeout, 30-second start period, 3 retries). Container orchestrators should use the /healthz (liveness) and /readyz (readiness) endpoints:

Endpoint	Returns 200 when
`GET /healthz`	Process is running
`GET /readyz`	Policy bundle is loaded

The readiness probe fails until a policy bundle has been loaded from cache or synced from the management plane. Gate traffic on the readiness probe.

Deployment models

Three deployment models are supported. All produce the same runtime behaviour — the differences are in how the container is managed and how the image is delivered.

Docker Compose (standard)

For single-host deployments. The docker-compose.outpost.yml file in the repository defines the outpost service with the required port mappings, volume mounts, and environment variable pass-through.

Storage volumes:

./policy_cache:/app/policy_cache — persistent policy bundle cache
./audit_buffer:/app/audit_buffer — persistent local audit log
./certs:/app/certs:ro — read-only mTLS certificate mount

Managed with:

docker compose -f docker-compose.outpost.yml up -d
docker compose -f docker-compose.outpost.yml down

Kubernetes / Helm

For Kubernetes deployments, the Helm chart is at charts/arbitex-outpost/. The chart configures:

Replicas: default 2 (for high availability)
PodDisruptionBudget: minAvailable: 1 — ensures at least one replica is available during rolling updates
PersistentVolumeClaims: 1 Gi for policy_cache, 5 Gi for audit_buffer
mTLS certificates: supplied via an existing Kubernetes Secret (3 files: outpost.pem, outpost.key, ca.pem)
Security context: runAsNonRoot: true, runAsUser: 1000, readOnlyRootFilesystem: true, all capabilities dropped
HPA: disabled by default; enable with autoscaling.enabled: true (min 2 replicas to satisfy PDB)

Resource profiles:

Mode	Memory request	Memory limit	CPU request	CPU limit
Standard (CPU DLP)	512 Mi	2 Gi	250m	2
GPU DLP	2 Gi	8 Gi	500m	4

With CredInt enabled, the memory limit must be increased to at least 2 Gi to account for the ~470 MB bloom filter RSS. The default limit of 2 Gi is sufficient when using the bundled 10% FPR filter.

Air-gap deployment

For hosts with no internet access. The air-gap model packages all runtime assets — Docker images (CPU and GPU), configuration files, policy bootstrap bundle, and optionally the GeoIP MMDB — into a single tarball built on an internet-connected machine and transferred to the target host.

Internet-connected build machine          Air-gapped target host
─────────────────────────────────         ────────────────────────
arbitex-outpost repo                      /tmp/airgap-<VERSION>/
│                                         │
├── docker build (CPU + GPU)  ─────────► ├── outpost-image-<VERSION>.tar.gz
├── make-airgap.sh stages:               │     docker load → local daemon
│   ├── docker-compose.outpost.yml       ├── docker-compose.outpost.yml
│   ├── .env.example                     ├── .env  (from install.sh prompts)
│   ├── install.sh                       ├── policy_cache/policy_bundle.json
│   ├── default-policy-bundle.json         │     (bootstrap → live on sync)
│   └── GeoLite2-City.mmdb (optional)   └── geoip/GeoLite2-City.mmdb
└── arbitex-outpost-airgap-*.tar.gz ──► install.sh runs on target host

Key properties of the air-gap model:

No registry pull: images loaded from tarball via docker load.
Bootstrap policy bundle: an empty bundle (version: "bootstrap-offline") is placed in policy_cache/ so the outpost can start before reaching the management plane.
GeoIP: bundled at package build time if geoip/GeoLite2-City.mmdb is present; disabled otherwise.
CredInt: bloom filter is embedded in the Docker image — fully operational in air-gap mode with no network dependency.
Management plane: the air-gapped outpost connects to the management plane via the same mTLS path as standard deployments, provided it can reach the platform URL. If the management plane is entirely unreachable, the outpost operates using the locally cached policy bundle.

See Air-gap deployment guide for the complete installation procedure.

Component inventory

The Outpost consists of nine independently managed components, all started during the lifespan context in outpost/main.py.

Component	Class	Module	Purpose
Proxy router	`ProxyRouter`	`outpost/proxy.py`	Handles chat completion requests; enforces DLP, budget, and policy rules
DLP pipeline	`DLPPipeline`	`outpost/dlp/pipeline.py`	3-tier local content inspection (regex → NER → DeBERTa)
Audit logger	`AuditLogger`	`outpost/audit/logger.py`	HMAC-chained local audit buffer on disk
Audit sync worker	`AuditSyncWorker`	`outpost/audit/sync.py`	Background task: pushes unsynced events to Platform via mTLS
SIEM direct sink	`SIEMDirectSink`	`outpost/audit/siem.py`	Optional: forwards events directly to Splunk HEC or syslog (parallel to audit sync)
PROMPT hold store	`PromptHoldStore`	`outpost/prompt_hold.py`	Suspends requests matching `action=PROMPT` until admin approves or denies
Policy sync client	`PolicySyncClient`	`outpost/policy_sync.py`	Polls Platform for policy bundle updates (ETag-based, every 60 s)
Heartbeat sender	`HeartbeatSender`	`outpost/heartbeat.py`	POSTs health payload to Platform every 120 s
Cert rotation client	`CertRotationClient`	`outpost/cert_rotation.py`	Monitors mTLS cert expiry and performs zero-downtime renewal

Two additional components initialised by the proxy:

Component	Class	Module	Purpose
Budget enforcer	`BudgetEnforcer`	`outpost/budget.py`	Stateless check against policy bundle budget field; blocks with HTTP 429 on hard cap
Local override store	`LocalOverrideStore`	`outpost/overrides.py`	Emergency admin overrides (routing, provider disable) via admin API

Network topology

The Outpost exposes two TCP ports and makes outbound connections to three destinations.

                     ┌──────────────────────────────────────┐
                     │         Customer VPC                 │
                     │                                      │
  User/App ──────────► Port 8300 (proxy)                    │
  Admin UI ──────────► Port 8301 (admin)                    │
                     │                                      │
                     │  Outpost process                     │
                     │   ├── DLP pipeline (local, in-proc)  │
                     │   ├── Audit buffer (local disk)      │
                     │   └── Policy cache (local disk)      │
                     │                 │                    │
                     │        mTLS egress                   │
                     │                 │                    │
                     └─────────────────┼────────────────────┘
                                       │
                  ┌────────────────────┼──────────────────┐
                  │                   │                   │
          Platform management    Provider APIs      SIEM endpoint
          (policy sync,          (OpenAI, etc.)     (optional)
           audit sync,           ─── HTTPS ───     ─── HEC/syslog
           heartbeat,
           cert renewal)
          ─── mTLS ───

Inbound ports

Port	App	CORS	Purpose
8300	Proxy app	Permissive (customer apps)	Chat completions, model list, health probes
8301	Admin app	Restrictive (localhost only)	Override endpoints, PROMPT hold management, status API

Port numbers are configured via ADMIN_PORT (default 8301). The proxy port (8300) is fixed.

Outbound connections

All outbound connections from the Outpost use standard HTTPS. Connections to the Platform management plane additionally use mTLS client certificates.

Destination	Protocol	Auth	Used by
`PLATFORM_MANAGEMENT_URL`	HTTPS + mTLS	Client cert	Policy sync, audit sync, heartbeat, cert renewal
Provider base URLs (OpenAI, Anthropic, etc.)	HTTPS	Provider API key	ProxyRouter forwarding
`SIEM_DIRECT_URL` (optional)	HTTPS or UDP/TCP	HEC token or none	SIEMDirectSink

Request flow

For each incoming chat completion request:

1. Request arrives at ProxyRouter (port 8300)
2. Budget check: BudgetEnforcer reads policy bundle budget field
   └─ hard cap exceeded → HTTP 429, audit event logged
3. DLP scan: DLPPipeline inspects all message content
   ├─ BLOCK → HTTP 403, audit event logged
   ├─ REDACT → redacted content forwarded, audit event logged
   ├─ PROMPT → request suspended; PromptHoldStore creates hold
   │            admin must approve/deny within PROMPT_HOLD_TIMEOUT_SECONDS
   │            timeout or deny → HTTP 403
   │            approve → continue to step 4
   └─ ALLOW → continue to step 4
4. Provider resolution: routing rules from policy bundle determine provider
5. Request forwarded to provider (streaming or non-streaming)
6. Response returned to caller
7. AuditLogger writes HMAC-chained entry to local disk buffer
   └─ SIEMDirectSink (if enabled): event placed on ring buffer → forwarded

DLP pipeline

The DLP pipeline (outpost/dlp/pipeline.py) runs entirely within the Outpost process — no content leaves the VPC during inspection.

Four tiers

Tier	Scanner	Model	Speed	When active
1	Regex	Pattern rules from policy bundle	< 1 ms	Always
2	NER	spaCy (`en_core_web_sm` by default)	5–50 ms	When `DLP_NER_ENABLED=true` (default)
3	DeBERTa	DeBERTa-v3 ONNX	50–500 ms	When `DEBERTA_MODEL_PATH` is set and file exists
4	CredInt	Bloom filter (861M+ compromised credentials)	< 1 ms	When `CREDINT_ENABLED=true` (default)

Cascade behaviour

Tier 1 runs first. If a regex match triggers BLOCK, the cascade terminates early and the request is blocked without invoking NER or DeBERTa. Otherwise, Tier 2 runs, its results are merged, and Tier 3 runs if available. Tier 4 (CredInt) runs after the other tiers — it does not short-circuit the cascade on a hit; it contributes its entity detections and the policy resolver determines the final action.

CredInt bloom filter layer

Tier 4 checks credential-shaped tokens extracted from the prompt against the Arbitex breach corpus using a bundled bloom filter. A CredInt hit means the specific credential has been seen in breach data — not just that a credential-shaped string is present (which earlier tiers handle).

Bloom filter specifications:

Property	Value
Corpus size	861M+ compromised credentials
False positive rate (FPR)	10%
Compressed filter size	~440–470 MB
Runtime RAM footprint	~470 MB
Lookup latency	< 1 μs per token

The 10% FPR means 1-in-10 credential-shaped tokens that are not in the breach corpus trigger a false positive. In practice, the user-visible false positive rate is very low because: (a) only tokens that passed Tier 1 credential-shape heuristics reach the bloom filter, and (b) credential-shaped tokens are rare in normal business prompts.

Deployment modes:

Bundled (default / air-gap): The filter is embedded in the Docker image at build time. No network access is required at startup or during request processing. The filter is current as of the image build date.
CDN refresh (internet-connected): If CREDINT_DOWNLOAD_URL is set, the Outpost attempts to download a fresher filter from Arbitex CDN at startup (45-second timeout). If the download succeeds and the snapshot date is newer than the bundled filter, the downloaded filter is used. If the download fails, the bundled filter is used — startup succeeds either way.

Memory: With CredInt enabled, the container memory limit must be at least 2 Gi. The Helm chart default (resources.limits.memory: "2Gi") accounts for this.

CPU vs GPU inference modes

By default the pipeline runs on CPU. GPU acceleration is available for Tier 2 (NER) and Tier 3 (DeBERTa).

Mode	Config	Requirement
CPU (default)	`DLP_NER_DEVICE=cpu` (or `auto` on CPU-only hosts)	No GPU required
GPU (NER)	`DLP_NER_DEVICE=cuda`	CUDA-capable GPU, matching PyTorch install
GPU (DeBERTa)	`DLP_NER_DEVICE=cuda` + `DEBERTA_MODEL_PATH=<path>`	CUDA-capable GPU + ONNX Runtime with GPU provider

For DLP_NER_DEVICE=auto, the pipeline selects CUDA if a GPU is available, otherwise falls back to CPU.

Resource requirements by mode

Mode	CPU	RAM	VRAM
Tier 1 only (regex)	≈ 0.1 vCPU idle	256 MB	—
Tier 1 + 2 (NER, en_core_web_sm)	0.5–1 vCPU at peak	512 MB	—
Tier 1 + 2 + 3 (DeBERTa, CPU)	2–4 vCPU at peak	2–4 GB	—
Tier 1 + 2 + 3 (DeBERTa, GPU)	0.5 vCPU	512 MB	2 GB

These are approximate figures. Actual usage depends on request concurrency and text length.

Policy sync

The PolicySyncClient polls the Platform management plane for policy bundle updates.

Endpoint: GET {PLATFORM_MANAGEMENT_URL}/api/internal/outpost/{OUTPOST_ID}/policies
Interval: POLICY_SYNC_INTERVAL seconds (default 60)
Conditional requests: ETag header sent with each request; 304 = no change, no disk write
HMAC verification: If POLICY_HMAC_KEY is set, the bundle_hmac field in the response is verified before the bundle is accepted. Bundles without a valid HMAC are rejected.
Disk cache: Written atomically to POLICY_CACHE_PATH/policy_bundle.json (write to .tmp, then os.replace)
Offline resilience: On network failure, the cached bundle on disk remains active. At startup, if no network is available, the Outpost will attempt first sync for up to 30 seconds (retrying every 3 seconds), then fall back to the cached bundle.

Audit pipeline

Every request produces an audit entry. The audit pipeline has two stages.

1. Local HMAC-chained buffer

The AuditLogger appends entries to a JSONL file at AUDIT_BUFFER_PATH/audit.jsonl. Each entry includes an HMAC-SHA256 chain signature computed over the entry content and the previous entry’s hash. This creates a tamper-evident log that can be verified offline.

HMAC key: AUDIT_HMAC_KEY — required; startup fails if empty
Ring buffer: MAX_AUDIT_BUFFER_ENTRIES (default 100,000) — oldest entries are rotated out when the buffer is full

2. Async sync to Platform

The AuditSyncWorker runs as a background task and pushes unsynced entries to Platform in batches.

Interval: AUDIT_SYNC_INTERVAL_SECONDS (default 30)
Batch size: 50 events per POST to {PLATFORM_MANAGEMENT_URL}/api/internal/outpost-audit-sync
Resilience: Events remain in the local buffer if Platform is unreachable. The Platform alerts if no events arrive for more than 5 minutes.
Auth: mTLS client certificate

Optional SIEM direct sink

When SIEM_DIRECT_ENABLED=true, the SIEMDirectSink runs in parallel with the sync worker. Events are placed on an asyncio ring buffer (default capacity: 10,000) and forwarded to Splunk HEC or syslog. This is additive — events go to both the Platform sync path and the SIEM direct path.

See Outpost SIEM direct sink for configuration details.

Heartbeat

The HeartbeatSender POSTs a health payload to the Platform every 120 seconds.

Endpoint: POST {PLATFORM_MANAGEMENT_URL}/v1/orgs/{ORG_ID}/outposts/{OUTPOST_ID}/heartbeat
Auth: mTLS client certificate
Payload fields:

Field	Description
`version`	Outpost software version (currently `0.1.0`)
`uptime`	Seconds since process start
`policy_version`	Version tag from the active policy bundle
`last_sync_at`	ISO 8601 timestamp of last successful policy sync
`dlp_model_version`	DeBERTa model filename, or `"none"` if Tier 3 is inactive
`pending_audit_events`	Count of unsynced audit entries in the local buffer
`tier3_active`	`true` if DeBERTa scanner is loaded and available

Update detection: The heartbeat response includes a latest_version field. If the running version differs, the Outpost logs a warning and the Platform marks the outpost as requiring an update.

Certificate management

All Outpost-to-Platform connections use mTLS. Three files are required:

Setting	Default path	Description
`OUTPOST_CERT_PATH`	`certs/outpost.pem`	Outpost client certificate (issued by Platform CA)
`OUTPOST_KEY_PATH`	`certs/outpost.key`	Private key for the client certificate
`OUTPOST_CA_PATH`	`certs/ca.pem`	Platform CA certificate for server verification

Initial provisioning

Certificates are issued by the Cloud portal when an outpost is registered. Download the certificate bundle from the portal and place the files at the configured paths before starting the Outpost.

Automatic rotation

The CertRotationClient checks certificate expiry every hour. When the certificate is within 30 days of expiry, it requests a renewed certificate from the Platform.

Rotation flow:

Check expiry of OUTPOST_CERT_PATH — if > 30 days remaining, no action
POST renewal request to {PLATFORM_MANAGEMENT_URL}/v1/outposts/{OUTPOST_ID}/cert-renewal
Write new cert, key, and CA to staging paths (.new suffix)
Verify TLS handshake with staged files (structural check — no network connection needed)
Atomic rename: staging → live paths (os.replace)
Close existing HTTP clients (forces reload on next request)
Invoke on_cert_rotated callback to signal other components

If the staged cert fails verification, the swap is aborted and the original cert remains in use. Rotation failures are logged and retried on the next hourly cycle.

Storage layout

The Outpost writes to two local directories. Both directories must be writable by the Outpost process.

Directory	Setting	Contents
`POLICY_CACHE_PATH`	default: `policy_cache/`	`policy_bundle.json` — active policy bundle
`AUDIT_BUFFER_PATH`	default: `audit_buffer/`	`audit.jsonl` — HMAC-chained audit buffer

For production deployments, mount these on persistent storage (not the container ephemeral layer) so that audit events and policy cache survive restarts.

Optional dead-letter path

When SIEM_DIRECT_DEAD_LETTER_PATH is set, failed SIEM deliveries are appended to a JSONL file at that path. This file grows without bound — add log rotation.

Environment variable reference

All configuration is via environment variables (12-factor). Required variables must be set before startup.

Required

Variable	Description
`OUTPOST_ID`	UUID issued by Cloud portal during registration
`ORG_ID`	Organisation UUID
`PLATFORM_MANAGEMENT_URL`	Platform management plane URL (e.g. `https://api.arbitex.ai`)
`AUDIT_HMAC_KEY`	HMAC-SHA256 key for audit chain integrity — startup fails if empty

Recommended for production

Variable	Default	Description
`POLICY_HMAC_KEY`	`""` (disabled)	HMAC key for verifying signed policy bundles
`PROVIDER_KEY_ENCRYPTION_KEY`	`""` (plaintext)	Fernet key for decrypting encrypted provider API keys
`OUTPOST_CERT_PATH`	`certs/outpost.pem`	mTLS client certificate path
`OUTPOST_KEY_PATH`	`certs/outpost.key`	mTLS private key path
`OUTPOST_CA_PATH`	`certs/ca.pem`	Platform CA certificate path

DLP

Variable	Default	Description
`DLP_ENABLED`	`true`	Enable DLP scanning
`DLP_NER_ENABLED`	`true`	Enable Tier 2 NER (spaCy)
`DLP_NER_MODEL`	`en_core_web_sm`	spaCy model name
`DLP_NER_DEVICE`	`auto`	Device selection: `auto`, `cpu`, `cuda`
`DLP_DEBERTA_ENABLED`	`false`	Enable Tier 3 DeBERTa scanner
`DEBERTA_MODEL_PATH`	`""`	Path to DeBERTa ONNX file; auto-activates Tier 3 when set

CredInt (Tier 4)

Variable	Default	Description
`CREDINT_ENABLED`	`true`	Enable Tier 4 CredInt bloom filter scan
`CREDINT_BLOOM_PATH`	`/app/credint.bf`	Path to bundled bloom filter inside the container (set at build time)
`CREDINT_DOWNLOAD_URL`	`""`	CDN URL for startup filter refresh; empty = air-gap mode
`CREDINT_DOWNLOAD_TIMEOUT_SECONDS`	`45`	Maximum seconds to wait for CDN download at startup
`CREDINT_FPR_THRESHOLD`	`0.10`	Downloaded filters with FPR higher than this value are rejected

Audit and sync

Variable	Default	Description
`AUDIT_BUFFER_PATH`	`audit_buffer/`	Local audit buffer directory
`AUDIT_SYNC_INTERVAL_SECONDS`	`30`	Seconds between audit sync pushes to Platform
`MAX_AUDIT_BUFFER_ENTRIES`	`100000`	Ring buffer capacity for local audit log
`POLICY_CACHE_PATH`	`policy_cache/`	Directory for cached policy bundle
`POLICY_SYNC_INTERVAL`	`60`	Seconds between policy sync polls

SIEM direct sink

Variable	Default	Description
`SIEM_DIRECT_ENABLED`	`false`	Enable direct SIEM forwarding
`SIEM_DIRECT_TYPE`	`splunk_hec`	Output type: `splunk_hec` or `syslog`
`SIEM_DIRECT_URL`	`""`	Endpoint URL
`SIEM_DIRECT_TOKEN`	`""`	HEC auth token (unused for syslog)
`SIEM_DIRECT_BUFFER_CAPACITY`	`10000`	Ring buffer capacity (events)
`SIEM_DIRECT_DEAD_LETTER_PATH`	`""`	Dead-letter JSONL path; empty = disabled

PROMPT holds

Variable	Default	Description
`PROMPT_HOLD_TIMEOUT_SECONDS`	`300`	Seconds to wait for admin decision before auto-deny

Admin interface

Variable	Default	Description
`ADMIN_PORT`	`8301`	Admin API port
`OUTPOST_EMERGENCY_ADMIN_KEY`	`""`	Emergency admin API key

Runtime

Variable	Default	Description
`LOG_LEVEL`	`info`	Log level: `debug`, `info`, `warning`, `error`, `critical`
`BUDGET_ENFORCEMENT_ENABLED`	`true`	Enable budget cap enforcement

Health and readiness probes

The Outpost exposes two probe endpoints on port 8300:

Endpoint	Purpose
`GET /healthz`	Liveness — returns HTTP 200 if the process is running
`GET /readyz`	Readiness — returns HTTP 200 if policy bundle is loaded; 503 if not

The readiness probe fails until the first policy bundle is loaded (either from cache or from the first successful sync). Configure your orchestrator to gate traffic on the readiness probe.