ADR-002: Fail-Closed DLP Inference

Status: Accepted Date: 2026-03 Deciders: Platform team (platform-0043, T558)

Context

The Arbitex Platform hosts GPU-accelerated DLP inference microservices (NER entity detection and DeBERTa contextual classification). These services can be temporarily unavailable due to:

Container restarts or OOM kills
GPU resource exhaustion
Rolling deployments
Startup latency (model loading takes time)

When a downstream inference microservice is unavailable, the platform must decide what to do with the pending DLP scan request: block the AI request (fail-closed) or allow it through unscanned (fail-open).

The original implementation was fail-open — the legacy path assumed that availability was more important than enforcement correctness. As Arbitex customers deployed into regulated environments (healthcare, finance, legal), this became unacceptable: an unavailable DLP scanner silently passing data to AI providers is a data protection failure.

Decision

Fail-closed by default. When a DLP inference microservice (NER or DeBERTa) is unavailable or returns an error, the affected request is blocked rather than passed unscanned.

The DLP_INFERENCE_FAIL_MODE environment variable controls this behaviour:

closed (default): Block the request when inference is unavailable.
open: Allow the request through unscanned (legacy behaviour).

Consequences

Positive:

Enforces the security invariant: data cannot reach AI providers without passing DLP if DLP is configured.
Aligns with the principle of “secure by default” — operators who want fail-open must explicitly opt in.
Satisfies HIPAA, SOC 2, and financial DLP policy requirements where bypass of scanning controls is prohibited.
Prevents silent data exfiltration during infrastructure degradation events.

Negative / trade-offs:

Increased availability impact: GPU microservice downtime now directly blocks AI requests, not just silently skips scanning. Operators must ensure DLP microservice SLAs are high.
False negatives become hard errors. A flapping inference service can cause user-visible request failures.
Customers with existing deployments must opt in to the new default by removing or setting DLP_INFERENCE_FAIL_MODE=closed — no change required, but awareness needed during upgrades.

Mitigations:

Platform health checks alert on GPU microservice unavailability before it impacts DLP scanning.
Retry logic with short timeout window reduces impact of transient failures.
DLP_INFERENCE_FAIL_MODE=open provides a documented escape hatch for availability-prioritised deployments.