ADR-002: Fail-Closed DLP Inference
ADR-002: Fail-Closed DLP Inference
Section titled “ADR-002: Fail-Closed DLP Inference”Status: Accepted Date: 2026-03 Deciders: Platform team (platform-0043, T558)
Context
Section titled “Context”The Arbitex Platform hosts GPU-accelerated DLP inference microservices (NER entity detection and DeBERTa contextual classification). These services can be temporarily unavailable due to:
- Container restarts or OOM kills
- GPU resource exhaustion
- Rolling deployments
- Startup latency (model loading takes time)
When a downstream inference microservice is unavailable, the platform must decide what to do with the pending DLP scan request: block the AI request (fail-closed) or allow it through unscanned (fail-open).
The original implementation was fail-open — the legacy path assumed that availability was more important than enforcement correctness. As Arbitex customers deployed into regulated environments (healthcare, finance, legal), this became unacceptable: an unavailable DLP scanner silently passing data to AI providers is a data protection failure.
Decision
Section titled “Decision”Fail-closed by default. When a DLP inference microservice (NER or DeBERTa) is unavailable or returns an error, the affected request is blocked rather than passed unscanned.
The DLP_INFERENCE_FAIL_MODE environment variable controls this behaviour:
closed(default): Block the request when inference is unavailable.open: Allow the request through unscanned (legacy behaviour).
Consequences
Section titled “Consequences”Positive:
- Enforces the security invariant: data cannot reach AI providers without passing DLP if DLP is configured.
- Aligns with the principle of “secure by default” — operators who want fail-open must explicitly opt in.
- Satisfies HIPAA, SOC 2, and financial DLP policy requirements where bypass of scanning controls is prohibited.
- Prevents silent data exfiltration during infrastructure degradation events.
Negative / trade-offs:
- Increased availability impact: GPU microservice downtime now directly blocks AI requests, not just silently skips scanning. Operators must ensure DLP microservice SLAs are high.
- False negatives become hard errors. A flapping inference service can cause user-visible request failures.
- Customers with existing deployments must opt in to the new default by removing or setting
DLP_INFERENCE_FAIL_MODE=closed— no change required, but awareness needed during upgrades.
Mitigations:
- Platform health checks alert on GPU microservice unavailability before it impacts DLP scanning.
- Retry logic with short timeout window reduces impact of transient failures.
DLP_INFERENCE_FAIL_MODE=openprovides a documented escape hatch for availability-prioritised deployments.