DeBERTa Tier 3 — admin guide
Tier 3 adds contextual classification to the Outpost DLP pipeline. It runs a fine-tuned DeBERTa ONNX model (deberta-dlp-v2-r2) directly inside the Outpost process and classifies each text chunk as pii or clean. Chunks classified as pii with confidence ≥ 0.70 are appended to the merged findings from Tier 1 (regex) and Tier 2 (NER) and passed to the Policy Engine for enforcement.
Tier 3 is optional. When DEBERTA_MODEL_PATH is not set or the ONNX file is not present, the pipeline runs Tier 1 + Tier 2 only and logs an INFO message. No configuration change is required to run without Tier 3.
What Tier 3 does
Section titled “What Tier 3 does”Tier 1 and Tier 2 identify entity spans using pattern matching and named-entity recognition. Tier 3 provides a second-pass contextual validation layer:
- The full request or response text is split into overlapping chunks of up to 450 characters at sentence boundaries.
- Each chunk is tokenized (max 512 tokens) and passed through the DeBERTa model.
- The model outputs logits for two classes:
pii(class 0) andclean(class 1). - The softmax probability of
piiis compared against the 0.70 confidence threshold. - Chunks that exceed the threshold produce a finding with
tier: "deberta"andaction_tier: "redact".
The model runs entirely within the Outpost — it makes no calls to Platform or Cloud.
Validated entity categories
Section titled “Validated entity categories”Model: deberta-dlp-v2-r2. Validation set: 376 examples across 11 entity types.
| Category | Entity types | Macro F1 |
|---|---|---|
| Contact info | email, telephone | 0.79 |
| Credentials | api_key, username_password_combo | 0.76 |
| Infrastructure | ip_address | 0.79 |
| Government ID | ssn, passport | 0.70 |
| Financial identifiers | iban, credit_card | 0.73 |
| MNPI | material_contract, earnings_announcement | 0.63 |
| Multi-entity | name | 0.89 |
Overall macro F1: 0.741. Overall accuracy: 0.606. The model is calibrated for high recall (0.89 overall) with moderate precision — false positives at the chunk level are expected and are filtered downstream by the Policy Engine confidence threshold.
Performance characteristics
Section titled “Performance characteristics”| Metric | Value |
|---|---|
| Average inference latency (CPU) | 88.9 ms per request |
| P95 inference latency (CPU) | 164 ms per request |
| Chunk size | 450 chars / 512 tokens max |
| Minimum confidence for reporting | 0.70 |
CPU is functional but GPU is recommended for production. On CPU, each inference call adds ~89 ms to the DLP pipeline. On a GPU node (NVIDIA T4 or equivalent), inference drops to 5–15 ms per chunk.
Deployment
Section titled “Deployment”Model artifact
Section titled “Model artifact”The promoted ONNX artifact is:
/home/brian/models/Arbitex/deploy/deberta-dlp-v2/├── model.onnx ← required├── tokenizer.json ← required├── tokenizer_config.json└── vocab files ← required by AutoTokenizerDEBERTA_MODEL_PATH must point to model.onnx. The tokenizer is loaded from the same directory (os.path.dirname(DEBERTA_MODEL_PATH)).
Bare-metal or Docker deployment
Section titled “Bare-metal or Docker deployment”Set the environment variable to the model file path:
DEBERTA_MODEL_PATH=/opt/arbitex/models/deberta/model.onnxDLP_DEBERTA_ENABLED=trueMount the model directory into the container:
volumes: - /host/path/to/deberta-dlp-v2:/app/models/deberta:roenvironment: DEBERTA_MODEL_PATH: /app/models/deberta/model.onnx DLP_DEBERTA_ENABLED: "true"Kubernetes / Helm deployment
Section titled “Kubernetes / Helm deployment”Add a model volume to the Outpost pod and set the Helm values:
outpost: dlpDebertaEnabled: true debertaModelPath: /app/models/deberta/model.onnx gpuEnabled: true # set false for CPU-only deployment
# Add a volume for the modelextraVolumes: - name: deberta-model persistentVolumeClaim: claimName: deberta-model-pvc # or hostPath, NFS, etc.
extraVolumeMounts: - name: deberta-model mountPath: /app/models/deberta readOnly: trueDeploy:
helm upgrade arbitex-outpost ./charts/arbitex-outpost \ -f values-override.yaml \ --set outpost.dlpDebertaEnabled=true \ --set outpost.debertaModelPath=/app/models/deberta/model.onnxFor GPU nodes, the chart sets resource requests for nvidia.com/gpu: 1 when outpost.gpuEnabled: true.
Python dependencies
Section titled “Python dependencies”The Outpost image must include at least one of these dependency sets:
| Runtime | Required packages | Notes |
|---|---|---|
| Preferred (optimum ORT) | transformers, optimum[onnxruntime], torch | Richer HuggingFace API |
| Fallback (raw onnxruntime) | transformers, onnxruntime | Minimal deps, CPU only |
Install:
# Preferredpip install transformers optimum[onnxruntime] torch
# Minimal fallbackpip install transformers onnxruntimeConfiguration reference
Section titled “Configuration reference”All settings are environment variables. The Outpost image uses Pydantic settings — prefix-free, case-insensitive.
| Environment variable | Helm value | Type | Default | Description |
|---|---|---|---|---|
DLP_DEBERTA_ENABLED | outpost.dlpDebertaEnabled | bool | false | Enable Tier 3. Must also set DEBERTA_MODEL_PATH. |
DEBERTA_MODEL_PATH | outpost.debertaModelPath | string | "" | Absolute path to model.onnx. When empty, Tier 3 is inactive. |
GPU_ENABLED | outpost.gpuEnabled | bool | false | Request GPU resources (nvidia.com/gpu: 1). Enables CUDA execution provider. |
DLP_ENABLED | outpost.dlpEnabled | bool | true | Master DLP toggle. Tier 3 requires this to be true. |
Confidence threshold
Section titled “Confidence threshold”The confidence threshold is fixed at 0.70 in the current model version. Chunks where P(pii) < 0.70 are silently discarded. Chunks at or above 0.70 generate a finding with:
{ "entity_type": "pii", "tier": "deberta", "action_tier": "redact", "confidence": 0.84}The Policy Engine then applies compliance bundle rules. If a bundle’s DLP action for pii is log_only or block, that takes precedence over the Tier 3 default redact.
Escalation band
Section titled “Escalation band”Tier 3 outputs only pii or clean — it does not identify specific entity types (SSN, email, etc.). Entity-type specificity comes from Tier 1 and Tier 2. Tier 3 confirms or discards their findings contextually.
Monitoring
Section titled “Monitoring”Startup log
Section titled “Startup log”On successful load, the Outpost logs at INFO level:
INFO outpost.dlp.deberta DeBERTa Tier 3 loaded via optimum ORT — contextual classification activeor:
INFO outpost.dlp.deberta DeBERTa Tier 3 loaded via onnxruntime — contextual classification activeIf the model is not configured:
INFO outpost.dlp.deberta DeBERTa Tier 3 not configured — using regex+NER only. Set DEBERTA_MODEL_PATH to enable.Audit log entries
Section titled “Audit log entries”Each Tier 3 finding writes an audit log entry with the following fields:
| Field | Value |
|---|---|
tier | deberta |
action_tier | redact (default, may be overridden by Policy Engine) |
confidence | Float, e.g. 0.847 |
entity_type | pii |
To extract Tier 3 audit events:
# On the Outpost hostjq 'select(.tier == "deberta")' audit_buffer/audit.jsonl
# Via Platform audit APIGET /api/admin/audit?tier=deberta&limit=100Confirm vs discard
Section titled “Confirm vs discard”- Confirm:
P(pii) ≥ 0.70— chunk appended to findings, forwarded to Policy Engine. - Discard:
P(pii) < 0.70— chunk ignored, no audit entry written for the discarded chunk.
The ratio of confirmed to discarded chunks appears in debug logs when LOG_LEVEL=debug.
Troubleshooting
Section titled “Troubleshooting”Tier 3 not activating
Section titled “Tier 3 not activating”Symptom: Startup log shows “DeBERTa Tier 3 not configured” even after setting DEBERTA_MODEL_PATH.
Check:
- Confirm the file exists at the configured path inside the container:
Terminal window kubectl exec <pod> -- ls -la /app/models/deberta/model.onnx - Confirm
DLP_DEBERTA_ENABLED=trueis set. - Confirm the volume mount is correct — the path must match
DEBERTA_MODEL_PATHexactly, pointing to the.onnxfile, not the directory.
Model load failure
Section titled “Model load failure”Symptom: Warning log: Failed to load DeBERTa ONNX model from '...': <error>. Falling back to regex+NER only.
Common causes and fixes:
| Error message | Cause | Fix |
|---|---|---|
transformers not installed | Package missing | pip install transformers onnxruntime |
No such file or directory | Wrong path | Verify mount and DEBERTA_MODEL_PATH value |
ORT ONNX model load failed | Corrupt or incompatible ONNX file | Re-export model with matching ONNX opset |
Cannot allocate memory | Insufficient container memory | Increase memory limit (minimum 2 Gi for CPU, 8 Gi for GPU) |
Incorrect label mapping
Section titled “Incorrect label mapping”Symptom: Tier 3 consistently classifies clean text as pii or vice versa.
Root cause: The deberta-dlp-v2-r2 training checkpoint uses an inverted label order relative to the original spec — class 0 = pii, class 1 = clean. The runtime in outpost/dlp/deberta.py accounts for this correctly. If you load the model with a custom inference script, ensure DEBERTA_ENTITY_LABELS[0] = "pii" and DEBERTA_ENTITY_LABELS[1] = "clean".
Do not use argmin(logits) — use argmax(softmax(logits)) with the ["pii", "clean"] label order.
High latency on CPU
Section titled “High latency on CPU”Symptom: Each DLP scan adds 100–200 ms to request latency.
Fix: Move the Outpost to a GPU node and set GPU_ENABLED=true. GPU inference reduces latency to 5–15 ms per chunk. Alternatively, reduce the proportion of long-text requests that require chunking, or disable Tier 3 for latency-sensitive Policy Pack paths using dlp_deberta_enabled=false in the relevant compliance bundle.
ONNX Runtime version mismatch
Section titled “ONNX Runtime version mismatch”Symptom: InvalidGraph: Load model from ... failed: ... opset X not supported.
Fix: The model was exported with a specific ONNX opset. Install the matching onnxruntime version:
pip install onnxruntime==1.17.3 # or the version used during exportCheck the model’s opset:
import onnxm = onnx.load("/app/models/deberta/model.onnx")print(m.opset_import)