DLP API
The DLP API manages the detection rules that feed Arbitex’s three-tier data loss prevention pipeline. Rules define how sensitive content is identified in prompts and responses, and what enforcement action to take when a match is found.
All endpoints require an API key with admin role.
Base URL: https://api.arbitex.ai/api/admin/dlp-rules
Overview
Section titled “Overview”Detector types
Section titled “Detector types”Each rule is associated with one of three detector types, corresponding to the pipeline tiers in which it runs:
detector_type | Pipeline tier | Description |
|---|---|---|
regex | Tier 1 | Fast pattern matching. Runs on every request using compiled regular expressions. Low latency, no ML inference. |
ner | Tier 2 | Named entity recognition using spaCy. Slower than regex but understands context well enough to reduce false positives on common entity classes. |
llm | Tier 3 | Contextual classification using the DeBERTa ONNX model. Available on Outpost deployments only. Produces a confidence score used for threshold-based routing. |
Action tiers
Section titled “Action tiers”When a rule matches, the action_tier field controls enforcement:
action_tier | Behavior |
|---|---|
log_only | Match is recorded in the audit log. No modification to the request or response. |
redact | Matched content is replaced with a redaction token (e.g., [REDACTED]) before being passed downstream or returned to the caller. |
cancel | The current streaming or non-streaming response is cancelled. The user receives an error response indicating the request could not be completed. |
block | The request is blocked before it is forwarded to the upstream provider. |
Action tiers are evaluated independently per rule. When multiple rules match the same content, the highest-precedence action wins: block > cancel > redact > log_only.
DLP rule schema
Section titled “DLP rule schema”The following fields are present on all DLP rule objects returned by the API.
| Field | Type | Description |
|---|---|---|
id | string (UUID) | Unique rule identifier. Assigned by the server on creation. |
detector_name | string | Human-readable name for this rule (e.g., "PCI Credit Card Regex"). |
detector_type | string | Detector type: "regex", "ner", or "llm". |
entity_type | string | Entity classification label (e.g., "CREDIT_CARD", "SSN", "API_KEY"). Used for grouping, audit filtering, and reporting. |
action_tier | string | Enforcement action: "log_only", "redact", "cancel", or "block". |
enabled | boolean | Whether the rule is active. Disabled rules are stored but not evaluated during request processing. |
confidence_threshold | float | Minimum confidence score required to trigger the rule (range: 0.0–1.0). For regex rules this is effectively ignored (matches are binary). For ner and llm rules, the detector score must meet or exceed this value. |
config_json | object | Detector-specific configuration. For regex rules, this contains a pattern key. For ner rules, optional spaCy entity label overrides. For llm rules, optional model or threshold overrides. |
Create a DLP rule
Section titled “Create a DLP rule”POST /api/admin/dlp-rules/Creates a new DLP rule. The rule is activated immediately if enabled is true.
Request body
| Field | Type | Required | Description |
|---|---|---|---|
detector_name | string | Yes | Human-readable rule name |
detector_type | string | Yes | "regex", "ner", or "llm" |
entity_type | string | Yes | Entity classification label |
action_tier | string | Yes | "log_only", "redact", "cancel", or "block" |
enabled | boolean | No | Defaults to true |
confidence_threshold | float | No | Defaults to 0.8. Must be between 0.0 and 1.0. |
config_json | object | No | Detector-specific configuration. Required for regex rules (must include pattern). |
curl -X POST https://api.arbitex.ai/api/admin/dlp-rules/ \ -H "Authorization: Bearer arb_live_your-api-key-here" \ -H "Content-Type: application/json" \ -d '{ "detector_name": "Visa/MC Credit Card Pattern", "detector_type": "regex", "entity_type": "CREDIT_CARD", "action_tier": "redact", "enabled": true, "confidence_threshold": 1.0, "config_json": { "pattern": "\\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})\\b" } }'Response 201 Created
{ "id": "550e8400-e29b-41d4-a716-446655440000", "detector_name": "Visa/MC Credit Card Pattern", "detector_type": "regex", "entity_type": "CREDIT_CARD", "action_tier": "redact", "enabled": true, "confidence_threshold": 1.0, "config_json": { "pattern": "\\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})\\b" }}A DLPRuleVersion record with change_type: "create" is written atomically with the new rule.
List DLP rules
Section titled “List DLP rules”GET /api/admin/dlp-rules/Returns all DLP rules for the tenant, ordered by creation time ascending.
curl https://api.arbitex.ai/api/admin/dlp-rules/ \ -H "Authorization: Bearer arb_live_your-api-key-here"Response 200 OK
[ { "id": "550e8400-e29b-41d4-a716-446655440000", "detector_name": "Visa/MC Credit Card Pattern", "detector_type": "regex", "entity_type": "CREDIT_CARD", "action_tier": "redact", "enabled": true, "confidence_threshold": 1.0, "config_json": { "pattern": "\\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})\\b" } }, { "id": "6ba7b810-9dad-11d1-80b4-00c04fd430c8", "detector_name": "SSN NER Rule", "detector_type": "ner", "entity_type": "SSN", "action_tier": "block", "enabled": true, "confidence_threshold": 0.75, "config_json": {} }]Update a DLP rule
Section titled “Update a DLP rule”PUT /api/admin/dlp-rules/{rule_id}Replaces all mutable fields of an existing rule. All fields listed in the create request body are accepted. Fields omitted from the request body are reset to their defaults, not preserved — send the full rule object.
curl -X PUT \ "https://api.arbitex.ai/api/admin/dlp-rules/550e8400-e29b-41d4-a716-446655440000" \ -H "Authorization: Bearer arb_live_your-api-key-here" \ -H "Content-Type: application/json" \ -d '{ "detector_name": "Visa/MC Credit Card Pattern", "detector_type": "regex", "entity_type": "CREDIT_CARD", "action_tier": "block", "enabled": true, "confidence_threshold": 1.0, "config_json": { "pattern": "\\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})\\b" } }'Response 200 OK — returns the updated rule object.
A DLPRuleVersion record with change_type: "update" is written, capturing both old_values and new_values.
Delete a DLP rule
Section titled “Delete a DLP rule”DELETE /api/admin/dlp-rules/{rule_id}Permanently removes a DLP rule. This action cannot be undone; however, the full rule state is preserved in the version history.
curl -X DELETE \ "https://api.arbitex.ai/api/admin/dlp-rules/550e8400-e29b-41d4-a716-446655440000" \ -H "Authorization: Bearer arb_live_your-api-key-here"Response 204 No Content on success. Returns 404 if the rule does not exist.
A DLPRuleVersion record with change_type: "delete" is written, capturing the final old_values of the deleted rule.
Test a DLP rule
Section titled “Test a DLP rule”POST /api/admin/dlp-rules/testRuns a detector against sample text without persisting anything. Use this to validate config_json (especially regex patterns) and tune confidence_threshold values before creating or updating a rule.
Request body (DLPTestRequest)
| Field | Type | Required | Description |
|---|---|---|---|
detector_type | string | Yes | "regex", "ner", or "llm" |
config_json | object | Yes | Detector configuration to test (e.g., {"pattern": "..."} for regex) |
text | string | Yes | Sample text to run the detector against |
Response (DLPTestResponse)
The response contains a matches array of DLPTestMatch objects:
| Field | Type | Description |
|---|---|---|
start | integer | Start character offset of the match in text |
end | integer | End character offset of the match in text |
matched_text | string | The exact substring that matched |
confidence | float | Confidence score returned by the detector (1.0 for regex matches) |
Example — testing a regex pattern
curl -X POST https://api.arbitex.ai/api/admin/dlp-rules/test \ -H "Authorization: Bearer arb_live_your-api-key-here" \ -H "Content-Type: application/json" \ -d '{ "detector_type": "regex", "config_json": { "pattern": "\\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})\\b" }, "text": "Please charge card 4111111111111111 for the order total." }'Response 200 OK
{ "matches": [ { "start": 19, "end": 35, "matched_text": "4111111111111111", "confidence": 1.0 } ]}If the detector produces no matches, matches is an empty array. The test endpoint does not write any audit log entries or version records.
Version history
Section titled “Version history”Every mutation to a DLP rule — create, update, or delete — writes a DLPRuleVersion record atomically. Version records are immutable and cannot be deleted through the API.
Version record schema
| Field | Type | Description |
|---|---|---|
id | UUID | Unique version record identifier |
rule_id | UUID | The DLP rule this version belongs to |
changed_by | UUID | Admin user who made the change |
change_type | string | "create", "update", or "delete" |
old_values | object | null | Full rule field snapshot before the change. null for create events. |
new_values | object | null | Full rule field snapshot after the change. null for delete events. |
changed_at | datetime (ISO 8601) | Timestamp of the change in UTC |
Version history is available in the audit log and can be queried via the SIEM export. See DLP Deep Dive for guidance on interpreting version diffs in incident investigations.
Bulk export and import
Section titled “Bulk export and import”Export all rules
Section titled “Export all rules”GET /api/admin/dlp-rules/exportReturns all DLP rules for the tenant as a JSON envelope. The envelope format is the canonical interchange format for rule portability between environments.
curl https://api.arbitex.ai/api/admin/dlp-rules/export \ -H "Authorization: Bearer arb_live_your-api-key-here" \ -o dlp-rules-backup.jsonResponse 200 OK
{ "version": "1", "exported_at": "2026-03-09T14:00:00Z", "rules": [ { "id": "550e8400-e29b-41d4-a716-446655440000", "detector_name": "Visa/MC Credit Card Pattern", "detector_type": "regex", "entity_type": "CREDIT_CARD", "action_tier": "redact", "enabled": true, "confidence_threshold": 1.0, "config_json": { "pattern": "\\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})\\b" } } ]}Import rules
Section titled “Import rules”POST /api/admin/dlp-rules/importBulk-imports rules from an export envelope. On import, each rule’s detector_name is used as the conflict key to determine whether an incoming rule clashes with an existing one.
Request body (DLPRuleImport)
| Field | Type | Required | Description |
|---|---|---|---|
rules | array | Yes | Array of rule objects in the export envelope format |
conflict_resolution | string | Yes | How to handle name conflicts: "skip", "overwrite", or "rename" |
Conflict resolution modes
| Mode | Behavior |
|---|---|
skip | Conflicting rules are left unchanged. The incoming rule is not applied. |
overwrite | The existing rule is replaced with the incoming rule, including its config_json. A version record is written with change_type: "update". |
rename | The incoming rule is created with a suffix appended to detector_name (e.g., "Visa/MC Credit Card Pattern (imported)"). The original rule is preserved. |
curl -X POST https://api.arbitex.ai/api/admin/dlp-rules/import \ -H "Authorization: Bearer arb_live_your-api-key-here" \ -H "Content-Type: application/json" \ -d '{ "conflict_resolution": "skip", "rules": [ { "detector_name": "Visa/MC Credit Card Pattern", "detector_type": "regex", "entity_type": "CREDIT_CARD", "action_tier": "redact", "enabled": true, "confidence_threshold": 1.0, "config_json": { "pattern": "\\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14})\\b" } } ] }'Response 200 OK (DLPImportSummary)
{ "created": 1, "skipped": 0, "failed": 0, "errors": []}If any rules fail validation (e.g., invalid detector_type, malformed config_json), they are recorded in errors as DLPImportError objects containing the rule index and an error message. Valid rules in the same import batch are still processed.
Output scanning
Section titled “Output scanning”By default, Arbitex also scans model response content for sensitive data. Output scanning is controlled by the DLP_OUTPUT_SCANNING_ENABLED environment variable (default: true).
When enabled, the OutputScanner service accumulates streamed response chunks and triggers DLP pipeline evaluations at configurable intervals during streaming. Rules with action_tier: "block" or "redact" are evaluated against accumulated output, allowing the gateway to intervene mid-stream before the full response reaches the caller.
Output scanning uses the same rule set as input scanning. Rules are not scoped to input or output separately — if a rule is enabled, it applies to both. Disabling output scanning entirely (via DLP_OUTPUT_SCANNING_ENABLED=false) reduces latency but removes enforcement coverage on model responses.
L3 confidence thresholds
Section titled “L3 confidence thresholds”Tier 3 (detector_type: "llm") rules produce a continuous confidence score. The route_l3_result function maps that score to an enforcement decision using three zones:
| Zone | Confidence range | Behavior |
|---|---|---|
| Hard block | > 0.70 | Blocked regardless of the org’s dlp_sensitivity setting |
| Ambiguous | 0.35 – 0.70 | Decision depends on dlp_sensitivity (see below) |
| Pass | < 0.35 | No flag; request proceeds normally |
Ambiguous zone routing by dlp_sensitivity
dlp_sensitivity setting | Outcome |
|---|---|
"high" | Soft block with audit flag |
"standard" | Pass with elevated audit flag |
Credential intelligence routing
Section titled “Credential intelligence routing”The route_credint_result function applies separate routing for credential detection results, which are classified into severity buckets (critical, high, medium, low):
| Bucket | dlp_sensitivity | Outcome |
|---|---|---|
critical or high | "high" | Soft block |
critical or high | "standard" | Pass with elevated confidence flag |
medium or low | Either | Pass with audit flag |
The dlp_sensitivity setting is configured at the org level. See DLP Deep Dive for guidance on choosing the appropriate sensitivity tier.
Error responses
Section titled “Error responses”| Status | Code | Description |
|---|---|---|
400 | bad_request | Missing required field, invalid enum value, or malformed config_json (e.g., unparseable regex pattern) |
403 | forbidden | API key does not have admin permissions |
404 | not_found | Rule not found for the given rule_id |
422 | unprocessable_entity | Request body fails schema validation |
See also
Section titled “See also”- DLP Deep Dive — architecture of the 3-tier pipeline, spaCy and DeBERTa configuration, tuning confidence thresholds
- Policy Rule Reference — DLP-related conditions in the policy engine (
dlp_entity_type,dlp_action_tier)