DLP Pipeline Wiring Administration
The Arbitex DLP pipeline runs in three tiers — Regex (Tier 1), NER (Tier 2), and DeBERTa (Tier 3). Platform-0039 adds two layers of org-specific customization wired into the intake pipeline:
- OrgDLPLayer — org-level custom regex patterns and suppression of platform defaults
- GroupDLPConfig — per-group detector overrides using a most-restrictive-wins algorithm
This guide covers administration of these layers, including the data model, action tiers, and how the pipeline evaluates conflicting group overrides.
Intake pipeline position
Section titled “Intake pipeline position”Request │ ▼Stage 2: Payload Analysis │ ├─ 1. Platform DLP scan (Regex + NER + DeBERTa) │ ├─ 2. GroupDLPConfig overrides (T528) │ Load group IDs for user → fetch enabled GroupDLPConfig rows │ → compute most-restrictive action per detector │ → detectors with SKIP action are excluded before scan │ └─ 3. OrgDLPLayer application (T527) → append org custom pattern matches → filter_platform_matches (suppress platform-matched entities)GroupDLPConfig overrides are applied before the DLP scan (to exclude detectors). OrgDLPLayer is applied after the platform scan (to add matches and suppress specific platform results).
Both layers fail-safe: if either fails to load or execute, the pipeline continues without that layer. A warning log is emitted.
OrgDLPLayer
Section titled “OrgDLPLayer”What it does
Section titled “What it does”The OrgDLPLayer provides two operations:
scan_custom_patterns(prompt_text)— runs org-defined regex patterns against the prompt and returns additional DLP matches. These are appended to the existing platform matches.filter_platform_matches(matches)— removes matches whose entity type is targeted by asuppress_defaultrule for the org.
Org DLP rules
Section titled “Org DLP rules”Org rules are stored in the org_dlp_rules table (OrgDLPRule) and managed via the org DLP rule service. Each rule has a rule_type:
rule_type | Description |
|---|---|
custom_pattern | Adds a new regex pattern to scan alongside platform defaults |
suppress_default | Suppresses a specific platform-level default detector for this org |
Rules are soft-deleted — setting deleted_at excludes the rule from active queries while retaining audit history. The cache is invalidated after every mutation.
OrgDLPRule fields
Section titled “OrgDLPRule fields”| Field | Type | Description |
|---|---|---|
id | UUID | Rule identifier |
org_id | UUID | Owning organization UUID |
rule_type | string | custom_pattern or suppress_default |
name | string | Human-readable rule name |
pattern | string | Regex pattern (required for custom_pattern, null for suppress_default) |
target_rule_id | UUID | Platform rule ID to suppress (for suppress_default). Also matched by name. |
enabled | bool | Whether the rule is currently active |
action_tier | string | DLP action tier for custom patterns (see below) |
custom_entity_type | string | Entity type label for matches. Defaults to org_custom_pattern if null. |
created_by | UUID | UUID of the admin who created the rule |
deleted_at | timestamp | Soft-delete timestamp (null if active) |
Action tiers
Section titled “Action tiers”The action_tier field on a custom_pattern rule determines the DLP action applied when the pattern matches:
action_tier | Behavior |
|---|---|
log_only | Match is recorded in the audit log. Request proceeds unmodified. |
redact | Matched text is redacted before the prompt reaches the provider. |
block | Request is blocked with HTTP 400. |
prompt | User is prompted for justification (ALLOW_WITH_OVERRIDE governance flow). |
Default: log_only.
Custom entity type
Section titled “Custom entity type”The custom_entity_type field sets the entity type label for matches from this rule. This label appears in audit entries, DLP event records, and the OCSF events streamed to SIEM.
If custom_entity_type is null, matches are labeled org_custom_pattern.
Use descriptive labels that map to your data classification taxonomy, for example: internal_project_code, patient_id, contract_number.
Rule cache
Section titled “Rule cache”Org DLP rules are cached per-org with a 60-second TTL (_org_rules_cache in org_dlp_rules.py). invalidate_org_rules_cache(org_id) is called after every create, update, or delete to ensure changes take effect within one TTL window.
Effective rules view
Section titled “Effective rules view”get_effective_rules(db, org_id) returns the merged view of platform defaults and org-specific rules:
{ "org_id": "...", "platform_rules_count": 42, "org_rules_count": 3, "suppressed_count": 1, "rules": [ { "name": "credit_card", "source": "platform", "rule_type": "platform_default", "pattern": "...", "enabled": true, "suppressed": true, "suppressed_by": "suppress-rule-uuid" }, { "name": "Internal Project Code", "source": "org", "rule_type": "custom_pattern", "pattern": "PRJ-[0-9]{4}", "enabled": true, "suppressed": false, "suppressed_by": null } ]}GroupDLPConfig
Section titled “GroupDLPConfig”What it does
Section titled “What it does”GroupDLPConfig allows group administrators to override per-detector DLP behavior for members of a group. When a user belongs to groups with conflicting detector configurations, the most-restrictive-wins algorithm determines the effective action.
Data model
Section titled “Data model”GroupDLPConfig rows are stored in the groups table family (backend/app/models/group.py). Each row ties a group to a DLP detector with an action override:
| Field | Type | Description |
|---|---|---|
group_id | UUID | Group identifier |
detector_name | string | DLP detector name (e.g., regex, ner, deberta) |
action | GroupDLPAction | Override action for this detector when triggered |
enabled | bool | Whether this override is active |
GroupDLPAction enum
Section titled “GroupDLPAction enum”| Value | Priority | Behavior |
|---|---|---|
BLOCK | 4 (highest) | Block the request |
REDACT | 3 | Redact matched text |
CANCEL | 2 | Cancel the request without error |
SKIP | 1 (lowest) | Skip (exempt) this detector for group members |
Most-restrictive-wins algorithm
Section titled “Most-restrictive-wins algorithm”When a user belongs to multiple groups with different actions for the same detector:
- Load all enabled
GroupDLPConfigrows for the user’s groups. - For each detector, track the highest-priority action across all groups.
- The effective action is the one with the highest priority number (BLOCK > REDACT > CANCEL > SKIP).
Example: A user is a member of Group A (detector ner, action SKIP) and Group B (detector ner, action REDACT). The effective action is REDACT (priority 3 > priority 1).
SKIP behavior: Detectors whose effective action is SKIP are excluded from the scan pipeline for that request. The detector does not run — it does not produce matches, not even log_only ones.
User in groups: [Group A, Group B]GroupDLPConfig rows: - Group A, detector=ner, action=SKIP - Group B, detector=ner, action=REDACT
Effective: ner → REDACT (priority 3 wins over SKIP priority 1)Result: ner detector runs, REDACT applied on matchUser in groups: [Group A, Group C]GroupDLPConfig rows: - Group A, detector=ner, action=SKIP - Group C, detector=regex, action=BLOCK
Effective: ner → SKIP, regex → BLOCKResult: ner excluded from scan; regex runs and blocks on matchFail-safe behavior
Section titled “Fail-safe behavior”If GroupDLPConfig loading fails (DB query error), the pipeline continues without group overrides — all detectors run with default platform behavior. A warning log is emitted:
Failed to load GroupDLPConfig for user <user_id> — continuing without overridesAudit trail
Section titled “Audit trail”All mutations to OrgDLPRule rows are recorded in the org_dlp_rule_audit table (OrgDLPRuleAudit):
| Field | Description |
|---|---|
action | created, updated, deleted, enabled, or disabled |
actor_id | UUID of the admin who performed the action |
old_value | JSONB snapshot of the previous state (null on create) |
new_value | JSONB snapshot of the new state (null on delete) |
created_at | Timestamp of the audit entry |
The audit table provides an immutable history of all org DLP rule changes. Rows are never updated or deleted.
Common administration tasks
Section titled “Common administration tasks”Add a custom pattern that redacts internal project codes
Section titled “Add a custom pattern that redacts internal project codes”POST /api/admin/org-dlp-rules/{ "rule_type": "custom_pattern", "name": "Internal Project Code", "pattern": "PRJ-[0-9]{4}", "enabled": true, "action_tier": "redact", "custom_entity_type": "internal_project_code"}Suppress the platform credit card detector for an org
Section titled “Suppress the platform credit card detector for an org”POST /api/admin/org-dlp-rules/{ "rule_type": "suppress_default", "name": "credit_card", "target_rule_id": "<platform-credit-card-rule-uuid>", "enabled": true}The platform pattern is also matched by name (name field must equal the platform pattern name). Both target_rule_id and name are checked during suppression resolution.
Grant a group SKIP exemption on the DeBERTa detector
Section titled “Grant a group SKIP exemption on the DeBERTa detector”Useful for technical teams whose prompts regularly contain source code patterns that DeBERTa flags:
POST /api/admin/groups/{group_id}/dlp-config{ "detector_name": "deberta", "action": "SKIP", "enabled": true}See also
Section titled “See also”- Policy engine administration — policy packs, rules, chains, and simulation
- DLP event monitoring — event types, lifecycle, and escalation
- DLP rule testing — test DLP rules in the admin portal
- User and group management — create and manage groups