Human-in-the-loop governance

Arbitex provides two policy actions that keep a human in the decision path for AI requests that match policy rules: PROMPT and ALLOW_WITH_OVERRIDE. This guide explains what each action does, when to use one versus the other, and how to configure them together for a layered governance posture.

What is human-in-the-loop governance?

Standard policy actions — BLOCK, ALLOW, REDACT — are automatic. The gateway makes the decision without any human involvement. For many policy rules this is the right design: credentials in a prompt should always be blocked, routine traffic should always be allowed.

Human-in-the-loop (HITL) actions introduce a human decision before the AI request can proceed. They are appropriate when:

The detection is probabilistic rather than certain — blocking everything would impede legitimate work, but silently allowing carries compliance risk.
The appropriate response depends on context that only a human can evaluate — for example, whether a flagged document is real PII or synthetic test data.
Your compliance framework requires documented justification for policy exceptions.

PROMPT and ALLOW_WITH_OVERRIDE both create a HITL checkpoint, but they differ in who makes the decision and when.

PROMPT — admin-reviewed hold

When a rule fires with action=PROMPT, the request is suspended and held for an admin to review. The requesting user is blocked waiting. Nothing goes to the AI provider until an admin explicitly approves or denies the hold.

Decision maker: An admin.

User experience: The client blocks. The user sees a waiting state. No response arrives until an admin acts.

Timeout behavior: If no admin acts within PROMPT_HOLD_TIMEOUT_SECONDS (default 300 s), the hold is automatically denied and the client receives HTTP 403.

Best used when: The policy match requires human judgment from someone other than the requesting user — for example, a manager or compliance officer who can verify whether a request is authorized.

How a hold flows

Client request
  └─► DLP pipeline → Policy engine → rule fires (action=PROMPT)
        └─► Hold created → SSE event sent to all connected admins
              └─► Admin reviews context
                    ├─► [approve] → gateway forwards to AI provider → response to client
                    └─► [deny]    → client receives HTTP 403

Configuring a PROMPT rule

{
  "name": "trading-desk-credit-card-review",
  "conditions": {
    "user_groups": ["trading-desk"],
    "entity_types": ["credit_card"],
    "entity_confidence_min": 0.85,
    "channel": ["interactive"]
  },
  "action": {
    "type": "PROMPT"
  }
}

The channel: ["interactive"] condition restricts the hold to browser/SSE callers. Automated API callers that match the same conditions fall through to the next rule. Add a separate BLOCK rule with channel: ["api"] if automated callers should be rejected.

Admin review workflow

Admins review and act on holds via:

Admin panel: Admin > Prompt Holds — real-time list of pending holds with request context (user, model, matched rule, detected entity types). Approve or deny with a single click.
API: Connect to the SSE stream at GET /admin/api/prompt-holds/events on port 8301 to receive holds in real time. Approve or deny with POST /admin/api/prompt-holds/{hold_id}/approve or …/deny.

See PROMPT governance for the complete API reference and SSE event format.

ALLOW_WITH_OVERRIDE — user-acknowledged override

When a rule fires with action=ALLOW_WITH_OVERRIDE, the request is not forwarded to the AI provider automatically. Instead, the gateway returns an override prompt to the requesting user. The user must acknowledge the detection and provide a written reason. If they do, the request proceeds. The acknowledgement and reason are written to the audit log.

Decision maker: The requesting user.

User experience: The client receives an override_required: true response with a short-lived override_token. The user provides a reason. The original request is re-sent with the token and reason. On success, the response arrives normally.

Token expiry: 5 minutes. Single-use. Expired or reused tokens are rejected.

Best used when: The detection is a policy warning rather than a hard block. The user may legitimately proceed, but the organization wants an audit trail of who overrode the policy and why.

How an override flows

Client request
  └─► DLP pipeline → Policy engine → rule fires (action=ALLOW_WITH_OVERRIDE)
        └─► Gateway returns override_required: true + override_token (5-minute JWT)
              └─► Client prompts user for reason
                    └─► User provides reason → re-sends request with X-Override-Token + override_reason
                          ├─► Token valid   → audit log written → gateway forwards → response to client
                          └─► Token expired → HTTP 403 (user must restart the original request)

Configuring an ALLOW_WITH_OVERRIDE rule

{
  "name": "finance-pii-override-required",
  "conditions": {
    "user_groups": ["finance"],
    "entity_types": ["credit_card", "bank_account"],
    "entity_confidence_min": 0.80
  },
  "action": {
    "type": "ALLOW_WITH_OVERRIDE"
  }
}

See ALLOW_WITH_OVERRIDE governance for the complete override token flow, re-submission format, and audit trail reference.

Choosing between PROMPT and ALLOW_WITH_OVERRIDE

Question	PROMPT	ALLOW_WITH_OVERRIDE
Who decides?	An admin	The requesting user
Is the client blocked waiting?	Yes — until admin acts or timeout	No — user sees the override prompt immediately
Can the user bypass without admin involvement?	No	Yes, by providing a reason
Is a reason required?	No — admin approves/denies without supplying one	Yes — user must supply written justification
Suitable for automated/batch callers?	No — use `channel: ["interactive"]` to restrict	Yes — works for any channel
Compliance record created?	Yes — `prompt_hold_approve` / `prompt_hold_deny` in audit log	Yes — `allow_with_override` entry with reason in audit log

Decision guide

Scenario	Recommended action
Match should always be rejected, no exceptions	`BLOCK`
Match requires a compliance officer or manager to approve	`PROMPT`
Match is a warning; the user may proceed with justification	`ALLOW_WITH_OVERRIDE`
Match should be logged but not blocked	`LOG_ONLY`
Detection is informational only	`ALLOW`

Use PROMPT when the request may be unauthorized and only a separate person can verify it — for example, an employee sending what looks like unreleased financial data to an external model.

Use ALLOW_WITH_OVERRIDE when the detection is likely but the user probably has a legitimate reason — for example, a developer sending synthetic test data that matches a PII pattern.

Using both actions together in a chain

PROMPT and ALLOW_WITH_OVERRIDE can coexist in a policy chain targeting the same group. Place harder restrictions earlier in the chain:

Priority	Rule	Action
1	Credentials or secrets in prompt	`BLOCK`
2	High-confidence PII (≥ 0.95) — trading desk	`PROMPT` (admin review required)
3	Medium-confidence PII (≥ 0.75) — trading desk	`ALLOW_WITH_OVERRIDE` (user justification required)
4	All other traffic	`ALLOW`

With this chain:

Credentials are always blocked.
High-confidence detections are held for admin review.
Medium-confidence detections give the user an opportunity to justify and proceed.
Everything else passes through.

Both PROMPT and ALLOW_WITH_OVERRIDE are terminal actions — they stop chain evaluation when they fire. Lower-priority rules in the same chain do not evaluate for the same request.

Audit trail comparison

Both actions produce audit log entries that can be queried and exported for compliance review.

PROMPT hold audit entries

Field	Value
`action`	`"prompt_hold_approve"` or `"prompt_hold_deny"` or `"prompt_hold_timeout"`
`hold_id`	UUID of the resolved hold
`admin_user`	Identity of the admin who approved or denied

Query:

GET /api/admin/audit-logs?action=prompt_hold_approve&limit=100

ALLOW_WITH_OVERRIDE audit entries

Field	Value
`action`	`"allow_with_override"`
`user_id`	ID of the user who acknowledged the override
`rule_id`	ID of the rule that triggered the override
`detected_entity_type`	Entity type at time of detection
`override_reason`	Verbatim reason text supplied by the user
`request_id`	Correlation ID linking override to original request

Query:

GET /api/admin/audit-logs?action=allow_with_override&limit=100

Both entry types are included in compliance exports. See Audit log for the full field reference and export instructions.

Compliance implications

SOX

Both actions create an evidence trail for internal controls. PROMPT records admin authorization decisions. ALLOW_WITH_OVERRIDE records user acknowledgement with written justification. Either can be included in SOX control testing documentation.

GLBA

GLBA requires logging authorized access to customer financial information. ALLOW_WITH_OVERRIDE captures who accessed data matching financial patterns and why. PROMPT captures the authorization chain (admin review) for higher-sensitivity data.

General posture

BLOCK prevents access. ALLOW permits access silently. ALLOW_WITH_OVERRIDE and PROMPT occupy the middle tier: access is permitted or denied, but with a documented record. Use them where blanket blocking would impede legitimate business operations but silent allowance is not acceptable to your compliance or security team.

Operational considerations

Admin availability for PROMPT

PROMPT holds wait for admin action. If no admin is online when a hold is created, the hold waits until PROMPT_HOLD_TIMEOUT_SECONDS elapses (default 5 minutes) and then auto-denies. For organizations where PROMPT rules are active outside business hours, either:

Increase PROMPT_HOLD_TIMEOUT_SECONDS to a value that covers your admin on-call response time, or
Configure a BLOCK fallback rule that fires outside business hours for the same group/entity conditions (using a time-window condition if supported by your policy engine version).

Override token expiry

ALLOW_WITH_OVERRIDE tokens expire in 5 minutes. If a user is reading a long policy explanation before re-submitting, the token may expire. The client should handle the override_token_expired error gracefully by re-initiating the original request, which will generate a fresh override prompt.

Monitoring override volume

High override rates on a specific rule may indicate the rule threshold is too aggressive or that a group needs training. Configure override frequency alerts under Admin > Admin Ops > Alert Rules to fire when override_count for a given rule exceeds a threshold in a rolling window.