Policy rule reference

This page is the complete reference for policy rule fields. For an overview of how the Policy Engine evaluates rules, see Policy Engine overview.

Conditions reference

All conditions within a rule are evaluated with AND logic: every non-empty condition must match for the rule to fire. Conditions set to null (or omitted) are not evaluated — they do not contribute to the match. A rule with no conditions set is a catch-all that matches every request reaching it.

Field	Type	Description	Example
`user_groups`	`string[]`	Rule fires if the requesting user is a member of ANY of the listed groups. Group membership is resolved from your SCIM/directory sync. OR logic within this field — a single matching group is sufficient.	`["finance", "trading-desk"]`
`entity_types`	`string[]`	Rule fires if the DLP pipeline detected at least one entity whose type is in this list. Requires the `entity_confidence_min` threshold to also be met if set.	`["credit_card", "ssn"]`
`entity_confidence_min`	`float`	Minimum confidence score (0.0–1.0) for a detected entity to count as a match. Applies to any entity matched by `entity_types`. Set this to reduce false positives from the DeBERTa contextual validator. Defaults to `0.0` if `entity_types` is set and this field is omitted.	`0.85`
`content_regex`	`string`	Regular expression matched against the full prompt text. The pattern is validated for safety (ReDoS protection) before it is stored. The match is a search, not a full-string match — the pattern fires if it appears anywhere in the text.	`"\\bMNPI\\b"`
`providers`	`string[]`	Rule fires if the request is destined for any of the listed provider identifiers. Use this to write provider-specific rules (for example, blocking a specific provider for a group).	`["openai", "anthropic"]`
`models`	`string[]`	Rule fires if the request targets any of the listed model identifiers. More specific than `providers` — use when a rule should apply to one model but not others from the same provider.	`["gpt-4o"]`
`user_risk_score_min`	`float`	Rule fires if the user’s risk score (from CredInt/GeoIP enrichment, 0.0–1.0) is at or above this threshold. Use this for risk-based access controls: high-risk users (unusual location, leaked credentials) receive a more restrictive policy.	`0.7`
`channel`	`string[]`	Rule fires if the request originated from a matching caller type. `"interactive"` = browser/SSE callers (Arbitex web app, embedded chat); `"api"` = programmatic API callers. Use this to restrict `PROMPT` governance rules to interactive callers only, preventing human-facing dialogs from firing on automated pipelines.	`["interactive"]`
`intent_complexity`	`string`	Rule fires if the request’s inferred complexity matches. Values: `"simple"`, `"medium"`, `"complex"`. Arbitex classifies intent complexity during the intake pipeline after DLP entity detection. Use this condition for cost-based routing — send complex requests to a higher-capability model, simple requests to an economical tier.	`"complex"`

Condition logic notes

All conditions AND together. If a rule has both user_groups and entity_types, both must match.
user_groups is OR within itself. The user needs to be in at least one listed group, not all of them.
entity_types is OR within itself. At least one detected entity must match at least one listed type.
providers and models are OR within themselves. The request must match at least one listed value.
channel is OR within itself. The request must originate from at least one of the listed channel types.
Missing conditions are not evaluated. A rule with only user_groups set ignores entity type, regex, and all other fields entirely.

Actions reference

ALLOW

Terminal. Explicitly permits the request and stops all further policy evaluation. Use this as a whitelist rule to carve out exceptions — for example, allowing a specific group to bypass a BLOCK rule that appears later in the chain.

Because ALLOW is terminal and first-match semantics apply, you must position ALLOW rules before the BLOCK rules they are intended to supersede.

Field	Required	Description
`type`	Yes	`"ALLOW"`

BLOCK

Terminal. Denies the request. No call is made to the AI provider. The caller receives an error response.

Field	Required	Description
`type`	Yes	`"BLOCK"`
`message`	No	Error text returned to the caller. If omitted, a generic denial message is returned. Keep this message non-specific if you do not want to reveal policy details.

CANCEL

Terminal. Silently drops the request. No error is returned to the caller — from the caller’s perspective, the request timed out or received no response. Use this when you want to deny a request without signaling to the caller that a policy was enforced.

Field	Required	Description
`type`	Yes	`"CANCEL"`

REDACT

Non-terminal. Replaces matched content in the prompt with the redact_replacement string, then continues evaluating subsequent rules. Multiple REDACT rules can fire on a single request, each replacing different content. The final prompt sent to the AI provider reflects all accumulated replacements.

When applies_to is output or both, the REDACT action applies to the model’s response: the response is buffered, matched content is replaced, and the redacted response is delivered to the caller.

Field	Required	Description
`type`	Yes	`"REDACT"`
`redact_replacement`	No	The string that replaces matched content. Defaults to `[REDACTED]`. Use a descriptive placeholder (e.g., `[CC-REMOVED]`, `[SSN-REDACTED]`) to make audit logs more readable.

ROUTE_TO

Terminal. Overrides the destination model for this request, then stops evaluation. Use this for cost control (routing simple requests to a cheaper model) or capability routing (routing complex requests to a more capable model).

You can specify either an exact model ID or a provider-agnostic tier. If you specify a tier, Arbitex maps it to the appropriate model for the provider the request was originally directed to.

Field	Required	Description
`type`	Yes	`"ROUTE_TO"`
`route_to_model`	One of these	Exact model identifier (e.g., `"claude-haiku-4-5-20251001"`). Provider-specific.
`route_to_tier`	One of these	Provider-agnostic tier: `"haiku"`, `"sonnet"`, or `"opus"`. Arbitex maps this to the appropriate model for the request’s provider.

If both route_to_model and route_to_tier are set, route_to_model takes precedence.

PROMPT

Terminal. Governance challenge. Pauses the request and surfaces a justification dialog to the user in the Arbitex web app. If the user submits an acceptable justification, the original request re-submits automatically with audit metadata attached. If the user cancels, the request is dropped.

PROMPT is designed for interactive callers only. Pair it with channel: ["interactive"] to prevent it from firing on API callers. If a PROMPT rule fires on an API caller, the caller receives HTTP 449 Retry With with a machine-readable error body — they do not see a dialog.

Field	Required	Description
`type`	Yes	`"PROMPT"`
`prompt_message`	No	The challenge text shown to the user in the governance dialog. If omitted, a default message is shown. Keep this message specific enough for the user to understand what justification is needed, without revealing sensitive policy details.

After the user submits a justification, the re-submitted request includes:

X-Governance-Justification: the user’s justification text (truncated to 1000 characters)
X-Governance-Challenge-Id: links the re-submission to the original challenge event in the audit log

The re-submitted request passes through the full policy chain again. If the same PROMPT rule fires on re-submission (for example, if the justification condition was not met), the dialog is not shown a second time — the request is blocked instead, preventing infinite challenge loops.

applies_to values

Every rule has an applies_to field that determines which direction of traffic the rule scans.

Value	When the rule evaluates
`input`	On the user’s prompt, before it is sent to the AI provider. This is the default.
`output`	On the model’s response, after it is received from the provider and before it is delivered to the caller.
`both`	On both the prompt and the response, in separate evaluation passes.

When to use output or both:

Use output or both when you need to inspect what the model returns — for example, to prevent a model from including sensitive data in its response even if the prompt did not contain that data. Output scanning is useful for preventing PII leakage through model completions, detecting model hallucinations that include real credential-like strings, or redacting content that the model generated rather than the user submitted.

Output scanning adds latency because the response must be received and buffered before it can be inspected. For streaming responses, Arbitex buffers the output up to a configurable window (POLICY_OUTPUT_BUFFER_MS, default 5000ms) before applying output rules.

Example rules

1. Block OpenAI access for the “openai_block” group

Use this to prevent a user group from accessing a specific provider entirely.

{
  "name": "Block OpenAI for openai_block group",
  "sequence": 10,
  "applies_to": "input",
  "conditions": {
    "user_groups": ["openai_block"],
    "providers": ["openai"]
  },
  "action": {
    "type": "BLOCK",
    "message": "Your account group does not have access to OpenAI. Contact your admin."
  }
}

The providers condition ensures the rule fires only when the request targets OpenAI. If the same user sends a request to Anthropic, this rule is skipped.

2. Redact credit card numbers globally (input and output)

Use this to strip payment card data from all traffic, regardless of user group or provider.

{
  "name": "Redact credit card numbers",
  "sequence": 5,
  "applies_to": "both",
  "conditions": {
    "entity_types": ["credit_card"],
    "entity_confidence_min": 0.80
  },
  "action": {
    "type": "REDACT",
    "redact_replacement": "[CC-REMOVED]"
  }
}

Because REDACT is non-terminal, this rule fires and replaces the card number, then evaluation continues. A subsequent BLOCK rule could still fire on the same request if another condition is met. Setting applies_to: "both" means the rule also scans model responses for card numbers in the output.

3. Route “junior-analyst” group requests to Haiku (cost control)

Use this to automatically send requests from a specific group to a lower-cost model tier.

{
  "name": "Route junior analysts to Haiku",
  "sequence": 20,
  "applies_to": "input",
  "conditions": {
    "user_groups": ["junior-analyst"]
  },
  "action": {
    "type": "ROUTE_TO",
    "route_to_tier": "haiku"
  }
}

route_to_tier: "haiku" is provider-agnostic — if the request was directed at Anthropic, it routes to the Haiku-tier model; if directed at another supported provider, it routes to that provider’s equivalent. This rule fires for every request from users in the junior-analyst group, regardless of entity type or content.

4. Whitelist “power-users” group for a specific model

Use this to allow a group to access a model that would otherwise be blocked by a subsequent rule.

{
  "name": "Allow power-users on gpt-4o",
  "sequence": 1,
  "applies_to": "input",
  "conditions": {
    "user_groups": ["power-users"],
    "models": ["gpt-4o"]
  },
  "action": {
    "type": "ALLOW"
  }
}

Position this rule at a low sequence number (here, sequence=1) so it evaluates before any BLOCK rule that targets gpt-4o. When a power-users member sends a request to gpt-4o, this ALLOW fires first and stops evaluation — the BLOCK rule is never reached.

5. Block any request matching a custom regex pattern

Use this to block requests containing a specific term or pattern — for example, material non-public information (MNPI) indicators.

{
  "name": "Block MNPI keyword mentions",
  "sequence": 15,
  "applies_to": "input",
  "conditions": {
    "content_regex": "\\bMNPI\\b"
  },
  "action": {
    "type": "BLOCK",
    "message": "Requests referencing MNPI cannot be processed through this gateway."
  }
}

The content_regex condition uses a word-boundary pattern (\b) to avoid false positives on substrings. Regex patterns are validated for safety before storage — patterns that could cause catastrophic backtracking (ReDoS) are rejected at write time.

Use PROMPT to require a user to justify why they are submitting a request containing sensitive entity types. This fires only on interactive callers — API callers are unaffected by this rule.

{
  "name": "Require justification for PII access — interactive",
  "sequence": 30,
  "applies_to": "input",
  "conditions": {
    "entity_types": ["ssn", "passport"],
    "entity_confidence_min": 0.80,
    "channel": ["interactive"]
  },
  "action": {
    "type": "PROMPT",
    "prompt_message": "This request contains government ID data. Please provide a business justification before proceeding."
  }
}

If the user submits a justification, the re-submitted request carries X-Governance-Justification and X-Governance-Challenge-Id headers and is recorded in the audit log as a justified exception. If the user cancels the dialog, the request is dropped with no error shown.

7. Route complex requests to a higher-capability model

Use intent_complexity to automatically route requests classified as complex to a more capable model tier, while simple requests stay on the economical tier.

{
  "name": "Route complex requests to Opus tier",
  "sequence": 5,
  "applies_to": "input",
  "conditions": {
    "intent_complexity": "complex"
  },
  "action": {
    "type": "ROUTE_TO",
    "route_to_tier": "opus"
  }
}

Intent complexity is computed by the intake pipeline after DLP entity detection and before policy evaluation. "simple" requests are short, direct, low-context queries. "complex" requests involve multi-step reasoning, long context, or technical depth. The classification is a best-effort proxy; for precise routing requirements, use content_regex or models conditions instead.

Policy Engine overview — evaluation flow, chain semantics, combining algorithms
Policy Engine — Admin Guide — step-by-step UI guide with PolicySimulator
Compliance Bundles — pre-configured packs for regulatory frameworks
DLP Overview — how entity detection works before policy rules evaluate