Skip to content

Policy Engine rule testing — advanced guide

The Policy Engine ships with two testing tools: the PolicySimulator (part of the admin UI) and the DLP evaluate endpoint (POST /api/admin/dlp-rules/evaluate). Together they let you validate the full enforcement pipeline before rules go live.

  • PolicySimulator — tests policy rule chain evaluation with a synthetic request (user, group, provider, model, prompt). Use it to verify that rule conditions fire correctly and that chain ordering produces the expected outcome.
  • Evaluate endpoint — simulates DLP rule chain evaluation for an org/group context and returns a decision trace. Use it to verify DLP findings before those findings are passed to the Policy Engine.

This guide covers the advanced testing patterns that go beyond single-rule validation: multi-pack chain evaluation, testing ROUTE_TO rules, interpreting combined DLP + policy outcomes, and reproducing production decisions in a test context.


  • Admin role
  • At least one policy pack in the chain
  • API access for evaluate endpoint tests (arb_live_* admin token or equivalent)

The PolicySimulator runs a synthetic request through the complete policy evaluation pipeline:

  1. The DLP pipeline runs on the supplied prompt (Tier 1 regex → Tier 2 NER → Tier 3 DeBERTa, depending on your org config)
  2. The full org policy chain is evaluated against the request context and the DLP findings
  3. The simulator returns the exact outcome that would have been produced on a live request

The result is deterministic — the same inputs always produce the same result. Changes you make to rules take effect immediately in the simulator (no deployment step required).

For each policy pack you want to test, build a matrix of cases that covers:

  • Positive cases — requests that should match the rule and produce the expected action
  • Negative cases — requests that should not match and should pass through (or be caught by a later rule)
  • Boundary cases — requests at the edge of a condition (e.g., confidence threshold exactly at 0.85, a user in one group but not another)

Example matrix for a rule that blocks OpenAI access for the openai_block group:

TestUser groupsProviderExpected outcomePass?
Block appliesopenai_blockopenaiBLOCK
Wrong provideropenai_blockanthropicALLOW
Wrong groupengineeringopenaiALLOW
Both groupsopenai_block, engineeringopenaiBLOCK

Run each scenario in the simulator and verify the result against the expected column.

A catch-all BLOCK rule (no conditions, sequence=999) should be tested with requests that you expect no earlier rule to match:

User: (any user not in any targeted group)
Groups: (empty override)
Provider: openai
Model: gpt-4o
Channel: API
Prompt: "Summarize the quarterly report."

Expected outcome: BLOCK if your org has a deny-all posture, ALLOW if permissive.

If the outcome is wrong, check whether an earlier rule’s conditions are unintentionally matching. The match_reason field in the simulator result will tell you which rule fired and why.


{
"outcome": "BLOCK",
"matched_pack_id": "pack_01HZ_TRADING_DESK",
"matched_rule_id": "rule_01HZ_BLOCK_OPENAI",
"matched_rule_name": "Block OpenAI for no-openai group",
"matched_sequence": 10,
"match_reason": "user_groups matched ['no-openai']; provider=openai",
"action_taken": "BLOCK",
"message": "OpenAI access is not permitted for your group.",
"dlp_findings": []
}
FieldWhat to verify
outcomeThe terminal action. Should match your expected outcome.
matched_pack_id / matched_rule_idThe specific rule that fired. Confirms the right rule matched, not an unintended one.
matched_sequenceThe sequence position of the matching rule. Lower is earlier in the chain.
match_reasonHuman-readable explanation of which conditions matched. Use this to debug unexpected outcomes.
dlp_findingsDLP entities detected in the prompt. Non-empty when the prompt contains PII/sensitive content. Verify entity types and confidence match your expectations.

When outcome is ALLOW but you expected a match

Section titled “When outcome is ALLOW but you expected a match”

Check:

  1. Conditions not met — read match_reason on the rule that should have fired. The simulator only shows the matched rule; if no rule matched, matched_rule_id is null. Navigate to the rule in PolicyRuleEditor and verify each condition against the test inputs.
  2. Earlier rule blocked evaluation — under first_applicable, an earlier rule may have fired ALLOW (whitelist), stopping evaluation before your target rule was reached. Check the rules at lower sequence numbers.
  3. deny_overrides mode — under deny_overrides, any BLOCK beats any ALLOW. Verify the combining algorithm in PolicyChainEditor.
  4. Group membership — if the rule uses user_groups, confirm the user is actually in those groups. Use the Groups (override) field to force group membership for testing.

When outcome is BLOCK but you expected ALLOW

Section titled “When outcome is BLOCK but you expected ALLOW”

Read match_reason carefully. Common causes:

  • A catch-all BLOCK rule at high sequence number fired because no earlier whitelist matched
  • A condition on a whitelist rule didn’t match (check entity type, confidence threshold, channel)
  • Under deny_overrides, a BLOCK rule later in the chain overrode an ALLOW rule

Multi-pack chains require testing pack interactions — specifically: which pack fires first and whether earlier packs interfere with later ones.

Setup:

  • Sequence=1: PCI-DSS compliance bundle (built-in, read-only)
  • Sequence=2: Custom pack with an ALLOW whitelist for your QA testing group

Test: verify the QA group whitelist overrides the PCI bundle.

User: qa_tester
Groups override: qa_testing
Provider: openai
Prompt: "Card 4532-0151-1234-5678 (synthetic test data)"

Expected result:

  • Under first_applicable, the custom pack is at sequence=2, so the PCI bundle (sequence=1) evaluates first. If PCI fires BLOCK first, the whitelist never runs.
  • Fix: Move the custom pack to sequence=1 to evaluate the whitelist before the compliance bundle.

This is a common ordering mistake. The simulator reveals it immediately.

With deny_overrides as the combining algorithm, any BLOCK in the chain beats any ALLOW regardless of sequence:

Combined algorithm: deny_overrides
Pack A (sequence=1): ALLOW rule for group "finance"
Pack B (sequence=2): BLOCK rule for entity_type "credit_card"

Test case:

User groups: finance
Prompt: "Card 4532-0151-1234-5678"

Under deny_overrides, the BLOCK from Pack B wins even though Pack A’s ALLOW fired first. The simulator outcome will be BLOCK. If you expected ALLOW, switch to first_applicable or restructure the rules.


ROUTE_TO overrides the destination model at routing time. Test that it fires under the right conditions and routes to the correct target.

Rule: intent_complexity = "simple" → ROUTE_TO tier="haiku"

Test with a short, simple prompt:

Prompt: "What is 2 + 2?"
Expected outcome: ROUTE_TO
redacted_prompt: (not shown for ROUTE_TO)

Verify action_taken: "ROUTE_TO" in the simulator result. The simulator does not reveal the resolved model (routing is determined at request execution time), but the action confirms the rule fires.

Rule: user_groups: ["cost-sensitive"] → ROUTE_TO model="gpt-4o-mini"

Test both sides:

  1. User in cost-sensitive group → expected ROUTE_TO
  2. User not in cost-sensitive group → expected ALLOW (no routing override)

For rules that depend on DLP findings (entity types, confidence thresholds), you need both the DLP pipeline and the policy chain to produce the expected outcome. The PolicySimulator runs both — but when debugging, it helps to isolate them.

Step 1 — verify DLP findings with the evaluate endpoint

Section titled “Step 1 — verify DLP findings with the evaluate endpoint”

Before testing the full chain, confirm the DLP pipeline detects the entities your policy rule expects:

Terminal window
curl -s -X POST https://api.arbitex.ai/api/admin/dlp-rules/evaluate \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"text": "My IBAN is GB29NWBK60161331926819",
"direction": "input",
"org_id": "'$ORG_ID'"
}' | jq '{final_action, findings: [.findings[] | {entity_type, confidence, action}]}'

Expected output:

{
"final_action": "block",
"findings": [
{ "entity_type": "iban", "confidence": 1.0, "action": "block" }
]
}

If the entity type you expect is not in findings, the DLP rule that should detect it is either not active, the pattern doesn’t match, or the confidence threshold isn’t met. Debug in the DLP rule test panel first.

Step 2 — verify the policy rule fires on that entity type

Section titled “Step 2 — verify the policy rule fires on that entity type”

Once you’ve confirmed the DLP pipeline produces the expected findings, test in the PolicySimulator with the same prompt:

Prompt: "My IBAN is GB29NWBK60161331926819"

Verify that dlp_findings in the simulator result shows entity_type: "iban" and the policy rule with entity_types: ["iban"] fires as expected.

If DLP found the entity but the policy rule didn’t fire, check:

  • The rule’s entity_types condition lists the exact entity type string
  • The minimum_confidence condition is at or below the detected confidence (e.g., confidence=0.97 satisfies minimum_confidence=0.85)
  • The applies_to field matches the direction (input vs output)

When an admin reports an unexpected block or allow in production, use the audit log entry to reproduce the decision in the simulator.

Every audit entry includes:

  • user_id — who made the request
  • matched_pack_id, matched_rule_id — what fired
  • match_reason — why it fired
  • entity_types_detected — what DLP found

To reproduce:

  1. Copy the user_id into the simulator’s User field
  2. Use the entity_types_detected from the audit entry to craft a prompt that would generate those entities
  3. Set the same provider and model from the audit entry
  4. Run the simulator and compare the outcome and matched_rule_id to the audit entry

If the simulator outcome differs from the production outcome, the most likely cause is a rule change between the time of the original request and the reproduction. Check the rule’s modification history by comparing the current rule configuration to what was active when the request was made.