Routing and Model Controls

Arbitex routes AI requests to providers and models based on configurable rules. The routing layer sits between the policy engine and the provider API: requests that clear policy evaluation are dispatched according to routing rules, then delivered to the selected provider. Admins can also set latency alert thresholds and monitor cost distribution across providers.

All routing configuration is available under Admin → Routing.

Prerequisites

Org Admin role

Routing Rules

Navigate to Admin → Routing → Rules to manage routing rules. Rules are evaluated in priority order (lowest number = highest priority). The first rule whose conditions match the request determines the routing action.

Conditions

Each rule has one or more conditions that must all match (logical AND):

Field	Operators	Example value
group	`eq`, `in`, `not_in`	`engineering`
user_role	`eq`, `in`, `not_in`	`admin`
intent	`eq`, `in`, `not_in`	`code_generation`
time_of_day	`eq`, `in`, `not_in`	`09:00-17:00`

Operators:

eq — exact match (single value)
in — matches any of a comma-separated list of values
not_in — matches none of a comma-separated list of values

A rule with no conditions matches all requests.

Actions

Each rule specifies one or more routing actions applied when conditions match:

Action	Description
route_to_provider	Send the request to a specific AI provider (e.g., `anthropic`, `openai`)
route_to_model	Send the request to a specific model ID (e.g., `claude-sonnet-4-6`)
route_to_tier	Send the request to a capability tier: `fast` (low-latency small models), `balanced` (general-purpose), or `powerful` (highest-capability)
optimize	Optimize selection for `cost` (prefer cheaper models) or `latency` (prefer fastest response)
cost_weight	Number from `0.0` (optimize purely for quality) to `1.0` (optimize purely for cost). Blends cost and quality scoring when selecting a model.

Create a routing rule

Navigate to Admin → Routing → Rules.
Click New Rule.
Add one or more conditions using the condition builder:
- Select a field (group / user_role / intent / time_of_day)
- Select an operator (eq / in / not_in)
- Enter a value (for in / not_in, separate multiple values with commas)
Configure the action fields.
Click Save Rule. The rule is added at the lowest priority (highest number) by default.

At least one non-empty condition is required before saving.

Reorder rules

Priority is shown as a number in the rule table. Use the ↑ / ↓ buttons on a rule row to increase or decrease its priority. Changes take effect immediately (each reorder is persisted automatically).

Lower priority numbers are evaluated first. A rule at priority 1 is checked before a rule at priority 10.

Edit or delete a rule

Click the Edit icon on a rule row to open the rule editor. Click Delete and confirm to remove the rule.

API reference

# List all routing rules (ordered by priority)
GET /api/admin/routing-rules/

# Create a routing rule
POST /api/admin/routing-rules/
{
  "conditions": [
    { "field": "group", "operator": "in", "value": "legal,compliance" },
    { "field": "intent", "operator": "eq", "value": "document_review" }
  ],
  "actions": {
    "route_to_tier": "powerful",
    "cost_weight": 0.2
  }
}

# Update a routing rule
PUT /api/admin/routing-rules/{rule_id}
{ ...same structure... }

# Delete a routing rule
DELETE /api/admin/routing-rules/{rule_id}

Latency Thresholds

Navigate to Admin → Routing → Latency to view real-time latency metrics and configure alert thresholds per provider.

Latency metrics

The latency monitor shows the following metrics for each provider over a selectable time window (1h / 24h / 7d / 30d):

Metric	Description
p50	Median response latency (50th percentile)
p95	95th-percentile latency — the primary threshold comparison column
p99	99th-percentile latency
avg	Mean response latency
Requests	Total request count for the period
Trend	↑ (degrading, red) / ↓ (improving, green) / — (stable)

Status levels

Latency status is color-coded based on the p50 value:

Status	p50 value
Healthy (green)	< 200 ms
Warning (amber)	200–499 ms
Critical (red)	≥ 500 ms

When a provider’s p95 exceeds its configured threshold, the p95 cell shows a red Exceeds threshold badge.

Configure thresholds

Click Thresholds to expand the threshold configuration panel. Per-provider threshold settings:

Setting	Default	Description
Latency threshold (ms)	500	p95 latency above this value triggers the “Exceeds threshold” badge
Error rate (%)	5.0	Error rate above this percentage triggers a provider-level alert

Adjust the values using the number inputs (latency step: 50 ms; error rate step: 0.5%) and click Save Thresholds.

API reference

# Get latency metrics
GET /api/metrics/latency?window=24h
# window options: 1h | 24h | 7d | 30d

# Get current thresholds
GET /api/admin/providers/thresholds
# Response: { "thresholds": { "anthropic": { "latency_ms": 500, "error_rate_pct": 5.0 }, ... } }

# Update thresholds
PUT /api/admin/providers/thresholds
{
  "thresholds": {
    "anthropic": { "latency_ms": 800, "error_rate_pct": 3.0 },
    "openai":    { "latency_ms": 600, "error_rate_pct": 5.0 }
  }
}

Cost Routing

Navigate to Admin → Routing → Cost to see a breakdown of spend by provider and identify cost optimization opportunities.

Period selector

Use the 24h / 7d / 30d tab selector to change the analysis period. Data is sourced from the usage analytics store.

Provider cost summary

The provider cost table aggregates all model-level usage into per-provider totals:

Column	Description
Provider	AI provider name
Total Cost	Sum of all request costs for this provider in the selected period (USD, 4 decimal places)
Total Tokens	Total input + output token count
Cost / 1K tokens	Effective cost rate per 1,000 tokens (total_cost ÷ total_tokens × 1,000)

The table is sorted by total cost descending — your highest-spend provider appears first.

Cheapest provider per model

Below the summary table, the Cheapest Provider per Model section identifies the most cost-efficient provider for each model that received traffic in the period. Each row shows:

Model identifier
Cheapest: {provider} badge
Cost per token for that provider/model combination

Use this view to inform routing rule configuration. For example, if claude-sonnet-4-6 is cheapest via anthropic, create a routing rule to set route_to_provider: anthropic for workloads where cost is the priority.

Cost weight optimization

When a routing rule uses cost_weight (0.0–1.0), Arbitex blends cost and quality scoring when selecting a model within the matched tier:

0.0 — select the highest-quality model regardless of cost
0.5 — balance cost and quality equally
1.0 — select the lowest-cost model regardless of quality

Pair cost_weight with the Cost Routing view to track whether the blended selection is reducing your total spend over time.

Routing Examples

Route legal team requests to the powerful tier

POST /api/admin/routing-rules/
{
  "conditions": [
    { "field": "group", "operator": "in", "value": "legal,compliance" }
  ],
  "actions": {
    "route_to_tier": "powerful"
  }
}

Cost-optimize non-critical API workloads

POST /api/admin/routing-rules/
{
  "conditions": [
    { "field": "user_role", "operator": "eq", "value": "api_service" }
  ],
  "actions": {
    "optimize": "cost",
    "cost_weight": 0.8
  }
}

Route after-hours requests to a specific provider

POST /api/admin/routing-rules/
{
  "conditions": [
    { "field": "time_of_day", "operator": "not_in", "value": "09:00-17:00" }
  ],
  "actions": {
    "route_to_provider": "openai"
  }
}

Routing and Model Controls

Prerequisites

Routing Rules

Conditions

Actions

Create a routing rule

Reorder rules

Edit or delete a rule

API reference

Latency Thresholds

Latency metrics

Status levels

Configure thresholds

API reference

Cost Routing

Period selector

Provider cost summary

Cheapest provider per model

Cost weight optimization

Routing Examples

Route legal team requests to the powerful tier

Cost-optimize non-critical API workloads

Route after-hours requests to a specific provider

See also