Routing and Model Controls
Arbitex routes AI requests to providers and models based on configurable rules. The routing layer sits between the policy engine and the provider API: requests that clear policy evaluation are dispatched according to routing rules, then delivered to the selected provider. Admins can also set latency alert thresholds and monitor cost distribution across providers.
All routing configuration is available under Admin → Routing.
Prerequisites
Section titled “Prerequisites”- Org Admin role
Routing Rules
Section titled “Routing Rules”Navigate to Admin → Routing → Rules to manage routing rules. Rules are evaluated in priority order (lowest number = highest priority). The first rule whose conditions match the request determines the routing action.
Conditions
Section titled “Conditions”Each rule has one or more conditions that must all match (logical AND):
| Field | Operators | Example value |
|---|---|---|
| group | eq, in, not_in | engineering |
| user_role | eq, in, not_in | admin |
| intent | eq, in, not_in | code_generation |
| time_of_day | eq, in, not_in | 09:00-17:00 |
Operators:
eq— exact match (single value)in— matches any of a comma-separated list of valuesnot_in— matches none of a comma-separated list of values
A rule with no conditions matches all requests.
Actions
Section titled “Actions”Each rule specifies one or more routing actions applied when conditions match:
| Action | Description |
|---|---|
| route_to_provider | Send the request to a specific AI provider (e.g., anthropic, openai) |
| route_to_model | Send the request to a specific model ID (e.g., claude-sonnet-4-6) |
| route_to_tier | Send the request to a capability tier: fast (low-latency small models), balanced (general-purpose), or powerful (highest-capability) |
| optimize | Optimize selection for cost (prefer cheaper models) or latency (prefer fastest response) |
| cost_weight | Number from 0.0 (optimize purely for quality) to 1.0 (optimize purely for cost). Blends cost and quality scoring when selecting a model. |
Create a routing rule
Section titled “Create a routing rule”- Navigate to Admin → Routing → Rules.
- Click New Rule.
- Add one or more conditions using the condition builder:
- Select a field (group / user_role / intent / time_of_day)
- Select an operator (eq / in / not_in)
- Enter a value (for
in/not_in, separate multiple values with commas)
- Configure the action fields.
- Click Save Rule. The rule is added at the lowest priority (highest number) by default.
At least one non-empty condition is required before saving.
Reorder rules
Section titled “Reorder rules”Priority is shown as a number in the rule table. Use the ↑ / ↓ buttons on a rule row to increase or decrease its priority. Changes take effect immediately (each reorder is persisted automatically).
Lower priority numbers are evaluated first. A rule at priority 1 is checked before a rule at priority 10.
Edit or delete a rule
Section titled “Edit or delete a rule”Click the Edit icon on a rule row to open the rule editor. Click Delete and confirm to remove the rule.
API reference
Section titled “API reference”# List all routing rules (ordered by priority)GET /api/admin/routing-rules/
# Create a routing rulePOST /api/admin/routing-rules/{ "conditions": [ { "field": "group", "operator": "in", "value": "legal,compliance" }, { "field": "intent", "operator": "eq", "value": "document_review" } ], "actions": { "route_to_tier": "powerful", "cost_weight": 0.2 }}
# Update a routing rulePUT /api/admin/routing-rules/{rule_id}{ ...same structure... }
# Delete a routing ruleDELETE /api/admin/routing-rules/{rule_id}Latency Thresholds
Section titled “Latency Thresholds”Navigate to Admin → Routing → Latency to view real-time latency metrics and configure alert thresholds per provider.
Latency metrics
Section titled “Latency metrics”The latency monitor shows the following metrics for each provider over a selectable time window (1h / 24h / 7d / 30d):
| Metric | Description |
|---|---|
| p50 | Median response latency (50th percentile) |
| p95 | 95th-percentile latency — the primary threshold comparison column |
| p99 | 99th-percentile latency |
| avg | Mean response latency |
| Requests | Total request count for the period |
| Trend | ↑ (degrading, red) / ↓ (improving, green) / — (stable) |
Status levels
Section titled “Status levels”Latency status is color-coded based on the p50 value:
| Status | p50 value |
|---|---|
| Healthy (green) | < 200 ms |
| Warning (amber) | 200–499 ms |
| Critical (red) | ≥ 500 ms |
When a provider’s p95 exceeds its configured threshold, the p95 cell shows a red Exceeds threshold badge.
Configure thresholds
Section titled “Configure thresholds”Click Thresholds to expand the threshold configuration panel. Per-provider threshold settings:
| Setting | Default | Description |
|---|---|---|
| Latency threshold (ms) | 500 | p95 latency above this value triggers the “Exceeds threshold” badge |
| Error rate (%) | 5.0 | Error rate above this percentage triggers a provider-level alert |
Adjust the values using the number inputs (latency step: 50 ms; error rate step: 0.5%) and click Save Thresholds.
API reference
Section titled “API reference”# Get latency metricsGET /api/metrics/latency?window=24h# window options: 1h | 24h | 7d | 30d
# Get current thresholdsGET /api/admin/providers/thresholds# Response: { "thresholds": { "anthropic": { "latency_ms": 500, "error_rate_pct": 5.0 }, ... } }
# Update thresholdsPUT /api/admin/providers/thresholds{ "thresholds": { "anthropic": { "latency_ms": 800, "error_rate_pct": 3.0 }, "openai": { "latency_ms": 600, "error_rate_pct": 5.0 } }}Cost Routing
Section titled “Cost Routing”Navigate to Admin → Routing → Cost to see a breakdown of spend by provider and identify cost optimization opportunities.
Period selector
Section titled “Period selector”Use the 24h / 7d / 30d tab selector to change the analysis period. Data is sourced from the usage analytics store.
Provider cost summary
Section titled “Provider cost summary”The provider cost table aggregates all model-level usage into per-provider totals:
| Column | Description |
|---|---|
| Provider | AI provider name |
| Total Cost | Sum of all request costs for this provider in the selected period (USD, 4 decimal places) |
| Total Tokens | Total input + output token count |
| Cost / 1K tokens | Effective cost rate per 1,000 tokens (total_cost ÷ total_tokens × 1,000) |
The table is sorted by total cost descending — your highest-spend provider appears first.
Cheapest provider per model
Section titled “Cheapest provider per model”Below the summary table, the Cheapest Provider per Model section identifies the most cost-efficient provider for each model that received traffic in the period. Each row shows:
- Model identifier
- Cheapest: {provider} badge
- Cost per token for that provider/model combination
Use this view to inform routing rule configuration. For example, if claude-sonnet-4-6 is cheapest via anthropic, create a routing rule to set route_to_provider: anthropic for workloads where cost is the priority.
Cost weight optimization
Section titled “Cost weight optimization”When a routing rule uses cost_weight (0.0–1.0), Arbitex blends cost and quality scoring when selecting a model within the matched tier:
0.0— select the highest-quality model regardless of cost0.5— balance cost and quality equally1.0— select the lowest-cost model regardless of quality
Pair cost_weight with the Cost Routing view to track whether the blended selection is reducing your total spend over time.
Routing Examples
Section titled “Routing Examples”Route legal team requests to the powerful tier
Section titled “Route legal team requests to the powerful tier”POST /api/admin/routing-rules/{ "conditions": [ { "field": "group", "operator": "in", "value": "legal,compliance" } ], "actions": { "route_to_tier": "powerful" }}Cost-optimize non-critical API workloads
Section titled “Cost-optimize non-critical API workloads”POST /api/admin/routing-rules/{ "conditions": [ { "field": "user_role", "operator": "eq", "value": "api_service" } ], "actions": { "optimize": "cost", "cost_weight": 0.8 }}Route after-hours requests to a specific provider
Section titled “Route after-hours requests to a specific provider”POST /api/admin/routing-rules/{ "conditions": [ { "field": "time_of_day", "operator": "not_in", "value": "09:00-17:00" } ], "actions": { "route_to_provider": "openai" }}See also
Section titled “See also”- Policy Engine Administration — ROUTE_TO action in policy rules
- Provider Management — adding and configuring AI providers
- Billing and Metering — usage cost reporting
- Cost Routing Card reference — prior routing configuration guide