Skip to content

Routing Rules API

The Routing Rules API manages how Arbitex Gateway selects providers and models for each request. Routing rules apply at the org level and override the gateway’s default routing behavior.

All endpoints require admin authentication.

Base URL: https://api.arbitex.ai

Authentication: Authorization: Bearer $ARBITEX_API_KEY


interface RoutingRule {
id: string; // UUID
name: string;
priority: number; // Higher = evaluated first (1–1000)
enabled: boolean;
strategy: RoutingStrategy;
conditions: RoutingCondition[];
fallback_chain: FallbackChain[];
cost_cap?: CostCap;
model_filter?: ModelFilter;
created_at: string;
updated_at: string;
}
type RoutingStrategy =
| "round_robin" // Distribute requests across providers in rotation
| "least_cost" // Always route to cheapest enabled provider for the model
| "lowest_latency" // Route to provider with lowest measured p50 latency
| "primary_with_fallback" // Use primary; fall through to fallbacks on error
| "weighted" // Route by weight (sum of weights = 100)
| "sticky_session"; // Route same user to same provider within session
interface FallbackChain {
provider_id: string;
model_id: string;
weight?: number; // Required for "weighted" strategy (0–100)
priority?: number; // Required for "primary_with_fallback" (1 = primary)
}
interface RoutingCondition {
field: "user.groups" | "model.id" | "request.source" | "time.hour_of_day" | "user.id";
operator: "in" | "not_in" | "equals" | "contains" | "between";
value: string | string[] | number[];
}
interface CostCap {
max_input_cost_per_1k: number; // Maximum $/1k input tokens
max_output_cost_per_1k: number; // Maximum $/1k output tokens
action: "skip" | "block"; // Skip provider or block request if over cap
}
interface ModelFilter {
allowed_models: string[]; // If set, only these model IDs are routed
min_context_window?: number; // Minimum context window size
require_capabilities?: string[]; // e.g., ["vision", "function_calling"]
}

GET /api/admin/routing/rules
Authorization: Bearer $ARBITEX_API_KEY

Query parameters:

ParameterTypeDescription
enabledbooleanFilter by enabled/disabled state
strategystringFilter by routing strategy

Response 200 OK:

{
"rules": [
{
"id": "rule_abc123",
"name": "Cost-optimized for contractors",
"priority": 500,
"enabled": true,
"strategy": "least_cost",
"conditions": [
{
"field": "user.groups",
"operator": "contains",
"value": "contractors"
}
],
"fallback_chain": [
{ "provider_id": "groq", "model_id": "llama-3-70b" },
{ "provider_id": "anthropic", "model_id": "claude-3-haiku-20240307" }
],
"cost_cap": {
"max_input_cost_per_1k": 0.01,
"max_output_cost_per_1k": 0.03,
"action": "skip"
},
"created_at": "2026-03-01T00:00:00Z",
"updated_at": "2026-03-12T00:00:00Z"
}
],
"total": 3
}

GET /api/admin/routing/rules/{rule_id}
Authorization: Bearer $ARBITEX_API_KEY

Response 200 OK: Single RoutingRule object. Returns 404 if not found.


POST /api/admin/routing/rules
Authorization: Bearer $ARBITEX_API_KEY
Content-Type: application/json

Routes to the primary provider; falls through to fallbacks in priority order on 5xx or timeout.

{
"name": "GPT-4o with Anthropic fallback",
"priority": 700,
"strategy": "primary_with_fallback",
"conditions": [],
"fallback_chain": [
{ "provider_id": "openai", "model_id": "gpt-4o", "priority": 1 },
{ "provider_id": "anthropic", "model_id": "claude-3-5-sonnet-20241022", "priority": 2 },
{ "provider_id": "google_gemini", "model_id": "gemini-1.5-pro", "priority": 3 }
]
}

Evaluates all providers in the fallback chain and routes to the one with the lowest cost per token for the requested model class.

{
"name": "Always use cheapest provider",
"priority": 300,
"strategy": "least_cost",
"conditions": [
{
"field": "user.groups",
"operator": "contains",
"value": "cost-sensitive"
}
],
"fallback_chain": [
{ "provider_id": "groq", "model_id": "llama-3-70b" },
{ "provider_id": "mistral", "model_id": "mistral-large" },
{ "provider_id": "anthropic", "model_id": "claude-3-haiku-20240307" }
],
"cost_cap": {
"max_input_cost_per_1k": 0.005,
"max_output_cost_per_1k": 0.015,
"action": "block"
}
}

Routes to the provider with the lowest measured p50 latency over the past 5 minutes. Latency is measured from the gateway’s perspective (includes provider network overhead).

{
"name": "Lowest latency for realtime features",
"priority": 600,
"strategy": "lowest_latency",
"conditions": [
{
"field": "user.groups",
"operator": "contains",
"value": "realtime-feature-users"
}
],
"fallback_chain": [
{ "provider_id": "openai", "model_id": "gpt-4o-mini" },
{ "provider_id": "groq", "model_id": "llama-3-8b" },
{ "provider_id": "anthropic", "model_id": "claude-3-haiku-20240307" }
]
}

Distributes requests across providers according to configured weights. Weights must sum to exactly 100.

{
"name": "A/B test: 80% OpenAI, 20% Anthropic",
"priority": 400,
"strategy": "weighted",
"conditions": [],
"fallback_chain": [
{ "provider_id": "openai", "model_id": "gpt-4o", "weight": 80 },
{ "provider_id": "anthropic", "model_id": "claude-3-5-sonnet-20241022", "weight": 20 }
]
}

Routes the same user to the same provider for the duration of their session (as long as the provider is healthy). Useful for conversation continuity.

{
"name": "Sticky session for chat interface",
"priority": 200,
"strategy": "sticky_session",
"conditions": [
{
"field": "request.source",
"operator": "equals",
"value": "chat-ui"
}
],
"fallback_chain": [
{ "provider_id": "openai", "model_id": "gpt-4o" },
{ "provider_id": "anthropic", "model_id": "claude-3-5-sonnet-20241022" }
]
}

Distributes requests evenly across all providers in the fallback chain in rotation.

{
"name": "Round robin for batch processing",
"priority": 100,
"strategy": "round_robin",
"conditions": [
{
"field": "user.groups",
"operator": "contains",
"value": "batch-jobs"
}
],
"fallback_chain": [
{ "provider_id": "openai", "model_id": "gpt-4o" },
{ "provider_id": "anthropic", "model_id": "claude-3-5-sonnet-20241022" },
{ "provider_id": "google_gemini", "model_id": "gemini-1.5-pro" }
]
}

Response 201 Created: RoutingRule object with assigned id.

Errors:

CodeDetail
400Missing required fields, invalid strategy, or weights don’t sum to 100
409A rule with the same name already exists
422Referenced provider or model not configured for this org

Full replacement of a routing rule.

PUT /api/admin/routing/rules/{rule_id}
Authorization: Bearer $ARBITEX_API_KEY
Content-Type: application/json

Request body: same structure as Create. Returns updated RoutingRule object.


Update specific fields without replacing the full rule.

PATCH /api/admin/routing/rules/{rule_id}
Authorization: Bearer $ARBITEX_API_KEY
Content-Type: application/json
{
"enabled": false
}

Supported patch fields: name, priority, enabled, strategy, conditions, fallback_chain, cost_cap, model_filter.

Response 200 OK: Updated RoutingRule object.


DELETE /api/admin/routing/rules/{rule_id}
Authorization: Bearer $ARBITEX_API_KEY

Response 204 No Content.

Deleting a rule does not affect in-flight requests that were already routed by the rule.


Evaluates which provider would be selected for a hypothetical request, given the current rule set and provider health states. Useful for testing routing configuration without sending real traffic.

POST /api/admin/routing/simulate
Authorization: Bearer $ARBITEX_API_KEY
Content-Type: application/json
{
"user_groups": ["contractors", "data-science"],
"model_id": "gpt-4o",
"request_source": "api"
}

Request body:

FieldTypeDescription
user_groupsstring[]Groups to evaluate conditions against
model_idstringRequested model ID
request_sourcestringSource tag (used in request.source conditions)
user_idstringOptional — for sticky_session simulation

Response 200 OK:

{
"selected_provider": "groq",
"selected_model": "llama-3-70b",
"matching_rule": {
"id": "rule_abc123",
"name": "Cost-optimized for contractors",
"strategy": "least_cost"
},
"evaluated_rules": [
{
"id": "rule_xyz789",
"name": "GPT-4o with Anthropic fallback",
"matched": false,
"reason": "Condition 'user.groups contains engineering' not satisfied"
},
{
"id": "rule_abc123",
"name": "Cost-optimized for contractors",
"matched": true,
"strategy_result": {
"evaluated_providers": [
{ "provider_id": "groq", "cost_per_1k": 0.0009, "selected": true },
{ "provider_id": "anthropic", "cost_per_1k": 0.003, "selected": false }
]
}
}
]
}

Returns routing distribution metrics for the past time window.

GET /api/admin/routing/analytics
Authorization: Bearer $ARBITEX_API_KEY

Query parameters:

ParameterDefaultDescription
perioddayhour / day / week
rule_idFilter to a specific rule

Response 200 OK:

{
"period": "day",
"from": "2026-03-11T00:00:00Z",
"to": "2026-03-12T00:00:00Z",
"by_provider": [
{ "provider_id": "openai", "request_count": 12450, "pct": 62.3 },
{ "provider_id": "anthropic", "request_count": 5200, "pct": 26.0 },
{ "provider_id": "groq", "request_count": 2300, "pct": 11.5 },
{ "provider_id": "aws_bedrock","request_count": 50, "pct": 0.2 }
],
"by_rule": [
{ "rule_id": "rule_abc123", "rule_name": "Cost-optimized for contractors", "request_count": 4200 },
{ "rule_id": "rule_xyz789", "rule_name": "GPT-4o with Anthropic fallback", "request_count": 15800 }
],
"fallback_activations": 127,
"fallback_rate_pct": 0.6
}

Returns the org’s default routing configuration — used when no routing rule matches.

GET /api/admin/routing/default
Authorization: Bearer $ARBITEX_API_KEY

Response 200 OK:

{
"strategy": "primary_with_fallback",
"fallback_chain": [
{ "provider_id": "openai", "model_id": "gpt-4o", "priority": 1 }
],
"updated_at": "2026-03-01T00:00:00Z"
}

PUT /api/admin/routing/default
Authorization: Bearer $ARBITEX_API_KEY
Content-Type: application/json
{
"strategy": "primary_with_fallback",
"fallback_chain": [
{ "provider_id": "openai", "model_id": "gpt-4o", "priority": 1 },
{ "provider_id": "anthropic", "model_id": "claude-3-5-sonnet-20241022", "priority": 2 }
]
}

The default config is applied to all requests that do not match any routing rule. At least one fallback chain entry is required.


FieldOperatorsExample value
user.groupsin, not_in, contains"contractors" / ["eng", "ml"]
user.idin, equals"user_abc123"
model.idin, not_in, equals"gpt-4o" / ["gpt-4o", "gpt-4o-mini"]
request.sourceequals, in"chat-ui" / ["api", "sdk"]
time.hour_of_daybetween[9, 17] (9 AM–5 PM UTC)

Multiple conditions are evaluated as logical AND. To express OR, create multiple rules at the same priority.