Portal operations

The portal provides read-only views of model availability, routing configuration, and provider health. This guide explains what each view contains and how to interpret it. Actions that modify configuration — adding models, changing fallback chains, adjusting thresholds — require admin access.

Model catalog

The model catalog lists all LLM models available in your organization. Navigate to Models to view it.

What the catalog shows

Each row in the catalog displays:

Field	Description
Model name	Display name (e.g., Claude 3.5 Sonnet)
Model ID	Identifier used in API requests (e.g., `claude-3-5-sonnet-20241022`)
Provider	The provider hosting this model (Anthropic, OpenAI, Google, Ollama, Mistral, Cohere)
Status	Active or inactive. Inactive models are not available for new requests.
Capabilities	Icons indicating streaming support, vision (image input), and function calling
Cost	Price per 1M input and output tokens, where available

Provider badges

Each model is labeled with its provider. The badge color identifies the provider at a glance. Models from providers that are not configured with valid credentials may be visible in the catalog but will fail when routed.

Filtering the catalog

Use the provider filter tabs above the table to show only models from a specific provider. The search box filters by model name or model ID. These filters are local to your session and do not affect routing configuration.

What requires admin access

Enabling or disabling models (changing the active/inactive status)
Adding new models to the catalog via Discover New Models
Bulk enabling or disabling models
Syncing the catalog against provider APIs

If a model you need is inactive, contact your org admin to enable it.

Routing and fallback chains

The routing view shows how the gateway orders alternative models when a primary fails. Navigate to Routing → Fallback Chains.

Reading the fallback chain view

Models are grouped by provider. Expand a model row to see its fallback chain — the ordered list of alternative models the gateway will try if the primary is unavailable.

Each fallback entry shows:

Field	Description
Priority	The order in which fallbacks are attempted. Priority 1 is tried first.
Model	The fallback model name and ID
Provider	The provider hosting the fallback model

An empty fallback chain means requests to that model will fail immediately if the primary provider is unavailable. There is no automatic alternative.

When fallback routing activates

The gateway routes to a fallback when:

The primary provider’s circuit breaker is open (3 or more consecutive failures).
The primary provider returns a 5xx response.
The primary provider’s p95 latency exceeds the configured threshold.

You do not need to take any action to trigger fallback routing — it happens automatically. If you observe that requests are returning errors consistently, check the provider health view to see whether a circuit breaker is open.

What requires admin access

Adding or removing fallback models
Reordering fallback priority
Configuring fallback chains via the API

Provider health

The provider health view shows the current operational status of each configured provider. Navigate to Routing → Provider Health.

Health indicators

Each provider row displays:

Field	Description
Health score	A value from 0.0 to 1.0. 1.0 means no failures in the current window.
Circuit breaker state	Closed (healthy), Open (failing), or Half-open (recovering)
Failure count	Consecutive failures since the last successful request
Failure rate	Proportion of failed requests in the current sliding window
Available	Whether the provider is currently accepting requests

Circuit breaker states

State	What it means
Closed	Provider is healthy. All requests route normally.
Open	Provider has failed 3 or more consecutive health checks. New requests are directed to fallback models automatically.
Half-open	The gateway is testing recovery with a single probe request. If it succeeds, the circuit closes and normal routing resumes.

Recovery is automatic. You do not need to manually reset a circuit breaker. After a provider recovers and passes 5 consecutive health checks, it re-enters normal routing rotation.

Interpreting health scores

1.0 — no recent failures.
0.7–0.9 — some failures in the window, provider likely still routing normally.
Below 0.5 — elevated failure rate; circuit breaker may be approaching open or is already open.
0.0 — provider is completely unavailable or has been offline for the entire window.

If the health score for a provider you depend on is consistently below 0.7, escalate to your admin to investigate.

Latency monitor

The latency view shows response time percentiles per model, aggregated over a selected time window. Navigate to Monitoring → Latency.

Reading the latency table

Column	Description
p50	Median response time. Half of requests complete faster than this.
p95	95th percentile. 95% of requests complete within this time.
p99	99th percentile. Useful for identifying tail latency.
Avg	Mean response time.
Requests	Total requests in the selected window.
Trend	Direction vs. the previous equivalent window (↑ degrading, ↓ improving, — stable).

Status colors

The p50 and p95 columns include a status indicator:

Color	Threshold
Green	< 200 ms
Yellow	200–499 ms
Red	≥ 500 ms

A red p95 on a model you use frequently may indicate that requests are slow even when they succeed. This can affect application response times regardless of whether the circuit breaker is open.

Time windows

Use the window buttons to change the aggregation period: 1 Hour, 24 Hours, 7 Days, or 30 Days. Click Refresh to reload data for the current window.

Read-only vs. admin access summary

View / Action	Portal user	Admin
View model catalog	Yes	Yes
Enable or disable models	No	Yes
View fallback chain configuration	Yes	Yes
Add, remove, reorder fallback models	No	Yes
View provider health status	Yes	Yes
Reset circuit breakers manually	No	Yes
View latency metrics	Yes	Yes
Configure latency thresholds	No	Yes

Portal operations

Model catalog

What the catalog shows

Provider badges

Filtering the catalog

What requires admin access

Routing and fallback chains

Reading the fallback chain view

When fallback routing activates

What requires admin access

Provider health

Health indicators

Circuit breaker states

Interpreting health scores

Latency monitor

Reading the latency table

Status colors

Time windows

Read-only vs. admin access summary

See also