Portal operations
The portal provides read-only views of model availability, routing configuration, and provider health. This guide explains what each view contains and how to interpret it. Actions that modify configuration — adding models, changing fallback chains, adjusting thresholds — require admin access.
Model catalog
Section titled “Model catalog”The model catalog lists all LLM models available in your organization. Navigate to Models to view it.
What the catalog shows
Section titled “What the catalog shows”Each row in the catalog displays:
| Field | Description |
|---|---|
| Model name | Display name (e.g., Claude 3.5 Sonnet) |
| Model ID | Identifier used in API requests (e.g., claude-3-5-sonnet-20241022) |
| Provider | The provider hosting this model (Anthropic, OpenAI, Google, Ollama, Mistral, Cohere) |
| Status | Active or inactive. Inactive models are not available for new requests. |
| Capabilities | Icons indicating streaming support, vision (image input), and function calling |
| Cost | Price per 1M input and output tokens, where available |
Provider badges
Section titled “Provider badges”Each model is labeled with its provider. The badge color identifies the provider at a glance. Models from providers that are not configured with valid credentials may be visible in the catalog but will fail when routed.
Filtering the catalog
Section titled “Filtering the catalog”Use the provider filter tabs above the table to show only models from a specific provider. The search box filters by model name or model ID. These filters are local to your session and do not affect routing configuration.
What requires admin access
Section titled “What requires admin access”- Enabling or disabling models (changing the active/inactive status)
- Adding new models to the catalog via Discover New Models
- Bulk enabling or disabling models
- Syncing the catalog against provider APIs
If a model you need is inactive, contact your org admin to enable it.
Routing and fallback chains
Section titled “Routing and fallback chains”The routing view shows how the gateway orders alternative models when a primary fails. Navigate to Routing → Fallback Chains.
Reading the fallback chain view
Section titled “Reading the fallback chain view”Models are grouped by provider. Expand a model row to see its fallback chain — the ordered list of alternative models the gateway will try if the primary is unavailable.
Each fallback entry shows:
| Field | Description |
|---|---|
| Priority | The order in which fallbacks are attempted. Priority 1 is tried first. |
| Model | The fallback model name and ID |
| Provider | The provider hosting the fallback model |
An empty fallback chain means requests to that model will fail immediately if the primary provider is unavailable. There is no automatic alternative.
When fallback routing activates
Section titled “When fallback routing activates”The gateway routes to a fallback when:
- The primary provider’s circuit breaker is open (3 or more consecutive failures).
- The primary provider returns a 5xx response.
- The primary provider’s p95 latency exceeds the configured threshold.
You do not need to take any action to trigger fallback routing — it happens automatically. If you observe that requests are returning errors consistently, check the provider health view to see whether a circuit breaker is open.
What requires admin access
Section titled “What requires admin access”- Adding or removing fallback models
- Reordering fallback priority
- Configuring fallback chains via the API
Provider health
Section titled “Provider health”The provider health view shows the current operational status of each configured provider. Navigate to Routing → Provider Health.
Health indicators
Section titled “Health indicators”Each provider row displays:
| Field | Description |
|---|---|
| Health score | A value from 0.0 to 1.0. 1.0 means no failures in the current window. |
| Circuit breaker state | Closed (healthy), Open (failing), or Half-open (recovering) |
| Failure count | Consecutive failures since the last successful request |
| Failure rate | Proportion of failed requests in the current sliding window |
| Available | Whether the provider is currently accepting requests |
Circuit breaker states
Section titled “Circuit breaker states”| State | What it means |
|---|---|
| Closed | Provider is healthy. All requests route normally. |
| Open | Provider has failed 3 or more consecutive health checks. New requests are directed to fallback models automatically. |
| Half-open | The gateway is testing recovery with a single probe request. If it succeeds, the circuit closes and normal routing resumes. |
Recovery is automatic. You do not need to manually reset a circuit breaker. After a provider recovers and passes 5 consecutive health checks, it re-enters normal routing rotation.
Interpreting health scores
Section titled “Interpreting health scores”- 1.0 — no recent failures.
- 0.7–0.9 — some failures in the window, provider likely still routing normally.
- Below 0.5 — elevated failure rate; circuit breaker may be approaching open or is already open.
- 0.0 — provider is completely unavailable or has been offline for the entire window.
If the health score for a provider you depend on is consistently below 0.7, escalate to your admin to investigate.
Latency monitor
Section titled “Latency monitor”The latency view shows response time percentiles per model, aggregated over a selected time window. Navigate to Monitoring → Latency.
Reading the latency table
Section titled “Reading the latency table”| Column | Description |
|---|---|
| p50 | Median response time. Half of requests complete faster than this. |
| p95 | 95th percentile. 95% of requests complete within this time. |
| p99 | 99th percentile. Useful for identifying tail latency. |
| Avg | Mean response time. |
| Requests | Total requests in the selected window. |
| Trend | Direction vs. the previous equivalent window (↑ degrading, ↓ improving, — stable). |
Status colors
Section titled “Status colors”The p50 and p95 columns include a status indicator:
| Color | Threshold |
|---|---|
| Green | < 200 ms |
| Yellow | 200–499 ms |
| Red | ≥ 500 ms |
A red p95 on a model you use frequently may indicate that requests are slow even when they succeed. This can affect application response times regardless of whether the circuit breaker is open.
Time windows
Section titled “Time windows”Use the window buttons to change the aggregation period: 1 Hour, 24 Hours, 7 Days, or 30 Days. Click Refresh to reload data for the current window.
Read-only vs. admin access summary
Section titled “Read-only vs. admin access summary”| View / Action | Portal user | Admin |
|---|---|---|
| View model catalog | Yes | Yes |
| Enable or disable models | No | Yes |
| View fallback chain configuration | Yes | Yes |
| Add, remove, reorder fallback models | No | Yes |
| View provider health status | Yes | Yes |
| Reset circuit breakers manually | No | Yes |
| View latency metrics | Yes | Yes |
| Configure latency thresholds | No | Yes |
See also
Section titled “See also”- Model routing configuration — admin guide for configuring fallback chains and thresholds
- Provider management — admin guide for adding and configuring providers
- Policy Engine user guide — understanding ROUTE_TO policy rules