Skip to content

Portal operations

The portal provides read-only views of model availability, routing configuration, and provider health. This guide explains what each view contains and how to interpret it. Actions that modify configuration — adding models, changing fallback chains, adjusting thresholds — require admin access.


The model catalog lists all LLM models available in your organization. Navigate to Models to view it.

Each row in the catalog displays:

FieldDescription
Model nameDisplay name (e.g., Claude 3.5 Sonnet)
Model IDIdentifier used in API requests (e.g., claude-3-5-sonnet-20241022)
ProviderThe provider hosting this model (Anthropic, OpenAI, Google, Ollama, Mistral, Cohere)
StatusActive or inactive. Inactive models are not available for new requests.
CapabilitiesIcons indicating streaming support, vision (image input), and function calling
CostPrice per 1M input and output tokens, where available

Each model is labeled with its provider. The badge color identifies the provider at a glance. Models from providers that are not configured with valid credentials may be visible in the catalog but will fail when routed.

Use the provider filter tabs above the table to show only models from a specific provider. The search box filters by model name or model ID. These filters are local to your session and do not affect routing configuration.

  • Enabling or disabling models (changing the active/inactive status)
  • Adding new models to the catalog via Discover New Models
  • Bulk enabling or disabling models
  • Syncing the catalog against provider APIs

If a model you need is inactive, contact your org admin to enable it.


The routing view shows how the gateway orders alternative models when a primary fails. Navigate to Routing → Fallback Chains.

Models are grouped by provider. Expand a model row to see its fallback chain — the ordered list of alternative models the gateway will try if the primary is unavailable.

Each fallback entry shows:

FieldDescription
PriorityThe order in which fallbacks are attempted. Priority 1 is tried first.
ModelThe fallback model name and ID
ProviderThe provider hosting the fallback model

An empty fallback chain means requests to that model will fail immediately if the primary provider is unavailable. There is no automatic alternative.

The gateway routes to a fallback when:

  • The primary provider’s circuit breaker is open (3 or more consecutive failures).
  • The primary provider returns a 5xx response.
  • The primary provider’s p95 latency exceeds the configured threshold.

You do not need to take any action to trigger fallback routing — it happens automatically. If you observe that requests are returning errors consistently, check the provider health view to see whether a circuit breaker is open.

  • Adding or removing fallback models
  • Reordering fallback priority
  • Configuring fallback chains via the API

The provider health view shows the current operational status of each configured provider. Navigate to Routing → Provider Health.

Each provider row displays:

FieldDescription
Health scoreA value from 0.0 to 1.0. 1.0 means no failures in the current window.
Circuit breaker stateClosed (healthy), Open (failing), or Half-open (recovering)
Failure countConsecutive failures since the last successful request
Failure rateProportion of failed requests in the current sliding window
AvailableWhether the provider is currently accepting requests
StateWhat it means
ClosedProvider is healthy. All requests route normally.
OpenProvider has failed 3 or more consecutive health checks. New requests are directed to fallback models automatically.
Half-openThe gateway is testing recovery with a single probe request. If it succeeds, the circuit closes and normal routing resumes.

Recovery is automatic. You do not need to manually reset a circuit breaker. After a provider recovers and passes 5 consecutive health checks, it re-enters normal routing rotation.

  • 1.0 — no recent failures.
  • 0.7–0.9 — some failures in the window, provider likely still routing normally.
  • Below 0.5 — elevated failure rate; circuit breaker may be approaching open or is already open.
  • 0.0 — provider is completely unavailable or has been offline for the entire window.

If the health score for a provider you depend on is consistently below 0.7, escalate to your admin to investigate.


The latency view shows response time percentiles per model, aggregated over a selected time window. Navigate to Monitoring → Latency.

ColumnDescription
p50Median response time. Half of requests complete faster than this.
p9595th percentile. 95% of requests complete within this time.
p9999th percentile. Useful for identifying tail latency.
AvgMean response time.
RequestsTotal requests in the selected window.
TrendDirection vs. the previous equivalent window (↑ degrading, ↓ improving, — stable).

The p50 and p95 columns include a status indicator:

ColorThreshold
Green< 200 ms
Yellow200–499 ms
Red≥ 500 ms

A red p95 on a model you use frequently may indicate that requests are slow even when they succeed. This can affect application response times regardless of whether the circuit breaker is open.

Use the window buttons to change the aggregation period: 1 Hour, 24 Hours, 7 Days, or 30 Days. Click Refresh to reload data for the current window.


View / ActionPortal userAdmin
View model catalogYesYes
Enable or disable modelsNoYes
View fallback chain configurationYesYes
Add, remove, reorder fallback modelsNoYes
View provider health statusYesYes
Reset circuit breakers manuallyNoYes
View latency metricsYesYes
Configure latency thresholdsNoYes