Skip to content

Kill switch

The kill switch provides immediate, manual control to disable a specific provider, model, or (provider, model) pair. When activated, the gateway blocks all requests to the disabled entry — no traffic reaches the provider, and the fallback chain skips the entry automatically. The kill switch is designed for emergency response: a provider security incident, a model producing harmful output, or an operational issue that requires instant traffic cutoff.


What happens when the kill switch is activated

Section titled “What happens when the kill switch is activated”
  1. All requests targeting the disabled (provider, model) pair are blocked immediately with a 503 provider_unavailable error
  2. Fallback chains skip the disabled entry — if gpt-4o is kill-switched and a fallback chain lists it as a secondary entry, the gateway proceeds to the next entry in the chain
  3. The kill switch state is persisted to the database and survives gateway restarts
  4. The audit log records the activation with the identity of the admin who triggered it, the reason string, and a timestamp

The kill switch does not affect requests already in flight. Requests that have been forwarded to the provider before the switch is activated complete normally.


Terminal window
POST https://api.arbitex.ai/api/admin/kill-switch
Authorization: Bearer arb_live_your-api-key-here
Content-Type: application/json
{
"provider": "openai",
"model_id": "gpt-4o",
"enabled": false,
"reason": "Provider security incident — disabling pending investigation"
}
FieldRequiredDescription
providerYesProvider identifier (e.g., openai, anthropic)
model_idNoModel identifier. If omitted, the kill switch applies to all models for the provider.
enabledYesSet to false to disable the entry. Set to true to re-enable.
reasonYesA human-readable reason recorded in the audit log. This field is required — every kill switch activation must document why.

The admin portal provides a visual toggle on the provider management page (Settings > Providers > [Provider Name]). The toggle includes a confirmation dialog that requires a reason string before activation.


Terminal window
POST https://api.arbitex.ai/api/admin/kill-switch
Authorization: Bearer arb_live_your-api-key-here
Content-Type: application/json
{
"provider": "openai",
"model_id": "gpt-4o",
"enabled": true,
"reason": "Provider incident resolved — re-enabling after verification"
}

Re-enabling restores the entry to normal operation. The health monitor state is evaluated independently — if the health monitor had previously disengaged the entry due to failures, re-enabling the kill switch does not override the health monitor state. The health monitor must independently verify the entry is healthy before routing live traffic.


The kill switch overrides fallback chain configuration. When an entry is kill-switched:

  • As a primary target: the gateway immediately falls through to the next entry in the fallback chain, as if the primary returned a 5xx error
  • As a fallback entry: the gateway skips the kill-switched entry and continues to the next fallback in the chain
  • As the only entry: the request returns 503 provider_unavailable with no further fallback attempts

Plan your fallback chains with the assumption that any single entry may be kill-switched at any time. A kill switch on a fallback chain entry does not generate an error — it is silently skipped.


The kill switch and the automatic health monitor serve different purposes:

Kill switchHealth monitor
TriggerManual — admin actionAutomatic — based on failure thresholds
ScopeProvider, model, or (provider, model) pairIndividual (provider, model) pair
RecoveryManual — admin must re-enableAutomatic — test request after lockout period
PersistenceSurvives restartsResets on restart
Use caseEmergency response, planned maintenanceTransient provider failures

Both systems can be active simultaneously. If a (provider, model) pair is both kill-switched and health-monitor-disengaged, re-enabling the kill switch does not restore traffic — the health monitor must also return to active state.


When you need to disable a provider immediately:

  1. Activate the kill switch via the admin API or portal. Include a descriptive reason.
  2. Verify fallback chains are routing traffic to alternative providers. Check the audit log for 503 provider_unavailable errors — these indicate requests with no viable fallback.
  3. Notify affected teams that the provider is disabled and which fallback providers are handling traffic.
  4. Monitor the audit log for the disabled provider — the kill switch audit entry records the activation. Any subsequent requests to the disabled entry produce 503 audit entries with the kill switch reason.
  5. When the incident is resolved, re-enable the provider via the kill switch API. Monitor the health check endpoint to confirm the provider is healthy before re-enabling.

Kill switch activations and deactivations produce audit log entries with:

FieldDescription
actionkill_switch_activated or kill_switch_deactivated
providerThe affected provider
model_idThe affected model (or null for provider-wide switches)
user_idThe admin who triggered the switch
reasonThe reason string provided at activation
timestampWhen the switch was toggled

These entries are forwarded to your SIEM alongside request audit entries.


  • Routing — Fallback chains, health monitoring, and provider management
  • Audit Log — How kill switch events are recorded