Kill switch

The kill switch provides immediate, manual control to disable a specific provider, model, or (provider, model) pair. When activated, the gateway blocks all requests to the disabled entry — no traffic reaches the provider, and the fallback chain skips the entry automatically. The kill switch is designed for emergency response: a provider security incident, a model producing harmful output, or an operational issue that requires instant traffic cutoff.

What happens when the kill switch is activated

All requests targeting the disabled (provider, model) pair are blocked immediately with a 503 provider_unavailable error
Fallback chains skip the disabled entry — if gpt-4o is kill-switched and a fallback chain lists it as a secondary entry, the gateway proceeds to the next entry in the chain
The kill switch state is persisted to the database and survives gateway restarts
The audit log records the activation with the identity of the admin who triggered it, the reason string, and a timestamp

The kill switch does not affect requests already in flight. Requests that have been forwarded to the provider before the switch is activated complete normally.

Activating the kill switch

Via the admin API

POST https://api.arbitex.ai/api/admin/kill-switch
Authorization: Bearer arb_live_your-api-key-here
Content-Type: application/json

{
  "provider": "openai",
  "model_id": "gpt-4o",
  "enabled": false,
  "reason": "Provider security incident — disabling pending investigation"
}

Field	Required	Description
`provider`	Yes	Provider identifier (e.g., `openai`, `anthropic`)
`model_id`	No	Model identifier. If omitted, the kill switch applies to all models for the provider.
`enabled`	Yes	Set to `false` to disable the entry. Set to `true` to re-enable.
`reason`	Yes	A human-readable reason recorded in the audit log. This field is required — every kill switch activation must document why.

Via the admin portal

The admin portal provides a visual toggle on the provider management page (Settings > Providers > [Provider Name]). The toggle includes a confirmation dialog that requires a reason string before activation.

Re-enabling a provider or model

POST https://api.arbitex.ai/api/admin/kill-switch
Authorization: Bearer arb_live_your-api-key-here
Content-Type: application/json

{
  "provider": "openai",
  "model_id": "gpt-4o",
  "enabled": true,
  "reason": "Provider incident resolved — re-enabling after verification"
}

Re-enabling restores the entry to normal operation. The health monitor state is evaluated independently — if the health monitor had previously disengaged the entry due to failures, re-enabling the kill switch does not override the health monitor state. The health monitor must independently verify the entry is healthy before routing live traffic.

Fallback chain interaction

The kill switch overrides fallback chain configuration. When an entry is kill-switched:

As a primary target: the gateway immediately falls through to the next entry in the fallback chain, as if the primary returned a 5xx error
As a fallback entry: the gateway skips the kill-switched entry and continues to the next fallback in the chain
As the only entry: the request returns 503 provider_unavailable with no further fallback attempts

Plan your fallback chains with the assumption that any single entry may be kill-switched at any time. A kill switch on a fallback chain entry does not generate an error — it is silently skipped.

Kill switch vs health monitoring

The kill switch and the automatic health monitor serve different purposes:

	Kill switch	Health monitor
Trigger	Manual — admin action	Automatic — based on failure thresholds
Scope	Provider, model, or (provider, model) pair	Individual (provider, model) pair
Recovery	Manual — admin must re-enable	Automatic — test request after lockout period
Persistence	Survives restarts	Resets on restart
Use case	Emergency response, planned maintenance	Transient provider failures

Both systems can be active simultaneously. If a (provider, model) pair is both kill-switched and health-monitor-disengaged, re-enabling the kill switch does not restore traffic — the health monitor must also return to active state.

Emergency runbook

When you need to disable a provider immediately:

Activate the kill switch via the admin API or portal. Include a descriptive reason.
Verify fallback chains are routing traffic to alternative providers. Check the audit log for 503 provider_unavailable errors — these indicate requests with no viable fallback.
Notify affected teams that the provider is disabled and which fallback providers are handling traffic.
Monitor the audit log for the disabled provider — the kill switch audit entry records the activation. Any subsequent requests to the disabled entry produce 503 audit entries with the kill switch reason.
When the incident is resolved, re-enable the provider via the kill switch API. Monitor the health check endpoint to confirm the provider is healthy before re-enabling.

Audit log

Kill switch activations and deactivations produce audit log entries with:

Field	Description
`action`	`kill_switch_activated` or `kill_switch_deactivated`
`provider`	The affected provider
`model_id`	The affected model (or `null` for provider-wide switches)
`user_id`	The admin who triggered the switch
`reason`	The reason string provided at activation
`timestamp`	When the switch was toggled

These entries are forwarded to your SIEM alongside request audit entries.