Skip to content

Quota Management

Arbitex quotas cap usage at the user or group level across three dimensions: token consumption, request count, and dollar cost. Null limits mean unlimited. Quotas are evaluated per-request and enforced with HTTP 429 responses when exceeded.

DimensionFieldsReset cadence
Token usagedaily_token_limit, monthly_token_limitDaily: midnight UTC · Monthly: 1st of month UTC
Request countdaily_request_limit, monthly_request_limitDaily: midnight UTC · Monthly: 1st of month UTC
Cost (USD)daily_cost_limit_usd, monthly_cost_limit_usdDaily: midnight UTC · Monthly: 1st of month UTC

A null value for any field means that dimension is uncapped. To remove all limits, delete the quota record.

User quotas are personal caps that apply to a single user regardless of their group membership.

Group quotas are aggregate caps that apply to the combined usage of all users in the group. If the group has exhausted its monthly token allowance, all members receive 429 until the quota resets — even individual users with no personal quota.

When both apply, both are enforced. The more restrictive limit takes effect first.


GET /api/admin/users/{user_id}/quota
Authorization: Bearer <admin-token>

Response 200 OK

{
"scope": "user",
"entity_id": "user-uuid-...",
"daily_token_limit": 100000,
"monthly_token_limit": 2000000,
"daily_request_limit": 500,
"monthly_request_limit": 10000,
"daily_cost_limit_usd": 5.00,
"monthly_cost_limit_usd": 50.00
}

Returns 404 if the user exists but has no quota configured (effectively unlimited).


PUT /api/admin/users/{user_id}/quota
Authorization: Bearer <admin-token>
Content-Type: application/json

Upsert — creates the quota if it does not exist; replaces all fields if it does. All fields must be provided; use null for uncapped dimensions.

Example — token and cost limits only

{
"daily_token_limit": 100000,
"monthly_token_limit": 2000000,
"daily_request_limit": null,
"monthly_request_limit": null,
"daily_cost_limit_usd": 5.00,
"monthly_cost_limit_usd": 50.00
}

Response 200 OK — returns the updated quota.


DELETE /api/admin/users/{user_id}/quota
Authorization: Bearer <admin-token>

Removes all quota limits for the user (reverts to unlimited). Returns 204 No Content.


GET /api/admin/groups/{group_id}/quota
Authorization: Bearer <admin-token>

Returns the aggregate quota for the group. Returns 404 if no quota is configured.


PUT /api/admin/groups/{group_id}/quota
Authorization: Bearer <admin-token>
Content-Type: application/json

Example — cap a contractor group

{
"daily_token_limit": 50000,
"monthly_token_limit": 500000,
"daily_request_limit": 200,
"monthly_request_limit": 4000,
"daily_cost_limit_usd": null,
"monthly_cost_limit_usd": 20.00
}

Response 200 OK


DELETE /api/admin/groups/{group_id}/quota
Authorization: Bearer <admin-token>

Removes all quota limits for the group. Returns 204 No Content.


For every incoming AI request:

  1. User quota check — current daily and monthly totals are compared against user limits.
  2. Group quota check — if the user belongs to a group with a quota, the group’s aggregate totals are checked.
  3. Forward — if both checks pass, the request is forwarded to the AI provider.
  4. Record usage — after the response, token usage and cost are recorded and totals updated.

The quota check happens synchronously before the request leaves the outpost, so quota-exceeded requests are never forwarded.

When a quota is exceeded:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Scope: user
X-RateLimit-Limit-Type: daily_token
X-RateLimit-Limit: 100000
X-RateLimit-Used: 100000
X-RateLimit-Reset: 2026-03-13T00:00:00Z
Retry-After: 36000
{
"detail": "daily token quota exceeded",
"limit": 100000,
"used": 100000,
"reset_at": "2026-03-13T00:00:00Z"
}

Response headers:

HeaderDescription
X-RateLimit-Scope"user" or "group"
X-RateLimit-Limit-TypeWhich limit was exceeded (e.g. "daily_token", "monthly_cost")
X-RateLimit-LimitThe configured limit value
X-RateLimit-UsedCurrent usage in the period
X-RateLimit-ResetISO 8601 timestamp when the period resets
Retry-AfterSeconds until the quota resets

Requests that succeed also receive quota usage headers so clients can proactively throttle:

X-RateLimit-Daily-Tokens-Remaining: 62400
X-RateLimit-Monthly-Tokens-Remaining: 1850000
X-RateLimit-Daily-Cost-Remaining-USD: 2.34

PeriodResets at
DailyMidnight UTC (00:00:00 UTC)
MonthlyFirst day of the next calendar month, midnight UTC

Resets are computed server-side based on UTC wall-clock time. There is no manual reset endpoint — to clear a user’s usage mid-period, contact platform operations or temporarily raise the limit.


Assign group quotas at role boundaries rather than individual users to reduce management overhead:

Groupmonthly_token_limitmonthly_cost_limit_usd
Free tier500,000$5
Professional5,000,000$50
Enterprisenull (unlimited)null

For contractors or temporary project accounts:

  1. Create a dedicated group (e.g. contractors-q1-2026)
  2. Assign a tight monthly cost cap reflecting the project budget
  3. Add members
  4. At project end, remove members and delete the group quota

Use group monthly_cost_limit_usd as a budget guardrail when a team has a fixed AI budget. Individual users within the group can have personal token limits for fine-grained control, but the group cap ensures the aggregate cost never exceeds budget.