Skip to content

Audit Data Model Reference

Every request processed by the Arbitex Gateway produces one or more audit log entries in the audit_logs table. This reference describes the complete schema, the cryptographic integrity mechanism, event types, OCSF class mapping, and the retention and archival model.

For the admin API used to search and export audit entries, see Audit log export.


FieldTypeNullableIn HMAC chainDescription
idUUIDNoYesPrimary key. Unique identifier for this audit entry.
user_idUUIDYesYesRequesting user. null for system or unauthenticated events. SET NULL on user deletion — the audit record is preserved.
tenant_idUUIDYesYesOrganization (tenant) UUID. Scopes the entry to an org. null for platform-level events.
conversation_idUUIDYesYesRelated conversation UUID. null for non-conversation events. SET NULL on conversation deletion.
actionstring(255)NoYesAction type identifier. See Event types.
model_idstring(255)YesYesLLM model identifier (e.g., gpt-4o, claude-3-5-sonnet-20241022). null for non-inference events.
providerstring(100)YesYesProvider name (e.g., openai, anthropic). null for non-inference events.
prompt_texttextYesYesPrompt text sent to the model. null when not applicable. Only present when prompt retention is enabled.
response_texttextYesYesResponse text received from the model. null when not applicable.
token_count_inputintegerYesYesInput token count for this inference.
token_count_outputintegerYesYesOutput token count for this inference.
cost_estimatedecimal(10,6)YesYesEstimated cost in USD. Precision: 6 decimal places.
latency_msintegerYesYesEnd-to-end request latency in milliseconds.
metadataJSONBYesYesExtension data. Stores framework reference (e.g., PCI-DSS Requirement 3.4), matched entity category, policy rule ID, and pack ID for compliance bundle matches.
hmacstring(64)YesHMAC-SHA256 hex digest for this entry. null when HMAC chaining is disabled (no AUDIT_HMAC_KEY configured).
previous_hmacstring(64)YesHMAC digest of the preceding entry in the chain. GENESIS_HMAC ("000...0") for the first entry.
hmac_key_idstring(64)YesKey version identifier. Supports future key rotation. Defaults to "default".
src_ipinetYesYesClient source IP address captured at request time (PostgreSQL inet type).
dst_ipinetYesYesGateway/server destination IP address.
src_country_codestring(2)YesNoISO 3166-1 alpha-2 country code for src_ip. GeoIP-derived. Not in HMAC chain.
src_country_namestring(100)YesNoFull country name for src_ip. GeoIP-derived. Not in HMAC chain.
src_regionstring(100)YesNoState or province for src_ip. GeoIP-derived. Not in HMAC chain.
src_citystring(100)YesNoCity name for src_ip. GeoIP-derived. Not in HMAC chain.
src_ispstring(200)YesNoInternet Service Provider name for src_ip. GeoIP-derived. Not in HMAC chain.
src_asnintegerYesNoAutonomous System Number for src_ip. GeoIP-derived. Not in HMAC chain.
src_asn_orgstring(200)YesNoASN organization name for src_ip. GeoIP-derived. Not in HMAC chain.
src_arin_orgstring(200)YesNoARIN bulk whois organization name for src_ip. Not in HMAC chain.
dst_country_codestring(2)YesNoISO 3166-1 alpha-2 country code for dst_ip. Not in HMAC chain.
dst_asnintegerYesNoASN for dst_ip. Not in HMAC chain.
dst_asn_orgstring(200)YesNoASN organization name for dst_ip. Not in HMAC chain.
credint_enabledbooleanYesYesWhether Credential Intelligence was active for the requesting org at request time. null means CredInt was not in scope.
credint_hitbooleanYesYesWhether a CredInt corpus match was detected. null means no scan.
frequency_bucketstring(20)YesYesCredential hit severity tier: critical, high, medium, or low.
context_typestring(50)YesYesL1 extractor context type for the matched token (e.g., explicit_assignment).
sha1_prefixstring(8)YesYesFirst 8 hex characters of the SHA-1 digest of the matched token. Audit traceability only — never the full hash, never the credential cleartext.
credint_confidencefloatYesYesL3 NLI confidence score for the CredInt hit, in [0.0, 1.0].
sourcestring(50)YesYesEvent origin: null (cloud gateway) or "outpost" (Hybrid Outpost sync).
outpost_idUUIDYesYesOutpost instance UUID when source="outpost".
created_attimestamptzNoYesRow creation timestamp. Indexed for range queries.

Observed facts (src_ip, dst_ip) are included in the HMAC chain because they are captured at request time and are immutable facts about the connection.

GeoIP enrichment fields (country, region, city, ISP, ASN, ARIN org) are excluded from the HMAC chain. GeoIP databases are periodically updated; re-enriching an entry when MaxMind data changes would break the chain. Only the raw IP is chained.


Arbitex uses an HMAC-SHA256 hash chain to provide tamper evidence for the audit trail. Any modification, deletion, or reordering of audit entries causes the chain to break and can be detected by the verification API.

The chain is built by HMACChain in audit_integrity.py:

GENESIS_HMAC = "0000000000000000000000000000000000000000000000000000000000000000"
(64 zero characters — 256-bit zero value in hex)

For each audit event, the HMAC is computed as:

message = key_id + ":" + json.dumps(event_fields, sort_keys=True) + previous_hmac
hmac = HMAC-SHA256(AUDIT_HMAC_KEY, message.encode("utf-8"))

Where:

  • key_id is the hmac_key_id value (defaults to "default")
  • event_fields is the event dict with hmac, previous_hmac, and hmac_key_id fields excluded (so the digest covers content only)
  • previous_hmac is the hmac field of the immediately preceding entry (or GENESIS_HMAC for the first entry)
  • The JSON is serialized with sort_keys=True for a canonical, deterministic representation

Three fields are injected into each entry:

  • hmac_key_id — which key version was used
  • previous_hmac — the previous entry’s HMAC (or GENESIS_HMAC for the first)
  • hmac — the computed HMAC for this entry

The verify_chain function walks an ordered list of events and checks three invariants:

  1. Genesis check — the first event’s previous_hmac equals GENESIS_HMAC
  2. HMAC correctness — each event’s stored hmac matches the recomputed digest
  3. Chain linkage — each event’s previous_hmac matches the hmac of the preceding event
# Pseudo-code for verify_chain
expected_prev = GENESIS_HMAC
errors = []
for idx, event in enumerate(events):
if event["previous_hmac"] != expected_prev:
errors.append(f"Event {idx}: previous_hmac mismatch")
recomputed = compute_hmac(event, key, event["hmac_key_id"])
if event["hmac"] != recomputed:
errors.append(f"Event {idx}: HMAC mismatch")
expected_prev = event["hmac"] # advance chain pointer
return len(errors) == 0, errors

The verifier uses the hmac_key_id recorded in each entry to support key rotation. Entries signed under an older key continue to verify correctly using that key — no chain migration is required when keys are rotated.

The AUDIT_HMAC_KEY environment variable provides the signing key. When no key is set, HMAC chaining is silently disabled (graceful degradation for development deployments). Production deployments must configure this key.

When the HMAC key is rotated, the hmac_key_id field is updated to the new key identifier. Entries already in the chain retain their original hmac_key_id and continue to verify correctly using the key that signed them. The chain is only linear within a contiguous key era; gaps in key IDs across era boundaries are expected.

POST /api/admin/audit/verify
Authorization: Bearer arb_live_your-api-key-here
Content-Type: application/json
{
"start": "2026-01-01T00:00:00Z",
"end": "2026-01-31T23:59:59Z"
}

Returns:

{
"valid": true,
"events_checked": 4821,
"errors": []
}

On integrity failure, valid is false and errors lists the violated invariants with event indices and details.


The action field identifies what occurred. Events fall into four functional categories:

Inference and API request events. These are the most frequent entries in the audit trail.

ActionDescription
prompt_sentUser submitted a prompt to the Gateway.
response_receivedModel response returned to the user.
chat_completionChat completion API call processed.
streaming_responseStreaming response chunk events.
api_key_usedAPI key authentication for a request.

Events where the policy engine or DLP pipeline took an enforcement action.

ActionDescription
dlp_blockDLP rule matched and request was blocked.
dlp_redactDLP rule matched and content was redacted.
dlp_cancelDLP rule matched and the completion was cancelled.
policy_blockPolicy Engine BLOCK action applied.
policy_routePolicy Engine ROUTE_TO action applied.
credint_hitCredential Intelligence corpus match detected.
ip_allowlist_blockedRequest blocked by IP allowlist enforcement.

Identity and session events.

ActionDescription
loginUser authenticated successfully.
logoutUser session terminated.
saml_loginSAML SSO authentication completed.
oidc_loginOIDC/OAuth2 authentication completed.
mfa_verifiedMFA challenge passed.
api_key_createdNew API key issued.
api_key_revokedAPI key revoked.
token_refreshJWT access token refreshed.

Administrative configuration events.

ActionDescription
user_invitedUser invite created.
user_activatedUser account activated.
user_deactivatedUser account deactivated or SCIM-deleted.
group_createdUser group created.
group_deletedUser group deleted.
policy_chain_updatedPolicy chain modified.
policy_rule_createdNew policy rule created.
policy_rule_updatedPolicy rule updated.
compliance_bundle_toggledCompliance bundle enabled or disabled.
ip_allowlist_entry_createdIP allowlist entry added.
ip_allowlist_entry_deletedIP allowlist entry removed.
scim_token_rotatedPer-org SCIM bearer token rotated.

Arbitex audit events are exported in Open Cybersecurity Schema Framework (OCSF) v1.1 format for SIEM ingestion. Each Arbitex event type maps to an OCSF class and category:

Arbitex categoryOCSF class IDOCSF class nameOCSF category
api_activity6003API ActivityApplication Activity (6)
security_finding2001Security FindingFindings (2)
authentication3002Authentication ActivityIdentity & Access Management (3)
account_change3001Account ChangeIdentity & Access Management (3)

When exporting to SIEM, Arbitex maps the audit log envelope fields to OCSF attributes:

Arbitex fieldOCSF attributeNotes
created_attimeUnix epoch milliseconds
user_idactor.user.uidUUID string
tenant_idcloud.account.uidOrg UUID string
actionactivity_nameMapped to OCSF activity verb
src_ipsrc_endpoint.ipSource IP address
src_country_codesrc_endpoint.location.countryISO 3166-1 alpha-2
src_asnsrc_endpoint.autonomous_system.numberASN integer
model_idhttp_request.url.query_stringModel identifier
providercloud.providerProvider name
metadataunmappedFramework references, rule IDs
hmacmetadata.event_uidUsed as event integrity token

For the full OCSF export configuration, see SIEM integration overview.


TierDurationStoreNotes
Active (hot)90 daysAzure Log AnalyticsFull query access; HMAC chain verification available
Archive2 yearsAzure Log Analytics archive tierAzure-managed encryption; immutable; query via Log Analytics archive query

The 90-day active window is the primary compliance investigation window. For long-term retention requirements (e.g., BSA/AML 5 years, SEC 17a-4 6 years), configure your SIEM to retain OCSF-formatted exports for the required period. See Compliance Frameworks for framework-specific retention requirements.

For Hybrid Outpost deployments, audit events are generated locally on the Outpost and periodically synced to the Cloud control plane. Synced entries have source="outpost" and a non-null outpost_id. The HMAC chain covers synced entries identically to cloud-generated entries — the chain is per-org across all sources.


The audit_logs table carries indexes on the following columns for query performance:

IndexColumnsUse case
ix_audit_logs_user_iduser_idFilter by user
ix_audit_logs_tenant_idtenant_idTenant-scoped queries (primary query filter)
ix_audit_logs_conversation_idconversation_idConversation-level audit drilldown
ix_audit_logs_created_atcreated_atTime range queries
ix_audit_logs_sourcesourceFilter outpost vs cloud events
ix_audit_logs_outpost_idoutpost_idPer-outpost audit queries