Audit Data Model Reference
Every request processed by the Arbitex Gateway produces one or more audit log entries in the audit_logs table. This reference describes the complete schema, the cryptographic integrity mechanism, event types, OCSF class mapping, and the retention and archival model.
For the admin API used to search and export audit entries, see Audit log export.
AuditLog schema
Section titled “AuditLog schema”| Field | Type | Nullable | In HMAC chain | Description |
|---|---|---|---|---|
id | UUID | No | Yes | Primary key. Unique identifier for this audit entry. |
user_id | UUID | Yes | Yes | Requesting user. null for system or unauthenticated events. SET NULL on user deletion — the audit record is preserved. |
tenant_id | UUID | Yes | Yes | Organization (tenant) UUID. Scopes the entry to an org. null for platform-level events. |
conversation_id | UUID | Yes | Yes | Related conversation UUID. null for non-conversation events. SET NULL on conversation deletion. |
action | string(255) | No | Yes | Action type identifier. See Event types. |
model_id | string(255) | Yes | Yes | LLM model identifier (e.g., gpt-4o, claude-3-5-sonnet-20241022). null for non-inference events. |
provider | string(100) | Yes | Yes | Provider name (e.g., openai, anthropic). null for non-inference events. |
prompt_text | text | Yes | Yes | Prompt text sent to the model. null when not applicable. Only present when prompt retention is enabled. |
response_text | text | Yes | Yes | Response text received from the model. null when not applicable. |
token_count_input | integer | Yes | Yes | Input token count for this inference. |
token_count_output | integer | Yes | Yes | Output token count for this inference. |
cost_estimate | decimal(10,6) | Yes | Yes | Estimated cost in USD. Precision: 6 decimal places. |
latency_ms | integer | Yes | Yes | End-to-end request latency in milliseconds. |
metadata | JSONB | Yes | Yes | Extension data. Stores framework reference (e.g., PCI-DSS Requirement 3.4), matched entity category, policy rule ID, and pack ID for compliance bundle matches. |
hmac | string(64) | Yes | — | HMAC-SHA256 hex digest for this entry. null when HMAC chaining is disabled (no AUDIT_HMAC_KEY configured). |
previous_hmac | string(64) | Yes | — | HMAC digest of the preceding entry in the chain. GENESIS_HMAC ("000...0") for the first entry. |
hmac_key_id | string(64) | Yes | — | Key version identifier. Supports future key rotation. Defaults to "default". |
src_ip | inet | Yes | Yes | Client source IP address captured at request time (PostgreSQL inet type). |
dst_ip | inet | Yes | Yes | Gateway/server destination IP address. |
src_country_code | string(2) | Yes | No | ISO 3166-1 alpha-2 country code for src_ip. GeoIP-derived. Not in HMAC chain. |
src_country_name | string(100) | Yes | No | Full country name for src_ip. GeoIP-derived. Not in HMAC chain. |
src_region | string(100) | Yes | No | State or province for src_ip. GeoIP-derived. Not in HMAC chain. |
src_city | string(100) | Yes | No | City name for src_ip. GeoIP-derived. Not in HMAC chain. |
src_isp | string(200) | Yes | No | Internet Service Provider name for src_ip. GeoIP-derived. Not in HMAC chain. |
src_asn | integer | Yes | No | Autonomous System Number for src_ip. GeoIP-derived. Not in HMAC chain. |
src_asn_org | string(200) | Yes | No | ASN organization name for src_ip. GeoIP-derived. Not in HMAC chain. |
src_arin_org | string(200) | Yes | No | ARIN bulk whois organization name for src_ip. Not in HMAC chain. |
dst_country_code | string(2) | Yes | No | ISO 3166-1 alpha-2 country code for dst_ip. Not in HMAC chain. |
dst_asn | integer | Yes | No | ASN for dst_ip. Not in HMAC chain. |
dst_asn_org | string(200) | Yes | No | ASN organization name for dst_ip. Not in HMAC chain. |
credint_enabled | boolean | Yes | Yes | Whether Credential Intelligence was active for the requesting org at request time. null means CredInt was not in scope. |
credint_hit | boolean | Yes | Yes | Whether a CredInt corpus match was detected. null means no scan. |
frequency_bucket | string(20) | Yes | Yes | Credential hit severity tier: critical, high, medium, or low. |
context_type | string(50) | Yes | Yes | L1 extractor context type for the matched token (e.g., explicit_assignment). |
sha1_prefix | string(8) | Yes | Yes | First 8 hex characters of the SHA-1 digest of the matched token. Audit traceability only — never the full hash, never the credential cleartext. |
credint_confidence | float | Yes | Yes | L3 NLI confidence score for the CredInt hit, in [0.0, 1.0]. |
source | string(50) | Yes | Yes | Event origin: null (cloud gateway) or "outpost" (Hybrid Outpost sync). |
outpost_id | UUID | Yes | Yes | Outpost instance UUID when source="outpost". |
created_at | timestamptz | No | Yes | Row creation timestamp. Indexed for range queries. |
GeoIP enrichment and HMAC chain
Section titled “GeoIP enrichment and HMAC chain”Observed facts (src_ip, dst_ip) are included in the HMAC chain because they are captured at request time and are immutable facts about the connection.
GeoIP enrichment fields (country, region, city, ISP, ASN, ARIN org) are excluded from the HMAC chain. GeoIP databases are periodically updated; re-enriching an entry when MaxMind data changes would break the chain. Only the raw IP is chained.
HMAC chain integrity
Section titled “HMAC chain integrity”Architecture
Section titled “Architecture”Arbitex uses an HMAC-SHA256 hash chain to provide tamper evidence for the audit trail. Any modification, deletion, or reordering of audit entries causes the chain to break and can be detected by the verification API.
The chain is built by HMACChain in audit_integrity.py:
GENESIS_HMAC = "0000000000000000000000000000000000000000000000000000000000000000" (64 zero characters — 256-bit zero value in hex)Chain computation
Section titled “Chain computation”For each audit event, the HMAC is computed as:
message = key_id + ":" + json.dumps(event_fields, sort_keys=True) + previous_hmachmac = HMAC-SHA256(AUDIT_HMAC_KEY, message.encode("utf-8"))Where:
key_idis thehmac_key_idvalue (defaults to"default")event_fieldsis the event dict withhmac,previous_hmac, andhmac_key_idfields excluded (so the digest covers content only)previous_hmacis thehmacfield of the immediately preceding entry (orGENESIS_HMACfor the first entry)- The JSON is serialized with
sort_keys=Truefor a canonical, deterministic representation
Three fields are injected into each entry:
hmac_key_id— which key version was usedprevious_hmac— the previous entry’s HMAC (or GENESIS_HMAC for the first)hmac— the computed HMAC for this entry
Chain verification algorithm
Section titled “Chain verification algorithm”The verify_chain function walks an ordered list of events and checks three invariants:
- Genesis check — the first event’s
previous_hmacequalsGENESIS_HMAC - HMAC correctness — each event’s stored
hmacmatches the recomputed digest - Chain linkage — each event’s
previous_hmacmatches thehmacof the preceding event
# Pseudo-code for verify_chainexpected_prev = GENESIS_HMACerrors = []
for idx, event in enumerate(events): if event["previous_hmac"] != expected_prev: errors.append(f"Event {idx}: previous_hmac mismatch")
recomputed = compute_hmac(event, key, event["hmac_key_id"]) if event["hmac"] != recomputed: errors.append(f"Event {idx}: HMAC mismatch")
expected_prev = event["hmac"] # advance chain pointer
return len(errors) == 0, errorsThe verifier uses the hmac_key_id recorded in each entry to support key rotation. Entries signed under an older key continue to verify correctly using that key — no chain migration is required when keys are rotated.
Key rotation
Section titled “Key rotation”The AUDIT_HMAC_KEY environment variable provides the signing key. When no key is set, HMAC chaining is silently disabled (graceful degradation for development deployments). Production deployments must configure this key.
When the HMAC key is rotated, the hmac_key_id field is updated to the new key identifier. Entries already in the chain retain their original hmac_key_id and continue to verify correctly using the key that signed them. The chain is only linear within a contiguous key era; gaps in key IDs across era boundaries are expected.
HMAC chain verification API
Section titled “HMAC chain verification API”POST /api/admin/audit/verifyAuthorization: Bearer arb_live_your-api-key-hereContent-Type: application/json
{ "start": "2026-01-01T00:00:00Z", "end": "2026-01-31T23:59:59Z"}Returns:
{ "valid": true, "events_checked": 4821, "errors": []}On integrity failure, valid is false and errors lists the violated invariants with event indices and details.
Event types
Section titled “Event types”The action field identifies what occurred. Events fall into four functional categories:
api_activity
Section titled “api_activity”Inference and API request events. These are the most frequent entries in the audit trail.
| Action | Description |
|---|---|
prompt_sent | User submitted a prompt to the Gateway. |
response_received | Model response returned to the user. |
chat_completion | Chat completion API call processed. |
streaming_response | Streaming response chunk events. |
api_key_used | API key authentication for a request. |
security_finding
Section titled “security_finding”Events where the policy engine or DLP pipeline took an enforcement action.
| Action | Description |
|---|---|
dlp_block | DLP rule matched and request was blocked. |
dlp_redact | DLP rule matched and content was redacted. |
dlp_cancel | DLP rule matched and the completion was cancelled. |
policy_block | Policy Engine BLOCK action applied. |
policy_route | Policy Engine ROUTE_TO action applied. |
credint_hit | Credential Intelligence corpus match detected. |
ip_allowlist_blocked | Request blocked by IP allowlist enforcement. |
authentication
Section titled “authentication”Identity and session events.
| Action | Description |
|---|---|
login | User authenticated successfully. |
logout | User session terminated. |
saml_login | SAML SSO authentication completed. |
oidc_login | OIDC/OAuth2 authentication completed. |
mfa_verified | MFA challenge passed. |
api_key_created | New API key issued. |
api_key_revoked | API key revoked. |
token_refresh | JWT access token refreshed. |
account_change
Section titled “account_change”Administrative configuration events.
| Action | Description |
|---|---|
user_invited | User invite created. |
user_activated | User account activated. |
user_deactivated | User account deactivated or SCIM-deleted. |
group_created | User group created. |
group_deleted | User group deleted. |
policy_chain_updated | Policy chain modified. |
policy_rule_created | New policy rule created. |
policy_rule_updated | Policy rule updated. |
compliance_bundle_toggled | Compliance bundle enabled or disabled. |
ip_allowlist_entry_created | IP allowlist entry added. |
ip_allowlist_entry_deleted | IP allowlist entry removed. |
scim_token_rotated | Per-org SCIM bearer token rotated. |
OCSF v1.1 class mapping
Section titled “OCSF v1.1 class mapping”Arbitex audit events are exported in Open Cybersecurity Schema Framework (OCSF) v1.1 format for SIEM ingestion. Each Arbitex event type maps to an OCSF class and category:
| Arbitex category | OCSF class ID | OCSF class name | OCSF category |
|---|---|---|---|
api_activity | 6003 | API Activity | Application Activity (6) |
security_finding | 2001 | Security Finding | Findings (2) |
authentication | 3002 | Authentication Activity | Identity & Access Management (3) |
account_change | 3001 | Account Change | Identity & Access Management (3) |
Envelope fields in OCSF format
Section titled “Envelope fields in OCSF format”When exporting to SIEM, Arbitex maps the audit log envelope fields to OCSF attributes:
| Arbitex field | OCSF attribute | Notes |
|---|---|---|
created_at | time | Unix epoch milliseconds |
user_id | actor.user.uid | UUID string |
tenant_id | cloud.account.uid | Org UUID string |
action | activity_name | Mapped to OCSF activity verb |
src_ip | src_endpoint.ip | Source IP address |
src_country_code | src_endpoint.location.country | ISO 3166-1 alpha-2 |
src_asn | src_endpoint.autonomous_system.number | ASN integer |
model_id | http_request.url.query_string | Model identifier |
provider | cloud.provider | Provider name |
metadata | unmapped | Framework references, rule IDs |
hmac | metadata.event_uid | Used as event integrity token |
For the full OCSF export configuration, see SIEM integration overview.
Retention and archival model
Section titled “Retention and archival model”| Tier | Duration | Store | Notes |
|---|---|---|---|
| Active (hot) | 90 days | Azure Log Analytics | Full query access; HMAC chain verification available |
| Archive | 2 years | Azure Log Analytics archive tier | Azure-managed encryption; immutable; query via Log Analytics archive query |
The 90-day active window is the primary compliance investigation window. For long-term retention requirements (e.g., BSA/AML 5 years, SEC 17a-4 6 years), configure your SIEM to retain OCSF-formatted exports for the required period. See Compliance Frameworks for framework-specific retention requirements.
Outpost audit sync
Section titled “Outpost audit sync”For Hybrid Outpost deployments, audit events are generated locally on the Outpost and periodically synced to the Cloud control plane. Synced entries have source="outpost" and a non-null outpost_id. The HMAC chain covers synced entries identically to cloud-generated entries — the chain is per-org across all sources.
Database indexes
Section titled “Database indexes”The audit_logs table carries indexes on the following columns for query performance:
| Index | Columns | Use case |
|---|---|---|
ix_audit_logs_user_id | user_id | Filter by user |
ix_audit_logs_tenant_id | tenant_id | Tenant-scoped queries (primary query filter) |
ix_audit_logs_conversation_id | conversation_id | Conversation-level audit drilldown |
ix_audit_logs_created_at | created_at | Time range queries |
ix_audit_logs_source | source | Filter outpost vs cloud events |
ix_audit_logs_outpost_id | outpost_id | Per-outpost audit queries |
See also
Section titled “See also”- Audit log export — Admin API for search, filter, and paginated export
- SIEM integration overview — OCSF export to Splunk, Sentinel, Elasticsearch, and others
- Audit chain integrity — Deep dive on the HMAC chain verification
- Compliance Frameworks — How audit fields map to regulatory requirements
- Security Overview — Full security architecture overview