SIEM integration
Arbitex forwards audit events to your SIEM in real time using configurable connectors. All events use the Open Cybersecurity Schema Framework (OCSF) v1.1 format, which maps to standard SIEM parsers and dashboards without custom transforms.
Five of six connectors are fully functional: Splunk HEC, Microsoft Sentinel, Elasticsearch, Datadog, and Sumo Logic. IBM QRadar has a stub implementation — it is recognized by the system and health-checked, but does not yet deliver events. QRadar promotion is deferred to a future release.
Connector comparison
Section titled “Connector comparison”| Connector | ID | Status | Protocol | Auth | Guide |
|---|---|---|---|---|---|
| Splunk HEC | splunk_hec | P0 — Fully functional | HTTP Event Collector (JSON batches) | HEC token | Splunk guide |
| Microsoft Sentinel | sentinel | P0 — Fully functional | Azure Monitor Log Ingestion API (DCR) | Azure AD client credentials | Sentinel guide |
| Elasticsearch | elastic | P0 — Fully functional | Bulk API (NDJSON) | API key or basic auth; Cloud ID support | Elastic guide |
| Datadog | datadog | P0 — Fully functional | Logs Intake API v2 (JSON array) | API key; multi-site | Datadog guide |
| Sumo Logic | sumo_logic | P0 — Fully functional | HTTP Source (NDJSON) | URL-embedded auth | Sumo Logic guide |
| IBM QRadar | qradar | Stub — not delivering | — | — | — |
All P0 connectors share the same reliability model: batched delivery (up to 100 events or 5 seconds), exponential backoff retry on 429/503, and dead letter JSONL fallback on persistent failure.
OCSF event format
Section titled “OCSF event format”All events emitted by Arbitex use OCSF v1.1. Events are classified into four OCSF classes based on the audit action type:
| OCSF Class | Class UID | Used for |
|---|---|---|
| API Activity | 6003 | Prompt/response events, model calls, stream activity |
| Security Finding | 2001 | DLP blocks, DLP redactions, credential intelligence hits |
| Authentication | 3002 | Login, logout, MFA, SSO, token events |
| Account Change | 3004 | Admin config changes, user provisioning, API key operations, policy changes |
Common fields (all classes)
Section titled “Common fields (all classes)”| Field | Type | Description |
|---|---|---|
class_uid | integer | OCSF class UID (2001, 3002, 3004, or 6003) |
class_name | string | Human-readable class name |
time | integer | Event timestamp in epoch milliseconds |
message | string | Human-readable event summary |
severity_id | integer | 0=Unknown, 1=Informational, 2=Low, 3=Medium, 4=High, 5=Critical |
severity | string | Severity label |
activity_id | integer | Activity type (1=Create, 2=Read, 3=Update, 4=Delete, 99=Other) |
metadata.version | string | "1.1.0" |
metadata.product.name | string | "Arbitex" |
actor.user.uid | string | User ID of the actor |
actor.user.org_uid | string | Tenant ID |
src_endpoint.ip | string | Source IP address (if available) |
src_endpoint.location | object | Country, city, region from GeoIP enrichment |
observables | array | Extracted indicators: user_id, src_ip, model_id, provider |
API Activity (class 6003) additional fields
Section titled “API Activity (class 6003) additional fields”| Field | Description |
|---|---|
api.operation | The Arbitex audit action (e.g., prompt_sent, response_received) |
api.service.name | AI provider name |
api.service.uid | Model identifier |
unmapped.token_count_input | Input token count |
unmapped.token_count_output | Output token count |
unmapped.cost_estimate | Estimated cost in USD |
unmapped.latency_ms | End-to-end latency |
Security Finding (class 2001) additional fields
Section titled “Security Finding (class 2001) additional fields”| Field | Description |
|---|---|
finding.title | Finding description (e.g., DLP: dlp_block) |
finding.uid | Unique finding ID |
finding.types | ["DLP"] |
finding.related_events | Linked rule IDs and CredInt hit data |
unmapped.credint_confidence | CredInt hit confidence score |
Example OCSF event (API Activity)
Section titled “Example OCSF event (API Activity)”{ "class_uid": 6003, "class_name": "Api Activity", "time": 1741564800000, "message": "Arbitex audit: prompt_sent", "severity_id": 1, "severity": "Informational", "activity_id": 1, "activity_name": "Send", "metadata": { "version": "1.1.0", "product": { "name": "Arbitex", "vendor_name": "Arbitex", "uid": "arbitex-platform" }, "log_name": "arbitex_audit_log", "log_provider": "Arbitex Platform" }, "actor": { "user": { "uid": "usr_01HZ_ALICE", "type_id": 1, "type": "User", "org_uid": "org_acme" } }, "device": { "type_id": 6, "type": "Server", "hostname": "api.arbitex.ai" }, "src_endpoint": { "ip": "198.51.100.42", "location": { "country": "US", "city": "New York", "region": "NY" } }, "api": { "operation": "prompt_sent", "service": { "name": "anthropic", "uid": "claude-sonnet-4-6" } }, "observables": [ { "name": "user_id", "type_id": 4, "type": "User", "value": "usr_01HZ_ALICE" }, { "name": "src_ip", "type_id": 2, "type": "IP Address", "value": "198.51.100.42" }, { "name": "provider", "type_id": 99, "type": "Other", "value": "anthropic" } ], "unmapped": { "provider": "anthropic", "model_id": "claude-sonnet-4-6", "token_count_input": 312, "token_count_output": 847, "cost_estimate": 0.0024, "latency_ms": 1840 }}Splunk HEC connector
Section titled “Splunk HEC connector”The Splunk HEC connector is a P0 production connector. It delivers events to your Splunk instance via the HTTP Event Collector endpoint with batching, retry, and a dead letter fallback.
How it works
Section titled “How it works”- Events are accumulated in an internal buffer (default: up to 100 events or 5 seconds, whichever comes first)
- Batches are sent to the HEC endpoint with
Authorization: Splunk <token>header - On HTTP 429 or 503, the connector retries with exponential backoff (up to 3 attempts by default)
- If all retries fail, events are written to a dead letter JSONL file at
/var/log/arbitex/splunk_dead_letter.jsonl - Each event is wrapped in a HEC envelope with
sourcetype: "arbitex:ocsf"for accurate indexing
Configuration
Section titled “Configuration”Set the following environment variables on your Arbitex deployment:
| Variable | Required | Default | Description |
|---|---|---|---|
SPLUNK_HEC_URL | Yes | — | HEC endpoint, e.g. https://splunk.example.com:8088/services/collector |
SPLUNK_HEC_TOKEN | Yes | — | HEC authentication token |
SPLUNK_HEC_INDEX | No | arbitex | Target Splunk index |
SPLUNK_HEC_SOURCE | No | arbitex:audit | Event source name |
SPLUNK_HEC_BATCH_SIZE | No | 100 | Maximum events per batch |
SPLUNK_HEC_FLUSH_INTERVAL | No | 5 | Maximum seconds between flushes |
SPLUNK_HEC_MAX_RETRIES | No | 3 | Maximum retry attempts on transient failures |
SPLUNK_HEC_DEAD_LETTER_PATH | No | /var/log/arbitex/splunk_dead_letter.jsonl | Dead letter fallback path |
Splunk prerequisites
Section titled “Splunk prerequisites”- Enable the HTTP Event Collector in Splunk Web (Settings → Data Inputs → HTTP Event Collector)
- Create a new HEC token with Source type set to
arbitex:ocsf - Ensure the target index exists and the token has write access to it
- If your Splunk instance uses a self-signed certificate, either add the cert to your Arbitex deployment’s trust store or configure Splunk HEC with a valid certificate
Verifying connectivity
Section titled “Verifying connectivity”In the admin UI, navigate to Admin → SIEM. The connector list shows the Splunk HEC connector with a status badge:
- Healthy — endpoint is reachable and token is valid
- Degraded — endpoint is reachable but returning unexpected responses
- Error — connection failed or token is invalid
- Not configured —
SPLUNK_HEC_URLorSPLUNK_HEC_TOKENis not set
Click Send test event next to the Splunk HEC connector to send a synthetic OCSF event immediately (bypassing the buffer). Check your Splunk index for an event with action: "siem_test_event".
Microsoft Sentinel connector
Section titled “Microsoft Sentinel connector”The Sentinel connector is a P0 production connector. It delivers events to Azure Sentinel via the Azure Monitor Log Ingestion API (DCR-based, not the legacy Log Analytics Data Collector API). Authentication uses Azure AD client credentials flow with automatic token caching.
How it works
Section titled “How it works”- Token acquisition uses the
https://monitor.azure.com/.defaultscope via the Azure AD v2.0 endpoint - Tokens are cached until expiry (minus a 60-second safety margin)
- Events are batched and sent as JSON arrays to the DCR ingestion endpoint
- Retry and dead letter behavior is identical to the Splunk HEC connector
Configuration
Section titled “Configuration”| Variable | Required | Default | Description |
|---|---|---|---|
SENTINEL_TENANT_ID | Yes | — | Azure AD tenant ID |
SENTINEL_CLIENT_ID | Yes | — | Azure AD application (client) ID |
SENTINEL_CLIENT_SECRET | Yes | — | Azure AD application client secret |
SENTINEL_DCE_ENDPOINT | Yes | — | Data Collection Endpoint URL |
SENTINEL_DCR_IMMUTABLE_ID | Yes | — | Data Collection Rule immutable ID |
SENTINEL_STREAM_NAME | No | Custom-ArbitexOCSF_CL | DCR stream name |
SENTINEL_BATCH_SIZE | No | 100 | Maximum events per batch |
SENTINEL_FLUSH_INTERVAL | No | 5 | Maximum seconds between flushes |
SENTINEL_MAX_RETRIES | No | 3 | Maximum retry attempts |
Azure prerequisites
Section titled “Azure prerequisites”- Create an Azure AD application registration and note the tenant ID, client ID, and create a client secret
- Create a Data Collection Endpoint (DCE) in your Azure Monitor workspace
- Create a Data Collection Rule (DCR) with a custom stream (
Custom-ArbitexOCSF_CL) targeting a Log Analytics workspace - Assign the Monitoring Metrics Publisher role to your app registration on the DCR resource
- Note the DCE endpoint URL and the DCR immutable ID
The DCR must define a schema that matches the OCSF fields you want to query in Sentinel. At minimum, include time_t, user_id_s, action_s, severity_s, and class_uid_d.
Verifying connectivity
Section titled “Verifying connectivity”The Sentinel connector health check attempts Azure AD token acquisition. If the token is acquired successfully, status is Healthy. If the token request fails (invalid client secret, wrong tenant, missing permissions), status is Error. Click Send test event to send a synthetic event and verify end-to-end delivery.
Stub connectors
Section titled “Stub connectors”IBM QRadar is registered and health-checked but does not deliver events. Its configuration is read from the QRADAR_HOST environment variable. If the variable is not set, the connector reports Not configured.
| Connector | ID | Status |
|---|---|---|
| IBM QRadar | qradar | Stub — not delivering |
QRadar will be promoted to P0 in a future release. Contact support if you need this integration on an accelerated timeline.
Admin UI — SIEM settings
Section titled “Admin UI — SIEM settings”Navigate to Admin → SIEM to manage connectors.
Connector list
Section titled “Connector list”The connector list shows all registered connectors with:
- Current health status (Healthy / Degraded / Error / Not configured)
- Connector type and non-sensitive config summary (URL, index, batch size — never secrets)
- Send test event button
Overall health summary
Section titled “Overall health summary”The SIEM health card at the top of the page shows aggregate counts:
{ "healthy": 5, "degraded": 0, "error": 0, "not_configured": 1, "total": 6}Use this to confirm that your configured connectors are reachable before going to production.
API endpoints
Section titled “API endpoints”| Method | Endpoint | Description |
|---|---|---|
GET | /api/admin/siem/connectors | List all connectors with status |
GET | /api/admin/siem/health | Aggregate health summary |
POST | /api/admin/siem/test/{connector_id} | Send a test event to a connector |
All SIEM endpoints require an admin API key.
# List connectorscurl https://api.arbitex.ai/api/admin/siem/connectors \ -H "Authorization: Bearer arb_live_your-api-key-here"
# Send a test event to Splunk HECcurl -X POST https://api.arbitex.ai/api/admin/siem/test/splunk_hec \ -H "Authorization: Bearer arb_live_your-api-key-here"Dead letter recovery
Section titled “Dead letter recovery”When a connector fails to deliver events after all retries, events are written to a dead letter JSONL file on disk. Each line is a JSON object:
{ "event": { ... }, "error": "HTTP 503: Service Unavailable", "connector": "splunk_hec", "timestamp": 1741564800.0}To replay dead letter events after the SIEM is restored, parse the JSONL file and re-submit events to your SIEM directly or contact Arbitex support for assisted recovery.
See also
Section titled “See also”- Splunk HEC connector — detailed Splunk setup guide
- Microsoft Sentinel connector — detailed Sentinel setup guide
- Elasticsearch connector — Bulk API setup guide
- Datadog Logs connector — Logs Intake API v2 setup guide
- Sumo Logic connector — HTTP Source setup guide
- Audit Log — querying and exporting the Arbitex audit log
- Policy Engine overview — the
siem_forwardedfield in audit entries