Skip to content

SIEM integration

Arbitex forwards audit events to your SIEM in real time using configurable connectors. All events use the Open Cybersecurity Schema Framework (OCSF) v1.1 format, which maps to standard SIEM parsers and dashboards without custom transforms.

Five of six connectors are fully functional: Splunk HEC, Microsoft Sentinel, Elasticsearch, Datadog, and Sumo Logic. IBM QRadar has a stub implementation — it is recognized by the system and health-checked, but does not yet deliver events. QRadar promotion is deferred to a future release.


ConnectorIDStatusProtocolAuthGuide
Splunk HECsplunk_hecP0 — Fully functionalHTTP Event Collector (JSON batches)HEC tokenSplunk guide
Microsoft SentinelsentinelP0 — Fully functionalAzure Monitor Log Ingestion API (DCR)Azure AD client credentialsSentinel guide
ElasticsearchelasticP0 — Fully functionalBulk API (NDJSON)API key or basic auth; Cloud ID supportElastic guide
DatadogdatadogP0 — Fully functionalLogs Intake API v2 (JSON array)API key; multi-siteDatadog guide
Sumo Logicsumo_logicP0 — Fully functionalHTTP Source (NDJSON)URL-embedded authSumo Logic guide
IBM QRadarqradarStub — not delivering

All P0 connectors share the same reliability model: batched delivery (up to 100 events or 5 seconds), exponential backoff retry on 429/503, and dead letter JSONL fallback on persistent failure.


All events emitted by Arbitex use OCSF v1.1. Events are classified into four OCSF classes based on the audit action type:

OCSF ClassClass UIDUsed for
API Activity6003Prompt/response events, model calls, stream activity
Security Finding2001DLP blocks, DLP redactions, credential intelligence hits
Authentication3002Login, logout, MFA, SSO, token events
Account Change3004Admin config changes, user provisioning, API key operations, policy changes
FieldTypeDescription
class_uidintegerOCSF class UID (2001, 3002, 3004, or 6003)
class_namestringHuman-readable class name
timeintegerEvent timestamp in epoch milliseconds
messagestringHuman-readable event summary
severity_idinteger0=Unknown, 1=Informational, 2=Low, 3=Medium, 4=High, 5=Critical
severitystringSeverity label
activity_idintegerActivity type (1=Create, 2=Read, 3=Update, 4=Delete, 99=Other)
metadata.versionstring"1.1.0"
metadata.product.namestring"Arbitex"
actor.user.uidstringUser ID of the actor
actor.user.org_uidstringTenant ID
src_endpoint.ipstringSource IP address (if available)
src_endpoint.locationobjectCountry, city, region from GeoIP enrichment
observablesarrayExtracted indicators: user_id, src_ip, model_id, provider

API Activity (class 6003) additional fields

Section titled “API Activity (class 6003) additional fields”
FieldDescription
api.operationThe Arbitex audit action (e.g., prompt_sent, response_received)
api.service.nameAI provider name
api.service.uidModel identifier
unmapped.token_count_inputInput token count
unmapped.token_count_outputOutput token count
unmapped.cost_estimateEstimated cost in USD
unmapped.latency_msEnd-to-end latency

Security Finding (class 2001) additional fields

Section titled “Security Finding (class 2001) additional fields”
FieldDescription
finding.titleFinding description (e.g., DLP: dlp_block)
finding.uidUnique finding ID
finding.types["DLP"]
finding.related_eventsLinked rule IDs and CredInt hit data
unmapped.credint_confidenceCredInt hit confidence score
{
"class_uid": 6003,
"class_name": "Api Activity",
"time": 1741564800000,
"message": "Arbitex audit: prompt_sent",
"severity_id": 1,
"severity": "Informational",
"activity_id": 1,
"activity_name": "Send",
"metadata": {
"version": "1.1.0",
"product": {
"name": "Arbitex",
"vendor_name": "Arbitex",
"uid": "arbitex-platform"
},
"log_name": "arbitex_audit_log",
"log_provider": "Arbitex Platform"
},
"actor": {
"user": {
"uid": "usr_01HZ_ALICE",
"type_id": 1,
"type": "User",
"org_uid": "org_acme"
}
},
"device": {
"type_id": 6,
"type": "Server",
"hostname": "api.arbitex.ai"
},
"src_endpoint": {
"ip": "198.51.100.42",
"location": {
"country": "US",
"city": "New York",
"region": "NY"
}
},
"api": {
"operation": "prompt_sent",
"service": {
"name": "anthropic",
"uid": "claude-sonnet-4-6"
}
},
"observables": [
{ "name": "user_id", "type_id": 4, "type": "User", "value": "usr_01HZ_ALICE" },
{ "name": "src_ip", "type_id": 2, "type": "IP Address", "value": "198.51.100.42" },
{ "name": "provider", "type_id": 99, "type": "Other", "value": "anthropic" }
],
"unmapped": {
"provider": "anthropic",
"model_id": "claude-sonnet-4-6",
"token_count_input": 312,
"token_count_output": 847,
"cost_estimate": 0.0024,
"latency_ms": 1840
}
}

The Splunk HEC connector is a P0 production connector. It delivers events to your Splunk instance via the HTTP Event Collector endpoint with batching, retry, and a dead letter fallback.

  • Events are accumulated in an internal buffer (default: up to 100 events or 5 seconds, whichever comes first)
  • Batches are sent to the HEC endpoint with Authorization: Splunk <token> header
  • On HTTP 429 or 503, the connector retries with exponential backoff (up to 3 attempts by default)
  • If all retries fail, events are written to a dead letter JSONL file at /var/log/arbitex/splunk_dead_letter.jsonl
  • Each event is wrapped in a HEC envelope with sourcetype: "arbitex:ocsf" for accurate indexing

Set the following environment variables on your Arbitex deployment:

VariableRequiredDefaultDescription
SPLUNK_HEC_URLYesHEC endpoint, e.g. https://splunk.example.com:8088/services/collector
SPLUNK_HEC_TOKENYesHEC authentication token
SPLUNK_HEC_INDEXNoarbitexTarget Splunk index
SPLUNK_HEC_SOURCENoarbitex:auditEvent source name
SPLUNK_HEC_BATCH_SIZENo100Maximum events per batch
SPLUNK_HEC_FLUSH_INTERVALNo5Maximum seconds between flushes
SPLUNK_HEC_MAX_RETRIESNo3Maximum retry attempts on transient failures
SPLUNK_HEC_DEAD_LETTER_PATHNo/var/log/arbitex/splunk_dead_letter.jsonlDead letter fallback path
  1. Enable the HTTP Event Collector in Splunk Web (Settings → Data Inputs → HTTP Event Collector)
  2. Create a new HEC token with Source type set to arbitex:ocsf
  3. Ensure the target index exists and the token has write access to it
  4. If your Splunk instance uses a self-signed certificate, either add the cert to your Arbitex deployment’s trust store or configure Splunk HEC with a valid certificate

In the admin UI, navigate to Admin → SIEM. The connector list shows the Splunk HEC connector with a status badge:

  • Healthy — endpoint is reachable and token is valid
  • Degraded — endpoint is reachable but returning unexpected responses
  • Error — connection failed or token is invalid
  • Not configuredSPLUNK_HEC_URL or SPLUNK_HEC_TOKEN is not set

Click Send test event next to the Splunk HEC connector to send a synthetic OCSF event immediately (bypassing the buffer). Check your Splunk index for an event with action: "siem_test_event".


The Sentinel connector is a P0 production connector. It delivers events to Azure Sentinel via the Azure Monitor Log Ingestion API (DCR-based, not the legacy Log Analytics Data Collector API). Authentication uses Azure AD client credentials flow with automatic token caching.

  • Token acquisition uses the https://monitor.azure.com/.default scope via the Azure AD v2.0 endpoint
  • Tokens are cached until expiry (minus a 60-second safety margin)
  • Events are batched and sent as JSON arrays to the DCR ingestion endpoint
  • Retry and dead letter behavior is identical to the Splunk HEC connector
VariableRequiredDefaultDescription
SENTINEL_TENANT_IDYesAzure AD tenant ID
SENTINEL_CLIENT_IDYesAzure AD application (client) ID
SENTINEL_CLIENT_SECRETYesAzure AD application client secret
SENTINEL_DCE_ENDPOINTYesData Collection Endpoint URL
SENTINEL_DCR_IMMUTABLE_IDYesData Collection Rule immutable ID
SENTINEL_STREAM_NAMENoCustom-ArbitexOCSF_CLDCR stream name
SENTINEL_BATCH_SIZENo100Maximum events per batch
SENTINEL_FLUSH_INTERVALNo5Maximum seconds between flushes
SENTINEL_MAX_RETRIESNo3Maximum retry attempts
  1. Create an Azure AD application registration and note the tenant ID, client ID, and create a client secret
  2. Create a Data Collection Endpoint (DCE) in your Azure Monitor workspace
  3. Create a Data Collection Rule (DCR) with a custom stream (Custom-ArbitexOCSF_CL) targeting a Log Analytics workspace
  4. Assign the Monitoring Metrics Publisher role to your app registration on the DCR resource
  5. Note the DCE endpoint URL and the DCR immutable ID

The DCR must define a schema that matches the OCSF fields you want to query in Sentinel. At minimum, include time_t, user_id_s, action_s, severity_s, and class_uid_d.

The Sentinel connector health check attempts Azure AD token acquisition. If the token is acquired successfully, status is Healthy. If the token request fails (invalid client secret, wrong tenant, missing permissions), status is Error. Click Send test event to send a synthetic event and verify end-to-end delivery.


IBM QRadar is registered and health-checked but does not deliver events. Its configuration is read from the QRADAR_HOST environment variable. If the variable is not set, the connector reports Not configured.

ConnectorIDStatus
IBM QRadarqradarStub — not delivering

QRadar will be promoted to P0 in a future release. Contact support if you need this integration on an accelerated timeline.


Navigate to Admin → SIEM to manage connectors.

The connector list shows all registered connectors with:

  • Current health status (Healthy / Degraded / Error / Not configured)
  • Connector type and non-sensitive config summary (URL, index, batch size — never secrets)
  • Send test event button

The SIEM health card at the top of the page shows aggregate counts:

{
"healthy": 5,
"degraded": 0,
"error": 0,
"not_configured": 1,
"total": 6
}

Use this to confirm that your configured connectors are reachable before going to production.

MethodEndpointDescription
GET/api/admin/siem/connectorsList all connectors with status
GET/api/admin/siem/healthAggregate health summary
POST/api/admin/siem/test/{connector_id}Send a test event to a connector

All SIEM endpoints require an admin API key.

Terminal window
# List connectors
curl https://api.arbitex.ai/api/admin/siem/connectors \
-H "Authorization: Bearer arb_live_your-api-key-here"
# Send a test event to Splunk HEC
curl -X POST https://api.arbitex.ai/api/admin/siem/test/splunk_hec \
-H "Authorization: Bearer arb_live_your-api-key-here"

When a connector fails to deliver events after all retries, events are written to a dead letter JSONL file on disk. Each line is a JSON object:

{
"event": { ... },
"error": "HTTP 503: Service Unavailable",
"connector": "splunk_hec",
"timestamp": 1741564800.0
}

To replay dead letter events after the SIEM is restored, parse the JSONL file and re-submit events to your SIEM directly or contact Arbitex support for assisted recovery.