Skip to content

SIEM Integration — All Connectors

Arbitex supports 7 SIEM connectors for forwarding audit events from the platform to your security information and event management system. This guide covers configuration, authentication, event format, and delivery mechanics for each connector. For a comparison of Platform connectors vs Outpost direct sink (air-gap), see SIEM integration guide.

Source: backend/app/services/siem/


All platform SIEM connectors:

  • Operate on the server side — events are forwarded from the Arbitex Platform, not the Outpost
  • Emit events in OCSF v1.1 format by default (connector-specific envelopes vary)
  • Batch up to 100 events per delivery with a maximum batch interval of 5 seconds
  • Retry failed deliveries with exponential backoff (initial 1s, max 300s, 5 attempts)
  • Write undeliverable events to a JSONL dead letter file at SIEM_DEAD_LETTER_PATH

Source: backend/app/services/siem/splunk.py

Splunk HTTP Event Collector (HEC) is the primary integration path for Splunk Cloud and Splunk Enterprise.

Environment variableRequiredDescription
SPLUNK_HEC_URLYesHEC endpoint. e.g. https://splunk.corp.example.com:8088/services/collector/event
SPLUNK_HEC_TOKENYesHEC authentication token
SPLUNK_HEC_INDEXNoTarget index. Default: arbitex
SPLUNK_HEC_SOURCENosource field value. Default: arbitex-platform
SPLUNK_HEC_SOURCETYPENosourcetype field value. Default: arbitex:audit
SPLUNK_HEC_TLS_VERIFYNoVerify TLS certificate. Default: true. Set false for self-signed certs (not recommended in production).

Bearer token via Authorization: Splunk <token> header. The token is created in Splunk under Settings > Data Inputs > HTTP Event Collector.

Each delivery is a standard HEC JSON batch:

{
"time": 1741824000.123,
"host": "platform.arbitex.ai",
"source": "arbitex-platform",
"sourcetype": "arbitex:audit",
"index": "arbitex",
"event": {
"class_uid": 6003,
"category_uid": 6,
"activity_id": 1,
"time": 1741824000123,
"severity_id": 1,
"actor": {
"user": { "uid": "user_abc", "email_addr": "alice@example.com" }
},
"api": {
"request": { "uid": "req_xyz" },
"operation": "chat_completion"
},
"metadata": {
"version": "1.1.0",
"product": { "name": "Arbitex", "vendor_name": "Arbitex" }
}
}
}

Multiple events are sent as newline-delimited JSON objects in a single HTTP POST to the HEC endpoint.

HTTP POST with Content-Type: application/json. Batches of up to 100 events. TLS 1.2+ required.


Source: backend/app/services/siem/sentinel.py

Microsoft Sentinel integration uses the Data Collection Rules (DCR) API (Logs Ingestion API), which is the current Sentinel ingestion method for custom tables.

Environment variableRequiredDescription
SENTINEL_TENANT_IDYesAzure AD tenant ID
SENTINEL_CLIENT_IDYesApp registration client ID
SENTINEL_CLIENT_SECRETYesApp registration client secret
SENTINEL_DCR_ENDPOINTYesData collection endpoint URL (from DCR configuration)
SENTINEL_DCR_RULE_IDYesData collection rule immutable ID (from DCR configuration)
SENTINEL_DCR_STREAM_NAMEYesCustom stream name in DCR (e.g., Custom-ArbitexAudit_CL)

OAuth 2.0 client credentials flow against https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token. Scope: https://monitor.azure.com/.default. Access tokens are cached and refreshed 5 minutes before expiry.

The app registration requires the Monitoring Metrics Publisher role on the Data Collection Endpoint resource.

Events are uploaded as a JSON array matching the custom table schema. The connector maps OCSF fields to the DCR stream schema:

[
{
"TimeGenerated": "2026-03-13T00:00:00.123Z",
"EventId": "req_xyz",
"UserId": "user_abc",
"UserEmail": "alice@example.com",
"Operation": "chat_completion",
"ModelId": "gpt-4o",
"PolicyAction": "allow",
"DlpLabels": "[]",
"TokensInput": 512,
"TokensOutput": 256,
"SeverityId": 1,
"RawOCSF": "{...}"
}
]

RawOCSF contains the full OCSF v1.1 event as a JSON string for use in Sentinel analytics rules that need the complete event.

HTTP POST to the DCR endpoint at {endpoint}/dataCollectionRules/{rule_id}/streams/{stream_name}?api-version=2023-01-01. Maximum payload: 1 MB per batch. Batches exceeding 1 MB are split automatically.


Source: backend/app/services/siem/elastic.py

Elastic integration uses the Elasticsearch Bulk API for high-throughput ingestion to Elastic Cloud or self-hosted Elasticsearch.

Environment variableRequiredDescription
ELASTIC_URLYesElasticsearch base URL. e.g. https://my-cluster.es.io:9243
ELASTIC_API_KEYYesBase64-encoded Elasticsearch API key (id:api_key format)
ELASTIC_INDEXNoTarget index or data stream name. Default: arbitex-audit
ELASTIC_PIPELINENoIngest pipeline to apply. Optional.
ELASTIC_TLS_VERIFYNoVerify TLS. Default: true.

Authorization: ApiKey <base64(id:api_key)> header. Create the API key in Kibana under Stack Management > API Keys with index privilege on the target index.

Bulk API format with index action and OCSF event body:

{ "index": { "_index": "arbitex-audit" } }
{ "class_uid": 6003, "time": 1741824000123, "actor": {...}, ... }
{ "index": { "_index": "arbitex-audit" } }
{ ... }

Each batch is a newline-delimited bulk request body. The @timestamp field is added automatically from the OCSF time field (Unix milliseconds converted to ISO 8601).

HTTP POST to {ELASTIC_URL}/_bulk. Content-Type: application/x-ndjson. Batch size: up to 100 events or 5 MB, whichever is smaller. The connector checks the errors field in the Bulk API response and routes failed documents to the dead letter file.


Source: backend/app/services/siem/datadog.py

Datadog integration uses the Logs Intake v2 API (/api/v2/logs).

Environment variableRequiredDescription
DATADOG_API_KEYYesDatadog API key
DATADOG_SITENoDatadog site. Default: datadoghq.com. Options: datadoghq.com, us3.datadoghq.com, us5.datadoghq.com, datadoghq.eu, ap1.datadoghq.com
DATADOG_SERVICENoservice tag. Default: arbitex
DATADOG_SOURCENoddsource tag. Default: arbitex-audit
DATADOG_TAGSNoComma-separated key:value tags. e.g. env:production,team:security

DD-API-KEY: <api_key> header. The API key is created in Datadog under Organization Settings > API Keys.

Array of Datadog log objects:

[
{
"ddsource": "arbitex-audit",
"ddtags": "env:production,service:arbitex",
"hostname": "platform.arbitex.ai",
"service": "arbitex",
"message": "{\"class_uid\":6003,\"time\":1741824000123,\"actor\":{...}}"
}
]

The message field contains the full OCSF event as a JSON string. Datadog Log Management will parse it as JSON when a JSON parsing processor is configured in a Datadog pipeline.

HTTP POST to https://http-intake.logs.{DATADOG_SITE}/api/v2/logs. Content-Type: application/json. Maximum payload: 5 MB. Maximum 1000 log entries per batch (Datadog limit). The connector respects the 429 Too Many Requests response and applies backoff.


Source: backend/app/services/siem/sumo.py

Sumo Logic integration uses an HTTP Source (HTTP Logs & Metrics Collector).

Environment variableRequiredDescription
SUMO_HTTP_SOURCE_URLYesHTTP Source URL from Sumo Logic. e.g. https://endpoint.collection.sumologic.com/receiver/v1/http/<token>
SUMO_CATEGORYNoSource category override. Default: arbitex/audit
SUMO_HOSTNoSource host override. Default: platform.arbitex.ai

The HTTP Source URL contains the authentication token embedded in the path. No additional headers required. The URL is generated when you create the HTTP Source in Sumo Logic and acts as the credential.

Newline-delimited OCSF JSON events. Sumo Logic parses each newline as a separate log message:

{"class_uid":6003,"time":1741824000123,"actor":{"user":{"uid":"user_abc"}},...}
{"class_uid":6003,"time":1741824000456,"actor":{"user":{"uid":"user_def"}},...}

HTTP POST to the HTTP Source URL. Content-Type: application/x-www-form-urlencoded is not used — the connector sends application/json with newline-delimited JSON. The X-Sumo-Category and X-Sumo-Host headers are set from the configuration. Maximum payload: 1 MB per request.


Source: backend/app/services/siem/qradar.py

IBM QRadar integration uses CEF (Common Event Format) over syslog (RFC 5424). This is the native QRadar log source protocol for custom event sources.

Environment variableRequiredDescription
QRADAR_SYSLOG_HOSTYesQRadar Console or Event Collector IP/hostname
QRADAR_SYSLOG_PORTNoSyslog port. Default: 514 for UDP, 6514 for TLS
QRADAR_SYSLOG_PROTOCOLNoTransport: udp, tcp, or tls. Default: tcp
QRADAR_CEF_DEVICE_VENDORNoCEF DeviceVendor. Default: Arbitex
QRADAR_CEF_DEVICE_PRODUCTNoCEF DeviceProduct. Default: ArbitexPlatform
QRADAR_CEF_DEVICE_VERSIONNoCEF DeviceVersion. Default: 1.0
QRADAR_TLS_CA_BUNDLENoPath to CA bundle for TLS syslog. Required when QRADAR_SYSLOG_PROTOCOL=tls.

No application-level authentication for UDP/TCP syslog. For TLS, mutual TLS is used when QRADAR_TLS_CA_BUNDLE is configured; otherwise, server-only TLS certificate validation is performed.

CEF over RFC 5424 syslog. Each event is a syslog message with a CEF-formatted message body:

<134>1 2026-03-13T00:00:00.123Z platform.arbitex.ai ArbitexPlatform - req_xyz - CEF:0|Arbitex|ArbitexPlatform|1.0|6003|chat_completion|3|rt=1741824000123 suser=alice@example.com duid=user_abc act=allow model=gpt-4o tokenIn=512 tokenOut=256 cs1Label=PolicyAction cs1=allow cs2Label=DlpLabels cs2= cs3Label=GroupIds cs3=finance-team

CEF extension field mapping:

CEF Extension KeyOCSF / Arbitex field
rttime (Unix ms)
suseractor.user.email_addr
duidactor.user.uid
actPolicy action (allow/block/redact/warn)
modelModel identifier
tokenInInput token count
tokenOutOutput token count
cs1 / cs1LabelPolicy action (labeled)
cs2 / cs2LabelDLP labels detected (comma-separated)
cs3 / cs3LabelUser group IDs (comma-separated)

Syslog over TCP (default), UDP, or TLS. Events are sent individually — CEF/syslog does not batch natively. The connector maintains a persistent TCP/TLS connection and reconnects on failure with exponential backoff. UDP has no delivery confirmation.

In QRadar, create a Universal DSM log source pointing to the Arbitex Platform’s egress IP:

  1. Navigate to Admin > Data Sources > Log Sources > Add.
  2. Set Log Source Type to Universal DSM.
  3. Set Protocol Configuration to Syslog.
  4. Set the IP to the Arbitex Platform egress IP.
  5. Apply the Arbitex DSM extension (available from the QRadar app exchange) for normalized parsing.

Source: backend/app/services/siem/xsiam.py

Palo Alto Cortex XSIAM integration uses the XSIAM HTTP Event Collector endpoint, which wraps standard HEC with an XSIAM-specific envelope for automated playbook triggering and XDR correlation.

Environment variableRequiredDescription
XSIAM_URLYesXSIAM instance URL. e.g. https://api-tenant.xsiam.paloaltonetworks.com
XSIAM_API_KEYYesXSIAM API key
XSIAM_API_KEY_IDYesXSIAM API key ID (numeric)
XSIAM_LOG_TYPENoLog type identifier. Default: arbitex_audit
XSIAM_DATASETNoXSIAM dataset name for routing. Default: arbitex_audit_raw

x-xdr-auth-id: <api_key_id> and x-xdr-nonce: <nonce> and x-xdr-hmac-sha256: <hmac> headers. The connector computes the HMAC-SHA256 signature over {api_key_id}{nonce}{api_key} with the API key as the key, matching the Cortex XSIAM authentication scheme.

XSIAM HEC format with envelope fields for XSIAM routing:

{
"events": [
{
"_time": "2026-03-13T00:00:00.123Z",
"_vendor": "Arbitex",
"_product": "ArbitexPlatform",
"_dataset": "arbitex_audit_raw",
"log_type": "arbitex_audit",
"event_id": "req_xyz",
"user_id": "user_abc",
"user_email": "alice@example.com",
"operation": "chat_completion",
"model_id": "gpt-4o",
"policy_action": "allow",
"dlp_labels": [],
"tokens_input": 512,
"tokens_output": 256,
"severity": "informational",
"raw_ocsf": "{...}"
}
]
}

The _vendor, _product, _dataset, and log_type fields are used by XSIAM for automatic data routing and normalized schema mapping. raw_ocsf contains the full OCSF v1.1 event for use in XSIAM XQL queries.

HTTP POST to {XSIAM_URL}/logs/v1/event. Content-Type: application/json. Batches of up to 100 events. The XSIAM endpoint returns a 202 Accepted for asynchronous ingestion. The connector treats non-2xx responses as failures and applies exponential backoff.


When an event cannot be delivered after all retries, it is written to the dead letter file:

SIEM_DEAD_LETTER_PATH=/var/log/arbitex/siem-dead-letter.jsonl

Each line is a JSONL record:

{
"timestamp": "2026-03-13T00:00:00Z",
"connector": "splunk",
"error": "Connection refused",
"event": { ... }
}

Replay dead letter events by posting them to the Platform’s dead letter replay endpoint:

POST /api/admin/siem/replay-dead-letter

ConnectorAuth methodEvent formatDeliveryBatching
Splunk HECBearer tokenHEC JSON / OCSFHTTP POST100 events / 5s
Microsoft SentinelOAuth 2.0 client credsDCR table schema + raw OCSFHTTP POST100 events / 1 MB
ElasticAPI KeyBulk API ndjson / OCSFHTTP POST100 events / 5 MB
DatadogAPI Key headerLog objects / OCSF in messageHTTP POST100 events / 5 MB
Sumo LogicToken in URLndjson / OCSFHTTP POST100 events / 1 MB
IBM QRadarmTLS / noneCEF over syslogTCP/UDP/TLS1 event per message
Cortex XSIAMHMAC-SHA256XSIAM HEC + OCSFHTTP POST100 events