SIEM Integration — All Connectors
Arbitex supports 7 SIEM connectors for forwarding audit events from the platform to your security information and event management system. This guide covers configuration, authentication, event format, and delivery mechanics for each connector. For a comparison of Platform connectors vs Outpost direct sink (air-gap), see SIEM integration guide.
Source: backend/app/services/siem/
Overview
Section titled “Overview”All platform SIEM connectors:
- Operate on the server side — events are forwarded from the Arbitex Platform, not the Outpost
- Emit events in OCSF v1.1 format by default (connector-specific envelopes vary)
- Batch up to 100 events per delivery with a maximum batch interval of 5 seconds
- Retry failed deliveries with exponential backoff (initial 1s, max 300s, 5 attempts)
- Write undeliverable events to a JSONL dead letter file at
SIEM_DEAD_LETTER_PATH
1. Splunk HEC
Section titled “1. Splunk HEC”Source: backend/app/services/siem/splunk.py
Splunk HTTP Event Collector (HEC) is the primary integration path for Splunk Cloud and Splunk Enterprise.
Configuration
Section titled “Configuration”| Environment variable | Required | Description |
|---|---|---|
SPLUNK_HEC_URL | Yes | HEC endpoint. e.g. https://splunk.corp.example.com:8088/services/collector/event |
SPLUNK_HEC_TOKEN | Yes | HEC authentication token |
SPLUNK_HEC_INDEX | No | Target index. Default: arbitex |
SPLUNK_HEC_SOURCE | No | source field value. Default: arbitex-platform |
SPLUNK_HEC_SOURCETYPE | No | sourcetype field value. Default: arbitex:audit |
SPLUNK_HEC_TLS_VERIFY | No | Verify TLS certificate. Default: true. Set false for self-signed certs (not recommended in production). |
Authentication
Section titled “Authentication”Bearer token via Authorization: Splunk <token> header. The token is created in Splunk under Settings > Data Inputs > HTTP Event Collector.
Event format
Section titled “Event format”Each delivery is a standard HEC JSON batch:
{ "time": 1741824000.123, "host": "platform.arbitex.ai", "source": "arbitex-platform", "sourcetype": "arbitex:audit", "index": "arbitex", "event": { "class_uid": 6003, "category_uid": 6, "activity_id": 1, "time": 1741824000123, "severity_id": 1, "actor": { "user": { "uid": "user_abc", "email_addr": "alice@example.com" } }, "api": { "request": { "uid": "req_xyz" }, "operation": "chat_completion" }, "metadata": { "version": "1.1.0", "product": { "name": "Arbitex", "vendor_name": "Arbitex" } } }}Multiple events are sent as newline-delimited JSON objects in a single HTTP POST to the HEC endpoint.
Delivery mechanism
Section titled “Delivery mechanism”HTTP POST with Content-Type: application/json. Batches of up to 100 events. TLS 1.2+ required.
2. Microsoft Sentinel (DCR API)
Section titled “2. Microsoft Sentinel (DCR API)”Source: backend/app/services/siem/sentinel.py
Microsoft Sentinel integration uses the Data Collection Rules (DCR) API (Logs Ingestion API), which is the current Sentinel ingestion method for custom tables.
Configuration
Section titled “Configuration”| Environment variable | Required | Description |
|---|---|---|
SENTINEL_TENANT_ID | Yes | Azure AD tenant ID |
SENTINEL_CLIENT_ID | Yes | App registration client ID |
SENTINEL_CLIENT_SECRET | Yes | App registration client secret |
SENTINEL_DCR_ENDPOINT | Yes | Data collection endpoint URL (from DCR configuration) |
SENTINEL_DCR_RULE_ID | Yes | Data collection rule immutable ID (from DCR configuration) |
SENTINEL_DCR_STREAM_NAME | Yes | Custom stream name in DCR (e.g., Custom-ArbitexAudit_CL) |
Authentication
Section titled “Authentication”OAuth 2.0 client credentials flow against https://login.microsoftonline.com/{tenant_id}/oauth2/v2.0/token. Scope: https://monitor.azure.com/.default. Access tokens are cached and refreshed 5 minutes before expiry.
The app registration requires the Monitoring Metrics Publisher role on the Data Collection Endpoint resource.
Event format
Section titled “Event format”Events are uploaded as a JSON array matching the custom table schema. The connector maps OCSF fields to the DCR stream schema:
[ { "TimeGenerated": "2026-03-13T00:00:00.123Z", "EventId": "req_xyz", "UserId": "user_abc", "UserEmail": "alice@example.com", "Operation": "chat_completion", "ModelId": "gpt-4o", "PolicyAction": "allow", "DlpLabels": "[]", "TokensInput": 512, "TokensOutput": 256, "SeverityId": 1, "RawOCSF": "{...}" }]RawOCSF contains the full OCSF v1.1 event as a JSON string for use in Sentinel analytics rules that need the complete event.
Delivery mechanism
Section titled “Delivery mechanism”HTTP POST to the DCR endpoint at {endpoint}/dataCollectionRules/{rule_id}/streams/{stream_name}?api-version=2023-01-01. Maximum payload: 1 MB per batch. Batches exceeding 1 MB are split automatically.
3. Elastic (Bulk API)
Section titled “3. Elastic (Bulk API)”Source: backend/app/services/siem/elastic.py
Elastic integration uses the Elasticsearch Bulk API for high-throughput ingestion to Elastic Cloud or self-hosted Elasticsearch.
Configuration
Section titled “Configuration”| Environment variable | Required | Description |
|---|---|---|
ELASTIC_URL | Yes | Elasticsearch base URL. e.g. https://my-cluster.es.io:9243 |
ELASTIC_API_KEY | Yes | Base64-encoded Elasticsearch API key (id:api_key format) |
ELASTIC_INDEX | No | Target index or data stream name. Default: arbitex-audit |
ELASTIC_PIPELINE | No | Ingest pipeline to apply. Optional. |
ELASTIC_TLS_VERIFY | No | Verify TLS. Default: true. |
Authentication
Section titled “Authentication”Authorization: ApiKey <base64(id:api_key)> header. Create the API key in Kibana under Stack Management > API Keys with index privilege on the target index.
Event format
Section titled “Event format”Bulk API format with index action and OCSF event body:
{ "index": { "_index": "arbitex-audit" } }{ "class_uid": 6003, "time": 1741824000123, "actor": {...}, ... }{ "index": { "_index": "arbitex-audit" } }{ ... }Each batch is a newline-delimited bulk request body. The @timestamp field is added automatically from the OCSF time field (Unix milliseconds converted to ISO 8601).
Delivery mechanism
Section titled “Delivery mechanism”HTTP POST to {ELASTIC_URL}/_bulk. Content-Type: application/x-ndjson. Batch size: up to 100 events or 5 MB, whichever is smaller. The connector checks the errors field in the Bulk API response and routes failed documents to the dead letter file.
4. Datadog (Logs Intake v2)
Section titled “4. Datadog (Logs Intake v2)”Source: backend/app/services/siem/datadog.py
Datadog integration uses the Logs Intake v2 API (/api/v2/logs).
Configuration
Section titled “Configuration”| Environment variable | Required | Description |
|---|---|---|
DATADOG_API_KEY | Yes | Datadog API key |
DATADOG_SITE | No | Datadog site. Default: datadoghq.com. Options: datadoghq.com, us3.datadoghq.com, us5.datadoghq.com, datadoghq.eu, ap1.datadoghq.com |
DATADOG_SERVICE | No | service tag. Default: arbitex |
DATADOG_SOURCE | No | ddsource tag. Default: arbitex-audit |
DATADOG_TAGS | No | Comma-separated key:value tags. e.g. env:production,team:security |
Authentication
Section titled “Authentication”DD-API-KEY: <api_key> header. The API key is created in Datadog under Organization Settings > API Keys.
Event format
Section titled “Event format”Array of Datadog log objects:
[ { "ddsource": "arbitex-audit", "ddtags": "env:production,service:arbitex", "hostname": "platform.arbitex.ai", "service": "arbitex", "message": "{\"class_uid\":6003,\"time\":1741824000123,\"actor\":{...}}" }]The message field contains the full OCSF event as a JSON string. Datadog Log Management will parse it as JSON when a JSON parsing processor is configured in a Datadog pipeline.
Delivery mechanism
Section titled “Delivery mechanism”HTTP POST to https://http-intake.logs.{DATADOG_SITE}/api/v2/logs. Content-Type: application/json. Maximum payload: 5 MB. Maximum 1000 log entries per batch (Datadog limit). The connector respects the 429 Too Many Requests response and applies backoff.
5. Sumo Logic (HTTP Source)
Section titled “5. Sumo Logic (HTTP Source)”Source: backend/app/services/siem/sumo.py
Sumo Logic integration uses an HTTP Source (HTTP Logs & Metrics Collector).
Configuration
Section titled “Configuration”| Environment variable | Required | Description |
|---|---|---|
SUMO_HTTP_SOURCE_URL | Yes | HTTP Source URL from Sumo Logic. e.g. https://endpoint.collection.sumologic.com/receiver/v1/http/<token> |
SUMO_CATEGORY | No | Source category override. Default: arbitex/audit |
SUMO_HOST | No | Source host override. Default: platform.arbitex.ai |
Authentication
Section titled “Authentication”The HTTP Source URL contains the authentication token embedded in the path. No additional headers required. The URL is generated when you create the HTTP Source in Sumo Logic and acts as the credential.
Event format
Section titled “Event format”Newline-delimited OCSF JSON events. Sumo Logic parses each newline as a separate log message:
{"class_uid":6003,"time":1741824000123,"actor":{"user":{"uid":"user_abc"}},...}{"class_uid":6003,"time":1741824000456,"actor":{"user":{"uid":"user_def"}},...}Delivery mechanism
Section titled “Delivery mechanism”HTTP POST to the HTTP Source URL. Content-Type: application/x-www-form-urlencoded is not used — the connector sends application/json with newline-delimited JSON. The X-Sumo-Category and X-Sumo-Host headers are set from the configuration. Maximum payload: 1 MB per request.
6. IBM QRadar (CEF/syslog)
Section titled “6. IBM QRadar (CEF/syslog)”Source: backend/app/services/siem/qradar.py
IBM QRadar integration uses CEF (Common Event Format) over syslog (RFC 5424). This is the native QRadar log source protocol for custom event sources.
Configuration
Section titled “Configuration”| Environment variable | Required | Description |
|---|---|---|
QRADAR_SYSLOG_HOST | Yes | QRadar Console or Event Collector IP/hostname |
QRADAR_SYSLOG_PORT | No | Syslog port. Default: 514 for UDP, 6514 for TLS |
QRADAR_SYSLOG_PROTOCOL | No | Transport: udp, tcp, or tls. Default: tcp |
QRADAR_CEF_DEVICE_VENDOR | No | CEF DeviceVendor. Default: Arbitex |
QRADAR_CEF_DEVICE_PRODUCT | No | CEF DeviceProduct. Default: ArbitexPlatform |
QRADAR_CEF_DEVICE_VERSION | No | CEF DeviceVersion. Default: 1.0 |
QRADAR_TLS_CA_BUNDLE | No | Path to CA bundle for TLS syslog. Required when QRADAR_SYSLOG_PROTOCOL=tls. |
Authentication
Section titled “Authentication”No application-level authentication for UDP/TCP syslog. For TLS, mutual TLS is used when QRADAR_TLS_CA_BUNDLE is configured; otherwise, server-only TLS certificate validation is performed.
Event format
Section titled “Event format”CEF over RFC 5424 syslog. Each event is a syslog message with a CEF-formatted message body:
<134>1 2026-03-13T00:00:00.123Z platform.arbitex.ai ArbitexPlatform - req_xyz - CEF:0|Arbitex|ArbitexPlatform|1.0|6003|chat_completion|3|rt=1741824000123 suser=alice@example.com duid=user_abc act=allow model=gpt-4o tokenIn=512 tokenOut=256 cs1Label=PolicyAction cs1=allow cs2Label=DlpLabels cs2= cs3Label=GroupIds cs3=finance-teamCEF extension field mapping:
| CEF Extension Key | OCSF / Arbitex field |
|---|---|
rt | time (Unix ms) |
suser | actor.user.email_addr |
duid | actor.user.uid |
act | Policy action (allow/block/redact/warn) |
model | Model identifier |
tokenIn | Input token count |
tokenOut | Output token count |
cs1 / cs1Label | Policy action (labeled) |
cs2 / cs2Label | DLP labels detected (comma-separated) |
cs3 / cs3Label | User group IDs (comma-separated) |
Delivery mechanism
Section titled “Delivery mechanism”Syslog over TCP (default), UDP, or TLS. Events are sent individually — CEF/syslog does not batch natively. The connector maintains a persistent TCP/TLS connection and reconnects on failure with exponential backoff. UDP has no delivery confirmation.
QRadar log source setup
Section titled “QRadar log source setup”In QRadar, create a Universal DSM log source pointing to the Arbitex Platform’s egress IP:
- Navigate to Admin > Data Sources > Log Sources > Add.
- Set Log Source Type to
Universal DSM. - Set Protocol Configuration to
Syslog. - Set the IP to the Arbitex Platform egress IP.
- Apply the Arbitex DSM extension (available from the QRadar app exchange) for normalized parsing.
7. Cortex XSIAM (HEC + XSIAM envelope)
Section titled “7. Cortex XSIAM (HEC + XSIAM envelope)”Source: backend/app/services/siem/xsiam.py
Palo Alto Cortex XSIAM integration uses the XSIAM HTTP Event Collector endpoint, which wraps standard HEC with an XSIAM-specific envelope for automated playbook triggering and XDR correlation.
Configuration
Section titled “Configuration”| Environment variable | Required | Description |
|---|---|---|
XSIAM_URL | Yes | XSIAM instance URL. e.g. https://api-tenant.xsiam.paloaltonetworks.com |
XSIAM_API_KEY | Yes | XSIAM API key |
XSIAM_API_KEY_ID | Yes | XSIAM API key ID (numeric) |
XSIAM_LOG_TYPE | No | Log type identifier. Default: arbitex_audit |
XSIAM_DATASET | No | XSIAM dataset name for routing. Default: arbitex_audit_raw |
Authentication
Section titled “Authentication”x-xdr-auth-id: <api_key_id> and x-xdr-nonce: <nonce> and x-xdr-hmac-sha256: <hmac> headers. The connector computes the HMAC-SHA256 signature over {api_key_id}{nonce}{api_key} with the API key as the key, matching the Cortex XSIAM authentication scheme.
Event format
Section titled “Event format”XSIAM HEC format with envelope fields for XSIAM routing:
{ "events": [ { "_time": "2026-03-13T00:00:00.123Z", "_vendor": "Arbitex", "_product": "ArbitexPlatform", "_dataset": "arbitex_audit_raw", "log_type": "arbitex_audit", "event_id": "req_xyz", "user_id": "user_abc", "user_email": "alice@example.com", "operation": "chat_completion", "model_id": "gpt-4o", "policy_action": "allow", "dlp_labels": [], "tokens_input": 512, "tokens_output": 256, "severity": "informational", "raw_ocsf": "{...}" } ]}The _vendor, _product, _dataset, and log_type fields are used by XSIAM for automatic data routing and normalized schema mapping. raw_ocsf contains the full OCSF v1.1 event for use in XSIAM XQL queries.
Delivery mechanism
Section titled “Delivery mechanism”HTTP POST to {XSIAM_URL}/logs/v1/event. Content-Type: application/json. Batches of up to 100 events. The XSIAM endpoint returns a 202 Accepted for asynchronous ingestion. The connector treats non-2xx responses as failures and applies exponential backoff.
Dead letter handling
Section titled “Dead letter handling”When an event cannot be delivered after all retries, it is written to the dead letter file:
SIEM_DEAD_LETTER_PATH=/var/log/arbitex/siem-dead-letter.jsonlEach line is a JSONL record:
{ "timestamp": "2026-03-13T00:00:00Z", "connector": "splunk", "error": "Connection refused", "event": { ... }}Replay dead letter events by posting them to the Platform’s dead letter replay endpoint:
POST /api/admin/siem/replay-dead-letterConnector comparison
Section titled “Connector comparison”| Connector | Auth method | Event format | Delivery | Batching |
|---|---|---|---|---|
| Splunk HEC | Bearer token | HEC JSON / OCSF | HTTP POST | 100 events / 5s |
| Microsoft Sentinel | OAuth 2.0 client creds | DCR table schema + raw OCSF | HTTP POST | 100 events / 1 MB |
| Elastic | API Key | Bulk API ndjson / OCSF | HTTP POST | 100 events / 5 MB |
| Datadog | API Key header | Log objects / OCSF in message | HTTP POST | 100 events / 5 MB |
| Sumo Logic | Token in URL | ndjson / OCSF | HTTP POST | 100 events / 1 MB |
| IBM QRadar | mTLS / none | CEF over syslog | TCP/UDP/TLS | 1 event per message |
| Cortex XSIAM | HMAC-SHA256 | XSIAM HEC + OCSF | HTTP POST | 100 events |
Related pages
Section titled “Related pages”- SIEM integration guide — platform vs Outpost delivery path comparison
- SIEM admin API — connector configuration via admin API
- Audit Log Management — audit event structure and OCSF schema
- Outpost credential intelligence — Outpost direct sink configuration