Skip to content

Splunk HEC Integration Guide

Arbitex streams audit events to Splunk using the HTTP Event Collector (HEC) REST API. Events are formatted as OCSF v1.1 JSON and sent in batches of up to 100 events, with automatic retry on transient failures and a dead letter queue for events that cannot be delivered.


Before configuring the connector, you need:

  • A running Splunk instance (on-prem or Splunk Cloud) accessible from the Arbitex platform hosts
  • An HEC token with permission to write to your target index
  • An index created and associated with the HEC token (the connector defaults to arbitex)
  • Network path open from Arbitex platform pods to {splunk-host}:8088 (or your custom HEC port), TCP
  1. In Splunk, go to Settings → Data Inputs → HTTP Event Collector.
  2. Click New Token.
  3. Set the Source type to arbitex:ocsf (create it if it does not exist).
  4. Set the Default Index to your target index (e.g. arbitex).
  5. Complete the wizard and copy the token value — it is shown only once.

Configure the connector using environment variables on the Arbitex platform deployment.

VariableRequiredDefaultDescription
SPLUNK_HEC_URLYesFull HEC endpoint URL, e.g. https://splunk.corp.example.com:8088/services/collector
SPLUNK_HEC_TOKENYesHEC authentication token
SPLUNK_HEC_INDEXNoarbitexTarget Splunk index
SPLUNK_HEC_SOURCENoarbitex:auditEvent source identifier
SPLUNK_HEC_BATCH_SIZENo100Maximum events per batch send
SPLUNK_HEC_FLUSH_INTERVALNo5Maximum seconds between batch flushes
SPLUNK_HEC_MAX_RETRIESNo3Maximum retry attempts on transient failures
SPLUNK_HEC_DEAD_LETTER_PATHNo/var/log/arbitex/splunk_dead_letter.jsonlPath for the dead letter queue file
Terminal window
SPLUNK_HEC_URL=https://splunk.corp.example.com:8088/services/collector
SPLUNK_HEC_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
SPLUNK_HEC_INDEX=arbitex
SPLUNK_HEC_SOURCE=arbitex:audit

For Splunk Cloud, the HEC endpoint URL follows this pattern:

https://http-inputs-<stack>.splunkcloud.com:443/services/collector

Each event is delivered inside a Splunk HEC envelope with index, source, and sourcetype metadata. The event field contains the raw OCSF v1.1 JSON object.

{
"time": 1741737600.000,
"index": "arbitex",
"source": "arbitex:audit",
"sourcetype": "arbitex:ocsf",
"event": {
"class_uid": 6003,
"class_name": "API Activity",
"category_uid": 6,
"category_name": "Application Activity",
"activity_id": 1,
"activity_name": "Create",
"severity_id": 1,
"severity": "Informational",
"status_id": 1,
"status": "Success",
"time": 1741737600000,
"message": "API request processed",
"actor": {
"user": {
"uid": "user_01jq...",
"email_addr": "alice@example.com",
"name": "Alice Smith"
},
"org": {
"uid": "org_01jq...",
"name": "Acme Corp"
}
},
"src_endpoint": {
"ip": "203.0.113.45",
"location": {
"country": "US",
"city": "New York",
"lat": 40.7128,
"long": -74.0060
}
},
"http_request": {
"http_method": "POST",
"url": {
"path": "/v1/chat/completions"
}
},
"metadata": {
"version": "1.1.0",
"product": {
"name": "Arbitex",
"vendor_name": "Arbitex"
}
}
}
}

The time field in the envelope is derived from the OCSF time field (epoch milliseconds) converted to epoch seconds as required by the HEC API.

OCSF classclass_uidDescription
API Activity6003Every request processed through the gateway
Security Finding2001DLP trigger events
Authentication3002Login, token exchange, SSO events
Account Change3001User and group lifecycle events

Events are buffered in memory and flushed to Splunk in one of two conditions:

  • The buffer reaches SPLUNK_HEC_BATCH_SIZE events (default: 100)
  • SPLUNK_HEC_FLUSH_INTERVAL seconds have elapsed since the last flush (default: 5 seconds)

On transient failures (HTTP 429 or 503), the connector retries with exponential backoff (1s, 2s, 4s) up to SPLUNK_HEC_MAX_RETRIES attempts. Events that cannot be delivered after all retries are written to the dead letter queue at SPLUNK_HEC_DEAD_LETTER_PATH in JSONL format.


After setting the environment variables and restarting the platform, confirm that events are reaching Splunk.

Use the Arbitex admin API to view connector status:

Terminal window
curl -s -H "Authorization: Bearer $ADMIN_TOKEN" \
https://api.arbitex.ai/api/admin/siem/connectors | jq '.[] | select(.connector_id == "splunk_hec")'

A healthy connector returns:

{
"connector_id": "splunk_hec",
"display_name": "Splunk HEC",
"status": "healthy",
"config_summary": {
"url": "https://splunk.corp.example.com:8088/services/collector",
"index": "arbitex",
"source": "arbitex:audit",
"token_configured": true
}
}

Run a Splunk search to confirm events are arriving:

index=arbitex sourcetype="arbitex:ocsf" earliest=-15m
| head 10

To check for a specific event class:

index=arbitex sourcetype="arbitex:ocsf" event.class_name="API Activity" earliest=-1h
| stats count by event.actor.org.name

The connector health check sends an empty POST to the HEC endpoint. A not_configured status means SPLUNK_HEC_URL or SPLUNK_HEC_TOKEN is not set. An error status means the endpoint is unreachable or the token is invalid.

Check the platform logs:

Terminal window
kubectl logs -n arbitex -l app=arbitex-platform --tail=200 | grep -i splunk

The HEC token is rejected. Verify the token value matches what Splunk shows in Settings → Data Inputs → HTTP Event Collector. Tokens are case-sensitive.

The index specified in SPLUNK_HEC_INDEX does not exist in Splunk, or the HEC token is not permitted to write to it. Verify the index exists and is associated with the token in Splunk.

Confirm that outbound TCP to {splunk-host}:8088 is permitted from the Arbitex platform pods. For Splunk Cloud, the port is typically 443.

If the Splunk instance uses a private CA or self-signed certificate, the platform’s container trust store must include the CA certificate. Contact your Arbitex administrator to add custom CA certificates to the deployment.

If events are arriving in SPLUNK_HEC_DEAD_LETTER_PATH, the connector encountered persistent failures. Each entry is a JSON object:

{
"event": { ... },
"error": "HTTP 503: Service Unavailable",
"connector": "splunk_hec",
"timestamp": 1741737600.0
}

Review the error field to identify the root cause. Once the underlying issue is resolved, dead letter events are not automatically replayed — they must be reingested manually if needed.