Skip to content

GeoIP enrichment

GeoIP enrichment adds geographic and network metadata to every audit log entry. The gateway resolves the requesting client’s IP address and the destination provider’s IP address against offline datasets loaded into memory, producing country, region, city, ISP, and ASN fields. This data supports geographic access monitoring, jurisdiction compliance, and network-level anomaly detection.

GeoIP enrichment runs in the payload analysis stage, alongside DLP inspection and Credential Intelligence. It adds no external network dependency — all lookups are against in-memory datasets with sub-millisecond latency.


Audit log fieldTypeDescription
src_ipstringClient IP address, from X-Forwarded-For or direct connection
src_country_codestringISO 3166-1 alpha-2 country code (e.g., US)
src_country_namestringFull country name
src_regionstringState or province
src_citystringCity name
src_ispstringInternet service provider name
src_asnintegerAutonomous System Number
src_asn_orgstringOrganization name for the ASN
src_arin_orgstringARIN organization name (US/CA IPs only; null for other regions)
Audit log fieldTypeDescription
dst_ipstringIP address of the model provider endpoint
dst_country_codestringCountry code for the provider endpoint
dst_asnintegerASN for the provider endpoint
dst_asn_orgstringOrganization name for the provider endpoint ASN

If a dataset is unavailable or does not contain an entry for the IP address, the corresponding fields are null. The audit entry is still written — GeoIP fields are optional enrichment, not required for the audit record.


Source and destination IP addresses (src_ip, dst_ip) are included in the HMAC chain computation because they are observed network facts. GeoIP-derived fields (country, region, city, ISP, ASN) are excluded from the HMAC chain because they are derived from datasets that may be updated between verifications. Updating your GeoIP datasets does not invalidate existing HMAC chains. See Audit log — HMAC chain verification.


The gateway uses four offline datasets, all loaded into memory at startup. All datasets are optional — the gateway operates without GeoIP enrichment if no datasets are configured.

DatasetPurposeFields provided
MaxMind GeoIP2 CityPrimary geographic datacountry_code, country_name, region, city
MaxMind Anonymous IPVPN/proxy/Tor/hosting detectionis_vpn, is_proxy, is_tor, is_hosting
IP2Location DB25Fallback geographic data + ISPcountry_code, country_name, region, city, isp
iptoasnASN resolutionasn, asn_org
ARIN Bulk WhoisUS/CA organization namearin_org

Priority logic: MaxMind is the primary source for geographic fields. If MaxMind is not configured, IP2Location is used as the fallback. For ASN data, iptoasn is the primary source; MaxMind ASN is the fallback if iptoasn has no entry for the IP.

Each dataset path is configured via environment variable:

Environment variableDataset
GEOIP_MAXMIND_PATHPath to MaxMind GeoIP2 City .mmdb file
GEOIP_MAXMIND_ANON_PATHPath to MaxMind Anonymous IP .mmdb file
GEOIP_IP2LOCATION_PATHPath to IP2Location DB25 .BIN file
GEOIP_IPTOASN_PATHPath to iptoasn .tsv file
GEOIP_ARIN_PATHPath to ARIN Bulk Whois text dump

Set one or more of these variables before starting the gateway. Omitting a variable disables that dataset — the corresponding fields will be null in audit entries.


MaxMind offers a separate Anonymous IP database that classifies IP addresses as VPN providers, public proxies, Tor exit nodes, or hosting/datacenter ranges. This is an optional add-on to the standard GeoIP2 City database and is loaded independently via its own environment variable.

When the Anonymous IP database is loaded, four additional boolean fields are added to every audit log entry for the source IP:

Audit log fieldTypeDescription
is_vpnboolean | nulltrue if the IP is associated with a commercial VPN provider
is_proxyboolean | nulltrue if the IP is a known public proxy
is_torboolean | nulltrue if the IP is a Tor exit node
is_hostingboolean | nulltrue if the IP belongs to a hosting provider or datacenter

Set GEOIP_MAXMIND_ANON_PATH to the path of the MaxMind Anonymous IP .mmdb file:

GEOIP_MAXMIND_ANON_PATH=/etc/arbitex/GeoIP2-Anonymous-IP.mmdb

The Anonymous IP database is loaded at startup alongside the other GeoIP datasets and participates in the SIGHUP hot-reload cycle — updating the file and sending SIGHUP is sufficient to apply a new database without restarting the gateway.

If GEOIP_MAXMIND_ANON_PATH is not set, or if the file cannot be read at startup, all four fields (is_vpn, is_proxy, is_tor, is_hosting) are null in audit entries. This is not treated as an error — the gateway operates normally without the Anonymous IP database, and the standard GeoIP enrichment fields are unaffected.

The four anonymous IP fields are available as conditions in the Policy Engine. Example use cases:

  • Block Tor exit nodes — flag or deny requests where is_tor = true to prevent anonymous Tor-routed traffic from reaching model providers.
  • Flag VPN traffic — set is_vpn = true as a condition to route VPN-originated requests to a stricter policy group or add a warning label to audit entries.
  • Restrict hosting IPs — deny requests originating from datacenter IP ranges (is_hosting = true) to block automated or non-human traffic.
  • Combine conditions — combine is_vpn = true OR is_tor = true with a geographic condition to tighten access for high-risk origins.

Refer to the Policy Rule Reference for the exact condition field names and supported operators.

The MaxMind Anonymous IP database is available in two tiers:

  • GeoIP2 Anonymous IP (commercial) — updated twice weekly; highest accuracy. Requires a MaxMind license key. Download via the MaxMind GeoIP Update tool or the download API.
  • GeoLite2 Anonymous IP (free, registration required) — updated weekly; lower coverage than the commercial edition. Available at maxmind.com after creating a free account.

Both editions produce a .mmdb file that is compatible with GEOIP_MAXMIND_ANON_PATH.

The /readyz response includes a dedicated status field for the Anonymous IP database:

{
"geoip_enrichment": {
"maxmind": "ok",
"maxmind_anon": "ok",
"ip2location": "ok",
"iptoasn": "ok",
"arin": "absent",
"cache": "1024/131072"
}
}

When GEOIP_MAXMIND_ANON_PATH is not configured, maxmind_anon appears as "absent". A load failure at startup logs a geoip_anon_missing event. A successful load logs geoip_anon_loaded. As with all GeoIP datasets, an absent or failed Anonymous IP database does not fail the readiness probe.


To update a GeoIP dataset without restarting the gateway:

  1. Replace the dataset file on disk at the configured path
  2. Send SIGHUP to the gateway worker process

The gateway reloads all dataset files atomically: new data is loaded outside the request path, then swapped in under a lock. In-flight requests are not affected. The in-memory lookup cache (128K entries, 4-hour TTL) is cleared on reload so stale entries drain immediately.

There is no scheduled automatic refresh. Dataset updates are operator-driven via SIGHUP or process restart.


GeoIP enrichment populates audit log fields for geographic and network analysis. The user_risk_score field used in policy rule conditions (via user_risk_score_min) is currently computed from Credential Intelligence breach corpus matches, not from GeoIP data. A CredInt hit maps to a risk score based on the frequency bucket: Critical = 0.95, High = 0.80, Medium = 0.50, Low = 0.25.

GeoIP fields are available for querying in your SIEM for geographic anomaly detection and jurisdiction monitoring, but they do not directly feed into the Policy Engine’s risk score condition at this time.


The /readyz endpoint reports GeoIP enrichment status:

{
"geoip_enrichment": {
"ip2location": "ok",
"iptoasn": "ok",
"arin": "absent",
"cache": "1024/131072"
}
}

GeoIP degradation is advisory — it does not fail the readiness probe. The gateway runs at full capacity without GeoIP datasets; all enrichment fields are null until datasets are loaded.