GeoIP enrichment
GeoIP enrichment adds geographic and network metadata to every audit log entry. The gateway resolves the requesting client’s IP address and the destination provider’s IP address against offline datasets loaded into memory, producing country, region, city, ISP, and ASN fields. This data supports geographic access monitoring, jurisdiction compliance, and network-level anomaly detection.
GeoIP enrichment runs in the payload analysis stage, alongside DLP inspection and Credential Intelligence. It adds no external network dependency — all lookups are against in-memory datasets with sub-millisecond latency.
Enriched fields
Section titled “Enriched fields”Source IP (client)
Section titled “Source IP (client)”| Audit log field | Type | Description |
|---|---|---|
src_ip | string | Client IP address, from X-Forwarded-For or direct connection |
src_country_code | string | ISO 3166-1 alpha-2 country code (e.g., US) |
src_country_name | string | Full country name |
src_region | string | State or province |
src_city | string | City name |
src_isp | string | Internet service provider name |
src_asn | integer | Autonomous System Number |
src_asn_org | string | Organization name for the ASN |
src_arin_org | string | ARIN organization name (US/CA IPs only; null for other regions) |
Destination IP (provider endpoint)
Section titled “Destination IP (provider endpoint)”| Audit log field | Type | Description |
|---|---|---|
dst_ip | string | IP address of the model provider endpoint |
dst_country_code | string | Country code for the provider endpoint |
dst_asn | integer | ASN for the provider endpoint |
dst_asn_org | string | Organization name for the provider endpoint ASN |
If a dataset is unavailable or does not contain an entry for the IP address, the corresponding fields are null. The audit entry is still written — GeoIP fields are optional enrichment, not required for the audit record.
HMAC chain inclusion
Section titled “HMAC chain inclusion”Source and destination IP addresses (src_ip, dst_ip) are included in the HMAC chain computation because they are observed network facts. GeoIP-derived fields (country, region, city, ISP, ASN) are excluded from the HMAC chain because they are derived from datasets that may be updated between verifications. Updating your GeoIP datasets does not invalidate existing HMAC chains. See Audit log — HMAC chain verification.
Datasets
Section titled “Datasets”The gateway uses four offline datasets, all loaded into memory at startup. All datasets are optional — the gateway operates without GeoIP enrichment if no datasets are configured.
| Dataset | Purpose | Fields provided |
|---|---|---|
| MaxMind GeoIP2 City | Primary geographic data | country_code, country_name, region, city |
| MaxMind Anonymous IP | VPN/proxy/Tor/hosting detection | is_vpn, is_proxy, is_tor, is_hosting |
| IP2Location DB25 | Fallback geographic data + ISP | country_code, country_name, region, city, isp |
| iptoasn | ASN resolution | asn, asn_org |
| ARIN Bulk Whois | US/CA organization name | arin_org |
Priority logic: MaxMind is the primary source for geographic fields. If MaxMind is not configured, IP2Location is used as the fallback. For ASN data, iptoasn is the primary source; MaxMind ASN is the fallback if iptoasn has no entry for the IP.
Configuration
Section titled “Configuration”Each dataset path is configured via environment variable:
| Environment variable | Dataset |
|---|---|
GEOIP_MAXMIND_PATH | Path to MaxMind GeoIP2 City .mmdb file |
GEOIP_MAXMIND_ANON_PATH | Path to MaxMind Anonymous IP .mmdb file |
GEOIP_IP2LOCATION_PATH | Path to IP2Location DB25 .BIN file |
GEOIP_IPTOASN_PATH | Path to iptoasn .tsv file |
GEOIP_ARIN_PATH | Path to ARIN Bulk Whois text dump |
Set one or more of these variables before starting the gateway. Omitting a variable disables that dataset — the corresponding fields will be null in audit entries.
Anonymous IP detection
Section titled “Anonymous IP detection”MaxMind offers a separate Anonymous IP database that classifies IP addresses as VPN providers, public proxies, Tor exit nodes, or hosting/datacenter ranges. This is an optional add-on to the standard GeoIP2 City database and is loaded independently via its own environment variable.
When the Anonymous IP database is loaded, four additional boolean fields are added to every audit log entry for the source IP:
| Audit log field | Type | Description |
|---|---|---|
is_vpn | boolean | null | true if the IP is associated with a commercial VPN provider |
is_proxy | boolean | null | true if the IP is a known public proxy |
is_tor | boolean | null | true if the IP is a Tor exit node |
is_hosting | boolean | null | true if the IP belongs to a hosting provider or datacenter |
Configuration
Section titled “Configuration”Set GEOIP_MAXMIND_ANON_PATH to the path of the MaxMind Anonymous IP .mmdb file:
GEOIP_MAXMIND_ANON_PATH=/etc/arbitex/GeoIP2-Anonymous-IP.mmdbThe Anonymous IP database is loaded at startup alongside the other GeoIP datasets and participates in the SIGHUP hot-reload cycle — updating the file and sending SIGHUP is sufficient to apply a new database without restarting the gateway.
Graceful degradation
Section titled “Graceful degradation”If GEOIP_MAXMIND_ANON_PATH is not set, or if the file cannot be read at startup, all four fields (is_vpn, is_proxy, is_tor, is_hosting) are null in audit entries. This is not treated as an error — the gateway operates normally without the Anonymous IP database, and the standard GeoIP enrichment fields are unaffected.
Policy rule use cases
Section titled “Policy rule use cases”The four anonymous IP fields are available as conditions in the Policy Engine. Example use cases:
- Block Tor exit nodes — flag or deny requests where
is_tor = trueto prevent anonymous Tor-routed traffic from reaching model providers. - Flag VPN traffic — set
is_vpn = trueas a condition to route VPN-originated requests to a stricter policy group or add a warning label to audit entries. - Restrict hosting IPs — deny requests originating from datacenter IP ranges (
is_hosting = true) to block automated or non-human traffic. - Combine conditions — combine
is_vpn = true OR is_tor = truewith a geographic condition to tighten access for high-risk origins.
Refer to the Policy Rule Reference for the exact condition field names and supported operators.
Obtaining the database
Section titled “Obtaining the database”The MaxMind Anonymous IP database is available in two tiers:
- GeoIP2 Anonymous IP (commercial) — updated twice weekly; highest accuracy. Requires a MaxMind license key. Download via the MaxMind GeoIP Update tool or the download API.
- GeoLite2 Anonymous IP (free, registration required) — updated weekly; lower coverage than the commercial edition. Available at maxmind.com after creating a free account.
Both editions produce a .mmdb file that is compatible with GEOIP_MAXMIND_ANON_PATH.
Health check
Section titled “Health check”The /readyz response includes a dedicated status field for the Anonymous IP database:
{ "geoip_enrichment": { "maxmind": "ok", "maxmind_anon": "ok", "ip2location": "ok", "iptoasn": "ok", "arin": "absent", "cache": "1024/131072" }}When GEOIP_MAXMIND_ANON_PATH is not configured, maxmind_anon appears as "absent". A load failure at startup logs a geoip_anon_missing event. A successful load logs geoip_anon_loaded. As with all GeoIP datasets, an absent or failed Anonymous IP database does not fail the readiness probe.
Updating datasets
Section titled “Updating datasets”To update a GeoIP dataset without restarting the gateway:
- Replace the dataset file on disk at the configured path
- Send
SIGHUPto the gateway worker process
The gateway reloads all dataset files atomically: new data is loaded outside the request path, then swapped in under a lock. In-flight requests are not affected. The in-memory lookup cache (128K entries, 4-hour TTL) is cleared on reload so stale entries drain immediately.
There is no scheduled automatic refresh. Dataset updates are operator-driven via SIGHUP or process restart.
Relationship to risk scoring
Section titled “Relationship to risk scoring”GeoIP enrichment populates audit log fields for geographic and network analysis. The user_risk_score field used in policy rule conditions (via user_risk_score_min) is currently computed from Credential Intelligence breach corpus matches, not from GeoIP data. A CredInt hit maps to a risk score based on the frequency bucket: Critical = 0.95, High = 0.80, Medium = 0.50, Low = 0.25.
GeoIP fields are available for querying in your SIEM for geographic anomaly detection and jurisdiction monitoring, but they do not directly feed into the Policy Engine’s risk score condition at this time.
Health check
Section titled “Health check”The /readyz endpoint reports GeoIP enrichment status:
{ "geoip_enrichment": { "ip2location": "ok", "iptoasn": "ok", "arin": "absent", "cache": "1024/131072" }}GeoIP degradation is advisory — it does not fail the readiness probe. The gateway runs at full capacity without GeoIP datasets; all enrichment fields are null until datasets are loaded.
See also
Section titled “See also”- Audit log — Full audit entry schema including GeoIP fields
- Credential Intelligence — How breach corpus matches produce the risk score used in policy rules
- Policy Rule Reference —
user_risk_score_mincondition field