Credential Intelligence
Credential Intelligence (CredInt) is an optional DLP subsystem that detects known-compromised credentials in AI prompts and responses. When a user pastes a password, API key, or bearer token into an AI request, CredInt checks whether that credential has appeared in a known breach corpus and routes the request based on its breach frequency.
CredInt operates as a non-blocking side-channel: it never delays AI responses. Results arrive asynchronously and are written to the audit log.
Architecture
Section titled “Architecture”L1 Extraction — Credential Candidate Detection
Section titled “L1 Extraction — Credential Candidate Detection”Before any network call is made, a synchronous L1 extractor (CredentialExtractor) scans the request text for credential candidates using four pattern families:
| Pattern family | Example matches |
|---|---|
| Explicit assignment | password=s3cr3t, api_key=abc123, secret=VALUE |
| Environment variable | DB_PASSWORD=mypassword, AWS_SECRET_ACCESS_KEY=... |
| Authorization header | Authorization: Bearer TOKEN, X-API-Key: VALUE |
| High-entropy token | Quoted or bare tokens with Shannon entropy > 3.5, 2+ character classes |
The extractor deduplicates candidates by value and sorts them by position in the text. Candidate cleartext is never logged at any point in the pipeline — only the SHA-1 prefix (first 8 hex chars) appears in audit records.
High-entropy token detection uses:
- Shannon entropy threshold: > 3.5 bits/character
- Minimum character class requirement: 2+ classes (letters, digits, symbols)
- False-positive filters: UUIDs, URLs, and common English words are excluded
Breach Corpus — 861M+ Known Compromised Credentials
Section titled “Breach Corpus — 861M+ Known Compromised Credentials”Arbitex integrates with a separate CredInt microservice (credint:8202) that maintains the breach corpus. The corpus contains SHA-1 hashes of over 861 million credentials drawn from known public breach datasets. The corpus is frequency-weighted: each hash carries a bucket label indicating how many times the credential has appeared across breaches.
The microservice is not part of the platform container image — it is a separately deployed service. Configuration:
CREDINT_SERVICE_URL=http://credint:8202 # defaultCREDINT_SERVICE_TIMEOUT=3.0 # secondsk-Anonymity Lookup Protocol
Section titled “k-Anonymity Lookup Protocol”CredInt uses a k-anonymity protocol to protect credential privacy during lookup:
- The platform computes the SHA-1 hash of each credential candidate locally.
- Only the 5-character hex prefix (20 bits) is sent to the CredInt microservice in a
POST /v1/checkrequest. - The microservice returns all corpus entries matching that prefix, along with their frequency buckets.
- The platform checks the full hash locally — the microservice never sees the complete hash or the original cleartext.
This is identical to the Have I Been Pwned k-anonymity model. Even a compromised CredInt microservice cannot reconstruct the credential being checked.
The SHA-1 prefix stored in the audit log is also the 5-character lookup prefix, not the full hash.
Frequency-Weighted Risk Buckets
Section titled “Frequency-Weighted Risk Buckets”Each breach corpus entry carries a frequency bucket indicating how many times it has appeared across known breaches:
| Bucket | Frequency threshold | Risk level |
|---|---|---|
critical | Very high frequency (widely circulated breach) | Highest risk |
high | High frequency (appeared in multiple major breaches) | High risk |
medium | Moderate frequency | Medium risk |
low | Infrequent (appeared in limited breach data) | Low risk |
When a lookup returns hits across multiple frequency buckets (multiple candidates in the same request), CredInt selects the worst bucket (highest severity) to represent the overall request risk.
Routing Logic
Section titled “Routing Logic”CredInt’s routing decision depends on two inputs: the frequency bucket and the org’s DLP sensitivity setting.
| Frequency bucket | DLP sensitivity | Action | Audit flag |
|---|---|---|---|
critical or high | high | soft_block — request blocked | true |
critical or high | standard | pass — elevated flag only | true |
medium or low | any | pass — flag only | true |
| No hit | any | pass — no flag | false |
soft_block means the AI request is blocked before reaching the model. The user sees a standard block message. The audit record captures credint_action: "soft_block".
Routing path is recorded in the audit log as credint_routing_path, which takes one of four values:
| Path | Meaning |
|---|---|
credint_no_hit | Credential not found in breach corpus |
credint_soft_block_high_sensitivity | Critical/high breach hit + high sensitivity = block |
credint_elevated_flag_standard | Critical/high breach hit + standard sensitivity = flag |
credint_medium_low_flag | Medium/low breach hit = flag only |
Confidence Scoring
Section titled “Confidence Scoring”Each CredInt audit record includes a credint_confidence score:
| Score | Meaning |
|---|---|
1.0 | Strong hit — direct corpus match |
0.5 | Partial or heuristic match |
0.0 | No match |
Circuit Breaker Behavior
Section titled “Circuit Breaker Behavior”The CredInt client implements a circuit breaker to protect against CredInt microservice unavailability:
| State | Condition | Behavior |
|---|---|---|
| Closed | Normal operation | Requests proceed to CredInt |
| Open | 3 consecutive failures | All CredInt calls bypassed immediately |
| Auto-reset | 60 seconds after opening | Circuit attempts to close on next request |
The circuit breaker uses monotonic time (not wall clock) for the 60-second reset window, making it immune to system clock changes.
Fail-Open Design
Section titled “Fail-Open Design”CredInt is designed fail-open: if the microservice is unreachable, times out, or returns an unexpected HTTP status, the platform:
- Records
credint_available: falsein the audit log - Returns
CredIntResult.unavailable()— the sentinel result that signals no check was performed - Does not block the AI request
This ensures CredInt microservice downtime never interrupts AI service delivery. The fail-open design is intentional — CredInt is a risk-flagging signal, not a hard gate.
credint_available: false in the audit log indicates that the check could not be completed. Monitor this field for CredInt microservice health.
Org Sensitivity Configuration
Section titled “Org Sensitivity Configuration”CredInt behavior scales with the org’s DLP sensitivity setting. The two sensitivity levels are:
# Standard (default) — flags critical/high hits but does not blockDLP_SENSITIVITY=standard
# High — blocks on critical/high frequency breach hitsDLP_SENSITIVITY=highConfigure via Admin → Organization Settings → DLP, or via the policy engine’s dlp_sensitivity parameter.
CredInt is disabled by default at the org level. Enable it explicitly:
# Via OrgDLPConfig (platform API)curl -X PATCH https://your-platform/api/admin/org/dlp-config \ -H "Authorization: Bearer $ADMIN_TOKEN" \ -H "Content-Type: application/json" \ -d '{"credint_enabled": true}'Audit Fields
Section titled “Audit Fields”Every request processed by CredInt (whether hit or miss) writes the following fields to the audit log:
| Field | Type | Description |
|---|---|---|
credint_enabled | bool | Whether CredInt was active for this request |
credint_hit | bool | Whether a breach corpus match was found |
frequency_bucket | string | Worst-case bucket: critical, high, medium, low, or null |
context_type | string | L1 extraction context: EXPLICIT_ASSIGNMENT, ENVIRONMENT_VARIABLE, AUTHORIZATION_HEADER, HIGH_ENTROPY_CODE |
sha1_prefix | string | First 8 hex characters of the matched SHA-1 hash (never full hash) |
credint_confidence | float | Match confidence: 1.0, 0.5, or 0.0 |
credint_available | bool | Whether the CredInt microservice was reachable |
candidate_count | int | Number of credential candidates extracted from the text |
These fields are added as nullable columns in the audit_logs table (migration 041_credint_audit_fields). All six fields are null when CredInt is disabled.
Performance
Section titled “Performance”CredInt is designed for sub-millisecond decision latency at the platform layer:
- L1 extraction is synchronous CPU-bound work; typical runtime < 1 ms for prompt-sized inputs
- CredInt microservice lookup is a single HTTP round-trip with a 3-second timeout; the circuit breaker ensures this never blocks indefinitely
- Fan-out: multiple credential candidates in a single request are checked in parallel via
asyncio.gather - Non-blocking: CredInt runs as a
asyncio.create_task— the AI request proceeds immediately; CredInt completes asynchronously and writes to the audit log via callback
The on_complete callback in fire_credint_check() is used internally to write audit fields after the CredInt result arrives without blocking the streaming AI response.
Deployment
Section titled “Deployment”CredInt Microservice
Section titled “CredInt Microservice”The CredInt microservice is a separate deployment. It exposes:
POST /v1/check— accept{prefix: "5-char hex", candidates: [{sha1: "40-char hex"}]}, return frequency bucketsGET /health— liveness probe
In Kubernetes, deploy it as a sidecar or internal service and set CREDINT_SERVICE_URL on the platform deployment.
Air-Gap Deployments
Section titled “Air-Gap Deployments”In air-gap mode (OUTPOST_AIRGAP=true), the Outpost does not connect to the CredInt microservice. CredInt checks are skipped and credint_available: false is recorded. See Air-Gap Deployment for full details.
Monitoring
Section titled “Monitoring”Monitor CredInt health via the audit log:
# Count credint unavailable events in the last hourcurl "https://your-platform/api/admin/audit-logs?credint_available=false&from=2026-03-13T00:00:00Z" \ -H "Authorization: Bearer $ADMIN_TOKEN" | jq '.total'Configure an alert on credint_available=false spikes to detect CredInt microservice outages — see Alert Configuration.
Related
Section titled “Related”- DLP Pipeline Configuration — full DLP rule pipeline including CredInt integration
- Compliance Frameworks — regulatory framework bundles
- Alert Configuration — alerting on DLP and CredInt events
- Air-Gap Deployment — offline operation without CredInt connectivity