Guide
How skip tracing APIs work
A technical look under the hood at what happens between your API call and the match payload — written for developers and skeptical buyers.
If you're a developer evaluating a skip tracing API, you've probably noticed that vendor marketing pages don't tell you much about the actual mechanics. They tell you about match rates and price per lookup. They don't tell you what's happening inside the black box. This guide does.
The identity graph
Every skip tracing API is fundamentally a thin layer over a database called an identity graph. The graph is a large collection of records that link identifying attributes to each other: name + date of birth + an address + phone numbers + emails + relatives, etc. Each attribute has a source, a confidence weight, and a "last seen" timestamp.
The more sources contributing to the graph, the better the resolution: a phone number that appears in three independent licensed sources tied to the same identity is almost certainly correct. A phone number that appears in one source from 2019 is much weaker evidence.
What happens when you make a request
You send the API some identifying input — usually some combination of name, address, phone, and email. The API runs an identity-resolution algorithm against the graph to find the cluster of records most likely to belong to your subject. This is the hard part: the same name appears thousands of times, and you need to disambiguate using whatever supporting attributes you provided.
Once the algorithm picks a cluster, the API returns the contact attributes attached to that cluster — phones, emails, addresses — along with a confidence score for the overall match.
Why match rates vary so much by input
Match rate is mostly a function of how much identifying input you provide. Examples:
- Just a common name: low match rate (too many candidates, can't disambiguate)
- Name + city/state: better but still ambiguous for common names
- Name + full address: high match rate, the address pins the identity
- Name + phone: very high — phones are nearly unique identifiers
- Name + DOB + address: highest possible
If a vendor's marketing claims a 90% match rate, ask what input they're testing with. Match rates on a high-quality input are not the same as match rates on a sparse one.
Single vs bulk endpoints
Most skip tracing APIs offer two patterns:
Single lookup — synchronous request, response in under a second. Use for real-time enrichment in user-facing flows (a process server in the field, a CRM that enriches a record on save, etc.).
Bulk lookup — asynchronous job, accepts hundreds to thousands of records, returns a job ID. Poll for results or get a webhook callback when the job finishes. Use for marketing list enrichment, CRM imports, and any workflow that doesn't need sub-second response.
Confidence scores: what they actually mean
A confidence score is the API's best estimate of how likely the returned data is correct. It's not a probability and there's no industry standard for how it's calculated. In practice:
- 90+ : very high — multiple-source agreement, recently verified
- 75–89 : high — good signal but some attribute is older or single-source
- 60–74 : medium — usable, treat with skepticism on cold outreach
- under 60 : low — call last, or skip
Use confidence scores to prioritize your dial queue, not to throw out lower-confidence records entirely.
What to look for in API design
A few quality signals when you're evaluating a skip tracing API:
- JSON-only request and response (not XML or weird custom formats)
- Idempotent bulk submission with job IDs you can poll
- Webhook delivery option for large batches
- Per-attribute source attribution in the response
- Clear error codes that distinguish "no match" from "bad input" from "rate limited"
- Sandbox environment with synthetic data for integration testing
If a vendor's API is missing more than one of these, the rest of the integration will probably hurt too.