Enrichment Pipeline
Every company passes through 12 sequential stages. Each stage adds intelligence, and the completeness scorer can re-enter the pipeline if quality thresholds aren't met.
Input Resolution
Auto-detect CSV/Excel columns, normalize company names, resolve ambiguous inputs. Handles company names, domains, URLs, phone numbers, addresses, ZIP codes, and social profile URLs.
Company Discovery
Multi-query search across Google, Bing, DDG. Domain extraction with DNS validation. MX record check confirms email capability. Social/hosting domain filtering.
Google Boost
Secondary search enrichment for companies with low-confidence domains. Adds search result context: meta descriptions, featured snippets, knowledge panel data.
Domain Validation
DNS/MX/SPF/DKIM/DMARC analysis. Bad domain filter (18 blocked domains). Social domain detection. Hosting provider classification. Domain age and SSL checks.
Website Crawling
5-layer scraping hierarchy: Static HTTP + regex + schema.org -> JS data (__NEXT_DATA__, React props, API endpoints) -> Structural (/team, /about, /contact) -> Semantic (4-method extraction) -> Browser (Playwright, 3 renders max).
Contact Extraction
4-method cascade: JSON-LD structured data (highest quality) -> Team card CSS pattern detection -> Heuristic proximity analysis (name near email/phone) -> LinkedIn profile URL extraction.
Email Discovery
4-layer Node.js pipeline: Passive OSINT (DNS/DMARC/Wayback) -> Direct website crawl -> Multi-engine search (SearXNG/Bing/Brave/DDG) -> Google Playwright. Pattern learning per domain.
Email Prediction
8-pattern prediction engine: first@, first.last@, flast@, firstlast@, first_last@, last@, last.first@, f.last@. Learns dominant patterns from discovered emails and applies them to remaining contacts.
Social Enrichment
8-platform discovery: LinkedIn, Twitter, Facebook, Instagram, YouTube, GitHub, Crunchbase, Glassdoor. Cross-platform profile linking via URL pattern matching + DDG fallback.
Intelligence Modules
7 pluggable modules: DNS intel, technology detection, email pattern analysis, social signal detection, domain age analysis, SSL certificate intel, WHOIS intelligence. Entity graph with multi-source triangulation.
AI Qualification
14-rule junk removal. 5-dimension scoring: title authority, email quality, completeness, LinkedIn presence, target match. Persona classification (Economic Buyer, Champion, Technical Evaluator, Influencer).
Verification & Export
8-check SMTP verification (syntax/DNS/SMTP/catch-all/disposable/role/auth/Reacher). Entity dedup. 4-tier scoring (TIER_1_SEND to SKIP). Multi-format export: Excel (5 sheets), CSV, JSON, NDJSON, webhook.
Recursive Enrichment
After all 12 stages, the completeness scorer evaluates each company (0-100). Companies below the configured threshold automatically re-enter the pipeline with adjusted parameters -- deeper crawl limits, alternative search queries, expanded social discovery. This recursive loop continues until quality targets are met or the maximum pass count is reached.
Frequently Asked Questions
Everything you need to know about our platform.
Still have questions?
Our team can walk you through the pipeline, pricing, and your use case.