B2B Discovery Engine
AI-powered company discovery, domain validation, and sales intelligence enrichment
The B2B Discovery Engine is the entry point of the LeadsLogix AI lead generation pipeline. It accepts any seed input -- company name, domain, phone number, address, ZIP code, or social profile URL -- and runs a 7-stage discovery process to find and validate company websites. Multi-query search across Google, Bing, and DuckDuckGo with DNS/MX/SPF/DKIM/DMARC validation ensures only legitimate B2B domains proceed to the web scraping pipeline.
Pipeline Stages
Each stage executes automatically, escalating only when needed.
Input Normalization
Parse and clean company names, extract domains from URLs, detect input type (name/domain/phone/address/social). Auto-detect CSV/Excel column mapping for the lead enrichment platform.
Multi-Query Search
3-5 search queries per company across DuckDuckGo, Google, Bing. Query templates: company name + industry, company name + location, company name + products. Supports B2B sales intelligence workflows.
Domain Extraction
Extract candidate domains from search results. Filter against 18 bad domains (dnb.com, alibaba.com, etc.), social domains, and hosting providers. Priority scoring by source reliability.
DNS/MX Validation
Full DNS resolution: A/AAAA records, MX records (email capability), SPF policy, DKIM selectors, DMARC alignment. Powers the email verification API pipeline downstream.
Google Boost
Secondary enrichment pass for low-confidence results. Extracts emails, phones, LinkedIn profiles from Google Knowledge Panel and featured snippets for enhanced LinkedIn enrichment.
Social Mapping
Discover LinkedIn company page, Twitter, Facebook, and other social profiles from search results. Cross-reference for identity confirmation in the B2B sales intelligence graph.
Confidence Scoring
0-100 confidence score based on: source reliability (+30), domain match (+20), DNS health (+15), social confirmation (+10). Only domains above threshold proceed to the web scraping pipeline.
Input Normalization
Parse and clean company names, extract domains from URLs, detect input type (name/domain/phone/address/social). Auto-detect CSV/Excel column mapping for the lead enrichment platform.
Multi-Query Search
3-5 search queries per company across DuckDuckGo, Google, Bing. Query templates: company name + industry, company name + location, company name + products. Supports B2B sales intelligence workflows.
Domain Extraction
Extract candidate domains from search results. Filter against 18 bad domains (dnb.com, alibaba.com, etc.), social domains, and hosting providers. Priority scoring by source reliability.
DNS/MX Validation
Full DNS resolution: A/AAAA records, MX records (email capability), SPF policy, DKIM selectors, DMARC alignment. Powers the email verification API pipeline downstream.
Google Boost
Secondary enrichment pass for low-confidence results. Extracts emails, phones, LinkedIn profiles from Google Knowledge Panel and featured snippets for enhanced LinkedIn enrichment.
Social Mapping
Discover LinkedIn company page, Twitter, Facebook, and other social profiles from search results. Cross-reference for identity confirmation in the B2B sales intelligence graph.
Confidence Scoring
0-100 confidence score based on: source reliability (+30), domain match (+20), DNS health (+15), social confirmation (+10). Only domains above threshold proceed to the web scraping pipeline.
Input Normalization
Parse and clean company names, extract domains from URLs, detect input type (name/domain/phone/address/social). Auto-detect CSV/Excel column mapping for the lead enrichment platform.
Multi-Query Search
3-5 search queries per company across DuckDuckGo, Google, Bing. Query templates: company name + industry, company name + location, company name + products. Supports B2B sales intelligence workflows.
Domain Extraction
Extract candidate domains from search results. Filter against 18 bad domains (dnb.com, alibaba.com, etc.), social domains, and hosting providers. Priority scoring by source reliability.
DNS/MX Validation
Full DNS resolution: A/AAAA records, MX records (email capability), SPF policy, DKIM selectors, DMARC alignment. Powers the email verification API pipeline downstream.
Google Boost
Secondary enrichment pass for low-confidence results. Extracts emails, phones, LinkedIn profiles from Google Knowledge Panel and featured snippets for enhanced LinkedIn enrichment.
Social Mapping
Discover LinkedIn company page, Twitter, Facebook, and other social profiles from search results. Cross-reference for identity confirmation in the B2B sales intelligence graph.
Confidence Scoring
0-100 confidence score based on: source reliability (+30), domain match (+20), DNS health (+15), social confirmation (+10). Only domains above threshold proceed to the web scraping pipeline.
Key Capabilities
Any-Seed Input for AI Lead Generation
Start with company names, domains, phone numbers, physical addresses, ZIP codes, or social profile URLs. The engine normalizes and resolves any input format for the lead enrichment platform.
Multi-Engine Search for B2B Sales Intelligence
Parallel queries across DuckDuckGo, Google, and Bing with query template rotation. Each engine returns different results, improving discovery coverage for sales intelligence workflows.
18-Domain Bad List Filter
Automatically filters known aggregator domains (dnb.com, volza.com, alibaba.com, etc.) that appear in search results but aren't actual company websites. Critical for AI lead generation accuracy.
DNS Intelligence Suite
Full DNS analysis: A records, MX records, SPF policy, DKIM selectors, DMARC alignment, CNAME chains, NS records. Feeds into the email verification API and domain scoring.
Source Priority Scoring
Results ranked by source: event_exhibitor=100, directory=90, registry=80, search=70, web_graph=60, sitemap=60. Higher-priority sources override lower ones in the B2B sales intelligence pipeline.
Confidence-Gated Progression
Only domains meeting the confidence threshold proceed to crawling. Low-confidence domains are flagged for manual review or recursive re-entry in the AI lead generation pipeline.
Accepted Inputs
- Company names (single or CSV/Excel batch)
- Domain names or URLs
- Phone numbers
- Physical addresses or ZIP codes
- Social profile URLs (LinkedIn, Twitter)
- Apify Dataset ID or KV Store key
Configuration
- Workers: 1-20 concurrent (default 5)
- Max Results: limit output size
- Search Engines: enable/disable specific engines
- DNS Validation: toggle full DNS checks
- Google Boost: enable/disable secondary enrichment
- Bad Domain List: customizable filter
- Confidence Threshold: minimum score to pass
See It In Action
Frequently Asked Questions
Everything you need to know about our platform.
Still have questions?
Our team can walk you through the pipeline, pricing, and your use case.