Find the people who make decisions
4-method extraction: JSON-LD -> Team Cards -> Heuristic -> LinkedIn
Most scrapers extract every name on a page -- including navigation text, footer links, and testimonials. LeadsLogix uses four progressively deeper extraction methods with name validation and title classification to find actual decision makers, not page noise.
Why Contact Extraction Fails
Four Methods, One Unified Profile
Each method handles different page structures. Cross-page context accumulation merges contacts found across multiple pages.
Method 1: JSON-LD
Extract structured data from schema.org Person/Organization markup. Highest confidence -- the website explicitly declares who works there.
Method 2: Team Cards
Detect team card UI patterns (image + name + title + email). CSS class analysis and spatial proximity grouping.
Method 3: Heuristic Proximity
Name-title-email proximity analysis on unstructured pages. Uses position, font size, and visual grouping to associate data.
Method 4: LinkedIn X-Ray
Multi-engine search for company employees when website extraction yields insufficient results.
Cross-Page Context
Accumulate contacts across /team, /about, /contact, /leadership, /people -- 35+ page path patterns. Merge fragments into complete profiles.
Title Classification
Parse titles into seniority levels: C-suite, VP, Director, Manager, Individual Contributor. Filter by decision-maker criteria.
Extraction Pipeline
Each stage processes data sequentially with full checkpoint/resume capability.
Page Discovery
Find team-related pages across 35+ URL patterns: /team, /about, /people, /leadership, /staff, /our-team, etc.
JSON-LD Extraction
Parse schema.org Person and Organization markup. Extract name, title, email, phone from structured data.
Team Card Detection
Identify team card UI patterns using CSS analysis. Group image + name + title + email by spatial proximity.
Heuristic Analysis
For unstructured pages: analyze text proximity, font sizing, and visual grouping to associate names with titles and emails.
LinkedIn X-Ray
Search engines for site:linkedin.com/in/ queries with company name. Match results to extracted contacts.
Cross-Page Merge
Combine contacts found across multiple pages. Match by name similarity and email domain.
Name Validation
Filter out navigation text, UI elements, placeholder names, and non-person entries.
14-Rule Cleanup
Apply junk removal rules: remove social handles, generic emails, company-name-as-person, low-confidence entries.
Technical Workflow
# Deep extraction with Playwright (BEST quality, <50 companies) # Uses the /lead-scraper skill for maximum contact quality # Enrichment pipeline extraction (scalable) python -m tools.enrichment.pipeline --input companies.csv # Standalone contact extraction python -m tools.contact_extractor --url https://acme.com/team # Mandatory cleanup after extraction python tools/cleanup_contacts.py # Extraction method priority: # 1. JSON-LD (highest confidence) # 2. Team card detection # 3. Heuristic proximity # 4. LinkedIn X-ray (fallback)
API Access
/api/v1/contacts/extractExtract decision makers from a company URL. Returns contacts with extraction method attribution.
/api/v1/contacts/batchBatch extraction for up to 200 company URLs.
/api/v1/contacts/{companyId}Retrieve all extracted contacts for a company with seniority classification.
/api/v1/contacts/cleanupApply 14-rule junk removal to a contact list.
Use Cases
Account-Based Sales
Find decision makers at target accounts before outreach: CEO, CTO, VP Sales, Head of Procurement.
Event Preparation
Extract leadership teams from exhibitor websites before attending trade shows.
Competitive Intelligence
Monitor competitor team pages for leadership changes, new hires, and departures.
M&A Research
Extract complete management teams from acquisition target company websites.
Recruitment
Find hiring managers and department heads at target companies for recruiting outreach.
Investor Research
Extract founder and executive teams from portfolio company and deal flow websites.
Industry Applications
Technology
JavaScript-rendered team pages handled by browser layer and JSON-LD extraction.
Professional Services
Partner/principal directories with structured team card layouts.
Manufacturing
Leadership pages with heuristic extraction for unstructured HTML.
E-Commerce
Vendor contact pages and wholesale buyer team directories.
Performance Metrics
Platform Preview
See how LeadsLogix processes, verifies, and delivers your leads in real time.
Contact Cards
Extracted decision makers with name, title, email, phone, LinkedIn, and confidence score.
Extraction Method Attribution
See which method found each contact: JSON-LD, team card, heuristic, or LinkedIn.
Seniority Distribution
Breakdown of extracted contacts by seniority: C-suite, VP, Director, Manager, IC.
Integrations
Frequently Asked Questions
Everything you need to know about our platform.
Still have questions?
Our team can walk you through the pipeline, pricing, and your use case.