Firecrawl

File: src/enrichment/sources/firecrawl.ts. LLM-based alternative to Crawlee Scraper. Activated with USE_FIRECRAWL=true.

Source: docs/SYSTEM_OVERVIEW.md § Firecrawl and docs/FIRECRAWL_MIGRATION.md.

How it works

  1. firecrawl.map() finds contact pages on the domain.
  2. firecrawl.scrape() extracts structured JSON via a Zod schema.
  3. Extracted contacts pass through isValidPersonName() (see Name Validation) — guard against LLM hallucinations.

Status

  • Phase 1 done: scaffold + feature flag + 18 unit tests passing.
  • Phase 2 not done: A/B comparison against Crawlee on real companies.

Warning

Do not switch the default to Firecrawl until Phase 2 runs and beats or matches Crawlee on the same test set.

Cost

~7 credits per company:

  • 1 credit for map
  • 5 credits per contact page (averaged)

At Standard plan ($83/month, 100K credits) ≈ 14,000 companies/month.

Known gap

tech_stack always returns [] — Firecrawl JSON extraction does not expose HTTP response headers, so server/framework fingerprinting is impossible.

Env

FIRECRAWL_API_KEY=fc-... and USE_FIRECRAWL=true.

See also

Crawlee Scraper, Name Validation, EnrichV7, Local Development.