Domain Blocklist

140+ domains in src/enrichment/config.ts (under INVALID_DOMAINS). Never accepted by Domain Discovery regardless of score.

Source: docs/SYSTEM_OVERVIEW.md § Domain discovery → Domain blocklist.

Categories

  • Swedish company directoriesallabolag.se, ratsit.se, proff.se, eniro.se, hitta.se
  • B2B data vendorsrocketreach.co, apollo.io, zoominfo.com, lusha.com
  • Social platformsfacebook.com, linkedin.com, instagram.com, etc.
  • Generic hosts — parking pages, CDN landings, registrar placeholders

Why directories are blocked

ToS of allabolag.se, ratsit.se, proff.se explicitly prohibits commercial data extraction. Treating them as the canonical company website would also degrade extraction quality (their pages list many companies, not one).

Maintenance

The list is hand-curated. Additions come from autoresearch false-positive analysis: when a wrong-domain match shows up in an experiment, the domain is added here. Full file: src/enrichment/config.ts (~790 lines includes name blocklists too).

See also

Domain Discovery, Name Validation, Known Issues.

See also