From autoresearch/results/ and docs/SYSTEM_OVERVIEW.md § Results. 29 rounds, 145 companies. Composite formula in Autoresearch Loop.

Key runs

TagCompositeExtractionFPRAvg timeNotes
baseline46.760%37.5%Pre-JSON-LD, no Places, generous name validation
jsonld-extraction65.480%6.3%Round 1; +18.7 over baseline. Largest single gain.
google-places-v276.180%23.4%24.7sDirect Places API; +29.4 vs baseline, +10.7 vs prior best
jsonld-v281.690%21.2%19.4sAll-time best. Committed pre-FPR-gate (round < 15).
extraction-v7 (prod)63.750%21.4%43.8sCurrent production config
quick-test (1 co.)93.8100%0%26.6sSingle well-structured site (More PR AB), not representative

Why production config (63.7) lags jsonld-v2 (81.6)

Source: SYSTEM_OVERVIEW.md § Results.

  1. The 10-company extraction-v7 test set includes two timeouts (konsultopia.se, frimedia.se) and one parked domain — drags extraction rate down to 50%.
  2. FPR still 21.4% — UI phrases slipping through Name Validation. Fix is batchable (add to Blocklists) but not yet shipped.

Full results-file index

Every *.json file under autoresearch/results/, with one-line provenance. Order: chronological. Columns: Extraction (%), FPR (%), total_contacts. Composite is recomputed per Autoresearch Loop formula when stated; otherwise left blank.

FileDatenExtractionFPRTotal contactsNote
google-places-v2.json2026-04-02 15:511080%23.4%47Places API direct; +29.4 over baseline
jsonld-v2.json2026-04-02 15:591090%21.2%52All-time best composite (81.6)
extraction-v7.json2026-04-02 18:581050%21.4%42Production config snapshot
active-companies.json2026-04-02 20:051080%0%67Live DB sample; FPR clean
db-companies.json2026-04-02 20:261050%4.2%24Smaller DB-sourced sample
stockholm-ab-v3.json2026-04-02 20:461060%0%51Stockholm AB cohort
email-association.json2026-04-02 20:561060%0%45Email→name pairing experiment
final-clean.json2026-04-02 21:031070%0%52Post-cleanup snapshot
uppsala-ab-v2.json2026-04-02 21:261060%2.2%92Uppsala AB cohort; highest raw contact count
current-test.json2026-04-06 02:4350%0%0Smoke run, all sources empty — likely env/key misconfig
current-test-3.json2026-04-06 02:46367%0%3Re-run, partial recovery
fixed-test.json2026-04-06 02:493100%3.3%61Fix verified; matches quick-test quality on a 3-co set
quick-test.json2026-04-06 03:181100%0%21Single company (More PR AB), composite 93.8
latest.json2026-04-06 03:181Symlink-style copy of quick-test.json

New untracked artefacts (post 2026-04-06)

  • current-test.json, current-test-3.json, fixed-test.json, quick-test.json — debugging session for the source-execution bug surfaced in early April. The arc reads: current-test (0% extraction, broken) → current-test-3 (67%, partial) → fixed-test (100% extraction, 3.3% FPR) → quick-test (100%, 0% FPR on 1 co.).
  • continuous-history.jsonl — append-only log written by the new continuous loop (see Autoresearch Loop). Each line is one company test result. Schema: { org_nr, name, city, domain, domain_time_ms, crawlee:{contacts,emails,phones,time_ms}, firecrawl:{...}, maps:{...}, best_source, total_contacts, timestamp }.

29-round aggregate

  • 145 unique companies tested
  • 78 (54%) produced ≥ 1 contact
  • ~450+ contacts total
  • Best single company: 32 contacts (inviatech AB)
  • 100% domain accuracy among the 78 productive runs
  • 82/82 unit tests passing

Why 40% produce nothing

Parked domains, broken SSL, timeouts, sites with no person-level contact data. A real ceiling — no scraper improvement breaks past it. Fallback data sources blocked by ToS or IP restrictions (see Known Issues).

See also

Autoresearch Loop, Autoresearch Result Types, JSON-LD Extraction, Google Places, Name Validation, Known Issues.