Autoresearch Loop

Autonomous experiment system in autoresearch/. Inspired by karpathy/autoresearch. Instead of minimising training loss, it maximises contact extraction quality.

Source: docs/SYSTEM_OVERVIEW.md § The autoresearch loop and autoresearch/program.md.

The loop

read latest results
→ pick one change (from a backlog or freeform)
→ edit allowed files
→ run: USE_CRAWLEE=true bun autoresearch/experiment.ts --tag <name>
→ compare composite score to previous
→ if score increased ≥ 2 points AND false_positive_rate = 0%: git commit, continue
→ otherwise: revert, try something else

Composite score formula

score = extraction_rate          × 40
      + min(avg_contacts / 5, 1) × 20
      + (1 − false_positive_rate)× 20
      + email_coverage            × 10
      + phone_coverage            × 10

Allowed file scope (agent)

Can modify:

crawlee.ts, maps.ts, nameUtils.ts, emailUtils.ts, phoneUtils.ts, config.ts, domain.ts, companies.json

Cannot touch:

pipeline.ts, experiment.ts, metrics.ts, types.ts

FPR rule

Warning

Decision rule: FPR > 0% triggers immediate revert. Several early experiments (including jsonld-v2, the current best, FPR 21.2%) ran before the FPR gate was tightened and were committed despite non-zero FPR. The rule has been enforced strictly since round 15.

Progress

29 rounds completed
145 unique companies tested
78 (54%) produced ≥ 1 contact
~450+ total contacts extracted
82/82 unit tests passing

Biggest single gain: JSON-LD Extraction in round 1 (+18.7 composite vs baseline).

Three runner variants

autoresearch/ now ships three different loop entry points. They share the metric module (metrics.ts) but pick companies and report progress differently.

`experiment.ts` — original tagged runner

The canonical one referenced in the loop description above.

Reads companies from autoresearch/companies.json (a fixed test set).
Runs once, writes results/<tag>.json and appends one line to history.jsonl.
Single-source: Crawlee only (USE_CRAWLEE=true required).
This is the runner the FPR rule and the 29-round progression are scored against.

`loop-v2.ts` — production-style multi-source

Untracked file added 2026-04-06 (~498 lines). Tests the current production pipeline rather than an isolated experiment.

Picks active AB companies straight from the database, skipping any cached recently.
Performs live domain discovery (no pre-known domains in companies.json).
Runs all three sources head-to-head: Crawlee, Firecrawl, Google Places.
CLI: bun autoresearch/loop-v2.ts [--companies N] [--source crawlee|firecrawl|maps] [--compare].
Output: in-process metrics summary; does not write a tagged JSON.

Use this when you want to know how the live pipeline performs on real DB rows, not when you want a clean before/after diff for the loop.

`loop-continuous.ts` — never-stops monitor

Untracked file added 2026-04-06 (~469 lines). A long-running variant of loop-v2.

Runs indefinitely until SIGINT; rotates through DB companies.
Live dashboard prints running stats and tracks the best-performing configuration over the session.
Per-company results appended to results/continuous-history.jsonl (one line per company; see Autoresearch Result Types for the schema).
Convenience wrapper: autoresearch/run-loop.sh (untracked, 510B).

Operational, not experimental — leave it running to gather drift data, not to score a single change.

EnrichNode Wiki

Explorer

The loop

Composite score formula

Allowed file scope (agent)

FPR rule

Progress

Three runner variants

`experiment.ts` — original tagged runner

`loop-v2.ts` — production-style multi-source

`loop-continuous.ts` — never-stops monitor

See also

Graph View

Table of Contents

Backlinks

EnrichNode Wiki

Explorer

Autoresearch Loop

The loop

Composite score formula

Allowed file scope (agent)

FPR rule

Progress

Three runner variants

experiment.ts — original tagged runner

loop-v2.ts — production-style multi-source

loop-continuous.ts — never-stops monitor

See also

Graph View

Table of Contents

Backlinks

`experiment.ts` — original tagged runner

`loop-v2.ts` — production-style multi-source

`loop-continuous.ts` — never-stops monitor