JSON-LD Extraction

Strategy 6 of the Crawlee Scraper. Parses <script type="application/ld+json"> blocks for schema.org/Person and schema.org/Organization > contactPoint.

Source: docs/SYSTEM_OVERVIEW.md § The scraper (Crawlee) → JSON-LD structured data.

Code

if (json['@type'] === 'Person') {
  contacts.push({
    full_name: json.name,
    role:      json.jobTitle ?? null,
    email:     json.email ?? null,
    phone:     json.telephone ?? null,
  });
}

File: src/enrichment/sources/crawlee.ts.

Impact

Single largest improvement found via Autoresearch Loop: +18.7 composite score points vs baseline (round 1, jsonld-extraction tag, 65.4 vs baseline 46.7). FPR also dropped to 6.3% from 37.5%.

A later iteration jsonld-v2 reached 81.6 composite — the all-time best (Experiment Results).

Why it works

Structured data is author-curated for search engines, so contacts are already labeled (@type: Person, jobTitle, email). No name-vs-UI-noise problem.

Caveat

Names from JSON-LD still flow through Name Validation — JSON-LD authors occasionally set fake/placeholder Person blocks.

See also

Crawlee Scraper, Name Validation, Experiment Results, Autoresearch Loop.