JSON-LD Extraction
Strategy 6 of the Crawlee Scraper. Parses <script type="application/ld+json"> blocks for schema.org/Person and schema.org/Organization > contactPoint.
Source: docs/SYSTEM_OVERVIEW.md § The scraper (Crawlee) → JSON-LD structured data.
Code
if (json['@type'] === 'Person') {
contacts.push({
full_name: json.name,
role: json.jobTitle ?? null,
email: json.email ?? null,
phone: json.telephone ?? null,
});
}File: src/enrichment/sources/crawlee.ts.
Impact
Single largest improvement found via Autoresearch Loop: +18.7 composite score points vs baseline (round 1, jsonld-extraction tag, 65.4 vs baseline 46.7). FPR also dropped to 6.3% from 37.5%.
A later iteration jsonld-v2 reached 81.6 composite — the all-time best (Experiment Results).
Why it works
Structured data is author-curated for search engines, so contacts are already labeled (@type: Person, jobTitle, email). No name-vs-UI-noise problem.
Caveat
Names from JSON-LD still flow through Name Validation — JSON-LD authors occasionally set fake/placeholder Person blocks.
See also
Crawlee Scraper, Name Validation, Experiment Results, Autoresearch Loop.