Canonical mermaid source for the Scrape Enrich Update pipeline. Inlined into Dashboard §Pipeline at a glance. Edit here, not there.

Source files cross-referenced:

  • src/queues/workers.ts (legacy in-process workers)
  • src/workers/enrichDispatcher.ts (rolling refresh dispatcher)
  • src/workers/enrichWorker.ts (Phase 3 isolated process)
  • src/workers/updateWorker.ts (Phase 3 isolated process)
  • src/enrichment/pipeline.ts (enrichV7)
  • src/compliance/reklamsparre.ts
  • src/workers/art14Worker.ts

The diagram reflects the four-source fan-in into Enrich_Job, both compliance gates (reklamspärr, opt-out), and the Article 14 schedule fired from Update_Job.

Diagram

flowchart LR
     Queues
    SQ[(Scrape_Job)]
    EQ[(Enrich_Job)]
    UQ[(Update_Job)]
    A14[(Art14_Job)]
    DLQ[(Dead_Letter)]

     Sinks
    PG[(Postgres<br/>companies + RoPA_Log)]
    REDIS[(Redis<br/>enrich:org_nr<br/>6mo TTL)]

     Scrape gates
    SQ --> REK
    REK -- blocked --> DLQ
    REK -- clear --> OPT
    OPT -- match --> DLQ
    OPT -- clear --> EQ

     Update side-effects
    UQ --> PG
    UQ --> REDIS
    UQ --> A14

    %% Failure path
    EQ -. retry x3 .-> DLQ
    UQ -. retry x3 .-> DLQ

Notes on accuracy

  • src/queues/workers.ts already calls isScbAdvertisingBlocked() inside the Scrape_Job worker (lines 62-66). The wiki note Pipeline still marks the gate as a P0 gap; the diagram shows the intended gate position. See Known Issues for the open dispute.
  • src/workers/enrichWorker.ts uses enrichV7 (src/enrichment/index.ts) and routes to Update_Job; it deliberately drops enrichment_status === 'error' results to avoid overwriting good data with empty scores.
  • src/workers/updateWorker.ts schedules Art14_Job per contact when email_confidence !== 'low' and the enrichment status is not error or blocked.
  • Dead_Letter is src/queues/workers.ts:197; both the in-process and isolated workers terminate failed jobs into it after retryOptions.attempts.