Pipeline

Three BullMQ workers in src/queues/workers.ts, called in sequence per org_nr.

Flow

Scrape_Job
  → check reklamspärr (SCB advertising_block)
  → check opt-out (OptOut_Hashes)
  → if clear: enqueue Enrich_Job

Enrich_Job
  → run enrichV7(input)
  → enqueue Update_Job

Update_Job
  → upsert into companies (enriched_data JSONB)
  → append to RoPA_Log
  → set Redis cache (6-month TTL)

Source: docs/SYSTEM_OVERVIEW.md § How enrichment works → The queue pipeline.

Stages

P0 — Reklamspärr gap

Warning

src/queues/workers.ts does NOT call isScbAdvertisingBlocked(). The check lives only inside enrichV7(), which runs after Scrape_Job has already enqueued. Companies with advertising_block = true get enriched anyway. GDPR compliance gap. See Reklamspärr and Known Issues.

Concurrency limit

Warning

Playwright runs synchronously inside the Enrich_Job worker thread. At >40 concurrent jobs the worker pool will OOM or deadlock. Decoupling Playwright into its own pool is required before raising concurrency. P1 in Known Issues.

Cache

  • Key: enrich:${org_nr}
  • TTL: 6 months
  • Bypass with input.bypass_cache = true

See also

EnrichV7, Reklamspärr, RoPA Log, Known Issues.