Start at Dashboard
Catalog of every note in this wiki. One line per note. Add new entries under the matching category. Categories are fixed: see Wiki Conventions and Vault Style Guide.
Maps of Content
- Build Inventory MOC — search before building anything. 37 indexed code units (14 backend files, 14 frontend pages, 4 query modules, 5 MSW handlers + the registry index) with status, gap_ref, mock_ref. Schema locked by repo-side ADR-0010. Walker script + CI hook scheduled next.
- Frontend MOC — kundkort frontend (17 notes; legacy — see banner there)
- KB MOC — KB legal subproject (13 notes)
- Tests MOC — test suite + coverage gaps (10 notes)
- History MOC — git eras and notable commits (12 notes)
- Compliance MOC — GDPR, opt-out, RoPA, Article 14 (8 notes)
Architecture
- System Overview — what DBPOC is, who it serves, the four pipeline stages at a glance.
- Stack — Bun, TypeScript, PostgreSQL, Redis, BullMQ, Keycloak — versions and roles.
- Repository Layout — top-level folders in
./and what lives where. - Repository Layout Complete — 89 TypeScript files mapped with exports and purposes.
- Pipeline — Scrape → Enrich → Update queues, worker boundaries, retry policy.
- Decision Register — 12 ADRs with context + consequences (yesterday’s vault format; cross-references the canonical
docs/adr/in repo). - Symbol Map — code file → wiki note navigation for 40+ source files.
Enrichment
- Crawlee Scraper — direct site crawler, rate limits, contact extraction quirks.
- Domain Discovery — how a company gets matched to a website (Serper + heuristics).
- Google Places — Maps fallback for phone, address, and existence checks.
- Firecrawl — third-party scraper used as a fallback path.
- Name Validation — NER and blocklist filtering for contact names.
- Lead Scoring — scoring rubric and the four validation layers.
- EnrichV7 — top-level orchestrator function called by
Enrich_Job.
Data
- Bolagsverket Import — bulk ingest of ~1.8M Swedish companies.
- SCB Import — bulk ingest of ~1.6M foundations from SCB TSV.
- Database Schema — tables in
enrichnodedb, ownership and relationships. - Database Schema Complete — full catalog (30+ tables, 50+ indexes, full migration history).
- Schema Migrations —
migrations/folder, how they are run, current state.
Compliance
- GDPR Legitimate Interest — Article 6(1)(f) basis and the balancing test.
- GDPR Audit Findings — full compliance audit; 7 critical gaps, 8 high-priority gaps.
- Bisnode case — IMY 2019 decision; the lesson behind Article 14 timing.
- Reklamspärr — Swedish marketing opt-out registry handling.
- Article 14 — notification obligations when data is collected from third parties.
- RoPA Log — record of processing activities, where it lives in the schema.
- Opt-Out Hashes — hashed opt-out blocklist and matching procedure.
- Blocklists — domain blocklist + opt-out blocklist composition.
- Domain Blocklist — domain-level rejection rules.
Operations
- Known Issues — open bugs, broken assumptions, things that bite.
- Autoresearch Loop — how
autoresearch/runs experiments and persists results, plus the three runner variants. - Experiment Results — index into
autoresearch/results/with one-line summaries. - Autoresearch Result Types — the three structurally distinct artefact families under
autoresearch/results/. - Environment Variables — full inventory of env vars (58 docs).
- Scripts Reference — every script under
scripts/with purpose + invocation. - Lint Report 2026-04-27 — last lint pass output.
- Local Development — Docker Compose, env vars, common workflows.
Scripts
- Import Scripts — Bolagsverket / SCB / merge npm aliases, PRV trademark importer, IIS .se zone loader.
- Archive Scripts —
archive-inactive,archive-non-ab,verify-archive,restore-from-archiverunbook. - Backup Scripts —
pg_dumpwrapper, when to run, restore path. - Debug Scripts — every
test-*/check-*/clear-*and the manual enrichment probes. - One-off Scripts —
generate-reports, the standalone SQL files, frontend build helper.
Frontend
- Frontend Overview — what
frontend/kundkort/is, stack, top-level entry, data flow. - Kundkort Overview — yesterday’s framing of the React SPA; tech stack, auth flow, data flow.
- Components Reference — full catalog of 25 components (pages, sections, UI, charts).
- Hooks Reference — useAuth, useKundkort, useEcoApi, useSearch.
- Services Reference — ECOAPI client (insights, gaps, enrichment).
- Frontend Build — how the SPA is bundled and served by
src/api/index.ts. - Auth Flow —
useAuth,LoginModal, the DEV_MODE auto-login flag. - Kundkort API Client — every backend endpoint the SPA calls.
- EcoAPI Integration — secondary insights backend at port 3100, hard-coded auth.
- Search Page — type-ahead search, recent searches, advanced filters, CSV export.
- Kundkort Page — detail view shell: tabs, sticky bar, Berika action.
- Identity Card — Översikt tab, basic facts row.
- Contacts Section —
ContactInfoCardand per-roleContactsSectionwithContactCard. - Summary Section —
sammanfattningtext plus source pill. - Financials Section — three-row table (anställda, omsättning, resultat) per year.
- Koncern Section — parent and subsidiaries from
data.koncern. - Varumarken Section — trademarks from
data.agda_varumarken. - Tremor Charts —
FinancialChart,ContactDistributionChart,DataCompletenessChart. - Gaps Panel — client-derived data-gap analysis with “Fyll gap” button.
- Insights Panel — metrics tiles, growth and risk signals.
- UI Primitives —
SectionHeading,KvRow,GapBadge,ContactCard,SkeletonLoader,LoadingCard,ErrorState,ErrorPanel,Footer,CompanyHeader,formatOrgNr.
Testing
- Test Strategy — what runs under
bun test, env vars, what’s mocked, what hits the DB or network. - API Tests —
tests/api/*.test.ts: REST integration tests for auth, CRUD, search, kundkort. - Enrichment Tests —
tests/enrichment/*plussrc/enrichmentEngine.v7.test.ts: extractors, processors, board-member integration. - Fetcher Tests —
tests/fetchers/*plus the colocated mapper test: SCB PxWebApi, Bolagsverket bulk download, iXBRL parser. - Domain Discovery Tests —
tests/domainDiscovery.test.ts: hosting and municipality rejection. - Speed Tests —
tests/speed.test.ts: 50 ms registry-lookup budget. - Compliance Tests —
src/compliance.test.ts,src/integration.test.ts, Article 14 section: hashing, RoPA, opt-out, queues. - Validation Tests —
src/lib/validation.test.tsandsrc/validationEngine.test.ts: input validators and four-layer engine. - Regression Guard Tests —
autoresearch/regression.test.ts: must-pass guard for the autoresearch loop. - Test Coverage Gaps — source modules that have no automated test.
Process
- Memory Rules — what belongs in
../Memory/versus the wiki. - Wiki Conventions — filename, heading, frontmatter, callout rules.
- Vault Style Guide — the controlled tag vocabulary, callout palette, MOC pattern.
- Lint Checklist — mechanical checks every edit must pass.
- LLM Wiki Best Practices — 15 patterns for LLM-optimized knowledge bases.
- QA Report 2026-04-28 — first QA pass of vault claims.
- QA Report Final 2026-04-28 — final QA validation snapshot.
- QA Gate Full Report 2026-04-28 — the audit that drove today’s work.
- Test Coverage Report — vault-level test coverage snapshot.
History
- History Overview — top-level timeline of the 50 commits in
master, grouped into eras. - Notable Commits — top 20 commits with hash, date, and why they matter.
- Git History — yesterday’s vault take on the 8-week timeline (parallel framing to History Overview).
- Experiment History — all 29 quality-loop rounds of the Crawlee experiment.
- History Foundation Era — security hardening and pre-rewrite restore point (2026-03-11 → 2026-03-24).
- History Phase Refactor Era — Phase 1 / 2 / 3 of the Enterprise Enrichment Plan (2026-03-24 → 2026-03-25).
- History Domain Discovery Era — single-day binge that produced the current discovery pipeline (2026-03-25).
- History Firecrawl Era — feature-flagged LLM extractor experiment (2026-03-26 → 2026-03-27).
- History Kundkort Era — React frontend, BV VärdefullaDatamängder API, role normalisation (2026-03-31 → 2026-04-01).
- History Crawlee Era — Crawlee scraper and the 29-round autoresearch quality loop (2026-04-01 → 2026-04-02).
- History Frontend Era — Tremor dashboard and ECOAPI (2026-04-06).
- History Archive Era — backup script, archive tables, archive runs (2026-04-06).
- History Docs Era — SYSTEM_OVERVIEW, root-files archive, name-validation regression (2026-04-22 → 2026-04-27).
- History Migrations Era — schema_migrations tracking added late (2026-04-27).
KB Subproject
- Knowledge Base Overview — yesterday’s framing of the KB subproject (architecture, tech stack, build process).
- Article Index — yesterday’s complete catalog of all GDPR/compliance articles.
- KB Overview — what
KB/is, why it exists, dev workflow, port 3001. - KB Architecture — server → frontend → kb modules → Anthropic SDK call graph.
- KB Content Index — the 10 live articles, plus the unwired drafts and the 10-vs-29 mismatch.
- KB GDPR Articles — Art. 6 / 14 / 17 live + Art. 30 / overview drafts.
- KB Swedish Law — Dataskyddslagen, IMY, Bolagsverket modules.
- KB B2B Enrichment — LIA, data sources, the five enrichment tiers, Art. 14 ops, retention.
- KB IMY Decisions — IMY structure, 2019–25 timeline, 2025 written-LIA requirement.
- KB Chat Flow —
useChat+ Anthropic agentic loop withsearch_webtool. - KB Search — Fuse.js client-side fuzzy search (no body indexing).
- KB Credibility Scoring —
scoreUrl()tiers and the EU-DPA-centric domain lists. - KB UI Components — Sidebar, ArticleViewer, ChatPanel, TOC, SourceBadge, etc.
- KB Settings —
useSettings, modal, model options, theme handling. - KB Legal Disclaimer — text, persistence, why it exists.
Lessons
- Failed Approaches — 11 lessons from experiments that didn’t work (latest: Path 3B PDF+LLM, 2026-05-05).
- Technical Debt — P0/P1/P2 register with 12 items and remediation plans.