Bulk-data ingest tooling. The actual heavy lifters live in src/import/, exposed via npm aliases. The scripts/ folder only holds the trademark importer and the .se zone loader.

See also: Bolagsverket Import, SCB Import, Schema Migrations.

Bolagsverket and SCB pipelines (npm aliases)

Defined in package.json (project root):

CommandEntry pointPurpose
bun run import:bolagsverketsrc/import/bolagsverket-import.tsBulk download + ingest of Bolagsverket organisations (~1.8M rows).
bun run import:scbsrc/import/scb-import.tsStreaming TSV import of SCB foundations (~1.6M rows).
bun run import:mergesrc/import/merge-sources.tsMerges the BV and SCB tables into the unified view.
bun run import:validatesrc/import/validate-merge.tsPost-merge sanity counts.

Supporting modules (no direct CLI):

  • src/import/parser.ts — org-nr / address normalisation utilities used by both importers.
  • src/import/copy-import.tsCOPY FROM-style bulk insert helper.
  • src/import/delta-import.ts — incremental update path (used after the initial bulk).
  • src/import/bv-import-wrapper.sh — shell wrapper that runs import:bolagsverket with retry logic.
  • src/import/monitor-imports.sh — tails the importer log and prints throughput.

Env vars: standard PGHOST / PGPORT / PGUSER / PGPASSWORD / PGDATABASE. SCB also reads SCB_API_KEY for the live PxWebApi v2 fetcher.

scripts/import-prv-trademarks.ts (PRV trademarks)

Imports the PRV (Patent och registreringsverket) trademark XML dump.

ftp://opendata.prv.se   user: OpenDataSource   pass: opendata
License: CC0 1.0 (commercial use permitted)

Flow: downloadPrvZip() → unzip → stream-parse XML → upsert into trademark tables, joining on org_nr.

Run: bun run scripts/import-prv-trademarks.ts. No npm alias.

scripts/indexer/ — IIS .se zone loader

Two-step pipeline that populates domain_registry (see migration 006_domain_registry.sql).

  1. scripts/indexer/download-zone.sh (or the Python variant download-zone.py) — pulls the full .se zone via DNS AXFR from zonedata.iis.se. Output: se_zone_domains.txt, one domain per line, ~1.47M lines. License: CC BY 4.0, commercial use permitted.
  2. scripts/indexer/load-registry.ts — bulk-loads that file into domain_registry. Uses Bun.sql and normalizeDomain() to drop the trailing dot. Idempotent via ON CONFLICT.

Invoke:

bash scripts/indexer/download-zone.sh   # or: python scripts/indexer/download-zone.py
bun run scripts/indexer/load-registry.ts

Warning

The downloader pulls ~1.47M records over a single AXFR. Run during off-peak hours and ensure zonedata.iis.se is whitelisted on your network.

scripts/migrate.ts (schema migrations)

Not strictly an import, but lives in scripts/ and is the canonical npm alias entry point.

bun run migrate              # apply all unapplied migrations
bun run migrate -- --mark-applied 007   # record without executing

Streams SQL files through psql (not Bun.sql.unsafe) so CREATE INDEX CONCURRENTLY works. Tracks applied versions in schema_migrations. See Schema Migrations for the full picture.