Bulk-data ingest tooling. The actual heavy lifters live in src/import/, exposed via npm aliases. The scripts/ folder only holds the trademark importer and the .se zone loader.
See also: Bolagsverket Import, SCB Import, Schema Migrations.
Bolagsverket and SCB pipelines (npm aliases)
Defined in package.json (project root):
| Command | Entry point | Purpose |
|---|---|---|
bun run import:bolagsverket | src/import/bolagsverket-import.ts | Bulk download + ingest of Bolagsverket organisations (~1.8M rows). |
bun run import:scb | src/import/scb-import.ts | Streaming TSV import of SCB foundations (~1.6M rows). |
bun run import:merge | src/import/merge-sources.ts | Merges the BV and SCB tables into the unified view. |
bun run import:validate | src/import/validate-merge.ts | Post-merge sanity counts. |
Supporting modules (no direct CLI):
src/import/parser.ts— org-nr / address normalisation utilities used by both importers.src/import/copy-import.ts—COPY FROM-style bulk insert helper.src/import/delta-import.ts— incremental update path (used after the initial bulk).src/import/bv-import-wrapper.sh— shell wrapper that runsimport:bolagsverketwith retry logic.src/import/monitor-imports.sh— tails the importer log and prints throughput.
Env vars: standard PGHOST / PGPORT / PGUSER / PGPASSWORD / PGDATABASE. SCB also reads SCB_API_KEY for the live PxWebApi v2 fetcher.
scripts/import-prv-trademarks.ts (PRV trademarks)
Imports the PRV (Patent och registreringsverket) trademark XML dump.
ftp://opendata.prv.se user: OpenDataSource pass: opendata
License: CC0 1.0 (commercial use permitted)Flow: downloadPrvZip() → unzip → stream-parse XML → upsert into trademark tables, joining on org_nr.
Run: bun run scripts/import-prv-trademarks.ts. No npm alias.
scripts/indexer/ — IIS .se zone loader
Two-step pipeline that populates domain_registry (see migration 006_domain_registry.sql).
scripts/indexer/download-zone.sh(or the Python variantdownload-zone.py) — pulls the full.sezone via DNS AXFR fromzonedata.iis.se. Output:se_zone_domains.txt, one domain per line,~1.47Mlines. License: CC BY 4.0, commercial use permitted.scripts/indexer/load-registry.ts— bulk-loads that file intodomain_registry. UsesBun.sqlandnormalizeDomain()to drop the trailing dot. Idempotent viaON CONFLICT.
Invoke:
bash scripts/indexer/download-zone.sh # or: python scripts/indexer/download-zone.py
bun run scripts/indexer/load-registry.tsWarning
The downloader pulls
~1.47Mrecords over a single AXFR. Run during off-peak hours and ensurezonedata.iis.seis whitelisted on your network.
scripts/migrate.ts (schema migrations)
Not strictly an import, but lives in scripts/ and is the canonical npm alias entry point.
bun run migrate # apply all unapplied migrations
bun run migrate -- --mark-applied 007 # record without executingStreams SQL files through psql (not Bun.sql.unsafe) so CREATE INDEX CONCURRENTLY works. Tracks applied versions in schema_migrations. See Schema Migrations for the full picture.