npm - @danielarndt0/cnpj-db-loader - Versions diffs - 2.2.0 → 2.3.0 - Mend

@danielarndt0/cnpj-db-loader 2.2.0 → 2.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (10) hide show

package/docs/architecture.md CHANGED Viewed

@@ -50,7 +50,7 @@ The importer is now split into focused modules so future performance work can re
 - `quarantine-writer`: stores bad rows without stopping long imports
 - `runner`: orchestrates the current import flow while keeping the service entry point small
-The project now also generates dedicated staging tables for large datasets. The CLI exposes both a one-shot command (`import`) and split commands (`import load`, `import materialize`). Staging cleanup is handled explicitly through `database cleanup staging`. The write path sends the heavy datasets to staging tables first with only light normalization, then consolidates them into a simplified final schema in dependency order while keeping the smaller catalog datasets on the final schema directly. The final schema now stays closer to the Receita layout so the API can derive richer views later without forcing every first load to pay that cost inside PostgreSQL.
+The project now also generates dedicated staging tables for large datasets. The CLI exposes both a one-shot command (`import`) and split commands (`import load`, `import materialize`). Staging cleanup is handled explicitly through `database cleanup staging`. The write path sends the heavy datasets to staging tables first with only light normalization, then consolidates them into a simplified final schema in dependency order while keeping the smaller catalog datasets on the final schema directly. The final schema stays close to the Receita layout while also exposing `establishment_secondary_cnaes`, a normalized helper table that lets CNPJ API query one row per establishment secondary CNAE without requiring an external backfill after each load.
 ## Staging schema

package/docs/commands.md CHANGED Viewed

@@ -19,7 +19,7 @@
 | `database config test`          | Test the connection using the saved or overridden URL.                                                                                                            |
 | `database config reset`         | Remove the saved PostgreSQL URL after confirmation.                                                                                                               |
 | `database cleanup staging`      | Truncate staging tables and optionally clear linked materialization checkpoints for a validated path.                                                             |
-| `database cleanup materialized` | Truncate simplified final relational tables populated by materialization in safe order for the current schema.                                                    |
+| `database cleanup materialized` | Truncate simplified final relational tables populated by materialization, including establishment secondary CNAEs when available, in safe order.                  |
 | `database cleanup checkpoints`  | Clear load checkpoints, materialization checkpoints, or both without truncating staging or final tables.                                                          |
 | `database cleanup plans`        | Delete saved import plans. Related plan files and materialization checkpoints are removed by database cascade.                                                    |
 | `import <input>`                | Run the full pipeline: plan, load validated files into staging/direct final targets, materialize staged datasets into final tables, and finalize the import plan. |

package/docs/usage.md CHANGED Viewed

@@ -67,6 +67,7 @@ cnpj-db-loader schema generate --profile staging
 - it persists the import plan in the database and reuses it on resume when the validated source files and batch size match
 - it reads files in streaming mode
 - it loads the large datasets into lightweight staging tables through PostgreSQL COPY with only light normalization in the hot path and defers heavier work to the materialization stage in dependency order
+- during establishment materialization, it also populates `establishment_secondary_cnaes` from `secondary_cnaes_raw`, replacing the previous need for a separate API-side backfill script
 - before each staged dataset is materialized into the final schema, the importer only reconciles missing lookup/domain codes when the current final schema still requires those lookup foreign keys
 - once the file import phase ends, the terminal switches to a dedicated MATERIALIZING stage and the JSONL progress log emits heartbeat entries during long staged-to-final upserts
 - it still upserts the smaller domain datasets directly into the final schema

package/package.json CHANGED Viewed

@@ -1,10 +1,20 @@
 {
   "name": "@danielarndt0/cnpj-db-loader",
-  "version": "2.2.0",
+  "version": "2.3.0",
   "publishConfig": {
     "access": "public"
   },
   "description": "Practical CLI for preparing Brazilian Federal Revenue CNPJ open data for PostgreSQL.",
+  "keywords": [
+    "cnpj",
+    "federal-revenue",
+    "cli",
+    "command-line",
+    "postgresql",
+    "database",
+    "typescript",
+    "nodejs"
+  ],
   "author": "Daniel Arndt",
   "license": "MIT",
   "type": "module",