altimate-code 0.4.9 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (39) hide show
  1. package/CHANGELOG.md +36 -0
  2. package/README.md +22 -60
  3. package/package.json +54 -14
  4. package/postinstall.mjs +35 -0
  5. package/skills/cost-report/SKILL.md +134 -0
  6. package/skills/data-viz/SKILL.md +135 -0
  7. package/skills/data-viz/references/component-guide.md +394 -0
  8. package/skills/dbt-analyze/SKILL.md +130 -0
  9. package/skills/dbt-analyze/references/altimate-dbt-commands.md +66 -0
  10. package/skills/dbt-analyze/references/lineage-interpretation.md +58 -0
  11. package/skills/dbt-develop/SKILL.md +151 -0
  12. package/skills/dbt-develop/references/altimate-dbt-commands.md +66 -0
  13. package/skills/dbt-develop/references/common-mistakes.md +49 -0
  14. package/skills/dbt-develop/references/incremental-strategies.md +118 -0
  15. package/skills/dbt-develop/references/layer-patterns.md +158 -0
  16. package/skills/dbt-develop/references/medallion-architecture.md +125 -0
  17. package/skills/dbt-develop/references/yaml-generation.md +90 -0
  18. package/skills/dbt-docs/SKILL.md +99 -0
  19. package/skills/dbt-docs/references/altimate-dbt-commands.md +66 -0
  20. package/skills/dbt-docs/references/documentation-standards.md +94 -0
  21. package/skills/dbt-test/SKILL.md +121 -0
  22. package/skills/dbt-test/references/altimate-dbt-commands.md +66 -0
  23. package/skills/dbt-test/references/custom-tests.md +59 -0
  24. package/skills/dbt-test/references/schema-test-patterns.md +103 -0
  25. package/skills/dbt-test/references/unit-test-guide.md +121 -0
  26. package/skills/dbt-troubleshoot/SKILL.md +187 -0
  27. package/skills/dbt-troubleshoot/references/altimate-dbt-commands.md +66 -0
  28. package/skills/dbt-troubleshoot/references/compilation-errors.md +57 -0
  29. package/skills/dbt-troubleshoot/references/runtime-errors.md +71 -0
  30. package/skills/dbt-troubleshoot/references/test-failures.md +95 -0
  31. package/skills/lineage-diff/SKILL.md +64 -0
  32. package/skills/pii-audit/SKILL.md +117 -0
  33. package/skills/query-optimize/SKILL.md +86 -0
  34. package/skills/schema-migration/SKILL.md +119 -0
  35. package/skills/sql-review/SKILL.md +118 -0
  36. package/skills/sql-translate/SKILL.md +68 -0
  37. package/skills/teach/SKILL.md +54 -0
  38. package/skills/train/SKILL.md +51 -0
  39. package/skills/training-status/SKILL.md +45 -0
@@ -0,0 +1,121 @@
1
+ # dbt Unit Tests
2
+
3
+ Unit tests validate model logic by mocking inputs and asserting expected outputs. Available in dbt-core 1.8+.
4
+
5
+ ## When to Unit Test
6
+
7
+ **DO unit test:**
8
+ - Complex calculations (revenue attribution, MRR changes, scoring)
9
+ - Business logic with edge cases (null handling, date boundaries, status transitions)
10
+ - Models with conditional logic (`CASE WHEN`, `IFF`, `COALESCE`)
11
+ - Aggregations where correctness is critical
12
+
13
+ **Do NOT unit test:**
14
+ - Simple staging models (just rename/cast)
15
+ - Pass-through models with no logic
16
+ - Built-in dbt functions
17
+
18
+ ## Basic Structure
19
+
20
+ Unit tests live in schema.yml (or a dedicated `_unit_tests.yml` file):
21
+
22
+ ```yaml
23
+ unit_tests:
24
+ - name: test_net_revenue_calculation
25
+ description: Verify net_revenue = gross - refunds - discounts
26
+ model: fct_daily_revenue
27
+ given:
28
+ - input: ref('stg_orders')
29
+ rows:
30
+ - { order_id: 1, gross_amount: 100.00, refund_amount: 10.00, discount_amount: 5.00 }
31
+ - { order_id: 2, gross_amount: 50.00, refund_amount: 0.00, discount_amount: 0.00 }
32
+ expect:
33
+ rows:
34
+ - { order_id: 1, net_revenue: 85.00 }
35
+ - { order_id: 2, net_revenue: 50.00 }
36
+ ```
37
+
38
+ ## Mocking Multiple Inputs
39
+
40
+ ```yaml
41
+ unit_tests:
42
+ - name: test_customer_lifetime_value
43
+ model: dim_customers
44
+ given:
45
+ - input: ref('stg_customers')
46
+ rows:
47
+ - { customer_id: 1, name: "Alice" }
48
+ - input: ref('stg_orders')
49
+ rows:
50
+ - { order_id: 1, customer_id: 1, amount: 50.00 }
51
+ - { order_id: 2, customer_id: 1, amount: 75.00 }
52
+ expect:
53
+ rows:
54
+ - { customer_id: 1, lifetime_value: 125.00 }
55
+ ```
56
+
57
+ ## Testing Edge Cases
58
+
59
+ ```yaml
60
+ unit_tests:
61
+ - name: test_handles_null_discounts
62
+ model: fct_orders
63
+ given:
64
+ - input: ref('stg_orders')
65
+ rows:
66
+ - { order_id: 1, amount: 100.00, discount: null }
67
+ expect:
68
+ rows:
69
+ - { order_id: 1, net_amount: 100.00 }
70
+
71
+ - name: test_handles_zero_quantity
72
+ model: fct_orders
73
+ given:
74
+ - input: ref('stg_orders')
75
+ rows:
76
+ - { order_id: 1, quantity: 0, unit_price: 10.00 }
77
+ expect:
78
+ rows:
79
+ - { order_id: 1, order_total: 0.00 }
80
+ ```
81
+
82
+ ## Overriding Macros and Vars
83
+
84
+ ```yaml
85
+ unit_tests:
86
+ - name: test_with_specific_date
87
+ model: fct_daily_metrics
88
+ overrides:
89
+ vars:
90
+ run_date: "2024-01-15"
91
+ macros:
92
+ - name: current_timestamp
93
+ result: "2024-01-15 00:00:00"
94
+ given:
95
+ - input: ref('stg_events')
96
+ rows:
97
+ - { event_id: 1, event_date: "2024-01-15" }
98
+ expect:
99
+ rows:
100
+ - { event_date: "2024-01-15", event_count: 1 }
101
+ ```
102
+
103
+ ## Running Unit Tests
104
+
105
+ ```bash
106
+ altimate-dbt test --model <name> # runs all tests including unit tests
107
+ altimate-dbt build --model <name> # build + test
108
+ ```
109
+
110
+ ## Test-Driven Development Pattern
111
+
112
+ 1. Write the unit test YAML first (expected inputs → outputs)
113
+ 2. Run it — it should fail (model doesn't exist yet or logic is missing)
114
+ 3. Write/fix the model SQL
115
+ 4. Run again — it should pass
116
+ 5. Add schema tests for data quality
117
+
118
+ ## Official Documentation
119
+
120
+ - **dbt unit tests**: https://docs.getdbt.com/docs/build/unit-tests
121
+ - **Unit test YAML spec**: https://docs.getdbt.com/reference/resource-properties/unit-tests
@@ -0,0 +1,187 @@
1
+ ---
2
+ name: dbt-troubleshoot
3
+ description: Debug dbt errors — compilation failures, runtime database errors, test failures, wrong data, and performance issues. Use when something is broken, producing wrong results, or failing to build. Powered by altimate-dbt.
4
+ ---
5
+
6
+ # dbt Troubleshooting
7
+
8
+ ## Requirements
9
+ **Agent:** any (read-only diagnosis), builder (if applying fixes)
10
+ **Tools used:** bash (runs `altimate-dbt` commands), read, glob, edit, altimate_core_semantics, altimate_core_column_lineage, altimate_core_correct, altimate_core_fix, sql_fix
11
+
12
+ ## When to Use This Skill
13
+
14
+ **Use when:**
15
+ - A dbt model fails to compile or build
16
+ - Tests are failing
17
+ - Model produces wrong or unexpected data
18
+ - Builds are slow or timing out
19
+ - User shares an error message from dbt
20
+
21
+ **Do NOT use for:**
22
+ - Creating new models → use `dbt-develop`
23
+ - Adding tests → use `dbt-test`
24
+ - Analyzing change impact → use `dbt-analyze`
25
+
26
+ ## Iron Rules
27
+
28
+ 1. **Never modify a test to make it pass without understanding why it's failing.**
29
+ 2. **Fix ALL errors, not just the reported one.** After fixing the specific issue, run a full `dbt build`. If other models fail — even ones not mentioned in the error report — fix them too. Your job is to leave the project in a fully working state. Never dismiss errors as "pre-existing" or "out of scope".
30
+
31
+ ## Diagnostic Workflow
32
+
33
+ ### Step 1: Health Check
34
+
35
+ ```bash
36
+ altimate-dbt doctor
37
+ altimate-dbt info
38
+ ```
39
+
40
+ If `doctor` fails, fix the environment first. Common issues:
41
+ - Python not found → reinstall or set `--python-path`
42
+ - dbt-core not installed → `pip install dbt-core`
43
+ - No `dbt_project.yml` → wrong directory
44
+ - Missing packages → if `packages.yml` exists but `dbt_packages/` doesn't, run `dbt deps`
45
+
46
+ ### Step 2: Classify the Error
47
+
48
+ | Error Type | Symptom | Jump To |
49
+ |-----------|---------|---------|
50
+ | Compilation Error | Jinja/YAML parse failure | [references/compilation-errors.md](references/compilation-errors.md) |
51
+ | Runtime/Database Error | SQL execution failure | [references/runtime-errors.md](references/runtime-errors.md) |
52
+ | Test Failure | Tests return failing rows | [references/test-failures.md](references/test-failures.md) |
53
+ | Wrong Data | Model builds but data is incorrect | Step 3 below |
54
+
55
+ ### Step 3: Isolate the Problem
56
+
57
+ ```bash
58
+ # Compile only — catches Jinja errors without hitting the database
59
+ altimate-dbt compile --model <name>
60
+
61
+ # If compile succeeds, try building
62
+ altimate-dbt build --model <name>
63
+
64
+ # Probe the data directly
65
+ altimate-dbt execute --query "SELECT count(*) FROM {{ ref('<name>') }}" --limit 1
66
+ altimate-dbt execute --query "SELECT * FROM {{ ref('<name>') }}" --limit 5
67
+ ```
68
+
69
+ ### Step 3b: Offline SQL Analysis
70
+
71
+ Before hitting the database, analyze the compiled SQL offline:
72
+
73
+ ```bash
74
+ # Check for semantic issues (wrong joins, cartesian products, NULL comparisons)
75
+ altimate_core_semantics --sql <compiled_sql>
76
+
77
+ # Trace column lineage to find where wrong data originates
78
+ altimate_core_column_lineage --sql <compiled_sql>
79
+
80
+ # Auto-suggest fixes for SQL errors
81
+ altimate_core_correct --sql <compiled_sql>
82
+ ```
83
+
84
+ **Quick-fix tools** — use these when the error type is clear:
85
+
86
+ ```
87
+ # Schema-based fix: fuzzy-matches table/column names against schema to fix typos and wrong references
88
+ altimate_core_fix(sql: <compiled_sql>, schema_context: <schema_object>)
89
+
90
+ # Error-message fix: given a failing query + database error, analyzes root cause and proposes corrections
91
+ sql_fix(sql: <compiled_sql>, error_message: <error_message>, dialect: <dialect>)
92
+ ```
93
+
94
+ `altimate_core_fix` is best for compilation errors (wrong names, missing objects). `sql_fix` is best for runtime errors (the database told you what's wrong). Use `altimate_core_correct` for iterative multi-round correction when the first fix doesn't resolve the issue.
95
+
96
+
97
+ Common findings:
98
+ - **Wrong join type**: `INNER JOIN` dropping rows that should appear → switch to `LEFT JOIN`
99
+ - **Fan-out**: One-to-many join inflating row counts → add deduplication or aggregate
100
+ - **Column mismatch**: Output columns don't match schema.yml definition → reorder SELECT
101
+ - **NULL comparison**: Using `= NULL` instead of `IS NULL` → silent data loss
102
+
103
+ ### Step 3c: Wrong Data Diagnosis — Deep Data Exploration
104
+
105
+ When a model builds but produces wrong results, the bug is almost always in the data assumptions, not the SQL syntax. **You must explore the actual data to find it.**
106
+
107
+ ```bash
108
+ # 1. Check the output for unexpected NULLs
109
+ altimate-dbt execute --query "SELECT count(*) as total, count(<col>) as non_null, count(*) - count(<col>) as nulls FROM {{ ref('<name>') }}" --limit 1
110
+
111
+ # 2. Check value ranges — are metrics within expected bounds?
112
+ altimate-dbt execute --query "SELECT min(<metric>), max(<metric>), avg(<metric>) FROM {{ ref('<name>') }}" --limit 1
113
+
114
+ # 3. Check distinct values for key columns — do they look right?
115
+ altimate-dbt execute --query "SELECT <col>, count(*) FROM {{ ref('<name>') }} GROUP BY 1 ORDER BY 2 DESC" --limit 20
116
+
117
+ # 4. Compare row counts between model output and parent tables
118
+ altimate-dbt execute --query "SELECT count(*) FROM {{ ref('<parent>') }}" --limit 1
119
+ ```
120
+
121
+ **Common wrong-data root causes:**
122
+ - **Fan-out from joins**: If row count is higher than expected, a join key isn't unique — check with `SELECT key, count(*) ... GROUP BY 1 HAVING count(*) > 1`
123
+ - **Missing rows from INNER JOIN**: If row count is lower than expected, switch to LEFT JOIN and check for NULL join keys
124
+ - **Date spine issues**: If using `current_date` or `dbt_utils.date_spine`, output changes daily — check min/max dates
125
+
126
+ ### Step 4: Check Upstream
127
+
128
+ Most errors cascade from upstream models:
129
+
130
+ ```bash
131
+ altimate-dbt parents --model <name>
132
+ ```
133
+
134
+ Read the parent models. Build them individually. **Query the parent data** — don't assume it's correct:
135
+ ```bash
136
+ altimate-dbt execute --query "SELECT count(*), count(DISTINCT <pk>) FROM {{ ref('<parent>') }}" --limit 1
137
+ altimate-dbt execute --query "SELECT * FROM {{ ref('<parent>') }}" --limit 5
138
+ ```
139
+
140
+ ### Step 5: Fix and Verify
141
+
142
+ After applying a fix:
143
+
144
+ ```bash
145
+ altimate-dbt build --model <name> --downstream
146
+ ```
147
+
148
+ Always build with `--downstream` to catch cascading impacts.
149
+
150
+ **Then verify the fix with data queries** — don't just trust the build:
151
+ ```bash
152
+ altimate-dbt execute --query "SELECT count(*) FROM {{ ref('<name>') }}" --limit 1
153
+ altimate-dbt execute --query "SELECT * FROM {{ ref('<name>') }}" --limit 10
154
+ # Check the specific metric/column that was wrong:
155
+ altimate-dbt execute --query "SELECT min(<col>), max(<col>), count(*) - count(<col>) as nulls FROM {{ ref('<name>') }}" --limit 1
156
+ ```
157
+
158
+ ## Rationalizations to Resist
159
+
160
+ | You're Thinking... | Reality |
161
+ |--------------------|---------|
162
+ | "Just make the test pass" | The test is telling you something. Investigate first. |
163
+ | "Let me delete this test" | Ask WHY it exists before removing it. |
164
+ | "It works on my machine" | Check the adapter, Python version, and profile config. |
165
+ | "I'll fix it later" | Later never comes. Fix it now. |
166
+
167
+ ## Common Mistakes
168
+
169
+ | Mistake | Fix |
170
+ |---------|-----|
171
+ | Changing tests before understanding failures | Read the error. Query the data. Understand the root cause. |
172
+ | Fixing symptoms instead of root cause | Trace the problem upstream. The bug is often 2 models back. |
173
+ | Not checking upstream models | Run `altimate-dbt parents` and build parents individually |
174
+ | Ignoring warnings | Warnings often become errors. Fix them proactively. |
175
+ | Not running offline SQL analysis | Use `altimate_core_semantics` before building to catch join issues |
176
+ | Column names/order don't match schema | Use `altimate_core_column_lineage` to verify output columns match schema.yml |
177
+ | Not querying the actual data when debugging wrong results | Always run data exploration queries — check NULLs, value ranges, distinct values |
178
+ | Trusting build success as proof of correctness | Build only checks syntax and constraints — wrong values pass silently |
179
+
180
+ ## Reference Guides
181
+
182
+ | Guide | Use When |
183
+ |-------|----------|
184
+ | [references/altimate-dbt-commands.md](references/altimate-dbt-commands.md) | Need the full CLI reference |
185
+ | [references/compilation-errors.md](references/compilation-errors.md) | Jinja, YAML, or parse errors |
186
+ | [references/runtime-errors.md](references/runtime-errors.md) | Database execution errors |
187
+ | [references/test-failures.md](references/test-failures.md) | Understanding and fixing test failures |
@@ -0,0 +1,66 @@
1
+ # altimate-dbt Command Reference
2
+
3
+ All dbt operations use the `altimate-dbt` CLI. Output is JSON to stdout; logs go to stderr.
4
+
5
+ ```bash
6
+ altimate-dbt <command> [args...]
7
+ altimate-dbt <command> [args...] --format text # Human-readable output
8
+ ```
9
+
10
+ ## First-Time Setup
11
+
12
+ ```bash
13
+ altimate-dbt init # Auto-detect project root
14
+ altimate-dbt init --project-root /path # Explicit root
15
+ altimate-dbt init --python-path /path # Override Python
16
+ altimate-dbt doctor # Verify setup
17
+ altimate-dbt info # Project name, adapter, root
18
+ ```
19
+
20
+ ## Build & Run
21
+
22
+ ```bash
23
+ altimate-dbt build --model <name> [--downstream] # compile + run + test
24
+ altimate-dbt run --model <name> [--downstream] # materialize only
25
+ altimate-dbt test --model <name> # run tests only
26
+ altimate-dbt build-project # full project build
27
+ ```
28
+
29
+ ## Compile
30
+
31
+ ```bash
32
+ altimate-dbt compile --model <name>
33
+ altimate-dbt compile-query --query "SELECT * FROM {{ ref('stg_orders') }}" [--model <context>]
34
+ ```
35
+
36
+ ## Execute SQL
37
+
38
+ ```bash
39
+ altimate-dbt execute --query "SELECT count(*) FROM {{ ref('orders') }}" --limit 100
40
+ ```
41
+
42
+ ## Schema & DAG
43
+
44
+ ```bash
45
+ altimate-dbt columns --model <name> # column names and types
46
+ altimate-dbt columns-source --source <src> --table <tbl> # source table columns
47
+ altimate-dbt column-values --model <name> --column <col> # sample values
48
+ altimate-dbt children --model <name> # downstream models
49
+ altimate-dbt parents --model <name> # upstream models
50
+ ```
51
+
52
+ ## Packages
53
+
54
+ ```bash
55
+ altimate-dbt deps # install packages.yml
56
+ altimate-dbt add-packages --packages dbt-utils,dbt-expectations
57
+ ```
58
+
59
+ ## Error Handling
60
+
61
+ All errors return JSON with `error` and `fix` fields:
62
+ ```json
63
+ { "error": "dbt-core is not installed", "fix": "Install it: python3 -m pip install dbt-core" }
64
+ ```
65
+
66
+ Run `altimate-dbt doctor` as the first diagnostic step for any failure.
@@ -0,0 +1,57 @@
1
+ # Compilation Errors
2
+
3
+ Compilation errors happen before SQL hits the database. They're Jinja, YAML, or reference problems.
4
+
5
+ ## Diagnosis
6
+
7
+ ```bash
8
+ altimate-dbt compile --model <name>
9
+ ```
10
+
11
+ ## Common Compilation Errors
12
+
13
+ ### `Compilation Error: Model 'model.project.name' depends on a node named 'missing_model'`
14
+
15
+ **Cause**: `{{ ref('missing_model') }}` references a model that doesn't exist.
16
+
17
+ **Fix**:
18
+ 1. Check the spelling: `glob models/**/*missing_model*`
19
+ 2. Check if it's in a package: `glob dbt_packages/**/*missing_model*`
20
+ 3. If it should be a source: use `{{ source('src', 'table') }}` instead
21
+
22
+ ### `Compilation Error: 'source_name' is undefined`
23
+
24
+ **Cause**: Source not defined in any `sources.yml`.
25
+
26
+ **Fix**: Create or update `sources.yml` with the source definition.
27
+
28
+ ### `Parsing Error in YAML`
29
+
30
+ **Cause**: Invalid YAML syntax (bad indentation, missing colons, unquoted special characters).
31
+
32
+ **Fix**: Check indentation (must be spaces, not tabs). Ensure strings with special characters are quoted.
33
+
34
+ ### `Compilation Error: Jinja template not found`
35
+
36
+ **Cause**: Missing macro or wrong macro path.
37
+
38
+ **Fix**:
39
+ 1. Check `macros/` directory
40
+ 2. Check `dbt_packages/` for package macros
41
+ 3. Verify `packages.yml` is installed: `altimate-dbt deps`
42
+
43
+ ### `dbt_utils is undefined`
44
+
45
+ **Cause**: Package not installed.
46
+
47
+ **Fix**:
48
+ ```bash
49
+ altimate-dbt deps
50
+ ```
51
+
52
+ ## General Approach
53
+
54
+ 1. Read the full error message — it usually tells you exactly which file and line
55
+ 2. Open that file and read the surrounding context
56
+ 3. Check for typos in `ref()` and `source()` calls
57
+ 4. Verify all packages are installed with `altimate-dbt deps`
@@ -0,0 +1,71 @@
1
+ # Runtime / Database Errors
2
+
3
+ Runtime errors happen when compiled SQL fails to execute against the database.
4
+
5
+ ## Diagnosis
6
+
7
+ ```bash
8
+ # First compile to rule out Jinja issues
9
+ altimate-dbt compile --model <name>
10
+
11
+ # Then try to build
12
+ altimate-dbt build --model <name>
13
+
14
+ # Probe the data directly
15
+ altimate-dbt execute --query "<diagnostic_sql>" --limit 10
16
+ ```
17
+
18
+ ## Common Runtime Errors
19
+
20
+ ### `Database Error: column "x" does not exist`
21
+
22
+ **Cause**: Model references a column that doesn't exist in the source/upstream model.
23
+
24
+ **Fix**:
25
+ ```bash
26
+ altimate-dbt columns --model <upstream_model> # check what columns actually exist
27
+ ```
28
+ Update the column name in the SQL.
29
+
30
+ ### `Database Error: relation "schema.table" does not exist`
31
+
32
+ **Cause**: The upstream model hasn't been built yet, or the schema doesn't exist.
33
+
34
+ **Fix**:
35
+ ```bash
36
+ altimate-dbt build --model <upstream_model> # build the dependency first
37
+ ```
38
+
39
+ ### `Database Error: division by zero`
40
+
41
+ **Cause**: Dividing by a column that contains zeros.
42
+
43
+ **Fix**: Add a `NULLIF(denominator, 0)` or `CASE WHEN denominator = 0 THEN NULL ELSE ...` guard.
44
+
45
+ ### `Database Error: ambiguous column reference`
46
+
47
+ **Cause**: Column name exists in multiple tables in a JOIN.
48
+
49
+ **Fix**: Qualify with table alias: `orders.customer_id` instead of `customer_id`.
50
+
51
+ ### `Database Error: type mismatch`
52
+
53
+ **Cause**: Comparing or operating on incompatible types (string vs integer, date vs timestamp).
54
+
55
+ **Fix**: Add explicit `CAST()` to align types.
56
+
57
+ ### `Timeout` or `Memory Exceeded`
58
+
59
+ **Cause**: Query is too expensive — full table scan, massive JOIN, or no partition pruning.
60
+
61
+ **Fix**:
62
+ 1. Check if model should be incremental
63
+ 2. Add `WHERE` filters to limit data
64
+ 3. Check JOIN keys — are they indexed/clustered?
65
+
66
+ ## General Approach
67
+
68
+ 1. Read the compiled SQL: `altimate-dbt compile --model <name>`
69
+ 2. Try running a simplified version of the query directly
70
+ 3. Check upstream columns: `altimate-dbt columns --model <upstream>`
71
+ 4. Add diagnostic queries to understand the data shape
@@ -0,0 +1,95 @@
1
+ # Test Failures
2
+
3
+ Test failures mean the data violates an expected constraint. The test is usually right — investigate before changing it.
4
+
5
+ ## Diagnosis
6
+
7
+ ```bash
8
+ altimate-dbt test --model <name>
9
+ ```
10
+
11
+ ## Common Test Failures
12
+
13
+ ### `unique` test fails
14
+
15
+ **Meaning**: Duplicate values exist in the column.
16
+
17
+ **Investigate**:
18
+ ```bash
19
+ altimate-dbt execute --query "
20
+ SELECT <column>, count(*) as cnt
21
+ FROM {{ ref('<model>') }}
22
+ GROUP BY 1
23
+ HAVING count(*) > 1
24
+ ORDER BY cnt DESC
25
+ " --limit 10
26
+ ```
27
+
28
+ **Common causes**:
29
+ - Missing deduplication in staging model
30
+ - Incorrect JOIN producing row multiplication (LEFT JOIN with 1:many relationship)
31
+ - Incorrect `GROUP BY` (missing a dimension)
32
+
33
+ ### `not_null` test fails
34
+
35
+ **Meaning**: NULL values exist where they shouldn't.
36
+
37
+ **Investigate**:
38
+ ```bash
39
+ altimate-dbt execute --query "
40
+ SELECT * FROM {{ ref('<model>') }}
41
+ WHERE <column> IS NULL
42
+ " --limit 5
43
+ ```
44
+
45
+ **Common causes**:
46
+ - LEFT JOIN where INNER JOIN was intended (unmatched rows become NULL)
47
+ - Source data has genuine NULLs — may need `COALESCE()` or filter
48
+ - Wrong column referenced in the model SQL
49
+
50
+ ### `accepted_values` test fails
51
+
52
+ **Meaning**: Values exist that weren't in the expected list.
53
+
54
+ **Investigate**:
55
+ ```bash
56
+ altimate-dbt column-values --model <name> --column <column>
57
+ ```
58
+
59
+ **Common causes**:
60
+ - New value appeared in source data (update the accepted list)
61
+ - Data quality issue upstream (fix the source or add a filter)
62
+ - Test list is incomplete (add the missing values)
63
+
64
+ ### `relationships` test fails
65
+
66
+ **Meaning**: Foreign key references a value that doesn't exist in the parent table.
67
+
68
+ **Investigate**:
69
+ ```bash
70
+ altimate-dbt execute --query "
71
+ SELECT child.<fk_col>, count(*)
72
+ FROM {{ ref('<child>') }} child
73
+ LEFT JOIN {{ ref('<parent>') }} parent ON child.<fk_col> = parent.<pk_col>
74
+ WHERE parent.<pk_col> IS NULL
75
+ GROUP BY 1
76
+ " --limit 10
77
+ ```
78
+
79
+ **Common causes**:
80
+ - Parent table hasn't been rebuilt with latest data
81
+ - Orphan records in source data
82
+ - Type mismatch between FK and PK (e.g., string vs integer)
83
+
84
+ ## The Decision Framework
85
+
86
+ When a test fails:
87
+
88
+ 1. **Understand**: Query the failing rows. Why do they exist?
89
+ 2. **Classify**: Is it a data issue, a model logic bug, or a test definition problem?
90
+ 3. **Fix the right thing**:
91
+ - Data issue → fix upstream or add a filter/coalesce
92
+ - Logic bug → fix the model SQL
93
+ - Test is wrong → update the test (with explicit justification to the user)
94
+
95
+ **Never silently weaken a test.** If you need to change a test, explain why to the user.
@@ -0,0 +1,64 @@
1
+ ---
2
+ name: lineage-diff
3
+ description: Compare column-level lineage between two versions of a SQL query to show added, removed, and changed data flow edges.
4
+ ---
5
+
6
+ # Lineage Diff
7
+
8
+ ## Requirements
9
+ **Agent:** any (read-only analysis)
10
+ **Tools used:** lineage_check, read, bash (for git operations), glob
11
+
12
+ Compare column-level lineage between two versions of a SQL model to identify changes in data flow.
13
+
14
+ ## Workflow
15
+
16
+ 1. **Get the original SQL** — Either:
17
+ - Read the file from disk (current committed version)
18
+ - Use `git show HEAD:path/to/file.sql` via `bash` to get the last committed version
19
+ - Accept the "before" SQL directly from the user
20
+
21
+ 2. **Get the modified SQL** — Either:
22
+ - Read the current (modified) file from disk
23
+ - Accept the "after" SQL directly from the user
24
+
25
+ 3. **Run lineage on both versions**:
26
+ - Call `lineage_check` with the original SQL
27
+ - Call `lineage_check` with the modified SQL
28
+
29
+ 4. **Compute the diff**:
30
+ - **Added edges**: Edges in the new lineage that don't exist in the old
31
+ - **Removed edges**: Edges in the old lineage that don't exist in the new
32
+ - **Unchanged edges**: Edges present in both
33
+
34
+ 5. **Report the diff** in a clear format:
35
+
36
+ ```
37
+ Lineage Diff: model_name
38
+ ═══════════════════════════
39
+
40
+ + ADDED (new data flow):
41
+ + source_table.new_column → target_table.output_column
42
+
43
+ - REMOVED (broken data flow):
44
+ - source_table.old_column → target_table.output_column
45
+
46
+ UNCHANGED: 5 edges
47
+
48
+ Impact: 1 new edge, 1 removed edge
49
+ ```
50
+
51
+ ## Usage
52
+
53
+ The user invokes this skill with a file path:
54
+ - `/lineage-diff models/marts/dim_customers.sql` — Compare current file against last git commit
55
+ - `/lineage-diff` — Compare staged changes in the current file
56
+
57
+ ## Edge Matching
58
+
59
+ Two edges are considered the same if all four fields match:
60
+ - `source_table` + `source_column` + `target_table` + `target_column`
61
+
62
+ The `transform` field is informational and not used for matching.
63
+
64
+ Use the tools: `lineage_check`, `read`, `bash` (for git operations), `glob`.