@altimateai/altimate-code 0.4.9 → 0.5.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +36 -0
- package/package.json +54 -14
- package/postinstall.mjs +35 -0
- package/skills/cost-report/SKILL.md +134 -0
- package/skills/data-viz/SKILL.md +135 -0
- package/skills/data-viz/references/component-guide.md +394 -0
- package/skills/dbt-analyze/SKILL.md +130 -0
- package/skills/dbt-analyze/references/altimate-dbt-commands.md +66 -0
- package/skills/dbt-analyze/references/lineage-interpretation.md +58 -0
- package/skills/dbt-develop/SKILL.md +151 -0
- package/skills/dbt-develop/references/altimate-dbt-commands.md +66 -0
- package/skills/dbt-develop/references/common-mistakes.md +49 -0
- package/skills/dbt-develop/references/incremental-strategies.md +118 -0
- package/skills/dbt-develop/references/layer-patterns.md +158 -0
- package/skills/dbt-develop/references/medallion-architecture.md +125 -0
- package/skills/dbt-develop/references/yaml-generation.md +90 -0
- package/skills/dbt-docs/SKILL.md +99 -0
- package/skills/dbt-docs/references/altimate-dbt-commands.md +66 -0
- package/skills/dbt-docs/references/documentation-standards.md +94 -0
- package/skills/dbt-test/SKILL.md +121 -0
- package/skills/dbt-test/references/altimate-dbt-commands.md +66 -0
- package/skills/dbt-test/references/custom-tests.md +59 -0
- package/skills/dbt-test/references/schema-test-patterns.md +103 -0
- package/skills/dbt-test/references/unit-test-guide.md +121 -0
- package/skills/dbt-troubleshoot/SKILL.md +187 -0
- package/skills/dbt-troubleshoot/references/altimate-dbt-commands.md +66 -0
- package/skills/dbt-troubleshoot/references/compilation-errors.md +57 -0
- package/skills/dbt-troubleshoot/references/runtime-errors.md +71 -0
- package/skills/dbt-troubleshoot/references/test-failures.md +95 -0
- package/skills/lineage-diff/SKILL.md +64 -0
- package/skills/pii-audit/SKILL.md +117 -0
- package/skills/query-optimize/SKILL.md +86 -0
- package/skills/schema-migration/SKILL.md +119 -0
- package/skills/sql-review/SKILL.md +118 -0
- package/skills/sql-translate/SKILL.md +68 -0
- package/skills/teach/SKILL.md +54 -0
- package/skills/train/SKILL.md +51 -0
- package/skills/training-status/SKILL.md +45 -0
|
@@ -0,0 +1,121 @@
|
|
|
1
|
+
# dbt Unit Tests
|
|
2
|
+
|
|
3
|
+
Unit tests validate model logic by mocking inputs and asserting expected outputs. Available in dbt-core 1.8+.
|
|
4
|
+
|
|
5
|
+
## When to Unit Test
|
|
6
|
+
|
|
7
|
+
**DO unit test:**
|
|
8
|
+
- Complex calculations (revenue attribution, MRR changes, scoring)
|
|
9
|
+
- Business logic with edge cases (null handling, date boundaries, status transitions)
|
|
10
|
+
- Models with conditional logic (`CASE WHEN`, `IFF`, `COALESCE`)
|
|
11
|
+
- Aggregations where correctness is critical
|
|
12
|
+
|
|
13
|
+
**Do NOT unit test:**
|
|
14
|
+
- Simple staging models (just rename/cast)
|
|
15
|
+
- Pass-through models with no logic
|
|
16
|
+
- Built-in dbt functions
|
|
17
|
+
|
|
18
|
+
## Basic Structure
|
|
19
|
+
|
|
20
|
+
Unit tests live in schema.yml (or a dedicated `_unit_tests.yml` file):
|
|
21
|
+
|
|
22
|
+
```yaml
|
|
23
|
+
unit_tests:
|
|
24
|
+
- name: test_net_revenue_calculation
|
|
25
|
+
description: Verify net_revenue = gross - refunds - discounts
|
|
26
|
+
model: fct_daily_revenue
|
|
27
|
+
given:
|
|
28
|
+
- input: ref('stg_orders')
|
|
29
|
+
rows:
|
|
30
|
+
- { order_id: 1, gross_amount: 100.00, refund_amount: 10.00, discount_amount: 5.00 }
|
|
31
|
+
- { order_id: 2, gross_amount: 50.00, refund_amount: 0.00, discount_amount: 0.00 }
|
|
32
|
+
expect:
|
|
33
|
+
rows:
|
|
34
|
+
- { order_id: 1, net_revenue: 85.00 }
|
|
35
|
+
- { order_id: 2, net_revenue: 50.00 }
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
## Mocking Multiple Inputs
|
|
39
|
+
|
|
40
|
+
```yaml
|
|
41
|
+
unit_tests:
|
|
42
|
+
- name: test_customer_lifetime_value
|
|
43
|
+
model: dim_customers
|
|
44
|
+
given:
|
|
45
|
+
- input: ref('stg_customers')
|
|
46
|
+
rows:
|
|
47
|
+
- { customer_id: 1, name: "Alice" }
|
|
48
|
+
- input: ref('stg_orders')
|
|
49
|
+
rows:
|
|
50
|
+
- { order_id: 1, customer_id: 1, amount: 50.00 }
|
|
51
|
+
- { order_id: 2, customer_id: 1, amount: 75.00 }
|
|
52
|
+
expect:
|
|
53
|
+
rows:
|
|
54
|
+
- { customer_id: 1, lifetime_value: 125.00 }
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
## Testing Edge Cases
|
|
58
|
+
|
|
59
|
+
```yaml
|
|
60
|
+
unit_tests:
|
|
61
|
+
- name: test_handles_null_discounts
|
|
62
|
+
model: fct_orders
|
|
63
|
+
given:
|
|
64
|
+
- input: ref('stg_orders')
|
|
65
|
+
rows:
|
|
66
|
+
- { order_id: 1, amount: 100.00, discount: null }
|
|
67
|
+
expect:
|
|
68
|
+
rows:
|
|
69
|
+
- { order_id: 1, net_amount: 100.00 }
|
|
70
|
+
|
|
71
|
+
- name: test_handles_zero_quantity
|
|
72
|
+
model: fct_orders
|
|
73
|
+
given:
|
|
74
|
+
- input: ref('stg_orders')
|
|
75
|
+
rows:
|
|
76
|
+
- { order_id: 1, quantity: 0, unit_price: 10.00 }
|
|
77
|
+
expect:
|
|
78
|
+
rows:
|
|
79
|
+
- { order_id: 1, order_total: 0.00 }
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
## Overriding Macros and Vars
|
|
83
|
+
|
|
84
|
+
```yaml
|
|
85
|
+
unit_tests:
|
|
86
|
+
- name: test_with_specific_date
|
|
87
|
+
model: fct_daily_metrics
|
|
88
|
+
overrides:
|
|
89
|
+
vars:
|
|
90
|
+
run_date: "2024-01-15"
|
|
91
|
+
macros:
|
|
92
|
+
- name: current_timestamp
|
|
93
|
+
result: "2024-01-15 00:00:00"
|
|
94
|
+
given:
|
|
95
|
+
- input: ref('stg_events')
|
|
96
|
+
rows:
|
|
97
|
+
- { event_id: 1, event_date: "2024-01-15" }
|
|
98
|
+
expect:
|
|
99
|
+
rows:
|
|
100
|
+
- { event_date: "2024-01-15", event_count: 1 }
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
## Running Unit Tests
|
|
104
|
+
|
|
105
|
+
```bash
|
|
106
|
+
altimate-dbt test --model <name> # runs all tests including unit tests
|
|
107
|
+
altimate-dbt build --model <name> # build + test
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
## Test-Driven Development Pattern
|
|
111
|
+
|
|
112
|
+
1. Write the unit test YAML first (expected inputs → outputs)
|
|
113
|
+
2. Run it — it should fail (model doesn't exist yet or logic is missing)
|
|
114
|
+
3. Write/fix the model SQL
|
|
115
|
+
4. Run again — it should pass
|
|
116
|
+
5. Add schema tests for data quality
|
|
117
|
+
|
|
118
|
+
## Official Documentation
|
|
119
|
+
|
|
120
|
+
- **dbt unit tests**: https://docs.getdbt.com/docs/build/unit-tests
|
|
121
|
+
- **Unit test YAML spec**: https://docs.getdbt.com/reference/resource-properties/unit-tests
|
|
@@ -0,0 +1,187 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: dbt-troubleshoot
|
|
3
|
+
description: Debug dbt errors — compilation failures, runtime database errors, test failures, wrong data, and performance issues. Use when something is broken, producing wrong results, or failing to build. Powered by altimate-dbt.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# dbt Troubleshooting
|
|
7
|
+
|
|
8
|
+
## Requirements
|
|
9
|
+
**Agent:** any (read-only diagnosis), builder (if applying fixes)
|
|
10
|
+
**Tools used:** bash (runs `altimate-dbt` commands), read, glob, edit, altimate_core_semantics, altimate_core_column_lineage, altimate_core_correct, altimate_core_fix, sql_fix
|
|
11
|
+
|
|
12
|
+
## When to Use This Skill
|
|
13
|
+
|
|
14
|
+
**Use when:**
|
|
15
|
+
- A dbt model fails to compile or build
|
|
16
|
+
- Tests are failing
|
|
17
|
+
- Model produces wrong or unexpected data
|
|
18
|
+
- Builds are slow or timing out
|
|
19
|
+
- User shares an error message from dbt
|
|
20
|
+
|
|
21
|
+
**Do NOT use for:**
|
|
22
|
+
- Creating new models → use `dbt-develop`
|
|
23
|
+
- Adding tests → use `dbt-test`
|
|
24
|
+
- Analyzing change impact → use `dbt-analyze`
|
|
25
|
+
|
|
26
|
+
## Iron Rules
|
|
27
|
+
|
|
28
|
+
1. **Never modify a test to make it pass without understanding why it's failing.**
|
|
29
|
+
2. **Fix ALL errors, not just the reported one.** After fixing the specific issue, run a full `dbt build`. If other models fail — even ones not mentioned in the error report — fix them too. Your job is to leave the project in a fully working state. Never dismiss errors as "pre-existing" or "out of scope".
|
|
30
|
+
|
|
31
|
+
## Diagnostic Workflow
|
|
32
|
+
|
|
33
|
+
### Step 1: Health Check
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
altimate-dbt doctor
|
|
37
|
+
altimate-dbt info
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
If `doctor` fails, fix the environment first. Common issues:
|
|
41
|
+
- Python not found → reinstall or set `--python-path`
|
|
42
|
+
- dbt-core not installed → `pip install dbt-core`
|
|
43
|
+
- No `dbt_project.yml` → wrong directory
|
|
44
|
+
- Missing packages → if `packages.yml` exists but `dbt_packages/` doesn't, run `dbt deps`
|
|
45
|
+
|
|
46
|
+
### Step 2: Classify the Error
|
|
47
|
+
|
|
48
|
+
| Error Type | Symptom | Jump To |
|
|
49
|
+
|-----------|---------|---------|
|
|
50
|
+
| Compilation Error | Jinja/YAML parse failure | [references/compilation-errors.md](references/compilation-errors.md) |
|
|
51
|
+
| Runtime/Database Error | SQL execution failure | [references/runtime-errors.md](references/runtime-errors.md) |
|
|
52
|
+
| Test Failure | Tests return failing rows | [references/test-failures.md](references/test-failures.md) |
|
|
53
|
+
| Wrong Data | Model builds but data is incorrect | Step 3 below |
|
|
54
|
+
|
|
55
|
+
### Step 3: Isolate the Problem
|
|
56
|
+
|
|
57
|
+
```bash
|
|
58
|
+
# Compile only — catches Jinja errors without hitting the database
|
|
59
|
+
altimate-dbt compile --model <name>
|
|
60
|
+
|
|
61
|
+
# If compile succeeds, try building
|
|
62
|
+
altimate-dbt build --model <name>
|
|
63
|
+
|
|
64
|
+
# Probe the data directly
|
|
65
|
+
altimate-dbt execute --query "SELECT count(*) FROM {{ ref('<name>') }}" --limit 1
|
|
66
|
+
altimate-dbt execute --query "SELECT * FROM {{ ref('<name>') }}" --limit 5
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
### Step 3b: Offline SQL Analysis
|
|
70
|
+
|
|
71
|
+
Before hitting the database, analyze the compiled SQL offline:
|
|
72
|
+
|
|
73
|
+
```bash
|
|
74
|
+
# Check for semantic issues (wrong joins, cartesian products, NULL comparisons)
|
|
75
|
+
altimate_core_semantics --sql <compiled_sql>
|
|
76
|
+
|
|
77
|
+
# Trace column lineage to find where wrong data originates
|
|
78
|
+
altimate_core_column_lineage --sql <compiled_sql>
|
|
79
|
+
|
|
80
|
+
# Auto-suggest fixes for SQL errors
|
|
81
|
+
altimate_core_correct --sql <compiled_sql>
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
**Quick-fix tools** — use these when the error type is clear:
|
|
85
|
+
|
|
86
|
+
```
|
|
87
|
+
# Schema-based fix: fuzzy-matches table/column names against schema to fix typos and wrong references
|
|
88
|
+
altimate_core_fix(sql: <compiled_sql>, schema_context: <schema_object>)
|
|
89
|
+
|
|
90
|
+
# Error-message fix: given a failing query + database error, analyzes root cause and proposes corrections
|
|
91
|
+
sql_fix(sql: <compiled_sql>, error_message: <error_message>, dialect: <dialect>)
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
`altimate_core_fix` is best for compilation errors (wrong names, missing objects). `sql_fix` is best for runtime errors (the database told you what's wrong). Use `altimate_core_correct` for iterative multi-round correction when the first fix doesn't resolve the issue.
|
|
95
|
+
|
|
96
|
+
|
|
97
|
+
Common findings:
|
|
98
|
+
- **Wrong join type**: `INNER JOIN` dropping rows that should appear → switch to `LEFT JOIN`
|
|
99
|
+
- **Fan-out**: One-to-many join inflating row counts → add deduplication or aggregate
|
|
100
|
+
- **Column mismatch**: Output columns don't match schema.yml definition → reorder SELECT
|
|
101
|
+
- **NULL comparison**: Using `= NULL` instead of `IS NULL` → silent data loss
|
|
102
|
+
|
|
103
|
+
### Step 3c: Wrong Data Diagnosis — Deep Data Exploration
|
|
104
|
+
|
|
105
|
+
When a model builds but produces wrong results, the bug is almost always in the data assumptions, not the SQL syntax. **You must explore the actual data to find it.**
|
|
106
|
+
|
|
107
|
+
```bash
|
|
108
|
+
# 1. Check the output for unexpected NULLs
|
|
109
|
+
altimate-dbt execute --query "SELECT count(*) as total, count(<col>) as non_null, count(*) - count(<col>) as nulls FROM {{ ref('<name>') }}" --limit 1
|
|
110
|
+
|
|
111
|
+
# 2. Check value ranges — are metrics within expected bounds?
|
|
112
|
+
altimate-dbt execute --query "SELECT min(<metric>), max(<metric>), avg(<metric>) FROM {{ ref('<name>') }}" --limit 1
|
|
113
|
+
|
|
114
|
+
# 3. Check distinct values for key columns — do they look right?
|
|
115
|
+
altimate-dbt execute --query "SELECT <col>, count(*) FROM {{ ref('<name>') }} GROUP BY 1 ORDER BY 2 DESC" --limit 20
|
|
116
|
+
|
|
117
|
+
# 4. Compare row counts between model output and parent tables
|
|
118
|
+
altimate-dbt execute --query "SELECT count(*) FROM {{ ref('<parent>') }}" --limit 1
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
**Common wrong-data root causes:**
|
|
122
|
+
- **Fan-out from joins**: If row count is higher than expected, a join key isn't unique — check with `SELECT key, count(*) ... GROUP BY 1 HAVING count(*) > 1`
|
|
123
|
+
- **Missing rows from INNER JOIN**: If row count is lower than expected, switch to LEFT JOIN and check for NULL join keys
|
|
124
|
+
- **Date spine issues**: If using `current_date` or `dbt_utils.date_spine`, output changes daily — check min/max dates
|
|
125
|
+
|
|
126
|
+
### Step 4: Check Upstream
|
|
127
|
+
|
|
128
|
+
Most errors cascade from upstream models:
|
|
129
|
+
|
|
130
|
+
```bash
|
|
131
|
+
altimate-dbt parents --model <name>
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
Read the parent models. Build them individually. **Query the parent data** — don't assume it's correct:
|
|
135
|
+
```bash
|
|
136
|
+
altimate-dbt execute --query "SELECT count(*), count(DISTINCT <pk>) FROM {{ ref('<parent>') }}" --limit 1
|
|
137
|
+
altimate-dbt execute --query "SELECT * FROM {{ ref('<parent>') }}" --limit 5
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
### Step 5: Fix and Verify
|
|
141
|
+
|
|
142
|
+
After applying a fix:
|
|
143
|
+
|
|
144
|
+
```bash
|
|
145
|
+
altimate-dbt build --model <name> --downstream
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
Always build with `--downstream` to catch cascading impacts.
|
|
149
|
+
|
|
150
|
+
**Then verify the fix with data queries** — don't just trust the build:
|
|
151
|
+
```bash
|
|
152
|
+
altimate-dbt execute --query "SELECT count(*) FROM {{ ref('<name>') }}" --limit 1
|
|
153
|
+
altimate-dbt execute --query "SELECT * FROM {{ ref('<name>') }}" --limit 10
|
|
154
|
+
# Check the specific metric/column that was wrong:
|
|
155
|
+
altimate-dbt execute --query "SELECT min(<col>), max(<col>), count(*) - count(<col>) as nulls FROM {{ ref('<name>') }}" --limit 1
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
## Rationalizations to Resist
|
|
159
|
+
|
|
160
|
+
| You're Thinking... | Reality |
|
|
161
|
+
|--------------------|---------|
|
|
162
|
+
| "Just make the test pass" | The test is telling you something. Investigate first. |
|
|
163
|
+
| "Let me delete this test" | Ask WHY it exists before removing it. |
|
|
164
|
+
| "It works on my machine" | Check the adapter, Python version, and profile config. |
|
|
165
|
+
| "I'll fix it later" | Later never comes. Fix it now. |
|
|
166
|
+
|
|
167
|
+
## Common Mistakes
|
|
168
|
+
|
|
169
|
+
| Mistake | Fix |
|
|
170
|
+
|---------|-----|
|
|
171
|
+
| Changing tests before understanding failures | Read the error. Query the data. Understand the root cause. |
|
|
172
|
+
| Fixing symptoms instead of root cause | Trace the problem upstream. The bug is often 2 models back. |
|
|
173
|
+
| Not checking upstream models | Run `altimate-dbt parents` and build parents individually |
|
|
174
|
+
| Ignoring warnings | Warnings often become errors. Fix them proactively. |
|
|
175
|
+
| Not running offline SQL analysis | Use `altimate_core_semantics` before building to catch join issues |
|
|
176
|
+
| Column names/order don't match schema | Use `altimate_core_column_lineage` to verify output columns match schema.yml |
|
|
177
|
+
| Not querying the actual data when debugging wrong results | Always run data exploration queries — check NULLs, value ranges, distinct values |
|
|
178
|
+
| Trusting build success as proof of correctness | Build only checks syntax and constraints — wrong values pass silently |
|
|
179
|
+
|
|
180
|
+
## Reference Guides
|
|
181
|
+
|
|
182
|
+
| Guide | Use When |
|
|
183
|
+
|-------|----------|
|
|
184
|
+
| [references/altimate-dbt-commands.md](references/altimate-dbt-commands.md) | Need the full CLI reference |
|
|
185
|
+
| [references/compilation-errors.md](references/compilation-errors.md) | Jinja, YAML, or parse errors |
|
|
186
|
+
| [references/runtime-errors.md](references/runtime-errors.md) | Database execution errors |
|
|
187
|
+
| [references/test-failures.md](references/test-failures.md) | Understanding and fixing test failures |
|
|
@@ -0,0 +1,66 @@
|
|
|
1
|
+
# altimate-dbt Command Reference
|
|
2
|
+
|
|
3
|
+
All dbt operations use the `altimate-dbt` CLI. Output is JSON to stdout; logs go to stderr.
|
|
4
|
+
|
|
5
|
+
```bash
|
|
6
|
+
altimate-dbt <command> [args...]
|
|
7
|
+
altimate-dbt <command> [args...] --format text # Human-readable output
|
|
8
|
+
```
|
|
9
|
+
|
|
10
|
+
## First-Time Setup
|
|
11
|
+
|
|
12
|
+
```bash
|
|
13
|
+
altimate-dbt init # Auto-detect project root
|
|
14
|
+
altimate-dbt init --project-root /path # Explicit root
|
|
15
|
+
altimate-dbt init --python-path /path # Override Python
|
|
16
|
+
altimate-dbt doctor # Verify setup
|
|
17
|
+
altimate-dbt info # Project name, adapter, root
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
## Build & Run
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
altimate-dbt build --model <name> [--downstream] # compile + run + test
|
|
24
|
+
altimate-dbt run --model <name> [--downstream] # materialize only
|
|
25
|
+
altimate-dbt test --model <name> # run tests only
|
|
26
|
+
altimate-dbt build-project # full project build
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
## Compile
|
|
30
|
+
|
|
31
|
+
```bash
|
|
32
|
+
altimate-dbt compile --model <name>
|
|
33
|
+
altimate-dbt compile-query --query "SELECT * FROM {{ ref('stg_orders') }}" [--model <context>]
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
## Execute SQL
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
altimate-dbt execute --query "SELECT count(*) FROM {{ ref('orders') }}" --limit 100
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
## Schema & DAG
|
|
43
|
+
|
|
44
|
+
```bash
|
|
45
|
+
altimate-dbt columns --model <name> # column names and types
|
|
46
|
+
altimate-dbt columns-source --source <src> --table <tbl> # source table columns
|
|
47
|
+
altimate-dbt column-values --model <name> --column <col> # sample values
|
|
48
|
+
altimate-dbt children --model <name> # downstream models
|
|
49
|
+
altimate-dbt parents --model <name> # upstream models
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
## Packages
|
|
53
|
+
|
|
54
|
+
```bash
|
|
55
|
+
altimate-dbt deps # install packages.yml
|
|
56
|
+
altimate-dbt add-packages --packages dbt-utils,dbt-expectations
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
## Error Handling
|
|
60
|
+
|
|
61
|
+
All errors return JSON with `error` and `fix` fields:
|
|
62
|
+
```json
|
|
63
|
+
{ "error": "dbt-core is not installed", "fix": "Install it: python3 -m pip install dbt-core" }
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
Run `altimate-dbt doctor` as the first diagnostic step for any failure.
|
|
@@ -0,0 +1,57 @@
|
|
|
1
|
+
# Compilation Errors
|
|
2
|
+
|
|
3
|
+
Compilation errors happen before SQL hits the database. They're Jinja, YAML, or reference problems.
|
|
4
|
+
|
|
5
|
+
## Diagnosis
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
altimate-dbt compile --model <name>
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
## Common Compilation Errors
|
|
12
|
+
|
|
13
|
+
### `Compilation Error: Model 'model.project.name' depends on a node named 'missing_model'`
|
|
14
|
+
|
|
15
|
+
**Cause**: `{{ ref('missing_model') }}` references a model that doesn't exist.
|
|
16
|
+
|
|
17
|
+
**Fix**:
|
|
18
|
+
1. Check the spelling: `glob models/**/*missing_model*`
|
|
19
|
+
2. Check if it's in a package: `glob dbt_packages/**/*missing_model*`
|
|
20
|
+
3. If it should be a source: use `{{ source('src', 'table') }}` instead
|
|
21
|
+
|
|
22
|
+
### `Compilation Error: 'source_name' is undefined`
|
|
23
|
+
|
|
24
|
+
**Cause**: Source not defined in any `sources.yml`.
|
|
25
|
+
|
|
26
|
+
**Fix**: Create or update `sources.yml` with the source definition.
|
|
27
|
+
|
|
28
|
+
### `Parsing Error in YAML`
|
|
29
|
+
|
|
30
|
+
**Cause**: Invalid YAML syntax (bad indentation, missing colons, unquoted special characters).
|
|
31
|
+
|
|
32
|
+
**Fix**: Check indentation (must be spaces, not tabs). Ensure strings with special characters are quoted.
|
|
33
|
+
|
|
34
|
+
### `Compilation Error: Jinja template not found`
|
|
35
|
+
|
|
36
|
+
**Cause**: Missing macro or wrong macro path.
|
|
37
|
+
|
|
38
|
+
**Fix**:
|
|
39
|
+
1. Check `macros/` directory
|
|
40
|
+
2. Check `dbt_packages/` for package macros
|
|
41
|
+
3. Verify `packages.yml` is installed: `altimate-dbt deps`
|
|
42
|
+
|
|
43
|
+
### `dbt_utils is undefined`
|
|
44
|
+
|
|
45
|
+
**Cause**: Package not installed.
|
|
46
|
+
|
|
47
|
+
**Fix**:
|
|
48
|
+
```bash
|
|
49
|
+
altimate-dbt deps
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
## General Approach
|
|
53
|
+
|
|
54
|
+
1. Read the full error message — it usually tells you exactly which file and line
|
|
55
|
+
2. Open that file and read the surrounding context
|
|
56
|
+
3. Check for typos in `ref()` and `source()` calls
|
|
57
|
+
4. Verify all packages are installed with `altimate-dbt deps`
|
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
# Runtime / Database Errors
|
|
2
|
+
|
|
3
|
+
Runtime errors happen when compiled SQL fails to execute against the database.
|
|
4
|
+
|
|
5
|
+
## Diagnosis
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
# First compile to rule out Jinja issues
|
|
9
|
+
altimate-dbt compile --model <name>
|
|
10
|
+
|
|
11
|
+
# Then try to build
|
|
12
|
+
altimate-dbt build --model <name>
|
|
13
|
+
|
|
14
|
+
# Probe the data directly
|
|
15
|
+
altimate-dbt execute --query "<diagnostic_sql>" --limit 10
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
## Common Runtime Errors
|
|
19
|
+
|
|
20
|
+
### `Database Error: column "x" does not exist`
|
|
21
|
+
|
|
22
|
+
**Cause**: Model references a column that doesn't exist in the source/upstream model.
|
|
23
|
+
|
|
24
|
+
**Fix**:
|
|
25
|
+
```bash
|
|
26
|
+
altimate-dbt columns --model <upstream_model> # check what columns actually exist
|
|
27
|
+
```
|
|
28
|
+
Update the column name in the SQL.
|
|
29
|
+
|
|
30
|
+
### `Database Error: relation "schema.table" does not exist`
|
|
31
|
+
|
|
32
|
+
**Cause**: The upstream model hasn't been built yet, or the schema doesn't exist.
|
|
33
|
+
|
|
34
|
+
**Fix**:
|
|
35
|
+
```bash
|
|
36
|
+
altimate-dbt build --model <upstream_model> # build the dependency first
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
### `Database Error: division by zero`
|
|
40
|
+
|
|
41
|
+
**Cause**: Dividing by a column that contains zeros.
|
|
42
|
+
|
|
43
|
+
**Fix**: Add a `NULLIF(denominator, 0)` or `CASE WHEN denominator = 0 THEN NULL ELSE ...` guard.
|
|
44
|
+
|
|
45
|
+
### `Database Error: ambiguous column reference`
|
|
46
|
+
|
|
47
|
+
**Cause**: Column name exists in multiple tables in a JOIN.
|
|
48
|
+
|
|
49
|
+
**Fix**: Qualify with table alias: `orders.customer_id` instead of `customer_id`.
|
|
50
|
+
|
|
51
|
+
### `Database Error: type mismatch`
|
|
52
|
+
|
|
53
|
+
**Cause**: Comparing or operating on incompatible types (string vs integer, date vs timestamp).
|
|
54
|
+
|
|
55
|
+
**Fix**: Add explicit `CAST()` to align types.
|
|
56
|
+
|
|
57
|
+
### `Timeout` or `Memory Exceeded`
|
|
58
|
+
|
|
59
|
+
**Cause**: Query is too expensive — full table scan, massive JOIN, or no partition pruning.
|
|
60
|
+
|
|
61
|
+
**Fix**:
|
|
62
|
+
1. Check if model should be incremental
|
|
63
|
+
2. Add `WHERE` filters to limit data
|
|
64
|
+
3. Check JOIN keys — are they indexed/clustered?
|
|
65
|
+
|
|
66
|
+
## General Approach
|
|
67
|
+
|
|
68
|
+
1. Read the compiled SQL: `altimate-dbt compile --model <name>`
|
|
69
|
+
2. Try running a simplified version of the query directly
|
|
70
|
+
3. Check upstream columns: `altimate-dbt columns --model <upstream>`
|
|
71
|
+
4. Add diagnostic queries to understand the data shape
|
|
@@ -0,0 +1,95 @@
|
|
|
1
|
+
# Test Failures
|
|
2
|
+
|
|
3
|
+
Test failures mean the data violates an expected constraint. The test is usually right — investigate before changing it.
|
|
4
|
+
|
|
5
|
+
## Diagnosis
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
altimate-dbt test --model <name>
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
## Common Test Failures
|
|
12
|
+
|
|
13
|
+
### `unique` test fails
|
|
14
|
+
|
|
15
|
+
**Meaning**: Duplicate values exist in the column.
|
|
16
|
+
|
|
17
|
+
**Investigate**:
|
|
18
|
+
```bash
|
|
19
|
+
altimate-dbt execute --query "
|
|
20
|
+
SELECT <column>, count(*) as cnt
|
|
21
|
+
FROM {{ ref('<model>') }}
|
|
22
|
+
GROUP BY 1
|
|
23
|
+
HAVING count(*) > 1
|
|
24
|
+
ORDER BY cnt DESC
|
|
25
|
+
" --limit 10
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
**Common causes**:
|
|
29
|
+
- Missing deduplication in staging model
|
|
30
|
+
- Incorrect JOIN producing row multiplication (LEFT JOIN with 1:many relationship)
|
|
31
|
+
- Incorrect `GROUP BY` (missing a dimension)
|
|
32
|
+
|
|
33
|
+
### `not_null` test fails
|
|
34
|
+
|
|
35
|
+
**Meaning**: NULL values exist where they shouldn't.
|
|
36
|
+
|
|
37
|
+
**Investigate**:
|
|
38
|
+
```bash
|
|
39
|
+
altimate-dbt execute --query "
|
|
40
|
+
SELECT * FROM {{ ref('<model>') }}
|
|
41
|
+
WHERE <column> IS NULL
|
|
42
|
+
" --limit 5
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
**Common causes**:
|
|
46
|
+
- LEFT JOIN where INNER JOIN was intended (unmatched rows become NULL)
|
|
47
|
+
- Source data has genuine NULLs — may need `COALESCE()` or filter
|
|
48
|
+
- Wrong column referenced in the model SQL
|
|
49
|
+
|
|
50
|
+
### `accepted_values` test fails
|
|
51
|
+
|
|
52
|
+
**Meaning**: Values exist that weren't in the expected list.
|
|
53
|
+
|
|
54
|
+
**Investigate**:
|
|
55
|
+
```bash
|
|
56
|
+
altimate-dbt column-values --model <name> --column <column>
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
**Common causes**:
|
|
60
|
+
- New value appeared in source data (update the accepted list)
|
|
61
|
+
- Data quality issue upstream (fix the source or add a filter)
|
|
62
|
+
- Test list is incomplete (add the missing values)
|
|
63
|
+
|
|
64
|
+
### `relationships` test fails
|
|
65
|
+
|
|
66
|
+
**Meaning**: Foreign key references a value that doesn't exist in the parent table.
|
|
67
|
+
|
|
68
|
+
**Investigate**:
|
|
69
|
+
```bash
|
|
70
|
+
altimate-dbt execute --query "
|
|
71
|
+
SELECT child.<fk_col>, count(*)
|
|
72
|
+
FROM {{ ref('<child>') }} child
|
|
73
|
+
LEFT JOIN {{ ref('<parent>') }} parent ON child.<fk_col> = parent.<pk_col>
|
|
74
|
+
WHERE parent.<pk_col> IS NULL
|
|
75
|
+
GROUP BY 1
|
|
76
|
+
" --limit 10
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
**Common causes**:
|
|
80
|
+
- Parent table hasn't been rebuilt with latest data
|
|
81
|
+
- Orphan records in source data
|
|
82
|
+
- Type mismatch between FK and PK (e.g., string vs integer)
|
|
83
|
+
|
|
84
|
+
## The Decision Framework
|
|
85
|
+
|
|
86
|
+
When a test fails:
|
|
87
|
+
|
|
88
|
+
1. **Understand**: Query the failing rows. Why do they exist?
|
|
89
|
+
2. **Classify**: Is it a data issue, a model logic bug, or a test definition problem?
|
|
90
|
+
3. **Fix the right thing**:
|
|
91
|
+
- Data issue → fix upstream or add a filter/coalesce
|
|
92
|
+
- Logic bug → fix the model SQL
|
|
93
|
+
- Test is wrong → update the test (with explicit justification to the user)
|
|
94
|
+
|
|
95
|
+
**Never silently weaken a test.** If you need to change a test, explain why to the user.
|
|
@@ -0,0 +1,64 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: lineage-diff
|
|
3
|
+
description: Compare column-level lineage between two versions of a SQL query to show added, removed, and changed data flow edges.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Lineage Diff
|
|
7
|
+
|
|
8
|
+
## Requirements
|
|
9
|
+
**Agent:** any (read-only analysis)
|
|
10
|
+
**Tools used:** lineage_check, read, bash (for git operations), glob
|
|
11
|
+
|
|
12
|
+
Compare column-level lineage between two versions of a SQL model to identify changes in data flow.
|
|
13
|
+
|
|
14
|
+
## Workflow
|
|
15
|
+
|
|
16
|
+
1. **Get the original SQL** — Either:
|
|
17
|
+
- Read the file from disk (current committed version)
|
|
18
|
+
- Use `git show HEAD:path/to/file.sql` via `bash` to get the last committed version
|
|
19
|
+
- Accept the "before" SQL directly from the user
|
|
20
|
+
|
|
21
|
+
2. **Get the modified SQL** — Either:
|
|
22
|
+
- Read the current (modified) file from disk
|
|
23
|
+
- Accept the "after" SQL directly from the user
|
|
24
|
+
|
|
25
|
+
3. **Run lineage on both versions**:
|
|
26
|
+
- Call `lineage_check` with the original SQL
|
|
27
|
+
- Call `lineage_check` with the modified SQL
|
|
28
|
+
|
|
29
|
+
4. **Compute the diff**:
|
|
30
|
+
- **Added edges**: Edges in the new lineage that don't exist in the old
|
|
31
|
+
- **Removed edges**: Edges in the old lineage that don't exist in the new
|
|
32
|
+
- **Unchanged edges**: Edges present in both
|
|
33
|
+
|
|
34
|
+
5. **Report the diff** in a clear format:
|
|
35
|
+
|
|
36
|
+
```
|
|
37
|
+
Lineage Diff: model_name
|
|
38
|
+
═══════════════════════════
|
|
39
|
+
|
|
40
|
+
+ ADDED (new data flow):
|
|
41
|
+
+ source_table.new_column → target_table.output_column
|
|
42
|
+
|
|
43
|
+
- REMOVED (broken data flow):
|
|
44
|
+
- source_table.old_column → target_table.output_column
|
|
45
|
+
|
|
46
|
+
UNCHANGED: 5 edges
|
|
47
|
+
|
|
48
|
+
Impact: 1 new edge, 1 removed edge
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
## Usage
|
|
52
|
+
|
|
53
|
+
The user invokes this skill with a file path:
|
|
54
|
+
- `/lineage-diff models/marts/dim_customers.sql` — Compare current file against last git commit
|
|
55
|
+
- `/lineage-diff` — Compare staged changes in the current file
|
|
56
|
+
|
|
57
|
+
## Edge Matching
|
|
58
|
+
|
|
59
|
+
Two edges are considered the same if all four fields match:
|
|
60
|
+
- `source_table` + `source_column` + `target_table` + `target_column`
|
|
61
|
+
|
|
62
|
+
The `transform` field is informational and not used for matching.
|
|
63
|
+
|
|
64
|
+
Use the tools: `lineage_check`, `read`, `bash` (for git operations), `glob`.
|