@revos/cli 0.1.2 → 0.1.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +17 -13
- package/dist/adapters/oclif/commands/auth/login.mjs +2 -2
- package/dist/adapters/oclif/commands/auth/logout.mjs +2 -2
- package/dist/adapters/oclif/commands/auth/status.mjs +2 -2
- package/dist/adapters/oclif/commands/init.mjs +2 -2
- package/dist/adapters/oclif/commands/org/current.mjs +2 -2
- package/dist/adapters/oclif/commands/org/list.mjs +2 -2
- package/dist/adapters/oclif/commands/org/switch.mjs +2 -2
- package/dist/adapters/oclif/commands/overlays/diff.d.mts +1 -1
- package/dist/adapters/oclif/commands/overlays/diff.mjs +3 -3
- package/dist/adapters/oclif/commands/overlays/pull.d.mts +1 -1
- package/dist/adapters/oclif/commands/overlays/pull.mjs +3 -3
- package/dist/adapters/oclif/commands/overlays/push.d.mts +1 -1
- package/dist/adapters/oclif/commands/overlays/push.mjs +3 -3
- package/dist/adapters/oclif/commands/overlays/status.d.mts +1 -1
- package/dist/adapters/oclif/commands/overlays/status.mjs +3 -3
- package/dist/{base.command-DDSLyx5v.mjs → base.command-DlVQ9Cqa.mjs} +1 -1
- package/dist/{core-EJgxP-x5.mjs → core-gKJ_V-K5.mjs} +43 -18
- package/dist/{index-DH6vy050.d.mts → index-B8n2GxTc.d.mts} +1 -1
- package/dist/index.d.mts +3 -3
- package/dist/index.mjs +1 -1
- package/dist/templates/AGENTS.md +1 -1
- package/dist/templates/dbt/profiles.yml +12 -0
- package/dist/templates/gitignore +19 -0
- package/dist/templates/skills/create-dbt-transformations/SKILL.md +214 -0
- package/dist/templates/skills/create-dbt-transformations/references/edge-cases.md +46 -0
- package/dist/templates/skills/create-dbt-transformations/references/schema-conventions.md +128 -0
- package/dist/templates/skills/create-dbt-transformations/references/sql-templates.md +73 -0
- package/dist/templates/skills/create-semantic-model/SKILL.md +126 -1432
- package/dist/templates/skills/create-semantic-model/references/cube-examples.md +267 -0
- package/dist/templates/skills/create-semantic-model/references/key-patterns.md +150 -0
- package/dist/templates/skills/create-semantic-model/references/validation-queries.md +209 -0
- package/dist/templates/skills/explore-lakehouse/SKILL.md +8 -1
- package/dist/{types-DZssnweO.d.mts → types-DmuJzN0Z.d.mts} +5 -1
- package/package.json +2 -1
|
@@ -0,0 +1,214 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: create-dbt-transformations
|
|
3
|
+
description: Create new dbt transformations (bronze/silver/gold models) in the RevOS dbt project. Use when asked to create a dbt model, build a transformation, add a new layer model, declare a raw source, or register a new Airbyte-ingested table. Covers dbt project conventions, sources, materialization, schema.yml, and validation commands.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Create dbt Transformations
|
|
7
|
+
|
|
8
|
+
Use this skill to generate SQL models, declare sources, update `schema.yml`, and validate models with `revos dbt run` / `revos dbt test`.
|
|
9
|
+
|
|
10
|
+
For BigQuery exploration (listing datasets, inspecting raw tables, previewing rows, null rates), load the `explore-lakehouse` skill. If that skill is not installed, fall back to:
|
|
11
|
+
|
|
12
|
+
```bash
|
|
13
|
+
bq show --format=prettyjson $REVOS_BQ_DATASET.<table>
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
Warn the user: "The `explore-lakehouse` skill is not installed — using `bq show` as a fallback. Install it for richer schema exploration."
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
# Part 1: Knowledge Base
|
|
21
|
+
|
|
22
|
+
## Layer Conventions
|
|
23
|
+
|
|
24
|
+
- **gold** — business-ready models exposed for reporting or downstream consumption.
|
|
25
|
+
- **silver** — cleaned, deduplicated, type-conformed intermediates.
|
|
26
|
+
- **bronze** — thin views over raw source data. References sources via `{{ source() }}`.
|
|
27
|
+
|
|
28
|
+
When layer is not obvious from context, ask (see Checkpoint 1).
|
|
29
|
+
|
|
30
|
+
## Sources (bronze layer)
|
|
31
|
+
|
|
32
|
+
Raw tables ingested by Airbyte are not dbt models. Declare them as dbt sources so bronze models can reference them with `{{ source() }}`.
|
|
33
|
+
|
|
34
|
+
Sources are declared in `dbt/models/bronze/schema.yml` under a `sources:` block using `schema` (the BigQuery dataset):
|
|
35
|
+
|
|
36
|
+
```yaml
|
|
37
|
+
sources:
|
|
38
|
+
- name: raw
|
|
39
|
+
schema: "{{ env_var('REVOS_BQ_DATASET') }}"
|
|
40
|
+
tables:
|
|
41
|
+
- name: hubspot_contacts
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Reference in bronze SQL:
|
|
45
|
+
|
|
46
|
+
```sql
|
|
47
|
+
SELECT * FROM {{ source('raw', 'hubspot_contacts') }}
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
See [schema-conventions.md](references/schema-conventions.md) for the full declaration pattern alongside `models:`.
|
|
51
|
+
|
|
52
|
+
## Materialization
|
|
53
|
+
|
|
54
|
+
Inherited globally from `dbt_project.yml` — do not add `{{ config(materialized=...) }}` unless the user explicitly asks to override.
|
|
55
|
+
|
|
56
|
+
## `schema.yml` Convention
|
|
57
|
+
|
|
58
|
+
One shared file per layer at `dbt/models/<layer>/schema.yml`. Append new models; never create per-model YAML files. See [schema-conventions.md](references/schema-conventions.md) for full examples and composite-PK / dbt-utils patterns.
|
|
59
|
+
|
|
60
|
+
## Resolving Physical BigQuery Tables
|
|
61
|
+
|
|
62
|
+
Materialized table lives at: `$REVOS_BQ_DATASET.<model_name>`
|
|
63
|
+
|
|
64
|
+
**When to use `{{ ref() }}` vs. `{{ source() }}`:**
|
|
65
|
+
|
|
66
|
+
| Context | Use |
|
|
67
|
+
| ----------------------------------- | -------------------------------- |
|
|
68
|
+
| dbt SQL → other dbt model | `{{ ref('<model>') }}` |
|
|
69
|
+
| dbt SQL → raw source table (bronze) | `{{ source('raw', '<table>') }}` |
|
|
70
|
+
|
|
71
|
+
Always declare raw tables as sources before referencing them. Do not use bare fully qualified names — that bypasses dbt's dependency graph and source freshness tracking.
|
|
72
|
+
|
|
73
|
+
## Standard dbt Commands
|
|
74
|
+
|
|
75
|
+
Always use the `revos` wrapper:
|
|
76
|
+
|
|
77
|
+
```bash
|
|
78
|
+
revos dbt parse # validate syntax (no warehouse)
|
|
79
|
+
revos dbt compile --select <model> # resolve refs, produce compiled SQL
|
|
80
|
+
revos dbt run --select <model> # execute against warehouse
|
|
81
|
+
revos dbt test --select <model> # run tests
|
|
82
|
+
revos dbt build --select <model> # run + test
|
|
83
|
+
revos dbt build --select path:models/<layer> # entire layer
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
---
|
|
87
|
+
|
|
88
|
+
# Part 2: Workflow — Create a New dbt Transformation
|
|
89
|
+
|
|
90
|
+
## Execution Order
|
|
91
|
+
|
|
92
|
+
For each transformation (one at a time — do not batch):
|
|
93
|
+
|
|
94
|
+
1. Determine the target layer (Checkpoint 1 if unclear).
|
|
95
|
+
2. Determine the model name.
|
|
96
|
+
3. Check if that model already exists (Checkpoint 2 if yes).
|
|
97
|
+
4. Gather source data and transformation logic. For bridge models, apply the bridge template ([sql-templates.md](references/sql-templates.md)).
|
|
98
|
+
5. For bronze models: check if required sources are declared in `dbt/models/bronze/schema.yml`; add them if missing.
|
|
99
|
+
6. Generate `dbt/models/<layer>/<model_name>.sql`.
|
|
100
|
+
7. Detect the primary key (Checkpoint 3 if ambiguous).
|
|
101
|
+
8. Add model entry to `dbt/models/<layer>/schema.yml` with PK and FK tests. See [schema-conventions.md](references/schema-conventions.md).
|
|
102
|
+
9. Run `revos dbt run --select <model_name>` and report result.
|
|
103
|
+
10. Run `revos dbt test --select <model_name>` and report result.
|
|
104
|
+
11. Summarize (see Final Response Format).
|
|
105
|
+
|
|
106
|
+
For multiple transformations in one request: repeat steps 1–11 per model in order.
|
|
107
|
+
|
|
108
|
+
---
|
|
109
|
+
|
|
110
|
+
## Mandatory User Checkpoints
|
|
111
|
+
|
|
112
|
+
### Checkpoint 1: Layer Selection
|
|
113
|
+
|
|
114
|
+
Ask if the layer is not obvious:
|
|
115
|
+
|
|
116
|
+
```text
|
|
117
|
+
Which layer should this transformation live in?
|
|
118
|
+
|
|
119
|
+
- gold: business-ready, exposed for reporting or downstream consumption
|
|
120
|
+
- silver: cleaned/intermediate, shared across downstream uses
|
|
121
|
+
- bronze: close-to-source view over raw data, references sources
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
Layer is obvious when the user explicitly names it.
|
|
125
|
+
|
|
126
|
+
### Checkpoint 2: Existing Model Conflict
|
|
127
|
+
|
|
128
|
+
If `dbt/models/<layer>/<model_name>.sql` exists:
|
|
129
|
+
|
|
130
|
+
```text
|
|
131
|
+
A model named <model_name> already exists at dbt/models/<layer>/<model_name>.sql.
|
|
132
|
+
|
|
133
|
+
Options:
|
|
134
|
+
- overwrite: replace with new transformation
|
|
135
|
+
- edit: modify existing (describe the change)
|
|
136
|
+
- rename: use a different name
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
If found in a different layer, mention it too.
|
|
140
|
+
|
|
141
|
+
### Checkpoint 3: Ambiguous Primary Key
|
|
142
|
+
|
|
143
|
+
If PK detection produces no clear result:
|
|
144
|
+
|
|
145
|
+
```text
|
|
146
|
+
I could not unambiguously detect the primary key. Candidates:
|
|
147
|
+
- <candidate_1>
|
|
148
|
+
- <candidate_2>
|
|
149
|
+
|
|
150
|
+
Which column(s) should be the primary key?
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
---
|
|
154
|
+
|
|
155
|
+
## Primary Key Detection
|
|
156
|
+
|
|
157
|
+
Apply in order; stop at first clear result:
|
|
158
|
+
|
|
159
|
+
1. `ROW_NUMBER() OVER (PARTITION BY <cols>) = 1` → partition columns are PK.
|
|
160
|
+
2. `SELECT DISTINCT` over a small column set → all selected columns form composite PK.
|
|
161
|
+
3. `GROUP BY <cols>` at outermost level → grouping columns are PK.
|
|
162
|
+
4. Single column named `id` → PK.
|
|
163
|
+
5. Single column named `<entity>_id` matching the model name stem → PK.
|
|
164
|
+
6. Bridge naming `<entity_a>_<entity_b>` → `(<entity_a>_id, <entity_b>_id)` composite PK.
|
|
165
|
+
|
|
166
|
+
If none produce a clear answer → Checkpoint 3.
|
|
167
|
+
|
|
168
|
+
## Foreign Key Detection
|
|
169
|
+
|
|
170
|
+
A column is a FK candidate if it matches `<entity>_id` where `<entity>` ≠ model's own entity, is not part of the PK, and is not nullable by design. Add `not_null` test only (no `relationships` tests by default).
|
|
171
|
+
|
|
172
|
+
## SQL File Generation
|
|
173
|
+
|
|
174
|
+
See [sql-templates.md](references/sql-templates.md) for:
|
|
175
|
+
|
|
176
|
+
- Bronze model template using `{{ source() }}`
|
|
177
|
+
- Standard silver/gold model template
|
|
178
|
+
- Bridge model (JSON array) template with concrete example
|
|
179
|
+
- Bridge model naming convention and SQL content rules
|
|
180
|
+
|
|
181
|
+
## schema.yml Update
|
|
182
|
+
|
|
183
|
+
See [schema-conventions.md](references/schema-conventions.md) for full examples including sources declaration, composite PK, and dbt-utils patterns.
|
|
184
|
+
|
|
185
|
+
## Edge Cases
|
|
186
|
+
|
|
187
|
+
See [edge-cases.md](references/edge-cases.md) for: missing SQL details, missing upstream model, undeclared source, run/test failure handling.
|
|
188
|
+
|
|
189
|
+
---
|
|
190
|
+
|
|
191
|
+
## Final Response Format
|
|
192
|
+
|
|
193
|
+
```text
|
|
194
|
+
Created dbt transformation: <model_name>
|
|
195
|
+
|
|
196
|
+
Layer: <bronze | silver | gold>
|
|
197
|
+
File: dbt/models/<layer>/<model_name>.sql
|
|
198
|
+
Materialization: <inherited: table | overridden: <type>>
|
|
199
|
+
Primary key: <pk_column> (or composite: <col_1>, <col_2>)
|
|
200
|
+
Foreign keys: <fk_1>, <fk_2> (or "none detected")
|
|
201
|
+
schema.yml: dbt/models/<layer>/schema.yml (entry added)
|
|
202
|
+
|
|
203
|
+
Tests:
|
|
204
|
+
- not_null on <pk>: added
|
|
205
|
+
- unique on <pk>: added | skipped: dbt-utils unavailable
|
|
206
|
+
- not_null on <fk>: added
|
|
207
|
+
|
|
208
|
+
Validation:
|
|
209
|
+
- revos dbt run: passed | failed
|
|
210
|
+
- revos dbt test: passed | failed
|
|
211
|
+
|
|
212
|
+
Physical table after run:
|
|
213
|
+
`<resolved_dataset>.<model_name>`
|
|
214
|
+
```
|
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
# Edge Cases
|
|
2
|
+
|
|
3
|
+
## User asks for a model but does not provide the SQL
|
|
4
|
+
|
|
5
|
+
```text
|
|
6
|
+
I can scaffold the model file and schema entry, but I need to know what the
|
|
7
|
+
transformation should produce.
|
|
8
|
+
|
|
9
|
+
Could you tell me:
|
|
10
|
+
1. What source model(s) or tables does it read from?
|
|
11
|
+
2. What columns should it expose, and what is the primary key?
|
|
12
|
+
3. Any filtering, aggregation, or join logic?
|
|
13
|
+
|
|
14
|
+
Alternatively, if this is a bridge model from a JSON array, I can apply the
|
|
15
|
+
standard bridge template — just tell me the source model and the JSON column.
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
## User asks for a "quick" or "simple" model without details
|
|
19
|
+
|
|
20
|
+
Same response as above. Do not invent business logic.
|
|
21
|
+
|
|
22
|
+
## Model depends on another model that does not exist yet
|
|
23
|
+
|
|
24
|
+
```text
|
|
25
|
+
The transformation you described references `<missing_model>`, which does not
|
|
26
|
+
exist in dbt/models/. Should I create that model first?
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
## Source is a raw Airbyte table not yet declared as a dbt source
|
|
30
|
+
|
|
31
|
+
Declare it as a source in `dbt/models/bronze/schema.yml` first (see [schema-conventions.md](schema-conventions.md)), then reference it with `{{ source('raw', '<table>') }}` in the bronze model SQL. Do not use fully qualified BigQuery names directly — that bypasses dbt's dependency graph and source freshness tracking.
|
|
32
|
+
|
|
33
|
+
## run fails
|
|
34
|
+
|
|
35
|
+
1. Show the error verbatim — do not paraphrase warehouse errors.
|
|
36
|
+
2. Offer to fix the SQL based on the error message.
|
|
37
|
+
3. Do not proceed to `revos dbt test` until run succeeds.
|
|
38
|
+
|
|
39
|
+
## test fails
|
|
40
|
+
|
|
41
|
+
Show which test failed and explain the likely cause:
|
|
42
|
+
|
|
43
|
+
- `unique` on PK fails → real duplicates exist; PK detection was wrong or source has unexpected duplicates needing `DISTINCT` or `ROW_NUMBER` dedup.
|
|
44
|
+
- `not_null` on a column fails → source has nulls; either the column is genuinely nullable (remove the test) or filter them out in SQL.
|
|
45
|
+
|
|
46
|
+
Ask the user how to proceed.
|
|
@@ -0,0 +1,128 @@
|
|
|
1
|
+
# schema.yml Conventions
|
|
2
|
+
|
|
3
|
+
## Contents
|
|
4
|
+
|
|
5
|
+
- [Declaring Sources (bronze layer)](#declaring-sources-bronze-layer)
|
|
6
|
+
- [Standard Model Entry](#standard-model-entry)
|
|
7
|
+
- [Composite Primary Keys (Bridge Models)](#composite-primary-keys-bridge-models)
|
|
8
|
+
- [Description Guidelines](#description-guidelines)
|
|
9
|
+
- [Foreign Key Tests](#foreign-key-tests)
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
Each layer has one shared `schema.yml` at `dbt/models/<layer>/schema.yml`. Append new models; do not create per-model files.
|
|
14
|
+
|
|
15
|
+
If the file does not exist, create it with:
|
|
16
|
+
|
|
17
|
+
```yaml
|
|
18
|
+
version: 2
|
|
19
|
+
|
|
20
|
+
models:
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
## Declaring Sources (bronze layer)
|
|
24
|
+
|
|
25
|
+
Raw tables must be declared as dbt sources before they can be referenced with `{{ source() }}`. Sources live in `dbt/models/bronze/schema.yml` under a `sources:` block alongside the `models:` block.
|
|
26
|
+
|
|
27
|
+
`schema` maps to the BigQuery dataset (`REVOS_BQ_DATASET`):
|
|
28
|
+
|
|
29
|
+
```yaml
|
|
30
|
+
version: 2
|
|
31
|
+
|
|
32
|
+
sources:
|
|
33
|
+
- name: raw
|
|
34
|
+
schema: "{{ env_var('REVOS_BQ_DATASET') }}"
|
|
35
|
+
tables:
|
|
36
|
+
- name: hubspot_contacts
|
|
37
|
+
- name: hubspot_deals
|
|
38
|
+
- name: stripe_charges
|
|
39
|
+
|
|
40
|
+
models:
|
|
41
|
+
- name: bronze_hubspot_contacts
|
|
42
|
+
...
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
Rules:
|
|
46
|
+
|
|
47
|
+
- Use `raw` as the source name for all Airbyte-ingested tables.
|
|
48
|
+
- Each raw table referenced in bronze SQL needs a corresponding entry under `tables:`.
|
|
49
|
+
- If the source block already exists, append to the `tables:` list only.
|
|
50
|
+
|
|
51
|
+
## Standard Model Entry
|
|
52
|
+
|
|
53
|
+
```yaml
|
|
54
|
+
- name: <model_name>
|
|
55
|
+
description: |
|
|
56
|
+
<1–2 sentences: what entity/relationship, what source, any non-obvious logic>
|
|
57
|
+
columns:
|
|
58
|
+
- name: <pk_column>
|
|
59
|
+
description: "Primary key."
|
|
60
|
+
tests:
|
|
61
|
+
- not_null
|
|
62
|
+
- unique
|
|
63
|
+
|
|
64
|
+
- name: <fk_column>
|
|
65
|
+
description: "Foreign key to <target_entity>."
|
|
66
|
+
tests:
|
|
67
|
+
- not_null
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
## Composite Primary Keys (Bridge Models)
|
|
71
|
+
|
|
72
|
+
If `dbt-utils` is available:
|
|
73
|
+
|
|
74
|
+
```yaml
|
|
75
|
+
- name: <model_name>
|
|
76
|
+
description: |
|
|
77
|
+
<description>
|
|
78
|
+
tests:
|
|
79
|
+
- dbt_utils.unique_combination_of_columns:
|
|
80
|
+
combination_of_columns:
|
|
81
|
+
- <pk_col_1>
|
|
82
|
+
- <pk_col_2>
|
|
83
|
+
columns:
|
|
84
|
+
- name: <pk_col_1>
|
|
85
|
+
description: "Composite key part."
|
|
86
|
+
tests:
|
|
87
|
+
- not_null
|
|
88
|
+
- name: <pk_col_2>
|
|
89
|
+
description: "Composite key part."
|
|
90
|
+
tests:
|
|
91
|
+
- not_null
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
If `dbt-utils` is not available, omit `unique_combination_of_columns` and note it in the description:
|
|
95
|
+
|
|
96
|
+
```yaml
|
|
97
|
+
description: |
|
|
98
|
+
<description>
|
|
99
|
+
Note: composite uniqueness on (<pk_col_1>, <pk_col_2>) is not enforced —
|
|
100
|
+
dbt-utils is not installed in this project.
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
Check availability:
|
|
104
|
+
|
|
105
|
+
```bash
|
|
106
|
+
grep -A2 "packages:" dbt/packages.yml 2>/dev/null | grep dbt-utils || echo "dbt-utils not found"
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
## Description Guidelines
|
|
110
|
+
|
|
111
|
+
Answer in 1–2 sentences:
|
|
112
|
+
|
|
113
|
+
1. What entity or relationship this model represents.
|
|
114
|
+
2. What source(s) it reads from.
|
|
115
|
+
3. Any non-obvious filtering or transformation.
|
|
116
|
+
|
|
117
|
+
Example:
|
|
118
|
+
|
|
119
|
+
```yaml
|
|
120
|
+
description: |
|
|
121
|
+
Bridge table linking HubSpot deals to companies, unpacked from the
|
|
122
|
+
`companies` JSON array on `hubspot_deals`. Excludes deals with no
|
|
123
|
+
associated companies.
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
## Foreign Key Tests
|
|
127
|
+
|
|
128
|
+
Add `not_null` only. Do not add `relationships` tests by default — they require knowing the target model and column, which needs explicit user input.
|
|
@@ -0,0 +1,73 @@
|
|
|
1
|
+
# SQL Templates
|
|
2
|
+
|
|
3
|
+
## Bronze Model (source reference)
|
|
4
|
+
|
|
5
|
+
Bronze models read raw Airbyte-ingested tables via `{{ source() }}`:
|
|
6
|
+
|
|
7
|
+
```sql
|
|
8
|
+
SELECT
|
|
9
|
+
<pk_column>,
|
|
10
|
+
<business_columns>,
|
|
11
|
+
_airbyte_extracted_at
|
|
12
|
+
FROM {{ source('raw', '<raw_table_name>') }}
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
Ensure the source is declared in `dbt/models/bronze/schema.yml` before using it (see `references/schema-conventions.md`).
|
|
16
|
+
|
|
17
|
+
## Standard Silver / Gold Model
|
|
18
|
+
|
|
19
|
+
```sql
|
|
20
|
+
SELECT
|
|
21
|
+
<pk_column>,
|
|
22
|
+
<business_columns>,
|
|
23
|
+
_airbyte_extracted_at
|
|
24
|
+
FROM {{ ref('<source_model>') }}
|
|
25
|
+
WHERE <filtering_conditions>
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
## Bridge Model (JSON Array)
|
|
29
|
+
|
|
30
|
+
When unpacking a JSON array into a many-to-many bridge table:
|
|
31
|
+
|
|
32
|
+
```sql
|
|
33
|
+
SELECT DISTINCT
|
|
34
|
+
d.id AS <entity_a>_id,
|
|
35
|
+
<entity_b>_id,
|
|
36
|
+
d._airbyte_extracted_at
|
|
37
|
+
FROM {{ ref('<source_model>') }} d,
|
|
38
|
+
UNNEST(JSON_VALUE_ARRAY(d.<json_array_column>)) AS <entity_b>_id
|
|
39
|
+
WHERE d.<json_array_column> IS NOT NULL
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
Concrete example (`gold_deals_companies.sql`, unpacking `companies` array on `hubspot_deals`):
|
|
43
|
+
|
|
44
|
+
```sql
|
|
45
|
+
SELECT DISTINCT
|
|
46
|
+
d.id AS deal_id,
|
|
47
|
+
company_id,
|
|
48
|
+
d._airbyte_extracted_at
|
|
49
|
+
FROM {{ ref('hubspot_deals') }} d,
|
|
50
|
+
UNNEST(JSON_VALUE_ARRAY(d.companies)) AS company_id
|
|
51
|
+
WHERE d.companies IS NOT NULL
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
Notes:
|
|
55
|
+
|
|
56
|
+
1. `SELECT DISTINCT` — a single source row can produce duplicate combinations under some Airbyte sync patterns.
|
|
57
|
+
2. `WHERE d.<json_array_column> IS NOT NULL` is required — `UNNEST(JSON_VALUE_ARRAY(NULL))` is unsafe.
|
|
58
|
+
3. `_airbyte_extracted_at` is preserved for downstream freshness checks.
|
|
59
|
+
4. Composite PK: `(<entity_a>_id, <entity_b>_id)`.
|
|
60
|
+
|
|
61
|
+
## Bridge Model Naming
|
|
62
|
+
|
|
63
|
+
Convention: `<entity_a>_<entity_b>` (no `to_`, no `bridge_` prefix). Alphabetical order unless one entity clearly owns the relationship.
|
|
64
|
+
|
|
65
|
+
Examples: `gold_deals_companies`, `gold_deals_contacts`, `gold_companies_contacts`.
|
|
66
|
+
|
|
67
|
+
## SQL Content Rules
|
|
68
|
+
|
|
69
|
+
1. No `{{ config(materialized=...) }}` unless user asks to override the layer default.
|
|
70
|
+
2. `{{ source('raw', '<table>') }}` for raw source tables in bronze models.
|
|
71
|
+
3. `{{ ref('<model>') }}` for references to other dbt models.
|
|
72
|
+
4. Named CTEs for non-trivial logic, explicit column lists where practical.
|
|
73
|
+
5. Preserve `_airbyte_extracted_at` from Airbyte-ingested sources.
|