@revos/cli 0.2.1 → 0.2.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (157) hide show
  1. package/README.md +271 -71
  2. package/dist/adapters/oclif/commands/action-runs/get.mjs +1 -1
  3. package/dist/adapters/oclif/commands/action-runs/list.mjs +8 -2
  4. package/dist/adapters/oclif/commands/actions/get-input-schema.mjs +2 -2
  5. package/dist/adapters/oclif/commands/actions/get-params-schema.mjs +2 -2
  6. package/dist/adapters/oclif/commands/actions/get.mjs +1 -1
  7. package/dist/adapters/oclif/commands/actions/list.mjs +8 -4
  8. package/dist/adapters/oclif/commands/ai-instructions/create.mjs +1 -1
  9. package/dist/adapters/oclif/commands/ai-instructions/delete.mjs +1 -1
  10. package/dist/adapters/oclif/commands/ai-instructions/get.mjs +1 -1
  11. package/dist/adapters/oclif/commands/ai-instructions/list.mjs +8 -2
  12. package/dist/adapters/oclif/commands/ai-instructions/update.mjs +1 -1
  13. package/dist/adapters/oclif/commands/api.d.mts +11 -0
  14. package/dist/adapters/oclif/commands/api.mjs +112 -0
  15. package/dist/adapters/oclif/commands/apply.d.mts +28 -0
  16. package/dist/adapters/oclif/commands/apply.mjs +77 -0
  17. package/dist/adapters/oclif/commands/auth/login.d.mts +5 -4
  18. package/dist/adapters/oclif/commands/auth/login.mjs +22 -11
  19. package/dist/adapters/oclif/commands/auth/logout.d.mts +1 -1
  20. package/dist/adapters/oclif/commands/auth/logout.mjs +2 -2
  21. package/dist/adapters/oclif/commands/auth/status.d.mts +2 -2
  22. package/dist/adapters/oclif/commands/auth/status.mjs +2 -2
  23. package/dist/adapters/oclif/commands/connections/create.d.mts +6 -0
  24. package/dist/adapters/oclif/commands/connections/create.mjs +8 -0
  25. package/dist/adapters/oclif/commands/connections/delete.d.mts +6 -0
  26. package/dist/adapters/oclif/commands/connections/delete.mjs +8 -0
  27. package/dist/adapters/oclif/commands/connections/get.d.mts +6 -0
  28. package/dist/adapters/oclif/commands/connections/get.mjs +8 -0
  29. package/dist/adapters/oclif/commands/connections/list.d.mts +6 -0
  30. package/dist/adapters/oclif/commands/connections/list.mjs +14 -0
  31. package/dist/adapters/oclif/commands/connections/update.d.mts +6 -0
  32. package/dist/adapters/oclif/commands/connections/update.mjs +8 -0
  33. package/dist/adapters/oclif/commands/cubes/create.d.mts +6 -0
  34. package/dist/adapters/oclif/commands/cubes/create.mjs +8 -0
  35. package/dist/adapters/oclif/commands/cubes/delete.d.mts +6 -0
  36. package/dist/adapters/oclif/commands/cubes/delete.mjs +8 -0
  37. package/dist/adapters/oclif/commands/cubes/get.d.mts +6 -0
  38. package/dist/adapters/oclif/commands/cubes/get.mjs +8 -0
  39. package/dist/adapters/oclif/commands/cubes/list.d.mts +6 -0
  40. package/dist/adapters/oclif/commands/cubes/list.mjs +13 -0
  41. package/dist/adapters/oclif/commands/cubes/update.d.mts +6 -0
  42. package/dist/adapters/oclif/commands/cubes/update.mjs +8 -0
  43. package/dist/adapters/oclif/commands/diff.d.mts +27 -0
  44. package/dist/adapters/oclif/commands/diff.mjs +66 -0
  45. package/dist/adapters/oclif/commands/gservice-account-keys/get.mjs +1 -1
  46. package/dist/adapters/oclif/commands/gservice-account-keys/reveal.mjs +2 -2
  47. package/dist/adapters/oclif/commands/gservice-accounts/create.mjs +1 -1
  48. package/dist/adapters/oclif/commands/gservice-accounts/delete.mjs +1 -1
  49. package/dist/adapters/oclif/commands/gservice-accounts/get.mjs +1 -1
  50. package/dist/adapters/oclif/commands/gservice-accounts/list.mjs +7 -2
  51. package/dist/adapters/oclif/commands/init.d.mts +2 -1
  52. package/dist/adapters/oclif/commands/init.mjs +26 -23
  53. package/dist/adapters/oclif/commands/org/create.mjs +1 -1
  54. package/dist/adapters/oclif/commands/org/current.d.mts +2 -2
  55. package/dist/adapters/oclif/commands/org/current.mjs +2 -2
  56. package/dist/adapters/oclif/commands/org/get.mjs +1 -1
  57. package/dist/adapters/oclif/commands/org/list.d.mts +3 -11
  58. package/dist/adapters/oclif/commands/org/list.mjs +26 -26
  59. package/dist/adapters/oclif/commands/org/switch.d.mts +3 -2
  60. package/dist/adapters/oclif/commands/org/switch.mjs +10 -3
  61. package/dist/adapters/oclif/commands/pull.d.mts +28 -0
  62. package/dist/adapters/oclif/commands/pull.mjs +88 -0
  63. package/dist/adapters/oclif/commands/score-groups/create.mjs +3 -2
  64. package/dist/adapters/oclif/commands/score-groups/delete.mjs +1 -1
  65. package/dist/adapters/oclif/commands/score-groups/get.mjs +1 -1
  66. package/dist/adapters/oclif/commands/score-groups/list.mjs +3 -2
  67. package/dist/adapters/oclif/commands/score-groups/update.mjs +1 -1
  68. package/dist/adapters/oclif/commands/scores/create.mjs +3 -2
  69. package/dist/adapters/oclif/commands/scores/delete.mjs +1 -1
  70. package/dist/adapters/oclif/commands/scores/list.mjs +3 -2
  71. package/dist/adapters/oclif/commands/scores/update.mjs +1 -1
  72. package/dist/adapters/oclif/commands/segments/create.mjs +1 -1
  73. package/dist/adapters/oclif/commands/segments/delete.mjs +1 -1
  74. package/dist/adapters/oclif/commands/segments/evaluate.mjs +2 -2
  75. package/dist/adapters/oclif/commands/segments/get-evaluation-history.mjs +2 -2
  76. package/dist/adapters/oclif/commands/segments/get-version.mjs +2 -2
  77. package/dist/adapters/oclif/commands/segments/get.mjs +1 -1
  78. package/dist/adapters/oclif/commands/segments/list-versions.mjs +16 -5
  79. package/dist/adapters/oclif/commands/segments/list.mjs +9 -2
  80. package/dist/adapters/oclif/commands/segments/restore-version.mjs +2 -2
  81. package/dist/adapters/oclif/commands/segments/update.mjs +1 -1
  82. package/dist/adapters/oclif/commands/sources/create.d.mts +11 -0
  83. package/dist/adapters/oclif/commands/sources/create.mjs +16 -0
  84. package/dist/adapters/oclif/commands/sources/delete.d.mts +6 -0
  85. package/dist/adapters/oclif/commands/sources/delete.mjs +8 -0
  86. package/dist/adapters/oclif/commands/sources/get.d.mts +6 -0
  87. package/dist/adapters/oclif/commands/sources/get.mjs +8 -0
  88. package/dist/adapters/oclif/commands/sources/list-streams.d.mts +6 -0
  89. package/dist/adapters/oclif/commands/sources/list-streams.mjs +31 -0
  90. package/dist/adapters/oclif/commands/sources/list.d.mts +6 -0
  91. package/dist/adapters/oclif/commands/sources/list.mjs +13 -0
  92. package/dist/adapters/oclif/commands/{integrations/get.d.mts → sources/update.d.mts} +4 -4
  93. package/dist/adapters/oclif/commands/sources/update.mjs +21 -0
  94. package/dist/adapters/oclif/commands/status.d.mts +26 -0
  95. package/dist/adapters/oclif/commands/status.mjs +77 -0
  96. package/dist/adapters/oclif/commands/table-views/create.mjs +3 -2
  97. package/dist/adapters/oclif/commands/table-views/delete.mjs +1 -1
  98. package/dist/adapters/oclif/commands/table-views/list.mjs +3 -2
  99. package/dist/adapters/oclif/commands/table-views/update.mjs +1 -1
  100. package/dist/adapters/oclif/commands/tables/create.mjs +1 -1
  101. package/dist/adapters/oclif/commands/tables/delete.mjs +1 -1
  102. package/dist/adapters/oclif/commands/tables/get.mjs +1 -1
  103. package/dist/adapters/oclif/commands/tables/list.mjs +3 -2
  104. package/dist/adapters/oclif/commands/tables/update.mjs +1 -1
  105. package/dist/{base.command-d7VW6WTp.d.mts → base.command-D7X3ZNtY.d.mts} +0 -1
  106. package/dist/{base.command-YiwlGlKs.mjs → base.command-cV5d65r8.mjs} +15 -12
  107. package/dist/chunk-CfYAbeIz.mjs +13 -0
  108. package/dist/core-CMrP5BQS.mjs +2378 -0
  109. package/dist/{factory-BrFKT8t-.mjs → factory-C6XLqhT9.mjs} +44 -10
  110. package/dist/iac-render-BSZZEP0n.mjs +17 -0
  111. package/dist/index-BqKwXXAo.d.mts +598 -0
  112. package/dist/index.d.mts +3 -4
  113. package/dist/index.mjs +2 -2
  114. package/dist/{presets-D9b6IWKy.mjs → presets-CJbFbHlw.mjs} +35 -8
  115. package/dist/templates/.claude/settings.json +39 -0
  116. package/dist/templates/.devcontainer/setup.sh +3 -0
  117. package/dist/templates/AGENTS.md +33 -20
  118. package/dist/templates/dbt/dbt_project.yml +2 -2
  119. package/dist/templates/skills/create-connections/SKILL.md +210 -0
  120. package/dist/templates/skills/create-connections/references/mappers.md +152 -0
  121. package/dist/templates/skills/{create-semantic-model → create-cubes}/SKILL.md +20 -18
  122. package/dist/templates/skills/create-cubes/references/bq-pk-fk-conventions.md +183 -0
  123. package/dist/templates/skills/{create-semantic-model → create-cubes}/references/cube-examples.md +2 -2
  124. package/dist/templates/skills/create-cubes/references/hubspot-entities.md +289 -0
  125. package/dist/templates/skills/create-cubes/references/jira-entities.md +201 -0
  126. package/dist/templates/skills/create-cubes/references/netsuite-entities.md +121 -0
  127. package/dist/templates/skills/create-cubes/references/stripe-entities.md +114 -0
  128. package/dist/templates/skills/create-dbt-transformations/SKILL.md +43 -22
  129. package/dist/templates/skills/create-dbt-transformations/references/edge-cases.md +20 -2
  130. package/dist/templates/skills/create-dbt-transformations/references/schema-conventions.md +21 -7
  131. package/dist/templates/skills/create-dbt-transformations/references/sql-templates.md +34 -20
  132. package/dist/templates/skills/explore-lakehouse/SKILL.md +3 -3
  133. package/dist/templates/skills/load-sample-data/SKILL.md +1 -1
  134. package/dist/templates/skills/visualize-semantic-model/SKILL.md +159 -0
  135. package/dist/templates/skills/visualize-semantic-model/scripts/render_graph.py +186 -0
  136. package/dist/{types-Y_ht_ja5.d.mts → types-CGjxcj4L.d.mts} +3 -0
  137. package/package.json +44 -7
  138. package/dist/adapters/oclif/commands/integrations/create.d.mts +0 -11
  139. package/dist/adapters/oclif/commands/integrations/create.mjs +0 -16
  140. package/dist/adapters/oclif/commands/integrations/get.mjs +0 -21
  141. package/dist/adapters/oclif/commands/integrations/list.d.mts +0 -11
  142. package/dist/adapters/oclif/commands/integrations/list.mjs +0 -16
  143. package/dist/adapters/oclif/commands/integrations/update.d.mts +0 -15
  144. package/dist/adapters/oclif/commands/integrations/update.mjs +0 -21
  145. package/dist/adapters/oclif/commands/overlays/diff.d.mts +0 -19
  146. package/dist/adapters/oclif/commands/overlays/diff.mjs +0 -80
  147. package/dist/adapters/oclif/commands/overlays/pull.d.mts +0 -15
  148. package/dist/adapters/oclif/commands/overlays/pull.mjs +0 -45
  149. package/dist/adapters/oclif/commands/overlays/push.d.mts +0 -18
  150. package/dist/adapters/oclif/commands/overlays/push.mjs +0 -59
  151. package/dist/adapters/oclif/commands/overlays/status.d.mts +0 -18
  152. package/dist/adapters/oclif/commands/overlays/status.mjs +0 -53
  153. package/dist/core-jpFPylBb.mjs +0 -997
  154. package/dist/index-DD2Vr-pu.d.mts +0 -193
  155. package/dist/types-C_p_6rkj.d.mts +0 -69
  156. /package/dist/templates/skills/{create-semantic-model → create-cubes}/references/key-patterns.md +0 -0
  157. /package/dist/templates/skills/{create-semantic-model → create-cubes}/references/validation-queries.md +0 -0
@@ -1,12 +1,12 @@
1
1
  ---
2
- name: create-semantic-model
2
+ name: create-cubes
3
3
  description: >
4
- Create semantic models (Cube.dev cubes) from existing RevOS dbt gold models.
4
+ Create first-class Cube.dev cube definitions from existing RevOS dbt gold models.
5
5
  Use when asked to: build a semantic layer, create cubes, generate Cube definitions from dbt,
6
- create a semantic overlay, or create a semantic model from gold models.
6
+ define cube files, or create a semantic model from gold models.
7
7
  ---
8
8
 
9
- # Create Semantic Model
9
+ # Create Cube
10
10
 
11
11
  ## Skill Dependencies
12
12
 
@@ -27,6 +27,8 @@ load the `explore-lakehouse` skill on demand.
27
27
 
28
28
  Expose existing dbt gold models as queryable Cube.dev semantic models without manually writing YAML boilerplate. Gold models may be tables or views.
29
29
 
30
+ Each cube is a **complete, standalone definition** stored in `cubes/`. There is no patching or merging — what is in the file is what gets deployed.
31
+
30
32
  This skill does not build gold models. If a needed gold model is missing, hand off to `create-dbt-transformations`.
31
33
 
32
34
  ---
@@ -38,13 +40,13 @@ Strip the `gold_` prefix for cube names and file names. Keep `gold_` in `sql_tab
38
40
  ```text
39
41
  gold SQL file: dbt/models/gold/gold_hubspot_companies.sql
40
42
  BigQuery table: gold_hubspot_companies
41
- overlay file: semantic/hubspot_companies.yml
43
+ cube file: cubes/hubspot_companies.yml
42
44
  cube name: hubspot_companies
43
45
  join reference: ${hubspot_companies}
44
46
  sql_table: "`<dataset>.gold_hubspot_companies`"
45
47
  ```
46
48
 
47
- Same rule for bridge cubes: `gold_deals_companies` -> cube name `deals_companies`, file `deals_companies.yml`.
49
+ Same rule for bridge cubes: `gold_deals_companies` -> cube name `deals_companies`, file `cubes/deals_companies.yml`.
48
50
 
49
51
  ## Cube `sql_table` Reference
50
52
 
@@ -76,7 +78,7 @@ When a many-to-many relationship is detected and no suitable bridge model exists
76
78
 
77
79
  ### Checkpoint 3: Relationship Confirmation
78
80
 
79
- Present validated relationships with join directions, cardinality, and match rates. Ask the user to confirm before generating overlays. Do not present unvalidated joins as confirmed — mark them as `validation pending`.
81
+ Present validated relationships with join directions, cardinality, and match rates. Ask the user to confirm before generating cube files. Do not present unvalidated joins as confirmed — mark them as `validation pending`.
80
82
 
81
83
  ### Checkpoint 4: Measures Confirmation
82
84
 
@@ -94,7 +96,7 @@ Follow these phases in order. Do not skip ahead.
94
96
 
95
97
  1. Discover gold models via `find dbt/models/gold -name "*.sql"`.
96
98
  2. If none exist, stop and tell the user to create gold models first via `create-dbt-transformations`.
97
- 3. Inspect 1-2 existing overlays in `semantic/` to detect conventions (`extends:`, `public:`, `refresh_key` style). Apply detected conventions to new overlays. Always use flat single-cube YAML (never `cubes:` or `views:` root).
99
+ 3. Inspect 1-2 existing cube files in `cubes/` to detect conventions (`extends:`, `public:`, `refresh_key` style). Apply detected conventions to new cubes. Always use flat single-cube YAML (never `cubes:` or `views:` root).
98
100
  4. If the user named a specific model, find it. If not found, stop.
99
101
  5. Otherwise list all discovered gold models and ask which should participate (Checkpoint 1).
100
102
  6. Keep the full discovered list available for connector search in Phase 3.
@@ -231,18 +233,18 @@ See [references/cube-examples.md](references/cube-examples.md) for type mapping
231
233
 
232
234
  ---
233
235
 
234
- ## Phase 8: Generate Cube Semantic Overlays
236
+ ## Phase 8: Generate Cube Files
235
237
 
236
- Create Cube.dev YAML files in `semantic/`. Follow the existing style detected in Phase 1.
238
+ Create Cube.dev YAML files in `cubes/`. Follow the existing style detected in Phase 1.
237
239
 
238
240
  Key rules:
239
241
 
240
- 1. **One cube per file, flat YAML.** Each overlay file contains a single cube starting with `name:` at the root level. Never wrap with `cubes:` or `views:` at the root.
242
+ 1. **One cube per file, flat YAML.** Each cube file contains a single cube starting with `name:` at the root level. Never wrap with `cubes:` or `views:` at the root.
241
243
  2. File name = cube `name` (no `gold_` prefix) + `.yml`.
242
244
  3. `sql_table` uses fully qualified BigQuery reference with `gold_` prefix.
243
245
  4. Every confirmed relationship gets joins in both directions.
244
246
  5. Bridge/junction cubes use `public: false`.
245
- 6. Every overlay **must** include a SQL-based `refresh_key`. Use `SELECT MAX(<timestamp_col>)` with columns in this priority: `_airbyte_extracted_at` (present on all Airbyte sources), `updated_at`/`modified_at` (CDC streams), `created_at` (insert-only facts). Only use `every: <interval>` as absolute last resort when **no timestamp column exists in the table** — add a YAML comment explaining why (e.g. `# no timestamp column available`).
247
+ 6. Every cube **must** include a SQL-based `refresh_key`. Use `SELECT MAX(<timestamp_col>)` with columns in this priority: `_airbyte_extracted_at` (present on all Airbyte sources), `updated_at`/`modified_at` (CDC streams), `created_at` (insert-only facts). Only use `every: <interval>` as absolute last resort when **no timestamp column exists in the table** — add a YAML comment explaining why (e.g. `# no timestamp column available`).
246
248
  7. `refresh_key.sql` references the same table as `sql_table`.
247
249
  8. Tag unvalidated joins with `# UNVALIDATED: <reason>`.
248
250
 
@@ -254,14 +256,14 @@ See [references/cube-examples.md](references/cube-examples.md) for canonical sta
254
256
 
255
257
  1. If `create-dbt-transformations` was invoked (bridge model), it already validated dbt models. Otherwise run `dbt parse`.
256
258
  2. Verify physical tables exist in BigQuery: `bq show <dataset>.<table_name>`. If missing, document as pending.
257
- 3. Verify generated overlays match conventions: flat YAML, correct naming, correct `sql_table`, all dimensions present, `refresh_key` included, joins in both directions.
259
+ 3. Verify generated cube files match conventions: flat YAML, correct naming, correct `sql_table`, all dimensions present, `refresh_key` included, joins in both directions.
258
260
 
259
261
  ---
260
262
 
261
263
  ## Final Response Format
262
264
 
263
265
  ```text
264
- Created semantic model draft.
266
+ Created cube definitions.
265
267
 
266
268
  Selected gold models:
267
269
  - dbt/models/gold/<gold_model_1>.sql
@@ -272,9 +274,9 @@ Approved connector models:
272
274
  Bridge/support models created (via create-dbt-transformations):
273
275
  - dbt/models/gold/<bridge_model>.sql
274
276
 
275
- Semantic overlays:
276
- - semantic/<entity_1>.yml (cube name: <entity_1>)
277
- - semantic/<bridge_entity>.yml (cube name: <bridge_entity>, public: false)
277
+ Cube files:
278
+ - cubes/<entity_1>.yml (cube name: <entity_1>)
279
+ - cubes/<bridge_entity>.yml (cube name: <bridge_entity>, public: false)
278
280
 
279
281
  Validated relationships:
280
282
  - <entity_a>.<key> -> <entity_b>.<key> (<relationship_type>)
@@ -299,7 +301,7 @@ Pending items:
299
301
  - <pending_item>
300
302
 
301
303
  Next step:
302
- revos overlays push -d ./semantic
304
+ revos apply
303
305
  ```
304
306
 
305
307
  If validation is incomplete, say exactly what remains pending.
@@ -0,0 +1,183 @@
1
+ # BigQuery PK / FK Conventions
2
+
3
+ Conventions and patterns for primary keys, foreign keys, and type handling in
4
+ BigQuery-backed Cube.dev cube definitions.
5
+
6
+ ---
7
+
8
+ ## Primary key rules
9
+
10
+ ### Single-column PK
11
+
12
+ Expose with `primary_key: true`:
13
+
14
+ ```yaml
15
+ dimensions:
16
+ id:
17
+ sql: "${CUBE}.id"
18
+ type: string
19
+ primary_key: true
20
+ ```
21
+
22
+ ### Composite PK (no natural single key)
23
+
24
+ Use a synthetic `CONCAT` or string concatenation dimension:
25
+
26
+ ```yaml
27
+ dimensions:
28
+ id:
29
+ sql: "${CUBE}.deal_id || '_' || ${CUBE}.contact_id"
30
+ type: string
31
+ primary_key: true
32
+ ```
33
+
34
+ Use `||` (SQL string concat) not `CONCAT()` — both work in BigQuery but `||` is
35
+ more portable. Always cast non-string parts:
36
+
37
+ ```yaml
38
+ sql: "${CUBE}.issue_id || '_' || CAST(${CUBE}.sprint_id AS STRING)"
39
+ ```
40
+
41
+ ### No natural PK
42
+
43
+ When no unique column exists, use `ROW_NUMBER()` in the cube `sql:` view or
44
+ document the absence clearly. Warn the user — Cube.js fan-out protection
45
+ depends on a correct PK.
46
+
47
+ ---
48
+
49
+ ## FK type casting
50
+
51
+ BigQuery enforces strict type matching in JOINs. Common mismatches:
52
+
53
+ | Situation | Fix |
54
+ | --------------------------- | ----------------------------------------------------- |
55
+ | `id` is STRING, FK is INT64 | `CAST(fk_col AS STRING) = id` |
56
+ | `id` is INT64, FK is STRING | `SAFE_CAST(id AS INT64) = fk_col` |
57
+ | Both sides uncertain | `SAFE_CAST(... AS STRING) = SAFE_CAST(... AS STRING)` |
58
+ | JSON object storing ID | `JSON_VALUE(col, '$.id')` |
59
+
60
+ Use `SAFE_CAST` (not `CAST`) when the FK can contain non-numeric values —
61
+ `SAFE_CAST` returns NULL on failure instead of throwing.
62
+
63
+ ---
64
+
65
+ ## JSON column patterns
66
+
67
+ ### Extracting a scalar value
68
+
69
+ ```sql
70
+ -- From a top-level field
71
+ JSON_VALUE(col, '$.fieldName')
72
+
73
+ -- From a nested object
74
+ JSON_VALUE(col, '$.parent.child.id')
75
+ ```
76
+
77
+ ### Extracting an array of scalars (for UNNEST)
78
+
79
+ ```sql
80
+ -- Array of plain strings/numbers (association IDs):
81
+ UNNEST(JSON_VALUE_ARRAY(col)) AS element
82
+
83
+ -- Array of JSON objects (pipeline stages):
84
+ UNNEST(JSON_QUERY_ARRAY(col)) AS obj
85
+ -- then: JSON_VALUE(obj, '$.fieldName')
86
+ ```
87
+
88
+ Rule of thumb: `JSON_VALUE_ARRAY` for scalar arrays, `JSON_QUERY_ARRAY` for object arrays.
89
+
90
+ ---
91
+
92
+ ## sql_table vs sql
93
+
94
+ | Approach | When to use |
95
+ | ---------------------------------------------- | -------------------------------------- |
96
+ | `sql_table: "\`<dataset>.<table>\`"` | Raw table, no transformation needed |
97
+ | `sql: "SELECT ... FROM \`<dataset>.<table>\`"` | Derived view (UNNEST, JOIN, aggregate) |
98
+
99
+ Always wrap BigQuery table names in backticks inside YAML. In YAML double-quoted
100
+ strings you must escape backticks: `"\`dataset.table\`"`. In block scalars (`>`or`|`) no escaping needed:
101
+
102
+ ```yaml
103
+ # Double-quoted — must escape backticks:
104
+ sql_table: "`my_project.my_dataset.my_table`"
105
+
106
+ # Inside sql block scalar — no escaping:
107
+ sql: >
108
+ SELECT id FROM `my_project.my_dataset.my_table`
109
+ ```
110
+
111
+ ---
112
+
113
+ ## refresh_key patterns
114
+
115
+ Priority order for the timestamp column:
116
+
117
+ 1. `_airbyte_extracted_at` — present on all Airbyte-synced tables
118
+ 2. `updated_at` / `modified_at` / `lastModifiedDate` — CDC streams
119
+ 3. `created_at` — insert-only facts
120
+ 4. `every: 1 hour` — only when **no timestamp column exists**, with a YAML comment
121
+
122
+ ```yaml
123
+ # Pattern 1 (preferred):
124
+ refresh_key:
125
+ sql: "SELECT MAX(_airbyte_extracted_at) FROM `<dataset>.<table>`"
126
+
127
+ # Pattern 4 (last resort):
128
+ refresh_key:
129
+ every: 1 hour # no timestamp column available in this table
130
+ ```
131
+
132
+ For derived cubes (`sql:` based, not `sql_table:`), the refresh key should
133
+ reference the **underlying source table**, not the derived view:
134
+
135
+ ```yaml
136
+ # Bridge cube derived from deals:
137
+ refresh_key:
138
+ sql: "SELECT MAX(_airbyte_extracted_at) FROM `<dataset>.<prefix>deals`"
139
+ ```
140
+
141
+ ---
142
+
143
+ ## Common dimension types in Cube.dev
144
+
145
+ | BigQuery type | Cube type | Notes |
146
+ | ------------------------- | --------- | ------------------------------------------ |
147
+ | STRING | `string` | default for most IDs, names |
148
+ | INT64, FLOAT64, NUMERIC | `number` | use for metrics |
149
+ | BOOL | `boolean` | |
150
+ | TIMESTAMP, DATETIME, DATE | `time` | enables time drill-downs |
151
+ | JSON | `string` | expose extracted subfields individually |
152
+ | ARRAY | — | use UNNEST in a bridge cube or `sql:` view |
153
+
154
+ ---
155
+
156
+ ## Naming conventions
157
+
158
+ | Item | Convention |
159
+ | ----------------------- | ----------------------------------------- |
160
+ | Cube name | `gold_` prefix stripped; snake_case |
161
+ | File name | same as cube name + `.yml` |
162
+ | Dimension/measure names | snake_case |
163
+ | Computed dimensions | descriptive name, not `col_json_value` |
164
+ | Bridge cubes | `<entity_a>_to_<entity_b>` |
165
+ | Table aliases | `<entity>_<role>` (e.g. `users_assignee`) |
166
+
167
+ ---
168
+
169
+ ## BigQuery-specific SQL tips
170
+
171
+ ```sql
172
+ -- Safe division (avoid divide-by-zero)
173
+ SAFE_DIVIDE(numerator, denominator)
174
+
175
+ -- Null-safe equality
176
+ ${CUBE}.col IS NOT DISTINCT FROM other_col
177
+
178
+ -- Date truncation for time series
179
+ DATE_TRUNC(${CUBE}.created_at, MONTH)
180
+
181
+ -- String aggregation
182
+ STRING_AGG(${CUBE}.name, ', ')
183
+ ```
@@ -1,4 +1,4 @@
1
- # Cube Overlay Examples
1
+ # Cube Examples
2
2
 
3
3
  ## Table of Contents
4
4
 
@@ -50,7 +50,7 @@ Notes:
50
50
 
51
51
  1. Cube `name` is `hubspot_companies` (no `gold_` prefix).
52
52
  2. `sql_table` references `gold_hubspot_companies` (with `gold_` prefix), in backticks.
53
- 3. The join references `${companies_deals}` — the cube name of a bridge cube defined in `semantic/companies_deals.yml`.
53
+ 3. The join references `${companies_deals}` — the cube name of a bridge cube defined in `cubes/companies_deals.yml`.
54
54
  4. Only `_airbyte_extracted_at` is exposed from Airbyte metadata, as `airbyte_extracted_at`.
55
55
  5. `refresh_key.sql` uses the same fully qualified table name as `sql_table`.
56
56
 
@@ -0,0 +1,289 @@
1
+ # HubSpot Entities Reference
2
+
3
+ ## Table naming
4
+
5
+ Airbyte syncs HubSpot tables with a configurable prefix (default: `hubspot_`).
6
+ Inspect the BigQuery dataset to identify the actual prefix:
7
+
8
+ ```sql
9
+ SELECT table_name FROM `<dataset>.INFORMATION_SCHEMA.TABLES`
10
+ WHERE table_name LIKE '%companies%' OR table_name LIKE '%deals%'
11
+ ORDER BY table_name LIMIT 20;
12
+ ```
13
+
14
+ Throughout this document `<prefix>` is a placeholder for that prefix (e.g. `hubspot_`).
15
+
16
+ ---
17
+
18
+ ## Primary entities
19
+
20
+ | Cube name | BigQuery table | PK | Notes |
21
+ | ------------------------ | ------------------------ | ------------ | -------------------------------------------------- | --- | --- | --- | --------- |
22
+ | `<prefix>companies` | `<prefix>companies` | `id` | `properties_name` is the display name |
23
+ | `<prefix>contacts` | `<prefix>contacts` | `id` | `properties_hs_full_name_or_email` is display name |
24
+ | `<prefix>deals` | `<prefix>deals` | `id` | `properties_dealname` is display name |
25
+ | `<prefix>tickets` | `<prefix>tickets` | `id` | — |
26
+ | `<prefix>owners` | `<prefix>owners` | `id` | Display name: `firstName | | ' ' | | lastName` |
27
+ | `<prefix>engagements` | `<prefix>engagements` | `id` | See engagement sub-types below |
28
+ | `<prefix>deal_pipelines` | `<prefix>deal_pipelines` | `pipelineId` | Stages stored as JSON array |
29
+ | `<prefix>line_items` | `<prefix>line_items` | `id` | `properties_name` |
30
+ | `<prefix>products` | `<prefix>products` | `id` | `properties_name` |
31
+
32
+ **Owner join pattern (shared by companies, contacts, deals, tickets):**
33
+
34
+ ```yaml
35
+ joins:
36
+ <prefix>owners:
37
+ relationship: many_to_one
38
+ sql: "${CUBE}.properties_hubspot_owner_id = ${<prefix>owners.id}"
39
+ ```
40
+
41
+ ---
42
+
43
+ ## Bridge / junction cubes (public: false)
44
+
45
+ HubSpot stores many-to-many associations as JSON arrays on the primary object.
46
+ Bridge cubes are required to join across these associations. They must be
47
+ `public: false` and use a composite PK.
48
+
49
+ ### Association columns
50
+
51
+ | Source table | Column | Contains |
52
+ | --------------------- | ------------------------- | --------------------------------------- |
53
+ | `<prefix>deals` | `companies` | JSON array of company IDs |
54
+ | `<prefix>deals` | `contacts` | JSON array of contact IDs |
55
+ | `<prefix>deals` | `line_items` | JSON array of line item IDs |
56
+ | `<prefix>deals` | `deals` | JSON array (for tickets→deals) |
57
+ | `<prefix>tickets` | `companies` | JSON array of company IDs |
58
+ | `<prefix>tickets` | `contacts` | JSON array of contact IDs |
59
+ | `<prefix>tickets` | `deals` | JSON array of deal IDs (CAST to STRING) |
60
+ | `<prefix>companies` | `contacts` | JSON array of contact IDs |
61
+ | `<prefix>engagements` | `associations.contactIds` | JSON array of contact IDs |
62
+ | `<prefix>engagements` | `associations.companyIds` | JSON array of company IDs |
63
+ | `<prefix>engagements` | `associations.dealIds` | JSON array of deal IDs |
64
+
65
+ ### Bridge cube: companies_to_deals
66
+
67
+ ```yaml
68
+ name: <prefix>companies_to_deals
69
+ sql: >
70
+ SELECT DISTINCT d.id as deal_id, company_id
71
+ FROM `<dataset>.<prefix>deals` d,
72
+ UNNEST(JSON_VALUE_ARRAY(d.companies)) company_id
73
+ public: false
74
+ dimensions:
75
+ id:
76
+ sql: "${CUBE.company_id} || ${CUBE.deal_id}"
77
+ type: string
78
+ primary_key: true
79
+ company_id:
80
+ sql: "${CUBE}.company_id"
81
+ type: string
82
+ deal_id:
83
+ sql: "${CUBE}.deal_id"
84
+ type: string
85
+ joins:
86
+ <prefix>companies:
87
+ relationship: many_to_one
88
+ sql: "${CUBE}.company_id = ${<prefix>companies.id}"
89
+ <prefix>deals:
90
+ relationship: many_to_one
91
+ sql: "${CUBE}.deal_id = ${<prefix>deals.id}"
92
+ refresh_key:
93
+ sql: "SELECT MAX(_airbyte_extracted_at) FROM `<dataset>.<prefix>deals`"
94
+ ```
95
+
96
+ ### Bridge cube: companies_to_tickets
97
+
98
+ Same pattern — UNNEST `tickets.companies`:
99
+
100
+ ```yaml
101
+ name: <prefix>companies_to_tickets
102
+ sql: >
103
+ SELECT DISTINCT t.id as ticket_id, company_id
104
+ FROM `<dataset>.<prefix>tickets` t,
105
+ UNNEST(JSON_VALUE_ARRAY(t.companies)) company_id
106
+ ```
107
+
108
+ ### Bridge cube: deals_to_tickets
109
+
110
+ Note: ticket `deals` column values are numbers — cast to STRING:
111
+
112
+ ```yaml
113
+ name: <prefix>deals_to_tickets
114
+ sql: >
115
+ SELECT DISTINCT t.id AS ticket_id, CAST(deal_id AS STRING) AS deal_id
116
+ FROM `<dataset>.<prefix>tickets` t,
117
+ UNNEST(JSON_VALUE_ARRAY(t.deals)) AS deal_id
118
+ ```
119
+
120
+ ### Bridge cube: deals_to_line_items
121
+
122
+ ```yaml
123
+ name: <prefix>deals_to_line_items
124
+ sql: >
125
+ SELECT DISTINCT d.id AS deal_id, line_item_id
126
+ FROM `<dataset>.<prefix>deals` d,
127
+ UNNEST(JSON_VALUE_ARRAY(d.line_items)) AS line_item_id
128
+ ```
129
+
130
+ ### Bridge cube: contacts_to_deals
131
+
132
+ ```yaml
133
+ name: <prefix>contacts_to_deals
134
+ sql: >
135
+ SELECT DISTINCT d.id AS deal_id, contact_id
136
+ FROM `<dataset>.<prefix>deals` d,
137
+ UNNEST(JSON_VALUE_ARRAY(d.contacts)) contact_id
138
+ ```
139
+
140
+ ### Bridge cube: contacts_to_tickets
141
+
142
+ ```yaml
143
+ name: <prefix>contacts_to_tickets
144
+ sql: >
145
+ SELECT DISTINCT t.id AS ticket_id, contact_id
146
+ FROM `<dataset>.<prefix>tickets` t,
147
+ UNNEST(JSON_VALUE_ARRAY(t.contacts)) contact_id
148
+ ```
149
+
150
+ ### Bridge cube: contacts_to_companies
151
+
152
+ Note: this uses `SAFE_CAST` on both sides — IDs can have type mismatches:
153
+
154
+ ```yaml
155
+ name: <prefix>contacts_to_companies
156
+ sql: >
157
+ SELECT DISTINCT c.id AS company_id, contact_id
158
+ FROM `<dataset>.<prefix>companies` c,
159
+ UNNEST(JSON_VALUE_ARRAY(c.contacts)) AS contact_id
160
+ joins:
161
+ <prefix>contacts:
162
+ relationship: many_to_one
163
+ sql: "SAFE_CAST(${CUBE}.contact_id AS STRING) = SAFE_CAST(${<prefix>contacts.id} AS STRING)"
164
+ <prefix>companies:
165
+ relationship: many_to_one
166
+ sql: "SAFE_CAST(${CUBE}.company_id AS STRING) = SAFE_CAST(${<prefix>companies.id} AS STRING)"
167
+ ```
168
+
169
+ ### Bridge cubes: engagements_to_contacts / companies / deals
170
+
171
+ Engagement IDs are integers — always CAST to STRING:
172
+
173
+ ```yaml
174
+ name: <prefix>engagements_to_contacts
175
+ sql: >
176
+ SELECT DISTINCT
177
+ CAST(e.id AS STRING) AS engagement_id,
178
+ CAST(contact_id AS STRING) AS contact_id
179
+ FROM `<dataset>.<prefix>engagements` e,
180
+ UNNEST(JSON_VALUE_ARRAY(e.associations.contactIds)) AS contact_id
181
+ ```
182
+
183
+ Same pattern for `companyIds` → `engagements_to_companies` and `dealIds` → `engagements_to_deals`.
184
+
185
+ Engagement join:
186
+
187
+ ```yaml
188
+ joins:
189
+ <prefix>engagements:
190
+ relationship: many_to_one
191
+ sql: "CAST(${CUBE}.engagement_id AS STRING) = CAST(${<prefix>engagements.id} AS STRING)"
192
+ ```
193
+
194
+ ---
195
+
196
+ ## Special cubes
197
+
198
+ ### deal_pipeline_stages
199
+
200
+ Derived from `deal_pipelines.stages` JSON array. Not a raw table — uses `sql:` not `sql_table:`.
201
+
202
+ ```yaml
203
+ name: <prefix>deal_pipeline_stages
204
+ sql: >
205
+ SELECT
206
+ JSON_VALUE(elem, '$.stageId') AS stage_id,
207
+ JSON_VALUE(elem, '$.label') AS label
208
+ FROM `<dataset>.<prefix>deal_pipelines`,
209
+ UNNEST(JSON_QUERY_ARRAY(stages)) AS elem
210
+ dimensions:
211
+ stage_id:
212
+ sql: "${CUBE}.stage_id"
213
+ type: string
214
+ primary_key: true
215
+ label:
216
+ sql: "${CUBE}.label"
217
+ type: string
218
+ joins:
219
+ <prefix>deals:
220
+ relationship: one_to_many
221
+ sql: "${CUBE}.stage_id = ${<prefix>deals.properties_dealstage}"
222
+ refresh_key:
223
+ sql: "SELECT MAX(_airbyte_extracted_at) FROM `<dataset>.<prefix>deal_pipelines`"
224
+ ```
225
+
226
+ Deals join to stages and pipelines:
227
+
228
+ ```yaml
229
+ joins:
230
+ <prefix>deal_pipeline_stages:
231
+ relationship: many_to_one
232
+ sql: "${CUBE}.properties_dealstage = ${<prefix>deal_pipeline_stages.stage_id}"
233
+ <prefix>deal_pipelines:
234
+ relationship: many_to_one
235
+ sql: "${CUBE}.properties_pipeline = ${<prefix>deal_pipelines.pipelineId}"
236
+ ```
237
+
238
+ ### engagements sub-types
239
+
240
+ `engagements` table has sub-type tables: `engagements_calls`, `engagements_emails`,
241
+ `engagements_meetings`, `engagements_tasks`, `engagements_notes`.
242
+
243
+ Join pattern (one-to-one by ID with CAST):
244
+
245
+ ```yaml
246
+ # On the engagements cube:
247
+ joins:
248
+ <prefix>engagements_calls:
249
+ relationship: one_to_one
250
+ sql: "CAST(${CUBE}.id AS STRING) = ${<prefix>engagements_calls.id}"
251
+
252
+ # On each sub-type cube:
253
+ joins:
254
+ <prefix>engagements:
255
+ relationship: many_to_one
256
+ sql: "${CUBE}.id = CAST(${<prefix>engagements.id} AS STRING)"
257
+ ```
258
+
259
+ ---
260
+
261
+ ## Deals measures
262
+
263
+ ```yaml
264
+ measures:
265
+ count_closed:
266
+ type: count
267
+ filters:
268
+ - sql: "${CUBE}.properties_hs_is_closed = TRUE"
269
+ count_closed_won:
270
+ type: count
271
+ filters:
272
+ - sql: "${CUBE}.properties_hs_is_closed_won = TRUE"
273
+ count_closed_lost:
274
+ type: count
275
+ filters:
276
+ - sql: >
277
+ ${CUBE}.properties_hs_is_closed = TRUE
278
+ AND ${CUBE}.properties_hs_is_closed_won = FALSE
279
+ ```
280
+
281
+ ---
282
+
283
+ ## Common pitfalls
284
+
285
+ 1. **ID type mismatches** — HubSpot IDs are sometimes integers, sometimes strings. Use `SAFE_CAST` when unsure (especially contacts_to_companies). Engagement IDs are always integers → always CAST to STRING.
286
+ 2. **JSON_VALUE_ARRAY vs JSON_QUERY_ARRAY** — use `JSON_VALUE_ARRAY` when the array contains scalar strings/ints (association IDs); use `JSON_QUERY_ARRAY` when the array contains JSON objects (deal_pipelines stages).
287
+ 3. **deal_pipeline_stages is derived** — uses `sql:` not `sql_table:`. Cannot be used in `revos cubes preview` diff against Airbyte-generated cubes.
288
+ 4. **engagements bridge refresh_key** — use the parent engagement table timestamp, not the contact/company/deal table.
289
+ 5. **Prefix varies** — always confirm the actual prefix from BigQuery before writing cube files.