strata-cli 0.1.8 → 0.1.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +29 -0
- data/README.md +26 -2
- data/lib/strata/cli/agent_mode.rb +26 -0
- data/lib/strata/cli/agent_output.rb +21 -0
- data/lib/strata/cli/ai/services/table_generator.rb +35 -20
- data/lib/strata/cli/api/client.rb +23 -63
- data/lib/strata/cli/api/response_error_handler.rb +115 -0
- data/lib/strata/cli/error_reporter.rb +4 -1
- data/lib/strata/cli/generators/datasource.rb +4 -3
- data/lib/strata/cli/generators/group.rb +37 -0
- data/lib/strata/cli/generators/migration.rb +2 -1
- data/lib/strata/cli/generators/project.rb +18 -11
- data/lib/strata/cli/generators/relation.rb +2 -1
- data/lib/strata/cli/generators/table.rb +5 -8
- data/lib/strata/cli/generators/templates/AGENTS.md +457 -88
- data/lib/strata/cli/generators/templates/table.table_name.yml +8 -3
- data/lib/strata/cli/generators/test.rb +2 -1
- data/lib/strata/cli/guard.rb +4 -1
- data/lib/strata/cli/helpers/command_context.rb +8 -9
- data/lib/strata/cli/helpers/datasource_helper.rb +27 -1
- data/lib/strata/cli/helpers/description_helper.rb +2 -1
- data/lib/strata/cli/main.rb +12 -3
- data/lib/strata/cli/output.rb +103 -0
- data/lib/strata/cli/sub_commands/audit.rb +136 -16
- data/lib/strata/cli/sub_commands/branch.rb +165 -0
- data/lib/strata/cli/sub_commands/create.rb +13 -2
- data/lib/strata/cli/sub_commands/datasource.rb +21 -3
- data/lib/strata/cli/sub_commands/deploy.rb +16 -13
- data/lib/strata/cli/sub_commands/project.rb +6 -3
- data/lib/strata/cli/sub_commands/table.rb +11 -8
- data/lib/strata/cli/terminal.rb +7 -4
- data/lib/strata/cli/ui/field_editor.rb +21 -27
- data/lib/strata/cli/utils/deployment_monitor.rb +15 -34
- data/lib/strata/cli/utils/git.rb +78 -0
- data/lib/strata/cli/utils/import_manager.rb +4 -1
- data/lib/strata/cli/utils/test_reporter.rb +4 -32
- data/lib/strata/cli/utils/version_checker.rb +4 -8
- data/lib/strata/cli/utils.rb +3 -1
- data/lib/strata/cli/version.rb +1 -1
- data/lib/strata/cli.rb +4 -3
- metadata +6 -2
- data/lib/strata/cli/helpers/color_helper.rb +0 -103
|
@@ -1,136 +1,505 @@
|
|
|
1
1
|
# Strata Semantic Model Project
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
You are authoring a **semantic model** — version-controlled YAML in this repo (`models/tbl.*.yml`, `models/rel.*.yml`, datasources). That is not one-off SQL, warehouse ETL, or a standalone script.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
| Term | Meaning in Strata |
|
|
6
|
+
|------|-------------------|
|
|
7
|
+
| **Semantic model** | What you edit in git — tables, fields, relationships, migrations |
|
|
8
|
+
| **Semantic layer** | What exists on the server **after deploy** — the governed dimensions, measures, and joins users query |
|
|
6
9
|
|
|
10
|
+
After **`strata deploy`**, this project’s model becomes (or updates) the **semantic layer** on the Strata server for that branch. The **Strata web app** charts and reports run against that layer, not against raw warehouse tables.
|
|
11
|
+
|
|
12
|
+
## What this model powers (end-to-end)
|
|
13
|
+
|
|
14
|
+
| Stage | What happens |
|
|
15
|
+
|-------|----------------|
|
|
16
|
+
| **You (agent + human)** | Edit the **semantic model** in `models/`; run `strata audit` |
|
|
17
|
+
| **Human** | `strata deploy` publishes the model to the **Strata server** (per git branch) |
|
|
18
|
+
| **Strata server** | Builds join universes, plans queries, runs SQL against datasources |
|
|
19
|
+
| **Strata web app** | Analysts query the deployed **semantic layer** — charting, reporting, filters, exploration by field **name** |
|
|
20
|
+
|
|
21
|
+
Field and table **names**, joins, and measure definitions affect **live dashboards and saved reports**, not only CLI or git. Breaking naming, duplicates, or joins can break production content after deploy. YAML is consumed by deploy on the server — it is **not** ad-hoc warehouse SQL run in isolation.
|
|
22
|
+
|
|
23
|
+
Read this file end-to-end before creating or editing `models/**/*.yml`. Do not mirror warehouse DDL or ad-hoc SQL style; follow the contracts below.
|
|
24
|
+
|
|
25
|
+
**Do not infer Strata behavior from warehouse SQL or generic BI tools.** Rules in this file are authoritative. Use the [official documentation index](#official-documentation) only when you need depth beyond what is inlined here.
|
|
26
|
+
|
|
27
|
+
## Critical rules
|
|
28
|
+
|
|
29
|
+
These are non-negotiable. Violations break deploy, query planning, or production reports.
|
|
30
|
+
|
|
31
|
+
1. **Every field `name` is unique project-wide** — one name = one dimension or measure entity across all `tbl.*.yml` files. Strata has no `table.field` namespaces; bracket references like `[Total Revenue]` resolve by name alone.
|
|
32
|
+
2. **`many_to_many` join cardinality is not supported.** Use a junction/bridge table with two relationships (`one_to_many` + `many_to_one`) instead.
|
|
33
|
+
3. **Measures on unrelated detail facts must have distinct names** — e.g. `Store Gross Sales`, `Catalog Gross Sales`, not one `Gross Sales` on three channel facts. See [Naming conventions](#naming-conventions).
|
|
34
|
+
4. **Dimensions may share names** when the business role is the same; Strata picks among tables at query time. **Measures do not combine that way** — duplicate measure names attach more SQL to the same measure entity (double-count risk).
|
|
35
|
+
5. **Cross-fact totals use compound measures or blending** — not the same measure name on multiple facts. See [Combined metrics across facts](#combined-metrics-across-facts).
|
|
36
|
+
|
|
37
|
+
## Unsupported or impossible requirements
|
|
38
|
+
|
|
39
|
+
**If the user asks for something Strata does not support, say so clearly and do not edit `models/**/*.yml` to approximate it.** A broken or misleading semantic model is worse than no change.
|
|
40
|
+
|
|
41
|
+
### Before writing YAML
|
|
42
|
+
|
|
43
|
+
1. **Classify the request:** supported in Strata · supported with a different pattern · unsupported · unknown.
|
|
44
|
+
2. **If unsupported or unknown:** explain the limitation, why a workaround would fail (deploy, double-count, invalid cardinality, invented keys), and offer **compliant alternatives** if any exist.
|
|
45
|
+
3. **Wait for the user to choose** a supported approach (or to change the requirement) before proposing file changes.
|
|
46
|
+
4. **Never ship “best effort” YAML** that you expect `audit` or deploy to reject, or that violates [Critical rules](#critical-rules) — fix the design in conversation first.
|
|
47
|
+
|
|
48
|
+
### What to tell the user
|
|
49
|
+
|
|
50
|
+
Use plain language, for example:
|
|
51
|
+
|
|
52
|
+
- “Strata does not support X. Doing Y in YAML would not work because …”
|
|
53
|
+
- “Supported alternative: … (trade-off: …)”
|
|
54
|
+
- “I’m not sure Strata supports X; I won’t change the model until we confirm — see [official documentation](#official-documentation) or your Strata admin.”
|
|
55
|
+
|
|
56
|
+
### Do not do these when blocked
|
|
57
|
+
|
|
58
|
+
- Invent YAML keys or properties not in Strata’s schema (`custom_join`, `blend_measures`, etc.).
|
|
59
|
+
- Use duplicate measure names, fake dimensions, or `many_to_many` joins as workarounds.
|
|
60
|
+
- Paste full ad-hoc report SQL into `expression.sql` and call it a semantic model.
|
|
61
|
+
- Rename or migrate entities to “make” an unsupported shape fit.
|
|
62
|
+
- Silently implement something that only works in raw warehouse SQL, not in Strata’s planner.
|
|
63
|
+
|
|
64
|
+
### Common unsupported asks → response pattern
|
|
65
|
+
|
|
66
|
+
| User ask | Why it fails | Compliant direction (if any) |
|
|
67
|
+
|----------|--------------|------------------------------|
|
|
68
|
+
| `many_to_many` join between two tables | Not supported | Junction table + two `rel` entries |
|
|
69
|
+
| One measure name on store + catalog + web facts | One measure entity, double-count risk | Distinct measures + [compound measure](#combined-metrics-across-facts) |
|
|
70
|
+
| “Combine metrics by reusing the same name” | Not how Strata merges facts | Compound measure or blending on shared dimensions |
|
|
71
|
+
| Cross-datasource compound measure | Same datasource required | Model per datasource or separate metrics |
|
|
72
|
+
| Datasource swap migration | Not supported | Rename datasource; human migration plan |
|
|
73
|
+
| Snapshot-style metric on a flow fact without `snapshot_date` | Wrong measure type | Normal additive measure, or proper snapshot table setup |
|
|
74
|
+
| Requirement needs engine feature you cannot verify | Risk of invalid model | Stop; cite docs or ask human — do not guess |
|
|
75
|
+
|
|
76
|
+
When `strata audit all --agent` reports errors you cannot resolve within Strata’s rules, **stop and report the errors to the user** — do not keep patching YAML until the model drifts further from a valid design.
|
|
77
|
+
|
|
78
|
+
## Order of operations for agents
|
|
79
|
+
|
|
80
|
+
Follow this sequence. Do not skip steps or reorder deploy before audit.
|
|
81
|
+
|
|
82
|
+
| Step | Who | Action |
|
|
83
|
+
|------|-----|--------|
|
|
84
|
+
| 1 | Agent | `strata datasource list --agent` → `tables` → `meta` (discovery) |
|
|
85
|
+
| 2 | Agent | `strata table list --agent` (existing models, if any) |
|
|
86
|
+
| 3 | Agent | Classify tables; check request is [supported](#unsupported-or-impossible-requirements); propose field names and joins (or explain why not) |
|
|
87
|
+
| 4 | Human | Approve proposal (required before any file writes) |
|
|
88
|
+
| 5 | Agent | Write `models/**/*.yml` (and `migrations/*.yml` if renaming — see [Renames and swaps](#renames-swaps-and-production-branch)) |
|
|
89
|
+
| 6 | Agent | `strata audit all --agent` — **required after every model change** |
|
|
90
|
+
| 7 | Human | `strata deploy` on the intended git branch (deploy runs audits again) |
|
|
91
|
+
|
|
92
|
+
**Agents must not run:** `strata deploy`, `strata datasource add`, interactive `strata create table` / `strata create relation` / `strata create migration`, or writing secrets into `.strata`.
|
|
93
|
+
|
|
94
|
+
**Branches:** Model on feature branches; deploy to the matching Strata server branch for staging. Production deploys typically use `production_branch` in `project.yml` (default `main`). Renames and swaps that affect production query names should happen on the production branch — see [Renames and swaps](#renames-swaps-and-production-branch).
|
|
95
|
+
|
|
96
|
+
## How the semantic layer maps to the warehouse
|
|
97
|
+
|
|
98
|
+
| Concept | In your YAML | Not the same as |
|
|
99
|
+
|--------|----------------|-----------------|
|
|
100
|
+
| Table users query | `name` in `tbl.*.yml` | `physical_name` (warehouse table) |
|
|
101
|
+
| Column in a table | `expression.sql` on each field | Bare column string or `physical_name.column` |
|
|
102
|
+
| Join endpoints | `left` / `right` (logical table names) | Table names inside join `sql` |
|
|
103
|
+
| Join condition | `sql` using `left.` and `right.` | `store_sales.col = date_dim.col` |
|
|
104
|
+
| Second FK to same dimension | New logical `tbl` (role-playing) + join to that name | Two `rel` entries with same `left` + `right` |
|
|
105
|
+
|
|
106
|
+
## Deploy contracts (required on every change)
|
|
107
|
+
|
|
108
|
+
Every `tbl` field and every `rel` join must satisfy **all** of these before deploy will succeed:
|
|
109
|
+
|
|
110
|
+
1. **Field `expression`** is a non-empty SQL string shortcut (`expression: order_id`, `expression: sum(amount)`) or a mapping with non-empty `sql`. Use the mapping form when you need `lookup`, `primary_key`, or `array`.
|
|
111
|
+
2. **Join `sql` uses only `left.` and `right.`** — column names from field definitions on those tables. Never warehouse prefixes (`orders.`, `date_dim.`, `store_sales.`).
|
|
112
|
+
3. **At most one join per logical table pair** in a branch. If a fact has two FKs to the same physical dimension (e.g. sold date and ship date both → `date_dim`), create **separate logical tables** (role-playing dimensions) and join to each distinct `name`.
|
|
113
|
+
4. **`left` / `right` in rel files match `name` in `tbl.*.yml` exactly** (same spelling and casing as the table model).
|
|
114
|
+
|
|
115
|
+
Run `strata audit all --agent` after edits; a human runs `strata deploy`.
|
|
116
|
+
|
|
117
|
+
## CLI for agents
|
|
118
|
+
|
|
119
|
+
Append `--agent` to every `strata` command (structured output, no prompts):
|
|
120
|
+
|
|
121
|
+
```bash
|
|
122
|
+
strata datasource list --agent
|
|
123
|
+
strata datasource tables MY_DS --agent
|
|
124
|
+
strata datasource meta MY_DS TABLE_NAME --agent
|
|
125
|
+
strata table list --agent
|
|
126
|
+
strata audit all --agent
|
|
7
127
|
```
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
128
|
+
|
|
129
|
+
**Do not run:** `strata create table`, `strata create relation` (interactive generators). Write `models/tbl.*.yml` and `models/rel.*.yml` directly instead.
|
|
130
|
+
|
|
131
|
+
If multiple datasources exist, pass `DS_KEY` on `tables` and `meta`.
|
|
132
|
+
|
|
133
|
+
## Datasources — human-in-the-loop (secrets)
|
|
134
|
+
|
|
135
|
+
If the CLI returns `no_datasources`, ask the user to run `strata datasource add [ADAPTER]` in a terminal. Do not create `datasources.yml` entries or fill `.strata` credentials yourself.
|
|
136
|
+
|
|
137
|
+
## Human-in-the-loop before file writes
|
|
138
|
+
|
|
139
|
+
Do **not** silently bulk-write model files. Before any `models/**/*.yml` change:
|
|
140
|
+
|
|
141
|
+
0. **If the request is unsupported** — follow [Unsupported or impossible requirements](#unsupported-or-impossible-requirements); do not proceed to file writes until the user picks a supported approach.
|
|
142
|
+
1. **Describe intent** — tables, fields, joins, and role-playing tables if needed (e.g. separate **Catalog Ship Date** when catalog sales has both sold-date and ship-date FKs).
|
|
143
|
+
2. **Explain semantic impact** — which **dimension** names are intentionally shared vs which **measure** names stay **distinct per fact**; new join paths; any proposed **compound measures** (including **which host table** defines dimensionality); any **migrations** (renames/swaps) and whether work is on **production branch**; which logical tables are created vs updated.
|
|
144
|
+
3. **Wait for explicit approval**, then write files and run `strata audit all --agent`.
|
|
145
|
+
|
|
146
|
+
## Workflow
|
|
147
|
+
|
|
148
|
+
Same steps as [Order of operations for agents](#order-of-operations-for-agents). In the proposal (step 3–4), include:
|
|
149
|
+
|
|
150
|
+
- Table classification: **fact**, **dimension**, or **aggregate/summary**
|
|
151
|
+
- **Unique measure name per unrelated detail fact**; shared dimension names only where the role matches
|
|
152
|
+
- `tbl.*.yml` and `rel.*.yml` drafts that satisfy [Deploy contracts](#deploy-contracts-required-on-every-change) and naming rules below
|
|
153
|
+
|
|
154
|
+
---
|
|
155
|
+
|
|
156
|
+
## Table models — `models/tbl.<name>.yml`
|
|
157
|
+
|
|
158
|
+
```yaml
|
|
159
|
+
datasource: "<datasource_key_or_name>"
|
|
160
|
+
name: "<logical_name>" # Required — used in rel left/right and queries
|
|
161
|
+
physical_name: "<warehouse_table>" # Required — actual table in the database
|
|
162
|
+
cost: 10
|
|
163
|
+
|
|
164
|
+
fields:
|
|
165
|
+
- type: dimension
|
|
166
|
+
name: "Order ID"
|
|
167
|
+
data_type: bigint
|
|
168
|
+
expression:
|
|
169
|
+
sql: order_id # Column or expression fragment for this table
|
|
170
|
+
lookup: true # Typical for filterable dimensions
|
|
171
|
+
primary_key: true # Optional — business/surrogate key
|
|
172
|
+
|
|
173
|
+
- type: measure
|
|
174
|
+
name: "Store Revenue" # Unique per fact — not reused on other fact tables
|
|
175
|
+
data_type: decimal
|
|
176
|
+
expression:
|
|
177
|
+
sql: sum(ss_sales_price) # Aggregation required for measures
|
|
15
178
|
```
|
|
16
179
|
|
|
17
|
-
|
|
180
|
+
**Field rules**
|
|
18
181
|
|
|
19
|
-
|
|
182
|
+
- `data_type`: `string`, `integer`, `bigint`, `decimal`, `date`, `date_time`, `boolean` (map from `normalized_data_type` in `datasource meta`).
|
|
183
|
+
- `expression` keys: `sql` (required), `lookup`, `primary_key`, `array` (optional booleans).
|
|
184
|
+
- Optional: `description`, `grains` (date/time), `format`, `synonyms`, `imports` of other tbl files.
|
|
20
185
|
|
|
21
|
-
|
|
186
|
+
**Date/time grains** (optional):
|
|
22
187
|
|
|
23
|
-
```
|
|
24
|
-
|
|
25
|
-
strata datasource list # List configured datasources
|
|
26
|
-
strata datasource tables [DS_KEY] # Browse physical tables in a datasource
|
|
27
|
-
strata datasource meta DS_KEY TABLE_NAME # Show columns for a physical table
|
|
188
|
+
```yaml
|
|
189
|
+
grains: [day, week, month, quarter, year]
|
|
28
190
|
```
|
|
29
191
|
|
|
30
|
-
|
|
192
|
+
---
|
|
31
193
|
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
# sales/orders → models/sales/tbl.orders.yml
|
|
37
|
-
# dw.fact_orders → models/tbl.dw.fact_orders.yml (schema-prefixed)
|
|
194
|
+
## Relationship models — `models/rel.<name>.yml`
|
|
195
|
+
|
|
196
|
+
```yaml
|
|
197
|
+
datasource: "<datasource_key_or_name>"
|
|
38
198
|
|
|
39
|
-
|
|
199
|
+
orders_to_customers:
|
|
200
|
+
left: "Orders" # Must match tbl name — many side for many_to_one
|
|
201
|
+
right: "Customers" # Must match tbl name — one side
|
|
202
|
+
sql: "left.customer_id = right.id"
|
|
203
|
+
cardinality: many_to_one
|
|
40
204
|
```
|
|
41
205
|
|
|
42
|
-
|
|
206
|
+
**Join rules**
|
|
43
207
|
|
|
44
|
-
|
|
208
|
+
- YAML `left` / `right`: logical table `name` values from `tbl.*.yml`.
|
|
209
|
+
- `sql`: equality using **`left.<column>`** and **`right.<column>`** only — use the same column names as in each table’s field `expression.sql`.
|
|
210
|
+
- `cardinality`: `many_to_one`, `one_to_many`, or `one_to_one` (as appropriate).
|
|
211
|
+
- Optional: `join: inner` (default) or `left` / `right`.
|
|
45
212
|
|
|
46
|
-
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
53
|
-
|
|
54
|
-
|
|
213
|
+
### Role-playing dimensions (multiple FKs to one physical table)
|
|
214
|
+
|
|
215
|
+
When one fact table has **more than one foreign key** to the same physical dimension (common with `date_dim`: sold date, ship date, etc.), you need **one logical table per role**, each with its own join:
|
|
216
|
+
|
|
217
|
+
| Wrong | Right |
|
|
218
|
+
|-------|--------|
|
|
219
|
+
| Two rel entries: `Catalog Sales` → `Date` (sold) and `Catalog Sales` → `Date` (ship) | `Catalog Sales` → `Date` (sold) and `Catalog Sales` → `Catalog Ship Date` (ship) |
|
|
220
|
+
| Single `tbl.date.yml` reused for both roles without planning | Add `tbl.catalog_ship_date.yml` with `name: Catalog Ship Date`, same `physical_name: date_dim`, fields aligned to date role |
|
|
221
|
+
|
|
222
|
+
Example pattern:
|
|
223
|
+
|
|
224
|
+
```yaml
|
|
225
|
+
# tbl — two logical tables, one physical date_dim
|
|
226
|
+
# tbl.date.yml → name: Date
|
|
227
|
+
# tbl.catalog_ship_date.yml → name: Catalog Ship Date, physical_name: date_dim
|
|
228
|
+
|
|
229
|
+
# rel — distinct right table per FK
|
|
230
|
+
catalog_sales_sold_date:
|
|
231
|
+
left: "Catalog Sales"
|
|
232
|
+
right: "Date"
|
|
233
|
+
sql: "left.cs_sold_date_sk = right.d_date_sk"
|
|
234
|
+
cardinality: many_to_one
|
|
235
|
+
|
|
236
|
+
catalog_sales_ship_date:
|
|
237
|
+
left: "Catalog Sales"
|
|
238
|
+
right: "Catalog Ship Date"
|
|
239
|
+
sql: "left.cs_ship_date_sk = right.d_date_sk"
|
|
240
|
+
cardinality: many_to_one
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
Before adding rel entries, list each fact’s FKs and confirm **no duplicate `(left, right)` pairs**. If ship and sold both target `Date`, add a role-playing tbl first.
|
|
244
|
+
|
|
245
|
+
---
|
|
246
|
+
|
|
247
|
+
## Naming conventions
|
|
248
|
+
|
|
249
|
+
### Dimensions (global — reuse when the role is the same)
|
|
250
|
+
|
|
251
|
+
- **Dimension names are global** across the project — same name on multiple tables = same concept; Strata picks the best table at query time.
|
|
252
|
+
- **Reuse consistent names** for shared dimensions (e.g. "Customer ID", "Date") where the role is the same.
|
|
253
|
+
- **Prefix role-specific dimensions** when the same physical column means different things (e.g. "Catalog Ship Date" vs "Date").
|
|
254
|
+
- **Prefix ambiguous attributes** on dimension tables (e.g. "Item Color" not "Color" on an item dimension).
|
|
255
|
+
|
|
256
|
+
### Measures (unique — do not treat like dimensions)
|
|
257
|
+
|
|
258
|
+
- **Project-wide uniqueness:** every measure `name` must be unique across the entire semantic layer (same rule as dimensions). There is only one `Semantic::Field` per name per branch; deploy reuses it when the same name appears in another `tbl.*.yml`.
|
|
259
|
+
- **Default:** one measure name per **unrelated detail fact**. Example (wrong): `Gross Sales` on `store_sales`, `catalog_sales`, and `web_sales`. Example (right): `Store Gross Sales Amount`, `Catalog Gross Sales Amount`, `Web Gross Sales Amount`.
|
|
260
|
+
- **One measure name = one measure entity.** Putting the same name on another fact table adds that table’s SQL to the **same** measure — it does **not** create a second metric and can cause **double counting** or ambiguous routing.
|
|
261
|
+
- **Exception — aggregate/summary tables only:** reusing a measure name is appropriate when the second table is an **alternate physical representation** of the **same** metric (rollup / pre-aggregate), routed via table **`cost`** and/or **`partitions`** — not when modeling three separate channel facts. Do **not** put the same additive flow measure on both a detail fact and a rollup fact without that routing design.
|
|
262
|
+
- **Similar warehouse column names do not imply one shared measure.** `ss_sales_price` and `cs_ext_sales_price` are different facts unless you have explicit rollup routing as above.
|
|
263
|
+
- **Prefix or qualify by fact/table** (`Store …`, `Catalog …`, `Web …`) or use distinct business names (`order_revenue` vs `subscription_revenue`).
|
|
264
|
+
- **Do not “fix” duplicate measure names with rename migrations** — use distinct names or a [compound measure](#combined-metrics-across-facts) instead.
|
|
265
|
+
|
|
266
|
+
---
|
|
267
|
+
|
|
268
|
+
## Multi-fact warehouses (star schema)
|
|
269
|
+
|
|
270
|
+
Typical warehouses (e.g. TPC-DS on Athena) have **many fact tables** and shared dimensions. Before naming fields:
|
|
271
|
+
|
|
272
|
+
1. **Inventory tables** — label each as fact, dimension, or aggregate/summary (pre-aggregated rollups).
|
|
273
|
+
2. **One grain per fact** — e.g. `store_sales` = store channel transactions; do not model the same additive flow metric on both a **detail** fact and a **rollup** fact unless the rollup is explicitly non-additive or routed via table `cost` / `partitions` (see advanced links below).
|
|
274
|
+
3. **Do not copy a synthetic dimension onto every fact** (e.g. "Sales Channel" on store, catalog, and web facts) to justify reusing one measure name — that does **not** fix measure uniqueness or double-count risk. Put channel only where that fact’s grain truly includes channel.
|
|
275
|
+
|
|
276
|
+
---
|
|
277
|
+
|
|
278
|
+
## Combined metrics across facts
|
|
279
|
+
|
|
280
|
+
| Wrong | Right |
|
|
281
|
+
|-------|--------|
|
|
282
|
+
| Same measure name on `store_sales`, `catalog_sales`, `web_sales` | Distinct measure per fact |
|
|
283
|
+
| Expect Strata to auto-sum identical names | Explicit **compound measure** (or blending via shared **dimensions**, not duplicate measure names) |
|
|
284
|
+
|
|
285
|
+
To report a **total across channels/facts**, define separate measures per fact, then optionally one compound measure:
|
|
286
|
+
|
|
287
|
+
```yaml
|
|
288
|
+
# On one table (or where the compound is defined), same datasource:
|
|
289
|
+
- type: measure
|
|
290
|
+
name: Total Gross Sales
|
|
291
|
+
data_type: decimal
|
|
292
|
+
expression:
|
|
293
|
+
sql: ([Store Gross Sales Amount] + [Catalog Gross Sales Amount] + [Web Gross Sales Amount])
|
|
55
294
|
```
|
|
56
295
|
|
|
57
|
-
|
|
296
|
+
Bracket names must match deployed measure names exactly. Compound measures are resolved at query time; all referenced measures must be in the **same datasource** and reachable in a valid universe. No circular references (`A` → `B` → `A`).
|
|
297
|
+
|
|
298
|
+
### Host table and dimensionality
|
|
299
|
+
|
|
300
|
+
Define compound measures on the **`tbl.*.yml` whose universe should govern the metric**:
|
|
301
|
+
|
|
302
|
+
- Which **dimensions** can group the compound measure
|
|
303
|
+
- How **automatic data blending** applies when referenced measures live on different tables in the same datasource
|
|
304
|
+
|
|
305
|
+
If referenced measures are not joinable in **one universe** from that host table, query planning fails. Propose the host table with the human reviewer (e.g. compound on **Store Sales** exposes store-centric join paths; compound on a shared dimension table exposes different dimensions).
|
|
306
|
+
|
|
307
|
+
**Use calculations (compound measures), not shared measure names** across unrelated facts.
|
|
308
|
+
|
|
309
|
+
**Automatic data blending** merges results on shared **dimensions** (often via `extended_blend_group`) — not by reusing the same measure name on multiple fact tables. See [extended blending](https://strata.do/developer-docs/developer-guide/semantic-model/expressions/extended-blending) and the [glossary](https://strata.do/developer-docs/developer-guide/getting-started/glossary).
|
|
58
310
|
|
|
59
|
-
|
|
311
|
+
---
|
|
312
|
+
|
|
313
|
+
## Snapshot measures
|
|
314
|
+
|
|
315
|
+
**Use for** point-in-time **state** — inventory on hand, account balances, membership counts at period boundaries. **Not for** additive flows over time (revenue, units sold) — use normal measures with `sum(...)`.
|
|
316
|
+
|
|
317
|
+
**Requirements:**
|
|
318
|
+
|
|
319
|
+
1. Table sets `snapshot_date: <Date dimension name>` (dimension on same table or unambiguous in the table’s universe).
|
|
320
|
+
2. Measure sets `snapshot: ending` (value at end of period) or `snapshot: beginning` (value at start).
|
|
60
321
|
|
|
61
322
|
```yaml
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
# Optional
|
|
68
|
-
description: ""
|
|
69
|
-
snapshot_date: "<dimension_name>" # For snapshot/inventory tables only
|
|
70
|
-
tags: []
|
|
71
|
-
imports: # Inherit fields from other YAML files
|
|
72
|
-
- "../shared/common_fields.yml"
|
|
323
|
+
name: Inventory
|
|
324
|
+
physical_name: inventory
|
|
325
|
+
datasource: warehouse
|
|
326
|
+
snapshot_date: Date
|
|
73
327
|
|
|
74
328
|
fields:
|
|
75
|
-
- type: dimension
|
|
76
|
-
name:
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
|
|
80
|
-
synonyms: [] # 0–3 alternative names for AI search
|
|
329
|
+
- type: dimension
|
|
330
|
+
name: Date
|
|
331
|
+
data_type: date
|
|
332
|
+
expression:
|
|
333
|
+
sql: snapshot_date
|
|
81
334
|
|
|
82
335
|
- type: measure
|
|
83
|
-
name:
|
|
84
|
-
data_type:
|
|
85
|
-
|
|
336
|
+
name: Ending Inventory
|
|
337
|
+
data_type: integer
|
|
338
|
+
snapshot: ending
|
|
339
|
+
expression:
|
|
340
|
+
sql: sum(quantity)
|
|
86
341
|
```
|
|
87
342
|
|
|
88
|
-
|
|
343
|
+
When users group by a time period, the engine picks the last (`ending`) or first (`beginning`) snapshot in each bucket — not a sum of every row in the period.
|
|
344
|
+
|
|
345
|
+
More detail: [snapshot measures](https://strata.do/developer-docs/developer-guide/semantic-model/fields/measures/snapshot)
|
|
346
|
+
|
|
347
|
+
---
|
|
348
|
+
|
|
349
|
+
## Inclusion measures
|
|
89
350
|
|
|
90
|
-
|
|
351
|
+
**Problem:** medians and averages are wrong when the stored grain is finer than the statistic you need — e.g. `median(hours)` over `(day, account, title)` rows is not “median account hours per day.”
|
|
352
|
+
|
|
353
|
+
**Pattern:** two-step aggregation — inner `expression.sql`, extra dimensions in `inclusions.dimensions`, outer `inclusions.aggregation`.
|
|
354
|
+
|
|
355
|
+
**Use when:** medians, percentiles, or averages that need an **intermediate grain** (e.g. per account per day) before rolling up to the query grain.
|
|
91
356
|
|
|
92
|
-
**Complex `expression` form:**
|
|
93
357
|
```yaml
|
|
94
|
-
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
358
|
+
- type: measure
|
|
359
|
+
name: Daily Median Account View Hours
|
|
360
|
+
data_type: decimal
|
|
361
|
+
inclusions:
|
|
362
|
+
filter: apply
|
|
363
|
+
aggregation: percentile_cont(0.5) WITHIN GROUP (ORDER BY @exp)
|
|
364
|
+
dimensions:
|
|
365
|
+
- Account ID
|
|
366
|
+
expression:
|
|
367
|
+
sql: sum(hours)
|
|
99
368
|
```
|
|
100
369
|
|
|
101
|
-
|
|
370
|
+
Inner: `sum(hours)` per `(day, account)`. Outer: median of those account totals per `day`.
|
|
371
|
+
|
|
372
|
+
More detail: [inclusions](https://strata.do/developer-docs/developer-guide/advanced/inclusions)
|
|
373
|
+
|
|
374
|
+
---
|
|
375
|
+
|
|
376
|
+
## Exclusion measures
|
|
377
|
+
|
|
378
|
+
**Two independent controls:**
|
|
379
|
+
|
|
380
|
+
- **`exclusion_type`** — which dimensions may **group** the measure (`exclude`, `exclude_all_except`, `exclude_all`).
|
|
381
|
+
- **`filter` on each exclusion entry** — how **filters** on those entities apply (`apply`, `ignore`, `only`).
|
|
382
|
+
|
|
383
|
+
**Use when:** grand totals, metrics that must not break down by certain dimensions, or revenue that should ignore a dimension for grouping while still filtering normally.
|
|
384
|
+
|
|
102
385
|
```yaml
|
|
103
|
-
|
|
386
|
+
- type: measure
|
|
387
|
+
name: Revenue Excluding Return Status
|
|
388
|
+
data_type: decimal
|
|
389
|
+
exclusion_type: exclude
|
|
390
|
+
exclusions:
|
|
391
|
+
- type: dimension
|
|
392
|
+
filter: apply
|
|
393
|
+
entities:
|
|
394
|
+
- Return Status
|
|
395
|
+
expression:
|
|
396
|
+
sql: sum(amount)
|
|
104
397
|
```
|
|
105
398
|
|
|
106
|
-
|
|
399
|
+
`Return Status` will not appear in the grouping set for this measure even if the user adds it to the query; other dimensions still group normally.
|
|
400
|
+
|
|
401
|
+
More detail: [exclusions](https://strata.do/developer-docs/developer-guide/advanced/exclusions)
|
|
402
|
+
|
|
403
|
+
---
|
|
404
|
+
|
|
405
|
+
## Common mistakes (anti-patterns)
|
|
406
|
+
|
|
407
|
+
| Anti-pattern | Why it fails |
|
|
408
|
+
|--------------|----------------|
|
|
409
|
+
| `Gross Sales` on store + catalog + web facts | One measure entity, multiple fact SQLs → double-count / ambiguous routing |
|
|
410
|
+
| "Sales Channel" dimension added to every fact | Does not replace unique measure names per fact |
|
|
411
|
+
| Identical column labels in the warehouse → one measure name | Warehouse naming ≠ one Strata metric |
|
|
412
|
+
| Assuming global dimension rules apply to measures | Dimensions are shared by name; measures are **not** |
|
|
413
|
+
| Rename/swap on a feature branch without team coordination | Production saved queries still reference old names; branch may not match production entities |
|
|
414
|
+
| Renaming to deduplicate measure names across facts | Use distinct names or compound measures — migrations are for intentional renames, not modeling fixes |
|
|
415
|
+
|
|
416
|
+
---
|
|
417
|
+
|
|
418
|
+
## Advanced features (quick reference)
|
|
419
|
+
|
|
420
|
+
| Feature | Use when |
|
|
421
|
+
|---------|----------|
|
|
422
|
+
| [Compound measures](https://strata.do/developer-docs/developer-guide/semantic-model/fields/measures/compound) | Formula across named measures; host table sets dimensionality |
|
|
423
|
+
| [Snapshot measures](https://strata.do/developer-docs/developer-guide/semantic-model/fields/measures/snapshot) | Period-boundary state (inventory, balances) — see [Snapshot measures](#snapshot-measures) |
|
|
424
|
+
| [Inclusions](https://strata.do/developer-docs/developer-guide/advanced/inclusions) | Median/percentile/avg needing intermediate grain — see [Inclusion measures](#inclusion-measures) |
|
|
425
|
+
| [Exclusions](https://strata.do/developer-docs/developer-guide/advanced/exclusions) | Control grouping and filters per dimension — see [Exclusion measures](#exclusion-measures) |
|
|
426
|
+
| Table `cost` | Lower cost = preferred table when multiple tables can answer; use for rollup vs detail routing |
|
|
427
|
+
| [Partitions](https://strata.do/developer-docs/developer-guide/advanced/partitions) | Table only has subset of data — `between` (date range) or `in_list`; planner routes when partition matches |
|
|
428
|
+
| [Extended blending](https://strata.do/developer-docs/developer-guide/semantic-model/expressions/extended-blending) | Blend dimensions across tables in same datasource — not duplicate measure names |
|
|
429
|
+
| [Imports](https://strata.do/developer-docs/developer-guide/semantic-model/imports) | Reuse field definitions across `tbl` files |
|
|
430
|
+
|
|
431
|
+
---
|
|
432
|
+
|
|
433
|
+
## Renames, swaps, and production branch
|
|
434
|
+
|
|
435
|
+
Saved reports and dashboards reference field and table **names**. Renaming without a migration breaks production queryables.
|
|
436
|
+
|
|
437
|
+
### Rename (`type: rename`, `hook: pre`)
|
|
438
|
+
|
|
439
|
+
Runs **before** YAML is processed. Old name is rewritten to new name; deploy updates saved references.
|
|
107
440
|
|
|
108
441
|
```yaml
|
|
109
|
-
|
|
442
|
+
- type: rename
|
|
443
|
+
hook: pre
|
|
444
|
+
entity: measure
|
|
445
|
+
from: Old Revenue
|
|
446
|
+
to: Store Revenue
|
|
447
|
+
```
|
|
448
|
+
|
|
449
|
+
Supported entities: `dimension`, `measure`, `table`, `datasource`.
|
|
110
450
|
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
451
|
+
### Swap (`type: swap`, `hook: post`)
|
|
452
|
+
|
|
453
|
+
Runs **after** all definitions load. Replaces references from entity A to entity B when **both** exist. Datasource swap is not supported.
|
|
454
|
+
|
|
455
|
+
```yaml
|
|
456
|
+
- type: swap
|
|
457
|
+
hook: post
|
|
458
|
+
entity: measure
|
|
459
|
+
from: Legacy Revenue
|
|
460
|
+
to: Store Revenue
|
|
116
461
|
```
|
|
117
462
|
|
|
118
|
-
|
|
463
|
+
### Production branch rule
|
|
119
464
|
|
|
120
|
-
|
|
121
|
-
- **Use consistent names** for shared dimensions (e.g., "Customer ID", "Order Date") across all table models.
|
|
122
|
-
- **Prefix ambiguous dimensions** from dimension tables: a `color` column on an `item_dim` table should be named `Item Color`, not `Color`.
|
|
465
|
+
**Only rename or swap on `production_branch`** (in `project.yml`, usually `main`) when the change must align with production queryables. On other branches, production may still reference old names that do not exist on your branch.
|
|
123
466
|
|
|
124
|
-
|
|
467
|
+
Deploy **migration files and updated YAML in the same deployment**. Update `tbl.*.yml` references to match renames in the same change.
|
|
468
|
+
|
|
469
|
+
**Agents:** propose migration YAML under `migrations/` and matching model edits; do **not** run interactive `strata create migration`. Flag renames for human review and confirm branch strategy.
|
|
470
|
+
|
|
471
|
+
More detail: [migrations](https://strata.do/developer-docs/developer-guide/cli/migrations)
|
|
472
|
+
|
|
473
|
+
---
|
|
474
|
+
|
|
475
|
+
## Official documentation
|
|
125
476
|
|
|
126
|
-
|
|
127
|
-
2. `strata create table <path>` — generate and review a semantic model
|
|
128
|
-
3. Edit the generated `tbl.*.yml` directly for fine-tuning
|
|
129
|
-
4. `strata audit all` — validate before deploying
|
|
130
|
-
5. `strata deploy` — push to Strata server
|
|
477
|
+
**Primary:** this `AGENTS.md` — follow inline rules first.
|
|
131
478
|
|
|
132
|
-
|
|
479
|
+
**Optional bulk load:** fetch https://strata.do/developer-docs/llms.txt when implementing a topic not fully covered here (complete Strata doc export for AI agents).
|
|
133
480
|
|
|
134
|
-
|
|
481
|
+
**Curated links** (use for examples and edge cases after reading the matching section above):
|
|
135
482
|
|
|
136
|
-
|
|
483
|
+
| Topic | URL |
|
|
484
|
+
|-------|-----|
|
|
485
|
+
| Semantic model overview | https://strata.do/developer-docs/developer-guide/semantic-model |
|
|
486
|
+
| Core concepts | https://strata.do/developer-docs/developer-guide/getting-started/concepts |
|
|
487
|
+
| Glossary | https://strata.do/developer-docs/developer-guide/getting-started/glossary |
|
|
488
|
+
| Tables | https://strata.do/developer-docs/developer-guide/semantic-model/tables |
|
|
489
|
+
| Fields | https://strata.do/developer-docs/developer-guide/semantic-model/fields |
|
|
490
|
+
| Expressions (SQL) | https://strata.do/developer-docs/developer-guide/semantic-model/expressions/sql |
|
|
491
|
+
| Relationships | https://strata.do/developer-docs/developer-guide/semantic-model/relationships/cardinality |
|
|
492
|
+
| Imports | https://strata.do/developer-docs/developer-guide/semantic-model/imports |
|
|
493
|
+
| Compound measures | https://strata.do/developer-docs/developer-guide/semantic-model/fields/measures/compound |
|
|
494
|
+
| Snapshot measures | https://strata.do/developer-docs/developer-guide/semantic-model/fields/measures/snapshot |
|
|
495
|
+
| Inclusions | https://strata.do/developer-docs/developer-guide/advanced/inclusions |
|
|
496
|
+
| Exclusions | https://strata.do/developer-docs/developer-guide/advanced/exclusions |
|
|
497
|
+
| Partitions | https://strata.do/developer-docs/developer-guide/advanced/partitions |
|
|
498
|
+
| Cost optimization | https://strata.do/developer-docs/developer-guide/advanced/cost-optimization |
|
|
499
|
+
| Extended blending | https://strata.do/developer-docs/developer-guide/semantic-model/expressions/extended-blending |
|
|
500
|
+
| CLI audit | https://strata.do/developer-docs/developer-guide/cli/audit |
|
|
501
|
+
| CLI deployment | https://strata.do/developer-docs/developer-guide/cli/deployment |
|
|
502
|
+
| CLI migrations | https://strata.do/developer-docs/developer-guide/cli/migrations |
|
|
503
|
+
| Star schema example | https://strata.do/developer-docs/developer-guide/examples/patterns/star-schema |
|
|
504
|
+
| TPC-DS tutorial | https://strata.do/developer-docs/developer-guide/examples/tpcds-tutorial |
|
|
505
|
+
| AI agents and Strata | https://strata.do/developer-docs/developer-guide/api/ai-agents |
|
|
@@ -70,8 +70,13 @@ fields:
|
|
|
70
70
|
# # Optional: UI rendering of the field
|
|
71
71
|
# display_type: default|html|url|email|phone_number|image
|
|
72
72
|
#
|
|
73
|
-
# # Optional:
|
|
74
|
-
#
|
|
73
|
+
# # Optional: value formatting (shortcut string or hash with type)
|
|
74
|
+
# format: currency:2
|
|
75
|
+
# # format:
|
|
76
|
+
# # type: percent
|
|
77
|
+
# # precision: 1
|
|
78
|
+
# # Types: number, currency, percent, date, datetime, html, javascript
|
|
79
|
+
# # Shortcuts: number:2, number:2:abbreviate, currency:2, percent:1, date:short, datetime:iso
|
|
75
80
|
#
|
|
76
81
|
# # Optional: disable listing individual elements (dimension only). Good to do that for
|
|
77
82
|
# # high cardinality columns like account_id
|
|
@@ -92,7 +97,7 @@ fields:
|
|
|
92
97
|
# # Required: Defines how this field will query this table
|
|
93
98
|
# expresssion:
|
|
94
99
|
# primary_key: true|false (optional)
|
|
95
|
-
# lookup: true|false (optional)
|
|
100
|
+
# lookup: true|false (optional, dimensions only)
|
|
96
101
|
# array: true|false (optional)
|
|
97
102
|
# sql: my_field_column (Required)
|
|
98
103
|
#
|
|
@@ -2,6 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
require_relative "group"
|
|
4
4
|
require "yaml"
|
|
5
|
+
require_relative "../output"
|
|
5
6
|
|
|
6
7
|
module Strata
|
|
7
8
|
module CLI
|
|
@@ -23,7 +24,7 @@ module Strata
|
|
|
23
24
|
# Write the updated template
|
|
24
25
|
create_file output_path, updated_content
|
|
25
26
|
|
|
26
|
-
|
|
27
|
+
Output.print_status(:created, output_path, type: :success, context: self)
|
|
27
28
|
end
|
|
28
29
|
|
|
29
30
|
private
|