modscape 1.1.8 → 1.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.ja.md +160 -24
- package/README.md +144 -33
- package/package.json +5 -4
- package/src/export.js +57 -1
- package/src/import-dbt.js +181 -0
- package/src/index.js +38 -1
- package/src/init.js +17 -7
- package/src/merge.js +74 -0
- package/src/sync-dbt.js +146 -0
- package/src/templates/claude/codegen.md +15 -0
- package/src/templates/claude/{command.md → modeling.md} +0 -2
- package/src/templates/codegen-rules.md +138 -0
- package/src/templates/codex/modscape-codegen/SKILL.md +20 -0
- package/src/templates/codex/{SKILL.md → modscape-modeling/SKILL.md} +10 -0
- package/src/templates/default-model.yaml +1 -1
- package/src/templates/gemini/modscape-codegen/SKILL.md +23 -0
- package/src/templates/gemini/{SKILL.md → modscape-modeling/SKILL.md} +5 -3
- package/src/templates/rules.md +681 -72
- package/visualizer/package.json +1 -1
- package/visualizer-dist/assets/index-D-14ykQt.js +63 -0
- package/visualizer-dist/assets/index-DHPAF3El.css +1 -0
- package/visualizer-dist/index.html +2 -2
- package/visualizer-dist/assets/index-BRGvsqv2.css +0 -1
- package/visualizer-dist/assets/index-D8jperx8.js +0 -63
package/src/templates/rules.md
CHANGED
|
@@ -1,122 +1,731 @@
|
|
|
1
|
-
# Modscape
|
|
1
|
+
# Modscape Modeling Rules for AI Agents
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
- **Accuracy**: Precisely implement every table, column, and relationship requested.
|
|
6
|
-
- **Completeness**: Do not omit requested details for the sake of brevity.
|
|
7
|
-
- **Expert Guidance**: If a user's instruction contradicts modeling best practices (e.g., mixing grains), **warn the user and suggest an alternative**, but do not ignore the original intent.
|
|
3
|
+
> **Purpose**: This file teaches AI agents how to write valid `model.yaml` files for Modscape.
|
|
4
|
+
> Read this file completely before generating or editing any YAML.
|
|
8
5
|
|
|
9
|
-
|
|
10
|
-
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## QUICK REFERENCE (read this first)
|
|
9
|
+
|
|
10
|
+
```
|
|
11
|
+
ROOT KEYS domains | tables | relationships | annotations | layout
|
|
12
|
+
COORDINATES ONLY in `layout`. NEVER inside tables or domains.
|
|
13
|
+
LINEAGE Use lineage.upstream (not relationships) for mart/aggregated tables.
|
|
14
|
+
parentId Declare a table's domain membership inside layout, not inside domains.
|
|
15
|
+
IDs Every object (table, domain, annotation) needs a unique `id`.
|
|
16
|
+
sampleData First row = column IDs. At least 3 realistic data rows.
|
|
17
|
+
Grid All x/y values must be multiples of 40.
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
---
|
|
21
|
+
|
|
22
|
+
## 1. Root Structure
|
|
23
|
+
|
|
24
|
+
A valid `model.yaml` has exactly these top-level keys.
|
|
25
|
+
|
|
26
|
+
```yaml
|
|
27
|
+
domains: # (array) visual containers — OPTIONAL but recommended
|
|
28
|
+
tables: # (array) entity definitions — REQUIRED
|
|
29
|
+
relationships: # (array) ER cardinality edges — OPTIONAL
|
|
30
|
+
annotations: # (array) sticky notes / callouts — OPTIONAL
|
|
31
|
+
layout: # (object) ALL coordinates — REQUIRED if any objects exist
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
**MUST NOT** add any other top-level keys. They will be ignored or cause errors.
|
|
35
|
+
|
|
36
|
+
---
|
|
37
|
+
|
|
38
|
+
## 2. Tables
|
|
39
|
+
|
|
40
|
+
### 2-1. Required and Optional Fields
|
|
41
|
+
|
|
42
|
+
| Field | Required | Description |
|
|
43
|
+
|-------|----------|-------------|
|
|
44
|
+
| `id` | **REQUIRED** | Unique identifier used as a key in `layout`, `domains.tables`, `lineage.upstream`, etc. Use snake_case. |
|
|
45
|
+
| `name` | **REQUIRED** | Conceptual (business) name shown large on the canvas. |
|
|
46
|
+
| `logical_name` | optional | Formal business name shown medium. Omit if same as `name`. |
|
|
47
|
+
| `physical_name` | optional | Actual database table name shown small. |
|
|
48
|
+
| `appearance` | optional | Visual type, icon, color. |
|
|
49
|
+
| `conceptual` | optional | AI-friendly business context metadata. |
|
|
50
|
+
| `lineage` | optional | Upstream table IDs. Only for `mart` / aggregated tables. |
|
|
51
|
+
| `columns` | optional | Column definitions. |
|
|
52
|
+
| `sampleData` | optional | 2D array of sample rows. Strongly recommended. |
|
|
53
|
+
|
|
54
|
+
### 2-2. `appearance` Fields
|
|
55
|
+
|
|
56
|
+
```yaml
|
|
57
|
+
appearance:
|
|
58
|
+
type: fact # REQUIRED if used. See table below.
|
|
59
|
+
sub_type: transaction # optional free text (transaction | periodic | accumulating | ...)
|
|
60
|
+
scd: type2 # optional. dimension tables only. type0|type1|type2|type3|type4|type6
|
|
61
|
+
icon: "💰" # optional. any single emoji.
|
|
62
|
+
color: "#e0f2fe" # optional. hex or CSS color for the header.
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
**`appearance.type` values:**
|
|
66
|
+
|
|
67
|
+
| type | Use when... |
|
|
68
|
+
|------|-------------|
|
|
69
|
+
| `fact` | Events, transactions, measurements. Has measures (numbers) and FK columns. |
|
|
70
|
+
| `dimension` | Entities, master data, reference lists. Descriptive attributes. |
|
|
71
|
+
| `mart` | Aggregated or consumer-facing output. **Always add `lineage.upstream`.** |
|
|
72
|
+
| `hub` | Data Vault: stores a single unique business key. |
|
|
73
|
+
| `link` | Data Vault: joins two or more hubs (transaction or relationship). |
|
|
74
|
+
| `satellite` | Data Vault: descriptive attributes of a hub, tracked over time. |
|
|
75
|
+
| `table` | Generic. Use when none of the above apply. |
|
|
76
|
+
|
|
77
|
+
**MUST NOT** use `scd` on `fact`, `mart`, `hub`, `link`, or `satellite` tables.
|
|
78
|
+
|
|
79
|
+
### 2-3. `conceptual` Fields (AI-readable business context)
|
|
80
|
+
|
|
81
|
+
```yaml
|
|
82
|
+
conceptual:
|
|
83
|
+
description: "One row per order line item."
|
|
84
|
+
tags: [WHAT, HOW_MUCH] # BEAM* tags: WHO | WHAT | WHEN | WHERE | HOW | COUNT | HOW_MUCH
|
|
85
|
+
businessDefinitions:
|
|
86
|
+
revenue: "Net revenue after discounts and returns."
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
### 2-4. `columns` Fields
|
|
90
|
+
|
|
91
|
+
Each column has an `id` plus optional `logical` and `physical` blocks.
|
|
92
|
+
|
|
93
|
+
```yaml
|
|
94
|
+
columns:
|
|
95
|
+
- id: order_id # REQUIRED. Unique within the table. Used in sampleData header.
|
|
96
|
+
logical:
|
|
97
|
+
name: "Order ID" # Display name
|
|
98
|
+
type: Int # Int | String | Decimal | Date | Timestamp | Boolean | ...
|
|
99
|
+
description: "Surrogate key." # optional
|
|
100
|
+
isPrimaryKey: true # optional. default false.
|
|
101
|
+
isForeignKey: false # optional. default false.
|
|
102
|
+
isPartitionKey: false # optional. default false.
|
|
103
|
+
isMetadata: false # optional. true for audit cols: load_date, record_source, hash_diff
|
|
104
|
+
additivity: fully # optional. fully=summable | semi=balance/stock | non=price/rate/ID
|
|
105
|
+
physical: # optional. override when warehouse names/types differ.
|
|
106
|
+
name: order_id_pk
|
|
107
|
+
type: "BIGINT"
|
|
108
|
+
constraints: [NOT NULL, UNIQUE]
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
|
|
113
|
+
## 3. Relationships (ER Cardinality)
|
|
114
|
+
|
|
115
|
+
Use `relationships` **only** for structural ER connections between tables.
|
|
116
|
+
|
|
117
|
+
```yaml
|
|
118
|
+
relationships:
|
|
119
|
+
- from:
|
|
120
|
+
table: dim_customers # table id
|
|
121
|
+
column: customer_key # column id — optional but recommended
|
|
122
|
+
to:
|
|
123
|
+
table: fct_orders
|
|
124
|
+
column: customer_key
|
|
125
|
+
type: one-to-many
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
**`type` values:**
|
|
129
|
+
|
|
130
|
+
| type | Typical usage |
|
|
131
|
+
|------|--------------|
|
|
132
|
+
| `one-to-one` | Lookup table / vertical split |
|
|
133
|
+
| `one-to-many` | Dimension → Fact *(most common)* |
|
|
134
|
+
| `many-to-one` | Fact → Dimension *(inverse notation of above)* |
|
|
135
|
+
| `many-to-many` | Via a bridge / link table |
|
|
136
|
+
|
|
137
|
+
**MUST NOT** use `relationships` to express data lineage (use `lineage.upstream` instead).
|
|
138
|
+
|
|
139
|
+
---
|
|
140
|
+
|
|
141
|
+
## 4. Data Lineage
|
|
142
|
+
|
|
143
|
+
`lineage.upstream` declares which source tables a derived table is built from.
|
|
144
|
+
This is rendered as animated arrows in **Lineage Mode**. It is separate from ER relationships.
|
|
145
|
+
|
|
146
|
+
```yaml
|
|
147
|
+
tables:
|
|
148
|
+
- id: mart_revenue
|
|
149
|
+
appearance: { type: mart }
|
|
150
|
+
lineage:
|
|
151
|
+
upstream:
|
|
152
|
+
- fct_orders # list of source table IDs
|
|
153
|
+
- dim_dates
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
### When to use lineage vs relationships
|
|
157
|
+
|
|
158
|
+
| Situation | Use |
|
|
159
|
+
|-----------|-----|
|
|
160
|
+
| `dim_customers` → `fct_orders` (FK join) | `relationships` |
|
|
161
|
+
| `fct_orders` + `dim_dates` → `mart_revenue` (aggregation) | `lineage.upstream` |
|
|
162
|
+
|
|
163
|
+
**MUST** define `lineage.upstream` for every `mart` or aggregated table.
|
|
164
|
+
**MUST NOT** define `lineage.upstream` for raw tables (`fact`, `dimension`, `hub`, `link`, `satellite`).
|
|
165
|
+
**MUST NOT** add a `relationships` entry for a connection already expressed in `lineage.upstream`.
|
|
166
|
+
|
|
167
|
+
#### Example: correct separation
|
|
168
|
+
|
|
169
|
+
```yaml
|
|
170
|
+
# CORRECT
|
|
171
|
+
tables:
|
|
172
|
+
- id: mart_revenue
|
|
173
|
+
appearance: { type: mart }
|
|
174
|
+
lineage:
|
|
175
|
+
upstream: [fct_orders, dim_dates] # lineage only
|
|
176
|
+
|
|
177
|
+
relationships:
|
|
178
|
+
- from: { table: dim_customers, column: customer_key }
|
|
179
|
+
to: { table: fct_orders, column: customer_key }
|
|
180
|
+
type: one-to-many # ER only
|
|
181
|
+
|
|
182
|
+
# WRONG — do not add a relationships entry for the same connection as lineage
|
|
183
|
+
relationships:
|
|
184
|
+
- from: { table: fct_orders }
|
|
185
|
+
to: { table: mart_revenue }
|
|
186
|
+
type: lineage # ❌ never do this
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
---
|
|
190
|
+
|
|
191
|
+
## 5. Domains
|
|
192
|
+
|
|
193
|
+
```yaml
|
|
194
|
+
domains:
|
|
195
|
+
- id: sales_ops # REQUIRED. Used as key in layout.
|
|
196
|
+
name: "Sales Operations" # REQUIRED. Display name.
|
|
197
|
+
description: "..." # optional
|
|
198
|
+
color: "rgba(59, 130, 246, 0.1)" # optional. rgba recommended.
|
|
199
|
+
tables: # REQUIRED. List of table IDs inside this domain.
|
|
200
|
+
- fct_orders
|
|
201
|
+
- dim_customers
|
|
202
|
+
isLocked: false # optional. true = prevent drag on canvas.
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
**MUST** list only table IDs that actually exist in `tables`.
|
|
206
|
+
**MUST** add a layout entry for the domain with `width` and `height`.
|
|
207
|
+
|
|
208
|
+
---
|
|
209
|
+
|
|
210
|
+
## 6. Layout
|
|
211
|
+
|
|
212
|
+
**All coordinates live here.** Never put `x`, `y`, `width`, or `height` inside `tables` or `domains`.
|
|
213
|
+
|
|
214
|
+
### 6-1. Field Reference
|
|
215
|
+
|
|
216
|
+
| Field | Required for | Description |
|
|
217
|
+
|-------|-------------|-------------|
|
|
218
|
+
| `x` | all entries | Canvas x coordinate (integer, multiple of 40) |
|
|
219
|
+
| `y` | all entries | Canvas y coordinate (integer, multiple of 40) |
|
|
220
|
+
| `width` | domains | Total pixel width of the domain container |
|
|
221
|
+
| `height` | domains | Total pixel height of the domain container |
|
|
222
|
+
| `parentId` | tables inside a domain | ID of the containing domain. Makes coordinates relative to domain origin. |
|
|
223
|
+
| `isLocked` | domains or tables | Prevents drag when true |
|
|
224
|
+
|
|
225
|
+
### 6-2. Domain Size Formula
|
|
226
|
+
|
|
227
|
+
Calculate domain dimensions so tables fit without overflow:
|
|
228
|
+
|
|
229
|
+
```
|
|
230
|
+
width = (numCols * 320) + ((numCols - 1) * 80) + 160
|
|
231
|
+
height = (numRows * 240) + ((numRows - 1) * 80) + 160
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
Examples:
|
|
235
|
+
- 1 col × 1 row → width: 480, height: 400
|
|
236
|
+
- 2 col × 1 row → width: 880, height: 400
|
|
237
|
+
- 2 col × 2 row → width: 880, height: 720
|
|
238
|
+
- 3 col × 2 row → width: 1280, height: 720
|
|
239
|
+
|
|
240
|
+
### 6-3. Table Positioning Inside a Domain
|
|
241
|
+
|
|
242
|
+
When `parentId` is set, `x`/`y` are **relative to the domain's top-left corner (0, 0)**.
|
|
243
|
+
|
|
244
|
+
```yaml
|
|
245
|
+
layout:
|
|
246
|
+
sales_ops:
|
|
247
|
+
x: 0 # absolute canvas position
|
|
248
|
+
y: 0
|
|
249
|
+
width: 880
|
|
250
|
+
height: 400
|
|
251
|
+
dim_customers:
|
|
252
|
+
x: 80 # 80px from domain's left edge
|
|
253
|
+
y: 80 # 80px from domain's top edge
|
|
254
|
+
parentId: sales_ops
|
|
255
|
+
fct_orders:
|
|
256
|
+
x: 480 # 480px from domain's left edge
|
|
257
|
+
y: 80
|
|
258
|
+
parentId: sales_ops
|
|
259
|
+
```
|
|
260
|
+
|
|
261
|
+
**MUST NOT** let any table's right edge (`x + 320`) or bottom edge (`y + 240`) exceed the domain's `width` or `height`.
|
|
262
|
+
|
|
263
|
+
### 6-4. Layout Flow Conventions
|
|
264
|
+
|
|
265
|
+
- **ER diagrams**: Dimension/Hub tables TOP, Fact/Link tables BOTTOM
|
|
266
|
+
- **Lineage diagrams**: Upstream (source) LEFT, Downstream (mart) RIGHT
|
|
267
|
+
- **Grid**: All `x` and `y` values must be multiples of 40
|
|
268
|
+
- **Spacing**: Minimum gap of 120px between nodes
|
|
269
|
+
|
|
270
|
+
### 6-5. Layout Template
|
|
271
|
+
|
|
272
|
+
```yaml
|
|
273
|
+
layout:
|
|
274
|
+
# --- Domain ---
|
|
275
|
+
<domain_id>:
|
|
276
|
+
x: <canvas_x> # absolute
|
|
277
|
+
y: <canvas_y>
|
|
278
|
+
width: <W> # use formula above
|
|
279
|
+
height: <H>
|
|
280
|
+
|
|
281
|
+
# --- Table inside domain ---
|
|
282
|
+
<table_id>:
|
|
283
|
+
x: <relative_x> # relative to domain origin
|
|
284
|
+
y: <relative_y>
|
|
285
|
+
parentId: <domain_id>
|
|
286
|
+
|
|
287
|
+
# --- Standalone table ---
|
|
288
|
+
<table_id>:
|
|
289
|
+
x: <canvas_x> # absolute
|
|
290
|
+
y: <canvas_y>
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
---
|
|
294
|
+
|
|
295
|
+
## 7. Annotations
|
|
296
|
+
|
|
297
|
+
```yaml
|
|
298
|
+
annotations:
|
|
299
|
+
- id: note_001 # REQUIRED. Unique ID.
|
|
300
|
+
type: sticky # REQUIRED. sticky | callout
|
|
301
|
+
text: "..." # REQUIRED. Note content.
|
|
302
|
+
color: "#fef9c3" # optional. background color.
|
|
303
|
+
targetId: fct_orders # optional. ID of the object to attach to.
|
|
304
|
+
targetType: table # required if targetId is set. table | domain | relationship | column
|
|
305
|
+
offset:
|
|
306
|
+
x: 100 # offset from target's top-left. if no targetId, this is absolute canvas position.
|
|
307
|
+
y: -80 # negative y = above the target.
|
|
308
|
+
```
|
|
309
|
+
|
|
310
|
+
---
|
|
311
|
+
|
|
312
|
+
## 8. Sample Data
|
|
313
|
+
|
|
314
|
+
Every table SHOULD include `sampleData`.
|
|
315
|
+
|
|
316
|
+
```yaml
|
|
317
|
+
sampleData:
|
|
318
|
+
- [1001, 1, 150.00, "COMPLETED"] # each row = one data record
|
|
319
|
+
- [1002, 2, 89.50, "PENDING"]
|
|
320
|
+
- [1003, 1, 210.00, "COMPLETED"]
|
|
321
|
+
```
|
|
322
|
+
|
|
323
|
+
**Rules:**
|
|
324
|
+
- Each row is a plain data record. No header row.
|
|
325
|
+
- The order of values MUST match the order of `columns` defined in the table.
|
|
326
|
+
- Use realistic values. Do NOT use "test1", "foo", "xxx".
|
|
327
|
+
- Numeric measures should be plausible business amounts.
|
|
328
|
+
- Dates should be in ISO 8601 format: `"2024-01-15"` or `"2024-01-15T00:00:00Z"`.
|
|
329
|
+
|
|
330
|
+
---
|
|
331
|
+
|
|
332
|
+
## 9. Implementation Hints
|
|
333
|
+
|
|
334
|
+
`implementation` is an **optional** block inside each table. AI agents read it to generate dbt / Spark / SQLMesh code. Omitting it is fine — the visualizer works without it.
|
|
11
335
|
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
336
|
+
```yaml
|
|
337
|
+
tables:
|
|
338
|
+
- id: fct_orders
|
|
339
|
+
appearance: { type: fact }
|
|
340
|
+
implementation:
|
|
341
|
+
materialization: incremental # table | view | incremental | ephemeral
|
|
342
|
+
incremental_strategy: merge # merge | append | delete+insert
|
|
343
|
+
unique_key: order_id # column id used for upsert
|
|
344
|
+
partition_by:
|
|
345
|
+
field: event_date
|
|
346
|
+
granularity: day # day | month | year | hour
|
|
347
|
+
cluster_by: [customer_id, region_id]
|
|
348
|
+
grain: [month_key, region_id] # GROUP BY columns (mart only)
|
|
349
|
+
measures:
|
|
350
|
+
- column: total_revenue # output column id in this table
|
|
351
|
+
agg: sum # sum | count | count_distinct | avg | min | max
|
|
352
|
+
source_column: amount # upstream column id (use <table_id>.<col_id> to disambiguate)
|
|
353
|
+
```
|
|
354
|
+
|
|
355
|
+
### AI Inference Defaults (when `implementation` is absent)
|
|
356
|
+
|
|
357
|
+
| `appearance.type` | `appearance.scd` | Inferred `materialization` |
|
|
358
|
+
|------------------|-----------------|--------------------------|
|
|
359
|
+
| `fact` | — | `incremental` |
|
|
360
|
+
| `dimension` | `type2` | `table` (snapshot pattern) |
|
|
361
|
+
| `dimension` | other | `table` |
|
|
362
|
+
| `mart` | — | `table` |
|
|
363
|
+
| `hub` / `link` / `satellite` | — | `incremental` |
|
|
364
|
+
| `table` | — | `view` |
|
|
365
|
+
|
|
366
|
+
**Rules:**
|
|
367
|
+
- `measures` and `grain` are for `mart` tables only.
|
|
368
|
+
- `incremental_strategy` and `unique_key` are only relevant when `materialization: incremental`.
|
|
369
|
+
- When `source_column` is ambiguous across multiple upstream tables, qualify it as `<table_id>.<column_id>` (e.g., `fct_orders.amount`).
|
|
370
|
+
- **MUST NOT** define `implementation` inside `domains`, `relationships`, or `annotations`.
|
|
17
371
|
|
|
18
372
|
---
|
|
19
373
|
|
|
20
|
-
##
|
|
21
|
-
To ensure a professional and clean diagram, AI agents MUST use the following numeric standards:
|
|
374
|
+
## 10. Common Mistakes (Before → After)
|
|
22
375
|
|
|
23
|
-
###
|
|
24
|
-
- **Grid Snapping**: All `x` and `y` values MUST be multiples of **40** (e.g., 0, 40, 80, 120).
|
|
25
|
-
- **Standard Table Width**: `320`
|
|
26
|
-
- **Standard Table Height**: `240` (base)
|
|
27
|
-
- **Node Spacing (Gap)**: Minimum `120` between nodes.
|
|
376
|
+
### ❌ Coordinates inside a table definition
|
|
28
377
|
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
|
|
378
|
+
```yaml
|
|
379
|
+
# WRONG
|
|
380
|
+
tables:
|
|
381
|
+
- id: fct_orders
|
|
382
|
+
x: 200 # ❌ coordinates do not belong here
|
|
383
|
+
y: 400
|
|
384
|
+
```
|
|
385
|
+
|
|
386
|
+
```yaml
|
|
387
|
+
# CORRECT
|
|
388
|
+
tables:
|
|
389
|
+
- id: fct_orders
|
|
390
|
+
name: Orders
|
|
36
391
|
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
- **Height**: `(Rows * 240) + ((Rows - 1) * 80) + 160` (Padding). *Example: 2-row domain = 720px high.*
|
|
43
|
-
- **Boundary Constraint**: NEVER place a table such that its right/bottom edge exceeds the domain's `width`/`height`.
|
|
392
|
+
layout:
|
|
393
|
+
fct_orders:
|
|
394
|
+
x: 200 # ✅ coordinates belong in layout
|
|
395
|
+
y: 400
|
|
396
|
+
```
|
|
44
397
|
|
|
45
398
|
---
|
|
46
399
|
|
|
47
|
-
|
|
48
|
-
Bridge the gap between business and tech by populating all three layers:
|
|
400
|
+
### ❌ Using relationships for lineage
|
|
49
401
|
|
|
50
|
-
|
|
51
|
-
|
|
52
|
-
|
|
402
|
+
```yaml
|
|
403
|
+
# WRONG
|
|
404
|
+
relationships:
|
|
405
|
+
- from: { table: fct_orders }
|
|
406
|
+
to: { table: mart_revenue }
|
|
407
|
+
type: lineage # ❌ 'lineage' is not a valid relationship type
|
|
408
|
+
```
|
|
409
|
+
|
|
410
|
+
```yaml
|
|
411
|
+
# CORRECT
|
|
412
|
+
tables:
|
|
413
|
+
- id: mart_revenue
|
|
414
|
+
appearance: { type: mart }
|
|
415
|
+
lineage:
|
|
416
|
+
upstream: [fct_orders] # ✅ express lineage here
|
|
417
|
+
```
|
|
53
418
|
|
|
54
419
|
---
|
|
55
420
|
|
|
56
|
-
|
|
57
|
-
AI agents MUST analyze the nature of data to choose the correct classification and methodology.
|
|
421
|
+
### ❌ Table listed in domain but missing from layout
|
|
58
422
|
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
-
|
|
63
|
-
|
|
423
|
+
```yaml
|
|
424
|
+
# WRONG
|
|
425
|
+
domains:
|
|
426
|
+
- id: sales_ops
|
|
427
|
+
tables: [fct_orders, dim_customers] # dim_customers listed here...
|
|
64
428
|
|
|
65
|
-
|
|
66
|
-
|
|
67
|
-
|
|
429
|
+
layout:
|
|
430
|
+
sales_ops: { x: 0, y: 0, width: 880, height: 400 }
|
|
431
|
+
fct_orders: { x: 480, y: 80, parentId: sales_ops }
|
|
432
|
+
# ❌ dim_customers has no layout entry → will render at origin (0,0)
|
|
433
|
+
```
|
|
68
434
|
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
435
|
+
```yaml
|
|
436
|
+
# CORRECT — every table in a domain MUST have a layout entry
|
|
437
|
+
layout:
|
|
438
|
+
sales_ops: { x: 0, y: 0, width: 880, height: 400 }
|
|
439
|
+
dim_customers: { x: 80, y: 80, parentId: sales_ops } # ✅
|
|
440
|
+
fct_orders: { x: 480, y: 80, parentId: sales_ops } # ✅
|
|
441
|
+
```
|
|
72
442
|
|
|
73
443
|
---
|
|
74
444
|
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
78
|
-
|
|
445
|
+
### ❌ Table overflows domain boundary
|
|
446
|
+
|
|
447
|
+
```yaml
|
|
448
|
+
# WRONG — domain width is 480 but table at x:280 + width:320 = 600 > 480
|
|
449
|
+
layout:
|
|
450
|
+
small_domain: { x: 0, y: 0, width: 480, height: 400 }
|
|
451
|
+
fct_orders: { x: 280, y: 80, parentId: small_domain } # ❌ right edge = 600
|
|
452
|
+
```
|
|
453
|
+
|
|
454
|
+
```yaml
|
|
455
|
+
# CORRECT — use the formula: 1 col = width 480
|
|
456
|
+
layout:
|
|
457
|
+
small_domain: { x: 0, y: 0, width: 480, height: 400 }
|
|
458
|
+
fct_orders: { x: 80, y: 80, parentId: small_domain } # ✅ right edge = 400
|
|
459
|
+
```
|
|
79
460
|
|
|
80
461
|
---
|
|
81
462
|
|
|
82
|
-
##
|
|
83
|
-
|
|
84
|
-
-
|
|
85
|
-
|
|
463
|
+
## 11. dbt Project Integration
|
|
464
|
+
|
|
465
|
+
If the user has a dbt project, AI agents SHOULD recommend using the built-in import commands instead of writing YAML from scratch.
|
|
466
|
+
|
|
467
|
+
### 11-1. Commands
|
|
468
|
+
|
|
469
|
+
```bash
|
|
470
|
+
# Prerequisite: generate manifest.json first
|
|
471
|
+
dbt parse
|
|
472
|
+
|
|
473
|
+
# Import a dbt project into Modscape YAML (one-time)
|
|
474
|
+
modscape dbt import [project-dir] [options]
|
|
475
|
+
|
|
476
|
+
# Sync dbt changes into existing Modscape YAML (incremental)
|
|
477
|
+
modscape dbt sync [project-dir] [options]
|
|
478
|
+
```
|
|
479
|
+
|
|
480
|
+
**`dbt import` options:**
|
|
481
|
+
|
|
482
|
+
| Option | Description |
|
|
483
|
+
|--------|-------------|
|
|
484
|
+
| `-o, --output <dir>` | Output directory (default: `modscape-<project-name>`) |
|
|
485
|
+
| `--split-by folder` | One YAML file per dbt folder |
|
|
486
|
+
| `--split-by schema` | One YAML file per database schema |
|
|
487
|
+
| `--split-by tag` | One YAML file per dbt tag |
|
|
488
|
+
|
|
489
|
+
### 11-2. What `dbt import` generates
|
|
490
|
+
|
|
491
|
+
The command reads `target/manifest.json` and produces YAML with:
|
|
492
|
+
|
|
493
|
+
| Field | Source | Notes |
|
|
494
|
+
|-------|--------|-------|
|
|
495
|
+
| `id` | `node.unique_id` | Format: `model.project.name` or `source.project.src.table` |
|
|
496
|
+
| `name` | `node.name` | Model / source name |
|
|
497
|
+
| `physical_name` | `node.alias` | Falls back to `node.name` |
|
|
498
|
+
| `conceptual.description` | `node.description` | From dbt docs |
|
|
499
|
+
| `columns[].logical.name/type/description` | `node.columns` | From dbt schema.yml |
|
|
500
|
+
| `lineage.upstream` | `node.depends_on.nodes` | Auto-populated |
|
|
501
|
+
| `appearance.type` | — | **Always `table`. Must be reclassified.** |
|
|
502
|
+
| `sampleData` | — | **Not generated. Must be added.** |
|
|
503
|
+
| `layout` | — | **Not generated. Must be added.** |
|
|
504
|
+
| `domains` | dbt folder structure | Auto-grouped by `fqn[1]` |
|
|
505
|
+
|
|
506
|
+
### 11-3. What AI agents MUST do after `dbt import`
|
|
507
|
+
|
|
508
|
+
After running `modscape dbt import`, the generated YAML needs enrichment. AI agents MUST:
|
|
509
|
+
|
|
510
|
+
1. **Reclassify `appearance.type`** — All tables default to `type: table`. Inspect the table name and columns to assign the correct type (`fact`, `dimension`, `mart`, etc.).
|
|
511
|
+
- Tables named `fct_*` → `fact`
|
|
512
|
+
- Tables named `dim_*` → `dimension`
|
|
513
|
+
- Tables named `mart_*` or `rpt_*` → `mart`
|
|
514
|
+
- Tables named `hub_*` → `hub`, `lnk_*` → `link`, `sat_*` → `satellite`
|
|
515
|
+
|
|
516
|
+
2. **Add `layout`** — The import does not generate coordinates. Calculate domain sizes and add `layout` entries for all tables and domains using the formula in Section 6.
|
|
517
|
+
|
|
518
|
+
3. **Add `sampleData`** — The import does not generate sample data. Add at least 3 realistic rows per table.
|
|
519
|
+
|
|
520
|
+
4. **Do NOT re-generate `lineage.upstream`** — It is already correctly populated from `depends_on.nodes`.
|
|
521
|
+
|
|
522
|
+
### 11-4. `dbt sync` — Incremental updates
|
|
523
|
+
|
|
524
|
+
Use `modscape dbt sync` when the dbt project has changed (new models, updated columns, etc.) and you want to update the existing Modscape YAML without losing manual edits.
|
|
525
|
+
|
|
526
|
+
**What `sync` overwrites:**
|
|
527
|
+
- `name`, `logical_name`, `physical_name`
|
|
528
|
+
- `conceptual.description`
|
|
529
|
+
- `columns` (all)
|
|
530
|
+
- `lineage.upstream`
|
|
531
|
+
|
|
532
|
+
**What `sync` preserves (safe to edit manually):**
|
|
533
|
+
- `appearance` (type, icon, color, scd)
|
|
534
|
+
- `sampleData`
|
|
535
|
+
- `layout`
|
|
536
|
+
- `domains`
|
|
537
|
+
- `annotations`
|
|
538
|
+
- Any fields not listed above
|
|
539
|
+
|
|
540
|
+
> **Workflow**: `dbt import` once → enrich with AI → `dbt sync` when dbt changes → re-enrich as needed.
|
|
541
|
+
|
|
542
|
+
### 11-5. Table ID format in dbt-imported models
|
|
543
|
+
|
|
544
|
+
In dbt-imported YAML, table IDs are dbt `unique_id` strings, not short names:
|
|
545
|
+
|
|
546
|
+
```yaml
|
|
547
|
+
# dbt-imported table ID examples
|
|
548
|
+
id: "model.my_project.fct_orders"
|
|
549
|
+
id: "source.my_project.raw.orders"
|
|
550
|
+
id: "seed.my_project.product_categories"
|
|
551
|
+
|
|
552
|
+
# lineage.upstream also uses unique_id format
|
|
553
|
+
lineage:
|
|
554
|
+
upstream:
|
|
555
|
+
- "model.my_project.stg_orders"
|
|
556
|
+
- "source.my_project.raw.customers"
|
|
557
|
+
```
|
|
558
|
+
|
|
559
|
+
**MUST NOT** shorten these IDs. They are the join keys between `tables`, `domains.tables`, `lineage.upstream`, and `layout`.
|
|
86
560
|
|
|
87
561
|
---
|
|
88
562
|
|
|
89
|
-
##
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
563
|
+
## 12. Merging YAML Files
|
|
564
|
+
|
|
565
|
+
When a user asks to **combine, merge, or consolidate** multiple YAML model files, use the built-in `merge` command instead of editing YAML manually.
|
|
566
|
+
|
|
567
|
+
```bash
|
|
568
|
+
# Merge specific files
|
|
569
|
+
modscape merge sales.yaml marketing.yaml -o combined.yaml
|
|
570
|
+
|
|
571
|
+
# Merge all YAML files in a directory
|
|
572
|
+
modscape merge ./models -o combined.yaml
|
|
573
|
+
|
|
574
|
+
# Merge multiple directories
|
|
575
|
+
modscape merge ./sales ./marketing -o combined.yaml
|
|
576
|
+
```
|
|
577
|
+
|
|
578
|
+
**Merge behavior:**
|
|
579
|
+
|
|
580
|
+
| Section | Behavior |
|
|
581
|
+
|---------|----------|
|
|
582
|
+
| `tables` | Deduplicated by `id`. First occurrence wins on conflict. |
|
|
583
|
+
| `relationships` | All entries included (no deduplication). |
|
|
584
|
+
| `domains` | Deduplicated by `id`. First occurrence wins on conflict. |
|
|
585
|
+
| `layout` | **Not included in output.** Must be added after merging. |
|
|
586
|
+
| `annotations` | **Not included in output.** Must be added after merging. |
|
|
587
|
+
|
|
588
|
+
**What AI agents MUST do after merge:**
|
|
589
|
+
|
|
590
|
+
1. **Add `layout`** — Run `modscape dev <output>` and use auto-layout, or calculate coordinates manually using the formula in Section 6.
|
|
591
|
+
2. **Check for relationship duplication** — If the same relationship exists in multiple source files, it will appear twice. Deduplicate manually if needed.
|
|
93
592
|
|
|
94
593
|
---
|
|
95
594
|
|
|
96
|
-
##
|
|
595
|
+
## 13. Complete Example
|
|
596
|
+
|
|
97
597
|
```yaml
|
|
98
598
|
domains:
|
|
99
599
|
- id: sales_domain
|
|
100
600
|
name: "Sales Operations"
|
|
101
|
-
|
|
601
|
+
description: "Core transactional data."
|
|
602
|
+
color: "rgba(239, 68, 68, 0.1)"
|
|
603
|
+
tables: [dim_customers, fct_orders]
|
|
604
|
+
|
|
605
|
+
- id: analytics_domain
|
|
606
|
+
name: "Analytics & Insights"
|
|
607
|
+
color: "rgba(245, 158, 11, 0.1)"
|
|
608
|
+
tables: [mart_monthly_revenue]
|
|
102
609
|
|
|
103
610
|
tables:
|
|
611
|
+
- id: dim_customers
|
|
612
|
+
name: "Customers"
|
|
613
|
+
logical_name: "Customer Master"
|
|
614
|
+
physical_name: "dim_customers_v2"
|
|
615
|
+
appearance:
|
|
616
|
+
type: dimension
|
|
617
|
+
scd: type2
|
|
618
|
+
icon: "👤"
|
|
619
|
+
conceptual:
|
|
620
|
+
description: "One row per unique customer version (SCD Type 2)."
|
|
621
|
+
tags: [WHO]
|
|
622
|
+
columns:
|
|
623
|
+
- id: customer_key
|
|
624
|
+
logical: { name: "Customer Key", type: Int, isPrimaryKey: true }
|
|
625
|
+
- id: customer_name
|
|
626
|
+
logical: { name: "Name", type: String }
|
|
627
|
+
- id: dw_valid_from
|
|
628
|
+
logical: { name: "Valid From", type: Timestamp, isMetadata: true }
|
|
629
|
+
sampleData:
|
|
630
|
+
- [1, "Acme Corp", "2024-01-01T00:00:00Z"]
|
|
631
|
+
- [2, "Beta Ltd", "2024-03-15T00:00:00Z"]
|
|
632
|
+
- [3, "Gamma Inc", "2024-06-01T00:00:00Z"]
|
|
633
|
+
|
|
104
634
|
- id: fct_orders
|
|
105
635
|
name: "Orders"
|
|
106
636
|
logical_name: "Order Transactions"
|
|
107
637
|
physical_name: "fct_sales_orders"
|
|
108
638
|
appearance: { type: fact, sub_type: transaction, icon: "🛒" }
|
|
639
|
+
conceptual:
|
|
640
|
+
description: "One row per order line item."
|
|
641
|
+
tags: [WHAT, HOW_MUCH]
|
|
642
|
+
implementation:
|
|
643
|
+
materialization: incremental
|
|
644
|
+
incremental_strategy: merge
|
|
645
|
+
unique_key: order_id
|
|
646
|
+
partition_by: { field: order_date, granularity: day }
|
|
647
|
+
cluster_by: [customer_key]
|
|
109
648
|
columns:
|
|
110
649
|
- id: order_id
|
|
111
|
-
logical: { name: "ID", type: Int, isPrimaryKey: true }
|
|
650
|
+
logical: { name: "Order ID", type: Int, isPrimaryKey: true }
|
|
651
|
+
physical: { name: "order_id", type: "BIGINT", constraints: [NOT NULL] }
|
|
652
|
+
- id: customer_key
|
|
653
|
+
logical: { name: "Customer Key", type: Int, isForeignKey: true }
|
|
112
654
|
- id: amount
|
|
113
655
|
logical: { name: "Amount", type: Decimal, additivity: fully }
|
|
114
656
|
sampleData:
|
|
115
|
-
- [
|
|
116
|
-
- [
|
|
117
|
-
- [
|
|
657
|
+
- [1001, 1, 150.00]
|
|
658
|
+
- [1002, 2, 89.50]
|
|
659
|
+
- [1003, 1, 210.00]
|
|
660
|
+
|
|
661
|
+
- id: mart_monthly_revenue
|
|
662
|
+
name: "Monthly Revenue"
|
|
663
|
+
logical_name: "Executive Revenue Summary"
|
|
664
|
+
physical_name: "mart_finance_monthly_revenue_agg"
|
|
665
|
+
appearance: { type: mart, icon: "📈" }
|
|
666
|
+
lineage: # mart → use lineage, not relationships
|
|
667
|
+
upstream:
|
|
668
|
+
- fct_orders
|
|
669
|
+
- dim_customers
|
|
670
|
+
implementation:
|
|
671
|
+
materialization: table
|
|
672
|
+
grain: [month_key]
|
|
673
|
+
measures:
|
|
674
|
+
- column: total_revenue
|
|
675
|
+
agg: sum
|
|
676
|
+
source_column: fct_orders.amount
|
|
677
|
+
columns:
|
|
678
|
+
- id: month_key
|
|
679
|
+
logical: { name: "Month", type: String, isPrimaryKey: true }
|
|
680
|
+
- id: total_revenue
|
|
681
|
+
logical: { name: "Revenue", type: Decimal, additivity: fully }
|
|
682
|
+
sampleData:
|
|
683
|
+
- ["2024-01", 12450.50]
|
|
684
|
+
- ["2024-02", 15200.00]
|
|
685
|
+
- ["2024-03", 18900.75]
|
|
686
|
+
|
|
687
|
+
relationships: # ER only — not for lineage
|
|
688
|
+
- from: { table: dim_customers, column: customer_key }
|
|
689
|
+
to: { table: fct_orders, column: customer_key }
|
|
690
|
+
type: one-to-many
|
|
691
|
+
|
|
692
|
+
annotations:
|
|
693
|
+
- id: note_001
|
|
694
|
+
type: sticky
|
|
695
|
+
text: "Grain: one row per order line item."
|
|
696
|
+
targetId: fct_orders
|
|
697
|
+
targetType: table
|
|
698
|
+
offset: { x: 100, y: -80 }
|
|
118
699
|
|
|
119
700
|
layout:
|
|
120
|
-
|
|
121
|
-
|
|
701
|
+
# Domains — width/height calculated by formula
|
|
702
|
+
# sales_domain: 2 tables side by side → 2-col × 1-row → w:880, h:400
|
|
703
|
+
sales_domain:
|
|
704
|
+
x: 0
|
|
705
|
+
y: 0
|
|
706
|
+
width: 880
|
|
707
|
+
height: 400
|
|
708
|
+
|
|
709
|
+
# Tables inside sales_domain — coordinates relative to domain origin
|
|
710
|
+
dim_customers:
|
|
711
|
+
x: 80
|
|
712
|
+
y: 80
|
|
713
|
+
parentId: sales_domain
|
|
714
|
+
|
|
715
|
+
fct_orders:
|
|
716
|
+
x: 480
|
|
717
|
+
y: 80
|
|
718
|
+
parentId: sales_domain
|
|
719
|
+
|
|
720
|
+
# analytics_domain: 1 table → 1-col × 1-row → w:480, h:400
|
|
721
|
+
analytics_domain:
|
|
722
|
+
x: 1000
|
|
723
|
+
y: 0
|
|
724
|
+
width: 480
|
|
725
|
+
height: 400
|
|
726
|
+
|
|
727
|
+
mart_monthly_revenue:
|
|
728
|
+
x: 80
|
|
729
|
+
y: 80
|
|
730
|
+
parentId: analytics_domain
|
|
122
731
|
```
|