modscape 1.1.1 → 1.1.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/src/templates/rules.md +105 -103
- package/visualizer/package.json +1 -1
- package/visualizer-dist/assets/index-BnLiPShL.js +63 -0
- package/visualizer-dist/assets/index-Bu1U91mc.css +1 -0
- package/visualizer-dist/index.html +2 -2
- package/visualizer-dist/assets/index-BHqdts7X.css +0 -1
- package/visualizer-dist/assets/index-Dtgsjgib.js +0 -63
package/package.json
CHANGED
package/src/templates/rules.md
CHANGED
|
@@ -1,120 +1,122 @@
|
|
|
1
|
-
# Data Modeling Rules
|
|
2
|
-
|
|
3
|
-
##
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
##
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
###
|
|
24
|
-
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
-
|
|
33
|
-
-
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
|
|
37
|
-
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
- **
|
|
42
|
-
- **
|
|
43
|
-
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
|
|
47
|
-
|
|
1
|
+
# Modscape Data Modeling Rules (AI-Optimized)
|
|
2
|
+
|
|
3
|
+
## 0. Foundational Principle: Instruction Fidelity
|
|
4
|
+
AI agents MUST prioritize the user's specific instructions above all else.
|
|
5
|
+
- **Accuracy**: Precisely implement every table, column, and relationship requested.
|
|
6
|
+
- **Completeness**: Do not omit requested details for the sake of brevity.
|
|
7
|
+
- **Expert Guidance**: If a user's instruction contradicts modeling best practices (e.g., mixing grains), **warn the user and suggest an alternative**, but do not ignore the original intent.
|
|
8
|
+
|
|
9
|
+
## CRITICAL: YAML Architecture
|
|
10
|
+
AI agents MUST follow this root-level structure. Schema violations will cause parsing errors.
|
|
11
|
+
|
|
12
|
+
1. **`domains`**: (Array) Visual groupings.
|
|
13
|
+
2. **`tables`**: (Array) Entity definitions. **NEVER put `x` or `y` coordinates here.**
|
|
14
|
+
3. **`relationships`**: (Array) ER connections.
|
|
15
|
+
4. **`annotations`**: (Array) Sticky notes and callouts.
|
|
16
|
+
5. **`layout`**: (Dictionary) **MANDATORY**. All coordinates MUST live here, keyed by object ID.
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## 1. Beautiful Layout Heuristics
|
|
21
|
+
To ensure a professional and clean diagram, AI agents MUST use the following numeric standards:
|
|
22
|
+
|
|
23
|
+
### Standard Metrics
|
|
24
|
+
- **Grid Snapping**: All `x` and `y` values MUST be multiples of **40** (e.g., 0, 40, 80, 120).
|
|
25
|
+
- **Standard Table Width**: `320`
|
|
26
|
+
- **Standard Table Height**: `240` (base)
|
|
27
|
+
- **Node Spacing (Gap)**: Minimum `120` between nodes.
|
|
28
|
+
|
|
29
|
+
### Directional Flow
|
|
30
|
+
- **Data Lineage (Horizontal)**:
|
|
31
|
+
- Upstream (Source) tables on the **LEFT**.
|
|
32
|
+
- Downstream (Target) tables on the **RIGHT**.
|
|
33
|
+
- **ER Relationships (Vertical)**:
|
|
34
|
+
- Master/Dimension/Hub tables on the **TOP**.
|
|
35
|
+
- Fact/Transaction/Link tables on the **BOTTOM**.
|
|
36
|
+
|
|
37
|
+
### Domain Containers
|
|
38
|
+
- Tables inside a domain are positioned **relative** to the domain's (0,0) origin.
|
|
39
|
+
- **Domain Packing (Arithmetic Rule)**:
|
|
40
|
+
To ensure tables fit perfectly inside a domain, calculate dimensions as follows:
|
|
41
|
+
- **Width**: `(Cols * 320) + ((Cols - 1) * 80) + 160` (Padding). *Example: 2-col domain = 880px wide.*
|
|
42
|
+
- **Height**: `(Rows * 240) + ((Rows - 1) * 80) + 160` (Padding). *Example: 2-row domain = 720px high.*
|
|
43
|
+
- **Boundary Constraint**: NEVER place a table such that its right/bottom edge exceeds the domain's `width`/`height`.
|
|
44
|
+
|
|
45
|
+
---
|
|
46
|
+
|
|
47
|
+
## 2. Table Naming Hierarchy (3-Layer)
|
|
48
|
+
Bridge the gap between business and tech by populating all three layers:
|
|
48
49
|
|
|
49
|
-
|
|
50
|
-
|
|
50
|
+
1. **Conceptual Name (`name`)**: Business title (e.g., "Customers"). High-level clarity.
|
|
51
|
+
2. **Logical Name (`logical_name`)**: Formal modeling name (e.g., "Customer Master"). Hidden if identical to `name`.
|
|
52
|
+
3. **Physical Name (`physical_name`)**: Actual database table name (e.g., `dim_customers_v1`).
|
|
51
53
|
|
|
52
|
-
|
|
53
|
-
- **Schema Standards**: `raw`, `staging`, `analytics`, or `mart`.
|
|
54
|
-
- **Data Types**: Use standard SQL types (e.g., `VARCHAR`, `BIGINT`, `NUMBER(38,2)`, `DATE`, `TIMESTAMP_NTZ`).
|
|
54
|
+
---
|
|
55
55
|
|
|
56
|
-
##
|
|
57
|
-
|
|
58
|
-
- **Metadata Columns**: Mark technical columns (e.g., `updated_at`) with `isMetadata: true` (🕒 icon).
|
|
59
|
-
- **Additivity**:
|
|
60
|
-
- `fully`: Can be summed across all dimensions (Σ icon).
|
|
61
|
-
- `semi`: Summing is restricted (e.g., bank balance) (Σ~ icon).
|
|
62
|
-
- `non`: Summing is never valid (e.g., unit price) (⊘ icon).
|
|
56
|
+
## 3. Modeling Strategy & Intelligence
|
|
57
|
+
AI agents MUST analyze the nature of data to choose the correct classification and methodology.
|
|
63
58
|
|
|
64
|
-
|
|
65
|
-
|
|
66
|
-
-
|
|
59
|
+
### Table Classification Heuristics
|
|
60
|
+
- **Fact (`fact`)**: Data represents **Events, Transactions, or Measurements** (e.g., "Sales", "Clicks"). Usually has numbers (measures) and foreign keys.
|
|
61
|
+
- **Dimension (`dimension`)**: Data represents **Entities, People, or Reference Lists** (e.g., "Customers", "Products"). Contains descriptive attributes.
|
|
62
|
+
- **Hub (`hub`)**: Data represents a **Unique Business Key** (e.g., "Customer ID"). Used in Data Vault for core entity identification.
|
|
63
|
+
- **Satellite (`satellite`)**: Data represents **Descriptive Attributes of a Hub over time**. Always linked to a Hub.
|
|
67
64
|
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
-
|
|
71
|
-
- Use `description` to explain the domain's business purpose.
|
|
65
|
+
### Defining the Grain (The "1-Row Rule")
|
|
66
|
+
- Before adding columns, define the **Grain**: What does one row represent? (e.g., "One line item per invoice").
|
|
67
|
+
- **STRICT**: NEVER mix grains in a single table. Aggregated measures and atomic transactions MUST be in separate tables.
|
|
72
68
|
|
|
73
|
-
|
|
69
|
+
### Methodology Selection
|
|
70
|
+
- **Star Schema**: Use for most business reporting. Prioritize user-friendliness and query performance.
|
|
71
|
+
- **Data Vault 2.0**: Use for high-integration environments with many source systems. Prioritize scalability and auditability over direct queryability.
|
|
72
|
+
|
|
73
|
+
---
|
|
74
|
+
|
|
75
|
+
## 4. Logical Column Rules
|
|
76
|
+
- **Key Flags**: Mark `isPrimaryKey`, `isForeignKey`, or `isPartitionKey`.
|
|
77
|
+
- **Metadata**: Mark technical columns (e.g., `dw_load_date`) with `isMetadata: true`.
|
|
78
|
+
- **Additivity**: `fully` (Summable), `semi` (Balance), `non` (Price/ID).
|
|
79
|
+
|
|
80
|
+
---
|
|
81
|
+
|
|
82
|
+
## 5. Sample Data Stories
|
|
83
|
+
**Every table MUST include high-quality sample data.**
|
|
84
|
+
- **Format**: 2D array. First row is Header IDs.
|
|
85
|
+
- **Storytelling**: Provide at least 3 rows representing a real business scenario. Avoid "test1", "test2". Use realistic names, dates, and amounts.
|
|
86
|
+
|
|
87
|
+
---
|
|
88
|
+
|
|
89
|
+
## 6. Prohibitions & Anti-Patterns
|
|
90
|
+
- **NO NESTED LAYOUT**: Never put `x` or `y` inside `tables[...]` or `domains[...]`.
|
|
91
|
+
- **NO FLOATS**: Use only integers for coordinates.
|
|
92
|
+
- **NO FRAGMENTED LINEAGE**: Always define `lineage.upstream` for derived tables.
|
|
93
|
+
|
|
94
|
+
---
|
|
95
|
+
|
|
96
|
+
## 7. Golden Schema Example
|
|
74
97
|
```yaml
|
|
75
98
|
domains:
|
|
76
|
-
- id:
|
|
77
|
-
name: Sales
|
|
78
|
-
color: "rgba(59, 130, 246, 0.1)"
|
|
99
|
+
- id: sales_domain
|
|
100
|
+
name: "Sales Operations"
|
|
79
101
|
tables: [fct_orders]
|
|
80
102
|
|
|
81
103
|
tables:
|
|
82
104
|
- id: fct_orders
|
|
83
|
-
name: Orders
|
|
84
|
-
logical_name: "
|
|
85
|
-
physical_name: "
|
|
86
|
-
appearance:
|
|
87
|
-
type: fact
|
|
88
|
-
sub_type: transaction
|
|
89
|
-
icon: "🛒"
|
|
90
|
-
physical:
|
|
91
|
-
name: fct_orders
|
|
92
|
-
schema: analytics
|
|
105
|
+
name: "Orders"
|
|
106
|
+
logical_name: "Order Transactions"
|
|
107
|
+
physical_name: "fct_sales_orders"
|
|
108
|
+
appearance: { type: fact, sub_type: transaction, icon: "🛒" }
|
|
93
109
|
columns:
|
|
94
110
|
- id: order_id
|
|
95
|
-
logical: { name: ID, type: Int, isPrimaryKey: true }
|
|
96
|
-
physical: { name: order_id, type: BIGINT }
|
|
111
|
+
logical: { name: "ID", type: Int, isPrimaryKey: true }
|
|
97
112
|
- id: amount
|
|
98
|
-
logical: { name: Amount, type: Decimal, additivity: fully }
|
|
99
|
-
physical: { name: total_amount, type: NUMBER(18,2) }
|
|
113
|
+
logical: { name: "Amount", type: Decimal, additivity: fully }
|
|
100
114
|
sampleData:
|
|
101
|
-
- [order_id, amount
|
|
102
|
-
- [1001, 50.0
|
|
103
|
-
- [1002, 120.5
|
|
104
|
-
|
|
105
|
-
relationships:
|
|
106
|
-
- from: { table: dim_customers, column: customer_id }
|
|
107
|
-
to: { table: fct_orders, column: customer_id }
|
|
108
|
-
type: one-to-many
|
|
109
|
-
```
|
|
115
|
+
- [order_id, amount]
|
|
116
|
+
- [1001, 50.0]
|
|
117
|
+
- [1002, 120.5]
|
|
110
118
|
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
## 11. The Golden Rules
|
|
116
|
-
- When updating `model.yaml`, always self-audit against these rules.
|
|
117
|
-
- **Utilize the 3-layer naming (Conceptual, Logical, Physical) for clarity.**
|
|
118
|
-
- **Generate realistic `sampleData` for every new entity.**
|
|
119
|
-
- **Provide `physical` mappings to make the model implementation-ready.**
|
|
120
|
-
- Use `isPartitionKey` for large tables to inform performance design.
|
|
119
|
+
layout:
|
|
120
|
+
sales_domain: { x: 0, y: 0, width: 480, height: 400 }
|
|
121
|
+
fct_orders: { x: 80, y: 80 } # Relative to domain
|
|
122
|
+
```
|