opencode-skills-collection 2.0.0 → 2.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (90) hide show
  1. package/bundled-skills/.antigravity-install-manifest.json +6 -1
  2. package/bundled-skills/docs/integrations/jetski-cortex.md +3 -3
  3. package/bundled-skills/docs/integrations/jetski-gemini-loader/README.md +1 -1
  4. package/bundled-skills/docs/maintainers/repo-growth-seo.md +3 -3
  5. package/bundled-skills/docs/maintainers/skills-update-guide.md +1 -1
  6. package/bundled-skills/docs/users/bundles.md +1 -1
  7. package/bundled-skills/docs/users/claude-code-skills.md +1 -1
  8. package/bundled-skills/docs/users/gemini-cli-skills.md +1 -1
  9. package/bundled-skills/docs/users/getting-started.md +1 -1
  10. package/bundled-skills/docs/users/kiro-integration.md +1 -1
  11. package/bundled-skills/docs/users/usage.md +4 -4
  12. package/bundled-skills/docs/users/visual-guide.md +4 -4
  13. package/bundled-skills/manage-skills/SKILL.md +187 -0
  14. package/bundled-skills/monte-carlo-monitor-creation/SKILL.md +222 -0
  15. package/bundled-skills/monte-carlo-monitor-creation/references/comparison-monitor.md +426 -0
  16. package/bundled-skills/monte-carlo-monitor-creation/references/custom-sql-monitor.md +207 -0
  17. package/bundled-skills/monte-carlo-monitor-creation/references/metric-monitor.md +292 -0
  18. package/bundled-skills/monte-carlo-monitor-creation/references/table-monitor.md +231 -0
  19. package/bundled-skills/monte-carlo-monitor-creation/references/validation-monitor.md +404 -0
  20. package/bundled-skills/monte-carlo-prevent/SKILL.md +252 -0
  21. package/bundled-skills/monte-carlo-prevent/references/TROUBLESHOOTING.md +23 -0
  22. package/bundled-skills/monte-carlo-prevent/references/parameters.md +32 -0
  23. package/bundled-skills/monte-carlo-prevent/references/workflows.md +478 -0
  24. package/bundled-skills/monte-carlo-push-ingestion/SKILL.md +363 -0
  25. package/bundled-skills/monte-carlo-push-ingestion/references/anomaly-detection.md +87 -0
  26. package/bundled-skills/monte-carlo-push-ingestion/references/custom-lineage.md +203 -0
  27. package/bundled-skills/monte-carlo-push-ingestion/references/direct-http-api.md +207 -0
  28. package/bundled-skills/monte-carlo-push-ingestion/references/prerequisites.md +150 -0
  29. package/bundled-skills/monte-carlo-push-ingestion/references/push-lineage.md +160 -0
  30. package/bundled-skills/monte-carlo-push-ingestion/references/push-metadata.md +158 -0
  31. package/bundled-skills/monte-carlo-push-ingestion/references/push-query-logs.md +219 -0
  32. package/bundled-skills/monte-carlo-push-ingestion/references/validation.md +257 -0
  33. package/bundled-skills/monte-carlo-push-ingestion/scripts/sample_verify.py +357 -0
  34. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/collect_and_push_lineage.py +70 -0
  35. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/collect_and_push_metadata.py +65 -0
  36. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/collect_and_push_query_logs.py +70 -0
  37. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/collect_lineage.py +214 -0
  38. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/collect_metadata.py +160 -0
  39. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/collect_query_logs.py +164 -0
  40. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/push_lineage.py +198 -0
  41. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/push_metadata.py +193 -0
  42. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/push_query_logs.py +207 -0
  43. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery-iceberg/collect_and_push_metadata.py +71 -0
  44. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery-iceberg/collect_and_push_query_logs.py +64 -0
  45. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery-iceberg/collect_metadata.py +253 -0
  46. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery-iceberg/collect_query_logs.py +149 -0
  47. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery-iceberg/push_metadata.py +190 -0
  48. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery-iceberg/push_query_logs.py +208 -0
  49. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/collect_and_push_lineage.py +83 -0
  50. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/collect_and_push_metadata.py +77 -0
  51. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/collect_and_push_query_logs.py +83 -0
  52. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/collect_lineage.py +240 -0
  53. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/collect_metadata.py +212 -0
  54. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/collect_query_logs.py +204 -0
  55. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/push_lineage.py +192 -0
  56. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/push_metadata.py +178 -0
  57. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/push_query_logs.py +200 -0
  58. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/collect_and_push_lineage.py +119 -0
  59. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/collect_and_push_metadata.py +119 -0
  60. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/collect_and_push_query_logs.py +117 -0
  61. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/collect_lineage.py +265 -0
  62. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/collect_metadata.py +313 -0
  63. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/collect_query_logs.py +284 -0
  64. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/push_lineage.py +309 -0
  65. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/push_metadata.py +245 -0
  66. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/push_query_logs.py +255 -0
  67. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/collect_and_push_lineage.py +78 -0
  68. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/collect_and_push_metadata.py +80 -0
  69. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/collect_and_push_query_logs.py +88 -0
  70. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/collect_lineage.py +235 -0
  71. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/collect_metadata.py +219 -0
  72. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/collect_query_logs.py +239 -0
  73. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/push_lineage.py +178 -0
  74. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/push_metadata.py +178 -0
  75. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/push_query_logs.py +196 -0
  76. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/collect_and_push_lineage.py +154 -0
  77. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/collect_and_push_metadata.py +137 -0
  78. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/collect_and_push_query_logs.py +137 -0
  79. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/collect_lineage.py +349 -0
  80. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/collect_metadata.py +329 -0
  81. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/collect_query_logs.py +254 -0
  82. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/push_lineage.py +307 -0
  83. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/push_metadata.py +228 -0
  84. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/push_query_logs.py +248 -0
  85. package/bundled-skills/monte-carlo-push-ingestion/scripts/test_template_sdk_usage.py +340 -0
  86. package/bundled-skills/monte-carlo-validation-notebook/SKILL.md +685 -0
  87. package/bundled-skills/monte-carlo-validation-notebook/scripts/generate_notebook_url.py +141 -0
  88. package/bundled-skills/monte-carlo-validation-notebook/scripts/resolve_dbt_schema.py +161 -0
  89. package/package.json +1 -1
  90. package/skills_index.json +503 -61
@@ -0,0 +1,32 @@
1
+ # MCP Parameter Notes
2
+
3
+ Important parameter details for Monte Carlo MCP tools. Consult when making API
4
+ calls to avoid common mistakes.
5
+
6
+ ---
7
+
8
+ ## `getAlerts` — use snake_case parameters
9
+
10
+ The MCP tool uses Python snake_case, **not** the camelCase params from the MC web UI:
11
+
12
+ ```
13
+ ✓ created_after (not createdTime.after)
14
+ ✓ created_before (not createdTime.before)
15
+ ✓ order_by (not orderBy)
16
+ ✓ table_mcons (not tableMcons)
17
+ ```
18
+
19
+ Always provide `created_after` and `created_before`. Max window is 60 days.
20
+ Use `getCurrentTime()` to get the current ISO timestamp when needed.
21
+
22
+ ---
23
+
24
+ ## `search` — finding the right table identifier
25
+
26
+ MC uses MCONs (Monte Carlo Object Names) as table identifiers. Always use
27
+ `search` first to resolve a table name to its MCON before calling `getTable`,
28
+ `getAssetLineage`, or `getAlerts`.
29
+
30
+ ```
31
+ search(query="orders_status") → returns mcon, full_table_id, warehouse
32
+ ```
@@ -0,0 +1,478 @@
1
+ # Workflow Details
2
+
3
+ Detailed step-by-step instructions for each Monte Carlo Prevent workflow.
4
+ These are referenced from the main SKILL.md — consult the relevant section when
5
+ executing a workflow.
6
+
7
+ ---
8
+
9
+ ## Workflow 1: Table health check — when opening or editing a model
10
+
11
+ When the user opens a dbt model or mentions a table, run this sequence automatically:
12
+
13
+ ```
14
+ 1. search(query="<table_name>") → get the full MCON/table identifier
15
+ 2. getTable(mcon="<mcon>") → schema, freshness, row count, importance score, monitoring status
16
+ 3. getAssetLineage(mcon="<mcon>") → upstream sources, downstream dependents
17
+ 4. getAlerts(created_after="<7 days ago>", created_before="<now>", table_mcons=["<mcon>"]) → active alerts
18
+ ```
19
+
20
+ Summarize for the user:
21
+ - **Health**: last updated, row count, is it monitored?
22
+ - **Lineage**: N upstream sources, M downstream consumers (name the important ones)
23
+ - **Alerts**: any active/unacknowledged incidents — lead with these if present
24
+ - **Risk signals** (lite): flag if importance score is high, if key assets are downstream, or if alerts are already firing — these indicate the table warrants extra care before modification
25
+
26
+ Example summary to offer unprompted when a dbt model file is opened:
27
+ > "The table `orders_status` was last updated 2 hours ago with 142K rows. It has 3 downstream dependents including `order_status_snapshot` (key asset). There are 2 active freshness alerts — this table warrants extra care before modification. Want me to run a full change impact assessment?"
28
+
29
+ **Auto-escalation rule — after completing steps 1–4 above:**
30
+
31
+ First, check whether the user has expressed intent to modify the model
32
+ in this session (e.g. mentioned a change, asked to add/edit/fix something).
33
+
34
+ IF change intent has been expressed AND any of the following are true:
35
+ - One or more active/unacknowledged alerts exist on the table
36
+ - One or more downstream dependents are key assets
37
+ - The table's importance score is above 0.8
38
+ → Ask the user before running Workflow 4:
39
+ "This is a high-importance table with [N active alerts / key asset
40
+ dependents / importance score 0.989]. Do you want me to run a full
41
+ change impact assessment before proceeding? (yes/no)"
42
+ → Wait for confirmation. If yes → run Workflow 4.
43
+ If no → proceed but note: "Skipping impact assessment at your request."
44
+
45
+ IF risk signals exist but NO change intent has been expressed:
46
+ → Surface the health summary and note the risk signals only:
47
+ "This is a high-importance table with key asset dependents. When
48
+ you're ready to make changes, say 'run impact assessment' or just
49
+ describe your change and I'll run it automatically."
50
+ → Do NOT run Workflow 4. Do NOT ask about running Workflow 4.
51
+
52
+ ### New model creation variant
53
+
54
+ When the user is creating a new .sql dbt model file (not editing an existing one):
55
+
56
+ 1. Parse all {{ ref('...') }} and {{ source('...', '...') }} calls from the SQL
57
+ 2. For each referenced table, run the standard Workflow 1 health check:
58
+ search() → getTable() → getAlerts()
59
+ 3. Surface a consolidated upstream health summary:
60
+ "Your new model references N upstream tables. Here's their current health:"
61
+ - List each with: last updated, active alerts (if any), key asset flag
62
+ 4. Flag any upstream table with active alerts as a risk:
63
+ "⚠️ <table_name> has <N> active alerts — your new model will inherit this data quality issue"
64
+
65
+ Skip getAssetLineage for new models — they have no downstream dependents yet.
66
+ Skip Workflow 4 for new models — there is no existing blast radius to assess.
67
+
68
+ ---
69
+
70
+ ## Workflow 2: Add a monitor — when new transformation logic is added
71
+
72
+ > **For detailed monitor creation guidance** — including parameter validation, field-type compatibility checks, and common error prevention — see the `monitor-creation` skill (`skills/monitor-creation/SKILL.md`). The workflow below is a quick-path for the common "just added a column, offer a monitor" case within a prevent session.
73
+
74
+ When the user adds a new column, filter, or business rule, suggest adding a monitor. First, choose the monitor type based on what the new logic does:
75
+
76
+ ```
77
+ - New column with a row-level condition (null check, range, regex)
78
+ → createValidationMonitorMac
79
+
80
+ - New aggregate metric (row count, sum, average, percentile over time)
81
+ → createMetricMonitorMac
82
+
83
+ - Logic that should match another table or a prior time period
84
+ → createComparisonMonitorMac
85
+
86
+ - Complex business rule that doesn't fit the above
87
+ → createCustomSqlMonitorMac
88
+ ```
89
+
90
+ Then run the appropriate sequence:
91
+
92
+ ```
93
+ 1. Read the SQL file being edited to extract the specific transformation logic:
94
+ - Confirm the file path from conversation context (do not guess or assume)
95
+ - If no file path is clear, ask the engineer: "Which file contains the new logic?"
96
+ - Extract the specific new column definition, filter condition, or business rule
97
+ - Use this logic directly when constructing the monitor condition in step 3
98
+
99
+ 2. For validation monitors: getValidationPredicates() → show what validation types are available
100
+ For all types: determine the right tool from the selection guide above
101
+ 3. Call the selected create*MonitorMac tool:
102
+ - createValidationMonitorMac(mcon, description, condition_sql) → returns YAML
103
+ - createMetricMonitorMac(mcon, description, metric, operator) → returns YAML
104
+ - createComparisonMonitorMac(source_table, target_table, metric) → returns YAML
105
+ - createCustomSqlMonitorMac(mcon, description, sql) → returns YAML
106
+ ⚠ If createValidationMonitorMac fails (e.g. column doesn't exist yet in the live table),
107
+ fall back to createCustomSqlMonitorMac with an explicit SQL query instead.
108
+ 3. Save the YAML to <project>/monitors/<table_name>.yml
109
+ 4. Run: montecarlo monitors apply --dry-run (to preview)
110
+ 5. Run: montecarlo monitors apply --auto-yes (to apply)
111
+ ```
112
+
113
+ **Important — YAML format for `monitors apply`:**
114
+ All `create*MonitorMac` tools return YAML that is not directly compatible with `montecarlo monitors apply`. Reformat the output into a standalone monitor file with `montecarlo:` as the root key. The second-level key matches the monitor type: `custom_sql:`, `validation:`, `metric:`, or `comparison:`. The example below shows `custom_sql:` — substitute the appropriate key for other monitor types.
115
+
116
+ ```yaml
117
+ # monitors/<table_name>.yml ← monitor definitions only, NOT montecarlo.yml
118
+ montecarlo:
119
+ custom_sql:
120
+ - warehouse: <warehouse_name>
121
+ name: <monitor_name>
122
+ description: <description>
123
+ schedule:
124
+ interval_minutes: 720
125
+ start_time: '<ISO timestamp>'
126
+ sql: <your validation SQL>
127
+ alert_conditions:
128
+ - operator: GT
129
+ threshold_value: 0.0
130
+ ```
131
+
132
+ The `montecarlo.yml` project config is a **separate file** in the project root containing only:
133
+ ```yaml
134
+ # montecarlo.yml ← project config only, NOT monitor definitions
135
+ version: 1
136
+ namespace: <your-namespace>
137
+ default_resource: <warehouse_name>
138
+ ```
139
+
140
+ Do NOT put `version:`, `namespace:`, or `default_resource:` inside monitor definition files.
141
+
142
+ ---
143
+
144
+ ## Workflow 3: Alert triage — when investigating an active incident
145
+
146
+ ```
147
+ 1. getAlerts(
148
+ created_after="<start>",
149
+ created_before="<end>",
150
+ order_by="-createdTime",
151
+ statuses=["NOT_ACKNOWLEDGED"]
152
+ ) → list open alerts
153
+ 2. getTable(mcon="<affected_table_mcon>") → check current table state
154
+ 3. getAssetLineage(mcon="<mcon>") → identify upstream cause or downstream blast radius
155
+ 4. getQueriesForTable(mcon="<mcon>") → recent queries that might explain the anomaly
156
+ ```
157
+
158
+ To respond to an alert:
159
+ - `updateAlert(alert_id="<id>", status="ACKNOWLEDGED")` — acknowledge it
160
+ - `setAlertOwner(alert_id="<id>", owner="<email>")` — assign ownership
161
+ - `createOrUpdateAlertComment(alert_id="<id>", comment="<text>")` — add context
162
+
163
+ ---
164
+
165
+ ## Workflow 4: Change impact assessment — REQUIRED before modifying a model
166
+
167
+ **Trigger:** Any expressed intent to add, rename, drop, or change a column, join, filter, or model logic. Run this immediately — before writing any code — even if the user hasn't asked for it.
168
+
169
+ ### Bugfixes and reverts require impact assessment too
170
+
171
+ When the user says "fix", "revert", "restore", or "undo", run this workflow
172
+ before writing any code — even if the change seems small or safe.
173
+
174
+ A revert that undoes a column addition or changes join logic has the same
175
+ blast radius as the original change. Downstream models may have already
176
+ adapted to the "incorrect" behavior, meaning the fix itself could break them.
177
+
178
+ Pay special attention to:
179
+ - Whether the revert removes a column other models now depend on
180
+ - Whether downstream models reference the specific logic being reverted
181
+ - Whether active alerts may be related to the change being reverted
182
+
183
+ When the user is about to rename or drop a column, change a join condition, alter a filter, or refactor a model's logic, run this sequence to surface the blast radius before any changes are committed:
184
+
185
+ ```
186
+ 1. search(query="<table_name>") + getTable(mcon="<mcon>")
187
+ → importance score, query volume (reads/writes per day), key asset flag
188
+
189
+ 2. getAssetLineage(mcon="<mcon>")
190
+ → full list of downstream dependents; for each, note whether it is a key asset
191
+
192
+ 3. getTable(mcon="<downstream_mcon>") for each key downstream asset
193
+ → importance score, last updated, monitoring status
194
+
195
+ 4. getAlerts(
196
+ created_after="<7 days ago>",
197
+ created_before="<now>",
198
+ table_mcons=["<mcon>", "<downstream_mcon_1>", ...],
199
+ statuses=["NOT_ACKNOWLEDGED"]
200
+ )
201
+ → any active incidents already affecting this table or its dependents
202
+
203
+ 5. getQueriesForTable(mcon="<mcon>")
204
+ → recent queries; scan for references to the specific columns being changed
205
+ → use getQueryData(query_id="<id>") to fetch full SQL for ambiguous cases
206
+
207
+ 5b. Supplementary local search for downstream dbt refs:
208
+ - Search the local models/ directory for ref('<table_name>') (single-hop only)
209
+ - Compare results against getAssetLineage output from step 2
210
+ - If any local models reference this table but are NOT in MC's lineage results:
211
+ "⚠️ Found N local model(s) referencing this table not yet in MC's lineage: [list]"
212
+ - If no models/ directory exists in the current project, skip silently
213
+ - MC lineage remains the authoritative source — local grep is supplementary only
214
+
215
+ 6. getMonitors(mcon="<mcon>")
216
+ → which monitors are watching columns or metrics affected by the change
217
+ ```
218
+
219
+ ### Risk tier assessment
220
+
221
+ | Tier | Conditions |
222
+ |---|---|
223
+ | 🔴 High | Key asset downstream, OR active alerts already firing, OR >50 reads/day |
224
+ | 🟡 Medium | Non-key assets downstream, OR monitors on affected columns, OR moderate query volume |
225
+ | 🟢 Low | No downstream dependents, no active alerts, low query volume |
226
+
227
+ ### Multi-model changes
228
+
229
+ When the user is changing multiple models in the same session or same domain
230
+ (e.g., 3 timeseries models, 4 criticality_score models):
231
+
232
+ - Run a single consolidated impact assessment across all changed tables
233
+ - Deduplicate downstream dependents — if two changed tables share a downstream
234
+ dependent, count it once and note that it's affected by multiple upstream changes
235
+ - Present a unified blast radius report rather than N separate reports
236
+ - Escalate risk tier if the combined blast radius is larger than any individual table
237
+
238
+ Example consolidated report header:
239
+ "## Change Impact: 3 models in timeseries domain
240
+ Combined downstream blast radius: 28 tables (deduplicated)
241
+ Highest risk table: timeseries_detector_routing (22 downstream refs)"
242
+
243
+ ### Report format
244
+
245
+ ```
246
+ ## Change Impact: <table_name>
247
+
248
+ Risk: 🔴 High / 🟡 Medium / 🟢 Low
249
+
250
+ Downstream blast radius:
251
+ - <N> tables depend on this model
252
+ - Key assets affected: <list or "none">
253
+
254
+ Active incidents:
255
+ - <alert title, status> or "none"
256
+
257
+ Column exposure (for columns being changed):
258
+ - Found in <N> recent queries (e.g. <query snippet>)
259
+
260
+ Monitor coverage:
261
+ - <monitor name> watches <metric> — will be affected by this change
262
+ - If zero custom monitors exist → append:
263
+ "⚠️ No custom monitors on this table. After making your changes,
264
+ I'll suggest a monitor for the new logic — or say 'add a monitor'
265
+ to do it now."
266
+
267
+ Recommendation:
268
+ - <specific callout, e.g. "Notify owners of downstream_table before deploying",
269
+ "Coordinate with the freshness alert owner", "Add a monitor for the new column">
270
+ ```
271
+
272
+ If risk is 🔴 High:
273
+ 1. Call `getAudiences()` to retrieve configured notification audiences
274
+ 2. Include in the recommendation: "Notify: <audience names / channels>"
275
+ 3. Proactively suggest:
276
+ - Notifying owners of downstream key assets (`setAlertOwner` / `createOrUpdateAlertComment` on active alerts)
277
+ - Adding a monitor for the new logic before deploying (Workflow 2)
278
+ - Running `montecarlo monitors apply --dry-run` after changes to verify nothing breaks
279
+
280
+ ### Synthesis: translate findings into code recommendations
281
+
282
+ After presenting the impact report, use the findings to shape your code suggestion.
283
+ Do not present MC data and then write code as if the data wasn't there.
284
+ Explicitly connect each key finding to a specific recommendation:
285
+
286
+ - Active alerts firing on the table:
287
+ → Recommend deferring or minimally scoping the change until alerts are resolved
288
+ → Explain: "There are N active alerts on this table — making this change now
289
+ risks compounding an existing data quality issue"
290
+
291
+ - Key assets downstream:
292
+ → Recommend defensive coding patterns: null guards, backward-compatible changes,
293
+ additive-only schema changes where possible
294
+ → Explain: "X downstream key assets depend on this table — I'd recommend
295
+ writing this as [specific pattern] to avoid breaking [specific dependent]"
296
+
297
+ - Monitors on affected columns:
298
+ → Call out that the change will affect monitor coverage
299
+ → Recommend updating monitors alongside the code change (offer Workflow 2)
300
+ → Explain: "The existing monitor on [column] will need to be updated to
301
+ account for this change"
302
+
303
+ - New output column or logic being added:
304
+ → Always offer Workflow 2 after the impact assessment, regardless
305
+ of existing monitor coverage
306
+ → Do not skip this step even if risk tier is 🟢 Low
307
+ → Say explicitly: "This adds new output logic — would you like me
308
+ to generate a monitor for it? I can add a null check, range
309
+ validation, or custom SQL rule."
310
+ → Wait for the user's response before proceeding with the edit
311
+
312
+ - High read volume (>50 reads/day):
313
+ → Recommend extra caution around column renames or removals
314
+ → Suggest backward-compatible transition (add new column, deprecate old one)
315
+ → Explain: "This table has [N] reads/day — a column rename without a
316
+ transition period would break downstream consumers immediately"
317
+
318
+ - Column renames, even inside CTEs:
319
+ → Never assume a CTE-internal rename is safe. Always check:
320
+ 1. Does this column appear in the final SELECT, directly or
321
+ via a CTE that feeds into the final SELECT?
322
+ 2. If yes — treat as a breaking change. Recommend a
323
+ backward-compatible transition: add the correctly-named
324
+ column, keep the old one temporarily, remove in a
325
+ follow-up PR.
326
+ 3. If truly internal and never surfaces in output — confirm
327
+ this explicitly before proceeding.
328
+ → Explain: "Even though this column is defined in a CTE, if it
329
+ surfaces in the final SELECT it is a public output column —
330
+ renaming it breaks any downstream model selecting it by name."
331
+
332
+ ---
333
+
334
+ ## Workflow 5: Change validation queries — after a code change is made
335
+
336
+ **Trigger:** Explicit engineer intent only. Activate when the engineer says something like:
337
+ - "generate validation queries", "validate this change", "I'm done with this change"
338
+ - "let me test this", "write queries to check this", "ready to commit"
339
+
340
+ **Required session context — do not activate without both:**
341
+ 1. Workflow 4 (change impact assessment) has run for this table in this session
342
+ 2. A file edit was made to a `.sql` or dbt model file for that same table
343
+
344
+ **Do NOT activate automatically after file edits. Do NOT proactively offer after Workflow 4 or file edits. The engineer asks when they are ready.**
345
+
346
+ ---
347
+
348
+ ### What this workflow does
349
+
350
+ Using the context already in the session — the Workflow 4 findings, the file diff, and the `getTable` result — generate 3–5 targeted SQL validation queries that directly test whether this specific change behaved as intended.
351
+
352
+ These are not generic templates. Use the semantic meaning of the change from Workflow 4 context: which columns changed and why, what business logic was affected, what downstream models depend on this table, and what monitors exist. A null check on a new `days_since_contract_start` column should verify it is never negative and never null for rows with a `contract_start_date` — not just check for nulls generically.
353
+
354
+ ---
355
+
356
+ ### Step 1 — Identify the change type from session context
357
+
358
+ From Workflow 4 findings and the file diff, classify the primary change. A change may span multiple types — classify the dominant one and note secondaries:
359
+
360
+ - **New column** — a new output column was added to the SELECT
361
+ - **Filter change** — a WHERE clause, IN-list, or CASE condition was modified
362
+ - **Join change** — a JOIN condition or join target was modified
363
+ - **Column rename or drop** — an existing output column was renamed or removed
364
+ - **Parameter change** — a hardcoded threshold, constant, or numeric value was changed
365
+ - **New model** — the file was newly created, no production baseline exists
366
+
367
+ ---
368
+
369
+ ### Step 2 — Determine warehouse context from Workflow 4
370
+
371
+ From the `getTable` result already in session context, extract:
372
+ - **Fully qualified table name** — e.g. `analytics.prod_internal_bi.client_hub_master`
373
+ - **Warehouse type** — Snowflake, BigQuery, Redshift, Databricks
374
+ - **Schema** — already resolved, do not re-derive
375
+
376
+ Use the correct SQL dialect for the warehouse type. Key differences:
377
+
378
+ | Warehouse | Date diff | Current timestamp | Notes |
379
+ |---|---|---|---|
380
+ | Snowflake | `DATEDIFF('day', a, b)` | `CURRENT_TIMESTAMP()` | `QUALIFY` supported |
381
+ | BigQuery | `DATE_DIFF(a, b, DAY)` | `CURRENT_TIMESTAMP()` | Use subquery instead of `QUALIFY` |
382
+ | Redshift | `DATEDIFF('day', a, b)` | `GETDATE()` | |
383
+ | Databricks | `DATEDIFF(a, b)` | `CURRENT_TIMESTAMP()` | |
384
+
385
+ For the dev database, use the placeholder `<YOUR_DEV_DATABASE>` with a comment instructing the engineer to replace it. Do not guess the dev database name.
386
+
387
+ ---
388
+
389
+ ### Step 3 — Apply database targeting rules (mandatory)
390
+
391
+ These rules are not negotiable — violating them produces queries that will fail at runtime:
392
+
393
+ - **Columns or logic that only exist post-change** → dev database only. Never query production for a column that doesn't exist there yet.
394
+ - **Comparison queries (before vs after)** → both production and dev databases
395
+ - **New model (no production baseline)** → dev database only for all queries
396
+ - **Row count comparison** → always include, always query both databases
397
+
398
+ ---
399
+
400
+ ### Step 4 — Generate targeted validation queries
401
+
402
+ Always include a row count comparison regardless of change type — it's the baseline signal that something unexpected happened.
403
+
404
+ Then generate change-specific queries based on what needs to be validated for this change type. Use the exact conditions, column names, and business logic from the diff and Workflow 4 findings — not generic placeholders. The goal for each change type:
405
+
406
+ **New column:** Verify the column is non-null where it should be non-null (based on its business meaning), that its value range is plausible, and that its distribution makes sense given the underlying data. Query dev only.
407
+
408
+ **Filter change:** Verify that only the intended rows were reclassified — generate a before/after count showing how many rows were added or removed by the new condition using the exact filter logic from the diff, and a sample of the rows that changed classification. The sample helps the engineer confirm the right records moved.
409
+
410
+ **Join change:** Verify that the join didn't introduce duplicates — a uniqueness check on the join key is essential. Also verify row count didn't change unexpectedly. Query dev for uniqueness, both databases for row count.
411
+
412
+ **Column rename or drop:** Verify the old column name is absent and the new column (if renamed) is present in the dev schema. Also verify that downstream models referencing the old column name are identified — use the local ref() grep results from Workflow 4 if available.
413
+
414
+ **Parameter or threshold change:** Verify the distribution of values affected by the change — how many rows moved above or below the new threshold, and whether the count matches the engineer's expectation. Query both databases to compare before and after.
415
+
416
+ **New model:** No production comparison possible. Verify row count is non-zero and plausible, sample rows look correct, and key columns are non-null. Query dev only.
417
+
418
+ ---
419
+
420
+ ### Step 5 — Add change-specific context to each query
421
+
422
+ For every query, include a SQL comment block that explains:
423
+ - What the query is checking
424
+ - What a healthy result looks like **for this specific change**
425
+ - What would indicate a problem
426
+
427
+ Derive this context from Workflow 4 findings. Use the business meaning of the change, not generic descriptions. For example, for adding `days_since_contract_start`:
428
+
429
+ ```sql
430
+ /*
431
+ Null rate check: days_since_contract_start (new column, dev only)
432
+ What to look for:
433
+ - Null count should equal workspaces with no contract_start_date
434
+ - All rows with contract_start_date should have a non-null, non-negative value
435
+ - Values above 3650 (~10 years) are suspicious and may indicate a data issue
436
+ */
437
+ ```
438
+
439
+ This is what differentiates these queries from generic validation — the comment tells the engineer exactly what pass and fail look like for their specific change.
440
+
441
+ ---
442
+
443
+ ### Step 6 — Save to local file
444
+
445
+ Save all generated queries to:
446
+ ```
447
+ validation/<table_name>_<YYYYMMDD_HHMM>.sql
448
+ ```
449
+
450
+ Include a header at the top of the file:
451
+ ```sql
452
+ /*
453
+ Validation queries for: <fully_qualified_table>
454
+ Change type: <change type from Step 1>
455
+ Generated: <timestamp>
456
+ Workflow 4 risk tier: <tier from this session>
457
+
458
+ Instructions:
459
+ 1. Replace <YOUR_DEV_DATABASE> with your personal or branch database
460
+ 2. Run the row count comparison first
461
+ 3. Run change-specific queries to validate intended behavior
462
+ 4. Unexpected results should be investigated before merging
463
+ */
464
+ ```
465
+
466
+ Then tell the engineer:
467
+ > "Validation queries saved to `validation/<table_name>_<timestamp>.sql`.
468
+ > Replace `<YOUR_DEV_DATABASE>` with your dev database and run in Snowflake
469
+ > or your preferred SQL client to verify the change behaved as expected."
470
+
471
+ ---
472
+
473
+ ### What this workflow does NOT do
474
+ - Does not execute queries (Phase 2)
475
+ - Does not require warehouse MCP connection
476
+ - Does not generate Monte Carlo notebook YAML
477
+ - Does not trigger automatically — only on explicit engineer request
478
+ - Does not activate if Workflow 4 has not run for this table in this session