opencode-skills-collection 2.0.0 → 2.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (90) hide show
  1. package/bundled-skills/.antigravity-install-manifest.json +6 -1
  2. package/bundled-skills/docs/integrations/jetski-cortex.md +3 -3
  3. package/bundled-skills/docs/integrations/jetski-gemini-loader/README.md +1 -1
  4. package/bundled-skills/docs/maintainers/repo-growth-seo.md +3 -3
  5. package/bundled-skills/docs/maintainers/skills-update-guide.md +1 -1
  6. package/bundled-skills/docs/users/bundles.md +1 -1
  7. package/bundled-skills/docs/users/claude-code-skills.md +1 -1
  8. package/bundled-skills/docs/users/gemini-cli-skills.md +1 -1
  9. package/bundled-skills/docs/users/getting-started.md +1 -1
  10. package/bundled-skills/docs/users/kiro-integration.md +1 -1
  11. package/bundled-skills/docs/users/usage.md +4 -4
  12. package/bundled-skills/docs/users/visual-guide.md +4 -4
  13. package/bundled-skills/manage-skills/SKILL.md +187 -0
  14. package/bundled-skills/monte-carlo-monitor-creation/SKILL.md +222 -0
  15. package/bundled-skills/monte-carlo-monitor-creation/references/comparison-monitor.md +426 -0
  16. package/bundled-skills/monte-carlo-monitor-creation/references/custom-sql-monitor.md +207 -0
  17. package/bundled-skills/monte-carlo-monitor-creation/references/metric-monitor.md +292 -0
  18. package/bundled-skills/monte-carlo-monitor-creation/references/table-monitor.md +231 -0
  19. package/bundled-skills/monte-carlo-monitor-creation/references/validation-monitor.md +404 -0
  20. package/bundled-skills/monte-carlo-prevent/SKILL.md +252 -0
  21. package/bundled-skills/monte-carlo-prevent/references/TROUBLESHOOTING.md +23 -0
  22. package/bundled-skills/monte-carlo-prevent/references/parameters.md +32 -0
  23. package/bundled-skills/monte-carlo-prevent/references/workflows.md +478 -0
  24. package/bundled-skills/monte-carlo-push-ingestion/SKILL.md +363 -0
  25. package/bundled-skills/monte-carlo-push-ingestion/references/anomaly-detection.md +87 -0
  26. package/bundled-skills/monte-carlo-push-ingestion/references/custom-lineage.md +203 -0
  27. package/bundled-skills/monte-carlo-push-ingestion/references/direct-http-api.md +207 -0
  28. package/bundled-skills/monte-carlo-push-ingestion/references/prerequisites.md +150 -0
  29. package/bundled-skills/monte-carlo-push-ingestion/references/push-lineage.md +160 -0
  30. package/bundled-skills/monte-carlo-push-ingestion/references/push-metadata.md +158 -0
  31. package/bundled-skills/monte-carlo-push-ingestion/references/push-query-logs.md +219 -0
  32. package/bundled-skills/monte-carlo-push-ingestion/references/validation.md +257 -0
  33. package/bundled-skills/monte-carlo-push-ingestion/scripts/sample_verify.py +357 -0
  34. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/collect_and_push_lineage.py +70 -0
  35. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/collect_and_push_metadata.py +65 -0
  36. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/collect_and_push_query_logs.py +70 -0
  37. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/collect_lineage.py +214 -0
  38. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/collect_metadata.py +160 -0
  39. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/collect_query_logs.py +164 -0
  40. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/push_lineage.py +198 -0
  41. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/push_metadata.py +193 -0
  42. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery/push_query_logs.py +207 -0
  43. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery-iceberg/collect_and_push_metadata.py +71 -0
  44. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery-iceberg/collect_and_push_query_logs.py +64 -0
  45. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery-iceberg/collect_metadata.py +253 -0
  46. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery-iceberg/collect_query_logs.py +149 -0
  47. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery-iceberg/push_metadata.py +190 -0
  48. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/bigquery-iceberg/push_query_logs.py +208 -0
  49. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/collect_and_push_lineage.py +83 -0
  50. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/collect_and_push_metadata.py +77 -0
  51. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/collect_and_push_query_logs.py +83 -0
  52. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/collect_lineage.py +240 -0
  53. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/collect_metadata.py +212 -0
  54. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/collect_query_logs.py +204 -0
  55. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/push_lineage.py +192 -0
  56. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/push_metadata.py +178 -0
  57. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/databricks/push_query_logs.py +200 -0
  58. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/collect_and_push_lineage.py +119 -0
  59. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/collect_and_push_metadata.py +119 -0
  60. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/collect_and_push_query_logs.py +117 -0
  61. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/collect_lineage.py +265 -0
  62. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/collect_metadata.py +313 -0
  63. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/collect_query_logs.py +284 -0
  64. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/push_lineage.py +309 -0
  65. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/push_metadata.py +245 -0
  66. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/hive/push_query_logs.py +255 -0
  67. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/collect_and_push_lineage.py +78 -0
  68. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/collect_and_push_metadata.py +80 -0
  69. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/collect_and_push_query_logs.py +88 -0
  70. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/collect_lineage.py +235 -0
  71. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/collect_metadata.py +219 -0
  72. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/collect_query_logs.py +239 -0
  73. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/push_lineage.py +178 -0
  74. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/push_metadata.py +178 -0
  75. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/redshift/push_query_logs.py +196 -0
  76. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/collect_and_push_lineage.py +154 -0
  77. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/collect_and_push_metadata.py +137 -0
  78. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/collect_and_push_query_logs.py +137 -0
  79. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/collect_lineage.py +349 -0
  80. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/collect_metadata.py +329 -0
  81. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/collect_query_logs.py +254 -0
  82. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/push_lineage.py +307 -0
  83. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/push_metadata.py +228 -0
  84. package/bundled-skills/monte-carlo-push-ingestion/scripts/templates/snowflake/push_query_logs.py +248 -0
  85. package/bundled-skills/monte-carlo-push-ingestion/scripts/test_template_sdk_usage.py +340 -0
  86. package/bundled-skills/monte-carlo-validation-notebook/SKILL.md +685 -0
  87. package/bundled-skills/monte-carlo-validation-notebook/scripts/generate_notebook_url.py +141 -0
  88. package/bundled-skills/monte-carlo-validation-notebook/scripts/resolve_dbt_schema.py +161 -0
  89. package/package.json +1 -1
  90. package/skills_index.json +503 -61
@@ -0,0 +1,404 @@
1
+ # Validation Monitor Reference
2
+
3
+ Detailed reference for building `createValidationMonitorMac` tool calls.
4
+
5
+ ## When to Use
6
+
7
+ Use a validation monitor when the user wants to:
8
+
9
+ - Check that specific fields are never null
10
+ - Validate that values are within an allowed set (e.g., status in 'active', 'pending', 'inactive')
11
+ - Enforce referential integrity (field values exist in another table)
12
+ - Apply row-level business rules (e.g., "amount must be positive")
13
+ - Combine multiple conditions with AND/OR logic
14
+
15
+ ---
16
+
17
+ ## Getting the Logic Right: Conditions Match INVALID Data
18
+
19
+ This is the single most confusing aspect of validation monitors and the number one source of mistakes. **Conditions describe what INVALID data looks like -- the data you want to be alerted about.** They do NOT describe what valid data looks like.
20
+
21
+ Think of it this way: the monitor scans rows and fires an alert when it finds rows matching the condition. So the condition must match the BAD rows.
22
+
23
+ | User wants | Condition should match | Common mistake |
24
+ |------------|----------------------|----------------|
25
+ | "id should never be null" | id IS NULL (alert when null found) | id IS NOT NULL (would alert on every valid row) |
26
+ | "status must be in [active, pending]" | status NOT IN [active, pending] (alert on unexpected values) | status IN [active, pending] (would alert on valid rows) |
27
+ | "amount must be positive" | amount IS NEGATIVE (alert on bad values) | amount > 0 (would alert on valid rows) |
28
+ | "email must not be empty" | email IS NULL **OR** email = '' (alert on missing) | email IS NOT NULL (would alert on valid rows) |
29
+
30
+ **Before building any condition, ask yourself: "If a row matches this condition, is the row INVALID?" If the answer is no, the logic is backwards.**
31
+
32
+ ---
33
+
34
+ ## Pre-Step: Verify Field Existence
35
+
36
+ Before constructing the `alert_condition`, verify that every field name you plan to reference exists in the table's column list. This is the number two source of validation monitor failures -- referencing columns that do not exist or are misspelled.
37
+
38
+ 1. You should already have the column list from `getTable` with `include_fields: true` (done in Step 2 of the main skill).
39
+ 2. For every field name in your planned conditions, confirm it appears in the column list exactly as spelled (field names are case-sensitive on most warehouses).
40
+ 3. If a field does not exist, stop and ask the user to clarify the correct column name. Do not guess.
41
+
42
+ ---
43
+
44
+ ## Required Parameters
45
+
46
+ | Parameter | Type | Description |
47
+ |-----------|------|-------------|
48
+ | `name` | string | Unique identifier for the monitor. Use a descriptive slug (e.g., `orders_not_null_check`). |
49
+ | `description` | string | Human-readable description of what the monitor checks. |
50
+ | `table` | string | Table MCON (preferred) or `database:schema.table` format. If not MCON, also pass `warehouse`. |
51
+ | `alert_condition` | object | Condition tree defining when to alert (see Alert Condition Structure below). |
52
+
53
+ ## Optional Parameters
54
+
55
+ | Parameter | Type | Description |
56
+ |-----------|------|-------------|
57
+ | `warehouse` | string | Warehouse name or UUID. Required if `table` is not an MCON. |
58
+ | `domain_id` | string (uuid) | Domain UUID (use `getDomains` to list). |
59
+
60
+ ---
61
+
62
+ ## Alert Condition Structure
63
+
64
+ The top level of `alert_condition` must always be a GROUP node. This GROUP contains one or more conditions combined with AND or OR logic.
65
+
66
+ ```json
67
+ {
68
+ "type": "GROUP",
69
+ "operator": "AND",
70
+ "conditions": [...]
71
+ }
72
+ ```
73
+
74
+ ### Condition Types
75
+
76
+ There are four condition types: UNARY, BINARY, SQL, and GROUP.
77
+
78
+ #### UNARY (single-value checks)
79
+
80
+ Used for predicates that operate on a single field with no comparison value.
81
+
82
+ ```json
83
+ {
84
+ "type": "UNARY",
85
+ "predicate": {"name": "null", "negated": false},
86
+ "value": [{"type": "FIELD", "field": "column_name"}]
87
+ }
88
+ ```
89
+
90
+ - `predicate.name` -- the predicate to apply (see Predicates Reference below).
91
+ - `predicate.negated` -- set to `true` to invert the predicate (e.g., `null` with `negated: true` means "is NOT null").
92
+ - `value` -- an array with a single value descriptor (usually a FIELD reference).
93
+
94
+ #### BINARY (comparison checks)
95
+
96
+ Used for predicates that compare a field against a value.
97
+
98
+ ```json
99
+ {
100
+ "type": "BINARY",
101
+ "predicate": {"name": "greater_than", "negated": false},
102
+ "left": [{"type": "FIELD", "field": "column_name"}],
103
+ "right": [{"type": "LITERAL", "literal": "0"}]
104
+ }
105
+ ```
106
+
107
+ - `left` -- the left-hand side of the comparison (typically a FIELD reference).
108
+ - `right` -- the right-hand side (typically a LITERAL value, SQL expression, or FIELD reference).
109
+ - Both `left` and `right` are arrays of value descriptors.
110
+
111
+ #### SQL (custom SQL expression)
112
+
113
+ Used for complex conditions that are difficult to express with UNARY/BINARY nodes. The SQL expression should evaluate to true for INVALID rows.
114
+
115
+ ```json
116
+ {
117
+ "type": "SQL",
118
+ "sql": "amount > 0 AND amount < 1000000"
119
+ }
120
+ ```
121
+
122
+ #### GROUP (nested conditions)
123
+
124
+ Used to combine multiple conditions with AND or OR logic. Groups can be nested.
125
+
126
+ ```json
127
+ {
128
+ "type": "GROUP",
129
+ "operator": "OR",
130
+ "conditions": [
131
+ {"type": "UNARY", "...": "..."},
132
+ {"type": "BINARY", "...": "..."}
133
+ ]
134
+ }
135
+ ```
136
+
137
+ ---
138
+
139
+ ## Value Types
140
+
141
+ Value descriptors appear in the `value`, `left`, and `right` arrays of UNARY and BINARY conditions.
142
+
143
+ | Type | Field | Description | Example |
144
+ |------|-------|-------------|---------|
145
+ | `FIELD` | `"field": "column_name"` | References a column in the table. | `{"type": "FIELD", "field": "user_id"}` |
146
+ | `LITERAL` | `"literal": "value"` | A static value (always a string, even for numbers). | `{"type": "LITERAL", "literal": "100"}` |
147
+ | `SQL` | `"sql": "SELECT ..."` | A SQL expression or subquery. | `{"type": "SQL", "sql": "SELECT MAX(id) FROM ref_table"}` |
148
+
149
+ ---
150
+
151
+ ## Predicates Reference
152
+
153
+ Before building conditions, call `getValidationPredicates` to get the full list of supported predicates for the connected warehouse. The list below covers common predicates but may not be exhaustive.
154
+
155
+ ### Unary Predicates
156
+
157
+ These predicates take no comparison value -- they check a property of the field itself.
158
+
159
+ | Predicate | Description | Example use |
160
+ |-----------|-------------|-------------|
161
+ | `null` | Field value is null. | Alert on null ids. |
162
+ | `is_negative` | Field value is negative. | Alert on negative amounts. |
163
+ | `is_between_0_and_1` | Field value is between 0 and 1 (inclusive). | Alert on rates that should be percentages (0-100). |
164
+ | `is_future_date` | Field value is a date/timestamp in the future. | Alert on future-dated records. |
165
+ | `is_uuid` | Field value matches UUID format. | Alert on non-UUID values in a UUID field (use with `negated: true`). |
166
+
167
+ ### Binary Predicates
168
+
169
+ These predicates compare a field against a value.
170
+
171
+ | Predicate | Right-hand side | Description | Example use |
172
+ |-----------|----------------|-------------|-------------|
173
+ | `equal` | Single LITERAL | Field equals the given value. | Alert when `status` equals `'deleted'`. |
174
+ | `greater_than` | Single LITERAL | Field is greater than the given value. | Alert when `discount_pct` exceeds 100. |
175
+ | `less_than` | Single LITERAL | Field is less than the given value. | Alert when `quantity` is below 0. |
176
+ | `in_set` | Multiple LITERALs | Field value is in the given set. | Alert when `status` is in an invalid set (see example below). |
177
+ | `contains` | Single LITERAL | Field value contains the given substring. | Alert when `email` contains `'test@'`. |
178
+ | `starts_with` | Single LITERAL | Field value starts with the given prefix. | Alert when `phone` starts with `'000'`. |
179
+ | `between` | Two LITERALs | Field value is between the two given values (inclusive). | Alert when `score` is between 0 and 10 (if that range is invalid). |
180
+
181
+ ### Using `negated` to Invert Predicates
182
+
183
+ Any predicate can be inverted by setting `"negated": true` in the predicate object. This is essential for "must be in set" validations:
184
+
185
+ - **"status must be in [active, pending]"** becomes `in_set` with values `["active", "pending"]` and `negated: true` -- meaning "alert when status is NOT in [active, pending]".
186
+ - **"id must not be null"** becomes `null` with `negated: false` -- meaning "alert when id IS null" (no inversion needed since the condition already matches invalid data).
187
+
188
+ ---
189
+
190
+ ## Examples
191
+
192
+ ### Alert when id is null
193
+
194
+ Verify that `id` exists in the table schema from `getTable` before proceeding.
195
+
196
+ ```json
197
+ {
198
+ "name": "orders_id_not_null",
199
+ "description": "Alert when order id is null",
200
+ "table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++analytics:core.orders",
201
+ "alert_condition": {
202
+ "type": "GROUP",
203
+ "operator": "AND",
204
+ "conditions": [
205
+ {
206
+ "type": "UNARY",
207
+ "predicate": {"name": "null", "negated": false},
208
+ "value": [{"type": "FIELD", "field": "id"}]
209
+ }
210
+ ]
211
+ }
212
+ }
213
+ ```
214
+
215
+ The condition matches rows where `id` IS NULL -- these are the invalid rows we want to be alerted about.
216
+
217
+ ### Alert when status is not in allowed set
218
+
219
+ Verify that `status` exists in the table schema from `getTable` before proceeding.
220
+
221
+ ```json
222
+ {
223
+ "name": "orders_status_allowed_values",
224
+ "description": "Alert when order status is outside the allowed set",
225
+ "table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++analytics:core.orders",
226
+ "alert_condition": {
227
+ "type": "GROUP",
228
+ "operator": "AND",
229
+ "conditions": [
230
+ {
231
+ "type": "BINARY",
232
+ "predicate": {"name": "in_set", "negated": true},
233
+ "left": [{"type": "FIELD", "field": "status"}],
234
+ "right": [
235
+ {"type": "LITERAL", "literal": "active"},
236
+ {"type": "LITERAL", "literal": "pending"},
237
+ {"type": "LITERAL", "literal": "inactive"}
238
+ ]
239
+ }
240
+ ]
241
+ }
242
+ }
243
+ ```
244
+
245
+ Note `negated: true` -- the predicate is `in_set`, but we want to alert when the value is NOT in the set. This catches any unexpected status values.
246
+
247
+ ### Alert when amount is negative
248
+
249
+ Verify that `amount` exists in the table schema from `getTable` before proceeding.
250
+
251
+ ```json
252
+ {
253
+ "name": "orders_positive_amount",
254
+ "description": "Alert when order amount is negative",
255
+ "table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++analytics:core.orders",
256
+ "alert_condition": {
257
+ "type": "GROUP",
258
+ "operator": "AND",
259
+ "conditions": [
260
+ {
261
+ "type": "UNARY",
262
+ "predicate": {"name": "is_negative", "negated": false},
263
+ "value": [{"type": "FIELD", "field": "amount"}]
264
+ }
265
+ ]
266
+ }
267
+ }
268
+ ```
269
+
270
+ The condition matches rows where `amount` is negative -- these are the invalid rows.
271
+
272
+ ### Combined conditions: null OR negative
273
+
274
+ Verify that both `amount` and `quantity` exist in the table schema from `getTable` before proceeding.
275
+
276
+ ```json
277
+ {
278
+ "name": "orders_amount_quality",
279
+ "description": "Alert when amount is null or quantity is negative",
280
+ "table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++analytics:core.orders",
281
+ "alert_condition": {
282
+ "type": "GROUP",
283
+ "operator": "OR",
284
+ "conditions": [
285
+ {
286
+ "type": "UNARY",
287
+ "predicate": {"name": "null", "negated": false},
288
+ "value": [{"type": "FIELD", "field": "amount"}]
289
+ },
290
+ {
291
+ "type": "UNARY",
292
+ "predicate": {"name": "is_negative", "negated": false},
293
+ "value": [{"type": "FIELD", "field": "quantity"}]
294
+ }
295
+ ]
296
+ }
297
+ }
298
+ ```
299
+
300
+ The OR operator means an alert fires if either condition matches -- the row has a null amount OR a negative quantity.
301
+
302
+ ### Between check with nested AND/OR
303
+
304
+ Verify that `score` and `status` exist in the table schema from `getTable` before proceeding.
305
+
306
+ ```json
307
+ {
308
+ "name": "records_score_validation",
309
+ "description": "Alert when score is outside 0-100 range for active records",
310
+ "table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++warehouse:metrics.records",
311
+ "alert_condition": {
312
+ "type": "GROUP",
313
+ "operator": "AND",
314
+ "conditions": [
315
+ {
316
+ "type": "BINARY",
317
+ "predicate": {"name": "equal", "negated": false},
318
+ "left": [{"type": "FIELD", "field": "status"}],
319
+ "right": [{"type": "LITERAL", "literal": "active"}]
320
+ },
321
+ {
322
+ "type": "BINARY",
323
+ "predicate": {"name": "between", "negated": true},
324
+ "left": [{"type": "FIELD", "field": "score"}],
325
+ "right": [
326
+ {"type": "LITERAL", "literal": "0"},
327
+ {"type": "LITERAL", "literal": "100"}
328
+ ]
329
+ }
330
+ ]
331
+ }
332
+ }
333
+ ```
334
+
335
+ This uses `between` with `negated: true` to alert when score is outside the 0-100 range, but only for active records (the AND operator requires both conditions to match).
336
+
337
+ ### Referential integrity with SQL subquery
338
+
339
+ Verify that `customer_id` exists in the table schema from `getTable` before proceeding.
340
+
341
+ ```json
342
+ {
343
+ "name": "orders_valid_customer",
344
+ "description": "Alert when customer_id does not exist in customers table",
345
+ "table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++analytics:core.orders",
346
+ "alert_condition": {
347
+ "type": "GROUP",
348
+ "operator": "AND",
349
+ "conditions": [
350
+ {
351
+ "type": "SQL",
352
+ "sql": "customer_id IS NOT NULL AND customer_id NOT IN (SELECT id FROM analytics.core.customers)"
353
+ }
354
+ ]
355
+ }
356
+ }
357
+ ```
358
+
359
+ The SQL condition type is useful for referential integrity checks that require subqueries. The `customer_id IS NOT NULL` guard avoids alerting on null values (which should be caught by a separate null check if needed).
360
+
361
+ ### Contains and starts_with checks
362
+
363
+ Verify that `email` and `phone` exist in the table schema from `getTable` before proceeding.
364
+
365
+ ```json
366
+ {
367
+ "name": "contacts_format_validation",
368
+ "description": "Alert when email contains test data or phone has invalid prefix",
369
+ "table": "MCON++a1b2c3d4-e5f6-7890-abcd-ef1234567890++1++1++warehouse:crm.contacts",
370
+ "alert_condition": {
371
+ "type": "GROUP",
372
+ "operator": "OR",
373
+ "conditions": [
374
+ {
375
+ "type": "BINARY",
376
+ "predicate": {"name": "contains", "negated": false},
377
+ "left": [{"type": "FIELD", "field": "email"}],
378
+ "right": [{"type": "LITERAL", "literal": "@test.example.com"}]
379
+ },
380
+ {
381
+ "type": "BINARY",
382
+ "predicate": {"name": "starts_with", "negated": false},
383
+ "left": [{"type": "FIELD", "field": "phone"}],
384
+ "right": [{"type": "LITERAL", "literal": "000"}]
385
+ }
386
+ ]
387
+ }
388
+ }
389
+ ```
390
+
391
+ ---
392
+
393
+ ## Fallback: Custom SQL Monitor
394
+
395
+ If `createValidationMonitorMac` fails -- for example because a referenced column does not exist yet in the live table, or the warehouse does not support a particular predicate -- fall back to `createCustomSqlMonitorMac` with an explicit SQL query instead.
396
+
397
+ A custom SQL monitor lets you express any validation logic as a SQL query that returns rows or a count. This is always available as a backup when the structured validation condition tree cannot express what you need or encounters an API error.
398
+
399
+ When falling back:
400
+
401
+ 1. Translate the intended validation logic into a SQL query.
402
+ 2. The SQL should select rows that violate the rule (matching the same "conditions match INVALID data" principle).
403
+ 3. Use `createCustomSqlMonitorMac` with the translated query.
404
+ 4. Inform the user that you used a custom SQL monitor as a fallback and explain why.
@@ -0,0 +1,252 @@
1
+ ---
2
+ name: monte-carlo-prevent
3
+ description: "Surfaces Monte Carlo data observability context (table health, alerts, lineage, blast radius) before SQL/dbt edits."
4
+ category: data
5
+ risk: safe
6
+ source: community
7
+ source_repo: monte-carlo-data/mc-agent-toolkit
8
+ source_type: community
9
+ date_added: "2026-04-08"
10
+ author: monte-carlo-data
11
+ tags: [data-observability, dbt, schema, monte-carlo, lineage]
12
+ tools: [claude, cursor, codex]
13
+ ---
14
+
15
+ # Monte Carlo Prevent Skill
16
+
17
+ This skill brings Monte Carlo's data observability context directly into your editor. When you're modifying a dbt model or SQL pipeline, use it to surface table health, lineage, active alerts, and to generate monitors-as-code without leaving Claude Code.
18
+
19
+ Reference files live next to this skill file. **Use the Read tool** (not MCP resources) to access them:
20
+
21
+ - Full workflow step-by-step instructions: `references/workflows.md` (relative to this file)
22
+ - MCP parameter details: `references/parameters.md` (relative to this file)
23
+ - Troubleshooting: `references/TROUBLESHOOTING.md` (relative to this file)
24
+
25
+ ## When to activate this skill
26
+
27
+ **Do not wait to be asked.** Run the appropriate workflow automatically whenever the user:
28
+
29
+ - References or opens a `.sql` file or dbt model (files in `models/`) → run Workflow 1
30
+ - Mentions a table name, dataset, or dbt model name in passing → run Workflow 1
31
+
32
+ - Describes a planned change to a model (new column, join update, filter change, refactor) → **STOP — run Workflow 4 before writing any code**
33
+ -
34
+ - Adds a new column, metric, or output expression to an existing
35
+ model → run Workflow 4 first, then ALWAYS offer Workflow 2
36
+ regardless of risk tier — do not skip the monitor offer
37
+ - Asks about data quality, freshness, row counts, or anomalies → run Workflow 1
38
+ - Wants to triage or respond to a data quality alert → run Workflow 3
39
+
40
+ Present the results as context the engineer needs before proceeding — not as a response to a question.
41
+
42
+ ## When NOT to activate this skill
43
+
44
+ Do not invoke Monte Carlo tools for:
45
+
46
+ - Seed files (files in seeds/ directory)
47
+ - Analysis files (files in analyses/ directory)
48
+ - One-off or ad-hoc SQL scripts not part of a dbt project
49
+ - Configuration files (dbt_project.yml, profiles.yml, packages.yml)
50
+ - Test files unless the user is specifically asking about data quality
51
+
52
+ If uncertain whether a file is a dbt model, check for {{ ref() }} or {{ source() }}
53
+ Jinja references — if absent, do not activate.
54
+
55
+ ### Macros and snapshots — gate edits, skip auto-context
56
+
57
+ Macro files (`macros/`) and snapshot files (`snapshots/`) are **not** models, so
58
+ do not auto-fetch Monte Carlo context (Workflow 1) when they are opened. However,
59
+ macros are inlined into every model that calls them at compile time — a one-line
60
+ macro change can silently alter dozens of models. Snapshots control historical
61
+ tracking and are similarly sensitive.
62
+
63
+ **The pre-edit hook gates these files.** If the hook fires for a macro or snapshot,
64
+ identify which models are affected and run the change impact assessment (Workflow 4)
65
+ for those models before proceeding with the edit.
66
+
67
+ ---
68
+
69
+ ## REQUIRED: Change impact assessment before any SQL edit
70
+
71
+ **Before editing or writing any SQL for a dbt model or pipeline, you MUST run Workflow 4.**
72
+
73
+ This applies whenever the user expresses intent to modify a model — including phrases like:
74
+
75
+ - "I want to add a column…"
76
+ - "Let me add / I'm adding…"
77
+ - "I'd like to change / update / rename…"
78
+ - "Can you add / modify / refactor…"
79
+ - "Let's add…" / "Add a `<column>` column"
80
+ - Any other description of a planned schema or logic change
81
+ - "Exclude / filter out / remove [records/customers/rows]…"
82
+ - "Adjust / increase / decrease [threshold/parameter/value]…"
83
+ - "Fix / bugfix / patch [issue/bug]…"
84
+ - "Revert / restore / undo [change/previous behavior]…"
85
+ - "Disable / enable [feature/logic/flag]…"
86
+ - "Clean up / remove [references/columns/code]…"
87
+ - "Implement [backend/feature] for…"
88
+ - "Create [models/dbt models] for…" (when modifying existing referenced tables)
89
+ - "Increase / decrease / change [max_tokens/threshold/date constant/numeric parameter]…"
90
+ - Any change to a hardcoded value, constant, or configuration parameter within SQL
91
+ - "Drop / remove / delete [column/field/table]"
92
+ - "Rename [column/field] to [new name]"
93
+ - "Add [column]" (short imperative form, e.g. "add a created_at column")
94
+ - Any single-verb imperative command targeting a column, table, or model
95
+ (e.g. "drop X", "rename Y", "add Z", "remove W")
96
+
97
+ Parameter changes (threshold values, date constants, numeric limits) appear
98
+ safe but silently change model output. Treat them the same as logic changes
99
+ for impact assessment purposes.
100
+
101
+ **Do not write or edit any SQL until the change impact assessment (Workflow 4) has been presented to the user.** The assessment must come first — not after the edit, not in parallel.
102
+
103
+ ---
104
+
105
+ ## Pre-edit gate — check before modifying any file
106
+
107
+ **Before calling Edit, Write, or MultiEdit on any `.sql` or dbt model
108
+ file, you MUST check:**
109
+
110
+ 1. Has the synthesis step been run for THIS SPECIFIC CHANGE in the
111
+ current prompt?
112
+ 2. **If YES** → proceed with the edit
113
+ 3. **If NO** → stop immediately, run Workflow 4, present the full
114
+ report with synthesis connected to this specific change.
115
+ **If risk is High or Medium:** ask "Do you want me to proceed
116
+ with the edit?" and wait for explicit confirmation.
117
+ **If risk is Low:** use judgment — proceed if straightforward
118
+ and no concerns found, otherwise ask before editing.
119
+
120
+ **Important: "Workflow 4 already ran this session" is NOT sufficient
121
+ to proceed.** Each distinct change prompt requires its own synthesis
122
+ step connecting the MC findings to that specific change.
123
+
124
+ The synthesis must reference the specific columns, filters, or logic
125
+ being changed in the current prompt — not just general table health.
126
+
127
+ Example:
128
+
129
+ - ✅ "Given 34 downstream models depend on is_paying_workspace,
130
+ adding 'MC Internal' to the exclusion list will exclude these
131
+ workspaces from all downstream health scores and exports.
132
+ Confirm?"
133
+ - ❌ "Workflow 4 already ran. Making the edit now."
134
+
135
+ The only exception: if the user explicitly acknowledges the risk
136
+ and confirms they want to skip (e.g. "I know the risks, just make
137
+ the change") — proceed but note the skipped assessment.
138
+
139
+ ## Available MCP tools
140
+
141
+ All tools are available via the `monte-carlo` MCP server.
142
+
143
+ | Tool | Purpose |
144
+ | ---------------------------- | -------------------------------------------------------------------- |
145
+ | `testConnection` | Verify auth and connectivity |
146
+ | `search` | Find tables/assets by name |
147
+ | `getTable` | Schema, stats, metadata for a table |
148
+ | `getAssetLineage` | Upstream/downstream dependencies (call with mcons array + direction) |
149
+ | `getAlerts` | Active incidents and alerts |
150
+ | `getMonitors` | Monitor configs — filter by table using mcons array |
151
+ | `getQueriesForTable` | Recent query history |
152
+ | `getQueryData` | Full SQL for a specific query |
153
+ | `createValidationMonitorMac` | Generate validation monitors-as-code YAML |
154
+ | `createMetricMonitorMac` | Generate metric monitors-as-code YAML |
155
+ | `createComparisonMonitorMac` | Generate comparison monitors-as-code YAML |
156
+ | `createCustomSqlMonitorMac` | Generate custom SQL monitors-as-code YAML |
157
+ | `getValidationPredicates` | List available validation rule types |
158
+ | `updateAlert` | Update alert status/severity |
159
+ | `setAlertOwner` | Assign alert ownership |
160
+ | `createOrUpdateAlertComment` | Add comments to alerts |
161
+ | `getAudiences` | List notification audiences |
162
+ | `getDomains` | List MC domains |
163
+ | `getUser` | Current user info |
164
+ | `getCurrentTime` | ISO timestamp for API calls |
165
+
166
+ ## Core workflows
167
+
168
+ Each workflow has detailed step-by-step instructions in `references/workflows.md` (Read tool).
169
+
170
+ ### 1. Table health check
171
+
172
+ **When:** User opens a dbt model or mentions a table.
173
+ **What:** Surfaces health, lineage, alerts, and risk signals. Auto-escalates to Workflow 4 if change intent is detected and risk signals are present.
174
+
175
+ ### 2. Add a monitor
176
+
177
+ **When:** New column, filter, or business rule is added to a model.
178
+ **What:** Suggests and generates monitors-as-code YAML using the appropriate `create*MonitorMac` tool. Saves to `monitors/<table_name>.yml`.
179
+
180
+ ### 3. Alert triage
181
+
182
+ **When:** User is investigating an active data quality incident.
183
+ **What:** Lists open alerts, checks table state, traces lineage for root cause, reviews recent queries.
184
+
185
+ ### 4. Change impact assessment — REQUIRED before modifying a model
186
+
187
+ **When:** Any intent to modify a dbt model's logic, columns, joins, or filters.
188
+ **What:** Surfaces blast radius, downstream dependencies, active incidents, monitor coverage, and query exposure. Produces a risk-tiered report with synthesis connecting findings to specific code recommendations. See `references/workflows.md` for the full assessment sequence, report format, and synthesis rules.
189
+
190
+ ### 5. Change validation queries
191
+
192
+ **When:** Explicit engineer request only (e.g. "validate this change", "ready to commit").
193
+ **What:** Generates 3-5 targeted SQL queries to verify the change behaved as intended. Uses Workflow 4 context — requires both impact assessment and file edit in session.
194
+
195
+ ---
196
+
197
+ ## Post-synthesis confirmation rules
198
+
199
+ Always end the synthesis with one clear, specific recommendation in plain English:
200
+ "Given the above, I recommend: [specific action]"
201
+
202
+ **If the risk is High or Medium:** STOP and wait for confirmation before editing
203
+ any file. You must ask the engineer and receive an explicit "yes", "go ahead",
204
+ "proceed", or similar confirmation before making code changes.
205
+ Say: "Do you want me to proceed with the edit?"
206
+ Do NOT say: "Proceeding with the edit." — that skips the engineer's decision.
207
+
208
+ **If the risk is Low:** Use your judgment based on the synthesis findings. If
209
+ the change is straightforward and the synthesis found no concerns, you may
210
+ proceed. If anything is surprising or worth flagging, ask before editing.
211
+
212
+ ---
213
+
214
+ ## Session markers
215
+
216
+ These markers coordinate between the skill and the plugin's hooks. Output each
217
+ on its own line when the condition is met.
218
+
219
+ ### Impact check complete
220
+
221
+ After the engineer confirms (High/Medium) or after presenting the synthesis (Low),
222
+ output one marker per assessed table. **IMPORTANT: use only the table/model name, not the full MCON:**
223
+
224
+ <!-- MC_IMPACT_CHECK_COMPLETE: <table_name> -->
225
+
226
+ (Use the model filename without .sql extension — NOT "acme.analytics.orders" or "prod.public.client_hub")
227
+
228
+ How many markers to emit depends on how the assessment was triggered:
229
+
230
+ **Hook-triggered** (the pre-edit hook blocked an edit and instructed you to run
231
+ the assessment): Be strict — only emit markers for tables whose lineage **and**
232
+ monitor coverage were fetched directly via Monte Carlo tools in this session. If
233
+ the engineer describes changes to multiple tables but only one was formally
234
+ assessed, emit only one marker. The pre-edit hook will gate the other tables and
235
+ prompt for their own Workflow 4 runs.
236
+
237
+ **Voluntarily invoked** (the engineer proactively asked for an impact assessment):
238
+ Be looser — emit markers for all tables the assessment meaningfully covered, even
239
+ if some were assessed via lineage context rather than direct MC tool calls. The
240
+ engineer is already safety-conscious; don't force redundant assessments for tables
241
+ they clearly considered.
242
+
243
+ ### Monitor coverage gap
244
+
245
+ When Workflow 4 finds zero custom monitors on a table's affected columns, output:
246
+
247
+ <!-- MC_MONITOR_GAP: <table_name> -->
248
+
249
+ Use only the table/model name (NOT the full MCON). This allows the plugin's hooks
250
+ to remind the engineer about monitor coverage at commit time. Only output this
251
+ marker when the gap is specifically about the columns or logic being changed —
252
+ not for general table-level monitor absence.
@@ -0,0 +1,23 @@
1
+ ## Troubleshooting
2
+
3
+ ### MCP connection fails:
4
+ ```bash
5
+ # Verify the server is reachable
6
+ curl -s -o /dev/null -w "%{http_code}" https://integrations.getmontecarlo.com/mcp/
7
+ ```
8
+
9
+ **If using the plugin (OAuth):** Run `/mcp` in Claude Code, select the `monte-carlo` server, and re-authenticate. If the browser flow doesn't complete, copy the callback URL from your browser's address bar into the URL prompt that appears in Claude Code.
10
+
11
+ **Legacy (header-based auth, for MCP clients without HTTP transport):** Check that `x-mcd-id` and `x-mcd-token` are set correctly in your MCP config. The key format is `<KEY_ID>:<KEY_SECRET>` — these are split across two separate headers.
12
+
13
+
14
+ ### Monitor creation errors:
15
+
16
+ **`montecarlo monitors apply` fails with "Unknown field":**
17
+ Monitor definition files must have `montecarlo:` as the root key — do not copy the `validation:` or `custom_sql:` output from the MCP tools directly. Reformat using the `montecarlo: > custom_sql:` structure shown in Workflow 2.
18
+
19
+ **`montecarlo monitors apply` fails with "Not a Monte Carlo project":**
20
+ Ensure `montecarlo.yml` (the project config) exists in the working directory. This file must contain only `version`, `namespace`, and `default_resource` — not monitor definitions.
21
+
22
+ **`createValidationMonitorMac` fails with a Snowflake error:**
23
+ This tool validates the condition SQL against the live table. If the column doesn't exist yet (e.g. you're writing the monitor before deploying the model change), fall back to `createCustomSqlMonitorMac` with an explicit SQL query instead.