coalesce-transform-mcp 0.5.0 → 0.5.1-alpha.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (2) hide show
  1. package/README.md +452 -498
  2. package/package.json +1 -1
package/README.md CHANGED
@@ -6,10 +6,7 @@
6
6
  [![Install in Cursor](https://img.shields.io/badge/Cursor-Install_MCP-000?style=flat&logo=cursor)](https://cursor.com/install-mcp?name=coalesce-transform&config=eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyJjb2FsZXNjZS10cmFuc2Zvcm0tbWNwIl19)
7
7
  [![License: MIT](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
8
8
 
9
- MCP server for [Coalesce](https://coalesce.io/). Built for **Snowflake [Cortex Code](https://docs.snowflake.com/en/user-guide/cortex-code/cortex-code-cli) (CoCo)** - with first-class support for every other MCP client (Claude Code, Claude Desktop, Cursor, VS Code, Windsurf). Manage nodes, pipelines, environments, jobs, and runs, and drive the local-first [`coa`](https://www.npmjs.com/package/@coalescesoftware/coa) CLI from the same server: validate a project, preview DDL/DML, plan a deployment, and apply it to a cloud environment.
10
-
11
- - **Cloud REST tools** - build pipelines declaratively, edit node YAML, review lineage, run deployed jobs, audit documentation.
12
- - **Local COA CLI tools** - validate projects before check-in, preview generated DDL/DML (`--dry-run`), run `plan → deploy → refresh` cycles. COA is bundled - no separate install.
9
+ MCP server for [Coalesce](https://coalesce.io/). Built for **Snowflake [Cortex Code](https://docs.snowflake.com/en/user-guide/cortex-code/cortex-code-cli) (CoCo)** - with first-class support for every other MCP client (Claude Code, Claude Desktop, Cursor, VS Code, Windsurf). Manage nodes, pipelines, environments, jobs, and runs, and drive the local-first Coalesce CLI from the same server: validate a project, preview DDL/DML, plan a deployment, and apply it to a cloud environment.
13
10
 
14
11
  ---
15
12
 
@@ -17,34 +14,22 @@ MCP server for [Coalesce](https://coalesce.io/). Built for **Snowflake [Cortex C
17
14
 
18
15
  | | Task | Jump to |
19
16
  | :-: | ---- | ------- |
20
- | 📦 | Install for my AI client | [Installation](#installation) |
21
- | 🚀 | Get running in 2 minutes | [Quick start](#quick-start) |
17
+ | 🚀 | Get running in 2 minutes | [Quick Start](#quick-start) |
22
18
  | 🎛️ | Customize agent behavior | [Skills](#skills) |
23
19
  | 🔍 | Find a specific tool | [Tools](#tools) |
20
+ | 📦 | Walk through the full setup | [Full Installation](#full-installation) |
24
21
  | 🔑 | Authenticate (env var or `~/.coa/config`) | [Credentials](#credentials) |
25
22
  | 🌐 | Run against multiple Coalesce environments | [Multiple environments](#multiple-environments) |
26
23
  | 🔒 | Lock prod down to read-only | [Safety model](docs/safety-model.md) |
27
- | 🧰 | Use the `coa` CLI tools | [Using the COA CLI tools](#using-the-coa-cli-tools) |
28
- | 🧪 | Try a prerelease build | [Prerelease channel](docs/prerelease.md) |
29
- | 🩺 | Debug "why isn't auth working?" | [Diagnosing setup](docs/diagnosing-setup.md) |
30
24
 
31
25
  ---
32
26
 
33
- ## Installation
27
+ ## Quick Start
34
28
 
35
29
  Each link below opens a short install guide with a click-to-install button (where supported) and the manual config.
36
30
 
37
- | Client | Install guide |
38
- | ------ | ------------- |
39
- | ❄️ **Snowflake Cortex Code (CoCo)** | [docs/installation-guides/cortex-code.md](docs/installation-guides/cortex-code.md) |
40
- | Cursor | [docs/installation-guides/cursor.md](docs/installation-guides/cursor.md) |
41
- | VS Code | [docs/installation-guides/vscode.md](docs/installation-guides/vscode.md) |
42
- | VS Code Insiders | [docs/installation-guides/vscode-insiders.md](docs/installation-guides/vscode-insiders.md) |
43
- | Claude Code (CLI) | [docs/installation-guides/claude-code.md](docs/installation-guides/claude-code.md) |
44
- | Claude Desktop | [docs/installation-guides/claude-desktop.md](docs/installation-guides/claude-desktop.md) |
45
- | Windsurf | [docs/installation-guides/windsurf.md](docs/installation-guides/windsurf.md) |
46
-
47
- Or expand the dropdown for your client below to paste directly without leaving this page.
31
+ > [!TIP]
32
+ > **❄️ Snowflake Cortex Code + coalesce-transform-mcp.** CoCo is Snowflake's AI coding CLI - it already knows your warehouse, role, and data. Drop this MCP in and an agent can plan pipelines, create nodes, run DML, and verify results in a single session, all under Snowflake's auth model. **[Install in Cortex Code →](docs/installation-guides/cortex-code.md)**
48
33
 
49
34
  <details>
50
35
  <summary><b>❄️ Install in Snowflake Cortex Code (CoCo)</b></summary>
@@ -218,485 +203,197 @@ Windsurf does **not** expand `${VAR}` - paste the literal token, or drop the `en
218
203
 
219
204
  </details>
220
205
 
221
- > [!CAUTION]
222
- > **Never hardcode credentials in git-tracked config files.** Only Claude Code's `.mcp.json` expands `${VAR}` from your shell env. For any other client, keep secrets in `~/.coa/config` or a secrets manager your client integrates with - don't commit literals into these JSON files.
206
+ <br>
223
207
 
224
208
  > [!TIP]
225
- > **❄️ Snowflake Cortex Code + coalesce-transform-mcp.** CoCo is Snowflake's AI coding CLI - it already knows your warehouse, role, and data. Drop this MCP in and an agent can plan pipelines, create nodes, run DML, and verify results in a single session, all under Snowflake's auth model. **[Install in Cortex Code →](docs/installation-guides/cortex-code.md)**
226
-
227
- > [!TIP]
228
- > The two surfaces are orthogonal. Use both, one, or neither. Every destructive tool - on either surface - requires explicit confirmation before running. New? Run the `/coalesce-setup` prompt after install - it walks you through anything missing.
209
+ >
210
+ > ### 🚀 New? Run the `/coalesce-setup` prompt after install
211
+ >
212
+ > It walks you through anything missing.
229
213
 
230
214
  ---
231
215
 
232
- ## Quick start
233
-
234
- **Requirements:**
235
-
236
- - [Node.js](https://nodejs.org/) 22+
237
- - A [Coalesce](https://coalesce.io/) account with a workspace
238
- - An MCP-compatible AI client (see [Installation](#installation))
239
- - Snowflake credentials - only if you plan to use run tools or `coa_create`/`coa_run` (see [Credentials](#credentials))
240
- - Install footprint is ~76 MB unpacked (the bundled `@coalescesoftware/coa` CLI ships its own runtime; the MCP tarball itself is under 1 MB)
216
+ ## Skills
241
217
 
242
- **1. Clone your project.** If your team already has a Coalesce project in Git, clone it locally - the bundled `coa` CLI operates on a project directory, so most local create/run tools require one on disk:
218
+ **Skills are editable markdown that shapes how the agent reasons about your Coalesce project.** Ship your team's naming conventions, grain definitions, and layering patterns as context - every agent on the server instantly picks them up. No fine-tuning, no prompt engineering, just markdown you edit and commit.
243
219
 
244
- ```bash
245
- git clone <your-coalesce-project-repo-url>
246
- cd my-project
247
- ```
220
+ Set `COALESCE_MCP_SKILLS_DIR` to make skills editable on disk. Each skill resolves to default content, user-augmented content, or a full user override - see [docs/context-skills.md](docs/context-skills.md) for the resolution order and customization walkthrough.
248
221
 
249
- Don't have a Git-linked project yet? In the Coalesce UI, open your workspace → **Settings → Git** and connect a repo (or create one via your Git provider and paste the URL). Coalesce will commit the project skeleton on first push; clone that repo locally once it's populated.
222
+ **24 skills, grouped into 6 families:**
250
223
 
251
224
  <details>
252
- <summary>What's in a Coalesce project directory?</summary>
253
225
 
254
- ```text
255
- my-project/
256
- ├── data.yml # Root metadata (fileVersion, platformKind)
257
- ├── locations.yml # Storage location manifest
258
- ├── nodes/ # Pipeline nodes (.yml for V1, .sql for V2)
259
- ├── nodeTypes/ # Node type definitions with templates
260
- ├── environments/ # Environment configs with storage mappings
261
- ├── macros/ # Reusable SQL macros
262
- ├── jobs/ # Job definitions
263
- └── subgraphs/ # Subgraph definitions
264
- ```
226
+ <summary>
227
+ <picture>
228
+ <source media="(prefers-color-scheme: dark)" srcset="docs/icons/book-dark.png">
229
+ <source media="(prefers-color-scheme: light)" srcset="docs/icons/book-light.png">
230
+ <img src="docs/icons/book-light.png" width="28" height="28" alt="book">
231
+ </picture>
232
+ <b>Foundations &mdash; the shared context every agent starts with</b>
233
+ </summary>
265
234
 
266
- **V1 vs V2** - the format is pinned by `fileVersion` in `data.yml`. **V1** (`fileVersion: 1` or `2`) stores each node as a single YAML file with columns, transforms, and config inline. **V2** (`fileVersion: 3`) is SQL-first: the node body lives in a `.sql` file using `@id` / `@nodeType` annotations and `{{ ref() }}` references, with YAML retained for config. New projects default to V2; existing V1 projects keep working unchanged.
235
+ - **`overview`** - General Coalesce concepts, response guidelines, and operational constraints
236
+ - **`tool-usage`** - Best practices for tool batching, parallelization, and SQL conversion
237
+ - **`id-discovery`** - Resolving project, workspace, environment, job, run, node, and org IDs
238
+ - **`storage-mappings`** - Storage location concepts, `{{ ref() }}` syntax, and reference patterns
239
+ - **`ecosystem-boundaries`** - Scope of this MCP vs adjacent data-engineering MCPs (Snowflake, Fivetran, dbt, Catalog)
240
+ - **`data-engineering-principles`** - Node type selection, layered architecture, methodology detection, materialization strategies
241
+ - **`sql-platform-selection`** - Determining the active SQL platform from project metadata
267
242
 
268
243
  </details>
269
244
 
270
- Point the MCP at this directory by setting `repoPath` in `~/.coa/config` or `COALESCE_REPO_PATH` in your env block.
271
-
272
- **2. Create `workspaces.yml`.** This file is **required** for `coa_create` / `coa_run` and their dry-run variants. It maps each storage location declared in `locations.yml` to a physical database + schema for local development. It's typically gitignored (per-developer), so cloning the project does not give it to you - you have to create it.
273
-
274
- The `/coalesce-setup` prompt detects a missing `workspaces.yml` and walks you through it. If you'd rather do it directly, pick one of:
245
+ <details>
275
246
 
276
- - **Let COA bootstrap it** (easiest): from the project root, run
247
+ <summary>
248
+ <picture>
249
+ <source media="(prefers-color-scheme: dark)" srcset="docs/icons/file-dark.png">
250
+ <source media="(prefers-color-scheme: light)" srcset="docs/icons/file-light.png">
251
+ <img src="docs/icons/file-light.png" width="28" height="28" alt="file">
252
+ </picture>
253
+ <b>SQL platform rules &mdash; per-warehouse conventions for node SQL</b>
254
+ </summary>
277
255
 
278
- ```bash
279
- npx @coalescesoftware/coa doctor --fix
280
- ```
256
+ - **`sql-snowflake`** - Snowflake-specific SQL conventions for node SQL
257
+ - **`sql-databricks`** - Databricks-specific SQL conventions for node SQL
258
+ - **`sql-bigquery`** - BigQuery-specific SQL conventions for node SQL
281
259
 
282
- Or from your MCP client, call the `coa_bootstrap_workspaces` tool (requires `confirmed: true`) which runs the same command.
260
+ </details>
283
261
 
284
- > [!WARNING]
285
- > **The generated file contains placeholder values.** `coa doctor --fix` seeds `database`/`schema` with defaults that won't match your real warehouse. Open the file and replace every placeholder before running `coa_create` / `coa_run` - otherwise the generated DDL/DML will target the wrong (or non-existent) database.
262
+ <details>
286
263
 
287
- - **Hand-write it.** Authoritative schema (from `coa describe schema workspaces` - no top-level wrapper, no `fileVersion`):
264
+ <summary>
265
+ <picture>
266
+ <source media="(prefers-color-scheme: dark)" srcset="docs/icons/git-commit-dark.png">
267
+ <source media="(prefers-color-scheme: light)" srcset="docs/icons/git-commit-light.png">
268
+ <img src="docs/icons/git-commit-light.png" width="28" height="28" alt="git-commit">
269
+ </picture>
270
+ <b>Node editing &amp; payloads &mdash; how the agent reasons about node bodies</b>
271
+ </summary>
288
272
 
289
- ```yaml
290
- # workspaces.yml - keys are workspace names; `dev` is the default if --workspace is omitted
291
- dev:
292
- connection: snowflake # required - name of the connection block COA should use
293
- locations: # optional - one entry per storage location name from locations.yml
294
- SRC_INGEST_TASTY_BITES:
295
- database: JESSE_DEV # required
296
- schema: INGEST_TASTY_BITES # required
297
- ETL_STAGE:
298
- database: JESSE_DEV
299
- schema: ETL_STAGE
300
- ANALYTICS:
301
- database: JESSE_DEV
302
- schema: ANALYTICS
303
- ```
273
+ - **`node-creation-decision-tree`** - Choosing between predecessor-based creation, updates, and full replacements
274
+ - **`node-payloads`** - Working with workspace node bodies, metadata, config, and array-replacement risks
275
+ - **`hydrated-metadata`** - Coalesce hydrated metadata structures for advanced node payload editing
276
+ - **`intelligent-node-configuration`** - How intelligent config completion works, schema resolution, automatic field detection
277
+ - **`node-operations`** - Editing existing nodes: joins, columns, config fields, and SQL-to-graph conversion
278
+ - **`aggregation-patterns`** - JOIN ON generation, GROUP BY detection, and join-to-aggregation conversion
304
279
 
305
- Verify with `coa_doctor` (or `npx @coalescesoftware/coa doctor`) - it checks `data.yml`, `workspaces.yml`, credentials, and warehouse connectivity end to end.
280
+ </details>
306
281
 
307
- **3. Pick an auth path:**
282
+ <details>
308
283
 
309
- <table>
310
- <tr>
311
- <th>Option A - env var</th>
312
- <th>Option B - reuse <code>~/.coa/config</code></th>
313
- </tr>
314
- <tr valign="top">
315
- <td>
284
+ <summary>
285
+ <picture>
286
+ <source media="(prefers-color-scheme: dark)" srcset="docs/icons/repo-dark.png">
287
+ <source media="(prefers-color-scheme: light)" srcset="docs/icons/repo-light.png">
288
+ <img src="docs/icons/repo-light.png" width="28" height="28" alt="repo">
289
+ </picture>
290
+ <b>Node type selection &mdash; picking the right node type for each step</b>
291
+ </summary>
316
292
 
317
- Simplest for first-time MCP users. Generate a `COALESCE_ACCESS_TOKEN` from Coalesce Deploy User Settings, then include it in your client config:
293
+ - **`node-type-selection-guide`** - When to use each Coalesce node type (Stage/Work vs Dimension/Fact vs specialized)
294
+ - **`node-type-corpus`** - Node type discovery, corpus search, and metadata patterns
318
295
 
319
- ```json
320
- {
321
- "env": {
322
- "COALESCE_ACCESS_TOKEN": "<YOUR_TOKEN>"
323
- }
324
- }
325
- ```
296
+ </details>
326
297
 
327
- </td>
328
- <td>
298
+ <details>
329
299
 
330
- Best if you already use the `coa` CLI - the server reads the same profile file, so nothing to duplicate. Drop the `env` block entirely:
300
+ <summary>
301
+ <picture>
302
+ <source media="(prefers-color-scheme: dark)" srcset="docs/icons/workflow-dark.png">
303
+ <source media="(prefers-color-scheme: light)" srcset="docs/icons/workflow-light.png">
304
+ <img src="docs/icons/workflow-light.png" width="28" height="28" alt="workflow">
305
+ </picture>
306
+ <b>Pipeline workflows &mdash; end-to-end pipeline building</b>
307
+ </summary>
331
308
 
332
- ```json
333
- {
334
- "command": "npx",
335
- "args": ["coalesce-transform-mcp"]
336
- }
337
- ```
309
+ - **`pipeline-workflows`** - Building pipelines end-to-end: node type selection, multi-node sequences, execution
310
+ - **`intent-pipeline-guide`** - Using `build_pipeline_from_intent` to create pipelines from natural language
311
+ - **`pipeline-review-guide`** - Using `review_pipeline` for pipeline analysis and optimization
312
+ - **`pipeline-workshop-guide`** - Using pipeline workshop tools for iterative, conversational pipeline building
338
313
 
339
- See [Credentials](#credentials) for the profile schema.
314
+ </details>
340
315
 
341
- </td>
342
- </tr>
343
- </table>
316
+ <details>
344
317
 
345
- When both sources set a field, the env var wins.
318
+ <summary>
319
+ <picture>
320
+ <source media="(prefers-color-scheme: dark)" srcset="docs/icons/beaker-dark.png">
321
+ <source media="(prefers-color-scheme: light)" srcset="docs/icons/beaker-light.png">
322
+ <img src="docs/icons/beaker-light.png" width="28" height="28" alt="beaker">
323
+ </picture>
324
+ <b>Run operations &mdash; starting, retrying, diagnosing runs</b>
325
+ </summary>
346
326
 
347
- **4. Install the server** via one of the [Installation](#installation) paths above.
327
+ - **`run-operations`** - Starting, retrying, polling, diagnosing, and canceling Coalesce runs
328
+ - **`run-diagnostics-guide`** - Using `diagnose_run_failure` to analyze failed runs and determine fixes
348
329
 
349
- **5. Restart your client,** then run the `/coalesce-setup` prompt to verify everything is wired up.
330
+ </details>
350
331
 
351
- If you have more than one Coalesce environment to manage, see [Multiple environments](#multiple-environments).
352
332
 
353
333
  ---
354
334
 
355
- ## Configuration
335
+ ## Tools
356
336
 
357
- ### Credentials
337
+ > [!NOTE]
338
+ >
339
+ > ### Legend
340
+ >
341
+ > - ⚠️ **Destructive** - the tool needs `confirmed: true` before it will run.
342
+ > - 🧰 **Bundled `coa` CLI** - runs locally against a project directory. The tool needs a `projectPath` pointing at a folder that contains `data.yml`.
343
+ > - **Preflight validation** - destructive 🧰 tools run a safety check before shelling out. See [Safety model](docs/safety-model.md).
358
344
 
359
- The server reads credentials from two sources and merges them with **env-wins precedence** - a matching env var always overrides the profile value, so you can pin a single field per session without editing the config file. Call `diagnose_setup` to see which source supplied each value.
345
+ <!-- start of tool reference -->
360
346
 
361
- #### Source 1: `~/.coa/config` (shared with the `coa` CLI)
347
+ <details>
362
348
 
363
- COA stores credentials in a standard INI file. You create it by hand, or let `coa` write it as you use the CLI. The MCP reads the profile selected by `COALESCE_PROFILE` (default `[default]`) and maps the keys below onto their matching env vars.
349
+ <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/project-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/project-light.png"><img src="docs/icons/project-light.png" width="28" height="28" alt="project"></picture> <b>Discovery</b> &mdash; list, get, and search across workspaces, nodes, jobs, and runs</summary>
364
350
 
365
- ```ini
366
- [default]
367
- token=<your-coalesce-refresh-token>
368
- domain=https://your-org.app.coalescesoftware.io
369
- snowflakeAccount=<your-snowflake-account> # e.g., abc12345.us-east-1 - required by coa CLI
370
- snowflakeUsername=YOUR_USER
371
- snowflakeRole=YOUR_ROLE
372
- snowflakeWarehouse=YOUR_WAREHOUSE
373
- snowflakeKeyPairKey=/Users/you/.coa/rsa_key.p8
374
- snowflakeAuthType=KeyPair
375
- orgID=<your-org-id> # optional; fallback for cancel-run
376
- repoPath=/Users/you/path/to/repo # optional; for repo-backed tools
377
- cacheDir=/Users/you/.coa/cache # optional; per-profile cache isolation
351
+ **Environments, workspaces, projects**
378
352
 
379
- [staging]
380
- # …additional profiles; select with COALESCE_PROFILE
381
- ```
353
+ - **`list_environments`** - List all available environments
354
+ - **`get_environment`** - Get details of a specific environment
355
+ - **`list_workspaces`** - List all workspaces
356
+ - **`get_workspace`** - Get details of a specific workspace
357
+ - **`list_projects`** - List all projects
358
+ - **`get_project`** - Get project details
382
359
 
383
- **Key mapping** - each profile key maps to an env var of the same concept:
360
+ **Nodes**
384
361
 
385
- | Profile key | Env var |
386
- | ----------- | ------- |
387
- | `token` | `COALESCE_ACCESS_TOKEN` |
388
- | `domain` | `COALESCE_BASE_URL` |
389
- | `snowflake*` (all keys) | `SNOWFLAKE_*` (matching suffix) |
390
- | `orgID` | `COALESCE_ORG_ID` |
391
- | `repoPath` | `COALESCE_REPO_PATH` |
392
- | `cacheDir` | `COALESCE_CACHE_DIR` |
362
+ - **`list_environment_nodes`** - List nodes in an environment
363
+ - **`list_workspace_nodes`** - List nodes in a workspace
364
+ - **`get_environment_node`** - Get a specific environment node
365
+ - **`get_workspace_node`** - Get a specific workspace node
366
+ - **`analyze_workspace_patterns`** - Detect package adoption, pipeline layers, methodology, and generate recommendations
367
+ - **`list_workspace_node_types`** - List distinct node types observed in current workspace nodes
393
368
 
394
- Notes:
369
+ **Jobs, subgraphs, runs**
395
370
 
396
- - `snowflakeAuthType` is read by COA itself (no env var) - include it when using key-pair auth.
397
- - `orgID`, `repoPath`, and `cacheDir` are MCP-specific - the COA CLI ignores them.
398
- - Only the fields the MCP needs are shown above. COA's config supports many more - run `npx @coalescesoftware/coa describe config` for the authoritative reference. Unknown keys are ignored.
371
+ - **`list_environment_jobs`** - List all jobs for an environment
372
+ - **`get_environment_job`** - Get details of a specific job
373
+ - **`list_workspace_subgraphs`** - List subgraphs in a workspace
374
+ - **`get_workspace_subgraph`** - Get details of a specific subgraph
375
+ - **`list_runs`** - List runs with optional filters
376
+ - **`get_run`** - Get details of a specific run
377
+ - **`get_run_results`** - Get results of a completed run
378
+ - **`get_run_details`** - Run metadata plus results in one call
399
379
 
400
- If `~/.coa/config` doesn't exist the server runs env-only - startup never fails on a missing or malformed profile file; it just logs a stderr warning.
380
+ **Search**
401
381
 
402
- #### Source 2: env vars in your MCP config
382
+ - **`search_workspace_content`** - Search node SQL, column names, descriptions, and config values
383
+ - **`audit_documentation_coverage`** - Scan all workspace nodes/columns for missing descriptions
403
384
 
404
- <!-- ENV_METADATA_CORE_TABLE_START -->
405
- | Variable | Description | Default |
406
- | -------- | -------- | -------- |
407
- | `COALESCE_ACCESS_TOKEN` | Bearer token from the Coalesce Deploy tab. Optional when `~/.coa/config` provides a `token`. | — |
408
- | `COALESCE_PROFILE` | Selects which `~/.coa/config` profile to load. | `default` |
409
- | `COALESCE_BASE_URL` | Region-specific base URL. | `https://app.coalescesoftware.io (US)` |
410
- | `COALESCE_ORG_ID` | Fallback org ID for cancel-run. Also readable from `orgID` in the active ~/.coa/config profile. | — |
411
- | `COALESCE_REPO_PATH` | Local repo root for repo-backed tools and pipeline planning. Also readable from `repoPath` in the active ~/.coa/config profile. | — |
412
- | `COALESCE_CACHE_DIR` | Base directory for the local data cache. When set, cache files are written here instead of the working directory. Also readable from `cacheDir` in the active ~/.coa/config profile. | — |
413
- | `COALESCE_MCP_AUTO_CACHE_MAX_BYTES` | JSON size threshold before auto-caching to disk. | `32768` |
414
- | `COALESCE_MCP_LINEAGE_TTL_MS` | In-memory lineage cache TTL in milliseconds. | `1800000` |
415
- | `COALESCE_MCP_MAX_REQUEST_BODY_BYTES` | Max outbound API request body size. | `524288` |
416
- | `COALESCE_MCP_READ_ONLY` | When `true`, hides all write/mutation tools during registration. Only read, list, search, cache, analyze, review, diagnose, and plan tools are exposed. | `false` |
417
- | `COALESCE_MCP_SKILLS_DIR` | Directory for customizable AI skill resources. When set, reads context resources from this directory and seeds defaults on first run. Users can augment or override any skill. | — |
418
- <!-- ENV_METADATA_CORE_TABLE_END -->
385
+ **Local project & cloud CLI**
419
386
 
420
- #### Snowflake credentials (run tools only)
387
+ - 🧰 **`coa_list_project_nodes`** - List all nodes defined in a local project (pre-deploy)
388
+ - 🧰 **`coa_list_environments`** - List deployment environments via the cloud CLI
389
+ - 🧰 **`coa_list_environment_nodes`** - List deployed nodes in an environment via the cloud CLI
390
+ - 🧰 **`coa_list_runs`** - List pipeline runs in a cloud environment (or across all environments)
421
391
 
422
- `start_run`, `retry_run`, `run_and_wait`, `retry_and_wait`, and the warehouse-touching COA tools (`coa_create`, `coa_run`) need Snowflake credentials. These normally come from `~/.coa/config`. Override any field via env var:
392
+ </details>
423
393
 
424
- <!-- ENV_METADATA_SNOWFLAKE_TABLE_START -->
425
- | Variable | Required | Description |
426
- | -------- | -------- | -------- |
427
- | `SNOWFLAKE_ACCOUNT` | Yes | Snowflake account identifier (e.g., `abc12345.us-east-1`). Required by the local `coa` CLI and `coa doctor`; not used by the MCP's REST run path. |
428
- | `SNOWFLAKE_USERNAME` | Yes | Snowflake account username |
429
- | `SNOWFLAKE_KEY_PAIR_KEY` | No | Path to PEM-encoded private key (required if SNOWFLAKE_PAT not set) |
430
- | `SNOWFLAKE_PAT` | No | Snowflake Programmatic Access Token (alternative to key pair) |
431
- | `SNOWFLAKE_KEY_PAIR_PASS` | No | Passphrase for encrypted keys |
432
- | `SNOWFLAKE_WAREHOUSE` | Yes | Snowflake compute warehouse |
433
- | `SNOWFLAKE_ROLE` | Yes | Snowflake user role |
434
- <!-- ENV_METADATA_SNOWFLAKE_TABLE_END -->
394
+ <details>
435
395
 
436
- "Required" means one of env OR the matching `~/.coa/config` field must supply the value. **`SNOWFLAKE_PAT` is env-only** - COA's config uses `snowflakePassword` for Basic auth (a different concept), which this server deliberately doesn't read.
437
-
438
- #### Field-level overrides
439
-
440
- <details>
441
- <summary>Pin a profile but override one field without editing the config file</summary>
442
-
443
- ```json
444
- {
445
- "coalesce-transform": {
446
- "command": "npx",
447
- "args": ["coalesce-transform-mcp"],
448
- "env": {
449
- "COALESCE_PROFILE": "staging",
450
- "SNOWFLAKE_ROLE": "TRANSFORMER_ADMIN"
451
- }
452
- }
453
- }
454
- ```
455
-
456
- Reads: "use the `[staging]` profile, but override its `snowflakeRole`."
457
-
458
- </details>
459
-
460
- ### Multiple environments
461
-
462
- <details>
463
- <summary>Register dev / staging / prod as separate namespaced servers</summary>
464
-
465
- If you work across several Coalesce environments (dev/staging/prod, or multiple orgs), register the package once per profile under distinct server names:
466
-
467
- ```json
468
- {
469
- "mcpServers": {
470
- "coalesce-prod": {
471
- "command": "npx",
472
- "args": ["coalesce-transform-mcp"],
473
- "env": {
474
- "COALESCE_PROFILE": "prod",
475
- "COALESCE_MCP_READ_ONLY": "true"
476
- }
477
- },
478
- "coalesce-dev": {
479
- "command": "npx",
480
- "args": ["coalesce-transform-mcp"],
481
- "env": { "COALESCE_PROFILE": "dev" }
482
- }
483
- }
484
- }
485
- ```
486
-
487
- Why this pattern:
488
-
489
- - **Namespaced tools.** The client surfaces `coalesce-prod__*` vs `coalesce-dev__*`, so an agent can't accidentally mutate the wrong environment.
490
- - **Per-environment safety.** Pair prod with `COALESCE_MCP_READ_ONLY=true` to hide every write tool on that server while leaving dev fully writable.
491
- - **No per-call profile juggling.** Each server is pinned at startup.
492
-
493
- Skip this pattern if you only use one environment - a single registration is simpler. For 2–3 environments it's worth the extra config; beyond that, each server is a separate Node process, so consider whether you actually need them all loaded at once.
494
-
495
- </details>
496
-
497
- ### Using the COA CLI tools
498
-
499
- COA is bundled - no extra install. Usage notes:
500
-
501
- - **Local commands** (`coa_doctor`, `coa_validate`, `coa_dry_run_create`, `coa_dry_run_run`, `coa_create`, `coa_run`, `coa_plan`) need a COA project directory (one that contains `data.yml`). Pass the path via the `projectPath` tool argument.
502
- - **Cloud commands** (`coa_list_environments`, `coa_list_environment_nodes`, `coa_list_runs`, `coa_deploy`, `coa_refresh`) read credentials from `~/.coa/config` - the same file the MCP uses. Populate it once and both surfaces agree.
503
- - **Profile resolution.** Cloud tools accept an optional `profile` arg. When omitted, they fall back to `COALESCE_PROFILE`, then to COA's own `[default]` - so you don't have to pass it on every call.
504
- - **Warehouse-touching commands** (`coa_create`, `coa_run`) need a valid `workspaces.yml` in the project root with storage-location mappings. Preflight catches a missing file before execution.
505
-
506
- ### Safety model
507
-
508
- Three layers prevent destructive surprises. See [docs/safety-model.md](docs/safety-model.md) for the full breakdown (tool annotations, read-only mode, explicit confirmation, COA preflight validation).
509
-
510
- - **Tool annotations** - every tool carries MCP `readOnlyHint` / `destructiveHint` / `idempotentHint`. The ⚠️ marker in [Tools](#tools) marks `destructiveHint: true` tools.
511
- - **`COALESCE_MCP_READ_ONLY=true`** hides all write/mutation tools at server startup. Use it for audits, agent sandboxes, or pairing with a prod profile.
512
- - **Explicit confirmation** on destructive ops - `delete_*`, `propagate_column_change`, `cancel_run`, `clear_data_cache`, `coa_create`, `coa_run`, `coa_deploy`, `coa_refresh` all require `confirmed: true`.
513
-
514
- ### More configuration
515
-
516
- - [Prerelease channel](docs/prerelease.md) - point `npx` at `@alpha` for preview builds.
517
- - [Diagnosing setup](docs/diagnosing-setup.md) - the `diagnose_setup` probe and `/coalesce-setup` MCP prompt.
518
-
519
- ---
520
-
521
- ## Skills
522
-
523
- **Skills are editable markdown that shapes how the agent reasons about your Coalesce project.** Ship your team's naming conventions, grain definitions, and layering patterns as context - every agent on the server instantly picks them up. No fine-tuning, no prompt engineering, just markdown you edit and commit.
524
-
525
- Set `COALESCE_MCP_SKILLS_DIR` to make skills editable on disk. Each skill resolves to default content, user-augmented content, or a full user override - see [docs/context-skills.md](docs/context-skills.md) for the resolution order and customization walkthrough.
526
-
527
- **24 skills, grouped into 6 families:**
528
-
529
- | | Family | Skills | Covers |
530
- | --- | ------ | :----: | ------ |
531
- | <picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/book-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/book-light.png"><img src="docs/icons/book-light.png" width="20" height="20" alt="book"></picture> | **Foundations** | 7 | Core concepts, tool usage, ID discovery, storage mappings, ecosystem scope |
532
- | <picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/file-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/file-light.png"><img src="docs/icons/file-light.png" width="20" height="20" alt="file"></picture> | **SQL platform rules** | 3 | Per-warehouse conventions for node SQL (Snowflake, Databricks, BigQuery) |
533
- | <picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/git-commit-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/git-commit-light.png"><img src="docs/icons/git-commit-light.png" width="20" height="20" alt="git-commit"></picture> | **Node editing & payloads** | 6 | Decision tree, payload shape, hydrated metadata, joins, config completion |
534
- | <picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/repo-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/repo-light.png"><img src="docs/icons/repo-light.png" width="20" height="20" alt="repo"></picture> | **Node type selection** | 2 | When to use Stage/Work vs Dimension/Fact vs specialized node types |
535
- | <picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/workflow-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/workflow-light.png"><img src="docs/icons/workflow-light.png" width="20" height="20" alt="workflow"></picture> | **Pipeline workflows** | 4 | End-to-end pipeline building, intent, review, and workshop patterns |
536
- | <picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/beaker-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/beaker-light.png"><img src="docs/icons/beaker-light.png" width="20" height="20" alt="beaker"></picture> | **Run operations** | 2 | Starting, retrying, polling, diagnosing, and canceling runs |
537
-
538
- <details>
539
-
540
- <summary>
541
- <picture>
542
- <source media="(prefers-color-scheme: dark)" srcset="docs/icons/book-dark.png">
543
- <source media="(prefers-color-scheme: light)" srcset="docs/icons/book-light.png">
544
- <img src="docs/icons/book-light.png" width="20" height="20" alt="book">
545
- </picture>
546
- &nbsp;<b>Foundations</b> &mdash; the shared context every agent starts with
547
- </summary>
548
-
549
- - **`overview`** - General Coalesce concepts, response guidelines, and operational constraints
550
- - **`tool-usage`** - Best practices for tool batching, parallelization, and SQL conversion
551
- - **`id-discovery`** - Resolving project, workspace, environment, job, run, node, and org IDs
552
- - **`storage-mappings`** - Storage location concepts, `{{ ref() }}` syntax, and reference patterns
553
- - **`ecosystem-boundaries`** - Scope of this MCP vs adjacent data-engineering MCPs (Snowflake, Fivetran, dbt, Catalog)
554
- - **`data-engineering-principles`** - Node type selection, layered architecture, methodology detection, materialization strategies
555
- - **`sql-platform-selection`** - Determining the active SQL platform from project metadata
556
-
557
- </details>
558
-
559
- <details>
560
-
561
- <summary>
562
- <picture>
563
- <source media="(prefers-color-scheme: dark)" srcset="docs/icons/file-dark.png">
564
- <source media="(prefers-color-scheme: light)" srcset="docs/icons/file-light.png">
565
- <img src="docs/icons/file-light.png" width="20" height="20" alt="file">
566
- </picture>
567
- &nbsp;<b>SQL platform rules</b> &mdash; per-warehouse conventions for node SQL
568
- </summary>
569
-
570
- - **`sql-snowflake`** - Snowflake-specific SQL conventions for node SQL
571
- - **`sql-databricks`** - Databricks-specific SQL conventions for node SQL
572
- - **`sql-bigquery`** - BigQuery-specific SQL conventions for node SQL
573
-
574
- </details>
575
-
576
- <details>
577
-
578
- <summary>
579
- <picture>
580
- <source media="(prefers-color-scheme: dark)" srcset="docs/icons/git-commit-dark.png">
581
- <source media="(prefers-color-scheme: light)" srcset="docs/icons/git-commit-light.png">
582
- <img src="docs/icons/git-commit-light.png" width="20" height="20" alt="git-commit">
583
- </picture>
584
- &nbsp;<b>Node editing &amp; payloads</b> &mdash; how the agent reasons about node bodies
585
- </summary>
586
-
587
- - **`node-creation-decision-tree`** - Choosing between predecessor-based creation, updates, and full replacements
588
- - **`node-payloads`** - Working with workspace node bodies, metadata, config, and array-replacement risks
589
- - **`hydrated-metadata`** - Coalesce hydrated metadata structures for advanced node payload editing
590
- - **`intelligent-node-configuration`** - How intelligent config completion works, schema resolution, automatic field detection
591
- - **`node-operations`** - Editing existing nodes: joins, columns, config fields, and SQL-to-graph conversion
592
- - **`aggregation-patterns`** - JOIN ON generation, GROUP BY detection, and join-to-aggregation conversion
593
-
594
- </details>
595
-
596
- <details>
597
-
598
- <summary>
599
- <picture>
600
- <source media="(prefers-color-scheme: dark)" srcset="docs/icons/repo-dark.png">
601
- <source media="(prefers-color-scheme: light)" srcset="docs/icons/repo-light.png">
602
- <img src="docs/icons/repo-light.png" width="20" height="20" alt="repo">
603
- </picture>
604
- &nbsp;<b>Node type selection</b> &mdash; picking the right node type for each step
605
- </summary>
606
-
607
- - **`node-type-selection-guide`** - When to use each Coalesce node type (Stage/Work vs Dimension/Fact vs specialized)
608
- - **`node-type-corpus`** - Node type discovery, corpus search, and metadata patterns
609
-
610
- </details>
611
-
612
- <details>
613
-
614
- <summary>
615
- <picture>
616
- <source media="(prefers-color-scheme: dark)" srcset="docs/icons/workflow-dark.png">
617
- <source media="(prefers-color-scheme: light)" srcset="docs/icons/workflow-light.png">
618
- <img src="docs/icons/workflow-light.png" width="20" height="20" alt="workflow">
619
- </picture>
620
- &nbsp;<b>Pipeline workflows</b> &mdash; end-to-end pipeline building
621
- </summary>
622
-
623
- - **`pipeline-workflows`** - Building pipelines end-to-end: node type selection, multi-node sequences, execution
624
- - **`intent-pipeline-guide`** - Using `build_pipeline_from_intent` to create pipelines from natural language
625
- - **`pipeline-review-guide`** - Using `review_pipeline` for pipeline analysis and optimization
626
- - **`pipeline-workshop-guide`** - Using pipeline workshop tools for iterative, conversational pipeline building
627
-
628
- </details>
629
-
630
- <details>
631
-
632
- <summary>
633
- <picture>
634
- <source media="(prefers-color-scheme: dark)" srcset="docs/icons/beaker-dark.png">
635
- <source media="(prefers-color-scheme: light)" srcset="docs/icons/beaker-light.png">
636
- <img src="docs/icons/beaker-light.png" width="20" height="20" alt="beaker">
637
- </picture>
638
- &nbsp;<b>Run operations</b> &mdash; starting, retrying, diagnosing runs
639
- </summary>
640
-
641
- - **`run-operations`** - Starting, retrying, polling, diagnosing, and canceling Coalesce runs
642
- - **`run-diagnostics-guide`** - Using `diagnose_run_failure` to analyze failed runs and determine fixes
643
-
644
- </details>
645
-
646
- > [!TIP]
647
- > **Companion resources:** 10 topics under `coalesce://coa/describe/*` surface the bundled COA CLI's self-describing documentation, version-pinned to the shipping CLI. Topics: `overview`, `commands`, `selectors`, `schemas`, `workflow`, `structure`, `concepts`, `sql-format`, `node-types`, `config`. Use the `coa_describe` tool for parameterized variants.
648
-
649
- ---
650
-
651
- ## Tools
652
-
653
- ⚠️ = Destructive (requires `confirmed: true`). 🧰 = Runs bundled `coa` CLI.
654
-
655
- <!-- start of tool reference -->
656
-
657
- <details>
658
-
659
- <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/project-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/project-light.png"><img src="docs/icons/project-light.png" width="20" height="20" alt="project"></picture> Discovery</summary>
660
-
661
- **Environments, workspaces, projects**
662
-
663
- - **`list_environments`** - List all available environments
664
- - **`get_environment`** - Get details of a specific environment
665
- - **`list_workspaces`** - List all workspaces
666
- - **`get_workspace`** - Get details of a specific workspace
667
- - **`list_projects`** - List all projects
668
- - **`get_project`** - Get project details
669
-
670
- **Nodes**
671
-
672
- - **`list_environment_nodes`** - List nodes in an environment
673
- - **`list_workspace_nodes`** - List nodes in a workspace
674
- - **`get_environment_node`** - Get a specific environment node
675
- - **`get_workspace_node`** - Get a specific workspace node
676
- - **`analyze_workspace_patterns`** - Detect package adoption, pipeline layers, methodology, and generate recommendations
677
- - **`list_workspace_node_types`** - List distinct node types observed in current workspace nodes
678
-
679
- **Jobs, subgraphs, runs**
680
-
681
- - **`list_environment_jobs`** - List all jobs for an environment
682
- - **`get_environment_job`** - Get details of a specific job
683
- - **`list_workspace_subgraphs`** - List subgraphs in a workspace
684
- - **`get_workspace_subgraph`** - Get details of a specific subgraph
685
- - **`list_runs`** - List runs with optional filters
686
- - **`get_run`** - Get details of a specific run
687
- - **`get_run_results`** - Get results of a completed run
688
- - **`get_run_details`** - Run metadata plus results in one call
689
-
690
- **Search**
691
-
692
- - **`search_workspace_content`** - Search node SQL, column names, descriptions, and config values
693
- - **`audit_documentation_coverage`** - Scan all workspace nodes/columns for missing descriptions
694
-
695
- </details>
696
-
697
- <details>
698
-
699
- <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/workflow-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/workflow-light.png"><img src="docs/icons/workflow-light.png" width="20" height="20" alt="workflow"></picture> Pipeline building</summary>
396
+ <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/workflow-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/workflow-light.png"><img src="docs/icons/workflow-light.png" width="28" height="28" alt="workflow"></picture> <b>Pipeline building</b> &mdash; plan, create, and iterate on multi-node pipelines</summary>
700
397
 
701
398
  **Plan & build**
702
399
 
@@ -715,11 +412,16 @@ Set `COALESCE_MCP_SKILLS_DIR` to make skills editable on disk. Each skill resolv
715
412
  - **`get_pipeline_workshop_status`** - Get the current state of a workshop session
716
413
  - **`pipeline_workshop_close`** - Close a workshop session and release resources
717
414
 
415
+ **Local project validation & planning**
416
+
417
+ - 🧰 **`coa_validate`** - Validate YAML schemas and scan a local project for configuration problems
418
+ - 🧰 **`coa_plan`** - Generate a deployment plan JSON by diffing a local project against a cloud environment (non-destructive)
419
+
718
420
  </details>
719
421
 
720
422
  <details>
721
423
 
722
- <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/git-commit-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/git-commit-light.png"><img src="docs/icons/git-commit-light.png" width="20" height="20" alt="git-commit"></picture> Node editing</summary>
424
+ <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/git-commit-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/git-commit-light.png"><img src="docs/icons/git-commit-light.png" width="28" height="28" alt="git-commit"></picture> <b>Node editing</b> &mdash; create, update, delete, and configure workspace nodes</summary>
723
425
 
724
426
  **Create**
725
427
 
@@ -753,7 +455,7 @@ Set `COALESCE_MCP_SKILLS_DIR` to make skills editable on disk. Each skill resolv
753
455
 
754
456
  <details>
755
457
 
756
- <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/beaker-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/beaker-light.png"><img src="docs/icons/beaker-light.png" width="20" height="20" alt="beaker"></picture> Runs & execution</summary>
458
+ <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/beaker-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/beaker-light.png"><img src="docs/icons/beaker-light.png" width="28" height="28" alt="beaker"></picture> <b>Runs & execution</b> &mdash; start, retry, poll, diagnose, and cancel runs</summary>
757
459
 
758
460
  - **`start_run`** - Start a new run; requires Snowflake auth
759
461
  - **`run_and_wait`** - Start a run and poll until completion
@@ -765,11 +467,20 @@ Set `COALESCE_MCP_SKILLS_DIR` to make skills editable on disk. Each skill resolv
765
467
  - **`get_environment_overview`** - Environment details with full node list
766
468
  - **`get_environment_health`** - Dashboard: node counts, run statuses, failed runs in last 24h, stale nodes, dependency health
767
469
 
470
+ **Local execution (bundled CLI)**
471
+
472
+ - 🧰 **`coa_dry_run_create`** - Preview DDL without executing (does **not** validate columns/types exist in warehouse)
473
+ - 🧰 **`coa_dry_run_run`** - Preview DML without executing (same caveat)
474
+ - 🧰 **`coa_create`** - Run DDL (CREATE/REPLACE) against the warehouse for selected nodes ⚠️
475
+ - 🧰 **`coa_run`** - Run DML (INSERT/MERGE) to populate selected nodes ⚠️
476
+ - 🧰 **`coa_deploy`** - Apply a plan JSON to a cloud environment ⚠️
477
+ - 🧰 **`coa_refresh`** - Run DML for selected nodes in an already-deployed environment (no local project required) ⚠️
478
+
768
479
  </details>
769
480
 
770
481
  <details>
771
482
 
772
- <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/git-branch-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/git-branch-light.png"><img src="docs/icons/git-branch-light.png" width="20" height="20" alt="git-branch"></picture> Lineage & impact</summary>
483
+ <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/git-branch-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/git-branch-light.png"><img src="docs/icons/git-branch-light.png" width="28" height="28" alt="git-branch"></picture> <b>Lineage & impact</b> &mdash; trace dependencies, analyze impact, propagate column changes</summary>
773
484
 
774
485
  - **`get_upstream_nodes`** - Walk the full upstream dependency graph for a node
775
486
  - **`get_downstream_nodes`** - Walk the full downstream dependency graph for a node
@@ -781,7 +492,7 @@ Set `COALESCE_MCP_SKILLS_DIR` to make skills editable on disk. Each skill resolv
781
492
 
782
493
  <details>
783
494
 
784
- <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/repo-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/repo-light.png"><img src="docs/icons/repo-light.png" width="20" height="20" alt="repo"></picture> Repo-backed node types</summary>
495
+ <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/repo-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/repo-light.png"><img src="docs/icons/repo-light.png" width="28" height="28" alt="repo"></picture> <b>Repo-backed node types</b> &mdash; inspect committed node-type definitions, variants, and templates</summary>
785
496
 
786
497
  - **`list_repo_packages`** - List package aliases and enabled node-type coverage from a committed Coalesce repo
787
498
  - **`list_repo_node_types`** - List exact resolvable committed node-type identifiers from `nodeTypes/`
@@ -795,58 +506,24 @@ Set `COALESCE_MCP_SKILLS_DIR` to make skills editable on disk. Each skill resolv
795
506
 
796
507
  <details>
797
508
 
798
- <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/tools-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/tools-light.png"><img src="docs/icons/tools-light.png" width="20" height="20" alt="tools"></picture> COA CLI</summary>
509
+ <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/file-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/file-light.png"><img src="docs/icons/file-light.png" width="28" height="28" alt="file"></picture> <b>Projects, environments & git accounts</b> &mdash; admin CRUD for top-level resources</summary>
799
510
 
800
- All local tools accept a `projectPath` argument and validate that it contains `data.yml` before shelling out. Destructive tools run preflight validation; see [Safety model](docs/safety-model.md).
511
+ - **`create_environment`** - Create a new environment within a project
512
+ - **`delete_environment`** - Delete an environment ⚠️
513
+ - **`create_project`** - Create a new project
514
+ - **`update_project`** - Update a project
515
+ - **`delete_project`** - Delete a project ⚠️
516
+ - **`list_git_accounts`** - List all git accounts
517
+ - **`get_git_account`** - Get git account details
518
+ - **`create_git_account`** - Create a new git account
519
+ - **`update_git_account`** - Update a git account
520
+ - **`delete_git_account`** - Delete a git account ⚠️
801
521
 
802
- **Read-only, local**
522
+ </details>
803
523
 
804
- - 🧰 **`coa_doctor`** - Check config, credentials, and warehouse connectivity
805
- - 🧰 **`coa_validate`** - Validate YAML schemas and scan for configuration problems
806
- - 🧰 **`coa_list_project_nodes`** - List all nodes defined in a local project (pre-deploy)
807
- - 🧰 **`coa_dry_run_create`** - Preview DDL without executing (does **not** validate columns/types exist in warehouse)
808
- - 🧰 **`coa_dry_run_run`** - Preview DML without executing (same caveat)
524
+ <details>
809
525
 
810
- **Read-only, cloud**
811
-
812
- - 🧰 **`coa_list_environments`** - List deployment environments
813
- - 🧰 **`coa_list_environment_nodes`** - List deployed nodes in an environment
814
- - 🧰 **`coa_list_runs`** - List pipeline runs in a cloud environment
815
-
816
- **Describe**
817
-
818
- - 🧰 **`coa_describe`** - Fetch a section of COA's self-describing documentation by topic + optional subtopic
819
-
820
- **Write & deploy**
821
-
822
- - 🧰 **`coa_plan`** - Generate a deployment plan JSON by diffing local project against a cloud environment (non-destructive)
823
- - 🧰 **`coa_create`** - Run DDL (CREATE/REPLACE) against the warehouse for selected nodes ⚠️
824
- - 🧰 **`coa_run`** - Run DML (INSERT/MERGE) to populate selected nodes ⚠️
825
- - 🧰 **`coa_deploy`** - Apply a plan JSON to a cloud environment ⚠️
826
- - 🧰 **`coa_refresh`** - Run DML for selected nodes in an already-deployed environment (no local project required) ⚠️
827
-
828
- </details>
829
-
830
- <details>
831
-
832
- <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/file-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/file-light.png"><img src="docs/icons/file-light.png" width="20" height="20" alt="file"></picture> Projects, environments & git accounts</summary>
833
-
834
- - **`create_environment`** - Create a new environment within a project
835
- - **`delete_environment`** - Delete an environment ⚠️
836
- - **`create_project`** - Create a new project
837
- - **`update_project`** - Update a project
838
- - **`delete_project`** - Delete a project ⚠️
839
- - **`list_git_accounts`** - List all git accounts
840
- - **`get_git_account`** - Get git account details
841
- - **`create_git_account`** - Create a new git account
842
- - **`update_git_account`** - Update a git account
843
- - **`delete_git_account`** - Delete a git account ⚠️
844
-
845
- </details>
846
-
847
- <details>
848
-
849
- <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/shield-lock-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/shield-lock-light.png"><img src="docs/icons/shield-lock-light.png" width="20" height="20" alt="shield-lock"></picture> Users & roles</summary>
526
+ <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/shield-lock-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/shield-lock-light.png"><img src="docs/icons/shield-lock-light.png" width="28" height="28" alt="shield-lock"></picture> <b>Users & roles</b> &mdash; assign and remove org, project, and environment roles</summary>
850
527
 
851
528
  - **`list_org_users`** - List all organization users
852
529
  - **`get_user_roles`** - Get roles for a specific user
@@ -861,7 +538,7 @@ All local tools accept a `projectPath` argument and validate that it contains `d
861
538
 
862
539
  <details>
863
540
 
864
- <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/book-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/book-light.png"><img src="docs/icons/book-light.png" width="20" height="20" alt="book"></picture> Cache, skills & setup</summary>
541
+ <summary><picture><source media="(prefers-color-scheme: dark)" srcset="docs/icons/book-dark.png"><source media="(prefers-color-scheme: light)" srcset="docs/icons/book-light.png"><img src="docs/icons/book-light.png" width="28" height="28" alt="book"></picture> <b>Cache, skills & setup</b> &mdash; local snapshots, customizable skills, and setup diagnostics</summary>
865
542
 
866
543
  **Cache snapshots**
867
544
 
@@ -875,6 +552,8 @@ All local tools accept a `projectPath` argument and validate that it contains `d
875
552
 
876
553
  - **`personalize_skills`** - Export bundled skill files to a local directory for customization
877
554
  - **`diagnose_setup`** - Stateless probe reporting configured setup pieces; pairs with the `/coalesce-setup` MCP prompt
555
+ - 🧰 **`coa_doctor`** - Check config, credentials, and warehouse connectivity end-to-end for a local project
556
+ - 🧰 **`coa_describe`** - Fetch a section of COA's self-describing documentation by topic + optional subtopic (also available as `coalesce://coa/describe/*` resources)
878
557
 
879
558
  </details>
880
559
 
@@ -882,6 +561,281 @@ All local tools accept a `projectPath` argument and validate that it contains `d
882
561
 
883
562
  ---
884
563
 
564
+ ## Full Installation
565
+
566
+ **Requirements:**
567
+
568
+ - [Node.js](https://nodejs.org/) 22+
569
+ - A [Coalesce](https://coalesce.io/) account with a workspace
570
+ - An MCP-compatible AI client (see [Quick Start](#quick-start))
571
+ - Snowflake credentials - only if you plan to use run tools or `coa_create`/`coa_run` (see [Credentials](#credentials))
572
+ - Install footprint is ~76 MB unpacked (the bundled `@coalescesoftware/coa` CLI ships its own runtime; the MCP tarball itself is under 1 MB)
573
+
574
+ **1. Clone your project.** If your team already has a Coalesce project in Git, clone it locally - the bundled `coa` CLI operates on a project directory, so most local create/run tools require one on disk:
575
+
576
+ ```bash
577
+ git clone <your-coalesce-project-repo-url>
578
+ cd my-project
579
+ ```
580
+
581
+ Don't have a Git-linked project yet? In the Coalesce UI, open your workspace → **Settings → Git** and connect a repo (or create one via your Git provider and paste the URL). Coalesce will commit the project skeleton on first push; clone that repo locally once it's populated.
582
+
583
+ <details>
584
+ <summary><b>What's in a Coalesce project directory?</b></summary>
585
+
586
+ ```text
587
+ my-project/
588
+ ├── data.yml # Root metadata (fileVersion, platformKind)
589
+ ├── locations.yml # Storage location manifest
590
+ ├── nodes/ # Pipeline nodes (.yml for V1, .sql for V2)
591
+ ├── nodeTypes/ # Node type definitions with templates
592
+ ├── environments/ # Environment configs with storage mappings
593
+ ├── macros/ # Reusable SQL macros
594
+ ├── jobs/ # Job definitions
595
+ └── subgraphs/ # Subgraph definitions
596
+ ```
597
+
598
+ **V1 vs V2** - the format is pinned by `fileVersion` in `data.yml`. **V1** (`fileVersion: 1` or `2`) stores each node as a single YAML file with columns, transforms, and config inline. **V2** (`fileVersion: 3`) is SQL-first: the node body lives in a `.sql` file using `@id` / `@nodeType` annotations and `{{ ref() }}` references, with YAML retained for config. New projects default to V2; existing V1 projects keep working unchanged.
599
+
600
+ </details>
601
+
602
+ Point the MCP at this directory by setting `repoPath` in `~/.coa/config` or `COALESCE_REPO_PATH` in your env block.
603
+
604
+ **2. Create `workspaces.yml`.** This file is **required** for `coa_create` / `coa_run` and their dry-run variants. It maps each storage location declared in `locations.yml` to a physical database + schema for local development. It's typically gitignored (per-developer), so cloning the project does not give it to you - you have to create it.
605
+
606
+ The `/coalesce-setup` prompt detects a missing `workspaces.yml` and walks you through it. If you'd rather do it directly, pick one of:
607
+
608
+ - **Ask your agent to bootstrap it** (easiest): prompt the agent to call the `coa_bootstrap_workspaces` tool (it needs `confirmed: true`, so the agent will ask before running).
609
+
610
+ > [!WARNING]
611
+ > **The generated file contains placeholder values.** The bootstrap tool seeds `database`/`schema` with defaults that won't match your real warehouse. Ask the agent to open the file with you and replace every placeholder before calling `coa_create` / `coa_run` - otherwise the generated DDL/DML will target the wrong (or non-existent) database.
612
+
613
+ - **Hand-write it.** Ask the agent to fetch the authoritative schema via the `coa_describe` tool (`topic: "schema"`, `subtopic: "workspaces"`) - no top-level wrapper, no `fileVersion`.
614
+
615
+ <details>
616
+ <summary><b>Example <code>workspaces.yml</code></b></summary>
617
+
618
+ ```yaml
619
+ # workspaces.yml - keys are workspace names; `dev` is the default if --workspace is omitted
620
+ dev:
621
+ connection: snowflake # required - name of the connection block COA should use
622
+ locations: # optional - one entry per storage location name from locations.yml
623
+ SRC_INGEST_TASTY_BITES:
624
+ database: JESSE_DEV # required
625
+ schema: INGEST_TASTY_BITES # required
626
+ ETL_STAGE:
627
+ database: JESSE_DEV
628
+ schema: ETL_STAGE
629
+ ANALYTICS:
630
+ database: JESSE_DEV
631
+ schema: ANALYTICS
632
+ ```
633
+
634
+ </details>
635
+
636
+ Ask your agent to verify the setup - e.g. *"Run `coa_doctor` on my project and summarize the results."* It checks `data.yml`, `workspaces.yml`, credentials, and warehouse connectivity end to end.
637
+
638
+ **3. Pick an auth path:**
639
+
640
+ <table>
641
+ <tr>
642
+ <th>Option A - env var</th>
643
+ <th>Option B - reuse <code>~/.coa/config</code></th>
644
+ </tr>
645
+ <tr valign="top">
646
+ <td>
647
+
648
+ Simplest for first-time MCP users. Generate a `COALESCE_ACCESS_TOKEN` from Coalesce → Deploy → User Settings, then include it in your client config:
649
+
650
+ ```json
651
+ {
652
+ "env": {
653
+ "COALESCE_ACCESS_TOKEN": "<YOUR_TOKEN>"
654
+ }
655
+ }
656
+ ```
657
+
658
+ </td>
659
+ <td>
660
+
661
+ Best if you already use the `coa` CLI - the server reads the same profile file, so nothing to duplicate. Drop the `env` block entirely:
662
+
663
+ ```json
664
+ {
665
+ "command": "npx",
666
+ "args": ["coalesce-transform-mcp"]
667
+ }
668
+ ```
669
+
670
+ See [Credentials](#credentials) for the profile schema.
671
+
672
+ </td>
673
+ </tr>
674
+ </table>
675
+
676
+ When both sources set a field, the env var wins.
677
+
678
+ **4. Install the server** via one of the [Quick Start](#quick-start) paths above.
679
+
680
+ **5. Restart your client,** then run the `/coalesce-setup` prompt to verify everything is wired up.
681
+
682
+ If you have more than one Coalesce environment to manage, see [Multiple environments](#multiple-environments).
683
+
684
+ ### Credentials
685
+
686
+ The server reads credentials from two sources and merges them with **env-wins precedence** - a matching env var always overrides the profile value, so you can pin a single field per session without editing the config file. Call `diagnose_setup` to see which source supplied each value.
687
+
688
+ #### Source 1: `~/.coa/config` (shared with the `coa` CLI)
689
+
690
+ COA stores credentials in a standard INI file. You create it by hand, or let `coa` write it as you use the CLI. The MCP reads the profile selected by `COALESCE_PROFILE` (default `[default]`) and maps the keys below onto their matching env vars.
691
+
692
+ ```ini
693
+ [default]
694
+ token=<your-coalesce-refresh-token>
695
+ domain=https://your-org.app.coalescesoftware.io
696
+ snowflakeAccount=<your-snowflake-account> # e.g., abc12345.us-east-1 - required by coa CLI
697
+ snowflakeUsername=YOUR_USER
698
+ snowflakeRole=YOUR_ROLE
699
+ snowflakeWarehouse=YOUR_WAREHOUSE
700
+ snowflakeKeyPairKey=/Users/you/.coa/rsa_key.p8
701
+ snowflakeAuthType=KeyPair
702
+ orgID=<your-org-id> # optional; fallback for cancel-run
703
+ repoPath=/Users/you/path/to/repo # optional; for repo-backed tools
704
+ cacheDir=/Users/you/.coa/cache # optional; per-profile cache isolation
705
+
706
+ [staging]
707
+ # …additional profiles; select with COALESCE_PROFILE
708
+ ```
709
+
710
+ **Key mapping** - each profile key maps to an env var of the same concept:
711
+
712
+ | Profile key | Env var |
713
+ | ----------- | ------- |
714
+ | `token` | `COALESCE_ACCESS_TOKEN` |
715
+ | `domain` | `COALESCE_BASE_URL` |
716
+ | `snowflake*` (all keys) | `SNOWFLAKE_*` (matching suffix) |
717
+ | `orgID` | `COALESCE_ORG_ID` |
718
+ | `repoPath` | `COALESCE_REPO_PATH` |
719
+ | `cacheDir` | `COALESCE_CACHE_DIR` |
720
+
721
+ Notes:
722
+
723
+ - `snowflakeAuthType` is read by COA itself (no env var) - include it when using key-pair auth.
724
+ - `orgID`, `repoPath`, and `cacheDir` are MCP-specific - the COA CLI ignores them.
725
+ - Only the fields the MCP needs are shown above. COA's config supports many more - run `npx @coalescesoftware/coa describe config` for the authoritative reference. Unknown keys are ignored.
726
+
727
+ If `~/.coa/config` doesn't exist the server runs env-only - startup never fails on a missing or malformed profile file; it just logs a stderr warning.
728
+
729
+ #### Source 2: env vars in your MCP config
730
+
731
+ <!-- ENV_METADATA_CORE_TABLE_START -->
732
+ | Variable | Description | Default |
733
+ | -------- | -------- | -------- |
734
+ | `COALESCE_ACCESS_TOKEN` | Bearer token from the Coalesce Deploy tab. Optional when `~/.coa/config` provides a `token`. | — |
735
+ | `COALESCE_PROFILE` | Selects which `~/.coa/config` profile to load. | `default` |
736
+ | `COALESCE_BASE_URL` | Region-specific base URL. | `https://app.coalescesoftware.io (US)` |
737
+ | `COALESCE_ORG_ID` | Fallback org ID for cancel-run. Also readable from `orgID` in the active ~/.coa/config profile. | — |
738
+ | `COALESCE_REPO_PATH` | Local repo root for repo-backed tools and pipeline planning. Also readable from `repoPath` in the active ~/.coa/config profile. | — |
739
+ | `COALESCE_CACHE_DIR` | Base directory for the local data cache. When set, cache files are written here instead of the working directory. Also readable from `cacheDir` in the active ~/.coa/config profile. | — |
740
+ | `COALESCE_MCP_AUTO_CACHE_MAX_BYTES` | JSON size threshold before auto-caching to disk. | `32768` |
741
+ | `COALESCE_MCP_LINEAGE_TTL_MS` | In-memory lineage cache TTL in milliseconds. | `1800000` |
742
+ | `COALESCE_MCP_MAX_REQUEST_BODY_BYTES` | Max outbound API request body size. | `524288` |
743
+ | `COALESCE_MCP_READ_ONLY` | When `true`, hides all write/mutation tools during registration. Only read, list, search, cache, analyze, review, diagnose, and plan tools are exposed. | `false` |
744
+ | `COALESCE_MCP_SKILLS_DIR` | Directory for customizable AI skill resources. When set, reads context resources from this directory and seeds defaults on first run. Users can augment or override any skill. | — |
745
+ <!-- ENV_METADATA_CORE_TABLE_END -->
746
+
747
+ #### Snowflake credentials (run tools only)
748
+
749
+ `start_run`, `retry_run`, `run_and_wait`, `retry_and_wait`, and the warehouse-touching COA tools (`coa_create`, `coa_run`) need Snowflake credentials. These normally come from `~/.coa/config`. Override any field via env var:
750
+
751
+ <!-- ENV_METADATA_SNOWFLAKE_TABLE_START -->
752
+ | Variable | Required | Description |
753
+ | -------- | -------- | -------- |
754
+ | `SNOWFLAKE_ACCOUNT` | Yes | Snowflake account identifier (e.g., `abc12345.us-east-1`). Required by the local `coa` CLI and `coa doctor`; not used by the MCP's REST run path. |
755
+ | `SNOWFLAKE_USERNAME` | Yes | Snowflake account username |
756
+ | `SNOWFLAKE_KEY_PAIR_KEY` | No | Path to PEM-encoded private key (required if SNOWFLAKE_PAT not set) |
757
+ | `SNOWFLAKE_PAT` | No | Snowflake Programmatic Access Token (alternative to key pair) |
758
+ | `SNOWFLAKE_KEY_PAIR_PASS` | No | Passphrase for encrypted keys |
759
+ | `SNOWFLAKE_WAREHOUSE` | Yes | Snowflake compute warehouse |
760
+ | `SNOWFLAKE_ROLE` | Yes | Snowflake user role |
761
+ <!-- ENV_METADATA_SNOWFLAKE_TABLE_END -->
762
+
763
+ "Required" means one of env OR the matching `~/.coa/config` field must supply the value. **`SNOWFLAKE_PAT` is env-only** - COA's config uses `snowflakePassword` for Basic auth (a different concept), which this server deliberately doesn't read.
764
+
765
+ #### Field-level overrides
766
+
767
+ <details>
768
+ <summary><b>Pin a profile but override one field without editing the config file</b></summary>
769
+
770
+ ```json
771
+ {
772
+ "coalesce-transform": {
773
+ "command": "npx",
774
+ "args": ["coalesce-transform-mcp"],
775
+ "env": {
776
+ "COALESCE_PROFILE": "staging",
777
+ "SNOWFLAKE_ROLE": "TRANSFORMER_ADMIN"
778
+ }
779
+ }
780
+ }
781
+ ```
782
+
783
+ Reads: "use the `[staging]` profile, but override its `snowflakeRole`."
784
+
785
+ </details>
786
+
787
+ ### Multiple environments
788
+
789
+ <details>
790
+ <summary><b>Register dev / staging / prod as separate namespaced servers</b></summary>
791
+
792
+ If you work across several Coalesce environments (dev/staging/prod, or multiple orgs), register the package once per profile under distinct server names:
793
+
794
+ ```json
795
+ {
796
+ "mcpServers": {
797
+ "coalesce-prod": {
798
+ "command": "npx",
799
+ "args": ["coalesce-transform-mcp"],
800
+ "env": {
801
+ "COALESCE_PROFILE": "prod",
802
+ "COALESCE_MCP_READ_ONLY": "true"
803
+ }
804
+ },
805
+ "coalesce-dev": {
806
+ "command": "npx",
807
+ "args": ["coalesce-transform-mcp"],
808
+ "env": { "COALESCE_PROFILE": "dev" }
809
+ }
810
+ }
811
+ }
812
+ ```
813
+
814
+ Why this pattern:
815
+
816
+ - **Namespaced tools.** The client surfaces `coalesce-prod__*` vs `coalesce-dev__*`, so an agent can't accidentally mutate the wrong environment.
817
+ - **Per-environment safety.** Pair prod with `COALESCE_MCP_READ_ONLY=true` to hide every write tool on that server while leaving dev fully writable.
818
+ - **No per-call profile juggling.** Each server is pinned at startup.
819
+
820
+ Skip this pattern if you only use one environment - a single registration is simpler. For 2–3 environments it's worth the extra config; beyond that, each server is a separate Node process, so consider whether you actually need them all loaded at once.
821
+
822
+ </details>
823
+
824
+ ### Safety model
825
+
826
+ Three layers prevent destructive surprises. See [docs/safety-model.md](docs/safety-model.md) for the full breakdown (tool annotations, read-only mode, explicit confirmation, COA preflight validation).
827
+
828
+ - **Tool annotations** - every tool carries MCP `readOnlyHint` / `destructiveHint` / `idempotentHint`. The ⚠️ marker in [Tools](#tools) marks `destructiveHint: true` tools.
829
+ - **`COALESCE_MCP_READ_ONLY=true`** hides all write/mutation tools at server startup. Use it for audits, agent sandboxes, or pairing with a prod profile.
830
+ - **Explicit confirmation** on destructive ops - `delete_*`, `propagate_column_change`, `cancel_run`, `clear_data_cache`, `coa_create`, `coa_run`, `coa_deploy`, `coa_refresh` all require `confirmed: true`.
831
+
832
+ ### More configuration
833
+
834
+ - [Prerelease channel](docs/prerelease.md) - point `npx` at `@alpha` for preview builds.
835
+ - [Diagnosing setup](docs/diagnosing-setup.md) - the `diagnose_setup` probe and `/coalesce-setup` MCP prompt.
836
+
837
+ ---
838
+
885
839
  ## Design notes
886
840
 
887
841
  - **SQL override is disallowed.** Nodes are built via YAML/config (columns, transforms, join conditions), not raw SQL. Template generation strips `overrideSQLToggle`, and write helpers reject `overrideSQL` fields.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "coalesce-transform-mcp",
3
- "version": "0.5.0",
3
+ "version": "0.5.1-alpha.1",
4
4
  "mcpName": "io.github.jessemarshall/coalesce-transform",
5
5
  "description": "MCP server for the Coalesce Transform API; run tools support Snowflake Key Pair and PAT auth",
6
6
  "main": "dist/index.js",