pmx-canvas 0.1.35 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +461 -0
- package/Readme.md +14 -2
- package/dist/canvas/index.js +82 -41
- package/dist/json-render/index.js +89 -334
- package/dist/types/client/nodes/ExtAppFrame.d.ts +2 -0
- package/dist/types/mcp/canvas-access.d.ts +12 -159
- package/dist/types/server/ax-context.d.ts +1 -1
- package/dist/types/server/ax-state-manager.d.ts +256 -0
- package/dist/types/server/ax-state.d.ts +29 -1
- package/dist/types/server/ax-wait.d.ts +23 -0
- package/dist/types/server/canvas-operations.d.ts +1 -12
- package/dist/types/server/canvas-state.d.ts +46 -14
- package/dist/types/server/html-surface.d.ts +7 -0
- package/dist/types/server/index.d.ts +66 -26
- package/dist/types/server/operations/composites.d.ts +121 -0
- package/dist/types/server/operations/http.d.ts +7 -0
- package/dist/types/server/operations/index.d.ts +8 -0
- package/dist/types/server/operations/invoker.d.ts +13 -0
- package/dist/types/server/operations/mcp.d.ts +15 -0
- package/dist/types/server/operations/ops/annotation.d.ts +2 -0
- package/dist/types/server/operations/ops/app.d.ts +33 -0
- package/dist/types/server/operations/ops/ax-await.d.ts +2 -0
- package/dist/types/server/operations/ops/ax-shared.d.ts +31 -0
- package/dist/types/server/operations/ops/ax-state.d.ts +2 -0
- package/dist/types/server/operations/ops/ax-timeline.d.ts +2 -0
- package/dist/types/server/operations/ops/ax-work.d.ts +2 -0
- package/dist/types/server/operations/ops/batch.d.ts +19 -0
- package/dist/types/server/operations/ops/edges.d.ts +2 -0
- package/dist/types/server/operations/ops/groups.d.ts +2 -0
- package/dist/types/server/operations/ops/json-render.d.ts +31 -0
- package/dist/types/server/operations/ops/nodes.d.ts +62 -0
- package/dist/types/server/operations/ops/query.d.ts +2 -0
- package/dist/types/server/operations/ops/snapshots.d.ts +2 -0
- package/dist/types/server/operations/ops/validate.d.ts +2 -0
- package/dist/types/server/operations/ops/viewport.d.ts +2 -0
- package/dist/types/server/operations/ops/webview.d.ts +2 -0
- package/dist/types/server/operations/registry.d.ts +15 -0
- package/dist/types/server/operations/types.d.ts +116 -0
- package/dist/types/server/operations/webview-runner.d.ts +69 -0
- package/docs/RELEASE.md +5 -0
- package/docs/adr-001-bun-only-runtime.md +46 -0
- package/docs/api-stability.md +57 -0
- package/docs/ax-host-adapter-contract.md +65 -0
- package/docs/ax-state-contract.md +72 -0
- package/docs/http-api.md +34 -2
- package/docs/mcp.md +64 -11
- package/docs/plans/plan-005-operation-registry.md +84 -0
- package/docs/plans/plan-006-mcp-tool-consolidation.md +109 -0
- package/docs/plans/plan-007-ax-domain.md +99 -0
- package/docs/plans/plan-008-registry-finish.md +91 -0
- package/docs/screenshot.png +0 -0
- package/docs/tech-debt-assessment-2026-06.md +90 -0
- package/package.json +3 -3
- package/skills/pmx-canvas/SKILL.md +233 -185
- package/skills/pmx-canvas/evals/evals.json +3 -3
- package/skills/pmx-canvas/references/codex-app-adapter.md +24 -11
- package/skills/pmx-canvas/references/github-copilot-app-adapter.md +31 -1
- package/src/cli/agent.ts +52 -31
- package/src/client/nodes/ExtAppFrame.tsx +73 -5
- package/src/client/nodes/HtmlNode.tsx +12 -3
- package/src/client/nodes/McpAppNode.tsx +12 -3
- package/src/json-render/renderer/index.tsx +3 -0
- package/src/mcp/canvas-access.ts +43 -774
- package/src/mcp/server.ts +190 -2001
- package/src/server/ax-context.ts +7 -1
- package/src/server/ax-state-manager.ts +808 -0
- package/src/server/ax-state.ts +89 -2
- package/src/server/ax-wait.ts +56 -0
- package/src/server/canvas-operations.ts +2 -328
- package/src/server/canvas-schema.ts +2 -2
- package/src/server/canvas-state.ts +140 -382
- package/src/server/html-surface.ts +49 -11
- package/src/server/index.ts +136 -192
- package/src/server/operations/composites.ts +355 -0
- package/src/server/operations/http.ts +103 -0
- package/src/server/operations/index.ts +65 -0
- package/src/server/operations/invoker.ts +87 -0
- package/src/server/operations/mcp.ts +221 -0
- package/src/server/operations/ops/annotation.ts +60 -0
- package/src/server/operations/ops/app.ts +447 -0
- package/src/server/operations/ops/ax-await.ts +216 -0
- package/src/server/operations/ops/ax-shared.ts +38 -0
- package/src/server/operations/ops/ax-state.ts +249 -0
- package/src/server/operations/ops/ax-timeline.ts +381 -0
- package/src/server/operations/ops/ax-work.ts +635 -0
- package/src/server/operations/ops/batch.ts +365 -0
- package/src/server/operations/ops/edges.ts +166 -0
- package/src/server/operations/ops/groups.ts +176 -0
- package/src/server/operations/ops/json-render.ts +691 -0
- package/src/server/operations/ops/nodes.ts +1047 -0
- package/src/server/operations/ops/query.ts +281 -0
- package/src/server/operations/ops/snapshots.ts +366 -0
- package/src/server/operations/ops/validate.ts +37 -0
- package/src/server/operations/ops/viewport.ts +219 -0
- package/src/server/operations/ops/webview.ts +339 -0
- package/src/server/operations/registry.ts +79 -0
- package/src/server/operations/types.ts +150 -0
- package/src/server/operations/webview-runner.ts +77 -0
- package/src/server/server.ts +253 -2170
- package/src/server/web-artifacts.ts +6 -2
package/docs/mcp.md
CHANGED
|
@@ -1,10 +1,18 @@
|
|
|
1
1
|
# MCP reference
|
|
2
2
|
|
|
3
|
-
PMX Canvas ships an MCP stdio server with **
|
|
3
|
+
PMX Canvas ships an MCP stdio server with **83 tools** + **14 core resources**,
|
|
4
4
|
plus per-skill resources at `canvas://skills/<name>`. The server emits
|
|
5
5
|
`notifications/resources/updated` when canvas state changes — humans pin
|
|
6
6
|
nodes in the browser, agents are notified immediately.
|
|
7
7
|
|
|
8
|
+
> **Consolidation in progress (plan-006/008).** The 83 tools are 14 action-discriminated
|
|
9
|
+
> **composites** (recommended — see below) plus 69 legacy single-purpose tools.
|
|
10
|
+
> The composites fold the legacy tools behind an `action` (and, for `canvas_ax_gate`,
|
|
11
|
+
> a `kind`) enum; each action dispatches to the same operation, so behavior is
|
|
12
|
+
> identical. Folded legacy tools are marked `Deprecated:` in their descriptions and
|
|
13
|
+
> are removed in v0.3 per [`api-stability.md`](api-stability.md). **Prefer the
|
|
14
|
+
> composites.**
|
|
15
|
+
|
|
8
16
|
## Connect
|
|
9
17
|
|
|
10
18
|
Add to your agent's MCP config:
|
|
@@ -22,18 +30,59 @@ Add to your agent's MCP config:
|
|
|
22
30
|
|
|
23
31
|
The canvas auto-starts on first tool call.
|
|
24
32
|
|
|
25
|
-
##
|
|
33
|
+
## Composite tools (recommended)
|
|
34
|
+
|
|
35
|
+
Action-discriminated tools that consolidate the single-purpose tools. Each maps
|
|
36
|
+
its `action` to the same operation the legacy tool used, so results are identical.
|
|
37
|
+
|
|
38
|
+
| Composite | `action` values | Replaces |
|
|
39
|
+
|-----------|-----------------|----------|
|
|
40
|
+
| `canvas_node` | `add` · `get` · `update` · `remove` | `canvas_add_node`, `canvas_get_node`, `canvas_update_node`, `canvas_remove_node`, `canvas_add_html_node` (`add` + `type:"html"`), `canvas_add_html_primitive` (`add` + `type:"html"`, `primitive:"<kind>"`), `canvas_refresh_webpage_node` (`update` + `refresh:true`) |
|
|
41
|
+
| `canvas_render` | `describe-schema` · `validate` · `add-json-render` · `stream-json-render` · `add-graph` | `canvas_describe_schema`, `canvas_validate_spec`, `canvas_add_json_render_node`, `canvas_stream_json_render_node`, `canvas_add_graph_node` |
|
|
42
|
+
| `canvas_edge` | `add` · `remove` | `canvas_add_edge`, `canvas_remove_edge` |
|
|
43
|
+
| `canvas_group` | `create` · `add` · `ungroup` | `canvas_create_group`, `canvas_group_nodes`, `canvas_ungroup` |
|
|
44
|
+
| `canvas_history` | `undo` · `redo` | `canvas_undo`, `canvas_redo` |
|
|
45
|
+
| `canvas_view` | `arrange` · `focus` · `fit` · `clear` | `canvas_arrange`, `canvas_focus_node`, `canvas_fit_view`, `canvas_clear` |
|
|
46
|
+
| `canvas_query` | `search` · `layout` | `canvas_search`, `canvas_get_layout` |
|
|
47
|
+
| `canvas_webview` | `status` · `start` · `stop` · `resize` · `evaluate` | `canvas_webview_status`, `canvas_webview_start`, `canvas_webview_stop`, `canvas_resize`, `canvas_evaluate` |
|
|
48
|
+
| `canvas_app` | `open-mcp-app` · `diagram` · `build-artifact` | `canvas_open_mcp_app`, `canvas_add_diagram`, `canvas_build_web_artifact` |
|
|
49
|
+
| `canvas_ax_state` | `get` · `set-focus` · `set-policy` · `report-capability` | `canvas_get_ax`, `canvas_set_ax_focus`, `canvas_set_ax_policy`, `canvas_report_host_capability` |
|
|
50
|
+
| `canvas_ax_work` | `add` · `update` · `annotate` | `canvas_add_work_item`, `canvas_update_work_item`, `canvas_add_review_annotation` |
|
|
51
|
+
| `canvas_ax_gate` | `request` · `resolve` · `await` × kind `approval` \| `elicitation` \| `mode` | `canvas_request_approval`, `canvas_resolve_approval`, `canvas_await_approval`, `canvas_request_elicitation`, `canvas_respond_elicitation`, `canvas_await_elicitation`, `canvas_request_mode`, `canvas_resolve_mode`, `canvas_await_mode` (9 → 1) |
|
|
52
|
+
| `canvas_ax_timeline` | `read` · `record-event` · `add-evidence` · `send-steering` | `canvas_get_ax_timeline`, `canvas_record_ax_event`, `canvas_add_evidence`, `canvas_send_steering` |
|
|
53
|
+
| `canvas_ax_delivery` | `claim` · `mark` | `canvas_claim_ax_delivery`, `canvas_mark_ax_delivery` |
|
|
54
|
+
|
|
55
|
+
Field names match the underlying operation (e.g. `canvas_view { action: "focus", id }`,
|
|
56
|
+
`canvas_group { action: "create", childIds }`). `canvas_ax_gate` has two discriminators:
|
|
57
|
+
`{ kind, action }` — e.g. `{ kind: "approval", action: "request", title }`,
|
|
58
|
+
`{ kind: "elicitation", action: "resolve", id, response }`,
|
|
59
|
+
`{ kind: "mode", action: "await", id, timeoutMs }`. (The approval machine-readable
|
|
60
|
+
action identifier is passed as `approvalAction`, since `action` is the lifecycle
|
|
61
|
+
discriminator.) `canvas_app` folds the external / built-content tools:
|
|
62
|
+
`{ action: "open-mcp-app", transport, toolName }`, `{ action: "diagram", elements }`
|
|
63
|
+
(the hosted Excalidraw preset), and `{ action: "build-artifact", title, appTsx }`
|
|
64
|
+
(build-artifact can run for minutes on a cold workspace — set a long client
|
|
65
|
+
timeout). `canvas_ax_interaction`, `canvas_ingest_activity`, and
|
|
66
|
+
`canvas_invoke_command` stay standalone (trust-boundary / firehose / execution-intent
|
|
67
|
+
tools). `canvas_screenshot` also stays standalone — it returns a binary image payload
|
|
68
|
+
the composite/registry JSON wire shape does not model. (Wave 5 folded
|
|
69
|
+
`canvas_refresh_webpage_node` → `canvas_node { action: "update", refresh: true }` after
|
|
70
|
+
fixing `node.update`'s `formatResult` to surface a FAILED refresh as `isError` +
|
|
71
|
+
`{ ok:false, error }` instead of masking it as a false `{ ok:true }`.) Snapshots
|
|
72
|
+
fold as their registry slice lands.
|
|
73
|
+
|
|
74
|
+
## Tools (legacy single-purpose)
|
|
26
75
|
|
|
27
76
|
| Tool | Description |
|
|
28
77
|
|------|-------------|
|
|
29
78
|
| `canvas_add_node` | Add a node (markdown, status, context, file, webpage, html, etc.) |
|
|
30
|
-
| `canvas_add_html_node` | Create an `html` node from a self-contained HTML/JS document (sandboxed iframe) |
|
|
31
|
-
| `canvas_add_html_primitive` | Create a reusable generated HTML communication primitive as a sandboxed `html` node |
|
|
79
|
+
| `canvas_add_html_node` | **Deprecated** → `canvas_node { action: "add", type: "html" }`. Create an `html` node from a self-contained HTML/JS document (sandboxed iframe) |
|
|
80
|
+
| `canvas_add_html_primitive` | **Deprecated** → `canvas_node { action: "add", type: "html", primitive: "<kind>" }`. Create a reusable generated HTML communication primitive as a sandboxed `html` node |
|
|
32
81
|
| `canvas_add_diagram` | Hand-drawn diagram via the hosted Excalidraw MCP App (preset alias for `canvas_open_mcp_app`) |
|
|
33
82
|
| `canvas_open_mcp_app` | Open any [MCP Apps](https://modelcontextprotocol.io/docs/extensions/apps) server's `ui://` resource as an iframe node |
|
|
34
83
|
| `canvas_describe_schema` | Describe the running server's create schemas, examples, json-render catalog, and HTML primitive catalog |
|
|
35
84
|
| `canvas_validate_spec` | Validate a json-render spec, graph payload, or HTML primitive payload without creating a node |
|
|
36
|
-
| `canvas_refresh_webpage_node` | Re-fetch and update a webpage node from its stored URL |
|
|
85
|
+
| `canvas_refresh_webpage_node` | **Deprecated** → `canvas_node { action: "update", refresh: true }`. Re-fetch and update a webpage node from its stored URL |
|
|
37
86
|
| `canvas_add_json_render_node` | Create a native json-render node from a validated spec |
|
|
38
87
|
| `canvas_stream_json_render_node` | Progressively build a json-render node from SpecStream JSON-Patch ops (live/streaming panels) |
|
|
39
88
|
| `canvas_add_graph_node` | Create a native graph node (line, bar, pie, area, scatter, radar, stacked-bar, composed, sparkline, dot-plot, bullet, slopegraph) |
|
|
@@ -72,6 +121,10 @@ searchable and readable in pinned/spatial context.
|
|
|
72
121
|
| `canvas_respond_elicitation` | Respond to / resolve a pending elicitation |
|
|
73
122
|
| `canvas_request_mode` | Request a workflow `mode-request` transition (plan/execute/autonomous) |
|
|
74
123
|
| `canvas_resolve_mode` | Resolve a pending mode request |
|
|
124
|
+
| `canvas_ingest_activity` | Ingest a harness-forwarded agent activity (tool/session event); the board auto-reacts with kind-driven, overridable defaults (failure → work item + review + evidence; `tool-result`+success → evidence). Makes AX bidirectional |
|
|
125
|
+
| `canvas_await_approval` | Block until an approval gate resolves (human approves/rejects in the browser) or the timeout elapses (`timeoutMs` 0 = immediate read). Gates that actually gate |
|
|
126
|
+
| `canvas_await_elicitation` | Block until an elicitation is answered or the timeout elapses |
|
|
127
|
+
| `canvas_await_mode` | Block until a mode request resolves or the timeout elapses |
|
|
75
128
|
| `canvas_invoke_command` | Invoke a registry command (`pmx.plan`, `pmx.execute`, `pmx.promote-context`, `pmx.summarize`, `pmx.review`); records a `command` agent-event, unknown names rejected |
|
|
76
129
|
| `canvas_set_ax_policy` | Patch the canvas-bound tool/prompt policy (`tools.allowed\|excluded\|approvalRequired`, `prompt.systemAppend\|mode`); patches merge and are normalized |
|
|
77
130
|
| `canvas_pin_nodes` | Pin nodes to include in agent context |
|
|
@@ -168,10 +221,10 @@ MCP for tools/resources and the in-app Browser for the live `/workbench` view.
|
|
|
168
221
|
No separate PMX renderer is needed. Prefer MCP over the CLI for Codex-native
|
|
169
222
|
operation; keep the CLI for fallback scripts and manual debugging.
|
|
170
223
|
|
|
171
|
-
Use `canvas://ax-context` or `
|
|
172
|
-
When Codex-hosted steering sets the current attention
|
|
173
|
-
`
|
|
174
|
-
focus came from. The full workflow lives in
|
|
224
|
+
Use `canvas://ax-context` or `canvas_ax_state { action: "get" }` to read
|
|
225
|
+
pinned/focused context. When Codex-hosted steering sets the current attention
|
|
226
|
+
target, call `canvas_ax_state { action: "set-focus", source: "codex" }` so the
|
|
227
|
+
AX state records where the focus came from. The full workflow lives in
|
|
175
228
|
`skills/pmx-canvas/references/codex-app-adapter.md`.
|
|
176
229
|
|
|
177
230
|
## Annotation Visibility
|
|
@@ -197,8 +250,8 @@ in doubt:
|
|
|
197
250
|
|
|
198
251
|
- `json-render` → `canvas_add_json_render_node`
|
|
199
252
|
- `graph` → `canvas_add_graph_node`
|
|
200
|
-
- `html-primitive` → `canvas_add_html_primitive`
|
|
201
|
-
- `html` → `canvas_add_html_node`
|
|
253
|
+
- `html-primitive` → `canvas_node { action: "add", type: "html", primitive: "<kind>" }` (or the deprecated `canvas_add_html_primitive`)
|
|
254
|
+
- `html` → `canvas_node { action: "add", type: "html" }` (or the deprecated `canvas_add_html_node`)
|
|
202
255
|
- `web-artifact` → `canvas_build_web_artifact`
|
|
203
256
|
- `mcp-app` → `canvas_open_mcp_app`
|
|
204
257
|
- `group` → `canvas_create_group`
|
|
@@ -0,0 +1,84 @@
|
|
|
1
|
+
# Plan 005 — Operation Registry: one definition site per canvas operation
|
|
2
|
+
|
|
3
|
+
**Status:** In progress (branch `refactor/v0.2-operation-registry`)
|
|
4
|
+
**Date:** 2026-06-12
|
|
5
|
+
**Motivation:** docs/tech-debt-assessment-2026-06.md item 1. Every operation is hand-written 5–6 times (CanvasStateManager/canvas-operations, PmxCanvas SDK, HTTP handler, MCP tool, CLI command, plus Local/Remote CanvasAccess in src/mcp/canvas-access.ts). Documented bug classes caused by this: fix applied to one of two mutation paths (LRN-20260606-006), enum guard not updated for new member (LRN-20260607-005), shared readJson hardening killing the batch bare-array shape (LRN-20260608-002).
|
|
6
|
+
|
|
7
|
+
## Confirmed live drift this refactor erases
|
|
8
|
+
|
|
9
|
+
- `PmxCanvas.addNode` uses `fileMode: 'path'` while `handleCanvasAddNode` uses `fileMode: 'auto'`.
|
|
10
|
+
- Node-update merge logic exists in three diverging versions: `handleCanvasUpdateNode` has webpage `titleSource`, html top-level `html`/`axCapabilities`, group children, `refresh:true` delegation; `PmxCanvas.updateNode` and batch `node.update` have none of these.
|
|
11
|
+
- `canvas_remove_node` over local access silently succeeds on a missing id while the HTTP path 404s.
|
|
12
|
+
- The per-type default-size ladder is copy-pasted in `handleCanvasAddNode`, `executeCanvasBatch`, and `PmxCanvas.addNode`.
|
|
13
|
+
|
|
14
|
+
## Registry core design
|
|
15
|
+
|
|
16
|
+
New directory `src/server/operations/`:
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
types.ts Operation<I,O>, OperationContext, OperationError, defineOperation()
|
|
20
|
+
registry.ts register/get/list, executeOperation(), setOperationEventEmitter()
|
|
21
|
+
http.ts route table + dispatchOperationRoute(req, url): Promise<Response | null>
|
|
22
|
+
invoker.ts OperationInvoker: LocalOperationInvoker | HttpOperationInvoker
|
|
23
|
+
mcp.ts registerOperationTools(server, getInvoker)
|
|
24
|
+
ops/nodes.ts slice 1: node.add / node.get / node.update / node.remove / layout.get
|
|
25
|
+
index.ts imports all ops/* files and registers them (single registration site)
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
Key contracts:
|
|
29
|
+
|
|
30
|
+
- `Operation<I,O>` fields: `name` ('node.add', doubles as batch op name), `mutates` (true → registry emits `canvas-layout-update` after success), `input` (a ZodObject; MUST be loose/passthrough — legacy ignores unknown keys, strict parsing would be an invisible API break), `http { method, path (EXACT legacy path, ':param' segments), readInput?, serialize? }`, `mcp { toolName (frozen legacy name), description, extraShape? (MCP-only presentation flags like full/verbose), formatResult? } | null`, `handler(input, ctx)` — the single implementation, mutating via canvasState/canvas-operations so mutation history records automatically.
|
|
31
|
+
- `OperationError(message, status 400|404|409)` maps to HTTP status + `{ ok:false, error }` and MCP `isError: true`.
|
|
32
|
+
- `executeOperation(name, rawInput)`: validate → run → emit. The ONE execution path. zod failures → OperationError(400).
|
|
33
|
+
- SSE: `setOperationEventEmitter` injected from server.ts at module top level (same pattern as `setCanvasLayoutUpdateEmitter`). Handlers never emit `canvas-layout-update` themselves; `mutates` is the single source. Extra events (focus, viewport) go through `ctx.emit`.
|
|
34
|
+
- `http.ts` route matching is segment-count exact so `/node/:id` never swallows `/node/:id/refresh`. Dispatch inserted in the server.ts fetch handler immediately before the first legacy `/api/canvas/*` check; registry routes shadow legacy ones and the legacy block is deleted in the same commit that registers the op.
|
|
35
|
+
- The shared body reader preserves array bodies (per-op `readInput` decides; the shared reader never coerces) — structural fix for the batch bare-array bug class.
|
|
36
|
+
- `invoker.ts`: `LocalOperationInvoker` wraps `executeOperation`; `HttpOperationInvoker(baseUrl)` builds the request from `op.http.path` template (fills `:id` from input, GET flags to query, rest as JSON body). MCP uses local or HTTP invoker depending on CanvasAccess mode; CLI uses `HttpOperationInvoker(getBaseUrl())`; SDK wraps the handler core functions directly to stay synchronous.
|
|
37
|
+
- `mcp.ts`: iterates the registry, passes `{ ...op.input.shape, ...extraShape }` to `server.tool()` (zod v4 shapes pass through unchanged), invokes via the invoker, formats with `formatResult` (where compactNodePayload/createdNodePayload live).
|
|
38
|
+
|
|
39
|
+
## Slice 1 — node CRUD + layout (this slice)
|
|
40
|
+
|
|
41
|
+
Ops: `node.add`, `node.get`, `node.update`, `node.remove`, `layout.get`.
|
|
42
|
+
|
|
43
|
+
- Export `NODE_TYPES` tuple once; derive the zod enum and replace the `VALID_NODE_TYPES` Set — structural fix for the enum-guard near-miss.
|
|
44
|
+
- `node.add` handler = union of `handleCanvasAddNode` + `createCanvasWebpageNode` + `createCanvasHtmlPrimitiveNode`, calling existing `addCanvasNode`/`createCanvasGroup`/`buildHtmlPrimitive`/`refreshCanvasWebpageNode` etc. Default-size ladder becomes one exported `defaultNodeSize(type)`. `http.serialize` = existing `buildNodeResponse` shape, byte-identical wire format. The json-render/graph/web-artifact redirect errors keep their exact current messages.
|
|
45
|
+
- `node.update` = shared `buildNodePatch(existing, input)` carrying the HTTP superset semantics (titleSource, html top-level fields, group children, refresh delegation). SDK `updateNode` delegates to it — drift disappears.
|
|
46
|
+
- `node.remove` = `closeNodeAppSession` + `removeCanvasNode`; missing id → OperationError 404 (unifies the silent local-remove asymmetry — see parity test note below).
|
|
47
|
+
- `node.get`/`layout.get` keep `withContextPinReadState`/serialization in `http.serialize`; MCP keeps compact/full payload behavior via `formatResult`.
|
|
48
|
+
- SDK keeps `fileMode: 'path'` as an explicit visible parameter instead of forked code.
|
|
49
|
+
|
|
50
|
+
Legacy deleted in this slice: `handleCanvasAddNode`, `createCanvasWebpageNode`, `createCanvasHtmlPrimitiveNode`, `handleCanvasUpdateNode`, inline state/node GET/PATCH/DELETE routes, `VALID_NODE_TYPES`, five MCP `server.tool` blocks, orphaned CanvasAccess methods, the SDK's forked merge logic.
|
|
51
|
+
|
|
52
|
+
## Migration order after slice 1
|
|
53
|
+
|
|
54
|
+
1. Edges (mechanical; DELETE body takes `edge_id`, schema accepts both)
|
|
55
|
+
2. Arrange/viewport/focus/fit/clear (focus emits 3 extra events via ctx.emit; fit is mutates:false with manual viewport emit)
|
|
56
|
+
3. Groups
|
|
57
|
+
4. Pins/search/spatial-context/summary/history/undo/redo
|
|
58
|
+
5. Snapshots (restore keeps its deferred emit mechanism)
|
|
59
|
+
6. json-render/graph/stream (alias triangle heightPx/nodeHeight/height absorbed into one schema)
|
|
60
|
+
7. AX domain (read + mutate sub-slices; long-poll waitMs in readInput; structured denial bodies preserved)
|
|
61
|
+
8. Webpage refresh/diagram/mcp-app open/web-artifact/html-surface (side-channel semantics; mutates:false, own their emits; one op per commit)
|
|
62
|
+
9. Batch last — meta-operation dispatching `executeOperation` per entry with layout emission suppressed + single final emit; deletes the 290-line switch in canvas-operations.ts
|
|
63
|
+
|
|
64
|
+
Theme/annotations/code-graph/schema/prompt/trace endpoints are single-transport, low-duplication; they may stay legacy indefinitely.
|
|
65
|
+
|
|
66
|
+
## Verification (every slice)
|
|
67
|
+
|
|
68
|
+
1. `bun run typecheck`
|
|
69
|
+
2. Targeted: `PMX_CANVAS_DISABLE_BROWSER_OPEN=1 bun test tests/unit/operation-parity.test.ts tests/unit/mcp-tool-freeze.test.ts tests/unit/server-api.test.ts tests/unit/mcp-server.test.ts tests/unit/cli-node.test.ts tests/unit/canvas-operations.test.ts tests/unit/pmx-canvas-sdk.test.ts`
|
|
70
|
+
3. Full unit: `bun run test`
|
|
71
|
+
4. Milestones (after slices 1, 6, batch): `bun run test:web-canvas` + `bun run test:e2e-cli`
|
|
72
|
+
|
|
73
|
+
Safety nets already in place (committed before any registry code): `tests/unit/operation-parity.test.ts` (cross-surface parity, SSE counts, junk-key tolerance, pinned asymmetries) and `tests/unit/mcp-tool-freeze.test.ts` (69 tool names + 14 fixed resource URIs frozen).
|
|
74
|
+
|
|
75
|
+
Parity note: the parity test currently PINS the local-remove silent-success asymmetry. Slice 1 deliberately unifies it to a 404-style error on all surfaces; update that one pinned assertion in the same commit, with a CHANGELOG note.
|
|
76
|
+
|
|
77
|
+
## Risks
|
|
78
|
+
|
|
79
|
+
- zod strictness: schemas must be loose; parity test has junk-key cases.
|
|
80
|
+
- Route shadowing: segment-exact matching + registry self-test against known still-legacy paths.
|
|
81
|
+
- SSE drift: double emit / missing emit — parity test counts frames.
|
|
82
|
+
- MCP-against-remote: ensure at least one test exercises each migrated tool through RemoteCanvasAccess/HttpOperationInvoker (mcp-server.test.ts daemon mode covers this).
|
|
83
|
+
- Import cycles: operations/ never imports server.ts (emitter injected) or index.ts (SDK imports the cores).
|
|
84
|
+
- Batch is highest-risk: last, separately committed, one-commit revert.
|
|
@@ -0,0 +1,109 @@
|
|
|
1
|
+
# Plan 006: MCP tool consolidation (69 tools to 21)
|
|
2
|
+
|
|
3
|
+
**Status:** In progress — wave 1 landed (7 canvas composites) and the AX wave landed (5 composites: `canvas_ax_state`, `canvas_ax_work`, `canvas_ax_gate`, `canvas_ax_timeline`, `canvas_ax_delivery`). Remaining: `canvas_snapshot` (name held by the legacy save tool until v0.3), `canvas_app`/`canvas_webview` (need plan-005 item 8), and the deferred actions `refresh` / `add-primitive` / `remove-annotation` / board validation.
|
|
4
|
+
**Date:** 2026-06-12
|
|
5
|
+
**Depends on:** plan-005 (operation registry). Slices 1-4 are migrated; consolidation lands per-domain as the corresponding registry slices complete.
|
|
6
|
+
**Motivation:** docs/tech-debt-assessment-2026-06.md item 2. Governed by docs/api-stability.md (deprecation: marked in one minor, removed in the next).
|
|
7
|
+
|
|
8
|
+
## Rationale
|
|
9
|
+
|
|
10
|
+
69 tools is a tax on every connected agent: each tool name + description + input schema is sent to every MCP client and consumes context window before the agent has done anything. Worse, it degrades tool selection. An agent choosing between `canvas_request_approval`, `canvas_resolve_approval`, and `canvas_await_approval` (times three gate kinds, nine tools) picks worse than one choosing `canvas_ax_gate` with a clear action enum and one good description. Agents reliably pick the right tool from ~20 well-described tools; at 69 the descriptions compete with each other.
|
|
11
|
+
|
|
12
|
+
The registry makes this cheap. A consolidated tool is one `mcp` block whose `extraShape` adds an action discriminator and whose `buildInput` dispatches to the existing registered operations. No handler logic moves; the consolidation is presentation-layer only, which is exactly what the registry was built to make safe.
|
|
13
|
+
|
|
14
|
+
## Current surface (69 tools, from tests/unit/mcp-tool-freeze.test.ts)
|
|
15
|
+
|
|
16
|
+
Grouped by domain: node CRUD + creation variants (15), edges (2), view (4), groups (3), snapshots + diff (6), undo/redo (2), search/validate/schema (4), pins (1), batch (1), webview automation (6), AX (25).
|
|
17
|
+
|
|
18
|
+
## Proposed surface (21 tools)
|
|
19
|
+
|
|
20
|
+
### Composites
|
|
21
|
+
|
|
22
|
+
**1. `canvas_node`**: folds `canvas_add_node`, `canvas_add_html_node`, `canvas_get_node`, `canvas_update_node`, `canvas_remove_node`, `canvas_refresh_webpage_node`.
|
|
23
|
+
Action enum: `add | get | update | remove | refresh`.
|
|
24
|
+
Sketch: `{ action, id?, type?, title?, content?, html?, x?, y?, width?, height?, data?, ...patch fields }`. `add` requires `type` (full node-type enum; html nodes stop needing a dedicated tool since `html` is already a first-class field). `refresh` covers the webpage re-fetch (`node.update` already has `refresh: true` delegation in the registry, so this is an alias action, not new logic). Spec-driven types (json-render, graph, web-artifact) keep their existing redirect errors pointing at `canvas_render`.
|
|
25
|
+
|
|
26
|
+
**2. `canvas_render`**: folds `canvas_add_json_render_node`, `canvas_stream_json_render_node`, `canvas_add_graph_node`, `canvas_add_html_primitive`, `canvas_validate_spec`, `canvas_describe_schema`.
|
|
27
|
+
Action enum: `describe-schema | validate | add-json-render | stream-json-render | add-graph | add-primitive`.
|
|
28
|
+
Sketch: `{ action, spec?, graph?, kind?, payload?, nodeId?, title?, x?, y?, ... }`. One tool owns "spec-driven content": discover the schema, validate, create. The alias triangle (heightPx/nodeHeight/height) is already absorbed by the registry slice 6 schema.
|
|
29
|
+
|
|
30
|
+
**3. `canvas_app`**: folds `canvas_open_mcp_app`, `canvas_add_diagram`, `canvas_build_web_artifact`.
|
|
31
|
+
Action enum: `open-mcp-app | diagram | build-artifact`.
|
|
32
|
+
Sketch: `{ action, serverUrl?, tool?, args?, elements?, files?, entry?, title?, ... }`. External and built content with side-channel semantics, kept apart from plain node CRUD because their inputs share nothing with it.
|
|
33
|
+
|
|
34
|
+
**4. `canvas_edge`**: folds `canvas_add_edge`, `canvas_remove_edge`.
|
|
35
|
+
Action enum: `add | remove`.
|
|
36
|
+
Sketch: `{ action, id?, from?, to?, type?, label?, style?, animated? }`.
|
|
37
|
+
|
|
38
|
+
**5. `canvas_view`**: folds `canvas_arrange`, `canvas_focus_node`, `canvas_fit_view`, `canvas_clear`, `canvas_remove_annotation`.
|
|
39
|
+
Action enum: `arrange | focus | fit | clear | remove-annotation`.
|
|
40
|
+
Sketch: `{ action, nodeId?, annotationId?, strategy?, padding? }`. `remove-annotation` lives here as "canvas surface housekeeping"; it is an overlay operation, not node CRUD (judgment call, see Risks).
|
|
41
|
+
|
|
42
|
+
**6. `canvas_group`**: folds `canvas_create_group`, `canvas_group_nodes`, `canvas_ungroup`.
|
|
43
|
+
Action enum: `create | add | ungroup`.
|
|
44
|
+
Sketch: `{ action, groupId?, title?, nodeIds? }`.
|
|
45
|
+
|
|
46
|
+
**7. `canvas_snapshot`**: folds `canvas_snapshot`, `canvas_list_snapshots`, `canvas_restore`, `canvas_delete_snapshot`, `canvas_gc_snapshots`, `canvas_diff`.
|
|
47
|
+
Action enum: `save | list | restore | delete | gc | diff`.
|
|
48
|
+
Sketch: `{ action, name?, id?, keep?, dryRun?, all? }`.
|
|
49
|
+
|
|
50
|
+
**8. `canvas_history`**: folds `canvas_undo`, `canvas_redo`.
|
|
51
|
+
Action enum: `undo | redo`.
|
|
52
|
+
Sketch: `{ action }`.
|
|
53
|
+
|
|
54
|
+
**9. `canvas_query`**: folds `canvas_search`, `canvas_get_layout`, `canvas_validate`.
|
|
55
|
+
Action enum: `search | layout | validate`.
|
|
56
|
+
Sketch: `{ action, query?, limit?, full? }`. The three "read the board" entry points under one description that teaches the cheap-to-expensive ladder (search before layout).
|
|
57
|
+
|
|
58
|
+
**10. `canvas_webview`**: folds `canvas_webview_status`, `canvas_webview_start`, `canvas_webview_stop`, `canvas_resize`, `canvas_evaluate`.
|
|
59
|
+
Action enum: `status | start | stop | resize | evaluate`.
|
|
60
|
+
Sketch: `{ action, width?, height?, expression? }`.
|
|
61
|
+
|
|
62
|
+
**11. `canvas_ax_state`**: folds `canvas_get_ax`, `canvas_set_ax_focus`, `canvas_set_ax_policy`, `canvas_report_host_capability`.
|
|
63
|
+
Action enum: `get | set-focus | set-policy | report-capability`.
|
|
64
|
+
Sketch: `{ action, focus?, policy?, capability? }`.
|
|
65
|
+
|
|
66
|
+
**12. `canvas_ax_work`**: folds `canvas_add_work_item`, `canvas_update_work_item`, `canvas_add_review_annotation`.
|
|
67
|
+
Action enum: `add | update | annotate`.
|
|
68
|
+
Sketch: `{ action, id?, title?, status?, detail?, nodeIds?, body?, anchor? }`.
|
|
69
|
+
|
|
70
|
+
**13. `canvas_ax_gate`**: folds the nine gate tools: `canvas_request_approval`, `canvas_resolve_approval`, `canvas_await_approval`, `canvas_request_elicitation`, `canvas_respond_elicitation`, `canvas_await_elicitation`, `canvas_request_mode`, `canvas_resolve_mode`, `canvas_await_mode`.
|
|
71
|
+
Two discriminators: `kind: approval | elicitation | mode` and `action: request | resolve | await`.
|
|
72
|
+
Sketch: `{ kind, action, id?, title?, detail?, nodeIds?, decision?, response?, mode?, timeoutMs? }`. `resolve` carries `decision` for approval/mode and `response` for elicitation. The biggest single win: 9 tools to 1, and the request/await pairing finally reads as one lifecycle.
|
|
73
|
+
|
|
74
|
+
**14. `canvas_ax_timeline`**: folds `canvas_get_ax_timeline`, `canvas_record_ax_event`, `canvas_add_evidence`, `canvas_send_steering`.
|
|
75
|
+
Action enum: `read | record-event | add-evidence | send-steering`.
|
|
76
|
+
Sketch: `{ action, kind?, summary?, payload?, evidenceType?, message?, limit? }`.
|
|
77
|
+
|
|
78
|
+
**15. `canvas_ax_delivery`**: folds `canvas_claim_ax_delivery`, `canvas_mark_ax_delivery`.
|
|
79
|
+
Action enum: `claim | mark`.
|
|
80
|
+
Sketch: `{ action, consumer?, id? }`.
|
|
81
|
+
|
|
82
|
+
### Kept standalone (composition would hurt)
|
|
83
|
+
|
|
84
|
+
- **16. `canvas_batch`**: already the meta-operation; folding anything into it inverts the design.
|
|
85
|
+
- **17. `canvas_pin_nodes`**: the flagship human-context primitive; deserves its own description so agents find it.
|
|
86
|
+
- **18. `canvas_screenshot`**: returns an MCP image payload; mixing return types inside a composite makes `formatResult` and client handling worse.
|
|
87
|
+
- **19. `canvas_ax_interaction`**: the single normalized trust-boundary envelope; it already is a composite by design.
|
|
88
|
+
- **20. `canvas_ingest_activity`**: adapter firehose with reaction semantics; distinct caller (harness, not agent).
|
|
89
|
+
- **21. `canvas_invoke_command`**: execution-intent tool; allowlist and approval-policy relevant, so it must stay individually nameable.
|
|
90
|
+
|
|
91
|
+
Every one of the 69 legacy tools maps to exactly one row above; nothing is dropped without a successor.
|
|
92
|
+
|
|
93
|
+
## Migration
|
|
94
|
+
|
|
95
|
+
1. **v0.2 minors: add consolidated tools alongside legacy.** Each consolidated tool ships when its registry slice lands (plan-005 migration order). Implementation per tool: one registry `mcp` registration with an `action` (and for gates, `kind`) discriminator in `extraShape`, a `buildInput` that maps the composite args onto the existing operation input, dispatching via the operation name. Legacy tools keep working unchanged.
|
|
96
|
+
2. **Same minors: mark legacy deprecated.** Each legacy tool description gets a leading `Deprecated: use canvas_x with action "y".` line, plus `### Deprecated` CHANGELOG entries and docs/mcp.md updates, per the api-stability contract.
|
|
97
|
+
3. **v0.3.0: remove legacy tools.** `### Breaking` CHANGELOG entry listing every removed tool and its replacement.
|
|
98
|
+
4. **Freeze test updated in two deliberate steps.** Step one (v0.2): the frozen list grows to 69 + 21 = 90 names as consolidated tools land (additive, not breaking). Step two (v0.3.0): the list shrinks to the 21 survivors in the same commit that deletes the legacy registrations. Both edits are intentional per the freeze test's contract.
|
|
99
|
+
5. **Verification per step:** `bun test tests/unit/mcp-tool-freeze.test.ts tests/unit/operation-parity.test.ts tests/unit/mcp-server.test.ts`, plus one parity case per composite asserting that the composite action and its legacy tool produce identical results through the same operation.
|
|
100
|
+
|
|
101
|
+
The interim 90-tool surface is worse than 69 for one or two minors. Accepted: the alternative (flag-day rename) breaks every existing client at once with no migration window.
|
|
102
|
+
|
|
103
|
+
## Risks
|
|
104
|
+
|
|
105
|
+
- **MCP clients with tool allowlists.** A client allowlisting `canvas_add_node` gets nothing when the tool disappears in v0.3, and a coarse `canvas_node` allowlist grants add AND remove together. Consolidation moves the permission boundary from tool name to action param, which allowlist-based policy cannot see. Mitigations: the v0.2 overlap window, loud CHANGELOG + docs/mcp.md migration table, and keeping the sensitive standalones (`canvas_invoke_command`, `canvas_ax_interaction`, `canvas_ingest_activity`) individually nameable. For finer control PMX's own AX policy (`canvas_set_ax_policy` `tools.approvalRequired`) remains the recommended layer.
|
|
106
|
+
- **Action-enum schema bloat.** A composite's schema is the union of its members' fields, mostly optional. If a composite's description plus schema approaches the combined size of the tools it replaced, the consolidation bought nothing; measure serialized listTools size before and after (target: well over 50% reduction).
|
|
107
|
+
- **Worse errors for wrong field/action combinations.** `buildInput` must reject mismatches loudly (OperationError 400 naming the action and the offending field), not silently ignore fields the action does not use.
|
|
108
|
+
- **Placement judgment calls** (`remove-annotation` under `canvas_view`, `refresh` under `canvas_node`, `evaluate` under `canvas_webview`) are cheap to revisit before v0.3 freezes the surface; after that they are contract.
|
|
109
|
+
- **Stale agent muscle memory.** Skills, docs, and CLAUDE.md reference legacy names everywhere. The v0.3 commit must sweep `skills/`, `docs/`, and the MCP `canvas_describe_schema` routing map in the same change, or agents will be steered at tools that no longer exist.
|
|
@@ -0,0 +1,99 @@
|
|
|
1
|
+
# Plan 007 — AX domain: state split + orphan-bug fix + registry migration + tool consolidation
|
|
2
|
+
|
|
3
|
+
**Status:** Proposed
|
|
4
|
+
**Date:** 2026-06-13
|
|
5
|
+
**Depends on:** plan-005 (operation registry — slices 1–4 merged), plan-006 (MCP tool consolidation — wave 1 merged).
|
|
6
|
+
**Motivation:** Retires the remaining hard parts of three tech-debt items at once, all of which converge on the AX domain:
|
|
7
|
+
- **Item 1** (operation registry) — migration item 7: the ~25 AX operations are still hand-written 4× (CanvasStateManager → PmxCanvas SDK → HTTP handler → MCP tool → CanvasAccess). ~37 AX HTTP routes + 25 AX MCP tools.
|
|
8
|
+
- **Item 3** (CanvasStateManager split): AX state lives as `_axState` inside the 2,498-line `CanvasStateManager`; node deletion silently orphans AX items; the snapshot-vs-audit contract is undocumented as code.
|
|
9
|
+
- **Item 2** (tool consolidation) — wave 2: the AX tools are the biggest single win (25 → 5 composites, including 9 gate tools → 1), but blocked until AX ops are registry-backed.
|
|
10
|
+
|
|
11
|
+
## The AX state contract (authoritative)
|
|
12
|
+
|
|
13
|
+
Three partitions, confirmed against the code. This contract is the spec for the split.
|
|
14
|
+
|
|
15
|
+
| Partition | Members | Storage | Snapshotted | Cleared by `canvas_clear` | Cleared by `restore` |
|
|
16
|
+
|-----------|---------|---------|:-----------:|:-------------------------:|:--------------------:|
|
|
17
|
+
| **Canvas-bound** | `focus`, `workItems`, `approvalGates`, `reviewAnnotations`, `elicitations`, `modeRequests`, `policy` | in-memory `_axState` + one JSON blob in `ax_state` table | ✅ | ✅ | ✅ (replaced by snapshot's AX) |
|
|
18
|
+
| **Timeline (audit-only)** | `agent-event`, `evidence-item`, `steering-message` | `ax_events` / `ax_evidence` / `ax_steering` tables, 500-row retention, sequential ids | ❌ | ❌ | ❌ |
|
|
19
|
+
| **Host/session** | `host-capability` | `ax_host_capabilities` table | ❌ | ❌ | ❌ |
|
|
20
|
+
|
|
21
|
+
Rules: canvas-bound state travels with the canvas (snapshot/restore/clear); timeline + host data are diagnostic and survive all three. This is mostly already in CLAUDE.md — plan-007 makes it the documented module boundary.
|
|
22
|
+
|
|
23
|
+
## Slice A — AX state extraction + orphan-bug fix + contract doc (tech-debt item 3)
|
|
24
|
+
|
|
25
|
+
**Extraction.** `_axState` is a single separable field; timeline ops are already DB-direct; normalization lives cleanly in `ax-state.ts`. Move the canvas-bound state + its ~17 mutators / ~13 readers + the timeline-direct ops into a dedicated `AxStateManager` (new `src/server/ax-state-manager.ts`). `CanvasStateManager` keeps a `private ax: AxStateManager` and **delegates** its existing AX methods to it — so the public surface (SDK, HTTP, MCP all call `canvasState.addWorkItem(...)` etc.) is byte-stable and no caller changes. The manager takes a node-id-validity callback (injected) so normalization still runs on write. Net: `CanvasStateManager` sheds ~600 lines; AX state becomes independently testable.
|
|
26
|
+
|
|
27
|
+
**The orphan bug** (`canvas-state.ts:1459`): `removeNode()` → `applyAxState()` → `normalizeAxForCurrentNodes()` re-normalizes AX against the surviving node set. Precise current behavior (verified):
|
|
28
|
+
- Work items / approval gates / elicitations / mode requests: `normalizeNodeIds` (`ax-state.ts:255`) **strips the dangling node id but keeps the item** — already "soft", but *silent*.
|
|
29
|
+
- Node-anchored review annotations (`anchorType:'node'`): **dropped entirely** (`ax-state.ts:577-582`) — correct (meaningless without their node), but *silent*.
|
|
30
|
+
- No event, no audit, on either. Same silent re-normalization runs on `restore()` (`canvas-state.ts:1063`).
|
|
31
|
+
|
|
32
|
+
**Decision (chosen): soft-orphan + audit.** The data semantics already match soft-orphan, so the fix is the **audit**: when node deletion strips a node ref from a work item/gate/elicitation/mode, or drops a node-anchored review annotation, record one auditable timeline event (`source:'system'`) summarizing what was re-anchored/removed and the trigger node — so the human and a resuming agent can see it instead of work silently changing. Needs a general-purpose system/audit event kind (add `'note'` to `PmxAxEventKind`; `kind` is stored as TEXT so no DB migration). Node-anchored review drop is kept (per the decision), now audited. Scope the audit to `removeNode` (the live bug); `restore` replaces the whole canvas wholesale and its snapshot AX was already consistent when saved.
|
|
33
|
+
|
|
34
|
+
**Contract doc:** formalize the partition table above in `docs/ax-state-contract.md` (or a CLAUDE.md section) as the authoritative snapshot-vs-audit spec.
|
|
35
|
+
|
|
36
|
+
Slice A is independent of the registry migration (delegation keeps the surface stable) and is the highest-correctness-value piece. The orphan fix alone is a small, shippable change that can land first.
|
|
37
|
+
|
|
38
|
+
## Slice B — AX registry migration (tech-debt item 1 / plan-005 item 7)
|
|
39
|
+
|
|
40
|
+
Define AX operations in `src/server/operations/ops/ax.ts` (split into `ax-state.ts` / `ax-timeline.ts` files if large), following the established pattern (loose zod schemas, `OperationError`, `http.serialize`, `mcp.formatResult/buildInput`, frozen tool names, `ctx.emit` for the AX SSE frames). Delete the legacy HTTP handler + route + MCP tool block + orphaned CanvasAccess method in the same change per op.
|
|
41
|
+
|
|
42
|
+
**Fits the simple synchronous model** (`mutates: false`, emit `ax-state-changed` or `ax-event-created` via `ctx.emit`; these are NOT layout mutations):
|
|
43
|
+
- State: `ax.get`, `ax.focus.set`, `ax.policy.get`, `ax.policy.set`, `ax.host-capability.report`
|
|
44
|
+
- Work/review: `ax.work.create`, `ax.work.update`, `ax.review.add`, `ax.review.update`
|
|
45
|
+
- Gates (create/resolve): `ax.approval.request`/`.resolve`, `ax.elicitation.request`/`.respond`, `ax.mode.request`/`.resolve`
|
|
46
|
+
- Timeline: `ax.timeline.get`, `ax.event.record`, `ax.evidence.add`, `ax.steer`
|
|
47
|
+
- Delivery: `ax.delivery.pending` (loop-safe consumer scoping preserved), `ax.delivery.mark`
|
|
48
|
+
- Commands: `ax.command.invoke` (allowlist-gated; records a timeline event only)
|
|
49
|
+
|
|
50
|
+
**Needs special handling (own sub-slice):**
|
|
51
|
+
- **Gate reads with long-poll** (`ax.approval.get` / `ax.elicitation.get` / `ax.mode.get`, the `await_*` tools): the HTTP `?waitMs=` blocks via `waitForAxResolution()` + `req.signal`. Migrate using a custom `http.readInput` that performs the wait and returns the resolved-or-`pending` value; the MCP `await` action passes `timeoutMs` through to the handler (no abort signal off-HTTP, timeout still honored). **Fallback:** if this abstraction turns ugly, leave the 3 `await_*` tools + their GET routes legacy and fold them into `canvas_ax_gate`'s `await` action in a later step (report as deferred, plan-005-style — do not force a bad abstraction).
|
|
52
|
+
|
|
53
|
+
**Stays as a sidecar (NOT a registry op), but routed through the shared op cores:**
|
|
54
|
+
- **`ax.interaction`** (`applyAxInteraction`, `src/server/ax-interaction.ts`): the single re-validation trust boundary for sandboxed-surface envelopes, with `sourceSurface` scope-clamping. Keep `POST /api/canvas/ax/interaction` + `canvas_ax_interaction` as-is, but point its per-type dispatch at the SAME operation cores the registry ops call — so interaction and direct calls can never diverge.
|
|
55
|
+
- **`ax.activity.ingest`** (`canvas_ingest_activity`): harness firehose with kind-driven auto-reactions firing 3–4 SSE events; distinct caller shape. Stays standalone.
|
|
56
|
+
|
|
57
|
+
**Preserve exactly:** SSE event names (`ax-state-changed`, `ax-event-created`), the resource-notification fan-out (`canvas://ax`, `ax-context`, `ax-timeline`, `ax-work`, `ax-pending-steering`, `ax-delivery`), structured denial bodies (`resolve` on a missing/already-resolved gate; node-anchored review requiring a real node id), `source` defaulting (`'mcp'` for MCP, `'api'` for HTTP). The SDK's AX methods become thin wrappers over the op cores; CanvasAccess Local/Remote AX methods are deleted (the invoker replaces them) — the same local-vs-remote unification class as slices 1–4.
|
|
58
|
+
|
|
59
|
+
## Slice C — AX tool consolidation (tech-debt item 2 / plan-006 wave 2)
|
|
60
|
+
|
|
61
|
+
Additive composites (per `docs/api-stability.md`; same mechanism as wave 1 — derived schema + reused op `buildInput`/`formatResult`, deprecation prefix on the folded legacy tools, removed in v0.3):
|
|
62
|
+
|
|
63
|
+
1. **`canvas_ax_state`** — `get | set-focus | set-policy | report-capability`
|
|
64
|
+
2. **`canvas_ax_work`** — `add | update | annotate` (work items + review annotations)
|
|
65
|
+
3. **`canvas_ax_gate`** — two discriminators `kind: approval|elicitation|mode` × `action: request|resolve|await`. **9 tools → 1.** The single biggest consolidation win.
|
|
66
|
+
4. **`canvas_ax_timeline`** — `read | record-event | add-evidence | send-steering`
|
|
67
|
+
5. **`canvas_ax_delivery`** — `claim | mark`
|
|
68
|
+
|
|
69
|
+
**Stay standalone** (plan-006 §19–21): `canvas_ax_interaction` (trust-boundary envelope), `canvas_ingest_activity` (harness firehose), `canvas_invoke_command` (gated execution intent, allowlist/approval-policy relevant). Freeze list grows by 5 (additive); legacy AX tools gain the `Deprecated:` prefix.
|
|
70
|
+
|
|
71
|
+
## Migration order
|
|
72
|
+
|
|
73
|
+
1. **A.1** orphan-bug fix + audit note (small, shippable first; behavior change — see decision).
|
|
74
|
+
2. **A.2** extract `AxStateManager`, delegate from `CanvasStateManager`, document the contract.
|
|
75
|
+
3. **B.1** migrate the simple AX state/work/gate-mutate/timeline/delivery/command ops; delete their legacy handlers/tools/CanvasAccess methods; SDK wraps cores.
|
|
76
|
+
4. **B.2** gate-read long-poll sub-slice (custom `readInput`; fallback = leave `await_*` legacy).
|
|
77
|
+
5. **B.3** re-point `applyAxInteraction` at the shared cores (no behavior change).
|
|
78
|
+
6. **C** add the 5 AX composites + deprecate legacy AX tools.
|
|
79
|
+
|
|
80
|
+
Each step is its own commit; B.1 is internally parallelizable per primitive (work / approvals / elicitations / modes / review / timeline / delivery) — the dynamic-workflow fit.
|
|
81
|
+
|
|
82
|
+
## Risks
|
|
83
|
+
|
|
84
|
+
- **Behavior change (orphan fix).** Soft-orphan changes long-standing silent-drop semantics. Mitigation: explicit decision below; a parity-test case pins the new behavior; CHANGELOG `### Changed` entry.
|
|
85
|
+
- **State extraction regressions.** `_axState` is touched by snapshot/restore/clear/load and the orphan path. Mitigation: delegation keeps the public surface identical; the existing AX + snapshot unit tests must pass untouched; add a node-delete-orphan test.
|
|
86
|
+
- **Long-poll abstraction.** The `await_*` ops are the only AX ops that don't fit the synchronous model. Mitigation: custom `readInput`; documented fallback to keep them legacy.
|
|
87
|
+
- **Trust-boundary drift.** `applyAxInteraction` must call the same cores as the registry ops, or the sandboxed-surface path diverges from the direct path. Mitigation: extract mutation cores first, route both through them; the interaction-scope tests stay untouched.
|
|
88
|
+
- **MCP-against-remote.** At least one test per migrated AX tool through RemoteCanvasAccess/HttpOperationInvoker (mcp-server daemon mode).
|
|
89
|
+
- **Surface size.** Interim tool count grows again (additive); accepted per plan-006.
|
|
90
|
+
|
|
91
|
+
## Verification (every slice)
|
|
92
|
+
|
|
93
|
+
1. `bun run typecheck`
|
|
94
|
+
2. Targeted: `PMX_CANVAS_DISABLE_BROWSER_OPEN=1 bun test tests/unit/operation-parity.test.ts tests/unit/mcp-tool-freeze.test.ts tests/unit/mcp-server.test.ts tests/unit/mcp-composites.test.ts tests/unit/server-api.test.ts tests/unit/canvas-state.test.ts` (+ AX-specific suites, + a new node-delete-orphan test)
|
|
95
|
+
3. Full unit: `bun test tests/unit`
|
|
96
|
+
4. e2e gate on the PR (`test` + `e2e` required checks).
|
|
97
|
+
5. A parity case per migrated tool (composite action == legacy tool, both through the same op) and per new behavior (soft-orphan + audit note).
|
|
98
|
+
|
|
99
|
+
Tool-name freeze + operation-parity edits are deliberate and called out in the same commit, with CHANGELOG entries (`### Added` composites, `### Deprecated` legacy, `### Changed` orphan semantics).
|
|
@@ -0,0 +1,91 @@
|
|
|
1
|
+
# Plan 008 — Finish the operation-registry refactor (plan-005 items 8–9 + plan-006 completion)
|
|
2
|
+
|
|
3
|
+
**Status:** Complete
|
|
4
|
+
**Date:** 2026-06-15
|
|
5
|
+
**Depends on:** plan-005 (registry — slices 1–7 merged), plan-006 (consolidation — waves 1–2 merged), plan-007 (AX domain — merged).
|
|
6
|
+
**Motivation:** Close out the registry refactor so v0.2 ships a coherent, complete surface. After this, the registry covers every n-way-duplicated operation; the only legacy left is the deliberate single-transport / poor-fit set.
|
|
7
|
+
|
|
8
|
+
## Verdicts (from the remaining-surface investigation)
|
|
9
|
+
|
|
10
|
+
| Operation | Verdict | Why |
|
|
11
|
+
|---|---|---|
|
|
12
|
+
| `canvas_validate` (board validation) | **Migrate** — `validate.get` op (read) | Pure read; clean fit; unblocks `canvas_query` validate action |
|
|
13
|
+
| `canvas_remove_annotation` | **Migrate** — `annotation.remove` op | Trivial DELETE-by-id mutation; unblocks `canvas_view` remove-annotation action |
|
|
14
|
+
| webview `status`/`start`/`stop`/`resize`/`evaluate` | **DONE (Wave 3)** — migrated via runner injection; `canvas_webview` composite shipped | The webview machinery (`startCanvasAutomationWebView` …) lives in `server.ts`, which `operations/` must NOT import (the isolation rule). Resolved with a **runner-injection** pattern (`src/server/operations/webview-runner.ts` + `setWebviewRunner` in `server.ts`), exactly mirroring `setOperationEventEmitter`. Ops in `src/server/operations/ops/webview.ts`; composite `canvas_webview` folds the 5 actions |
|
|
15
|
+
| `canvas_screenshot` | **Stay legacy** (intentional) | Returns a binary image payload the registry JSON wire shape does not model. Stays a standalone hand-written tool (and route) |
|
|
16
|
+
| `canvas_refresh_webpage_node` | **DONE (Wave 5)** — deprecate-only fold into `canvas_node` | The original "needs `refresh` action + per-action input injection" rationale was wrong: `refresh:true` is a plain param reachable via `canvas_node {action:"update", refresh:true}`. The audit surfaced a real failure-path GAP (`node.update`'s `formatResult` masked a FAILED refresh as `{ ok:true }` with no `isError` over the local invoker), which was then **closed** — `formatResult` now passes a refresh result through verbatim and surfaces `ok:false` as `isError`, matching the standalone tool and the HTTP 400. With the failure path equivalent, the standalone is deprecated → `canvas_node {action:"update", refresh:true}` (see Wave 5) |
|
|
17
|
+
| `canvas_add_html_node` / `canvas_add_html_primitive` | **DONE (Wave 5)** — deprecate-only fold into `canvas_node` | Verdict REVERSED: no mechanism needed. `node.add` already routes `type:"html"` + `primitive\|kind` → `createHtmlPrimitiveNode` and merges the top-level html fields into node data, so `canvas_node {action:"add", type:"html", …}` is already equivalent. Deprecate-only: a `Deprecated: use canvas_node …` prefix on each standalone description; no new action / op / SDK / freeze change |
|
|
18
|
+
| `canvas_open_mcp_app` / `canvas_add_diagram` / `canvas_build_web_artifact` | **DONE (Wave 4)** — migrated as `mcpapp.open` / `diagram.open` / `webartifact.build`; `canvas_app` composite shipped | The earlier "poor fit" verdict was wrong on reflection: `executeOperation` is async (the long-running build fits — the caveat is MCP-client timeouts, not registry fit), and the runtimes are server-independent **domain modules** (`mcp-app-runtime` / `diagram-presets` / `web-artifacts`), not `server.ts` — so the op handlers call them directly, NO runner injection. `web-artifacts.ts` was made server-independent by switching its one `emitPrimaryWorkbenchEvent` to the already-injected `emitCanvasLayoutUpdate`. The three ops are `mutates:false` (they emit `ext-app-open`/`ext-app-result` via `ctx.emit`, or web-artifacts emits its own layout frame). `canvas_app` folds the 3 |
|
|
19
|
+
| `canvas_batch` | **Migrate last** — `canvas.batch` meta-op | The remaining registry slice; deletes the 290-line switch |
|
|
20
|
+
| `canvas_ax_interaction` / `canvas_ingest_activity` | **Stay legacy** (already decided — trust boundary / firehose) | plan-007 |
|
|
21
|
+
|
|
22
|
+
## Wave 1 — clean migrations + the two free composite actions
|
|
23
|
+
|
|
24
|
+
Two new ops (follow the established pattern; delete legacy handler + route + MCP tool + orphaned CanvasAccess per op). Both are server-independent (no `server.ts`/`index.ts` import):
|
|
25
|
+
- **`validate.get`** — `GET /api/canvas/validate`, mutates:false, no emit; serialize = `validateCanvasLayout(canvasState.getLayout())`; MCP `canvas_validate` (no args).
|
|
26
|
+
- **`annotation.remove`** — `DELETE /api/canvas/annotation/:id`, mutates:true (auto layout emit); 404 on missing; returns `{ ok:true, removed:id }`; MCP `canvas_remove_annotation { id }`.
|
|
27
|
+
|
|
28
|
+
Consolidation (additive; these are clean `action→op`, NO mechanism extension — that's why they're in scope and refresh/add-primitive are not):
|
|
29
|
+
- **`canvas_query`** + `validate` action → `validate.get`. Deprecate `canvas_validate`.
|
|
30
|
+
- **`canvas_view`** + `remove-annotation` action → `annotation.remove`. Deprecate `canvas_remove_annotation`.
|
|
31
|
+
|
|
32
|
+
Tool names: the 2 migrated tools keep their names (hand-written → registry-served); no freeze-count change. Deprecation prefixes auto-derive from the composite definitions.
|
|
33
|
+
|
|
34
|
+
**Deferred (documented, not in this campaign):** `canvas_node` refresh + `canvas_render` add-primitive + `canvas_add_html_node` folding (need a per-action input-injection mechanism — over-engineering for niche actions), `canvas_snapshot` (v0.3 name collision). These legacy tools keep working, unchanged. (`canvas_webview` is **done** — Wave 3 below; its server.ts coupling was resolved with runner injection. `canvas_app` is **done** — Wave 4 below; the "poor fit" verdict was reversed. `canvas_screenshot` intentionally stays standalone — binary payload.) **Update (Wave 5):** the html deferral was wrong — `node.add` already exposes the params, so the html fold is deprecate-only (no mechanism). Only the refresh fold turned out to be a genuine result-equivalence gap. See Wave 5.
|
|
35
|
+
|
|
36
|
+
## Wave 2 — batch (plan-005 item 9, last, highest risk)
|
|
37
|
+
|
|
38
|
+
Convert `executeCanvasBatch` (the ~290-line switch in canvas-operations.ts) into a `canvas.batch` registry meta-op:
|
|
39
|
+
- **`runWithSuppressedEmits(fn)` wraps the batch loop** — the registry runs each op through the normal `executeOperation` path while a depth counter suppresses both auto `canvas-layout-update` frames and explicit `ctx.emit` events. The batch meta-op then emits one final layout frame.
|
|
40
|
+
- The `canvas.batch` handler: read `{ operations:[...] }` or a bare `[...]` (shared array-preserving reader); for each entry resolve `$ref`/`assign` against prior results, then dispatch the legacy batch allowlist through `executeOperation` inside `runWithSuppressedEmits`; collect `results`/`refs`; on failure record `failedIndex`/`error` and stop (preserve current semantics). `mutates:false` + ONE manual `ctx.emit('canvas-layout-update')` at the end. Result shape `{ ok, results, refs, failedIndex?, error? }` byte-identical.
|
|
41
|
+
- All 11 batch op names (`node.add/update/remove`, `graph.add`, `edge.add`, `group.create/add/remove`, `pin.set` [+ add/remove modes], `snapshot.save`, `arrange`) are already registered — names match.
|
|
42
|
+
- Delete the switch. Per-entry mutation history still records individually (undo per step preserved).
|
|
43
|
+
- **Risk: highest** — last, separately committed, one-commit revert. Verify: every op name in batch + standalone, `$ref` chaining, bare-array + `{operations}` shapes, SSE single-final-emit (operation-parity counts frames), failure at each index, local + remote.
|
|
44
|
+
|
|
45
|
+
## Wave 3 — webview (DONE) — runner injection
|
|
46
|
+
|
|
47
|
+
Migrate the 5 browser-automation tools (`status`/`start`/`stop`/`resize`/`evaluate`) to the registry. The blocker was that the webview machinery lives in `server.ts`, which `operations/` must NOT import. Resolved with **runner injection**, mirroring `setOperationEventEmitter`:
|
|
48
|
+
|
|
49
|
+
- **`src/server/operations/webview-runner.ts`** declares a `WebviewRunner` interface (`status` / `start` / `stop` / `resize` / `evaluate`) + a module-level injected instance with `setWebviewRunner(runner)` and `getWebviewRunner()` (throws a clear error if not injected). `screenshot` is intentionally NOT in the runner (binary).
|
|
50
|
+
- **`server.ts`** calls `setWebviewRunner({ … })` at module load (same point as `setOperationEventEmitter`), wiring the real `getCanvasAutomationWebViewStatus` / `startCanvasAutomationWebView` / `stopCanvasAutomationWebView` / `resizeCanvasAutomationWebView` / `evaluateCanvasAutomationWebView` functions. The `start` closure carries the success/error asymmetry the legacy route preserved (200 ok; 503 server-not-running; 501 unsupported runtime; 500 supported-failure — with the webview status in the error body so callers can read `lastError`).
|
|
51
|
+
- **`src/server/operations/ops/webview.ts`** — 5 `mutates:false` ops (webview is a side surface — no `canvas-layout-update` frame). Each handler calls `getWebviewRunner()`, never `server.ts`. Routes match the legacy paths exactly (`GET /api/workbench/webview`, `POST …/start`, `DELETE …/webview`, `POST …/resize`, `POST …/evaluate`). The `evaluate` op preserves the exact arg validation (exactly one of `expression`/`script`, MCP message in `buildInput`, HTTP message in the handler), the async-IIFE script wrap, and the arbitrary-eval trust posture (relocated, unchanged). `start`'s MCP `buildInput` sandboxes `dataStoreDir` to the workspace (MCP-only, as the legacy tool did). The fetch handler dispatches `/api/workbench/webview*` through `dispatchOperationRoute` (a null return falls through to the still-hand-written screenshot route).
|
|
52
|
+
- **Composite** `canvas_webview` (additive): `status` → `webview.status`, `start` → `webview.start`, `stop` → `webview.stop`, `resize` → `webview.resize`, `evaluate` → `webview.evaluate`. Deprecation prefixes auto-derive via `buildCompositeDeprecationNotes`.
|
|
53
|
+
- **Deleted legacy:** the 5 MCP tool blocks (`mcp/server.ts`), the 5 HTTP handlers + routes + `parseCanvasAutomationWebViewRequestBody` (`server.ts`), and the orphaned `CanvasAccess` methods (`startAutomationWebView` / `stopAutomationWebView` / `resizeAutomationWebView` / `evaluateAutomationWebView` + the `WebViewEnvelope` / `WebViewStopEnvelope` / `WebViewEvaluateEnvelope` interfaces + the `AutomationWebViewOptions` / `AutomationEvaluateResult` type aliases). **KEPT:** `CanvasAccess.getAutomationWebViewStatus` + `screenshotAutomationWebView` (the standalone `canvas_screenshot` tool needs both) and the public SDK `PmxCanvas` webview methods.
|
|
54
|
+
- **Freeze:** `canvas_webview` is a new tool name → freeze list 81 → 82 (the only deliberate freeze change). The 5 legacy webview tool names stay in the list (still registered, now registry-served).
|
|
55
|
+
- **Divergence (documented):** the only allowed unification is the local-vs-remote error asymmetry. For `evaluate`/`resize` a runtime error throws an `OperationError` (HTTP 400 `{ ok:false, error }` — no `webview` field, vs the legacy handler's `{ ok:false, error, webview }`); the MCP result is byte-identical (isError + bare message) because the HTTP invoker reads only `error`. No test asserts that error body shape.
|
|
56
|
+
|
|
57
|
+
## Wave 4 — external / built-content apps (DONE) — the reversed "poor fit"
|
|
58
|
+
|
|
59
|
+
Migrate the 3 deferred external/built-content tools to the registry. The deferral was wrong on reflection: `executeOperation` is async (the long-running web-artifact build fits — its "long-running" caveat is about MCP-client timeouts, not registry fit), and the runtimes are server-independent **domain modules**, not `server.ts`. So the op handlers call them DIRECTLY — no runner injection.
|
|
60
|
+
|
|
61
|
+
- **Server-independence (verified):** `mcp-app-runtime.ts` (`openMcpApp` as `openExternalMcpApp` / `closeMcpAppSession`), `diagram-presets.ts` (`buildExcalidrawOpenMcpAppInput` / `ensureExcalidrawCheckpointId` / `isExcalidrawCreateView`), and `ext-app-lookup.ts` (`findCanvasExtAppNodeId`) import no `server.ts`/`index.ts`. `web-artifacts.ts` had ONE coupling — `import { emitPrimaryWorkbenchEvent } from './server.js'` used in `openWebArtifactInCanvas`. Switched it to the already-injected `emitCanvasLayoutUpdate` (exported from `canvas-operations.ts`, wired by `server.ts` via `setCanvasLayoutUpdateEmitter` to the same `emitPrimaryWorkbenchEvent('canvas-layout-update', { layout })`). `web-artifacts.ts` is now server-independent → no runner injection, no import cycle (`operations/ops/app.ts → web-artifacts.ts → canvas-operations.ts`, never `server.ts`).
|
|
62
|
+
- **`src/server/operations/ops/app.ts`** — 3 `mutates:false` ops + a shared `openMcpAppCore(input, ctx)`:
|
|
63
|
+
- **`mcpapp.open`** → `canvas_open_mcp_app` → `POST /api/canvas/mcp-app/open`. The relocated legacy SDK `openMcpApp` body: `openExternalMcpApp`, the `Date.now()-Math.random()` `toolCallId`, prior-session `closeMcpAppSession`, the Excalidraw checkpoint tagging, `ext-app-open` + `ext-app-result` via `ctx.emit` (→ the registry emitter → `emitPrimaryWorkbenchEvent`), node-id resolution via `findCanvasExtAppNodeId`. Returns `{ ok, id?, nodeId, toolCallId, sessionId, resourceUri }` byte-identical.
|
|
64
|
+
- **`diagram.open`** → `canvas_add_diagram` → `POST /api/canvas/diagram`. Thin preset: `buildExcalidrawOpenMcpAppInput` then delegate to `openMcpAppCore` (the SSE pair fires ONCE — diagram.open does not re-emit). Same return shape.
|
|
65
|
+
- **`webartifact.build`** → `canvas_build_web_artifact` → `POST /api/canvas/web-artifact`. Async handler awaits `buildWebArtifactOnCanvas` and returns the byte-identical metadata envelope `{ ok, path, bytes, projectPath, openedInCanvas, startedAt, completedAt, durationMs, timeoutMs, id?, nodeId, url, metadata, logs, stdout?, stderr? }`. Long-running (minutes) is fine for an async op — no timeouts added. `projectPath`/`outputPath` are sandboxed via `web-artifacts.ts` `resolveWorkspacePath` (the legacy HTTP `resolveWorkspacePath` + MCP `safeWorkspacePath` unified to one server-side check).
|
|
66
|
+
- **SDK stays public:** `PmxCanvas.openMcpApp` / `addDiagram` delegate to `executeOperation('mcpapp.open' | 'diagram.open', input)` (cast to `OpenMcpAppCoreResult`) — the same single execution path, so the ext-app-* frames fire once via `ctx.emit`. `PmxCanvas.buildWebArtifact` calls `buildWebArtifactOnCanvas` directly (its documented return is the full `WebArtifactCanvasBuildResult`, not the wire envelope; the op core IS the same runtime, so no divergence).
|
|
67
|
+
- **Composite** `canvas_app` (additive): `open-mcp-app` → `mcpapp.open`, `diagram` → `diagram.open`, `build-artifact` → `webartifact.build`. Deprecation prefixes auto-derive via `buildCompositeDeprecationNotes`.
|
|
68
|
+
- **Deleted legacy:** the 3 MCP tool blocks (`mcp/server.ts`) + the now-orphaned `safeWorkspacePath`/`workspaceRoot`/`isPathInside` helpers + the `node:path` import; the 3 HTTP handlers (`handleCanvasOpenMcpApp` / `handleCanvasAddDiagram` / `handleCanvasBuildWebArtifact`) + `runAndEmitOpenMcpApp` + `RunAndEmitOpenMcpAppParams` + `randomExtAppToolCallId` + `parseExternalMcpTransportConfig` + `normalizeStringRecord` + their routes + the orphaned imports (`openMcpApp`, `ExternalMcpTransportConfig`, `buildExcalidrawOpenMcpAppInput`, `buildWebArtifactOnCanvas`, `resolveWorkspacePath`) (`server.ts`); the orphaned `CanvasAccess` methods (`openMcpApp` / `addDiagram` / `buildWebArtifact` on both local + remote impls + the interface) and their type aliases (`OpenMcpAppInput` / `OpenMcpAppResult` / `AddDiagramInput` / `WebArtifactInput` / `WebArtifactResult`); the SDK's now-orphaned private `findCanvasExtAppNodeId` method + its `ext-app-lookup` import.
|
|
69
|
+
- **Freeze:** `canvas_app` is a new tool name → freeze list 82 → 83. The 3 legacy tool names stay in the list (still registered, now registry-served).
|
|
70
|
+
- **Divergence (documented):** (1) the local-vs-remote error asymmetry — `mcpapp.open`'s node-precondition failures throw `OperationError` (404 missing node, 400 non-ext-app node), which the legacy HTTP handler returned as explicit 404/400 and the legacy SDK threw as a plain `Error`; over MCP both become a bare-message isError. (2) The canonical core is the SDK shape: the legacy HTTP `runAndEmitOpenMcpApp` returned two extra fields (`serverName`, `toolName`) and used the existing-node title as a fallback for in-place updates; the unified op returns the SDK shape (no `serverName`/`toolName`) and uses `opened.tool.title ?? opened.tool.name` for the title. No test asserts those dropped fields or the in-place title fallback.
|
|
71
|
+
|
|
72
|
+
## Wave 5 — the final fold (DONE) — the reversed "needs input injection" verdict
|
|
73
|
+
|
|
74
|
+
Fold the 3 deferred html/webpage tools. The key discovery: the assumption that a fold needs a NEW composite action plus a per-action INPUT-INJECTION mechanism the composite layer lacks was **wrong**. The registry `node.add` / `node.update` ops already absorb the behaviors via plain params, and the `canvas_node` composite already exposes those params (its add/update action schemas derive from `node.add`/`node.update`'s `inputShape` + `extraShape`).
|
|
75
|
+
|
|
76
|
+
- **Verified equivalence (file:line):**
|
|
77
|
+
- `node.add` shape (`src/server/operations/ops/nodes.ts` `nodeAddShape`, lines ~677–688) already carries `html` / `primitive` / `kind` / `presentation` / `slideTitles` / `embeddedNodeIds` / `embeddedUrls` / `summary` / `agentSummary` / `description`; it is a `z.looseObject` so `strictSize` / `axCapabilities` pass through.
|
|
78
|
+
- The `node.add` handler routes `type:"html"` + `primitive|kind` → `createHtmlPrimitiveNode` (line ~768) and bare `type:"html"` → `createBasicCanvasNode`, which MERGES the top-level html fields into node data (lines ~387–399) — exactly mirroring the legacy SDK `addHtmlNode` / `addHtmlPrimitive`.
|
|
79
|
+
- `node.update` shape already has `refresh` (line ~860); the handler routes `webpage` + `refresh === true` → `refreshCanvasWebpageNode` (lines ~923–934).
|
|
80
|
+
- **Deprecate-only (html, 2 of 3):** prefix each standalone tool description in `src/mcp/server.ts` with `Deprecated: use canvas_node …` (matching the auto-derived composite-note wording). `canvas_add_html_node` → `action:"add", type:"html"`; `canvas_add_html_primitive` → `action:"add", type:"html", primitive:"<kind>"`. **Direct prefix, not `buildCompositeDeprecationNotes`:** that helper is keyed by registry-OPERATION name and only reaches registry-served tools; these 2 are hand-written tools, so a direct description prefix is the correct one-place edit. **No new action, no mechanism, no op/SDK change, no freeze-count change** (the 2 tools stay registered, just annotated; `canvas_node` is already frozen).
|
|
81
|
+
- **GAP found AND closed (refresh):** the audit surfaced that `refresh:true` IS reachable via `canvas_node {action:"update", refresh:true}` but the RESULT diverged on the failure path. `refreshCanvasWebpageNode` returns `{ ok, id, error? }` (no `node` field); `node.update`'s `formatResult` read `body.node`, found none, and returned a hardcoded `{ ok:true, id }`. Over the MCP-default **`LocalOperationInvoker`** the handler result is returned WITHOUT throwing (the HTTP `status: => 400` mapping never runs locally), so a FAILED refresh surfaced as `{ ok:true, id }` with **no `isError`** — a false success (a live bug reachable today, independent of any deprecation). **Fix:** `node.update`'s `formatResult` now, when there is no `node`, passes the body through verbatim and sets `isError` when `body.ok === false` — matching the HTTP 400 and the legacy `canvas_refresh_webpage_node` (`{ ok:false, id, error }` + `isError`). With the failure path equivalent, the standalone is deprecated → `canvas_node {action:"update", refresh:true}`. `axCapabilities` was also added to `nodeAddShape` + `node.add`'s `extraShape` so the html AX-bridge config is **advertised** (it previously only passed through `z.looseObject`, invisible to schema-guided agents migrating off `canvas_add_html_node`).
|
|
82
|
+
- **Parity tests** (`tests/unit/mcp-composites.test.ts`, head-to-head pattern): (1) `canvas_node add type:"html"` vs `canvas_add_html_node` — same node type/title + `data.presentation`/`slideTitles`/`html`/`embeddedNodeIds`/`embeddedUrls`/`axCapabilities` read back via `canvas_node get full:true`; (2) `canvas_node add type:"html" primitive:"choice-grid" strictSize` vs `canvas_add_html_primitive` — same `type:"html"` + `data.htmlPrimitive === kind` + `data.strictSize`; (3) `canvas_node update refresh:true` vs `canvas_refresh_webpage_node` on the FAILURE path (a connection-refused `http://127.0.0.1:1` — deterministic, no network egress) — both `isError` + `ok:false`.
|
|
83
|
+
- **Remaining legacy after Wave 5:** only `canvas_snapshot` (v0.3 name collision — the save-snapshot tool still holds the name) and `canvas_screenshot` (binary image payload, intentionally standalone). Every other n-way-duplicated operation is now registry-backed and folded into a composite.
|
|
84
|
+
|
|
85
|
+
## Verification (every wave)
|
|
86
|
+
|
|
87
|
+
1. `bun run typecheck`
|
|
88
|
+
2. Targeted: `operation-parity`, `mcp-tool-freeze`, `mcp-server`, `mcp-composites`, `server-api`, `cli-node`, `canvas-operations`, `pmx-canvas-sdk` (+ the batch/webview/validate suites)
|
|
89
|
+
3. Full `bun test tests/unit`
|
|
90
|
+
4. Guard tests (operation-parity / mcp-tool-freeze / mcp-server) edited only deliberately; wire shapes + tool names byte-compatible; `operations/` never imports server.ts/index.ts.
|
|
91
|
+
5. `dist/types` regenerated before the PR.
|
package/docs/screenshot.png
CHANGED
|
Binary file
|