pmx-canvas 0.1.36 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +409 -0
- package/Readme.md +2 -2
- package/dist/json-render/index.js +89 -334
- package/dist/types/mcp/canvas-access.d.ts +5 -171
- package/dist/types/server/ax-state-manager.d.ts +256 -0
- package/dist/types/server/ax-state.d.ts +1 -1
- package/dist/types/server/canvas-operations.d.ts +1 -12
- package/dist/types/server/canvas-state.d.ts +3 -23
- package/dist/types/server/index.d.ts +6 -24
- package/dist/types/server/operations/composites.d.ts +121 -0
- package/dist/types/server/operations/http.d.ts +7 -0
- package/dist/types/server/operations/index.d.ts +8 -0
- package/dist/types/server/operations/invoker.d.ts +13 -0
- package/dist/types/server/operations/mcp.d.ts +15 -0
- package/dist/types/server/operations/ops/annotation.d.ts +2 -0
- package/dist/types/server/operations/ops/app.d.ts +33 -0
- package/dist/types/server/operations/ops/ax-await.d.ts +2 -0
- package/dist/types/server/operations/ops/ax-shared.d.ts +31 -0
- package/dist/types/server/operations/ops/ax-state.d.ts +2 -0
- package/dist/types/server/operations/ops/ax-timeline.d.ts +2 -0
- package/dist/types/server/operations/ops/ax-work.d.ts +2 -0
- package/dist/types/server/operations/ops/batch.d.ts +19 -0
- package/dist/types/server/operations/ops/edges.d.ts +2 -0
- package/dist/types/server/operations/ops/groups.d.ts +2 -0
- package/dist/types/server/operations/ops/json-render.d.ts +31 -0
- package/dist/types/server/operations/ops/nodes.d.ts +62 -0
- package/dist/types/server/operations/ops/query.d.ts +2 -0
- package/dist/types/server/operations/ops/snapshots.d.ts +2 -0
- package/dist/types/server/operations/ops/validate.d.ts +2 -0
- package/dist/types/server/operations/ops/viewport.d.ts +2 -0
- package/dist/types/server/operations/ops/webview.d.ts +2 -0
- package/dist/types/server/operations/registry.d.ts +15 -0
- package/dist/types/server/operations/types.d.ts +116 -0
- package/dist/types/server/operations/webview-runner.d.ts +69 -0
- package/docs/RELEASE.md +5 -0
- package/docs/adr-001-bun-only-runtime.md +46 -0
- package/docs/api-stability.md +57 -0
- package/docs/ax-state-contract.md +72 -0
- package/docs/mcp.md +60 -11
- package/docs/plans/plan-005-operation-registry.md +84 -0
- package/docs/plans/plan-006-mcp-tool-consolidation.md +109 -0
- package/docs/plans/plan-007-ax-domain.md +99 -0
- package/docs/plans/plan-008-registry-finish.md +91 -0
- package/docs/tech-debt-assessment-2026-06.md +90 -0
- package/package.json +3 -3
- package/skills/pmx-canvas/SKILL.md +192 -186
- package/skills/pmx-canvas/evals/evals.json +3 -3
- package/skills/pmx-canvas/references/codex-app-adapter.md +13 -14
- package/skills/pmx-canvas/references/github-copilot-app-adapter.md +4 -5
- package/src/cli/agent.ts +52 -31
- package/src/mcp/canvas-access.ts +30 -830
- package/src/mcp/server.ts +162 -2014
- package/src/server/ax-state-manager.ts +808 -0
- package/src/server/ax-state.ts +2 -2
- package/src/server/canvas-operations.ts +2 -328
- package/src/server/canvas-schema.ts +2 -2
- package/src/server/canvas-state.ts +95 -465
- package/src/server/index.ts +54 -190
- package/src/server/operations/composites.ts +355 -0
- package/src/server/operations/http.ts +103 -0
- package/src/server/operations/index.ts +65 -0
- package/src/server/operations/invoker.ts +87 -0
- package/src/server/operations/mcp.ts +221 -0
- package/src/server/operations/ops/annotation.ts +60 -0
- package/src/server/operations/ops/app.ts +447 -0
- package/src/server/operations/ops/ax-await.ts +216 -0
- package/src/server/operations/ops/ax-shared.ts +38 -0
- package/src/server/operations/ops/ax-state.ts +249 -0
- package/src/server/operations/ops/ax-timeline.ts +381 -0
- package/src/server/operations/ops/ax-work.ts +635 -0
- package/src/server/operations/ops/batch.ts +365 -0
- package/src/server/operations/ops/edges.ts +166 -0
- package/src/server/operations/ops/groups.ts +176 -0
- package/src/server/operations/ops/json-render.ts +691 -0
- package/src/server/operations/ops/nodes.ts +1047 -0
- package/src/server/operations/ops/query.ts +281 -0
- package/src/server/operations/ops/snapshots.ts +366 -0
- package/src/server/operations/ops/validate.ts +37 -0
- package/src/server/operations/ops/viewport.ts +219 -0
- package/src/server/operations/ops/webview.ts +339 -0
- package/src/server/operations/registry.ts +79 -0
- package/src/server/operations/types.ts +150 -0
- package/src/server/operations/webview-runner.ts +77 -0
- package/src/server/server.ts +158 -2255
- package/src/server/web-artifacts.ts +6 -2
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
# ADR-001: Bun-only runtime, MCP + HTTP as the universal surfaces
|
|
2
|
+
|
|
3
|
+
**Status:** Accepted
|
|
4
|
+
**Date:** 2026-06-12
|
|
5
|
+
**Context for:** v0.2 stability release (docs/tech-debt-assessment-2026-06.md, Phase 2)
|
|
6
|
+
|
|
7
|
+
## Context
|
|
8
|
+
|
|
9
|
+
pmx-canvas ships as TypeScript source executed directly by Bun. `package.json` points `main`/`exports` at `src/server/index.ts`, the `bin` entry is `src/cli/index.ts` with a `#!/usr/bin/env bun` shebang, and `engines` requires `bun >= 1.3.14`. There is no compiled JS distribution; `dist/` carries only the client bundle, the json-render viewer, and type declarations.
|
|
10
|
+
|
|
11
|
+
The runtime dependence on Bun is not incidental. `Bun.serve` is the HTTP + SSE server, `bun:sqlite` is the persistence layer, `Bun.WebView` backs the screenshot/evaluate automation tools, and `bun test` is the test runner. These are load-bearing APIs across `src/server/`, not shims that a bundler could paper over.
|
|
12
|
+
|
|
13
|
+
The recurring question is whether v0.2 should add a Node-compatible dual build (ESM + CJS dist, ported server internals) so Node projects can `import 'pmx-canvas'`. The tech debt assessment already leaned no; this ADR makes the decision explicit and binding.
|
|
14
|
+
|
|
15
|
+
The decisive observation: nobody integrates with pmx-canvas by importing it. Agents connect over MCP (stdio). Everything else (scripts, harnesses, CI, other languages) talks HTTP + SSE on localhost. The in-process SDK is a convenience for Bun-native tooling and our own CLI, not the distribution channel. A Node build would be sustained effort (porting `Bun.serve`, replacing `bun:sqlite`, a second test matrix, a real build step where today there is none) spent on the least differentiated integration path, while the operation registry refactor is actively consolidating the surfaces that do matter.
|
|
16
|
+
|
|
17
|
+
## Decision
|
|
18
|
+
|
|
19
|
+
pmx-canvas stays Bun-only for the SDK and runtime.
|
|
20
|
+
|
|
21
|
+
1. MCP (stdio) and the HTTP API are the universal integration surfaces. They are runtime-agnostic by construction: any client that can spawn a process or open a socket can use them.
|
|
22
|
+
2. No Node dual-build (ESM + CJS dist of the server/SDK) will be produced. The package continues to ship TypeScript source executed by Bun.
|
|
23
|
+
3. The programmatic SDK (`import { createCanvas } from 'pmx-canvas'`) is documented as Bun-runtime-only.
|
|
24
|
+
4. Bun-specific APIs (`Bun.serve`, `bun:sqlite`, `Bun.WebView`, `bun test`) remain first-class; no compatibility shims are added for hypothetical Node hosting.
|
|
25
|
+
|
|
26
|
+
## Consequences
|
|
27
|
+
|
|
28
|
+
The honest costs:
|
|
29
|
+
|
|
30
|
+
- **Node-based programmatic consumers cannot import the package.** A Node project that wants canvas access must run the server (any of: `bunx pmx-canvas`, a daemon, MCP auto-start) and integrate over HTTP or MCP. This is a real limitation for anyone wanting in-process embedding from Node, and we are accepting it deliberately.
|
|
31
|
+
- **`bin` requires bun on PATH.** The CLI shebang is `#!/usr/bin/env bun`. `npm install -g pmx-canvas` succeeds but the binary fails at invocation on a machine without bun.
|
|
32
|
+
- **npx-style installation has a sharp edge.** `npx pmx-canvas` fetches the package fine but execution still resolves the bun shebang; without bun installed it fails with a confusing error rather than a clear "install bun" message. `bunx pmx-canvas` is the canonical one-shot command and docs must lead with it. The README and CLI install docs should state the bun prerequisite up front, and a fast preflight check in `src/cli/index.ts` that prints an actionable message when bun is missing is cheap insurance (worth doing, not required by this ADR).
|
|
33
|
+
- **MCP client configs must spawn bun.** MCP server entries point at `bunx pmx-canvas --mcp` (or a bun invocation), not `node`/`npx`. Example configs in docs must be consistent about this.
|
|
34
|
+
- **We forgo the npm-ecosystem long tail.** Some integrations will never happen because `import` was the only path their authors would take. We judge that tail small relative to the agents-over-MCP center of mass.
|
|
35
|
+
|
|
36
|
+
What we gain: zero build step for the server, one runtime to test against, continued free use of `bun:sqlite` and `Bun.serve` without abstraction layers, and engineering time pointed at the operation registry and AX surface instead of distribution plumbing.
|
|
37
|
+
|
|
38
|
+
## Alternatives considered
|
|
39
|
+
|
|
40
|
+
- **Full Node dual-build (ESM + CJS dist).** Requires replacing `Bun.serve` (Hono/Express adapter), `bun:sqlite` (better-sqlite3, a native dependency with its own install pain), and dropping or forking `Bun.WebView`. Doubles the test matrix permanently. Rejected: high sustained cost on the path with the least demand.
|
|
41
|
+
- **Node-compatible SDK client only (thin HTTP wrapper published for Node).** Cheaper, but it is just a typed fetch client; any consumer can write one in an afternoon, and an official one creates a second public surface to version and freeze. Rejected for v0.2; can be revisited if real demand appears, without violating this ADR (the server stays Bun-only either way).
|
|
42
|
+
- **Compile-to-single-binary (`bun build --compile`).** Solves "bun on PATH" for the CLI but not programmatic import, adds per-platform release artifacts, and complicates the MCP spawn story. Out of scope for v0.2; does not change this decision.
|
|
43
|
+
|
|
44
|
+
## Revisit triggers
|
|
45
|
+
|
|
46
|
+
Reopen this ADR if (a) a major MCP host platform cannot spawn bun, or (b) repeated, concrete integration requests arrive that HTTP/MCP genuinely cannot serve (in-process embedding with shared memory, for example). Absent those, Bun-only stands.
|
|
@@ -0,0 +1,57 @@
|
|
|
1
|
+
# API Stability Contract (v0.2)
|
|
2
|
+
|
|
3
|
+
**Status:** Accepted, effective from v0.2.0
|
|
4
|
+
**Date:** 2026-06-12
|
|
5
|
+
**Context:** docs/tech-debt-assessment-2026-06.md item 7 (breaking patch releases, no deprecation path) and Phase 2 of the direction proposal. See also docs/adr-001-bun-only-runtime.md for which surfaces are universal.
|
|
6
|
+
|
|
7
|
+
The problem this fixes: 0.1.35 and 0.1.36 both changed HTTP contract behavior in patch releases. Consumers could not pin safely. From v0.2.0, they can.
|
|
8
|
+
|
|
9
|
+
## Public surfaces
|
|
10
|
+
|
|
11
|
+
These four surfaces are the contract. Anything not listed here is internal.
|
|
12
|
+
|
|
13
|
+
1. **HTTP API:** all `/api/canvas/*` routes documented in `docs/http-api.md`: method, path, request shape, response shape, and status codes.
|
|
14
|
+
2. **MCP surface:** tool names, tool input schemas (field names, types, required/optional status), and the fixed resource URIs (`canvas://layout`, `canvas://pinned-context`, and the rest of the frozen 14). Per-skill resources (`canvas://skills/<name>`) track the `skills/` directory and are explicitly not frozen by name.
|
|
15
|
+
3. **CLI:** the `pmx-canvas` subcommands and flags documented in `docs/cli.md` (including `serve`, `--mcp`, `--port`, `--theme`), their argument shapes, and their output formats where documented as machine-readable.
|
|
16
|
+
4. **SDK:** the exports of the package entry (`src/server/index.ts` via the `exports` map): the `PmxCanvas` class surface, `createCanvas`, and the exported types and helpers. Bun runtime only, per ADR-001.
|
|
17
|
+
|
|
18
|
+
## Policy
|
|
19
|
+
|
|
20
|
+
We are in 0.x semver, and we use it honestly:
|
|
21
|
+
|
|
22
|
+
- **Minor versions (0.2 → 0.3) may break public surfaces.** Breaking changes are allowed only at minor boundaries.
|
|
23
|
+
- **Patch versions never break public surfaces.** A patch may fix bugs, tighten validation of inputs that were never accepted as documented, and add purely additive fields. If a documented request that worked stops working, or a documented response shape changes, that is not a patch.
|
|
24
|
+
- **Every breaking change gets a CHANGELOG entry under a `### Breaking` heading before release.** Not after, not in a follow-up. The release checklist in docs/RELEASE.md treats a breaking change without that heading as a release blocker.
|
|
25
|
+
- **MCP tool names are frozen by `tests/unit/mcp-tool-freeze.test.ts`.** The test pins the literal tool-name list and the fixed resource URIs. Renaming or removing a tool requires editing that test in the same commit, which makes the break deliberate and reviewable rather than accidental. If you find yourself updating the freeze test, you owe a `### Breaking` entry and a minor version.
|
|
26
|
+
- **Additive changes are always allowed:** new tools, new routes, new optional input fields, new response fields. Consumers must tolerate unknown fields in responses.
|
|
27
|
+
|
|
28
|
+
## Deprecation
|
|
29
|
+
|
|
30
|
+
A public surface is marked deprecated at least one minor version before removal. Concretely:
|
|
31
|
+
|
|
32
|
+
1. Mark it in the docs (`docs/http-api.md`, `docs/mcp.md`, `docs/cli.md`, or `docs/sdk.md`) and in the MCP tool description where applicable.
|
|
33
|
+
2. Record it in the CHANGELOG under `### Deprecated` in the minor that deprecates it.
|
|
34
|
+
3. Remove it no earlier than the next minor, with a `### Breaking` entry naming the replacement.
|
|
35
|
+
|
|
36
|
+
So a tool deprecated in 0.2.x survives all of 0.2.x and may be removed in 0.3.0. Plan-006 (MCP tool consolidation) is the first consumer of this mechanism.
|
|
37
|
+
|
|
38
|
+
## Explicitly out of contract
|
|
39
|
+
|
|
40
|
+
These can change in any release without notice:
|
|
41
|
+
|
|
42
|
+
- **SSE event payload internals.** The existence of the `/api/workbench/events` stream is public; the field-level shape of individual event frames is not. Build on the HTTP read endpoints, not on event internals.
|
|
43
|
+
- **Undocumented endpoints.** Anything reachable but not in `docs/http-api.md` (internal prompt/trace/theme plumbing, browser-only routes) is internal.
|
|
44
|
+
- **The `.pmx-canvas/` on-disk layout.** `canvas.db` schema, artifact directory structure, daemon pid/log files. Migrations keep old data readable; the format itself is ours to change.
|
|
45
|
+
- **Anything under `src/` not exported from the package entry.** Deep imports (`pmx-canvas/src/server/whatever`) get no stability promise even where the file layout makes them possible.
|
|
46
|
+
- **Browser UI:** DOM structure, CSS custom properties, client bundle internals.
|
|
47
|
+
|
|
48
|
+
## Enforcement: the operation registry
|
|
49
|
+
|
|
50
|
+
The contract is only as real as its single source of truth. The operation registry (`src/server/operations/`, docs/plans/plan-005-operation-registry.md) gives each canvas operation exactly one zod input schema and one handler, from which the HTTP route, MCP tool, CLI command, and SDK method derive. One schema per operation means one place where the contract lives, one diff to review when it changes, and no cross-surface drift of the kind that produced the 0.1.x breakages (the same operation behaving differently over HTTP vs local MCP access).
|
|
51
|
+
|
|
52
|
+
Two mechanical guards back the policy:
|
|
53
|
+
|
|
54
|
+
- `tests/unit/mcp-tool-freeze.test.ts`: tool names and fixed resource URIs cannot change silently.
|
|
55
|
+
- `tests/unit/operation-parity.test.ts`: migrated operations behave identically across surfaces, including tolerance of unknown input keys (schemas stay loose; strict parsing would be an invisible break).
|
|
56
|
+
|
|
57
|
+
Operations not yet migrated to the registry are covered by the same policy; the registry just makes compliance cheap instead of disciplined.
|
|
@@ -0,0 +1,72 @@
|
|
|
1
|
+
# AX state contract (authoritative)
|
|
2
|
+
|
|
3
|
+
The agent-experience (AX) state is split into **three partitions** with distinct
|
|
4
|
+
storage and lifecycle rules. This document is the authoritative spec for the
|
|
5
|
+
snapshot-vs-audit boundary; it is the documented module boundary for
|
|
6
|
+
`AxStateManager` (`src/server/ax-state-manager.ts`), which `CanvasStateManager`
|
|
7
|
+
holds and delegates to.
|
|
8
|
+
|
|
9
|
+
| Partition | Members | Storage | Snapshotted | Cleared by `canvas_clear` | Cleared by `restore` |
|
|
10
|
+
|-----------|---------|---------|:-----------:|:-------------------------:|:--------------------:|
|
|
11
|
+
| **Canvas-bound** | `focus`, `workItems`, `approvalGates`, `reviewAnnotations`, `elicitations`, `modeRequests`, `policy` | in-memory `_axState` + one JSON blob in the `ax_state` table | ✅ | ✅ | ✅ (replaced by the snapshot's AX) |
|
|
12
|
+
| **Timeline (audit-only)** | `agent-event`, `evidence-item`, `steering-message` | `ax_events` / `ax_evidence` / `ax_steering` tables, 500-row retention, sequential ids | ❌ | ❌ | ❌ |
|
|
13
|
+
| **Host/session** | `host-capability` | `ax_host_capabilities` table | ❌ | ❌ | ❌ |
|
|
14
|
+
|
|
15
|
+
**Rules.** Canvas-bound state travels with the canvas (snapshot / restore / clear);
|
|
16
|
+
timeline and host data are diagnostic and survive all three. Timeline rows are
|
|
17
|
+
append-only, retention-bounded (`AX_TIMELINE_RETENTION = 500` per table), and
|
|
18
|
+
read via `canvas_get_ax_timeline` / `canvas://ax-timeline`. The host-capability
|
|
19
|
+
row is reported by adapters and read via `canvas_get_ax`.
|
|
20
|
+
|
|
21
|
+
## Read surfaces
|
|
22
|
+
|
|
23
|
+
- **Canvas-bound:** `canvas_get_ax`, `canvas://ax`, `canvas://ax-context`, `canvas://ax-work`
|
|
24
|
+
- **Timeline:** `canvas_get_ax_timeline`, `canvas://ax-timeline`, `canvas://ax-pending-steering`, `canvas://ax-delivery`
|
|
25
|
+
- **Host:** `canvas_get_ax`
|
|
26
|
+
|
|
27
|
+
## Node-deletion semantics (soft-orphan + audit)
|
|
28
|
+
|
|
29
|
+
When a node is removed, the canvas-bound partition is re-normalized against the
|
|
30
|
+
surviving node set (`AxStateManager.revalidateAfterNodeRemoval`):
|
|
31
|
+
|
|
32
|
+
- **Work items / approval gates / elicitations / mode requests** that referenced
|
|
33
|
+
the deleted node keep the item but **strip the dangling node id** ("re-anchored").
|
|
34
|
+
The data semantics are soft-orphan: the work is not destroyed.
|
|
35
|
+
- **Node-anchored review annotations** (`anchorType: 'node'`) for the deleted node
|
|
36
|
+
are **dropped entirely** ("removed") — they are meaningless without their node.
|
|
37
|
+
|
|
38
|
+
This re-normalization was previously **silent**. It now records exactly one
|
|
39
|
+
auditable **timeline** event when (and only when) something was actually affected:
|
|
40
|
+
|
|
41
|
+
```
|
|
42
|
+
kind: 'note'
|
|
43
|
+
source: 'system'
|
|
44
|
+
summary: 'Node "<title>" deleted — re-anchored N AX item(s),
|
|
45
|
+
removed M node-anchored review annotation(s). [(focus anchor cleared)]'
|
|
46
|
+
data: {
|
|
47
|
+
systemEvent: 'ax-node-orphan',
|
|
48
|
+
removedNodeId: '<node id>',
|
|
49
|
+
reanchoredIds: [ ...work/gate/elicitation/mode ids... ],
|
|
50
|
+
removedReviewIds: [ ...review annotation ids... ],
|
|
51
|
+
reanchoredFocus: <boolean>, // true if focus.nodeIds referenced the deleted node
|
|
52
|
+
}
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
The audit lives in the **timeline** (audit partition) — correct per the contract:
|
|
56
|
+
it is diagnostic continuity, not canvas-bound state, so it survives clear/restore
|
|
57
|
+
and is not part of any snapshot. `recordAxEvent` is timeline-only and does not
|
|
58
|
+
re-enter the canvas-bound normalization path, so there is no recursion.
|
|
59
|
+
|
|
60
|
+
The audit is scoped to `removeNode` (the live, observable change). `restore`
|
|
61
|
+
replaces the whole canvas wholesale and its snapshot AX was already consistent
|
|
62
|
+
when it was saved, so it is not audited.
|
|
63
|
+
|
|
64
|
+
**Append-only / undo semantics.** The note records a historical fact (at time T,
|
|
65
|
+
deleting node X re-anchored these items), not current state. It is **not rolled
|
|
66
|
+
back on undo** and **not duplicated on redo**: undo restores the canvas-bound AX
|
|
67
|
+
state (the re-anchoring is reversed in the live state) but leaves the note as a
|
|
68
|
+
record; redo replays `removeNode` inside suppressed recording
|
|
69
|
+
(`_suppressRecordingDepth > 0`), which re-runs the re-normalization but does
|
|
70
|
+
**not** append a second note. Consumers should read `reanchoredIds` /
|
|
71
|
+
`removedReviewIds` against the *current* canvas-bound state, not assume the
|
|
72
|
+
referenced items are still re-anchored.
|
package/docs/mcp.md
CHANGED
|
@@ -1,10 +1,18 @@
|
|
|
1
1
|
# MCP reference
|
|
2
2
|
|
|
3
|
-
PMX Canvas ships an MCP stdio server with **
|
|
3
|
+
PMX Canvas ships an MCP stdio server with **83 tools** + **14 core resources**,
|
|
4
4
|
plus per-skill resources at `canvas://skills/<name>`. The server emits
|
|
5
5
|
`notifications/resources/updated` when canvas state changes — humans pin
|
|
6
6
|
nodes in the browser, agents are notified immediately.
|
|
7
7
|
|
|
8
|
+
> **Consolidation in progress (plan-006/008).** The 83 tools are 14 action-discriminated
|
|
9
|
+
> **composites** (recommended — see below) plus 69 legacy single-purpose tools.
|
|
10
|
+
> The composites fold the legacy tools behind an `action` (and, for `canvas_ax_gate`,
|
|
11
|
+
> a `kind`) enum; each action dispatches to the same operation, so behavior is
|
|
12
|
+
> identical. Folded legacy tools are marked `Deprecated:` in their descriptions and
|
|
13
|
+
> are removed in v0.3 per [`api-stability.md`](api-stability.md). **Prefer the
|
|
14
|
+
> composites.**
|
|
15
|
+
|
|
8
16
|
## Connect
|
|
9
17
|
|
|
10
18
|
Add to your agent's MCP config:
|
|
@@ -22,18 +30,59 @@ Add to your agent's MCP config:
|
|
|
22
30
|
|
|
23
31
|
The canvas auto-starts on first tool call.
|
|
24
32
|
|
|
25
|
-
##
|
|
33
|
+
## Composite tools (recommended)
|
|
34
|
+
|
|
35
|
+
Action-discriminated tools that consolidate the single-purpose tools. Each maps
|
|
36
|
+
its `action` to the same operation the legacy tool used, so results are identical.
|
|
37
|
+
|
|
38
|
+
| Composite | `action` values | Replaces |
|
|
39
|
+
|-----------|-----------------|----------|
|
|
40
|
+
| `canvas_node` | `add` · `get` · `update` · `remove` | `canvas_add_node`, `canvas_get_node`, `canvas_update_node`, `canvas_remove_node`, `canvas_add_html_node` (`add` + `type:"html"`), `canvas_add_html_primitive` (`add` + `type:"html"`, `primitive:"<kind>"`), `canvas_refresh_webpage_node` (`update` + `refresh:true`) |
|
|
41
|
+
| `canvas_render` | `describe-schema` · `validate` · `add-json-render` · `stream-json-render` · `add-graph` | `canvas_describe_schema`, `canvas_validate_spec`, `canvas_add_json_render_node`, `canvas_stream_json_render_node`, `canvas_add_graph_node` |
|
|
42
|
+
| `canvas_edge` | `add` · `remove` | `canvas_add_edge`, `canvas_remove_edge` |
|
|
43
|
+
| `canvas_group` | `create` · `add` · `ungroup` | `canvas_create_group`, `canvas_group_nodes`, `canvas_ungroup` |
|
|
44
|
+
| `canvas_history` | `undo` · `redo` | `canvas_undo`, `canvas_redo` |
|
|
45
|
+
| `canvas_view` | `arrange` · `focus` · `fit` · `clear` | `canvas_arrange`, `canvas_focus_node`, `canvas_fit_view`, `canvas_clear` |
|
|
46
|
+
| `canvas_query` | `search` · `layout` | `canvas_search`, `canvas_get_layout` |
|
|
47
|
+
| `canvas_webview` | `status` · `start` · `stop` · `resize` · `evaluate` | `canvas_webview_status`, `canvas_webview_start`, `canvas_webview_stop`, `canvas_resize`, `canvas_evaluate` |
|
|
48
|
+
| `canvas_app` | `open-mcp-app` · `diagram` · `build-artifact` | `canvas_open_mcp_app`, `canvas_add_diagram`, `canvas_build_web_artifact` |
|
|
49
|
+
| `canvas_ax_state` | `get` · `set-focus` · `set-policy` · `report-capability` | `canvas_get_ax`, `canvas_set_ax_focus`, `canvas_set_ax_policy`, `canvas_report_host_capability` |
|
|
50
|
+
| `canvas_ax_work` | `add` · `update` · `annotate` | `canvas_add_work_item`, `canvas_update_work_item`, `canvas_add_review_annotation` |
|
|
51
|
+
| `canvas_ax_gate` | `request` · `resolve` · `await` × kind `approval` \| `elicitation` \| `mode` | `canvas_request_approval`, `canvas_resolve_approval`, `canvas_await_approval`, `canvas_request_elicitation`, `canvas_respond_elicitation`, `canvas_await_elicitation`, `canvas_request_mode`, `canvas_resolve_mode`, `canvas_await_mode` (9 → 1) |
|
|
52
|
+
| `canvas_ax_timeline` | `read` · `record-event` · `add-evidence` · `send-steering` | `canvas_get_ax_timeline`, `canvas_record_ax_event`, `canvas_add_evidence`, `canvas_send_steering` |
|
|
53
|
+
| `canvas_ax_delivery` | `claim` · `mark` | `canvas_claim_ax_delivery`, `canvas_mark_ax_delivery` |
|
|
54
|
+
|
|
55
|
+
Field names match the underlying operation (e.g. `canvas_view { action: "focus", id }`,
|
|
56
|
+
`canvas_group { action: "create", childIds }`). `canvas_ax_gate` has two discriminators:
|
|
57
|
+
`{ kind, action }` — e.g. `{ kind: "approval", action: "request", title }`,
|
|
58
|
+
`{ kind: "elicitation", action: "resolve", id, response }`,
|
|
59
|
+
`{ kind: "mode", action: "await", id, timeoutMs }`. (The approval machine-readable
|
|
60
|
+
action identifier is passed as `approvalAction`, since `action` is the lifecycle
|
|
61
|
+
discriminator.) `canvas_app` folds the external / built-content tools:
|
|
62
|
+
`{ action: "open-mcp-app", transport, toolName }`, `{ action: "diagram", elements }`
|
|
63
|
+
(the hosted Excalidraw preset), and `{ action: "build-artifact", title, appTsx }`
|
|
64
|
+
(build-artifact can run for minutes on a cold workspace — set a long client
|
|
65
|
+
timeout). `canvas_ax_interaction`, `canvas_ingest_activity`, and
|
|
66
|
+
`canvas_invoke_command` stay standalone (trust-boundary / firehose / execution-intent
|
|
67
|
+
tools). `canvas_screenshot` also stays standalone — it returns a binary image payload
|
|
68
|
+
the composite/registry JSON wire shape does not model. (Wave 5 folded
|
|
69
|
+
`canvas_refresh_webpage_node` → `canvas_node { action: "update", refresh: true }` after
|
|
70
|
+
fixing `node.update`'s `formatResult` to surface a FAILED refresh as `isError` +
|
|
71
|
+
`{ ok:false, error }` instead of masking it as a false `{ ok:true }`.) Snapshots
|
|
72
|
+
fold as their registry slice lands.
|
|
73
|
+
|
|
74
|
+
## Tools (legacy single-purpose)
|
|
26
75
|
|
|
27
76
|
| Tool | Description |
|
|
28
77
|
|------|-------------|
|
|
29
78
|
| `canvas_add_node` | Add a node (markdown, status, context, file, webpage, html, etc.) |
|
|
30
|
-
| `canvas_add_html_node` | Create an `html` node from a self-contained HTML/JS document (sandboxed iframe) |
|
|
31
|
-
| `canvas_add_html_primitive` | Create a reusable generated HTML communication primitive as a sandboxed `html` node |
|
|
79
|
+
| `canvas_add_html_node` | **Deprecated** → `canvas_node { action: "add", type: "html" }`. Create an `html` node from a self-contained HTML/JS document (sandboxed iframe) |
|
|
80
|
+
| `canvas_add_html_primitive` | **Deprecated** → `canvas_node { action: "add", type: "html", primitive: "<kind>" }`. Create a reusable generated HTML communication primitive as a sandboxed `html` node |
|
|
32
81
|
| `canvas_add_diagram` | Hand-drawn diagram via the hosted Excalidraw MCP App (preset alias for `canvas_open_mcp_app`) |
|
|
33
82
|
| `canvas_open_mcp_app` | Open any [MCP Apps](https://modelcontextprotocol.io/docs/extensions/apps) server's `ui://` resource as an iframe node |
|
|
34
83
|
| `canvas_describe_schema` | Describe the running server's create schemas, examples, json-render catalog, and HTML primitive catalog |
|
|
35
84
|
| `canvas_validate_spec` | Validate a json-render spec, graph payload, or HTML primitive payload without creating a node |
|
|
36
|
-
| `canvas_refresh_webpage_node` | Re-fetch and update a webpage node from its stored URL |
|
|
85
|
+
| `canvas_refresh_webpage_node` | **Deprecated** → `canvas_node { action: "update", refresh: true }`. Re-fetch and update a webpage node from its stored URL |
|
|
37
86
|
| `canvas_add_json_render_node` | Create a native json-render node from a validated spec |
|
|
38
87
|
| `canvas_stream_json_render_node` | Progressively build a json-render node from SpecStream JSON-Patch ops (live/streaming panels) |
|
|
39
88
|
| `canvas_add_graph_node` | Create a native graph node (line, bar, pie, area, scatter, radar, stacked-bar, composed, sparkline, dot-plot, bullet, slopegraph) |
|
|
@@ -172,10 +221,10 @@ MCP for tools/resources and the in-app Browser for the live `/workbench` view.
|
|
|
172
221
|
No separate PMX renderer is needed. Prefer MCP over the CLI for Codex-native
|
|
173
222
|
operation; keep the CLI for fallback scripts and manual debugging.
|
|
174
223
|
|
|
175
|
-
Use `canvas://ax-context` or `
|
|
176
|
-
When Codex-hosted steering sets the current attention
|
|
177
|
-
`
|
|
178
|
-
focus came from. The full workflow lives in
|
|
224
|
+
Use `canvas://ax-context` or `canvas_ax_state { action: "get" }` to read
|
|
225
|
+
pinned/focused context. When Codex-hosted steering sets the current attention
|
|
226
|
+
target, call `canvas_ax_state { action: "set-focus", source: "codex" }` so the
|
|
227
|
+
AX state records where the focus came from. The full workflow lives in
|
|
179
228
|
`skills/pmx-canvas/references/codex-app-adapter.md`.
|
|
180
229
|
|
|
181
230
|
## Annotation Visibility
|
|
@@ -201,8 +250,8 @@ in doubt:
|
|
|
201
250
|
|
|
202
251
|
- `json-render` → `canvas_add_json_render_node`
|
|
203
252
|
- `graph` → `canvas_add_graph_node`
|
|
204
|
-
- `html-primitive` → `canvas_add_html_primitive`
|
|
205
|
-
- `html` → `canvas_add_html_node`
|
|
253
|
+
- `html-primitive` → `canvas_node { action: "add", type: "html", primitive: "<kind>" }` (or the deprecated `canvas_add_html_primitive`)
|
|
254
|
+
- `html` → `canvas_node { action: "add", type: "html" }` (or the deprecated `canvas_add_html_node`)
|
|
206
255
|
- `web-artifact` → `canvas_build_web_artifact`
|
|
207
256
|
- `mcp-app` → `canvas_open_mcp_app`
|
|
208
257
|
- `group` → `canvas_create_group`
|
|
@@ -0,0 +1,84 @@
|
|
|
1
|
+
# Plan 005 — Operation Registry: one definition site per canvas operation
|
|
2
|
+
|
|
3
|
+
**Status:** In progress (branch `refactor/v0.2-operation-registry`)
|
|
4
|
+
**Date:** 2026-06-12
|
|
5
|
+
**Motivation:** docs/tech-debt-assessment-2026-06.md item 1. Every operation is hand-written 5–6 times (CanvasStateManager/canvas-operations, PmxCanvas SDK, HTTP handler, MCP tool, CLI command, plus Local/Remote CanvasAccess in src/mcp/canvas-access.ts). Documented bug classes caused by this: fix applied to one of two mutation paths (LRN-20260606-006), enum guard not updated for new member (LRN-20260607-005), shared readJson hardening killing the batch bare-array shape (LRN-20260608-002).
|
|
6
|
+
|
|
7
|
+
## Confirmed live drift this refactor erases
|
|
8
|
+
|
|
9
|
+
- `PmxCanvas.addNode` uses `fileMode: 'path'` while `handleCanvasAddNode` uses `fileMode: 'auto'`.
|
|
10
|
+
- Node-update merge logic exists in three diverging versions: `handleCanvasUpdateNode` has webpage `titleSource`, html top-level `html`/`axCapabilities`, group children, `refresh:true` delegation; `PmxCanvas.updateNode` and batch `node.update` have none of these.
|
|
11
|
+
- `canvas_remove_node` over local access silently succeeds on a missing id while the HTTP path 404s.
|
|
12
|
+
- The per-type default-size ladder is copy-pasted in `handleCanvasAddNode`, `executeCanvasBatch`, and `PmxCanvas.addNode`.
|
|
13
|
+
|
|
14
|
+
## Registry core design
|
|
15
|
+
|
|
16
|
+
New directory `src/server/operations/`:
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
types.ts Operation<I,O>, OperationContext, OperationError, defineOperation()
|
|
20
|
+
registry.ts register/get/list, executeOperation(), setOperationEventEmitter()
|
|
21
|
+
http.ts route table + dispatchOperationRoute(req, url): Promise<Response | null>
|
|
22
|
+
invoker.ts OperationInvoker: LocalOperationInvoker | HttpOperationInvoker
|
|
23
|
+
mcp.ts registerOperationTools(server, getInvoker)
|
|
24
|
+
ops/nodes.ts slice 1: node.add / node.get / node.update / node.remove / layout.get
|
|
25
|
+
index.ts imports all ops/* files and registers them (single registration site)
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
Key contracts:
|
|
29
|
+
|
|
30
|
+
- `Operation<I,O>` fields: `name` ('node.add', doubles as batch op name), `mutates` (true → registry emits `canvas-layout-update` after success), `input` (a ZodObject; MUST be loose/passthrough — legacy ignores unknown keys, strict parsing would be an invisible API break), `http { method, path (EXACT legacy path, ':param' segments), readInput?, serialize? }`, `mcp { toolName (frozen legacy name), description, extraShape? (MCP-only presentation flags like full/verbose), formatResult? } | null`, `handler(input, ctx)` — the single implementation, mutating via canvasState/canvas-operations so mutation history records automatically.
|
|
31
|
+
- `OperationError(message, status 400|404|409)` maps to HTTP status + `{ ok:false, error }` and MCP `isError: true`.
|
|
32
|
+
- `executeOperation(name, rawInput)`: validate → run → emit. The ONE execution path. zod failures → OperationError(400).
|
|
33
|
+
- SSE: `setOperationEventEmitter` injected from server.ts at module top level (same pattern as `setCanvasLayoutUpdateEmitter`). Handlers never emit `canvas-layout-update` themselves; `mutates` is the single source. Extra events (focus, viewport) go through `ctx.emit`.
|
|
34
|
+
- `http.ts` route matching is segment-count exact so `/node/:id` never swallows `/node/:id/refresh`. Dispatch inserted in the server.ts fetch handler immediately before the first legacy `/api/canvas/*` check; registry routes shadow legacy ones and the legacy block is deleted in the same commit that registers the op.
|
|
35
|
+
- The shared body reader preserves array bodies (per-op `readInput` decides; the shared reader never coerces) — structural fix for the batch bare-array bug class.
|
|
36
|
+
- `invoker.ts`: `LocalOperationInvoker` wraps `executeOperation`; `HttpOperationInvoker(baseUrl)` builds the request from `op.http.path` template (fills `:id` from input, GET flags to query, rest as JSON body). MCP uses local or HTTP invoker depending on CanvasAccess mode; CLI uses `HttpOperationInvoker(getBaseUrl())`; SDK wraps the handler core functions directly to stay synchronous.
|
|
37
|
+
- `mcp.ts`: iterates the registry, passes `{ ...op.input.shape, ...extraShape }` to `server.tool()` (zod v4 shapes pass through unchanged), invokes via the invoker, formats with `formatResult` (where compactNodePayload/createdNodePayload live).
|
|
38
|
+
|
|
39
|
+
## Slice 1 — node CRUD + layout (this slice)
|
|
40
|
+
|
|
41
|
+
Ops: `node.add`, `node.get`, `node.update`, `node.remove`, `layout.get`.
|
|
42
|
+
|
|
43
|
+
- Export `NODE_TYPES` tuple once; derive the zod enum and replace the `VALID_NODE_TYPES` Set — structural fix for the enum-guard near-miss.
|
|
44
|
+
- `node.add` handler = union of `handleCanvasAddNode` + `createCanvasWebpageNode` + `createCanvasHtmlPrimitiveNode`, calling existing `addCanvasNode`/`createCanvasGroup`/`buildHtmlPrimitive`/`refreshCanvasWebpageNode` etc. Default-size ladder becomes one exported `defaultNodeSize(type)`. `http.serialize` = existing `buildNodeResponse` shape, byte-identical wire format. The json-render/graph/web-artifact redirect errors keep their exact current messages.
|
|
45
|
+
- `node.update` = shared `buildNodePatch(existing, input)` carrying the HTTP superset semantics (titleSource, html top-level fields, group children, refresh delegation). SDK `updateNode` delegates to it — drift disappears.
|
|
46
|
+
- `node.remove` = `closeNodeAppSession` + `removeCanvasNode`; missing id → OperationError 404 (unifies the silent local-remove asymmetry — see parity test note below).
|
|
47
|
+
- `node.get`/`layout.get` keep `withContextPinReadState`/serialization in `http.serialize`; MCP keeps compact/full payload behavior via `formatResult`.
|
|
48
|
+
- SDK keeps `fileMode: 'path'` as an explicit visible parameter instead of forked code.
|
|
49
|
+
|
|
50
|
+
Legacy deleted in this slice: `handleCanvasAddNode`, `createCanvasWebpageNode`, `createCanvasHtmlPrimitiveNode`, `handleCanvasUpdateNode`, inline state/node GET/PATCH/DELETE routes, `VALID_NODE_TYPES`, five MCP `server.tool` blocks, orphaned CanvasAccess methods, the SDK's forked merge logic.
|
|
51
|
+
|
|
52
|
+
## Migration order after slice 1
|
|
53
|
+
|
|
54
|
+
1. Edges (mechanical; DELETE body takes `edge_id`, schema accepts both)
|
|
55
|
+
2. Arrange/viewport/focus/fit/clear (focus emits 3 extra events via ctx.emit; fit is mutates:false with manual viewport emit)
|
|
56
|
+
3. Groups
|
|
57
|
+
4. Pins/search/spatial-context/summary/history/undo/redo
|
|
58
|
+
5. Snapshots (restore keeps its deferred emit mechanism)
|
|
59
|
+
6. json-render/graph/stream (alias triangle heightPx/nodeHeight/height absorbed into one schema)
|
|
60
|
+
7. AX domain (read + mutate sub-slices; long-poll waitMs in readInput; structured denial bodies preserved)
|
|
61
|
+
8. Webpage refresh/diagram/mcp-app open/web-artifact/html-surface (side-channel semantics; mutates:false, own their emits; one op per commit)
|
|
62
|
+
9. Batch last — meta-operation dispatching `executeOperation` per entry with layout emission suppressed + single final emit; deletes the 290-line switch in canvas-operations.ts
|
|
63
|
+
|
|
64
|
+
Theme/annotations/code-graph/schema/prompt/trace endpoints are single-transport, low-duplication; they may stay legacy indefinitely.
|
|
65
|
+
|
|
66
|
+
## Verification (every slice)
|
|
67
|
+
|
|
68
|
+
1. `bun run typecheck`
|
|
69
|
+
2. Targeted: `PMX_CANVAS_DISABLE_BROWSER_OPEN=1 bun test tests/unit/operation-parity.test.ts tests/unit/mcp-tool-freeze.test.ts tests/unit/server-api.test.ts tests/unit/mcp-server.test.ts tests/unit/cli-node.test.ts tests/unit/canvas-operations.test.ts tests/unit/pmx-canvas-sdk.test.ts`
|
|
70
|
+
3. Full unit: `bun run test`
|
|
71
|
+
4. Milestones (after slices 1, 6, batch): `bun run test:web-canvas` + `bun run test:e2e-cli`
|
|
72
|
+
|
|
73
|
+
Safety nets already in place (committed before any registry code): `tests/unit/operation-parity.test.ts` (cross-surface parity, SSE counts, junk-key tolerance, pinned asymmetries) and `tests/unit/mcp-tool-freeze.test.ts` (69 tool names + 14 fixed resource URIs frozen).
|
|
74
|
+
|
|
75
|
+
Parity note: the parity test currently PINS the local-remove silent-success asymmetry. Slice 1 deliberately unifies it to a 404-style error on all surfaces; update that one pinned assertion in the same commit, with a CHANGELOG note.
|
|
76
|
+
|
|
77
|
+
## Risks
|
|
78
|
+
|
|
79
|
+
- zod strictness: schemas must be loose; parity test has junk-key cases.
|
|
80
|
+
- Route shadowing: segment-exact matching + registry self-test against known still-legacy paths.
|
|
81
|
+
- SSE drift: double emit / missing emit — parity test counts frames.
|
|
82
|
+
- MCP-against-remote: ensure at least one test exercises each migrated tool through RemoteCanvasAccess/HttpOperationInvoker (mcp-server.test.ts daemon mode covers this).
|
|
83
|
+
- Import cycles: operations/ never imports server.ts (emitter injected) or index.ts (SDK imports the cores).
|
|
84
|
+
- Batch is highest-risk: last, separately committed, one-commit revert.
|
|
@@ -0,0 +1,109 @@
|
|
|
1
|
+
# Plan 006: MCP tool consolidation (69 tools to 21)
|
|
2
|
+
|
|
3
|
+
**Status:** In progress — wave 1 landed (7 canvas composites) and the AX wave landed (5 composites: `canvas_ax_state`, `canvas_ax_work`, `canvas_ax_gate`, `canvas_ax_timeline`, `canvas_ax_delivery`). Remaining: `canvas_snapshot` (name held by the legacy save tool until v0.3), `canvas_app`/`canvas_webview` (need plan-005 item 8), and the deferred actions `refresh` / `add-primitive` / `remove-annotation` / board validation.
|
|
4
|
+
**Date:** 2026-06-12
|
|
5
|
+
**Depends on:** plan-005 (operation registry). Slices 1-4 are migrated; consolidation lands per-domain as the corresponding registry slices complete.
|
|
6
|
+
**Motivation:** docs/tech-debt-assessment-2026-06.md item 2. Governed by docs/api-stability.md (deprecation: marked in one minor, removed in the next).
|
|
7
|
+
|
|
8
|
+
## Rationale
|
|
9
|
+
|
|
10
|
+
69 tools is a tax on every connected agent: each tool name + description + input schema is sent to every MCP client and consumes context window before the agent has done anything. Worse, it degrades tool selection. An agent choosing between `canvas_request_approval`, `canvas_resolve_approval`, and `canvas_await_approval` (times three gate kinds, nine tools) picks worse than one choosing `canvas_ax_gate` with a clear action enum and one good description. Agents reliably pick the right tool from ~20 well-described tools; at 69 the descriptions compete with each other.
|
|
11
|
+
|
|
12
|
+
The registry makes this cheap. A consolidated tool is one `mcp` block whose `extraShape` adds an action discriminator and whose `buildInput` dispatches to the existing registered operations. No handler logic moves; the consolidation is presentation-layer only, which is exactly what the registry was built to make safe.
|
|
13
|
+
|
|
14
|
+
## Current surface (69 tools, from tests/unit/mcp-tool-freeze.test.ts)
|
|
15
|
+
|
|
16
|
+
Grouped by domain: node CRUD + creation variants (15), edges (2), view (4), groups (3), snapshots + diff (6), undo/redo (2), search/validate/schema (4), pins (1), batch (1), webview automation (6), AX (25).
|
|
17
|
+
|
|
18
|
+
## Proposed surface (21 tools)
|
|
19
|
+
|
|
20
|
+
### Composites
|
|
21
|
+
|
|
22
|
+
**1. `canvas_node`**: folds `canvas_add_node`, `canvas_add_html_node`, `canvas_get_node`, `canvas_update_node`, `canvas_remove_node`, `canvas_refresh_webpage_node`.
|
|
23
|
+
Action enum: `add | get | update | remove | refresh`.
|
|
24
|
+
Sketch: `{ action, id?, type?, title?, content?, html?, x?, y?, width?, height?, data?, ...patch fields }`. `add` requires `type` (full node-type enum; html nodes stop needing a dedicated tool since `html` is already a first-class field). `refresh` covers the webpage re-fetch (`node.update` already has `refresh: true` delegation in the registry, so this is an alias action, not new logic). Spec-driven types (json-render, graph, web-artifact) keep their existing redirect errors pointing at `canvas_render`.
|
|
25
|
+
|
|
26
|
+
**2. `canvas_render`**: folds `canvas_add_json_render_node`, `canvas_stream_json_render_node`, `canvas_add_graph_node`, `canvas_add_html_primitive`, `canvas_validate_spec`, `canvas_describe_schema`.
|
|
27
|
+
Action enum: `describe-schema | validate | add-json-render | stream-json-render | add-graph | add-primitive`.
|
|
28
|
+
Sketch: `{ action, spec?, graph?, kind?, payload?, nodeId?, title?, x?, y?, ... }`. One tool owns "spec-driven content": discover the schema, validate, create. The alias triangle (heightPx/nodeHeight/height) is already absorbed by the registry slice 6 schema.
|
|
29
|
+
|
|
30
|
+
**3. `canvas_app`**: folds `canvas_open_mcp_app`, `canvas_add_diagram`, `canvas_build_web_artifact`.
|
|
31
|
+
Action enum: `open-mcp-app | diagram | build-artifact`.
|
|
32
|
+
Sketch: `{ action, serverUrl?, tool?, args?, elements?, files?, entry?, title?, ... }`. External and built content with side-channel semantics, kept apart from plain node CRUD because their inputs share nothing with it.
|
|
33
|
+
|
|
34
|
+
**4. `canvas_edge`**: folds `canvas_add_edge`, `canvas_remove_edge`.
|
|
35
|
+
Action enum: `add | remove`.
|
|
36
|
+
Sketch: `{ action, id?, from?, to?, type?, label?, style?, animated? }`.
|
|
37
|
+
|
|
38
|
+
**5. `canvas_view`**: folds `canvas_arrange`, `canvas_focus_node`, `canvas_fit_view`, `canvas_clear`, `canvas_remove_annotation`.
|
|
39
|
+
Action enum: `arrange | focus | fit | clear | remove-annotation`.
|
|
40
|
+
Sketch: `{ action, nodeId?, annotationId?, strategy?, padding? }`. `remove-annotation` lives here as "canvas surface housekeeping"; it is an overlay operation, not node CRUD (judgment call, see Risks).
|
|
41
|
+
|
|
42
|
+
**6. `canvas_group`**: folds `canvas_create_group`, `canvas_group_nodes`, `canvas_ungroup`.
|
|
43
|
+
Action enum: `create | add | ungroup`.
|
|
44
|
+
Sketch: `{ action, groupId?, title?, nodeIds? }`.
|
|
45
|
+
|
|
46
|
+
**7. `canvas_snapshot`**: folds `canvas_snapshot`, `canvas_list_snapshots`, `canvas_restore`, `canvas_delete_snapshot`, `canvas_gc_snapshots`, `canvas_diff`.
|
|
47
|
+
Action enum: `save | list | restore | delete | gc | diff`.
|
|
48
|
+
Sketch: `{ action, name?, id?, keep?, dryRun?, all? }`.
|
|
49
|
+
|
|
50
|
+
**8. `canvas_history`**: folds `canvas_undo`, `canvas_redo`.
|
|
51
|
+
Action enum: `undo | redo`.
|
|
52
|
+
Sketch: `{ action }`.
|
|
53
|
+
|
|
54
|
+
**9. `canvas_query`**: folds `canvas_search`, `canvas_get_layout`, `canvas_validate`.
|
|
55
|
+
Action enum: `search | layout | validate`.
|
|
56
|
+
Sketch: `{ action, query?, limit?, full? }`. The three "read the board" entry points under one description that teaches the cheap-to-expensive ladder (search before layout).
|
|
57
|
+
|
|
58
|
+
**10. `canvas_webview`**: folds `canvas_webview_status`, `canvas_webview_start`, `canvas_webview_stop`, `canvas_resize`, `canvas_evaluate`.
|
|
59
|
+
Action enum: `status | start | stop | resize | evaluate`.
|
|
60
|
+
Sketch: `{ action, width?, height?, expression? }`.
|
|
61
|
+
|
|
62
|
+
**11. `canvas_ax_state`**: folds `canvas_get_ax`, `canvas_set_ax_focus`, `canvas_set_ax_policy`, `canvas_report_host_capability`.
|
|
63
|
+
Action enum: `get | set-focus | set-policy | report-capability`.
|
|
64
|
+
Sketch: `{ action, focus?, policy?, capability? }`.
|
|
65
|
+
|
|
66
|
+
**12. `canvas_ax_work`**: folds `canvas_add_work_item`, `canvas_update_work_item`, `canvas_add_review_annotation`.
|
|
67
|
+
Action enum: `add | update | annotate`.
|
|
68
|
+
Sketch: `{ action, id?, title?, status?, detail?, nodeIds?, body?, anchor? }`.
|
|
69
|
+
|
|
70
|
+
**13. `canvas_ax_gate`**: folds the nine gate tools: `canvas_request_approval`, `canvas_resolve_approval`, `canvas_await_approval`, `canvas_request_elicitation`, `canvas_respond_elicitation`, `canvas_await_elicitation`, `canvas_request_mode`, `canvas_resolve_mode`, `canvas_await_mode`.
|
|
71
|
+
Two discriminators: `kind: approval | elicitation | mode` and `action: request | resolve | await`.
|
|
72
|
+
Sketch: `{ kind, action, id?, title?, detail?, nodeIds?, decision?, response?, mode?, timeoutMs? }`. `resolve` carries `decision` for approval/mode and `response` for elicitation. The biggest single win: 9 tools to 1, and the request/await pairing finally reads as one lifecycle.
|
|
73
|
+
|
|
74
|
+
**14. `canvas_ax_timeline`**: folds `canvas_get_ax_timeline`, `canvas_record_ax_event`, `canvas_add_evidence`, `canvas_send_steering`.
|
|
75
|
+
Action enum: `read | record-event | add-evidence | send-steering`.
|
|
76
|
+
Sketch: `{ action, kind?, summary?, payload?, evidenceType?, message?, limit? }`.
|
|
77
|
+
|
|
78
|
+
**15. `canvas_ax_delivery`**: folds `canvas_claim_ax_delivery`, `canvas_mark_ax_delivery`.
|
|
79
|
+
Action enum: `claim | mark`.
|
|
80
|
+
Sketch: `{ action, consumer?, id? }`.
|
|
81
|
+
|
|
82
|
+
### Kept standalone (composition would hurt)
|
|
83
|
+
|
|
84
|
+
- **16. `canvas_batch`**: already the meta-operation; folding anything into it inverts the design.
|
|
85
|
+
- **17. `canvas_pin_nodes`**: the flagship human-context primitive; deserves its own description so agents find it.
|
|
86
|
+
- **18. `canvas_screenshot`**: returns an MCP image payload; mixing return types inside a composite makes `formatResult` and client handling worse.
|
|
87
|
+
- **19. `canvas_ax_interaction`**: the single normalized trust-boundary envelope; it already is a composite by design.
|
|
88
|
+
- **20. `canvas_ingest_activity`**: adapter firehose with reaction semantics; distinct caller (harness, not agent).
|
|
89
|
+
- **21. `canvas_invoke_command`**: execution-intent tool; allowlist and approval-policy relevant, so it must stay individually nameable.
|
|
90
|
+
|
|
91
|
+
Every one of the 69 legacy tools maps to exactly one row above; nothing is dropped without a successor.
|
|
92
|
+
|
|
93
|
+
## Migration
|
|
94
|
+
|
|
95
|
+
1. **v0.2 minors: add consolidated tools alongside legacy.** Each consolidated tool ships when its registry slice lands (plan-005 migration order). Implementation per tool: one registry `mcp` registration with an `action` (and for gates, `kind`) discriminator in `extraShape`, a `buildInput` that maps the composite args onto the existing operation input, dispatching via the operation name. Legacy tools keep working unchanged.
|
|
96
|
+
2. **Same minors: mark legacy deprecated.** Each legacy tool description gets a leading `Deprecated: use canvas_x with action "y".` line, plus `### Deprecated` CHANGELOG entries and docs/mcp.md updates, per the api-stability contract.
|
|
97
|
+
3. **v0.3.0: remove legacy tools.** `### Breaking` CHANGELOG entry listing every removed tool and its replacement.
|
|
98
|
+
4. **Freeze test updated in two deliberate steps.** Step one (v0.2): the frozen list grows to 69 + 21 = 90 names as consolidated tools land (additive, not breaking). Step two (v0.3.0): the list shrinks to the 21 survivors in the same commit that deletes the legacy registrations. Both edits are intentional per the freeze test's contract.
|
|
99
|
+
5. **Verification per step:** `bun test tests/unit/mcp-tool-freeze.test.ts tests/unit/operation-parity.test.ts tests/unit/mcp-server.test.ts`, plus one parity case per composite asserting that the composite action and its legacy tool produce identical results through the same operation.
|
|
100
|
+
|
|
101
|
+
The interim 90-tool surface is worse than 69 for one or two minors. Accepted: the alternative (flag-day rename) breaks every existing client at once with no migration window.
|
|
102
|
+
|
|
103
|
+
## Risks
|
|
104
|
+
|
|
105
|
+
- **MCP clients with tool allowlists.** A client allowlisting `canvas_add_node` gets nothing when the tool disappears in v0.3, and a coarse `canvas_node` allowlist grants add AND remove together. Consolidation moves the permission boundary from tool name to action param, which allowlist-based policy cannot see. Mitigations: the v0.2 overlap window, loud CHANGELOG + docs/mcp.md migration table, and keeping the sensitive standalones (`canvas_invoke_command`, `canvas_ax_interaction`, `canvas_ingest_activity`) individually nameable. For finer control PMX's own AX policy (`canvas_set_ax_policy` `tools.approvalRequired`) remains the recommended layer.
|
|
106
|
+
- **Action-enum schema bloat.** A composite's schema is the union of its members' fields, mostly optional. If a composite's description plus schema approaches the combined size of the tools it replaced, the consolidation bought nothing; measure serialized listTools size before and after (target: well over 50% reduction).
|
|
107
|
+
- **Worse errors for wrong field/action combinations.** `buildInput` must reject mismatches loudly (OperationError 400 naming the action and the offending field), not silently ignore fields the action does not use.
|
|
108
|
+
- **Placement judgment calls** (`remove-annotation` under `canvas_view`, `refresh` under `canvas_node`, `evaluate` under `canvas_webview`) are cheap to revisit before v0.3 freezes the surface; after that they are contract.
|
|
109
|
+
- **Stale agent muscle memory.** Skills, docs, and CLAUDE.md reference legacy names everywhere. The v0.3 commit must sweep `skills/`, `docs/`, and the MCP `canvas_describe_schema` routing map in the same change, or agents will be steered at tools that no longer exist.
|
|
@@ -0,0 +1,99 @@
|
|
|
1
|
+
# Plan 007 — AX domain: state split + orphan-bug fix + registry migration + tool consolidation
|
|
2
|
+
|
|
3
|
+
**Status:** Proposed
|
|
4
|
+
**Date:** 2026-06-13
|
|
5
|
+
**Depends on:** plan-005 (operation registry — slices 1–4 merged), plan-006 (MCP tool consolidation — wave 1 merged).
|
|
6
|
+
**Motivation:** Retires the remaining hard parts of three tech-debt items at once, all of which converge on the AX domain:
|
|
7
|
+
- **Item 1** (operation registry) — migration item 7: the ~25 AX operations are still hand-written 4× (CanvasStateManager → PmxCanvas SDK → HTTP handler → MCP tool → CanvasAccess). ~37 AX HTTP routes + 25 AX MCP tools.
|
|
8
|
+
- **Item 3** (CanvasStateManager split): AX state lives as `_axState` inside the 2,498-line `CanvasStateManager`; node deletion silently orphans AX items; the snapshot-vs-audit contract is undocumented as code.
|
|
9
|
+
- **Item 2** (tool consolidation) — wave 2: the AX tools are the biggest single win (25 → 5 composites, including 9 gate tools → 1), but blocked until AX ops are registry-backed.
|
|
10
|
+
|
|
11
|
+
## The AX state contract (authoritative)
|
|
12
|
+
|
|
13
|
+
Three partitions, confirmed against the code. This contract is the spec for the split.
|
|
14
|
+
|
|
15
|
+
| Partition | Members | Storage | Snapshotted | Cleared by `canvas_clear` | Cleared by `restore` |
|
|
16
|
+
|-----------|---------|---------|:-----------:|:-------------------------:|:--------------------:|
|
|
17
|
+
| **Canvas-bound** | `focus`, `workItems`, `approvalGates`, `reviewAnnotations`, `elicitations`, `modeRequests`, `policy` | in-memory `_axState` + one JSON blob in `ax_state` table | ✅ | ✅ | ✅ (replaced by snapshot's AX) |
|
|
18
|
+
| **Timeline (audit-only)** | `agent-event`, `evidence-item`, `steering-message` | `ax_events` / `ax_evidence` / `ax_steering` tables, 500-row retention, sequential ids | ❌ | ❌ | ❌ |
|
|
19
|
+
| **Host/session** | `host-capability` | `ax_host_capabilities` table | ❌ | ❌ | ❌ |
|
|
20
|
+
|
|
21
|
+
Rules: canvas-bound state travels with the canvas (snapshot/restore/clear); timeline + host data are diagnostic and survive all three. This is mostly already in CLAUDE.md — plan-007 makes it the documented module boundary.
|
|
22
|
+
|
|
23
|
+
## Slice A — AX state extraction + orphan-bug fix + contract doc (tech-debt item 3)
|
|
24
|
+
|
|
25
|
+
**Extraction.** `_axState` is a single separable field; timeline ops are already DB-direct; normalization lives cleanly in `ax-state.ts`. Move the canvas-bound state + its ~17 mutators / ~13 readers + the timeline-direct ops into a dedicated `AxStateManager` (new `src/server/ax-state-manager.ts`). `CanvasStateManager` keeps a `private ax: AxStateManager` and **delegates** its existing AX methods to it — so the public surface (SDK, HTTP, MCP all call `canvasState.addWorkItem(...)` etc.) is byte-stable and no caller changes. The manager takes a node-id-validity callback (injected) so normalization still runs on write. Net: `CanvasStateManager` sheds ~600 lines; AX state becomes independently testable.
|
|
26
|
+
|
|
27
|
+
**The orphan bug** (`canvas-state.ts:1459`): `removeNode()` → `applyAxState()` → `normalizeAxForCurrentNodes()` re-normalizes AX against the surviving node set. Precise current behavior (verified):
|
|
28
|
+
- Work items / approval gates / elicitations / mode requests: `normalizeNodeIds` (`ax-state.ts:255`) **strips the dangling node id but keeps the item** — already "soft", but *silent*.
|
|
29
|
+
- Node-anchored review annotations (`anchorType:'node'`): **dropped entirely** (`ax-state.ts:577-582`) — correct (meaningless without their node), but *silent*.
|
|
30
|
+
- No event, no audit, on either. Same silent re-normalization runs on `restore()` (`canvas-state.ts:1063`).
|
|
31
|
+
|
|
32
|
+
**Decision (chosen): soft-orphan + audit.** The data semantics already match soft-orphan, so the fix is the **audit**: when node deletion strips a node ref from a work item/gate/elicitation/mode, or drops a node-anchored review annotation, record one auditable timeline event (`source:'system'`) summarizing what was re-anchored/removed and the trigger node — so the human and a resuming agent can see it instead of work silently changing. Needs a general-purpose system/audit event kind (add `'note'` to `PmxAxEventKind`; `kind` is stored as TEXT so no DB migration). Node-anchored review drop is kept (per the decision), now audited. Scope the audit to `removeNode` (the live bug); `restore` replaces the whole canvas wholesale and its snapshot AX was already consistent when saved.
|
|
33
|
+
|
|
34
|
+
**Contract doc:** formalize the partition table above in `docs/ax-state-contract.md` (or a CLAUDE.md section) as the authoritative snapshot-vs-audit spec.
|
|
35
|
+
|
|
36
|
+
Slice A is independent of the registry migration (delegation keeps the surface stable) and is the highest-correctness-value piece. The orphan fix alone is a small, shippable change that can land first.
|
|
37
|
+
|
|
38
|
+
## Slice B — AX registry migration (tech-debt item 1 / plan-005 item 7)
|
|
39
|
+
|
|
40
|
+
Define AX operations in `src/server/operations/ops/ax.ts` (split into `ax-state.ts` / `ax-timeline.ts` files if large), following the established pattern (loose zod schemas, `OperationError`, `http.serialize`, `mcp.formatResult/buildInput`, frozen tool names, `ctx.emit` for the AX SSE frames). Delete the legacy HTTP handler + route + MCP tool block + orphaned CanvasAccess method in the same change per op.
|
|
41
|
+
|
|
42
|
+
**Fits the simple synchronous model** (`mutates: false`, emit `ax-state-changed` or `ax-event-created` via `ctx.emit`; these are NOT layout mutations):
|
|
43
|
+
- State: `ax.get`, `ax.focus.set`, `ax.policy.get`, `ax.policy.set`, `ax.host-capability.report`
|
|
44
|
+
- Work/review: `ax.work.create`, `ax.work.update`, `ax.review.add`, `ax.review.update`
|
|
45
|
+
- Gates (create/resolve): `ax.approval.request`/`.resolve`, `ax.elicitation.request`/`.respond`, `ax.mode.request`/`.resolve`
|
|
46
|
+
- Timeline: `ax.timeline.get`, `ax.event.record`, `ax.evidence.add`, `ax.steer`
|
|
47
|
+
- Delivery: `ax.delivery.pending` (loop-safe consumer scoping preserved), `ax.delivery.mark`
|
|
48
|
+
- Commands: `ax.command.invoke` (allowlist-gated; records a timeline event only)
|
|
49
|
+
|
|
50
|
+
**Needs special handling (own sub-slice):**
|
|
51
|
+
- **Gate reads with long-poll** (`ax.approval.get` / `ax.elicitation.get` / `ax.mode.get`, the `await_*` tools): the HTTP `?waitMs=` blocks via `waitForAxResolution()` + `req.signal`. Migrate using a custom `http.readInput` that performs the wait and returns the resolved-or-`pending` value; the MCP `await` action passes `timeoutMs` through to the handler (no abort signal off-HTTP, timeout still honored). **Fallback:** if this abstraction turns ugly, leave the 3 `await_*` tools + their GET routes legacy and fold them into `canvas_ax_gate`'s `await` action in a later step (report as deferred, plan-005-style — do not force a bad abstraction).
|
|
52
|
+
|
|
53
|
+
**Stays as a sidecar (NOT a registry op), but routed through the shared op cores:**
|
|
54
|
+
- **`ax.interaction`** (`applyAxInteraction`, `src/server/ax-interaction.ts`): the single re-validation trust boundary for sandboxed-surface envelopes, with `sourceSurface` scope-clamping. Keep `POST /api/canvas/ax/interaction` + `canvas_ax_interaction` as-is, but point its per-type dispatch at the SAME operation cores the registry ops call — so interaction and direct calls can never diverge.
|
|
55
|
+
- **`ax.activity.ingest`** (`canvas_ingest_activity`): harness firehose with kind-driven auto-reactions firing 3–4 SSE events; distinct caller shape. Stays standalone.
|
|
56
|
+
|
|
57
|
+
**Preserve exactly:** SSE event names (`ax-state-changed`, `ax-event-created`), the resource-notification fan-out (`canvas://ax`, `ax-context`, `ax-timeline`, `ax-work`, `ax-pending-steering`, `ax-delivery`), structured denial bodies (`resolve` on a missing/already-resolved gate; node-anchored review requiring a real node id), `source` defaulting (`'mcp'` for MCP, `'api'` for HTTP). The SDK's AX methods become thin wrappers over the op cores; CanvasAccess Local/Remote AX methods are deleted (the invoker replaces them) — the same local-vs-remote unification class as slices 1–4.
|
|
58
|
+
|
|
59
|
+
## Slice C — AX tool consolidation (tech-debt item 2 / plan-006 wave 2)
|
|
60
|
+
|
|
61
|
+
Additive composites (per `docs/api-stability.md`; same mechanism as wave 1 — derived schema + reused op `buildInput`/`formatResult`, deprecation prefix on the folded legacy tools, removed in v0.3):
|
|
62
|
+
|
|
63
|
+
1. **`canvas_ax_state`** — `get | set-focus | set-policy | report-capability`
|
|
64
|
+
2. **`canvas_ax_work`** — `add | update | annotate` (work items + review annotations)
|
|
65
|
+
3. **`canvas_ax_gate`** — two discriminators `kind: approval|elicitation|mode` × `action: request|resolve|await`. **9 tools → 1.** The single biggest consolidation win.
|
|
66
|
+
4. **`canvas_ax_timeline`** — `read | record-event | add-evidence | send-steering`
|
|
67
|
+
5. **`canvas_ax_delivery`** — `claim | mark`
|
|
68
|
+
|
|
69
|
+
**Stay standalone** (plan-006 §19–21): `canvas_ax_interaction` (trust-boundary envelope), `canvas_ingest_activity` (harness firehose), `canvas_invoke_command` (gated execution intent, allowlist/approval-policy relevant). Freeze list grows by 5 (additive); legacy AX tools gain the `Deprecated:` prefix.
|
|
70
|
+
|
|
71
|
+
## Migration order
|
|
72
|
+
|
|
73
|
+
1. **A.1** orphan-bug fix + audit note (small, shippable first; behavior change — see decision).
|
|
74
|
+
2. **A.2** extract `AxStateManager`, delegate from `CanvasStateManager`, document the contract.
|
|
75
|
+
3. **B.1** migrate the simple AX state/work/gate-mutate/timeline/delivery/command ops; delete their legacy handlers/tools/CanvasAccess methods; SDK wraps cores.
|
|
76
|
+
4. **B.2** gate-read long-poll sub-slice (custom `readInput`; fallback = leave `await_*` legacy).
|
|
77
|
+
5. **B.3** re-point `applyAxInteraction` at the shared cores (no behavior change).
|
|
78
|
+
6. **C** add the 5 AX composites + deprecate legacy AX tools.
|
|
79
|
+
|
|
80
|
+
Each step is its own commit; B.1 is internally parallelizable per primitive (work / approvals / elicitations / modes / review / timeline / delivery) — the dynamic-workflow fit.
|
|
81
|
+
|
|
82
|
+
## Risks
|
|
83
|
+
|
|
84
|
+
- **Behavior change (orphan fix).** Soft-orphan changes long-standing silent-drop semantics. Mitigation: explicit decision below; a parity-test case pins the new behavior; CHANGELOG `### Changed` entry.
|
|
85
|
+
- **State extraction regressions.** `_axState` is touched by snapshot/restore/clear/load and the orphan path. Mitigation: delegation keeps the public surface identical; the existing AX + snapshot unit tests must pass untouched; add a node-delete-orphan test.
|
|
86
|
+
- **Long-poll abstraction.** The `await_*` ops are the only AX ops that don't fit the synchronous model. Mitigation: custom `readInput`; documented fallback to keep them legacy.
|
|
87
|
+
- **Trust-boundary drift.** `applyAxInteraction` must call the same cores as the registry ops, or the sandboxed-surface path diverges from the direct path. Mitigation: extract mutation cores first, route both through them; the interaction-scope tests stay untouched.
|
|
88
|
+
- **MCP-against-remote.** At least one test per migrated AX tool through RemoteCanvasAccess/HttpOperationInvoker (mcp-server daemon mode).
|
|
89
|
+
- **Surface size.** Interim tool count grows again (additive); accepted per plan-006.
|
|
90
|
+
|
|
91
|
+
## Verification (every slice)
|
|
92
|
+
|
|
93
|
+
1. `bun run typecheck`
|
|
94
|
+
2. Targeted: `PMX_CANVAS_DISABLE_BROWSER_OPEN=1 bun test tests/unit/operation-parity.test.ts tests/unit/mcp-tool-freeze.test.ts tests/unit/mcp-server.test.ts tests/unit/mcp-composites.test.ts tests/unit/server-api.test.ts tests/unit/canvas-state.test.ts` (+ AX-specific suites, + a new node-delete-orphan test)
|
|
95
|
+
3. Full unit: `bun test tests/unit`
|
|
96
|
+
4. e2e gate on the PR (`test` + `e2e` required checks).
|
|
97
|
+
5. A parity case per migrated tool (composite action == legacy tool, both through the same op) and per new behavior (soft-orphan + audit note).
|
|
98
|
+
|
|
99
|
+
Tool-name freeze + operation-parity edits are deliberate and called out in the same commit, with CHANGELOG entries (`### Added` composites, `### Deprecated` legacy, `### Changed` orphan semantics).
|