pmx-canvas 0.1.26 → 0.1.27
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.github/extensions/pmx-canvas/extension.mjs +191 -0
- package/CHANGELOG.md +74 -0
- package/Readme.md +74 -27
- package/dist/canvas/index.js +82 -82
- package/dist/json-render/index.css +1 -1
- package/dist/json-render/index.js +944 -164
- package/dist/types/json-render/catalog.d.ts +195 -20
- package/dist/types/json-render/charts/components.d.ts +7 -0
- package/dist/types/json-render/charts/definitions.d.ts +13 -1
- package/dist/types/json-render/charts/tufte-components.d.ts +65 -0
- package/dist/types/json-render/charts/tufte-definitions.d.ts +164 -0
- package/dist/types/json-render/directives.d.ts +23 -0
- package/dist/types/json-render/renderer/index.d.ts +1 -0
- package/dist/types/json-render/server.d.ts +32 -1
- package/dist/types/mcp/canvas-access.d.ts +62 -0
- package/dist/types/server/ax-state.d.ts +170 -0
- package/dist/types/server/canvas-db.d.ts +17 -1
- package/dist/types/server/canvas-operations.d.ts +45 -0
- package/dist/types/server/canvas-schema.d.ts +5 -1
- package/dist/types/server/canvas-state.d.ts +95 -4
- package/dist/types/server/index.d.ts +114 -2
- package/dist/types/server/mutation-history.d.ts +1 -1
- package/docs/cli.md +42 -0
- package/docs/http-api.md +64 -0
- package/docs/mcp.md +23 -5
- package/docs/node-types.md +1 -1
- package/docs/screenshots/codex-app.png +0 -0
- package/docs/screenshots/github-copilot-app.png +0 -0
- package/docs/sdk.md +19 -1
- package/package.json +10 -7
- package/skills/control-session-orchestrator/SKILL.md +359 -0
- package/skills/control-session-orchestrator/evals/evals.json +75 -0
- package/skills/data-analysis/SKILL.md +6 -0
- package/skills/pmx-canvas/SKILL.md +50 -4
- package/skills/pmx-canvas/references/github-copilot-app-adapter.md +6 -0
- package/skills/tufte-viz/SKILL.md +157 -0
- package/skills/tufte-viz/references/analytical-design.md +217 -0
- package/skills/tufte-viz/references/tufte-principles.md +147 -0
- package/src/cli/agent.ts +280 -2
- package/src/cli/index.ts +2 -1
- package/src/client/nodes/ExtAppFrame.tsx +23 -1
- package/src/client/nodes/McpAppNode.tsx +6 -2
- package/src/json-render/catalog.ts +22 -1
- package/src/json-render/charts/components.tsx +97 -10
- package/src/json-render/charts/definitions.ts +19 -2
- package/src/json-render/charts/extra-components.tsx +5 -4
- package/src/json-render/charts/tufte-components.tsx +383 -0
- package/src/json-render/charts/tufte-definitions.ts +128 -0
- package/src/json-render/directives.ts +29 -0
- package/src/json-render/renderer/index.css +101 -0
- package/src/json-render/renderer/index.tsx +33 -0
- package/src/json-render/server.ts +257 -5
- package/src/mcp/canvas-access.ts +261 -0
- package/src/mcp/server.ts +496 -7
- package/src/server/ax-context.ts +8 -3
- package/src/server/ax-state.ts +447 -0
- package/src/server/canvas-db.ts +184 -1
- package/src/server/canvas-operations.ts +107 -0
- package/src/server/canvas-schema.ts +26 -3
- package/src/server/canvas-state.ts +349 -2
- package/src/server/index.ts +234 -2
- package/src/server/mutation-history.ts +6 -0
- package/src/server/server.ts +419 -2
package/docs/cli.md
CHANGED
|
@@ -125,6 +125,48 @@ pmx-canvas ax focus node-1 node-2 # Set AX focus
|
|
|
125
125
|
pmx-canvas ax focus --clear # Clear AX focus
|
|
126
126
|
```
|
|
127
127
|
|
|
128
|
+
### AX primitives
|
|
129
|
+
|
|
130
|
+
Host-agnostic agent-experience primitives. Timeline commands persist for
|
|
131
|
+
diagnostics (retention-bounded, not snapshotted); work items, approval gates,
|
|
132
|
+
and review annotations are canvas-bound and ride snapshots/restore.
|
|
133
|
+
|
|
134
|
+
```bash
|
|
135
|
+
# Timeline
|
|
136
|
+
pmx-canvas ax event add --kind tool-start --summary "ran tests"
|
|
137
|
+
pmx-canvas ax steer "focus on the failing test first"
|
|
138
|
+
pmx-canvas ax evidence add --kind test-output --title "unit pass"
|
|
139
|
+
pmx-canvas ax timeline --limit 50
|
|
140
|
+
|
|
141
|
+
# Work items (canvas-bound)
|
|
142
|
+
pmx-canvas ax work add --title "Wire up auth" --status in-progress node-1
|
|
143
|
+
pmx-canvas ax work update <id> --status done
|
|
144
|
+
pmx-canvas ax work list
|
|
145
|
+
|
|
146
|
+
# Approval gates (canvas-bound; pending → approved/rejected)
|
|
147
|
+
pmx-canvas ax approval request --title "Deploy to prod" --action deploy.prod
|
|
148
|
+
pmx-canvas ax approval resolve <id> --decision approved
|
|
149
|
+
pmx-canvas ax approval list
|
|
150
|
+
|
|
151
|
+
# Review annotations (canvas-bound)
|
|
152
|
+
pmx-canvas ax review add --body "off-by-one" --kind finding --severity error --file src/x.ts
|
|
153
|
+
pmx-canvas ax review list
|
|
154
|
+
|
|
155
|
+
# Host capability (own partition; survives clear)
|
|
156
|
+
pmx-canvas ax host report --host copilot --canvas --session-messaging
|
|
157
|
+
pmx-canvas ax host status
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
## Copilot adapter
|
|
161
|
+
|
|
162
|
+
Install the bundled GitHub Copilot extension adapter into a repo. The adapter
|
|
163
|
+
maps onto the same neutral AX surfaces (it never makes the core GitHub-specific).
|
|
164
|
+
|
|
165
|
+
```bash
|
|
166
|
+
pmx-canvas copilot install-extension --dry-run # Preview target, writes nothing
|
|
167
|
+
pmx-canvas copilot install-extension --yes # Install/overwrite into .github/extensions/pmx-canvas/
|
|
168
|
+
```
|
|
169
|
+
|
|
128
170
|
## WebView automation
|
|
129
171
|
|
|
130
172
|
Drive a headless Bun.WebView (Chromium or WebKit) pointed at the workbench:
|
package/docs/http-api.md
CHANGED
|
@@ -117,6 +117,70 @@ curl -X PATCH http://localhost:4313/api/canvas/ax \
|
|
|
117
117
|
-d '{"focus":{"nodeIds":["node-1"],"source":"api"}}'
|
|
118
118
|
```
|
|
119
119
|
|
|
120
|
+
## AX primitives (timeline, work, host)
|
|
121
|
+
|
|
122
|
+
Host-agnostic agent-experience primitives across three state partitions.
|
|
123
|
+
Canvas-bound state (work items, approval gates, review annotations) rides
|
|
124
|
+
canvas snapshots; timeline state (events, evidence, steering) persists for
|
|
125
|
+
diagnostics but is retention-bounded and not restored by snapshots; the host
|
|
126
|
+
capability is reported by adapters and survives `canvas_clear`.
|
|
127
|
+
|
|
128
|
+
```bash
|
|
129
|
+
# Timeline — record a normalized agent-event
|
|
130
|
+
curl -X POST http://localhost:4313/api/canvas/ax/event \
|
|
131
|
+
-H "Content-Type: application/json" \
|
|
132
|
+
-d '{"kind":"tool-start","summary":"ran tests","source":"api"}'
|
|
133
|
+
|
|
134
|
+
# Timeline — send a steering message to the active agent session
|
|
135
|
+
curl -X POST http://localhost:4313/api/canvas/ax/steer \
|
|
136
|
+
-H "Content-Type: application/json" \
|
|
137
|
+
-d '{"message":"focus on the failing test first","source":"api"}'
|
|
138
|
+
|
|
139
|
+
# Timeline — record an evidence item (logs/tool-result/screenshot/file/diff/test-output)
|
|
140
|
+
curl -X POST http://localhost:4313/api/canvas/ax/evidence \
|
|
141
|
+
-H "Content-Type: application/json" \
|
|
142
|
+
-d '{"kind":"test-output","title":"unit pass","source":"api"}'
|
|
143
|
+
|
|
144
|
+
# Timeline — read the bounded timeline (default limit 50, max 200)
|
|
145
|
+
curl "http://localhost:4313/api/canvas/ax/timeline?limit=50"
|
|
146
|
+
|
|
147
|
+
# Canvas-bound — add / update a work item
|
|
148
|
+
curl -X POST http://localhost:4313/api/canvas/ax/work \
|
|
149
|
+
-H "Content-Type: application/json" \
|
|
150
|
+
-d '{"title":"Wire up auth","status":"in-progress","nodeIds":["node-1"],"source":"api"}'
|
|
151
|
+
curl -X PATCH http://localhost:4313/api/canvas/ax/work/<id> \
|
|
152
|
+
-H "Content-Type: application/json" \
|
|
153
|
+
-d '{"status":"done"}'
|
|
154
|
+
curl http://localhost:4313/api/canvas/ax/work
|
|
155
|
+
|
|
156
|
+
# Canvas-bound — request / resolve an approval gate (pending → approved/rejected)
|
|
157
|
+
curl -X POST http://localhost:4313/api/canvas/ax/approval \
|
|
158
|
+
-H "Content-Type: application/json" \
|
|
159
|
+
-d '{"title":"Deploy to prod","action":"deploy.prod","source":"api"}'
|
|
160
|
+
curl -X POST http://localhost:4313/api/canvas/ax/approval/<id>/resolve \
|
|
161
|
+
-H "Content-Type: application/json" \
|
|
162
|
+
-d '{"decision":"approved","source":"api"}'
|
|
163
|
+
curl http://localhost:4313/api/canvas/ax/approval
|
|
164
|
+
|
|
165
|
+
# Canvas-bound — add a review annotation (comment/finding) anchored to node/file/region
|
|
166
|
+
curl -X POST http://localhost:4313/api/canvas/ax/review \
|
|
167
|
+
-H "Content-Type: application/json" \
|
|
168
|
+
-d '{"body":"off-by-one","kind":"finding","severity":"error","anchorType":"file","file":"src/x.ts","source":"api"}'
|
|
169
|
+
curl http://localhost:4313/api/canvas/ax/review
|
|
170
|
+
|
|
171
|
+
# Host/session — report and read host capability
|
|
172
|
+
curl -X PUT http://localhost:4313/api/canvas/ax/host-capability \
|
|
173
|
+
-H "Content-Type: application/json" \
|
|
174
|
+
-d '{"host":"copilot","canvas":true,"sessionMessaging":true,"source":"api"}'
|
|
175
|
+
curl http://localhost:4313/api/canvas/ax/host-capability
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
Validation: `/ax/event` requires a valid `kind` + `summary` (400 otherwise);
|
|
179
|
+
`/ax/evidence` requires `kind` + `title`; `/ax/steer`, `/ax/work`,
|
|
180
|
+
`/ax/approval`, `/ax/review` require their primary field; `PATCH /ax/work/:id`
|
|
181
|
+
and `PATCH /ax/review/:id` return 404 for unknown IDs; approval resolve returns
|
|
182
|
+
404 if the gate is missing or already resolved.
|
|
183
|
+
|
|
120
184
|
## Diagrams (Excalidraw preset)
|
|
121
185
|
|
|
122
186
|
```bash
|
package/docs/mcp.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# MCP reference
|
|
2
2
|
|
|
3
|
-
PMX Canvas ships an MCP stdio server with **
|
|
3
|
+
PMX Canvas ships an MCP stdio server with **56 tools** + **12 core resources**,
|
|
4
4
|
plus per-skill resources at `canvas://skills/<name>`. The server emits
|
|
5
5
|
`notifications/resources/updated` when canvas state changes — humans pin
|
|
6
6
|
nodes in the browser, agents are notified immediately.
|
|
@@ -35,7 +35,8 @@ The canvas auto-starts on first tool call.
|
|
|
35
35
|
| `canvas_validate_spec` | Validate a json-render spec, graph payload, or HTML primitive payload without creating a node |
|
|
36
36
|
| `canvas_refresh_webpage_node` | Re-fetch and update a webpage node from its stored URL |
|
|
37
37
|
| `canvas_add_json_render_node` | Create a native json-render node from a validated spec |
|
|
38
|
-
| `
|
|
38
|
+
| `canvas_stream_json_render_node` | Progressively build a json-render node from SpecStream JSON-Patch ops (live/streaming panels) |
|
|
39
|
+
| `canvas_add_graph_node` | Create a native graph node (line, bar, pie, area, scatter, radar, stacked-bar, composed, sparkline, dot-plot, bullet, slopegraph) |
|
|
39
40
|
| `canvas_build_web_artifact` | Build a bundled HTML artifact and open it on the canvas |
|
|
40
41
|
|
|
41
42
|
`canvas_add_html_node` accepts optional `summary`, `agentSummary`, `embeddedNodeIds`, and
|
|
@@ -51,8 +52,19 @@ searchable and readable in pinned/spatial context.
|
|
|
51
52
|
| `canvas_arrange` | Auto-arrange (grid/column/flow) |
|
|
52
53
|
| `canvas_validate` | Validate collisions, containment, and missing edge endpoints |
|
|
53
54
|
| `canvas_focus_node` | Pan viewport to a node; use CLI `focus --no-pan` when you only need to select/raise |
|
|
54
|
-
| `
|
|
55
|
+
| `canvas_fit_view` | Fit the canvas viewport to all nodes or a selected subset |
|
|
56
|
+
| `canvas_get_ax` | Read the PMX AX state (focus, work items, approvals, review annotations, host capability) plus pinned/focused context |
|
|
55
57
|
| `canvas_set_ax_focus` | Set the host-agnostic AX focus node set; adapters can pass a source such as `codex` |
|
|
58
|
+
| `canvas_record_ax_event` | Record a normalized timeline `agent-event` (prompt/assistant-message/tool-start/tool-result/failure/approval/steering) |
|
|
59
|
+
| `canvas_send_steering` | Record a `steering-message`: a user instruction from the surface to the active agent session |
|
|
60
|
+
| `canvas_get_ax_timeline` | Read the bounded AX timeline (events, evidence, steering) plus counts |
|
|
61
|
+
| `canvas_add_work_item` | Add a canvas-bound `work-item` (visible task/plan/status tied to nodes) |
|
|
62
|
+
| `canvas_update_work_item` | Update a work item's title/status/detail/nodeIds by ID |
|
|
63
|
+
| `canvas_request_approval` | Request human approval via an `approval-gate` (pending) before a high-impact action |
|
|
64
|
+
| `canvas_resolve_approval` | Resolve a pending approval gate (`approved`/`rejected`) |
|
|
65
|
+
| `canvas_add_evidence` | Record an `evidence-item` on the timeline (logs/tool-result/screenshot/file/diff/test-output) |
|
|
66
|
+
| `canvas_add_review_annotation` | Add a canvas-bound `review-annotation` (comment/finding) anchored to a node, file, or region |
|
|
67
|
+
| `canvas_report_host_capability` | Report a host/session `host-capability` for diagnostics |
|
|
56
68
|
| `canvas_pin_nodes` | Pin nodes to include in agent context |
|
|
57
69
|
| `canvas_clear` | Clear all nodes and edges |
|
|
58
70
|
| `canvas_snapshot` | Save current canvas as a named snapshot |
|
|
@@ -82,8 +94,10 @@ Individual bundled skills are also readable at `canvas://skills/<name>`.
|
|
|
82
94
|
| Resource | Description |
|
|
83
95
|
|----------|-------------|
|
|
84
96
|
| `canvas://pinned-context` | Content of pinned nodes + nearby unpinned neighbors |
|
|
85
|
-
| `canvas://ax` | PMX AX state,
|
|
86
|
-
| `canvas://ax-context` | Agent-readable pinned and focused AX context |
|
|
97
|
+
| `canvas://ax` | PMX AX state: focus, work items, approval gates, review annotations |
|
|
98
|
+
| `canvas://ax-context` | Agent-readable pinned and focused AX context, plus timeline summary and host capability |
|
|
99
|
+
| `canvas://ax-work` | Canvas-bound AX work: work items, approval gates, review annotations |
|
|
100
|
+
| `canvas://ax-timeline` | Bounded AX timeline: recent agent-events, evidence, and steering messages |
|
|
87
101
|
| `canvas://schema` | Running-server create schemas and json-render catalog metadata |
|
|
88
102
|
| `canvas://layout` | Full canvas state (all nodes, edges, viewport) |
|
|
89
103
|
| `canvas://summary` | Compact overview: counts, pinned titles, viewport |
|
|
@@ -99,6 +113,10 @@ changes:
|
|
|
99
113
|
|
|
100
114
|
- Pin changes notify `canvas://pinned-context`, `canvas://ax`, and `canvas://ax-context`
|
|
101
115
|
- AX focus changes notify `canvas://ax` and `canvas://ax-context`
|
|
116
|
+
- Canvas-bound AX mutations (work items, approval gates, review annotations,
|
|
117
|
+
host capability) notify `canvas://ax`, `canvas://ax-work`, and `canvas://ax-context`
|
|
118
|
+
- AX timeline mutations (agent-events, evidence, steering) notify
|
|
119
|
+
`canvas://ax-timeline` and `canvas://ax-context`
|
|
102
120
|
- All mutations notify `canvas://layout`, `canvas://summary`,
|
|
103
121
|
`canvas://spatial-context`, `canvas://history`, and `canvas://code-graph`
|
|
104
122
|
|
package/docs/node-types.md
CHANGED
|
@@ -19,7 +19,7 @@ see [MCP tools](mcp.md), [HTTP API](http-api.md), and [SDK](sdk.md).
|
|
|
19
19
|
| `webpage` | Persisted webpage snapshot with stored URL, extracted text, refresh |
|
|
20
20
|
| `mcp-app` | Tool-backed hosted MCP App iframes (Excalidraw, etc.) |
|
|
21
21
|
| `json-render` | Structured UI from JSON specs (cards, tables, forms) |
|
|
22
|
-
| `graph` | Charts (line, bar, pie, area, scatter, radar, stacked-bar, composed) |
|
|
22
|
+
| `graph` | Charts (line, bar, pie, area, scatter, radar, stacked-bar, composed, plus Tufte primitives: sparkline, dot-plot, bullet, slopegraph) |
|
|
23
23
|
| `html` | Self-contained HTML/JS in a sandboxed iframe |
|
|
24
24
|
| `web-artifact` | Bundled React/Tailwind artifact (full single-file app) |
|
|
25
25
|
| `group` | Spatial container/frame around other nodes |
|
|
Binary file
|
|
Binary file
|
package/docs/sdk.md
CHANGED
|
@@ -85,9 +85,27 @@ console.log(canvas.validate());
|
|
|
85
85
|
console.log(canvas.getLayout());
|
|
86
86
|
|
|
87
87
|
// AX context for host adapters
|
|
88
|
-
canvas.setAxFocus(
|
|
88
|
+
canvas.setAxFocus([n1], { source: 'sdk' });
|
|
89
89
|
console.log(canvas.getAxState());
|
|
90
90
|
console.log(canvas.getAxContext());
|
|
91
|
+
|
|
92
|
+
// AX primitives — host-agnostic agent-experience layer
|
|
93
|
+
// Timeline (persisted for diagnostics, retention-bounded, not snapshotted)
|
|
94
|
+
canvas.recordAxEvent({ kind: 'tool-start', summary: 'ran tests' }, { source: 'sdk' });
|
|
95
|
+
canvas.addEvidence({ kind: 'test-output', title: 'unit pass' }, { source: 'sdk' });
|
|
96
|
+
canvas.sendSteering('focus on the failing test first', { source: 'sdk' });
|
|
97
|
+
console.log(canvas.getAxTimeline({ limit: 50 }));
|
|
98
|
+
|
|
99
|
+
// Canvas-bound (rides snapshots + restore, cleared by canvas.clear())
|
|
100
|
+
const work = canvas.addWorkItem({ title: 'Wire up auth', status: 'in-progress', nodeIds: [n1] }, { source: 'sdk' });
|
|
101
|
+
canvas.updateWorkItem(work.id, { status: 'done' });
|
|
102
|
+
const gate = canvas.requestApproval({ title: 'Deploy to prod', action: 'deploy.prod' }, { source: 'sdk' });
|
|
103
|
+
canvas.resolveApproval(gate.id, 'approved', { source: 'sdk' });
|
|
104
|
+
canvas.addReviewAnnotation({ body: 'off-by-one', kind: 'finding', severity: 'error', anchorType: 'file', file: 'src/x.ts' }, { source: 'sdk' });
|
|
105
|
+
|
|
106
|
+
// Host/session capability (own table, survives clear)
|
|
107
|
+
canvas.reportHostCapability({ host: 'copilot', canvas: true, sessionMessaging: true }, { source: 'sdk' });
|
|
108
|
+
console.log(canvas.getHostCapability());
|
|
91
109
|
```
|
|
92
110
|
|
|
93
111
|
## WebView automation
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "pmx-canvas",
|
|
3
|
-
"version": "0.1.
|
|
3
|
+
"version": "0.1.27",
|
|
4
4
|
"description": "Spatial canvas workbench for coding agents — infinite 2D canvas with agent-native CLI, MCP integration, nodes, edges, file watching, and snapshots",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "./src/server/index.ts",
|
|
@@ -56,18 +56,21 @@
|
|
|
56
56
|
},
|
|
57
57
|
"dependencies": {
|
|
58
58
|
"@joplin/turndown-plugin-gfm": "^1.0.64",
|
|
59
|
-
"@json-render/core": "
|
|
60
|
-
"@json-render/
|
|
61
|
-
"@json-render/react": "
|
|
62
|
-
"@json-render/
|
|
59
|
+
"@json-render/core": "0.19.0",
|
|
60
|
+
"@json-render/devtools": "0.19.0",
|
|
61
|
+
"@json-render/devtools-react": "0.19.0",
|
|
62
|
+
"@json-render/directives": "0.19.0",
|
|
63
|
+
"@json-render/mcp": "0.19.0",
|
|
64
|
+
"@json-render/react": "0.19.0",
|
|
65
|
+
"@json-render/shadcn": "0.19.0",
|
|
63
66
|
"@modelcontextprotocol/ext-apps": "^1.3.1",
|
|
64
67
|
"@modelcontextprotocol/sdk": "^1.0.0",
|
|
65
68
|
"@preact/signals": "^2.0.0",
|
|
66
69
|
"@types/turndown": "^5.0.6",
|
|
67
70
|
"marked": "^15.0.0",
|
|
68
71
|
"preact": "^10.25.0",
|
|
69
|
-
"react": "^19.2.
|
|
70
|
-
"react-dom": "^19.2.
|
|
72
|
+
"react": "^19.2.3",
|
|
73
|
+
"react-dom": "^19.2.3",
|
|
71
74
|
"recharts": "^3.2.1",
|
|
72
75
|
"turndown": "^7.2.4",
|
|
73
76
|
"zod": "^4.3.6"
|
|
@@ -0,0 +1,359 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: control-session-orchestrator
|
|
3
|
+
description: >
|
|
4
|
+
Control-plane workflow for coordinating multi-agent, multi-session project work from a single
|
|
5
|
+
Codex, GitHub Copilot, or agent-app control session. Use this skill whenever the user asks to
|
|
6
|
+
orchestrate agents, create or steer worker sessions, run a workflow-like effort, fan out
|
|
7
|
+
audits/research/migrations, coordinate parallel implementation streams, monitor other project
|
|
8
|
+
sessions, or compare this control-session pattern to Claude Code dynamic workflows. This skill is
|
|
9
|
+
especially relevant when the current session can spawn persistent project sessions and those
|
|
10
|
+
sessions can spawn their own subagents, creating a two-level orchestration hierarchy.
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
# Control Session Orchestrator
|
|
14
|
+
|
|
15
|
+
Use the current session as the control plane for project work that is too broad, risky, or
|
|
16
|
+
stateful for one conversation. The control session owns intent, decomposition, routing, status,
|
|
17
|
+
verification, and consolidation. Worker sessions own scoped execution. Worker subagents are local
|
|
18
|
+
implementation/research/audit helpers inside each worker session.
|
|
19
|
+
|
|
20
|
+
## Mental model
|
|
21
|
+
|
|
22
|
+
```
|
|
23
|
+
User
|
|
24
|
+
-> Control session (strategy, dispatch, tracking, integration)
|
|
25
|
+
-> Worker project session A (persistent branch/workstream)
|
|
26
|
+
-> Subagents for research, implementation, review, tests
|
|
27
|
+
-> Worker project session B (persistent branch/workstream)
|
|
28
|
+
-> Subagents for local fan-out
|
|
29
|
+
-> Verifier/reviewer session (optional independent gate)
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
This is similar to dynamic workflows, but the orchestration is human-readable and session-native
|
|
33
|
+
instead of a runtime script. Use it when persistence, branches, PRs, human steering, or cross-session
|
|
34
|
+
continuity matter more than fully automated fan-out.
|
|
35
|
+
|
|
36
|
+
A code runtime gets reliability for free (validated results, barriers, budgets, dedup, resume). A
|
|
37
|
+
prompt-driven control plane only gets it if you make state machine-checkable. Two contracts do that
|
|
38
|
+
without a runtime: a required **worker result block** and a durable **control-state manifest** (see
|
|
39
|
+
[Machine-checkable contracts](#machine-checkable-contracts)). Everything else in this skill keys off
|
|
40
|
+
those two artifacts — without them, "is this worker done and passing?" is a guess, not a field read.
|
|
41
|
+
|
|
42
|
+
## Supported control apps
|
|
43
|
+
|
|
44
|
+
This skill is app-agnostic. First discover which orchestration tools are available in the current
|
|
45
|
+
session, then adapt the same control workflow to that surface.
|
|
46
|
+
|
|
47
|
+
| Capability | Codex app | GitHub Copilot app | Fallback |
|
|
48
|
+
|---|---|---|---|
|
|
49
|
+
| Find worker sessions | List/search project threads | List/search app sessions | Ask user for target session links/IDs |
|
|
50
|
+
| Create persistent workstreams | Create or reuse Codex threads/worktrees when available | Create or reuse Copilot app sessions/workspaces when available | Use local subagents only |
|
|
51
|
+
| Steer an existing workstream | Send a follow-up prompt to the thread | Send a follow-up prompt to the session | Ask user to paste the prompt into the worker |
|
|
52
|
+
| Local fan-out | Spawn subagents from this session or ask workers to spawn their own | Use Copilot's available agent/session tools | Keep work local |
|
|
53
|
+
| Tracking | Thread titles, pins, branches, PRs, canvas nodes, compact status tables | Session names, branches, PRs, issues, canvas nodes, compact status tables | Markdown status table |
|
|
54
|
+
|
|
55
|
+
Do not assume the GitHub Copilot or Codex tool names. Use the tools exposed in the current
|
|
56
|
+
environment, and say which control surface is active before dispatching workers.
|
|
57
|
+
|
|
58
|
+
## When to use
|
|
59
|
+
|
|
60
|
+
Use this skill for:
|
|
61
|
+
|
|
62
|
+
- Codebase-wide audits, migrations, or parity checks
|
|
63
|
+
- Parallel investigation across modules, services, features, or PRs
|
|
64
|
+
- Work that benefits from independent implementer and verifier sessions
|
|
65
|
+
- Large features where design, implementation, testing, and review should be split
|
|
66
|
+
- Project-control prompts like "coordinate agents", "spin up sessions", "run a workflow",
|
|
67
|
+
"make workers handle this", "monitor the other sessions", or "act as control"
|
|
68
|
+
- Situations where worker sessions may themselves use subagents for local research, coding, or review
|
|
69
|
+
|
|
70
|
+
Do not use it for a simple one-file fix, a quick answer, or a task where a single local subagent is
|
|
71
|
+
enough. Orchestration has overhead; spend it only when coordination reduces risk or increases
|
|
72
|
+
throughput.
|
|
73
|
+
|
|
74
|
+
## Machine-checkable contracts
|
|
75
|
+
|
|
76
|
+
These are the session-native analog of a runtime's typed results and durable run state. They stay
|
|
77
|
+
human-readable, but they are **required**, not advisory — the control session parses them instead of
|
|
78
|
+
re-reading prose.
|
|
79
|
+
|
|
80
|
+
### Worker result block
|
|
81
|
+
|
|
82
|
+
Every worker MUST end its report with a fenced ` ```json ` block tagged `control-result`. The control
|
|
83
|
+
session reads this block (never the surrounding prose) to update state, dedup, and decide routing.
|
|
84
|
+
|
|
85
|
+
```json control-result
|
|
86
|
+
{
|
|
87
|
+
"worker_id": "auth-api",
|
|
88
|
+
"wave_id": "w1",
|
|
89
|
+
"unit_key": "service/auth",
|
|
90
|
+
"scope": "src/auth/** — refresh-token rotation",
|
|
91
|
+
"status": "complete",
|
|
92
|
+
"files_changed": ["src/auth/rotate.ts"],
|
|
93
|
+
"verification": { "command": "pnpm test auth", "result": "pass", "evidence": "42 passed" },
|
|
94
|
+
"subagents_used": "2 — one research, one test author",
|
|
95
|
+
"risks": ["rotation interacts with logout; covered by test"],
|
|
96
|
+
"next_step": "ready for review session",
|
|
97
|
+
"report_ref": "thread/PR/path to the full report"
|
|
98
|
+
}
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
The block must be **strict JSON** (no comments/trailing commas) so it parses. `status` is one of
|
|
102
|
+
`complete | blocked | needs-decision | failed`; `verification.result` is one of `pass | fail | not-run`.
|
|
103
|
+
|
|
104
|
+
### Control-state manifest
|
|
105
|
+
|
|
106
|
+
One durable artifact that **is** the source of truth for the mission — a pinned control thread, a
|
|
107
|
+
tracking-issue body, a canvas node, or a committed `control/state.json`. Re-read and update it every
|
|
108
|
+
turn; keep the conversation for decisions, not state. One row per **unit** (unit-keyed, so the same
|
|
109
|
+
unit is never dispatched twice — this is the dedup ledger).
|
|
110
|
+
|
|
111
|
+
```json
|
|
112
|
+
{
|
|
113
|
+
"mission": "MCP tool parity audit",
|
|
114
|
+
"non_goals": ["no behavior changes"],
|
|
115
|
+
"success_criteria": ["every tool present in server, HTTP, SDK, docs or flagged"],
|
|
116
|
+
"budget": { "max_concurrent_workers": 5, "max_total_workers": 25, "spawned": 0, "in_flight": 0 },
|
|
117
|
+
"convergence": { "rule": "single-pass", "k_empty": 2, "empty_streak": 0, "target": null, "current": 0 },
|
|
118
|
+
"workers": [
|
|
119
|
+
{
|
|
120
|
+
"unit_key": "surface/http",
|
|
121
|
+
"worker_id": "http-audit",
|
|
122
|
+
"session_ref": "thread-or-session id/link",
|
|
123
|
+
"scope": "HTTP API surface",
|
|
124
|
+
"branch_or_pr": "—",
|
|
125
|
+
"status": "pending",
|
|
126
|
+
"wave_id": "w1",
|
|
127
|
+
"last_update": "ISO-8601",
|
|
128
|
+
"evidence_ref": "report_ref from the result block"
|
|
129
|
+
}
|
|
130
|
+
],
|
|
131
|
+
"decisions": [],
|
|
132
|
+
"open_followups": []
|
|
133
|
+
}
|
|
134
|
+
```
|
|
135
|
+
|
|
136
|
+
Rules:
|
|
137
|
+
|
|
138
|
+
- **Worker status** (what a worker self-reports in its result block): `complete | blocked |
|
|
139
|
+
needs-decision | failed`.
|
|
140
|
+
- **Manifest unit status** (the superset the control session maintains): `pending | dispatched |
|
|
141
|
+
needs-decision | blocked | stalled | complete | failed | dropped`. Worker-reported values are a
|
|
142
|
+
subset of these, so setting a unit's status from a worker block (Step 5) is always valid.
|
|
143
|
+
- **Terminal** states — a unit is closed — are `complete | failed | dropped`. Everything else is
|
|
144
|
+
non-terminal and must be resolved, or explicitly converted to `dropped` with a reason, before the
|
|
145
|
+
mission closes (Step 8).
|
|
146
|
+
- `budget.in_flight` is the number of rows currently `dispatched`. Increment `spawned` and `in_flight`
|
|
147
|
+
on dispatch; decrement `in_flight` when a unit leaves `dispatched`; recompute it from the rows on
|
|
148
|
+
rehydrate.
|
|
149
|
+
- `convergence.rule` is one of `single-pass | loop-until-dry | loop-until-budget |
|
|
150
|
+
accumulate-to-target`. `k_empty`/`empty_streak` are used only by `loop-until-dry`; `target`/`current`
|
|
151
|
+
only by `accumulate-to-target` (`target` = the count or coverage goal, `current` = progress so far).
|
|
152
|
+
- dropped/failed units MUST carry a reason in `open_followups`.
|
|
153
|
+
|
|
154
|
+
This manifest is what a fresh control session rehydrates from (Step 0).
|
|
155
|
+
|
|
156
|
+
## Control workflow
|
|
157
|
+
|
|
158
|
+
### 0. Rehydrate (resume an in-flight mission)
|
|
159
|
+
|
|
160
|
+
On session start, look for an existing control-state manifest for this mission. If one exists:
|
|
161
|
+
|
|
162
|
+
- Load it; treat it as the source of truth.
|
|
163
|
+
- Re-attach to workers by `session_ref` and reconcile each worker's *real* status (read the thread/PR)
|
|
164
|
+
before any new dispatch.
|
|
165
|
+
- Recompute `budget.in_flight` from the rows still marked `dispatched`.
|
|
166
|
+
- Do NOT re-dispatch a unit whose status is `dispatched` or `complete` — route a follow-up instead.
|
|
167
|
+
|
|
168
|
+
If no manifest exists, this is a new mission — create one during Step 1.
|
|
169
|
+
|
|
170
|
+
### 1. Frame the mission
|
|
171
|
+
|
|
172
|
+
Before spawning anything, capture (and write into the manifest):
|
|
173
|
+
|
|
174
|
+
- Objective and non-goals
|
|
175
|
+
- Repositories, branches, PRs, or issues in scope
|
|
176
|
+
- File or subsystem boundaries for each workstream
|
|
177
|
+
- Success criteria and verification gates
|
|
178
|
+
- Merge/integration expectations
|
|
179
|
+
- Any "do not touch" constraints
|
|
180
|
+
|
|
181
|
+
Also set explicit limits up front (manifest `budget` and `convergence`):
|
|
182
|
+
|
|
183
|
+
- `max_concurrent_workers` (default ~4–6) — never more in flight at once
|
|
184
|
+
- `max_total_workers` — a lifetime backstop for the whole mission (e.g. 25)
|
|
185
|
+
- optional token / cost / time ceiling
|
|
186
|
+
- the convergence rule: `single-pass` for bounded missions; `loop-until-dry`, `loop-until-budget`,
|
|
187
|
+
or `accumulate-to-target` for open-ended audits/migrations/parity sweeps
|
|
188
|
+
|
|
189
|
+
If any boundary is ambiguous and could cause conflicting edits, ask before dispatch.
|
|
190
|
+
|
|
191
|
+
### 2. Detect the control surface
|
|
192
|
+
|
|
193
|
+
Before dispatch, identify the available app tools:
|
|
194
|
+
|
|
195
|
+
- Codex app: thread/session tools such as list, create/read, send-message, rename, pin/archive, plus
|
|
196
|
+
optional local subagent tools.
|
|
197
|
+
- GitHub Copilot app: session or workspace tools exposed by the app connector, plus any available
|
|
198
|
+
GitHub issue/PR/branch controls.
|
|
199
|
+
- Generic agent app: any combination of session, task, subagent, branch, issue, PR, or automation
|
|
200
|
+
tools.
|
|
201
|
+
|
|
202
|
+
If no persistent-session tools are available, downgrade to a local multi-agent plan and explain the
|
|
203
|
+
limitation. Do not invent a backend.
|
|
204
|
+
|
|
205
|
+
### 3. Choose the topology
|
|
206
|
+
|
|
207
|
+
Pick the smallest useful topology:
|
|
208
|
+
|
|
209
|
+
- **One worker**: isolated implementation or bug fix that should live in its own project session
|
|
210
|
+
- **Parallel workers**: independent modules, packages, endpoints, tests, or docs
|
|
211
|
+
- **Research then implementation**: exploratory sessions report findings before coding starts
|
|
212
|
+
- **Implementer + verifier**: one session changes code, another reviews or verifies independently
|
|
213
|
+
- **Control-only**: no workers yet; just inspect state, list sessions, or plan the dispatch
|
|
214
|
+
|
|
215
|
+
Prefer separate sessions when workers may edit overlapping history, need different branches, or need
|
|
216
|
+
long-running context. Prefer local subagents inside one session when the task is exploratory and does
|
|
217
|
+
not need persistent branch state.
|
|
218
|
+
|
|
219
|
+
### 4. Dispatch workers with complete prompts
|
|
220
|
+
|
|
221
|
+
Respect the budget: **never dispatch while `in_flight >= max_concurrent_workers`** — queue the unit
|
|
222
|
+
(`status: pending`) and log it. On reaching `max_total_workers` or a token/cost ceiling, STOP
|
|
223
|
+
dispatching and surface a *Decision needed* rather than spawning more. Dispatch is an **atomic
|
|
224
|
+
manifest update**: set the unit's row to `status: dispatched` (with `session_ref`, `worker_id`,
|
|
225
|
+
`wave_id`, `last_update`) and increment `spawned` and `in_flight` together; if the dispatch fails to
|
|
226
|
+
start, leave the row `pending` and advance neither counter. Decrement `in_flight` when a unit leaves
|
|
227
|
+
`dispatched` (it reaches a terminal state, or returns to `needs-decision`/`blocked`/`stalled`) so
|
|
228
|
+
queued units can start. This keeps `in_flight` equal to the count of `dispatched` rows that Step 0
|
|
229
|
+
recomputes.
|
|
230
|
+
|
|
231
|
+
Each worker prompt should be self-contained. Include:
|
|
232
|
+
|
|
233
|
+
- The mission and exact scope (and its `unit_key`)
|
|
234
|
+
- Files, subsystems, issue/PR links, and branch expectations
|
|
235
|
+
- What the worker may and may not change
|
|
236
|
+
- Verification commands or acceptance criteria
|
|
237
|
+
- Whether it may create commits, PRs, or only report back
|
|
238
|
+
- The required result block
|
|
239
|
+
|
|
240
|
+
Worker prompt template:
|
|
241
|
+
|
|
242
|
+
```text
|
|
243
|
+
You are worker <name> for <project>.
|
|
244
|
+
|
|
245
|
+
Mission: <specific outcome>
|
|
246
|
+
unit_key / wave_id: <key> / <wave>
|
|
247
|
+
Scope: <files/subsystems/issue/PR>
|
|
248
|
+
Do not touch: <boundaries>
|
|
249
|
+
Approach: <expected plan or constraints>
|
|
250
|
+
Verification: <commands/checks/evidence>
|
|
251
|
+
|
|
252
|
+
You MAY use your own subagents for local research, implementation, and review, but you remain
|
|
253
|
+
accountable for this scope and the final report. Do NOT create or steer further persistent project
|
|
254
|
+
sessions — if the work needs another full workstream, say so in next_step.
|
|
255
|
+
|
|
256
|
+
End your report with a fenced ```json control-result block (see the contract). Populate every field;
|
|
257
|
+
record subagents you used in subagents_used. The control session reads only that block.
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
When using Codex app controls, prefer to rename and pin important worker/control threads so the
|
|
261
|
+
session graph stays legible. When using GitHub Copilot app controls, use the corresponding session or
|
|
262
|
+
workspace labels if exposed.
|
|
263
|
+
|
|
264
|
+
### 5. Track state centrally
|
|
265
|
+
|
|
266
|
+
The control-state manifest is the single source of truth — update it every turn, not the
|
|
267
|
+
conversation. From each worker's result block, set the unit's `status`, `branch_or_pr`,
|
|
268
|
+
`last_update`, and `evidence_ref`. Keep the control session's context focused on summaries and
|
|
269
|
+
decisions, not full transcripts; the full report lives at `report_ref`.
|
|
270
|
+
|
|
271
|
+
Track at least, per unit: `unit_key`, `worker_id`, `session_ref`, scope, status, branch/PR, last
|
|
272
|
+
update, blocker, and verification state. Canvas nodes or a SQL/todo table are good backends for the
|
|
273
|
+
manifest when the app exposes them.
|
|
274
|
+
|
|
275
|
+
### 6. Route follow-ups (result-gate)
|
|
276
|
+
|
|
277
|
+
When a worker reports, first run the **result-gate**:
|
|
278
|
+
|
|
279
|
+
- Parse the `control-result` block. If a required field is missing or malformed, or the status is
|
|
280
|
+
inconsistent with evidence (e.g. `status: complete` with `verification.result != pass`), do NOT
|
|
281
|
+
accept it — send exactly one standardized re-prompt asking only for the corrected block. Cap at 2
|
|
282
|
+
retries, then escalate to the user.
|
|
283
|
+
- Accept completed work only when the block validates AND meets the success criteria.
|
|
284
|
+
|
|
285
|
+
Then route:
|
|
286
|
+
|
|
287
|
+
- Send targeted follow-ups for missing verification, scope drift, or blockers.
|
|
288
|
+
- Avoid duplicating a worker's investigation unless its result is incomplete or suspect (check the
|
|
289
|
+
unit ledger first).
|
|
290
|
+
- If two or more workers conflict, pause integration and resolve ownership before more edits happen.
|
|
291
|
+
|
|
292
|
+
### 7. Iterate waves to convergence
|
|
293
|
+
|
|
294
|
+
For multi-wave missions, after routing a wave's follow-ups, apply the declared `convergence.rule`
|
|
295
|
+
before consolidating:
|
|
296
|
+
|
|
297
|
+
- **single-pass** — one wave; skip to consolidate.
|
|
298
|
+
- **loop-until-dry** — keep opening units until `k_empty` consecutive waves produce zero *new*
|
|
299
|
+
(deduped) units; maintain `empty_streak` in the manifest.
|
|
300
|
+
- **loop-until-budget** — stop when a budget cap is hit.
|
|
301
|
+
- **accumulate-to-target** — stop when the target count/coverage is reached.
|
|
302
|
+
|
|
303
|
+
"New" and "dry" are measured against the manifest's set of `unit_key`s, not memory. Never stop
|
|
304
|
+
silently — write why iteration ended (`open_followups` / `decisions`).
|
|
305
|
+
|
|
306
|
+
### 8. Verify and consolidate
|
|
307
|
+
|
|
308
|
+
Before declaring the mission done:
|
|
309
|
+
|
|
310
|
+
- Run or delegate the agreed verification gate.
|
|
311
|
+
- Review diffs or ask an independent reviewer session for high-signal findings.
|
|
312
|
+
- Ensure worker outputs are integrated in the right branch/session.
|
|
313
|
+
|
|
314
|
+
**Wave-join / completeness gate:** the mission is complete only when **every** manifest worker row is
|
|
315
|
+
in a **terminal** state — `complete`, `failed`, or `dropped`. Non-terminal rows (`pending`,
|
|
316
|
+
`dispatched`, `needs-decision`, `blocked`, `stalled`) must first be resolved; a unit that cannot be —
|
|
317
|
+
e.g. a worker that never reported by its checkpoint, marked `stalled` — must be explicitly converted
|
|
318
|
+
to `dropped` with a reason. Only then may the mission be declared *"complete with N dropped: <ids +
|
|
319
|
+
reasons>"*. Never close with a non-terminal row, and never drop silently. Enumerate every dispatched
|
|
320
|
+
unit in the final summary.
|
|
321
|
+
|
|
322
|
+
**Pull cadence (no push signal):** a session-native control plane has no "worker done" event to wake
|
|
323
|
+
it. After dispatching a wave, define the next checkpoint trigger — a follow-up turn, a status-table
|
|
324
|
+
poll, or a user ping — and never leave a wave un-joined.
|
|
325
|
+
|
|
326
|
+
For PR-bound work, keep the control session responsible for final PR readiness and review routing.
|
|
327
|
+
|
|
328
|
+
## Safety rules
|
|
329
|
+
|
|
330
|
+
- Do not spawn workers for trivial tasks.
|
|
331
|
+
- Do not let multiple workers edit the same files unless explicitly coordinated.
|
|
332
|
+
- Do not assume a named app connector exists; discover it and fall back honestly.
|
|
333
|
+
- Do not silently create branches, commits, pushes, or PRs; follow the user's consent and repo rules.
|
|
334
|
+
- Do not ask workers to share secrets or sensitive data across sessions.
|
|
335
|
+
- Worker subagents are leaf helpers — they MUST NOT create or steer further persistent sessions. The
|
|
336
|
+
hierarchy is exactly two levels (control -> worker -> subagents); a worker that needs another full
|
|
337
|
+
workstream reports that need to control.
|
|
338
|
+
- Enforce the concurrency and total-fan-out caps; never exceed them silently. Dropped, skipped, or
|
|
339
|
+
failed units MUST be recorded with a reason (no silent truncation).
|
|
340
|
+
- If using an in-place checkout, be extra careful: other user-owned changes may already exist.
|
|
341
|
+
- If the plan changes materially, update the user and the workers before continuing.
|
|
342
|
+
|
|
343
|
+
## Recommended reporting format
|
|
344
|
+
|
|
345
|
+
Use a compact control-plane update (rows derived from the manifest):
|
|
346
|
+
|
|
347
|
+
```markdown
|
|
348
|
+
**Status:** <on track | blocked | needs decision | complete>
|
|
349
|
+
**Budget:** in-flight <X/Y> · spawned <A/B> · wave <N> (empty-streak <E>)
|
|
350
|
+
|
|
351
|
+
| Workstream | Session | Scope | State | Evidence |
|
|
352
|
+
|---|---|---|---|---|
|
|
353
|
+
| <name> | <id/name> | <scope> | <state> | <test/report/PR> |
|
|
354
|
+
|
|
355
|
+
**Decision needed:** <only if blocked>
|
|
356
|
+
```
|
|
357
|
+
|
|
358
|
+
Keep user-facing updates concise. The control session should make coordination legible, not flood the
|
|
359
|
+
user with every worker's transcript.
|