@femtomc/mu-agent 26.2.104 → 26.2.106
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +5 -1
- package/assets/mu-tui-logo.png +0 -0
- package/package.json +2 -2
- package/prompts/skills/code-mode/SKILL.md +134 -0
- package/prompts/skills/control-flow/SKILL.md +164 -0
- package/prompts/skills/crons/SKILL.md +3 -1
- package/prompts/skills/heartbeats/SKILL.md +3 -1
- package/prompts/skills/hud/SKILL.md +37 -3
- package/prompts/skills/model-routing/SKILL.md +336 -0
- package/prompts/skills/mu/SKILL.md +11 -3
- package/prompts/skills/{hierarchical-work-protocol → orchestration}/SKILL.md +25 -3
- package/prompts/skills/planning/SKILL.md +20 -2
- package/prompts/skills/setup-discord/SKILL.md +7 -0
- package/prompts/skills/setup-neovim/SKILL.md +7 -0
- package/prompts/skills/subagents/SKILL.md +53 -10
- package/prompts/skills/tmux/SKILL.md +149 -0
package/README.md
CHANGED
|
@@ -28,7 +28,11 @@ into `~/.mu/skills/` (or `$MU_HOME/skills/`) by the CLI store-initialization pat
|
|
|
28
28
|
- `memory`
|
|
29
29
|
- `planning`
|
|
30
30
|
- `hud`
|
|
31
|
-
- `
|
|
31
|
+
- `orchestration`
|
|
32
|
+
- `control-flow`
|
|
33
|
+
- `model-routing`
|
|
34
|
+
- `code-mode`
|
|
35
|
+
- `tmux`
|
|
32
36
|
- `subagents`
|
|
33
37
|
- `heartbeats`
|
|
34
38
|
- `crons`
|
package/assets/mu-tui-logo.png
CHANGED
|
Binary file
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@femtomc/mu-agent",
|
|
3
|
-
"version": "26.2.
|
|
3
|
+
"version": "26.2.106",
|
|
4
4
|
"description": "Shared operator runtime for mu assistant sessions and serve extensions.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"mu",
|
|
@@ -25,7 +25,7 @@
|
|
|
25
25
|
"themes/**"
|
|
26
26
|
],
|
|
27
27
|
"dependencies": {
|
|
28
|
-
"@femtomc/mu-core": "26.2.
|
|
28
|
+
"@femtomc/mu-core": "26.2.106",
|
|
29
29
|
"@mariozechner/pi-agent-core": "^0.54.2",
|
|
30
30
|
"@mariozechner/pi-ai": "^0.54.2",
|
|
31
31
|
"@mariozechner/pi-coding-agent": "^0.54.2",
|
|
@@ -0,0 +1,134 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: code-mode
|
|
3
|
+
description: "Runs lightweight tmux-backed REPL workflows so agents can execute code and engineer context without bloating prompt history."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# code-mode
|
|
7
|
+
|
|
8
|
+
Use this skill when a task is better solved by iterative code execution in a live
|
|
9
|
+
REPL than by stuffing intermediate data into chat context.
|
|
10
|
+
|
|
11
|
+
The core idea is intentionally simple: `tmux` provides a persistent runtime shell,
|
|
12
|
+
and the agent drives it with `bash`.
|
|
13
|
+
|
|
14
|
+
## Contents
|
|
15
|
+
|
|
16
|
+
- [Core contract](#core-contract)
|
|
17
|
+
- [tmux skill dependency](#tmux-skill-dependency)
|
|
18
|
+
- [Minimal tmux execution loop](#minimal-tmux-execution-loop)
|
|
19
|
+
- [Context engineering contract](#context-engineering-contract)
|
|
20
|
+
- [Language-specific quick starts](#language-specific-quick-starts)
|
|
21
|
+
- [Integration with other mu skills](#integration-with-other-mu-skills)
|
|
22
|
+
- [Evaluation scenarios](#evaluation-scenarios)
|
|
23
|
+
|
|
24
|
+
## Core contract
|
|
25
|
+
|
|
26
|
+
1. **Use persistent runtime state, not prompt state**
|
|
27
|
+
- Keep working data in REPL variables, files, and process memory.
|
|
28
|
+
- Only return compact summaries/artifacts to chat.
|
|
29
|
+
|
|
30
|
+
2. **One session per task scope**
|
|
31
|
+
- Reuse the same tmux session while solving one problem.
|
|
32
|
+
- Use distinct session names for unrelated tasks to avoid state bleed.
|
|
33
|
+
|
|
34
|
+
3. **Bounded command passes**
|
|
35
|
+
- Send one coherent code block per pass.
|
|
36
|
+
- Capture output, summarize, decide next pass.
|
|
37
|
+
|
|
38
|
+
4. **On-demand discovery**
|
|
39
|
+
- Ask runtime for definitions/help only when needed (`help(...)`, `dir(...)`,
|
|
40
|
+
`.help`, etc.) instead of loading everything up front.
|
|
41
|
+
|
|
42
|
+
5. **No extra harness required**
|
|
43
|
+
- Trusted local workflows can stay minimal: `tmux` + `bash` + REPL.
|
|
44
|
+
|
|
45
|
+
## tmux skill dependency
|
|
46
|
+
|
|
47
|
+
Before mutating tmux session state, load **`tmux`** and follow its canonical
|
|
48
|
+
session lifecycle and bounded pass protocol.
|
|
49
|
+
|
|
50
|
+
- Treat `tmux` as source-of-truth for create/reuse, capture, fan-out, and teardown.
|
|
51
|
+
- This `code-mode` skill defines REPL/context-engineering behavior only.
|
|
52
|
+
|
|
53
|
+
## Minimal tmux execution loop
|
|
54
|
+
|
|
55
|
+
Follow the `Bounded execution protocol` in the `tmux` skill to create
|
|
56
|
+
sessions, run commands with a completion marker, and capture output.
|
|
57
|
+
|
|
58
|
+
Example session creation for a REPL:
|
|
59
|
+
|
|
60
|
+
```bash
|
|
61
|
+
session="mu-code-py"
|
|
62
|
+
tmux has-session -t "$session" 2>/dev/null || tmux new-session -d -s "$session" "python3 -q"
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
Then send your REPL commands and the marker, according to the `tmux` skill.
|
|
66
|
+
Teardown the session when finished.
|
|
67
|
+
|
|
68
|
+
## Context engineering contract
|
|
69
|
+
|
|
70
|
+
Use the runtime to compress context before speaking:
|
|
71
|
+
|
|
72
|
+
1. Load raw data into files/variables.
|
|
73
|
+
2. Execute code that filters, slices, or aggregates.
|
|
74
|
+
3. Persist useful artifacts (`summary.json`, `notes.md`, `results.csv`).
|
|
75
|
+
4. Report only:
|
|
76
|
+
- key findings,
|
|
77
|
+
- confidence/limits,
|
|
78
|
+
- artifact paths and next action.
|
|
79
|
+
|
|
80
|
+
Practical rules:
|
|
81
|
+
|
|
82
|
+
- Prefer computed summaries over pasted raw logs.
|
|
83
|
+
- Keep long transcripts in files; cite paths.
|
|
84
|
+
- Recompute when uncertain instead of guessing from stale text.
|
|
85
|
+
|
|
86
|
+
## Language-specific quick starts
|
|
87
|
+
|
|
88
|
+
Python:
|
|
89
|
+
|
|
90
|
+
```bash
|
|
91
|
+
tmux new-session -d -s mu-code-py "python3 -q"
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
Node:
|
|
95
|
+
|
|
96
|
+
```bash
|
|
97
|
+
tmux new-session -d -s mu-code-node "node"
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
SQLite:
|
|
101
|
+
|
|
102
|
+
```bash
|
|
103
|
+
tmux new-session -d -s mu-code-sql "sqlite3 data.db"
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
Shell-only REPL (for pipelines/tools):
|
|
107
|
+
|
|
108
|
+
```bash
|
|
109
|
+
tmux new-session -d -s mu-code-sh "bash --noprofile --norc -i"
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
## Integration with other mu skills
|
|
113
|
+
|
|
114
|
+
- Use with `planning` when a plan step needs exploratory coding.
|
|
115
|
+
- Use with `orchestration`/`subagents` by assigning one tmux session per worker.
|
|
116
|
+
- Use with `control-flow` for explicit retry/termination policy around code passes.
|
|
117
|
+
- Use with `heartbeats`/`crons` when bounded code passes should run on schedule.
|
|
118
|
+
|
|
119
|
+
## Evaluation scenarios
|
|
120
|
+
|
|
121
|
+
1. **Exploratory data pass**
|
|
122
|
+
- Setup: large raw text/log corpus.
|
|
123
|
+
- Expected: agent uses REPL transforms to produce concise findings + artifact path,
|
|
124
|
+
without dumping full corpus into chat.
|
|
125
|
+
|
|
126
|
+
2. **Multi-pass debugging**
|
|
127
|
+
- Setup: bug reproduction requires iterative commands.
|
|
128
|
+
- Expected: same tmux session is reused across passes; state continuity reduces
|
|
129
|
+
repeated setup and prompt churn.
|
|
130
|
+
|
|
131
|
+
3. **Language swap with same control pattern**
|
|
132
|
+
- Setup: compare Python and Node approaches.
|
|
133
|
+
- Expected: same tmux send/capture loop works across both REPLs with only session
|
|
134
|
+
bootstrap command changed.
|
|
@@ -0,0 +1,164 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: control-flow
|
|
3
|
+
description: "Defines compositional control-flow policies for orchestration DAGs (for example review-gated retry loops) using protocol-preserving transitions."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# control-flow
|
|
7
|
+
|
|
8
|
+
Use this skill when work needs explicit loop/termination policy on top of the
|
|
9
|
+
shared orchestration protocol.
|
|
10
|
+
|
|
11
|
+
## Contents
|
|
12
|
+
|
|
13
|
+
- [Purpose](#purpose)
|
|
14
|
+
- [Required dependencies](#required-dependencies)
|
|
15
|
+
- [Core contract](#core-contract)
|
|
16
|
+
- [Review-gated policy (`flow:review-gated-v1`)](#review-gated-policy-flowreview-gated-v1)
|
|
17
|
+
- [Transition table](#transition-table)
|
|
18
|
+
- [Planning handoff contract](#planning-handoff-contract)
|
|
19
|
+
- [Subagents/heartbeat execution contract](#subagentsheartbeat-execution-contract)
|
|
20
|
+
- [HUD visibility and teardown](#hud-visibility-and-teardown)
|
|
21
|
+
- [Evaluation scenarios](#evaluation-scenarios)
|
|
22
|
+
|
|
23
|
+
## Purpose
|
|
24
|
+
|
|
25
|
+
Control-flow policies are overlays. They do not replace orchestration protocol
|
|
26
|
+
semantics; they guide which protocol primitive to apply next.
|
|
27
|
+
|
|
28
|
+
Examples:
|
|
29
|
+
- review-gated retries
|
|
30
|
+
- bounded retry + human escalation
|
|
31
|
+
- checkpoint/approval gates
|
|
32
|
+
|
|
33
|
+
## Required dependencies
|
|
34
|
+
|
|
35
|
+
Load these skills before applying control-flow policies:
|
|
36
|
+
|
|
37
|
+
- `orchestration` (protocol primitives/invariants)
|
|
38
|
+
- `subagents` (durable execution runtime)
|
|
39
|
+
- `heartbeats` and/or `crons` (scheduler clock)
|
|
40
|
+
- `hud` (required visibility/handoff surface)
|
|
41
|
+
|
|
42
|
+
## Core contract
|
|
43
|
+
|
|
44
|
+
1. **Overlay, don’t fork protocol**
|
|
45
|
+
- Keep `hierarchical-work.protocol/v1` + `proto:hierarchical-work-v1`.
|
|
46
|
+
- Do not invent new protocol IDs for policy variants.
|
|
47
|
+
|
|
48
|
+
2. **Policy metadata lives in `flow:*`**
|
|
49
|
+
- Keep policy tags/metadata orthogonal to `kind:*` and `ctx:*`.
|
|
50
|
+
|
|
51
|
+
3. **Transitions compile to protocol primitives**
|
|
52
|
+
- Use only `spawn|fork|expand|ask|complete|serial` plus normal issue lifecycle
|
|
53
|
+
commands (`claim/open/close/dep`).
|
|
54
|
+
|
|
55
|
+
4. **Bounded pass per tick**
|
|
56
|
+
- One control-flow transition decision and one bounded mutation bundle per
|
|
57
|
+
heartbeat pass; verify then exit.
|
|
58
|
+
|
|
59
|
+
## Review-gated policy (`flow:review-gated-v1`)
|
|
60
|
+
|
|
61
|
+
### Tag vocabulary
|
|
62
|
+
|
|
63
|
+
- `flow:review-gated-v1` — subtree uses review-gated policy
|
|
64
|
+
- `flow:attempt` — implementation attempt node
|
|
65
|
+
- `flow:review` — review gate node
|
|
66
|
+
|
|
67
|
+
Optional metadata in issue body/forum packet:
|
|
68
|
+
- `max_review_rounds=<N>` (default recommended: 3)
|
|
69
|
+
|
|
70
|
+
### Required shape per round
|
|
71
|
+
|
|
72
|
+
For round `k` under policy scope:
|
|
73
|
+
|
|
74
|
+
- `attempt_k` (executable; usually `kind:spawn` or `kind:fork`)
|
|
75
|
+
- `review_k` (executable; usually `kind:fork`, `ctx:inherit`)
|
|
76
|
+
- edge: `attempt_k blocks review_k`
|
|
77
|
+
|
|
78
|
+
### Critical invariant
|
|
79
|
+
|
|
80
|
+
When review fails, **do not leave the review node closed as `needs_work`**.
|
|
81
|
+
That keeps the DAG non-final forever.
|
|
82
|
+
|
|
83
|
+
Instead:
|
|
84
|
+
1. record verdict in forum (`VERDICT: needs_work`)
|
|
85
|
+
2. spawn `attempt_{k+1}` + `review_{k+1}`
|
|
86
|
+
3. add `attempt_{k+1} blocks review_{k+1}`
|
|
87
|
+
4. close `review_k` with `outcome=expanded`
|
|
88
|
+
|
|
89
|
+
This preserves full audit history while keeping finalization reachable.
|
|
90
|
+
|
|
91
|
+
## Transition table
|
|
92
|
+
|
|
93
|
+
Given current round `(attempt_k, review_k)`:
|
|
94
|
+
|
|
95
|
+
1. **attempt not finished**
|
|
96
|
+
- action: continue attempt execution (normal worker loop)
|
|
97
|
+
|
|
98
|
+
2. **attempt finished, review pending**
|
|
99
|
+
- action: run review_k
|
|
100
|
+
|
|
101
|
+
3. **review verdict = pass**
|
|
102
|
+
- action: `complete(review_k)` with `success`
|
|
103
|
+
- if subtree validates final, disable supervising heartbeat
|
|
104
|
+
|
|
105
|
+
4. **review verdict = needs_work, rounds < max**
|
|
106
|
+
- action: apply fail->expand transition (spawn next round + close review_k as `expanded`)
|
|
107
|
+
|
|
108
|
+
5. **review verdict = needs_work, rounds >= max**
|
|
109
|
+
- action: create `kind:ask` escalation node (`ctx:human`, `actor:user`)
|
|
110
|
+
- downstream work blocks on that ask node
|
|
111
|
+
|
|
112
|
+
## Planning handoff contract
|
|
113
|
+
|
|
114
|
+
When planning a review-gated subtree:
|
|
115
|
+
|
|
116
|
+
1. Tag policy scope root (or selected goal node) with `flow:review-gated-v1`.
|
|
117
|
+
2. Create round-1 pair (`flow:attempt`, `flow:review`) + dependency edge.
|
|
118
|
+
3. Encode acceptance criteria for attempt + review explicitly.
|
|
119
|
+
4. Record max rounds policy in body/forum packet.
|
|
120
|
+
|
|
121
|
+
## Subagents/heartbeat execution contract
|
|
122
|
+
|
|
123
|
+
Per orchestrator tick:
|
|
124
|
+
|
|
125
|
+
1. `read_tree` + ready-set + round-state inspection.
|
|
126
|
+
2. Select one transition from the table above.
|
|
127
|
+
3. Apply one bounded transition bundle.
|
|
128
|
+
4. Verify with:
|
|
129
|
+
- `mu issues ready --root <root-id> --tag proto:hierarchical-work-v1 --pretty`
|
|
130
|
+
- `mu issues validate <root-id>`
|
|
131
|
+
5. Post one concise ORCH_PASS update.
|
|
132
|
+
6. If final: disable heartbeat program.
|
|
133
|
+
|
|
134
|
+
Reusable bounded heartbeat prompt fragment:
|
|
135
|
+
|
|
136
|
+
```text
|
|
137
|
+
Use skills orchestration, control-flow, subagents, and hud.
|
|
138
|
+
For root <root-id>, enforce flow:review-gated-v1 with spawn-per-attempt rounds.
|
|
139
|
+
Run exactly one bounded control-flow transition pass, verify DAG state,
|
|
140
|
+
post one ORCH_PASS, and stop. If validate is final, disable the supervising
|
|
141
|
+
heartbeat and report completion.
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
## HUD visibility and teardown
|
|
145
|
+
|
|
146
|
+
HUD usage is not optional for active control-flow execution.
|
|
147
|
+
|
|
148
|
+
- If subagents HUD is already active, publish control-flow state in that HUD doc
|
|
149
|
+
(for example policy mode, round counters, escalation state).
|
|
150
|
+
- If running control-flow standalone, own a dedicated `hud_id:"control-flow"` doc.
|
|
151
|
+
- Update HUD state each bounded pass before reporting ORCH_PASS output.
|
|
152
|
+
|
|
153
|
+
- Follow the HUD ownership and teardown protocol from the `hud` skill when completing or handing off.
|
|
154
|
+
|
|
155
|
+
## Evaluation scenarios
|
|
156
|
+
|
|
157
|
+
1. **Single-pass review success**
|
|
158
|
+
- attempt_1 succeeds, review_1 succeeds, subtree validates final, heartbeat disables.
|
|
159
|
+
|
|
160
|
+
2. **One failed review then success**
|
|
161
|
+
- review_1 fails -> expand to round 2; review_2 succeeds -> final.
|
|
162
|
+
|
|
163
|
+
3. **Max-round escalation**
|
|
164
|
+
- repeated failed reviews hit `max_review_rounds`; ask node created and execution blocks awaiting human input.
|
|
@@ -152,7 +152,9 @@ post concise summary, then exit.
|
|
|
152
152
|
|
|
153
153
|
For DAG execution workloads, combine with:
|
|
154
154
|
- `planning`
|
|
155
|
-
- `
|
|
155
|
+
- `orchestration`
|
|
156
|
+
- `control-flow` (when explicit loop/termination policy is required)
|
|
157
|
+
- `model-routing` (when per-issue model/provider/thinking policy is required)
|
|
156
158
|
- `subagents`
|
|
157
159
|
- `heartbeats` (for short-cadence wake loops)
|
|
158
160
|
|
|
@@ -145,7 +145,9 @@ packet mechanics, raw issue-ID lists). Include them only when diagnosing a block
|
|
|
145
145
|
|
|
146
146
|
For hierarchical DAG execution, pair this skill with:
|
|
147
147
|
- `planning`
|
|
148
|
-
- `
|
|
148
|
+
- `orchestration`
|
|
149
|
+
- `control-flow` (when explicit loop/termination policy is required)
|
|
150
|
+
- `model-routing` (when per-issue model/provider/thinking policy is required)
|
|
149
151
|
- `subagents`
|
|
150
152
|
|
|
151
153
|
For wall-clock scheduling semantics (`at`, `every`, `cron`), use `crons`.
|
|
@@ -18,7 +18,8 @@ This skill is the canonical HUD reference for:
|
|
|
18
18
|
- [Core contract](#core-contract)
|
|
19
19
|
- [HudDoc shape](#huddoc-shape)
|
|
20
20
|
- [Recommended turn loop](#recommended-turn-loop)
|
|
21
|
-
- [
|
|
21
|
+
- [Ownership and teardown protocol](#ownership-and-teardown-protocol)
|
|
22
|
+
- [Planning, subagents, and model-routing profiles](#planning-subagents-and-model-routing-profiles)
|
|
22
23
|
- [Determinism and rendering limits](#determinism-and-rendering-limits)
|
|
23
24
|
- [Evaluation scenarios](#evaluation-scenarios)
|
|
24
25
|
|
|
@@ -135,12 +136,45 @@ Example checklist doc:
|
|
|
135
136
|
|
|
136
137
|
4. Keep response text and HUD state aligned (no contradictions).
|
|
137
138
|
|
|
138
|
-
##
|
|
139
|
+
## Ownership and teardown protocol
|
|
140
|
+
|
|
141
|
+
HUD-using skills should treat HUD state as owned, explicit, and non-optional when
|
|
142
|
+
that skill declares HUD-required behavior.
|
|
143
|
+
|
|
144
|
+
1. **Own explicit `hud_id` values**
|
|
145
|
+
- Each active skill owns one canonical doc id (for example `planning`,
|
|
146
|
+
`subagents`, `control-flow`, `model-routing`).
|
|
147
|
+
- Prefer `remove <hud_id>` over `clear` to avoid deleting other skills’ docs.
|
|
148
|
+
|
|
149
|
+
2. **Teardown is mandatory at skill end**
|
|
150
|
+
- When a HUD-owning skill completes, remove its doc(s).
|
|
151
|
+
- If no other HUD-owning skill is active next, turn HUD off.
|
|
152
|
+
|
|
153
|
+
3. **Teardown during HUD-to-HUD handoff**
|
|
154
|
+
- Remove current skill doc(s), keep HUD enabled, then let next skill set its doc.
|
|
155
|
+
- Never leave stale docs from prior skills during handoff.
|
|
156
|
+
|
|
157
|
+
Example teardown/handoff calls:
|
|
158
|
+
|
|
159
|
+
```json
|
|
160
|
+
{"action":"remove","hud_id":"planning"}
|
|
161
|
+
{"action":"set","doc":{"v":1,"hud_id":"subagents","title":"Subagents HUD","snapshot_compact":"HUD(subagents)","updated_at_ms":1771853115001}}
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
Example full teardown (no next HUD skill):
|
|
165
|
+
|
|
166
|
+
```json
|
|
167
|
+
{"action":"remove","hud_id":"subagents"}
|
|
168
|
+
{"action":"off"}
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
## Planning, subagents, and model-routing profiles
|
|
139
172
|
|
|
140
173
|
Use profile-specific `hud_id` values:
|
|
141
174
|
|
|
142
175
|
- planning profile: `hud_id: "planning"`
|
|
143
176
|
- subagents profile: `hud_id: "subagents"`
|
|
177
|
+
- model-routing profile: `hud_id: "model-routing"`
|
|
144
178
|
|
|
145
179
|
Treat these as conventions layered on top of this generic contract.
|
|
146
180
|
|
|
@@ -168,4 +202,4 @@ If behavior is unclear, inspect implementation/tests before guessing:
|
|
|
168
202
|
- Expected: `subagents` doc updates queue/activity/chips after each bounded pass.
|
|
169
203
|
|
|
170
204
|
3. **HUD reset handoff**
|
|
171
|
-
- Expected: after phase completion, HUD is
|
|
205
|
+
- Expected: after phase completion, owned HUD docs are removed; HUD is turned off when no next HUD skill is active, or ownership cleanly hands off to the next skill with no stale docs.
|
|
@@ -0,0 +1,336 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: model-routing
|
|
3
|
+
description: "Adds a model-selection overlay for issue DAG execution, recommending provider/model/thinking per issue from live harness capabilities."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# model-routing
|
|
7
|
+
|
|
8
|
+
Use this skill when execution should choose different models for different issue
|
|
9
|
+
kinds (for example code vs docs), while preserving orchestration protocol
|
|
10
|
+
semantics.
|
|
11
|
+
|
|
12
|
+
## Contents
|
|
13
|
+
|
|
14
|
+
- [Purpose](#purpose)
|
|
15
|
+
- [Required dependencies](#required-dependencies)
|
|
16
|
+
- [Core contract](#core-contract)
|
|
17
|
+
- [Overlay identity (`route:model-routing-v1`)](#overlay-identity-routemodel-routing-v1)
|
|
18
|
+
- [Tag vocabulary](#tag-vocabulary)
|
|
19
|
+
- [Recommendation packet contract](#recommendation-packet-contract)
|
|
20
|
+
- [Selection algorithm (deterministic)](#selection-algorithm-deterministic)
|
|
21
|
+
- [Transition table](#transition-table)
|
|
22
|
+
- [Planning handoff contract](#planning-handoff-contract)
|
|
23
|
+
- [Subagents/heartbeat execution contract](#subagentsheartbeat-execution-contract)
|
|
24
|
+
- [Failure + fallback policy](#failure--fallback-policy)
|
|
25
|
+
- [HUD visibility and teardown](#hud-visibility-and-teardown)
|
|
26
|
+
- [Evaluation scenarios](#evaluation-scenarios)
|
|
27
|
+
|
|
28
|
+
## Purpose
|
|
29
|
+
|
|
30
|
+
Model-routing policies are overlays. They do not replace `orchestration`
|
|
31
|
+
protocol semantics.
|
|
32
|
+
|
|
33
|
+
Examples:
|
|
34
|
+
- use a strong coding model for implementation leaves
|
|
35
|
+
- use a stronger writing model for docs/synthesis leaves
|
|
36
|
+
- choose lower-cost fast models for routine triage
|
|
37
|
+
- escalate to deeper thinking for high-risk or complex nodes
|
|
38
|
+
|
|
39
|
+
## Required dependencies
|
|
40
|
+
|
|
41
|
+
Load these skills before applying model-routing policies:
|
|
42
|
+
|
|
43
|
+
- `orchestration` (protocol primitives/invariants)
|
|
44
|
+
- `subagents` (durable execution runtime)
|
|
45
|
+
- `heartbeats` and/or `crons` (scheduler clock)
|
|
46
|
+
- `hud` (required visibility/handoff surface)
|
|
47
|
+
- `control-flow` (optional; when loop/termination overlays are also active)
|
|
48
|
+
|
|
49
|
+
## Core contract
|
|
50
|
+
|
|
51
|
+
1. **Overlay, don’t fork protocol**
|
|
52
|
+
- Keep `hierarchical-work.protocol/v1` + `proto:hierarchical-work-v1`.
|
|
53
|
+
- Do not redefine `kind:*`, `ctx:*`, issue lifecycle semantics, or DAG validity.
|
|
54
|
+
|
|
55
|
+
2. **Harness is source-of-truth**
|
|
56
|
+
- Drive recommendations from `mu control harness --json`.
|
|
57
|
+
- Only consider authenticated providers unless policy explicitly allows otherwise.
|
|
58
|
+
|
|
59
|
+
3. **Recommend, then apply**
|
|
60
|
+
- Route decisions are explicit artifacts (forum packets + optional tags),
|
|
61
|
+
not hidden implicit behavior.
|
|
62
|
+
|
|
63
|
+
4. **Non-blocking by default**
|
|
64
|
+
- Routing failure should degrade safely (fallback model / default model)
|
|
65
|
+
unless a hard requirement cannot be met.
|
|
66
|
+
|
|
67
|
+
5. **Bounded pass per tick**
|
|
68
|
+
- One routing decision and one bounded mutation/action bundle per heartbeat pass.
|
|
69
|
+
|
|
70
|
+
6. **Per-issue/session overrides preferred**
|
|
71
|
+
- Use `mu exec --provider/--model/--thinking` or `mu turn ...` overrides.
|
|
72
|
+
- Avoid changing workspace-global operator defaults for per-issue routing.
|
|
73
|
+
|
|
74
|
+
## Overlay identity (`route:model-routing-v1`)
|
|
75
|
+
|
|
76
|
+
- Tag scope root (or selected subtree root) with: `route:model-routing-v1`
|
|
77
|
+
- Routing metadata remains orthogonal to `kind:*`, `ctx:*`, and `flow:*`.
|
|
78
|
+
|
|
79
|
+
## Tag vocabulary
|
|
80
|
+
|
|
81
|
+
Recommended routing tags (policy metadata):
|
|
82
|
+
|
|
83
|
+
- Scope:
|
|
84
|
+
- `route:model-routing-v1`
|
|
85
|
+
- Task family:
|
|
86
|
+
- `route:task:code`
|
|
87
|
+
- `route:task:docs`
|
|
88
|
+
- `route:task:research`
|
|
89
|
+
- `route:task:ops`
|
|
90
|
+
- `route:task:review`
|
|
91
|
+
- `route:task:synth`
|
|
92
|
+
- `route:task:general`
|
|
93
|
+
- Depth intent:
|
|
94
|
+
- `route:depth:fast`
|
|
95
|
+
- `route:depth:balanced`
|
|
96
|
+
- `route:depth:deep`
|
|
97
|
+
- Budget intent:
|
|
98
|
+
- `route:budget:low`
|
|
99
|
+
- `route:budget:balanced`
|
|
100
|
+
- `route:budget:premium`
|
|
101
|
+
- Hard modality requirement:
|
|
102
|
+
- `route:modality:image` (omit for text-only)
|
|
103
|
+
- Pin indicator:
|
|
104
|
+
- `route:pin` (exact provider/model comes from packet metadata)
|
|
105
|
+
|
|
106
|
+
Notes:
|
|
107
|
+
- Keep tags concise and stable.
|
|
108
|
+
- Put detailed routing config in forum packets (not in tag strings).
|
|
109
|
+
|
|
110
|
+
## Recommendation packet contract
|
|
111
|
+
|
|
112
|
+
Post one `ROUTE_RECOMMENDATION` packet to `issue:<issue-id>` before launching work
|
|
113
|
+
with a selected model.
|
|
114
|
+
|
|
115
|
+
Suggested packet shape (JSON block inside forum message):
|
|
116
|
+
|
|
117
|
+
```text
|
|
118
|
+
ROUTE_RECOMMENDATION:
|
|
119
|
+
{
|
|
120
|
+
"version": "route:model-routing-v1",
|
|
121
|
+
"issue_id": "<issue-id>",
|
|
122
|
+
"harness_fingerprint": "<sha256>",
|
|
123
|
+
"selected": {
|
|
124
|
+
"provider": "<provider>",
|
|
125
|
+
"model": "<model>",
|
|
126
|
+
"thinking": "<thinking-level>"
|
|
127
|
+
},
|
|
128
|
+
"alternates": [
|
|
129
|
+
{ "provider": "<provider>", "model": "<model>", "thinking": "<thinking-level>" }
|
|
130
|
+
],
|
|
131
|
+
"constraints": {
|
|
132
|
+
"task": "code|docs|research|ops|review|synth|general",
|
|
133
|
+
"depth": "fast|balanced|deep",
|
|
134
|
+
"budget": "low|balanced|premium",
|
|
135
|
+
"modality": "text|image",
|
|
136
|
+
"min_context_window": 0
|
|
137
|
+
},
|
|
138
|
+
"rationale": [
|
|
139
|
+
"provider authenticated",
|
|
140
|
+
"supports required thinking level",
|
|
141
|
+
"meets context/modality constraints",
|
|
142
|
+
"best score under budget/depth policy"
|
|
143
|
+
],
|
|
144
|
+
"created_at_ms": 0
|
|
145
|
+
}
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
Optional root-level packet for custom preferences:
|
|
149
|
+
|
|
150
|
+
```text
|
|
151
|
+
ROUTE_POLICY:
|
|
152
|
+
{
|
|
153
|
+
"version": "route:model-routing-v1",
|
|
154
|
+
"task_preferences": {
|
|
155
|
+
"code": [
|
|
156
|
+
{ "provider": "openai-codex", "model": "gpt-5.3-codex", "thinking": "xhigh" }
|
|
157
|
+
],
|
|
158
|
+
"docs": [
|
|
159
|
+
{ "provider": "openrouter", "model": "google/gemini-3.1-pro-preview", "thinking": "high" }
|
|
160
|
+
]
|
|
161
|
+
}
|
|
162
|
+
}
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
If a preference entry is unavailable under current harness/auth state, skip it and
|
|
166
|
+
continue deterministic fallback selection.
|
|
167
|
+
|
|
168
|
+
## Selection algorithm (deterministic)
|
|
169
|
+
|
|
170
|
+
### Inputs
|
|
171
|
+
|
|
172
|
+
- Issue tags (`route:task:*`, `route:depth:*`, `route:budget:*`, `route:modality:image`, `route:pin`)
|
|
173
|
+
- Optional `ROUTE_POLICY` and per-issue constraints from forum/body
|
|
174
|
+
- Live harness snapshot (`mu control harness --json`)
|
|
175
|
+
|
|
176
|
+
### Step 1: gather live capabilities
|
|
177
|
+
|
|
178
|
+
```bash
|
|
179
|
+
mu control harness --json --pretty
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
### Step 2: build candidate set
|
|
183
|
+
|
|
184
|
+
1. Start from authenticated providers only.
|
|
185
|
+
2. Flatten model entries across providers.
|
|
186
|
+
3. Filter by hard requirements:
|
|
187
|
+
- required modality (`text` and optional `image`)
|
|
188
|
+
- minimum context window (if specified)
|
|
189
|
+
- pin requirement (`route:pin`) if specified
|
|
190
|
+
4. Resolve target thinking from depth intent:
|
|
191
|
+
- `fast` -> `minimal`
|
|
192
|
+
- `balanced` -> `medium`
|
|
193
|
+
- `deep` -> `xhigh` if available, else `high`
|
|
194
|
+
5. Clamp chosen thinking to model-supported `thinking_levels`.
|
|
195
|
+
|
|
196
|
+
### Step 3: score candidates
|
|
197
|
+
|
|
198
|
+
Use deterministic score components (example):
|
|
199
|
+
|
|
200
|
+
- Hard-fit gates (must pass): auth, modality, context, thinking compatibility
|
|
201
|
+
- Soft score:
|
|
202
|
+
- task preference match (`ROUTE_POLICY`/task family)
|
|
203
|
+
- reasoning/xhigh capability vs depth
|
|
204
|
+
- context headroom
|
|
205
|
+
- budget penalty from per-token cost
|
|
206
|
+
|
|
207
|
+
Tie-breaker: lower estimated cost, then lexicographic `provider/model`.
|
|
208
|
+
|
|
209
|
+
### Step 4: select + alternates
|
|
210
|
+
|
|
211
|
+
- pick top candidate as `selected`
|
|
212
|
+
- keep next N as `alternates` (recommended N=2)
|
|
213
|
+
- post `ROUTE_RECOMMENDATION` packet
|
|
214
|
+
|
|
215
|
+
### Step 5: apply selection
|
|
216
|
+
|
|
217
|
+
For one-shot execution:
|
|
218
|
+
|
|
219
|
+
```bash
|
|
220
|
+
mu exec --provider <provider> --model <model> --thinking <thinking> \
|
|
221
|
+
"Use skills subagents, orchestration, model-routing, and hud. Work issue <issue-id>."
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
For existing session turn:
|
|
225
|
+
|
|
226
|
+
```bash
|
|
227
|
+
mu turn --session-kind cp_operator --session-id <session-id> \
|
|
228
|
+
--provider <provider> --model <model> --thinking <thinking> \
|
|
229
|
+
--body "Continue issue <issue-id> with current routing selection."
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
## Transition table
|
|
233
|
+
|
|
234
|
+
Given an executable issue under `route:model-routing-v1`:
|
|
235
|
+
|
|
236
|
+
1. **No routing decision yet**
|
|
237
|
+
- action: compute recommendation + post `ROUTE_RECOMMENDATION` packet
|
|
238
|
+
|
|
239
|
+
2. **Routing decision exists and still valid**
|
|
240
|
+
- action: execute issue using selected provider/model/thinking
|
|
241
|
+
|
|
242
|
+
3. **Selected route fails at launch/runtime**
|
|
243
|
+
- action: choose next alternate, post `ROUTE_FALLBACK`, retry bounded once
|
|
244
|
+
|
|
245
|
+
4. **All alternates exhausted**
|
|
246
|
+
- action: degrade to harness default model, post `ROUTE_DEGRADED`
|
|
247
|
+
|
|
248
|
+
5. **Hard requirement unmet (no valid candidates)**
|
|
249
|
+
- action: create `kind:ask` node (`ctx:human`, `actor:user`) requesting
|
|
250
|
+
provider auth/config change or constraint relaxation
|
|
251
|
+
|
|
252
|
+
## Planning handoff contract
|
|
253
|
+
|
|
254
|
+
When planning a routed subtree:
|
|
255
|
+
|
|
256
|
+
1. Tag policy scope with `route:model-routing-v1`.
|
|
257
|
+
2. Tag executable nodes with task/depth/budget intent.
|
|
258
|
+
3. Record any hard constraints (modality/context) in issue body or forum packet.
|
|
259
|
+
4. Optionally add root `ROUTE_POLICY` preferences.
|
|
260
|
+
5. Ensure DAG remains valid under `orchestration` invariants:
|
|
261
|
+
- `mu issues ready --root <root-id> --tag proto:hierarchical-work-v1 --pretty`
|
|
262
|
+
- `mu issues validate <root-id>`
|
|
263
|
+
|
|
264
|
+
## Subagents/heartbeat execution contract
|
|
265
|
+
|
|
266
|
+
Per orchestrator tick:
|
|
267
|
+
|
|
268
|
+
1. Read tree + ready set + latest route packet on target issue.
|
|
269
|
+
2. Read harness snapshot once per pass.
|
|
270
|
+
3. Select one routing transition from the table above.
|
|
271
|
+
4. Apply one bounded mutation bundle (recommend/fallback/ask/execute-start).
|
|
272
|
+
5. Verify with:
|
|
273
|
+
- `mu issues ready --root <root-id> --tag proto:hierarchical-work-v1 --pretty`
|
|
274
|
+
- `mu issues validate <root-id>`
|
|
275
|
+
6. Update HUD state.
|
|
276
|
+
7. Post one concise `ORCH_PASS` status update.
|
|
277
|
+
8. If root is final, disable supervising heartbeat.
|
|
278
|
+
|
|
279
|
+
Reusable heartbeat prompt fragment:
|
|
280
|
+
|
|
281
|
+
```text
|
|
282
|
+
Use skills orchestration, model-routing, subagents, and hud.
|
|
283
|
+
For root <root-id>, enforce route:model-routing-v1.
|
|
284
|
+
Run exactly one bounded routing/orchestration transition pass: compute or validate
|
|
285
|
+
one issue's model recommendation from live `mu control harness` capabilities,
|
|
286
|
+
apply one action, verify DAG state, post one ORCH_PASS, then stop.
|
|
287
|
+
If validate is final, disable the supervising heartbeat and report completion.
|
|
288
|
+
```
|
|
289
|
+
|
|
290
|
+
## Failure + fallback policy
|
|
291
|
+
|
|
292
|
+
1. **Provider/model unavailable or auth drift**
|
|
293
|
+
- post `ROUTE_FALLBACK`
|
|
294
|
+
- move to next alternate
|
|
295
|
+
|
|
296
|
+
2. **Thinking level unsupported for selected model**
|
|
297
|
+
- clamp to nearest supported lower level
|
|
298
|
+
- post rationale in fallback packet
|
|
299
|
+
|
|
300
|
+
3. **No candidates satisfy hard constraints**
|
|
301
|
+
- create `kind:ask` escalation with clear options:
|
|
302
|
+
- authenticate provider X
|
|
303
|
+
- relax modality/context/depth constraint
|
|
304
|
+
- approve default-model execution
|
|
305
|
+
|
|
306
|
+
4. **Auditability requirement**
|
|
307
|
+
- every route change emits forum packet (`ROUTE_RECOMMENDATION`,
|
|
308
|
+
`ROUTE_FALLBACK`, `ROUTE_DEGRADED`)
|
|
309
|
+
|
|
310
|
+
## HUD visibility and teardown
|
|
311
|
+
|
|
312
|
+
HUD usage is not optional for active model-routing execution.
|
|
313
|
+
|
|
314
|
+
- If subagents HUD is active, publish routing state there (selected model,
|
|
315
|
+
alternates remaining, last fallback reason).
|
|
316
|
+
- If running model-routing standalone, own `hud_id:"model-routing"`.
|
|
317
|
+
- Update HUD each bounded pass before ORCH_PASS output.
|
|
318
|
+
- Follow `hud` skill ownership/teardown protocol on completion or handoff.
|
|
319
|
+
|
|
320
|
+
## Evaluation scenarios
|
|
321
|
+
|
|
322
|
+
1. **Coding leaf selects deep coding model**
|
|
323
|
+
- Setup: `route:task:code`, `route:depth:deep`, authenticated coding provider.
|
|
324
|
+
- Expected: recommendation picks a deep reasoning coding model and starts work.
|
|
325
|
+
|
|
326
|
+
2. **Docs leaf prefers writing model**
|
|
327
|
+
- Setup: `route:task:docs` with root `ROUTE_POLICY` preference for docs.
|
|
328
|
+
- Expected: recommendation uses preferred docs model when available, otherwise fallback.
|
|
329
|
+
|
|
330
|
+
3. **Auth/provider drift fallback**
|
|
331
|
+
- Setup: selected provider becomes unauthenticated mid-run.
|
|
332
|
+
- Expected: `ROUTE_FALLBACK` packet and alternate selection in next bounded pass.
|
|
333
|
+
|
|
334
|
+
4. **Hard requirement escalation**
|
|
335
|
+
- Setup: issue requires image input but no authenticated image-capable models.
|
|
336
|
+
- Expected: `kind:ask` node created; downstream remains blocked until user action.
|
|
@@ -177,8 +177,12 @@ mu cron --help
|
|
|
177
177
|
```
|
|
178
178
|
|
|
179
179
|
When work is multi-step and issue-graph driven, use `planning` to shape the DAG,
|
|
180
|
-
then `hud` for canonical HUD behavior, then `
|
|
181
|
-
semantics consistent, then `
|
|
180
|
+
then `hud` for canonical HUD behavior, then `orchestration` to keep DAG
|
|
181
|
+
semantics consistent, then `control-flow` for explicit loop/termination policy,
|
|
182
|
+
then `model-routing` for per-issue provider/model/thinking selection overlays,
|
|
183
|
+
then `subagents` for durable execution.
|
|
184
|
+
For REPL-driven exploration and context compression, use `code-mode`.
|
|
185
|
+
For persistent terminal sessions and worker fan-out mechanics, use `tmux`.
|
|
182
186
|
For recurring bounded automation loops, use `heartbeats`.
|
|
183
187
|
For wall-clock schedules (one-shot, interval, cron-expression), use `crons`.
|
|
184
188
|
|
|
@@ -201,7 +205,11 @@ For wall-clock schedules (one-shot, interval, cron-expression), use `crons`.
|
|
|
201
205
|
- Historical context retrieval and index maintenance: **`memory`**
|
|
202
206
|
- Planning/decomposition and DAG review: **`planning`**
|
|
203
207
|
- HUD contract/state updates across surfaces: **`hud`**
|
|
204
|
-
- Shared DAG semantics for planning + execution: **`
|
|
208
|
+
- Shared DAG semantics for planning + execution: **`orchestration`**
|
|
209
|
+
- Loop/termination policy overlays (review gates, retries, escalation): **`control-flow`**
|
|
210
|
+
- Per-issue model/provider/thinking selection overlays: **`model-routing`**
|
|
211
|
+
- Live REPL execution and context engineering via tmux: **`code-mode`**
|
|
212
|
+
- Persistent tmux session management + worker fan-out primitives: **`tmux`**
|
|
205
213
|
- Durable multi-agent orchestration: **`subagents`**
|
|
206
214
|
- Recurring bounded automation scheduling: **`heartbeats`**
|
|
207
215
|
- Wall-clock scheduling workflows: **`crons`**
|
|
@@ -1,16 +1,19 @@
|
|
|
1
1
|
---
|
|
2
|
-
name:
|
|
3
|
-
description: "Defines the shared
|
|
2
|
+
name: orchestration
|
|
3
|
+
description: "Defines the shared planning/execution orchestration protocol for issue-DAG work. Use when creating, validating, or executing protocol-driven DAG work."
|
|
4
4
|
---
|
|
5
5
|
|
|
6
|
-
#
|
|
6
|
+
# orchestration
|
|
7
7
|
|
|
8
8
|
Use this skill when work should flow through one shared protocol from planning to execution.
|
|
9
9
|
|
|
10
|
+
This skill supersedes the previous `hierarchical-work-protocol` skill name.
|
|
11
|
+
|
|
10
12
|
## Contents
|
|
11
13
|
|
|
12
14
|
- [Protocol identity](#protocol-identity)
|
|
13
15
|
- [Canonical tags and node roles](#canonical-tags-and-node-roles)
|
|
16
|
+
- [Policy overlays](#policy-overlays)
|
|
14
17
|
- [Protocol primitives](#protocol-primitives)
|
|
15
18
|
- [Required invariants](#required-invariants)
|
|
16
19
|
- [Planning handoff contract](#planning-handoff-contract)
|
|
@@ -62,6 +65,25 @@ Node role rules:
|
|
|
62
65
|
- Must include: `proto:hierarchical-work-v1`, `kind:ask`, `ctx:human`, `actor:user`
|
|
63
66
|
- Must be non-executable (`node:agent` removed)
|
|
64
67
|
|
|
68
|
+
## Policy overlays
|
|
69
|
+
|
|
70
|
+
Policy overlays are layered on top of this protocol and should not redefine
|
|
71
|
+
protocol primitives or `kind:*` semantics.
|
|
72
|
+
|
|
73
|
+
- Keep orchestration protocol tags/kinds as source-of-truth for structure.
|
|
74
|
+
- Represent policy-specific behavior with overlay tags/metadata:
|
|
75
|
+
- loop/termination policy (for example review gates, retry rounds,
|
|
76
|
+
escalation thresholds): `flow:*`
|
|
77
|
+
- per-issue model/provider/thinking routing policy: `route:*`
|
|
78
|
+
- Compile overlay decisions into existing primitives (`spawn`, `fork`, `expand`,
|
|
79
|
+
`ask`, `complete`, `serial`) and per-turn/per-session model overrides
|
|
80
|
+
(`mu exec` / `mu turn` with `--provider --model --thinking`) instead of
|
|
81
|
+
introducing ad-hoc mutations.
|
|
82
|
+
|
|
83
|
+
Current compositional overlay skills:
|
|
84
|
+
- `control-flow` (for example `flow:review-gated-v1` behavior)
|
|
85
|
+
- `model-routing` (for example `route:model-routing-v1` behavior)
|
|
86
|
+
|
|
65
87
|
## Protocol primitives
|
|
66
88
|
|
|
67
89
|
### `read_tree`
|
|
@@ -21,11 +21,13 @@ Use this skill when the user asks for planning, decomposition, or a staged execu
|
|
|
21
21
|
## Planning HUD is required
|
|
22
22
|
|
|
23
23
|
For this skill, the planning HUD is the primary status/communication surface.
|
|
24
|
+
HUD usage is not optional for planning turns.
|
|
24
25
|
|
|
25
26
|
- Keep HUD state in sync with real planning progress.
|
|
26
27
|
- Update HUD before and after each major planning turn.
|
|
27
28
|
- Use `waiting_on_user`, `next_action`, and `blocker` to communicate exactly what the user needs to do.
|
|
28
29
|
- Include a HUD snapshot in user-facing planning updates.
|
|
30
|
+
- Teardown/handoff HUD state explicitly when planning ends or transitions to another HUD-owning skill.
|
|
29
31
|
|
|
30
32
|
Default per-turn HUD loop:
|
|
31
33
|
|
|
@@ -33,6 +35,7 @@ Default per-turn HUD loop:
|
|
|
33
35
|
2. Keep checklist progress and root issue linkage synchronized with the live issue DAG.
|
|
34
36
|
3. Emit `snapshot` (`compact` or `multiline`) and reflect it in your response.
|
|
35
37
|
|
|
38
|
+
|
|
36
39
|
## HUD skill dependency
|
|
37
40
|
|
|
38
41
|
Before emitting or mutating planning HUD state, load **`hud`** and follow its canonical contract.
|
|
@@ -43,7 +46,7 @@ Before emitting or mutating planning HUD state, load **`hud`** and follow its ca
|
|
|
43
46
|
## Shared protocol dependency
|
|
44
47
|
|
|
45
48
|
This skill plans DAGs for execution by `subagents`, so planning must follow the
|
|
46
|
-
shared protocol in **`
|
|
49
|
+
shared protocol in **`orchestration`**.
|
|
47
50
|
|
|
48
51
|
Before creating or reshaping DAG nodes, load that skill and use its canonical:
|
|
49
52
|
|
|
@@ -54,6 +57,15 @@ Before creating or reshaping DAG nodes, load that skill and use its canonical:
|
|
|
54
57
|
|
|
55
58
|
Do not invent alternate protocol names or tag schemas.
|
|
56
59
|
|
|
60
|
+
If the user asks for explicit loop/termination behavior (for example review-gated
|
|
61
|
+
retry rounds), load **`control-flow`** and encode policy via `flow:*` overlays
|
|
62
|
+
without changing orchestration protocol semantics.
|
|
63
|
+
|
|
64
|
+
If the user asks for per-issue model/provider/thinking recommendations based on
|
|
65
|
+
live harness capabilities, load **`model-routing`** and encode policy via
|
|
66
|
+
`route:*` overlays plus route packets (for example `ROUTE_POLICY`) without
|
|
67
|
+
changing orchestration protocol semantics.
|
|
68
|
+
|
|
57
69
|
## Core contract
|
|
58
70
|
|
|
59
71
|
1. **Investigate first**
|
|
@@ -64,6 +76,7 @@ Do not invent alternate protocol names or tag schemas.
|
|
|
64
76
|
- Create root and child issues that comply with `hierarchical-work.protocol/v1`.
|
|
65
77
|
- Encode dependencies so the DAG reflects execution order and synth fan-in.
|
|
66
78
|
- Add clear titles, scope, acceptance criteria, and protocol tags.
|
|
79
|
+
- When model specialization is required, attach explicit `route:*` intent tags/constraints to executable nodes.
|
|
67
80
|
|
|
68
81
|
3. **Drive communication through the planning HUD**
|
|
69
82
|
- Load `hud` and use its canonical `mu_hud`/`HudDoc` contract.
|
|
@@ -82,7 +95,10 @@ Do not invent alternate protocol names or tag schemas.
|
|
|
82
95
|
- Do not begin broad execution until the user signals satisfaction.
|
|
83
96
|
|
|
84
97
|
6. **After user approval, ask user about next steps**
|
|
85
|
-
- On user acceptance of the plan,
|
|
98
|
+
- On user acceptance of the plan, teardown planning HUD ownership.
|
|
99
|
+
- If handing off to another HUD-owning skill (for example `subagents`), remove
|
|
100
|
+
`hud_id:"planning"` and keep HUD on for the next skill.
|
|
101
|
+
- If no next HUD-owning skill starts immediately, remove planning doc and turn HUD off.
|
|
86
102
|
- Read the `subagents` skill and offer to supervise subagents to execute the plan.
|
|
87
103
|
|
|
88
104
|
## Suggested workflow
|
|
@@ -193,6 +209,7 @@ Required HUD updates during the loop:
|
|
|
193
209
|
- Re-emit the `planning` HUD doc with current `phase`, checklist progress, `waiting_on_user`, `next_action`, and `blocker` after each meaningful planning step.
|
|
194
210
|
- Use `{"action":"snapshot","snapshot_format":"compact"}` for concise user-facing HUD lines.
|
|
195
211
|
- Keep `updated_at_ms` monotonic across updates so latest doc wins deterministically.
|
|
212
|
+
- On plan completion/handoff, remove `hud_id:"planning"` and apply handoff/off semantics from the `hud` skill.
|
|
196
213
|
|
|
197
214
|
## Effective HUD usage heuristics
|
|
198
215
|
|
|
@@ -223,4 +240,5 @@ Required HUD updates during the loop:
|
|
|
223
240
|
- Keep tasks small enough to complete in one focused pass.
|
|
224
241
|
- Explicitly call out uncertain assumptions for user confirmation.
|
|
225
242
|
- Prefer reversible plans and incremental checkpoints.
|
|
243
|
+
- If `model-routing` is in scope, route intent/constraints are explicit and non-conflicting.
|
|
226
244
|
- HUD state must be fresh, accurate, and aligned with user-visible status updates.
|
|
@@ -9,6 +9,13 @@ Use this skill when the user asks to set up Discord messaging for `mu`.
|
|
|
9
9
|
|
|
10
10
|
Goal: get Discord `/mu` ingress working with minimal user effort outside the terminal.
|
|
11
11
|
|
|
12
|
+
## Contents
|
|
13
|
+
|
|
14
|
+
- [Required user-provided inputs](#required-user-provided-inputs)
|
|
15
|
+
- [Agent-first workflow](#agent-first-workflow)
|
|
16
|
+
- [Evaluation scenarios](#evaluation-scenarios)
|
|
17
|
+
- [Notes and caveats](#notes-and-caveats)
|
|
18
|
+
|
|
12
19
|
## Required user-provided inputs
|
|
13
20
|
|
|
14
21
|
- Public webhook base URL reachable by Discord (for example `https://mu.example.com`)
|
|
@@ -9,6 +9,13 @@ Use this skill when the user asks to set up the Neovim messaging channel (`mu.nv
|
|
|
9
9
|
|
|
10
10
|
Goal: get `:Mu ...` working against `mu` control-plane with minimal user-side editor actions.
|
|
11
11
|
|
|
12
|
+
## Contents
|
|
13
|
+
|
|
14
|
+
- [Required user-provided inputs](#required-user-provided-inputs)
|
|
15
|
+
- [Agent-first workflow](#agent-first-workflow)
|
|
16
|
+
- [Evaluation scenarios](#evaluation-scenarios)
|
|
17
|
+
- [Safety and UX requirements](#safety-and-ux-requirements)
|
|
18
|
+
|
|
12
19
|
## Required user-provided inputs
|
|
13
20
|
|
|
14
21
|
- Confirmation that `mu.nvim` is installed (or permission for the agent to provide install snippet)
|
|
@@ -9,7 +9,10 @@ description: "Orchestrates issue-driven subagent execution with heartbeat superv
|
|
|
9
9
|
|
|
10
10
|
- [Purpose (what this skill is for)](#purpose-what-this-skill-is-for)
|
|
11
11
|
- [Shared protocol dependency](#shared-protocol-dependency)
|
|
12
|
+
- [Control-flow dependency](#control-flow-dependency)
|
|
13
|
+
- [Model-routing dependency](#model-routing-dependency)
|
|
12
14
|
- [HUD skill dependency](#hud-skill-dependency)
|
|
15
|
+
- [tmux skill dependency](#tmux-skill-dependency)
|
|
13
16
|
- [When to use](#when-to-use)
|
|
14
17
|
- [Success condition](#success-condition)
|
|
15
18
|
- [Dispatch modes](#dispatch-modes)
|
|
@@ -36,7 +39,7 @@ Source of truth remains in `mu issues` + `mu forum`.
|
|
|
36
39
|
|
|
37
40
|
## Shared protocol dependency
|
|
38
41
|
|
|
39
|
-
This skill executes DAG work defined by **`
|
|
42
|
+
This skill executes DAG work defined by **`orchestration`**.
|
|
40
43
|
|
|
41
44
|
Before orchestration begins, load that skill and enforce:
|
|
42
45
|
|
|
@@ -46,13 +49,45 @@ Before orchestration begins, load that skill and enforce:
|
|
|
46
49
|
|
|
47
50
|
Do not run subagent orchestration against alternate protocol tags.
|
|
48
51
|
|
|
52
|
+
## Control-flow dependency
|
|
53
|
+
|
|
54
|
+
When a subtree declares explicit loop/termination policy (for example
|
|
55
|
+
`flow:review-gated-v1`), load **`control-flow`** and apply policy transitions as
|
|
56
|
+
an overlay on orchestration primitives.
|
|
57
|
+
|
|
58
|
+
- Keep DAG structure protocol-valid (`orchestration` remains source-of-truth).
|
|
59
|
+
- Compile control-flow decisions into protocol primitives (`spawn`, `expand`,
|
|
60
|
+
`ask`, `complete`, `serial`), not ad-hoc mutations.
|
|
61
|
+
|
|
62
|
+
## Model-routing dependency
|
|
63
|
+
|
|
64
|
+
When a subtree declares per-issue model/provider/thinking policy (for example
|
|
65
|
+
`route:model-routing-v1`), load **`model-routing`** and apply routing transitions
|
|
66
|
+
as an overlay on orchestration primitives.
|
|
67
|
+
|
|
68
|
+
- Keep DAG structure protocol-valid (`orchestration` remains source-of-truth).
|
|
69
|
+
- Drive recommendations from live harness capabilities (`mu control harness --json`).
|
|
70
|
+
- Apply route selections with per-turn/per-session overrides (`mu exec`/`mu turn`
|
|
71
|
+
`--provider --model --thinking`) instead of mutating workspace-global defaults.
|
|
72
|
+
- Emit auditable route packets (`ROUTE_RECOMMENDATION`, `ROUTE_FALLBACK`,
|
|
73
|
+
`ROUTE_DEGRADED`) in forum topics.
|
|
74
|
+
|
|
49
75
|
## HUD skill dependency
|
|
50
76
|
|
|
51
77
|
Before emitting or mutating subagent HUD state, load **`hud`** and follow its canonical contract.
|
|
78
|
+
HUD usage is not optional for this skill.
|
|
52
79
|
|
|
53
80
|
- Treat `hud` as source-of-truth for generic `mu_hud` actions, `HudDoc` shape, and rendering constraints.
|
|
54
81
|
- This subagents skill defines orchestration-specific conventions only (for example `hud_id: "subagents"`, queue/activity semantics).
|
|
55
82
|
|
|
83
|
+
## tmux skill dependency
|
|
84
|
+
|
|
85
|
+
Before spawning/inspecting worker sessions, load **`tmux`** and follow its
|
|
86
|
+
canonical session lifecycle and bounded send/capture protocol.
|
|
87
|
+
|
|
88
|
+
- Treat `tmux` as source-of-truth for session ownership, completion markers, and teardown.
|
|
89
|
+
- This subagents skill defines orchestration semantics and queue policy.
|
|
90
|
+
|
|
56
91
|
## When to use
|
|
57
92
|
|
|
58
93
|
- Work is represented as issue-scoped deliverables with explicit outcomes.
|
|
@@ -103,14 +138,15 @@ mu issues ready --root <root-id> --tag proto:hierarchical-work-v1 --pretty
|
|
|
103
138
|
mu forum read issue:<root-id> --limit 20 --pretty
|
|
104
139
|
```
|
|
105
140
|
|
|
106
|
-
2. Choose exactly one action/primitive from `
|
|
141
|
+
2. Choose exactly one action/primitive from `orchestration`.
|
|
107
142
|
3. Apply it.
|
|
108
143
|
4. Verify (`get`, `children`, `ready`, `validate`).
|
|
109
|
-
5.
|
|
144
|
+
5. Update `hud_id:"subagents"` (required) and emit a compact snapshot.
|
|
145
|
+
6. Post a human-facing `ORCH_PASS` update to forum:
|
|
110
146
|
- start with a short title that captures status in plain language
|
|
111
147
|
- follow with one concise paragraph covering: project objective context, milestone moved this pass, impact, overall progress, and next high-level step
|
|
112
148
|
- include queue/worker/drift internals only when diagnosing blocker/anomaly.
|
|
113
|
-
|
|
149
|
+
7. Exit tick.
|
|
114
150
|
|
|
115
151
|
Stop automation when `mu issues validate <root-id>` returns final.
|
|
116
152
|
|
|
@@ -120,6 +156,7 @@ For claimed issue `<issue-id>` under `<root-id>`:
|
|
|
120
156
|
|
|
121
157
|
1. Run `read_tree`.
|
|
122
158
|
2. Choose one primitive:
|
|
159
|
+
- route policy present and no valid route decision -> apply one `model-routing` transition
|
|
123
160
|
- missing input -> `ask`
|
|
124
161
|
- needs decomposition -> `expand`
|
|
125
162
|
- directly solvable -> `complete`
|
|
@@ -132,7 +169,7 @@ Repeat bounded passes until issue closes.
|
|
|
132
169
|
## Bootstrap and queue targeting
|
|
133
170
|
|
|
134
171
|
If root DAG does not yet exist, create it using the
|
|
135
|
-
`
|
|
172
|
+
`orchestration` bootstrap template first.
|
|
136
173
|
|
|
137
174
|
During orchestration, always scope queue reads with protocol tag:
|
|
138
175
|
|
|
@@ -147,9 +184,9 @@ mu issues ready --root <root-id> --tag proto:hierarchical-work-v1 --pretty
|
|
|
147
184
|
```bash
|
|
148
185
|
mu heartbeats create \
|
|
149
186
|
--title "hierarchical-work-v1 <root-id>" \
|
|
150
|
-
--reason
|
|
187
|
+
--reason orchestration_v1 \
|
|
151
188
|
--every-ms 15000 \
|
|
152
|
-
--prompt "Use skills subagents,
|
|
189
|
+
--prompt "Use skills subagents, orchestration, control-flow, model-routing, and hud for root <root-id>. Run exactly one bounded orchestration pass: inspect the proto:hierarchical-work-v1 queue, perform exactly one corrective orchestration action (including in_progress-without-worker drift recovery) or claim/work-start one ready issue, then verify state. If flow:* policy tags are present, apply one control-flow transition from the control-flow skill in this pass. If route:* policy tags are present, apply one model-routing transition from the model-routing skill in this pass using live `mu control harness` capabilities and per-turn provider/model/thinking overrides. Report human-facing progress as a titled status note plus one concise paragraph that explains project context, milestone moved, impact, overall progress, and next high-level step; avoid low-level orchestration internals unless diagnosing a blocker/anomaly. Post a matching ORCH_PASS update to issue:<root-id>. Stop when 'mu issues validate <root-id>' is final."
|
|
153
190
|
```
|
|
154
191
|
|
|
155
192
|
Reusable status-voice add-on for heartbeat prompts (copy/paste):
|
|
@@ -169,13 +206,13 @@ run_id="$(date +%Y%m%d-%H%M%S)"
|
|
|
169
206
|
for issue_id in $(mu issues ready --root <root-id> --tag proto:hierarchical-work-v1 --json | jq -r '.[].id' | head -n 3); do
|
|
170
207
|
session="mu-sub-${run_id}-${issue_id}"
|
|
171
208
|
tmux new-session -d -s "$session" \
|
|
172
|
-
"cd '$PWD' && mu exec 'Use skills subagents,
|
|
209
|
+
"cd '$PWD' && mu exec 'Use skills subagents, orchestration, control-flow, model-routing, and hud. Work issue ${issue_id} using hierarchical-work.protocol/v1. If flow:* policy tags are present, apply the control-flow overlay before selecting the next primitive. If route:* policy tags are present, apply the model-routing overlay using live harness capabilities before selecting the next primitive. Claim first, then run one full control loop.' ; rc=\$?; echo __MU_DONE__:\$rc"
|
|
173
210
|
done
|
|
174
211
|
```
|
|
175
212
|
|
|
176
213
|
## Subagents HUD
|
|
177
214
|
|
|
178
|
-
|
|
215
|
+
HUD usage is required for this skill. Truth still lives in issues/forum.
|
|
179
216
|
|
|
180
217
|
```text
|
|
181
218
|
/mu hud on
|
|
@@ -194,7 +231,8 @@ Tool: `mu_hud`
|
|
|
194
231
|
- actions: refresh/spawn command hooks (if desired)
|
|
195
232
|
- metadata: include `style_preset:"subagents"` for consistent renderer emphasis
|
|
196
233
|
- Example update:
|
|
197
|
-
- `{"action":"set","doc":{"
|
|
234
|
+
- `{"action":"set", "doc": {"hud_id":"subagents", ...}}` (see `hud` skill for exact shape)
|
|
235
|
+
- Follow the HUD ownership and teardown protocol from `hud` skill for completion and handoff.
|
|
198
236
|
|
|
199
237
|
## Evaluation scenarios
|
|
200
238
|
|
|
@@ -210,12 +248,17 @@ Tool: `mu_hud`
|
|
|
210
248
|
- Setup: worker encounters missing critical input.
|
|
211
249
|
- Expected: skill applies protocol `ask` semantics, creates a human-input node, and downstream work remains blocked until the answer issue closes.
|
|
212
250
|
|
|
251
|
+
4. **Model-routing overlay with fallback**
|
|
252
|
+
- Setup: ready issue tagged `route:model-routing-v1` and selected model fails at launch.
|
|
253
|
+
- Expected: one bounded pass emits `ROUTE_FALLBACK`, selects alternate/provider fallback deterministically, and continues execution without violating DAG protocol rules.
|
|
254
|
+
|
|
213
255
|
## Reconciliation
|
|
214
256
|
|
|
215
257
|
- Run `mu issues validate <root-id>` before declaring completion.
|
|
216
258
|
- Merge synth-node outputs into one final user-facing result.
|
|
217
259
|
- Convert unresolved gaps into new child issues tagged `proto:hierarchical-work-v1`.
|
|
218
260
|
- Tear down temporary tmux sessions.
|
|
261
|
+
- Tear down/handoff `hud_id:"subagents"` ownership following the `hud` skill protocol.
|
|
219
262
|
|
|
220
263
|
## Safety
|
|
221
264
|
|
|
@@ -0,0 +1,149 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: tmux
|
|
3
|
+
description: "Provides canonical tmux session patterns for persistent REPLs, bounded command execution, and parallel worker fan-out."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# tmux
|
|
7
|
+
|
|
8
|
+
Use this skill when other workflows need durable shell state, long-lived REPLs,
|
|
9
|
+
or parallel worker sessions.
|
|
10
|
+
|
|
11
|
+
This is a transport/runtime primitive skill. It does not define task semantics;
|
|
12
|
+
it defines how to run sessions reliably.
|
|
13
|
+
|
|
14
|
+
## Contents
|
|
15
|
+
|
|
16
|
+
- [Core contract](#core-contract)
|
|
17
|
+
- [Session lifecycle primitives](#session-lifecycle-primitives)
|
|
18
|
+
- [Bounded execution protocol](#bounded-execution-protocol)
|
|
19
|
+
- [Parallel fan-out pattern](#parallel-fan-out-pattern)
|
|
20
|
+
- [Teardown and diagnostics](#teardown-and-diagnostics)
|
|
21
|
+
- [Integration map](#integration-map)
|
|
22
|
+
- [Evaluation scenarios](#evaluation-scenarios)
|
|
23
|
+
|
|
24
|
+
## Core contract
|
|
25
|
+
|
|
26
|
+
1. **One logical task scope per session**
|
|
27
|
+
- Reuse a session for one task/thread.
|
|
28
|
+
- Use distinct names for unrelated tasks.
|
|
29
|
+
|
|
30
|
+
2. **Create-or-reuse, do not assume**
|
|
31
|
+
- Always check `tmux has-session` before creating.
|
|
32
|
+
|
|
33
|
+
3. **Bound command passes**
|
|
34
|
+
- Send one coherent pass, capture output, then decide the next pass.
|
|
35
|
+
- Use completion markers when possible.
|
|
36
|
+
|
|
37
|
+
4. **Prefer explicit ownership**
|
|
38
|
+
- Track which sessions this run created.
|
|
39
|
+
- Tear down owned sessions at completion/handoff.
|
|
40
|
+
|
|
41
|
+
5. **Keep it simple**
|
|
42
|
+
- `tmux` is a substrate; avoid extra protocol complexity unless the task requires it.
|
|
43
|
+
|
|
44
|
+
## Session lifecycle primitives
|
|
45
|
+
|
|
46
|
+
List and inspect:
|
|
47
|
+
|
|
48
|
+
```bash
|
|
49
|
+
tmux list-sessions
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
Create or reuse a shell session:
|
|
53
|
+
|
|
54
|
+
```bash
|
|
55
|
+
session="mu-shell-main"
|
|
56
|
+
tmux has-session -t "$session" 2>/dev/null || tmux new-session -d -s "$session" "bash --noprofile --norc -i"
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
Attach for manual inspection:
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
tmux attach -t "$session"
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
## Bounded execution protocol
|
|
66
|
+
|
|
67
|
+
Send one command pass and wait for a marker:
|
|
68
|
+
|
|
69
|
+
```bash
|
|
70
|
+
session="mu-shell-main"
|
|
71
|
+
token="__MU_DONE_$(date +%s%N)__"
|
|
72
|
+
|
|
73
|
+
tmux send-keys -t "$session" "echo start && pwd && ls" C-m
|
|
74
|
+
tmux send-keys -t "$session" "echo $token" C-m
|
|
75
|
+
|
|
76
|
+
for _ in $(seq 1 80); do
|
|
77
|
+
out="$(tmux capture-pane -pt "$session" -S -200)"
|
|
78
|
+
echo "$out" | grep -q "$token" && break
|
|
79
|
+
sleep 0.05
|
|
80
|
+
done
|
|
81
|
+
|
|
82
|
+
printf "%s\n" "$out"
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
Use this same pattern for REPL sessions (`python3 -q`, `node`, `sqlite3`, etc.).
|
|
86
|
+
|
|
87
|
+
## Parallel fan-out pattern
|
|
88
|
+
|
|
89
|
+
Spawn one session per independent unit of work:
|
|
90
|
+
|
|
91
|
+
```bash
|
|
92
|
+
run_id="$(date +%Y%m%d-%H%M%S)"
|
|
93
|
+
for work_id in a b c; do
|
|
94
|
+
session="mu-worker-${run_id}-${work_id}"
|
|
95
|
+
tmux new-session -d -s "$session" "bash -lc 'echo START:${work_id}; sleep 1; echo DONE:${work_id}'"
|
|
96
|
+
done
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
Inspect recent output from all workers:
|
|
100
|
+
|
|
101
|
+
```bash
|
|
102
|
+
for s in $(tmux list-sessions -F '#S' | grep '^mu-worker-'); do
|
|
103
|
+
echo "=== $s ==="
|
|
104
|
+
tmux capture-pane -pt "$s" -S -60 | tail -n 20
|
|
105
|
+
done
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
## Teardown and diagnostics
|
|
109
|
+
|
|
110
|
+
Kill one session:
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
tmux kill-session -t "$session"
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
Kill owned worker set by prefix:
|
|
117
|
+
|
|
118
|
+
```bash
|
|
119
|
+
for s in $(tmux list-sessions -F '#S' | grep '^mu-worker-20260224-'); do
|
|
120
|
+
tmux kill-session -t "$s"
|
|
121
|
+
done
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
Quick diagnostics checklist:
|
|
125
|
+
|
|
126
|
+
1. `tmux list-sessions` (does session exist?)
|
|
127
|
+
2. `tmux capture-pane -pt <session> -S -200` (what actually happened?)
|
|
128
|
+
3. check marker presence / timeout behavior
|
|
129
|
+
4. recreate session if shell state is irrecoverably bad
|
|
130
|
+
|
|
131
|
+
## Integration map
|
|
132
|
+
|
|
133
|
+
- `code-mode`: tmux-backed REPL persistence and context engineering loops
|
|
134
|
+
- `subagents`: tmux fan-out for parallel worker execution
|
|
135
|
+
- `heartbeats` / `crons`: schedule bounded passes that dispatch into tmux workers
|
|
136
|
+
|
|
137
|
+
## Evaluation scenarios
|
|
138
|
+
|
|
139
|
+
1. **Persistent REPL continuity**
|
|
140
|
+
- Setup: run multi-pass Python debugging task.
|
|
141
|
+
- Expected: same session reused; state persists across passes.
|
|
142
|
+
|
|
143
|
+
2. **Bounded pass completion**
|
|
144
|
+
- Setup: command that emits long output.
|
|
145
|
+
- Expected: completion marker reliably terminates capture loop.
|
|
146
|
+
|
|
147
|
+
3. **Parallel worker fan-out**
|
|
148
|
+
- Setup: three independent work items.
|
|
149
|
+
- Expected: one session per item, inspectable output, clean teardown.
|