@yemi33/minions 0.1.1883 → 0.1.1885

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,213 @@
1
+ # Runtime Adapters
2
+
3
+ How Minions abstracts over CLI runtimes (Claude Code, GitHub Copilot CLI, future
4
+ adapters). Engine code never special-cases a runtime by name — every CLI-specific
5
+ behavior is hidden behind an adapter object resolved through `resolveRuntime()`.
6
+
7
+ ## What lives in `engine/runtimes/`
8
+
9
+ | File | Role |
10
+ |------|------|
11
+ | `engine/runtimes/index.js` | Adapter registry. `resolveRuntime(name)`, `listRuntimes()`, `registerRuntime()`. Engine code MUST go through these. |
12
+ | `engine/runtimes/claude.js` | Claude Code adapter. Owns binary probe, `--system-prompt-file`, JSONL parser, model shorthands, budget cap, bare mode. |
13
+ | `engine/runtimes/copilot.js` | GitHub Copilot CLI adapter. Owns standalone-vs-`gh-copilot` resolution, stdin-only prompt delivery, `https://api.githubcopilot.com/models` discovery, effort `'max' → 'xhigh'` mapping. |
14
+
15
+ `resolveRuntime(name)` throws when `name` is unknown so misconfigurations surface
16
+ at dispatch time instead of producing silent fallbacks deep inside spawn logic.
17
+ The default name is `'claude'` (matches `engine.defaultCli` fallback).
18
+
19
+ ## Adapter Interface
20
+
21
+ Every adapter exports the following surface. The Claude adapter is the canonical
22
+ implementation — copy it as the template for new runtimes and only override the
23
+ methods that genuinely differ.
24
+
25
+ | Member | Type / Returns | Purpose |
26
+ |--------|----------------|---------|
27
+ | `name` | string | Adapter identifier (matches the registry key). |
28
+ | `capabilities` | flag object (see below) | Feature flags consumed by engine code. |
29
+ | `resolveBinary({ env, config? })` | `{ bin, native, leadingArgs }` or null | Probe PATH, npm-globals, package overrides; cached to `engine/<name>-caps.json`. `leadingArgs` is `[]` for direct binaries and `['copilot']` for the `gh copilot` extension fallback. |
30
+ | `capsFile` | absolute path | Cache file for `resolveBinary()` results. |
31
+ | `installHint` | string | Surfaced when `resolveBinary()` returns null. Consumed by `engine/preflight.js` and `engine/spawn-agent.js`. |
32
+ | `listModels()` | `Promise<{id,name,provider}[] | null>` | Model catalog or null when discovery isn't supported. |
33
+ | `modelsCache` | absolute path | Cache file for `listModels()` results. |
34
+ | `spawnScript` | absolute path | Path to the runtime-agnostic spawn wrapper (currently `engine/spawn-agent.js` for both runtimes). |
35
+ | `buildArgs(opts)` | `string[]` | CLI args excluding the binary itself. |
36
+ | `buildSpawnFlags(opts)` | `string[]` | Optional; engine-side flags consumed by `engine/spawn-agent.js` (overrides the default `_buildAgentSpawnFlags`). |
37
+ | `buildPrompt(promptText, sysPromptText)` | string | Final prompt delivered to the CLI. Adapters that lack `--system-prompt-file` (Copilot) prepend a `<system>` block here. |
38
+ | `getUserAssetDirs({ homeDir })` | `string[]` | Runtime-native global asset roots passed to spawn as `--add-dir` so worktrees still see them. |
39
+ | `getSkillRoots({ homeDir, project? })` | `[{ scope, dir, projectName? }]` | Where `collectSkillFiles` looks for native + project skill markdown. |
40
+ | `getSkillWriteTargets({ homeDir, project? })` | `{ personal, project }` | Where `extractSkillsFromOutput` writes auto-extracted skills. |
41
+ | `getResumeSessionId(...)` | string or null | Pre-spawn resume probe; runtime-agnostic stamp prevents cross-runtime session-ID confusion. |
42
+ | `saveSession(...)` | void | Writes `agents/<id>/session.json` after a successful turn. |
43
+ | `resolveModel(input)` | string or undefined | Shorthand expansion / passthrough. Returning undefined causes the adapter to omit `--model` and the CLI uses its own default. |
44
+ | `modelLooksFamiliar(model)` | boolean | Heuristic powering the preflight "stale model after CLI switch" warning. |
45
+ | `parseOutput(raw)` | `{ text, usage, sessionId, model }` | Final-event parser. |
46
+ | `parseStreamChunk(line)` | event object or null | Single JSONL line → typed event. |
47
+ | `parseError(rawOutput)` | `{ message, code, retriable }` | Codes: `auth-failure`, `context-limit`, `budget-exceeded`, `crash`, null. |
48
+ | `createStreamConsumer(ctx)` | consumer object | Stream accumulator used by `engine/llm.js`. |
49
+ | `detectPermissionGate`, `getPromptDeliveryMode`, `usesSystemPromptFile`, `classifyFailure` | misc | Adapter-owned policy that engine code reads through accessors instead of branching on `runtime.name`. |
50
+
51
+ The contract surface is documented inline at the top of `engine/runtimes/claude.js`
52
+ (see the JSDoc block at the start of that file). Keep that block authoritative —
53
+ this table is the human-readable mirror.
54
+
55
+ ## Capability Flags
56
+
57
+ Engine code branches on flags, never on runtime names. Add a flag whenever a
58
+ real behavioral split appears between adapters.
59
+
60
+ | Flag | Claude | Copilot | Gates |
61
+ |------|--------|---------|-------|
62
+ | `streaming` | ✓ | ✓ | JSONL events on stdout. |
63
+ | `sessionResume` | ✓ | ✓ | `--resume <id>`. |
64
+ | `midRunSessionId` | ✓ | ✗ | Resumable session ID emitted before the terminal `result` event; when false, steering waits for a checkpoint. |
65
+ | `systemPromptFile` | ✓ | ✗ | sysprompt via `--system-prompt-file` (else inlined into stdin by `buildPrompt`). |
66
+ | `effortLevels` | ✓ | ✓ | `--effort low\|medium\|high\|xhigh`. Copilot maps `'max' → 'xhigh'` inside `resolveModel`/`buildArgs`; Claude leaves `'max'` alone. |
67
+ | `costTracking` | ✓ | ✗ | USD + token counts in the result event. Copilot only emits `premiumRequests`. |
68
+ | `modelShorthands` | ✓ | ✗ | Bare `sonnet`/`opus`/`haiku` accepted. Copilot expects full model IDs (`claude-sonnet-4.5`, `gpt-5.4`). |
69
+ | `modelDiscovery` | ✗ | ✓ | `listModels()` returns a real catalog (Copilot reads `https://api.githubcopilot.com/models`). |
70
+ | `promptViaArg` | ✗ | ✗ | If true, prompt goes via `--prompt <text>` instead of stdin. False on both because Windows ARG_MAX (~32 KB) breaks `-p "<long-prompt>"` outright. |
71
+ | `budgetCap` | ✓ | ✗ | `--max-budget-usd <n>`. |
72
+ | `bareMode` | ✓ | ✗ | `--bare` (suppresses CLAUDE.md auto-discovery). Closest Copilot equivalent is `--no-custom-instructions`, gated separately. |
73
+ | `fallbackModel` | ✓ | ✗ | `--fallback-model <id>` on rate-limit. |
74
+ | `sessionPersistenceControl` | ✓ | ✗ | Engine writes `session.json`. Copilot manages session state internally in `~/.copilot/session-state/`. |
75
+ | `resumePromptCarryover` | ✗ | ✓ | CC resume turns prepend recent visible Q&A in stdin because Copilot's session store is opaque to Minions. |
76
+ | `resumeBookkeepingTurn` | ✓ | — | Claude CLI injects a synthetic "Continue from where you left off." meta turn on `--resume`; CC prompts must tell the model not to treat it as user intent. |
77
+ | `streamConsumer` | ✓ | ✓ | Adapter implements `createStreamConsumer(ctx)` — required by `engine/llm.js` accumulator. |
78
+
79
+ When a behavior is genuinely uniform across all current adapters, it still gets
80
+ a flag if a future runtime might disagree — flags are cheap.
81
+
82
+ ## CC vs Agent Runtime Independence
83
+
84
+ Command Center (CC) / doc-chat and the agent fleet are two **independent**
85
+ runtime paths. Setting `engine.ccCli: copilot` for CC alone must NOT switch the
86
+ agent fleet to Copilot, and vice versa. Both paths fall through to
87
+ `engine.defaultCli` (the fleet-wide default), but they never see each other's
88
+ overrides.
89
+
90
+ Six helpers in `engine/shared.js` are the single source of truth for runtime/model
91
+ resolution. Engine code MUST go through them instead of reading config keys
92
+ directly.
93
+
94
+ | Helper | Chain |
95
+ |--------|-------|
96
+ | `resolveAgentCli(agent, engine)` | `agent.cli` → `engine.defaultCli` → `'claude'` |
97
+ | `resolveCcCli(engine)` | `engine.ccCli` → `engine.defaultCli` → `'claude'` |
98
+ | `resolveAgentModel(agent, engine)` | `agent.model` → `engine.defaultModel` → undefined |
99
+ | `resolveCcModel(engine)` | `engine.ccModel` → `engine.defaultModel` → undefined |
100
+ | `resolveAgentMaxBudget(agent, engine)` | `agent.maxBudgetUsd` → `engine.maxBudgetUsd`. Honors literal `0`. |
101
+ | `resolveAgentBareMode(agent, engine)` | `agent.bareMode` → `engine.claudeBareMode` → false. Strict null check so per-agent `false` overrides engine `true`. |
102
+
103
+ Agent dispatch resolves the runtime once at spawn time:
104
+
105
+ ```js
106
+ // engine.js spawnAgent (~line 1158)
107
+ const runtime = resolveRuntime(shared.resolveAgentCli(agentConfig, engineConfig));
108
+ ```
109
+
110
+ CC / doc-chat resolves the runtime via `engine/llm.js` `_resolveRuntimeFor()`:
111
+
112
+ ```js
113
+ // engine/llm.js (~line 247)
114
+ if (!runtimeName && callOpts.engineConfig) runtimeName = resolveCcCli(callOpts.engineConfig);
115
+ ```
116
+
117
+ Tests in `test/unit/runtime-fleet-helpers.test.js` enforce that
118
+ `resolveAgentCli` does NOT fall through to `engine.ccCli` and that `resolveCcCli`
119
+ does NOT inspect any agent settings.
120
+
121
+ ## `--cli` / `--model` / `--effort` Knobs
122
+
123
+ Three CLI flags switch the fleet without dashboard interaction. They forward
124
+ through `engine/cli.js` `parseRuntimeFleetFlags()` and persist into `config.json`.
125
+
126
+ | Flag | Persists to | Notes |
127
+ |------|-------------|-------|
128
+ | `--cli <runtime>` | `engine.defaultCli` | Validated against the runtime registry; unknown runtimes exit non-zero with the registered list. Implicitly clears any explicit `engine.ccCli` (with a warning naming the prior value). |
129
+ | `--model <id>` | `engine.defaultModel` | Empty-string sentinel (`--model ''`) **deletes** `engine.defaultModel` — never pin to a literal empty string. Compatibility with `defaultCli` is checked; mismatch emits an incompatibility warning suggesting `--model ''`. |
130
+ | `--effort <level>` | dispatched as `opts.effort` | One of `low|medium|high|xhigh`. Copilot maps `'max' → 'xhigh'`; Claude passes `'max'` through unchanged. Capability-gated by `effortLevels`. |
131
+
132
+ Examples:
133
+
134
+ ```bash
135
+ minions start --cli copilot --model claude-sonnet-4.5
136
+ minions restart --cli claude --model '' # '' DELETES defaultModel
137
+ minions config set-cli copilot --model gpt-5.4
138
+ ```
139
+
140
+ Inside the adapters, `buildArgs(opts)` translates the resolved values into the
141
+ CLI-specific argv:
142
+
143
+ ```js
144
+ // engine/runtimes/copilot.js buildArgs
145
+ if (model) args.push('--model', String(model));
146
+ if (mappedEffort) args.push('--effort', mappedEffort);
147
+ ```
148
+
149
+ When `resolveAgentModel` returns undefined the adapter omits `--model` from
150
+ `buildArgs` entirely and the underlying CLI uses its own default.
151
+
152
+ ## Per-Runtime Skill Roots
153
+
154
+ Skills live in runtime-native directories so a skill placed there is genuinely
155
+ visible to that runtime when invoked outside Minions. The adapter's
156
+ `getSkillRoots()` and `getSkillWriteTargets()` are the discovery and write
157
+ contracts.
158
+
159
+ | Adapter | Personal skill root | Project skill roots | Auto-extraction write targets |
160
+ |---------|---------------------|---------------------|-------------------------------|
161
+ | Claude | `~/.claude/skills` | `<repo>/.claude/skills` | personal: `~/.claude/skills`; project: `<repo>/.claude/skills` |
162
+ | Copilot | `~/.copilot/skills` (+ `~/.agents/skills`) | `<repo>/.github/skills`, `<repo>/.agents/skills` | personal: `~/.copilot/skills`; project: `<repo>/.github/skills` |
163
+
164
+ `getUserAssetDirs()` returns the union of runtime-native global roots and is
165
+ passed to spawn as `--add-dir` so spawned agents read the same global assets the
166
+ user sees outside Minions. The cross-runtime portable location
167
+ `~/.agents/skills` is included by every adapter that opts into it (Copilot today;
168
+ add it to new adapters when the runtime can read directly from there).
169
+
170
+ ## The "No Runtime-Name Branching" Rule
171
+
172
+ The whole point of this layer: **engine code MUST gate behavior on
173
+ `runtime.capabilities.*` flags, not on `runtime.name === 'claude'` comparisons.**
174
+
175
+ If you find yourself wanting to write `if (runtime.name === 'copilot')` in
176
+ engine code, the correct response is one of:
177
+
178
+ 1. Add a capability flag to the contract and set it on every adapter.
179
+ 2. Add a method to the adapter and call it through the runtime object.
180
+ 3. Move the special case into the adapter's `buildArgs` / `buildPrompt` /
181
+ `parseOutput` where it belongs.
182
+
183
+ Tests in `test/unit/runtime-adapters.test.js` and `test/unit/copilot-adapter.test.js`
184
+ enforce the contract; the source-inspection tests in `test/unit.test.js` look
185
+ for `resolveRuntime(...)` call sites and reject runtime-name comparisons in new
186
+ code.
187
+
188
+ ## Adding a New Runtime
189
+
190
+ 1. Create `engine/runtimes/<name>.js` implementing the full adapter contract.
191
+ Copy `claude.js` as the template and override only the members that differ.
192
+ Set a useful `installHint`.
193
+ 2. Register it in `engine/runtimes/index.js`:
194
+ ```js
195
+ registry.set('<name>', require('./<name>'));
196
+ ```
197
+ 3. Add unit coverage modeled on `test/unit/runtime-adapters.test.js`.
198
+
199
+ Once registered, the dashboard `/api/runtimes` endpoint, the `--cli` flag
200
+ validator, the agent CLI dropdown, the `engine/preflight.js` per-runtime binary
201
+ check, and the model discovery cache all light up automatically. Use capability
202
+ flags — never special-case in engine code.
203
+
204
+ ## See Also
205
+
206
+ - `CLAUDE.md` → "Runtime Adapters" section — short reference table that mirrors
207
+ this doc and is enforced by `test/unit.test.js`.
208
+ - `docs/copilot-cli-schema.md` — empirical Copilot CLI behavior captured during
209
+ the spike that produced the Copilot adapter; cite it when changing
210
+ Copilot-specific capability flags.
211
+ - `engine/runtimes/claude.js` JSDoc header — authoritative adapter contract.
212
+ - `engine/shared.js` `resolveAgentCli` / `resolveCcCli` etc. — the only
213
+ supported way to read runtime / model selection from config.
@@ -0,0 +1,116 @@
1
+ # Team Memory
2
+
3
+ Minions agents share knowledge through a layered memory system. Before an agent does any independent research — web searches, broad codebase exploration, external doc reading — it is expected to check what the team already knows.
4
+
5
+ This document describes the four-tier read order the engine enforces in dispatch prompts, the rationale, and the constraint that `knowledge/` is **sweep-write-only**.
6
+
7
+ ## Why team memory exists
8
+
9
+ Multiple agents work the same repository in parallel and across days. Without shared memory:
10
+
11
+ - Agents re-research the same questions (cost, time, context churn).
12
+ - Agents re-litigate decisions the team already made (drift, regressions).
13
+ - Human-flagged warnings (e.g., "do not self-approve PRs") get ignored because they live outside the prompt.
14
+
15
+ Team memory closes the loop. The engine injects pinned context and recent team notes into every dispatch prompt automatically, and the playbooks instruct agents to consult `knowledge/` and `notes/inbox/` on disk before going wider. This keeps work cheap, decisions sticky, and prior learnings reusable.
16
+
17
+ ## The four-tier read order
18
+
19
+ Every agent prompt (built by `engine/playbook.js`) is rendered with the rule: **check team memory first, then look outside.** The canonical read order (from `playbooks/shared-rules.md`) is:
20
+
21
+ 1. **`pinned.md`** — critical context flagged by the human teammate. **READ FIRST.**
22
+ 2. **`knowledge/`** — categorized KB entries written by the consolidation sweep.
23
+ 3. **`notes.md`** — consolidated team knowledge, decisions, and rolling context.
24
+ 4. **`notes/inbox/`** — recent agent findings not yet consolidated.
25
+
26
+ Agents are also encouraged, after the four primary tiers, to skim `agents/*/live-output.log` for related tasks and the `resultSummary` of prior completed work items on the same topic. Only after exhausting team memory should an agent fall back to web search, broad codebase exploration, or external docs.
27
+
28
+ ### 1. `pinned.md` — human-flagged critical context
29
+
30
+ `pinned.md` lives at the Minions root (`MINIONS_DIR/pinned.md`). It holds notes the human teammate has explicitly pinned through the dashboard — short, high-signal rules and warnings that **must** be visible to every agent on every dispatch.
31
+
32
+ The engine injects the file verbatim into the prompt under a "Pinned Context (CRITICAL — READ FIRST)" heading and caps it at 4 KB to protect the prompt budget.
33
+
34
+ Typical contents include credential constraints (e.g., shared `gh` auth means PR self-approval is blocked), workflow policies (e.g., merging policy, TDD requirement), and active setup state. If pinned context contradicts a playbook instruction, pinned context wins for the duration of the dispatch.
35
+
36
+ ### 2. `knowledge/` — categorized KB entries (sweep-write-only)
37
+
38
+ `knowledge/` is the long-form team knowledge base. Files are grouped by category:
39
+
40
+ - `knowledge/architecture/` — design decisions, system overviews
41
+ - `knowledge/conventions/` — coding patterns, repo norms, standards
42
+ - `knowledge/project-notes/` — per-project context and observations
43
+ - `knowledge/build-reports/` — CI/build investigation results
44
+ - `knowledge/reviews/` — PR review findings and rationale
45
+ - `knowledge/agents/<agentId>.md` — per-agent personal memory (curated by the sweep)
46
+
47
+ Filenames follow the `YYYY-MM-DD-<agent>-<slug>.md` convention so chronological sorting just works.
48
+
49
+ Agents read `knowledge/` directly with grep/glob/view. They **do not** write to it. See [Sweep-write-only constraint](#sweep-write-only-constraint) below.
50
+
51
+ ### 3. `notes.md` — consolidated team notes
52
+
53
+ `notes.md` is the rolling team notebook. The consolidation sweep periodically merges high-signal findings from the inbox into dated `### YYYY-MM-DD` sections at the top of `notes.md`, with links back to the underlying KB entries.
54
+
55
+ The engine injects `notes.md` into every prompt under a "Team Notes (MUST READ)" heading. When the file exceeds `ENGINE_DEFAULTS.maxNotesPromptBytes`, the engine keeps the most recent 10 sections, truncates the body, and appends a footer noting how many older sections live on disk so agents can read the full file when they need depth.
56
+
57
+ ### 4. `notes/inbox/` — fresh agent findings
58
+
59
+ `notes/inbox/` holds the freshest, un-consolidated learnings. Every agent writes exactly one inbox file at the end of a successful task to a path the playbook supplies (`notes/inbox/<agent>-<work-item-id>-<date>-<time>.md`). The file always begins with a YAML frontmatter block (`id`, `agent`, `date`) so the consolidation sweep can track it.
60
+
61
+ Inbox notes are the canonical place to record:
62
+
63
+ - New patterns or conventions discovered while implementing.
64
+ - Gotchas, warnings, or repro recipes for future agents.
65
+ - File-and-line citations supporting any claims.
66
+
67
+ The consolidation sweep is what eventually promotes inbox notes into `knowledge/` and `notes.md`. Until then, agents should grep the inbox themselves for very recent findings on adjacent work.
68
+
69
+ #### Failure-path exception
70
+
71
+ Inbox writes are **success artifacts only**. If a task fails, is blocked, is cancelled, or ends partial, the agent must **not** create an inbox note — the failure is reported through the completion JSON, the PR/work-item comment, or the task-specific failure channel instead. This keeps the inbox a clean signal and prevents noise from drowning out durable findings.
72
+
73
+ ## Rationale
74
+
75
+ The read order is deliberately ordered by **trust × recency**:
76
+
77
+ - `pinned.md` is human-curated and authoritative — it overrides everything else.
78
+ - `knowledge/` is curated by the sweep from validated agent findings — high-signal but slightly delayed.
79
+ - `notes.md` mirrors `knowledge/` in summarized form, optimized for prompt injection.
80
+ - `notes/inbox/` is raw, recent, and unverified — useful for the freshest context but treat critically.
81
+
82
+ Following this order achieves two goals:
83
+
84
+ 1. **Avoid re-research.** If another agent already investigated the same module, fix, or design question, the answer is somewhere in this stack. Reading saves a costly round-trip.
85
+ 2. **Respect team decisions.** Pinned policies and consolidated conventions encode prior agreement (often with the human). Skipping team memory and acting on first principles is how regressions and drift happen.
86
+
87
+ ## Sweep-write-only constraint
88
+
89
+ **Agents must never delete, move, or overwrite files in `knowledge/`.** Only the consolidation sweep (`engine/consolidation.js`) writes to `knowledge/`. This rule is restated in every dispatch prompt under a "Knowledge Base Rules" heading.
90
+
91
+ Why:
92
+
93
+ - The sweep classifies inbox notes into the right category, generates the canonical filename, and updates the `notes.md` summary in lockstep with the `knowledge/` write. Manual edits skip those side effects and desync the system.
94
+ - Agents work in parallel. Letting any agent rewrite `knowledge/` entries makes consolidation non-deterministic and makes blame attribution impossible.
95
+ - Knowledge base entries are referenced by date-stamped path. Rewriting an entry breaks every existing link.
96
+
97
+ If an agent thinks a `knowledge/` file is wrong, the correct response is to **note the disagreement in the agent's own inbox file**. The sweep will pick up the note on the next consolidation cycle and either supersede or correct the existing entry.
98
+
99
+ The same constraint applies to `knowledge/agents/<agentId>.md` — those are curated by the sweep and should not be hand-edited.
100
+
101
+ ## Quick reference for agents
102
+
103
+ ```
104
+ 1. pinned.md ← critical, human-flagged. Read first.
105
+ 2. knowledge/ ← categorized KB. Read, never write.
106
+ 3. notes.md ← consolidated team notes. Read.
107
+ 4. notes/inbox/ ← fresh, unconsolidated findings. Read; write one on success.
108
+ ```
109
+
110
+ After the four tiers, optionally consult `agents/*/live-output.log` and prior `resultSummary` fields. Only then go outside.
111
+
112
+ ## See also
113
+
114
+ - `playbooks/shared-rules.md` — the canonical instruction injected into every dispatch.
115
+ - `engine/playbook.js` — where `pinned.md`, `notes.md`, and per-agent memory get spliced into prompts.
116
+ - `engine/consolidation.js` — the sweep that promotes inbox notes into `knowledge/` and `notes.md`.
@@ -1,5 +1,7 @@
1
1
  # Teams Production Endpoint Migration
2
2
 
3
+ > Last verified: 2026-05-12. `botbuilder` dependency confirmed in `package.json` (4.23.3); `/api/bot` route confirmed in `dashboard.js` (`handleTeamsBot`).
4
+
3
5
  Guide for migrating the Minions Teams bot from a Dev Tunnel to a stable public HTTPS endpoint for production use. Choose one of the three deployment options below based on your infrastructure.
4
6
 
5
7
  **Key fact:** The Azure Bot messaging endpoint URL can be changed at any time in the Azure Portal — it takes effect immediately. No bot reinstallation is needed in Teams. This means you can switch between Dev Tunnel and production endpoints freely.
@@ -120,7 +122,7 @@ Containerize the dashboard and deploy to Azure Container Apps with a stable FQDN
120
122
  CMD ["node", "dashboard.js"]
121
123
  ```
122
124
 
123
- > Minions has zero npm dependencies beyond Node.js built-ins, so `npm ci` may be a no-op. The botbuilder package (added by this feature branch) is the exception.
125
+ > Minions ships with `botbuilder` as its only runtime dependency (declared in `package.json`); other operations rely on Node.js built-ins, so `npm ci` is a fast install.
124
126
 
125
127
  2. **Build and push to Azure Container Registry:**
126
128
 
@@ -0,0 +1,201 @@
1
+ # Watches
2
+
3
+ Persistent monitoring jobs that fire inbox notifications and follow-up actions when a target hits a condition. Watches survive engine restarts and are checked every 3 ticks (~3 minutes by default).
4
+
5
+ ## What a Watch Is
6
+
7
+ A watch is a small JSON record persisted to `engine/watches.json`. It binds:
8
+
9
+ | Field | Purpose |
10
+ |--------------|-------------------------------------------------------------------------|
11
+ | `target` | The thing to watch (PR number, work item id, plan filename, ...) |
12
+ | `targetType` | Which registered target type (`pr`, `work-item`, `meeting`, ...) |
13
+ | `condition` | One of the conditions the target type accepts (e.g. `merged`, `failed`) |
14
+ | `interval` | Min ms between checks (default 5min, floor 60s) |
15
+ | `notify` | `'inbox'` writes an inbox alert on trigger (set `'none'` to suppress) |
16
+ | `owner` | Inbox owner that receives notifications (`'human'` by default) |
17
+ | `stopAfter` | `0` = run forever for change conditions / fire-once for absolute; `N` = expire after N triggers |
18
+ | `onNotMet` | `null` or `'notify'` — write a per-poll inbox note when the condition is not yet met |
19
+ | `action` | Optional follow-up action (see "Follow-Up Actions" below) |
20
+ | `status` | `WATCH_STATUS.ACTIVE` \| `PAUSED` \| `TRIGGERED` \| `EXPIRED` |
21
+
22
+ `createWatch()` allocates a `watch-<uid>` id, defaults the fields above, and persists atomically via `mutateJsonFileLocked` *(source: `engine/watches.js:103-145`)*.
23
+
24
+ ## Lifecycle (`WATCH_STATUS`)
25
+
26
+ Defined in `engine/shared.js:1557`:
27
+
28
+ | Status | Meaning |
29
+ |-------------|-------------------------------------------------------------------------|
30
+ | `active` | Eligible for evaluation each tick |
31
+ | `paused` | Skipped by `checkWatches`; persists indefinitely until resumed/deleted |
32
+ | `triggered` | Reserved status (set on demand by callers; not auto-applied) |
33
+ | `expired` | Auto-set when `stopAfter` is reached, or on first trigger for absolute conditions when `stopAfter === 0`. The watch is left on disk for audit but no longer evaluated *(source: `engine/watches.js:305-310`)* |
34
+
35
+ Pause/resume flips the `status` field via `POST /api/watches/update` *(source: `engine/watches.js:153-178`, `dashboard.js:6400-6412`)*.
36
+
37
+ ## Conditions (`WATCH_CONDITION`)
38
+
39
+ Defined in `engine/shared.js:1573-1592`. Conditions split into two families:
40
+
41
+ ### Absolute conditions (`WATCH_ABSOLUTE_CONDITIONS`)
42
+ *(source: `engine/shared.js:1595-1599`)*
43
+
44
+ `merged`, `build-fail`, `build-pass`, `completed`, `failed`, `concluded`, `approved`, `rejected`.
45
+
46
+ When `stopAfter === 0`, these are **fire-once** — the engine flips the watch to `expired` after the first trigger so a permanently-merged PR doesn't keep notifying *(source: `engine/watches.js:305-310`)*.
47
+
48
+ ### Change-based conditions
49
+ `status-change`, `any`, `new-comments`, `vote-change`, `stage-complete`, `ran`, `enabled`, `disabled`, `activity-change`.
50
+
51
+ These compare the live entity against the watch's `_lastState` snapshot and run forever when `stopAfter === 0`. Baseline `_lastState` is captured on the first check so the very next change triggers the watch *(source: `engine/watches.js:262-266`)*.
52
+
53
+ ## Target Types — `TARGET_TYPES` Registry
54
+
55
+ Target-type behavior in `engine/watches.js` is **data-driven via a registry** *(source: `engine/watches.js:50-72`)*. Each spec must provide:
56
+
57
+ - `label` — human name shown in dashboard pickers
58
+ - `description` — short help text
59
+ - `conditions` — non-empty array of accepted condition keys (also acts as the per-type allowlist)
60
+ - `fetchEntity(target, state)` — entity-or-null lookup
61
+ - `captureState(entity)` — snapshot used for change-detection diffs
62
+ - `evaluate(condition, entity, prevState, target)` — returns `{ triggered, message }`
63
+
64
+ The registry IS the allowlist for `createWatch` and `/api/watches/target-types`; the old hardcoded "pr or work-item" check is gone. Add a new target type at runtime with `registerTargetType(type, spec)` and look one up with `getTargetType(type)`. `listTargetTypes()` returns the serializable form used by the dashboard *(source: `engine/watches.js:62-88`)*.
65
+
66
+ ### Built-in target types
67
+
68
+ The eight built-ins are registered at module load *(source: `engine/watches.js:399-748`)*. Constants live at `engine/shared.js:1563-1572` (`WATCH_TARGET_TYPE`).
69
+
70
+ | `targetType` | Target value | Conditions | Notes |
71
+ |---------------|--------------------------------------|----------------------------------------------------------------------------|-------|
72
+ | `pr` | PR number, canonical id, or display id | `merged`, `build-fail`, `build-pass`, `status-change`, `any`, `new-comments`, `vote-change` | Reads from `pull-requests.json` for any project; `new-comments` watches `humanFeedback.lastProcessedCommentDate` |
73
+ | `work-item` | Work item id | `completed`, `failed`, `status-change`, `any` | `completed` matches `DONE_STATUSES`; `failed` matches `WI_STATUS.FAILED` |
74
+ | `meeting` | Meeting id | `concluded`, `status-change`, `any` | `concluded` fires on terminal status (`completed`, `archived`) |
75
+ | `plan` | PRD JSON filename or plan id | `approved`, `rejected`, `completed`, `status-change`, `any` | Looked up by `_source` (PRD file), `_sourcePlan` (.md), or `id`; uses `PLAN_STATUS` |
76
+ | `schedule` | Schedule id | `ran`, `enabled`, `disabled`, `status-change`, `any` | `ran` fires when `lastRun` advances; `enabled`/`disabled` fire on the flip |
77
+ | `pipeline` | Pipeline id (latest run is tracked) | `completed`, `failed`, `stage-complete`, `status-change`, `any` | `failed` covers `PIPELINE_STATUS.FAILED` and `STOPPED`; `stage-complete` only counts within the same `runId` |
78
+ | `dispatch` | Dispatch entry id | `completed`, `failed`, `status-change`, `any` | Looks across `pending` / `active` / `completed` lists |
79
+ | `agent` | Agent id | `activity-change`, `status-change`, `any` | `activity-change` fires only on transitions in/out of `'working'` |
80
+
81
+ `evaluateWatch` dispatches to `tt.evaluate(...)`; unknown target types return `"Unknown target type: ..."` and unknown conditions return `"Unknown condition: ..."` — both are non-triggering *(source: `engine/watches.js:208-224`)*.
82
+
83
+ ## Tick Integration
84
+
85
+ `engine.js` calls `checkWatches(config, state)` every 3 ticks (~3 min at the default 60s tick) inside its own `safe('checkWatches', ...)` block *(source: `engine.js:4480-4538`)*. The engine builds the state object from cached project files + module reads:
86
+
87
+ ```
88
+ {
89
+ pullRequests, workItems, // flattened across all projects + central
90
+ meetings, plans, // optional — try/catch'd, missing modules give []
91
+ scheduleRuns, pipelineRuns,
92
+ dispatch, agents, config,
93
+ }
94
+ ```
95
+
96
+ `checkWatches` walks every active watch and, inside a single `mutateJsonFileLocked` callback *(source: `engine/watches.js:241-345`)*:
97
+
98
+ 1. Skips paused/expired watches and any watch checked within its `interval`.
99
+ 2. Captures a baseline `_lastState` on first check (so change conditions have something to diff).
100
+ 3. Calls `evaluateWatch(watch, state)`.
101
+ 4. On trigger: increments `triggerCount`, sets `last_triggered`, queues an inbox notification (if `notify === 'inbox'`), and snapshots an action task for any configured `watch.action`.
102
+ 5. Applies fire-once / `stopAfter` expiration.
103
+ 6. On non-trigger: writes a per-poll inbox note when `onNotMet === 'notify'`.
104
+ 7. Refreshes `_lastState` for the next check.
105
+
106
+ I/O happens **outside the lock**: notifications via `writeToInbox`, follow-up actions via `_runActionTask` (`Promise` per action, failures isolated). Each action's result is persisted back onto the watch as `_lastActionResult` in a follow-up locked write *(source: `engine/watches.js:330-377`)*.
107
+
108
+ ## Follow-Up Actions on Trigger
109
+
110
+ `watch.action` is an optional structured action that runs after the inbox notification fires. Action types live in a sibling registry in `engine/watch-actions.js` and are validated at create/update time *(source: `engine/watches.js:112-115`, `engine/watch-actions.js:50-73`)*. `GET /api/watches/action-types` returns the live list for dashboard pickers.
111
+
112
+ ### Built-in actions
113
+
114
+ | Action type | What it does |
115
+ |------------------------|-----------------------------------------------------------------------------------------------|
116
+ | `notify` | Explicit inbox write; lets you customize `owner`/`body` instead of the default trigger string |
117
+ | `dispatch-work-item` | Append a new WI to the project (or central) `work-items.json` with `createdBy: "watch:<id>"` |
118
+ | `run-skill` | Wrapper around `dispatch-work-item` that asks the agent to run a specific `.claude` skill |
119
+ | `webhook` | `http`/`https` request to an arbitrary URL (10s safety timeout, JSON or string body) |
120
+ | `cancel-work-item` | Flip a WI to `WI_STATUS.CANCELLED` across all known work-items files |
121
+ | `trigger-pipeline` | Start a new pipeline run (skipped if the pipeline already has an active run) |
122
+ | `archive-plan` | Set PRD `status="archived"` + `archivedAt` |
123
+ | `resume-plan` | Set PRD `status=PLAN_STATUS.ACTIVE` and clear `planStale` |
124
+
125
+ Constants live in `WATCH_ACTION_TYPE` (`engine/shared.js:1605-1616`); handlers in `engine/watch-actions.js:185-464`.
126
+
127
+ ### Templating
128
+
129
+ Action params support `{{var}}` substitution from the trigger context *(source: `engine/watch-actions.js:83-99`)*. Built-in vars from `buildTriggerContext` *(source: `engine/watch-actions.js:107-154`)*:
130
+
131
+ - Always: `target`, `targetType`, `condition`, `watchId`, `triggerCount`, `message`
132
+ - All scalar fields from `newState` (e.g. `status`, `buildStatus`, `reviewStatus`, `lastRun`)
133
+ - Type-specific aliases: `prNumber`, `branch`, `prId`, `prUrl`, `project` (pr); `workItemId`, `workItemTitle`, `project` (work-item); `meetingId`, `meetingTitle` (meeting); `planFile`, `planMd` (plan); `pipelineId`, `runId` (pipeline); `dispatchId` (dispatch); `agentId` (agent)
134
+
135
+ Unknown vars are left as `{{var}}` so callers can detect them downstream.
136
+
137
+ ### Failure isolation
138
+
139
+ Action handlers must return `{ ok: false, summary }` rather than throw — `runWatchAction` wraps them in try/catch as a backstop *(source: `engine/watch-actions.js:160-178`)*. A bad action never blocks other watches in the same tick, and the failed result is persisted onto the watch as `_lastActionResult` for debugging.
140
+
141
+ ## Dashboard Surface
142
+
143
+ | Route | Method | Handler | Purpose |
144
+ |--------------------------------|--------|-------------------------------|-------------------------------------------------------------------------|
145
+ | `/watches` | GET | `dashboard/pages/watches.html` | Watches page (sidebar link in `dashboard/layout.html:75`) |
146
+ | `/api/watches` | GET | `handleWatchesList` | List all watches |
147
+ | `/api/watches/target-types` | GET | `handleWatchesTargetTypes` | Live registry from `listTargetTypes()` — used by the create-watch modal |
148
+ | `/api/watches/action-types` | GET | `handleWatchesActionTypes` | Live registry from `listActionTypes()` |
149
+ | `/api/watches` | POST | `handleWatchesCreate` | Create a watch (body: `target, targetType, condition, interval?, owner?, description?, project?, notify?, stopAfter?, onNotMet?, action?`) |
150
+ | `/api/watches/update` | POST | `handleWatchesUpdate` | Pause/resume/modify (body: `id, status?, interval?, description?, notify?, stopAfter?, onNotMet?, condition?, action?`) |
151
+ | `/api/watches/delete` | POST | `handleWatchesDelete` | Delete (body: `id`) |
152
+
153
+ *(source: route table in `dashboard.js:7395-7400`; handlers in `dashboard.js:6370-6422`)*
154
+
155
+ The watches page is rendered by `dashboard/js/render-watches.js`; the **+ New** button calls `openCreateWatchModal()` which fetches `/api/watches/target-types` and `/api/watches/action-types` to populate the picker dynamically *(source: `dashboard/pages/watches.html:3`, `dashboard/js/render-watches.js:354-414`)*.
156
+
157
+ The CC state preamble injects a `Watches: <count>` line via `getWatches()` so the chat brain knows how many watches exist *(source: `dashboard.js:1342`)*.
158
+
159
+ ## Command Center Integration
160
+
161
+ CC creates, pauses, resumes, and deletes watches by calling the REST API directly via its `Bash` tool — the older `===ACTIONS===` `create-watch`/`delete-watch`/`pause-watch`/`resume-watch` actions have been retired *(source: `test/unit.test.js:16976-17008`, `test/unit/watches-module.test.js:673`)*. To use a watch from CC chat, ask it something like:
162
+
163
+ > "Watch PR 1234 for build-pass and dispatch a verify work-item when it fires."
164
+
165
+ CC will discover the endpoints from `GET /api/routes` and `POST /api/watches` with the appropriate `targetType`, `condition`, and `action` payload.
166
+
167
+ ## Expected Outputs
168
+
169
+ When a watch fires:
170
+
171
+ - `engine/watches.json` entry gets `triggerCount++`, `last_triggered=<ISO>`, `_lastTriggerMessage=<message>`, and a refreshed `_lastState`.
172
+ - An inbox file lands at `notes/inbox/<owner>/watch-<watchId>-<n>.md` (when `notify === 'inbox'`).
173
+ - If `action` is set, a follow-up runs (work item, webhook POST, plan archive, ...) and writes `_lastActionResult` (`{ type, ok, summary, dispatchedItemId?, at }`) back onto the watch.
174
+ - `engine.log.json` gets `Watch triggered: <id> — <message>` and `Watch <id> action <type>: <summary>`.
175
+
176
+ Absolute conditions firing under `stopAfter === 0` flip `status` to `expired`; `stopAfter > 0` flips it to `expired` once `triggerCount >= stopAfter`.
177
+
178
+ ## Common Failure Modes
179
+
180
+ | Symptom | Likely cause / debug |
181
+ |-------------------------------------------------------------------------|----------------------------------------------------------------------------------------|
182
+ | Watch never fires | Check `status === 'active'`; check `last_checked` advancing each cycle; confirm engine tick is running and `interval` isn't longer than your test window |
183
+ | `evaluateWatch` returns `"<label> <target> not found"` | `fetchEntity` got nothing back — wrong `target` (e.g. PR display id vs canonical id), the target type isn't loaded, or the underlying file (PR cache, plan PRD) doesn't exist |
184
+ | `"Unknown target type"` / `"Unknown condition"` | The registry doesn't recognise the value. Check `GET /api/watches/target-types` to see what's registered server-side; condition must be in that target type's `conditions[]` |
185
+ | Change condition fires immediately on first tick | Won't happen — baseline `_lastState` is captured on the first check before `evaluate` runs *(source: `engine/watches.js:262-266`)*. If you see this, suspect manual edits to `watches.json` |
186
+ | Absolute watch fires forever instead of once | `stopAfter` is set to a non-zero value; only `stopAfter === 0` triggers fire-once expiration |
187
+ | Action runs but inbox notification doesn't | `notify` field isn't `'inbox'`, or `owner` is empty. `notify` and `action` are independent — both can fire, or only one |
188
+ | `_lastActionResult.ok === false` with `"unknown action type"` | The `action.type` isn't registered. List with `listActionTypes()` / `GET /api/watches/action-types` |
189
+ | Webhook action returns `"only http/https allowed"` | URLs must use `http://` or `https://` schemes; other protocols are rejected by design *(source: `engine/watch-actions.js:297-299`)* |
190
+ | Trigger fires but follow-up `dispatch-work-item` is missing | Check the engine log for `Watch <id> action <type>: <summary>`. Common reasons: missing `title`, the project's `work-items.json` couldn't be written, or the WI landed in central `work-items.json` because no project was specified |
191
+ | Watch `_lastActionResult` shows `"timeout"` for webhook | Webhooks have a 10s safety timeout to keep the watches tick fast *(source: `engine/watch-actions.js:334-337`)* |
192
+ | `checkWatches` block crashes silently | Wrapped in `safe('checkWatches', ...)` so one failure doesn't abort the tick *(source: `engine.js:4484`)*. Inspect `engine/log.json` for `Watch check error (<id>)` lines. Regression #1088: the block must use `getProjects(config)`, never the long-removed `PROJECTS` constant |
193
+
194
+ ## See Also
195
+
196
+ - `engine/shared.js:1557-1616` — `WATCH_STATUS`, `WATCH_TARGET_TYPE`, `WATCH_CONDITION`, `WATCH_ABSOLUTE_CONDITIONS`, `WATCH_ACTION_TYPE` constants
197
+ - `engine/watches.js` — registry, lifecycle, tick integration
198
+ - `engine/watch-actions.js` — action registry and built-in handlers
199
+ - `dashboard/pages/watches.html`, `dashboard/js/render-watches.js` — dashboard UI
200
+ - `test/unit/watches-module.test.js`, `test/unit/watch-actions.test.js` — module-level tests
201
+ - [`auto-discovery.md`](auto-discovery.md) — overall tick cycle context
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@yemi33/minions",
3
- "version": "0.1.1883",
3
+ "version": "0.1.1885",
4
4
  "description": "Multi-agent AI dev team that runs from ~/.minions/ — five autonomous agents share a single engine, dashboard, and knowledge base",
5
5
  "bin": {
6
6
  "minions": "bin/minions.js"
package/playbooks/docs.md CHANGED
@@ -92,7 +92,7 @@ git commit -m "{{commit_message}}"
92
92
  git push -u origin {{branch_name}}
93
93
  ```
94
94
 
95
- PR creation is MANDATORY for docs tasks — docs go through the same review flow as code.
95
+ Create the PR for docs tasks — docs go through the same review flow as code.
96
96
  Use the appropriate repo-host tooling for PR creation. For Azure DevOps, prefer the
97
97
  `az` CLI first and use the ADO MCP only as a fallback.
98
98
 
package/playbooks/fix.md CHANGED
@@ -102,15 +102,4 @@ Your task is complete when each review finding has either been fixed or answered
102
102
 
103
103
  ## Completion
104
104
 
105
- After finishing, write the JSON completion report described in the shared rules. Also output this structured completion block as a compatibility fallback:
106
-
107
- ```completion
108
- status: done | partial | failed
109
- files_changed: <comma-separated list of key files changed>
110
- tests: pass | fail | skipped | N/A
111
- pr: PR-<number> or N/A
112
- failure_class: N/A
113
- pending: <any remaining work, or none>
114
- ```
115
-
116
- Replace the values with your actual results.
105
+ After finishing, write the JSON completion report described in the shared rules (schema: `docs/completion-reports.md`). A fenced ` ```completion ` block in stdout is accepted only as a compatibility fallback — do not duplicate the schema here.
@@ -84,15 +84,4 @@ Your task is complete when the requested implementation is delivered, the valida
84
84
 
85
85
  ## Completion
86
86
 
87
- After finishing, write the JSON completion report described in the shared rules. Also output this structured completion block as a compatibility fallback:
88
-
89
- ```completion
90
- status: done | partial | failed
91
- files_changed: <comma-separated list of key files changed>
92
- tests: pass | fail | skipped | N/A
93
- pr: N/A
94
- failure_class: N/A
95
- pending: <any remaining work, or none>
96
- ```
97
-
98
- Replace the values with your actual results.
87
+ After finishing, write the JSON completion report described in the shared rules (schema: `docs/completion-reports.md`). A fenced ` ```completion ` block in stdout is accepted only as a compatibility fallback — do not duplicate the schema here.
@@ -83,12 +83,12 @@ git push -u origin {{branch_name}}
83
83
 
84
84
  {{pr_create_instructions}}
85
85
 
86
- PR creation is MANDATORY for implement tasks because the engine tracks review and completion from the PR.
86
+ Create the PR for implement tasks the engine tracks review and completion from the PR.
87
87
 
88
88
  Include build/test status and run instructions in the PR description. If the project has a runnable app, include the localhost URL.
89
89
 
90
90
  ## When to Stop
91
91
 
92
- Your task is complete when the requested implementation is delivered, the validation story is truthful and sufficient for review, the branch is pushed, and the PR exists. Your final message MUST include the PR URL so the engine can track it.
92
+ Your task is complete when the requested implementation is delivered, the validation story is truthful and sufficient for review, the branch is pushed, and the PR exists. Include the PR URL in your final message so the engine can track it.
93
93
 
94
94
  Do NOT run `gh pr merge` or any other merge command on your own PR. The engine reviews and merges PRs through a separate review cycle. Self-merging is prohibited.