@yemi33/squad 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,277 @@
1
+ # Self-Improvement Loop
2
+
3
+ How the squad learns from its own work and gets better over time.
4
+
5
+ ## Overview
6
+
7
+ The squad has four self-improvement mechanisms that form a continuous feedback loop:
8
+
9
+ ```
10
+ Agent completes task
11
+
12
+ ├─ 1. Learnings Inbox → notes.md (all future agents see it)
13
+ ├─ 2. Per-Agent History → history.md (agent sees its own past)
14
+ ├─ 3. Review Feedback Loop → author gets reviewer's findings
15
+ └─ 4. Quality Metrics → engine/metrics.json (tracks performance)
16
+ ```
17
+
18
+ ## 1. Learnings Inbox → notes.md
19
+
20
+ **The core loop.** Every playbook instructs agents to write findings to `notes/inbox/`. The engine consolidates these into `notes.md`, which is injected into every future playbook prompt.
21
+
22
+ ### Flow
23
+
24
+ ```
25
+ Agent finishes task
26
+ → writes notes/inbox/<agent>-<date>.md
27
+ → engine checks inbox on each tick
28
+ → at 5+ files: consolidateInbox() runs
29
+ → items categorized (reviews, feedback, learnings, other)
30
+ → summary appended to notes.md
31
+ → originals moved to notes/archive/
32
+ → notes.md injected into every future agent prompt
33
+ ```
34
+
35
+ ### Smart Consolidation
36
+
37
+ The engine doesn't just dump files — it categorizes them:
38
+ - **Reviews** — files containing "review" or PR references
39
+ - **Feedback** — review feedback files for authors
40
+ - **Learnings** — build summaries, explorations, implementation notes
41
+ - **Other** — everything else
42
+
43
+ Each category gets a header with item count and one-line summaries.
44
+
45
+ ### Auto-Pruning
46
+
47
+ When `notes.md` exceeds 50KB, the engine prunes old consolidation sections, keeping the header and last 8 consolidations. This prevents the file from growing unbounded while retaining recent institutional knowledge.
48
+
49
+ ## 2. Per-Agent History
50
+
51
+ Each agent maintains a `history.md` file that tracks its last 20 tasks. This is injected into the agent's system prompt so it knows what it did recently.
52
+
53
+ ### Flow
54
+
55
+ ```
56
+ Agent finishes task
57
+ → engine calls updateAgentHistory()
58
+ → prepends entry to agents/<name>/history.md
59
+ → trims to 20 entries
60
+ → next time agent spawns: history.md injected into system prompt
61
+ ```
62
+
63
+ ### What's tracked per entry
64
+
65
+ - Timestamp
66
+ - Task description
67
+ - Type (implement, review, fix, explore)
68
+ - Result (success/error)
69
+ - Project name
70
+ - Branch name
71
+ - Dispatch ID
72
+
73
+ ### Why it matters
74
+
75
+ Without history, an agent has no memory. It might:
76
+ - Re-explore code it already explored yesterday
77
+ - Repeat mistakes it made last session
78
+ - Not know that it already has a PR open for a similar feature
79
+
80
+ With history, the agent sees "I already implemented M005 yesterday (success)" and can build on that context.
81
+
82
+ ## 3. Review Feedback Loop
83
+
84
+ When a reviewer (e.g., Ripley) reviews a PR and writes findings, the engine automatically creates a feedback file for the PR author (e.g., Dallas) in the inbox.
85
+
86
+ ### Flow
87
+
88
+ ```
89
+ Ripley reviews Dallas's PR
90
+ → writes notes/inbox/ripley-review-pr123-2026-03-12.md
91
+ → engine detects review completion
92
+ → updatePrAfterReview() runs
93
+ → createReviewFeedbackForAuthor() runs
94
+ → creates notes/inbox/feedback-dallas-from-ripley-pr123-2026-03-12.md
95
+ → next consolidation: feedback appears in notes.md
96
+ → Dallas's next task: he sees what Ripley flagged
97
+ ```
98
+
99
+ ### What the feedback file contains
100
+
101
+ ```markdown
102
+ # Review Feedback for Dallas
103
+
104
+ **PR:** PR-123 — Add retry logic
105
+ **Reviewer:** Ripley
106
+ **Date:** 2026-03-12
107
+
108
+ ## What the reviewer found
109
+ <full content of Ripley's review>
110
+
111
+ ## Action Required
112
+ Read this feedback carefully. When you work on similar tasks
113
+ in the future, avoid the patterns flagged here.
114
+ ```
115
+
116
+ ### Why it matters
117
+
118
+ Without this, review findings only exist in the inbox file under the reviewer's name. The author never explicitly sees them unless they happen to read the consolidated notes.md. The feedback loop ensures the author gets a direct, targeted learning from every review.
119
+
120
+ ## 4. Quality Metrics
121
+
122
+ The engine tracks per-agent performance metrics in `engine/metrics.json`. Updated after every task completion and PR review.
123
+
124
+ ### Metrics tracked
125
+
126
+ | Metric | Updated when |
127
+ |--------|-------------|
128
+ | `tasksCompleted` | Agent exits with code 0 |
129
+ | `tasksErrored` | Agent exits with non-zero code |
130
+ | `prsCreated` | Agent completes an implement task |
131
+ | `prsApproved` | Reviewer approves the agent's PR |
132
+ | `prsRejected` | Reviewer requests changes on the agent's PR |
133
+ | `reviewsDone` | Agent completes a review task |
134
+ | `lastTask` | Every completion |
135
+ | `lastCompleted` | Every completion |
136
+
137
+ ### Where metrics are visible
138
+
139
+ - **CLI:** `node engine.js status` shows a metrics table
140
+ - **Dashboard:** "Agent Metrics" section with approval rates color-coded (green ≥70%, red <70%)
141
+
142
+ ### Sample output
143
+
144
+ ```
145
+ Metrics:
146
+ Agent Done Err PRs Approved Rejected Reviews
147
+ ----------------------------------------------------------
148
+ dallas 12 1 8 6 (75%) 2 0
149
+ ripley 0 0 0 0 (-) 0 10
150
+ ralph 5 0 4 3 (75%) 1 0
151
+ rebecca 3 0 2 2 (100%) 0 0
152
+ lambert 2 0 0 0 (-) 0 4
153
+ ```
154
+
155
+ ### Future use
156
+
157
+ Metrics are currently informational — displayed in status and dashboard. Planned uses:
158
+ - **Routing adaptation:** If an agent's approval rate drops below a threshold, deprioritize them for implementation tasks
159
+ - **Auto-escalation:** If an agent errors 3 times in a row, pause their dispatch and alert
160
+ - **Capacity planning:** Track throughput per agent to optimize `maxConcurrent`
161
+
162
+ ## How It All Connects
163
+
164
+ ```
165
+ ┌──────────────────────┐
166
+ │ notes.md │
167
+ │ (institutional │
168
+ │ knowledge) │
169
+ └──────┬───────────────┘
170
+ │ injected into every playbook
171
+
172
+ ┌──────────┐ ┌──────────────────┐ ┌──────────┐
173
+ │ history │──injects──│ Agent works │──writes──→│ inbox/ │
174
+ │ .md │ │ on task │ │ *.md │
175
+ │ (past │ └────────┬─────────┘ └────┬─────┘
176
+ │ tasks) │ │ │
177
+ └──────────┘ │ on completion │ consolidateInbox()
178
+ ▲ ▼ ▼
179
+ │ ┌──────────────────┐ ┌──────────┐
180
+ └─updateHistory───│ Engine │─prune──→│decisions │
181
+ │ post-hooks │ │ .md │
182
+ └────────┬─────────┘ └──────────┘
183
+
184
+ ┌───────────┼───────────┐
185
+ ▼ ▼ ▼
186
+ ┌──────────┐ ┌──────────┐ ┌──────────┐
187
+ │ metrics │ │ feedback │ │ history │
188
+ │ .json │ │ for │ │ .md │
189
+ │ │ │ author │ │ updated │
190
+ └──────────┘ └──────────┘ └──────────┘
191
+ ```
192
+
193
+ ## 5. Skills — Agent-Discovered Workflows
194
+
195
+ When an agent discovers a repeatable multi-step procedure, it can save it as a **skill** — a structured, reusable workflow compatible with Claude Code's skill system. Skills are stored in two locations:
196
+
197
+ - **Squad-wide:** `~/.squad/skills/<name>.md` — shared across all agents, no PR required
198
+ - **Project-specific:** `<project>/.claude/skills/<name>/SKILL.md` — scoped to one repo, requires a PR
199
+
200
+ ### Flow
201
+
202
+ ```
203
+ Agent discovers repeatable pattern during task
204
+ → writes skills/<name>.md with frontmatter (name, description, trigger, allowed-tools)
205
+ → engine detects new skill files on next tick
206
+ → builds skill index (name + trigger + file path)
207
+ → index injected into every agent's system prompt
208
+ → future agents see "Available Skills" and follow matching ones
209
+ → skills are also invocable via Claude Code's /skill-name command
210
+ ```
211
+
212
+ ### Skill Format
213
+
214
+ ```markdown
215
+ ---
216
+ name: fix-yarn-lock-conflict
217
+ description: Resolves yarn.lock merge conflicts by regenerating the lockfile
218
+ trigger: when merging branches that both modified yarn.lock
219
+ allowed-tools: Bash, Read, Write
220
+ author: dallas
221
+ created: 2026-03-12
222
+ project: any
223
+ ---
224
+
225
+ # Skill: Fix Yarn Lock Conflicts
226
+
227
+ ## When to Use
228
+ When a git merge or rebase produces conflicts in yarn.lock.
229
+
230
+ ## Steps
231
+ 1. Delete yarn.lock
232
+ 2. Run `yarn install` to regenerate
233
+ 3. Stage the new yarn.lock
234
+ 4. Continue the merge/rebase
235
+
236
+ ## Notes
237
+ - Never manually edit yarn.lock
238
+ - Always run `yarn build` after regenerating to verify
239
+ ```
240
+
241
+ ### How it differs from notes.md
242
+
243
+ | | notes.md | Skills |
244
+ |---|---|---|
245
+ | **Format** | Free-form prose, appended by engine | Structured with frontmatter, one file per workflow |
246
+ | **Granularity** | Rules, conventions, findings | Step-by-step procedures |
247
+ | **Authored by** | Engine (consolidation) | Agents directly |
248
+ | **Trigger** | Always injected (all context) | Agent matches trigger to current situation |
249
+ | **Lifespan** | Grows forever (pruned at 50KB) | Permanent, individually editable |
250
+ | **Claude Code** | Not directly invocable | Invocable via `/skill-name` |
251
+
252
+ ### When agents should create skills
253
+
254
+ - Multi-step procedures they had to figure out (build setup, deployment, migration)
255
+ - Error recovery patterns (how to fix a specific class of failure)
256
+ - Project-specific workflows that aren't documented elsewhere
257
+ - Cross-repo coordination steps
258
+
259
+ ## Configuration
260
+
261
+ | Setting | Default | What it controls |
262
+ |---------|---------|-----------------|
263
+ | `engine.inboxConsolidateThreshold` | 5 | Files needed before consolidation triggers |
264
+ | notes.md max size | 50KB | Auto-prunes old sections above this |
265
+ | Agent history entries | 20 | Max entries kept in history.md |
266
+ | Metrics file | `engine/metrics.json` | Auto-created on first completion |
267
+
268
+ ## Files
269
+
270
+ | File | Purpose | Written by |
271
+ |------|---------|-----------|
272
+ | `notes/inbox/*.md` | Agent findings drop-box | Agents |
273
+ | `notes/archive/*.md` | Archived inbox files | Engine (consolidation) |
274
+ | `notes.md` | Accumulated team knowledge | Engine (consolidation) |
275
+ | `agents/<name>/history.md` | Per-agent task history | Engine (post-completion) |
276
+ | `engine/metrics.json` | Quality metrics per agent | Engine (post-completion + review) |
277
+ | `notes/inbox/feedback-*.md` | Review feedback for authors | Engine (post-review) |
@@ -0,0 +1,49 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * Wrapper for @azure-devops/mcp that fetches an ADO token via azureauth
4
+ * broker (no browser popup) and sets AZURE_DEVOPS_EXT_PAT before launching
5
+ * the MCP server.
6
+ */
7
+ const { execSync, spawn } = require('child_process');
8
+ const path = require('path');
9
+
10
+ // Fetch token via azureauth broker (corp tool, no browser)
11
+ let token;
12
+ try {
13
+ token = execSync('azureauth ado token --mode broker --output token --timeout 1', {
14
+ encoding: 'utf8',
15
+ timeout: 30000,
16
+ windowsHide: true,
17
+ }).trim();
18
+ } catch (e) {
19
+ // Fallback: try with web mode (may open browser as last resort)
20
+ try {
21
+ token = execSync('azureauth ado token --mode web --output token --timeout 5', {
22
+ encoding: 'utf8',
23
+ timeout: 120000,
24
+ windowsHide: true,
25
+ }).trim();
26
+ } catch (e2) {
27
+ process.stderr.write('ado-mcp-wrapper: Failed to get ADO token: ' + e2.message + '\n');
28
+ process.exit(1);
29
+ }
30
+ }
31
+
32
+ // Launch the actual MCP server with the token in env
33
+ const args = process.argv.slice(2);
34
+ const child = spawn(process.platform === 'win32' ? 'npx.cmd' : 'npx', [
35
+ '-y',
36
+ '--registry=https://registry.npmjs.org/',
37
+ '@azure-devops/mcp@latest',
38
+ ...args
39
+ ], {
40
+ stdio: 'inherit',
41
+ env: { ...process.env, AZURE_DEVOPS_EXT_PAT: token },
42
+ windowsHide: true,
43
+ });
44
+
45
+ child.on('exit', (code) => process.exit(code || 0));
46
+ child.on('error', (err) => {
47
+ process.stderr.write('ado-mcp-wrapper: ' + err.message + '\n');
48
+ process.exit(1);
49
+ });
@@ -0,0 +1,98 @@
1
+ #!/usr/bin/env node
2
+ /**
3
+ * spawn-agent.js — Wrapper to spawn claude CLI safely
4
+ * Reads prompt and system prompt from files, avoiding shell metacharacter issues.
5
+ *
6
+ * Usage: node spawn-agent.js <prompt-file> <sysprompt-file> [claude-args...]
7
+ */
8
+
9
+ const { spawn, execSync } = require('child_process');
10
+ const fs = require('fs');
11
+ const path = require('path');
12
+
13
+ const [,, promptFile, sysPromptFile, ...extraArgs] = process.argv;
14
+
15
+ if (!promptFile || !sysPromptFile) {
16
+ console.error('Usage: node spawn-agent.js <prompt-file> <sysprompt-file> [args...]');
17
+ process.exit(1);
18
+ }
19
+
20
+ const prompt = fs.readFileSync(promptFile, 'utf8');
21
+ const sysPrompt = fs.readFileSync(sysPromptFile, 'utf8');
22
+
23
+ // Clean CLAUDECODE env vars
24
+ const env = { ...process.env };
25
+ delete env.CLAUDECODE;
26
+ delete env.CLAUDE_CODE_ENTRYPOINT;
27
+ for (const key of Object.keys(env)) {
28
+ if (key.startsWith('CLAUDE_CODE') || key.startsWith('CLAUDECODE_')) delete env[key];
29
+ }
30
+
31
+ // Resolve claude binary — find the actual JS entry point
32
+ let claudeBin;
33
+ const searchPaths = [
34
+ // npm global install locations
35
+ path.join(process.env.npm_config_prefix || '', 'node_modules', '@anthropic-ai', 'claude-code', 'cli.js'),
36
+ 'C:/.tools/.npm-global/node_modules/@anthropic-ai/claude-code/cli.js',
37
+ path.join(process.env.APPDATA || '', 'npm', 'node_modules', '@anthropic-ai', 'claude-code', 'cli.js'),
38
+ ];
39
+ for (const p of searchPaths) {
40
+ if (p && fs.existsSync(p)) { claudeBin = p; break; }
41
+ }
42
+ // Fallback: parse the shell wrapper
43
+ if (!claudeBin) {
44
+ try {
45
+ const which = execSync('bash -c "which claude"', { encoding: 'utf8', env }).trim();
46
+ const wrapper = execSync(`bash -c "cat '${which}'"`, { encoding: 'utf8', env });
47
+ const m = wrapper.match(/node_modules\/@anthropic-ai\/claude-code\/cli\.js/);
48
+ if (m) {
49
+ const basedir = path.dirname(which.replace(/^\/c\//, 'C:/').replace(/\//g, path.sep));
50
+ claudeBin = path.join(basedir, 'node_modules', '@anthropic-ai', 'claude-code', 'cli.js');
51
+ }
52
+ } catch {}
53
+ }
54
+
55
+ // Debug log
56
+ const debugPath = path.join(__dirname, 'spawn-debug.log');
57
+ fs.writeFileSync(debugPath, `spawn-agent.js at ${new Date().toISOString()}\nclaudeBin=${claudeBin || 'not found'}\nprompt=${promptFile}\nsysPrompt=${sysPromptFile}\nextraArgs=${extraArgs.join(' ')}\n`);
58
+
59
+ const cliArgs = ['-p', '--system-prompt', sysPrompt, ...extraArgs];
60
+
61
+ if (!claudeBin) {
62
+ fs.appendFileSync(debugPath, 'FATAL: Cannot find claude-code cli.js\n');
63
+ process.exit(1);
64
+ }
65
+
66
+ const proc = spawn(process.execPath, [claudeBin, ...cliArgs], {
67
+ stdio: ['pipe', 'pipe', 'pipe'],
68
+ env
69
+ });
70
+
71
+ fs.appendFileSync(debugPath, `PID=${proc.pid || 'none'}\n`);
72
+
73
+ // Write PID file for parent engine to verify spawn
74
+ const pidFile = promptFile.replace(/prompt-/, 'pid-').replace(/\.md$/, '.pid');
75
+ fs.writeFileSync(pidFile, String(proc.pid || ''));
76
+
77
+ // Send prompt via stdin
78
+ proc.stdin.write(prompt);
79
+ proc.stdin.end();
80
+
81
+ // Capture stderr separately for debugging
82
+ let stderrBuf = '';
83
+ proc.stderr.on('data', (chunk) => {
84
+ stderrBuf += chunk.toString();
85
+ process.stderr.write(chunk);
86
+ });
87
+
88
+ // Pipe stdout to parent
89
+ proc.stdout.pipe(process.stdout);
90
+
91
+ proc.on('close', (code) => {
92
+ fs.appendFileSync(debugPath, `EXIT: code=${code}\nSTDERR: ${stderrBuf.slice(0, 500)}\n`);
93
+ process.exit(code || 0);
94
+ });
95
+ proc.on('error', (err) => {
96
+ fs.appendFileSync(debugPath, `ERROR: ${err.message}\n`);
97
+ process.exit(1);
98
+ });