pi-crew 0.2.25 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,37 @@
1
1
  # Changelog
2
2
 
3
+ ## [0.3.0] — Phase 3a+3b: Discovery Cache, Dynamic Agent Registry, Rich TUI Rendering (2026-05-23)
4
+
5
+ ### Phase 3a: Agent Discovery Cache
6
+ - **500ms TTL cache** with max 32 entries and per-cwd invalidation
7
+ - **FIFO eviction** when cache is full
8
+ - Cache pruned on every `discoverAgents()` call
9
+ - `invalidateAgentDiscoveryCache(cwd?)` exposed for explicit invalidation
10
+
11
+ ### Phase 3b: Dynamic Agent Registry
12
+ - **`registerDynamicAgent(config)`** — runtime agent registration with cache invalidation
13
+ - **`unregisterDynamicAgent(name)`** — throws on missing agent
14
+ - **`listDynamicAgents()`** — returns all registered dynamic agents
15
+ - Dynamic agents get **highest priority** over discovered agents (security: project < builtin < user < dynamic)
16
+ - **CrewRegistry v2** — extended from v1 with `registerAgent`/`unregisterAgent`/`listDynamicAgents`
17
+ - Factory `installCrewGlobalRegistry()` for clean initialization
18
+
19
+ ### Rich TUI Tool Rendering
20
+ - **New `src/ui/tool-render.ts`** (304 lines) — shared rendering module ported from pi-subagent4
21
+ - **`renderTeamToolCall`** — collapsed: `team action='run' (default) "goal preview"` / expanded: header + goal streaming
22
+ - **`renderAgentToolCall`** — collapsed: `Agent explorer "prompt preview"` / expanded: header + prompt
23
+ - **`renderTeamToolResult`** — `[status] goal text` for run actions / compact info for others
24
+ - **`renderAgentToolResult`** — status icons (⟳○✓✗) + output lines for agent results
25
+ - **`renderAgentProgress`** — icon + header + tool log + context gauge + usage line (↑↓RW$ctx)
26
+ - Helpers: `formatTokens`, `formatDuration`, `formatContextUsage`, `truncLine`, `formatToolPreview`
27
+ - All tools use **`@mariozechner/pi-tui`** Components (Container, Text, Spacer) directly
28
+ - `renderCall`/`renderResult` added to: `team`, `Agent` tools
29
+
30
+ ### Tests
31
+ - **1662 tests pass** (1652 unit + 46 integration + 4 new)
32
+ - New test suites: `agent-discovery-cache.test.ts` (10 tests), `tool-render.test.ts` (10 tests)
33
+ - Bug fix: `allAgents` priority corrected (discovery: project < builtin < user; dynamic separate/highest)
34
+
3
35
  ## [0.2.21] — 3 Bugs Fixed — Background Runner, Child-pi stdin, Phantom Runs (2026-05-22)
4
36
 
5
37
  ## [0.2.25] — CI Fixes & needs_attention Terminal Status (2026-05-22)
package/README.md CHANGED
@@ -9,7 +9,7 @@ npm: pi-crew
9
9
  repo: https://github.com/baphuongna/pi-crew
10
10
  ```
11
11
 
12
- **v0.2.21**: 3 bugs fixed — background runner session_shutdown survival, child-pi stdin hang, phantom runs from temp workspaces. See [CHANGELOG.md](CHANGELOG.md) and [docs/pi-crew-bugs.md](docs/pi-crew-bugs.md).
12
+ **v0.2.25**: See [CHANGELOG.md](CHANGELOG.md) and [docs/pi-crew-bugs.md](docs/pi-crew-bugs.md).
13
13
 
14
14
  ---
15
15
 
@@ -116,33 +116,61 @@ security-reviewer · test-engineer · verifier · writer
116
116
 
117
117
  ---
118
118
 
119
- ## Runtime Safety
119
+ ## Runtime Modes
120
120
 
121
- By default, `run` launches each task as a **separate child Pi process**. Workers execute independently and stream output to durable state.
121
+ pi-crew supports multiple runtime modes for task execution:
122
122
 
123
- Scaffold/dry-run mode (no real workers):
123
+ | Mode | Description |
124
+ |------|-------------|
125
+ | `auto` (default) | Uses `child-process` unless overridden by config |
126
+ | `child-process` | Spawns real `pi` child processes — each task runs in isolation |
127
+ | `scaffold` | Dry-run mode — renders prompts and persists artifacts without executing |
128
+ | `live-session` (experimental) | In-process session execution within the parent Pi |
124
129
 
125
130
  ```json
126
- { "runtime": { "mode": "scaffold" } }
131
+ // Use scaffold mode (no real workers, just prompts)
132
+ { "action": "run", "team": "default", "goal": "...", "runtime": { "mode": "scaffold" } }
133
+
134
+ // Disable workers globally
135
+ { "executeWorkers": false }
127
136
  ```
128
137
 
129
- Disable workers globally:
138
+ ## Async Runs
139
+
140
+ Async runs are **detached** from the session — they survive session switches and reloads. Pi-crew notifies when complete.
130
141
 
131
142
  ```json
132
- { "executeWorkers": false }
143
+ { "action": "run", "team": "default", "goal": "...", "async": true }
144
+ ```
145
+
146
+ ```text
147
+ /team-run --async Investigate failing tests
133
148
  ```
134
149
 
135
- Worktree mode is **opt-in** and requires a clean repo:
150
+ Background runs use `node --import jiti-register.mjs` for TypeScript support. See [docs/runtime-flow.md](docs/runtime-flow.md) for details.
151
+
152
+ ## Worktree Isolation
153
+
154
+ Worktree mode creates an **isolated git worktree per task** — safe for parallel edits to the same branch.
136
155
 
137
156
  ```json
138
157
  {
139
158
  "action": "run",
140
159
  "team": "implementation",
141
160
  "goal": "Refactor auth",
142
- "workspaceMode": "worktree"
161
+ "worktree": { "enabled": true }
143
162
  }
144
163
  ```
145
164
 
165
+ ```text
166
+ /team-run --worktree Refactor auth
167
+ ```
168
+
169
+ Requirements:
170
+ - Git repository
171
+ - Clean working tree (no uncommitted changes in the main worktree)
172
+ - Worktrees auto-cleanup on run completion/cancel
173
+
146
174
  ---
147
175
 
148
176
  ## Configuration
@@ -158,45 +186,85 @@ Worktree mode is **opt-in** and requires a clean repo:
158
186
  ### Quick Config
159
187
 
160
188
  ```text
161
- /team-config # view
162
- /team-config asyncByDefault=true # update
163
- /team-config runtime.mode=scaffold # scaffold mode
164
- /team-config --unset=asyncByDefault # reset
165
- /team-config autonomous.profile=assisted --project # project scope
189
+ /team-config # view all settings
190
+ /team-config get runtime.mode # read one key
191
+ /team-config set runtime.mode=scaffold # scaffold mode
192
+ /team-config set asyncByDefault=true # async by default
193
+ /team-config unset runtime.mode # reset to default
194
+ /team-config --project # project scope
195
+ /team-settings path # show config file path
166
196
  ```
167
197
 
168
198
  ### Key Settings
169
199
 
170
- | Section | Key Settings | Default |
171
- |---------|-------------|---------|
172
- | **Runtime** | `mode`: `auto` \| `scaffold` \| `child-process` \| `live-session` | `auto` |
173
- | **Concurrency** | `limits.maxConcurrentWorkers` | workflow-dependent (2–4) |
174
- | **Async** | `asyncByDefault`, `runtime.groupJoin` | `false`, `smart` |
200
+ | Section | Keys | Default |
201
+ |---------|------|---------|
202
+ | **Runtime** | `mode`: `auto` \| `child-process` \| `scaffold` \| `live-session` | `auto` |
203
+ | | `maxTurns`, `graceTurns`, `groupJoin`, `requirePlanApproval` | various |
204
+ | **Concurrency** | `limits.maxConcurrentWorkers` | workflow-dependent |
205
+ | | `limits.maxTaskDepth`, `limits.maxChildrenPerTask` | 2, 5 |
206
+ | **Async** | `asyncByDefault` | `false` |
207
+ | | `runtime.groupJoin`: `off` \| `group` \| `smart` | `smart` |
175
208
  | **Autonomy** | `profile`: `manual` \| `suggested` \| `assisted` \| `aggressive` | `suggested` |
176
- | **UI** | `widgetPlacement`, `dashboardPlacement`, `showModel`, `showTokens` | compact widget |
177
- | **Reliability** | `autoRetry`, `autoRecover`, `deadletterThreshold`, `retryPolicy` | all opt-in |
178
- | **Observability** | `prometheus.enabled`, `otlp.enabled` | opt-in |
209
+ | | `autonomous.injectPolicy`, `preferAsyncForLongTasks` | true, false |
210
+ | **UI** | `widgetPlacement`, `dashboardPlacement` | compact widget |
211
+ | | `showModel`, `showTokens` | display controls |
212
+ | **Reliability** | `autoRetry`, `autoRecover`, `deadletterThreshold` | opt-in |
213
+ | **Observability** | `prometheus.enabled`, `otlp.enabled`, `heartbeatStaleMs` | opt-in |
214
+ | **Worktree** | `worktree.enabled` | disabled by default |
179
215
 
180
216
  > ⚠️ **Trust boundary**: project config cannot override sensitive execution controls (workers, runtime mode, autonomy, agent overrides). Set those in **user config** only.
181
217
 
182
- 📖 Full config reference: [docs/configuration.md](docs/configuration.md) *(coming soon — see [docs/usage.md](docs/usage.md) for now)*
218
+ 📖 Full config reference: [docs/commands-reference.md#team-settings--config-management](docs/commands-reference.md) and [schema.json](schema.json)
183
219
 
184
220
  ---
185
221
 
186
222
  ## Tool Actions
187
223
 
188
224
  ```json
189
- { "action": "run", "team": "default", "goal": "..." } // execute
190
- { "action": "status", "runId": "team_..." } // monitor
191
- { "action": "cancel", "runId": "team_..." } // stop
192
- { "action": "resume", "runId": "team_..." } // continue
193
- { "action": "recommend", "goal": "..." } // get advice
194
- { "action": "list" } // discover
195
- { "action": "create", "resource": "agent", ... } // extend
196
- { "action": "doctor" } // diagnose
225
+ // Execute workflow (foreground or async)
226
+ { "action": "run", "team": "default", "goal": "..." }
227
+ { "action": "run", "team": "default", "goal": "...", "async": true }
228
+
229
+ // Monitor & control
230
+ { "action": "status", "runId": "team_..." }
231
+ { "action": "summary", "runId": "team_..." }
232
+ { "action": "events", "runId": "team_..." }
233
+ { "action": "artifacts", "runId": "team_..." }
234
+ { "action": "cancel", "runId": "team_..." }
235
+ { "action": "resume", "runId": "team_..." }
236
+
237
+ // Discovery
238
+ { "action": "list" }
239
+ { "action": "get", "resource": "team", "name": "default" }
240
+ { "action": "recommend", "goal": "Refactor auth flow" }
241
+
242
+ // Resource management
243
+ { "action": "create", "resource": "agent", "config": { "name": "api-reviewer", ... } }
244
+ { "action": "update", "resource": "team", "name": "backend", "config": { ... } }
245
+ { "action": "delete", "resource": "workflow", "name": "quick-review" }
246
+ { "action": "validate" }
247
+
248
+ // Run maintenance
249
+ { "action": "cleanup", "runId": "team_..." }
250
+ { "action": "forget", "runId": "team_...", "confirm": true }
251
+ { "action": "prune", "olderThanDays": 7, "confirm": true }
252
+ { "action": "export", "runId": "team_..." }
253
+ { "action": "import", "path": "/path/to/bundle.tar.gz" }
254
+
255
+ // Environment & configuration
256
+ { "action": "doctor", "config": { "smokeChildPi": true } }
257
+ { "action": "config" }
258
+ { "action": "init", "config": { "copyBuiltins": true } }
259
+ { "action": "autonomy", "profile": "assisted" }
260
+
261
+ // Advanced
262
+ { "action": "api", "runId": "team_...", "operation": "read-manifest" }
263
+ { "action": "plan", "team": "default", "goal": "..." }
264
+ { "action": "worktrees", "runId": "team_..." }
197
265
  ```
198
266
 
199
- 📖 Full actions reference: [docs/actions-reference.md](docs/actions-reference.md)
267
+ 📖 Full actions reference (28 actions): [docs/actions-reference.md](docs/actions-reference.md)
200
268
 
201
269
  ---
202
270
 
@@ -16,20 +16,20 @@ Maps pi-crew behavior to proof. Every row must have real validation evidence.
16
16
 
17
17
  | Story | Contract | Unit | Integration | CI | Status | Evidence |
18
18
  |-------|----------|------|-------------|-----|--------|----------|
19
- | Core team run | `docs/product/team-run.md` | yes | yes | yes 3/3 | implemented | 1621 tests pass |
20
- | Child process runner | `docs/product/child-process.md` | yes | no | yes 3/3 | implemented | child-pi.ts tests |
21
- | Async runner | `docs/product/async-runner.md` | yes | no | yes 3/3 | implemented | async-runner tests |
22
- | Live session | `docs/product/live-session.md` | yes | no | yes 3/3 | implemented | live-session tests |
23
- | State durability | `docs/product/state.md` | yes | no | yes 3/3 | implemented | state-store tests |
24
- | Worktree isolation | `docs/product/worktree.md` | yes | no | yes 3/3 | implemented | worktree tests |
25
- | Team tool API | `docs/product/team-tool.md` | yes | no | yes 3/3 | implemented | api tests |
26
- | Group join | `docs/product/group-join.md` | yes | no | yes 3/3 | implemented | group-join tests |
27
- | Model fallback | `docs/product/model-fallback.md` | yes | no | yes 3/3 | implemented | model-fallback tests |
28
- | Conflict detection | `docs/product/conflict-detect.md` | yes | no | yes 3/3 | implemented | conflict-detect tests |
29
- | Crash recovery | `docs/product/crash-recovery.md` | yes | no | yes 3/3 | implemented | crash-recovery tests |
30
- | Effectiveness guard | `docs/product/effectiveness.md` | yes | no | yes 3/3 | implemented | effectiveness tests |
31
- | Windows EBUSY | `docs/product/platform.md` | yes | no | yes 3/3 | implemented | rmSyncRetry tests |
32
- | Depth guard | `docs/product/runtime-safety.md` | yes | no | yes 3/3 | implemented | depth-guard tests |
19
+ | Core team run | `docs/product/team-run.md` | yes | yes | yes 3/3 | implemented | 1655 tests pass (268 unit + 14 integration files) |
20
+ | Child process runner | `docs/product/child-process.md` | yes | yes | yes 3/3 | implemented | child-pi-pool.test.ts, child-pi-timeout.test.ts, mock-child-run.test.ts |
21
+ | Async runner | `docs/product/async-runner.md` | yes | yes | yes 3/3 | implemented | async-runner.test.ts, async-restart-recovery.test.ts |
22
+ | Live session | `docs/product/live-session.md` | yes | no | yes 3/3 | implemented | live-session-context.test.ts, live-session-runtime.test.ts |
23
+ | State durability | `docs/product/state.md` | yes | yes | yes 3/3 | implemented | state-store.test.ts, state-contracts.test.ts, phase3-runtime.test.ts |
24
+ | Worktree isolation | `docs/product/worktree.md` | yes | yes | yes 3/3 | implemented | worktree-manager.test.ts, worktree-run.test.ts |
25
+ | Team tool API | `docs/product/team-tool.md` | yes | yes | yes 3/3 | implemented | team-tool-dispatch.test.ts, extension-api-surface.test.ts, operator-experience.test.ts |
26
+ | Group join | `docs/product/group-join.md` | yes | yes | yes 3/3 | implemented | phase6-runtime-hardening.test.ts |
27
+ | Model fallback | `docs/product/model-fallback.md` | yes | no | yes 3/3 | implemented | model-fallback.test.ts |
28
+ | Conflict detection | `docs/product/conflict-detect.md` | yes | no | yes 3/3 | implemented | conflict-detect.test.ts, delta-conflict.test.ts |
29
+ | Crash recovery | `docs/product/crash-recovery.md` | yes | yes | yes 3/3 | implemented | recovery-recipes.test.ts, async-restart-recovery.test.ts |
30
+ | Effectiveness guard | `docs/product/effectiveness.md` | yes | no | yes 3/3 | implemented | effectiveness-guard.test.ts |
31
+ | Windows EBUSY | `docs/product/platform.md` | yes | yes | yes 3/3 | implemented | phase6-runtime-hardening.test.ts |
32
+ | Depth guard | `docs/product/runtime-safety.md` | yes | no | yes 3/3 | implemented | subagent-depth.test.ts, completion-guard.test.ts |
33
33
 
34
34
  ## Evidence Rules
35
35
 
@@ -42,8 +42,10 @@ Maps pi-crew behavior to proof. Every row must have real validation evidence.
42
42
  ## Validation Commands
43
43
 
44
44
  ```bash
45
- npm test # Run all unit tests (1600+)
45
+ npm test # Run all unit tests (1655 tests across 268 unit files + 14 integration files)
46
46
  npm run typecheck # TypeScript check + strip-types import
47
47
  npm run check # Biome lint + format
48
+ npm run test:unit # Unit tests only (fast, parallel)
49
+ npm run test:integration # Integration tests only (sequential)
48
50
  gh run list --limit 1 # Check latest CI status
49
51
  ```
@@ -0,0 +1,305 @@
1
+ # Feature Analysis: 3 Features to Port from pi-subagent4
2
+
3
+ ## 1. Safe Bash Tool
4
+
5
+ ### Current State in pi-crew
6
+ - **No dangerous command blocking** - pi-crew relies on user config
7
+ - `src/utils/env-filter.ts` has `sanitizeEnvSecrets()` for env var filtering, but nothing for bash commands
8
+ - `src/runtime/skill-instructions.ts` references `safe-bash` skill, but it's a guidance document, not enforcement
9
+
10
+ ### How subagent4 Does It
11
+
12
+ ```typescript
13
+ // tools/safe-bash.ts
14
+ const DANGEROUS_PATTERNS = [
15
+ /\brm\s+(-[a-zA-Z]*f[a-zA-Z]*\s+)?(-[a-zA-Z]*r[a-zA-Z]*\s+)?(\/|~\/?\s|~\/?\b)/,
16
+ /\bsudo\b/,
17
+ /\bmkfs\b/,
18
+ /\bdd\s+if=/,
19
+ /:\(\)\s*\{\s*:\|:&\s*\}\s*;:/,
20
+ />\s*\/dev\/[sh]d[a-z]/,
21
+ /\bchmod\s+(-[a-zA-Z]+\s+)?777\s+\//,
22
+ /\bchown\s+(-[a-zA-Z]+\s+)?root/,
23
+ /\bcurl\s.*\|\s*(ba)?sh/,
24
+ /\bwget\s.*\|\s*(ba)?sh/,
25
+ /\bshutdown\b/,
26
+ /\breboot\b/,
27
+ /\binit\s+0\b/,
28
+ /\bkill\s+-9\s+1\b/,
29
+ /\bkillall\b/,
30
+ ];
31
+
32
+ function isDangerous(command: string): string | null {
33
+ const normalized = command.replace(/\\\n/g, " ");
34
+ for (const pattern of DANGEROUS_PATTERNS) {
35
+ if (pattern.test(normalized)) {
36
+ return `Command blocked by safe_bash: matches dangerous pattern ${pattern}`;
37
+ }
38
+ }
39
+ return null;
40
+ }
41
+
42
+ // Wraps pi's built-in bash tool
43
+ pi.registerTool({
44
+ name: "safe_bash",
45
+ execute(toolCallId, params, signal, onUpdate, ctx) {
46
+ const danger = isDangerous(params.command);
47
+ if (danger) throw new Error(danger);
48
+ return bashTool.execute(toolCallId, params, signal, onUpdate);
49
+ }
50
+ });
51
+ ```
52
+
53
+ ### Implementation Options for pi-crew
54
+
55
+ **Option A: Wrapper Tool (Recommended)**
56
+ ```typescript
57
+ // src/tools/safe-bash.ts
58
+ // Extends pi's bash tool with pattern blocking
59
+ // Registered as a custom tool that agents can use instead of bash
60
+ ```
61
+
62
+ **Option B: Config-based Pattern Matching**
63
+ ```typescript
64
+ // In pi-crew config
65
+ {
66
+ "tools": {
67
+ "bash": {
68
+ "safeMode": true,
69
+ "blockedPatterns": ["rm -rf /", "sudo", "mkfs", ...]
70
+ }
71
+ }
72
+ }
73
+ ```
74
+
75
+ **Option C: Skill-based Guidance**
76
+ ```typescript
77
+ // Already exists: skills/safe-bash/SKILL.md
78
+ // But this is guidance only, not enforcement
79
+ ```
80
+
81
+ ### Effort Assessment
82
+ | Aspect | Estimate |
83
+ |--------|----------|
84
+ | Code complexity | Low (~60 lines) |
85
+ | Integration points | 1 (bash tool wrapper) |
86
+ | Testing needed | Medium (regex pattern coverage) |
87
+ | **Total effort** | **0.5-1 day** |
88
+
89
+ ### Risks
90
+ - **Pattern gaps**: Regex may miss edge cases (e.g., `curl -sL` with `|` on separate line)
91
+ - **Performance**: Pattern matching on every command adds latency
92
+ - **User override**: Users might need to bypass for legitimate uses
93
+
94
+ ### Recommendation
95
+ **IMPLEMENT** - Low effort, high value. Start with Option A (wrapper tool) and iterate.
96
+
97
+ ---
98
+
99
+ ## 2. Dynamic Agent Registration
100
+
101
+ ### Current State in pi-crew
102
+ - **Static configuration**: Agents defined in `.team.md` files
103
+ - **No runtime API**: Can't add/remove agents after startup
104
+ - **Manifest-based**: Agents loaded from manifest at run start
105
+
106
+ ### How subagent4 Does It
107
+
108
+ ```typescript
109
+ // Global bridge for cross-module access
110
+ (globalThis as any).__pi_subagents = { registerAgent, unregisterAgent };
111
+
112
+ export function registerAgent(config: AgentConfig): void {
113
+ // Validate not already registered
114
+ if (agents.find((a) => a.name === config.name)) {
115
+ throw new Error(`Agent already registered: ${config.name}`);
116
+ }
117
+ // Check allowlist if PI_SUBAGENT_ALLOWED is set
118
+ if (SUBAGENT_ALLOWLIST && !SUBAGENT_ALLOWLIST.includes(config.name)) return;
119
+ agents.push(config);
120
+ }
121
+
122
+ export function unregisterAgent(name: string): void {
123
+ agents = agents.filter((a) => a.name !== name);
124
+ }
125
+
126
+ // Agent config schema
127
+ interface AgentConfig {
128
+ name: string; // "scout", "researcher", "worker"
129
+ model: string; // "haiku-4-5", "sonnet-4-6"
130
+ tools: string[]; // ["read", "grep", "find", "ls"]
131
+ systemPrompt?: string; // Custom system prompt
132
+ subagentAgents?: string[]; // For worker: ["scout", "researcher"]
133
+ }
134
+ ```
135
+
136
+ ### Implementation Options for pi-crew
137
+
138
+ **Option A: Manifest Extension API**
139
+ ```typescript
140
+ // Add to team-tool.ts
141
+ export function registerAgent(config: AgentConfig): void {
142
+ // Validate against schema
143
+ // Add to global agent registry
144
+ // Notify active runs to reload
145
+ }
146
+ ```
147
+
148
+ **Option B: globalThis Bridge (subagent4 style)**
149
+ ```typescript
150
+ // In extension/register.ts
151
+ (globalThis as any).__pi_crew = {
152
+ registerAgent: (config: AgentConfig) => { ... },
153
+ unregisterAgent: (name: string) => { ... },
154
+ listAgents: () => { ... }
155
+ };
156
+ ```
157
+
158
+ **Option C: File-based Hot Reload**
159
+ ```typescript
160
+ // Watch .team.md files for changes
161
+ // Reload agents on file change
162
+ // No API change needed
163
+ ```
164
+
165
+ ### Effort Assessment
166
+ | Aspect | Estimate |
167
+ |--------|----------|
168
+ | Code complexity | Medium (~150 lines) |
169
+ | Integration points | 3 (extension, team-tool, runtime) |
170
+ | State management | Complex (need to handle active runs) |
171
+ | **Total effort** | **2-3 days** |
172
+
173
+ ### Use Cases Enabled
174
+ 1. **Plugin system**: Third-party agents can register at runtime
175
+ 2. **Dynamic workflows**: Agents added based on project needs
176
+ 3. **A/B testing**: Swap agents without restart
177
+
178
+ ### Risks
179
+ - **Race conditions**: Concurrent registration could cause duplicates
180
+ - **State sync**: Active runs might use stale agent list
181
+ - **Security**: Allowlist enforcement needed to prevent unauthorized agents
182
+
183
+ ### Recommendation
184
+ **DEFER** - Medium effort, unclear value. Current manifest-based approach works for most use cases. Revisit if plugin system becomes a priority.
185
+
186
+ ---
187
+
188
+ ## 3. JSON Event Stream Parsing
189
+
190
+ ### Current State in pi-crew
191
+ - **Lifecycle events**: spawn, spawn_error, response_timeout, etc.
192
+ - **No tool-level events**: No visibility into what tools are running
193
+ - **Completion-based**: Only sees final result, not progress
194
+
195
+ ### How subagent4 Does It
196
+
197
+ ```typescript
198
+ // stdout JSON event stream parsing
199
+ child.stdout.on("data", (data) => {
200
+ const lines = data.toString().split("\n");
201
+ for (const line of lines) {
202
+ if (!line.trim() || !line.startsWith("{")) continue;
203
+ const evt = JSON.parse(line);
204
+
205
+ // Event types handled
206
+ if (evt.type === "tool_execution_start") {
207
+ // Tool started - update UI, track count
208
+ }
209
+ if (evt.type === "tool_execution_update") {
210
+ // Tool progress - nested subagent results
211
+ }
212
+ if (evt.type === "tool_execution_end") {
213
+ // Tool completed - finalize
214
+ }
215
+ if (evt.type === "message_end") {
216
+ // Final output + usage stats
217
+ }
218
+ }
219
+ });
220
+
221
+ // Tool args preview extraction
222
+ function extractToolArgsPreview(args: Record<string, unknown>): string {
223
+ if (args.command) return flatten(String(args.command));
224
+ if (args.path) return flatten(String(args.path));
225
+ if (args.query) return `"${flatten(String(args.query))}"`;
226
+ // ... more types
227
+ }
228
+ ```
229
+
230
+ ### Implementation Options for pi-crew
231
+
232
+ **Option A: Event Stream Bridge (Recommended)**
233
+ ```typescript
234
+ // src/runtime/event-stream-bridge.ts
235
+ // Parses JSON events from child stdout
236
+ // Emits structured events to event bus
237
+ // Updates task state in real-time
238
+
239
+ interface ToolEvent {
240
+ type: "tool_execution_start" | "tool_execution_end" | "tool_execution_update";
241
+ toolName: string;
242
+ toolCallId: string;
243
+ args?: Record<string, unknown>;
244
+ result?: unknown;
245
+ timestamp: number;
246
+ }
247
+ ```
248
+
249
+ **Option B: Periodic Snapshot Polling**
250
+ ```typescript
251
+ // Poll child process state every N seconds
252
+ // Less real-time, but simpler implementation
253
+ // Lower fidelity but still useful
254
+ ```
255
+
256
+ **Option C: Log-based Analysis**
257
+ ```typescript
258
+ // Parse .events.jsonl files after completion
259
+ // No real-time, but enables post-run analysis
260
+ // Good for debugging, not for live UI
261
+ ```
262
+
263
+ ### Effort Assessment
264
+ | Aspect | Estimate |
265
+ |--------|----------|
266
+ | Code complexity | High (~300 lines) |
267
+ | Integration points | 4 (child-pi, event-bus, task-runner, UI) |
268
+ | Error handling | Complex (malformed JSON, partial events) |
269
+ | **Total effort** | **3-5 days** |
270
+
271
+ ### Benefits Enabled
272
+ 1. **Live tool progress**: See what tools are running in real-time
273
+ 2. **Nested subagent visibility**: See child subagent activity
274
+ 3. **Token usage tracking**: Real-time context window monitoring
275
+ 4. **Error isolation**: Know exactly which tool failed
276
+ 5. **Better UX**: Progress indicators, not just spinner
277
+
278
+ ### Risks
279
+ - **Event format changes**: Pi might change JSON event format
280
+ - **Performance overhead**: JSON parsing on every stdout chunk
281
+ - **Buffer handling**: Partial JSON lines need buffering
282
+
283
+ ### Recommendation
284
+ **IMPLEMENT** - High effort, high value. This would significantly improve UX. Start with Option A and target `tool_execution_start/end` events first (most impactful).
285
+
286
+ ---
287
+
288
+ ## Summary
289
+
290
+ | Feature | Effort | Value | Priority | Recommendation |
291
+ |---------|--------|-------|----------|----------------|
292
+ | Safe Bash | Low (0.5-1 day) | High | P0 | **IMPLEMENT NOW** |
293
+ | Dynamic Registration | Medium (2-3 days) | Medium | P2 | DEFER |
294
+ | JSON Event Stream | High (3-5 days) | High | P1 | **IMPLEMENT** |
295
+
296
+ ### Recommended Roadmap
297
+
298
+ **Phase 1 (This week)**
299
+ - Safe bash tool with pattern blocklist
300
+
301
+ **Phase 2 (Next sprint)**
302
+ - JSON event stream parsing for tool progress
303
+
304
+ **Phase 3 (Future)**
305
+ - Dynamic agent registration (if needed)