pi-crew 0.8.13 → 0.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +296 -0
- package/README.md +118 -2
- package/docs/FEATURE_INTAKE.md +1 -1
- package/docs/HARNESS.md +20 -19
- package/docs/PROJECT_REVIEW.md +132 -133
- package/docs/PROJECT_REVIEW_FIXES.md +130 -131
- package/docs/actions-reference.md +127 -121
- package/docs/architecture.md +1 -1
- package/docs/code-review-2026-05-11.md +134 -134
- package/docs/commands-reference.md +108 -106
- package/docs/comparison-pi-subagents-vs-pi-crew.md +105 -105
- package/docs/deep-review-report.md +1 -1
- package/docs/dynamic-workflows.md +90 -0
- package/docs/fixes/BATCH_A_H1_H2.md +17 -17
- package/docs/fixes/bug-007-async-notifier-stale-ctx.md +23 -23
- package/docs/followup-plan-2026-05-12.md +135 -135
- package/docs/followup-review-2026-05-12.md +86 -86
- package/docs/followup-review-round3-2026-05-12.md +123 -123
- package/docs/goals.md +59 -0
- package/docs/implementation-plan-top3.md +4 -4
- package/docs/issue-29-analysis.md +2 -2
- package/docs/oh-my-pi-research.md +154 -154
- package/docs/optimization-plan.md +2 -0
- package/docs/perf/baseline-2026-05.md +9 -9
- package/docs/perf/final-report-2026-05.md +2 -2
- package/docs/perf/sprint-1-report.md +2 -2
- package/docs/perf/sprint-2-report.md +1 -1
- package/docs/perf/upgrade-plan-2026-05.md +72 -72
- package/docs/pi-crew-bugs.md +230 -230
- package/docs/pi-crew-investigation-report.md +102 -102
- package/docs/pi-crew-test-round5.md +4 -4
- package/docs/runtime-analysis-child-vs-live.md +57 -57
- package/docs/runtime-migration-in-process-analysis.md +97 -97
- package/install.mjs +3 -2
- package/package.json +2 -4
- package/skills/orchestration/SKILL.md +11 -11
- package/src/agents/agent-config.ts +4 -0
- package/src/config/config.ts +39 -0
- package/src/config/types.ts +11 -0
- package/src/extension/action-suggestions.ts +2 -1
- package/src/extension/async-notifier.ts +10 -0
- package/src/extension/help.ts +14 -0
- package/src/extension/project-init.ts +7 -20
- package/src/extension/registration/commands.ts +27 -0
- package/src/extension/team-tool/destructive-gate.ts +1 -1
- package/src/extension/team-tool/goal-wrap.ts +288 -0
- package/src/extension/team-tool/goal.ts +405 -0
- package/src/extension/team-tool/run.ts +103 -4
- package/src/extension/team-tool/workflow-manage.ts +194 -0
- package/src/extension/team-tool.ts +20 -0
- package/src/hooks/types.ts +3 -1
- package/src/runtime/async-runner.ts +24 -2
- package/src/runtime/background-runner.ts +68 -19
- package/src/runtime/child-pi.ts +6 -1
- package/src/runtime/completion-guard.ts +1 -1
- package/src/runtime/dynamic-workflow-context.ts +450 -0
- package/src/runtime/dynamic-workflow-runner.ts +180 -0
- package/src/runtime/global-worker-cap.ts +96 -0
- package/src/runtime/goal-evaluator.ts +294 -0
- package/src/runtime/goal-loop-runner.ts +612 -0
- package/src/runtime/goal-state-store.ts +209 -0
- package/src/runtime/pi-args.ts +10 -2
- package/src/runtime/result-extractor.ts +32 -0
- package/src/runtime/team-runner.ts +11 -1
- package/src/runtime/verification-gates.ts +85 -5
- package/src/runtime/verification-integrity.ts +110 -0
- package/src/runtime/verification-worktree.ts +136 -0
- package/src/runtime/workspace-lock.ts +448 -0
- package/src/schema/config-schema.ts +26 -0
- package/src/schema/team-tool-schema.ts +39 -4
- package/src/state/atomic-write.ts +9 -0
- package/src/state/contracts.ts +14 -0
- package/src/state/crew-init.ts +18 -5
- package/src/state/event-log.ts +7 -1
- package/src/state/state-store.ts +2 -0
- package/src/state/types.ts +82 -0
- package/src/state/worker-atomic-writer.ts +176 -0
- package/src/utils/redaction.ts +104 -24
- package/src/workflows/discover-workflows.ts +25 -1
- package/src/workflows/workflow-config.ts +13 -0
- package/teams/parallel-research.team.md +1 -1
- package/workflows/examples/hello.dwf.ts +24 -0
|
@@ -1,7 +1,7 @@
|
|
|
1
|
-
# pi-crew Sprint 1 Report — UI
|
|
1
|
+
# pi-crew Sprint 1 Report — Low-risk UI smoothness
|
|
2
2
|
|
|
3
3
|
Date: 2026-05-14
|
|
4
|
-
Branch: `perf/sprint-1` (
|
|
4
|
+
Branch: `perf/sprint-1` (cut from `perf/baseline-bench`)
|
|
5
5
|
Status: complete
|
|
6
6
|
|
|
7
7
|
## Items shipped
|
|
@@ -2,110 +2,110 @@
|
|
|
2
2
|
|
|
3
3
|
Date: 2026-05-14
|
|
4
4
|
Owner: pi-crew maintainers
|
|
5
|
-
|
|
5
|
+
Base branch: `perf/baseline-bench`
|
|
6
6
|
Status: in-progress (Sprint 0 starting)
|
|
7
7
|
|
|
8
|
-
##
|
|
8
|
+
## Purpose
|
|
9
9
|
|
|
10
|
-
|
|
10
|
+
Improve performance and UI smoothness while maintaining stability, following the `AGENTS.md` task loop and the 5 current ADRs (durable state, child-process for async, depth guard, execFileSync, no parameter properties). This plan consolidates 30 analyzed upgrade items, split into 5 sprints plus 1 wrap-up phase.
|
|
11
11
|
|
|
12
|
-
|
|
12
|
+
The tracks:
|
|
13
13
|
|
|
14
|
-
1. **UI
|
|
15
|
-
2. **Runtime/state** —
|
|
16
|
-
3.
|
|
17
|
-
4. **
|
|
18
|
-
5. **Build/test feedback loop** — bench gate, test concurrency, watch mode,
|
|
14
|
+
1. **Smoother UI** — cut sync I/O out of the render path.
|
|
15
|
+
2. **Runtime/state** — reduce syscalls, lazy imports, warm pool, refactor large files.
|
|
16
|
+
3. **Stability** — backpressure, heartbeat, early cancel, mailbox archive, Windows kill-tree.
|
|
17
|
+
4. **Low-cost telemetry** — stream sink, OTLP gzip, sampled progress, histogram buckets.
|
|
18
|
+
5. **Build/test feedback loop** — bench gate, test concurrency, watch mode, bundling.
|
|
19
19
|
|
|
20
|
-
##
|
|
20
|
+
## Process applied to every PR
|
|
21
21
|
|
|
22
|
-
- Branch: `perf/<sprint>-<id>-<slug>`
|
|
23
|
-
- Lane (
|
|
24
|
-
-
|
|
22
|
+
- Branch: `perf/<sprint>-<id>-<slug>` cut from `perf/baseline-bench`.
|
|
23
|
+
- Lane (per `AGENTS.md`): tiny / normal / high-risk.
|
|
24
|
+
- Required validation before merge:
|
|
25
25
|
- `npm run typecheck`
|
|
26
26
|
- `npm run check:lazy-imports`
|
|
27
27
|
- `npm test`
|
|
28
|
-
- `npm run bench:check` —
|
|
29
|
-
-
|
|
30
|
-
- Update `CHANGELOG.md` (
|
|
31
|
-
- Update `docs/TEST_MATRIX.md`
|
|
32
|
-
-
|
|
33
|
-
-
|
|
34
|
-
-
|
|
28
|
+
- `npm run bench:check` — no regression > 15%
|
|
29
|
+
- Documentation:
|
|
30
|
+
- Update `CHANGELOG.md` (grouped by sprint).
|
|
31
|
+
- Update `docs/TEST_MATRIX.md` when adding tests.
|
|
32
|
+
- Write an ADR for any contract change (`docs/decisions/`).
|
|
33
|
+
- Do not combine > 2 items into a single PR. No "drive-by" refactors.
|
|
34
|
+
- Each high-risk item must have a kill switch in config (`runtime.experimental.<feature>=false`).
|
|
35
35
|
|
|
36
|
-
## Sprint 0 — Baseline & gate (2
|
|
36
|
+
## Sprint 0 — Baseline & gate (2 days)
|
|
37
37
|
|
|
38
|
-
|
|
38
|
+
Goal: measure before optimizing.
|
|
39
39
|
|
|
40
40
|
| ID | Task | Files | Lane |
|
|
41
41
|
|---|---|---|---|
|
|
42
42
|
| S0-1 | Profile script | `scripts/profile-startup.mjs` | tiny |
|
|
43
|
-
| S0-2 | Bench harness 3
|
|
43
|
+
| S0-2 | Bench harness, 3 files | `test/bench/{register-startup,render-flush,snapshot-cache}.bench.ts`, `test/bench/baseline.json` | normal |
|
|
44
44
|
| S0-3 | `npm run bench` + `bench:check` | `package.json`, `scripts/bench-check.mjs` | tiny |
|
|
45
|
-
| S0-4 |
|
|
45
|
+
| S0-4 | Base branch `perf/baseline-bench` | — | — |
|
|
46
46
|
| S0-5 | Capture baseline | `docs/perf/baseline-2026-05.md` | tiny |
|
|
47
47
|
|
|
48
|
-
Exit criteria: `npm run bench`
|
|
48
|
+
Exit criteria: `npm run bench` is stable, baseline recorded.
|
|
49
49
|
|
|
50
|
-
## Sprint 1 — UI
|
|
50
|
+
## Sprint 1 — Low-risk UI smoothness (5 days)
|
|
51
51
|
|
|
52
52
|
| ID | Item | Lane | Acceptance |
|
|
53
53
|
|---|---|---|---|
|
|
54
|
-
| 1.1 | renderTick no-sync | tiny | Render skeleton
|
|
55
|
-
| 1.2 | Async snapshot stamps | normal | Sync version
|
|
56
|
-
| 1.4 | Stamp version counter | tiny |
|
|
57
|
-
| 1.5 | Stamp agents O(1) | tiny | 1 stat/run
|
|
58
|
-
| 1.8 | Powerbar dedup hash | tiny | 100
|
|
59
|
-
| 1.9 | subagent.completed coalescer | tiny | 10 events
|
|
54
|
+
| 1.1 | renderTick no-sync | tiny | Render skeleton when preload is not ready; test that a thrown fs.statSync does not crash. |
|
|
55
|
+
| 1.2 | Async snapshot stamps | normal | Sync version only in the CLI handler; bench p95 -30%. |
|
|
56
|
+
| 1.4 | Stamp version counter | tiny | Use `events.jsonl.seq` instead of `combineStamps(size)`. |
|
|
57
|
+
| 1.5 | Stamp agents O(1) | tiny | 1 stat/run instead of N. |
|
|
58
|
+
| 1.8 | Powerbar dedup hash | tiny | 100 emits with the same payload → 1 event. |
|
|
59
|
+
| 1.9 | subagent.completed coalescer | tiny | 10 events within 30 ms → 1 invalidate. |
|
|
60
60
|
| 1.10 | Mascot pause idle | tiny | Config `ui.mascotPauseIdleMs`. |
|
|
61
61
|
|
|
62
62
|
Exit: `render-flush.bench.ts` -30%, `snapshot-cache.bench.ts` -20%.
|
|
63
63
|
|
|
64
|
-
## Sprint 2 —
|
|
64
|
+
## Sprint 2 — Cut sync I/O from the hot path (5 days)
|
|
65
65
|
|
|
66
66
|
| ID | Item | Lane | Acceptance |
|
|
67
67
|
|---|---|---|---|
|
|
68
68
|
| 2.7 | Lazy import phase 2 | tiny | `register` end-to-end -200 ms. |
|
|
69
|
-
| 2.10 | projectCrewRoot cache | tiny | 1000
|
|
70
|
-
| 4.1 | Metric-sink stream | tiny | 10k
|
|
71
|
-
| 4.4 | Progress sample 1/10 + first/last | tiny | 100 progress → 12
|
|
72
|
-
| 2.1 | Atomic-write coalescer | normal |
|
|
73
|
-
| 2.2 | Events.jsonl buffer 20 ms | normal | flushSync
|
|
69
|
+
| 2.10 | projectCrewRoot cache | tiny | 1000 calls → 1 stat. |
|
|
70
|
+
| 4.1 | Metric-sink stream | tiny | 10k metrics → 0 sync IO on hot path. |
|
|
71
|
+
| 4.4 | Progress sample 1/10 + first/last | tiny | 100 progress events → 12 in jsonl. |
|
|
72
|
+
| 2.1 | Atomic-write coalescer | normal | A crash in the window does not corrupt; recovery test. |
|
|
73
|
+
| 2.2 | Events.jsonl buffer 20 ms | normal | flushSync on cleanupRuntime + session_before_switch. |
|
|
74
74
|
| 2.3 | Rotation threshold 4 MB | tiny | Append 4 MB → rotate. |
|
|
75
|
-
| 1.3 | FS watcher native | normal | Render < 100 ms
|
|
75
|
+
| 1.3 | FS watcher native | normal | Render < 100 ms from FS event; poll fallback on ENOSYS. |
|
|
76
76
|
|
|
77
|
-
Exit: 0 sync IO
|
|
77
|
+
Exit: 0 sync IO in `RenderScheduler.flush`, register start ≤ 400 ms.
|
|
78
78
|
|
|
79
|
-
## Sprint 3 — Refactor & UI selectors (5
|
|
79
|
+
## Sprint 3 — Refactor & UI selectors (5 days)
|
|
80
80
|
|
|
81
81
|
| ID | Item | Lane | Acceptance |
|
|
82
82
|
|---|---|---|---|
|
|
83
|
-
| 2.8 |
|
|
84
|
-
| 2.9 |
|
|
85
|
-
| 1.6 | Dashboard pane independent | normal | 1 task
|
|
86
|
-
| 1.7 | Memoized snapshot slice | normal | 2
|
|
87
|
-
| 5.1 | Test concurrency 4 | tiny |
|
|
83
|
+
| 2.8 | Split out adaptive-plan | normal | `team-runner.ts` < 45 KB; lazy import when workflow ≠ implementation. |
|
|
84
|
+
| 2.9 | Split out config.ts | normal | `config.ts` < 20 KB; hot path does not import drift/suggestions. |
|
|
85
|
+
| 1.6 | Dashboard pane independent | normal | 1 task changes → only the agents-pane re-renders. |
|
|
86
|
+
| 1.7 | Memoized snapshot slice | normal | 2 identical get calls on the same cache → same reference. |
|
|
87
|
+
| 5.1 | Test concurrency 4 | tiny | each test does its own mkdtemp for a private PI_TEAMS_HOME. |
|
|
88
88
|
|
|
89
|
-
Exit: dashboard FPS +50%
|
|
89
|
+
Exit: dashboard FPS +50% while a run is active.
|
|
90
90
|
|
|
91
|
-
## Sprint 4 —
|
|
91
|
+
## Sprint 4 — Stability & telemetry (4 days)
|
|
92
92
|
|
|
93
93
|
| ID | Item | Lane | Acceptance |
|
|
94
94
|
|---|---|---|---|
|
|
95
|
-
| 3.1 | Backpressure stdout | normal | Stress 50 MB output → memory
|
|
96
|
-
| 3.2 | Heartbeat backoff | tiny | Stale → poll 1 s;
|
|
95
|
+
| 3.1 | Backpressure stdout | normal | Stress 50 MB output → memory does not exceed cap. |
|
|
96
|
+
| 3.2 | Heartbeat backoff | tiny | Stale → poll every 1 s; healthy → every 5 s. |
|
|
97
97
|
| 3.5 | Cancel propagate < 200 ms | normal | Stream-parse JSONL + signal check. |
|
|
98
98
|
| 3.6 | Deadletter cooldown | tiny | Config `reliability.deadletterCooldownMs`. |
|
|
99
|
-
| 3.7 | Idempotent resume
|
|
100
|
-
| 3.8 | Kill-tree
|
|
99
|
+
| 3.7 | Idempotent resume by attemptId | tiny | Resume 3 times → artifact is not duplicated. |
|
|
100
|
+
| 3.8 | Kill-tree on Windows | normal | SIGKILL fail → `taskkill /F /T`. |
|
|
101
101
|
| 3.4 | Atomic-write jitter | tiny | Jitter ±20%, max 8 attempts. |
|
|
102
|
-
| 3.3 | Mailbox auto-archive | normal | 11 MB → rotate
|
|
102
|
+
| 3.3 | Mailbox auto-archive | normal | 11 MB → rotate into blob-store. |
|
|
103
103
|
| 4.2 | OTLP gzip + delta | tiny | Content-Encoding: gzip; counter delta. |
|
|
104
104
|
| 4.3 | Histogram buckets pre-tuned | tiny | `crew.task.duration_ms` buckets `[50,200,500,1k,5k,30k,120k]`. |
|
|
105
105
|
|
|
106
|
-
Exit: cancel < 200 ms, no OOM
|
|
106
|
+
Exit: cancel < 200 ms, no OOM under stress, no deadletter repetition.
|
|
107
107
|
|
|
108
|
-
## Sprint 5 —
|
|
108
|
+
## Sprint 5 — High-risk backlog + ADR (1 week)
|
|
109
109
|
|
|
110
110
|
| ID | Item | Lane | ADR |
|
|
111
111
|
|---|---|---|---|
|
|
@@ -115,33 +115,33 @@ Exit: cancel < 200 ms, no OOM trên stress, deadletter không lặp.
|
|
|
115
115
|
| 2.5 | Lazy materialize agent records | normal | — |
|
|
116
116
|
| 5.2 | Watch mode test | tiny | — |
|
|
117
117
|
|
|
118
|
-
|
|
118
|
+
Each item: ADR + kill switch + dual-ship migration if needed.
|
|
119
119
|
|
|
120
|
-
##
|
|
120
|
+
## Wrap-up
|
|
121
121
|
|
|
122
|
-
- `docs/perf/sprint-<n>-report.md`
|
|
123
|
-
- `docs/perf/final-report-2026-05.md`
|
|
124
|
-
- Update `docs/next-upgrade-roadmap.md`
|
|
122
|
+
- `docs/perf/sprint-<n>-report.md` at the end of each sprint.
|
|
123
|
+
- `docs/perf/final-report-2026-05.md` comparing baseline vs final.
|
|
124
|
+
- Update `docs/next-upgrade-roadmap.md` to mark completed items.
|
|
125
125
|
|
|
126
126
|
## Risk register
|
|
127
127
|
|
|
128
128
|
| Risk | Sprint | Mitigation |
|
|
129
129
|
|---|---|---|
|
|
130
|
-
| Coalescer
|
|
131
|
-
| FS watcher
|
|
132
|
-
|
|
|
133
|
-
| Warm pool
|
|
134
|
-
| Binary index migration | 5 | Read both binary + JSONL
|
|
135
|
-
| Concurrency=4 unit
|
|
130
|
+
| Coalescer loses events on crash | 2 | flushSync in exit hook; crash-recovery integration test. |
|
|
131
|
+
| FS watcher fails on network filesystems | 2 | Detect ENOSYS/EPERM → poll fallback. |
|
|
132
|
+
| Bundling breaks Pi extension load | 5 | Prototype + smoke first; dual-ship for 1 release. |
|
|
133
|
+
| Warm pool leaks state | 5 | Pool process starts fresh, has a nonce; reuse failure → discard. |
|
|
134
|
+
| Binary index migration | 5 | Read both binary + JSONL for 2 releases. |
|
|
135
|
+
| Concurrency=4 unit tests flaky | 3 | Audit tests using a shared HOME; each test does its own mkdtemp. |
|
|
136
136
|
|
|
137
|
-
##
|
|
137
|
+
## Measurement goals
|
|
138
138
|
|
|
139
|
-
| Metric | Baseline (Sprint 0) | Target |
|
|
139
|
+
| Metric | Baseline (Sprint 0) | Target | Expected sprint for improvement |
|
|
140
140
|
|---|---|---|---|
|
|
141
141
|
| `register.ts` end-to-end | TBD | < 400 ms | 2 |
|
|
142
|
-
| Widget first frame
|
|
143
|
-
| `runTeamTask` cold | TBD | -2
|
|
144
|
-
| Dashboard FPS
|
|
142
|
+
| Widget first frame after session_start | TBD | < 150 ms | 1 |
|
|
143
|
+
| `runTeamTask` cold | TBD | -2 to -4 s (warm pool) | 5 |
|
|
144
|
+
| Dashboard FPS while a run is active | TBD | +50% | 3 |
|
|
145
145
|
| events.jsonl tail 32 KB parse | TBD | < 5 ms | 2 |
|
|
146
|
-
| CPU idle
|
|
146
|
+
| CPU idle when run completed | TBD | < 1% | 1 |
|
|
147
147
|
| Cancel round-trip | TBD | < 200 ms | 4 |
|