npm - peaks-cli - Versions diffs - 1.2.8 → 1.2.9 - Mend

peaks-cli 1.2.8 → 1.2.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (31) hide show

package/README.md +12 -0
package/bin/peaks.js +0 -0
package/dist/src/cli/commands/project-commands.js +1 -1
package/dist/src/cli/commands/scan-commands.js +22 -0
package/dist/src/services/memory/project-memory-service.d.ts +1 -1
package/dist/src/services/memory/project-memory-service.js +52 -23
package/dist/src/services/scan/libraries-service.d.ts +24 -0
package/dist/src/services/scan/libraries-service.js +419 -0
package/dist/src/services/scan/libraries-types.d.ts +59 -0
package/dist/src/services/scan/libraries-types.js +9 -0
package/dist/src/services/skills/skill-runbook-service.js +34 -1
package/dist/src/services/workflow/autonomous-resume-writer.js +7 -7
package/dist/src/shared/change-id.d.ts +30 -0
package/dist/src/shared/change-id.js +40 -6
package/dist/src/shared/paths.d.ts +1 -1
package/dist/src/shared/paths.js +2 -1
package/dist/src/shared/version.d.ts +1 -1
package/dist/src/shared/version.js +1 -1
package/package.json +1 -1
package/schemas/library-breaking-changes.data.json +141 -0
package/schemas/library-breaking-changes.meta.json +6 -0
package/schemas/library-breaking-changes.schema.json +50 -0
package/skills/peaks-qa/SKILL.md +12 -0
package/skills/peaks-rd/SKILL.md +145 -2
package/skills/peaks-solo/SKILL.md +76 -316
package/skills/peaks-solo/references/runbook.md +166 -0
package/skills/peaks-solo/references/workflow-gates-and-types.md +177 -0
package/skills/peaks-solo-resume/SKILL.md +81 -0
package/skills/peaks-solo-status/SKILL.md +120 -0
package/skills/peaks-solo-test/SKILL.md +84 -0
package/skills/peaks-txt/SKILL.md +8 -5

package/skills/peaks-rd/SKILL.md CHANGED Viewed

@@ -281,8 +281,8 @@ You cannot declare a phase complete from memory. Each gate below is a `ls` or `g
 >
 > | Type | rd:implemented requires | rd:qa-handoff also requires |
 > |---|---|---|
-> | feature / refactor | `rd/tech-doc.md` | `rd/code-review.md` + `rd/security-review.md` |
-> | bugfix | `rd/bug-analysis.md` (lighter than tech-doc; root cause + fix + regression test plan) | `rd/code-review.md` + `rd/security-review.md` |
+> | feature / refactor | `rd/tech-doc.md` | `rd/code-review.md` + `rd/security-review.md` + `rd/perf-baseline.md` (filled Results table, or `N/A — no perf surface` in Notes) + **`qa/test-cases/<rid>.md`** (added in slice 004; pre-drafted by the 4th sub-agent in the parallel fan-out) |
+> | bugfix | `rd/bug-analysis.md` (lighter than tech-doc; root cause + fix + regression test plan) | `rd/code-review.md` + `rd/security-review.md` + **`qa/test-cases/<rid>.md`**; `rd/perf-baseline.md` only when the bug is performance-shaped (matches the L449-452 "When this applies" criteria) |
 > | config | (none) | `rd/security-review.md` only |
 > | docs / chore | (none) | (none) |
 >
@@ -387,6 +387,32 @@ peaks scan diff-vs-scope --rid <rid> --project <repo> --session-id <sid> --json
 #   before re-running. Auto-allowed paths (test files, .peaks/, __mocks__/) never need a pattern.
 ```
+**Peaks-Cli Gate B9 — RD-side perf-baseline output present (when slice has a user-perceivable perf surface):**
+```bash
+ls .peaks/<id>/rd/perf-baseline.md 2>&1
+# Expected: .peaks/<id>/rd/perf-baseline.md
+# "No such file" + slice is feature / refactor / bugfix-when-perf → BLOCKED.
+#   Run the perf-baseline sub-agent from "Parallel review fan-out" below (or
+#   `peaks perf baseline --apply` inline), then fill in the Results table
+#   with measurements (lighthouse / k6 / autocannon / project-local bench —
+#   the CLI does not run these; that is the RD's job), then re-verify.
+# "No such file" + slice is docs / chore / pure-bugfix-no-perf → OK to proceed;
+#   this gate does not apply to those slice types.
+# File exists but Results table is empty (only the header row, no data rows) →
+#   BLOCKED. The sub-agent scaffolds the file; the main RD loop must fill in
+#   the Path / route | Workload | Tool | Metric | Baseline | Threshold table
+#   with actual numbers before handoff.
+# File contains the marker `N/A — no perf surface` in its Notes section →
+#   OK to proceed. This is the explicit opt-out the sub-agent writes when
+#   the slice has no user-perceivable perf surface (e.g. a feature that only
+#   adds an internal flag with no runtime cost, or a refactor that does not
+#   alter any hot path).
+#
+# The CLI enforcement table below the section header also gates this at the
+# `peaks request transition rd:qa-handoff` call, so a missing or empty file
+# is rejected by the CLI with `code: PREREQUISITES_MISSING`.
+```
 ## Project standards preflight
 Before RD planning or implementation work in a code repository, call the Peaks-Cli CLI:
@@ -396,6 +422,37 @@ Before RD planning or implementation work in a code repository, call the Peaks-C
 If `CLAUDE.md` is missing, treat creation as the preferred path. If `CLAUDE.md` already exists, use `standards update` to decide whether to append a managed index block or surface review-only suggestions. Apply only when write authorization exists; otherwise keep the CLI output as a preflight next action. Do not hand-write standards file mutations inside the skill.
+## Library version awareness (3rd-party breaking-change gate)
+After `peaks scan libraries` lands the dependency list under `## Library versions` in `rd/project-scan.md`, RD MUST cross-check the slice's diff against `schemas/library-breaking-changes.data.json` before writing any 3rd-party API call. Concretely:
+1. **Read the project's `## Library versions` section** in `.peaks/<session-id>/rd/project-scan.md`. Identify the `name` + `major` of every dependency the slice imports from.
+2. **Open `schemas/library-breaking-changes.data.json`** (LLM reads via the `Read` tool). For each library where the installed `major` matches a `toMajor` in the table, load the corresponding `breakingChanges[]` list.
+3. **For each `import` statement in the slice's diff** (e.g. `import { Drawer } from 'antd'`), check whether the imported symbol or its prop signature matches any `breakingChanges[].api` entry for the library's installed major.
+4. **On a hit**:
+   - **Warn the LLM in the slice's handoff**: in `.peaks/<session-id>/rd/requests/<rid>.md` under `## Implementation evidence`, append a one-line note per hit: `- [lib-version] <library> <installed version> imports <api>; breaking-change rule says use <replacement> instead.`
+   - **Persist a `lesson` memory** at the END of `.peaks/<session-id>/rd/project-scan.md` (or the tech-doc, or the handoff — any of these is read by future RD runs):
+     ```
+     <!-- peaks-memory:start -->
+     title: <library> <installed major> requires <api> → <replacement>
+     kind: lesson
+     ---
+     Observed in slice <rid>: project is on <library>@<major> and the diff imported <api> which is on the breaking-changes list. Use <replacement> instead. Source: schemas/library-breaking-changes.data.json.
+     <!-- peaks-memory:end -->
+     ```
+   - The next RD run will see this lesson in `peaks project memories` and skip the same drift.
+**Why this exists**: the LLM's training data lags the latest major versions. The user hit `[antd: Drawer] width is deprecated. Please use size instead` in an antv6 project because the LLM wrote v5-style code. The breaking-changes table is the canonical place for "library X at major Y has these known migrations" so the LLM doesn't have to guess.
+**Out of scope**: the breaking-changes table is hand-curated; auto-syncing from upstream changelogs (Context7, etc.) is a follow-up slice. Per-slice the LLM only reads the table — it does NOT maintain it.
+**Data freshness check (read schemas/library-breaking-changes.meta.json first)**:
+- Before reading `schemas/library-breaking-changes.data.json`, also read `schemas/library-breaking-changes.meta.json`.
+- Compute `ageInDays = (today - meta.lastUpdated)`. The LLM is responsible for this date math.
+- If `ageInDays > meta.freshnessPolicyDays` (default 180 days), surface a **freshness warning** in the handoff: `- [data-staleness] library-breaking-changes.data.json is ${ageInDays} days old (last touched ${meta.lastUpdated}); the breaking-changes below may miss library X's recent major. Re-verify against the library's official changelog before relying on these substitutions.`
+- The warning is **informational**, not blocking. A stale table is better than no table. The LLM still applies the entries it has, just with the caveat.
+- When a row in the table matches an `import` in the diff AND the table is fresh, proceed without the warning.
 ## GStack integration and code dry-runs
 Use gstack as a concrete engineering workflow reference for `Think → Plan → Build → Review → Test → Ship → Reflect`:
@@ -496,6 +553,7 @@ RD cannot mark a development slice complete until all of these are true. Each ga
 4. for frontend or UI-affecting slices, RD self-test has launched the app and used Playwright MCP for real browser end-to-end validation with visible-browser confirmation (install via `peaks mcp plan/apply --capability playwright-mcp.browser-validation --yes` if not yet present; navigate with `mcp__playwright__browser_navigate`, capture with `browser_snapshot` / `browser_take_screenshot` / `browser_console_messages` / `browser_network_requests`, sanitize route/actions and observations before retention, record acceptance result, close with `browser_close`); if login, CAPTCHA, SSO, or MFA appears, the headed browser is already visible — wait for the user to complete login and explicitly confirm completion before continuing;
 5. code review has been performed with findings recorded and CRITICAL/HIGH issues fixed before progression; unresolved CRITICAL/HIGH findings only allow a blocked handoff; **→ verified by Peaks-Cli Gate B3** — evidence file must exist at `.peaks/<id>/rd/code-review.md`
 6. security review has been performed for the changed surface, with CRITICAL/HIGH issues fixed before progression and particular attention to user input, file system access, external calls, auth, secrets, and dependency changes; **→ verified by Peaks-Cli Gate B4** — evidence file must exist at `.peaks/<id>/rd/security-review.md`
+6.5. perf-baseline output is in place for any slice with a user-perceivable performance surface — `peaks perf baseline --apply` has been run and `.peaks/<session-id>/rd/perf-baseline.md` exists with the Results table filled in with measurements (or `N/A — no perf surface` in Notes for slices without a perf surface). For docs / chore / pure-bugfix-no-perf, the file is not required. Run the fan-out from "Parallel review fan-out" below; **→ verified by Peaks-Cli Gate B9** — evidence file must exist at `.peaks/<id>/rd/perf-baseline.md` and contain a non-empty Results table or the N/A marker.
 7. the post-check dry-run has passed and is linked in the handoff;
 8. the tech-doc artifact (`.peaks/<session-id>/rd/tech-doc.md`) is written and linked from the request artifact. **→ verified by Peaks-Cli Gate B**
 9. the RD request artifact body has no unfilled placeholders, TBD markers, or bare-bullet stubs (`peaks request lint <rid> --role rd`). **→ verified by Peaks-Cli Gate B5**
@@ -505,6 +563,89 @@ RD cannot mark a development slice complete until all of these are true. Each ga
 If any gate fails, return to development for fixes or hand off as blocked. Do not describe the work as done, shippable, or ready for QA.
+## Parallel review fan-out (code-review + security-review + perf-baseline + qa-test-cases)
+**When RD reaches the end of implementation, the four review activities (code review, security review, perf baseline, AND QA test-cases draft) run in parallel via Task() sub-agents, not sequentially.** This is the same fan-out pattern peaks-solo uses for the post-PRD swarm (see `peaks-solo/SKILL.md` "Peaks-Cli Swarm parallel phase" L659-764). RD itself, when it is the main loop, behaves as a sub-agent orchestrator: it issues 4 Task() calls in a single message and waits for all to return before aggregating findings and transitioning to `qa-handoff`.
+**Why 4 sub-agents (added in slice 004):** the original 3-way fan-out (code-review + security-review + perf-baseline) cut the RD→QA wall-clock by running 3 LLM writes in parallel, but `qa/test-cases/<rid>.md` was still written sequentially by QA's main loop AFTER the RD handoff landed. Drafting QA test-cases in the same fan-out means the QA main loop's first action is "execute the pre-drafted test plan + write test-report" instead of "draft a test plan from scratch + execute + write report". Wall-clock drop: ~30-40% on the RD→QA-verdict segment for `feature` / `refactor` / `bugfix` slices.
+**When to fan out:**
+- Feature / refactor slices: all four sub-agents always run.
+- Bugfix slices: code-review + security-review + qa-test-cases always run; perf-baseline runs only when the bug is performance-shaped (matches the "When this applies" criteria in the perf-baseline section above).
+- Config / docs / chore slices: no fan-out (no review surface). Document N/A in the request artifact. (qa-test-cases also skipped — config / docs / chore have no acceptance surface to validate.)
+**The Task() template (mirror of peaks-solo L705-717):**
+```
+Task(
+  subagent_type="general-purpose",
+  description="<role> review for rid=<rid>",
+  prompt="<role contract below>, plus runtime args: project=<repo>, session-id=<sid>, request-id=<rid>. Write your evidence file at .peaks/<sid>/<evidence-path> and return ONLY the path. Do not call Skill(...). Do not set presence. Do not prompt the user. Do not commit, push, install hooks, or mutate settings.json. Do not edit any source file — review only."
+)
+```
+Note: sub-agents 1-3 write to `rd/<evidence-path>`, sub-agent 4 writes to `qa/test-cases/<rid>.md` (QA's dir). The role name in the description differentiates them.
+**Sub-agent 1 — code-reviewer (always runs for feature / refactor / bugfix):**
+- Read the git diff for this slice (`git diff main...HEAD` or equivalent).
+- Read `.peaks/<sid>/rd/tech-doc.md` for slice intent.
+- Inspect for: correctness, type safety, error handling, mutation patterns, file-size, naming, dead code, regressions, contract drift.
+- Output: `.peaks/<sid>/rd/code-review.md` with sections: Summary, Findings (CRITICAL/HIGH/MEDIUM/LOW with file:line), Required Fixes (CRITICAL+HIGH only), Recommended (MEDIUM+LOW), Verdict (pass | return-to-rd | blocked).
+- Required for Gate B3.
+**Sub-agent 2 — security-reviewer (always runs for feature / refactor / bugfix):**
+- Read the git diff and the file list.
+- Read `.peaks/<sid>/rd/tech-doc.md` for the slice's threat model.
+- Inspect for: hardcoded secrets, unsanitized input, path traversal, SQL injection, XSS, missing auth, dependency changes, external API surface, command injection via Bash guards.
+- Output: `.peaks/<sid>/rd/security-review.md` with the same shape (Summary, Findings, Required Fixes, Recommended, Verdict).
+- Required for Gate B4.
+**Sub-agent 3 — perf-baseline-reviewer (feature / refactor / bugfix-when-perf only):**
+- Read the git diff and the slice's PRD/tech-doc for any mentioned numbers (LCP / FCP / TBT / p95 / rps).
+- Run `peaks perf baseline --project <repo> --apply --reason "parallel fan-out for rid=<rid>"` to scaffold `.peaks/<sid>/rd/perf-baseline.md` (idempotent: re-run is a no-op per `src/services/perf/perf-baseline-service.ts:188-201`).
+- Inspect the slice for a user-perceivable performance surface (route, hook, API, render, hot loop, N+1).
+- Decide: perf surface exists → leave the scaffold in place for the main RD loop to fill in the Results table with actual measurements (lighthouse / k6 / autocannon / project-local bench — the CLI does NOT run these). No perf surface → write `N/A — no perf surface` in the file's Notes section and return.
+- Output: `.peaks/<sid>/rd/perf-baseline.md` (scaffolded, or N/A stub), plus a one-line return string: `perf-baseline: scaffolded — main loop must fill Results table` OR `perf-baseline: N/A — no perf surface`.
+- Required for Gate B9. The Results-table-filling happens in the main RD loop AFTER the fan-out returns and BEFORE `rd:qa-handoff` transition.
+**Sub-agent 4 — qa-test-cases-writer (always runs for feature / refactor / bugfix; added in slice 004):**
+- Read the git diff for this slice.
+- Read `.peaks/<sid>/rd/tech-doc.md` (or `bug-analysis.md` for bugfix) for the slice's acceptance criteria.
+- Read `.peaks/<sid>/prd/requests/<rid>.md` for the user's "Acceptance criteria" section.
+- Draft the test plan: enumerate every acceptance criterion from the PRD as a separate test case; for each, write a `ts` test snippet (using vitest, jest, or the project's test framework per the existing test files), assert the expected outcome, and link the test to the PRD criterion by ID or section reference.
+- Include the standard test plan sections: ## Test cases (with `ts` code blocks), ## Test case summary (table), ## Mandatory validation gates (units / API / browser / security / performance), ## Regression matrix, ## Verdict.
+- The test cases do NOT need to be executed by this sub-agent — execution is the QA main loop's job, AFTER the RD handoff lands. The sub-agent's contract is: "produce a runnable, exhaustive, type-correct test plan that QA can execute verbatim."
+- Output: `.peaks/<sid>/qa/test-cases/<rid>.md`.
+- Required for Gate C (RD-side qa-handoff transition, added in slice 004). When this file is present at RD's qa-handoff transition, QA's main loop can skip its own "draft test plan" step and proceed directly to "execute pre-drafted test plan + write test-report + security-findings + performance-findings + verdict".
+- Failure mode: if the PRD is missing or the acceptance criteria are too vague to enumerate, this sub-agent returns `blocked` with a `blockedReason` like `prd-missing` or `acceptance-criteria-vague`; the main RD loop then escalates via AskUserQuestion before falling back to inline QA test-case drafting.
+**Hard prohibitions on all 4 sub-agents (mirror of peaks-solo L729-734):**
+- Do NOT call `Skill(skill="...")` — would re-enter RD or another skill and break the fan-out.
+- Do NOT call `peaks skill presence:set` — only the main RD loop owns presence.
+- Do NOT open interactive user prompts. If something is unclear, return `blocked` and let the main loop handle the user.
+- Do NOT commit, push, install hooks, or mutate `~/.claude/settings.json` or `.claude/settings.json`. Only the main RD loop holds those permissions.
+- Do NOT edit any source file under `src/`, `tests/`, `skills/`, `bin/`, `scripts/`, `docs/`, `schemas/`. Review only. (Sub-agent 4, qa-test-cases-writer, may write test code in the test plan body, but does NOT write to `tests/` files on disk — that is the QA main loop's job, after the verdict is issued.)
+**Aggregation (after all 4 sub-agents return):**
+1. Restore presence: `peaks skill presence:set peaks-rd --project <repo> --gate review-fan-out-converged`
+2. Run the 4 `ls` checks (Gate B3 code-review, Gate B4 security-review, Gate B9 perf-baseline, Gate C2 qa-test-cases — the last one is a new check added in slice 004).
+3. Read each evidence file. Aggregate CRITICAL/HIGH across code-review + security-review.
+4. If any CRITICAL or HIGH finding exists in code-review.md or security-review.md: fix in the main RD loop, then re-launch ONLY the affected sub-agent(s) to verify the fix. Loop until clean, or mark as blocked if the issue cannot be resolved.
+5. For perf-baseline: if scaffolded, run the project's perf measurement tool, fill in the Results table (Path / route | Workload | Tool | Metric | Baseline | Threshold), and `git diff` the file to confirm the table has data (not just the header row). If N/A, no measurement needed.
+6. For qa-test-cases: the file is now pre-drafted by sub-agent 4. The main RD loop does NOT re-draft it; it only verifies (a) the file exists, (b) every PRD acceptance criterion is enumerated, (c) every `ts` test snippet is syntactically valid (the sub-agent's contract guarantees the last two; the main loop's job is just the `ls` check). If the sub-agent's draft is incomplete, fix it inline in the main RD loop (small edits only) OR re-launch the sub-agent (large re-drafts).
+7. Re-run all 4 `ls` checks to confirm the evidence files are present and not empty.
+8. Only then transition `peaks request transition <rid> --role rd --state qa-handoff --project <repo> --json`.
+**Degradation when a sub-agent fails or returns blocked:**
+- code-review sub-agent fails: fall back to inline RD code review (the L486-506 Gate B3 is still required; only the fan-out is degraded). TXT handoff note: `code-review-subagent-degraded-to-inline`.
+- security-review sub-agent fails: same fallback. TXT note: `security-review-subagent-degraded-to-inline`.
+- perf-baseline sub-agent fails: same fallback. TXT note: `perf-baseline-subagent-degraded-to-inline`.
+- qa-test-cases sub-agent fails: fall back to inline QA test-case drafting at the start of QA's main loop (i.e. QA drafts the test cases itself, instead of receiving the pre-drafted file). TXT note: `qa-test-cases-subagent-degraded-to-inline-qa-draft`. The wall-clock win is reduced but not eliminated — QA's drafting is still faster than writing from scratch because the test plan can be drafted from the PRD + tech-doc directly.
+- 2 or more fail: do not hand off as clean; transition to `qa-handoff` with `--allow-incomplete --reason "<degradation>"` OR block.
+**Why this works (3-loop repair closure):** the original 3-loop repair pain (`qa return for perf → rd fix → qa return for perf again → ...`) was caused by perf being QA-only. This fan-out moves perf measurement to the RD side AND runs it in parallel with the other reviews, so the RD handoff is complete on the first attempt instead of after several cycles. **Slice 004 extends the same pattern to QA test-cases** so the QA→verdict loop is also faster on the first attempt.
 ## Refactor hard gates
 If a request is refactor, cleanup, architecture adjustment, module split, or technical debt work:
@@ -646,4 +787,6 @@ Do not run upstream installer flows, mutate agent settings, or commit `.codegrap
 Do not bypass PRD/QA artifacts. Do not install hooks, agents, MCP, or settings. Ask the Peaks-Cli CLI to handle runtime side effects.
+Do not bypass the parallel review fan-out when the slice has a code-review / security-review / perf-baseline surface — see `## Parallel review fan-out` above for the contract. The three review activities are fan-out, not sequential; sequential re-implementation of the same logic by the main RD loop defeats the wall-clock benefit and is treated as a red-line violation.
 Reference: `references/refactor-workflow.md`.