npm - pi-dev - Versions diffs - 0.1.1 - Mend

pi-dev 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (40) hide show

package/LICENSE +28 -0
package/README.md +117 -0
package/dist/cli.js +73 -0
package/dist/install.js +101 -0
package/dist/manifest.js +28 -0
package/dist/paths.js +14 -0
package/package.json +48 -0
package/presets/preferences.md +74 -0
package/skills/diagnose/SKILL.md +117 -0
package/skills/diagnose/scripts/hitl-loop.template.sh +41 -0
package/skills/do/SKILL.md +180 -0
package/skills/grill-with-docs/ADR-FORMAT.md +47 -0
package/skills/grill-with-docs/CONTEXT-FORMAT.md +77 -0
package/skills/grill-with-docs/SKILL.md +88 -0
package/skills/improve-codebase-architecture/DEEPENING.md +37 -0
package/skills/improve-codebase-architecture/INTERFACE-DESIGN.md +44 -0
package/skills/improve-codebase-architecture/LANGUAGE.md +53 -0
package/skills/improve-codebase-architecture/SKILL.md +71 -0
package/skills/migrate/SKILL.md +231 -0
package/skills/recon-with-vision/SKILL.md +106 -0
package/skills/setup/SKILL.md +121 -0
package/skills/setup/domain.md +51 -0
package/skills/setup/issue-tracker-github.md +22 -0
package/skills/setup/issue-tracker-gitlab.md +23 -0
package/skills/setup/issue-tracker-local.md +19 -0
package/skills/setup/triage-labels.md +15 -0
package/skills/taste/SKILL.md +148 -0
package/skills/tdd/SKILL.md +109 -0
package/skills/tdd/deep-modules.md +33 -0
package/skills/tdd/interface-design.md +31 -0
package/skills/tdd/mocking.md +59 -0
package/skills/tdd/refactoring.md +10 -0
package/skills/tdd/tests.md +61 -0
package/skills/to-issues/SKILL.md +81 -0
package/skills/to-prd/SKILL.md +74 -0
package/skills/triage/AGENT-BRIEF.md +168 -0
package/skills/triage/OUT-OF-SCOPE.md +101 -0
package/skills/triage/SKILL.md +111 -0
package/skills/where/SKILL.md +108 -0
package/skills/zoom-out/SKILL.md +7 -0

package/LICENSE ADDED Viewed

@@ -0,0 +1,28 @@
+MIT License
+Copyright (c) 2025 jason2077
+The engineering skills bundled in `skills/` (notably `diagnose`, `tdd`, `to-prd`,
+`to-issues`, `triage`, `grill-with-docs`, `improve-codebase-architecture`,
+`zoom-out`, `recon-with-vision`, and `setup`) derive from Matt Pocock's `skills`
+project. Their respective licenses and attributions apply. The `do`, `migrate`,
+`taste`, and `where` skills, plus the CLI under `src/`, are MIT-licensed
+additions by this project.
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

package/README.md ADDED Viewed

@@ -0,0 +1,117 @@
+# pi-dev
+> An autonomous engineering skill framework for the [pi](https://github.com/badlogic/pi) runtime.
+>
+> Built on the shoulders of [Matt Pocock's skills](https://github.com/mattpocock/skills) — the structure, vocabulary, and underlying engineering discipline are his. This package layers a **single entry point**, a **strict migration gate**, and **per-project preferences** on top so the skills can run end-to-end inside the pi runtime without ceremony.
+## What this gives you
+After one install and one onboarding, your interaction with the agent collapses to **three commands**:
+```
+/do       — do the engineering work end-to-end
+/taste    — view or update your engineering preferences
+/where    — recall prior pi sessions for this cwd
+```
+That is the whole interface. Everything else (`/diagnose`, `/tdd`, `/to-prd`, `/to-issues`, `/triage`, `/grill-with-docs`, `/improve-codebase-architecture`, `/zoom-out`, `/recon-with-vision`, `/migrate`, `/setup`) is invoked automatically by `/do` based on intent and scope.
+## Install
+Requires Node ≥ 20 and the pi runtime.
+```bash
+# Install the skills + seed your global preferences
+npx pi-dev@latest install
+# Later: refresh skills only (keeps your preferences)
+npx pi-dev@latest update
+# See what's installed
+npx pi-dev list
+# Verify your environment
+npx pi-dev doctor
+```
+What `install` does:
+- Copies the skill folders to `~/.pi/agent/skills/` (the directory the pi runtime reads).
+- Seeds `~/.pi/agent/preferences.md` with sensible global defaults — only if you do not already have one.
+What `update` does:
+- Refreshes the skill folders.
+- **Does not** overwrite your global preferences. Pass `--include-prefs` if you want to re-seed.
+## How it works
+```
+[ /do ]  ──►  bootstrap
+              ├─ migration gate (refuses to run on un-migrated repos)
+              ├─ load merged preferences (global → project → package)
+              ├─ classify intent + scope from the user's message
+              └─ execute the right chain of skills, end-to-end
+```
+The first time you run `/do` in a new repo, the gate will trigger:
+1. **`/migrate`** — audits `AGENTS.md` / `CLAUDE.md` / handoff systems / context-and-ADR layouts, normalises them, and stamps a migration marker.
+2. **`/setup`** — scaffolds `docs/agents/{issue-tracker,triage-labels,domain}.md` if needed.
+3. **`/taste`** (onboarding mode) — auto-detects project signals, asks at most a handful of questions where the project diverges from your global preferences, writes `docs/agents/preferences.md`.
+After that one-time onboarding, `/do` runs without interruption. Side effects (issue creation, label application, commits, PRs) follow your `auto-*` preferences literally.
+## Preferences in three layers
+```
+~/.pi/agent/preferences.md            # global — your engineering taste
+docs/agents/preferences.md            # project — overrides for this repo
+packages/<pkg>/preferences.md         # package — overrides for this subtree (optional)
+```
+Skills merge them in order; last write wins per key. You can edit any of these files directly. `/taste` is just a guided way to do the same thing.
+## What's a skill?
+Each skill is a directory with a `SKILL.md`. The pi runtime reads the directory name and the skill's frontmatter to expose it as `/<name>`. The contents are pure Markdown — instructions for the agent, not code. That is the whole framework.
+## Credits
+The engineering discipline encoded here — `tdd`, `diagnose`, `to-prd`, `to-issues`, `triage`, `grill-with-docs`, `improve-codebase-architecture`, `zoom-out`, `recon-with-vision`, and the `setup` template — comes from **Matt Pocock**'s skills repo. The vertical-slice / tracer-bullet vocabulary, the deletion test, the seam/adapter language, and the triage state machine are all his work.
+This package adds:
+- `do` — a one-shot orchestrator that picks the right chain
+- `migrate` — a strict gate that normalises any repo into the canonical shape
+- `taste` — three-layer preferences with project-aware onboarding
+- `where` — pi-session recall for multi-session work
+If you find these skills useful, support the upstream:
+- Matt Pocock — https://www.mattpocock.com
+- pi runtime — https://github.com/badlogic/pi
+## Release process
+Versioning, changelog, tagging, and npm publishing are fully automated:
+1. Land changes on `main` using **Conventional Commits** (`feat:`, `fix:`,
+   `perf:`, `refactor:`, `docs:`, `chore:`, etc.).
+2. [release-please](https://github.com/googleapis/release-please) opens or
+   updates a **Release PR** that bumps the version and writes `CHANGELOG.md`.
+3. Merge the Release PR. release-please tags the commit and creates a GitHub
+   Release.
+4. The `Release` workflow rebuilds, retests, and runs `npm publish --access
+   public` automatically.
+Required one-time setup (maintainer):
+- Add an `NPM_TOKEN` repository secret with a granular publish token from
+  npmjs.com (Read and write, 2FA bypass enabled).
+- If you flip the repo to public later, add `--provenance` to the publish
+  command in `.github/workflows/release.yml` for sigstore attestation.
+## License
+MIT.

package/dist/cli.js ADDED Viewed

@@ -0,0 +1,73 @@
+#!/usr/bin/env node
+import { readFileSync } from "node:fs";
+import { dirname, join } from "node:path";
+import { fileURLToPath } from "node:url";
+import { install, update, uninstallSkill, listInstalled, doctor } from "./install.js";
+const args = process.argv.slice(2);
+const cmd = args[0];
+function help() {
+    console.log(`pi-dev — autonomous engineering skill framework for the pi runtime
+Usage:
+  pi-dev install [--skip-prefs]       Copy skills into ~/.pi/agent/skills and seed
+                                       ~/.pi/agent/preferences.md if missing.
+  pi-dev update [--include-prefs]     Refresh skills. By default keeps your global
+                                       preferences. Pass --include-prefs to overwrite.
+  pi-dev list                         Show installed skills + global prefs path.
+  pi-dev uninstall <skill>            Soft-remove a skill (renamed to .removed-…).
+  pi-dev doctor                       Check ~/.pi layout and external CLIs.
+  pi-dev version                      Print version.
+  pi-dev help                         This message.
+After install, in any pi session, you primarily call:
+  /do      — the one-shot engineering entry point
+  /taste   — view or update preferences
+  /where   — recall prior pi sessions for this cwd
+All other skills are invoked automatically by /do.
+`);
+}
+function getFlag(name) {
+    return args.includes(`--${name}`);
+}
+switch (cmd) {
+    case "install":
+        install({ skipPrefs: getFlag("skip-prefs") });
+        break;
+    case "update":
+        update({ skipPrefs: !getFlag("include-prefs") });
+        break;
+    case "list":
+        listInstalled();
+        break;
+    case "uninstall": {
+        const name = args[1];
+        if (!name) {
+            console.error("Usage: pi-dev uninstall <skill>");
+            process.exit(1);
+        }
+        uninstallSkill(name);
+        break;
+    }
+    case "doctor":
+        doctor();
+        break;
+    case "version":
+    case "--version":
+    case "-v": {
+        const here = dirname(fileURLToPath(import.meta.url));
+        const pkg = JSON.parse(readFileSync(join(here, "..", "package.json"), "utf8"));
+        console.log(pkg.version);
+        break;
+    }
+    case "help":
+    case "--help":
+    case "-h":
+    case undefined:
+        help();
+        break;
+    default:
+        console.error(`Unknown command: ${cmd}\n`);
+        help();
+        process.exit(1);
+}

package/dist/install.js ADDED Viewed

@@ -0,0 +1,101 @@
+import { existsSync, mkdirSync, cpSync, copyFileSync, renameSync } from "node:fs";
+import { execSync } from "node:child_process";
+import { join } from "node:path";
+import { SKILLS } from "./manifest.js";
+import { PI_AGENT_DIR, PI_SKILLS_DIR, PI_GLOBAL_PREFS, PKG_SKILLS_DIR, PKG_GLOBAL_PREFS_PRESET, } from "./paths.js";
+export function install(opts = {}) {
+    mkdirSync(PI_SKILLS_DIR, { recursive: true });
+    let copied = 0;
+    for (const skill of SKILLS) {
+        const src = join(PKG_SKILLS_DIR, skill.name);
+        const dst = join(PI_SKILLS_DIR, skill.name);
+        if (!existsSync(src)) {
+            console.warn(`  skip ${skill.name} (source not found in package)`);
+            continue;
+        }
+        if (existsSync(dst) && !opts.force) {
+            // Replace contents but keep the directory; safer than rmSync for skills
+            // the user may have customised lightly.
+            cpSync(src, dst, { recursive: true, force: true });
+        }
+        else {
+            cpSync(src, dst, { recursive: true });
+        }
+        copied++;
+    }
+    console.log(`Installed ${copied} skill(s) into ${PI_SKILLS_DIR}`);
+    if (opts.skipPrefs) {
+        console.log("Skipped global preferences (use --include-prefs on update to merge in new keys).");
+        return;
+    }
+    if (!existsSync(PI_GLOBAL_PREFS)) {
+        if (existsSync(PKG_GLOBAL_PREFS_PRESET)) {
+            copyFileSync(PKG_GLOBAL_PREFS_PRESET, PI_GLOBAL_PREFS);
+            console.log(`Seeded global preferences at ${PI_GLOBAL_PREFS}`);
+        }
+    }
+    else {
+        console.log(`Existing global preferences kept at ${PI_GLOBAL_PREFS} (use --include-prefs to merge new keys).`);
+    }
+}
+export function update(opts = {}) {
+    // Update is install with force, but defaults to NOT overwriting global prefs.
+    install({ ...opts, force: true, skipPrefs: !opts.skipPrefs ? true : opts.skipPrefs });
+}
+export function uninstallSkill(name) {
+    const dst = join(PI_SKILLS_DIR, name);
+    if (!existsSync(dst)) {
+        console.error(`Skill not installed: ${name}`);
+        process.exit(1);
+    }
+    // Soft delete: rename to .removed-<ts>
+    const stamp = new Date().toISOString().replace(/[:.]/g, "-");
+    const moved = join(PI_SKILLS_DIR, `.removed-${stamp}-${name}`);
+    renameSync(dst, moved);
+    console.log(`Uninstalled ${name} (moved to ${moved}).`);
+}
+export function listInstalled() {
+    console.log(`Skills directory: ${PI_SKILLS_DIR}\n`);
+    for (const skill of SKILLS) {
+        const path = join(PI_SKILLS_DIR, skill.name, "SKILL.md");
+        const installed = existsSync(path);
+        const tag = skill.kind === "human" ? "[user]   " : "[support]";
+        const status = installed ? "ok" : "missing";
+        console.log(`  ${tag} /${skill.name.padEnd(34)} ${status}  — ${skill.summary}`);
+    }
+    console.log(`\nGlobal preferences: ${existsSync(PI_GLOBAL_PREFS) ? PI_GLOBAL_PREFS : "(missing)"}`);
+}
+export function doctor() {
+    const issues = [];
+    if (!existsSync(PI_AGENT_DIR))
+        issues.push(`~/.pi/agent missing — run: pi-dev install`);
+    if (!existsSync(PI_SKILLS_DIR))
+        issues.push(`~/.pi/agent/skills missing — run: pi-dev install`);
+    for (const skill of SKILLS.filter((s) => s.kind === "human")) {
+        if (!existsSync(join(PI_SKILLS_DIR, skill.name, "SKILL.md"))) {
+            issues.push(`Human skill /${skill.name} not installed — run: pi-dev update`);
+        }
+    }
+    if (!existsSync(PI_GLOBAL_PREFS)) {
+        issues.push(`Global preferences missing — run: pi-dev install (or copy presets/preferences.md manually)`);
+    }
+    // Quick external check (best-effort)
+    const checkCmd = (cmd, label) => {
+        try {
+            execSync(`${cmd} --version`, { stdio: "ignore" });
+        }
+        catch {
+            issues.push(`${label} not found in PATH (optional but recommended).`);
+        }
+    };
+    checkCmd("git", "git");
+    checkCmd("gh", "gh CLI");
+    if (issues.length === 0) {
+        console.log("All checks passed.");
+        return;
+    }
+    console.log("Doctor report:");
+    for (const i of issues)
+        console.log("  - " + i);
+    process.exit(1);
+}

package/dist/manifest.js ADDED Viewed

@@ -0,0 +1,28 @@
+/**
+ * The canonical list of skills in this framework.
+ *
+ * `human` skills are what the user calls directly: /do, /taste, /where.
+ * `support` skills are auto-invoked by /do or /migrate.
+ *
+ * The skill names match the directory names under `skills/`.
+ */
+export const SKILLS = [
+    // Human-facing entry points
+    { name: "do", kind: "human", summary: "Do the engineering work end-to-end." },
+    { name: "taste", kind: "human", summary: "View / update / onboard preferences." },
+    { name: "where", kind: "human", summary: "Recall prior pi sessions for this cwd." },
+    // Auto-invoked support skills
+    { name: "migrate", kind: "support", summary: "Strict migration gate before /do can run." },
+    { name: "setup", kind: "support", summary: "Scaffold issue-tracker / triage / domain docs." },
+    { name: "diagnose", kind: "support", summary: "Reproducible diagnosis loop for hard bugs." },
+    { name: "tdd", kind: "support", summary: "Red→green→refactor in vertical slices." },
+    { name: "to-prd", kind: "support", summary: "Synthesise current context into a PRD." },
+    { name: "to-issues", kind: "support", summary: "Break a plan into AFK-grabbable issues." },
+    { name: "triage", kind: "support", summary: "Move issues through the triage state machine." },
+    { name: "grill-with-docs", kind: "support", summary: "Stress-test a plan against domain docs." },
+    { name: "improve-codebase-architecture", kind: "support", summary: "Find deepening opportunities." },
+    { name: "zoom-out", kind: "support", summary: "Map the area before diving in." },
+    { name: "recon-with-vision", kind: "support", summary: "Lock extractor schemas via vision." },
+];
+export const HUMAN_SKILLS = SKILLS.filter((s) => s.kind === "human");
+export const SUPPORT_SKILLS = SKILLS.filter((s) => s.kind === "support");

package/dist/paths.js ADDED Viewed

@@ -0,0 +1,14 @@
+import { homedir } from "node:os";
+import { join, resolve, dirname } from "node:path";
+import { fileURLToPath } from "node:url";
+export const HOME = homedir();
+export const PI_AGENT_DIR = join(HOME, ".pi", "agent");
+export const PI_SKILLS_DIR = join(PI_AGENT_DIR, "skills");
+export const PI_GLOBAL_PREFS = join(PI_AGENT_DIR, "preferences.md");
+const __filename = fileURLToPath(import.meta.url);
+const __dirname = dirname(__filename);
+/** Resolve the package root regardless of whether we run from src/ or dist/. */
+export const PKG_ROOT = resolve(__dirname, "..");
+export const PKG_SKILLS_DIR = join(PKG_ROOT, "skills");
+export const PKG_PRESETS_DIR = join(PKG_ROOT, "presets");
+export const PKG_GLOBAL_PREFS_PRESET = join(PKG_PRESETS_DIR, "preferences.md");

package/package.json ADDED Viewed

@@ -0,0 +1,48 @@
+{
+  "name": "pi-dev",
+  "version": "0.1.1",
+  "description": "An autonomous engineering skill framework for the pi runtime — built on Matt Pocock's skills.",
+  "type": "module",
+  "bin": {
+    "pi-dev": "./dist/cli.js"
+  },
+  "files": [
+    "dist",
+    "skills",
+    "presets",
+    "README.md",
+    "LICENSE"
+  ],
+  "scripts": {
+    "build": "tsc -p tsconfig.json && chmod +x dist/cli.js",
+    "dev": "tsc -p tsconfig.json --watch",
+    "test": "vitest run",
+    "prepublishOnly": "npm run build"
+  },
+  "engines": {
+    "node": ">=20"
+  },
+  "keywords": [
+    "pi",
+    "ai",
+    "agent",
+    "skills",
+    "engineering",
+    "matt-pocock"
+  ],
+  "author": "jason2077",
+  "license": "MIT",
+  "devDependencies": {
+    "@types/node": "^22.0.0",
+    "typescript": "^5.6.0",
+    "vitest": "^2.1.9"
+  },
+  "repository": {
+    "type": "git",
+    "url": "git+https://github.com/jason2077/pi-dev.git"
+  },
+  "bugs": {
+    "url": "https://github.com/jason2077/pi-dev/issues"
+  },
+  "homepage": "https://github.com/jason2077/pi-dev#readme"
+}

package/presets/preferences.md ADDED Viewed

@@ -0,0 +1,74 @@
+# Global Engineering Preferences
+> User-level defaults. Project-level files at `docs/agents/preferences.md` override these key-by-key. Package-level files at `packages/<pkg>/preferences.md` override project-level.
+>
+> Read order in skills: global → project → package. Last write wins per key.
+>
+> Edit this file directly to update.
+last-updated: 2025-05-08
+## 1. Lifecycle
+- stage: growth
+- change-budget: module
+## 2. Engineering philosophy
+- simplicity-bias: simple-first
+- completeness-bias: feature-complete
+- over-engineering-tolerance: 0
+- durability-target: long-lived
+## 3. Testing
+- test-priority-order: local-live > integration > unit
+- wait-budget-seconds: 30
+- wait-budget-exceeded-action: redesign-test
+- test-design-bar: design-first
+- coverage-scope: critical-paths
+- mocking-stance: fakes
+- local-live-policy: mandatory
+- ops-live-policy: risk-gated
+- regression-test-locations: <package>/test/regressions/<slug>.test.ts (default; override per-project)
+## 4. Architecture
+- module-depth-preference: deep
+- dedup-trigger: never-preemptive
+- adr-threshold: hard-to-reverse-only
+## 5. Process automation
+- auto-create-issues: preview-then-yes
+- auto-apply-labels: yes
+- auto-commit-per-slice: staged-only
+- auto-pr: branch-push-only
+- interrupt-on-ambiguity: confidence<0.5
+## 6. Communication
+- verbosity: minimal
+- explanation-style: decisions-only
+- language: ko
+## 7. Free notes
+- **라이브 테스트는 두 층**: local-live(에이전트가 로컬 앱/서브시스템 부팅·구동·검증)는 **필수**. ops-live(운영/스테이징 실데이터)는 risk-gated.
+- 단위/통합 테스트 빠른 검증은 기본. 그래도 라이브 가면 100% 새 문제 나옴 → local-live 생략 금지.
+- 로컬 라이브 엔트리포인트 없으면 에이전트가 **하니스를 만든다**. 사용자가 테스트 러너 노릇 하지 않음.
+- 인프라 의존(DB/외부)은 운영 커넥션 빌려서라도 애플리케이션 라이브로 띄울 것.
+- 테스트 작성보다 **테스트 설계**가 우선. 설계 안 되면 작성 보류.
+- 한 사이클 30초 초과로 멍때리지 말 것. 초과하면 곧장 redesign.
+- 오버 엔지니어링 혐오. "심플하면서 강력한", 기능 완전성 + 효율 중심.
+- 듀러블한 것 선호. 일회성 코드는 일회성으로 명시.
+- "rule of three"보다 보수적. 사용자가 명시 요청하지 않는 한 선제적 추출 금지.
+- AI 자동 커밋·PR은 보수적으로. push까지 가지 말고 stage에서 멈출 것.
+- **No handoff files.** 작업 상태는 코드(git) / 이슈 트래커 / 머지된 prefs 셋에만 산다.
+- **Migration is mandatory.** 마이그레이션 안 된 레포에서는 `/do` 시작 거부.
+## Risk-gated rules (default)
+`ops-live-policy: risk-gated` — 강제 트리거: 외부 API / 결제 / 인증 / 스키마·마이그레이션 / 비즈니스 핵심 룰 / 릴리스. 내부 리팩토링·타입 정리·문서·테스트-온리 변경은 생략. 모호하면 강제.
+`local-live-policy: mandatory` — 런타임 동작에 영향 주는 모든 변경에 강제. 예외는 docs-only / type-only / test-only 디프뿐.

package/skills/diagnose/SKILL.md ADDED Viewed

@@ -0,0 +1,117 @@
+---
+name: diagnose
+description: Disciplined diagnosis loop for hard bugs and performance regressions. Reproduce → minimise → hypothesise → instrument → fix → regression-test. Use when user says "diagnose this" / "debug this", reports a bug, says something is broken/throwing/failing, or describes a performance regression.
+---
+# Diagnose
+A discipline for hard bugs. Skip phases only when explicitly justified.
+When exploring the codebase, read `docs/agents/domain.md` first if it exists, use the project's domain glossary to get a clear mental model of the relevant modules, and check ADRs in the area you're touching.
+## Phase 1 — Build a feedback loop
+**This is the skill.** Everything else is mechanical. If you have a fast, deterministic, agent-runnable pass/fail signal for the bug, you will find the cause — bisection, hypothesis-testing, and instrumentation all just consume that signal. If you don't have one, no amount of staring at code will save you.
+Spend disproportionate effort here. **Be aggressive. Be creative. Refuse to give up.**
+### Ways to construct one — try them in roughly this order
+1. **Failing test** at whatever seam reaches the bug — unit, integration, e2e.
+2. **Curl / HTTP script** against a running dev server.
+3. **CLI invocation** with a fixture input, diffing stdout against a known-good snapshot.
+4. **Headless browser script** (Playwright / Puppeteer) — drives the UI, asserts on DOM/console/network.
+5. **Replay a captured trace.** Save a real network request / payload / event log to disk; replay it through the code path in isolation.
+6. **Throwaway harness.** Spin up a minimal subset of the system (one service, mocked deps) that exercises the bug code path with a single function call.
+7. **Property / fuzz loop.** If the bug is "sometimes wrong output", run 1000 random inputs and look for the failure mode.
+8. **Bisection harness.** If the bug appeared between two known states (commit, dataset, version), automate "boot at state X, check, repeat" so you can `git bisect run` it.
+9. **Differential loop.** Run the same input through old-version vs new-version (or two configs) and diff outputs.
+10. **HITL bash script.** Last resort. If a human must click, drive _them_ with `scripts/hitl-loop.template.sh` so the loop is still structured. Captured output feeds back to you.
+Build the right feedback loop, and the bug is 90% fixed.
+### Iterate on the loop itself
+Treat the loop as a product. Once you have _a_ loop, ask:
+- Can I make it faster? (Cache setup, skip unrelated init, narrow the test scope.)
+- Can I make the signal sharper? (Assert on the specific symptom, not "didn't crash".)
+- Can I make it more deterministic? (Pin time, seed RNG, isolate filesystem, freeze network.)
+A 30-second flaky loop is barely better than no loop. A 2-second deterministic loop is a debugging superpower.
+### Non-deterministic bugs
+The goal is not a clean repro but a **higher reproduction rate**. Loop the trigger 100×, parallelise, add stress, narrow timing windows, inject sleeps. A 50%-flake bug is debuggable; 1% is not — keep raising the rate until it's debuggable.
+### When you genuinely cannot build a loop
+Stop and say so explicitly. List what you tried. Ask the user for: (a) access to whatever environment reproduces it, (b) a captured artifact (HAR file, log dump, core dump, screen recording with timestamps), or (c) permission to add temporary production instrumentation. Do **not** proceed to hypothesise without a loop.
+Do not proceed to Phase 2 until you have a loop you believe in.
+## Phase 2 — Reproduce
+Run the loop. Watch the bug appear.
+Confirm:
+- [ ] The loop produces the failure mode the **user** described — not a different failure that happens to be nearby. Wrong bug = wrong fix.
+- [ ] The failure is reproducible across multiple runs (or, for non-deterministic bugs, reproducible at a high enough rate to debug against).
+- [ ] You have captured the exact symptom (error message, wrong output, slow timing) so later phases can verify the fix actually addresses it.
+Do not proceed until you reproduce the bug.
+## Phase 3 — Hypothesise
+Generate **3–5 ranked hypotheses** before testing any of them. Single-hypothesis generation anchors on the first plausible idea.
+Each hypothesis must be **falsifiable**: state the prediction it makes.
+> Format: "If <X> is the cause, then <changing Y> will make the bug disappear / <changing Z> will make it worse."
+If you cannot state the prediction, the hypothesis is a vibe — discard or sharpen it.
+**Show the ranked list to the user before testing.** They often have domain knowledge that re-ranks instantly ("we just deployed a change to #3"), or know hypotheses they've already ruled out. Cheap checkpoint, big time saver. Don't block on it — proceed with your ranking if the user is AFK.
+## Phase 4 — Instrument
+Each probe must map to a specific prediction from Phase 3. **Change one variable at a time.**
+Tool preference:
+1. **Debugger / REPL inspection** if the env supports it. One breakpoint beats ten logs.
+2. **Targeted logs** at the boundaries that distinguish hypotheses.
+3. Never "log everything and grep".
+**Tag every debug log** with a unique prefix, e.g. `[DEBUG-a4f2]`. Cleanup at the end becomes a single grep. Untagged logs survive; tagged logs die.
+**Perf branch.** For performance regressions, logs are usually wrong. Instead: establish a baseline measurement (timing harness, `performance.now()`, profiler, query plan), then bisect. Measure first, fix second.
+## Phase 5 — Fix + regression test
+Write the regression test **before the fix** — but only if there is a **correct seam** for it.
+A correct seam is one where the test exercises the **real bug pattern** as it occurs at the call site. If the only available seam is too shallow (single-caller test when the bug needs multiple callers, unit test that can't replicate the chain that triggered the bug), a regression test there gives false confidence.
+**If no correct seam exists, that itself is the finding.** Note it. The codebase architecture is preventing the bug from being locked down. Flag this for the next phase.
+If a correct seam exists:
+1. Turn the minimised repro into a failing test at that seam.
+2. Watch it fail.
+3. Apply the fix.
+4. Watch it pass.
+5. Re-run the Phase 1 feedback loop against the original (un-minimised) scenario.
+## Phase 6 — Cleanup + post-mortem
+Required before declaring done:
+- [ ] Original repro no longer reproduces (re-run the Phase 1 loop)
+- [ ] Regression test passes (or absence of seam is documented)
+- [ ] All `[DEBUG-...]` instrumentation removed (`grep` the prefix)
+- [ ] Throwaway prototypes deleted (or moved to a clearly-marked debug location)
+- [ ] The hypothesis that turned out correct is stated in the commit / PR message — so the next debugger learns
+**Then ask: what would have prevented this bug?** If the answer involves architectural change (no good test seam, tangled callers, hidden coupling) hand off to the `/improve-codebase-architecture` skill with the specifics. Make the recommendation **after** the fix is in, not before — you have more information now than when you started.

package/skills/diagnose/scripts/hitl-loop.template.sh ADDED Viewed

@@ -0,0 +1,41 @@
+#!/usr/bin/env bash
+# Human-in-the-loop reproduction loop.
+# Copy this file, edit the steps below, and run it.
+# The agent runs the script; the user follows prompts in their terminal.
+#
+# Usage:
+#   bash hitl-loop.template.sh
+#
+# Two helpers:
+#   step "<instruction>"          → show instruction, wait for Enter
+#   capture VAR "<question>"      → show question, read response into VAR
+#
+# At the end, captured values are printed as KEY=VALUE for the agent to parse.
+set -euo pipefail
+step() {
+  printf '\n>>> %s\n' "$1"
+  read -r -p "    [Enter when done] " _
+}
+capture() {
+  local var="$1" question="$2" answer
+  printf '\n>>> %s\n' "$question"
+  read -r -p "    > " answer
+  printf -v "$var" '%s' "$answer"
+}
+# --- edit below ---------------------------------------------------------
+step "Open the app at http://localhost:3000 and sign in."
+capture ERRORED "Click the 'Export' button. Did it throw an error? (y/n)"
+capture ERROR_MSG "Paste the error message (or 'none'):"
+# --- edit above ---------------------------------------------------------
+printf '\n--- Captured ---\n'
+printf 'ERRORED=%s\n' "$ERRORED"
+printf 'ERROR_MSG=%s\n' "$ERROR_MSG"