pi-dev 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (40) hide show
  1. package/LICENSE +28 -0
  2. package/README.md +117 -0
  3. package/dist/cli.js +73 -0
  4. package/dist/install.js +101 -0
  5. package/dist/manifest.js +28 -0
  6. package/dist/paths.js +14 -0
  7. package/package.json +48 -0
  8. package/presets/preferences.md +74 -0
  9. package/skills/diagnose/SKILL.md +117 -0
  10. package/skills/diagnose/scripts/hitl-loop.template.sh +41 -0
  11. package/skills/do/SKILL.md +180 -0
  12. package/skills/grill-with-docs/ADR-FORMAT.md +47 -0
  13. package/skills/grill-with-docs/CONTEXT-FORMAT.md +77 -0
  14. package/skills/grill-with-docs/SKILL.md +88 -0
  15. package/skills/improve-codebase-architecture/DEEPENING.md +37 -0
  16. package/skills/improve-codebase-architecture/INTERFACE-DESIGN.md +44 -0
  17. package/skills/improve-codebase-architecture/LANGUAGE.md +53 -0
  18. package/skills/improve-codebase-architecture/SKILL.md +71 -0
  19. package/skills/migrate/SKILL.md +231 -0
  20. package/skills/recon-with-vision/SKILL.md +106 -0
  21. package/skills/setup/SKILL.md +121 -0
  22. package/skills/setup/domain.md +51 -0
  23. package/skills/setup/issue-tracker-github.md +22 -0
  24. package/skills/setup/issue-tracker-gitlab.md +23 -0
  25. package/skills/setup/issue-tracker-local.md +19 -0
  26. package/skills/setup/triage-labels.md +15 -0
  27. package/skills/taste/SKILL.md +148 -0
  28. package/skills/tdd/SKILL.md +109 -0
  29. package/skills/tdd/deep-modules.md +33 -0
  30. package/skills/tdd/interface-design.md +31 -0
  31. package/skills/tdd/mocking.md +59 -0
  32. package/skills/tdd/refactoring.md +10 -0
  33. package/skills/tdd/tests.md +61 -0
  34. package/skills/to-issues/SKILL.md +81 -0
  35. package/skills/to-prd/SKILL.md +74 -0
  36. package/skills/triage/AGENT-BRIEF.md +168 -0
  37. package/skills/triage/OUT-OF-SCOPE.md +101 -0
  38. package/skills/triage/SKILL.md +111 -0
  39. package/skills/where/SKILL.md +108 -0
  40. package/skills/zoom-out/SKILL.md +7 -0
package/LICENSE ADDED
@@ -0,0 +1,28 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 jason2077
4
+
5
+ The engineering skills bundled in `skills/` (notably `diagnose`, `tdd`, `to-prd`,
6
+ `to-issues`, `triage`, `grill-with-docs`, `improve-codebase-architecture`,
7
+ `zoom-out`, `recon-with-vision`, and `setup`) derive from Matt Pocock's `skills`
8
+ project. Their respective licenses and attributions apply. The `do`, `migrate`,
9
+ `taste`, and `where` skills, plus the CLI under `src/`, are MIT-licensed
10
+ additions by this project.
11
+
12
+ Permission is hereby granted, free of charge, to any person obtaining a copy
13
+ of this software and associated documentation files (the "Software"), to deal
14
+ in the Software without restriction, including without limitation the rights
15
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
16
+ copies of the Software, and to permit persons to whom the Software is
17
+ furnished to do so, subject to the following conditions:
18
+
19
+ The above copyright notice and this permission notice shall be included in all
20
+ copies or substantial portions of the Software.
21
+
22
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
23
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
24
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
25
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
26
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
27
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
28
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,117 @@
1
+ # pi-dev
2
+
3
+ > An autonomous engineering skill framework for the [pi](https://github.com/badlogic/pi) runtime.
4
+ >
5
+ > Built on the shoulders of [Matt Pocock's skills](https://github.com/mattpocock/skills) — the structure, vocabulary, and underlying engineering discipline are his. This package layers a **single entry point**, a **strict migration gate**, and **per-project preferences** on top so the skills can run end-to-end inside the pi runtime without ceremony.
6
+
7
+ ## What this gives you
8
+
9
+ After one install and one onboarding, your interaction with the agent collapses to **three commands**:
10
+
11
+ ```
12
+ /do — do the engineering work end-to-end
13
+ /taste — view or update your engineering preferences
14
+ /where — recall prior pi sessions for this cwd
15
+ ```
16
+
17
+ That is the whole interface. Everything else (`/diagnose`, `/tdd`, `/to-prd`, `/to-issues`, `/triage`, `/grill-with-docs`, `/improve-codebase-architecture`, `/zoom-out`, `/recon-with-vision`, `/migrate`, `/setup`) is invoked automatically by `/do` based on intent and scope.
18
+
19
+ ## Install
20
+
21
+ Requires Node ≥ 20 and the pi runtime.
22
+
23
+ ```bash
24
+ # Install the skills + seed your global preferences
25
+ npx pi-dev@latest install
26
+
27
+ # Later: refresh skills only (keeps your preferences)
28
+ npx pi-dev@latest update
29
+
30
+ # See what's installed
31
+ npx pi-dev list
32
+
33
+ # Verify your environment
34
+ npx pi-dev doctor
35
+ ```
36
+
37
+ What `install` does:
38
+
39
+ - Copies the skill folders to `~/.pi/agent/skills/` (the directory the pi runtime reads).
40
+ - Seeds `~/.pi/agent/preferences.md` with sensible global defaults — only if you do not already have one.
41
+
42
+ What `update` does:
43
+
44
+ - Refreshes the skill folders.
45
+ - **Does not** overwrite your global preferences. Pass `--include-prefs` if you want to re-seed.
46
+
47
+ ## How it works
48
+
49
+ ```
50
+ [ /do ] ──► bootstrap
51
+ ├─ migration gate (refuses to run on un-migrated repos)
52
+ ├─ load merged preferences (global → project → package)
53
+ ├─ classify intent + scope from the user's message
54
+ └─ execute the right chain of skills, end-to-end
55
+ ```
56
+
57
+ The first time you run `/do` in a new repo, the gate will trigger:
58
+
59
+ 1. **`/migrate`** — audits `AGENTS.md` / `CLAUDE.md` / handoff systems / context-and-ADR layouts, normalises them, and stamps a migration marker.
60
+ 2. **`/setup`** — scaffolds `docs/agents/{issue-tracker,triage-labels,domain}.md` if needed.
61
+ 3. **`/taste`** (onboarding mode) — auto-detects project signals, asks at most a handful of questions where the project diverges from your global preferences, writes `docs/agents/preferences.md`.
62
+
63
+ After that one-time onboarding, `/do` runs without interruption. Side effects (issue creation, label application, commits, PRs) follow your `auto-*` preferences literally.
64
+
65
+ ## Preferences in three layers
66
+
67
+ ```
68
+ ~/.pi/agent/preferences.md # global — your engineering taste
69
+ docs/agents/preferences.md # project — overrides for this repo
70
+ packages/<pkg>/preferences.md # package — overrides for this subtree (optional)
71
+ ```
72
+
73
+ Skills merge them in order; last write wins per key. You can edit any of these files directly. `/taste` is just a guided way to do the same thing.
74
+
75
+ ## What's a skill?
76
+
77
+ Each skill is a directory with a `SKILL.md`. The pi runtime reads the directory name and the skill's frontmatter to expose it as `/<name>`. The contents are pure Markdown — instructions for the agent, not code. That is the whole framework.
78
+
79
+ ## Credits
80
+
81
+ The engineering discipline encoded here — `tdd`, `diagnose`, `to-prd`, `to-issues`, `triage`, `grill-with-docs`, `improve-codebase-architecture`, `zoom-out`, `recon-with-vision`, and the `setup` template — comes from **Matt Pocock**'s skills repo. The vertical-slice / tracer-bullet vocabulary, the deletion test, the seam/adapter language, and the triage state machine are all his work.
82
+
83
+ This package adds:
84
+
85
+ - `do` — a one-shot orchestrator that picks the right chain
86
+ - `migrate` — a strict gate that normalises any repo into the canonical shape
87
+ - `taste` — three-layer preferences with project-aware onboarding
88
+ - `where` — pi-session recall for multi-session work
89
+
90
+ If you find these skills useful, support the upstream:
91
+
92
+ - Matt Pocock — https://www.mattpocock.com
93
+ - pi runtime — https://github.com/badlogic/pi
94
+
95
+ ## Release process
96
+
97
+ Versioning, changelog, tagging, and npm publishing are fully automated:
98
+
99
+ 1. Land changes on `main` using **Conventional Commits** (`feat:`, `fix:`,
100
+ `perf:`, `refactor:`, `docs:`, `chore:`, etc.).
101
+ 2. [release-please](https://github.com/googleapis/release-please) opens or
102
+ updates a **Release PR** that bumps the version and writes `CHANGELOG.md`.
103
+ 3. Merge the Release PR. release-please tags the commit and creates a GitHub
104
+ Release.
105
+ 4. The `Release` workflow rebuilds, retests, and runs `npm publish --access
106
+ public` automatically.
107
+
108
+ Required one-time setup (maintainer):
109
+
110
+ - Add an `NPM_TOKEN` repository secret with a granular publish token from
111
+ npmjs.com (Read and write, 2FA bypass enabled).
112
+ - If you flip the repo to public later, add `--provenance` to the publish
113
+ command in `.github/workflows/release.yml` for sigstore attestation.
114
+
115
+ ## License
116
+
117
+ MIT.
package/dist/cli.js ADDED
@@ -0,0 +1,73 @@
1
+ #!/usr/bin/env node
2
+ import { readFileSync } from "node:fs";
3
+ import { dirname, join } from "node:path";
4
+ import { fileURLToPath } from "node:url";
5
+ import { install, update, uninstallSkill, listInstalled, doctor } from "./install.js";
6
+ const args = process.argv.slice(2);
7
+ const cmd = args[0];
8
+ function help() {
9
+ console.log(`pi-dev — autonomous engineering skill framework for the pi runtime
10
+
11
+ Usage:
12
+ pi-dev install [--skip-prefs] Copy skills into ~/.pi/agent/skills and seed
13
+ ~/.pi/agent/preferences.md if missing.
14
+ pi-dev update [--include-prefs] Refresh skills. By default keeps your global
15
+ preferences. Pass --include-prefs to overwrite.
16
+ pi-dev list Show installed skills + global prefs path.
17
+ pi-dev uninstall <skill> Soft-remove a skill (renamed to .removed-…).
18
+ pi-dev doctor Check ~/.pi layout and external CLIs.
19
+ pi-dev version Print version.
20
+ pi-dev help This message.
21
+
22
+ After install, in any pi session, you primarily call:
23
+ /do — the one-shot engineering entry point
24
+ /taste — view or update preferences
25
+ /where — recall prior pi sessions for this cwd
26
+
27
+ All other skills are invoked automatically by /do.
28
+ `);
29
+ }
30
+ function getFlag(name) {
31
+ return args.includes(`--${name}`);
32
+ }
33
+ switch (cmd) {
34
+ case "install":
35
+ install({ skipPrefs: getFlag("skip-prefs") });
36
+ break;
37
+ case "update":
38
+ update({ skipPrefs: !getFlag("include-prefs") });
39
+ break;
40
+ case "list":
41
+ listInstalled();
42
+ break;
43
+ case "uninstall": {
44
+ const name = args[1];
45
+ if (!name) {
46
+ console.error("Usage: pi-dev uninstall <skill>");
47
+ process.exit(1);
48
+ }
49
+ uninstallSkill(name);
50
+ break;
51
+ }
52
+ case "doctor":
53
+ doctor();
54
+ break;
55
+ case "version":
56
+ case "--version":
57
+ case "-v": {
58
+ const here = dirname(fileURLToPath(import.meta.url));
59
+ const pkg = JSON.parse(readFileSync(join(here, "..", "package.json"), "utf8"));
60
+ console.log(pkg.version);
61
+ break;
62
+ }
63
+ case "help":
64
+ case "--help":
65
+ case "-h":
66
+ case undefined:
67
+ help();
68
+ break;
69
+ default:
70
+ console.error(`Unknown command: ${cmd}\n`);
71
+ help();
72
+ process.exit(1);
73
+ }
@@ -0,0 +1,101 @@
1
+ import { existsSync, mkdirSync, cpSync, copyFileSync, renameSync } from "node:fs";
2
+ import { execSync } from "node:child_process";
3
+ import { join } from "node:path";
4
+ import { SKILLS } from "./manifest.js";
5
+ import { PI_AGENT_DIR, PI_SKILLS_DIR, PI_GLOBAL_PREFS, PKG_SKILLS_DIR, PKG_GLOBAL_PREFS_PRESET, } from "./paths.js";
6
+ export function install(opts = {}) {
7
+ mkdirSync(PI_SKILLS_DIR, { recursive: true });
8
+ let copied = 0;
9
+ for (const skill of SKILLS) {
10
+ const src = join(PKG_SKILLS_DIR, skill.name);
11
+ const dst = join(PI_SKILLS_DIR, skill.name);
12
+ if (!existsSync(src)) {
13
+ console.warn(` skip ${skill.name} (source not found in package)`);
14
+ continue;
15
+ }
16
+ if (existsSync(dst) && !opts.force) {
17
+ // Replace contents but keep the directory; safer than rmSync for skills
18
+ // the user may have customised lightly.
19
+ cpSync(src, dst, { recursive: true, force: true });
20
+ }
21
+ else {
22
+ cpSync(src, dst, { recursive: true });
23
+ }
24
+ copied++;
25
+ }
26
+ console.log(`Installed ${copied} skill(s) into ${PI_SKILLS_DIR}`);
27
+ if (opts.skipPrefs) {
28
+ console.log("Skipped global preferences (use --include-prefs on update to merge in new keys).");
29
+ return;
30
+ }
31
+ if (!existsSync(PI_GLOBAL_PREFS)) {
32
+ if (existsSync(PKG_GLOBAL_PREFS_PRESET)) {
33
+ copyFileSync(PKG_GLOBAL_PREFS_PRESET, PI_GLOBAL_PREFS);
34
+ console.log(`Seeded global preferences at ${PI_GLOBAL_PREFS}`);
35
+ }
36
+ }
37
+ else {
38
+ console.log(`Existing global preferences kept at ${PI_GLOBAL_PREFS} (use --include-prefs to merge new keys).`);
39
+ }
40
+ }
41
+ export function update(opts = {}) {
42
+ // Update is install with force, but defaults to NOT overwriting global prefs.
43
+ install({ ...opts, force: true, skipPrefs: !opts.skipPrefs ? true : opts.skipPrefs });
44
+ }
45
+ export function uninstallSkill(name) {
46
+ const dst = join(PI_SKILLS_DIR, name);
47
+ if (!existsSync(dst)) {
48
+ console.error(`Skill not installed: ${name}`);
49
+ process.exit(1);
50
+ }
51
+ // Soft delete: rename to .removed-<ts>
52
+ const stamp = new Date().toISOString().replace(/[:.]/g, "-");
53
+ const moved = join(PI_SKILLS_DIR, `.removed-${stamp}-${name}`);
54
+ renameSync(dst, moved);
55
+ console.log(`Uninstalled ${name} (moved to ${moved}).`);
56
+ }
57
+ export function listInstalled() {
58
+ console.log(`Skills directory: ${PI_SKILLS_DIR}\n`);
59
+ for (const skill of SKILLS) {
60
+ const path = join(PI_SKILLS_DIR, skill.name, "SKILL.md");
61
+ const installed = existsSync(path);
62
+ const tag = skill.kind === "human" ? "[user] " : "[support]";
63
+ const status = installed ? "ok" : "missing";
64
+ console.log(` ${tag} /${skill.name.padEnd(34)} ${status} — ${skill.summary}`);
65
+ }
66
+ console.log(`\nGlobal preferences: ${existsSync(PI_GLOBAL_PREFS) ? PI_GLOBAL_PREFS : "(missing)"}`);
67
+ }
68
+ export function doctor() {
69
+ const issues = [];
70
+ if (!existsSync(PI_AGENT_DIR))
71
+ issues.push(`~/.pi/agent missing — run: pi-dev install`);
72
+ if (!existsSync(PI_SKILLS_DIR))
73
+ issues.push(`~/.pi/agent/skills missing — run: pi-dev install`);
74
+ for (const skill of SKILLS.filter((s) => s.kind === "human")) {
75
+ if (!existsSync(join(PI_SKILLS_DIR, skill.name, "SKILL.md"))) {
76
+ issues.push(`Human skill /${skill.name} not installed — run: pi-dev update`);
77
+ }
78
+ }
79
+ if (!existsSync(PI_GLOBAL_PREFS)) {
80
+ issues.push(`Global preferences missing — run: pi-dev install (or copy presets/preferences.md manually)`);
81
+ }
82
+ // Quick external check (best-effort)
83
+ const checkCmd = (cmd, label) => {
84
+ try {
85
+ execSync(`${cmd} --version`, { stdio: "ignore" });
86
+ }
87
+ catch {
88
+ issues.push(`${label} not found in PATH (optional but recommended).`);
89
+ }
90
+ };
91
+ checkCmd("git", "git");
92
+ checkCmd("gh", "gh CLI");
93
+ if (issues.length === 0) {
94
+ console.log("All checks passed.");
95
+ return;
96
+ }
97
+ console.log("Doctor report:");
98
+ for (const i of issues)
99
+ console.log(" - " + i);
100
+ process.exit(1);
101
+ }
@@ -0,0 +1,28 @@
1
+ /**
2
+ * The canonical list of skills in this framework.
3
+ *
4
+ * `human` skills are what the user calls directly: /do, /taste, /where.
5
+ * `support` skills are auto-invoked by /do or /migrate.
6
+ *
7
+ * The skill names match the directory names under `skills/`.
8
+ */
9
+ export const SKILLS = [
10
+ // Human-facing entry points
11
+ { name: "do", kind: "human", summary: "Do the engineering work end-to-end." },
12
+ { name: "taste", kind: "human", summary: "View / update / onboard preferences." },
13
+ { name: "where", kind: "human", summary: "Recall prior pi sessions for this cwd." },
14
+ // Auto-invoked support skills
15
+ { name: "migrate", kind: "support", summary: "Strict migration gate before /do can run." },
16
+ { name: "setup", kind: "support", summary: "Scaffold issue-tracker / triage / domain docs." },
17
+ { name: "diagnose", kind: "support", summary: "Reproducible diagnosis loop for hard bugs." },
18
+ { name: "tdd", kind: "support", summary: "Red→green→refactor in vertical slices." },
19
+ { name: "to-prd", kind: "support", summary: "Synthesise current context into a PRD." },
20
+ { name: "to-issues", kind: "support", summary: "Break a plan into AFK-grabbable issues." },
21
+ { name: "triage", kind: "support", summary: "Move issues through the triage state machine." },
22
+ { name: "grill-with-docs", kind: "support", summary: "Stress-test a plan against domain docs." },
23
+ { name: "improve-codebase-architecture", kind: "support", summary: "Find deepening opportunities." },
24
+ { name: "zoom-out", kind: "support", summary: "Map the area before diving in." },
25
+ { name: "recon-with-vision", kind: "support", summary: "Lock extractor schemas via vision." },
26
+ ];
27
+ export const HUMAN_SKILLS = SKILLS.filter((s) => s.kind === "human");
28
+ export const SUPPORT_SKILLS = SKILLS.filter((s) => s.kind === "support");
package/dist/paths.js ADDED
@@ -0,0 +1,14 @@
1
+ import { homedir } from "node:os";
2
+ import { join, resolve, dirname } from "node:path";
3
+ import { fileURLToPath } from "node:url";
4
+ export const HOME = homedir();
5
+ export const PI_AGENT_DIR = join(HOME, ".pi", "agent");
6
+ export const PI_SKILLS_DIR = join(PI_AGENT_DIR, "skills");
7
+ export const PI_GLOBAL_PREFS = join(PI_AGENT_DIR, "preferences.md");
8
+ const __filename = fileURLToPath(import.meta.url);
9
+ const __dirname = dirname(__filename);
10
+ /** Resolve the package root regardless of whether we run from src/ or dist/. */
11
+ export const PKG_ROOT = resolve(__dirname, "..");
12
+ export const PKG_SKILLS_DIR = join(PKG_ROOT, "skills");
13
+ export const PKG_PRESETS_DIR = join(PKG_ROOT, "presets");
14
+ export const PKG_GLOBAL_PREFS_PRESET = join(PKG_PRESETS_DIR, "preferences.md");
package/package.json ADDED
@@ -0,0 +1,48 @@
1
+ {
2
+ "name": "pi-dev",
3
+ "version": "0.1.1",
4
+ "description": "An autonomous engineering skill framework for the pi runtime — built on Matt Pocock's skills.",
5
+ "type": "module",
6
+ "bin": {
7
+ "pi-dev": "./dist/cli.js"
8
+ },
9
+ "files": [
10
+ "dist",
11
+ "skills",
12
+ "presets",
13
+ "README.md",
14
+ "LICENSE"
15
+ ],
16
+ "scripts": {
17
+ "build": "tsc -p tsconfig.json && chmod +x dist/cli.js",
18
+ "dev": "tsc -p tsconfig.json --watch",
19
+ "test": "vitest run",
20
+ "prepublishOnly": "npm run build"
21
+ },
22
+ "engines": {
23
+ "node": ">=20"
24
+ },
25
+ "keywords": [
26
+ "pi",
27
+ "ai",
28
+ "agent",
29
+ "skills",
30
+ "engineering",
31
+ "matt-pocock"
32
+ ],
33
+ "author": "jason2077",
34
+ "license": "MIT",
35
+ "devDependencies": {
36
+ "@types/node": "^22.0.0",
37
+ "typescript": "^5.6.0",
38
+ "vitest": "^2.1.9"
39
+ },
40
+ "repository": {
41
+ "type": "git",
42
+ "url": "git+https://github.com/jason2077/pi-dev.git"
43
+ },
44
+ "bugs": {
45
+ "url": "https://github.com/jason2077/pi-dev/issues"
46
+ },
47
+ "homepage": "https://github.com/jason2077/pi-dev#readme"
48
+ }
@@ -0,0 +1,74 @@
1
+ # Global Engineering Preferences
2
+
3
+ > User-level defaults. Project-level files at `docs/agents/preferences.md` override these key-by-key. Package-level files at `packages/<pkg>/preferences.md` override project-level.
4
+ >
5
+ > Read order in skills: global → project → package. Last write wins per key.
6
+ >
7
+ > Edit this file directly to update.
8
+
9
+ last-updated: 2025-05-08
10
+
11
+ ## 1. Lifecycle
12
+
13
+ - stage: growth
14
+ - change-budget: module
15
+
16
+ ## 2. Engineering philosophy
17
+
18
+ - simplicity-bias: simple-first
19
+ - completeness-bias: feature-complete
20
+ - over-engineering-tolerance: 0
21
+ - durability-target: long-lived
22
+
23
+ ## 3. Testing
24
+
25
+ - test-priority-order: local-live > integration > unit
26
+ - wait-budget-seconds: 30
27
+ - wait-budget-exceeded-action: redesign-test
28
+ - test-design-bar: design-first
29
+ - coverage-scope: critical-paths
30
+ - mocking-stance: fakes
31
+ - local-live-policy: mandatory
32
+ - ops-live-policy: risk-gated
33
+ - regression-test-locations: <package>/test/regressions/<slug>.test.ts (default; override per-project)
34
+
35
+ ## 4. Architecture
36
+
37
+ - module-depth-preference: deep
38
+ - dedup-trigger: never-preemptive
39
+ - adr-threshold: hard-to-reverse-only
40
+
41
+ ## 5. Process automation
42
+
43
+ - auto-create-issues: preview-then-yes
44
+ - auto-apply-labels: yes
45
+ - auto-commit-per-slice: staged-only
46
+ - auto-pr: branch-push-only
47
+ - interrupt-on-ambiguity: confidence<0.5
48
+
49
+ ## 6. Communication
50
+
51
+ - verbosity: minimal
52
+ - explanation-style: decisions-only
53
+ - language: ko
54
+
55
+ ## 7. Free notes
56
+
57
+ - **라이브 테스트는 두 층**: local-live(에이전트가 로컬 앱/서브시스템 부팅·구동·검증)는 **필수**. ops-live(운영/스테이징 실데이터)는 risk-gated.
58
+ - 단위/통합 테스트 빠른 검증은 기본. 그래도 라이브 가면 100% 새 문제 나옴 → local-live 생략 금지.
59
+ - 로컬 라이브 엔트리포인트 없으면 에이전트가 **하니스를 만든다**. 사용자가 테스트 러너 노릇 하지 않음.
60
+ - 인프라 의존(DB/외부)은 운영 커넥션 빌려서라도 애플리케이션 라이브로 띄울 것.
61
+ - 테스트 작성보다 **테스트 설계**가 우선. 설계 안 되면 작성 보류.
62
+ - 한 사이클 30초 초과로 멍때리지 말 것. 초과하면 곧장 redesign.
63
+ - 오버 엔지니어링 혐오. "심플하면서 강력한", 기능 완전성 + 효율 중심.
64
+ - 듀러블한 것 선호. 일회성 코드는 일회성으로 명시.
65
+ - "rule of three"보다 보수적. 사용자가 명시 요청하지 않는 한 선제적 추출 금지.
66
+ - AI 자동 커밋·PR은 보수적으로. push까지 가지 말고 stage에서 멈출 것.
67
+ - **No handoff files.** 작업 상태는 코드(git) / 이슈 트래커 / 머지된 prefs 셋에만 산다.
68
+ - **Migration is mandatory.** 마이그레이션 안 된 레포에서는 `/do` 시작 거부.
69
+
70
+ ## Risk-gated rules (default)
71
+
72
+ `ops-live-policy: risk-gated` — 강제 트리거: 외부 API / 결제 / 인증 / 스키마·마이그레이션 / 비즈니스 핵심 룰 / 릴리스. 내부 리팩토링·타입 정리·문서·테스트-온리 변경은 생략. 모호하면 강제.
73
+
74
+ `local-live-policy: mandatory` — 런타임 동작에 영향 주는 모든 변경에 강제. 예외는 docs-only / type-only / test-only 디프뿐.
@@ -0,0 +1,117 @@
1
+ ---
2
+ name: diagnose
3
+ description: Disciplined diagnosis loop for hard bugs and performance regressions. Reproduce → minimise → hypothesise → instrument → fix → regression-test. Use when user says "diagnose this" / "debug this", reports a bug, says something is broken/throwing/failing, or describes a performance regression.
4
+ ---
5
+
6
+ # Diagnose
7
+
8
+ A discipline for hard bugs. Skip phases only when explicitly justified.
9
+
10
+ When exploring the codebase, read `docs/agents/domain.md` first if it exists, use the project's domain glossary to get a clear mental model of the relevant modules, and check ADRs in the area you're touching.
11
+
12
+ ## Phase 1 — Build a feedback loop
13
+
14
+ **This is the skill.** Everything else is mechanical. If you have a fast, deterministic, agent-runnable pass/fail signal for the bug, you will find the cause — bisection, hypothesis-testing, and instrumentation all just consume that signal. If you don't have one, no amount of staring at code will save you.
15
+
16
+ Spend disproportionate effort here. **Be aggressive. Be creative. Refuse to give up.**
17
+
18
+ ### Ways to construct one — try them in roughly this order
19
+
20
+ 1. **Failing test** at whatever seam reaches the bug — unit, integration, e2e.
21
+ 2. **Curl / HTTP script** against a running dev server.
22
+ 3. **CLI invocation** with a fixture input, diffing stdout against a known-good snapshot.
23
+ 4. **Headless browser script** (Playwright / Puppeteer) — drives the UI, asserts on DOM/console/network.
24
+ 5. **Replay a captured trace.** Save a real network request / payload / event log to disk; replay it through the code path in isolation.
25
+ 6. **Throwaway harness.** Spin up a minimal subset of the system (one service, mocked deps) that exercises the bug code path with a single function call.
26
+ 7. **Property / fuzz loop.** If the bug is "sometimes wrong output", run 1000 random inputs and look for the failure mode.
27
+ 8. **Bisection harness.** If the bug appeared between two known states (commit, dataset, version), automate "boot at state X, check, repeat" so you can `git bisect run` it.
28
+ 9. **Differential loop.** Run the same input through old-version vs new-version (or two configs) and diff outputs.
29
+ 10. **HITL bash script.** Last resort. If a human must click, drive _them_ with `scripts/hitl-loop.template.sh` so the loop is still structured. Captured output feeds back to you.
30
+
31
+ Build the right feedback loop, and the bug is 90% fixed.
32
+
33
+ ### Iterate on the loop itself
34
+
35
+ Treat the loop as a product. Once you have _a_ loop, ask:
36
+
37
+ - Can I make it faster? (Cache setup, skip unrelated init, narrow the test scope.)
38
+ - Can I make the signal sharper? (Assert on the specific symptom, not "didn't crash".)
39
+ - Can I make it more deterministic? (Pin time, seed RNG, isolate filesystem, freeze network.)
40
+
41
+ A 30-second flaky loop is barely better than no loop. A 2-second deterministic loop is a debugging superpower.
42
+
43
+ ### Non-deterministic bugs
44
+
45
+ The goal is not a clean repro but a **higher reproduction rate**. Loop the trigger 100×, parallelise, add stress, narrow timing windows, inject sleeps. A 50%-flake bug is debuggable; 1% is not — keep raising the rate until it's debuggable.
46
+
47
+ ### When you genuinely cannot build a loop
48
+
49
+ Stop and say so explicitly. List what you tried. Ask the user for: (a) access to whatever environment reproduces it, (b) a captured artifact (HAR file, log dump, core dump, screen recording with timestamps), or (c) permission to add temporary production instrumentation. Do **not** proceed to hypothesise without a loop.
50
+
51
+ Do not proceed to Phase 2 until you have a loop you believe in.
52
+
53
+ ## Phase 2 — Reproduce
54
+
55
+ Run the loop. Watch the bug appear.
56
+
57
+ Confirm:
58
+
59
+ - [ ] The loop produces the failure mode the **user** described — not a different failure that happens to be nearby. Wrong bug = wrong fix.
60
+ - [ ] The failure is reproducible across multiple runs (or, for non-deterministic bugs, reproducible at a high enough rate to debug against).
61
+ - [ ] You have captured the exact symptom (error message, wrong output, slow timing) so later phases can verify the fix actually addresses it.
62
+
63
+ Do not proceed until you reproduce the bug.
64
+
65
+ ## Phase 3 — Hypothesise
66
+
67
+ Generate **3–5 ranked hypotheses** before testing any of them. Single-hypothesis generation anchors on the first plausible idea.
68
+
69
+ Each hypothesis must be **falsifiable**: state the prediction it makes.
70
+
71
+ > Format: "If <X> is the cause, then <changing Y> will make the bug disappear / <changing Z> will make it worse."
72
+
73
+ If you cannot state the prediction, the hypothesis is a vibe — discard or sharpen it.
74
+
75
+ **Show the ranked list to the user before testing.** They often have domain knowledge that re-ranks instantly ("we just deployed a change to #3"), or know hypotheses they've already ruled out. Cheap checkpoint, big time saver. Don't block on it — proceed with your ranking if the user is AFK.
76
+
77
+ ## Phase 4 — Instrument
78
+
79
+ Each probe must map to a specific prediction from Phase 3. **Change one variable at a time.**
80
+
81
+ Tool preference:
82
+
83
+ 1. **Debugger / REPL inspection** if the env supports it. One breakpoint beats ten logs.
84
+ 2. **Targeted logs** at the boundaries that distinguish hypotheses.
85
+ 3. Never "log everything and grep".
86
+
87
+ **Tag every debug log** with a unique prefix, e.g. `[DEBUG-a4f2]`. Cleanup at the end becomes a single grep. Untagged logs survive; tagged logs die.
88
+
89
+ **Perf branch.** For performance regressions, logs are usually wrong. Instead: establish a baseline measurement (timing harness, `performance.now()`, profiler, query plan), then bisect. Measure first, fix second.
90
+
91
+ ## Phase 5 — Fix + regression test
92
+
93
+ Write the regression test **before the fix** — but only if there is a **correct seam** for it.
94
+
95
+ A correct seam is one where the test exercises the **real bug pattern** as it occurs at the call site. If the only available seam is too shallow (single-caller test when the bug needs multiple callers, unit test that can't replicate the chain that triggered the bug), a regression test there gives false confidence.
96
+
97
+ **If no correct seam exists, that itself is the finding.** Note it. The codebase architecture is preventing the bug from being locked down. Flag this for the next phase.
98
+
99
+ If a correct seam exists:
100
+
101
+ 1. Turn the minimised repro into a failing test at that seam.
102
+ 2. Watch it fail.
103
+ 3. Apply the fix.
104
+ 4. Watch it pass.
105
+ 5. Re-run the Phase 1 feedback loop against the original (un-minimised) scenario.
106
+
107
+ ## Phase 6 — Cleanup + post-mortem
108
+
109
+ Required before declaring done:
110
+
111
+ - [ ] Original repro no longer reproduces (re-run the Phase 1 loop)
112
+ - [ ] Regression test passes (or absence of seam is documented)
113
+ - [ ] All `[DEBUG-...]` instrumentation removed (`grep` the prefix)
114
+ - [ ] Throwaway prototypes deleted (or moved to a clearly-marked debug location)
115
+ - [ ] The hypothesis that turned out correct is stated in the commit / PR message — so the next debugger learns
116
+
117
+ **Then ask: what would have prevented this bug?** If the answer involves architectural change (no good test seam, tangled callers, hidden coupling) hand off to the `/improve-codebase-architecture` skill with the specifics. Make the recommendation **after** the fix is in, not before — you have more information now than when you started.
@@ -0,0 +1,41 @@
1
+ #!/usr/bin/env bash
2
+ # Human-in-the-loop reproduction loop.
3
+ # Copy this file, edit the steps below, and run it.
4
+ # The agent runs the script; the user follows prompts in their terminal.
5
+ #
6
+ # Usage:
7
+ # bash hitl-loop.template.sh
8
+ #
9
+ # Two helpers:
10
+ # step "<instruction>" → show instruction, wait for Enter
11
+ # capture VAR "<question>" → show question, read response into VAR
12
+ #
13
+ # At the end, captured values are printed as KEY=VALUE for the agent to parse.
14
+
15
+ set -euo pipefail
16
+
17
+ step() {
18
+ printf '\n>>> %s\n' "$1"
19
+ read -r -p " [Enter when done] " _
20
+ }
21
+
22
+ capture() {
23
+ local var="$1" question="$2" answer
24
+ printf '\n>>> %s\n' "$question"
25
+ read -r -p " > " answer
26
+ printf -v "$var" '%s' "$answer"
27
+ }
28
+
29
+ # --- edit below ---------------------------------------------------------
30
+
31
+ step "Open the app at http://localhost:3000 and sign in."
32
+
33
+ capture ERRORED "Click the 'Export' button. Did it throw an error? (y/n)"
34
+
35
+ capture ERROR_MSG "Paste the error message (or 'none'):"
36
+
37
+ # --- edit above ---------------------------------------------------------
38
+
39
+ printf '\n--- Captured ---\n'
40
+ printf 'ERRORED=%s\n' "$ERRORED"
41
+ printf 'ERROR_MSG=%s\n' "$ERROR_MSG"