pi-dev 0.2.3 → 0.2.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -35,21 +35,22 @@ You ask. It classifies. It executes. It reports.
35
35
  - **A migration gate that won't let you cheat.** `/do` refuses to touch an un-migrated repo. `/migrate` audits `AGENTS.md`, `CLAUDE.md`, scoped agent dirs, handoff systems, and ADR layouts; archives the noise; stamps a marker; and on every re-entry runs drift probes so banned conventions can't sneak back in.
36
36
  - **No handoff files. Ever.** State of work lives in three places only — code (git), the issue tracker, and merged preferences. No `docs/handoff/`, no `.scratch/flow/`, no SESSION_*.md littering your repo.
37
37
  - **Local-live verification the agent owns.** A one-time playbook captures how to boot your stack. From then on the agent boots it, drives it, and quotes the evidence in the summary. You are not the test runner.
38
- - **Vertical-slice issues with AI disclaimers baked in.** `/to-issues` produces independently-grabbable tickets — each with acceptance criteria, allowed-touch-set, and out-of-scope — and the disclaimer required by `/triage` lands on every one.
38
+ - **Vertical-slice issues that read like a spec.** `/to-issues` produces independently-grabbable tickets — each with acceptance criteria, allowed-touch-set, and out-of-scope — written as plain spec so the next worker agent picks them up as input, not as a meta-tagged AI artefact.
39
+ - **One minimal plugin where prose is provably not enough.** Ships `pi-flow`: a single pi extension whose only job is to catch `/do` ending a turn with `follow-up: ... run /do ...` mid-chain (the hand-back pattern that turns one prompt into ten "진행해" replies) and steer back into the next phase. Toggle in `~/.pi/agent/settings.json` → `piFlow.enabled`. We add a new guard only when an audit shows a rule the model demonstrably can't see itself drift past.
39
40
 
40
41
  ## Install
41
42
 
42
43
  Requires Node ≥ 20 and the [pi runtime](https://github.com/badlogic/pi).
43
44
 
44
45
  ```bash
45
- # Interactive: pick global vs project-local, then install + seed preferences
46
+ # Interactive: pick global vs project-local, then install skills + extensions + seed preferences
46
47
  npx pi-dev@latest install
47
48
 
48
49
  # Or be explicit:
49
- npx pi-dev@latest install --global # ~/.pi/agent/skills/ (every repo)
50
- npx pi-dev@latest install --local # ./.pi/skills/ (this repo only)
50
+ npx pi-dev@latest install --global # ~/.pi/agent/{skills,extensions}/ (every repo)
51
+ npx pi-dev@latest install --local # ./.pi/{skills,extensions}/ (this repo only)
51
52
 
52
- # Refresh skills (auto-detects scope from disk; preferences are kept)
53
+ # Refresh skills + extensions (auto-detects scope from disk; preferences are kept)
53
54
  npx pi-dev@latest update
54
55
 
55
56
  # See what's installed under each scope
@@ -64,10 +65,11 @@ npx pi-dev doctor
64
65
  | | global | local |
65
66
  | --- | --- | --- |
66
67
  | Skills | `~/.pi/agent/skills/` | `<repo>/.pi/skills/` |
68
+ | Extensions | `~/.pi/agent/extensions/` | `<repo>/.pi/extensions/` |
67
69
  | Preferences | `~/.pi/agent/preferences.md` | `<repo>/.pi/preferences.md` |
68
70
  | Pi sessions see it from | every cwd | only this repo |
69
- | Goes in the repo's git? | no | yes (commit `.pi/skills/`, gitignore `.pi/sessions/`) |
70
- | Use when | the skills are part of *your* engineering taste | the skills are part of *the project's* contract |
71
+ | Goes in the repo's git? | no | yes (commit `.pi/skills/` + `.pi/extensions/`, gitignore `.pi/sessions/`) |
72
+ | Use when | the framework is part of *your* engineering taste | the framework is part of *the project's* contract |
71
73
 
72
74
  Non-interactive runs (CI, piped input) default to `--global` silently. Pass `-y` to skip the prompt and accept the default.
73
75
 
package/dist/install.js CHANGED
@@ -3,7 +3,19 @@ import { execSync } from "node:child_process";
3
3
  import { join } from "node:path";
4
4
  import { createInterface } from "node:readline";
5
5
  import { SKILLS, CONSUMER_SKILLS } from "./manifest.js";
6
- import { PKG_SKILLS_DIR, PKG_GLOBAL_PREFS_PRESET, destFor, } from "./paths.js";
6
+ import { PKG_SKILLS_DIR, PKG_EXTENSIONS_DIR, PKG_GLOBAL_PREFS_PRESET, destFor, } from "./paths.js";
7
+ /**
8
+ * Extensions shipped with pi-dev. Each is a subdirectory under `extensions/`
9
+ * that pi auto-discovers from `~/.pi/agent/extensions/<name>/` (global) or
10
+ * `.pi/extensions/<name>/` (local). The directory name is the install name
11
+ * verbatim — no prefix, no transform.
12
+ *
13
+ * Keep this list short. A new entry is justified only by an audit signal
14
+ * showing prose alone has ≥60% miss rate on a mechanically checkable rule.
15
+ */
16
+ const EXTENSIONS = [
17
+ { name: "pi-flow", summary: "Steer mid-chain hand-backs back into the next phase." },
18
+ ];
7
19
  function ask(question) {
8
20
  const rl = createInterface({ input: process.stdin, output: process.stdout });
9
21
  return new Promise((res) => {
@@ -39,8 +51,9 @@ async function resolveScope(opts) {
39
51
  }
40
52
  export async function install(opts = {}) {
41
53
  const scope = await resolveScope(opts);
42
- const { agentDir, skillsDir, prefsFile } = destFor(scope);
54
+ const { agentDir, skillsDir, extensionsDir, prefsFile } = destFor(scope);
43
55
  mkdirSync(skillsDir, { recursive: true });
56
+ mkdirSync(extensionsDir, { recursive: true });
44
57
  const skillsToInstall = opts.includeMaintainer ? SKILLS : CONSUMER_SKILLS;
45
58
  let copied = 0;
46
59
  for (const skill of skillsToInstall) {
@@ -59,6 +72,25 @@ export async function install(opts = {}) {
59
72
  copied++;
60
73
  }
61
74
  console.log(`Installed ${copied} skill(s) into ${skillsDir} [${scope}]`);
75
+ let extsCopied = 0;
76
+ for (const ext of EXTENSIONS) {
77
+ const src = join(PKG_EXTENSIONS_DIR, ext.name);
78
+ const dst = join(extensionsDir, ext.name);
79
+ if (!existsSync(src)) {
80
+ console.warn(` skip extension ${ext.name} (source not found in package)`);
81
+ continue;
82
+ }
83
+ if (existsSync(dst) && !opts.force) {
84
+ cpSync(src, dst, { recursive: true, force: true });
85
+ }
86
+ else {
87
+ cpSync(src, dst, { recursive: true });
88
+ }
89
+ extsCopied++;
90
+ }
91
+ if (extsCopied > 0) {
92
+ console.log(`Installed ${extsCopied} extension(s) into ${extensionsDir} [${scope}]`);
93
+ }
62
94
  if (opts.skipPrefs) {
63
95
  console.log("Skipped preferences (pass --include-prefs on update to merge in new keys).");
64
96
  return;
package/dist/paths.js CHANGED
@@ -4,23 +4,31 @@ import { fileURLToPath } from "node:url";
4
4
  export const HOME = homedir();
5
5
  export const PI_AGENT_DIR = join(HOME, ".pi", "agent");
6
6
  export const PI_SKILLS_DIR = join(PI_AGENT_DIR, "skills");
7
+ export const PI_EXTENSIONS_DIR = join(PI_AGENT_DIR, "extensions");
7
8
  export const PI_GLOBAL_PREFS = join(PI_AGENT_DIR, "preferences.md");
8
9
  const __filename = fileURLToPath(import.meta.url);
9
10
  const __dirname = dirname(__filename);
10
11
  /** Resolve the package root regardless of whether we run from src/ or dist/. */
11
12
  export const PKG_ROOT = resolve(__dirname, "..");
12
13
  export const PKG_SKILLS_DIR = join(PKG_ROOT, "skills");
14
+ export const PKG_EXTENSIONS_DIR = join(PKG_ROOT, "extensions");
13
15
  export const PKG_PRESETS_DIR = join(PKG_ROOT, "presets");
14
16
  export const PKG_GLOBAL_PREFS_PRESET = join(PKG_PRESETS_DIR, "preferences.md");
15
17
  /** Compute install destinations for a given scope. `local` resolves against `cwd`. */
16
18
  export function destFor(scope, cwd = process.cwd()) {
17
19
  if (scope === "global") {
18
- return { agentDir: PI_AGENT_DIR, skillsDir: PI_SKILLS_DIR, prefsFile: PI_GLOBAL_PREFS };
20
+ return {
21
+ agentDir: PI_AGENT_DIR,
22
+ skillsDir: PI_SKILLS_DIR,
23
+ extensionsDir: PI_EXTENSIONS_DIR,
24
+ prefsFile: PI_GLOBAL_PREFS,
25
+ };
19
26
  }
20
27
  const agentDir = join(cwd, ".pi");
21
28
  return {
22
29
  agentDir,
23
30
  skillsDir: join(agentDir, "skills"),
31
+ extensionsDir: join(agentDir, "extensions"),
24
32
  prefsFile: join(agentDir, "preferences.md"),
25
33
  };
26
34
  }
@@ -0,0 +1,40 @@
1
+ # pi-flow
2
+
3
+ The default plugin that ships with `pi-dev`. One job: stop `/do` from
4
+ quietly handing the chain back to you between phases.
5
+
6
+ ## What it does
7
+
8
+ Watches the end of every assistant turn. If the last visible text contains
9
+ a `follow-up: ... run /do ...` line **without** a preceding
10
+ `chain complete — no further action.` line, it queues a follow-up steer
11
+ that prompts the next phase. Your next turn is the next `[flow N/M]`
12
+ status line, not "진행해".
13
+
14
+ ## Why one guard, not five
15
+
16
+ `pi-flow` is intentionally minimal. Each new guard taxes every tool call,
17
+ slows iteration, and tempts everyone to disable the whole plugin. So:
18
+
19
+ - We add a guard only when a real audit shows ≥60% violation on a rule
20
+ the model can't see itself drift past.
21
+ - We remove a guard when the underlying root cause is gone (e.g. an
22
+ `.gitignore` lockout makes a path-write guard redundant).
23
+ - Everything else stays in `SKILL.md` prose where the cost is zero.
24
+
25
+ The hugn 2026-05 audit produced exactly one such finding (mid-chain
26
+ hand-back, 8/13 sessions). That is what this guard exists for.
27
+
28
+ ## Toggle
29
+
30
+ `~/.pi/agent/settings.json`:
31
+
32
+ ```json
33
+ {
34
+ "piFlow": {
35
+ "enabled": false
36
+ }
37
+ }
38
+ ```
39
+
40
+ Default: on.
@@ -0,0 +1,65 @@
1
+ /**
2
+ * pi-flow
3
+ *
4
+ * Minimal runtime plugin that keeps /do on-chain.
5
+ *
6
+ * One guard, one event: when an assistant turn ends with a
7
+ * `follow-up: ... run /do ...` line that is *not* preceded by the canonical
8
+ * `chain complete — no further action.` terminator, the chain is being
9
+ * handed back mid-flight. The plugin queues a steer reminder so the next
10
+ * turn re-enters the chain at the next phase instead of waiting for the user
11
+ * to type "진행해" / "go" / "다음".
12
+ *
13
+ * Rationale: the hugn 2026-05 audit showed 8/13 /do sessions ending with
14
+ * chain-depth 0 and 82% of user messages being short re-entry nudges. Every
15
+ * other candidate guard had near-zero ROI — the taboo-path writes were already
16
+ * stopped by a .gitignore lockout, and other rules were taken out of policy.
17
+ * One predicate, one steer.
18
+ *
19
+ * Toggle:
20
+ * ~/.pi/agent/settings.json → "piFlow": { "enabled": false }
21
+ */
22
+
23
+ import type { ExtensionAPI } from "@earendil-works/pi-coding-agent";
24
+ import { existsSync, readFileSync } from "node:fs";
25
+ import { homedir } from "node:os";
26
+ import { join } from "node:path";
27
+
28
+ const HANDBACK = /^follow-up:\s.*\brun\s+`?\/?do\b/m;
29
+ const TERMINATOR = /chain complete\s+\u2014\s+no further action/;
30
+
31
+ function isEnabled(): boolean {
32
+ const settingsPath = join(homedir(), ".pi", "agent", "settings.json");
33
+ if (!existsSync(settingsPath)) return true;
34
+ try {
35
+ const raw = JSON.parse(readFileSync(settingsPath, "utf-8")) as Record<string, unknown>;
36
+ const cfg = (raw["piFlow"] ?? {}) as { enabled?: boolean };
37
+ return cfg.enabled !== false;
38
+ } catch {
39
+ return true;
40
+ }
41
+ }
42
+
43
+ export default function piFlow(pi: ExtensionAPI) {
44
+ if (!isEnabled()) return;
45
+
46
+ pi.on("message_end", async (event) => {
47
+ if (event.message.role !== "assistant") return;
48
+ const text = event.message.content
49
+ .filter((b): b is { type: "text"; text: string } => b.type === "text")
50
+ .map((b) => b.text)
51
+ .join("\n");
52
+ if (!text || !HANDBACK.test(text) || TERMINATOR.test(text)) return;
53
+
54
+ pi.sendMessage(
55
+ {
56
+ customType: "pi-flow",
57
+ content:
58
+ "[pi-flow] mid-chain hand-back detected. `follow-up: … run /do …` is the Step-7 terminator, " +
59
+ "not a phase boundary. Print the next `[flow N/M]` status line and continue the chain.",
60
+ display: true,
61
+ },
62
+ { triggerTurn: true, deliverAs: "followUp" },
63
+ );
64
+ });
65
+ }
@@ -0,0 +1,9 @@
1
+ {
2
+ "name": "pi-flow",
3
+ "version": "0.0.0",
4
+ "private": true,
5
+ "description": "Default pi-dev plugin: keeps /do on-chain by steering mid-chain hand-backs.",
6
+ "pi": {
7
+ "extensions": ["./index.ts"]
8
+ }
9
+ }
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "pi-dev",
3
- "version": "0.2.3",
3
+ "version": "0.2.5",
4
4
  "description": "An autonomous engineering skill framework for the pi runtime — built on Matt Pocock's skills.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -9,6 +9,7 @@
9
9
  "files": [
10
10
  "dist",
11
11
  "skills",
12
+ "extensions",
12
13
  "presets",
13
14
  "README.md",
14
15
  "LICENSE"
@@ -39,10 +40,10 @@
39
40
  },
40
41
  "repository": {
41
42
  "type": "git",
42
- "url": "git+https://github.com/jason2077/pi-dev.git"
43
+ "url": "git+https://github.com/jason2077/pi-flow.git"
43
44
  },
44
45
  "bugs": {
45
- "url": "https://github.com/jason2077/pi-dev/issues"
46
+ "url": "https://github.com/jason2077/pi-flow/issues"
46
47
  },
47
- "homepage": "https://github.com/jason2077/pi-dev#readme"
48
+ "homepage": "https://github.com/jason2077/pi-flow#readme"
48
49
  }
@@ -26,33 +26,9 @@ The point: **one request → one finished outcome**, with as few user interrupti
26
26
  - a phase's terminal predicate fails twice and the failure cannot be characterised, OR
27
27
  - the Ambiguity protocol fires (preferences truly silent on a one-shot decision), OR
28
28
  - user interrupts.
29
- 5. **No handoff files.** State of work lives in three places only: **code (git), issue tracker, merged preferences**. Do not create `.scratch/flow/`, `docs/handoff/`, or any session-log file. Phase outputs are remembered in-context; persistent decisions are committed to code or filed as issues.
29
+ 5. **No handoff files.** State of work lives in three places only: **code (git), issue tracker, merged preferences**. Do not create `.scratch/flow/`, `docs/handoff/`, `.handoff/`, or any session-log file (`SESSION_*.md`, `HANDOFF_*.md`). Phase outputs are remembered in-context; persistent decisions are committed to code or filed as issues.
30
30
  6. **Side-effect gates respect prefs literally.** `auto-create-issues`, `auto-apply-labels`, `auto-commit-per-slice`, `auto-pr` follow merged prefs without reinterpretation.
31
31
  7. **Status line per phase.** `[flow N/M] <phase-name> — <one-sentence what>`.
32
- 8. **Issue-write rule (no exceptions).** Every issue body written from a `/do` flow — whether you invoke `/to-issues`, `/triage`, or call `gh issue create` / `gh issue edit` directly from a bash tool — MUST start with this disclaimer as the first non-blank line:
33
-
34
- ```
35
- > *This was generated by AI during triage.*
36
- ```
37
-
38
- This rule is global and binding regardless of which skill you think you are in. Before publishing any issue body:
39
-
40
- - Build the body with `--body-file <path>` or a heredoc, never with inline `--body "..."` (heredocs make the disclaimer visible in the diff and the file is auditable).
41
- - The first non-blank line of the body file must be the disclaimer literal above.
42
- - Immediately after `gh issue create` / `gh issue edit` returns, run one self-check on the resulting body and halt the flow if it fails:
43
-
44
- ```bash
45
- gh issue view <num> --json body --jq '.body' | awk 'NF{print; exit}' | grep -q 'generated by AI' \
46
- || { echo "AI disclaimer missing on issue #<num>"; exit 1; }
47
- ```
48
-
49
- **Anti-patterns** (delete and redo if you catch any in your own draft):
50
-
51
- - `gh issue create --title "…" --body "..."` (inline body, no disclaimer check)
52
- - First line of issue body being `## Goal`, `## Scope`, `## Problem Statement`, `**Parent epic:**`, or any heading other than the disclaimer.
53
- - Skipping the post-create `gh issue view ... | grep generated by AI` self-check because "I included the disclaimer in the heredoc".
54
-
55
- Skills `/to-issues` and `/triage` repeat this rule for the case where you do enter them. This Hard rule is the binding statement for the case where you do not.
56
32
 
57
33
  ## Process
58
34
 
@@ -155,6 +131,8 @@ Then for each phase:
155
131
  - Ending the turn with `follow-up: <next planned phase> — run /do …` when N < M. The `follow-up:` line is reserved for *post-chain* deferred work (push, manual ops live, prefs refresh). Using it to describe the *next planned phase of this chain* is a disguised hand-back — start the phase instead.
156
132
  - Treating a short user nudge ("진행해", "다음", "계속", "ㄱㄱ", "go", "이어서") as a new request. If the planned chain has unfinished phases, those nudges are noise — re-enter the chain at the next phase, do not re-classify and re-inject `/do`.
157
133
 
134
+ The `pi-flow` plugin (installed by default) watches the end of each assistant turn: a `follow-up: ... run /do ...` line *not* preceded by `chain complete — no further action.` is treated as a mid-chain hand-back and steered back into the next phase automatically. The prose above is still the canonical contract; the plugin is the safety net.
135
+
158
136
  The only place a wrap-up belongs is **Step 7 — Final summary**, after the last phase has met its terminal predicate.
159
137
 
160
138
  ### Step 5 — Live verification
@@ -219,7 +197,7 @@ If you emitted anything that looks like a wrap-up ("All set!", "Done.", "Let me
219
197
  | `grill-with-docs` / `grill-lite` | every open question answered, deferred with rationale, or escalated; relevant CONTEXT/ADR updates committed |
220
198
  | `to-prd` | PRD published to issue tracker with `needs-triage` (per `auto-create-issues`) |
221
199
  | `to-issues` | all slices created on tracker; each is independently grabbable; labels applied per `auto-apply-labels` |
222
- | `triage` | issue carries exactly one state label; AI disclaimer present (verified via the Hard rule #8 self-check, not just inserted into the heredoc) |
200
+ | `triage` | issue carries exactly one state label |
223
201
  | `diagnose` | reproducible pass/fail loop exists AND root cause identified AND regression test exists |
224
202
  | `tdd` | new test red→green; project's check command clean; commit per `auto-commit-per-slice` |
225
203
  | `improve-codebase-architecture` | proposed deepening either applied or recorded as a follow-up issue |
@@ -50,22 +50,37 @@ Always from inside the pi-dev checkout (see Pre-flight).
50
50
  - **Consumer repo path** (or its sessions directory) to audit. The maintainer names a repo on this machine; you resolve its sessions dir.
51
51
  - Optional: a specific skill name to focus the audit on (`do`, `migrate`, `triage`, …).
52
52
  - Optional: a date range.
53
- - **Fix scope** per finding — `framework` or `consumer-prefs`. Defaults set in Step 5.5; maintainer can flip individual rows before applying.
53
+ - **Fix scope** per finding — `framework`, `extension`, or `consumer-prefs`. Defaults set in Step 5.5; maintainer can flip individual rows before applying.
54
54
 
55
55
  ## Fix scopes
56
56
 
57
- pi-runtime today loads skill bodies from a single location (`~/.pi/agent/skills/<name>/SKILL.md` for global installs, `<repo>/.pi/skills/<name>/SKILL.md` for local installs). The framework's 3-layer override is on **preferences**, not on SKILL bodies. So a finding lands in one of two places:
57
+ pi-runtime loads three artefact kinds the framework can ship:
58
+
59
+ - **Skill bodies** — `~/.pi/agent/skills/<name>/SKILL.md` (global) or `<repo>/.pi/skills/<name>/SKILL.md` (local). Pure prose.
60
+ - **Extensions** — `~/.pi/agent/extensions/<name>/` (global) or `<repo>/.pi/extensions/<name>/` (local). TypeScript modules auto-loaded via jiti; can subscribe to `tool_call`, `message_end`, `before_agent_start`, etc., and can block tool calls or inject steer messages. pi-dev ships **one** by default — `pi-flow` — and the bar to add a second is high (see below).
61
+ - **Preferences** — `docs/agents/preferences.md` (per repo) and `~/.pi/agent/preferences.md` (per machine). 3-layer override on prose-level decisions.
62
+
63
+ A finding lands in exactly one of:
58
64
 
59
65
  | scope | lands in | reaches | propagation | when to pick |
60
66
  | --- | --- | --- | --- | --- |
61
- | **framework** | this repo's `skills/<name>/SKILL.md` | every consumer after the next `npx pi-dev update` | release-please → npm publish | the SKILL.md wording itself is wrong; gap shows up generically |
62
- | **consumer-prefs** | the audited consumer repo's `docs/agents/preferences.md` (Project taboos / Diagnosis posture / Local-live playbook / Free notes whichever section fits) | only that repo, on every `/do` bootstrap | regular consumer-repo commit | gap is the consumer repo's domain / paths / conventions, not the SKILL.md |
67
+ | **framework** | this repo's `skills/<name>/SKILL.md` | every consumer after the next `npx pi-dev update` | release-please → npm publish | the SKILL.md wording is wrong; gap shows up generically; prose alone is plausibly enough |
68
+ | **extension** | this repo's `extensions/pi-flow/index.ts` (extend) or a new `extensions/<name>/` (rare) | every consumer after `npx pi-dev update` | release-please → npm publish | a real audit shows prose can't recover the rule, AND the rule is a simple deterministic predicate, AND skipping the guard would silently waste ≥1 user prompt per session |
69
+ | **consumer-prefs** | the audited consumer repo's `docs/agents/preferences.md` | only that repo, on every `/do` bootstrap | regular consumer-repo commit | gap is the consumer repo's domain / paths / conventions, not the SKILL.md |
63
70
 
64
71
  Notes:
65
72
 
66
73
  - A `framework` apply is **always** mirrored into `~/.pi/agent/skills/<name>/` on this machine so the next session picks it up immediately, without waiting for npm.
74
+ - An `extension` apply is **always** mirrored into `~/.pi/agent/extensions/<name>/` on this machine for the same reason. The package directory name is the install name verbatim (no `pi-dev-` prefix).
67
75
  - A `consumer-prefs` apply touches no pi-dev files. It is committed to the consumer repo only.
68
- - A single audit may produce a mix of framework and consumer-prefs findings. Decide scope per finding, not per audit.
76
+ - A single audit may produce a mix of all three. Decide scope per finding, not per audit.
77
+
78
+ **Extension scope is the most expensive option.** Every guard runs on every relevant tool call or turn, slows iteration, and tempts the maintainer to disable the whole plugin. Default to `framework` / `consumer-prefs` and reach for `extension` only when the audit makes the ROI undeniable. The bar:
79
+
80
+ - The rule was in-context as SKILL.md prose at audit time **and** was violated on ≥60% of opportunities across two consecutive audits.
81
+ - The predicate is deterministic and short — a tool name plus an arg regex, or a tail-of-turn regex — with no LLM call.
82
+ - The failure cost is concrete — the model wastes one or more user prompts per session re-entering a chain, or commits a destructive artefact.
83
+ - A simpler remedy (gitignore lockout, prefs taboo, prose anti-pattern) does **not** plausibly close the gap. If it does, take the simpler remedy.
69
84
 
70
85
  ## Session-data location & format
71
86
 
@@ -84,7 +99,27 @@ Each line is one of these record types:
84
99
  | `session` | session header — has `cwd`, `timestamp`, `version`, `id` |
85
100
  | `model_change` | provider / modelId switch |
86
101
  | `thinking_level_change` | reasoning effort knob |
87
- | `message` | a user or assistant turn |
102
+ | `message` | a user / assistant / toolResult turn |
103
+
104
+ **Record shape for `message` (do not skim this — it has bitten parsers before):**
105
+
106
+ ```jsonc
107
+ {
108
+ "type": "message",
109
+ "id": "...",
110
+ "parentId": "...",
111
+ "timestamp": "2026-05-11T16:46:49.795Z",
112
+ "message": { // ← NESTED. role/content are HERE, not at top level.
113
+ "role": "user" | "assistant" | "toolResult",
114
+ "content": [ ...blocks ],
115
+ "timestamp": "...",
116
+ // assistant-only extras: api, provider, model, usage, stopReason, responseId
117
+ // toolResult-only extras: toolCallId, toolName, isError
118
+ }
119
+ }
120
+ ```
121
+
122
+ A correct read path is `rec["message"]["role"]` and `rec["message"]["content"]`. Reading `rec["role"]` / `rec["content"]` returns `None` for every record and silently produces a zero-row signal table — if your first pass shows all counters at 0, this is almost certainly why.
88
123
 
89
124
  `message.content` is a **list of blocks**, each block has a `type`:
90
125
 
@@ -95,8 +130,19 @@ Each line is one of these record types:
95
130
  | `toolCall` | `{id, name, arguments}` | **pi uses this** — NOT Anthropic SDK's `tool_use` |
96
131
  | `toolResult` | `{id, output}` | **pi uses this** — NOT `tool_result` |
97
132
 
133
+ Roles in practice: `user`, `assistant`, and **`toolResult`** (yes, role and block type share the name; a `toolResult`-role message contains one or more `text` blocks holding the tool output). Treat `toolResult`-role messages as siblings of the originating `toolCall` — do not double-count them as user/assistant turns.
134
+
98
135
  Tool names are **lower-case** (`bash`, `read`, `edit`, `write`, `glob`, `grep`, etc.). Build any parser around `toolCall` / `toolResult` first, then fall back to Anthropic-shaped blocks for robustness.
99
136
 
137
+ **Parser pre-flight (mandatory before Step 2 aggregates).** After you write the one-shot Python parser, run it on the newest 1–2 `.jsonl` files and assert the following are non-zero for any session that obviously had work done:
138
+
139
+ ```python
140
+ assert tool_count, "toolCall extraction returned 0 — check rec['message']['content'], not rec['content']"
141
+ assert user_msg_total or skill_inject_count, "no user messages parsed — same nested-message bug"
142
+ ```
143
+
144
+ If either assertion would fail, fix the parser before producing the signal table. A zero-row table is never a finding — it is a parser bug.
145
+
100
146
  **Critical detail:** the pi runtime injects each skill's `SKILL.md` content into the conversation as a `<skill name="..." location="...">…</skill>` block embedded inside a **user-role** message (system-side injection, but the role is user). This is how you tell the classifier loaded a skill. Count these to see what `/do` actually picked.
101
147
 
102
148
  Timestamps on `message` records are ISO strings; some other record types use int millis. Handle both.
@@ -133,7 +179,7 @@ Run a single pass over every targeted `.jsonl` and tally:
133
179
  - Count user messages shorter than ~80 chars — these are usually nudges ("진행해", "다음은?", "끝났어?"). High proportion = `/do` is handing the flow back too often.
134
180
  - Count user messages containing correction markers (`아니`, `wait`, `stop`, `취소`, `다시`, `그만`, `undo`, `revert`). These mark interventions.
135
181
  - Count `<!-- migrated: ... -->` marker date vs. any post-marker `docs/handoff/` / `.scratch/flow/` / `SESSION_*.md` writes. Drift = handoff lockout failed.
136
- - Count tracker writes (`gh issue create`, `gh pr create`) and check whether `generated by AI` (the `/triage` disclaimer) appears in the surrounding text. Missing disclaimer = `/to-issues` / `/triage` predicate violated.
182
+ - Count tracker writes (`gh issue create`, `gh pr create`) and inspect the bodies; the bodies are plain spec consumed by future worker agents, so look for issues with shape problems (missing parent, no acceptance criteria, etc.) rather than meta tags.
137
183
 
138
184
  Use a deterministic Python or shell script you write once and check into `/tmp` for the duration of the run. Do not eyeball big JSONLs by hand.
139
185
 
@@ -160,7 +206,7 @@ Produce a small table per audited skill. Each row:
160
206
  | signal | observed | expected (predicate / rule) | severity |
161
207
  | ------------------------------ | -------- | ---------------------------------- | -------- |
162
208
  | /do → next-skill chain depth | 1/5 | ≥1 per chain step (M phases) | 🔴 |
163
- | AI disclaimer on slice issues | 0/8 | every issue starts with disclaimer | 🔴 |
209
+ | /do hand-back via follow-up | 8/13 | 0 mid-chain hand-backs | 🔴 |
164
210
  | post-marker handoff writes | 9 | 0 | 🟡 |
165
211
  | ... | | | |
166
212
  ```
@@ -176,7 +222,7 @@ For every 🔴 / 🟡 row, quote the smallest piece of evidence that makes the g
176
222
 
177
223
  - a user message timestamp + first 80 chars,
178
224
  - a tool-call command that violates a taboo,
179
- - a missing disclaimer in a created issue body.
225
+ - the first 80 chars of an issue body that violates the slice template.
180
226
 
181
227
  If a finding cannot be backed by an excerpt, it is not actionable yet — demote to a TODO and keep digging.
182
228
 
@@ -184,9 +230,15 @@ If a finding cannot be backed by an excerpt, it is not actionable yet — demote
184
230
 
185
231
  For each 🔴 / 🟡 finding, pick a default scope using the heuristic below, then show the table once and let the maintainer flip individual rows before applying.
186
232
 
233
+ Default to `extension` if the finding matches **any** of:
234
+
235
+ - The rule was already in SKILL.md prose at the time of the violating session **and** was violated repeatedly (i.e. prose recall is provably low for this rule).
236
+ - The predicate is mechanically expressible against `event.input` (tool name + arg substring / path glob) or `event.message.content` (text tail shape).
237
+ - The fix is a *refusal* or a *steer*, not a *reminder*.
238
+
187
239
  Default to `framework` if the finding matches **any** of:
188
240
 
189
- - Cites SKILL.md wording / phase / predicate / rule numbers.
241
+ - Cites SKILL.md wording / phase / predicate / rule numbers, and the rule was not yet in place.
190
242
  - The proposed fix is a generic anti-pattern string, a terminator literal, a runway line, or a lockout that any repo would benefit from.
191
243
  - The same gap would plausibly show up in two or more consumer repos.
192
244
 
@@ -199,12 +251,12 @@ Default to `consumer-prefs` if the finding matches **any** of:
199
251
  Present the scope-decision table:
200
252
 
201
253
  ```
202
- | # | finding (short) | default scope | target file | flip? |
203
- | - | ---------------------------------------- | ---------------- | ------------------------------------ | ----- |
204
- | 1 | /do hands flow back between phases | framework | pi-dev:skills/do/SKILL.md | |
205
- | 2 | docs/handoff/ resurrected after marker | framework | pi-dev:skills/migrate/SKILL.md | |
206
- | 3 | retro-action-item label still alive | consumer-prefs | hugn:docs/agents/preferences.md | |
207
- | 4 | smoke command name changed in S058 | consumer-prefs | hugn:docs/agents/preferences.md | |
254
+ | # | finding (short) | default scope | target file | flip? |
255
+ | - | ---------------------------------------- | ---------------- | ---------------------------------------------------- | ----- |
256
+ | 1 | /do hands flow back between phases | extension | pi-dev:extensions/pi-flow/index.ts (message_end) | |
257
+ | 2 | post-marker handoff writes | framework | pi-dev:skills/migrate/SKILL.md (gitignore lockout) | |
258
+ | 3 | retro-action-item label still alive | consumer-prefs | hugn:docs/agents/preferences.md | |
259
+ | 4 | smoke command name changed in S058 | consumer-prefs | hugn:docs/agents/preferences.md | |
208
260
  ```
209
261
 
210
262
  Ask once: "OK to proceed with these scopes? Reply with row numbers to flip, or `go`." Apply the flips and move on.
@@ -220,6 +272,15 @@ For each 🔴 / 🟡 finding, draft the smallest possible edit that, **if it had
220
272
  - Prefer **terminal markers** ("the summary's last line must be one of these two literals: …") over qualitative descriptions of "good wrap-up".
221
273
  - Update **at most three skills per run.** More than that means findings aren't anchored well enough.
222
274
 
275
+ **Extension findings (target: pi-dev `extensions/pi-flow/index.ts`, or rarely a new `extensions/<name>/`):**
276
+
277
+ - Add the handler to `pi-flow` rather than creating a new extension, unless the new concern is large enough to be independently toggleable.
278
+ - A guard must be **deterministic** — same input → same outcome. No LLM calls, no time-of-day branches.
279
+ - A guard must be **narrowly scoped** — tool name + arg shape, or end-of-turn regex. No free-text classification.
280
+ - A guard must be **toggleable** via `~/.pi/agent/settings.json`. Default on. A user that disables it must still be able to read the prose contract.
281
+ - Trim the corresponding SKILL.md prose to a one-line *why* + a pointer to the plugin. Do not delete the why — the model still needs to know the intent when the plugin is off.
282
+ - Update **at most one handler per run.** Two changes at once destroys the next audit's ability to attribute movement.
283
+
223
284
  **Consumer-prefs findings (target: that repo's `docs/agents/preferences.md`):**
224
285
 
225
286
  - Pick the *narrowest* existing section that fits before adding a new one. Mapping:
@@ -234,10 +295,10 @@ For each 🔴 / 🟡 finding, draft the smallest possible edit that, **if it had
234
295
  | glossary / context term clarification | `## Glossary alignment` |
235
296
  | rationale that doesn't fit elsewhere | `## Free notes` (one paragraph max, dated) |
236
297
 
237
- - One bullet per finding. Reference the evidence ticket ("S058 smoke name", "#103 missing disclaimer") so the line stays auditable.
298
+ - One bullet per finding. Reference the evidence ticket ("S058 smoke name", "#103 missing parent ref") so the line stays auditable.
238
299
  - Do **not** invent new top-level sections unless three findings legitimately share one.
239
300
 
240
- Show all drafts as one unified diff per target file before applying. Group by target file: pi-dev's `skills/<name>/SKILL.md` first (framework), then the consumer's `docs/agents/preferences.md` (consumer-prefs).
301
+ Show all drafts as one unified diff per target file before applying. Group by target file: pi-dev's `extensions/pi-flow/index.ts` first (extension), then `skills/<name>/SKILL.md` (framework), then the consumer's `docs/agents/preferences.md` (consumer-prefs).
241
302
 
242
303
  ### 7 — Apply, release, verify (branches on scope)
243
304
 
@@ -250,6 +311,16 @@ Run both branches if the audit produced mixed-scope findings. Each branch has it
250
311
  3. `git push origin main`; release-please opens the version-bump PR; merge it; npm publish runs automatically.
251
312
  4. Confirm `npm view pi-dev@latest version` matches the bumped tag.
252
313
 
314
+ **7a′. Extension branch** — only if any finding was approved as `extension`:
315
+
316
+ 1. Edit `extensions/pi-flow/index.ts` (or, only if justified by Step 5.5 bar, add a new `extensions/<name>/` directory with `index.ts`, `package.json`, `README.md`).
317
+ 2. If a new extension was added, register it in `src/install.ts` `EXTENSIONS` array so `pi-dev install` and `pi-dev update` propagate it.
318
+ 3. `npm run build` to ensure `install.ts` still compiles.
319
+ 4. Smoke-test with `node dist/cli.js install --local --skip-prefs -y` in `/tmp/<fresh-dir>`. Verify the extension landed under `.pi/extensions/<name>/`.
320
+ 5. `git add extensions/<name>/ src/install.ts src/paths.ts && git commit -m "feat(pi-flow): <one-liner anchoring the evidence>"`. Commit body must cite the signal.
321
+ 6. Mirror to live: `cp -r extensions/<name> ~/.pi/agent/extensions/<name>` so the **next** session picks up the change without waiting for npm.
322
+ 7. `git push origin main`; release-please → npm publish as usual.
323
+
253
324
  **7b. Consumer-prefs branch** — only if any finding was approved as `consumer-prefs`:
254
325
 
255
326
  1. In the audited consumer repo: edit `docs/agents/preferences.md` per the drafts from Step 6. Keep the migration marker at the very end of the file undisturbed.
@@ -260,7 +331,7 @@ Run both branches if the audit produced mixed-scope findings. Each branch has it
260
331
  **Verification (both branches).** After the next pi session in the affected repo:
261
332
 
262
333
  1. Re-run **this skill** scoped to the last 24 h.
263
- 2. Confirm each previously-🔴 signal has moved (chain depth up, intervention rate down, taboo writes gone, disclaimer coverage at 100%, etc.).
334
+ 2. Confirm each previously-🔴 signal has moved (chain depth up, intervention rate down, taboo writes gone, etc.).
264
335
  3. If a signal did not move, the fix wording was too weak — file a follow-up audit, do not re-write from scratch.
265
336
 
266
337
  ## Terminal predicate
@@ -268,9 +339,9 @@ Run both branches if the audit produced mixed-scope findings. Each branch has it
268
339
  This skill is done when **all four** are true:
269
340
 
270
341
  1. A signal table with severities and evidence excerpts has been presented.
271
- 2. Each finding has an approved scope (`framework` / `consumer-prefs` / `defer`) on record, defaulted by Step 5.5 and confirmed by the maintainer.
342
+ 2. Each finding has an approved scope (`framework` / `extension` / `consumer-prefs` / `defer`) on record, defaulted by Step 5.5 and confirmed by the maintainer.
272
343
  3. Either (a) zero 🔴 findings — flow is healthy, recorded as "no change this cycle", OR (b) each 🔴 finding has landed in its scope's target file.
273
- 4. For any landed change: if `framework`, the npm version has bumped (`npm view pi-dev@latest version`); if `consumer-prefs`, the consumer repo has the commit on its push-stream. Either way, the next-session re-audit plan is stated.
344
+ 4. For any landed change: if `framework` or `extension`, the npm version has bumped (`npm view pi-dev@latest version`) and the live mirror is in place; if `consumer-prefs`, the consumer repo has the commit on its push-stream. Either way, the next-session re-audit plan is stated.
274
345
 
275
346
  The summary's **last line** must be one of:
276
347
 
@@ -55,11 +55,7 @@ For each approved slice, publish a new issue to the issue tracker. Use the issue
55
55
 
56
56
  Publish issues in dependency order (blockers first) so you can reference real issue identifiers in the "Blocked by" field.
57
57
 
58
- **Every issue body MUST start with the AI disclaimer** (same wording as `/triage`). This is a hard requirement — no exceptions, including auto-generated slices. After writing the body, grep your own draft for the disclaimer string; if missing, do not publish.
59
-
60
58
  <issue-template>
61
- > *This was generated by AI during triage.*
62
-
63
59
  ## Parent
64
60
 
65
61
  A reference to the parent issue on the issue tracker (if the source was an existing issue, otherwise omit this section).
@@ -82,15 +78,13 @@ Or "None - can start immediately" if no blockers.
82
78
 
83
79
  </issue-template>
84
80
 
85
- **Publishing pattern (GitHub).** Use `--body-file -` with a heredoc, never inline `--body "..."`, so the disclaimer and markdown structure survive shell quoting verbatim:
81
+ **Publishing pattern (GitHub).** Use `--body-file -` with a heredoc, never inline `--body "..."`, so markdown structure survives shell quoting verbatim:
86
82
 
87
83
  ```bash
88
84
  gh issue create \
89
85
  --title "[Slice N] <title>" \
90
86
  --label needs-triage \
91
87
  --body-file - <<'EOF'
92
- > *This was generated by AI during triage.*
93
-
94
88
  ## Parent
95
89
  #<parent>
96
90
 
@@ -99,11 +93,4 @@ gh issue create \
99
93
  EOF
100
94
  ```
101
95
 
102
- After creation, immediately verify the disclaimer landed:
103
-
104
- ```bash
105
- gh issue view <n> --json body --jq '.body' | head -1 | grep -q 'generated by AI' \
106
- || echo "FAIL: disclaimer missing on #<n>"
107
- ```
108
-
109
96
  Do NOT close or modify any parent issue.
@@ -7,12 +7,6 @@ description: Triage issues through a state machine driven by triage roles. Use w
7
7
 
8
8
  Move issues on the project issue tracker through a small state machine of triage roles.
9
9
 
10
- Every comment or issue posted to the issue tracker during triage **must** start with this disclaimer:
11
-
12
- ```
13
- > *This was generated by AI during triage.*
14
- ```
15
-
16
10
  ## Reference docs
17
11
 
18
12
  Before acting on the issue tracker, read the repo-local setup docs if they exist: