pi-diffwarden 0.26.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,846 @@
1
+ # Diffwarden
2
+
3
+ [![version](https://img.shields.io/badge/version-0.26.1-blue.svg)](CHANGELOG.md)
4
+ [![license](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
5
+
6
+ Independent PR guardian skill. You tell your coding agent "use diffwarden on this PR" and it reviews the pull request like a careful senior engineer: reads the diff, CI checks, and review comments; finds bugs and risks; fixes safe ones; verifies; and stops before doing anything dangerous.
7
+
8
+ It never auto-merges, never force-pushes, and never weakens your tests or CI to make a check go green.
9
+
10
+ ## Contents
11
+
12
+ - [Command reference](#command-reference)
13
+ - [Workspace review (no git required)](#workspace-review-no-git-required)
14
+ - [Review uncommitted changes (no PR)](#review-uncommitted-changes-no-pr)
15
+ - [Auto-detected mode (code vs plan)](#auto-detected-mode-code-vs-plan)
16
+ - [Web-augmented review (opt-in)](#web-augmented-review-opt-in)
17
+ - [Loop until merge-ready (c5/5)](#loop-until-merge-ready-c55)
18
+ - [Optional orchestration](#optional-orchestration)
19
+ - [What it actually does](#what-it-actually-does)
20
+ - [Is this for me?](#is-this-for-me)
21
+ - [Prerequisites (do this first)](#prerequisites-do-this-first)
22
+ - [Pi Agent](#pi-agent)
23
+ - [Install](#install)
24
+ - [Slash commands](#slash-commands)
25
+ - [Codex CLI](#codex-cli)
26
+ - [Your first run (step by step)](#your-first-run-step-by-step)
27
+ - [Modes / flags](#modes--flags)
28
+ - [Common recipes](#common-recipes)
29
+ - [What it will and won't do](#what-it-will-and-wont-do)
30
+ - [Core loop](#core-loop)
31
+ - [Troubleshooting / FAQ](#troubleshooting--faq)
32
+ - [Contributing](#contributing)
33
+ - [Files](#files)
34
+ - [Version](#version)
35
+
36
+ ## Command reference
37
+
38
+ Invoke with `/diffwarden` (or the optional `/dw` alias). v0.26.1 uses five primary commands: `review`, `loop`, `status`, `comment`, and `help`. Target arg: `workspace` (current folder, git not required), a local target (`local`, `staged`), a PR (`#123`, full URL, or omit for current-branch PR), or a plan/docs file (`path/to/file.md`). Natural-language prompts still work — see [Slash commands](#slash-commands).
39
+
40
+ **What works out of the box:** once the skill is installed (see [Install](#install)), `/diffwarden` registers in **Claude Code** automatically (it matches the skill name). The shorthand `/dw` needs command files in Claude Code/Cursor. **Codex CLI is different** — see [Codex CLI](#codex-cli): use `$diffwarden` or `/skills`, not `/dw` or `/diffwarden`.
41
+
42
+ | Command | What it does |
43
+ |---------|--------------|
44
+ | `/diffwarden review [<target>]` | Read-only review + fix plan. No edits, commits, or push. Lean output by default. |
45
+ | `/diffwarden loop [<target>]` | Review → fix safe issues → verify → rescore until `c5/5`. Local edits only unless `--commit` or `--push`. |
46
+ | `/diffwarden status [<target>]` | Score/snapshot only (checks, confidence, blockers). |
47
+ | `/diffwarden comment [<pr>]` | Short PR review comment. Asks for explicit approval before posting. PR-only. |
48
+ | `/diffwarden help` | List commands. Bare `/diffwarden` = help. Use `/dw help --verbose` for advanced flags. |
49
+
50
+ **Targets:**
51
+
52
+ | Target | Meaning |
53
+ |--------|---------|
54
+ | `workspace` | Current folder — git not required (see [Workspace review](#workspace-review-no-git-required)) |
55
+ | `local` / `staged` | Git working tree or staged changes, no PR |
56
+ | `#123`, PR URL, or omitted | GitHub PR (current branch when omitted) |
57
+ | `path/to/file.md` | Plan, docs, guides, tutorials (see [Auto-detected mode](#auto-detected-mode-code-vs-plan)) |
58
+
59
+ | Flag | Effect |
60
+ |------|--------|
61
+ | `--mvp` | Stop loop at `c4/5` when only P3/info remains. |
62
+ | `--verbose` | Full detailed report (iterations, verification, changed files, risks, …). Off by default. |
63
+ | `--orchestrate` | Split review and fix across configured models (see [Optional orchestration](#optional-orchestration)). Off by default. |
64
+ | `--commit` | Commit verified changes (git modes only, after verification). |
65
+ | `--push` | Commit + push verified changes (PR mode only, after PR head recheck). |
66
+ | `--security` | Security-focused review (auth, injection, SSRF, secrets, path traversal, crypto, data loss). |
67
+ | `--as-code` / `--as-plan` | Force code or plan/document mode for `review`/`loop`. |
68
+ | `--web` / `--research` | Opt into [web-augmented review](#web-augmented-review-opt-in). Per-finding consent. |
69
+ | `--reply` / `--resolve` | Reply on or resolve PR review threads (explicit approval required). |
70
+ | `--dry-run` | Review only; no edits (= `review`). |
71
+ | `--max N` | Loop iterations (default `3`, hard max `5`). |
72
+
73
+ **Compatibility aliases** (still work; not shown in short help):
74
+
75
+ | Alias | Equivalent |
76
+ |-------|------------|
77
+ | `fix` | `loop` |
78
+ | `prepare` | `loop --push` |
79
+ | `security` | `review --security` |
80
+ | `review-plan <file>` | `review <file> --as-plan` |
81
+ | `fix-plan <file>` | `loop <file> --as-plan` |
82
+
83
+ ```text
84
+ /dw review workspace
85
+ /dw loop workspace
86
+ /dw loop #123 --commit
87
+ /dw loop #123 --push
88
+ /dw comment #123
89
+ /dw status
90
+ /dw help
91
+ /dw help --verbose
92
+ ```
93
+
94
+ ## Workspace review (no git required)
95
+
96
+ Use `workspace` to review the current folder when there is no git repo, no branch, detached HEAD, or no open PR. Diffwarden discovers files, detects the stack, and reviews high-signal code/config/tests/docs — no PR detection, no CI, no GitHub comments.
97
+
98
+ ```text
99
+ /dw review workspace # read-only workspace review
100
+ /dw loop workspace # review + fix safe issues (backs up before editing)
101
+ /dw status workspace # score only
102
+ ```
103
+
104
+ Auto-fallback: when no explicit PR target is given and git/branch/PR is missing, Diffwarden falls back to workspace mode instead of blocking.
105
+
106
+ `loop workspace` creates a reversible baseline in `.diffwarden/backups/<timestamp>/` before the first edit. Workspace mode does not commit, push, or post PR comments. `--push` and `--commit` are rejected for `workspace`.
107
+
108
+ ## Review uncommitted changes (no PR)
109
+
110
+ Pass a local target instead of a PR to review your working tree before you
111
+ commit or open a PR. No GitHub access, no CI, no review threads — just the diff,
112
+ your project context, and the same review pipeline.
113
+
114
+ | Target | Diff scope |
115
+ |--------|------------|
116
+ | `local` / `worktree` | All changes vs `HEAD` **plus** untracked files (gitignored excluded). |
117
+ | `staged` | Staged changes only (`git diff --cached`). |
118
+
119
+ ```text
120
+ /dw review local # read-only review of everything uncommitted
121
+ /dw review staged # review only what you've git add-ed
122
+ /dw loop local # review + apply safe fixes (no commit/push unless --commit)
123
+ /dw review local --security # security-focused pass on uncommitted changes
124
+ ```
125
+
126
+ Valid with `review`, `loop`, and `status`. Everything that defines a
127
+ review still runs — classification, severity, confidence score, fix loop,
128
+ verification, security checklist. What's skipped (no PR exists): PR detection,
129
+ CI, review/issue comments, `comment`, and push. `loop local` edits the working tree only unless you pass `--commit`. `comment` and `--push` are rejected with a local target.
130
+
131
+ ## Auto-detected mode (code vs plan)
132
+
133
+ `review` and `loop` work on **either** code or a plan/docs file.
134
+ Diffwarden classifies the *target* and runs the matching mode — you do
135
+ not pick a separate subcommand.
136
+
137
+ | Target | Detected mode |
138
+ |--------|---------------|
139
+ | `workspace` | code (folder scan, git optional) |
140
+ | `#123`, `123`, full PR URL, `current`, or omitted | code |
141
+ | `local`, `staged`, `worktree` | code |
142
+ | a single prose `.md` plan/docs file | document/plan |
143
+ | `--as-code` flag | code (forced) |
144
+ | `--as-plan` flag | plan (forced) |
145
+ | **mixed** signals (e.g. a PR *and* a `.md` plan) | **asks you; default code** |
146
+
147
+ ```text
148
+ /dw review #123 # PR review
149
+ /dw review # current branch PR or git-local/workspace fallback
150
+ /dw review workspace # folder review, no git required
151
+ /dw review docs/plan.md # document/plan review
152
+ /dw review docs/plan.md --as-code # force code review of the file
153
+ /dw loop #123 # PR fix loop
154
+ /dw loop docs/plan.md # revise document in place (backs up to <file>.orig)
155
+ /dw loop docs/plan.md --as-plan # explicit document mode
156
+ ```
157
+
158
+ `--as-code` / `--as-plan` override the detector; on a mix of signals Diffwarden
159
+ **asks first** (defaulting to code only if you don't choose) — it never silently
160
+ guesses. Document mode never touches a PR or git: `review` is read-only;
161
+ `loop` revises only the target document (backing up to `<file>.orig`) and
162
+ never commits or pushes unless you pass `--commit` in a git context. `comment` is PR-only.
163
+
164
+ > The older `review-plan` / `fix-plan` names still work as **hidden back-compat
165
+ > aliases** (equivalent to `review` / `loop <file> --as-plan`), but `review` /
166
+ > `loop` on a `.md` file is the way to invoke document mode now.
167
+
168
+ ## Web-augmented review (opt-in)
169
+
170
+ Off by default. Diffwarden grounds its findings against your repo and the diff —
171
+ **never the internet** — unless you turn this on with `--web` (alias
172
+ `--research`). Even then it never searches silently: on an **uncertain** finding
173
+ it asks first and waits for your `y`.
174
+
175
+ Two gates, both required:
176
+
177
+ 1. **You pass `--web`.** Without it, Diffwarden never touches the network for a
178
+ review. (The only other network call is the help-path version check.)
179
+ 2. **Per finding, it asks and waits:**
180
+
181
+ ```text
182
+ I am unsure about <finding>. Search the web to verify? [y/N]
183
+ Query (redacted): "<minimal finding descriptor>"
184
+ ```
185
+
186
+ Default is **No**. Anything but `y` skips the search and keeps the finding
187
+ `local-only`. No batch-approve, no assuming consent from the flag.
188
+
189
+ **When it offers a search:** only on genuine uncertainty — a low-confidence
190
+ finding, something time-sensitive (a CVE, a security advisory, a deprecation, a
191
+ current best practice or idiomatic pattern), or when you asked for a
192
+ deep/verbose review. High-confidence, locally-provable findings are never sent
193
+ out.
194
+
195
+ **What leaves your machine:** the **minimal finding descriptor only** — the
196
+ abstract shape of the issue (e.g. "Express open-redirect via unvalidated user
197
+ input"). Never your code, diff, secrets, tokens, file paths, or internal names.
198
+ The exact redacted query is shown in the prompt — what you approve is what's
199
+ sent. A web search is egress to a third party that may be logged or indexed;
200
+ that's why it's gated, redacted, and minimized.
201
+
202
+ **Output:** every finding is marked `web-verified` (a consented search grounded
203
+ it; URL cited) or `local-only` (the default). Web grounding **never** raises
204
+ severity on its own and never bypasses a safety cap — severity and the
205
+ confidence score stay Diffwarden's own judgment.
206
+
207
+ Valid on `review`, `loop`, and `review --security` (code targets, including
208
+ `local` / `staged` / `workspace`), and compatible with `--dry-run`.
209
+ Rejected on `status` (snapshot only) and on document mode (`--as-plan`
210
+ or a `.md` docs target) — document critique grounds against your repo, not the web.
211
+
212
+ ```text
213
+ /dw review #123 --web # asks [y/N] before grounding any uncertain finding
214
+ /dw loop --web --security # security run reads raw; web grounding still per-finding gated
215
+ ```
216
+
217
+ ## Loop until merge-ready (c5/5)
218
+
219
+ `loop` is the primary review-fix-verify command. Each iteration: collect evidence → classify top blocker → compute confidence → fix safe scoped issue → verify → rescore. Stops at **c5/5**, `--mvp` at **c4/5**, or a safety stop.
220
+
221
+ ### Lean output (default)
222
+
223
+ Loop prints one line per iteration, then final `Status:` and `Level:` lines — no long evidence blocks unless `--verbose`:
224
+
225
+ ```text
226
+ c2/5 P1 src/auth.ts:44 — missing ownership check
227
+ c3/5 P2 tests missing for changed branch
228
+ c4/5 mvp-ready — only P3/info remains
229
+ c5/5 clean
230
+
231
+ Status: ready
232
+ Level: 5/5
233
+ ```
234
+
235
+ Every final review ends with `Status:` then `Level:`. Review output is also lean by default:
236
+
237
+ ```text
238
+ Findings:
239
+ - P1 src/auth.ts:44 — missing ownership check
240
+ - P2 tests/auth.test.ts — missing coverage for denied update
241
+
242
+ Status: not-ready
243
+ Level: 2/5
244
+ ```
245
+
246
+ Use `--verbose` for the full report (iterations, verification, changed files, risks, next action, how to test).
247
+
248
+ ### Commands
249
+
250
+ | Goal | Command |
251
+ |------|---------|
252
+ | Loop locally (no commit/push) | `/dw loop` or `/dw loop --max 5` |
253
+ | Loop + commit | `/dw loop --commit` |
254
+ | Loop + commit + push (PR only) | `/dw loop --push` or `/dw loop #123 --push` |
255
+ | Check score only | `/dw status` |
256
+ | Stop at MVP (c4/5) | `/dw loop --mvp` |
257
+ | Post short PR comment | `/dw comment #123` |
258
+
259
+ Natural language: `Use diffwarden on the current PR --max-iterations 5`
260
+
261
+ Default **3** iterations; hard max **5** unless you explicitly ask for more in chat.
262
+
263
+ ### What c5/5 means
264
+
265
+ All must be true (scope-dependent — PR vs local vs workspace):
266
+
267
+ - Required CI checks pass (PR mode) or grounded local verification passes
268
+ - No actionable findings remain
269
+ - No open P0/P1/security issue
270
+ - PR description adequate (PR mode) or document/workspace ready
271
+
272
+ Score is recomputed from evidence every iteration. **c5/5 does not auto-merge** — you merge.
273
+
274
+ ### Confidence scale (short)
275
+
276
+ | Score | Meaning |
277
+ |-------|---------|
278
+ | `c5/5` | Clean / merge-ready (loop stops) |
279
+ | `c4/5` | MVP-ready — only P3 / informational items left (`--mvp` stops here) |
280
+ | `c3/5` | P2 issues or missing targeted test / verification |
281
+ | `c2/5` | P1 issue or failing required check |
282
+ | `c1/5` | P0, security, or hard build failure |
283
+
284
+ Safety caps: unresolved P0/security → max `1/5`; failing required check → max `2/5`;
285
+ needs-user-decision → max `3/5` until you decide.
286
+
287
+ ### Evidence-based findings
288
+
289
+ - Actionable findings need **anchor + quote** — not model guesswork.
290
+ - Anchors: `file:line`, check name, PR field, or comment/thread id.
291
+ - Fix plans: only diff/read files; verify commands must exist in manifests.
292
+ - Verification is built into `loop` (no `--verify` flag on `review`).
293
+ - `--verbose` loop adds structured `verify: pass|fail|skipped` reporting.
294
+
295
+ ### When it stops before c5/5
296
+
297
+ | Reason | What to do |
298
+ |--------|------------|
299
+ | Hit `--max 5` | Run again: `/dw loop --max 5` |
300
+ | `--mvp` and c4/5 | Done for MVP — merge or continue without `--mvp` |
301
+ | Needs user decision (API, product, migration…) | Answer in chat, re-run |
302
+ | Same finding repeats | Agent stops — fix root cause manually |
303
+ | CI still pending | Wait for green, then `/dw status` |
304
+ | Dirty unrelated files | Clean worktree or stash first |
305
+
306
+ ### Example workflow
307
+
308
+ ```text
309
+ /dw status
310
+ /dw loop --max 5
311
+ /dw loop #123 --push
312
+ /dw comment #123
313
+ ```
314
+
315
+ ## Optional orchestration
316
+
317
+ Diffwarden can optionally split review and fix work across different models using
318
+ `--orchestrate`. This is off by default. See
319
+ [docs/orchestration.md](docs/orchestration.md).
320
+
321
+ ## What it actually does
322
+
323
+ This repo is **not an app**. It is one markdown playbook (`skills/diffwarden/SKILL.md`) that teaches an AI coding agent (Claude Code, Copilot CLI, Cursor, etc.) a safe, repeatable way to babysit a pull request.
324
+
325
+ Given a PR, the agent:
326
+
327
+ 1. Checks your environment is safe to work in (git repo, logged into GitHub, right branch).
328
+ 2. Reads everything: the diff, CI status, inline review comments, bot comments.
329
+ 3. Sorts findings into: must-fix now, FYI, already fixed, or "ask the human".
330
+ Actionable items need anchor + quote (file/line, check, PR field, or comment).
331
+ 4. Ranks by severity (P0 security/data-loss down to P3 polish).
332
+ 5. Writes a small fix plan, applies safe fixes, and runs discovered tests/linters
333
+ to prove they work (`loop` only — `review` is read-only).
334
+ 6. Optionally posts the review on GitHub or commits fixes — only if you allow it.
335
+ 7. Loops until the PR is merge-ready, blocked, or it needs your decision.
336
+
337
+ ## Is this for me?
338
+
339
+ Use it if you want to:
340
+
341
+ - check a PR before merging it
342
+ - get failing CI checks fixed safely
343
+ - review a teammate's PR and leave comments on GitHub
344
+ - do a focused security pass on changed code
345
+
346
+ Don't use it for: deploying to production, auto-merging, rewriting git history, or large refactors unrelated to the PR.
347
+
348
+ ## Prerequisites (do this first)
349
+
350
+ You need a coding agent that can read skills and run shell commands. Examples: Claude Code, Codex, GitHub Copilot CLI, Cursor, OpenCode, Pi Agent. The installer targets Claude Code, Codex, Cursor, and Pi directly; any other skill-loading agent works via manual copy ([Install](#install) Option C/D).
351
+
352
+ **For PR review** you also need `git`, GitHub CLI (`gh`), and a logged-in GitHub session:
353
+
354
+ ```bash
355
+ git --version
356
+ gh --version
357
+ gh auth status # should say "Logged in to github.com"
358
+ gh auth login # run this if it doesn't
359
+ ```
360
+
361
+ **For workspace or document review** git and `gh` are optional. Diffwarden falls back to workspace mode when git/branch/PR is unavailable.
362
+
363
+ Optional: export `GH_TOKEN` (or `GITHUB_TOKEN`) for CI/automation when `gh auth
364
+ login` is not available. Diffwarden tries `gh auth status` first; if you are
365
+ logged in, it ignores env tokens for that session so `gh` uses your user. With
366
+ no active user, it validates env tokens with `gh api user`. It never searches
367
+ files or config for tokens.
368
+
369
+ ## Pi Agent
370
+
371
+ Diffwarden can be used with Pi Agent three ways: installer/manual skill copy, prompt templates, or optional Pi package extension.
372
+
373
+ Diffwarden core behavior stays agent-neutral. The extension only adds native `/dw` and `/diffwarden` commands that forward to `/skill:diffwarden`, plus bundled skill discovery.
374
+
375
+ ### Pi package extension
376
+
377
+ > Security: Pi extensions run with full local permissions. Review `extensions/diffwarden/index.ts` before installing.
378
+
379
+ ```bash
380
+ pi install npm:pi-diffwarden@0.26.1 # global
381
+ pi install -l npm:pi-diffwarden@0.26.1 # project
382
+
383
+ # Git source also works:
384
+ pi install git:github.com/jperocho/diffwarden@v0.26.1
385
+ ```
386
+
387
+ The package loads `extensions/diffwarden/index.ts`, which discovers `skills/diffwarden/SKILL.md` from this repo. Restart Pi Agent or run `/reload` after installing.
388
+
389
+ ### Manual install
390
+
391
+ Copy the Diffwarden skill into one of Pi Agent's skill locations:
392
+
393
+ ```bash
394
+ # project scope, loaded after the project is trusted
395
+ mkdir -p .pi/skills/diffwarden
396
+ cp skills/diffwarden/SKILL.md .pi/skills/diffwarden/SKILL.md
397
+
398
+ # global scope
399
+ mkdir -p ~/.pi/agent/skills/diffwarden
400
+ cp skills/diffwarden/SKILL.md ~/.pi/agent/skills/diffwarden/SKILL.md
401
+ ```
402
+
403
+ Pi also discovers skills from `.agents/skills/` and `~/.agents/skills/`, so the Codex-compatible install path works in Pi too.
404
+
405
+ Optional `/dw` and `/diffwarden` aliases can be installed as Pi prompt templates when you do not use the extension package:
406
+
407
+ ```bash
408
+ # project scope
409
+ mkdir -p .pi/prompts
410
+ cp skills/diffwarden/prompts/dw.md .pi/prompts/dw.md
411
+ cp skills/diffwarden/prompts/diffwarden.md .pi/prompts/diffwarden.md
412
+
413
+ # global scope
414
+ mkdir -p ~/.pi/agent/prompts
415
+ cp skills/diffwarden/prompts/dw.md ~/.pi/agent/prompts/dw.md
416
+ cp skills/diffwarden/prompts/diffwarden.md ~/.pi/agent/prompts/diffwarden.md
417
+ ```
418
+
419
+ The prompt templates must pass arguments through with `$ARGUMENTS` so `/dw loop workspace` expands to a Diffwarden invocation with `loop workspace` intact.
420
+
421
+ Restart Pi Agent or run `/reload` after installing. Without prompt templates or the extension, invoke the skill with `/skill:diffwarden` or plain chat.
422
+
423
+ ### Installer
424
+
425
+ ```bash
426
+ ./install.sh --pi --project
427
+ ./install.sh --pi --global
428
+ ./install.sh --pi --pi-root ~/.pi/agent --global
429
+ ./install.sh --pi --pi-root ./.pi --project
430
+ ./install.sh --pi --dry-run
431
+ ```
432
+
433
+ ### Usage
434
+
435
+ ```text
436
+ /dw review workspace
437
+ /dw loop workspace
438
+ /dw review
439
+ /dw loop
440
+ /diffwarden review
441
+ /diffwarden loop
442
+ /skill:diffwarden loop workspace
443
+ ```
444
+
445
+ ### Extension behavior
446
+
447
+ The extension registers native `/dw` and `/diffwarden` commands, offers basic argument completions, and sends `/skill:diffwarden <args>` to the agent. It does not add tool interception, auto-merge, auto-push, file writes, or background processes.
448
+
449
+ ## Install
450
+
451
+ **Global install is recommended** — Diffwarden is a reusable reviewer/fixer that should be available in every workspace. Project install is still supported for team/repo-specific distribution.
452
+
453
+ There is **no `npx`/skills.sh step** — that loader proved flaky, so Diffwarden
454
+ installs with its own script or a plain copy. Both place the same files:
455
+
456
+ - the skill itself → `<root>/.claude/skills/diffwarden/SKILL.md` (Claude Code),
457
+ `<root>/.agents/skills/diffwarden/SKILL.md` (Codex), and/or
458
+ `<root>/.cursor/skills/diffwarden/SKILL.md` (Cursor),
459
+ - the optional `/dw` and `/diffwarden` slash-command files → `<root>/.claude/commands/`
460
+ and/or `<root>/.cursor/commands/` (Claude Code and Cursor only — Codex does not
461
+ use command files; see [Codex CLI](#codex-cli)),
462
+
463
+ where `<root>` is your project folder (project scope) or `$HOME` (global scope).
464
+
465
+ **Option A — installer (recommended).** It detects which agents you have, asks
466
+ where to install (global recommended), copies the skill + command files into the right places,
467
+ skips files already up to date, and never overwrites a changed file without
468
+ asking.
469
+
470
+ > **Security — inspect before you run.** Diffwarden is a safety tool; don't
471
+ > pipe a script straight into a shell on its word. Download it, read it, then
472
+ > run it. The installer pins to a release tag, uses HTTPS only, never uses
473
+ > `sudo`, and only writes under `.claude/`, `.cursor/`, `.agents/`, Pi roots
474
+ > (`skills/` + `prompts/` only), and optional `~/.config/diffwarden/` when you
475
+ > confirm orchestration defaults.
476
+
477
+ ```bash
478
+ # Recommended: download → read → run
479
+ curl -fsSLO https://raw.githubusercontent.com/jperocho/diffwarden/v0.26.1/install.sh
480
+ less install.sh # read it first
481
+ bash install.sh # interactive: detects agents, asks scope, confirms
482
+
483
+ # Or run it straight from a clone (no network):
484
+ git clone https://github.com/jperocho/diffwarden
485
+ cd diffwarden && ./install.sh
486
+ ```
487
+
488
+ Useful flags (see `./install.sh --help`):
489
+
490
+ ```bash
491
+ ./install.sh --dry-run # show the plan, write nothing
492
+ ./install.sh --claude --project # Claude Code, current repo only
493
+ ./install.sh --codex --global # Codex, all projects on this machine
494
+ ./install.sh --cursor --global # Cursor, all projects on this machine
495
+ ./install.sh --pi --global # Pi Agent, all projects on this machine
496
+ ./install.sh --yes # non-interactive (accept detected defaults)
497
+ ./install.sh --force # overwrite differing files without prompting
498
+ ```
499
+
500
+ **Option B — manual copy.** Do exactly what the installer does, by hand. Pick a
501
+ `<root>` (`.` for this project, `~` for global) and the matching agent location:
502
+
503
+ ```bash
504
+ # Claude Code, project scope
505
+ mkdir -p .claude/skills/diffwarden .claude/commands
506
+ cp skills/diffwarden/SKILL.md .claude/skills/diffwarden/SKILL.md
507
+ cp skills/diffwarden/commands/dw.md .claude/commands/
508
+ cp skills/diffwarden/commands/diffwarden.md .claude/commands/
509
+
510
+ # Codex, project or global scope (same skill path; invoke with $diffwarden)
511
+ mkdir -p .agents/skills/diffwarden
512
+ cp skills/diffwarden/SKILL.md .agents/skills/diffwarden/SKILL.md
513
+ # global: mkdir -p ~/.agents/skills/diffwarden && cp ... ~/.agents/skills/diffwarden/
514
+
515
+ # Cursor, project scope — same files under .cursor/
516
+ mkdir -p .cursor/skills/diffwarden .cursor/commands
517
+ cp skills/diffwarden/SKILL.md .cursor/skills/diffwarden/SKILL.md
518
+ cp skills/diffwarden/commands/dw.md .cursor/commands/
519
+ cp skills/diffwarden/commands/diffwarden.md .cursor/commands/
520
+ ```
521
+
522
+ For global Claude Code/Cursor scope, swap the leading `.` for `~`. For global
523
+ Codex skills, use `~/.agents/skills/diffwarden/SKILL.md`.
524
+
525
+ Claude Code and Codex load skills at session start — restart (or `/clear` in
526
+ Codex) after installing. Codex invocation details: [Codex CLI](#codex-cli).
527
+
528
+ **Optional — caveman mode for token savings.** Diffwarden runs long review loops
529
+ (diffs, CI logs, threads), so it pairs well with the [`caveman`](https://github.com/JuliusBrussee/caveman)
530
+ skill, which compresses agent output ~75% with no loss of technical substance. If
531
+ `caveman` is loaded, Diffwarden runs in caveman mode automatically; if not, it prints
532
+ a one-time install tip and continues normally.
533
+
534
+ Caveman activation differs by agent:
535
+
536
+ - **Claude Code / Codex / Gemini** — hook-driven, auto-activates per session once installed.
537
+ - **Cursor / Windsurf / Cline / Copilot** — no hook system; activation is a static
538
+ rule file. For Cursor, install the rule into `.cursor/rules/`:
539
+
540
+ ```bash
541
+ npx skills add JuliusBrussee/caveman -a cursor --with-init
542
+ ```
543
+
544
+ > **Caution for this repo only:** `--with-init` also writes repo-root `AGENTS.md`,
545
+ > which in this project is a symlink to `CLAUDE.md`. Running it here would modify
546
+ > project instructions. Instead, copy just the Cursor rule by hand:
547
+ >
548
+ > ```bash
549
+ > mkdir -p .cursor/rules
550
+ > cp ~/.claude/plugins/marketplaces/caveman/src/rules/caveman-activate.md \
551
+ > .cursor/rules/caveman.mdc
552
+ > echo ".cursor/rules/caveman.mdc" >> .gitignore # keep out of the distributable
553
+ > ```
554
+
555
+ Cursor reads only `.cursor/` and repo-root `AGENTS.md`; it never reads Claude's
556
+ `~/.claude` install, so the two stay isolated.
557
+
558
+ **Option C — other agents / custom skill folder.** Copy the skill wherever your
559
+ agent loads skills from:
560
+
561
+ ```bash
562
+ mkdir -p ~/.config/agent-skills/diffwarden
563
+ cp skills/diffwarden/SKILL.md ~/.config/agent-skills/diffwarden/SKILL.md
564
+ ```
565
+
566
+ **Option D — no skill loader.** Paste the contents of `skills/diffwarden/SKILL.md` into your agent's context before you give it the PR task.
567
+
568
+ ## Slash commands
569
+
570
+ Examples and natural-language form. Full command table: [Command reference](#command-reference).
571
+
572
+ ```text
573
+ /diffwarden review #123
574
+ /diffwarden loop
575
+ /diffwarden loop #123 --push
576
+ /diffwarden comment #123
577
+ /dw status
578
+ /dw loop workspace
579
+ /dw help
580
+ ```
581
+
582
+ Natural-language equivalents:
583
+
584
+ ```text
585
+ Use diffwarden on the current PR --dry-run
586
+ Use diffwarden on PR https://github.com/owner/repo/pull/123 --no-push
587
+ ```
588
+
589
+ ## Codex CLI
590
+
591
+ Codex installs and runs Diffwarden as a **skill**, not as custom slash commands.
592
+ The grammar is the same; only the prefix changes.
593
+
594
+ ### Supported
595
+
596
+ | How | Example |
597
+ | --- | --- |
598
+ | Skill install path | `.agents/skills/diffwarden/SKILL.md` or `~/.agents/skills/diffwarden/SKILL.md` |
599
+ | Explicit invocation | `$diffwarden review`, `$diffwarden loop workspace` |
600
+ | Skill picker | `/skills` → choose **diffwarden** |
601
+ | Plain chat | Works when the task matches the skill description (implicit load) |
602
+
603
+ After install, restart Codex or run `/clear` so it rescans skills.
604
+
605
+ ### Not supported (and why)
606
+
607
+ | What you might expect | Why it does not work |
608
+ | --- | --- |
609
+ | `/diffwarden`, `/dw` in the `/` menu | Codex `/` commands are **built-in only** (`/skills`, `/review`, `/model`, …). Custom slash commands from skill or command files are not loaded. OpenAI directs skill use through `$skill-name` instead ([codex#11817](https://github.com/openai/codex/issues/11817)). |
610
+ | `/prompts:diffwarden`, `/prompts:dw` | **Custom prompts** in `~/.codex/prompts/` were [deprecated](https://developers.openai.com/codex/custom-prompts) and **removed in the March 2026 Codex release** (0.117 series). OpenAI consolidated on Agent Skills as the standard; overlapping prompt-slash machinery was dropped ([codex#15941](https://github.com/openai/codex/issues/15941)). |
611
+ | `.codex/commands/` or `.codex/skills/` | Legacy paths. Current Codex reads skills from `.agents/skills` / `~/.agents/skills` per [customization docs](https://developers.openai.com/codex/concepts/customization). |
612
+
613
+ There is no `/dw` shorthand on Codex unless you add a separate skill named `dw`.
614
+ Use `$diffwarden` — same subcommands and flags as the [command reference](#command-reference).
615
+
616
+ ```text
617
+ $diffwarden review
618
+ $diffwarden loop workspace
619
+ $diffwarden loop #123 --push
620
+ $diffwarden comment #123
621
+ $diffwarden status
622
+ /skills # pick diffwarden from the menu
623
+ ```
624
+
625
+ ## Your first run (step by step)
626
+
627
+ 1. `cd` into your repo and switch to the PR's branch.
628
+ 2. Confirm you're set up:
629
+
630
+ ```bash
631
+ gh auth status
632
+ gh pr view # should show the current PR
633
+ ```
634
+
635
+ 3. In your agent, type:
636
+
637
+ ```text
638
+ /diffwarden review
639
+ ```
640
+
641
+ Or the long form:
642
+
643
+ ```text
644
+ Use diffwarden on the current PR --dry-run
645
+ ```
646
+
647
+ Both mean **review and plan only — change nothing.** Best way to start: zero risk.
648
+
649
+ 4. Read the report. It lists findings, severity, and a fix plan.
650
+ 5. When ready to let it act:
651
+
652
+ ```text
653
+ /diffwarden loop
654
+ ```
655
+
656
+ Or with explicit push on a PR:
657
+
658
+ ```text
659
+ /diffwarden loop #123 --push
660
+ ```
661
+
662
+ If you omit the PR number/URL, it detects the PR from your current branch.
663
+
664
+ ## Modes / flags
665
+
666
+ Add these after the command. Combine freely. Short help shows primary flags; use `/dw help --verbose` for the full list.
667
+
668
+ | Flag | What it does |
669
+ |------|--------------|
670
+ | `--mvp` | Stop loop at `c4/5` when only P3/info remains. |
671
+ | `--verbose` | Full detailed report instead of lean output. |
672
+ | `--orchestrate` | Optional reviewer/fixer model split ([docs/orchestration.md](docs/orchestration.md)). |
673
+ | `--commit` | Commit verified changes (git modes, after verification). |
674
+ | `--push` | Commit + push verified changes (PR mode only, after head recheck). |
675
+ | `--as-code` / `--as-plan` | Force code or document mode, overriding the [target detector](#auto-detected-mode-code-vs-plan). |
676
+ | `--dry-run` | Review and plan only. No edits, commits, pushes, or comments. **Start here.** |
677
+ | `--security` | Prioritize security: auth, injection, SSRF, secrets, path traversal, crypto, data loss. |
678
+ | `--reply` / `--resolve` | Reply on or resolve PR review threads (explicit OK each run). |
679
+ | `--web` / `--research` | Opt into [web-augmented review](#web-augmented-review-opt-in). Per-finding consent. |
680
+ | `--max N` | Loop iterations. Default `3`; hard max `5`. |
681
+
682
+ ## Common recipes
683
+
684
+ **Review your own PR before merge (safe, read-only):**
685
+
686
+ ```text
687
+ /diffwarden review
688
+ ```
689
+
690
+ **Review a teammate's PR and post a short comment:**
691
+
692
+ ```text
693
+ /diffwarden comment #123
694
+ ```
695
+
696
+ Posts a `COMMENT`-type review with inline P-level notes after your approval. It will **not** approve or request changes — that decision stays yours.
697
+
698
+ **Security-focused pass:**
699
+
700
+ ```text
701
+ /diffwarden review #123 --security
702
+ ```
703
+
704
+ **Review a folder with no git repo:**
705
+
706
+ ```text
707
+ /dw review workspace
708
+ /dw loop workspace
709
+ ```
710
+
711
+ **Address review feedback and reply on threads:**
712
+
713
+ ```text
714
+ /diffwarden loop #123 --reply --resolve
715
+ ```
716
+
717
+ **Let it fix safe issues locally, but don't push:**
718
+
719
+ ```text
720
+ /diffwarden loop
721
+ ```
722
+
723
+ **Loop until merge-ready and push (PR):**
724
+
725
+ ```text
726
+ /diffwarden loop #123 --push
727
+ ```
728
+
729
+ ## What it will and won't do
730
+
731
+ **Will:**
732
+
733
+ - Read diffs, checks, and comments.
734
+ - Fix safe, in-scope issues and run tests to verify.
735
+ - Reply on reviewer comment threads (with `--reply` + your OK).
736
+ - Resolve fixed threads (with `--resolve` + your OK).
737
+ - Post short comment-only reviews (with `comment` + your OK).
738
+ - Commit/push **only** when you pass `--commit` or `--push`.
739
+
740
+ **Won't (the safety promise):**
741
+
742
+ - No auto-merge.
743
+ - No force-push, no `git reset --hard`, no history rewrite.
744
+ - No blind push — it checks the PR head didn't change first.
745
+ - No approving or requesting changes on a PR.
746
+ - No resolving human review comments without your explicit approval.
747
+ - No weakening CI, tests, lint, branch protection, auth, secrets, or infra config to make checks pass.
748
+
749
+ **Stops and asks** on: dirty unrelated files, ambiguous risk, the PR head changing mid-run, protected branches, or a loop that isn't converging.
750
+
751
+ ## Core loop
752
+
753
+ ```text
754
+ preflight -> detect PR -> collect evidence -> classify -> plan fixes -> apply safe fixes -> verify -> optional commit/push -> optional thread replies/resolve -> optional post-review -> re-check -> report
755
+ ```
756
+
757
+ ## Troubleshooting / FAQ
758
+
759
+ **" `/dw` doesn't show in the `/` menu."** For Claude Code/Cursor, the command files weren't installed; copy `dw.md` / `diffwarden.md` into `.claude/commands/`, `~/.claude/commands/`, `.cursor/commands/`, or `~/.cursor/commands/`, then restart. For Pi Agent, install prompt templates or the Pi package extension, then run `/reload`. For **Codex CLI**, `/dw` and `/diffwarden` are **not supported** — the `/` menu is built-in commands only. Install the skill to `.agents/skills/diffwarden/` (or `~/.agents/skills/diffwarden/`), restart or `/clear`, then use `$diffwarden review ...` or `/skills`. See [Codex CLI](#codex-cli) for why `/prompts:dw` also no longer works on Codex ≥ 0.117.
760
+
761
+ **"Caveman mode doesn't activate in Cursor."** Cursor has no hook system, so caveman
762
+ needs a static rule file at `.cursor/rules/caveman.mdc` (see [Install](#install)).
763
+ Check with `ls .cursor/rules/caveman.mdc`. Absent → caveman is inactive and Diffwarden
764
+ shows the install tip instead of compact output.
765
+
766
+ **"It says I'm not authenticated."** Run `gh auth login`, then `gh auth status`
767
+ to confirm. For CI with no `gh` user, export a valid `GH_TOKEN`. If you are
768
+ logged in via `gh` but a stale `GH_TOKEN` is set, Diffwarden unsets it so your
769
+ user login wins.
770
+
771
+ **"It can't find a PR."** Make sure you're on the PR's branch, or pass the number/URL explicitly: `... on PR 123`.
772
+
773
+ **"It refuses to run on `main`/`master`."** By design. Switch to the PR's feature branch first.
774
+
775
+ **"It won't fix my failing CI by editing the workflow."** Also by design — it never weakens CI/tests to go green. Fix the real cause.
776
+
777
+ **"Will it merge my PR?"** No. Never. You merge.
778
+
779
+ **"Can it review a PR from a fork?"** It can review and (with `comment`) post findings. It usually can't push fixes to a fork branch, so omit `--push` / treat fixes as suggestions.
780
+
781
+ **"It stopped early."** It hit a safety stop (dirty worktree, ambiguous risk, head changed, max iterations). Read the report — it says why and what to do next.
782
+
783
+ ## Contributing
784
+
785
+ Contributions welcome — fork, branch, PR.
786
+
787
+ ```bash
788
+ # 1. Fork on GitHub, then:
789
+ git clone https://github.com/<you>/diffwarden
790
+ cd diffwarden
791
+ git checkout -b my-change
792
+
793
+ # 2. Make the change. Before pushing, run the same checks CI runs:
794
+ bash -n install.sh # shell syntax
795
+ shellcheck install.sh # shell lint
796
+ node -e 'JSON.parse(require("fs").readFileSync("package.json","utf8"))'
797
+ npm pack --dry-run
798
+ ./install.sh --pi --pi-root /tmp/pi-agent --project --dry-run
799
+ # Optional local Pi smoke, when pi is installed:
800
+ pi -e ./extensions/diffwarden/index.ts --help
801
+ # if you bumped the version, confirm it matches in every file (see below)
802
+
803
+ # 3. Push to your fork and open a PR against main.
804
+ git push -u origin my-change
805
+ ```
806
+
807
+ **Branch protection on `main`.** `main` is protected by repository rulesets —
808
+ plan your PR around them:
809
+
810
+ - **No direct pushes to `main`** — every change lands through a pull request,
811
+ including the maintainer's. Pushing to `main` is rejected.
812
+ - **1 approving review required** before merge.
813
+ - **CI must pass** (`bash -n` + `shellcheck` on `install.sh`, Pi installer
814
+ dry-runs, Pi package static checks, `npm pack --dry-run`, plus a version-sync
815
+ check). This is enforced for everyone — the maintainer can't merge red CI.
816
+ - **Squash merge only** — keeps `main` linear. Your PR's commits are squashed
817
+ into one on merge, so a clean PR title is the commit message.
818
+ - **No force-push, no branch deletion** on `main`.
819
+
820
+ The maintainer may merge without a second reviewer (solo project) but is still
821
+ held to "PR required" and "CI green" — same as everyone else.
822
+
823
+ **Touching the skill or its version?** `SKILL.md` is the source of truth; if you
824
+ change it, update `README.md` and `CHANGELOG.md` to match. The version string is
825
+ duplicated across six places and must stay in sync (CI fails otherwise) — see
826
+ [`CLAUDE.md`](CLAUDE.md) for the exact list and the project's editing rules.
827
+
828
+ ## Files
829
+
830
+ - `skills/diffwarden/SKILL.md` — the skill/playbook (the actual product).
831
+ - `skills/diffwarden/commands/` — optional `/dw` `/diffwarden` slash files for Claude Code and Cursor.
832
+ - `skills/diffwarden/prompts/` — Pi Agent prompt templates for `/dw` and `/diffwarden`.
833
+ - `extensions/diffwarden/index.ts` — optional Pi extension command wrapper and resource discovery.
834
+ - `package.json` — Pi/npm package manifest for installing the extension from npm, git, or local paths.
835
+ - `docs/orchestration.md` — optional model orchestration guide.
836
+ - `install.sh` — installer that detects agents and copies the skill + command files into place.
837
+ - `.github/workflows/ci.yml` — CI: shellchecks `install.sh` and enforces version sync.
838
+ - `README.md` — this guide.
839
+ - `CHANGELOG.md` — release notes.
840
+ - `CLAUDE.md` / `AGENTS.md` — agent guidance (`AGENTS.md` symlinks `CLAUDE.md`).
841
+ - `LICENSE` — MIT.
842
+ - `.gitignore` — local/editor/cache exclusions.
843
+
844
+ ## Version
845
+
846
+ Current version: `v0.26.1`