@apogeelabs/the-agency 0.7.0 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@apogeelabs/the-agency",
3
- "version": "0.7.0",
3
+ "version": "0.9.0",
4
4
  "description": "Centralized Claude Code agents, commands, and workflows",
5
5
  "type": "module",
6
6
  "bin": {
@@ -10,9 +10,9 @@ Before generating any tests, complete these steps in order:
10
10
 
11
11
  1. **Branch analysis** (Coverage-Driven Test Planning): Read the source, enumerate every branch, map each to a test scenario.
12
12
  2. **Superfluous test check** (Superfluous Test Prevention): Verify each planned scenario covers a distinct branch, not the same branch with different values.
13
- 3. **Execution location check** (CRITICAL RULE #1): Execute methods under test in `beforeEach()`, not in `it()` blocks.
14
- 4. **Mock configuration check** (CRITICAL RULE #2): Configure mock behavior in `beforeEach`, not at module level.
15
- 5. **Callback invocation check** (CRITICAL RULE #3): Use `mockImplementation` for callbacks, not `mock.calls[N][M]()`.
13
+ 3. **Execution location check** (CRITICAL RULE 1): Execute methods under test in `beforeEach()`, not in `it()` blocks.
14
+ 4. **Mock configuration check** (CRITICAL RULE 2): Configure mock behavior in `beforeEach`, not at module level.
15
+ 5. **Callback invocation check** (CRITICAL RULE 3): Use `mockImplementation` for callbacks, not `mock.calls[N][M]()`.
16
16
 
17
17
  Once tests are generated:
18
18
 
@@ -21,7 +21,7 @@ Once tests are generated:
21
21
 
22
22
  ---
23
23
 
24
- ## CRITICAL RULE #1: Test Execution Location
24
+ ## CRITICAL RULE 1: Test Execution Location
25
25
 
26
26
  ⚠️ **NON-NEGOTIABLE RULE** ⚠️
27
27
 
@@ -97,7 +97,7 @@ Before writing any test suite, verify:
97
97
 
98
98
  ---
99
99
 
100
- ## CRITICAL RULE #2: Mock Configuration Location
100
+ ## CRITICAL RULE 2: Mock Configuration Location
101
101
 
102
102
  ⚠️ **NON-NEGOTIABLE RULE** ⚠️
103
103
 
@@ -152,7 +152,7 @@ const mockLuxon = {
152
152
 
153
153
  ---
154
154
 
155
- ## CRITICAL RULE #3: Callback Invocation via mockImplementation
155
+ ## CRITICAL RULE 3: Callback Invocation via mockImplementation
156
156
 
157
157
  ⚠️ **NON-NEGOTIABLE RULE** ⚠️
158
158
 
@@ -329,7 +329,7 @@ describe("when status is inactive", () => {
329
329
  - Inner `describe` blocks handle specific scenarios ("when X condition")
330
330
  - **Test descriptions**:
331
331
  - `describe` blocks handle the "when" conditions/scenarios
332
- - `it` blocks focus purely on "should" assertions (no conditions). See **CRITICAL RULE #1** above.
332
+ - `it` blocks focus purely on "should" assertions (no conditions). See **CRITICAL RULE 1** above.
333
333
  - Avoid redundancy - don't repeat conditions in both `describe` and `it`
334
334
  - **Grouping logic**:
335
335
  - Group related test cases under shared setup scenarios
@@ -366,7 +366,7 @@ describe("when status is inactive", () => {
366
366
  - **State initialization**: Variables initialized in `beforeEach`
367
367
  - **Fresh state guarantee**:
368
368
  - Call `jest.clearAllMocks()` and `jest.resetModules()` in outer `beforeEach`
369
- - **Do NOT use `jest.resetAllMocks()`** in the outer `beforeEach` — it wipes implementations, which breaks module-level `mockReturnThis()` chains (see CRITICAL RULE #2 exception)
369
+ - **Do NOT use `jest.resetAllMocks()`** in the outer `beforeEach` — it wipes implementations, which breaks module-level `mockReturnThis()` chains (see CRITICAL RULE 2 exception)
370
370
  - Use `mockReset()` on **individual mocks** when a nested `beforeEach` needs to replace behavior already configured by an outer `beforeEach`
371
371
  - **Never use `mockRestore()`** — it only applies to `jest.spyOn`, which this codebase does not use
372
372
 
@@ -385,14 +385,14 @@ Am I in the **outer** `beforeEach`?
385
385
 
386
386
  ## Mocking Strategy
387
387
 
388
- - **Module-level mock declarations**: Dependencies mocked above test suites with bare `jest.fn()` — see **CRITICAL RULE #2** for configuration rules
388
+ - **Module-level mock declarations**: Dependencies mocked above test suites with bare `jest.fn()` — see **CRITICAL RULE 2** for configuration rules
389
389
  - **Mock behavior configuration**: Configured per scenario in `beforeEach` blocks
390
390
  - **Mock naming**: Consistent mock prefix (e.g., `mockLogger`, `mockGetMessageBroker`)
391
391
  - **Selective mocking**: Mock only the methods being used unless it significantly raises complexity
392
392
  - **Mock verification**: Always verify mock calls when behavior depends on them
393
393
  - **Mock realism**: Ensure mocks behave like real implementations (same async patterns, error types)
394
394
  - **Mock methods**:
395
- - Use `mockImplementation` when invoking callbacks passed to the mocked function — see **CRITICAL RULE #3**
395
+ - Use `mockImplementation` when invoking callbacks passed to the mocked function — see **CRITICAL RULE 3**
396
396
  - Use `mockResolvedValue` for async returns
397
397
  - Use `mockReturnValue` for sync returns
398
398
  - Prefer "Once" versions when appropriate (`mockResolvedValueOnce`, etc.)
@@ -11,6 +11,27 @@ Make this code bulletproof by writing the tests the developer didn't think of. A
11
11
 
12
12
  You have NO knowledge of how this code was written. You are seeing it for the first time. You are NOT rewriting the developer's tests — you're adding what's missing.
13
13
 
14
+ ## Tooling — Non-Negotiable
15
+
16
+ **Use ONLY the repo's established tooling to run and verify tests.** You must discover the repo's conventions before writing or running anything.
17
+
18
+ ### Discovery (do this FIRST, before writing any tests)
19
+
20
+ 1. Read `package.json` (root and any relevant workspace package). Identify the test script — it will be one of `npm test`, `pnpm test`, `yarn test`, or similar. That is your test runner. Period.
21
+ 2. If the repo is a monorepo, identify the workspace tool (`pnpm --filter`, `npm -w`, `yarn workspace`, `turbo run`, `nx run`, etc.) and use that to scope test runs to the relevant package.
22
+ 3. Read the test framework config (e.g., `jest.config.ts`, `vitest.config.ts`, `.mocharc.*`) to understand module resolution, transforms, and path aliases.
23
+ 4. Read `.ai/UnitTestGeneration.md` and `.ai/UnitTestExamples.md` if they exist. These are your style guide. Follow them exactly.
24
+
25
+ ### The Rules
26
+
27
+ - **Run tests with the repo's test script.** `npm test`, `pnpm test`, `pnpm --filter <pkg> test`, etc. Whatever `package.json` says.
28
+ - **DO NOT use `node -e`, `npx tsx`, `npx jest`, `ts-node`, or any ad-hoc command to run, compile, or verify code.** Ever. No exceptions. The repo has a test runner. Use it.
29
+ - **DO NOT improvise test runners or verification methods.** If you're tempted to run something outside the repo's scripts to "quickly check" something, stop. Use the test script.
30
+ - **DO NOT install packages, add dependencies, or modify package.json.** You write tests using what's already available.
31
+ - **When running tests, scope them.** Don't run the entire test suite when you only need to verify one file. Use the test runner's built-in filtering (e.g., `pnpm test -- --testPathPattern=path/to/file` for Jest, or equivalent).
32
+
33
+ If you catch yourself about to type `node -e` or `npx tsx` or anything that isn't the repo's test script, you are doing it wrong. Back away from the keyboard.
34
+
14
35
  ## Input
15
36
 
16
37
  1. Read the build plan from `docs/build-plans/` to understand intended behavior.
@@ -97,12 +118,14 @@ Map every branch (`if`/`else`, `try`/`catch`, `switch`, early returns) in the so
97
118
 
98
119
  ## Process
99
120
 
100
- 1. Audit existing tests. Catalog what's covered.
101
- 2. Identify gaps by category.
102
- 3. Prioritize: likely to happen OR catastrophic if it does.
103
- 4. **Before writing any test**, verify it against the anti-patterns above. For every planned `describe` block, identify the specific source line/branch it uniquely covers. If you cannot, drop it.
104
- 5. Write tests. Follow the existing test framework and patterns exactly.
105
- 6. Write your report.
121
+ 1. **Discover repo tooling.** Follow the Tooling — Non-Negotiable section above. Identify the package manager, test script, test framework config, and any workspace/monorepo conventions. Do this BEFORE reading any source code.
122
+ 2. Audit existing tests. Catalog what's covered.
123
+ 3. Identify gaps by category.
124
+ 4. Prioritize: likely to happen OR catastrophic if it does.
125
+ 5. **Before writing any test**, verify it against the anti-patterns above. For every planned `describe` block, identify the specific source line/branch it uniquely covers. If you cannot, drop it.
126
+ 6. Write tests. Follow the existing test framework and patterns exactly.
127
+ 7. Run tests using the repo's test script. Fix any failures before proceeding.
128
+ 8. Write your report.
106
129
 
107
130
  ## Output
108
131
 
@@ -146,9 +169,10 @@ Actual bugs discovered during test hardening.
146
169
 
147
170
  ## Verification
148
171
 
149
- Before writing the report, verify:
172
+ Before writing the report, verify by **reading your own code and the source code** — not by running ad-hoc commands:
150
173
 
151
- 1. Every new `describe` block covers a code path no existing test covers.
174
+ 1. Every new `describe` block covers a code path no existing test covers. Verify this by reading the source, not by running coverage tools.
152
175
  2. No tests target `.tsx` files or barrel exports.
153
176
  3. Tests are added to existing test files, not new ones.
154
177
  4. No existing tests were modified.
178
+ 5. All tests pass when run with the repo's test script. This is the ONLY command you should have executed via Bash during this entire process.
@@ -2,15 +2,15 @@
2
2
 
3
3
  ## Goal
4
4
 
5
- Prepare and create a draft pull request for the current branch. Run pre-submission checks, generate a PR title and description, collect testing steps, and create the draft PR.
5
+ Prepare and create a draft pull request for the current branch. Run pre-submission checks, generate a full PR review, collect optional author testing notes, and create the draft PR.
6
6
 
7
7
  This command takes no arguments. It operates on the currently checked-out branch.
8
8
 
9
- <!-- Sibling command: review-pr.md uses the same plugin discovery/loading flow but with deeper evaluation. If the plugin format changes, update both files. -->
9
+ <!-- Sibling command: review-pr.md uses the same analysis approach and plugin discovery/loading flow. If the analysis format or plugin format changes, update both files. -->
10
10
 
11
11
  ## Constraints
12
12
 
13
- - This command is conversational. There are multiple points where you pause and wait for developer input (Step 2, Step 6, Step 7, Step 8). Don't try to rush through without their responses.
13
+ - This command is conversational. There are multiple points where you pause and wait for developer input (Step 2, Step 10, Step 11, Step 12). Don't try to rush through without their responses.
14
14
  - Check failures are informational, not blocking. The developer can still create the PR even if checks fail.
15
15
  - Always create the PR as a draft. No option to toggle this.
16
16
  - If the developer wants to bail at any point, respect that. Don't push them to continue.
@@ -67,34 +67,140 @@ The developer can accept the default or specify another branch. Use whatever the
67
67
 
68
68
  ## Step 3: Gather Diff
69
69
 
70
- Check that there are commits ahead of the target branch:
70
+ Fetch the latest state of the target branch from the remote so comparisons reflect what's actually on the remote, not a potentially stale local copy:
71
71
 
72
72
  ```bash
73
- git log --oneline $TARGET..HEAD
73
+ git fetch origin $TARGET
74
+ ```
75
+
76
+ ### 3.1: Resolve Comparison Ref
77
+
78
+ Default to `origin/$TARGET` (the remote tracking ref) for all diff and log comparisons. Before proceeding, check whether a local copy of the target branch exists and is ahead of the remote:
79
+
80
+ ```bash
81
+ git rev-list --count origin/$TARGET..$TARGET 2>/dev/null
82
+ ```
83
+
84
+ - **If the command fails** (no local branch exists), or **returns 0** (local is behind or even with remote): use `origin/$TARGET` silently. No prompt needed.
85
+ - **If the count is greater than 0**: the local branch has commits not yet on the remote. Ask the developer:
86
+
87
+ > **Your local `{$TARGET}` is {count} commit(s) ahead of `origin/{$TARGET}`.** Use local or remote for comparison? (default: remote)
88
+
89
+ Use whichever ref the developer chooses as `$COMPARE_REF` for all subsequent diff and log commands. If they accept the default or don't respond, use `origin/$TARGET`.
90
+
91
+ ### 3.2: Check for Commits
92
+
93
+ Check that there are commits ahead of the comparison ref:
94
+
95
+ ```bash
96
+ git log --oneline $COMPARE_REF..HEAD
74
97
  ```
75
98
 
76
99
  **If no commits are returned**, stop and output:
77
100
 
78
- > **No commits ahead of `{$TARGET}`. Nothing to PR.** You may need to rebase.
101
+ > **No commits ahead of `{$COMPARE_REF}`. Nothing to PR.** You may need to rebase.
79
102
 
80
- Gather the change data:
103
+ ### 3.3: Gather Change Data
81
104
 
82
105
  ```bash
83
106
  # File list with change stats
84
- git diff --stat $TARGET...HEAD
107
+ git diff --stat $COMPARE_REF...HEAD
85
108
 
86
109
  # Full diff
87
- git diff $TARGET...HEAD
110
+ git diff $COMPARE_REF...HEAD
88
111
 
89
112
  # Commit log (excluding merges)
90
- git log --no-merges --oneline $TARGET..HEAD
113
+ git log --no-merges --oneline $COMPARE_REF..HEAD
91
114
  ```
92
115
 
93
- ## Step 4: Review Plugin Checks
116
+ ## Step 4: Categorize and Filter Files
117
+
118
+ Before analysis, categorize the changed files:
119
+
120
+ **Noise files (acknowledge but don't analyze deeply):**
121
+
122
+ - `package-lock.json`, `yarn.lock`, `pnpm-lock.yaml` -> "lock file updated"
123
+ - `*.min.js`, `*.min.css`, `dist/*`, `build/*` -> "generated/bundled files"
124
+ - Binary files (images, fonts, etc.) -> "binary file added/modified/deleted"
125
+
126
+ **Categorize by area:**
127
+
128
+ Group changed files by top-level directory. If the repo uses workspaces (check for a `workspaces` field in `package.json` or the presence of `pnpm-workspace.yaml`), use the workspace definitions to inform grouping. Otherwise, fall back to top-level directory names.
129
+
130
+ ## Step 5: Analyze Changes
131
+
132
+ ### For Small PRs (fewer than 30 changed files):
133
+
134
+ Analyze **per-commit**. For each commit, produce a narrative block with:
135
+
136
+ 1. **The Mental Model Shift** -- plain English explanation of how the codebase's story changed. Focus on "why" and "so what," not restating the diff. Highlight:
137
+ - Architectural shifts or new patterns introduced
138
+ - Patterns retired or approaches abandoned
139
+ - Code that appears orphaned, half-finished, or disconnected
140
+
141
+ 2. **What Changed Structurally** -- numbered list of concrete changes:
142
+ - Collapse repetitive/mechanical changes (e.g., "~15 files updated imports from X to Y")
143
+ - Call out meaningful changes individually with enough context to understand impact
144
+
145
+ ### For Large PRs (30 or more changed files):
146
+
147
+ Analyze **by theme/area** rather than per-commit. Group changes by package or functional area:
94
148
 
95
- Review plugin checks are loaded dynamically from `.ai/review-checks/`. Each check file is a markdown file with YAML frontmatter.
149
+ 1. First, identify the major themes/areas touched
150
+ 2. For each theme, produce a narrative block (Mental Model Shift + Structural Changes)
151
+ 3. After per-area analysis, produce a **holistic summary** that captures cross-cutting changes and overall architectural impact
96
152
 
97
- <!-- This is the same discovery/loading flow as review-pr.md, but with lighter evaluation: pass/fail with brief file/line pointers rather than full reviewer narrative. -->
153
+ ### Analysis Guidance:
154
+
155
+ - **Explain the "why" and "so what"**, not just the "what"
156
+ - **Collapse noise**: If 50 files have the same mechanical change, that's one bullet point
157
+ - **Highlight signal**: New public APIs, changed interfaces, deleted capabilities, behavioral changes
158
+ - **Flag disconnects**: Code that doesn't seem to connect to anything, half-implemented features, TODOs left behind
159
+ - **Note removals**: Deleted code is as important as added code -- what capability is gone?
160
+
161
+ ## Step 6: Identify Risks
162
+
163
+ Scan for and report:
164
+
165
+ **Security concerns:**
166
+
167
+ - Auth/authorization changes
168
+ - New API endpoints or route changes
169
+ - Injection vectors (SQL, command, XSS)
170
+ - Sensitive data in logs or error messages
171
+ - Secrets or credentials (even if they look like placeholders)
172
+ - Changes to validation or sanitization
173
+
174
+ **Behavioral risks:**
175
+
176
+ - Changes to public APIs or interfaces
177
+ - Modified default values or fallback behavior
178
+ - Error handling changes that might swallow errors
179
+ - Timing or ordering changes in async code
180
+
181
+ **Architectural concerns:**
182
+
183
+ - Layer boundary violations (e.g., handler calling repository directly)
184
+ - New circular dependencies
185
+ - Assumptions in one layer that depend on implementation details of another
186
+ - Patterns that diverge from established codebase conventions
187
+
188
+ **Dependency concerns:**
189
+
190
+ - New dependencies added
191
+ - Dependencies removed (what relied on them?)
192
+ - Major version bumps
193
+ - Dependencies with known security issues
194
+
195
+ **Before finalizing risk callouts related to test files (`.test.ts`, `.spec.ts`):**
196
+
197
+ Read `.ai/UnitTestGeneration.md` (if it exists) and cross-reference any test-related findings against the project's testing conventions. Do NOT flag patterns that conform to those guidelines -- they are intentional, not risks.
198
+
199
+ ## Step 7: Tribal Knowledge Checks
200
+
201
+ Tribal knowledge checks are loaded dynamically from `.ai/review-checks/`. Each check file is a markdown file with YAML frontmatter.
202
+
203
+ <!-- This is the same discovery/loading flow as review-pr.md. -->
98
204
 
99
205
  **Expected check file format:**
100
206
 
@@ -116,7 +222,7 @@ applies_when: Changed files include .ts files in src/
116
222
  ls .ai/review-checks/*.md 2>/dev/null
117
223
  ```
118
224
 
119
- 2. **If the directory does not exist or contains no `.md` files**, skip the entire review plugin checks section silently no error, no placeholder text. Proceed to Step 5.
225
+ 2. **If the directory does not exist or contains no `.md` files**, skip the entire Tribal Knowledge Checks section silently -- no error, no placeholder text. Proceed to Step 8.
120
226
 
121
227
  3. **If files are found**, read each one:
122
228
 
@@ -126,34 +232,60 @@ cat .ai/review-checks/*.md
126
232
 
127
233
  4. For each file, parse the YAML frontmatter to extract `name` and `applies_when`. If a file is missing frontmatter or has invalid/unparseable YAML, skip it and note: "Skipped `{filename}`: missing or invalid frontmatter."
128
234
 
129
- 5. Evaluate each file's `applies_when` value against the list of changed files from the diff (gathered in Step 3). Use your judgment `applies_when` is natural language, not a glob pattern. Match generously but sensibly.
235
+ 5. Evaluate each file's `applies_when` value against the list of changed files from the diff (gathered in Step 3). Use your judgment -- `applies_when` is natural language, not a glob pattern. Match generously but sensibly.
130
236
 
131
- 6. For each check group where `applies_when` matches, evaluate each check item against the diff. Output format: **pass/fail per check with brief file/line pointers for failures** enough breadcrumb to find and fix the issue, not a full review narrative.
237
+ 6. For each check group where `applies_when` matches, include its checks in the output under a heading using the `name` from frontmatter. Evaluate each check against the actual diff.
132
238
 
133
- 7. If files exist but **none** of their `applies_when` criteria match the diff, skip the check results silently.
134
-
135
- Store the check results — they'll be included in the PR body later.
239
+ 7. If files exist but **none** of their `applies_when` criteria match the diff, skip the Tribal Knowledge Checks section entirely.
136
240
 
137
241
  Display the check results to the developer as you go so they can see what passed and what needs attention.
138
242
 
139
- ## Step 5: Generate Title and Description
243
+ ## Step 8: Testing Recommendations
244
+
245
+ Based on the changes, provide concrete testing recommendations. This is NOT generic "write more tests" advice -- recommendations must be tied to the actual changes.
246
+
247
+ ### What to recommend:
248
+
249
+ **Manual verification:**
250
+
251
+ - Specific user flows or scenarios the reviewer should manually test
252
+ - Edge cases introduced by the changes that aren't obvious from reading the code
253
+ - Integration points that might behave differently after these changes
254
+
255
+ **Automated test coverage:**
256
+
257
+ - New code paths that lack corresponding tests
258
+ - Behavioral changes that existing tests might not cover
259
+ - Specific test scenarios to add (with enough detail to write the test)
260
+ - **Do NOT recommend tests for React components (`.tsx` files).** We do not unit test React components.
261
+
262
+ **Regression risks:**
263
+
264
+ - Existing functionality that might be affected and should be regression tested
265
+ - Areas where the change assumptions might conflict with existing behavior
266
+
267
+ ### Guidance:
268
+
269
+ - Be specific: "Test the login flow with an expired token" not "test authentication"
270
+ - Reference the actual changes: "The new `validateApiKey` middleware should be tested with missing, invalid, and expired keys"
271
+ - Prioritize: If there are many potential tests, highlight the most important ones first
272
+ - If the PR includes good test coverage already, acknowledge that and note any gaps
273
+
274
+ ## Step 9: Generate Title
140
275
 
141
276
  Analyze the diff and commit history gathered in Step 3. Generate:
142
277
 
143
278
  - **Title**: Concise, reflects the nature of the changes (bug fix, feature, refactor, etc.). Keep it under 70 characters.
144
- - **Description**: Summarize what changed and why. Focus on the "so what" — not a restatement of the diff. Draw from commit messages and the actual code changes.
145
-
146
- For large diffs, prioritize analyzing `git diff --stat` and the commit log, then selectively read diffs for the most significant files rather than trying to process the entire diff.
147
279
 
148
- ## Step 6: Collect Testing Steps
280
+ ## Step 10: Collect Author Testing Notes (Optional)
149
281
 
150
282
  Ask the developer:
151
283
 
152
- > **Testing steps how should a reviewer verify these changes?**
284
+ > **Do you have any additional testing steps you'd like to include?** These will appear in the PR as "PR Author Testing Recommendations" alongside the generated testing analysis. Feel free to skip if the generated recommendations cover it.
153
285
 
154
- The developer provides their own testing steps as free-form text. These are NOT AI-generated. Wait for their input before proceeding.
286
+ If the developer provides testing steps, store them for inclusion in the PR body. If they skip or decline, proceed without them.
155
287
 
156
- ## Step 7: Present Full Preview
288
+ ## Step 11: Present Full Preview
157
289
 
158
290
  Show the developer the complete PR preview:
159
291
 
@@ -164,24 +296,59 @@ Show the developer the complete PR preview:
164
296
 
165
297
  ---
166
298
 
299
+ # {title}
300
+
301
+ **Branch**: {current branch} -> {$TARGET}
302
+ **Changes**: {additions} additions, {deletions} deletions across {changed file count} files
303
+
304
+ ---
305
+
167
306
  ## Summary
168
307
 
169
- {generated description}
308
+ [Per-commit or per-theme narrative blocks from Step 5]
170
309
 
171
- ## Test Plan
310
+ ---
172
311
 
173
- {developer-provided testing steps}
312
+ ## Risk Callouts
313
+
314
+ [Risks from Step 6, or "No significant risks identified"]
315
+
316
+ ---
317
+
318
+ ## Tribal Knowledge Checks
319
+
320
+ [Only if matching check files were found in Step 7. Omit section entirely otherwise.]
321
+
322
+ ### [name from check file frontmatter]
323
+
324
+ - [x] [Check passed or N/A]
325
+ - [ ] **[Check failed]**: [Specific finding with file:line references]
326
+
327
+ ---
328
+
329
+ ## Testing Recommendations
330
+
331
+ ### Manual Verification
332
+
333
+ - [Specific scenario to test manually]
334
+
335
+ ### Automated Test Gaps
336
+
337
+ - [Specific test that should be written]
338
+
339
+ ### Regression Risks
340
+
341
+ - [Existing functionality to regression test]
174
342
  ```
175
343
 
176
- If check results exist from Step 4, also show:
344
+ If the developer provided author testing notes in Step 10, also show:
177
345
 
178
346
  ```
179
- <details>
180
- <summary>Pre-submission Checks</summary>
347
+ ---
181
348
 
182
- {check results pass/fail with names}
349
+ ## PR Author Testing Recommendations
183
350
 
184
- </details>
351
+ {developer-provided testing steps}
185
352
  ```
186
353
 
187
354
  Then ask:
@@ -190,7 +357,7 @@ Then ask:
190
357
 
191
358
  Let the developer make edits. Iterate until they're satisfied.
192
359
 
193
- ## Step 8: Push Branch
360
+ ## Step 12: Push Branch
194
361
 
195
362
  Check whether the branch has an upstream and is up to date:
196
363
 
@@ -218,32 +385,60 @@ git push -u origin $(git branch --show-current)
218
385
 
219
386
  **If the branch is already up to date with the remote**, skip this step silently.
220
387
 
221
- ## Step 9: Create Draft PR
388
+ ## Step 13: Create Draft PR
222
389
 
223
- Assemble the PR body:
390
+ Assemble the PR body using the full review content from Steps 5-8:
224
391
 
225
392
  ```
393
+ # {title}
394
+
395
+ **Branch**: {current branch} -> {$TARGET}
396
+ **Changes**: {additions} additions, {deletions} deletions across {changed file count} files
397
+
398
+ ---
399
+
226
400
  ## Summary
227
401
 
228
- {description}
402
+ {per-commit or per-theme narrative blocks}
229
403
 
230
- ## Test Plan
404
+ ---
405
+
406
+ ## Risk Callouts
231
407
 
232
- {testing steps}
408
+ {risks, or "No significant risks identified"}
233
409
  ```
234
410
 
235
- If check results exist from Step 4, append:
411
+ If tribal knowledge check results exist from Step 7, append:
236
412
 
237
413
  ```
238
- <details>
239
- <summary>Pre-submission Checks</summary>
414
+ ---
240
415
 
241
- {check results pass/fail with names}
416
+ ## Tribal Knowledge Checks
242
417
 
243
- </details>
418
+ {check results with pass/fail and file:line references}
244
419
  ```
245
420
 
246
- If no check results (no plugin files found or none matched), omit the `<details>` section entirely.
421
+ If no check results (no plugin files found or none matched), omit the Tribal Knowledge Checks section entirely.
422
+
423
+ Append the testing recommendations from Step 8:
424
+
425
+ ```
426
+ ---
427
+
428
+ ## Testing Recommendations
429
+
430
+ {testing recommendations}
431
+ ```
432
+
433
+ If the developer provided author testing notes in Step 10, append:
434
+
435
+ ```
436
+ ---
437
+
438
+ ## PR Author Testing Recommendations
439
+
440
+ {developer-provided testing steps}
441
+ ```
247
442
 
248
443
  Create the draft PR:
249
444
 
@@ -50,19 +50,52 @@ gh pr view --json number,title,baseRefName,headRefName,body,additions,deletions,
50
50
 
51
51
  ## Step 3: Gather Diff Information
52
52
 
53
- Use `gh pr diff` to get the diff as GitHub sees it. This avoids stale-local-branch problems where a behind-origin base branch inflates the diff with already-merged changes from other PRs.
53
+ ### 3.1: Resolve Comparison Ref
54
+
55
+ Fetch the latest state of the base branch from the remote:
56
+
57
+ ```bash
58
+ git fetch origin $baseRefName
59
+ ```
60
+
61
+ Default to using the remote for comparison (via `gh pr diff`). Before proceeding, check whether the local copy of the base branch is ahead of the remote:
54
62
 
55
63
  ```bash
56
- # File list with change stats
57
- gh pr diff --stat
64
+ git rev-list --count origin/$baseRefName..$baseRefName 2>/dev/null
65
+ ```
66
+
67
+ - **If the command fails** (no local branch exists), or **returns 0** (local is behind or even with remote): use the remote. No prompt needed.
68
+ - **If the count is greater than 0**: the local branch has commits not yet on the remote. Ask the developer:
69
+
70
+ > **Your local `{baseRefName}` is {count} commit(s) ahead of `origin/{baseRefName}`.** Use local or remote for comparison? (default: remote)
71
+
72
+ Store the choice as `$DIFF_MODE` — either `remote` (default) or `local`.
73
+
74
+ ### 3.2: Get the Diff
75
+
76
+ **If `$DIFF_MODE` is `remote`** (the 95% case):
58
77
 
59
- # Full diff (for analysis)
78
+ Use `gh pr diff` to get the full diff as GitHub sees it. This avoids stale-local-branch problems where a behind-origin base branch inflates the diff with already-merged changes from other PRs.
79
+
80
+ ```bash
60
81
  gh pr diff
61
82
  ```
62
83
 
63
- **For commit messages**, use the `commits` array already captured in Step 2 — that is the authoritative commit list for this PR. Do NOT use `git log`, which can include commits from other PRs if the local base branch is behind origin.
84
+ **If `$DIFF_MODE` is `local`:**
64
85
 
65
- **Cross-check**: Compare the file count from `gh pr diff --stat` against the `changedFiles` value from Step 2. If they diverge significantly, flag the discrepancy in your output and prefer the `gh pr view` / `gh pr diff` data as the source of truth.
86
+ Use local git to diff against the local base branch:
87
+
88
+ ```bash
89
+ git diff $baseRefName...HEAD
90
+ ```
91
+
92
+ ⚠️ Note: local diffs may differ slightly from GitHub's view in repos with complex merge histories.
93
+
94
+ ### 3.3: Stats and Commit Messages
95
+
96
+ File-level stats (additions, deletions, file count) are already available from the `gh pr view --json` output captured in Step 2 — don't duplicate that work here.
97
+
98
+ **For commit messages**, use the `commits` array already captured in Step 2 — that is the authoritative commit list for this PR. Do NOT use `git log`, which can include commits from other PRs if the local base branch is behind origin.
66
99
 
67
100
  ## Step 4: Categorize and Filter Files
68
101