@paw-workflow/cli 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +124 -0
- package/bin/paw.js +82 -0
- package/dist/agents/PAW-Review.agent.md +86 -0
- package/dist/agents/PAW.agent.md +171 -0
- package/dist/skills/paw-code-research/SKILL.md +209 -0
- package/dist/skills/paw-docs-guidance/SKILL.md +163 -0
- package/dist/skills/paw-git-operations/SKILL.md +196 -0
- package/dist/skills/paw-impl-review/SKILL.md +178 -0
- package/dist/skills/paw-implement/SKILL.md +153 -0
- package/dist/skills/paw-init/SKILL.md +118 -0
- package/dist/skills/paw-plan-review/SKILL.md +117 -0
- package/dist/skills/paw-planning/SKILL.md +217 -0
- package/dist/skills/paw-pr/SKILL.md +157 -0
- package/dist/skills/paw-review-baseline/SKILL.md +268 -0
- package/dist/skills/paw-review-correlation/SKILL.md +307 -0
- package/dist/skills/paw-review-critic/SKILL.md +373 -0
- package/dist/skills/paw-review-feedback/SKILL.md +437 -0
- package/dist/skills/paw-review-gap/SKILL.md +639 -0
- package/dist/skills/paw-review-github/SKILL.md +336 -0
- package/dist/skills/paw-review-impact/SKILL.md +569 -0
- package/dist/skills/paw-review-response/SKILL.md +118 -0
- package/dist/skills/paw-review-understanding/SKILL.md +372 -0
- package/dist/skills/paw-review-workflow/SKILL.md +239 -0
- package/dist/skills/paw-spec/SKILL.md +257 -0
- package/dist/skills/paw-spec-research/SKILL.md +138 -0
- package/dist/skills/paw-spec-review/SKILL.md +101 -0
- package/dist/skills/paw-status/SKILL.md +160 -0
- package/dist/skills/paw-transition/SKILL.md +134 -0
- package/dist/skills/paw-work-shaping/SKILL.md +99 -0
- package/dist/skills/paw-workflow/SKILL.md +142 -0
- package/lib/commands/install.js +103 -0
- package/lib/commands/list.js +18 -0
- package/lib/commands/uninstall.js +95 -0
- package/lib/commands/upgrade.js +119 -0
- package/lib/manifest.js +42 -0
- package/lib/paths.js +42 -0
- package/lib/registry.js +41 -0
- package/package.json +40 -0
|
@@ -0,0 +1,373 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: paw-review-critic
|
|
3
|
+
description: Critically assesses generated review comments for usefulness, accuracy, and appropriateness, adding assessment sections.
|
|
4
|
+
metadata:
|
|
5
|
+
version: "0.0.1"
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# PAW Review Critic Skill
|
|
9
|
+
|
|
10
|
+
Critically assess generated review comments to help reviewers make informed decisions about what feedback to include, modify, or skip.
|
|
11
|
+
|
|
12
|
+
> **Reference**: Follow Core Review Principles from `paw-review-workflow` skill.
|
|
13
|
+
|
|
14
|
+
## Prerequisites
|
|
15
|
+
|
|
16
|
+
Verify `ReviewComments.md` exists in `.paw/reviews/<identifier>/`.
|
|
17
|
+
|
|
18
|
+
Also verify access to all supporting artifacts:
|
|
19
|
+
- `ReviewContext.md` (PR metadata)
|
|
20
|
+
- `CodeResearch.md` (baseline understanding)
|
|
21
|
+
- `DerivedSpec.md` (PR intent)
|
|
22
|
+
- `ImpactAnalysis.md` (system-wide effects)
|
|
23
|
+
- `GapAnalysis.md` (categorized findings)
|
|
24
|
+
|
|
25
|
+
If ReviewComments.md is missing, report blocked status—Feedback Generation must complete first.
|
|
26
|
+
|
|
27
|
+
## Core Responsibilities
|
|
28
|
+
|
|
29
|
+
- Read and understand all generated review comments
|
|
30
|
+
- Critically evaluate each comment's usefulness and accuracy
|
|
31
|
+
- Consider alternative perspectives and trade-offs
|
|
32
|
+
- Add assessment sections to ReviewComments.md
|
|
33
|
+
- Provide recommendations (Include, Modify, Skip) with justification
|
|
34
|
+
- Help reviewer make informed decisions about feedback quality
|
|
35
|
+
|
|
36
|
+
## Process Steps
|
|
37
|
+
|
|
38
|
+
### Step 1: Read All Review Comments
|
|
39
|
+
|
|
40
|
+
Understand the complete feedback landscape:
|
|
41
|
+
|
|
42
|
+
**Load ReviewComments.md:**
|
|
43
|
+
- Read all inline comments
|
|
44
|
+
- Read all thread comments
|
|
45
|
+
- Read questions for author
|
|
46
|
+
- Understand summary comment framing
|
|
47
|
+
|
|
48
|
+
**Load Supporting Context:**
|
|
49
|
+
- Review GapAnalysis.md findings that generated each comment
|
|
50
|
+
- Reference CodeResearch.md for baseline patterns
|
|
51
|
+
- Check ImpactAnalysis.md for system-wide context
|
|
52
|
+
- Understand DerivedSpec.md intent
|
|
53
|
+
|
|
54
|
+
**Identify Relationships:**
|
|
55
|
+
- Note batched findings (multiple locations in one comment)
|
|
56
|
+
- Identify linked comments (related but separate)
|
|
57
|
+
- Understand categorization (Must/Should/Could)
|
|
58
|
+
|
|
59
|
+
### Step 2: Critical Assessment
|
|
60
|
+
|
|
61
|
+
For each review comment, evaluate multiple dimensions:
|
|
62
|
+
|
|
63
|
+
#### Usefulness Evaluation
|
|
64
|
+
|
|
65
|
+
Ask: "Does this comment truly improve code quality? Is it actionable?"
|
|
66
|
+
|
|
67
|
+
**High Usefulness:**
|
|
68
|
+
- Fixes actual bug with clear failure mode
|
|
69
|
+
- Prevents production issue (security, data loss, crash)
|
|
70
|
+
- Improves maintainability significantly with concrete benefits
|
|
71
|
+
- Adds essential test coverage for risky code
|
|
72
|
+
- Addresses critical design flaw
|
|
73
|
+
|
|
74
|
+
**Medium Usefulness:**
|
|
75
|
+
- Improves code quality (clarity, consistency)
|
|
76
|
+
- Adds useful tests for non-critical paths
|
|
77
|
+
- Enhances error handling for edge cases
|
|
78
|
+
- Improves documentation for complex code
|
|
79
|
+
- Suggests better patterns with clear advantages
|
|
80
|
+
|
|
81
|
+
**Low Usefulness:**
|
|
82
|
+
- Stylistic preference without concrete benefit
|
|
83
|
+
- Minimal impact on quality or maintainability
|
|
84
|
+
- Already addressed elsewhere in PR or codebase
|
|
85
|
+
- Over-engineering for current requirements
|
|
86
|
+
- Bikeshedding (arguing about trivial details)
|
|
87
|
+
|
|
88
|
+
#### Accuracy Validation
|
|
89
|
+
|
|
90
|
+
Ask: "Are evidence references correct? Is the diagnosis sound?"
|
|
91
|
+
|
|
92
|
+
**Verify:**
|
|
93
|
+
- File:line references point to actual code
|
|
94
|
+
- Diagnosis matches actual code behavior (not misread)
|
|
95
|
+
- Baseline pattern comparison is fair and relevant
|
|
96
|
+
- Impact assessment is realistic (not exaggerated)
|
|
97
|
+
- Best practice citation is appropriate for context
|
|
98
|
+
- Code suggestion would actually fix the issue
|
|
99
|
+
|
|
100
|
+
**Flag if:**
|
|
101
|
+
- Evidence references are incorrect or outdated
|
|
102
|
+
- Diagnosis misunderstands code intent
|
|
103
|
+
- Baseline pattern cited isn't actually analogous
|
|
104
|
+
- Impact is speculative without concrete evidence
|
|
105
|
+
- Suggestion would introduce new problems
|
|
106
|
+
|
|
107
|
+
#### Alternative Perspective Exploration
|
|
108
|
+
|
|
109
|
+
Ask: "What might the initial reviewer have missed? Are there valid reasons for the current approach?"
|
|
110
|
+
|
|
111
|
+
**Consider:**
|
|
112
|
+
- Project-specific context not captured in artifacts
|
|
113
|
+
- Time/complexity trade-offs for suggested change
|
|
114
|
+
- Intentional design decisions with valid rationale
|
|
115
|
+
- Performance/readability trade-offs
|
|
116
|
+
- Technical debt consciously accepted
|
|
117
|
+
- Platform/framework limitations
|
|
118
|
+
|
|
119
|
+
**Identify:**
|
|
120
|
+
- Cases where current approach might be deliberate
|
|
121
|
+
- Situations where "better" is subjective
|
|
122
|
+
- Comments that are too prescriptive vs exploratory
|
|
123
|
+
- Recommendations that conflict with other constraints
|
|
124
|
+
|
|
125
|
+
#### Trade-off Analysis
|
|
126
|
+
|
|
127
|
+
Ask: "Are there valid reasons to do it the current way? What are the costs of changing?"
|
|
128
|
+
|
|
129
|
+
**Evaluate:**
|
|
130
|
+
- Effort required vs benefit gained
|
|
131
|
+
- Risk introduced by change (new bugs, regressions)
|
|
132
|
+
- Complexity added by "better" solution
|
|
133
|
+
- Consistency with rest of codebase vs ideal pattern
|
|
134
|
+
- Timing (now vs later with more information)
|
|
135
|
+
|
|
136
|
+
**Balance:**
|
|
137
|
+
- Perfect vs good enough for current context
|
|
138
|
+
- Immediate needs vs future flexibility
|
|
139
|
+
- Code purity vs pragmatic delivery
|
|
140
|
+
|
|
141
|
+
### Step 3: Add Assessment Sections
|
|
142
|
+
|
|
143
|
+
Append assessment after each comment's rationale in ReviewComments.md:
|
|
144
|
+
|
|
145
|
+
**Assessment Structure:**
|
|
146
|
+
```markdown
|
|
147
|
+
**Assessment:**
|
|
148
|
+
- **Usefulness**: <High|Medium|Low> - <justification>
|
|
149
|
+
- **Accuracy**: <validation of evidence and diagnosis>
|
|
150
|
+
- **Alternative Perspective**: <other valid interpretations or approaches>
|
|
151
|
+
- **Trade-offs**: <reasons current approach might be acceptable>
|
|
152
|
+
- **Recommendation**: <Include as-is | Modify to... | Skip because...>
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
**CRITICAL - Where Assessments Go:**
|
|
156
|
+
|
|
157
|
+
| Add to ReviewComments.md | DO NOT Post Externally |
|
|
158
|
+
|-------------------------|------------------------|
|
|
159
|
+
| Append assessment sections after rationale | No GitHub posting |
|
|
160
|
+
| Keep assessments local to reviewer's workspace | Not visible to PR author |
|
|
161
|
+
| Use assessments to inform reviewer's decisions | No external platform posting |
|
|
162
|
+
|
|
163
|
+
**Why**: Assessments help the reviewer decide what feedback to give, but showing this internal evaluation process to the PR author would be confusing and potentially counterproductive.
|
|
164
|
+
|
|
165
|
+
**Example Assessment:**
|
|
166
|
+
|
|
167
|
+
```markdown
|
|
168
|
+
### File: `auth.ts` | Lines: 45-50
|
|
169
|
+
|
|
170
|
+
**Type**: Must
|
|
171
|
+
**Category**: Safety
|
|
172
|
+
|
|
173
|
+
Missing null check before accessing user.profile could cause runtime error.
|
|
174
|
+
|
|
175
|
+
**Suggestion:**
|
|
176
|
+
```typescript
|
|
177
|
+
if (user?.profile) {
|
|
178
|
+
return user.profile.name;
|
|
179
|
+
}
|
|
180
|
+
return 'Anonymous';
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
**Rationale:**
|
|
184
|
+
- **Evidence**: `auth.ts:45` shows direct access to user.profile.name
|
|
185
|
+
- **Baseline Pattern**: Similar code in `auth.ts:120` uses null checks for user objects
|
|
186
|
+
- **Impact**: Null pointer exception if user profile not loaded
|
|
187
|
+
- **Best Practice**: Defensive programming - validate before access
|
|
188
|
+
|
|
189
|
+
**Assessment:**
|
|
190
|
+
- **Usefulness**: High - Prevents actual runtime crash. User profile loading is conditional based on auth provider, so null case is realistic.
|
|
191
|
+
- **Accuracy**: Evidence confirmed. auth.ts:45 does access user.profile.name without check. Baseline pattern at auth.ts:120 does use optional chaining for similar access.
|
|
192
|
+
- **Alternative Perspective**: Could argue that profile should always exist if user is authenticated, but auth provider variance makes this risky assumption.
|
|
193
|
+
- **Trade-offs**: Minimal cost to add check. No downside to defensive code here.
|
|
194
|
+
- **Recommendation**: Include as-is. Clear safety improvement with concrete failure mode.
|
|
195
|
+
|
|
196
|
+
**Posted**: âś“ Pending review comment ID: <id>
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
### Recommendation Guidelines
|
|
200
|
+
|
|
201
|
+
**Include as-is:**
|
|
202
|
+
- High usefulness + accurate diagnosis + no major alternatives
|
|
203
|
+
- Clear benefit with minimal cost
|
|
204
|
+
- Addresses concrete issue with evidence
|
|
205
|
+
- Aligns with codebase patterns
|
|
206
|
+
- Reviewer confident in recommendation
|
|
207
|
+
|
|
208
|
+
**Modify to...:**
|
|
209
|
+
- Core issue is valid but suggestion needs adjustment
|
|
210
|
+
- Tone could be more/less direct
|
|
211
|
+
- Could be batched with related comment
|
|
212
|
+
- Suggestion is too prescriptive vs suggesting exploration
|
|
213
|
+
- Evidence is correct but impact overstated
|
|
214
|
+
|
|
215
|
+
**Skip because...:**
|
|
216
|
+
- Low usefulness (stylistic preference, minimal impact)
|
|
217
|
+
- Inaccurate diagnosis or evidence
|
|
218
|
+
- Valid alternative explanation exists
|
|
219
|
+
- Already addressed elsewhere
|
|
220
|
+
- Cost outweighs benefit
|
|
221
|
+
- Not appropriate for this review cycle
|
|
222
|
+
|
|
223
|
+
## Assessment Guidelines
|
|
224
|
+
|
|
225
|
+
### Usefulness Calibration
|
|
226
|
+
|
|
227
|
+
**Avoid Grade Inflation:**
|
|
228
|
+
- Not every suggestion is "High" usefulness
|
|
229
|
+
- Style preferences are typically "Low" even if correct
|
|
230
|
+
- Medium is appropriate for incremental improvements
|
|
231
|
+
|
|
232
|
+
**Focus on Impact:**
|
|
233
|
+
- What actually breaks vs what could theoretically be better?
|
|
234
|
+
- User-facing impact vs internal code cleanliness?
|
|
235
|
+
- Maintainability boost that saves real time vs theoretical elegance?
|
|
236
|
+
|
|
237
|
+
**Consider Context:**
|
|
238
|
+
- Is this a critical production system or experimental prototype?
|
|
239
|
+
- Is this a hot path or rarely-executed edge case?
|
|
240
|
+
- Is this public API or internal implementation?
|
|
241
|
+
|
|
242
|
+
### Accuracy Rigor
|
|
243
|
+
|
|
244
|
+
**Verify Evidence:**
|
|
245
|
+
- Check that file:line references are current (not stale)
|
|
246
|
+
- Confirm code behavior matches description
|
|
247
|
+
- Validate that baseline pattern is truly analogous
|
|
248
|
+
|
|
249
|
+
**Challenge Assumptions:**
|
|
250
|
+
- Is the "problem" actually problematic in this context?
|
|
251
|
+
- Could the current code be intentionally designed this way?
|
|
252
|
+
- Is the suggestion actually an improvement or just different?
|
|
253
|
+
|
|
254
|
+
**Check Suggestions:**
|
|
255
|
+
- Would the proposed fix actually work?
|
|
256
|
+
- Would it introduce new issues (performance, complexity)?
|
|
257
|
+
- Is it compatible with the framework/platform?
|
|
258
|
+
|
|
259
|
+
### Alternative Perspective Depth
|
|
260
|
+
|
|
261
|
+
**Steelman, Don't Strawman:**
|
|
262
|
+
- Present the strongest case for the current approach
|
|
263
|
+
- Consider legitimate trade-offs, not just defend poor code
|
|
264
|
+
- Acknowledge when criticism is valid but timing might be wrong
|
|
265
|
+
|
|
266
|
+
**Common Valid Alternatives:**
|
|
267
|
+
- "Premature optimization" - current simple approach sufficient for now
|
|
268
|
+
- "Technical debt acknowledged" - team aware, will address later
|
|
269
|
+
- "Platform limitation" - workaround necessary given constraints
|
|
270
|
+
- "Readability trade-off" - more explicit code despite verbosity
|
|
271
|
+
|
|
272
|
+
### Trade-off Realism
|
|
273
|
+
|
|
274
|
+
**Quantify When Possible:**
|
|
275
|
+
- "Would require refactoring 5 files" vs "simple one-line fix"
|
|
276
|
+
- "Adds 10% performance overhead" vs "negligible impact"
|
|
277
|
+
- "Increases complexity from 3 conditionals to 8" vs "simplifies logic"
|
|
278
|
+
|
|
279
|
+
**Acknowledge Uncertainty:**
|
|
280
|
+
- "Unknown if this path is hot enough to matter"
|
|
281
|
+
- "Unclear if this pattern will generalize to future cases"
|
|
282
|
+
- "Would need profiling to confirm performance impact"
|
|
283
|
+
|
|
284
|
+
## Guardrails
|
|
285
|
+
|
|
286
|
+
**Advisory Only:**
|
|
287
|
+
- Assessments help reviewer decide, don't make final decisions
|
|
288
|
+
- Reviewer can override any recommendation
|
|
289
|
+
- Purpose is to inform, not to dictate
|
|
290
|
+
|
|
291
|
+
**Critical Thinking:**
|
|
292
|
+
- Question assumptions in generated comments
|
|
293
|
+
- Consider alternative interpretations
|
|
294
|
+
- Don't rubber-stamp every comment as useful
|
|
295
|
+
|
|
296
|
+
**Local Only:**
|
|
297
|
+
- NEVER post assessments to GitHub or external platforms
|
|
298
|
+
- Assessments remain in ReviewComments.md only
|
|
299
|
+
- Internal decision-making tool, not external communication
|
|
300
|
+
|
|
301
|
+
**Respectful Tone:**
|
|
302
|
+
- Assessment is about comment quality, not personal critique
|
|
303
|
+
- Focus on improving feedback, not judging the Feedback Generation skill
|
|
304
|
+
- Acknowledge when comments are well-crafted
|
|
305
|
+
|
|
306
|
+
**Context-Aware:**
|
|
307
|
+
- Reference all available artifacts for complete picture
|
|
308
|
+
- Consider project-specific patterns from CodeResearch.md
|
|
309
|
+
- Understand PR intent from DerivedSpec.md
|
|
310
|
+
- Factor in system-wide impacts from ImpactAnalysis.md
|
|
311
|
+
|
|
312
|
+
**Balanced Perspective:**
|
|
313
|
+
- Don't be reflexively negative or positive
|
|
314
|
+
- Some comments will be excellent, others questionable
|
|
315
|
+
- Honest assessment serves reviewer and PR author best
|
|
316
|
+
|
|
317
|
+
## Iteration Summary
|
|
318
|
+
|
|
319
|
+
After adding assessments to all comments, append an Iteration Summary section to ReviewComments.md:
|
|
320
|
+
|
|
321
|
+
```markdown
|
|
322
|
+
---
|
|
323
|
+
|
|
324
|
+
## Iteration Summary
|
|
325
|
+
|
|
326
|
+
### Comments to Update (Based on Critique)
|
|
327
|
+
|
|
328
|
+
| Original Comment | Recommendation | Update Guidance |
|
|
329
|
+
|------------------|----------------|-----------------|
|
|
330
|
+
| File: auth.ts L45-50 | Modify | Soften tone; acknowledge valid alternative |
|
|
331
|
+
| File: api.ts L88 | Skip | Stylistic preference, not actionable |
|
|
332
|
+
| File: db.ts L120 | Include as-is | High value, accurate |
|
|
333
|
+
|
|
334
|
+
### Counts
|
|
335
|
+
- **Comments to Include as-is**: X
|
|
336
|
+
- **Comments to Modify**: Y
|
|
337
|
+
- **Comments to Skip**: Z (will not post to GitHub, retained in ReviewComments.md)
|
|
338
|
+
|
|
339
|
+
### Notes for Feedback Response
|
|
340
|
+
[Any specific guidance for the feedback skill when updating comments]
|
|
341
|
+
```
|
|
342
|
+
|
|
343
|
+
**Skip Clarification**: Comments marked "Skip" remain in ReviewComments.md for documentation but will NOT be posted to GitHub by `paw-review-github`. The reviewer can override by changing the recommendation before the feedback response pass.
|
|
344
|
+
|
|
345
|
+
## Validation Checklist
|
|
346
|
+
|
|
347
|
+
Before completing, verify:
|
|
348
|
+
|
|
349
|
+
- [ ] Assessment added for every inline comment
|
|
350
|
+
- [ ] Assessment added for every thread comment
|
|
351
|
+
- [ ] All assessments have all five components (Usefulness, Accuracy, Alternative Perspective, Trade-offs, Recommendation)
|
|
352
|
+
- [ ] Usefulness ratings calibrated (not inflated)
|
|
353
|
+
- [ ] Evidence validation performed (file:line references checked)
|
|
354
|
+
- [ ] Alternative perspectives genuinely considered
|
|
355
|
+
- [ ] Trade-offs realistically evaluated
|
|
356
|
+
- [ ] Recommendations actionable and justified
|
|
357
|
+
- [ ] Assessments remain in ReviewComments.md (NOT posted externally)
|
|
358
|
+
- [ ] Iteration Summary section appended with counts table
|
|
359
|
+
- [ ] Tone is respectful and constructive
|
|
360
|
+
|
|
361
|
+
## Completion Response
|
|
362
|
+
|
|
363
|
+
```
|
|
364
|
+
Activity complete.
|
|
365
|
+
Artifact saved: .paw/reviews/<identifier>/ReviewComments.md (assessments added)
|
|
366
|
+
Status: Success
|
|
367
|
+
|
|
368
|
+
Iteration Summary:
|
|
369
|
+
- Include as-is: N comments
|
|
370
|
+
- Modify: M comments (with update guidance in assessments)
|
|
371
|
+
- Skip: K comments (retained in artifact, will not post to GitHub)
|
|
372
|
+
|
|
373
|
+
Next: Run paw-review-feedback in Critique Response Mode to finalize comments with **Final**: markers.
|