ace-task 0.31.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (68) hide show
  1. checksums.yaml +7 -0
  2. data/.ace-defaults/nav/protocols/skill-sources/ace-task.yml +19 -0
  3. data/.ace-defaults/nav/protocols/wfi-sources/ace-task.yml +19 -0
  4. data/.ace-defaults/task/config.yml +25 -0
  5. data/CHANGELOG.md +518 -0
  6. data/README.md +52 -0
  7. data/Rakefile +12 -0
  8. data/exe/ace-task +22 -0
  9. data/handbook/guides/task-definition.g.md +156 -0
  10. data/handbook/skills/as-bug-analyze/SKILL.md +26 -0
  11. data/handbook/skills/as-bug-fix/SKILL.md +27 -0
  12. data/handbook/skills/as-task-document-unplanned/SKILL.md +27 -0
  13. data/handbook/skills/as-task-draft/SKILL.md +24 -0
  14. data/handbook/skills/as-task-finder/SKILL.md +27 -0
  15. data/handbook/skills/as-task-plan/SKILL.md +30 -0
  16. data/handbook/skills/as-task-review/SKILL.md +25 -0
  17. data/handbook/skills/as-task-review-questions/SKILL.md +25 -0
  18. data/handbook/skills/as-task-update/SKILL.md +21 -0
  19. data/handbook/skills/as-task-work/SKILL.md +41 -0
  20. data/handbook/templates/task/draft.template.md +166 -0
  21. data/handbook/templates/task/file-modification-checklist.template.md +26 -0
  22. data/handbook/templates/task/technical-approach.template.md +26 -0
  23. data/handbook/workflow-instructions/bug/analyze.wf.md +458 -0
  24. data/handbook/workflow-instructions/bug/fix.wf.md +512 -0
  25. data/handbook/workflow-instructions/task/document-unplanned.wf.md +222 -0
  26. data/handbook/workflow-instructions/task/draft.wf.md +552 -0
  27. data/handbook/workflow-instructions/task/finder.wf.md +22 -0
  28. data/handbook/workflow-instructions/task/plan.wf.md +489 -0
  29. data/handbook/workflow-instructions/task/review-plan.wf.md +144 -0
  30. data/handbook/workflow-instructions/task/review-questions.wf.md +411 -0
  31. data/handbook/workflow-instructions/task/review-work.wf.md +146 -0
  32. data/handbook/workflow-instructions/task/review.wf.md +351 -0
  33. data/handbook/workflow-instructions/task/update.wf.md +118 -0
  34. data/handbook/workflow-instructions/task/work.wf.md +106 -0
  35. data/lib/ace/task/atoms/task_file_pattern.rb +68 -0
  36. data/lib/ace/task/atoms/task_frontmatter_defaults.rb +46 -0
  37. data/lib/ace/task/atoms/task_id_formatter.rb +62 -0
  38. data/lib/ace/task/atoms/task_validation_rules.rb +51 -0
  39. data/lib/ace/task/cli/commands/create.rb +105 -0
  40. data/lib/ace/task/cli/commands/doctor.rb +206 -0
  41. data/lib/ace/task/cli/commands/list.rb +73 -0
  42. data/lib/ace/task/cli/commands/plan.rb +119 -0
  43. data/lib/ace/task/cli/commands/show.rb +58 -0
  44. data/lib/ace/task/cli/commands/status.rb +77 -0
  45. data/lib/ace/task/cli/commands/update.rb +183 -0
  46. data/lib/ace/task/cli.rb +83 -0
  47. data/lib/ace/task/models/task.rb +46 -0
  48. data/lib/ace/task/molecules/path_utils.rb +20 -0
  49. data/lib/ace/task/molecules/subtask_creator.rb +130 -0
  50. data/lib/ace/task/molecules/task_config_loader.rb +92 -0
  51. data/lib/ace/task/molecules/task_creator.rb +115 -0
  52. data/lib/ace/task/molecules/task_display_formatter.rb +221 -0
  53. data/lib/ace/task/molecules/task_doctor_fixer.rb +510 -0
  54. data/lib/ace/task/molecules/task_doctor_reporter.rb +264 -0
  55. data/lib/ace/task/molecules/task_frontmatter_validator.rb +138 -0
  56. data/lib/ace/task/molecules/task_loader.rb +119 -0
  57. data/lib/ace/task/molecules/task_plan_cache.rb +190 -0
  58. data/lib/ace/task/molecules/task_plan_generator.rb +141 -0
  59. data/lib/ace/task/molecules/task_plan_prompt_builder.rb +91 -0
  60. data/lib/ace/task/molecules/task_reparenter.rb +247 -0
  61. data/lib/ace/task/molecules/task_resolver.rb +115 -0
  62. data/lib/ace/task/molecules/task_scanner.rb +129 -0
  63. data/lib/ace/task/molecules/task_structure_validator.rb +154 -0
  64. data/lib/ace/task/organisms/task_doctor.rb +199 -0
  65. data/lib/ace/task/organisms/task_manager.rb +353 -0
  66. data/lib/ace/task/version.rb +7 -0
  67. data/lib/ace/task.rb +37 -0
  68. metadata +197 -0
@@ -0,0 +1,411 @@
1
+ ---
2
+ doc-type: workflow
3
+ purpose: Resolve clarifying questions for tasks and implementation readiness
4
+ ace-docs:
5
+ last-updated: '2026-03-21'
6
+ ---
7
+
8
+ # Review Questions Workflow Instruction
9
+
10
+ ## Goal
11
+
12
+ Interactively review and resolve questions in tasks marked with `needs_review: true`, capturing answers and updating task definitions to make them implementation-ready without requiring further clarification.
13
+
14
+ ## Prerequisites
15
+
16
+ - One or more tasks exist with `needs_review: true` flag
17
+ - Understanding of task review question format and structure
18
+ - Authority to make implementation decisions or access to stakeholders
19
+ - Write access to task files in `.ace-tasks/`
20
+ - Access to `ace-task list` tool for finding tasks
21
+
22
+ ## Project Context Loading
23
+
24
+ - Read and follow: `ace-bundle wfi://bundle`
25
+ - Load existing review workflow: `ace-bundle wfi://task/review`
26
+
27
+ ## Process Steps
28
+
29
+ 1. **Find Next Task Needing Review:**
30
+
31
+ ```bash
32
+ # List tasks by status (needs_review is a metadata field, not a filter)
33
+ ace-task list --status draft
34
+ ace-task list --status pending
35
+
36
+ # You'll need to check task files manually for needs_review: true flag
37
+ # Or use ace-search to find tasks with the flag:
38
+ cd .ace-tasks && ace-search "needs_review: true" --content
39
+ ```
40
+
41
+ **Selection Strategy:**
42
+ - Prioritize HIGH priority tasks first
43
+ - Within same priority, select oldest tasks
44
+ - Consider task dependencies (review prerequisites first)
45
+ - Note the task path for loading
46
+ - **Note**: `needs_review` is a task metadata field that must be checked by reading task files
47
+
48
+ 2. **Load and Analyze Task Questions:**
49
+
50
+ - **Read Task File:**
51
+ ```bash
52
+ # Read the selected task
53
+ cat [task-path]
54
+ ```
55
+
56
+ - **Identify Question Structure:**
57
+ - Locate `## Review Questions (Pending Human Input)` section
58
+ - Note question priorities: [HIGH], [MEDIUM], [LOW]
59
+ - Review research context for each question
60
+ - Understand suggested defaults and rationale
61
+
62
+ - **Prepare Question Presentation:**
63
+ - Group questions by priority level
64
+ - Order within groups by dependency/logic flow
65
+ - Prepare to present context with each question
66
+
67
+ 3. **Interactive Question Review Process:**
68
+
69
+ ### For Each Question (Priority Order):
70
+
71
+ **a. Present Question with Full Context:**
72
+ ```markdown
73
+ ========================================
74
+ QUESTION [1/N] - [PRIORITY LEVEL]
75
+ ========================================
76
+
77
+ **Question**: [Question text]
78
+
79
+ **Research Conducted**:
80
+ [Research findings from task]
81
+
82
+ **Current Context**:
83
+ [Relevant project/technical context]
84
+
85
+ **Suggested Default**:
86
+ [Default recommendation with rationale]
87
+
88
+ **Why Human Input Needed**:
89
+ [Business/design decision reasoning]
90
+
91
+ **Potential Options**:
92
+ 1. [Option A with implications]
93
+ 2. [Option B with implications]
94
+ 3. [Option C with implications]
95
+ 4. [Custom answer]
96
+
97
+ Please provide your decision:
98
+ ```
99
+
100
+ **b. Capture User Answer:**
101
+ - Record the exact answer provided
102
+ - Ask for optional rationale if not clear
103
+ - Confirm understanding before proceeding
104
+ - Note any follow-up implications
105
+
106
+ **c. Document Answer Format:**
107
+ ```markdown
108
+ ### [RESOLVED] Original Question Title
109
+ - **Decision**: [User's answer]
110
+ - **Rationale**: [Why this choice was made]
111
+ - **Implications**: [What this means for implementation]
112
+ - **Resolved by**: [User/Role]
113
+ - **Date**: [YYYY-MM-DD]
114
+ ```
115
+
116
+ 4. **Save Answers Progressively:**
117
+
118
+ **After Each Answer:**
119
+ - Update the task file immediately
120
+ - Move question from pending to resolved section
121
+ - Preserve original question for audit trail
122
+ - Add resolution details
123
+
124
+ **Answer Integration Pattern:**
125
+ ```markdown
126
+ ## Review Questions (Resolved)
127
+
128
+ ### ✅ [RESOLVED] How should we handle session timeouts?
129
+ - **Original Priority**: HIGH
130
+ - **Decision**: Implement 12-hour sessions with 2-hour idle timeout
131
+ - **Rationale**: Balances security with user convenience per OWASP
132
+ - **Implementation Notes**:
133
+ - Use refresh tokens for extension
134
+ - Log timeout events for monitoring
135
+ - **Resolved by**: Product Owner
136
+ - **Date**: 2025-01-30
137
+
138
+ ## Review Questions (Pending Human Input)
139
+
140
+ ### [MEDIUM] Remaining Question
141
+ - [ ] [Question still needing answer...]
142
+ ```
143
+
144
+ 5. **Update Task Definition with Answers:**
145
+
146
+ **Integration Points by Task Section:**
147
+
148
+ ### Technical Specifications:
149
+ - Add concrete configuration values from answers
150
+ - Update implementation approach based on decisions
151
+ - Specify exact thresholds, limits, quotas
152
+
153
+ ### Implementation Notes:
154
+ - Document specific technical choices made
155
+ - Add configuration examples with resolved values
156
+ - Include edge case handling per decisions
157
+
158
+ ### Success Criteria:
159
+ - Update measurable targets with specific values
160
+ - Add validation criteria from answers
161
+ - Include performance thresholds decided
162
+
163
+ ### Configuration Files:
164
+ - Update code examples with actual values
165
+ - Replace placeholders with decisions
166
+ - Add comments explaining choices
167
+
168
+ **Example Integration:**
169
+ ```javascript
170
+ // Before (with question)
171
+ numberOfRuns: 3, // TODO: How many runs for reliability?
172
+
173
+ // After (with answer integrated)
174
+ numberOfRuns: 3, // Confirmed: 3 runs for median reliability (decided 2025-01-30)
175
+ ```
176
+
177
+ 6. **Complete Review Session:**
178
+
179
+ **When All Questions Answered:**
180
+ - Remove `needs_review: true` flag from metadata
181
+ - Move all questions to Resolved section
182
+ - Add review completion note
183
+
184
+ **Completion Metadata Update:**
185
+ ```yaml
186
+ ---
187
+ id: v.0.2.0+task.123
188
+ status: draft # Or current status
189
+ priority: high
190
+ estimate: 4-6h # Update if needed based on decisions
191
+ dependencies: none
192
+ # needs_review: true # REMOVED
193
+ review_completed: 2025-01-30
194
+ reviewed_by: [User/Role]
195
+ ---
196
+ ```
197
+
198
+ **Add Implementation Readiness Note:**
199
+ ```markdown
200
+ ## Review Completion Summary
201
+
202
+ **Date**: 2025-01-30
203
+ **Reviewed by**: [User/Role]
204
+ **Questions Resolved**: 5 (3 HIGH, 2 MEDIUM)
205
+ **Implementation Readiness**: ✅ Ready for implementation
206
+
207
+ **Key Decisions Made**:
208
+ - Lighthouse CI will run on all builds with sampling
209
+ - Performance thresholds: 5-point warning, 10-point failure
210
+ - Mobile-first testing with 3 runs for reliability
211
+ - 50ms monitoring overhead budget approved
212
+ - BigQuery integration deferred to Phase 2
213
+ ```
214
+
215
+ 7. **Handle Partial Reviews:**
216
+
217
+ **If Review Must Be Interrupted:**
218
+ - Save all answered questions immediately
219
+ - Keep `needs_review: true` flag
220
+ - Add progress note with timestamp
221
+ - Document which questions remain
222
+
223
+ **Progress Note Format:**
224
+ ```markdown
225
+ ## Review Progress Notes
226
+
227
+ ### Session: 2025-01-30 14:30
228
+ - Resolved: 3 of 5 questions
229
+ - Remaining: 2 MEDIUM priority questions
230
+ - Blocked on: Need input from DevOps team
231
+ - Next steps: Schedule follow-up for remaining items
232
+ ```
233
+
234
+ 8. **Batch Review Mode (Optional):**
235
+
236
+ **For Multiple Tasks:**
237
+ ```bash
238
+ # Generate review queue (find tasks with needs_review flag)
239
+ cd .ace-tasks && ace-search "needs_review: true" --content --files-with-matches > ../review-queue.txt
240
+
241
+ # Process each task systematically
242
+ for task in $(cat review-queue.txt); do
243
+ echo "Reviewing: $task"
244
+ # Follow steps 2-6 for each task
245
+ done
246
+ ```
247
+
248
+ **Batch Summary Report:**
249
+ ```markdown
250
+ ## Batch Review Summary - 2025-01-30
251
+
252
+ **Tasks Reviewed**: 3
253
+ **Questions Resolved**: 12 total
254
+ - Task.123: 5 questions ✅
255
+ - Task.124: 4 questions ✅
256
+ - Task.125: 3 questions (2 resolved, 1 pending)
257
+
258
+ **Common Decisions**:
259
+ - All performance monitoring at 25% sampling
260
+ - Consistent 5/10 point threshold strategy
261
+ - Mobile-first testing approach approved
262
+ ```
263
+
264
+ ## Success Criteria
265
+
266
+ - All HIGH priority questions answered with clear decisions
267
+ - Answers documented with rationale and implications
268
+ - Task definition updated with concrete implementation details
269
+ - Configuration examples include actual decided values
270
+ - `needs_review` flag removed when fully resolved
271
+ - Task achieves "implementation-ready" state
272
+ - No ambiguity remains that would block implementation
273
+ - Review completion summary added to task
274
+
275
+ ## Common Question Types and Answer Templates
276
+
277
+ ### Performance/Threshold Questions
278
+ ```markdown
279
+ **Question**: What performance degradation threshold should trigger build failure?
280
+ **Answer Template**:
281
+ - Warning threshold: [X points/percent]
282
+ - Failure threshold: [Y points/percent]
283
+ - Applies to: [specific metrics]
284
+ - Exception handling: [if any]
285
+ ```
286
+
287
+ ### Configuration/Setup Questions
288
+ ```markdown
289
+ **Question**: Should this run in CI/CD or local only?
290
+ **Answer Template**:
291
+ - Environments: [local, CI, production]
292
+ - Trigger conditions: [PR, merge, manual]
293
+ - Resource limits: [if applicable]
294
+ - Cost considerations: [if applicable]
295
+ ```
296
+
297
+ ### Feature Scope Questions
298
+ ```markdown
299
+ **Question**: Should we include [feature X] in this implementation?
300
+ **Answer Template**:
301
+ - Include in current scope: [Yes/No]
302
+ - If deferred, target step: [Step N]
303
+ - Dependencies affected: [list]
304
+ - Alternative approach: [if not included]
305
+ ```
306
+
307
+ ### Technical Approach Questions
308
+ ```markdown
309
+ **Question**: Which library/tool should we use for [purpose]?
310
+ **Answer Template**:
311
+ - Selected option: [Library/Tool name]
312
+ - Version constraint: [if specific]
313
+ - Rationale: [why chosen]
314
+ - Fallback option: [if first choice fails]
315
+ ```
316
+
317
+ ## Integration with Task Workflows
318
+
319
+ ### Before review-questions:
320
+ - `review-task`: Generates questions needing answers
321
+ - `draft-task`: Creates tasks that may need clarification
322
+
323
+ ### After review-questions:
324
+ - `plan-task`: Can proceed with clear requirements
325
+ - `work-on-task`: Implementation without ambiguity
326
+ - Task is ready for execution without blockers
327
+
328
+ ### Parallel workflows:
329
+ - `create-adr`: Document significant technical decisions
330
+ - `create-test-cases`: Define tests based on decisions
331
+
332
+ ## Error Handling
333
+
334
+ ### Common Issues:
335
+
336
+ **"No tasks need review"**
337
+ - Run `ace-task list needs-review` (preset) or `cd .ace-tasks && ace-search "needs_review: true" --content`
338
+ - Check if reviews were already completed
339
+ - Look for tasks with questions but missing flag
340
+
341
+ **"Cannot parse question format"**
342
+ - Ensure questions follow standard format
343
+ - Check for `## Review Questions` section
344
+ - Verify markdown structure is valid
345
+
346
+ **"Conflicting answers"**
347
+ - Review previous decisions for consistency
348
+ - Document why this case differs
349
+ - Consider creating ADR for significant changes
350
+
351
+ ## Usage Examples
352
+
353
+ ### Example 1: Single Task Review
354
+ ```
355
+ User: "Review questions for the Lighthouse CI task"
356
+
357
+ Process:
358
+ 1. Load task.123 with 5 pending questions
359
+ 2. Present each question with context:
360
+ Q1 [HIGH]: "Should Lighthouse CI run on all builds?"
361
+ - Show research about build times
362
+ - Present cost implications
363
+ - Suggest: "PR checks + production"
364
+ 3. Capture answer: "Yes, but with different configs"
365
+ 4. Document decision with rationale
366
+ 5. Update task config examples with decision
367
+ 6. Continue through all 5 questions
368
+ 7. Remove needs_review flag
369
+ 8. Add completion summary
370
+ Result: Task ready for implementation
371
+ ```
372
+
373
+ ### Example 2: Batch Review Session
374
+ ```
375
+ User: "Review all pending task questions"
376
+
377
+ Process:
378
+ 1. Find 3 tasks needing review (123, 124, 125)
379
+ 2. Start with highest priority (task.123)
380
+ 3. Work through all questions systematically
381
+ 4. Save progress after each task
382
+ 5. Generate batch summary report
383
+ 6. Flag any that need follow-up
384
+ Result: 2 tasks ready, 1 needs additional input
385
+ ```
386
+
387
+ ### Example 3: Partial Review with Handoff
388
+ ```
389
+ User: "Review what I can answer for task.124"
390
+
391
+ Process:
392
+ 1. Load 4 questions from task.124
393
+ 2. Answer technical questions (2 resolved)
394
+ 3. Flag business questions for Product Owner
395
+ 4. Save partial progress with notes
396
+ 5. Keep needs_review flag active
397
+ 6. Document what remains and who should answer
398
+ Result: Partial resolution with clear next steps
399
+ ```
400
+
401
+ ## Key Value: Structured Decision Capture
402
+
403
+ This workflow ensures:
404
+ 1. **No Lost Context**: All research and reasoning preserved
405
+ 2. **Audit Trail**: Clear record of who decided what and why
406
+ 3. **Implementation Clarity**: Developers have exact values and approaches
407
+ 4. **Efficient Reviews**: Questions presented with full context for quick decisions
408
+ 5. **Progressive Resolution**: Can handle partial reviews and handoffs
409
+ 6. **Batch Processing**: Efficiently review multiple tasks in one session
410
+
411
+ The workflow transforms tasks from "blocked on questions" to "ready to build" through systematic, documented decision-making.
@@ -0,0 +1,146 @@
1
+ ---
2
+ doc-type: workflow
3
+ purpose: Review implementation work before completion
4
+ ace-docs:
5
+ last-updated: '2026-03-21'
6
+ ---
7
+
8
+ # Review Work Workflow Instruction
9
+
10
+ ## Goal
11
+
12
+ Critically evaluate work execution output for completeness, credibility, and delivery readiness. This workflow acts as the adversarial quality gate between execution and delivery. Work that passes this review should be complete enough to serve as a PR description or implementation report.
13
+
14
+ ## When to Use
15
+
16
+ - As Phase 2 (self-critique) in a work execution step
17
+ - After any implementation report is produced, before declaring work complete
18
+ - When reviewing execution quality in ace-assign pipeline steps
19
+
20
+ ## Evaluation Dimensions
21
+
22
+ Evaluate the work output against these six dimensions. Score each as **PASS**, **WEAK**, or **FAIL**.
23
+
24
+ ### 1. Plan Adherence
25
+
26
+ Every item from the implementation plan must be addressed in the execution output. No silent drops.
27
+
28
+ **PASS:** Every plan item has a corresponding execution result — completed, modified with rationale, or explicitly deferred.
29
+ **WEAK:** Most plan items addressed but 1-2 minor items not mentioned.
30
+ **FAIL:** Plan items silently dropped with no explanation.
31
+
32
+ **Check for:**
33
+ - Plan items with no corresponding execution mention
34
+ - Scope changes without documented rationale
35
+ - New work introduced that wasn't in the plan (scope creep)
36
+ - Deferred items without justification
37
+
38
+ ### 2. Change Credibility
39
+
40
+ Every claimed change must reference specific file paths, use valid code patterns, and match the project's actual structure.
41
+
42
+ **PASS:** All changes reference real file paths, use correct syntax, and match project conventions.
43
+ **WEAK:** Most changes are specific but some lack file paths or use approximate descriptions.
44
+ **FAIL:** Vague claims like "updated the module" or references to non-existent patterns.
45
+
46
+ **Check for:**
47
+ - Changes described without file paths
48
+ - Code snippets that don't match the project's language or framework conventions
49
+ - References to files or modules that don't exist in the project
50
+ - Descriptions too vague to verify ("improved error handling")
51
+
52
+ ### 3. Test Coverage Verification
53
+
54
+ Tests must include concrete assertions and cover edge cases, not just happy paths.
55
+
56
+ **PASS:** Test scenarios named with specific inputs, expected outputs, and edge case coverage.
57
+ **WEAK:** Tests cover happy paths but edge cases are thin or unspecified.
58
+ **FAIL:** Generic "tests added" claims or no test evidence for code changes.
59
+
60
+ **Check for:**
61
+ - "Tests pass" without listing what was tested
62
+ - Missing edge case coverage for boundary conditions
63
+ - No error path testing
64
+ - Test file paths not specified
65
+
66
+ ### 4. Convention Compliance
67
+
68
+ Naming, style, error messages, and patterns must match established project conventions.
69
+
70
+ **PASS:** All changes follow project naming patterns, code style, and error message conventions.
71
+ **WEAK:** Minor deviations that don't affect functionality.
72
+ **FAIL:** Systematic convention violations or introduction of inconsistent patterns.
73
+
74
+ **Check for:**
75
+ - Naming that breaks established conventions (snake_case vs camelCase, prefixes, etc.)
76
+ - Error messages that don't follow project patterns
77
+ - File placement that violates project structure
78
+ - New patterns introduced without justification when existing patterns apply
79
+
80
+ ### 5. Risk Mitigation Evidence
81
+
82
+ Risks identified in the plan must have corresponding mitigation actions in the execution.
83
+
84
+ **PASS:** Each identified risk has a documented mitigation action or resolution.
85
+ **WEAK:** Most risks addressed but some mitigations are implicit rather than explicit.
86
+ **FAIL:** Risks from the plan ignored in execution, or new risks introduced without mitigation.
87
+
88
+ **Check for:**
89
+ - Plan risks with no corresponding mitigation evidence
90
+ - New risks introduced during execution without acknowledgment
91
+ - Cross-package impacts not verified
92
+ - Breaking changes without backward compatibility consideration
93
+
94
+ ### 6. Delivery Readiness
95
+
96
+ The execution output must be complete enough that a reviewer can assess the full scope of changes.
97
+
98
+ **PASS:** Output includes complete change manifest, test results, and remaining work (if any) clearly documented.
99
+ **WEAK:** Output covers main changes but missing minor details a reviewer would need.
100
+ **FAIL:** Output is incomplete — missing change descriptions, no test evidence, or unclear what was actually done.
101
+
102
+ **Check for:**
103
+ - Missing summary of what changed and why
104
+ - No test execution evidence
105
+ - Unclear boundary between completed and remaining work
106
+ - Missing information a PR reviewer would need
107
+
108
+ ## Output Format
109
+
110
+ Produce the critique in this structure:
111
+
112
+ ```markdown
113
+ ## Work Critique
114
+
115
+ **Verdict:** SHIP IT | NEEDS REVISION | INCOMPLETE
116
+
117
+ ### Dimension Scores
118
+
119
+ | Dimension | Score | Notes |
120
+ |-----------|-------|-------|
121
+ | Plan Adherence | PASS/WEAK/FAIL | One-line finding |
122
+ | Change Credibility | PASS/WEAK/FAIL | One-line finding |
123
+ | Test Coverage Verification | PASS/WEAK/FAIL | One-line finding |
124
+ | Convention Compliance | PASS/WEAK/FAIL | One-line finding |
125
+ | Risk Mitigation Evidence | PASS/WEAK/FAIL | One-line finding |
126
+ | Delivery Readiness | PASS/WEAK/FAIL | One-line finding |
127
+
128
+ ### Critical Findings
129
+ - [List specific issues that MUST be fixed before delivery]
130
+
131
+ ### Strengths
132
+ - [List what the execution does well]
133
+ ```
134
+
135
+ ## Verdict Criteria
136
+
137
+ - **SHIP IT:** No FAIL scores, at most one WEAK score
138
+ - **NEEDS REVISION:** No more than two FAIL scores, or three+ WEAK scores
139
+ - **INCOMPLETE:** Three or more FAIL scores
140
+
141
+ ## Review Principles
142
+
143
+ - Be adversarial. Your job is to find gaps between the plan and the execution, not to validate effort.
144
+ - Compare the plan and execution item-by-item. Every plan item needs a resolution.
145
+ - Demand specificity. "Updated the code" is not evidence of a change.
146
+ - A shipped report is a commitment. Ensure every claim is verifiable.