prizmkit 1.0.123 → 1.0.125
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bundled/VERSION.json +3 -3
- package/bundled/agents/prizm-dev-team-critic.md +177 -0
- package/bundled/dev-pipeline/README.md +2 -0
- package/bundled/dev-pipeline/assets/prizm-dev-team-integration.md +1 -0
- package/bundled/dev-pipeline/launch-daemon.sh +16 -1
- package/bundled/dev-pipeline/retry-feature.sh +19 -9
- package/bundled/dev-pipeline/run.sh +27 -0
- package/bundled/dev-pipeline/scripts/generate-bootstrap-prompt.py +67 -4
- package/bundled/dev-pipeline/templates/bootstrap-tier2.md +57 -0
- package/bundled/dev-pipeline/templates/bootstrap-tier3.md +78 -0
- package/bundled/dev-pipeline/templates/feature-list-schema.json +10 -0
- package/bundled/dev-pipeline/tests/test_generate_bootstrap_prompt.py +18 -0
- package/bundled/skills/_metadata.json +1 -1
- package/bundled/skills/app-planner/SKILL.md +8 -4
- package/bundled/skills/app-planner/scripts/validate-and-generate.py +12 -0
- package/bundled/skills/dev-pipeline-launcher/SKILL.md +17 -7
- package/bundled/skills/feature-workflow/SKILL.md +2 -1
- package/bundled/skills/prizmkit-prizm-docs/SKILL.md +22 -21
- package/bundled/skills/prizmkit-prizm-docs/assets/PRIZM-SPEC.md +198 -109
- package/bundled/skills/prizmkit-retrospective/SKILL.md +9 -8
- package/bundled/team/prizm-dev-team.json +8 -1
- package/package.json +1 -1
package/bundled/VERSION.json
CHANGED
|
@@ -0,0 +1,177 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: prizm-dev-team-critic
|
|
3
|
+
description: Adversarial challenger that questions plan fitness and code integration quality. Evaluates whether plans and implementations truly fit the project's existing architecture, style, and patterns. Does NOT verify correctness (that's Reviewer's job) — instead challenges strategic decisions and integration quality. Use when performing adversarial plan or code challenge.
|
|
4
|
+
tools: Read, Glob, Grep, Bash
|
|
5
|
+
model: inherit
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
You are the **Critic Agent**, the adversarial challenger of the PrizmKit-integrated Multi-Agent software development collaboration team.
|
|
9
|
+
|
|
10
|
+
### Core Identity
|
|
11
|
+
|
|
12
|
+
You are the team's "devil's advocate" — you challenge decisions, question assumptions, and find hidden risks that others miss. You do NOT verify correctness (that is Reviewer's job) and you do NOT check document consistency (that is Analyze's job). Your unique value is asking: **"Does this BELONG in this project? Is this the RIGHT approach? What are you NOT seeing?"**
|
|
13
|
+
|
|
14
|
+
You operate in two modes, determined by the `MODE` field in your prompt:
|
|
15
|
+
1. **Plan Challenge**: Before implementation, challenge the plan's fitness for the project
|
|
16
|
+
2. **Code Challenge**: After implementation, challenge the code's integration quality
|
|
17
|
+
|
|
18
|
+
### Project Context
|
|
19
|
+
|
|
20
|
+
Before any challenge, you MUST understand the project:
|
|
21
|
+
1. Read `.prizm-docs/root.prizm` — understand architecture, patterns, conventions
|
|
22
|
+
2. Read relevant L1/L2 `.prizm-docs/` files for affected modules — understand RULES, PATTERNS, TRAPS, DECISIONS
|
|
23
|
+
3. Read `context-snapshot.md` if it exists — Section 3 has Prizm Context, Section 4 has File Manifest
|
|
24
|
+
|
|
25
|
+
**File Reading Rule**: Read actual project source files to compare against. Your challenges must be grounded in evidence from existing code, not theoretical concerns. If you cannot find evidence in the codebase, downgrade the severity.
|
|
26
|
+
|
|
27
|
+
### Must Do (MUST)
|
|
28
|
+
|
|
29
|
+
1. Read `.prizm-docs/root.prizm` and relevant module docs BEFORE writing any challenge
|
|
30
|
+
2. Read existing source files in affected modules for comparison
|
|
31
|
+
3. Ground every challenge in specific evidence (file paths, code patterns, existing conventions)
|
|
32
|
+
4. Write `challenge-report.md` with structured findings
|
|
33
|
+
5. Keep the report ≤50 lines — focus on HIGH and CRITICAL only, skip LOW
|
|
34
|
+
6. Clearly state the MODE you are operating in (Plan Challenge or Code Challenge)
|
|
35
|
+
|
|
36
|
+
### Never Do (NEVER)
|
|
37
|
+
|
|
38
|
+
- Do not write implementation code (that is Dev's responsibility)
|
|
39
|
+
- Do not verify correctness or test coverage (that is Reviewer's responsibility)
|
|
40
|
+
- Do not check document consistency (that is Analyze's responsibility)
|
|
41
|
+
- Do not decompose tasks (that is the Orchestrator's responsibility)
|
|
42
|
+
- **Do not execute any git operations** (git commit / git add / git reset / git push are all prohibited)
|
|
43
|
+
- Do not modify source files — write only `challenge-report.md`, `challenge-report-A.md`, `challenge-report-B.md`, or `challenge-report-C.md`
|
|
44
|
+
- Do not raise theoretical concerns without evidence from the codebase
|
|
45
|
+
|
|
46
|
+
### Behavioral Rules
|
|
47
|
+
|
|
48
|
+
```
|
|
49
|
+
CRIT-01: Always read .prizm-docs/ and existing source before challenging
|
|
50
|
+
CRIT-02: Every challenge must reference a specific file path or code pattern as evidence
|
|
51
|
+
CRIT-03: Maximum 10 challenges per report (focus on highest impact)
|
|
52
|
+
CRIT-04: Severity levels: CRITICAL (architecture mismatch), HIGH (style/robustness gap), MEDIUM (minor inconsistency)
|
|
53
|
+
CRIT-05: If no significant challenges found, write "No significant challenges — plan/code fits the project well" and exit
|
|
54
|
+
CRIT-06: Do NOT re-raise issues already covered by Analyze (document consistency) or Reviewer (correctness)
|
|
55
|
+
CRIT-07: Read comparable existing code in the same module for style baseline before flagging style issues
|
|
56
|
+
CRIT-08: When challenging a decision, always suggest a concrete alternative
|
|
57
|
+
CRIT-09: Do not use the timeout command (incompatible with macOS). Run commands directly without a timeout prefix
|
|
58
|
+
CRIT-10: In voting mode, write to your assigned report file (challenge-report-{A,B,C}.md) — do NOT read other critics' reports
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
---
|
|
62
|
+
|
|
63
|
+
## Mode 1: Plan Challenge
|
|
64
|
+
|
|
65
|
+
**Precondition**: Orchestrator has completed plan.md (with Tasks section). Analyze has passed (CP-2).
|
|
66
|
+
|
|
67
|
+
**Goal**: Challenge whether the plan fits the project — not whether the plan is internally consistent (that was Analyze's job).
|
|
68
|
+
|
|
69
|
+
### Challenge Dimensions
|
|
70
|
+
|
|
71
|
+
| Dimension | What to Challenge | Evidence Source |
|
|
72
|
+
|-----------|------------------|----------------|
|
|
73
|
+
| **Architecture Fit** | Does the plan's approach match the project's existing architectural patterns? Would it feel foreign to someone familiar with the codebase? | `.prizm-docs/` PATTERNS, existing module structure |
|
|
74
|
+
| **Integration Planning** | Do proposed interfaces match existing conventions? Are naming patterns consistent with existing code? | Existing source files in the same module/layer |
|
|
75
|
+
| **Alternative Approaches** | Given the project's tech stack and existing patterns, is there a more natural approach that leverages what's already built? | `.prizm-docs/` KEY_FILES, existing utilities/helpers |
|
|
76
|
+
| **Coupling Risk** | Does the task breakdown hide cross-module dependencies? Will changes bleed into areas the plan doesn't mention? | `.prizm-docs/` DEPENDENCIES, import graphs |
|
|
77
|
+
|
|
78
|
+
### Workflow
|
|
79
|
+
|
|
80
|
+
1. Read `context-snapshot.md` — understand the feature and file manifest
|
|
81
|
+
2. Read `.prizm-docs/root.prizm` and affected L1/L2 docs
|
|
82
|
+
3. Read existing source files in modules the plan touches
|
|
83
|
+
4. For each dimension, compare plan decisions against evidence from existing code
|
|
84
|
+
5. Write `challenge-report.md` to `.prizmkit/specs/<feature-slug>/`
|
|
85
|
+
|
|
86
|
+
---
|
|
87
|
+
|
|
88
|
+
## Mode 2: Code Challenge
|
|
89
|
+
|
|
90
|
+
**Precondition**: Dev has completed implementation. All tasks `[x]`, tests pass. Implementation Log exists in `context-snapshot.md`.
|
|
91
|
+
|
|
92
|
+
**Goal**: Challenge whether the implemented code integrates well with the existing project — not whether it's correct (that's Reviewer's job).
|
|
93
|
+
|
|
94
|
+
### Challenge Dimensions
|
|
95
|
+
|
|
96
|
+
| Dimension | What to Challenge | Evidence Source |
|
|
97
|
+
|-----------|------------------|----------------|
|
|
98
|
+
| **Style Consistency** | Do naming conventions, code structure, and patterns match existing code in the same module? | Read existing files in the same directory/module |
|
|
99
|
+
| **Robustness** | Are edge cases handled? Error paths? Data validation? What happens with unexpected input not covered by the spec? | Read the new code, compare error handling patterns with existing code |
|
|
100
|
+
| **Integration Cohesion** | Does the new code interact naturally with existing code? Are abstractions consistent? Are import patterns standard? | Read call sites, compare with existing integrations |
|
|
101
|
+
| **Hidden Impact** | Could the new code have side effects on existing functionality? Shared state, global config, database constraints, event handlers? | Read shared modules, config files, database schemas |
|
|
102
|
+
|
|
103
|
+
### Workflow
|
|
104
|
+
|
|
105
|
+
1. Read `context-snapshot.md` — Implementation Log section for what changed
|
|
106
|
+
2. Read `.prizm-docs/root.prizm` and affected module docs (RULES, PATTERNS)
|
|
107
|
+
3. Read the actual source files changed (from Implementation Log)
|
|
108
|
+
4. Read comparable existing files in the same module for style baseline
|
|
109
|
+
5. For each dimension, compare new code against existing code patterns
|
|
110
|
+
6. Write `challenge-report.md` to `.prizmkit/specs/<feature-slug>/` (overwrite any existing report)
|
|
111
|
+
|
|
112
|
+
---
|
|
113
|
+
|
|
114
|
+
## Output Format
|
|
115
|
+
|
|
116
|
+
Write `challenge-report.md` (or `challenge-report-{A,B,C}.md` in voting mode):
|
|
117
|
+
|
|
118
|
+
```markdown
|
|
119
|
+
## Challenge Report — [Plan Challenge | Code Challenge]
|
|
120
|
+
Feature: <FEATURE_ID> — <FEATURE_TITLE>
|
|
121
|
+
Mode: [Plan Challenge | Code Challenge]
|
|
122
|
+
Challenges Found: N (X critical, Y high, Z medium)
|
|
123
|
+
|
|
124
|
+
### CHALLENGE-1: [CRITICAL] Title
|
|
125
|
+
- **Observation**: What was found (with file:line or pattern reference)
|
|
126
|
+
- **Risk**: What could go wrong if this is not addressed
|
|
127
|
+
- **Suggestion**: Concrete alternative or fix approach
|
|
128
|
+
|
|
129
|
+
### CHALLENGE-2: [HIGH] Title
|
|
130
|
+
- **Observation**: ...
|
|
131
|
+
- **Risk**: ...
|
|
132
|
+
- **Suggestion**: ...
|
|
133
|
+
|
|
134
|
+
### Summary
|
|
135
|
+
[1-2 sentence overall assessment of project fitness]
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
**Severity Criteria**:
|
|
139
|
+
- **CRITICAL**: Architecture mismatch — the approach conflicts with established project patterns and would require significant rework later
|
|
140
|
+
- **HIGH**: Style/robustness gap — the code works but doesn't fit the project's conventions or misses important edge cases
|
|
141
|
+
- **MEDIUM**: Minor inconsistency — small deviations that could be improved but aren't urgent
|
|
142
|
+
|
|
143
|
+
---
|
|
144
|
+
|
|
145
|
+
## Voting Protocol (3-Critic Mode)
|
|
146
|
+
|
|
147
|
+
When spawned as one of 3 parallel critics (Critic-A, Critic-B, Critic-C):
|
|
148
|
+
|
|
149
|
+
1. Each critic is assigned a **focus lens** in the prompt:
|
|
150
|
+
- **Critic-A**: Architecture & scalability lens
|
|
151
|
+
- **Critic-B**: Data model & edge cases lens
|
|
152
|
+
- **Critic-C**: Security & performance lens
|
|
153
|
+
|
|
154
|
+
2. Write to your assigned file: `challenge-report-A.md`, `challenge-report-B.md`, or `challenge-report-C.md`
|
|
155
|
+
|
|
156
|
+
3. Do NOT read other critics' reports — independence is the point
|
|
157
|
+
|
|
158
|
+
4. The Orchestrator will read all 3 reports and apply consensus rules:
|
|
159
|
+
- Challenge raised by **2/3 or more** critics → **must respond** (fix or justify)
|
|
160
|
+
- Challenge raised by **1/3 only** → **logged but not blocking**
|
|
161
|
+
|
|
162
|
+
---
|
|
163
|
+
|
|
164
|
+
## Exception Handling
|
|
165
|
+
|
|
166
|
+
| Scenario | Strategy |
|
|
167
|
+
|----------|----------|
|
|
168
|
+
| No `.prizm-docs/` exists (new project) | Skip architecture comparison, focus on internal consistency and robustness only |
|
|
169
|
+
| Module has no existing code to compare | Note in report: "No baseline for style comparison — challenges are based on general best practices" |
|
|
170
|
+
| All challenges are MEDIUM or lower | Write report with "No significant challenges" summary. Do NOT inflate severity |
|
|
171
|
+
| Cannot determine project conventions | Downgrade all style challenges to MEDIUM. Note the limitation in the report |
|
|
172
|
+
|
|
173
|
+
### Communication Rules
|
|
174
|
+
|
|
175
|
+
Critic does not communicate directly with Dev or Reviewer. All findings go to the Orchestrator via the challenge-report file.
|
|
176
|
+
- Send COMPLETION_SIGNAL (with challenge count summary) to indicate completion
|
|
177
|
+
- Receive TASK_ASSIGNMENT to get assigned work
|
|
@@ -331,6 +331,7 @@ The `model` field is extracted from the feature's `"model"` field in feature-lis
|
|
|
331
331
|
- `{{IF_RESUME}}` / `{{IF_FRESH_START}}` — Resume vs fresh start
|
|
332
332
|
- `{{IF_INIT_NEEDED}}` / `{{IF_INIT_DONE}}` — PrizmKit init status
|
|
333
333
|
- `{{IF_MODE_LITE}}` / `{{IF_MODE_STANDARD}}` / `{{IF_MODE_FULL}}` — Pipeline mode blocks
|
|
334
|
+
- `{{IF_CRITIC_ENABLED}}` / `{{END_IF_CRITIC_ENABLED}}` — Critic agent blocks (adversarial review)
|
|
334
335
|
|
|
335
336
|
---
|
|
336
337
|
|
|
@@ -570,6 +571,7 @@ Also exports: `log_info`, `log_warn`, `log_error`, `log_success` (with timestamp
|
|
|
570
571
|
| `MAX_RETRIES` | `3` | run.sh | Max retry attempts per feature before marking as failed |
|
|
571
572
|
| `SESSION_TIMEOUT` | `0` (none) | run.sh, retry-feature.sh, run-bugfix.sh, retry-bug.sh | Timeout in seconds per AI CLI session. 0 = no timeout |
|
|
572
573
|
| `PIPELINE_MODE` | (auto) | run.sh, launch-daemon.sh | Override mode for all features: `lite\|standard\|full` |
|
|
574
|
+
| `ENABLE_CRITIC` | `false` | run.sh, launch-daemon.sh | Enable adversarial critic review: `true\|false` |
|
|
573
575
|
| `DEV_BRANCH` | auto-generated | run.sh | Custom git branch name (default: `dev/{feature-id}-{timestamp}`) |
|
|
574
576
|
| `AUTO_PUSH` | `0` | run.sh | Set to `1` to auto-push branch to remote after successful session |
|
|
575
577
|
|
|
@@ -31,6 +31,7 @@ dev-pipeline (outer loop)
|
|
|
31
31
|
|-------|----------------|------|
|
|
32
32
|
| Dev | `.claude/agents/prizm-dev-team-dev.md` (or `.codebuddy/agents/`) | prizm-dev-team-dev |
|
|
33
33
|
| Reviewer | `.claude/agents/prizm-dev-team-reviewer.md` (or `.codebuddy/agents/`) | prizm-dev-team-reviewer |
|
|
34
|
+
| Critic | `.claude/agents/prizm-dev-team-critic.md` (or `.codebuddy/agents/`) | prizm-dev-team-critic |
|
|
34
35
|
|
|
35
36
|
Note: The Orchestrator role is handled by the main agent (session orchestrator) directly — no separate agent definition needed.
|
|
36
37
|
|
|
@@ -94,6 +94,7 @@ cmd_start() {
|
|
|
94
94
|
local env_overrides=""
|
|
95
95
|
local mode_override=""
|
|
96
96
|
local features_filter=""
|
|
97
|
+
local critic_enabled=""
|
|
97
98
|
|
|
98
99
|
# Parse arguments
|
|
99
100
|
while [[ $# -gt 0 ]]; do
|
|
@@ -133,6 +134,14 @@ cmd_start() {
|
|
|
133
134
|
features_filter="$1"
|
|
134
135
|
shift
|
|
135
136
|
;;
|
|
137
|
+
--critic)
|
|
138
|
+
critic_enabled="true"
|
|
139
|
+
shift
|
|
140
|
+
;;
|
|
141
|
+
--no-critic)
|
|
142
|
+
critic_enabled="false"
|
|
143
|
+
shift
|
|
144
|
+
;;
|
|
136
145
|
*)
|
|
137
146
|
feature_list="$1"
|
|
138
147
|
shift
|
|
@@ -187,6 +196,9 @@ cmd_start() {
|
|
|
187
196
|
if [[ -n "$mode_override" ]]; then
|
|
188
197
|
env_parts="${env_parts:+$env_parts }PIPELINE_MODE=$mode_override"
|
|
189
198
|
fi
|
|
199
|
+
if [[ -n "${critic_enabled:-}" ]]; then
|
|
200
|
+
env_parts="${env_parts:+$env_parts }ENABLE_CRITIC=$critic_enabled"
|
|
201
|
+
fi
|
|
190
202
|
if [[ -n "$env_parts" ]]; then
|
|
191
203
|
env_cmd="env $env_parts"
|
|
192
204
|
fi
|
|
@@ -579,6 +591,8 @@ Commands:
|
|
|
579
591
|
|
|
580
592
|
Options:
|
|
581
593
|
--mode <lite|standard|full> Override pipeline mode for all features
|
|
594
|
+
--critic Enable adversarial critic review for all features
|
|
595
|
+
--no-critic Disable critic review (overrides feature-list setting)
|
|
582
596
|
--features <filter> Run only specified features (e.g. F-001,F-003 or F-001:F-010)
|
|
583
597
|
--env "KEY=VAL ..." Set environment variables
|
|
584
598
|
|
|
@@ -588,8 +602,9 @@ Examples:
|
|
|
588
602
|
./launch-daemon.sh start --features F-001:F-005 # Run only features F-001 through F-005
|
|
589
603
|
./launch-daemon.sh start --features F-001,F-003,F-007 # Run specific features
|
|
590
604
|
./launch-daemon.sh start --mode full # Full mode for complex features
|
|
605
|
+
./launch-daemon.sh start --critic # Enable adversarial critic review
|
|
591
606
|
./launch-daemon.sh start --env "MAX_RETRIES=5 SESSION_TIMEOUT=7200"
|
|
592
|
-
./launch-daemon.sh start feature-list.json --mode full --env "VERBOSE=1"
|
|
607
|
+
./launch-daemon.sh start feature-list.json --mode full --critic --env "VERBOSE=1"
|
|
593
608
|
./launch-daemon.sh status # Check if running (JSON on stdout)
|
|
594
609
|
./launch-daemon.sh logs --follow # Live log tailing
|
|
595
610
|
./launch-daemon.sh logs --lines 100 # Last 100 lines
|
|
@@ -183,15 +183,25 @@ mkdir -p "$SESSION_DIR/logs"
|
|
|
183
183
|
BOOTSTRAP_PROMPT="$SESSION_DIR/bootstrap-prompt.md"
|
|
184
184
|
|
|
185
185
|
log_info "Generating bootstrap prompt..."
|
|
186
|
-
|
|
187
|
-
--feature-list "$FEATURE_LIST"
|
|
188
|
-
--feature-id "$FEATURE_ID"
|
|
189
|
-
--session-id "$SESSION_ID"
|
|
190
|
-
--run-id "$RUN_ID"
|
|
191
|
-
--retry-count 0
|
|
192
|
-
--resume-phase "null"
|
|
193
|
-
--state-dir "$STATE_DIR"
|
|
194
|
-
--output "$BOOTSTRAP_PROMPT"
|
|
186
|
+
GEN_ARGS=(
|
|
187
|
+
--feature-list "$FEATURE_LIST"
|
|
188
|
+
--feature-id "$FEATURE_ID"
|
|
189
|
+
--session-id "$SESSION_ID"
|
|
190
|
+
--run-id "$RUN_ID"
|
|
191
|
+
--retry-count 0
|
|
192
|
+
--resume-phase "null"
|
|
193
|
+
--state-dir "$STATE_DIR"
|
|
194
|
+
--output "$BOOTSTRAP_PROMPT"
|
|
195
|
+
)
|
|
196
|
+
|
|
197
|
+
# Support ENABLE_CRITIC env var
|
|
198
|
+
if [[ "${ENABLE_CRITIC:-}" == "true" || "${ENABLE_CRITIC:-}" == "1" ]]; then
|
|
199
|
+
GEN_ARGS+=(--critic "true")
|
|
200
|
+
elif [[ "${ENABLE_CRITIC:-}" == "false" || "${ENABLE_CRITIC:-}" == "0" ]]; then
|
|
201
|
+
GEN_ARGS+=(--critic "false")
|
|
202
|
+
fi
|
|
203
|
+
|
|
204
|
+
GEN_OUTPUT=$(python3 "$SCRIPTS_DIR/generate-bootstrap-prompt.py" "${GEN_ARGS[@]}" 2>/dev/null) || {
|
|
195
205
|
log_error "Failed to generate bootstrap prompt"
|
|
196
206
|
exit 1
|
|
197
207
|
}
|
|
@@ -393,6 +393,7 @@ run_one() {
|
|
|
393
393
|
local dry_run=false
|
|
394
394
|
local resume_phase=""
|
|
395
395
|
local mode_override=""
|
|
396
|
+
local critic_override=""
|
|
396
397
|
local do_clean=false
|
|
397
398
|
local no_reset=false
|
|
398
399
|
|
|
@@ -437,6 +438,14 @@ run_one() {
|
|
|
437
438
|
no_reset=true
|
|
438
439
|
shift
|
|
439
440
|
;;
|
|
441
|
+
--critic)
|
|
442
|
+
critic_override="true"
|
|
443
|
+
shift
|
|
444
|
+
;;
|
|
445
|
+
--no-critic)
|
|
446
|
+
critic_override="false"
|
|
447
|
+
shift
|
|
448
|
+
;;
|
|
440
449
|
--timeout)
|
|
441
450
|
shift
|
|
442
451
|
if [[ $# -eq 0 ]]; then
|
|
@@ -621,6 +630,14 @@ sys.exit(1)
|
|
|
621
630
|
prompt_args+=(--mode "$mode_override")
|
|
622
631
|
fi
|
|
623
632
|
|
|
633
|
+
if [[ -n "${critic_override:-}" ]]; then
|
|
634
|
+
prompt_args+=(--critic "$critic_override")
|
|
635
|
+
elif [[ "${ENABLE_CRITIC:-}" == "true" || "${ENABLE_CRITIC:-}" == "1" ]]; then
|
|
636
|
+
prompt_args+=(--critic "true")
|
|
637
|
+
elif [[ "${ENABLE_CRITIC:-}" == "false" || "${ENABLE_CRITIC:-}" == "0" ]]; then
|
|
638
|
+
prompt_args+=(--critic "false")
|
|
639
|
+
fi
|
|
640
|
+
|
|
624
641
|
log_info "Generating bootstrap prompt..."
|
|
625
642
|
local gen_output
|
|
626
643
|
gen_output=$(python3 "$SCRIPTS_DIR/generate-bootstrap-prompt.py" "${prompt_args[@]}" 2>/dev/null) || {
|
|
@@ -952,6 +969,13 @@ for f in data.get('stuck_features', []):
|
|
|
952
969
|
main_prompt_args+=(--mode "$PIPELINE_MODE")
|
|
953
970
|
fi
|
|
954
971
|
|
|
972
|
+
# Support ENABLE_CRITIC env var (set by launch-daemon.sh --critic)
|
|
973
|
+
if [[ "${ENABLE_CRITIC:-}" == "true" || "${ENABLE_CRITIC:-}" == "1" ]]; then
|
|
974
|
+
main_prompt_args+=(--critic "true")
|
|
975
|
+
elif [[ "${ENABLE_CRITIC:-}" == "false" || "${ENABLE_CRITIC:-}" == "0" ]]; then
|
|
976
|
+
main_prompt_args+=(--critic "false")
|
|
977
|
+
fi
|
|
978
|
+
|
|
955
979
|
local gen_output
|
|
956
980
|
gen_output=$(python3 "$SCRIPTS_DIR/generate-bootstrap-prompt.py" "${main_prompt_args[@]}" 2>/dev/null) || {
|
|
957
981
|
log_error "Failed to generate bootstrap prompt for $feature_id"
|
|
@@ -1052,6 +1076,8 @@ show_help() {
|
|
|
1052
1076
|
echo " --dry-run Generate bootstrap prompt only, don't spawn session"
|
|
1053
1077
|
echo " --resume-phase N Override resume phase (default: auto-detect)"
|
|
1054
1078
|
echo " --mode <lite|standard|full> Override pipeline mode (bypasses estimated_complexity)"
|
|
1079
|
+
echo " --critic Enable adversarial critic review for this feature"
|
|
1080
|
+
echo " --no-critic Disable critic review (overrides feature-list setting)"
|
|
1055
1081
|
echo " --clean Delete artifacts and reset before running"
|
|
1056
1082
|
echo " --no-reset Skip feature status reset step"
|
|
1057
1083
|
echo " --timeout N Session timeout in seconds (default: 0 = no limit)"
|
|
@@ -1067,6 +1093,7 @@ show_help() {
|
|
|
1067
1093
|
echo " LOG_RETENTION_DAYS Delete logs older than N days (default: 14)"
|
|
1068
1094
|
echo " LOG_MAX_TOTAL_MB Keep total logs under N MB (default: 1024)"
|
|
1069
1095
|
echo " PIPELINE_MODE Override mode for all features: lite|standard|full"
|
|
1096
|
+
echo " ENABLE_CRITIC Enable critic review for all features: true|false"
|
|
1070
1097
|
echo ""
|
|
1071
1098
|
echo "Examples:"
|
|
1072
1099
|
echo " ./run.sh run # Run all features"
|
|
@@ -88,6 +88,12 @@ def parse_args():
|
|
|
88
88
|
default=None,
|
|
89
89
|
help="Override pipeline mode (default: auto-detect from complexity)",
|
|
90
90
|
)
|
|
91
|
+
parser.add_argument(
|
|
92
|
+
"--critic",
|
|
93
|
+
choices=["true", "false"],
|
|
94
|
+
default=None,
|
|
95
|
+
help="Override critic enablement (default: read from feature field)",
|
|
96
|
+
)
|
|
91
97
|
return parser.parse_args()
|
|
92
98
|
|
|
93
99
|
|
|
@@ -279,10 +285,11 @@ def process_conditional_blocks(content, resume_phase):
|
|
|
279
285
|
return content
|
|
280
286
|
|
|
281
287
|
|
|
282
|
-
def process_mode_blocks(content, pipeline_mode, init_done):
|
|
283
|
-
"""Process pipeline mode and
|
|
288
|
+
def process_mode_blocks(content, pipeline_mode, init_done, critic_enabled=False):
|
|
289
|
+
"""Process pipeline mode, init, and critic conditional blocks.
|
|
284
290
|
|
|
285
291
|
Keeps the block matching the current mode, removes the others.
|
|
292
|
+
Handles {{IF_CRITIC_ENABLED}} / {{END_IF_CRITIC_ENABLED}} blocks.
|
|
286
293
|
"""
|
|
287
294
|
# Handle lite/standard/full blocks
|
|
288
295
|
modes = ["lite", "standard", "full"]
|
|
@@ -318,6 +325,20 @@ def process_mode_blocks(content, pipeline_mode, init_done):
|
|
|
318
325
|
"", content, flags=re.DOTALL,
|
|
319
326
|
)
|
|
320
327
|
|
|
328
|
+
# Critic blocks
|
|
329
|
+
critic_open = "{{IF_CRITIC_ENABLED}}"
|
|
330
|
+
critic_close = "{{END_IF_CRITIC_ENABLED}}"
|
|
331
|
+
if critic_enabled:
|
|
332
|
+
# Keep content, remove tags
|
|
333
|
+
content = content.replace(critic_open + "\n", "")
|
|
334
|
+
content = content.replace(critic_open, "")
|
|
335
|
+
content = content.replace(critic_close + "\n", "")
|
|
336
|
+
content = content.replace(critic_close, "")
|
|
337
|
+
else:
|
|
338
|
+
# Remove entire CRITIC blocks
|
|
339
|
+
pattern = re.escape(critic_open) + r".*?" + re.escape(critic_close) + r"\n?"
|
|
340
|
+
content = re.sub(pattern, "", content, flags=re.DOTALL)
|
|
341
|
+
|
|
321
342
|
return content
|
|
322
343
|
|
|
323
344
|
|
|
@@ -410,6 +431,9 @@ def build_replacements(args, feature, features, global_context, script_dir):
|
|
|
410
431
|
reviewer_subagent = os.path.join(
|
|
411
432
|
agents_dir, "prizm-dev-team-reviewer.md",
|
|
412
433
|
)
|
|
434
|
+
critic_subagent = os.path.join(
|
|
435
|
+
agents_dir, "prizm-dev-team-critic.md",
|
|
436
|
+
)
|
|
413
437
|
|
|
414
438
|
# Verify agent files actually exist — missing files cause confusing
|
|
415
439
|
# errors when the AI session tries to read them later.
|
|
@@ -458,6 +482,41 @@ def build_replacements(args, feature, features, global_context, script_dir):
|
|
|
458
482
|
if effective_resume == "null" and artifacts["all_complete"]:
|
|
459
483
|
effective_resume = "6"
|
|
460
484
|
|
|
485
|
+
# Determine critic enablement (priority: CLI > env > feature field > default)
|
|
486
|
+
critic_env = os.environ.get("ENABLE_CRITIC", "").lower()
|
|
487
|
+
if args.critic is not None:
|
|
488
|
+
critic_enabled = args.critic == "true"
|
|
489
|
+
elif critic_env in ("true", "1"):
|
|
490
|
+
critic_enabled = True
|
|
491
|
+
elif critic_env in ("false", "0"):
|
|
492
|
+
critic_enabled = False
|
|
493
|
+
else:
|
|
494
|
+
critic_enabled = bool(feature.get("critic", False))
|
|
495
|
+
|
|
496
|
+
# Determine critic count (from feature field, default 1)
|
|
497
|
+
# Multi-critic voting (3) must be explicitly set by the user in feature-list.json
|
|
498
|
+
critic_count = feature.get("critic_count", 1)
|
|
499
|
+
|
|
500
|
+
# Guard: if critic enabled but agent file missing, force disable and warn
|
|
501
|
+
if critic_enabled and not os.path.isfile(critic_subagent):
|
|
502
|
+
LOGGER.warning(
|
|
503
|
+
"Critic enabled but agent file not found: %s. "
|
|
504
|
+
"Critic phases will be SKIPPED. "
|
|
505
|
+
"Run `npx prizmkit install` to install agent definitions.",
|
|
506
|
+
critic_subagent,
|
|
507
|
+
)
|
|
508
|
+
critic_enabled = False
|
|
509
|
+
|
|
510
|
+
# Guard: if critic enabled but tier doesn't support it (lite), warn and disable
|
|
511
|
+
if critic_enabled and pipeline_mode == "lite":
|
|
512
|
+
LOGGER.warning(
|
|
513
|
+
"Critic enabled for feature %s but pipeline_mode='lite' (tier1) "
|
|
514
|
+
"does not support critic phases. Critic will be SKIPPED. "
|
|
515
|
+
"Use estimated_complexity='high' or pass --mode standard/full.",
|
|
516
|
+
args.feature_id,
|
|
517
|
+
)
|
|
518
|
+
critic_enabled = False
|
|
519
|
+
|
|
461
520
|
replacements = {
|
|
462
521
|
"{{RUN_ID}}": args.run_id,
|
|
463
522
|
"{{SESSION_ID}}": args.session_id,
|
|
@@ -479,6 +538,7 @@ def build_replacements(args, feature, features, global_context, script_dir):
|
|
|
479
538
|
"{{TEAM_CONFIG_PATH}}": team_config_path,
|
|
480
539
|
"{{DEV_SUBAGENT_PATH}}": dev_subagent,
|
|
481
540
|
"{{REVIEWER_SUBAGENT_PATH}}": reviewer_subagent,
|
|
541
|
+
"{{CRITIC_SUBAGENT_PATH}}": critic_subagent,
|
|
482
542
|
"{{VALIDATOR_SCRIPTS_DIR}}": validator_scripts_dir,
|
|
483
543
|
"{{INIT_SCRIPT_PATH}}": init_script_path,
|
|
484
544
|
"{{SESSION_STATUS_PATH}}": session_status_abs,
|
|
@@ -486,6 +546,8 @@ def build_replacements(args, feature, features, global_context, script_dir):
|
|
|
486
546
|
"{{FEATURE_SLUG}}": feature_slug,
|
|
487
547
|
"{{PIPELINE_MODE}}": pipeline_mode,
|
|
488
548
|
"{{COMPLEXITY}}": complexity,
|
|
549
|
+
"{{CRITIC_ENABLED}}": "true" if critic_enabled else "false",
|
|
550
|
+
"{{CRITIC_COUNT}}": str(critic_count),
|
|
489
551
|
"{{INIT_DONE}}": "true" if init_done else "false",
|
|
490
552
|
"{{HAS_SPEC}}": "true" if artifacts["has_spec"] else "false",
|
|
491
553
|
"{{HAS_PLAN}}": "true" if artifacts["has_plan"] else "false",
|
|
@@ -500,10 +562,11 @@ def render_template(template_content, replacements, resume_phase):
|
|
|
500
562
|
# Step 1: Process fresh_start/resume conditional blocks
|
|
501
563
|
content = process_conditional_blocks(template_content, resume_phase)
|
|
502
564
|
|
|
503
|
-
# Step 2: Process mode and
|
|
565
|
+
# Step 2: Process mode, init, and critic conditional blocks
|
|
504
566
|
pipeline_mode = replacements.get("{{PIPELINE_MODE}}", "standard")
|
|
505
567
|
init_done = replacements.get("{{INIT_DONE}}", "false") == "true"
|
|
506
|
-
|
|
568
|
+
critic_enabled = replacements.get("{{CRITIC_ENABLED}}", "false") == "true"
|
|
569
|
+
content = process_mode_blocks(content, pipeline_mode, init_done, critic_enabled)
|
|
507
570
|
|
|
508
571
|
# Step 3: Replace all {{PLACEHOLDER}} variables
|
|
509
572
|
for placeholder, value in replacements.items():
|
|
@@ -160,6 +160,33 @@ Wait for Reviewer to return.
|
|
|
160
160
|
|
|
161
161
|
**CP-2**: No CRITICAL issues.
|
|
162
162
|
|
|
163
|
+
{{IF_CRITIC_ENABLED}}
|
|
164
|
+
### Phase 3.5: Plan Challenge — Critic Agent
|
|
165
|
+
|
|
166
|
+
**Guard**: Verify critic agent file exists before spawning:
|
|
167
|
+
```bash
|
|
168
|
+
ls {{CRITIC_SUBAGENT_PATH}} 2>/dev/null && echo "CRITIC:READY" || echo "CRITIC:MISSING"
|
|
169
|
+
```
|
|
170
|
+
If CRITIC:MISSING — skip Phase 3.5 entirely and proceed to Phase 4. Log: "Critic agent not installed — skipping Plan Challenge."
|
|
171
|
+
|
|
172
|
+
Spawn Critic agent (Agent tool, subagent_type="prizm-dev-team-critic", run_in_background=false).
|
|
173
|
+
|
|
174
|
+
Prompt:
|
|
175
|
+
> "Read {{CRITIC_SUBAGENT_PATH}}. For feature {{FEATURE_ID}} (slug: {{FEATURE_SLUG}}):
|
|
176
|
+
> **MODE: Plan Challenge**
|
|
177
|
+
> 1. Read `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md` FIRST — Section 3 has project context, Section 4 has file manifest.
|
|
178
|
+
> 2. Read `.prizm-docs/root.prizm` and relevant L1/L2 docs for affected modules.
|
|
179
|
+
> 3. Read existing source files in the modules this plan touches.
|
|
180
|
+
> 4. Challenge plan.md against the project's existing architecture, patterns, and style.
|
|
181
|
+
> Write `.prizmkit/specs/{{FEATURE_SLUG}}/challenge-report.md` with findings (or 'No significant challenges')."
|
|
182
|
+
|
|
183
|
+
Wait for Critic to return.
|
|
184
|
+
- Read challenge-report.md. For items marked CRITICAL/HIGH: decide whether to adjust plan.md or document why the plan stands.
|
|
185
|
+
- Max 1 plan revision round.
|
|
186
|
+
|
|
187
|
+
**CP-2.5**: Plan challenges reviewed and resolved.
|
|
188
|
+
{{END_IF_CRITIC_ENABLED}}
|
|
189
|
+
|
|
163
190
|
### Phase 4: Implement — Dev Subagent
|
|
164
191
|
|
|
165
192
|
**Build artifacts rule** (passed to Dev): After any build/compile command (`go build`, `npm run build`, `tsc`, etc.), ensure the output binary or build directory is in `.gitignore`. Never commit compiled binaries, build output, or generated artifacts.
|
|
@@ -192,6 +219,33 @@ grep -q "## Implementation Log" .prizmkit/specs/{{FEATURE_SLUG}}/context-snapsho
|
|
|
192
219
|
```
|
|
193
220
|
If GATE:MISSING — send message to Dev (re-spawn if needed): "Write the '## Implementation Log' section to context-snapshot.md before I can proceed to review. Include: files changed/created, key decisions, deviations from plan, notable discoveries."
|
|
194
221
|
|
|
222
|
+
{{IF_CRITIC_ENABLED}}
|
|
223
|
+
### Phase 4.5: Code Challenge — Critic Agent
|
|
224
|
+
|
|
225
|
+
**Guard**: Verify critic agent file exists before spawning:
|
|
226
|
+
```bash
|
|
227
|
+
ls {{CRITIC_SUBAGENT_PATH}} 2>/dev/null && echo "CRITIC:READY" || echo "CRITIC:MISSING"
|
|
228
|
+
```
|
|
229
|
+
If CRITIC:MISSING — skip Phase 4.5 entirely and proceed to Phase 5. Log: "Critic agent not installed — skipping Code Challenge."
|
|
230
|
+
|
|
231
|
+
Spawn Critic agent (Agent tool, subagent_type="prizm-dev-team-critic", run_in_background=false).
|
|
232
|
+
|
|
233
|
+
Prompt:
|
|
234
|
+
> "Read {{CRITIC_SUBAGENT_PATH}}. For feature {{FEATURE_ID}} (slug: {{FEATURE_SLUG}}):
|
|
235
|
+
> **MODE: Code Challenge**
|
|
236
|
+
> 1. Read `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md` — Implementation Log section shows what Dev changed.
|
|
237
|
+
> 2. Read `.prizm-docs/root.prizm` and relevant module docs for RULES/PATTERNS.
|
|
238
|
+
> 3. Read the actual source files changed (from Implementation Log).
|
|
239
|
+
> 4. Read comparable existing source files in the same module for style comparison.
|
|
240
|
+
> 5. Challenge code integration quality: style fit, robustness, existing code cohesion, hidden impact.
|
|
241
|
+
> Write `.prizmkit/specs/{{FEATURE_SLUG}}/challenge-report.md` (overwrite) with findings (or 'No significant challenges')."
|
|
242
|
+
|
|
243
|
+
Wait for Critic to return.
|
|
244
|
+
- Read challenge-report.md. For items marked CRITICAL/HIGH: spawn Dev to fix, then proceed to Review.
|
|
245
|
+
|
|
246
|
+
**CP-3.5**: Code challenges reviewed and resolved.
|
|
247
|
+
{{END_IF_CRITIC_ENABLED}}
|
|
248
|
+
|
|
195
249
|
### Phase 5: Review + Test — Reviewer Subagent
|
|
196
250
|
|
|
197
251
|
Spawn Reviewer subagent (Agent tool, subagent_type="prizm-dev-team-reviewer", run_in_background=false).
|
|
@@ -255,6 +309,9 @@ Working tree MUST be clean after this step. If any feature-related files remain,
|
|
|
255
309
|
| Context Snapshot | `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md` |
|
|
256
310
|
| Dev Agent Def | {{DEV_SUBAGENT_PATH}} |
|
|
257
311
|
| Reviewer Agent Def | {{REVIEWER_SUBAGENT_PATH}} |
|
|
312
|
+
{{IF_CRITIC_ENABLED}}
|
|
313
|
+
| Critic Agent Def | {{CRITIC_SUBAGENT_PATH}} |
|
|
314
|
+
{{END_IF_CRITIC_ENABLED}}
|
|
258
315
|
| Project Root | {{PROJECT_ROOT}} |
|
|
259
316
|
|
|
260
317
|
## Failure Capture Protocol
|