prizmkit 1.1.68 → 1.1.70

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (55) hide show
  1. package/bundled/VERSION.json +3 -3
  2. package/bundled/dev-pipeline/lib/heartbeat.sh +5 -5
  3. package/bundled/dev-pipeline/scripts/generate-bootstrap-prompt.py +11 -12
  4. package/bundled/dev-pipeline/scripts/parse-stream-progress.py +217 -18
  5. package/bundled/dev-pipeline/templates/agent-prompts/dev-implement.md +36 -22
  6. package/bundled/dev-pipeline/templates/agent-prompts/reviewer-review.md +1 -1
  7. package/bundled/dev-pipeline/templates/bootstrap-tier2.md +19 -1
  8. package/bundled/dev-pipeline/templates/bootstrap-tier3.md +19 -1
  9. package/bundled/dev-pipeline/templates/bugfix-bootstrap-prompt.md +24 -21
  10. package/bundled/dev-pipeline/templates/refactor-bootstrap-prompt.md +32 -24
  11. package/bundled/dev-pipeline/templates/sections/ac-verification-checklist.md +4 -10
  12. package/bundled/dev-pipeline/templates/sections/context-budget-rules.md +1 -0
  13. package/bundled/dev-pipeline/templates/sections/feature-context.md +16 -11
  14. package/bundled/dev-pipeline/templates/sections/phase-browser-verification-auto.md +17 -26
  15. package/bundled/dev-pipeline/templates/sections/phase-browser-verification-opencli.md +1 -1
  16. package/bundled/dev-pipeline/templates/sections/phase-browser-verification.md +1 -1
  17. package/bundled/dev-pipeline/templates/sections/phase-context-snapshot-base.md +1 -1
  18. package/bundled/dev-pipeline/templates/sections/phase-critic-plan-full.md +10 -0
  19. package/bundled/dev-pipeline/templates/sections/phase-critic-plan.md +10 -0
  20. package/bundled/dev-pipeline/templates/sections/phase-implement-agent.md +14 -9
  21. package/bundled/dev-pipeline/templates/sections/phase-implement-full.md +14 -9
  22. package/bundled/dev-pipeline/templates/sections/phase-implement-lite.md +8 -17
  23. package/bundled/dev-pipeline/templates/sections/phase-plan-lite.md +1 -1
  24. package/bundled/dev-pipeline/templates/sections/phase-review-agent.md +5 -1
  25. package/bundled/dev-pipeline/templates/sections/phase-review-full.md +6 -2
  26. package/bundled/dev-pipeline/templates/sections/phase-specify-plan-full.md +1 -1
  27. package/bundled/dev-pipeline/templates/sections/task-contract.md +34 -0
  28. package/bundled/dev-pipeline/templates/sections/test-failure-recovery-agent.md +27 -46
  29. package/bundled/dev-pipeline/templates/sections/test-failure-recovery-lite.md +27 -37
  30. package/bundled/dev-pipeline/tests/test_generate_bootstrap_prompt.py +13 -0
  31. package/bundled/dev-pipeline-windows/scripts/generate-bootstrap-prompt.py +11 -12
  32. package/bundled/dev-pipeline-windows/scripts/parse-stream-progress.py +217 -18
  33. package/bundled/dev-pipeline-windows/templates/agent-prompts/dev-implement.md +36 -22
  34. package/bundled/dev-pipeline-windows/templates/agent-prompts/reviewer-review.md +1 -1
  35. package/bundled/dev-pipeline-windows/templates/bugfix-bootstrap-prompt.md +24 -21
  36. package/bundled/dev-pipeline-windows/templates/refactor-bootstrap-prompt.md +32 -24
  37. package/bundled/dev-pipeline-windows/templates/sections/ac-verification-checklist.md +4 -10
  38. package/bundled/dev-pipeline-windows/templates/sections/context-budget-rules.md +1 -0
  39. package/bundled/dev-pipeline-windows/templates/sections/feature-context.md +16 -11
  40. package/bundled/dev-pipeline-windows/templates/sections/phase-browser-verification-auto.md +22 -10
  41. package/bundled/dev-pipeline-windows/templates/sections/phase-context-snapshot-base.md +1 -1
  42. package/bundled/dev-pipeline-windows/templates/sections/phase-critic-plan-full.md +10 -0
  43. package/bundled/dev-pipeline-windows/templates/sections/phase-critic-plan.md +10 -0
  44. package/bundled/dev-pipeline-windows/templates/sections/phase-implement-agent.md +14 -9
  45. package/bundled/dev-pipeline-windows/templates/sections/phase-implement-full.md +14 -9
  46. package/bundled/dev-pipeline-windows/templates/sections/phase-implement-lite.md +8 -19
  47. package/bundled/dev-pipeline-windows/templates/sections/phase-plan-lite.md +1 -1
  48. package/bundled/dev-pipeline-windows/templates/sections/phase-review-agent.md +5 -1
  49. package/bundled/dev-pipeline-windows/templates/sections/phase-review-full.md +6 -2
  50. package/bundled/dev-pipeline-windows/templates/sections/phase-specify-plan-full.md +1 -1
  51. package/bundled/dev-pipeline-windows/templates/sections/task-contract.md +34 -0
  52. package/bundled/dev-pipeline-windows/templates/sections/test-failure-recovery-agent.md +27 -46
  53. package/bundled/dev-pipeline-windows/templates/sections/test-failure-recovery-lite.md +27 -37
  54. package/bundled/skills/_metadata.json +1 -1
  55. package/package.json +1 -1
@@ -1,5 +1,5 @@
1
1
  {
2
- "frameworkVersion": "1.1.68",
3
- "bundledAt": "2026-06-09T14:36:58.835Z",
4
- "bundledFrom": "82060fd"
2
+ "frameworkVersion": "1.1.70",
3
+ "bundledAt": "2026-06-10T03:59:21.944Z",
4
+ "bundledFrom": "e948b58"
5
5
  }
@@ -90,8 +90,8 @@ PY
90
90
  fi
91
91
  prev_child_activity_signature="$child_activity_signature"
92
92
 
93
- # Track progress staleness. A Codex parent can sit in `wait`
94
- # while child transcripts keep growing, so child activity counts.
93
+ # Track progress staleness. Parent sessions can sit in a wait/polling
94
+ # tool while child transcripts keep growing, so child activity counts.
95
95
  if [[ $growth -eq 0 && $child_growth -eq 0 ]]; then
96
96
  stale_seconds=$((stale_seconds + heartbeat_interval))
97
97
  else
@@ -174,9 +174,9 @@ PY
174
174
  fi
175
175
 
176
176
  # Stale-kill: auto-terminate process if no progress for too long.
177
- # Codex parent sessions can sit on the `wait` tool while a spawned
178
- # subagent is still doing useful work. Give that valid wait a longer
179
- # stale window; normal single-agent stalls still use the base limit.
177
+ # Parent sessions can wait on spawned work; child transcript growth
178
+ # counts as progress above, while silent waits still use the active
179
+ # stale window to surface stuck agents promptly.
180
180
  if [[ $effective_stale_kill_threshold -gt 0 && $stale_seconds -ge $effective_stale_kill_threshold ]]; then
181
181
  local stale_mins=$((stale_seconds / 60))
182
182
  echo -e " ${RED}[HEARTBEAT]${NC} ${mins}m${secs}s | log: ${size_display} | ${RED}STALE-KILL: no progress for ${stale_mins}m (threshold: ${effective_stale_kill_threshold}s)${NC}"
@@ -248,7 +248,7 @@ def extract_baseline_failures(test_cmd, project_root):
248
248
  def format_ac_checklist(acceptance_criteria):
249
249
  """Format acceptance criteria as a markdown checkbox list."""
250
250
  if not acceptance_criteria:
251
- return "- [ ] (no acceptance criteria specified)"
251
+ return "- (no Verification Gates specified)"
252
252
  lines = []
253
253
  for item in acceptance_criteria:
254
254
  lines.append("- [ ] {}".format(item))
@@ -285,10 +285,10 @@ def format_user_context(user_context):
285
285
  if not items:
286
286
  return ""
287
287
  lines = [
288
- "### User-Provided Context (HIGHEST PRIORITY)",
289
- "",
290
- "> The following materials were provided by the user. "
291
- "They take precedence over AI inference.",
288
+ "> These materials were provided by the user and are authoritative "
289
+ "when they clarify or constrain this feature. They do not expand "
290
+ "the current scope by themselves; use the Task Contract to decide "
291
+ "what belongs to this session.",
292
292
  "",
293
293
  ]
294
294
  for item in items:
@@ -932,6 +932,10 @@ def assemble_sections(pipeline_mode, sections_dir, init_done, is_resume,
932
932
  mission += "\n\n" + tier_desc
933
933
  sections.append(("mission", mission))
934
934
 
935
+ # --- Task Contract: single source of current scope and gates ---
936
+ sections.append(("task-contract",
937
+ load_section(sections_dir, "task-contract.md")))
938
+
935
939
  # --- Feature Context (XML-wrapped, optimization 3) ---
936
940
  sections.append(("feature-context",
937
941
  load_section(sections_dir, "feature-context.md")))
@@ -1041,13 +1045,8 @@ def assemble_sections(pipeline_mode, sections_dir, init_done, is_resume,
1041
1045
  load_section(sections_dir,
1042
1046
  "test-failure-recovery-agent.md")))
1043
1047
 
1044
- # --- AC Verification Checklist (all tiers) ---
1045
- ac_checklist_path = os.path.join(sections_dir, "ac-verification-checklist.md")
1046
- if os.path.isfile(ac_checklist_path):
1047
- sections.append(("ac-verification-checklist",
1048
- load_section(sections_dir,
1049
- "ac-verification-checklist.md")))
1050
-
1048
+ # Verification Gates are included in Task Contract. Keep AC in one place so
1049
+ # background context and implementation prompts cannot redefine scope.
1051
1050
  # --- Review (only for agent tiers) ---
1052
1051
  if pipeline_mode == "full":
1053
1052
  sections.append(("phase-review",
@@ -63,7 +63,8 @@ PHASE_KEYWORDS = {
63
63
  class ProgressTracker:
64
64
  """Tracks progress state from stream-json events."""
65
65
 
66
- def __init__(self):
66
+ def __init__(self, session_log=None):
67
+ self.session_log_path = Path(session_log).expanduser() if session_log else None
67
68
  self.message_count = 0
68
69
  self.current_tool = None
69
70
  self.current_tool_input_summary = ""
@@ -78,12 +79,19 @@ class ProgressTracker:
78
79
  self.active_subagent_count = 0
79
80
  self.subagent_status_counts = Counter()
80
81
  self.codex_child_thread_ids = set()
82
+ self.claude_session_id = ""
83
+ self.claude_cwd = ""
84
+ self.claude_task_states = {}
81
85
  self.child_session_files = []
82
86
  self.child_total_bytes = 0
83
87
  self.child_activity_signature = ""
84
88
  self.last_child_activity_at = ""
85
89
  self._codex_child_session_paths = {}
90
+ self._claude_child_session_files = []
86
91
  self._last_child_scan_at = 0.0
92
+ self._last_claude_fallback_scan_at = 0.0
93
+ self._last_claude_fallback_scan_key = ""
94
+ self._claude_fallback_scan_interval_seconds = 10.0
87
95
  self._text_buffer = ""
88
96
  self._in_tool_use = False
89
97
  self._current_tool_input_parts = []
@@ -195,11 +203,76 @@ class ProgressTracker:
195
203
  self.is_active = True
196
204
 
197
205
  elif event_type == "system":
198
- # System events (hooks, init, etc.) — track but don't count as messages
206
+ # System events (hooks, init, task notifications, etc.) — track but don't count as messages.
199
207
  self.event_format = self.event_format or "stream-json"
200
208
  subtype = event.get("subtype", "")
201
209
  if subtype == "init":
202
210
  self.is_active = True
211
+ session_id = event.get("session_id")
212
+ if isinstance(session_id, str) and session_id.strip():
213
+ self.claude_session_id = session_id.strip()
214
+ cwd = event.get("cwd")
215
+ if isinstance(cwd, str) and cwd.strip():
216
+ self.claude_cwd = cwd.strip()
217
+ elif subtype == "task_started":
218
+ task_id = event.get("task_id")
219
+ if isinstance(task_id, str) and task_id.strip():
220
+ self.claude_task_states[task_id.strip()] = {
221
+ "status": "running",
222
+ "summary": str(event.get("description") or "")[:120],
223
+ "tool_use_id": str(event.get("tool_use_id") or ""),
224
+ "task_type": str(event.get("task_type") or ""),
225
+ "subagent_type": str(event.get("subagent_type") or ""),
226
+ }
227
+ self._update_claude_subagent_status_counts()
228
+ elif subtype in ("task_updated", "task_progress"):
229
+ task_id = event.get("task_id")
230
+ if isinstance(task_id, str) and task_id.strip():
231
+ state = self.claude_task_states.setdefault(task_id.strip(), {})
232
+ patch = event.get("patch") if isinstance(event.get("patch"), dict) else {}
233
+ status = patch.get("status") or event.get("status")
234
+ if status:
235
+ state["status"] = str(status)
236
+ summary = patch.get("summary") or patch.get("description") or event.get("summary") or event.get("description")
237
+ if summary:
238
+ state["summary"] = str(summary)[:120]
239
+ else:
240
+ state.setdefault("summary", "")
241
+ tool_use_id = patch.get("tool_use_id") or event.get("tool_use_id")
242
+ if tool_use_id:
243
+ state["tool_use_id"] = str(tool_use_id)
244
+ else:
245
+ state.setdefault("tool_use_id", "")
246
+ task_type = patch.get("task_type") or event.get("task_type")
247
+ if task_type:
248
+ state["task_type"] = str(task_type)
249
+ else:
250
+ state.setdefault("task_type", "")
251
+ subagent_type = patch.get("subagent_type") or event.get("subagent_type")
252
+ if subagent_type:
253
+ state["subagent_type"] = str(subagent_type)
254
+ else:
255
+ state.setdefault("subagent_type", "")
256
+ self._update_claude_subagent_status_counts()
257
+ elif subtype == "task_notification":
258
+ task_id = event.get("task_id")
259
+ if isinstance(task_id, str) and task_id.strip():
260
+ state = self.claude_task_states.setdefault(task_id.strip(), {})
261
+ status = event.get("status") or "completed"
262
+ state["status"] = str(status)
263
+ state["summary"] = str(event.get("summary") or state.get("summary") or "")[:120]
264
+ state.setdefault("tool_use_id", str(event.get("tool_use_id") or ""))
265
+ task_type = event.get("task_type")
266
+ if task_type:
267
+ state["task_type"] = str(task_type)
268
+ else:
269
+ state.setdefault("task_type", "")
270
+ subagent_type = event.get("subagent_type")
271
+ if subagent_type:
272
+ state["subagent_type"] = str(subagent_type)
273
+ else:
274
+ state.setdefault("subagent_type", "")
275
+ self._update_claude_subagent_status_counts()
203
276
 
204
277
  # ── Claude API raw stream format ────────────────────────────
205
278
  elif event_type == "message_start":
@@ -391,16 +464,135 @@ class ProgressTracker:
391
464
  pass
392
465
  return str(matches[0])
393
466
 
467
+ def _is_tracked_claude_subagent_state(self, state):
468
+ """Return true for Claude Code task events representing in-process agents."""
469
+ if not isinstance(state, dict):
470
+ return False
471
+ task_type = str(state.get("task_type") or "")
472
+ task_type_lower = task_type.lower()
473
+ subagent_type = str(state.get("subagent_type") or "")
474
+ if task_type_lower == "local_bash":
475
+ return False
476
+ tracked_types = {"in_process_teammate", "subagent", "agent", "teammate"}
477
+ if task_type_lower in tracked_types:
478
+ return True
479
+ if task_type_lower == "local_agent" and subagent_type:
480
+ return True
481
+ summary = str(state.get("summary") or "")
482
+ return bool(
483
+ not task_type
484
+ and summary.lower().startswith(("dev:", "critic:", "reviewer:", "agent:"))
485
+ )
486
+
487
+ def _has_tracked_claude_subagent_task(self):
488
+ """Return true once a Claude Code local-agent/subagent task has been observed."""
489
+ return any(
490
+ self._is_tracked_claude_subagent_state(state)
491
+ for state in self.claude_task_states.values()
492
+ )
493
+
494
+ def _update_claude_subagent_status_counts(self):
495
+ """Track Claude Code in-process teammate task state counts."""
496
+ counts = Counter()
497
+ active = 0
498
+ inactive_statuses = {
499
+ "completed",
500
+ "failed",
501
+ "cancelled",
502
+ "canceled",
503
+ "killed",
504
+ "stopped",
505
+ "success",
506
+ "error",
507
+ }
508
+ for state in self.claude_task_states.values():
509
+ if not self._is_tracked_claude_subagent_state(state):
510
+ continue
511
+ status = str(state.get("status") or "unknown")
512
+ counts[status] += 1
513
+ if status.lower() not in inactive_statuses:
514
+ active += 1
515
+ summary = state.get("summary") or state.get("subagent_type")
516
+ if summary:
517
+ self.last_text_snippet = str(summary).strip()[:120]
518
+ self._detect_phase(str(summary))
519
+ self.subagent_status_counts = counts
520
+ self.active_subagent_count = active
521
+
522
+ def _claude_projects_dir(self):
523
+ """Return the Claude Code projects directory for transcript lookup."""
524
+ projects_dir = os.environ.get("CLAUDE_PROJECTS_DIR")
525
+ if projects_dir:
526
+ return Path(projects_dir).expanduser()
527
+ claude_config_dir = os.environ.get("CLAUDE_CONFIG_DIR")
528
+ if claude_config_dir:
529
+ return Path(claude_config_dir).expanduser() / "projects"
530
+ claude_home = os.environ.get("CLAUDE_HOME")
531
+ if claude_home:
532
+ return Path(claude_home).expanduser() / "projects"
533
+ return Path.home() / ".claude" / "projects"
534
+
535
+ def _claude_project_key(self):
536
+ """Encode cwd the same way Claude Code stores project transcript dirs."""
537
+ cwd = self.claude_cwd
538
+ if not cwd:
539
+ return ""
540
+ return cwd.replace("\\", "-").replace("/", "-").replace(":", "")
541
+
542
+ def _find_claude_child_session_files(self):
543
+ """Find Claude Code subagent transcripts for this parent session."""
544
+ if not self.claude_session_id:
545
+ return []
546
+
547
+ projects_dir = self._claude_projects_dir()
548
+ if not projects_dir.exists():
549
+ return []
550
+
551
+ candidates = []
552
+ project_key = self._claude_project_key()
553
+ if project_key:
554
+ candidates.append(
555
+ projects_dir / project_key / self.claude_session_id / "subagents"
556
+ )
557
+
558
+ for candidate in candidates:
559
+ if candidate.exists():
560
+ try:
561
+ return sorted(candidate.glob("*.jsonl"))
562
+ except OSError:
563
+ return []
564
+
565
+ # Fallback for non-standard cwd encoding or custom Claude homes. Avoid
566
+ # repeatedly walking every stored transcript before any Agent task exists.
567
+ if not self._has_tracked_claude_subagent_task():
568
+ return []
569
+
570
+ fallback_scan_key = f"{projects_dir}:{self.claude_session_id}"
571
+ now = time.monotonic()
572
+ if (
573
+ self._last_claude_fallback_scan_key == fallback_scan_key
574
+ and now - self._last_claude_fallback_scan_at < self._claude_fallback_scan_interval_seconds
575
+ ):
576
+ return self._claude_child_session_files
577
+ self._last_claude_fallback_scan_key = fallback_scan_key
578
+ self._last_claude_fallback_scan_at = now
579
+ try:
580
+ matches = sorted(projects_dir.rglob(f"{self.claude_session_id}/subagents/*.jsonl"))
581
+ except OSError:
582
+ return []
583
+ return matches
584
+
394
585
  def refresh_child_session_activity(self, force=False):
395
- """Refresh Codex child transcript file stats.
586
+ """Refresh child transcript file stats.
396
587
 
397
588
  The heartbeat monitor uses this activity signature to treat subagent
398
- transcript growth as real progress while the parent Codex session is
399
- blocked in `wait`.
589
+ transcript growth as real progress while the parent session is blocked
590
+ waiting for a child agent/tool result. Supports Codex child threads and
591
+ Claude Code in-process teammate transcripts.
400
592
  """
401
593
  previous_signature = self.child_activity_signature
402
594
 
403
- if not self.codex_child_thread_ids:
595
+ if not self.codex_child_thread_ids and not self.claude_session_id:
404
596
  self.child_session_files = []
405
597
  self.child_total_bytes = 0
406
598
  self.child_activity_signature = ""
@@ -420,6 +612,7 @@ class ProgressTracker:
420
612
  found = self._find_codex_child_session_file(thread_id)
421
613
  if found:
422
614
  self._codex_child_session_paths[thread_id] = found
615
+ self._claude_child_session_files = self._find_claude_child_session_files()
423
616
  self._last_child_scan_at = now
424
617
 
425
618
  files = []
@@ -427,24 +620,22 @@ class ProgressTracker:
427
620
  total_bytes = 0
428
621
  max_mtime = 0.0
429
622
 
430
- for thread_id in sorted(self.codex_child_thread_ids):
431
- path = self._codex_child_session_paths.get(thread_id)
432
- if not path:
433
- continue
623
+ def add_file(kind, identifier, path):
624
+ nonlocal total_bytes, max_mtime
434
625
  try:
435
626
  stat = os.stat(path)
436
627
  except OSError:
437
- continue
438
-
628
+ return
629
+ path_str = str(path)
439
630
  total_bytes += stat.st_size
440
631
  max_mtime = max(max_mtime, stat.st_mtime)
441
- signature_parts.append(
442
- f"{thread_id}:{stat.st_size}:{getattr(stat, 'st_mtime_ns', int(stat.st_mtime * 1_000_000_000))}"
443
- )
632
+ mtime_ns = getattr(stat, "st_mtime_ns", int(stat.st_mtime * 1_000_000_000))
633
+ signature_parts.append(f"{kind}:{identifier}:{stat.st_size}:{mtime_ns}")
444
634
  files.append(
445
635
  {
446
- "thread_id": thread_id,
447
- "path": path,
636
+ "kind": kind,
637
+ "thread_id": identifier,
638
+ "path": path_str,
448
639
  "size": stat.st_size,
449
640
  "mtime": datetime.fromtimestamp(
450
641
  stat.st_mtime, timezone.utc
@@ -452,6 +643,14 @@ class ProgressTracker:
452
643
  }
453
644
  )
454
645
 
646
+ for thread_id in sorted(self.codex_child_thread_ids):
647
+ path = self._codex_child_session_paths.get(thread_id)
648
+ if path:
649
+ add_file("codex", thread_id, path)
650
+
651
+ for path in self._claude_child_session_files:
652
+ add_file("claude", path.stem, path)
653
+
455
654
  self.child_session_files = files
456
655
  self.child_total_bytes = total_bytes
457
656
  self.child_activity_signature = "|".join(signature_parts)
@@ -519,7 +718,7 @@ def atomic_write_json(data, filepath):
519
718
 
520
719
  def tail_and_parse(session_log, progress_file, poll_interval=0.5):
521
720
  """Tail session log and parse stream-json events."""
522
- tracker = ProgressTracker()
721
+ tracker = ProgressTracker(session_log)
523
722
  last_write_state = None
524
723
 
525
724
  def state_key(state):
@@ -1,30 +1,44 @@
1
1
  "Read {{DEV_SUBAGENT_PATH}}. Implement feature {{FEATURE_ID}} (slug: {{FEATURE_SLUG}}).
2
- **IMPORTANT**: Read `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md` FIRST — Section 3 has Prizm Context (TRAPS/RULES), Section 4 has File Manifest with paths and interfaces.
3
- ⚠️ DO NOT re-read source files already listed in Section 4 File Manifest unless you need implementation detail beyond the interface summary.
4
- 1. Read `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md` for full context.
5
- 2. Run `/prizmkit-implement` to execute the tasks in plan.md. Run tests with: `{{TEST_CMD}}`. Known baseline failures (pre-existing, not your fault): `{{BASELINE_FAILURES}}`.
6
- 3. If plan.md has more than 5 tasks: run `/compact` after completing every 3 tasks to manage context budget. If `/compact` is unavailable, continue without it.
7
- 4. After implement completes, verify the '## Implementation Log' section was written to context-snapshot.md.
8
2
 
9
- ## Acceptance Criteria Verification
3
+ ## Required Inputs
10
4
 
11
- Update the AC Verification Checklist in context-snapshot.md by marking each item [x] as you verify it:
12
- - As you complete each task, verify the corresponding acceptance criteria
13
- - Check the AC Checklist at the end of implementation
14
- - All [ ] must become [x] if any AC remains unverified, the feature is incomplete
15
- - Document any AC that cannot be verified due to test failures
5
+ 1. Read `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md` first.
6
+ 2. Use Section 4 File Manifest for targeted reads.
7
+ 3. Do not expand scope beyond the Task Contract and Verification Gates.
8
+ 4. Do not re-read source files already listed in Section 4 unless implementation details are missing from the manifest.
16
9
 
17
- ## Test Failure Recovery (Convergence-Based)
10
+ ## Work
18
11
 
19
- If tests fail, use convergence recovery — keep fixing while progress is being made:
12
+ Run `/prizmkit-implement` to execute `plan.md`.
20
13
 
21
- 1. **Run tests, record results**: count failures, exclude baseline failures
22
- 2. **Check termination**: All pass → done | Plateau (same failures 3 rounds) → stop | Failures decreased → continue
23
- 3. **Fix and iterate**: analyze, apply fix, re-run `($TEST_CMD)`, go back to step 1
14
+ Test command:
15
+ `{{TEST_CMD}}`
24
16
 
25
- **Key rule**: If failures decrease (even by 1), plateau counter resets.
26
- **Do NOT block completion** if unable to resolve — only NEW REGRESSIONS (not in baseline) require fixing.
27
- **If any AC cannot be verified** due to test failure: the feature is incomplete, add to failure notes.
17
+ Known baseline failures:
18
+ `{{BASELINE_FAILURES}}`
28
19
 
29
- 4. Do NOT execute any git commands (no git add/commit/reset/push).
30
- Do NOT exit until all tasks are [x], the '## Implementation Log' section is written, and AC Verification Checklist is 100% complete in context-snapshot.md."
20
+ If plan.md has more than 5 tasks, run `/compact` after completing every 3 tasks to manage context budget. If `/compact` is unavailable, continue without it.
21
+
22
+ ## Required Outputs
23
+
24
+ Before returning, append `## Implementation Log` to `context-snapshot.md` with:
25
+ - files changed/created
26
+ - key decisions
27
+ - deviations from plan
28
+ - test results
29
+ - Verification Gate status
30
+ - unresolved blockers, if any
31
+
32
+ ## Protocol References
33
+
34
+ - Follow the global Context Budget Rules.
35
+ - Carry forward the Dev-isolated subset: skip scaffold/generated files listed in `context-snapshot.md`; verify dependency versions before install/build commands that resolve dependencies; after build/compile commands, ensure outputs are ignored and never commit generated artifacts.
36
+ - If tests fail, follow this Test Failure Recovery subset: classify failures as baseline, new regression, brittle test, or environment/tooling; fix new regressions and brittle tests while progress is being made; document baseline failures; write `failure-log.md` for blockers.
37
+ - Do not run git commands; staging and commit are handled by the orchestrator.
38
+
39
+ Do not return success unless:
40
+ 1. implementation tasks are complete;
41
+ 2. `## Implementation Log` exists;
42
+ 3. every Verification Gate is verified.
43
+
44
+ If any Verification Gate is blocked, write `failure-log.md` and return a blocked/incomplete result instead of success."
@@ -1,5 +1,5 @@
1
1
  "Read {{REVIEWER_SUBAGENT_PATH}}. For feature {{FEATURE_ID}} (slug: {{FEATURE_SLUG}}):
2
- 1. Read `.prizmkit/specs/{{FEATURE_SLUG}}/spec.md` (if it exists) for goals and acceptance criteria; if spec.md does not exist, read `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md` Section 1 instead
2
+ 1. Read `.prizmkit/specs/{{FEATURE_SLUG}}/spec.md` (if it exists) for goals and Verification Gates; if spec.md does not exist, read `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md` Section 1 Task Contract instead
3
3
  2. Read `.prizmkit/specs/{{FEATURE_SLUG}}/plan.md` for architecture decisions and completed tasks
4
4
  3. Run /prizmkit-code-review with artifact_dir=.prizmkit/specs/{{FEATURE_SLUG}}/. The skill will run its internal review-fix loop (Reviewer → filter → Dev fix, max 3 rounds) and write review-report.md.
5
5
  4. Run the full test suite using `{{TEST_CMD}}`. When running tests: `({{TEST_CMD}}) 2>&1 | tee /tmp/review-test-out.txt | tail -20`, then grep `/tmp/review-test-out.txt` for details — do NOT re-run the suite multiple times.
@@ -14,6 +14,12 @@ You are the **session orchestrator**. Implement Feature {{FEATURE_ID}}: "{{FEATU
14
14
 
15
15
  **Tier 2 — Dual Agent**: You handle context + planning directly. Then spawn Dev and Reviewer subagents. Spawn Dev and Reviewer agents via the Agent tool.
16
16
 
17
+ **Agent spawn failure policy (all Agent tool calls)**:
18
+ - If spawning Dev, Reviewer, or Critic fails with team/config/lock errors, retry at most once.
19
+ - If the second attempt fails, do not keep spawning variants and do not enter artifact polling for Implementation Log, challenge report, or review report markers.
20
+ - Use the documented inline/recovery fallback for that phase: write the required report yourself where possible, complete remaining Dev work directly in the orchestrator when safe, or write `failure-log.md` with the spawn error and last observable state before stopping for recovery.
21
+ - Apply the same cap to any re-spawn for report repair or resume prompts; do not burn multiple minutes on identical team/config/lock failures.
22
+
17
23
  ### Feature Description
18
24
 
19
25
  {{FEATURE_DESCRIPTION}}
@@ -163,6 +169,8 @@ Before proceeding past CP-1, verify:
163
169
 
164
170
  Spawn Reviewer agent (Agent tool, subagent_type="prizm-dev-team-reviewer", mode="plan", run_in_background=false).
165
171
 
172
+ Spawn failure cap: for team/config/lock errors, retry at most once for this Reviewer spawn. If the second attempt fails, do not poll for report artifacts; fix/check the plan inline or write `failure-log.md` before stopping for recovery.
173
+
166
174
  Prompt:
167
175
  > "Read {{REVIEWER_SUBAGENT_PATH}}. For feature {{FEATURE_ID}} (slug: {{FEATURE_SLUG}}):
168
176
  > 1. Read `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md` FIRST — Section 3 has project context, Section 4 has file manifest.
@@ -186,6 +194,8 @@ If CRITIC:MISSING — skip Phase 3.5 entirely and proceed to Phase 4. Log: "Crit
186
194
 
187
195
  Spawn Critic agent (Agent tool, subagent_type="prizm-dev-team-critic", mode="plan", run_in_background=false).
188
196
 
197
+ Spawn failure cap: for team/config/lock errors, retry at most once for this Critic spawn. If the second attempt fails, do not poll for `challenge-report.md`; perform the plan challenge inline and record the fallback.
198
+
189
199
  Prompt:
190
200
  > "Read {{CRITIC_SUBAGENT_PATH}}. For feature {{FEATURE_ID}} (slug: {{FEATURE_SLUG}}):
191
201
  > **MODE: Plan Challenge**
@@ -208,6 +218,8 @@ Wait for Critic to return.
208
218
 
209
219
  Spawn Dev subagent (Agent tool, subagent_type="prizm-dev-team-dev", run_in_background=false).
210
220
 
221
+ Spawn failure cap: for team/config/lock errors, retry at most once for this Dev spawn. If the second attempt fails, do not poll for `## Implementation Log`; write `failure-log.md` and either implement remaining tasks directly in the orchestrator or stop for recovery.
222
+
211
223
  Prompt:
212
224
  > "Read {{DEV_SUBAGENT_PATH}}. Implement feature {{FEATURE_ID}} (slug: {{FEATURE_SLUG}}).
213
225
  > **IMPORTANT**: Read `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md` FIRST — Section 3 has Prizm Context (TRAPS/RULES), Section 4 has File Manifest with paths and interfaces.
@@ -232,6 +244,8 @@ If GATE:MISSING — send message to Dev (re-spawn if needed): "Write the '## Imp
232
244
 
233
245
  Spawn Reviewer subagent (Agent tool, subagent_type="prizm-dev-team-reviewer", run_in_background=false).
234
246
 
247
+ Spawn failure cap: for team/config/lock errors, retry at most once for this Reviewer spawn. If the second attempt fails, do not poll for `review-report.md`; write `failure-log.md` with the spawn error and last observable state before stopping or performing an inline fallback.
248
+
235
249
  Prompt:
236
250
  > "Read {{REVIEWER_SUBAGENT_PATH}}. For feature {{FEATURE_ID}} (slug: {{FEATURE_SLUG}}):
237
251
  > 1. Read `.prizmkit/specs/{{FEATURE_SLUG}}/spec.md` for goals and acceptance criteria
@@ -248,7 +262,11 @@ After Reviewer agent returns, verify the review report was written:
248
262
  ```bash
249
263
  grep -q "## Verdict" .prizmkit/specs/{{FEATURE_SLUG}}/review-report.md && echo "GATE:PASS" || echo "GATE:MISSING"
250
264
  ```
251
- If GATE:MISSING — send message to Reviewer (re-spawn if needed): "Write review-report.md to .prizmkit/specs/{{FEATURE_SLUG}}/."
265
+ If GATE:MISSING:
266
+ - Do not re-spawn Reviewer or re-run `/prizmkit-code-review` in an unbounded report-repair loop.
267
+ - Perform one bounded status check; retry at most once: inspect Reviewer output, code-review skill output, `review-report.md` path, and any Reviewer/Dev spawn messages.
268
+ - If the missing report is caused by team/config/lock errors from Reviewer or the internal code-review loop, write `failure-log.md` with the spawn/skill error and last observable state.
269
+ - If the report is still missing after that single check/retry, either perform a safe inline fallback review and write `review-report.md` with `## Verdict`, or stop with a clear recovery failure.
252
270
 
253
271
  Read `review-report.md` and check the Verdict:
254
272
  - `PASS` → proceed to next phase
@@ -14,6 +14,12 @@ You are the **session orchestrator**. Implement Feature {{FEATURE_ID}}: "{{FEATU
14
14
 
15
15
  **Tier 3 — Full Team**: For complex features, use the full pipeline (Phase 0–6) with Dev + Reviewer agents spawned via the Agent tool.
16
16
 
17
+ **Agent spawn failure policy (all Agent tool calls)**:
18
+ - If spawning Dev, Reviewer, or Critic fails with team/config/lock errors, retry at most once.
19
+ - If the second attempt fails, do not keep spawning variants and do not enter artifact polling for Implementation Log, challenge report, or review report markers.
20
+ - Use the documented inline/recovery fallback for that phase: write the required report yourself where possible, complete remaining Dev work directly in the orchestrator when safe, or write `failure-log.md` with the spawn error and last observable state before stopping for recovery.
21
+ - Apply the same cap to any re-spawn for report repair or resume prompts; do not burn multiple minutes on identical team/config/lock failures.
22
+
17
23
  ### Feature Description
18
24
 
19
25
  {{FEATURE_DESCRIPTION}}
@@ -190,6 +196,8 @@ Before proceeding past CP-1, verify:
190
196
 
191
197
  Spawn Reviewer agent (Agent tool, subagent_type="prizm-dev-team-reviewer", mode="plan", run_in_background=false).
192
198
 
199
+ Spawn failure cap: for team/config/lock errors, retry at most once for this Reviewer spawn. If the second attempt fails, do not poll for report artifacts; fix/check the plan inline or write `failure-log.md` before stopping for recovery.
200
+
193
201
  Prompt:
194
202
  > "Read {{REVIEWER_SUBAGENT_PATH}}. For feature {{FEATURE_ID}} (slug: {{FEATURE_SLUG}}):
195
203
  > 1. Read `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md` FIRST — Section 3 has project context, Section 4 has file manifest.
@@ -217,6 +225,8 @@ If CRITIC:MISSING — skip Phase 3.5 entirely and proceed to Phase 4. Log: "Crit
217
225
 
218
226
  Spawn Critic agent (Agent tool, subagent_type="prizm-dev-team-critic", mode="plan", run_in_background=false).
219
227
 
228
+ Spawn failure cap: for team/config/lock errors, retry at most once for this Critic spawn. If the second attempt fails, do not poll for challenge reports; perform the plan challenge inline and record the fallback.
229
+
220
230
  Prompt:
221
231
  > "Read {{CRITIC_SUBAGENT_PATH}}. For feature {{FEATURE_ID}} (slug: {{FEATURE_SLUG}}):
222
232
  > **MODE: Plan Challenge**
@@ -263,6 +273,8 @@ grep -c '^\- \[ \]' .prizmkit/specs/{{FEATURE_SLUG}}/plan.md 2>/dev/null || true
263
273
 
264
274
  Spawn Dev agent (Agent tool, subagent_type="prizm-dev-team-dev", run_in_background=false).
265
275
 
276
+ Spawn failure cap: for team/config/lock errors, retry at most once for this Dev spawn. If the second attempt fails, do not poll for `## Implementation Log`; write `failure-log.md` and either implement remaining tasks directly in the orchestrator or stop for recovery.
277
+
266
278
  Prompt:
267
279
  > "Read {{DEV_SUBAGENT_PATH}}. Implement feature {{FEATURE_ID}} (slug: {{FEATURE_SLUG}}).
268
280
  > **IMPORTANT**: Read `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md` FIRST — Section 3 has Prizm Context (TRAPS/RULES), Section 4 has File Manifest with paths and interfaces.
@@ -297,6 +309,8 @@ All tasks `[x]`, tests pass.
297
309
 
298
310
  Spawn Reviewer agent (Agent tool, subagent_type="prizm-dev-team-reviewer", run_in_background=false).
299
311
 
312
+ Spawn failure cap: for team/config/lock errors, retry at most once for this Reviewer spawn. If the second attempt fails, do not poll for `review-report.md`; write `failure-log.md` with the spawn error and last observable state before stopping or performing an inline fallback.
313
+
300
314
  Prompt:
301
315
  > "Read {{REVIEWER_SUBAGENT_PATH}}. For feature {{FEATURE_ID}} (slug: {{FEATURE_SLUG}}):
302
316
  > 1. Read `.prizmkit/specs/{{FEATURE_SLUG}}/spec.md` for goals and acceptance criteria
@@ -313,7 +327,11 @@ After Reviewer agent returns, verify the review report was written:
313
327
  ```bash
314
328
  grep -q "## Verdict" .prizmkit/specs/{{FEATURE_SLUG}}/review-report.md && echo "GATE:PASS" || echo "GATE:MISSING"
315
329
  ```
316
- If GATE:MISSING — send message to Reviewer (re-spawn if needed): "Write review-report.md to .prizmkit/specs/{{FEATURE_SLUG}}/."
330
+ If GATE:MISSING:
331
+ - Do not re-spawn Reviewer or re-run `/prizmkit-code-review` in an unbounded report-repair loop.
332
+ - Perform one bounded status check; retry at most once: inspect Reviewer output, code-review skill output, `review-report.md` path, and any Reviewer/Dev spawn messages.
333
+ - If the missing report is caused by team/config/lock errors from Reviewer or the internal code-review loop, write `failure-log.md` with the spawn/skill error and last observable state.
334
+ - If the report is still missing after that single check/retry, either perform a safe inline fallback review and write `review-report.md` with `## Verdict`, or stop with a clear recovery failure.
317
335
 
318
336
  Read `review-report.md` and check the Verdict:
319
337
  - `PASS` → proceed to next phase