prizmkit 1.1.70 → 1.1.74
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bundled/VERSION.json +3 -3
- package/bundled/agents/prizm-dev-team-dev.md +11 -1
- package/bundled/dev-pipeline/lib/common.sh +427 -0
- package/bundled/dev-pipeline/lib/heartbeat.sh +101 -36
- package/bundled/dev-pipeline/run-feature.sh +109 -29
- package/bundled/dev-pipeline/scripts/parse-stream-progress.py +198 -3
- package/bundled/dev-pipeline/scripts/update-feature-status.py +27 -3
- package/bundled/dev-pipeline/templates/agent-prompts/dev-implement.md +21 -0
- package/bundled/dev-pipeline/templates/bootstrap-tier2.md +1 -1
- package/bundled/dev-pipeline/templates/bootstrap-tier3.md +5 -9
- package/bundled/dev-pipeline/templates/sections/feature-context.md +3 -18
- package/bundled/dev-pipeline/templates/sections/phase-commit-full.md +11 -0
- package/bundled/dev-pipeline/templates/sections/phase-commit.md +11 -0
- package/bundled/dev-pipeline/templates/sections/phase-context-snapshot-agent-suffix.md +1 -1
- package/bundled/dev-pipeline/templates/sections/phase-context-snapshot-base.md +6 -12
- package/bundled/dev-pipeline/templates/sections/phase-context-snapshot-lite-suffix.md +10 -3
- package/bundled/dev-pipeline/templates/sections/phase-implement-agent.md +1 -0
- package/bundled/dev-pipeline/templates/sections/phase-specify-plan-full.md +4 -8
- package/bundled/dev-pipeline-windows/lib/common.ps1 +61 -1
- package/bundled/dev-pipeline-windows/lib/pipeline.ps1 +325 -16
- package/bundled/dev-pipeline-windows/scripts/parse-stream-progress.py +198 -3
- package/bundled/dev-pipeline-windows/scripts/update-feature-status.py +27 -3
- package/bundled/dev-pipeline-windows/templates/agent-prompts/dev-implement.md +21 -0
- package/bundled/dev-pipeline-windows/templates/agent-prompts/reviewer-review.md +1 -1
- package/bundled/dev-pipeline-windows/templates/bootstrap-prompt.md +27 -0
- package/bundled/dev-pipeline-windows/templates/bootstrap-tier1.md +543 -14
- package/bundled/dev-pipeline-windows/templates/bootstrap-tier2.md +664 -14
- package/bundled/dev-pipeline-windows/templates/bootstrap-tier3.md +741 -14
- package/bundled/dev-pipeline-windows/templates/bugfix-bootstrap-prompt.md +2 -2
- package/bundled/dev-pipeline-windows/templates/feature-list-schema.json +1 -1
- package/bundled/dev-pipeline-windows/templates/refactor-bootstrap-prompt.md +1 -1
- package/bundled/dev-pipeline-windows/templates/refactor-list-schema.json +1 -1
- package/bundled/dev-pipeline-windows/templates/sections/context-budget-rules.md +3 -3
- package/bundled/dev-pipeline-windows/templates/sections/failure-capture.md +1 -1
- package/bundled/dev-pipeline-windows/templates/sections/feature-context.md +3 -18
- package/bundled/dev-pipeline-windows/templates/sections/phase-browser-verification-auto.md +239 -40
- package/bundled/dev-pipeline-windows/templates/sections/phase-browser-verification-opencli.md +75 -26
- package/bundled/dev-pipeline-windows/templates/sections/phase-browser-verification.md +142 -36
- package/bundled/dev-pipeline-windows/templates/sections/phase-commit-full.md +13 -2
- package/bundled/dev-pipeline-windows/templates/sections/phase-commit.md +12 -1
- package/bundled/dev-pipeline-windows/templates/sections/phase-context-snapshot-agent-suffix.md +1 -1
- package/bundled/dev-pipeline-windows/templates/sections/phase-context-snapshot-base.md +7 -17
- package/bundled/dev-pipeline-windows/templates/sections/phase-context-snapshot-lite-suffix.md +10 -3
- package/bundled/dev-pipeline-windows/templates/sections/phase-critic-plan-full.md +1 -1
- package/bundled/dev-pipeline-windows/templates/sections/phase-critic-plan.md +1 -1
- package/bundled/dev-pipeline-windows/templates/sections/phase-implement-agent.md +3 -1
- package/bundled/dev-pipeline-windows/templates/sections/phase-implement-full.md +7 -3
- package/bundled/dev-pipeline-windows/templates/sections/phase-implement-lite.md +1 -3
- package/bundled/dev-pipeline-windows/templates/sections/phase-plan-agent.md +1 -1
- package/bundled/dev-pipeline-windows/templates/sections/phase-plan-lite.md +1 -1
- package/bundled/dev-pipeline-windows/templates/sections/phase-review-agent.md +1 -1
- package/bundled/dev-pipeline-windows/templates/sections/phase-review-full.md +2 -2
- package/bundled/dev-pipeline-windows/templates/sections/phase-specify-plan-full.md +13 -17
- package/bundled/dev-pipeline-windows/templates/sections/phase0-test-baseline.md +2 -4
- package/bundled/dev-pipeline-windows/templates/sections/subagent-timeout-recovery.md +1 -1
- package/bundled/skills/_metadata.json +1 -1
- package/package.json +1 -1
|
@@ -77,13 +77,16 @@ FEATURE_LIST=""
|
|
|
77
77
|
# Branch tracking (for cleanup on interrupt)
|
|
78
78
|
_ORIGINAL_BRANCH=""
|
|
79
79
|
_DEV_BRANCH_NAME=""
|
|
80
|
+
_SPAWN_FEATURE_SLUG=""
|
|
81
|
+
_SPAWN_EXIT_CODE=0
|
|
80
82
|
|
|
81
83
|
# ============================================================
|
|
82
84
|
# Shared: Spawn an AI CLI session and wait for result
|
|
83
85
|
# ============================================================
|
|
84
86
|
|
|
85
87
|
# Spawns an AI CLI session with heartbeat + timeout, waits for completion,
|
|
86
|
-
# checks session status
|
|
88
|
+
# and checks session status. Canonical status updates happen after the caller
|
|
89
|
+
# returns to the original branch.
|
|
87
90
|
#
|
|
88
91
|
# Arguments:
|
|
89
92
|
# $1 - feature_id
|
|
@@ -105,6 +108,9 @@ spawn_and_wait_session() {
|
|
|
105
108
|
local session_log="$session_dir/logs/session.log"
|
|
106
109
|
local progress_json="$session_dir/logs/progress.json"
|
|
107
110
|
|
|
111
|
+
_SPAWN_FEATURE_SLUG=""
|
|
112
|
+
_SPAWN_EXIT_CODE=0
|
|
113
|
+
|
|
108
114
|
local effective_model="${feature_model:-$MODEL}"
|
|
109
115
|
local cbc_pid
|
|
110
116
|
prizm_start_ai_session "$bootstrap_prompt" "$session_log" "$effective_model"
|
|
@@ -144,6 +150,7 @@ spawn_and_wait_session() {
|
|
|
144
150
|
if [[ $exit_code -eq 143 ]]; then
|
|
145
151
|
exit_code=124
|
|
146
152
|
fi
|
|
153
|
+
_SPAWN_EXIT_CODE="$exit_code"
|
|
147
154
|
|
|
148
155
|
# Check for stale-kill marker (heartbeat killed the process due to no progress)
|
|
149
156
|
local stale_kill_marker="$session_dir/logs/stale-kill.json"
|
|
@@ -174,7 +181,28 @@ spawn_and_wait_session() {
|
|
|
174
181
|
project_root="$PROJECT_ROOT"
|
|
175
182
|
local default_branch="$base_branch"
|
|
176
183
|
|
|
177
|
-
|
|
184
|
+
local semantic_finalized=false
|
|
185
|
+
local semantic_feature_slug=""
|
|
186
|
+
local semantic_commit_sha=""
|
|
187
|
+
local was_ai_runtime_error=false
|
|
188
|
+
if prizm_detect_ai_runtime_error "$session_log" "$progress_json"; then
|
|
189
|
+
was_ai_runtime_error=true
|
|
190
|
+
fi
|
|
191
|
+
|
|
192
|
+
if prizm_feature_semantically_complete "$feature_list" "$feature_id" "$project_root" "$default_branch" "$PRIZMKIT_DIR"; then
|
|
193
|
+
semantic_finalized=true
|
|
194
|
+
semantic_feature_slug="$PRIZM_SEMANTIC_FEATURE_SLUG"
|
|
195
|
+
semantic_commit_sha="$PRIZM_SEMANTIC_COMMIT_SHA"
|
|
196
|
+
if [[ $exit_code -ne 0 || "$was_stale_killed" == true || "$was_ai_runtime_error" == true ]]; then
|
|
197
|
+
log_warn "Session ended with a failure signal after semantic completion; treating as finalized success"
|
|
198
|
+
log_warn "Semantic completion commit: ${semantic_commit_sha:-unknown}"
|
|
199
|
+
fi
|
|
200
|
+
session_status="success"
|
|
201
|
+
elif [[ "$was_ai_runtime_error" == true ]]; then
|
|
202
|
+
log_warn "Session failed due to structured AI runtime/context error"
|
|
203
|
+
log_warn "AI runtime errors are retried without consuming code retry budget"
|
|
204
|
+
session_status="infra_error"
|
|
205
|
+
elif [[ $exit_code -eq 124 ]]; then
|
|
178
206
|
log_warn "Session timed out after ${SESSION_TIMEOUT}s"
|
|
179
207
|
session_status="timed_out"
|
|
180
208
|
elif [[ "$was_infra_error" == true ]]; then
|
|
@@ -222,15 +250,31 @@ spawn_and_wait_session() {
|
|
|
222
250
|
# ── Post-success validation ──────────────────────────────────────────
|
|
223
251
|
if [[ "$session_status" == "success" ]]; then
|
|
224
252
|
if git -C "$project_root" rev-parse --is-inside-work-tree >/dev/null 2>&1; then
|
|
225
|
-
# Auto-commit any remaining dirty files produced during the session
|
|
226
253
|
local dirty_files=""
|
|
227
254
|
dirty_files=$(git -C "$project_root" status --porcelain 2>/dev/null || true)
|
|
228
255
|
if [[ -n "$dirty_files" ]]; then
|
|
229
|
-
|
|
230
|
-
|
|
231
|
-
|
|
232
|
-
|
|
233
|
-
|
|
256
|
+
if [[ "$semantic_finalized" == true ]]; then
|
|
257
|
+
local post_completion_slug="$semantic_feature_slug"
|
|
258
|
+
if [[ -z "$post_completion_slug" ]]; then
|
|
259
|
+
post_completion_slug=$(prizm_feature_slug_from_list "$feature_list" "$feature_id" 2>/dev/null || true)
|
|
260
|
+
fi
|
|
261
|
+
if [[ -n "$post_completion_slug" ]] && prizm_preserve_post_completion_dirty "$project_root" "$PRIZMKIT_DIR/specs/${post_completion_slug}" "$feature_id" "$session_id"; then
|
|
262
|
+
log_warn "Post-completion dirty changes preserved under $PRIZMKIT_DIR/specs/${post_completion_slug}/"
|
|
263
|
+
log_warn "They were not included in the finalized feature commit."
|
|
264
|
+
else
|
|
265
|
+
log_warn "Could not safely preserve post-completion dirty changes; preserving dev branch for manual finalization"
|
|
266
|
+
session_status="finalization_needed"
|
|
267
|
+
fi
|
|
268
|
+
else
|
|
269
|
+
# Auto-commit any remaining dirty files produced during a normal
|
|
270
|
+
# clean success path. Semantic finalization explicitly avoids this
|
|
271
|
+
# so delayed post-commit findings cannot be merged into main.
|
|
272
|
+
log_info "Auto-committing remaining session artifacts..."
|
|
273
|
+
git -C "$project_root" add -A 2>/dev/null || true
|
|
274
|
+
git -C "$project_root" commit --no-verify --amend --no-edit 2>/dev/null \
|
|
275
|
+
|| git -C "$project_root" commit --no-verify -m "chore($feature_id): include remaining session artifacts" 2>/dev/null \
|
|
276
|
+
|| true
|
|
277
|
+
fi
|
|
234
278
|
fi
|
|
235
279
|
fi
|
|
236
280
|
fi
|
|
@@ -242,7 +286,10 @@ spawn_and_wait_session() {
|
|
|
242
286
|
|
|
243
287
|
# Write lightweight session summary for post-session inspection
|
|
244
288
|
local feature_slug
|
|
245
|
-
|
|
289
|
+
if [[ -n "$semantic_feature_slug" ]]; then
|
|
290
|
+
feature_slug="$semantic_feature_slug"
|
|
291
|
+
else
|
|
292
|
+
feature_slug=$(python3 -c "
|
|
246
293
|
import json, re, sys
|
|
247
294
|
flist, fid = sys.argv[1], sys.argv[2]
|
|
248
295
|
with open(flist) as f:
|
|
@@ -258,9 +305,11 @@ for feat in data.get('features', []):
|
|
|
258
305
|
sys.exit(0)
|
|
259
306
|
sys.exit(1)
|
|
260
307
|
" "$feature_list" "$feature_id" 2>/dev/null) || {
|
|
261
|
-
|
|
262
|
-
|
|
263
|
-
|
|
308
|
+
log_warn "Could not resolve feature slug for $feature_id — session summary and artifact validation will be skipped"
|
|
309
|
+
feature_slug=""
|
|
310
|
+
}
|
|
311
|
+
fi
|
|
312
|
+
_SPAWN_FEATURE_SLUG="$feature_slug"
|
|
264
313
|
|
|
265
314
|
# Validate key artifacts exist after successful session
|
|
266
315
|
if [[ "$session_status" == "success" && -n "$feature_slug" ]]; then
|
|
@@ -315,16 +364,6 @@ sys.exit(0)
|
|
|
315
364
|
fi
|
|
316
365
|
fi
|
|
317
366
|
|
|
318
|
-
# Check if session produced a failure-log for future retries
|
|
319
|
-
if [[ "$session_status" != "success" && -n "$feature_slug" ]]; then
|
|
320
|
-
local failure_log="$PRIZMKIT_DIR/specs/${feature_slug}/failure-log.md"
|
|
321
|
-
if [[ -f "$failure_log" ]]; then
|
|
322
|
-
log_info "FAILURE_LOG: Session wrote failure-log.md — will be available to next retry"
|
|
323
|
-
else
|
|
324
|
-
log_info "FAILURE_LOG: No failure-log.md written by session"
|
|
325
|
-
fi
|
|
326
|
-
fi
|
|
327
|
-
|
|
328
367
|
# Propagate completion notes for dependency context (only on success)
|
|
329
368
|
if [[ "$session_status" == "success" && -n "$feature_slug" ]]; then
|
|
330
369
|
local summary_path="$PRIZMKIT_DIR/specs/$feature_slug/completion-summary.json"
|
|
@@ -342,7 +381,45 @@ sys.exit(0)
|
|
|
342
381
|
fi
|
|
343
382
|
fi
|
|
344
383
|
|
|
345
|
-
#
|
|
384
|
+
# Return status via global variable (avoids $() swallowing stdout)
|
|
385
|
+
_SPAWN_RESULT="$session_status"
|
|
386
|
+
}
|
|
387
|
+
|
|
388
|
+
finalize_feature_status_after_branch_return() {
|
|
389
|
+
local feature_id="$1"
|
|
390
|
+
local feature_list="$2"
|
|
391
|
+
local session_id="$3"
|
|
392
|
+
local session_status="$4"
|
|
393
|
+
local max_retries="$5"
|
|
394
|
+
local session_dir="$6"
|
|
395
|
+
local base_branch="${7:-main}"
|
|
396
|
+
|
|
397
|
+
local feature_slug="${_SPAWN_FEATURE_SLUG:-}"
|
|
398
|
+
local progress_json="$session_dir/logs/progress.json"
|
|
399
|
+
local stale_kill_marker="$session_dir/logs/stale-kill.json"
|
|
400
|
+
local exit_code="${_SPAWN_EXIT_CODE:-0}"
|
|
401
|
+
|
|
402
|
+
# Check if session produced a failure-log for future retries; synthesize one
|
|
403
|
+
# after branch return so canonical diagnostics live on the original branch.
|
|
404
|
+
if [[ "$session_status" != "success" && -n "$feature_slug" ]]; then
|
|
405
|
+
local failure_log="$PRIZMKIT_DIR/specs/${feature_slug}/failure-log.md"
|
|
406
|
+
local checkpoint_file_for_failure="$PRIZMKIT_DIR/specs/${feature_slug}/workflow-checkpoint.json"
|
|
407
|
+
if [[ -f "$failure_log" ]]; then
|
|
408
|
+
log_info "FAILURE_LOG: Session wrote failure-log.md — will be available to next retry"
|
|
409
|
+
else
|
|
410
|
+
prizm_synthesize_failure_log \
|
|
411
|
+
"$failure_log" "$feature_id" "$session_id" "$session_status" "$exit_code" \
|
|
412
|
+
"$stale_kill_marker" "$progress_json" "$checkpoint_file_for_failure" "$PROJECT_ROOT" "$base_branch"
|
|
413
|
+
if [[ -f "$failure_log" ]]; then
|
|
414
|
+
log_info "FAILURE_LOG: Runtime synthesized failure-log.md — will be available to next retry"
|
|
415
|
+
else
|
|
416
|
+
log_info "FAILURE_LOG: No failure-log.md written by session"
|
|
417
|
+
fi
|
|
418
|
+
fi
|
|
419
|
+
fi
|
|
420
|
+
|
|
421
|
+
# Update feature status on the original branch. The caller commits the
|
|
422
|
+
# resulting feature-list diff immediately after this helper returns.
|
|
346
423
|
local update_output
|
|
347
424
|
update_output=$(python3 "$SCRIPTS_DIR/update-feature-status.py" \
|
|
348
425
|
--feature-list "$feature_list" \
|
|
@@ -357,9 +434,6 @@ sys.exit(0)
|
|
|
357
434
|
}
|
|
358
435
|
|
|
359
436
|
_SPAWN_ITEM_STATUS="$(printf '%s' "$update_output" | prizm_extract_update_new_status)"
|
|
360
|
-
|
|
361
|
-
# Return status via global variable (avoids $() swallowing stdout)
|
|
362
|
-
_SPAWN_RESULT="$session_status"
|
|
363
437
|
}
|
|
364
438
|
|
|
365
439
|
# ============================================================
|
|
@@ -896,7 +970,7 @@ else:
|
|
|
896
970
|
else
|
|
897
971
|
log_warn "Auto-merge failed — dev branch preserved: $_DEV_BRANCH_NAME"
|
|
898
972
|
log_warn "Merge manually: git checkout $_ORIGINAL_BRANCH && git rebase $_DEV_BRANCH_NAME"
|
|
899
|
-
|
|
973
|
+
session_status="merge_conflict"
|
|
900
974
|
fi
|
|
901
975
|
elif [[ -n "$_DEV_BRANCH_NAME" ]]; then
|
|
902
976
|
# Session failed — preserve dev branch for inspection
|
|
@@ -907,6 +981,9 @@ else:
|
|
|
907
981
|
# GUARANTEED: always return to original branch regardless of success/failure/merge outcome
|
|
908
982
|
branch_ensure_return "$_proj_root" "$_ORIGINAL_BRANCH"
|
|
909
983
|
|
|
984
|
+
finalize_feature_status_after_branch_return \
|
|
985
|
+
"$feature_id" "$feature_list" "$session_id" "$session_status" 999 "$session_dir" "$_ORIGINAL_BRANCH"
|
|
986
|
+
|
|
910
987
|
# Commit feature status update on the original branch (after guaranteed return)
|
|
911
988
|
if ! git -C "$_proj_root" diff --quiet "$feature_list" 2>/dev/null; then
|
|
912
989
|
git -C "$_proj_root" add "$feature_list"
|
|
@@ -1318,7 +1395,6 @@ DEPLOY_PROMPT_EOF
|
|
|
1318
1395
|
"$feature_id" "$feature_list" "$session_id" \
|
|
1319
1396
|
"$bootstrap_prompt" "$session_dir" "$MAX_RETRIES" "$feature_model" "$_ORIGINAL_BRANCH"
|
|
1320
1397
|
local session_status="$_SPAWN_RESULT"
|
|
1321
|
-
local item_status_after_session="${_SPAWN_ITEM_STATUS:-}"
|
|
1322
1398
|
|
|
1323
1399
|
# Merge per-feature dev branch back to original on success
|
|
1324
1400
|
if [[ "$session_status" == "success" && -n "$_DEV_BRANCH_NAME" ]]; then
|
|
@@ -1327,7 +1403,7 @@ DEPLOY_PROMPT_EOF
|
|
|
1327
1403
|
else
|
|
1328
1404
|
log_warn "Auto-merge failed — dev branch preserved: $_DEV_BRANCH_NAME"
|
|
1329
1405
|
log_warn "Merge manually: git checkout $_ORIGINAL_BRANCH && git rebase $_DEV_BRANCH_NAME"
|
|
1330
|
-
|
|
1406
|
+
session_status="merge_conflict"
|
|
1331
1407
|
fi
|
|
1332
1408
|
elif [[ -n "$_DEV_BRANCH_NAME" ]]; then
|
|
1333
1409
|
# Session failed — preserve dev branch for inspection
|
|
@@ -1338,6 +1414,10 @@ DEPLOY_PROMPT_EOF
|
|
|
1338
1414
|
# GUARANTEED: always return to original branch regardless of success/failure/merge outcome
|
|
1339
1415
|
branch_ensure_return "$_proj_root" "$_ORIGINAL_BRANCH"
|
|
1340
1416
|
|
|
1417
|
+
finalize_feature_status_after_branch_return \
|
|
1418
|
+
"$feature_id" "$feature_list" "$session_id" "$session_status" "$MAX_RETRIES" "$session_dir" "$_ORIGINAL_BRANCH"
|
|
1419
|
+
local item_status_after_session="${_SPAWN_ITEM_STATUS:-}"
|
|
1420
|
+
|
|
1341
1421
|
# Commit feature status update on the original branch (after guaranteed return)
|
|
1342
1422
|
if ! git -C "$_proj_root" diff --quiet "$feature_list" 2>/dev/null; then
|
|
1343
1423
|
git -C "$_proj_root" add "$feature_list"
|
|
@@ -17,6 +17,7 @@ The script runs until:
|
|
|
17
17
|
import argparse
|
|
18
18
|
import json
|
|
19
19
|
import os
|
|
20
|
+
import re
|
|
20
21
|
import signal
|
|
21
22
|
import sys
|
|
22
23
|
import tempfile
|
|
@@ -59,6 +60,58 @@ PHASE_KEYWORDS = {
|
|
|
59
60
|
},
|
|
60
61
|
}
|
|
61
62
|
|
|
63
|
+
CONTEXT_ERROR_PATTERNS = [
|
|
64
|
+
re.compile(pattern, re.IGNORECASE)
|
|
65
|
+
for pattern in (
|
|
66
|
+
r"context_too_large",
|
|
67
|
+
r"model_context_window_exceeded",
|
|
68
|
+
r"Your input exceeds the context window",
|
|
69
|
+
r"input exceeds the context window",
|
|
70
|
+
r"context window of this model",
|
|
71
|
+
r"context window exceeded",
|
|
72
|
+
r"invalid_request_error.*context window",
|
|
73
|
+
r"context window.*invalid_request_error",
|
|
74
|
+
)
|
|
75
|
+
]
|
|
76
|
+
|
|
77
|
+
ERROR_CONTEXT_PATTERNS = [
|
|
78
|
+
re.compile(pattern, re.IGNORECASE)
|
|
79
|
+
for pattern in (
|
|
80
|
+
r"\bapi error\b",
|
|
81
|
+
r"invalid_request_error",
|
|
82
|
+
r"\bstatus\s*[:=]?\s*(400|413)\b",
|
|
83
|
+
r"\bapi_error_status\b",
|
|
84
|
+
r"\bapi_error_code\b",
|
|
85
|
+
r"\blast_result_is_error\b\s*[\"':=]*\s*true\b",
|
|
86
|
+
r"\bis_error\b\s*[\"':=]*\s*true\b",
|
|
87
|
+
)
|
|
88
|
+
]
|
|
89
|
+
|
|
90
|
+
|
|
91
|
+
def _has_error_context(text):
|
|
92
|
+
"""Return true when free text looks like a runtime/provider error."""
|
|
93
|
+
if not text:
|
|
94
|
+
return False
|
|
95
|
+
return any(pattern.search(text) for pattern in ERROR_CONTEXT_PATTERNS)
|
|
96
|
+
|
|
97
|
+
|
|
98
|
+
def detect_api_error_code(text, require_error_context=False):
|
|
99
|
+
"""Return a normalized fatal/runtime error code from terminal text.
|
|
100
|
+
|
|
101
|
+
Structured terminal result/error events and raw stderr can be matched
|
|
102
|
+
directly. Ordinary assistant prose is noisier: it may mention the phrase
|
|
103
|
+
"input exceeds the context window" while explaining a test or recovery
|
|
104
|
+
rule, so callers can require additional error-like context there.
|
|
105
|
+
"""
|
|
106
|
+
if not text:
|
|
107
|
+
return ""
|
|
108
|
+
if require_error_context and not _has_error_context(text):
|
|
109
|
+
return ""
|
|
110
|
+
for pattern in CONTEXT_ERROR_PATTERNS:
|
|
111
|
+
if pattern.search(text):
|
|
112
|
+
return "context_too_large"
|
|
113
|
+
return ""
|
|
114
|
+
|
|
62
115
|
|
|
63
116
|
class ProgressTracker:
|
|
64
117
|
"""Tracks progress state from stream-json events."""
|
|
@@ -73,6 +126,12 @@ class ProgressTracker:
|
|
|
73
126
|
self.tool_call_counts = Counter()
|
|
74
127
|
self.total_tool_calls = 0
|
|
75
128
|
self.last_text_snippet = ""
|
|
129
|
+
self.last_result_is_error = False
|
|
130
|
+
self.api_error_status = None
|
|
131
|
+
self.api_error_code = ""
|
|
132
|
+
self.terminal_result_text = ""
|
|
133
|
+
self.terminal_success_at = ""
|
|
134
|
+
self.fatal_error_code = ""
|
|
76
135
|
self.is_active = True
|
|
77
136
|
self.errors = []
|
|
78
137
|
self.event_format = ""
|
|
@@ -164,11 +223,13 @@ class ProgressTracker:
|
|
|
164
223
|
elif event_type == "turn.failed":
|
|
165
224
|
error = event.get("error") or event.get("message") or "Codex turn failed"
|
|
166
225
|
self.errors.append(str(error))
|
|
226
|
+
self._detect_terminal_error(str(error))
|
|
167
227
|
self.current_tool = None
|
|
168
228
|
|
|
169
229
|
elif event_type == "error":
|
|
170
230
|
error = event.get("error") or event.get("message") or "Unknown error"
|
|
171
231
|
self.errors.append(str(error))
|
|
232
|
+
self._detect_terminal_error(str(error))
|
|
172
233
|
|
|
173
234
|
return
|
|
174
235
|
|
|
@@ -196,12 +257,51 @@ class ProgressTracker:
|
|
|
196
257
|
if text.strip():
|
|
197
258
|
self.last_text_snippet = text.strip()[:120]
|
|
198
259
|
self._detect_phase(text)
|
|
260
|
+
self._detect_terminal_error(text, require_error_context=True)
|
|
199
261
|
|
|
200
262
|
elif event_type == "tool_result" or event_type == "user":
|
|
201
263
|
# tool_result contains output from tool execution
|
|
202
264
|
self.event_format = self.event_format or "stream-json"
|
|
203
265
|
self.is_active = True
|
|
204
266
|
|
|
267
|
+
# Check for error patterns in tool_result content (supports both formats):
|
|
268
|
+
# A) Top-level tool_result events: event["content"] is the result text
|
|
269
|
+
# B) Nested user events: event["message"]["content"][] has type=="tool_result" items
|
|
270
|
+
content_candidates = []
|
|
271
|
+
|
|
272
|
+
# Format A: top-level tool_result
|
|
273
|
+
if event_type == "tool_result":
|
|
274
|
+
content_candidates.append(str(event.get("content", "")))
|
|
275
|
+
|
|
276
|
+
# Format B: nested inside user event
|
|
277
|
+
if event_type == "user":
|
|
278
|
+
message = event.get("message", {})
|
|
279
|
+
content_list = message.get("content", [])
|
|
280
|
+
if isinstance(content_list, list):
|
|
281
|
+
for item in content_list:
|
|
282
|
+
if isinstance(item, dict) and item.get("type") == "tool_result":
|
|
283
|
+
content_candidates.append(str(item.get("content", "")))
|
|
284
|
+
|
|
285
|
+
for result_text in content_candidates:
|
|
286
|
+
if "shorter than the provided offset" in result_text:
|
|
287
|
+
self.errors.append({
|
|
288
|
+
"type": "read_offset_overflow",
|
|
289
|
+
"tool": self.current_tool,
|
|
290
|
+
"at": datetime.now(timezone.utc).isoformat(),
|
|
291
|
+
})
|
|
292
|
+
break # one error per event is enough
|
|
293
|
+
elif "Wasted call" in result_text:
|
|
294
|
+
self.errors.append({
|
|
295
|
+
"type": "wasted_call",
|
|
296
|
+
"tool": self.current_tool,
|
|
297
|
+
"at": datetime.now(timezone.utc).isoformat(),
|
|
298
|
+
})
|
|
299
|
+
break
|
|
300
|
+
|
|
301
|
+
# Keep only last 20 errors to prevent unbounded growth in progress.json
|
|
302
|
+
if len(self.errors) > 20:
|
|
303
|
+
self.errors = self.errors[-20:]
|
|
304
|
+
|
|
205
305
|
elif event_type == "system":
|
|
206
306
|
# System events (hooks, init, task notifications, etc.) — track but don't count as messages.
|
|
207
307
|
self.event_format = self.event_format or "stream-json"
|
|
@@ -274,6 +374,28 @@ class ProgressTracker:
|
|
|
274
374
|
state.setdefault("subagent_type", "")
|
|
275
375
|
self._update_claude_subagent_status_counts()
|
|
276
376
|
|
|
377
|
+
elif event_type == "result":
|
|
378
|
+
self.event_format = self.event_format or "stream-json"
|
|
379
|
+
self.is_active = False
|
|
380
|
+
result_text = event.get("result") or event.get("message") or ""
|
|
381
|
+
error_obj = event.get("error")
|
|
382
|
+
if isinstance(error_obj, dict):
|
|
383
|
+
error_text = " ".join(
|
|
384
|
+
str(error_obj.get(key) or "")
|
|
385
|
+
for key in ("type", "code", "message")
|
|
386
|
+
if error_obj.get(key)
|
|
387
|
+
)
|
|
388
|
+
result_text = " ".join(part for part in (str(result_text), error_text) if part)
|
|
389
|
+
api_error_code = event.get("api_error_code") or event.get("error_code") or ""
|
|
390
|
+
if isinstance(error_obj, dict) and not api_error_code:
|
|
391
|
+
api_error_code = error_obj.get("code") or error_obj.get("type") or ""
|
|
392
|
+
self._record_terminal_result(
|
|
393
|
+
text=str(result_text or ""),
|
|
394
|
+
is_error=bool(event.get("is_error")),
|
|
395
|
+
api_error_status=event.get("api_error_status"),
|
|
396
|
+
api_error_code=str(api_error_code or ""),
|
|
397
|
+
)
|
|
398
|
+
|
|
277
399
|
# ── Claude API raw stream format ────────────────────────────
|
|
278
400
|
elif event_type == "message_start":
|
|
279
401
|
self.event_format = self.event_format or "stream-json"
|
|
@@ -316,6 +438,7 @@ class ProgressTracker:
|
|
|
316
438
|
self.last_text_snippet = stripped[:120]
|
|
317
439
|
# Try to detect phase from text
|
|
318
440
|
self._detect_phase(text)
|
|
441
|
+
self._detect_terminal_error(text, require_error_context=True)
|
|
319
442
|
|
|
320
443
|
elif delta_type == "input_json_delta":
|
|
321
444
|
partial = delta.get("partial_json", "")
|
|
@@ -331,21 +454,73 @@ class ProgressTracker:
|
|
|
331
454
|
self._extract_tool_summary(full_input)
|
|
332
455
|
self._detect_phase(full_input)
|
|
333
456
|
else:
|
|
334
|
-
# Text block finished - detect phase from accumulated text
|
|
457
|
+
# Text block finished - detect phase and terminal errors from accumulated text
|
|
335
458
|
if self._text_buffer:
|
|
336
459
|
self._detect_phase(self._text_buffer)
|
|
460
|
+
self._detect_terminal_error(
|
|
461
|
+
self._text_buffer,
|
|
462
|
+
require_error_context=True,
|
|
463
|
+
)
|
|
337
464
|
self._in_tool_use = False
|
|
338
465
|
self._current_tool_input_parts = []
|
|
339
466
|
|
|
340
467
|
elif event_type == "error":
|
|
341
468
|
error_msg = event.get("error", {}).get("message", "Unknown error")
|
|
342
469
|
self.errors.append(error_msg)
|
|
470
|
+
self._detect_terminal_error(str(error_msg))
|
|
343
471
|
|
|
344
472
|
# Check for subagent indicator
|
|
345
473
|
if event.get("parent_tool_use_id"):
|
|
346
474
|
# This is a sub-agent event; tool name is still tracked normally
|
|
347
475
|
pass
|
|
348
476
|
|
|
477
|
+
def _record_terminal_result(self, text="", is_error=False, api_error_status=None, api_error_code=""):
|
|
478
|
+
"""Record a Claude Code terminal result event."""
|
|
479
|
+
terminal_text = str(text or "")
|
|
480
|
+
self.last_result_is_error = bool(is_error)
|
|
481
|
+
if api_error_status not in (None, ""):
|
|
482
|
+
try:
|
|
483
|
+
self.api_error_status = int(api_error_status)
|
|
484
|
+
except (TypeError, ValueError):
|
|
485
|
+
self.api_error_status = api_error_status
|
|
486
|
+
error_like_result = (
|
|
487
|
+
self.last_result_is_error
|
|
488
|
+
or api_error_status not in (None, "")
|
|
489
|
+
or bool(api_error_code)
|
|
490
|
+
or _has_error_context(terminal_text)
|
|
491
|
+
)
|
|
492
|
+
normalized_code = detect_api_error_code(
|
|
493
|
+
" ".join([str(api_error_code or ""), terminal_text]),
|
|
494
|
+
require_error_context=not error_like_result,
|
|
495
|
+
)
|
|
496
|
+
if normalized_code:
|
|
497
|
+
self.api_error_code = normalized_code
|
|
498
|
+
self.fatal_error_code = normalized_code
|
|
499
|
+
elif api_error_code:
|
|
500
|
+
self.api_error_code = str(api_error_code)
|
|
501
|
+
self.terminal_result_text = terminal_text[:1000]
|
|
502
|
+
if terminal_text.strip():
|
|
503
|
+
self.last_text_snippet = terminal_text.strip()[:120]
|
|
504
|
+
if not self.last_result_is_error and not self.fatal_error_code:
|
|
505
|
+
self.terminal_success_at = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
|
|
506
|
+
elif self.fatal_error_code:
|
|
507
|
+
self.errors.append(self.fatal_error_code)
|
|
508
|
+
|
|
509
|
+
def _detect_terminal_error(self, text, require_error_context=False):
|
|
510
|
+
"""Detect fatal context-window errors from unstructured text."""
|
|
511
|
+
code = detect_api_error_code(
|
|
512
|
+
str(text or ""),
|
|
513
|
+
require_error_context=require_error_context,
|
|
514
|
+
)
|
|
515
|
+
if not code:
|
|
516
|
+
return
|
|
517
|
+
self.last_result_is_error = True
|
|
518
|
+
self.api_error_code = code
|
|
519
|
+
self.fatal_error_code = code
|
|
520
|
+
self.terminal_result_text = str(text or "")[:1000]
|
|
521
|
+
if text:
|
|
522
|
+
self.last_text_snippet = str(text).strip()[:120]
|
|
523
|
+
|
|
349
524
|
def _detect_phase(self, text):
|
|
350
525
|
"""Detect pipeline phase from text content.
|
|
351
526
|
|
|
@@ -692,6 +867,12 @@ class ProgressTracker:
|
|
|
692
867
|
"child_activity_signature": self.child_activity_signature,
|
|
693
868
|
"last_child_activity_at": self.last_child_activity_at,
|
|
694
869
|
"last_text_snippet": self.last_text_snippet,
|
|
870
|
+
"last_result_is_error": self.last_result_is_error,
|
|
871
|
+
"api_error_status": self.api_error_status,
|
|
872
|
+
"api_error_code": self.api_error_code,
|
|
873
|
+
"terminal_result_text": self.terminal_result_text,
|
|
874
|
+
"terminal_success_at": self.terminal_success_at,
|
|
875
|
+
"fatal_error_code": self.fatal_error_code,
|
|
695
876
|
"is_active": self.is_active,
|
|
696
877
|
"errors": self.errors[-10:], # Keep last 10 errors
|
|
697
878
|
}
|
|
@@ -728,6 +909,12 @@ def tail_and_parse(session_log, progress_file, poll_interval=0.5):
|
|
|
728
909
|
state["current_phase"],
|
|
729
910
|
state["total_tool_calls"],
|
|
730
911
|
state.get("child_activity_signature", ""),
|
|
912
|
+
state.get("last_result_is_error"),
|
|
913
|
+
state.get("api_error_status"),
|
|
914
|
+
state.get("api_error_code", ""),
|
|
915
|
+
state.get("fatal_error_code", ""),
|
|
916
|
+
state.get("terminal_result_text", ""),
|
|
917
|
+
tuple(state.get("errors", [])),
|
|
731
918
|
)
|
|
732
919
|
|
|
733
920
|
# Wait for log file to appear
|
|
@@ -752,11 +939,19 @@ def tail_and_parse(session_log, progress_file, poll_interval=0.5):
|
|
|
752
939
|
event = json.loads(line)
|
|
753
940
|
tracker.process_event(event)
|
|
754
941
|
except json.JSONDecodeError:
|
|
755
|
-
# Not a JSON line (could be stderr mixed in)
|
|
756
|
-
#
|
|
942
|
+
# Not a JSON line (could be stderr mixed in). Use it as a
|
|
943
|
+
# text snippet and only treat it as terminal when it has a
|
|
944
|
+
# strong API/runtime error marker; ordinary assistant prose
|
|
945
|
+
# can discuss context limits without being fatal.
|
|
757
946
|
stripped = line.strip()
|
|
758
947
|
if stripped and len(stripped) > 5:
|
|
759
948
|
tracker.last_text_snippet = stripped[:120]
|
|
949
|
+
tracker._detect_terminal_error(stripped, require_error_context=True)
|
|
950
|
+
current_state = tracker.to_dict()
|
|
951
|
+
current_state_key = state_key(current_state)
|
|
952
|
+
if current_state_key != last_write_state:
|
|
953
|
+
atomic_write_json(current_state, progress_file)
|
|
954
|
+
last_write_state = current_state_key
|
|
760
955
|
continue
|
|
761
956
|
|
|
762
957
|
# Write progress if state changed
|
|
@@ -49,6 +49,7 @@ SESSION_STATUS_VALUES = [
|
|
|
49
49
|
"commit_missing",
|
|
50
50
|
"docs_missing",
|
|
51
51
|
"merge_conflict",
|
|
52
|
+
"finalization_needed",
|
|
52
53
|
]
|
|
53
54
|
|
|
54
55
|
TERMINAL_STATUSES = {"completed", "failed", "skipped", "auto_skipped", "split"}
|
|
@@ -644,7 +645,25 @@ def action_update(args, feature_list_path, state_dir):
|
|
|
644
645
|
fs["degraded_reason"] = session_status
|
|
645
646
|
fs["resume_from_phase"] = None
|
|
646
647
|
fs["sessions"] = []
|
|
647
|
-
|
|
648
|
+
if session_id:
|
|
649
|
+
fs["last_session_id"] = session_id
|
|
650
|
+
fs["last_failed_session_id"] = session_id
|
|
651
|
+
|
|
652
|
+
err = update_feature_in_list(feature_list_path, feature_id, new_status)
|
|
653
|
+
if err:
|
|
654
|
+
error_out("Failed to update .prizmkit/plans/feature-list.json: {}".format(err))
|
|
655
|
+
return
|
|
656
|
+
elif session_status == "finalization_needed":
|
|
657
|
+
# Runtime preserved dirty post-completion changes but could not safely
|
|
658
|
+
# clean them for automatic merge. Preserve the dev branch and stop for
|
|
659
|
+
# manual finalization instead of spending code retry budget.
|
|
660
|
+
new_status = "failed"
|
|
661
|
+
fs["degraded_reason"] = session_status
|
|
662
|
+
fs["resume_from_phase"] = None
|
|
663
|
+
fs["finalization_needed_count"] = fs.get("finalization_needed_count", 0) + 1
|
|
664
|
+
if session_id:
|
|
665
|
+
fs["last_session_id"] = session_id
|
|
666
|
+
fs["last_failed_session_id"] = session_id
|
|
648
667
|
|
|
649
668
|
err = update_feature_in_list(feature_list_path, feature_id, new_status)
|
|
650
669
|
if err:
|
|
@@ -657,6 +676,8 @@ def action_update(args, feature_list_path, state_dir):
|
|
|
657
676
|
new_status = "pending"
|
|
658
677
|
fs["infra_error_count"] = fs.get("infra_error_count", 0) + 1
|
|
659
678
|
fs["last_infra_error_session_id"] = session_id
|
|
679
|
+
if session_id:
|
|
680
|
+
fs["last_session_id"] = session_id
|
|
660
681
|
fs["resume_from_phase"] = None
|
|
661
682
|
|
|
662
683
|
err = update_feature_in_list(feature_list_path, feature_id, new_status)
|
|
@@ -673,6 +694,9 @@ def action_update(args, feature_list_path, state_dir):
|
|
|
673
694
|
new_status = "pending"
|
|
674
695
|
|
|
675
696
|
fs["resume_from_phase"] = None
|
|
697
|
+
if session_id:
|
|
698
|
+
fs["last_session_id"] = session_id
|
|
699
|
+
fs["last_failed_session_id"] = session_id
|
|
676
700
|
# Keep sessions list and last_session_id for debugging
|
|
677
701
|
|
|
678
702
|
err = update_feature_in_list(feature_list_path, feature_id, new_status)
|
|
@@ -712,9 +736,9 @@ def action_update(args, feature_list_path, state_dir):
|
|
|
712
736
|
}
|
|
713
737
|
if auto_skipped_features:
|
|
714
738
|
summary["auto_skipped"] = [info["feature_id"] for info in auto_skipped_features]
|
|
715
|
-
if session_status in ("commit_missing", "docs_missing", "merge_conflict"):
|
|
739
|
+
if session_status in ("commit_missing", "docs_missing", "merge_conflict", "finalization_needed"):
|
|
716
740
|
summary["degraded_reason"] = session_status
|
|
717
|
-
summary["restart_policy"] = "finalization_retry"
|
|
741
|
+
summary["restart_policy"] = "manual_finalization" if session_status == "finalization_needed" else "finalization_retry"
|
|
718
742
|
elif session_status == "infra_error":
|
|
719
743
|
summary["restart_policy"] = "infra_retry"
|
|
720
744
|
summary["infra_error_count"] = fs.get("infra_error_count", 0)
|
|
@@ -1,5 +1,23 @@
|
|
|
1
1
|
"Read {{DEV_SUBAGENT_PATH}}. Implement feature {{FEATURE_ID}} (slug: {{FEATURE_SLUG}}).
|
|
2
2
|
|
|
3
|
+
## Task Summary Card
|
|
4
|
+
|
|
5
|
+
**Objective**: Implement {{FEATURE_TITLE}}.
|
|
6
|
+
|
|
7
|
+
**Primary files** (see context-snapshot.md Section 4 for full manifest):
|
|
8
|
+
- Review plan.md Tasks section for the complete task-to-file mapping.
|
|
9
|
+
- Each task's `— file:` suffix identifies the target file.
|
|
10
|
+
|
|
11
|
+
**Test command**: `{{TEST_CMD}}`
|
|
12
|
+
|
|
13
|
+
**Known baseline failures**: `{{BASELINE_FAILURES}}`
|
|
14
|
+
|
|
15
|
+
**DO NOT**:
|
|
16
|
+
- Re-read source files already listed in context-snapshot.md Section 4 File Manifest
|
|
17
|
+
- Create new files unless a plan.md task explicitly requires it
|
|
18
|
+
- Run git commands
|
|
19
|
+
- Use mock success data or fake rows in UI/tests
|
|
20
|
+
|
|
3
21
|
## Required Inputs
|
|
4
22
|
|
|
5
23
|
1. Read `.prizmkit/specs/{{FEATURE_SLUG}}/context-snapshot.md` first.
|
|
@@ -35,6 +53,9 @@ Before returning, append `## Implementation Log` to `context-snapshot.md` with:
|
|
|
35
53
|
- Carry forward the Dev-isolated subset: skip scaffold/generated files listed in `context-snapshot.md`; verify dependency versions before install/build commands that resolve dependencies; after build/compile commands, ensure outputs are ignored and never commit generated artifacts.
|
|
36
54
|
- If tests fail, follow this Test Failure Recovery subset: classify failures as baseline, new regression, brittle test, or environment/tooling; fix new regressions and brittle tests while progress is being made; document baseline failures; write `failure-log.md` for blockers.
|
|
37
55
|
- Do not run git commands; staging and commit are handled by the orchestrator.
|
|
56
|
+
- **Edit safety**: If an Edit fails with 'String to replace not found', grep for the target text before retrying. Never guess file offsets — verify them with a Read or grep first.
|
|
57
|
+
- **Read safety**: If 3 consecutive Reads to the same file return 'shorter than offset' or 'Wasted call', STOP and report BLOCKED.
|
|
58
|
+
- **Test early**: Run `{{TEST_CMD}}` after every 3 successful Edit operations. Capture output to /tmp/test-out.txt and grep for failures.
|
|
38
59
|
|
|
39
60
|
Do not return success unless:
|
|
40
61
|
1. implementation tasks are complete;
|
|
@@ -131,7 +131,7 @@ If MISSING — build it now:
|
|
|
131
131
|
```bash
|
|
132
132
|
find . -maxdepth 2 -type d -not -path '*/node_modules/*' -not -path '*/.git/*' -not -path '*/dist/*' -not -path '*/build/*' -not -path '*/__pycache__/*' -not -path '*/vendor/*' | sed -e 's;[^/]*/;|____;g;s;____|; |;g'
|
|
133
133
|
```
|
|
134
|
-
- **Section 3 —
|
|
134
|
+
- **Section 3 — Key TRAPS & RULES**: relevant TRAPS/RULES from prizm-docs (not full copies)
|
|
135
135
|
- **Section 4 — File Manifest**: For each file relevant to this feature, list: file path, why it's needed (modify/reference/test), key interface signatures (function names + params + return types). Do NOT include full file content — agents read files on-demand. Format:
|
|
136
136
|
### Files to Modify
|
|
137
137
|
| File | Why Needed | Key Interfaces |
|