@codename_inc/spectre 5.1.0 → 5.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@codename_inc/spectre",
3
- "version": "5.1.0",
3
+ "version": "5.2.0",
4
4
  "type": "module",
5
5
  "bin": {
6
6
  "spectre": "./bin/spectre.js"
@@ -1,5 +1,5 @@
1
1
  {
2
2
  "name": "spectre",
3
- "version": "5.1.0",
3
+ "version": "5.2.0",
4
4
  "description": "Agentic coding workflow with session memory. spectre guides you through Scope, Plan, Execute, Clean, Test, Rebase, and Extract phases."
5
5
  }
@@ -171,6 +171,8 @@ Optional user input to seed this workflow.
171
171
  - **MEDIUM**: Quality improvements, test coverage, configuration, performance (non-critical)
172
172
  - **LOW**: Documentation, polish, cleanup
173
173
 
174
+ **Evidence rule:** Every CRITICAL or HIGH finding MUST include (1) `file:line` and (2) a reproducible failure scenario or exploit path describing observable behavior. Findings without an evidence chain are auto-downgraded one severity level. "Could potentially" is not evidence.
175
+
174
176
  **Perform comprehensive analysis covering all aspects:**
175
177
 
176
178
  ### 🔧 Foundation & Correctness
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: execute
3
- description: 👻 | Adaptive Wave-Based Build -> Code_Review -> Validate Flow
3
+ description: 👻 | Adaptive Wave-Based Build with Per-Wave Verification Gate
4
4
  user-invocable: true
5
5
  ---
6
6
 
@@ -11,9 +11,9 @@ user-invocable: true
11
11
  Treat the current command arguments as this workflow's input. When invoked from a slash command, use the forwarded `$ARGUMENTS` value.
12
12
 
13
13
 
14
- # execute: Adaptive Task Execution with Quality Gates
14
+ # execute: Adaptive Task Execution with Per-Wave Verification
15
15
 
16
- Execute tasks in parallel waves with full scope context, adapt based on learnings, code review loop, validate requirements. Outcome: complete implementation with verified quality and E2E requirement coverage.
16
+ Execute tasks in parallel waves with full scope context, verify each wave before proceeding, adapt based on learnings, audit cross-wave integration, generate manual test guide. Outcome: complete implementation with verified quality and E2E requirement coverage.
17
17
 
18
18
  ## ARGUMENTS
19
19
 
@@ -39,7 +39,9 @@ $ARGUMENTS
39
39
 
40
40
  2. **Dispatch Wave**: Launch parallel @dev subagents (1 per task batch)
41
41
  - **CRITICAL**: Each subagent MUST read `SCOPE_DOCS` before executing
42
- - Each receives: task batch assignment, dependency completion reports, SCOPE_DOCS paths
42
+ - Each receives: task batch assignment, SCOPE_DOCS paths, and (after wave 1) a **Prior-Wave Context** block
43
+ - **Prior-Wave Context** (REQUIRED in waves 2+): the orchestrator appends each prior wave's @dev Completion Reports verbatim into this wave's dispatch prompt under a `## Prior-Wave Context` header. Includes Completed tasks, Files changed, Scope signal, Discoveries, and Guidance from each prior batch. This is how state is carried forward — there is no separate state file.
44
+ - **Test discovery**: instruct @dev to use the project's native related-test command (`jest --findRelatedTests <file>`, `pytest` by path, `vitest related`, `cargo test <path>`). Do not create parallel test files for code already covered.
43
45
  - Instruct: "Read scope docs first to understand E2E UX and integration points. Load @skill-spectre:spectre-tdd, then execute tasks sequentially using its TDD methodology. **Commit after each parent task** with conventional commit format (e.g., `feat(module): add X`, `fix(module): resolve Y`). Return completion report with **Implementation Insights** + **E2E Completeness Check**."
44
46
 
45
47
  **E2E Completeness Check** (subagent returns one per batch):
@@ -47,15 +49,64 @@ $ARGUMENTS
47
49
  - 🟡 Gap — [specific functionality missing for E2E UX]
48
50
  - 🔴 Blocker — [cannot deliver spec without changes to other tasks]
49
51
 
50
- 3. **Mark Complete**: Update tasks doc with `[x]` for completed tasks
52
+ 3. **Per-Wave Verification Gate**: Verify the wave's output before adapting or advancing.
51
53
 
52
- 4. **Reflect**: Review completion reports for:
54
+ **3a. Deterministic pre-gate (no AI)**
55
+ - Detect project commands from `package.json` / `pyproject.toml` / `Cargo.toml` / `Makefile`
56
+ - Run lint, typecheck, build — whichever apply
57
+ - If any fail: dispatch @dev to fix the failures, re-run the gate. Do NOT invoke @reviewer until all deterministic checks pass.
58
+
59
+ **3b. Parallel review lenses (single message, two @reviewer dispatches)**
60
+
61
+ Build each reviewer prompt from:
62
+ - Wave diff: `git diff <parent-of-first-wave-commit>..HEAD`
63
+ - Acceptance criteria: verbatim text from scope/tasks docs for this wave's tasks
64
+ - Files-touched manifest
65
+
66
+ **Forbidden in reviewer prompts**: @dev completion reports, implementer rationale, orchestrator paraphrase of "what the dev did and why". The reviewer is a clean room — diff + criteria only.
67
+
68
+ **Lens 1 — security + correctness**
69
+ - OWASP Top-10, injection, auth, secrets, data exposure
70
+ - Logic, edge cases, state transitions
71
+ - Scope adherence (flag only in-scope issues; do not flag missing out-of-scope work)
72
+
73
+ **Lens 2 — wiring**
74
+ - Apply the Defined → Connected → Reachable methodology:
75
+ - Defined: code exists in a file
76
+ - Connected: code is imported/called by other code
77
+ - Reachable: a user action can trigger the code path
78
+ - For each new function/component, grep for usage (not just definition)
79
+ - For UI features, trace render-backward: JSX ← variable ← source ← user action
80
+ - Flag dead computations (computed but never reach output) and old code paths still active when replaced
81
+
82
+ **Severity & evidence rule** (enforced in both lens prompts):
83
+ - Every CRITICAL or HIGH finding MUST include:
84
+ 1. `file:line` reference
85
+ 2. A reproducible failure scenario or exploit path describing observable behavior
86
+ - Findings without an evidence chain are auto-downgraded one severity level. "Could potentially" is not evidence.
87
+ - Each finding includes a hash: `sha256(file_path + line + finding_category)` for the fix-loop ledger (3c).
88
+
89
+ **3c. Bounded fix loop**
90
+
91
+ If lens dispatches return CRITICAL/HIGH:
92
+ - **Iteration cap**: 3 fix waves maximum
93
+ - **Hash ledger**: maintain a set of finding hashes addressed. If a finding with a hash already in the ledger reappears in a later review, classify as "reviewer disagreement" and escalate to user — do NOT re-queue.
94
+ - **Fix/test ratio**: monitor changes per fix wave. If test-file changes > 0.5 × implementation-file changes, halt and surface to user — likely "fixing the test instead of the bug."
95
+ - **Diff-growth circuit-breaker**: if cumulative fix-wave diff grows > 25% per iteration, halt and surface — fixes are adding surface area, not reducing it.
96
+ - **Dispatch fix**: parallel @dev subagents address each CRITICAL/HIGH finding. Each fix-dev receives the finding's full evidence chain (file:line + scenario), not just the description.
97
+ - **Re-verify**: after fixes commit, return to 3a (deterministic) then 3b (lenses).
98
+
99
+ **3d. Exit condition**: No CRITICAL/HIGH remain, OR iteration cap reached and user has been notified of unresolved findings.
100
+
101
+ 4. **Mark Complete**: Update tasks doc with `[x]` for completed tasks
102
+
103
+ 5. **Reflect**: Review completion reports for:
53
104
  - Scope signals (🟡/🟠/🔴) from implementation insights
54
105
  - E2E completeness gaps (🟡/🔴) from completeness checks
55
- - **If** all ⚪ across both → skip to step 6
106
+ - **If** all ⚪ across both → skip to step 7
56
107
  - **Else** → adapt tasks
57
108
 
58
- 5. **Adapt** (only if triggered):
109
+ 6. **Adapt** (only if triggered):
59
110
  - Modify future tasks with learned context
60
111
  - Add tasks for E2E gaps with `[ADDED - E2E gap]` prefix
61
112
  - Add required sub-tasks with `[ADDED]` prefix
@@ -63,34 +114,28 @@ $ARGUMENTS
63
114
  - Flag cross-task integration issues to remaining waves
64
115
  - **Guardrails**: ❌ No "nice-to-have" additions, ❌ No scope expansion, ✅ Only adapt for spec compliance
65
116
 
66
- 6. **Next Wave**: Identify next tasks, gather relevant completion reports, return to step 1
67
-
68
- ## Step 2 - Code Review Loop
69
-
70
- - **Action** — ExecutedeveviewLoop: Until no critical/high feedback:
117
+ 7. **Next Wave**: Identify next tasks, gather prior-wave completion reports for the Prior-Wave Context block, return to step 1
71
118
 
72
- 1. **Spawn Review**: @dev subagent runs `Skill(code_review)` (Claude slash route: `/spectre:code_review`)
73
- 2. **Analyze**: Identify critical/high items
74
- - **If** none → exit loop
75
- 3. **Address**: Parallel @dev subagents fix feedback
76
- 4. **Re-verify**: Return to step 1
119
+ ## Step 2 - Cross-Wave Validate
77
120
 
78
- ## Step 3 - Validate Requirements
121
+ - **Action** SpawnValidation: @analyst runs `Skill(validate)` (Claude slash route: `/spectre:validate`) with **narrowed scope**:
122
+ - Focus: cross-wave integration audit (did later waves silently break earlier waves' wiring?) + scope-creep audit (anything implemented that is NOT in the acceptance criteria?) + dead-computation sweep across the full cumulative diff
123
+ - Skip: per-area wiring verification (already done per-wave in Step 1.3b's wiring lens)
79
124
 
80
- - **Action** — SpawnValidation: @reviewer runs `Skill(validate)` (Claude slash route: `/spectre:validate`) with task list
81
- - **Action** — AddressGaps: If high priority gaps → dispatch @dev subagents to fix
125
+ - **Action** — AddressGaps: If high priority gaps surface dispatch @dev subagents to fix.
82
126
 
83
- ## Step 4 - Prepare for QA
127
+ ## Step 3 - Prepare for QA
84
128
 
85
129
  - **Action** — GenerateTestGuide: @dev runs `Skill(create_test_guide)` (Claude slash route: `/spectre:create_test_guide`)
86
130
  - Save to `{OUT_DIR}/test_guide.md`
87
131
 
88
- ## Step 5 - Report
132
+ ## Step 4 - Report
89
133
 
90
134
  - **Action** — SummarizeCompletion:
91
- - Tasks completed, waves executed, code review iterations, validation status
135
+ - Tasks completed, waves executed, per-wave fix-loop iteration counts, validation status
92
136
  - Test guide location
93
137
  - **Task Evolution Summary**: Adaptations made (or "None - original plan executed")
94
138
  - **E2E Gaps Addressed**: Summary of completeness issues found and resolved
139
+ - **Unresolved Findings** (if any): Any CRITICAL/HIGH that hit the fix-loop cap and were escalated to user
95
140
 
96
141
  - **Action** — RenderFooter: Use `@skill-spectre:spectre-guide` skill for Next Steps
@@ -171,6 +171,8 @@ Optional user input to seed this workflow.
171
171
  - **MEDIUM**: Quality improvements, test coverage, configuration, performance (non-critical)
172
172
  - **LOW**: Documentation, polish, cleanup
173
173
 
174
+ **Evidence rule:** Every CRITICAL or HIGH finding MUST include (1) `file:line` and (2) a reproducible failure scenario or exploit path describing observable behavior. Findings without an evidence chain are auto-downgraded one severity level. "Could potentially" is not evidence.
175
+
174
176
  **Perform comprehensive analysis covering all aspects:**
175
177
 
176
178
  ### 🔧 Foundation & Correctness
@@ -1,6 +1,6 @@
1
1
  ---
2
2
  name: "execute"
3
- description: "👻 | Adaptive Wave-Based Build -> Code_Review -> Validate Flow"
3
+ description: "👻 | Adaptive Wave-Based Build with Per-Wave Verification Gate"
4
4
  user-invocable: true
5
5
  ---
6
6
 
@@ -11,9 +11,9 @@ user-invocable: true
11
11
  Treat the current command arguments as this workflow's input. When invoked from a slash command, use the forwarded `$ARGUMENTS` value.
12
12
 
13
13
 
14
- # execute: Adaptive Task Execution with Quality Gates
14
+ # execute: Adaptive Task Execution with Per-Wave Verification
15
15
 
16
- Execute tasks in parallel waves with full scope context, adapt based on learnings, code review loop, validate requirements. Outcome: complete implementation with verified quality and E2E requirement coverage.
16
+ Execute tasks in parallel waves with full scope context, verify each wave before proceeding, adapt based on learnings, audit cross-wave integration, generate manual test guide. Outcome: complete implementation with verified quality and E2E requirement coverage.
17
17
 
18
18
  ## ARGUMENTS
19
19
 
@@ -39,7 +39,9 @@ $ARGUMENTS
39
39
 
40
40
  2. **Dispatch Wave**: Launch parallel @dev subagents (1 per task batch)
41
41
  - **CRITICAL**: Each subagent MUST read `SCOPE_DOCS` before executing
42
- - Each receives: task batch assignment, dependency completion reports, SCOPE_DOCS paths
42
+ - Each receives: task batch assignment, SCOPE_DOCS paths, and (after wave 1) a **Prior-Wave Context** block
43
+ - **Prior-Wave Context** (REQUIRED in waves 2+): the orchestrator appends each prior wave's @dev Completion Reports verbatim into this wave's dispatch prompt under a `## Prior-Wave Context` header. Includes Completed tasks, Files changed, Scope signal, Discoveries, and Guidance from each prior batch. This is how state is carried forward — there is no separate state file.
44
+ - **Test discovery**: instruct @dev to use the project's native related-test command (`jest --findRelatedTests <file>`, `pytest` by path, `vitest related`, `cargo test <path>`). Do not create parallel test files for code already covered.
43
45
  - Instruct: "Read scope docs first to understand E2E UX and integration points. Load Skill(spectre-tdd), then execute tasks sequentially using its TDD methodology. **Commit after each parent task** with conventional commit format (e.g., `feat(module): add X`, `fix(module): resolve Y`). Return completion report with **Implementation Insights** + **E2E Completeness Check**."
44
46
 
45
47
  **E2E Completeness Check** (subagent returns one per batch):
@@ -47,15 +49,64 @@ $ARGUMENTS
47
49
  - 🟡 Gap — [specific functionality missing for E2E UX]
48
50
  - 🔴 Blocker — [cannot deliver spec without changes to other tasks]
49
51
 
50
- 3. **Mark Complete**: Update tasks doc with `[x]` for completed tasks
52
+ 3. **Per-Wave Verification Gate**: Verify the wave's output before adapting or advancing.
51
53
 
52
- 4. **Reflect**: Review completion reports for:
54
+ **3a. Deterministic pre-gate (no AI)**
55
+ - Detect project commands from `package.json` / `pyproject.toml` / `Cargo.toml` / `Makefile`
56
+ - Run lint, typecheck, build — whichever apply
57
+ - If any fail: dispatch @dev to fix the failures, re-run the gate. Do NOT invoke @reviewer until all deterministic checks pass.
58
+
59
+ **3b. Parallel review lenses (single message, two @reviewer dispatches)**
60
+
61
+ Build each reviewer prompt from:
62
+ - Wave diff: `git diff <parent-of-first-wave-commit>..HEAD`
63
+ - Acceptance criteria: verbatim text from scope/tasks docs for this wave's tasks
64
+ - Files-touched manifest
65
+
66
+ **Forbidden in reviewer prompts**: @dev completion reports, implementer rationale, orchestrator paraphrase of "what the dev did and why". The reviewer is a clean room — diff + criteria only.
67
+
68
+ **Lens 1 — security + correctness**
69
+ - OWASP Top-10, injection, auth, secrets, data exposure
70
+ - Logic, edge cases, state transitions
71
+ - Scope adherence (flag only in-scope issues; do not flag missing out-of-scope work)
72
+
73
+ **Lens 2 — wiring**
74
+ - Apply the Defined → Connected → Reachable methodology:
75
+ - Defined: code exists in a file
76
+ - Connected: code is imported/called by other code
77
+ - Reachable: a user action can trigger the code path
78
+ - For each new function/component, grep for usage (not just definition)
79
+ - For UI features, trace render-backward: JSX ← variable ← source ← user action
80
+ - Flag dead computations (computed but never reach output) and old code paths still active when replaced
81
+
82
+ **Severity & evidence rule** (enforced in both lens prompts):
83
+ - Every CRITICAL or HIGH finding MUST include:
84
+ 1. `file:line` reference
85
+ 2. A reproducible failure scenario or exploit path describing observable behavior
86
+ - Findings without an evidence chain are auto-downgraded one severity level. "Could potentially" is not evidence.
87
+ - Each finding includes a hash: `sha256(file_path + line + finding_category)` for the fix-loop ledger (3c).
88
+
89
+ **3c. Bounded fix loop**
90
+
91
+ If lens dispatches return CRITICAL/HIGH:
92
+ - **Iteration cap**: 3 fix waves maximum
93
+ - **Hash ledger**: maintain a set of finding hashes addressed. If a finding with a hash already in the ledger reappears in a later review, classify as "reviewer disagreement" and escalate to user — do NOT re-queue.
94
+ - **Fix/test ratio**: monitor changes per fix wave. If test-file changes > 0.5 × implementation-file changes, halt and surface to user — likely "fixing the test instead of the bug."
95
+ - **Diff-growth circuit-breaker**: if cumulative fix-wave diff grows > 25% per iteration, halt and surface — fixes are adding surface area, not reducing it.
96
+ - **Dispatch fix**: parallel @dev subagents address each CRITICAL/HIGH finding. Each fix-dev receives the finding's full evidence chain (file:line + scenario), not just the description.
97
+ - **Re-verify**: after fixes commit, return to 3a (deterministic) then 3b (lenses).
98
+
99
+ **3d. Exit condition**: No CRITICAL/HIGH remain, OR iteration cap reached and user has been notified of unresolved findings.
100
+
101
+ 4. **Mark Complete**: Update tasks doc with `[x]` for completed tasks
102
+
103
+ 5. **Reflect**: Review completion reports for:
53
104
  - Scope signals (🟡/🟠/🔴) from implementation insights
54
105
  - E2E completeness gaps (🟡/🔴) from completeness checks
55
- - **If** all ⚪ across both → skip to step 6
106
+ - **If** all ⚪ across both → skip to step 7
56
107
  - **Else** → adapt tasks
57
108
 
58
- 5. **Adapt** (only if triggered):
109
+ 6. **Adapt** (only if triggered):
59
110
  - Modify future tasks with learned context
60
111
  - Add tasks for E2E gaps with `[ADDED - E2E gap]` prefix
61
112
  - Add required sub-tasks with `[ADDED]` prefix
@@ -63,34 +114,28 @@ $ARGUMENTS
63
114
  - Flag cross-task integration issues to remaining waves
64
115
  - **Guardrails**: ❌ No "nice-to-have" additions, ❌ No scope expansion, ✅ Only adapt for spec compliance
65
116
 
66
- 6. **Next Wave**: Identify next tasks, gather relevant completion reports, return to step 1
67
-
68
- ## Step 2 - Code Review Loop
69
-
70
- - **Action** — ExecutedeveviewLoop: Until no critical/high feedback:
117
+ 7. **Next Wave**: Identify next tasks, gather prior-wave completion reports for the Prior-Wave Context block, return to step 1
71
118
 
72
- 1. **Spawn Review**: @dev subagent runs `Skill(code_review)` (Claude slash route: `code_review`)
73
- 2. **Analyze**: Identify critical/high items
74
- - **If** none → exit loop
75
- 3. **Address**: Parallel @dev subagents fix feedback
76
- 4. **Re-verify**: Return to step 1
119
+ ## Step 2 - Cross-Wave Validate
77
120
 
78
- ## Step 3 - Validate Requirements
121
+ - **Action** SpawnValidation: @analyst runs `Skill(validate)` (Claude slash route: `validate`) with **narrowed scope**:
122
+ - Focus: cross-wave integration audit (did later waves silently break earlier waves' wiring?) + scope-creep audit (anything implemented that is NOT in the acceptance criteria?) + dead-computation sweep across the full cumulative diff
123
+ - Skip: per-area wiring verification (already done per-wave in Step 1.3b's wiring lens)
79
124
 
80
- - **Action** — SpawnValidation: @reviewer runs `Skill(validate)` (Claude slash route: `validate`) with task list
81
- - **Action** — AddressGaps: If high priority gaps → dispatch @dev subagents to fix
125
+ - **Action** — AddressGaps: If high priority gaps surface dispatch @dev subagents to fix.
82
126
 
83
- ## Step 4 - Prepare for QA
127
+ ## Step 3 - Prepare for QA
84
128
 
85
129
  - **Action** — GenerateTestGuide: @dev runs `Skill(create_test_guide)` (Claude slash route: `create_test_guide`)
86
130
  - Save to `{OUT_DIR}/test_guide.md`
87
131
 
88
- ## Step 5 - Report
132
+ ## Step 4 - Report
89
133
 
90
134
  - **Action** — SummarizeCompletion:
91
- - Tasks completed, waves executed, code review iterations, validation status
135
+ - Tasks completed, waves executed, per-wave fix-loop iteration counts, validation status
92
136
  - Test guide location
93
137
  - **Task Evolution Summary**: Adaptations made (or "None - original plan executed")
94
138
  - **E2E Gaps Addressed**: Summary of completeness issues found and resolved
139
+ - **Unresolved Findings** (if any): Any CRITICAL/HIGH that hit the fix-loop cap and were escalated to user
95
140
 
96
141
  - **Action** — RenderFooter: Use `Skill(spectre-guide)` skill for Next Steps