forge-orkes 0.3.8 → 0.3.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/template/.claude/agents/executor.md +37 -50
- package/template/.claude/agents/planner.md +33 -41
- package/template/.claude/agents/researcher.md +24 -26
- package/template/.claude/agents/reviewer.md +45 -53
- package/template/.claude/agents/verifier.md +30 -50
- package/template/.claude/skills/architecting/SKILL.md +32 -46
- package/template/.claude/skills/beads-integration/SKILL.md +27 -43
- package/template/.claude/skills/debugging/SKILL.md +34 -35
- package/template/.claude/skills/designing/SKILL.md +33 -52
- package/template/.claude/skills/discussing/SKILL.md +139 -180
- package/template/.claude/skills/executing/SKILL.md +85 -157
- package/template/.claude/skills/forge/SKILL.md +101 -148
- package/template/.claude/skills/initializing/SKILL.md +104 -144
- package/template/.claude/skills/planning/SKILL.md +65 -67
- package/template/.claude/skills/quick-tasking/SKILL.md +25 -31
- package/template/.claude/skills/researching/SKILL.md +22 -32
- package/template/.claude/skills/reviewing/SKILL.md +406 -0
- package/template/.claude/skills/securing/SKILL.md +19 -19
- package/template/.claude/skills/upgrading/SKILL.md +19 -27
- package/template/.claude/skills/verifying/SKILL.md +53 -81
- package/template/CLAUDE.md +7 -10
- package/template/.claude/skills/auditing/SKILL.md +0 -314
- package/template/.claude/skills/refactoring/SKILL.md +0 -168
|
@@ -1,86 +1,69 @@
|
|
|
1
1
|
---
|
|
2
2
|
name: verifying
|
|
3
|
-
description: "
|
|
3
|
+
description: "Prove completed work delivers what was promised. Goal-backward verification with 3 levels: Observable Truths, Artifacts, and Key Links. Task completion ≠ goal achievement."
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Verifying
|
|
7
7
|
|
|
8
|
-
Prove completed work
|
|
8
|
+
Prove completed work delivers what was promised. Task completion ≠ goal achievement.
|
|
9
9
|
|
|
10
10
|
## Core Question
|
|
11
11
|
|
|
12
|
-
|
|
13
|
-
Ask: "Does the user get what they were promised?"
|
|
12
|
+
Not "Did we complete all tasks?" but "Does the user get what they were promised?"
|
|
14
13
|
|
|
15
14
|
## Load Context
|
|
16
15
|
|
|
17
|
-
|
|
16
|
+
After `/clear`, load with fresh eyes — don't carry the executor's assumptions:
|
|
18
17
|
|
|
19
18
|
```
|
|
20
19
|
Read: .forge/state/milestone-{id}.yml → current phase, plans completed
|
|
21
20
|
Read: .forge/project.yml → tech stack (for running tests)
|
|
22
21
|
Read: .forge/phases/m{M}-{N}-{name}/plan-{NN}.md → must_haves (truths, artifacts, key_links)
|
|
23
|
-
Read: .forge/context.md → locked decisions
|
|
22
|
+
Read: .forge/context.md → locked decisions
|
|
24
23
|
Read: .forge/requirements.yml → requirement IDs for coverage check
|
|
25
24
|
```
|
|
26
25
|
|
|
27
|
-
This is critical — the verifier should assess the code with fresh eyes, not carry the executor's assumptions. Load must_haves from the plan files and verify against the actual codebase.
|
|
28
|
-
|
|
29
26
|
## 3-Level Goal-Backward Verification
|
|
30
27
|
|
|
31
28
|
### Level 1: Observable Truths
|
|
32
29
|
|
|
33
30
|
Read the plan's `must_haves.truths`. For each truth:
|
|
34
31
|
|
|
35
|
-
1. Design a test
|
|
32
|
+
1. Design a test that proves/disproves it
|
|
36
33
|
2. Run the test
|
|
37
|
-
3. Record
|
|
34
|
+
3. Record: **VERIFIED** | **FAILED** | **UNCERTAIN**
|
|
38
35
|
|
|
39
36
|
```markdown
|
|
40
37
|
| Truth | Test | Result |
|
|
41
38
|
|-------|------|--------|
|
|
42
|
-
| User can
|
|
43
|
-
|
|
|
44
|
-
| Profile photo uploads correctly | Upload image, check display | FAILED — 413 error on large files |
|
|
39
|
+
| User can edit their bio | Click edit, type, save, check persistence | VERIFIED |
|
|
40
|
+
| Profile photo uploads correctly | Upload image, check display | FAILED — 413 on large files |
|
|
45
41
|
```
|
|
46
42
|
|
|
47
43
|
### Level 2: Artifacts (Exists → Substantive → Wired)
|
|
48
44
|
|
|
49
|
-
Read the plan's `must_haves.artifacts`. For each
|
|
45
|
+
Read the plan's `must_haves.artifacts`. For each:
|
|
50
46
|
|
|
51
47
|
| Check | Question | Pass Criteria |
|
|
52
48
|
|-------|----------|---------------|
|
|
53
|
-
| **Exists** | Does the file exist? |
|
|
54
|
-
| **Substantive** |
|
|
55
|
-
| **Wired** |
|
|
49
|
+
| **Exists** | Does the file exist? | Present at specified path |
|
|
50
|
+
| **Substantive** | Real code, not a stub? | Exceeds min_lines, no placeholders, real logic |
|
|
51
|
+
| **Wired** | Imported and used? | Has importers, called in production paths |
|
|
56
52
|
|
|
57
53
|
```markdown
|
|
58
54
|
| Artifact | Exists | Substantive | Wired | Status |
|
|
59
55
|
|----------|--------|-------------|-------|--------|
|
|
60
56
|
| src/components/Profile.tsx | ✓ | ✓ (87 lines) | ✓ (imported in routes) | VERIFIED |
|
|
61
|
-
| src/api/users/[id].ts | ✓ | ✓ (43 lines) | ✓ (called by Profile) | VERIFIED |
|
|
62
57
|
| src/hooks/useProfile.ts | ✓ | ✗ (returns {}) | - | STUB |
|
|
63
58
|
```
|
|
64
59
|
|
|
65
60
|
### Stub Detection Red Flags
|
|
66
61
|
|
|
67
|
-
|
|
68
|
-
|
|
69
|
-
**React components:**
|
|
70
|
-
- `<div>Component</div>` or `<div>Placeholder</div>`
|
|
71
|
-
- `onClick={() => {}}` (empty handler)
|
|
72
|
-
- `onChange={() => console.log()}` (logging-only handler)
|
|
73
|
-
- Hardcoded data instead of API calls
|
|
62
|
+
**React components:** `<div>Placeholder</div>`, `onClick={() => {}}`, `onChange={() => console.log()}`, hardcoded data instead of API calls
|
|
74
63
|
|
|
75
|
-
**API endpoints:**
|
|
76
|
-
- `return Response.json({ message: "Not implemented" })`
|
|
77
|
-
- Empty arrays without database queries
|
|
78
|
-
- TODO comments in response path
|
|
64
|
+
**API endpoints:** `return Response.json({ message: "Not implemented" })`, empty arrays without DB queries, TODO in response path
|
|
79
65
|
|
|
80
|
-
**Wiring:**
|
|
81
|
-
- `fetch()` without await/then
|
|
82
|
-
- Query without returning result
|
|
83
|
-
- Form with only `preventDefault`
|
|
66
|
+
**Wiring:** `fetch()` without await/then, query without returning result, form with only `preventDefault`
|
|
84
67
|
|
|
85
68
|
### Level 3: Key Links
|
|
86
69
|
|
|
@@ -88,8 +71,8 @@ Read the plan's `must_haves.key_links`. Verify each connection:
|
|
|
88
71
|
|
|
89
72
|
| Pattern | How to Verify |
|
|
90
73
|
|---------|---------------|
|
|
91
|
-
| Component → API | Find fetch/axios call
|
|
92
|
-
| API → Database | Find ORM/query call
|
|
74
|
+
| Component → API | Find fetch/axios call pointing to API route |
|
|
75
|
+
| API → Database | Find ORM/query call returning data |
|
|
93
76
|
| Form → Handler | Find onSubmit with API call and response handling |
|
|
94
77
|
| State → Render | Find state variable used in JSX (not just declared) |
|
|
95
78
|
|
|
@@ -97,12 +80,11 @@ Read the plan's `must_haves.key_links`. Verify each connection:
|
|
|
97
80
|
| From | To | Via | Pattern | Status |
|
|
98
81
|
|------|----|-----|---------|--------|
|
|
99
82
|
| Profile.tsx | /api/users/[id] | fetch in useEffect | fetch.*api/users | VERIFIED |
|
|
100
|
-
| ProfileForm.tsx | /api/users/[id] | PUT on submit | method.*PUT | VERIFIED |
|
|
101
83
|
```
|
|
102
84
|
|
|
103
85
|
## Anti-Pattern Scan
|
|
104
86
|
|
|
105
|
-
After 3-level verification
|
|
87
|
+
After 3-level verification:
|
|
106
88
|
|
|
107
89
|
- [ ] No `TODO` or `FIXME` in production code paths
|
|
108
90
|
- [ ] No `console.log` as error handling
|
|
@@ -112,27 +94,24 @@ After 3-level verification, scan for common issues:
|
|
|
112
94
|
|
|
113
95
|
## Requirements Coverage
|
|
114
96
|
|
|
115
|
-
Cross-reference
|
|
97
|
+
Cross-reference against `.forge/requirements.yml`:
|
|
116
98
|
|
|
117
99
|
```markdown
|
|
118
100
|
| Requirement | Status | Evidence |
|
|
119
101
|
|-------------|--------|----------|
|
|
120
102
|
| FR-001 | Verified | Profile renders, tests pass |
|
|
121
|
-
| FR-002 | Verified | Edit flow works end-to-end |
|
|
122
103
|
| FR-003 | Partial | Upload works for small files, fails > 5MB |
|
|
123
104
|
```
|
|
124
105
|
|
|
125
106
|
## Verdict
|
|
126
107
|
|
|
127
|
-
Based on all verification levels:
|
|
128
|
-
|
|
129
108
|
### PASSED
|
|
130
109
|
All truths verified, all artifacts substantive and wired, all key links connected, requirements covered.
|
|
131
|
-
→ Route to `
|
|
110
|
+
→ Route to `reviewing` skill for health audit + refactoring review.
|
|
132
111
|
|
|
133
112
|
### GAPS FOUND
|
|
134
113
|
Some truths failed or artifacts are stubs.
|
|
135
|
-
→ Document gaps
|
|
114
|
+
→ Document gaps:
|
|
136
115
|
|
|
137
116
|
```yaml
|
|
138
117
|
gaps:
|
|
@@ -150,8 +129,8 @@ gaps:
|
|
|
150
129
|
→ Return to `planning` skill in gap-closure mode.
|
|
151
130
|
|
|
152
131
|
### HUMAN VERIFICATION NEEDED
|
|
153
|
-
|
|
154
|
-
→ List
|
|
132
|
+
Items that can't be verified automatically (visual appearance, real-time behavior, external services).
|
|
133
|
+
→ List for manual check:
|
|
155
134
|
|
|
156
135
|
```markdown
|
|
157
136
|
## Human Verification Items
|
|
@@ -162,68 +141,61 @@ Some items can't be verified automatically (visual appearance, real-time behavio
|
|
|
162
141
|
|
|
163
142
|
## Re-Verification Mode
|
|
164
143
|
|
|
165
|
-
|
|
144
|
+
After gap closure:
|
|
166
145
|
1. Load previous verification results
|
|
167
146
|
2. Full 3-level check on previously failed items
|
|
168
|
-
3. Quick regression check on
|
|
169
|
-
4. Merge results
|
|
147
|
+
3. Quick regression check on passed items (exists + basic sanity)
|
|
148
|
+
4. Merge results, issue updated verdict
|
|
170
149
|
|
|
171
150
|
## Desire Paths Retrospective
|
|
172
151
|
|
|
173
|
-
After
|
|
152
|
+
After verification completes (PASSED or GAPS FOUND), run a quick retrospective on framework usage patterns. Update `.forge/state/index.yml → desire_paths` (global, not per-milestone).
|
|
174
153
|
|
|
175
154
|
### Collect Signals
|
|
176
155
|
|
|
177
|
-
|
|
156
|
+
**1. Deviation patterns**: Read `.forge/state/milestone-{id}.yml → deviations`. Repeating?
|
|
157
|
+
- Same Rule 1 fix in multiple places → plan template should include this check
|
|
158
|
+
- Same Rule 2 addition everywhere → constitution needs a new article
|
|
159
|
+
- Same Rule 3 issue → project setup is missing something
|
|
178
160
|
|
|
179
|
-
**
|
|
180
|
-
- Same Rule 1 fix in multiple places → maybe the plan template should include this check
|
|
181
|
-
- Same Rule 2 addition everywhere → maybe the constitution needs a new article
|
|
182
|
-
- Same Rule 3 issue → maybe the project setup is missing something
|
|
161
|
+
**2. Tier overrides**: User override tier detection? Log detected vs. chosen. Repeated overrides = wrong heuristics.
|
|
183
162
|
|
|
184
|
-
**
|
|
163
|
+
**3. Skipped steps**: User ask to skip workflow steps? Repeated skips = friction without value.
|
|
185
164
|
|
|
186
|
-
**
|
|
165
|
+
**4. Recurring friction**: Same problem from previous sessions? Check prior `desire_paths`. Increment counts.
|
|
187
166
|
|
|
188
|
-
**
|
|
167
|
+
**5. Agent struggles**: Agent need multiple attempts or human intervention? Log task type and failure pattern.
|
|
189
168
|
|
|
190
|
-
**
|
|
191
|
-
|
|
192
|
-
**6. User corrections**: Did the user correct the same thing multiple times? (e.g., "remember to use the Card component, not a div", "always add error boundaries"). These are implicit rules that should become explicit.
|
|
169
|
+
**6. User corrections**: User correct the same thing multiple times? Implicit rules that should become explicit.
|
|
193
170
|
|
|
194
171
|
### Surface Recommendations
|
|
195
172
|
|
|
196
|
-
When any pattern reaches **3+ occurrences**, surface it
|
|
197
|
-
|
|
198
|
-
*"I've noticed a recurring pattern: [{description}] has come up {N} times now. This suggests we should evolve the framework. Options:"*
|
|
173
|
+
When any pattern reaches **3+ occurrences**, surface it:
|
|
199
174
|
|
|
200
175
|
| Pattern Type | Suggested Evolution |
|
|
201
176
|
|-------------|-------------------|
|
|
202
|
-
| Repeated deviations | Add pre-check to planning
|
|
177
|
+
| Repeated deviations | Add pre-check to planning, or new constitutional article |
|
|
203
178
|
| Tier overrides | Adjust detection heuristics in forge skill |
|
|
204
|
-
| Skipped steps | Make step optional
|
|
205
|
-
| Recurring friction | Add
|
|
206
|
-
| Agent struggles | Add examples or anti-patterns to the skill
|
|
207
|
-
| User corrections | Add
|
|
179
|
+
| Skipped steps | Make step optional, or merge into another step |
|
|
180
|
+
| Recurring friction | Add guidance to relevant skill, or create a template |
|
|
181
|
+
| Agent struggles | Add examples or anti-patterns to the relevant skill |
|
|
182
|
+
| User corrections | Add rule to constitution, context.md, or relevant skill |
|
|
208
183
|
|
|
209
|
-
|
|
210
|
-
- *"
|
|
211
|
-
- *"
|
|
212
|
-
- *"
|
|
213
|
-
- *"Should I add a new constitutional article: 'Error Boundaries Required'?"*
|
|
184
|
+
Propose concrete actions:
|
|
185
|
+
- *"Add 'always use Card component for content containers' to design-system.md?"*
|
|
186
|
+
- *"Add null-check verification step to planning template?"*
|
|
187
|
+
- *"Make research phase optional for Standard tier in this project?"*
|
|
214
188
|
|
|
215
|
-
Only suggest
|
|
189
|
+
Only suggest at 3+ occurrences. One-off issues are noise.
|
|
216
190
|
|
|
217
191
|
## Phase Handoff
|
|
218
192
|
|
|
219
|
-
After
|
|
193
|
+
After PASSED verdict:
|
|
220
194
|
|
|
221
|
-
1. **
|
|
222
|
-
2. **Update state** — Set `current.status` to `
|
|
195
|
+
1. **Persist** — Confirm verification results documented, desire paths logged to `.forge/state/index.yml`
|
|
196
|
+
2. **Update state** — Set `current.status` to `reviewing` in `.forge/state/milestone-{id}.yml`
|
|
223
197
|
3. **Recommend context clear:**
|
|
224
198
|
|
|
225
|
-
*"Verification
|
|
226
|
-
|
|
227
|
-
*Ready to continue? Clear context and invoke `/forge` to resume."*
|
|
199
|
+
*"Verification passed. State written. `/clear` then `/forge` to continue with reviewing."*
|
|
228
200
|
|
|
229
|
-
|
|
201
|
+
If GAPS found, route back to planning in gap-closure mode. Context clear applies after re-verified PASSED verdict.
|
package/template/CLAUDE.md
CHANGED
|
@@ -29,11 +29,11 @@ Forge auto-detects complexity. Override with: "Use Quick/Standard/Full tier."
|
|
|
29
29
|
|
|
30
30
|
### Standard (hours)
|
|
31
31
|
**Triggers:** new feature, component, significant refactor, multi-file change
|
|
32
|
-
**Flow:** → `researching` → `discussing` → `planning` → `executing` → `verifying` → `
|
|
32
|
+
**Flow:** → `researching` → `discussing` → `planning` → `executing` → `verifying` → `reviewing` → done
|
|
33
33
|
|
|
34
34
|
### Full (days)
|
|
35
35
|
**Triggers:** new project, major milestone, complex multi-system feature, architectural decisions needed
|
|
36
|
-
**Flow:** → `researching` → `discussing` → `architecting` → `planning` → `executing` → `verifying` → `
|
|
36
|
+
**Flow:** → `researching` → `discussing` → `architecting` → `planning` → `executing` → `verifying` → `reviewing` → done
|
|
37
37
|
**Optional additions:** `designing` (UI work), `securing` (auth/data/API), `debugging` (stuck on issue)
|
|
38
38
|
|
|
39
39
|
## Skill Routing
|
|
@@ -48,8 +48,7 @@ Forge auto-detects complexity. Override with: "Use Quick/Standard/Full tier."
|
|
|
48
48
|
| Break work into executable tasks with gates | `planning` | Standard, Full |
|
|
49
49
|
| Build code with deviation rules + atomic commits | `executing` | All |
|
|
50
50
|
| Prove work actually delivers on goals | `verifying` | Standard, Full |
|
|
51
|
-
| Audit
|
|
52
|
-
| Review refactoring opportunities after milestone audit | `refactoring` | Standard, Full |
|
|
51
|
+
| Audit health + catalog refactoring opportunities | `reviewing` | Standard, Full |
|
|
53
52
|
| Fix a small, scoped issue fast | `quick-tasking` | Quick |
|
|
54
53
|
| Build UI with design system consistency | `designing` | When UI involved |
|
|
55
54
|
| Review security before shipping | `securing` | When auth/data/API involved |
|
|
@@ -71,7 +70,7 @@ Forge auto-detects complexity. Override with: "Use Quick/Standard/Full tier."
|
|
|
71
70
|
When a task touches 20+ files or a complex subsystem, spawn a fresh executor agent with isolated context. This prevents context rot — the #1 cause of quality degradation in long sessions.
|
|
72
71
|
|
|
73
72
|
### Context Handoff Between Phases
|
|
74
|
-
Each phase writes its outputs to `.forge/` before completing. At every phase boundary (researching → discussing → planning → executing → verifying →
|
|
73
|
+
Each phase writes its outputs to `.forge/` before completing. At every phase boundary (researching → discussing → planning → executing → verifying → reviewing), the completing skill recommends clearing context (`/clear`) before the next phase begins. The next phase loads what it needs from disk. This is advisory — skip for short phases where context is under 40%. See the `forge` skill's "Context Handoff Protocol" for full details.
|
|
75
74
|
|
|
76
75
|
### Lazy Loading
|
|
77
76
|
Skills load only when invoked. CLAUDE.md stays in context; skill details load on demand. This keeps base context lean (~300 lines) while making full framework available.
|
|
@@ -84,9 +83,7 @@ Skills load only when invoked. CLAUDE.md stays in context; skill details load on
|
|
|
84
83
|
| `planner` | Planning with constitutional gates | Read + Write (plan files only) | Planning phases |
|
|
85
84
|
| `executor` | Building with deviation rules | All dev tools | Execution phases |
|
|
86
85
|
| `verifier` | Goal-backward verification | Read + Bash (test execution) | Verification phases |
|
|
87
|
-
| `
|
|
88
|
-
| `architecture-auditor` | Structural health assessor | Read, Grep, Glob | Auditing phase |
|
|
89
|
-
| `reviewer` | Security + code quality audit | Read-only + npm audit | Before shipping |
|
|
86
|
+
| `reviewer` | Security + architecture + refactoring audit | Read, Bash, Grep, Glob | Reviewing phase |
|
|
90
87
|
|
|
91
88
|
## Project Init (First Run)
|
|
92
89
|
|
|
@@ -124,7 +121,7 @@ Project state lives in `.forge/`:
|
|
|
124
121
|
- `state/milestone-{id}.yml` — Per-milestone cursor: current position, progress, decisions, blockers, deviations
|
|
125
122
|
- `context.md` — Locked user decisions + deferred ideas (created during discuss phase)
|
|
126
123
|
- `plan.md` — Per-phase task plans with must_haves frontmatter
|
|
127
|
-
- `refactor-backlog.yml` — Refactoring opportunities cataloged
|
|
124
|
+
- `refactor-backlog.yml` — Refactoring opportunities cataloged during milestone reviews, worked via quick-tasking
|
|
128
125
|
|
|
129
126
|
### Milestones
|
|
130
127
|
Milestones group phases into concurrent work streams. Each milestone has its own state file, so different sessions can work on different milestones without conflicts. On resume, Forge shows active milestones and asks which one to work on.
|
|
@@ -133,7 +130,7 @@ Milestones group phases into concurrent work streams. Each milestone has its own
|
|
|
133
130
|
YAML for anything agents parse programmatically (project, requirements, roadmap, state). Markdown for human-facing content (constitution, context, verification reports). Never free-form prose for machine state.
|
|
134
131
|
|
|
135
132
|
### Milestone Completion: Status vs. Percentage
|
|
136
|
-
**`current.status` is the authoritative workflow position.** A milestone is only complete when `current.status == complete`. The `progress.overall_percent` field measures task completion — not workflow completion. A milestone at 100% task completion still needs verifying
|
|
133
|
+
**`current.status` is the authoritative workflow position.** A milestone is only complete when `current.status == complete`. The `progress.overall_percent` field measures task completion — not workflow completion. A milestone at 100% task completion still needs verifying and reviewing before it is done. On resume, always check and display `current.status` to determine next steps.
|
|
137
134
|
|
|
138
135
|
## Deviation Rules (Executor Decision Tree)
|
|
139
136
|
|
|
@@ -1,314 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: auditing
|
|
3
|
-
description: "Use after verifying passes to assess overall application health before milestone completion. Runs security audit (10 categories) and architecture audit (scaling, maintainability, code health). This is the pre-release gate — it answers 'is this codebase healthy enough to ship?'"
|
|
4
|
-
---
|
|
5
|
-
|
|
6
|
-
# Auditing: Health Audit Before Milestone Completion
|
|
7
|
-
|
|
8
|
-
You are the pre-release gate. After `verifying` confirms the work delivers what was promised, you assess whether the codebase is healthy enough to ship. Two parallel audits — security and architecture — produce a structured health report that determines whether the milestone can complete.
|
|
9
|
-
|
|
10
|
-
## When to Trigger
|
|
11
|
-
|
|
12
|
-
- **Automatically** after `verifying` returns a PASSED verdict (Standard and Full tiers)
|
|
13
|
-
- **On-demand** at any time via user request
|
|
14
|
-
|
|
15
|
-
## Process Overview
|
|
16
|
-
|
|
17
|
-
1. Read project context (`.forge/project.yml`) to determine tech stack
|
|
18
|
-
2. Scope the audit — glob all source files, summarize what will be scanned
|
|
19
|
-
3. Spawn two parallel subagents: Security Audit + Architecture Audit
|
|
20
|
-
4. Collect results, score per-category, determine overall status
|
|
21
|
-
5. Write health report to `.forge/audits/milestone-{id}-health-report.md`
|
|
22
|
-
6. Route based on results: healthy → complete, issues → user decides
|
|
23
|
-
|
|
24
|
-
## Step 1: Read Context
|
|
25
|
-
|
|
26
|
-
```
|
|
27
|
-
Read: .forge/project.yml → tech stack, framework, database, dependencies
|
|
28
|
-
Read: .forge/state/milestone-{id}.yml → milestone ID and name
|
|
29
|
-
Read: .forge/constitution.md → active architectural gates (if exists)
|
|
30
|
-
```
|
|
31
|
-
|
|
32
|
-
Determine which security categories apply based on the tech stack. For example:
|
|
33
|
-
- No database → SQL/NoSQL Injection is N/A
|
|
34
|
-
- No frontend → XSS Prevention is N/A
|
|
35
|
-
- No CI/CD config → Pipeline Security is N/A
|
|
36
|
-
|
|
37
|
-
## Step 2: Scope the Audit
|
|
38
|
-
|
|
39
|
-
```
|
|
40
|
-
Glob: src/**/*.{ts,tsx,js,jsx,py,go,rs,java} (adapt to project language)
|
|
41
|
-
Glob: **/*.env*, **/docker-compose*, **/.github/workflows/*
|
|
42
|
-
Glob: **/next.config*, **/vite.config*, **/webpack.config*
|
|
43
|
-
```
|
|
44
|
-
|
|
45
|
-
Present scope summary to user:
|
|
46
|
-
*"Health audit scope: {N} source files, {N} config files. Scanning for security vulnerabilities (10 categories) and architectural health (4 dimensions). This will take a moment."*
|
|
47
|
-
|
|
48
|
-
Build explicit file lists for each subagent — pass file paths, not globs, so nothing is missed.
|
|
49
|
-
|
|
50
|
-
## Step 3: Spawn Parallel Audits
|
|
51
|
-
|
|
52
|
-
Spawn both audits as fresh-context subagents. Each receives:
|
|
53
|
-
- The explicit file list for their scope
|
|
54
|
-
- The tech stack from `project.yml`
|
|
55
|
-
- Their specific audit instructions (below)
|
|
56
|
-
|
|
57
|
-
### Part 1: Security Audit (subagent)
|
|
58
|
-
|
|
59
|
-
Spawn a security auditor agent with a fresh context window.
|
|
60
|
-
|
|
61
|
-
**10 Security Categories:**
|
|
62
|
-
|
|
63
|
-
| # | Category | What It Checks |
|
|
64
|
-
|---|----------|---------------|
|
|
65
|
-
| 1 | Authentication & Authorization | Every endpoint has auth middleware; role checks before data access |
|
|
66
|
-
| 2 | Data Scoping / Tenant Isolation | Queries scoped to correct user/tenant; no cross-tenant data leaks |
|
|
67
|
-
| 3 | Input Validation | Request bodies/params validated before use in queries or logic |
|
|
68
|
-
| 4 | Error Information Leakage | No stack traces, DB schemas, or internal details in API responses |
|
|
69
|
-
| 5 | XSS Prevention | No unsanitized user content injected into DOM |
|
|
70
|
-
| 6 | SQL/NoSQL Injection | All queries use parameterized placeholders, no string interpolation |
|
|
71
|
-
| 7 | Secrets Management | No hardcoded keys/tokens; `.env` in `.gitignore`; `process.env` usage |
|
|
72
|
-
| 8 | CORS Policy | No wildcard `*` origins in production; appropriate method restrictions |
|
|
73
|
-
| 9 | HTTP Security Headers | CSP, X-Frame-Options, HSTS, X-Content-Type-Options, Referrer-Policy |
|
|
74
|
-
| 10 | CI/CD Pipeline Security | Secrets via secrets context, not hardcoded in workflow files |
|
|
75
|
-
|
|
76
|
-
**Agent behavior rules:**
|
|
77
|
-
- Read every file in the provided list. No sampling or skipping.
|
|
78
|
-
- Every finding must have: file path, line number, what's wrong, severity, remediation.
|
|
79
|
-
- Understand context before flagging — read surrounding code, check for middleware, wrappers, and higher-order protections.
|
|
80
|
-
- Document intentionally public endpoints; don't flag them as vulnerabilities.
|
|
81
|
-
- Severity is firm: `critical` = exploitable vulnerability, `warning` = defense-in-depth gap, `info` = observation.
|
|
82
|
-
- Prefer false negatives over false positives — only flag what you're confident about.
|
|
83
|
-
- Categories that don't apply to this project's stack → mark as N/A with brief explanation.
|
|
84
|
-
|
|
85
|
-
**Project adaptation:** Adapt checks to the detected stack:
|
|
86
|
-
- Express vs Next.js vs Fastify endpoint patterns
|
|
87
|
-
- PostgreSQL vs MongoDB vs SQLite query patterns
|
|
88
|
-
- GitHub Actions vs GitLab CI vs other CI systems
|
|
89
|
-
- React vs Vue vs Svelte frontend patterns
|
|
90
|
-
|
|
91
|
-
**Output format** (return to orchestrator):
|
|
92
|
-
|
|
93
|
-
```yaml
|
|
94
|
-
security_audit:
|
|
95
|
-
files_scanned: N
|
|
96
|
-
categories:
|
|
97
|
-
- id: 1
|
|
98
|
-
name: "Authentication & Authorization"
|
|
99
|
-
status: passed | warning | critical | na
|
|
100
|
-
findings:
|
|
101
|
-
- file: "src/api/users.ts"
|
|
102
|
-
line: 42
|
|
103
|
-
severity: critical | warning | info
|
|
104
|
-
issue: "Description of what's wrong"
|
|
105
|
-
remediation: "How to fix it"
|
|
106
|
-
notes: "Optional context about intentional decisions"
|
|
107
|
-
```
|
|
108
|
-
|
|
109
|
-
### Part 2: Architecture Audit (subagent)
|
|
110
|
-
|
|
111
|
-
Spawn an architecture auditor agent with a fresh context window.
|
|
112
|
-
|
|
113
|
-
**4 Architecture Dimensions:**
|
|
114
|
-
|
|
115
|
-
| Dimension | What It Checks |
|
|
116
|
-
|-----------|---------------|
|
|
117
|
-
| **Scalability** | Synchronous blocking calls, missing pagination, unbounded queries, N+1 query patterns, missing caching opportunities, single points of failure, hardcoded limits |
|
|
118
|
-
| **Maintainability** | Code complexity hotspots (files >300 lines, deeply nested logic >4 levels, god components/classes), circular dependencies, duplicated logic that warrants abstraction |
|
|
119
|
-
| **Code Health** | Dead code / unused exports, TODO/FIXME inventory with age, test coverage gaps (untested critical paths), stale/vulnerable dependencies |
|
|
120
|
-
| **Structural Quality** | Separation of concerns violations (business logic in UI layer), inconsistent patterns across similar features, missing error boundaries, API contract consistency |
|
|
121
|
-
|
|
122
|
-
**Agent behavior rules:**
|
|
123
|
-
- Check actual code, not theoretical concerns.
|
|
124
|
-
- Every finding references specific files with evidence.
|
|
125
|
-
- Severity: `critical` = architectural debt that will cause production issues or block future work, `warning` = quality concern worth addressing, `info` = improvement opportunity.
|
|
126
|
-
- Respect existing ADRs in `.forge/decisions/` — don't flag intentional architectural choices as issues.
|
|
127
|
-
- Respect constitutional articles in `.forge/constitution.md` — if the constitution permits a pattern, don't flag it.
|
|
128
|
-
|
|
129
|
-
**Output format** (return to orchestrator):
|
|
130
|
-
|
|
131
|
-
```yaml
|
|
132
|
-
architecture_audit:
|
|
133
|
-
files_scanned: N
|
|
134
|
-
dimensions:
|
|
135
|
-
- name: "Scalability"
|
|
136
|
-
status: passed | warning | critical
|
|
137
|
-
findings:
|
|
138
|
-
- file: "src/api/products.ts"
|
|
139
|
-
line: 87
|
|
140
|
-
severity: critical | warning | info
|
|
141
|
-
issue: "Unbounded query with no pagination"
|
|
142
|
-
remediation: "Add limit/offset parameters"
|
|
143
|
-
- name: "Maintainability"
|
|
144
|
-
status: passed | warning | critical
|
|
145
|
-
findings: []
|
|
146
|
-
- name: "Code Health"
|
|
147
|
-
status: passed | warning | critical
|
|
148
|
-
findings: []
|
|
149
|
-
- name: "Structural Quality"
|
|
150
|
-
status: passed | warning | critical
|
|
151
|
-
findings: []
|
|
152
|
-
```
|
|
153
|
-
|
|
154
|
-
## Step 4: Score Results
|
|
155
|
-
|
|
156
|
-
After both subagents return, compute scores.
|
|
157
|
-
|
|
158
|
-
**Per-category scoring:**
|
|
159
|
-
|
|
160
|
-
| Status | Meaning |
|
|
161
|
-
|--------|---------|
|
|
162
|
-
| `passed` | No issues found |
|
|
163
|
-
| `warning` | Non-critical issues (info-level also maps here) |
|
|
164
|
-
| `critical` | Real vulnerabilities or architectural blockers |
|
|
165
|
-
| `na` | Category doesn't apply to this project |
|
|
166
|
-
|
|
167
|
-
**Overall status:**
|
|
168
|
-
|
|
169
|
-
| Overall | Condition |
|
|
170
|
-
|---------|-----------|
|
|
171
|
-
| `passed` | ALL categories and dimensions passed or N/A |
|
|
172
|
-
| `warnings_only` | One or more warnings, zero critical |
|
|
173
|
-
| `issues_found` | One or more critical findings |
|
|
174
|
-
|
|
175
|
-
## Step 5: Write Health Report
|
|
176
|
-
|
|
177
|
-
Create `.forge/audits/` directory if needed. Write to `.forge/audits/milestone-{id}-health-report.md`.
|
|
178
|
-
|
|
179
|
-
**YAML frontmatter:**
|
|
180
|
-
|
|
181
|
-
```yaml
|
|
182
|
-
---
|
|
183
|
-
milestone_id: {id}
|
|
184
|
-
milestone_name: "{name}"
|
|
185
|
-
audited: "{ISO 8601 timestamp}"
|
|
186
|
-
status: passed | warnings_only | issues_found
|
|
187
|
-
security:
|
|
188
|
-
status: passed | warnings_only | issues_found
|
|
189
|
-
categories_passed: N
|
|
190
|
-
categories_warning: N
|
|
191
|
-
categories_critical: N
|
|
192
|
-
categories_na: N
|
|
193
|
-
architecture:
|
|
194
|
-
status: passed | warnings_only | issues_found
|
|
195
|
-
scalability: passed | warning | critical
|
|
196
|
-
maintainability: passed | warning | critical
|
|
197
|
-
code_health: passed | warning | critical
|
|
198
|
-
structural_quality: passed | warning | critical
|
|
199
|
-
total_files_scanned: N
|
|
200
|
-
---
|
|
201
|
-
```
|
|
202
|
-
|
|
203
|
-
**Body structure:**
|
|
204
|
-
|
|
205
|
-
```markdown
|
|
206
|
-
# Health Audit Report: {milestone name}
|
|
207
|
-
|
|
208
|
-
## Executive Summary
|
|
209
|
-
{1-3 sentences: overall health assessment, key findings, recommendation}
|
|
210
|
-
|
|
211
|
-
## Security Findings
|
|
212
|
-
|
|
213
|
-
### Category 1: Authentication & Authorization — {STATUS}
|
|
214
|
-
| File | Line | Severity | Issue | Remediation |
|
|
215
|
-
|------|------|----------|-------|-------------|
|
|
216
|
-
| ... | ... | ... | ... | ... |
|
|
217
|
-
|
|
218
|
-
{Repeat for each category. N/A categories get a single line: "N/A — {reason}"}
|
|
219
|
-
|
|
220
|
-
## Architecture Findings
|
|
221
|
-
|
|
222
|
-
### Scalability — {STATUS}
|
|
223
|
-
| File | Line | Severity | Issue | Remediation |
|
|
224
|
-
|------|------|----------|-------|-------------|
|
|
225
|
-
| ... | ... | ... | ... | ... |
|
|
226
|
-
|
|
227
|
-
{Repeat for each dimension}
|
|
228
|
-
|
|
229
|
-
## Public Endpoints
|
|
230
|
-
{List of intentionally public endpoints documented during security audit}
|
|
231
|
-
|
|
232
|
-
## Files Scanned
|
|
233
|
-
{Count and list of all files scanned across both audits}
|
|
234
|
-
```
|
|
235
|
-
|
|
236
|
-
**Health trend tracking:** If a previous audit exists for an earlier milestone (check `.forge/audits/` for prior reports), compare results and note improvements or regressions in the executive summary.
|
|
237
|
-
|
|
238
|
-
## Step 6: Route Based on Results
|
|
239
|
-
|
|
240
|
-
### HEALTHY (all passed)
|
|
241
|
-
|
|
242
|
-
Update `.forge/state/milestone-{id}.yml`:
|
|
243
|
-
- Set `current.status` to `refactoring`
|
|
244
|
-
|
|
245
|
-
Present to user:
|
|
246
|
-
*"Health audit passed. No security vulnerabilities or architectural concerns found. Moving to refactoring review."*
|
|
247
|
-
|
|
248
|
-
→ Route to `refactoring` skill.
|
|
249
|
-
|
|
250
|
-
### NEEDS ATTENTION (critical issues found)
|
|
251
|
-
|
|
252
|
-
Do NOT mark milestone complete. Present to user:
|
|
253
|
-
|
|
254
|
-
*"Health audit found critical issues that should be addressed before shipping:"*
|
|
255
|
-
|
|
256
|
-
Inline the top 3 findings per critical category so the user sees them immediately (don't make them open the report).
|
|
257
|
-
|
|
258
|
-
Then offer choices:
|
|
259
|
-
|
|
260
|
-
*"Options:"*
|
|
261
|
-
- **A. Fix critical issues** — return to `planning` in fix mode with findings as requirements
|
|
262
|
-
- **B. Accept risk and continue** — document accepted risks in report, proceed to refactoring review
|
|
263
|
-
|
|
264
|
-
If user chooses A:
|
|
265
|
-
- Create fix requirements from critical findings
|
|
266
|
-
- Route to `planning` skill in fix mode
|
|
267
|
-
- After fix execution, re-run `auditing` (not full `verifying` — just the audit)
|
|
268
|
-
|
|
269
|
-
If user chooses B:
|
|
270
|
-
- Append "Accepted Risks" section to the health report with user's acknowledgment
|
|
271
|
-
- Update `.forge/state/milestone-{id}.yml`: set `current.status` to `refactoring`
|
|
272
|
-
- → Route to `refactoring` skill.
|
|
273
|
-
|
|
274
|
-
### ACCEPTABLE WITH CAVEATS (warnings only)
|
|
275
|
-
|
|
276
|
-
Present to user:
|
|
277
|
-
|
|
278
|
-
*"Health audit passed with warnings — no critical issues, but {N} items worth noting. See the full report at `.forge/audits/milestone-{id}-health-report.md`."*
|
|
279
|
-
|
|
280
|
-
Then offer choices:
|
|
281
|
-
- **A. Continue to refactoring review** — accept warnings as known items
|
|
282
|
-
- **B. Fix warnings** — address before continuing
|
|
283
|
-
|
|
284
|
-
If user chooses A:
|
|
285
|
-
- Document accepted warnings in report
|
|
286
|
-
- Update `.forge/state/milestone-{id}.yml`: set `current.status` to `refactoring`
|
|
287
|
-
- → Route to `refactoring` skill.
|
|
288
|
-
|
|
289
|
-
If user chooses B:
|
|
290
|
-
- Create fix requirements from warning findings
|
|
291
|
-
- Route to `planning` in fix mode
|
|
292
|
-
- After fix execution, re-run `auditing`
|
|
293
|
-
|
|
294
|
-
## Gate Type: Soft Gate
|
|
295
|
-
|
|
296
|
-
This is a soft gate — critical issues strongly recommend fixing before completion, but the user can accept risk and proceed. Rationale:
|
|
297
|
-
- Some issues may be acceptable known risks for the deployment context
|
|
298
|
-
- Some findings may be false positives despite the conservative flagging approach
|
|
299
|
-
- Non-production or internal tools may have different risk tolerances
|
|
300
|
-
- The user always has final authority over ship decisions
|
|
301
|
-
|
|
302
|
-
The report documents the decision either way, creating an audit trail.
|
|
303
|
-
|
|
304
|
-
## Phase Handoff
|
|
305
|
-
|
|
306
|
-
After auditing routes to refactoring (all three paths: HEALTHY, accepted risk, accepted warnings):
|
|
307
|
-
|
|
308
|
-
1. **Verify persistence** — Confirm health report is written to `.forge/audits/milestone-{id}-health-report.md`
|
|
309
|
-
2. **Update state** — Set `current.status` to `refactoring` in `.forge/state/milestone-{id}.yml`
|
|
310
|
-
3. **Recommend context clear:**
|
|
311
|
-
|
|
312
|
-
*"Health audit complete. Report written to `.forge/audits/`. I recommend clearing context (`/clear`) before the refactoring review — the refactoring scanner spawns a fresh agent with the git diff and health report, so a clean context ensures accurate scanning.*
|
|
313
|
-
|
|
314
|
-
*Ready to continue? Clear context and invoke `/forge` to resume."*
|