forge-orkes 0.3.8 → 0.3.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/template/.claude/skills/forge/SKILL.md +12 -15
- package/template/.claude/skills/reviewing/SKILL.md +437 -0
- package/template/.claude/skills/verifying/SKILL.md +3 -3
- package/template/CLAUDE.md +7 -10
- package/template/.claude/skills/auditing/SKILL.md +0 -314
- package/template/.claude/skills/refactoring/SKILL.md +0 -168
package/package.json
CHANGED
|
@@ -29,7 +29,7 @@ Check for state files in this order:
|
|
|
29
29
|
4. **If one milestone:** Auto-select it. Inform user: *"Resuming milestone: [{name}] — status: {current.status}, tasks: {overall_percent}%"*
|
|
30
30
|
5. **If no active milestones:** Proceed to init or ask user to create one.
|
|
31
31
|
6. Load the selected milestone's state file (`.forge/state/milestone-{id}.yml`)
|
|
32
|
-
7. **Route based on `current.status`, NOT on `overall_percent`.** The `current.status` field is the authoritative workflow position. A milestone is only complete when `current.status` equals `complete`. Even if `overall_percent` is 100%, the milestone still needs to go through
|
|
32
|
+
7. **Route based on `current.status`, NOT on `overall_percent`.** The `current.status` field is the authoritative workflow position. A milestone is only complete when `current.status` equals `complete`. Even if `overall_percent` is 100%, the milestone still needs to go through remaining workflow steps (verifying, reviewing) before it is truly done.
|
|
33
33
|
8. Report position briefly, then **immediately route to the next skill** (see Step 3: Mandatory Auto-Routing):
|
|
34
34
|
- **Workflow status** (`current.status`) — this is the primary indicator of where you are
|
|
35
35
|
- Phase progress using precise terminology: "Executed" (code done, not verified), "Verified", "Pending", "In progress" — **never say "Complete" for a phase that hasn't been verified**
|
|
@@ -85,7 +85,7 @@ Match ANY:
|
|
|
85
85
|
- Integration with external service
|
|
86
86
|
- Estimated 1-8 hours of work
|
|
87
87
|
|
|
88
|
-
→ Route through: `researching` → `discussing` → `planning` → `executing` → `verifying` → `
|
|
88
|
+
→ Route through: `researching` → `discussing` → `planning` → `executing` → `verifying` → `reviewing`
|
|
89
89
|
|
|
90
90
|
### Full Tier
|
|
91
91
|
Match ANY:
|
|
@@ -96,7 +96,7 @@ Match ANY:
|
|
|
96
96
|
- Estimated days of work
|
|
97
97
|
- User says "full", "complex", "project", "milestone"
|
|
98
98
|
|
|
99
|
-
→ Route through: `researching` → `discussing` → `architecting` → `planning` → `executing` → `verifying` → `
|
|
99
|
+
→ Route through: `researching` → `discussing` → `architecting` → `planning` → `executing` → `verifying` → `reviewing`
|
|
100
100
|
→ Add `designing` if UI work involved
|
|
101
101
|
→ Add `securing` if auth/data/API touched
|
|
102
102
|
|
|
@@ -140,12 +140,11 @@ This is a **briefing, not a prompt** — the user sees where they are and what's
|
|
|
140
140
|
| `discussing` | Invoke `Skill(discussing)`, then → `planning` (or `architecting` for Full) |
|
|
141
141
|
| `planning` | Invoke `Skill(planning)`, then → `executing` |
|
|
142
142
|
| `executing` | Invoke `Skill(executing)`, then → `verifying` |
|
|
143
|
-
| `verifying` | Invoke `Skill(verifying)`, then → `
|
|
144
|
-
| `
|
|
145
|
-
| `refactoring` | Invoke `Skill(refactoring)`, then → `complete` |
|
|
143
|
+
| `verifying` | Invoke `Skill(verifying)`, then → `reviewing` |
|
|
144
|
+
| `reviewing` | Invoke `Skill(reviewing)`, then → `complete` |
|
|
146
145
|
| `complete` | Milestone is done. Ask user what's next. |
|
|
147
146
|
|
|
148
|
-
- **Never treat a milestone as complete just because `overall_percent` is 100%.** Task completion and workflow completion are different. All planned tasks being done (100%) means execution is finished — verification
|
|
147
|
+
- **Never treat a milestone as complete just because `overall_percent` is 100%.** Task completion and workflow completion are different. All planned tasks being done (100%) means execution is finished — verification and reviewing still need to run.
|
|
149
148
|
- Skip completed phases (phases before `current.status`)
|
|
150
149
|
- Resume from current phase
|
|
151
150
|
|
|
@@ -154,7 +153,7 @@ This is a **briefing, not a prompt** — the user sees where they are and what's
|
|
|
154
153
|
Sometimes a session ends before the executing skill advances `current.status`. On resume, detect and fix this:
|
|
155
154
|
|
|
156
155
|
- **If `current.status == executing`**: Check if all phases in the roadmap have been executed (all plans completed, commits made). If YES → advance `current.status` to `verifying` in the state file, then route to verifying. If NO → route to executing for the next unexecuted phase.
|
|
157
|
-
- **If `current.status == verifying`**: Check if verification report exists. If YES and it passed → advance to `
|
|
156
|
+
- **If `current.status == verifying`**: Check if verification report exists. If YES and it passed → advance to `reviewing`. If NO → route to verifying.
|
|
158
157
|
- **General rule**: If the work for the current status is done but the status wasn't advanced (session ended mid-handoff), advance it now and route to the next skill.
|
|
159
158
|
|
|
160
159
|
### Phase Status Wording
|
|
@@ -168,7 +167,7 @@ When reporting phase progress on resume, use precise terminology to avoid confus
|
|
|
168
167
|
| Not yet started | **"Pending"** | No work done yet |
|
|
169
168
|
| Currently in progress | **"In progress"** | Partially executed |
|
|
170
169
|
|
|
171
|
-
**Never say a phase is "Complete" unless it has passed through the full workflow** (executed + verified +
|
|
170
|
+
**Never say a phase is "Complete" unless it has passed through the full workflow** (executed + verified + reviewed). Use "Executed" for phases where code is done but verification hasn't run. This prevents users from thinking a phase is fully done when it still needs verification.
|
|
172
171
|
|
|
173
172
|
### On-Demand Discussion
|
|
174
173
|
|
|
@@ -184,8 +183,7 @@ While working at any tier, if you encounter:
|
|
|
184
183
|
| Missing validation/error handling/null checks | Auto-add, document | Rule 2 |
|
|
185
184
|
| Broken import/dep/config blocking progress | Auto-fix, document | Rule 3 |
|
|
186
185
|
| Need new DB table, service layer, library swap | **STOP. Ask user.** | Rule 4 |
|
|
187
|
-
| After verifying passes | Run health audit
|
|
188
|
-
| After auditing passes | Review refactoring opportunities | `refactoring` |
|
|
186
|
+
| After verifying passes | Run health audit + refactoring review | `reviewing` |
|
|
189
187
|
|
|
190
188
|
When uncertain → Rule 4 (ask). Never silently make architectural decisions.
|
|
191
189
|
|
|
@@ -202,7 +200,7 @@ Each phase produces persistent artifacts (state files, plans, reports, backlogs)
|
|
|
202
200
|
Recommend clearing context at every phase boundary in Standard and Full tiers:
|
|
203
201
|
|
|
204
202
|
```
|
|
205
|
-
researching → [clear] → discussing → [clear] → architecting → [clear] → planning → [clear] → executing → [clear] → verifying → [clear] →
|
|
203
|
+
researching → [clear] → discussing → [clear] → architecting → [clear] → planning → [clear] → executing → [clear] → verifying → [clear] → reviewing
|
|
206
204
|
```
|
|
207
205
|
|
|
208
206
|
**Skip the recommendation when:**
|
|
@@ -233,8 +231,7 @@ Each skill ends with a standard handoff message. The pattern is:
|
|
|
233
231
|
| architecting | ADRs in `.forge/decisions/`, data models, API contracts | planning reads decisions |
|
|
234
232
|
| planning | Plans in `.forge/phases/m{M}-{N}-{name}/`, requirements.yml, roadmap.yml, context.md | executing reads plans |
|
|
235
233
|
| executing | Committed code, execution summary, milestone state updated | verifying reads must_haves from plans |
|
|
236
|
-
| verifying | Verification report, desire paths updated |
|
|
237
|
-
| auditing | Health report in `.forge/audits/` | refactoring reads health report + git diff |
|
|
234
|
+
| verifying | Verification report, desire paths updated | reviewing reads project.yml + source files + git diff |
|
|
238
235
|
|
|
239
236
|
### Context Loading on Resume
|
|
240
237
|
|
|
@@ -243,7 +240,7 @@ When a skill starts after a context clear, it must load its required state from
|
|
|
243
240
|
## State Transitions
|
|
244
241
|
|
|
245
242
|
```
|
|
246
|
-
not_started → [init if new] → researching → [clear] → discussing → [clear] → planning → [clear] → executing → [clear] → verifying → [clear] →
|
|
243
|
+
not_started → [init if new] → researching → [clear] → discussing → [clear] → planning → [clear] → executing → [clear] → verifying → [clear] → reviewing → complete
|
|
247
244
|
↗ debugging (if stuck)
|
|
248
245
|
↗ designing (if UI)
|
|
249
246
|
↗ securing (if auth/data)
|
|
@@ -0,0 +1,437 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: reviewing
|
|
3
|
+
description: "Use after verifying passes to assess codebase health and catalog improvement opportunities. Combines security audit (10 categories), architecture audit (4 dimensions), and refactoring scan (6 categories) into a single review pass. This is the pre-completion gate — it answers 'is this healthy enough to ship, and what could be better?'"
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Reviewing: Health Audit + Refactoring Review
|
|
7
|
+
|
|
8
|
+
You are the pre-completion gate. After `verifying` confirms the work delivers what was promised, you assess codebase health AND catalog improvement opportunities in a single review pass. Three parallel scans — security, architecture, and refactoring — produce a structured report that determines whether the milestone can complete.
|
|
9
|
+
|
|
10
|
+
## When to Trigger
|
|
11
|
+
|
|
12
|
+
- **Automatically** after `verifying` returns a PASSED verdict (Standard and Full tiers)
|
|
13
|
+
- **On-demand** at any time via user request
|
|
14
|
+
|
|
15
|
+
## Process Overview
|
|
16
|
+
|
|
17
|
+
1. Read project context (`.forge/project.yml`) to determine tech stack
|
|
18
|
+
2. Scope the review — glob all source files, determine milestone diff
|
|
19
|
+
3. Spawn three parallel subagents: Security Audit + Architecture Audit + Refactoring Scan
|
|
20
|
+
4. Collect results, score per-category, determine overall status
|
|
21
|
+
5. Write health report to `.forge/audits/milestone-{id}-health-report.md`
|
|
22
|
+
6. Write accepted refactoring items to `.forge/refactor-backlog.yml`
|
|
23
|
+
7. Route based on results: healthy → complete, critical issues → user decides
|
|
24
|
+
|
|
25
|
+
## Step 1: Read Context
|
|
26
|
+
|
|
27
|
+
```
|
|
28
|
+
Read: .forge/project.yml → tech stack, framework, database, dependencies
|
|
29
|
+
Read: .forge/state/milestone-{id}.yml → milestone ID and name
|
|
30
|
+
Read: .forge/constitution.md → active architectural gates (if exists)
|
|
31
|
+
Read: .forge/refactor-backlog.yml → existing backlog items (if any)
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
Determine which security categories apply based on the tech stack. For example:
|
|
35
|
+
- No database → SQL/NoSQL Injection is N/A
|
|
36
|
+
- No frontend → XSS Prevention is N/A
|
|
37
|
+
- No CI/CD config → Pipeline Security is N/A
|
|
38
|
+
|
|
39
|
+
Determine the milestone's starting point for the git diff (for refactoring scan):
|
|
40
|
+
- Check git log for the commit tagged or noted as the milestone start
|
|
41
|
+
- If unavailable, use the first commit after the previous milestone's completion date
|
|
42
|
+
- Fallback: ask the user for the starting commit or branch
|
|
43
|
+
|
|
44
|
+
## Step 2: Scope the Review
|
|
45
|
+
|
|
46
|
+
```
|
|
47
|
+
Glob: src/**/*.{ts,tsx,js,jsx,py,go,rs,java} (adapt to project language)
|
|
48
|
+
Glob: **/*.env*, **/docker-compose*, **/.github/workflows/*
|
|
49
|
+
Glob: **/next.config*, **/vite.config*, **/webpack.config*
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
Also get the diff file list for the refactoring scan:
|
|
53
|
+
```
|
|
54
|
+
git diff --name-only {milestone_start}..HEAD
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
Present scope summary to user:
|
|
58
|
+
*"Review scope: {N} source files, {N} config files, {N} files changed in this milestone. Scanning security (10 categories), architecture (4 dimensions), and refactoring opportunities (6 categories). This will take a moment."*
|
|
59
|
+
|
|
60
|
+
Build explicit file lists for each subagent — pass file paths, not globs, so nothing is missed.
|
|
61
|
+
|
|
62
|
+
## Step 3: Spawn Parallel Scans
|
|
63
|
+
|
|
64
|
+
Spawn all three scans as fresh-context subagents. Each receives the explicit file list for their scope, the tech stack from `project.yml`, and their specific instructions below.
|
|
65
|
+
|
|
66
|
+
### Part 1: Security Audit (subagent)
|
|
67
|
+
|
|
68
|
+
Spawn a security auditor agent with a fresh context window.
|
|
69
|
+
|
|
70
|
+
**10 Security Categories:**
|
|
71
|
+
|
|
72
|
+
| # | Category | What It Checks |
|
|
73
|
+
|---|----------|---------------|
|
|
74
|
+
| 1 | Authentication & Authorization | Every endpoint has auth middleware; role checks before data access |
|
|
75
|
+
| 2 | Data Scoping / Tenant Isolation | Queries scoped to correct user/tenant; no cross-tenant data leaks |
|
|
76
|
+
| 3 | Input Validation | Request bodies/params validated before use in queries or logic |
|
|
77
|
+
| 4 | Error Information Leakage | No stack traces, DB schemas, or internal details in API responses |
|
|
78
|
+
| 5 | XSS Prevention | No unsanitized user content injected into DOM |
|
|
79
|
+
| 6 | SQL/NoSQL Injection | All queries use parameterized placeholders, no string interpolation |
|
|
80
|
+
| 7 | Secrets Management | No hardcoded keys/tokens; `.env` in `.gitignore`; `process.env` usage |
|
|
81
|
+
| 8 | CORS Policy | No wildcard `*` origins in production; appropriate method restrictions |
|
|
82
|
+
| 9 | HTTP Security Headers | CSP, X-Frame-Options, HSTS, X-Content-Type-Options, Referrer-Policy |
|
|
83
|
+
| 10 | CI/CD Pipeline Security | Secrets via secrets context, not hardcoded in workflow files |
|
|
84
|
+
|
|
85
|
+
**Agent behavior rules:**
|
|
86
|
+
- Read every file in the provided list. No sampling or skipping.
|
|
87
|
+
- Every finding must have: file path, line number, what's wrong, severity, remediation.
|
|
88
|
+
- Understand context before flagging — read surrounding code, check for middleware, wrappers, and higher-order protections.
|
|
89
|
+
- Document intentionally public endpoints; don't flag them as vulnerabilities.
|
|
90
|
+
- Severity is firm: `critical` = exploitable vulnerability, `warning` = defense-in-depth gap, `info` = observation.
|
|
91
|
+
- Prefer false negatives over false positives — only flag what you're confident about.
|
|
92
|
+
- Categories that don't apply to this project's stack → mark as N/A with brief explanation.
|
|
93
|
+
|
|
94
|
+
**Project adaptation:** Adapt checks to the detected stack:
|
|
95
|
+
- Express vs Next.js vs Fastify endpoint patterns
|
|
96
|
+
- PostgreSQL vs MongoDB vs SQLite query patterns
|
|
97
|
+
- GitHub Actions vs GitLab CI vs other CI systems
|
|
98
|
+
- React vs Vue vs Svelte frontend patterns
|
|
99
|
+
|
|
100
|
+
**Output format** (return to orchestrator):
|
|
101
|
+
|
|
102
|
+
```yaml
|
|
103
|
+
security_audit:
|
|
104
|
+
files_scanned: N
|
|
105
|
+
categories:
|
|
106
|
+
- id: 1
|
|
107
|
+
name: "Authentication & Authorization"
|
|
108
|
+
status: passed | warning | critical | na
|
|
109
|
+
findings:
|
|
110
|
+
- file: "src/api/users.ts"
|
|
111
|
+
line: 42
|
|
112
|
+
severity: critical | warning | info
|
|
113
|
+
issue: "Description of what's wrong"
|
|
114
|
+
remediation: "How to fix it"
|
|
115
|
+
notes: "Optional context about intentional decisions"
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
### Part 2: Architecture Audit (subagent)
|
|
119
|
+
|
|
120
|
+
Spawn an architecture auditor agent with a fresh context window.
|
|
121
|
+
|
|
122
|
+
**4 Architecture Dimensions:**
|
|
123
|
+
|
|
124
|
+
| Dimension | What It Checks |
|
|
125
|
+
|-----------|---------------|
|
|
126
|
+
| **Scalability** | Synchronous blocking calls, missing pagination, unbounded queries, N+1 query patterns, missing caching opportunities, single points of failure, hardcoded limits |
|
|
127
|
+
| **Maintainability** | Code complexity hotspots (files >300 lines, deeply nested logic >4 levels, god components/classes), circular dependencies, duplicated logic that warrants abstraction |
|
|
128
|
+
| **Code Health** | Dead code / unused exports, TODO/FIXME inventory with age, test coverage gaps (untested critical paths), stale/vulnerable dependencies |
|
|
129
|
+
| **Structural Quality** | Separation of concerns violations (business logic in UI layer), inconsistent patterns across similar features, missing error boundaries, API contract consistency |
|
|
130
|
+
|
|
131
|
+
**Agent behavior rules:**
|
|
132
|
+
- Check actual code, not theoretical concerns.
|
|
133
|
+
- Every finding references specific files with evidence.
|
|
134
|
+
- Severity: `critical` = architectural debt that will cause production issues or block future work, `warning` = quality concern worth addressing, `info` = improvement opportunity.
|
|
135
|
+
- Respect existing ADRs in `.forge/decisions/` — don't flag intentional architectural choices as issues.
|
|
136
|
+
- Respect constitutional articles in `.forge/constitution.md` — if the constitution permits a pattern, don't flag it.
|
|
137
|
+
|
|
138
|
+
**Output format** (return to orchestrator):
|
|
139
|
+
|
|
140
|
+
```yaml
|
|
141
|
+
architecture_audit:
|
|
142
|
+
files_scanned: N
|
|
143
|
+
dimensions:
|
|
144
|
+
- name: "Scalability"
|
|
145
|
+
status: passed | warning | critical
|
|
146
|
+
findings:
|
|
147
|
+
- file: "src/api/products.ts"
|
|
148
|
+
line: 87
|
|
149
|
+
severity: critical | warning | info
|
|
150
|
+
issue: "Unbounded query with no pagination"
|
|
151
|
+
remediation: "Add limit/offset parameters"
|
|
152
|
+
- name: "Maintainability"
|
|
153
|
+
status: passed | warning | critical
|
|
154
|
+
findings: []
|
|
155
|
+
- name: "Code Health"
|
|
156
|
+
status: passed | warning | critical
|
|
157
|
+
findings: []
|
|
158
|
+
- name: "Structural Quality"
|
|
159
|
+
status: passed | warning | critical
|
|
160
|
+
findings: []
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
### Part 3: Refactoring Scan (subagent)
|
|
164
|
+
|
|
165
|
+
Spawn a refactoring scanner agent with a fresh context window. Pass it only the files changed during the milestone (from the git diff).
|
|
166
|
+
|
|
167
|
+
**6 Refactoring Categories:**
|
|
168
|
+
|
|
169
|
+
| # | Category | What to Look For |
|
|
170
|
+
|---|----------|-----------------|
|
|
171
|
+
| 1 | **Duplication** | Similar logic in 2+ places that could be extracted into a shared function, hook, or utility |
|
|
172
|
+
| 2 | **Complexity hotspots** | Functions >50 lines, nesting >3 levels deep, high cyclomatic complexity, overly long files |
|
|
173
|
+
| 3 | **Naming & clarity** | Unclear variable/function names, misleading abstractions, functions that do more than their name suggests |
|
|
174
|
+
| 4 | **Pattern inconsistency** | Same thing done differently across the milestone's files (e.g., error handling, data fetching, state management) |
|
|
175
|
+
| 5 | **Dead code** | Unused functions, unreachable branches, commented-out code left behind, unused imports |
|
|
176
|
+
| 6 | **Abstraction issues** | Over-engineered helpers used once, repeated inline code that warrants extraction, premature or missing abstractions |
|
|
177
|
+
|
|
178
|
+
**Agent behavior rules:**
|
|
179
|
+
- Read every file in the diff. No sampling.
|
|
180
|
+
- Every finding must reference a specific file and line range.
|
|
181
|
+
- Understand context — don't flag intentional patterns documented in the constitution.
|
|
182
|
+
- Don't duplicate findings from the security or architecture audits.
|
|
183
|
+
- Estimate effort for each item: `quick` (< 30 min, under 50 lines) or `standard` (needs planning).
|
|
184
|
+
- Suggest a concrete approach for each finding, not just "refactor this."
|
|
185
|
+
- Prefer fewer high-quality findings over many low-signal ones.
|
|
186
|
+
|
|
187
|
+
**Output format** (return to orchestrator):
|
|
188
|
+
|
|
189
|
+
```yaml
|
|
190
|
+
refactoring_scan:
|
|
191
|
+
files_scanned: N
|
|
192
|
+
findings:
|
|
193
|
+
- category: duplication
|
|
194
|
+
file: "src/api/users.ts"
|
|
195
|
+
lines: "42-67"
|
|
196
|
+
description: "Duplicate validation logic — same email check in createUser and updateUser"
|
|
197
|
+
effort: quick
|
|
198
|
+
suggested_approach: "Extract shared validateEmail() helper to src/utils/validation.ts"
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
## Step 4: Score Results
|
|
202
|
+
|
|
203
|
+
After all three subagents return, compute scores.
|
|
204
|
+
|
|
205
|
+
**Per-category scoring (security + architecture):**
|
|
206
|
+
|
|
207
|
+
| Status | Meaning |
|
|
208
|
+
|--------|---------|
|
|
209
|
+
| `passed` | No issues found |
|
|
210
|
+
| `warning` | Non-critical issues (info-level also maps here) |
|
|
211
|
+
| `critical` | Real vulnerabilities or architectural blockers |
|
|
212
|
+
| `na` | Category doesn't apply to this project |
|
|
213
|
+
|
|
214
|
+
**Overall health status:**
|
|
215
|
+
|
|
216
|
+
| Overall | Condition |
|
|
217
|
+
|---------|-----------|
|
|
218
|
+
| `passed` | ALL categories and dimensions passed or N/A |
|
|
219
|
+
| `warnings_only` | One or more warnings, zero critical |
|
|
220
|
+
| `issues_found` | One or more critical findings |
|
|
221
|
+
|
|
222
|
+
**Refactoring findings** are separate from the health status — they never block completion.
|
|
223
|
+
|
|
224
|
+
## Step 5: Write Health Report
|
|
225
|
+
|
|
226
|
+
Create `.forge/audits/` directory if needed. Write to `.forge/audits/milestone-{id}-health-report.md`.
|
|
227
|
+
|
|
228
|
+
**YAML frontmatter:**
|
|
229
|
+
|
|
230
|
+
```yaml
|
|
231
|
+
---
|
|
232
|
+
milestone_id: {id}
|
|
233
|
+
milestone_name: "{name}"
|
|
234
|
+
reviewed: "{ISO 8601 timestamp}"
|
|
235
|
+
status: passed | warnings_only | issues_found
|
|
236
|
+
security:
|
|
237
|
+
status: passed | warnings_only | issues_found
|
|
238
|
+
categories_passed: N
|
|
239
|
+
categories_warning: N
|
|
240
|
+
categories_critical: N
|
|
241
|
+
categories_na: N
|
|
242
|
+
architecture:
|
|
243
|
+
status: passed | warnings_only | issues_found
|
|
244
|
+
scalability: passed | warning | critical
|
|
245
|
+
maintainability: passed | warning | critical
|
|
246
|
+
code_health: passed | warning | critical
|
|
247
|
+
structural_quality: passed | warning | critical
|
|
248
|
+
refactoring:
|
|
249
|
+
findings_count: N
|
|
250
|
+
quick_count: N
|
|
251
|
+
standard_count: N
|
|
252
|
+
total_files_scanned: N
|
|
253
|
+
---
|
|
254
|
+
```
|
|
255
|
+
|
|
256
|
+
**Body structure:**
|
|
257
|
+
|
|
258
|
+
```markdown
|
|
259
|
+
# Review Report: {milestone name}
|
|
260
|
+
|
|
261
|
+
## Executive Summary
|
|
262
|
+
{1-3 sentences: overall health assessment, key findings, refactoring highlights, recommendation}
|
|
263
|
+
|
|
264
|
+
## Security Findings
|
|
265
|
+
|
|
266
|
+
### Category 1: Authentication & Authorization — {STATUS}
|
|
267
|
+
| File | Line | Severity | Issue | Remediation |
|
|
268
|
+
|------|------|----------|-------|-------------|
|
|
269
|
+
| ... | ... | ... | ... | ... |
|
|
270
|
+
|
|
271
|
+
{Repeat for each category. N/A categories get a single line: "N/A — {reason}"}
|
|
272
|
+
|
|
273
|
+
## Architecture Findings
|
|
274
|
+
|
|
275
|
+
### Scalability — {STATUS}
|
|
276
|
+
| File | Line | Severity | Issue | Remediation |
|
|
277
|
+
|------|------|----------|-------|-------------|
|
|
278
|
+
| ... | ... | ... | ... | ... |
|
|
279
|
+
|
|
280
|
+
{Repeat for each dimension}
|
|
281
|
+
|
|
282
|
+
## Refactoring Opportunities
|
|
283
|
+
|
|
284
|
+
### Duplication ({N} items)
|
|
285
|
+
| File | Lines | Description | Effort | Approach |
|
|
286
|
+
|------|-------|-------------|--------|----------|
|
|
287
|
+
| ... | ... | ... | quick/standard | ... |
|
|
288
|
+
|
|
289
|
+
{Repeat for each refactoring category with findings}
|
|
290
|
+
|
|
291
|
+
## Public Endpoints
|
|
292
|
+
{List of intentionally public endpoints documented during security audit}
|
|
293
|
+
|
|
294
|
+
## Files Scanned
|
|
295
|
+
{Count and list of all files scanned across all three scans}
|
|
296
|
+
```
|
|
297
|
+
|
|
298
|
+
**Health trend tracking:** If a previous audit exists for an earlier milestone (check `.forge/audits/` for prior reports), compare results and note improvements or regressions in the executive summary.
|
|
299
|
+
|
|
300
|
+
## Step 6: Present Results + Triage Refactoring
|
|
301
|
+
|
|
302
|
+
### Health Results
|
|
303
|
+
|
|
304
|
+
Present the health status first — this is the gate.
|
|
305
|
+
|
|
306
|
+
**If HEALTHY (all passed):**
|
|
307
|
+
*"Health audit passed. No security vulnerabilities or architectural concerns found."*
|
|
308
|
+
|
|
309
|
+
**If NEEDS ATTENTION (critical issues):**
|
|
310
|
+
*"Review found critical issues that should be addressed before shipping:"*
|
|
311
|
+
Inline the top 3 findings per critical category so the user sees them immediately.
|
|
312
|
+
|
|
313
|
+
**If WARNINGS ONLY:**
|
|
314
|
+
*"Review passed with warnings — no critical issues, but {N} items worth noting. See the full report at `.forge/audits/milestone-{id}-health-report.md`."*
|
|
315
|
+
|
|
316
|
+
### Refactoring Triage
|
|
317
|
+
|
|
318
|
+
After presenting health results, show refactoring findings for triage. Group by category, max 10 initially:
|
|
319
|
+
|
|
320
|
+
*"I also found {N} refactoring opportunities in the code built during this milestone:"*
|
|
321
|
+
|
|
322
|
+
For each category with findings:
|
|
323
|
+
*"**Duplication** ({N} items):*
|
|
324
|
+
*1. `src/api/users.ts:42-67` — Duplicate email validation in createUser and updateUser. Quick fix: extract shared helper. [Accept / Dismiss]*
|
|
325
|
+
*2. ...*"
|
|
326
|
+
|
|
327
|
+
The user can respond with:
|
|
328
|
+
- **Accept** (individual item) → add to backlog
|
|
329
|
+
- **Dismiss** (individual item) → skip, not a real issue or intentional
|
|
330
|
+
- **Accept all** → bulk add all remaining items to backlog
|
|
331
|
+
- **Dismiss all** → skip everything, no backlog items added
|
|
332
|
+
|
|
333
|
+
For dismissed items, optionally ask for a brief reason (helps calibrate future scans).
|
|
334
|
+
|
|
335
|
+
## Step 7: Write Backlog + Route
|
|
336
|
+
|
|
337
|
+
### Write Refactoring Backlog
|
|
338
|
+
|
|
339
|
+
Read existing `.forge/refactor-backlog.yml` (if any). Determine the next item ID by incrementing from the highest existing ID.
|
|
340
|
+
|
|
341
|
+
Append accepted items to `.forge/refactor-backlog.yml`:
|
|
342
|
+
|
|
343
|
+
```yaml
|
|
344
|
+
items:
|
|
345
|
+
- id: R001
|
|
346
|
+
milestone: 1
|
|
347
|
+
category: duplication
|
|
348
|
+
file: "src/api/users.ts"
|
|
349
|
+
lines: "42-67"
|
|
350
|
+
description: "Duplicate validation logic — same email check in createUser and updateUser"
|
|
351
|
+
effort: quick
|
|
352
|
+
suggested_approach: "Extract shared validateEmail() helper"
|
|
353
|
+
status: pending
|
|
354
|
+
added: "2026-03-18"
|
|
355
|
+
completed: null
|
|
356
|
+
```
|
|
357
|
+
|
|
358
|
+
If the file doesn't exist yet, create it from the template at `.forge/templates/refactor-backlog.yml`.
|
|
359
|
+
|
|
360
|
+
### Route Based on Health Status
|
|
361
|
+
|
|
362
|
+
#### HEALTHY or WARNINGS ONLY (user accepts)
|
|
363
|
+
|
|
364
|
+
Update `.forge/state/milestone-{id}.yml`:
|
|
365
|
+
- Set `current.status` to `complete`
|
|
366
|
+
|
|
367
|
+
Update `.forge/state/index.yml`:
|
|
368
|
+
- Set milestone status to `complete`
|
|
369
|
+
- Update `last_updated` timestamp
|
|
370
|
+
|
|
371
|
+
Present to user:
|
|
372
|
+
*"Milestone [{name}] is complete. {N} refactoring items are in the backlog for whenever you want to tackle them."*
|
|
373
|
+
|
|
374
|
+
If Beads is installed, run `bd complete` to update the dependency graph.
|
|
375
|
+
|
|
376
|
+
#### NEEDS ATTENTION (critical issues found)
|
|
377
|
+
|
|
378
|
+
Do NOT mark milestone complete. Present choices:
|
|
379
|
+
|
|
380
|
+
*"Options:"*
|
|
381
|
+
- **A. Fix critical issues** — return to `planning` in fix mode with findings as requirements
|
|
382
|
+
- **B. Accept risk and continue** — document accepted risks in report, complete the milestone
|
|
383
|
+
|
|
384
|
+
If user chooses A:
|
|
385
|
+
- Create fix requirements from critical findings
|
|
386
|
+
- Route to `planning` skill in fix mode
|
|
387
|
+
- After fix execution + re-verification, re-run `reviewing` (not full verification — just this review)
|
|
388
|
+
|
|
389
|
+
If user chooses B:
|
|
390
|
+
- Append "Accepted Risks" section to the health report with user's acknowledgment
|
|
391
|
+
- Complete the milestone (same as HEALTHY path above)
|
|
392
|
+
|
|
393
|
+
#### WARNINGS ONLY (user wants to fix)
|
|
394
|
+
|
|
395
|
+
If user wants to fix warnings instead of accepting:
|
|
396
|
+
- Create fix requirements from warning findings
|
|
397
|
+
- Route to `planning` in fix mode
|
|
398
|
+
- After fix execution, re-run `reviewing`
|
|
399
|
+
|
|
400
|
+
## Gate Type: Mixed
|
|
401
|
+
|
|
402
|
+
- **Security critical findings** → soft gate (user can accept risk, but strongly recommended to fix)
|
|
403
|
+
- **Architecture critical findings** → soft gate (same — user has final authority)
|
|
404
|
+
- **Warnings** → advisory (noted in report, user chooses)
|
|
405
|
+
- **Refactoring items** → never block (cataloged to backlog for future work)
|
|
406
|
+
|
|
407
|
+
The report documents the decision either way, creating an audit trail.
|
|
408
|
+
|
|
409
|
+
## Backlog Lifecycle
|
|
410
|
+
|
|
411
|
+
Backlog items follow this lifecycle:
|
|
412
|
+
|
|
413
|
+
```
|
|
414
|
+
pending → in_progress → done
|
|
415
|
+
pending → dismissed (during triage or later review)
|
|
416
|
+
```
|
|
417
|
+
|
|
418
|
+
Items with `effort: quick` can be picked up directly via `quick-tasking`.
|
|
419
|
+
Items with `effort: standard` should go through the Standard tier flow.
|
|
420
|
+
|
|
421
|
+
When working a backlog item:
|
|
422
|
+
1. `forge` surfaces it as an available task
|
|
423
|
+
2. User selects it
|
|
424
|
+
3. Route to `quick-tasking` or Standard tier based on effort
|
|
425
|
+
4. On completion, update the item's `status` to `done` and set `completed` date
|
|
426
|
+
|
|
427
|
+
## Phase Handoff
|
|
428
|
+
|
|
429
|
+
After reviewing completes (all paths: HEALTHY, accepted risk, accepted warnings):
|
|
430
|
+
|
|
431
|
+
1. **Verify persistence** — Confirm health report is written to `.forge/audits/milestone-{id}-health-report.md` and refactoring backlog is updated
|
|
432
|
+
2. **Update state** — Set `current.status` to `complete` in `.forge/state/milestone-{id}.yml`
|
|
433
|
+
3. **Present completion:**
|
|
434
|
+
|
|
435
|
+
*"Milestone [{name}] complete. Review report at `.forge/audits/milestone-{id}-health-report.md`. {N} refactoring items in backlog.*
|
|
436
|
+
|
|
437
|
+
*Start new work with `/forge` or tackle backlog items anytime."*
|
|
@@ -128,7 +128,7 @@ Based on all verification levels:
|
|
|
128
128
|
|
|
129
129
|
### PASSED
|
|
130
130
|
All truths verified, all artifacts substantive and wired, all key links connected, requirements covered.
|
|
131
|
-
→ Route to `
|
|
131
|
+
→ Route to `reviewing` skill for health audit + refactoring review before milestone completion.
|
|
132
132
|
|
|
133
133
|
### GAPS FOUND
|
|
134
134
|
Some truths failed or artifacts are stubs.
|
|
@@ -219,10 +219,10 @@ Only suggest changes when there's clear evidence (3+ occurrences). One-off issue
|
|
|
219
219
|
After verification completes with a PASSED verdict:
|
|
220
220
|
|
|
221
221
|
1. **Verify persistence** — Confirm verification results are documented, desire paths retrospective is logged to `.forge/state/index.yml`
|
|
222
|
-
2. **Update state** — Set `current.status` to `
|
|
222
|
+
2. **Update state** — Set `current.status` to `reviewing` in `.forge/state/milestone-{id}.yml`
|
|
223
223
|
3. **Recommend context clear:**
|
|
224
224
|
|
|
225
|
-
*"Verification phase complete — all truths verified, artifacts substantive and wired. I recommend clearing context (`/clear`) before the
|
|
225
|
+
*"Verification phase complete — all truths verified, artifacts substantive and wired. I recommend clearing context (`/clear`) before the review — the reviewing skill spawns fresh subagents anyway, and a clean orchestrator context ensures nothing is missed.*
|
|
226
226
|
|
|
227
227
|
*Ready to continue? Clear context and invoke `/forge` to resume."*
|
|
228
228
|
|
package/template/CLAUDE.md
CHANGED
|
@@ -29,11 +29,11 @@ Forge auto-detects complexity. Override with: "Use Quick/Standard/Full tier."
|
|
|
29
29
|
|
|
30
30
|
### Standard (hours)
|
|
31
31
|
**Triggers:** new feature, component, significant refactor, multi-file change
|
|
32
|
-
**Flow:** → `researching` → `discussing` → `planning` → `executing` → `verifying` → `
|
|
32
|
+
**Flow:** → `researching` → `discussing` → `planning` → `executing` → `verifying` → `reviewing` → done
|
|
33
33
|
|
|
34
34
|
### Full (days)
|
|
35
35
|
**Triggers:** new project, major milestone, complex multi-system feature, architectural decisions needed
|
|
36
|
-
**Flow:** → `researching` → `discussing` → `architecting` → `planning` → `executing` → `verifying` → `
|
|
36
|
+
**Flow:** → `researching` → `discussing` → `architecting` → `planning` → `executing` → `verifying` → `reviewing` → done
|
|
37
37
|
**Optional additions:** `designing` (UI work), `securing` (auth/data/API), `debugging` (stuck on issue)
|
|
38
38
|
|
|
39
39
|
## Skill Routing
|
|
@@ -48,8 +48,7 @@ Forge auto-detects complexity. Override with: "Use Quick/Standard/Full tier."
|
|
|
48
48
|
| Break work into executable tasks with gates | `planning` | Standard, Full |
|
|
49
49
|
| Build code with deviation rules + atomic commits | `executing` | All |
|
|
50
50
|
| Prove work actually delivers on goals | `verifying` | Standard, Full |
|
|
51
|
-
| Audit
|
|
52
|
-
| Review refactoring opportunities after milestone audit | `refactoring` | Standard, Full |
|
|
51
|
+
| Audit health + catalog refactoring opportunities | `reviewing` | Standard, Full |
|
|
53
52
|
| Fix a small, scoped issue fast | `quick-tasking` | Quick |
|
|
54
53
|
| Build UI with design system consistency | `designing` | When UI involved |
|
|
55
54
|
| Review security before shipping | `securing` | When auth/data/API involved |
|
|
@@ -71,7 +70,7 @@ Forge auto-detects complexity. Override with: "Use Quick/Standard/Full tier."
|
|
|
71
70
|
When a task touches 20+ files or a complex subsystem, spawn a fresh executor agent with isolated context. This prevents context rot — the #1 cause of quality degradation in long sessions.
|
|
72
71
|
|
|
73
72
|
### Context Handoff Between Phases
|
|
74
|
-
Each phase writes its outputs to `.forge/` before completing. At every phase boundary (researching → discussing → planning → executing → verifying →
|
|
73
|
+
Each phase writes its outputs to `.forge/` before completing. At every phase boundary (researching → discussing → planning → executing → verifying → reviewing), the completing skill recommends clearing context (`/clear`) before the next phase begins. The next phase loads what it needs from disk. This is advisory — skip for short phases where context is under 40%. See the `forge` skill's "Context Handoff Protocol" for full details.
|
|
75
74
|
|
|
76
75
|
### Lazy Loading
|
|
77
76
|
Skills load only when invoked. CLAUDE.md stays in context; skill details load on demand. This keeps base context lean (~300 lines) while making full framework available.
|
|
@@ -84,9 +83,7 @@ Skills load only when invoked. CLAUDE.md stays in context; skill details load on
|
|
|
84
83
|
| `planner` | Planning with constitutional gates | Read + Write (plan files only) | Planning phases |
|
|
85
84
|
| `executor` | Building with deviation rules | All dev tools | Execution phases |
|
|
86
85
|
| `verifier` | Goal-backward verification | Read + Bash (test execution) | Verification phases |
|
|
87
|
-
| `
|
|
88
|
-
| `architecture-auditor` | Structural health assessor | Read, Grep, Glob | Auditing phase |
|
|
89
|
-
| `reviewer` | Security + code quality audit | Read-only + npm audit | Before shipping |
|
|
86
|
+
| `reviewer` | Security + architecture + refactoring audit | Read, Bash, Grep, Glob | Reviewing phase |
|
|
90
87
|
|
|
91
88
|
## Project Init (First Run)
|
|
92
89
|
|
|
@@ -124,7 +121,7 @@ Project state lives in `.forge/`:
|
|
|
124
121
|
- `state/milestone-{id}.yml` — Per-milestone cursor: current position, progress, decisions, blockers, deviations
|
|
125
122
|
- `context.md` — Locked user decisions + deferred ideas (created during discuss phase)
|
|
126
123
|
- `plan.md` — Per-phase task plans with must_haves frontmatter
|
|
127
|
-
- `refactor-backlog.yml` — Refactoring opportunities cataloged
|
|
124
|
+
- `refactor-backlog.yml` — Refactoring opportunities cataloged during milestone reviews, worked via quick-tasking
|
|
128
125
|
|
|
129
126
|
### Milestones
|
|
130
127
|
Milestones group phases into concurrent work streams. Each milestone has its own state file, so different sessions can work on different milestones without conflicts. On resume, Forge shows active milestones and asks which one to work on.
|
|
@@ -133,7 +130,7 @@ Milestones group phases into concurrent work streams. Each milestone has its own
|
|
|
133
130
|
YAML for anything agents parse programmatically (project, requirements, roadmap, state). Markdown for human-facing content (constitution, context, verification reports). Never free-form prose for machine state.
|
|
134
131
|
|
|
135
132
|
### Milestone Completion: Status vs. Percentage
|
|
136
|
-
**`current.status` is the authoritative workflow position.** A milestone is only complete when `current.status == complete`. The `progress.overall_percent` field measures task completion — not workflow completion. A milestone at 100% task completion still needs verifying
|
|
133
|
+
**`current.status` is the authoritative workflow position.** A milestone is only complete when `current.status == complete`. The `progress.overall_percent` field measures task completion — not workflow completion. A milestone at 100% task completion still needs verifying and reviewing before it is done. On resume, always check and display `current.status` to determine next steps.
|
|
137
134
|
|
|
138
135
|
## Deviation Rules (Executor Decision Tree)
|
|
139
136
|
|
|
@@ -1,314 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: auditing
|
|
3
|
-
description: "Use after verifying passes to assess overall application health before milestone completion. Runs security audit (10 categories) and architecture audit (scaling, maintainability, code health). This is the pre-release gate — it answers 'is this codebase healthy enough to ship?'"
|
|
4
|
-
---
|
|
5
|
-
|
|
6
|
-
# Auditing: Health Audit Before Milestone Completion
|
|
7
|
-
|
|
8
|
-
You are the pre-release gate. After `verifying` confirms the work delivers what was promised, you assess whether the codebase is healthy enough to ship. Two parallel audits — security and architecture — produce a structured health report that determines whether the milestone can complete.
|
|
9
|
-
|
|
10
|
-
## When to Trigger
|
|
11
|
-
|
|
12
|
-
- **Automatically** after `verifying` returns a PASSED verdict (Standard and Full tiers)
|
|
13
|
-
- **On-demand** at any time via user request
|
|
14
|
-
|
|
15
|
-
## Process Overview
|
|
16
|
-
|
|
17
|
-
1. Read project context (`.forge/project.yml`) to determine tech stack
|
|
18
|
-
2. Scope the audit — glob all source files, summarize what will be scanned
|
|
19
|
-
3. Spawn two parallel subagents: Security Audit + Architecture Audit
|
|
20
|
-
4. Collect results, score per-category, determine overall status
|
|
21
|
-
5. Write health report to `.forge/audits/milestone-{id}-health-report.md`
|
|
22
|
-
6. Route based on results: healthy → complete, issues → user decides
|
|
23
|
-
|
|
24
|
-
## Step 1: Read Context
|
|
25
|
-
|
|
26
|
-
```
|
|
27
|
-
Read: .forge/project.yml → tech stack, framework, database, dependencies
|
|
28
|
-
Read: .forge/state/milestone-{id}.yml → milestone ID and name
|
|
29
|
-
Read: .forge/constitution.md → active architectural gates (if exists)
|
|
30
|
-
```
|
|
31
|
-
|
|
32
|
-
Determine which security categories apply based on the tech stack. For example:
|
|
33
|
-
- No database → SQL/NoSQL Injection is N/A
|
|
34
|
-
- No frontend → XSS Prevention is N/A
|
|
35
|
-
- No CI/CD config → Pipeline Security is N/A
|
|
36
|
-
|
|
37
|
-
## Step 2: Scope the Audit
|
|
38
|
-
|
|
39
|
-
```
|
|
40
|
-
Glob: src/**/*.{ts,tsx,js,jsx,py,go,rs,java} (adapt to project language)
|
|
41
|
-
Glob: **/*.env*, **/docker-compose*, **/.github/workflows/*
|
|
42
|
-
Glob: **/next.config*, **/vite.config*, **/webpack.config*
|
|
43
|
-
```
|
|
44
|
-
|
|
45
|
-
Present scope summary to user:
|
|
46
|
-
*"Health audit scope: {N} source files, {N} config files. Scanning for security vulnerabilities (10 categories) and architectural health (4 dimensions). This will take a moment."*
|
|
47
|
-
|
|
48
|
-
Build explicit file lists for each subagent — pass file paths, not globs, so nothing is missed.
|
|
49
|
-
|
|
50
|
-
## Step 3: Spawn Parallel Audits
|
|
51
|
-
|
|
52
|
-
Spawn both audits as fresh-context subagents. Each receives:
|
|
53
|
-
- The explicit file list for their scope
|
|
54
|
-
- The tech stack from `project.yml`
|
|
55
|
-
- Their specific audit instructions (below)
|
|
56
|
-
|
|
57
|
-
### Part 1: Security Audit (subagent)
|
|
58
|
-
|
|
59
|
-
Spawn a security auditor agent with a fresh context window.
|
|
60
|
-
|
|
61
|
-
**10 Security Categories:**
|
|
62
|
-
|
|
63
|
-
| # | Category | What It Checks |
|
|
64
|
-
|---|----------|---------------|
|
|
65
|
-
| 1 | Authentication & Authorization | Every endpoint has auth middleware; role checks before data access |
|
|
66
|
-
| 2 | Data Scoping / Tenant Isolation | Queries scoped to correct user/tenant; no cross-tenant data leaks |
|
|
67
|
-
| 3 | Input Validation | Request bodies/params validated before use in queries or logic |
|
|
68
|
-
| 4 | Error Information Leakage | No stack traces, DB schemas, or internal details in API responses |
|
|
69
|
-
| 5 | XSS Prevention | No unsanitized user content injected into DOM |
|
|
70
|
-
| 6 | SQL/NoSQL Injection | All queries use parameterized placeholders, no string interpolation |
|
|
71
|
-
| 7 | Secrets Management | No hardcoded keys/tokens; `.env` in `.gitignore`; `process.env` usage |
|
|
72
|
-
| 8 | CORS Policy | No wildcard `*` origins in production; appropriate method restrictions |
|
|
73
|
-
| 9 | HTTP Security Headers | CSP, X-Frame-Options, HSTS, X-Content-Type-Options, Referrer-Policy |
|
|
74
|
-
| 10 | CI/CD Pipeline Security | Secrets via secrets context, not hardcoded in workflow files |
|
|
75
|
-
|
|
76
|
-
**Agent behavior rules:**
|
|
77
|
-
- Read every file in the provided list. No sampling or skipping.
|
|
78
|
-
- Every finding must have: file path, line number, what's wrong, severity, remediation.
|
|
79
|
-
- Understand context before flagging — read surrounding code, check for middleware, wrappers, and higher-order protections.
|
|
80
|
-
- Document intentionally public endpoints; don't flag them as vulnerabilities.
|
|
81
|
-
- Severity is firm: `critical` = exploitable vulnerability, `warning` = defense-in-depth gap, `info` = observation.
|
|
82
|
-
- Prefer false negatives over false positives — only flag what you're confident about.
|
|
83
|
-
- Categories that don't apply to this project's stack → mark as N/A with brief explanation.
|
|
84
|
-
|
|
85
|
-
**Project adaptation:** Adapt checks to the detected stack:
|
|
86
|
-
- Express vs Next.js vs Fastify endpoint patterns
|
|
87
|
-
- PostgreSQL vs MongoDB vs SQLite query patterns
|
|
88
|
-
- GitHub Actions vs GitLab CI vs other CI systems
|
|
89
|
-
- React vs Vue vs Svelte frontend patterns
|
|
90
|
-
|
|
91
|
-
**Output format** (return to orchestrator):
|
|
92
|
-
|
|
93
|
-
```yaml
|
|
94
|
-
security_audit:
|
|
95
|
-
files_scanned: N
|
|
96
|
-
categories:
|
|
97
|
-
- id: 1
|
|
98
|
-
name: "Authentication & Authorization"
|
|
99
|
-
status: passed | warning | critical | na
|
|
100
|
-
findings:
|
|
101
|
-
- file: "src/api/users.ts"
|
|
102
|
-
line: 42
|
|
103
|
-
severity: critical | warning | info
|
|
104
|
-
issue: "Description of what's wrong"
|
|
105
|
-
remediation: "How to fix it"
|
|
106
|
-
notes: "Optional context about intentional decisions"
|
|
107
|
-
```
|
|
108
|
-
|
|
109
|
-
### Part 2: Architecture Audit (subagent)
|
|
110
|
-
|
|
111
|
-
Spawn an architecture auditor agent with a fresh context window.
|
|
112
|
-
|
|
113
|
-
**4 Architecture Dimensions:**
|
|
114
|
-
|
|
115
|
-
| Dimension | What It Checks |
|
|
116
|
-
|-----------|---------------|
|
|
117
|
-
| **Scalability** | Synchronous blocking calls, missing pagination, unbounded queries, N+1 query patterns, missing caching opportunities, single points of failure, hardcoded limits |
|
|
118
|
-
| **Maintainability** | Code complexity hotspots (files >300 lines, deeply nested logic >4 levels, god components/classes), circular dependencies, duplicated logic that warrants abstraction |
|
|
119
|
-
| **Code Health** | Dead code / unused exports, TODO/FIXME inventory with age, test coverage gaps (untested critical paths), stale/vulnerable dependencies |
|
|
120
|
-
| **Structural Quality** | Separation of concerns violations (business logic in UI layer), inconsistent patterns across similar features, missing error boundaries, API contract consistency |
|
|
121
|
-
|
|
122
|
-
**Agent behavior rules:**
|
|
123
|
-
- Check actual code, not theoretical concerns.
|
|
124
|
-
- Every finding references specific files with evidence.
|
|
125
|
-
- Severity: `critical` = architectural debt that will cause production issues or block future work, `warning` = quality concern worth addressing, `info` = improvement opportunity.
|
|
126
|
-
- Respect existing ADRs in `.forge/decisions/` — don't flag intentional architectural choices as issues.
|
|
127
|
-
- Respect constitutional articles in `.forge/constitution.md` — if the constitution permits a pattern, don't flag it.
|
|
128
|
-
|
|
129
|
-
**Output format** (return to orchestrator):
|
|
130
|
-
|
|
131
|
-
```yaml
|
|
132
|
-
architecture_audit:
|
|
133
|
-
files_scanned: N
|
|
134
|
-
dimensions:
|
|
135
|
-
- name: "Scalability"
|
|
136
|
-
status: passed | warning | critical
|
|
137
|
-
findings:
|
|
138
|
-
- file: "src/api/products.ts"
|
|
139
|
-
line: 87
|
|
140
|
-
severity: critical | warning | info
|
|
141
|
-
issue: "Unbounded query with no pagination"
|
|
142
|
-
remediation: "Add limit/offset parameters"
|
|
143
|
-
- name: "Maintainability"
|
|
144
|
-
status: passed | warning | critical
|
|
145
|
-
findings: []
|
|
146
|
-
- name: "Code Health"
|
|
147
|
-
status: passed | warning | critical
|
|
148
|
-
findings: []
|
|
149
|
-
- name: "Structural Quality"
|
|
150
|
-
status: passed | warning | critical
|
|
151
|
-
findings: []
|
|
152
|
-
```
|
|
153
|
-
|
|
154
|
-
## Step 4: Score Results
|
|
155
|
-
|
|
156
|
-
After both subagents return, compute scores.
|
|
157
|
-
|
|
158
|
-
**Per-category scoring:**
|
|
159
|
-
|
|
160
|
-
| Status | Meaning |
|
|
161
|
-
|--------|---------|
|
|
162
|
-
| `passed` | No issues found |
|
|
163
|
-
| `warning` | Non-critical issues (info-level also maps here) |
|
|
164
|
-
| `critical` | Real vulnerabilities or architectural blockers |
|
|
165
|
-
| `na` | Category doesn't apply to this project |
|
|
166
|
-
|
|
167
|
-
**Overall status:**
|
|
168
|
-
|
|
169
|
-
| Overall | Condition |
|
|
170
|
-
|---------|-----------|
|
|
171
|
-
| `passed` | ALL categories and dimensions passed or N/A |
|
|
172
|
-
| `warnings_only` | One or more warnings, zero critical |
|
|
173
|
-
| `issues_found` | One or more critical findings |
|
|
174
|
-
|
|
175
|
-
## Step 5: Write Health Report
|
|
176
|
-
|
|
177
|
-
Create `.forge/audits/` directory if needed. Write to `.forge/audits/milestone-{id}-health-report.md`.
|
|
178
|
-
|
|
179
|
-
**YAML frontmatter:**
|
|
180
|
-
|
|
181
|
-
```yaml
|
|
182
|
-
---
|
|
183
|
-
milestone_id: {id}
|
|
184
|
-
milestone_name: "{name}"
|
|
185
|
-
audited: "{ISO 8601 timestamp}"
|
|
186
|
-
status: passed | warnings_only | issues_found
|
|
187
|
-
security:
|
|
188
|
-
status: passed | warnings_only | issues_found
|
|
189
|
-
categories_passed: N
|
|
190
|
-
categories_warning: N
|
|
191
|
-
categories_critical: N
|
|
192
|
-
categories_na: N
|
|
193
|
-
architecture:
|
|
194
|
-
status: passed | warnings_only | issues_found
|
|
195
|
-
scalability: passed | warning | critical
|
|
196
|
-
maintainability: passed | warning | critical
|
|
197
|
-
code_health: passed | warning | critical
|
|
198
|
-
structural_quality: passed | warning | critical
|
|
199
|
-
total_files_scanned: N
|
|
200
|
-
---
|
|
201
|
-
```
|
|
202
|
-
|
|
203
|
-
**Body structure:**
|
|
204
|
-
|
|
205
|
-
```markdown
|
|
206
|
-
# Health Audit Report: {milestone name}
|
|
207
|
-
|
|
208
|
-
## Executive Summary
|
|
209
|
-
{1-3 sentences: overall health assessment, key findings, recommendation}
|
|
210
|
-
|
|
211
|
-
## Security Findings
|
|
212
|
-
|
|
213
|
-
### Category 1: Authentication & Authorization — {STATUS}
|
|
214
|
-
| File | Line | Severity | Issue | Remediation |
|
|
215
|
-
|------|------|----------|-------|-------------|
|
|
216
|
-
| ... | ... | ... | ... | ... |
|
|
217
|
-
|
|
218
|
-
{Repeat for each category. N/A categories get a single line: "N/A — {reason}"}
|
|
219
|
-
|
|
220
|
-
## Architecture Findings
|
|
221
|
-
|
|
222
|
-
### Scalability — {STATUS}
|
|
223
|
-
| File | Line | Severity | Issue | Remediation |
|
|
224
|
-
|------|------|----------|-------|-------------|
|
|
225
|
-
| ... | ... | ... | ... | ... |
|
|
226
|
-
|
|
227
|
-
{Repeat for each dimension}
|
|
228
|
-
|
|
229
|
-
## Public Endpoints
|
|
230
|
-
{List of intentionally public endpoints documented during security audit}
|
|
231
|
-
|
|
232
|
-
## Files Scanned
|
|
233
|
-
{Count and list of all files scanned across both audits}
|
|
234
|
-
```
|
|
235
|
-
|
|
236
|
-
**Health trend tracking:** If a previous audit exists for an earlier milestone (check `.forge/audits/` for prior reports), compare results and note improvements or regressions in the executive summary.
|
|
237
|
-
|
|
238
|
-
## Step 6: Route Based on Results
|
|
239
|
-
|
|
240
|
-
### HEALTHY (all passed)
|
|
241
|
-
|
|
242
|
-
Update `.forge/state/milestone-{id}.yml`:
|
|
243
|
-
- Set `current.status` to `refactoring`
|
|
244
|
-
|
|
245
|
-
Present to user:
|
|
246
|
-
*"Health audit passed. No security vulnerabilities or architectural concerns found. Moving to refactoring review."*
|
|
247
|
-
|
|
248
|
-
→ Route to `refactoring` skill.
|
|
249
|
-
|
|
250
|
-
### NEEDS ATTENTION (critical issues found)
|
|
251
|
-
|
|
252
|
-
Do NOT mark milestone complete. Present to user:
|
|
253
|
-
|
|
254
|
-
*"Health audit found critical issues that should be addressed before shipping:"*
|
|
255
|
-
|
|
256
|
-
Inline the top 3 findings per critical category so the user sees them immediately (don't make them open the report).
|
|
257
|
-
|
|
258
|
-
Then offer choices:
|
|
259
|
-
|
|
260
|
-
*"Options:"*
|
|
261
|
-
- **A. Fix critical issues** — return to `planning` in fix mode with findings as requirements
|
|
262
|
-
- **B. Accept risk and continue** — document accepted risks in report, proceed to refactoring review
|
|
263
|
-
|
|
264
|
-
If user chooses A:
|
|
265
|
-
- Create fix requirements from critical findings
|
|
266
|
-
- Route to `planning` skill in fix mode
|
|
267
|
-
- After fix execution, re-run `auditing` (not full `verifying` — just the audit)
|
|
268
|
-
|
|
269
|
-
If user chooses B:
|
|
270
|
-
- Append "Accepted Risks" section to the health report with user's acknowledgment
|
|
271
|
-
- Update `.forge/state/milestone-{id}.yml`: set `current.status` to `refactoring`
|
|
272
|
-
- → Route to `refactoring` skill.
|
|
273
|
-
|
|
274
|
-
### ACCEPTABLE WITH CAVEATS (warnings only)
|
|
275
|
-
|
|
276
|
-
Present to user:
|
|
277
|
-
|
|
278
|
-
*"Health audit passed with warnings — no critical issues, but {N} items worth noting. See the full report at `.forge/audits/milestone-{id}-health-report.md`."*
|
|
279
|
-
|
|
280
|
-
Then offer choices:
|
|
281
|
-
- **A. Continue to refactoring review** — accept warnings as known items
|
|
282
|
-
- **B. Fix warnings** — address before continuing
|
|
283
|
-
|
|
284
|
-
If user chooses A:
|
|
285
|
-
- Document accepted warnings in report
|
|
286
|
-
- Update `.forge/state/milestone-{id}.yml`: set `current.status` to `refactoring`
|
|
287
|
-
- → Route to `refactoring` skill.
|
|
288
|
-
|
|
289
|
-
If user chooses B:
|
|
290
|
-
- Create fix requirements from warning findings
|
|
291
|
-
- Route to `planning` in fix mode
|
|
292
|
-
- After fix execution, re-run `auditing`
|
|
293
|
-
|
|
294
|
-
## Gate Type: Soft Gate
|
|
295
|
-
|
|
296
|
-
This is a soft gate — critical issues strongly recommend fixing before completion, but the user can accept risk and proceed. Rationale:
|
|
297
|
-
- Some issues may be acceptable known risks for the deployment context
|
|
298
|
-
- Some findings may be false positives despite the conservative flagging approach
|
|
299
|
-
- Non-production or internal tools may have different risk tolerances
|
|
300
|
-
- The user always has final authority over ship decisions
|
|
301
|
-
|
|
302
|
-
The report documents the decision either way, creating an audit trail.
|
|
303
|
-
|
|
304
|
-
## Phase Handoff
|
|
305
|
-
|
|
306
|
-
After auditing routes to refactoring (all three paths: HEALTHY, accepted risk, accepted warnings):
|
|
307
|
-
|
|
308
|
-
1. **Verify persistence** — Confirm health report is written to `.forge/audits/milestone-{id}-health-report.md`
|
|
309
|
-
2. **Update state** — Set `current.status` to `refactoring` in `.forge/state/milestone-{id}.yml`
|
|
310
|
-
3. **Recommend context clear:**
|
|
311
|
-
|
|
312
|
-
*"Health audit complete. Report written to `.forge/audits/`. I recommend clearing context (`/clear`) before the refactoring review — the refactoring scanner spawns a fresh agent with the git diff and health report, so a clean context ensures accurate scanning.*
|
|
313
|
-
|
|
314
|
-
*Ready to continue? Clear context and invoke `/forge` to resume."*
|
|
@@ -1,168 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: refactoring
|
|
3
|
-
description: "Review code built during a milestone for refactoring opportunities. Runs after auditing passes. Produces a structured backlog of improvements the user can work through via quick-tasking. Soft gate — review items, add to backlog, complete milestone."
|
|
4
|
-
---
|
|
5
|
-
|
|
6
|
-
# Refactoring: Post-Milestone Code Review for Improvement Opportunities
|
|
7
|
-
|
|
8
|
-
You review the code built during a milestone and catalog opportunities for improvement. This is not about correctness (that's `verifying`) or health (that's `auditing`) — it's about identifying code that works but could be cleaner, simpler, or more consistent.
|
|
9
|
-
|
|
10
|
-
## When to Trigger
|
|
11
|
-
|
|
12
|
-
- **Automatically** after `auditing` completes (all three paths: HEALTHY, NEEDS ATTENTION after fix, ACCEPTABLE WITH CAVEATS after accept)
|
|
13
|
-
- **On-demand** at any time via user request
|
|
14
|
-
|
|
15
|
-
## Step 1: Read Context
|
|
16
|
-
|
|
17
|
-
```
|
|
18
|
-
Read: .forge/project.yml → tech stack, conventions
|
|
19
|
-
Read: .forge/state/milestone-{id}.yml → milestone ID, name, phases completed
|
|
20
|
-
Read: .forge/audits/milestone-{id}-health-report.md → health findings (avoid overlap)
|
|
21
|
-
Read: .forge/refactor-backlog.yml → existing backlog items (if any)
|
|
22
|
-
Read: .forge/constitution.md → active architectural gates (if exists)
|
|
23
|
-
```
|
|
24
|
-
|
|
25
|
-
Determine the milestone's starting point for the git diff:
|
|
26
|
-
- Check git log for the commit tagged or noted as the milestone start
|
|
27
|
-
- If unavailable, use the first commit after the previous milestone's completion date
|
|
28
|
-
- Fallback: ask the user for the starting commit or branch
|
|
29
|
-
|
|
30
|
-
## Step 2: Scan Milestone Code
|
|
31
|
-
|
|
32
|
-
Spawn a fresh agent with isolated context. Pass it:
|
|
33
|
-
- The explicit list of files changed during the milestone (from `git diff --name-only {start}..HEAD`)
|
|
34
|
-
- The tech stack from `project.yml`
|
|
35
|
-
- The health report findings (so it doesn't duplicate auditing's work)
|
|
36
|
-
- The constitution (so it respects intentional decisions)
|
|
37
|
-
|
|
38
|
-
**The agent scans for 6 categories:**
|
|
39
|
-
|
|
40
|
-
| # | Category | What to Look For |
|
|
41
|
-
|---|----------|-----------------|
|
|
42
|
-
| 1 | **Duplication** | Similar logic in 2+ places that could be extracted into a shared function, hook, or utility |
|
|
43
|
-
| 2 | **Complexity hotspots** | Functions >50 lines, nesting >3 levels deep, high cyclomatic complexity, overly long files |
|
|
44
|
-
| 3 | **Naming & clarity** | Unclear variable/function names, misleading abstractions, functions that do more than their name suggests |
|
|
45
|
-
| 4 | **Pattern inconsistency** | Same thing done differently across the milestone's files (e.g., error handling, data fetching, state management) |
|
|
46
|
-
| 5 | **Dead code** | Unused functions, unreachable branches, commented-out code left behind, unused imports |
|
|
47
|
-
| 6 | **Abstraction issues** | Over-engineered helpers used once, repeated inline code that warrants extraction, premature or missing abstractions |
|
|
48
|
-
|
|
49
|
-
**Agent behavior rules:**
|
|
50
|
-
- Read every file in the diff. No sampling.
|
|
51
|
-
- Every finding must reference a specific file and line range.
|
|
52
|
-
- Understand context — don't flag intentional patterns documented in the constitution.
|
|
53
|
-
- Don't duplicate findings already in the health report from auditing.
|
|
54
|
-
- Estimate effort for each item: `quick` (< 30 min, under 50 lines) or `standard` (needs planning).
|
|
55
|
-
- Suggest a concrete approach for each finding, not just "refactor this."
|
|
56
|
-
- Prefer fewer high-quality findings over many low-signal ones.
|
|
57
|
-
|
|
58
|
-
**Output format** (return to orchestrator):
|
|
59
|
-
|
|
60
|
-
```yaml
|
|
61
|
-
findings:
|
|
62
|
-
- category: duplication
|
|
63
|
-
file: "src/api/users.ts"
|
|
64
|
-
lines: "42-67"
|
|
65
|
-
description: "Duplicate validation logic — same email check in createUser and updateUser"
|
|
66
|
-
effort: quick
|
|
67
|
-
suggested_approach: "Extract shared validateEmail() helper to src/utils/validation.ts"
|
|
68
|
-
- category: complexity
|
|
69
|
-
file: "src/components/Dashboard.tsx"
|
|
70
|
-
lines: "120-245"
|
|
71
|
-
description: "Dashboard render function is 125 lines with 4 levels of nesting"
|
|
72
|
-
effort: standard
|
|
73
|
-
suggested_approach: "Extract stat cards, chart section, and filter bar into subcomponents"
|
|
74
|
-
```
|
|
75
|
-
|
|
76
|
-
## Step 3: Present Findings to User
|
|
77
|
-
|
|
78
|
-
Group findings by category. Show each with:
|
|
79
|
-
- File and line range
|
|
80
|
-
- What the issue is
|
|
81
|
-
- Estimated effort
|
|
82
|
-
- Suggested approach
|
|
83
|
-
|
|
84
|
-
Present top findings (max 10 initially). If there are more, mention the count.
|
|
85
|
-
|
|
86
|
-
*"I found {N} refactoring opportunities in the code built during this milestone:"*
|
|
87
|
-
|
|
88
|
-
Then for each category with findings:
|
|
89
|
-
|
|
90
|
-
*"**Duplication** ({N} items):*
|
|
91
|
-
*1. `src/api/users.ts:42-67` — Duplicate email validation in createUser and updateUser. Quick fix: extract shared helper. [Accept / Dismiss]*
|
|
92
|
-
*2. ...*"
|
|
93
|
-
|
|
94
|
-
## Step 4: User Triage
|
|
95
|
-
|
|
96
|
-
The user can respond with:
|
|
97
|
-
- **Accept** (individual item) → add to backlog
|
|
98
|
-
- **Dismiss** (individual item) → skip, not a real issue or intentional
|
|
99
|
-
- **Accept all** → bulk add all remaining items to backlog
|
|
100
|
-
- **Dismiss all** → skip everything, no backlog items added
|
|
101
|
-
|
|
102
|
-
For dismissed items, optionally ask for a brief reason (helps calibrate future scans).
|
|
103
|
-
|
|
104
|
-
## Step 5: Write Backlog
|
|
105
|
-
|
|
106
|
-
Read existing `.forge/refactor-backlog.yml` (if any). Determine the next item ID by incrementing from the highest existing ID.
|
|
107
|
-
|
|
108
|
-
Append accepted items to `.forge/refactor-backlog.yml`:
|
|
109
|
-
|
|
110
|
-
```yaml
|
|
111
|
-
items:
|
|
112
|
-
- id: R001
|
|
113
|
-
milestone: 1
|
|
114
|
-
category: duplication
|
|
115
|
-
file: "src/api/users.ts"
|
|
116
|
-
lines: "42-67"
|
|
117
|
-
description: "Duplicate validation logic — same email check in createUser and updateUser"
|
|
118
|
-
effort: quick
|
|
119
|
-
suggested_approach: "Extract shared validateEmail() helper"
|
|
120
|
-
status: pending
|
|
121
|
-
added: "2026-03-18"
|
|
122
|
-
completed: null
|
|
123
|
-
```
|
|
124
|
-
|
|
125
|
-
If the file doesn't exist yet, create it from the template at `.forge/templates/refactor-backlog.yml`.
|
|
126
|
-
|
|
127
|
-
Present summary:
|
|
128
|
-
*"Added {N} items to the refactor backlog. {M} dismissed. You can work these anytime — pending items with `effort: quick` will show up as available Quick tasks when you start a session."*
|
|
129
|
-
|
|
130
|
-
## Step 6: Route
|
|
131
|
-
|
|
132
|
-
Update `.forge/state/milestone-{id}.yml`:
|
|
133
|
-
- Set `current.status` to `complete`
|
|
134
|
-
|
|
135
|
-
Update `.forge/state/index.yml`:
|
|
136
|
-
- Set milestone status to `complete`
|
|
137
|
-
- Update `last_updated` timestamp
|
|
138
|
-
|
|
139
|
-
Present to user:
|
|
140
|
-
*"Milestone [{name}] is complete. {N} refactoring items are in the backlog for whenever you want to tackle them."*
|
|
141
|
-
|
|
142
|
-
If Beads is installed, run `bd complete` to update the dependency graph.
|
|
143
|
-
|
|
144
|
-
## Gate Type: Soft Gate
|
|
145
|
-
|
|
146
|
-
This is a soft gate — it presents opportunities but never blocks milestone completion. Rationale:
|
|
147
|
-
- Refactoring is improvement, not correctness. The code already works (verified) and is healthy (audited).
|
|
148
|
-
- Users should review opportunities but aren't forced to act on them immediately.
|
|
149
|
-
- Backlog items persist across sessions and can be worked whenever it makes sense.
|
|
150
|
-
- Some items may become irrelevant as the codebase evolves — that's fine.
|
|
151
|
-
|
|
152
|
-
## Backlog Lifecycle
|
|
153
|
-
|
|
154
|
-
Backlog items follow this lifecycle:
|
|
155
|
-
|
|
156
|
-
```
|
|
157
|
-
pending → in_progress → done
|
|
158
|
-
pending → dismissed (during triage or later review)
|
|
159
|
-
```
|
|
160
|
-
|
|
161
|
-
Items with `effort: quick` can be picked up directly via `quick-tasking`.
|
|
162
|
-
Items with `effort: standard` should go through the Standard tier flow.
|
|
163
|
-
|
|
164
|
-
When working a backlog item:
|
|
165
|
-
1. `forge` surfaces it as an available task
|
|
166
|
-
2. User selects it
|
|
167
|
-
3. Route to `quick-tasking` or Standard tier based on effort
|
|
168
|
-
4. On completion, update the item's `status` to `done` and set `completed` date
|