forge-orkes 0.3.7 → 0.3.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/template/.claude/settings.json +6 -2
- package/template/.claude/skills/executing/SKILL.md +69 -1
- package/template/.claude/skills/forge/SKILL.md +12 -15
- package/template/.claude/skills/initializing/SKILL.md +80 -1
- package/template/.claude/skills/reviewing/SKILL.md +437 -0
- package/template/.claude/skills/verifying/SKILL.md +3 -3
- package/template/.forge/templates/project.yml +11 -0
- package/template/CLAUDE.md +29 -11
- package/template/.claude/skills/auditing/SKILL.md +0 -314
- package/template/.claude/skills/refactoring/SKILL.md +0 -168
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"forge": {
|
|
3
|
-
"version": "0.
|
|
3
|
+
"version": "0.1.0",
|
|
4
4
|
"default_tier": "standard",
|
|
5
5
|
"beads_integration": false,
|
|
6
6
|
"context_gates": {
|
|
@@ -17,7 +17,11 @@
|
|
|
17
17
|
"refactor",
|
|
18
18
|
"chore",
|
|
19
19
|
"docs"
|
|
20
|
-
]
|
|
20
|
+
],
|
|
21
|
+
"verification": {
|
|
22
|
+
"auto_fix": true,
|
|
23
|
+
"max_retries": 2
|
|
24
|
+
}
|
|
21
25
|
},
|
|
22
26
|
"hooks": {
|
|
23
27
|
"PostToolUse": [
|
|
@@ -92,7 +92,8 @@ For each task in the plan:
|
|
|
92
92
|
5. **Verify** using the verify step (run tests, inspect output)
|
|
93
93
|
6. **Confirm** done criteria are met
|
|
94
94
|
7. **Commit** atomically
|
|
95
|
-
8. **
|
|
95
|
+
8. **Run verification gate** — execute configured verification commands (see Verification Gate below)
|
|
96
|
+
9. **Mark complete** — `TaskUpdate` the native task to `completed`
|
|
96
97
|
|
|
97
98
|
## TDD Flow (When task type="tdd")
|
|
98
99
|
|
|
@@ -127,6 +128,73 @@ feat(auth-01): implement JWT-based login
|
|
|
127
128
|
- Include integration test for login flow
|
|
128
129
|
```
|
|
129
130
|
|
|
131
|
+
## Verification Gate
|
|
132
|
+
|
|
133
|
+
After each task commit, run the project's configured verification commands to catch regressions immediately. This is mechanical enforcement — not optional, not agent-directed.
|
|
134
|
+
|
|
135
|
+
### Load Config
|
|
136
|
+
|
|
137
|
+
Read `.forge/project.yml → verification`. If `verification.commands` is empty or missing, skip this section entirely.
|
|
138
|
+
|
|
139
|
+
```yaml
|
|
140
|
+
# Example from project.yml:
|
|
141
|
+
verification:
|
|
142
|
+
commands:
|
|
143
|
+
- cmd: "npm run lint"
|
|
144
|
+
advisory: false
|
|
145
|
+
- cmd: "npm test"
|
|
146
|
+
advisory: false
|
|
147
|
+
- cmd: "npx tsc --noEmit"
|
|
148
|
+
advisory: true # pre-existing failures — warn only
|
|
149
|
+
auto_fix: true
|
|
150
|
+
max_retries: 2
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
### Execution Flow
|
|
154
|
+
|
|
155
|
+
For each verification command, in order:
|
|
156
|
+
|
|
157
|
+
1. **Run the command** via Bash
|
|
158
|
+
2. **If it passes** → move to next command
|
|
159
|
+
3. **If it fails:**
|
|
160
|
+
- **Advisory command?** → Log warning: *"Advisory: `{cmd}` failed (pre-existing issue, not blocking)."* Move to next command.
|
|
161
|
+
- **Non-advisory + `auto_fix: false`?** → Log failure and continue to next task. Don't block.
|
|
162
|
+
- **Non-advisory + `auto_fix: true`?** → Enter auto-fix loop (below)
|
|
163
|
+
|
|
164
|
+
### Auto-Fix Loop
|
|
165
|
+
|
|
166
|
+
When a non-advisory verification command fails with `auto_fix: true`:
|
|
167
|
+
|
|
168
|
+
```
|
|
169
|
+
Attempt 1:
|
|
170
|
+
1. Read the command output (error messages, failing tests, lint errors)
|
|
171
|
+
2. Identify the issue — is it caused by the current task's changes?
|
|
172
|
+
- YES → fix the code, stage fixes, amend the commit
|
|
173
|
+
- NO (pre-existing) → mark this command as advisory for this session, log warning, continue
|
|
174
|
+
3. Re-run the verification command
|
|
175
|
+
4. If passes → continue to next command
|
|
176
|
+
5. If fails → attempt 2 (up to max_retries)
|
|
177
|
+
|
|
178
|
+
After max_retries exhausted:
|
|
179
|
+
→ Log the failure in the execution summary
|
|
180
|
+
→ Continue to next task (don't block the whole plan)
|
|
181
|
+
→ The verifying skill will catch persistent failures later
|
|
182
|
+
```
|
|
183
|
+
|
|
184
|
+
### Integration with 3-Strike Rule
|
|
185
|
+
|
|
186
|
+
Verification auto-fix attempts count toward the task's 3-strike limit. If a task has already used 2 strikes on implementation issues, it gets 1 verification retry max.
|
|
187
|
+
|
|
188
|
+
### What NOT to Fix in Verification
|
|
189
|
+
|
|
190
|
+
- **Pre-existing failures** not caused by the current task → mark advisory
|
|
191
|
+
- **Flaky tests** that pass on re-run without code changes → note in summary, don't count as a strike
|
|
192
|
+
- **Unrelated warnings** (deprecation notices, non-blocking lint info) → ignore
|
|
193
|
+
|
|
194
|
+
### Quick Tier
|
|
195
|
+
|
|
196
|
+
Verification gates also run for Quick tier tasks (`quick-tasking` skill). After the commit, run all non-advisory verification commands once. If they fail, show the output and let the agent fix — but limit to 1 retry (Quick tier shouldn't spend time in fix loops).
|
|
197
|
+
|
|
130
198
|
## Context Engineering
|
|
131
199
|
|
|
132
200
|
### When to Spawn Fresh Agent
|
|
@@ -29,7 +29,7 @@ Check for state files in this order:
|
|
|
29
29
|
4. **If one milestone:** Auto-select it. Inform user: *"Resuming milestone: [{name}] — status: {current.status}, tasks: {overall_percent}%"*
|
|
30
30
|
5. **If no active milestones:** Proceed to init or ask user to create one.
|
|
31
31
|
6. Load the selected milestone's state file (`.forge/state/milestone-{id}.yml`)
|
|
32
|
-
7. **Route based on `current.status`, NOT on `overall_percent`.** The `current.status` field is the authoritative workflow position. A milestone is only complete when `current.status` equals `complete`. Even if `overall_percent` is 100%, the milestone still needs to go through
|
|
32
|
+
7. **Route based on `current.status`, NOT on `overall_percent`.** The `current.status` field is the authoritative workflow position. A milestone is only complete when `current.status` equals `complete`. Even if `overall_percent` is 100%, the milestone still needs to go through remaining workflow steps (verifying, reviewing) before it is truly done.
|
|
33
33
|
8. Report position briefly, then **immediately route to the next skill** (see Step 3: Mandatory Auto-Routing):
|
|
34
34
|
- **Workflow status** (`current.status`) — this is the primary indicator of where you are
|
|
35
35
|
- Phase progress using precise terminology: "Executed" (code done, not verified), "Verified", "Pending", "In progress" — **never say "Complete" for a phase that hasn't been verified**
|
|
@@ -85,7 +85,7 @@ Match ANY:
|
|
|
85
85
|
- Integration with external service
|
|
86
86
|
- Estimated 1-8 hours of work
|
|
87
87
|
|
|
88
|
-
→ Route through: `researching` → `discussing` → `planning` → `executing` → `verifying` → `
|
|
88
|
+
→ Route through: `researching` → `discussing` → `planning` → `executing` → `verifying` → `reviewing`
|
|
89
89
|
|
|
90
90
|
### Full Tier
|
|
91
91
|
Match ANY:
|
|
@@ -96,7 +96,7 @@ Match ANY:
|
|
|
96
96
|
- Estimated days of work
|
|
97
97
|
- User says "full", "complex", "project", "milestone"
|
|
98
98
|
|
|
99
|
-
→ Route through: `researching` → `discussing` → `architecting` → `planning` → `executing` → `verifying` → `
|
|
99
|
+
→ Route through: `researching` → `discussing` → `architecting` → `planning` → `executing` → `verifying` → `reviewing`
|
|
100
100
|
→ Add `designing` if UI work involved
|
|
101
101
|
→ Add `securing` if auth/data/API touched
|
|
102
102
|
|
|
@@ -140,12 +140,11 @@ This is a **briefing, not a prompt** — the user sees where they are and what's
|
|
|
140
140
|
| `discussing` | Invoke `Skill(discussing)`, then → `planning` (or `architecting` for Full) |
|
|
141
141
|
| `planning` | Invoke `Skill(planning)`, then → `executing` |
|
|
142
142
|
| `executing` | Invoke `Skill(executing)`, then → `verifying` |
|
|
143
|
-
| `verifying` | Invoke `Skill(verifying)`, then → `
|
|
144
|
-
| `
|
|
145
|
-
| `refactoring` | Invoke `Skill(refactoring)`, then → `complete` |
|
|
143
|
+
| `verifying` | Invoke `Skill(verifying)`, then → `reviewing` |
|
|
144
|
+
| `reviewing` | Invoke `Skill(reviewing)`, then → `complete` |
|
|
146
145
|
| `complete` | Milestone is done. Ask user what's next. |
|
|
147
146
|
|
|
148
|
-
- **Never treat a milestone as complete just because `overall_percent` is 100%.** Task completion and workflow completion are different. All planned tasks being done (100%) means execution is finished — verification
|
|
147
|
+
- **Never treat a milestone as complete just because `overall_percent` is 100%.** Task completion and workflow completion are different. All planned tasks being done (100%) means execution is finished — verification and reviewing still need to run.
|
|
149
148
|
- Skip completed phases (phases before `current.status`)
|
|
150
149
|
- Resume from current phase
|
|
151
150
|
|
|
@@ -154,7 +153,7 @@ This is a **briefing, not a prompt** — the user sees where they are and what's
|
|
|
154
153
|
Sometimes a session ends before the executing skill advances `current.status`. On resume, detect and fix this:
|
|
155
154
|
|
|
156
155
|
- **If `current.status == executing`**: Check if all phases in the roadmap have been executed (all plans completed, commits made). If YES → advance `current.status` to `verifying` in the state file, then route to verifying. If NO → route to executing for the next unexecuted phase.
|
|
157
|
-
- **If `current.status == verifying`**: Check if verification report exists. If YES and it passed → advance to `
|
|
156
|
+
- **If `current.status == verifying`**: Check if verification report exists. If YES and it passed → advance to `reviewing`. If NO → route to verifying.
|
|
158
157
|
- **General rule**: If the work for the current status is done but the status wasn't advanced (session ended mid-handoff), advance it now and route to the next skill.
|
|
159
158
|
|
|
160
159
|
### Phase Status Wording
|
|
@@ -168,7 +167,7 @@ When reporting phase progress on resume, use precise terminology to avoid confus
|
|
|
168
167
|
| Not yet started | **"Pending"** | No work done yet |
|
|
169
168
|
| Currently in progress | **"In progress"** | Partially executed |
|
|
170
169
|
|
|
171
|
-
**Never say a phase is "Complete" unless it has passed through the full workflow** (executed + verified +
|
|
170
|
+
**Never say a phase is "Complete" unless it has passed through the full workflow** (executed + verified + reviewed). Use "Executed" for phases where code is done but verification hasn't run. This prevents users from thinking a phase is fully done when it still needs verification.
|
|
172
171
|
|
|
173
172
|
### On-Demand Discussion
|
|
174
173
|
|
|
@@ -184,8 +183,7 @@ While working at any tier, if you encounter:
|
|
|
184
183
|
| Missing validation/error handling/null checks | Auto-add, document | Rule 2 |
|
|
185
184
|
| Broken import/dep/config blocking progress | Auto-fix, document | Rule 3 |
|
|
186
185
|
| Need new DB table, service layer, library swap | **STOP. Ask user.** | Rule 4 |
|
|
187
|
-
| After verifying passes | Run health audit
|
|
188
|
-
| After auditing passes | Review refactoring opportunities | `refactoring` |
|
|
186
|
+
| After verifying passes | Run health audit + refactoring review | `reviewing` |
|
|
189
187
|
|
|
190
188
|
When uncertain → Rule 4 (ask). Never silently make architectural decisions.
|
|
191
189
|
|
|
@@ -202,7 +200,7 @@ Each phase produces persistent artifacts (state files, plans, reports, backlogs)
|
|
|
202
200
|
Recommend clearing context at every phase boundary in Standard and Full tiers:
|
|
203
201
|
|
|
204
202
|
```
|
|
205
|
-
researching → [clear] → discussing → [clear] → architecting → [clear] → planning → [clear] → executing → [clear] → verifying → [clear] →
|
|
203
|
+
researching → [clear] → discussing → [clear] → architecting → [clear] → planning → [clear] → executing → [clear] → verifying → [clear] → reviewing
|
|
206
204
|
```
|
|
207
205
|
|
|
208
206
|
**Skip the recommendation when:**
|
|
@@ -233,8 +231,7 @@ Each skill ends with a standard handoff message. The pattern is:
|
|
|
233
231
|
| architecting | ADRs in `.forge/decisions/`, data models, API contracts | planning reads decisions |
|
|
234
232
|
| planning | Plans in `.forge/phases/m{M}-{N}-{name}/`, requirements.yml, roadmap.yml, context.md | executing reads plans |
|
|
235
233
|
| executing | Committed code, execution summary, milestone state updated | verifying reads must_haves from plans |
|
|
236
|
-
| verifying | Verification report, desire paths updated |
|
|
237
|
-
| auditing | Health report in `.forge/audits/` | refactoring reads health report + git diff |
|
|
234
|
+
| verifying | Verification report, desire paths updated | reviewing reads project.yml + source files + git diff |
|
|
238
235
|
|
|
239
236
|
### Context Loading on Resume
|
|
240
237
|
|
|
@@ -243,7 +240,7 @@ When a skill starts after a context clear, it must load its required state from
|
|
|
243
240
|
## State Transitions
|
|
244
241
|
|
|
245
242
|
```
|
|
246
|
-
not_started → [init if new] → researching → [clear] → discussing → [clear] → planning → [clear] → executing → [clear] → verifying → [clear] →
|
|
243
|
+
not_started → [init if new] → researching → [clear] → discussing → [clear] → planning → [clear] → executing → [clear] → verifying → [clear] → reviewing → complete
|
|
247
244
|
↗ debugging (if stuck)
|
|
248
245
|
↗ designing (if UI)
|
|
249
246
|
↗ securing (if auth/data)
|
|
@@ -212,6 +212,67 @@ From this, auto-detect:
|
|
|
212
212
|
- Database (from dependencies or config)
|
|
213
213
|
- Project name and description (from package.json or README)
|
|
214
214
|
|
|
215
|
+
### Discovery Step 1.5: Verification Command Detection
|
|
216
|
+
|
|
217
|
+
Auto-detect verification commands from the project's package manifest and config:
|
|
218
|
+
|
|
219
|
+
```bash
|
|
220
|
+
# Node.js projects — scan package.json scripts
|
|
221
|
+
Read: package.json → scripts
|
|
222
|
+
|
|
223
|
+
# Look for these common script names:
|
|
224
|
+
# test, test:unit, test:integration, test:e2e
|
|
225
|
+
# lint, lint:fix, eslint
|
|
226
|
+
# typecheck, type-check, tsc, check-types
|
|
227
|
+
# check (often runs multiple checks)
|
|
228
|
+
# build (catches type errors in compiled languages)
|
|
229
|
+
|
|
230
|
+
# Python projects
|
|
231
|
+
Check: Makefile, tox.ini, pyproject.toml for test/lint commands
|
|
232
|
+
# e.g., pytest, ruff check, mypy
|
|
233
|
+
|
|
234
|
+
# Go projects
|
|
235
|
+
# Standard: go test ./..., go vet ./...
|
|
236
|
+
|
|
237
|
+
# Rust projects
|
|
238
|
+
# Standard: cargo test, cargo clippy
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
**Auto-detection rules for Node.js (most common):**
|
|
242
|
+
|
|
243
|
+
| `package.json` script | Maps to verification command |
|
|
244
|
+
|----------------------|------------------------------|
|
|
245
|
+
| `test` or `test:unit` | `npm test` or `npm run test:unit` |
|
|
246
|
+
| `lint` | `npm run lint` |
|
|
247
|
+
| `typecheck` or `type-check` or `check-types` | `npm run typecheck` (etc.) |
|
|
248
|
+
| `tsc` in scripts or `typescript` in devDeps | `npx tsc --noEmit` |
|
|
249
|
+
| `eslint` in devDeps but no `lint` script | `npx eslint .` |
|
|
250
|
+
|
|
251
|
+
**Advisory mode detection:** After identifying commands, run each one once silently to check baseline health. Commands that fail on the current codebase (before Forge changes anything) are marked `advisory: true` — they'll run but won't block execution. This prevents Forge from being blocked by pre-existing tech debt.
|
|
252
|
+
|
|
253
|
+
```yaml
|
|
254
|
+
# Example auto-detected config:
|
|
255
|
+
verification:
|
|
256
|
+
commands:
|
|
257
|
+
- cmd: "npm run lint"
|
|
258
|
+
advisory: false # passes on current codebase
|
|
259
|
+
- cmd: "npm test"
|
|
260
|
+
advisory: false # passes on current codebase
|
|
261
|
+
- cmd: "npx tsc --noEmit"
|
|
262
|
+
advisory: true # FAILED on current codebase — pre-existing type errors
|
|
263
|
+
auto_fix: true
|
|
264
|
+
max_retries: 2
|
|
265
|
+
```
|
|
266
|
+
|
|
267
|
+
Present findings to user:
|
|
268
|
+
|
|
269
|
+
*"I detected these verification commands from your project:*
|
|
270
|
+
- *`npm run lint` — passes currently*
|
|
271
|
+
- *`npm test` — passes currently*
|
|
272
|
+
- *`npx tsc --noEmit` — currently failing (pre-existing type errors, will run in advisory mode)*
|
|
273
|
+
|
|
274
|
+
*These will run automatically after each task to catch regressions. Want to add, remove, or adjust any?"*
|
|
275
|
+
|
|
215
276
|
### Discovery Step 2: Design System Detection
|
|
216
277
|
|
|
217
278
|
```bash
|
|
@@ -336,6 +397,24 @@ Then:
|
|
|
336
397
|
- Skip design-system.md creation
|
|
337
398
|
- Disable Article V in the constitution
|
|
338
399
|
|
|
400
|
+
### Greenfield Step 2.5: Verification Commands
|
|
401
|
+
|
|
402
|
+
Ask: *"What verification commands should run after each task? Common options:"*
|
|
403
|
+
|
|
404
|
+
| If your stack includes... | Suggested commands |
|
|
405
|
+
|--------------------------|-------------------|
|
|
406
|
+
| TypeScript | `npx tsc --noEmit` |
|
|
407
|
+
| ESLint / Biome | `npm run lint` |
|
|
408
|
+
| Jest / Vitest / Mocha | `npm test` |
|
|
409
|
+
| Python + pytest | `pytest` |
|
|
410
|
+
| Python + ruff | `ruff check .` |
|
|
411
|
+
| Go | `go test ./...`, `go vet ./...` |
|
|
412
|
+
| Rust | `cargo test`, `cargo clippy` |
|
|
413
|
+
|
|
414
|
+
Pre-fill based on the tech stack chosen in Step 1. User confirms or adjusts. Write to the `verification` section of `project.yml`.
|
|
415
|
+
|
|
416
|
+
If the user doesn't want verification gates: set `verification.commands: []` — the executing skill will skip the gate entirely.
|
|
417
|
+
|
|
339
418
|
### Greenfield Step 3: Constitutional Setup
|
|
340
419
|
|
|
341
420
|
Present the 9 constitutional articles grouped by domain:
|
|
@@ -363,7 +442,7 @@ Ask: *"Which of these articles apply? I'd recommend [suggest based on stack and
|
|
|
363
442
|
|
|
364
443
|
## Finalize Init (Both Paths)
|
|
365
444
|
|
|
366
|
-
1. Write `.forge/project.yml` with all gathered/discovered info
|
|
445
|
+
1. Write `.forge/project.yml` with all gathered/discovered info (including `verification` section with detected/configured commands)
|
|
367
446
|
2. Write `.forge/constitution.md` with selected articles
|
|
368
447
|
3. Write `.forge/design-system.md` (if design system configured)
|
|
369
448
|
4. Initialize milestone-aware state:
|