forge-orkes 0.3.7 → 0.3.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "forge-orkes",
3
- "version": "0.3.7",
3
+ "version": "0.3.8",
4
4
  "description": "Set up the Forge meta-prompting framework for Claude Code in your project",
5
5
  "bin": {
6
6
  "create-forge": "./bin/create-forge.js"
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "forge": {
3
- "version": "0.3.1",
3
+ "version": "0.1.0",
4
4
  "default_tier": "standard",
5
5
  "beads_integration": false,
6
6
  "context_gates": {
@@ -17,7 +17,11 @@
17
17
  "refactor",
18
18
  "chore",
19
19
  "docs"
20
- ]
20
+ ],
21
+ "verification": {
22
+ "auto_fix": true,
23
+ "max_retries": 2
24
+ }
21
25
  },
22
26
  "hooks": {
23
27
  "PostToolUse": [
@@ -92,7 +92,8 @@ For each task in the plan:
92
92
  5. **Verify** using the verify step (run tests, inspect output)
93
93
  6. **Confirm** done criteria are met
94
94
  7. **Commit** atomically
95
- 8. **Mark complete** — `TaskUpdate` the native task to `completed`
95
+ 8. **Run verification gate** — execute configured verification commands (see Verification Gate below)
96
+ 9. **Mark complete** — `TaskUpdate` the native task to `completed`
96
97
 
97
98
  ## TDD Flow (When task type="tdd")
98
99
 
@@ -127,6 +128,73 @@ feat(auth-01): implement JWT-based login
127
128
  - Include integration test for login flow
128
129
  ```
129
130
 
131
+ ## Verification Gate
132
+
133
+ After each task commit, run the project's configured verification commands to catch regressions immediately. This is mechanical enforcement — not optional, not agent-directed.
134
+
135
+ ### Load Config
136
+
137
+ Read `.forge/project.yml → verification`. If `verification.commands` is empty or missing, skip this section entirely.
138
+
139
+ ```yaml
140
+ # Example from project.yml:
141
+ verification:
142
+ commands:
143
+ - cmd: "npm run lint"
144
+ advisory: false
145
+ - cmd: "npm test"
146
+ advisory: false
147
+ - cmd: "npx tsc --noEmit"
148
+ advisory: true # pre-existing failures — warn only
149
+ auto_fix: true
150
+ max_retries: 2
151
+ ```
152
+
153
+ ### Execution Flow
154
+
155
+ For each verification command, in order:
156
+
157
+ 1. **Run the command** via Bash
158
+ 2. **If it passes** → move to next command
159
+ 3. **If it fails:**
160
+ - **Advisory command?** → Log warning: *"Advisory: `{cmd}` failed (pre-existing issue, not blocking)."* Move to next command.
161
+ - **Non-advisory + `auto_fix: false`?** → Log failure and continue to next task. Don't block.
162
+ - **Non-advisory + `auto_fix: true`?** → Enter auto-fix loop (below)
163
+
164
+ ### Auto-Fix Loop
165
+
166
+ When a non-advisory verification command fails with `auto_fix: true`:
167
+
168
+ ```
169
+ Attempt 1:
170
+ 1. Read the command output (error messages, failing tests, lint errors)
171
+ 2. Identify the issue — is it caused by the current task's changes?
172
+ - YES → fix the code, stage fixes, amend the commit
173
+ - NO (pre-existing) → mark this command as advisory for this session, log warning, continue
174
+ 3. Re-run the verification command
175
+ 4. If passes → continue to next command
176
+ 5. If fails → attempt 2 (up to max_retries)
177
+
178
+ After max_retries exhausted:
179
+ → Log the failure in the execution summary
180
+ → Continue to next task (don't block the whole plan)
181
+ → The verifying skill will catch persistent failures later
182
+ ```
183
+
184
+ ### Integration with 3-Strike Rule
185
+
186
+ Verification auto-fix attempts count toward the task's 3-strike limit. If a task has already used 2 strikes on implementation issues, it gets 1 verification retry max.
187
+
188
+ ### What NOT to Fix in Verification
189
+
190
+ - **Pre-existing failures** not caused by the current task → mark advisory
191
+ - **Flaky tests** that pass on re-run without code changes → note in summary, don't count as a strike
192
+ - **Unrelated warnings** (deprecation notices, non-blocking lint info) → ignore
193
+
194
+ ### Quick Tier
195
+
196
+ Verification gates also run for Quick tier tasks (`quick-tasking` skill). After the commit, run all non-advisory verification commands once. If they fail, show the output and let the agent fix — but limit to 1 retry (Quick tier shouldn't spend time in fix loops).
197
+
130
198
  ## Context Engineering
131
199
 
132
200
  ### When to Spawn Fresh Agent
@@ -212,6 +212,67 @@ From this, auto-detect:
212
212
  - Database (from dependencies or config)
213
213
  - Project name and description (from package.json or README)
214
214
 
215
+ ### Discovery Step 1.5: Verification Command Detection
216
+
217
+ Auto-detect verification commands from the project's package manifest and config:
218
+
219
+ ```bash
220
+ # Node.js projects — scan package.json scripts
221
+ Read: package.json → scripts
222
+
223
+ # Look for these common script names:
224
+ # test, test:unit, test:integration, test:e2e
225
+ # lint, lint:fix, eslint
226
+ # typecheck, type-check, tsc, check-types
227
+ # check (often runs multiple checks)
228
+ # build (catches type errors in compiled languages)
229
+
230
+ # Python projects
231
+ Check: Makefile, tox.ini, pyproject.toml for test/lint commands
232
+ # e.g., pytest, ruff check, mypy
233
+
234
+ # Go projects
235
+ # Standard: go test ./..., go vet ./...
236
+
237
+ # Rust projects
238
+ # Standard: cargo test, cargo clippy
239
+ ```
240
+
241
+ **Auto-detection rules for Node.js (most common):**
242
+
243
+ | `package.json` script | Maps to verification command |
244
+ |----------------------|------------------------------|
245
+ | `test` or `test:unit` | `npm test` or `npm run test:unit` |
246
+ | `lint` | `npm run lint` |
247
+ | `typecheck` or `type-check` or `check-types` | `npm run typecheck` (etc.) |
248
+ | `tsc` in scripts or `typescript` in devDeps | `npx tsc --noEmit` |
249
+ | `eslint` in devDeps but no `lint` script | `npx eslint .` |
250
+
251
+ **Advisory mode detection:** After identifying commands, run each one once silently to check baseline health. Commands that fail on the current codebase (before Forge changes anything) are marked `advisory: true` — they'll run but won't block execution. This prevents Forge from being blocked by pre-existing tech debt.
252
+
253
+ ```yaml
254
+ # Example auto-detected config:
255
+ verification:
256
+ commands:
257
+ - cmd: "npm run lint"
258
+ advisory: false # passes on current codebase
259
+ - cmd: "npm test"
260
+ advisory: false # passes on current codebase
261
+ - cmd: "npx tsc --noEmit"
262
+ advisory: true # FAILED on current codebase — pre-existing type errors
263
+ auto_fix: true
264
+ max_retries: 2
265
+ ```
266
+
267
+ Present findings to user:
268
+
269
+ *"I detected these verification commands from your project:*
270
+ - *`npm run lint` — passes currently*
271
+ - *`npm test` — passes currently*
272
+ - *`npx tsc --noEmit` — currently failing (pre-existing type errors, will run in advisory mode)*
273
+
274
+ *These will run automatically after each task to catch regressions. Want to add, remove, or adjust any?"*
275
+
215
276
  ### Discovery Step 2: Design System Detection
216
277
 
217
278
  ```bash
@@ -336,6 +397,24 @@ Then:
336
397
  - Skip design-system.md creation
337
398
  - Disable Article V in the constitution
338
399
 
400
+ ### Greenfield Step 2.5: Verification Commands
401
+
402
+ Ask: *"What verification commands should run after each task? Common options:"*
403
+
404
+ | If your stack includes... | Suggested commands |
405
+ |--------------------------|-------------------|
406
+ | TypeScript | `npx tsc --noEmit` |
407
+ | ESLint / Biome | `npm run lint` |
408
+ | Jest / Vitest / Mocha | `npm test` |
409
+ | Python + pytest | `pytest` |
410
+ | Python + ruff | `ruff check .` |
411
+ | Go | `go test ./...`, `go vet ./...` |
412
+ | Rust | `cargo test`, `cargo clippy` |
413
+
414
+ Pre-fill based on the tech stack chosen in Step 1. User confirms or adjusts. Write to the `verification` section of `project.yml`.
415
+
416
+ If the user doesn't want verification gates: set `verification.commands: []` — the executing skill will skip the gate entirely.
417
+
339
418
  ### Greenfield Step 3: Constitutional Setup
340
419
 
341
420
  Present the 9 constitutional articles grouped by domain:
@@ -363,7 +442,7 @@ Ask: *"Which of these articles apply? I'd recommend [suggest based on stack and
363
442
 
364
443
  ## Finalize Init (Both Paths)
365
444
 
366
- 1. Write `.forge/project.yml` with all gathered/discovered info
445
+ 1. Write `.forge/project.yml` with all gathered/discovered info (including `verification` section with detected/configured commands)
367
446
  2. Write `.forge/constitution.md` with selected articles
368
447
  3. Write `.forge/design-system.md` (if design system configured)
369
448
  4. Initialize milestone-aware state:
@@ -30,6 +30,17 @@ constraints:
30
30
  - "" # e.g., "No custom auth — use Clerk"
31
31
  - "" # e.g., "No server-side rendering"
32
32
 
33
+ verification:
34
+ commands: # Shell commands run after each task commit
35
+ - "" # e.g., "npm run lint"
36
+ - "" # e.g., "npm test"
37
+ - "" # e.g., "npx tsc --noEmit"
38
+ auto_fix: true # On failure, agent fixes and retries
39
+ max_retries: 2 # Max auto-fix attempts per command (0 = fail immediately)
40
+ # Commands are auto-detected during init from package.json scripts.
41
+ # Advisory mode: commands that were already failing before Forge started
42
+ # run but don't block — they log warnings only.
43
+
33
44
  success_criteria: # How do we know we're done?
34
45
  - "" # e.g., "User can create and edit posts"
35
46
  - "" # e.g., "All tests pass with >80% coverage"
@@ -115,7 +115,7 @@ For Quick tier tasks, init is skipped — just do the work.
115
115
  ## State Management
116
116
 
117
117
  Project state lives in `.forge/`:
118
- - `project.yml` — Vision, stack, design system, constraints (< 5 KB)
118
+ - `project.yml` — Vision, stack, design system, verification commands, constraints (< 5 KB)
119
119
  - `constitution.md` — Active architectural gates (selected during init)
120
120
  - `design-system.md` — Component mapping table (generated during init)
121
121
  - `requirements.yml` — Structured requirements with `[NEEDS CLARIFICATION]` markers
@@ -146,6 +146,27 @@ When the executor encounters issues during building:
146
146
 
147
147
  Priority: Rule 4 first (stop if architectural). Then Rules 1-3 (auto-fix). Uncertain? → Rule 4 (ask).
148
148
 
149
+ ## Verification Gates
150
+
151
+ After each task commit, the executor runs configured verification commands from `project.yml`:
152
+
153
+ ```yaml
154
+ verification:
155
+ commands:
156
+ - cmd: "npm run lint"
157
+ - cmd: "npm test"
158
+ - cmd: "npx tsc --noEmit"
159
+ advisory: true # pre-existing failures — warn only
160
+ auto_fix: true # agent fixes and retries on failure
161
+ max_retries: 2 # max auto-fix attempts per command
162
+ ```
163
+
164
+ - **Auto-detected during init** from `package.json` scripts (test, lint, typecheck)
165
+ - **Advisory mode**: commands that were already failing before Forge started run but don't block
166
+ - **Auto-fix loop**: on failure, agent reads output, fixes code, amends commit, re-runs (up to max_retries)
167
+ - **3-strike integration**: verification retries count toward the task's 3-strike limit
168
+ - Empty `commands` list = no verification gate (opt-out)
169
+
149
170
  ## Beads Integration (Optional)
150
171
 
151
172
  When Beads is installed, Forge gains persistent cross-session memory: