forge-orkes 0.9.4 → 0.9.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/template/.claude/agents/verifier.md +5 -0
- package/template/.claude/settings.json +1 -1
- package/template/.claude/skills/discussing/SKILL.md +1 -6
- package/template/.claude/skills/executing/SKILL.md +18 -5
- package/template/.claude/skills/forge/SKILL.md +11 -1
- package/template/.claude/skills/initializing/SKILL.md +11 -0
- package/template/.claude/skills/verifying/SKILL.md +68 -0
- package/template/.forge/templates/framework-absorption/gsd.md +1 -1
- package/template/.forge/templates/interface-detection.md +17 -0
- package/template/.forge/templates/plan.md +1 -0
- package/template/.forge/templates/project.yml +28 -7
package/package.json
CHANGED
|
@@ -60,6 +60,7 @@ Run: {n} | Passed: {n} | Failed: {n} | Coverage: {if available}
|
|
|
60
60
|
Read: .forge/phases/m{M}-{N}-{name}/plan.md → extract must_haves
|
|
61
61
|
Read: .forge/state/milestone-{id}.yml → reported progress
|
|
62
62
|
Read: .forge/context.md → locked decisions
|
|
63
|
+
Read: .forge/deferred-issues.md → known pre-existing failures (if exists; treat as advisory)
|
|
63
64
|
```
|
|
64
65
|
|
|
65
66
|
### 2. Truths
|
|
@@ -90,6 +91,10 @@ npm run build 2>&1
|
|
|
90
91
|
npm run lint 2>&1
|
|
91
92
|
```
|
|
92
93
|
|
|
94
|
+
Cross-reference failures against `deferred-issues.md`:
|
|
95
|
+
- Match on test name or summary → advisory; label "Pre-existing (deferred: DI-{N})"
|
|
96
|
+
- No match → regression; include in Issues (Critical) and gaps
|
|
97
|
+
|
|
93
98
|
### 6. Anti-Pattern Scan
|
|
94
99
|
|
|
95
100
|
| Pattern | Command | Severity |
|
|
@@ -51,7 +51,7 @@
|
|
|
51
51
|
"hooks": [
|
|
52
52
|
{
|
|
53
53
|
"type": "command",
|
|
54
|
-
"command": "if [ ! -f
|
|
54
|
+
"command": "if [ ! -f \"$(git rev-parse --show-toplevel 2>/dev/null)/.forge/.active-skill\" ]; then echo \"[Forge] No active skill. Invoke /forge or /quick-tasking before editing code. To bypass: touch .forge/.active-skill\" >&2; exit 2; fi"
|
|
55
55
|
}
|
|
56
56
|
]
|
|
57
57
|
},
|
|
@@ -23,21 +23,16 @@ Structured conversation: approach, trade-offs, decisions. Clarity, not artifacts
|
|
|
23
23
|
|
|
24
24
|
## Progressive Persistence
|
|
25
25
|
|
|
26
|
-
|
|
27
|
-
> Write to `.forge/context.md` after EVERY confirmed decision, before asking the next question.
|
|
28
|
-
> If you reach convergence without having written each decision individually — you already violated this rule.
|
|
26
|
+
**Decisions persist immediately. This is a hard gate — not a guideline.**
|
|
29
27
|
|
|
30
28
|
After each user response that confirms a decision:
|
|
31
29
|
|
|
32
30
|
1. **STOP** — do not continue the discussion
|
|
33
31
|
2. **Write to disk** — append to `.forge/context.md` (create from template if missing)
|
|
34
|
-
- **Path is always `.forge/context.md`** — never `.forge/phases/*/context.md`, never inline memory
|
|
35
32
|
3. **THEN** continue to the next question
|
|
36
33
|
|
|
37
34
|
Never accumulate decisions in working memory. Never batch writes to convergence.
|
|
38
35
|
|
|
39
|
-
**Common failure mode:** Agent says "decisions live in conversation; planning writes context.md." This is WRONG. Discussing writes context.md. Planning reads it.
|
|
40
|
-
|
|
41
36
|
### Write Protocol
|
|
42
37
|
|
|
43
38
|
**On first decision of the session:**
|
|
@@ -10,6 +10,18 @@ description: "Build to plan with atomic commits, deviation rules, and context en
|
|
|
10
10
|
- [ ] Context.md locked decisions noted
|
|
11
11
|
- [ ] Constitution.md gates satisfied
|
|
12
12
|
- [ ] Milestone state updated to `status: executing`
|
|
13
|
+
- [ ] Baseline snapshot captured (see below)
|
|
14
|
+
|
|
15
|
+
## Baseline Snapshot
|
|
16
|
+
|
|
17
|
+
Run **before the first task begins**. Makes failure causality mechanical — no self-assessment.
|
|
18
|
+
|
|
19
|
+
1. Run all non-advisory verification commands
|
|
20
|
+
2. Record which tests/checks pass and which fail — this is the baseline
|
|
21
|
+
3. After each commit: failures **not in baseline** = introduced by this task = **must fix under Rule 3**
|
|
22
|
+
4. Failures **present in baseline** = pre-existing → append to `.forge/deferred-issues.md`, mark `status: pending`
|
|
23
|
+
|
|
24
|
+
Skip only if re-entering an in-progress execution and `deferred-issues.md` already documents all current failures.
|
|
13
25
|
|
|
14
26
|
## Deviation Rules
|
|
15
27
|
|
|
@@ -116,9 +128,9 @@ For each command, in order:
|
|
|
116
128
|
```
|
|
117
129
|
Attempt 1:
|
|
118
130
|
1. Read error output
|
|
119
|
-
2.
|
|
120
|
-
- YES →
|
|
121
|
-
- NO
|
|
131
|
+
2. In baseline snapshot?
|
|
132
|
+
- YES (pre-existing) → mark advisory for this session; append to .forge/deferred-issues.md; continue
|
|
133
|
+
- NO (introduced by this task) → fix code, stage fixes, amend commit
|
|
122
134
|
3. Re-run command
|
|
123
135
|
4. Pass → next command
|
|
124
136
|
5. Fail → next attempt (up to max_retries)
|
|
@@ -133,7 +145,7 @@ After max_retries exhausted:
|
|
|
133
145
|
Verification retries count toward the task's 3-strike limit. 2 strikes used = 1 verification retry max.
|
|
134
146
|
|
|
135
147
|
### Do NOT Fix
|
|
136
|
-
- **Pre-existing failures**
|
|
148
|
+
- **Pre-existing failures** present in baseline snapshot → mark advisory; append to `.forge/deferred-issues.md`
|
|
137
149
|
- **Flaky tests** passing on re-run without changes → note in summary, no strike
|
|
138
150
|
- **Unrelated warnings** (deprecation, non-blocking lint) → ignore
|
|
139
151
|
|
|
@@ -190,7 +202,8 @@ After completing all tasks in a plan:
|
|
|
190
202
|
- src/components/Login.tsx (created)
|
|
191
203
|
|
|
192
204
|
## Notes
|
|
193
|
-
[
|
|
205
|
+
[Genuine handoff context only: environment requirements, seed data, external services needed (e.g. "Redis must be running for auth tests").
|
|
206
|
+
Do NOT list test failures here — pre-existing failures belong in deferred-issues.md; task-introduced failures must be fixed before committing.]
|
|
194
207
|
```
|
|
195
208
|
|
|
196
209
|
## State Updates
|
|
@@ -9,7 +9,7 @@ Entry point. Detect tier, route skills, manage transitions. New projects → ini
|
|
|
9
9
|
|
|
10
10
|
## Step 1: Read State
|
|
11
11
|
|
|
12
|
-
### Milestone Selection
|
|
12
|
+
### 1.1 Milestone Selection
|
|
13
13
|
|
|
14
14
|
Check state files:
|
|
15
15
|
1. `.forge/state/index.yml` → milestone-aware
|
|
@@ -33,6 +33,8 @@ Beads enabled (`forge.beads_integration: true`) → `bd prime`. Optional.
|
|
|
33
33
|
|
|
34
34
|
Read `.forge/context.md`. **Needs Resolution** unchecked → warn: *"{N} unresolved discrepancies."* Quick proceeds; Standard/Full blocked at planning.
|
|
35
35
|
|
|
36
|
+
### 1.2 Backlog + Desire Paths
|
|
37
|
+
|
|
36
38
|
Check `.forge/refactor-backlog.yml`:
|
|
37
39
|
- *"{N} refactors ({Q} quick, {S} standard). Tackle first?"*
|
|
38
40
|
- Pick → `quick-tasking`/Standard, set `in_progress`. Decline → proceed.
|
|
@@ -41,6 +43,14 @@ Check `desire_paths` 3+ occurrences:
|
|
|
41
43
|
- *"Recurring: [{description}] ({N}x). Fix via [suggestion]?"*
|
|
42
44
|
- Agree → apply, reset. Decline → note, don't nag.
|
|
43
45
|
|
|
46
|
+
### 1.3 Interface Check
|
|
47
|
+
|
|
48
|
+
Check `interface` in `project.yml`:
|
|
49
|
+
- Field present → skip. (If scalar `none` found, normalize to `[none]`; see `.forge/templates/interface-detection.md` for empty-field canon.)
|
|
50
|
+
- Field absent → detect using `.forge/templates/interface-detection.md`. No match → write `[none]`.
|
|
51
|
+
- Prompt: *"No `interface` field found. Detected: [{list}]. Add to project.yml? (yes/no)"*
|
|
52
|
+
- Yes → write `interface: [{list}]` to project.yml. No → skip for this session.
|
|
53
|
+
|
|
44
54
|
No `project.yml` + Standard/Full → **Step 2A**. State exists → **Step 2B**.
|
|
45
55
|
|
|
46
56
|
## Step 1.5: Lifecycle Operations
|
|
@@ -143,6 +143,11 @@ Bash: wc -l src/**/*.{ts,tsx,js,jsx} 2>/dev/null | tail -1 # codebase size
|
|
|
143
143
|
|
|
144
144
|
Auto-detect: language, framework, build tools, tests, DB, name.
|
|
145
145
|
|
|
146
|
+
### Step 1.2: Interface Type
|
|
147
|
+
|
|
148
|
+
Detect which surfaces the project exposes. See `.forge/templates/interface-detection.md` for detection signals. No match → `[none]`.
|
|
149
|
+
Present: *"Detected interfaces: [{list}]. Override?"* — user can correct before project.yml is written.
|
|
150
|
+
|
|
146
151
|
### Step 1.5: Verification Commands
|
|
147
152
|
|
|
148
153
|
```bash
|
|
@@ -258,6 +263,12 @@ Present grouped. User confirms/adds/removes.
|
|
|
258
263
|
|
|
259
264
|
User describes project → `.forge/project.yml`: name, goal, stack, constraints, success criteria, risks.
|
|
260
265
|
|
|
266
|
+
### Step 1.5: Interface Type
|
|
267
|
+
|
|
268
|
+
*"What interfaces does this project expose? (browser / cli / api / desktop / native-apple / none — comma-separate multiples)"*
|
|
269
|
+
|
|
270
|
+
Validate each term against `.forge/templates/interface-detection.md` type vocabulary. On unrecognized term, prompt: *"Did you mean [closest match]? Valid: browser | cli | api | desktop | native-apple | none."* Write validated answer as `interface: [...]` in project.yml.
|
|
271
|
+
|
|
261
272
|
### Step 2: Design System
|
|
262
273
|
|
|
263
274
|
*"UI library?"*
|
|
@@ -21,8 +21,76 @@ Read: .forge/project.yml → tech stack (for running tests)
|
|
|
21
21
|
Read: .forge/phases/m{M}-{N}-{name}/plan-{NN}.md → must_haves (truths, artifacts, key_links)
|
|
22
22
|
Read: .forge/context.md → locked decisions
|
|
23
23
|
Read: .forge/requirements.yml → requirement IDs for coverage check
|
|
24
|
+
Read: .forge/deferred-issues.md → known pre-existing failures (if exists; treat as advisory)
|
|
24
25
|
```
|
|
25
26
|
|
|
27
|
+
## Deferred Issues
|
|
28
|
+
|
|
29
|
+
If `.forge/deferred-issues.md` exists, load it before running any tests.
|
|
30
|
+
|
|
31
|
+
When test results come in, cross-reference failures against known deferred issue IDs:
|
|
32
|
+
|
|
33
|
+
- **Failure matches a deferred ID** → advisory only — note as "Pre-existing (deferred: DI-{N})", do NOT fail verification for this
|
|
34
|
+
- **Failure not in deferred-issues.md** → regression introduced after the baseline — **FAIL** and include in gaps
|
|
35
|
+
|
|
36
|
+
In the Test Results section of the report, split accordingly:
|
|
37
|
+
|
|
38
|
+
```markdown
|
|
39
|
+
## Test Results
|
|
40
|
+
Run: {n} | Passed: {n} | Failed: {n}
|
|
41
|
+
|
|
42
|
+
### Regressions (block release)
|
|
43
|
+
- {test name}: {error}
|
|
44
|
+
|
|
45
|
+
### Pre-existing / Deferred (advisory)
|
|
46
|
+
- DI-{N}: {test name} — {summary from deferred-issues.md}
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
Verdict: FAIL only triggers for regressions. Known deferred failures do not block a PASSED verdict — they are tracked separately and must be worked off via the refactor backlog or a dedicated fix phase.
|
|
50
|
+
|
|
51
|
+
## Interface Testing Gate
|
|
52
|
+
|
|
53
|
+
Read `interface` from `.forge/project.yml`. If absent, `[]`, or `[none]` → skip gate, proceed to 3-level verification.
|
|
54
|
+
Also read `interface_tools` (optional override map).
|
|
55
|
+
|
|
56
|
+
For each declared interface type, check for tests:
|
|
57
|
+
|
|
58
|
+
| Type | Detection | Expected Tool | Suggested Command |
|
|
59
|
+
|------|-----------|---------------|-------------------|
|
|
60
|
+
| `browser` | Glob `**/*.spec.ts`, `**/*.spec.js`, `**/e2e/**`, `**/playwright/**` | Playwright | `npx playwright test` |
|
|
61
|
+
| `cli` | Grep test files for `spawn`, `execSync`, `subprocess.run`, `exec(` | Any runner | subprocess integration tests |
|
|
62
|
+
| `api` | Grep test files for `supertest`, `request(`, `httpx`, `TestClient`, `net/http` | Node→supertest · Python→httpx · Go→net/http | runner-dependent |
|
|
63
|
+
| `desktop` | Glob `**/*.spec.ts`, `**/*.spec.js`, `**/e2e/**`, `**/playwright/**` | Playwright | `npx playwright test` |
|
|
64
|
+
| `native-apple` | Glob `**/*UITests/**`, `**/*Tests/**` | XCUITest | `xcodebuild test` |
|
|
65
|
+
|
|
66
|
+
If `interface_tools.{type}` is set → use its `tool` and `cmd` instead of defaults.
|
|
67
|
+
|
|
68
|
+
**All types have tests** → gate passes silently. Proceed to 3-level verification.
|
|
69
|
+
|
|
70
|
+
**Any type missing tests** → BLOCK:
|
|
71
|
+
|
|
72
|
+
```
|
|
73
|
+
INTERFACE TESTING GATE — BLOCKED
|
|
74
|
+
|
|
75
|
+
Interface type: {type}
|
|
76
|
+
Expected: {tool} tests
|
|
77
|
+
Detection: {glob or grep used}
|
|
78
|
+
Found: none
|
|
79
|
+
|
|
80
|
+
Add {type} tests before verifying:
|
|
81
|
+
browser: Add Playwright specs in e2e/ or *.spec.ts files
|
|
82
|
+
cli: Add integration tests that spawn the CLI as a child process and assert exit codes + stdout
|
|
83
|
+
api: Add HTTP integration tests using {default tool for detected language}
|
|
84
|
+
desktop: Add Playwright tests targeting the Electron/Tauri window
|
|
85
|
+
native-apple: Add XCUITest targets and run via xcodebuild test
|
|
86
|
+
|
|
87
|
+
Re-run verifying after tests are added.
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
**Do NOT invoke Skill(testing).** Return BLOCKED and wait — the user or executor must add tests explicitly.
|
|
91
|
+
|
|
92
|
+
If detection is ambiguous (e.g. API tests hard to grep definitively) → lean toward PASS to avoid false blocks; note uncertainty in the verdict.
|
|
93
|
+
|
|
26
94
|
## 3-Level Goal-Backward Verification
|
|
27
95
|
|
|
28
96
|
### Level 1: Observable Truths
|
|
@@ -22,7 +22,7 @@ CONTEXT.md with "NON-NEGOTIABLE" or "DEFERRED" sections
|
|
|
22
22
|
| `ROADMAP.md` | `.forge/roadmap.yml` | Extract phases, milestones. Add dependency references between phases. Convert to YAML. |
|
|
23
23
|
| `STATE.md` | `.forge/state/index.yml` + `.forge/state/milestone-{id}.yml` | Extract current phase number, progress, active blockers, recent decisions. Split into global index and per-milestone state. |
|
|
24
24
|
| `CONTEXT.md` | `.forge/context.md` | NON-NEGOTIABLE → Locked Decisions. DEFERRED → Deferred Ideas. DISCRETION → Discretion Areas. Minimal format change needed. |
|
|
25
|
-
| `PLAN.md` | `.forge/phases/{N}-{name}/plan.md` | Keep XML task format. Add `must_haves` YAML frontmatter. Split per-phase if combined. |
|
|
25
|
+
| `PLAN.md` | `.forge/phases/m{M}-{N}-{name}/plan.md` | Keep XML task format. Add `must_haves` YAML frontmatter. Split per-phase if combined. |
|
|
26
26
|
| `references/ui-brand.md` | `.forge/design-system.md` | Extract component rules, brand guidelines. Convert to component mapping table format. |
|
|
27
27
|
| `references/tdd.md` | `.forge/constitution.md` Article II | Inform Test-First article gates with project-specific testing approach. |
|
|
28
28
|
| `references/verification-patterns.md` | Informs `verifying` skill | Extract project-specific verification patterns. Add to constitution or context. |
|
|
@@ -0,0 +1,17 @@
|
|
|
1
|
+
# Interface Detection Signals
|
|
2
|
+
|
|
3
|
+
Used by `initializing` (Step 1.2), `forge` (Step 1, interface check), and `verifying` (Interface Testing Gate) to detect and validate interface types.
|
|
4
|
+
|
|
5
|
+
If ANY signal matches for a type, include that type in the result array.
|
|
6
|
+
|
|
7
|
+
| Signal | Type |
|
|
8
|
+
|--------|------|
|
|
9
|
+
| `next`/`react`/`vue`/`svelte`/`astro`/`nuxt`/`remix`/`angular` in deps | `browser` |
|
|
10
|
+
| `bin` field in package.json OR `commander`/`yargs`/`meow` in name/deps OR `cobra`/`urfave/cli` in go.mod OR `click`/`typer`/`argparse` in requirements.txt | `cli` |
|
|
11
|
+
| `express`/`fastapi`/`flask`/`django`/`gin`/`echo`/`koa`/`hono`/`nestjs`/`fiber` in deps | `api` |
|
|
12
|
+
| `electron`/`tauri` in deps | `desktop` |
|
|
13
|
+
| `.xcodeproj`/`.xcworkspace`/`Package.swift`/`Info.plist` at project root | `native-apple` |
|
|
14
|
+
|
|
15
|
+
No match → `[none]`. Multiple matches → array (e.g. `[browser, api]`).
|
|
16
|
+
|
|
17
|
+
**Empty field:** absent, `[]`, and `[none]` all mean no interface declared. Scalar `none` normalizes to `[none]` — arrays only are valid on write.
|
|
@@ -14,6 +14,26 @@ tech_stack:
|
|
|
14
14
|
testing: "" # e.g., Vitest, Jest, Pytest
|
|
15
15
|
other: [] # Additional key dependencies
|
|
16
16
|
|
|
17
|
+
interface: [none] # Surfaces this project exposes: browser | cli | api | desktop | native-apple | none
|
|
18
|
+
# Array — e.g. [browser, api] for full-stack projects
|
|
19
|
+
|
|
20
|
+
interface_tools: # Optional: override default test tool per interface type. Set manually or during init.
|
|
21
|
+
# browser:
|
|
22
|
+
# tool: cypress
|
|
23
|
+
# cmd: "npx cypress run"
|
|
24
|
+
# cli:
|
|
25
|
+
# tool: bats
|
|
26
|
+
# cmd: "bats test/"
|
|
27
|
+
# api:
|
|
28
|
+
# tool: httpx
|
|
29
|
+
# cmd: "pytest tests/api/"
|
|
30
|
+
# desktop:
|
|
31
|
+
# tool: playwright
|
|
32
|
+
# cmd: "npx playwright test"
|
|
33
|
+
# native-apple:
|
|
34
|
+
# tool: xcodebuild
|
|
35
|
+
# cmd: "xcodebuild test -scheme MyApp"
|
|
36
|
+
|
|
17
37
|
design_system:
|
|
18
38
|
library: "" # e.g., PrimeReact, Material-UI, shadcn/ui, Ant Design, Chakra UI, none
|
|
19
39
|
version: "" # e.g., 10.x, 5.x
|
|
@@ -31,15 +51,16 @@ constraints:
|
|
|
31
51
|
- "" # e.g., "No server-side rendering"
|
|
32
52
|
|
|
33
53
|
verification:
|
|
34
|
-
commands: # Shell commands run after each task commit
|
|
35
|
-
-
|
|
36
|
-
|
|
37
|
-
-
|
|
54
|
+
commands: # Shell commands run after each task commit. Auto-detected during init.
|
|
55
|
+
# - cmd: "npm run lint"
|
|
56
|
+
# advisory: false # true = warn only (for pre-existing failures)
|
|
57
|
+
# - cmd: "npm test"
|
|
58
|
+
# advisory: false
|
|
59
|
+
# - cmd: "npx tsc --noEmit"
|
|
60
|
+
# advisory: true # pre-existing type errors — warn, don't block
|
|
38
61
|
auto_fix: true # On failure, agent fixes and retries
|
|
39
62
|
max_retries: 2 # Max auto-fix attempts per command (0 = fail immediately)
|
|
40
|
-
#
|
|
41
|
-
# Advisory mode: commands that were already failing before Forge started
|
|
42
|
-
# run but don't block — they log warnings only.
|
|
63
|
+
# Advisory mode: commands already failing before Forge started run but don't block — warn only.
|
|
43
64
|
|
|
44
65
|
success_criteria: # How do we know we're done?
|
|
45
66
|
- "" # e.g., "User can create and edit posts"
|