forge-orkes 0.3.11 → 0.3.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,151 +1,149 @@
1
1
  <!-- forge:start -->
2
2
  # Forge
3
3
 
4
- A lean meta-prompting framework for Claude Code. Synthesizes context engineering (GSD) and constitutional governance (Spec-Kit) on Claude Code's native primitives.
4
+ Lean meta-prompting framework for Claude Code. Context engineering + constitutional governance on native primitives.
5
5
 
6
6
  ## Critical: No Native Plan Mode
7
7
 
8
- **NEVER use the `EnterPlanMode` tool when the Forge framework is active** (i.e., when `.forge/` exists or a Forge skill is running). Forge has its own `planning` skill that writes structured plans to `.forge/phases/`. Claude Code's native plan mode writes to a separate plan file with a different format — this conflicts with Forge's workflow and state management.
8
+ **NEVER use `EnterPlanMode` when Forge active.** Forge writes plans to `.forge/phases/` native plan mode conflicts.
9
9
 
10
- When the workflow reaches the planning phase, **invoke the `planning` skill using the `Skill` tool** do not enter native plan mode. This applies to all tiers (Standard and Full). The same rule applies to all other Forge phases: always invoke the corresponding Forge skill, never substitute a native Claude Code behavior.
10
+ All phases: **invoke via `Skill` tool**, never native behavior. `planning` `Skill(planning)`, not `EnterPlanMode`.
11
11
 
12
12
  ## Core Principles
13
13
 
14
- 1. **Lean by default, powerful when needed.** Quick fixes skip ceremony. Complex features get full governance. The framework adapts — you don't.
15
- 2. **Native-first.** Skills, agents, hooks, plugins — use Claude Code's built-in systems. No custom JavaScript, no reinvented orchestration. Periodically audit Forge features against Claude Code's current native capabilities — if a native tool now handles what a Forge feature does, deprecate the Forge version. Use native tools for session-scoped concerns (task UI, exploration) and Forge state for cross-session persistence. When in doubt, prefer native.
16
- 3. **Context is sacred.** Every token earns its place. Size-gate all artifacts, lazy-load skills, spawn fresh agents for isolated work.
17
- 4. **Decisions are contracts.** User decisions lock before building begins. Downstream agents honor contracts or flag violations — never silently override.
18
- 5. **Verify against goals, not tasks.** "Does it work?" beats "Did we complete the checklist?" Goal-backward verification at every tier.
19
- 6. **Never forget.** Project state persists via `.forge/state/` and survives session boundaries. Milestones can be worked on concurrently across sessions. Compatible with external memory tools like Beads for deeper cross-session context.
20
- 7. **Pave the desire paths.** When agents repeatedly deviate, users repeatedly correct, or steps repeatedly get skipped — that's a signal, not a failure. Track these patterns and evolve the framework to match how it's actually used.
14
+ 1. **Lean default, powerful when needed.** Quick skip ceremony. Complex full governance.
15
+ 2. **Native-first.** Claude Code built-ins only. No custom JS. Deprecate Forge when native catches up.
16
+ 3. **Context is sacred.** Size-gate artifacts, lazy-load skills, fresh agents for isolated work.
17
+ 4. **Decisions are contracts.** Lock before building. Honor or flag — never silently override.
18
+ 5. **Verify goals, not tasks.** "Does it work?" > "Did we finish checklist?"
19
+ 6. **Never forget.** `.forge/state/` persists across sessions. Milestones concurrent. Beads-compatible.
20
+ 7. **Pave desire paths.** Repeated deviations/corrections = signal evolve framework.
21
21
 
22
22
  ## Workflow Tiers
23
23
 
24
- Forge auto-detects complexity. Override with: "Use Quick/Standard/Full tier."
24
+ Auto-detects complexity. Override: "Use Quick/Standard/Full tier."
25
25
 
26
26
  ### Quick (minutes)
27
- **Triggers:** bug fix, typo, config change, dependency bump, < 50 lines changed
28
- **Flow:** `quick-tasking` → commit → done
27
+ **Triggers:** bug fix, typo, config, dep bump, <50 lines
28
+ **Flow:** `quick-tasking` → commit → done
29
29
 
30
30
  ### Standard (hours)
31
- **Triggers:** new feature, component, significant refactor, multi-file change
32
- **Flow:** `researching` → `discussing` → `planning` → `executing` → `verifying` → `reviewing` → done
31
+ **Triggers:** new feature, component, significant refactor, multi-file
32
+ **Flow:** `researching` → `discussing` → `planning` → `executing` → `verifying` → `reviewing` → done
33
33
 
34
34
  ### Full (days)
35
- **Triggers:** new project, major milestone, complex multi-system feature, architectural decisions needed
36
- **Flow:** `researching` → `discussing` → `architecting` → `planning` → `executing` → `verifying` → `reviewing` → done
37
- **Optional additions:** `designing` (UI work), `securing` (auth/data/API), `debugging` (stuck on issue)
35
+ **Triggers:** new project, major milestone, complex multi-system, architectural decisions
36
+ **Flow:** `researching` → `discussing` → `architecting` → `planning` → `executing` → `verifying` → `reviewing` → done
37
+ **Optional:** `designing` (UI), `securing` (auth/data/API), `debugging` (stuck)
38
38
 
39
39
  ## Skill Routing
40
40
 
41
- | When you need to... | Use skill | Tier |
42
- |---------------------|-----------|------|
43
- | Start any task (entry point) | `forge` | All |
44
- | Set up a new project (brownfield or greenfield) | `initializing` | First run only |
45
- | Investigate codebase, tech, or requirements | `researching` | Standard, Full |
46
- | Talk through approach, trade-offs, or revisit a plan | `discussing` | Standard, Full (also on-demand) |
47
- | Make architectural decisions with rationale | `architecting` | Full |
48
- | Break work into executable tasks with gates | `planning` | Standard, Full |
49
- | Build code with deviation rules + atomic commits | `executing` | All |
50
- | Prove work actually delivers on goals | `verifying` | Standard, Full |
51
- | Audit health + catalog refactoring opportunities | `reviewing` | Standard, Full |
52
- | Fix a small, scoped issue fast | `quick-tasking` | Quick |
53
- | Build UI with design system consistency | `designing` | When UI involved |
54
- | Review security before shipping | `securing` | When auth/data/API involved |
55
- | Debug systematically with hypotheses | `debugging` | When stuck |
56
- | Upgrade Forge framework files | `upgrading` | On-demand |
57
- | Use Beads for cross-session memory | `beads-integration` | When Beads installed |
41
+ | Need to... | Skill | Tier |
42
+ |------------|-------|------|
43
+ | Start any task | `forge` | All |
44
+ | Set up new project | `initializing` | First run |
45
+ | Investigate codebase/tech/requirements | `researching` | Standard, Full |
46
+ | Talk through approach/trade-offs | `discussing` | Standard, Full (on-demand) |
47
+ | Architectural decisions | `architecting` | Full |
48
+ | Break work into tasks with gates | `planning` | Standard, Full |
49
+ | Build with deviation rules + atomic commits | `executing` | All |
50
+ | Prove work delivers on goals | `verifying` | Standard, Full |
51
+ | Audit health + catalog refactoring | `reviewing` | Standard, Full |
52
+ | Small scoped fix | `quick-tasking` | Quick |
53
+ | UI with design system | `designing` | When UI |
54
+ | Security review | `securing` | When auth/data/API |
55
+ | Systematic debugging | `debugging` | When stuck |
56
+ | Upgrade Forge files | `upgrading` | On-demand |
57
+ | Cross-session memory | `beads-integration` | When Beads installed |
58
58
 
59
59
  ## Context Engineering
60
60
 
61
- ### Size Gates (enforced by agents)
62
- | Artifact | Max Size | Why |
63
- |----------|----------|-----|
64
- | `project.yml` | 5 KB | Forces clarity — if you can't describe it concisely, you don't understand it |
65
- | `requirements.yml` | 50 KB | Prevents scope creep — trim to v1 essentials |
66
- | `plan.md` (per task) | 30 KB | Keeps executor context under 50% — quality degrades above this |
67
- | `constitution.md` | 10 KB | Immutable gates should be scannable, not novels |
61
+ ### Size Gates
62
+ | Artifact | Max | Reason |
63
+ |----------|-----|--------|
64
+ | `project.yml` | 5 KB | Forces clarity |
65
+ | `requirements.yml` | 50 KB | Prevents scope creep |
66
+ | `plan.md` | 30 KB | Keeps executor context <50% |
67
+ | `constitution.md` | 10 KB | Gates must be scannable |
68
68
 
69
69
  ### Fresh Agent Pattern
70
- When a task touches 20+ files or a complex subsystem, spawn a fresh executor agent with isolated context. This prevents context rot — the #1 cause of quality degradation in long sessions.
70
+ Task touches 20+ files or complex subsystem spawn fresh executor with isolated context. Prevents context rot.
71
71
 
72
- ### Context Handoff Between Phases
73
- Each phase writes its outputs to `.forge/` before completing. At every phase boundary (researching → discussing → planning → executing → verifying → reviewing), the completing skill recommends clearing context (`/clear`) before the next phase begins. The next phase loads what it needs from disk. This is advisory — skip for short phases where context is under 40%. See the `forge` skill's "Context Handoff Protocol" for full details.
72
+ ### Context Handoff
73
+ Each phase writes outputs to `.forge/` before completing. At phase boundaries, recommend `/clear`. Next phase loads from disk. Advisory — skip for short phases under 40% context.
74
74
 
75
75
  ### Lazy Loading
76
- Skills load only when invoked. CLAUDE.md stays in context; skill details load on demand. This keeps base context lean (~300 lines) while making full framework available.
76
+ Skills load only when invoked. CLAUDE.md stays in context; skill details on demand. Base context ~300 lines.
77
+
78
+ ## Model Routing
79
+
80
+ Configure in `project.yml`:
81
+
82
+ ```yaml
83
+ models:
84
+ default: sonnet # fallback
85
+ parent_session: sonnet # advisory — warns on mismatch
86
+ skills: # per-skill overrides
87
+ architecting: opus
88
+ planning: opus
89
+ verifying: haiku
90
+ quick-tasking: haiku
91
+ ```
92
+
93
+ **Precedence:** `skills.X` → `default` → parent model.
94
+ **Parent session advisory** — warns if mismatch, cannot auto-switch. Enforced at review gate.
95
+ **Agents model-agnostic** — skills set model at spawn. One routing table.
77
96
 
78
97
  ## Agents
79
98
 
80
- | Agent | Role | Tools | When Used |
81
- |-------|------|-------|-----------|
82
- | `researcher` | Investigation specialist | Read-only (Read, Glob, Grep, WebFetch) | Research phases |
83
- | `planner` | Planning with constitutional gates | Read + Write (plan files only) | Planning phases |
84
- | `executor` | Building with deviation rules | All dev tools | Execution phases |
85
- | `verifier` | Goal-backward verification | Read + Bash (test execution) | Verification phases |
86
- | `reviewer` | Security + architecture + refactoring audit | Read, Bash, Grep, Glob | Reviewing phase |
99
+ | Agent | Role | Tools | When |
100
+ |-------|------|-------|------|
101
+ | `researcher` | Investigation | Read-only (Read, Glob, Grep, WebFetch) | Research |
102
+ | `planner` | Planning + constitutional gates | Read + Write (plan files) | Planning |
103
+ | `executor` | Building + deviation rules | All dev tools | Execution |
104
+ | `verifier` | Goal-backward verification | Read + Bash (tests) | Verification |
105
+ | `reviewer` | Security + architecture + refactoring | Read, Bash, Grep, Glob | Reviewing |
87
106
 
88
107
  ## Project Init (First Run)
89
108
 
90
- When `forge` detects no `.forge/project.yml`, it auto-detects the project type and runs the appropriate init:
109
+ No `.forge/project.yml` auto-detect project type:
91
110
 
92
- **Brownfield** (existing codebase detected — has package.json, src/, .git/):
93
- 1. **Framework detection** — checks for existing meta-frameworks (GSD, Spec-Kit, BMAD, custom). If found, offers: absorb & convert, archive & start fresh, or keep both. Also detects companion tools (Beads, etc.) and preserves their configuration.
94
- 2. **Framework absorption** — reads existing framework docs, converts project knowledge to Forge format (PROJECT.md → project.yml, etc.), archives originals in `.forge/archive/`
95
- 3. **Docs vs. code verification** — cross-references all framework documentation against the actual codebase. Flags discrepancies (stale docs, missing features, drifted tech stack). The codebase is the source of truth, not the docs.
96
- 4. **Tech stack scan** — auto-detects language, framework, dependencies, codebase size
97
- 5. **Design system detection** — identifies component library from imports and dependencies, builds mapping
98
- 6. **Pattern analysis** — detects testing style, commit conventions, architecture patterns
99
- 7. **Constitutional inference** — suggests articles based on what's already in the codebase
100
- 8. **User confirmation** — presents findings and discrepancies, user corrects/approves
111
+ **Brownfield** (has package.json/src/.git): detect frameworks → absorb → verify docs vs code → scan stack → detect design system → analyze patterns → infer constitution → user confirms.
101
112
 
102
- **Greenfield** (new project):
103
- 1. **Project basics** — asks user to describe the project, fills in tech stack and constraints
104
- 2. **Design system** — which component library? Researches and builds component mapping in `.forge/design-system.md`
105
- 3. **Constitution** — walks through 9 articles grouped by domain. User selects which apply.
106
- 4. **State init** — writes `project.yml`, `constitution.md`, `design-system.md`, `state/index.yml`, `state/milestone-1.yml`
113
+ **Greenfield**: project basics → design system → constitution → state init.
107
114
 
108
- Example design system configs for PrimeReact, MUI, and shadcn/ui ship in `.forge/templates/design-systems/`.
109
-
110
- For Quick tier tasks, init is skipped — just do the work.
115
+ Quick tier skips init.
111
116
 
112
117
  ## State Management
113
118
 
114
- Project state lives in `.forge/`:
115
- - `project.yml` — Vision, stack, design system, verification commands, constraints (< 5 KB)
116
- - `constitution.md` — Active architectural gates (selected during init)
117
- - `design-system.md` — Component mapping table (generated during init)
119
+ State lives in `.forge/`:
120
+ - `project.yml` — Vision, stack, design system, verification, constraints (<5KB)
121
+ - `constitution.md` — Active architectural gates
122
+ - `design-system.md` — Component mapping table
118
123
  - `requirements.yml` — Structured requirements with `[NEEDS CLARIFICATION]` markers
119
124
  - `roadmap.yml` — Phases, milestones, dependencies
120
- - `state/index.yml` — Global state: active milestones list, desire_paths, metrics
121
- - `state/milestone-{id}.yml` — Per-milestone cursor: current position, progress, decisions, blockers, deviations
122
- - `context.md` — Locked user decisions + deferred ideas (created during discuss phase)
123
- - `plan.md` — Per-phase task plans with must_haves frontmatter
124
- - `refactor-backlog.yml` — Refactoring opportunities cataloged during milestone reviews, worked via quick-tasking
125
-
126
- ### Milestones
127
- Milestones group phases into concurrent work streams. Each milestone has its own state file, so different sessions can work on different milestones without conflicts. On resume, Forge shows active milestones and asks which one to work on.
128
-
129
- ### Machine-Readable State
130
- YAML for anything agents parse programmatically (project, requirements, roadmap, state). Markdown for human-facing content (constitution, context, verification reports). Never free-form prose for machine state.
125
+ - `state/index.yml` — Global: active milestones, desire_paths, metrics
126
+ - `state/milestone-{id}.yml` — Per-milestone cursor: position, progress, decisions, blockers
127
+ - `context.md` — Locked decisions + deferred ideas (discuss phase)
128
+ - `plan.md` — Task plans with must_haves frontmatter
129
+ - `refactor-backlog.yml` — Refactoring catalog, worked via quick-tasking
131
130
 
132
- ### Milestone Completion: Status vs. Percentage
133
- **`current.status` is the authoritative workflow position.** A milestone is only complete when `current.status == complete`. The `progress.overall_percent` field measures task completion — not workflow completion. A milestone at 100% task completion still needs verifying and reviewing before it is done. On resume, always check and display `current.status` to determine next steps.
131
+ **Milestones** group phases into concurrent streams. Own state file — no conflicts across sessions.
132
+ **Format**: YAML for machine state, Markdown for human content.
133
+ **`current.status` is authoritative.** Complete only at `current.status == complete`. 100% tasks ≠ done — still needs verifying + reviewing.
134
134
 
135
- ## Deviation Rules (Executor Decision Tree)
135
+ ## Deviation Rules
136
136
 
137
- When the executor encounters issues during building:
137
+ 1. **Bug blocking task** Auto-fix. Document "Rule 1."
138
+ 2. **Missing critical functionality** (error handling, validation, auth, null checks) → Auto-add. "Rule 2."
139
+ 3. **Blocking infrastructure** (missing dep, wrong types, broken imports) → Auto-fix. "Rule 3."
140
+ 4. **Architectural change** (new DB table, service layer, library switch) → **STOP. Ask user.** "Rule 4."
138
141
 
139
- 1. **Bug blocking current task** → Auto-fix. Document in summary with "Rule 1."
140
- 2. **Missing critical functionality** (error handling, validation, auth, null checks) → Auto-add. Document with "Rule 2."
141
- 3. **Blocking infrastructure issue** (missing dep, wrong types, broken imports) → Auto-fix. Document with "Rule 3."
142
- 4. **Architectural change needed** (new DB table, service layer, library switch) → **STOP. Checkpoint with user.** Document with "Rule 4."
143
-
144
- Priority: Rule 4 first (stop if architectural). Then Rules 1-3 (auto-fix). Uncertain? → Rule 4 (ask).
142
+ Priority: Rule 4 first. Then 1-3. Uncertain Rule 4.
145
143
 
146
144
  ## Verification Gates
147
145
 
148
- After each task commit, the executor runs configured verification commands from `project.yml`:
146
+ Post-commit verification from `project.yml`:
149
147
 
150
148
  ```yaml
151
149
  verification:
@@ -154,31 +152,23 @@ verification:
154
152
  - cmd: "npm test"
155
153
  - cmd: "npx tsc --noEmit"
156
154
  advisory: true # pre-existing failures — warn only
157
- auto_fix: true # agent fixes and retries on failure
158
- max_retries: 2 # max auto-fix attempts per command
155
+ auto_fix: true # fix and retry on failure
156
+ max_retries: 2 # max attempts per command
159
157
  ```
160
158
 
161
- - **Auto-detected during init** from `package.json` scripts (test, lint, typecheck)
162
- - **Advisory mode**: commands that were already failing before Forge started run but don't block
163
- - **Auto-fix loop**: on failure, agent reads output, fixes code, amends commit, re-runs (up to max_retries)
164
- - **3-strike integration**: verification retries count toward the task's 3-strike limit
165
- - Empty `commands` list = no verification gate (opt-out)
159
+ - Auto-detected from `package.json` scripts during init
160
+ - Advisory mode: pre-existing failures warn, don't block
161
+ - Auto-fix loop: read output fix amend re-run (up to max_retries)
162
+ - 3-strike: retries count toward task limit
163
+ - Empty commands = no gate (opt-out)
166
164
 
167
165
  ## Beads Integration (Optional)
168
166
 
169
- When Beads is installed, Forge gains persistent cross-session memory:
170
- - **Session start:** `bd prime` injects ~1-2K tokens of project context
171
- - **Task selection:** `bd ready` returns unblocked tasks sorted by priority
172
- - **Task completion:** `bd complete` updates dependency graph
173
- - **Memory hygiene:** `bd compact` summarizes old closed tasks
174
-
175
- Without Beads, Forge uses `.forge/state/` + Claude Code's Session Memory. Beads adds depth for long-horizon multi-session projects.
167
+ With Beads installed: `bd prime` (session context), `bd ready` (unblocked tasks), `bd complete` (update deps), `bd compact` (summarize old). Without Beads, `.forge/state/` + Session Memory suffice.
176
168
 
177
169
  ## Atomic Commits
178
170
 
179
- Every task gets its own commit. Format: `{type}({scope}): {description}`
180
-
171
+ One commit per task. Format: `{type}({scope}): {description}`
181
172
  Types: `feat`, `fix`, `test`, `refactor`, `chore`, `docs`
182
- Scope: phase-plan or feature area
183
- Never use `git add .` or `git add -A` — stage files individually.
173
+ Never `git add .` or `git add -A` — stage individually.
184
174
  <!-- forge:end -->