npm - forge-orkes - Versions diffs - 0.3.10 → 0.3.13 - Mend

forge-orkes 0.3.10 → 0.3.13

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

package/bin/create-forge.js +16 -5
package/package.json +1 -1
package/template/.claude/agents/executor.md +44 -95
package/template/.claude/agents/planner.md +46 -75
package/template/.claude/agents/researcher.md +34 -70
package/template/.claude/agents/reviewer.md +54 -117
package/template/.claude/agents/verifier.md +51 -102
package/template/.claude/skills/discussing/SKILL.md +69 -121
package/template/.claude/skills/forge/SKILL.md +129 -129
package/template/.claude/skills/initializing/SKILL.md +72 -174
package/template/.claude/skills/planning/SKILL.md +92 -118
package/template/.claude/skills/reviewing/SKILL.md +82 -177
package/template/.forge/templates/constitution.md +10 -0
package/template/.forge/templates/project.yml +17 -0
package/template/CLAUDE.md +105 -115

package/template/CLAUDE.md CHANGED Viewed

@@ -1,151 +1,149 @@
 <!-- forge:start -->
 # Forge
-A lean meta-prompting framework for Claude Code. Synthesizes context engineering (GSD) and constitutional governance (Spec-Kit) on Claude Code's native primitives.
+Lean meta-prompting framework for Claude Code. Context engineering + constitutional governance on native primitives.
 ## Critical: No Native Plan Mode
-**NEVER use the `EnterPlanMode` tool when the Forge framework is active** (i.e., when `.forge/` exists or a Forge skill is running). Forge has its own `planning` skill that writes structured plans to `.forge/phases/`. Claude Code's native plan mode writes to a separate plan file with a different format — this conflicts with Forge's workflow and state management.
+**NEVER use `EnterPlanMode` when Forge active.** Forge writes plans to `.forge/phases/` — native plan mode conflicts.
-When the workflow reaches the planning phase, **invoke the `planning` skill using the `Skill` tool** — do not enter native plan mode. This applies to all tiers (Standard and Full). The same rule applies to all other Forge phases: always invoke the corresponding Forge skill, never substitute a native Claude Code behavior.
+All phases: **invoke via `Skill` tool**, never native behavior. `planning` → `Skill(planning)`, not `EnterPlanMode`.
 ## Core Principles
-1. **Lean by default, powerful when needed.** Quick fixes skip ceremony. Complex features get full governance. The framework adapts — you don't.
-2. **Native-first.** Skills, agents, hooks, plugins — use Claude Code's built-in systems. No custom JavaScript, no reinvented orchestration. Periodically audit Forge features against Claude Code's current native capabilities — if a native tool now handles what a Forge feature does, deprecate the Forge version. Use native tools for session-scoped concerns (task UI, exploration) and Forge state for cross-session persistence. When in doubt, prefer native.
-3. **Context is sacred.** Every token earns its place. Size-gate all artifacts, lazy-load skills, spawn fresh agents for isolated work.
-4. **Decisions are contracts.** User decisions lock before building begins. Downstream agents honor contracts or flag violations — never silently override.
-5. **Verify against goals, not tasks.** "Does it work?" beats "Did we complete the checklist?" Goal-backward verification at every tier.
-6. **Never forget.** Project state persists via `.forge/state/` and survives session boundaries. Milestones can be worked on concurrently across sessions. Compatible with external memory tools like Beads for deeper cross-session context.
-7. **Pave the desire paths.** When agents repeatedly deviate, users repeatedly correct, or steps repeatedly get skipped — that's a signal, not a failure. Track these patterns and evolve the framework to match how it's actually used.
+1. **Lean default, powerful when needed.** Quick → skip ceremony. Complex → full governance.
+2. **Native-first.** Claude Code built-ins only. No custom JS. Deprecate Forge when native catches up.
+3. **Context is sacred.** Size-gate artifacts, lazy-load skills, fresh agents for isolated work.
+4. **Decisions are contracts.** Lock before building. Honor or flag — never silently override.
+5. **Verify goals, not tasks.** "Does it work?" > "Did we finish checklist?"
+6. **Never forget.** `.forge/state/` persists across sessions. Milestones concurrent. Beads-compatible.
+7. **Pave desire paths.** Repeated deviations/corrections = signal → evolve framework.
 ## Workflow Tiers
-Forge auto-detects complexity. Override with: "Use Quick/Standard/Full tier."
+Auto-detects complexity. Override: "Use Quick/Standard/Full tier."
 ### Quick (minutes)
-**Triggers:** bug fix, typo, config change, dependency bump, < 50 lines changed
-**Flow:** → `quick-tasking` → commit → done
+**Triggers:** bug fix, typo, config, dep bump, <50 lines
+**Flow:** `quick-tasking` → commit → done
 ### Standard (hours)
-**Triggers:** new feature, component, significant refactor, multi-file change
-**Flow:** → `researching` → `discussing` → `planning` → `executing` → `verifying` → `reviewing` → done
+**Triggers:** new feature, component, significant refactor, multi-file
+**Flow:** `researching` → `discussing` → `planning` → `executing` → `verifying` → `reviewing` → done
 ### Full (days)
-**Triggers:** new project, major milestone, complex multi-system feature, architectural decisions needed
-**Flow:** → `researching` → `discussing` → `architecting` → `planning` → `executing` → `verifying` → `reviewing` → done
-**Optional additions:** `designing` (UI work), `securing` (auth/data/API), `debugging` (stuck on issue)
+**Triggers:** new project, major milestone, complex multi-system, architectural decisions
+**Flow:** `researching` → `discussing` → `architecting` → `planning` → `executing` → `verifying` → `reviewing` → done
+**Optional:** `designing` (UI), `securing` (auth/data/API), `debugging` (stuck)
 ## Skill Routing
-| When you need to... | Use skill | Tier |
-|---------------------|-----------|------|
-| Start any task (entry point) | `forge` | All |
-| Set up a new project (brownfield or greenfield) | `initializing` | First run only |
-| Investigate codebase, tech, or requirements | `researching` | Standard, Full |
-| Talk through approach, trade-offs, or revisit a plan | `discussing` | Standard, Full (also on-demand) |
-| Make architectural decisions with rationale | `architecting` | Full |
-| Break work into executable tasks with gates | `planning` | Standard, Full |
-| Build code with deviation rules + atomic commits | `executing` | All |
-| Prove work actually delivers on goals | `verifying` | Standard, Full |
-| Audit health + catalog refactoring opportunities | `reviewing` | Standard, Full |
-| Fix a small, scoped issue fast | `quick-tasking` | Quick |
-| Build UI with design system consistency | `designing` | When UI involved |
-| Review security before shipping | `securing` | When auth/data/API involved |
-| Debug systematically with hypotheses | `debugging` | When stuck |
-| Upgrade Forge framework files | `upgrading` | On-demand |
-| Use Beads for cross-session memory | `beads-integration` | When Beads installed |
+| Need to... | Skill | Tier |
+|------------|-------|------|
+| Start any task | `forge` | All |
+| Set up new project | `initializing` | First run |
+| Investigate codebase/tech/requirements | `researching` | Standard, Full |
+| Talk through approach/trade-offs | `discussing` | Standard, Full (on-demand) |
+| Architectural decisions | `architecting` | Full |
+| Break work into tasks with gates | `planning` | Standard, Full |
+| Build with deviation rules + atomic commits | `executing` | All |
+| Prove work delivers on goals | `verifying` | Standard, Full |
+| Audit health + catalog refactoring | `reviewing` | Standard, Full |
+| Small scoped fix | `quick-tasking` | Quick |
+| UI with design system | `designing` | When UI |
+| Security review | `securing` | When auth/data/API |
+| Systematic debugging | `debugging` | When stuck |
+| Upgrade Forge files | `upgrading` | On-demand |
+| Cross-session memory | `beads-integration` | When Beads installed |
 ## Context Engineering
-### Size Gates (enforced by agents)
-| Artifact | Max Size | Why |
-|----------|----------|-----|
-| `project.yml` | 5 KB | Forces clarity — if you can't describe it concisely, you don't understand it |
-| `requirements.yml` | 50 KB | Prevents scope creep — trim to v1 essentials |
-| `plan.md` (per task) | 30 KB | Keeps executor context under 50% — quality degrades above this |
-| `constitution.md` | 10 KB | Immutable gates should be scannable, not novels |
+### Size Gates
+| Artifact | Max | Reason |
+|----------|-----|--------|
+| `project.yml` | 5 KB | Forces clarity |
+| `requirements.yml` | 50 KB | Prevents scope creep |
+| `plan.md` | 30 KB | Keeps executor context <50% |
+| `constitution.md` | 10 KB | Gates must be scannable |
 ### Fresh Agent Pattern
-When a task touches 20+ files or a complex subsystem, spawn a fresh executor agent with isolated context. This prevents context rot — the #1 cause of quality degradation in long sessions.
+Task touches 20+ files or complex subsystem → spawn fresh executor with isolated context. Prevents context rot.
-### Context Handoff Between Phases
-Each phase writes its outputs to `.forge/` before completing. At every phase boundary (researching → discussing → planning → executing → verifying → reviewing), the completing skill recommends clearing context (`/clear`) before the next phase begins. The next phase loads what it needs from disk. This is advisory — skip for short phases where context is under 40%. See the `forge` skill's "Context Handoff Protocol" for full details.
+### Context Handoff
+Each phase writes outputs to `.forge/` before completing. At phase boundaries, recommend `/clear`. Next phase loads from disk. Advisory — skip for short phases under 40% context.
 ### Lazy Loading
-Skills load only when invoked. CLAUDE.md stays in context; skill details load on demand. This keeps base context lean (~300 lines) while making full framework available.
+Skills load only when invoked. CLAUDE.md stays in context; skill details on demand. Base context ~300 lines.
+## Model Routing
+Configure in `project.yml`:
+```yaml
+models:
+  default: sonnet          # fallback
+  parent_session: sonnet   # advisory — warns on mismatch
+  skills:                  # per-skill overrides
+    architecting: opus
+    planning: opus
+    verifying: haiku
+    quick-tasking: haiku
+```
+**Precedence:** `skills.X` → `default` → parent model.
+**Parent session advisory** — warns if mismatch, cannot auto-switch. Enforced at review gate.
+**Agents model-agnostic** — skills set model at spawn. One routing table.
 ## Agents
-| Agent | Role | Tools | When Used |
-|-------|------|-------|-----------|
-| `researcher` | Investigation specialist | Read-only (Read, Glob, Grep, WebFetch) | Research phases |
-| `planner` | Planning with constitutional gates | Read + Write (plan files only) | Planning phases |
-| `executor` | Building with deviation rules | All dev tools | Execution phases |
-| `verifier` | Goal-backward verification | Read + Bash (test execution) | Verification phases |
-| `reviewer` | Security + architecture + refactoring audit | Read, Bash, Grep, Glob | Reviewing phase |
+| Agent | Role | Tools | When |
+|-------|------|-------|------|
+| `researcher` | Investigation | Read-only (Read, Glob, Grep, WebFetch) | Research |
+| `planner` | Planning + constitutional gates | Read + Write (plan files) | Planning |
+| `executor` | Building + deviation rules | All dev tools | Execution |
+| `verifier` | Goal-backward verification | Read + Bash (tests) | Verification |
+| `reviewer` | Security + architecture + refactoring | Read, Bash, Grep, Glob | Reviewing |
 ## Project Init (First Run)
-When `forge` detects no `.forge/project.yml`, it auto-detects the project type and runs the appropriate init:
+No `.forge/project.yml` → auto-detect project type:
-**Brownfield** (existing codebase detected — has package.json, src/, .git/):
-1. **Framework detection** — checks for existing meta-frameworks (GSD, Spec-Kit, BMAD, custom). If found, offers: absorb & convert, archive & start fresh, or keep both. Also detects companion tools (Beads, etc.) and preserves their configuration.
-2. **Framework absorption** — reads existing framework docs, converts project knowledge to Forge format (PROJECT.md → project.yml, etc.), archives originals in `.forge/archive/`
-3. **Docs vs. code verification** — cross-references all framework documentation against the actual codebase. Flags discrepancies (stale docs, missing features, drifted tech stack). The codebase is the source of truth, not the docs.
-4. **Tech stack scan** — auto-detects language, framework, dependencies, codebase size
-5. **Design system detection** — identifies component library from imports and dependencies, builds mapping
-6. **Pattern analysis** — detects testing style, commit conventions, architecture patterns
-7. **Constitutional inference** — suggests articles based on what's already in the codebase
-8. **User confirmation** — presents findings and discrepancies, user corrects/approves
+**Brownfield** (has package.json/src/.git): detect frameworks → absorb → verify docs vs code → scan stack → detect design system → analyze patterns → infer constitution → user confirms.
-**Greenfield** (new project):
-1. **Project basics** — asks user to describe the project, fills in tech stack and constraints
-2. **Design system** — which component library? Researches and builds component mapping in `.forge/design-system.md`
-3. **Constitution** — walks through 9 articles grouped by domain. User selects which apply.
-4. **State init** — writes `project.yml`, `constitution.md`, `design-system.md`, `state/index.yml`, `state/milestone-1.yml`
+**Greenfield**: project basics → design system → constitution → state init.
-Example design system configs for PrimeReact, MUI, and shadcn/ui ship in `.forge/templates/design-systems/`.
-For Quick tier tasks, init is skipped — just do the work.
+Quick tier skips init.
 ## State Management
-Project state lives in `.forge/`:
-- `project.yml` — Vision, stack, design system, verification commands, constraints (< 5 KB)
-- `constitution.md` — Active architectural gates (selected during init)
-- `design-system.md` — Component mapping table (generated during init)
+State lives in `.forge/`:
+- `project.yml` — Vision, stack, design system, verification, constraints (<5KB)
+- `constitution.md` — Active architectural gates
+- `design-system.md` — Component mapping table
 - `requirements.yml` — Structured requirements with `[NEEDS CLARIFICATION]` markers
 - `roadmap.yml` — Phases, milestones, dependencies
-- `state/index.yml` — Global state: active milestones list, desire_paths, metrics
-- `state/milestone-{id}.yml` — Per-milestone cursor: current position, progress, decisions, blockers, deviations
-- `context.md` — Locked user decisions + deferred ideas (created during discuss phase)
-- `plan.md` — Per-phase task plans with must_haves frontmatter
-- `refactor-backlog.yml` — Refactoring opportunities cataloged during milestone reviews, worked via quick-tasking
-### Milestones
-Milestones group phases into concurrent work streams. Each milestone has its own state file, so different sessions can work on different milestones without conflicts. On resume, Forge shows active milestones and asks which one to work on.
-### Machine-Readable State
-YAML for anything agents parse programmatically (project, requirements, roadmap, state). Markdown for human-facing content (constitution, context, verification reports). Never free-form prose for machine state.
+- `state/index.yml` — Global: active milestones, desire_paths, metrics
+- `state/milestone-{id}.yml` — Per-milestone cursor: position, progress, decisions, blockers
+- `context.md` — Locked decisions + deferred ideas (discuss phase)
+- `plan.md` — Task plans with must_haves frontmatter
+- `refactor-backlog.yml` — Refactoring catalog, worked via quick-tasking
-### Milestone Completion: Status vs. Percentage
-**`current.status` is the authoritative workflow position.** A milestone is only complete when `current.status == complete`. The `progress.overall_percent` field measures task completion — not workflow completion. A milestone at 100% task completion still needs verifying and reviewing before it is done. On resume, always check and display `current.status` to determine next steps.
+**Milestones** group phases into concurrent streams. Own state file — no conflicts across sessions.
+**Format**: YAML for machine state, Markdown for human content.
+**`current.status` is authoritative.** Complete only at `current.status == complete`. 100% tasks ≠ done — still needs verifying + reviewing.
-## Deviation Rules (Executor Decision Tree)
+## Deviation Rules
-When the executor encounters issues during building:
+1. **Bug blocking task** → Auto-fix. Document "Rule 1."
+2. **Missing critical functionality** (error handling, validation, auth, null checks) → Auto-add. "Rule 2."
+3. **Blocking infrastructure** (missing dep, wrong types, broken imports) → Auto-fix. "Rule 3."
+4. **Architectural change** (new DB table, service layer, library switch) → **STOP. Ask user.** "Rule 4."
-1. **Bug blocking current task** → Auto-fix. Document in summary with "Rule 1."
-2. **Missing critical functionality** (error handling, validation, auth, null checks) → Auto-add. Document with "Rule 2."
-3. **Blocking infrastructure issue** (missing dep, wrong types, broken imports) → Auto-fix. Document with "Rule 3."
-4. **Architectural change needed** (new DB table, service layer, library switch) → **STOP. Checkpoint with user.** Document with "Rule 4."
-Priority: Rule 4 first (stop if architectural). Then Rules 1-3 (auto-fix). Uncertain? → Rule 4 (ask).
+Priority: Rule 4 first. Then 1-3. Uncertain → Rule 4.
 ## Verification Gates
-After each task commit, the executor runs configured verification commands from `project.yml`:
+Post-commit verification from `project.yml`:
 ```yaml
 verification:
@@ -154,31 +152,23 @@ verification:
     - cmd: "npm test"
     - cmd: "npx tsc --noEmit"
       advisory: true           # pre-existing failures — warn only
-  auto_fix: true               # agent fixes and retries on failure
-  max_retries: 2               # max auto-fix attempts per command
+  auto_fix: true               # fix and retry on failure
+  max_retries: 2               # max attempts per command
 ```
-- **Auto-detected during init** from `package.json` scripts (test, lint, typecheck)
-- **Advisory mode**: commands that were already failing before Forge started run but don't block
-- **Auto-fix loop**: on failure, agent reads output, fixes code, amends commit, re-runs (up to max_retries)
-- **3-strike integration**: verification retries count toward the task's 3-strike limit
-- Empty `commands` list = no verification gate (opt-out)
+- Auto-detected from `package.json` scripts during init
+- Advisory mode: pre-existing failures warn, don't block
+- Auto-fix loop: read output → fix → amend → re-run (up to max_retries)
+- 3-strike: retries count toward task limit
+- Empty commands = no gate (opt-out)
 ## Beads Integration (Optional)
-When Beads is installed, Forge gains persistent cross-session memory:
-- **Session start:** `bd prime` injects ~1-2K tokens of project context
-- **Task selection:** `bd ready` returns unblocked tasks sorted by priority
-- **Task completion:** `bd complete` updates dependency graph
-- **Memory hygiene:** `bd compact` summarizes old closed tasks
-Without Beads, Forge uses `.forge/state/` + Claude Code's Session Memory. Beads adds depth for long-horizon multi-session projects.
+With Beads installed: `bd prime` (session context), `bd ready` (unblocked tasks), `bd complete` (update deps), `bd compact` (summarize old). Without Beads, `.forge/state/` + Session Memory suffice.
 ## Atomic Commits
-Every task gets its own commit. Format: `{type}({scope}): {description}`
+One commit per task. Format: `{type}({scope}): {description}`
 Types: `feat`, `fix`, `test`, `refactor`, `chore`, `docs`
-Scope: phase-plan or feature area
-Never use `git add .` or `git add -A` — stage files individually.
+Never `git add .` or `git add -A` — stage individually.
 <!-- forge:end -->