npm - forge-orkes - Versions diffs - 0.9.4 → 0.9.6 - Mend

forge-orkes 0.9.4 → 0.9.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/package.json +1 -1
package/template/.claude/agents/verifier.md +5 -0
package/template/.claude/settings.json +1 -1
package/template/.claude/skills/discussing/SKILL.md +1 -6
package/template/.claude/skills/executing/SKILL.md +18 -5
package/template/.claude/skills/forge/SKILL.md +11 -1
package/template/.claude/skills/initializing/SKILL.md +11 -0
package/template/.claude/skills/verifying/SKILL.md +68 -0
package/template/.forge/templates/framework-absorption/gsd.md +1 -1
package/template/.forge/templates/interface-detection.md +17 -0
package/template/.forge/templates/plan.md +1 -0
package/template/.forge/templates/project.yml +28 -7

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "forge-orkes",
-  "version": "0.9.4",
+  "version": "0.9.6",
   "description": "Set up the Forge meta-prompting framework for Claude Code in your project",
   "bin": {
     "create-forge": "./bin/create-forge.js"

package/template/.claude/agents/verifier.md CHANGED Viewed

@@ -60,6 +60,7 @@ Run: {n} | Passed: {n} | Failed: {n} | Coverage: {if available}
 Read: .forge/phases/m{M}-{N}-{name}/plan.md → extract must_haves
 Read: .forge/state/milestone-{id}.yml → reported progress
 Read: .forge/context.md → locked decisions
+Read: .forge/deferred-issues.md → known pre-existing failures (if exists; treat as advisory)
 ```
 ### 2. Truths
@@ -90,6 +91,10 @@ npm run build 2>&1
 npm run lint 2>&1
 ```
+Cross-reference failures against `deferred-issues.md`:
+- Match on test name or summary → advisory; label "Pre-existing (deferred: DI-{N})"
+- No match → regression; include in Issues (Critical) and gaps
 ### 6. Anti-Pattern Scan
 | Pattern | Command | Severity |

package/template/.claude/settings.json CHANGED Viewed

@@ -51,7 +51,7 @@
         "hooks": [
           {
             "type": "command",
-            "command": "if [ ! -f .forge/.active-skill ]; then echo \"[Forge] No active skill. Invoke /forge or /quick-tasking before editing code. To bypass: touch .forge/.active-skill\" >&2; exit 2; fi"
+            "command": "if [ ! -f \"$(git rev-parse --show-toplevel 2>/dev/null)/.forge/.active-skill\" ]; then echo \"[Forge] No active skill. Invoke /forge or /quick-tasking before editing code. To bypass: touch .forge/.active-skill\" >&2; exit 2; fi"
           }
         ]
       },

package/template/.claude/skills/discussing/SKILL.md CHANGED Viewed

@@ -23,21 +23,16 @@ Structured conversation: approach, trade-offs, decisions. Clarity, not artifacts
 ## Progressive Persistence
-> **HARD GATE — NOT A GUIDELINE. NO EXCEPTIONS.**
-> Write to `.forge/context.md` after EVERY confirmed decision, before asking the next question.
-> If you reach convergence without having written each decision individually — you already violated this rule.
+**Decisions persist immediately. This is a hard gate — not a guideline.**
 After each user response that confirms a decision:
 1. **STOP** — do not continue the discussion
 2. **Write to disk** — append to `.forge/context.md` (create from template if missing)
-   - **Path is always `.forge/context.md`** — never `.forge/phases/*/context.md`, never inline memory
 3. **THEN** continue to the next question
 Never accumulate decisions in working memory. Never batch writes to convergence.
-**Common failure mode:** Agent says "decisions live in conversation; planning writes context.md." This is WRONG. Discussing writes context.md. Planning reads it.
 ### Write Protocol
 **On first decision of the session:**

package/template/.claude/skills/executing/SKILL.md CHANGED Viewed

@@ -10,6 +10,18 @@ description: "Build to plan with atomic commits, deviation rules, and context en
 - [ ] Context.md locked decisions noted
 - [ ] Constitution.md gates satisfied
 - [ ] Milestone state updated to `status: executing`
+- [ ] Baseline snapshot captured (see below)
+## Baseline Snapshot
+Run **before the first task begins**. Makes failure causality mechanical — no self-assessment.
+1. Run all non-advisory verification commands
+2. Record which tests/checks pass and which fail — this is the baseline
+3. After each commit: failures **not in baseline** = introduced by this task = **must fix under Rule 3**
+4. Failures **present in baseline** = pre-existing → append to `.forge/deferred-issues.md`, mark `status: pending`
+Skip only if re-entering an in-progress execution and `deferred-issues.md` already documents all current failures.
 ## Deviation Rules
@@ -116,9 +128,9 @@ For each command, in order:
 ```
 Attempt 1:
   1. Read error output
-  2. Caused by current task?
-     - YES → fix code, stage fixes, amend commit
-     - NO → mark advisory for this session; append to .forge/deferred-issues.md; continue
+  2. In baseline snapshot?
+     - YES (pre-existing) → mark advisory for this session; append to .forge/deferred-issues.md; continue
+     - NO (introduced by this task) → fix code, stage fixes, amend commit
   3. Re-run command
   4. Pass → next command
   5. Fail → next attempt (up to max_retries)
@@ -133,7 +145,7 @@ After max_retries exhausted:
 Verification retries count toward the task's 3-strike limit. 2 strikes used = 1 verification retry max.
 ### Do NOT Fix
-- **Pre-existing failures** not from current task → mark advisory; append to `.forge/deferred-issues.md`
+- **Pre-existing failures** present in baseline snapshot → mark advisory; append to `.forge/deferred-issues.md`
 - **Flaky tests** passing on re-run without changes → note in summary, no strike
 - **Unrelated warnings** (deprecation, non-blocking lint) → ignore
@@ -190,7 +202,8 @@ After completing all tasks in a plan:
 - src/components/Login.tsx (created)
 ## Notes
-[Anything the verifier should know]
+[Genuine handoff context only: environment requirements, seed data, external services needed (e.g. "Redis must be running for auth tests").
+Do NOT list test failures here — pre-existing failures belong in deferred-issues.md; task-introduced failures must be fixed before committing.]
 ```
 ## State Updates

package/template/.claude/skills/forge/SKILL.md CHANGED Viewed

@@ -9,7 +9,7 @@ Entry point. Detect tier, route skills, manage transitions. New projects → ini
 ## Step 1: Read State
-### Milestone Selection
+### 1.1 Milestone Selection
 Check state files:
 1. `.forge/state/index.yml` → milestone-aware
@@ -33,6 +33,8 @@ Beads enabled (`forge.beads_integration: true`) → `bd prime`. Optional.
 Read `.forge/context.md`. **Needs Resolution** unchecked → warn: *"{N} unresolved discrepancies."* Quick proceeds; Standard/Full blocked at planning.
+### 1.2 Backlog + Desire Paths
 Check `.forge/refactor-backlog.yml`:
 - *"{N} refactors ({Q} quick, {S} standard). Tackle first?"*
 - Pick → `quick-tasking`/Standard, set `in_progress`. Decline → proceed.
@@ -41,6 +43,14 @@ Check `desire_paths` 3+ occurrences:
 - *"Recurring: [{description}] ({N}x). Fix via [suggestion]?"*
 - Agree → apply, reset. Decline → note, don't nag.
+### 1.3 Interface Check
+Check `interface` in `project.yml`:
+- Field present → skip. (If scalar `none` found, normalize to `[none]`; see `.forge/templates/interface-detection.md` for empty-field canon.)
+- Field absent → detect using `.forge/templates/interface-detection.md`. No match → write `[none]`.
+- Prompt: *"No `interface` field found. Detected: [{list}]. Add to project.yml? (yes/no)"*
+- Yes → write `interface: [{list}]` to project.yml. No → skip for this session.
 No `project.yml` + Standard/Full → **Step 2A**. State exists → **Step 2B**.
 ## Step 1.5: Lifecycle Operations

package/template/.claude/skills/initializing/SKILL.md CHANGED Viewed

@@ -143,6 +143,11 @@ Bash: wc -l src/**/*.{ts,tsx,js,jsx} 2>/dev/null | tail -1  # codebase size
 Auto-detect: language, framework, build tools, tests, DB, name.
+### Step 1.2: Interface Type
+Detect which surfaces the project exposes. See `.forge/templates/interface-detection.md` for detection signals. No match → `[none]`.
+Present: *"Detected interfaces: [{list}]. Override?"* — user can correct before project.yml is written.
 ### Step 1.5: Verification Commands
 ```bash
@@ -258,6 +263,12 @@ Present grouped. User confirms/adds/removes.
 User describes project → `.forge/project.yml`: name, goal, stack, constraints, success criteria, risks.
+### Step 1.5: Interface Type
+*"What interfaces does this project expose? (browser / cli / api / desktop / native-apple / none — comma-separate multiples)"*
+Validate each term against `.forge/templates/interface-detection.md` type vocabulary. On unrecognized term, prompt: *"Did you mean [closest match]? Valid: browser | cli | api | desktop | native-apple | none."* Write validated answer as `interface: [...]` in project.yml.
 ### Step 2: Design System
 *"UI library?"*

package/template/.claude/skills/verifying/SKILL.md CHANGED Viewed

@@ -21,8 +21,76 @@ Read: .forge/project.yml → tech stack (for running tests)
 Read: .forge/phases/m{M}-{N}-{name}/plan-{NN}.md → must_haves (truths, artifacts, key_links)
 Read: .forge/context.md → locked decisions
 Read: .forge/requirements.yml → requirement IDs for coverage check
+Read: .forge/deferred-issues.md → known pre-existing failures (if exists; treat as advisory)
 ```
+## Deferred Issues
+If `.forge/deferred-issues.md` exists, load it before running any tests.
+When test results come in, cross-reference failures against known deferred issue IDs:
+- **Failure matches a deferred ID** → advisory only — note as "Pre-existing (deferred: DI-{N})", do NOT fail verification for this
+- **Failure not in deferred-issues.md** → regression introduced after the baseline — **FAIL** and include in gaps
+In the Test Results section of the report, split accordingly:
+```markdown
+## Test Results
+Run: {n} | Passed: {n} | Failed: {n}
+### Regressions (block release)
+- {test name}: {error}
+### Pre-existing / Deferred (advisory)
+- DI-{N}: {test name} — {summary from deferred-issues.md}
+```
+Verdict: FAIL only triggers for regressions. Known deferred failures do not block a PASSED verdict — they are tracked separately and must be worked off via the refactor backlog or a dedicated fix phase.
+## Interface Testing Gate
+Read `interface` from `.forge/project.yml`. If absent, `[]`, or `[none]` → skip gate, proceed to 3-level verification.
+Also read `interface_tools` (optional override map).
+For each declared interface type, check for tests:
+| Type | Detection | Expected Tool | Suggested Command |
+|------|-----------|---------------|-------------------|
+| `browser` | Glob `**/*.spec.ts`, `**/*.spec.js`, `**/e2e/**`, `**/playwright/**` | Playwright | `npx playwright test` |
+| `cli` | Grep test files for `spawn`, `execSync`, `subprocess.run`, `exec(` | Any runner | subprocess integration tests |
+| `api` | Grep test files for `supertest`, `request(`, `httpx`, `TestClient`, `net/http` | Node→supertest · Python→httpx · Go→net/http | runner-dependent |
+| `desktop` | Glob `**/*.spec.ts`, `**/*.spec.js`, `**/e2e/**`, `**/playwright/**` | Playwright | `npx playwright test` |
+| `native-apple` | Glob `**/*UITests/**`, `**/*Tests/**` | XCUITest | `xcodebuild test` |
+If `interface_tools.{type}` is set → use its `tool` and `cmd` instead of defaults.
+**All types have tests** → gate passes silently. Proceed to 3-level verification.
+**Any type missing tests** → BLOCK:
+```
+INTERFACE TESTING GATE — BLOCKED
+Interface type: {type}
+Expected: {tool} tests
+Detection: {glob or grep used}
+Found: none
+Add {type} tests before verifying:
+  browser:       Add Playwright specs in e2e/ or *.spec.ts files
+  cli:           Add integration tests that spawn the CLI as a child process and assert exit codes + stdout
+  api:           Add HTTP integration tests using {default tool for detected language}
+  desktop:       Add Playwright tests targeting the Electron/Tauri window
+  native-apple:  Add XCUITest targets and run via xcodebuild test
+Re-run verifying after tests are added.
+```
+**Do NOT invoke Skill(testing).** Return BLOCKED and wait — the user or executor must add tests explicitly.
+If detection is ambiguous (e.g. API tests hard to grep definitively) → lean toward PASS to avoid false blocks; note uncertainty in the verdict.
 ## 3-Level Goal-Backward Verification
 ### Level 1: Observable Truths

package/template/.forge/templates/framework-absorption/gsd.md CHANGED Viewed

@@ -22,7 +22,7 @@ CONTEXT.md with "NON-NEGOTIABLE" or "DEFERRED" sections
 | `ROADMAP.md` | `.forge/roadmap.yml` | Extract phases, milestones. Add dependency references between phases. Convert to YAML. |
 | `STATE.md` | `.forge/state/index.yml` + `.forge/state/milestone-{id}.yml` | Extract current phase number, progress, active blockers, recent decisions. Split into global index and per-milestone state. |
 | `CONTEXT.md` | `.forge/context.md` | NON-NEGOTIABLE → Locked Decisions. DEFERRED → Deferred Ideas. DISCRETION → Discretion Areas. Minimal format change needed. |
-| `PLAN.md` | `.forge/phases/{N}-{name}/plan.md` | Keep XML task format. Add `must_haves` YAML frontmatter. Split per-phase if combined. |
+| `PLAN.md` | `.forge/phases/m{M}-{N}-{name}/plan.md` | Keep XML task format. Add `must_haves` YAML frontmatter. Split per-phase if combined. |
 | `references/ui-brand.md` | `.forge/design-system.md` | Extract component rules, brand guidelines. Convert to component mapping table format. |
 | `references/tdd.md` | `.forge/constitution.md` Article II | Inform Test-First article gates with project-specific testing approach. |
 | `references/verification-patterns.md` | Informs `verifying` skill | Extract project-specific verification patterns. Add to constitution or context. |

package/template/.forge/templates/interface-detection.md ADDED Viewed

@@ -0,0 +1,17 @@
+# Interface Detection Signals
+Used by `initializing` (Step 1.2), `forge` (Step 1, interface check), and `verifying` (Interface Testing Gate) to detect and validate interface types.
+If ANY signal matches for a type, include that type in the result array.
+| Signal | Type |
+|--------|------|
+| `next`/`react`/`vue`/`svelte`/`astro`/`nuxt`/`remix`/`angular` in deps | `browser` |
+| `bin` field in package.json OR `commander`/`yargs`/`meow` in name/deps OR `cobra`/`urfave/cli` in go.mod OR `click`/`typer`/`argparse` in requirements.txt | `cli` |
+| `express`/`fastapi`/`flask`/`django`/`gin`/`echo`/`koa`/`hono`/`nestjs`/`fiber` in deps | `api` |
+| `electron`/`tauri` in deps | `desktop` |
+| `.xcodeproj`/`.xcworkspace`/`Package.swift`/`Info.plist` at project root | `native-apple` |
+No match → `[none]`. Multiple matches → array (e.g. `[browser, api]`).
+**Empty field:** absent, `[]`, and `[none]` all mean no interface declared. Scalar `none` normalizes to `[none]` — arrays only are valid on write.

package/template/.forge/templates/plan.md CHANGED Viewed

@@ -1,6 +1,7 @@
 # Phase Plan Template
 Copy to `.forge/phases/m{M}-{N}-{name}/plan-{NN}.md` for each plan in a phase.
+`{M}` = milestone ID, `{N}` = phase number within milestone, `{name}` = kebab-case phase name.
 ---

package/template/.forge/templates/project.yml CHANGED Viewed

@@ -14,6 +14,26 @@ tech_stack:
   testing: ""                       # e.g., Vitest, Jest, Pytest
   other: []                         # Additional key dependencies
+interface: [none]                      # Surfaces this project exposes: browser | cli | api | desktop | native-apple | none
+                                       # Array — e.g. [browser, api] for full-stack projects
+interface_tools:                       # Optional: override default test tool per interface type. Set manually or during init.
+  # browser:
+  #   tool: cypress
+  #   cmd: "npx cypress run"
+  # cli:
+  #   tool: bats
+  #   cmd: "bats test/"
+  # api:
+  #   tool: httpx
+  #   cmd: "pytest tests/api/"
+  # desktop:
+  #   tool: playwright
+  #   cmd: "npx playwright test"
+  # native-apple:
+  #   tool: xcodebuild
+  #   cmd: "xcodebuild test -scheme MyApp"
 design_system:
   library: ""                       # e.g., PrimeReact, Material-UI, shadcn/ui, Ant Design, Chakra UI, none
   version: ""                       # e.g., 10.x, 5.x
@@ -31,15 +51,16 @@ constraints:
     - ""                            # e.g., "No server-side rendering"
 verification:
-  commands:                           # Shell commands run after each task commit
-    - ""                              # e.g., "npm run lint"
-    - ""                              # e.g., "npm test"
-    - ""                              # e.g., "npx tsc --noEmit"
+  commands:                           # Shell commands run after each task commit. Auto-detected during init.
+    # - cmd: "npm run lint"
+    #   advisory: false               # true = warn only (for pre-existing failures)
+    # - cmd: "npm test"
+    #   advisory: false
+    # - cmd: "npx tsc --noEmit"
+    #   advisory: true                # pre-existing type errors — warn, don't block
   auto_fix: true                      # On failure, agent fixes and retries
   max_retries: 2                      # Max auto-fix attempts per command (0 = fail immediately)
-  # Commands are auto-detected during init from package.json scripts.
-  # Advisory mode: commands that were already failing before Forge started
-  # run but don't block — they log warnings only.
+  # Advisory mode: commands already failing before Forge started run but don't block — warn only.
 success_criteria:                   # How do we know we're done?
   - ""                              # e.g., "User can create and edit posts"