npm - forge-orkes - Versions diffs - 0.3.11 → 0.3.14 - Mend

forge-orkes 0.3.11 → 0.3.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

package/package.json +1 -1
package/template/.claude/agents/executor.md +44 -95
package/template/.claude/agents/planner.md +46 -75
package/template/.claude/agents/researcher.md +34 -70
package/template/.claude/agents/reviewer.md +54 -117
package/template/.claude/agents/verifier.md +51 -102
package/template/.claude/skills/discussing/SKILL.md +69 -121
package/template/.claude/skills/executing/SKILL.md +20 -3
package/template/.claude/skills/forge/SKILL.md +129 -129
package/template/.claude/skills/initializing/SKILL.md +72 -174
package/template/.claude/skills/planning/SKILL.md +92 -118
package/template/.claude/skills/reviewing/SKILL.md +88 -175
package/template/.forge/templates/constitution.md +10 -0
package/template/.forge/templates/project.yml +17 -0
package/template/CLAUDE.md +105 -115

package/template/.claude/skills/reviewing/SKILL.md CHANGED Viewed

@@ -1,26 +1,26 @@
 ---
 name: reviewing
-description: "Post-verification health gate. Runs security audit (10 categories), architecture audit (4 dimensions), and refactoring scan (6 categories) in parallel. Determines if the milestone can ship and catalogs improvement opportunities."
+description: "Post-verification health gate. Security (10), architecture (4), refactoring (6) in parallel. Gates shipping + catalogs improvements."
 ---
-# Reviewing: Health Audit + Refactoring Review
+# Reviewing
-The pre-completion gate. After `verifying` confirms deliverables, assess codebase health and catalog improvements. Three parallel scans produce a structured report that gates milestone completion.
+After `verifying` passes, 3 parallel scans assess health + catalog improvements.
 ## Triggers
-- **Auto:** after `verifying` returns PASSED (Standard/Full tiers)
-- **Manual:** on user request
+- **Auto:** after `verifying` PASSED (Standard/Full)
+- **Manual:** user request
 ## Process
-1. Read project context from `.forge/project.yml`
-2. Scope the review — glob source files, determine milestone diff
-3. Spawn three parallel subagents: Security + Architecture + Refactoring
-4. Collect results, score per-category, determine overall status
-5. Write health report to `.forge/audits/milestone-{id}-health-report.md`
-6. Write accepted refactoring items to `.forge/refactor-backlog.yml`
-7. Route: healthy → complete, critical → user decides
+1. Read context from `.forge/project.yml`
+2. Scope -- glob sources, milestone diff
+3. Spawn 3 subagents: Security + Architecture + Refactoring
+4. Score, determine status
+5. Write `.forge/audits/milestone-{id}-health-report.md`
+6. Write `.forge/refactor-backlog.yml`
+7. Route: healthy->complete, critical->user decides
 ## Step 1: Read Context
@@ -29,43 +29,34 @@ Read: .forge/project.yml → tech stack, framework, database, dependencies
 Read: .forge/state/milestone-{id}.yml → milestone ID and name
 Read: .forge/constitution.md → active architectural gates (if exists)
 Read: .forge/refactor-backlog.yml → existing backlog items (if any)
+Read: .forge/deferred-issues.md → pre-existing failures logged during execution (if exists)
 ```
-Skip inapplicable security categories based on tech stack:
-- No database → SQL/NoSQL Injection N/A
-- No frontend → XSS Prevention N/A
-- No CI/CD config → Pipeline Security N/A
+Skip by stack: no DB->SQL/NoSQL N/A, no frontend->XSS N/A, no CI/CD->Pipeline N/A.
-Determine the milestone's git diff starting point:
-- Check git log for the milestone start commit
-- Fallback: first commit after previous milestone's completion date
-- Last resort: ask the user
+Diff start: git log milestone start -> fallback: first commit after prev milestone -> ask user.
-## Step 2: Scope the Review
+## Step 2: Scope
 ```
-Glob: src/**/*.{ts,tsx,js,jsx,py,go,rs,java} (adapt to project language)
+Glob: src/**/*.{ts,tsx,js,jsx,py,go,rs,java} (adapt to language)
 Glob: **/*.env*, **/docker-compose*, **/.github/workflows/*
 Glob: **/next.config*, **/vite.config*, **/webpack.config*
 ```
-Get diff file list for refactoring scan:
 ```
 git diff --name-only {milestone_start}..HEAD
 ```
-Present scope summary:
-*"Review scope: {N} source files, {N} config files, {N} changed files. Scanning security (10), architecture (4), refactoring (6)."*
+Present: *"Scope: {N} source, {N} config, {N} changed. Scanning security(10), arch(4), refactoring(6)."*
-Build explicit file lists for each subagent — pass paths, not globs.
+File lists per subagent (paths, not globs).
-## Step 3: Spawn Parallel Scans
+## Step 3: Parallel Scans
-Spawn all three as fresh-context subagents. Each receives: explicit file list, tech stack from `project.yml`, instructions below.
+Fresh-context subagents with file list + tech stack.
-### Part 1: Security Audit (subagent)
-**10 Security Categories:**
+### Part 1: Security Audit
 | # | Category | Checks |
 |---|----------|--------|
@@ -80,18 +71,7 @@ Spawn all three as fresh-context subagents. Each receives: explicit file list, t
 | 9 | HTTP Security Headers | CSP, X-Frame-Options, HSTS, X-Content-Type-Options, Referrer-Policy |
 | 10 | CI/CD Pipeline Security | Secrets via secrets context, not hardcoded in workflows |
-**Rules:**
-- Read every file. No sampling.
-- Every finding needs: file path, line number, issue, severity, remediation.
-- Check surrounding code for middleware/wrappers before flagging.
-- Document intentionally public endpoints — don't flag them.
-- Severity: `critical` = exploitable vulnerability, `warning` = defense-in-depth gap, `info` = observation.
-- Prefer false negatives over false positives.
-- Inapplicable categories → N/A with brief reason.
-**Adapt checks to detected stack:** Express/Next.js/Fastify endpoints, PostgreSQL/MongoDB/SQLite queries, GitHub Actions/GitLab CI, React/Vue/Svelte frontends.
-**Output format:**
+**Rules:** Every file, no sampling. Finding = path + line + issue + severity + remediation. Check middleware before flagging. Doc public endpoints, don't flag. `critical`=exploitable, `warning`=defense gap, `info`=observation. False negatives > false positives. Inapplicable -> N/A. Adapt to detected stack.
 ```yaml
 security_audit:
@@ -109,24 +89,16 @@ security_audit:
       notes: "Optional context about intentional decisions"
 ```
-### Part 2: Architecture Audit (subagent)
-**4 Dimensions:**
+### Part 2: Architecture Audit
 | Dimension | Checks |
 |-----------|--------|
-| **Scalability** | Synchronous blocking, missing pagination, unbounded queries, N+1 patterns, missing caching, single points of failure, hardcoded limits |
-| **Maintainability** | Files >300 lines, nesting >4 levels, god components/classes, circular deps, duplicated logic warranting abstraction |
-| **Code Health** | Dead code/unused exports, TODO/FIXME inventory with age, untested critical paths, stale/vulnerable deps |
-| **Structural Quality** | Business logic in UI layer, inconsistent patterns across features, missing error boundaries, API contract inconsistency |
+| **Scalability** | Synchronous blocking, missing pagination, unbounded queries, N+1, missing caching, SPOFs, hardcoded limits |
+| **Maintainability** | Files >300 lines, nesting >4, god components/classes, circular deps, dup logic |
+| **Code Health** | Dead code/unused exports, TODO/FIXME inventory, untested critical paths, stale deps, deferred issues in `.forge/deferred-issues.md` |
+| **Structural Quality** | Biz logic in UI, inconsistent patterns, missing error boundaries, API contract drift |
-**Rules:**
-- Check actual code, not theoretical concerns.
-- Every finding references specific files with evidence.
-- Severity: `critical` = debt causing production issues or blocking future work, `warning` = quality concern, `info` = improvement opportunity.
-- Respect existing ADRs in `.forge/decisions/` and constitutional articles — don't flag intentional choices.
-**Output format:**
+**Rules:** Actual code, not theory. Specific files + evidence. `critical`=prod issues/blocking, `warning`=quality, `info`=improvement. Respect ADRs + constitution.
 ```yaml
 architecture_audit:
@@ -151,31 +123,20 @@ architecture_audit:
       findings: []
 ```
-### Part 3: Refactoring Scan (subagent)
-Pass only files changed during the milestone (from git diff).
+### Part 3: Refactoring Scan
-**6 Categories:**
+Milestone-changed files only (git diff).
 | # | Category | Look For |
 |---|----------|----------|
-| 1 | **Duplication** | Similar logic in 2+ places extractable into shared function/hook/utility |
-| 2 | **Complexity hotspots** | Functions >50 lines, nesting >3 levels, high cyclomatic complexity, overly long files |
-| 3 | **Naming & clarity** | Unclear names, misleading abstractions, functions exceeding their name's scope |
-| 4 | **Pattern inconsistency** | Same concern handled differently across milestone files (error handling, data fetching, state) |
-| 5 | **Dead code** | Unused functions, unreachable branches, commented-out code, unused imports |
-| 6 | **Abstraction issues** | Over-engineered one-use helpers, repeated inline code warranting extraction, premature/missing abstractions |
-**Rules:**
-- Read every file in the diff. No sampling.
-- Every finding references a specific file and line range.
-- Don't flag patterns documented in the constitution.
-- Don't duplicate security or architecture findings.
-- Estimate effort: `quick` (< 30 min, < 50 lines) or `standard` (needs planning).
-- Suggest a concrete approach, not "refactor this."
-- Fewer high-quality findings over many low-signal ones.
-**Output format:**
+| 1 | **Duplication** | Similar logic in 2+ places -> shared function/hook/util |
+| 2 | **Complexity** | Functions >50 lines, nesting >3, high cyclomatic |
+| 3 | **Naming** | Unclear names, misleading abstractions, scope > name |
+| 4 | **Inconsistency** | Same concern handled differently across files |
+| 5 | **Dead code** | Unused functions, unreachable branches, commented code |
+| 6 | **Abstraction** | Over-engineered helpers, missing extraction, premature abstractions |
+**Rules:** Every diff file, no sampling. File + line range. Skip constitutional patterns. No security/arch dupes. Effort: `quick`(<30min, <50 lines) or `standard`. Concrete approach. Quality > quantity.
 ```yaml
 refactoring_scan:
@@ -189,32 +150,30 @@ refactoring_scan:
       suggested_approach: "Extract shared validateEmail() helper to src/utils/validation.ts"
 ```
-## Step 4: Score Results
+## Step 4: Score
-**Per-category scoring (security + architecture):**
+**Per-category:**
 | Status | Meaning |
 |--------|---------|
 | `passed` | No issues |
-| `warning` | Non-critical issues |
-| `critical` | Exploitable vulnerabilities or architectural blockers |
-| `na` | Category doesn't apply |
+| `warning` | Non-critical |
+| `critical` | Exploitable vulns or blockers |
+| `na` | N/A |
-**Overall health:**
+**Overall:**
 | Overall | Condition |
 |---------|-----------|
-| `passed` | All categories passed or N/A |
-| `warnings_only` | One+ warnings, zero critical |
-| `issues_found` | One+ critical findings |
+| `passed` | All passed/N/A |
+| `warnings_only` | 1+ warnings, 0 critical |
+| `issues_found` | 1+ critical |
-Refactoring findings are separate — they never block completion.
+Refactoring never blocks.
-## Step 5: Write Health Report
+## Step 5: Write Report
-Create `.forge/audits/` if needed. Write to `.forge/audits/milestone-{id}-health-report.md`.
-**YAML frontmatter:**
+Create `.forge/audits/` if needed:
 ```yaml
 ---
@@ -242,13 +201,11 @@ total_files_scanned: N
 ---
 ```
-**Body structure:**
 ```markdown
 # Review Report: {milestone name}
 ## Executive Summary
-{1-3 sentences: health assessment, key findings, refactoring highlights, recommendation}
+{1-3 sentences: health, findings, refactoring, recommendation}
 ## Security Findings
@@ -257,7 +214,7 @@ total_files_scanned: N
 |------|------|----------|-------|-------------|
 | ... | ... | ... | ... | ... |
-{Repeat per category. N/A categories: single line "N/A — {reason}"}
+{Repeat per category. N/A: "N/A — {reason}"}
 ## Architecture Findings
@@ -275,56 +232,41 @@ total_files_scanned: N
 |------|-------|-------------|--------|----------|
 | ... | ... | ... | quick/standard | ... |
-{Repeat per category with findings}
+{Repeat per category}
 ## Public Endpoints
-{Intentionally public endpoints documented during security audit}
+{Intentionally public endpoints}
 ## Files Scanned
-{Count and list of all files scanned}
+{Count + list}
 ```
-If a previous milestone audit exists in `.forge/audits/`, compare results and note improvements/regressions in the executive summary.
-## Step 6: Present Results + Triage Refactoring
-### Health Results
-Present health status first — this is the gate.
-**HEALTHY (all passed):**
-*"Health audit passed. No security vulnerabilities or architectural concerns."*
+Previous audit? Note improvements/regressions in summary.
-**NEEDS ATTENTION (critical issues):**
-*"Critical issues found:"*
-Inline top 3 findings per critical category.
+## Step 6: Present + Triage
-**WARNINGS ONLY:**
-*"Passed with warnings — no critical issues, {N} items worth noting. Full report: `.forge/audits/milestone-{id}-health-report.md`."*
+Health status first (gates completion):
+- **HEALTHY:** *"Passed. No vulns or arch concerns."*
+- **NEEDS ATTENTION:** *"Critical issues:"* top 3 per critical category.
+- **WARNINGS:** *"Passed with warnings -- {N} noted. Report: `.forge/audits/milestone-{id}-health-report.md`."*
-### Refactoring Triage
+Refactoring triage (max 10): *"{N} opportunities:"*
+*"**Duplication** ({N}): 1. `src/api/users.ts:42-67` -- Dup email validation. [Accept/Dismiss]*"
-Show findings grouped by category (max 10 initially):
+**Accept**->backlog | **Dismiss**->skip | **Accept all** | **Dismiss all**
-*"{N} refactoring opportunities found:"*
+Deferred issues triage: If `.forge/deferred-issues.md` has `status: pending` items, surface them:
+*"**Test debt** ({N} pre-existing failures): 1. `{summary}` -- first seen {date}. [Accept/Dismiss/Fix-now]*"*
-Per category:
-*"**Duplication** ({N}):*
-*1. `src/api/users.ts:42-67` — Duplicate email validation. Quick fix: extract helper. [Accept / Dismiss]*"
+- **Accept** → add to refactor-backlog.yml as `category: test-debt`, mark `status: triaged` in deferred-issues.md
+- **Fix-now** → route to `planning` fix mode before completing milestone
+- **Dismiss** → mark `status: dismissed` in deferred-issues.md with reason
-User responses:
-- **Accept** → add to backlog
-- **Dismiss** → skip (optionally ask reason to calibrate future scans)
-- **Accept all** → bulk add remaining
-- **Dismiss all** → skip everything
+## Step 7: Backlog + Route
-## Step 7: Write Backlog + Route
+### Backlog
-### Write Refactoring Backlog
-Read existing `.forge/refactor-backlog.yml`. Determine next item ID by incrementing from highest existing.
-Append accepted items:
+Read `.forge/refactor-backlog.yml`. Next ID = max + 1. Append:
 ```yaml
 items:
@@ -341,41 +283,23 @@ items:
     completed: null
 ```
-If file doesn't exist, create from `.forge/templates/refactor-backlog.yml`.
-### Route Based on Health Status
-#### HEALTHY or WARNINGS ONLY (accepted)
-Update `.forge/state/milestone-{id}.yml`: set `current.status` to `complete`.
-Update `.forge/state/index.yml`: set milestone status to `complete`, update `last_updated`.
-*"Milestone [{name}] complete. {N} refactoring items in backlog."*
+Missing? Create from `.forge/templates/refactor-backlog.yml`.
-If Beads installed, run `bd complete`.
+### Route
-#### NEEDS ATTENTION (critical issues)
+**HEALTHY/WARNINGS (accepted):** `current.status: complete` in milestone + index. *"Milestone [{name}] complete. {N} backlog items."* Beads: `bd complete`.
-Do NOT mark complete. Present choices:
+**CRITICAL:** Don't complete. A) Fix->`planning` fix mode->re-verify->re-review. B) Accept risk->doc in report->complete.
-- **A. Fix critical issues** — return to `planning` in fix mode with findings as requirements
-- **B. Accept risk** — document accepted risks in report, complete the milestone
+**WARNINGS (fix):** ->`planning` fix mode->fix->re-review.
-If A: create fix requirements from findings → route to `planning` → after fix + re-verification → re-run `reviewing`.
-If B: append "Accepted Risks" section to report → complete milestone (same as HEALTHY path).
+## Gates
-#### WARNINGS ONLY (user wants to fix)
+- **Security/arch critical** -> soft gate (accept risk)
+- **Warnings** -> advisory
+- **Refactoring** -> never block
-Create fix requirements from warnings → route to `planning` in fix mode → after fix → re-run `reviewing`.
-## Gate Type: Mixed
-- **Security critical** → soft gate (user can accept risk)
-- **Architecture critical** → soft gate (user has final authority)
-- **Warnings** → advisory (noted in report, user chooses)
-- **Refactoring items** → never block (cataloged to backlog)
-The report documents the decision either way — audit trail.
+Report = audit trail.
 ## Backlog Lifecycle
@@ -384,23 +308,12 @@ pending → in_progress → done
 pending → dismissed (during triage or later)
 ```
-`effort: quick` items → pick up via `quick-tasking`.
-`effort: standard` items → Standard tier flow.
+`quick` -> `quick-tasking`. `standard` -> Standard tier.
-Working a backlog item:
-1. `forge` surfaces it as available
-2. User selects it
-3. Route to `quick-tasking` or Standard tier based on effort
-4. On completion, set `status: done` and `completed` date
+`forge` surfaces -> user selects -> route by effort -> `status: done` + date.
 ## Phase Handoff
-After reviewing completes (all paths):
-1. Confirm health report written to `.forge/audits/milestone-{id}-health-report.md` and backlog updated
-2. Set `current.status` to `complete` in `.forge/state/milestone-{id}.yml`
-3. Present:
-*"Milestone [{name}] complete. Report: `.forge/audits/milestone-{id}-health-report.md`. {N} refactoring items in backlog.*
-*Start new work with `/forge` or tackle backlog items anytime."*
+1. Confirm report + backlog
+2. Set `current.status: complete`
+3. *"Milestone [{name}] complete. Report: `.forge/audits/milestone-{id}-health-report.md`. {N} backlog items. `/forge` or backlog."*

package/template/.forge/templates/constitution.md CHANGED Viewed

@@ -116,6 +116,16 @@ Production code must be debuggable. Logging, error reporting, and health checks
 _Add custom articles below during project init. Follow the same format: rule + gate checkboxes._
+### Article XII: Token Efficiency
+Minimize framework token footprint. Compress prose, route models by task complexity, eliminate duplication across load paths.
+**Gate:**
+- [ ] No duplicated content across load paths (CLAUDE.md, skills, agents)
+- [ ] All framework prose uses compressed style (terse, fragments OK, no filler)
+- [ ] Model routing configured in project.yml with justified defaults
+- [ ] Periodic compression audit: heaviest skill files measured and trimmed if >10KB
 ---
 ## Amending the Constitution

package/template/.forge/templates/project.yml CHANGED Viewed

@@ -46,6 +46,23 @@ success_criteria:                   # How do we know we're done?
   - ""                              # e.g., "All tests pass with >80% coverage"
   - ""                              # e.g., "Page load < 2 seconds"
+models:
+  default: sonnet                      # Fallback for skills without explicit override
+  parent_session: sonnet               # Advisory — forge warns on mismatch
+  skills:
+    architecting: opus                 # Deep reasoning for structural decisions
+    planning: opus                     # Deep reasoning for task decomposition
+    researching: sonnet                # Solid analysis for investigation
+    executing: sonnet                  # Solid code generation
+    verifying: haiku                   # Structured/mechanical verification
+    reviewing: sonnet                  # Needs judgment for audit findings
+    quick-tasking: haiku               # Fast for small changes
+    initializing: sonnet               # Needs analysis for project detection
+    designing: sonnet                  # UI decisions need good judgment
+    securing: sonnet                   # Security analysis needs depth
+    debugging: sonnet                  # Investigation needs solid reasoning
+    discussing: sonnet                 # Conversation needs natural fluency
 risks:                              # What could go wrong?
   - risk: ""
     mitigation: ""