npm - forgedev - Versions diffs - 1.1.3 → 1.2.0 - Mend

forgedev 1.1.3 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (53) hide show

package/README.md +2 -1
package/bin/devforge.js +2 -1
package/docs/00-README.md +310 -0
package/docs/01-universal-prompt-library.md +1049 -0
package/docs/02-claude-code-mastery-playbook.md +283 -0
package/docs/03-multi-agent-verification.md +565 -0
package/docs/04-errata-and-verification-checklist.md +284 -0
package/docs/05-universal-scaffolder-vision.md +452 -0
package/docs/06-confidence-assessment-and-repo-prompt.md +407 -0
package/docs/errata.md +58 -0
package/docs/multi-agent-verification.md +66 -0
package/docs/plans/.gitkeep +0 -0
package/docs/playbook.md +95 -0
package/docs/prompt-library.md +160 -0
package/docs/uat/UAT_CHECKLIST.csv +9 -0
package/docs/uat/UAT_TEMPLATE.md +163 -0
package/package.json +10 -2
package/src/claude-configurator.js +1 -0
package/src/cli.js +5 -5
package/src/index.js +3 -3
package/src/utils.js +1 -1
package/templates/base/docs/plans/.gitkeep +0 -0
package/templates/base/docs/uat/UAT_CHECKLIST.csv.template +2 -0
package/templates/base/docs/uat/UAT_TEMPLATE.md.template +22 -0
package/templates/claude-code/agents/build-error-resolver.md +3 -2
package/templates/claude-code/agents/code-quality-reviewer.md +1 -1
package/templates/claude-code/agents/database-reviewer.md +1 -1
package/templates/claude-code/agents/doc-updater.md +1 -1
package/templates/claude-code/agents/harness-optimizer.md +26 -0
package/templates/claude-code/agents/loop-operator.md +2 -1
package/templates/claude-code/agents/product-strategist.md +124 -0
package/templates/claude-code/agents/security-reviewer.md +1 -0
package/templates/claude-code/agents/spec-validator.md +31 -1
package/templates/claude-code/agents/uat-validator.md +4 -0
package/templates/claude-code/claude-md/base.md +1 -0
package/templates/claude-code/claude-md/nextjs.md +1 -1
package/templates/claude-code/commands/code-review.md +7 -1
package/templates/claude-code/commands/full-audit.md +3 -2
package/templates/claude-code/commands/workflows.md +3 -0
package/templates/claude-code/hooks/scripts/autofix-polyglot.mjs +20 -10
package/templates/claude-code/hooks/scripts/autofix-python.mjs +3 -4
package/templates/claude-code/hooks/scripts/autofix-typescript.mjs +3 -3
package/templates/claude-code/hooks/scripts/guard-protected-files.mjs +2 -2
package/templates/claude-code/skills/git-workflow/SKILL.md +2 -2
package/templates/claude-code/skills/nextjs/SKILL.md +1 -1
package/templates/claude-code/skills/playwright/SKILL.md +6 -5
package/templates/claude-code/skills/security-web/SKILL.md +1 -0
package/templates/infra/github-actions/.github/workflows/ci.yml.template +49 -0
package/templates/testing/pytest/backend/tests/__init__.py +0 -0
package/templates/testing/pytest/backend/tests/conftest.py.template +11 -0
package/templates/testing/pytest/backend/tests/test_health.py.template +10 -0
package/templates/testing/vitest/vitest.config.ts.template +18 -0
package/CLAUDE.md +0 -38

package/docs/06-confidence-assessment-and-repo-prompt.md ADDED Viewed

@@ -0,0 +1,407 @@
+# Confidence Assessment & The Repo Prompt
+---
+## Gap Analysis: Failover and UAT
+### UAT Coverage: WEAK (3/10)
+**What's there:**
+- Prompt library mentions "acceptance criteria" in spec writing (Flow 1)
+- Multi-agent doc's spec-validator checks requirements as IMPLEMENTED/PARTIAL/MISSING
+- AI quality auditor checks for graceful degradation
+**What's MISSING:**
+- No dedicated UAT flow or prompt
+- No UAT scenario template
+- No staging/pre-production verification step
+- No manual testing checklist that pairs with automated tests
+- No "run through the app as a real user" verification step
+- No smoke test protocol after deployment
+- No UAT sign-off gate before marking a feature complete
+### Failover Coverage: MODERATE (5/10)
+**What's there:**
+- AI quality auditor checks for graceful fallback (point 7 of the 7-point audit)
+- Production readiness agent checks error handling and recovery
+- Security agent checks for AI unavailability handling
+**What's MISSING:**
+- No circuit breaker patterns in the scaffolding
+- No health check endpoint generation
+- No retry/backoff strategy guidance
+- No database connection failover
+- No external service timeout configuration
+- No graceful shutdown handling
+- No queue/dead letter patterns for async operations
+---
+## Confidence Assessment: Will These Documents Reduce Recurring Issues?
+| Issue You Described | Coverage | Confidence | Why |
+|--------------------|----------|-----------|-----|
+| "I always have to ask if it's sure many times" | Stop hooks + completion protocol | **8/10** | Hooks are deterministic — Claude physically can't say "done" with broken types/tests. The 20% gap is that hooks only catch what scripts can check — logic errors still need human review. |
+| "Missing things in instructions" | Lean CLAUDE.md (~150 lines) + skills | **7/10** | Cutting from 650+ to 150 lines directly addresses instruction-following degradation. Skills load domain knowledge on demand without bloating every session. The gap: you still need discipline to keep it lean over time. |
+| "Playwright regression issues" | Testing skill + pre-commit hooks + data-testid enforcement | **6/10** | The patterns are sound (no waitForTimeout, semantic locators, independent tests). But Playwright flakiness often comes from app-specific timing issues that no generic pattern can predict. You'll still need to tune for your app. |
+| "Things wouldn't work as intended" | Spec-validator agent + audit-wiring command | **7/10** | Dead feature detection (endpoint exists but nothing calls it) catches the most common "looks done but isn't" problem. The gap: business logic correctness still requires human judgment. |
+| "Quality of AI output" | 7-point prompt audit + AI quality auditor agent | **8/10** | This is actually the strongest area. The concrete data embedding rule alone would have prevented most AI quality issues. The anti-hallucination boundary and confidence scoring are well-established patterns. |
+| "Doesn't group frontend/backend well" | Directory-scoped CLAUDE.md + task separation prompts | **7/10** | Backend-first-then-frontend with /clear between works well. The gap: Claude sometimes still sneaks in cross-domain changes if the task description is ambiguous. |
+| "Waste tokens troubleshooting" | Atomic tasks + /clear + plan mode | **7/10** | Smaller tasks = less context degradation = fewer retries. Plan mode catches issues before code. The gap: you need the discipline to actually /clear between tasks. |
+| **UAT / user acceptance** | **WEAK** | **3/10** | **This is the biggest gap. No structured UAT flow exists in the current documents.** |
+| **Failover / production resilience** | **MODERATE** | **5/10** | **Partially covered by production-readiness agent but no scaffolded patterns.** |
+### Overall Confidence: **6.5/10**
+The documents would meaningfully reduce your issues — probably cut "are you sure?"
+cycles by 60-70%. But they're strongest on code quality and weakest on UAT and
+production resilience. Those gaps need to be filled.
+---
+## The Repo Concept
+**Name suggestions (matching the mission):**
+| Name | Why | npm availability |
+|------|-----|-----------------|
+| `forge-init` | You're forging production-ready projects from raw ideas | Check `npx forge-init` |
+| `scaffold-ai` | Scaffolding + AI-first development | Check `npx scaffold-ai` |
+| `launchpad-dev` | Launch pad for any project | Check `npx launchpad-dev` |
+| `init-forge` | Forging from init | Check `npx init-forge` |
+| `devforge` | Developer's forge | Check `npx devforge` |
+| `buildkit-ai` | Build kit with AI infrastructure | Check `npx buildkit-ai` |
+| `startship` | Start + ship (you start, you ship) | Check `npx startship` |
+My recommendation: **`devforge`** — short, memorable, captures the idea of
+forging production-ready projects from raw materials.
+---
+## The Prompt (Drop This Into Claude Code)
+This is the single prompt that builds the entire repo from scratch. It's long
+because it's a complete specification. Copy the whole thing.
+```
+I'm building a new open-source CLI tool called devforge (or [your chosen name]).
+It's a universal, AI-first project scaffolding tool that:
+1. Asks what you're building (web app, API, full-stack, mobile, CLI, AI service, etc.)
+2. Recommends the optimal tech stack based on the service type
+3. Scaffolds the project with the right structure, configs, and dependencies
+4. Ships with Claude Code infrastructure (CLAUDE.md, hooks, skills, agents, commands) tailored to the selected stack
+5. Includes UAT templates, failover patterns, and production readiness checks
+## Repository structure to create:
+```
+devforge/
+├── package.json                    # CLI package, bin: "devforge"
+├── README.md                       # How to install and use
+├── LICENSE                         # MIT
+├── CLAUDE.md                       # For developing devforge itself
+├── .claude/
+│   ├── settings.json               # Hooks for devforge development
+│   └── agents/
+│       └── code-quality-reviewer.md
+│
+├── bin/
+│   └── devforge.js                 # CLI entry point (#!/usr/bin/env node)
+│
+├── src/
+│   ├── index.js                    # Main orchestrator
+│   ├── prompts.js                  # Interactive CLI prompts (Inquirer.js)
+│   ├── recommender.js              # Service type → stack recommendation engine
+│   ├── composer.js                 # Template composition engine
+│   ├── claude-configurator.js      # Generates .claude/ directory for the project
+│   ├── uat-generator.js            # Generates UAT templates and checklists
+│   └── utils.js                    # File operations, logging, colors
+│
+├── templates/
+│   ├── base/                       # Every project gets this
+│   │   ├── .gitignore.template
+│   │   ├── README.md.template
+│   │   └── docs/
+│   │       ├── plans/              # Empty plans directory
+│   │       └── uat/                # UAT templates
+│   │           ├── UAT_TEMPLATE.md
+│   │           └── UAT_CHECKLIST.csv
+│   │
+│   ├── frontend/
+│   │   ├── nextjs/                 # Next.js App Router + TypeScript + Tailwind + Shadcn
+│   │   └── react-vite/             # React + Vite + TypeScript + Tailwind
+│   │
+│   ├── backend/
+│   │   ├── fastapi/                # FastAPI + SQLAlchemy 2.0 + Pydantic v2
+│   │   ├── hono/                   # Hono + TypeScript
+│   │   └── express/                # Express + TypeScript
+│   │
+│   ├── database/
+│   │   ├── prisma-postgres/        # Prisma + PostgreSQL
+│   │   └── sqlalchemy-postgres/    # SQLAlchemy + PostgreSQL + Alembic
+│   │
+│   ├── auth/
+│   │   ├── nextauth/               # NextAuth.js
+│   │   └── jwt-custom/             # Custom JWT
+│   │
+│   ├── testing/
+│   │   ├── vitest/                 # Vitest config + example test
+│   │   ├── playwright/             # Playwright config + example E2E + fixture patterns
+│   │   └── pytest/                 # Pytest config + example test + fixtures
+│   │
+│   ├── infra/
+│   │   ├── docker-compose/         # docker-compose.yml template
+│   │   └── github-actions/         # CI/CD workflow template
+│   │
+│   └── claude-code/                # Claude Code infrastructure modules
+│       ├── hooks/
+│       │   ├── typescript.json     # PostToolUse: eslint, Stop: tsc + eslint
+│       │   ├── python.json         # PostToolUse: ruff, Stop: pyright + ruff
+│       │   └── polyglot.json       # Both TypeScript and Python hooks
+│       │
+│       ├── claude-md/
+│       │   ├── nextjs.md           # CLAUDE.md template for Next.js projects
+│       │   ├── fastapi.md          # CLAUDE.md template for FastAPI projects
+│       │   ├── fullstack.md        # CLAUDE.md template for full-stack projects
+│       │   └── base.md             # Base CLAUDE.md template (universal rules)
+│       │
+│       ├── skills/
+│       │   ├── nextjs/SKILL.md
+│       │   ├── fastapi/SKILL.md
+│       │   ├── playwright/SKILL.md
+│       │   ├── security-web/SKILL.md
+│       │   ├── security-api/SKILL.md
+│       │   └── ai-prompts/SKILL.md
+│       │
+│       ├── agents/
+│       │   ├── code-quality-reviewer.md   # Universal code quality agent
+│       │   ├── security-reviewer.md       # Universal security agent
+│       │   ├── spec-validator.md          # Universal spec compliance agent
+│       │   ├── production-readiness.md    # Universal production readiness agent
+│       │   └── uat-validator.md           # UAT verification agent (NEW)
+│       │
+│       └── commands/
+│           ├── verify-all.md
+│           ├── audit-spec.md
+│           ├── audit-wiring.md
+│           ├── audit-security.md
+│           ├── pre-pr.md
+│           └── run-uat.md          # UAT execution command (NEW)
+│
+├── docs/
+│   ├── universal-prompt-library.md # The complete prompt library (all 6 flows)
+│   ├── multi-agent-verification.md # Agent architecture documentation
+│   ├── playbook.md                 # CLAUDE.md structuring guide
+│   └── errata.md                   # Known issues and testing checklist
+│
+└── tests/
+    ├── recommender.test.js         # Tests for stack recommendation logic
+    ├── composer.test.js            # Tests for template composition
+    └── claude-configurator.test.js # Tests for .claude/ generation
+```
+## The CLI flow:
+```
+$ npx devforge my-app
+🔨 Welcome to DevForge!
+What are you building?
+  1. Web app (SPA / SSR / static)
+  2. API / backend service
+  3. Full-stack app (frontend + backend)
+  4. Mobile app (cross-platform)
+  5. CLI tool / utility
+  6. AI/ML powered service
+  7. Desktop app
+  8. Browser extension
+  9. Microservice / serverless
+  10. Describe it (AI recommends)
+> _
+[After selection, ask refinement questions:]
+- Language preference? (TypeScript, Python, Go, Rust)
+- Need authentication? (y/n)
+- Need AI/LLM integration? (y/n)
+- Need file uploads? (y/n)
+- Need real-time features? (y/n)
+- Deployment target? (Docker, Vercel, AWS, GCP)
+- Include Claude Code infrastructure? (y/n) — default yes
+[Show recommendation, let user confirm/adjust]
+[Scaffold the project]
+[Generate Claude Code infrastructure]
+[Generate UAT templates]
+[Show "next steps" with exact commands to run]
+```
+## Critical features to include:
+### 1. UAT Templates (REQUIRED)
+Every generated project must include:
+- `docs/uat/UAT_TEMPLATE.md` — a scenario pack template:
+  ```markdown
+  # UAT Scenario Pack: [Project Name]
+  ## Pre-Conditions
+  - [ ] Application is deployed to staging
+  - [ ] Test accounts are created
+  - [ ] Test data is seeded
+  ## Scenarios
+  ### UAT-001: [Feature Name] — Happy Path
+  **Priority:** P0
+  **Preconditions:** [what must be true before testing]
+  **Steps:**
+  1. [action]
+  2. [action]
+  3. [action]
+  **Expected Result:** [what should happen]
+  **Actual Result:** ___
+  **Status:** PASS / FAIL / BLOCKED / NOT RUN
+  **Tester:** ___
+  **Date:** ___
+  **Notes:** ___
+  ### UAT-002: [Feature Name] — Error Handling
+  ...
+  ```
+- `docs/uat/UAT_CHECKLIST.csv` — machine-readable tracking:
+  ```csv
+  UAT_ID,Feature,Priority,Status,Tester,Date,Defect_ID,Notes
+  UAT-001,[Feature],P0,NOT RUN,,,,
+  ```
+- A `run-uat` Claude Code command:
+  ```markdown
+  <!-- .claude/commands/run-uat.md -->
+  Read docs/uat/UAT_TEMPLATE.md.
+  For each P0 scenario:
+  1. Check if automated tests exist that cover this scenario
+  2. If automated: run the test and report PASS/FAIL
+  3. If not automated: flag as MANUAL REQUIRED
+  4. Update UAT_CHECKLIST.csv with results
+  Output:
+  - Automated coverage: X/Y scenarios have automated tests
+  - Results: X passed, Y failed, Z need manual testing
+  - Blocking issues: list any P0 failures
+  ```
+- A `uat-validator` agent:
+  ```markdown
+  <!-- .claude/agents/uat-validator.md -->
+  You are a QA engineer validating UAT scenarios.
+  Read-only. Never modify code.
+  For each UAT scenario:
+  1. Verify the feature exists in the codebase
+  2. Check if there's a corresponding automated test
+  3. If automated test exists, verify it covers the scenario's steps
+  4. Flag gaps: scenarios without tests, tests without scenarios
+  Output a traceability matrix:
+  | UAT ID | Feature | Has Automated Test? | Test File | Coverage |
+  ```
+### 2. Failover / Production Resilience Patterns (REQUIRED)
+Every generated backend must include:
+- Health check endpoint (`/health` or `/healthz`)
+- Graceful shutdown handler
+- Database connection retry with exponential backoff
+- External service timeout configuration
+- Structured error responses (never leak stack traces)
+For AI-powered services, additionally:
+- AI service fallback (rule-based when AI is unavailable)
+- AI response validation (Pydantic/Zod, not raw strings)
+- AI timeout + retry configuration
+- Rate limit handling
+### 3. The Prompt Library
+Include the complete prompt library from docs/universal-prompt-library.md
+in every generated project at `docs/prompt-library.md`. This gives every
+developer on the project access to the 6 flows and utility prompts.
+### 4. Pre-built Verification Chain
+Every generated project ships with the full agent verification chain:
+- code-quality-reviewer.md (tailored to the selected stack)
+- security-reviewer.md (tailored to web/API/full-stack)
+- spec-validator.md (universal)
+- production-readiness.md (tailored to the deployment target)
+- uat-validator.md (universal)
+## Implementation approach:
+- Use Node.js for the CLI (so it's npx-installable)
+- Use Inquirer.js for interactive prompts
+- Use simple file copying + string replacement for templates (no complex template engine needed for v1)
+- Test with Vitest
+- Start with 3 stack combinations:
+  1. Next.js full-stack (extends your existing next-init)
+  2. FastAPI backend service
+  3. Next.js frontend + FastAPI backend (polyglot full-stack)
+- Make it easy to add more stacks later (each stack is just a folder in templates/)
+## What to do RIGHT NOW:
+1. Read this entire specification
+2. Create the repo structure
+3. Implement the CLI flow (prompts.js → recommender.js → composer.js → claude-configurator.js)
+4. Implement the first 3 stack templates
+5. Implement the Claude Code infrastructure generation
+6. Implement the UAT template generation
+7. Write tests for the recommendation engine
+8. Test by running `node bin/devforge.js test-app` locally
+Build this phase by phase. Start with the CLI flow and base template.
+Show me the plan before writing any code.
+```
+---
+## After The Repo Is Built: Your Workflow
+```
+1. You have a product idea
+        ↓
+2. Run: npx devforge my-new-saas
+        ↓
+3. Answer: "Full-stack app" → "TypeScript + Python" → "Yes auth" → "Yes AI" → "Docker"
+        ↓
+4. DevForge generates:
+   - Next.js frontend + FastAPI backend
+   - PostgreSQL + pgvector
+   - NextAuth + JWT
+   - Playwright + Pytest
+   - Docker Compose
+   - CLAUDE.md tailored to this stack
+   - Hooks (eslint, tsc, ruff, pyright)
+   - Skills (nextjs, fastapi, playwright, security, ai-prompts)
+   - Agents (code-quality, security, spec-validator, production-readiness, uat-validator)
+   - Commands (verify-all, audit-spec, audit-wiring, pre-pr, run-uat)
+   - UAT templates
+   - Health check endpoints
+   - Graceful shutdown handlers
+   - docs/prompt-library.md (the complete 6-flow prompt library)
+        ↓
+5. cd my-new-saas && npm install && npm run dev
+        ↓
+6. Open Claude Code. Everything is already configured.
+   - Hooks enforce quality on every edit
+   - Skills provide domain knowledge on demand
+   - Agents verify your work before PRs
+   - UAT templates track acceptance testing
+   - Prompt library guides every task
+        ↓
+7. You just... build.
+```
+No setup. No "are you sure?" cycles. No troubleshooting hooks.
+No writing CLAUDE.md from scratch. No figuring out which agents to create.
+Clone, run, build.

package/docs/errata.md ADDED Viewed

@@ -0,0 +1,58 @@
+# DevForge Errata & Testing Checklist
+Known issues, limitations, and manual testing procedures.
+---
+## Known Issues
+### No ESLint
+The `post-edit.sh` hook is a no-op because ESLint is not yet configured. The `stop-quality-gate.sh` only runs Vitest. When ESLint is added, both hooks should be updated.
+### Windows Bash Compatibility
+Hook scripts require `bash` and `jq`. On Windows:
+- Git Bash provides `bash`
+- `jq` must be installed separately (e.g., via `choco install jq` or `scoop install jq`)
+- `chmod +x` may not persist on NTFS — hooks use `bash script.sh` invocation to avoid this
+### Template Variable Edge Cases
+- Template substitution is simple regex: `{{(\w+)}}` → replacement
+- Nested braces like `{{{VAR}}}` will produce `{value}` (outer brace preserved)
+- Variables not in the vars map are left as-is (no error thrown)
+- Binary files with `{{` patterns will not be modified (only `.template` files are processed)
+### V1 Stack Limitations
+- Only 3 stack combinations supported
+- No Go, Rust, React SPA, React Native, Tauri, or browser extension support
+- "Describe it" option (AI recommendation) is not implemented in V1
+---
+## Manual Testing Checklist
+Before any release, verify these manually:
+### CLI Flow
+- [ ] `node bin/devforge.js` (no args) shows usage error
+- [ ] `node bin/devforge.js test-output` starts the interactive flow
+- [ ] Ctrl+C during prompts exits cleanly (no partial output)
+- [ ] Selecting each of the 3 supported stacks completes successfully
+- [ ] Selecting an unsupported stack shows a helpful message
+### Output Verification
+- [ ] All `{{VAR}}` placeholders are replaced in output
+- [ ] No `.template` extensions remain in output filenames
+- [ ] `.gitignore` includes `.env` in every generated project
+- [ ] `CLAUDE.md` is generated with stack-specific content
+- [ ] Health check endpoint code exists in generated project
+- [ ] Graceful shutdown handler exists in generated project
+### Claude Code Infrastructure
+- [ ] `.claude/settings.json` generated with correct hooks for the stack
+- [ ] All 5 agents present in `.claude/agents/`
+- [ ] Relevant skills present in `.claude/skills/`
+- [ ] All 6 commands present in `.claude/commands/`
+- [ ] `docs/prompt-library.md` exists in generated project
+### Clean Up
+- [ ] `rm -rf test-output/` after each test run

package/docs/multi-agent-verification.md ADDED Viewed

@@ -0,0 +1,66 @@
+# Multi-Agent Verification Chain
+How the 5 review agents work together to verify DevForge code quality.
+---
+## The 5 Agents
+### 1. Code Quality Reviewer
+**Focus:** Code correctness and conventions
+- ESM import patterns (`.js` extensions, no `require()`)
+- Single-responsibility functions in `src/`
+- Template `{{VARIABLE_NAME}}` conventions
+- Chalk for user output, stderr for errors
+- No dead code or unused imports
+### 2. Security Reviewer
+**Focus:** CLI-specific security (not web security)
+- Path traversal in template composition (can output escape target dir?)
+- Command injection via project names
+- File system safety (no symlink following)
+- No secrets in templates or generated output
+- Dependency safety
+### 3. Spec Validator
+**Focus:** Requirement traceability
+- Every feature in README → has implementation in `src/`
+- Every stack in recommender → has templates in `templates/`
+- Every Claude Code feature → has templates in `templates/claude-code/`
+- Reports IMPLEMENTED / PARTIAL / MISSING / DIVERGED
+### 4. Production Readiness Reviewer
+**Focus:** npm publish readiness
+- Package.json correctness (bin, engines, type, version)
+- CLI behavior (--help, --version, exit codes, Ctrl+C)
+- Cross-platform compatibility (path.join, no hardcoded separators)
+- Error handling (existing dirs, invalid names, missing templates)
+### 5. UAT Validator
+**Focus:** Test coverage completeness
+- Maps UAT scenarios in `docs/uat/UAT_TEMPLATE.md` to tests in `tests/`
+- Reports coverage matrix
+- Flags P0 scenarios without automated tests
+- Suggests test implementations for gaps
+---
+## Orchestration
+### Via `/project:verify-all`
+Runs all 5 agents sequentially after `npx vitest run`. Summarizes findings by severity.
+### Via `/project:pre-pr`
+Runs code-quality and security agents on the PR diff only (not full codebase).
+### Manual
+Launch any agent individually from the Claude Code agents panel.
+---
+## Key Principles
+1. **All agents are read-only** — they have `disallowedTools: [Write, Edit, MultiEdit]`
+2. **Self-verification** — each agent re-checks findings before reporting to reduce false positives
+3. **Tailored to DevForge** — agents check CLI-specific patterns, not generic web app patterns
+4. **Severity-based output** — findings are grouped as critical/high/medium/low

package/docs/plans/.gitkeep ADDED Viewed

File without changes

package/docs/playbook.md ADDED Viewed

@@ -0,0 +1,95 @@
+# DevForge AI Infrastructure Playbook
+How the Claude Code development infrastructure is structured for DevForge.
+---
+## Layer 0: CLAUDE.md (<50 lines)
+The root `CLAUDE.md` contains only what Claude needs on every interaction:
+- Project identity and purpose
+- Directory map
+- Build/test/lint commands
+- Key rules (ESM only, template conventions, read-only docs)
+Everything else goes in hooks, agents, commands, or skills.
+---
+## Layer 1: Hooks (3 scripts in .claude/hooks/)
+### protect-files.sh (PreToolUse)
+Blocks edits to: `.env`, `.env.*`, `package-lock.json`, `.git/`, `docs/0*` reference docs.
+Triggered on Write, Edit, MultiEdit. Exit 2 = blocked.
+### post-edit.sh (PostToolUse)
+Currently a no-op stub — DevForge has no ESLint yet. When ESLint is added, this will auto-lint `.js` files after every edit. Non-blocking (exit 0 always).
+### stop-quality-gate.sh (Stop)
+Runs `npx vitest run` before Claude can mark a task as done. Prevents completing work with broken tests. Has infinite-loop guard via `stop_hook_active` check.
+---
+## Layer 2: Agents (5 read-only reviewers in .claude/agents/)
+All agents have `disallowedTools: [Write, Edit, MultiEdit]` — they can only read and report.
+| Agent | Focus |
+|---|---|
+| code-quality-reviewer | ESM patterns, single responsibility, template conventions |
+| security-reviewer | Path traversal, command injection, file system safety |
+| spec-validator | Requirement traceability against a spec file |
+| production-readiness | CLI packaging, cross-platform, error handling |
+| uat-validator | UAT scenario to test coverage mapping |
+---
+## Layer 3: Commands (10 slash commands in .claude/commands/)
+### Daily Workflow (4 commands)
+| Command | When to Use |
+|---|---|
+| `/project:help` | Don't know what to do — guides you to the right workflow |
+| `/project:status` | Quick dashboard of tests, git state, recent commits |
+| `/project:next` | Suggests what to work on next based on context |
+| `/project:done` | Verifies task completion before moving on |
+### Verification (6 commands)
+| Command | What It Does |
+|---|---|
+| `/project:verify-all` | Runs all 5 agents + tests |
+| `/project:audit-spec` | Checks implementation vs specification |
+| `/project:audit-wiring` | Finds dead/unwired code and templates |
+| `/project:audit-security` | Security review of src/ and bin/ |
+| `/project:pre-pr` | Full pre-PR checklist |
+| `/project:run-uat` | Execute UAT scenarios, report coverage |
+---
+## Adding New Infrastructure
+### Adding a hook
+1. Create `.claude/hooks/[name].sh` with `#!/bin/bash` and stdin JSON parsing
+2. Add to `.claude/settings.json` under the appropriate event (PreToolUse/PostToolUse/Stop)
+3. Make executable: `chmod +x .claude/hooks/[name].sh`
+### Adding an agent
+1. Create `.claude/agents/[name].md` with YAML frontmatter including `disallowedTools`
+2. Add a review checklist tailored to DevForge
+3. Include a self-verification protocol
+### Adding a command
+1. Create `.claude/commands/[name].md` with instructions
+2. Use DevForge's real commands (not placeholders)
+3. Reference it in the help command if it's a common workflow
+---
+## Why No ESLint Yet
+DevForge is a small CLI tool (~900 lines of JS). The CLAUDE.md notes ESLint is planned ("when eslint is added"). When it's added:
+1. Install: `npm install -D eslint @eslint/js`
+2. Create `eslint.config.js` (flat config for ESM)
+3. Update `post-edit.sh` to run `npx eslint --fix`
+4. Update `stop-quality-gate.sh` to add `npx eslint .`
+5. Update CLAUDE.md lint command