npm - specweave - Versions diffs - 1.0.256 → 1.0.259 - Mend

specweave 1.0.256 → 1.0.259

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (113) hide show

package/plugins/specweave/skills/auto/SKILL.md CHANGED Viewed

@@ -10,23 +10,11 @@ argument-hint: "[INCREMENT_IDS...] [OPTIONS]"
 !`s="auto"; for d in .specweave/skill-memories .claude/skill-memories "$HOME/.claude/skill-memories"; do p="$d/$s.md"; [ -f "$p" ] && awk '/^## Learnings$/{ok=1;next}/^## /{ok=0}ok' "$p" && break; done 2>/dev/null; true`
-**Start autonomous execution session using Claude Code's Stop Hook.**
-## How to Use
-When user says "auto" or "autonomous" or "keep working" or provides a task description, you should:
+## Project Context
-1. **Understand the user's intent**: What do they want to work on?
-2. **Find or create the increment**: Check for active increments, or create new ones if needed
-3. **Set up auto session directly** (see Execution section):
-   - Read config and find increments using Read/Glob tools
-   - Activate increments by editing metadata.json via Edit tool
-   - Write session marker (`.specweave/state/auto-mode.json`) via Write tool
-   - Map quality flags to `successCriteria` in the marker
-4. **MANDATORY: Display stop conditions banner** - Users MUST see when auto mode will stop BEFORE work begins! See "Step 1.5" in Execution section.
-5. **Start working**: Execute /sw:do on tasks, mark them complete, let framework hooks handle sync
+!`.specweave/scripts/skill-context.sh auto 2>/dev/null; true`
-Now work on the increment tasks. When you try to exit, the stop hook will check completion conditions and feed the next task back to you. Continue until all tasks are complete and quality gates pass.
+**Start autonomous execution session using Claude Code's Stop Hook.**
 ## Usage
@@ -34,1082 +22,149 @@ Now work on the increment tasks. When you try to exit, the stop hook will check
 /sw:auto [INCREMENT_IDS...] [OPTIONS]
 ```
-:::tip Claude Code's Game-Changing Features for Auto Mode
-**Compact Command (VSCode)** — Use `compact` mode to keep Claude Code inside your VSCode window. Work continuously for **hours** in the same session without context switching between terminal and editor. Perfect for long auto mode sessions!
-**STOP Hooks with Subagents** — Stop hooks now work with spawned subagents! This means `/sw:auto` can validate quality gates at EVERY level of execution. When auto mode spawns specialized agents (QA, Security, Performance), the stop hook validates their results before allowing the session to continue.
-:::
-## Arguments
-- `INCREMENT_IDS`: One or more increment IDs to process (e.g., `0001`, `0001-feature`)
-  - **NEW BEHAVIOR**: If omitted, auto mode will:
-    1. Check for active/in-progress increments
-    2. If none found, **intelligently create increments** based on user context
-    3. Match existing planned increments to user intent OR extend them
+- `INCREMENT_IDS`: One or more increment IDs (e.g., `0001`, `0001-feature`). If omitted, finds active increments or intelligently creates new ones.
 ## Options
 | Option | Description | Default |
 |--------|-------------|---------|
-| `--max-turns N` | Maximum hook invocations before hard stop | **20** |
-| `--simple` | Simple mode (minimal context) | false |
+| `--max-turns N` | Max hook invocations before hard stop | 20 |
+| `--simple` | Minimal context mode | false |
 | `--dry-run` | Preview without starting | false |
 | `--all-backlog` | Process all backlog items | false |
 | `--skip-gates G1,G2` | Pre-approve specific gates | None |
-| `--no-increment`, `--no-inc` | Skip auto-creation (require existing increments) | false |
-| `--yes`, `-y` | Auto-approve increment plan (skip user approval) | false |
-| `--tdd`, `--strict` | Enable TDD strict mode - ALL tests must pass | false |
-| **`--build`** | Build must pass before completion | false |
-| **`--tests`** | Tests must pass before completion (unit + integration) | false |
-| **`--e2e`** | E2E tests must pass before completion | false |
-| **`--lint`** | Linting must pass before completion | false |
-| **`--types`** | Type-checking must pass before completion | false |
-| **`--cov <n>`** | Code coverage must meet threshold (%) | 80 |
-| **`--e2e-cov <n>`** | E2E coverage must meet threshold (%) | 70 |
-| **`--cmd "<command>"`** | Custom command must pass before completion | None |
-:::warning Turn limit is a SAFETY NET
-The primary completion criteria is **all tasks complete + all ACs satisfied** (grep-based, reliable). The turn limit (default: 20) is a backup safety net. Quality gates (tests, build, coverage) are enforced by the MODEL before running `/sw:done`, NOT by the stop hook.
-:::
-## Completion Architecture (v5)
-**Two-layer design**: Stop hook gates exit. Model enforces quality before `/sw:done`.
-### What the Stop Hook Does (v5)
-The stop hook (`stop-auto-v5.sh`, ~166 lines) is a **simple gate**:
-1. Checks if auto mode is active (auto-mode.json marker)
-2. Counts pending tasks in `tasks.md` (grep-based, reliable)
-3. Counts open ACs in `spec.md` (grep-based, reliable)
-4. If work remains: **block** with concise message
-5. If all complete: **approve** (model should then run `/sw:done`)
-The hook does NOT run tests, builds, or any external commands. It has no PATH dependency.
-### What the Model Does (before /sw:done)
-When `--build`, `--tests`, etc. flags are passed, they are stored as **success criteria** in `auto-mode.json`. The model reads these criteria and MUST verify them before running `/sw:done`:
-| Flag | successCriteria type | Model Responsibility |
-|------|---------------------|---------------------|
-| `--build` | `build_succeeds` | Run build command, fix failures |
-| `--tests` | `tests_pass` | Run tests, fix failures |
-| `--e2e` | `tests_pass` (description: "E2E tests") | Run E2E tests, fix failures |
-| `--lint` | `custom_command` (command: lint cmd) | Run linter, fix issues |
-| `--types` | `custom_command` (command: tsc) | Run type checker, fix errors |
-| `--cov N` | `tests_pass` (threshold: N) | Write tests to meet threshold |
-| `--cmd "..."` | `custom_command` | Run custom command |
-The model runs in a proper shell environment where npm/node ARE available.
-### Completion Flow
-```
-Tasks done + ACs satisfied
-  → Hook approves exit
-  → Model verifies quality criteria (build, tests, etc.)
-  → Model runs /sw:done
-  → /sw:done runs PM validation gates
-  → Increment closed
-```
-### Safety Nets
-| Mechanism | Default | Purpose |
-|-----------|---------|---------|
-| Turn limit | 20 | Hard stop after N hook invocations |
-| Staleness | 2h | Auto-cleanup sessions inactive > 2 hours |
-| Dedup | 30s | Prevent rapid-fire blocks |
-Configure in `.specweave/config.json`:
-```json
-{ "auto": { "maxTurns": 50, "maxSessionAge": 7200 } }
-```
-### Best Practices
-1. **Start Simple**: Use `--build --tests` for basic quality gates
-2. **Add Coverage Gradually**: Start with `--cov 70`, increase over time
-3. **Don't Skip E2E**: Use `--e2e` for user-facing features
-4. **Custom Commands**: Use `--cmd` for project-specific checks
-## Intelligent Increment Creation (NEW!)
-**Auto mode now creates increments automatically when none exist!**
-### Decision Flow
-```
-/sw:auto invoked
-     │
-     ▼
-Are INCREMENT_IDS specified? ──YES──> Use specified increments
-     │
-     NO
-     ▼
-Active increment exists? ──YES──> Use active increment
-     │
-     NO
-     ▼
---no-increment/--no-inc flag? ──YES──> ERROR: No increments found
-     │
-     NO (DEFAULT)
-     ▼
-INTELLIGENT INCREMENT CREATION
-     │
-     ├─> Analyze user context
-     ├─> Check for matching planned/backlog increments
-     ├─> Match existing OR create new increment(s)
-     │
-     ▼
-Auto mode starts with new/matched increment(s)
-```
-### Intelligence Patterns
-The LLM will analyze the context and decide:
-1. **Match Existing**: If user says "continue the auth feature" → finds `0002-user-authentication`
-2. **Extend Existing**: If user says "add password reset" → extends auth increment with new tasks
-3. **Create New**: If user says "build a payment system" → creates `0003-payment-integration`
-4. **Multiple Increments**: If user says "finish all pending features" → creates queue from backlog
-5. **Ask User**: If ambiguous, LLM will ask clarifying questions before creating
+| `--no-increment` | Require existing increments (no auto-creation) | false |
+| `--yes`, `-y` | Auto-approve increment plan | false |
+| `--tdd`, `--strict` | TDD strict mode (RED->GREEN->REFACTOR enforced) | false |
+| `--build` | Build must pass before completion | false |
+| `--tests` | Tests must pass before completion | false |
+| `--e2e` | E2E tests must pass before completion | false |
+| `--lint` | Linting must pass before completion | false |
+| `--types` | Type-checking must pass before completion | false |
+| `--cov <n>` | Code coverage threshold (%) | 80 |
+| `--cmd "<command>"` | Custom command must pass | None |
-### Examples
+## Core Loop
-```bash
-# User says: "Let's ship the dashboard feature"
-/sw:auto
-# → LLM finds 0004-dashboard in backlog, activates it
-# User says: "Build a user profile page with avatar upload"
-/sw:auto
-# → LLM creates 0005-user-profile-page with spec + tasks
-# User says: "I want to work on auth and notifications"
-/sw:auto
-# → LLM creates queue: [0001-authentication, 0002-notifications]
-# User says: "Just work on what's already planned"
-/sw:auto --no-increment  # or --no-inc
-# → ERROR if no active increment (strict mode)
-```
-## How It Works
-```
-1. User runs /sw:auto (with or without IDs)
-           │
-           ▼
-2. Skill sets up session directly (no CLI needed):
-   ├─ Read config.json for TDD mode, auto settings
-   ├─ Glob/Read increments to find active/backlog
-   ├─ Edit metadata.json to activate increments
-   └─ Write .specweave/state/auto-mode.json (session marker)
-           │
-           ▼
-3. Claude starts working on tasks
-   └─ /sw:do executes tasks
-           │
-           ▼
-4. Claude tries to exit (naturally)
-           │
-           ▼
-5. Stop Hook intercepts (stop-auto-v5.sh)
-   ├─ Checks: auto-mode.json exists with active=true
-   ├─ Checks: All tasks complete? (grep [ ] in tasks.md)
-   ├─ Checks: All ACs satisfied? (grep [ ] in spec.md)
-   └─ Checks: Turn limit reached?
-           │
-   ┌──────┴──────┐
-   ▼             ▼
-INCOMPLETE    COMPLETE
-   │             │
-   ▼             ▼
-Block exit    Approve exit
-Re-feed       Session ends
-prompt
 ```
-## Examples
-### Basic Usage
-```bash
-# Start auto on current increment
-/sw:auto
-# Start on specific increment
-/sw:auto 0001-user-auth
-# Multiple increments
-/sw:auto 0001 0002 0003
+IMPLEMENT task -> TEST -> FAIL? -> FIX -> PASS -> mark complete -> NEXT task
 ```
-### With Options
-```bash
-# Increase turn limit
-/sw:auto --max-turns 50
-# Simple mode (minimal context)
-/sw:auto --simple
-# Preview only
-/sw:auto --dry-run
-# All backlog items
-/sw:auto --all-backlog
-```
-### Pre-approve Gates
-```bash
-# Skip deploy gate (pre-approved)
-/sw:auto --skip-gates deploy
-# Multiple gates
-/sw:auto --skip-gates "deploy,migrate"
-```
-## Session Management
-### Check Status
-```bash
-/sw:auto-status
-```
-### Cancel Session
-```bash
-/sw:cancel-auto
-```
-### Resume After Crash
-Just run `/sw:do` - it will detect incomplete tasks and continue.
-Or use Claude Code's built-in:
-```bash
-/resume           # Pick session to resume
-claude --continue # Continue last session
-```
-## Configuration
-In `.specweave/config.json`:
-```json
-{
-  "auto": {
-    "enabled": true,
-    "maxIterations": 500,
-    "maxHours": 120,
-    "testCommand": "npm test",
-    "coverageThreshold": 80,
-    "enforceTestFirst": false,
-    "humanGated": {
-      "patterns": ["deploy", "migrate", "publish"],
-      "timeout": 1800
-    }
-  }
-}
-```
-**Note**: The stop hook checks task/AC completion via grep. Quality gates (tests, build) are the model's responsibility before running `/sw:done`.
-## Completion Signals
-The session ends when ANY of these occur:
-1. **All tasks complete + tests passed** - tasks.md has all `[x]` AND tests were executed
-2. **Completion promise** - Output contains `<!-- auto-complete:DONE -->`
-3. **Max iterations** - Reached configured limit (default: 500)
-4. **Max hours** - Time limit exceeded (default: 120 hours / 5 days)
-5. **User cancellation** - `/sw:cancel-auto`
-6. **Human gate timeout** - Gate pending too long
-**IMPORTANT**: When tasks are all marked done, the stop hook approves exit. The model MUST verify quality criteria (from `--build`, `--tests`, etc. flags stored in `auto-mode.json`) before running `/sw:done`.
-## Simple Mode (--simple)
-Pure stop hook loop behavior:
-- Minimal context in re-feed prompt
-- No session state UI
-- No queue management
-- Just: loop + tasks.md completion + max iterations
-```bash
-/sw:auto --simple
-```
-## Safety Features
-- **Human Gates**: Sensitive operations require approval
-- **Circuit Breakers**: External service failures handled gracefully
-- **Turn Limit**: Hard stop after maxTurns (default: 20)
-- **stop_hook_active**: Prevents infinite continuation loops
-- **Sound Notifications**: Audible alerts when Claude stops working
-## Sound Notifications
-**Auto mode plays a satisfying sound when work completes successfully!**
-### When Sound Plays
-| Event | Sound | Platforms | Meaning |
-|-------|-------|-----------|---------|
-| **Session Complete (Success)** | Glass.aiff (macOS)<br>complete.oga (Linux)<br>Windows Notify (Windows) | All | All tasks done, tests passing - work finished! |
-**Sound plays ONLY on complete success** - when all tasks are done AND all tests pass. This way you know when to check back without being interrupted during ongoing work.
-### Cross-Platform Support
-The sound notification works automatically on:
-- **macOS**: Glass.aiff (satisfying chime)
-- **Linux**: PulseAudio/ALSA/speaker-test fallbacks
-- **Windows**: PowerShell beeps
-Sounds fail gracefully on systems without audio support.
-## v2.3 Per-Agent Stop Hook Behavior (NEW!)
-**CRITICAL: The stop hook runs PER AGENT, not globally!**
-### How It Works
-```
-Main Agent (Claude Code)
-    │
-    ├── Stop hook invoked when main agent tries to exit
-    │
-    ├── Spawns Subagent A (Task tool)
-    │   └── Subagent A completes → returns to main agent
-    │       (NO stop hook for subagent exit by default)
-    │
-    ├── Spawns Subagent B (Task tool with stop_hooks enabled)
-    │   └── Stop hook CAN be invoked if configured
-    │
-    └── Main agent tries to exit → Stop hook invoked
-```
-### Key Implications
-1. **Turn count = main agent loops**: When you see "Turn 8/20", that's 8 times the MAIN agent tried to exit, not subagent work.
-2. **Subagent work is "free"**: Spawning specialized agents (QA, Security, etc.) doesn't consume turns from the main loop.
-3. **Shared session state**: All agents (main + sub) share the same `auto-session.json`, so task completion is tracked globally.
-4. **Completion validation at main level**: The stop hook checks task/AC completion when the MAIN agent tries to exit.
-### Configuration
-To enable stop hooks for subagents (advanced):
-```typescript
-// In Task tool call
-{
-  "stop_hooks": true,  // Enable stop hook for this subagent
-  "inherit_session": true  // Share session state with parent
-}
-```
-### Best Practices
-- Let subagents do specialized work without worrying about iterations
-- Main agent orchestrates and validates via stop hook
-- Use `--max-turns` as a safety net, not a target
-- Primary completion = tasks complete + ACs satisfied
-## v2.1 Reliability Improvements
-Auto mode includes reliability features for long-running sessions:
-| Feature | Description |
-|---------|-------------|
-| **Context Management** | Triggers compaction at ~150k tokens, saves checkpoints |
-| **Heartbeat/Watchdog** | Detects stale sessions (>5 min no activity) |
-| **Failure Classification** | Transient (retry), Fixable (AI fix), External (pause) |
-| **Checkpoints** | Task-level recovery from crashes |
-| **Command Timeouts** | 10 min test, 5 min build (configurable) |
-**Configuration** (`config.json`):
-```json
-{
-  "auto": {
-    "contextThreshold": 150000,
-    "timeouts": { "test": 600, "build": 300 }
-  }
-}
-```
-**Logs**: `.specweave/logs/auto-iterations.log`
-## v2.2 TDD Strict Mode & Stop Reason Tracking
-### TDD Strict Mode
-**Enable TDD strict mode for RED->GREEN->REFACTOR discipline:**
-```bash
-/sw:auto --tdd 0001-feature   # or --strict
-```
-**TDD configuration priority** (highest to lowest):
-1. Command flag (`--tdd`, `--strict`)
-2. Increment `metadata.json` (`"tddMode": true`)
-3. spec.md frontmatter (`tdd: true`)
-4. Global `config.json` (`testing.defaultTestMode: "TDD"`)
-**For full TDD workflow details, see:** `sw:tdd-orchestrator` skill
-### Test Framework Auto-Detection
-Auto mode discovers and runs test commands automatically:
-| Framework | Detection |
-|-----------|-----------|
-| npm/Vitest/Jest | `package.json`, config files |
-| Playwright/Cypress | Config files, `/e2e` dirs |
-| Pytest/Go/Cargo | `go.mod`, `Cargo.toml`, `pytest.ini` |
-| Xcode/Swift | `.xcodeproj`, `Package.swift` |
-### Stop Reason Tracking
-Stop reasons logged to `.specweave/logs/auto-stop-reasons.log`:
-| Category | Success | Description |
-|----------|---------|-------------|
-| `all_tasks_complete` | Yes | All tests pass, all tasks done |
-| `completion_promise` | Yes | `<!-- auto-complete:DONE -->` |
-| `max_iterations_reached` | No | Safety limit hit |
-| `test_failures_exhausted` | No | 3 retry attempts failed |
-| `human_gate_pending` | Paused | Waiting for approval |
-## UI/UX Quality Gates (NEW!)
-Auto mode now includes comprehensive UI/UX quality gates that run automatically when E2E tests are detected.
-### Accessibility Audit
-When `@axe-core/playwright` or similar accessibility testing tools are detected, auto mode:
-- Parses accessibility audit results from test output
-- **Blocks** on critical and serious violations (WCAG Level A/AA)
-- **Warns** on moderate and minor violations
-- Shows detailed violation report with fix suggestions
-**Violation Severity Handling:**
-| Severity | Action | Example |
-|----------|--------|---------|
-| Critical | **BLOCKS** completion | Missing alt text, form without labels |
-| Serious | **BLOCKS** completion | Color contrast, missing document lang |
-| Moderate | Warning only | Landmark regions |
-| Minor | Warning only | Empty headings |
-**Enable in your tests:**
-```typescript
-import { injectAxe, checkA11y } from '@axe-core/playwright';
-test('page is accessible', async ({ page }) => {
-  await page.goto('/');
-  await injectAxe(page);
-  await checkA11y(page);
-});
-```
-### Console Error Detection
-Auto mode parses E2E test output for console errors:
-- **Blocks** on uncaught exceptions
-- **Blocks** on `console.error` from application code
-- **Excludes** expected dev tool messages (React DevTools, HMR, etc.)
-**Automatic exclusions:**
-- React/Apollo DevTools prompts
-- HMR messages
-- Vite dev server messages
-- Favicon loading failures
-**Add custom exclusions in config:**
-```json
-{
-  "auto": {
-    "consoleErrors": {
-      "excludePatterns": ["Expected test error"]
-    }
-  }
-}
-```
-### UI State Coverage
-Auto mode detects and reports on UI state test coverage:
-| State | Detection | Recommendation |
-|-------|-----------|----------------|
-| Loading | Spinners, skeletons, `aria-busy` | Test loading/skeleton states |
-| Error | Error boundaries, 404/500 pages | Test error handling |
-| Empty | No data, no results | Test empty state displays |
-Shows warning if states are detected but not explicitly tested.
-## Increment Queue Transition (NEW!)
-Auto mode now handles multi-increment queues with smooth transitions.
-### Completion Summary
-When an increment completes, auto mode shows:
-```
-INCREMENT COMPLETE: 0001-user-auth
-SUMMARY:
-  Tasks: 15/15 | Duration: 45m
-  Tests: 42 passed, 0 failed
-  Status: All acceptance criteria met
-NEXT INCREMENT: 0002-notifications
-  Queue: 2 increment(s) remaining
-```
-### Skip Failed Increments
-If an increment fails after 3 retry attempts, you can skip it:
-```bash
-/sw:skip-increment
-```
-This will:
-1. Mark the increment as "skipped" (not failed, not completed)
-2. Log failure details for later review
-3. Move to the next increment in queue
-4. Continue auto mode execution
-**Use when:**
-- A blocking issue requires external resolution
-- You want to prioritize other work
-- The issue needs human investigation
-## Auto-Execute with Credentials (MANDATORY)
-**In auto mode, ALL agents MUST follow the auto-execute skill rules:**
-### The Golden Rule
-```
-FORBIDDEN: "Next Steps: Run wrangler deploy"
-FORBIDDEN: "Execute the schema in Supabase SQL Editor"
-FORBIDDEN: "Set secret via: wrangler secret put..."
-REQUIRED: Execute commands DIRECTLY using available credentials
-```
-### Credential Lookup Order
-Before ANY deployment task, check for credentials:
-1. **`.env` file** - Primary credential storage
-2. **Environment variables** - Already loaded in session
-3. **CLI tool auth** - `wrangler whoami`, `gh auth status`, etc.
-4. **Config files** - `wrangler.toml`, `.specweave/config.json`
-### If Credentials Found -> AUTO-EXECUTE
-```bash
-# Example: Supabase migration
-if grep -q "DATABASE_URL" .env; then
-  source .env
-  psql "$DATABASE_URL" -f schema.sql
-fi
-# Example: Wrangler deployment
-if wrangler whoami 2>/dev/null; then
-  wrangler deploy
-fi
-```
-### If Credentials Missing -> ASK, Don't Show Manual Steps
-```markdown
-**Credential Required for Auto-Execution**
-I need your Supabase database URL to execute the migration.
-**Please paste your DATABASE_URL:**
-[I will save to .env and continue automatically]
-```
-**After user provides credential:**
-1. Save to `.env`
-2. EXECUTE immediately
-3. Continue auto mode
----
-## Quality Gates
-**Before running `/sw:done`, verify these gates based on success criteria in `auto-mode.json`:**
-1. All tasks marked `[x]` in tasks.md
-2. All ACs marked `[x]` in spec.md
-3. If `--build` flag: build passes
-4. If `--tests` flag: tests pass
-5. If `--e2e` flag: E2E tests pass
-6. If `--lint` flag: linting passes
-7. If `--types` flag: type-checking passes
-8. If `--cov N` flag: coverage meets threshold
-The stop hook validates gates 1-2 via grep. Gates 3-8 are the model's responsibility.
----
+Stop hook gates exit when tasks/ACs remain. Model enforces quality gates (build/tests/lint) before `/sw:done`.
 ## Execution
-**CRITICAL: You MUST show STOP CONDITIONS to user BEFORE starting work!**
-When this command is invoked:
+### Step 1: Set Up Auto Session
-### Step 1: MANDATORY - Set Up Auto Session (DO THIS FIRST!)
+Use Read/Write/Edit/Glob tools directly (no CLI needed):
-**Set up the auto session directly using Read/Write/Edit/Glob tools. No CLI needed.**
+**1a. Read config** — `.specweave/config.json`: `auto.enabled`, `auto.maxTurns` (default 20), `testing.defaultTestMode`, `testing.tddEnforcement`
-#### Step 1a: Read Configuration
+**1b. Find increments:**
+- If IDs specified: Glob `.specweave/increments/{ID}*/metadata.json`, verify exists
+- If no IDs: find active/in-progress increments. If none, check backlog/planned. If none at all, go to Step 2 (Intelligent Creation).
-```
-Read .specweave/config.json to extract:
-  - auto.enabled (default: true) — if false, STOP
-  - auto.maxTurns (default: 20)
-  - testing.defaultTestMode — check for "TDD"
-  - testing.tddEnforcement — "strict" | "warn" | "off"
-  - auto.requireTests (default: false)
-```
-#### Step 1b: Find Increments
-**If INCREMENT_IDS specified by user:**
-- For each ID, Glob `.specweave/increments/{ID}*/metadata.json`
-- Read each metadata.json, verify it exists
-- If not found, warn user and STOP
-**If no IDs specified:**
-- Glob `.specweave/increments/*/metadata.json`
-- Read each, find ones with status "active" or "in-progress"
-- If none active, check for "backlog" or "planned"
-- If none at all → proceed to Step 2 (Intelligent Increment Creation)
+**1c. Activate increments** — Edit `metadata.json`: set `"status": "active"`, update timestamp
-#### Step 1c: Activate Increments
-For each increment that needs activation:
-- Edit its `metadata.json`: set `"status": "active"` and `"updated"` to current ISO timestamp
-#### Step 1d: Create Session Marker
-**Write `.specweave/state/auto-mode.json` using the Write tool** with this schema:
+**1d. Write session marker** — `.specweave/state/auto-mode.json`:
 ```json
 {
   "active": true,
-  "timestamp": "2026-02-11T12:00:00.000Z",
-  "incrementIds": ["0001-feature-name"],
+  "timestamp": "<ISO>",
+  "incrementIds": ["0001-feature"],
   "tddMode": false,
   "requireTests": false,
-  "userGoal": "optional user prompt text",
+  "userGoal": "optional",
   "successCriteria": [
-    { "type": "tasks_complete", "description": "All tasks marked complete in tasks.md", "required": true },
-    { "type": "acs_satisfied", "description": "All acceptance criteria satisfied in spec.md", "required": true }
+    { "type": "tasks_complete", "description": "All tasks marked complete", "required": true },
+    { "type": "acs_satisfied", "description": "All ACs satisfied", "required": true }
   ],
   "successSummary": "All tasks and acceptance criteria complete"
 }
 ```
-**Map quality flags to successCriteria entries:**
-| User Flag | Add to successCriteria |
-|-----------|----------------------|
-| `--tests` | `{ "type": "tests_pass", "description": "All tests must pass", "required": true }` |
-| `--build` | `{ "type": "build_succeeds", "description": "Build must succeed", "required": true }` |
-| `--e2e` | `{ "type": "tests_pass", "description": "E2E tests must pass", "required": true }` |
-| `--lint` | `{ "type": "custom_command", "description": "Linting must pass", "required": true, "command": "<lint-cmd>" }` |
-| `--types` | `{ "type": "custom_command", "description": "Type-checking must pass", "required": true, "command": "npx tsc --noEmit" }` |
-| `--cov N` | `{ "type": "tests_pass", "description": "Coverage must meet N%", "required": true, "threshold": N }` |
-| `--cmd "X"` | `{ "type": "custom_command", "description": "Custom: X", "required": true, "command": "X" }` |
-| `--tdd` | Set `"tddMode": true` in marker |
-**Always include `tasks_complete` and `acs_satisfied` as base criteria.**
-Ensure `.specweave/state/` directory exists (create with Bash `mkdir -p` if needed).
-**Proceed to Step 1.5**
-### Step 1.5: MANDATORY - Analyze Tests & Display Stop Conditions
-**CRITICAL: You MUST analyze the test situation and output SPECIFIC stop conditions BEFORE starting any task work!**
-#### Step 1.5a: Detect Existing Tests
-Run these commands to detect what tests exist:
-```bash
-# Check for test frameworks
-ls package.json 2>/dev/null && cat package.json | grep -E '"(jest|vitest|mocha|playwright|cypress)"' || true
-ls vitest.config.* jest.config.* playwright.config.* cypress.config.* 2>/dev/null || true
-# Count existing test files
-find . -name "*.test.ts" -o -name "*.test.tsx" -o -name "*.spec.ts" -o -name "*.test.js" 2>/dev/null | wc -l
-find . -name "*.e2e.ts" -o -name "*.e2e-spec.ts" -path "*/e2e/*" -name "*.spec.ts" 2>/dev/null | wc -l
-# Check if tests can run
-npm test --help 2>/dev/null | head -1 || true
-```
-#### Step 1.5b: Determine Test Strategy
-Based on what you find, determine:
-**IF tests exist:**
-- List the EXACT test commands that will be run
-- List the SPECIFIC test files that will validate this work
+Map flags to extra `successCriteria` entries:
+- `--tests` -> `{ "type": "tests_pass", "required": true }`
+- `--build` -> `{ "type": "build_succeeds", "required": true }`
+- `--e2e` -> `{ "type": "tests_pass", "description": "E2E tests", "required": true }`
+- `--lint` -> `{ "type": "custom_command", "command": "<lint-cmd>", "required": true }`
+- `--types` -> `{ "type": "custom_command", "command": "npx tsc --noEmit", "required": true }`
+- `--cov N` -> `{ "type": "tests_pass", "threshold": N, "required": true }`
+- `--cmd "X"` -> `{ "type": "custom_command", "command": "X", "required": true }`
+- `--tdd` -> set `"tddMode": true`
-**IF tests DON'T exist yet:**
-- You MUST plan what tests need to be created as part of the tasks
-- List the specific test files you will CREATE during auto mode
-- These tests become part of the stop criteria
+Always include `tasks_complete` and `acs_satisfied` as base criteria. Ensure `.specweave/state/` dir exists.
-#### Step 1.5c: Output the Stop Conditions Banner
+### Step 1.5: MANDATORY - Display Stop Conditions
-**Output this banner with SPECIFIC test information:**
+**You MUST output a stop conditions banner BEFORE starting work.** Detect test frameworks, count test files, then show:
 ```
 AUTO MODE STARTING
-══════════════════════════════════════════════════════════════════════════════
-Increment: [INCREMENT_ID]
-Tasks: [X] pending
-══════════════════════════════════════════════════════════════════════════════
-TESTS THAT MUST PASS FOR COMPLETION:
-Unit/Integration Tests:
-  Command: [EXACT_TEST_COMMAND]
-  Files:
-    - [test-file-1.test.ts] - [what it tests]
-    - [test-file-2.test.ts] - [what it tests]
-    - [NEW] [test-file-3.test.ts] - [will be created for X]
-E2E Tests (if applicable):
-  Command: [EXACT_E2E_COMMAND]
-  Files:
-    - [auth.e2e.ts] - [login/logout flows]
-    - [NEW] [checkout.e2e.ts] - [will be created for payment flow]
-══════════════════════════════════════════════════════════════════════════════
-SESSION WILL COMPLETE WHEN:
-  - All [X] tasks marked complete
-  - [TEST_COMMAND] passes (0 failures)
-  - [E2E_COMMAND] passes (if E2E tests exist)
-  - /sw:done validation passes
-SESSION WILL PAUSE/STOP IF:
-  - Tests fail 3 times in a row → pauses for human review
-  - User runs /sw:cancel-auto
-  - Max iterations reached (safety limit)
-══════════════════════════════════════════════════════════════════════════════
-Check progress: /sw:auto-status
-Cancel: close session or /sw:cancel-auto
-══════════════════════════════════════════════════════════════════════════════
-```
-#### Step 1.5d: Fill in ALL placeholders with REAL values
-**Required placeholders:**
-- `[INCREMENT_ID]`: Actual increment ID (e.g., `0001-user-auth`)
-- `[X]`: Number of pending tasks from tasks.md
-- `[EXACT_TEST_COMMAND]`: Real command like `npm test` or `npx vitest run`
-- `[EXACT_E2E_COMMAND]`: Real command like `npx playwright test`
-- `[test-file-*.ts]`: Real test file names with brief description
-- `[NEW]`: Mark any test files that will be CREATED during auto mode
-**Examples of GOOD vs BAD:**
-BAD (vague):
-```
-Tests: All tests passing (unit + E2E if present)
+======================================================================
+Increment: [ID] | Tasks: [N] pending
+======================================================================
+TESTS THAT MUST PASS:
+  Unit: [command] - [N] test files ([list key ones])
+  E2E: [command] - [N] test files (if applicable)
+  [NEW] files to be created during auto mode
+======================================================================
+COMPLETES WHEN: All tasks done + tests pass + /sw:done passes
+STOPS IF: 3 consecutive test failures | /sw:cancel-auto | max turns
+======================================================================
 ```
-GOOD (specific):
-```
-Unit Tests:
-  Command: npm test
-  Files:
-    - src/auth/auth.service.test.ts - JWT token generation
-    - src/auth/login.test.ts - login validation
-    - [NEW] src/auth/logout.test.ts - will create for logout flow
+Fill ALL placeholders with real values. Be specific about test files and commands.
-E2E Tests:
-  Command: npx playwright test
-  Files:
-    - tests/auth.e2e.ts - full login/logout user journey
-```
+### Step 1.6: TDD Enforcement (if TDD mode enabled)
-**DO NOT SKIP THIS STEP!** Users MUST see the EXACT tests that will determine success.
+Check TDD priority: `--tdd` flag > increment `metadata.json` > `config.json`
-### Step 1.6: TDD Enforcement Check (when TDD mode enabled)
+If TDD enabled, validate tasks.md has `[RED]`/`[GREEN]`/`[REFACTOR]` markers. If no markers found:
+- `strict`: BLOCK — cannot proceed, fix tasks first
+- `warn`: show warning, continue without enforcement
+- `off`: skip silently
-**If TDD mode is detected (`--tdd` flag, config, or increment metadata), enforce TDD discipline!**
+Enforcement rules: `[RED]` tasks complete freely. `[GREEN]` requires its `[RED]` done first. `[REFACTOR]` requires its `[GREEN]` done first.
-#### Step 1.6a: TDD Marker Validation (CRITICAL!)
+### Step 2: Intelligent Increment Creation (when none found)
-**BEFORE checking TDD order, validate that tasks.md HAS TDD markers:**
+Analyze context and decide:
+- **Match existing**: find planned/backlog increment matching user intent, activate it
+- **Extend existing**: add tasks to active incomplete increment
+- **Create new**: invoke `/sw:increment "description"` then set up session
+- **Multiple**: activate all matching, include in session marker
+- **Ambiguous**: ask user to choose
-```bash
-INCREMENT_PATH=".specweave/increments/<id>"
-TASKS_FILE="$INCREMENT_PATH/tasks.md"
+Then return to Step 1c-1d to set up the session, then Step 1.5 for the banner.
-# Count TDD markers
-RED_COUNT=$(grep -c '\[RED\]' "$TASKS_FILE" 2>/dev/null || echo "0")
-GREEN_COUNT=$(grep -c '\[GREEN\]' "$TASKS_FILE" 2>/dev/null || echo "0")
-REFACTOR_COUNT=$(grep -c '\[REFACTOR\]' "$TASKS_FILE" 2>/dev/null || echo "0")
-TOTAL_MARKERS=$((RED_COUNT + GREEN_COUNT + REFACTOR_COUNT))
-echo "TDD Marker Check: RED=$RED_COUNT, GREEN=$GREEN_COUNT, REFACTOR=$REFACTOR_COUNT"
-```
+### Step 3: Execute Tasks
-**If TDD_MODE == true BUT TOTAL_MARKERS == 0:**
+1. Run `/sw:do` in a loop (stop hook handles continuation)
+2. Mark tasks complete in tasks.md, update spec.md ACs
+3. Run tests after each task
+4. Before `/sw:done`: verify all quality gates from `successCriteria`
+5. On completion output `<!-- auto-complete:DONE -->`
-```
-TDD MODE ENABLED BUT NO TDD MARKERS IN TASKS
-Your configuration has TDD mode enabled:
-  - --tdd flag: [yes/no]
-  - config.json: testing.defaultTestMode = "TDD"
-  - Enforcement level: [strict/warn/off]
-But tasks.md contains NO [RED], [GREEN], [REFACTOR] markers:
-  - [RED] markers found: 0
-  - [GREEN] markers found: 0
-  - [REFACTOR] markers found: 0
-TDD ORDER ENFORCEMENT WILL BE BYPASSED!
-The enforcement checks for task markers to validate RED->GREEN->REFACTOR order.
-Without markers, tasks can be completed in ANY order - defeating TDD discipline.
-CAUSE: Tasks were likely created:
-  - Manually (without using /sw:increment)
-  - Before TDD mode was enabled in config
-  - By copying from a non-TDD template
-FIX OPTIONS:
-1. (Recommended) Regenerate tasks with TDD structure:
-   /sw:increment "your-feature"
-   This will create proper RED->GREEN->REFACTOR triplets
-2. Add markers manually to existing tasks:
-   ### T-001: [RED] Write failing test for feature
-   ### T-002: [GREEN] Implement feature to pass test
-   ### T-003: [REFACTOR] Clean up feature code
-3. Disable TDD mode if not needed:
-   Set testing.defaultTestMode: "test-after" in config.json
-4. Continue without TDD enforcement (not recommended):
-   Set testing.tddEnforcement: "off" in config.json
-```
-**Behavior based on tddEnforcement:**
-| Enforcement | TDD Enabled + No Markers | Action |
-|-------------|--------------------------|--------|
-| `strict` | **BLOCKS** | Cannot proceed - fix tasks.md first |
-| `warn` | **WARNS** | Shows warning, continues without enforcement |
-| `off` | **Silent** | Skips all TDD checks |
-**Example (strict mode, no markers):**
-```
-TDD MARKER VALIDATION FAILED (strict mode)
-Cannot start auto mode with TDD enabled but no task markers.
-Run /sw:increment to regenerate tasks with TDD structure.
-Or set tddEnforcement: "warn" to continue anyway.
-```
+## Credential Auto-Execution
-#### Step 1.6b: TDD Mode Detection
+In auto mode, execute deployment commands directly using available credentials. Check `.env`, env vars, CLI auth (`wrangler whoami`, `gh auth status`). If credentials missing, ask user — never output manual steps.
-```bash
-# Check TDD mode from multiple sources (priority order)
-TDD_MODE=false
-TDD_ENFORCEMENT="warn"  # Default
-# 1. Check command-line flag (highest priority)
-if [[ "$*" == *"--tdd"* ]] || [[ "$*" == *"--strict"* ]]; then
-  TDD_MODE=true
-  TDD_ENFORCEMENT="strict"
-fi
-# 2. Check increment metadata
-if [[ -f "$INCREMENT_PATH/metadata.json" ]]; then
-  INC_TDD=$(jq -r '.tddMode // .testMode' "$INCREMENT_PATH/metadata.json" 2>/dev/null)
-  [[ "$INC_TDD" == "true" || "$INC_TDD" == "TDD" || "$INC_TDD" == "tdd" ]] && TDD_MODE=true
-fi
-# 3. Check global config
-if [[ -f ".specweave/config.json" ]]; then
-  GLOBAL_TDD=$(jq -r '.testing.defaultTestMode' ".specweave/config.json" 2>/dev/null)
-  [[ "$GLOBAL_TDD" == "TDD" || "$GLOBAL_TDD" == "tdd" ]] && TDD_MODE=true
-  # Check enforcement level
-  TDD_ENFORCEMENT=$(jq -r '.testing.tddEnforcement // "warn"' ".specweave/config.json" 2>/dev/null)
-fi
-```
-**If TDD_MODE == true, add TDD enforcement to stop conditions banner:**
-```
-TDD MODE ACTIVE - RED->GREEN->REFACTOR ENFORCEMENT
-TDD Task Order Enforcement:
-  Enforcement Level: [strict|warn|off]
-  [RED] tasks can be completed freely
-  [GREEN] tasks REQUIRE their [RED] counterpart completed first
-  [REFACTOR] tasks REQUIRE their [GREEN] counterpart completed first
-  strict mode: BLOCKS completion of out-of-order tasks
-  warn mode: Shows warning but allows (not recommended)
-```
-**TDD Enforcement during task execution:**
-```typescript
-// Before marking ANY [GREEN] or [REFACTOR] task complete:
-function enforceTDDOrder(task: Task, allTasks: Task[], enforcement: string): void {
-  const phase = extractPhase(task.title); // [RED], [GREEN], [REFACTOR]
-  if (!phase || phase === 'RED') return; // RED can always proceed
-  const tripletBase = Math.floor((task.number - 1) / 3) * 3 + 1;
-  if (phase === 'GREEN') {
-    const redTask = allTasks.find(t => t.number === tripletBase && t.title.includes('[RED]'));
-    if (redTask && redTask.status !== 'completed') {
-      const msg = `TDD VIOLATION: Cannot complete ${task.id} [GREEN] before ${redTask.id} [RED]`;
-      if (enforcement === 'strict') {
-        throw new Error(msg); // BLOCKS completion
-      } else {
-        console.warn(`${msg}`); // Warns but allows
-      }
-    }
-  }
-  if (phase === 'REFACTOR') {
-    const greenTask = allTasks.find(t => t.number === tripletBase + 1 && t.title.includes('[GREEN]'));
-    if (greenTask && greenTask.status !== 'completed') {
-      const msg = `TDD VIOLATION: Cannot complete ${task.id} [REFACTOR] before ${greenTask.id} [GREEN]`;
-      if (enforcement === 'strict') {
-        throw new Error(msg); // BLOCKS completion
-      } else {
-        console.warn(`${msg}`); // Warns but allows
-      }
-    }
-  }
-}
-```
-**Best practice for TDD in auto mode:**
-1. Enable `--tdd` flag for strict enforcement: `/sw:auto --tdd 0001`
-2. Or set globally: `testing.defaultTestMode: "TDD"` + `testing.tddEnforcement: "strict"`
-3. Tasks should be grouped in triplets: T-001 [RED], T-002 [GREEN], T-003 [REFACTOR]
-### Step 2: INTELLIGENT INCREMENT CREATION (when no increments found)
-**When no active or matching increments were found in Step 1b:**
-1. **Analyze context** (ULTRATHINK):
-   - Read recent conversation history
-   - Check user prompt for feature descriptions
-   - Scan `.specweave/increments/` for planned/backlog items
-   - Look for patterns: "build X", "implement Y", "add Z feature"
-3. **Make intelligent decision:**
-   **A. Match existing increment:**
-   ```
-   # User said: "work on the login feature"
-   # Found: .specweave/increments/0002-user-login-system (status: planned)
-   # Action: Activate it via Edit metadata.json, then set up session (Step 1c-1d)
-   ```
-   **B. Extend existing increment:**
-   ```
-   # User said: "add password reset to auth"
-   # Found: .specweave/increments/0001-authentication (status: active, incomplete)
-   # Action: Add tasks to existing tasks.md, include in session marker
-   ```
-   **C. Create new increment(s):**
-   ```
-   # User said: "build a payment integration with Stripe"
-   # No matching increments found
-   # Action: Create new increment via /sw:increment, then set up session (Step 1c-1d)
-   /sw:increment "Payment integration with Stripe"
-   ```
-   **D. Multiple increments:**
-   ```
-   # User said: "finish all pending features"
-   # Found: multiple backlog/planned increments
-   # Action: Activate all via Edit metadata.json, include all in session marker
-   ```
-   **E. Ask user (if ambiguous):**
-   ```markdown
-   I found several potential matches for your request:
-   1. **0002-user-authentication** (planned) - Add auth system
-   2. **0005-oauth-integration** (backlog) - Third-party auth
-   Which would you like to work on?
-   - Both (in sequence)
-   - Just authentication
-   - Just OAuth
-   - Something else (please describe)
-   ```
-4. **Set up session** — Go back to Step 1c-1d to activate and create the marker.
-5. **Proceed to Step 1.5** - Display the stop conditions banner!
-### Step 3: Start Task Execution
-**After displaying the stop conditions banner (Step 1.5), begin work:**
-1. **Execute /sw:do in a loop** (stop hook handles continuation):
-   - Work on tasks
-   - Mark complete in tasks.md
-   - Update spec.md ACs
-   - Sync to external tools
-   - Run tests after each task
-2. **On completion**:
-   ```
-   Auto Session Complete!
-   <!-- auto-complete:DONE -->
+## Session Management
-   Duration: 2h 34m
-   Iterations: 47
-   Tasks Completed: 42/42
-   Tests Passed: 156/156
-   Coverage: 87%
-   ```
+- **Status**: `/sw:auto-status`
+- **Cancel**: `/sw:cancel-auto`
+- **Resume after crash**: `/sw:do` or `claude --continue`
+- **Multi-agent**: use `/sw:team-lead` instead
-## Multi-Agent Work
+## Safety
-For coordinated multi-agent execution across different workstreams (frontend, backend, database, etc.), use `/sw:team-lead` instead. It provides structured orchestration of specialized agents working on the same increment.
+| Mechanism | Default |
+|-----------|---------|
+| Turn limit | 20 |
+| Staleness cleanup | 2h |
+| Human gates | `deploy`, `migrate`, `publish` patterns |
 ## Related Commands
@@ -1117,7 +172,6 @@ For coordinated multi-agent execution across different workstreams (frontend, ba
 |---------|---------|
 | `/sw:auto-status` | Check session status |
 | `/sw:cancel-auto` | Cancel session |
-| `/sw:skip-increment` | Skip failed increment and continue queue |
-| `/sw:do` | Execute tasks (also works standalone) |
-| `/sw:progress` | Show increment progress |
+| `/sw:do` | Execute tasks (standalone) |
+| `/sw:progress` | Show progress |
 | `/sw:team-lead` | Multi-agent orchestration |