npm - specweave - Versions diffs - 1.0.354 → 1.0.355 - Mend

specweave 1.0.354 → 1.0.355

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (9) hide show

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "specweave",
-  "version": "1.0.354",
+  "version": "1.0.355",
   "description": "Spec-driven development framework for AI coding agents. Works with Claude Code, Codex, Antigravity, Cursor, Copilot & more. 100+ skills, 49 CLI commands, verified skill certification, autonomous execution, and living documentation.",
   "type": "module",
   "main": "dist/index.js",

package/plugins/specweave/skills/team-lead/SKILL.md CHANGED Viewed

@@ -316,6 +316,24 @@ Plan review MUST NOT block other agents. Review plans as they arrive — agents
 For very large features, the team lead MAY split work into multiple increments per domain for better tracking and independent closure. Decide this during initial analysis (Step 1), before spawning agents.
+### Task Cap Per Agent (CRITICAL — Context Overflow Prevention)
+**Maximum 15 tasks per agent.** Agents with more tasks accumulate too much context in auto-mode, leading to extended thinking loops and stuck agents.
+When distributing tasks from the master spec:
+1. Count tasks per domain
+2. If a domain has >15 tasks: **split into 2 agents** (e.g., `jira-agent-a`, `jira-agent-b`) with non-overlapping task ranges
+3. If splitting isn't natural, group tasks into phases and create 2 increments per domain
+```
+Domain tasks analysis:
+  Frontend: 12 tasks -> 1 agent (OK)
+  Backend:  8 tasks  -> 1 agent (OK)
+  JIRA:     23 tasks -> SPLIT into 2 agents (tasks 1-12, tasks 13-23)
+```
+**Why**: Each auto-mode iteration adds context (spec reads, edits, test outputs). At 20+ tasks, accumulated context causes the model to enter extended thinking (30+ min) and effectively hang. The 15-task cap keeps agents within a safe context budget.
 ---
 ## 4. Agent Spawn Prompt Templates
@@ -494,36 +512,38 @@ Task({
 ## 8. Quality Gates
-Every agent MUST run quality validation before signaling completion.
+Quality gates are split: agents handle tests, team-lead handles closure (grill, done, judge-llm). This prevents context overflow in agents from loading 4+ additional skill definitions during closure.
-### Per-Agent Quality Gate
+### Per-Agent Quality Gate (Lightweight)
 ```
 Agent Workflow:
-  1. Execute all assigned tasks (prefer /sw:auto for autonomous execution)
+  1. Execute all assigned tasks via /sw:auto --simple
   2. Run all tests for owned code (unit + integration + E2E)
   3. Run linter/type-check for owned code
-  4. Run /sw:grill
-  5. If tests fail -> fix issues and repeat from step 2. Do NOT signal completion until all tests pass.
-  6. If /sw:grill passes -> attempt closure via /sw:done
-  7. If /sw:grill fails -> fix issues, repeat from step 2
-  8. Signal COMPLETION via SendMessage
+  4. If tests fail -> fix issues and repeat from step 2
+  5. Do NOT signal completion until all tests pass
+  6. Signal COMPLETION via SendMessage (include task count, test results summary)
+  7. Do NOT run /sw:grill or /sw:done — team-lead handles closure centrally
 ```
-### Orchestrator Quality Gate
+**Why agents don't run /sw:done**: The /sw:done skill invokes 4 sub-skills (grill, judge-llm, sync-docs, qa), each loading a full SKILL.md. After 15+ tasks of auto-mode context, this pushes agents into extended thinking (30+ min hangs). Centralizing closure on the team-lead (which has a cleaner context) avoids this.
-After all agents complete, the orchestrator (team lead) runs a final validation:
+### Orchestrator Quality Gate (Centralized Closure)
+After all agents complete, the team-lead runs closure **centrally** for each increment:
 ```
 Orchestrator Final Check:
   1. All agents signaled COMPLETION
   2. No unresolved BLOCKING_ISSUE messages
   3. Run full test suite (all domains combined)
-  4. Run /sw:grill on the combined increment
-  5. Run /sw:done --auto <id> for each increment in dependency order
-  6. If any /sw:done --auto fails, report the failure and continue with remaining increments
-  7. If all pass -> /sw:team-merge
-  8. If failures -> identify owning agent, send fix request via SendMessage
+  4. For EACH increment in dependency order:
+     a. Run /sw:grill on the increment
+     b. Run /sw:done --auto <id>
+     c. If /sw:done fails, report the failure and continue with remaining increments
+  5. If all pass -> /sw:team-merge
+  6. If failures -> identify owning agent, send fix request via SendMessage
 ```
 ### Grill Checklist per Domain
@@ -539,6 +559,36 @@ Orchestrator Final Check:
 ---
+## 8b. Agent Timeout and Stuck Detection
+Agents can get stuck in extended thinking if their context overflows. The team-lead MUST monitor for stuck agents.
+### Timeout Rules
+| Condition | Action |
+|-----------|--------|
+| Agent idle >20 min after last message | Send `STATUS_CHECK` message to agent |
+| No response to STATUS_CHECK within 5 min | Declare agent stuck |
+| Agent stuck | Log warning, proceed with other agents, handle stuck agent's increment manually in team-merge |
+| All agents stuck | STOP team, report to user |
+### Stuck Agent Recovery
+When an agent is declared stuck:
+1. Do NOT wait for it — proceed with closure of other agents' increments
+2. Note the stuck agent's increment ID and last known task progress
+3. During /sw:team-merge, the stuck agent's increment is left open for manual completion
+4. Send shutdown_request to the stuck agent to free resources
+### Preventing Stuck Agents
+- Enforce the 15-task cap (Section 3b)
+- Agents use `--simple` flag in auto-mode (reduces context per iteration)
+- Agents do NOT run /sw:done (team-lead handles closure centrally)
+- If an agent's task count exceeds 15 despite the cap, the team-lead should split it before spawning
+---
 ## 9. Workflow Summary
 ```
@@ -556,9 +606,10 @@ Orchestrator Final Check:
   │     │     └── Wait for CONTRACT_READY after approval
   │     └── Phase 2: Spawn backend + frontend + testing
   │           └── Receive PLAN_READY, review & approve via SendMessage
-  ├── Step 5: Monitor progress via SendMessage
-  ├── Step 6: Quality gates (each agent runs /sw:grill)
-  └── Step 7: Merge and close (/sw:team-merge)
+  ├── Step 5: Monitor progress via SendMessage (timeout: 20min idle → STATUS_CHECK)
+  ├── Step 6: Agents signal COMPLETION (tests pass, no /sw:grill or /sw:done on agents)
+  ├── Step 7: Team-lead runs /sw:grill + /sw:done --auto per increment (centralized closure)
+  └── Step 8: Merge and close (/sw:team-merge)
 ```
 **IMPORTANT**: The intended entry point is: `/sw:increment` → `/sw:do` (detects 3+ domains) → `/sw:team-lead`.
@@ -596,6 +647,8 @@ To execute, run without --dry-run.
 | **Agent stuck on trust folder** | Agent spawned without `bypassPermissions` | ALWAYS use `mode: "bypassPermissions"` — NEVER `mode: "plan"`. Trust prompts require interactive input agents cannot provide |
 | **Agents editing same files** | Overlapping file ownership patterns | Review ownership map; reassign conflicting files to a single owner; use `--dry-run` to validate before launch |
 | **Token cost too high** | Too many agents or overly large prompts | Reduce `--max-agents`; use `--domains` to limit scope; split feature into smaller increments |
+| **Agent stuck in extended thinking** | Too many tasks (>15) causing context overflow | Enforce 15-task cap per agent; split large domains into 2 agents; agents use `--simple` mode |
+| **Agent hung on /sw:done** | Closure loads 4+ skill definitions into already-full context | Agents should NOT run /sw:done — team-lead handles closure centrally |
 | **Contract agent takes too long** | Large schema or complex type system | Set a timeout in the agent prompt; if stuck >15 min, check agent output and consider splitting the contract work |
 | **Phase 2 starts before Phase 1 finishes** | CONTRACT_READY not received yet | Ensure upstream agents send CONTRACT_READY via SendMessage before team-lead spawns downstream |
 | **Agent fails mid-task** | Build error, test failure, or dependency issue | Send message to agent to fix; restart the agent with `/sw:auto` on its increment |

package/plugins/specweave/skills/team-lead/agents/backend.md CHANGED Viewed

@@ -42,13 +42,12 @@ WORKFLOW:
        content: "PLAN_READY: [increment path]. [summary of planned tasks and files].",
        summary: "Backend plan ready for review" })
   9. WAIT for "PLAN_APPROVED" message. If "PLAN_REJECTED", revise and re-submit.
-  10. Execute tasks autonomously: prefer /sw:auto for autonomous execution
+  10. Execute tasks autonomously: /sw:auto --simple (minimal context mode to prevent context overflow)
   11. Generate or update OpenAPI spec if API routes change
   12. Run all tests for owned code (unit + integration): npm test
-  13. Run quality gate: /sw:grill
-  14. Do NOT signal completion until all tests pass
-  15. After auto completes, attempt closure via /sw:done
-  16. Signal completion via SendMessage to team-lead
+  13. Do NOT signal completion until all tests pass
+  14. Signal COMPLETION via SendMessage to team-lead with summary of tasks done and test results
+  15. Do NOT run /sw:done or /sw:grill yourself — team-lead handles closure centrally
 RULES:
   - WRITE only to files you own (listed above)

package/plugins/specweave/skills/team-lead/agents/database.md CHANGED Viewed

@@ -33,13 +33,12 @@ WORKFLOW:
   8. WAIT for "PLAN_APPROVED" message. If "PLAN_REJECTED", revise and re-submit.
   9. Generate Prisma migration: npx prisma migrate dev --name <migration-name>
   10. Write seed data if needed
-  11. Execute tasks autonomously: prefer /sw:auto for autonomous execution
+  11. Execute tasks autonomously: /sw:auto --simple (minimal context mode to prevent context overflow)
   12. Run all tests for owned code (migration, seed): npm test
-  13. Run quality gate: /sw:grill
-  14. Do NOT signal completion until all tests pass
-  15. Signal CONTRACT_READY with schema details via SendMessage to team-lead
-  16. After auto completes, attempt closure via /sw:done
-  17. Signal completion via SendMessage to team-lead
+  13. Do NOT signal completion until all tests pass
+  14. Signal CONTRACT_READY with schema details via SendMessage to team-lead
+  15. Signal COMPLETION via SendMessage to team-lead with summary of tasks done and test results
+  16. Do NOT run /sw:done or /sw:grill yourself — team-lead handles closure centrally
 RULES:
   - WRITE only to files you own (listed above)

package/plugins/specweave/skills/team-lead/agents/frontend.md CHANGED Viewed

@@ -44,12 +44,11 @@ WORKFLOW:
        content: "PLAN_READY: [increment path]. [summary of planned tasks and files].",
        summary: "Frontend plan ready for review" })
   9. WAIT for "PLAN_APPROVED" message. If "PLAN_REJECTED", revise and re-submit.
-  10. Execute tasks autonomously: prefer /sw:auto for autonomous execution
+  10. Execute tasks autonomously: /sw:auto --simple (minimal context mode to prevent context overflow)
   11. Run all tests for owned code (unit + integration): npm test
-  12. Run quality gate: /sw:grill
-  13. Do NOT signal completion until all tests pass
-  14. After auto completes, attempt closure via /sw:done
-  15. Signal completion via SendMessage to team-lead
+  12. Do NOT signal completion until all tests pass
+  13. Signal COMPLETION via SendMessage to team-lead with summary of tasks done and test results
+  14. Do NOT run /sw:done or /sw:grill yourself — team-lead handles closure centrally
 RULES:
   - WRITE only to files you own (listed above)

package/plugins/specweave/skills/team-lead/agents/security.md CHANGED Viewed

@@ -34,13 +34,12 @@ WORKFLOW:
   8. WAIT for "PLAN_APPROVED" message. If "PLAN_REJECTED", revise and re-submit.
   9. Implement auth/authz middleware if needed
   10. Add input validation and sanitization
-  11. Execute tasks autonomously: prefer /sw:auto for autonomous execution
+  11. Execute tasks autonomously: /sw:auto --simple (minimal context mode to prevent context overflow)
   12. Run all tests for owned code (security tests): npm test
   13. Run security audit tools (npm audit, dependency check)
-  14. Run quality gate: /sw:grill
-  15. Do NOT signal completion until all tests pass
-  16. After auto completes, attempt closure via /sw:done
-  17. Signal completion with security findings summary via SendMessage to team-lead
+  14. Do NOT signal completion until all tests pass
+  15. Signal COMPLETION via SendMessage to team-lead with summary of tasks done, test results, and security findings
+  16. Do NOT run /sw:done or /sw:grill yourself — team-lead handles closure centrally
 RULES:
   - WRITE only to files you own (listed above)

package/plugins/specweave/skills/team-lead/agents/testing.md CHANGED Viewed

@@ -40,12 +40,11 @@ WORKFLOW:
   9. Write unit tests for new services/components
   10. Write integration tests for API endpoints
   11. Write E2E tests for user journeys
-  12. Execute tasks autonomously: prefer /sw:auto for autonomous execution
+  12. Execute tasks autonomously: /sw:auto --simple (minimal context mode to prevent context overflow)
   13. Run all tests (unit + integration + E2E): npm test && npx playwright test
   14. Do NOT signal completion until all tests pass -- if tests fail, fix and repeat
-  15. Run quality gate: /sw:grill
-  16. After auto completes, attempt closure via /sw:done
-  17. Signal completion via SendMessage to team-lead
+  15. Signal COMPLETION via SendMessage to team-lead with summary of tasks done and test results
+  16. Do NOT run /sw:done or /sw:grill yourself — team-lead handles closure centrally
 RULES:
   - WRITE only to test files (listed above)

package/src/templates/AGENTS.md.template CHANGED Viewed

@@ -38,13 +38,17 @@
 <!-- SECTION:orchestration required -->
 ## Workflow Orchestration
-### 1. Plan Before Code
+### 1. Plan Before Code (MANDATORY)
-BEFORE implementing ANY non-trivial task (3+ steps):
+BEFORE implementing ANY task — create an increment FIRST:
 1. Create increment: spec.md (WHAT/WHY) + plan.md (HOW) + tasks.md (checklist)
 2. Get user approval before implementing
 3. If something goes sideways → STOP and re-plan
+**No exceptions for "simple" tasks** — "simple", "quick", "basic" still require an increment. The only exception: user explicitly says "don't create an increment."
+**Setup/config actions are NOT implementation** — "connect github", "setup sync", "import issues" → use the matching setup command directly, not the increment workflow.
 See **Task Format** and **User Story Format** sections for templates.
 ### 2. Verify Before Done
@@ -84,6 +88,7 @@ Good: npm run build → node script.js → Success
 - Avoid over-engineering and premature optimization
 - One function = one responsibility
 - If you can delete code and tests still pass, delete it
+- **Match tooling to complexity** — simple tasks (calculator, todo) need 0 domain plugins and vanilla code. Don't load heavyweight frameworks, design systems, or i18n for trivial features
 ### No Laziness
 - Don't leave TODO comments for "later"
@@ -139,6 +144,8 @@ Good: npm run build → node script.js → Success
 | `/sw-github:sync 0001` | Sync increment to GitHub issue |
 | `/sw-jira:sync 0001` | Sync to Jira |
 | `/sw-ado:sync 0001` | Sync to Azure DevOps |
+| `/sw:sync-setup` | Connect GitHub/Jira/ADO integration |
+| `/sw:import` | Import issues from external tools |
 <!-- /SECTION -->
 ---

package/src/templates/CLAUDE.md.template CHANGED Viewed

@@ -9,6 +9,10 @@
 |---|---|
 | **"RESTART REQUIRED"** | ALL tools blocked → STOP, wait for restart |
 | **"SKILL FIRST"** | Call shown skill FIRST → chain domain skills → implement |
+**"SKILL FIRST" is mandatory** — "simple", "quick", "basic" are NOT opt-out phrases. The ONLY exception: user explicitly says "don't create an increment" or similar. Perceived simplicity never overrides hook instructions.
+**Setup actions are NOT implementation** — "connect github", "setup sync", "import issues" → route to the matching setup skill (`sw:sync-setup`, `sw:import`, `sw:progress-sync`), NOT `/sw:increment`.
 <!-- /SECTION -->
 <!-- SECTION:header required -->
@@ -28,6 +32,12 @@
 2. **Implementation**: Invoke domain skill per tech (React → `frontend:architect`, .NET → `backend:dotnet`, Stripe → `payments:payment-core`, etc.)
 3. **Closure**: `sw:grill` runs automatically via `/sw:done`
+**Complexity gate** — before chaining domain skills:
+1. **Tech stack specified?** → Chain ONLY the matching skill. If unspecified, ASK or default to minimal (vanilla JS/HTML, simple Express)
+2. **Complexity triage** → Simple (calculator, todo) = 0 domain plugins. Medium (auth, dashboard) = 1-2. Complex (SaaS) = full chain
+3. **Sanity check** → Would a senior engineer use this tool for this task? If obviously not, don't invoke it
+4. **Never** load all available plugins for a domain — pick ONE per domain based on the actual tech stack
 If auto-activation fails, invoke explicitly: `Skill({ skill: "name" })`
 <!-- /SECTION -->
@@ -55,6 +65,10 @@ SpecWeave auto-detects product descriptions and routes to `/sw:increment`:
 **Signals** (5+ = auto-route): Project name | Features list (3+) | Tech stack | Timeline/MVP | Problem statement | Business model
 **Opt-out phrases**: "Just brainstorm first" | "Don't plan yet" | "Quick discussion" | "Let's explore ideas"
+**NOT opt-out phrases**: "simple" | "quick" | "basic" | "small" — these still require `/sw:increment`
+**Setup/config requests bypass auto-detection** → route directly to the matching skill (e.g., `sw:sync-setup`, `sw:import`)
 <!-- /SECTION -->
 <!-- SECTION:metarule required -->
@@ -123,6 +137,8 @@ Good: npm run build → node script.js → Success
 | `/sw:done` | Close |
 | `/sw:progress-sync` | Sync progress to all external tools |
 | `/sw-github:push` | Push progress to GitHub |
+| `/sw:sync-setup` | Connect GitHub/Jira/ADO integration |
+| `/sw:import` | Import issues from external tools |
 **Natural language**: "Let's build X" → `/sw:increment` | "What's status?" → `/sw:progress` | "We're done" → `/sw:done` | "Ship while sleeping" → `/sw:auto`
@@ -251,7 +267,7 @@ Plugins load automatically. Manual: `vskill install --repo anton-abyzov/vskill -
 <!-- SECTION:principles -->
 ## Principles
-1. **Spec-first**: `/sw:increment` before coding
+1. **Spec-first**: `/sw:increment` before coding — mandatory for ALL implementation requests, no exceptions unless user explicitly opts out
 2. **Docs = truth**: Specs guide implementation
 3. **Simplicity First**: Minimal code, minimal impact
 4. **No Laziness**: Root causes, senior standards