specweave 1.0.354 → 1.0.355
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/plugins/specweave/skills/team-lead/SKILL.md +71 -18
- package/plugins/specweave/skills/team-lead/agents/backend.md +4 -5
- package/plugins/specweave/skills/team-lead/agents/database.md +5 -6
- package/plugins/specweave/skills/team-lead/agents/frontend.md +4 -5
- package/plugins/specweave/skills/team-lead/agents/security.md +4 -5
- package/plugins/specweave/skills/team-lead/agents/testing.md +3 -4
- package/src/templates/AGENTS.md.template +9 -2
- package/src/templates/CLAUDE.md.template +17 -1
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "specweave",
|
|
3
|
-
"version": "1.0.
|
|
3
|
+
"version": "1.0.355",
|
|
4
4
|
"description": "Spec-driven development framework for AI coding agents. Works with Claude Code, Codex, Antigravity, Cursor, Copilot & more. 100+ skills, 49 CLI commands, verified skill certification, autonomous execution, and living documentation.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"main": "dist/index.js",
|
|
@@ -316,6 +316,24 @@ Plan review MUST NOT block other agents. Review plans as they arrive — agents
|
|
|
316
316
|
|
|
317
317
|
For very large features, the team lead MAY split work into multiple increments per domain for better tracking and independent closure. Decide this during initial analysis (Step 1), before spawning agents.
|
|
318
318
|
|
|
319
|
+
### Task Cap Per Agent (CRITICAL — Context Overflow Prevention)
|
|
320
|
+
|
|
321
|
+
**Maximum 15 tasks per agent.** Agents with more tasks accumulate too much context in auto-mode, leading to extended thinking loops and stuck agents.
|
|
322
|
+
|
|
323
|
+
When distributing tasks from the master spec:
|
|
324
|
+
1. Count tasks per domain
|
|
325
|
+
2. If a domain has >15 tasks: **split into 2 agents** (e.g., `jira-agent-a`, `jira-agent-b`) with non-overlapping task ranges
|
|
326
|
+
3. If splitting isn't natural, group tasks into phases and create 2 increments per domain
|
|
327
|
+
|
|
328
|
+
```
|
|
329
|
+
Domain tasks analysis:
|
|
330
|
+
Frontend: 12 tasks -> 1 agent (OK)
|
|
331
|
+
Backend: 8 tasks -> 1 agent (OK)
|
|
332
|
+
JIRA: 23 tasks -> SPLIT into 2 agents (tasks 1-12, tasks 13-23)
|
|
333
|
+
```
|
|
334
|
+
|
|
335
|
+
**Why**: Each auto-mode iteration adds context (spec reads, edits, test outputs). At 20+ tasks, accumulated context causes the model to enter extended thinking (30+ min) and effectively hang. The 15-task cap keeps agents within a safe context budget.
|
|
336
|
+
|
|
319
337
|
---
|
|
320
338
|
|
|
321
339
|
## 4. Agent Spawn Prompt Templates
|
|
@@ -494,36 +512,38 @@ Task({
|
|
|
494
512
|
|
|
495
513
|
## 8. Quality Gates
|
|
496
514
|
|
|
497
|
-
|
|
515
|
+
Quality gates are split: agents handle tests, team-lead handles closure (grill, done, judge-llm). This prevents context overflow in agents from loading 4+ additional skill definitions during closure.
|
|
498
516
|
|
|
499
|
-
### Per-Agent Quality Gate
|
|
517
|
+
### Per-Agent Quality Gate (Lightweight)
|
|
500
518
|
|
|
501
519
|
```
|
|
502
520
|
Agent Workflow:
|
|
503
|
-
1. Execute all assigned tasks
|
|
521
|
+
1. Execute all assigned tasks via /sw:auto --simple
|
|
504
522
|
2. Run all tests for owned code (unit + integration + E2E)
|
|
505
523
|
3. Run linter/type-check for owned code
|
|
506
|
-
4.
|
|
507
|
-
5.
|
|
508
|
-
6.
|
|
509
|
-
7.
|
|
510
|
-
8. Signal COMPLETION via SendMessage
|
|
524
|
+
4. If tests fail -> fix issues and repeat from step 2
|
|
525
|
+
5. Do NOT signal completion until all tests pass
|
|
526
|
+
6. Signal COMPLETION via SendMessage (include task count, test results summary)
|
|
527
|
+
7. Do NOT run /sw:grill or /sw:done — team-lead handles closure centrally
|
|
511
528
|
```
|
|
512
529
|
|
|
513
|
-
|
|
530
|
+
**Why agents don't run /sw:done**: The /sw:done skill invokes 4 sub-skills (grill, judge-llm, sync-docs, qa), each loading a full SKILL.md. After 15+ tasks of auto-mode context, this pushes agents into extended thinking (30+ min hangs). Centralizing closure on the team-lead (which has a cleaner context) avoids this.
|
|
514
531
|
|
|
515
|
-
|
|
532
|
+
### Orchestrator Quality Gate (Centralized Closure)
|
|
533
|
+
|
|
534
|
+
After all agents complete, the team-lead runs closure **centrally** for each increment:
|
|
516
535
|
|
|
517
536
|
```
|
|
518
537
|
Orchestrator Final Check:
|
|
519
538
|
1. All agents signaled COMPLETION
|
|
520
539
|
2. No unresolved BLOCKING_ISSUE messages
|
|
521
540
|
3. Run full test suite (all domains combined)
|
|
522
|
-
4.
|
|
523
|
-
|
|
524
|
-
|
|
525
|
-
|
|
526
|
-
|
|
541
|
+
4. For EACH increment in dependency order:
|
|
542
|
+
a. Run /sw:grill on the increment
|
|
543
|
+
b. Run /sw:done --auto <id>
|
|
544
|
+
c. If /sw:done fails, report the failure and continue with remaining increments
|
|
545
|
+
5. If all pass -> /sw:team-merge
|
|
546
|
+
6. If failures -> identify owning agent, send fix request via SendMessage
|
|
527
547
|
```
|
|
528
548
|
|
|
529
549
|
### Grill Checklist per Domain
|
|
@@ -539,6 +559,36 @@ Orchestrator Final Check:
|
|
|
539
559
|
|
|
540
560
|
---
|
|
541
561
|
|
|
562
|
+
## 8b. Agent Timeout and Stuck Detection
|
|
563
|
+
|
|
564
|
+
Agents can get stuck in extended thinking if their context overflows. The team-lead MUST monitor for stuck agents.
|
|
565
|
+
|
|
566
|
+
### Timeout Rules
|
|
567
|
+
|
|
568
|
+
| Condition | Action |
|
|
569
|
+
|-----------|--------|
|
|
570
|
+
| Agent idle >20 min after last message | Send `STATUS_CHECK` message to agent |
|
|
571
|
+
| No response to STATUS_CHECK within 5 min | Declare agent stuck |
|
|
572
|
+
| Agent stuck | Log warning, proceed with other agents, handle stuck agent's increment manually in team-merge |
|
|
573
|
+
| All agents stuck | STOP team, report to user |
|
|
574
|
+
|
|
575
|
+
### Stuck Agent Recovery
|
|
576
|
+
|
|
577
|
+
When an agent is declared stuck:
|
|
578
|
+
1. Do NOT wait for it — proceed with closure of other agents' increments
|
|
579
|
+
2. Note the stuck agent's increment ID and last known task progress
|
|
580
|
+
3. During /sw:team-merge, the stuck agent's increment is left open for manual completion
|
|
581
|
+
4. Send shutdown_request to the stuck agent to free resources
|
|
582
|
+
|
|
583
|
+
### Preventing Stuck Agents
|
|
584
|
+
|
|
585
|
+
- Enforce the 15-task cap (Section 3b)
|
|
586
|
+
- Agents use `--simple` flag in auto-mode (reduces context per iteration)
|
|
587
|
+
- Agents do NOT run /sw:done (team-lead handles closure centrally)
|
|
588
|
+
- If an agent's task count exceeds 15 despite the cap, the team-lead should split it before spawning
|
|
589
|
+
|
|
590
|
+
---
|
|
591
|
+
|
|
542
592
|
## 9. Workflow Summary
|
|
543
593
|
|
|
544
594
|
```
|
|
@@ -556,9 +606,10 @@ Orchestrator Final Check:
|
|
|
556
606
|
│ │ └── Wait for CONTRACT_READY after approval
|
|
557
607
|
│ └── Phase 2: Spawn backend + frontend + testing
|
|
558
608
|
│ └── Receive PLAN_READY, review & approve via SendMessage
|
|
559
|
-
├── Step 5: Monitor progress via SendMessage
|
|
560
|
-
├── Step 6:
|
|
561
|
-
|
|
609
|
+
├── Step 5: Monitor progress via SendMessage (timeout: 20min idle → STATUS_CHECK)
|
|
610
|
+
├── Step 6: Agents signal COMPLETION (tests pass, no /sw:grill or /sw:done on agents)
|
|
611
|
+
├── Step 7: Team-lead runs /sw:grill + /sw:done --auto per increment (centralized closure)
|
|
612
|
+
└── Step 8: Merge and close (/sw:team-merge)
|
|
562
613
|
```
|
|
563
614
|
|
|
564
615
|
**IMPORTANT**: The intended entry point is: `/sw:increment` → `/sw:do` (detects 3+ domains) → `/sw:team-lead`.
|
|
@@ -596,6 +647,8 @@ To execute, run without --dry-run.
|
|
|
596
647
|
| **Agent stuck on trust folder** | Agent spawned without `bypassPermissions` | ALWAYS use `mode: "bypassPermissions"` — NEVER `mode: "plan"`. Trust prompts require interactive input agents cannot provide |
|
|
597
648
|
| **Agents editing same files** | Overlapping file ownership patterns | Review ownership map; reassign conflicting files to a single owner; use `--dry-run` to validate before launch |
|
|
598
649
|
| **Token cost too high** | Too many agents or overly large prompts | Reduce `--max-agents`; use `--domains` to limit scope; split feature into smaller increments |
|
|
650
|
+
| **Agent stuck in extended thinking** | Too many tasks (>15) causing context overflow | Enforce 15-task cap per agent; split large domains into 2 agents; agents use `--simple` mode |
|
|
651
|
+
| **Agent hung on /sw:done** | Closure loads 4+ skill definitions into already-full context | Agents should NOT run /sw:done — team-lead handles closure centrally |
|
|
599
652
|
| **Contract agent takes too long** | Large schema or complex type system | Set a timeout in the agent prompt; if stuck >15 min, check agent output and consider splitting the contract work |
|
|
600
653
|
| **Phase 2 starts before Phase 1 finishes** | CONTRACT_READY not received yet | Ensure upstream agents send CONTRACT_READY via SendMessage before team-lead spawns downstream |
|
|
601
654
|
| **Agent fails mid-task** | Build error, test failure, or dependency issue | Send message to agent to fix; restart the agent with `/sw:auto` on its increment |
|
|
@@ -42,13 +42,12 @@ WORKFLOW:
|
|
|
42
42
|
content: "PLAN_READY: [increment path]. [summary of planned tasks and files].",
|
|
43
43
|
summary: "Backend plan ready for review" })
|
|
44
44
|
9. WAIT for "PLAN_APPROVED" message. If "PLAN_REJECTED", revise and re-submit.
|
|
45
|
-
10. Execute tasks autonomously:
|
|
45
|
+
10. Execute tasks autonomously: /sw:auto --simple (minimal context mode to prevent context overflow)
|
|
46
46
|
11. Generate or update OpenAPI spec if API routes change
|
|
47
47
|
12. Run all tests for owned code (unit + integration): npm test
|
|
48
|
-
13.
|
|
49
|
-
14.
|
|
50
|
-
15.
|
|
51
|
-
16. Signal completion via SendMessage to team-lead
|
|
48
|
+
13. Do NOT signal completion until all tests pass
|
|
49
|
+
14. Signal COMPLETION via SendMessage to team-lead with summary of tasks done and test results
|
|
50
|
+
15. Do NOT run /sw:done or /sw:grill yourself — team-lead handles closure centrally
|
|
52
51
|
|
|
53
52
|
RULES:
|
|
54
53
|
- WRITE only to files you own (listed above)
|
|
@@ -33,13 +33,12 @@ WORKFLOW:
|
|
|
33
33
|
8. WAIT for "PLAN_APPROVED" message. If "PLAN_REJECTED", revise and re-submit.
|
|
34
34
|
9. Generate Prisma migration: npx prisma migrate dev --name <migration-name>
|
|
35
35
|
10. Write seed data if needed
|
|
36
|
-
11. Execute tasks autonomously:
|
|
36
|
+
11. Execute tasks autonomously: /sw:auto --simple (minimal context mode to prevent context overflow)
|
|
37
37
|
12. Run all tests for owned code (migration, seed): npm test
|
|
38
|
-
13.
|
|
39
|
-
14.
|
|
40
|
-
15. Signal
|
|
41
|
-
16.
|
|
42
|
-
17. Signal completion via SendMessage to team-lead
|
|
38
|
+
13. Do NOT signal completion until all tests pass
|
|
39
|
+
14. Signal CONTRACT_READY with schema details via SendMessage to team-lead
|
|
40
|
+
15. Signal COMPLETION via SendMessage to team-lead with summary of tasks done and test results
|
|
41
|
+
16. Do NOT run /sw:done or /sw:grill yourself — team-lead handles closure centrally
|
|
43
42
|
|
|
44
43
|
RULES:
|
|
45
44
|
- WRITE only to files you own (listed above)
|
|
@@ -44,12 +44,11 @@ WORKFLOW:
|
|
|
44
44
|
content: "PLAN_READY: [increment path]. [summary of planned tasks and files].",
|
|
45
45
|
summary: "Frontend plan ready for review" })
|
|
46
46
|
9. WAIT for "PLAN_APPROVED" message. If "PLAN_REJECTED", revise and re-submit.
|
|
47
|
-
10. Execute tasks autonomously:
|
|
47
|
+
10. Execute tasks autonomously: /sw:auto --simple (minimal context mode to prevent context overflow)
|
|
48
48
|
11. Run all tests for owned code (unit + integration): npm test
|
|
49
|
-
12.
|
|
50
|
-
13.
|
|
51
|
-
14.
|
|
52
|
-
15. Signal completion via SendMessage to team-lead
|
|
49
|
+
12. Do NOT signal completion until all tests pass
|
|
50
|
+
13. Signal COMPLETION via SendMessage to team-lead with summary of tasks done and test results
|
|
51
|
+
14. Do NOT run /sw:done or /sw:grill yourself — team-lead handles closure centrally
|
|
53
52
|
|
|
54
53
|
RULES:
|
|
55
54
|
- WRITE only to files you own (listed above)
|
|
@@ -34,13 +34,12 @@ WORKFLOW:
|
|
|
34
34
|
8. WAIT for "PLAN_APPROVED" message. If "PLAN_REJECTED", revise and re-submit.
|
|
35
35
|
9. Implement auth/authz middleware if needed
|
|
36
36
|
10. Add input validation and sanitization
|
|
37
|
-
11. Execute tasks autonomously:
|
|
37
|
+
11. Execute tasks autonomously: /sw:auto --simple (minimal context mode to prevent context overflow)
|
|
38
38
|
12. Run all tests for owned code (security tests): npm test
|
|
39
39
|
13. Run security audit tools (npm audit, dependency check)
|
|
40
|
-
14.
|
|
41
|
-
15.
|
|
42
|
-
16.
|
|
43
|
-
17. Signal completion with security findings summary via SendMessage to team-lead
|
|
40
|
+
14. Do NOT signal completion until all tests pass
|
|
41
|
+
15. Signal COMPLETION via SendMessage to team-lead with summary of tasks done, test results, and security findings
|
|
42
|
+
16. Do NOT run /sw:done or /sw:grill yourself — team-lead handles closure centrally
|
|
44
43
|
|
|
45
44
|
RULES:
|
|
46
45
|
- WRITE only to files you own (listed above)
|
|
@@ -40,12 +40,11 @@ WORKFLOW:
|
|
|
40
40
|
9. Write unit tests for new services/components
|
|
41
41
|
10. Write integration tests for API endpoints
|
|
42
42
|
11. Write E2E tests for user journeys
|
|
43
|
-
12. Execute tasks autonomously:
|
|
43
|
+
12. Execute tasks autonomously: /sw:auto --simple (minimal context mode to prevent context overflow)
|
|
44
44
|
13. Run all tests (unit + integration + E2E): npm test && npx playwright test
|
|
45
45
|
14. Do NOT signal completion until all tests pass -- if tests fail, fix and repeat
|
|
46
|
-
15.
|
|
47
|
-
16.
|
|
48
|
-
17. Signal completion via SendMessage to team-lead
|
|
46
|
+
15. Signal COMPLETION via SendMessage to team-lead with summary of tasks done and test results
|
|
47
|
+
16. Do NOT run /sw:done or /sw:grill yourself — team-lead handles closure centrally
|
|
49
48
|
|
|
50
49
|
RULES:
|
|
51
50
|
- WRITE only to test files (listed above)
|
|
@@ -38,13 +38,17 @@
|
|
|
38
38
|
<!-- SECTION:orchestration required -->
|
|
39
39
|
## Workflow Orchestration
|
|
40
40
|
|
|
41
|
-
### 1. Plan Before Code
|
|
41
|
+
### 1. Plan Before Code (MANDATORY)
|
|
42
42
|
|
|
43
|
-
BEFORE implementing ANY
|
|
43
|
+
BEFORE implementing ANY task — create an increment FIRST:
|
|
44
44
|
1. Create increment: spec.md (WHAT/WHY) + plan.md (HOW) + tasks.md (checklist)
|
|
45
45
|
2. Get user approval before implementing
|
|
46
46
|
3. If something goes sideways → STOP and re-plan
|
|
47
47
|
|
|
48
|
+
**No exceptions for "simple" tasks** — "simple", "quick", "basic" still require an increment. The only exception: user explicitly says "don't create an increment."
|
|
49
|
+
|
|
50
|
+
**Setup/config actions are NOT implementation** — "connect github", "setup sync", "import issues" → use the matching setup command directly, not the increment workflow.
|
|
51
|
+
|
|
48
52
|
See **Task Format** and **User Story Format** sections for templates.
|
|
49
53
|
|
|
50
54
|
### 2. Verify Before Done
|
|
@@ -84,6 +88,7 @@ Good: npm run build → node script.js → Success
|
|
|
84
88
|
- Avoid over-engineering and premature optimization
|
|
85
89
|
- One function = one responsibility
|
|
86
90
|
- If you can delete code and tests still pass, delete it
|
|
91
|
+
- **Match tooling to complexity** — simple tasks (calculator, todo) need 0 domain plugins and vanilla code. Don't load heavyweight frameworks, design systems, or i18n for trivial features
|
|
87
92
|
|
|
88
93
|
### No Laziness
|
|
89
94
|
- Don't leave TODO comments for "later"
|
|
@@ -139,6 +144,8 @@ Good: npm run build → node script.js → Success
|
|
|
139
144
|
| `/sw-github:sync 0001` | Sync increment to GitHub issue |
|
|
140
145
|
| `/sw-jira:sync 0001` | Sync to Jira |
|
|
141
146
|
| `/sw-ado:sync 0001` | Sync to Azure DevOps |
|
|
147
|
+
| `/sw:sync-setup` | Connect GitHub/Jira/ADO integration |
|
|
148
|
+
| `/sw:import` | Import issues from external tools |
|
|
142
149
|
<!-- /SECTION -->
|
|
143
150
|
|
|
144
151
|
---
|
|
@@ -9,6 +9,10 @@
|
|
|
9
9
|
|---|---|
|
|
10
10
|
| **"RESTART REQUIRED"** | ALL tools blocked → STOP, wait for restart |
|
|
11
11
|
| **"SKILL FIRST"** | Call shown skill FIRST → chain domain skills → implement |
|
|
12
|
+
|
|
13
|
+
**"SKILL FIRST" is mandatory** — "simple", "quick", "basic" are NOT opt-out phrases. The ONLY exception: user explicitly says "don't create an increment" or similar. Perceived simplicity never overrides hook instructions.
|
|
14
|
+
|
|
15
|
+
**Setup actions are NOT implementation** — "connect github", "setup sync", "import issues" → route to the matching setup skill (`sw:sync-setup`, `sw:import`, `sw:progress-sync`), NOT `/sw:increment`.
|
|
12
16
|
<!-- /SECTION -->
|
|
13
17
|
|
|
14
18
|
<!-- SECTION:header required -->
|
|
@@ -28,6 +32,12 @@
|
|
|
28
32
|
2. **Implementation**: Invoke domain skill per tech (React → `frontend:architect`, .NET → `backend:dotnet`, Stripe → `payments:payment-core`, etc.)
|
|
29
33
|
3. **Closure**: `sw:grill` runs automatically via `/sw:done`
|
|
30
34
|
|
|
35
|
+
**Complexity gate** — before chaining domain skills:
|
|
36
|
+
1. **Tech stack specified?** → Chain ONLY the matching skill. If unspecified, ASK or default to minimal (vanilla JS/HTML, simple Express)
|
|
37
|
+
2. **Complexity triage** → Simple (calculator, todo) = 0 domain plugins. Medium (auth, dashboard) = 1-2. Complex (SaaS) = full chain
|
|
38
|
+
3. **Sanity check** → Would a senior engineer use this tool for this task? If obviously not, don't invoke it
|
|
39
|
+
4. **Never** load all available plugins for a domain — pick ONE per domain based on the actual tech stack
|
|
40
|
+
|
|
31
41
|
If auto-activation fails, invoke explicitly: `Skill({ skill: "name" })`
|
|
32
42
|
<!-- /SECTION -->
|
|
33
43
|
|
|
@@ -55,6 +65,10 @@ SpecWeave auto-detects product descriptions and routes to `/sw:increment`:
|
|
|
55
65
|
**Signals** (5+ = auto-route): Project name | Features list (3+) | Tech stack | Timeline/MVP | Problem statement | Business model
|
|
56
66
|
|
|
57
67
|
**Opt-out phrases**: "Just brainstorm first" | "Don't plan yet" | "Quick discussion" | "Let's explore ideas"
|
|
68
|
+
|
|
69
|
+
**NOT opt-out phrases**: "simple" | "quick" | "basic" | "small" — these still require `/sw:increment`
|
|
70
|
+
|
|
71
|
+
**Setup/config requests bypass auto-detection** → route directly to the matching skill (e.g., `sw:sync-setup`, `sw:import`)
|
|
58
72
|
<!-- /SECTION -->
|
|
59
73
|
|
|
60
74
|
<!-- SECTION:metarule required -->
|
|
@@ -123,6 +137,8 @@ Good: npm run build → node script.js → Success
|
|
|
123
137
|
| `/sw:done` | Close |
|
|
124
138
|
| `/sw:progress-sync` | Sync progress to all external tools |
|
|
125
139
|
| `/sw-github:push` | Push progress to GitHub |
|
|
140
|
+
| `/sw:sync-setup` | Connect GitHub/Jira/ADO integration |
|
|
141
|
+
| `/sw:import` | Import issues from external tools |
|
|
126
142
|
|
|
127
143
|
**Natural language**: "Let's build X" → `/sw:increment` | "What's status?" → `/sw:progress` | "We're done" → `/sw:done` | "Ship while sleeping" → `/sw:auto`
|
|
128
144
|
|
|
@@ -251,7 +267,7 @@ Plugins load automatically. Manual: `vskill install --repo anton-abyzov/vskill -
|
|
|
251
267
|
<!-- SECTION:principles -->
|
|
252
268
|
## Principles
|
|
253
269
|
|
|
254
|
-
1. **Spec-first**: `/sw:increment` before coding
|
|
270
|
+
1. **Spec-first**: `/sw:increment` before coding — mandatory for ALL implementation requests, no exceptions unless user explicitly opts out
|
|
255
271
|
2. **Docs = truth**: Specs guide implementation
|
|
256
272
|
3. **Simplicity First**: Minimal code, minimal impact
|
|
257
273
|
4. **No Laziness**: Root causes, senior standards
|