claude-capsule-kit 3.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +281 -0
- package/agents/agent-developer.md +206 -0
- package/agents/architecture-explorer.md +90 -0
- package/agents/brainstorm-coordinator.md +120 -0
- package/agents/code-reviewer.md +135 -0
- package/agents/context-librarian.md +227 -0
- package/agents/context-manager.md +151 -0
- package/agents/database-architect.md +107 -0
- package/agents/database-navigator.md +136 -0
- package/agents/debugger.md +121 -0
- package/agents/devops-sre.md +102 -0
- package/agents/error-detective.md +128 -0
- package/agents/git-workflow-manager.md +212 -0
- package/agents/github-issue-tracker.md +252 -0
- package/agents/product-dx-specialist.md +99 -0
- package/agents/refactoring-specialist.md +159 -0
- package/agents/security-engineer.md +102 -0
- package/agents/session-summarizer.md +126 -0
- package/agents/system-architect.md +103 -0
- package/bin/cck.js +1624 -0
- package/commands/crew-setup.md +75 -0
- package/commands/load-session.md +68 -0
- package/commands/sessions.md +55 -0
- package/commands/statusline.md +51 -0
- package/commands/sync-disable.md +35 -0
- package/commands/sync-enable.md +32 -0
- package/commands/sync.md +31 -0
- package/crew/lib/activity-monitor.js +128 -0
- package/crew/lib/crew-config-reader.js +255 -0
- package/crew/lib/health-monitor.js +171 -0
- package/crew/lib/merge-pilot.js +340 -0
- package/crew/lib/prompt-generator.js +268 -0
- package/crew/lib/role-presets.js +63 -0
- package/crew/lib/task-decomposer.js +382 -0
- package/crew/lib/team-spawner.sh +557 -0
- package/crew/lib/team-state-manager.js +155 -0
- package/crew/lib/worktree-gc.js +357 -0
- package/crew/lib/worktree-manager.sh +700 -0
- package/docs/AGENT_ROUTING_GUIDE.md +655 -0
- package/docs/AGENT_TEAMS_WORKTREE_MODE.md +681 -0
- package/docs/BEST_PRACTICES.md +194 -0
- package/docs/CAPSULE_DEGRADATION_RCA.md +577 -0
- package/docs/SKILLS_ORCHESTRATION_ARCHITECTURE.md +455 -0
- package/docs/SUPER_CLAUDE_SYSTEM_ARCHITECTURE.md +1647 -0
- package/docs/TOOL_ENFORCEMENT_REFERENCE.md +418 -0
- package/hooks/check-refresh-needed.sh +77 -0
- package/hooks/detect-changes.sh +90 -0
- package/hooks/keyword-triggers.sh +66 -0
- package/hooks/lib/crew-detect.js +241 -0
- package/hooks/lib/handoff-generator.js +158 -0
- package/hooks/load-from-journal.sh +41 -0
- package/hooks/post-tool-use.js +212 -0
- package/hooks/pre-compact.js +77 -0
- package/hooks/pre-edit-analysis.sh +68 -0
- package/hooks/pre-tool-use.sh +212 -0
- package/hooks/prompt-submit-memory.sh +87 -0
- package/hooks/quality-check.sh +48 -0
- package/hooks/session-end.js +133 -0
- package/hooks/session-start.js +439 -0
- package/hooks/stop.sh +66 -0
- package/hooks/suggest-discoveries.sh +84 -0
- package/hooks/summarize-session.sh +122 -0
- package/hooks/sync-to-journal.sh +77 -0
- package/hooks/sync-todowrite.sh +37 -0
- package/hooks/tool-auto-suggest.sh +77 -0
- package/hooks/user-prompt-submit.sh +71 -0
- package/lib/audit-logger.sh +120 -0
- package/lib/sandbox-validator.sh +194 -0
- package/lib/tool-runner.sh +274 -0
- package/package.json +67 -0
- package/scripts/postinstall.js +4 -0
- package/scripts/show-capsule-visual.sh +103 -0
- package/scripts/show-capsule.sh +113 -0
- package/scripts/show-deps-tree.sh +66 -0
- package/scripts/show-stats-dashboard.sh +52 -0
- package/scripts/show-stats.sh +79 -0
- package/skills/code-review/SKILL.md +520 -0
- package/skills/crew/SKILL.md +395 -0
- package/skills/debug/SKILL.md +473 -0
- package/skills/deep-context/SKILL.md +446 -0
- package/skills/task-router/SKILL.md +390 -0
- package/skills/workflow/SKILL.md +370 -0
- package/templates/CLAUDE.md +124 -0
- package/templates/crew-config.json +21 -0
- package/templates/settings-hooks.json +74 -0
- package/templates/statusline-command.sh +208 -0
- package/tools/context-query/context-query.js +312 -0
- package/tools/context-query/context-query.sh +5 -0
- package/tools/context-query/tool.json +42 -0
- package/tools/dependency-scanner/dependency-scanner.sh +53 -0
- package/tools/dependency-scanner/tool.json +8 -0
- package/tools/find-circular/find-circular.sh +41 -0
- package/tools/find-circular/tool.json +36 -0
- package/tools/find-dead-code/find-dead-code.sh +41 -0
- package/tools/find-dead-code/tool.json +36 -0
- package/tools/impact-analysis/impact-analysis.sh +99 -0
- package/tools/impact-analysis/tool.json +38 -0
- package/tools/progressive-reader/progressive-reader.sh +14 -0
- package/tools/progressive-reader/tool.json +69 -0
- package/tools/query-deps/query-deps.sh +69 -0
- package/tools/query-deps/tool.json +34 -0
- package/tools/stats/stats.js +299 -0
- package/tools/stats/stats.sh +5 -0
- package/tools/stats/tool.json +34 -0
- package/tools/token-counter/README.md +73 -0
- package/tools/token-counter/token-counter.py +202 -0
- package/tools/token-counter/tool.json +40 -0
package/README.md
ADDED
|
@@ -0,0 +1,281 @@
|
|
|
1
|
+
<p align="center">
|
|
2
|
+
<img src="./.github/capsule-hero.png" alt="Claude Capsule Kit" width="100%" />
|
|
3
|
+
</p>
|
|
4
|
+
|
|
5
|
+
<p align="center">
|
|
6
|
+
<a href="https://www.npmjs.com/package/claude-capsule-kit"><img src="https://img.shields.io/npm/v/claude-capsule-kit.svg" alt="npm"></a>
|
|
7
|
+
<a href="https://opensource.org/licenses/MIT"><img src="https://img.shields.io/badge/License-MIT-yellow.svg" alt="License: MIT"></a>
|
|
8
|
+
<a href="https://claude.ai"><img src="https://img.shields.io/badge/Claude_Code-Compatible-orange.svg" alt="Claude Code"></a>
|
|
9
|
+
</p>
|
|
10
|
+
|
|
11
|
+
<h3 align="center">A toolkit that makes Claude Code better at engineering.</h3>
|
|
12
|
+
|
|
13
|
+
<p align="center">
|
|
14
|
+
Session memory. Dependency analysis. Large file navigation. 18 specialist agents.<br/>
|
|
15
|
+
Crew teams for parallel multi-branch work. All automatic, zero configuration.
|
|
16
|
+
</p>
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## Install
|
|
21
|
+
|
|
22
|
+
```bash
|
|
23
|
+
npm install -g claude-capsule-kit
|
|
24
|
+
cck setup
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
Restart Claude Code. Everything activates automatically via hooks.
|
|
28
|
+
|
|
29
|
+
---
|
|
30
|
+
|
|
31
|
+
## Why CCK
|
|
32
|
+
|
|
33
|
+
Claude Code is powerful out of the box. But on real codebases, you run into limits:
|
|
34
|
+
|
|
35
|
+
- **Session isolation** — Claude starts fresh every time. Previous context, discoveries, and decisions are lost.
|
|
36
|
+
- **No dependency awareness** — Claude doesn't know your import graph. It can't tell you what breaks when you change a file.
|
|
37
|
+
- **Large file blindness** — Files over 50KB get truncated. Claude can't navigate framework internals or generated code.
|
|
38
|
+
- **Single-threaded work** — One branch, one task at a time. No way to parallelize across features.
|
|
39
|
+
- **Agent amnesia** — Sub-agents start with zero context. Findings from one agent don't flow to others.
|
|
40
|
+
|
|
41
|
+
CCK fills these gaps with a set of tools, hooks, and agents that plug into Claude Code's extension system. Nothing is patched or hacked — it's all built on official hooks, skills, commands, and agent routing.
|
|
42
|
+
|
|
43
|
+
---
|
|
44
|
+
|
|
45
|
+
## What's in the kit
|
|
46
|
+
|
|
47
|
+
### Session Memory
|
|
48
|
+
|
|
49
|
+
Hooks capture every file read, edit, and agent invocation into a local SQLite database powered by [**blink-query**](https://github.com/arpitnath/blink-query) — a DNS-inspired knowledge resolution layer built for AI agents.
|
|
50
|
+
|
|
51
|
+
| Event | What happens |
|
|
52
|
+
|---|---|
|
|
53
|
+
| You read/edit a file | Operation logged to capsule.db |
|
|
54
|
+
| Session ends | Summary saved with branch context |
|
|
55
|
+
| Next session starts | Previous context restored automatically |
|
|
56
|
+
| Context window fills up | Continuity doc saved before compaction |
|
|
57
|
+
| You switch branches | Context switches with you |
|
|
58
|
+
|
|
59
|
+
No manual logging. No `/save` commands. It just works.
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
cck stats overview # See what's been captured
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
### Dependency Tools
|
|
66
|
+
|
|
67
|
+
Claude Code doesn't know your import graph. CCK builds one.
|
|
68
|
+
|
|
69
|
+
| Tool | What it answers |
|
|
70
|
+
|---|---|
|
|
71
|
+
| `query-deps` | What does this file import? Who imports it? |
|
|
72
|
+
| `impact-analysis` | What breaks if I change this file? |
|
|
73
|
+
| `find-circular` | Are there circular dependencies? |
|
|
74
|
+
| `find-dead-code` | What code is never imported? |
|
|
75
|
+
|
|
76
|
+
These use a pre-built dependency graph (via the Go-based `dependency-scanner`) — instant results instead of Claude scanning files one by one. The Pre-Tool-Use hook automatically suggests the right tool when Claude reaches for a file.
|
|
77
|
+
|
|
78
|
+
```bash
|
|
79
|
+
cck build # Build Go binaries (dependency-scanner, progressive-reader)
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
### Large File Navigation
|
|
83
|
+
|
|
84
|
+
Files over 50KB hit Claude Code's token limit. CCK's `progressive-reader` parses the AST and splits files into navigable chunks — functions, classes, sections.
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
# Claude uses this automatically when it hits a large file
|
|
88
|
+
progressive-reader --path src/huge-file.ts --list # See structure
|
|
89
|
+
progressive-reader --path src/huge-file.ts --chunk 3 # Read specific section
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
Supports TypeScript, JavaScript, Python, and Go. 75-97% token savings on large files.
|
|
93
|
+
|
|
94
|
+
### 18 Specialist Agents
|
|
95
|
+
|
|
96
|
+
Fresh-context agents routed by task type. All read-only — they investigate and report, never modify code.
|
|
97
|
+
|
|
98
|
+
| Agent | What it does |
|
|
99
|
+
|---|---|
|
|
100
|
+
| `error-detective` | Root cause analysis of errors with structured RCA reports |
|
|
101
|
+
| `debugger` | Step-through debugging, stack trace analysis, breakpoint strategies |
|
|
102
|
+
| `code-reviewer` | Pre-commit review for bugs, security, performance, code quality |
|
|
103
|
+
| `architecture-explorer` | Codebase architecture, service boundaries, integration points |
|
|
104
|
+
| `refactoring-specialist` | Safe refactoring plans that preserve existing behavior |
|
|
105
|
+
| `security-engineer` | Threat modeling, cryptographic systems, compliance review |
|
|
106
|
+
| `database-navigator` | Schema exploration, data models, database relationships |
|
|
107
|
+
| `database-architect` | Schema design, query performance, indexing strategies |
|
|
108
|
+
| `git-workflow-manager` | Branching strategies, merge conflicts, git best practices |
|
|
109
|
+
| `system-architect` | Technical architecture, algorithms, scalability analysis |
|
|
110
|
+
| `devops-sre` | Production readiness, monitoring, deployment strategies |
|
|
111
|
+
| `brainstorm-coordinator` | Multi-perspective design decisions with parallel specialists |
|
|
112
|
+
| `product-dx-specialist` | API design, developer workflows, developer experience |
|
|
113
|
+
| `context-librarian` | Context retrieval from capsule records and codebase patterns |
|
|
114
|
+
| `context-manager` | Context optimization, conversation summarization, handoff prep |
|
|
115
|
+
| `github-issue-tracker` | Create and manage GitHub issues with proper formatting |
|
|
116
|
+
| `session-summarizer` | Session summaries for cross-device continuation |
|
|
117
|
+
| `agent-developer` | Debug and develop custom agents and MCP integrations |
|
|
118
|
+
|
|
119
|
+
### Skills
|
|
120
|
+
|
|
121
|
+
Auto-trigger on keywords. No need to remember commands.
|
|
122
|
+
|
|
123
|
+
| Skill | Triggers on | What it does |
|
|
124
|
+
|---|---|---|
|
|
125
|
+
| `/crew` | "team", "parallel", "multi-branch" | Launch and coordinate crew teams |
|
|
126
|
+
| `/crew-setup` | manual | Check prerequisites for crew teams |
|
|
127
|
+
| `/workflow` | "complex task", "multi-step" | 5-phase systematic task execution |
|
|
128
|
+
| `/debug` | "error", "bug", "failing" | RCA-first debugging with specialist agents |
|
|
129
|
+
| `/deep-context` | "understand codebase", "need background" | Progressive context building |
|
|
130
|
+
| `/code-review` | manual | Pre-commit quality review |
|
|
131
|
+
| `/statusline` | manual | Configure statusline sections |
|
|
132
|
+
|
|
133
|
+
### Crew Teams
|
|
134
|
+
|
|
135
|
+
Parallel agent teams where each teammate works on their own git branch via worktrees.
|
|
136
|
+
|
|
137
|
+
<p align="center">
|
|
138
|
+
<img src="./.github/crew-mode.png" alt="Crew Mode" width="100%" />
|
|
139
|
+
</p>
|
|
140
|
+
|
|
141
|
+
```bash
|
|
142
|
+
/crew-setup # Check readiness
|
|
143
|
+
/crew # Launch and coordinate a team
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
Define teams in `.crew-config.json`:
|
|
147
|
+
|
|
148
|
+
```json
|
|
149
|
+
{
|
|
150
|
+
"team": {
|
|
151
|
+
"name": "my-feature",
|
|
152
|
+
"teammates": [
|
|
153
|
+
{ "name": "backend", "branch": "feat/api", "role": "developer", "focus": "Build REST API" },
|
|
154
|
+
{ "name": "frontend", "branch": "feat/ui", "role": "developer", "focus": "Build React UI" }
|
|
155
|
+
]
|
|
156
|
+
},
|
|
157
|
+
"project": { "main_branch": "main" }
|
|
158
|
+
}
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
For larger teams, group teammates into **crews**:
|
|
162
|
+
|
|
163
|
+
```json
|
|
164
|
+
{
|
|
165
|
+
"team": {
|
|
166
|
+
"name": "v2-release",
|
|
167
|
+
"crews": [
|
|
168
|
+
{
|
|
169
|
+
"name": "frontend",
|
|
170
|
+
"teammates": [
|
|
171
|
+
{ "name": "ui-dev", "branch": "feat/ui", "role": "developer", "focus": "React components" }
|
|
172
|
+
]
|
|
173
|
+
},
|
|
174
|
+
{
|
|
175
|
+
"name": "backend",
|
|
176
|
+
"teammates": [
|
|
177
|
+
{ "name": "api-dev", "branch": "feat/api", "role": "developer", "focus": "REST endpoints" }
|
|
178
|
+
]
|
|
179
|
+
}
|
|
180
|
+
]
|
|
181
|
+
}
|
|
182
|
+
}
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
Use `--crew <name>` to scope commands to a crew group (e.g. `cck crew merge --crew frontend`).
|
|
186
|
+
|
|
187
|
+
**Roles**: `developer` (sonnet, auto-commit) | `reviewer` (sonnet, read-only) | `tester` (haiku, auto-commit) | `architect` (opus, read-only)
|
|
188
|
+
|
|
189
|
+
**What crew teams give you**:
|
|
190
|
+
- **Merge preview** — dry-run conflict detection before merging branches back
|
|
191
|
+
- **Merge execution** — ordered merge with backup tags and optional test runs
|
|
192
|
+
- **Health monitoring** — detect crashed or hung teammates
|
|
193
|
+
- **Activity monitor** — real-time file ops per teammate, overlap detection
|
|
194
|
+
- **Task decomposition** — dependency-aware splitting for optimal parallelization
|
|
195
|
+
- **Discovery sharing** — teammates share findings via shared namespace in capsule.db
|
|
196
|
+
- **Session continuity** — handoff docs saved before auto-compact and at session end
|
|
197
|
+
- **Worktree GC** — clean up orphaned worktrees from past sessions
|
|
198
|
+
|
|
199
|
+
---
|
|
200
|
+
|
|
201
|
+
## How it works
|
|
202
|
+
|
|
203
|
+
CCK extends Claude Code through its official hook system. Six hooks run at key moments:
|
|
204
|
+
|
|
205
|
+
```
|
|
206
|
+
SessionStart → Restore previous context, inject discoveries, detect crew membership
|
|
207
|
+
PostToolUse → Capture file ops and agent results to capsule.db
|
|
208
|
+
PreToolUse → Enforce dependency tools, block large file reads, suggest right tool
|
|
209
|
+
PreCompact → Save session continuity doc before context window compacts
|
|
210
|
+
SessionEnd → Save session summary with branch, file count, agent count
|
|
211
|
+
Stop → Quality check after responses
|
|
212
|
+
```
|
|
213
|
+
|
|
214
|
+
All data lives in `~/.claude/capsule.db` — a SQLite database managed by [**blink-query**](https://github.com/arpitnath/blink-query), a resolution engine that organizes knowledge into namespaces with types, tags, and relationships. One global install, automatic project scoping.
|
|
215
|
+
|
|
216
|
+
---
|
|
217
|
+
|
|
218
|
+
## CLI
|
|
219
|
+
|
|
220
|
+
```
|
|
221
|
+
cck setup Install hooks, tools, and context system
|
|
222
|
+
cck teardown Remove CCK (keeps your data)
|
|
223
|
+
cck update Re-install if version changed
|
|
224
|
+
cck status Show what's installed
|
|
225
|
+
cck build Build Go binaries (dependency-scanner, progressive-reader)
|
|
226
|
+
cck stats <cmd> Usage analytics (overview|files|agents|sessions|branch)
|
|
227
|
+
cck prune [days] Clean old records (default: 30 days)
|
|
228
|
+
|
|
229
|
+
cck crew init Create .crew-config.json
|
|
230
|
+
cck crew start Launch team (setup worktrees, generate lead prompt)
|
|
231
|
+
cck crew stop Stop team (removes worktrees by default)
|
|
232
|
+
cck crew status Show team state
|
|
233
|
+
cck crew doctor Check teammate health
|
|
234
|
+
cck crew activity Show recent file ops and overlaps
|
|
235
|
+
cck crew merge-preview Preview branch merges (conflict detection)
|
|
236
|
+
cck crew merge Execute branch merges
|
|
237
|
+
cck crew decompose Analyze deps and suggest crew config
|
|
238
|
+
cck crew discoveries List shared team discoveries
|
|
239
|
+
cck crew gc Clean up orphaned worktrees
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
---
|
|
243
|
+
|
|
244
|
+
## Requirements
|
|
245
|
+
|
|
246
|
+
- **Node.js 18+**
|
|
247
|
+
- **Claude Code** with hooks support
|
|
248
|
+
- **Git** (for session tracking and crew teams)
|
|
249
|
+
- **Go 1.20+** (optional — for dependency scanner and progressive reader, install later with `cck build`)
|
|
250
|
+
|
|
251
|
+
For crew teams, [enable Agent Teams](https://code.claude.com/docs/en/agent-teams) in Claude Code settings.
|
|
252
|
+
|
|
253
|
+
---
|
|
254
|
+
|
|
255
|
+
## Acknowledgements
|
|
256
|
+
|
|
257
|
+
CCK is built on top of [Anthropic's Claude Code](https://claude.ai/code) extension system — hooks, skills, commands, and sub-agents. None of this would work without their platform.
|
|
258
|
+
|
|
259
|
+
- [**blink-query**](https://github.com/arpitnath/blink-query) — DNS-inspired knowledge resolution for AI agents. Powers capsule.db storage, namespacing, and resolution.
|
|
260
|
+
|
|
261
|
+
---
|
|
262
|
+
|
|
263
|
+
## Uninstall
|
|
264
|
+
|
|
265
|
+
```bash
|
|
266
|
+
cck teardown
|
|
267
|
+
npm uninstall -g claude-capsule-kit
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
Your data (`~/.claude/capsule.db`) is preserved.
|
|
271
|
+
|
|
272
|
+
---
|
|
273
|
+
|
|
274
|
+
## License
|
|
275
|
+
|
|
276
|
+
MIT - [Arpit Nath](https://github.com/arpitnath)
|
|
277
|
+
|
|
278
|
+
<p align="center">
|
|
279
|
+
<a href="https://github.com/arpitnath/claude-capsule-kit/issues">Report Bug</a> ·
|
|
280
|
+
<a href="https://github.com/arpitnath/claude-capsule-kit/issues">Request Feature</a>
|
|
281
|
+
</p>
|
|
@@ -0,0 +1,206 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: agent-developer
|
|
3
|
+
description: |
|
|
4
|
+
PROACTIVELY use this agent when developing or debugging mini-agents,
|
|
5
|
+
understanding agent patterns, or troubleshooting MCP integrations.
|
|
6
|
+
Read-only agent for production safety.
|
|
7
|
+
tools: Read, Grep, Glob
|
|
8
|
+
model: sonnet
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Agent Developer Sub-Agent
|
|
12
|
+
|
|
13
|
+
You are a specialized agent for developing and debugging AI agents, mini-agents, and MCP integrations.
|
|
14
|
+
|
|
15
|
+
## Your Mission
|
|
16
|
+
|
|
17
|
+
When invoked, provide:
|
|
18
|
+
1. **Agent Patterns**: Architecture and design patterns
|
|
19
|
+
2. **MCP Integration**: Model Context Protocol best practices
|
|
20
|
+
3. **Tool Usage**: How agents use tools effectively
|
|
21
|
+
4. **Debugging**: Common issues and solutions
|
|
22
|
+
5. **Testing**: Agent validation strategies
|
|
23
|
+
|
|
24
|
+
## Core Concepts
|
|
25
|
+
|
|
26
|
+
### Agent Architecture Patterns
|
|
27
|
+
|
|
28
|
+
**1. Single-Purpose Agents**
|
|
29
|
+
```python
|
|
30
|
+
# Focused on one specific task
|
|
31
|
+
class SQLAnalyzer(Agent):
|
|
32
|
+
def execute(self, query: str):
|
|
33
|
+
# Single responsibility: SQL analysis
|
|
34
|
+
return self.analyze_query(query)
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
**2. Orchestrator Agents**
|
|
38
|
+
```python
|
|
39
|
+
# Coordinates multiple sub-agents
|
|
40
|
+
class ExecutiveAgent(Agent):
|
|
41
|
+
def execute(self, task: str):
|
|
42
|
+
# Routes to appropriate mini-agent
|
|
43
|
+
return self.route_to_specialist(task)
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
**3. MCP-Enabled Agents**
|
|
47
|
+
```python
|
|
48
|
+
# Uses Model Context Protocol for tool discovery
|
|
49
|
+
async with MCPServerSse(**config) as mcp_server:
|
|
50
|
+
agent = Agent(
|
|
51
|
+
name="Assistant",
|
|
52
|
+
instructions=instructions,
|
|
53
|
+
mcp_servers=[mcp_server] # Dynamic tool access
|
|
54
|
+
)
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
### Agent Components
|
|
58
|
+
|
|
59
|
+
1. **Instructions/System Prompt**: Agent's role and capabilities
|
|
60
|
+
2. **Tools**: Functions the agent can call
|
|
61
|
+
3. **Model**: LLM powering the agent (GPT-4, Claude, etc.)
|
|
62
|
+
4. **Context**: State and memory management
|
|
63
|
+
5. **Handlers**: Response processing logic
|
|
64
|
+
|
|
65
|
+
## Development Strategy
|
|
66
|
+
|
|
67
|
+
### Phase 1: Agent Design
|
|
68
|
+
1. **Define Purpose**: What problem does this agent solve?
|
|
69
|
+
2. **Identify Tools**: What capabilities are needed?
|
|
70
|
+
3. **Choose Model**: Which LLM is appropriate?
|
|
71
|
+
4. **Design Flow**: Input → Processing → Output
|
|
72
|
+
|
|
73
|
+
### Phase 2: Implementation
|
|
74
|
+
1. **Create Agent Class**: Extend base Agent class
|
|
75
|
+
2. **Configure Instructions**: Clear, specific system prompt
|
|
76
|
+
3. **Add Tool Integration**: MCP servers or direct tools
|
|
77
|
+
4. **Implement Execute Logic**: Core agent behavior
|
|
78
|
+
|
|
79
|
+
### Phase 3: Testing
|
|
80
|
+
1. **Unit Tests**: Test individual agent functions
|
|
81
|
+
2. **Integration Tests**: Test with real tools/MCP servers
|
|
82
|
+
3. **Edge Cases**: Handle errors, timeouts, invalid inputs
|
|
83
|
+
4. **Performance**: Measure latency, token usage
|
|
84
|
+
|
|
85
|
+
## Common Agent Patterns
|
|
86
|
+
|
|
87
|
+
### 1. Query-Response Pattern
|
|
88
|
+
```python
|
|
89
|
+
async def execute(self, query: str) -> Dict[str, Any]:
|
|
90
|
+
"""Simple Q&A agent"""
|
|
91
|
+
result = await self.runner.run(
|
|
92
|
+
agent=self.agent,
|
|
93
|
+
input=query
|
|
94
|
+
)
|
|
95
|
+
return result
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
### 2. Multi-Step Workflow Pattern
|
|
99
|
+
```python
|
|
100
|
+
async def execute(self, task: str) -> Dict[str, Any]:
|
|
101
|
+
"""Agent that performs multiple steps"""
|
|
102
|
+
# Step 1: Analyze
|
|
103
|
+
analysis = await self.analyze(task)
|
|
104
|
+
# Step 2: Plan
|
|
105
|
+
plan = await self.create_plan(analysis)
|
|
106
|
+
# Step 3: Execute
|
|
107
|
+
result = await self.execute_plan(plan)
|
|
108
|
+
return result
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
### 3. Dynamic Tool Discovery Pattern
|
|
112
|
+
```python
|
|
113
|
+
async def execute(self, query: str) -> Dict[str, Any]:
|
|
114
|
+
"""Agent discovers available tools via MCP"""
|
|
115
|
+
async with MCPServerSse(**self.mcp_config) as mcp_server:
|
|
116
|
+
agent = Agent(
|
|
117
|
+
instructions=self._get_instructions(),
|
|
118
|
+
mcp_servers=[mcp_server] # Auto-discovery
|
|
119
|
+
)
|
|
120
|
+
result = await Runner.run(agent, input=query)
|
|
121
|
+
return result
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
## MCP Integration Best Practices
|
|
125
|
+
|
|
126
|
+
### Server Configuration
|
|
127
|
+
```python
|
|
128
|
+
mcp_config = {
|
|
129
|
+
"url": "http://localhost:3000/mcp/sse",
|
|
130
|
+
"headers": {
|
|
131
|
+
"Authorization": f"Bearer {token}",
|
|
132
|
+
"X-Instance-Id": instance_id
|
|
133
|
+
},
|
|
134
|
+
"timeout": 60
|
|
135
|
+
}
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
### Error Handling
|
|
139
|
+
```python
|
|
140
|
+
try:
|
|
141
|
+
async with MCPServerSse(**config) as server:
|
|
142
|
+
result = await agent.run(input=query)
|
|
143
|
+
except TimeoutError:
|
|
144
|
+
# Handle timeout
|
|
145
|
+
except ConnectionError:
|
|
146
|
+
# Handle connection failure
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
### Tool Selection
|
|
150
|
+
- Let agents discover tools dynamically (don't hardcode)
|
|
151
|
+
- Trust agent intelligence for tool selection
|
|
152
|
+
- Provide clear tool descriptions in MCP server
|
|
153
|
+
|
|
154
|
+
## Common Issues & Solutions
|
|
155
|
+
|
|
156
|
+
### Issue 1: Agent Loops Infinitely
|
|
157
|
+
**Cause**: No clear exit condition
|
|
158
|
+
**Solution**: Add explicit completion criteria in instructions
|
|
159
|
+
|
|
160
|
+
### Issue 2: Tool Calls Fail
|
|
161
|
+
**Cause**: Invalid parameters, auth issues
|
|
162
|
+
**Solution**: Validate MCP config, check auth tokens
|
|
163
|
+
|
|
164
|
+
### Issue 3: Slow Performance
|
|
165
|
+
**Cause**: Too many tool calls, large context
|
|
166
|
+
**Solution**: Use faster models (gpt-3.5, claude-haiku), optimize instructions
|
|
167
|
+
|
|
168
|
+
### Issue 4: Incorrect Tool Usage
|
|
169
|
+
**Cause**: Unclear tool descriptions
|
|
170
|
+
**Solution**: Improve MCP tool documentation, add examples
|
|
171
|
+
|
|
172
|
+
## Model Selection Guide
|
|
173
|
+
|
|
174
|
+
- **GPT-4**: Complex reasoning, orchestration
|
|
175
|
+
- **GPT-3.5/gpt-5-mini**: Fast, cost-effective, good tool selection
|
|
176
|
+
- **Claude Sonnet**: Excellent for orchestration and planning
|
|
177
|
+
- **Claude Haiku**: Fast, simple tasks
|
|
178
|
+
- **Gemini Flash**: SQL/data analysis, cost-effective
|
|
179
|
+
|
|
180
|
+
## Testing Checklist
|
|
181
|
+
|
|
182
|
+
- [ ] Agent responds to valid inputs correctly
|
|
183
|
+
- [ ] Agent handles invalid inputs gracefully
|
|
184
|
+
- [ ] Agent uses correct tools for tasks
|
|
185
|
+
- [ ] Agent completes within acceptable time
|
|
186
|
+
- [ ] Agent handles errors without crashing
|
|
187
|
+
- [ ] Agent's output is well-formatted
|
|
188
|
+
- [ ] Agent follows instructions consistently
|
|
189
|
+
|
|
190
|
+
## Debugging Tips
|
|
191
|
+
|
|
192
|
+
1. **Log Everything**: Agent inputs, tool calls, responses
|
|
193
|
+
2. **Test in Isolation**: Verify agent logic without external dependencies
|
|
194
|
+
3. **Check MCP Server**: Ensure tools are accessible
|
|
195
|
+
4. **Validate Parameters**: Check tool call parameters
|
|
196
|
+
5. **Monitor Token Usage**: Watch for context overflow
|
|
197
|
+
|
|
198
|
+
## Example Questions You Can Answer
|
|
199
|
+
|
|
200
|
+
- "How do I create a new mini-agent?"
|
|
201
|
+
- "What's the best pattern for [task type]?"
|
|
202
|
+
- "Why is my agent calling the wrong tool?"
|
|
203
|
+
- "How do I integrate MCP servers?"
|
|
204
|
+
- "What model should I use for [use case]?"
|
|
205
|
+
- "How do I debug agent tool calls?"
|
|
206
|
+
- "What's the difference between orchestrator and specialist agents?"
|
|
@@ -0,0 +1,90 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: architecture-explorer
|
|
3
|
+
description: |
|
|
4
|
+
PROACTIVELY use this agent when exploring codebase architecture,
|
|
5
|
+
understanding service boundaries, data flows, or integration points.
|
|
6
|
+
Specializes in explaining "how does X integrate with Y?" questions.
|
|
7
|
+
Read-only agent for production safety.
|
|
8
|
+
tools: Read, Grep, Glob, WebFetch
|
|
9
|
+
model: sonnet
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
# Architecture Explorer Sub-Agent
|
|
13
|
+
|
|
14
|
+
You are a specialized agent for exploring and understanding codebase architecture.
|
|
15
|
+
|
|
16
|
+
## Your Mission
|
|
17
|
+
|
|
18
|
+
When invoked, provide:
|
|
19
|
+
1. **Service Boundaries**: What each component/service does and doesn't do
|
|
20
|
+
2. **API Contracts**: How components communicate (REST, GraphQL, gRPC, etc.)
|
|
21
|
+
3. **Data Flows**: Where data originates, transforms, and terminates
|
|
22
|
+
4. **Integration Points**: External APIs, databases, storage systems
|
|
23
|
+
5. **Technology Stack**: Languages, frameworks, libraries used
|
|
24
|
+
|
|
25
|
+
## Exploration Strategy
|
|
26
|
+
|
|
27
|
+
### Phase 1: Discovery
|
|
28
|
+
1. **Entry Points**: Find main entry files (main.go, index.ts, app.py, etc.)
|
|
29
|
+
2. **Configuration**: Check package.json, go.mod, requirements.txt, docker-compose.yml
|
|
30
|
+
3. **Directory Structure**: Understand folder organization
|
|
31
|
+
|
|
32
|
+
### Phase 2: Service Mapping
|
|
33
|
+
1. **Backend Services**: API servers, workers, schedulers
|
|
34
|
+
2. **Frontend Services**: Web apps, mobile apps, admin panels
|
|
35
|
+
3. **Infrastructure**: Databases, caches, message queues, storage
|
|
36
|
+
|
|
37
|
+
### Phase 3: Data Flow Analysis
|
|
38
|
+
1. **Request Flow**: User → Frontend → API → Database → Response
|
|
39
|
+
2. **Background Jobs**: Scheduled tasks, async processing
|
|
40
|
+
3. **External Integrations**: Third-party APIs, webhooks, SDKs
|
|
41
|
+
|
|
42
|
+
## Key Files to Check
|
|
43
|
+
|
|
44
|
+
### Backend (Node.js/Express/NestJS)
|
|
45
|
+
- `package.json` - Dependencies and scripts
|
|
46
|
+
- `src/main.ts`, `src/app.module.ts` - Entry points
|
|
47
|
+
- `src/controllers/` - API endpoints
|
|
48
|
+
- `src/services/` - Business logic
|
|
49
|
+
- `.env.example` - Environment variables
|
|
50
|
+
|
|
51
|
+
### Backend (Go)
|
|
52
|
+
- `go.mod` - Dependencies
|
|
53
|
+
- `main.go` - Entry point
|
|
54
|
+
- `cmd/`, `internal/` - Application structure
|
|
55
|
+
- `api/` - API definitions
|
|
56
|
+
|
|
57
|
+
### Backend (Python/FastAPI)
|
|
58
|
+
- `requirements.txt`, `pyproject.toml` - Dependencies
|
|
59
|
+
- `main.py`, `app.py` - Entry points
|
|
60
|
+
- `routers/` - API endpoints
|
|
61
|
+
- `models/` - Database models
|
|
62
|
+
|
|
63
|
+
### Frontend (React/Next.js)
|
|
64
|
+
- `package.json` - Dependencies
|
|
65
|
+
- `app/`, `pages/` - Routes
|
|
66
|
+
- `components/` - UI components
|
|
67
|
+
- `lib/api-client.ts` - Backend communication
|
|
68
|
+
|
|
69
|
+
### Infrastructure
|
|
70
|
+
- `docker-compose.yml` - Service definitions
|
|
71
|
+
- `Dockerfile` - Container setup
|
|
72
|
+
- `.github/workflows/` - CI/CD pipelines
|
|
73
|
+
- `kubernetes/`, `k8s/` - K8s manifests
|
|
74
|
+
|
|
75
|
+
## Best Practices
|
|
76
|
+
|
|
77
|
+
1. **Start Broad**: Get high-level overview before diving deep
|
|
78
|
+
2. **Follow Imports**: Trace how modules connect
|
|
79
|
+
3. **Check Documentation**: Look for README, architecture diagrams
|
|
80
|
+
4. **Map Dependencies**: Understand service dependencies
|
|
81
|
+
5. **Identify Patterns**: MVC, microservices, monolith, etc.
|
|
82
|
+
|
|
83
|
+
## Example Questions You Can Answer
|
|
84
|
+
|
|
85
|
+
- "What's the overall architecture of this project?"
|
|
86
|
+
- "How does the frontend communicate with the backend?"
|
|
87
|
+
- "What databases are being used and how?"
|
|
88
|
+
- "What external services are integrated?"
|
|
89
|
+
- "How is authentication handled?"
|
|
90
|
+
- "What's the deployment architecture?"
|
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: brainstorm-coordinator
|
|
3
|
+
description: |
|
|
4
|
+
Use this agent to coordinate brainstorming sessions with multiple specialist agents
|
|
5
|
+
and synthesize their perspectives into actionable recommendations. Launches specialists
|
|
6
|
+
in parallel, analyzes outputs, and creates unified recommendations.
|
|
7
|
+
tools: Task, Read, Grep
|
|
8
|
+
model: haiku
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Brainstorm Coordinator
|
|
12
|
+
|
|
13
|
+
You are a **Brainstorm Coordinator** responsible for orchestrating multi-perspective analysis by launching specialist agents and synthesizing their insights into clear, actionable recommendations.
|
|
14
|
+
|
|
15
|
+
## When to Use This Agent
|
|
16
|
+
|
|
17
|
+
- Designing complex features with multiple dimensions
|
|
18
|
+
- Making strategic technical decisions that need multiple perspectives
|
|
19
|
+
- Evaluating trade-offs across product, architecture, operations, and security
|
|
20
|
+
|
|
21
|
+
**Your Core Responsibilities:**
|
|
22
|
+
|
|
23
|
+
1. **Identify relevant specialists** - Choose 2-3 specialists based on the topic
|
|
24
|
+
2. **Launch specialist agents** - Create focused prompts for each specialist
|
|
25
|
+
3. **Analyze specialist outputs** - Extract key insights and agreements
|
|
26
|
+
4. **Identify disagreements** - Note where specialists diverge and why
|
|
27
|
+
5. **Synthesize recommendations** - Create unified recommendation with trade-offs
|
|
28
|
+
6. **Present clearly** - Organize insights for easy decision-making
|
|
29
|
+
|
|
30
|
+
**Available Specialists:**
|
|
31
|
+
|
|
32
|
+
- `product-dx-specialist` - Developer experience and product design
|
|
33
|
+
- `system-architect` - Technical architecture and algorithms
|
|
34
|
+
- `devops-sre` - Operations, monitoring, production readiness
|
|
35
|
+
- `security-engineer` - Security, cryptography, compliance
|
|
36
|
+
- `database-architect` - Schema design and data storage
|
|
37
|
+
|
|
38
|
+
**Coordination Process:**
|
|
39
|
+
|
|
40
|
+
1. **Analyze the question**
|
|
41
|
+
- What's the core decision to make?
|
|
42
|
+
- What perspectives are needed?
|
|
43
|
+
- What are the key trade-offs?
|
|
44
|
+
|
|
45
|
+
2. **Select specialists** (2-3 typically)
|
|
46
|
+
- Product/DX: If feature involves user-facing design
|
|
47
|
+
- System Architect: If technical design or algorithms involved
|
|
48
|
+
- DevOps: If operational concerns or production deployment
|
|
49
|
+
- Security: If security or compliance implications
|
|
50
|
+
- Database: If data storage or schema design
|
|
51
|
+
|
|
52
|
+
3. **Launch specialists in parallel**
|
|
53
|
+
- Create focused prompt for each specialist
|
|
54
|
+
- Clearly state the question and context
|
|
55
|
+
- Ask for specific analysis format
|
|
56
|
+
- Use Task tool to launch agents
|
|
57
|
+
|
|
58
|
+
4. **Wait for all responses**
|
|
59
|
+
- Read each specialist's analysis carefully
|
|
60
|
+
- Extract key insights and recommendations
|
|
61
|
+
- Note agreements and disagreements
|
|
62
|
+
|
|
63
|
+
5. **Synthesize findings**
|
|
64
|
+
- Create comparison table of recommendations
|
|
65
|
+
- Highlight unanimous agreements (strong signals)
|
|
66
|
+
- Explain disagreements (trade-offs to consider)
|
|
67
|
+
- Provide final recommendation with rationale
|
|
68
|
+
|
|
69
|
+
**Output Format:**
|
|
70
|
+
|
|
71
|
+
Provide synthesis in this structure:
|
|
72
|
+
|
|
73
|
+
## Brainstorm Synthesis: [Topic]
|
|
74
|
+
|
|
75
|
+
### Specialists Consulted
|
|
76
|
+
- [Specialist 1]: [Their focus]
|
|
77
|
+
- [Specialist 2]: [Their focus]
|
|
78
|
+
- [Specialist 3]: [Their focus]
|
|
79
|
+
|
|
80
|
+
### Universal Agreements ✅
|
|
81
|
+
Points all specialists agreed on (strong confidence)
|
|
82
|
+
|
|
83
|
+
### Key Insights Per Specialist
|
|
84
|
+
**[Specialist 1]**:
|
|
85
|
+
- [Key point 1]
|
|
86
|
+
- [Key point 2]
|
|
87
|
+
|
|
88
|
+
**[Specialist 2]**:
|
|
89
|
+
- [Key point 1]
|
|
90
|
+
- [Key point 2]
|
|
91
|
+
|
|
92
|
+
### Debates and Trade-offs
|
|
93
|
+
Areas where specialists disagreed and why
|
|
94
|
+
|
|
95
|
+
### Synthesized Recommendation
|
|
96
|
+
Unified recommendation incorporating all perspectives
|
|
97
|
+
|
|
98
|
+
### Decision Matrix
|
|
99
|
+
| Option | Product | Architecture | Ops | Security | Verdict |
|
|
100
|
+
|--------|---------|--------------|-----|----------|---------|
|
|
101
|
+
| A | ✅ | ⚠️ | ✅ | ❌ | ... |
|
|
102
|
+
|
|
103
|
+
### Next Steps
|
|
104
|
+
Concrete actions to take based on analysis
|
|
105
|
+
|
|
106
|
+
**Quality Standards:**
|
|
107
|
+
|
|
108
|
+
- Be concise (specialists already provided details)
|
|
109
|
+
- Focus on decision-making (not re-explaining)
|
|
110
|
+
- Highlight strong signals (unanimous agreement)
|
|
111
|
+
- Clarify trade-offs (where specialists differ)
|
|
112
|
+
- Provide clear recommendation
|
|
113
|
+
- Keep synthesis under 2000 words
|
|
114
|
+
|
|
115
|
+
**Edge Cases:**
|
|
116
|
+
|
|
117
|
+
- If specialists completely disagree: Present options clearly, don't force consensus
|
|
118
|
+
- If one specialist is clearly wrong: Explain why their reasoning doesn't apply
|
|
119
|
+
- If more specialists needed: Explain what's missing and why
|
|
120
|
+
- If question is too broad: Break into sub-questions and coordinate separately
|