@vibe-hero/server 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +190 -0
- package/README.md +151 -0
- package/dist/catalog/bundled/claude-code/.gitkeep +0 -0
- package/dist/catalog/bundled/claude-code/context-management.yaml +302 -0
- package/dist/catalog/bundled/claude-code/planning.yaml +313 -0
- package/dist/catalog/bundled/claude-code/subagents.yaml +357 -0
- package/dist/catalog/bundled/general/.gitkeep +0 -0
- package/dist/catalog/bundled/general/_placeholder.yaml +39 -0
- package/dist/catalog/bundled/general/task-decomposition.yaml +390 -0
- package/dist/catalog/bundled/index.d.ts +39 -0
- package/dist/catalog/bundled/index.d.ts.map +1 -0
- package/dist/catalog/bundled/index.js +41 -0
- package/dist/catalog/bundled/index.js.map +1 -0
- package/dist/catalog/fetcher.d.ts +201 -0
- package/dist/catalog/fetcher.d.ts.map +1 -0
- package/dist/catalog/fetcher.js +452 -0
- package/dist/catalog/fetcher.js.map +1 -0
- package/dist/catalog/loader.d.ts +165 -0
- package/dist/catalog/loader.d.ts.map +1 -0
- package/dist/catalog/loader.js +241 -0
- package/dist/catalog/loader.js.map +1 -0
- package/dist/catalog/resolve.d.ts +85 -0
- package/dist/catalog/resolve.d.ts.map +1 -0
- package/dist/catalog/resolve.js +103 -0
- package/dist/catalog/resolve.js.map +1 -0
- package/dist/cli/getOffer.d.ts +38 -0
- package/dist/cli/getOffer.d.ts.map +1 -0
- package/dist/cli/getOffer.js +150 -0
- package/dist/cli/getOffer.js.map +1 -0
- package/dist/cli/index.d.ts +46 -0
- package/dist/cli/index.d.ts.map +1 -0
- package/dist/cli/index.js +88 -0
- package/dist/cli/index.js.map +1 -0
- package/dist/config.d.ts +34 -0
- package/dist/config.d.ts.map +1 -0
- package/dist/config.js +63 -0
- package/dist/config.js.map +1 -0
- package/dist/engine/elo.d.ts +76 -0
- package/dist/engine/elo.d.ts.map +1 -0
- package/dist/engine/elo.js +79 -0
- package/dist/engine/elo.js.map +1 -0
- package/dist/engine/graduation.d.ts +108 -0
- package/dist/engine/graduation.d.ts.map +1 -0
- package/dist/engine/graduation.js +161 -0
- package/dist/engine/graduation.js.map +1 -0
- package/dist/engine/lapse.d.ts +80 -0
- package/dist/engine/lapse.d.ts.map +1 -0
- package/dist/engine/lapse.js +125 -0
- package/dist/engine/lapse.js.map +1 -0
- package/dist/engine/selection.d.ts +84 -0
- package/dist/engine/selection.d.ts.map +1 -0
- package/dist/engine/selection.js +119 -0
- package/dist/engine/selection.js.map +1 -0
- package/dist/grading/deterministic.d.ts +102 -0
- package/dist/grading/deterministic.d.ts.map +1 -0
- package/dist/grading/deterministic.js +118 -0
- package/dist/grading/deterministic.js.map +1 -0
- package/dist/grading/freeform.d.ts +64 -0
- package/dist/grading/freeform.d.ts.map +1 -0
- package/dist/grading/freeform.js +85 -0
- package/dist/grading/freeform.js.map +1 -0
- package/dist/index.d.ts +52 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +91 -0
- package/dist/index.js.map +1 -0
- package/dist/observation/hookEvents.d.ts +113 -0
- package/dist/observation/hookEvents.d.ts.map +1 -0
- package/dist/observation/hookEvents.js +170 -0
- package/dist/observation/hookEvents.js.map +1 -0
- package/dist/observation/offers.d.ts +215 -0
- package/dist/observation/offers.d.ts.map +1 -0
- package/dist/observation/offers.js +327 -0
- package/dist/observation/offers.js.map +1 -0
- package/dist/observation/source.d.ts +133 -0
- package/dist/observation/source.d.ts.map +1 -0
- package/dist/observation/source.js +105 -0
- package/dist/observation/source.js.map +1 -0
- package/dist/profile/migrate.d.ts +122 -0
- package/dist/profile/migrate.d.ts.map +1 -0
- package/dist/profile/migrate.js +147 -0
- package/dist/profile/migrate.js.map +1 -0
- package/dist/profile/store.d.ts +84 -0
- package/dist/profile/store.d.ts.map +1 -0
- package/dist/profile/store.js +267 -0
- package/dist/profile/store.js.map +1 -0
- package/dist/schemas/common.d.ts +95 -0
- package/dist/schemas/common.d.ts.map +1 -0
- package/dist/schemas/common.js +106 -0
- package/dist/schemas/common.js.map +1 -0
- package/dist/schemas/content.d.ts +828 -0
- package/dist/schemas/content.d.ts.map +1 -0
- package/dist/schemas/content.js +219 -0
- package/dist/schemas/content.js.map +1 -0
- package/dist/schemas/profile.d.ts +599 -0
- package/dist/schemas/profile.d.ts.map +1 -0
- package/dist/schemas/profile.js +177 -0
- package/dist/schemas/profile.js.map +1 -0
- package/dist/schemas/tools.d.ts +1581 -0
- package/dist/schemas/tools.d.ts.map +1 -0
- package/dist/schemas/tools.js +286 -0
- package/dist/schemas/tools.js.map +1 -0
- package/dist/tools/config.d.ts +51 -0
- package/dist/tools/config.d.ts.map +1 -0
- package/dist/tools/config.js +104 -0
- package/dist/tools/config.js.map +1 -0
- package/dist/tools/gate.d.ts +50 -0
- package/dist/tools/gate.d.ts.map +1 -0
- package/dist/tools/gate.js +67 -0
- package/dist/tools/gate.js.map +1 -0
- package/dist/tools/guidance.d.ts +36 -0
- package/dist/tools/guidance.d.ts.map +1 -0
- package/dist/tools/guidance.js +117 -0
- package/dist/tools/guidance.js.map +1 -0
- package/dist/tools/listTopics.d.ts +55 -0
- package/dist/tools/listTopics.d.ts.map +1 -0
- package/dist/tools/listTopics.js +78 -0
- package/dist/tools/listTopics.js.map +1 -0
- package/dist/tools/offers.d.ts +60 -0
- package/dist/tools/offers.d.ts.map +1 -0
- package/dist/tools/offers.js +152 -0
- package/dist/tools/offers.js.map +1 -0
- package/dist/tools/placeholders.d.ts +27 -0
- package/dist/tools/placeholders.d.ts.map +1 -0
- package/dist/tools/placeholders.js +49 -0
- package/dist/tools/placeholders.js.map +1 -0
- package/dist/tools/recordObservation.d.ts +52 -0
- package/dist/tools/recordObservation.d.ts.map +1 -0
- package/dist/tools/recordObservation.js +87 -0
- package/dist/tools/recordObservation.js.map +1 -0
- package/dist/tools/startQuiz.d.ts +82 -0
- package/dist/tools/startQuiz.d.ts.map +1 -0
- package/dist/tools/startQuiz.js +180 -0
- package/dist/tools/startQuiz.js.map +1 -0
- package/dist/tools/status.d.ts +59 -0
- package/dist/tools/status.d.ts.map +1 -0
- package/dist/tools/status.js +133 -0
- package/dist/tools/status.js.map +1 -0
- package/dist/tools/submitAnswer.d.ts +156 -0
- package/dist/tools/submitAnswer.d.ts.map +1 -0
- package/dist/tools/submitAnswer.js +402 -0
- package/dist/tools/submitAnswer.js.map +1 -0
- package/dist/tools/types.d.ts +82 -0
- package/dist/tools/types.d.ts.map +1 -0
- package/dist/tools/types.js +48 -0
- package/dist/tools/types.js.map +1 -0
- package/dist/tools/us2/standing.d.ts +111 -0
- package/dist/tools/us2/standing.d.ts.map +1 -0
- package/dist/tools/us2/standing.js +143 -0
- package/dist/tools/us2/standing.js.map +1 -0
- package/package.json +62 -0
|
@@ -0,0 +1,313 @@
|
|
|
1
|
+
# Topic: planning (claude-code)
|
|
2
|
+
#
|
|
3
|
+
# Covers Claude Code's planning mode (ExitPlanMode), the TodoWrite/TodoRead
|
|
4
|
+
# task list as a planning primitive, when to plan vs. act, and how to
|
|
5
|
+
# structure multi-step work. Tiers 100–500.
|
|
6
|
+
|
|
7
|
+
id: planning
|
|
8
|
+
class:
|
|
9
|
+
kind: tool
|
|
10
|
+
tool: claude-code
|
|
11
|
+
title: Planning Mode & Task Management
|
|
12
|
+
summary: >-
|
|
13
|
+
Using Claude Code's plan mode and the TodoWrite task list to structure complex
|
|
14
|
+
work — when to plan before acting, how to break tasks down, and how to
|
|
15
|
+
maintain progress across a long session.
|
|
16
|
+
|
|
17
|
+
triggerSignals:
|
|
18
|
+
- tool: claude-code
|
|
19
|
+
match:
|
|
20
|
+
toolName: ExitPlanMode
|
|
21
|
+
weight: 1
|
|
22
|
+
- tool: claude-code
|
|
23
|
+
match:
|
|
24
|
+
toolName: TodoWrite
|
|
25
|
+
weight: 0.9
|
|
26
|
+
- tool: claude-code
|
|
27
|
+
match:
|
|
28
|
+
toolName: TodoRead
|
|
29
|
+
weight: 0.7
|
|
30
|
+
- tool: claude-code
|
|
31
|
+
match:
|
|
32
|
+
toolNamePattern: "^ExitPlanMode$"
|
|
33
|
+
weight: 1
|
|
34
|
+
|
|
35
|
+
items:
|
|
36
|
+
# ── Tier 100 — Remember ──────────────────────────────────────────────────
|
|
37
|
+
- id: planning-100-mc-a
|
|
38
|
+
tier: 100
|
|
39
|
+
bloom: remember
|
|
40
|
+
difficulty: 150
|
|
41
|
+
type: multiple_choice
|
|
42
|
+
prompt: >-
|
|
43
|
+
What does Claude Code's "plan mode" mean in practice?
|
|
44
|
+
choices:
|
|
45
|
+
- id: a
|
|
46
|
+
text: >-
|
|
47
|
+
The model runs a special subprocess that generates a plan file
|
|
48
|
+
- id: b
|
|
49
|
+
text: >-
|
|
50
|
+
Claude Code reads files and reasons about a task without executing
|
|
51
|
+
any changes, then awaits approval before acting
|
|
52
|
+
- id: c
|
|
53
|
+
text: >-
|
|
54
|
+
A scheduled background job that plans the next day's work
|
|
55
|
+
- id: d
|
|
56
|
+
text: >-
|
|
57
|
+
A restricted mode where only Bash commands are allowed
|
|
58
|
+
answerKey:
|
|
59
|
+
kind: choice
|
|
60
|
+
correctChoiceId: b
|
|
61
|
+
guidance: >-
|
|
62
|
+
In plan mode, Claude Code explores the problem space (reading files,
|
|
63
|
+
thinking through the approach) without making any edits or running
|
|
64
|
+
commands. It presents the plan for user review. Calling ExitPlanMode
|
|
65
|
+
signals that planning is complete and implementation can begin.
|
|
66
|
+
|
|
67
|
+
- id: planning-100-mc-b
|
|
68
|
+
tier: 100
|
|
69
|
+
bloom: remember
|
|
70
|
+
difficulty: 160
|
|
71
|
+
type: multiple_choice
|
|
72
|
+
prompt: >-
|
|
73
|
+
Which tool call signals that Claude Code is done planning and ready to
|
|
74
|
+
begin implementing?
|
|
75
|
+
choices:
|
|
76
|
+
- id: a
|
|
77
|
+
text: TodoWrite with status "in_progress"
|
|
78
|
+
- id: b
|
|
79
|
+
text: Bash with "start"
|
|
80
|
+
- id: c
|
|
81
|
+
text: ExitPlanMode
|
|
82
|
+
- id: d
|
|
83
|
+
text: Read with the implementation file path
|
|
84
|
+
answerKey:
|
|
85
|
+
kind: choice
|
|
86
|
+
correctChoiceId: c
|
|
87
|
+
guidance: >-
|
|
88
|
+
ExitPlanMode is the explicit tool call that ends the planning phase and
|
|
89
|
+
transitions Claude Code into implementation. Observing this call in the
|
|
90
|
+
transcript is a strong signal that the agent has finished scoping the
|
|
91
|
+
work and is about to start making changes.
|
|
92
|
+
|
|
93
|
+
# ── Tier 200 — Understand ───────────────────────────────────────────────
|
|
94
|
+
- id: planning-200-mc-a
|
|
95
|
+
tier: 200
|
|
96
|
+
bloom: understand
|
|
97
|
+
difficulty: 250
|
|
98
|
+
type: multiple_choice
|
|
99
|
+
prompt: >-
|
|
100
|
+
Why should you mark a TodoWrite task as "in_progress" before starting it,
|
|
101
|
+
rather than marking it "completed" only when you finish?
|
|
102
|
+
choices:
|
|
103
|
+
- id: a
|
|
104
|
+
text: >-
|
|
105
|
+
The MCP server requires it to track billing
|
|
106
|
+
- id: b
|
|
107
|
+
text: >-
|
|
108
|
+
Marking in_progress signals to the user that work is happening and
|
|
109
|
+
creates a recovery checkpoint if the session is interrupted
|
|
110
|
+
- id: c
|
|
111
|
+
text: >-
|
|
112
|
+
It prevents other agents from picking up the same task
|
|
113
|
+
- id: d
|
|
114
|
+
text: >-
|
|
115
|
+
TodoWrite ignores status unless you use in_progress first
|
|
116
|
+
answerKey:
|
|
117
|
+
kind: choice
|
|
118
|
+
correctChoiceId: b
|
|
119
|
+
guidance: >-
|
|
120
|
+
The in_progress status serves two purposes: it shows the user that the
|
|
121
|
+
agent is actively working on something (not silently idle), and it leaves
|
|
122
|
+
a clear recovery marker if context is compacted or the session is
|
|
123
|
+
interrupted mid-task. If you only mark completed at the end, a
|
|
124
|
+
compaction event could erase all progress signals.
|
|
125
|
+
|
|
126
|
+
- id: planning-200-sa-a
|
|
127
|
+
tier: 200
|
|
128
|
+
bloom: understand
|
|
129
|
+
difficulty: 260
|
|
130
|
+
type: short_answer
|
|
131
|
+
prompt: >-
|
|
132
|
+
What are the three valid status values for a task in the Claude Code
|
|
133
|
+
Todo list (TodoWrite)?
|
|
134
|
+
answerKey:
|
|
135
|
+
kind: keyword
|
|
136
|
+
anyOf:
|
|
137
|
+
- "pending, in_progress, completed"
|
|
138
|
+
- "pending in_progress completed"
|
|
139
|
+
- "pending"
|
|
140
|
+
- "in_progress"
|
|
141
|
+
- "completed"
|
|
142
|
+
normalize: lower
|
|
143
|
+
guidance: >-
|
|
144
|
+
The three Todo task statuses are: `pending` (not yet started),
|
|
145
|
+
`in_progress` (currently being worked on), and `completed` (done).
|
|
146
|
+
Transitioning through these statuses as you work lets both the user and
|
|
147
|
+
future context reconstruction understand what has been accomplished.
|
|
148
|
+
|
|
149
|
+
# ── Tier 300 — Apply ────────────────────────────────────────────────────
|
|
150
|
+
- id: planning-300-mc-a
|
|
151
|
+
tier: 300
|
|
152
|
+
bloom: apply
|
|
153
|
+
difficulty: 350
|
|
154
|
+
type: multiple_choice
|
|
155
|
+
prompt: >-
|
|
156
|
+
A user asks you to implement a new API endpoint. This requires: (1) adding
|
|
157
|
+
a route, (2) writing a handler, (3) adding a test, and (4) updating docs.
|
|
158
|
+
In what order should you use TodoWrite?
|
|
159
|
+
choices:
|
|
160
|
+
- id: a
|
|
161
|
+
text: >-
|
|
162
|
+
Write all four tasks at once before starting any of them; then mark
|
|
163
|
+
each in_progress → completed as you go
|
|
164
|
+
- id: b
|
|
165
|
+
text: >-
|
|
166
|
+
Write one task, complete it, then write the next task
|
|
167
|
+
- id: c
|
|
168
|
+
text: >-
|
|
169
|
+
Write all tasks and immediately mark them all completed to show the
|
|
170
|
+
full plan
|
|
171
|
+
- id: d
|
|
172
|
+
text: >-
|
|
173
|
+
Skip TodoWrite — four steps is small enough to hold in context
|
|
174
|
+
answerKey:
|
|
175
|
+
kind: choice
|
|
176
|
+
correctChoiceId: a
|
|
177
|
+
guidance: >-
|
|
178
|
+
The right pattern is to write all tasks upfront (creating the full
|
|
179
|
+
checklist), then work through them one at a time: mark each in_progress
|
|
180
|
+
before starting, completed when done. Writing tasks one at a time loses
|
|
181
|
+
the upfront clarity; marking all completed immediately is dishonest.
|
|
182
|
+
Even four tasks benefits from explicit tracking — sessions can be
|
|
183
|
+
interrupted.
|
|
184
|
+
|
|
185
|
+
- id: planning-300-mc-b
|
|
186
|
+
tier: 300
|
|
187
|
+
bloom: apply
|
|
188
|
+
difficulty: 360
|
|
189
|
+
type: multiple_choice
|
|
190
|
+
prompt: >-
|
|
191
|
+
You are in plan mode researching a complex feature request. You have read
|
|
192
|
+
three files. The user has NOT yet approved your plan. Should you call
|
|
193
|
+
ExitPlanMode and start editing?
|
|
194
|
+
choices:
|
|
195
|
+
- id: a
|
|
196
|
+
text: >-
|
|
197
|
+
Yes — you have enough information to proceed
|
|
198
|
+
- id: b
|
|
199
|
+
text: >-
|
|
200
|
+
Yes — plan mode is just a suggestion, not a gate
|
|
201
|
+
- id: c
|
|
202
|
+
text: >-
|
|
203
|
+
No — ExitPlanMode should only be called after presenting the plan
|
|
204
|
+
and receiving user confirmation to proceed
|
|
205
|
+
- id: d
|
|
206
|
+
text: >-
|
|
207
|
+
No — you must read every file in the repo before exiting plan mode
|
|
208
|
+
answerKey:
|
|
209
|
+
kind: choice
|
|
210
|
+
correctChoiceId: c
|
|
211
|
+
guidance: >-
|
|
212
|
+
Plan mode is a human-in-the-loop gate. Its purpose is to surface the
|
|
213
|
+
proposed approach for review before any changes are made. Calling
|
|
214
|
+
ExitPlanMode without user confirmation undermines that gate. The correct
|
|
215
|
+
flow is: explore → present plan → wait for approval → ExitPlanMode →
|
|
216
|
+
implement.
|
|
217
|
+
|
|
218
|
+
# ── Tier 400 — Analyze ──────────────────────────────────────────────────
|
|
219
|
+
- id: planning-400-mc-a
|
|
220
|
+
tier: 400
|
|
221
|
+
bloom: analyze
|
|
222
|
+
difficulty: 430
|
|
223
|
+
type: multiple_choice
|
|
224
|
+
prompt: >-
|
|
225
|
+
Mid-way through a large implementation, context is compacted. You had five
|
|
226
|
+
Todo tasks; two were completed. What should you do immediately after
|
|
227
|
+
compaction to recover correctly?
|
|
228
|
+
choices:
|
|
229
|
+
- id: a
|
|
230
|
+
text: >-
|
|
231
|
+
Restart the entire task from scratch — compaction invalidates all
|
|
232
|
+
previous work
|
|
233
|
+
- id: b
|
|
234
|
+
text: >-
|
|
235
|
+
Call TodoRead to inspect the current task list, confirm which tasks
|
|
236
|
+
are completed, then resume from the first pending/in_progress task
|
|
237
|
+
- id: c
|
|
238
|
+
text: >-
|
|
239
|
+
Trust that the summary captured everything and continue from where
|
|
240
|
+
you think you left off
|
|
241
|
+
- id: d
|
|
242
|
+
text: >-
|
|
243
|
+
Ask the user to list what was done
|
|
244
|
+
answerKey:
|
|
245
|
+
kind: choice
|
|
246
|
+
correctChoiceId: b
|
|
247
|
+
guidance: >-
|
|
248
|
+
The Todo list is the canonical state that survives compaction — it is
|
|
249
|
+
persisted separately from the conversation history. After compaction,
|
|
250
|
+
call TodoRead first to see what is completed vs. pending, then continue
|
|
251
|
+
from the first incomplete task. "Trusting the summary" risks duplicating
|
|
252
|
+
completed work or skipping tasks the summary glossed over.
|
|
253
|
+
|
|
254
|
+
- id: planning-400-sa-a
|
|
255
|
+
tier: 400
|
|
256
|
+
bloom: analyze
|
|
257
|
+
difficulty: 440
|
|
258
|
+
type: short_answer
|
|
259
|
+
prompt: >-
|
|
260
|
+
A task is straightforward and can be completed in a single tool call.
|
|
261
|
+
Should you use TodoWrite for it? Give the principle behind your answer.
|
|
262
|
+
answerKey:
|
|
263
|
+
kind: keyword
|
|
264
|
+
anyOf:
|
|
265
|
+
- "no"
|
|
266
|
+
- "not necessary"
|
|
267
|
+
- "unnecessary"
|
|
268
|
+
- "simple tasks"
|
|
269
|
+
- "single step"
|
|
270
|
+
- "overhead"
|
|
271
|
+
normalize: lower
|
|
272
|
+
guidance: >-
|
|
273
|
+
TodoWrite adds value for multi-step or interruptible work. For a single,
|
|
274
|
+
simple, instantly-completable task it is unnecessary overhead — the tool
|
|
275
|
+
call costs context without providing a meaningful recovery checkpoint or
|
|
276
|
+
progress signal. Apply TodoWrite when the work has meaningful sub-steps,
|
|
277
|
+
spans multiple files, or could be interrupted.
|
|
278
|
+
|
|
279
|
+
# ── Tier 500 — Evaluate ─────────────────────────────────────────────────
|
|
280
|
+
- id: planning-500-mc-a
|
|
281
|
+
tier: 500
|
|
282
|
+
bloom: evaluate
|
|
283
|
+
difficulty: 480
|
|
284
|
+
type: multiple_choice
|
|
285
|
+
prompt: >-
|
|
286
|
+
A teammate's workflow always enters plan mode for every task, even trivial
|
|
287
|
+
one-liner fixes. Evaluate this approach and identify the main cost.
|
|
288
|
+
choices:
|
|
289
|
+
- id: a
|
|
290
|
+
text: >-
|
|
291
|
+
Good discipline — plan mode prevents all implementation mistakes
|
|
292
|
+
- id: b
|
|
293
|
+
text: >-
|
|
294
|
+
Acceptable — the overhead is negligible because plan mode is free
|
|
295
|
+
- id: c
|
|
296
|
+
text: >-
|
|
297
|
+
Over-engineering: for trivial changes plan mode adds a round-trip
|
|
298
|
+
latency and a user-confirmation burden with no commensurate reduction
|
|
299
|
+
in risk
|
|
300
|
+
- id: d
|
|
301
|
+
text: >-
|
|
302
|
+
Dangerous — plan mode can modify files if not used carefully
|
|
303
|
+
answerKey:
|
|
304
|
+
kind: choice
|
|
305
|
+
correctChoiceId: c
|
|
306
|
+
guidance: >-
|
|
307
|
+
Plan mode is most valuable for non-trivial changes where the approach
|
|
308
|
+
is not obvious and mistakes are costly. Applying it to trivial fixes
|
|
309
|
+
(renaming a variable, fixing a typo) imposes a confirmation round-trip
|
|
310
|
+
and interrupts flow without meaningful risk reduction. The judgment call
|
|
311
|
+
is: does the complexity/risk of this change warrant a planning gate? If
|
|
312
|
+
not, proceed directly. Good tooling use requires calibrating tool
|
|
313
|
+
overhead against the value it delivers.
|
|
@@ -0,0 +1,357 @@
|
|
|
1
|
+
# Topic: subagents (claude-code)
|
|
2
|
+
#
|
|
3
|
+
# Covers Claude Code's subagent / Task tool: spawning isolated agent threads,
|
|
4
|
+
# when to use isolation vs. direct edits, passing context, background execution,
|
|
5
|
+
# and interpreting results. Tiers 100–500.
|
|
6
|
+
|
|
7
|
+
id: subagents
|
|
8
|
+
class:
|
|
9
|
+
kind: tool
|
|
10
|
+
tool: claude-code
|
|
11
|
+
title: Subagents & the Task Tool
|
|
12
|
+
summary: >-
|
|
13
|
+
Understanding when and how to spawn isolated subagent threads with the Task
|
|
14
|
+
tool — isolation modes, context passing, background execution, and result
|
|
15
|
+
handling.
|
|
16
|
+
|
|
17
|
+
triggerSignals:
|
|
18
|
+
- tool: claude-code
|
|
19
|
+
match:
|
|
20
|
+
toolName: Task
|
|
21
|
+
weight: 1
|
|
22
|
+
- tool: claude-code
|
|
23
|
+
match:
|
|
24
|
+
toolNamePattern: "^Agent$"
|
|
25
|
+
weight: 0.9
|
|
26
|
+
- tool: claude-code
|
|
27
|
+
match:
|
|
28
|
+
mcpToolPattern: ".*subagent.*"
|
|
29
|
+
weight: 0.6
|
|
30
|
+
|
|
31
|
+
items:
|
|
32
|
+
# ── Tier 100 — Remember ──────────────────────────────────────────────────
|
|
33
|
+
- id: subagents-100-mc-a
|
|
34
|
+
tier: 100
|
|
35
|
+
bloom: remember
|
|
36
|
+
difficulty: 150
|
|
37
|
+
type: multiple_choice
|
|
38
|
+
prompt: >-
|
|
39
|
+
What is the primary purpose of the Task tool in Claude Code?
|
|
40
|
+
choices:
|
|
41
|
+
- id: a
|
|
42
|
+
text: To run shell commands in a subprocess
|
|
43
|
+
- id: b
|
|
44
|
+
text: To spawn an isolated subagent thread that runs independently
|
|
45
|
+
- id: c
|
|
46
|
+
text: To create a new git branch for each change
|
|
47
|
+
- id: d
|
|
48
|
+
text: To open a second terminal session
|
|
49
|
+
answerKey:
|
|
50
|
+
kind: choice
|
|
51
|
+
correctChoiceId: b
|
|
52
|
+
guidance: >-
|
|
53
|
+
The Task tool launches a new, context-isolated agent thread. Unlike Bash
|
|
54
|
+
(which runs shell commands) it gives the subagent its own context window,
|
|
55
|
+
its own tool access, and returns a single result message when done.
|
|
56
|
+
|
|
57
|
+
- id: subagents-100-mc-b
|
|
58
|
+
tier: 100
|
|
59
|
+
bloom: remember
|
|
60
|
+
difficulty: 160
|
|
61
|
+
type: multiple_choice
|
|
62
|
+
prompt: >-
|
|
63
|
+
Which parameter on the Agent/Task tool call signals that the subagent
|
|
64
|
+
should write files into a separate git worktree rather than the parent's
|
|
65
|
+
working tree?
|
|
66
|
+
choices:
|
|
67
|
+
- id: a
|
|
68
|
+
text: run_in_background
|
|
69
|
+
- id: b
|
|
70
|
+
text: isolation
|
|
71
|
+
- id: c
|
|
72
|
+
text: worktree
|
|
73
|
+
- id: d
|
|
74
|
+
text: fork
|
|
75
|
+
answerKey:
|
|
76
|
+
kind: choice
|
|
77
|
+
correctChoiceId: b
|
|
78
|
+
guidance: >-
|
|
79
|
+
Setting `isolation: "worktree"` on an Agent call tells Claude Code to
|
|
80
|
+
create a temporary git worktree for the subagent. Its file changes stay
|
|
81
|
+
isolated from the parent working tree and the path/branch are returned in
|
|
82
|
+
the result.
|
|
83
|
+
|
|
84
|
+
# ── Tier 200 — Understand ───────────────────────────────────────────────
|
|
85
|
+
- id: subagents-200-mc-a
|
|
86
|
+
tier: 200
|
|
87
|
+
bloom: understand
|
|
88
|
+
difficulty: 250
|
|
89
|
+
type: multiple_choice
|
|
90
|
+
prompt: >-
|
|
91
|
+
A subagent spawned with `isolation: "worktree"` makes no file changes
|
|
92
|
+
during its run. What happens to the worktree when the subagent finishes?
|
|
93
|
+
choices:
|
|
94
|
+
- id: a
|
|
95
|
+
text: The worktree is committed and merged automatically
|
|
96
|
+
- id: b
|
|
97
|
+
text: The worktree is automatically cleaned up (deleted)
|
|
98
|
+
- id: c
|
|
99
|
+
text: The worktree persists as a stash entry
|
|
100
|
+
- id: d
|
|
101
|
+
text: The parent is asked whether to keep or delete it
|
|
102
|
+
answerKey:
|
|
103
|
+
kind: choice
|
|
104
|
+
correctChoiceId: b
|
|
105
|
+
guidance: >-
|
|
106
|
+
When a worktree-isolated subagent makes no changes, Claude Code cleans up
|
|
107
|
+
the temporary worktree automatically. A worktree is only preserved (and
|
|
108
|
+
its path/branch returned) when the subagent actually wrote files.
|
|
109
|
+
|
|
110
|
+
- id: subagents-200-sa-a
|
|
111
|
+
tier: 200
|
|
112
|
+
bloom: understand
|
|
113
|
+
difficulty: 260
|
|
114
|
+
type: short_answer
|
|
115
|
+
prompt: >-
|
|
116
|
+
You want a subagent to do read-only research across the codebase without
|
|
117
|
+
touching any files in the parent's working tree. Which iso:skip sentinel
|
|
118
|
+
should you add to the description field so the worktree guard is
|
|
119
|
+
satisfied — and why is a worktree unnecessary here?
|
|
120
|
+
answerKey:
|
|
121
|
+
kind: keyword
|
|
122
|
+
anyOf:
|
|
123
|
+
- "iso:skip"
|
|
124
|
+
- "[iso:skip]"
|
|
125
|
+
normalize: trim
|
|
126
|
+
guidance: >-
|
|
127
|
+
For read-only subagents (inspection, search, no file writes) you append
|
|
128
|
+
`[iso:skip]` to the description rather than setting isolation. A worktree
|
|
129
|
+
is only needed when the subagent writes files that must stay separate from
|
|
130
|
+
the parent tree. Read-only work needs neither isolation.
|
|
131
|
+
|
|
132
|
+
# ── Tier 300 — Apply ────────────────────────────────────────────────────
|
|
133
|
+
- id: subagents-300-mc-a
|
|
134
|
+
tier: 300
|
|
135
|
+
bloom: apply
|
|
136
|
+
difficulty: 350
|
|
137
|
+
type: multiple_choice
|
|
138
|
+
prompt: >-
|
|
139
|
+
You spawn two subagents in parallel — one to write a migration script and
|
|
140
|
+
one to validate existing tests. Which combination of isolation settings
|
|
141
|
+
is correct?
|
|
142
|
+
choices:
|
|
143
|
+
- id: a
|
|
144
|
+
text: >-
|
|
145
|
+
Migration: isolation "worktree" — Validator: [iso:skip] in description
|
|
146
|
+
- id: b
|
|
147
|
+
text: >-
|
|
148
|
+
Migration: [iso:skip] in description — Validator: isolation "worktree"
|
|
149
|
+
- id: c
|
|
150
|
+
text: Both use isolation "worktree"
|
|
151
|
+
- id: d
|
|
152
|
+
text: Both use [iso:skip] in description
|
|
153
|
+
answerKey:
|
|
154
|
+
kind: choice
|
|
155
|
+
correctChoiceId: a
|
|
156
|
+
guidance: >-
|
|
157
|
+
The migration agent writes files that should stay isolated until reviewed,
|
|
158
|
+
so it gets `isolation: "worktree"`. The validator only reads test output
|
|
159
|
+
and never writes to the repo, so it gets `[iso:skip]` in the description
|
|
160
|
+
— no worktree needed for read-only work.
|
|
161
|
+
|
|
162
|
+
- id: subagents-300-mc-b
|
|
163
|
+
tier: 300
|
|
164
|
+
bloom: apply
|
|
165
|
+
difficulty: 360
|
|
166
|
+
type: multiple_choice
|
|
167
|
+
prompt: >-
|
|
168
|
+
A subagent must edit files directly in the parent's working tree (its
|
|
169
|
+
changes are meant to land in the current checkout immediately). Which
|
|
170
|
+
isolation choice is correct?
|
|
171
|
+
choices:
|
|
172
|
+
- id: a
|
|
173
|
+
text: isolation "worktree" — changes land in the parent tree via merge
|
|
174
|
+
- id: b
|
|
175
|
+
text: >-
|
|
176
|
+
[iso:skip] in description, no isolation field — the subagent writes
|
|
177
|
+
to the parent working tree directly
|
|
178
|
+
- id: c
|
|
179
|
+
text: isolation "remote" — uses a cloud environment
|
|
180
|
+
- id: d
|
|
181
|
+
text: Any isolation mode; the parent always sees the changes
|
|
182
|
+
answerKey:
|
|
183
|
+
kind: choice
|
|
184
|
+
correctChoiceId: b
|
|
185
|
+
guidance: >-
|
|
186
|
+
When you need a subagent's edits to land in the current checkout
|
|
187
|
+
immediately, use `[iso:skip]` (no isolation field). A worktree would
|
|
188
|
+
put the changes in a separate branch, preventing the parent from seeing
|
|
189
|
+
them directly. "Otherwise writes to parent tree" is exactly the third
|
|
190
|
+
case for skipping isolation.
|
|
191
|
+
|
|
192
|
+
# ── Tier 400 — Analyze ──────────────────────────────────────────────────
|
|
193
|
+
- id: subagents-400-mc-a
|
|
194
|
+
tier: 400
|
|
195
|
+
bloom: analyze
|
|
196
|
+
difficulty: 430
|
|
197
|
+
type: multiple_choice
|
|
198
|
+
prompt: >-
|
|
199
|
+
You receive a subagent summary saying "I fixed the bug and updated the
|
|
200
|
+
tests." Before reporting the work as done to the user, what should you
|
|
201
|
+
verify and why?
|
|
202
|
+
choices:
|
|
203
|
+
- id: a
|
|
204
|
+
text: >-
|
|
205
|
+
Nothing — if the subagent says it's done, the work is complete
|
|
206
|
+
- id: b
|
|
207
|
+
text: >-
|
|
208
|
+
Rerun all CI checks to be sure, but the file changes can be trusted
|
|
209
|
+
- id: c
|
|
210
|
+
text: >-
|
|
211
|
+
Inspect the actual file changes; a subagent's summary describes intent,
|
|
212
|
+
not necessarily what it did
|
|
213
|
+
- id: d
|
|
214
|
+
text: >-
|
|
215
|
+
Ask the user to verify — the parent agent cannot read subagent output
|
|
216
|
+
answerKey:
|
|
217
|
+
kind: choice
|
|
218
|
+
correctChoiceId: c
|
|
219
|
+
guidance: >-
|
|
220
|
+
"Trust but verify" is a key subagent principle. A subagent's result
|
|
221
|
+
message describes what it *intended* to do. You must check the actual
|
|
222
|
+
diff/file state before reporting success. Summaries can be optimistic or
|
|
223
|
+
incomplete; the files are the ground truth.
|
|
224
|
+
|
|
225
|
+
- id: subagents-400-sa-a
|
|
226
|
+
tier: 400
|
|
227
|
+
bloom: analyze
|
|
228
|
+
difficulty: 440
|
|
229
|
+
type: short_answer
|
|
230
|
+
prompt: >-
|
|
231
|
+
Name the parameter you set on an Agent tool call to let it run
|
|
232
|
+
concurrently with other work, so you are notified when it completes
|
|
233
|
+
rather than waiting for it.
|
|
234
|
+
answerKey:
|
|
235
|
+
kind: keyword
|
|
236
|
+
anyOf:
|
|
237
|
+
- run_in_background
|
|
238
|
+
- "run_in_background: true"
|
|
239
|
+
- run_in_background=true
|
|
240
|
+
normalize: lower
|
|
241
|
+
guidance: >-
|
|
242
|
+
Setting `run_in_background: true` on an Agent call starts the subagent
|
|
243
|
+
asynchronously. The parent continues with other work and is notified
|
|
244
|
+
automatically when the background agent completes. You should NOT poll
|
|
245
|
+
or sleep — the notification arrives when the subagent finishes.
|
|
246
|
+
|
|
247
|
+
# ── Tier 500 — Evaluate / Create ────────────────────────────────────────
|
|
248
|
+
- id: subagents-500-ff-a
|
|
249
|
+
tier: 500
|
|
250
|
+
bloom: evaluate
|
|
251
|
+
difficulty: 485
|
|
252
|
+
type: free_form
|
|
253
|
+
prompt: >-
|
|
254
|
+
Explain when you should NOT spawn parallel subagents, and why. Give at
|
|
255
|
+
least three distinct situations where parallelising with the Agent/Task
|
|
256
|
+
tool would be the wrong choice, and describe the failure mode each
|
|
257
|
+
situation produces.
|
|
258
|
+
rubric:
|
|
259
|
+
criteria:
|
|
260
|
+
- id: shared-state-conflict
|
|
261
|
+
text: >-
|
|
262
|
+
Identifies that tasks sharing mutable state (the same files,
|
|
263
|
+
database rows, or in-memory structures) are unsafe to parallelize
|
|
264
|
+
because concurrent writes produce race conditions, corrupt output,
|
|
265
|
+
or silently overwrite each other's work.
|
|
266
|
+
- id: dependency-ordering
|
|
267
|
+
text: >-
|
|
268
|
+
Identifies that tasks with a dependency relationship — where the
|
|
269
|
+
output of one is the required input of the next — must run
|
|
270
|
+
sequentially; launching the downstream agent before the upstream
|
|
271
|
+
result is ready means the downstream prompt is under-specified or
|
|
272
|
+
based on stale information.
|
|
273
|
+
- id: context-cost-overhead
|
|
274
|
+
text: >-
|
|
275
|
+
Recognises that each subagent carries a full, independent context
|
|
276
|
+
window and token budget; spawning many agents for trivial or
|
|
277
|
+
fast-finishing tasks wastes money and latency relative to doing the
|
|
278
|
+
work directly in the parent context.
|
|
279
|
+
- id: sequential-simpler
|
|
280
|
+
text: >-
|
|
281
|
+
Recognises that when tasks are few, small, or tightly coupled,
|
|
282
|
+
sequential execution in the parent is simpler to reason about,
|
|
283
|
+
easier to debug, and avoids the "trust but verify" overhead that
|
|
284
|
+
every subagent result imposes on the parent.
|
|
285
|
+
referenceAnswer: >-
|
|
286
|
+
Parallel subagents are wrong in at least three situations:
|
|
287
|
+
|
|
288
|
+
1. Shared mutable state. If two agents write to the same file, the same
|
|
289
|
+
database record, or the same in-memory store, their changes race. The
|
|
290
|
+
second writer silently overwrites the first, or both produce a partially
|
|
291
|
+
merged, corrupt result. Example: spawning a migration writer and a test
|
|
292
|
+
updater in parallel when both need to edit the same schema file.
|
|
293
|
+
|
|
294
|
+
2. Sequential dependency. If agent B needs agent A's output as its
|
|
295
|
+
input, launching them together means B operates on missing or stale
|
|
296
|
+
information. The result is an under-specified prompt, a hallucinated
|
|
297
|
+
implementation, or a second pass that undoes what the first did.
|
|
298
|
+
Example: research-then-write tasks where the write agent must read the
|
|
299
|
+
research findings before it can produce an accurate implementation.
|
|
300
|
+
|
|
301
|
+
3. Trivial tasks where overhead exceeds benefit. Each subagent opens a
|
|
302
|
+
new context window, pays full prompt-token cost, and returns a result
|
|
303
|
+
the parent must verify. For a quick grep, a small edit, or a single
|
|
304
|
+
file read, doing the work directly in the parent is faster, cheaper,
|
|
305
|
+
and requires no trust-but-verify round-trip. Spawning subagents for
|
|
306
|
+
tiny tasks multiplies cost without multiplying useful throughput.
|
|
307
|
+
|
|
308
|
+
A fourth valid situation: when the total number of tasks is small (two
|
|
309
|
+
or three) and they are tightly coupled, sequential execution in the
|
|
310
|
+
parent keeps the full context in one place, produces one coherent
|
|
311
|
+
history, and is simpler to debug if something goes wrong.
|
|
312
|
+
passThreshold: 0.75
|
|
313
|
+
guidance: >-
|
|
314
|
+
Parallelism is powerful but has hard limits. Candidates who only recite
|
|
315
|
+
"use background agents for independent tasks" without articulating the
|
|
316
|
+
failure modes of over-parallelisation have not internalised the tradeoffs.
|
|
317
|
+
Key signals: shared-state races, dependency ordering, and cost-vs-benefit
|
|
318
|
+
for trivial tasks. Full marks require at least three distinct situations
|
|
319
|
+
with their failure modes, not just a list of slogans.
|
|
320
|
+
|
|
321
|
+
|
|
322
|
+
- id: subagents-500-mc-a
|
|
323
|
+
tier: 500
|
|
324
|
+
bloom: evaluate
|
|
325
|
+
difficulty: 480
|
|
326
|
+
type: multiple_choice
|
|
327
|
+
prompt: >-
|
|
328
|
+
A task requires searching the codebase (read-only) and then, based on
|
|
329
|
+
the findings, writing a new module. Which subagent strategy is most
|
|
330
|
+
efficient and why?
|
|
331
|
+
choices:
|
|
332
|
+
- id: a
|
|
333
|
+
text: >-
|
|
334
|
+
Use one subagent with [iso:skip] for both steps — simplest path
|
|
335
|
+
- id: b
|
|
336
|
+
text: >-
|
|
337
|
+
Use a foreground read-only subagent first ([iso:skip]), wait for its
|
|
338
|
+
findings, then make an informed decision on how to write the module
|
|
339
|
+
either directly or via a second worktree-isolated subagent
|
|
340
|
+
- id: c
|
|
341
|
+
text: >-
|
|
342
|
+
Use two background subagents in parallel — one reads, one writes —
|
|
343
|
+
so they finish faster
|
|
344
|
+
- id: d
|
|
345
|
+
text: >-
|
|
346
|
+
Always use isolation "worktree" for any task that may write files,
|
|
347
|
+
even if the read phase comes first
|
|
348
|
+
answerKey:
|
|
349
|
+
kind: choice
|
|
350
|
+
correctChoiceId: b
|
|
351
|
+
guidance: >-
|
|
352
|
+
The read step must complete before the write step can be designed well —
|
|
353
|
+
they are sequential, not parallel. Using a foreground read-only subagent
|
|
354
|
+
with [iso:skip] for the research phase, then synthesizing what was
|
|
355
|
+
learned before spawning a writer, gives the parent agent maximum
|
|
356
|
+
information to write a self-contained, accurate prompt for the second
|
|
357
|
+
agent. Launching a writer in parallel before reading is premature.
|
|
File without changes
|