@vibe-hero/server 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (150) hide show
  1. package/LICENSE +190 -0
  2. package/README.md +151 -0
  3. package/dist/catalog/bundled/claude-code/.gitkeep +0 -0
  4. package/dist/catalog/bundled/claude-code/context-management.yaml +302 -0
  5. package/dist/catalog/bundled/claude-code/planning.yaml +313 -0
  6. package/dist/catalog/bundled/claude-code/subagents.yaml +357 -0
  7. package/dist/catalog/bundled/general/.gitkeep +0 -0
  8. package/dist/catalog/bundled/general/_placeholder.yaml +39 -0
  9. package/dist/catalog/bundled/general/task-decomposition.yaml +390 -0
  10. package/dist/catalog/bundled/index.d.ts +39 -0
  11. package/dist/catalog/bundled/index.d.ts.map +1 -0
  12. package/dist/catalog/bundled/index.js +41 -0
  13. package/dist/catalog/bundled/index.js.map +1 -0
  14. package/dist/catalog/fetcher.d.ts +201 -0
  15. package/dist/catalog/fetcher.d.ts.map +1 -0
  16. package/dist/catalog/fetcher.js +452 -0
  17. package/dist/catalog/fetcher.js.map +1 -0
  18. package/dist/catalog/loader.d.ts +165 -0
  19. package/dist/catalog/loader.d.ts.map +1 -0
  20. package/dist/catalog/loader.js +241 -0
  21. package/dist/catalog/loader.js.map +1 -0
  22. package/dist/catalog/resolve.d.ts +85 -0
  23. package/dist/catalog/resolve.d.ts.map +1 -0
  24. package/dist/catalog/resolve.js +103 -0
  25. package/dist/catalog/resolve.js.map +1 -0
  26. package/dist/cli/getOffer.d.ts +38 -0
  27. package/dist/cli/getOffer.d.ts.map +1 -0
  28. package/dist/cli/getOffer.js +150 -0
  29. package/dist/cli/getOffer.js.map +1 -0
  30. package/dist/cli/index.d.ts +46 -0
  31. package/dist/cli/index.d.ts.map +1 -0
  32. package/dist/cli/index.js +88 -0
  33. package/dist/cli/index.js.map +1 -0
  34. package/dist/config.d.ts +34 -0
  35. package/dist/config.d.ts.map +1 -0
  36. package/dist/config.js +63 -0
  37. package/dist/config.js.map +1 -0
  38. package/dist/engine/elo.d.ts +76 -0
  39. package/dist/engine/elo.d.ts.map +1 -0
  40. package/dist/engine/elo.js +79 -0
  41. package/dist/engine/elo.js.map +1 -0
  42. package/dist/engine/graduation.d.ts +108 -0
  43. package/dist/engine/graduation.d.ts.map +1 -0
  44. package/dist/engine/graduation.js +161 -0
  45. package/dist/engine/graduation.js.map +1 -0
  46. package/dist/engine/lapse.d.ts +80 -0
  47. package/dist/engine/lapse.d.ts.map +1 -0
  48. package/dist/engine/lapse.js +125 -0
  49. package/dist/engine/lapse.js.map +1 -0
  50. package/dist/engine/selection.d.ts +84 -0
  51. package/dist/engine/selection.d.ts.map +1 -0
  52. package/dist/engine/selection.js +119 -0
  53. package/dist/engine/selection.js.map +1 -0
  54. package/dist/grading/deterministic.d.ts +102 -0
  55. package/dist/grading/deterministic.d.ts.map +1 -0
  56. package/dist/grading/deterministic.js +118 -0
  57. package/dist/grading/deterministic.js.map +1 -0
  58. package/dist/grading/freeform.d.ts +64 -0
  59. package/dist/grading/freeform.d.ts.map +1 -0
  60. package/dist/grading/freeform.js +85 -0
  61. package/dist/grading/freeform.js.map +1 -0
  62. package/dist/index.d.ts +52 -0
  63. package/dist/index.d.ts.map +1 -0
  64. package/dist/index.js +91 -0
  65. package/dist/index.js.map +1 -0
  66. package/dist/observation/hookEvents.d.ts +113 -0
  67. package/dist/observation/hookEvents.d.ts.map +1 -0
  68. package/dist/observation/hookEvents.js +170 -0
  69. package/dist/observation/hookEvents.js.map +1 -0
  70. package/dist/observation/offers.d.ts +215 -0
  71. package/dist/observation/offers.d.ts.map +1 -0
  72. package/dist/observation/offers.js +327 -0
  73. package/dist/observation/offers.js.map +1 -0
  74. package/dist/observation/source.d.ts +133 -0
  75. package/dist/observation/source.d.ts.map +1 -0
  76. package/dist/observation/source.js +105 -0
  77. package/dist/observation/source.js.map +1 -0
  78. package/dist/profile/migrate.d.ts +122 -0
  79. package/dist/profile/migrate.d.ts.map +1 -0
  80. package/dist/profile/migrate.js +147 -0
  81. package/dist/profile/migrate.js.map +1 -0
  82. package/dist/profile/store.d.ts +84 -0
  83. package/dist/profile/store.d.ts.map +1 -0
  84. package/dist/profile/store.js +267 -0
  85. package/dist/profile/store.js.map +1 -0
  86. package/dist/schemas/common.d.ts +95 -0
  87. package/dist/schemas/common.d.ts.map +1 -0
  88. package/dist/schemas/common.js +106 -0
  89. package/dist/schemas/common.js.map +1 -0
  90. package/dist/schemas/content.d.ts +828 -0
  91. package/dist/schemas/content.d.ts.map +1 -0
  92. package/dist/schemas/content.js +219 -0
  93. package/dist/schemas/content.js.map +1 -0
  94. package/dist/schemas/profile.d.ts +599 -0
  95. package/dist/schemas/profile.d.ts.map +1 -0
  96. package/dist/schemas/profile.js +177 -0
  97. package/dist/schemas/profile.js.map +1 -0
  98. package/dist/schemas/tools.d.ts +1581 -0
  99. package/dist/schemas/tools.d.ts.map +1 -0
  100. package/dist/schemas/tools.js +286 -0
  101. package/dist/schemas/tools.js.map +1 -0
  102. package/dist/tools/config.d.ts +51 -0
  103. package/dist/tools/config.d.ts.map +1 -0
  104. package/dist/tools/config.js +104 -0
  105. package/dist/tools/config.js.map +1 -0
  106. package/dist/tools/gate.d.ts +50 -0
  107. package/dist/tools/gate.d.ts.map +1 -0
  108. package/dist/tools/gate.js +67 -0
  109. package/dist/tools/gate.js.map +1 -0
  110. package/dist/tools/guidance.d.ts +36 -0
  111. package/dist/tools/guidance.d.ts.map +1 -0
  112. package/dist/tools/guidance.js +117 -0
  113. package/dist/tools/guidance.js.map +1 -0
  114. package/dist/tools/listTopics.d.ts +55 -0
  115. package/dist/tools/listTopics.d.ts.map +1 -0
  116. package/dist/tools/listTopics.js +78 -0
  117. package/dist/tools/listTopics.js.map +1 -0
  118. package/dist/tools/offers.d.ts +60 -0
  119. package/dist/tools/offers.d.ts.map +1 -0
  120. package/dist/tools/offers.js +152 -0
  121. package/dist/tools/offers.js.map +1 -0
  122. package/dist/tools/placeholders.d.ts +27 -0
  123. package/dist/tools/placeholders.d.ts.map +1 -0
  124. package/dist/tools/placeholders.js +49 -0
  125. package/dist/tools/placeholders.js.map +1 -0
  126. package/dist/tools/recordObservation.d.ts +52 -0
  127. package/dist/tools/recordObservation.d.ts.map +1 -0
  128. package/dist/tools/recordObservation.js +87 -0
  129. package/dist/tools/recordObservation.js.map +1 -0
  130. package/dist/tools/startQuiz.d.ts +82 -0
  131. package/dist/tools/startQuiz.d.ts.map +1 -0
  132. package/dist/tools/startQuiz.js +180 -0
  133. package/dist/tools/startQuiz.js.map +1 -0
  134. package/dist/tools/status.d.ts +59 -0
  135. package/dist/tools/status.d.ts.map +1 -0
  136. package/dist/tools/status.js +133 -0
  137. package/dist/tools/status.js.map +1 -0
  138. package/dist/tools/submitAnswer.d.ts +156 -0
  139. package/dist/tools/submitAnswer.d.ts.map +1 -0
  140. package/dist/tools/submitAnswer.js +402 -0
  141. package/dist/tools/submitAnswer.js.map +1 -0
  142. package/dist/tools/types.d.ts +82 -0
  143. package/dist/tools/types.d.ts.map +1 -0
  144. package/dist/tools/types.js +48 -0
  145. package/dist/tools/types.js.map +1 -0
  146. package/dist/tools/us2/standing.d.ts +111 -0
  147. package/dist/tools/us2/standing.d.ts.map +1 -0
  148. package/dist/tools/us2/standing.js +143 -0
  149. package/dist/tools/us2/standing.js.map +1 -0
  150. package/package.json +62 -0
@@ -0,0 +1,313 @@
1
+ # Topic: planning (claude-code)
2
+ #
3
+ # Covers Claude Code's planning mode (ExitPlanMode), the TodoWrite/TodoRead
4
+ # task list as a planning primitive, when to plan vs. act, and how to
5
+ # structure multi-step work. Tiers 100–500.
6
+
7
+ id: planning
8
+ class:
9
+ kind: tool
10
+ tool: claude-code
11
+ title: Planning Mode & Task Management
12
+ summary: >-
13
+ Using Claude Code's plan mode and the TodoWrite task list to structure complex
14
+ work — when to plan before acting, how to break tasks down, and how to
15
+ maintain progress across a long session.
16
+
17
+ triggerSignals:
18
+ - tool: claude-code
19
+ match:
20
+ toolName: ExitPlanMode
21
+ weight: 1
22
+ - tool: claude-code
23
+ match:
24
+ toolName: TodoWrite
25
+ weight: 0.9
26
+ - tool: claude-code
27
+ match:
28
+ toolName: TodoRead
29
+ weight: 0.7
30
+ - tool: claude-code
31
+ match:
32
+ toolNamePattern: "^ExitPlanMode$"
33
+ weight: 1
34
+
35
+ items:
36
+ # ── Tier 100 — Remember ──────────────────────────────────────────────────
37
+ - id: planning-100-mc-a
38
+ tier: 100
39
+ bloom: remember
40
+ difficulty: 150
41
+ type: multiple_choice
42
+ prompt: >-
43
+ What does Claude Code's "plan mode" mean in practice?
44
+ choices:
45
+ - id: a
46
+ text: >-
47
+ The model runs a special subprocess that generates a plan file
48
+ - id: b
49
+ text: >-
50
+ Claude Code reads files and reasons about a task without executing
51
+ any changes, then awaits approval before acting
52
+ - id: c
53
+ text: >-
54
+ A scheduled background job that plans the next day's work
55
+ - id: d
56
+ text: >-
57
+ A restricted mode where only Bash commands are allowed
58
+ answerKey:
59
+ kind: choice
60
+ correctChoiceId: b
61
+ guidance: >-
62
+ In plan mode, Claude Code explores the problem space (reading files,
63
+ thinking through the approach) without making any edits or running
64
+ commands. It presents the plan for user review. Calling ExitPlanMode
65
+ signals that planning is complete and implementation can begin.
66
+
67
+ - id: planning-100-mc-b
68
+ tier: 100
69
+ bloom: remember
70
+ difficulty: 160
71
+ type: multiple_choice
72
+ prompt: >-
73
+ Which tool call signals that Claude Code is done planning and ready to
74
+ begin implementing?
75
+ choices:
76
+ - id: a
77
+ text: TodoWrite with status "in_progress"
78
+ - id: b
79
+ text: Bash with "start"
80
+ - id: c
81
+ text: ExitPlanMode
82
+ - id: d
83
+ text: Read with the implementation file path
84
+ answerKey:
85
+ kind: choice
86
+ correctChoiceId: c
87
+ guidance: >-
88
+ ExitPlanMode is the explicit tool call that ends the planning phase and
89
+ transitions Claude Code into implementation. Observing this call in the
90
+ transcript is a strong signal that the agent has finished scoping the
91
+ work and is about to start making changes.
92
+
93
+ # ── Tier 200 — Understand ───────────────────────────────────────────────
94
+ - id: planning-200-mc-a
95
+ tier: 200
96
+ bloom: understand
97
+ difficulty: 250
98
+ type: multiple_choice
99
+ prompt: >-
100
+ Why should you mark a TodoWrite task as "in_progress" before starting it,
101
+ rather than marking it "completed" only when you finish?
102
+ choices:
103
+ - id: a
104
+ text: >-
105
+ The MCP server requires it to track billing
106
+ - id: b
107
+ text: >-
108
+ Marking in_progress signals to the user that work is happening and
109
+ creates a recovery checkpoint if the session is interrupted
110
+ - id: c
111
+ text: >-
112
+ It prevents other agents from picking up the same task
113
+ - id: d
114
+ text: >-
115
+ TodoWrite ignores status unless you use in_progress first
116
+ answerKey:
117
+ kind: choice
118
+ correctChoiceId: b
119
+ guidance: >-
120
+ The in_progress status serves two purposes: it shows the user that the
121
+ agent is actively working on something (not silently idle), and it leaves
122
+ a clear recovery marker if context is compacted or the session is
123
+ interrupted mid-task. If you only mark completed at the end, a
124
+ compaction event could erase all progress signals.
125
+
126
+ - id: planning-200-sa-a
127
+ tier: 200
128
+ bloom: understand
129
+ difficulty: 260
130
+ type: short_answer
131
+ prompt: >-
132
+ What are the three valid status values for a task in the Claude Code
133
+ Todo list (TodoWrite)?
134
+ answerKey:
135
+ kind: keyword
136
+ anyOf:
137
+ - "pending, in_progress, completed"
138
+ - "pending in_progress completed"
139
+ - "pending"
140
+ - "in_progress"
141
+ - "completed"
142
+ normalize: lower
143
+ guidance: >-
144
+ The three Todo task statuses are: `pending` (not yet started),
145
+ `in_progress` (currently being worked on), and `completed` (done).
146
+ Transitioning through these statuses as you work lets both the user and
147
+ future context reconstruction understand what has been accomplished.
148
+
149
+ # ── Tier 300 — Apply ────────────────────────────────────────────────────
150
+ - id: planning-300-mc-a
151
+ tier: 300
152
+ bloom: apply
153
+ difficulty: 350
154
+ type: multiple_choice
155
+ prompt: >-
156
+ A user asks you to implement a new API endpoint. This requires: (1) adding
157
+ a route, (2) writing a handler, (3) adding a test, and (4) updating docs.
158
+ In what order should you use TodoWrite?
159
+ choices:
160
+ - id: a
161
+ text: >-
162
+ Write all four tasks at once before starting any of them; then mark
163
+ each in_progress → completed as you go
164
+ - id: b
165
+ text: >-
166
+ Write one task, complete it, then write the next task
167
+ - id: c
168
+ text: >-
169
+ Write all tasks and immediately mark them all completed to show the
170
+ full plan
171
+ - id: d
172
+ text: >-
173
+ Skip TodoWrite — four steps is small enough to hold in context
174
+ answerKey:
175
+ kind: choice
176
+ correctChoiceId: a
177
+ guidance: >-
178
+ The right pattern is to write all tasks upfront (creating the full
179
+ checklist), then work through them one at a time: mark each in_progress
180
+ before starting, completed when done. Writing tasks one at a time loses
181
+ the upfront clarity; marking all completed immediately is dishonest.
182
+ Even four tasks benefits from explicit tracking — sessions can be
183
+ interrupted.
184
+
185
+ - id: planning-300-mc-b
186
+ tier: 300
187
+ bloom: apply
188
+ difficulty: 360
189
+ type: multiple_choice
190
+ prompt: >-
191
+ You are in plan mode researching a complex feature request. You have read
192
+ three files. The user has NOT yet approved your plan. Should you call
193
+ ExitPlanMode and start editing?
194
+ choices:
195
+ - id: a
196
+ text: >-
197
+ Yes — you have enough information to proceed
198
+ - id: b
199
+ text: >-
200
+ Yes — plan mode is just a suggestion, not a gate
201
+ - id: c
202
+ text: >-
203
+ No — ExitPlanMode should only be called after presenting the plan
204
+ and receiving user confirmation to proceed
205
+ - id: d
206
+ text: >-
207
+ No — you must read every file in the repo before exiting plan mode
208
+ answerKey:
209
+ kind: choice
210
+ correctChoiceId: c
211
+ guidance: >-
212
+ Plan mode is a human-in-the-loop gate. Its purpose is to surface the
213
+ proposed approach for review before any changes are made. Calling
214
+ ExitPlanMode without user confirmation undermines that gate. The correct
215
+ flow is: explore → present plan → wait for approval → ExitPlanMode →
216
+ implement.
217
+
218
+ # ── Tier 400 — Analyze ──────────────────────────────────────────────────
219
+ - id: planning-400-mc-a
220
+ tier: 400
221
+ bloom: analyze
222
+ difficulty: 430
223
+ type: multiple_choice
224
+ prompt: >-
225
+ Mid-way through a large implementation, context is compacted. You had five
226
+ Todo tasks; two were completed. What should you do immediately after
227
+ compaction to recover correctly?
228
+ choices:
229
+ - id: a
230
+ text: >-
231
+ Restart the entire task from scratch — compaction invalidates all
232
+ previous work
233
+ - id: b
234
+ text: >-
235
+ Call TodoRead to inspect the current task list, confirm which tasks
236
+ are completed, then resume from the first pending/in_progress task
237
+ - id: c
238
+ text: >-
239
+ Trust that the summary captured everything and continue from where
240
+ you think you left off
241
+ - id: d
242
+ text: >-
243
+ Ask the user to list what was done
244
+ answerKey:
245
+ kind: choice
246
+ correctChoiceId: b
247
+ guidance: >-
248
+ The Todo list is the canonical state that survives compaction — it is
249
+ persisted separately from the conversation history. After compaction,
250
+ call TodoRead first to see what is completed vs. pending, then continue
251
+ from the first incomplete task. "Trusting the summary" risks duplicating
252
+ completed work or skipping tasks the summary glossed over.
253
+
254
+ - id: planning-400-sa-a
255
+ tier: 400
256
+ bloom: analyze
257
+ difficulty: 440
258
+ type: short_answer
259
+ prompt: >-
260
+ A task is straightforward and can be completed in a single tool call.
261
+ Should you use TodoWrite for it? Give the principle behind your answer.
262
+ answerKey:
263
+ kind: keyword
264
+ anyOf:
265
+ - "no"
266
+ - "not necessary"
267
+ - "unnecessary"
268
+ - "simple tasks"
269
+ - "single step"
270
+ - "overhead"
271
+ normalize: lower
272
+ guidance: >-
273
+ TodoWrite adds value for multi-step or interruptible work. For a single,
274
+ simple, instantly-completable task it is unnecessary overhead — the tool
275
+ call costs context without providing a meaningful recovery checkpoint or
276
+ progress signal. Apply TodoWrite when the work has meaningful sub-steps,
277
+ spans multiple files, or could be interrupted.
278
+
279
+ # ── Tier 500 — Evaluate ─────────────────────────────────────────────────
280
+ - id: planning-500-mc-a
281
+ tier: 500
282
+ bloom: evaluate
283
+ difficulty: 480
284
+ type: multiple_choice
285
+ prompt: >-
286
+ A teammate's workflow always enters plan mode for every task, even trivial
287
+ one-liner fixes. Evaluate this approach and identify the main cost.
288
+ choices:
289
+ - id: a
290
+ text: >-
291
+ Good discipline — plan mode prevents all implementation mistakes
292
+ - id: b
293
+ text: >-
294
+ Acceptable — the overhead is negligible because plan mode is free
295
+ - id: c
296
+ text: >-
297
+ Over-engineering: for trivial changes plan mode adds a round-trip
298
+ latency and a user-confirmation burden with no commensurate reduction
299
+ in risk
300
+ - id: d
301
+ text: >-
302
+ Dangerous — plan mode can modify files if not used carefully
303
+ answerKey:
304
+ kind: choice
305
+ correctChoiceId: c
306
+ guidance: >-
307
+ Plan mode is most valuable for non-trivial changes where the approach
308
+ is not obvious and mistakes are costly. Applying it to trivial fixes
309
+ (renaming a variable, fixing a typo) imposes a confirmation round-trip
310
+ and interrupts flow without meaningful risk reduction. The judgment call
311
+ is: does the complexity/risk of this change warrant a planning gate? If
312
+ not, proceed directly. Good tooling use requires calibrating tool
313
+ overhead against the value it delivers.
@@ -0,0 +1,357 @@
1
+ # Topic: subagents (claude-code)
2
+ #
3
+ # Covers Claude Code's subagent / Task tool: spawning isolated agent threads,
4
+ # when to use isolation vs. direct edits, passing context, background execution,
5
+ # and interpreting results. Tiers 100–500.
6
+
7
+ id: subagents
8
+ class:
9
+ kind: tool
10
+ tool: claude-code
11
+ title: Subagents & the Task Tool
12
+ summary: >-
13
+ Understanding when and how to spawn isolated subagent threads with the Task
14
+ tool — isolation modes, context passing, background execution, and result
15
+ handling.
16
+
17
+ triggerSignals:
18
+ - tool: claude-code
19
+ match:
20
+ toolName: Task
21
+ weight: 1
22
+ - tool: claude-code
23
+ match:
24
+ toolNamePattern: "^Agent$"
25
+ weight: 0.9
26
+ - tool: claude-code
27
+ match:
28
+ mcpToolPattern: ".*subagent.*"
29
+ weight: 0.6
30
+
31
+ items:
32
+ # ── Tier 100 — Remember ──────────────────────────────────────────────────
33
+ - id: subagents-100-mc-a
34
+ tier: 100
35
+ bloom: remember
36
+ difficulty: 150
37
+ type: multiple_choice
38
+ prompt: >-
39
+ What is the primary purpose of the Task tool in Claude Code?
40
+ choices:
41
+ - id: a
42
+ text: To run shell commands in a subprocess
43
+ - id: b
44
+ text: To spawn an isolated subagent thread that runs independently
45
+ - id: c
46
+ text: To create a new git branch for each change
47
+ - id: d
48
+ text: To open a second terminal session
49
+ answerKey:
50
+ kind: choice
51
+ correctChoiceId: b
52
+ guidance: >-
53
+ The Task tool launches a new, context-isolated agent thread. Unlike Bash
54
+ (which runs shell commands) it gives the subagent its own context window,
55
+ its own tool access, and returns a single result message when done.
56
+
57
+ - id: subagents-100-mc-b
58
+ tier: 100
59
+ bloom: remember
60
+ difficulty: 160
61
+ type: multiple_choice
62
+ prompt: >-
63
+ Which parameter on the Agent/Task tool call signals that the subagent
64
+ should write files into a separate git worktree rather than the parent's
65
+ working tree?
66
+ choices:
67
+ - id: a
68
+ text: run_in_background
69
+ - id: b
70
+ text: isolation
71
+ - id: c
72
+ text: worktree
73
+ - id: d
74
+ text: fork
75
+ answerKey:
76
+ kind: choice
77
+ correctChoiceId: b
78
+ guidance: >-
79
+ Setting `isolation: "worktree"` on an Agent call tells Claude Code to
80
+ create a temporary git worktree for the subagent. Its file changes stay
81
+ isolated from the parent working tree and the path/branch are returned in
82
+ the result.
83
+
84
+ # ── Tier 200 — Understand ───────────────────────────────────────────────
85
+ - id: subagents-200-mc-a
86
+ tier: 200
87
+ bloom: understand
88
+ difficulty: 250
89
+ type: multiple_choice
90
+ prompt: >-
91
+ A subagent spawned with `isolation: "worktree"` makes no file changes
92
+ during its run. What happens to the worktree when the subagent finishes?
93
+ choices:
94
+ - id: a
95
+ text: The worktree is committed and merged automatically
96
+ - id: b
97
+ text: The worktree is automatically cleaned up (deleted)
98
+ - id: c
99
+ text: The worktree persists as a stash entry
100
+ - id: d
101
+ text: The parent is asked whether to keep or delete it
102
+ answerKey:
103
+ kind: choice
104
+ correctChoiceId: b
105
+ guidance: >-
106
+ When a worktree-isolated subagent makes no changes, Claude Code cleans up
107
+ the temporary worktree automatically. A worktree is only preserved (and
108
+ its path/branch returned) when the subagent actually wrote files.
109
+
110
+ - id: subagents-200-sa-a
111
+ tier: 200
112
+ bloom: understand
113
+ difficulty: 260
114
+ type: short_answer
115
+ prompt: >-
116
+ You want a subagent to do read-only research across the codebase without
117
+ touching any files in the parent's working tree. Which iso:skip sentinel
118
+ should you add to the description field so the worktree guard is
119
+ satisfied — and why is a worktree unnecessary here?
120
+ answerKey:
121
+ kind: keyword
122
+ anyOf:
123
+ - "iso:skip"
124
+ - "[iso:skip]"
125
+ normalize: trim
126
+ guidance: >-
127
+ For read-only subagents (inspection, search, no file writes) you append
128
+ `[iso:skip]` to the description rather than setting isolation. A worktree
129
+ is only needed when the subagent writes files that must stay separate from
130
+ the parent tree. Read-only work needs neither isolation.
131
+
132
+ # ── Tier 300 — Apply ────────────────────────────────────────────────────
133
+ - id: subagents-300-mc-a
134
+ tier: 300
135
+ bloom: apply
136
+ difficulty: 350
137
+ type: multiple_choice
138
+ prompt: >-
139
+ You spawn two subagents in parallel — one to write a migration script and
140
+ one to validate existing tests. Which combination of isolation settings
141
+ is correct?
142
+ choices:
143
+ - id: a
144
+ text: >-
145
+ Migration: isolation "worktree" — Validator: [iso:skip] in description
146
+ - id: b
147
+ text: >-
148
+ Migration: [iso:skip] in description — Validator: isolation "worktree"
149
+ - id: c
150
+ text: Both use isolation "worktree"
151
+ - id: d
152
+ text: Both use [iso:skip] in description
153
+ answerKey:
154
+ kind: choice
155
+ correctChoiceId: a
156
+ guidance: >-
157
+ The migration agent writes files that should stay isolated until reviewed,
158
+ so it gets `isolation: "worktree"`. The validator only reads test output
159
+ and never writes to the repo, so it gets `[iso:skip]` in the description
160
+ — no worktree needed for read-only work.
161
+
162
+ - id: subagents-300-mc-b
163
+ tier: 300
164
+ bloom: apply
165
+ difficulty: 360
166
+ type: multiple_choice
167
+ prompt: >-
168
+ A subagent must edit files directly in the parent's working tree (its
169
+ changes are meant to land in the current checkout immediately). Which
170
+ isolation choice is correct?
171
+ choices:
172
+ - id: a
173
+ text: isolation "worktree" — changes land in the parent tree via merge
174
+ - id: b
175
+ text: >-
176
+ [iso:skip] in description, no isolation field — the subagent writes
177
+ to the parent working tree directly
178
+ - id: c
179
+ text: isolation "remote" — uses a cloud environment
180
+ - id: d
181
+ text: Any isolation mode; the parent always sees the changes
182
+ answerKey:
183
+ kind: choice
184
+ correctChoiceId: b
185
+ guidance: >-
186
+ When you need a subagent's edits to land in the current checkout
187
+ immediately, use `[iso:skip]` (no isolation field). A worktree would
188
+ put the changes in a separate branch, preventing the parent from seeing
189
+ them directly. "Otherwise writes to parent tree" is exactly the third
190
+ case for skipping isolation.
191
+
192
+ # ── Tier 400 — Analyze ──────────────────────────────────────────────────
193
+ - id: subagents-400-mc-a
194
+ tier: 400
195
+ bloom: analyze
196
+ difficulty: 430
197
+ type: multiple_choice
198
+ prompt: >-
199
+ You receive a subagent summary saying "I fixed the bug and updated the
200
+ tests." Before reporting the work as done to the user, what should you
201
+ verify and why?
202
+ choices:
203
+ - id: a
204
+ text: >-
205
+ Nothing — if the subagent says it's done, the work is complete
206
+ - id: b
207
+ text: >-
208
+ Rerun all CI checks to be sure, but the file changes can be trusted
209
+ - id: c
210
+ text: >-
211
+ Inspect the actual file changes; a subagent's summary describes intent,
212
+ not necessarily what it did
213
+ - id: d
214
+ text: >-
215
+ Ask the user to verify — the parent agent cannot read subagent output
216
+ answerKey:
217
+ kind: choice
218
+ correctChoiceId: c
219
+ guidance: >-
220
+ "Trust but verify" is a key subagent principle. A subagent's result
221
+ message describes what it *intended* to do. You must check the actual
222
+ diff/file state before reporting success. Summaries can be optimistic or
223
+ incomplete; the files are the ground truth.
224
+
225
+ - id: subagents-400-sa-a
226
+ tier: 400
227
+ bloom: analyze
228
+ difficulty: 440
229
+ type: short_answer
230
+ prompt: >-
231
+ Name the parameter you set on an Agent tool call to let it run
232
+ concurrently with other work, so you are notified when it completes
233
+ rather than waiting for it.
234
+ answerKey:
235
+ kind: keyword
236
+ anyOf:
237
+ - run_in_background
238
+ - "run_in_background: true"
239
+ - run_in_background=true
240
+ normalize: lower
241
+ guidance: >-
242
+ Setting `run_in_background: true` on an Agent call starts the subagent
243
+ asynchronously. The parent continues with other work and is notified
244
+ automatically when the background agent completes. You should NOT poll
245
+ or sleep — the notification arrives when the subagent finishes.
246
+
247
+ # ── Tier 500 — Evaluate / Create ────────────────────────────────────────
248
+ - id: subagents-500-ff-a
249
+ tier: 500
250
+ bloom: evaluate
251
+ difficulty: 485
252
+ type: free_form
253
+ prompt: >-
254
+ Explain when you should NOT spawn parallel subagents, and why. Give at
255
+ least three distinct situations where parallelising with the Agent/Task
256
+ tool would be the wrong choice, and describe the failure mode each
257
+ situation produces.
258
+ rubric:
259
+ criteria:
260
+ - id: shared-state-conflict
261
+ text: >-
262
+ Identifies that tasks sharing mutable state (the same files,
263
+ database rows, or in-memory structures) are unsafe to parallelize
264
+ because concurrent writes produce race conditions, corrupt output,
265
+ or silently overwrite each other's work.
266
+ - id: dependency-ordering
267
+ text: >-
268
+ Identifies that tasks with a dependency relationship — where the
269
+ output of one is the required input of the next — must run
270
+ sequentially; launching the downstream agent before the upstream
271
+ result is ready means the downstream prompt is under-specified or
272
+ based on stale information.
273
+ - id: context-cost-overhead
274
+ text: >-
275
+ Recognises that each subagent carries a full, independent context
276
+ window and token budget; spawning many agents for trivial or
277
+ fast-finishing tasks wastes money and latency relative to doing the
278
+ work directly in the parent context.
279
+ - id: sequential-simpler
280
+ text: >-
281
+ Recognises that when tasks are few, small, or tightly coupled,
282
+ sequential execution in the parent is simpler to reason about,
283
+ easier to debug, and avoids the "trust but verify" overhead that
284
+ every subagent result imposes on the parent.
285
+ referenceAnswer: >-
286
+ Parallel subagents are wrong in at least three situations:
287
+
288
+ 1. Shared mutable state. If two agents write to the same file, the same
289
+ database record, or the same in-memory store, their changes race. The
290
+ second writer silently overwrites the first, or both produce a partially
291
+ merged, corrupt result. Example: spawning a migration writer and a test
292
+ updater in parallel when both need to edit the same schema file.
293
+
294
+ 2. Sequential dependency. If agent B needs agent A's output as its
295
+ input, launching them together means B operates on missing or stale
296
+ information. The result is an under-specified prompt, a hallucinated
297
+ implementation, or a second pass that undoes what the first did.
298
+ Example: research-then-write tasks where the write agent must read the
299
+ research findings before it can produce an accurate implementation.
300
+
301
+ 3. Trivial tasks where overhead exceeds benefit. Each subagent opens a
302
+ new context window, pays full prompt-token cost, and returns a result
303
+ the parent must verify. For a quick grep, a small edit, or a single
304
+ file read, doing the work directly in the parent is faster, cheaper,
305
+ and requires no trust-but-verify round-trip. Spawning subagents for
306
+ tiny tasks multiplies cost without multiplying useful throughput.
307
+
308
+ A fourth valid situation: when the total number of tasks is small (two
309
+ or three) and they are tightly coupled, sequential execution in the
310
+ parent keeps the full context in one place, produces one coherent
311
+ history, and is simpler to debug if something goes wrong.
312
+ passThreshold: 0.75
313
+ guidance: >-
314
+ Parallelism is powerful but has hard limits. Candidates who only recite
315
+ "use background agents for independent tasks" without articulating the
316
+ failure modes of over-parallelisation have not internalised the tradeoffs.
317
+ Key signals: shared-state races, dependency ordering, and cost-vs-benefit
318
+ for trivial tasks. Full marks require at least three distinct situations
319
+ with their failure modes, not just a list of slogans.
320
+
321
+
322
+ - id: subagents-500-mc-a
323
+ tier: 500
324
+ bloom: evaluate
325
+ difficulty: 480
326
+ type: multiple_choice
327
+ prompt: >-
328
+ A task requires searching the codebase (read-only) and then, based on
329
+ the findings, writing a new module. Which subagent strategy is most
330
+ efficient and why?
331
+ choices:
332
+ - id: a
333
+ text: >-
334
+ Use one subagent with [iso:skip] for both steps — simplest path
335
+ - id: b
336
+ text: >-
337
+ Use a foreground read-only subagent first ([iso:skip]), wait for its
338
+ findings, then make an informed decision on how to write the module
339
+ either directly or via a second worktree-isolated subagent
340
+ - id: c
341
+ text: >-
342
+ Use two background subagents in parallel — one reads, one writes —
343
+ so they finish faster
344
+ - id: d
345
+ text: >-
346
+ Always use isolation "worktree" for any task that may write files,
347
+ even if the read phase comes first
348
+ answerKey:
349
+ kind: choice
350
+ correctChoiceId: b
351
+ guidance: >-
352
+ The read step must complete before the write step can be designed well —
353
+ they are sequential, not parallel. Using a foreground read-only subagent
354
+ with [iso:skip] for the research phase, then synthesizing what was
355
+ learned before spawning a writer, gives the parent agent maximum
356
+ information to write a self-contained, accurate prompt for the second
357
+ agent. Launching a writer in parallel before reading is premature.
File without changes