@a5c-ai/babysitter-gemini 4.0.153
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/GEMINI.md +392 -0
- package/README.md +295 -0
- package/bin/cli.js +337 -0
- package/bin/postinstall.js +79 -0
- package/bin/preuninstall.js +73 -0
- package/commands/assimilate.md +37 -0
- package/commands/call.md +7 -0
- package/commands/cleanup.md +20 -0
- package/commands/contrib.md +33 -0
- package/commands/doctor.md +426 -0
- package/commands/forever.md +7 -0
- package/commands/help.md +244 -0
- package/commands/observe.md +12 -0
- package/commands/plan.md +7 -0
- package/commands/plugins.md +255 -0
- package/commands/project-install.md +17 -0
- package/commands/resume.md +8 -0
- package/commands/retrospect.md +55 -0
- package/commands/user-install.md +17 -0
- package/commands/yolo.md +7 -0
- package/gemini-extension.json +18 -0
- package/hooks/after-agent.sh +101 -0
- package/hooks/hooks.json +32 -0
- package/hooks/session-start.sh +129 -0
- package/package.json +50 -0
- package/plugin.json +48 -0
- package/scripts/sync-command-surfaces.js +62 -0
- package/versions.json +4 -0
package/GEMINI.md
ADDED
|
@@ -0,0 +1,392 @@
|
|
|
1
|
+
# Babysitter -- Orchestration Context for Gemini CLI
|
|
2
|
+
|
|
3
|
+
Babysitter is an event-sourced workflow orchestrator. When active, it runs an **in-session loop** driven by the AfterAgent hook: each turn, you perform one orchestration step, then stop -- the hook re-injects the prompt to continue.
|
|
4
|
+
|
|
5
|
+
Orchestrate `.a5c/runs/<runId>/` through iterative execution. Use the SDK CLI to drive the orchestration loop.
|
|
6
|
+
|
|
7
|
+
## Dependencies
|
|
8
|
+
|
|
9
|
+
### Babysitter SDK and CLI
|
|
10
|
+
|
|
11
|
+
Read the SDK version from the extension manifest to ensure version compatibility:
|
|
12
|
+
|
|
13
|
+
```bash
|
|
14
|
+
SDK_VERSION=$(node -e "try{console.log(JSON.parse(require('fs').readFileSync('${GEMINI_EXTENSION_PATH}/versions.json','utf8')).sdkVersion||'latest')}catch{console.log('latest')}")
|
|
15
|
+
npm i -g @a5c-ai/babysitter-sdk@$SDK_VERSION
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
Then use the CLI alias: `CLI="babysitter"`
|
|
19
|
+
|
|
20
|
+
**Alternatively, use the CLI alias:** `CLI="npx -y @a5c-ai/babysitter-sdk@$SDK_VERSION"`
|
|
21
|
+
|
|
22
|
+
### jq
|
|
23
|
+
|
|
24
|
+
Make sure you have jq installed and available in the path. If not, install it.
|
|
25
|
+
|
|
26
|
+
---
|
|
27
|
+
|
|
28
|
+
## When Babysitter Is Active
|
|
29
|
+
|
|
30
|
+
The AfterAgent hook fires after every agent turn. If a babysitter run is bound to this session (via `GEMINI_SESSION_ID`), the hook will:
|
|
31
|
+
- **Block exit** (`{"decision":"block","reason":"...","systemMessage":"..."}`) if the run is not yet complete
|
|
32
|
+
- **Allow exit** (`{}` or `{"decision":"allow"}`) once you output `<promise>COMPLETION_PROOF</promise>` matching the run's `completionProof`
|
|
33
|
+
|
|
34
|
+
---
|
|
35
|
+
|
|
36
|
+
## Core Iteration Workflow
|
|
37
|
+
|
|
38
|
+
The babysitter workflow has 8 steps:
|
|
39
|
+
|
|
40
|
+
### 1. Create or find the process for the run
|
|
41
|
+
|
|
42
|
+
#### Interview phase
|
|
43
|
+
|
|
44
|
+
##### Interactive mode (default)
|
|
45
|
+
|
|
46
|
+
Interview the user for the intent, requirements, goal, scope, etc. using the `ask_user` tool (before setting the in-session loop).
|
|
47
|
+
|
|
48
|
+
A multi-step phase to understand the intent and perspective to approach the process building after researching the repo, short research online if needed, short research in the target repo, additional instructions, intent and library (processes, specializations, skills, subagents, methodologies, references, etc.) / guide for methodology building. (clarifications regarding the intent, requirements, goal, scope, etc.) - the library is at [skill-root]/process/specializations/**/**/** and [skill-root]/process/methodologies/ and under [skill-root]/process/contrib/[contributer-username]/]
|
|
49
|
+
|
|
50
|
+
The first step should be to look at the state of the repo, then find the most relevant processes, specializations, skills, subagents, methodologies, references, etc. to use as a reference. Use the babysitter CLI discover command to find the relevant processes, skills, subagents, etc at various stages.
|
|
51
|
+
|
|
52
|
+
Then this phase can have: research online, research the repo, user questions, and other steps one after the other until the intent, requirements, goal, scope, etc. are clear and the user is satisfied with the understanding. After each step, decide the type of next step to take. Do not plan more than 1 step ahead in this phase. The same step type can be used more than once in this phase.
|
|
53
|
+
|
|
54
|
+
##### Non-interactive mode (running without ask_user tool)
|
|
55
|
+
|
|
56
|
+
When running non-interactively, skip the interview phase entirely. Instead:
|
|
57
|
+
1. Parse the initial prompt to extract intent, scope, and requirements.
|
|
58
|
+
2. Research the repo structure to understand the codebase.
|
|
59
|
+
3. Search the process library for the most relevant specialization/methodology.
|
|
60
|
+
4. Proceed directly to the process creation phase using the extracted requirements.
|
|
61
|
+
|
|
62
|
+
#### User Profile Integration
|
|
63
|
+
|
|
64
|
+
Before building the process, check for an existing user profile to personalize the orchestration:
|
|
65
|
+
|
|
66
|
+
1. **Read user profile**: Run `babysitter profile:read --user --json` to load the user profile from `~/.a5c/user-profile.json`. **Always use the CLI for profile operations -- never import or call SDK profile functions directly.**
|
|
67
|
+
|
|
68
|
+
2. **Pre-fill context**: Use the profile to understand the user's specialties, expertise levels, preferences, and communication style. This informs how you conduct the interview (skip questions the profile already answers) and how you build the process.
|
|
69
|
+
|
|
70
|
+
3. **Breakpoint density**: Use the `breakpointTolerance` field to calibrate breakpoint placement in the generated process:
|
|
71
|
+
- `minimal`/`low` (expert users): Fewer breakpoints -- only at critical decision points
|
|
72
|
+
- `moderate` (intermediate users): Standard breakpoints at phase boundaries
|
|
73
|
+
- `high`/`maximum` (novice users): More breakpoints -- add review gates after each implementation step
|
|
74
|
+
- Always respect `alwaysBreakOn` for operations that must always pause (e.g., destructive-git, deploy)
|
|
75
|
+
|
|
76
|
+
4. **Tool preferences**: Use `toolPreferences` and `installedSkills`/`installedAgents` to prioritize which agents and skills to use in the process.
|
|
77
|
+
|
|
78
|
+
5. **Communication style**: Adapt process descriptions and breakpoint questions to match the user's `communicationStyle` preferences.
|
|
79
|
+
|
|
80
|
+
6. **If no profile exists**: Proceed normally with the interview phase.
|
|
81
|
+
|
|
82
|
+
7. **CLI profile commands (mandatory)**: All profile operations MUST use the babysitter CLI:
|
|
83
|
+
- `babysitter profile:read --user --json` -- Read user profile as JSON
|
|
84
|
+
- `babysitter profile:read --project --json` -- Read project profile as JSON
|
|
85
|
+
- `babysitter profile:write --user --input <file> --json` -- Write user profile from file
|
|
86
|
+
- `babysitter profile:write --project --input <file> --json` -- Write project profile from file
|
|
87
|
+
- `babysitter profile:merge --user --input <file> --json` -- Merge partial updates into user profile
|
|
88
|
+
- `babysitter profile:merge --project --input <file> --json` -- Merge partial updates into project profile
|
|
89
|
+
- `babysitter profile:render --user` -- Render user profile as readable markdown
|
|
90
|
+
- `babysitter profile:render --project` -- Render project profile as readable markdown
|
|
91
|
+
|
|
92
|
+
#### Process creation phase
|
|
93
|
+
|
|
94
|
+
After the interview phase, create the complete custom process files (js and jsons) for the run according to the Process Creation Guidelines and methodologies section. Also install the babysitter-sdk inside .a5c if it is not already installed. (Install it in .a5c/package.json if it is not already installed, make sure to use the latest version). **IMPORTANT**: When installing into `.a5c/`, use `npm i --prefix .a5c @a5c-ai/babysitter-sdk@latest` or a subshell `(cd .a5c && npm i @a5c-ai/babysitter-sdk@latest)` to avoid leaving CWD inside `.a5c/`, which causes doubled path resolution bugs.
|
|
95
|
+
|
|
96
|
+
**IMPORTANT -- Path resolution**: Always use **absolute paths** for `--entry` when calling `run:create`, and always run the CLI from the **project root** directory (not from `.a5c/`).
|
|
97
|
+
|
|
98
|
+
After the process is created and before creating the run:
|
|
99
|
+
- **Interactive mode**: Describe the process at high level to the user and ask for confirmation using the `ask_user` tool. Also generate it as a [process-name].mermaid.md and [process-name].process.md file.
|
|
100
|
+
- **Non-interactive mode**: Proceed directly to creating the run without user confirmation.
|
|
101
|
+
|
|
102
|
+
### 2. Create run and bind session (single command):
|
|
103
|
+
|
|
104
|
+
**For new runs:**
|
|
105
|
+
|
|
106
|
+
```bash
|
|
107
|
+
$CLI run:create \
|
|
108
|
+
--process-id <id> \
|
|
109
|
+
--entry <path>#<export> \
|
|
110
|
+
--inputs <file> \
|
|
111
|
+
--prompt "$PROMPT" \
|
|
112
|
+
--harness gemini-cli \
|
|
113
|
+
--session-id "${GEMINI_SESSION_ID}" \
|
|
114
|
+
--state-dir ".a5c/state" \
|
|
115
|
+
--json
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
**Required flags:**
|
|
119
|
+
- `--process-id <id>` -- unique identifier for the process definition
|
|
120
|
+
- `--entry <path>#<export>` -- path to the process JS file and its named export
|
|
121
|
+
- `--prompt "$PROMPT"` -- the user's initial prompt/request text
|
|
122
|
+
- `--harness gemini-cli` -- activates Gemini CLI session binding
|
|
123
|
+
- `--session-id "${GEMINI_SESSION_ID}"` -- the Gemini session identifier
|
|
124
|
+
|
|
125
|
+
**Optional flags:**
|
|
126
|
+
- `--inputs <file>` -- path to a JSON file with process inputs
|
|
127
|
+
- `--run-id <id>` -- override auto-generated run ID
|
|
128
|
+
- `--runs-dir <dir>` -- override runs directory (default: `.a5c/runs`)
|
|
129
|
+
- `--state-dir <dir>` -- state directory for session binding (default: `.a5c/state`)
|
|
130
|
+
|
|
131
|
+
**For resuming existing runs:**
|
|
132
|
+
|
|
133
|
+
```bash
|
|
134
|
+
$CLI session:resume \
|
|
135
|
+
--state-dir ".a5c/state" \
|
|
136
|
+
--run-id <runId> --runs-dir .a5c/runs --json
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
### 3. Run Iteration
|
|
140
|
+
|
|
141
|
+
```bash
|
|
142
|
+
$CLI run:iterate .a5c/runs/<runId> --json --iteration <n>
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
**Output:**
|
|
146
|
+
```json
|
|
147
|
+
{
|
|
148
|
+
"iteration": 1,
|
|
149
|
+
"status": "executed|waiting|completed|failed|none",
|
|
150
|
+
"action": "executed-tasks|waiting|none",
|
|
151
|
+
"reason": "auto-runnable-tasks|breakpoint-waiting|terminal-state",
|
|
152
|
+
"count": 3,
|
|
153
|
+
"completionProof": "only-present-when-completed",
|
|
154
|
+
"metadata": { "runId": "...", "processId": "..." }
|
|
155
|
+
}
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
**Status values:**
|
|
159
|
+
- `"executed"` - Tasks executed, continue looping
|
|
160
|
+
- `"waiting"` - Breakpoint/sleep, pause until released
|
|
161
|
+
- `"completed"` - Run finished successfully
|
|
162
|
+
- `"failed"` - Run failed with error
|
|
163
|
+
- `"none"` - No pending effects
|
|
164
|
+
|
|
165
|
+
**Common mistake to avoid:**
|
|
166
|
+
- WRONG: Calling run:iterate, performing the effect, posting the result, then calling run:iterate again in the same turn
|
|
167
|
+
- CORRECT: Calling run:iterate, performing the effect, posting the result, then STOPPING so the AfterAgent hook triggers the next iteration
|
|
168
|
+
|
|
169
|
+
### 4. Get Effects
|
|
170
|
+
|
|
171
|
+
```bash
|
|
172
|
+
$CLI task:list .a5c/runs/<runId> --pending --json
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
### 5. Perform Effects
|
|
176
|
+
|
|
177
|
+
Run the effect externally to the SDK (by you, the hook, or another worker). After execution (by delegation to a sub-agent via `@agent` or other means), post the outcome summary into the run by calling `task:post`.
|
|
178
|
+
|
|
179
|
+
IMPORTANT:
|
|
180
|
+
- Delegate using `@agent` for sub-agent delegation when possible.
|
|
181
|
+
- Make sure the change was actually performed and not described or implied.
|
|
182
|
+
- Include in the instructions to the agent to perform the task in full and return only the summary result in the requested schema.
|
|
183
|
+
|
|
184
|
+
#### 5.1 Breakpoint Handling
|
|
185
|
+
|
|
186
|
+
##### 5.1.1 Interactive mode
|
|
187
|
+
|
|
188
|
+
If running in interactive mode, use the `ask_user` tool to ask the user the breakpoint question.
|
|
189
|
+
|
|
190
|
+
**CRITICAL: Response validation rules:**
|
|
191
|
+
- The `ask_user` call MUST include explicit "Approve" and "Reject" options
|
|
192
|
+
- If `ask_user` returns empty or no selection: treat as **NOT approved**
|
|
193
|
+
- NEVER fabricate, synthesize, or infer approval text
|
|
194
|
+
- NEVER assume approval from ambiguous responses
|
|
195
|
+
|
|
196
|
+
**Breakpoint posting examples:**
|
|
197
|
+
|
|
198
|
+
```bash
|
|
199
|
+
# CORRECT: User approved
|
|
200
|
+
echo '{"approved": true, "response": "Looks good, proceed"}' > tasks/<effectId>/output.json
|
|
201
|
+
$CLI task:post <runId> <effectId> --status ok --value tasks/<effectId>/output.json
|
|
202
|
+
|
|
203
|
+
# CORRECT: User rejected (ALWAYS use --status ok, not --status error)
|
|
204
|
+
echo '{"approved": false, "response": "Stop here"}' > tasks/<effectId>/output.json
|
|
205
|
+
$CLI task:post <runId> <effectId> --status ok --value tasks/<effectId>/output.json
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
##### 5.1.2 Non-interactive mode
|
|
209
|
+
|
|
210
|
+
Skip the `ask_user` tool. Resolve the breakpoint by selecting the best option according to context, then post the result via `task:post`.
|
|
211
|
+
|
|
212
|
+
### 6. Results Posting
|
|
213
|
+
|
|
214
|
+
**IMPORTANT**: Do NOT write `result.json` directly. The SDK owns that file.
|
|
215
|
+
|
|
216
|
+
**Workflow:**
|
|
217
|
+
|
|
218
|
+
1. Write the result **value** to a separate file (e.g., `output.json`):
|
|
219
|
+
2. Post the result:
|
|
220
|
+
```bash
|
|
221
|
+
$CLI task:post .a5c/runs/<runId> <effectId> \
|
|
222
|
+
--status ok \
|
|
223
|
+
--value tasks/<effectId>/output.json \
|
|
224
|
+
--json
|
|
225
|
+
```
|
|
226
|
+
|
|
227
|
+
### 7. STOP after every phase after run-session association
|
|
228
|
+
|
|
229
|
+
The AfterAgent hook drives the loop, not you. After run:create or run-session association and after each effect is posted, you MUST stop the session and return control. The AfterAgent hook will call you back to continue.
|
|
230
|
+
|
|
231
|
+
### 8. Completion Proof
|
|
232
|
+
|
|
233
|
+
When the run is completed, the CLI will emit a `completionProof` value. You must return that exact value wrapped in a `<promise>...</promise>` tag:
|
|
234
|
+
|
|
235
|
+
```
|
|
236
|
+
<promise>THE_PROOF_VALUE</promise>
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
---
|
|
240
|
+
|
|
241
|
+
## Task Kinds
|
|
242
|
+
|
|
243
|
+
**CRITICAL RULE: NEVER use `node` kind effects in generated processes.**
|
|
244
|
+
|
|
245
|
+
| Kind | Description | Executor | When to use |
|
|
246
|
+
|------|-------------|----------|-------------|
|
|
247
|
+
| `agent` | LLM agent | Agent runtime | **Default for all tasks** |
|
|
248
|
+
| `skill` | Skill invocation | Skill system | When a matching installed skill exists |
|
|
249
|
+
| `shell` | Shell command | Local shell | Only for existing CLI tools, tests, git, linters, builds |
|
|
250
|
+
| `breakpoint` | Human approval | UI/CLI | Decision gates requiring user input |
|
|
251
|
+
| `sleep` | Time gate | Scheduler | Time-based pauses |
|
|
252
|
+
|
|
253
|
+
### Agent Task Example
|
|
254
|
+
|
|
255
|
+
```javascript
|
|
256
|
+
export const agentTask = defineTask('agent-scorer', (args, taskCtx) => ({
|
|
257
|
+
kind: 'agent',
|
|
258
|
+
title: 'Agent scoring',
|
|
259
|
+
agent: {
|
|
260
|
+
name: 'quality-scorer',
|
|
261
|
+
prompt: {
|
|
262
|
+
role: 'QA engineer',
|
|
263
|
+
task: 'Score results 0-100',
|
|
264
|
+
context: { ...args },
|
|
265
|
+
instructions: ['Review', 'Score', 'Recommend'],
|
|
266
|
+
outputFormat: 'JSON'
|
|
267
|
+
},
|
|
268
|
+
outputSchema: {
|
|
269
|
+
type: 'object',
|
|
270
|
+
required: ['score']
|
|
271
|
+
}
|
|
272
|
+
},
|
|
273
|
+
io: {
|
|
274
|
+
inputJsonPath: `tasks/${taskCtx.effectId}/input.json`,
|
|
275
|
+
outputJsonPath: `tasks/${taskCtx.effectId}/result.json`
|
|
276
|
+
}
|
|
277
|
+
}));
|
|
278
|
+
```
|
|
279
|
+
|
|
280
|
+
---
|
|
281
|
+
|
|
282
|
+
## Quick Commands Reference
|
|
283
|
+
|
|
284
|
+
**Install SDK:**
|
|
285
|
+
```bash
|
|
286
|
+
SDK_VERSION=$(node -e "try{console.log(JSON.parse(require('fs').readFileSync('${GEMINI_EXTENSION_PATH}/versions.json','utf8')).sdkVersion||'latest')}catch{console.log('latest')}")
|
|
287
|
+
npm i -g @a5c-ai/babysitter-sdk@$SDK_VERSION
|
|
288
|
+
```
|
|
289
|
+
|
|
290
|
+
**Create run (with session binding):**
|
|
291
|
+
```bash
|
|
292
|
+
$CLI run:create --process-id <id> --entry <path>#<export> --inputs <file> \
|
|
293
|
+
--prompt "$PROMPT" --harness gemini-cli \
|
|
294
|
+
--session-id "${GEMINI_SESSION_ID}" --state-dir ".a5c/state" --json
|
|
295
|
+
```
|
|
296
|
+
|
|
297
|
+
**Check status:**
|
|
298
|
+
```bash
|
|
299
|
+
$CLI run:status <runId> --json
|
|
300
|
+
```
|
|
301
|
+
|
|
302
|
+
**View events:**
|
|
303
|
+
```bash
|
|
304
|
+
$CLI run:events <runId> --limit 20 --reverse
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
**List tasks:**
|
|
308
|
+
```bash
|
|
309
|
+
$CLI task:list <runId> --pending --json
|
|
310
|
+
```
|
|
311
|
+
|
|
312
|
+
**Post task result:**
|
|
313
|
+
```bash
|
|
314
|
+
$CLI task:post <runId> <effectId> --status <ok|error> --json
|
|
315
|
+
```
|
|
316
|
+
|
|
317
|
+
**Iterate:**
|
|
318
|
+
```bash
|
|
319
|
+
$CLI run:iterate <runId> --json --iteration <n>
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
---
|
|
323
|
+
|
|
324
|
+
## Recovery from failure
|
|
325
|
+
|
|
326
|
+
If at any point the run fails due to SDK issues or corrupted state or journal, analyze the error and the journal events. Recover the state to the last known good state and adapt and try to continue the run.
|
|
327
|
+
|
|
328
|
+
---
|
|
329
|
+
|
|
330
|
+
## Process Creation Guidelines and methodologies
|
|
331
|
+
|
|
332
|
+
- When building UX and full stack applications, integrate/link the main pages of the frontend with functionality created for every phase of the development process (where relevant), so that there is a way to test the functionality of the app as we go.
|
|
333
|
+
|
|
334
|
+
- Unless otherwise specified, prefer quality gated iterative development loops in the process.
|
|
335
|
+
|
|
336
|
+
- You can change the process after the run is created or during the run in case you discovered new information or requirements.
|
|
337
|
+
|
|
338
|
+
- The process should be a comprehensive and complete solution to the user request.
|
|
339
|
+
|
|
340
|
+
- The process should usually be a composition (in code) of multiple processes from the process library, each utilizing a different process from the library as a reference.
|
|
341
|
+
|
|
342
|
+
- Include verification and refinement steps (and loops) for planning phases and integration phases.
|
|
343
|
+
|
|
344
|
+
- Create the process with (and around) the available skills and subagents. (Check which are available first.)
|
|
345
|
+
|
|
346
|
+
- Prefer incremental work that allows testing and experimentation as we go.
|
|
347
|
+
|
|
348
|
+
### Process File Discovery Markers
|
|
349
|
+
|
|
350
|
+
When creating process files, include `@skill` and `@agent` markers in the JSDoc header:
|
|
351
|
+
|
|
352
|
+
```javascript
|
|
353
|
+
/**
|
|
354
|
+
* @process specializations/web-development/react-app-development
|
|
355
|
+
* @description React app development with TDD
|
|
356
|
+
* @skill frontend-design specializations/web-development/skills/frontend-design/SKILL.md
|
|
357
|
+
* @agent frontend-architect specializations/web-development/agents/frontend-architect/AGENT.md
|
|
358
|
+
*/
|
|
359
|
+
```
|
|
360
|
+
|
|
361
|
+
---
|
|
362
|
+
|
|
363
|
+
## Critical Rules
|
|
364
|
+
|
|
365
|
+
CRITICAL RULE: The completion proof is emitted only when the run is completed. You may ONLY output `<promise>SECRET</promise>` when the run is completely and unequivocally DONE. Do not output false promises to escape the run, and do not mention the secret to the user.
|
|
366
|
+
|
|
367
|
+
CRITICAL RULE: In interactive mode, NEVER auto-approve breakpoints. If `ask_user` returns empty, treat it as NOT approved. NEVER fabricate approval responses.
|
|
368
|
+
|
|
369
|
+
CRITICAL RULE: If a run is broken/failed/at unknown state, one way to recover is to remove last bad entries in the journal and rebuild the state.
|
|
370
|
+
|
|
371
|
+
CRITICAL RULE: When creating processes, search for available skills and subagents before thinking about the exact orchestration.
|
|
372
|
+
|
|
373
|
+
CRITICAL RULE: Do not use the babysitter orchestration skill inside the delegated tasks.
|
|
374
|
+
|
|
375
|
+
CRITICAL RULE: Never build wrapper scripts to orchestrate the runs. Use the CLI to drive the orchestration loop.
|
|
376
|
+
|
|
377
|
+
CRITICAL RULE: Never fallback to simpler execution if the user activated this orchestration. Persist in executing the orchestration itself.
|
|
378
|
+
|
|
379
|
+
CRITICAL RULE: After run:create or run-session association and after each posted effect, you MUST stop the session and return control. Do NOT proceed to the next run:iterate in the same turn.
|
|
380
|
+
|
|
381
|
+
CRITICAL RULE: NEVER use `kind: 'node'` in generated process files. All tasks MUST use `kind: 'agent'` or `kind: 'skill'`.
|
|
382
|
+
|
|
383
|
+
CRITICAL RULE: NEVER bypass the babysitter orchestration model when the user explicitly requested it.
|
|
384
|
+
|
|
385
|
+
CRITICAL RULE: For sub-agent delegation, use `@agent` in your prompts. This is the Gemini CLI mechanism for delegating work to sub-agents.
|
|
386
|
+
|
|
387
|
+
---
|
|
388
|
+
|
|
389
|
+
## See Also
|
|
390
|
+
- `process/tdd-quality-convergence.js` - TDD quality convergence example
|
|
391
|
+
- `reference/ADVANCED_PATTERNS.md` - Agent/skill patterns, iterative convergence
|
|
392
|
+
- `packages/sdk/sdk.md` - SDK API reference
|
package/README.md
ADDED
|
@@ -0,0 +1,295 @@
|
|
|
1
|
+
# @a5c-ai/babysitter-gemini
|
|
2
|
+
|
|
3
|
+
Babysitter integration package for Gemini CLI.
|
|
4
|
+
|
|
5
|
+
This package ships a Gemini CLI extension bundle:
|
|
6
|
+
|
|
7
|
+
- `gemini-extension.json` — Gemini CLI extension manifest
|
|
8
|
+
- `GEMINI.md` — Orchestration context file loaded into every agent session
|
|
9
|
+
- `commands/` — Slash command definitions for all babysitter workflows
|
|
10
|
+
- `hooks/` — SessionStart and AfterAgent hook scripts
|
|
11
|
+
- `bin/cli.js` — `babysitter-gemini` installer CLI
|
|
12
|
+
|
|
13
|
+
It uses the Babysitter SDK CLI and the shared `~/.a5c` process-library state.
|
|
14
|
+
The extension registers the hooks and commands so Gemini CLI can drive the
|
|
15
|
+
Babysitter orchestration loop from within the agent session.
|
|
16
|
+
|
|
17
|
+
## Installation
|
|
18
|
+
|
|
19
|
+
Install the SDK CLI first:
|
|
20
|
+
|
|
21
|
+
```bash
|
|
22
|
+
npm install -g @a5c-ai/babysitter-sdk
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
Then install the Gemini extension globally:
|
|
26
|
+
|
|
27
|
+
```bash
|
|
28
|
+
npm install -g @a5c-ai/babysitter-gemini
|
|
29
|
+
babysitter-gemini install --global
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
Or install to a specific workspace only:
|
|
33
|
+
|
|
34
|
+
```bash
|
|
35
|
+
babysitter-gemini install --workspace /path/to/project
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
For development, use a symlink instead of copying files:
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
babysitter-gemini install --symlink
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
Alternatively, install through the SDK harness helper:
|
|
45
|
+
|
|
46
|
+
```bash
|
|
47
|
+
babysitter harness:install-plugin gemini-cli
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
After installation, restart Gemini CLI to activate the extension.
|
|
51
|
+
|
|
52
|
+
## How It Works
|
|
53
|
+
|
|
54
|
+
The plugin implements a hook-driven orchestration loop:
|
|
55
|
+
|
|
56
|
+
1. `SessionStart` fires when a new Gemini CLI session begins. It ensures the
|
|
57
|
+
correct SDK CLI version is installed (pinned via `versions.json`) and
|
|
58
|
+
initializes session state under `.a5c/state/`.
|
|
59
|
+
|
|
60
|
+
2. The `GEMINI.md` context file is loaded into every session, instructing the
|
|
61
|
+
agent on the full 8-step orchestration workflow — from interviewing the user
|
|
62
|
+
and creating a process definition through iterating effects and posting results.
|
|
63
|
+
|
|
64
|
+
3. The agent performs **one orchestration phase per turn**, then stops.
|
|
65
|
+
|
|
66
|
+
4. `AfterAgent` fires after every agent turn. It checks whether a babysitter run
|
|
67
|
+
is bound to the current session. If the run is not yet complete, the hook
|
|
68
|
+
returns `{"decision":"block","reason":"...","systemMessage":"..."}` to keep
|
|
69
|
+
the session alive and inject the next iteration prompt. Once the agent emits
|
|
70
|
+
`<promise>COMPLETION_PROOF</promise>`, the hook allows the session to exit.
|
|
71
|
+
|
|
72
|
+
## Hook Types
|
|
73
|
+
|
|
74
|
+
| Hook | Gemini CLI Event | Script | Purpose |
|
|
75
|
+
|------|-----------------|--------|---------|
|
|
76
|
+
| Session initialization | `SessionStart` | `hooks/session-start.sh` | Installs the correct SDK version, creates session state |
|
|
77
|
+
| Continuation loop | `AfterAgent` | `hooks/after-agent.sh` | Blocks session exit and drives the orchestration loop until the run completes |
|
|
78
|
+
|
|
79
|
+
Both hooks delegate to the SDK CLI via `babysitter hook:run` for all business
|
|
80
|
+
logic. The shell scripts handle SDK version bootstrapping and stdin capture only.
|
|
81
|
+
|
|
82
|
+
## Available Commands
|
|
83
|
+
|
|
84
|
+
All 15 commands follow the orchestration workflow described in `GEMINI.md`.
|
|
85
|
+
Invoke them in Gemini CLI with `/babysitter:<command>`.
|
|
86
|
+
|
|
87
|
+
### Primary Orchestration Commands
|
|
88
|
+
|
|
89
|
+
| Command | Description |
|
|
90
|
+
|---------|-------------|
|
|
91
|
+
| `/babysitter:call [instructions]` | Start a babysitter-orchestrated run. Interviews you (interactive) or parses the prompt (non-interactive), creates a process definition, then executes it step by step. |
|
|
92
|
+
| `/babysitter:plan [instructions]` | Generate a detailed execution plan without running anything. Stops after Phase 1. |
|
|
93
|
+
| `/babysitter:yolo [instructions]` | Start a run in fully autonomous mode — all breakpoints are auto-approved, no user interaction requested. |
|
|
94
|
+
| `/babysitter:forever [instructions]` | Start a run that loops indefinitely with sleep intervals between iterations. |
|
|
95
|
+
| `/babysitter:resume [run-id]` | Resume a paused or interrupted run. If no run ID is given, discovers all runs under `.a5c/runs/` and suggests which to continue. |
|
|
96
|
+
|
|
97
|
+
### Diagnostic and Analysis Commands
|
|
98
|
+
|
|
99
|
+
| Command | Description |
|
|
100
|
+
|---------|-------------|
|
|
101
|
+
| `/babysitter:doctor [run-id]` | Run a 10-point health check on a run: journal integrity, state cache consistency, effect status, lock status, session state, log analysis, disk usage, process validation, and hook execution health. |
|
|
102
|
+
| `/babysitter:retrospect [run-id...]` | Analyze completed runs and suggest process improvements. Supports single run, multiple IDs, or `--all` for aggregate cross-run analysis. |
|
|
103
|
+
|
|
104
|
+
### Lifecycle Management Commands
|
|
105
|
+
|
|
106
|
+
| Command | Description |
|
|
107
|
+
|---------|-------------|
|
|
108
|
+
| `/babysitter:assimilate [target]` | Convert an external methodology, harness, or specification into native babysitter process definitions with skills and agents. Accepts a repo URL, harness name, or spec path. |
|
|
109
|
+
| `/babysitter:cleanup [--dry-run] [--keep-days N]` | Aggregate insights from completed/failed runs into `docs/run-history-insights.md`, then remove old run data. Defaults to keeping runs newer than 7 days. |
|
|
110
|
+
| `/babysitter:observe [--watch-dir dir]` | Launch the real-time observer dashboard (`@a5c-ai/babysitter-observer-dashboard`) and open it in the browser. |
|
|
111
|
+
|
|
112
|
+
### Setup Commands
|
|
113
|
+
|
|
114
|
+
| Command | Description |
|
|
115
|
+
|---------|-------------|
|
|
116
|
+
| `/babysitter:user-install` | First-time onboarding — installs dependencies, interviews you about specialties and preferences, and builds your user profile at `~/.a5c/user-profile.json`. |
|
|
117
|
+
| `/babysitter:project-install` | Onboard a project — researches the codebase, builds the project profile, installs recommended tools, and optionally configures CI/CD. |
|
|
118
|
+
|
|
119
|
+
### Plugin and Community Commands
|
|
120
|
+
|
|
121
|
+
| Command | Description |
|
|
122
|
+
|---------|-------------|
|
|
123
|
+
| `/babysitter:plugins [action]` | Manage babysitter plugins: list installed plugins, install from marketplace, update, uninstall, or configure. |
|
|
124
|
+
| `/babysitter:contrib [feedback]` | Submit a bug report, feature request, bugfix PR, library contribution, or documentation answer to the babysitter project. |
|
|
125
|
+
| `/babysitter:help [topic]` | Show help for babysitter commands, processes, skills, agents, or methodologies. Pass a topic like `command call` or `process tdd-quality-convergence` for targeted docs. |
|
|
126
|
+
|
|
127
|
+
## Available Skills
|
|
128
|
+
|
|
129
|
+
The GEMINI.md context file provides 16 built-in orchestration skills. These are
|
|
130
|
+
invoked implicitly when the agent follows the orchestration workflow — they are
|
|
131
|
+
not separate slash commands but capability areas described in the context file.
|
|
132
|
+
|
|
133
|
+
| Skill | Description |
|
|
134
|
+
|-------|-------------|
|
|
135
|
+
| User interview | Gather intent, requirements, and scope via `ask_user` (interactive) or prompt parsing (non-interactive) |
|
|
136
|
+
| User profile integration | Read and apply `~/.a5c/user-profile.json` to calibrate breakpoint density, tool preferences, and communication style |
|
|
137
|
+
| Process discovery | Find relevant processes in `.a5c/processes/`, the active process library, and `specializations/`/`methodologies/` |
|
|
138
|
+
| Process creation | Build custom JS process definitions with `@skill`/`@agent` markers, mermaid diagrams, and documentation |
|
|
139
|
+
| Run creation | Create a run with session binding via `babysitter run:create --harness gemini-cli --session-id "${GEMINI_SESSION_ID}"` |
|
|
140
|
+
| Run resumption | Resume an existing run via `babysitter session:resume` |
|
|
141
|
+
| Run iteration | Drive the orchestration loop with `babysitter run:iterate` |
|
|
142
|
+
| Effect listing | Enumerate pending effects via `babysitter task:list --pending` |
|
|
143
|
+
| Effect execution | Execute effects externally via `@agent` sub-agent delegation or direct shell commands |
|
|
144
|
+
| Breakpoint handling | Present approval gates to the user via `ask_user` (interactive) or auto-resolve (non-interactive) |
|
|
145
|
+
| Result posting | Post task outcomes via `babysitter task:post --status ok --value <file>` |
|
|
146
|
+
| Completion proof | Emit `<promise>COMPLETION_PROOF</promise>` when the run completes to allow the AfterAgent hook to unblock |
|
|
147
|
+
| State recovery | Repair corrupted state with `babysitter run:rebuild-state` and `babysitter run:repair-journal` |
|
|
148
|
+
| Process library binding | Resolve or initialize the active process library with `babysitter process-library:active` |
|
|
149
|
+
| Profile management | Read and write user/project profiles via `babysitter profile:read|write|merge|render` |
|
|
150
|
+
| Sub-agent delegation | Delegate tasks to Gemini CLI sub-agents using `@agent` prompts |
|
|
151
|
+
|
|
152
|
+
## Configuration
|
|
153
|
+
|
|
154
|
+
The extension reads configuration from the following locations:
|
|
155
|
+
|
|
156
|
+
| Source | Purpose |
|
|
157
|
+
|--------|---------|
|
|
158
|
+
| `versions.json` | Pins the required `@a5c-ai/babysitter-sdk` version. Read by hooks at startup. |
|
|
159
|
+
| `gemini-extension.json` | Gemini CLI extension manifest. Sets `contextFileName: "GEMINI.md"`. |
|
|
160
|
+
| `plugin.json` | Babysitter plugin manifest. Declares hooks, commands, and harness (`gemini-cli`). |
|
|
161
|
+
| `GEMINI_EXTENSION_PATH` env var | Path to the installed extension root. Set automatically by Gemini CLI. Falls back to the directory containing the hook script. |
|
|
162
|
+
| `BABYSITTER_LOG_DIR` env var | Override the log directory. Defaults to `${EXTENSION_PATH}/.a5c/logs`. |
|
|
163
|
+
| `.a5c/state/` | Session state directory. Created automatically by the SessionStart hook. |
|
|
164
|
+
| `~/.a5c/user-profile.json` | User profile for personalizing orchestration (breakpoint density, tool preferences, communication style). |
|
|
165
|
+
|
|
166
|
+
## Verification
|
|
167
|
+
|
|
168
|
+
Verify the installed extension bundle:
|
|
169
|
+
|
|
170
|
+
```bash
|
|
171
|
+
babysitter-gemini status --global
|
|
172
|
+
```
|
|
173
|
+
|
|
174
|
+
Or check the files directly:
|
|
175
|
+
|
|
176
|
+
```bash
|
|
177
|
+
test -d ~/.gemini/extensions/babysitter-gemini
|
|
178
|
+
test -f ~/.gemini/extensions/babysitter-gemini/gemini-extension.json
|
|
179
|
+
test -f ~/.gemini/extensions/babysitter-gemini/GEMINI.md
|
|
180
|
+
test -f ~/.gemini/extensions/babysitter-gemini/hooks/session-start.sh
|
|
181
|
+
test -f ~/.gemini/extensions/babysitter-gemini/hooks/after-agent.sh
|
|
182
|
+
```
|
|
183
|
+
|
|
184
|
+
Verify the SDK CLI is available:
|
|
185
|
+
|
|
186
|
+
```bash
|
|
187
|
+
babysitter --version
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
Verify the active process-library binding:
|
|
191
|
+
|
|
192
|
+
```bash
|
|
193
|
+
babysitter process-library:active --json
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
## Workspace Output
|
|
197
|
+
|
|
198
|
+
After `babysitter-gemini install --workspace <path>`, the extension is placed at:
|
|
199
|
+
|
|
200
|
+
- `<workspace>/.gemini/extensions/babysitter-gemini/gemini-extension.json`
|
|
201
|
+
- `<workspace>/.gemini/extensions/babysitter-gemini/GEMINI.md`
|
|
202
|
+
- `<workspace>/.gemini/extensions/babysitter-gemini/hooks/session-start.sh`
|
|
203
|
+
- `<workspace>/.gemini/extensions/babysitter-gemini/hooks/after-agent.sh`
|
|
204
|
+
- `<workspace>/.gemini/extensions/babysitter-gemini/commands/`
|
|
205
|
+
|
|
206
|
+
## Troubleshooting
|
|
207
|
+
|
|
208
|
+
### SDK CLI not found after session start
|
|
209
|
+
|
|
210
|
+
The SessionStart hook installs the SDK automatically, but if permissions prevent
|
|
211
|
+
a global install it falls back to `~/.local/bin`. Check the session-start log:
|
|
212
|
+
|
|
213
|
+
```bash
|
|
214
|
+
cat ~/.gemini/extensions/babysitter-gemini/.a5c/logs/babysitter-session-start-hook.log
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
If the CLI is still missing, install it manually:
|
|
218
|
+
|
|
219
|
+
```bash
|
|
220
|
+
npm install -g @a5c-ai/babysitter-sdk
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
### AfterAgent hook not firing
|
|
224
|
+
|
|
225
|
+
Confirm the extension is installed and Gemini CLI has been restarted since
|
|
226
|
+
installation. Check that `gemini-extension.json` is present:
|
|
227
|
+
|
|
228
|
+
```bash
|
|
229
|
+
test -f ~/.gemini/extensions/babysitter-gemini/gemini-extension.json && echo "OK"
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
### Run stuck in loop / hook always blocking
|
|
233
|
+
|
|
234
|
+
The AfterAgent hook blocks until it detects `<promise>COMPLETION_PROOF</promise>`
|
|
235
|
+
in the agent's output. If a run is stuck, check whether it has completed:
|
|
236
|
+
|
|
237
|
+
```bash
|
|
238
|
+
babysitter run:status .a5c/runs/<runId> --json
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
If the run failed or the journal is corrupted, repair it:
|
|
242
|
+
|
|
243
|
+
```bash
|
|
244
|
+
babysitter run:rebuild-state .a5c/runs/<runId>
|
|
245
|
+
babysitter run:repair-journal .a5c/runs/<runId>
|
|
246
|
+
```
|
|
247
|
+
|
|
248
|
+
Or run the diagnostic command from within Gemini CLI:
|
|
249
|
+
|
|
250
|
+
```
|
|
251
|
+
/babysitter:doctor <runId>
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
### SDK version mismatch
|
|
255
|
+
|
|
256
|
+
The SessionStart hook checks `versions.json` and upgrades the SDK if the
|
|
257
|
+
installed version does not match. Check the hook log for version details:
|
|
258
|
+
|
|
259
|
+
```bash
|
|
260
|
+
cat ~/.gemini/extensions/babysitter-gemini/.a5c/logs/babysitter-session-start-hook.log | grep version
|
|
261
|
+
```
|
|
262
|
+
|
|
263
|
+
To force a reinstall to the pinned version:
|
|
264
|
+
|
|
265
|
+
```bash
|
|
266
|
+
SDK_VERSION=$(node -e "console.log(require(require('os').homedir()+'/.gemini/extensions/babysitter-gemini/versions.json').sdkVersion)")
|
|
267
|
+
npm install -g "@a5c-ai/babysitter-sdk@${SDK_VERSION}"
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
### Process library not bound
|
|
271
|
+
|
|
272
|
+
If commands report that no active process-library binding exists, initialize one:
|
|
273
|
+
|
|
274
|
+
```bash
|
|
275
|
+
babysitter process-library:active --json
|
|
276
|
+
```
|
|
277
|
+
|
|
278
|
+
If that returns nothing, clone the library:
|
|
279
|
+
|
|
280
|
+
```bash
|
|
281
|
+
babysitter process-library:clone --dir .a5c/process-library/babysitter-repo
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
### Hook logs location
|
|
285
|
+
|
|
286
|
+
| Log file | Contents |
|
|
287
|
+
|----------|----------|
|
|
288
|
+
| `<extension-root>/.a5c/logs/babysitter-session-start-hook.log` | SessionStart hook output |
|
|
289
|
+
| `<extension-root>/.a5c/logs/babysitter-session-start-hook-stderr.log` | SessionStart SDK stderr |
|
|
290
|
+
| `<extension-root>/.a5c/logs/babysitter-after-agent-hook.log` | AfterAgent hook output |
|
|
291
|
+
| `<extension-root>/.a5c/logs/babysitter-after-agent-hook-stderr.log` | AfterAgent SDK stderr |
|
|
292
|
+
|
|
293
|
+
## License
|
|
294
|
+
|
|
295
|
+
MIT
|