@a5c-ai/babysitter-opencode 0.1.1-staging.0dc03363
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +169 -0
- package/bin/cli.cjs +194 -0
- package/bin/cli.js +55 -0
- package/bin/install-shared.cjs +406 -0
- package/bin/install.cjs +97 -0
- package/bin/install.js +110 -0
- package/bin/uninstall.cjs +90 -0
- package/bin/uninstall.js +46 -0
- package/commands/assimilate.md +37 -0
- package/commands/call.md +7 -0
- package/commands/cleanup.md +20 -0
- package/commands/contrib.md +33 -0
- package/commands/doctor.md +426 -0
- package/commands/forever.md +7 -0
- package/commands/help.md +244 -0
- package/commands/observe.md +12 -0
- package/commands/plan.md +7 -0
- package/commands/plugins.md +255 -0
- package/commands/project-install.md +17 -0
- package/commands/resume.md +8 -0
- package/commands/retrospect.md +55 -0
- package/commands/status.md +8 -0
- package/commands/user-install.md +17 -0
- package/commands/yolo.md +7 -0
- package/hooks/hooks.json +46 -0
- package/hooks/session-created.js +180 -0
- package/hooks/session-idle.js +122 -0
- package/hooks/shell-env.js +86 -0
- package/hooks/tool-execute-after.js +105 -0
- package/hooks/tool-execute-before.js +107 -0
- package/package.json +46 -0
- package/plugin.json +25 -0
- package/scripts/sync-command-docs.cjs +105 -0
- package/scripts/sync-command-surfaces.js +52 -0
- package/skills/assimilate/SKILL.md +38 -0
- package/skills/babysit/SKILL.md +35 -0
- package/skills/call/SKILL.md +8 -0
- package/skills/cleanup/SKILL.md +21 -0
- package/skills/contrib/SKILL.md +34 -0
- package/skills/doctor/SKILL.md +427 -0
- package/skills/forever/SKILL.md +8 -0
- package/skills/help/SKILL.md +245 -0
- package/skills/observe/SKILL.md +13 -0
- package/skills/plan/SKILL.md +8 -0
- package/skills/plugins/SKILL.md +257 -0
- package/skills/project-install/SKILL.md +18 -0
- package/skills/resume/SKILL.md +9 -0
- package/skills/retrospect/SKILL.md +56 -0
- package/skills/user-install/SKILL.md +18 -0
- package/skills/yolo/SKILL.md +8 -0
- package/versions.json +4 -0
|
@@ -0,0 +1,427 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: doctor
|
|
3
|
+
description: Diagnose babysitter run health - journal integrity, state cache, effects, locks, sessions, logs, and disk usage
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# doctor
|
|
7
|
+
|
|
8
|
+
You are a diagnostic agent for the babysitter runtime. Your job is to perform a comprehensive health check across 10 areas and produce a structured diagnostic report. Follow each section methodically. Track results as you go and produce the final summary at the end.
|
|
9
|
+
|
|
10
|
+
Initialize a results tracker with these 10 checks, all starting as PENDING:
|
|
11
|
+
1. Run Discovery
|
|
12
|
+
2. Journal Integrity
|
|
13
|
+
3. State Cache Consistency
|
|
14
|
+
4. Effect Status
|
|
15
|
+
5. Lock Status
|
|
16
|
+
6. Session State
|
|
17
|
+
7. Log Analysis
|
|
18
|
+
8. Disk Usage
|
|
19
|
+
9. Process Validation
|
|
20
|
+
10. Hook Execution Health
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## 1. Run Discovery
|
|
25
|
+
|
|
26
|
+
**Goal:** Identify the target run and display its metadata.
|
|
27
|
+
|
|
28
|
+
- List all runs by running: `ls -lt .a5c/runs/`
|
|
29
|
+
- If the user provided a run ID argument, use that as the run ID. Otherwise, use the most recent run directory (the first entry from the listing).
|
|
30
|
+
- Store the resolved run ID and construct the run directory path: `.a5c/runs/<runId>`
|
|
31
|
+
- Verify the run directory exists. If it does not exist, report FAIL for this check and stop the entire diagnostic (no run to diagnose).
|
|
32
|
+
- Show run metadata by running: `npx babysitter run:status .a5c/runs/<runId> --json`
|
|
33
|
+
- Parse and display: runId, processId, entrypoint/importPath, createdAt, current state.
|
|
34
|
+
- Mark this check as PASS.
|
|
35
|
+
|
|
36
|
+
---
|
|
37
|
+
|
|
38
|
+
## 2. Journal Integrity
|
|
39
|
+
|
|
40
|
+
**Goal:** Verify the append-only event journal is well-formed and uncorrupted.
|
|
41
|
+
|
|
42
|
+
- List all journal events by running: `npx babysitter run:events .a5c/runs/<runId> --json`
|
|
43
|
+
- List all files in `.a5c/runs/<runId>/journal/` sorted by name.
|
|
44
|
+
- If the journal directory is empty or missing, mark as FAIL and note "No journal entries found."
|
|
45
|
+
|
|
46
|
+
For each journal file (named `<seq>.<ulid>.json`):
|
|
47
|
+
|
|
48
|
+
**Sequential numbering check:**
|
|
49
|
+
- Extract the sequence number prefix from each filename (e.g., `000001` from `000001.01JAXYZ.json`).
|
|
50
|
+
- Verify sequence numbers are contiguous starting from 000001 with no gaps.
|
|
51
|
+
- If gaps found, mark as WARN and list the missing sequence numbers.
|
|
52
|
+
|
|
53
|
+
**Checksum verification:**
|
|
54
|
+
|
|
55
|
+
The SDK computes checksums as follows: it first builds the event payload **without** the `checksum` field (`{ type, recordedAt, data }`), serializes it with `JSON.stringify(payload, null, 2) + "\n"` (pretty-printed with a trailing newline), then computes SHA256 of that string. To verify:
|
|
56
|
+
|
|
57
|
+
- Read each journal file as JSON.
|
|
58
|
+
- Extract and remove the `checksum` field from the parsed object.
|
|
59
|
+
- Re-serialize the remaining object with `JSON.stringify(remaining, null, 2) + "\n"` — **must** use 2-space indentation and a trailing newline to match the SDK.
|
|
60
|
+
- Compute SHA256 (hex) of that exact string.
|
|
61
|
+
- Compare computed checksum with the stored checksum.
|
|
62
|
+
- If any mismatch, mark as FAIL and list the corrupt files.
|
|
63
|
+
|
|
64
|
+
Example bash one-liner for a single file:
|
|
65
|
+
```bash
|
|
66
|
+
node -e "const fs=require('fs'); const f=process.argv[1]; const obj=JSON.parse(fs.readFileSync(f,'utf8')); const stored=obj.checksum; delete obj.checksum; const expected=require('crypto').createHash('sha256').update(JSON.stringify(obj,null,2)+'\n').digest('hex'); console.log(stored===expected?'OK':'MISMATCH',f)" <file>
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
**Timestamp monotonicity check:**
|
|
70
|
+
- Extract `recordedAt` from each event.
|
|
71
|
+
- Verify each timestamp is >= the previous one.
|
|
72
|
+
- If any timestamp goes backward, mark as WARN and list the offending entries.
|
|
73
|
+
|
|
74
|
+
**Event type summary:**
|
|
75
|
+
- Count events by type: RUN_CREATED, EFFECT_REQUESTED, EFFECT_RESOLVED, STOP_HOOK_INVOKED, RUN_COMPLETED, RUN_FAILED, and any other types encountered.
|
|
76
|
+
- Display the counts in a table.
|
|
77
|
+
|
|
78
|
+
**Orphan detection:**
|
|
79
|
+
- Flag any files in the journal directory that do not match the expected `<seq>.<ulid>.json` naming pattern.
|
|
80
|
+
|
|
81
|
+
If all sub-checks pass, mark as PASS. If any sub-check is WARN, mark as WARN. If any sub-check is FAIL, mark as FAIL.
|
|
82
|
+
|
|
83
|
+
---
|
|
84
|
+
|
|
85
|
+
## 3. State Cache Consistency
|
|
86
|
+
|
|
87
|
+
**Goal:** Verify the derived state cache matches the current journal.
|
|
88
|
+
|
|
89
|
+
- Check if `.a5c/runs/<runId>/state/state.json` exists.
|
|
90
|
+
- If it does not exist, mark as WARN and recommend: `npx babysitter run:rebuild-state .a5c/runs/<runId>`
|
|
91
|
+
|
|
92
|
+
If it exists:
|
|
93
|
+
- Read `state.json` and extract the `journalHead` field (contains `seq`, `ulid`, and `checksum`).
|
|
94
|
+
- Determine the actual last journal entry by reading the last file in `.a5c/runs/<runId>/journal/` (highest sequence number).
|
|
95
|
+
- Extract the sequence number and ULID from the last journal filename, and the checksum from its content.
|
|
96
|
+
- Compare:
|
|
97
|
+
- `journalHead.seq` should match the last journal file's sequence number.
|
|
98
|
+
- `journalHead.ulid` should match the last journal file's ULID.
|
|
99
|
+
- `journalHead.checksum` should match the last journal file's checksum.
|
|
100
|
+
- If all match, mark as PASS.
|
|
101
|
+
- If any mismatch, mark as WARN and recommend: `npx babysitter run:rebuild-state .a5c/runs/<runId>`
|
|
102
|
+
- Also verify `schemaVersion` field is present and report its value.
|
|
103
|
+
|
|
104
|
+
---
|
|
105
|
+
|
|
106
|
+
## 4. Effect Status
|
|
107
|
+
|
|
108
|
+
**Goal:** Identify stuck, errored, or pending effects.
|
|
109
|
+
|
|
110
|
+
- Run: `npx babysitter task:list .a5c/runs/<runId> --json`
|
|
111
|
+
- Run: `npx babysitter task:list .a5c/runs/<runId> --pending --json`
|
|
112
|
+
- Parse the JSON output from both commands.
|
|
113
|
+
|
|
114
|
+
**All effects summary:**
|
|
115
|
+
- Count total effects, resolved effects, and pending effects.
|
|
116
|
+
- Group and count effects by `kind` (node, breakpoint, orchestrator_task, sleep, etc.).
|
|
117
|
+
|
|
118
|
+
**Stuck effect detection:**
|
|
119
|
+
- For each pending effect, check its `requestedAt` timestamp.
|
|
120
|
+
- If any pending effect was requested more than 30 minutes ago, flag it as STUCK.
|
|
121
|
+
- List stuck effects with their effectId, kind, taskId, and age.
|
|
122
|
+
|
|
123
|
+
**Error detection:**
|
|
124
|
+
- Identify any effects with error status in their results.
|
|
125
|
+
- List errored effects with their effectId and error message.
|
|
126
|
+
|
|
127
|
+
**Pending summary:**
|
|
128
|
+
- Summarize pending effects grouped by kind with count per kind.
|
|
129
|
+
|
|
130
|
+
Mark as PASS if no stuck or errored effects. Mark as WARN if there are pending effects older than 30 minutes. Mark as FAIL if there are errored effects.
|
|
131
|
+
|
|
132
|
+
---
|
|
133
|
+
|
|
134
|
+
## 5. Lock Status
|
|
135
|
+
|
|
136
|
+
**Goal:** Detect stale or orphaned run locks.
|
|
137
|
+
|
|
138
|
+
- Check if `.a5c/runs/<runId>/run.lock` exists.
|
|
139
|
+
- If it does not exist, mark as PASS ("No lock held -- run is not actively being iterated").
|
|
140
|
+
|
|
141
|
+
If it exists:
|
|
142
|
+
- Read the lock file (JSON with `pid`, `owner`, `acquiredAt`).
|
|
143
|
+
- Display the lock info: PID, owner, acquired time, and age of the lock.
|
|
144
|
+
- Check if the PID is still alive by running: `kill -0 <pid> 2>/dev/null; echo $?` (exit code 0 means alive, non-zero means dead). On Windows/MINGW, use `tasklist //FI "PID eq <pid>" 2>/dev/null` or equivalent.
|
|
145
|
+
- If the process is alive, mark as PASS ("Lock held by active process").
|
|
146
|
+
- If the process is dead, mark as FAIL ("Stale lock detected -- process <pid> is no longer running").
|
|
147
|
+
- Recommend: `rm .a5c/runs/<runId>/run.lock`
|
|
148
|
+
|
|
149
|
+
---
|
|
150
|
+
|
|
151
|
+
## 6. Session State
|
|
152
|
+
|
|
153
|
+
**Goal:** Inspect babysitter session files for health and detect runaway loops.
|
|
154
|
+
|
|
155
|
+
- Search for session state files using Glob:
|
|
156
|
+
- `plugins/babysitter/skills/babysit/state/*.md`
|
|
157
|
+
- `.a5c/state/*.md`
|
|
158
|
+
- `.a5c/state/*.json`
|
|
159
|
+
- For each session state file found:
|
|
160
|
+
- Read the file and extract available information: iteration count, associated runId, timestamps, session status.
|
|
161
|
+
- Display: filename, iteration count, runId (if present), last activity time.
|
|
162
|
+
|
|
163
|
+
**Runaway loop detection:**
|
|
164
|
+
- If any session file contains iteration timing data, compute the average time between iterations.
|
|
165
|
+
- If the average iteration time is less than 3 seconds, flag as WARN ("Possible runaway loop detected -- average iteration time is under 3 seconds").
|
|
166
|
+
|
|
167
|
+
**Session classification:**
|
|
168
|
+
- Active: session has recent activity (within last 30 minutes).
|
|
169
|
+
- Stale: session has no activity for more than 30 minutes.
|
|
170
|
+
- Display counts of active vs stale sessions.
|
|
171
|
+
|
|
172
|
+
Mark as PASS if no issues. Mark as WARN if runaway loops or stale sessions detected.
|
|
173
|
+
|
|
174
|
+
---
|
|
175
|
+
|
|
176
|
+
## 7. Log Analysis
|
|
177
|
+
|
|
178
|
+
**Goal:** Analyze babysitter log files for errors, warnings, and stop hook decisions.
|
|
179
|
+
|
|
180
|
+
Read the last 50 lines of each of these log files (if they exist):
|
|
181
|
+
- `$CLAUDE_PLUGIN_ROOT/.a5c/logs/hooks.log`
|
|
182
|
+
- `$CLAUDE_PLUGIN_ROOT/.a5c/logs/babysitter-stop-hook.log`
|
|
183
|
+
- `$CLAUDE_PLUGIN_ROOT/.a5c/logs/babysitter-stop-hook-stderr.log`
|
|
184
|
+
- `$CLAUDE_PLUGIN_ROOT/.a5c/logs/babysitter-session-start-hook.log`
|
|
185
|
+
- `$CLAUDE_PLUGIN_ROOT/.a5c/logs/babysitter-session-start-hook-stderr.log`
|
|
186
|
+
- `$CLAUDE_PLUGIN_ROOT/.a5c/logs/babysitter.log`
|
|
187
|
+
- `$HOME/.a5c/logs/` and relevant logs and run/session specific logs there
|
|
188
|
+
|
|
189
|
+
|
|
190
|
+
For each log file:
|
|
191
|
+
- If the file does not exist, note it as "Not found (OK if hooks have not run yet)."
|
|
192
|
+
- If the file exists, analyze its content.
|
|
193
|
+
|
|
194
|
+
**Stop hook analysis (babysitter-stop-hook.log):**
|
|
195
|
+
- Count lines containing "approve" vs "block" decisions (case-insensitive).
|
|
196
|
+
- Display the approve/block ratio.
|
|
197
|
+
- Show the last 20 stop hook decision entries (lines containing "approve" or "block").
|
|
198
|
+
- Count and display CLI exit codes from lines containing "CLI exit code=".
|
|
199
|
+
|
|
200
|
+
**Stderr analysis (babysitter-stop-hook-stderr.log, babysitter-session-start-hook-stderr.log):**
|
|
201
|
+
- If stderr logs contain content, display the last 20 lines from each.
|
|
202
|
+
- Look for common failure patterns: "command not found", "MODULE_NOT_FOUND", "ENOENT", "EACCES", "permission denied", "npm ERR", "Cannot find module".
|
|
203
|
+
- Flag any stderr content as a potential issue.
|
|
204
|
+
|
|
205
|
+
**Error/Warning detection (all logs):**
|
|
206
|
+
- Count and list lines containing "ERROR" or "WARN" (case-insensitive).
|
|
207
|
+
- Display the last 10 error/warning lines from each log.
|
|
208
|
+
|
|
209
|
+
Mark as PASS if no ERROR lines found and stderr logs are empty. Mark as WARN if WARN lines found or stderr has content but no ERROR. Mark as FAIL if ERROR lines found.
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
## 8. Disk Usage
|
|
214
|
+
|
|
215
|
+
**Goal:** Report disk consumption and identify oversized files.
|
|
216
|
+
|
|
217
|
+
- Run `du -sh .a5c/runs/<runId>` for the total run directory size.
|
|
218
|
+
- Run `du -sh` on each subdirectory:
|
|
219
|
+
- `.a5c/runs/<runId>/journal/`
|
|
220
|
+
- `.a5c/runs/<runId>/tasks/`
|
|
221
|
+
- `.a5c/runs/<runId>/blobs/`
|
|
222
|
+
- `.a5c/runs/<runId>/state/`
|
|
223
|
+
- `.a5c/runs/<runId>/process/` (if it exists)
|
|
224
|
+
|
|
225
|
+
- Display results in a table: directory, size.
|
|
226
|
+
|
|
227
|
+
**Large file detection:**
|
|
228
|
+
- Find individual files larger than 10MB within the run directory: `find .a5c/runs/<runId> -type f -size +10M -exec ls -lh {} \;`
|
|
229
|
+
- If any found, list them with their paths and sizes.
|
|
230
|
+
|
|
231
|
+
- Report the total run directory size prominently.
|
|
232
|
+
|
|
233
|
+
Mark as PASS if total size < 500MB and no files > 10MB. Mark as WARN if total size > 500MB or any files > 10MB. Mark as FAIL if total size > 2GB.
|
|
234
|
+
|
|
235
|
+
---
|
|
236
|
+
|
|
237
|
+
## 9. Process Validation
|
|
238
|
+
|
|
239
|
+
**Goal:** Verify the process entrypoint and SDK dependency are valid.
|
|
240
|
+
|
|
241
|
+
- Read `.a5c/runs/<runId>/run.json` and extract the `importPath` (or `entrypoint`) field.
|
|
242
|
+
- Check if the referenced process file exists on disk. Use Glob or file read to verify.
|
|
243
|
+
- If the file does not exist, mark as FAIL ("Process entrypoint not found on disk").
|
|
244
|
+
|
|
245
|
+
**SDK dependency check:**
|
|
246
|
+
- Read `.a5c/package.json` (if it exists) or the project root `package.json`.
|
|
247
|
+
- Check for `@a5c-ai/babysitter-sdk` in `dependencies` or `devDependencies`.
|
|
248
|
+
- Report the installed version.
|
|
249
|
+
- If the dependency is missing, mark as WARN.
|
|
250
|
+
- If present, verify it looks like a valid semver version and mark as PASS.
|
|
251
|
+
|
|
252
|
+
---
|
|
253
|
+
|
|
254
|
+
## 10. Hook Execution Health
|
|
255
|
+
|
|
256
|
+
**Goal:** Verify that the stop hook and session-start hook are properly configured, can execute, and have been running. If the stop hook has NOT been running, diagnose why.
|
|
257
|
+
|
|
258
|
+
### 10a. Hook Registration
|
|
259
|
+
|
|
260
|
+
- Locate the plugin root. Check for `CLAUDE_PLUGIN_ROOT` env var, or search for `plugins/babysitter/hooks/hooks.json` by walking up from the current directory.
|
|
261
|
+
- If found, read `hooks.json` and verify:
|
|
262
|
+
- A `Stop` hook entry exists with a command referencing `babysitter-stop-hook.sh`.
|
|
263
|
+
- A `SessionStart` hook entry exists with a command referencing `babysitter-session-start-hook.sh`.
|
|
264
|
+
- If `hooks.json` is not found, mark as FAIL ("Hook registration file not found — hooks are not registered with Claude Code").
|
|
265
|
+
|
|
266
|
+
### 10b. Hook Script Availability
|
|
267
|
+
|
|
268
|
+
- Locate the hook scripts relative to the plugin root:
|
|
269
|
+
- `hooks/babysitter-stop-hook.sh`
|
|
270
|
+
- `hooks/babysitter-session-start-hook.sh`
|
|
271
|
+
- For each script:
|
|
272
|
+
- Check if the file exists.
|
|
273
|
+
- Check if it is executable (`test -x <path>`).
|
|
274
|
+
- If any script is missing or not executable, mark as FAIL and list which scripts are missing/not-executable.
|
|
275
|
+
|
|
276
|
+
### 10c. CLI Availability (babysitter command)
|
|
277
|
+
|
|
278
|
+
The hooks delegate to the `babysitter` CLI. Check if it is available:
|
|
279
|
+
- Run: `command -v babysitter 2>/dev/null && babysitter --version 2>/dev/null`
|
|
280
|
+
- If the command is found, display its path and version. Mark sub-check as PASS.
|
|
281
|
+
- If not found, check the user-local prefix: `$HOME/.local/bin/babysitter --version 2>/dev/null`
|
|
282
|
+
- If neither is found, mark sub-check as FAIL ("babysitter CLI not found — hooks will fail with exit code 127. Install with: `npm i -g @a5c-ai/babysitter-sdk`").
|
|
283
|
+
|
|
284
|
+
### 10d. Stop Hook Execution Evidence
|
|
285
|
+
|
|
286
|
+
Check whether the stop hook has actually been invoked during this run's lifetime:
|
|
287
|
+
|
|
288
|
+
**From log files:**
|
|
289
|
+
- Read `$CLAUDE_PLUGIN_ROOT/.a5c/logs/babysitter-stop-hook.log` (if it exists).
|
|
290
|
+
- Count the number of "Hook script invoked" lines. This is the total invocation count.
|
|
291
|
+
- Count the number of "CLI exit code=" lines and extract exit codes.
|
|
292
|
+
- If the log file does not exist or has zero invocations, the stop hook has NOT been running.
|
|
293
|
+
|
|
294
|
+
**From journal events:**
|
|
295
|
+
- Search the run's journal events for `STOP_HOOK_INVOKED` type events (using the run:events output from section 2 if available).
|
|
296
|
+
- Count the number of STOP_HOOK_INVOKED events.
|
|
297
|
+
- If present, display the last 5 with their timestamps and decision data.
|
|
298
|
+
- If no STOP_HOOK_INVOKED events exist in the journal, note that the stop hook has not recorded any decisions for this run.
|
|
299
|
+
|
|
300
|
+
**From stderr:**
|
|
301
|
+
- Read `$CLAUDE_PLUGIN_ROOT/.a5c/logs/babysitter-stop-hook-stderr.log`.
|
|
302
|
+
- If it contains error output, display it and diagnose:
|
|
303
|
+
- "command not found" or exit code 127 → CLI not installed (see 10c)
|
|
304
|
+
- "MODULE_NOT_FOUND" or "Cannot find module" → SDK package corrupted or not built
|
|
305
|
+
- "ENOENT" → Missing file referenced by the hook
|
|
306
|
+
- "EACCES" or "permission denied" → Permission issue on hook script or CLI
|
|
307
|
+
- "npm ERR" → npm installation failure during hook execution
|
|
308
|
+
|
|
309
|
+
### 10e. Stop Hook Not Running — Root Cause Diagnosis
|
|
310
|
+
|
|
311
|
+
If the stop hook shows NO evidence of execution (no log entries, no journal events, zero invocations):
|
|
312
|
+
|
|
313
|
+
Perform these diagnostic steps in order and report the first failure found:
|
|
314
|
+
|
|
315
|
+
1. **Plugin not installed**: Check if `plugins/babysitter/` exists relative to the project root and if `CLAUDE_PLUGIN_ROOT` is set. If the plugin directory doesn't exist, report: "Plugin not installed — the babysitter plugin directory is missing."
|
|
316
|
+
|
|
317
|
+
2. **Plugin not enabled**: Check for Claude settings files:
|
|
318
|
+
- `~/.claude/settings.json` — look for `babysitter` in `enabledPlugins`.
|
|
319
|
+
- `~/.claude/plugins/installed_plugins.json` — look for `babysitter` in the plugins list.
|
|
320
|
+
- If not found in either, report: "Plugin not enabled in Claude Code settings."
|
|
321
|
+
|
|
322
|
+
3. **hooks.json not registered**: If `hooks.json` doesn't contain a `Stop` hook entry (checked in 10a), report: "Stop hook not registered in hooks.json."
|
|
323
|
+
|
|
324
|
+
4. **Hook script missing or not executable**: If the stop hook script doesn't exist or isn't executable (checked in 10b), report with the specific file path.
|
|
325
|
+
|
|
326
|
+
5. **CLI not available**: If `babysitter` CLI is not found (checked in 10c), report: "babysitter CLI not installed — hook script will fail silently."
|
|
327
|
+
|
|
328
|
+
6. **Hook running but failing silently**: If the log file exists but shows exit codes other than 0, or if stderr has content, report: "Stop hook is being invoked but failing — see stderr log for details."
|
|
329
|
+
|
|
330
|
+
7. **No active session**: If no session state files exist (from section 6), report: "No active babysitter session — the stop hook only activates when a session is bound to a run."
|
|
331
|
+
|
|
332
|
+
8. **All checks pass but hook still not running**: Report: "All prerequisites are met but the stop hook shows no evidence of execution. Possible causes: Claude Code may not be invoking plugin hooks (check Claude Code version), or the session may have ended before the hook could fire."
|
|
333
|
+
|
|
334
|
+
### 10f. Verdict
|
|
335
|
+
|
|
336
|
+
Mark as PASS if:
|
|
337
|
+
- Hook registration is correct (10a)
|
|
338
|
+
- Hook scripts exist and are executable (10b)
|
|
339
|
+
- CLI is available (10c)
|
|
340
|
+
- There is evidence of stop hook execution (10d) with exit code 0
|
|
341
|
+
|
|
342
|
+
Mark as WARN if:
|
|
343
|
+
- Hooks are registered and scripts exist, but there's no evidence of execution yet
|
|
344
|
+
- Stop hook ran but had non-zero exit codes
|
|
345
|
+
|
|
346
|
+
Mark as FAIL if:
|
|
347
|
+
- Hook registration is missing
|
|
348
|
+
- Hook scripts are missing or not executable
|
|
349
|
+
- CLI is not available
|
|
350
|
+
- Stop hook is failing (consistent non-zero exit codes or stderr errors)
|
|
351
|
+
|
|
352
|
+
---
|
|
353
|
+
|
|
354
|
+
## Final Report
|
|
355
|
+
|
|
356
|
+
After completing all 10 checks, produce the diagnostic report in this format:
|
|
357
|
+
|
|
358
|
+
```
|
|
359
|
+
============================================
|
|
360
|
+
BABYSITTER DIAGNOSTIC REPORT
|
|
361
|
+
Run: <runId>
|
|
362
|
+
Time: <current timestamp>
|
|
363
|
+
============================================
|
|
364
|
+
|
|
365
|
+
OVERALL HEALTH: <HEALTHY | WARNING | CRITICAL>
|
|
366
|
+
|
|
367
|
+
--------------------------------------------
|
|
368
|
+
CHECK RESULTS
|
|
369
|
+
--------------------------------------------
|
|
370
|
+
|
|
371
|
+
| # | Check | Status |
|
|
372
|
+
|----|--------------------------|--------|
|
|
373
|
+
| 1 | Run Discovery | <status> |
|
|
374
|
+
| 2 | Journal Integrity | <status> |
|
|
375
|
+
| 3 | State Cache Consistency | <status> |
|
|
376
|
+
| 4 | Effect Status | <status> |
|
|
377
|
+
| 5 | Lock Status | <status> |
|
|
378
|
+
| 6 | Session State | <status> |
|
|
379
|
+
| 7 | Log Analysis | <status> |
|
|
380
|
+
| 8 | Disk Usage | <status> |
|
|
381
|
+
| 9 | Process Validation | <status> |
|
|
382
|
+
| 10 | Hook Execution Health | <status> |
|
|
383
|
+
|
|
384
|
+
--------------------------------------------
|
|
385
|
+
ISSUES & RECOMMENDATIONS
|
|
386
|
+
--------------------------------------------
|
|
387
|
+
|
|
388
|
+
<For each WARN or FAIL check, list:>
|
|
389
|
+
- [WARN|FAIL] <Check name>: <description of issue>
|
|
390
|
+
Fix: <specific actionable command or instruction>
|
|
391
|
+
|
|
392
|
+
--------------------------------------------
|
|
393
|
+
```
|
|
394
|
+
|
|
395
|
+
**Overall health determination:**
|
|
396
|
+
- **HEALTHY**: All 10 checks are PASS.
|
|
397
|
+
- **WARNING**: At least one check is WARN but none are FAIL.
|
|
398
|
+
- **CRITICAL**: At least one check is FAIL.
|
|
399
|
+
|
|
400
|
+
Present the full detailed findings for each check BEFORE the summary table, so the user can see the evidence. End with the summary table and recommendations. Also, create a single HTML report file with all the findings that uses the arwes UI framework and open it for the user in the browser.
|
|
401
|
+
|
|
402
|
+
---
|
|
403
|
+
|
|
404
|
+
## Escalation: Claude /debug
|
|
405
|
+
|
|
406
|
+
If any check results in FAIL and the root cause is unclear after your own analysis -- especially for environment issues, hook execution failures, CLI availability problems, or permission errors that may relate to the Claude Code runtime itself -- invoke the built-in Claude `/debug` command to get additional diagnostic context from the Claude Code environment. This is particularly useful for:
|
|
407
|
+
- Hook scripts that should be running but show no evidence of execution (check 10)
|
|
408
|
+
- Permission or path resolution issues that don't match expected behavior
|
|
409
|
+
- Unexpected CLI behavior that might be a Claude Code environment issue rather than a babysitter issue
|
|
410
|
+
|
|
411
|
+
Call `/debug` with a summary of the failing check and what you've already ruled out, so it can focus on environment-level causes.
|
|
412
|
+
|
|
413
|
+
---
|
|
414
|
+
|
|
415
|
+
## After Diagnosis: Contribute Back
|
|
416
|
+
|
|
417
|
+
After diagnosing issues, prompt the user to report or fix what was found -- they don't need to fix the bug themselves, just reporting it is valuable. Suggest `/babysitter:contrib` based on the situation:
|
|
418
|
+
|
|
419
|
+
- **Found a bug but didn't fix it**: `/babysitter:contrib bug report: [what the doctor found, e.g. "state cache rebuild silently drops EFFECT_RESOLVED events when journal has duplicate invocation keys"]`
|
|
420
|
+
- **Found and fixed a bug**: `/babysitter:contrib bugfix: [description of the fix]`
|
|
421
|
+
- **Found confusing or missing docs that made diagnosis harder**: `/babysitter:contrib documentation question: [what was unclear or missing]`
|
|
422
|
+
- **Found an issue in a plugin**: `/babysitter:contrib bug report: [plugin-name] [description]`
|
|
423
|
+
- **Improved a process or skill during diagnosis**: `/babysitter:contrib library contribution: [description]`
|
|
424
|
+
|
|
425
|
+
Example prompt after diagnosis:
|
|
426
|
+
|
|
427
|
+
> "Diagnosis found a stale lock -- process 12847 crashed without cleanup. This is a known edge case in the orchestration loop. Even if you don't want to fix it yourself, reporting it helps: run `/babysitter:contrib bug report: orchestration loop doesn't release lock on unhandled rejection` to open an issue."
|
|
@@ -0,0 +1,8 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: forever
|
|
3
|
+
description: Use this command to start babysitting a never-ending babysitter run.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# forever
|
|
7
|
+
|
|
8
|
+
Invoke the babysitter:babysit skill (using the Skill tool) and follow its instructions (SKILL.md). but create a process that uses an infinte loop and a ctx.sleep to create a never-ending babysitter loop. an example of such process is a daily process that reads new support ticket every day and tries to resolve them, then sleeps for 4 hours and repeats the process.
|
|
@@ -0,0 +1,245 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: help
|
|
3
|
+
description: help and documentation for babysitter command usage, processes, skills, agents, and methodologies. use this command to understand how to use babysitter effectively.
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# help
|
|
7
|
+
|
|
8
|
+
## if no arguments provided:
|
|
9
|
+
|
|
10
|
+
show this message:
|
|
11
|
+
|
|
12
|
+
```
|
|
13
|
+
Welcome to the Babysitter Help Center! Here you can find documentation and guidance on how to use Babysitter effectively.
|
|
14
|
+
|
|
15
|
+
Documentation: Explore our comprehensive documentation to understand Babysitter's features, processes, skills, agents, and methodologies. Read the Docs: https://github.com/a5c-ai/babysitter
|
|
16
|
+
|
|
17
|
+
Or ask specific questions about commands, processes, skills, agents, methodologies, domains, specialities to get targeted help.
|
|
18
|
+
|
|
19
|
+
Just type /babysitter:help followed by your question or the topic you want to learn more about.
|
|
20
|
+
|
|
21
|
+
|
|
22
|
+
PRIMARY COMMANDS
|
|
23
|
+
================
|
|
24
|
+
|
|
25
|
+
/babysitter:call [input]
|
|
26
|
+
Start a babysitter-orchestrated run. Babysitter analyzes your request, interviews you
|
|
27
|
+
to gather requirements, selects or creates the best process definition (from 50+
|
|
28
|
+
domain-specific processes covering science, business, engineering, and more), then
|
|
29
|
+
executes it step by step with breakpoints where you can steer direction.
|
|
30
|
+
|
|
31
|
+
How it works: The babysitter skill reads your input, explores the process library to
|
|
32
|
+
find matching processes, interviews you to refine scope, creates an SDK run with
|
|
33
|
+
run:create, and orchestrates iterations with run:iterate -- dispatching tasks,
|
|
34
|
+
handling breakpoints, and posting results until the run completes or you pause it.
|
|
35
|
+
|
|
36
|
+
Example: /babysitter:call migrate our Express.js REST API to Fastify, keeping all
|
|
37
|
+
existing routes and middleware behavior identical, with integration tests proving
|
|
38
|
+
parity
|
|
39
|
+
|
|
40
|
+
|
|
41
|
+
/babysitter:resume [run id or name]
|
|
42
|
+
Resume a paused or interrupted babysitter run. If you don't specify a run, babysitter
|
|
43
|
+
discovers all runs under .a5c/runs/, shows their status (created, waiting, completed,
|
|
44
|
+
failed), and suggests which incomplete run to pick up based on its process, pending
|
|
45
|
+
effects, and last activity.
|
|
46
|
+
|
|
47
|
+
How it works: Reads run metadata and journal, rebuilds state cache if stale, identifies
|
|
48
|
+
pending effects (breakpoints awaiting approval, tasks needing results), and continues
|
|
49
|
+
orchestration from exactly where it left off -- no work is repeated thanks to the
|
|
50
|
+
replay engine.
|
|
51
|
+
|
|
52
|
+
Example: /babysitter:resume
|
|
53
|
+
(discovers runs and offers: "Run abc123 is waiting on a breakpoint in the 'review
|
|
54
|
+
test results' phase of your API migration -- resume this one?")
|
|
55
|
+
|
|
56
|
+
|
|
57
|
+
/babysitter:yolo [input]
|
|
58
|
+
Start a babysitter run in fully autonomous mode. Identical to /call but all breakpoints
|
|
59
|
+
are auto-approved and no user interaction is requested. The babysitter makes every
|
|
60
|
+
decision on its own until the run completes or hits a critical failure it can't recover
|
|
61
|
+
from. Best for well-understood tasks where you trust the process.
|
|
62
|
+
|
|
63
|
+
How it works: Same orchestration as /call, but the process context is configured to
|
|
64
|
+
skip breakpoint effects -- instead of pausing for human approval, each breakpoint
|
|
65
|
+
resolves immediately with an auto-approve result.
|
|
66
|
+
|
|
67
|
+
Example: /babysitter:yolo add comprehensive unit tests for all functions in
|
|
68
|
+
src/utils/ using vitest with >90% branch coverage
|
|
69
|
+
|
|
70
|
+
|
|
71
|
+
/babysitter:plan [input]
|
|
72
|
+
Generate a detailed execution plan without running anything. Babysitter goes through
|
|
73
|
+
the full interview and process selection flow, designs the process definition with
|
|
74
|
+
all tasks, breakpoints, and dependencies, but stops before creating the actual SDK run.
|
|
75
|
+
You get a complete plan you can review, modify, or execute later with /call.
|
|
76
|
+
|
|
77
|
+
How it works: Runs the babysitter skill's planning phase only -- analyzes input,
|
|
78
|
+
matches to domain processes, interviews for requirements, then outputs the process
|
|
79
|
+
definition file and a human-readable execution plan showing each phase, task, and
|
|
80
|
+
decision point.
|
|
81
|
+
|
|
82
|
+
Example: /babysitter:plan redesign our database schema to support multi-tenancy,
|
|
83
|
+
migrate existing data, and update all queries -- I want to review the plan before
|
|
84
|
+
we touch anything
|
|
85
|
+
|
|
86
|
+
|
|
87
|
+
/babysitter:forever [input]
|
|
88
|
+
Start a babysitter run that loops indefinitely with sleep intervals. Designed for
|
|
89
|
+
ongoing operational tasks: monitoring, periodic maintenance, continuous improvement,
|
|
90
|
+
or recurring workflows. The process uses an infinite loop with ctx.sleepUntil() to
|
|
91
|
+
pause between iterations.
|
|
92
|
+
|
|
93
|
+
How it works: Creates a process definition with a while(true) loop. Each cycle performs
|
|
94
|
+
the task (e.g., check metrics, process tickets, run audits), then calls ctx.sleepUntil()
|
|
95
|
+
to pause for a configured interval. The run stays in "waiting" state during sleep and
|
|
96
|
+
resumes automatically when the sleep expires on the next orchestration iteration.
|
|
97
|
+
|
|
98
|
+
Example: /babysitter:forever every 4 hours, check our GitHub issues labeled "bug",
|
|
99
|
+
attempt to reproduce and fix any that look straightforward, and submit PRs for the fixes
|
|
100
|
+
|
|
101
|
+
|
|
102
|
+
SECONDARY COMMANDS
|
|
103
|
+
==================
|
|
104
|
+
|
|
105
|
+
/babysitter:doctor [issue]
|
|
106
|
+
Run a comprehensive 10-point health check on a babysitter run. Inspects journal
|
|
107
|
+
integrity (checksum verification, sequence gaps, timestamp ordering), state cache
|
|
108
|
+
consistency, stuck/errored effects, stale locks, session state, log files, disk usage,
|
|
109
|
+
process validation, and hook execution health. Produces a structured diagnostic report
|
|
110
|
+
with PASS/WARN/FAIL status per check and specific fix commands.
|
|
111
|
+
|
|
112
|
+
If no run ID is provided, automatically targets the most recent run. Can also diagnose
|
|
113
|
+
environment-wide issues like missing CLI, unregistered hooks, or plugin problems.
|
|
114
|
+
|
|
115
|
+
Example: /babysitter:doctor
|
|
116
|
+
(checks the latest run: "CRITICAL -- Check 5 Lock Status: FAIL -- stale lock detected,
|
|
117
|
+
process 12847 is no longer running. Fix: rm .a5c/runs/abc123/run.lock")
|
|
118
|
+
|
|
119
|
+
|
|
120
|
+
/babysitter:assimilate [target]
|
|
121
|
+
Convert an external methodology, AI coding harness, or specification into native
|
|
122
|
+
babysitter process definitions. Takes a GitHub repo URL, harness name, or spec file
|
|
123
|
+
and produces a complete process package with skills/ and agents/ directories.
|
|
124
|
+
|
|
125
|
+
Two workflows available:
|
|
126
|
+
- Methodology assimilation: clones the repo, learns its procedures and commands,
|
|
127
|
+
converts manual flows into babysitter processes with refactored skills and agents
|
|
128
|
+
- Harness integration: wires babysitter's SDK into a specific AI coding tool
|
|
129
|
+
(codex, opencode, gemini-cli, antigravity, etc.) so it can orchestrate runs
|
|
130
|
+
|
|
131
|
+
Example: /babysitter:assimilate https://github.com/some-org/their-deployment-playbook
|
|
132
|
+
(clones the repo, analyzes their deployment procedures, and generates babysitter
|
|
133
|
+
processes that replicate the same workflow with proper task definitions and breakpoints)
|
|
134
|
+
|
|
135
|
+
|
|
136
|
+
/babysitter:user-install
|
|
137
|
+
First-time onboarding for new babysitter users. Installs dependencies, runs an
|
|
138
|
+
interactive interview about your development specialties, preferred tools, coding
|
|
139
|
+
style, and how much autonomy you want babysitter to have. Builds a user profile
|
|
140
|
+
stored at ~/.a5c/user-profile.json that personalizes future runs.
|
|
141
|
+
|
|
142
|
+
Uses the cradle/user-install process which covers: dependency verification, user
|
|
143
|
+
interview (expertise areas, preferred languages, IDE, terminal setup), profile
|
|
144
|
+
generation, tool configuration, and optional global plugin installation.
|
|
145
|
+
|
|
146
|
+
Example: /babysitter:user-install
|
|
147
|
+
(walks you through: "What's your primary programming language? What frameworks do
|
|
148
|
+
you use most? Do you prefer babysitter to auto-approve routine tasks or always ask?")
|
|
149
|
+
|
|
150
|
+
|
|
151
|
+
/babysitter:project-install
|
|
152
|
+
Onboard a new or existing project for babysitter orchestration. Researches the
|
|
153
|
+
codebase (reads package.json, scans directory structure, identifies frameworks and
|
|
154
|
+
patterns), interviews you about project goals and workflows, generates a project
|
|
155
|
+
profile at .a5c/project-profile.json, and optionally sets up CI/CD integration.
|
|
156
|
+
|
|
157
|
+
Uses the cradle/project-install process which covers: codebase analysis, project
|
|
158
|
+
interview, profile creation, recommended plugin installation, hook configuration,
|
|
159
|
+
and optional CI pipeline setup.
|
|
160
|
+
|
|
161
|
+
Example: /babysitter:project-install
|
|
162
|
+
(scans your repo: "I see this is a Next.js 16 app with Tailwind, using vitest for
|
|
163
|
+
tests and PostgreSQL. What are your main development goals for this project?")
|
|
164
|
+
|
|
165
|
+
|
|
166
|
+
/babysitter:retrospect [run id or name]
|
|
167
|
+
Analyze a completed run to extract lessons and improve future runs. Reviews what
|
|
168
|
+
happened (journal events, task results, timing, errors), evaluates the process that
|
|
169
|
+
was followed, and suggests concrete improvements to process definitions, skills,
|
|
170
|
+
and agents. Interactive -- multiple breakpoints let you steer the analysis and
|
|
171
|
+
decide which improvements to implement.
|
|
172
|
+
|
|
173
|
+
Covers: run result analysis, process effectiveness review, improvement suggestions,
|
|
174
|
+
implementation of changes, and routing to /contrib if improvements belong in the
|
|
175
|
+
shared process library.
|
|
176
|
+
|
|
177
|
+
Example: /babysitter:retrospect
|
|
178
|
+
(analyzes the last run: "The API migration run completed but the 'verify parity'
|
|
179
|
+
phase took 8 iterations because test assertions were too brittle. Suggestion: add
|
|
180
|
+
a fuzzy comparison step before strict assertion. Implement this fix?")
|
|
181
|
+
|
|
182
|
+
|
|
183
|
+
/babysitter:plugins [action]
|
|
184
|
+
Manage babysitter plugins: list installed plugins, browse marketplaces, install,
|
|
185
|
+
update, configure, uninstall, or create new plugins. Plugins are version-managed
|
|
186
|
+
instruction packages (not executable code) that guide the agent through install,
|
|
187
|
+
configure, and uninstall steps via markdown files.
|
|
188
|
+
|
|
189
|
+
Without arguments: shows installed plugins (name, version, marketplace, dates) and
|
|
190
|
+
available marketplaces. With arguments: routes to the specific action.
|
|
191
|
+
|
|
192
|
+
Key actions:
|
|
193
|
+
- install <name> --global|--project: fetch install.md from marketplace and execute
|
|
194
|
+
- configure <name> --global|--project: fetch configure.md and walk through options
|
|
195
|
+
- update <name> --global|--project: resolve migration chain via BFS and apply steps
|
|
196
|
+
- uninstall <name> --global|--project: fetch uninstall.md and execute removal
|
|
197
|
+
- create: scaffold a new plugin package with the meta/plugin-creation process
|
|
198
|
+
|
|
199
|
+
Example: /babysitter:plugins install sound-hooks --project
|
|
200
|
+
(fetches sound-hooks from marketplace, reads install.md, walks you through player
|
|
201
|
+
detection, sound selection, hook configuration, and registers in plugin-registry.json)
|
|
202
|
+
|
|
203
|
+
|
|
204
|
+
/babysitter:contrib [feedback]
|
|
205
|
+
Submit feedback or contribute to the babysitter project. Routes to the appropriate
|
|
206
|
+
workflow based on what you want to do:
|
|
207
|
+
|
|
208
|
+
Issue-based (opens GitHub issue in a5c-ai/babysitter):
|
|
209
|
+
- Bug report: describe a bug in the SDK, CLI, or process library
|
|
210
|
+
- Feature request: propose a new feature or enhancement
|
|
211
|
+
- Documentation question: flag undocumented behavior or missing docs
|
|
212
|
+
|
|
213
|
+
PR-based (forks repo, creates branch, submits PR):
|
|
214
|
+
- Bugfix: you already have a fix ready
|
|
215
|
+
- Feature implementation: you've built a new feature
|
|
216
|
+
- Library contribution: new or improved process/skill/agent for the library
|
|
217
|
+
- Harness integration: CI/CD or IDE integration
|
|
218
|
+
|
|
219
|
+
Without arguments: shows all contribution types and helps you pick the right one.
|
|
220
|
+
Breakpoints are placed before all GitHub actions (fork, star, PR, issue) so you
|
|
221
|
+
can review before anything is submitted.
|
|
222
|
+
|
|
223
|
+
Example: /babysitter:contrib bug report: plugin:update-registry fails when the
|
|
224
|
+
marketplace hasn't been cloned yet, even though the registry update doesn't need
|
|
225
|
+
marketplace access
|
|
226
|
+
|
|
227
|
+
|
|
228
|
+
/babysitter:observe
|
|
229
|
+
Launch the babysitter observer dashboard -- a real-time web UI that monitors active
|
|
230
|
+
and past runs. Displays task progress, journal events, orchestration state, and
|
|
231
|
+
effect status in your browser. Useful when running /yolo or /forever to watch
|
|
232
|
+
progress without interrupting the run.
|
|
233
|
+
|
|
234
|
+
How it works: Runs npx @yoavmayer/babysitter-observer-dashboard@latest which watches
|
|
235
|
+
the .a5c/runs/ directory (or a parent directory containing multiple projects) and
|
|
236
|
+
serves a live dashboard. The process is blocking -- it runs until you stop it.
|
|
237
|
+
|
|
238
|
+
Example: /babysitter:observe
|
|
239
|
+
(opens browser showing all runs with live-updating task
|
|
240
|
+
status, journal event stream, and effect resolution timeline)
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
## if arguments provided:
|
|
244
|
+
|
|
245
|
+
if the argument is "command [command name]", "process [process name]", "skill [skill name]", "agent [agent name]", or "methodology [methodology name]", then show the detailed documentation for that specific command, process, skill, agent, or methodology after reading the relevant files.
|