@a5c-ai/babysitter-codex 0.1.6-staging.f48a64a2 → 0.1.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (33) hide show
  1. package/.app.json +3 -0
  2. package/.codex-plugin/plugin.json +47 -0
  3. package/README.md +41 -108
  4. package/assets/icon.svg +7 -0
  5. package/assets/logo.svg +8 -0
  6. package/bin/install-shared.js +509 -0
  7. package/bin/install.js +29 -433
  8. package/bin/uninstall.js +19 -120
  9. package/{.codex/hooks → hooks}/babysitter-session-start.sh +0 -0
  10. package/{.codex/hooks → hooks}/babysitter-stop-hook.sh +0 -0
  11. package/{.codex/hooks.json → hooks.json} +3 -3
  12. package/package.json +9 -5
  13. package/scripts/team-install.js +32 -424
  14. package/skills/babysit/SKILL.md +798 -0
  15. package/{.codex/skills → skills}/call/SKILL.md +1 -0
  16. package/{.codex/skills → skills}/yolo/SKILL.md +3 -0
  17. package/.codex/config.toml +0 -25
  18. package/.codex/skills/babysit/SKILL.md +0 -385
  19. package/SKILL.md +0 -385
  20. /package/{.codex/hooks → hooks}/user-prompt-submit.sh +0 -0
  21. /package/{.codex/skills → skills}/assimilate/SKILL.md +0 -0
  22. /package/{.codex/skills → skills}/doctor/SKILL.md +0 -0
  23. /package/{.codex/skills → skills}/forever/SKILL.md +0 -0
  24. /package/{.codex/skills → skills}/help/SKILL.md +0 -0
  25. /package/{.codex/skills → skills}/issue/SKILL.md +0 -0
  26. /package/{.codex/skills → skills}/model/SKILL.md +0 -0
  27. /package/{.codex/skills → skills}/observe/SKILL.md +0 -0
  28. /package/{.codex/skills → skills}/plan/SKILL.md +0 -0
  29. /package/{.codex/skills → skills}/project-install/SKILL.md +0 -0
  30. /package/{.codex/skills → skills}/resume/SKILL.md +0 -0
  31. /package/{.codex/skills → skills}/retrospect/SKILL.md +0 -0
  32. /package/{.codex/skills → skills}/team-install/SKILL.md +0 -0
  33. /package/{.codex/skills → skills}/user-install/SKILL.md +0 -0
@@ -12,5 +12,6 @@ Resolve the request in `call` mode:
12
12
  - treat everything after `$call` as the initial Babysitter request for a new
13
13
  orchestration run
14
14
  - create the process, create the run, and enter the Babysitter loop
15
+ - using this always means the user meant an interactive run.
15
16
  - do not create a separate command surface here; this skill only forwards into
16
17
  `babysit`
@@ -12,5 +12,8 @@ Resolve the request in `yolo` mode:
12
12
  - treat everything after `$yolo` as the autonomous execution request
13
13
  - follow the `babysit` skill contract while optimizing for minimal manual
14
14
  interruption
15
+ - using this means the user wants to run autonomously with minimal manual
16
+ interruption, so optimize for that by skipping or minimizing any steps that
17
+ would require user input or decision-making during the run
15
18
  - do not create a separate command surface here; this skill only forwards into
16
19
  `babysit`
@@ -1,25 +0,0 @@
1
- # Project-scoped Codex configuration template for babysitter-codex.
2
- #
3
- # This package now assumes the real Codex lifecycle hook engine:
4
- # - `.codex/hooks.json` registers SessionStart, UserPromptSubmit, and Stop
5
- # - `features.codex_hooks = true` enables that engine on supported platforms
6
- # - `.a5c` stores babysitter session and run state
7
- #
8
- # Global install should materialize `~/.codex/hooks.json` and `~/.codex/hooks/`.
9
- # `team-install` / `project-install` should materialize workspace-local
10
- # `.codex/hooks.json` and `.codex/hooks/` for repo-level pinning.
11
-
12
- approval_policy = "on-request"
13
- sandbox_mode = "workspace-write"
14
- project_doc_max_bytes = 65536
15
-
16
- [sandbox_workspace_write]
17
- writable_roots = [".a5c", ".codex"]
18
-
19
- [features]
20
- codex_hooks = true
21
- multi_agent = true
22
-
23
- [agents]
24
- max_depth = 3
25
- max_threads = 4
@@ -1,385 +0,0 @@
1
- ---
2
- name: babysit
3
- description: >-
4
- Run babysitter workflows from Codex using the installed babysit skill bundle,
5
- Codex mode-wrapper skills, Codex hooks/config, and the Babysitter SDK runtime
6
- loop. Use when the user wants to babysit a task, start or resume a run,
7
- diagnose run health, install Codex integration, or assimilate a methodology.
8
- ---
9
-
10
- # babysit
11
-
12
- Babysitter on Codex is implemented as:
13
-
14
- - the installed core skill under `~/.codex/skills/babysit` or `.codex/skills/babysit`
15
- - installed mode-wrapper skills under `~/.codex/skills/<mode>` or `.codex/skills/<mode>`
16
- - global `~/.codex/hooks.json` and `~/.codex/hooks/`
17
- - global `~/.codex/config.toml`
18
- - optional workspace `.codex/hooks.json` and `.codex/hooks/`
19
- - optional workspace `.codex/config.toml`
20
- - workspace `.a5c/`
21
- - shared global `.a5c/` process-library state
22
- - the Babysitter SDK CLI for `run:create`, `run:iterate`, `run:status`,
23
- `task:list`, `task:post`, and process-library binding
24
-
25
- This package supports only the hooks model for the Codex plugin path. Do not
26
- introduce an app-server loop, an external orchestrator, or fake plugin-manifest
27
- machinery for the Codex integration.
28
-
29
- ## Choosing a Mode
30
-
31
- Use this skill whenever it is invoked directly, and whenever one of the
32
- installed mode-wrapper skills such as `$call`, `$plan`, `$resume`, or `$yolo`
33
- loads it.
34
-
35
- Choose the mode from either:
36
-
37
- 1. the direct user intent when the skill is invoked as `$babysit`
38
- 2. the installed wrapper skill name when the user invoked `$call`, `$plan`,
39
- `$resume`, `$yolo`, and the rest
40
-
41
- | User intent | Mode |
42
- |-------------|------|
43
- | Start an orchestration run | `call` |
44
- | Work an issue-centric flow | `issue` |
45
- | Run autonomously | `yolo` |
46
- | Run continuously / recurring workflow | `forever` |
47
- | Resume an existing run | `resume` |
48
- | Plan without executing | `plan` |
49
- | Observe or inspect a run | `observe` |
50
- | Summarize a completed run | `retrospect` |
51
- | Diagnose run health | `doctor` |
52
- | Change or inspect model routing | `model` |
53
- | Help and documentation | `help` |
54
- | Install into a project | `project-install` |
55
- | Install user profile/setup | `user-install` |
56
- | Install team-pinned setup | `team-install` |
57
- | Assimilate external methodology | `assimilate` |
58
-
59
- Deprecated prompt aliases are not the Codex command surface anymore. Do not
60
- depend on `.codex/prompts` for normal operation.
61
-
62
- ## Dependencies
63
-
64
- ### Babysitter SDK and CLI
65
-
66
- Use the installed CLI alias:
67
-
68
- ```bash
69
- CLI="babysitter"
70
- ```
71
-
72
- If it is not available on the path, use:
73
-
74
- ```bash
75
- CLI="npx -y @a5c-ai/babysitter-sdk"
76
- ```
77
-
78
- ### jq
79
-
80
- Make sure `jq` is available in the path. Install it if missing.
81
-
82
- ## Core Iteration Workflow
83
-
84
- The Babysitter workflow has 8 steps:
85
-
86
- 1. **Create or find the process** - interview the user or parse the prompt,
87
- research the repo and process library, and build a process definition
88
- 2. **Create run and bind session** - create the run via the Babysitter CLI and
89
- bind it to the current Codex session honestly
90
- 3. **Run iteration** - execute one orchestration step
91
- 4. **Get effects** - inspect pending effects
92
- 5. **Perform effects** - execute the requested tasks through skills, agents, or
93
- shell work
94
- 6. **Post results** - commit results back through `task:post`
95
- 7. **Stop and yield** - the Codex stop hook decides whether to continue
96
- 8. **Completion proof** - finish only when the emitted proof is returned
97
-
98
- ### 1. Create or find the process for the run
99
-
100
- #### Interview phase
101
-
102
- ##### Interactive mode (default)
103
-
104
- Interview the user for intent, requirements, goals, scope, and constraints
105
- before entering the hook-driven loop.
106
-
107
- This phase should be iterative and adaptive:
108
-
109
- - inspect the current repo state first
110
- - resolve the active process-library root with
111
- `babysitter process-library:active --json`
112
- - conduct an actual search against that active process library before writing a
113
- process
114
- - research the repo, online references, methodologies, specializations, skills,
115
- agents, and related processes as needed
116
- - ask the user follow-up questions when the intent or constraints are still not
117
- clear
118
-
119
- Do not plan more than one step ahead during the interview phase. After each
120
- step, decide the next best step from the current evidence.
121
-
122
- The `process-library:active` command bootstraps the shared global SDK process
123
- library automatically if no binding exists yet. Read:
124
-
125
- - `binding.dir` as the active process-library root that must be searched
126
- - `defaultSpec.cloneDir` as the cloned repo root when adjacent repo-level
127
- material is needed
128
-
129
- After that, treat `specializations/**/**/**`, `methodologies/`, `contrib/`, and
130
- `reference/` as paths relative to `binding.dir`.
131
-
132
- ##### Non-interactive mode
133
-
134
- When running non-interactively:
135
-
136
- 1. parse the initial prompt to extract intent, scope, and constraints
137
- 2. inspect the repo structure
138
- 3. resolve the active process-library root with
139
- `babysitter process-library:active --json`
140
- 4. search that active library for the most relevant specialization,
141
- methodology, process, skill, or agent
142
- 5. proceed directly to process creation
143
-
144
- Do not skip the active-library search step.
145
-
146
- #### User Profile Integration
147
-
148
- Before building the process, check for an existing user profile:
149
-
150
- 1. run `babysitter profile:read --user --json`
151
- 2. use the profile to pre-fill user preferences, expertise, and communication
152
- style
153
- 3. calibrate breakpoint density from `breakpointTolerance`
154
- 4. prefer tools, skills, and agents the user already uses
155
- 5. adapt explanations and breakpoint text to the user's communication style
156
- 6. if no profile exists, proceed normally and consider suggesting `$user-install`
157
-
158
- All profile read/write/merge/render operations must go through the Babysitter
159
- CLI, never direct SDK imports.
160
-
161
- #### Process creation phase
162
-
163
- After the interview phase, create the full custom process files for the run
164
- according to the process-library patterns and the process-creation guidelines
165
- below.
166
-
167
- Install `@a5c-ai/babysitter-sdk` into `.a5c/` if it is missing. When doing so,
168
- run the install from the project root and use either `npm i --prefix .a5c ...`
169
- or a subshell so the working directory does not stay inside `.a5c/`.
170
-
171
- Always use an **absolute path** for `--entry` when calling `run:create`.
172
-
173
- After the process is created and before creating the run:
174
-
175
- - in interactive mode, describe the process at a high level, generate
176
- `[process-name].diagram.md` and `[process-name].process.md`, and get user
177
- confirmation before proceeding
178
- - in non-interactive mode, proceed directly to `run:create`
179
-
180
- Common mistakes to avoid:
181
-
182
- - wrong: skipping repo/process-library research before writing the process
183
- - wrong: bypassing the orchestration model with helper scripts or inline logic
184
- - wrong: using `kind: 'node'` in generated tasks
185
- - correct: use `agent` or `skill` tasks for reasoning work, with `shell` only
186
- for existing CLIs, tests, linters, git, or builds
187
- - correct: include verification loops, refinement loops, quality gates, and
188
- breakpoints where appropriate
189
-
190
- ### 2. Create run and bind session
191
-
192
- For new runs:
193
-
194
- ```bash
195
- $CLI run:create \
196
- --process-id <id> \
197
- --entry <absolute-path>#<export> \
198
- --inputs <file> \
199
- --prompt "$PROMPT" \
200
- --harness codex \
201
- --state-dir .a5c \
202
- --plugin-root "${CODEX_PLUGIN_ROOT}" \
203
- --json
204
- ```
205
-
206
- Required flags:
207
-
208
- - `--process-id <id>` - unique identifier for the process definition
209
- - `--entry <absolute-path>#<export>` - process JS file plus named export
210
- - `--prompt "$PROMPT"` - the user's initial request
211
- - `--harness codex` - activates Codex session binding
212
- - `--state-dir .a5c` - required for honest workspace-local Codex session state
213
- - `--plugin-root "${CODEX_PLUGIN_ROOT}"` - plugin root used for session/state
214
- resolution
215
-
216
- Optional flags:
217
-
218
- - `--inputs <file>` - process input JSON
219
- - `--run-id <id>` - override the generated run id
220
- - `--runs-dir <dir>` - override the default runs directory
221
-
222
- Inside a real Codex hook/session environment, do **not** pass `--session-id`
223
- explicitly. The Codex adapter auto-resolves the session/thread id from
224
- `CODEX_THREAD_ID`, `CODEX_SESSION_ID`, or `CODEX_ENV_FILE`. Only pass
225
- `--session-id` in out-of-band recovery flows where no ambient Codex session
226
- identity exists.
227
-
228
- In normal Codex usage, `run:create` must bind the session into the active
229
- workspace `.a5c`, not the global `~/.a5c`, so the Stop hook can find the same
230
- session state file in later turns.
231
-
232
- For resuming existing runs in a manual recovery flow:
233
-
234
- ```bash
235
- $CLI session:resume \
236
- --session-id <id> \
237
- --state-dir .a5c \
238
- --run-id <runId> \
239
- --runs-dir .a5c/runs \
240
- --json
241
- ```
242
-
243
- ### 3. Run iteration
244
-
245
- ```bash
246
- $CLI run:iterate .a5c/runs/<runId> --json --iteration <n> --plugin-root "${CODEX_PLUGIN_ROOT}"
247
- ```
248
-
249
- Status values:
250
-
251
- - `"executed"` - tasks executed, continue looping
252
- - `"waiting"` - breakpoint or sleep is pending
253
- - `"completed"` - run finished successfully
254
- - `"failed"` - run failed
255
- - `"none"` - no runnable effects exist
256
-
257
- ### 4. Get effects
258
-
259
- ```bash
260
- $CLI task:list .a5c/runs/<runId> --pending --json
261
- ```
262
-
263
- ### 5. Perform effects
264
-
265
- Run the effect externally to the SDK, then post the outcome summary with
266
- `task:post`.
267
-
268
- Important:
269
-
270
- - delegate using Codex skills or agent tooling when possible
271
- - make sure the requested change actually happened
272
- - do not describe or imply success without verifying the requested effect
273
- - do not use the `babysit` skill itself inside delegated task execution
274
-
275
- #### 5.1 Breakpoint handling
276
-
277
- ##### Interactive mode
278
-
279
- Ask the user explicitly for approval. If the Codex environment provides a
280
- structured question UI, include explicit approve/reject options. If not, ask in
281
- chat and require an explicit approval response.
282
-
283
- Never infer approval from silence, ambiguity, or dismissal.
284
-
285
- Breakpoint rejections must still be posted with `--status ok` and a value such
286
- as `{"approved": false, "response": "..."}`.
287
-
288
- ##### Non-interactive mode
289
-
290
- Choose the best option from context and post the result. Rejections still use
291
- `--status ok` with `{"approved": false}`.
292
-
293
- ### 6. Results posting
294
-
295
- Never write `result.json` directly.
296
-
297
- Workflow:
298
-
299
- 1. write the result value to `tasks/<effectId>/output.json`
300
- 2. call `task:post` with `--value tasks/<effectId>/output.json`
301
- 3. let the SDK write `result.json`, append the journal event, and update state
302
-
303
- ### 7. Stop after every phase after run-session association
304
-
305
- After `run:create` or any posted effect result, end the current assistant turn
306
- and yield back to the Codex hook loop. Do not run multiple `run:iterate` steps
307
- in the same turn.
308
-
309
- ### 8. Completion proof
310
-
311
- When `run:iterate` or `run:status` returns `completionProof`, return that exact
312
- value wrapped in `<promise>...</promise>`.
313
-
314
- ## Hook Loop
315
-
316
- Global install must materialize `~/.codex/hooks.json`, `~/.codex/hooks/`, and
317
- `~/.codex/config.toml`.
318
-
319
- Workspace onboarding may also materialize `.codex/hooks.json`,
320
- `.codex/hooks/`, and `.codex/config.toml` for repo-local pinning.
321
-
322
- Both levels must provide:
323
-
324
- 1. `SessionStart` seeds `.a5c` session state
325
- 2. `UserPromptSubmit` performs prompt-time transformations when needed
326
- 3. `Stop` decides whether the run is complete or Codex should receive the next
327
- Babysitter iteration context
328
-
329
- ## Task Kinds
330
-
331
- Never generate `kind: 'node'` effects.
332
-
333
- | Kind | When to use |
334
- |------|-------------|
335
- | `agent` | default for planning, implementation, analysis, debugging, scoring, research |
336
- | `skill` | when a matching installed skill exists |
337
- | `shell` | existing CLI tools, tests, git, linters, builds |
338
- | `breakpoint` | human approval gates |
339
- | `sleep` | time gates |
340
-
341
- ## Process Creation Guidelines
342
-
343
- - always research the repo and the active process library before writing the
344
- process
345
- - prefer composing multiple relevant library processes rather than copying just
346
- one template blindly
347
- - include verification and refinement loops
348
- - prefer processes that close the widest practical quality loop
349
- - add `@skill` and `@agent` discovery markers to generated process files for
350
- the dependencies you actually selected
351
- - prefer incremental work that can be tested as you go
352
-
353
- Search for relevant processes, skills, agents, methodologies, and references
354
- in:
355
-
356
- 1. `.a5c/processes/`
357
- 2. the active process-library root from `binding.dir`
358
- 3. the cloned repo root from `defaultSpec.cloneDir` when adjacent material is
359
- needed
360
-
361
- ## Codex-Specific Rules
362
-
363
- - `$babysit` is the core skill
364
- - `$call`, `$plan`, `$resume`, `$yolo`, and the other mode skills are thin
365
- wrappers that must only load `babysit` for the matching mode
366
- - do not revive prompt aliases as a parallel integration surface
367
- - do not fabricate a session id
368
- - use `notify` only for telemetry or monitoring, never as the orchestration
369
- control loop
370
-
371
- ## Critical Rules
372
-
373
- CRITICAL RULE: The completion proof is emitted only when the run is truly
374
- completed. Output `<promise>SECRET</promise>` only when the orchestration status
375
- is completed.
376
-
377
- CRITICAL RULE: Never bypass the Babysitter orchestration model when this skill
378
- is active. Do not replace it with ad-hoc direct execution.
379
-
380
- CRITICAL RULE: Never build helper scripts or wrapper programs to drive the run.
381
- Use the CLI and the hook loop directly.
382
-
383
- CRITICAL RULE: In interactive mode, never auto-approve breakpoints.
384
-
385
- CRITICAL RULE: Do not use `kind: 'node'` in generated process files.