instar 1.2.57 → 1.2.59

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (41) hide show
  1. package/.claude/skills/autonomous/hooks/autonomous-stop-hook.sh +31 -0
  2. package/.claude/skills/autonomous/scripts/setup-autonomous.sh +25 -0
  3. package/dist/commands/server.d.ts.map +1 -1
  4. package/dist/commands/server.js +21 -1
  5. package/dist/commands/server.js.map +1 -1
  6. package/dist/core/PostUpdateMigrator.js +2 -2
  7. package/dist/core/PostUpdateMigrator.js.map +1 -1
  8. package/dist/core/SessionManager.d.ts +6 -0
  9. package/dist/core/SessionManager.d.ts.map +1 -1
  10. package/dist/core/SessionManager.js +10 -0
  11. package/dist/core/SessionManager.js.map +1 -1
  12. package/dist/core/frameworkSessionLaunch.d.ts +34 -3
  13. package/dist/core/frameworkSessionLaunch.d.ts.map +1 -1
  14. package/dist/core/frameworkSessionLaunch.js +42 -3
  15. package/dist/core/frameworkSessionLaunch.js.map +1 -1
  16. package/dist/core/types.d.ts +12 -0
  17. package/dist/core/types.d.ts.map +1 -1
  18. package/dist/core/types.js.map +1 -1
  19. package/dist/server/CapabilityIndex.d.ts.map +1 -1
  20. package/dist/server/CapabilityIndex.js +2 -0
  21. package/dist/server/CapabilityIndex.js.map +1 -1
  22. package/dist/server/routes.d.ts.map +1 -1
  23. package/dist/server/routes.js +54 -0
  24. package/dist/server/routes.js.map +1 -1
  25. package/dist/threadline/PipeSessionSpawner.d.ts +10 -0
  26. package/dist/threadline/PipeSessionSpawner.d.ts.map +1 -1
  27. package/dist/threadline/PipeSessionSpawner.js +6 -0
  28. package/dist/threadline/PipeSessionSpawner.js.map +1 -1
  29. package/dist/threadline/ThreadlineBootstrap.d.ts.map +1 -1
  30. package/dist/threadline/ThreadlineBootstrap.js +5 -16
  31. package/dist/threadline/ThreadlineBootstrap.js.map +1 -1
  32. package/dist/threadline/mcpEntry.d.ts +25 -0
  33. package/dist/threadline/mcpEntry.d.ts.map +1 -0
  34. package/dist/threadline/mcpEntry.js +38 -0
  35. package/dist/threadline/mcpEntry.js.map +1 -0
  36. package/package.json +1 -1
  37. package/src/data/builtin-manifest.json +62 -62
  38. package/upgrades/1.2.58.md +77 -0
  39. package/upgrades/1.2.59.md +67 -0
  40. package/upgrades/side-effects/codex-multiagent-threadline.md +69 -0
  41. package/upgrades/side-effects/goal-native-delegation.md +76 -0
@@ -0,0 +1,67 @@
1
+ # Upgrade Guide — NEXT (autonomous mode delegates to native /goal)
2
+
3
+ <!-- bump: minor -->
4
+ <!-- minor = new capability, backward compatible -->
5
+
6
+ ## What Changed
7
+
8
+ **New: where the framework has a native /goal loop, autonomous mode hands the finish-line to it.**
9
+
10
+ Phase 2 of `docs/specs/goal-completion-evaluator.md`. When an autonomous job is started with a
11
+ completion condition AND the framework provides native /goal (Claude Code >= 2.1.139), instar
12
+ now **injects `/goal <condition>` into the session** — using its core session-input mechanism
13
+ (`SessionManager.sendInput` / tmux send-keys, the same way it injects any message) — and marks
14
+ the job `goal_mode: native`. The framework's own /goal loop (with its own independent evaluator)
15
+ then drives completion, and instar's stop-hook **defers** the continue/stop decision to it
16
+ (approves each turn so native /goal stays in control). Where native /goal is absent, instar's
17
+ own completion evaluator (shipped previously) drives — works everywhere.
18
+
19
+ instar stays in charge of what native /goal doesn't cover: it still enforces **emergency-stop**
20
+ and **duration expiry** on a native-goal job by injecting `/goal clear` first, then standing the
21
+ job down. Multi-topic orchestration, cap/quota, and messaging remain instar's.
22
+
23
+ - Endpoints: `POST /autonomous/native-goal/set {topicId, condition}` (inject + mark native),
24
+ `POST /autonomous/native-goal/clear {topicId}`.
25
+ - `setup-autonomous.sh` auto-detects Claude Code >= 2.1.139 and activates native /goal when a
26
+ condition is set; otherwise falls back to instar's own evaluator.
27
+
28
+ ## What to Tell Your User
29
+
30
+ When the tool I'm running already has its own "keep going until done" feature (Claude Code and
31
+ Codex both added one called goal), I now hand my finish-line straight to it instead of running
32
+ my own judge on top — no double-checking, and I use the tool's native machinery. Where the tool
33
+ doesn't have it, my own judge still does the job. Either way I keep the safety controls (stop
34
+ everything, time limits) and the ability to run several jobs at once. Nothing to set up.
35
+
36
+ ## Summary of New Capabilities
37
+
38
+ - Native /goal delegation: instar injects `/goal <condition>` into the session and lets the
39
+ framework's loop own completion (`goal_mode: native`).
40
+ - `POST /autonomous/native-goal/set` and `/autonomous/native-goal/clear`.
41
+ - Auto-detection of Claude Code >= 2.1.139 in `setup-autonomous.sh`.
42
+ - Stop-hook defers completion to native /goal while still enforcing emergency-stop + duration
43
+ (clears the native goal first).
44
+
45
+ ## Migration Notes
46
+
47
+ Existing agents receive the updated hook + setup via
48
+ `PostUpdateMigrator.migrateAutonomousStopHookTopicKeyed` (marker bumped to the native-/goal
49
+ signature). No action required.
50
+
51
+ ## Evidence
52
+
53
+ - **Hook (behavioral):** `autonomous-completion-condition.test.ts` — `goal_mode: native` defers
54
+ (approves/exits, job retained, native /goal stays in control); emergency-stop and duration
55
+ expiry still clear state + exit in native mode.
56
+ - **Integration:** `autonomous-sessions-api.test.ts` — `native-goal/set` injects `/goal
57
+ <condition>` into the topic's session (verified via the recorded `sendInput` call) and flips
58
+ `goal_mode: native`; `native-goal/clear` injects `/goal clear`; 404 on unknown topic.
59
+ - tsc clean; 174 affected tests green.
60
+
61
+ ## Note on approach
62
+
63
+ The original spec sketched Phase 2 via a `ThreadGoalSlot` provider primitive. The shipped
64
+ approach instead drives native /goal by **injecting the slash command** through instar's
65
+ existing session-input mechanism — simpler, and the correct use of a capability instar already
66
+ has (rather than treating "no programmatic /goal API" as a blocker). Same intent (delegate to
67
+ native /goal where present), better mechanism.
@@ -0,0 +1,69 @@
1
+ # Side-Effects Review — Codex Multi-Agent Threadline Robustness
2
+
3
+ Spec: docs/specs/CODEX-MULTIAGENT-THREADLINE-SPEC.md (approved, converged)
4
+ Change: let Codex-framework agents reply to Threadline messages — (A) targeted
5
+ MCP-permitting launch for reply workers, (B) per-agent threadline MCP override,
6
+ (C) wire both reply paths.
7
+
8
+ ## 1. Over-block — what legitimate inputs does this reject that it shouldn't?
9
+
10
+ None new. The launch-profile selector is additive: jobs keep `-s
11
+ workspace-write` (unchanged); reply workers add bypass; explicit
12
+ `codexSandboxMode` still wins. The `-c` MCP override only ever points codex at
13
+ THIS agent's own threadline entry — it cannot reject anything.
14
+
15
+ ## 2. Under-block — what failure modes does this still miss?
16
+
17
+ - Codex agents without threadline configured: no override emitted (correct — no
18
+ threadline to reply through). Not a miss.
19
+ - The reply worker still depends on the deployed relay / local delivery being
20
+ reachable — out of scope here (transport already works).
21
+ - A future THIRD reply path that calls `buildHeadlessLaunch` directly would need
22
+ the same two flags. Mitigated by routing all reply spawns through the two
23
+ known paths; documented in the spec.
24
+
25
+ ## 3. Level-of-abstraction fit
26
+
27
+ Correct layer. The launch-profile + MCP-override are properties of HOW a codex
28
+ session is launched → they belong in `frameworkSessionLaunch` (the builder) with
29
+ the data computed once at boot in `server.ts`. The single-source-of-truth
30
+ resolver (`mcpEntry.ts`) is shared with `ThreadlineBootstrap` so registration and
31
+ per-spawn override cannot drift.
32
+
33
+ ## 4. Signal vs authority compliance
34
+
35
+ Compliant. No blocking authority added. The selectors (`codexAllowMcpTools`,
36
+ `codexThreadlineMcp`, `config.threadline`-present) are capability/launch choices,
37
+ not gates that block information flow. See docs/signal-vs-authority.md.
38
+
39
+ ## 5. Interactions
40
+
41
+ - Fix A ↔ Fix B: independent (sandbox profile vs MCP entry); both emitted in the
42
+ same codex argv, no conflict.
43
+ - Fix B ↔ existing shared-config registration: intentional — `-c` wins per spawn;
44
+ the shared `~/.codex/config.toml` entry stays (Claude parity / discovery) but
45
+ is non-authoritative for codex spawns. No double-fire (one MCP server named
46
+ "threadline" results).
47
+ - Jobs/dispatch (routes.ts generic spawn, JobScheduler): do NOT set
48
+ `codexAllowMcpTools` → unchanged (`workspace-write`). No shadowing.
49
+ - ThreadlineBootstrap refactor is behavior-preserving (`resolveThreadlineMcpEntry`
50
+ returns the identical `{command,args}`; `absDir` still declared for downstream
51
+ ~/.claude.json registration).
52
+
53
+ ## 6. External surfaces
54
+
55
+ - Other agents: a codex agent can now actually reply over Threadline — net new
56
+ outbound it previously couldn't send. Bounded by the trust gate (replies only
57
+ to trusted peers, who already messaged it).
58
+ - Security posture: codex reply workers run unsandboxed (full bypass) — the only
59
+ mode that permits the MCP call (verified, finding 2). Scheduled jobs remain
60
+ sandboxed. Operator signed off (topic 12304).
61
+ - Timing/runtime: none introduced; the override is computed once at boot.
62
+
63
+ ## 7. Rollback cost
64
+
65
+ Code-only, no migration/state. Revert `frameworkSessionLaunch.ts`, `mcpEntry.ts`,
66
+ `ThreadlineBootstrap` refactor, `server.ts`/`SessionManager`/`types.ts`/
67
+ `PipeSessionSpawner` wiring. The shared-config registration is untouched, so a
68
+ revert cannot strand `~/.codex` or `.instar` state. Existing agents pick up the
69
+ launch fix on normal update (no config rewrite).
@@ -0,0 +1,76 @@
1
+ # Side-Effects Review — native /goal delegation (Phase 2)
2
+
3
+ **Version / slug:** `goal-native-delegation`
4
+ **Date:** 2026-05-24
5
+ **Author:** echo
6
+ **Second-pass reviewer:** internal conformance pass
7
+
8
+ ## Summary of the change
9
+
10
+ Where the framework has a native /goal loop (Claude Code >= 2.1.139), autonomous mode delegates
11
+ completion to it: instar **injects `/goal <condition>`** into the session via
12
+ `SessionManager.sendInput` (tmux send-keys — its existing session-input mechanism), marks the
13
+ job `goal_mode: native`, and the stop-hook **defers** the continue/stop decision to native
14
+ /goal (approves each turn). instar still enforces emergency-stop + duration by injecting
15
+ `/goal clear` first. Phase 2 of `docs/specs/goal-completion-evaluator.md`. `src/`: two routes
16
+ (`/autonomous/native-goal/set|clear`) + capability-index entry + migration marker bump. Non-src:
17
+ hook native branch, setup auto-detection.
18
+
19
+ ## Decision-point inventory
20
+ - Stop-hook `goal_mode: native` branch — **modify**: defer completion to native /goal (approve);
21
+ still enforce emergency-stop + duration (clear native first). This REMOVES instar's completion
22
+ authority for native topics by design (native /goal is the authority there).
23
+ - `POST /autonomous/native-goal/set` / `clear` — **add**: inject the slash command + flip
24
+ goal_mode. Thin; the side-effect is the session injection.
25
+ - `setup-autonomous.sh` native detection — **modify** (`.claude/`): activate native mode when
26
+ Claude Code >= 2.1.139 + a condition is set.
27
+
28
+ ## 1. Over-block
29
+ - In native mode instar approves (never blocks) for completion, so instar cannot over-block.
30
+ Native /goal's own hook decides. No new over-block path.
31
+
32
+ ## 2. Under-block (false "done" / premature exit)
33
+ - instar approving in native mode does NOT cause a false done: in Claude Code's hook composition
34
+ a `block` from native /goal wins over instar's `approve`, so native /goal keeps the session
35
+ working until ITS evaluator confirms the condition. If native /goal somehow isn't active, the
36
+ session would exit — mitigated because goal_mode:native is only set after a successful inject
37
+ (the set endpoint flips the flag only when sendInput returns true).
38
+
39
+ ## 3. Level-of-abstraction fit
40
+ - Correct + the point of the change: drive the framework's native feature via instar's own
41
+ session-input mechanism, rather than reimplementing or treating "no /goal API" as a blocker.
42
+
43
+ ## 4. Blocking authority
44
+ - [x] In native mode instar **yields** completion authority to native /goal (reduces instar's
45
+ authority — safe direction) while retaining its terminal STOP concerns (emergency/duration) by
46
+ clearing the native goal. No new brittle authority added.
47
+
48
+ ## 5. Interactions (the key one: two Stop hooks)
49
+ - instar's hook + native /goal's hook both fire each turn. Resolved by composition: instar
50
+ approves (completion) so native /goal's block keeps control; instar only force-stops on
51
+ emergency/duration, and does so by clearing native /goal first (so they don't fight).
52
+ - **Emergency-stop** already kills the session via the sentinel path (native /goal dies with it);
53
+ the hook also clears native /goal on the flag. **Duration** clears native /goal then exits.
54
+ - Falls back cleanly to the instar evaluator (Phase 1) when native /goal is absent.
55
+
56
+ ## 6. External surfaces
57
+ - **Session injection:** instar types `/goal <condition>` / `/goal clear` into the agent's own
58
+ tmux session (send-keys). This is instar's established mechanism (initial-message injection).
59
+ No new external/credential surface.
60
+ - **HTTP:** two authed routes under the already-claimed `/autonomous` prefix.
61
+
62
+ ## 7. Rollback cost
63
+ - Low. Reverting restores instar's own evaluator everywhere (Phase 1 still in main). A
64
+ `goal_mode: native` left in a state file is ignored by an older hook (falls to the evaluator/
65
+ promise path). Migration marker is content-sniffed (rollback re-deploys cleanly).
66
+
67
+ ## 8. Test evidence
68
+ - Hook: native defers (approve/exit, retained) + emergency/duration clear+exit. Integration:
69
+ set injects `/goal <cond>` (verified sendInput) + flips goal_mode; clear injects `/goal clear`;
70
+ 404 unknown topic. tsc clean; 174 affected tests green.
71
+
72
+ ## Deviation from the original spec
73
+ Spec Phase 2 sketched a `ThreadGoalSlot` provider primitive. Shipped instead via direct slash-
74
+ command injection through `SessionManager.sendInput` — simpler and the correct use of an existing
75
+ instar capability (per maintainer direction: "we already input text into sessions; use that to
76
+ call /goal"). Same intent, better mechanism; `ThreadGoalSlot` left unimplemented (not needed).