instar 1.2.57 → 1.2.59
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/skills/autonomous/hooks/autonomous-stop-hook.sh +31 -0
- package/.claude/skills/autonomous/scripts/setup-autonomous.sh +25 -0
- package/dist/commands/server.d.ts.map +1 -1
- package/dist/commands/server.js +21 -1
- package/dist/commands/server.js.map +1 -1
- package/dist/core/PostUpdateMigrator.js +2 -2
- package/dist/core/PostUpdateMigrator.js.map +1 -1
- package/dist/core/SessionManager.d.ts +6 -0
- package/dist/core/SessionManager.d.ts.map +1 -1
- package/dist/core/SessionManager.js +10 -0
- package/dist/core/SessionManager.js.map +1 -1
- package/dist/core/frameworkSessionLaunch.d.ts +34 -3
- package/dist/core/frameworkSessionLaunch.d.ts.map +1 -1
- package/dist/core/frameworkSessionLaunch.js +42 -3
- package/dist/core/frameworkSessionLaunch.js.map +1 -1
- package/dist/core/types.d.ts +12 -0
- package/dist/core/types.d.ts.map +1 -1
- package/dist/core/types.js.map +1 -1
- package/dist/server/CapabilityIndex.d.ts.map +1 -1
- package/dist/server/CapabilityIndex.js +2 -0
- package/dist/server/CapabilityIndex.js.map +1 -1
- package/dist/server/routes.d.ts.map +1 -1
- package/dist/server/routes.js +54 -0
- package/dist/server/routes.js.map +1 -1
- package/dist/threadline/PipeSessionSpawner.d.ts +10 -0
- package/dist/threadline/PipeSessionSpawner.d.ts.map +1 -1
- package/dist/threadline/PipeSessionSpawner.js +6 -0
- package/dist/threadline/PipeSessionSpawner.js.map +1 -1
- package/dist/threadline/ThreadlineBootstrap.d.ts.map +1 -1
- package/dist/threadline/ThreadlineBootstrap.js +5 -16
- package/dist/threadline/ThreadlineBootstrap.js.map +1 -1
- package/dist/threadline/mcpEntry.d.ts +25 -0
- package/dist/threadline/mcpEntry.d.ts.map +1 -0
- package/dist/threadline/mcpEntry.js +38 -0
- package/dist/threadline/mcpEntry.js.map +1 -0
- package/package.json +1 -1
- package/src/data/builtin-manifest.json +62 -62
- package/upgrades/1.2.58.md +77 -0
- package/upgrades/1.2.59.md +67 -0
- package/upgrades/side-effects/codex-multiagent-threadline.md +69 -0
- package/upgrades/side-effects/goal-native-delegation.md +76 -0
|
@@ -0,0 +1,67 @@
|
|
|
1
|
+
# Upgrade Guide — NEXT (autonomous mode delegates to native /goal)
|
|
2
|
+
|
|
3
|
+
<!-- bump: minor -->
|
|
4
|
+
<!-- minor = new capability, backward compatible -->
|
|
5
|
+
|
|
6
|
+
## What Changed
|
|
7
|
+
|
|
8
|
+
**New: where the framework has a native /goal loop, autonomous mode hands the finish-line to it.**
|
|
9
|
+
|
|
10
|
+
Phase 2 of `docs/specs/goal-completion-evaluator.md`. When an autonomous job is started with a
|
|
11
|
+
completion condition AND the framework provides native /goal (Claude Code >= 2.1.139), instar
|
|
12
|
+
now **injects `/goal <condition>` into the session** — using its core session-input mechanism
|
|
13
|
+
(`SessionManager.sendInput` / tmux send-keys, the same way it injects any message) — and marks
|
|
14
|
+
the job `goal_mode: native`. The framework's own /goal loop (with its own independent evaluator)
|
|
15
|
+
then drives completion, and instar's stop-hook **defers** the continue/stop decision to it
|
|
16
|
+
(approves each turn so native /goal stays in control). Where native /goal is absent, instar's
|
|
17
|
+
own completion evaluator (shipped previously) drives — works everywhere.
|
|
18
|
+
|
|
19
|
+
instar stays in charge of what native /goal doesn't cover: it still enforces **emergency-stop**
|
|
20
|
+
and **duration expiry** on a native-goal job by injecting `/goal clear` first, then standing the
|
|
21
|
+
job down. Multi-topic orchestration, cap/quota, and messaging remain instar's.
|
|
22
|
+
|
|
23
|
+
- Endpoints: `POST /autonomous/native-goal/set {topicId, condition}` (inject + mark native),
|
|
24
|
+
`POST /autonomous/native-goal/clear {topicId}`.
|
|
25
|
+
- `setup-autonomous.sh` auto-detects Claude Code >= 2.1.139 and activates native /goal when a
|
|
26
|
+
condition is set; otherwise falls back to instar's own evaluator.
|
|
27
|
+
|
|
28
|
+
## What to Tell Your User
|
|
29
|
+
|
|
30
|
+
When the tool I'm running already has its own "keep going until done" feature (Claude Code and
|
|
31
|
+
Codex both added one called goal), I now hand my finish-line straight to it instead of running
|
|
32
|
+
my own judge on top — no double-checking, and I use the tool's native machinery. Where the tool
|
|
33
|
+
doesn't have it, my own judge still does the job. Either way I keep the safety controls (stop
|
|
34
|
+
everything, time limits) and the ability to run several jobs at once. Nothing to set up.
|
|
35
|
+
|
|
36
|
+
## Summary of New Capabilities
|
|
37
|
+
|
|
38
|
+
- Native /goal delegation: instar injects `/goal <condition>` into the session and lets the
|
|
39
|
+
framework's loop own completion (`goal_mode: native`).
|
|
40
|
+
- `POST /autonomous/native-goal/set` and `/autonomous/native-goal/clear`.
|
|
41
|
+
- Auto-detection of Claude Code >= 2.1.139 in `setup-autonomous.sh`.
|
|
42
|
+
- Stop-hook defers completion to native /goal while still enforcing emergency-stop + duration
|
|
43
|
+
(clears the native goal first).
|
|
44
|
+
|
|
45
|
+
## Migration Notes
|
|
46
|
+
|
|
47
|
+
Existing agents receive the updated hook + setup via
|
|
48
|
+
`PostUpdateMigrator.migrateAutonomousStopHookTopicKeyed` (marker bumped to the native-/goal
|
|
49
|
+
signature). No action required.
|
|
50
|
+
|
|
51
|
+
## Evidence
|
|
52
|
+
|
|
53
|
+
- **Hook (behavioral):** `autonomous-completion-condition.test.ts` — `goal_mode: native` defers
|
|
54
|
+
(approves/exits, job retained, native /goal stays in control); emergency-stop and duration
|
|
55
|
+
expiry still clear state + exit in native mode.
|
|
56
|
+
- **Integration:** `autonomous-sessions-api.test.ts` — `native-goal/set` injects `/goal
|
|
57
|
+
<condition>` into the topic's session (verified via the recorded `sendInput` call) and flips
|
|
58
|
+
`goal_mode: native`; `native-goal/clear` injects `/goal clear`; 404 on unknown topic.
|
|
59
|
+
- tsc clean; 174 affected tests green.
|
|
60
|
+
|
|
61
|
+
## Note on approach
|
|
62
|
+
|
|
63
|
+
The original spec sketched Phase 2 via a `ThreadGoalSlot` provider primitive. The shipped
|
|
64
|
+
approach instead drives native /goal by **injecting the slash command** through instar's
|
|
65
|
+
existing session-input mechanism — simpler, and the correct use of a capability instar already
|
|
66
|
+
has (rather than treating "no programmatic /goal API" as a blocker). Same intent (delegate to
|
|
67
|
+
native /goal where present), better mechanism.
|
|
@@ -0,0 +1,69 @@
|
|
|
1
|
+
# Side-Effects Review — Codex Multi-Agent Threadline Robustness
|
|
2
|
+
|
|
3
|
+
Spec: docs/specs/CODEX-MULTIAGENT-THREADLINE-SPEC.md (approved, converged)
|
|
4
|
+
Change: let Codex-framework agents reply to Threadline messages — (A) targeted
|
|
5
|
+
MCP-permitting launch for reply workers, (B) per-agent threadline MCP override,
|
|
6
|
+
(C) wire both reply paths.
|
|
7
|
+
|
|
8
|
+
## 1. Over-block — what legitimate inputs does this reject that it shouldn't?
|
|
9
|
+
|
|
10
|
+
None new. The launch-profile selector is additive: jobs keep `-s
|
|
11
|
+
workspace-write` (unchanged); reply workers add bypass; explicit
|
|
12
|
+
`codexSandboxMode` still wins. The `-c` MCP override only ever points codex at
|
|
13
|
+
THIS agent's own threadline entry — it cannot reject anything.
|
|
14
|
+
|
|
15
|
+
## 2. Under-block — what failure modes does this still miss?
|
|
16
|
+
|
|
17
|
+
- Codex agents without threadline configured: no override emitted (correct — no
|
|
18
|
+
threadline to reply through). Not a miss.
|
|
19
|
+
- The reply worker still depends on the deployed relay / local delivery being
|
|
20
|
+
reachable — out of scope here (transport already works).
|
|
21
|
+
- A future THIRD reply path that calls `buildHeadlessLaunch` directly would need
|
|
22
|
+
the same two flags. Mitigated by routing all reply spawns through the two
|
|
23
|
+
known paths; documented in the spec.
|
|
24
|
+
|
|
25
|
+
## 3. Level-of-abstraction fit
|
|
26
|
+
|
|
27
|
+
Correct layer. The launch-profile + MCP-override are properties of HOW a codex
|
|
28
|
+
session is launched → they belong in `frameworkSessionLaunch` (the builder) with
|
|
29
|
+
the data computed once at boot in `server.ts`. The single-source-of-truth
|
|
30
|
+
resolver (`mcpEntry.ts`) is shared with `ThreadlineBootstrap` so registration and
|
|
31
|
+
per-spawn override cannot drift.
|
|
32
|
+
|
|
33
|
+
## 4. Signal vs authority compliance
|
|
34
|
+
|
|
35
|
+
Compliant. No blocking authority added. The selectors (`codexAllowMcpTools`,
|
|
36
|
+
`codexThreadlineMcp`, `config.threadline`-present) are capability/launch choices,
|
|
37
|
+
not gates that block information flow. See docs/signal-vs-authority.md.
|
|
38
|
+
|
|
39
|
+
## 5. Interactions
|
|
40
|
+
|
|
41
|
+
- Fix A ↔ Fix B: independent (sandbox profile vs MCP entry); both emitted in the
|
|
42
|
+
same codex argv, no conflict.
|
|
43
|
+
- Fix B ↔ existing shared-config registration: intentional — `-c` wins per spawn;
|
|
44
|
+
the shared `~/.codex/config.toml` entry stays (Claude parity / discovery) but
|
|
45
|
+
is non-authoritative for codex spawns. No double-fire (one MCP server named
|
|
46
|
+
"threadline" results).
|
|
47
|
+
- Jobs/dispatch (routes.ts generic spawn, JobScheduler): do NOT set
|
|
48
|
+
`codexAllowMcpTools` → unchanged (`workspace-write`). No shadowing.
|
|
49
|
+
- ThreadlineBootstrap refactor is behavior-preserving (`resolveThreadlineMcpEntry`
|
|
50
|
+
returns the identical `{command,args}`; `absDir` still declared for downstream
|
|
51
|
+
~/.claude.json registration).
|
|
52
|
+
|
|
53
|
+
## 6. External surfaces
|
|
54
|
+
|
|
55
|
+
- Other agents: a codex agent can now actually reply over Threadline — net new
|
|
56
|
+
outbound it previously couldn't send. Bounded by the trust gate (replies only
|
|
57
|
+
to trusted peers, who already messaged it).
|
|
58
|
+
- Security posture: codex reply workers run unsandboxed (full bypass) — the only
|
|
59
|
+
mode that permits the MCP call (verified, finding 2). Scheduled jobs remain
|
|
60
|
+
sandboxed. Operator signed off (topic 12304).
|
|
61
|
+
- Timing/runtime: none introduced; the override is computed once at boot.
|
|
62
|
+
|
|
63
|
+
## 7. Rollback cost
|
|
64
|
+
|
|
65
|
+
Code-only, no migration/state. Revert `frameworkSessionLaunch.ts`, `mcpEntry.ts`,
|
|
66
|
+
`ThreadlineBootstrap` refactor, `server.ts`/`SessionManager`/`types.ts`/
|
|
67
|
+
`PipeSessionSpawner` wiring. The shared-config registration is untouched, so a
|
|
68
|
+
revert cannot strand `~/.codex` or `.instar` state. Existing agents pick up the
|
|
69
|
+
launch fix on normal update (no config rewrite).
|
|
@@ -0,0 +1,76 @@
|
|
|
1
|
+
# Side-Effects Review — native /goal delegation (Phase 2)
|
|
2
|
+
|
|
3
|
+
**Version / slug:** `goal-native-delegation`
|
|
4
|
+
**Date:** 2026-05-24
|
|
5
|
+
**Author:** echo
|
|
6
|
+
**Second-pass reviewer:** internal conformance pass
|
|
7
|
+
|
|
8
|
+
## Summary of the change
|
|
9
|
+
|
|
10
|
+
Where the framework has a native /goal loop (Claude Code >= 2.1.139), autonomous mode delegates
|
|
11
|
+
completion to it: instar **injects `/goal <condition>`** into the session via
|
|
12
|
+
`SessionManager.sendInput` (tmux send-keys — its existing session-input mechanism), marks the
|
|
13
|
+
job `goal_mode: native`, and the stop-hook **defers** the continue/stop decision to native
|
|
14
|
+
/goal (approves each turn). instar still enforces emergency-stop + duration by injecting
|
|
15
|
+
`/goal clear` first. Phase 2 of `docs/specs/goal-completion-evaluator.md`. `src/`: two routes
|
|
16
|
+
(`/autonomous/native-goal/set|clear`) + capability-index entry + migration marker bump. Non-src:
|
|
17
|
+
hook native branch, setup auto-detection.
|
|
18
|
+
|
|
19
|
+
## Decision-point inventory
|
|
20
|
+
- Stop-hook `goal_mode: native` branch — **modify**: defer completion to native /goal (approve);
|
|
21
|
+
still enforce emergency-stop + duration (clear native first). This REMOVES instar's completion
|
|
22
|
+
authority for native topics by design (native /goal is the authority there).
|
|
23
|
+
- `POST /autonomous/native-goal/set` / `clear` — **add**: inject the slash command + flip
|
|
24
|
+
goal_mode. Thin; the side-effect is the session injection.
|
|
25
|
+
- `setup-autonomous.sh` native detection — **modify** (`.claude/`): activate native mode when
|
|
26
|
+
Claude Code >= 2.1.139 + a condition is set.
|
|
27
|
+
|
|
28
|
+
## 1. Over-block
|
|
29
|
+
- In native mode instar approves (never blocks) for completion, so instar cannot over-block.
|
|
30
|
+
Native /goal's own hook decides. No new over-block path.
|
|
31
|
+
|
|
32
|
+
## 2. Under-block (false "done" / premature exit)
|
|
33
|
+
- instar approving in native mode does NOT cause a false done: in Claude Code's hook composition
|
|
34
|
+
a `block` from native /goal wins over instar's `approve`, so native /goal keeps the session
|
|
35
|
+
working until ITS evaluator confirms the condition. If native /goal somehow isn't active, the
|
|
36
|
+
session would exit — mitigated because goal_mode:native is only set after a successful inject
|
|
37
|
+
(the set endpoint flips the flag only when sendInput returns true).
|
|
38
|
+
|
|
39
|
+
## 3. Level-of-abstraction fit
|
|
40
|
+
- Correct + the point of the change: drive the framework's native feature via instar's own
|
|
41
|
+
session-input mechanism, rather than reimplementing or treating "no /goal API" as a blocker.
|
|
42
|
+
|
|
43
|
+
## 4. Blocking authority
|
|
44
|
+
- [x] In native mode instar **yields** completion authority to native /goal (reduces instar's
|
|
45
|
+
authority — safe direction) while retaining its terminal STOP concerns (emergency/duration) by
|
|
46
|
+
clearing the native goal. No new brittle authority added.
|
|
47
|
+
|
|
48
|
+
## 5. Interactions (the key one: two Stop hooks)
|
|
49
|
+
- instar's hook + native /goal's hook both fire each turn. Resolved by composition: instar
|
|
50
|
+
approves (completion) so native /goal's block keeps control; instar only force-stops on
|
|
51
|
+
emergency/duration, and does so by clearing native /goal first (so they don't fight).
|
|
52
|
+
- **Emergency-stop** already kills the session via the sentinel path (native /goal dies with it);
|
|
53
|
+
the hook also clears native /goal on the flag. **Duration** clears native /goal then exits.
|
|
54
|
+
- Falls back cleanly to the instar evaluator (Phase 1) when native /goal is absent.
|
|
55
|
+
|
|
56
|
+
## 6. External surfaces
|
|
57
|
+
- **Session injection:** instar types `/goal <condition>` / `/goal clear` into the agent's own
|
|
58
|
+
tmux session (send-keys). This is instar's established mechanism (initial-message injection).
|
|
59
|
+
No new external/credential surface.
|
|
60
|
+
- **HTTP:** two authed routes under the already-claimed `/autonomous` prefix.
|
|
61
|
+
|
|
62
|
+
## 7. Rollback cost
|
|
63
|
+
- Low. Reverting restores instar's own evaluator everywhere (Phase 1 still in main). A
|
|
64
|
+
`goal_mode: native` left in a state file is ignored by an older hook (falls to the evaluator/
|
|
65
|
+
promise path). Migration marker is content-sniffed (rollback re-deploys cleanly).
|
|
66
|
+
|
|
67
|
+
## 8. Test evidence
|
|
68
|
+
- Hook: native defers (approve/exit, retained) + emergency/duration clear+exit. Integration:
|
|
69
|
+
set injects `/goal <cond>` (verified sendInput) + flips goal_mode; clear injects `/goal clear`;
|
|
70
|
+
404 unknown topic. tsc clean; 174 affected tests green.
|
|
71
|
+
|
|
72
|
+
## Deviation from the original spec
|
|
73
|
+
Spec Phase 2 sketched a `ThreadGoalSlot` provider primitive. Shipped instead via direct slash-
|
|
74
|
+
command injection through `SessionManager.sendInput` — simpler and the correct use of an existing
|
|
75
|
+
instar capability (per maintainer direction: "we already input text into sessions; use that to
|
|
76
|
+
call /goal"). Same intent, better mechanism; `ThreadGoalSlot` left unimplemented (not needed).
|