instar 1.2.58 → 1.2.60

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,76 @@
1
+ # Side-Effects Review — native /goal delegation (Phase 2)
2
+
3
+ **Version / slug:** `goal-native-delegation`
4
+ **Date:** 2026-05-24
5
+ **Author:** echo
6
+ **Second-pass reviewer:** internal conformance pass
7
+
8
+ ## Summary of the change
9
+
10
+ Where the framework has a native /goal loop (Claude Code >= 2.1.139), autonomous mode delegates
11
+ completion to it: instar **injects `/goal <condition>`** into the session via
12
+ `SessionManager.sendInput` (tmux send-keys — its existing session-input mechanism), marks the
13
+ job `goal_mode: native`, and the stop-hook **defers** the continue/stop decision to native
14
+ /goal (approves each turn). instar still enforces emergency-stop + duration by injecting
15
+ `/goal clear` first. Phase 2 of `docs/specs/goal-completion-evaluator.md`. `src/`: two routes
16
+ (`/autonomous/native-goal/set|clear`) + capability-index entry + migration marker bump. Non-src:
17
+ hook native branch, setup auto-detection.
18
+
19
+ ## Decision-point inventory
20
+ - Stop-hook `goal_mode: native` branch — **modify**: defer completion to native /goal (approve);
21
+ still enforce emergency-stop + duration (clear native first). This REMOVES instar's completion
22
+ authority for native topics by design (native /goal is the authority there).
23
+ - `POST /autonomous/native-goal/set` / `clear` — **add**: inject the slash command + flip
24
+ goal_mode. Thin; the side-effect is the session injection.
25
+ - `setup-autonomous.sh` native detection — **modify** (`.claude/`): activate native mode when
26
+ Claude Code >= 2.1.139 + a condition is set.
27
+
28
+ ## 1. Over-block
29
+ - In native mode instar approves (never blocks) for completion, so instar cannot over-block.
30
+ Native /goal's own hook decides. No new over-block path.
31
+
32
+ ## 2. Under-block (false "done" / premature exit)
33
+ - instar approving in native mode does NOT cause a false done: in Claude Code's hook composition
34
+ a `block` from native /goal wins over instar's `approve`, so native /goal keeps the session
35
+ working until ITS evaluator confirms the condition. If native /goal somehow isn't active, the
36
+ session would exit — mitigated because goal_mode:native is only set after a successful inject
37
+ (the set endpoint flips the flag only when sendInput returns true).
38
+
39
+ ## 3. Level-of-abstraction fit
40
+ - Correct + the point of the change: drive the framework's native feature via instar's own
41
+ session-input mechanism, rather than reimplementing or treating "no /goal API" as a blocker.
42
+
43
+ ## 4. Blocking authority
44
+ - [x] In native mode instar **yields** completion authority to native /goal (reduces instar's
45
+ authority — safe direction) while retaining its terminal STOP concerns (emergency/duration) by
46
+ clearing the native goal. No new brittle authority added.
47
+
48
+ ## 5. Interactions (the key one: two Stop hooks)
49
+ - instar's hook + native /goal's hook both fire each turn. Resolved by composition: instar
50
+ approves (completion) so native /goal's block keeps control; instar only force-stops on
51
+ emergency/duration, and does so by clearing native /goal first (so they don't fight).
52
+ - **Emergency-stop** already kills the session via the sentinel path (native /goal dies with it);
53
+ the hook also clears native /goal on the flag. **Duration** clears native /goal then exits.
54
+ - Falls back cleanly to the instar evaluator (Phase 1) when native /goal is absent.
55
+
56
+ ## 6. External surfaces
57
+ - **Session injection:** instar types `/goal <condition>` / `/goal clear` into the agent's own
58
+ tmux session (send-keys). This is instar's established mechanism (initial-message injection).
59
+ No new external/credential surface.
60
+ - **HTTP:** two authed routes under the already-claimed `/autonomous` prefix.
61
+
62
+ ## 7. Rollback cost
63
+ - Low. Reverting restores instar's own evaluator everywhere (Phase 1 still in main). A
64
+ `goal_mode: native` left in a state file is ignored by an older hook (falls to the evaluator/
65
+ promise path). Migration marker is content-sniffed (rollback re-deploys cleanly).
66
+
67
+ ## 8. Test evidence
68
+ - Hook: native defers (approve/exit, retained) + emergency/duration clear+exit. Integration:
69
+ set injects `/goal <cond>` (verified sendInput) + flips goal_mode; clear injects `/goal clear`;
70
+ 404 unknown topic. tsc clean; 174 affected tests green.
71
+
72
+ ## Deviation from the original spec
73
+ Spec Phase 2 sketched a `ThreadGoalSlot` provider primitive. Shipped instead via direct slash-
74
+ command injection through `SessionManager.sendInput` — simpler and the correct use of an existing
75
+ instar capability (per maintainer direction: "we already input text into sessions; use that to
76
+ call /goal"). Same intent, better mechanism; `ThreadGoalSlot` left unimplemented (not needed).
@@ -0,0 +1,43 @@
1
+ # Side-Effects Review — HumanAsDetectorLog
2
+
3
+ **Change**: Ports Dawn's human-as-detector pattern into Instar. Treats every human-caught
4
+ coherence break as a first-class signal about which automated layer failed to catch it.
5
+
6
+ ## Files
7
+ - `src/monitoring/HumanAsDetectorLog.ts` (new) — singleton, deterministic no-LLM classifier,
8
+ append-only `.instar/metrics/human-as-detector.jsonl`, `summarizeByLayer()` heat map, plus
9
+ the `observeInboundMessage()` gating helper (inbound-human-only).
10
+ - `tests/unit/HumanAsDetectorLog.test.ts` (new) — 19 unit tests (classifier/observe/heat-map +
11
+ gating-helper wiring integrity).
12
+ - `tests/integration/human-as-detector-routes.test.ts` (new) — 3 tests via real `createRoutes`.
13
+ - `tests/e2e/human-as-detector-lifecycle.test.ts` (new) — 2 tests, live HTTP boot + disk.
14
+ - `src/server/routes.ts` — adds read-only `GET /human-as-detector/summary` (singleton-backed).
15
+ - `src/commands/server.ts` — configure() at startup; `observeInboundMessage()` chained onto
16
+ `telegram.onMessageLogged` (chains prior callbacks; only inbound human messages).
17
+
18
+ ## Side effects
19
+ - **New disk write**: appends to `.instar/metrics/human-as-detector.jsonl` only when an
20
+ inbound human message matches a correction signal. Best-effort; wrapped in try/catch; never
21
+ throws into message handling.
22
+ - **Console**: one `[HUMAN-AS-DETECTOR]` warn line per detected signal (mirrors
23
+ DegradationReporter's loud-not-silent convention).
24
+ - **No network, no LLM, no external calls.** Classifier is pure regex over a conservative set.
25
+ - **No behavior change to message handling**: the hook only observes; prior `onMessageLogged`
26
+ callbacks (TopicMemory dual-write, PresenceProxy, keep-watching) are preserved via chaining.
27
+ - **New endpoint** is read-only and singleton-backed (always available; no 503 path).
28
+
29
+ ## Risk
30
+ - Low. Additive, isolated module. Worst case on a logic bug: a spurious JSONL line or a missed
31
+ signal — neither affects message delivery (observe never throws into the caller).
32
+ - False-positive risk on the classifier is bounded by the `totalWeight >= 2` threshold (lone
33
+ weak signals like "actually," are ignored).
34
+ - Rollback cost: trivial — drop the module, the endpoint, and the ~6 wiring lines; no schema,
35
+ no migration, no config default.
36
+
37
+ ## Signal vs authority
38
+ - Pure signal. The log only *records* and *summarizes*; it has no blocking authority and gates
39
+ nothing. Consumers (a human reading the heat map, or future evolution tooling) decide.
40
+
41
+ ## Verification
42
+ - `npx tsc --noEmit` — clean (no new errors).
43
+ - `npx vitest run` on the three new test files — 24/24 pass across unit + integration + e2e.