instar 1.3.0 → 1.3.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,108 @@
1
+ # Side-Effects Review — Codex Intelligence-Provider Clean-Call Fix
2
+
3
+ **Version / slug:** `fix-codex-intel-clean-call`
4
+ **Date:** 2026-05-26
5
+ **Author:** Echo
6
+ **Spec:** `docs/specs/CODEX-INTELLIGENCE-PROVIDER-CLEAN-CALL-SPEC.md` (converged + approved)
7
+
8
+ ## Summary of the change
9
+
10
+ `CodexCliIntelligenceProvider.evaluate()` ran `codex exec --cd <agent project dir>` for
11
+ every internal LLM "judgment" call (message classification, terminal-output analysis,
12
+ arc extraction, usher, coherence, etc.). Running in the project dir made Codex load the
13
+ full ~26 KB `AGENTS.md` identity AND fire the project's `.codex/hooks.json`
14
+ (session_start / user_prompt_submit / stop) on **every** call — ~1,550 such calls/day,
15
+ causing notification spam (session_start firing constantly) and spawn-storm delivery
16
+ failures (12 heavyweight spawns/minute saturating the machine).
17
+
18
+ The fix runs these calls in an empty, owner-only scratch dir instead — the Codex analog
19
+ of `ClaudeCliIntelligenceProvider`'s `--setting-sources user`. No identity, no project
20
+ hooks.
21
+
22
+ **Files changed (source):**
23
+ - `src/core/CodexCliIntelligenceProvider.ts` — `evaluate()` now uses an `mkdtempSync`
24
+ scratch dir for `--cd` (not the project dir) + `-c project_doc_max_bytes=0`; added the
25
+ `resolveIntelligenceScratchDir()` helper; removed the now-dead `workingDirectory` field
26
+ (kept on the options type for API compat).
27
+
28
+ **Files changed (tests):**
29
+ - `tests/unit/CodexCliIntelligenceProvider.test.ts` — updated the `--cd` assertion (it
30
+ previously asserted the buggy project-dir behavior) + added 7 cases covering the
31
+ scratch-dir contract, 0700 perms, unguessable name, and tmp-reaper recovery (12 total).
32
+
33
+ **Files changed (spec / report / release notes):**
34
+ - `docs/specs/CODEX-INTELLIGENCE-PROVIDER-CLEAN-CALL-SPEC.md` (+ `.eli16.md`)
35
+ - `docs/specs/reports/codex-intelligence-provider-clean-call-convergence.md`
36
+ - `upgrades/NEXT.md`
37
+
38
+ ## Decision-point inventory
39
+
40
+ - **Scratch dir, not the project dir** — the core fix. Judgment calls are cwd-independent
41
+ (per the existing code comment), so an empty cwd is correct.
42
+ - **`mkdtempSync` (random suffix, 0700), not a fixed name** — convergence security finding:
43
+ a fixed `/tmp` name on Linux is plantable (`.codex/hooks.json` squatting; not gated by
44
+ `project_doc_max_bytes`). The unguessable, owner-only dir closes that vector.
45
+ - **Re-verify-before-use** — recreate the dir if a tmp-reaper deleted it during a
46
+ long-lived process.
47
+ - **`-c project_doc_max_bytes=0`** — belt-and-suspenders for an `AGENTS.md` on the cwd
48
+ walk-up; real key, already used in `contextScopeControl.ts`.
49
+ - **Drop `workingDirectory` as exec cwd** — verified only `route.ts` passes it, and only
50
+ for its own PreferenceStore DB path, never the codex cwd.
51
+
52
+ ## Over-block / under-block analysis
53
+
54
+ - **Over-block:** none. The provider gates nothing; it only changes the cwd of a spawn.
55
+ Judgment calls that worked before continue to work (the fake-codex unit tests confirm
56
+ the full arg contract).
57
+ - **Under-block:** the *intended* behavioral subtraction is "stop loading identity + firing
58
+ hooks for judgment calls." There is no path where a judgment call legitimately needed the
59
+ identity or hooks — they are stateless classifications/extractions. If a future caller
60
+ did need project context, it must pass it in the prompt (as all current callers do), not
61
+ rely on cwd.
62
+
63
+ ## Level-of-abstraction fit
64
+
65
+ The fix lives in the single provider that owns the `codex exec` invocation — the same layer
66
+ where the Claude sibling already solves the identical problem with `--setting-sources user`.
67
+ No higher-level orchestration or config knob is introduced; the concern is local to the
68
+ spawn, so the fix is local to the spawn. Correct altitude.
69
+
70
+ ## Signal-vs-authority compliance
71
+
72
+ N/A in the gate sense — this change neither detects nor blocks anything. It is a pure
73
+ invocation-hygiene fix. It does not touch any sentinel/gate authority boundary.
74
+
75
+ ## Interactions
76
+
77
+ - **Claude provider:** untouched; asymmetry (flag vs scratch-cwd) is intentional and
78
+ documented — Codex has no single equivalent flag.
79
+ - **Callers (`reflect.ts`, `route.ts`, `server.ts`):** none depend on the codex cwd
80
+ content; verified during integration review. No behavior change for them beyond the
81
+ intended one.
82
+ - **Concurrency:** `mkdtempSync` once + cached + `existsSync` re-check; no race under the
83
+ high call volume (idempotent, read-only dir).
84
+ - **Monitoring layer:** positive interaction — the session_start hook no longer fires on
85
+ judgment spawns, so PresenceProxy/standby stops mistaking them for real sessions
86
+ (the notification-spam root cause).
87
+
88
+ ## Rollback cost
89
+
90
+ Trivial and isolated. Revert the single source file (and its test). No persisted state, no
91
+ schema, no config/hook/template/migration to unwind — the only on-disk footprint is an
92
+ empty 0700 tmp dir that the OS reaps on its own. Reverting restores the prior (buggy but
93
+ functional) behavior with zero data implications.
94
+
95
+ ## Migration parity
96
+
97
+ Code-only change inside the compiled provider. No agent-installed file
98
+ (settings/hooks/config/templates/skills) references the old behavior, so **no
99
+ `PostUpdateMigrator` entry is required** — existing Codex agents receive the fix via the
100
+ normal package update path. Verified by grep during integration review.
101
+
102
+ ## Testing evidence
103
+
104
+ - Unit: 12 tests in `CodexCliIntelligenceProvider.test.ts` pass; sibling env-allowlist (4)
105
+ + factory (10) tests unaffected; clean `tsc` build.
106
+ - Live / bug-fix evidence bar: the before/after rollout reproduction on a real Codex agent
107
+ (identity-loaded before, bare after) is run as the post-merge test-as-self gate and
108
+ recorded before the fix is declared shipped.
@@ -0,0 +1,45 @@
1
+ # Side-Effects Review — Deterministic "open this" (CMT-529)
2
+
3
+ **Version / slug:** `threadline-open-this-deterministic`
4
+ **Date:** 2026-05-26
5
+ **Author:** Echo
6
+ **Second-pass reviewer:** (pending — required; message-routing intercept)
7
+
8
+ ## Summary of the change
9
+
10
+ Makes "open this" / "tie this to &lt;topic&gt;" in the Threadline hub topic a DETERMINISTIC structural intercept instead of agent-interpreted (which failed — the agent rambled instead of binding). New `src/threadline/hubCommands.ts`: `parseHubCommand` (pure, tightly-anchored) + `bindHubConversation` (shared logic extracted from the route; discriminated result; readable+scrubbed topic name; `autoPick`). The intercept lands in `telegram.onTopicMessage` (`wireTelegramRouting`, src/commands/server.ts) — the convergence point BOTH inbound paths reach — via a late-bound `getHubDeps()` accessor (deps constructed after wiring). `POST /threadline/hub/bind` refactored to call the same helper (autoPick=false → 409 preserved for the API). `CollaborationSurfacer.load()` legacy migration now stamps `surfacedAt` by index (was epoch) so ordering works. Decision point: the hub-command intercept (routing, not block/allow).
11
+
12
+ ## Decision-point inventory
13
+
14
+ - `telegram.onTopicMessage` hub-command intercept — **add** — for the hub topic + a matched command, bind structurally + return before session injection.
15
+ - `POST /threadline/hub/bind` — **modify** (refactor to shared helper; behavior unchanged for the API).
16
+ - `CollaborationSurfacer.load()` legacy `surfacedAt` — **modify** — index-based ordering.
17
+
18
+ ## 1. Over-block
19
+ No block/allow surface. The intercept only fires when `topicId === hub` AND `parseHubCommand` matches a tightly-anchored command (`/^open(?:\s+this)?\s*[.!]?$/i`, `/^(?:tie|bind)\s+this\s+to\s+.../i`). Ordinary hub chat ("can you open this and explain?") returns null → falls through to the agent. FAIL-OPEN: any intercept error logs + falls through.
20
+
21
+ ## 2. Under-block
22
+ A creatively-phrased command ("open it please") won't match → falls through to the agent (who still has the #392 CLAUDE.md guidance as backstop). Acceptable; common forms covered.
23
+
24
+ ## 3. Level-of-abstraction fit
25
+ Correct: the intercept sits at `onTopicMessage` alongside the existing `/new`/slash/fix-command interceptions — the single seam both the lifeline-forward and server-polling paths converge on (avoids the dual-path dead-code trap that bit the sentinel + warrants-reply gate). `bindHubConversation` composes existing primitives (ConversationStore.mutate, findOrCreateForumTopic, CommitmentTracker.mutate, surfacer.markBound/noteInHub) — no re-implementation.
26
+
27
+ ## 4. Signal vs authority compliance
28
+ No new blocking authority. The intercept is a deterministic router (match → bind → return), the bind is an authoritative state mutation on explicit operator action, parseHubCommand is a pure classifier. Per `docs/signal-vs-authority.md`: routers/sinks, not gates.
29
+
30
+ ## 5. Interactions
31
+ - **No double-bind/post:** `bindHubConversation` posts to the new topic + one `noteInHub`; the intercept does NOT add a third message (reuses the helper's confirmation), and `return`s so no session injection. One bind per command (markBound makes a re-issued "open this" pick the next unbound).
32
+ - **Order vs other intercepts:** sits after `/new` + slash + (and, on the forward path, the sentinel) — emergency-stop still wins. No shadowing.
33
+ - **autoPick split:** intercept (human) autoPick=true → most-recent; API path autoPick=false → 409. Distinct, intentional.
34
+
35
+ ## 6. External surfaces
36
+ New `getHubDeps` param on `wireTelegramRouting` (internal). Hub stays silent. Topic name from gist is **capped ~40 chars, charset-scrubbed, and falls back to `&lt;peer&gt; · &lt;threadId8&gt;` on empty/credential-like gist** (a cold first message could contain a secret — never splash it into a chat-list-visible title). Template + `migrateClaudeMd` note that "open this" is now structural (Agent Awareness + Migration Parity).
37
+
38
+ ## 7. Rollback cost
39
+ Localized: new hubCommands module, one route refactor, one intercept block + a param threaded to two call sites, one load() timestamp line, the naming helper. Clean `git revert`. The legacy `surfacedAt` change is read-time + idempotent (no data migration). `getHubDeps` is an optional param (older callers unaffected).
40
+
41
+ ## Second-pass review
42
+
43
+ **Concur with the review** (independent reviewer, 2026-05-26). All seven checks verified against the diff: (1) `getHubDeps` late-bind is TDZ/null-safe — the closure only runs at message-time, long after the `const` deps initialize; the `&& telegram` guard narrows `TelegramAdapter|undefined`; `commitmentTracker` is unconditionally constructed + null-guarded internally; tsc clean. (2) Intercept gates on `getHubTopicId()` match, falls through otherwise, whole block try/catch fail-open. (3) `parseHubCommand` anchoring verified empirically across 15 inputs — "open this"/"open"/"Open This." fire; "can you open this and explain?" / "open this conversation please" fall through. (4) `bindHubConversation` returns a discriminated result, never touches res/req; route maps 400/404/409/500. (5) Early `return` skips no essential side-effect — consistent with the adjacent `/`/`/new` intercepts (no message logging in this handler). (6) `topicNameFor` caps 40 / scrubs / credential-fallback. (7) The CMT-529 migrator re-patch is idempotent + correctly scoped (matches only OLD CMT-519 agents; non-greedy anchored regex; 2nd run no-op).
44
+
45
+ Non-blocking notes (no action needed): an 18-digit "tie this to <huge number>" would be treated as a name not an id (unrealistic, harmless); legacy `surfacedAt=new Date(index+1)` ISO strings stay lexicographically monotonic well past realistic array sizes.