npm - instar - Versions diffs - 1.2.65 → 1.2.67 - Mend

instar 1.2.65 → 1.2.67

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (27) hide show

package/dist/commands/init.d.ts.map +1 -1
package/dist/commands/init.js +21 -0
package/dist/commands/init.js.map +1 -1
package/dist/core/PostUpdateMigrator.d.ts.map +1 -1
package/dist/core/PostUpdateMigrator.js +38 -1
package/dist/core/PostUpdateMigrator.js.map +1 -1
package/dist/core/codexCapabilities.d.ts +31 -0
package/dist/core/codexCapabilities.d.ts.map +1 -0
package/dist/core/codexCapabilities.js +56 -0
package/dist/core/codexCapabilities.js.map +1 -0
package/dist/core/frameworkSessionLaunch.d.ts.map +1 -1
package/dist/core/frameworkSessionLaunch.js +13 -0
package/dist/core/frameworkSessionLaunch.js.map +1 -1
package/dist/core/installCodexHooks.d.ts +52 -0
package/dist/core/installCodexHooks.d.ts.map +1 -0
package/dist/core/installCodexHooks.js +113 -0
package/dist/core/installCodexHooks.js.map +1 -0
package/package.json +1 -1
package/src/data/builtin-manifest.json +17 -17
package/upgrades/1.2.66.md +57 -0
package/upgrades/1.2.67.md +57 -0
package/upgrades/side-effects/codex-enforcement-hooks-p1.md +32 -0
package/upgrades/side-effects/codex-enforcement-hooks-p1b.md +29 -0
package/upgrades/side-effects/codex-enforcement-hooks-p2.md +36 -0
package/upgrades/side-effects/codex-enforcement-hooks-p3.md +25 -0
package/upgrades/side-effects/codex-enforcement-hooks-p5c.md +38 -0
package/upgrades/side-effects/codex-hook-trust-bypass.md +37 -0

package/upgrades/side-effects/codex-enforcement-hooks-p1.md ADDED Viewed

@@ -0,0 +1,32 @@
+# Side-Effects Review: Codex enforcement hooks — P1 (installCodexHooks writer + tests)
+## Change
+New module `src/core/installCodexHooks.ts` + `tests/unit/installCodexHooks.test.ts` (6 tests). The module writes/merges instar's safety-gate registrations into a Codex agent's per-project `<projectDir>/.codex/hooks.json`, mapping the existing gate scripts (`external-operation-gate.js`, `grounding-before-messaging.sh`, `response-review.js`, `deferral-detector.js`, `session-start.sh`, `telegram-topic-context.sh`) to Codex's verified hook events (PreToolUse, PermissionRequest, Stop, SessionStart, UserPromptSubmit), using the verified Codex hooks.json schema.
+## Scope of effect (this commit)
+- **Capability-only, NOT yet wired.** This commit adds the writer + tests; it is NOT yet invoked from the init/refresh path (that is P1b, with a wiring-integrity test). So there is **no runtime behavior change** until the wiring lands — the function is inert until called.
+- When wired, it writes a single file: `<projectDir>/.codex/hooks.json`. Nothing else.
+## Scoping (correctness-critical)
+- Writes the **per-project** `.codex/hooks.json`, never the global `~/.codex/`. The global root is shared with the operator's personal desktop Codex and every other Codex project — global hooks would intercept the operator's personal sessions. Per-project scoping confines the gates to this agent's project dir. Unit-tested (asserts the path is not under `~/.codex`).
+## Merge-safety
+- Instar-owned entries are identified by command path containing `.instar/hooks/instar/`. On re-run, instar groups are replaced; any user-added Codex hooks are preserved verbatim. Idempotent (re-run yields identical file). Both behaviors unit-tested.
+## Signal vs Authority
+- The writer carries no runtime authority. The hooks it registers are low-context triggers that route to the server-side authority gates (`/operations/evaluate`, `/review/evaluate`). The writer just emits config; nothing in it decides allow/deny.
+## Over/under-block, abstraction
+- N/A — config writer, not a gate. The registered gates are the existing, unchanged authorities. No new decision boundary introduced here.
+## Migration parity
+- Not in this commit. `migrateCodexHooks()` (P3) will backfill existing Codex agents. Tracked in the spec phase plan; not deferred-and-forgotten.
+## Rollback
+- Trivial: delete the two files. No deployed effect (unwired). Once wired, removing the instar entries from `.codex/hooks.json` is a clean revert with no data migration.
+## Tests
+- 6 unit tests: per-project location (not global), all five events with the verified schema, absolute cwd-independent script paths, idempotency, user-hook preservation + instar replace, pure builder. All passing; tsc clean.
+## Publish
+- Feature branch `echo/codex-enforcement-hooks`. Not shipped; no separate publish.

package/upgrades/side-effects/codex-enforcement-hooks-p1b.md ADDED Viewed

@@ -0,0 +1,29 @@
+# Side-Effects Review: Codex enforcement hooks — P1b (wire installCodexHooks into init/refresh)
+## Change
+`src/commands/init.ts`: `refreshHooksAndSettings()` now calls `installCodexHooks(projectDir)` gated on `enabledFrameworks.includes('codex-cli')`, mirroring the existing `claudeEnabled → installClaudeSettings` block. Plus a wiring-integrity test (`tests/unit/codex-hooks-wiring.test.ts`).
+`refreshHooksAndSettings` is the single path that both `instar init` (line ~1097) and the update path invoke — so this one call site covers BOTH new and existing codex agents.
+## Runtime behavior change
+- Codex-cli agents now get `<projectDir>/.codex/hooks.json` written on init AND on every update/refresh. Claude-only agents are unaffected (gated). Both verified by the wiring-integrity test (codex → file created with instar gates; claude-only → no file; both → file created).
+## Sequencing risk (IMPORTANT — captured, not deferred-and-forgotten)
+- The wiring registers the existing gate scripts (`external-operation-gate.js`, `response-review.js`, etc.) into Codex's hook events. Those scripts currently parse **Claude's** hook stdin payload; the **Codex-payload shim is P2**. So on a *live* Codex session, the scripts could misbehave (parse errors → potentially an erroneous exit-2 block) until P2 lands.
+- **Why this is safe now:** this is a feature branch, NOT deployed. codey runs the released v1.2.53, which does not carry this wiring. No live codex agent has these hooks yet.
+- **Hard sequencing constraint for the build:** **P2 (gate-script Codex-payload shim) MUST land before the P6 deploy.** The phase plan already orders P2 → P6; this review makes the constraint explicit. Do not deploy P1 wiring without P2.
+## Signal vs Authority
+- Unchanged from P1: the hooks route to the server-side authority gates; the wiring just ensures they're registered for codex agents. No new authority.
+## Migration parity
+- `instar init` + update both flow through `refreshHooksAndSettings`, so existing codex agents get the wiring on their next update. (A dedicated `migrateCodexHooks` / always-overwrite-instar-owned hardening is P3.)
+## Rollback
+- Remove the `codexEnabled` block in `refreshHooksAndSettings` + the import. No deployed effect (branch-only).
+## Tests
+- Wiring-integrity: 3 tests (codex → wired, claude-only → not wired, both → wired). Plus the P1a unit suite (6). tsc + lint clean.
+## Publish
+- Feature branch. Not shipped.

package/upgrades/side-effects/codex-enforcement-hooks-p2.md ADDED Viewed

@@ -0,0 +1,36 @@
+# Side-Effects Review: Codex enforcement hooks — P2 (shell-gate works under Codex)
+## Change
+Closes the gap from §4.2d (as-wired-in-P1, Codex's native shell/exec/apply_patch would have passed UNGATED).
+1. **stdin shim** for the two arg-reading safety scripts so they read `tool_input.command` from Codex's stdin JSON when no positional arg is present (Claude's `$1` path unchanged):
+   - `dangerous-command-guard.sh` — both source copies kept consistent: `PostUpdateMigrator.getDangerousCommandGuard()` (always-overwrite/migration canonical) AND the inline duplicate in `init.ts`.
+   - `grounding-before-messaging.sh` — `PostUpdateMigrator.getGroundingBeforeMessaging()` (canonical, used by migration + init).
+2. **Mapping fix:** added `dangerous-command-guard.sh` to `installCodexHooks` `buildInstarCodexHookGroups()` PreToolUse group (coupled with the shim — mapping without shim would be the false-install trap).
+## Why
+`external-operation-gate.js` only gates `mcp__*` tools (exits 0 otherwise). Codex's destructive class is native `shell`/`exec`/`apply_patch`, which Claude gates via `dangerous-command-guard` on the Bash matcher. P1 omitted that gate from the Codex mapping, and the script couldn't read Codex's input anyway. Both fixed.
+## Over/under-block
+- The blocked catastrophic patterns are identical to Claude's (`rm -rf /`, `mkfs.`, `dd if=`, fork-bomb, etc.) — no Codex-specific over-block. Empty/garbage stdin → INPUT empty → no match → pass (no false-block); tested.
+## Signal vs Authority
+- `dangerous-command-guard` is a deterministic low-context guard on catastrophic patterns + a config safety-level gate; nuanced authority remains the server-side gate. Adding it to Codex mirrors Claude's posture — no new authority.
+## Near-silence
+- Blocks write the reason to **stderr** (the agent sees it), never a user message. No notification spam.
+## Migration parity
+- The scripts are always-overwritten on migration (getDangerousCommandGuard / getGroundingBeforeMessaging), so existing Codex agents get the shim on update; the mapping runs via refreshHooksAndSettings (P1b) on init + update.
+## Rollback
+- Revert the 3 shim insertions + the one mapping-line addition. No data migration.
+## Tests (real, no mocks)
+- Unit: 9 (incl. new assertion that dangerous-command-guard is in the Codex PreToolUse mapping).
+- **Integration (the proof): 5** — generates the REAL script via refreshHooksAndSettings, then: BLOCKS `rm -rf /` via Codex stdin/no-arg (exit 2), PASSES benign via stdin, still BLOCKS via Claude arg path (regression), no false-block on garbage stdin.
+## Sequencing
+- Satisfies the P2 hard-constraint (shell gating works under Codex) before any P6 deploy. Remaining: PermissionRequest exit-2 confirmation (P4), the live codey E2E (P5).
+## Publish
+- Feature branch. Not shipped.

package/upgrades/side-effects/codex-enforcement-hooks-p3.md ADDED Viewed

@@ -0,0 +1,25 @@
+# Side-Effects Review: Codex enforcement hooks — P3 (migration parity)
+## Change
+`PostUpdateMigrator.migrateHooks()` now calls `installCodexHooks(this.config.projectDir)` gated on `getEnabledFrameworks().includes('codex-cli')`, writing the per-project `.codex/hooks.json` for existing Codex agents on update. + import.
+## Why (Migration Parity Standard — non-negotiable)
+`installCodexHooks` ran ONLY from init's `refreshHooksAndSettings` (verified: that function's sole caller is `init.ts`). Existing agents update via `PostUpdateMigrator`, which wrote the gate SCRIPTS (with the P2 shim) but NOT the `.codex/hooks.json` registration. So without this, an existing Codex agent would get the updated guard scripts yet never the registration that makes Codex fire them — "works for new agents only" = broken. This closes it.
+## Scope
+- On update, codex-cli agents get `.codex/hooks.json` written/refreshed (idempotent; preserves user-added Codex hooks via the command-path ownership check). Claude-only agents unaffected (gated). The referenced gate scripts are written earlier in the same `migrateHooks` pass.
+## Idempotency
+- Tested: repeated migration yields exactly one instar PreToolUse group (no accumulation).
+## Signal vs Authority / Over-block
+- Unchanged from P1/P2 — this only ensures the registration reaches existing agents. No new authority, no new block patterns.
+## Rollback
+- Revert the `migrateHooks` codex block + the import. No data migration.
+## Tests
+- 3 migration tests: codex-cli agent → `.codex/hooks.json` written (with dangerous-command-guard in PreToolUse); claude-only → not written; idempotent across repeated migrations. Full P1–P3 sweep: 17 green. tsc + lint clean.
+## Publish
+- Feature branch. Not shipped.

package/upgrades/side-effects/codex-enforcement-hooks-p5c.md ADDED Viewed

@@ -0,0 +1,38 @@
+# Side-Effects Review: Codex enforcement hooks — P5c (the guard actually fires)
+## Change
+Two source fixes that make the Codex PreToolUse gate actually fire on the real Codex engine (previously it was registered but silently never invoked):
+1. **`src/core/installCodexHooks.ts`** — PreToolUse + PermissionRequest matcher changed `'*'` → `'.*'`. Codex treats the matcher as a regex against the tool name; a bare `*` is an invalid quantifier (no preceding atom) that matches nothing, so the gate never fired. `.*` matches all tool calls.
+2. **`src/core/PostUpdateMigrator.ts`** (dangerous-command-guard + grounding-before-messaging generators) and **`src/commands/init.ts`** (inline dangerous-command-guard) — the stdin shim now reads `tool_input.command OR tool_input.cmd`. Codex's shell tool is `exec_command` and delivers the command in `tool_input.cmd`; Claude uses `tool_input.command`. The prior shim read only `command`, so even when fired against Codex it saw an empty string.
+## Why
+Live verification (host codex-cli v0.133.0, interactive, hooks trusted) showed SessionStart + UserPromptSubmit hooks fired but the PreToolUse dangerous-command-guard did NOT — even trusted. Diagnosing from the Codex session rollout log revealed the matcher was an invalid regex AND the command field name differed. The earlier P5b conclusion ("root cause = hook-trust model") was a red herring for the non-firing symptom: trust gates whether hooks run at all, but with trust granted the guard still failed to fire for these two reasons.
+## Scope / blast radius
+- Codex agents only in effect: `.codex/hooks.json` is written solely for `enabledFrameworks.includes('codex-cli')`. Claude-only agents are unaffected (the guard scripts gained a stdin fallback that is inert when `$1` is supplied — Claude's existing arg path is unchanged and still tested).
+- With matcher `.*`, all three PreToolUse hooks now fire on every Codex tool call. Verified non-harmful: `external-operation-gate.js` exits 0 for any non-`mcp__*` tool (so `exec_command` passes straight through); `grounding-before-messaging.sh` only blocks when the command matches its messaging regex; `dangerous-command-guard.sh` only blocks catastrophic/risky patterns. No false-block surface introduced.
+- The `cmd`-field fallback is additive: `command or cmd or ''`. Claude payloads (`command`) are read first and unchanged.
+## Signal vs Authority / over-block
+- Unchanged authority model: hooks remain low-context triggers that exit-2 on deterministic catastrophic patterns or route to the server-side gate. No new block patterns added; only the delivery (matcher + field) was corrected so existing patterns reach the guard.
+## Migration parity
+- Both the init generator (`init.ts`) and the update generator (`PostUpdateMigrator.ts`) carry the shim fix, so new AND existing Codex agents get the working guard. The matcher fix lives in `installCodexHooks.ts`, called from both `refreshHooksAndSettings` (init) and `migrateHooks` (update, P3).
+## Live proof (evidence bar)
+Regenerated codey's hooks from freshly-built source via the real `refreshHooksAndSettings` path (no hand-patch, no debug instrumentation), launched real interactive Codex 0.133, instructed it to run `echo 'rm -rf /'` → Codex displayed `• PreToolUse hook (blocked) — BLOCKED: Catastrophic command detected: rm -rf /` and did not execute it. First confirmed firing of the Codex enforcement guard in the real engine. Before the fix the identical setup ran the command unblocked.
+## Tests
+- `tests/integration/codex-dangerous-command-block.test.ts` rewritten to the verified Codex shape (`tool_name: 'exec_command'`, `tool_input: { cmd, yield_time_ms }`) — would have failed before the `cmd` shim; plus a Claude-stdin (`command`) case so both field paths are covered.
+- `tests/unit/installCodexHooks.test.ts` asserts `PreToolUse`/`PermissionRequest` matcher === `'.*'` (regression guard against the invalid `*`).
+- Full codex suite: 19 green (7 + 3 + 3 + 6). tsc clean.
+## Rollback
+- Revert the matcher to its prior value and drop the `cmd` fallback in the three generators. No data migration. (Rollback re-breaks Codex enforcement — not advised.)
+## Follow-on (tracked, NOT deferred-broken)
+- **P6a managed hooks**: trust remains a separate concern — even `--dangerously-bypass-hook-trust` still pops an interactive trust prompt + a model-upsell prompt (would freeze unattended autonomy), and a trust-gated hook lets the agent decline ("continue without trusting"), so it can disable its own guard. Managed hooks (run-by-policy, agent-can't-disable) fix both. Genuine design fork → paused for Justin's input. Does not block this correctness fix shipping.
+## Publish
+- Feature branch `echo/codex-enforcement-hooks`. Targets release v1.2.57 once P6 (awareness + crossreview) completes.

package/upgrades/side-effects/codex-hook-trust-bypass.md ADDED Viewed

@@ -0,0 +1,37 @@
+# Side-Effects Review: Codex hook-trust bypass (P6a — autonomy)
+## Change
+- **New** `src/core/codexCapabilities.ts`: memoized `codexSupportsHookTrustBypass(binaryPath)` — probes `codex --help` once per binary path, returns whether `--dangerously-bypass-hook-trust` is supported. Fails closed on any error.
+- **`src/core/frameworkSessionLaunch.ts`**: both the interactive (`codexCliBuilder`) and headless (`codexCliHeadlessBuilder`) codex builders append `--dangerously-bypass-hook-trust` when the probe passes.
+## Why
+Codex requires interactive trust of command hooks before they run; that prompt freezes an unattended/autonomous session and offers a "continue without trusting" escape that lets an agent decline its own guards. The flag (codex 0.133) runs instar's already-vetted hooks with no prompt. Per Justin's delegation ("choose what's best for Instar", 2026-05-24), chosen as a **per-agent launch flag** over system-managed hooks — see spec §10 P6a for the full rationale.
+## Scope / blast radius
+- **Codex launches only.** Claude launches are untouched (separate builder). The flag is appended only to codex argv, and only when the probed binary supports it.
+- **Safe-by-construction:** instar writes the hooks (`installCodexHooks`) AND owns the launch command — there is no untrusted third-party hook this prompt would have protected against. The "danger" of the flag (running arbitrary untrusted hooks) does not apply.
+- **No system-wide effect:** unlike managed hooks, this changes nothing outside instar's own spawned codex processes. The operator's personal `codex` launches do not get the flag and still prompt normally.
+- **Older codex (<0.133):** the probe returns false → flag omitted → the launch still succeeds (an unknown flag would otherwise fail the whole launch) and degrades to the safe-by-blocking interactive trust-prompt path.
+## Signal vs Authority / over-block
+- No change to what the hooks block. This only affects whether they run unprompted. No new authority, no new patterns.
+## Performance
+- The probe runs `codex --help` at most once per binary path per process (memoized), at session-spawn time (infrequent), 5s timeout. Negligible.
+## Migration parity
+- **None required.** This is instar's own dist code (the launch builders), not an agent-installed file (`.claude/settings.json` / `.codex/hooks.json` / hook scripts / config defaults). Existing agents receive it on the normal npm update + server restart. (The P5c hooks.json/guard changes that this depends on already shipped in v1.2.66 with their own migration.)
+## Unbypassability (honest bound)
+- The agent cannot remove the flag from a launch instar constructs → it cannot turn off the trust-bypass. It could still attempt to edit `.codex/hooks.json` / the guard script to neuter the hook itself; that residual vector is covered by always-overwrite-on-update of instar-owned hooks + the file-edit gates, not by this change. Absolute unbypassability would require system-managed hooks, deliberately not chosen (see spec §10 P6a).
+## Tests
+- `tests/unit/codexCapabilities.test.ts` (5): supported/unsupported `--help`, fail-closed on missing + empty path, memoization (cached true survives binary deletion).
+- `tests/unit/frameworkSessionLaunch.test.ts` (+4): interactive & headless builders append the flag when the fake binary advertises it, omit it when it doesn't; prompt stays the final positional arg in headless.
+- Live-proven end-to-end on real codex 0.133 (no trust granted → no prompt → guard still blocked `rm -rf /`). `tsc` clean; 53 launch/capability tests green.
+## Rollback
+- Remove the two `if (codexSupportsHookTrustBypass(...)) argv.push(...)` blocks and delete `codexCapabilities.ts`. No data migration. (Rollback re-introduces the autonomous-hang on the trust prompt.)
+## Publish
+- Branch `echo/codex-hook-trust-bypass`. Patch → next release.