@ironbee-ai/cli 0.31.0 → 0.33.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (73) hide show
  1. package/CHANGELOG.md +8 -0
  2. package/dist/clients/base.js +1 -1
  3. package/dist/clients/claude/agents/ironbee-scenario.md +40 -11
  4. package/dist/clients/claude/agents/ironbee-verifier.md +40 -4
  5. package/dist/clients/claude/commands/ironbee-manage-scenario.md +2 -1
  6. package/dist/clients/claude/hooks/require-verdict.js +2 -2
  7. package/dist/clients/claude/hooks/require-verification.js +3 -3
  8. package/dist/clients/claude/hooks/track-action-monitor.js +1 -1
  9. package/dist/clients/claude/hooks/track-action.js +1 -1
  10. package/dist/clients/claude/index.js +4 -4
  11. package/dist/clients/claude/platforms/scenario.terminal.md +26 -0
  12. package/dist/clients/claude/platforms/skill.browser.md +1 -1
  13. package/dist/clients/claude/platforms/skill.terminal.md +62 -0
  14. package/dist/clients/codex/agents/ironbee-scenario.md +39 -10
  15. package/dist/clients/codex/agents/ironbee-verifier.md +39 -3
  16. package/dist/clients/codex/commands/ironbee-manage-scenario/SKILL.main.md +21 -6
  17. package/dist/clients/codex/commands/ironbee-manage-scenario/SKILL.md +2 -1
  18. package/dist/clients/codex/commands/ironbee-search-scenario/SKILL.main.md +3 -0
  19. package/dist/clients/codex/commands/ironbee-sync-scenario/SKILL.main.md +4 -1
  20. package/dist/clients/codex/commands/ironbee-verify/SKILL.main.md +4 -0
  21. package/dist/clients/codex/hooks/require-verification.js +1 -1
  22. package/dist/clients/codex/hooks/track-action.js +1 -1
  23. package/dist/clients/codex/index.js +2 -2
  24. package/dist/clients/codex/platforms/command-verify.terminal.md +61 -0
  25. package/dist/clients/codex/platforms/rule.terminal.md +31 -0
  26. package/dist/clients/codex/platforms/scenario.terminal.md +36 -0
  27. package/dist/clients/codex/platforms/skill.browser.md +1 -1
  28. package/dist/clients/codex/platforms/skill.terminal.md +57 -0
  29. package/dist/clients/codex/rules/ironbee-verification.main.md +3 -0
  30. package/dist/clients/codex/skills/ironbee-verification.main.md +14 -0
  31. package/dist/clients/codex/util.js +1 -1
  32. package/dist/clients/cursor/commands/ironbee-manage-scenario/SKILL.md +21 -6
  33. package/dist/clients/cursor/commands/ironbee-search-scenario/SKILL.md +3 -0
  34. package/dist/clients/cursor/commands/ironbee-sync-scenario/SKILL.md +4 -1
  35. package/dist/clients/cursor/commands/ironbee-verify/SKILL.md +4 -0
  36. package/dist/clients/cursor/hooks/require-verdict.js +2 -2
  37. package/dist/clients/cursor/hooks/require-verification.js +3 -3
  38. package/dist/clients/cursor/hooks/track-action-monitor.js +1 -1
  39. package/dist/clients/cursor/hooks/track-action.js +1 -1
  40. package/dist/clients/cursor/index.js +1 -1
  41. package/dist/clients/cursor/platforms/command-verify.terminal.md +61 -0
  42. package/dist/clients/cursor/platforms/rule.terminal.md +31 -0
  43. package/dist/clients/cursor/platforms/scenario.terminal.md +29 -0
  44. package/dist/clients/cursor/platforms/skill.browser.md +1 -1
  45. package/dist/clients/cursor/platforms/skill.terminal.md +54 -0
  46. package/dist/clients/cursor/rules/ironbee-verification.mdc +3 -0
  47. package/dist/clients/cursor/skills/ironbee-verification.md +14 -0
  48. package/dist/clients/registry.js +1 -1
  49. package/dist/commands/config.js +2 -2
  50. package/dist/commands/hook.js +22 -19
  51. package/dist/commands/install.js +1 -1
  52. package/dist/commands/platform-suggest.js +2 -0
  53. package/dist/commands/scenario.js +1 -1
  54. package/dist/commands/terminal.js +1 -0
  55. package/dist/hooks/core/actions.js +9 -7
  56. package/dist/hooks/core/run-checks.js +7 -0
  57. package/dist/hooks/core/verification-context.js +19 -15
  58. package/dist/hooks/core/verify-gate.js +35 -21
  59. package/dist/import/claude/events/tool-call.js +1 -1
  60. package/dist/import/codex/events/tool-call.js +1 -1
  61. package/dist/index.js +1 -1
  62. package/dist/lib/config.js +1 -1
  63. package/dist/lib/event.js +1 -1
  64. package/dist/lib/headless.js +1 -0
  65. package/dist/lib/install-version.js +1 -1
  66. package/dist/lib/platform-section.js +5 -4
  67. package/dist/lib/prompt.js +6 -5
  68. package/dist/lib/scenario-staleness.js +1 -1
  69. package/dist/tui/config/schema.js +1 -1
  70. package/dist/tui/platforms/area.js +2 -2
  71. package/dist/tui/projects/area.js +4 -4
  72. package/dist/tui/shell/session.js +5 -5
  73. package/package.json +1 -1
@@ -61,7 +61,7 @@ This is NOT a verification cycle — you submit no verdict and do not gate compl
61
61
  - **passes** → still current. (non-check) `scenario-update` to stamp `ironbee.commit` → current HEAD
62
62
  (read via `git rev-parse HEAD`) + `ironbee.liveValidated: true`; done. `scenario-update`
63
63
  shallow-replaces metadata, so read the current metadata and re-send it MERGED with these two
64
- keys — don't drop `coveredPaths` / `group` / `argsSchema`.
64
+ keys — don't drop `coveredPaths` / `group`. (Omit `params` to keep the stored contract.)
65
65
  - **fails due to DRIFT** (the *mechanics* broke — the way to reach / drive the flow changed, not the
66
66
  expected outcome) → repair the SCRIPT mechanics only, `scenario-update`, re-run until green, then
67
67
  stamp commit / liveValidated.
@@ -139,30 +139,56 @@ their results.
139
139
 
140
140
  ## Script format
141
141
  A scenario `script` is JS run in the devtools sandbox (async — top-level `await`/`return` work).
142
- It reads params from the `args` binding and invokes the platform's tools via `callTool`:
142
+ It reads its inputs from the `args` binding and invokes the platform's tools via `callTool`:
143
143
 
144
144
  ```js
145
- const { baseUrl } = args; // declared via argsSchema
145
+ const { baseUrl } = args; // declared in the scenario's `params` contract
146
146
  const result = await callTool('<bare-tool-name>', { /* tool input */ });
147
147
  return { ok: true };
148
148
  ```
149
149
 
150
- `args` is opaque to devtools document the expected shape in the scenario's `description` and the
151
- `argsSchema` metadata. **Discover the available `callTool` tool names for a platform from your
152
- connected MCP tool schemas** (the bare names) — don't guess.
150
+ Declare each input the script reads via the first-class **`params`** contract (see §Parameters)
151
+ not the old opaque `argsSchema` metadata key. **Discover the available `callTool` tool names for a
152
+ platform from your connected MCP tool schemas** (the bare names) — don't guess.
153
+
154
+ ## Parameters (`params`) — typed, defaulted, validated
155
+ For a parametric scenario, declare each input via the first-class **`params`** array on
156
+ `scenario-add` / `scenario-update` (a top-level field, NOT inside `metadata` — this supersedes the
157
+ old `argsSchema` metadata convention). Each entry:
158
+
159
+ - `name` (required) — the `args` key the script reads (e.g. `baseUrl`).
160
+ - `description` — what the param is (agent/human-facing).
161
+ - `type` — `string` / `number` / `boolean` / `object` / `array`. Omit for an untyped passthrough;
162
+ `object` / `array` are shallow-checked at the top level only (inner shape not validated).
163
+ - `default` — applied when the caller omits the arg; for `object` / `array` it doubles as the
164
+ concrete shape example. **Capture sensible defaults from the live-authoring run** so the scenario
165
+ re-runs "as captured" with zero args.
166
+ - `example` — documentation-only concrete shape, surfaced when there's no `default` (typically for
167
+ `object` / `array`). Never injected or validated.
168
+ - `required` — `true` rejects the run when there's no value AND no `default`.
169
+
170
+ `scenario-run` then applies defaults for omitted args, enforces `required`, and shallow-validates
171
+ declared types: re-running after a fix needs no re-derived args (`scenario-run { name }` reproduces
172
+ the captured values), and a wrong-type / required-missing run fails loudly instead of running with
173
+ `undefined`. The declared params ride in `scenario-list` / `-search` / `-run` output, so the
174
+ contract is visible without reading the script. Pass `args` only to OVERRIDE a default. A scenario
175
+ with no `params` keeps the fully-opaque `args` passthrough (document its shape in `description`).
176
+
177
+ **`scenario-update` shallow-replaces `params`** (same as `metadata`): to change one entry, re-send
178
+ the FULL `params` array; omit `params` entirely to keep the stored contract.
153
179
 
154
180
  ## Metadata conventions (stamp these on add/update)
155
181
  - `ironbee.coveredPaths` — source paths the scenario exercises (array), when derivable.
156
- - `argsSchema` — declared params, e.g. `{ "baseUrl": "string" }`.
157
- **Mandatory for any parametric scenario** (run reads it to know what to ask).
158
182
  - `ironbee.liveValidated` — `true` when you validated the scenario by running it end-to-end against
159
183
  the live app this session; `false` when authored source-only (`draft`, or the app couldn't be
160
184
  started). Always stamp it.
161
185
  - `ironbee.commit` — the commit the scenario was authored against (`git rev-parse HEAD`).
162
186
  - `ironbee.group` / `ironbee.order` — for a high-level scenario split across platforms: a shared
163
187
  group slug + integer run order.
164
- - `scenario-update` does a **shallow replace** of metadata — to change one key, re-send the FULL
165
- metadata object (read it first, merge, write back).
188
+ - `scenario-update` does a **shallow replace** of metadata (and of `params`) — to change one key,
189
+ re-send the FULL object / array (read it first, merge, write back).
190
+ - (The scenario's typed input contract is the first-class **`params`** field — see §Parameters —
191
+ NOT a metadata key.)
166
192
 
167
193
  The platform sections below tell you each enabled cycle's server, tool prefix, and store dir.
168
194
 
@@ -177,3 +203,6 @@ The platform sections below tell you each enabled cycle's server, tool prefix, a
177
203
 
178
204
  <!--IRONBEE:PLATFORM:android-->
179
205
  <!--/IRONBEE:PLATFORM:android-->
206
+
207
+ <!--IRONBEE:PLATFORM:terminal-->
208
+ <!--/IRONBEE:PLATFORM:terminal-->
@@ -29,14 +29,30 @@ The delegating prompt may tell you what to verify in one of two ways:
29
29
  gate's required-tools for you (as long as the scenario exercises them).
30
30
  **On a PASS verdict, also keep the scenario fresh:** `*_scenario-update` its `ironbee.commit`
31
31
  → current HEAD (`git rev-parse HEAD`) + `liveValidated: true` — read the current metadata and
32
- re-send it MERGED (shallow replace; don't drop `coveredPaths` / `group` / `argsSchema`). On a
32
+ re-send it MERGED (shallow replace; don't drop `coveredPaths` / `group`; omit `params` to keep
33
+ the stored typed contract). On a
33
34
  FAIL / defect, do NOT stamp (leave it for `$ironbee-sync-scenario scenario:<name>` or the user).
34
35
  - **A FREE-TEXT scenario / file path** — anything else is authoritative: verify exactly what it
35
36
  describes, driving each active cycle's tools to exercise precisely the flows, states, and endpoints
36
37
  it names (this replaces the default "exercise the changed pages/endpoints").
37
38
 
38
39
  Map each `checks` entry to a scenario step, each `issues` entry to a step that failed. If no scenario
39
- is given at all, exercise the changed pages/endpoints for each active cycle.
40
+ is given at all, exercise the changed pages/endpoints for each active cycle **plus the downstream
41
+ flows they feed** (see *Verify end-to-end* below).
42
+
43
+ ## Verify end-to-end — trace the blast radius (don't stop at the edited file)
44
+
45
+ A change's defect most often surfaces not on the edited file's own surface but in a **downstream
46
+ consumer** of what the change produces — wherever its output is read back, stored, rendered, or acted
47
+ on. Before driving tools, spend ONE quick pass reading/grepping the code to map the blast radius:
48
+ identify what the change produces and which other surfaces consume it, then exercise the FULL flow
49
+ from where the change is produced through to where its effect is observable — not only the surface the
50
+ edited file owns. A feature that works at its source but breaks in a downstream consumer is a **FAIL**.
51
+
52
+ This holds even when the consumer was not itself edited: the place you should have updated but didn't
53
+ never appears in the changed-files list, so don't let that list bound your verification — **follow the
54
+ data, not the diff.** Keep the mapping quick (a focused scan, not a full audit) so it doesn't eat the
55
+ speed budget.
40
56
 
41
57
  ## Session id — you don't need it
42
58
  The `ironbee hook` commands resolve the session automatically from your environment
@@ -59,6 +75,23 @@ echo '{"status":"pass","checks":["..."]}' | ironbee hook submit-verdict
59
75
  echo '{}' | ironbee hook verification-start --intent fix
60
76
  ```
61
77
  (No declared mode → plain form as above, no flag.)
78
+ 1.5. **Run the project checks FIRST (lint/test/…)** — the deterministic first step of every
79
+ verification cycle. Run them with a **generous timeout** (they may take minutes):
80
+ ```
81
+ echo '{}' | ironbee hook run-checks
82
+ ```
83
+ This runs the project's configured `verification.checks` and records the results IronBee's
84
+ gate reads.
85
+
86
+ 🛑 **HARD STOP — IF ANY REQUIRED CHECK FAILS, THE VERIFICATION HAS ALREADY FAILED.** Do **NOT**
87
+ drive the devtools tools. Do **NOT** submit a pass. Do **NOT** rationalize the failure away — it
88
+ is **NOT your call** whether a required failure is "just a planted fixture", "unrelated to my
89
+ change", "pre-existing", or "not really broken": IronBee marked the check **required**, so a
90
+ non-zero exit **IS** a verification failure, full stop. Immediately submit a **fail** verdict
91
+ whose `issues` are the failing checks (the gate enforces the fix).
92
+
93
+ Only when **every** required check PASSES do you continue to the application/devtools flow below.
94
+ (If it reports "no checks configured", just continue.)
62
95
  2. Build and start the application **only if it isn't already running** (check
63
96
  `docker compose ps` / process output / config — don't guess ports). **Track whether YOU
64
97
  started it**: if it was already up, the user or main agent owns it — leave it alone.
@@ -118,7 +151,7 @@ Each tool call is a separate LLM round-trip, and that round-trip — not the too
118
151
  — is the dominant cost of a verification. Drive the tools in as few turns as you can:
119
152
 
120
153
  - **Batch a scope's work into ONE `*_execute` call.** Each cycle exposes a batch tool
121
- (`bdt_execute` / `ndt_execute` / `bedt_execute` / `adt_execute`) that runs many steps in
154
+ (`bdt_execute` / `ndt_execute` / `bedt_execute` / `adt_execute` / `tdt_execute`) that runs many steps in
122
155
  one turn — nest each as a `callTool('<tool>', { … })`. A batch nests only that cycle's own
123
156
  tools (you can't mix servers in one `*_execute`). It's a JS sandbox, so a later step
124
157
  can reuse a value an earlier `callTool` returned
@@ -147,3 +180,6 @@ Each tool call is a separate LLM round-trip, and that round-trip — not the too
147
180
 
148
181
  <!--IRONBEE:PLATFORM:android-->
149
182
  <!--/IRONBEE:PLATFORM:android-->
183
+
184
+ <!--IRONBEE:PLATFORM:terminal-->
185
+ <!--/IRONBEE:PLATFORM:terminal-->
@@ -69,23 +69,35 @@ tools directly: that keeps it gate-orthogonal — no `verification_id`, can't fa
69
69
  > passes" means fixing the SCRIPT, never working around the app.)
70
70
 
71
71
  ## Script format
72
- JS run in the devtools sandbox (async — top-level `await`/`return` work); reads params from `args`:
72
+ JS run in the devtools sandbox (async — top-level `await`/`return` work); reads its inputs from `args`:
73
73
 
74
74
  ```js
75
- const { baseUrl } = args; // declared via argsSchema
75
+ const { baseUrl } = args; // declared in the scenario's `params` contract
76
76
  const result = await callTool('<bare-tool-name>', { /* tool input */ });
77
77
  return { ok: true };
78
78
  ```
79
79
 
80
80
  Discover the available `callTool` tool names for a platform from your connected MCP schemas — don't
81
- guess. Document the expected `args` in the `description` + the `argsSchema` metadata.
81
+ guess. Declare each input via the first-class **`params`** contract (§Parameters), not `argsSchema`.
82
+
83
+ ## Parameters (`params`) — typed, defaulted, validated
84
+ Declare a parametric scenario's inputs via the first-class **`params`** array on
85
+ `scenario-add` / `scenario-update` (top-level field, NOT metadata — supersedes the old `argsSchema`
86
+ convention). Each entry: `name` (required — the `args` key the script reads), `description`, `type`
87
+ (`string`/`number`/`boolean`/`object`/`array`; `object`/`array` shallow-checked at the top level),
88
+ `default` (applied when the arg is omitted — **capture it from the live-authoring run** so the
89
+ scenario re-runs "as captured" with zero args), `example` (doc-only shape when there's no `default`),
90
+ `required` (reject the run when there's no value AND no `default`). `scenario-run` applies defaults,
91
+ enforces `required`, shallow-validates declared types, and surfaces `params` in list/search/run
92
+ output. Pass `args` only to OVERRIDE a default. `scenario-update` shallow-replaces `params` (re-send
93
+ the full array; omit to keep the stored contract).
82
94
 
83
95
  ## Metadata conventions (stamp on add/update)
84
- - `argsSchema` — declared params, e.g. `{ "baseUrl": "string" }`. **Mandatory for parametric scenarios.**
85
96
  - `ironbee.coveredPaths` — source paths exercised (array), when derivable.
86
97
  - `ironbee.group` / `ironbee.order` — for a cross-platform split.
87
- - `*_scenario-update` does a **shallow replace** of metadata — to change one key, re-send the FULL
88
- metadata object (read it first, merge, write back).
98
+ - `*_scenario-update` does a **shallow replace** of metadata (and of `params`) — to change one key,
99
+ re-send the FULL object / array (read it first, merge, write back). The typed input contract is the
100
+ first-class `params` field (§Parameters), not a metadata key.
89
101
 
90
102
  The platform sections below list each enabled cycle's server, tool prefix, and store dir.
91
103
 
@@ -100,3 +112,6 @@ The platform sections below list each enabled cycle's server, tool prefix, and s
100
112
 
101
113
  <!--IRONBEE:PLATFORM:android-->
102
114
  <!--/IRONBEE:PLATFORM:android-->
115
+
116
+ <!--IRONBEE:PLATFORM:terminal-->
117
+ <!--/IRONBEE:PLATFORM:terminal-->
@@ -32,7 +32,8 @@ custom agent. This is NOT a verification cycle — it submits no verdict and doe
32
32
  right platform, authors the script — **against the live app by default** (starts the app if needed,
33
33
  observes the real behavior, validates by running once, then cleans up — deletes any probe /
34
34
  throwaway scenarios it added and stops what it started; `draft` skips this)
35
- — and stamps metadata (`argsSchema` for parametric ones).
35
+ — and declares the typed `params` contract for parametric ones (defaults captured from the run)
36
+ plus stamps metadata.
36
37
  **Delete and fuzzy-resolved update ask you to confirm** the matched scenario first — relay that
37
38
  to the user and pass their answer back. **Wait for the sub-agent in the same turn.**
38
39
  3. **Relay** the sub-agent's summary (what it created / updated / deleted, on which platform).
@@ -35,3 +35,6 @@ The platform sections below list each enabled cycle's server, tool prefix, and s
35
35
 
36
36
  <!--IRONBEE:PLATFORM:android-->
37
37
  <!--/IRONBEE:PLATFORM:android-->
38
+
39
+ <!--IRONBEE:PLATFORM:terminal-->
40
+ <!--/IRONBEE:PLATFORM:terminal-->
@@ -23,7 +23,7 @@ is NOT a verification cycle — no verdict, no gate.
23
23
  - **passes** → still current; (non-check) `*_scenario-update` to stamp `ironbee.commit` → HEAD
24
24
  (read via `git rev-parse HEAD`) + `ironbee.liveValidated: true`. `*_scenario-update`
25
25
  shallow-replaces metadata — read current metadata and re-send it MERGED with these two keys
26
- (don't drop `coveredPaths` / `group` / `argsSchema`).
26
+ (don't drop `coveredPaths` / `group`; omit `params` to keep the stored typed contract).
27
27
  - **mechanical DRIFT** (the way to reach / drive the flow changed, not the expected outcome) →
28
28
  repair the SCRIPT mechanics only, `*_scenario-update`, re-run until green, then stamp.
29
29
  - **real DEFECT** (the expected outcome is unreachable — the app broke) → **STOP, report, do NOT
@@ -53,3 +53,6 @@ running anything, use `ironbee scenario status`.)
53
53
 
54
54
  <!--IRONBEE:PLATFORM:android-->
55
55
  <!--/IRONBEE:PLATFORM:android-->
56
+
57
+ <!--IRONBEE:PLATFORM:terminal-->
58
+ <!--/IRONBEE:PLATFORM:terminal-->
@@ -72,6 +72,7 @@ stripping a leading `fix` / `report` mode token.
72
72
  ```
73
73
  echo '{"session_id":"<your-session-id>"}' | ironbee hook verification-start --intent fix
74
74
  ```
75
+ 1.5. **Run the project checks FIRST (lint/test/…)**: `echo '{"session_id":"<your-session-id>"}' | ironbee hook run-checks` (generous shell timeout — they may take minutes). Runs the configured `verification.checks` and records the results the gate reads. 🛑 **IF ANY REQUIRED CHECK FAILS, THE VERIFICATION HAS ALREADY FAILED — STOP.** It is **NOT your call** whether the failure is "just a fixture", "unrelated", or "pre-existing" — a required non-zero exit **IS** a failure. Do **NOT** touch the devtools tools or submit a pass; submit a **fail** verdict whose `issues` are the failing checks (the gate enforces the fix). Only when **every** required check PASSES do you continue. ("no checks configured" → continue.)
75
76
  2. **Build and start** the application if not already running (don't guess ports). Track what YOU started.
76
77
  3. **For every active cycle, run its flow** — driven by the scenario above when supplied, otherwise
77
78
  per the platform sections near the bottom of this file. All active cycles must be exercised within
@@ -102,6 +103,9 @@ stripping a leading `fix` / `report` mode token.
102
103
  <!--IRONBEE:PLATFORM:android-->
103
104
  <!--/IRONBEE:PLATFORM:android-->
104
105
 
106
+ <!--IRONBEE:PLATFORM:terminal-->
107
+ <!--/IRONBEE:PLATFORM:terminal-->
108
+
105
109
  ---
106
110
 
107
111
  ## When to FAIL
@@ -3,7 +3,7 @@
3
3
  Start verification first:
4
4
  echo '{"session_id":"${t}"}' | ironbee hook verification-start
5
5
 
6
- Then use the verification tools for the active cycle(s) \u2014 mcp__browser-devtools__bdt_* for browser, mcp__node-devtools__ndt_* for node, mcp__backend-devtools__bedt_* for backend, mcp__android-devtools__adt_* for android.`;process.stdout.write(JSON.stringify({hookSpecificOutput:{hookEventName:"PreToolUse",permissionDecision:"deny",permissionDecisionReason:p}})),process.exit(0);return}const _=r.tool_name??"",S=(0,f.extractCodexMcpServer)(_),c=(0,A.recordingToolsForServer)(S),j=c!==null?(0,f.canonicalizeCodexToolName)(_.split("__").pop()??""):"";if(!s&&!g&&c!==null&&(0,i.isRecordingRequired)(n)&&!(0,i.isRecordingActive)(n)&&j!==c.startTool){const p=`BLOCKED: Recording is required but not started.
6
+ Then use the verification tools for the active cycle(s) \u2014 mcp__browser-devtools__bdt_* for browser, mcp__node-devtools__ndt_* for node, mcp__backend-devtools__bedt_* for backend, mcp__android-devtools__adt_* for android, mcp__terminal-devtools__tdt_* for terminal.`;process.stdout.write(JSON.stringify({hookSpecificOutput:{hookEventName:"PreToolUse",permissionDecision:"deny",permissionDecisionReason:p}})),process.exit(0);return}const _=r.tool_name??"",S=(0,f.extractCodexMcpServer)(_),c=(0,A.recordingToolsForServer)(S),j=c!==null?(0,f.canonicalizeCodexToolName)(_.split("__").pop()??""):"";if(!s&&!g&&c!==null&&(0,i.isRecordingRequired)(n)&&!(0,i.isRecordingActive)(n)&&j!==c.startTool){const p=`BLOCKED: Recording is required but not started.
7
7
 
8
8
  1. Start recording NOW:
9
9
  Use mcp__${c.server}__${c.startTool}
@@ -1 +1 @@
1
- "use strict";var N=Object.defineProperty;var K=Object.getOwnPropertyDescriptor;var W=Object.getOwnPropertyNames;var X=Object.prototype.hasOwnProperty;var b=(t,e)=>N(t,"name",{value:e,configurable:!0});var Z=(t,e)=>{for(var o in e)N(t,o,{get:e[o],enumerable:!0})},ee=(t,e,o,n)=>{if(e&&typeof e=="object"||typeof e=="function")for(let i of W(e))!X.call(t,i)&&i!==o&&N(t,i,{get:()=>e[i],enumerable:!(n=K(e,i))||n.enumerable});return t};var te=t=>ee(N({},"__esModule",{value:!0}),t);var se={};Z(se,{run:()=>ie});module.exports=te(se);var T=require("../../../hooks/core/actions"),v=require("../../../hooks/core/nested-tools"),$=require("../../../import/ids"),L=require("../../../lib/runtime-paths"),r=require("../../../hooks/core/session-state"),P=require("../../../hooks/core/tool-use-stash"),U=require("../../../lib/config"),a=require("../../../lib/logger"),h=require("../../../lib/output"),q=require("../../../lib/recording-tools"),H=require("../../../lib/stdin"),x=require("../../../queue"),d=require("../util");function A(t){if(t==null)return 0;if(typeof t=="string")try{return Buffer.byteLength(t,"utf8")}catch{return 0}try{return Buffer.byteLength(JSON.stringify(t),"utf8")}catch{return 0}}b(A,"safeStringifyBytes");function oe(t){if(t==null)return{isError:!1,errorText:void 0};if(typeof t=="object"&&t!==null){const e=t;if(e.isError===!0||e.is_error===!0){const o=e.error??e.message??e.errorMessage;return{isError:!0,errorText:typeof o=="string"?o:JSON.stringify(e).slice(0,500)}}}if(typeof t=="string"){const e=t;if(/(?:^|\n)Process exited with code [1-9]/.test(e)||/^Exit code:\s*[1-9]/m.test(e)||/apply_patch verification failed/i.test(e)||/failed to find expected lines/i.test(e)||/^\s*Error\b/.test(e)||/(?:^|\n)\[Request interrupted by user\]/.test(e)||/modified since (?:last )?read|stale read/i.test(e)||/file (?:is )?too large|exceeds/i.test(e)||/file not found|No such file or directory|does not exist/i.test(e))return{isError:!0,errorText:e.slice(0,500)}}return{isError:!1,errorText:void 0}}b(oe,"detectFailure");function ne(t){if(t===null||typeof t!="object")return;const e=t._metadata;if(e===null||typeof e!="object")return;const o=e.toolCallId;if(typeof o=="string"&&/^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i.test(o))return o}b(ne,"extractMetadataToolCallId");function re(t,e){const o=(0,P.consumeToolUseData)(t,e);if(!o?.start_ns)return null;try{const n=process.hrtime.bigint()-BigInt(o.start_ns);return Number(n/1000000n)}catch(n){return a.logger.debug(`failed to derive duration from stash: ${n}`),null}}b(re,"deriveDurationMs");async function ie(t){const e=(0,d.parseCodexHookStdin)((0,H.readStdin)()),o=e.session_id??"default",n=(0,L.sessionDir)(t,o),i=`${n}/actions.jsonl`;(0,a.setLogFile)(`${n}/session.log`);const y=e.tool_name??"",s=e.tool_use_id??"",g=e.tool_input,R=g&&typeof g=="object"?{...g,_metadata:void 0}:void 0,C=e.tool_response,l=(0,d.extractCodexMcpServer)(y),D=l==="browser-devtools"||l==="node-devtools"||l==="backend-devtools"||l==="android-devtools",z=re(o,s),c=(0,d.classifyCodexTool)(y),F=D&&(0,v.isNestedToolContainer)(c.tool_name,l),J=F?(0,v.extractNestedToolCallsFromResponse)(C,l):null,f=J!==null?{isError:!1,errorText:void 0}:oe(C);if(D){const w=c.tool_name,u=(0,q.recordingToolsForServer)(l);u!==null&&(w===u.startTool?(0,r.setRecordingActive)(n,!0):w===u.stopTool&&(0,r.setRecordingActive)(n,!1));const E=(0,r.getActiveActivityId)(n),m={...(0,T.baseFields)(i),type:"tool_call",timestamp:Date.now(),tool_type:c.tool_type,tool_name:c.tool_name,mcp_server:c.mcp_server??l,tool_input:R,tool_input_size:A(R),tool_response:f.isError?void 0:C,tool_response_size:f.isError?0:A(C),duration:z};E&&(m.activity_id=E);const B=ne(g);B!==void 0?m.id=B:s.length>0&&(m.id=(0,$.deriveToolCallEventIdFromToolUseId)(o,s)),s&&(m.tool_use_id=s);const k=(0,r.getActiveVerificationId)(n);k&&(m.verification_id=k);const S=(0,r.getActiveTraceId)(n);if(S&&(m.trace_id=S),f.isError&&(m.error=f.errorText),await(0,T.appendAction)(i,m),F&&!f.isError){const G=J??(0,v.extractNestedToolCalls)(R??g,l);for(const _ of G){u!==null&&(_.name===u.startTool?((0,r.setRecordingActive)(n,!0),a.logger.debug(`track-action (nested): recording started (${u.cycle})`)):_.name===u.stopTool&&((0,r.setRecordingActive)(n,!1),a.logger.debug(`track-action (nested): recording stopped (${u.cycle})`)));const I={...(0,T.baseFields)(i),type:"tool_call",timestamp:_.startTime??Date.now(),tool_name:_.name,tool_type:"mcp",tool_input:_.args,duration:_.duration??null,mcp_server:l,nested:!0,...s?{parent_tool_use_id:s}:{}};E&&(I.activity_id=E),k&&(I.verification_id=k),S&&(I.trace_id=S),await(0,T.appendAction)(i,I),a.logger.debug(`track-action (nested): ${_.name}`)}}(0,h.writeAndExit)(JSON.stringify({}),0);return}if(!(0,U.isJobQueueEnabled)(t)){(0,h.writeAndExit)(JSON.stringify({}),0);return}const M=(0,r.getActiveActivityId)(n),V=(0,d.extractCodexToolInput)(y,g),Q=A(g),Y=f.isError?0:A(C),p={...(0,T.baseFields)(i),type:"tool_call",timestamp:Date.now(),tool_type:c.tool_type,tool_name:c.tool_name||(0,d.normalizeCodexToolName)(y),mcp_server:c.mcp_server,tool_input:V,tool_input_size:Q,tool_response_size:Y,duration:z};M&&(p.activity_id=M),s.length>0&&(p.id=(0,$.deriveToolCallEventIdFromToolUseId)(o,s)),s&&(p.tool_use_id=s);const O=(0,r.getActiveVerificationId)(n);O&&(p.verification_id=O);const j=(0,r.getActiveTraceId)(n);j&&(p.trace_id=j),f.isError&&(p.error=f.errorText);try{(0,x.submit)(t,o,x.SEND_EVENT_TYPE,p)}catch(w){w instanceof x.JobTooLargeError?a.logger.debug(`track-action: wire event too large for tool_call ${y}; dropping`):a.logger.debug(`queue submit failed for tool_call ${y}: ${w}`)}(0,h.writeAndExit)(JSON.stringify({}),0)}b(ie,"run");0&&(module.exports={run});
1
+ "use strict";var N=Object.defineProperty;var K=Object.getOwnPropertyDescriptor;var W=Object.getOwnPropertyNames;var X=Object.prototype.hasOwnProperty;var b=(t,e)=>N(t,"name",{value:e,configurable:!0});var Z=(t,e)=>{for(var o in e)N(t,o,{get:e[o],enumerable:!0})},ee=(t,e,o,n)=>{if(e&&typeof e=="object"||typeof e=="function")for(let i of W(e))!X.call(t,i)&&i!==o&&N(t,i,{get:()=>e[i],enumerable:!(n=K(e,i))||n.enumerable});return t};var te=t=>ee(N({},"__esModule",{value:!0}),t);var se={};Z(se,{run:()=>ie});module.exports=te(se);var T=require("../../../hooks/core/actions"),v=require("../../../hooks/core/nested-tools"),$=require("../../../import/ids"),L=require("../../../lib/runtime-paths"),r=require("../../../hooks/core/session-state"),P=require("../../../hooks/core/tool-use-stash"),U=require("../../../lib/config"),a=require("../../../lib/logger"),h=require("../../../lib/output"),q=require("../../../lib/recording-tools"),H=require("../../../lib/stdin"),x=require("../../../queue"),d=require("../util");function A(t){if(t==null)return 0;if(typeof t=="string")try{return Buffer.byteLength(t,"utf8")}catch{return 0}try{return Buffer.byteLength(JSON.stringify(t),"utf8")}catch{return 0}}b(A,"safeStringifyBytes");function oe(t){if(t==null)return{isError:!1,errorText:void 0};if(typeof t=="object"&&t!==null){const e=t;if(e.isError===!0||e.is_error===!0){const o=e.error??e.message??e.errorMessage;return{isError:!0,errorText:typeof o=="string"?o:JSON.stringify(e).slice(0,500)}}}if(typeof t=="string"){const e=t;if(/(?:^|\n)Process exited with code [1-9]/.test(e)||/^Exit code:\s*[1-9]/m.test(e)||/apply_patch verification failed/i.test(e)||/failed to find expected lines/i.test(e)||/^\s*Error\b/.test(e)||/(?:^|\n)\[Request interrupted by user\]/.test(e)||/modified since (?:last )?read|stale read/i.test(e)||/file (?:is )?too large|exceeds/i.test(e)||/file not found|No such file or directory|does not exist/i.test(e))return{isError:!0,errorText:e.slice(0,500)}}return{isError:!1,errorText:void 0}}b(oe,"detectFailure");function ne(t){if(t===null||typeof t!="object")return;const e=t._metadata;if(e===null||typeof e!="object")return;const o=e.toolCallId;if(typeof o=="string"&&/^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$/i.test(o))return o}b(ne,"extractMetadataToolCallId");function re(t,e){const o=(0,P.consumeToolUseData)(t,e);if(!o?.start_ns)return null;try{const n=process.hrtime.bigint()-BigInt(o.start_ns);return Number(n/1000000n)}catch(n){return a.logger.debug(`failed to derive duration from stash: ${n}`),null}}b(re,"deriveDurationMs");async function ie(t){const e=(0,d.parseCodexHookStdin)((0,H.readStdin)()),o=e.session_id??"default",n=(0,L.sessionDir)(t,o),i=`${n}/actions.jsonl`;(0,a.setLogFile)(`${n}/session.log`);const y=e.tool_name??"",s=e.tool_use_id??"",g=e.tool_input,R=g&&typeof g=="object"?{...g,_metadata:void 0}:void 0,C=e.tool_response,l=(0,d.extractCodexMcpServer)(y),D=l==="browser-devtools"||l==="node-devtools"||l==="backend-devtools"||l==="android-devtools"||l==="terminal-devtools",z=re(o,s),c=(0,d.classifyCodexTool)(y),F=D&&(0,v.isNestedToolContainer)(c.tool_name,l),J=F?(0,v.extractNestedToolCallsFromResponse)(C,l):null,f=J!==null?{isError:!1,errorText:void 0}:oe(C);if(D){const w=c.tool_name,u=(0,q.recordingToolsForServer)(l);u!==null&&(w===u.startTool?(0,r.setRecordingActive)(n,!0):w===u.stopTool&&(0,r.setRecordingActive)(n,!1));const E=(0,r.getActiveActivityId)(n),m={...(0,T.baseFields)(i),type:"tool_call",timestamp:Date.now(),tool_type:c.tool_type,tool_name:c.tool_name,mcp_server:c.mcp_server??l,tool_input:R,tool_input_size:A(R),tool_response:f.isError?void 0:C,tool_response_size:f.isError?0:A(C),duration:z};E&&(m.activity_id=E);const B=ne(g);B!==void 0?m.id=B:s.length>0&&(m.id=(0,$.deriveToolCallEventIdFromToolUseId)(o,s)),s&&(m.tool_use_id=s);const k=(0,r.getActiveVerificationId)(n);k&&(m.verification_id=k);const S=(0,r.getActiveTraceId)(n);if(S&&(m.trace_id=S),f.isError&&(m.error=f.errorText),await(0,T.appendAction)(i,m),F&&!f.isError){const G=J??(0,v.extractNestedToolCalls)(R??g,l);for(const _ of G){u!==null&&(_.name===u.startTool?((0,r.setRecordingActive)(n,!0),a.logger.debug(`track-action (nested): recording started (${u.cycle})`)):_.name===u.stopTool&&((0,r.setRecordingActive)(n,!1),a.logger.debug(`track-action (nested): recording stopped (${u.cycle})`)));const I={...(0,T.baseFields)(i),type:"tool_call",timestamp:_.startTime??Date.now(),tool_name:_.name,tool_type:"mcp",tool_input:_.args,duration:_.duration??null,mcp_server:l,nested:!0,...s?{parent_tool_use_id:s}:{}};E&&(I.activity_id=E),k&&(I.verification_id=k),S&&(I.trace_id=S),await(0,T.appendAction)(i,I),a.logger.debug(`track-action (nested): ${_.name}`)}}(0,h.writeAndExit)(JSON.stringify({}),0);return}if(!(0,U.isJobQueueEnabled)(t)){(0,h.writeAndExit)(JSON.stringify({}),0);return}const M=(0,r.getActiveActivityId)(n),V=(0,d.extractCodexToolInput)(y,g),Q=A(g),Y=f.isError?0:A(C),p={...(0,T.baseFields)(i),type:"tool_call",timestamp:Date.now(),tool_type:c.tool_type,tool_name:c.tool_name||(0,d.normalizeCodexToolName)(y),mcp_server:c.mcp_server,tool_input:V,tool_input_size:Q,tool_response_size:Y,duration:z};M&&(p.activity_id=M),s.length>0&&(p.id=(0,$.deriveToolCallEventIdFromToolUseId)(o,s)),s&&(p.tool_use_id=s);const O=(0,r.getActiveVerificationId)(n);O&&(p.verification_id=O);const j=(0,r.getActiveTraceId)(n);j&&(p.trace_id=j),f.isError&&(p.error=f.errorText);try{(0,x.submit)(t,o,x.SEND_EVENT_TYPE,p)}catch(w){w instanceof x.JobTooLargeError?a.logger.debug(`track-action: wire event too large for tool_call ${y}; dropping`):a.logger.debug(`queue submit failed for tool_call ${y}: ${w}`)}(0,h.writeAndExit)(JSON.stringify({}),0)}b(ie,"run");0&&(module.exports={run});
@@ -1,3 +1,3 @@
1
- "use strict";var M=Object.defineProperty;var re=Object.getOwnPropertyDescriptor;var se=Object.getOwnPropertyNames;var ae=Object.prototype.hasOwnProperty;var S=(f,e)=>M(f,"name",{value:e,configurable:!0});var le=(f,e)=>{for(var o in e)M(f,o,{get:e[o],enumerable:!0})},ce=(f,e,o,r)=>{if(e&&typeof e=="object"||typeof e=="function")for(let i of se(e))!ae.call(f,i)&&i!==o&&M(f,i,{get:()=>e[i],enumerable:!(r=re(e,i))||r.enumerable});return f};var de=f=>ce(M({},"__esModule",{value:!0}),f);var fe={};le(fe,{CodexClient:()=>ue});module.exports=de(fe);var s=require("fs"),m=require("path"),U=require("../../lib/gitignore"),b=require("../../lib/logger"),l=require("../../lib/output"),O=require("../../lib/fs-prune"),d=require("../../lib/config"),C=require("../../lib/platform-section"),n=require("./util"),H=require("./thread-map"),q=require("../../lib/runtime-paths"),W=require("./hooks/verify-gate"),D=require("./hooks/activity-end"),X=require("./hooks/session-start"),Y=require("./hooks/activity-start"),z=require("./hooks/require-verification"),Q=require("./hooks/require-verdict"),Z=require("./hooks/clear-verdict"),j=require("./hooks/track-action"),ee=require("./hooks/track-action-monitor"),oe=require("./hooks/track-action-pre"),ne=require("./hooks/subagent-start"),te=require("./hooks/subagent-stop");const B="~/.ironbee/projects",E="browser-devtools",A="node-devtools",_="backend-devtools",I="android-devtools",ge="ironbee",$="ironbee-verifier",V=30,L="Verifies recent code changes through real browser/runtime/backend tools and submits the IronBee verdict. Spawn this custom agent (by agent_type) after editing code to run the verification cycle out-of-band \u2014 it drives the devtools tools, judges the result, and records the verdict in the shared session. It does NOT edit code.",x="ironbee-scenario",J=["ironbee-manage-scenario","ironbee-search-scenario","ironbee-sync-scenario"],F="Manages and searches reusable IronBee verification scenarios via the devtools scenario tools. Spawn this custom agent (by agent_type) from the scenario slash commands to author/update/delete saved scenarios and find them by name/description/metadata. NOT a verification cycle (running a saved scenario to verify is done via $ironbee-verify scenario:<name>).";function P(f){return(0,m.join)(__dirname,"..",f,"platforms")}S(P,"platformsDirFor");function y(f){return l.pc.dim(f)}S(y,"codexColor");function G(f){return f.hooks.some(e=>e.command.includes(ge))}S(G,"isIronBeeHookGroup");function me(f){const e=Object.keys(f);return e.length===0?!0:e.length===1&&e[0]==="hooks"?Object.keys(f.hooks??{}).length===0:!1}S(me,"isCodexHooksEmpty");class ue{constructor(){this.name="codex";this.supportsVerifierModel=!0}static{S(this,"CodexClient")}detect(e){return(0,s.existsSync)((0,m.join)(e,".agents","skills","ironbee-verify"))}resolveProjectDir(){return process.env.CODEX_PROJECT_DIR??process.env.IRONBEE_PROJECT_DIR??process.cwd()}install(e,o){const r=o??(0,d.loadConfig)(e),i=(0,d.getVerificationMode)(r),t=i!=="monitor",a=(0,d.getCodexVerifierMode)(r);this.cleanupArtifacts(e);const g=(0,n.codexHooksJsonPath)(e);if(this.mergeHooksConfig(g,i,a),this.mergeConfigToml(e,r,t,a),t&&(i==="enforce"&&this.writeAgentsMdBlock(e,r,a),this.writeSkills(e,i==="enforce",r,a),(0,C.syncPlatformSectionsToConfig)(e,P)),(0,U.ensureIronBeeGitignored)(e),console.log(` ${l.pc.dim("\u2192")} ${y("[codex]")} hooks ${l.pc.dim("\u2192")} ${l.pc.dim(g)}`),console.log(` ${l.pc.dim("\u2192")} ${y("[codex]")} config ${l.pc.dim("\u2192")} ${l.pc.dim((0,n.codexConfigTomlPath)(e))}`),t){const p=a==="main-agent"?`${l.pc.yellow("main-agent")} (the main agent drives the devtools tools directly)`:`${l.pc.bold("sub-agent")} (delegated to the ironbee-verifier custom agent)`;console.log(` ${l.pc.dim("\u2192")} ${y("[codex]")} verify ${l.pc.dim("\u2192")} ${p}`)}i==="enforce"?(console.log(` ${l.pc.dim("\u2192")} ${y("[codex]")} agents ${l.pc.dim("\u2192")} ${l.pc.dim((0,m.join)(e,"AGENTS.md"))}`),console.log(` ${l.pc.dim("\u2192")} ${y("[codex]")} skill ${l.pc.dim("\u2192")} ${l.pc.dim((0,m.join)(e,".agents","skills","ironbee-verification","SKILL.md"))}`),console.log(` ${l.pc.dim("\u2192")} ${y("[codex]")} command ${l.pc.dim("\u2192")} ${l.pc.dim((0,m.join)(e,".agents","skills","ironbee-verify","SKILL.md"))}`)):i==="assist"?(console.log(` ${l.pc.dim("\u2192")} ${y("[codex]")} ${l.pc.yellow("assist mode")} (verification.auto: false) \u2014 manual $ironbee-verify only, no enforcement`),console.log(` ${l.pc.dim("\u2192")} ${y("[codex]")} command ${l.pc.dim("\u2192")} ${l.pc.dim((0,m.join)(e,".agents","skills","ironbee-verify","SKILL.md"))}`)):console.log(` ${l.pc.dim("\u2192")} ${y("[codex]")} ${l.pc.yellow("monitoring-only mode")} (verification.enable: false)`),console.log(),console.log(` ${l.pc.yellow("\u26A0")} ${l.pc.yellow("Codex requires one-time TUI setup:")}`),console.log(` ${l.pc.yellow("1.")} Run ${l.pc.bold("/hooks")} in a fresh Codex session to review and trust IronBee hooks`),console.log(` ${l.pc.yellow("2.")} Restart any open Codex sessions to pick up new hook config`)}uninstall(e){this.cleanupArtifacts(e),(0,s.existsSync)((0,n.codexHooksJsonPath)(e))||this.removeFeaturesHooksFlag(e),(0,O.pruneEmptyDirs)((0,m.join)(e,".codex"));const o=(0,H.codexThreadMapPath)(e);if((0,s.existsSync)(o))try{(0,s.unlinkSync)(o)}catch(r){b.logger.debug(`failed to remove codex thread map: ${r}`)}console.log(` ${l.pc.dim("\u2192")} ${y("[codex]")} removed hooks, MCP entries, AGENTS.md block, and skills`)}removeFeaturesHooksFlag(e){const o=(0,n.codexConfigTomlPath)(e);if((0,s.existsSync)(o))try{const r=(0,s.readFileSync)(o,"utf-8");let i=(0,n.removeFeaturesHooks)(r);i=(0,n.removeSandboxWritableRoot)(i,B),i.trim().length===0?(0,s.unlinkSync)(o):i!==r&&(0,s.writeFileSync)(o,i)}catch(r){b.logger.debug(`failed to strip [features] hooks from config.toml: ${r}`)}}cleanupArtifacts(e){this.migrateAwayFromUserLevel();const o=(0,n.codexHooksJsonPath)(e);this.removeIronBeeHooks(o),this.maybeDeleteEmptyHooks(o),this.removeIronBeeMcpServers(e),this.removeVerifierAgentToml(e),this.removeScenarioAgentToml(e);const r=(0,m.join)(e,"AGENTS.md");if((0,s.existsSync)(r))try{const t=(0,s.readFileSync)(r,"utf-8"),a=(0,n.stripAgentsMdBlock)(t);a===null?(0,s.unlinkSync)(r):a!==t&&(0,s.writeFileSync)(r,a)}catch(t){b.logger.debug(`failed to strip AGENTS.md block: ${t}`)}const i=(0,m.join)(e,".agents","skills");this.removeDir((0,m.join)(i,"ironbee-verification")),this.removeDir((0,m.join)(i,"ironbee-verify"));for(const t of J)this.removeDir((0,m.join)(i,t));this.removeDir((0,m.join)(i,"ironbee-run-scenario")),(0,O.pruneEmptyDirs)((0,m.join)(e,".agents"))}async runVerifyGate(e){await(0,W.run)(e)}async runActivityEnd(e){await(0,D.run)(e)}async runSessionStart(e){await(0,X.run)(e)}async runActivityStart(e){await(0,Y.run)(e)}async runRequireVerification(e,o){await(0,z.run)(e,o)}async runRequireVerdict(e,o){await(0,Q.run)(e,o)}async runClearVerdict(e){await(0,Z.run)(e)}async runTrackAction(e){await(0,j.run)(e)}async runTrackActionMonitor(e){await(0,ee.run)(e)}async runTrackActionPre(e){await(0,oe.run)(e)}async runSubagentStart(e){await(0,ne.run)(e)}async runSubagentStop(e){await(0,te.run)(e)}resolveAgentSessionId(e,o){const r=process.env.CODEX_THREAD_ID;if(typeof r=="string"&&r.length>0&&o)return(0,H.lookupThreadSession)(o,r)}async runSessionEnd(e){b.logger.debug("session-end: no-op on Codex (no SessionEnd hook event)")}mergeHooksConfig(e,o,r){const i=o!=="monitor",t=o==="assist"?" --soft":"";(0,s.mkdirSync)((0,m.dirname)(e),{recursive:!0});let a={hooks:{}};if((0,s.existsSync)(e))try{a=JSON.parse((0,s.readFileSync)(e,"utf-8")),a.hooks||(a.hooks={})}catch(v){b.logger.debug(`failed to parse ${e}: ${v}`),a={hooks:{}}}for(const v of Object.keys(a.hooks)){const c=a.hooks[v].filter(h=>!G(h));c.length===0?delete a.hooks[v]:a.hooks[v]=c}const g=S((v,c,h)=>{a.hooks[v]||(a.hooks[v]=[]),a.hooks[v].push({matcher:c,hooks:[{type:"command",command:h}]})},"addGroup");g("SessionStart",".*","ironbee hook session-start --client codex"),g("UserPromptSubmit",".*","ironbee hook activity-start --client codex"),g("PreToolUse",".*","ironbee hook track-action-pre --client codex"),i&&(g("PreToolUse","^mcp__(browser|node|backend|android)[-_]devtools__.*",`ironbee hook require-verification --client codex${t}`),g("PreToolUse","^apply_patch$",`ironbee hook require-verdict --client codex${t}`),g("PostToolUse","^apply_patch$","ironbee hook clear-verdict --client codex"),r==="sub-agent"&&g("SubagentStart",".*","ironbee hook subagent-start --client codex")),g("SubagentStop",".*","ironbee hook subagent-stop --client codex"),g("PostToolUse",".*",i?"ironbee hook track-action --client codex":"ironbee hook track-action-monitor --client codex"),g("Stop",".*",o==="enforce"?"ironbee hook verify-gate --client codex":"ironbee hook activity-end --client codex"),(0,s.writeFileSync)(e,JSON.stringify(a,null,2))}removeIronBeeHooks(e){if((0,s.existsSync)(e))try{const o=(0,s.readFileSync)(e,"utf-8"),r=JSON.parse(o);if(!r.hooks)return;let i=!1;for(const t of Object.keys(r.hooks)){const a=r.hooks[t].filter(g=>!G(g));a.length!==r.hooks[t].length&&(i=!0),a.length===0?delete r.hooks[t]:r.hooks[t]=a}i&&(0,s.writeFileSync)(e,JSON.stringify(r,null,2))}catch(o){b.logger.debug(`failed to strip IronBee hooks from ${e}: ${o}`)}}maybeDeleteEmptyHooks(e){if((0,s.existsSync)(e))try{const o=JSON.parse((0,s.readFileSync)(e,"utf-8"));me(o)&&(0,s.unlinkSync)(e)}catch(o){b.logger.debug(`failed to inspect ${e} for emptiness: ${o}`)}}mergeConfigToml(e,o,r,i){(0,s.mkdirSync)((0,m.join)(e,".codex"),{recursive:!0});let t=(0,n.readCodexConfigToml)(e);if(t=(0,n.ensureFeaturesHooksTrue)(t),t=r&&(0,q.resolveRuntimeLocation)(e)==="external"?(0,n.ensureSandboxWritableRoot)(t,B):(0,n.removeSandboxWritableRoot)(t,B),t=(0,n.removeMcpServer)(t,E),t=(0,n.removeMcpServer)(t,A),t=(0,n.removeMcpServer)(t,_),t=(0,n.removeMcpServer)(t,I),r&&i==="main-agent"){t=this.upsertSessionMcpServers(t,e,o),t=(0,n.removeAgentsTable)(t,$),t=(0,n.removeAgentsTable)(t,x),t=(0,n.removeMultiAgentV2SpawnMetadata)(t),this.removeVerifierAgentToml(e),this.removeScenarioAgentToml(e),(0,n.writeCodexConfigToml)(e,t);return}if(r){const g=(0,d.getVerificationModel)(o,"codex"),p=(0,s.existsSync)((0,n.userCodexConfigTomlPath)())?(0,s.readFileSync)((0,n.userCodexConfigTomlPath)(),"utf-8"):"",u=(0,n.extractTomlTopLevelModel)(t)===null&&(0,n.extractTomlTopLevelModel)(p)===null;g===void 0&&u&&console.log(` ${l.pc.dim("\u2192")} ${y("[codex]")} ${l.pc.yellow("\u26A0 no model for the verifier")} \u2014 the ${l.pc.bold("ironbee-verifier")} sub-agent inherits the session model, but neither this project's .codex/config.toml nor ~/.codex/config.toml has a top-level ${l.pc.bold("model")}, so it may fail to spawn ("could not resolve the child model"). Fix: set ${l.pc.bold("model")} in ~/.codex/config.toml, or set ${l.pc.bold("verification.model")} in your ironbee config.`),this.writeVerifierAgentToml(e,o,g),t=(0,n.upsertAgentsTable)(t,$,[`description = ${JSON.stringify(L)}`,`config_file = ${JSON.stringify(`agents/${$}.toml`)}`]),t=(0,n.ensureMultiAgentV2SpawnMetadataExposed)(t),this.writeScenarioAgentToml(e,o,g),t=(0,n.upsertAgentsTable)(t,x,[`description = ${JSON.stringify(F)}`,`config_file = ${JSON.stringify(`agents/${x}.toml`)}`])}else t=(0,n.removeAgentsTable)(t,$),t=(0,n.removeAgentsTable)(t,x),t=(0,n.removeMultiAgentV2SpawnMetadata)(t),this.removeVerifierAgentToml(e),this.removeScenarioAgentToml(e);(0,n.writeCodexConfigToml)(e,t)}writeVerifierAgentToml(e,o,r){this.writeCustomAgentToml(e,o,r,$,L,"skill","read-only")}writeScenarioAgentToml(e,o,r){this.writeCustomAgentToml(e,o,r,x,F,"scenario","read-only")}writeCustomAgentToml(e,o,r,i,t,a,g){const p=(0,m.join)(__dirname,"agents",`${i}.md`);let u;try{u=(0,s.readFileSync)(p,"utf-8")}catch(k){b.logger.debug(`failed to read agent source ${p}: ${k}`);return}const v=P("codex");for(const k of d.ALL_CYCLES){const w=(0,d.isCycleEnabled)(o,k)?ie=>{const N=(0,m.join)(v,(0,C.fragmentFilename)(a,k,ie));return(0,s.existsSync)(N)?(0,s.readFileSync)(N,"utf-8").trimEnd():null}:null;u=(0,C.applyPlatformSection)(u,k,w,`${i}.toml`)}const c=[];c.push(`name = ${JSON.stringify(i)}`),c.push(`description = ${JSON.stringify(t)}`),c.push(`sandbox_mode = ${JSON.stringify(g)}`),r&&c.push(`model = ${JSON.stringify(r)}`),c.push("developer_instructions = '''"),c.push(u.replace(/'''/g,"```").trimEnd()),c.push("'''");const h=S((k,T,w)=>{k&&(c.push(""),c.push(`[mcp_servers.${T}]`),c.push(...K(w)),c.push(`startup_timeout_sec = ${V}`),c.push("required = true"),c.push('default_tools_approval_mode = "approve"'))},"addCycle");h((0,d.isCycleEnabled)(o,"browser"),E,(0,d.getMcpServerEntry)(e)),h((0,d.isCycleEnabled)(o,"node"),A,(0,d.getNodeDevToolsMcpEntry)(e)),h((0,d.isCycleEnabled)(o,"backend"),_,(0,d.getBackendDevToolsMcpEntry)(e)),h((0,d.isCycleEnabled)(o,"android"),I,(0,d.getAndroidDevToolsMcpEntry)(e));const R=(0,n.codexAgentTomlPath)(e,i);(0,s.mkdirSync)((0,m.dirname)(R),{recursive:!0}),(0,s.writeFileSync)(R,c.join(`
1
+ "use strict";var P=Object.defineProperty;var le=Object.getOwnPropertyDescriptor;var ce=Object.getOwnPropertyNames;var de=Object.prototype.hasOwnProperty;var S=(f,e)=>P(f,"name",{value:e,configurable:!0});var me=(f,e)=>{for(var o in e)P(f,o,{get:e[o],enumerable:!0})},ge=(f,e,o,r)=>{if(e&&typeof e=="object"||typeof e=="function")for(let i of ce(e))!de.call(f,i)&&i!==o&&P(f,i,{get:()=>e[i],enumerable:!(r=le(e,i))||r.enumerable});return f};var ue=f=>ge(P({},"__esModule",{value:!0}),f);var be={};me(be,{CodexClient:()=>pe});module.exports=ue(be);var s=require("fs"),q=require("os"),m=require("path"),W=require("../../lib/gitignore"),p=require("../../lib/logger"),c=require("../../lib/output"),N=require("../../lib/fs-prune"),D=require("../../lib/headless"),l=require("../../lib/config"),C=require("../../lib/platform-section"),n=require("./util"),B=require("./thread-map"),X=require("../../lib/runtime-paths"),Y=require("./hooks/verify-gate"),z=require("./hooks/activity-end"),Q=require("./hooks/session-start"),Z=require("./hooks/activity-start"),j=require("./hooks/require-verification"),ee=require("./hooks/require-verdict"),oe=require("./hooks/clear-verdict"),ne=require("./hooks/track-action"),te=require("./hooks/track-action-monitor"),ie=require("./hooks/track-action-pre"),re=require("./hooks/subagent-start"),se=require("./hooks/subagent-stop");const O="~/.ironbee/projects",w="browser-devtools",A="node-devtools",_="backend-devtools",I="android-devtools",R="terminal-devtools",fe="ironbee",$="ironbee-verifier",L=30,J="Verifies recent code changes through real browser/runtime/backend tools and submits the IronBee verdict. Spawn this custom agent (by agent_type) after editing code to run the verification cycle out-of-band \u2014 it drives the devtools tools, judges the result, and records the verdict in the shared session. It does NOT edit code.",x="ironbee-scenario",F=["ironbee-manage-scenario","ironbee-search-scenario","ironbee-sync-scenario"],G="Manages and searches reusable IronBee verification scenarios via the devtools scenario tools. Spawn this custom agent (by agent_type) from the scenario slash commands to author/update/delete saved scenarios and find them by name/description/metadata. NOT a verification cycle (running a saved scenario to verify is done via $ironbee-verify scenario:<name>).";function H(f){return(0,m.join)(__dirname,"..",f,"platforms")}S(H,"platformsDirFor");function k(f){return c.pc.dim(f)}S(k,"codexColor");function K(f){return f.hooks.some(e=>e.command.includes(fe))}S(K,"isIronBeeHookGroup");function ve(f){const e=Object.keys(f);return e.length===0?!0:e.length===1&&e[0]==="hooks"?Object.keys(f.hooks??{}).length===0:!1}S(ve,"isCodexHooksEmpty");class pe{constructor(){this.name="codex";this.supportsVerifierModel=!0}static{S(this,"CodexClient")}detect(e){return(0,s.existsSync)((0,m.join)(e,".agents","skills","ironbee-verify"))}resolveProjectDir(){return process.env.CODEX_PROJECT_DIR??process.env.IRONBEE_PROJECT_DIR??process.cwd()}async runHeadlessPrompt(e,o){const r=(0,s.mkdtempSync)((0,m.join)((0,q.tmpdir)(),"ironbee-codex-")),i=(0,m.join)(r,"last.txt");try{await(0,D.runHeadlessCommand)("codex",["exec","--sandbox","read-only","--skip-git-repo-check","-o",i,e],{cwd:o.projectDir,timeoutMs:o.timeoutMs,signal:o.signal});try{return(0,s.readFileSync)(i,"utf8")}catch{return""}}finally{try{(0,s.rmSync)(r,{recursive:!0,force:!0})}catch{}}}install(e,o){const r=o??(0,l.loadConfig)(e),i=(0,l.getVerificationMode)(r),t=i!=="monitor",a=(0,l.getCodexVerifierMode)(r);this.cleanupArtifacts(e);const g=(0,n.codexHooksJsonPath)(e);if(this.mergeHooksConfig(g,i,a),this.mergeConfigToml(e,r,t,a),t&&(i==="enforce"&&this.writeAgentsMdBlock(e,r,a),this.writeSkills(e,i==="enforce",r,a),(0,C.syncPlatformSectionsToConfig)(e,H)),(0,W.ensureIronBeeGitignored)(e),console.log(` ${c.pc.dim("\u2192")} ${k("[codex]")} hooks ${c.pc.dim("\u2192")} ${c.pc.dim(g)}`),console.log(` ${c.pc.dim("\u2192")} ${k("[codex]")} config ${c.pc.dim("\u2192")} ${c.pc.dim((0,n.codexConfigTomlPath)(e))}`),t){const h=a==="main-agent"?`${c.pc.yellow("main-agent")} (the main agent drives the devtools tools directly)`:`${c.pc.bold("sub-agent")} (delegated to the ironbee-verifier custom agent)`;console.log(` ${c.pc.dim("\u2192")} ${k("[codex]")} verify ${c.pc.dim("\u2192")} ${h}`)}i==="enforce"?(console.log(` ${c.pc.dim("\u2192")} ${k("[codex]")} agents ${c.pc.dim("\u2192")} ${c.pc.dim((0,m.join)(e,"AGENTS.md"))}`),console.log(` ${c.pc.dim("\u2192")} ${k("[codex]")} skill ${c.pc.dim("\u2192")} ${c.pc.dim((0,m.join)(e,".agents","skills","ironbee-verification","SKILL.md"))}`),console.log(` ${c.pc.dim("\u2192")} ${k("[codex]")} command ${c.pc.dim("\u2192")} ${c.pc.dim((0,m.join)(e,".agents","skills","ironbee-verify","SKILL.md"))}`)):i==="assist"?(console.log(` ${c.pc.dim("\u2192")} ${k("[codex]")} ${c.pc.yellow("assist mode")} (verification.auto: false) \u2014 manual $ironbee-verify only, no enforcement`),console.log(` ${c.pc.dim("\u2192")} ${k("[codex]")} command ${c.pc.dim("\u2192")} ${c.pc.dim((0,m.join)(e,".agents","skills","ironbee-verify","SKILL.md"))}`)):console.log(` ${c.pc.dim("\u2192")} ${k("[codex]")} ${c.pc.yellow("monitoring-only mode")} (verification.enable: false)`),console.log(),console.log(` ${c.pc.yellow("\u26A0")} ${c.pc.yellow("Codex requires one-time TUI setup:")}`),console.log(` ${c.pc.yellow("1.")} Run ${c.pc.bold("/hooks")} in a fresh Codex session to review and trust IronBee hooks`),console.log(` ${c.pc.yellow("2.")} Restart any open Codex sessions to pick up new hook config`)}uninstall(e){this.cleanupArtifacts(e),(0,s.existsSync)((0,n.codexHooksJsonPath)(e))||this.removeFeaturesHooksFlag(e),(0,N.pruneEmptyDirs)((0,m.join)(e,".codex"));const o=(0,B.codexThreadMapPath)(e);if((0,s.existsSync)(o))try{(0,s.unlinkSync)(o)}catch(r){p.logger.debug(`failed to remove codex thread map: ${r}`)}console.log(` ${c.pc.dim("\u2192")} ${k("[codex]")} removed hooks, MCP entries, AGENTS.md block, and skills`)}removeFeaturesHooksFlag(e){const o=(0,n.codexConfigTomlPath)(e);if((0,s.existsSync)(o))try{const r=(0,s.readFileSync)(o,"utf-8");let i=(0,n.removeFeaturesHooks)(r);i=(0,n.removeSandboxWritableRoot)(i,O),i.trim().length===0?(0,s.unlinkSync)(o):i!==r&&(0,s.writeFileSync)(o,i)}catch(r){p.logger.debug(`failed to strip [features] hooks from config.toml: ${r}`)}}cleanupArtifacts(e){this.migrateAwayFromUserLevel();const o=(0,n.codexHooksJsonPath)(e);this.removeIronBeeHooks(o),this.maybeDeleteEmptyHooks(o),this.removeIronBeeMcpServers(e),this.removeVerifierAgentToml(e),this.removeScenarioAgentToml(e);const r=(0,m.join)(e,"AGENTS.md");if((0,s.existsSync)(r))try{const t=(0,s.readFileSync)(r,"utf-8"),a=(0,n.stripAgentsMdBlock)(t);a===null?(0,s.unlinkSync)(r):a!==t&&(0,s.writeFileSync)(r,a)}catch(t){p.logger.debug(`failed to strip AGENTS.md block: ${t}`)}const i=(0,m.join)(e,".agents","skills");this.removeDir((0,m.join)(i,"ironbee-verification")),this.removeDir((0,m.join)(i,"ironbee-verify"));for(const t of F)this.removeDir((0,m.join)(i,t));this.removeDir((0,m.join)(i,"ironbee-run-scenario")),(0,N.pruneEmptyDirs)((0,m.join)(e,".agents"))}async runVerifyGate(e){await(0,Y.run)(e)}async runActivityEnd(e){await(0,z.run)(e)}async runSessionStart(e){await(0,Q.run)(e)}async runActivityStart(e){await(0,Z.run)(e)}async runRequireVerification(e,o){await(0,j.run)(e,o)}async runRequireVerdict(e,o){await(0,ee.run)(e,o)}async runClearVerdict(e){await(0,oe.run)(e)}async runTrackAction(e){await(0,ne.run)(e)}async runTrackActionMonitor(e){await(0,te.run)(e)}async runTrackActionPre(e){await(0,ie.run)(e)}async runSubagentStart(e){await(0,re.run)(e)}async runSubagentStop(e){await(0,se.run)(e)}resolveAgentSessionId(e,o){const r=process.env.CODEX_THREAD_ID;if(typeof r=="string"&&r.length>0&&o)return(0,B.lookupThreadSession)(o,r)}async runSessionEnd(e){p.logger.debug("session-end: no-op on Codex (no SessionEnd hook event)")}mergeHooksConfig(e,o,r){const i=o!=="monitor",t=o==="assist"?" --soft":"";(0,s.mkdirSync)((0,m.dirname)(e),{recursive:!0});let a={hooks:{}};if((0,s.existsSync)(e))try{a=JSON.parse((0,s.readFileSync)(e,"utf-8")),a.hooks||(a.hooks={})}catch(v){p.logger.debug(`failed to parse ${e}: ${v}`),a={hooks:{}}}for(const v of Object.keys(a.hooks)){const d=a.hooks[v].filter(b=>!K(b));d.length===0?delete a.hooks[v]:a.hooks[v]=d}const g=S((v,d,b)=>{a.hooks[v]||(a.hooks[v]=[]),a.hooks[v].push({matcher:d,hooks:[{type:"command",command:b}]})},"addGroup");g("SessionStart",".*","ironbee hook session-start --client codex"),g("UserPromptSubmit",".*","ironbee hook activity-start --client codex"),g("PreToolUse",".*","ironbee hook track-action-pre --client codex"),i&&(g("PreToolUse","^mcp__(browser|node|backend|android|terminal)[-_]devtools__.*",`ironbee hook require-verification --client codex${t}`),g("PreToolUse","^apply_patch$",`ironbee hook require-verdict --client codex${t}`),g("PostToolUse","^apply_patch$","ironbee hook clear-verdict --client codex"),r==="sub-agent"&&g("SubagentStart",".*","ironbee hook subagent-start --client codex")),g("SubagentStop",".*","ironbee hook subagent-stop --client codex"),g("PostToolUse",".*",i?"ironbee hook track-action --client codex":"ironbee hook track-action-monitor --client codex"),g("Stop",".*",o==="enforce"?"ironbee hook verify-gate --client codex":"ironbee hook activity-end --client codex"),(0,s.writeFileSync)(e,JSON.stringify(a,null,2))}removeIronBeeHooks(e){if((0,s.existsSync)(e))try{const o=(0,s.readFileSync)(e,"utf-8"),r=JSON.parse(o);if(!r.hooks)return;let i=!1;for(const t of Object.keys(r.hooks)){const a=r.hooks[t].filter(g=>!K(g));a.length!==r.hooks[t].length&&(i=!0),a.length===0?delete r.hooks[t]:r.hooks[t]=a}i&&(0,s.writeFileSync)(e,JSON.stringify(r,null,2))}catch(o){p.logger.debug(`failed to strip IronBee hooks from ${e}: ${o}`)}}maybeDeleteEmptyHooks(e){if((0,s.existsSync)(e))try{const o=JSON.parse((0,s.readFileSync)(e,"utf-8"));ve(o)&&(0,s.unlinkSync)(e)}catch(o){p.logger.debug(`failed to inspect ${e} for emptiness: ${o}`)}}mergeConfigToml(e,o,r,i){(0,s.mkdirSync)((0,m.join)(e,".codex"),{recursive:!0});let t=(0,n.readCodexConfigToml)(e);if(t=(0,n.ensureFeaturesHooksTrue)(t),t=r&&(0,X.resolveRuntimeLocation)(e)==="external"?(0,n.ensureSandboxWritableRoot)(t,O):(0,n.removeSandboxWritableRoot)(t,O),t=(0,n.removeMcpServer)(t,w),t=(0,n.removeMcpServer)(t,A),t=(0,n.removeMcpServer)(t,_),t=(0,n.removeMcpServer)(t,I),t=(0,n.removeMcpServer)(t,R),r&&i==="main-agent"){t=this.upsertSessionMcpServers(t,e,o),t=(0,n.removeAgentsTable)(t,$),t=(0,n.removeAgentsTable)(t,x),t=(0,n.removeMultiAgentV2SpawnMetadata)(t),this.removeVerifierAgentToml(e),this.removeScenarioAgentToml(e),(0,n.writeCodexConfigToml)(e,t);return}if(r){const g=(0,l.getVerificationModel)(o,"codex"),h=(0,s.existsSync)((0,n.userCodexConfigTomlPath)())?(0,s.readFileSync)((0,n.userCodexConfigTomlPath)(),"utf-8"):"",u=(0,n.extractTomlTopLevelModel)(t)===null&&(0,n.extractTomlTopLevelModel)(h)===null;g===void 0&&u&&console.log(` ${c.pc.dim("\u2192")} ${k("[codex]")} ${c.pc.yellow("\u26A0 no model for the verifier")} \u2014 the ${c.pc.bold("ironbee-verifier")} sub-agent inherits the session model, but neither this project's .codex/config.toml nor ~/.codex/config.toml has a top-level ${c.pc.bold("model")}, so it may fail to spawn ("could not resolve the child model"). Fix: set ${c.pc.bold("model")} in ~/.codex/config.toml, or set ${c.pc.bold("verification.model")} in your ironbee config.`),this.writeVerifierAgentToml(e,o,g),t=(0,n.upsertAgentsTable)(t,$,[`description = ${JSON.stringify(J)}`,`config_file = ${JSON.stringify(`agents/${$}.toml`)}`]),t=(0,n.ensureMultiAgentV2SpawnMetadataExposed)(t),this.writeScenarioAgentToml(e,o,g),t=(0,n.upsertAgentsTable)(t,x,[`description = ${JSON.stringify(G)}`,`config_file = ${JSON.stringify(`agents/${x}.toml`)}`])}else t=(0,n.removeAgentsTable)(t,$),t=(0,n.removeAgentsTable)(t,x),t=(0,n.removeMultiAgentV2SpawnMetadata)(t),this.removeVerifierAgentToml(e),this.removeScenarioAgentToml(e);(0,n.writeCodexConfigToml)(e,t)}writeVerifierAgentToml(e,o,r){this.writeCustomAgentToml(e,o,r,$,J,"skill","read-only")}writeScenarioAgentToml(e,o,r){this.writeCustomAgentToml(e,o,r,x,G,"scenario","read-only")}writeCustomAgentToml(e,o,r,i,t,a,g){const h=(0,m.join)(__dirname,"agents",`${i}.md`);let u;try{u=(0,s.readFileSync)(h,"utf-8")}catch(y){p.logger.debug(`failed to read agent source ${h}: ${y}`);return}const v=H("codex");for(const y of l.ALL_CYCLES){const E=(0,l.isCycleEnabled)(o,y)?ae=>{const V=(0,m.join)(v,(0,C.fragmentFilename)(a,y,ae));return(0,s.existsSync)(V)?(0,s.readFileSync)(V,"utf-8").trimEnd():null}:null;u=(0,C.applyPlatformSection)(u,y,E,`${i}.toml`)}const d=[];d.push(`name = ${JSON.stringify(i)}`),d.push(`description = ${JSON.stringify(t)}`),d.push(`sandbox_mode = ${JSON.stringify(g)}`),r&&d.push(`model = ${JSON.stringify(r)}`),d.push("developer_instructions = '''"),d.push(u.replace(/'''/g,"```").trimEnd()),d.push("'''");const b=S((y,T,E)=>{y&&(d.push(""),d.push(`[mcp_servers.${T}]`),d.push(...U(E)),d.push(`startup_timeout_sec = ${L}`),d.push("required = true"),d.push('default_tools_approval_mode = "approve"'))},"addCycle");b((0,l.isCycleEnabled)(o,"browser"),w,(0,l.getMcpServerEntry)(e)),b((0,l.isCycleEnabled)(o,"node"),A,(0,l.getNodeDevToolsMcpEntry)(e)),b((0,l.isCycleEnabled)(o,"backend"),_,(0,l.getBackendDevToolsMcpEntry)(e)),b((0,l.isCycleEnabled)(o,"android"),I,(0,l.getAndroidDevToolsMcpEntry)(e)),b((0,l.isCycleEnabled)(o,"terminal"),R,(0,l.getTerminalDevToolsMcpEntry)(e));const M=(0,n.codexAgentTomlPath)(e,i);(0,s.mkdirSync)((0,m.dirname)(M),{recursive:!0}),(0,s.writeFileSync)(M,d.join(`
2
2
  `)+`
3
- `)}upsertSessionMcpServers(e,o,r){let i=e;const t=S((a,g,p)=>{if(!a)return;const u=[...K(p),`startup_timeout_sec = ${V}`,'default_tools_approval_mode = "approve"'];i=(0,n.upsertMcpServer)(i,g,u)},"addCycle");return t((0,d.isCycleEnabled)(r,"browser"),E,(0,d.getMcpServerEntry)(o)),t((0,d.isCycleEnabled)(r,"node"),A,(0,d.getNodeDevToolsMcpEntry)(o)),t((0,d.isCycleEnabled)(r,"backend"),_,(0,d.getBackendDevToolsMcpEntry)(o)),t((0,d.isCycleEnabled)(r,"android"),I,(0,d.getAndroidDevToolsMcpEntry)(o)),i}removeVerifierAgentToml(e){const o=(0,n.codexAgentTomlPath)(e,$);if((0,s.existsSync)(o))try{(0,s.unlinkSync)(o)}catch(r){b.logger.debug(`failed to remove verifier agent toml: ${r}`)}}removeScenarioAgentToml(e){const o=(0,n.codexAgentTomlPath)(e,x);if((0,s.existsSync)(o))try{(0,s.unlinkSync)(o)}catch(r){b.logger.debug(`failed to remove scenario agent toml: ${r}`)}}removeIronBeeMcpServers(e){let o=(0,n.readCodexConfigToml)(e);o&&(o=(0,n.removeMcpServer)(o,E),o=(0,n.removeMcpServer)(o,A),o=(0,n.removeMcpServer)(o,_),o=(0,n.removeMcpServer)(o,I),o=(0,n.removeAgentsTable)(o,$),o=(0,n.removeAgentsTable)(o,x),o=(0,n.removeMultiAgentV2SpawnMetadata)(o),(0,n.writeCodexConfigToml)(e,o))}migrateAwayFromUserLevel(){const e=(0,n.userCodexHooksJsonPath)();this.removeIronBeeHooks(e),this.maybeDeleteEmptyHooks(e);const o=(0,n.userCodexConfigTomlPath)();if((0,s.existsSync)(o))try{let i=(0,s.readFileSync)(o,"utf-8");const t=i;i=(0,n.removeMcpServer)(i,E),i=(0,n.removeMcpServer)(i,A),i=(0,n.removeMcpServer)(i,_),i=(0,n.removeMcpServer)(i,I),i=(0,n.removeAgentsTable)(i,$),i=(0,n.removeMultiAgentV2SpawnMetadata)(i),i!==t&&(0,s.writeFileSync)(o,i)}catch(i){b.logger.debug(`migrate: failed to clean user-level config.toml: ${i}`)}const r=(0,n.userCodexAgentTomlPath)($);if((0,s.existsSync)(r))try{(0,s.unlinkSync)(r)}catch(i){b.logger.debug(`migrate: failed to remove user-level verifier toml: ${i}`)}}writeAgentsMdBlock(e,o,r){const i=(0,m.join)(e,"AGENTS.md"),t=r==="main-agent"?"ironbee-verification.main.md":"ironbee-verification.md",a=(0,m.join)(__dirname,"rules",t);let g;try{g=(0,s.readFileSync)(a,"utf-8")}catch(c){b.logger.debug(`failed to read rule source ${a}: ${c}`);return}const p=P("codex");for(const c of d.ALL_CYCLES){const R=(0,d.isCycleEnabled)(o,c)?k=>{const T=(0,m.join)(p,(0,C.fragmentFilename)("rule",c,k));if(!(0,s.existsSync)(T)){const w=k.length>0?`${c}:${k}`:c;return b.logger.debug(`AGENTS.md platform-section ${w}: missing fragment ${T}, using placeholder`),null}return(0,s.readFileSync)(T,"utf-8").trimEnd()}:null;g=(0,C.applyPlatformSection)(g,c,R,"AGENTS.md")}const u=(0,s.existsSync)(i)?(0,s.readFileSync)(i,"utf-8"):"",v=(0,n.upsertAgentsMdBlock)(u,g);(0,s.writeFileSync)(i,v)}writeSkills(e,o,r,i){const t=(0,m.join)(e,".agents","skills"),a=i==="main-agent";if(o){const u=(0,m.join)(t,"ironbee-verification");(0,s.mkdirSync)(u,{recursive:!0});const v=(0,m.join)(__dirname,"skills",a?"ironbee-verification.main.md":"ironbee-verification.md");try{let c=(0,s.readFileSync)(v,"utf-8");a&&(c=this.spliceCycleFragments(c,"skill",r,"ironbee-verification/SKILL.md")),(0,s.writeFileSync)((0,m.join)(u,"SKILL.md"),c)}catch(c){b.logger.debug(`failed to copy skill ${v}: ${c}`)}}const g=(0,m.join)(t,"ironbee-verify");(0,s.mkdirSync)(g,{recursive:!0});const p=(0,m.join)(__dirname,"commands","ironbee-verify",a?"SKILL.main.md":"SKILL.md");try{let u=(0,s.readFileSync)(p,"utf-8");a&&(u=this.spliceCycleFragments(u,"command-verify",r,"ironbee-verify/SKILL.md")),(0,s.writeFileSync)((0,m.join)(g,"SKILL.md"),u)}catch(u){b.logger.debug(`failed to copy verify command ${p}: ${u}`)}for(const u of J){const v=(0,m.join)(t,u);(0,s.mkdirSync)(v,{recursive:!0});const c=(0,m.join)(__dirname,"commands",u,a?"SKILL.main.md":"SKILL.md");try{let h=(0,s.readFileSync)(c,"utf-8");a&&(h=this.spliceCycleFragments(h,"scenario",r,`${u}/SKILL.md`)),(0,s.writeFileSync)((0,m.join)(v,"SKILL.md"),h)}catch(h){b.logger.debug(`failed to copy scenario command ${c}: ${h}`)}}}spliceCycleFragments(e,o,r,i){const t=P("codex");let a=e;for(const g of d.ALL_CYCLES){const u=(0,d.isCycleEnabled)(r,g)?v=>{const c=(0,m.join)(t,(0,C.fragmentFilename)(o,g,v));return(0,s.existsSync)(c)?(0,s.readFileSync)(c,"utf-8").trimEnd():null}:null;a=(0,C.applyPlatformSection)(a,g,u,i)}return a}removeDir(e){if((0,s.existsSync)(e))try{(0,s.rmSync)(e,{recursive:!0,force:!0})}catch(o){b.logger.debug(`failed to remove ${e}: ${o}`)}}}function K(f){return(0,n.tomlBodyFromRecord)(f)}S(K,"mcpEntryToTomlBody");0&&(module.exports={CodexClient});
3
+ `)}upsertSessionMcpServers(e,o,r){let i=e;const t=S((a,g,h)=>{if(!a)return;const u=[...U(h),`startup_timeout_sec = ${L}`,'default_tools_approval_mode = "approve"'];i=(0,n.upsertMcpServer)(i,g,u)},"addCycle");return t((0,l.isCycleEnabled)(r,"browser"),w,(0,l.getMcpServerEntry)(o)),t((0,l.isCycleEnabled)(r,"node"),A,(0,l.getNodeDevToolsMcpEntry)(o)),t((0,l.isCycleEnabled)(r,"backend"),_,(0,l.getBackendDevToolsMcpEntry)(o)),t((0,l.isCycleEnabled)(r,"android"),I,(0,l.getAndroidDevToolsMcpEntry)(o)),t((0,l.isCycleEnabled)(r,"terminal"),R,(0,l.getTerminalDevToolsMcpEntry)(o)),i}removeVerifierAgentToml(e){const o=(0,n.codexAgentTomlPath)(e,$);if((0,s.existsSync)(o))try{(0,s.unlinkSync)(o)}catch(r){p.logger.debug(`failed to remove verifier agent toml: ${r}`)}}removeScenarioAgentToml(e){const o=(0,n.codexAgentTomlPath)(e,x);if((0,s.existsSync)(o))try{(0,s.unlinkSync)(o)}catch(r){p.logger.debug(`failed to remove scenario agent toml: ${r}`)}}removeIronBeeMcpServers(e){let o=(0,n.readCodexConfigToml)(e);o&&(o=(0,n.removeMcpServer)(o,w),o=(0,n.removeMcpServer)(o,A),o=(0,n.removeMcpServer)(o,_),o=(0,n.removeMcpServer)(o,I),o=(0,n.removeMcpServer)(o,R),o=(0,n.removeAgentsTable)(o,$),o=(0,n.removeAgentsTable)(o,x),o=(0,n.removeMultiAgentV2SpawnMetadata)(o),(0,n.writeCodexConfigToml)(e,o))}migrateAwayFromUserLevel(){const e=(0,n.userCodexHooksJsonPath)();this.removeIronBeeHooks(e),this.maybeDeleteEmptyHooks(e);const o=(0,n.userCodexConfigTomlPath)();if((0,s.existsSync)(o))try{let i=(0,s.readFileSync)(o,"utf-8");const t=i;i=(0,n.removeMcpServer)(i,w),i=(0,n.removeMcpServer)(i,A),i=(0,n.removeMcpServer)(i,_),i=(0,n.removeMcpServer)(i,I),i=(0,n.removeMcpServer)(i,R),i=(0,n.removeAgentsTable)(i,$),i=(0,n.removeMultiAgentV2SpawnMetadata)(i),i!==t&&(0,s.writeFileSync)(o,i)}catch(i){p.logger.debug(`migrate: failed to clean user-level config.toml: ${i}`)}const r=(0,n.userCodexAgentTomlPath)($);if((0,s.existsSync)(r))try{(0,s.unlinkSync)(r)}catch(i){p.logger.debug(`migrate: failed to remove user-level verifier toml: ${i}`)}}writeAgentsMdBlock(e,o,r){const i=(0,m.join)(e,"AGENTS.md"),t=r==="main-agent"?"ironbee-verification.main.md":"ironbee-verification.md",a=(0,m.join)(__dirname,"rules",t);let g;try{g=(0,s.readFileSync)(a,"utf-8")}catch(d){p.logger.debug(`failed to read rule source ${a}: ${d}`);return}const h=H("codex");for(const d of l.ALL_CYCLES){const M=(0,l.isCycleEnabled)(o,d)?y=>{const T=(0,m.join)(h,(0,C.fragmentFilename)("rule",d,y));if(!(0,s.existsSync)(T)){const E=y.length>0?`${d}:${y}`:d;return p.logger.debug(`AGENTS.md platform-section ${E}: missing fragment ${T}, using placeholder`),null}return(0,s.readFileSync)(T,"utf-8").trimEnd()}:null;g=(0,C.applyPlatformSection)(g,d,M,"AGENTS.md")}const u=(0,s.existsSync)(i)?(0,s.readFileSync)(i,"utf-8"):"",v=(0,n.upsertAgentsMdBlock)(u,g);(0,s.writeFileSync)(i,v)}writeSkills(e,o,r,i){const t=(0,m.join)(e,".agents","skills"),a=i==="main-agent";if(o){const u=(0,m.join)(t,"ironbee-verification");(0,s.mkdirSync)(u,{recursive:!0});const v=(0,m.join)(__dirname,"skills",a?"ironbee-verification.main.md":"ironbee-verification.md");try{let d=(0,s.readFileSync)(v,"utf-8");a&&(d=this.spliceCycleFragments(d,"skill",r,"ironbee-verification/SKILL.md")),(0,s.writeFileSync)((0,m.join)(u,"SKILL.md"),d)}catch(d){p.logger.debug(`failed to copy skill ${v}: ${d}`)}}const g=(0,m.join)(t,"ironbee-verify");(0,s.mkdirSync)(g,{recursive:!0});const h=(0,m.join)(__dirname,"commands","ironbee-verify",a?"SKILL.main.md":"SKILL.md");try{let u=(0,s.readFileSync)(h,"utf-8");a&&(u=this.spliceCycleFragments(u,"command-verify",r,"ironbee-verify/SKILL.md")),(0,s.writeFileSync)((0,m.join)(g,"SKILL.md"),u)}catch(u){p.logger.debug(`failed to copy verify command ${h}: ${u}`)}for(const u of F){const v=(0,m.join)(t,u);(0,s.mkdirSync)(v,{recursive:!0});const d=(0,m.join)(__dirname,"commands",u,a?"SKILL.main.md":"SKILL.md");try{let b=(0,s.readFileSync)(d,"utf-8");a&&(b=this.spliceCycleFragments(b,"scenario",r,`${u}/SKILL.md`)),(0,s.writeFileSync)((0,m.join)(v,"SKILL.md"),b)}catch(b){p.logger.debug(`failed to copy scenario command ${d}: ${b}`)}}}spliceCycleFragments(e,o,r,i){const t=H("codex");let a=e;for(const g of l.ALL_CYCLES){const u=(0,l.isCycleEnabled)(r,g)?v=>{const d=(0,m.join)(t,(0,C.fragmentFilename)(o,g,v));return(0,s.existsSync)(d)?(0,s.readFileSync)(d,"utf-8").trimEnd():null}:null;a=(0,C.applyPlatformSection)(a,g,u,i)}return a}removeDir(e){if((0,s.existsSync)(e))try{(0,s.rmSync)(e,{recursive:!0,force:!0})}catch(o){p.logger.debug(`failed to remove ${e}: ${o}`)}}}function U(f){return(0,n.tomlBodyFromRecord)(f)}S(U,"mcpEntryToTomlBody");0&&(module.exports={CodexClient});
@@ -0,0 +1,61 @@
1
+ <!-- Terminal verification is ENABLED for this project. -->
2
+
3
+ ## Terminal Mode (when `terminal.verifyPatterns` matches an edited file)
4
+
5
+ > **Precondition: the change must have terminal-observable behavior.** If the change is a web-only UI with no command-line / REPL / TUI surface, this section does NOT apply — `tdt_*` tools spawn a program attached to a PTY. Just do browser verification.
6
+
7
+ If the project has terminal verification enabled (`ironbee terminal enable` once at setup) and your edits touch matching paths, the Stop hook also enforces a terminal cycle. The same `verification-start` covers both cycles; one platform-agnostic verdict covers both.
8
+
9
+ ### Mode behavior (terminal cycle)
10
+ - **default** (no arg or `default`): exercise only the commands / code paths your diff touched.
11
+ - **full**: exercise every terminal-reachable code path from files matching `terminal.verifyPatterns`.
12
+ - `visual` / `functional`: browser-only modes; terminal cycle behaves as `default` when they are passed.
13
+
14
+ ### Steps (run within step 3 of the Universal steps above)
15
+ 1. **Pick an evidence path** for the changed code:
16
+ - **Run-evidence** (proves a non-interactive command works): run the affected command one-shot with `mcp__terminal-devtools__tdt_pty_run` — it spawns the command attached to a PTY, runs it to completion, and returns the FULL output plus exit code. Confirm the output shows the expected result AND the exit code matches expectation. Best for CLIs, build targets, scripts, and test runs.
17
+ - **Interactive-evidence** (proves a REPL / shell / TUI change works):
18
+ - Spawn the program: `mcp__terminal-devtools__tdt_pty_start` (returns a `paneId`).
19
+ - Drive input: `mcp__terminal-devtools__tdt_interaction_send-keys` (tmux key syntax — `Enter`, `C-c`, `Up`, `Tab`, …) and `mcp__terminal-devtools__tdt_interaction_send-text` (literal text).
20
+ - Synchronize before reading: `mcp__terminal-devtools__tdt_sync_wait-for` (block until the expected output appears — prefer over delays).
21
+ - Capture output: `mcp__terminal-devtools__tdt_content_capture` — `mode: stream` for line-oriented programs (REPLs, shells; incremental `since` cursor reads only new lines), `mode: screen` for full-screen TUIs. Confirm it shows the expected result.
22
+ - Stop the pane: `mcp__terminal-devtools__tdt_pty_stop`.
23
+ - Auxiliary (NOT gate evidence): `mcp__terminal-devtools__tdt_sync_wait-for-idle`, `mcp__terminal-devtools__tdt_content_get-cursor`, `mcp__terminal-devtools__tdt_pty_resize`, `mcp__terminal-devtools__tdt_pty_signal`, `mcp__terminal-devtools__tdt_pty_list`.
24
+ 2. **Submit verdict** — platform-agnostic, just status + checks (+ issues/fixes).
25
+
26
+ ### Verdict (platform-agnostic)
27
+ ```json
28
+ {
29
+ "session_id": "...",
30
+ "status": "pass",
31
+ "checks": ["`mycli build` exits 0 with the new summary line", "REPL `:help` lists the new command"]
32
+ }
33
+ ```
34
+
35
+ For a multi-cycle pass, both browser and terminal pass criteria must hold.
36
+
37
+ ---
38
+
39
+ ## Default Mode (terminal cycle)
40
+
41
+ Focus on the commands or code paths your diff touched — not the entire program.
42
+
43
+ ### 1. Study the changes
44
+ 1. Run `git diff --name-only` and `git diff --name-only HEAD~1`
45
+ 2. **Ignore `.ironbee/`, `.claude/`, `.cursor/`** — tool config, not application code
46
+ 3. **Read the full diff** for every terminal file in scope — note new commands, changed flags, new output lines, changed exit codes, new REPL/TUI behavior
47
+ 4. Before spawning, identify: which command / subcommand / REPL command / TUI screen is affected? What input exercises it? What output / exit code proves it works?
48
+
49
+ ### 2. Verify against the running program
50
+ - **Run-evidence**: run the affected command via `tdt_pty_run`; the output must show the expected result and the exit code must match expectation
51
+ - **Interactive-evidence**: spawn the program, drive the affected input flow (send-keys / send-text), wait for the expected output (`tdt_sync_wait-for`), and capture it (`tdt_content_capture`) — the capture must show the expected state after your change
52
+
53
+ ---
54
+
55
+ ## Full Mode (`$ironbee-verify full`, terminal cycle)
56
+
57
+ Verify every terminal-reachable code path from files matching `terminal.verifyPatterns`, not just the changed files. Do NOT run `git diff` or scope to recent changes.
58
+
59
+ - Exercise every command / subcommand / REPL command / TUI screen in scope
60
+ - Drive at least one happy-path flow AND one error-path flow per command (confirm both the success output/exit `0` and the expected failure output/non-zero exit)
61
+ - Capture output (run-evidence or interactive-evidence) for each path; no unexpected crashes or stack traces
@@ -0,0 +1,31 @@
1
+ <!-- Terminal verification is ENABLED for this project. The Stop hook
2
+ enforces a terminal cycle whenever an edited file matches
3
+ `terminal.verifyPatterns`. -->
4
+
5
+ ## Terminal cycle
6
+
7
+ Terminal file changes IF the file matches `terminal.verifyPatterns` ALSO require verification through the **terminal-devtools** MCP server (prefix `tdt_`). Terminal-cycle verification means spawning the affected program attached to a PTY and confirming its behavior — either running the command one-shot and checking its output and exit code, OR driving an interactive session (REPL / shell / TUI) and capturing the rendered output.
8
+
9
+ Both cycles can be active simultaneously (e.g. you edit both a React component and a CLI command in the same task). One `verification-start` covers all active cycles; one platform-agnostic verdict covers them all; one retry counter applies globally.
10
+
11
+ ### ⚠️ `terminal-devtools` is ONLY for terminal-observable behavior
12
+
13
+ `terminal-devtools` drives CLIs, REPLs, shells, and TUIs through a PTY. It does NOT apply to web-only UI changes with no command-line surface. If the change produces no terminal-observable output (stdout / stderr / exit code / rendered TUI), do NOT call `tdt_*` tools — use the browser cycle for web-only projects.
14
+
15
+ **Misconfiguration recovery.** If you reach this state, the operator enabled the terminal cycle by mistake. The Stop hook will keep blocking with `incomplete_tools` for the terminal cycle. Don't attempt to spawn a PTY. Instead, stop and clearly report to the user: this change has no terminal-observable behavior; ask them to run `ironbee terminal disable` to unblock the gate.
16
+
17
+ ### Terminal-cycle additions to the main flow
18
+
19
+ These attach to the **Required steps** above — they don't replace any step. Numbering follows the main flow:
20
+
21
+ - **Within step 3 (run flow):** also run the terminal flow: pick ONE evidence path:
22
+ - **Run-evidence**: run the affected command one-shot (`tdt_pty_run`) and confirm its output AND exit code match expectation
23
+ - **Interactive-evidence**: spawn a pane (`tdt_pty_start`) → drive input (`tdt_interaction_send-keys` / `tdt_interaction_send-text`) → synchronize (`tdt_sync_wait-for`) → capture output (`tdt_content_capture`, `mode: stream` for REPLs/shells, `mode: screen` for TUIs) → stop the pane (`tdt_pty_stop`). Auxiliary only (NOT evidence): `tdt_sync_wait-for-idle`, `tdt_content_get-cursor`, `tdt_pty_resize` / `tdt_pty_signal` / `tdt_pty_list`.
24
+ - **Within step 6 (submit verdict):** submit one platform-agnostic verdict with `status` + `checks` (+ `issues`/`fixes` as needed). Terminal-cycle pass criteria: (command ran via `tdt_pty_run` with output + exit code confirmed) OR (pane spawned AND input driven AND output captured showing the expected result).
25
+
26
+ ### Additional BANNED for terminal cycle
27
+
28
+ - Calling `tdt_*` tools without first opening a verification cycle (`ironbee hook verification-start`).
29
+ - **Calling `tdt_*` tools when the change has NO terminal-observable behavior.** Use the browser cycle only for web-only projects.
30
+ - Claiming `status: pass` for a terminal cycle when no evidence path was exercised.
31
+ - Claiming `status: pass` on the run-evidence path without confirming the exit code, or on the interactive-evidence path without capturing output that shows the expected result.
@@ -0,0 +1,36 @@
1
+ ### terminal platform (enabled)
2
+ - **Use for**: CLI / REPL / shell / TUI scenarios driven through a PTY.
3
+ - **Server**: `terminal-devtools` · **scenario tools**: the `tdt_scenario-*` tools
4
+ (`tdt_scenario-add` / `-update` / `-delete` / `-list` / `-search` / `-run`).
5
+ - **Store**: project → `.ironbee/scenarios/tdt`, global → `~/.ironbee/scenarios/tdt` (the
6
+ server's `SCENARIOS_DIR`; you pass `scope`, the server resolves the path).
7
+ - Scenario **scripts** call this platform's tools via `callTool('<bare-tool>', {...})` — discover
8
+ the available `tdt_*` tool names from your connected MCP tool schemas; don't guess.
9
+
10
+ **What to test & how — capture the SAME evidence the verifier would** (a scenario runs FOR
11
+ verification, so its script must collect what the terminal cycle collects). In the script, pick an
12
+ **evidence path** for the changed code area:
13
+ 1. **Run-evidence path** — run the affected command one-shot with `tdt_pty_run` (with
14
+ `returnOutput: true`): it spawns the command attached to a PTY, runs it to completion, and returns
15
+ the FULL output plus exit code. Put the returned output AND exit code in your result; the verifier
16
+ reads them to judge whether the change behaved correctly. Best for non-interactive CLIs, build
17
+ targets, scripts, and test runs.
18
+ 2. **Interactive-evidence path** — drive a live session:
19
+ - Spawn the program: `tdt_pty_start` (returns a `paneId` you reference for the rest of the script).
20
+ - Drive input: `tdt_interaction_send-keys` (tmux key syntax — `Enter`, `C-c`, `Up`, `Tab`, …) and
21
+ `tdt_interaction_send-text` (literal text).
22
+ - **Synchronize before reading** — `tdt_sync_wait-for` to block until the expected output appears
23
+ (prefer over fixed delays).
24
+ - Capture output: `tdt_content_capture` (with `returnOutput: true`) — `mode: stream` for
25
+ line-oriented programs (REPLs, shells; incremental `since` cursor reads only new lines),
26
+ `mode: screen` for full-screen TUIs. Its captured text is what the verifier reads.
27
+ - Stop the pane: `tdt_pty_stop`.
28
+ - Optional helpers (NOT evidence): `tdt_sync_wait-for-idle` (wait until output settles),
29
+ `tdt_content_get-cursor` (read the stream cursor), `tdt_pty_resize` / `tdt_pty_signal` /
30
+ `tdt_pty_list`.
31
+
32
+ `return` the evidence — the captured output text, the exit code (run-evidence) — **plus explicit
33
+ pass/fail assertions**. That returned result is what `$ironbee-verify scenario:<name>` reads to judge
34
+ functional correctness (from the output text and exit code). **`terminal-devtools` has no
35
+ screenshots / video** — there is no visual artifact to capture; the captured text and exit code ARE
36
+ the evidence. **`terminal-devtools` is for terminal-observable behavior only.**
@@ -6,7 +6,7 @@
6
6
 
7
7
  > **Recording (only when `recording.enable` is on in config):** the gate blocks every other browser tool until you first call `mcp__browser-devtools__bdt_content_start-recording`, and `submit-verdict` rejects with `"recording is still active"` unless you call `mcp__browser-devtools__bdt_content_stop-recording` after the steps below. **Treat start/stop as bookends around steps 1-5.** The same is enforced as step 6 of the Universal flow.
8
8
 
9
- 1. **Navigate**: `mcp__browser-devtools__bdt_navigation_go-to` — go to the affected page(s)
9
+ 1. **Navigate**: `mcp__browser-devtools__bdt_navigation_go-to` — go to the affected page(s) **AND any downstream page that renders or consumes what the change produces** — verify the change's effect where it's observed, not only the page the edited file owns
10
10
  2. **Interact**: actually exercise what changed — click buttons, fill forms, submit data, trigger workflows. Don't just look at the page.
11
11
  3. **Screenshot**: `mcp__browser-devtools__bdt_content_take-screenshot` — capture the final visual state
12
12
  4. **Accessibility**: `mcp__browser-devtools__bdt_a11y_take-aria-snapshot` — verify page structure