@ironbee-ai/cli 0.10.0 → 0.10.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +15 -1
- package/README.md +15 -40
- package/dist/analysis/scoring.d.ts.map +1 -1
- package/dist/analysis/scoring.js +6 -4
- package/dist/analysis/scoring.js.map +1 -1
- package/dist/analysis/verdict-details.d.ts +5 -5
- package/dist/analysis/verdict-details.d.ts.map +1 -1
- package/dist/analysis/verdict-details.js +5 -5
- package/dist/analysis/verdict-details.js.map +1 -1
- package/dist/analysis/verification-quality.d.ts +4 -3
- package/dist/analysis/verification-quality.d.ts.map +1 -1
- package/dist/analysis/verification-quality.js +7 -44
- package/dist/analysis/verification-quality.js.map +1 -1
- package/dist/analytics/emit.d.ts.map +1 -1
- package/dist/analytics/emit.js +7 -6
- package/dist/analytics/emit.js.map +1 -1
- package/dist/analytics/projection.d.ts +12 -10
- package/dist/analytics/projection.d.ts.map +1 -1
- package/dist/analytics/projection.js +13 -11
- package/dist/analytics/projection.js.map +1 -1
- package/dist/clients/claude/commands/ironbee-verify.md +5 -5
- package/dist/clients/claude/hooks/require-verdict.js +2 -2
- package/dist/clients/claude/hooks/session-start.d.ts.map +1 -1
- package/dist/clients/claude/hooks/session-start.js +10 -8
- package/dist/clients/claude/hooks/session-start.js.map +1 -1
- package/dist/clients/claude/platforms/command-verify.backend.md +6 -31
- package/dist/clients/claude/platforms/command-verify.browser.md +2 -2
- package/dist/clients/claude/platforms/command-verify.node.md +8 -14
- package/dist/clients/claude/platforms/rule.backend.md +2 -2
- package/dist/clients/claude/platforms/rule.browser.md +1 -1
- package/dist/clients/claude/platforms/rule.node.md +3 -4
- package/dist/clients/claude/platforms/skill.backend.md +10 -41
- package/dist/clients/claude/platforms/skill.browser.md +4 -7
- package/dist/clients/claude/platforms/skill.node.md +11 -26
- package/dist/clients/claude/rules/ironbee-verification.md +3 -4
- package/dist/clients/claude/skills/ironbee-verification.md +8 -6
- package/dist/clients/cursor/commands/ironbee-verify/SKILL.md +5 -5
- package/dist/clients/cursor/hooks/require-verdict.js +2 -2
- package/dist/clients/cursor/hooks/session-start.d.ts.map +1 -1
- package/dist/clients/cursor/hooks/session-start.js +10 -8
- package/dist/clients/cursor/hooks/session-start.js.map +1 -1
- package/dist/clients/cursor/platforms/command-verify.backend.md +6 -31
- package/dist/clients/cursor/platforms/command-verify.browser.md +2 -2
- package/dist/clients/cursor/platforms/command-verify.node.md +8 -14
- package/dist/clients/cursor/platforms/rule.backend.md +2 -2
- package/dist/clients/cursor/platforms/rule.browser.md +1 -1
- package/dist/clients/cursor/platforms/rule.node.md +3 -4
- package/dist/clients/cursor/platforms/skill.backend.md +10 -41
- package/dist/clients/cursor/platforms/skill.browser.md +4 -7
- package/dist/clients/cursor/platforms/skill.node.md +11 -26
- package/dist/clients/cursor/rules/ironbee-verification.mdc +3 -4
- package/dist/clients/cursor/skills/ironbee-verification.md +8 -6
- package/dist/commands/analyze.d.ts.map +1 -1
- package/dist/commands/analyze.js +0 -9
- package/dist/commands/analyze.js.map +1 -1
- package/dist/commands/login.js +1 -1
- package/dist/commands/login.js.map +1 -1
- package/dist/commands/status.d.ts.map +1 -1
- package/dist/commands/status.js +0 -4
- package/dist/commands/status.js.map +1 -1
- package/dist/commands/verification-toggle.d.ts +7 -5
- package/dist/commands/verification-toggle.d.ts.map +1 -1
- package/dist/commands/verification-toggle.js +7 -5
- package/dist/commands/verification-toggle.js.map +1 -1
- package/dist/commands/verification.d.ts +24 -0
- package/dist/commands/verification.d.ts.map +1 -0
- package/dist/commands/verification.js +65 -0
- package/dist/commands/verification.js.map +1 -0
- package/dist/commands/verify.d.ts.map +1 -1
- package/dist/commands/verify.js +1 -34
- package/dist/commands/verify.js.map +1 -1
- package/dist/hooks/core/actions.d.ts +12 -54
- package/dist/hooks/core/actions.d.ts.map +1 -1
- package/dist/hooks/core/actions.js +50 -4
- package/dist/hooks/core/actions.js.map +1 -1
- package/dist/hooks/core/submit-verdict.d.ts.map +1 -1
- package/dist/hooks/core/submit-verdict.js +2 -3
- package/dist/hooks/core/submit-verdict.js.map +1 -1
- package/dist/hooks/core/verify-gate.d.ts +10 -6
- package/dist/hooks/core/verify-gate.d.ts.map +1 -1
- package/dist/hooks/core/verify-gate.js +59 -163
- package/dist/hooks/core/verify-gate.js.map +1 -1
- package/dist/import/claude/analytics-runner.js +1 -1
- package/dist/import/claude/analytics-runner.js.map +1 -1
- package/dist/index.js +3 -6
- package/dist/index.js.map +1 -1
- package/dist/lib/collector.d.ts +37 -4
- package/dist/lib/collector.d.ts.map +1 -1
- package/dist/lib/collector.js +68 -8
- package/dist/lib/collector.js.map +1 -1
- package/dist/lib/config.d.ts +6 -4
- package/dist/lib/config.d.ts.map +1 -1
- package/dist/lib/config.js +2 -1
- package/dist/lib/config.js.map +1 -1
- package/dist/lib/platform-section.d.ts +1 -1
- package/dist/lib/platform-section.js +1 -1
- package/package.json +1 -1
- package/dist/commands/disable-verification.d.ts +0 -16
- package/dist/commands/disable-verification.d.ts.map +0 -1
- package/dist/commands/disable-verification.js +0 -39
- package/dist/commands/disable-verification.js.map +0 -1
- package/dist/commands/enable-verification.d.ts +0 -14
- package/dist/commands/enable-verification.d.ts.map +0 -1
- package/dist/commands/enable-verification.js +0 -37
- package/dist/commands/enable-verification.js.map +0 -1
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
|
|
5
5
|
> **Precondition: the backend must actually be Node.js.** If you see `pom.xml`, `build.gradle`, `requirements.txt`, `pyproject.toml`, `go.mod`, `Cargo.toml`, etc., this section does NOT apply — `ndt_*` tools won't connect to non-Node processes. Just do browser verification.
|
|
6
6
|
|
|
7
|
-
If the project has node backend verification enabled (`ironbee node enable` once at setup, by an operator who confirmed the backend is Node.js) and your edits touch matching paths (e.g. `server/**`, `pages/api/**`), the Stop hook also enforces a Node cycle. The same `verification-start` covers both cycles;
|
|
7
|
+
If the project has node backend verification enabled (`ironbee node enable` once at setup, by an operator who confirmed the backend is Node.js) and your edits touch matching paths (e.g. `server/**`, `pages/api/**`), the Stop hook also enforces a Node cycle. The same `verification-start` covers both cycles; one platform-agnostic verdict covers both.
|
|
8
8
|
|
|
9
9
|
### Mode behavior (node cycle)
|
|
10
10
|
- **default** (no arg or `default`): probe / log only the code paths your diff touched. Map each changed file → the handler(s) it affects → place probes there.
|
|
@@ -16,26 +16,20 @@ If the project has node backend verification enabled (`ironbee node enable` once
|
|
|
16
16
|
2. **Connect**: `mcp__node-devtools__ndt_debug_connect` with one of `pid` / `processName` / `containerId` / `containerName` / `inspectorPort` / `wsUrl`. Inspector is auto-activated via SIGUSR1 if needed.
|
|
17
17
|
3. **Pick an evidence path** for each changed code path:
|
|
18
18
|
- **Probe path** (proves the code path executed): `mcp__node-devtools__ndt_debug_put-tracepoint` (or `put-logpoint` / `put-exceptionpoint`) at the changed code, exercise the path (e.g. trigger the API call from the browser), then `mcp__node-devtools__ndt_debug_get-probe-snapshots`. At least one probe must come back with `triggered: true`.
|
|
19
|
-
- **Log path** (proves no errors): exercise the path, then `mcp__node-devtools__ndt_debug_get-logs` with the error level filter.
|
|
19
|
+
- **Log path** (proves no errors): exercise the path, then `mcp__node-devtools__ndt_debug_get-logs` with the error level filter.
|
|
20
20
|
4. **Disconnect** (optional): `mcp__node-devtools__ndt_debug_disconnect`.
|
|
21
|
-
5. **Submit verdict**
|
|
21
|
+
5. **Submit verdict** — platform-agnostic, just status + checks (+ issues/fixes).
|
|
22
22
|
|
|
23
|
-
### Verdict (
|
|
23
|
+
### Verdict (platform-agnostic)
|
|
24
24
|
```json
|
|
25
25
|
{
|
|
26
26
|
"session_id": "...",
|
|
27
27
|
"status": "pass",
|
|
28
|
-
"checks": ["POST /api/orders returned 201", "tracepoint at handler.ts:42 fired once"]
|
|
29
|
-
"node_processes_connected": ["pid:12345 (next-server)"],
|
|
30
|
-
"node_probes_set": [
|
|
31
|
-
{ "type": "tracepoint", "location": "src/api/orders.ts:42", "triggered": true }
|
|
32
|
-
],
|
|
33
|
-
"node_probe_snapshots_collected": 1,
|
|
34
|
-
"node_log_errors": []
|
|
28
|
+
"checks": ["POST /api/orders returned 201", "tracepoint at handler.ts:42 fired once"]
|
|
35
29
|
}
|
|
36
30
|
```
|
|
37
31
|
|
|
38
|
-
For a multi-cycle pass, both browser and node criteria must hold
|
|
32
|
+
For a multi-cycle pass, both browser and node pass criteria must hold.
|
|
39
33
|
|
|
40
34
|
---
|
|
41
35
|
|
|
@@ -54,7 +48,7 @@ Focus on the code you changed — not the entire Node service.
|
|
|
54
48
|
- **Exercise the path end-to-end** (trigger from browser, curl, or the backend cycle if active)
|
|
55
49
|
- **Each touched probe must report `triggered: true`** in `mcp__node-devtools__ndt_debug_get-probe-snapshots`
|
|
56
50
|
- **Check one edge case per new branch** — invalid input, missing field, auth failure, …
|
|
57
|
-
- **Logs** — `mcp__node-devtools__ndt_debug_get-logs` at error level;
|
|
51
|
+
- **Logs** — `mcp__node-devtools__ndt_debug_get-logs` at error level; no ERROR-level entries are expected for `pass`
|
|
58
52
|
|
|
59
53
|
---
|
|
60
54
|
|
|
@@ -64,4 +58,4 @@ Probe every Node code path reachable from files matching `node.verifyPatterns`,
|
|
|
64
58
|
|
|
65
59
|
- Place probes at every handler / route / service entry point in scope, plus key internal branch points (early returns, error catches, conditional middleware)
|
|
66
60
|
- Exercise each path with at least one happy-path call AND one failure-path call
|
|
67
|
-
-
|
|
61
|
+
- No ERROR-level log entries are expected after the full run — any unexpected log error is a fail, regardless of when it was introduced
|
|
@@ -16,7 +16,7 @@ These attach to the **Required steps** above — they don't replace any step. Nu
|
|
|
16
16
|
- **Protocol-call** — identify the affected endpoint(s) → call the matching protocol tool (`bedt_request_http` / `bedt_request_grpc` / `bedt_request_graphql` / `bedt_request_websocket-open` / `bedt_request_replay`) → inspect status / body / `traceId` → chain follow-up calls when verifying side effects. **A 4xx / 5xx response is a normal result, not an error** — only transport failures populate the `error` field.
|
|
17
17
|
- **Log evidence** — `bedt_log_register-source` (file / docker / kubernetes) → `bedt_log_read` (point-in-time, supports `tail` / `since-until` / `pattern` / `level` / `parseJson` + `jsonFilter` / `contextBefore-After` / `select` / `coalesce`) OR `bedt_log_follow` + `bedt_log_get-followed` (streaming). Fit for jobs / queue workers / async handlers, or any case where an external driver is hitting the endpoint and you only need to verify what the server logged. `bedt_log_register-source` is mandatory on this path (the gate counts it as the setup step).
|
|
18
18
|
- **DB evidence** — `bedt_db_connect` (named, `connectionStringEnv` preferred, default readonly) → ONE of `bedt_db_query` / `bedt_db_describe-table` / `bedt_db_list-tables` / `bedt_db_snapshot` (+ optional `bedt_db_diff`) / `bedt_db_get-changes`. Fit for migrations, seed-data changes, query-result regressions, and any code change whose side effect lives in a relational DB. `bedt_db_connect` is mandatory on this path (same anti-fluke rule as `log-evidence` — the connection name is on the wire).
|
|
19
|
-
- **Within step 6 (submit verdict):**
|
|
19
|
+
- **Within step 6 (submit verdict):** submit one platform-agnostic verdict (`status` + `checks` + optionally `issues` / `fixes`). The gate requires that at least one evidence path (protocol-call, log-evidence, or DB-evidence) was exercised in your `bedt_*` tool calls.
|
|
20
20
|
|
|
21
21
|
### Trace correlation (`o11y_*` is auxiliary, not evidence)
|
|
22
22
|
|
|
@@ -26,7 +26,7 @@ IronBee already injects the active verification `traceId` into every backend too
|
|
|
26
26
|
|
|
27
27
|
- Calling `bedt_*` tools without first opening a verification cycle (`ironbee hook verification-start`).
|
|
28
28
|
- Treating a 4xx / 5xx response as a transport failure when the test was specifically asking for that error condition (e.g. "POST should reject malformed body with 400"). Decide PASS/FAIL based on the test's intent, not the status code's HTTP-class default.
|
|
29
|
-
-
|
|
29
|
+
- Claiming `status: pass` for a backend cycle without exercising at least one evidence path (no `bedt_request_*` call, no `bedt_log_register-source` + read, no `bedt_db_connect` + read). The gate will reject.
|
|
30
30
|
- Inferring backend behavior by reading code without exercising any evidence path. The cycle is satisfied only by making a real protocol call, reading real logs, or inspecting real DB state on the running service.
|
|
31
31
|
- Reading a pre-existing log source / DB unrelated to your task to fake the log-evidence or db-evidence path. `bedt_log_register-source` and `bedt_db_connect` are required setup steps on those paths so the registration / connection is on the wire.
|
|
32
32
|
- Opening a DB connection with `allowWrites: true` to "set up" verification data without an explicit need (seed / migration). Read-only is the default for a reason — flipping it widens the blast radius if a query goes wrong.
|
|
@@ -9,7 +9,7 @@
|
|
|
9
9
|
Run the **Browser cycle** flow when an edited file activates it: navigate (`bdt_navigation_go-to`) → functionally test → screenshot (`bdt_content_take-screenshot`) → accessibility (`bdt_a11y_take-aria-snapshot`) → console (`bdt_o11y_get-console-messages`). All four are MANDATORY.
|
|
10
10
|
|
|
11
11
|
- If `recording.enable` is on, the gate forces `bdt_content_start-recording` BEFORE the steps above and rejects the verdict if you don't call `bdt_content_stop-recording` AFTER them. Always pair start/stop around the steps above.
|
|
12
|
-
-
|
|
12
|
+
- The verdict you submit carries only semantic judgment (`status`, `checks`, optionally `issues` / `fixes`). The gate enforces that the required `bdt_*` tools were called and that `checks` is non-empty.
|
|
13
13
|
|
|
14
14
|
## Browser-cycle BANNED
|
|
15
15
|
|
|
@@ -6,7 +6,7 @@
|
|
|
6
6
|
|
|
7
7
|
Backend file changes IF the file matches `node.verifyPatterns` ALSO require verification through the **node-devtools** MCP server (prefix `ndt_`). Node-cycle verification means attaching to the running Node process via the V8 inspector, setting a probe (tracepoint / logpoint / exceptionpoint) at the changed code, exercising the path so the probe fires, and reading snapshots — OR inspecting runtime error logs.
|
|
8
8
|
|
|
9
|
-
Both cycles can be active simultaneously (e.g. you edit both a React component and an API handler in the same task). One `verification-start` covers all active cycles; one verdict
|
|
9
|
+
Both cycles can be active simultaneously (e.g. you edit both a React component and an API handler in the same task). One `verification-start` covers all active cycles; one platform-agnostic verdict covers them all; one retry counter applies globally.
|
|
10
10
|
|
|
11
11
|
### ⚠️ `node-devtools` is ONLY for Node.js backends
|
|
12
12
|
|
|
@@ -19,11 +19,10 @@ Both cycles can be active simultaneously (e.g. you edit both a React component a
|
|
|
19
19
|
These attach to the **Required steps** above — they don't replace any step. Numbering follows the main flow:
|
|
20
20
|
|
|
21
21
|
- **Within step 3 (run flow):** also run the node flow: connect (`ndt_debug_connect`) → set probe (`ndt_debug_put-tracepoint` / `put-logpoint` / `put-exceptionpoint`) AND exercise + read snapshots (`ndt_debug_get-probe-snapshots`), OR exercise + read logs (`ndt_debug_get-logs`). When both browser and node cycles are active, run BOTH within the same verification cycle.
|
|
22
|
-
- **Within step 6 (submit verdict):**
|
|
22
|
+
- **Within step 6 (submit verdict):** submit one platform-agnostic verdict with `status` + `checks` (+ `issues`/`fixes` as needed). Node-cycle pass criteria: process connected, probe triggered (or log path used with no ERROR entries).
|
|
23
23
|
|
|
24
24
|
### Additional BANNED for node cycle
|
|
25
25
|
|
|
26
26
|
- Calling `ndt_*` tools without first opening a verification cycle (`ironbee hook verification-start`).
|
|
27
27
|
- **Calling `ndt_*` tools when the project's backend is NOT Node.js** (Java / Python / Go / Rust / .NET / Ruby / PHP / Elixir / etc.). Use the browser cycle only for non-Node backends.
|
|
28
|
-
- Claiming `status: pass` for a node cycle when no probe triggered AND
|
|
29
|
-
- Submitting a node-only verdict that omits `node_processes_connected` — every node-cycle verdict requires this field non-empty.
|
|
28
|
+
- Claiming `status: pass` for a node cycle when no probe triggered AND no log path was used — those criteria must hold.
|
|
@@ -65,62 +65,31 @@ The IronBee verification cycle already pins a W3C trace id on every backend tool
|
|
|
65
65
|
|
|
66
66
|
### Submit verdict
|
|
67
67
|
|
|
68
|
-
|
|
68
|
+
The verdict is platform-agnostic — `status`, `checks`, and (when applicable) `issues` / `fixes`. The gate enforces that AT LEAST one backend evidence path was exercised in your `bedt_*` tool calls (protocol-call OR log-evidence OR DB-evidence) and that `checks` is non-empty.
|
|
69
69
|
|
|
70
70
|
```json
|
|
71
71
|
{
|
|
72
72
|
"session_id": "<sid>",
|
|
73
73
|
"status": "pass",
|
|
74
|
-
"checks": ["POST /api/orders returned 201 with order id", "GET /api/orders/:id reflects new order"]
|
|
75
|
-
"backend_endpoints_called": [
|
|
76
|
-
"POST http://localhost:3000/api/orders",
|
|
77
|
-
"GET http://localhost:3000/api/orders/42"
|
|
78
|
-
],
|
|
79
|
-
"backend_response_statuses": [201, 200],
|
|
80
|
-
"backend_traces_collected": ["00-1234abcd-...01"]
|
|
74
|
+
"checks": ["POST /api/orders returned 201 with order id", "GET /api/orders/:id reflects new order"]
|
|
81
75
|
}
|
|
82
76
|
```
|
|
83
77
|
|
|
84
|
-
Or, for a log-evidence path:
|
|
85
|
-
|
|
86
|
-
```json
|
|
87
|
-
{
|
|
88
|
-
"session_id": "<sid>",
|
|
89
|
-
"status": "pass",
|
|
90
|
-
"checks": ["api-server logged 'order 42 created' on POST /api/orders", "no ERROR-level lines after the change"],
|
|
91
|
-
"backend_log_sources_read": ["api-server"]
|
|
92
|
-
}
|
|
93
|
-
```
|
|
94
|
-
|
|
95
|
-
Or, for a DB-evidence path:
|
|
96
|
-
|
|
97
|
-
```json
|
|
98
|
-
{
|
|
99
|
-
"session_id": "<sid>",
|
|
100
|
-
"status": "pass",
|
|
101
|
-
"checks": ["users table has new email_verified column with default false", "row count unchanged after migration"],
|
|
102
|
-
"backend_db_connections_read": ["app"]
|
|
103
|
-
}
|
|
104
|
-
```
|
|
105
|
-
|
|
106
|
-
At least one of `backend_endpoints_called`, `backend_log_sources_read`, or `backend_db_connections_read` must be non-empty. The optional protocol-call fields (`backend_response_statuses`, `backend_traces_collected`) are strongly recommended when you used the protocol-call path — same order as `backend_endpoints_called`. Status interpretation is YOUR call: there is no automatic pass-criteria override on this cycle. If the agent claims `status: pass` and the evidence is structurally valid, the gate honors it.
|
|
107
|
-
|
|
108
78
|
## Multi-cycle (browser + backend, or browser + node + backend)
|
|
109
79
|
|
|
110
|
-
Common case: a feature edit touches a `.tsx` page (browser-cycle) and a `routes/orders.ts` (node-cycle if Node.js, plus backend-cycle for protocol verification when `backend` is enabled). All active cycles must be satisfied for `status: pass`. **Single** `verification-start`, **single** verdict, **single** retry counter cover all of them.
|
|
80
|
+
Common case: a feature edit touches a `.tsx` page (browser-cycle) and a `routes/orders.ts` (node-cycle if Node.js, plus backend-cycle for protocol verification when `backend` is enabled). All active cycles must be satisfied for `status: pass`. **Single** `verification-start`, **single** verdict, **single** retry counter cover all of them. The verdict shape doesn't change with the number of active cycles — same minimal verdict regardless:
|
|
111
81
|
|
|
112
82
|
```json
|
|
113
83
|
{
|
|
114
84
|
"session_id": "<sid>",
|
|
115
85
|
"status": "pass",
|
|
116
|
-
"
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
122
|
-
"backend_traces_collected": ["00-1234abcd-...01"]
|
|
86
|
+
"checks": [
|
|
87
|
+
"checkout page renders",
|
|
88
|
+
"POST /api/orders returned 201",
|
|
89
|
+
"tracepoint at handler.ts:42 fired once",
|
|
90
|
+
"orders table reflects the new row"
|
|
91
|
+
]
|
|
123
92
|
}
|
|
124
93
|
```
|
|
125
94
|
|
|
126
|
-
For a multi-cycle pass, EVERY active cycle's pass criteria must hold.
|
|
95
|
+
For a multi-cycle pass, EVERY active cycle's pass criteria must hold.
|
|
@@ -14,18 +14,15 @@
|
|
|
14
14
|
|
|
15
15
|
All four tools are MANDATORY (the Stop hook checks each). Functional interaction is expected for every verification.
|
|
16
16
|
|
|
17
|
-
###
|
|
17
|
+
### Verdict fields
|
|
18
|
+
The verdict is platform-agnostic — you submit only semantic judgment:
|
|
19
|
+
|
|
18
20
|
```json
|
|
19
21
|
{
|
|
20
22
|
"session_id": "<sid>",
|
|
21
23
|
"status": "pass",
|
|
22
|
-
"
|
|
23
|
-
"checks": ["form submits successfully", "new item appears in list", "no console errors"],
|
|
24
|
-
"console_errors": 0,
|
|
25
|
-
"network_failures": 0
|
|
24
|
+
"checks": ["form submits successfully", "new item appears in list", "no console errors"]
|
|
26
25
|
}
|
|
27
26
|
```
|
|
28
27
|
|
|
29
|
-
For `status: "pass"` (browser cycle): `console_errors === 0` AND `network_failures === 0`.
|
|
30
|
-
|
|
31
28
|
On fail, include `issues`. On pass after a previous fail, include `fixes`.
|
|
@@ -28,50 +28,35 @@ If you see `pom.xml`, `build.gradle`, `requirements.txt`, `pyproject.toml`, `go.
|
|
|
28
28
|
- Read collected snapshots: `ndt_debug_get-probe-snapshots`. At least one probe must come back with `triggered: true`.
|
|
29
29
|
- **Log path** (proves no errors during execution):
|
|
30
30
|
- Exercise the path.
|
|
31
|
-
- Read errors: `ndt_debug_get-logs` with the error-level filter.
|
|
31
|
+
- Read errors: `ndt_debug_get-logs` with the error-level filter.
|
|
32
32
|
4. **Disconnect** (optional): `ndt_debug_disconnect`.
|
|
33
33
|
|
|
34
|
-
###
|
|
34
|
+
### Verdict fields
|
|
35
|
+
The verdict is platform-agnostic — you submit only semantic judgment:
|
|
36
|
+
|
|
35
37
|
```json
|
|
36
38
|
{
|
|
37
39
|
"session_id": "<sid>",
|
|
38
40
|
"status": "pass",
|
|
39
|
-
"checks": ["POST /api/orders returned 201", "tracepoint at handler.ts:42 fired once"]
|
|
40
|
-
"node_processes_connected": ["pid:12345 (next-server)"],
|
|
41
|
-
"node_probes_set": [
|
|
42
|
-
{ "type": "tracepoint", "location": "src/api/orders.ts:42", "triggered": true }
|
|
43
|
-
],
|
|
44
|
-
"node_probe_snapshots_collected": 1,
|
|
45
|
-
"node_log_errors": []
|
|
41
|
+
"checks": ["POST /api/orders returned 201", "tracepoint at handler.ts:42 fired once"]
|
|
46
42
|
}
|
|
47
43
|
```
|
|
48
44
|
|
|
49
|
-
|
|
50
|
-
- If probes were set, at least one must have
|
|
51
|
-
- If only logs were used,
|
|
45
|
+
Node-cycle pass criteria:
|
|
46
|
+
- If probes were set, at least one must have triggered (proves the code path executed).
|
|
47
|
+
- If only logs were used, no ERROR-level entries.
|
|
52
48
|
- If both forms were used, both conditions must hold.
|
|
53
49
|
|
|
54
50
|
## Multi-cycle (browser + node simultaneously)
|
|
55
51
|
|
|
56
|
-
Common case: in the same task you edit a `.tsx` component (browser-cycle) and a `server/api/*.ts` handler (node-cycle). Both cycles activate. **Single** `verification-start`, **single** `verdict.json`, **single** retry counter cover both.
|
|
57
|
-
|
|
58
|
-
Submit ONE verdict carrying fields for every active cycle:
|
|
52
|
+
Common case: in the same task you edit a `.tsx` component (browser-cycle) and a `server/api/*.ts` handler (node-cycle). Both cycles activate. **Single** `verification-start`, **single** `verdict.json`, **single** retry counter cover both. The verdict shape doesn't change — one verdict regardless of how many cycles ran:
|
|
59
53
|
|
|
60
54
|
```json
|
|
61
55
|
{
|
|
62
56
|
"session_id": "<sid>",
|
|
63
57
|
"status": "pass",
|
|
64
|
-
"
|
|
65
|
-
"checks": ["checkout submits", "POST /api/orders returned 201", "no console errors"],
|
|
66
|
-
"console_errors": 0,
|
|
67
|
-
"network_failures": 0,
|
|
68
|
-
"node_processes_connected": ["pid:12345 (next-server)"],
|
|
69
|
-
"node_probes_set": [
|
|
70
|
-
{ "type": "tracepoint", "location": "src/api/orders.ts:42", "triggered": true }
|
|
71
|
-
],
|
|
72
|
-
"node_probe_snapshots_collected": 1,
|
|
73
|
-
"node_log_errors": []
|
|
58
|
+
"checks": ["checkout submits", "POST /api/orders returned 201", "no console errors"]
|
|
74
59
|
}
|
|
75
60
|
```
|
|
76
61
|
|
|
77
|
-
For a multi-cycle `pass`, BOTH cycles' pass criteria must hold.
|
|
62
|
+
For a multi-cycle `pass`, BOTH cycles' pass criteria must hold.
|
|
@@ -18,13 +18,12 @@ Skip start if already running. Fix build errors before proceeding. **Don't guess
|
|
|
18
18
|
1. Build and start the application if not already running.
|
|
19
19
|
2. **Start verification**: `echo '{"session_id":"<your-session-id>"}' | ironbee hook verification-start` — required before any devtools tool call.
|
|
20
20
|
3. **Run the per-cycle flow for every active cycle** — see the platform sections near the bottom of this file. Multiple cycles can be active in the same Stop run; every one of them must be exercised within this single verification cycle.
|
|
21
|
-
4. Stop the dev server when done.
|
|
21
|
+
4. Stop the dev server when done — every cycle, including the final one.
|
|
22
22
|
5. **Honor any cycle-specific teardown** noted in the platform sections BEFORE submit-verdict.
|
|
23
23
|
6. **IMMEDIATELY submit your verdict** — do NOT edit any code before submitting: `echo '<verdict-json>' | ironbee hook submit-verdict`.
|
|
24
|
-
-
|
|
25
|
-
- Status `pass` is overridden to fail by the gate when evidence doesn't back it.
|
|
24
|
+
- Platform-agnostic shape: `status`, `checks` (always required); add `issues` on fail; add `fixes` on pass-after-fail. One verdict regardless of how many cycles ran.
|
|
26
25
|
|
|
27
|
-
The Stop hook checks tool usage
|
|
26
|
+
The Stop hook checks tool usage for every active cycle and that the verdict carries non-empty `checks`. After EVERY verification attempt, you MUST submit a verdict before doing anything else — even if it failed. Do not skip to fixing code.
|
|
28
27
|
|
|
29
28
|
If verification fails: submit fail verdict → fix the code → re-verify → submit again.
|
|
30
29
|
|
|
@@ -44,16 +44,18 @@ If already running, skip start. If the build fails, fix it before proceeding.
|
|
|
44
44
|
```
|
|
45
45
|
Devtools tools are blocked without this.
|
|
46
46
|
3. Build and start the application if not already running.
|
|
47
|
-
4. **Run the per-cycle flows for every active cycle.** See the platform sections near the bottom of this file — each enabled cycle's section has its own flow steps
|
|
48
|
-
5. Stop the dev server when verification is complete.
|
|
47
|
+
4. **Run the per-cycle flows for every active cycle.** See the platform sections near the bottom of this file — each enabled cycle's section has its own flow steps and mandatory tools. All active cycles must be exercised within this one verification cycle.
|
|
48
|
+
5. Stop the dev server when verification is complete (every cycle — including the final one).
|
|
49
49
|
6. **Honor any cycle-specific teardown** noted in the platform sections (e.g. recording stop) BEFORE submitting your verdict.
|
|
50
50
|
7. **Submit your verdict immediately** — do NOT edit any code first:
|
|
51
51
|
```
|
|
52
52
|
echo '<verdict-json>' | ironbee hook submit-verdict
|
|
53
53
|
```
|
|
54
|
-
-
|
|
55
|
-
-
|
|
56
|
-
-
|
|
54
|
+
- Verdict shape is platform-agnostic: `status`, `checks`, optionally `issues` / `fixes`. One verdict regardless of how many cycles ran.
|
|
55
|
+
- Pass → `{ "session_id": "...", "status": "pass", "checks": [...] }`
|
|
56
|
+
- Fail → add `"issues": [...]` describing what failed.
|
|
57
|
+
- Pass after a previous fail → add `"fixes": [...]` describing what was repaired.
|
|
58
|
+
- **The Stop hook enforces that you called the required tools for every active cycle and that the verdict carries non-empty `checks`.**
|
|
57
59
|
8. If failed → fix → rebuild → go back to step 2 → repeat until pass.
|
|
58
60
|
|
|
59
61
|
<!--IRONBEE:PLATFORM:browser-->
|
|
@@ -67,7 +69,7 @@ If already running, skip start. If the build fails, fix it before proceeding.
|
|
|
67
69
|
|
|
68
70
|
## Important
|
|
69
71
|
- **Always submit a verdict after every verification attempt** — both pass AND fail. Fail verdicts are tracked for analytics.
|
|
70
|
-
- The Stop hook checks
|
|
72
|
+
- The Stop hook checks that the required tools were used for every active cycle and that the verdict carries non-empty `checks`.
|
|
71
73
|
- Submit verdicts via `ironbee hook submit-verdict`, never write `verdict.json` directly.
|
|
72
74
|
- Every code edit (Write/Edit) automatically clears your session's verdict.
|
|
73
75
|
- After 3 failed verification attempts, you may complete but must report unresolved issues.
|
|
@@ -6,18 +6,18 @@ disable-model-invocation: true
|
|
|
6
6
|
|
|
7
7
|
# IronBee Verify
|
|
8
8
|
|
|
9
|
-
Verify the current code changes through real tools. The gate runs every cycle that has been wired up for this project, and all active cycles must be satisfied within a single verification cycle for `status: pass`. Each cycle has its own tools
|
|
9
|
+
Verify the current code changes through real tools. The gate runs every cycle that has been wired up for this project, and all active cycles must be satisfied within a single verification cycle for `status: pass`. Each cycle has its own tools and flow — **see the platform sections near the bottom of this file** for which cycles apply and what to call. The verdict shape itself is platform-agnostic (`status`, `checks`, `issues?`, `fixes?`); the gate enforces that you called each cycle's required tools and that `checks` is non-empty.
|
|
10
10
|
|
|
11
11
|
## Universal steps
|
|
12
12
|
|
|
13
13
|
1. **Start verification**: Run `echo '{"session_id":"<your-session-id>"}' | ironbee hook verification-start` via terminal.
|
|
14
14
|
2. **Build and start** the application if not already running.
|
|
15
15
|
3. **For every active cycle, run its flow** as described in the platform sections near the bottom of this file. All active cycles must be exercised within this same verification cycle.
|
|
16
|
-
4. **Stop** the dev server when verification is complete.
|
|
16
|
+
4. **Stop** the dev server when verification is complete (every cycle — including the final one).
|
|
17
17
|
5. **Honor any cycle-specific teardown** noted in the platform sections BEFORE submitting your verdict.
|
|
18
|
-
6. **Submit your verdict** via terminal.
|
|
19
|
-
- Pass: `echo '{"session_id":"...","status":"pass",
|
|
20
|
-
- Fail: `echo '{"session_id":"...","status":"fail",
|
|
18
|
+
6. **Submit your verdict** via terminal. One verdict covers every active cycle:
|
|
19
|
+
- Pass: `echo '{"session_id":"...","status":"pass","checks":["..."]}' | ironbee hook submit-verdict`
|
|
20
|
+
- Fail: `echo '{"session_id":"...","status":"fail","checks":["..."],"issues":["describe what failed"]}' | ironbee hook submit-verdict`
|
|
21
21
|
7. **If failed** → collect ALL issues first (finish testing every active cycle), submit one fail verdict with all issues, then fix everything, rebuild, and re-verify. Do not fix one issue at a time — batch fixes to avoid repeated build/restart cycles.
|
|
22
22
|
8. If pass after a previous fail, include `"fixes"` in the verdict describing what was fixed.
|
|
23
23
|
|
|
@@ -44,8 +44,8 @@ async function run(projectDir) {
|
|
|
44
44
|
permission: "deny",
|
|
45
45
|
agent_message: `BLOCKED: You used verification tools (browser-devtools / node-devtools / backend-devtools) but did not submit a verdict. You MUST submit a verdict (pass or fail) before editing code.
|
|
46
46
|
|
|
47
|
-
Submit your verdict first
|
|
48
|
-
echo '{"session_id":"${sessionId}","status":"fail","checks":[...],"issues":["describe what failed"]
|
|
47
|
+
Submit your verdict first:
|
|
48
|
+
echo '{"session_id":"${sessionId}","status":"fail","checks":["..."],"issues":["describe what failed"]}' | ironbee hook submit-verdict
|
|
49
49
|
|
|
50
50
|
Then you can edit code to fix the issues.`,
|
|
51
51
|
};
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"session-start.d.ts","sourceRoot":"","sources":["../../../../src/clients/cursor/hooks/session-start.ts"],"names":[],"mappings":"AAAA;;;;;;GAMG;AAuBH,wBAAsB,GAAG,CAAC,UAAU,EAAE,MAAM,GAAG,OAAO,CAAC,IAAI,CAAC,
|
|
1
|
+
{"version":3,"file":"session-start.d.ts","sourceRoot":"","sources":["../../../../src/clients/cursor/hooks/session-start.ts"],"names":[],"mappings":"AAAA;;;;;;GAMG;AAuBH,wBAAsB,GAAG,CAAC,UAAU,EAAE,MAAM,GAAG,OAAO,CAAC,IAAI,CAAC,CA4E3D"}
|
|
@@ -42,23 +42,25 @@ async function run(projectDir) {
|
|
|
42
42
|
};
|
|
43
43
|
await (0, actions_1.appendAction)(actionsFile, entry);
|
|
44
44
|
await (0, session_state_1.reconcileSessionState)(sessionDir, actionsFile, actions_1.appendAction);
|
|
45
|
-
|
|
45
|
+
const verificationEnabled = (0, config_1.getVerificationEnabled)((0, config_1.loadConfig)(projectDir));
|
|
46
|
+
await (0, telemetry_1.trackSessionStart)("cursor", sessionId, verificationEnabled);
|
|
46
47
|
logger_1.logger.debug(`session-start: ${sessionId}`);
|
|
48
|
+
// Monitoring mode: no enforcement hooks are installed and the agent
|
|
49
|
+
// should not be told to verify or submit verdicts. Empty JSON so
|
|
50
|
+
// Cursor injects no additional_context.
|
|
51
|
+
if (!verificationEnabled) {
|
|
52
|
+
(0, output_1.writeAndExit)(JSON.stringify({}), 0);
|
|
53
|
+
return;
|
|
54
|
+
}
|
|
47
55
|
const verdictPass = JSON.stringify({
|
|
48
56
|
session_id: sessionId,
|
|
49
57
|
status: "pass",
|
|
50
|
-
pages_tested: ["http://localhost:3000/affected-page"],
|
|
51
58
|
checks: ["form submits successfully", "new item appears in list"],
|
|
52
|
-
console_errors: 0,
|
|
53
|
-
network_failures: 0,
|
|
54
59
|
});
|
|
55
60
|
const verdictFail = JSON.stringify({
|
|
56
61
|
session_id: sessionId,
|
|
57
62
|
status: "fail",
|
|
58
|
-
pages_tested: ["http://localhost:3000/affected-page"],
|
|
59
63
|
checks: ["form renders", "submit button unresponsive"],
|
|
60
|
-
console_errors: 2,
|
|
61
|
-
network_failures: 0,
|
|
62
64
|
issues: ["button click handler not firing", "TypeError in console"],
|
|
63
65
|
});
|
|
64
66
|
const context = `IRONBEE VERIFICATION — SESSION ACTIVE
|
|
@@ -75,7 +77,7 @@ Submit via terminal:
|
|
|
75
77
|
On fail (issues is required):
|
|
76
78
|
echo '${verdictFail}' | ironbee hook submit-verdict
|
|
77
79
|
|
|
78
|
-
Required fields: session_id, status,
|
|
80
|
+
Required fields: session_id, status, checks
|
|
79
81
|
On fail, include: issues (array of strings describing what failed)
|
|
80
82
|
On pass after a previous fail, include: fixes (array of strings describing what was fixed)`;
|
|
81
83
|
const output = {
|
|
@@ -1 +1 @@
|
|
|
1
|
-
{"version":3,"file":"session-start.js","sourceRoot":"","sources":["../../../../src/clients/cursor/hooks/session-start.ts"],"names":[],"mappings":";AAAA;;;;;;GAMG;;AAuBH,
|
|
1
|
+
{"version":3,"file":"session-start.js","sourceRoot":"","sources":["../../../../src/clients/cursor/hooks/session-start.ts"],"names":[],"mappings":";AAAA;;;;;;GAMG;;AAuBH,kBA4EC;AAjGD,yDAA2F;AAC3F,qEAAwF;AACxF,gDAAyE;AACzE,gDAAyD;AACzD,gDAAmD;AACnD,8CAA+C;AAC/C,sDAA2D;AAepD,KAAK,UAAU,GAAG,CAAC,UAAkB;IACxC,IAAI,KAA8B,CAAC;IACnC,IAAI,CAAC;QACD,KAAK,GAAG,IAAI,CAAC,KAAK,CAAC,IAAA,iBAAS,GAAE,CAA4B,CAAC;IAC/D,CAAC;IAAC,OAAO,CAAU,EAAE,CAAC;QAClB,eAAM,CAAC,KAAK,CAAC,0BAA0B,CAAC,EAAE,CAAC,CAAC;QAC5C,IAAA,qBAAY,EAAC,IAAI,CAAC,SAAS,CAAC,EAAE,CAAC,EAAE,CAAC,CAAC,CAAC;QACpC,OAAO;IACX,CAAC;IAED,MAAM,SAAS,GAAW,KAAK,CAAC,eAAe,IAAI,SAAS,CAAC;IAC7D,MAAM,WAAW,GAAW,GAAG,UAAU,sBAAsB,SAAS,gBAAgB,CAAC;IACzF,IAAA,mBAAU,EAAC,GAAG,UAAU,sBAAsB,SAAS,cAAc,CAAC,CAAC;IAEvE,MAAM,UAAU,GAAW,GAAG,UAAU,sBAAsB,SAAS,EAAE,CAAC;IAC1E,wEAAwE;IACxE,wEAAwE;IACxE,IAAA,4BAAY,EAAC,UAAU,EAAE,KAAK,CAAC,UAAU,IAAI,SAAS,CAAC,CAAC;IAExD,MAAM,KAAK,GAAuB;QAC9B,GAAG,IAAA,oBAAU,EAAC,WAAW,CAAC;QAC1B,IAAI,EAAE,eAAe;QACrB,SAAS,EAAE,IAAI,CAAC,GAAG,EAAE;QACrB,UAAU,EAAE,SAAS;QACrB,MAAM,EAAE,QAAQ;QAChB,MAAM,EAAE,SAAS;KACpB,CAAC;IAEF,MAAM,IAAA,sBAAY,EAAC,WAAW,EAAE,KAAK,CAAC,CAAC;IACvC,MAAM,IAAA,qCAAqB,EAAC,UAAU,EAAE,WAAW,EAAE,sBAAY,CAAC,CAAC;IACnE,MAAM,mBAAmB,GAAY,IAAA,+BAAsB,EAAC,IAAA,mBAAU,EAAC,UAAU,CAAC,CAAC,CAAC;IACpF,MAAM,IAAA,6BAAiB,EAAC,QAAQ,EAAE,SAAS,EAAE,mBAAmB,CAAC,CAAC;IAClE,eAAM,CAAC,KAAK,CAAC,kBAAkB,SAAS,EAAE,CAAC,CAAC;IAE5C,oEAAoE;IACpE,iEAAiE;IACjE,wCAAwC;IACxC,IAAI,CAAC,mBAAmB,EAAE,CAAC;QACvB,IAAA,qBAAY,EAAC,IAAI,CAAC,SAAS,CAAC,EAAE,CAAC,EAAE,CAAC,CAAC,CAAC;QACpC,OAAO;IACX,CAAC;IAED,MAAM,WAAW,GAAW,IAAI,CAAC,SAAS,CAAC;QACvC,UAAU,EAAE,SAAS;QACrB,MAAM,EAAE,MAAM;QACd,MAAM,EAAE,CAAC,2BAA2B,EAAE,0BAA0B,CAAC;KACpE,CAAC,CAAC;IACH,MAAM,WAAW,GAAW,IAAI,CAAC,SAAS,CAAC;QACvC,UAAU,EAAE,SAAS;QACrB,MAAM,EAAE,MAAM;QACd,MAAM,EAAE,CAAC,cAAc,EAAE,4BAA4B,CAAC;QACtD,MAAM,EAAE,CAAC,iCAAiC,EAAE,sBAAsB,CAAC;KACtE,CAAC,CAAC;IAEH,MAAM,OAAO,GAAW;cACd,SAAS;;;;;;;;UAQb,WAAW;;;UAGX,WAAW;;;;2FAIsE,CAAC;IAExF,MAAM,MAAM,GAA6B;QACrC,kBAAkB,EAAE,OAAO;KAC9B,CAAC;IACF,IAAA,qBAAY,EAAC,IAAI,CAAC,SAAS,CAAC,MAAM,CAAC,EAAE,CAAC,CAAC,CAAC;AAC5C,CAAC"}
|
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
## Backend Mode (when `backend.verifyPatterns` matches an edited file)
|
|
4
4
|
|
|
5
|
-
If the project has the backend protocol cycle enabled (`ironbee backend enable` once at setup) and your edits touch matching paths (e.g. `server/**`, `api/**`, `routes/**`, `controllers/**`), the Stop hook also enforces a Backend cycle. The same `verification-start` covers every active cycle;
|
|
5
|
+
If the project has the backend protocol cycle enabled (`ironbee backend enable` once at setup) and your edits touch matching paths (e.g. `server/**`, `api/**`, `routes/**`, `controllers/**`), the Stop hook also enforces a Backend cycle. The same `verification-start` covers every active cycle; one platform-agnostic verdict covers them all.
|
|
6
6
|
|
|
7
7
|
This cycle is **runtime- and language-agnostic** — it works for Node, Java, Python, Go, Rust, Ruby, .NET, PHP, Elixir, Kotlin, Scala. The agent makes real protocol calls (HTTP / gRPC / GraphQL / WebSocket) against the running service, inspects logs, OR reads database state; it never attaches to a process.
|
|
8
8
|
|
|
@@ -45,46 +45,21 @@ The cycle is satisfied by ANY ONE of three evidence paths: protocol-call (you dr
|
|
|
45
45
|
5. **Trace correlation (optional, `o11y_*` primitives):** IronBee already pins the verification cycle's traceId on every backend tool call via `_metadata.traceId` (outranks any session pin), so the orchestrator's correlation root is authoritative. Use `MCP:bedt_o11y_get-trace-context` to read it, then pass it to `MCP:bedt_log_read { pattern: "<traceId>" }` to slice logs for one flow. `MCP:bedt_o11y_new-trace-id` / `MCP:bedt_o11y_set-trace-context` are available when you want to anchor a flow under an explicit id (e.g. integration-test runs).
|
|
46
46
|
6. **Submit verdict** including the fields matching the path(s) you exercised. If browser and/or node cycles are also active, include their fields in the SAME verdict — do not submit two verdicts.
|
|
47
47
|
|
|
48
|
-
### Verdict (
|
|
48
|
+
### Verdict (platform-agnostic)
|
|
49
49
|
|
|
50
|
-
|
|
51
|
-
```json
|
|
52
|
-
{
|
|
53
|
-
"session_id": "...",
|
|
54
|
-
"status": "pass",
|
|
55
|
-
"checks": ["POST /api/orders returned 201 with order id", "GET /api/orders/:id reflects new order"],
|
|
56
|
-
"backend_endpoints_called": [
|
|
57
|
-
"POST http://localhost:3000/api/orders",
|
|
58
|
-
"GET http://localhost:3000/api/orders/42"
|
|
59
|
-
],
|
|
60
|
-
"backend_response_statuses": [201, 200],
|
|
61
|
-
"backend_traces_collected": ["00-1234abcd-...01"]
|
|
62
|
-
}
|
|
63
|
-
```
|
|
64
|
-
|
|
65
|
-
Log-evidence path:
|
|
66
|
-
```json
|
|
67
|
-
{
|
|
68
|
-
"session_id": "...",
|
|
69
|
-
"status": "pass",
|
|
70
|
-
"checks": ["api-server logged 'order 42 created' on POST /api/orders", "no ERROR-level lines after the change"],
|
|
71
|
-
"backend_log_sources_read": ["api-server"]
|
|
72
|
-
}
|
|
73
|
-
```
|
|
50
|
+
The verdict shape is the same regardless of which evidence path (protocol-call / log / db) you took — `status` + `checks` (+ `issues` / `fixes` as needed):
|
|
74
51
|
|
|
75
|
-
DB-evidence path:
|
|
76
52
|
```json
|
|
77
53
|
{
|
|
78
54
|
"session_id": "...",
|
|
79
55
|
"status": "pass",
|
|
80
|
-
"checks": ["
|
|
81
|
-
"backend_db_connections_read": ["app"]
|
|
56
|
+
"checks": ["POST /api/orders returned 201 with order id", "GET /api/orders/:id reflects new order"]
|
|
82
57
|
}
|
|
83
58
|
```
|
|
84
59
|
|
|
85
|
-
|
|
60
|
+
The gate requires that AT LEAST one evidence path was actually exercised in your tool calls — `MCP:bedt_request_*` for protocol-call, `MCP:bedt_log_register-source` + `MCP:bedt_log_read*` / `_follow` for log-evidence, or `MCP:bedt_db_connect` + a read/diff/snapshot/get-changes for DB-evidence. If none were used, the gate will reject.
|
|
86
61
|
|
|
87
|
-
For a multi-cycle pass (browser + backend, or browser + node + backend), every active cycle's
|
|
62
|
+
For a multi-cycle pass (browser + backend, or browser + node + backend), every active cycle's pass criteria must hold.
|
|
88
63
|
|
|
89
64
|
---
|
|
90
65
|
|
|
@@ -38,8 +38,8 @@ If no argument is given, use **default** mode. `default` and `full` apply to eve
|
|
|
38
38
|
6. **Stop** the dev server when verification is complete
|
|
39
39
|
7. **If recording was started, stop it now** — `MCP:bdt_content_stop-recording`. submit-verdict rejects with `"recording is still active"` when this step is skipped. (Recording is a server-side opt-in via `recording.enable` — when on, the gate forces `MCP:bdt_content_start-recording` BEFORE the steps above and demands the matching stop here.)
|
|
40
40
|
8. **Submit your verdict** via terminal:
|
|
41
|
-
- Pass: `echo '{"session_id":"...","status":"pass","
|
|
42
|
-
- Fail: `echo '{"session_id":"...","status":"fail","
|
|
41
|
+
- Pass: `echo '{"session_id":"...","status":"pass","checks":["..."]}' | ironbee hook submit-verdict`
|
|
42
|
+
- Fail: `echo '{"session_id":"...","status":"fail","checks":["..."],"issues":["describe what failed"]}' | ironbee hook submit-verdict`
|
|
43
43
|
9. **If failed** → collect ALL issues first (finish testing all affected pages), submit one fail verdict with all issues, then fix everything, rebuild, and re-verify. Do not fix one issue at a time — batch fixes to avoid repeated build/restart cycles.
|
|
44
44
|
10. If pass after a previous fail, include `"fixes"` in the verdict describing what was fixed
|
|
45
45
|
|
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
|
|
5
5
|
> **Precondition: the backend must actually be Node.js.** If you see `pom.xml`, `build.gradle`, `requirements.txt`, `pyproject.toml`, `go.mod`, `Cargo.toml`, etc., this section does NOT apply — `MCP:ndt_*` tools won't connect to non-Node processes. Just do browser verification.
|
|
6
6
|
|
|
7
|
-
If the project has node backend verification enabled (`ironbee node enable` once at setup, by an operator who confirmed the backend is Node.js) and your edits touch matching paths (e.g. `server/**`, `pages/api/**`), the stop hook also enforces a Node cycle. The same `verification-start` covers both cycles;
|
|
7
|
+
If the project has node backend verification enabled (`ironbee node enable` once at setup, by an operator who confirmed the backend is Node.js) and your edits touch matching paths (e.g. `server/**`, `pages/api/**`), the stop hook also enforces a Node cycle. The same `verification-start` covers both cycles; one platform-agnostic verdict covers both.
|
|
8
8
|
|
|
9
9
|
### Mode behavior (node cycle)
|
|
10
10
|
- **default** (no arg or `default`): probe / log only the code paths your diff touched. Map each changed file → the handler(s) it affects → place probes there.
|
|
@@ -16,26 +16,20 @@ If the project has node backend verification enabled (`ironbee node enable` once
|
|
|
16
16
|
2. **Connect**: `MCP:ndt_debug_connect` with one of `pid` / `processName` / `containerId` / `containerName` / `inspectorPort` / `wsUrl`. Inspector is auto-activated via SIGUSR1 if needed.
|
|
17
17
|
3. **Pick an evidence path** for each changed code path:
|
|
18
18
|
- **Probe path** (proves the code path executed): `MCP:ndt_debug_put-tracepoint` (or `put-logpoint` / `put-exceptionpoint`) at the changed code, exercise the path (e.g. trigger the API call from the browser), then `MCP:ndt_debug_get-probe-snapshots`. At least one probe must come back with `triggered: true`.
|
|
19
|
-
- **Log path** (proves no errors): exercise the path, then `MCP:ndt_debug_get-logs` with the error level filter.
|
|
19
|
+
- **Log path** (proves no errors): exercise the path, then `MCP:ndt_debug_get-logs` with the error level filter.
|
|
20
20
|
4. **Disconnect** (optional): `MCP:ndt_debug_disconnect`.
|
|
21
|
-
5. **Submit verdict**
|
|
21
|
+
5. **Submit verdict** — platform-agnostic, just status + checks (+ issues/fixes).
|
|
22
22
|
|
|
23
|
-
### Verdict (
|
|
23
|
+
### Verdict (platform-agnostic)
|
|
24
24
|
```json
|
|
25
25
|
{
|
|
26
26
|
"session_id": "...",
|
|
27
27
|
"status": "pass",
|
|
28
|
-
"checks": ["POST /api/orders returned 201", "tracepoint at handler.ts:42 fired once"]
|
|
29
|
-
"node_processes_connected": ["pid:12345 (next-server)"],
|
|
30
|
-
"node_probes_set": [
|
|
31
|
-
{ "type": "tracepoint", "location": "src/api/orders.ts:42", "triggered": true }
|
|
32
|
-
],
|
|
33
|
-
"node_probe_snapshots_collected": 1,
|
|
34
|
-
"node_log_errors": []
|
|
28
|
+
"checks": ["POST /api/orders returned 201", "tracepoint at handler.ts:42 fired once"]
|
|
35
29
|
}
|
|
36
30
|
```
|
|
37
31
|
|
|
38
|
-
For a multi-cycle pass, both browser and node criteria must hold
|
|
32
|
+
For a multi-cycle pass, both browser and node pass criteria must hold.
|
|
39
33
|
|
|
40
34
|
---
|
|
41
35
|
|
|
@@ -54,7 +48,7 @@ Focus on the code you changed — not the entire Node service.
|
|
|
54
48
|
- **Exercise the path end-to-end** (trigger from browser, curl, or the backend cycle if active)
|
|
55
49
|
- **Each touched probe must report `triggered: true`** in `MCP:ndt_debug_get-probe-snapshots`
|
|
56
50
|
- **Check one edge case per new branch** — invalid input, missing field, auth failure, …
|
|
57
|
-
- **Logs** — `MCP:ndt_debug_get-logs` at error level;
|
|
51
|
+
- **Logs** — `MCP:ndt_debug_get-logs` at error level; no ERROR-level entries are expected for `pass`
|
|
58
52
|
|
|
59
53
|
---
|
|
60
54
|
|
|
@@ -64,4 +58,4 @@ Probe every Node code path reachable from files matching `node.verifyPatterns`,
|
|
|
64
58
|
|
|
65
59
|
- Place probes at every handler / route / service entry point in scope, plus key internal branch points (early returns, error catches, conditional middleware)
|
|
66
60
|
- Exercise each path with at least one happy-path call AND one failure-path call
|
|
67
|
-
-
|
|
61
|
+
- No ERROR-level log entries are expected after the full run — any unexpected log error is a fail, regardless of when it was introduced
|
|
@@ -16,7 +16,7 @@ These attach to the **Required steps** above — they don't replace any step. Nu
|
|
|
16
16
|
- **Protocol-call** — identify the affected endpoint(s) → call the matching protocol tool (`bedt_request_http` / `bedt_request_grpc` / `bedt_request_graphql` / `bedt_request_websocket-open` / `bedt_request_replay`) → inspect status / body / `traceId` → chain follow-up calls when verifying side effects. **A 4xx / 5xx response is a normal result, not an error** — only transport failures populate the `error` field.
|
|
17
17
|
- **Log evidence** — `bedt_log_register-source` (file / docker / kubernetes) → `bedt_log_read` (point-in-time, supports `tail` / `since-until` / `pattern` / `level` / `parseJson` + `jsonFilter` / `contextBefore-After` / `select` / `coalesce`) OR `bedt_log_follow` + `bedt_log_get-followed` (streaming). Fit for jobs / queue workers / async handlers, or any case where an external driver is hitting the endpoint and you only need to verify what the server logged. `bedt_log_register-source` is mandatory on this path (the gate counts it as the setup step).
|
|
18
18
|
- **DB evidence** — `bedt_db_connect` (named, `connectionStringEnv` preferred, default readonly) → ONE of `bedt_db_query` / `bedt_db_describe-table` / `bedt_db_list-tables` / `bedt_db_snapshot` (+ optional `bedt_db_diff`) / `bedt_db_get-changes`. Fit for migrations, seed-data changes, query-result regressions, and any code change whose side effect lives in a relational DB. `bedt_db_connect` is mandatory on this path (same anti-fluke rule as `log-evidence` — the connection name is on the wire).
|
|
19
|
-
- **Within step 6 (submit verdict):**
|
|
19
|
+
- **Within step 6 (submit verdict):** submit one platform-agnostic verdict (`status` + `checks` + optionally `issues` / `fixes`). The gate requires that at least one evidence path (protocol-call, log-evidence, or DB-evidence) was exercised in your `MCP:bedt_*` tool calls.
|
|
20
20
|
|
|
21
21
|
### Trace correlation (`o11y_*` is auxiliary, not evidence)
|
|
22
22
|
|
|
@@ -26,7 +26,7 @@ IronBee already injects the active verification `traceId` into every backend too
|
|
|
26
26
|
|
|
27
27
|
- Calling `bedt_*` tools without first opening a verification cycle (`ironbee hook verification-start`).
|
|
28
28
|
- Treating a 4xx / 5xx response as a transport failure when the test was specifically asking for that error condition (e.g. "POST should reject malformed body with 400"). Decide PASS/FAIL based on the test's intent, not the status code's HTTP-class default.
|
|
29
|
-
-
|
|
29
|
+
- Claiming `status: pass` for a backend cycle without exercising at least one evidence path (no `MCP:bedt_request_*` call, no `MCP:bedt_log_register-source` + read, no `MCP:bedt_db_connect` + read). The gate will reject.
|
|
30
30
|
- Inferring backend behavior by reading code without exercising any evidence path. The cycle is satisfied only by making a real protocol call, reading real logs, or inspecting real DB state on the running service.
|
|
31
31
|
- Reading a pre-existing log source / DB unrelated to your task to fake the log-evidence or db-evidence path. `bedt_log_register-source` and `bedt_db_connect` are required setup steps on those paths so the registration / connection is on the wire.
|
|
32
32
|
- Opening a DB connection with `allowWrites: true` to "set up" verification data without an explicit need (seed / migration). Read-only is the default for a reason — flipping it widens the blast radius if a query goes wrong.
|