@ironbee-ai/cli 0.9.1 → 0.9.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +6 -0
- package/README.md +7 -7
- package/dist/clients/claude/hooks/require-verdict.js +1 -1
- package/dist/clients/claude/platforms/command-verify.backend.md +42 -10
- package/dist/clients/claude/platforms/command-verify.node.md +8 -8
- package/dist/clients/claude/platforms/rule.backend.md +9 -6
- package/dist/clients/claude/platforms/rule.node.md +3 -3
- package/dist/clients/claude/platforms/skill.backend.md +34 -6
- package/dist/clients/claude/platforms/skill.node.md +10 -10
- package/dist/clients/cursor/hooks/require-verdict.js +1 -1
- package/dist/clients/cursor/platforms/command-verify.backend.md +42 -10
- package/dist/clients/cursor/platforms/command-verify.node.md +8 -8
- package/dist/clients/cursor/platforms/rule.backend.md +9 -6
- package/dist/clients/cursor/platforms/rule.node.md +3 -3
- package/dist/clients/cursor/platforms/skill.backend.md +34 -6
- package/dist/clients/cursor/platforms/skill.node.md +10 -10
- package/dist/hooks/core/actions.d.ts +12 -8
- package/dist/hooks/core/actions.d.ts.map +1 -1
- package/dist/hooks/core/actions.js.map +1 -1
- package/dist/hooks/core/submit-verdict.js +1 -1
- package/dist/hooks/core/submit-verdict.js.map +1 -1
- package/dist/hooks/core/verify-gate.d.ts.map +1 -1
- package/dist/hooks/core/verify-gate.js +31 -20
- package/dist/hooks/core/verify-gate.js.map +1 -1
- package/dist/lib/config.d.ts +13 -2
- package/dist/lib/config.d.ts.map +1 -1
- package/dist/lib/config.js +26 -2
- package/dist/lib/config.js.map +1 -1
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,11 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## 0.9.2 (2026-05-11)
|
|
4
|
+
|
|
5
|
+
### Features
|
|
6
|
+
|
|
7
|
+
* **backend:** support log domain in the backend platform ([#11](https://github.com/ironbee-ai/ironbee-cli/issues/11)) ([4512275](https://github.com/ironbee-ai/ironbee-cli/commit/4512275b5f90c501481f4b07c13057fea475e3fb))
|
|
8
|
+
|
|
3
9
|
## 0.9.1 (2026-05-11)
|
|
4
10
|
|
|
5
11
|
### Bug Fixes
|
package/README.md
CHANGED
|
@@ -372,7 +372,7 @@ When the agent tries to complete a task, IronBee runs these checks:
|
|
|
372
372
|
- **Browser cycle**: navigate, screenshot, accessibility snapshot, console check (all-of)
|
|
373
373
|
- **Node cycle**: connect; then either probe path (`(put-tracepoint | put-logpoint | put-exceptionpoint) AND get-probe-snapshots`) OR log path (`get-logs`)
|
|
374
374
|
4. **Does a verdict exist?** — The agent must submit a single verdict carrying evidence for every active cycle via `ironbee hook submit-verdict`.
|
|
375
|
-
5. **Is the verdict valid?** — Per active cycle: browser fields (pages_tested, console_errors, network_failures) and/or node fields (
|
|
375
|
+
5. **Is the verdict valid?** — Per active cycle: browser fields (pages_tested, console_errors, network_failures) and/or node fields (node_processes_connected, node_probes_set / node_log_errors).
|
|
376
376
|
6. **Pass or fail?** — `status: "pass"` is honored only if every active cycle's evidence backs the claim. The gate overrides to fail if it doesn't.
|
|
377
377
|
7. **Retry limit** — After `maxRetries` failed attempts (default 3, single global counter), the agent is allowed to complete but must report unresolved issues.
|
|
378
378
|
|
|
@@ -419,23 +419,23 @@ On pass after a previous fail, include a `fixes` array describing what was fixed
|
|
|
419
419
|
}
|
|
420
420
|
```
|
|
421
421
|
|
|
422
|
-
For a **node-cycle** verdict (probe path), use the `
|
|
422
|
+
For a **node-cycle** verdict (probe path), use the `node_*` fields instead of (or alongside) the browser fields:
|
|
423
423
|
|
|
424
424
|
```json
|
|
425
425
|
{
|
|
426
426
|
"session_id": "<your-session-id>",
|
|
427
427
|
"status": "pass",
|
|
428
428
|
"checks": ["POST /api/orders returned 201", "tracepoint at handler.ts:42 fired once"],
|
|
429
|
-
"
|
|
430
|
-
"
|
|
429
|
+
"node_processes_connected": ["pid:12345 (next-server)"],
|
|
430
|
+
"node_probes_set": [
|
|
431
431
|
{ "type": "tracepoint", "location": "src/api/orders.ts:42", "triggered": true }
|
|
432
432
|
],
|
|
433
|
-
"
|
|
434
|
-
"
|
|
433
|
+
"node_probe_snapshots_collected": 1,
|
|
434
|
+
"node_log_errors": []
|
|
435
435
|
}
|
|
436
436
|
```
|
|
437
437
|
|
|
438
|
-
If both cycles are active, populate browser fields **and** `
|
|
438
|
+
If both cycles are active, populate browser fields **and** `node_*` fields in the same verdict — both cycles' pass criteria must hold for the gate to honor `status: "pass"`.
|
|
439
439
|
|
|
440
440
|
The agent must submit a verdict after **every** verification attempt — both pass and fail. File edits are blocked until a verdict is submitted after using devtools tools.
|
|
441
441
|
|
|
@@ -40,7 +40,7 @@ async function run(projectDir) {
|
|
|
40
40
|
if ((0, actions_1.hasToolCallsSinceLastVerdict)(actionsFile)) {
|
|
41
41
|
process.stderr.write(`BLOCKED: You used verification tools (browser-devtools / node-devtools / backend-devtools) but did not submit a verdict. You MUST submit a verdict (pass or fail) before editing code.
|
|
42
42
|
|
|
43
|
-
Submit your verdict first (include cycle-appropriate fields — browser fields for bdt_*,
|
|
43
|
+
Submit your verdict first (include cycle-appropriate fields — browser fields for bdt_*, node_* for ndt_*, backend_endpoints_called/backend_response_statuses for bedt_*):
|
|
44
44
|
echo '{"session_id":"${sessionId}","status":"fail","checks":[...],"issues":["describe what failed"], ...}' | ironbee hook submit-verdict
|
|
45
45
|
|
|
46
46
|
Then you can edit code to fix the issues.
|
|
@@ -7,24 +7,37 @@ If the project has the backend protocol cycle enabled (`ironbee backend enable`
|
|
|
7
7
|
This cycle is **runtime- and language-agnostic** — it works for Node, Java, Python, Go, Rust, Ruby, .NET, PHP, Elixir, Kotlin, Scala. The agent makes real protocol calls (HTTP / gRPC / GraphQL / WebSocket) against the running service and inspects the responses; it never attaches to a process.
|
|
8
8
|
|
|
9
9
|
### Mode behavior (backend cycle)
|
|
10
|
-
- **default** (no arg or `default`):
|
|
11
|
-
- **full**:
|
|
10
|
+
- **default** (no arg or `default`): exercise the endpoints your diff touched via ONE of the two evidence paths (protocol-call OR log evidence — see below). Map each changed file → the route(s) / handler(s) / RPC method(s) it exposes, then either call them yourself and chain follow-ups to verify side effects, OR set up log capture and verify the server-side trace of an external driver hitting them.
|
|
11
|
+
- **full**: cover every endpoint reachable from files matching `backend.verifyPatterns`, not just the changed files. Cover the success path, at least one error path, and any auth-gated variant for each.
|
|
12
12
|
- `visual` / `functional`: browser-only modes; this cycle behaves as `default` when they are passed.
|
|
13
13
|
|
|
14
14
|
### Steps (additive to the browser flow above)
|
|
15
|
+
|
|
16
|
+
The cycle is satisfied by **either** a protocol-call path (you drive the request) **or** a log-evidence path (something else drives it; you read the resulting logs). Pick whichever fits the task.
|
|
17
|
+
|
|
15
18
|
1. **Confirm the backend service is running** (the user's dev server / Docker compose / k8s port-forward / …). Don't start the service yourself — ask the user if it's not obvious.
|
|
16
19
|
2. **Identify the affected endpoint(s)**. Map your code change to wire-level addresses (URL, gRPC service+method, GraphQL operation, WebSocket path).
|
|
17
|
-
3. **
|
|
20
|
+
3. **Exercise ONE evidence path** (or both):
|
|
21
|
+
|
|
22
|
+
**Path A — Protocol-call (you drive the request):**
|
|
18
23
|
- `mcp__backend-devtools__bedt_request_http` — HTTP/1.1 + HTTP/2 (ALPN auto-negotiates).
|
|
19
24
|
- `mcp__backend-devtools__bedt_request_grpc` — unary + 3 streaming modes; `.proto` text or descriptor.
|
|
20
25
|
- `mcp__backend-devtools__bedt_request_graphql` — query/mutation/persisted query.
|
|
21
26
|
- `mcp__backend-devtools__bedt_request_websocket-open` then `bedt_request_websocket-send` / `bedt_request_websocket-receive` / `bedt_request_websocket-close` for stateful WS sessions.
|
|
22
27
|
- `mcp__backend-devtools__bedt_request_replay` — re-issue a captured curl command or HAR entry.
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
28
|
+
- **Inspect the response** — `status`, body, headers, `traceId`. **4xx/5xx and gRPC non-OK are normal results, not transport errors.** Decide PASS/FAIL based on what the test actually requires.
|
|
29
|
+
|
|
30
|
+
**Path B — Log evidence (an external driver hits the endpoint; you read the logs):**
|
|
31
|
+
- `mcp__backend-devtools__bedt_log_register-source` — register the running service's log destination (`type: "file"` + `path`, `type: "docker"` + `container`, or `type: "kubernetes"` + `pod`).
|
|
32
|
+
- `mcp__backend-devtools__bedt_log_read` / `bedt_log_read-multi` — point-in-time read with filters (`tail`, `since`/`until`, `pattern`, `level`, `parseJson` + `jsonFilter`, `contextBefore`/`contextAfter`, `select`, `coalesce`).
|
|
33
|
+
- `mcp__backend-devtools__bedt_log_follow` + `bedt_log_get-followed` (and `bedt_log_stop-follow`) — streaming follow when you need to capture lines that emit AFTER you trigger.
|
|
34
|
+
- **Verify the lines match the expectation** — error gone, expected line present, trace-id chained through, no unexpected ERROR-level entries.
|
|
35
|
+
4. **Chain follow-up calls (Path A)** to verify side effects (POST then GET, set then list, etc.). Use `bedt_request_set-default-headers` for auth tokens (host-scoped), `bedt_request_set-cookies` for session state. **OR chain follow-up reads (Path B)** to capture downstream log output (job started → job finished → no retries).
|
|
36
|
+
5. **Submit verdict** including the fields matching the path(s) you exercised. If browser and/or node cycles are also active, include their fields in the SAME verdict — do not submit two verdicts.
|
|
26
37
|
|
|
27
38
|
### Verdict (backend-cycle fields)
|
|
39
|
+
|
|
40
|
+
Protocol-call path:
|
|
28
41
|
```json
|
|
29
42
|
{
|
|
30
43
|
"session_id": "...",
|
|
@@ -39,7 +52,17 @@ This cycle is **runtime- and language-agnostic** — it works for Node, Java, Py
|
|
|
39
52
|
}
|
|
40
53
|
```
|
|
41
54
|
|
|
42
|
-
|
|
55
|
+
Log-evidence path:
|
|
56
|
+
```json
|
|
57
|
+
{
|
|
58
|
+
"session_id": "...",
|
|
59
|
+
"status": "pass",
|
|
60
|
+
"checks": ["api-server logged 'order 42 created' on POST /api/orders", "no ERROR-level lines after the change"],
|
|
61
|
+
"backend_log_sources_read": ["api-server"]
|
|
62
|
+
}
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
At least one of `backend_endpoints_called` or `backend_log_sources_read` must be non-empty. `backend_response_statuses` and `backend_traces_collected` are optional but strongly recommended on the protocol-call path — same order as `backend_endpoints_called`. There is no automatic pass-criteria override on this cycle: if `status: pass` and the evidence is structurally valid, the gate honors it.
|
|
43
66
|
|
|
44
67
|
For a multi-cycle pass (browser + backend, or browser + node + backend), every active cycle's evidence must be present — claiming `pass` without one cycle's fields will be overridden to fail.
|
|
45
68
|
|
|
@@ -53,22 +76,31 @@ Focus on the endpoints you changed — not every endpoint in the service.
|
|
|
53
76
|
1. Run `git diff --name-only` and `git diff --name-only HEAD~1`
|
|
54
77
|
2. **Ignore `.ironbee/`, `.claude/`, `.cursor/`** — tool config, not application code
|
|
55
78
|
3. **Read the full diff** for route / handler / controller / service files in scope — note the wire-level address (HTTP method+path, gRPC service+method, GraphQL operation, WebSocket path), request shape, response shape, side effects (DB writes, downstream calls, queue puts)
|
|
56
|
-
4. Before opening the request
|
|
79
|
+
4. Before opening the request or log tools, you should be able to answer: what endpoints did I touch? What does each return on the happy path? What does each return on the error path? What side effects need verification? Which side (request or log) is easier to drive for this task?
|
|
57
80
|
|
|
58
81
|
### 2. Verify against the running service
|
|
82
|
+
Pick the evidence path that fits the task:
|
|
83
|
+
|
|
84
|
+
**Protocol-call path** (you drive the request):
|
|
59
85
|
- **Call each changed endpoint** with the matching tool — `mcp__backend-devtools__bedt_request_http` / `bedt_request_grpc` / `bedt_request_graphql` / `bedt_request_websocket-open`
|
|
60
86
|
- **Cross-reference the response against the diff** — status, body shape, headers, gRPC status code
|
|
61
87
|
- **Chain a follow-up call** to verify side effects (POST then GET, set then list, mutation then query, …)
|
|
62
88
|
- **Test one error path** per new branch — invalid body, missing field, missing auth, 404 path
|
|
63
89
|
- **Capture `traceId`** when available — useful for joining with downstream cycle evidence
|
|
64
90
|
|
|
91
|
+
**Log-evidence path** (an external driver hits it; you read the logs):
|
|
92
|
+
- **Register the service's log source** with `mcp__backend-devtools__bedt_log_register-source` (file / docker / kubernetes)
|
|
93
|
+
- **Read or follow** with `bedt_log_read` / `bedt_log_read-multi` (point-in-time, filter by `pattern` / `level` / `since-until` / `jsonFilter`) or `bedt_log_follow` (streaming for after-the-trigger capture)
|
|
94
|
+
- **Correlate with `traceId`** — use `pattern: "<traceId>"` to pull only the lines for one request
|
|
95
|
+
- **Verify the expected line is present** AND **no unexpected ERROR-level entries appeared** for the touched route(s)
|
|
96
|
+
|
|
65
97
|
---
|
|
66
98
|
|
|
67
99
|
## Full Mode (`/ironbee-verify full`, backend cycle)
|
|
68
100
|
|
|
69
101
|
Exercise every endpoint reachable from files matching `backend.verifyPatterns`, not just the changed files. Do NOT run `git diff` or scope to recent changes.
|
|
70
102
|
|
|
71
|
-
- Hit every route / RPC method / GraphQL operation / WebSocket lifecycle in scope
|
|
103
|
+
- Hit every route / RPC method / GraphQL operation / WebSocket lifecycle in scope (protocol-call) OR cover them via the log feed when an external driver / test suite drives them (log evidence)
|
|
72
104
|
- Cover the success path AND at least one error path for each
|
|
73
105
|
- Cover any auth-gated variant (unauthenticated, wrong role) where authentication is present
|
|
74
|
-
- Any unexpected error response during the run is a fail, regardless of when it was introduced
|
|
106
|
+
- Any unexpected error response or unexpected ERROR-level log line during the run is a fail, regardless of when it was introduced
|
|
@@ -16,9 +16,9 @@ If the project has node backend verification enabled (`ironbee node enable` once
|
|
|
16
16
|
2. **Connect**: `mcp__node-devtools__ndt_debug_connect` with one of `pid` / `processName` / `containerId` / `containerName` / `inspectorPort` / `wsUrl`. Inspector is auto-activated via SIGUSR1 if needed.
|
|
17
17
|
3. **Pick an evidence path** for each changed code path:
|
|
18
18
|
- **Probe path** (proves the code path executed): `mcp__node-devtools__ndt_debug_put-tracepoint` (or `put-logpoint` / `put-exceptionpoint`) at the changed code, exercise the path (e.g. trigger the API call from the browser), then `mcp__node-devtools__ndt_debug_get-probe-snapshots`. At least one probe must come back with `triggered: true`.
|
|
19
|
-
- **Log path** (proves no errors): exercise the path, then `mcp__node-devtools__ndt_debug_get-logs` with the error level filter. `
|
|
19
|
+
- **Log path** (proves no errors): exercise the path, then `mcp__node-devtools__ndt_debug_get-logs` with the error level filter. `node_log_errors` must be empty for `status: pass`.
|
|
20
20
|
4. **Disconnect** (optional): `mcp__node-devtools__ndt_debug_disconnect`.
|
|
21
|
-
5. **Submit verdict** including `
|
|
21
|
+
5. **Submit verdict** including `node_*` fields. If browser cycle is also active, include browser fields in the SAME verdict — do not submit two verdicts.
|
|
22
22
|
|
|
23
23
|
### Verdict (node-cycle fields)
|
|
24
24
|
```json
|
|
@@ -26,12 +26,12 @@ If the project has node backend verification enabled (`ironbee node enable` once
|
|
|
26
26
|
"session_id": "...",
|
|
27
27
|
"status": "pass",
|
|
28
28
|
"checks": ["POST /api/orders returned 201", "tracepoint at handler.ts:42 fired once"],
|
|
29
|
-
"
|
|
30
|
-
"
|
|
29
|
+
"node_processes_connected": ["pid:12345 (next-server)"],
|
|
30
|
+
"node_probes_set": [
|
|
31
31
|
{ "type": "tracepoint", "location": "src/api/orders.ts:42", "triggered": true }
|
|
32
32
|
],
|
|
33
|
-
"
|
|
34
|
-
"
|
|
33
|
+
"node_probe_snapshots_collected": 1,
|
|
34
|
+
"node_log_errors": []
|
|
35
35
|
}
|
|
36
36
|
```
|
|
37
37
|
|
|
@@ -54,7 +54,7 @@ Focus on the code you changed — not the entire Node service.
|
|
|
54
54
|
- **Exercise the path end-to-end** (trigger from browser, curl, or the backend cycle if active)
|
|
55
55
|
- **Each touched probe must report `triggered: true`** in `mcp__node-devtools__ndt_debug_get-probe-snapshots`
|
|
56
56
|
- **Check one edge case per new branch** — invalid input, missing field, auth failure, …
|
|
57
|
-
- **Logs** — `mcp__node-devtools__ndt_debug_get-logs` at error level; `
|
|
57
|
+
- **Logs** — `mcp__node-devtools__ndt_debug_get-logs` at error level; `node_log_errors` must be empty for `pass`
|
|
58
58
|
|
|
59
59
|
---
|
|
60
60
|
|
|
@@ -64,4 +64,4 @@ Probe every Node code path reachable from files matching `node.verifyPatterns`,
|
|
|
64
64
|
|
|
65
65
|
- Place probes at every handler / route / service entry point in scope, plus key internal branch points (early returns, error catches, conditional middleware)
|
|
66
66
|
- Exercise each path with at least one happy-path call AND one failure-path call
|
|
67
|
-
- `
|
|
67
|
+
- `node_log_errors` must be empty after the full run — any unexpected log error is a fail, regardless of when it was introduced
|
|
@@ -4,20 +4,23 @@
|
|
|
4
4
|
|
|
5
5
|
## Backend cycle (runtime-agnostic protocol verification)
|
|
6
6
|
|
|
7
|
-
When the file matches `backend.verifyPatterns`, the Stop hook ALSO requires verification through the **backend-devtools** MCP server (prefix `bedt_`).
|
|
7
|
+
When the file matches `backend.verifyPatterns`, the Stop hook ALSO requires verification through the **backend-devtools** MCP server (prefix `bedt_`). The cycle is satisfied by **either** a real protocol call (HTTP / gRPC / GraphQL / WebSocket) against the running service, **or** by reading logs from the running service (file / docker / kubernetes) when an external driver is hitting the endpoint. Pick whichever fits the task — language- and framework-independent in both cases.
|
|
8
8
|
|
|
9
|
-
The backend cycle and the node cycle are **independent**. Node attaches to a Node.js V8 inspector with non-blocking probes (`ndt_*`); backend drives wire protocols from outside (`bedt_*`). They can be active in the same task; both must be satisfied for `status: pass`.
|
|
9
|
+
The backend cycle and the node cycle are **independent**. Node attaches to a Node.js V8 inspector with non-blocking probes (`ndt_*`); backend drives wire protocols and/or reads logs from outside (`bedt_*`). They can be active in the same task; both must be satisfied for `status: pass`.
|
|
10
10
|
|
|
11
11
|
### Backend-cycle additions to the main flow
|
|
12
12
|
|
|
13
13
|
These attach to the **Required steps** above — they don't replace any step. Numbering follows the main flow:
|
|
14
14
|
|
|
15
|
-
- **Within step 3 (run flow):** also run the backend flow
|
|
16
|
-
- **
|
|
15
|
+
- **Within step 3 (run flow):** also run the backend flow via ONE of two evidence paths:
|
|
16
|
+
- **Protocol-call** — identify the affected endpoint(s) → call the matching protocol tool (`bedt_request_http` / `bedt_request_grpc` / `bedt_request_graphql` / `bedt_request_websocket-open` / `bedt_request_replay`) → inspect status / body / `traceId` → chain follow-up calls when verifying side effects. **A 4xx / 5xx response is a normal result, not an error** — only transport failures populate the `error` field.
|
|
17
|
+
- **Log evidence** — `bedt_log_register-source` (file / docker / kubernetes) → `bedt_log_read` (point-in-time, supports `tail` / `since-until` / `pattern` / `level` / `parseJson` + `jsonFilter` / `contextBefore-After` / `select` / `coalesce`) OR `bedt_log_follow` + `bedt_log_get-followed` (streaming). Fit for jobs / queue workers / async handlers, or any case where an external driver is hitting the endpoint and you only need to verify what the server logged. `bedt_log_register-source` is mandatory on this path (the gate counts it as the setup step).
|
|
18
|
+
- **Within step 6 (submit verdict):** include `backend_endpoints_called` for protocol-call evidence (optionally with `backend_response_statuses` and `backend_traces_collected`), and/or `backend_log_sources_read` for log-evidence. **At least one** of those two arrays must be non-empty. One verdict carries fields for every active cycle.
|
|
17
19
|
|
|
18
20
|
### Additional BANNED for backend cycle
|
|
19
21
|
|
|
20
22
|
- Calling `bedt_*` tools without first opening a verification cycle (`ironbee hook verification-start`).
|
|
21
23
|
- Treating a 4xx / 5xx response as a transport failure when the test was specifically asking for that error condition (e.g. "POST should reject malformed body with 400"). Decide PASS/FAIL based on the test's intent, not the status code's HTTP-class default.
|
|
22
|
-
- Submitting a backend
|
|
23
|
-
- Inferring backend behavior by reading code without
|
|
24
|
+
- Submitting a backend verdict that omits BOTH `backend_endpoints_called` and `backend_log_sources_read` — at least one of those evidence arrays must be non-empty.
|
|
25
|
+
- Inferring backend behavior by reading code without exercising either evidence path. The cycle is satisfied only by making a real protocol call or by reading real logs from the running service.
|
|
26
|
+
- Reading a pre-existing log source unrelated to your task to fake the log-evidence path. `bedt_log_register-source` is a required setup step on that path so the registration is on the wire.
|
|
@@ -19,11 +19,11 @@ Both cycles can be active simultaneously (e.g. you edit both a React component a
|
|
|
19
19
|
These attach to the **Required steps** above — they don't replace any step. Numbering follows the main flow:
|
|
20
20
|
|
|
21
21
|
- **Within step 3 (run flow):** also run the node flow: connect (`ndt_debug_connect`) → set probe (`ndt_debug_put-tracepoint` / `put-logpoint` / `put-exceptionpoint`) AND exercise + read snapshots (`ndt_debug_get-probe-snapshots`), OR exercise + read logs (`ndt_debug_get-logs`). When both browser and node cycles are active, run BOTH within the same verification cycle.
|
|
22
|
-
- **Within step 6 (submit verdict):** include `
|
|
22
|
+
- **Within step 6 (submit verdict):** include `node_*` fields (`node_processes_connected` non-empty, plus `node_probes_set` and/or `node_log_errors`). One verdict carries fields for every active cycle.
|
|
23
23
|
|
|
24
24
|
### Additional BANNED for node cycle
|
|
25
25
|
|
|
26
26
|
- Calling `ndt_*` tools without first opening a verification cycle (`ironbee hook verification-start`).
|
|
27
27
|
- **Calling `ndt_*` tools when the project's backend is NOT Node.js** (Java / Python / Go / Rust / .NET / Ruby / PHP / Elixir / etc.). Use the browser cycle only for non-Node backends.
|
|
28
|
-
- Claiming `status: pass` for a node cycle when no probe triggered AND `
|
|
29
|
-
- Submitting a node-only verdict that omits `
|
|
28
|
+
- Claiming `status: pass` for a node cycle when no probe triggered AND `node_log_errors` was never collected.
|
|
29
|
+
- Submitting a node-only verdict that omits `node_processes_connected` — every node-cycle verdict requires this field non-empty.
|
|
@@ -5,11 +5,15 @@
|
|
|
5
5
|
|
|
6
6
|
## Backend cycle — runtime- and language-agnostic
|
|
7
7
|
|
|
8
|
-
The **backend protocol cycle** verifies backend changes by driving real protocol calls (HTTP / gRPC / GraphQL / WebSocket) against the running service and reading the responses. It works for ANY backend runtime: Node, Java, Python, Go, Rust, Ruby, .NET, PHP, Elixir, Kotlin, Scala — the agent never attaches to a process
|
|
8
|
+
The **backend protocol cycle** verifies backend changes by driving real protocol calls (HTTP / gRPC / GraphQL / WebSocket) against the running service and reading the responses, OR by inspecting the logs of the running service (file / docker / kubernetes) when an external driver is hitting the endpoint. It works for ANY backend runtime: Node, Java, Python, Go, Rust, Ruby, .NET, PHP, Elixir, Kotlin, Scala — the agent never attaches to a process.
|
|
9
9
|
|
|
10
|
-
**This is different from the node cycle.** Node-cycle (`ndt_*`) attaches to a V8 inspector and sets non-blocking probes inside a running Node.js process — it's Node-only. Backend-cycle (`bedt_*`) makes outside-in protocol calls
|
|
10
|
+
**This is different from the node cycle.** Node-cycle (`ndt_*`) attaches to a V8 inspector and sets non-blocking probes inside a running Node.js process — it's Node-only. Backend-cycle (`bedt_*`) makes outside-in protocol calls and/or reads logs of any service. They can be active at the same time when both are enabled.
|
|
11
11
|
|
|
12
|
-
## Backend flow
|
|
12
|
+
## Backend flow (two evidence paths — at least one is required)
|
|
13
|
+
|
|
14
|
+
You can satisfy the cycle via **protocol-call evidence** (you drive the request yourself), **log evidence** (something else drives the request, you read the resulting logs), or both. Pick whichever fits the task; one is enough.
|
|
15
|
+
|
|
16
|
+
### Path A — Protocol-call evidence
|
|
13
17
|
|
|
14
18
|
1. **Confirm a backend service is running** (the user's dev server, Docker compose, k8s port-forward, …). The agent itself does not start the service — ask the user if uncertain.
|
|
15
19
|
2. **Identify the affected endpoint(s)** for your code change. Look at routes / handlers / controllers in the changed files. Map them to wire-level addresses (URL, gRPC service+method, GraphQL operation name, WebSocket path).
|
|
@@ -25,9 +29,22 @@ The **backend protocol cycle** verifies backend changes by driving real protocol
|
|
|
25
29
|
4. **Inspect the response** — `status` (HTTP / gRPC code), body, headers, returned `traceId` (always W3C `traceparent`).
|
|
26
30
|
**`4xx/5xx and gRPC non-OK are normal results, not errors.** A test for "404 Not Found" SHOULD return 404. Only transport-level failures (DNS, TLS, timeout, abort) populate the response's `error` field. Decide PASS/FAIL based on what your task actually requires.
|
|
27
31
|
5. **Chain follow-up calls** if you need to verify side effects (e.g. POST then GET to confirm the new resource is readable). Use `bedt_request_set-default-headers` to pin auth tokens once per host, `bedt_request_set-cookies` for session cookies — both stay scoped to that target across calls.
|
|
28
|
-
6. **Submit verdict** including `backend_*` fields. If browser and/or node cycles are also active, include their fields in the SAME verdict — do not submit two verdicts.
|
|
29
32
|
|
|
30
|
-
###
|
|
33
|
+
### Path B — Log evidence (when an external driver hits the endpoint)
|
|
34
|
+
|
|
35
|
+
Useful when an integration test, the user, or a deploy script is already driving the protocol — your job is to verify "side B" by reading what the server logged. Also a fit for jobs / queue workers / cron handlers where there is no synchronous request to make.
|
|
36
|
+
|
|
37
|
+
1. **Register the log source** with `mcp__backend-devtools__bedt_log_register-source` — pick `type: "file"` for any process whose stdout is redirected to a file, `type: "docker"` for a container (`container: "<name|id>"`), `type: "kubernetes"` for a pod (`pod`, optionally `kubernetesContainer` / `namespace`). Source names are session-unique; re-register to overwrite. Listing/check helpers: `bedt_log_list-sources`, `bedt_log_check-source`.
|
|
38
|
+
2. **Read or follow the source**:
|
|
39
|
+
- `mcp__backend-devtools__bedt_log_read` / `bedt_log_read-multi` — point-in-time read across one or many sources. Filters: `tail` (last N lines), `since` / `until` (ISO-8601 — natively docker; file sources require `parseJson: true` so timestamp is extracted from a JSON field), `pattern` (substring; use for trace-id correlation), `level` (ERROR/WARN/INFO/DEBUG/TRACE/FATAL), `limit`, `parseJson`, `jsonFilter` (dot-path equality predicates against parsed JSON), `contextBefore` / `contextAfter`, `select` (dot-path projection to trim verbose JSONL), `coalesce` (fold multi-line stack traces into one line).
|
|
40
|
+
- `mcp__backend-devtools__bedt_log_follow` — open a streaming subscription that pushes lines into a ring buffer; `bedt_log_get-followed` drains it on demand; `bedt_log_stop-follow` tears it down. Use this when you need to capture logs that emit AFTER your trigger.
|
|
41
|
+
3. **Verify the lines you got match the expectation** — error gone, expected log line present, trace-id chained through. Plain-text and JSON sources are both supported; JSON sources accept structural predicates (`jsonFilter: { 'level': 'error', 'route': '/api/orders' }`).
|
|
42
|
+
4. **Unregister when done** — `bedt_log_unregister-source` cleans up. Optional; the session tears them down at end too.
|
|
43
|
+
|
|
44
|
+
### Submit verdict
|
|
45
|
+
|
|
46
|
+
Include the fields matching the path(s) you exercised. If browser and/or node cycles are also active, include their fields in the SAME verdict — do not submit two verdicts.
|
|
47
|
+
|
|
31
48
|
```json
|
|
32
49
|
{
|
|
33
50
|
"session_id": "<sid>",
|
|
@@ -42,7 +59,18 @@ The **backend protocol cycle** verifies backend changes by driving real protocol
|
|
|
42
59
|
}
|
|
43
60
|
```
|
|
44
61
|
|
|
45
|
-
|
|
62
|
+
Or, for a log-evidence path:
|
|
63
|
+
|
|
64
|
+
```json
|
|
65
|
+
{
|
|
66
|
+
"session_id": "<sid>",
|
|
67
|
+
"status": "pass",
|
|
68
|
+
"checks": ["api-server logged 'order 42 created' on POST /api/orders", "no ERROR-level lines after the change"],
|
|
69
|
+
"backend_log_sources_read": ["api-server"]
|
|
70
|
+
}
|
|
71
|
+
```
|
|
72
|
+
|
|
73
|
+
At least one of `backend_endpoints_called` or `backend_log_sources_read` must be non-empty. The optional protocol-call fields (`backend_response_statuses`, `backend_traces_collected`) are strongly recommended when you used the protocol-call path — same order as `backend_endpoints_called`. Status interpretation is YOUR call: there is no automatic pass-criteria override on this cycle. If the agent claims `status: pass` and the evidence is structurally valid, the gate honors it.
|
|
46
74
|
|
|
47
75
|
## Multi-cycle (browser + backend, or browser + node + backend)
|
|
48
76
|
|
|
@@ -28,7 +28,7 @@ If you see `pom.xml`, `build.gradle`, `requirements.txt`, `pyproject.toml`, `go.
|
|
|
28
28
|
- Read collected snapshots: `ndt_debug_get-probe-snapshots`. At least one probe must come back with `triggered: true`.
|
|
29
29
|
- **Log path** (proves no errors during execution):
|
|
30
30
|
- Exercise the path.
|
|
31
|
-
- Read errors: `ndt_debug_get-logs` with the error-level filter. `
|
|
31
|
+
- Read errors: `ndt_debug_get-logs` with the error-level filter. `node_log_errors` must be empty for `status: pass`.
|
|
32
32
|
4. **Disconnect** (optional): `ndt_debug_disconnect`.
|
|
33
33
|
|
|
34
34
|
### Node verdict fields
|
|
@@ -37,18 +37,18 @@ If you see `pom.xml`, `build.gradle`, `requirements.txt`, `pyproject.toml`, `go.
|
|
|
37
37
|
"session_id": "<sid>",
|
|
38
38
|
"status": "pass",
|
|
39
39
|
"checks": ["POST /api/orders returned 201", "tracepoint at handler.ts:42 fired once"],
|
|
40
|
-
"
|
|
41
|
-
"
|
|
40
|
+
"node_processes_connected": ["pid:12345 (next-server)"],
|
|
41
|
+
"node_probes_set": [
|
|
42
42
|
{ "type": "tracepoint", "location": "src/api/orders.ts:42", "triggered": true }
|
|
43
43
|
],
|
|
44
|
-
"
|
|
45
|
-
"
|
|
44
|
+
"node_probe_snapshots_collected": 1,
|
|
45
|
+
"node_log_errors": []
|
|
46
46
|
}
|
|
47
47
|
```
|
|
48
48
|
|
|
49
49
|
For `status: "pass"` (node cycle):
|
|
50
50
|
- If probes were set, at least one must have `triggered: true` (proves the code path executed).
|
|
51
|
-
- If only logs were used, `
|
|
51
|
+
- If only logs were used, `node_log_errors.length === 0` (no errors observed).
|
|
52
52
|
- If both forms were used, both conditions must hold.
|
|
53
53
|
|
|
54
54
|
## Multi-cycle (browser + node simultaneously)
|
|
@@ -65,12 +65,12 @@ Submit ONE verdict carrying fields for every active cycle:
|
|
|
65
65
|
"checks": ["checkout submits", "POST /api/orders returned 201", "no console errors"],
|
|
66
66
|
"console_errors": 0,
|
|
67
67
|
"network_failures": 0,
|
|
68
|
-
"
|
|
69
|
-
"
|
|
68
|
+
"node_processes_connected": ["pid:12345 (next-server)"],
|
|
69
|
+
"node_probes_set": [
|
|
70
70
|
{ "type": "tracepoint", "location": "src/api/orders.ts:42", "triggered": true }
|
|
71
71
|
],
|
|
72
|
-
"
|
|
73
|
-
"
|
|
72
|
+
"node_probe_snapshots_collected": 1,
|
|
73
|
+
"node_log_errors": []
|
|
74
74
|
}
|
|
75
75
|
```
|
|
76
76
|
|
|
@@ -44,7 +44,7 @@ async function run(projectDir) {
|
|
|
44
44
|
permission: "deny",
|
|
45
45
|
agent_message: `BLOCKED: You used verification tools (browser-devtools / node-devtools / backend-devtools) but did not submit a verdict. You MUST submit a verdict (pass or fail) before editing code.
|
|
46
46
|
|
|
47
|
-
Submit your verdict first (include cycle-appropriate fields — browser fields for bdt_*,
|
|
47
|
+
Submit your verdict first (include cycle-appropriate fields — browser fields for bdt_*, node_* for ndt_*, backend_endpoints_called/backend_response_statuses for bedt_*):
|
|
48
48
|
echo '{"session_id":"${sessionId}","status":"fail","checks":[...],"issues":["describe what failed"], ...}' | ironbee hook submit-verdict
|
|
49
49
|
|
|
50
50
|
Then you can edit code to fix the issues.`,
|
|
@@ -7,24 +7,37 @@ If the project has the backend protocol cycle enabled (`ironbee backend enable`
|
|
|
7
7
|
This cycle is **runtime- and language-agnostic** — it works for Node, Java, Python, Go, Rust, Ruby, .NET, PHP, Elixir, Kotlin, Scala. The agent makes real protocol calls (HTTP / gRPC / GraphQL / WebSocket) against the running service and inspects the responses; it never attaches to a process.
|
|
8
8
|
|
|
9
9
|
### Mode behavior (backend cycle)
|
|
10
|
-
- **default** (no arg or `default`):
|
|
11
|
-
- **full**:
|
|
10
|
+
- **default** (no arg or `default`): exercise the endpoints your diff touched via ONE of the two evidence paths (protocol-call OR log evidence — see below). Map each changed file → the route(s) / handler(s) / RPC method(s) it exposes, then either call them yourself and chain follow-ups to verify side effects, OR set up log capture and verify the server-side trace of an external driver hitting them.
|
|
11
|
+
- **full**: cover every endpoint reachable from files matching `backend.verifyPatterns`, not just the changed files. Cover the success path, at least one error path, and any auth-gated variant for each.
|
|
12
12
|
- `visual` / `functional`: browser-only modes; this cycle behaves as `default` when they are passed.
|
|
13
13
|
|
|
14
14
|
### Steps (additive to the browser flow above)
|
|
15
|
+
|
|
16
|
+
The cycle is satisfied by **either** a protocol-call path (you drive the request) **or** a log-evidence path (something else drives it; you read the resulting logs). Pick whichever fits the task.
|
|
17
|
+
|
|
15
18
|
1. **Confirm the backend service is running** (the user's dev server / Docker compose / k8s port-forward / …). Don't start the service yourself — ask the user if it's not obvious.
|
|
16
19
|
2. **Identify the affected endpoint(s)**. Map your code change to wire-level addresses (URL, gRPC service+method, GraphQL operation, WebSocket path).
|
|
17
|
-
3. **
|
|
20
|
+
3. **Exercise ONE evidence path** (or both):
|
|
21
|
+
|
|
22
|
+
**Path A — Protocol-call (you drive the request):**
|
|
18
23
|
- `MCP:bedt_request_http` — HTTP/1.1 + HTTP/2 (ALPN auto-negotiates).
|
|
19
24
|
- `MCP:bedt_request_grpc` — unary + 3 streaming modes; `.proto` text or descriptor.
|
|
20
25
|
- `MCP:bedt_request_graphql` — query/mutation/persisted query.
|
|
21
26
|
- `MCP:bedt_request_websocket-open` then `bedt_request_websocket-send` / `bedt_request_websocket-receive` / `bedt_request_websocket-close` for stateful WS sessions.
|
|
22
27
|
- `MCP:bedt_request_replay` — re-issue a captured curl command or HAR entry.
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
28
|
+
- **Inspect the response** — `status`, body, headers, `traceId`. **4xx/5xx and gRPC non-OK are normal results, not transport errors.** Decide PASS/FAIL based on what the test actually requires.
|
|
29
|
+
|
|
30
|
+
**Path B — Log evidence (an external driver hits the endpoint; you read the logs):**
|
|
31
|
+
- `MCP:bedt_log_register-source` — register the running service's log destination (`type: "file"` + `path`, `type: "docker"` + `container`, or `type: "kubernetes"` + `pod`).
|
|
32
|
+
- `MCP:bedt_log_read` / `bedt_log_read-multi` — point-in-time read with filters (`tail`, `since`/`until`, `pattern`, `level`, `parseJson` + `jsonFilter`, `contextBefore`/`contextAfter`, `select`, `coalesce`).
|
|
33
|
+
- `MCP:bedt_log_follow` + `bedt_log_get-followed` (and `bedt_log_stop-follow`) — streaming follow when you need to capture lines that emit AFTER you trigger.
|
|
34
|
+
- **Verify the lines match the expectation** — error gone, expected line present, trace-id chained through, no unexpected ERROR-level entries.
|
|
35
|
+
4. **Chain follow-up calls (Path A)** to verify side effects (POST then GET, set then list, etc.). Use `bedt_request_set-default-headers` for auth tokens (host-scoped), `bedt_request_set-cookies` for session state. **OR chain follow-up reads (Path B)** to capture downstream log output (job started → job finished → no retries).
|
|
36
|
+
5. **Submit verdict** including the fields matching the path(s) you exercised. If browser and/or node cycles are also active, include their fields in the SAME verdict — do not submit two verdicts.
|
|
26
37
|
|
|
27
38
|
### Verdict (backend-cycle fields)
|
|
39
|
+
|
|
40
|
+
Protocol-call path:
|
|
28
41
|
```json
|
|
29
42
|
{
|
|
30
43
|
"session_id": "...",
|
|
@@ -39,7 +52,17 @@ This cycle is **runtime- and language-agnostic** — it works for Node, Java, Py
|
|
|
39
52
|
}
|
|
40
53
|
```
|
|
41
54
|
|
|
42
|
-
|
|
55
|
+
Log-evidence path:
|
|
56
|
+
```json
|
|
57
|
+
{
|
|
58
|
+
"session_id": "...",
|
|
59
|
+
"status": "pass",
|
|
60
|
+
"checks": ["api-server logged 'order 42 created' on POST /api/orders", "no ERROR-level lines after the change"],
|
|
61
|
+
"backend_log_sources_read": ["api-server"]
|
|
62
|
+
}
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
At least one of `backend_endpoints_called` or `backend_log_sources_read` must be non-empty. `backend_response_statuses` and `backend_traces_collected` are optional but strongly recommended on the protocol-call path — same order as `backend_endpoints_called`. There is no automatic pass-criteria override on this cycle: if `status: pass` and the evidence is structurally valid, the gate honors it.
|
|
43
66
|
|
|
44
67
|
For a multi-cycle pass (browser + backend, or browser + node + backend), every active cycle's evidence must be present — claiming `pass` without one cycle's fields will be overridden to fail.
|
|
45
68
|
|
|
@@ -53,22 +76,31 @@ Focus on the endpoints you changed — not every endpoint in the service.
|
|
|
53
76
|
1. Run `git diff --name-only` and `git diff --name-only HEAD~1`
|
|
54
77
|
2. **Ignore `.ironbee/`, `.claude/`, `.cursor/`** — tool config, not application code
|
|
55
78
|
3. **Read the full diff** for route / handler / controller / service files in scope — note the wire-level address (HTTP method+path, gRPC service+method, GraphQL operation, WebSocket path), request shape, response shape, side effects (DB writes, downstream calls, queue puts)
|
|
56
|
-
4. Before opening the request
|
|
79
|
+
4. Before opening the request or log tools, you should be able to answer: what endpoints did I touch? What does each return on the happy path? What does each return on the error path? What side effects need verification? Which side (request or log) is easier to drive for this task?
|
|
57
80
|
|
|
58
81
|
### 2. Verify against the running service
|
|
82
|
+
Pick the evidence path that fits the task:
|
|
83
|
+
|
|
84
|
+
**Protocol-call path** (you drive the request):
|
|
59
85
|
- **Call each changed endpoint** with the matching tool — `MCP:bedt_request_http` / `MCP:bedt_request_grpc` / `MCP:bedt_request_graphql` / `MCP:bedt_request_websocket-open`
|
|
60
86
|
- **Cross-reference the response against the diff** — status, body shape, headers, gRPC status code
|
|
61
87
|
- **Chain a follow-up call** to verify side effects (POST then GET, set then list, mutation then query, …)
|
|
62
88
|
- **Test one error path** per new branch — invalid body, missing field, missing auth, 404 path
|
|
63
89
|
- **Capture `traceId`** when available — useful for joining with downstream cycle evidence
|
|
64
90
|
|
|
91
|
+
**Log-evidence path** (an external driver hits it; you read the logs):
|
|
92
|
+
- **Register the service's log source** with `MCP:bedt_log_register-source` (file / docker / kubernetes)
|
|
93
|
+
- **Read or follow** with `MCP:bedt_log_read` / `bedt_log_read-multi` (point-in-time, filter by `pattern` / `level` / `since-until` / `jsonFilter`) or `MCP:bedt_log_follow` (streaming for after-the-trigger capture)
|
|
94
|
+
- **Correlate with `traceId`** — use `pattern: "<traceId>"` to pull only the lines for one request
|
|
95
|
+
- **Verify the expected line is present** AND **no unexpected ERROR-level entries appeared** for the touched route(s)
|
|
96
|
+
|
|
65
97
|
---
|
|
66
98
|
|
|
67
99
|
## Full Mode (`/ironbee-verify full`, backend cycle)
|
|
68
100
|
|
|
69
101
|
Exercise every endpoint reachable from files matching `backend.verifyPatterns`, not just the changed files. Do NOT run `git diff` or scope to recent changes.
|
|
70
102
|
|
|
71
|
-
- Hit every route / RPC method / GraphQL operation / WebSocket lifecycle in scope
|
|
103
|
+
- Hit every route / RPC method / GraphQL operation / WebSocket lifecycle in scope (protocol-call) OR cover them via the log feed when an external driver / test suite drives them (log evidence)
|
|
72
104
|
- Cover the success path AND at least one error path for each
|
|
73
105
|
- Cover any auth-gated variant (unauthenticated, wrong role) where authentication is present
|
|
74
|
-
- Any unexpected error response during the run is a fail, regardless of when it was introduced
|
|
106
|
+
- Any unexpected error response or unexpected ERROR-level log line during the run is a fail, regardless of when it was introduced
|
|
@@ -16,9 +16,9 @@ If the project has node backend verification enabled (`ironbee node enable` once
|
|
|
16
16
|
2. **Connect**: `MCP:ndt_debug_connect` with one of `pid` / `processName` / `containerId` / `containerName` / `inspectorPort` / `wsUrl`. Inspector is auto-activated via SIGUSR1 if needed.
|
|
17
17
|
3. **Pick an evidence path** for each changed code path:
|
|
18
18
|
- **Probe path** (proves the code path executed): `MCP:ndt_debug_put-tracepoint` (or `put-logpoint` / `put-exceptionpoint`) at the changed code, exercise the path (e.g. trigger the API call from the browser), then `MCP:ndt_debug_get-probe-snapshots`. At least one probe must come back with `triggered: true`.
|
|
19
|
-
- **Log path** (proves no errors): exercise the path, then `MCP:ndt_debug_get-logs` with the error level filter. `
|
|
19
|
+
- **Log path** (proves no errors): exercise the path, then `MCP:ndt_debug_get-logs` with the error level filter. `node_log_errors` must be empty for `status: pass`.
|
|
20
20
|
4. **Disconnect** (optional): `MCP:ndt_debug_disconnect`.
|
|
21
|
-
5. **Submit verdict** including `
|
|
21
|
+
5. **Submit verdict** including `node_*` fields. If browser cycle is also active, include browser fields in the SAME verdict — do not submit two verdicts.
|
|
22
22
|
|
|
23
23
|
### Verdict (node-cycle fields)
|
|
24
24
|
```json
|
|
@@ -26,12 +26,12 @@ If the project has node backend verification enabled (`ironbee node enable` once
|
|
|
26
26
|
"session_id": "...",
|
|
27
27
|
"status": "pass",
|
|
28
28
|
"checks": ["POST /api/orders returned 201", "tracepoint at handler.ts:42 fired once"],
|
|
29
|
-
"
|
|
30
|
-
"
|
|
29
|
+
"node_processes_connected": ["pid:12345 (next-server)"],
|
|
30
|
+
"node_probes_set": [
|
|
31
31
|
{ "type": "tracepoint", "location": "src/api/orders.ts:42", "triggered": true }
|
|
32
32
|
],
|
|
33
|
-
"
|
|
34
|
-
"
|
|
33
|
+
"node_probe_snapshots_collected": 1,
|
|
34
|
+
"node_log_errors": []
|
|
35
35
|
}
|
|
36
36
|
```
|
|
37
37
|
|
|
@@ -54,7 +54,7 @@ Focus on the code you changed — not the entire Node service.
|
|
|
54
54
|
- **Exercise the path end-to-end** (trigger from browser, curl, or the backend cycle if active)
|
|
55
55
|
- **Each touched probe must report `triggered: true`** in `MCP:ndt_debug_get-probe-snapshots`
|
|
56
56
|
- **Check one edge case per new branch** — invalid input, missing field, auth failure, …
|
|
57
|
-
- **Logs** — `MCP:ndt_debug_get-logs` at error level; `
|
|
57
|
+
- **Logs** — `MCP:ndt_debug_get-logs` at error level; `node_log_errors` must be empty for `pass`
|
|
58
58
|
|
|
59
59
|
---
|
|
60
60
|
|
|
@@ -64,4 +64,4 @@ Probe every Node code path reachable from files matching `node.verifyPatterns`,
|
|
|
64
64
|
|
|
65
65
|
- Place probes at every handler / route / service entry point in scope, plus key internal branch points (early returns, error catches, conditional middleware)
|
|
66
66
|
- Exercise each path with at least one happy-path call AND one failure-path call
|
|
67
|
-
- `
|
|
67
|
+
- `node_log_errors` must be empty after the full run — any unexpected log error is a fail, regardless of when it was introduced
|
|
@@ -4,20 +4,23 @@
|
|
|
4
4
|
|
|
5
5
|
## Backend cycle (runtime-agnostic protocol verification)
|
|
6
6
|
|
|
7
|
-
When the file matches `backend.verifyPatterns`, the Stop hook ALSO requires verification through the **backend-devtools** MCP server (prefix `bedt_`).
|
|
7
|
+
When the file matches `backend.verifyPatterns`, the Stop hook ALSO requires verification through the **backend-devtools** MCP server (prefix `bedt_`). The cycle is satisfied by **either** a real protocol call (HTTP / gRPC / GraphQL / WebSocket) against the running service, **or** by reading logs from the running service (file / docker / kubernetes) when an external driver is hitting the endpoint. Pick whichever fits the task — language- and framework-independent in both cases.
|
|
8
8
|
|
|
9
|
-
The backend cycle and the node cycle are **independent**. Node attaches to a Node.js V8 inspector with non-blocking probes (`ndt_*`); backend drives wire protocols from outside (`bedt_*`). They can be active in the same task; both must be satisfied for `status: pass`.
|
|
9
|
+
The backend cycle and the node cycle are **independent**. Node attaches to a Node.js V8 inspector with non-blocking probes (`ndt_*`); backend drives wire protocols and/or reads logs from outside (`bedt_*`). They can be active in the same task; both must be satisfied for `status: pass`.
|
|
10
10
|
|
|
11
11
|
### Backend-cycle additions to the main flow
|
|
12
12
|
|
|
13
13
|
These attach to the **Required steps** above — they don't replace any step. Numbering follows the main flow:
|
|
14
14
|
|
|
15
|
-
- **Within step 3 (run flow):** also run the backend flow
|
|
16
|
-
- **
|
|
15
|
+
- **Within step 3 (run flow):** also run the backend flow via ONE of two evidence paths:
|
|
16
|
+
- **Protocol-call** — identify the affected endpoint(s) → call the matching protocol tool (`bedt_request_http` / `bedt_request_grpc` / `bedt_request_graphql` / `bedt_request_websocket-open` / `bedt_request_replay`) → inspect status / body / `traceId` → chain follow-up calls when verifying side effects. **A 4xx / 5xx response is a normal result, not an error** — only transport failures populate the `error` field.
|
|
17
|
+
- **Log evidence** — `bedt_log_register-source` (file / docker / kubernetes) → `bedt_log_read` (point-in-time, supports `tail` / `since-until` / `pattern` / `level` / `parseJson` + `jsonFilter` / `contextBefore-After` / `select` / `coalesce`) OR `bedt_log_follow` + `bedt_log_get-followed` (streaming). Fit for jobs / queue workers / async handlers, or any case where an external driver is hitting the endpoint and you only need to verify what the server logged. `bedt_log_register-source` is mandatory on this path (the gate counts it as the setup step).
|
|
18
|
+
- **Within step 6 (submit verdict):** include `backend_endpoints_called` for protocol-call evidence (optionally with `backend_response_statuses` and `backend_traces_collected`), and/or `backend_log_sources_read` for log-evidence. **At least one** of those two arrays must be non-empty. One verdict carries fields for every active cycle.
|
|
17
19
|
|
|
18
20
|
### Additional BANNED for backend cycle
|
|
19
21
|
|
|
20
22
|
- Calling `bedt_*` tools without first opening a verification cycle (`ironbee hook verification-start`).
|
|
21
23
|
- Treating a 4xx / 5xx response as a transport failure when the test was specifically asking for that error condition (e.g. "POST should reject malformed body with 400"). Decide PASS/FAIL based on the test's intent, not the status code's HTTP-class default.
|
|
22
|
-
- Submitting a backend
|
|
23
|
-
- Inferring backend behavior by reading code without
|
|
24
|
+
- Submitting a backend verdict that omits BOTH `backend_endpoints_called` and `backend_log_sources_read` — at least one of those evidence arrays must be non-empty.
|
|
25
|
+
- Inferring backend behavior by reading code without exercising either evidence path. The cycle is satisfied only by making a real protocol call or by reading real logs from the running service.
|
|
26
|
+
- Reading a pre-existing log source unrelated to your task to fake the log-evidence path. `bedt_log_register-source` is a required setup step on that path so the registration is on the wire.
|
|
@@ -19,11 +19,11 @@ Both cycles can be active simultaneously (e.g. you edit both a React component a
|
|
|
19
19
|
These attach to the **Required steps** above — they don't replace any step. Numbering follows the main flow:
|
|
20
20
|
|
|
21
21
|
- **Within step 3 (run flow):** also run the node flow: connect (`MCP:ndt_debug_connect`) → set probe (`MCP:ndt_debug_put-tracepoint` / `put-logpoint` / `put-exceptionpoint`) AND exercise + read snapshots (`MCP:ndt_debug_get-probe-snapshots`), OR exercise + read logs (`MCP:ndt_debug_get-logs`). When both browser and node cycles are active, run BOTH within the same verification cycle.
|
|
22
|
-
- **Within step 6 (submit verdict):** include `
|
|
22
|
+
- **Within step 6 (submit verdict):** include `node_*` fields (`node_processes_connected` non-empty, plus `node_probes_set` and/or `node_log_errors`). One verdict carries fields for every active cycle.
|
|
23
23
|
|
|
24
24
|
### Additional BANNED for node cycle
|
|
25
25
|
|
|
26
26
|
- Calling `MCP:ndt_*` tools without first opening a verification cycle (`ironbee hook verification-start`).
|
|
27
27
|
- **Calling `MCP:ndt_*` tools when the project's backend is NOT Node.js** (Java / Python / Go / Rust / .NET / Ruby / PHP / Elixir / etc.). Use the browser cycle only for non-Node backends.
|
|
28
|
-
- Claiming `status: pass` for a node cycle when no probe triggered AND `
|
|
29
|
-
- Submitting a node-only verdict that omits `
|
|
28
|
+
- Claiming `status: pass` for a node cycle when no probe triggered AND `node_log_errors` was never collected.
|
|
29
|
+
- Submitting a node-only verdict that omits `node_processes_connected` — every node-cycle verdict requires this field non-empty.
|