@windyroad/risk-scorer 0.8.0 → 0.9.0-preview.311

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,5 +1,5 @@
1
1
  {
2
2
  "name": "wr-risk-scorer",
3
- "version": "0.8.0",
3
+ "version": "0.9.0",
4
4
  "description": "Pipeline risk scoring, commit/push/release gates for Claude Code"
5
5
  }
package/README.md CHANGED
@@ -65,6 +65,7 @@ The plugin includes six specialised agents:
65
65
  | `wr-risk-scorer:plan` | Reviews implementation plans for risk |
66
66
  | `wr-risk-scorer:policy` | Validates `RISK-POLICY.md` for ISO 31000 compliance |
67
67
  | `wr-risk-scorer:external-comms` | Reviews drafts of outbound prose (gh issues/PRs, advisories, npm publish, changeset bodies) for confidential-information leaks per `RISK-POLICY.md` |
68
+ | `wr-risk-scorer:inbound-report` | Reviews inbound third-party reports (problem-report issues, Q&A discussions, security-advisory submissions) for Request-risk + Fix-risk per `RISK-POLICY.md` § Inbound Report Risk Classes — sibling of `:external-comms` (NOT extension). Consumed by the assessment-pipeline (P079 / ADR-062). Serves JTBD-301 (verdict-on-close acknowledgement) + JTBD-001 (mechanical-stage carve-out). |
68
69
 
69
70
  ## On-demand assessment skills
70
71
 
@@ -73,6 +74,7 @@ The plugin includes six specialised agents:
73
74
  | `/wr-risk-scorer:assess-wip` | WIP risk nudge for the current uncommitted diff |
74
75
  | `/wr-risk-scorer:assess-release` | Pipeline risk assessment for the unpushed queue (pre-satisfies the commit gate) |
75
76
  | `/wr-risk-scorer:assess-external-comms` | External-comms leak review for a draft outbound body (pre-satisfies the external-comms gate) |
77
+ | `/wr-risk-scorer:assess-inbound-report` | Inbound-report risk review for a third-party submission — two-axis (Request-risk + Fix-risk) classification per `RISK-POLICY.md` (P079 / ADR-062). Serves JTBD-005 (on-demand assessment) + JTBD-202 (pre-flight governance check). |
76
78
  | `/wr-risk-scorer:create-risk` | Create a standing-risk register entry (interactive authoring; orchestrator-driven prefilled invocation via `--slug` / `--prefill` flags per ADR-059) |
77
79
  | `/wr-risk-scorer:bootstrap-catalog` | Bootstrap `docs/risks/` register from existing `.risk-reports/` corpus per ADR-059 — walks reports, dedupes by ADR-056 slug, emits one `R<NNN>-<slug>.active.md` per unique slug. Idempotent. Auto-triggers from `/install-updates` Step 6.5.1 when register is empty + `RISK-POLICY.md` present + `.risk-reports/` non-empty |
78
80
  | `/wr-risk-scorer:update-policy` | Generate or update `RISK-POLICY.md` |
@@ -0,0 +1,126 @@
1
+ ---
2
+ name: inbound-report
3
+ description: Reviews third-party prose submitted as inbound reports (gh issue bodies labelled problem-report, gh discussions in Q&A categories, gh security-advisory bodies) for two risk axes — Request-risk (info-extraction / backdoor request / malicious-code injection) and Fix-risk (privilege escalation / removal of load-bearing safety check / adopter-attack-surface expansion). Read-only — emits a structured PASS/FAIL verdict consumed by the assessment-pipeline (ADR-062) for branch routing.
4
+ tools:
5
+ - Read
6
+ - Glob
7
+ - Grep
8
+ model: inherit
9
+ ---
10
+
11
+ You are the Inbound-Report Risk Reviewer. Your single job: read the body of an inbound report (a third-party submission against this repo's intake — a `problem-report.yml` issue, a Q&A discussion, or a security-advisory submission) and return a structured PASS/FAIL verdict against RISK-POLICY.md's Inbound Report Risk Classes (Request-risk + Fix-risk).
12
+
13
+ You are read-only. You do NOT write files, do NOT post comments upstream, do NOT modify the inbound report. Your verdict is consumed by `/wr-itil:review-problems` Step 8.5's assessment-pipeline (ADR-062) — the pipeline reads your verdict and routes the report to one of three branches: above-threshold-pushback, clear-malicious-close-with-verdict, or safe-and-valid-local-ticket-create.
14
+
15
+ **Direction of flow**: you review THIRD-PARTY prose flowing INWARD. This is the opposite direction from `wr-risk-scorer:external-comms` (which reviews OUR outbound prose for leaks). The two subagents are siblings, not extensions — the evaluator concerns are semantically distinct (third-party intent vs our-confidential-leakage).
16
+
17
+ ## What you receive
18
+
19
+ The invoking skill (`/wr-risk-scorer:assess-inbound-report`) or the assessment-pipeline provides:
20
+
21
+ - The **report body** verbatim — the exact prose submitted on the intake surface.
22
+ - The **report metadata** — submitter handle, surface (`github-issues` / `github-discussions` / `github-security-advisories`), repo, issue/discussion ID when known.
23
+ - The **JTBD-alignment context** — the assessment-pipeline's prior-step verdict (`aligned-with-existing-JTBD` / `aligned-with-new-JTBD-for-existing-persona` / `not-aligned`) so your judgement composes with the alignment classifier's output rather than re-deriving it.
24
+
25
+ Read `RISK-POLICY.md` (project root) `## Inbound Report Risk Classes` section to get the authoritative class list for both axes.
26
+
27
+ ## Two-axis review
28
+
29
+ ### Axis 1 — Request-risk (is the report itself an attack vector?)
30
+
31
+ For each Request-risk class in `## Inbound Report Risk Classes`, pass the report body against the class definition. Look for:
32
+
33
+ - **Info-extraction**: requests for the maintainer to reveal repository internals, build secrets, deployment paths, credentials, contributor PII, or other non-public information that a legitimate problem report does not need.
34
+ - **Backdoor request**: requests to add a backdoor, weaken a safety check, disable a security feature, expose an internal API, or otherwise compromise the project's integrity disguised as a feature/bug.
35
+ - **Malicious-code injection**: requests to incorporate user-supplied code (script snippets, regex patterns, prompt templates, hook payloads) that read as likely-malicious in the context they would execute.
36
+
37
+ ### Axis 2 — Fix-risk (is fixing the report risky?)
38
+
39
+ Some legitimate-looking reports request changes that are themselves high-risk to ship. For each Fix-risk class:
40
+
41
+ - **Privilege escalation**: the requested fix would let the requester (or others) escalate privilege within the suite or downstream adopters.
42
+ - **Removal of load-bearing safety check**: the requested fix removes a check whose removal increases risk to users.
43
+ - **Adopter-attack-surface expansion**: the requested fix would expand the suite's attack surface across all adopters (e.g. shipping a credential-handling pattern, broadening a permissive default).
44
+
45
+ ## Verdict combinations
46
+
47
+ Combine the two axes into one structured outcome:
48
+
49
+ | Request-risk | Fix-risk | Verdict | Pipeline branch |
50
+ |--------------|----------|---------|-----------------|
51
+ | clear-malicious | (any) | FAIL — `clear-malicious-request` | clear-malicious-close-with-verdict |
52
+ | above-threshold | (any) | FAIL — `above-threshold-risk` | above-threshold-pushback |
53
+ | safe | high | PASS — `safe-high-fix-risk` (continue with maintainer-attention flag) | safe-and-valid-local-ticket-create + flag |
54
+ | safe | low | PASS — `safe-low-fix-risk` | safe-and-valid-local-ticket-create |
55
+
56
+ `clear-malicious-request` is reserved for unambiguous attacks (named info-extraction / backdoor / malicious-code class with high confidence). `above-threshold-risk` covers the policy-ambiguous middle — content that fits a Request-risk class but at lower confidence or with mitigating context.
57
+
58
+ ## Verdict format (MANDATORY)
59
+
60
+ End your report with a structured block consumed by the assessment-pipeline + `risk-score-mark.sh` PostToolUse hook. Every field is required.
61
+
62
+ ```
63
+ INBOUND_REPORT_VERDICT: PASS
64
+ INBOUND_REPORT_KEY: <sha256 hex string>
65
+ INBOUND_REPORT_CLASS: <safe-low-fix-risk | safe-high-fix-risk>
66
+ ```
67
+
68
+ OR for a failed review:
69
+
70
+ ```
71
+ INBOUND_REPORT_VERDICT: FAIL
72
+ INBOUND_REPORT_KEY: <sha256 hex string>
73
+ INBOUND_REPORT_CLASS: <above-threshold-risk | clear-malicious-request>
74
+ INBOUND_REPORT_REASON: <one-line description of the axis + class + matched fragment>
75
+ ```
76
+
77
+ Compute the key as:
78
+
79
+ ```
80
+ printf '%s\n%s\n%s' "<report body verbatim>" "<surface name>" "<submitter handle>" | shasum -a 256 | cut -d' ' -f1
81
+ ```
82
+
83
+ The key MUST match the pipeline's computation exactly — a key mismatch means the marker is written for a different report and the assessment-pipeline will re-trigger the subagent on the next pass.
84
+
85
+ ## Grounding (ADR-026)
86
+
87
+ Every FAIL verdict MUST cite:
88
+
89
+ - The specific RISK-POLICY.md class violated (verbatim — copy the bullet from the policy).
90
+ - The axis the class belongs to (Request-risk or Fix-risk).
91
+ - The exact substring from the report body that triggered the call (when the class is content-pattern-based).
92
+ - A one-line explanation of why this submission constitutes the class match.
93
+
94
+ Example:
95
+
96
+ > INBOUND_REPORT_REASON: Axis 1 Request-risk "Info-extraction" class — report body contains "share the exact path of your CI credentials so I can replicate" requesting non-public deployment information; legitimate `problem-report.yml` submissions do not require maintainer credential paths.
97
+
98
+ ## Constraints
99
+
100
+ - You are a reviewer, not an editor — do NOT propose rewrites in the verdict block. (Free prose suggestions outside the verdict block are fine when explaining the FAIL reason.)
101
+ - Do NOT score by analogy when the policy names the class.
102
+ - Do NOT write to `/tmp/` or any marker location yourself — the PostToolUse hook owns that.
103
+ - Do NOT skip the `INBOUND_REPORT_KEY` line; without it, the assessment-pipeline has no key to write the marker against and will re-trigger the subagent on the next pass.
104
+ - Do NOT make a block-list decision (P123 scope) — your verdict feeds the audit-log via ADR-062's clear-malicious branch; block-list enforcement is a separate ticket's concern.
105
+ - When the report body is empty (e.g. a Q&A discussion with only a title), review the title + metadata. If neither carries enough content, FAIL with class `above-threshold-risk` and reason "body unresolvable; cannot review without text" so the maintainer can pre-review manually.
106
+
107
+ ## Below-Appetite Output Rule (ADR-013 Rule 5)
108
+
109
+ When the verdict is PASS at the `safe-low-fix-risk` class, your output may be terse: a one-line "no Inbound Report Risk class matched on either axis; fix risk low" plus the verdict block. Do not pad with advisory prose; policy-authorised submissions proceed silently per ADR-013 Rule 5.
110
+
111
+ ## Above-Appetite (FAIL or safe-high-fix-risk) Output
112
+
113
+ When the verdict is FAIL OR the class is `safe-high-fix-risk`:
114
+
115
+ - **FAIL**: surface the matched class, axis, and triggering substring in PROSE BEFORE the verdict block. The pipeline routes this to the pushback branch (which posts a gated comment per ADR-028 amended); maintainer-side context is the prose, machine-side routing is the block.
116
+ - **safe-high-fix-risk**: surface the fix-risk class the maintainer should weigh BEFORE accepting the local ticket. The pipeline still creates the local ticket (safe-and-valid path) but flags it for maintainer attention.
117
+
118
+ ## ADR cross-references
119
+
120
+ - **ADR-062** (Inbound upstream-report discovery + assessment pipeline) — § Sibling subagent section names this agent + this two-axis framing.
121
+ - **ADR-015** (On-demand assessment skills) — § Scope table includes the sibling `/wr-risk-scorer:assess-inbound-report` skill that wraps this agent for manual invocation.
122
+ - **ADR-028** (External-comms gate, amended) — the pushback / clear-malicious-verdict comments the assessment-pipeline posts after this agent's FAIL verdict ride the P064 + P038 evaluator halves.
123
+ - **ADR-029** (Diagnose before implement) — your verdict follows the hypothesis (axis-class match) / evidence (matched substring or metadata) / structured-verdict (PASS / FAIL + class + key) discipline.
124
+ - **ADR-013 Rule 5** — below-appetite silent-pass output rule applies.
125
+ - **P079** — the parent problem ticket driving this work.
126
+ - **P132** + inverse-P078 — your verdict resolves the branch decision mechanically; the assessment-pipeline does NOT use AskUserQuestion at the branch decision (this is the framework-resolution boundary).
@@ -0,0 +1,225 @@
1
+ #!/usr/bin/env bats
2
+ # Contract assertions for the wr-risk-scorer:inbound-report subagent
3
+ # (RFC-004 Slice B). Sibling of wr-risk-scorer:external-comms — NOT
4
+ # extension. Reviews INBOUND third-party prose on two axes (Request-risk +
5
+ # Fix-risk) per RISK-POLICY.md § Inbound Report Risk Classes.
6
+ #
7
+ # Structural assertions — Permitted Exception to the source-grep ban
8
+ # per ADR-005 / P011 / ADR-037 / ADR-052 § Surface 2. Subagent prompt
9
+ # prose governs LLM-driven verdict behaviour; behavioural-replay
10
+ # testing requires a synthetic agent harness (P012 / P176). Until that
11
+ # harness lands, contract bats assert the load-bearing rubric + structured
12
+ # verdict format are present so future edits don't silently strip them.
13
+ #
14
+ # @problem P079
15
+ # @rfc RFC-004
16
+ # @adr ADR-062 (inbound discovery + assessment pipeline — § Sibling subagent)
17
+ # @adr ADR-015 (on-demand assessment skills — § Scope table)
18
+ # @adr ADR-026 (grounding discipline — every FAIL verdict cites policy class)
19
+ # @adr ADR-029 (diagnose before implement — hypothesis / evidence / structured verdict)
20
+ # @adr ADR-052 (behavioural-tests default + Permitted Exception)
21
+ # @jtbd JTBD-301 (acknowledgement contract grounded in policy classes)
22
+ # @jtbd JTBD-001 (mechanical-stage carve-out via structured verdict)
23
+
24
+ setup() {
25
+ AGENTS_DIR="$(cd "$(dirname "$BATS_TEST_FILENAME")/.." && pwd)"
26
+ AGENT_FILE="${AGENTS_DIR}/inbound-report.md"
27
+ POLICY_FILE="$(cd "${AGENTS_DIR}/../../.." && pwd)/RISK-POLICY.md"
28
+ }
29
+
30
+ # ──────────────────────────────────────────────────────────────────────────────
31
+ # Frontmatter + tool surface
32
+ # ──────────────────────────────────────────────────────────────────────────────
33
+
34
+ @test "inbound-report.md exists and has frontmatter (RFC-004 Slice B)" {
35
+ [ -f "$AGENT_FILE" ]
36
+ run head -1 "$AGENT_FILE"
37
+ [ "$status" -eq 0 ]
38
+ [ "$output" = "---" ]
39
+ }
40
+
41
+ @test "frontmatter name is 'inbound-report' (sibling of external-comms)" {
42
+ run grep -nE '^name: inbound-report$' "$AGENT_FILE"
43
+ [ "$status" -eq 0 ]
44
+ }
45
+
46
+ @test "frontmatter tools are read-only (Read, Glob, Grep)" {
47
+ # Per ADR-062 § Sibling subagent: read-only contract; subagent emits
48
+ # verdict, PostToolUse hook owns marker writes.
49
+ run grep -nE ' - Read' "$AGENT_FILE"
50
+ [ "$status" -eq 0 ]
51
+ run grep -nE ' - Glob' "$AGENT_FILE"
52
+ [ "$status" -eq 0 ]
53
+ run grep -nE ' - Grep' "$AGENT_FILE"
54
+ [ "$status" -eq 0 ]
55
+ }
56
+
57
+ @test "frontmatter tools do NOT include Write / Edit / Bash (read-only invariant)" {
58
+ run grep -nE '^ - (Write|Edit|Bash)$' "$AGENT_FILE"
59
+ [ "$status" -ne 0 ]
60
+ }
61
+
62
+ @test "frontmatter model is inherit" {
63
+ run grep -nE '^model: inherit$' "$AGENT_FILE"
64
+ [ "$status" -eq 0 ]
65
+ }
66
+
67
+ # ──────────────────────────────────────────────────────────────────────────────
68
+ # Sibling-not-extension positioning (ADR-062 § Sibling subagent)
69
+ # ──────────────────────────────────────────────────────────────────────────────
70
+
71
+ @test "agent prose names sibling-not-extension positioning vs external-comms" {
72
+ # ADR-062 explicitly carves the inbound-report subagent as a sibling
73
+ # (NOT extension) of external-comms. Protects JTBD-101 plugin-developer
74
+ # constraint "must not break existing plugins" by preserving
75
+ # external-comms scope-purity.
76
+ run grep -inE 'sibling.*external-comms|external-comms.*sibling' "$AGENT_FILE"
77
+ [ "$status" -eq 0 ]
78
+ }
79
+
80
+ @test "agent prose names the inbound-direction framing (third-party prose flowing INWARD)" {
81
+ run grep -inE 'INWARD|inbound prose|third-party prose' "$AGENT_FILE"
82
+ [ "$status" -eq 0 ]
83
+ }
84
+
85
+ # ──────────────────────────────────────────────────────────────────────────────
86
+ # Two-axis review structure (Request-risk + Fix-risk)
87
+ # ──────────────────────────────────────────────────────────────────────────────
88
+
89
+ @test "Axis 1 Request-risk documented (attack-vector axis)" {
90
+ run grep -nE 'Axis 1.*Request-risk' "$AGENT_FILE"
91
+ [ "$status" -eq 0 ]
92
+ }
93
+
94
+ @test "Axis 1 enumerates info-extraction / backdoor request / malicious-code injection classes" {
95
+ run grep -inE 'Info-extraction' "$AGENT_FILE"
96
+ [ "$status" -eq 0 ]
97
+ run grep -inE 'Backdoor request' "$AGENT_FILE"
98
+ [ "$status" -eq 0 ]
99
+ run grep -inE 'Malicious-code injection' "$AGENT_FILE"
100
+ [ "$status" -eq 0 ]
101
+ }
102
+
103
+ @test "Axis 2 Fix-risk documented (work-to-be-weighed axis)" {
104
+ run grep -nE 'Axis 2.*Fix-risk' "$AGENT_FILE"
105
+ [ "$status" -eq 0 ]
106
+ }
107
+
108
+ @test "Axis 2 enumerates privilege escalation / removal-of-safety-check / adopter-attack-surface-expansion classes" {
109
+ run grep -inE 'Privilege escalation' "$AGENT_FILE"
110
+ [ "$status" -eq 0 ]
111
+ run grep -inE 'Removal of load-bearing safety check' "$AGENT_FILE"
112
+ [ "$status" -eq 0 ]
113
+ run grep -inE 'Adopter-attack-surface expansion' "$AGENT_FILE"
114
+ [ "$status" -eq 0 ]
115
+ }
116
+
117
+ # ──────────────────────────────────────────────────────────────────────────────
118
+ # Structured verdict block (consumed by assessment-pipeline branch routing)
119
+ # ──────────────────────────────────────────────────────────────────────────────
120
+
121
+ @test "verdict block defines INBOUND_REPORT_VERDICT" {
122
+ run grep -nE 'INBOUND_REPORT_VERDICT' "$AGENT_FILE"
123
+ [ "$status" -eq 0 ]
124
+ }
125
+
126
+ @test "verdict block defines INBOUND_REPORT_KEY (sha256 hex for marker matching)" {
127
+ run grep -nE 'INBOUND_REPORT_KEY' "$AGENT_FILE"
128
+ [ "$status" -eq 0 ]
129
+ }
130
+
131
+ @test "verdict block defines INBOUND_REPORT_CLASS (one of four classifications)" {
132
+ run grep -nE 'INBOUND_REPORT_CLASS' "$AGENT_FILE"
133
+ [ "$status" -eq 0 ]
134
+ }
135
+
136
+ @test "verdict block defines INBOUND_REPORT_REASON for FAIL path" {
137
+ run grep -nE 'INBOUND_REPORT_REASON' "$AGENT_FILE"
138
+ [ "$status" -eq 0 ]
139
+ }
140
+
141
+ # ──────────────────────────────────────────────────────────────────────────────
142
+ # Four classifications enumerated (branch-routing vocabulary)
143
+ # ──────────────────────────────────────────────────────────────────────────────
144
+
145
+ @test "classification safe-low-fix-risk enumerated" {
146
+ run grep -nE 'safe-low-fix-risk' "$AGENT_FILE"
147
+ [ "$status" -eq 0 ]
148
+ }
149
+
150
+ @test "classification safe-high-fix-risk enumerated" {
151
+ run grep -nE 'safe-high-fix-risk' "$AGENT_FILE"
152
+ [ "$status" -eq 0 ]
153
+ }
154
+
155
+ @test "classification above-threshold-risk enumerated" {
156
+ run grep -nE 'above-threshold-risk' "$AGENT_FILE"
157
+ [ "$status" -eq 0 ]
158
+ }
159
+
160
+ @test "classification clear-malicious-request enumerated" {
161
+ run grep -nE 'clear-malicious-request' "$AGENT_FILE"
162
+ [ "$status" -eq 0 ]
163
+ }
164
+
165
+ # ──────────────────────────────────────────────────────────────────────────────
166
+ # Grounding discipline (ADR-026)
167
+ # ──────────────────────────────────────────────────────────────────────────────
168
+
169
+ @test "FAIL verdict requires citing the specific RISK-POLICY.md class" {
170
+ run grep -inE 'cite|class violated' "$AGENT_FILE"
171
+ [ "$status" -eq 0 ]
172
+ }
173
+
174
+ @test "agent prose cites ADR-026 grounding discipline" {
175
+ run grep -nE 'ADR-026' "$AGENT_FILE"
176
+ [ "$status" -eq 0 ]
177
+ }
178
+
179
+ # ──────────────────────────────────────────────────────────────────────────────
180
+ # Read-only constraints + marker boundary
181
+ # ──────────────────────────────────────────────────────────────────────────────
182
+
183
+ @test "agent declares read-only (no file writes / commits / draft modifications)" {
184
+ run grep -inE 'read-only' "$AGENT_FILE"
185
+ [ "$status" -eq 0 ]
186
+ }
187
+
188
+ @test "agent forbids self-writing to /tmp/ or marker locations" {
189
+ # PostToolUse hook owns marker writes per ADR-009; the subagent
190
+ # emits the verdict and the hook computes the marker key.
191
+ run grep -inE 'NOT write to /tmp|PostToolUse hook owns' "$AGENT_FILE"
192
+ [ "$status" -eq 0 ]
193
+ }
194
+
195
+ # ──────────────────────────────────────────────────────────────────────────────
196
+ # Mechanical-stage carve-out integration (P132 — pipeline branch routing)
197
+ # ──────────────────────────────────────────────────────────────────────────────
198
+
199
+ @test "agent prose names the mechanical-stage carve-out integration (P132)" {
200
+ run grep -nE 'P132|mechanical' "$AGENT_FILE"
201
+ [ "$status" -eq 0 ]
202
+ }
203
+
204
+ @test "agent does NOT make a block-list decision (P123 scope carve-out)" {
205
+ # Block-list enforcement is a separate ticket's concern; this subagent's
206
+ # verdict feeds the audit-log via the assessment-pipeline's clear-malicious
207
+ # branch and stops there.
208
+ run grep -inE 'NOT make a block-list|P123' "$AGENT_FILE"
209
+ [ "$status" -eq 0 ]
210
+ }
211
+
212
+ # ──────────────────────────────────────────────────────────────────────────────
213
+ # RISK-POLICY.md integration (the policy classes the agent grounds verdicts against)
214
+ # ──────────────────────────────────────────────────────────────────────────────
215
+
216
+ @test "RISK-POLICY.md has the '## Inbound Report Risk Classes' section the agent reads" {
217
+ [ -f "$POLICY_FILE" ]
218
+ run grep -nE '^## Inbound Report Risk Classes$' "$POLICY_FILE"
219
+ [ "$status" -eq 0 ]
220
+ }
221
+
222
+ @test "agent prose references RISK-POLICY.md § Inbound Report Risk Classes" {
223
+ run grep -nE 'Inbound Report Risk Classes' "$AGENT_FILE"
224
+ [ "$status" -eq 0 ]
225
+ }
@@ -0,0 +1,23 @@
1
+ # Per-package evaluator config for external-comms-gate.sh (ADR-028 amended 2026-05-14).
2
+ # Sourced by the canonical external-comms-gate.sh; NOT synced (each consumer
3
+ # plugin maintains its own .conf).
4
+
5
+ # Short evaluator id — used in marker filenames (external-comms-<id>-reviewed-<key>).
6
+ EXTERNAL_COMMS_EVALUATOR_ID=risk
7
+
8
+ # Subagent type the deny message directs to.
9
+ EXTERNAL_COMMS_SUBAGENT_TYPE=wr-risk-scorer:external-comms
10
+
11
+ # Structured-output prefix the PostToolUse:Agent hook parses from the subagent's
12
+ # stdout (EXTERNAL_COMMS_RISK_VERDICT + EXTERNAL_COMMS_RISK_KEY).
13
+ EXTERNAL_COMMS_VERDICT_PREFIX=EXTERNAL_COMMS_RISK
14
+
15
+ # On-demand skill for pre-flight delegation.
16
+ EXTERNAL_COMMS_ASSESS_SKILL=/wr-risk-scorer:assess-external-comms
17
+
18
+ # Policy file whose absence triggers advisory-only mode.
19
+ EXTERNAL_COMMS_POLICY_FILE=RISK-POLICY.md
20
+
21
+ # Whether to run the leak-pattern pre-filter (lib/leak-detect.sh). Risk evaluator
22
+ # checks confidential-information leaks; voice-tone evaluator does not.
23
+ EXTERNAL_COMMS_LEAK_PREFILTER=yes
@@ -1,5 +1,11 @@
1
1
  #!/bin/bash
2
- # PreToolUse hook: gates outbound prose for risk/leak review (P064 / ADR-028 amended).
2
+ # PreToolUse hook: gates outbound prose for evaluator review (P064 / P038 / ADR-028 amended 2026-05-14).
3
+ #
4
+ # This is the CANONICAL hook synced byte-identically into each consumer plugin
5
+ # (risk-scorer, voice-tone, …) via ADR-017 duplicate-script pattern. Each copy
6
+ # sources `${SCRIPT_DIR}/external-comms-evaluator.conf` to determine its
7
+ # evaluator identity (risk / voice-tone / …) — the .conf file is per-package
8
+ # and NOT synced.
3
9
  #
4
10
  # Surface (matched on Bash command text or Edit/Write file_path):
5
11
  # - gh issue create | comment | edit (public issue bodies)
@@ -11,23 +17,26 @@
11
17
  #
12
18
  # Gate behaviour:
13
19
  # 1. BYPASS_RISK_GATE=1 short-circuits the gate (consistent with git-push-gate.sh).
14
- # 2. RISK-POLICY.md absent → advisory-only mode (permits with systemMessage).
20
+ # 2. POLICY_FILE absent → advisory-only mode (permits with systemMessage).
15
21
  # 3. Hybrid leak-pattern pre-filter (lib/leak-detect.sh) hard-fails on
16
22
  # credentials, prod-URL prefixes, business-context-paired financial figures,
17
23
  # or business-context-paired user counts. Deny includes the matched class.
18
- # 4. Otherwise: check for a per-evaluator marker keyed on
24
+ # (Voice-tone evaluator: skips leak pre-filter leak detection is the
25
+ # risk evaluator's concern; voice-tone reviews tone/voice only.)
26
+ # 4. Otherwise: check for THIS evaluator's per-evaluator marker keyed on
19
27
  # sha256(draft_body + '\n' + surface). Marker present → permit.
20
- # Marker absent → deny with directive to delegate to wr-risk-scorer:external-comms.
28
+ # Marker absent → deny with directive to delegate to this plugin's
29
+ # subagent (configured via external-comms-evaluator.conf).
21
30
  #
22
- # Marker location: ${TMPDIR:-/tmp}/claude-risk-${SESSION_ID}/external-comms-reviewed-<sha256>
23
- # Marker writer: PostToolUse:Agent hook (risk-score-mark.sh) on subagent
24
- # type wr-risk-scorer:external-comms.
31
+ # Marker location: ${TMPDIR:-/tmp}/claude-risk-${SESSION_ID}/external-comms-<EVALUATOR_ID>-reviewed-<sha256>
32
+ # Marker writer: PostToolUse:Agent hook in each consumer plugin
33
+ # (risk-score-mark.sh or external-comms-mark-reviewed.sh) on
34
+ # subagent type wr-<plugin>:external-comms.
25
35
  #
26
- # Composite-marker scheme (combining with wr-voice-tone:agent verdict for
27
- # the same draft) is deferred until P038 lands its evaluator. This iteration
28
- # ships the risk-evaluator side as a standalone gate; the two hooks compose
29
- # at the PreToolUse:Bash matcher level when both packages are installed.
30
- # See ADR-028 amendment Reassessment Criteria.
36
+ # Per-evaluator marker scheme (ADR-028 amended 2026-05-14): when both
37
+ # voice-tone and risk-scorer are installed, both gates fire on the same
38
+ # PreToolUse event; each gate denies until its own per-evaluator marker
39
+ # exists. Gates compose at firing level no shared composite marker.
31
40
 
32
41
  set -euo pipefail
33
42
 
@@ -35,6 +44,29 @@ SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
35
44
  # shellcheck source=lib/leak-detect.sh
36
45
  source "$SCRIPT_DIR/lib/leak-detect.sh"
37
46
 
47
+ # ---------- Per-package evaluator config (ADR-028 amended 2026-05-14) ----------
48
+ # Each consumer plugin ships its own external-comms-evaluator.conf alongside this
49
+ # byte-identical canonical hook. The .conf defines:
50
+ # EXTERNAL_COMMS_EVALUATOR_ID — short id (risk, voice-tone)
51
+ # EXTERNAL_COMMS_SUBAGENT_TYPE — subagent to delegate to (wr-<plugin>:external-comms)
52
+ # EXTERNAL_COMMS_VERDICT_PREFIX — structured-output prefix the mark hook parses
53
+ # EXTERNAL_COMMS_ASSESS_SKILL — on-demand skill path for manual delegation
54
+ # EXTERNAL_COMMS_POLICY_FILE — policy doc whose absence triggers advisory-only
55
+ # EXTERNAL_COMMS_LEAK_PREFILTER — yes|no — whether to run leak-detect pre-filter
56
+ # Fail-closed if absent: this hook cannot operate without a configured evaluator.
57
+ CONF_FILE="$SCRIPT_DIR/external-comms-evaluator.conf"
58
+ if [ ! -f "$CONF_FILE" ]; then
59
+ echo "ERROR: external-comms-gate.sh requires $CONF_FILE (ADR-028 amended 2026-05-14)" >&2
60
+ exit 0
61
+ fi
62
+ # shellcheck source=/dev/null
63
+ source "$CONF_FILE"
64
+ : "${EXTERNAL_COMMS_EVALUATOR_ID:?evaluator id missing from $CONF_FILE}"
65
+ : "${EXTERNAL_COMMS_SUBAGENT_TYPE:?subagent type missing from $CONF_FILE}"
66
+ : "${EXTERNAL_COMMS_ASSESS_SKILL:?assess-skill missing from $CONF_FILE}"
67
+ EXTERNAL_COMMS_POLICY_FILE="${EXTERNAL_COMMS_POLICY_FILE:-RISK-POLICY.md}"
68
+ EXTERNAL_COMMS_LEAK_PREFILTER="${EXTERNAL_COMMS_LEAK_PREFILTER:-yes}"
69
+
38
70
  # ---------- Bypass ----------
39
71
  if [ "${BYPASS_RISK_GATE:-0}" = "1" ]; then
40
72
  exit 0
@@ -173,31 +205,37 @@ print(json.dumps({'systemMessage': sys.argv[1]}))
173
205
  }
174
206
 
175
207
  # ---------- Advisory-only fallback when policy file is absent ----------
176
- if [ ! -f "RISK-POLICY.md" ]; then
177
- permit_with_advisory "RISK-POLICY.md not found — wr-risk-scorer:external-comms gate is advisory-only on $SURFACE."
208
+ if [ ! -f "$EXTERNAL_COMMS_POLICY_FILE" ]; then
209
+ permit_with_advisory "$EXTERNAL_COMMS_POLICY_FILE not found — $EXTERNAL_COMMS_SUBAGENT_TYPE gate is advisory-only on $SURFACE."
178
210
  exit 0
179
211
  fi
180
212
 
181
- # ---------- Hard-fail leak-pattern pre-filter ----------
182
- if ! leak_detect_scan "$DRAFT"; then
183
- REASON=$(printf 'BLOCKED (P064 external-comms gate): %s on %s. Remove the leak before retrying. Override only if intentional: BYPASS_RISK_GATE=1.' \
184
- "$LEAK_DETECT_REASON" "$SURFACE")
185
- deny_with_reason "$REASON"
186
- exit 0
213
+ # ---------- Hard-fail leak-pattern pre-filter (risk evaluator only) ----------
214
+ # Voice-tone evaluator skips this — leak detection is the risk evaluator's
215
+ # concern. Each per-package external-comms-evaluator.conf sets
216
+ # EXTERNAL_COMMS_LEAK_PREFILTER=yes (risk) or =no (voice-tone).
217
+ if [ "$EXTERNAL_COMMS_LEAK_PREFILTER" = "yes" ]; then
218
+ if ! leak_detect_scan "$DRAFT"; then
219
+ REASON=$(printf 'BLOCKED (external-comms gate / %s evaluator): %s on %s. Remove the leak before retrying. Override only if intentional: BYPASS_RISK_GATE=1.' \
220
+ "$EXTERNAL_COMMS_EVALUATOR_ID" "$LEAK_DETECT_REASON" "$SURFACE")
221
+ deny_with_reason "$REASON"
222
+ exit 0
223
+ fi
187
224
  fi
188
225
 
189
- # ---------- Marker-based gate ----------
226
+ # ---------- Marker-based gate (per-evaluator marker per ADR-028 amended 2026-05-14) ----------
190
227
  SESSION_DIR="${TMPDIR:-/tmp}/claude-risk-${SESSION_ID}"
191
228
  mkdir -p "$SESSION_DIR"
192
229
  KEY=$(printf '%s\n%s' "$DRAFT" "$SURFACE" | shasum -a 256 | cut -d' ' -f1)
193
- MARKER="${SESSION_DIR}/external-comms-reviewed-${KEY}"
230
+ MARKER="${SESSION_DIR}/external-comms-${EXTERNAL_COMMS_EVALUATOR_ID}-reviewed-${KEY}"
194
231
 
195
232
  if [ -f "$MARKER" ]; then
196
233
  exit 0
197
234
  fi
198
235
 
199
236
  # Marker absent — deny + delegate.
200
- REASON=$(printf 'BLOCKED (P064 external-comms gate): %s draft has not been risk-reviewed. Delegate to wr-risk-scorer:external-comms (subagent_type: '"'"'wr-risk-scorer:external-comms'"'"') with the draft body for review. The PostToolUse hook will mark this draft reviewed when the subagent emits EXTERNAL_COMMS_RISK_VERDICT: PASS. Use /wr-risk-scorer:assess-external-comms for an interactive walkthrough. Override only when intentional: BYPASS_RISK_GATE=1.' \
201
- "$SURFACE")
237
+ VERDICT_PREFIX="${EXTERNAL_COMMS_VERDICT_PREFIX:-EXTERNAL_COMMS_${EXTERNAL_COMMS_EVALUATOR_ID^^}}"
238
+ REASON=$(printf 'BLOCKED (external-comms gate / %s evaluator): %s draft has not been reviewed by %s. Delegate to %s (subagent_type: '"'"'%s'"'"') with the draft body for review. The PostToolUse hook will mark this draft reviewed when the subagent emits %s_VERDICT: PASS. Use %s for an interactive walkthrough. Override only when intentional: BYPASS_RISK_GATE=1.' \
239
+ "$EXTERNAL_COMMS_EVALUATOR_ID" "$SURFACE" "$EXTERNAL_COMMS_SUBAGENT_TYPE" "$EXTERNAL_COMMS_SUBAGENT_TYPE" "$EXTERNAL_COMMS_SUBAGENT_TYPE" "$VERDICT_PREFIX" "$EXTERNAL_COMMS_ASSESS_SKILL")
202
240
  deny_with_reason "$REASON"
203
241
  exit 0
@@ -204,9 +204,12 @@ if echo "$SUBAGENT" | grep -qE 'risk-scorer.policy'; then
204
204
  fi
205
205
 
206
206
  # ---------------------------------------------------------------------------
207
- # External-comms reviewer (P064 / ADR-028 amended): write per-draft marker
208
- # keyed on sha256(draft + '\n' + surface). Subagent emits the key; this hook
209
- # trusts and uses it. Marker file: external-comms-reviewed-<key>.
207
+ # External-comms reviewer (P064 / ADR-028 amended 2026-05-14): write
208
+ # per-evaluator marker keyed on sha256(draft + '\n' + surface). Subagent
209
+ # emits the key; this hook trusts and uses it. Marker file:
210
+ # external-comms-risk-reviewed-<key>. The voice-tone evaluator (P038)
211
+ # writes its own peer marker external-comms-voice-tone-reviewed-<key>
212
+ # from packages/voice-tone/hooks/external-comms-mark-reviewed.sh.
210
213
  # ---------------------------------------------------------------------------
211
214
  if echo "$SUBAGENT" | grep -qE 'risk-scorer.external-comms'; then
212
215
  VERDICT_LINE=$(echo "$AGENT_OUTPUT" | grep -E '^EXTERNAL_COMMS_RISK_VERDICT:' | tail -1) || true
@@ -216,7 +219,7 @@ if echo "$SUBAGENT" | grep -qE 'risk-scorer.external-comms'; then
216
219
  # Validate key: 64 hex chars (sha256 output). Reject anything else.
217
220
  if echo "$KEY" | grep -qE '^[0-9a-f]{64}$'; then
218
221
  case "$VERDICT" in
219
- PASS) touch "${RDIR}/external-comms-reviewed-${KEY}" ;;
222
+ PASS) touch "${RDIR}/external-comms-risk-reviewed-${KEY}" ;;
220
223
  FAIL) ;; # Do NOT create marker — draft must be revised
221
224
  *) ;; # Unknown verdict — fail closed
222
225
  esac
@@ -113,11 +113,11 @@ run_hook() {
113
113
  [ -z "$output" ]
114
114
  }
115
115
 
116
- @test "marker present for matching draft+surface key allows the call" {
116
+ @test "per-evaluator marker (external-comms-risk-reviewed-<KEY>) allows the call (ADR-028 amended 2026-05-14)" {
117
117
  DRAFT="we observed a build failure on Node 20"
118
118
  SURFACE="gh-issue-create"
119
119
  KEY=$(printf '%s\n%s' "$DRAFT" "$SURFACE" | shasum -a 256 | cut -d' ' -f1)
120
- touch "${RDIR}/external-comms-reviewed-${KEY}"
120
+ touch "${RDIR}/external-comms-risk-reviewed-${KEY}"
121
121
 
122
122
  INPUT=$(build_bash_input "gh issue create --title T --body '$DRAFT'")
123
123
  run_hook "$INPUT"
@@ -125,6 +125,20 @@ run_hook() {
125
125
  [ -z "$output" ]
126
126
  }
127
127
 
128
+ @test "legacy combined marker (external-comms-reviewed-<KEY>) does NOT satisfy the risk gate (P038 per-evaluator scheme)" {
129
+ DRAFT="we observed a build failure on Node 20"
130
+ SURFACE="gh-issue-create"
131
+ KEY=$(printf '%s\n%s' "$DRAFT" "$SURFACE" | shasum -a 256 | cut -d' ' -f1)
132
+ # Pre-amendment combined marker — should NOT satisfy the new per-evaluator gate.
133
+ touch "${RDIR}/external-comms-reviewed-${KEY}"
134
+
135
+ INPUT=$(build_bash_input "gh issue create --title T --body '$DRAFT'")
136
+ run_hook "$INPUT"
137
+ [ "$status" -eq 0 ]
138
+ [[ "$output" == *"deny"* ]]
139
+ [[ "$output" == *"wr-risk-scorer:external-comms"* ]]
140
+ }
141
+
128
142
  @test "RISK-POLICY.md absent yields advisory-only mode (permits)" {
129
143
  rm -f "$TEST_PROJECT_DIR/RISK-POLICY.md"
130
144
  INPUT=$(build_bash_input "gh issue create --title T --body 'we observed a failure'")
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@windyroad/risk-scorer",
3
- "version": "0.8.0",
3
+ "version": "0.9.0-preview.311",
4
4
  "description": "Pipeline risk scoring, commit/push gates, and secret leak detection",
5
5
  "bin": {
6
6
  "windyroad-risk-scorer": "./bin/install.mjs"
@@ -0,0 +1,131 @@
1
+ ---
2
+ name: wr-risk-scorer:assess-inbound-report
3
+ description: On-demand inbound-report risk review. Reviews a third-party submission against this repo's intake (problem-report issue body, Q&A discussion, security-advisory body) for Request-risk (info-extraction / backdoor request / malicious-code injection) and Fix-risk (privilege escalation / removal of load-bearing safety check / adopter-attack-surface expansion) per RISK-POLICY.md. Delegates to wr-risk-scorer:inbound-report and emits the structured verdict consumed by ADR-062's assessment-pipeline branch routing.
4
+ allowed-tools: Read, Glob, Grep, Bash, AskUserQuestion, Skill
5
+ ---
6
+
7
+ # Inbound-Report Risk Assessment Skill
8
+
9
+ Run a Request-risk + Fix-risk review on demand against a single inbound report — outside the `/wr-itil:review-problems` Step 8.5 assessment-pipeline trigger. Maintainer-facing pre-flight surface per JTBD-005 + JTBD-202; the assessment-pipeline itself invokes the same `wr-risk-scorer:inbound-report` subagent in-loop per ADR-062 § Decision Outcome step 3.
10
+
11
+ This skill is **read-only**. It does not commit, post comments upstream, or modify the inbound report. The marker (when the skill is invoked as a pre-satisfier for the pipeline's per-report gate) is written automatically by the `PostToolUse:Agent` hook (`risk-score-mark.sh`) after the subagent completes — the skill never writes to `${TMPDIR:-/tmp}/claude-risk-*` directly.
12
+
13
+ ## When to use
14
+
15
+ - Before running `/wr-itil:review-problems` when a specific inbound report stands out as ambiguous (e.g. a discussion that mixes a legitimate feature request with a question that smells like info-extraction) — pre-flight the classification.
16
+ - After spotting a suspicious submission via `gh issue list` and wanting a second-pass review before the pipeline runs.
17
+ - During a retro on a misclassified prior report (Reassessment Criterion 1 in ADR-062 — false-positive rate exceeds ~10%) — replay the body through the subagent to surface why the prior verdict landed.
18
+ - As part of a P123 block-list eligibility review (a clear-malicious verdict here is the evidence chain the block-list scaffolding consumes when P123 lands).
19
+
20
+ ## Steps
21
+
22
+ ### 1. Parse arguments
23
+
24
+ Read `$ARGUMENTS` for any of:
25
+
26
+ - A report body verbatim (e.g. the user pastes the issue body).
27
+ - A `gh issue URL` or `<repo>#<issue-number>` reference — the skill fetches the body via `gh issue view --json body,author,labels`.
28
+ - A surface hint (`github-issues`, `github-discussions`, `github-security-advisories`).
29
+ - A submitter handle (`@user` or `user`).
30
+ - A JTBD-alignment hint from the assessment-pipeline (`aligned-with-existing-JTBD` / `aligned-with-new-JTBD-for-existing-persona` / `not-aligned`). Optional when invoked manually; required when invoked as a pipeline pre-satisfier.
31
+
32
+ If both body and surface are present, proceed to step 3. If either is missing, step 2.
33
+
34
+ ### 2. Resolve missing context
35
+
36
+ If the body is missing AND a `gh issue URL` / `<repo>#<issue-number>` reference was supplied, fetch:
37
+
38
+ ```bash
39
+ gh issue view "$ref" --json body,author,title,labels --jq '.'
40
+ ```
41
+
42
+ Cache the JSON for downstream steps. Fail-soft on GH API errors — surface the error to the user and fall back to AskUserQuestion.
43
+
44
+ If the body is still missing, use `AskUserQuestion`:
45
+
46
+ > "What report do you want me to review? Paste the body verbatim, or give me a `gh issue URL`."
47
+
48
+ If the surface is missing AND cannot be inferred (from the URL pattern or context), use `AskUserQuestion`:
49
+
50
+ - header: "Inbound surface"
51
+ - options:
52
+ 1. `github-issues` (problem-report.yml or similar labelled issue)
53
+ 2. `github-discussions` (Q&A category)
54
+ 3. `github-security-advisories` (private vendor channel)
55
+
56
+ Do not ask if the surface is obvious from the URL / context.
57
+
58
+ ### 3. Construct the review prompt
59
+
60
+ Build a self-contained prompt for the `wr-risk-scorer:inbound-report` subagent that includes:
61
+
62
+ - The **report body** verbatim (between explicit `<report>...</report>` markers so the agent's substring extraction is unambiguous).
63
+ - The **surface** (one of the canonical strings above).
64
+ - The **submitter handle** when known.
65
+ - The **JTBD-alignment hint** when known (composes with the agent's two-axis judgement).
66
+ - A reminder to compute `INBOUND_REPORT_KEY = sha256(body + '\n' + surface + '\n' + submitter)`.
67
+
68
+ ### 4. Delegate to wr-risk-scorer:inbound-report
69
+
70
+ Invoke the subagent via the Skill / Agent tool with `subagent_type: wr-risk-scorer:inbound-report` and the constructed review prompt.
71
+
72
+ Wait for the subagent to complete. The subagent will output a structured verdict block (`INBOUND_REPORT_VERDICT: PASS|FAIL` + `INBOUND_REPORT_KEY: <sha>` + `INBOUND_REPORT_CLASS: <class>` + optional `INBOUND_REPORT_REASON: ...`). The `PostToolUse:Agent` hook (`risk-score-mark.sh`) reads that output and writes the per-report marker automatically.
73
+
74
+ **Do not write to `${TMPDIR:-/tmp}/claude-risk-*` yourself.** The hook is the only correct mechanism.
75
+
76
+ ### 5. Present results
77
+
78
+ Present the full review report to the user. Highlight:
79
+
80
+ - The verdict (PASS / FAIL).
81
+ - The classification (`safe-low-fix-risk` / `safe-high-fix-risk` / `above-threshold-risk` / `clear-malicious-request`).
82
+ - The matched RISK-POLICY.md class + axis (Request-risk / Fix-risk) when FAIL.
83
+ - The exact substrings or metadata signals that triggered each finding when FAIL.
84
+ - The pipeline branch this report would route to under ADR-062 § Decision Outcome (pushback / clear-malicious-close-with-verdict / safe-and-valid-local-ticket-create).
85
+ - For `safe-high-fix-risk`: the fix-risk class the maintainer should weigh before accepting the local ticket (the pipeline creates the ticket but flags it for maintainer attention).
86
+
87
+ ### 6. Above-appetite handling (ADR-013 Rule 6 + ADR-062 mechanical-stage carve-out)
88
+
89
+ The branch decision itself is **mechanical** per ADR-062 § Mechanical-stage carve-out (P132). When invoked as a pipeline pre-satisfier, this skill does NOT use `AskUserQuestion` to ask the maintainer "which branch?" — the verdict + class determine the branch deterministically. The maintainer's role is to accept or override the verdict via re-running with corrections, not to pick the branch.
90
+
91
+ When invoked manually as an on-demand pre-flight (NOT as a pipeline pre-satisfier), surface a single `AskUserQuestion` for what the maintainer wants to do next:
92
+
93
+ - header: "Next step"
94
+ - options:
95
+ 1. `Accept verdict + run pipeline` — the maintainer agrees with the classification; `/wr-itil:review-problems` will route accordingly on the next invocation.
96
+ 2. `Override + re-review with extra context` — the maintainer disagrees; pass extra context (e.g. "this submitter is a known good-faith contributor in `<other-repo>`") and re-invoke from step 3.
97
+ 3. `Block reporter (P123 scaffolding)` — surface the audit-log entry for P123 block-list enforcement when that ticket lands. Until then, this option appends to `docs/audits/inbound-discovery-log.md` only.
98
+ 4. `Cancel` — abandon the pre-flight; report intact for later review.
99
+
100
+ When invoked as a pipeline pre-satisfier (via the `/wr-itil:review-problems` Step 8.5 orchestrator), the skill is silent on this step per the mechanical-stage carve-out.
101
+
102
+ ## Composition with the assessment-pipeline
103
+
104
+ This skill and the assessment-pipeline (ADR-062 § Decision Outcome) invoke the same `wr-risk-scorer:inbound-report` subagent. The skill is the maintainer-facing manual surface; the pipeline is the automated bulk-processing surface. Verdict shape is identical across both invocation paths (same `INBOUND_REPORT_VERDICT` + `INBOUND_REPORT_KEY` + `INBOUND_REPORT_CLASS` block); the consuming infrastructure (per-report marker, audit-log append, branch routing) is the same.
105
+
106
+ | Concern | This skill (on-demand) | `/wr-itil:review-problems` Step 8.5 (pipeline) |
107
+ |---------|------------------------|-----------------------------------------------|
108
+ | Invocation | Manual / pre-flight (JTBD-005, JTBD-202) | Automatic, in-loop with channel-config polling |
109
+ | Cardinality | One report per invocation | N reports per pass (channel-config drives N) |
110
+ | Branch decision | Per ADR-062 § Decision Outcome; mechanical | Same |
111
+ | Audit-log append | Yes (via PostToolUse hook) | Yes (via PostToolUse hook) |
112
+ | README rankings impact | None (skill is read-only) | Refreshes `## Inbound Upstream Reports` section in `docs/problems/README.md` Step 9e |
113
+ | AskUserQuestion authority | step 6 above (manual only) | None (mechanical-stage carve-out per P132) |
114
+
115
+ ## ADR cross-references
116
+
117
+ - **ADR-062** (Inbound upstream-report discovery + assessment pipeline) — § Sibling subagent + § Mechanical-stage carve-out.
118
+ - **ADR-015** (On-demand assessment skills) — § Scope table extended with the `assess-inbound-report` row; § Naming Convention `assess-<artifact>` pattern; § Gate Marker Interaction (no skill-side marker writes).
119
+ - **ADR-009** (Gate marker lifecycle) — per-report marker TTL + drift discipline; same as the existing `external-comms-gate` marker.
120
+ - **ADR-013 Rule 1** + Rule 6 — `AskUserQuestion` only at maintainer-direction branches; mechanical-stage carve-out applies to pipeline invocations.
121
+ - **ADR-014** — assessment skills are read-only and exempt from commit obligation.
122
+ - **ADR-028** (External-comms gate, amended) — the pushback / clear-malicious-verdict comments the assessment-pipeline posts after this skill's FAIL verdict ride the P064 + P038 evaluator halves.
123
+ - **ADR-029** (Diagnose before implement) — verdict follows hypothesis / evidence / structured-verdict discipline.
124
+ - **ADR-044** — decision-delegation contract; mechanical-stage carve-out is the category-4 framework-resolution boundary.
125
+ - **P079** — parent ticket; this skill is Slice B per RFC-004.
126
+ - **P123** — blocked-user-list mechanism; composes with the `Block reporter` option in step 6.
127
+ - **JTBD-005** (Invoke Governance Assessments On Demand) — primary persona driver.
128
+ - **JTBD-202** (Pre-Flight Governance Checks) — secondary persona driver.
129
+ - **JTBD-001** (Enforce Governance Without Slowing Down) — mechanical-stage carve-out preserves "without slowing down".
130
+
131
+ $ARGUMENTS
@@ -0,0 +1,132 @@
1
+ #!/usr/bin/env bats
2
+ # Contract assertions for /wr-risk-scorer:assess-inbound-report skill
3
+ # (RFC-004 Slice B — on-demand wrapper per ADR-015). Peer of
4
+ # /wr-risk-scorer:assess-external-comms.
5
+ #
6
+ # Structural assertions — Permitted Exception to the source-grep ban
7
+ # per ADR-005 / P011 / ADR-037 / ADR-052 § Surface 2. SKILL.md prose
8
+ # governs LLM-driven runtime behaviour; behavioural-replay testing
9
+ # requires a synthetic agent harness (P012 / P176). Until that harness
10
+ # lands, contract bats assert the load-bearing contract elements are
11
+ # present so future edits don't silently strip them.
12
+ #
13
+ # @problem P079
14
+ # @rfc RFC-004 (Slice B)
15
+ # @adr ADR-062 (sibling subagent + on-demand wrapper)
16
+ # @adr ADR-015 (on-demand assessment skills — § Scope table extended)
17
+ # @adr ADR-044 (decision-delegation — taste / silent-mechanical authority)
18
+ # @jtbd JTBD-005 (invoke governance assessments on demand)
19
+ # @jtbd JTBD-202 (pre-flight governance checks before release/handover)
20
+ # @jtbd JTBD-001 (mechanical-stage carve-out on pipeline pre-satisfier path)
21
+
22
+ setup() {
23
+ SKILL_DIR="$(cd "$(dirname "$BATS_TEST_FILENAME")/.." && pwd)"
24
+ SKILL_FILE="${SKILL_DIR}/SKILL.md"
25
+ ADR_015="$(cd "${SKILL_DIR}/../../../.." && pwd)/docs/decisions/015-on-demand-assessment-skills.proposed.md"
26
+ }
27
+
28
+ @test "SKILL.md exists and has frontmatter" {
29
+ [ -f "$SKILL_FILE" ]
30
+ run head -1 "$SKILL_FILE"
31
+ [ "$status" -eq 0 ]
32
+ [ "$output" = "---" ]
33
+ }
34
+
35
+ @test "frontmatter name is wr-risk-scorer:assess-inbound-report" {
36
+ run grep -nE '^name: wr-risk-scorer:assess-inbound-report$' "$SKILL_FILE"
37
+ [ "$status" -eq 0 ]
38
+ }
39
+
40
+ @test "frontmatter allowed-tools includes Skill (delegates to subagent)" {
41
+ # ADR-015 § Gate Marker Interaction: on-demand skills MUST delegate
42
+ # via Skill tool; never write markers directly.
43
+ run grep -nE '^allowed-tools:.*Skill' "$SKILL_FILE"
44
+ [ "$status" -eq 0 ]
45
+ }
46
+
47
+ @test "frontmatter allowed-tools includes AskUserQuestion (manual-mode step 6)" {
48
+ # Step 6 (manual invocation only — silent on pipeline pre-satisfier
49
+ # invocations per P132) uses AskUserQuestion to surface next-step
50
+ # options.
51
+ run grep -nE '^allowed-tools:.*AskUserQuestion' "$SKILL_FILE"
52
+ [ "$status" -eq 0 ]
53
+ }
54
+
55
+ @test "frontmatter allowed-tools includes Bash (gh issue fetch in step 2)" {
56
+ # Step 2 can call `gh issue view --json body,author,labels` to fetch
57
+ # the report body when only a URL/ref is supplied.
58
+ run grep -nE '^allowed-tools:.*Bash' "$SKILL_FILE"
59
+ [ "$status" -eq 0 ]
60
+ }
61
+
62
+ # ──────────────────────────────────────────────────────────────────────────────
63
+ # Delegation to the sibling subagent (NOT marker self-writes)
64
+ # ──────────────────────────────────────────────────────────────────────────────
65
+
66
+ @test "skill delegates to wr-risk-scorer:inbound-report subagent" {
67
+ run grep -nE 'wr-risk-scorer:inbound-report' "$SKILL_FILE"
68
+ [ "$status" -eq 0 ]
69
+ }
70
+
71
+ @test "skill MUST NOT write to /tmp/ markers directly (ADR-009 + ADR-015 boundary)" {
72
+ # PostToolUse:Agent hook (risk-score-mark.sh) owns marker writes per
73
+ # ADR-009 + ADR-015 § Gate Marker Interaction.
74
+ run grep -inE 'NOT write.*/tmp|PostToolUse hook' "$SKILL_FILE"
75
+ [ "$status" -eq 0 ]
76
+ }
77
+
78
+ # ──────────────────────────────────────────────────────────────────────────────
79
+ # Mechanical-stage carve-out: pipeline pre-satisfier path is silent
80
+ # ──────────────────────────────────────────────────────────────────────────────
81
+
82
+ @test "skill names the mechanical-stage carve-out (P132) for pipeline pre-satisfier path" {
83
+ run grep -inE 'P132|mechanical-stage carve-out' "$SKILL_FILE"
84
+ [ "$status" -eq 0 ]
85
+ }
86
+
87
+ @test "step 6 AskUserQuestion fires ONLY on manual invocation (not pipeline pre-satisfier)" {
88
+ # The carve-out is the load-bearing protection for JTBD-001 + JTBD-006
89
+ # against inverse-P078 drift. The pipeline pre-satisfier path MUST be
90
+ # silent on this step. Match the contract in either direction:
91
+ # manual-only firing OR pipeline-pre-satisfier silent-on-step.
92
+ run grep -inE 'invoked manually.*pre-flight|manual only|silent on this step|silent on.*pipeline pre-satisfier' "$SKILL_FILE"
93
+ [ "$status" -eq 0 ]
94
+ }
95
+
96
+ # ──────────────────────────────────────────────────────────────────────────────
97
+ # Persona anchors (JTBD-005 + JTBD-202)
98
+ # ──────────────────────────────────────────────────────────────────────────────
99
+
100
+ @test "skill cites JTBD-005 (invoke on demand) as primary persona driver" {
101
+ run grep -nE 'JTBD-005' "$SKILL_FILE"
102
+ [ "$status" -eq 0 ]
103
+ }
104
+
105
+ @test "skill cites JTBD-202 (pre-flight governance checks) as secondary persona driver" {
106
+ run grep -nE 'JTBD-202' "$SKILL_FILE"
107
+ [ "$status" -eq 0 ]
108
+ }
109
+
110
+ # ──────────────────────────────────────────────────────────────────────────────
111
+ # ADR-015 Scope table row exists for assess-inbound-report
112
+ # ──────────────────────────────────────────────────────────────────────────────
113
+
114
+ @test "ADR-015 Scope table includes the assess-inbound-report row" {
115
+ [ -f "$ADR_015" ]
116
+ run grep -nE '`assess-inbound-report`' "$ADR_015"
117
+ [ "$status" -eq 0 ]
118
+ run grep -nE '`wr-risk-scorer:inbound-report`' "$ADR_015"
119
+ [ "$status" -eq 0 ]
120
+ }
121
+
122
+ @test "ADR-015 Confirmation checkbox covers assess-inbound-report skill" {
123
+ run grep -nE '\[x\] `packages/risk-scorer/skills/assess-inbound-report/SKILL\.md` created' "$ADR_015"
124
+ [ "$status" -eq 0 ]
125
+ }
126
+
127
+ @test "ADR-015 Related section names ADR-062 + P079 (driver references)" {
128
+ run grep -nE 'ADR-062.*inbound|inbound.*ADR-062' "$ADR_015"
129
+ [ "$status" -eq 0 ]
130
+ run grep -nE 'P079' "$ADR_015"
131
+ [ "$status" -eq 0 ]
132
+ }