npm - unbound-cli - Versions diffs - 1.3.2 → 1.4.0 - Mend

unbound-cli 1.3.2 → 1.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

package/PLAN-web-4887.md +477 -0
package/PLAN.md +117 -0
package/README.md +8 -2
package/package.json +1 -1
package/src/commands/policy.js +183 -10
package/src/commands/setup.js +4 -2
package/src/lib/policy-ai-assist.js +503 -0
package/test/eval/README.md +45 -0
package/test/eval/policy-prompts.json +122 -0
package/test/eval/run-eval.js +57 -0
package/test/policy-ai-assist-mcp.test.js +606 -0
package/test/policy-ai-assist-preflight.test.js +66 -0
package/test/policy-ai-assist.test.js +884 -0

package/PLAN-web-4887.md ADDED Viewed

@@ -0,0 +1,477 @@
+# WEB-4887 — AI-Assisted Tool Policy Creation in `unbound` CLI
+**Linear:** https://linear.app/unboundsec/issue/WEB-4887
+**Status:** Spec / implementation plan
+**Owner:** Dinesh Veluswamy
+**Last updated:** 2026-06-18
+---
+## 1. Background
+The web UI's tool-policy creation flow has an AI-assist input that hits a server-side endpoint tuned with our policy schema and is materially better than Claude hand-authoring policy JSON. Today, when a Claude Code user asks `unbound` to create a tool policy, Claude reaches for the flag-based `unbound policy tool create-terminal` / `create-mcp` commands and constructs the arguments itself. That works but underperforms the in-product AI-assist for the same reason the ticket describes — Claude does not have our system prompt, our command-family inventory, or our MCP catalog in context.
+This plan exposes the existing AI-assist endpoints to the CLI and ships a Claude Code skill that steers Claude to use them.
+Endpoints already in production (see `ai-gateway-data/webapp/api/v1/command_policy_handlers.py`):
+| Policy kind | Endpoint | Model | Backend fallback if LLM fails |
+|---|---|---|---|
+| Terminal command | `POST /api/v1/command-policies/assist/` | Cerebras → Groq via ai-gateway classifier | Deterministic family/field extraction |
+| MCP tool | `POST /api/v1/command-policies/assist-mcp/` | Claude Haiku 4.5 (`temp=0.1`, `max_tokens=900`) | None — returns `success: false` |
+Both are admin-only (`@require_admin`), enforce a hard 2000-character input cap (HTTP 200 + `success: false`), and collapse runs of newlines/strip non-printable control chars as prompt-injection defense.
+---
+## 2. Goals
+1. Add an AI-assist code path to `unbound policy tool create-terminal` and `create-mcp` so Claude (and humans) can create policies from a natural-language description.
+2. Ship a Claude Code skill bundled with the CLI install so Claude reliably picks the AI-assist path over hand-authored arguments.
+3. Detect and clearly reject prompts that ask for fields the AI-assist endpoint cannot fill (env scoping, exception clauses, user-group conditions, etc.).
+4. Surface backend errors — especially the 2000-char rejection and the LLM-could-not-determine cases — with actionable CLI messages.
+5. Provide an eval harness that measures the acceptance criterion: "Claude Code should always pick our AI-assist endpoint over creating policies on its own."
+## 3. Non-goals
+- **LLM quality improvements.** The Cerebras terminal classifier and Haiku MCP classifier are out of scope. The user has explicitly flagged this as non-functional for this ticket. We will work around flakiness via UX, not by changing the model.
+- **Cost / Model / Security policy types.** Only tool policies have AI-assist endpoints today. Adding AI-assist to the other three is a separate ticket.
+- **Multi-turn refinement loops** with the user inside the CLI. The endpoints are one-shot; we keep that contract.
+- **Authoring new backend fields.** The CLI must work strictly within the existing form schema and AI-assist response schema.
+---
+## 4. Field whitelist — what AI-assist actually fills
+This is the boundary the skill must teach Claude to respect. Fields outside this set must be passed via existing flags, or the prompt must be rejected with a clear message.
+### 4.1 Terminal command AI-assist (`assist/`)
+**Fills** (from `form_updates` in the response — `command_policy_handlers.py:1340-1393`):
+| Field | Type | Notes |
+|---|---|---|
+| `command_family` | string | Must match a known family (e.g. `filesystem`, `git`, `package_manager`). |
+| `selected_field` | string | One of the family's fields, or the literal `ANY` wildcard. |
+| `field_value` | string | Pattern — auto-detected as EXACT / GLOB / REGEX. `*` for "match everything". |
+| `action` | enum | `audit` \| `block` \| `warn` \| `require_slack_approval`. Omitted if user didn't specify. |
+| `name` | string ≤ 200 | Suggested policy name. Optional. |
+| `description` | string ≤ 500 | Suggested description. Optional. |
+**Does NOT fill** (form fields the user / Claude must set via flags or accept defaults):
+| Form field | Source | CLI handling |
+|---|---|---|
+| `enabled` | `useCreatePolicyForm.ts:21` | Default `true`. Expose `--disabled` flag. |
+| `custom_message` | `useCreatePolicyForm.ts:18` | Only relevant for `BLOCK`/`WARN`. Expose `--custom-message` flag; do not parse from prompt. |
+| `scope_user_group_ids` | `useCreatePolicyForm.ts:39` | Expose `--group` / `--scope-group` flag; do not parse group names from prompt. |
+| `suggestion_id` | UI-only, coverage-gap link | Not relevant to CLI. |
+### 4.2 MCP tool AI-assist (`assist-mcp/`)
+**Fills** (from `mcp_policy_assist_service.py:36-86`):
+| Field | Type | Notes |
+|---|---|---|
+| `mcp_canonical_group_id` | int | Must match an entry in the org's MCP catalog. |
+| `mcp_tools` | string[] | Tool names, post-validated against the group's actual tool list — invented names are dropped. |
+| `name` | string ≤ 80 | Short policy name. |
+| `description` | string | One sentence. |
+| `action` | enum | `AUDIT` \| `BLOCK` \| `WARN` \| `REQUIRE_SLACK_APPROVAL`. Defaults to `AUDIT` if not in prompt. |
+| `custom_message` | string | Only when action is `BLOCK`/`WARN`. |
+**Does NOT fill** (same boundary as terminal):
+- `enabled` — default true, `--disabled` flag.
+- `scope_user_group_ids` — `--group` flag.
+- `suggested_tools` — UI-only, captured from a separate `/resolve-tools/` call.
+### 4.3 Out-of-scope constructs (the "env, archived projects, nuance" cases)
+These are things users will naturally try to express. None of them are representable in the current form schema, and the LLM will either silently drop them or hallucinate. The CLI must detect them in the prompt and warn before sending.
+| Construct | Example phrasing | Why it can't work | CLI response |
+|---|---|---|---|
+| Environment scope | "in staging", "on prod", "only in dev" | No env field on policies. | Warn + ask user to remove or proceed knowing it's ignored. |
+| Repo / project filter | "for the web repo", "except archived projects" | No project field. | Warn. |
+| User-role conditions | "for non-admins", "when the user is on call" | Closest is `scope_user_group_ids`, settable via `--group` only. | Suggest `--group <name>`. |
+| Time conditions | "after 6pm", "outside business hours" | Not representable. | Reject. |
+| Exception clauses | "block all writes EXCEPT to docs" | Single-policy schema is allow-or-deny on one pattern. | Suggest creating two policies. |
+| Compound asks | "block X and audit Y" | One policy per call. | Suggest splitting into N invocations. |
+---
+## 5. Architecture decision
+### 5.1 Add `--prompt` to existing subcommands rather than introduce parallel `-ai` variants
+The existing CLI surface is:
+```
+unbound policy tool create-terminal --name "..." --command-family ... --field K=V --action ... [flags]
+unbound policy tool create-mcp      --name "..." --mcp-server ... --mcp-action-type ... --action ... [flags]
+```
+We add a single `--prompt "<natural language>"` flag to each. When `--prompt` is present:
+1. The CLI calls the corresponding AI-assist endpoint.
+2. The response's `form_updates` are merged with any flags the user also passed (flags take precedence — they're explicit overrides).
+3. The CLI shows the resolved policy to the user (or to Claude, who will surface it to the user) and asks for confirmation before calling the real create endpoint.
+Rejected alternatives:
+- **Parallel `create-terminal-ai` / `create-mcp-ai` subcommands.** Doubles the command surface; gives Claude two near-identical commands to pick between; complicates the skill.
+- **Unified `unbound policy tool assist` that infers terminal vs MCP.** Inference itself is another LLM call. Cleaner UX, but adds a third moving part. Defer to a later ticket.
+### 5.2 Two-phase delivery
+| Phase | Scope | Why this order |
+|---|---|---|
+| **Phase 1** | `create-terminal --prompt` + skill v1 (terminal only) | Terminal endpoint has a deterministic fallback when the LLM fails. Lower-risk path to validate the surface. |
+| **Phase 2** | `create-mcp --prompt` + skill v2 (covers both) | MCP endpoint has no fallback. Ship after Phase 1 has shaken out the error-handling patterns. |
+In Phase 1, the skill explicitly tells Claude: "For MCP policies, use the existing flag-based `create-mcp` — AI-assist for MCP is coming in Phase 2."
+---
+## 6. Phase 1 — Terminal command AI-assist
+### 6.1 CLI surface
+```bash
+unbound policy tool create-terminal \
+  --prompt "block rm -rf in the filesystem family" \
+  [--group <name>] \
+  [--disabled] \
+  [--custom-message "..."] \
+  [--yes] \
+  [--json]
+```
+Behavior:
+- `--prompt` is mutually exclusive with the existing field-specifying flags (`--command-family`, `--field`, `--action`, `--name`, `--description`). If both are passed, error out: "Pass `--prompt` for AI-assist or the field flags for explicit creation, not both."
+- `--group`, `--disabled`, `--custom-message` remain valid alongside `--prompt` because they're out-of-AI-scope (see §4.1).
+- `--yes` skips the confirmation prompt (needed for Claude's non-interactive flows).
+- `--json` emits the resolved policy and the create response as JSON, no TTY formatting.
+### 6.2 Pre-flight checks (CLI side, before any network call)
+In order, fail fast on the first hit:
+1. **Auth present.** Refuse if `unbound login` has not been run.
+2. **Admin role.** Hit `whoami` cache or pre-flight `/me`; if not admin, error: "Tool policy creation requires admin role; current role: <role>."
+3. **Prompt length.** If trimmed length > 1800 characters, reject locally with the same wording the backend uses: "Input is too long (max 2000 characters)." Reject at 1800 to leave headroom for backend normalization.
+4. **Out-of-scope keyword scan.** Lower-cased substring match against this list (kept small and explicit, not a regex puzzle):
+   - `staging`, `production`, `prod`, `dev`, `qa`, `testing`
+   - `archived`, `archive`
+   - `business hour`, `after hour`, `outside hour`, `weekend`
+   - `except`, `unless`, `but not`
+   - `private repo`, `public repo`, `private project`, `public project`
+   On match, warn and confirm: "Your prompt mentions `<token>`. Tool policies cannot scope by environment / project / time / exception clauses. Continue anyway? The endpoint will ignore those parts. [y/N]". Under `--yes`, log the warning but proceed.
+5. **Newline normalization.** Collapse runs of ≥2 newlines to one. (Mirror backend defense — gives us identical sanitization regardless of which side rejects.)
+### 6.3 Network call
+```
+POST {API_BASE}/api/v1/command-policies/assist/
+Authorization: Bearer <token>
+Content-Type: application/json
+{
+  "user_input": "<sanitized prompt>",
+  "current_form_state": {
+    "command_family": "",
+    "selected_field": "",
+    "field_value": "",
+    "action": "",
+    "name": "",
+    "description": ""
+  }
+}
+```
+Timeout: 20 seconds (backend timeout to the gateway is 15s; we add 5s for routing).
+### 6.4 Response handling
+| HTTP | Body | CLI behavior |
+|---|---|---|
+| 200 | `success: true`, `form_updates: {...}`, `explanation` | Merge `form_updates` ← user-passed flags, render preview, confirm. |
+| 200 | `success: false`, `error: "Input is too long..."` | Print verbatim error + suggestion: "Try shortening to under 1800 characters." Exit 2. |
+| 200 | `success: false`, `error: "Could not determine command family..."` | Print verbatim error + suggestion: "Try naming the command type explicitly (e.g. `block git pushes`, `audit npm installs`). Or use `unbound policy tool families` to see available families and pass `--command-family` directly." Exit 2. |
+| 200 | `success: false`, `error: <other>` | Print verbatim error. Exit 2. |
+| 401 / 403 | any | "Authentication failed / not authorized. Tool policies require admin. Run `unbound whoami` to check role." Exit 3. |
+| 400 / 422 | any | "Request validation failed: `<body>`." Exit 2. |
+| 5xx | any | "Server error. Try again, or fall back to flag-based creation: `unbound policy tool create-terminal --command-family ... --field ... --action ...`." Exit 4. |
+| timeout / network | — | "Network error reaching `<host>`. Check connectivity. Falling back to flag-based creation is not blocked." Exit 4. |
+**No automatic retries.** Re-sending the same prompt to the same flaky LLM endpoint won't change the answer in any meaningful way, and silent retries hide latency. If the user explicitly asks Claude to retry, that's a separate invocation.
+### 6.5 Merging AI output with user-passed flags
+Order of precedence (highest first):
+1. Explicit flags on the command line (`--group`, `--custom-message`, `--disabled`).
+2. AI response `form_updates` fields.
+3. Defaults (`enabled: true`, `action: audit` if AI omitted it).
+The merge happens before the confirmation step so the user sees the final shape.
+### 6.6 Confirmation step
+Default behavior — show the resolved policy, ask `Create policy? [Y/n]`. The
+preview reuses the CLI's standard `output.keyValue` rendering (dimmed keys,
+auto-padded columns) for visual consistency with `displayToolPolicy` and the
+other policy-display surfaces.
+Concept-order of rows (rows MAY be skipped when their value is empty — e.g.
+Description and Custom message are dropped rather than printed as `(none)`):
+- Name, Description, Type, Command family, Field, Pattern, Action,
+  Custom message, Scope (groups), Enabled.
+Action is colored — `BLOCK` red, `WARN`/`REQUIRE_SLACK_APPROVAL` yellow,
+`AUDIT` dimmed (low-emphasis default). The AI explanation prints below the
+table in dim text and is omitted entirely when the backend returns an empty
+explanation.
+Exact column widths and labels are not pinned — tests assert on meaningful
+tokens (Name value, Action enum, explanation body), not on column-aligned
+strings.
+Under `--yes`, skip the prompt but always print the resolved policy (Claude will surface it to the user in chat).
+### 6.7 Create call
+On confirmation, call the existing create endpoint exactly as the flag-based path does today (`POST /api/v1/command-policies/`) with the merged payload. No new backend code on the create side.
+### 6.8 Backend change (optional, recommended)
+Add a `source` field to both AI-assist endpoints (`"web"` | `"cli"`, defaulting to `"web"` for backward compat). Lets us measure CLI adoption and the AC pick-rate without parsing user-agent headers. One-line wire change, two-line server change, no behavioral impact.
+### 6.9 Tests
+- **Unit (CLI):**
+  - Prompt length pre-flight at 1799 / 1800 / 1801 chars.
+  - Out-of-scope keyword detection for each token in §6.2.
+  - Flag/prompt mutual exclusion error path.
+  - Merge precedence: flag wins over AI value.
+  - Each HTTP status path in §6.4 routes to the right CLI message.
+- **Integration (against staging):**
+  - 10 representative single-intent prompts → expect `success: true` and a valid policy.
+  - 3 oversize prompts → expect length error, no policy created.
+  - 3 nonsense prompts → expect "could not determine" error.
+- **Eval harness (see §8):**
+  - Pick-rate test: spawn Claude Code with the skill installed, give it 10 natural-language asks ("create a policy that…"), measure whether it invokes `create-terminal --prompt` vs hand-rolls `--command-family ...`.
+---
+## 7. Phase 2 — MCP tool AI-assist — **DELIVERED**
+Phase 2 ships on the same branch (`web-4887-cli-ai-assist-policy`, PR #54) as
+Phase 1. New exports in `src/lib/policy-ai-assist.js`: `runMcpPromptCreate`,
+`mergeAiAndFlagsMcp`, `renderMcpPreview`. The `create-mcp` subcommand now
+accepts `--prompt <text>` as a first-class alternative to the flag-based path.
+### 7.1 CLI surface
+```bash
+unbound policy tool create-mcp \
+  --prompt "audit all Linear writes" \
+  [--group <name>] \
+  [--disabled] \
+  [--custom-message "..."] \
+  [--yes] \
+  [--json]
+```
+Same flag rules as Phase 1.
+### 7.2 Pre-flight additions on top of §6.2
+- **Catalog awareness.** If the prompt names a service that isn't in the org's MCP catalog (fetched via the existing `unbound policy tool mcp-servers` call, cached for the session), warn before sending: "Your prompt mentions `<service>`, which isn't in this org's MCP catalog. The endpoint will likely return an error. Available services: `<list>`. Continue? [y/N]". Optional optimization; skip in v1 if scope creeps.
+### 7.3 Network call
+`POST {API_BASE}/api/v1/command-policies/assist-mcp/` — identical envelope shape to Phase 1, with the MCP-specific `current_form_state` keys.
+### 7.4 Response handling — same matrix as §6.4, with one addition
+| HTTP | Body | CLI behavior |
+|---|---|---|
+| 200 | `success: true`, `mcp_tools: []` (empty array after backend post-validation drops hallucinated names) | Treat as a soft failure: "AI assist could not match any tools in `<group>` for your description. Try naming the tools more directly, or use `unbound policy tool create-mcp --mcp-server ... --mcp-tools tool1,tool2`." Do NOT auto-create a policy with no tools. Exit 2. |
+The MCP endpoint has no backend deterministic fallback, so flakiness surfaces as either `success: false` or `mcp_tools: []`. The CLI handles both as "couldn't figure it out; here's the manual escape hatch."
+Note: the `assist-mcp/` response has no `explanation` field (asymmetric with the
+terminal `assist/` endpoint). `renderMcpPreview` is passed `undefined` for the
+explanation and skips the trailing dim line entirely. The render code is
+forward-compatible: if the backend later adds `explanation`, the existing
+`if (explanation)` guard picks it up with zero CLI change.
+### 7.5 Tests — extend §6.9 with MCP-specific cases
+- Empty `mcp_tools` response routes to the "couldn't match any tools" message.
+- Out-of-catalog service mention warns before sending.
+- Integration: 10 single-service prompts + 3 multi-service prompts (latter should warn or split).
+---
+## 8. Claude Code skill — distribution and content
+The CLI cannot edit Claude's system prompt, but it can drop a skill file. Skills are the closest analog to a CLAUDE.md hook that we can install programmatically.
+### 8.1 Where it lives
+`~/.claude/skills/unbound-tool-policy/SKILL.md` at runtime.
+Source-of-truth lives in the **`setup` repo** at `claude-code/skills/unbound-tool-policy/SKILL.md`, NOT in unbound-cli. The `claude-code/hooks/setup.py` and `claude-code/gateway/setup.py` scripts (which `unbound setup claude-code` and `unbound onboard` download and run) copy it from the setup repo's main branch on every install. This co-locates the skill with the rest of the claude-code per-tool install artifacts (`unbound.py`, `anthropic_key.sh`, etc.) and lets skill content updates ship the moment they merge to setup-repo main — no unbound-cli release required.
+Installed by `unbound setup claude-code` (both subscription and gateway modes) and by `unbound onboard`. Idempotent: overwritten on each install so content updates propagate.
+### 8.2 Phase 1 skill content (terminal only)
+The SKILL.md frontmatter is what Claude reads to decide whether to invoke. Keep the description imperative and specific:
+```markdown
+---
+name: unbound-tool-policy
+description: |
+  Use when the user asks to create a terminal command tool policy in Unbound
+  (e.g. "block rm -rf", "audit git pushes to main"). ALWAYS prefer
+  `unbound policy tool create-terminal --prompt "<NL>"` over hand-authoring
+  the --command-family / --field / --action flags. The CLI calls a
+  server-side AI-assist endpoint tuned on Unbound's policy schema; it
+  outperforms hand-authoring.
+  For MCP tool policies (Linear, GitHub, etc.), AI-assist is not available
+  yet — fall back to flag-based `unbound policy tool create-mcp` for those.
+---
+# Creating an Unbound terminal command policy
+When the user asks for a tool policy targeting terminal commands, invoke:
+    unbound policy tool create-terminal --prompt "<natural language>" [--group <name>] [--yes]
+## Rules for the --prompt string
+1. **One policy per invocation.** If the user wants two things, run the
+   command twice.
+2. **Single-intent, imperative.** Phrase it as "block git pushes to main",
+   "audit npm installs", "warn on rm -rf". Avoid multi-paragraph briefs.
+3. **Stay under 1500 characters.** The endpoint caps at 2000; the CLI
+   pre-trims at 1800. Long prompts get rejected.
+4. **Do NOT include** these — the endpoint cannot represent them and the
+   CLI will warn about them:
+   - environment scope (staging, prod, dev)
+   - project / repo filters
+   - time-based conditions
+   - exception clauses ("except for X")
+   - user-role conditions
+5. **Group scoping** goes on the `--group` flag, not in the prompt.
+6. **Custom block messages** go on `--custom-message`, not in the prompt.
+## When AI-assist fails
+If the CLI returns "Could not determine command family", try one of:
+- Re-phrase the user's intent to name the command type explicitly.
+- Fall back to flag-based: `unbound policy tool families` to list families,
+  then `unbound policy tool create-terminal --command-family ... --field ...`.
+## What success looks like
+The CLI prints a resolved policy preview and asks for confirmation.
+Pass `--yes` to skip the confirm; the preview still prints so the user
+can sanity-check.
+## MCP policies (not this skill, for now)
+For MCP tool policies, use the flag-based form:
+    unbound policy tool create-mcp --name "..." --mcp-server <server> \
+      --mcp-action-type <read|write|destructive> --action AUDIT|BLOCK|WARN
+```
+### 8.3 Phase 2 skill content updates
+When Phase 2 ships, replace the "MCP policies (not this skill, for now)" section with the MCP variant of the same rules, and update the frontmatter description to drop the "MCP not available yet" caveat.
+### 8.4 Belt-and-suspenders: README and `--help`
+- Update `unbound policy tool create-terminal --help` so the imperative language is also in the help output. Claude reads `--help` when uncertain.
+- Update `README.md` "Tool policy examples" section so the first example uses `--prompt`. Manual flag examples remain below as a fallback.
+---
+## 9. Acceptance criterion — testable definition
+The Linear ticket says "Claude code should always pick our AI assist endpoint over creating policies on its own." Rewritten as testable:
+1. **Pick-rate ≥ 90%** on the eval set below.
+2. **Success-rate ≥ 80%** of in-scope eval prompts that reach the endpoint return `success: true`.
+3. **Zero hand-authored YAML / JSON files** in the eval transcript. Claude should never write a policy file to disk and then upload — the CLI is the only path.
+### 9.1 Eval set
+`tests/eval/policy-prompts.json` in this repo. ~20 prompts across:
+- 6 single-intent in-scope ("block rm -rf", "audit git pushes to main", …)
+- 4 detailed in-scope (300–800 chars, single intent, more context)
+- 4 out-of-scope (env, exception, time, compound) — expected outcome: skill steers Claude to warn the user or split
+- 3 oversize (> 2000 chars) — expected outcome: pre-flight rejects
+- 3 nonsense — expected outcome: `success: false`, surfaced
+Eval harness: a script that spawns Claude Code in a clean workspace with the skill installed, feeds each prompt as a user message, and records: which command Claude invoked, whether `--prompt` was used, the endpoint response, the final state. Aggregate into a pick-rate / success-rate report.
+This eval is the artifact that closes the ticket. Without it, "always picks" is unverifiable.
+---
+## 10. Implementation order
+### Phase 1 (1–2 dev-days)
+1. `src/commands/policy.js` — add `--prompt` flag and AI path to `create-terminal`. Branch on `--prompt` presence; preserve all existing flag paths.
+2. New module `src/lib/policy-ai-assist.js` — HTTP client, pre-flight checks, response routing, preview rendering.
+3. Skill file at `skills/unbound-tool-policy/SKILL.md` in the repo; build step copies to `~/.claude/skills/` during `setup claude-code` and `onboard`.
+4. Update `--help` text and `README.md` policy examples.
+5. Unit tests (§6.9).
+6. Eval harness skeleton + the in-scope prompts (§9.1).
+7. Integration sanity check against staging.
+### Phase 2 (1 dev-day, after Phase 1 lands)
+1. Extend `create-mcp` with `--prompt`.
+2. Reuse `src/lib/policy-ai-assist.js`; add MCP-specific pre-flight and response routing.
+3. Update skill file and `--help`.
+4. Extend eval harness with MCP prompts.
+### Backend (optional, parallel to Phase 1)
+Add the `source: "cli" | "web"` field to both AI-assist request handlers and log it. PR in `ai-gateway-data`.
+---
+## 11. Open questions / risks
+1. **Catalog awareness in MCP pre-flight.** The `mcp-servers` list is org-specific and changes. Caching policy: per-session in memory, or persist to `~/.unbound/cache/`? Defer to Phase 2 review.
+2. **Admin-key blast radius.** AI-assist is admin-only. End-user Claude Code sessions running under a non-admin key will get 403s. The skill should detect that on first use and tell Claude to advise the user to ask their admin, instead of looping on retries.
+3. **Eval cost.** Running the eval against real Claude Code instances burns tokens. Cap the eval to one run per CI build, gated on changes to `src/commands/policy.js` or the skill file.
+## 12. Deferred to follow-up tickets
+These are intentionally out of scope for WEB-4887 to keep the surface narrow. Spin off tickets when these become live:
+1. **MCP catalog cache for pre-flight.** Phase 2 pre-flight could check the prompted service against the org's MCP catalog before sending. Cache location (`~/.unbound/cache/`?) and invalidation policy need design. Defer until we see whether the bare endpoint surfaces "no matching service" errors well enough on its own.
+2. **`source: "cli" | "web"` request field on both AI-assist endpoints.** Lets us distinguish CLI usage from web usage in production logs, which is the only way to measure the AC pick-rate at scale without parsing user-agents. Punt to a post-Phase-2 ticket so backend work doesn't gate the CLI rollout.
+3. **AI-assist for Cost / Model / Security policy types.** Today only tool policies have AI-assist endpoints. If pick-rate data shows users want NL creation for the other three, design endpoints and extend the skill.
+~~4. **Per-user skill install for MDM deployments.**~~ — DELIVERED in setup PR #163. Both MDM setup scripts now enumerate device users via `get_all_user_homes()` and install the skill into each user's `~/.claude/skills/` under `_run_as_user(username, _do)` (the existing security-critical privilege-drop primitive). Best-effort per user; one locked-down home does not abort the device rollout. See `claude-code/{hooks,gateway}/mdm/setup.py:setup_tool_policy_skill_for_user` and the README updates in each `mdm/` directory.
+## 13. Skill reinstall policy
+Each `unbound setup claude-code` and `unbound onboard` run **overwrites** `~/.claude/skills/unbound-tool-policy/SKILL.md` with the version fetched from the `setup` repo's main branch at install time. No checksum gating, no diff-and-prompt. Rationale: the skill is a derived artifact of whatever's in the setup repo's main; treating it as an install-time output (not user state) keeps the mental model simple and ensures Claude is always steered with the current rules.
+(The skill source-of-truth was originally bundled in unbound-cli during initial implementation; relocated to the setup repo to follow the existing per-tool-artifact convention. See companion PR https://github.com/websentry-ai/setup/pull/163.)

package/PLAN.md ADDED Viewed

@@ -0,0 +1,117 @@
+# Implementation Plan: WEB-4887 Phase 2 — AI-assisted MCP tool policy creation in `unbound` CLI
+> Generated by /implementation-plan on 2026-06-18. Source: principal-architect.
+>
+> Phase 1 (terminal command AI-assist) is **DELIVERED** on the same branch
+> (`web-4887-cli-ai-assist-policy`, commits `a90713e`..`f4508dc`, PR #54).
+> Phase 1 acceptance criteria still hold — see git history. This document
+> contracts Phase 2 (MCP tool AI-assist) only; downstream engineers should
+> read Phase 1's exports in `src/lib/policy-ai-assist.js` to understand the
+> infrastructure being extended (preflight, admin probe, error matrix,
+> confirmation, BLOCK/WARN guard, flag-precedence merge).
+## Goal
+Add `--prompt <text>` to `unbound policy tool create-mcp` so an admin can describe an MCP tool policy in natural language; the CLI routes the prompt through `POST /api/v1/command-policies/assist-mcp/`, merges the AI response with user-supplied override flags, previews the resolved policy, confirms, and creates it via the existing create endpoint.
+## Established Context (do not override)
+- Single PR — Phase 2 ships on existing branch `web-4887-cli-ai-assist-policy`, same PR #54 as Phase 1. Phase 1 infrastructure is shared.
+- Same `--prompt` flag pattern — add to existing `unbound policy tool create-mcp` subcommand (NOT a parallel `-ai` subcommand). Mutex with MCP classification flags (`--mcp-server`, `--mcp-tool` — singular, what the CLI actually exposes today; `--mcp-action-type`; `--config`). Compatible with `--name`, `--description`, `--action`, `--custom-message`, `--group`, `--disabled`, `--yes`, `--json` as overrides.
+- Endpoint: `POST {API_BASE}/api/v1/command-policies/assist-mcp/`. Admin-only. Request body `{user_input: string, current_form_state: {mcp_canonical_group_id, mcp_tools, name, description, action}}`. Success response: `{success:true, canonical_group_id:int, mcp_tools:string[], name:string, description:string, action:enum, custom_message:string}`. Failure response: `{success:false, error:string}`. No `explanation` field in the response (asymmetric with the terminal endpoint).
+- Field whitelist (spec §4.2). AI-fills: `mcp_canonical_group_id`, `mcp_tools[]`, `name` (≤80 chars), `description`, `action`, `custom_message`. User-via-flags overrides: `--name`, `--description`, `--action`, `--custom-message`, `--group`, `--disabled`. Out-of-scope (mutex with `--prompt`): `--mcp-server`, `--mcp-tool`, `--mcp-action-type`, `--config`.
+- No backend fallback (Haiku endpoint). Two extra failure modes vs Phase 1: `success:true` + `mcp_tools:[]` → soft fail; `success:true` + missing `canonical_group_id` → soft fail. Surface manual escape-hatch and exit 2; do NOT create the policy.
+- Reuse Phase 1 helpers from `src/lib/policy-ai-assist.js`: `validatePromptPreflight`, `loadPrivileges` + module-level `_privilegesCache`, `confirmCreate`, `confirmContinue`, `promptYesNo`, `routeBackendError`, `validateActionOverride`, `OUT_OF_SCOPE_KEYWORDS`, `MAX_PROMPT_LEN`, `TOOL_ACTIONS`, `colorAction`.
+- New helpers added to the SAME file (`src/lib/policy-ai-assist.js`): `runMcpPromptCreate`, `mergeAiAndFlagsMcp`, `renderMcpPreview`, `routeMcpSuccessFalse`. Module grows from ~340 LOC to ~600 LOC — within "wieldy".
+- BLOCK/WARN custom-message guard same shape as Phase 1: post-merge, if action ∈ {BLOCK, WARN} and no `custom_message`, exit 2 with user-vs-AI attributed wording.
+- Out of Scope this ticket: MCP catalog cache pre-flight (spec §7.2), backend `source:"cli"|"web"` telemetry (spec §12.2), AI-assist for Cost/Model/Security (spec §3 / §12.3), MDM-side MCP catalog awareness.
+- Skill update needed in setup PR #163 — separate cross-repo deliverable (see "Sequencing" step 7).
+## Out of Scope
+- MCP catalog cache pre-flight (spec §7.2). The "warn before sending if service not in org's catalog" optimization is deferred.
+- Backend `source: "cli" | "web"` telemetry field on the assist-mcp endpoint (spec §12.2).
+- AI-assist for Cost / Model / Security policy types (spec §3 / §12.3).
+- MDM-side MCP catalog awareness (catalogs are per-org).
+- Reverse-resolving `canonical_group_id` → service display name in the preview "Service" row. v1 shows the integer id; post-create `displayToolPolicy` shows the resolved name from the backend's full serialization. Polish follow-up.
+- Migrating the existing flag-based `create-mcp` body shape (`mcp_server` string + `mcp_tool` singular + `mcp_tool_action_type`) to the AI-assist shape (`mcp_canonical_group_id` int + `mcp_tools[]` array). The flag path stays exactly as today; only the AI-assist branch sends the new shape. Backend accepts both.
+- Renaming `--mcp-tool` (singular, current) to `--mcp-tools` (plural, as referenced in spec §7). Separate flag-rename ticket if wanted.
+- Updating `claude-code/skills/unbound-tool-policy/SKILL.md` in the `setup` repo (PR #163). Cross-repo follow-up commit, NOT bundled with the unbound-cli PR. Sequencing in step 7.
+- Running the eval harness in CI. The terminal eval set at `test/eval/policy-prompts.json` is manual-only; the MCP extension (10 single-service + 3 multi-service prompts per spec §7.5) is also manual-only and noted in `test/eval/README.md` as a follow-up entry — not a Phase 2 file change.
+## Files to Change
+| Path | Change | Acceptance criterion |
+|------|--------|----------------------|
+| `src/commands/policy.js` | (a) Downgrade three `.requiredOption(...)` calls on `create-mcp` (lines ~1586, ~1587, ~1590 — `--name`, `--mcp-server`, `--action`) to `.option(...)`. (b) Add `.option('--prompt <text>', ...)`, `.option('--yes', ...)`. (Confirm `--json` is present; add if absent.) (c) Update the descriptions of `--name`, `--description`, `--action`, `--custom-message` to note "Override AI suggestion when used with --prompt". (d) In the `create-mcp` action handler, branch on `opts.prompt` BEFORE the existing flag validation. Branch body: empty-prompt check (exit 1, message "--prompt cannot be empty."); mutex against `--mcp-server`, `--mcp-tool`, `--mcp-action-type`, `--config` with verbatim wording "Pass --prompt for AI-assist or the field flags for explicit creation, not both." (exit 1); resolve `--group` via `loadFormData` and `resolveByName` to populate `opts.scopeUserGroupIds`; call `runMcpPromptCreate(opts)`; handle `err.exitCode === 0` as user-abort (warn + return); on success, `api.post('/api/v1/command-policies/', { body: result.body })`; success message `MCP policy${body.name ? ` "${body.name}"` : ''} created.`; `displayToolPolicy(unwrapToolPolicy(data))`. (e) Preserve the flag-based path verbatim; add `(or use --prompt for AI assist)` suffix to the three required-flag in-handler errors so non-`--prompt` users see equivalent guidance. (f) Update `addHelpText('after', ...)` so the first example uses `--prompt`. | M2, M3, M5, M6, M7, M9, M10, M11 |
+| `src/lib/policy-ai-assist.js` | Add four new symbols. (a) `routeMcpSuccessFalse(error)` — same shape as the existing `routeSuccessFalse`, but MCP-specific manual-escape-hatch wording (mentions `--mcp-server`/`--mcp-action-type`, not `--command-family`/`unbound policy tool families`). (b) `mergeAiAndFlagsMcp(aiResponse, opts)` — accepts the full AI response (`canonical_group_id`, `mcp_tools`, `name`, `description`, `action`, `custom_message`), returns body `{policy_type:'MCP_TOOL', mcp_canonical_group_id, mcp_tools, name, description, action, enabled, custom_message?, scope_user_group_ids?}`. Flag-precedence rules: trimmed `--name`/`--description`/`--action` win over AI; whitespace-only falls back to AI silently (mirrors Phase 1 regression fix); `--custom-message` always wins; `--disabled` always wins; default action `AUDIT` if neither AI nor flag sets it. (c) `renderMcpPreview(body, explanation)` — same `console.log` + `output.keyValue` shape as `renderTerminalPreview`. Rows in order: Name, Description (skip if empty), Type (`MCP_TOOL`), Service (`canonical_group_id: <int>` — see Out of Scope for why no reverse-resolve), Tools (`body.mcp_tools.join(', ')`), Action (colored via existing `colorAction`), Custom message (skip if empty), Scope (groups, `(org-wide)` if none), Enabled. Header "ℹ Resolved MCP policy (from AI assist):" in cyan. Explanation block guarded by `if (explanation)` — Phase 2 callers pass `undefined` because assist-mcp returns no explanation; forward-compatible if backend later adds it. (d) `runMcpPromptCreate(opts)` — orchestrator mirroring `runTerminalPromptCreate`. Sequence: `config.isLoggedIn()` check (exit 3); `validateActionOverride(opts.action)` early; `validatePromptPreflight(opts)` and warn loop over `warnings`; if warnings present and not `--yes`/`--json`, `confirmContinue()` gate (abort → throw with `exitCode:0`); `loadPrivileges()` + non-admin gate (exit 3 with role-naming message); `api.post('/api/v1/command-policies/assist-mcp/', {body: {user_input: sanitizedPrompt, current_form_state: {mcp_canonical_group_id:'', mcp_tools:[], name:'', description:'', action:''}}})` wrapped in try/catch that routes through `routeBackendError`; on `success === false`, route through `routeMcpSuccessFalse` and throw; **soft-fail check**: if `!response.canonical_group_id` OR `!Array.isArray(response.mcp_tools)` OR `response.mcp_tools.length === 0`, throw `Error` with `exitCode:2` and message `"AI assist could not match any tools for your description. Try naming the service and tools more directly, or use \`unbound policy tool create-mcp --name "..." --mcp-server <server> --mcp-action-type <read\|write\|destructive> --action <action>\`."`; `body = mergeAiAndFlagsMcp(response, opts)`; BLOCK/WARN guard reusing Phase 1's user-vs-AI attribution pattern (check `typeof opts.action === 'string' && opts.action.trim().length > 0` to decide attribution); `renderMcpPreview(body, undefined)` unless `opts.json`; `confirmed = opts.yes || opts.json ? true : await confirmCreate()`; return `{body, confirmed, explanation: undefined}`. (e) Append `runMcpPromptCreate`, `mergeAiAndFlagsMcp`, `renderMcpPreview` to `module.exports` (keep all existing exports verbatim). | M1, M2, M3, M4, M5, M6, M7, M8, M11 |
+| `README.md` | Under "Tool policy examples", in the MCP section, lead the first example with `unbound policy tool create-mcp --prompt "audit all Linear writes"`. Keep the flag-based example below as the documented fallback. Symmetric to Phase 1's terminal section update. | M12 |
+| `PLAN-web-4887.md` | §7: replace "Phase 2 spec, not yet implemented" language with "Phase 2 DELIVERED" markers. At the end of §7.4, add a note that the assist-mcp response has no `explanation` field (preview's explanation block is unconditionally skipped). §11.1 (catalog cache), §12.2 (`source:cli` telemetry), §12.3 (Cost/Model/Security AI-assist) remain in the deferred list — Phase 2 does NOT mark them delivered. | M13 |
+## Files to Create
+| Path | Purpose | Acceptance criterion |
+|------|---------|----------------------|
+| `test/policy-ai-assist-mcp.test.js` | NEW integration test file at the CLI command layer. Uses `node:test` + `commander.parseAsync` exactly like Phase 1's `test/policy-ai-assist.test.js`. Extends the harness pattern: `buildHarness` in this file accepts `assistMcpResponse` / `assistMcpError` options, and the `api.post` stub routes on `path === '/api/v1/command-policies/assist-mcp/'` to either return `assistMcpResponse` or throw `assistMcpError` (otherwise routes to `/api/v1/command-policies/` for the create POST). Defines a local `runCreateMcp(argv)` helper wrapping `program.parseAsync(['node','unbound','policy','tool','create-mcp', ...argv])`. ~50 LOC of harness duplication accepted vs shared-helper plumbing. Defines 12 distinct tests (mutex tests parametrized over a `MUTEX_FLAGS` array; routing tests parametrized over a `ROUTING_CASES` array). Total ~13–15 `test(...)` calls. | M2, M3, M4, M5, M6, M7, M8, M9, M11 |
+## Test Surface
+- **Layer:** CLI command. Tests invoke `unbound policy tool create-mcp` via `commander.parseAsync(['node','unbound','policy','tool','create-mcp', ...argv])` against a fresh `new Command()` with the policy module re-registered per test (`loadFreshModules()` invalidates `require.cache` so module-level `_privilegesCache` and `_formDataCache` are clean). `src/api.js` `.post`/`.get` and `src/config.js` `isLoggedIn`/`getApiKey`/`getBaseUrl` are monkey-patched. NO internal helper unit tests — preflight is already covered by the policy-type-agnostic `test/policy-ai-assist-preflight.test.js` from Phase 1 and is NOT re-covered.
+- **Tests to add or extend:**
+  - `test/policy-ai-assist-mcp.test.js` — covers acceptance criteria M2, M3, M4, M5, M6, M7, M8, M9, M11. Test cases:
+    1. Mutex `--prompt + --mcp-server` → M2.
+    2. Mutex `--prompt + --mcp-tool` → M2.
+    3. Mutex `--prompt + --mcp-action-type` → M2.
+    4. Mutex `--prompt + --config` → M2.
+    5. Empty `--prompt` → M2.
+    6. Happy-path body shape (assist-mcp returns full success response) → M3.
+    7. Merge precedence with all override flags + AI fields preserved where flags absent → M4.
+    8. Whitespace-only `--name` falls back to AI name → M4.
+    9. Soft-fail `mcp_tools:[]` → M8.
+    10. Soft-fail missing `canonical_group_id` → M8.
+    11. AI-attributed BLOCK without `--custom-message` → M11.
+    12. User-attributed `--action BLOCK` without `--custom-message` → M11.
+    13. 7 HTTP routing cases parametrized (200 success, 200 success_false length, 200 success_false generic, 401, 422, 5xx, network/timeout) → exit codes 0/2/2/3/2/4/4 → M5.
+    14. `--yes` prints preview + skips confirm + issues create POST; assert "Resolved MCP policy" header on stdout, Name/Tools/Action rows present, create POST fires → M6, M7.
+    15. `--json` suppresses preview but still POSTs create → M6.
+    16. No `--yes`, readline-stubbed `'n'` → preview prints, no create POST → M6.
+    17. Out-of-scope keyword `staging` + `--yes` → warning logged AND assist-mcp call fires → M9.
+    18. Non-admin probe → exit 3 before assist-mcp call (defensive smoke test against `_privilegesCache` bleed across tests).
+- M1, M10, M12, M13 are verified by inspection / build smoke (M1: `require('../src/lib/policy-ai-assist')` exposes new symbols AND all Phase 1 symbols; M10: existing `test/policy-conditions.test.js` + flag-path tests if any continue to pass — running `npm test` validates this; M12: `README.md` diff; M13: `PLAN-web-4887.md` diff).
+## Acceptance Criteria
+1. **M1.** Given `require('../src/lib/policy-ai-assist')`, when destructured, then the module exports `runMcpPromptCreate`, `mergeAiAndFlagsMcp`, `renderMcpPreview` AND retains every Phase 1 export (`MAX_PROMPT_LEN`, `OUT_OF_SCOPE_KEYWORDS`, `validatePromptPreflight`, `mergeAiAndFlags`, `renderTerminalPreview`, `routeBackendError`, `runTerminalPromptCreate`) verbatim.
+2. **M2.** Given `unbound policy tool create-mcp --prompt "X" --mcp-server linear` (and the three other mutex pairings `--mcp-tool`, `--mcp-action-type`, `--config`), when the command runs, then `process.exitCode === 1` AND `output.error` is called with exactly `Pass --prompt for AI-assist or the field flags for explicit creation, not both.` AND no network calls fire. Given `unbound policy tool create-mcp --prompt ""`, then `process.exitCode === 1` AND `output.error` is called with a message containing `--prompt cannot be empty.` AND no network calls fire.
+3. **M3.** Given `unbound policy tool create-mcp --prompt "audit all linear writes" --yes` and a stubbed `api.post('/api/v1/command-policies/assist-mcp/')` returning `{success:true, canonical_group_id:12, mcp_tools:['create_issue','update_issue'], name:'Audit Linear writes', description:'…', action:'AUDIT'}`, when the command runs, then the body POSTed to `/api/v1/command-policies/` equals `{policy_type:'MCP_TOOL', mcp_canonical_group_id:12, mcp_tools:['create_issue','update_issue'], name:'Audit Linear writes', description:'…', action:'AUDIT', enabled:true}` (custom_message and scope_user_group_ids absent because not supplied).
+4. **M4.** Given a stubbed assist-mcp response `{success:true, canonical_group_id:12, mcp_tools:['t1'], name:'AI Name', description:'AI desc', action:'BLOCK', custom_message:'AI msg'}` and the CLI invoked with `--prompt "..." --name "Custom" --custom-message "Override" --action AUDIT --group GroupX --disabled --yes` and `loadFormData` stubbed to return `{user_groups:[{id:7,name:'GroupX'}]}`, when the command runs, then the create-POST body has `name:'Custom'`, `action:'AUDIT'`, `custom_message:'Override'`, `enabled:false`, `scope_user_group_ids:[7]`, AND retains `mcp_canonical_group_id:12` and `mcp_tools:['t1']` from the AI response. Given the same setup but `--name "   "` (whitespace-only), then the body has `name:'AI Name'` (AI value preserved, whitespace silently ignored).
+5. **M5.** Given each of 7 backend response shapes returned by the assist-mcp stub — (a) 200 + `{success:true, canonical_group_id, mcp_tools, ...}`, (b) 200 + `{success:false, error:"Input is too long (max 2000 characters)."}`, (c) 200 + `{success:false, error:"<generic>"}`, (d) `throw new FakeApiError(401, ...)`, (e) `throw new FakeApiError(422, ...)`, (f) `throw new FakeApiError(503, ...)`, (g) `throw new Error("connect ECONNREFUSED ...")` — when the command runs with `--yes`, then `process.exitCode` is `0 / 2 / 2 / 3 / 2 / 4 / 4` respectively AND `output.error` contains the verbatim wording from Phase 1's spec §6.4 matrix for that row.
+6. **M6.** Given `unbound policy tool create-mcp --prompt "audit linear writes"` without `--yes` and a stubbed successful assist-mcp response, when the command runs against a stubbed `readline.createInterface().question(_, cb)` that calls `cb('n')`, then `console.log` captures emit the header `Resolved MCP policy (from AI assist):` AND key rows for Name, Type (`MCP_TOOL`), Service, Tools (comma-joined), Action, Enabled BEFORE the prompt; AND no POST to `/api/v1/command-policies/` fires (user declined). Description and Custom message rows are omitted when their values are empty. Given `--json`, then NO "Resolved MCP policy" string appears on stdout or in `output.info`, but the create POST still fires.
+7. **M7.** Given `unbound policy tool create-mcp --prompt "audit linear writes" --yes` and a stubbed successful assist-mcp response, when the command runs, then the preview header AND key rows still print on stdout AND no readline prompt is invoked AND a POST to `/api/v1/command-policies/` fires with the merged body.
+8. **M8.** Given a stubbed assist-mcp response `{success:true, canonical_group_id:12, mcp_tools:[], name:'X', action:'AUDIT'}`, when the command runs with `--yes`, then `process.exitCode === 2` AND `output.error` contains the substring `could not match any tools` AND `unbound policy tool create-mcp --name` AND no POST to `/api/v1/command-policies/` fires. Given a stubbed assist-mcp response `{success:true, mcp_tools:['t'], name:'X', action:'AUDIT'}` (no `canonical_group_id`), the same exit code and message wording apply.
+9. **M9.** Given `unbound policy tool create-mcp --prompt "audit staging linear writes" --yes`, when the command runs, then `output.warn` is called with a message containing the substring `` `staging` `` AND a POST to `/api/v1/command-policies/assist-mcp/` is fired (does NOT exit early).
+10. **M10.** Given `unbound policy tool create-mcp --name X --mcp-server linear --mcp-tool create_issue --action AUDIT` (no `--prompt`), when the command runs against a stubbed `api.post('/api/v1/command-policies/')`, then the body POSTed equals `{name:'X', description:'', policy_type:'MCP_TOOL', mcp_server:'linear', mcp_tool:'create_issue', action:'AUDIT', enabled:true, config:{}}` — identical to today's flag-path body shape. Given `unbound policy tool create-mcp` with `--name` omitted, then the in-handler error `--name is required (or use --prompt for AI assist).` fires (replacing commander's auto-emitted `required option not specified` message).
+11. **M11.** Given the merged body has `action ∈ {BLOCK, WARN}` and no `custom_message`, when the command runs with `--yes`, then `process.exitCode === 2` AND no create POST fires AND `output.error` wording is user-attributed (`--action BLOCK requires --custom-message. ...`) when `opts.action` was passed by the user, or AI-attributed (`AI assist set --action to BLOCK ...`) when only the AI returned the action.
+12. **M12.** Given `README.md`, when read, the "Tool policy examples" MCP section's first example uses `unbound policy tool create-mcp --prompt "..."` AND a flag-based example appears below it as a documented fallback.
+13. **M13.** Given `PLAN-web-4887.md`, when read, §7 is marked "Phase 2 DELIVERED" with a cross-link to the Phase 2 commit AND §7.4 includes a note that assist-mcp has no `explanation` field AND §11.1, §12.2, §12.3 remain in the deferred list.
+## Risks & Mitigations
+- **Risk:** Downgrading three `.requiredOption(...)` calls on `create-mcp` to `.option(...)` changes commander's auto-emitted error wording for users who omit `--name` / `--mcp-server` / `--action` on the flag-based path. Existing scripts that grep stderr will see different text.
+  - **Mitigation:** In-handler validation throws functionally equivalent messages (`"--name is required (or use --prompt for AI assist)."`, etc.) — same flag name, same imperative, additional `--prompt` hint. Document the wording change in the PR description. Phase 1 made the exact same change to `create-terminal`.
+- **Risk:** The assist-mcp response has no `explanation` field (asymmetric with the terminal endpoint). A naive `renderMcpPreview` that always prints "AI assist explanation:" would emit an empty trailing line.
+  - **Mitigation:** Reuse Phase 1's `if (explanation) { ... }` guard pattern verbatim. Phase 2 callers pass `explanation: undefined`; the block is skipped. Forward-compatible: if backend later adds `explanation`, the same branch picks it up with zero CLI change.
+- **Risk:** Soft-fail on `mcp_tools:[]` could mask a legitimate "match all tools on this service" intent. The Haiku endpoint's post-validation drops invented names; a legitimate "all writes on Linear" might survive validation as `mcp_tools:[]` if Linear has no write tools.
+  - **Mitigation:** Established Context #5 makes this an explicit product decision: do NOT auto-create with no tools. If backend later wants wildcard semantics, it can return `mcp_tools:['*']` or a sentinel; the CLI passes through unchanged. Surface message points users at the manual flag path for the unambiguous "all destructive tools on Linear" intent.
+- **Risk:** Preview shows `Service: canonical_group_id: 12` (integer) instead of "Linear". Users seeing only the preview won't know what service they're authorizing.
+  - **Mitigation:** v1 limitation, explicit in Out of Scope. Post-create `displayToolPolicy(unwrapToolPolicy(data))` shows the backend's resolved `mcp_server` name in the success output, so the integer-only window is just the pre-confirmation preview. Polish follow-up: have `renderMcpPreview` accept an optional name-resolution map fetched from `/api/v1/command-policies/metadata/mcp-servers/` (one extra round trip; defer unless user feedback demands it).
+- **Risk:** `runMcpPromptCreate` and `runTerminalPromptCreate` diverge subtly over time. The BLOCK/WARN guard wording is duplicated text, not a shared helper.
+  - **Mitigation:** Both functions explicitly call shared `validatePromptPreflight`, `loadPrivileges`, `confirmCreate`, `confirmContinue`, `routeBackendError`, `validateActionOverride`. The duplicated guard wording is ~10 lines — extract into a `assertCustomMessageForBlockWarn(body, opts)` helper if a third policy type ships with AI-assist (per spec §3, deferred).
+- **Risk:** `buildHarness` in `test/policy-ai-assist-mcp.test.js` duplicates ~50 LOC of the Phase 1 harness scaffolding (module-cache invalidation, output-capture, ApiError shape).
+  - **Mitigation:** Duplication is intentional. A shared `test/_helpers/harness.js` would couple Phase 1 and Phase 2 test files; any harness change risks breaking Phase 1's stable test suite. Accepted trade-off: per-file harnesses, ~50 LOC duplication, full Phase 1 / Phase 2 test isolation.
+- **Risk:** Cross-repo SKILL.md update on setup PR #163 lands BEFORE unbound-cli PR #54 — the skill would advertise an MCP `--prompt` path that doesn't exist yet on installed CLIs.
+  - **Mitigation:** Sequencing in step 7 explicitly orders PR #54 to merge first, THEN push the skill update commit to PR #163. CLI versioning ensures users who upgrade get the binary before the skill steers them to the new flag.
+## Sequencing
+1. Extend `src/lib/policy-ai-assist.js`: add `routeMcpSuccessFalse`, `mergeAiAndFlagsMcp`, `renderMcpPreview`, `runMcpPromptCreate`. Append to `module.exports`. ~250 LOC added; module reaches ~600 LOC.
+2. Modify `src/commands/policy.js` `create-mcp` registration: downgrade three `.requiredOption(...)` to `.option(...)`, add `--prompt`/`--yes` (and `--json` if absent), update flag descriptions, branch in the action handler on `opts.prompt`, update `addHelpText` to lead with `--prompt`. ~80 LOC added/modified.
+3. Write `test/policy-ai-assist-mcp.test.js` with the ~13–15 tests enumerated in Test Surface. Run `npm test` — both the new file and the existing Phase 1 suite (`test/policy-ai-assist.test.js`, `test/policy-ai-assist-preflight.test.js`, all other existing tests) must pass.
+4. Update `README.md` "Tool policy examples" MCP section to lead with `--prompt`.
+5. Update `PLAN-web-4887.md` §7 to mark Phase 2 DELIVERED with commit cross-links; append the no-`explanation`-field note at the end of §7.4. Leave §11.1 / §12.2 / §12.3 in the deferred list.
+6. Commit on `web-4887-cli-ai-assist-policy`; PR #54 updates with the Phase 2 changes.
+7. (Cross-repo, AFTER step 6 merges) Push a follow-up commit to setup PR #163: update `claude-code/skills/unbound-tool-policy/SKILL.md` — drop the "MCP not available yet" footer, update the frontmatter description to cover both terminal AND MCP policy creation via `--prompt`, add a "Creating an Unbound MCP tool policy" section mirroring the terminal one (rules: one policy per service per invocation, tool names belong in the prompt, `--group` for scoping, `--custom-message` for BLOCK/WARN, fall back to flag-based on failure). Merge PR #163 after PR #54 is in main so the skill never advertises a flag the installed CLI doesn't have.
+## Open Questions
+None — all decisions are locked in Established Context. The setup PR #163 SKILL.md update is sequencing, not a question.