claude-dev-env 1.17.0 → 1.17.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -7,8 +7,8 @@ When authoring or refining prompts, ground decisions in these sources. If guidan
7
7
  ### Tier 1: Anthropic (primary authority for Claude)
8
8
 
9
9
  - https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/overview -- overview, links to all sub-guides
10
- - https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices -- the single living reference for Claude's latest models. Covers general principles, XML tags, prefill deprecation, tool use, thinking, agentic systems, overeagerness, evidence-grounding and citing sources before strong claims.
11
- - https://transformer-circuits.pub/2026/emotions/index.html -- emotion concepts research (April 2026): 171 internal activation patterns that causally influence behavior. Key prompt-engineering takeaways: clear criteria and escape routes improve output quality, collaborative framing activates engagement, positive task framing correlates with better results, inviting transparency produces more reliable output. Cross-model caveat: studied on Sonnet 4.5; patterns align with best practices independently.
10
+ - https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices -- the single living reference for Claude's latest models.
11
+ - https://transformer-circuits.pub/2026/emotions/index.html -- emotion concepts research (April 2026). Key takeaways: clear criteria and escape routes, collaborative framing, positive task framing, inviting transparency. Full catalog: `packages/claude-dev-env/docs/emotion-informed-prompt-design.md`.
12
12
  - https://www.anthropic.com/research/emotion-concepts-function -- blog summary of the above paper.
13
13
  - https://platform.claude.com/docs/en/build-with-claude/adaptive-thinking -- adaptive thinking reference; replaces manual budget_tokens with effort-based control.
14
14
  - https://claude.com/blog/harnessing-claudes-intelligence -- harness evolution: primitives Claude already knows, what to stop doing in the harness, deliberate boundaries (context economics, caching, typed tools). Local inventory: `docs/references/anthropic-harnessing-claudes-intelligence-technique-inventory.md`.
@@ -37,7 +37,17 @@ When authoring or refining prompts, ground decisions in these sources. If guidan
37
37
 
38
38
  ### Conflict resolution rule
39
39
 
40
- If sources disagree on a technique, apply in order: Anthropic documentation first (it describes the actual model behavior), then OpenAI/Google/Microsoft (large-scale research with cross-model relevance), then community sources (patterns and intuition, not authoritative on model internals). When Tier 3 contradicts Tier 1, Tier 1 wins without exception.
40
+ If sources disagree, apply tier order: Anthropic first, then OpenAI/Google/Microsoft, then community. Tier 1 wins when conflicting with lower tiers.
41
+
42
+ ### Outcome preview gate and digest (`prompt-generator`)
43
+
44
+ See SKILL.md §§107-115 (Phases 4-5) and `TARGET_OUTPUT.md` for the full contract. **Clipboard safety:** `extract_fenced_xml_content` concatenates every ` ```xml ` block—follow §7 sample formatting so clipboard copy stays the lone artifact body.
45
+
46
+ ### Outcome preview gate and digest (`prompt-generator`)
47
+
48
+ Human checkpoint before the paste-ready artifact ships: the orchestrator runs an **Outcome preview** turn (`### Outcome preview` bullets built from the **preview summary**, defined in SKILL.md Terminology) plus **AskUserQuestion** (**Ship** recommended first, two contextual alternates, **Refine with free text**), then emits `Audit`, a single ` ```xml ` fence, and **`## Outcome digest`** after the fence. Rationale matches collaborative checkpoints in `templates/skill-from-ground-up.md` and the refinement pattern in `templates/skill-refinement-package.md`. `ARCHITECTURE.md` lists all files in this skill package.
49
+
50
+ **Clipboard safety:** `prompt_workflow_gate_core.extract_fenced_xml_content` concatenates every ` ```xml ` block in the message—follow the sample formatting rules in SKILL.md section 7 so clipboard copy stays the lone artifact body. Full contract: `TARGET_OUTPUT.md`.
41
51
 
42
52
  ## Harness design patterns (Anthropic blog, April 2026)
43
53
 
@@ -68,7 +78,7 @@ Jump from concept to the platform specs the post names:
68
78
 
69
79
  ### Prompt caching (Hook 6)
70
80
 
71
- The [Messages API](https://platform.claude.com/docs/en/build-with-claude/working-with-messages) is stateless—re-supply prior actions, tool definitions, and instructions each turn. Maximize [prompt caching](https://platform.claude.com/docs/en/build-with-claude/prompt-caching) hits: **stable prefix first, dynamic tail last**; **append** new content via **messages** instead of rewriting the cached prompt; **avoid mid-session model switches** (caches are model-specific—use a **subagent** for a cheaper model); **treat the tool list as part of the cached prefix** and avoid churn; use **tool search** so dynamic discovery **appends** without invalidating the prefix; for multi-turn agents, **advance breakpoints** toward the latest message (**auto-caching**). Cached input tokens are priced at **10% of base input** per [pricing](https://platform.claude.com/docs/en/about-claude/pricing).
81
+ The Messages API is stateless. Maximize [prompt caching](https://platform.claude.com/docs/en/build-with-claude/prompt-caching): **stable prefix first, dynamic tail last**; **append** via messages; **avoid mid-session model switches** (use a subagent for cheaper models); **treat tool list as cached prefix**; use **tool search** to append without invalidation; **advance breakpoints** toward the latest message. Cached tokens cost **10% of base input**.
72
82
 
73
83
  ### Typed tools vs bash strings (Hook 7)
74
84
 
@@ -173,12 +183,6 @@ Search for this information in a structured way. As you gather data, develop sev
173
183
  </research_approach>
174
184
  ```
175
185
 
176
- Key elements:
177
- - Define clear **success criteria** for the research question
178
- - Encourage **source verification** across multiple sources
179
- - Track **competing hypotheses** with confidence levels
180
- - **Self-critique** approach and plan regularly
181
-
182
186
  ## Evaluation loop
183
187
 
184
188
  For prompt drafts that must hold up over time:
@@ -203,7 +207,7 @@ When deciding how to approach a problem, choose an approach and commit to it. Av
203
207
 
204
208
  ## Debug JSON schema (prompt-generator pipeline)
205
209
 
206
- Use **only** when the user explicitly requests debug output (for example `show debug`, `full audit table`, `raw internal object`). Default assistant turns stay **audit line + one `xml` fence**; this object is an optional appendix after that pair.
210
+ Use **only** when the user explicitly requests debug output (for example `show debug`, `full audit table`, `raw internal object`). Default assistant turns complete the normal handoff first: one `xml` fence + **`## Outcome digest`** (see also `TARGET_OUTPUT.md`); this JSON object is an optional appendix **after** that handoff.
207
211
 
208
212
  Shape (field names stable for internal audit helpers and Stop-hook leak detection):
209
213
 
@@ -247,4 +251,4 @@ Shape (field names stable for internal audit helpers and Stop-hook leak detectio
247
251
  }
248
252
  ```
249
253
 
250
- `checklist_results` keys must include all **14** compliance row ids from `SKILL.md` §11 (for example `reversible_action_and_safety_check_guidance`, `scope_terms_explicit_and_anchored`).
254
+ `checklist_results` keys must include all **15** compliance row ids from `SKILL.md` §11 (for example `reversible_action_and_safety_check_guidance`, `scope_terms_explicit_and_anchored`).
@@ -4,10 +4,11 @@ description: >-
4
4
  Authors repository-grounded XML prompt artifacts for Claude: system and developer
5
5
  instructions, agent harnesses, tool-use patterns, evaluation rubrics, NotebookLM audio
6
6
  customization, and MCP or browser automation steering. Gathers scope through discovery
7
- and AskUserQuestion, runs the default refinement pipeline in a drafting subagent, and
8
- delivers a one-line audit plus one fenced XML block. Trigger when the user asks to write,
9
- refine, or improve steering text for Claude. Execution of the described work belongs in
10
- /agent-prompt only after the user explicitly confirms they want it run.
7
+ and AskUserQuestion, runs the default refinement pipeline in a drafting subagent, runs a
8
+ mandatory Outcome preview AskUserQuestion gate, then delivers one fenced XML block and a
9
+ skimmable Outcome digest after the fence. Trigger when the user asks to
10
+ write, refine, or improve steering text for Claude. Execution of the described work belongs
11
+ in /agent-prompt only after the user explicitly confirms they want it run.
11
12
  ---
12
13
  @packages/claude-dev-env/skills/prompt-generator/REFERENCE.md
13
14
 
@@ -21,39 +22,40 @@ description: >-
21
22
 
22
23
  **Harness hygiene:** Re-test harness assumptions about what Claude cannot do alone on each model generation or major product release—stale compensations bottleneck performance as capabilities improve (Hook 1; [Harnessing Claude's intelligence](https://claude.com/blog/harnessing-claudes-intelligence), inventory `docs/references/anthropic-harnessing-claudes-intelligence-technique-inventory.md`).
23
24
 
24
- **Eval contract:** The user-visible behavior this skill must satisfy is defined in `packages/claude-dev-env/skills/prompt-generator/TARGET_OUTPUT.md`. Automated evals live in `packages/claude-dev-env/skills/prompt-generator/evals/prompt-generator.json`.
25
+ **Eval contract:** The user-visible behavior this skill must satisfy is defined in `packages/claude-dev-env/skills/prompt-generator/TARGET_OUTPUT.md`. Automated evals live in `packages/claude-dev-env/skills/prompt-generator/evals/prompt-generator.json`. **File map:** `ARCHITECTURE.md` lists all files in this skill package and their roles.
25
26
 
26
27
  **Templates:** Under `packages/claude-dev-env/skills/prompt-generator/templates/`, `skill-from-ground-up.md` is the collaborative prompt for **net-new** checkpointed Agent Skill packages; `skill-refinement-package.md` is the sibling prompt for **existing-skill** multi-file refinements and package-aware polish. Skill-builder and skill-writer in this repo require implementers to use the matching template before checkpointed package work.
27
28
 
28
- **Terminology:** **Prompt artifact** — the full XML inside the single user-facing `xml` fence (the paste-ready handoff). **Scope block** — the five-key contract in §3A that grounds instructions. **Default refinement pipeline** — §10: base draft → section refine → merge → 15-row compliance audit → capped fixes (subagent-internal unless draft-only). **Light self-check** — §8: fast pre-return sanity pass (shape, tools, scope, patterns); *not* the compliance audit. **Compliance audit (15-row)** — §11: hook-keyed rows that set the `Audit: pass|fail` numerator. **Execution handoff** — `/agent-prompt` after explicit user intent to run work.
29
+ **Terminology:** **Prompt artifact** — the full XML inside the single user-facing `xml` fence (the paste-ready output). **Outcome digest** — skimmable `## Outcome digest` markdown **after** that fence on the final turn: what executing the prompt produces, inputs or tools, done criteria, short sample (see `TARGET_OUTPUT.md`). **Outcome preview gate** — mandatory `AskUserQuestion` **after** internal drafting returns candidate XML and **before** the final fenced artifact ships; uses `### Outcome preview` bullets plus confirmation options (**Ship** first, two contextual alternates, **Refine with free text**). **Preview summary** — structured fields the drafting subagent returns to the orchestrator: `final_prompt_xml`, `what_executor_produces`, `primary_inputs_or_tools`, `done_when`, `sample_excerpt_markdown` (about twenty lines; follow the sample formatting rules in SKILL.md section 7). **Scope block** — the five-key contract in §3A that grounds instructions. **Default refinement pipeline** — §10: base draft → section refine → merge → 15-row compliance audit → capped fixes (subagent-internal unless draft-only). **Light self-check** — §8: fast pre-return pass on output shape, tools, scope, and patterns; *not* the compliance audit. **Compliance audit (15-row)** — §11: hook-keyed rows the subagent evaluates internally; ships only after the file-based validation loop exits 0. **Execution handoff** — `/agent-prompt` after explicit user intent to run work. **Hook validation block** — structured fields for validation. Fields: `overall_status`, `checklist_results` rows, five scope-anchor tokens, `base_minimal_instruction_layer: true` (signals that the response includes the required minimal instruction scaffolding: scope anchors, checklist rows, and runtime signals), and `on_demand_skill_loading: true` (signals that heavy skills were loaded only when the task explicitly required them, per section 17 context-footprint controls). Stripped before user output. All other files reference this single definition.
29
30
 
30
- **Hook-survival invariant (read first):** The fenced XML artifact is the primary deliverable and MUST survive Stop-hook retries. If a Stop hook rejects the response, only the surrounding audit summary and runtime signal scaffolding may change between retries—the XML inside the fence MUST be re-emitted in full on every retry. Recovery pattern: re-emit the complete fenced XML first, then adjust the audit line. Trimming, summarizing, or deferring the prompt artifact to satisfy a hook gate is forbidden.
31
+ **File-based validation loop (read first):** The fenced XML artifact is the primary deliverable. The drafting subagent writes the complete output (fenced XML + Outcome digest + hook validation block) to `data/prompts/.draft-prompt.xml`, runs `python packages/claude-dev-env/hooks/blocking/prompt_workflow_validate.py data/prompts/.draft-prompt.xml`, reads stderr for any `[reason_code] message` violations when exit code is 2, edits the file to fix violations, and re-runs until exit code 0. Only then does the orchestrator strip the hook validation block, output fenced XML + Outcome digest to the user, and delete the temp file. Trimming or summarizing the prompt artifact to pass validation is forbidden.
31
32
 
32
- **Turn shape:** Each orchestrator turn is either **AskUserQuestion** only (then wait for answers), or **`Audit: …` + exactly one `xml` fenced block** (then **send boundary**)—per `TARGET_OUTPUT.md`. Do not substitute free-form question paragraphs for AskUserQuestion; do not append commentary after the closing fence on the default path.
33
+ **Turn shape:** Each orchestrator turn is one of: **AskUserQuestion** only (then wait); **Outcome preview** turn (`### Outcome preview` markdown bullets + **AskUserQuestion** only); or the **final handoff** (one `xml` fence + `## Outcome digest`)—per `TARGET_OUTPUT.md`. Do not substitute free-form question paragraphs for scope clarifications; preview bullets are statements, not standalone interrogative paragraphs.
33
34
 
34
- **Happy path:** (1) Choose scenario **1–4** from the router table. (2) Run discovery when that scenario calls for repo tools. (3) Collect answers through **AskUserQuestion** (one form per round, **2–4** options per field, recommended first). (4) Subagent produces XML, runs **light self-check**, then **15-row compliance audit** + refinement loop. (5) Orchestrator prints **`Audit: pass 15/15`** or **`Audit: fail N/15 [reason]`** and the **complete fenced XML**. (6) **Send boundary:** end the message immediately after the closing fence. (7) If the user names a debug phrase, append the full table / JSON per `TARGET_OUTPUT.md`.
35
+ **Happy path:** (1) Choose scenario **1–4** from the router table. (2) Run discovery when that scenario calls for repo tools. (3) **AskUserQuestion** (one form per round, **2–4** options per field, recommended first). (4) Subagent produces XML plus **preview summary**, runs **light self-check**, then **15-row compliance audit** + refinement loop (all internal). (5) Orchestrator emits **Outcome preview** turn from the preview summary; user confirms or refines (up to **three** preview rounds unless the user raises the cap in chat). (6) Orchestrator prints the **complete fenced XML**, then **`## Outcome digest`**. (7) If the user names a debug phrase, append the full table / JSON **after** digest per `TARGET_OUTPUT.md`.
35
36
 
36
- **Clarity bar:** Ship concrete, outcome-first copy everywhere (AskUserQuestion fields, audit line, XML body): name *what* to do, *where* it applies, and *how* to verify done—per [Be clear and direct](https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices#be-clear-and-direct) and [Control the format of responses](https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices#control-the-format-of-responses). This skill **authors** prompts; downstream execution stays out of the default path until `/agent-prompt`.
37
+ **Clarity bar:** Ship concrete, outcome-first copy everywhere (AskUserQuestion fields, XML body, Outcome digest): name *what* to do, *where* it applies, and *how* to verify done—per [Be clear and direct](https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices#be-clear-and-direct) and [Control the format of responses](https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices#control-the-format-of-responses). This skill **authors** prompts; downstream execution stays out of the default path until `/agent-prompt`.
37
38
 
38
39
  ## Primary mission: paste-ready XML prompts (overrides other delivery instructions)
39
40
 
40
- **Delivery contract:** Each completed request yields a **repo-grounded XML prompt** a human or agent can paste into a new session. Turns go to discovery, **AskUserQuestion**, subagent drafting, and internal audits until that artifact is ready. **Author vs execution:** this skill ends at the artifact; when the user wants edits, tests, or PRs run for real, they confirm and move to **`/agent-prompt`**.
41
+ **Delivery contract:** Each completed request yields a **repo-grounded XML prompt** a human or agent can paste into a new session, preceded by confirmation at the **Outcome preview gate** and followed by an **Outcome digest** for skimming. **Author vs execution:** this skill ends at the artifact plus digest; when the user wants edits, tests, or PRs run for real, they confirm and move to **`/agent-prompt`**.
41
42
 
42
- **Hook-survival invariant:** Treat the fenced XML as the immutable payload for the user. On every Stop-hook retry, print the **same full** XML between the opening and closing fences; adjust only the one-line audit prefix (or other non-fence scaffolding) if a hook requires a format tweak. Re-emit the **entire** XML body before tweaking surrounding text—never shorten the artifact to pass a gate.
43
+ **Validation loop invariant:** The fenced XML is the immutable payload for paste operations. During the validation loop, keep the XML byte-identical between iterations; adjust only surrounding scaffolding. When a violation is inside the artifact (e.g. negative keywords), edit only the specific flagged lines.
43
44
 
44
- **Orchestrator vs subagent:** The **orchestrator** runs ordered discovery, issues **AskUserQuestion**, and owns the **final** user-visible line: audit + fence. The **subagent** owns base draft, per-section refinement, merge, and the **15-row compliance audit**, returning **only** final XML plus pass/fail counts (no user-facing table)—unless the user asked for **draft-only** / **no refinement**, in which case you may draft inline with the same output shape. Keep hook retries internal; expose at most one short line such as `Retrying: scope anchor missing` before the successful audit + fence.
45
+ **Orchestrator vs subagent:** The **orchestrator** owns discovery, **AskUserQuestion**, the **Outcome preview gate**, and the **final** handoff: read the validated file, strip the hook validation block, output fence + digest, copy to clipboard (respecting `PROMPT_WORKFLOW_SKIP_CLIPBOARD`), and delete `data/prompts/.draft-prompt.xml`. The **subagent** owns base draft, per-section refinement, merge, the **15-row compliance audit**, writing to `data/prompts/.draft-prompt.xml`, and the file-based validation loop until exit 0; returns **pass/fail counts + preview summary** to the orchestrator (no user-facing compliance table). For **draft-only** requests, draft inline with the same preview + handoff shape.
45
46
 
46
- **Interaction shape:** Route clarifications through **AskUserQuestion** only. Close each successful artifact turn with **audit line + one fenced XML block**; keep implementation plans **inside** that XML for the downstream consumer, not as a chat to-do list.
47
+ **Interaction shape:** Route scope clarifications through **AskUserQuestion** only. Close each successful run with **one fenced XML block + Outcome digest**; keep implementation plans **inside** the fenced XML for the downstream consumer, not as a chat to-do list.
47
48
 
48
49
  ## User-visible output contract (mandatory)
49
50
 
50
51
  Match `TARGET_OUTPUT.md`. Summary:
51
52
 
52
- 1. **Questions:** Use **AskUserQuestion** for every clarification (one multi-field form per round); keep normal assistant text free of standalone question paragraphs.
53
+ 1. **Questions:** Use **AskUserQuestion** for every scope clarification (one multi-field form per round); keep normal assistant text free of standalone question paragraphs outside preview bullets.
53
54
  2. **Options:** Supply **2–4** options per question, **recommended option first**; label discovery-sourced choices **`[discovered]`**.
54
- 3. **Final message (exactly):** Line 1 = `Audit: pass 15/15` or `Audit: fail N/15 — [short reason]`; immediately after, output **one** Markdown code fence whose language tag is `xml` and whose body is the **complete** prompt; **send boundary** = right after that fence closes—the visible message is exactly those two consecutive blocks, copy-ready together, before any later user message.
55
- 4. **Full audit table / JSON debug object:** Append only after the user uses an explicit debug phrase such as `show debug`, `full audit table`, or `raw internal object`.
56
- 5. **Commit-and-execute:** Pick a drafting approach, run it to completion, ship the XML; change plans only when **new** facts from the user or tools contradict the earlier scope.
55
+ 3. **Outcome preview turn:** `### Outcome preview` bullet block (preview summary) plus **AskUserQuestion** with **Ship this outcome profile** first, two contextual alternates, **Refine with free text**; cap at three preview rounds unless the user raises the cap in chat.
56
+ 4. **Final message:** **One** ` ```xml ` fence with the **complete** prompt; then **`## Outcome digest`**—**paste-ready section** remains the single `xml` fence for downstream paste.
57
+ 5. **Full audit table / JSON debug object:** Append only after the user uses an explicit debug phrase such as `show debug`, `full audit table`, or `raw internal object`, and only **after** digest.
58
+ 6. **Commit-and-execute:** Pick a drafting approach, carry it through preview confirmation, ship the handoff; change plans only when **new** facts from the user or tools contradict the earlier scope.
57
59
 
58
60
  **Required XML sections** inside the fence: `<role>`, `<background>`, `<instructions>`, `<constraints>`, `<output_format>`. Optional: `<illustrations>`, `<open_question>` (use for unresolved discovery — see structural invariant D in `TARGET_OUTPUT.md`).
59
61
 
@@ -66,15 +68,17 @@ Match `TARGET_OUTPUT.md`. Summary:
66
68
  | **3 — Long unstructured input** | Many requirements / paths in one message | Verify repo references (packages, shared utils, configs) with targeted tools **before** questions | First question **confirms extracted intent**; ambiguities as **specific** options; **every** user-stated requirement captured in the generated XML by name — track all requirements from the unstructured input and confirm coverage before shipping |
67
69
  | **4 — Noisy context** | Long unrelated thread before `/prompt-generator` | Build the subagent brief from: the user’s literal `/prompt-generator` text, a **≤120-word** summary of on-topic facts, and discovery notes—**exclude** raw stack traces and unrelated tangents | As needed (often Scenario 1-shaped) |
68
70
 
71
+ **Final handoff (all scenarios):** After drafting, every run uses the **Outcome preview** turn, then the final message ` ```xml ` → `## Outcome digest` (`TARGET_OUTPUT.md`).
72
+
69
73
  **Handoff (Scenario 2):** `<background>` must be **self-contained** — state, **decisions**, files touched, next steps, constraints — so a new session needs no prior chat. Preserve prior decisions verbatim in the handoff; quote the exact decision text where precision matters rather than paraphrasing it away.
70
74
 
71
75
  ## Phase ordering (structural invariant A)
72
76
 
73
77
  For the **final** user-visible turn that ships the artifact:
74
78
 
75
- - Compose the message as **audit line → opening fence → XML → closing fence → end**; keep the byte stream free of `tool_use` blocks **between** the opening and closing fences.
79
+ - Compose the message as **opening fence → XML → closing fence → `## Outcome digest` → end**; keep the byte stream free of `tool_use` blocks **between** the opening and closing fences.
76
80
  - **Completeness:** End every numbered step inside `<instructions>` with a complete sentence and a fully written list item. Balance every XML tag explicitly (open and close each `<role>`, `<background>`, `<instructions>`, `<constraints>`, `<output_format>`). The artifact must be copy-pasteable into a new file with zero manual repair.
77
- - Global pipeline: **discovery tools** (when applicable) → **AskUserQuestion** → **subagent** (draft + refinement + internal audit) → **one** orchestrator reply containing only audit line + fence.
81
+ - Global pipeline: **discovery tools** (when applicable) → **AskUserQuestion** → **subagent** (draft + refinement + internal audit + **preview summary**) → **Outcome preview** turn → optional refinement loops → **one** orchestrator reply with fence + digest.
78
82
 
79
83
  ## Interactive discovery mode (default)
80
84
 
@@ -95,12 +99,22 @@ Issue **one** AskUserQuestion with all fields populated from discovery and the u
95
99
 
96
100
  Spawn a **subagent** (Agent tool) with:
97
101
 
98
- - Scenario id (1–4), user goal, discovery summary, AskUserQuestion answers
99
- - Instruction: produce **one** well-formed XML prompt (required sections) + run the internal refinement loop and **15-row compliance audit**; return **only** the final XML string and a pass/fail + fail count for that audit (no user-facing table)
102
+ - Scenario id (1–4), user goal, discovery summary, AskUserQuestion answers (and any **Refine with free text** deltas from prior preview rounds)
103
+ - Instruction: produce **one** well-formed XML prompt (required sections) + run the internal refinement loop and **15-row compliance audit**; write the complete output (fenced XML + Outcome digest + hook validation block) to `data/prompts/.draft-prompt.xml`; run `python packages/claude-dev-env/hooks/blocking/prompt_workflow_validate.py data/prompts/.draft-prompt.xml`; if exit code 2, read stderr violations, edit the draft file, and re-run until exit code 0; return **pass/fail + fail count** for the audit and **preview summary** fields (`what_executor_produces`, `primary_inputs_or_tools`, `done_when`, `sample_excerpt_markdown` following the sample formatting rules in SKILL.md section 7, about twenty lines max)
104
+
105
+ Keep subagent reasoning in the Agent transcript; the user-facing **Outcome preview** turn surfaces the preview summary; the **final** turn contains fence + digest.
106
+
107
+ ### Phase 4 — Outcome preview gate
108
+
109
+ 1. Render `### Outcome preview` from the **preview summary** (bullets only).
110
+ 2. Issue **AskUserQuestion** with **Ship this outcome profile** (recommended), two contextual alternates from discovery, **Refine with free text**.
111
+ 3. On **Ship**, go to Phase 5. On an alternate, merge the alternate into the brief and re-run Phase 3. On **Refine with free text**, merge the user text into the brief and re-run Phase 3. Stop after **three** preview rounds unless the user explicitly raises the cap in chat.
100
112
 
101
- The orchestrator then prints **`Audit: pass 15/15`** or **`Audit: fail N/15 [reason]`** immediately followed by the fenced XML. Keep subagent reasoning in the Agent transcript; the user-facing turn contains **only** audit + artifact.
113
+ ### Phase 5Final handoff
102
114
 
103
- **Draft-only:** If the user explicitly requests no refinement (“quick draft”, “no refinement loop”), the subagent may skip Steps 10–12 below but must still return valid XML and a honest audit line.
115
+ Print the **complete fenced XML**, then **`## Outcome digest`** (tightened copy from the accepted preview summary).
116
+
117
+ **Draft-only:** If the user explicitly requests no refinement (“quick draft”, “no refinement loop”), the subagent may skip Steps 10–12 below but must still return valid XML and a **preview summary**; Phases 4–5 still run so the user confirms shape before paste.
104
118
 
105
119
  ## Workflow (run in order — primarily inside the drafting subagent)
106
120
 
@@ -138,17 +152,17 @@ Apply principles from Anthropic’s prompting guide (see REFERENCE.md): XML sect
138
152
 
139
153
  **Structural invariant D:** Write `<instructions>` / `<constraints>` as direct imperatives (“Open `path/to/file.ts` and …”). Park unresolved items in `<open_question>` tags—one distinct question per tag with the exact decision you need. Inside the fenced XML artifact, use only confident, definitive language: replace hedging phrases (“let me also check”, “actually”, “one more consideration”) and tentative qualifiers (“might be”, “possibly”, “I think”, “could be”) with direct assertions or move genuine uncertainty into `<open_question>` tags.
140
154
 
141
- **Set a role** in the system prompt. Anthropic: "Setting a role in the system prompt focuses Claude's behavior and tone for your use case. Even a single sentence makes a difference."
155
+ **Set a role** in the system prompt even a single sentence focuses behavior and tone.
142
156
 
143
- **Add motivation behind constraints** in `<background>`. Anthropic: "Providing context or motivation behind your instructions... can help Claude better understand your goals and deliver more targeted responses." Claude generalizes from the explanation.
157
+ **Add motivation behind constraints** in `<background>`. Claude generalizes from the explanation, delivering more targeted responses.
144
158
 
145
- **Frame positively (zero-negative-keyword rule).** Anthropic: state the desired outcome directly. "Your response should be composed of smoothly flowing prose paragraphs" provides clearer guidance than a prohibition-only instruction. Apply this rule absolutely inside the fenced XML artifact across all sections (`<role>`, `<background>`, `<instructions>`, `<constraints>`, `<output_format>`): every instruction states what to do, what to produce, what to enforce. Use affirmative directives exclusively: "only X", "always X", "ensure X", "require X." Banned keywords inside generated XML: "no", "not", "don't", "do not", "never", "avoid", "without", "refrain", "stop", "prevent", "exclude", "prohibit", "forbid", "reject", "cannot", "unless." Also banned: indirect negative patterns such as "instead of X", "rather than X", "as opposed to." Example pass: "Ensure all functions have explicit return types." Example fail: "Do not leave return types implicit." When a boundary is needed, phrase it as what is permitted: "only run commands within the scoped paths" rather than a prohibition.
159
+ **Frame positively (zero-negative-keyword rule).** Anthropic: state the desired outcome directly. "Your response should be composed of smoothly flowing prose paragraphs" provides clearer guidance than a prohibition-only instruction. Apply this rule across all XML sections: every instruction states what to do, what to produce, what to enforce. Use affirmative directives exclusively: "only X", "always X", "ensure X", "require X." Banned keywords inside generated XML: "no", "not", "don't", "do not", "never", "avoid", "without", "refrain", "stop", "prevent", "exclude", "prohibit", "forbid", "reject", "cannot", "unless." Also banned: indirect negatives ("instead of X", "rather than X", "as opposed to"). Example pass: "Ensure all functions have explicit return types." Example fail: "Do not leave return types implicit." When a boundary is needed, phrase it as what is permitted: "only run commands within the scoped paths" rather than a prohibition.
146
160
 
147
- **Emotion-informed framing.** Anthropic's emotion concepts research (2026) shows that internal activation patterns causally influence output quality. Apply: explicit success criteria with "say so if you're unsure" as an accepted answer; collaborative language ("help figure out", "work on this together"); framing tasks as interesting problems rather than chores; constructive, forward-looking tone. Cross-model caveat: studied on Sonnet 4.5; the patterns align with Anthropic's prompting best practices independently. Full pattern catalog and citations: `packages/claude-dev-env/docs/emotion-informed-prompt-design.md`.
161
+ **Emotion-informed framing.** Apply: explicit success criteria with "say so if you're unsure" as an accepted answer; collaborative language ("help figure out", "work on this together"); framing tasks as interesting problems; constructive, forward-looking tone. Full catalog: `packages/claude-dev-env/docs/emotion-informed-prompt-design.md`.
148
162
 
149
163
  **Golden rule check.** Anthropic: "Show your prompt to a colleague with minimal context on the task and ask them to follow it. If they'd be confused, Claude will be too."
150
164
 
151
- **Commit-and-execute pattern.** Anthropic: "When you're deciding how to approach a problem, choose an approach and commit to it. Avoid revisiting decisions unless you encounter new information that directly contradicts your reasoning." For prompts that guide agents through multi-step work, include this pattern so the agent doesn't spin revisiting decisions.
165
+ **Commit-and-execute pattern.** For multi-step agent prompts, include: "Choose an approach and commit to it. Revisit only when new information directly contradicts your reasoning."
152
166
 
153
167
  **Tool-return policy (agent-harness / tool-use prompts):** Require explicit justification before the harness tokenizes full tool outputs; when the next hop needs only a slice or a tool-to-tool handoff, steer authors toward code execution (bash/REPL) so only execution output reaches model-visible context—not every intermediate payload (Hook 2; [Harnessing Claude's intelligence](https://claude.com/blog/harnessing-claudes-intelligence)).
154
168
 
@@ -177,11 +191,12 @@ Use the optional `<illustrations>` section when concrete samples make format, to
177
191
  3. **Triple-backtick inner fence:** When the sample must use backtick fences, emit a **complete pair**: an opening line beginning with three backticks plus an info string (e.g. `` ```bash ``), the sample lines, then a closing line containing only three backticks. The prompt-workflow hook and clipboard path treat that pair as one unit inside the outer `` ```xml `` fence. For the **most stable on-screen rendering** in chat UIs, use step 1 or step 2 above before this option.
178
192
  4. **Cap count:** Include **three to five** distinct illustration blocks (narrative plus optional sample) unless the user’s brief asks for a different depth.
179
193
 
180
- These steps are **machine-facing obligations** for the orchestrator and drafting subagent. The person invoking `/prompt-generator` receives the finished fenced XML; the skill text above is what the model follows when filling `<illustrations>`.
194
+ These steps are instructions for the orchestrator and drafting subagent to follow when filling `<illustrations>`. The person invoking `/prompt-generator` receives the finished fenced XML.
195
+
181
196
 
182
197
  ### 8. Light self-check (subagent, pre-return)
183
198
 
184
- **Two-tier validation — tier 1:** Before the subagent returns XML, run a quick pass on output shape, tool phrasing, scope anchors, and safety / research / agentic patterns as applicable (see REFERENCE.md and patterns below). This **light self-check** is not interchangeable with the **15-row compliance audit** in §11; tier 2 supplies the hook-keyed pass/fail counts for the `Audit:` line.
199
+ Before the subagent returns XML, run a quick pass on output shape, tool phrasing, scope anchors, and applicable patterns (see REFERENCE.md). This **light self-check** (tier 1) is separate from the **15-row compliance audit** (tier 2, §11).
185
200
 
186
201
  Expand the light self-check with this internal checklist when useful:
187
202
 
@@ -201,15 +216,16 @@ Expand the light self-check with this internal checklist when useful:
201
216
 
202
217
  ### 9. Deliver (orchestrator)
203
218
 
204
- The orchestrator’s **only** delivery to the user is:
219
+ The orchestrator’s **final** delivery to the user is, in order: **one** fenced `xml` block (paste-ready prompt artifact), immediately followed by **`## Outcome digest`** containing:
205
220
 
206
- ```text
207
- Audit: pass 15/15
208
- ```
221
+ - **What it does** — plain-language summary of what running this prompt produces
222
+ - **Key inputs** — what the prompt needs to work (files, tools, context)
223
+ - **Done when** — how to tell the prompt succeeded
224
+ - **Quick sample** — short example of what the output looks like (follow the sample formatting rules in SKILL.md section 7 because `extract_fenced_xml_content` concatenates every `xml` fence)
209
225
 
210
- (or `fail N/15 …`), immediately followed by **one** fenced XML block; **send boundary** is immediately after the closing fence so the user receives a copy-ready pair (audit line + artifact) in one assistant message before the conversation continues.
226
+ **Paste-ready section:** Only the ` ```xml ` ` ``` ` span is intended for clipboard paste into a downstream session; the digest is for reading.
211
227
 
212
- **Render-survival:** When the fenced XML uses tag names that **collide with HTML5 elements** (`section`, `summary`, `details`, `header`, `footer`, `main`, `aside`, `article`, `nav`, `figure`), or when the artifact is **very large**, **write the artifact to a file** and give the user the path together with the usual one-line audit. Add a brief **section inventory** (confirming the five required sections) so the user can trust the file even if the inline fence would render poorly. Required grounding uses `<background>` (the old `context` name matched HTML). Details: **TARGET_OUTPUT.md — Structural invariant E**.
228
+ **Render-survival:** When the fenced XML uses tag names that **collide with HTML5 elements** (`section`, `summary`, `details`, `header`, `footer`, `main`, `aside`, `article`, `nav`, `figure`), or when the artifact is **very large**, **write the artifact to a file** and give the user the path together with the usual one-line audit. Add a brief **section inventory** (confirming the five required sections) so the user can trust the file even if the inline fence would render poorly. Still emit **Outcome digest** (and file path if used) after the inline fence closes. Required grounding uses `<background>` (the old `context` name matched HTML). Details: **TARGET_OUTPUT.md — Structural invariant E**.
213
229
 
214
230
  ### 10. Default refinement mode (subagent-internal)
215
231
 
@@ -225,9 +241,9 @@ Required section list is immutable for this pipeline: `role`, `background`, `ins
225
241
 
226
242
  ### 11. Compliance audit — 15-row checklist (internal, audit numerator)
227
243
 
228
- **Two-tier validation — tier 2:** The `15` in `Audit: pass 15/15` counts these **compliance** rows (stable ids for hooks). Tier 1 is the **light self-check** in §8—keep the steps separate so models do not merge them.
244
+ The 15-row compliance audit counts these **compliance** rows (stable ids for hooks). Keep separate from the **light self-check** (§8, tier 1).
229
245
 
230
- **Runtime Stop hook:** In addition to the 15-row internal audit, the `prompt-workflow-stop-guard` Stop hook enforces **section presence** on prompt-workflow responses: any fenced Markdown XML block must include opening and closing tags for `role`, `background`, `instructions`, `constraints`, and `output_format`. Missing tags trigger a retry before the user sees a passing turn. Pair this with **Structural invariant E** in `TARGET_OUTPUT.md` so users still receive intact XML when chat renderers strip HTML-named tags. `prompt_workflow_gate_core.extract_fenced_xml_content` scans each inner Markdown fence (` ```lang ` through its closing `` ``` `` line) as a unit so hooks and clipboard copy see the **full** XML body, including everything after inner fences inside `<illustrations>`.
246
+ **File-based validation:** The `prompt_workflow_validate.py` CLI enforces **section presence**, **scope anchors**, **checklist rows**, **context-control signals**, **ambiguous scope detection**, and **negative keyword detection** on the draft file. The subagent fixes violations until exit code 0. Pair with **Structural invariant E** in `TARGET_OUTPUT.md` for render-survival. `extract_fenced_xml_content` scans inner Markdown fences as units so validation and clipboard see the **full** XML body.
231
247
 
232
248
  | # | Row name |
233
249
  |---|----------|
@@ -251,13 +267,13 @@ For each row, maintain `status`, `evidence_quote`, `source_ref`, and `fix_if_fai
251
267
 
252
268
  ### 12. Debug-only bundle (explicit user request only)
253
269
 
254
- When the user explicitly asks for debug / full audit, emit the markdown table, `scope_block` recap, and the debug JSON **in addition to** the audit line + XML fence.
270
+ When the user explicitly asks for debug / full audit, emit the markdown table, `scope_block` recap, and the debug JSON **in addition to** the XML fence + **Outcome digest**.
255
271
 
256
- **Default user-facing path (keeps Stop hooks green):** After the XML fence, stopdo **not** add a second fenced block, do **not** start the message with `{`, and keep internal pipeline keys (`pipeline_mode`, `scope_block_validation`, `evidence_quotes`, `source_refs`, `corrective_edits`, `retry_count`, `audit_output_contract`, `section_output_contract`, `base_prompt_xml`, `required_sections`) inside the debug JSON only.
272
+ **Default user-facing path:** On non-debug turns, after the `xml` fence, emit **Outcome digest**, then **stop**—do **not** add a second outer fenced block for debug payloads, do **not** start the assistant message with `{`, and keep internal pipeline keys (`pipeline_mode`, `scope_block_validation`, `evidence_quotes`, `source_refs`, `corrective_edits`, `retry_count`, `audit_output_contract`, `section_output_contract`, `base_prompt_xml`, `required_sections`) inside the debug JSON only.
257
273
 
258
- **Debug JSON shape:** Full schema and field definitions: **REFERENCE.md** → **Debug JSON schema (prompt-generator pipeline)**. Use that object only on debug requests; default turns remain audit line + single `xml` fence.
274
+ **Debug JSON shape:** Full schema and field definitions: **REFERENCE.md** → **Debug JSON schema (prompt-generator pipeline)**. Use that object only on debug requests.
259
275
 
260
- **Hook-recovery (default path):** Print the **complete** fenced XML again, then the **one-line** audit; keep every XML section intact while you adjust scaffolding to satisfy the hook.
276
+ **Validation recovery (default path):** Fix the specific issue in `data/prompts/.draft-prompt.xml` and re-run the validator. Keep every XML section inside the fence intact; adjust only scaffolding outside the fence.
261
277
 
262
278
  ### 13. Scope quality rule for generated prompts
263
279
 
@@ -284,9 +300,9 @@ Use `/agent-prompt` only after the user explicitly asks to execute. Refinement s
284
300
 
285
301
  ### 17. Context-footprint controls
286
302
 
287
- Keep orchestrator turns minimal: discovery → AskUserQuestion → subagent → one-line audit + fence. Push heavy drafting to the subagent with a **curated** brief (especially Scenario 4).
303
+ Keep orchestrator turns structured: discovery → **AskUserQuestion** → subagent → **Outcome preview** turn → one final message (fence + digest). Push heavy drafting to the subagent with a **curated** brief (especially Scenario 4).
288
304
 
289
- **Low-context defaults:** Keep the base instruction layer in generated prompts lean—scope anchors, checklist-backed behaviors, and inert-content safety where hooks apply. Store stable enforcement text in hooks/rules instead of pasting full policy into every XML artifact. Load heavy skills only when the user’s task explicitly needs them. Prefer pointers to **REFERENCE.md** over repeating long excerpts; default user-visible output stays audit line + single `xml` fence unless the user requests debug.
305
+ **Low-context defaults:** Keep the base instruction layer in generated prompts lean—scope anchors, checklist-backed behaviors, and inert-content safety where hooks apply. Store stable enforcement text in hooks/rules instead of pasting full policy into every XML artifact. Load heavy skills only when the user’s task explicitly needs them. Prefer pointers to **REFERENCE.md** over repeating long excerpts; default user-visible output stays single `xml` fence + **Outcome digest** unless the user requests debug extras.
290
306
 
291
307
  ## Claude 4.6 considerations
292
308
 
@@ -317,7 +333,7 @@ For commands that delete data, rewrite shared history, or notify other people, o
317
333
  When tests fail or tooling blocks progress, prefer iterative fixes inside the allowed scope. Keep safety hooks (`--verify`, linters) enabled; surface unfamiliar files as questions.
318
334
  ```
319
335
 
320
- **Positive rewrite guidance:** When embedding this pattern into a generated XML artifact, rephrase each line using affirmative directives only (per the zero-negative-keyword rule in §4). Example rewrite for generated output: "Prioritize local, reversible actions: read files, run targeted tests, apply patches within scoped paths. Obtain explicit user approval before running commands that delete data, rewrite shared history, or send external notifications. Keep safety hooks enabled (`--verify`, linters). Surface unfamiliar files as questions for the user."
336
+ **Positive rewrite guidance:** When embedding this pattern into generated XML, apply the zero-negative-keyword rule (§4). Example: "Prioritize local, reversible actions: read files, run targeted tests, apply patches within scoped paths. Obtain explicit user approval before commands that delete data, rewrite shared history, or send external notifications. Keep safety hooks enabled. Surface unfamiliar files as questions for the user."
321
337
 
322
338
  ## Research prompt pattern
323
339
 
@@ -2,19 +2,29 @@
2
2
 
3
3
  This file is the **target output spec** for eval-driven iteration of the `prompt-generator` skill. Evals assert behavior against it; update this document and `SKILL.md` together when the contract changes.
4
4
 
5
+ **File map:** `ARCHITECTURE.md` lists all files in this skill package and their roles.
6
+
5
7
  **Methodology:** [Anthropic — Agent Skills: evaluation and iteration](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices#evaluation-and-iteration)
6
8
 
9
+ ## Terminology
10
+
11
+ - **Outcome preview gate** — Mandatory `AskUserQuestion` turn **after** the drafting subagent returns candidate XML internally and **before** the orchestrator emits the fenced artifact. Confirms the user recognizes what executing the generated prompt will produce.
12
+ - **Outcome digest** — Skimmable markdown block **after** the closing ` ``` ` of the single `xml` fence on the final handoff: bullets for downstream deliverables, primary inputs or tools, done criteria, and a short sample excerpt (see `SKILL.md` §9).
13
+
7
14
  ## User-visible output contract
8
15
 
9
- - **Clarity bar:** Every deliverable (AskUserQuestion fields, audit line, XML body) states concrete outcomes, explicit formats, and checkable done-when signals—aligned with Anthropic [Be clear and direct](https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices#be-clear-and-direct) and [Control the format of responses](https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices#control-the-format-of-responses). Prefer what to do and how to verify it over empty prohibitions or vague quality adjectives.
16
+ - **Clarity bar:** Every deliverable (`AskUserQuestion` fields, XML body, outcome digest) states concrete outcomes, explicit formats, and checkable done-when signals—aligned with Anthropic [Be clear and direct](https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices#be-clear-and-direct) and [Control the format of responses](https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-prompting-best-practices#control-the-format-of-responses). Prefer what to do and how to verify it over empty prohibitions or vague quality adjectives.
10
17
  - **Questions:** Deliver every clarifying question through **AskUserQuestion** (one form per round), with **2–4** options per question and the **recommended** option listed **first**. Tag discovery-sourced options **`[discovered]`** when they came from repo search.
18
+ - **Outcome preview turn (mandatory):** Immediately before the final handoff, emit exactly one assistant turn that contains:
19
+ 1. A markdown block titled `### Outcome preview` with bullets only: **What it does**, **Key inputs**, **Done when**, **Quick sample** (about twenty lines max; follow the sample formatting rules in SKILL.md section 7).
20
+ 2. **AskUserQuestion** with **2–4** options: **Ship this outcome profile** (recommended first), two **contextual alternates** grounded in discovery, and **Refine with free text** (starts another drafting loop). Observe the preview round cap per SKILL.md Phase 4.
11
21
  - **Final assistant message (complete handoff in one send):**
12
- 1. **Audit line:** `Audit: pass 15/15` or `Audit: fail N/15 [reason]`
13
- 2. **Artifact:** the full XML prompt inside **one** Markdown code fence whose language tag is `xml`
14
- 3. **Send boundary:** stop typing as soon as the closing fence ends—the message body is exactly those two blocks back-to-back, ready to copy; your next tokens belong to the user’s following turn
15
- - **Full audit table / JSON debug bundle:** Stay internal until the user names debug with a phrase such as `show debug`, `full audit table`, or `raw internal object`; then append the table/JSON after the usual audit line + XML fence.
16
- - **Hook retries:** Keep retry loops inside the subagent or internal pipeline; the user sees at most one short status line such as `Retrying: scope anchor missing` before the successful audit line + fence.
17
- - **Decision stability:** Pick one drafting approach, carry it to a complete XML artifact, then stop. Change approach only when the user or tool results add **new** facts that contradict the earlier plan; if the draft fails checks, fix forward inside the same structure instead of restarting from scratch.
22
+ 1. **Artifact:** the full XML prompt inside **one** Markdown code fence whose language tag is `xml`
23
+ 2. **Outcome digest:** after the closing fence, a `## Outcome digest` section with tightened bullets so the user can verify intent at a glance
24
+ 3. **Paste-ready section:** The paste-ready prompt artifact remains the single ` ```xml ` block; the digest is for reading, not for pasting into the downstream session
25
+ - **Full audit table / JSON debug bundle:** Stay internal until the user names debug with a phrase such as `show debug`, `full audit table`, or `raw internal object`; then append the table/JSON **after** the Outcome digest.
26
+ - **File-based validation loop:** The subagent writes output to `data/prompts/.draft-prompt.xml`, runs the validator CLI, and fixes violations until exit 0. The orchestrator then outputs the validated result to the user.
27
+ - **Decision stability:** Pick one drafting approach, carry it through preview confirmation, then ship; change approach only when **new** facts from the user or tools contradict the earlier plan; if the draft fails checks, fix forward inside the same structure instead of restarting from scratch.
18
28
 
19
29
  ## Scenario 1: Fresh chat with brief goal
20
30
 
@@ -24,7 +34,7 @@ This file is the **target output spec** for eval-driven iteration of the `prompt
24
34
 
25
35
  **Q&A:** One AskUserQuestion with **2–4** questions covering: scope (which subtree), audience (human vs agent consumer), desired downstream output shape, and hard constraints (tests, CODE_RULES, deadlines). Populate options from discovery paths and package names.
26
36
 
27
- **Output:** Send audit line, then one `xml` fence with the full prompt, then stop—the handoff message is complete.
37
+ **Output:** After drafting, run the **Outcome preview** turn; then send `xml` fence and **Outcome digest**—the handoff message is complete.
28
38
 
29
39
  ## Scenario 2: Session handoff
30
40
 
@@ -34,7 +44,7 @@ This file is the **target output spec** for eval-driven iteration of the `prompt
34
44
 
35
45
  **Q&A:** One AskUserQuestion with **1–2** questions, e.g. “Rank these next actions for the new session” or “Exclude these areas from scope,” each with **2–4** concrete options drawn from the thread.
36
46
 
37
- **Output:** Send audit line, then one `xml` fence with the full prompt, then stop—the handoff message is complete.
47
+ **Output:** After drafting, run the **Outcome preview** turn; then send `xml` fence and **Outcome digest**.
38
48
 
39
49
  **Handoff prompt quality:** `<background>` must include the bullet lists above so a new session can continue with **zero** access to this chat. Quote decision text verbatim where precision matters.
40
50
 
@@ -48,13 +58,13 @@ This file is the **target output spec** for eval-driven iteration of the `prompt
48
58
 
49
59
  **Requirements checklist:** The generated XML must mention every user-stated requirement by name (timeouts, selectors, config extraction, TDD, CODE_RULES, test safety, etc.); if one is out of scope, put the reason in `<open_question>`.
50
60
 
51
- **Output:** Send audit line, then one `xml` fence with the full prompt, then stop—the handoff message is complete.
61
+ **Output:** After drafting, run the **Outcome preview** turn; then send `xml` fence and **Outcome digest**.
52
62
 
53
63
  ## Scenario 4: Noisy context, stable output
54
64
 
55
65
  **Trigger:** `/prompt-generator ...` after a long thread with unrelated topics, tool errors, or tangents.
56
66
 
57
- **Output shape:** Same as Scenario 1: audit line, one `xml` fence, immediate send boundary after the closing fence.
67
+ **Output shape:** Same as Scenario 1 for the final message: one `xml` fence, then **Outcome digest**.
58
68
 
59
69
  **Content focus:** Keep the generated XML aligned with the latest `/prompt-generator` request (e.g. “security-focused code review agent”). Populate the subagent brief from: the user’s literal request string, a **one-paragraph** summary of on-topic facts, and path-grounded discovery notes—leave stack traces, failed commands, and off-topic thread history out of that brief so they never reach the XML body.
60
70
 
@@ -62,17 +72,17 @@ This file is the **target output spec** for eval-driven iteration of the `prompt
62
72
 
63
73
  **Delegation:** Give the drafting subagent a **curated** brief under ~2k tokens when possible: request string + summary + discovery snippets—enough context to draft, without attaching the full raw transcript.
64
74
 
65
- ## Structural invariant A — Tool-free artifact tail
75
+ ## Structural invariant A — Tool-free artifact output
66
76
 
67
- - **Order:** discovery tool calls (when used) → AskUserQuestion → subagent (draft + internal audit) → **one** final assistant message.
68
- - **Final message composition:** That message is plain text only, in order: audit line opening fence → XML body → closing fence → end-of-message. Run every `tool_use` in earlier turns; between the opening and closing fence, emit only the characters of the XML payload.
77
+ - **Order:** discovery tool calls (when used) → **AskUserQuestion** → subagent (draft + internal audit) → **Outcome preview** turn (`### Outcome preview` + **AskUserQuestion**) → optional refinement loops → **one** final assistant message.
78
+ - **Final message composition:** That message is plain text only, in order: opening ` ```xml ` fence → XML body → closing fence → `## Outcome digest` → end-of-message. Run every `tool_use` in earlier turns; between the opening and closing `xml` fence, emit only the characters of the XML payload.
69
79
 
70
80
  ## Structural invariant B — Fenced block closes cleanly
71
81
 
72
- - Use one opening ``` and one closing ``` for the artifact.
82
+ - Use one opening ``` and one closing ``` for the **xml** artifact.
73
83
  - Balance every XML tag; close `<instructions>`, `<background>`, etc. explicitly.
74
84
  - End each numbered step inside `<instructions>` with a complete sentence and a fully written list item.
75
- - The user can copy from the opening ``` through the closing ``` into a new file without manual repair.
85
+ - The user can copy from the opening ``` through the closing ``` of the **xml** fence into a new file without manual repair.
76
86
 
77
87
  ## Structural invariant C — Discovery before lock-in
78
88
 
@@ -90,11 +100,12 @@ This file is the **target output spec** for eval-driven iteration of the `prompt
90
100
  ## Structural invariant E — Render-survival for XML sections
91
101
 
92
102
  - **Problem (HTML):** Tag names used for prompt XML sections can overlap **HTML5 element names**. Chat renderers may treat those tokens as HTML and hide or alter the content between tags. High-risk examples include: `section`, `summary`, `details`, `header`, `footer`, `main`, `aside`, `article`, `nav`, `figure`. The former required name `context` matched an HTML element; **required** sections now use `<background>` for situational grounding so the name stays off that list. The raw assistant text may be complete while the **rendered** message looks like sections are missing.
93
- - **Problem (nested Markdown fences):** A ` ```bash ` (or other inner) line inside the outer ` ```xml ` block is still a line of text in the transcript, but many Markdown renderers treat it as **opening a nested code fence**, which **closes the outer fence early**. Everything after that point (including `</illustrations>` and other closing tags) can appear outside the code block or look “swallowed.” Hooks historically used a regex that stopped at the **first** triple-backtick line; `extract_fenced_xml_content` now walks inner fences (` ```lang ` … closing `` ``` ``) before accepting the outer `` ``` `` that ends the `xml` block.
103
+ - **Problem (nested Markdown fences):** A ` ```bash ` (or other inner) line inside the outer ` ```xml ` block is still a line of text in the transcript, but many Markdown renderers treat it as **opening a nested code fence**, which **closes the outer fence early**. Everything after that point (including `</illustrations>` and other closing tags) can appear outside the code block or look “swallowed.” `extract_fenced_xml_content` now walks inner fences (` ```lang ` … closing `` ``` ``) before accepting the outer `` ``` `` that ends the `xml` block.
104
+ - **Outcome digest:** Follow the sample formatting rules in SKILL.md section 7 inside `## Outcome digest` so a second outer ` ```xml ` block never appears—multiple `xml` fences concatenate in `extract_fenced_xml_content` and would corrupt clipboard copy.
94
105
  - **Primary mitigation:** When the fenced XML artifact **contains any tag whose local name is on the HTML-collision list**, or when the artifact is **large enough that render truncation is likely**, the orchestrator **must write the full artifact to a file** (default: under `data/prompts/` or a path the user supplied earlier) and **paste the absolute file path** in the chat message. Pair the path with a **short section inventory** confirming all five required sections (`role`, `background`, `instructions`, `constraints`, `output_format`) are present in the file.
95
- - **Authoring rules for code inside `<illustrations>` (orchestrator + drafting subagent see `SKILL.md` §7):** (1) Format each sample line with **four leading spaces** inside `<illustrations>` as the default for stable rendered chat. (2) **Or** use a **tilde fence**: `~~~` + optional language on the opening line, body, then `~~~` on its own line. (3) **Or** use a **complete triple-backtick pair** (opening `` ```lang `` line, body, closing `` ``` `` line); hooks and clipboard treat the pair as one unit inside the outer `` ```xml `` fence.
106
+ - **Authoring rules for code inside `<illustrations>`:** Follow the sample formatting rules in SKILL.md section 7. Hooks and clipboard treat complete triple-backtick pairs as one unit inside the outer `` ```xml `` fence.
96
107
  - **Fallback when file write is unavailable:** Escape the **opening angle bracket** of colliding tags (for example `&lt;section>` — user restores `<` when pasting) or use another distinctive wrapper **documented in the same message**, so the user can recover literal XML. State explicitly that the user should restore brackets when copying into another system.
97
- - **Structural safety net:** Regardless of renderer behavior, the **Stop hook section-presence gate** blocks any prompt-workflow response whose fenced XML is missing any required opening/closing section tag pair. Methodology: [Anthropic Agent Skills: evaluation and iteration](https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices#evaluation-and-iteration).
108
+ - **Structural safety net:** The **file-based validation loop** corrects missing section tag pairs before user output. The validator CLI enforces section presence, scope anchors, checklist rows, context-control signals, and positive phrasing.
98
109
 
99
110
  ## XML artifact (minimum sections)
100
111
 
@@ -106,8 +117,17 @@ Include at least:
106
117
  - `<constraints>...</constraints>`
107
118
  - `<output_format>...</output_format>`
108
119
 
109
- Add `<illustrations>` when format or tone is easy to misunderstand; nest sections when the task has natural hierarchy. **Long code samples belong in `<illustrations>`** using the same ordered choices as Structural invariant E: four-space-indented lines first, then tilde fences, then a complete triple-backtick pair if the brief requires backtick fences (see `SKILL.md` §7).
120
+ Add `<illustrations>` when format or tone is easy to misunderstand; nest sections when the task has natural hierarchy. **Long code samples belong in `<illustrations>`** follow the sample formatting rules in SKILL.md section 7.
121
+
122
+
123
+ ## File-based validation loop (primary enforcement)
124
+
125
+ The subagent writes the complete artifact (fenced XML + Outcome digest + hook validation block) to `data/prompts/.draft-prompt.xml`, runs the validator, and fixes violations (exit 2) until exit 0. The orchestrator then strips the hook validation block and outputs fence + digest.
126
+
127
+ python packages/claude-dev-env/hooks/blocking/prompt_workflow_validate.py data/prompts/.draft-prompt.xml
128
+
129
+ The same checks are available as a Python function via `from prompt_workflow_validate import validate_prompt_workflow` for integration in tests and tooling.
110
130
 
111
131
  ## Internal 15-row compliance checklist (audit numerator)
112
132
 
113
- The `15` in `Audit: pass 15/15` maps to the named rows in `SKILL.md` (§11 **Compliance audit — 15-row checklist**), including `reversible_action_and_safety_check_guidance` and `scope_terms_explicit_and_anchored`. **Default user path:** keep the table internal; print the expanded table + JSON only after an explicit debug request. On failure, set the audit line to `Audit: fail N/15 — [primary theme]` where the theme names one concrete gap (e.g. `scope_block missing completion_boundary`, `output_format lacks acceptance checks`).
133
+ The 15-row compliance audit maps to the named rows in `SKILL.md` §11. **Default user path:** keep the table internal; print the expanded table + JSON only after an explicit debug request.