npm - @pieerry/harness-kit - Versions diffs - 3.1.2 → 3.3.1 - Mend

@pieerry/harness-kit 3.1.2 → 3.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (64) hide show

package/.claude/plugins/product-manager/evals/prd-quality.md CHANGED Viewed

@@ -9,21 +9,21 @@ Score each dimension 0-10. Cite line numbers when scoring below 7. Weighted tota
 ## Rubric
 ### Clarity (weight 20%)
-Is the problem stated in 1-2 sentences? Names who suffers, how often, what it costs?
+Problem stated in 1-2 sentences? Names who suffers, how often, what it costs?
 - 10: crisp, specific, evidence-backed
 - 5: present but generic
 - 0: missing or vague
 ### Hypothesis (weight 15%)
-Is there an "If we X, then Y will Z, because W" hypothesis with numeric target?
+"If we X, then Y will Z, because W" hypothesis with numeric target?
 - 10: falsifiable, numeric, evidence-tied
 - 5: directional but missing target or evidence
 - 0: aspirational, no measurable claim
 ### Customer specificity (weight 10%)
-Real customers named with reasons each one matters?
+Real customers named with reasons each matters?
 - 10: concrete, differentiated
 - 5: segments named but thin

package/.claude/plugins/product-manager/evals/prd-readiness.md CHANGED Viewed

@@ -3,7 +3,7 @@
 Type: rule-based stage gate
 Mode: advisory
-Decides whether a PRD has enough content for its declared stage to advance. Run after prd-quality passes.
+Decides whether PRD has enough content for declared stage to advance. Run after prd-quality passes.
 ## Stage gates

package/.claude/plugins/product-manager/evals/prp-context-readiness.md CHANGED Viewed

@@ -22,11 +22,11 @@ If any fail, abort and return sensor feedback.
 ## Phase 2: executor simulation
-You are a coding agent that has never seen this codebase. You are handed only this PRP. You may not ask the PM follow-up questions. Read it end-to-end and answer:
+You are coding agent that has never seen this codebase. Handed only this PRP. Cannot ask PM follow-ups. Read end-to-end and answer:
-1. Could you ship with only what is in the PRP plus linked files? (yes / partial / no)
-2. List every place where you would need to stop and ask a human.
-3. List every claim that looks plausible but you cannot verify from the PRP alone.
+1. Could you ship with only what is in PRP plus linked files? (yes / partial / no)
+2. List every place you would need to stop and ask a human.
+3. List every claim looking plausible but you cannot verify from PRP alone.
 4. Rate one-shot-success likelihood from 0 to 1.
 ## Output format
@@ -48,4 +48,4 @@ PM confirms or requests another iteration.
 Phase 1 fails: re-run failing sensor, regenerate failed sections.
 Phase 2 fails criteria: regenerate weakest section based on blocking_questions.
-Phase 3 rejects: collect PM feedback, retry the loop.
+Phase 3 rejects: collect PM feedback, retry loop.

package/.claude/plugins/product-manager/evals/prp-quality.md CHANGED Viewed

@@ -30,7 +30,7 @@ Blueprint specific enough that executor never decides naming, file location, or
 - 0: prose handwaving
 ### Validation executability (weight 15%)
-Validation gates are real commands the executor copy-pastes? Manual checks specific?
+Validation gates are real commands executor copy-pastes? Manual checks specific?
 - 10: copy-paste ready
 - 5: commands with placeholders

package/.claude/plugins/product-manager/guides/pipeline.md CHANGED Viewed

@@ -1,13 +1,13 @@
 # Pipeline
-Shared rules for PRD and PRP generation. Edit retry, approval, publish, and token accounting here.
+Shared rules for PRD and PRP generation. Edit retry, approval, publish, token accounting here.
 ## Stages per artifact
-1. Generate the artifact using the template in skills/{prd|prp}/SKILL.md.
+1. Generate artifact using template in skills/{prd|prp}/SKILL.md.
 2. Apply sensors listed in skills/{prd|prp}/SKILL.md.
 3. Apply evals listed in skills/{prd|prp}/SKILL.md.
-4. Append the approval marker. Publish hook fires.
+4. Append approval marker. Publish hook fires.
 ## Retry policy
@@ -18,7 +18,7 @@ Trigger on:
 - eval weighted total below threshold (default 8.0)
 Strategy:
-1. Read the feedback.
+1. Read feedback.
 2. Regenerate only failed or low-scoring sections.
 3. Re-run sensors and eval.
@@ -42,7 +42,7 @@ Triggers hooks/post-eval-{prd|prp}.sh.
 Local: file stays in outputs/{prd|prp}/.
-Confluence: fires if JIRA_USERNAME and JIRA_API_TOKEN are set. Calls scripts/confluence-publish.py. Missing creds skips silently.
+Confluence: fires if JIRA_USERNAME and JIRA_API_TOKEN set. Calls scripts/confluence-publish.py. Missing creds skips silently.
 After publish, hook appends:
 ```
@@ -51,34 +51,34 @@ After publish, hook appends:
 ## Token accounting
-Each phase brackets a measurable token window. Phases for this plugin:
+Each phase brackets measurable token window. Phases:
 - prd-generate: from skill invocation to first save in outputs/prd/
-- prd-validate: from first save to approval marker (covers sensors and evals)
+- prd-validate: from first save to approval marker (sensors + evals)
 - prp-generate: from skill invocation to first save in outputs/prp/
 - prp-validate: from first save to approval marker
-Markers live in outputs/.markers/{feature_id}.{phase}.{start|end}, each with `{"timestamp": ISO, "session_id": ""}`. Skill writes the .start marker; hooks write the .end markers.
+Markers in outputs/.markers/{feature_id}.{phase}.{start|end}, each `{"timestamp": ISO, "session_id": ""}`. Skill writes .start; hooks write .end.
-After each artifact's eval passes, the publish hook runs scripts/token-phase.py for both of its phases. The script reads the Claude transcript JSONL, sums usage tokens (input, output, cache_read, cache_creation) within each window, and appends a phase entry to outputs/tokens/{feature_id}.json. Markers are deleted after consumption.
+After eval passes, publish hook runs scripts/token-phase.py for both phases. Script reads Claude transcript JSONL, sums usage tokens (input, output, cache_read, cache_creation) within each window, appends phase entry to outputs/tokens/{feature_id}.json. Markers deleted after consumption.
-The post-eval hook also appends an inline summary comment to the artifact:
+Post-eval hook also appends inline summary to artifact:
 ```
 <!-- tokens: outputs/tokens/{feature_id}.json in={N} out={N} cache_r={N} -->
 ```
-Future phases (dev, code review, launch) append new entries to the same outputs/tokens/{feature_id}.json by reusing the feature_id slug.
+Future phases (dev, code review, launch) append entries to same outputs/tokens/{feature_id}.json by reusing feature_id slug.
-If the transcript is not readable, the script logs a warning and exits 0. Token accounting never blocks publish.
+If transcript not readable, script logs warning and exits 0. Token accounting never blocks publish.
 ## Orchestrator order
 1. PRD: stages 1-4.
-2. hooks/pre-prp-check.sh validates the PRD approved marker.
+2. hooks/pre-prp-check.sh validates PRD approved marker.
 3. PRP: stages 1-4.
 4. Return summary.
 ## Stop conditions
-- 3 failed attempts on the same artifact
+- 3 failed attempts on same artifact
 - missing required input after one clarification
 - pre-prp-check.sh denies PRP start

package/.claude/plugins/product-manager/guides/prd-guidelines.md CHANGED Viewed

@@ -2,11 +2,11 @@
 Rules specific to PRDs. General PM rules in product-guidelines.md.
-Business only. Zero technical detail. File paths, APIs, schemas belong in the PRP.
+Business only. Zero technical detail. File paths, APIs, schemas belong in PRP.
 User stories: As a {persona}, I want {action}, so that {benefit}.
-Metrics need baseline, target, horizon in every row. No empty cells.
+Metrics need baseline, target, horizon per row. No empty cells.
 Mermaid only, never ASCII.
@@ -18,10 +18,10 @@ Word count by stage:
 - Solution Review: 800-1200
 - Launch Readiness: 1500-2000
-Required sections at each stage:
+Required sections per stage:
 - Team Kickoff: 1, 2, 8
 - Planning Review: + 5
 - Solution Review: + 4, 7
 - Launch Readiness: all
-Stage progression: do not skip stages. Each gate adds requirements, never removes them.
+Stage progression: do not skip stages. Each gate adds requirements, never removes.

package/.claude/plugins/product-manager/guides/product-guidelines.md CHANGED Viewed

@@ -1,23 +1,23 @@
 # Product Guidelines
-How PMs think about product. Read before drafting a PRD.
+How PMs think about product. Read before drafting PRD.
-## We ship outcomes, not features
+## Ship outcomes, not features
-Every PRD touches a real user or customer cohort. Always answer:
-- Who pays for the pain today?
+Every PRD touches real user or customer cohort. Always answer:
+- Who pays for pain today?
 - Whose workflow changes?
-- What metric in the customer's business or experience moves?
+- What metric in customer's business or experience moves?
-If you cannot answer all three, the PRD is not ready.
+If you cannot answer all three, PRD not ready.
 ## Squads
-Defined per org in `context-library/business-info.md` and `context-library/squads/`. Multi-squad PRDs: name a primary squad, call out cross-squad interfaces.
+Defined per org in `context-library/business-info.md` and `context-library/squads/`. Multi-squad PRDs: name primary squad, call out cross-squad interfaces.
 ## Customer hierarchy
-When prioritizing, make trade-offs between cohorts explicit, not invisible. A common ranking template (override per org):
+When prioritizing, make trade-offs between cohorts explicit, not invisible. Common ranking template (override per org):
 1. Paying customers with high operational impact
 2. Reliability partners (SLAs, contracts)
@@ -26,9 +26,9 @@ When prioritizing, make trade-offs between cohorts explicit, not invisible. A co
 ## Metrics
-Prefer metrics already on existing dashboards. New metrics need an owner to instrument.
+Prefer metrics already on existing dashboards. New metrics need owner to instrument.
-Strong examples (the shape of a strong metric — specific, owned, instrumented):
+Strong examples (shape of strong metric — specific, owned, instrumented):
 - error rate of X by segment
 - first-attempt success rate
 - latency P95 of Y action
@@ -40,17 +40,17 @@ Weak:
 - "Improved efficiency" with no number
 - "Reduced friction" with no baseline
-## Non-goals are mandatory
+## Non-goals mandatory
-Every PRD lists at least one non-goal. If you cannot think of one, you have not scoped the feature.
+Every PRD lists at least one non-goal. If you cannot think of one, you have not scoped feature.
-## Rollout is not "ship it"
+## Rollout not "ship it"
 Every PRD ends with phases, audience per phase, pass criteria per phase, rollback. Feature flags by default.
 ## Kill criteria
-Decide before launch what makes you turn the feature off. Examples:
+Decide before launch what makes you turn feature off. Examples:
 - error rate above X%
 - latency P95 above Yms
 - adoption below Z% after N days
@@ -59,7 +59,7 @@ Decide before launch what makes you turn the feature off. Examples:
 In PRD:
 - why we are building
-- who feels the pain
+- who feels pain
 - success metric and target
 - open questions about scope
@@ -68,7 +68,7 @@ In PRP:
 - API endpoints, validation commands
 - open questions about implementation
-If the question is "should we build this", PRD. If it is "how do we build this", PRP.
+If question is "should we build this", PRD. If "how do we build this", PRP.
 ## Diagrams

package/.claude/plugins/product-manager/guides/prp-guidelines.md CHANGED Viewed

@@ -1,32 +1,32 @@
 # PRP Guidelines
-A PRP is a PRD plus curated codebase intelligence plus an agent runbook. If the executor has to leave the document to figure out what to do, the PRP is incomplete.
+PRP = PRD + curated codebase intelligence + agent runbook. If executor leaves document to figure out what to do, PRP incomplete.
 ## Three pillars
 ### Context
-Everything the executor needs, included or linked. Specific:
+Everything executor needs, included or linked. Specific:
 - file paths with line numbers when localized
 - existing patterns to mimic, one concrete example each
-- external docs scoped to the relevant section
+- external docs scoped to relevant section
 - known gotchas (concurrency, idempotency, retries, timezones)
-If the codebase has a similar feature already, point to it. Coding agents do their best work by analogy.
+If codebase has similar feature, point to it. Coding agents work best by analogy.
 ### Implementation detail
 Pseudocode or numbered steps concrete enough to translate one-to-one. Include:
 - class names, method signatures, return types where decided
-- schema changes with the actual migration file or DDL
+- schema changes with actual migration file or DDL
 - endpoint definitions: method, path, request/response shape
 - logging, metrics, alert hooks
-You do not write the code. You remove every "what should I name this" decision.
+You do not write code. You remove every "what should I name this" decision.
 ### Validation gates
-Real commands the executor runs and passes. Vague checks fail.
+Real commands executor runs and passes. Vague checks fail.
 Bad:
 - "Make sure tests pass"
@@ -36,29 +36,29 @@ Good:
 - `npm --prefix web run typecheck`
 - `./gradlew :billing-service:detekt`
-Manual gates count, but must be specific:
-- [ ] Open `https://staging.example.com/invoices/new` and verify the deadline badge renders for region `EU-WEST`.
+Manual gates count, must be specific:
+- [ ] Open `https://staging.example.com/invoices/new` and verify deadline badge renders for region `EU-WEST`.
-## A good PRP
+## Good PRP
 1. Executor reads top-to-bottom, never tabs out except to view linked files.
-2. No decisions left for the executor that change scope.
-3. Every TODO has an owner.
-4. Validation block is copy-pasteable.
+2. No decisions left for executor that change scope.
+3. Every TODO has owner.
+4. Validation block copy-pasteable.
 ## Anti-patterns
 - Prose mush where bullets would do
-- Restating the PRD instead of linking
+- Restating PRD instead of linking
 - Hidden scope ("also we should refactor X")
 - Optimism about state ("there is probably a helper for this already")
 - Fabricated file paths or class names
-## When to split a PRP
+## When to split PRP
 Split if:
 - scope crosses two repos with independent deploy cycles
 - one PR would exceed ~600 lines
 - independent validation gates pass or fail separately
-Otherwise keep it one. Context fragmentation costs more than a slightly longer doc.
+Otherwise keep one. Context fragmentation costs more than slightly longer doc.

package/.claude/plugins/product-manager/guides/writing-style.md CHANGED Viewed

@@ -1,13 +1,13 @@
 # Writing Style
-How PRDs and PRPs read .
+How PRDs and PRPs read.
 ## Voice
-- Direct. Lead with the point.
+- Direct. Lead with point.
 - Specific. Real numbers, real names, real quotes. "47% abandon at step 3" beats "many users struggle".
 - Conversational, professional. Contractions ok. Use "we", not "I".
-- Action-oriented. Every section helps a reader decide or act.
+- Action-oriented. Every section helps reader decide or act.
 ## Length
@@ -32,21 +32,21 @@ Dead giveaways for AI fluff:
 ## Punctuation
-- No em-dashes. Use commas, periods, parentheses, or a new sentence.
+- No em-dashes. Use commas, periods, parentheses, or new sentence.
 - Oxford comma: yes.
-- One space after a period.
+- One space after period.
 ## Negative parallelism
-Avoid. Lead with the positive:
+Avoid. Lead with positive:
 - bad: "Don't use X. Use Y."
 - good: "Use Y."
 ## AI-detector test
-If a paragraph sounds like a LinkedIn post, rewrite it. Real human writing:
+If paragraph sounds like LinkedIn post, rewrite. Real human writing:
 - occasional sentence starting with "And" or "But"
-- specific details a generic writer would not know
+- specific details generic writer would not know
 - slightly imperfect rhythm
 - mid-paragraph asides
@@ -64,7 +64,7 @@ Not:
 ## Tables
-Use when you have comparison across 3+ items, structured data with same fields per row, or decision matrices. Otherwise use bullets.
+Use when you have comparison across 3+ items, structured data with same fields per row, or decision matrices. Otherwise bullets.
 ## Diagrams

package/.claude/plugins/product-manager/sensors/prd-acceptance-criteria.md CHANGED Viewed

@@ -16,23 +16,23 @@ Mode: hard gate
 ## Checks
-User story format. Every bullet in Solution Overview that starts with "As a" must follow:
+User story format. Every Solution Overview bullet starting with "As a" must follow:
 ```
 - As a {persona}, I want {action}, so that {benefit}.
 ```
-Metric completeness. Success Metrics table: every row must have non-empty Baseline, Target, Horizon. Cells with only "TBD", "N/A", "?", or empty fail.
+Metric completeness. Success Metrics table: every row needs non-empty Baseline, Target, Horizon. Cells with only "TBD", "N/A", "?", or empty fail.
-Guardrails block. A "Guardrails" line must exist in Success Metrics.
+Guardrails block. "Guardrails" line must exist in Success Metrics.
-Kill criteria. A "Kill criteria:" line must exist in Success Metrics and contain at least one digit.
+Kill criteria. "Kill criteria:" line must exist in Success Metrics and contain at least one digit.
 Rollout phases. Rollout table needs at least 2 rows.
-Non-goals listed. Scope and Non-Goals must have a Non-goals subsection with 1-3 bullets.
+Non-goals listed. Scope and Non-Goals needs Non-goals subsection with 1-3 bullets.
-Open gaps budget. Count of `NOT FOUND` markers must be 5 or fewer.
+Open gaps budget. `NOT FOUND` marker count must be 5 or fewer.
 ## On failure

package/.claude/plugins/product-manager/sensors/prd-structure.md CHANGED Viewed

@@ -3,7 +3,7 @@
 Type: deterministic
 Mode: hard gate
-## Required sections (all must be present, in order)
+## Required sections (all present, in order)
 - Problem and Hypothesis
 - Customers
@@ -36,4 +36,4 @@ Mode: hard gate
 ## On failure
-Block publish. Return list of missing sections, forbidden tokens, rule violations. Agent regenerates only failed parts.
+Block publish. Return missing sections, forbidden tokens, rule violations. Agent regenerates failed parts only.

package/.claude/plugins/product-manager/sensors/prp-context-quality.md CHANGED Viewed

@@ -3,7 +3,7 @@
 Type: deterministic plus heuristic
 Mode: hard gate
-A PRP must give an executor enough context to ship without coming back to ask.
+PRP must give executor enough context to ship without coming back to ask.
 ## Required sections
@@ -13,9 +13,9 @@ A PRP must give an executor enough context to ship without coming back to ask.
 ## Checks within Context
-File paths referenced. Repos and files touched table must reference at least 2 real file paths. A file path has `/` and a known extension (java, kt, ts, tsx, js, jsx, vue, py, sql, yaml, yml, md, sh, gradle).
+File paths referenced. Repos and files touched table must reference at least 2 real file paths. File path has `/` and known extension (java, kt, ts, tsx, js, jsx, vue, py, sql, yaml, yml, md, sh, gradle).
-Patterns with concrete examples. Patterns to follow must have at least 1 pattern. Each pattern must include an `Example in codebase:` line pointing to a concrete path.
+Patterns with concrete examples. Patterns to follow needs at least 1 pattern. Each pattern must include `Example in codebase:` line pointing to concrete path.
 Known gotchas. Subsection `### Known gotchas` must exist with at least 1 bullet.
@@ -23,9 +23,9 @@ External documentation. Subsection `### External documentation` must exist. Eith
 ## Checks within Implementation blueprint
-Numbered steps in a fenced code block.
+Numbered steps in fenced code block.
-At least 1 numbered step references a class, file, or command (uppercase or path-shaped token).
+At least 1 numbered step references class, file, or command (uppercase or path-shaped token).
 ## Checks within What
@@ -35,7 +35,7 @@ Success criteria must include at least 1 line matching:
 ## Open gaps budget
-Count of `NEEDS REVIEW`, `NOT FOUND`, or `TBD` markers must be 5 or fewer.
+`NEEDS REVIEW`, `NOT FOUND`, or `TBD` marker count must be 5 or fewer.
 ## On failure

package/.claude/plugins/product-manager/sensors/prp-links.md CHANGED Viewed

@@ -7,7 +7,7 @@ Implemented in scripts/link-validator.py. Same rules when agent self-checks.
 ## Hard checks (block)
-Source PRD resolvable. The "Source PRD:" line must point to a path that exists under outputs/prd/, relative to the PRP file, or relative to the repo root.
+Source PRD resolvable. "Source PRD:" line must point to path existing under outputs/prd/, relative to PRP file, or relative to repo root.
 No localhost URLs. `http://localhost` or `http://127.0.0.1` not allowed as pinned references.
@@ -15,7 +15,7 @@ GitHub permalinks. Links like `github.com/.../blob/{main|master|develop}/...` no
 ## Soft checks (warn)
-Path format. Inline code spans that look like file paths (have `/` and end with extension) should end in a recognized extension.
+Path format. Inline code spans looking like file paths (have `/` and end with extension) should end in recognized extension.
 URL syntax. http/https URLs must be syntactically valid.

package/.claude/plugins/product-manager/sensors/prp-structure.md CHANGED Viewed

@@ -3,7 +3,7 @@
 Type: deterministic
 Mode: hard gate
-## Required sections (all must be present, in order)
+## Required sections (all present, in order)
 - Goal
 - Why

package/.claude/plugins/staff-software-engineer/evals/dev-quality.md ADDED Viewed

@@ -0,0 +1,48 @@
+# Eval: Dev Quality
+Type: LLM-judge
+Mode: quality gate
+Threshold: weighted total >= 8.0
+Score each dimension 0-10. Cite file paths or commit SHAs when below 7. Weighted total = sum(score x weight%) / 100.
+## Rubric
+### Plan fidelity (weight 25%)
+Did implementation cover every step in source plan? Any scope creep or missed items?
+### Files touched specificity (weight 15%)
+Real, concrete paths listed (not placeholders)? Match what plan named?
+### Commit hygiene (weight 20%)
+Conventional Commits prefix correct. Small commits (1-4 files, < 100 lines ideal). One concern per commit. Messages explain why, not what.
+### Test coverage (weight 20%)
+Every new feature/bugfix has matching tests. Edge cases covered. Tests run before commit.
+### Convention adherence (weight 10%)
+Coding style, framework version, package layout match `coding-style.md` and `conventions/{area}.md`.
+### Blocker quality (weight 10%)
+If blockers exist, specific (file:line, error message). If none, state `none` explicitly.
+## On failure (total below 8.0)
+Retry. Regenerate weakest section only. Max 3 attempts.
+## Output
+```json
+{
+  "scores": {
+    "plan_fidelity": 0,
+    "files_touched": 0,
+    "commit_hygiene": 0,
+    "test_coverage": 0,
+    "convention_adherence": 0,
+    "blocker_quality": 0
+  },
+  "weighted_total": 0.0,
+  "feedback": ["dimension: specific issue with file or sha ref"]
+}
+```

package/.claude/plugins/staff-software-engineer/evals/plan-quality.md CHANGED Viewed

@@ -9,22 +9,22 @@ Score each dimension 0-10. Cite line numbers when below 7. Weighted total = sum(
 ## Rubric
 ### Scope clarity (weight 20%)
-Is what is being built clear? Tied to the source PRP?
+What is being built clear? Tied to source PRP?
 ### Files touched specificity (weight 20%)
-Are the files and modules concrete? Real paths, not placeholders?
+Files and modules concrete? Real paths, not placeholders?
 ### Execution flow (weight 20%)
-Are the steps actionable in order? Could a fresh engineer follow them?
+Steps actionable in order? Could fresh engineer follow them?
 ### Risk awareness (weight 15%)
-Are real risks called out with mitigations?
+Real risks called out with mitigations?
 ### Rollout (weight 15%)
-Is there a phased plan or feature flag strategy?
+Phased plan or feature flag strategy?
 ### Tests (weight 10%)
-Does the plan name the test cases to cover?
+Plan names test cases to cover?
 ## On failure (total below 8.0)

package/.claude/plugins/staff-software-engineer/evals/pr-quality.md ADDED Viewed

@@ -0,0 +1,48 @@
+# Eval: PR Quality
+Type: LLM-judge
+Mode: quality gate
+Threshold: weighted total >= 8.0
+Score each dimension 0-10. Cite line refs from PR body when below 7. Weighted total = sum(score x weight%) / 100.
+## Rubric
+### Title quality (weight 20%)
+Conventional Commits prefix correct (`feat`, `fix`, `chore`, etc). Scope present if relevant. <= 70 chars. Concrete, not vague.
+### Summary clarity (weight 20%)
+Bullets explain what changed and why. Reviewer can grasp change without opening files. No marketing language.
+### Test plan completeness (weight 20%)
+Markdown checklist covers golden path and edge cases. Items are checkable actions, not vague ("test it works").
+### Links and refs (weight 15%)
+Source plan and dev paths linked. Ticket id (e.g. `PROJ-123`) referenced if branch carried one.
+### Risk callouts (weight 15%)
+Migration risks, feature flags, rollback plan named when applicable. Marked `none` if truly nothing.
+### Draft / readiness signal (weight 10%)
+Draft status matches stage of work. If `--ready`, CI gates passed and reviewers assignable. If draft, what's still pending is stated.
+## On failure (total below 8.0)
+Retry. Regenerate weakest section only. Max 3 attempts.
+## Output
+```json
+{
+  "scores": {
+    "title_quality": 0,
+    "summary_clarity": 0,
+    "test_plan_completeness": 0,
+    "links_and_refs": 0,
+    "risk_callouts": 0,
+    "draft_readiness": 0
+  },
+  "weighted_total": 0.0,
+  "feedback": ["dimension: specific issue with body line ref"]
+}
+```