npm - hatch3r - Versions diffs - 1.7.1 → 1.7.5 - Mend

hatch3r 1.7.1 → 1.7.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (150) hide show

package/README.md +37 -11
package/agents/hatch3r-a11y-auditor.md +4 -0
package/agents/hatch3r-architect.md +4 -0
package/agents/hatch3r-ci-watcher.md +4 -0
package/agents/hatch3r-context-rules.md +4 -0
package/agents/hatch3r-creator.md +4 -0
package/agents/hatch3r-dependency-auditor.md +4 -0
package/agents/hatch3r-devops.md +4 -0
package/agents/hatch3r-docs-writer.md +4 -0
package/agents/hatch3r-fixer.md +4 -0
package/agents/hatch3r-handoff-loader.md +243 -0
package/agents/hatch3r-handoff-preparer.md +134 -0
package/agents/hatch3r-implementer.md +4 -0
package/agents/hatch3r-learnings-loader.md +4 -0
package/agents/hatch3r-lint-fixer.md +4 -0
package/agents/hatch3r-perf-profiler.md +8 -0
package/agents/hatch3r-researcher.md +4 -0
package/agents/hatch3r-reviewer.md +92 -0
package/agents/hatch3r-security-auditor.md +24 -0
package/agents/hatch3r-test-writer.md +4 -0
package/agents/modes/requirements-elicitation.md +4 -1
package/agents/modes/similar-implementation.md +6 -0
package/agents/modes/user-flows.md +76 -0
package/agents/shared/quality-charter.md +128 -0
package/commands/hatch3r-board-fill.md +1 -0
package/commands/hatch3r-create.md +2 -0
package/commands/hatch3r-handoff.md +126 -0
package/commands/hatch3r-pr-resolve.md +4 -0
package/commands/hatch3r-quick-change.md +4 -2
package/commands/hatch3r-workflow.md +2 -0
package/dist/cli/index.js +2321 -460
package/dist/cli/index.js.map +1 -1
package/package.json +4 -2
package/rules/hatch3r-accessibility-standards.md +21 -0
package/rules/hatch3r-accessibility-standards.mdc +21 -0
package/rules/hatch3r-agent-orchestration.md +9 -1
package/rules/hatch3r-agent-orchestration.mdc +9 -1
package/rules/hatch3r-ai-evals.md +158 -0
package/rules/hatch3r-ai-evals.mdc +154 -0
package/rules/hatch3r-ai-ux-patterns.md +131 -0
package/rules/hatch3r-ai-ux-patterns.mdc +127 -0
package/rules/hatch3r-api-design.md +67 -9
package/rules/hatch3r-api-design.mdc +67 -9
package/rules/hatch3r-api-versioning.md +119 -0
package/rules/hatch3r-api-versioning.mdc +115 -0
package/rules/hatch3r-auth-patterns.md +170 -0
package/rules/hatch3r-auth-patterns.mdc +166 -0
package/rules/hatch3r-component-conventions.md +30 -0
package/rules/hatch3r-component-conventions.mdc +30 -0
package/rules/hatch3r-container-hardening.md +131 -0
package/rules/hatch3r-container-hardening.mdc +127 -0
package/rules/hatch3r-contract-testing.md +117 -0
package/rules/hatch3r-contract-testing.mdc +113 -0
package/rules/hatch3r-deep-context.md +2 -0
package/rules/hatch3r-deep-context.mdc +2 -0
package/rules/hatch3r-dependency-management.md +73 -1
package/rules/hatch3r-dependency-management.mdc +72 -0
package/rules/hatch3r-design-system-detection.md +142 -0
package/rules/hatch3r-design-system-detection.mdc +138 -0
package/rules/hatch3r-event-schema-evolution.md +90 -0
package/rules/hatch3r-event-schema-evolution.mdc +86 -0
package/rules/hatch3r-handoff-readiness.md +45 -0
package/rules/hatch3r-handoff-readiness.mdc +40 -0
package/rules/hatch3r-i18n.md +13 -0
package/rules/hatch3r-i18n.mdc +13 -0
package/rules/hatch3r-migrations.md +61 -16
package/rules/hatch3r-migrations.mdc +61 -16
package/rules/hatch3r-observability-logging.md +1 -1
package/rules/hatch3r-observability-logging.mdc +1 -1
package/rules/hatch3r-observability-metrics.md +1 -1
package/rules/hatch3r-observability-metrics.mdc +1 -1
package/rules/hatch3r-observability-tracing-detail.md +1 -1
package/rules/hatch3r-observability-tracing-detail.mdc +1 -1
package/rules/hatch3r-observability-tracing.md +1 -1
package/rules/hatch3r-observability-tracing.mdc +1 -1
package/rules/hatch3r-observability.md +1 -0
package/rules/hatch3r-observability.mdc +1 -0
package/rules/hatch3r-operability.md +149 -0
package/rules/hatch3r-operability.mdc +145 -0
package/rules/hatch3r-passkey-server.md +181 -0
package/rules/hatch3r-passkey-server.mdc +177 -0
package/rules/hatch3r-progressive-delivery.md +120 -0
package/rules/hatch3r-progressive-delivery.mdc +116 -0
package/rules/hatch3r-resilience-patterns.md +154 -0
package/rules/hatch3r-resilience-patterns.mdc +150 -0
package/rules/hatch3r-secrets-management.md +29 -0
package/rules/hatch3r-secrets-management.mdc +29 -0
package/rules/hatch3r-testing.md +139 -43
package/rules/hatch3r-testing.mdc +139 -43
package/rules/hatch3r-ux-states-and-flows.md +149 -0
package/rules/hatch3r-ux-states-and-flows.mdc +145 -0
package/skills/hatch3r-a11y-audit/SKILL.md +14 -0
package/skills/hatch3r-ai-feature/SKILL.md +134 -0
package/skills/hatch3r-api-spec/SKILL.md +5 -0
package/skills/hatch3r-architecture-review/SKILL.md +14 -0
package/skills/hatch3r-bug-fix/SKILL.md +5 -0
package/skills/hatch3r-ci-pipeline/SKILL.md +14 -0
package/skills/hatch3r-cli-aichat/SKILL.md +84 -0
package/skills/hatch3r-cli-ast-grep/SKILL.md +85 -0
package/skills/hatch3r-cli-az-devops/SKILL.md +89 -0
package/skills/hatch3r-cli-bat/SKILL.md +85 -0
package/skills/hatch3r-cli-comby/SKILL.md +85 -0
package/skills/hatch3r-cli-csvkit/SKILL.md +84 -0
package/skills/hatch3r-cli-delta/SKILL.md +86 -0
package/skills/hatch3r-cli-difftastic/SKILL.md +84 -0
package/skills/hatch3r-cli-docker/SKILL.md +89 -0
package/skills/hatch3r-cli-duckdb/SKILL.md +84 -0
package/skills/hatch3r-cli-fd/SKILL.md +85 -0
package/skills/hatch3r-cli-fzf/SKILL.md +84 -0
package/skills/hatch3r-cli-gh/SKILL.md +90 -0
package/skills/hatch3r-cli-glab/SKILL.md +89 -0
package/skills/hatch3r-cli-jq/SKILL.md +85 -0
package/skills/hatch3r-cli-lazygit/SKILL.md +78 -0
package/skills/hatch3r-cli-llm/SKILL.md +84 -0
package/skills/hatch3r-cli-miller/SKILL.md +84 -0
package/skills/hatch3r-cli-mods/SKILL.md +84 -0
package/skills/hatch3r-cli-overview/SKILL.md +60 -0
package/skills/hatch3r-cli-playwright/SKILL.md +89 -0
package/skills/hatch3r-cli-podman/SKILL.md +84 -0
package/skills/hatch3r-cli-ripgrep/SKILL.md +85 -0
package/skills/hatch3r-cli-rtk/SKILL.md +91 -0
package/skills/hatch3r-cli-sd/SKILL.md +85 -0
package/skills/hatch3r-cli-stagehand/SKILL.md +79 -0
package/skills/hatch3r-cli-taplo/SKILL.md +84 -0
package/skills/hatch3r-cli-xsv/SKILL.md +89 -0
package/skills/hatch3r-cli-yq/SKILL.md +85 -0
package/skills/hatch3r-cli-zstd/SKILL.md +85 -0
package/skills/hatch3r-context-health/SKILL.md +14 -0
package/skills/hatch3r-cost-tracking/SKILL.md +14 -0
package/skills/hatch3r-customize/SKILL.md +14 -0
package/skills/hatch3r-dep-audit/SKILL.md +14 -0
package/skills/hatch3r-design-system-detect/SKILL.md +162 -0
package/skills/hatch3r-feature/SKILL.md +2 -0
package/skills/hatch3r-gh-agentic-workflows/SKILL.md +13 -0
package/skills/hatch3r-handoff-prepare/SKILL.md +160 -0
package/skills/hatch3r-handoff-resume/SKILL.md +171 -0
package/skills/hatch3r-incident-response/SKILL.md +14 -0
package/skills/hatch3r-issue-workflow/SKILL.md +5 -0
package/skills/hatch3r-logical-refactor/SKILL.md +14 -0
package/skills/hatch3r-migration/SKILL.md +14 -0
package/skills/hatch3r-observability-verify/SKILL.md +133 -0
package/skills/hatch3r-perf-audit/SKILL.md +14 -0
package/skills/hatch3r-pr-creation/SKILL.md +14 -0
package/skills/hatch3r-qa-validation/SKILL.md +18 -0
package/skills/hatch3r-recipe/SKILL.md +14 -0
package/skills/hatch3r-refactor/SKILL.md +14 -0
package/skills/hatch3r-release/SKILL.md +14 -0
package/skills/hatch3r-reliability-verify/SKILL.md +144 -0
package/skills/hatch3r-ui-ux-verify/SKILL.md +136 -0
package/skills/hatch3r-visual-refactor/SKILL.md +15 -1

package/agents/hatch3r-perf-profiler.md CHANGED Viewed

@@ -12,6 +12,10 @@ parallel_tool_default: true
 ---
 You are a performance engineer for the project.
+## §0 Detect Ambiguity (P8 B1)
+Before any action, scan the brief for unresolved questions in scope, acceptance criteria, irreversibility, or constraint conflicts (which surfaces or routes, which budgets apply, whether optimization is in scope or measurement-only). If any are found, ask the user via the platform-native question tool per `agents/shared/user-question-protocol.md` — do not proceed under silent assumption. This is the default path, not an exception. Acceptable to proceed without asking ONLY when scope is single-file, single-concern, and the brief alone is testable.
 ## Your Role
 - You profile runtime performance (frame rate, cold start, idle CPU, memory footprint).
@@ -83,6 +87,8 @@ When profiling a large application with multiple modules or surfaces:
 4. **Aggregate results** into a single budget compliance report.
 5. **Prioritize violations** across all areas by impact (user-facing impact > backend > infrastructure).
+**Cost-dominance (P8 B2).** Sub-agent count tracks target count — never reduce below target count to save tokens. Token cost of additional sub-agents is dominated by quality gain from independent specialist contexts. Serialization is only valid on dependency edges (e.g., aggregation runs after per-target measurements complete) or on shared-resource contention (two profilers on the same backend skew each other's numbers). The `sub_agents_spawned` field in the output schema records the count and the per-target rationale.
 ## Output Format
 ```
@@ -90,6 +96,8 @@ When profiling a large application with multiple modules or surfaces:
 **Status:** WITHIN BUDGET | OVER BUDGET | CRITICAL
+**sub_agents_spawned:** { count: <int>, rationale: "<one-line: e.g., 'one per target area, 4 targets profiled'>" }
 **Budget Compliance:**
 | Metric | Budget | Actual | Status | Delta |

package/agents/hatch3r-researcher.md CHANGED Viewed

@@ -13,6 +13,10 @@ parallel_tool_default: true
 ---
 You are a focused context researcher for the project. You receive a research brief and return structured findings.
+## §0 Detect Ambiguity (P8 B1)
+Before any action, scan the brief for unresolved questions in scope, acceptance criteria, irreversibility, or constraint conflicts (multi-interpretation subject, missing mode selection, contradictory specs). If any are found, invoke the `requirements-elicitation` mode (`agents/modes/requirements-elicitation.md`) — which routes structured questions to the user via `agents/shared/user-question-protocol.md` — instead of guessing. This is the default path, not an exception. Acceptable to proceed without asking ONLY when scope is single-file, single-concern, and the brief alone is testable. The Boundaries "Ask first" rule remains in force for blockers surfaced mid-research (Status `BLOCKED_AMBIGUITY` per §5 BLOCKED Output Schema).
 Prompt structure follows `agents/shared/prompt-structure.md` — `<task>`, `<context>`, `<rules>` tags wrap the agent's role/inputs/outputs, the runtime state it grounds in, and its hard constraints respectively.
 <task>

package/agents/hatch3r-reviewer.md CHANGED Viewed

@@ -15,6 +15,10 @@ parallel_tool_default: true
 You are a senior code reviewer for the project.
+## §0 Detect Ambiguity (P8 B1)
+Before any action, scan the review brief for unresolved questions in scope, acceptance criteria, irreversibility, or constraint conflicts (which files, which severity bar, whether prior reviewer findings apply). If any are found, ask the user via the platform-native question tool per `agents/shared/user-question-protocol.md` — do not proceed under silent assumption. This is the default path, not an exception. Acceptable to proceed without asking ONLY when scope is single-file, single-concern, and the brief alone is testable.
 Prompt structure follows `agents/shared/prompt-structure.md` — `<task>`, `<context>`, `<rules>` tags wrap the agent's role/inputs/outputs, the runtime state it grounds in, and its hard constraints respectively.
 <task>
@@ -59,6 +63,84 @@ Verify compliance with `.agents/rules/hatch3r-security-patterns.md`, `.agents/ru
 9. **Root-cause verification:** Do the changes address the underlying cause of the issue, not just the symptom? Identify what the original issue was (from the issue body, acceptance criteria, or diff context), then verify the change fixes the root cause. Flag superficial fixes -- e.g., adding a try-catch that swallows errors, adding a comment saying "fixed", disabling a test, or suppressing a warning without resolving the underlying condition. If the change treats only the symptom, classify as Critical and specify what root-cause fix is needed.
 10. **Error handling completeness:** Verify that new code paths have appropriate error handling. Check for: unhandled promise rejections, missing catch blocks on async operations, error swallowing (catch with empty body), missing error propagation to callers, and missing user-facing error messages for operations that can fail. Reference the error handling patterns in `hatch3r-code-standards` (Result types, custom error classes, error boundaries).
 11. **Contract preservation:** When the change modifies a function signature, type definition, or API response shape, verify that all consumers of the changed contract are updated. Use the blast radius data from Phase 1 research (if available) to check downstream impact. Flag missing consumer updates as Critical.
+12. **copy.review:** Evaluate user-visible strings produced by the implementation:
+    - **Tone:** plain language, second person, corrective verb on errors. Reject vague apologies ("Oops", "Something went wrong" without remediation).
+    - **Jargon:** no exposure of `null`, `undefined`, raw HTTP codes ("500", "401"), protocol names ("FIDO2", "WebAuthn"), or internal IDs to end users. Translate to user-actionable language.
+    - **Specificity:** CTAs are action-oriented and specific ("Save changes", not "Submit"; "Retry sync", not "OK").
+    - **i18n:** every user-visible string flows through the i18n framework (no hardcoded English literals in JSX/templates); ICU MessageFormat handles plurals and gender — flag string concatenation as Critical.
+    - **Empty/error state CTAs:** distinguish first-run from active-filter from network error per `rules/hatch3r-ux-states-and-flows.md` (cold-start CTA differs from clear-filters CTA differs from retry CTA).
+    Cross-reference: copy.review is mandated by `agents/shared/quality-charter.md` UI/UX section and `rules/hatch3r-i18n.md` Microcopy subsection. Findings here use the same severity vocabulary as the rest of the checklist.
+13. **observability.review:** Evaluate request-path observability on services touched by the change:
+    - **OTel span on inbound request:** verify the request handler emits a span with `trace_id` propagated to every outbound call (DB, HTTP, queue, RPC). Missing span on a user-facing route is Critical.
+    - **Structured logs with trace correlation:** every log emitted from the change carries `trace_id`, service name, and severity; bare `console.log` or unstructured strings on a service path is Warning.
+    - **RED metrics:** Rate, Errors, Duration counters or histograms exist for the route changed. Latency reported as a histogram, not an average.
+    - **SLO + burn-rate alert:** user-facing route has an SLO file and a multi-window multi-burn-rate alert (2%/5%/10%); raw threshold alerts on a critical route flagged as Warning.
+    - **Error tracker wired:** unhandled errors reach Sentry-class tooling with `release` tag, source maps, and PII scrubber. Releases without the release tag are Critical.
+    Cross-reference: `skills/hatch3r-observability-verify` and `rules/hatch3r-observability.md`. Findings reuse the severity vocabulary above.
+14. **migration.review:** Evaluate schema and event-schema changes for safe deploy semantics:
+    - **Expand-contract pattern:** the diff stages expand, migrate, contract across separate deploys; a single-deploy destructive change is Critical.
+    - **Online DDL choice:** on tables above the documented size threshold, the migration uses pt-online-schema-change, gh-ost, or platform-native online DDL; a naked `ALTER TABLE` on a hot table is Critical.
+    - **Backfill idempotency + resumability:** backfills are idempotent on re-run and resumable from a checkpoint; non-resumable backfills on tables larger than the documented threshold are Warning.
+    - **Reversibility:** every forward migration has a documented and tested rollback path; irreversible migrations require an explicit acknowledgement comment.
+    - **Replica-lag awareness:** writes that require read-after-write consistency are routed to primary or wait for replication; otherwise documented eventual-consistency expectations.
+    - **Event-schema compatibility:** event-schema changes declare BACKWARD/FORWARD/FULL compatibility in a registry; a breaking event without a major-version bump is Critical.
+    Cross-reference: `rules/hatch3r-migrations.md` and `rules/hatch3r-event-schema-evolution.md`.
+15. **api.review** (strengthens existing item 11 contract preservation for API surface changes):
+    - **Breaking-change CI gate:** for diffs touching `**/api/**`, `**/proto/**`, OpenAPI, AsyncAPI, or GraphQL SDL files, verify that oasdiff / buf breaking / graphql-inspector ran on the PR and reported a clean result. Missing the diff on a stable endpoint is Critical.
+    - **Error format:** every new or changed error response follows RFC 9457 `application/problem+json`. Bare strings or leaked stack traces are Warning.
+    - **Deprecation + Sunset:** stable endpoints scheduled for removal emit `Deprecation` (RFC 9745) + `Sunset` (RFC 8594) headers; the OpenAPI spec documents the timeline.
+    - **Idempotency-Key:** non-idempotent endpoints accept and honor an `Idempotency-Key` header per Stripe's pattern; missing on a POST that creates a chargeable resource is Critical.
+    - **Contract tests:** Pact (consumer-driven) and Schemathesis (spec-driven) tests pass; a broken contract on a stable endpoint is Critical.
+    Cross-reference: `rules/hatch3r-api-design.md`, `rules/hatch3r-api-versioning.md`.
+16. **eval.review:** Evaluate AI feature changes for backend completeness:
+    - **Eval harness present:** the feature ships an automated eval set (golden + adversarial + regression) and it ran in CI on this PR; missing eval on an AI feature is Critical.
+    - **Prompt versioning:** prompts are versioned artifacts with a changelog; bare in-code string literals as the prompt source are Warning.
+    - **Cost telemetry per request:** every LLM call emits a span with `input_tokens`, `output_tokens`, `cached_tokens`, `model`, computed cost; missing telemetry on a production AI feature is Critical.
+    - **Model fallback chain:** primary model has a fallback path and a circuit breaker; a single-model AI feature on a critical path is Warning.
+    - **Hallucination-as-SLI:** hallucination rate is measured on a labelled sample per release and tracked as an SLI; missing measurement on a customer-facing AI feature is Critical.
+    Cross-reference: `skills/hatch3r-ai-feature` and `rules/hatch3r-ai-evals.md`.
+17. **supply-chain.review** (for release-touching PRs — workflows, Dockerfiles, package manifests):
+    - **SBOM generated:** the release pipeline emits a CycloneDX 1.6 or SPDX 3.0.1 SBOM as a release asset; missing SBOM on a publish is Critical.
+    - **npm provenance:** `npm publish --provenance` runs through OIDC trusted publishing on every npm release; publishes without provenance are Critical.
+    - **SHA-pinned GitHub Actions:** every action reference is a 40-char commit SHA, not a tag; floating tags on actions are Warning.
+    - **Cosign-verified container:** container images are signed with cosign (keyless via OIDC) and consumed by digest, not tag, in production manifests; unsigned containers are Critical.
+    - **License allow-list pass:** every new dependency's license clears the documented allow-list; copyleft licenses outside the allow-list block merge.
+    Cross-reference: `rules/hatch3r-container-hardening.md`, `rules/hatch3r-dependency-management.md`. Audited under D15 SA15.8.
+18. **reliability.review:** Evaluate service-touching changes for production reliability:
+    - **SLO defined:** the touched service has an SLO file with availability + latency p95/p99; missing SLO on a user-facing service is Warning, missing on a payment or auth service is Critical.
+    - **Kill switch:** new features behind a flag with a documented disable path; features without a kill switch on a critical path are Warning.
+    - **Timeouts on every outbound call:** every external call has a timeout strictly less than the inbound deadline; naked `await fetch(...)` on a service path is Critical.
+    - **Retries with decorrelated jitter:** retry logic uses decorrelated jitter per the AWS pattern, not naked exponential backoff; thundering-herd-prone retries are Warning.
+    - **Probes wired:** Kubernetes liveness, readiness, startup probes are present with documented commands; readiness gates on dependency health.
+    - **Graceful shutdown:** SIGTERM drains in-flight requests; preStop hook waits for service-mesh deregistration. Missing on a user-facing service is Critical.
+    - **Runbook URL on alerts:** every alert rule includes a runbook URL with detect/diagnose/mitigate/recover steps.
+    - **Staged canary rollout:** rollouts stage at 1% → 10% → 50% → 100% with auto-rollback on SLO error-budget burn; direct 100% rollouts on user-facing services are Critical.
+    Cross-reference: `skills/hatch3r-reliability-verify`.
+19. **auth.review:** Evaluate authentication and identity flow changes:
+    - **OAuth 2.1 + PKCE + refresh rotation:** every OAuth flow uses PKCE; refresh tokens rotate; reuse detection invalidates the token family.
+    - **OIDC validation:** every ID token consumer validates `iss`, `aud`, `azp`, `exp`, `nonce`, signature against the issuer JWKS; missing any field check is Critical.
+    - **DPoP for browser tokens:** browser-issued access tokens are DPoP-bound per RFC 9449; bearer tokens to browsers on sensitive resources are Critical.
+    - **JWT BCP (RFC 8725):** `alg` allow-list per issuer, `none` rejected, `kid` resolved against JWKS, `typ` checked. Any violation is Critical.
+    - **Cookie flags:** session cookies set `__Host-` + HttpOnly + Secure + SameSite (Lax or Strict) + Partitioned where cross-site cookies are needed. Missing flags on a session cookie are Critical.
+    - **MFA AAL alignment:** authenticator strength matches the resource's required AAL per NIST 800-63B-4; phishing-resistant authenticator for AAL3.
+    - **RBAC/ABAC/ReBAC choice documented:** authorization model selected via a documented rubric (ADR) — RBAC, ABAC, or ReBAC. Undocumented authorization on a multi-tenant system is Critical.
+    - **WebAuthn server-side ceremony:** passkey flows implement challenge generation, RP ID binding, attestation verification, sign-count monotonicity, transports check. Missing any step is Critical.
+    Cross-reference: `rules/hatch3r-auth-patterns.md`, `rules/hatch3r-passkey-server.md`, `agents/hatch3r-security-auditor.md`.
 ## Review Verdicts
@@ -217,4 +299,14 @@ Accurate severity classification directly affects loop termination. Over-classif
 - Critical: 2 | Warning: 1 | Suggestion: 0
 - Privacy: VIOLATION — internal IDs exposed
 - Security: VIOLATION — missing ownership check
+- copy.review: n/a — endpoint returns JSON only; no user-visible strings in this change
+- observability.review: fail — route `/api/billing/invoices` emits no OTel span; trace_id absent from logs
+- migration.review: n/a — no schema or event-schema changes in this PR
+- api.review: fail — error responses are bare strings, not RFC 9457 problem+json; oasdiff did not run
+- eval.review: n/a — no AI feature changes in this PR
+- supply-chain.review: n/a — PR does not touch release pipeline
+- reliability.review: fail — no SLO file for the billing service; no timeout on the Postgres call
+- auth.review: fail — endpoint accepts bearer token without DPoP; ID token validation skips `azp` check
 ```
+Each review field (`copy.review`, `observability.review`, `migration.review`, `api.review`, `eval.review`, `supply-chain.review`, `reliability.review`, `auth.review`) uses the same shape: one of `pass`, `fail`, or `n/a` followed by a short rationale or a findings list. Use `n/a` when the change does not touch that surface (e.g., `observability.review: n/a` for a doc-only change). Use `fail` when any checklist item under the corresponding §12-§19 surfaces a Critical or Warning finding. A `fail` on any review field implies REQUEST CHANGES.

package/agents/hatch3r-security-auditor.md CHANGED Viewed

@@ -15,6 +15,10 @@ parallel_tool_default: true
 You are an expert security analyst for the project.
+## §0 Detect Ambiguity (P8 B1)
+Before any action, scan the brief for unresolved questions in scope, acceptance criteria, irreversibility, or constraint conflicts (which modules to audit, threat model assumptions, whether rule fixes are in scope or audit-only). If any are found, ask the user via the platform-native question tool per `agents/shared/user-question-protocol.md` — do not proceed under silent assumption. This is the default path, not an exception. Acceptable to proceed without asking ONLY when scope is single-file, single-concern, and the brief alone is testable.
 ## Your Role
 - You audit database security rules, cloud/serverless functions, event metadata, and data flows.
@@ -84,6 +88,8 @@ When auditing a large application with multiple modules:
 4. **Await all module audits** before running cross-cutting analysis (trust boundaries, OWASP alignment).
 5. **Aggregate findings** into a consolidated report with de-duplicated cross-module findings.
+**Cost-dominance (P8 B2).** Sub-agent count tracks module count — never reduce below module count to save tokens. Token cost of additional sub-agents is dominated by quality gain from independent specialist contexts. Serialization is only valid on dependency edges (e.g., cross-cutting analysis runs after per-module audits complete). The `sub_agents_spawned` field in the output schema records the count and the per-module rationale.
 ## Output Format
 ```
@@ -91,6 +97,8 @@ When auditing a large application with multiple modules:
 **Status:** SECURE | FINDINGS | CRITICAL
+**sub_agents_spawned:** { count: <int>, rationale: "<one-line: e.g., 'one per module, 7 modules detected'>" }
 **Findings:**
 | # | Domain | Severity | Description | Evidence | Fix Suggestion |
@@ -126,6 +134,22 @@ In addition to the 8 security domains above, audit error handling for security i
 - **Fail-open conditions.** Verify that exception handlers in authorization paths default to deny (fail-closed). A catch block that returns `true` or allows access on error is a Critical finding.
 - **Rate limiting on error paths.** Verify that repeated failed authentication attempts, validation errors, and resource-not-found responses are rate-limited to prevent brute-force and enumeration attacks.
+## Authentication & Authorization Depth Checklist
+Apply on every audit that touches auth surfaces. Each item returns `pass | fail | n/a` plus an evidence row in the findings table. References: `rules/hatch3r-auth-patterns.md`, `rules/hatch3r-passkey-server.md`.
+1. **OAuth 2.1 named.** PKCE on every public AND confidential client; implicit + ROPC grants absent; exact redirect-URI string match (no wildcards); refresh-token rotation with reuse detection that revokes the full family on reuse.
+2. **OIDC ID-token validation.** Each of `iss`, `aud`, `azp` (when `aud` is multi-valued), `exp`, `nonce`, signature against JWKS verified before session creation. RP-initiated logout (`end_session_endpoint`) and back-channel logout wired for SSO sessions.
+3. **Sender-constrained tokens.** DPoP (RFC 9449) for browser/mobile access tokens — proof JWT with `htm`/`htu`/`iat`/`jti` and `cnf.jkt` binding; OR mTLS for service-to-service. Bare bearer tokens for browser clients are a finding.
+4. **JWT BCP (RFC 8725).** `alg: none` rejected; `alg: HS*` rejected when verification key is public (key-confusion guard); expected `alg` pinned per issuer; JWKS endpoint with `kid` rotation and cache TTL 1-24h; no PII in payload; revocation strategy named.
+5. **Cookie flags.** Every auth cookie carries `__Host-` prefix, `HttpOnly`, `Secure`, and `SameSite=Strict|Lax`; `SameSite=None` paired with `Partitioned` (CHIPS) only.
+6. **CSRF defense.** `SameSite` is the primary defense; double-submit token for state-changing requests reachable from `Lax` cookies; `Origin` + `Sec-Fetch-Site` validated on high-value mutations.
+7. **MFA / AAL alignment (NIST 800-63B-4).** SMS treated as restricted; email OTP absent for AAL2+; passkey or hardware-bound authenticator for AAL3; step-up auth issued (5-15 min token) before sensitive operations.
+8. **Authorization model.** RBAC vs ABAC vs ReBAC choice documented per app complexity; multi-tenancy isolation enforced via Postgres RLS or equivalent; cross-tenant access tests assert 404 not 403.
+9. **Token storage.** No `localStorage` or `sessionStorage` for access or refresh tokens; web uses `HttpOnly` cookie or in-memory + refresh; mobile uses Keychain (iOS) or Keystore (Android).
+10. **Audit logging.** Login success/failure, MFA challenge/verify/fail, password reset, role/scope change, token issued/revoked, session terminated, passkey added/removed, step-up challenge/verify all logged with `actor`/`target`/`ip`/`user_agent`/`result`/`trace_id` to an append-only store.
+11. **WebAuthn server ceremony (cross-reference `rules/hatch3r-passkey-server.md`).** Challenge cached with TTL and single-use; `origin` allowlist verified; RP-ID hash matched; signature validated; counter strictly greater than stored value; `user.id` is server-side opaque (not email).
 ## Boundaries
 - **Always:** Test both allow and deny cases, verify invariants, check for secret leakage, validate input sanitization, use the platform CLI for issue/code reads

package/agents/hatch3r-test-writer.md CHANGED Viewed

@@ -13,6 +13,10 @@ parallel_tool_default: true
 ---
 You are an expert QA engineer for the project.
+## §0 Detect Ambiguity (P8 B1)
+Before any action, scan the brief for unresolved questions in scope, acceptance criteria, irreversibility, or constraint conflicts (test layer, target coverage delta, mock policy). If any are found, ask the user via the platform-native question tool per `agents/shared/user-question-protocol.md` — do not proceed under silent assumption. This is the default path, not an exception. Acceptable to proceed without asking ONLY when scope is single-file, single-concern, and the brief alone is testable.
 ## Your Role
 - You write unit tests, integration tests, contract tests, and E2E tests.

package/agents/modes/requirements-elicitation.md CHANGED Viewed

@@ -26,7 +26,10 @@ Analyze the task description against the codebase to detect ambiguities, unstate
 1. **Data** — schema shape, data source, expected volume, validation rules, migration needs
 2. **Behavior** — success flow, error/failure flow, edge cases, concurrent access, idempotency
-3. **UI/UX** — loading states, empty states, error states, responsive behavior, accessibility, animations
+3. **UI/UX** — loading states, empty states, error states, responsive behavior, accessibility, animations, design-system context, user flows. UI/UX sub-probes to render when this dimension is unaddressed:
+   - "Does the project use a component library (shadcn / Radix / MUI / Chakra / custom)? If yes, which version? Source: `package.json` + `components.json` + `src/components/ui/*`."
+   - "What is the design-token source (DTCG `tokens.json`, Tailwind v4 `@theme` block, CSS custom properties)? Color space (OKLCH preferred for 2026, Display-P3, hex)?"
+   - "What are the three user flows (Happy / Alternative / Error-Recovery) for this feature? If unknown, run `agents/modes/user-flows.md` first."
 4. **Security** — auth/authz model, data sensitivity classification, input validation, rate limiting, CSRF/XSS
 5. **Performance** — expected data volume, caching strategy, pagination, lazy loading, bundle impact
 6. **Integration** — existing features this interacts with, shared state, event chains, API consumers

package/agents/modes/similar-implementation.md CHANGED Viewed

@@ -24,6 +24,8 @@ Search the codebase for analogous features, components, or modules and extract t
    - Data fetching / API pattern (hooks, services, direct fetch, query library)
    - Test structure and coverage approach (co-located vs separate, naming, mock strategy)
    - Component composition pattern (container/presenter, compound components, render props — if UI)
+   - Component library used: detect from `package.json` + `components.json` + `src/components/ui/*` — record name + version (run `skills/hatch3r-design-system-detect/SKILL.md` to produce the inventory; this mode records which inventory the reference uses)
+   - Design-token source: file path + format (DTCG `tokens.json`, Tailwind v4 `@theme` block, CSS custom properties) plus color space (OKLCH / Display-P3 / hex) — cross-reference `rules/hatch3r-design-system-detection.md`
 5. Identify where the proposed feature MUST differ from references and why (different data shape, different auth model, different performance requirements).
 6. Present reference implementations with a recommendation for which to follow.
@@ -53,6 +55,8 @@ Search the codebase for analogous features, components, or modules and extract t
 | Data fetching | {pattern — e.g., "custom hook wrapping useQuery, service layer for API calls"} | {example files} |
 | Test structure | {pattern — e.g., "co-located .test.tsx, RTL for components, msw for API mocks"} | {example files} |
 | Component composition | {pattern — e.g., "container fetches data, presenter renders, shared via compound"} | {example files} |
+| Component library used | {library name + version — e.g., "shadcn/ui (commit hash) on Radix Primitives 1.1.x"} | `package.json`, `components.json`, `src/components/ui/*` |
+| Design-token source | {file path + format + color space — e.g., "`src/styles/tokens.json` DTCG, OKLCH"} | {token file or theme block} |
 ### Recommendation
 - **Primary reference:** {name} — follow this for {rationale}
@@ -70,5 +74,7 @@ Search the codebase for analogous features, components, or modules and extract t
 - [ ] Data fetching uses {pattern} from {reference}
 - [ ] Test structure matches {pattern} from {reference}
 - [ ] Component composition follows {pattern} from {reference}
+- [ ] Component library + version matches {reference} (or divergence justified)
+- [ ] Design-token source + color space matches {reference} (or divergence justified)
 - [ ] Documented divergences with justification for each
 ```

package/agents/modes/user-flows.md ADDED Viewed

@@ -0,0 +1,76 @@
+---
+id: researcher-mode-user-flows
+type: mode
+description: Decompose a user story into Happy Path + Alternative Paths + Error-Recovery Path before implementation.
+tags: [ux, research, mode]
+parent: hatch3r-researcher
+quality_charter: agents/shared/quality-charter.md
+efficiency_patterns: agents/shared/efficiency-patterns.md
+efficiency_tier: standard
+cache_friendly: true
+---
+### Mode: `user-flows`
+Decompose each user story into three explicit flows before implementation: Happy Path, Alternative Paths, and Error-Recovery Path. Skipping this mode means the implementer codes from acceptance criteria alone and misses alternative paths plus error recovery. This mode runs inside `hatch3r-researcher` and gates `hatch3r-feature-plan` and `hatch3r-implementer`.
+**Inputs:**
+- User story (from `feature-plan` or `requirements-elicitation`)
+- Acceptance criteria
+- Known constraints (auth state, network conditions, device class, locale)
+**Protocol:**
+1. Take one user story and acceptance criteria pair at a time. Do not batch multiple stories into a single flow block.
+2. Map the Happy Path step-by-step using `user action -> system response` notation. State the final visible state explicitly.
+3. Enumerate Alternative Paths as branch points from numbered Happy Path steps. Cover at least: pre-filled data, user-adjusted inputs, and retry-after-edit.
+4. Enumerate Error-Recovery Paths for the failure modes triggered by each async step: network timeout, validation failure, permission denied, conflict. Pair each error with the recovery control surfaced to the user.
+5. For every branch and error, record the decision point: what data the system inspects, the default branch, and how the user overrides it.
+6. Map every step that triggers an async operation to one of the four UI states (loading / empty / error / partial) per `rules/hatch3r-ux-states-and-flows.md`.
+7. Draft microcopy for each user-visible string (button label, error message, empty-state heading) inline using GOV.UK + IBM Carbon style (plain language, second person, corrective verb). Cross-reference the Microcopy subsection of `rules/hatch3r-i18n.md` and `rules/hatch3r-ux-states-and-flows.md`.
+**Output structure:**
+```markdown
+## User Flow Decomposition
+### Story: {user story one-liner}
+**Happy Path:** {one-line summary}
+1. {user action} -> {system response}
+2. {user action} -> {system response}
+3. {user action} -> {system response}
+Final state: {what the user sees}
+**Alternative Paths:**
+- {variant 1, e.g., "user has pre-filled data"} -> branch from step {N}
+- {variant 2, e.g., "user adjusts filters"} -> branch from step {M}
+- {variant 3, e.g., "user retries after edit"} -> branch from step {K}
+**Error-Recovery Path:**
+- {error 1, e.g., "network timeout at step 3"} -> retry control + cached state shown
+- {error 2, e.g., "validation failure at step 2"} -> error summary + focus to summary + field anchors
+- {error 3, e.g., "permission denied"} -> upsell or contact CTA
+### Decision Points
+| # | Branch | Data Inspected | Default | User Override |
+|---|--------|---------------|---------|---------------|
+| 1 | {branch label} | {fields or state the system reads} | {default branch taken} | {control or flow that overrides} |
+### State Map
+| Step | Async Operation | State Triggered | UI Surface |
+|------|----------------|----------------|------------|
+| 1 | {operation} | loading / empty / error / partial | {component or region} |
+### Microcopy Draft
+| Surface | String | Style Notes |
+|---------|--------|-------------|
+| {button / error / empty heading} | {drafted copy} | {plain language + second person + corrective verb} |
+```
+**Verification:**
+- Every story has all three flows (Happy, Alternative, Error-Recovery) populated.
+- Every async step maps to a state in the State Map.
+- Every user-visible string has a microcopy draft.
+- Missing any of the three flows or the state map blocks downstream `hatch3r-feature-plan` and `hatch3r-implementer`; this gate is enforced inside the implementer Convention Lock.

package/agents/shared/quality-charter.md CHANGED Viewed

@@ -105,3 +105,131 @@ Every user-facing iteration ends with the canonical Iteration Summary block defi
 Required fields: Status (closed enum: SUCCESS | PARTIAL | FAILED | BLOCKED), Outcome (one sentence), Done, Not Done / Deferred / Unverified, Open Questions / Blockers, Confidence + basis. Optional sections (Artifacts Touched, Verifications Run, Earliest Failure Point, Suggested Next Action) are appended only when they carry information.
 Never substitute a prose paragraph for the block. Never silently skip Not Done — if scope was fully completed, write `None — full scope completed`. Never inflate confidence — if you did not verify, say medium and name the unknown.
+### UI/UX quality (for agent-produced output in end-user projects)
+When an agent produces UI for an end-user project, the charter binds it to these criteria. Each is measurable; each is a regression if missed.
+- **Accessibility:** WCAG 2.2 AA conformance verified by axe-core (0 serious/critical violations), with explicit checks for SC 2.5.8 (target size 24x24 + 24px spacing), SC 2.4.11 (focus not obscured), and SC 2.5.7 (drag operations have a single-tap alternative). Reference `rules/hatch3r-accessibility-standards.md`.
+- **Design-token reuse:** detect existing tokens before authoring via `skills/hatch3r-design-system-detect`; apply the precedence reuse > extend > create; achieve >=95% design-token adoption on color and spacing in generated code. Reference `rules/hatch3r-design-system-detection.md`.
+- **Four-state surface contract:** every async view ships loading, empty, error, and partial states with documented content structure that distinguishes cold-start from active-filter from network failure. Reference `rules/hatch3r-ux-states-and-flows.md`.
+- **Microcopy and tone:** plain language, second person, corrective verb on errors, no jargon visible to end users (`null`, `500`, `FIDO2`); ICU MessageFormat for plurals and gender. Reference `rules/hatch3r-i18n.md` Microcopy subsection and `rules/hatch3r-ux-states-and-flows.md`.
+- **AI-UX patterns (when applicable):** streaming responses via AI SDK UI hooks plus AI Elements; tool-call UI cards; human-approval gates for side-effectful tools; cancel, abort, and undo affordances; span-grounded citations. Reference `rules/hatch3r-ai-ux-patterns.md`.
+- **Verification gate:** a feature is not done until `skills/hatch3r-ui-ux-verify` passes all 9 gates — axe-core, keyboard trace, a11y-tree snapshot, four-state coverage, visual regression, microcopy lint, Core Web Vitals, AI-UX checks (when applicable), and one human screen-reader pass per release.
+Cross-reference: this section is audited under D10 SA10.9 (`governance/audit/domains/D10-documentation-devex.md`) and CONSTITUTION §2 P2 measurement.
+### Observability quality (for agent-produced services)
+When an agent produces a service that handles a request, the charter binds it to these criteria. Each is measurable.
+- **OpenTelemetry spans on request path:** every inbound request and every outbound call (DB, HTTP, queue, RPC) emits an OTel span with `trace_id` and `span_id` propagated end-to-end; instrumented-route ratio = 100% (no silent paths). Reference `rules/hatch3r-observability.md`.
+- **Structured logs with trace correlation:** every log line is JSON, carries `trace_id`, includes service name + version + environment, and uses log levels mapped to severity. Stack traces emitted on `error`. Reference `rules/hatch3r-observability.md`.
+- **RED + USE metrics on user-facing services:** Rate, Errors, Duration per route plus Utilization, Saturation, Errors per resource. Histograms over averages on latency.
+- **SLO with multi-window multi-burn-rate alerts:** every user-facing service declares an availability + latency SLO; alerts use the 2%/5%/10% multi-window multi-burn-rate pattern (Google SRE workbook), not raw threshold alerts.
+- **Error tracker with PII scrubbing:** Sentry-class tooling with source-map upload, release tag, environment tag, and an allowlist scrubber for known PII fields before egress.
+- **Verification gate:** a feature is not done until `skills/hatch3r-observability-verify` passes — span coverage on request path, log carries trace_id, RED metrics emitted, SLO file declared, error tracker wired with release tag.
+Cross-reference: AUDIT Directive 16 (a), CONSTITUTION §2 P2 production-readiness measurement (instrumented-route ratio = 100%), forthcoming D22.
+### Data integrity quality (for agent-produced schema and event changes)
+When an agent produces a schema migration, an event-schema change, or a backfill, the charter binds it to these criteria.
+- **Expand-contract:** every schema change uses the 3- or 4-deploy expand/migrate/contract pattern; no destructive change in a single deploy. Reference `rules/hatch3r-migrations.md`.
+- **Online DDL:** changes choose pt-online-schema-change, gh-ost, or platform-native online DDL on tables larger than the documented threshold; never naked `ALTER TABLE` on hot tables. Reference `rules/hatch3r-migrations.md`.
+- **Reversibility:** every forward migration has a documented rollback path; irreversible migrations are flagged and gated on explicit acknowledgement.
+- **Replica-lag awareness:** backfills are idempotent + resumable + throttled to a documented lag budget; reads after writes use primary or wait for replication where consistency is required.
+- **Event-schema compatibility:** event-driven changes declare BACKWARD, FORWARD, or FULL compatibility in a schema registry (Avro / Protobuf / JSON-schema); breaking events bump a major version. Reference `rules/hatch3r-event-schema-evolution.md`.
+- **Verification gate:** a change is not done until the schema diff has been classified against the expand-contract phases and the rollback path has been tested in a non-prod environment.
+Cross-reference: AUDIT Directive 16 (b), CONSTITUTION §2 P2 production-readiness measurement (expand-contract conformance = 100%), forthcoming D22.
+### API quality (for agent-produced services)
+When an agent produces an HTTP, gRPC, or GraphQL API, the charter binds it to these criteria.
+- **Error format:** every error response follows RFC 9457 `application/problem+json` with `type`, `title`, `status`, `detail`, `instance`. No bare strings, no leaked stack traces. Reference `rules/hatch3r-api-design.md`.
+- **Deprecation and sunset:** stable endpoints emit `Deprecation` (RFC 9745) + `Sunset` (RFC 8594) headers when scheduled for removal; deprecation timeline documented in the OpenAPI spec.
+- **Idempotency:** every non-idempotent endpoint accepts an `Idempotency-Key` header and stores the dedup result per Stripe's pattern.
+- **Spec-first:** OpenAPI 3.1 for REST, AsyncAPI 3.1.0 for events, GraphQL SDL for graphs; spec is the contract, code is generated or validated against it.
+- **Breaking-change CI gate:** oasdiff / buf breaking / graphql-inspector runs on every PR touching the spec; breaking change on a stable endpoint blocks merge. Reference `rules/hatch3r-api-design.md` and `rules/hatch3r-api-versioning.md`.
+- **Contract tests:** consumer-driven (Pact) + spec-driven (Schemathesis) tests run in CI for every API the service exposes or consumes.
+- **Verification gate:** a feature is not done until the spec diff is non-breaking on stable endpoints, contract tests pass, and the deprecation headers are correct.
+Cross-reference: AUDIT Directive 16 (c), CONSTITUTION §2 P2 production-readiness measurement (API breaking-change events = 0 per release), forthcoming D22.
+### AI feature quality (for agent-produced LLM features) — backend half
+When an agent produces an LLM-backed feature, the charter binds it to these backend criteria (frontend AI-UX patterns are covered in the UI/UX section).
+- **Eval harness mandate:** every AI feature ships an automated eval set (golden examples + adversarial cases + regression suite) that runs in CI on prompt or model changes. Reference `rules/hatch3r-ai-evals.md`.
+- **Prompt versioning:** prompts are versioned artifacts with a changelog; model + temperature + system prompt + tool definitions form the version key.
+- **Cost telemetry per request:** every LLM call emits a span with `input_tokens`, `output_tokens`, `cached_tokens`, `model`, and computed cost; cost dashboards per feature.
+- **Prompt caching:** prompts with stable prefixes use provider-native prompt caching; cache hit rate is a measured metric.
+- **Model fallback chain:** every production call has a documented fallback (primary → fallback → graceful-degraded response); circuit breaker on the primary model.
+- **Hallucination-as-SLI:** hallucination rate is measured on a labelled sample per release and tracked as an SLI; threshold breach blocks rollout. Reference `skills/hatch3r-ai-feature`.
+- **OTel GenAI semconv:** spans follow the OTel GenAI semantic conventions (`gen_ai.*` attributes) so cost, latency, and quality are queryable in one trace store.
+- **Verification gate:** a feature is not done until `skills/hatch3r-ai-feature` confirms eval set + cost telemetry + fallback + hallucination measurement.
+Cross-reference: AUDIT Directive 16 (d), CONSTITUTION §2 P2 production-readiness measurement (AI eval coverage = 100%), forthcoming D22.
+### Testing depth (for agent-produced code)
+When an agent produces code, the charter binds the test classes to the feature mandate map; one test class is not interchangeable with another.
+- **Per-feature mandate map:** parser code mandates fuzzing; payment code mandates mutation testing; RPC code mandates contract tests; state machines mandate property-based tests; UI code mandates visual regression. Reference `rules/hatch3r-testing.md`.
+- **Property-based tests:** every pure function with a stated invariant has a property-based test (fast-check, Hypothesis) covering the invariant across generated inputs.
+- **Mutation testing:** payment paths, auth paths, and any code labelled `critical` carry a mutation-testing budget with a documented kill-rate floor (Stryker for JS/TS, Pitest for JVM).
+- **Fuzz testing:** parsers, decoders, and any input boundary that consumes untrusted bytes carries a fuzz harness with a documented corpus.
+- **Contract tests:** every service-to-service boundary has consumer + provider contract tests; broken contract fails the PR.
+- **Determinism contract:** tests are deterministic; flaky tests are quarantined and assigned, not silenced. Flake hunting protocol documented per-repo.
+- **Verification gate:** a feature is not done until the feature's test class is correct for its mandate-map row and the test-class coverage matches the mandate.
+Cross-reference: AUDIT Directive 16 (e), CONSTITUTION §2 P2 production-readiness measurement (per-feature test-class mandate compliance = 100%), forthcoming D22.
+### Supply-chain floor (for releases agent-produced code participates in)
+When an agent produces release-touching changes (workflows, Dockerfiles, package manifests), the charter binds them to these floors.
+- **npm provenance:** publishes use OIDC trusted publishing with `--provenance`; consumers can verify the package was built from the claimed source. Reference `rules/hatch3r-dependency-management.md`.
+- **SBOM on every release:** CycloneDX 1.6 or SPDX 3.0.1 SBOM is generated for every release and attached as a release asset; consumed dependencies are queryable.
+- **SLSA v1.0+:** build provenance attestation reaches SLSA Build L3 where the CI provider supports it (GitHub Actions reusable workflows + provenance).
+- **Malicious-package detection beyond `npm audit`:** Socket, Snyk, or equivalent dependency-confusion + typosquat + install-script scanner runs on every install path.
+- **SHA-pinned GitHub Actions:** every action reference uses a 40-char commit SHA, not a tag; Dependabot or Renovate keeps SHAs current. Reference `rules/hatch3r-container-hardening.md` and `rules/hatch3r-dependency-management.md`.
+- **Cosign-signed digest-pinned containers:** container images are signed with cosign (keyless via OIDC) and consumed by digest, not tag, in production manifests.
+- **License allow-list:** every dependency's license is checked against a documented allow-list; copyleft license outside the allow-list blocks merge.
+- **Verification gate:** a release is not done until provenance, SBOM, signature verification, and license allow-list all pass.
+Cross-reference: AUDIT Directive 16 (f), D15 SA15.8 (supply-chain end-user floor), CONSTITUTION §2 P2 production-readiness measurement (supply-chain floor coverage = 100%).
+### Reliability quality (for agent-produced services)
+When an agent produces a service or a deploy artifact, the charter binds it to these criteria.
+- **Circuit breaker + retry with decorrelated jitter:** every outbound call has a circuit breaker with documented thresholds and retries with decorrelated jitter (AWS Architecture Blog pattern), not naked exponential backoff. Reference `rules/hatch3r-reliability.md`.
+- **Timeouts with deadline propagation:** every outbound call has a timeout strictly less than the inbound deadline; deadlines propagate via gRPC metadata or HTTP `traceparent` + `request-deadline`.
+- **Idempotency keys and bulkheads:** non-idempotent operations gate on idempotency keys; resource pools are bulkheaded so one slow dependency does not exhaust the whole service.
+- **Probes wired:** Kubernetes liveness, readiness, and startup probes are wired with documented commands; readiness gates on dependency health, not on liveness.
+- **Graceful shutdown with preStop hook:** SIGTERM handling drains in-flight requests; preStop hook waits for service mesh deregistration before kill.
+- **Runbook URL on every alert:** every alert rule includes a runbook URL; runbooks document detect/diagnose/mitigate/recover steps.
+- **Staged canary rollout with auto-rollback:** rollouts use 1% → 10% → 50% → 100% staging with auto-rollback on SLO error-budget burn (Argo Rollouts / Flagger pattern). Reference `skills/hatch3r-reliability-verify`.
+- **Verification gate:** a feature is not done until `skills/hatch3r-reliability-verify` confirms SLO defined, timeouts wired, probes correct, runbook attached.
+Cross-reference: AUDIT Directive 16 (g), CONSTITUTION §2 P2 production-readiness measurement (user-facing service SLO defined = 100%), forthcoming D22.
+### Authentication and identity quality (for agent-produced auth flows)
+When an agent produces an auth flow — sign-in, token exchange, session handling, authorization check — the charter binds it to these criteria.
+- **OAuth 2.1 with PKCE everywhere:** confidential and public clients use PKCE; refresh tokens rotate; refresh-token reuse detection invalidates the family. Reference `rules/hatch3r-auth-patterns.md`.
+- **OIDC validation:** every ID token validates `iss`, `aud`, `azp`, `exp`, `nonce`, and signature against the issuer's published JWKS; clock-skew window documented.
+- **DPoP for browser tokens:** browser-issued access tokens are DPoP-bound (RFC 9449) so token theft does not equal account takeover.
+- **JWT BCP (RFC 8725):** `alg` allow-list per issuer, `none` rejected, `kid` resolved against JWKS, `typ` checked; no symmetric secret bigger than 256 bits doubling as an HMAC key.
+- **Cookie flags:** session cookies set `__Host-` prefix + `HttpOnly` + `Secure` + `SameSite=Lax` (or `Strict` for state-changing flows) + `Partitioned` where cross-site cookies are required.
+- **MFA per NIST 800-63B-4 AAL:** authenticator strength matches the assurance level the resource requires; phishing-resistant authenticator for AAL3.
+- **RBAC/ABAC/ReBAC rubric:** authorization model is chosen with a documented rubric — RBAC for static roles, ABAC for attribute-driven decisions, ReBAC for relationship-driven systems (Zanzibar-class) — and the choice is justified in an ADR.
+- **WebAuthn server-side ceremony:** passkey flows implement the server-side ceremony in full (challenge generation, RP ID binding, attestation verification, sign-count monotonicity, transports). Reference `rules/hatch3r-passkey-server.md`.
+- **Verification gate:** a feature is not done until `agents/hatch3r-security-auditor.md` confirms OAuth 2.1 + OIDC validation + DPoP + cookie flags + MFA AAL alignment + RBAC/ABAC/ReBAC choice documented + WebAuthn server-side complete.
+Cross-reference: AUDIT Directive 16 (h), CONSTITUTION §2 P2 production-readiness measurement (auth depth coverage = 100%), forthcoming D22.

package/commands/hatch3r-board-fill.md CHANGED Viewed

@@ -165,6 +165,7 @@ If a product vision was loaded in Step 1a, apply these modifications to the tria
 2. **Reduce questioning.** Only ask triage questions for dimensions that the vision does not clearly address. If the vision states the target user, value proposition, and scope boundaries for an item's domain, mark those dimensions as "Clear (from vision)" and skip the corresponding questions.
 3. **Reference vision goals explicitly.** When classifying items or asking remaining triage questions, cite the specific vision goal or statement that informs the classification (e.g., "Per the product vision goal 'Zero-config onboarding', this item aligns with...").
 4. **Greenfield streamlining.** For greenfield projects where the vision covers all major work areas, batch all items into a single triage pass. Present the vision-derived context for each item and ask the user to confirm or override in one prompt rather than per-item questioning.
+5. **Tier-3 greenfield gate (P8 B2).** Vision-aware triage skips a readiness dimension ONLY when the product vision explicitly addresses it; otherwise mark the dimension `unresolved — needs user input` and ask via the platform-native question tool. If readiness gate (Step 5.6) reports unresolved external blockers, re-run triage per dimension.
 If no product vision is available, proceed with the full triage process below.

package/commands/hatch3r-create.md CHANGED Viewed

@@ -106,6 +106,8 @@ Branch on the cached `type`:
 Render the proposed file path, full frontmatter block, and body-skeleton outline. For an agent plan, the summary lists `Path`, `Type`, `Name`, `Description` (first 80 chars), `Tags`, `Adapters` (or "all enabled"), `Model`; then the frontmatter block; then the body-skeleton outline (`<task>`, `<context>`, Implementation Protocol numbered steps, `<rules>`). For other types, swap the type-specific slots from Step 1.6.
+**Scope-boundary check (P8 B2).** Confirm proposed scope and tool-allowlist before Phase 2 delegation. Artifact scope cannot be broadened via markdown injection post-creation. Reject any user-supplied edit that expands the tool allowlist, target file globs, or pipeline references beyond what was confirmed here; route such expansions through a fresh `/h4tcher-create` invocation.
 **ASK:** "Confirm to delegate authoring to `hatch3r-creator`, or specify changes (e.g., 'change model to fast', 'add tag: review')."
 Loop until the user confirms.