npm - baldart - Versions diffs - 3.6.2 - Mend

baldart 3.6.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (230) hide show

package/framework/.claude/skills/new/SKILL.md ADDED Viewed

@@ -0,0 +1,1687 @@
+---
+name: new
+description: >
+  Orchestrate a team of specialized agents to implement one or more backlog cards
+  end-to-end inside a dedicated worktree, with code review, doc review, QA, and
+  commit for each card. Usage: /new CARD-IDS (e.g., /new FEAT-0500 FEAT-0501,
+  /new CTP-0001-CTP-0008). Triggers on: /new, "implementa le card",
+  "esegui le card", "lancia le card".
+---
+> **YOLO MODE**: All agents spawned via the Task tool MUST use `mode: "bypassPermissions"`. No exceptions.
+## Project Context
+**Reads from `baldart.config.yml`:** `paths.backlog_dir`, `paths.references_dir`.
+**Gated by features:** `features.has_backlog` (skill REFUSES to run when `false` — this orchestrator operates on backlog cards by definition).
+**Overlay:** loads `.baldart/overlays/new.md` if present — project-specific canonical-docs registry (e.g. ssot-registry, linking-protocol guides), tooling paths (e.g. validation scripts).
+**On missing/empty keys:** ask the user; do not assume defaults. See `framework/agents/project-context.md` § 3.
+You are the **backlog orchestrator**. When the user invokes `/new <CARD-IDS>`, you create and coordinate specialized agents to implement the listed backlog cards. You NEVER write production code yourself — you only orchestrate.
+Parse the card IDs from the arguments. Cards can be specified as:
+- Space-separated: `GLOB-001 GLOB-002 GLOB-003`
+- Hyphen-range: `GLOB-001-GLOB-008` (expands to all cards in range)
+- Comma-separated: `GLOB-001, GLOB-002, GLOB-003`
+If no card IDs are provided, ask the user which cards to implement.
+---
+## Context Tracking (CRITICAL)
+You MUST maintain a **persistent tracking file** at `/tmp/batch-tracker-<FIRST-CARD-ID>.md` throughout the entire batch run (e.g., `/tmp/batch-tracker-FEAT-0396.md`). Use the **first card ID** from the batch as the suffix. This ensures multiple `/new` sessions running in parallel terminals (e.g., one per worktree) do NOT conflict.
+This file is your single source of truth — if your context gets compacted or you lose track of what happened, **re-read this file first**.
+### Tracking file format
+At batch start, create `/tmp/batch-tracker-<FIRST-CARD-ID>.md` with:
+```markdown
+# Batch Run: [CARD-IDS]
+Started: [timestamp]
+Total cards: [N]
+## Worktree
+Branch: [feat/FEAT-XXXX-slug]
+Path: [.worktrees/feat-FEAT-XXXX-slug]
+Port: [from registry]
+Group parent: [FEAT-XXXX or "standalone"]
+Main repo: [/absolute/path/to/main/repo]
+## Card Queue
+- [ ] CARD-001 — [title from backlog]
+- [ ] CARD-002 — [title from backlog]
+...
+## Completed Cards
+(none yet)
+## Current Card
+(none — starting pre-flight)
+## File Ownership Map
+(built during pre-flight — one entry per agent)
+| File | Assigned Agent (Card ID) |
+|------|--------------------------|
+## Cross-Card Conflicts (Codex)
+(pending — runs during worktree setup for batches > 1 card)
+## Issues & Flags
+(none yet)
+## Lessons Learned
+(none yet)
+<!-- Format: "PHASE: pattern — file" e.g. "SIMPLIFY: use formatCurrency util — src/lib/utils.ts" -->
+```
+### Update rules
+- **Before starting a card**: move it to `## Current Card` with phase info.
+- **After each phase**: update the current card's phase status in the tracker.
+- **After completing a card**: move it from `Current Card` to `## Completed Cards` with:
+  - Commit hash
+  - One-line summary of what was implemented
+  - Any flags/issues found
+  - Doc review result (pass/fail + what was added)
+  - Test results (new + existing count, pass/fail)
+  - Fix cycles count
+  - UX testing result (PASS/FAIL/SKIP + test file path if written)
+  - QA result (profile used: skip/light/balanced/deep | verdict: PASS/FAIL/SKIP | confidence % | findings: N blockers, N majors | E2E: PASS/FAIL/SKIP)
+  - QA findings file (e.g. `/qa/FEAT-XXXX.md` or "skipped")
+  - **card_status: DONE (verified)** — confirms the backlog YAML was updated and re-read to verify
+- **When blocked**: log the blocker in `## Issues & Flags`.
+- **On context recovery**: if you ever feel lost or after context compaction, IMMEDIATELY read your tracker file (`/tmp/batch-tracker-<FIRST-CARD-ID>.md`) to restore your state.
+---
+## Pre-flight (once)
+1. Read each backlog card from `${paths.backlog_dir}/*.yml` to understand scope and dependencies. Also read the project's canonical-docs registry — typically `${paths.references_dir}/ssot-registry.md` plus any linking-protocol guide — to understand which canonical docs exist for the feature area being implemented. Exact filenames are listed in `.baldart/overlays/new.md`; skip when absent.
+1b. **Validate card fields (pre-flight gate)** — for each card, verify it has the minimum required fields before queuing it:
+   - `requirements` — must be a non-empty list (>=1 item)
+   - `acceptance_criteria` — must be a non-empty list (>=1 item)
+   - `files_likely_touched` — must be a non-empty list (>=1 file)
+   If any card fails: log the specific missing fields in `## Issues & Flags`, ask the user to fill them in before proceeding with that card, and continue pre-flight for any remaining valid cards.
+1c. **Field Registry Validation (pre-flight gate)** — for each card with a `data_fields` block, run the project's field-validation tool if available (path listed in `.baldart/overlays/new.md`; typically `python3 tools/validate-card-fields.py <card-yaml-path>`).
+   - If exit 1: display the field errors in `## Issues & Flags` and HALT — ask the user to fix the card before proceeding. Do not start implementation until the card passes validation.
+   - If the card has DB-index signals (e.g. `firestore_indexes`, `data.collections`) but NO `data_fields` block: log WARNING in `## Issues & Flags` — "Card `<ID>` touches storage but has no `data_fields` block. Field names are unvalidated."
+   - Cards with no `data_fields` and no storage signals: skip silently.
+   - If the project does not ship a field-validation tool: skip this step.
+2. Check `${paths.references_dir}/project-status.md` for current state (skip when absent).
+3. Determine which cards can run in **parallel** (no shared files/components) vs which must be **sequential** (dependencies or overlapping paths). Use `group.sequence` to determine execution order within a group.
+3b. **Build a file-ownership map** — for each card, enumerate every source file it is expected to touch (from `claimed_paths`, `paths`, and the implementation plan). Assign each file to **exactly one agent**. If two cards claim the same file, force those cards sequential on a single agent — do NOT spawn them in parallel. Record this map in the tracker under `## File Ownership Map`. This map is the authoritative source for all agent file permissions in Phase 2.
+3c. **Complexity assessment — team mode decision**
+   Determine whether to use **team mode** (parallel coder agents with isolated contexts) or keep **sequential mode** (current behavior).
+   **Decision logic:**
+   1. Read `execution_strategy.recommended_mode` from the epic parent card (if it exists). If present, use it as the default.
+   2. If no `execution_strategy` exists, compute:
+      - Count total cards in batch
+      - Count max cards in any single parallel group (from `parallel_group` field)
+      - If any card lacks `parallel_group`, compute it on-the-fly: build dependency graph from `depends_on`/`blocks`, add conflict edges for cards sharing `(MODIFY)` files, compute topological layers via BFS.
+   3. Apply threshold:
+      - **Sequential mode** if ANY of: total cards <= 3 — OR all cards are in different parallel groups (max group size = 1, nothing to parallelize) — OR all cards form a linear dependency chain.
+      - **Team mode** if: total cards > 3 AND at least one parallel group has 2+ cards.
+   4. Log decision in tracker:
+      ```
+      ## Execution Mode
+      Mode: team | sequential
+      Reason: [e.g., "6 cards, 3 parallel groups, max 3 cards in group 1"]
+      ```
+   5. Inform the user:
+      ```
+      Batch: N cards, M file unici
+      Gruppi paralleli: K livelli, max P card in parallelo
+      Modalita: **team mode** / **sequential mode** — [reason]
+      ```
+      Proceed without asking (the PRD already approved the strategy).
+   When `mode == sequential`, the per-card pipeline below runs exactly as documented. The `parallel_group` field is simply ignored. When `mode == team`, skip the per-card pipeline and follow the **Team Mode** section at the end of this document.
+3d. **Codex batch cross-card grounding check** (runs in background during worktree setup)
+   > **Why**: GPT-5.4 reviews the full card batch for cross-card conflicts that per-card plan-auditor checks cannot detect (each plan-auditor sees only one card).
+   **Skip if**: batch has only 1 card (no cross-card conflicts possible).
+   Launch via `Bash` with `run_in_background: true` and `timeout: 300000`:
+   ```bash
+   AUDIT_FILE="/tmp/codex-crosscard-$(date +%Y-%m-%d).md" && \
+   CODEX_SCRIPT="$(ls -d ~/.claude/plugins/marketplaces/openai-codex/plugins/codex/scripts/codex-companion.mjs ~/.claude/plugins/cache/openai-codex/codex/*/scripts/codex-companion.mjs 2>/dev/null | sort -V | tail -1)" && \
+   [ -z "$CODEX_SCRIPT" ] && echo "CODEX_NOT_FOUND" && exit 1; \
+   node "$CODEX_SCRIPT" task --wait "
+   Cross-card grounding check for a batch of backlog cards about to be implemented together.
+   Your job is to find conflicts BETWEEN cards that per-card reviewers would miss.
+   Cards to check (read each file):
+   ${CARD_PATHS}
+   File-ownership map:
+   ${FILE_OWNERSHIP_MAP}
+   Check:
+   1. FILE CONFLICTS: Do two cards modify the same file in incompatible ways?
+      (e.g., Card A adds field X to interface Foo, Card B restructures Foo entirely)
+   2. IMPLICIT DEPENDENCIES: Does Card A change a type/function that Card B's
+      requirements assume stays unchanged? Flag missing depends_on.
+   3. EXECUTION ORDER RISKS: Given the parallel_group assignments, will parallel
+      execution cause merge conflicts or type errors?
+   4. SHARED STATE MUTATIONS: Do multiple cards write to the same Firestore
+      collection/document in ways that could conflict at runtime?
+   For each finding return:
+   - **Cards involved**: CARD-A + CARD-B
+   - **Conflict type**: FILE_CONFLICT | IMPLICIT_DEP | ORDER_RISK | STATE_MUTATION
+   - **Evidence**: exact field/file from each card
+   - **Fix**: concrete recommendation (add depends_on, force sequential, etc.)
+   If no cross-card conflicts: return PASS.
+   " 2>&1 | tee "$AUDIT_FILE"
+   ```
+   **Variable interpolation**:
+   - `${CARD_PATHS}`: newline-separated list of all `- backlog/FEAT-XXXX-*.yml` paths in the batch
+   - `${FILE_OWNERSHIP_MAP}`: the file-ownership map built in step 3b
+   **Result handling** (read before Phase 1 of first card):
+   - Read `/tmp/codex-crosscard-{YYYY-MM-DD}.md` after background command completes.
+   - If **PASS** or file empty/missing: proceed normally.
+   - If **conflicts found**: log in tracker under `## Cross-Card Conflicts (Codex)` and present to user. For each conflict:
+     - `FILE_CONFLICT` / `ORDER_RISK` → force the conflicting cards sequential (update file-ownership map).
+     - `IMPLICIT_DEP` → add `depends_on` entry to tracker notes (do NOT modify backlog YAML).
+     - `STATE_MUTATION` → add warning to both cards' Phase 2 briefings.
+4. **Worktree setup** — delegate to the **worktree-manager** skill (`/nw` in programmatic mode):
+   a. Pass all card IDs and their `group.parent` fields to the skill's grouping logic.
+   b. The skill handles: grouping cards by `group.parent`, deriving branch names from `git_strategy.branch`, creating the worktree in `.worktrees/`, installing dependencies, copying env files, assigning a free port, and verifying the build.
+   c. The skill updates `.worktrees/registry.json` with the worktree entry (including all card IDs in the `cards` field).
+   d. If build fails → the skill STOPs and reports. Do NOT continue.
+   e. Record the worktree path, branch, and port from the skill's output in the tracker.
+6. Create the tracking file `/tmp/batch-tracker-<FIRST-CARD-ID>.md` (include worktree path and branch name).
+7. Create a task list to track progress across all cards.
+---
+## QA Profile Selector
+Before Phase 3.5, determine the QA profile for each card by reading its YAML metadata. Use the **first matching rule** (priority order):
+| Profile | When to apply |
+|---------|--------------|
+| **SKIP** | Card type is `docs`, `chore`, or `config` — OR all changed paths are `.md`/`.yml` (non-API)/CSS with zero logic files — OR title contains only cosmetic keywords (typo, rename, copy, wording, style) with no code areas |
+| **LIGHT** | Card type is `bugfix` OR `refactor` — AND ≤3 files likely touched — AND no HIGH-risk keywords in paths/areas/title. **NEVER applies to `feature` or `enhancement` cards.** |
+| **BALANCED** | Default for ALL `feature` / `enhancement` cards not matching DEEP rules — OR `bugfix`/`refactor` with >3 files or HIGH-risk keywords |
+| **DEEP** | ANY of: areas includes both `api` + `data` — OR paths/title contain `auth`, `payment`, `permission`, `schema`, `migration`, `cron`, `webhook`, `transaction` — OR >15 files likely touched — OR acceptance criteria count > 5 — OR Firestore indexes changed — OR API contract changed — OR `data_fields` block has ≥3 entries with `status: new` or `status: modified` |
+**Critical**: `feature` and `enhancement` cards are ALWAYS BALANCED minimum — never LIGHT. The `files_likely_touched` field in YAML underestimates actual scope; ignore it for profile decisions on feature cards. When in doubt between BALANCED and DEEP, use DEEP.
+---
+## Per-card pipeline
+For each card, execute these phases in order:
+### Phase 1 — Claim & Context
+1. **Update tracker**: set current card, phase = "1-claim".
+2. Set the card status to `IN_PROGRESS` and assign yourself.
+2b. **`depends_on` gate** — read the card's `depends_on` field. For each listed card ID:
+   - Read that card's backlog YAML and check its `status` field.
+   - If NOT `DONE` → HALT: log in `## Issues & Flags` and ask the user: "Card <CARD-ID> depends on <DEP-ID> which is `<status>`. Proceed anyway, or wait?" Do not start implementation until the user responds explicitly.
+   - If `DONE` → continue.
+3. Update `${paths.references_dir}/project-status.md` Active Code Context (skip when the file does not exist in the project).
+4. Invoke the **codebase-architect** agent (MUST per AGENTS.md) to understand the relevant codebase area, existing patterns, and architecture before any implementation.
+4b. **Plan-auditor grounding check** — invoke the **plan-auditor** agent in QUICK mode with this prompt:
+   ```
+   Quick grounding check only (not a full audit). Verify this backlog card's requirements are grounded in the actual codebase.
+   Check:
+   1. Do all paths in files_likely_touched actually exist?
+   2. Are all type/field references in requirements correct per the codebase findings below?
+   3. Are any [ASSUMED] items answerable by reading the listed files (i.e., should be verified facts)?
+   4. Do any requirements conflict with known anti-patterns from your memory?
+   Card YAML:
+   [paste full card YAML]
+   Codebase-architect findings:
+   [paste key findings from step 4]
+   Return: PASS | FIXES NEEDED (list exact corrections). No full audit report needed.
+   ```
+   - If **FIXES NEEDED**: apply corrections to the tracker notes for this card (do NOT modify the backlog YAML). Carry the corrected requirements into the Phase 2 briefing.
+   - If **PASS**: proceed.
+5. **Update tracker**: phase = "1-claim DONE", log codebase-architect key findings (1-2 lines) and plan-auditor result (PASS or corrections applied).
+### Phase 2 — Implement (self-healing, up to 3 retries)
+6. **Update tracker**: phase = "2-implement".
+7. Spawn the **coder** agent (or appropriate specialist from `.claude/agents/REGISTRY.md`) using this **standardized mission briefing** — fill in each section from the card YAML, tracker, and codebase-architect findings:
+   ```
+   ## MISSION BRIEFING — <CARD-ID>
+   ### Card Specification (verbatim — do not paraphrase)
+   Requirements:
+   [copy full requirements list from card YAML, including any corrections from plan-auditor step 4b]
+   Acceptance Criteria:
+   [copy full acceptance_criteria list from card YAML]
+   ### Unknowns & Conditional Requirements (MANDATORY — resolve before coding)
+   [If card has `unknowns` field: copy verbatim. If empty or absent: "None."]
+   Each unknown that says "verify X; if missing → do Y" is a BINARY-OUTCOME ITEM.
+   You MUST actively verify and produce one of the two outcomes (implementation OR TODO comment).
+   See `coder.md § Conditional Requirements — Binary-Outcome Items`.
+   ### Business Context (WHY this feature exists)
+   [paste business_rationale field from card YAML]
+   [if field is empty/missing, read PRD Section 1b from card's links.prd path]
+   ### Codebase Context
+   [paste codebase-architect key findings: file paths, exact line numbers, type signatures, existing patterns to follow]
+   [include any corrections or anti-patterns flagged by plan-auditor grounding check]
+   ### Design Reference (UI cards only — include if card has links.design)
+   Design file: [path from card's links.design field]
+   Read the design.html file and use it as the visual reference for your implementation.
+   The design was approved by the user — your implementation MUST match it.
+   ### File Permissions (ENFORCED — no exceptions)
+   MAY EDIT — your files for this card:
+   [list from ownership map for this card]
+   MUST READ ONLY — shared dependencies:
+   [list from ownership map: files owned by other cards that this card reads]
+   FORBIDDEN:
+   - Do NOT edit any file outside the MAY EDIT list above
+   - Do NOT refactor unrelated code
+   - Do NOT add unrequested features or extra error handling beyond what's specified
+   - Do NOT modify test files unless the card explicitly requires it
+   ### Firestore Composite Indexes (if card has `firestore_indexes` field)
+   This card requires the following composite indexes in `firestore.indexes.json`.
+   You MUST add these in the SAME commit as the query code. Missing indexes cause
+   runtime FAILED_PRECONDITION errors (500 in production).
+   [paste firestore_indexes entries from card YAML, formatted as:]
+   | Collection | Fields | Query Location | PRD Ref |
+   |-----------|--------|----------------|---------|
+   | [collection] | [field1 ASC, field2 DESC] | [file path] | [IDX-N] |
+   For each index, add the corresponding entry to `firestore.indexes.json` under
+   `indexes[]` with this format:
+   ```json
+   {
+     "collectionGroup": "<collection>",
+     "queryScope": "COLLECTION",
+     "fields": [
+       { "fieldPath": "<field1>", "order": "ASCENDING" },
+       { "fieldPath": "<field2>", "order": "DESCENDING" }
+     ]
+   }
+   ```
+   Include `firestore.indexes.json` in your staged files.
+   ### Expected Output Locations
+   [pre-fill from codebase-architect: for each requirement, the file:line where the change should go]
+   ### Project Anti-Patterns & NFRs (ENFORCED)
+   Violating these is a BLOCKER in code review.
+   Deprecated patterns (from AGENTS.md / MEMORY.md — adapt to your project):
+   - List the project-specific anti-patterns documented in AGENTS.md / MEMORY.md
+   - Example: deprecated permission shortcuts, deprecated client-state APIs, etc.
+   Performance MUST rules (adapt to your stack):
+   - Bounded reads on every database query (`.limit()` / equivalent)
+   - Cursor-based pagination, not offset
+   - No per-row fetches in loops — batch instead
+   - Database indexes declared in schema config (same commit as the query that needs them)
+   Security MUST rules:
+   - No stack traces in HTTP responses — generic messages + error codes only
+   - No PII in console.log or error tracking
+   - No hardcoded secrets — env vars only
+   - ALL non-public routes MUST use the project's auth middleware
+   Coding standards (agents/coding-standards.md):
+   - Use the project's canonical terminology (document it in coding-standards.md)
+   - Honor the design-system theming contract (e.g. text/background pairing rules)
+   - Honor the project's image-format and asset-pipeline rules
+   Design System SSOT (MANDATORY for UI cards):
+   - Master reference: `${paths.design_system}/INDEX.md` (component index + Canonical Authority
+     Matrix + Quick rules MUST). Read this BEFORE touching any UI file.
+   - For each UI component you modify or create, read `${paths.design_system}/components/<Name>.md`.
+   - Merchant-themed surfaces MUST follow the pairing rule in
+     `the project's theming pattern doc (listed in `.baldart/overlays/ui-design.md`)`.
+   - Motion MUST follow `the project's motion pattern doc (listed in `.baldart/overlays/ui-design.md`)` (reduced-motion variants
+     required). The coder agent's own hardening (see `.claude/agents/coder.md`) already enforces
+     this — this briefing is an explicit reminder for UI-touching cards.
+   - No hardcoded hex/shadow/border values in component styling — canonical tokens only.
+   ### Batch Lessons (from prior cards in this batch)
+   [If tracker `## Lessons Learned` has entries, paste them here.
+    If empty or first card: "First card — no lessons yet."]
+   ### MANDATORY: Numbered Requirements Checklist (anti-skip measure)
+   Before you start coding, print this checklist to confirm you see ALL requirements:
+   ```
+   REQUIREMENTS TO IMPLEMENT:
+   [x] R1: [requirement text]
+   [x] R2: [requirement text]
+   ...
+   ACCEPTANCE CRITERIA TO SATISFY:
+   [x] AC1: [criterion text]
+   [x] AC2: [criterion text]
+   ...
+   CONDITIONAL / BINARY-OUTCOME ITEMS (from unknowns + requirements with "verify if / if missing"):
+   [ ] C1: [item text] → Branch to take: A (found+implement) | B (missing+TODO)
+   [ ] C2: [item text] → Branch to take: A (found+implement) | B (missing+TODO)
+   ...
+   Total: N requirements + M acceptance criteria + K conditional items = X items
+   ```
+   You MUST implement ALL items. Skipping even one is a failure.
+   For each conditional item: verify actively, then produce the correct branch artifact.
+   ### MANDATORY: Completion Report
+   When implementation is done, output this block in EXACTLY this format.
+   This report is NOT optional — if you omit it, the orchestrator will flag
+   your implementation as incomplete and spawn a fix agent.
+   ```completion-report
+   card: <CARD-ID>
+   requirements:
+     - id: 1
+       text: "[verbatim requirement text]"
+       status: done | partial | blocked
+       evidence: "src/path/to/file.ts:LINE_NUMBER"
+       notes: "[if partial or blocked: explain why]"
+     - id: 2
+       [repeat for each requirement]
+   acceptance_criteria:
+     - id: 1
+       text: "[verbatim criterion text]"
+       status: done | partial | blocked
+       evidence: "src/path/to/file.ts:LINE_NUMBER"
+   items_total: [N]
+   items_done: [M]
+   items_skipped: [list of skipped item IDs, or "none"]
+   conditional_items:
+     - item: "[verbatim text of conditional requirement]"
+       branch_taken: "A-found" | "B-missing"
+       evidence: "src/path/to/file.ts:LINE_NUMBER"
+       notes: "[if B-missing: exact TODO location and text left; otherwise omit]"
+   ```
+   If `items_skipped` is not "none", you MUST explain why each was skipped.
+   The orchestrator treats any skipped item as a gap requiring a fix agent.
+   `conditional_items` MUST list every binary-outcome item from the card. If the card has no
+   conditional requirements, omit the field. `branch_taken: B-missing` MUST have a `notes`
+   field with the exact file path and line where the TODO comment was written.
+   ```
+8. Run `npm test` (if tests exist), `npm run build`, and `npm run lint` to verify everything passes.
+9. **If any check fails**: categorize the error (`lint | TypeScript | test | build`), log it in the tracker as `retry-cause: <category>`.
+   **Stash recovery (MUST check before rewriting)**: before spawning a fix agent or rewriting code, check if a previous stash contains the needed changes:
+   ```bash
+   git stash list
+   ```
+   If a named agent stash exists (e.g., `stash@{0}: On feat/...: agent-XXXXXXXXXX`), inspect it:
+   ```bash
+   git stash show -p stash@{0}
+   ```
+   If the stash contains edits that lint-staged removed (e.g., a field definition that was "unused" at commit time but is needed by subsequent code), recover with `git stash pop` instead of rewriting from scratch. Only spawn a fix agent if the stash does NOT contain the needed changes.
+   When spawning a fix agent: scope it exclusively to this card's Edit-allowed files (from `## File Ownership Map`) — it MUST NOT touch files owned by other cards. Pass the fix agent: the error output, the error category, the explicit list of files it may edit, and the failing check output. Do NOT ask the user — just fix and re-run. Fix the code, not the tests (unless the test itself is wrong). Repeat up to **3 times**.
+10. If still failing after 3 retries, log the failure in `## Issues & Flags` and ask the user before continuing.
+11. **Update tracker**: phase = "2-implement DONE", log files changed (short list), retry count, retry causes (e.g., `"2 retries: TypeScript x1, lint x1"`), and test results (new + existing test count, pass/fail).
+11b. **File diff gate** — verify the coder only touched its allowed files:
+   ```bash
+   cd <worktree-path> && git diff --name-only HEAD
+   ```
+   Compare against this card's allowed files in `## File Ownership Map`.
+   - **All within allowed set** → proceed to Phase 2.5.
+   - **Any file outside allowed set** → log the violation in `## Issues & Flags` (`"unauthorized file: <path>"`), then spawn a **targeted revert coder agent** with instruction: "Revert ONLY these files to their pre-commit state. Do not touch any other file: [list unauthorized files]". Re-run build + lint to confirm clean state after revert. Update tracker with revert outcome, then proceed to Phase 2.5.
+### Phase 2.5 — Implementation Completeness Check (MANDATORY)
+Before triggering any review, you MUST verify that the coder agent implemented **every requirement and acceptance criterion** in the backlog card. This is a blocking gate — do NOT proceed to Phase 3 with unimplemented items.
+**Step-by-step**:
+0. **Conditional requirements pre-scan** (BLOCKING — before building the main checklist):
+   Scan the card's `requirements`, `acceptance_criteria`, `unknowns`, and `notes` fields for
+   conditional language patterns: "verify if / verify whether", "if missing / if not present /
+   if not found / if not already", "if already exists / if already included", "leave TODO /
+   add TODO / leave a note", "add if needed / create if missing", "add support if missing".
+   For each match, classify as a **binary-outcome item** and verify:
+   a. Positive branch: grep changed files for the implementation.
+   b. Negative branch (B-missing): grep codebase for `// TODO [CARD-ID]:` at the relevant location.
+   If NEITHER branch artifact exists → classify as `Missing` (not Partial) and trigger the
+   gap-resolution fix agent sub-loop immediately for these items (before continuing the checklist).
+   Also check: if the card has conditional requirements but the completion report has no
+   `conditional_items` section → flag as `Missing` (the coder must document the branch taken).
+1. **Update tracker**: phase = "2.5-completeness".
+2. Re-read the backlog card (`requirements`, `acceptance_criteria`, `unknowns`, and `definition_of_done` fields).
+3. Build a checklist — one row per item:
+   ```
+   ## Completeness Check — <CARD-ID>
+   | # | Requirement / Criterion | Status | Evidence |
+   |---|------------------------|--------|----------|
+   | 1 | [requirement text]     | Done / Missing / Partial | [file:line or "not found"] |
+   | 2 | [acceptance criterion] | Done / Missing / Partial | [file:line or "not found"] |
+   ```
+4. For each item, verify by **reading the actual implementation code** (use Grep/Read).
+   Do NOT trust the coder agent's completion report alone — verify independently.
+   **Verification depth per requirement type:**
+   - **"Create endpoint/route"** → verify: route file exists, handler has correct HTTP method,
+     request validation present, response shape matches requirement, auth middleware applied.
+   - **"Add UI component/page"** → verify: component file exists, renders the required elements,
+     handles loading/error/empty states if specified, is wired into the page/layout.
+   - **"Add field/collection"** → verify: type definition updated, field used in read AND write
+     paths, validation present if specified.
+   - **"Integrate with X"** → verify: import exists, function is called with correct parameters,
+     error handling present.
+   - **Data fields completeness** → if card has `data_fields` block:
+     - For each `status: existing` field: grep the changed files to confirm the field name is
+       referenced in the implementation. A declared-but-unused field is a red flag (wrong name
+       or dead code). Log in checklist as Partial if not found.
+     - For each `status: new` field: verify the field is BOTH written (setter/creator path)
+       AND read (getter/reader path) in the changed files.
+     - For each `status: modified` field: verify old-name references are removed and
+       new-name references exist.
+   - **Firestore compound query** → verify: if the card has a `firestore_indexes` field,
+     check that EVERY listed index exists in `firestore.indexes.json` (match collection +
+     fields + order). Also grep the query code to confirm `where()` + `orderBy()` field
+     names match the index definition exactly. Missing or mismatched index = CRITICAL gap
+     (causes runtime 500 FAILED_PRECONDITION).
+   If the completion report has `items_skipped` that is not "none", immediately flag those
+   items as Missing regardless of the agent's justification.
+5. Classify each item:
+   - **Done** — code exists AND satisfies the FULL scope of the requirement (not just partial).
+   - **Partial** — some code exists but the criterion is not fully satisfied (e.g., UI renders but no error state, endpoint exists but wrong response shape).
+   - **Missing** — no evidence found in the codebase, OR the coder agent explicitly skipped it.
+**Gap resolution** (for Partial and Missing items):
+- Spawn a **targeted fix coder agent** with:
+  - The exact list of unimplemented items (copy the checklist rows)
+  - The file-ownership restrictions from `## File Ownership Map`
+  - The instruction: "Implement ONLY these missing items. Do not refactor or expand scope."
+- After the fix agent completes, re-run build + lint (`npm run build && npm run lint`).
+- Re-verify each fixed item against the code — do NOT trust the agent's self-report.
+- Repeat this sub-loop up to **2 times**. After 2 loops, if items remain Missing:
+  - Log in `## Issues & Flags`: list each unimplemented requirement.
+  - Ask the user whether to proceed to review despite the gap or stop.
+5b. **API contract check** (only for MODIFIED `route.ts` files — new routes are exempt):
+    - From `git diff --name-only`, identify `route.ts` files that existed before this card.
+    - For each modified route:
+      a. Derive the API module from path (e.g., `src/app/api/v1/booking/[id]/route.ts` → `booking`).
+      b. Read `${paths.references_dir}/api/<module>.md` (or subdirectory match).
+      c. Read the modified route, extract response shape from `NextResponse.json()` calls.
+      d. Compare against documented schema.
+      e. Flag: removed fields or changed types → `[API-CONTRACT] BREAKING` in `## Issues & Flags` (BLOCKER).
+         Added fields only → `[API-CONTRACT] ADDITIVE` (informational).
+         No doc found → `[API-CONTRACT] NO-DOC` (informational).
+5c. **Reference-aliasing mutation check** (MANDATORY — deterministic detector, BUG-0558 prevention).
+Detect call sites that pair a helper invocation with in-place mutation of an input array
+without an identity guard. Anti-pattern (the BUG-0558 family):
+```ts
+const result = filterHelper(input, ...);
+input.length = 0;          // ← wipes both input AND result when result === input
+input.push(...result);     // ← pushes zero elements (result was just emptied)
+```
+Run this exact bash block in the worktree:
+```bash
+cd <worktree-path>
+CHANGED_TS="$(git diff --name-only develop...HEAD -- '*.ts' '*.tsx' 2>/dev/null \
+              || git diff --name-only HEAD~1 -- '*.ts' '*.tsx')"
+ALIAS_HITS=()
+for f in $CHANGED_TS; do
+  [ -f "$f" ] || continue
+  # Heuristic: find lines matching `arr.length = 0` followed within 3 lines by `arr.push(...result)`.
+  # Capture the variable name; flag for human/agent review.
+  python3 - "$f" <<'PY'
+import re, sys
+src = open(sys.argv[1]).read().splitlines()
+for i, line in enumerate(src):
+    m = re.search(r'(\w+)\.length\s*=\s*0', line)
+    if not m: continue
+    var = m.group(1)
+    # Look back 5 lines for a const/let assignment from a function call returning to a different name,
+    # then forward 3 lines for `var.push(...other)`.
+    back = "\n".join(src[max(0,i-5):i])
+    fwd  = "\n".join(src[i+1:i+4])
+    assign = re.search(rf'const\s+(\w+)\s*=\s*\w+\([^)]*\b{re.escape(var)}\b', back)
+    push   = re.search(rf'{re.escape(var)}\.push\(\.\.\.(\w+)\)', fwd)
+    if assign and push and assign.group(1) == push.group(1):
+        print(f"{sys.argv[1]}:{i+1}: ALIAS-MUTATION candidate — '{var}.length = 0' followed by '{var}.push(...{push.group(1)})' where {push.group(1)} comes from a call that received {var}")
+PY
+done
+```
+For each hit, classify by reading 10 lines of context around the match:
+- **Guarded** — an `if (<assignedName> !== <var>)` check wraps the reset+push block. SAFE. Log `[ALIAS-MUTATION] guarded` (informational, no action).
+- **Defensively cloned** — the helper was called with `[...var]` or `var.slice()` instead of `var`. SAFE. Log `[ALIAS-MUTATION] cloned` (informational).
+- **Helper always returns new array** — read every `return` in the helper. If ALL paths return `[...candidates]` / `candidates.filter(...)` / a new array literal, identity collision is impossible. SAFE. Log `[ALIAS-MUTATION] helper-immutable` (informational).
+- **Un-guarded** — none of the above. Flag `[ALIAS-MUTATION] BLOCKER` in `## Issues & Flags` with file:line and the captured pattern.
+For each `[ALIAS-MUTATION] BLOCKER`:
+- Spawn a targeted fix `coder` agent with the exact file:line and instruction: "Add identity guard `if (result !== input) { ... }` around the in-place reset, OR clone the input with `[...input]` before passing it to the helper. Add a regression test on the caller pattern (see `tests/booking/apply-orphan-protection-reference.test.ts` as reference). Read `agents/coding-standards.md § Reference-Aliasing Mutation Patterns` before implementing."
+- Re-run the detector after the fix. The hit MUST classify as Guarded, Cloned, or helper-immutable. Repeat **max 2 times**.
+- If still un-guarded after 2 loops → log in `## Issues & Flags` and ask the user.
+Rationale: BUG-0558 (2026-05-11) shipped to develop and collapsed availability to zero options for any store with protected Solo-Combo tables. Cause: `applyOrphanProtection()` returns the same input reference on 3 of 4 paths; the call site did `options.length = 0; push(...result)` without an identity guard. The helper passed unit tests in isolation; the bug only manifested end-to-end. This detector closes that detection gap.
+5d. **Caller-pattern test coverage check** (MANDATORY — when card introduces or modifies an exported helper consumed by 1+ caller with in-place mutation).
+If the card's diff includes BOTH (a) an `export function` / `export const` declaration in a `src/lib/**` file AND (b) at least one call site in another file that mutates an input array in place (detected via the bash block in 5c), verify that a test file exists that exercises the **caller pattern** — not only the helper in isolation.
+Required test characteristics:
+- At least one case asserts the post-call state of the input array (e.g. `assert.strictEqual(options.length, 3)` after the helper invocation).
+- At least one **negative-control case** documents the failure mode that would occur if the identity guard / defensive clone were removed (e.g. `tests/booking/apply-orphan-protection-reference.test.ts:182-205`).
+If the test is missing:
+- Spawn a targeted fix `coder` agent to write the test. The test MUST live in a file co-located with similar helpers (e.g. `tests/booking/<helper-name>-reference.test.ts`).
+- This is BLOCKING for Phase 4 commit when the helper is on a high-risk path (auth, payments, booking-availability, DORE primitives). On non-high-risk paths it's a HIGH-severity flag but the orchestrator may proceed and defer to `/codexreview`.
+Rationale: BUG-0558's introducing card (FEAT-0850-02) had its tests deferred to the later E2E card (FEAT-0850-07). The helper passed in isolation; the caller pattern was never tested. A caller-pattern test at the helper-card level would have caught the bug 2 days earlier.
+6. **Update tracker**: phase = "2.5-completeness DONE", log result (PASS / GAPS REMAIN + count).
+### Phase 2.55 — Simplify (code cleanup before review)
+After completeness is verified, clean up the implementation before it reaches reviewers and E2E tests. This phase launches three parallel agents on the card's diff, then applies fixes directly.
+**Step-by-step**:
+1. **Update tracker**: phase = "2.55-simplify".
+2. Run `git diff` in the worktree to capture all changes for the current card.
+3. Launch **three agents in parallel** (single message, all receive the full diff):
+   - **Reuse agent** — search the codebase for existing utilities/helpers that could replace newly written code. Flag duplicated functionality, inline logic that could use existing utils.
+   - **Quality agent** — flag redundant state, parameter sprawl, copy-paste with slight variation, leaky abstractions, stringly-typed code where constants/enums exist, unnecessary JSX nesting, unnecessary comments (WHAT comments, narration, task references).
+   - **Efficiency agent** — flag unnecessary work (redundant computations, duplicate API calls, N+1), missed concurrency, hot-path bloat, recurring no-op updates without change-detection guards, TOCTOU existence checks, memory issues (unbounded structures, missing cleanup), overly broad operations.
+4. Aggregate findings from all three agents. For each finding:
+   - **Valid** → fix directly (no coder agent spawn — apply edits inline).
+   - **False positive / not worth addressing** → skip silently, do not log.
+5. After all fixes, run `npm run lint` and `npx tsc --noEmit` to confirm nothing broke.
+   If either fails, fix the regression (up to 2 retries).
+6. **Update tracker**: phase = "2.55-simplify DONE", log count of fixes applied (or "clean — 0 fixes").
+   If any valid finding revealed a reusable pattern or common mistake, append 1-line to `## Lessons Learned`:
+   `SIMPLIFY: <pattern> — <file>`
+### Phase 2.6 — E2E Testing (PRD-driven, card-level)
+This phase uses the `test_plan` field from the backlog card YAML to determine whether
+to write and run E2E tests. The test plan was defined during `/prd` discovery.
+**Trigger detection** — read the card's `test_plan` field:
+| `test_plan.e2e_required` | Action |
+|--------------------------|--------|
+| `true` | **AUTO-TRIGGER** — proceed directly, no user confirmation needed |
+| `false` | **SKIP** — log "E2E testing: SKIP (PRD evaluation: not needed)" in tracker |
+| field missing | **FALLBACK to legacy detection** (see below) |
+**Legacy fallback** (only when `test_plan` field is absent — old cards without PRD test plan):
+Check if ALL conditions are met:
+1. The card's `areas` field includes any of: `ui`, `frontend`, `booking`, `customer`, `merchant-dashboard`, `onboarding`, `settings`.
+2. The changed files include at least one `.tsx` page file (under `src/app/`).
+3. The card type is NOT `docs`, `chore`, or `config`.
+If all met, ask the user: `Vuoi testarlo con Playwright E2E? (si/no)`
+**When E2E runs** (auto-triggered or user-confirmed):
+1. **Update tracker**: phase = "2.6-e2e-testing".
+2. Read test scenarios and credentials from `test_plan`:
+   - `test_scenarios` — pre-planned scenarios with mapped user stories/ACs
+   - `test_credentials` — persona, auth method, credentials, store
+   - `test_data_prerequisites` — data that must exist
+3. Spawn a **coder** agent to write the Playwright E2E test:
+   ```
+   Write a Playwright E2E test for card <CARD-ID>.
+   Test file: tests/e2e/<card-id-slug>.spec.ts
+   Use: import { test, expect } from '@playwright/test'
+   ## Pre-planned Test Scenarios (from PRD)
+   [paste test_scenarios from card YAML — each has description, mapped US/AC, priority]
+   ## Test Credentials
+   Persona: [from test_plan.test_credentials.persona]
+   Auth method: [from test_plan.test_credentials.auth_method]
+   Credentials: [from test_plan.test_credentials.credentials]
+   Store: [from test_plan.test_credentials.store]
+   Notes: [from test_plan.test_credentials.notes]
+   ## Auth Handling
+   - If persona is CUSTOMER with OTP auth: navigate to login, enter phone number,
+     then use `await test.step('OTP input (manual)', async () => { await page.pause() })`
+     to pause for manual OTP entry by the user. After pause resumes, continue the flow.
+   - If persona is MERCHANT with password auth: use page.fill() for username/password
+     fields, then submit the login form. No manual intervention needed.
+   ## Test Data Prerequisites
+   [paste test_data_prerequisites — data that must exist in Firestore]
+   ## Guidelines
+   - Use page.getByRole(), page.getByText(), page.getByTestId() for selectors
+   - Use expect(locator) assertions with auto-waiting
+   - Group related steps with test.describe()
+   - Implement ALL scenarios from the PRD test plan (not just happy path)
+   - Use baseURL from config (just page.goto('/relative-path'))
+   - Do NOT use chromium.launch() or manual browser management
+   - For OTP flows: use page.pause() and document it clearly in test output
+   ```
+4. After the test file is written, run the test in headed mode:
+   ```bash
+   npx playwright test tests/e2e/<test-file>.spec.ts --headed --reporter=list
+   ```
+   Note: for OTP-based tests, the test will pause at `page.pause()` for manual OTP input.
+   The user enters the OTP in the browser, then resumes (close Playwright Inspector).
+5. Report the result to the user. If tests fail, the coder agent fixes them (up to 2 retries).
+6. **Update tracker**: phase = "2.6-e2e-testing DONE", log: test file path, result (PASS/FAIL/SKIP), test count, scenarios covered (N/M from PRD plan).
+### Phase 2.7 — Visual Design Review (UI cards only — advisory)
+**Trigger**: card has `links.design` field. If absent, SKIP.
+Non-blocking — findings are logged but do NOT block commit.
+1. **Update tracker**: phase = "2.7-design-review".
+2. Extract the route from `files_likely_touched` (look for `page.tsx` under `src/app/`).
+   If no page file found, SKIP (log "no page route").
+3. Invoke the **design-review** agent (`.claude/commands/design-review.md`) on the route,
+   passing the worktree dev server port.
+4. Log findings in tracker: `design-review: [N] blockers, [M] high, [K] medium`.
+   If blockers: log in `## Issues & Flags` as `[DESIGN-ADVISORY]`.
+5. **Do NOT spawn fix agents.** Findings are informational for the user.
+6. **Update tracker**: phase = "2.7-design-review DONE".
+---
+### Phase 3 — Doc Review & Sync (code review handled by mandatory Phase 3.7 gate)
+> **Note**: Code review is NOT performed in this phase — it is handled by the **mandatory unconditional `/codexreview` gate in Phase 3.7**, which runs per-card BEFORE the Phase 4 commit. The post-batch `/codexreview` in Phase 7 remains as a final cross-card sweep. This Phase 3 focuses exclusively on documentation sync, which must happen per-card (tied to the specific commit).
+12. **Update tracker**: phase = "3-doc-review".
+13. Build a **Doc Sync Context** block:
+    ```bash
+    git diff --name-only HEAD
+    ```
+    Use the output to build this block:
+    ```
+    ## Doc Sync Context
+    ### Changed code files (for freshness mapping):
+    [list all code files changed — src/, scripts/, etc.]
+    ### Invariant checklist (Linking Protocol — docs/guides/linking-protocol-rollout-v1.md):
+    - [ ] New route.ts added? → ${paths.references_dir}/api/<module>.md + api/index.md count
+    - [ ] New page.tsx added? → ${paths.references_dir}/ui/<domain>.md + ui/index.md Route Summary
+    - [ ] New Firestore collection? → ${paths.references_dir}/data-model.md + collections/<domain>.md
+    - [ ] Compound Firestore query (where+orderBy on different fields)? → firestore.indexes.json must include the composite index and be staged
+    - [ ] Backlog card set to DONE? → ${paths.references_dir}/ssot-registry.md entry
+    - [ ] New external dependency? → agents/architecture.md External Dependencies list
+    - [ ] Card has `documentation_impact` field? → verify each listed doc is updated
+    - [ ] New `process.env.VAR` added? → entry in `${paths.references_dir}/env-vars.md` (nome, scope, required, feature/card, default)
+    - [ ] Last usage of `process.env.VAR` removed? → mark `status: deprecated` in `${paths.references_dir}/env-vars.md` con data
+    - [ ] Default value changed in `src/lib/env.ts`? → aggiornare colonna Default in `${paths.references_dir}/env-vars.md`
+    - [ ] Card ha campo `env_vars` popolato? → verificare che ogni entry sia tracciata in `${paths.references_dir}/env-vars.md`
+    ### Related docs to check (per convention map):
+    - [derive from FRESHNESS_MAP: src/app/api/** → ${paths.references_dir}/api/; src/lib/booking/** → booking.md etc.]
+    ```
+    Invoke the **doc-reviewer** agent as a **read-only audit**, passing the Doc Sync Context:
+    ```
+    Doc Sync Context:
+    [paste the Doc Sync Context block built above]
+    Instructions for doc-reviewer:
+    - Set freshness_status: fresh and last_verified_from_code: <today's date> on any doc you touch
+    - Check each invariant in the checklist above and flag any that are unmet
+    - Flag any doc in "Related docs to check" that should be stale but isn't marked yet
+    - Run your standard doc audit on all changed files
+    ```
+    Doc-reviewer collects findings WITHOUT making changes.
+14. **Obsidian Corpus Dispatch**: Parse section H from doc-reviewer findings. If `Trigger: YES`, dispatch the `obsidian-sync` agent (`.claude/agents/obsidian-sync.md`) with the listed paths after all fixes are applied (step 16). If `Trigger: NO`, skip. This is non-blocking -- do not wait for obsidian-sync to complete before proceeding.
+15. If doc findings exist, invoke the **coder** agent once to apply **ALL doc fixes in one pass**.
+16. Run `npm run lint` and `npx tsc --noEmit` to verify nothing broke. If any check fails, apply the self-healing retry loop (up to 3 times, no user prompt).
+17. **Update tracker**: phase = "3-doc-review DONE", log doc findings count, fixes applied.
+    If doc-reviewer found a recurring gap, append 1-line to `## Lessons Learned`:
+    `DOC: <pattern>`
+### Phase 3.5 — QA Validation
+18. **Update tracker**: phase = "3.5-qa".
+19. **Select QA profile** using the QA Profile Selector table above. Log the chosen profile and rationale in the tracker (1 line).
+20. **If profile is SKIP**: log "QA skipped — [reason]" in the tracker. Proceed to Phase 4.
+21. **If profile is LIGHT**: SKIP qa-sentinel — Phase 2 step 8 already ran lint/tsc/test/build gates. Log "QA LIGHT skipped — gates already passed in Phase 2" in the tracker. Proceed to Phase 4.
+22. **If profile is BALANCED or DEEP**: invoke the **`qa-sentinel`** agent (subagent_type: `qa-sentinel`) via Task tool with the following context:
+    ```
+    Run FULL VALIDATION MODE on card <CARD-ID>.
+    Context:
+    - Worktree path: <worktree-path>
+    - Branch: <branch-name>
+    - Changed files: <list from implementation phase>
+    - QA profile: [balanced | deep]
+    Run E2E tests: `npx playwright test --reporter=list`
+    E2E failures are BLOCKING for deep profile, ADVISORY for balanced profile.
+    Also run gates (lint, tsc, test, build, markdownlint) as sanity check.
+    Do NOT verify acceptance criteria (Phase 2.5 already did this).
+    Do NOT analyze code for bugs/patterns (deferred to /codexreview post-batch).
+    Do NOT write recommendations or follow-up actions.
+    Write the gate results + verdict to: /qa/<CARD-ID>.md
+    Report should be under 40 lines.
+    Return verdict: PASS or FAIL.
+    ```
+22. **Read qa-sentinel's output.** Verify the findings file was written to `/qa/<CARD-ID>.md`.
+23. **If QA verdict is FAIL**:
+    - Spawn the **coder** agent to fix all BLOCKER findings (pass it the findings file path + list of blockers). Do NOT ask the user.
+    - After coder fixes, re-invoke `qa-sentinel` in the same mode to re-validate. Repeat up to **2 times**.
+    - If still FAIL after 2 retries: log in `## Issues & Flags` and **ask the user** whether to proceed or stop.
+    - The commit in Phase 4 MUST NOT happen until QA verdict is PASS (or user explicitly overrides).
+24. **Update tracker**: phase = "3.5-qa DONE", log: profile used, verdict (PASS/FAIL/SKIP), confidence %, findings count (blockers/majors/minors), findings file path.
+### Phase 3.7 — Pre-Merge Codex Review Gate (MANDATORY — UNCONDITIONAL)
+**This gate is NON-SKIPPABLE and runs for EVERY card before the Phase 4 commit, no matter what.** A per-card `/codexreview` MUST run BEFORE the Phase 4 commit, regardless of file paths, card type, or perceived risk. The historical conditional High-Risk Path detector is preserved below — but only as a **signal-logging step** to record *which* triggers matched in the tracker. It NEVER suppresses the `/codexreview` invocation. Even if zero triggers match, `/codexreview` runs.
+**Rationale**: leaving the gate conditional caused it to be silently skipped on the majority of cards, defeating the AGENTS.md "MUST run BEFORE merge" requirement. The cost of a per-card Codex pass is acceptable; the cost of a missed blocker on a "low-risk" path (BUG-0530-class regressions) is not.
+#### Step A — Detect (signal-logging only, NEVER gates the next step)
+Run this exact bash block in the worktree. It is deterministic (grep + path match), not LLM-discretionary.
+```bash
+cd <worktree-path>
+CARD_ID="<CARD-ID>"
+CARD_FILE="$(find ../../backlog -name "${CARD_ID}*.yml" 2>/dev/null | head -1)"
+CHANGED="$(git diff --name-only develop...HEAD 2>/dev/null || git diff --name-only HEAD~1)"
+TRIGGERS=()
+# Trigger #1 — DORE shared scoring/ranking primitive
+echo "$CHANGED" | grep -qE 'src/lib/booking/dore/(engine|reranking)\.ts$' \
+  && TRIGGERS+=("#1: DORE shared scoring/ranking primitive")
+# Trigger #2 — Auth / permissions
+echo "$CHANGED" | grep -qE 'src/lib/auth/middleware\.ts$|src/lib/permissions\.ts$|withAuth' \
+  && TRIGGERS+=("#2: Auth/permissions")
+# Trigger #3 — Payment / billing
+echo "$CHANGED" | grep -qE '^src/lib/payments/|^src/app/api/v1/billing/' \
+  && TRIGGERS+=("#3: Payment/billing")
+# Trigger #4 — Dead-code resurrection (card text + commit messages)
+{ cat "$CARD_FILE" 2>/dev/null; git log --format=%B develop...HEAD 2>/dev/null; } \
+  | grep -qiE 'dead code|unreachable|resurrect' \
+  && TRIGGERS+=("#4: Dead-code resurrection")
+# Trigger #5 — Cross-card delta-baseline arithmetic
+{ cat "$CARD_FILE" 2>/dev/null; echo "$CHANGED" | xargs -I{} git show "HEAD:{}" 2>/dev/null; } \
+  | grep -qE 'score_new\s*-\s*score_current|delta-baseline|deltaBaseline' \
+  && TRIGGERS+=("#5: Cross-card delta-baseline arithmetic")
+# Trigger #6 — Reference-aliasing mutation pattern (BUG-0558 family)
+# Card introduces or modifies an exported helper that returns an array/object AND a
+# call site in the diff mutates an input array in place via `length = 0` + `push(...result)`.
+# Even if the call site has an identity guard locally, the high-risk gate runs Codex
+# adversarial review to confirm the contract is robust against future caller patterns.
+echo "$CHANGED" | xargs -I{} sh -c 'git show "HEAD:{}" 2>/dev/null' \
+  | grep -qE '\.length\s*=\s*0' \
+  && echo "$CHANGED" | xargs -I{} sh -c 'git show "HEAD:{}" 2>/dev/null' \
+       | grep -qE '\.push\(\s*\.\.\.' \
+  && TRIGGERS+=("#6: Reference-aliasing mutation pattern (BUG-0558 family)")
+printf '%s\n' "${TRIGGERS[@]}"
+```
+#### Step B — Log detector result (never skips)
+Log the detector output in the tracker for traceability. **This step never short-circuits Step C — `/codexreview` runs unconditionally regardless of whether `TRIGGERS` is empty or populated.**
+- If `TRIGGERS` is empty: log `## Pre-Merge Codex Review: no high-risk triggers matched (gate still runs unconditionally)` and proceed to Step C.
+- If `TRIGGERS` is non-empty: log the matched triggers (see Step C step 1) and proceed to Step C.
+#### Step C — Invoke per-card `/codexreview` (ALWAYS)
+For EVERY card (no conditional skip):
+1. **Log gate invocation** in the tracker under `## Pre-Merge Codex Review`:
+   ```
+   ## Pre-Merge Codex Review
+   - Triggered: unconditional (mandatory pre-merge gate)
+   - Matched high-risk triggers: <list from detector, or "none">
+   - Action: invoking /codexreview <CARD-ID> (per-card, pre-merge gate)
+   ```
+2. **Invoke `/codexreview`** for the single card via the Skill tool:
+   ```
+   Skill: codexreview
+   args: <CARD-ID>
+   ```
+   The orchestrator's `/codexreview` skill runs all 5 review agents (incl. adversarial Codex) + Step 3 false-positive gate + Step 3.5 cross-agent CoVe + Step 4 consolidated report. This is the same pipeline used post-batch in Phase 7 — here it runs per-card and BEFORE commit.
+3. **Read the consolidated report** from `/tmp/codexreview-report-<TIMESTAMP>.md`. Extract:
+   - Verified BLOCKER count
+   - Verified HIGH count
+   - List of `[ripple-expanded]` findings (Step 3.5 widened scope)
+4. **Apply fix sub-loop** (mirror of Phase 3.5 retry pattern):
+   - If 0 BLOCKER and 0 HIGH → log `verdict: PASS — proceeding to Phase 4` in tracker. Done.
+   - If 1+ BLOCKER OR 1+ HIGH → spawn `coder` agent with the report path + list of VERIFIED bugs to fix. After coder fixes, re-invoke `/codexreview <CARD-ID>` to re-validate. Repeat **max 2 times**.
+   - If still BLOCKER/HIGH after 2 retries → log in `## Issues & Flags` and **ask the user** whether to proceed, escalate, or stop. The Phase 4 commit MUST NOT happen until High-Risk Gate verdict is PASS or user explicitly overrides.
+5. **Update tracker**: phase = `3.7-highrisk DONE`, log final verdict, retry count, list of fixed findings, and the report path.
+#### Why deterministic detector + agent pipeline
+The detector (Step A) is bash + grep — guaranteed to run, no LLM skip. The downstream pipeline (Step C) is the existing `/codexreview` skill — fully tested, includes Codex adversarial review and Step 3.5 cross-agent CoVe. This combination ensures the AGENTS.md "MUST run BEFORE merge" rule is enforced reliably.
+### Phase 4 — Commit (in worktree, NO merge yet)
+25. **Update tracker**: phase = "4-commit".
+26. Stage and commit **all changes together** in the worktree using format `[CARD-ID] Brief description` (MUST per AGENTS.md). Include all relevant files — implementation, review fixes, QA-driven fixes, and doc updates in a single commit. Do NOT merge or push yet — that happens post-batch.
+    - **IMPORTANT — explicit staging**: NEVER use `git add -A` or `git add .`. Always stage files by explicit name:
+      ```bash
+      cd <worktree-path>
+      # List changed files
+      git diff --name-only
+      git ls-files --others --exclude-standard
+      # Stage only the files this card touched (explicit names)
+      git add src/path/to/file1.ts src/path/to/file2.ts ...
+      # Commit directly — NO stash in worktrees
+      git commit -m "[CARD-ID] Brief description"
+      ```
+    - **WORKTREE COMMIT RULE — NEVER use `git stash` in worktrees.** Stashes are globally shared across all worktrees (`refs/stash` via `git.commondir`). A stash created in one worktree can be popped by another, causing conflicts, data loss, and cascading merge failures. The file ownership map ensures no overlap between cards — explicit staging is sufficient isolation. The stash pattern in AGENTS.md applies ONLY to the main repo working tree.
+    - **If commit fails** (e.g., lint-staged or pre-commit hook error):
+      1. **Check for stale COMMIT_LOCK first** — parallel agents and crashed processes leave stale lock files. Always clear before retry:
+         ```bash
+         rm -f <main-repo>/.git/worktrees/<worktree-name>/COMMIT_LOCK 2>/dev/null
+         ```
+      2. Re-stage the same files explicitly and retry. Never run `git commit` alone after a failure — the staging area may have been altered by lint-staged auto-fixes. If lint-staged removed an "unused" variable that IS used by code in a later card, check `git diff` to see what changed, restore the needed code, re-stage, and retry.
+    - The Claude Code pre-commit hook automatically skips for worktree commits (Husky handles checks natively in the worktree).
+27. **Mark card DONE (MANDATORY — do NOT skip)**:
+    a. Edit the backlog YAML (`${paths.backlog_dir}/<CARD-ID>.yml`): set `status: DONE`, add `completed_date: <today>`.
+    b. Add implementation notes summarizing what was built.
+    c. **Update `${paths.references_dir}/ssot-registry.md`** — add/update the entry for this card's feature area. The pre-commit doc-freshness hook BLOCKS commits that touch `backlog/` without a corresponding ssot-registry update. Always include ssot-registry.md in the same commit as the backlog YAML.
+    d. **Verify the write**: re-read the YAML file and confirm `status: DONE` is present. If not, retry the edit.
+    e. Stage BOTH the updated YAML AND ssot-registry.md, then commit (or as an immediate follow-up commit if Phase 4 commit already happened).
+28. **Update tracker**: move card to `## Completed Cards` with commit hash, summary, flags, **and `card_status: DONE (verified)`**.
+### Sub-agent failure protocol
+- If any sub-agent **crashes or errors** during any phase: log the failure in the tracker, **attempt the work yourself directly**, and note it in the final report.
+- Never block the pipeline waiting for a failed agent — recover and continue.
+### Phase 5 — Context Clean & Continue
+29. Archive the card from Active Code Context in `${paths.references_dir}/project-status.md`.
+30. **CONTEXT PURGE**: After updating the tracker, deliberately forget the implementation details of this card. From this point forward, you should NOT reference any code, file contents, or review details from this card — only the summary in the tracker. If you need to recall what happened, read the tracker file. This keeps your working context lean for the next card.
+31. **Update tracker**: clear `## Current Card`, move to next pending card.
+32. Move to the next card.
+---
+## Final review (after all cards)
+> **Primary code reviewer: Codex (GPT-5.4)** — cross-model validation with built-in FP filtering.
+> Claude `code-reviewer` is the automatic fallback if Codex is unavailable.
+> The `/codexreview` command remains available for standalone reviews on demand.
+Once ALL cards are committed in the worktree:
+### Step F.1 — Resolve scope
+1. **Read the tracker file** to get the full picture: card IDs, files changed, commit hashes.
+2. Gather git evidence in the worktree:
+   ```bash
+   cd <worktree-path>
+   git diff --name-only <base-branch>...HEAD
+   ```
+3. Read each card's backlog YAML to collect `acceptance_criteria`, `files_likely_touched`.
+4. Build `review_scope_files` as union of card-indicated files + git-touched files.
+### Step F.2 — Architecture baseline
+5. Invoke **codebase-architect** agent to map existing architecture, critical patterns, and
+   high-risk code paths for regression across the batch scope. This provides grounding context
+   for all downstream review agents.
+### Step F.3 — Codex deep code review (primary) + Claude agents (support)
+> **Primary reviewer: Codex (GPT-5.4)** — cross-model validation. Codex runs the full
+> `/codexreview` protocol: scope resolution, architecture baseline, parallel deep review,
+> and mandatory false-positive validation. Claude agents provide doc review and supplementary checks.
+6. **Launch Codex code review** via `Bash` with `run_in_background: true` and `timeout: 600000` (10 min):
+   ```bash
+   REVIEW_FILE="/tmp/codexreview-batch-$(date +%Y%m%d-%H%M%S).md" && \
+   CODEX_SCRIPT="$(ls -d ~/.claude/plugins/marketplaces/openai-codex/plugins/codex/scripts/codex-companion.mjs ~/.claude/plugins/cache/openai-codex/codex/*/scripts/codex-companion.mjs 2>/dev/null | sort -V | tail -1)" && \
+   [ -z "$CODEX_SCRIPT" ] && echo "CODEX_NOT_FOUND" && exit 1; \
+   node "$CODEX_SCRIPT" task --wait "
+   Run a deep multi-agent code review for these backlog cards. This is a post-implementation
+   review — the code is already written and committed. Your job is to find bugs, regressions,
+   security issues, and quality problems.
+   Cards to review (read each file):
+   ${CARD_PATHS}
+   Files changed (review ALL of these):
+   ${REVIEW_SCOPE_FILES}
+   Architecture baseline (from codebase-architect):
+   ${ARCH_BASELINE}
+   Follow the /codexreview protocol:
+   1. For each card, read its backlog YAML for acceptance_criteria and entrypoints.
+   2. Read every changed file in full.
+   3. Check for: functional bugs, logic flaws, regressions, cross-card inconsistencies,
+      security issues (auth gaps, input validation, multi-tenant isolation),
+      performance issues (unbounded reads, N+1, missing pagination),
+      missing error handling on external boundaries.
+   4. For each finding, return:
+      - finding_id: <CARD-ID>-F###
+      - title: short description
+      - severity: BLOCKER | HIGH | MEDIUM | LOW
+      - confidence: 0-100
+      - evidence: exact file:line + code quote
+      - minimal_fix_direction: what to change
+   5. Run mandatory false-positive check: for each finding, ask 'What is the strongest
+      argument this is a false positive?' Suppress if the FP argument is convincing.
+   6. Classify surviving findings as VERIFIED, FALSE_POSITIVE, or NEEDS_MANUAL_CONFIRMATION.
+   Return ONLY verified findings. If zero verified bugs: state 'No verified bugs found.'
+   " 2>&1 | tee "$REVIEW_FILE"
+   ```
+   **Variable interpolation** (build the command string before execution):
+   - `${CARD_PATHS}`: newline-separated list of backlog YAML paths
+   - `${REVIEW_SCOPE_FILES}`: newline-separated list from Step F.1
+   - `${ARCH_BASELINE}`: key findings from Step F.2 (2-3 paragraphs max)
+7. **In parallel with Codex**, launch Claude support agents (single message):
+   | Agent | `subagent_type` | Focus |
+   |-------|-----------------|-------|
+   | **doc-reviewer** | `doc-reviewer` | Cross-card doc consistency, ssot-registry completeness, invariants |
+   | **api-perf-cost-auditor** | `api-perf-cost-auditor` | API/data/performance/cost defects (skip if no API/data files in scope) |
+   | **qa-sentinel** | `qa-sentinel` | Edge cases, testing gaps, reproducibility |
+   Each agent receives: card IDs, YAML, `review_scope_files`, codebase-architect baseline.
+   Return findings with `finding_id`, `title`, `severity`, `confidence`, `evidence`, `minimal_fix_direction`.
+### Step F.4 — Collect & merge findings
+8. **Read Codex findings** from `$REVIEW_FILE` after the background command completes.
+   - If file exists and contains findings → use as **primary code review source**.
+   - If file is empty, missing, or contains `CODEX_NOT_FOUND` → **fallback**: spawn `code-reviewer`
+     agent (subagent_type: `code-reviewer`) with the same scope and instructions. Log fallback
+     reason in tracker: `"Codex unavailable — fallback to Claude code-reviewer"`.
+9. **Merge all findings** (Codex + Claude agents) into a consolidated list.
+   - Codex findings are already FP-validated (Step F.3 protocol includes it).
+   - Claude agent findings with `confidence < 80` → cross-validate with a second agent.
+   - Classify: `VERIFIED` | `FALSE_POSITIVE` | `NEEDS_MANUAL_CONFIRMATION`.
+   - Only `VERIFIED` findings proceed to fixes.
+### Step F.5 — Apply fixes and final build
+10. **Persist verified findings** to `/tmp/batch-final-review-<FIRST-CARD-ID>.md`.
+11. If VERIFIED findings with severity >= MEDIUM exist, invoke the **coder** agent once to
+    apply ALL fixes in a single pass. Pass only the verified findings, not false positives.
+12. Run final build: `npm run lint && npx tsc --noEmit && npm run build`.
+    If any check fails, apply self-healing retry loop (up to 3 times).
+13. **Update tracker** with final review results:
+    - Review engine: Codex GPT-5.4 (primary) | Claude code-reviewer (fallback)
+    - Total findings raised / verified / false positives / needs-manual
+    - Fixes applied count
+    - Build status (pass/fail + retry count)
+    - Highest severity found
+7. **SSOT & Documentation Activity** (MANDATORY — run BEFORE merge):
+   Summarize all documentation and SSOT updates performed during the batch:
+   ```
+   ## Aggiornamenti SSOT & Documentazione
+   ### Documenti aggiornati
+   - `${paths.references_dir}/api/<module>.md` — N nuovi endpoint aggiunti
+   - `${paths.references_dir}/ui/<domain>.md` — N nuove route aggiunte
+   - `${paths.references_dir}/data-model.md` — N nuove collection
+   - `${paths.references_dir}/ssot-registry.md` — N entry aggiornate/create
+   - `${paths.references_dir}/project-status.md` — contesto aggiornato
+   ### ADR creati
+   - `${paths.adrs_dir}/ADR-YYYYMMDD-<slug>.md` — [titolo] (o "Nessun ADR creato")
+   ```
+8. **Knowledge Base Sync** (OPTIONAL — only if the project has an external knowledge corpus, e.g. an Obsidian vault, Confluence space, or Notion workspace):
+   If the project ships a `knowledge-sync` agent (or equivalent), invoke it after doc updates so the external corpus stays aligned. If no such agent exists, skip this step.
+9. **Proceed to Phase 6** (post-batch merge & cleanup).
+10. Present a **single summary report** to the user per card (and a batch summary at the end):
+   - **Files changed** (short list per card)
+   - **Test results** (new tests + existing tests count, pass rate at each iteration)
+   - **Build/lint status** (pass + retry count if any)
+   - **Fix cycles** (total number of self-healing retries across phases)
+   - **Final review** (findings: N raised / M verified / K false positives | fixes applied: N | highest severity)
+   - **UX testing** (PASS/FAIL/SKIP | test file path if written)
+   - **QA result** (profile: skip/light/balanced/deep | verdict: PASS/FAIL/SKIP | confidence % | findings: N blockers, N majors, N minors | E2E: PASS/FAIL/SKIP | findings file path)
+   - **Issues needing user attention** (anything unresolved, partially wired, or flagged)
+   - **Commit hashes** (from tracker)
+   - **Merge commit hash** (from Phase 6)
+   - **Card status reconciliation** (Phase 6b: N cards verified DONE, K force-updated)
+   - **Worktree cleanup status** (success/failed)
+   - **Obsidian sync status** (from step 8 — COMPLETATO/FALLITO/PARZIALE)
+   - Overall implementation status
+10b. **Next Steps & Launch Command** (MANDATORY section — always present):
+   Check for remaining TODO/READY cards in the same epic group:
+   ```bash
+   # Find parent epic from completed cards
+   grep -l "parent: <EPIC-ID>" backlog/*.yml | xargs grep -l "status: TODO\|status: READY"
+   ```
+   Present the section:
+   ```
+   ## Prossimi Passi
+   ### Lancio implementazione (copia e incolla)
+   ```
+   /new FEAT-XXXX-05 FEAT-XXXX-06 FEAT-XXXX-07
+   ```
+   ### Card rimanenti nell'epic
+   | Card | Titolo | Status | Gruppo parallelo |
+   |------|--------|--------|-----------------|
+   | FEAT-XXXX-05 | [title] | TODO | 2 |
+   | FEAT-XXXX-06 | [title] | TODO | 2 |
+   ### Card completate in questa sessione
+   | Card | Titolo | Commit |
+   |------|--------|--------|
+   | FEAT-XXXX-01 | [title] | abc1234 |
+   | FEAT-XXXX-02 | [title] | def5678 |
+   ```
+   If ALL cards in the epic are DONE:
+   ```
+   ## Prossimi Passi
+   Tutte le card dell'epic **FEAT-XXXX** sono state completate.
+   Pronto per il deploy: `/deploy` o `git push origin develop`
+   ```
+11. **Proceed to Phase 7** (production readiness checklist).
+---
+## Phase 6 — Post-batch merge & cleanup (delegated to worktree-manager skill)
+After the final review passes AND all cards are committed in the worktree, delegate the entire merge and cleanup to the **worktree-manager** skill (`/mw` in programmatic mode):
+1. **BEFORE invoking /mw** — verify no uncommitted files remain in the worktree:
+   ```bash
+   cd <worktree-path>
+   git status --porcelain
+   ```
+   If ANY uncommitted files exist (staged, unstaged, or untracked), commit them NOW with `[safety] Auto-commit remaining files before merge`. Do NOT proceed to `/mw` with a dirty worktree — files WILL be lost during rebase.
+2. Invoke `/mw` with:
+   - The worktree path and branch from the tracker
+   - `checksAlreadyPassed: true` (final review + QA already validated the build)
+   - All card IDs for the commit message
+3. The skill handles: safety commit of any remaining uncommitted files (step 3), rebasing onto latest develop (step 4b — auto-resolves doc conflicts, stops on code conflicts), remote merge to develop via `gh pr merge` (step 4c — NEVER `git checkout develop` on the main repo; this respects the absolute terminal-isolation rule that many consumer projects declare in CLAUDE.md), post-merge verification, worktree removal, registry cleanup, and remote branch deletion.
+4. **If code merge conflicts** → the skill STOPs and reports. Doc-only conflicts (ssot-registry.md, project-status.md, etc.) are auto-resolved by keeping both sides.
+5. **If post-merge build fails** → the skill STOPs and keeps the worktree intact for investigation.
+6. Record the merge commit hash and result in the tracker.
+### Phase 6b — Backlog Card Status Reconciliation (MANDATORY — ZERO TOLERANCE)
+**This gate runs IMMEDIATELY after Phase 6 merge completes, BEFORE presenting any summary to the user. It is NON-SKIPPABLE.**
+The most common failure mode is leaving cards IN_PROGRESS after merge. This creates SSOT drift and confuses downstream agents (codebase-architect, doc-reviewer). This gate prevents that.
+**Steps:**
+1. **Read the tracker file** to get the full list of card IDs in the batch.
+2. **For EACH card in the batch**, read its backlog YAML (`${paths.backlog_dir}/<CARD-ID>.yml`) and check the `status` field:
+   - If `status: DONE` → OK, skip.
+   - If `status` is anything else (`IN_PROGRESS`, `READY`, `TODO`, etc.) → **FORCE UPDATE to DONE immediately**.
+3. **Early exit — all cards already DONE**: if step 2 found ALL cards are already `status: DONE`, skip steps 4-5 entirely. Log in the tracker:
+   ```
+   ## Phase 6b — Status Reconciliation
+   Cards checked: N
+   Already DONE: N (all cards marked DONE by coder agents)
+   Force-updated to DONE: 0
+   Reconciliation commit: not needed
+   ```
+   Proceed directly to step 6.
+4. **Force update procedure** for non-DONE cards:
+   ```bash
+   # In the MAIN repo (not worktree — worktree is already cleaned up)
+   cd <main-repo-path>
+   ```
+   Edit the backlog YAML: set `status: DONE`, add `completed_date: <today>`, add implementation note: `"Marked DONE by post-merge reconciliation gate (Phase 6b)"`.
+5. **If ANY card was force-updated**: commit in the main repo with these precautions:
+   a. **Clear stale COMMIT_LOCK** (common when coder agents crash or timeout):
+      ```bash
+      # Remove any stale COMMIT_LOCK from main repo or worktrees
+      rm -f <main-repo-path>/.git/COMMIT_LOCK 2>/dev/null
+      find <main-repo-path>/.git/worktrees -name COMMIT_LOCK -delete 2>/dev/null || true
+      ```
+   b. **Stage backlog YAMLs AND related docs** (pre-commit hook requires doc updates):
+      ```bash
+      # Stage only the force-updated backlog YAMLs
+      git add backlog/CARD-001.yml backlog/CARD-002.yml
+      # If ssot-registry.md was modified during the batch, stage it too
+      # (doc-freshness hook blocks commits that touch backlog/ without ssot-registry.md)
+      git diff --name-only ${paths.references_dir}/ssot-registry.md && git add ${paths.references_dir}/ssot-registry.md || true
+      ```
+   c. **Commit with explicit file list**:
+      ```bash
+      git commit -m "[BATCH] Mark cards DONE — post-merge reconciliation"
+      ```
+   d. **If commit fails due to pre-commit hook** (lint-staged, doc-freshness): check the error.
+      - If "no staged files" → all changes were already committed. Log as "not needed".
+      - If doc-freshness requires ssot-registry.md → update its `last_verified_from_code` date, stage it, and retry.
+      - If COMMIT_LOCK → re-run step 5a and retry.
+      - Do NOT retry more than 2 times. Log the failure and move on.
+6. **Update tracker** with reconciliation results:
+   ```
+   ## Phase 6b — Status Reconciliation
+   Cards checked: N
+   Already DONE: M
+   Force-updated to DONE: K [list card IDs]
+   Reconciliation commit: <hash> (or "not needed")
+   ```
+7. **HALT condition**: if a card cannot be updated (e.g., file read error, YAML parse error), log it in `## Issues & Flags` and continue with the remaining cards. Report the failure in the final summary.
+**Why this exists**: Agents frequently skip the DONE marking in Phase 4 (step 27) due to context compaction, commit failures that interrupt the flow, or team mode where Step D.6 gets lost. This gate is the safety net that catches ALL of these cases.
+### Fail-safe rules (enforced by worktree-manager skill)
+- **Never `git checkout`, `git switch`, `git checkout -b`, or `git branch` on the main repo** from inside this orchestrator. The main repo is shared across parallel terminals. Use `gh pr merge` for develop merges and `git -C <main> pull --ff-only` (only when `HEAD = develop` already) for sync. See `/mw` step 4c.
+- Never merge to `main` — only to `develop`, and only via `gh pr merge` (NOT local checkout).
+- Never force push to `main` or `develop`. `--force-with-lease` on feature branches after rebase is allowed.
+- Never delete a branch before successful merge verification.
+- Never remove a worktree before confirming develop is stable post-merge.
+- Stop execution immediately if any command fails.
+---
+## Phase 7 — Production Readiness Checklist
+After Phase 6 completes (or after the final summary report if Phase 6 is deferred), present a **Production Readiness Checklist** — a clear list of all manual or infrastructure actions required to launch the implemented changes in production.
+### How to detect items
+Scan ALL files changed across the batch (use the tracker's completed cards + `git diff` against the base branch) and check for:
+| Category | Detection signal | Action to report |
+|----------|-----------------|------------------|
+| **Firestore indexes** | New/modified `.firestore.indexes.json`, or code using new compound queries with `orderBy`/`where` on multiple fields | `firebase deploy --only firestore:indexes` |
+| **Firestore security rules** | New/modified `firestore.rules` | `firebase deploy --only firestore:rules` |
+| **Storage security rules** | New/modified `storage.rules` | `firebase deploy --only storage:rules` |
+| **Environment variables** | New `process.env.*` references not present in the base branch, new entries in `.env.example` or `.env.local` | 1. Add to Vercel project settings (list each var name + environments) 2. Update `${paths.references_dir}/env-vars.md` Change Log |
+| **Firebase Remote Config** | New `remoteConfig` keys in code | Add keys in Firebase Console > Remote Config |
+| **Firebase Auth providers** | New auth provider configuration | Enable provider in Firebase Console > Authentication |
+| **Scheduled functions / cron** | New or modified cron/scheduled Cloud Functions | Deploy functions: `firebase deploy --only functions` |
+| **Database migrations** | New collections, field renames, data backfills referenced in code or ADRs | Run migration script (specify which) |
+| **New API endpoints** | New route files under `src/app/api/` | Verify CORS/auth config; update API docs if public |
+| **Third-party services** | New API keys, webhook URLs, or external service integrations | Configure in provider dashboard + add secrets to Vercel |
+| **DNS / domain changes** | Hosting or redirect config changes | Update DNS records or Vercel domain settings |
+| **Package upgrades with breaking changes** | Major version bumps in `package.json` | Verify compatibility; check migration guides |
+### Output format
+Present the checklist as a clearly formatted section in the final report:
+```
+## Production Readiness Checklist
+### Required before deploy
+1. **[Firebase Indexes]** Deploy Firestore composite indexes
+   - Command: `firebase deploy --only firestore:indexes`
+   - Reason: New compound query on `redemptions` (storeId + tableNumber + createdAt)
+2. **[Environment Variable]** Add `REDIS_URL` to Vercel
+   - Go to: Vercel > Project Settings > Environment Variables
+   - Required for: Production, Preview
+   - Value: (obtain from Upstash dashboard)
+3. **[Firestore Rules]** Deploy updated security rules
+   - Command: `firebase deploy --only firestore:rules`
+   - Reason: New collection `promoRedemptions` access rules added
+### No action needed
+- No new scheduled functions
+- No database migrations
+- No DNS changes
+### Notes
+- Firebase index deployment can take 5-10 minutes; deploy BEFORE releasing the code
+- Environment variables must be set BEFORE the Vercel deployment triggers
+```
+### Auto-execution of agent-doable tasks
+Before presenting the checklist, **auto-execute** all items that can be performed by the agent
+without manual intervention. Do NOT ask the user for approval — just run them.
+**Auto-executable items** (run via Bash tool, no confirmation needed):
+| Category | Command | Auto-execute? |
+|----------|---------|--------------|
+| Firestore indexes | `firebase deploy --only firestore:indexes --project <your-firebase-project>` | YES |
+| Firestore security rules | `firebase deploy --only firestore:rules --project <your-firebase-project>` | YES |
+| Storage security rules | `firebase deploy --only storage:rules --project <your-firebase-project>` | YES |
+| Scheduled functions | `firebase deploy --only functions --project <your-firebase-project>` | YES |
+**Manual-only items** (report to user, do NOT auto-execute):
+| Category | Why manual |
+|----------|-----------|
+| Environment variables | Requires Vercel dashboard access or secret values |
+| Firebase Remote Config | Requires Firebase Console UI |
+| Firebase Auth providers | Requires Firebase Console UI |
+| Database migrations / backfills | Risk of data loss — needs human judgment |
+| Third-party service config | Requires external dashboards and secrets |
+| DNS / domain changes | Risk of downtime — needs human judgment |
+**Auto-execution procedure:**
+1. For each auto-executable item detected, run the command immediately.
+2. Log the result (success/failure) in the tracker under `## Production Readiness`.
+3. In the final checklist output, mark auto-executed items with their result:
+   ```
+   1. **[Firebase Indexes]** Deploy Firestore composite indexes
+      - Command: `firebase deploy --only firestore:indexes --project <your-firebase-project>`
+      - Result: DEPLOYED (took 45s) | FAILED (error: ...)
+   ```
+4. If an auto-execution FAILS: log the error, mark it as `MANUAL FALLBACK NEEDED`,
+   and include it in the "Required before deploy" section for the user to handle.
+### Firestore Index Verification (MUST — after deploy)
+After `firebase deploy --only firestore:indexes` succeeds, you MUST verify that all indexes
+are actually `READY` before reporting success. The deploy command returns immediately, but
+indexes can take 5-10 minutes to build.
+**Verification procedure:**
+1. **Extract expected collection groups** from the local `firestore.indexes.json`:
+   ```bash
+   cat firestore.indexes.json | python3 -c "
+   import sys, json
+   data = json.load(sys.stdin)
+   groups = sorted(set(i['collectionGroup'] for i in data.get('indexes', [])))
+   print('\n'.join(groups))
+   "
+   ```
+2. **Check index states** via Firestore REST API for each collection group:
+   ```bash
+   TOKEN=$(gcloud auth print-access-token) && \
+   for CG in <collection_groups>; do
+     curl -s "https://firestore.googleapis.com/v1/projects/<your-firebase-project>/databases/(default)/collectionGroups/$CG/indexes" \
+       -H "Authorization: Bearer $TOKEN" 2>/dev/null
+   done | python3 -c "
+   import sys, json, re
+   raw = sys.stdin.read()
+   # Parse multiple JSON responses concatenated
+   creating = []
+   for match in re.finditer(r'\{[^{}]*\"indexes\"[^}]*\}', raw, re.DOTALL):
+       try:
+           data = json.loads(match.group())
+           for idx in data.get('indexes', []):
+               state = idx.get('state', 'UNKNOWN')
+               if state != 'READY':
+                   fields = ' + '.join(f.get('fieldPath','?') for f in idx.get('fields',[]))
+                   cg = idx.get('name','').split('/collectionGroups/')[1].split('/')[0] if '/collectionGroups/' in idx.get('name','') else '?'
+                   creating.append(f'{state}: {cg} ({fields})')
+       except: pass
+   if creating:
+       print(f'NOT_READY ({len(creating)} indexes still building):')
+       for c in creating: print(f'  - {c}')
+   else:
+       print('ALL_READY')
+   "
+   ```
+3. **Poll if NOT_READY** — re-check every 30 seconds, up to 10 retries (5 minutes max).
+   After each retry, print a status update: `Index verification attempt N/10 — still N indexes building...`
+4. **Final status**:
+   - If all indexes are `READY`: mark as `VERIFIED READY` in the checklist
+   - If indexes are still `CREATING` after 10 retries: mark as `DEPLOYED BUT BUILDING — indexes may take a few more minutes` (this is a warning, not a failure)
+   - If any index is `NEEDS_REPAIR` or `ERROR`: mark as `INDEX ERROR — manual intervention required` and include the index details
+**Checklist output format with verification:**
+```
+1. **[Firebase Indexes]** Deploy Firestore composite indexes
+   - Command: `firebase deploy --only firestore:indexes --project <your-firebase-project>`
+   - Deploy: SUCCESS (took 45s)
+   - Verification: ALL INDEXES READY (155/155) | BUILDING (3 still creating after 5min) | ERROR (details)
+```
+### Rules
+- **Always present this section**, even if the checklist is empty (in that case, state "No infrastructure changes required — deploy is code-only").
+- Order items by **deployment sequence** (items that must happen first go first — e.g., indexes before code deploy, env vars before code deploy).
+- For each item, include the **reason** (which card/feature requires it) and the **exact command or UI path**.
+- If an item is **uncertain** (e.g., you suspect a new index might be needed but aren't sure), mark it with `VERIFY` and explain what to check.
+- **Update the tracker** with the full checklist under a new `## Production Readiness` section.
+---
+## Context recovery protocol
+If at ANY point you are unsure where you are in the batch:
+1. Read your tracker file (`/tmp/batch-tracker-<FIRST-CARD-ID>.md`)
+2. Check `## Current Card` — if populated, resume that card at the listed phase.
+3. Check `## Card Queue` — find the next unchecked card.
+4. Check `## Completed Cards` — know what's already done (don't redo).
+5. Continue the pipeline from where you left off.
+---
+## Parallelism rules
+### Sequential mode (default for small batches)
+- Cards execute one at a time through the full per-card pipeline (Phases 1-5).
+- Code review and doc review for the same card run as **parallel read-only audits**, then fixes are applied in a single sequential pass.
+- This mode is unchanged from the original behavior.
+### Team mode (for complex batches)
+- Cards within the same `parallel_group` run in parallel via isolated coder agents.
+- Groups execute sequentially (group 0 → group 1 → group 2...).
+- Review + QA run ONCE per group (combined), not per card.
+- The orchestrator holds ONLY coordination state, never implementation details.
+- File ownership map is enforced per-agent via MAY EDIT / FORBIDDEN lists.
+- See "Team Mode" section below for full workflow.
+### Common rules (both modes)
+- The file-ownership map is the authoritative source for which files each agent may edit.
+- When running parallel agents, expect "file modified since read" errors on shared files (like the backlog yml) — handle gracefully.
+- When running in parallel, each parallel branch updates the tracker with its own card — use card ID as prefix to avoid conflicts.
+---
+## Team Mode (parallel coder agents with isolated contexts)
+When the complexity assessment (step 3c) selects team mode, the orchestrator changes role: instead of executing each card's pipeline sequentially, it coordinates parallel coder agents — each with its OWN isolated context window.
+**Key principle**: the orchestrator stays LEAN. It holds only:
+- The tracker file path
+- Parallel group status (pending/active/done)
+- Completion verdicts per card (pass/fail + files changed)
+It does NOT accumulate implementation details, codebase-architect findings, or review outputs — those live and die in each agent's isolated context.
+### Team Mode Pre-flight
+After the standard pre-flight (steps 1-7), add:
+1. Read all cards' `parallel_group` field and sort into execution layers.
+2. Update tracker with team mode section:
+   ```
+   ## Team Mode
+   Status: active
+   ## Parallel Groups
+   | Group | Cards | Status |
+   |-------|-------|--------|
+   | 0 | FEAT-01 | pending |
+   | 1 | FEAT-02, FEAT-03 | pending |
+   | 2 | FEAT-04, FEAT-05 | pending |
+   ```
+### Per-Group Execution
+Process groups in order (0, 1, 2, ...). Within each group, spawn coder agents IN PARALLEL — one per card.
+#### Step A: Pre-compute shared context (ONCE per group)
+Before spawning coders, the orchestrator runs **codebase-architect** ONCE for the entire group (not per card). This single call covers all cards in the group.
+Prompt:
+```
+Explore the codebase for context relevant to these cards:
+[list card IDs + their scope.summary, one line each]
+Focus on: [combined files_likely_touched from all cards in group]
+Return: file paths, type signatures, existing patterns. Max 30 lines.
+```
+This is the ONLY context the orchestrator accumulates per group. After passing it to the coders, it can be purged.
+#### Step B: Spawn parallel coder agents
+For each card in the current group, spawn a coder agent using the Agent tool. ALL agents for the group MUST be spawned in a **SINGLE message** (multiple Agent tool calls) to run truly in parallel.
+Each coder agent receives a **SELF-CONTAINED** mission briefing that includes EVERYTHING it needs — it will NOT call codebase-architect or plan-auditor itself:
+```
+Agent tool call:
+  subagent_type: "coder"
+  mode: "bypassPermissions"
+  run_in_background: true
+  name: "coder-<CARD-ID>"
+  prompt: |
+    ## AUTONOMOUS CARD IMPLEMENTATION — <CARD-ID>
+    You are implementing this card AUTONOMOUSLY. Complete ALL phases below
+    without external coordination. You have your own isolated context.
+    ### 1. Card Specification (verbatim)
+    Requirements:
+    [copy from card YAML]
+    Acceptance Criteria:
+    [copy from card YAML]
+    ### 2. Codebase Context (pre-computed)
+    [paste codebase-architect findings from Step A]
+    ### 3. Working Directory
+    All work MUST happen in the worktree: <worktree-path>
+    cd to this directory before any file operations.
+    ### 4. File Permissions (ENFORCED)
+    MAY EDIT:
+    [files from ownership map for THIS card only]
+    FORBIDDEN:
+    - ALL files not in the MAY EDIT list
+    - Do NOT create files outside the designated paths
+    ### 5. Design Reference (if UI card)
+    [path to design.html if exists]
+    ### 6. Your Pipeline
+    Execute these steps in order:
+    a) Print the numbered requirements checklist (anti-skip measure)
+    b) Implement ALL requirements
+    c) Run: npx tsc --noEmit && npx eslint --max-warnings=0 <your-files>
+    d) Self-heal up to 3 times if checks fail
+    e) Verify completeness: for each requirement, confirm code exists (read it)
+    f) If any requirement is missing after implementation, implement it now
+    g) Output the completion report (MANDATORY format below)
+    ### 7. Completion Report (MANDATORY)
+    ```completion-report
+    card: <CARD-ID>
+    status: done | partial | failed
+    requirements:
+      - id: 1
+        text: "[text]"
+        status: done | partial | blocked
+        evidence: "file:line"
+    files_changed:
+      - path/to/file.ts
+    build_status: pass | fail
+    lint_status: pass | fail
+    retry_count: N
+    ```
+```
+#### Step C: Wait for group completion
+The orchestrator waits for ALL background agents in the group to complete. It will be notified automatically as each finishes (`run_in_background`).
+For each completed agent:
+1. Read the completion report from the agent's output.
+2. Log to tracker: card ID, status, files changed, build/lint status.
+3. Do NOT read or store implementation details — only the verdict.
+**If an agent fails** (status: failed after 3 retries):
+- Log failure in tracker `## Issues & Flags`.
+- Other agents in the group continue unaffected.
+- After group completes, ask user: retry failed card or skip?
+#### Step D: Post-group review + QA
+After ALL agents in the group complete successfully:
+1. **Build verification** — Run `npm run build` in the worktree to verify combined changes compile. If build fails, identify which card's changes broke it (from `git diff` per card), spawn a targeted fix-coder for those files only.
+2. **Combined review** — Invoke **code-reviewer** + **doc-reviewer** IN PARALLEL (read-only, same as Phase 3) across ALL changes in the group. This is ONE combined review, not per-card — more efficient and catches cross-card issues.
+3. **Apply fixes** — If findings exist, spawn ONE fix-coder to apply all fixes in a single pass. Run build + lint after.
+4. **QA gate** — Select the HIGHEST QA profile across all cards in the group (e.g., if one card is BALANCED and another is DEEP, use DEEP). Invoke **qa-sentinel** once for the group.
+4b. **Pre-Merge Codex Review Gate (MANDATORY — UNCONDITIONAL)** — Invoke `/codexreview` for EACH card in the group, one at a time, BEFORE any commit. This mirrors Phase 3.7 of the sequential path and is non-skippable regardless of file paths or perceived risk. Apply the same fix sub-loop: if the consolidated report shows verified BLOCKER/HIGH findings, spawn a fix-coder and re-run `/codexreview <CARD-ID>` (max 2 retries per card). If still BLOCKER/HIGH after retries, ask the user before proceeding to step 5. The Phase 4-equivalent commits in step 5 MUST NOT happen until every card in the group has a PASS verdict (or explicit user override). Log results in the tracker under `## Pre-Merge Codex Review` per card.
+5. **Commit** — One commit per card using explicit staging from the ownership map. **NO stash in worktrees** (stashes are globally shared via `refs/stash` — see Phase 4 WORKTREE COMMIT RULE):
+   ```bash
+   cd <worktree-path>
+   # For each card in the group:
+   git add <card's-files-only>
+   git commit -m "[CARD-ID] Brief description"
+   ```
+6. **Update backlog (MANDATORY — do NOT skip)** — For EACH card in the group:
+   a. Edit the backlog YAML (`${paths.backlog_dir}/<CARD-ID>.yml`): set `status: DONE`, add `completed_date: <today>`, add implementation notes.
+   b. **Verify the write**: re-read the YAML file and confirm `status: DONE` is present. If not, retry.
+   c. Stage the updated YAML and include it in the card's commit (or as an immediate follow-up commit).
+   d. Log in tracker: `card_status: DONE (verified)` for each card.
+   Note: Phase 6b (Status Reconciliation) will catch any card missed here, but aim for zero misses.
+#### Step E: Context purge + next group
+After committing all cards in the group:
+1. Update tracker: move group to done, log all results per card.
+2. **PURGE**: forget all implementation details, review findings, architect context.
+3. Move to the next pending group (Step A again).
+### Sequential fallback within team mode
+If two cards in the same parallel group are discovered at runtime to have a file conflict (e.g., one coder created an unexpected file that overlaps):
+1. Detect via file-diff gate after Step C.
+2. Revert the later card's conflicting files.
+3. Re-run that card as a standalone sequential step after the group.
+### Dependency gates between groups
+Before starting group N, verify ALL cards in groups 0..N-1 are status: done. If any card in a previous group failed and was skipped:
+- If cards in group N `depends_on` it: skip those dependent cards too, log in `## Issues & Flags`.
+- If no dependency: proceed normally.
+### Post-batch (same as sequential mode)
+After all groups are complete, run the same Final Review, Phase 6 (merge), and Phase 7 (production readiness) as documented above. No changes needed for these phases.
+---
+## Phase 8 — Metrics Log (MANDATORY — runs after Phase 7)
+After the Production Readiness Checklist is complete, log this batch run to the skill
+effectiveness registry. This enables tracking of first-attempt success rate, actionability,
+and QA quality over time.
+**Steps:**
+1. Read the batch tracker file (`/tmp/batch-tracker-<FIRST-CARD-ID>.md`) to extract:
+   - Start timestamp (from `Started:` field)
+   - Card list and total count
+   - For each completed card: fix_cycles, qa_profile, qa_verdict, findings_total, findings_verified, blockers_count
+   - Merge commit hash
+2. Compute aggregate metrics:
+   - `first_attempt_success_rate`: cards with fix_cycles == 0 / total_cards
+   - `mean_fix_cycles`: mean of per-card fix cycle counts
+   - `qa_profiles`: count of {skip, light, balanced, deep}
+   - `qa_pass_first_attempt_rate`: cards with qa_verdict PASS on first attempt / total
+   - `findings_total` / `findings_verified`: sum across all cards
+   - `actionability_rate`: findings_verified / findings_total (conservative proxy; 0 if no findings)
+   - `severity_p0_pct`: blockers_count / findings_total (0 if no findings)
+   - `cycle_time_mins`: minutes from `Started:` timestamp to merge commit time (use `git log --format=%ci <hash>`)
+3. Write ONE JSON line to `docs/metrics/skill-runs.jsonl` (append):
+   ```json
+   {"ts":"<ISO-8601-UTC>","skill":"new","run_id":"batch-<FIRST-CARD-ID>","cards":["FEAT-XXX"],"total_cards":N,"first_attempt_success_rate":0.0,"mean_fix_cycles":0.0,"qa_profiles":{"skip":0,"light":0,"balanced":0,"deep":0},"qa_pass_first_attempt_rate":0.0,"findings_total":0,"findings_verified":0,"actionability_rate":0.0,"severity_p0_pct":0.0,"doc_gaps_found":0,"doc_gaps_fixed":0,"cycle_time_mins":0,"worktree_branch":"","merge_commit":""}
+   ```
+   Use `date -u +%Y-%m-%dT%H:%M:%SZ` for the timestamp. Write with `echo '...' >> docs/metrics/skill-runs.jsonl` via Bash.
+4. Copy the batch tracker to archive:
+   ```bash
+   cp /tmp/batch-tracker-<FIRST-CARD-ID>.md docs/metrics/archive/
+   ```
+5. Note in tracker: `## Metrics Log: WRITTEN (run_id: batch-<FIRST-CARD-ID>)`
+**If `docs/metrics/skill-runs.jsonl` does not exist**: create it first with `touch docs/metrics/skill-runs.jsonl`.
+**If batch tracker is missing or unreadable**: log "Metrics Log: SKIPPED (tracker not found)" and proceed without blocking.
+**This phase is NON-BLOCKING** — if it fails for any reason, do not abort the run.