npm - cfsa-antigravity - Versions diffs - 6.0.0 → 6.1.0 - Mend

cfsa-antigravity 6.0.0 → 6.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (47) hide show

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "cfsa-antigravity",
-  "version": "6.0.0",
+  "version": "6.1.0",
   "description": "CFSA Pipeline — Constraint-First Specification Architecture for AI agents. Production-grade from line one.",
   "scripts": {
     "changeset": "changeset",

package/template/.agents/skills/implement-slice-tdd/SKILL.md CHANGED Viewed

@@ -90,7 +90,23 @@ After GREEN, scan new imports. If any package lacks a corresponding skill direct
 2. Read `.agents/skills/bootstrap-agents/SKILL.md` and invoke `/bootstrap-agents PIPELINE_STAGE=implement-slice` + the new dependency key
 3. **HARD GATE**: Follow the bootstrap verification protocol (`.agents/skills/prd-templates/references/bootstrap-verification-protocol.md`). Confirm the matching skill is installed before proceeding to REFACTOR.
-No new unregistered dependencies → skip to Step 5.
+No new unregistered dependencies → skip to Step 4.9.
+## 4.9. Slice depth ratio gate (mandatory — derived from spec)
+Read `.agents/skills/prd-templates/references/slice-depth-floor.md` in full. The slice's planned `phase-N.md` entry should already include a computed depth floor and breakdown (added by `/plan-phase-write` § 4.4). If it does not, compute it now from the slice's referenced spec sections.
+1. **Count delivered tests** for this slice. A "delivered test" is a test function that:
+   - Lives in this slice's test files (or shared test files with explicit slice-tag comments), AND
+   - Asserts a concrete behavior tied to a specific spec item (a field, a validation rule, an error code, a role, a state, an edge case)
+   - Smoke tests, snapshot tests with no assertions, and "test that it renders" tests do NOT count.
+2. **Compute ratio**: `delivered_tests / depth_floor`.
+3. **Hard gate**:
+   - `ratio >= 1.0` → proceed to Step 5.
+   - `ratio < 1.0` → **STOP**. List every uncovered spec item (per the floor's breakdown). Return to Step 3 (RED), write failing tests for each uncovered item, then proceed through GREEN again. Do not weaken the floor; do not skip uncovered items by claiming they are "implicit." Every floor item must have an explicit test.
+4. **Anti-cheat**: A test that asserts only `expect(result).toBeTruthy()` or `expect(response.status).toBe(200)` does not satisfy a floor item that requires verifying a specific field, error message, role denial, or state transition. Each floor item must be matched to at least one assertion that exercises that item's specific behavior.
+Record the ratio and the matched item count in the slice progress file (`.memory/pipeline/progress/slices/<slice-id>.md`) under a `## Depth Ratio` section before proceeding.
 ## 5. Refactor
@@ -151,7 +167,7 @@ Read `.agents/skills/session-continuity/protocols/11-parallel-synthesis.md` and
 ## 7. Update progress (Mandatory)
-**CRITICAL**: You MUST NOT skip progress updates. Read `.agents/skills/session-continuity/protocols/03-progress-update.md` and follow **every step** — physically edit all four file targets (slice, phase, index, memory).
+**CRITICAL**: You MUST NOT skip progress updates. Read `.agents/skills/session-continuity/protocols/03-progress-update.md` and follow **every step** — including the new step 8 (read-back verification), step 9 (cross-file consistency check), and step 10 (Completion Signature stamp). Physically edit all four file targets (slice, phase, index, memory) and run `node scripts/check-progress-consistency.mjs` before exiting this step.
 ## 8. Completion Gate
@@ -160,19 +176,27 @@ Read `.agents/skills/verification-before-completion/SKILL.md` and apply its disc
 Read `.agents/skills/prd-templates/references/slice-completion-gates.md` and verify every applicable checklist passes:
 - **UI Completeness Check** — FE slices only
 - **Spec Traceability Gate** — all slices
+- **Spec Depth Floor Gate** — all slices
 Read `.agents/skills/prd-templates/references/tdd-testing-policy.md` and run the **QA Anti-Cheat Audit** checklist.
-You may not call `notify_user` until you have edited all four progress file targets (7a–7d).
+**Cross-runtime progress gate (HARD STOP)**: Run `node scripts/check-progress-consistency.mjs`. If exit code is non-zero, you MAY NOT call `notify_user`. Fix every drift item the script reports, then re-run until exit 0. This gate exists because different agent runtimes (Claude, Antigravity, Codex, Factory) share the same `.memory/pipeline/progress/` tree — a silent partial update here means the next session in any runtime resumes from stale state and may redo or skip work.
+You may not call `notify_user` until:
+1. All four progress file targets have been edited (Protocol 3 steps 1–7).
+2. Read-back verification passed for each (Protocol 3 step 8).
+3. `node scripts/check-progress-consistency.mjs` returned exit 0 (Protocol 3 step 9).
+4. Slice file has a `## Completion Signature` block stamped with date, runtime, verifier exit, and depth ratio (Protocol 3 step 10).
 Verify your edits by reading:
-- `.memory/pipeline/progress/slices/phase-NN-slice-NN.md` — Status: complete, [x] criteria
+- `.memory/pipeline/progress/slices/phase-NN-slice-NN.md` — Status: complete, [x] criteria, Depth Ratio >= 1.0, Completion Signature present
 - `.memory/pipeline/progress/phases/phase-NN.md` — incremented progress fraction
 - `.memory/pipeline/progress/index.md` — updated overall percentage
 Your `notify_user` payload **MUST** include:
 1. Raw output from the three reads above
-2. Updated overall progress (e.g., "Overall progress is now 75% (24/32 slices)")
-3. Explicit next command: Run `/implement-slice` for [next slice name]
+2. Verifier exit code (must be `0`)
+3. Updated overall progress (e.g., "Overall progress is now 75% (24/32 slices)")
+4. Explicit next command: Run `/implement-slice` for [next slice name]
 **Infrastructure/Auth slice gate**: If this was the `00-infrastructure` or auth slice, the next command is `/verify-infrastructure`, not `/implement-slice`.

package/template/.agents/skills/plan-phase-write/SKILL.md CHANGED Viewed

@@ -41,14 +41,13 @@ For each FE spec in the phase scope:
 The resulting list of slices is derived from the spec, not estimated from feature names. Do not aggregate slices by domain name or by "it feels like one feature."
-Estimate complexity (S/M/L) per derived slice. Flag any slice estimated L — these are candidates for further splitting before ordering begins.
+Estimate complexity (S/M/L) per derived slice as informational metadata only. **Do not split, merge, or drop slices to hit a complexity target.** Slice count is determined by the spec, never by an arbitrary cap.
-**L-slice enforcement**: Any slice marked **L** MUST be reviewed for splitting before Step 3. Present L slices to user: "These slices are estimated Large. Split each into 2-3 smaller slices, or confirm L is acceptable?" Wait for confirmation. Do not proceed to ordering with unreviewed L slices.
+**L-slice handling (informational)**: If a slice is estimated L, surface it in the phase plan with the L tag and a one-line note explaining why it is large (e.g., "covers 4 endpoints with shared transaction boundary"). Splitting an L slice is permitted only when the spec itself contains a natural seam (e.g., two independent BE endpoints) — never to satisfy a count target.
-**Slice count sanity check**: After splitting, count total slices.
-- **1-15 slices** → normal.
-- **16-25 slices** → warn: "Phase has [N] slices. Consider splitting into two phases if unrelated domains are grouped together."
-- **>25 slices** → **STOP**: "Phase has [N] slices — this is too many for one phase. Split into Phase N and Phase N+1, keeping dependency order intact."
+**Slice count is informational**: Report the total slice count in the phase plan header. Do not stop, warn, or restructure based on count alone. A phase has as many slices as the spec produces. If the count feels uncomfortable, the correct response is to verify the spec is correctly scoped to this phase, not to compress slices.
+**Phase splitting** is justified only by **dependency boundaries** (e.g., "auth must ship before any auth-gated feature"), not by slice count. If two independent domains are grouped in one phase by the architecture's phasing section, that is the architecture's call, not the planner's.
 **Good slice**: "User can submit an entity claim form" (one named user flow from the FE interaction spec)
 **Bad slice**: "Implement entity management" (domain name, not a spec-derived user flow)
@@ -129,6 +128,18 @@ Read `.agents/skills/prd-templates/references/operational-templates.md` for the
 - `QA`: Test writing (RED phase — runs FIRST), test verification (GREEN phase — runs LAST)
 - No tag: Contract/schema work, shared infra — handled sequentially by orchestrator
+## 4.4. Slice depth floor (mandatory — derived from spec)
+Read `.agents/skills/prd-templates/references/slice-depth-floor.md` in full. For every slice, compute the **minimum acceptance-criteria count** using the formula in that file. Annotate the slice in `phase-N-draft.md` with the computed floor and its breakdown.
+**Hard gate**:
+1. The acceptance-criteria count written in Step 4 MUST equal or exceed the computed floor for every slice. If it does not, return to Step 4 and add the missing criteria — every missing criterion must trace to a concrete spec item (validation rule, error code, role, state, edge case) that the slice covers.
+2. Apply the **Spec Thinness Detection** table from `slice-depth-floor.md`. If any slice's referenced spec section produces zero items in a required category, **STOP**. Do not write shallow criteria from a thin spec. Tell the user: "Spec [BE/FE/IA §...] is too thin to produce a meaningful slice. Run `/resolve-ambiguity [path]` before continuing." Wait for the spec to be deepened, then re-run this step.
+The floor is the **minimum**, not the maximum. Implementers may add criteria beyond the floor; they may not write fewer.
+This step exists to prevent shallow slices from being green-lit by the criteria-counting gate. Without it, a slice can be planned with 3 criteria when its specs collectively define 18 items that need explicit verification.
 ## 4.5. Identify parallel groups (TDD order)
 Read `.agents/skills/parallel-agents/SKILL.md` and follow its TDD-Order Dispatch methodology for parallel groups and execution order. Flag any tasks that can't be parallelized (shared file dependencies) in the plan.

package/template/.agents/skills/prd-templates/references/shard-boundary-analysis.md CHANGED Viewed

@@ -88,7 +88,7 @@ The decomposition should produce a shard count proportional to the ideation dept
 | **Small** | 3–5 | 4–8 | Most domains map 1:1 to shards |
 | **Medium** | 6–8 | 7–14 | Some multi-domain splits expected |
 | **Large** | 9–12 | 12–20 | Cross-cutting shards multiply; enforce sub-feature limits aggressively |
-| **Deep** | 13+ | 16–25 | Maximum recommended. Beyond 25 → consider surface-level decomposition or domain grouping |
+| **Deep** | 13+ | 16+ | Beyond 25 → review whether each additional shard reflects a true domain boundary or an over-decomposition; do not force a cap if boundaries are real |
 ### Total Count Thresholds (Post-Skeleton Check)
@@ -98,6 +98,6 @@ After all shard skeletons are created in Step 5, count the total:
 |--------------|--------|
 | **≤ 20** | ✅ Proceed |
 | **21–25** | ⚠️ Warning — "Decomposition produced [N] shards. This will require [N × 3] spec documents (IA + BE + FE) and proportional phase planning. Confirm this is the intended scope or consider grouping related domains." |
-| **> 25** | 🛑 **Hard stop** — "Decomposition produced [N] shards. This exceeds the recommended maximum for sequential agent processing. Present a domain grouping proposal that reduces shard count to ≤ 25." |
+| **> 25** | ⚠️ Informational — "Decomposition produced [N] shards. This is fine if each shard reflects an independent domain boundary. Review the boundary justification for any shard with strong dependency coupling to a sibling and merge if the coupling indicates the same domain." Proceed after acknowledgment. |
-> **Why 25 max**: Each shard produces ~3 spec documents (IA + BE + FE). At 25 shards, that's 75 spec documents. Beyond this, cross-layer consistency checks become unreliable and phase planning produces impractical slice counts.
+> **Why no upper cap**: Shard count must reflect domain boundaries, not an arbitrary processing limit. The justification for splitting or merging shards is **dependency coupling**, not count: shards with high mutual dependency belong together; shards with independent domains belong apart. Spec quality is enforced by the per-shard content-completeness floors in BE/FE, not by limiting total shard count. If 35 shards each represent a distinct domain, 35 shards is correct.

package/template/.agents/skills/prd-templates/references/slice-completion-gates.md CHANGED Viewed

@@ -18,6 +18,14 @@ Non-FE slices skip this block.
 - [ ] No `// DECISION:` annotations exist for behaviors that are actually specified in the BE spec or IA shard (i.e., no spec-defined behavior was treated as an undocumented implementation decision)
 - [ ] The {{CONTRACT_LIBRARY}} contract written in Step 2 matches the delivered implementation field-for-field — no fields added, removed, or renamed during implementation without a corresponding contract update
+## Spec Depth Floor Gate (all slices)
+- [ ] Slice depth floor was computed from BE/FE/IA spec content during `/plan-phase` and recorded in the phase plan (`.memory/pipeline/progress/phases/phase-N.md`) for this slice
+- [ ] Acceptance criteria count for this slice meets or exceeds the recorded depth floor (slice may not ship with fewer criteria than the floor)
+- [ ] Slice depth ratio (`delivered_tests / depth_floor`) is recorded in `.memory/pipeline/progress/slices/<slice-id>.md` under `## Depth Ratio` and is `≥ 1.0`
+- [ ] Anti-cheat: every floor item has at least one delivered test that asserts the specific spec behavior — no `expect(true).toBeTruthy()`, no bare `expect(res.status).toBe(200)`, no snapshot-only or render-only tests counted toward the floor
+- [ ] Every required category from the spec (validation rules, error codes, authorization rows, idempotency behavior, rate limit behavior, observability hooks for BE; state enumeration, role variants, accessibility inventory, navigation behavior, network-degradation for FE) is represented by ≥1 explicit assertion
 ## Resource Cleanup Gate (all slices)
 - [ ] Every database client or connection created in this slice has a corresponding cleanup call (`disconnect`, `close`, `end`, `dispose`) in a `finally` block or lifecycle hook

package/template/.agents/skills/prd-templates/references/slice-depth-floor.md ADDED Viewed

@@ -0,0 +1,96 @@
+# Slice Depth Floor
+**Purpose**: Define the minimum depth a slice must have, derived from the specs it covers. This document is the authoritative source for two gates:
+1. **Plan-phase floor** (`/plan-phase-write` Step 4) — minimum acceptance-criteria count per slice, computed from spec breadth. Refuses to plan a shallow slice.
+2. **Implement-slice floor** (`/implement-slice-tdd` Step 5) — minimum delivered test count per slice, computed from the same formula. Refuses to mark a slice complete if test depth is below the floor.
+The floor is **derived from the specs the slice references** — not invented, not estimated, not negotiable. If the specs are too thin to produce a meaningful floor, the spec is wrong and must be deepened first (see `write-be-spec-write` § Pass 4 and `write-fe-spec-write` § Pass 5).
+---
+## The Formula
+For a single slice, the **expected criteria/test count** is the sum of every item below that the slice's referenced spec sections produce. Count each item exactly once.
+### BE-derived items (per BE spec section the slice covers)
+| Item | What to Count |
+|------|---------------|
+| **Happy-path** | 1 per endpoint covered by the slice |
+| **Field validation** | 1 per validated request field × constraint (e.g., `email` with `required` + `format` + `maxLength` = 3) |
+| **Field validation messages** | 1 per distinct error message a validator can produce |
+| **Error codes** | 1 per error code defined in the BE spec for those endpoints (`4xx`, `5xx` differentiated) |
+| **Authorization** | 1 per role × endpoint combination defined in the BE spec's `## Access Control` (including the deny case for unauthorized roles) |
+| **Ownership / scoping** | 1 per ownership rule (e.g., user can only read their own records) |
+| **Idempotency / concurrency** | 1 per idempotency or race-condition rule from BE Pass 2 (sequencing & concurrency) |
+| **Failure cascade** | 1 per rollback or transaction-boundary rule from BE Pass 3 |
+| **Rate limiting** | 1 per rate-limit rule defined for the endpoint (boundary test that asserts the limit triggers) |
+### FE-derived items (per FE spec component/flow the slice covers)
+| Item | What to Count |
+|------|---------------|
+| **States** | 1 per distinct UI state per data-fetching component: `idle`, `loading`, `error` (per error class), `empty`, `success`, `optimistic-pending`, `optimistic-rollback` (only count states the spec actually defines) |
+| **Role variants** | 1 per non-trivial cell in the component's role × feature matrix (every `hidden`, `locked`, `read-only` cell in addition to `full`) |
+| **Responsive variants** | 1 per breakpoint that changes interaction (not just layout) — e.g., touch swipe replacing drag |
+| **Form fields** | 1 per form field × validation rule (mirrors BE field validation but exercised through the UI) |
+| **Navigation** | 1 per route guard, deep-link entry, browser-back behavior, and multi-tab scenario defined in FE Pass 3 |
+| **Accessibility** | 1 per row in the component's accessibility inventory (WCAG criterion + keyboard binding + screen-reader behavior count as one row) |
+| **Network degradation** | 1 per loading-threshold/retry/timeout rule from FE Pass 2 |
+### IA-derived items (per IA shard section the slice's domain references)
+| Item | What to Count |
+|------|---------------|
+| **Edge cases** | 1 per `## Edge Cases` item in the IA shard relevant to this slice |
+| **Acceptance criteria** | 1 per Given/When/Then in the IA shard's testability section relevant to this slice |
+| **Junior/age-restricted rules** | 1 per content-gating rule that affects this slice |
+---
+## Computing the Floor
+```
+floor(slice) = sum(all items above for every spec section the slice references)
+```
+Annotate each slice in the phase plan with its computed floor and the breakdown:
+```markdown
+### Slice 3: Submit entity claim form
+**Spec depth floor**: 18 criteria
+  - BE 04-entities §3.2: 1 happy + 4 fields × 2 constraints + 5 error codes + 3 roles + 2 ownership = 19 → adjusted 14 (deny-case overlaps)
+  - FE 04-entity-claim §EntityClaimForm: 5 states + 1 role variant + 4 form fields + 2 a11y = 12 → adjusted 4 (form fields overlap with BE)
+  - IA 04-entities §Edge Cases: 0 relevant
+**Floor total**: 18
+```
+The acceptance criteria written in `/plan-phase-write` Step 4 must equal or exceed this floor. The delivered test count in `/implement-slice-tdd` Step 5 must equal or exceed this floor.
+---
+## Spec Thinness Detection
+If any spec section the slice references produces **zero items** in a category where items would be expected for the type of work the slice does, the spec is too thin. Apply the rule:
+| Slice Type | Required Non-Zero Categories |
+|------------|------------------------------|
+| Any BE endpoint slice | happy-path, field validation, error codes, authorization |
+| Any FE data-fetching slice | states (≥3: loading, error, success), accessibility (≥1) |
+| Any FE form slice | states, form fields × validation, accessibility |
+| Any role-conditional slice | role variants (≥1) |
+If a required category is zero:
+> ❌ **STOP** — Slice "[name]" references spec section [BE/FE/IA §...] which produces 0 [category] items. The spec is too thin to produce a meaningful slice. Run `/resolve-ambiguity` on the spec before planning this slice. Do not write acceptance criteria from a thin spec — that produces shallow code.
+---
+## Rationale
+Without this floor, slice depth is bounded by the implementer's interpretation of "what's a reasonable test count." Two implementers can read the same FE spec and produce 3 vs 12 tests and both feel justified. The floor turns a subjective bound into a deterministic count derived from spec content.
+Without this floor, the pipeline can produce passing tests for shallow slices that omit half the spec's validation rules, error cases, or role variants — and the existing "tests pass" gate green-lights it. The floor closes that loophole.
+The floor is **not a maximum**. Implementers may exceed it. It is the minimum below which a slice is structurally shallow regardless of how many tests pass.

package/template/.agents/skills/session-continuity/protocols/01-session-resumption.md CHANGED Viewed

@@ -29,10 +29,35 @@
 6. **Read `.memory/wiki/decisions.md`** — load key decisions for context.
-7. **Summarize for the current task**:
+7. **Drift scan (mandatory — runtime-agnostic)**
+   Before trusting the resumption point, run the cross-file consistency verifier:
+   ```
+   node scripts/check-progress-consistency.mjs
+   ```
+   - Exit 0 → progress files agree across slice/phase/index. Trust the resume point and continue to step 8.
+   - Exit 1 → drift. A previous session (possibly in a different runtime — Claude, Antigravity, Codex, Factory) committed work but did not finish updating all four progress targets. **DO NOT silently re-do or skip past the affected slice.** Surface the verifier's full output to the user with this exact framing:
+     ```
+     Progress drift detected from a previous session. Before resuming, the following must be reconciled:
+     <verifier output>
+     Likely cause: another runtime completed work but did not finish updating all progress files (Protocol 3 steps 8–10 were skipped or interrupted).
+     Recommended action: hand-edit the offending files to match the actual completion state on disk (check `git status`, slice's Completion Signature block, and test pass/fail), then re-run this check. I will not advance until the verifier returns 0.
+     ```
+   - Exit 2 → malformed progress files. **STOP**: report and ask the user before attempting repair.
+   If `scripts/check-progress-consistency.mjs` does not exist in the project (older installation), note this and recommend `/sync-kit`, then fall back to manually opening `index.md` + the latest in-progress phase file + that phase's slice files and confirming the fractions and checkboxes match before resuming.
+8. **Summarize for the current task**:
    ```
    Status: Phase 2 in progress — 3/7 slices complete (43%)
    Last session: Completed auth middleware slice, deferred rate limiting (blocked on Redis)
    Blockers: 1 active — Redis connection config needed
+   Drift scan: clean (verifier exit 0)
    Resume at: Phase 2, Slice 4 — Rate limiting middleware
    ```

package/template/.agents/skills/session-continuity/protocols/03-progress-update.md CHANGED Viewed

@@ -68,3 +68,49 @@
    - **Resolution**: Pinned to v12.0.0, added to package.json overrides
    - **Resolved**: 2026-02-14
    ```
+8. **Read-back verification (mandatory — runtime-agnostic)**
+   After every write in steps 1–6, immediately read each file back and confirm the change landed. This mirrors Protocol 8 step 6 and is required because different agent runtimes (Claude, Antigravity, Codex, Factory) all write to the same `.memory/pipeline/progress/` tree — a silent partial write here means the next session in any runtime will resume from stale state.
+   For each of the four targets you wrote (slice file, phase file, index.md, plus blockers/decisions/patterns if touched):
+   (a) Re-read the file with a fresh read.
+   (b) Confirm the specific bytes you intended to write are present:
+       - Slice file: `**Status**: complete` line is present.
+       - Phase file: the `**Slice N**` row checkbox is `[x]`, and the `**Progress**: X/Y` header reflects the new count.
+       - `index.md`: the phase row's status and `X/Y` cell match the phase file, and `**Overall**: X/Y` is recalculated.
+   (c) If any check fails:
+       - Log: `"Progress write-back failed for [file]: expected [value], saw [actual]"`
+       - Retry the write **once**.
+       - Re-read and re-verify.
+       - If second attempt fails → **STOP**: `"Unable to update [file] after 2 attempts. Do not call notify_user. Do not advance the slice. Investigate before continuing."`
+   (d) On success, log: `"Progress write-back verified: slice/phase/index all consistent."`
+9. **Cross-file consistency check (mandatory — runtime-agnostic)**
+   Run the verifier:
+   ```
+   node scripts/check-progress-consistency.mjs
+   ```
+   (If `scripts/check-progress-consistency.mjs` does not exist in the project — for example, an older installation that has not been synced — fall back to manual cross-checking: open all three files (slice, phase, index) side-by-side and verify the slice's status, the phase's `X/Y`, and the index's overall `X/Y` are all internally consistent. Then `/sync-kit` to pull the verifier in.)
+   - Exit code 0 → consistent. Proceed.
+   - Exit code 1 → drift. Read the script's output, fix every reported drift item by editing the affected file(s), and re-run the script until it returns 0. **STOP and do not call `notify_user` until the script returns 0.**
+   - Exit code 2 → malformed file. The progress files are unreadable or missing required headers. **STOP**: report the malformed file and the reason; do not attempt repair without user confirmation, since malformed progress files usually indicate a deeper schema mismatch.
+   This check exists because checkbox edits and fraction edits are easy to forget independently. The script catches the four most common drift patterns: phase fraction ≠ checkbox count, slice file status ≠ phase row checkbox, index overall ≠ sum of phases, index per-phase row ≠ phase file. It runs in any runtime with Node available.
+10. **Stamp the slice with a completion signature** — append (or update) at the bottom of `slices/phase-NN-slice-NN.md`:
+    ```markdown
+    ## Completion Signature
+    - Completed: <ISO date>
+    - Runtime: <claude | antigravity | codex | factory | other>
+    - Verifier: check-progress-consistency.mjs exit 0
+    - Depth ratio: <ratio> (>= 1.0 required)
+    ```
+    The signature is what lets a different runtime opening the project later trust that the slice is genuinely done — not just that someone ticked a box.

package/template/.agents/skills/write-be-spec-write/SKILL.md CHANGED Viewed

@@ -53,15 +53,27 @@ Read `.agents/skills/prd-templates/references/be-spec-template.md` for the docum
 Write decision to disk. Continue below.
-### 7.5. Spec complexity gate
+### 7.5. Spec content completeness floor
-Count the total lines in the written BE spec file.
+Spec depth is enforced by **content completeness**, not line count. A short spec that omits validation rules, error codes, or auth coverage is broken; a long spec that enumerates them is correct. There is no upper bound on length.
-| Lines | Action |
-|-------|--------|
-| **≤ 600** | ✅ Pass |
-| **601–800** | ⚠️ Warning — "This BE spec is [N] lines. Consider splitting if endpoint groups are independently testable." Proceed after acknowledgment. |
-| **> 800** | 🛑 **Hard stop** — "This BE spec is [N] lines and will degrade implementation quality. Split the parent IA shard or separate endpoint groups into distinct BE specs." Present the largest sections with line counts. |
+For **every endpoint** in this spec, verify all of the following are explicitly present:
+| Required item | What "present" means |
+|---------------|---------------------|
+| Request schema | Every field with type, constraints (required/optional, min/max, format, enum), and example |
+| Response schema (success) | Full body shape including envelope, pagination metadata, and computed fields |
+| Response schema (each error class) | Body shape for every error code listed below |
+| Validation rules | One row per (field × constraint) with the exact rejection error code and message |
+| Error codes | At minimum: any `4xx` produced by validation, any `4xx` produced by auth/authz, `404` if resource is addressable, `409` if uniqueness or version conflicts apply, `429` if rate-limited, `5xx` for downstream failure cascades |
+| Authorization | One row per role × this endpoint, with allow/deny outcome and any ownership/scoping rule |
+| Idempotency | Explicit statement of behavior on duplicate submission (idempotent / safe-to-retry / not-idempotent + dedupe strategy) |
+| Rate limit | Per-role limit (or "inherits global default" pointing at api-conventions) |
+| Observability | Log fields emitted, metric names incremented, audit-trail entries written |
+**Hard gate**: If any cell above is missing for any endpoint, the spec is incomplete. Fill it in. Do **not** count missing items as "implicit," "obvious," or "covered by conventions" — every endpoint must surface its full table even when it inherits from a shared convention.
+**Length is informational**: Report the line count for the spec in the index entry as metadata only. Do not stop, warn, or split based on length. Splitting is justified only by domain boundaries (e.g., two unrelated entity groups in one shard) — not by size.
 ## 8. Update the BE index
@@ -108,9 +120,49 @@ For each endpoint that mutates data:
 Add any new transaction boundary requirements, rollback specifications, or consistency guarantees.
 **Pass loop guard**: Track total pass count.
-- Passes 1-3 → mandatory.
-- Pass 4-5 → optional, run if prior pass produced significant additions.
-- **After pass 5** → **STOP**: "5 deepening passes completed. Present remaining gaps to user: continue deepening or accept current spec depth?"
+- Passes 1-7 → mandatory.
+- Passes 8+ → optional, run if prior pass produced significant additions.
+- **After pass 10** → **STOP**: "10 deepening passes completed. Present remaining gaps to user: continue deepening or accept current spec depth?"
+### Pass 4: Authorization completeness
+Build a role × endpoint matrix for this spec. Every cell must be one of: `allow`, `allow-own-only`, `allow-team-only`, `deny`, or `deny-with-reason-code`. No empty cells, no "tbd," no "see other doc."
+For every `allow-*-only` cell, the spec must define:
+- The exact ownership predicate (e.g., `record.created_by == auth.user_id`)
+- The error code returned when the predicate fails
+- Whether the predicate is enforced in the application layer, the database layer (RLS / policies), or both
+For every `deny` cell, the spec must define the error code returned. `404` vs `403` for unauthorized reads of existing records must be an explicit decision, not an oversight.
+### Pass 5: Observability and audit trail
+For every endpoint, enumerate:
+- **Structured log entries** emitted on success, on each error class, and on slow-path / degraded-mode execution
+- **Metrics** incremented: counters (request count, error count per class), histograms (latency, payload size), gauges (in-flight requests if relevant)
+- **Audit-trail entries** for any state-changing endpoint: actor, action verb, target entity ID, timestamp, before/after diff (or pointer to it), correlation ID
+- **Trace span attributes** added to the request span beyond defaults (e.g., `entity.id`, `tenant.id`, `feature.flag.X`)
+Add any missing observability hook to the endpoint spec, or to the api-conventions spec if it should be cross-cutting.
+### Pass 6: Rate-limit and abuse-protection edge cases
+For every endpoint:
+- Anonymous-vs-authenticated rate limits (anonymous must always be stricter; if endpoint is auth-only, state explicitly that anonymous receives `401` before rate-limit logic runs)
+- Per-IP vs per-user vs per-tenant limit boundaries
+- Burst behavior (token-bucket vs fixed-window)
+- Behavior when limit is exceeded: `429` with `Retry-After` header, log event, metric increment
+- Abuse pattern handling: brute-force detection on auth endpoints, enumeration protection (return same response for "user not found" and "wrong password"), mass-assignment protection (whitelist allowed fields explicitly per endpoint)
+### Pass 7: Failure-mode partial-state hygiene
+For every multi-step endpoint (e.g., create-then-link-then-notify):
+- Identify each external dependency (database, queue, email, third-party API)
+- For each, specify behavior when that dependency fails: rollback, compensate, queue-for-retry, fail-and-surface-to-user
+- Identify which combinations of partial failures are possible
+- Specify the user-facing error code and message for each combination
+If the endpoint cannot guarantee atomicity, the spec must say so explicitly and define the reconciliation strategy.
 ## 10. Cross-reference check

package/template/.agents/skills/write-fe-spec-write/SKILL.md CHANGED Viewed

@@ -42,15 +42,26 @@ Read `.agents/skills/prd-templates/references/fe-spec-template.md` for the docum
 Write decision to disk. Continue below.
-### 6.5. Spec complexity gate
+### 6.5. Spec content completeness floor
-Count the total lines in the written FE spec file.
+Spec depth is enforced by **content completeness**, not line count. There is no upper bound on length.
-| Lines | Action |
-|-------|--------|
-| **≤ 600** | ✅ Pass |
-| **601–800** | ⚠️ Warning — "This FE spec is [N] lines. Consider splitting if component groups are independently testable." Proceed after acknowledgment. |
-| **> 800** | 🛑 **Hard stop** — "This FE spec is [N] lines and will degrade implementation quality. Split into separate FE specs per component group or page." Present the largest sections with line counts. |
+For **every component, view, and form** in this spec, verify all of the following are explicitly present:
+| Required item | What "present" means |
+|---------------|---------------------|
+| State enumeration | Every distinct state defined: `idle`, `loading`, `error` (per error class produced by the BE), `empty`, `success`, plus `optimistic-pending` and `optimistic-rollback` for any optimistic mutation. "Loading" is one state per data dependency, not one shared state. |
+| Role variants | Every cell in the role × component matrix explicitly defined: `full`, `read-only`, `partial-hidden` (with which fields hidden), `disabled` (with which actions disabled), `not-rendered`. No empty cells. |
+| Form fields | Every field with type, validation rules, error message text per rule, on-blur vs on-submit behavior, disabled conditions |
+| Empty / loading / error UI | Concrete copy for empty state, loading skeleton structure, error message + retry affordance per error class |
+| Responsive variants | Every breakpoint that changes interaction behavior (not just layout) listed with the specific behavior change |
+| Accessibility inventory | Per interactive element: WCAG criterion satisfied, keyboard binding, focus order, screen-reader announcement text, ARIA attributes |
+| Navigation behavior | Browser back, deep-link entry, bookmarkability, multi-tab, and unsaved-changes guard for every route |
+| Network degradation | Loading-threshold (when does the loading UI promote to a slow-network UI), retry behavior, offline behavior |
+**Hard gate**: If any cell is missing for any component, the spec is incomplete. Fill it in. Do **not** count missing items as "implicit," "obvious," or "covered by the design system" — every component must surface its full table even when it inherits from a shared pattern.
+**Length is informational**: Report the line count for the spec in the index entry as metadata only. Splitting is justified only by component-group boundaries (e.g., entirely independent surface areas) — not by size.
 ## 7. Update the FE index
@@ -105,9 +116,39 @@ For each component with responsive breakpoints defined:
 Add any new touch interaction specs or mobile-specific behavior.
 **Pass loop guard**: Track total pass count.
-- Passes 1-4 → mandatory.
-- Pass 5 → optional, run if prior pass produced significant additions.
-- **After pass 5** → **STOP**: "5 deepening passes completed. Present remaining gaps to user: continue deepening or accept current spec depth?"
+- Passes 1-7 → mandatory.
+- Passes 8+ → optional, run if prior pass produced significant additions.
+- **After pass 10** → **STOP**: "10 deepening passes completed. Present remaining gaps to user: continue deepening or accept current spec depth?"
+### Pass 5: State enumeration completeness
+For each component that consumes data:
+- List every distinct state the component can be in. At minimum: `idle`, `loading`, `error` (one per BE error class returned), `empty`, `success`, `optimistic-pending` (if mutating), `optimistic-rollback` (if mutating), `disabled` (if any role removes interaction), `degraded` (if network slowness has a separate UI).
+- For each state, define: visual treatment, available interactions, transition triggers in/out, and whether it persists across navigation.
+- For each error state, define the user-visible message text per error class, the retry affordance, and whether the user must take action to clear the state.
+- Empty states must include the empty copy + the affordance to leave the empty state (e.g., "Create your first entity").
+Flag any component where states were collapsed (e.g., all errors share one "something went wrong" message). Spec must enumerate per-class behavior unless the spec explicitly states uniform handling is the design choice.
+### Pass 6: Role-conditional rendering exhaustion
+Build a role × component matrix for this spec. Every cell must be one of: `full`, `read-only`, `partial-hidden`, `disabled`, `not-rendered`. No empty cells, no "tbd," no "see other doc."
+For every `partial-hidden` cell, list which fields/actions are hidden.
+For every `read-only` cell, list which controls become non-interactive.
+For every `disabled` cell, list which actions are disabled and the user-visible explanation (tooltip, badge, inline message).
+For every `not-rendered` cell, specify whether the layout slot collapses or shows a placeholder.
+### Pass 7: Accessibility edge-case enumeration
+For every interactive component:
+- **Keyboard**: Tab order, focus trap behavior in modals, escape behavior, arrow-key navigation in composite widgets (menus, tabs, listboxes), enter/space activation.
+- **Screen reader**: Announcement text for state changes (loading start, loading complete, error appeared, success), live-region usage (`polite` vs `assertive`), label-vs-description distinction.
+- **Focus management**: Where focus moves on modal open/close, on route change, on form submit, on async error.
+- **Color and contrast**: Required contrast ratios per element, behavior under high-contrast mode, behavior when color alone conveys state.
+- **Motion**: Respect for `prefers-reduced-motion`, alternative non-animated affordances for motion-conveyed information.
+If the spec inherits from a design-system accessibility doc, cite the section explicitly per component — do not leave it implicit.
 ## 9. Cross-reference check

package/template/.claude/instructions/workflow.md CHANGED Viewed

@@ -37,7 +37,7 @@ Do NOT mark a task complete until all validations pass.
 After completing a workflow or substantial task:
-- **Pattern Extraction**: Read `.claude/skills/session-continuity/protocols/04-pattern-extraction.md` and follow the **Pattern Extraction Protocol**. Reflect on what worked, what didn't, and log reusable patterns to `memory/patterns.md`. Skip only if the task was trivial (routine, nothing new learned).
+- **Pattern Extraction**: Read `.claude/skills/session-continuity/protocols/04-pattern-extraction.md` and follow the **Pattern Extraction Protocol**. Reflect on what worked, what didn't, and log reusable patterns to `.memory/wiki/patterns.md`. Skip only if the task was trivial (routine, nothing new learned).
 - **Session Close**: Read `.claude/skills/session-continuity/protocols/05-session-close.md` and follow the **Session Close Protocol**. Write a session log to `.memory/pipeline/progress/sessions/` so the next session can resume cleanly.
 > These steps are **not optional**. They are what differentiate a pipeline that gets

package/template/.claude/skills/implement-slice-tdd/SKILL.md CHANGED Viewed

@@ -51,6 +51,14 @@ parameters:
    - access-control enforcement
 4. Run query optimization and resource cleanup checks where applicable.
+### Step 4.5 — Slice depth ratio gate (mandatory)
+1. Read `.claude/skills/prd-templates/references/slice-depth-floor.md`.
+2. Read the slice's depth floor and breakdown from `phase-N.md` (computed in `/plan-phase-write` Step 5.5). If absent, compute it now from referenced spec sections.
+3. Count delivered tests — only tests asserting concrete spec items count (no smoke, no bare `toBeTruthy`).
+4. Compute ratio = delivered_tests / depth_floor. Hard gate: `< 1.0` returns to RED with a list of every uncovered floor item. Each floor item must have at least one explicit assertion exercising its specific behavior.
+5. Record ratio and matched-item count in `.memory/pipeline/progress/slices/<slice-id>.md` under `## Depth Ratio`.
 ### Step 5 — Validation and synthesis
 1. Run full validation command.
@@ -58,18 +66,23 @@ parameters:
 ### Step 6 — Mandatory progress updates and evidence gate
-1. Update all required progress targets.
-2. Re-read updated files and include raw evidence in completion payload.
-3. Apply slice completion gates and QA anti-cheat audit.
-4. Emit next command with infrastructure/auth gate branching rule.
+1. Update all required progress targets per Protocol 3 steps 1–7 (slice file, phase file, index.md, blockers/decisions/patterns).
+2. Apply Protocol 3 step 8 (read-back verification): re-read each written file and confirm the bytes landed; retry once on failure; STOP after second failed attempt.
+3. Run `node scripts/check-progress-consistency.mjs` (Protocol 3 step 9). Exit 1 → fix every reported drift item and re-run until exit 0. Exit 2 → STOP and surface the malformed-file report. **You may not call `notify_user` until exit is 0.**
+4. Stamp the slice file with a `## Completion Signature` block (Protocol 3 step 10) recording date, runtime, verifier exit, and depth ratio.
+5. Apply slice completion gates (`UI Completeness`, `Spec Traceability`, `Spec Depth Floor`) and QA anti-cheat audit.
+6. Completion payload must include: raw re-reads of slice/phase/index, verifier exit code (`0`), updated overall progress fraction, explicit next command (`/implement-slice` for next slice, or `/verify-infrastructure` if this was the infra/auth slice).
 ## Completion Checklist
 - [ ] RED tests authored and failing
 - [ ] GREEN implementation complete and passing
 - [ ] refactor and completeness checks complete
+- [ ] slice depth ratio ≥ 1.0 against spec floor
 - [ ] validation command passes
-- [ ] progress files updated + re-verified
+- [ ] progress files updated + re-verified (Protocol 3 step 8)
+- [ ] cross-file consistency verifier returned exit 0 (Protocol 3 step 9)
+- [ ] Completion Signature stamped on slice file (Protocol 3 step 10)
 - [ ] completion payload includes required evidence
 ## Next Steps

package/template/.claude/skills/plan-phase-write/SKILL.md CHANGED Viewed

@@ -28,9 +28,9 @@ parameters:
 1. Derive candidate slices from FE interaction specification user flows.
 2. Map each flow to BE endpoints via source map.
 3. Group only when strict dependency criteria are met.
-4. Size slices (S/M/L).
-5. Enforce L-slice review/split decision before ordering.
-6. Enforce slice count sanity gates (warn at 16–25, hard stop >25).
+4. Size slices (S/M/L) as informational metadata only.
+5. Do not split, merge, or drop slices to hit a complexity target. L slices are surfaced with a one-line note; splitting is permitted only when the spec contains a natural seam.
+6. Slice count is informational only — no warn or hard-stop thresholds. Phase splitting is justified by dependency boundaries, never by count.
 ### Step 3 — Spec coverage verification gate
@@ -53,6 +53,13 @@ parameters:
 2. Every criterion must include source citation (`[BE §..]`, `[FE §..]`, `[IA §..]`).
 3. Append each completed slice progressively to `phase-N-draft.md`.
+### Step 5.5 — Slice depth floor (mandatory)
+1. Read `.claude/skills/prd-templates/references/slice-depth-floor.md` in full.
+2. For every slice, compute the minimum acceptance-criteria count using its formula and annotate the slice in `phase-N-draft.md` with the floor and breakdown.
+3. Hard gate: every slice's criteria count MUST equal or exceed the floor. If short, return to Step 5 and add criteria traceable to concrete spec items.
+4. Apply Spec Thinness Detection: if any slice's referenced spec section produces zero items in a required category, STOP and tell the user to run `/resolve-ambiguity` on the thin spec before continuing.
 ### Step 6 — Finalize and generate progress artifacts
 1. Finalize `phase-N.md` from draft source.
@@ -74,6 +81,7 @@ parameters:
 - [ ] coverage gate passed
 - [ ] dependency ordering complete with infra gates
 - [ ] acceptance criteria written with citations
+- [ ] slice depth floor computed and met for every slice
 - [ ] phase draft and final plan written
 - [ ] progress files generated
 - [ ] spec graph refreshed via `memory_compile`