npm - @ai-dev-methodologies/rlp-desk - Versions diffs - 0.11.1 → 0.12.0 - Mend

@ai-dev-methodologies/rlp-desk 0.11.1 → 0.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (44) hide show

package/docs/rlp-desk/artifact-schema.md ADDED Viewed

@@ -0,0 +1,99 @@
+# rlp-desk Artifact Schema (v5.7 §4.25)
+> Worker/Verifier write JSON artifacts that the Leader reads. The schema validator at the READ boundary enforces these contracts. **Violation → BLOCKED `contract_violation/malformed_artifact`** (recoverable).
+## Validated artifacts
+| File | Written by | Read by | `signal_type` |
+|------|-----------|---------|---------------|
+| `<slug>-iter-signal.json` | Worker | Leader (worker poll) | `signal` |
+| `<slug>-verify-verdict.json` (per-US) | Verifier | Leader (verifier poll) | `verdict` |
+| `<slug>-verify-verdict.json` (final ALL) | Verifier | Leader (final-verifier poll) | `verdict` |
+| `<slug>-flywheel-signal.json` | Flywheel | Leader (flywheel poll) | `flywheel_signal` |
+| `<slug>-flywheel-guard-verdict.json` | Guard | Leader (guard poll) | `flywheel_guard_verdict` |
+| `<slug>-done-claim.json` | Worker | Leader (analytics, A4 fallback) | `done_claim` |
+## Required structural fields (validated by `validateArtifact`)
+| Field | Type | Constraint | Notes |
+|-------|------|------------|-------|
+| `slug` | string | === campaign slug | OPTIONAL for backward compat. If present, must match. |
+| `iteration` | integer | ≥ `iteration_floor` (current state.iteration) | OPTIONAL for backward compat. Worker may advance, never regress. |
+| `signal_type` | string | === expected per read context | OPTIONAL for backward compat. Discriminates artifacts at read time. |
+| `us_id` | string | ∈ `usList ∪ {'ALL'}` | OPTIONAL for backward compat. Closed-set check. |
+The validator is structural-minimum + semantic-anchor. It does NOT validate downstream business fields (e.g. `verdict.verdict`, `signal.status`); those are checked by their respective consumers.
+## Examples
+### Valid worker signal
+```json
+{
+  "slug": "sum-fn",
+  "iteration": 1,
+  "signal_type": "signal",
+  "us_id": "US-001",
+  "status": "verify",
+  "summary": "implementation done; tests pass"
+}
+```
+### Valid verifier verdict
+```json
+{
+  "slug": "sum-fn",
+  "iteration": 1,
+  "signal_type": "verdict",
+  "us_id": "US-001",
+  "verdict": "pass",
+  "criteria_results": [...]
+}
+```
+### Violation: wrong slug
+```json
+{
+  "slug": "wrong-campaign",   // ← BLOCKED contract_violation
+  "iteration": 1,
+  ...
+}
+```
+→ `Malformed artifact at slug: expected sum-fn, got wrong-campaign`
+### Violation: us_id outside allowed set
+```json
+{
+  "us_id": "US-999"   // ← BLOCKED contract_violation (US-999 ∉ [US-001, ALL])
+}
+```
+→ `Malformed artifact at us_id: expected one of [US-001, ALL], got US-999`
+### Violation: iteration regress
+```json
+{
+  "iteration": 0   // ← floor is 1; regress not allowed
+}
+```
+→ `Malformed artifact at iteration: expected >= 1, got 0`
+## Backward compatibility
+Existing artifacts written before v5.7 §4.25 do not carry `slug`/`signal_type`/`iteration` fields. The validator skips any field not present (`undefined` is allowed). Workers/Verifiers SHOULD start emitting these fields for stronger contract enforcement, but legacy artifacts continue to work.
+## Feedback loop closure
+When `MalformedArtifactError` fires:
+1. `_handlePollFailure` writes BLOCKED with `reason_category: contract_violation`, `failure_category: malformed_artifact`, `recoverable: true`.
+2. `reason_detail` includes the structured error: `Malformed artifact at <field>: expected <expected>, got <got>`.
+3. Operators reviewing `<slug>-blocked.json` see the precise contract violation and can update the Worker prompt template (`prompts/<slug>.worker.prompt.md`) to require the missing/correct field.
+4. On re-run after fix, the Worker writes a compliant artifact and the campaign proceeds.
+## Authoring guidance
+- Worker prompt templates SHOULD instruct the LLM to include `slug`, `iteration`, `signal_type`, and `us_id` in every JSON artifact.
+- The fix-contract (`buildFixContract` in `campaign-main-loop.mjs`) already feeds verifier failures back to the next Worker; future enhancement: feed `MalformedArtifactError` details directly into the next Worker prompt without requiring user re-run.
+## Audit
+- Schema unit tests: `tests/node/test-artifact-schema.mjs` (7 violation scenarios)
+- E2E: Schema violations are exercised in `tests/sv-gate-full.sh` (REAL campaign E2E asserts `complete.md` or `blocked.md` exists — schema violations route to the latter)

package/docs/rlp-desk/ci-setup.md ADDED Viewed

@@ -0,0 +1,100 @@
+# rlp-desk CI Setup (v5.7 §4.25)
+> SV gate is a mechanical contract: every PR touching `src/node/**`, `src/scripts/**`, `src/commands/rlp-desk.md`, or `src/governance.md` MUST pass `tests/sv-gate-full.sh` before merge.
+## Local development
+### Fast gate (~30s)
+Run before every commit:
+```sh
+zsh tests/sv-gate-fast.sh
+# or
+npm run sv-gate:fast
+```
+Checks:
+- 35+ code-pattern greps (each tracked v5.7 fix has the expected code)
+- All Node unit tests (~50)
+- 5 critical zsh unit tests
+### Full gate (~5 min)
+Run before merge / release:
+```sh
+zsh tests/sv-gate-full.sh
+# or
+npm run sv-gate:full
+```
+Adds:
+- REAL tmux E2E (mocked tmux capture, 9 scenarios)
+- REAL campaign E2E (haiku worker/verifier, max-iter 3, iter-timeout 300s)
+- Asserts `<slug>-complete.md` OR `<slug>-blocked.md` exists post-run (file-guarantee invariant)
+**Pre-conditions for full gate**:
+- Inside a tmux session (`echo $TMUX` not empty)
+- `claude` CLI in PATH
+- `node` >= 16 in PATH
+- `~/.claude/ralph-desk/` synced from latest `src/` (run `bash install.sh` or manual sync)
+## GitHub Actions
+The fast gate runs on every PR via `.github/workflows/sv-gate.yml`:
+```yaml
+name: SV Gate
+on: [push, pull_request]
+jobs:
+  sv-gate-fast:
+    runs-on: macos-latest  # zsh + tmux available
+    steps:
+      - uses: actions/checkout@v4
+      - uses: actions/setup-node@v4
+        with: { node-version: '22' }
+      - run: bash install.sh    # syncs to ~/.claude/ralph-desk
+        env: { REPO_URL: file://${{ github.workspace }} }
+      - run: zsh tests/sv-gate-fast.sh
+```
+The full gate (with REAL campaign E2E) is NOT run in CI — it requires:
+- Anthropic API key (haiku worker/verifier)
+- Live tmux session (CI runners are non-interactive)
+- ~3-5 min wallclock per run
+Operators MUST run `tests/sv-gate-full.sh` locally before merging to `main`.
+## Branch protection (manual)
+Required for the SV gate to be enforceable:
+1. Go to `https://github.com/<owner>/rlp-desk/settings/branches`
+2. Add rule for `main`:
+   - ✅ Require a pull request before merging
+   - ✅ Require status checks to pass before merging
+   - ✅ Search and select: `sv-gate-fast`
+   - ✅ Require branches to be up to date before merging
+3. Document the manual step here. Branch protection cannot be enforced via committed YAML alone — it is a repo-admin setting.
+## Forks / non-GitHub repos
+`tests/sv-gate-fast.sh` and `tests/sv-gate-full.sh` are pure zsh + Node — no GitHub-specific dependencies. Forks should:
+1. Run `npm run sv-gate:fast` in their CI (Travis, GitLab CI, etc.) using the same OS-level prereqs (macOS or Linux + zsh + tmux + node + claude CLI).
+2. Optionally run `npm run sv-gate:full` in a scheduled job (nightly) since it requires live API key.
+## Gate failure interpretation
+| Failure mode | Meaning | Action |
+|--------------|---------|--------|
+| Code-pattern grep failed | Tracked fix's expected code is missing | Restore the fix or update `tests/sv-gate-fast.sh` if the pattern legitimately changed |
+| Node unit test failed | Behavioral regression | Fix the code; do NOT relax the test |
+| zsh unit test failed | Behavioral regression in shell helpers | Fix the helper |
+| REAL tmux E2E failed | Real tmux capture/send-keys broke | Investigate tmux version or pane state |
+| REAL campaign E2E failed (no sentinel) | **FILE-GUARANTEE VIOLATED** — Worker/Verifier exited without artifact AND backstop did NOT catch | Critical bug; investigate `_ensureTerminalSentinel` and `_handlePollFailure` paths |
+## Memo: SV gate is the contract
+The SV gate exists because AI assistants (including the Leader itself) miss steps. Mechanical .sh verification is the only enforceable contract — code review, "I tested it locally", and unit-test-only verification are not sufficient. Plan v5.7 explicitly forbids commits that have not passed `tests/sv-gate-full.sh`.

package/docs/rlp-desk/e2e-scenarios.md ADDED Viewed

@@ -0,0 +1,102 @@
+# rlp-desk E2E Test Scenarios (v5.7 §4.25)
+> Two-tier coverage: **Tier A** (deterministic injection, ~ms) runs in `sv-gate-fast`; **Tier B** (real-subprocess + real-tmux + real-claude, seconds–minutes) runs in `sv-gate-full`. Every fix path is covered by at least one tier.
+## Tier A — Deterministic injection (sv-gate-fast)
+Uses `pollForSignal` injection seam (no subprocess spawn) — deterministic, fast, CI-stable.
+| Scenario | Test file | Asserts |
+|----------|-----------|---------|
+| writeSentinelExclusive O_EXCL race | `tests/node/test-sentinel-exclusive.mjs` | First-writer-wins, parent dir create, EEXIST returns no-op, parallel race |
+| Backstop: missing scaffold | `tests/node/test-leader-exit-invariant.mjs` | `_ensureTerminalSentinel` writes `blocked.md` even on `ensureScaffold` throw |
+| Backstop: pollForSignal throws | `tests/node/test-leader-exit-invariant.mjs` | `_handlePollFailure` writes BLOCKED + run() returns blocked status |
+| Backstop: idempotent first-writer-wins | `tests/node/test-leader-exit-invariant.mjs` | Pre-existing BLOCKED is NOT overwritten by backstop |
+| Lying worker (signal missing) | `tests/node/test-lying-worker.mjs` | BLOCKED `infra_failure/worker_exited_without_artifacts` |
+| Lying verifier (per-US verdict missing) | `tests/node/test-lying-worker.mjs` + `tests/node/sv-e2e/test-lying-verifier.mjs` | BLOCKED `verifier_exited_without_artifacts` |
+| Lying final verifier (US-ALL) | `tests/node/sv-e2e/test-lying-verifier.mjs` | BLOCKED `final_verifier_exited_without_artifacts` |
+| Prompt-blocked (default-No worker) | `tests/node/sv-e2e/test-prompt-blocked.mjs` | BLOCKED `prompt_blocked` |
+| Prompt-blocked (default-No verifier) | `tests/node/sv-e2e/test-prompt-blocked.mjs` | BLOCKED `prompt_blocked` (verifier role) |
+| Schema: empty object | `tests/node/test-artifact-schema.mjs` | No crash |
+| Schema: wrong slug | `tests/node/test-artifact-schema.mjs` | BLOCKED `contract_violation/malformed_artifact` |
+| Schema: us_id outside set | `tests/node/test-artifact-schema.mjs` | BLOCKED `malformed_artifact` |
+| Schema: iteration regress | `tests/node/test-artifact-schema.mjs` | BLOCKED `malformed_artifact` |
+| Schema: iteration not integer | `tests/node/test-artifact-schema.mjs` | BLOCKED `malformed_artifact` |
+| Schema: signal_type mismatch | `tests/node/test-artifact-schema.mjs` | BLOCKED `malformed_artifact` |
+| Schema: valid signal (back-compat) | `tests/node/test-artifact-schema.mjs` | No false positive |
+| Auto-dismiss prompt patterns (24+) | `tests/node/test-prompt-dismisser.mjs` | Each `(y/n)`/`[Y/n]`/`[y/N]` variant + scrollback + unknown-fast-fail + claude v2.x trust |
+| Shell quote (Bug 1) | `tests/node/test-shell-quote.mjs` | POSIX single-quote escape for `[1m]` etc. |
+| Opus 1M context | `tests/node/test-opus-1m-context.mjs` | `ANTHROPIC_BETA` prefix, isOpusModel detection |
+**Tier A total**: 50+ tests across 11 files. Runtime: ~0.7s. Always runs in CI.
+## Tier B — Real-subprocess (sv-gate-full)
+Uses real tmux session + real `tmux send-keys` / `capture-pane` / real claude haiku CLI. Slow (~5min) but exercises actual production paths.
+| Scenario | Test | Asserts |
+|----------|------|---------|
+| Real tmux: `[Y/n]` auto-dismiss | `tests/sv-gate-real-e2e.sh` | Real `tmux send-keys Enter` after `auto_dismiss_prompts` |
+| Real tmux: `[y/N]` BLOCK | `tests/sv-gate-real-e2e.sh` | `infra_failure` sentinel written, NO Enter sent |
+| Real tmux: 10s no-progress timeout | `tests/sv-gate-real-e2e.sh` | BLOCKED on freeze regardless of prompt |
+| Real tmux: unknown text + no bracket | `tests/sv-gate-real-e2e.sh` | No false BLOCK, no false Enter |
+| Real tmux: unknown phrasing + `[y/N]` | `tests/sv-gate-real-e2e.sh` | Fast-fail BLOCK (10min wait avoided) |
+| Real tmux: unknown phrasing + `(y/n)` | `tests/sv-gate-real-e2e.sh` | Fast-fail BLOCK |
+| Real tmux: codex `[Y/n]` | `tests/sv-gate-real-e2e.sh` | Auto-dismiss (codex CLI variant) |
+| Real tmux: codex `[y/N]` | `tests/sv-gate-real-e2e.sh` | BLOCK |
+| Real tmux: scrollback contamination | `tests/sv-gate-real-e2e.sh` | Old `[Y/n]` + active `[y/N]` → BLOCK (scan-all) |
+| Real haiku campaign (happy path) | `tests/sv-gate-full.sh` (inline) | `complete.md` written; trust prompt auto-dismissed; tests pass; commit recorded |
+**Tier B total**: 10+ scenarios. Runtime: ~5 min (1 min for tmux scenarios + ~4 min for haiku campaign). Run before merge / release.
+## Coverage matrix (per fix)
+| Fix | Tier A | Tier B | Bug ID |
+|-----|--------|--------|--------|
+| zsh `[1m]` glob | shell-quote | (haiku campaign launches Opus models when promoted) | Bug 1 |
+| tmux silent SV/flywheel | us012 | (haiku campaign exercises tmux mode) | Bug 2/3 |
+| auto_dismiss prompts | prompt-dismisser | real-e2e #1-9 | Bug 4 |
+| A4 fallback prompt guard | a4_fallback | (haiku campaign) | Bug 5 |
+| scrollback contamination | prompt-dismisser | real-e2e #9 | §4.17.b |
+| unknown-prompt fast-fail | prompt-dismisser | real-e2e #5-6 | §4.18 |
+| Node iterTimeout fwd | (verified by haiku campaign actually completing in ≤300s) | full | §4.19 |
+| claude v2.x trust prompt | prompt-dismisser | full (haiku triggers it) | §4.20 |
+| capture window -50 + whitespace norm | prompt-dismisser | full (haiku narrow-pane wrap) | §4.21 |
+| WorkerExitedError | lying-worker | (full campaign covers happy path; injection covers exit) | §4.22 |
+| tail-15 normalized matching | prompt-dismisser | real-e2e | §4.23 |
+| writeSentinelExclusive O_EXCL | sentinel-exclusive | (full campaign uses it for complete.md) | §4.24 |
+| run() try/finally backstop | leader-exit-invariant | (full campaign verifies success path) | §4.24 §1g |
+| _handlePollFailure | lying-worker, lying-verifier, prompt-blocked | (full campaign success path) | §4.25 |
+| validateArtifact schema | artifact-schema | full (haiku artifacts schema-compliant) | §4.25 P1 |
+Every fix has at least one Tier A test. Tier B exercises the production-realistic paths (real tmux, real subprocess, real claude haiku).
+## Running the gates
+```sh
+# Fast gate (~0.7s, every commit)
+zsh tests/sv-gate-fast.sh
+# or
+npm run sv-gate:fast
+# Full gate (~5 min, before merge/release)
+zsh tests/sv-gate-full.sh
+# or
+npm run sv-gate:full
+```
+`sv-gate-full` requires:
+- Inside a tmux session (`echo $TMUX` non-empty)
+- `claude` CLI in PATH with valid auth
+- `node >= 16` in PATH
+- `~/.claude/ralph-desk/` synced from latest `src/` (run `bash install.sh`)
+## Adding a new scenario
+1. **Determine tier**:
+   - Deterministic, no subprocess → Tier A
+   - Requires real tmux/claude/network → Tier B
+2. **Tier A**: add `tests/node/sv-e2e/test-<name>.mjs` (or extend existing file). Use `pollForSignal` injection seam. Update `NODE_TESTS` array in `tests/sv-gate-fast.sh`.
+3. **Tier B**: add scenario to `tests/sv-gate-real-e2e.sh` with `reset_pane_state` between scenarios. The script auto-runs in `sv-gate-full.sh`.
+4. **Document**: add row to the Coverage matrix in this file.
+5. **Verify**: run `npm run sv-gate:fast` (Tier A) or `npm run sv-gate:full` (both tiers); both must exit 0.