voidforge-build 23.20.0 → 23.22.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/.claude/commands/debrief.md +1 -1
- package/dist/.claude/commands/git.md +7 -3
- package/dist/.claude/commands/seal.md +5 -4
- package/dist/CHANGELOG.md +33 -0
- package/dist/CLAUDE.md +3 -2
- package/dist/VERSION.md +3 -1
- package/dist/docs/methods/BUILD_PROTOCOL.md +8 -0
- package/dist/docs/methods/DEVOPS_ENGINEER.md +10 -0
- package/dist/docs/methods/FIELD_MEDIC.md +1 -1
- package/dist/docs/methods/QA_ENGINEER.md +4 -0
- package/dist/docs/methods/RELEASE_MANAGER.md +11 -0
- package/dist/docs/methods/SUB_AGENTS.md +2 -0
- package/dist/docs/patterns/egress-sandbox.sh +43 -0
- package/dist/docs/patterns/nginx-vhost.conf +156 -0
- package/dist/docs/patterns/post-deploy-probe.sh +115 -0
- package/dist/docs/patterns/rls-test-fixture.py +140 -0
- package/dist/docs/patterns/structural-sql-sentinel.py +134 -0
- package/dist/wizard/lib/project-init.d.ts +17 -0
- package/dist/wizard/lib/project-init.js +35 -15
- package/dist/wizard/lib/updater.js +37 -6
- package/package.json +2 -2
|
@@ -13,7 +13,7 @@ Bashir examines the patient. Time to diagnose.
|
|
|
13
13
|
[ -x scripts/surfer-gate/bypass.sh ] && bash scripts/surfer-gate/bypass.sh --light || true
|
|
14
14
|
```
|
|
15
15
|
|
|
16
|
-
The existence guard is a no-op on projects that predate the gate. **
|
|
16
|
+
The existence guard is a no-op on projects that predate the gate. **Stale-pointer self-repair (#384 RC-3):** `bypass.sh` now repoints a stale pointer (one left by a `/clear`ed or crashed session) to the live session automatically — read from `CLAUDE_CODE_SESSION_ID` — so the bypass lands correctly on the first try. On older Claude Code builds without that env var the legacy behavior applies: if the first sub-agent launch still blocks, re-run the exact same `bypass.sh --light` line once (the first blocked `check.sh` fire repoints the pointer, so the second write lands correctly). See CLAUDE.md "Silver Surfer Gate" → "Stale session pointer — auto-repaired."
|
|
17
17
|
|
|
18
18
|
## Step 0 — Reconstruct the Timeline
|
|
19
19
|
|
|
@@ -8,8 +8,12 @@
|
|
|
8
8
|
Scope the changes:
|
|
9
9
|
1. Run `git status` — identify staged, unstaged, and untracked files
|
|
10
10
|
2. Run `git diff --stat` — get a summary of what changed
|
|
11
|
-
3.
|
|
12
|
-
|
|
11
|
+
3. **Unrelated / pre-existing-change detection (field report #384 RC-1 — never `git add -A` blind).** Before staging, separate what this session authored from changes that were already in the working tree or fall outside the session's stated scope. Two mechanical checks:
|
|
12
|
+
- **Dependency manifests get special scrutiny.** If any manifest or lockfile appears in the diff — `package.json`, `package-lock.json`, `yarn.lock`, `pnpm-lock.yaml`, `requirements.txt`, `pyproject.toml`, `Cargo.toml` / `Cargo.lock`, `go.mod` / `go.sum`, `Gemfile` / `Gemfile.lock` — read the actual dependency-level diff (`git diff -- <manifest>`), not just the filename. A dependency added / changed / removed that the session did not deliberately introduce is the exact bug this check exists for: the v23.20.0 `vercel` near-miss was a stray `npm install` that added `vercel` to root `dependencies` plus ~5,900 lockfile lines, and a naive `git add -A` would have shipped it into a methodology release. Flag every dependency change for an explicit include/exclude decision and honor "no new dependencies without justification" (CLAUDE.md Coding Standards).
|
|
13
|
+
- **Scope diff.** Cross-check the full changed-file list against what this session actually touched. Anything you did not author this session — a leftover edit, a scratch/probe file, an untracked artifact — is surfaced for an explicit keep/drop decision.
|
|
14
|
+
Present the split — *session-authored (stage these)* vs *pre-existing or out-of-scope (decide)* — and get the include/exclude decision BEFORE Step 4 staging. **Never `git add -A` / `git add .` a release without this split.**
|
|
15
|
+
4. If there are unstaged changes, ask the user: "Stage everything, or should I be selective?" — informed by step 3's split.
|
|
16
|
+
5. If there are no changes at all, stop: "Nothing to version. Working tree is clean."
|
|
13
17
|
|
|
14
18
|
## Step 1 — Analyze (Vision)
|
|
15
19
|
Read the actual diffs and classify every change:
|
|
@@ -71,7 +75,7 @@ Every hit that is not the intentional "Removed" changelog line is either updated
|
|
|
71
75
|
## Step 4 — Commit (Rogers)
|
|
72
76
|
Stage and commit:
|
|
73
77
|
1. Stage all modified version files: `VERSION.md`, the active changelog (`CHANGELOG.md` or `PROJECT_VERSION.md`), **every** bumped `package.json` (all workspace packages, not just the root), and any generated copy re-synced in Step 3
|
|
74
|
-
2. Stage any other files that are part of this release
|
|
78
|
+
2. Stage any other files that are part of this release — explicitly, from Step 0's *session-authored* split, including any prose fixed by the Step 3.5 removal sweep. Stage by path; do **not** `git add -A` / `git add .` (that re-admits the pre-existing/out-of-scope changes Step 0 just excluded — field report #384 RC-1)
|
|
75
79
|
3. Craft commit message in the format: `vX.Y.Z: One-line summary`
|
|
76
80
|
- If elaboration needed, add a blank line then details
|
|
77
81
|
- Match the style of existing commits (check `git log --oneline -10`)
|
|
@@ -18,11 +18,12 @@ The ordering is deliberate: commit and push first so the field report and vault
|
|
|
18
18
|
## Step 0 — Preflight (decide which stages apply)
|
|
19
19
|
1. `git status` + `git diff --stat` — is there anything to commit?
|
|
20
20
|
- **Working tree clean:** there is no release to ship. Skip Stages 1–2 (commit + push), tell the user "Nothing to ship — sealing without a release," and proceed to Stage 3 (debrief) + Stage 4 (vault). A clean tree is not an error.
|
|
21
|
-
2.
|
|
22
|
-
3.
|
|
23
|
-
4.
|
|
21
|
+
2. **Unrelated / pre-existing-change detection (field report #384 RC-1).** Run the same split `/git` Step 0 does — separate session-authored changes from changes that were already in the tree or fall outside this session's scope, giving **dependency manifests / lockfiles** (`package.json`, `package-lock.json`, `yarn.lock`, `pnpm-lock.yaml`, `requirements.txt`, `Cargo.lock`, `go.sum`, …) special scrutiny via their dependency-level diff. This is exactly the vigilance that caught the v23.20.0 `vercel` near-miss by hand; doing it at Preflight surfaces it as part of the plan disclosure (step 3) instead of relying on the operator noticing mid-commit. Surface any pre-existing/out-of-scope change for an explicit include/exclude decision; **never let the downstream `/git` stage `git add -A` a release without this split.**
|
|
22
|
+
3. `git rev-parse --abbrev-ref HEAD` — confirm the branch. If on the default branch and a release is being cut, that is fine for this repo's flow; just surface it.
|
|
23
|
+
4. Echo the plan the user is about to authorize: the stages that will run, whether a push will happen, whether a GitHub field report will be filed, **and any pre-existing/out-of-scope changes detected in step 2 (with the include/exclude call)**. This single up-front disclosure is the contract — do not re-ask before each stage (the operator already authorized the whole ritual by invoking `/seal`).
|
|
24
|
+
5. **Arm the gate bypass for Stage 3.** `/debrief --submit` deploys sub-agents (Ezri / O'Brien / Nog / Jake), and the Silver Surfer PreToolUse hook gates *every* Agent launch — `/debrief` is an analysis command, not a Surfer review roster, so it takes the documented bypass (field report #366-F4). Run, existence-guarded:
|
|
24
25
|
`[ -x scripts/surfer-gate/bypass.sh ] && bash scripts/surfer-gate/bypass.sh --light || true`
|
|
25
|
-
**Stale-pointer
|
|
26
|
+
**Stale-pointer self-repair (#384 RC-3):** `bypass.sh` now reads the live session id from `CLAUDE_CODE_SESSION_ID` and, when the repo pointer is stale (left by a `/clear`ed or crashed session), repoints it to the live session automatically — the bypass lands correctly on the first try. On older Claude Code builds without that env var the legacy behavior applies: if Stage 3's first Agent call is still blocked despite the bypass, re-run the same `bypass.sh --light` line once (the blocked `check.sh` fire repoints the pointer), then retry. Do not fight the gate beyond one re-run; if it still blocks, fall back to `/debrief --solo` for this stage.
|
|
26
27
|
|
|
27
28
|
## Step 1 — Ship (Coulson · `/git`)
|
|
28
29
|
Run the full `/git` release flow (version bump → changelog → commit → tag → verify). Pass through `--major` / `--minor` / `--patch` / `--no-tag` if supplied.
|
package/dist/CHANGELOG.md
CHANGED
|
@@ -6,6 +6,39 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/), and this
|
|
|
6
6
|
|
|
7
7
|
---
|
|
8
8
|
|
|
9
|
+
## [23.22.0] - 2026-06-24
|
|
10
|
+
|
|
11
|
+
### `update` now auto-activates `/contextmeter` (matches `init`)
|
|
12
|
+
|
|
13
|
+
- **Fixed** — `npx voidforge-build update` now wires the `/contextmeter` status line + `UserPromptSubmit` awareness hook into `.claude/settings.json`, the same default-on way `init` does. Previously `update` copied `scripts/statusline/` but left the meter inactive until you ran `/contextmeter` by hand. `mergeStatuslineSettings` is now shared by `init` and `update`; it stays idempotent and **non-clobbering** — never overwrites a project's own `statusLine`, never duplicates the awareness hook. `--dry-run` / diff now reports the pending `.claude/settings.json` change honestly (reading the snippet from the source so it's accurate even before the scripts are copied). +3 updater tests (suite 1420→1423).
|
|
14
|
+
|
|
15
|
+
Both packages bumped in lockstep (version-consistency gate). Dep `^23.21.0` → `^23.22.0`.
|
|
16
|
+
|
|
17
|
+
## [23.21.0] - 2026-06-24
|
|
18
|
+
|
|
19
|
+
### Triaged field reports #382 / #383 / #384 → `/seal`-hardening + DevOps/QA/orchestration fixes + a pattern-distribution gap
|
|
20
|
+
|
|
21
|
+
**From #384 (the v23.20.0 `/seal` session's own debrief):**
|
|
22
|
+
|
|
23
|
+
- **Added** — Release Step 0 *unrelated / pre-existing-change detection* (`/git`, `/seal`, `RELEASE_MANAGER.md`): before staging, the working tree is split into session-authored vs pre-existing/out-of-scope changes, with dependency manifests / lockfiles getting dependency-level scrutiny — the exact vigilance that caught the v23.20.0 `vercel` near-miss, now mechanical. Never `git add -A` a release.
|
|
24
|
+
- **Added** — *Creation-time native-collision gate* (`BUILD_PROTOCOL.md`, `NATIVE_CAPABILITIES.md`): a new command's name is checked against the native command/skill set *before* the file is written, and its `NATIVE_CAPABILITIES` row is added at creation — the check shifts left from release-time re-audit to command-creation (the `/statusline`→`/contextmeter` rework cause).
|
|
25
|
+
- **Fixed** — `scripts/surfer-gate/bypass.sh` *stale-pointer self-repair*: reads the live session id from `CLAUDE_CODE_SESSION_ID` and repoints a pointer left by a `/clear`ed/crashed session, so a single `bypass.sh --light` lands correctly on the first try — no operator re-run. Older CLIs without the env var keep the documented re-run fallback. (+4 regression tests; gate suite 27→31.)
|
|
26
|
+
|
|
27
|
+
**From #382 (QA-isolation prod outage + sandbox / spend / coverage):**
|
|
28
|
+
|
|
29
|
+
- **Added** — `DEVOPS_ENGINEER.md`: locking a shared parent dir must enumerate every traversing service account and grant each an explicit traverse ACL, then `curl` the prod FE and assert 200 (a `/home/ubuntu` `0750` lock 500'd nginx). Plus a headless-OAuth-bootstrap note (SSH `-L` port-forward or paste-the-code fallback).
|
|
30
|
+
- **Added** — `docs/patterns/egress-sandbox.sh`: run a `systemd-run` egress-confined workload under `--uid`/`--gid` so artifacts stay user-owned, not root-owned — `IPAddress*` filtering is a cgroup property and uid-independent.
|
|
31
|
+
- **Added** — `SUB_AGENTS.md`: a global spend ceiling must reserve max-possible in-flight child budget before launching the next child (an $80 cap spent $83.72), or document the overshoot bound.
|
|
32
|
+
- **Added** — `QA_ENGINEER.md`: coverage honesty — count a case covered only at the fidelity actually exercised; record proof in a per-lane ledger; never reclassify a coverage SSOT on partial evidence.
|
|
33
|
+
|
|
34
|
+
**Distribution:**
|
|
35
|
+
|
|
36
|
+
- **Fixed** — `prepack.sh` / `copy-assets.sh` now ship **every** `docs/patterns/` file regardless of extension. Globbing by `.ts`/`.tsx`/`.md` had silently dropped the `.sh`/`.py`/`.conf` patterns (`post-deploy-probe.sh`, `nginx-vhost.conf`, `rls-test-fixture.py`, `structural-sql-sentinel.py`) from the published package — an LRN-11 gap, now a future-proof whole-dir copy.
|
|
37
|
+
|
|
38
|
+
**#383** (`/contextmeter`) shipped in v23.20.0 — closed as implemented; its creation-time-collision proposal is covered by #384 RC-2.
|
|
39
|
+
|
|
40
|
+
Build clean, suite 1420 (gate 27→31). Dep `^23.20.0` → `^23.21.0`.
|
|
41
|
+
|
|
9
42
|
## [23.20.0] - 2026-06-23
|
|
10
43
|
|
|
11
44
|
### Triaged 12 upstream field reports → methodology hardening, + `/seal` and `/contextmeter`
|
package/dist/CLAUDE.md
CHANGED
|
@@ -35,7 +35,7 @@ ADR-051 enforces this gate at the hook level (PreToolUse). The prose below is th
|
|
|
35
35
|
|
|
36
36
|
**Non-review commands with a fixed roster take the bypass, NOT a Surfer muster (#366 F4).** A command like `/debrief` is NOT in the gated-commands list above — but the hook blocks *every* non-Surfer Agent launch regardless of the list, so its command-prescribed sub-agents (Ezri/O'Brien/Nog/Jake) get blocked too. The fix: any fixed-roster, non-review pipeline runs `[ -x scripts/surfer-gate/bypass.sh ] && bash scripts/surfer-gate/bypass.sh --light || true` BEFORE launching its sub-agents. Its roster is command-prescribed, not cherry-picked, so the gate's anti-cherry-pick purpose doesn't apply — the bypass is correct, not a workaround. (The gated list governs *which commands must muster the Surfer*; it does not exempt unlisted commands from the hook.)
|
|
37
37
|
|
|
38
|
-
**
|
|
38
|
+
**Stale session pointer — auto-repaired (#366 F4, fixed in #384 RC-3).** The repo's session pointer can point at a *dead* session (a prior `/clear`ed or crashed session whose dir still exists). Historically `bypass.sh` then wrote the flag to that dead session's dir — the WRONG one — and the live session's launch still blocked until you re-ran the bypass. **`bypass.sh` now self-repairs:** it reads the live session id from `CLAUDE_CODE_SESSION_ID` (the same id Claude Code passes the `PreToolUse` hook as `session_id` — it equals the live transcript's basename), and when that disagrees with the pointer it repoints the pointer to the live session and writes the flag there. A single `bash scripts/surfer-gate/bypass.sh --light` now lands correctly on the first try; no re-run needed. **Legacy fallback:** on older Claude Code builds that don't export `CLAUDE_CODE_SESSION_ID`, `LIVE_SID` is empty and the prior behavior remains — if the first launch still blocks, re-run the same `bypass.sh --light` line once (the first blocked `check.sh` fire repoints the pointer, so the second write lands correctly).
|
|
39
39
|
|
|
40
40
|
**Orchestrator contract** (you run these Bash commands at the right moments — wrap each in an existence guard so projects on older methodology versions don't error):
|
|
41
41
|
|
|
@@ -136,6 +136,7 @@ Reference implementations in `/docs/patterns/`. Match these shapes when writing.
|
|
|
136
136
|
- `codemod-hygiene.md` — after a jscodeshift/recast codemod, strip incidental reformatting so the diff shows only the semantic change (field report #357)
|
|
137
137
|
- `post-deploy-probe.sh` — deploy probe that asserts response content + Content-Type, not HTTP status only, so an SPA catch-all serving index.html for every path can't false-pass into a rollback (field report #371)
|
|
138
138
|
- `exclusion-set-invariant.md` — superset invariant for multi-mechanism exclusion sets: one canonical secret/PII set with `.gitignore` / rsync / scanner derived from it (or a CI assertion) so the three never drift (field report #377)
|
|
139
|
+
- `egress-sandbox.sh` — egress-confined workload (`systemd-run` `IPAddress*` cgroup filter) that drops to the invoking uid/gid so artifacts stay user-owned, not root-owned, while network confinement is preserved (field report #382)
|
|
139
140
|
|
|
140
141
|
## Slash Commands
|
|
141
142
|
|
|
@@ -262,7 +263,7 @@ See `/docs/methods/MUSTER.md` for the full Muster Protocol.
|
|
|
262
263
|
| **Learnings** | `/docs/LEARNINGS.md` | Project-scoped operational knowledge — read at session start if exists |
|
|
263
264
|
| **The Muster** | `/docs/methods/MUSTER.md` | When using `--muster` flag on any command |
|
|
264
265
|
| **Time Vault** | `/docs/methods/TIME_VAULT.md` | Seldon — when preserving session intelligence for transfer |
|
|
265
|
-
| **Patterns** | `/docs/patterns/` | When writing code (
|
|
266
|
+
| **Patterns** | `/docs/patterns/` | When writing code (56 reference implementations) |
|
|
266
267
|
| **Lessons** | `/docs/LESSONS.md` | Cross-project learnings |
|
|
267
268
|
| **Workflows** | `/docs/methods/WORKFLOWS.md` | Dynamic Workflow authoring standard (ADR-067) — when to use, API, gotchas, the ADR-064 gate-launch sequence |
|
|
268
269
|
| **Native Capabilities** | `/docs/NATIVE_CAPABILITIES.md` | Command × native-skill collision tracker (ADR-066) — re-audit each release |
|
package/dist/VERSION.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Version
|
|
2
2
|
|
|
3
|
-
**Current:** 23.
|
|
3
|
+
**Current:** 23.22.0
|
|
4
4
|
|
|
5
5
|
## Versioning Scheme
|
|
6
6
|
|
|
@@ -14,6 +14,8 @@ This project uses [Semantic Versioning](https://semver.org/):
|
|
|
14
14
|
|
|
15
15
|
| Version | Date | Summary |
|
|
16
16
|
|---------|------|---------|
|
|
17
|
+
| 23.22.0 | 2026-06-24 | **`update` now auto-activates `/contextmeter` (matches `init`).** `npx voidforge-build update` wires the context-meter status line + `UserPromptSubmit` awareness hook into `.claude/settings.json` the same default-on way `init` does — previously `update` copied `scripts/statusline/` but left the meter inactive until a manual `/contextmeter` run. `mergeStatuslineSettings` is now shared by init + update, idempotent + non-clobbering (never overwrites a project's own `statusLine`, never duplicates the hook); `--dry-run` reports the pending `.claude/settings.json` change. +3 updater tests (1420→1423). Dep `^23.21.0` → `^23.22.0`. |
|
|
18
|
+
| 23.21.0 | 2026-06-24 | **Triaged field reports #382 / #383 / #384 → `/seal`-hardening + DevOps/QA/orchestration fixes + a pattern-distribution gap.** **#384:** release Step 0 unrelated/pre-existing-change detection (`/git`, `/seal`, `RELEASE_MANAGER.md`) — split session-authored vs out-of-scope changes, dependency-manifest scrutiny, never `git add -A` (the `vercel` near-miss, mechanized); creation-time native-collision gate (`BUILD_PROTOCOL.md`, `NATIVE_CAPABILITIES.md`) — check a new command's name + add its row at creation, not release re-audit; `bypass.sh` stale-pointer self-repair via `CLAUDE_CODE_SESSION_ID` (repoints a dead-session pointer on the first try, no re-run; +4 tests, gate 27→31). **#382:** DevOps ACL-traverse enumeration + post-lock prod-FE 200 check; new `egress-sandbox.sh` pattern (`systemd-run --uid/--gid`, uid-independent egress); SUB_AGENTS spend ceiling reserves in-flight child budget; QA coverage-fidelity honesty + per-lane ledger; headless-OAuth bootstrap note. **Distribution:** `prepack.sh`/`copy-assets.sh` now ship every `docs/patterns/` file regardless of extension (`.sh`/`.py`/`.conf` were silently dropped — LRN-11 gap). #383 closed as already-shipped. Build clean, suite 1420. Dep `^23.20.0` → `^23.21.0`. |
|
|
17
19
|
| 23.20.0 | 2026-06-23 | **Triaged 12 upstream field reports (#364–#378) → methodology hardening, + `/seal` and `/contextmeter`.** Applied every accepted fix across ~41 method docs / agents / patterns / commands (throughput/scale gates, ADR concurrency verification, deny-list discipline, runtime-path tracers, render-gate coverage, OAuth/external-claim verification, HTTP two-principal isolation, dall-e→gpt-image-1 currency, `/debrief` gate-gap docs) + 2 new patterns (`post-deploy-probe.sh`, `exclusion-set-invariant.md`); implemented the two wizard reports — non-destructive CLAUDE.md `update` merge (#368) + legacy-marker detection (#369). **New `/seal`** — session closeout (git → debrief → vault → handoff). **New `/contextmeter`** — context-budget meter + `UserPromptSubmit` awareness hook, default-on (warn 80% / crit 92%), `scripts/statusline/` wired through all four distribution paths + npm `files`. Build clean, suite 1392→1420. Dep `^23.19.0` → `^23.20.0`. |
|
|
18
20
|
| 23.19.0 | 2026-06-13 | **Gauntlet acceptance test → 14 fixes (the ADR-067 re-platform, validated by running it on itself).** Ran the new `gauntlet.workflow.js` live on the v23.13–v23.18 platform code (10-agent Surfer roster → 347 agents → 99 distinct claims → 66 confirmed + 24 crossfire, 0 Critical) and fixed the 3-lens-confirmed findings. **Gate (security):** `_paths.sh` reap was missing `-mindepth 1` → could `rm -rf` the entire `sessions/` tree (every live roster/bypass); the reaper now refreshes `$SESSION_DIR` mtime on activity + threshold raised above the TTL, closing the documented reap-vs-fresh-roster/bypass race; `shasum`→`sha256sum` fallback (gate silently broke on Alpine); `bypass.sh` run before the first hook fire now records a repo-scoped *pending* bypass that `check.sh` promotes (was a silent no-op). **Workflows:** strike no longer re-runs the same ≤5-agent roster twice; crossfire `survives:true+REFUTED` verdicts no longer vanish into no bucket (logged in `crossfireRefutedLog`); dedup keeps the **highest** severity + `raisedBy` (was first-write-wins); guarded `JSON.parse(args)`; undefined-domain prompt guard. **Distribution:** `npx voidforge-build init` now copies `.claude/workflows/` + `AGENT_CLASSIFICATION.md`; `update` now propagates `.claude/workflows` + `scripts/surfer-gate` (both were stranded). **Validation:** new `scripts/validate-workflows.sh` (wraps the runtime shape, then `node --check`) wired into `pretest` — corrects the false "scripts pass `node --check`" claim and gates syntax errors from shipping. **Docs:** `WORKFLOWS.md` example `agentType: a.id`→`a.name` + new gotchas; stale `/tmp/voidforge-*` paths fixed in gate README + CLAUDE.md (ADR-060). **CI:** `recover-partial` derives the version from `package.json` not `github.ref_name` (broke on dispatch); Playwright cache key off the committed manifests not the regenerated lockfile. Gate suite 23→27, full suite 1390→1392. Deferred (field-report candidates): concurrent same-repo pointer collision, `workflow_dispatch` branch guard. Dep `^23.18.0` → `^23.19.0`. |
|
|
19
21
|
| 23.18.0 | 2026-06-13 | **Workflow re-platform of `/gauntlet` + `/assemble` (ADR-067)** — the opportunity ADR-064 unblocked. New `.claude/workflows/gauntlet.workflow.js` (discovery → JS dedupe → 3-lens adversarial REFUTE → crossfire → council, schema-validated) and `assemble-review.workflow.js` (engage+sentinel over a mission diff; build/arch/devops stay prose). New **`docs/methods/WORKFLOWS.md`** authoring standard (API, the #348/#363 gotchas, 16/1000 caps, and the ADR-064 gate-launch sequence: Surfer→record-roster→Workflow). `gauntlet.md`/`assemble.md` gain workflow-execution sections; personas + fix-application + Debate Protocol stay prose. **Distribution gate (Phase 12.75):** `.claude/workflows/` is a new shared category — added to `prepack.sh` (npm) + `copy-assets.sh` (init) so the scripts ship to consumers. Both scripts `node --check`-validated (ESM async-wrapped); the live end-to-end gauntlet run is the acceptance test. Dep `^23.17.0` → `^23.18.0`. |
|
|
@@ -450,6 +450,14 @@ Examples of batches that are too big:
|
|
|
450
450
|
|
|
451
451
|
---
|
|
452
452
|
|
|
453
|
+
## Authoring a New Slash Command — Creation-Time Native-Collision Gate (field report #384 RC-2)
|
|
454
|
+
|
|
455
|
+
When you create a new VoidForge slash command (a new `.claude/commands/*.md`), the **native-collision check happens at creation time, not at release re-audit.** ADR-066's `docs/NATIVE_CAPABILITIES.md` re-audit is a release-time backstop; relying on it alone let `/contextmeter` get built as `/statusline` and then renamed mid-build — after docs and scripts already referenced the dead name. Shift the check left:
|
|
456
|
+
|
|
457
|
+
1. **Before writing the file, check the proposed name against the native command/skill set.** Claude Code ships native slash commands and skills (e.g. `/init`, `/review`, `/security-review`, `/code-review`, `/test`, `/commit`, `/statusline`, `/context`, `/deep-research`, plus built-ins). On surfaces with project-local resolution a same-named `.claude/commands/*.md` wins, but on surfaces without it (claude.ai web, some IDE extensions) a colliding **native** capability shadows ours — running ungated and without VoidForge semantics. Consult `docs/NATIVE_CAPABILITIES.md` for the currently-tracked native set.
|
|
458
|
+
2. **If the name collides, resolve it before the name propagates.** Either rename to a non-colliding name (the `/statusline`→`/contextmeter`, `/review`→`/engage`, `/security`→`/sentinel` precedent) or, if coexistence is deliberate, record a `coexist + document` disposition. Decide *before* writing the file, so no doc/script ever references a name you'll have to retract.
|
|
459
|
+
3. **Add the `NATIVE_CAPABILITIES.md` row as part of creating the command** — same commit, not a later audit. The ADR-066 coverage rule (every `.claude/commands/*.md` has a row) is then satisfied at creation, and the release-time re-audit becomes a confirmation rather than a discovery.
|
|
460
|
+
|
|
453
461
|
## Principles
|
|
454
462
|
|
|
455
463
|
1. PRD is source of truth. Agents don't override product decisions. If the PRD is ambiguous, flag it and present options — don't decide product direction.
|
|
@@ -45,6 +45,8 @@ When the application spawns child processes (workers, background jobs, PTY sessi
|
|
|
45
45
|
|
|
46
46
|
(Field report #57: shell profiles re-injected environment variables that were explicitly filtered from the PTY environment.)
|
|
47
47
|
|
|
48
|
+
**Egress sandbox: drop to the invoking uid, don't run as root.** When you confine a workload's outbound network with `sudo systemd-run -p IPAddressDeny=…`, pass `--uid`/`--gid` to run it as the invoking user. `IPAddress*` filtering is a cgroup property and is uid-independent, so confinement is fully preserved while artifacts stay user-owned — running as root litters root-owned state that breaks a sibling tool run later as the normal user. See `docs/patterns/egress-sandbox.sh` (field report #382 RC-2).
|
|
49
|
+
|
|
48
50
|
See NAMING_REGISTRY.md for 70+ additional characters.
|
|
49
51
|
|
|
50
52
|
## Goal
|
|
@@ -406,6 +408,12 @@ When staging and production coexist on the same server, enforce full isolation:
|
|
|
406
408
|
|
|
407
409
|
Convention isn't enough — enforcement is. The pre-push hook is the single most effective protection. (Field report #241: 68-hour production outage from shared infrastructure.)
|
|
408
410
|
|
|
411
|
+
### Locking a shared parent dir: enumerate every traversing service account (field report #382, HIGH)
|
|
412
|
+
|
|
413
|
+
A QA/security isolation step that revokes world-traverse on a home or parent directory can silently break any *other* service whose path runs through it — a cross-phase landmine that surfaces hours later. Real incident: `/home/ubuntu` was set to `0750` plus a traverse ACL granting only the QA containment users (`us-qa`/`us-healer`); nginx (`www-data`) serves the production SPA from a directory *under* `/home/ubuntu`, lost traverse, and 500'd the moment its workers recycled (logrotate).
|
|
414
|
+
|
|
415
|
+
**Rule — when you revoke world-traverse (`o-x`, `0750`, a restrictive ACL) on any home or parent directory, enumerate EVERY service account that traverses it and grant each an explicit traverse ACL.** The web server (`www-data`/`nginx`), the app/runtime user, the process manager, backup/cron users — any uid whose working path descends through the locked dir needs `setfacl -m u:<svc>:--x <dir>` (execute/traverse only, NOT read). Then **verify with a live request, not the ACL listing**: after locking the dir, `curl` the production front-end (and any other co-located service) and assert `200`. `getfacl` proves the ACL is *set*; only a live request proves nothing downstream *broke*. Add to the containment runbook: "after locking a home/parent dir → curl prod FE → assert 200."
|
|
416
|
+
|
|
409
417
|
### Promote gate must verify the staging server's DEPLOYED COMMIT == branch HEAD
|
|
410
418
|
|
|
411
419
|
A staging-first promote gate that checks only "staging branch is ahead of main" + "staging health endpoint returns 200" + "version was bumped" is **structurally blind to the one thing staging-first exists to guarantee**: that the code being promoted actually *ran on staging*. Branch-ahead proves the commit was *pushed*; health-200 proves *some* build is up. Neither proves the staging **server** is running the commit being promoted — a push-to-branch without a redeploy leaves the server lagging the branch, and "ahead + 200" both still pass. Promote at that point and you ship commits to prod that **never executed on staging** — the exact failure staging-first is built to prevent (and the same shape as the "deployed but never reloaded" stale-build outage elsewhere in this doc).
|
|
@@ -478,6 +486,8 @@ Add project-specific exclusions for any directory that receives runtime-generate
|
|
|
478
486
|
|
|
479
487
|
**Live-fire verification per credential (field report #360).** After wiring ANY external credential — analytics, error tracking, ad platform, payment, LLM provider, anything with an API key/secret/token — exercise it against the provider's LIVE API and confirm acceptance before marking the integration done. Env-var-set is NOT done; a structurally-valid value (correct prefix/length) can still be dead. Send the smallest real authenticated request the provider supports (a no-op read, a token introspection, a `whoami`/`accounts:list`, a single test event) and assert a success status, not just a non-error transport. This single live call also surfaces latent integration bugs the stored value can't reveal: a hardcoded/sunsetting API version now returning 404 (pin a current version + add a health check), a missing required header (e.g. `login-customer-id` for a manager→client account), or wrong scopes. Evidence: a Google Ads credential that looked structurally valid was dead (`invalid_client`) and a v17 pin had been retired (404, current v21) — eyeballing would have shipped a silently-broken integration.
|
|
480
488
|
|
|
489
|
+
**Interactive OAuth consent on a headless server (field report #382).** Bootstrapping an OAuth credential that uses a loopback redirect (Google's installed-app flow, many provider CLIs) assumes a browser on the *same host* as the consent step — it opens `http://127.0.0.1:<port>/` and waits. On a headless box (deploy server, CI runner) there is no local browser, so the flow hangs. Two fallbacks — document both in any OAuth-bootstrap tooling guidance: (1) **SSH local port-forward** — `ssh -L <port>:127.0.0.1:<port> user@server`, run the consent in your *local* browser, and the loopback redirect resolves back through the tunnel to the server's listener; (2) **paste-the-code / out-of-band** — use the flow's manual-code variant (e.g. `--no-launch-browser`), open the printed auth URL on any machine, and paste the returned code into the headless prompt. Use port-forward when the tool only does loopback; paste-code when it offers an OOB option.
|
|
490
|
+
|
|
481
491
|
**Post-deploy OAuth sign-in failures: discriminate IdP-side from regression before rolling back.** When the first real sign-in after a deploy fails, do NOT reflexively roll back — first locate WHERE it failed. If the error page lives on the IdP's own domain (e.g. `accounts.google.com/info/unknownerror`, with a `rapt` re-auth token) and occurs BEFORE your `/callback` is hit, the failure is on the identity provider, not your migration — typically a stuck re-auth session, not a regression. Confirm your authorize request was well-formed (client_id, redirect_uri, scope, state) and then retry in a fresh/incognito session; an incognito success proves the deploy is fine and the IdP session was transient. Only an error AT your callback (state mismatch, token-exchange 4xx, cookie not set) implicates your code. A reflexive rollback on an IdP-side error falsely blames the migration and fixes nothing. (Field report #357 #3.)
|
|
482
492
|
|
|
483
493
|
**Post-deploy asset verification:** After deploying, verify specifically the files that *changed* in this deploy — not pre-existing assets. Check: (a) correct content-type header (text/html on a static asset means the file is missing from the deployment), (b) correct content-length (not the index.html fallback size), (c) deployment list shows the correct environment. Do NOT verify only pre-existing assets — they prove the host is up, not that the deploy succeeded. (Field report #114)
|
|
@@ -50,7 +50,7 @@ Transform session failures into structured, actionable field reports that improv
|
|
|
50
50
|
4. **Categorize root causes.** Every failure is one of: methodology gap, tooling limitation, communication failure, scope issue, framework-specific bug, or external dependency.
|
|
51
51
|
5. **Severity matters.** Distinguish between "this affects all users" (methodology flaw) and "this was specific to my project" (edge case).
|
|
52
52
|
6. **Be actionable.** Every finding should specify: which file should change, what should be added/modified, and which agent is responsible.
|
|
53
|
-
7. **Take the gate bypass before dispatching the DS9 crew (#366 F4).** Bashir's Ezri/O'Brien/Nog/Jake are a fixed, command-prescribed roster — not a cherry-picked review — so `/debrief` is NOT in the Silver Surfer gated-commands list. But the gate's `PreToolUse` hook blocks *every* non-Surfer Agent launch regardless of that list, so the crew would be blocked. BEFORE launching the first sub-agent, run `[ -x scripts/surfer-gate/bypass.sh ] && bash scripts/surfer-gate/bypass.sh --light || true`. The bypass is correct here, not a workaround: a fixed roster can't be cherry-picked, so the gate's anti-cherry-pick purpose doesn't apply. **Stale-pointer
|
|
53
|
+
7. **Take the gate bypass before dispatching the DS9 crew (#366 F4).** Bashir's Ezri/O'Brien/Nog/Jake are a fixed, command-prescribed roster — not a cherry-picked review — so `/debrief` is NOT in the Silver Surfer gated-commands list. But the gate's `PreToolUse` hook blocks *every* non-Surfer Agent launch regardless of that list, so the crew would be blocked. BEFORE launching the first sub-agent, run `[ -x scripts/surfer-gate/bypass.sh ] && bash scripts/surfer-gate/bypass.sh --light || true`. The bypass is correct here, not a workaround: a fixed roster can't be cherry-picked, so the gate's anti-cherry-pick purpose doesn't apply. **Stale-pointer self-repair (#384 RC-3):** `bypass.sh` now detects a stale pointer itself — it reads the live session id from `CLAUDE_CODE_SESSION_ID` and repoints a pointer left by a `/clear`ed or crashed session to the live session, so the bypass lands in the right dir on the first try. On older Claude Code builds without that env var the legacy behavior applies: if the first sub-agent launch still blocks, re-run the same `bypass.sh --light` line once (the first blocked `check.sh` fire repoints to the live session, so the second write lands correctly).
|
|
54
54
|
|
|
55
55
|
## Root Cause Categories
|
|
56
56
|
|
|
@@ -301,6 +301,10 @@ A shipped drift-guard (coverage gate, schema-parity `--check` CLI, lint sentinel
|
|
|
301
301
|
|
|
302
302
|
A guard failing either condition is **High** — it manufactures false confidence in exactly the regressions it claims to prevent. (Field report #365: a coverage drift-guard shipped a `--check` CLI that enforced weaker invariants than its own pytest suite — silently passing the three likeliest regressions — AND the tests were never wired into CI. The guard looked green while guarding nothing.)
|
|
303
303
|
|
|
304
|
+
### Coverage Honesty — Count at the Fidelity Actually Exercised (field report #382)
|
|
305
|
+
|
|
306
|
+
A coverage SSOT (the canonical "what's tested" ledger) must record each case at the fidelity it was *actually* exercised — never reclassify it covered on partial evidence. Real incident: a new real-account smoke lane proved the *backend* integration (token → authenticated read) but not the in-browser OAuth round-trip; the temptation was to mark the front-end `external_cases` "covered." It wasn't — only the backend path was. **Rules:** (1) a case is covered only at the layer a test actually drives — a backend token test does not cover the browser consent flow; an API test does not cover the UI that calls it; (2) record proof **per lane** in a ledger (which lane exercised the case, at what fidelity, with what evidence — a log line, a screenshot, a status assertion) so a coverage claim is auditable, not asserted; (3) never promote a case to "covered" in the SSOT on a different-fidelity proxy. Counting partial evidence as full coverage manufactures the same false confidence as a vacuous gate — it just hides in the ledger instead of the code. (Reinforces the verification-discipline theme of #377.)
|
|
307
|
+
|
|
304
308
|
### Safety-Critical Return Value Verification
|
|
305
309
|
|
|
306
310
|
For systems with safety-critical operations (stop-loss placement, circuit breakers, rollback triggers, payment captures, credential revocations): verify the return value of the safety operation BEFORE transitioning state. The pattern: `call safety operation → check return → only then transition`.
|
|
@@ -43,6 +43,17 @@ Clean, consistent, well-documented releases. Every version bump tells a story. E
|
|
|
43
43
|
4. **Never auto-push.** Push only when the user explicitly requests it.
|
|
44
44
|
5. **Present before executing.** Show the changelog entry, version bump, and commit message for user approval before committing.
|
|
45
45
|
6. **Breaking changes get called out.** If MAJOR, explain what breaks and why.
|
|
46
|
+
7. **Never `git add -A` a release.** Run the Step-0 unrelated-change split first (see "Unrelated-Change Detection" below) and stage by path.
|
|
47
|
+
|
|
48
|
+
## Unrelated-Change Detection — Step 0 (field report #384 RC-1)
|
|
49
|
+
|
|
50
|
+
A release must ship only what the session authored. The working tree can also carry **pre-existing or out-of-scope changes** — a stray dependency, a leftover edit, an untracked scratch/probe file — and a naive `git add -A` bundles them into the release. The v23.20.0 near-miss: a `vercel` dependency (added to root `dependencies` by a stray `npm install`, +~5,900 `package-lock.json` lines) sat in the tree unrelated to the release; it was caught only because the lead manually diffed `package.json`. Encode that vigilance instead of relying on it.
|
|
51
|
+
|
|
52
|
+
Before staging (in `/git` Step 0 and `/seal` Step 0):
|
|
53
|
+
|
|
54
|
+
1. **Dependency manifests get special scrutiny.** If any manifest/lockfile is in the diff — `package.json`, `package-lock.json`, `yarn.lock`, `pnpm-lock.yaml`, `requirements.txt`, `pyproject.toml`, `Cargo.toml`/`Cargo.lock`, `go.mod`/`go.sum`, `Gemfile`/`Gemfile.lock` — read the **dependency-level** diff (`git diff -- <manifest>`), not just the filename. Any dependency added/changed/removed that the session did not deliberately introduce is flagged for an explicit include/exclude decision. This enforces "no new dependencies without justification" (CLAUDE.md Coding Standards) at release time.
|
|
55
|
+
2. **Scope diff.** Compare the full changed-file list against what the session actually touched. Surface anything you did not author this session for an explicit keep/drop decision.
|
|
56
|
+
3. **Present the split** — *session-authored (stage these)* vs *pre-existing or out-of-scope (decide)* — and stage by path from the session-authored side only. `git add -A` / `git add .` re-admits exactly the changes this split exists to exclude.
|
|
46
57
|
|
|
47
58
|
## Semver Rules
|
|
48
59
|
|
|
@@ -289,6 +289,8 @@ Leads inherit the main session's model (Opus). Specialists run on Sonnet for cos
|
|
|
289
289
|
|
|
290
290
|
**Effort tiering (per-agent spend lever).** Claude Code exposes an `effort:` level (`low`/`medium`/`high`/`xhigh`/`max`) that controls reasoning depth *independently* of the model tier. Apply by role: **Leads → `xhigh`** (the recommended start for agentic work on Opus 4.8); **Specialists → `medium`** (read-and-report review rarely needs full `high` spend across ~200 agents); **Scouts → OMIT** — **Haiku 4.5 does not support the effort parameter and errors if it is passed.** Haiku also has a **200K context ceiling (not 1M)**: the Surfer pre-scan and scout prompts must fit within it — read agent frontmatter (name/description/tags), not full bodies, on large rosters. **Verified + applied 2026-06-13:** the official sub-agents docs confirm `effort` is a supported frontmatter field; the fleet edit is live — all 20 leads carry `effort: xhigh`, all 201 Sonnet specialists `effort: medium`, the 43 Haiku scouts omit it (ADR-054). New agents should follow the same tiering.
|
|
291
291
|
|
|
292
|
+
**Global spend ceiling must reserve in-flight budget (field report #382).** When a multi-child orchestration enforces a global cost ceiling, the launch gate must NOT admit the next child on *cumulative-spent-so-far < ceiling* — that ignores the children already running, so the ceiling is breached by whatever the in-flight children go on to spend (real incident: an $80 cap spent $83.72). Before launching child N, reserve the **max-possible in-flight spend**: admit only if `ceiling − spent_so_far − Σ(per-child ceiling of running children) ≥ next child's per-child ceiling`. That bounds the worst case under the cap. If per-child spend can't be bounded, the total can't be either — then document the overshoot bound explicitly (`ceiling + (concurrency − 1) × max-per-child`) rather than calling the cap hard. In a Dynamic Workflow the `budget` API (WORKFLOWS.md) is the natural enforcement point: gate `agent()` launches on `budget.remaining()` minus reserved in-flight, not on raw `spent()`.
|
|
293
|
+
|
|
292
294
|
### Tool Restrictions
|
|
293
295
|
|
|
294
296
|
| Profile | Tools | Agents |
|
|
@@ -0,0 +1,43 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
# egress-sandbox.sh — Pattern: run a network-egress-confined workload WITHOUT
|
|
3
|
+
# making its artifacts root-owned (field report #382 RC-2).
|
|
4
|
+
#
|
|
5
|
+
# PROBLEM. A common "egress sandbox" wraps a workload in `sudo systemd-run` with
|
|
6
|
+
# IPAddressAllow/IPAddressDeny to confine outbound network. Done naively it runs
|
|
7
|
+
# the workload as ROOT (sudo's default), so every file the workload writes —
|
|
8
|
+
# caches, state, lock files, output — is root-owned. A sibling/same-purpose tool
|
|
9
|
+
# run later as the normal user then can't read or overwrite that state and
|
|
10
|
+
# breaks. The egress confinement never REQUIRED root: IPAddress* filtering is a
|
|
11
|
+
# cgroup property (systemd's BPF egress filter) and is uid-independent. Drop the
|
|
12
|
+
# workload back to the invoking user and confinement is fully preserved while
|
|
13
|
+
# artifacts stay user-owned.
|
|
14
|
+
#
|
|
15
|
+
# ── WRONG: runs as root, litters root-owned artifacts ────────────────────────
|
|
16
|
+
# sudo systemd-run --pipe --wait \
|
|
17
|
+
# -p IPAddressDeny=any -p IPAddressAllow=10.0.0.0/8 \
|
|
18
|
+
# my-workload --out ./state # ./state is now root-owned
|
|
19
|
+
#
|
|
20
|
+
# ── RIGHT: same egress confinement, artifacts owned by the invoking user ──────
|
|
21
|
+
INVOKING_UID="$(id -u)"
|
|
22
|
+
INVOKING_GID="$(id -g)"
|
|
23
|
+
|
|
24
|
+
sudo systemd-run --pipe --wait \
|
|
25
|
+
--uid="$INVOKING_UID" --gid="$INVOKING_GID" \
|
|
26
|
+
-p IPAddressDeny=any \
|
|
27
|
+
-p IPAddressAllow=localhost \
|
|
28
|
+
-p IPAddressAllow=10.0.0.0/8 \
|
|
29
|
+
my-workload --out ./state # ./state owned by the invoking user
|
|
30
|
+
#
|
|
31
|
+
# WHY IT'S SAFE. IPAddressAllow/IPAddressDeny are enforced by the transient
|
|
32
|
+
# unit's cgroup, which applies regardless of the process uid. --uid/--gid only
|
|
33
|
+
# change the credential the workload runs under — they do not relax the network
|
|
34
|
+
# policy. You get confinement AND user-owned artifacts.
|
|
35
|
+
#
|
|
36
|
+
# VERIFY BOTH HALVES (don't assume — one assertion per property):
|
|
37
|
+
# 1. Confinement: from inside the workload, a connection to a DENIED address
|
|
38
|
+
# must fail — curl --max-time 3 https://example.org times out/refused.
|
|
39
|
+
# 2. Ownership: stat -c '%U' ./state returns the invoking user, not root.
|
|
40
|
+
#
|
|
41
|
+
# NOTE. IPAddress* allow/deny lists need systemd ≥ 235 with cgroup v2 + BPF; on
|
|
42
|
+
# hosts without it, fall back to a network namespace (`ip netns`) or a per-unit
|
|
43
|
+
# firewall, but keep the same --uid/--gid drop so artifacts stay user-owned.
|
|
@@ -0,0 +1,156 @@
|
|
|
1
|
+
# nginx vhost — Cloudflare-Flexible-compatible reverse proxy with ACME passthrough
|
|
2
|
+
#
|
|
3
|
+
# Reference implementation for a per-tenant origin vhost that sits behind
|
|
4
|
+
# Cloudflare in *Flexible* SSL mode (browser<->Cloudflare is HTTPS, but
|
|
5
|
+
# Cloudflare<->origin is plain HTTP on :80). Use this when the origin app
|
|
6
|
+
# does NOT terminate TLS itself and Cloudflare fronts the zone.
|
|
7
|
+
#
|
|
8
|
+
# Evidence: field report #344 F2 (origin 301 HTTP->HTTPS redirect loops on
|
|
9
|
+
# Cloudflare Flexible zones) and #344 F4a (missing/origin-level security header
|
|
10
|
+
# stack + per-tenant log isolation).
|
|
11
|
+
#
|
|
12
|
+
# When to use it:
|
|
13
|
+
# - Cloudflare zone SSL mode = Flexible, origin speaks HTTP only.
|
|
14
|
+
# - You want the security header stack applied at the origin so it survives
|
|
15
|
+
# even if a Cloudflare Transform/Response-Header rule is removed.
|
|
16
|
+
# - You need Let's Encrypt http-01 (ACME) to keep working through the proxy.
|
|
17
|
+
#
|
|
18
|
+
# When NOT to use it (use a TLS-terminating vhost instead):
|
|
19
|
+
# - Cloudflare SSL mode = Full or Full (strict): the origin must serve HTTPS
|
|
20
|
+
# and you SHOULD redirect HTTP->HTTPS. Adding the 301 below would NOT loop
|
|
21
|
+
# in that mode. This template deliberately omits the redirect for Flexible.
|
|
22
|
+
#
|
|
23
|
+
# Install:
|
|
24
|
+
# - Copy to /etc/nginx/sites-available/<tenant>.conf, edit the @@PLACEHOLDERS@@,
|
|
25
|
+
# symlink into sites-enabled/, then `nginx -t && systemctl reload nginx`.
|
|
26
|
+
# - Replace @@SERVER_NAME@@, @@UPSTREAM@@, @@TENANT@@ before loading.
|
|
27
|
+
#
|
|
28
|
+
# Placeholders:
|
|
29
|
+
# @@SERVER_NAME@@ -> e.g. app.example.com
|
|
30
|
+
# @@UPSTREAM@@ -> origin app host:port, e.g. 127.0.0.1:3000
|
|
31
|
+
# @@TENANT@@ -> short slug used for log filenames, e.g. acme-corp
|
|
32
|
+
|
|
33
|
+
# ---------------------------------------------------------------------------
|
|
34
|
+
# Rate-limit zone declaration.
|
|
35
|
+
#
|
|
36
|
+
# IMPORTANT (field report #344 F4a): limit_req_zone is an http{}-context-only
|
|
37
|
+
# directive. It MUST live in the top-level http{} block (e.g. nginx.conf or a
|
|
38
|
+
# file in conf.d/ that is included at http scope) — it is NOT valid inside a
|
|
39
|
+
# server{} or location{} block and `nginx -t` will reject it there with
|
|
40
|
+
# "limit_req_zone directive is not allowed here".
|
|
41
|
+
#
|
|
42
|
+
# Declare the zone ONCE at http{} scope, then *apply* it per-location with
|
|
43
|
+
# `limit_req` (which IS valid in server{}/location{}). The line below is shown
|
|
44
|
+
# here for reference only — move it to your http{} context:
|
|
45
|
+
#
|
|
46
|
+
# limit_req_zone $binary_remote_addr zone=@@TENANT@@_perip:10m rate=20r/s;
|
|
47
|
+
#
|
|
48
|
+
# The WebSocket upgrade map below is ALSO http{}-context-only — declare it once
|
|
49
|
+
# at http{} scope alongside the rate-limit zone, not inside server{}:
|
|
50
|
+
#
|
|
51
|
+
# map $http_upgrade $connection_upgrade {
|
|
52
|
+
# default upgrade;
|
|
53
|
+
# '' close; # plain HTTP keep-alive: send Connection: close-able
|
|
54
|
+
# }
|
|
55
|
+
# ---------------------------------------------------------------------------
|
|
56
|
+
|
|
57
|
+
upstream @@TENANT@@_origin {
|
|
58
|
+
# Origin application. keepalive reduces TCP churn on the proxy hop.
|
|
59
|
+
server @@UPSTREAM@@;
|
|
60
|
+
keepalive 32;
|
|
61
|
+
}
|
|
62
|
+
|
|
63
|
+
server {
|
|
64
|
+
listen 80;
|
|
65
|
+
listen [::]:80;
|
|
66
|
+
server_name @@SERVER_NAME@@;
|
|
67
|
+
|
|
68
|
+
# --- Per-tenant access/error logs (field report #344 F4a) -------------
|
|
69
|
+
# Isolated per tenant so one tenant's traffic/errors never contaminate
|
|
70
|
+
# another's audit trail. Ensure /var/log/nginx exists and is writable.
|
|
71
|
+
access_log /var/log/nginx/@@TENANT@@.access.log combined;
|
|
72
|
+
error_log /var/log/nginx/@@TENANT@@.error.log warn;
|
|
73
|
+
|
|
74
|
+
# --- ACME http-01 passthrough (field report #344 F4a) -----------------
|
|
75
|
+
# Let's Encrypt validates by fetching /.well-known/acme-challenge/<token>
|
|
76
|
+
# over plain HTTP. This location MUST be reachable on :80 and MUST NOT be
|
|
77
|
+
# redirected or proxied to the app, or certificate issuance/renewal fails.
|
|
78
|
+
# Point root at the webroot your ACME client writes challenges into.
|
|
79
|
+
location ^~ /.well-known/acme-challenge/ {
|
|
80
|
+
default_type "text/plain";
|
|
81
|
+
root /var/www/acme;
|
|
82
|
+
# No proxy here — serve the challenge token straight from the webroot.
|
|
83
|
+
# NOTE: nginx add_header inheritance is replace-not-merge, but a block
|
|
84
|
+
# with zero add_header of its own inherits the parent's. The security
|
|
85
|
+
# headers declared at server{} scope below are thus also emitted here;
|
|
86
|
+
# that is harmless for a bare token response. If you must strip them on
|
|
87
|
+
# this path, redeclare the headers you want (or none) inside this block.
|
|
88
|
+
try_files $uri =404;
|
|
89
|
+
}
|
|
90
|
+
|
|
91
|
+
# --- Security header stack (field report #344 F4a) --------------------
|
|
92
|
+
# Applied at the origin so the posture survives even if a Cloudflare
|
|
93
|
+
# response-header rule is later removed. `always` ensures the headers are
|
|
94
|
+
# emitted on error responses (4xx/5xx) too, not just 2xx/3xx.
|
|
95
|
+
#
|
|
96
|
+
# HSTS-aware: behind Cloudflare Flexible the browser<->edge leg is HTTPS,
|
|
97
|
+
# so advertising HSTS is correct for the public hostname. Start with a
|
|
98
|
+
# short max-age while validating, then raise to 1y + preload once certain
|
|
99
|
+
# the apex and every subdomain are HTTPS-only at the edge.
|
|
100
|
+
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
|
|
101
|
+
add_header X-Content-Type-Options "nosniff" always;
|
|
102
|
+
add_header X-Frame-Options "SAMEORIGIN" always;
|
|
103
|
+
# frame-ancestors is the modern replacement for X-Frame-Options; both are
|
|
104
|
+
# sent for backward compatibility with older user agents.
|
|
105
|
+
add_header Content-Security-Policy "frame-ancestors 'self'" always;
|
|
106
|
+
add_header Referrer-Policy "strict-origin-when-cross-origin" always;
|
|
107
|
+
|
|
108
|
+
# --- Reverse proxy to the origin app ----------------------------------
|
|
109
|
+
# NOTE (field report #344 F2): there is deliberately NO
|
|
110
|
+
# `return 301 https://$host$request_uri;`
|
|
111
|
+
# and NO `if ($http_x_forwarded_proto != "https") { return 301 ...; }`
|
|
112
|
+
# here. On a Cloudflare *Flexible* zone the edge talks HTTP to the origin,
|
|
113
|
+
# so any origin-level HTTP->HTTPS redirect produces an infinite redirect
|
|
114
|
+
# loop (ERR_TOO_MANY_REDIRECTS). Let Cloudflare handle the HTTPS upgrade
|
|
115
|
+
# at the edge. If you switch the zone to Full/Full-strict, terminate TLS
|
|
116
|
+
# at the origin and add the redirect in a separate :80 server block.
|
|
117
|
+
location / {
|
|
118
|
+
# Apply the http{}-scoped rate-limit zone here (this is the valid
|
|
119
|
+
# context for limit_req; the zone itself is declared at http{} scope
|
|
120
|
+
# per the note at the top of this file).
|
|
121
|
+
limit_req zone=@@TENANT@@_perip burst=40 nodelay;
|
|
122
|
+
|
|
123
|
+
proxy_pass http://@@TENANT@@_origin;
|
|
124
|
+
proxy_http_version 1.1;
|
|
125
|
+
|
|
126
|
+
# Preserve client + protocol context for the app. X-Forwarded-Proto is
|
|
127
|
+
# taken from Cloudflare's CF-Visitor / the edge; default to https since
|
|
128
|
+
# the public-facing leg is HTTPS even though this hop is HTTP.
|
|
129
|
+
proxy_set_header Host $host;
|
|
130
|
+
proxy_set_header X-Real-IP $remote_addr;
|
|
131
|
+
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
|
132
|
+
proxy_set_header X-Forwarded-Proto https;
|
|
133
|
+
proxy_set_header X-Forwarded-Host $host;
|
|
134
|
+
|
|
135
|
+
# WebSocket / upgrade support. $connection_upgrade is the http{}-scoped
|
|
136
|
+
# map declared at the top of this file: it sends "upgrade" for WS and
|
|
137
|
+
# "close" for plain HTTP. This single directive replaces the conflicting
|
|
138
|
+
# `Connection "upgrade"` + `Connection ""` pair and lets keepalive to
|
|
139
|
+
# the upstream coexist with WebSocket upgrades.
|
|
140
|
+
proxy_set_header Upgrade $http_upgrade;
|
|
141
|
+
proxy_set_header Connection $connection_upgrade;
|
|
142
|
+
|
|
143
|
+
proxy_connect_timeout 5s;
|
|
144
|
+
proxy_send_timeout 60s;
|
|
145
|
+
proxy_read_timeout 60s;
|
|
146
|
+
|
|
147
|
+
proxy_buffering on;
|
|
148
|
+
}
|
|
149
|
+
|
|
150
|
+
# Lightweight health endpoint that bypasses the app, for the edge/monitor.
|
|
151
|
+
location = /healthz {
|
|
152
|
+
access_log off;
|
|
153
|
+
default_type "application/json";
|
|
154
|
+
return 200 '{"status":"ok"}';
|
|
155
|
+
}
|
|
156
|
+
}
|
|
@@ -0,0 +1,115 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
# Post-Deploy Probe — Assert sensitive paths are NOT publicly served.
|
|
3
|
+
#
|
|
4
|
+
# Reference implementation for .claude/commands/deploy.md Step 4.5.
|
|
5
|
+
# Probes a denylist of paths against a live deploy URL.
|
|
6
|
+
#
|
|
7
|
+
# CONTENT-AWARE, NOT STATUS-ONLY (field report #371). A Single-Page App with a
|
|
8
|
+
# catch-all route returns HTTP 200 for EVERY path — /.env, /.git/config,
|
|
9
|
+
# /id_rsa — by serving index.html. A status-only probe reads those 200s as
|
|
10
|
+
# "EXPOSED" and would trigger a ROLLBACK of a clean deploy (false positive),
|
|
11
|
+
# while a real leak that happens to 200 looks identical to the shell. So we
|
|
12
|
+
# assert on CONTENT and Content-Type, not status alone:
|
|
13
|
+
# - 200 + text/html shell (<!doctype html> / <html) -> PASS (SPA fallback)
|
|
14
|
+
# - 200 + non-HTML body (KEY=VALUE, JSON, PEM, "ref:", binary) -> EXPOSED (real leak)
|
|
15
|
+
# - non-200 -> PASS (not served)
|
|
16
|
+
# A real .env leak is text/plain `KEY=VALUE`; a .git/config is an INI `[core]`
|
|
17
|
+
# block; an id_rsa is a `-----BEGIN ... PRIVATE KEY-----` PEM. None are HTML.
|
|
18
|
+
#
|
|
19
|
+
# Evidence: field reports #305 (32-day credential leak), #303 (methodology
|
|
20
|
+
# exposure), #371 (SPA catch-all status-only false-positive → would-be rollback).
|
|
21
|
+
#
|
|
22
|
+
# Usage:
|
|
23
|
+
# DEPLOY_URL=https://example.com bash docs/patterns/post-deploy-probe.sh
|
|
24
|
+
# DEPLOY_URL=https://example.com DEPLOY_PROBE_EXTRA=$'/admin\n/private.key' bash docs/patterns/post-deploy-probe.sh
|
|
25
|
+
|
|
26
|
+
set -euo pipefail
|
|
27
|
+
|
|
28
|
+
: "${DEPLOY_URL:?DEPLOY_URL is required (e.g. https://example.com)}"
|
|
29
|
+
|
|
30
|
+
# Strip trailing slash for clean URL composition.
|
|
31
|
+
DEPLOY_URL="${DEPLOY_URL%/}"
|
|
32
|
+
|
|
33
|
+
TMP="$(mktemp -t postdeploy-probe.XXXXXX)"
|
|
34
|
+
BODY="$(mktemp -t postdeploy-body.XXXXXX)"
|
|
35
|
+
cleanup() { rm -f "$TMP" "$BODY"; }
|
|
36
|
+
trap cleanup EXIT INT TERM
|
|
37
|
+
|
|
38
|
+
# Fixed denylist — mirrors Step 4.5 in .claude/commands/deploy.md.
|
|
39
|
+
DENYLIST=(
|
|
40
|
+
"/.env"
|
|
41
|
+
"/.env.production"
|
|
42
|
+
"/.env.local"
|
|
43
|
+
"/.git/config"
|
|
44
|
+
"/.git/HEAD"
|
|
45
|
+
"/.claude/agents/silver-surfer-herald.md"
|
|
46
|
+
"/docs/methods/FORGE_KEEPER.md"
|
|
47
|
+
"/HOLOCRON.md"
|
|
48
|
+
"/CHANGELOG.md"
|
|
49
|
+
"/VERSION.md"
|
|
50
|
+
"/package.json"
|
|
51
|
+
"/tsconfig.json"
|
|
52
|
+
"/id_rsa"
|
|
53
|
+
"/.ssh/id_rsa"
|
|
54
|
+
)
|
|
55
|
+
|
|
56
|
+
# Optional extensible denylist (newline-separated).
|
|
57
|
+
if [[ -n "${DEPLOY_PROBE_EXTRA:-}" ]]; then
|
|
58
|
+
while IFS= read -r extra; do
|
|
59
|
+
[[ -n "$extra" ]] && DENYLIST+=("$extra")
|
|
60
|
+
done <<< "$DEPLOY_PROBE_EXTRA"
|
|
61
|
+
fi
|
|
62
|
+
|
|
63
|
+
# Decide whether a fetched path is a REAL leak vs an SPA HTML fallback.
|
|
64
|
+
# Inputs: $1 status, $2 content-type header, body file at $BODY.
|
|
65
|
+
# Echoes "leak" or "ok".
|
|
66
|
+
classify() {
|
|
67
|
+
local status="$1" ctype="$2"
|
|
68
|
+
# Only a 200 can possibly be a leak; anything else is not served.
|
|
69
|
+
[[ "$status" == "200" ]] || { echo "ok"; return; }
|
|
70
|
+
|
|
71
|
+
# Lowercase the content-type for matching.
|
|
72
|
+
local ct; ct="$(printf '%s' "$ctype" | tr '[:upper:]' '[:lower:]')"
|
|
73
|
+
|
|
74
|
+
# An HTML response is the SPA catch-all shell, not the sensitive file. PASS.
|
|
75
|
+
if [[ "$ct" == *"text/html"* ]]; then echo "ok"; return; fi
|
|
76
|
+
# Body-sniff fallback when the server omits/mislabels Content-Type: a leading
|
|
77
|
+
# <!doctype html> or <html is the SPA shell.
|
|
78
|
+
if head -c 256 "$BODY" | tr '[:upper:]' '[:lower:]' | grep -qE '<!doctype html|<html'; then
|
|
79
|
+
echo "ok"; return
|
|
80
|
+
fi
|
|
81
|
+
|
|
82
|
+
# 200 + non-HTML body = the real file is being served. EXPOSED.
|
|
83
|
+
echo "leak"
|
|
84
|
+
}
|
|
85
|
+
|
|
86
|
+
hits=0
|
|
87
|
+
checked=0
|
|
88
|
+
|
|
89
|
+
for path in "${DENYLIST[@]}"; do
|
|
90
|
+
checked=$((checked + 1))
|
|
91
|
+
url="${DEPLOY_URL}${path}"
|
|
92
|
+
# Capture status + content-type, and the body (capped) for sniffing.
|
|
93
|
+
read -r status ctype < <(
|
|
94
|
+
curl -s -o "$BODY" --max-time 10 \
|
|
95
|
+
-w '%{http_code} %{content_type}\n' "$url" 2>/dev/null || echo "000 -"
|
|
96
|
+
)
|
|
97
|
+
verdict="$(classify "$status" "$ctype")"
|
|
98
|
+
if [[ "$verdict" == "leak" ]]; then
|
|
99
|
+
hits=$((hits + 1))
|
|
100
|
+
printf 'LEAK %s %-24s -> %s\n' "$status" "$ctype" "$url" | tee -a "$TMP" >&2
|
|
101
|
+
else
|
|
102
|
+
printf 'ok %s %-24s -> %s\n' "$status" "$ctype" "$url"
|
|
103
|
+
fi
|
|
104
|
+
done
|
|
105
|
+
|
|
106
|
+
printf '{"action":"post-deploy-probe","url":"%s","checked":%d,"hits":%d,"mode":"content-aware"}\n' \
|
|
107
|
+
"$DEPLOY_URL" "$checked" "$hits"
|
|
108
|
+
|
|
109
|
+
if (( hits > 0 )); then
|
|
110
|
+
echo "[post-deploy-probe] ${hits} sensitive path(s) served as non-HTML content. Rollback and fix deploy surface." >&2
|
|
111
|
+
exit 1
|
|
112
|
+
fi
|
|
113
|
+
|
|
114
|
+
echo "[post-deploy-probe] clean (SPA HTML fallbacks treated as PASS)"
|
|
115
|
+
exit 0
|
|
@@ -0,0 +1,140 @@
|
|
|
1
|
+
"""
|
|
2
|
+
Pattern: RLS Test Fixture (db_as_app SAVEPOINT)
|
|
3
|
+
|
|
4
|
+
Source: Field report #318 §5. Cara Dune (Union Station, M-05) discovered that
|
|
5
|
+
Testcontainers' default `us_test` user is `SUPERUSER + BYPASSRLS=t`.
|
|
6
|
+
Superusers bypass FORCE RLS at the engine level — the policy doesn't fire.
|
|
7
|
+
Any test using the shared `db` fixture for cross-tenant assertions will
|
|
8
|
+
silently pass even when the policy is broken.
|
|
9
|
+
|
|
10
|
+
Without this pattern, RLS tests that pass in development WILL silently fail
|
|
11
|
+
in production under the runtime non-owner role.
|
|
12
|
+
|
|
13
|
+
Use this pattern in every Python/asyncpg + pytest project with FORCE RLS.
|
|
14
|
+
The same shape ports to SQLAlchemy + sync sessions, psycopg, and Django ORM.
|
|
15
|
+
"""
|
|
16
|
+
|
|
17
|
+
import pytest
|
|
18
|
+
import pytest_asyncio
|
|
19
|
+
import asyncpg
|
|
20
|
+
from contextlib import asynccontextmanager
|
|
21
|
+
from typing import AsyncIterator
|
|
22
|
+
|
|
23
|
+
# ── Fixtures ──────────────────────────────────────────────────────────────
|
|
24
|
+
|
|
25
|
+
|
|
26
|
+
@pytest_asyncio.fixture
|
|
27
|
+
async def db_as_app(db: asyncpg.Connection) -> AsyncIterator[asyncpg.Connection]:
|
|
28
|
+
"""
|
|
29
|
+
Wrap the standard `db` fixture so RLS-sensitive tests run under the app
|
|
30
|
+
role (BYPASSRLS=f), not the SUPERUSER bootstrap role. Connection state
|
|
31
|
+
is restored on test teardown.
|
|
32
|
+
|
|
33
|
+
Use this fixture for any test that asserts an RLS policy fires. Use the
|
|
34
|
+
standard `db` fixture only for schema setup or admin-only operations.
|
|
35
|
+
|
|
36
|
+
Pairs with a `pg_container_app` fixture (below) that provisions an
|
|
37
|
+
app-level role with `LOGIN NOBYPASSRLS NOSUPERUSER` matching the
|
|
38
|
+
runtime DSN identity.
|
|
39
|
+
"""
|
|
40
|
+
await db.execute("SAVEPOINT rls_test")
|
|
41
|
+
try:
|
|
42
|
+
await db.execute(f"SET LOCAL ROLE {APP_ROLE_NAME}")
|
|
43
|
+
# If the test sets a tenant ContextVar, wire it through:
|
|
44
|
+
# await db.execute("SELECT set_config('app.current_org_id', $1, true)", org_id)
|
|
45
|
+
yield db
|
|
46
|
+
finally:
|
|
47
|
+
await db.execute("ROLLBACK TO SAVEPOINT rls_test")
|
|
48
|
+
|
|
49
|
+
|
|
50
|
+
@pytest.fixture(scope="session")
|
|
51
|
+
def app_role_name() -> str:
|
|
52
|
+
return APP_ROLE_NAME
|
|
53
|
+
|
|
54
|
+
|
|
55
|
+
# ── Container provisioning (run once per test session) ────────────────────
|
|
56
|
+
|
|
57
|
+
APP_ROLE_NAME = "unionstation_app" # Match production DSN identity
|
|
58
|
+
|
|
59
|
+
|
|
60
|
+
async def provision_app_role(admin_conn: asyncpg.Connection) -> None:
|
|
61
|
+
"""
|
|
62
|
+
Create the runtime non-owner role inside the test container. Mirrors
|
|
63
|
+
production: NOLOGIN if password-less; LOGIN with a fixed test password
|
|
64
|
+
if the test harness needs to connect as this role directly.
|
|
65
|
+
|
|
66
|
+
NOSUPERUSER and NOBYPASSRLS are the load-bearing settings. Without these,
|
|
67
|
+
the role retains FORCE RLS bypass and the fixture buys you nothing.
|
|
68
|
+
"""
|
|
69
|
+
await admin_conn.execute(f"""
|
|
70
|
+
DO $$
|
|
71
|
+
BEGIN
|
|
72
|
+
IF NOT EXISTS (SELECT FROM pg_roles WHERE rolname = '{APP_ROLE_NAME}') THEN
|
|
73
|
+
CREATE ROLE {APP_ROLE_NAME}
|
|
74
|
+
LOGIN
|
|
75
|
+
NOSUPERUSER
|
|
76
|
+
NOBYPASSRLS
|
|
77
|
+
NOCREATEDB
|
|
78
|
+
NOCREATEROLE
|
|
79
|
+
PASSWORD 'test_app_password';
|
|
80
|
+
GRANT USAGE ON SCHEMA public TO {APP_ROLE_NAME};
|
|
81
|
+
GRANT SELECT, INSERT, UPDATE, DELETE
|
|
82
|
+
ON ALL TABLES IN SCHEMA public TO {APP_ROLE_NAME};
|
|
83
|
+
GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO {APP_ROLE_NAME};
|
|
84
|
+
END IF;
|
|
85
|
+
END $$;
|
|
86
|
+
""")
|
|
87
|
+
|
|
88
|
+
|
|
89
|
+
# ── Usage example ─────────────────────────────────────────────────────────
|
|
90
|
+
|
|
91
|
+
|
|
92
|
+
@pytest.mark.asyncio
|
|
93
|
+
async def test_rls_blocks_cross_org_select(db_as_app: asyncpg.Connection) -> None:
|
|
94
|
+
"""
|
|
95
|
+
Use db_as_app — NOT db — for any RLS-policy assertion. Under the
|
|
96
|
+
SUPERUSER `db` fixture, this test would pass even if the policy were
|
|
97
|
+
deleted.
|
|
98
|
+
"""
|
|
99
|
+
await db_as_app.execute(
|
|
100
|
+
"SELECT set_config('app.current_org_id', '1', true)"
|
|
101
|
+
)
|
|
102
|
+
rows = await db_as_app.fetch("SELECT id, org_id FROM people")
|
|
103
|
+
assert all(row["org_id"] == 1 for row in rows), \
|
|
104
|
+
"RLS allowed cross-org rows under FORCE — policy is broken or role has BYPASSRLS=t"
|
|
105
|
+
|
|
106
|
+
|
|
107
|
+
# ── Asynccontextmanager variant for non-pytest contexts ───────────────────
|
|
108
|
+
|
|
109
|
+
|
|
110
|
+
@asynccontextmanager
|
|
111
|
+
async def as_app_role(conn: asyncpg.Connection) -> AsyncIterator[asyncpg.Connection]:
|
|
112
|
+
"""
|
|
113
|
+
Imperative variant of the fixture for scripts and one-off RLS exercises.
|
|
114
|
+
"""
|
|
115
|
+
await conn.execute("SAVEPOINT as_app_role")
|
|
116
|
+
try:
|
|
117
|
+
await conn.execute(f"SET LOCAL ROLE {APP_ROLE_NAME}")
|
|
118
|
+
yield conn
|
|
119
|
+
finally:
|
|
120
|
+
await conn.execute("ROLLBACK TO SAVEPOINT as_app_role")
|
|
121
|
+
|
|
122
|
+
|
|
123
|
+
# ── Anti-patterns ─────────────────────────────────────────────────────────
|
|
124
|
+
#
|
|
125
|
+
# 1. Using `db` fixture for RLS assertions. SUPERUSER bypass means
|
|
126
|
+
# every test passes regardless of policy correctness. Production blows up.
|
|
127
|
+
#
|
|
128
|
+
# 2. Provisioning the app role with BYPASSRLS=t "for convenience." Defeats
|
|
129
|
+
# the entire FORCE RLS deployment.
|
|
130
|
+
#
|
|
131
|
+
# 3. SET ROLE without SAVEPOINT. Test pollution: subsequent tests run under
|
|
132
|
+
# whichever role the previous test ended in.
|
|
133
|
+
#
|
|
134
|
+
# 4. Skipping ROLLBACK TO SAVEPOINT in the finally branch. Connection
|
|
135
|
+
# pooling will hand out a connection still scoped to APP_ROLE_NAME.
|
|
136
|
+
#
|
|
137
|
+
# 5. SET LOCAL ROLE inside an asyncpg pool callback (which runs outside any
|
|
138
|
+
# transaction). Use SET ROLE (session-scoped) plus explicit RESET ROLE
|
|
139
|
+
# on connection release. See field report #319 §1 — same trap surfaced
|
|
140
|
+
# in M-04c W3.
|
|
@@ -0,0 +1,134 @@
|
|
|
1
|
+
"""
|
|
2
|
+
Pattern: Structural SQL Sentinel (with adversarial-test discipline)
|
|
3
|
+
|
|
4
|
+
Source: Field report #319 §3. V083 sentinel #2 originally used a single regex
|
|
5
|
+
matching `current_setting(...) = ''`. Three Wave 3 reviewers independently
|
|
6
|
+
flagged that the regex misses commuted (`'' = current_setting(...)`),
|
|
7
|
+
cast (`current_setting(...)::text = ''`), IS NULL canonical
|
|
8
|
+
(`current_setting(...) IS NULL`), and coalesce-wrapped variants. Each missed
|
|
9
|
+
form is a future fail-open re-introduction the sentinel is supposed to block.
|
|
10
|
+
|
|
11
|
+
A single-form structural sentinel is a single point of failure. This pattern
|
|
12
|
+
documents the discipline: every structural sentinel has positive controls,
|
|
13
|
+
adversarial alternation tests, AND fixture-bindability proof.
|
|
14
|
+
|
|
15
|
+
Use this pattern for any SQL-shape policing — fail-open detection in RLS
|
|
16
|
+
policies, dangerous catalog reads, deprecated function calls, plaintext
|
|
17
|
+
storage in encrypted columns.
|
|
18
|
+
"""
|
|
19
|
+
|
|
20
|
+
import re
|
|
21
|
+
import pytest
|
|
22
|
+
from typing import Iterable
|
|
23
|
+
|
|
24
|
+
|
|
25
|
+
# ── The Sentinel ──────────────────────────────────────────────────────────
|
|
26
|
+
|
|
27
|
+
# Comprehensive regex that matches all known fail-open forms. Each
|
|
28
|
+
# alternation is a CVE-class pattern that has bitten production at least once.
|
|
29
|
+
FAIL_OPEN_RE = re.compile(
|
|
30
|
+
r"""
|
|
31
|
+
(
|
|
32
|
+
# Direct equality
|
|
33
|
+
current_setting\([^)]*\)\s*=\s*'' |
|
|
34
|
+
# Commuted (Postgres doesn't canonicalize operand order)
|
|
35
|
+
''\s*=\s*current_setting\([^)]*\) |
|
|
36
|
+
# Cast on the function call
|
|
37
|
+
current_setting\([^)]*\)\s*::\s*\w+\s*=\s*'' |
|
|
38
|
+
# IS NULL form (treats unset GUC as fail-open)
|
|
39
|
+
current_setting\([^)]*\)\s*IS\s+NULL |
|
|
40
|
+
# COALESCE wrap
|
|
41
|
+
coalesce\(\s*current_setting\([^)]*\)\s*,\s*''\s*\)\s*=\s*''
|
|
42
|
+
)
|
|
43
|
+
""",
|
|
44
|
+
re.IGNORECASE | re.VERBOSE,
|
|
45
|
+
)
|
|
46
|
+
|
|
47
|
+
|
|
48
|
+
def policy_is_fail_open(policy_qual: str) -> bool:
|
|
49
|
+
"""Return True if the policy expression contains a known fail-open arm."""
|
|
50
|
+
return bool(FAIL_OPEN_RE.search(policy_qual))
|
|
51
|
+
|
|
52
|
+
|
|
53
|
+
# ── Positive controls (must trigger) ─────────────────────────────────────
|
|
54
|
+
|
|
55
|
+
POSITIVE_FORMS = [
|
|
56
|
+
# Direct
|
|
57
|
+
"current_setting('app.current_org_id', true) = ''",
|
|
58
|
+
# Whitespace tolerance
|
|
59
|
+
"current_setting('app.current_org_id', true) = ''",
|
|
60
|
+
# Commuted
|
|
61
|
+
"'' = current_setting('app.current_org_id', true)",
|
|
62
|
+
# Cast
|
|
63
|
+
"current_setting('app.current_org_id', true)::text = ''",
|
|
64
|
+
# IS NULL
|
|
65
|
+
"current_setting('app.current_org_id', true) IS NULL",
|
|
66
|
+
# COALESCE wrap
|
|
67
|
+
"coalesce(current_setting('app.current_org_id', true), '') = ''",
|
|
68
|
+
]
|
|
69
|
+
|
|
70
|
+
|
|
71
|
+
# ── Negative controls (must NOT trigger) ──────────────────────────────────
|
|
72
|
+
|
|
73
|
+
NEGATIVE_FORMS = [
|
|
74
|
+
# The legitimate org_id check the sentinel is protecting
|
|
75
|
+
"org_id = current_setting('app.current_org_id', true)::int",
|
|
76
|
+
"org_id::text = current_setting('app.current_org_id', true)",
|
|
77
|
+
# Other unrelated comparisons
|
|
78
|
+
"deleted_at IS NULL",
|
|
79
|
+
"tenant_id = (SELECT id FROM tenants WHERE name = 'system')",
|
|
80
|
+
]
|
|
81
|
+
|
|
82
|
+
|
|
83
|
+
# ── Adversarial-bindability test ──────────────────────────────────────────
|
|
84
|
+
|
|
85
|
+
|
|
86
|
+
@pytest.mark.parametrize("form", POSITIVE_FORMS)
|
|
87
|
+
def test_sentinel_catches_fail_open_form(form: str) -> None:
|
|
88
|
+
"""Every known fail-open variant must trip the sentinel."""
|
|
89
|
+
assert policy_is_fail_open(form), \
|
|
90
|
+
f"SENTINEL GAP: form did not trip — '{form}'"
|
|
91
|
+
|
|
92
|
+
|
|
93
|
+
@pytest.mark.parametrize("form", NEGATIVE_FORMS)
|
|
94
|
+
def test_sentinel_does_not_false_positive(form: str) -> None:
|
|
95
|
+
"""Legitimate policy expressions must not trip the sentinel."""
|
|
96
|
+
assert not policy_is_fail_open(form), \
|
|
97
|
+
f"FALSE POSITIVE: legitimate form tripped — '{form}'"
|
|
98
|
+
|
|
99
|
+
|
|
100
|
+
# ── Fixture-bindability proof ─────────────────────────────────────────────
|
|
101
|
+
#
|
|
102
|
+
# A structural sentinel is meaningful only if it can FAIL on a deliberate
|
|
103
|
+
# regression. Test that, too:
|
|
104
|
+
|
|
105
|
+
|
|
106
|
+
def test_sentinel_can_bind() -> None:
|
|
107
|
+
"""
|
|
108
|
+
Construct a deliberate regression and assert the sentinel catches it.
|
|
109
|
+
If this assertion ever flips, either the regex was changed silently
|
|
110
|
+
or the fail-open form is no longer detectable. Either way, an alert
|
|
111
|
+
is mandatory.
|
|
112
|
+
"""
|
|
113
|
+
deliberate_regression = "current_setting('x', true) = ''"
|
|
114
|
+
assert policy_is_fail_open(deliberate_regression), \
|
|
115
|
+
"BINDABILITY FAILURE: sentinel cannot fail under any input — it's a no-op"
|
|
116
|
+
|
|
117
|
+
|
|
118
|
+
# ── Anti-patterns ─────────────────────────────────────────────────────────
|
|
119
|
+
#
|
|
120
|
+
# 1. SUBSTRING match instead of regex (LIKE '%...%'). Misses commuted, cast,
|
|
121
|
+
# and IS NULL forms. Field report #319 §3.
|
|
122
|
+
#
|
|
123
|
+
# 2. Single regex variant without alternation. The first migration author who
|
|
124
|
+
# writes a different form silently re-introduces the class.
|
|
125
|
+
#
|
|
126
|
+
# 3. Positive controls only. Without negative controls, false positives
|
|
127
|
+
# flood the alert channel and reviewers learn to ignore them.
|
|
128
|
+
#
|
|
129
|
+
# 4. No bindability proof. A sentinel that algebraically cannot fail is a
|
|
130
|
+
# no-op. See /docs/patterns/adr-verification-gate.md.
|
|
131
|
+
#
|
|
132
|
+
# 5. Sentinel lives in one place (CI grep) without a database-side mirror.
|
|
133
|
+
# Belt-and-suspenders: lint the policy text in CI AND assert
|
|
134
|
+
# policy_is_fail_open() against pg_policies.qual at runtime.
|
|
@@ -21,4 +21,21 @@ export interface ProjectResult {
|
|
|
21
21
|
markerId: string;
|
|
22
22
|
filesCreated: number;
|
|
23
23
|
}
|
|
24
|
+
/**
|
|
25
|
+
* Wire the /contextmeter status line + awareness hook into settings.json (default-on).
|
|
26
|
+
* Mirrors mergeSettingsHook: set `statusLine` only when the project doesn't already have
|
|
27
|
+
* one (never clobber a user's), and append the UserPromptSubmit awareness hook unless an
|
|
28
|
+
* equivalent is already present (idempotent). Defaults (warn 80 / crit 92) live in the
|
|
29
|
+
* scripts, so the wired commands carry no env prefix.
|
|
30
|
+
*
|
|
31
|
+
* Shared by `init` (copyMethodology) and `update` (applyUpdate) so `update` auto-activates
|
|
32
|
+
* the meter the same way `init` does (#384 follow-up). Returns `true` when it makes — or,
|
|
33
|
+
* under `{ dryRun: true }`, WOULD make — a change, so the updater can report the pending
|
|
34
|
+
* settings edit and decide `applied`. `snippetDir` lets the dry-run read the snippet from
|
|
35
|
+
* the methodology SOURCE before the scripts have been copied into the project.
|
|
36
|
+
*/
|
|
37
|
+
export declare function mergeStatuslineSettings(projectDir: string, opts?: {
|
|
38
|
+
dryRun?: boolean;
|
|
39
|
+
snippetDir?: string;
|
|
40
|
+
}): Promise<boolean>;
|
|
24
41
|
export declare function createProject(config: ProjectConfig): Promise<ProjectResult>;
|
|
@@ -206,17 +206,30 @@ async function mergeSettingsHook(projectDir) {
|
|
|
206
206
|
* one (never clobber a user's), and append the UserPromptSubmit awareness hook unless an
|
|
207
207
|
* equivalent is already present (idempotent). Defaults (warn 80 / crit 92) live in the
|
|
208
208
|
* scripts, so the wired commands carry no env prefix.
|
|
209
|
+
*
|
|
210
|
+
* Shared by `init` (copyMethodology) and `update` (applyUpdate) so `update` auto-activates
|
|
211
|
+
* the meter the same way `init` does (#384 follow-up). Returns `true` when it makes — or,
|
|
212
|
+
* under `{ dryRun: true }`, WOULD make — a change, so the updater can report the pending
|
|
213
|
+
* settings edit and decide `applied`. `snippetDir` lets the dry-run read the snippet from
|
|
214
|
+
* the methodology SOURCE before the scripts have been copied into the project.
|
|
209
215
|
*/
|
|
210
|
-
async function mergeStatuslineSettings(projectDir) {
|
|
211
|
-
const
|
|
216
|
+
export async function mergeStatuslineSettings(projectDir, opts = {}) {
|
|
217
|
+
const snippetDir = opts.snippetDir ?? join(projectDir, 'scripts', 'statusline');
|
|
218
|
+
const snippetPath = join(snippetDir, 'settings-snippet.json');
|
|
212
219
|
const settingsPath = join(projectDir, '.claude', 'settings.json');
|
|
213
220
|
if (!existsSync(snippetPath))
|
|
214
|
-
return;
|
|
215
|
-
|
|
221
|
+
return false;
|
|
222
|
+
let snippet;
|
|
223
|
+
try {
|
|
224
|
+
snippet = JSON.parse(await readFile(snippetPath, 'utf-8'));
|
|
225
|
+
}
|
|
226
|
+
catch {
|
|
227
|
+
return false;
|
|
228
|
+
}
|
|
216
229
|
const snippetStatusLine = snippet?.statusLine;
|
|
217
230
|
const snippetUserPrompt = (snippet?.hooks?.UserPromptSubmit ?? []);
|
|
218
231
|
if (!snippetStatusLine && snippetUserPrompt.length === 0)
|
|
219
|
-
return;
|
|
232
|
+
return false;
|
|
220
233
|
let settings = {};
|
|
221
234
|
if (existsSync(settingsPath)) {
|
|
222
235
|
try {
|
|
@@ -224,15 +237,15 @@ async function mergeStatuslineSettings(projectDir) {
|
|
|
224
237
|
}
|
|
225
238
|
catch {
|
|
226
239
|
// Existing settings.json is unreadable — leave it alone.
|
|
227
|
-
return;
|
|
240
|
+
return false;
|
|
228
241
|
}
|
|
229
242
|
}
|
|
230
|
-
|
|
231
|
-
await mkdir(join(projectDir, '.claude'), { recursive: true });
|
|
232
|
-
}
|
|
243
|
+
let changed = false;
|
|
233
244
|
// statusLine: never clobber a project's existing one.
|
|
234
245
|
if (snippetStatusLine && !settings.statusLine) {
|
|
235
|
-
|
|
246
|
+
if (!opts.dryRun)
|
|
247
|
+
settings.statusLine = snippetStatusLine;
|
|
248
|
+
changed = true;
|
|
236
249
|
}
|
|
237
250
|
// UserPromptSubmit: append the awareness hook unless an equivalent is already present.
|
|
238
251
|
const existingHooks = (settings.hooks ?? {});
|
|
@@ -242,12 +255,19 @@ async function mergeStatuslineSettings(projectDir) {
|
|
|
242
255
|
return hooks.some((h) => typeof h?.command === 'string' && h.command.includes('context-awareness-hook'));
|
|
243
256
|
});
|
|
244
257
|
if (!alreadyHasMeter && snippetUserPrompt.length > 0) {
|
|
245
|
-
|
|
246
|
-
|
|
247
|
-
|
|
248
|
-
|
|
258
|
+
if (!opts.dryRun) {
|
|
259
|
+
settings.hooks = {
|
|
260
|
+
...existingHooks,
|
|
261
|
+
UserPromptSubmit: [...existingUserPrompt, ...snippetUserPrompt],
|
|
262
|
+
};
|
|
263
|
+
}
|
|
264
|
+
changed = true;
|
|
249
265
|
}
|
|
250
|
-
|
|
266
|
+
if (changed && !opts.dryRun) {
|
|
267
|
+
await mkdir(join(projectDir, '.claude'), { recursive: true });
|
|
268
|
+
await writeFile(settingsPath, JSON.stringify(settings, null, 2) + '\n', 'utf-8');
|
|
269
|
+
}
|
|
270
|
+
return changed;
|
|
251
271
|
}
|
|
252
272
|
// ── Identity Injection ───────────────────────────────────
|
|
253
273
|
async function injectIdentity(projectDir, config) {
|
|
@@ -8,6 +8,7 @@ import { join } from 'node:path';
|
|
|
8
8
|
import { execSync } from 'node:child_process';
|
|
9
9
|
import { readMarker, writeMarker, DEFAULT_CLAUDE_MD_STRATEGY } from './marker.js';
|
|
10
10
|
import { planClaudeMdUpdate, UPSTREAM_SUFFIX } from './claude-md-strategy.js';
|
|
11
|
+
import { mergeStatuslineSettings } from './project-init.js';
|
|
11
12
|
/**
|
|
12
13
|
* Decide which `update` mode the given argv selects. Pure — no I/O, no exit.
|
|
13
14
|
*
|
|
@@ -103,8 +104,9 @@ export async function diffMethodology(projectDir) {
|
|
|
103
104
|
{ src: 'docs/patterns', dest: 'docs/patterns' },
|
|
104
105
|
{ src: 'scripts/thumper', dest: 'scripts/thumper' },
|
|
105
106
|
{ src: 'scripts/surfer-gate', dest: 'scripts/surfer-gate' },
|
|
106
|
-
// Context-meter status line + awareness hook (/contextmeter). Scripts propagate
|
|
107
|
-
//
|
|
107
|
+
// Context-meter status line + awareness hook (/contextmeter). Scripts propagate here;
|
|
108
|
+
// activation (statusLine + UserPromptSubmit hook in settings.json) is wired below the
|
|
109
|
+
// same way `init` does it, so `update` auto-activates the meter too (#384 follow-up).
|
|
108
110
|
{ src: 'scripts/statusline', dest: 'scripts/statusline' },
|
|
109
111
|
];
|
|
110
112
|
// CLAUDE.md is handled via the non-destructive strategy mechanism (issue #368)
|
|
@@ -179,6 +181,22 @@ export async function diffMethodology(projectDir) {
|
|
|
179
181
|
}
|
|
180
182
|
}
|
|
181
183
|
}
|
|
184
|
+
// /contextmeter activation (#384 follow-up): `update` now wires the statusLine +
|
|
185
|
+
// awareness hook into .claude/settings.json the same way `init` does. Report the pending
|
|
186
|
+
// settings change here so `--dry-run` shows it. The snippet is read from the SOURCE — the
|
|
187
|
+
// project may not have the statusline scripts yet on this update — and the check is
|
|
188
|
+
// idempotent + non-clobbering (it returns false once the meter is wired, or when the
|
|
189
|
+
// project already has its own statusLine).
|
|
190
|
+
const statuslineNeedsWiring = await mergeStatuslineSettings(projectDir, {
|
|
191
|
+
dryRun: true,
|
|
192
|
+
snippetDir: join(sourceRoot, 'scripts', 'statusline'),
|
|
193
|
+
});
|
|
194
|
+
if (statuslineNeedsWiring) {
|
|
195
|
+
const settingsEntry = '.claude/settings.json';
|
|
196
|
+
if (!plan.modified.includes(settingsEntry) && !plan.added.includes(settingsEntry)) {
|
|
197
|
+
plan.modified.push(settingsEntry);
|
|
198
|
+
}
|
|
199
|
+
}
|
|
182
200
|
return plan;
|
|
183
201
|
}
|
|
184
202
|
// ── Apply Update ─────────────────────────────────────────
|
|
@@ -214,13 +232,20 @@ export async function applyUpdate(projectDir) {
|
|
|
214
232
|
await writeFile(`${claudeMdDestPath}${UPSTREAM_SUFFIX}`, claudeMdPlan.sideFileContent, 'utf-8');
|
|
215
233
|
}
|
|
216
234
|
}
|
|
217
|
-
//
|
|
218
|
-
//
|
|
219
|
-
|
|
235
|
+
// Skip from the generic verbatim copy loop:
|
|
236
|
+
// - CLAUDE.md entries: handled above by the non-destructive strategy.
|
|
237
|
+
// - .claude/settings.json: wired below by mergeStatuslineSettings, not copied verbatim
|
|
238
|
+
// (the methodology source carries no project settings.json — copying it would throw,
|
|
239
|
+
// or in dev clobber the project with the methodology repo's own dogfood settings).
|
|
240
|
+
const skipVerbatim = new Set([
|
|
241
|
+
'CLAUDE.md',
|
|
242
|
+
`CLAUDE.md${UPSTREAM_SUFFIX}`,
|
|
243
|
+
'.claude/settings.json',
|
|
244
|
+
]);
|
|
220
245
|
// Copy added + modified files
|
|
221
246
|
const { mkdir } = await import('node:fs/promises');
|
|
222
247
|
for (const file of [...plan.added, ...plan.modified]) {
|
|
223
|
-
if (
|
|
248
|
+
if (skipVerbatim.has(file))
|
|
224
249
|
continue;
|
|
225
250
|
const srcPath = join(sourceRoot, file);
|
|
226
251
|
const destPath = join(projectDir, file);
|
|
@@ -228,6 +253,12 @@ export async function applyUpdate(projectDir) {
|
|
|
228
253
|
await mkdir(destDir, { recursive: true });
|
|
229
254
|
await cp(srcPath, destPath);
|
|
230
255
|
}
|
|
256
|
+
// /contextmeter activation (#384 follow-up): wire the statusLine + awareness hook into
|
|
257
|
+
// .claude/settings.json so `update` matches `init`'s default-on behavior. The scripts
|
|
258
|
+
// were copied above; this turns the meter on. Idempotent + non-clobbering — it never
|
|
259
|
+
// overwrites a user's own statusLine and never duplicates the awareness hook, so it is
|
|
260
|
+
// safe to run on every update.
|
|
261
|
+
await mergeStatuslineSettings(projectDir);
|
|
231
262
|
// Update marker version
|
|
232
263
|
const marker = await readMarker(projectDir);
|
|
233
264
|
if (marker) {
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "voidforge-build",
|
|
3
|
-
"version": "23.
|
|
3
|
+
"version": "23.22.0",
|
|
4
4
|
"description": "From nothing, everything. A methodology framework for building with Claude Code.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"engines": {
|
|
@@ -45,7 +45,7 @@
|
|
|
45
45
|
"@aws-sdk/client-rds": "^3.700.0",
|
|
46
46
|
"@aws-sdk/client-s3": "^3.700.0",
|
|
47
47
|
"@aws-sdk/client-sts": "^3.700.0",
|
|
48
|
-
"voidforge-build-methodology": "^23.
|
|
48
|
+
"voidforge-build-methodology": "^23.22.0",
|
|
49
49
|
"node-pty": "^1.2.0-beta.12",
|
|
50
50
|
"ws": "^8.19.0"
|
|
51
51
|
},
|