voidforge-build 23.10.0 → 23.11.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/.claude/agents/bashir-field-medic.md +1 -0
- package/dist/.claude/agents/coulson-release.md +3 -0
- package/dist/.claude/agents/irulan-historian.md +3 -0
- package/dist/.claude/agents/loki-chaos.md +1 -0
- package/dist/.claude/agents/picard-architecture.md +3 -0
- package/dist/.claude/agents/silver-surfer-herald.md +3 -0
- package/dist/.claude/agents/sisko-campaign.md +3 -0
- package/dist/.claude/commands/architect.md +38 -0
- package/dist/.claude/commands/campaign.md +2 -0
- package/dist/.claude/commands/gauntlet.md +11 -0
- package/dist/.claude/commands/git.md +13 -3
- package/dist/CHANGELOG.md +63 -0
- package/dist/CLAUDE.md +13 -4
- package/dist/VERSION.md +2 -1
- package/dist/docs/methods/AI_INTELLIGENCE.md +15 -0
- package/dist/docs/methods/BACKEND_ENGINEER.md +48 -0
- package/dist/docs/methods/CAMPAIGN.md +196 -1
- package/dist/docs/methods/DEVOPS_ENGINEER.md +16 -0
- package/dist/docs/methods/FORGE_KEEPER.md +18 -0
- package/dist/docs/methods/GAUNTLET.md +2 -0
- package/dist/docs/methods/QA_ENGINEER.md +46 -0
- package/dist/docs/methods/RELEASE_MANAGER.md +59 -0
- package/dist/docs/methods/SECURITY_AUDITOR.md +53 -0
- package/dist/docs/methods/SUB_AGENTS.md +90 -0
- package/dist/docs/methods/SYSTEMS_ARCHITECT.md +42 -2
- package/dist/docs/methods/TESTING.md +17 -0
- package/dist/docs/methods/TIME_VAULT.md +17 -0
- package/dist/docs/patterns/adr-verification-gate.md +80 -0
- package/dist/docs/patterns/ai-eval.ts +87 -0
- package/dist/docs/patterns/ai-prompt-safety.ts +242 -0
- package/dist/docs/patterns/audit-log.ts +132 -0
- package/dist/docs/patterns/llm-state-dedup.ts +246 -0
- package/dist/docs/patterns/middleware.ts +83 -0
- package/dist/docs/patterns/multi-tenant-pool-bypass.ts +134 -0
- package/dist/docs/patterns/multi-tenant-property-test.ts +127 -0
- package/dist/docs/patterns/refactor-extraction.md +96 -0
- package/dist/scripts/voidforge.js +0 -0
- package/dist/wizard/lib/anomaly-detection.d.ts +59 -0
- package/dist/wizard/lib/anomaly-detection.js +122 -0
- package/dist/wizard/lib/asset-scanner.d.ts +23 -0
- package/dist/wizard/lib/asset-scanner.js +107 -0
- package/dist/wizard/lib/build-analytics.d.ts +39 -0
- package/dist/wizard/lib/build-analytics.js +91 -0
- package/dist/wizard/lib/codegen/erd-gen.d.ts +16 -0
- package/dist/wizard/lib/codegen/erd-gen.js +98 -0
- package/dist/wizard/lib/codegen/openapi-gen.d.ts +15 -0
- package/dist/wizard/lib/codegen/openapi-gen.js +79 -0
- package/dist/wizard/lib/codegen/prisma-types.d.ts +15 -0
- package/dist/wizard/lib/codegen/prisma-types.js +44 -0
- package/dist/wizard/lib/codegen/seed-gen.d.ts +16 -0
- package/dist/wizard/lib/codegen/seed-gen.js +128 -0
- package/dist/wizard/lib/correlation-engine.d.ts +59 -0
- package/dist/wizard/lib/correlation-engine.js +152 -0
- package/dist/wizard/lib/desktop-notify.d.ts +27 -0
- package/dist/wizard/lib/desktop-notify.js +98 -0
- package/dist/wizard/lib/image-gen.d.ts +56 -0
- package/dist/wizard/lib/image-gen.js +159 -0
- package/dist/wizard/lib/natural-language-deploy.d.ts +30 -0
- package/dist/wizard/lib/natural-language-deploy.js +186 -0
- package/dist/wizard/lib/project-init.js +57 -0
- package/dist/wizard/lib/route-optimizer.d.ts +28 -0
- package/dist/wizard/lib/route-optimizer.js +93 -0
- package/dist/wizard/lib/service-install.d.ts +18 -0
- package/dist/wizard/lib/service-install.js +182 -0
- package/package.json +1 -1
|
@@ -48,6 +48,7 @@ Structure your debrief as:
|
|
|
48
48
|
- Always present the full debrief report for user review before any upstream submission. The user approves what gets sent — no silent submissions.
|
|
49
49
|
- Findings must map to VoidForge's vocabulary: agent names, command names, file paths, pattern references. Generic advice like "improve testing" is useless — say which agent, which check, which pattern.
|
|
50
50
|
- Root causes over blame. Trace each failure to its origin category: methodology gap, missing pattern, agent error, user error, or external dependency.
|
|
51
|
+
- **Verifier agents must run an explicit `git diff` against build-agent claims before signing off.** A build report that says "removed `import httpx`" or "added column `org_id` via migration V076" can be wrong — stale grep cache, wrong artifact verified, or a sandbox database confused for the repo. The verification dispatch must read the git diff against HEAD, not just trust the report's prose. Field reports #316 (build agent reported `import httpx` was removed; the import had never been there) and #317 §2 (Spock verified V076 against a sandbox PG database, not the repo's schema) document this failure mode. Mechanical fix: every verifier dispatch ends with `git diff --stat HEAD -- <touched-files>` printed verbatim, and the verifier compares that diff to the build report's claims.
|
|
51
52
|
|
|
52
53
|
## Required Context
|
|
53
54
|
|
|
@@ -51,6 +51,9 @@ Structure all output as:
|
|
|
51
51
|
- **Dynamic counts eliminate hardcoded staleness:** Hardcoded numeric claims ("170+ agents", "13 phases") go stale immediately. Replace with computed values derived from the authoritative data source (array length, directory listing, config object keys).
|
|
52
52
|
- **CLAUDE.md is a contract -- every claim must have a backing file:** The slash command table, agent table, and docs reference table are contracts with the user. Every entry must have a corresponding file. No audit step verified table entries against actual files for 30 versions.
|
|
53
53
|
- **Prepack creates a different compilation environment than dev:** After npm publish, verify on a clean install. prepack.sh copies files from repo root into the package directory — paths that work in dev (relative to repo root) may not resolve in the packaged artifact. Functions that exist in dev may become phantom exports if the source file isn't in the prepack manifest. (Field report #297: daemon-core.ts phantom exports blocked npm publish.)
|
|
54
|
+
- **Per-commit CHANGELOG sibling rule:** Commits touching `src/**`, `docs/adrs/**`, or load-bearing `docs/methods/*.md` MUST stage `CHANGELOG.md` alongside. CHANGELOG drift caught at session end is the symptom of per-commit CHANGELOG omission. Reject commits matching those globs without a CHANGELOG entry; the only valid exception is a pure refactor/move (label `chore:`). (Field report #322: test count trajectory was wrong because CHANGELOG was updated only at session boundaries.)
|
|
55
|
+
- **Pre-push lint sweep:** Before `git push`, discover all `scripts/check-*` and `scripts/lint_*` executables and run each. Push is blocked on any non-zero exit until the finding is resolved (fix code OR add `# <gate>-allowed` waiver with rationale). The orchestrator does not need to know what each script does — only that they all must pass. (Field report #324: 3 separate hotfix loops in one session from forgotten project-specific gates.)
|
|
56
|
+
- **Post-amend SHA pin:** After `git commit --amend`, scan `logs/campaign-state.md` (and `build-state.md`, `gauntlet-state.md` if present) for now-stale SHAs. Pin the post-amend SHA in the state files in the same logical operation as the amend. (Field report #327: every Phase C mission shipped as a `<mission> + <SHA-pin followup>` pair because amends were routine.)
|
|
54
57
|
|
|
55
58
|
## Required Context
|
|
56
59
|
|
|
@@ -5,6 +5,8 @@ heralding: "Princess Irulan opens the chronicle. Your system's documentation wil
|
|
|
5
5
|
model: haiku
|
|
6
6
|
tools:
|
|
7
7
|
- Read
|
|
8
|
+
- Write
|
|
9
|
+
- Edit
|
|
8
10
|
- Grep
|
|
9
11
|
- Glob
|
|
10
12
|
---
|
|
@@ -22,6 +24,7 @@ You are Princess Irulan, chronicler of Muad'Dib. You record and audit documentat
|
|
|
22
24
|
- Check that ADRs exist for significant architectural decisions
|
|
23
25
|
- Verify changelogs reflect actual changes
|
|
24
26
|
- Flag stale documentation that contradicts current code
|
|
27
|
+
- **When the brief asks you to write or update a file, write it.** Your tools include Write and Edit — use them. Returning an audit report when the brief asked for a file produces a wasteful orchestrator redirect. If the file's structure is uncertain, draft the file with TODO markers rather than returning prose. (Field report #322: returned audit text instead of `docs/adrs/INDEX.md` because the prior tool list was Read/Grep/Glob only.)
|
|
25
28
|
|
|
26
29
|
## Output Format
|
|
27
30
|
|
|
@@ -46,6 +46,7 @@ Findings tagged by severity, with file and line references:
|
|
|
46
46
|
- Exploit type coercion: `"0"` vs `0` vs `false` vs `null` vs `undefined`. JavaScript's loose equality creates entire categories of bugs.
|
|
47
47
|
- Identify denial-of-service vectors: unbounded loops, regex backtracking (ReDoS), memory bombs (e.g., parsing a 10GB JSON), recursive structures without depth limits.
|
|
48
48
|
- Find information leakage: error messages that include stack traces, debug endpoints left enabled, verbose logging of sensitive data.
|
|
49
|
+
- **Production cohabitation check.** When a host runs both staging and production stacks, verify the test stack cannot reach the production stack's resources: separate API keys (`grep API_KEY prod/.env staging/.env | md5sum`), separate Redis namespaces with auth, no shared Unix group membership (`id staging-user | grep prod-group`), Docker ports bound to `127.0.0.1` not `0.0.0.0` (`docker ps --format '{{.Ports}}'`), DSN allowlists. **Docker port bindings bypass UFW** — verify with `ss -tlnp`, not `ufw status`. Field reports #316 §11 + #241 + #243: cohabitation gaps surface as test stacks accidentally writing to production storage. Loki's job is to *try* the cross-stack path and confirm it's blocked.
|
|
49
50
|
|
|
50
51
|
## Required Context
|
|
51
52
|
|
|
@@ -50,6 +50,9 @@ Severity: CRITICAL (blocks ship) > HIGH (must fix before prod) > MEDIUM (fix soo
|
|
|
50
50
|
- **Branch-before-destroying (Operating Rule 8):** Before any destructive git operation (`git rm`, `git revert`, `git reset`, `git checkout --`), verify current branch with `git branch --show-current`. Never run destructive ops on `main` without explicit intent. (Field report #281: scaffold cleanup ran on main instead of scaffold, required 272-file restoration.)
|
|
51
51
|
- **Stubs ship as features:** When stubs are committed "to be implemented later," they almost never are. The codebase grows around them, tests don't cover them, and users encounter stubs as production failures. If a feature can't be fully implemented, don't create the file -- document it in ROADMAP.md.
|
|
52
52
|
- **CLAUDE.md is a contract:** Every entry in the slash command table, agent table, and docs reference table must have a corresponding file. Audit table entries against actual files. (Field report #108: `/dangerroom` listed for 30 versions with no backing file.)
|
|
53
|
+
- **Spec-vs-code review are not the same review.** Code-vs-ADR review confirms the implementation matches the spec. Spec-adversary review confirms the spec is correct. For non-trivial methodology ADRs (statistical, security, financial, identity, multi-tenant), require BOTH passes before Stark implements. The bug that sinks production is usually in the spec, not the code. (Field report #322: ADR-069 FWER family scoping was wrong in the spec; four agents signed off on code-vs-ADR.)
|
|
54
|
+
- **Signing-path audit:** for every file that produces a cryptographic signature (EIP-712, EIP-191, action hashes, HMAC for webhooks, JWT signing, OAuth state signing), verify a golden-vector test exists pinning byte-identical output for fixed inputs. Asymmetry across signing paths in the same codebase is a known regression vector — the test the author didn't write is the one that catches the SDK upgrade that breaks production. (Field report #323: barrierwatch HL had a golden vector; PM did not. 35-agent /architect synthesis caught it.)
|
|
55
|
+
- **Scope-confidence interval on callsite-counted ADRs:** when an ADR's effort estimate is denominated in callsite/file count, require EITHER a verifying grep with pinned `n=N` OR an explicit "±X×" uncertainty annotation. Point estimates are a methodology bug. (Field reports #328 + #329: M-48c.1 estimated 5 lines → 24 references; F-V710-ORG1-DEFAULTS estimated 12 → 65 sites.)
|
|
53
56
|
|
|
54
57
|
### Agent-invented constraints require operator confirmation
|
|
55
58
|
|
|
@@ -96,6 +96,9 @@ DEPLOYMENT REMINDER: You MUST now launch an Agent sub-process for EVERY agent li
|
|
|
96
96
|
- **The command's hardcoded manifest is the floor, not the ceiling.** Your job is to add specialists the command didn't think to include. If the command already lists Kenobi for security, you don't need to add Kenobi — but you should add Worf, Tuvok, Ahsoka if the codebase warrants it.
|
|
97
97
|
- **Your roster must be deployed IN FULL.** The orchestrator will be tempted to cherry-pick "key specialists" from your roster. This defeats your purpose. Your curation IS the filter — however many you select, all of them deploy. (Field report: voidforge.build — orchestrator cherry-picked from the roster, admitted it was wrong.)
|
|
98
98
|
- **You MUST be launched. No exceptions.** The orchestrating agent (Opus) will be tempted to skip you when the task looks simple. "4 content-only missions" or "just a text fix" are NOT valid reasons to skip. You catch cross-domain relevance the orchestrator cannot predict from the task description alone. If you are not launched, the command violates protocol. (Field report: voidforge.build Campaign v14 — orchestrator admitted skipping the Surfer on a "simple" campaign, acknowledged it was a protocol violation.)
|
|
99
|
+
- **Returned roster names MUST match `.claude/agents/*.md` basenames exactly.** No `voidforge-` prefix, no display-name aliases, no character-name shorthand. The orchestrator dispatches by basename — a name like `voidforge-systems-architect` or `picard` (without `-architecture` suffix) blocks the launch on the first mismatch. If uncertain, run `ls .claude/agents/` and copy the literal filename minus `.md`. (Field report #318: Surfer twice returned `voidforge-`-prefixed names; each cost 30-60s of orchestrator translation per dispatch.)
|
|
100
|
+
- **Rosters >20 agents need explicit framing.** On mature codebases the optimize-for-coverage instinct can return 50-200 agents in one pass. Past ~25 agents, marginal signal-to-noise drops sharply. Either narrow scope first via `--focus`, or annotate the roster: *"Core N required; remaining are advisory — orchestrator may prune if context is constrained."* Do NOT return raw 50+ rosters and expect deployment. (Field reports #315 + #316 + #318: 53-agent /assess, 218-agent /architect, 58-agent /campaign --plan rosters all required orchestrator pruning.)
|
|
101
|
+
- **Track over-count vs find-count ratio across rounds.** When 3+ agents in a roster flag the same finding in Round 1 of a Gauntlet, that's overlap not signal — the marginal agent added redundancy, not coverage. Across a campaign, if the over-include heuristic consistently produces <50 unique findings per round from 130-agent rosters, soften over-include in subsequent rounds for the same campaign. The rule shifts from "over-include, never under-include" (first pass) to "tighten after de-duplication is observable" (later passes). (Field report #325: 130-agent roster recommended; ~30 actually deployed; Round 1 had Picard A4 + Stark S-009 + Kenobi K-12 all naming the same `pending_actions.json` schema-version gap — three agents finding the same thing in three universes.)
|
|
99
102
|
|
|
100
103
|
## Required Context
|
|
101
104
|
|
|
@@ -53,6 +53,9 @@ Mission briefs follow: Objective, Scope (files/features), Acceptance Criteria, A
|
|
|
53
53
|
- **Phase completion is NOT a pause point:** In blitz mode, phase boundaries (Phase 1 -> Phase 2) are organizational labels, not gates or rest stops. The only pause triggers are: (1) context >85%, (2) BLOCKED item requiring user input. (Field report #139: agent stopped at phase boundaries twice despite explicit instructions.)
|
|
54
54
|
- **Numeric context checks:** Do not say "context is heavy," "given context usage," or "recommend a fresh session" unless you have run `/context` and the number exceeds 85%.
|
|
55
55
|
- **Cross-reference learnings when generating artifacts:** When generating mission artifacts (code, configs, agent definitions), cross-reference `docs/LEARNINGS.md` and `docs/LESSONS.md` before writing. Artifacts generated from a single source (e.g., NAMING_REGISTRY.md only) will contain 3-12% of operational knowledge. The learnings files contain hard-won rules that prevent repeat failures. (Field report #297: 263 agents deployed without learnings injection — required a full remediation campaign.)
|
|
56
|
+
- **Pause-bias is an anti-pattern in autonomous mode.** When a mission completes, mark the next mission in_progress and START. Status updates are fine ("M-X shipped at <sha>; starting M-Y"). Decision-frame questions are NOT ("continue or pause?"). "Strategic checkpoint" and "context budget" are not valid pause rationales below 85% context. (Field report #323: barrierwatch Phase 2 — operator pushed back on mid-campaign pause-bias; rationalizations had been rejected via project memory prior.)
|
|
57
|
+
- **ROADMAP path disambiguation:** if both `ROADMAP.md` (root) and `docs/ROADMAP.md` exist, root is canonical. Verify with `head -20` on both before reading. (Field report #323.)
|
|
58
|
+
- **Cluster-mission recognition at plan time:** if a single mission entry spans 4+ ADR sections, 4+ sub-components, or 4+ migration steps, split into sub-missions (M-Xa/b/c/d) at plan time, not at execution time. (Field report #326: v7.10 Sisko slate was 9 missions; reality was 21 because clusters weren't recognized upfront.)
|
|
56
59
|
|
|
57
60
|
## Required Context
|
|
58
61
|
|
|
@@ -82,8 +82,39 @@ In autonomous/blitz mode: append every AGENT_INVENTED constraint to `needs_opera
|
|
|
82
82
|
|
|
83
83
|
Evidence: BarrierWatch campaign (field report #304) invented a $20 kill switch and $50/$50 capital split that took ~90 minutes to remove across 39 files. Both propagated into ROADMAP, source modules, config YAML, tests, and an ADR before the operator reviewed the design.
|
|
84
84
|
|
|
85
|
+
## Step 4.6 — Schema-vs-ADR Cross-Check (Spock + Worf)
|
|
86
|
+
|
|
87
|
+
Before any ADR claiming a property of an existing table or callsite is marked Accepted, validate the claim against code reality. Field reports #312, #313, #316 document a pattern where ADRs say *"every tenant-touching table has `org_id`"* or *"X primitive landed in mission Y"* — and downstream missions discover the claim was aspirational. SQL errors at build time, ~1 day mid-mission rescoping per occurrence.
|
|
88
|
+
|
|
89
|
+
For each ADR with a "Implementation Scope" or "Existing State" claim, run:
|
|
90
|
+
|
|
91
|
+
1. **Existing-table claims** → grep schema files for the column/constraint/index. Do NOT trust prose. Spock confirms with `grep -nE "^\s*org_id\s+(INTEGER|UUID|BIGINT)" schema*.sql` per claim.
|
|
92
|
+
2. **"X already landed in mission Y" claims** → Worf empirically inspects the referenced files. If the claim is about a security primitive (paper-gate, allowlist, RLS policy), verification is mandatory before any downstream mission treats it as scope-reduction.
|
|
93
|
+
3. **File-path claims** → `[ -f <path> ] && echo present || echo MISSING` for every path the ADR cites as a deliverable. Reject "Fully implemented in vX.Y" framing for paths that don't exist at HEAD.
|
|
94
|
+
|
|
95
|
+
If verification fails, the ADR's status is `Proposed`, not `Accepted`, until the gap is closed. Do not apply Riker's review to an unverified claim — the reviewer is testing the *decision*, not the *factual ground state*.
|
|
96
|
+
|
|
97
|
+
## Step 4.7 — Implementation Rehearsal for Infrastructure ADRs (Stark or domain lead)
|
|
98
|
+
|
|
99
|
+
ADRs that specify async lifecycle hooks, connection-pool callbacks, middleware initialization, DB function bodies, signal handlers, or daemon orchestration MUST be spiked against the real library API before Wave 2 sign-off. 4-hour timebox.
|
|
100
|
+
|
|
101
|
+
Examples of ADRs that require rehearsal:
|
|
102
|
+
- *"Pool callback uses `SET LOCAL` to set the GUC"* — actually fails empirically because `SET LOCAL` is transaction-scoped and the pool callback runs outside any caller-owned transaction. (Field report #316 §2 — would have shipped a tenant-isolation invariant that silently no-ops.)
|
|
103
|
+
- *"Lifespan handler initializes ContextVar"* — rehearsal needs a non-owner role identity to surface RLS-strict behavior the dev superuser silently bypasses.
|
|
104
|
+
- *"Middleware emits `logger.critical` per request"* — rehearsal at expected RPS exposes per-request log flooding.
|
|
105
|
+
|
|
106
|
+
Rehearsal output: a runnable snippet (or test) that exercises the spec end-to-end, plus a one-line affirmative result *"Rehearsed at <commit-sha>; behavior matches spec"* in the ADR body. Code-level prose without a run record is not rehearsed.
|
|
107
|
+
|
|
85
108
|
## Step 5 — ADRs + Decision Review
|
|
86
109
|
Write Architecture Decision Records to `/docs/adrs/` for every non-obvious choice. After writing, **Riker** `subagent_type: Riker` reviews: challenges trade-offs, verifies alternatives were truly considered, checks for second-order effects.
|
|
110
|
+
|
|
111
|
+
**Spec adversary pass (BEFORE implementation begins):** For non-trivial methodology ADRs (statistical, security, financial, identity, multi-tenant), launch an adversarial agent in parallel with Riker — **Feyd-Rautha**, **Maul**, or **Loki** depending on domain. Their job is different from Riker's. Riker asks "do the trade-offs hold up?" The adversary asks "is the SPECIFICATION asking the right question? Does the algebraic intersection of constraints contain the desired solution? What failure mode did the spec not name?"
|
|
112
|
+
|
|
113
|
+
Field report #322 (barrierwatch FWER): ADR-069 specified "filter family by p-value alone." Four agents reviewed code-vs-ADR and all signed off. The bug was in the spec — the family should have been scoped to runs that passed the per-run gate. It surfaced when production produced a false-positive alert. A spec-adversary pass would have caught it before implementation.
|
|
114
|
+
|
|
115
|
+
The rule: code-vs-ADR review confirms fidelity; spec-adversary review confirms correctness. Both run before Stark implements.
|
|
116
|
+
|
|
117
|
+
**ADR filename rule (ADR-044, field report #315 M6):** Use the orchestrator-assigned filename verbatim. Do NOT also write at the next-sequential ADR number when the number space is contested. When 80+ agents write ADRs in parallel, dual-numbering produces collision pairs that require pre-commit deduplication. One agent, one ADR, one filename — the orchestrator owns the namespace.
|
|
87
118
|
```
|
|
88
119
|
# ADR-001: [Title]
|
|
89
120
|
## Status: Accepted
|
|
@@ -91,6 +122,13 @@ Write Architecture Decision Records to `/docs/adrs/` for every non-obvious choic
|
|
|
91
122
|
## Decision: [What was decided]
|
|
92
123
|
## Consequences: [Trade-offs, what this enables, what this prevents]
|
|
93
124
|
## Alternatives: [What else was considered and why it was rejected]
|
|
125
|
+
|
|
126
|
+
## Implementation Scope
|
|
127
|
+
- **Reality anchor:** Does this ADR describe work that exists at HEAD?
|
|
128
|
+
- YES → "Fully implemented in vX.Y." (verify each deliverable with `ls`/`grep` before writing)
|
|
129
|
+
- NO → "Proposed — to be implemented in vX.Y PR." (do NOT mark Accepted)
|
|
130
|
+
- **Deliverables:** [enumerated paths + a 1-line existence-check command]
|
|
131
|
+
- **Verification gate:** [the test/check that proves the fix is correct, with a Fixture Bindability proof — see `/docs/patterns/adr-verification-gate.md`]
|
|
94
132
|
```
|
|
95
133
|
|
|
96
134
|
## Conflict Resolution
|
|
@@ -27,6 +27,8 @@ The Prophets have shown me the path. Time to execute the plan.
|
|
|
27
27
|
|
|
28
28
|
**In blitz mode, make ALL decisions autonomously. Never ask the user a question. If uncertain, choose the option that preserves quality (e.g., run the Gauntlet, not skip it). The only human interaction in blitz mode is the final completion summary.**
|
|
29
29
|
|
|
30
|
+
**Pause-bias anti-pattern (autonomous-by-default per ADR-043 — also applies outside `--blitz`).** When a mission completes, the orchestrator's next action is to mark the next mission `in_progress` and start. Status updates are FINE ("M-7.4 shipped at `6d7f5b3`. Starting M-7.5."); decision-frame questions are NOT ("Continue with M-7.5 or pause?"). Do NOT rationalize a pause as "strategic checkpoint," "context budget management," or "natural milestone." The only valid pause triggers are: (1) `/context` >85% with the actual number cited, (2) a BLOCKED item, (3) a Critical /assemble finding with no auto-fix, (4) the user interrupts. (Field report #323: operator pushed back sharply on a mid-campaign pause-bias instance; the rationalizations had been rejected before via project memory.) See `CAMPAIGN.md` "Pause-Bias Anti-Pattern" for full rules.
|
|
31
|
+
|
|
30
32
|
**Blitz per-mission checklist** (verify ALL before continuing to next mission):
|
|
31
33
|
1. `/assemble` completed
|
|
32
34
|
2. `/git` committed
|
|
@@ -108,6 +108,17 @@ If the Council finds issues:
|
|
|
108
108
|
2. Re-run the Council (max 2 iterations).
|
|
109
109
|
3. If not converged after 2 rounds, present remaining findings to the user.
|
|
110
110
|
|
|
111
|
+
## Production-Parity Verification (mandatory before The Snap)
|
|
112
|
+
|
|
113
|
+
Before Thanos can render verdict, verify that the test execution backend matches the project's declared production backend (`PROJECT_VERSION.md` Stack section, or equivalent). Run:
|
|
114
|
+
|
|
115
|
+
```bash
|
|
116
|
+
grep -nE "_backend\s*=\s*['\"]" tests/conftest.py 2>/dev/null
|
|
117
|
+
grep -iE "database|backend|stack" PROJECT_VERSION.md CLAUDE.md 2>/dev/null | head
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
If `tests/conftest.py` autouse fixture pins a non-prod backend (e.g., `_backend = "sqlite"` while prod is PostgreSQL), the Gauntlet **FAILS** regardless of green test counts. Tests pinned to the wrong backend exercise none of the production-relevant integrations (RLS, asyncpg, advisory locks, LISTEN/NOTIFY, FOR UPDATE SKIP LOCKED). Field report #315 M3: this slipped past 4 dual-backend Union Station Gauntlets. Fail loud.
|
|
121
|
+
|
|
111
122
|
## The Snap — Thanos's Verdict
|
|
112
123
|
|
|
113
124
|
**If all domains sign off:**
|
|
@@ -35,9 +35,18 @@ Determine the version bump:
|
|
|
35
35
|
5. User confirms or overrides
|
|
36
36
|
|
|
37
37
|
## Step 3 — Chronicle (Wong)
|
|
38
|
+
|
|
39
|
+
**Disambiguation: project changelog vs methodology changelog.**
|
|
40
|
+
|
|
41
|
+
If `PROJECT_VERSION.md` exists at repo root, it is the project's changelog (version history). The repo's `CHANGELOG.md` is voidforge-methodology-scoped (versions match the methodology package, not the project) — do NOT edit it for project work. Update `PROJECT_VERSION.md`'s "Current" / "In Progress" / "Next" lines and add a row to the Version History table instead.
|
|
42
|
+
|
|
43
|
+
If only `CHANGELOG.md` exists, follow the standard flow — that's a methodology repo or a single-version-history project.
|
|
44
|
+
|
|
45
|
+
If both files exist and the project is a downstream consumer of VoidForge, the project's history goes in `PROJECT_VERSION.md` and the methodology's bundled CHANGELOG is read-only (Bombadil owns it via `/void` sync). Field report #320 §5 documents the confusion this caused before the disambiguation was written.
|
|
46
|
+
|
|
38
47
|
Update all version files:
|
|
39
|
-
1. Read the top of
|
|
40
|
-
2. Write new
|
|
48
|
+
1. Read the top of the **active changelog** (PROJECT_VERSION.md if present at repo root, else CHANGELOG.md) — ~30 lines for format reference.
|
|
49
|
+
2. Write the new entry at the top (after the header), using the categories from Step 1:
|
|
41
50
|
- User-facing language, not file-level details
|
|
42
51
|
- Group by Added/Changed/Fixed/Removed/Security
|
|
43
52
|
- Include today's date
|
|
@@ -62,8 +71,9 @@ Confirm everything is consistent:
|
|
|
62
71
|
2. Check version consistency:
|
|
63
72
|
- `VERSION.md` current version matches
|
|
64
73
|
- `package.json` version matches
|
|
65
|
-
-
|
|
74
|
+
- The **active changelog** (PROJECT_VERSION.md if present, else CHANGELOG.md) has an entry for this version
|
|
66
75
|
- Commit message starts with the correct version tag
|
|
76
|
+
- **ROADMAP.md cross-check (field report #309 Fix 4):** if `ROADMAP.md` exists, grep it for the new version string. If milestones in ROADMAP.md reference a higher version than `package.json`, that's drift — surface it and offer to bump. If ROADMAP claims a milestone is "DONE" at a version that doesn't match the just-committed bump, surface that too. Drift between ROADMAP and package.json typically goes unnoticed for weeks.
|
|
67
77
|
3. Run `git status` — verify working tree is clean (no forgotten files)
|
|
68
78
|
4. If any inconsistency found, flag it and offer to fix
|
|
69
79
|
|
package/dist/CHANGELOG.md
CHANGED
|
@@ -6,6 +6,69 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/), and this
|
|
|
6
6
|
|
|
7
7
|
---
|
|
8
8
|
|
|
9
|
+
## [23.11.0] - 2026-05-10
|
|
10
|
+
|
|
11
|
+
### Field Report Triage — 18 reports closed (#313–#320, #322–#330)
|
|
12
|
+
|
|
13
|
+
Combined two-batch triage. Batch 1 covers multi-tenant retrofit campaigns and Union Station v7.7-v7.9 closeouts (#313-#320). Batch 2 covers autonomous-mode campaigns + AI-execution agent reports from threadplex-ops, barrierwatch, and Union Station v7.10-v7.11 (#322-#330). 9 new patterns, 18+ methodology sections, operational learnings on 7 agents. No breaking changes.
|
|
14
|
+
|
|
15
|
+
### Added
|
|
16
|
+
|
|
17
|
+
**New patterns (9):**
|
|
18
|
+
- **`docs/patterns/adr-verification-gate.md`** — Fixture Bindability discipline. Every ADR's verification gate must include "Can the gate FAIL under this fixture?" with algebraic/empirical rationale. Reality-anchored Implementation Scope (Proposed vs Accepted vs Deferred). Sum-verification for numbered-cohort ADRs. (#313, #314, #316, #318.)
|
|
19
|
+
- **`docs/patterns/audit-log.ts`** — System-event NULL trap resolution: schema relaxation (NULL org_id) vs sentinel+JSONB tag. Append-only invariants. Hash-chained integrity. (#319 §6.)
|
|
20
|
+
- **`docs/patterns/multi-tenant-property-test.ts`** — Property-based isolation test: any orgs A,B; A's writes never appear in B's reads. (#315, #316.)
|
|
21
|
+
- **`docs/patterns/multi-tenant-pool-bypass.ts`** — `pre_org_resolution_scope()` ContextVar wrapper for cross-tenant lifespan/daemon code. (#318, #319.)
|
|
22
|
+
- **`docs/patterns/rls-test-fixture.py`** — `db_as_app` SAVEPOINT pattern defeating the SUPERUSER + BYPASSRLS=t fixture trap. (#318, #319.)
|
|
23
|
+
- **`docs/patterns/structural-sql-sentinel.py`** — Adversarial-test discipline for SQL regex sentinels: commuted comparisons, casts, IS NULL, coalesce coverage. (#320.)
|
|
24
|
+
- **`docs/patterns/refactor-extraction.md`** — 8-commit per-entity large-refactor template with IDOR matrix discipline. (#320.)
|
|
25
|
+
- **`docs/patterns/ai-prompt-safety.ts`** — Type A (instructions to model, statistical) vs Type B (constraints on tool, enforced); AUTHORITY-as-text caveat; SafetyStack reference shape; 3 anti-patterns. (#325, #330.)
|
|
26
|
+
- **`docs/patterns/llm-state-dedup.ts`** — LLM-emitted ids are display labels, not primary keys; content-hash dedup; logical-key fallback for command-string drift; lifecycle-state snapshot completeness. (#330.)
|
|
27
|
+
|
|
28
|
+
**Pattern extensions:**
|
|
29
|
+
- **`docs/patterns/ai-eval.ts`** — `CLAUDE_PROMPT_EVAL_CATEGORIES` template (prompt-structure invariants, sanitizer round-trip, refusal stability on Tier-3 inputs, JSON schema adherence, cost regression). Bayta's 7-test bats spec as reference. (#325.)
|
|
30
|
+
- **`docs/patterns/middleware.ts`** — Hot-path logging gate (fireOnce / shouldEmit token-bucket) preventing observability-pipeline DoS from naked `logger.critical()` per-request. (#319 §5.)
|
|
31
|
+
|
|
32
|
+
**New methodology sections:**
|
|
33
|
+
- **`docs/methods/SYSTEMS_ARCHITECT.md`** — Scope-confidence interval for callsite-counted ADRs (verifying grep with pinned `n=N` OR ±X× uncertainty); spec adversary pass before implementation; signing-path audit requirement; service-extraction test-patch checklist. (#322, #323, #324, #326, #328, #329.)
|
|
34
|
+
- **`docs/methods/CAMPAIGN.md`** — Closeout grep pinning (reciprocal to scope-confidence); cluster-mission recognition at plan time; pause-bias anti-pattern (autonomous mode); ROADMAP path disambiguation; pre-split blocker phase; caller-graph audit for silent-default abstractions; V710 acceptance template inheritance counter; operator decision documents (`logs/campaign-decisions-{version}.md`); LOC growth tracker per-mission. (#322, #323, #326, #327, #329.)
|
|
35
|
+
- **`docs/methods/SECURITY_AUDITOR.md`** — Sanitizer Bypass-Class Checklist (7 classes: case-fold, em-dash, novel marker, newline-split, char-class, encoding, length boundary). (#325.)
|
|
36
|
+
- **`docs/methods/QA_ENGINEER.md`** — Strict-Mode Audit Classification (no cosmetic/WARN downgrade without behavioral evidence); Telegram-bot group-chat suffix test. (#325, #330.)
|
|
37
|
+
- **`docs/methods/SUB_AGENTS.md`** — Intentionally Overlapping Mandates (3+ agents on same diff = high-signal convergence); Sub-Agent Review Contract (WARN/cosmetic requires unreachable proof OR real-path test); Agent Capability Matrix. (#322, #324, #330.)
|
|
38
|
+
- **`docs/methods/BACKEND_ENGINEER.md`** — AST Lints Are Cheap (contracts with 8+ duplicates → AST lint + baseline + `--regenerate-baseline`). (#324.)
|
|
39
|
+
- **`docs/methods/RELEASE_MANAGER.md`** — Per-Commit CHANGELOG Discipline (src/**, docs/adrs/**, methods/*.md commits must stage CHANGELOG); Pre-Push Lint Sweep (run all `scripts/check-*`); Post-Amend SHA Pin (detect stale state-file SHAs after `git commit --amend`). (#322, #324, #327.)
|
|
40
|
+
- **`docs/methods/GAUNTLET.md`** + **`.claude/commands/gauntlet.md`** — Production-Parity Exit Criterion (test backend must match production declared in PROJECT_VERSION.md; mismatch FAILS the round regardless of green tests). (#315 M3.)
|
|
41
|
+
- **`docs/methods/AI_INTELLIGENCE.md`** — Event-Ladder Severity Gradient (info < warning < error < fatal monotonic; climactic rung must be fatal). (#319 §4.)
|
|
42
|
+
- **`docs/methods/DEVOPS_ENGINEER.md`** — Production Runtime Topology Authoritative-Source (single supervisor; reconcile `systemctl status` vs `ps -ef` before deploy). (#319 §7.)
|
|
43
|
+
- **`docs/methods/FORGE_KEEPER.md`** — Distribution-vs-Source Drift Check (every CLAUDE.md-cited path must exist post-sync). (#317.)
|
|
44
|
+
- **`docs/methods/TESTING.md`** — Decreasing-Counter Test Markers (e.g., `known_pg_gap`) for tracked migrations; monotonic counter with mission ownership in campaign-state. (#316 §7.)
|
|
45
|
+
- **`docs/methods/TIME_VAULT.md`** — Verification Pass Before Sealing (live psql + code reads for table count, migration head, schema invariants, file paths, test counts, version numbers). (#318.)
|
|
46
|
+
- **`.claude/commands/git.md`** — Project-vs-methodology changelog disambiguation (PROJECT_VERSION.md vs CHANGELOG.md routing); ROADMAP.md cross-check during verification. (#320 §5, #309 Fix 4.)
|
|
47
|
+
- **`.claude/commands/architect.md`** — Spec-adversary pass before implementation. (#322.)
|
|
48
|
+
- **`.claude/commands/campaign.md`** — Pause-bias anti-pattern mirror. (#323.)
|
|
49
|
+
|
|
50
|
+
**Operational learnings (agent definitions):**
|
|
51
|
+
- **`.claude/agents/picard-architecture.md`** — spec-vs-code review distinction; signing-path audit; scope-confidence interval. (#322, #323, #328.)
|
|
52
|
+
- **`.claude/agents/sisko-campaign.md`** — pause-bias prohibition; ROADMAP path disambiguation; cluster-mission recognition. (#323, #326.)
|
|
53
|
+
- **`.claude/agents/coulson-release.md`** — per-commit CHANGELOG sibling rule; pre-push lint sweep; post-amend SHA pin. (#322, #324, #327.)
|
|
54
|
+
- **`.claude/agents/bashir-field-medic.md`** — verifiers run `git diff` against build-agent claims. (#316, #317 §2.)
|
|
55
|
+
- **`.claude/agents/loki-chaos.md`** — production cohabitation check (Docker port bindings bypass UFW). (#316 §11, #241, #243.)
|
|
56
|
+
- **`.claude/agents/irulan-historian.md`** — added Write + Edit tools; behavioral directive to write files when briefed to write. (#322.)
|
|
57
|
+
- **`.claude/agents/silver-surfer-herald.md`** — over-count vs find-count ratio (soften over-include after de-duplication observable). (#325.)
|
|
58
|
+
|
|
59
|
+
**Distribution (closes ADR-051 #317):**
|
|
60
|
+
- **`packages/methodology/package.json`** — `scripts/surfer-gate/` added to npm `files` array.
|
|
61
|
+
- **`packages/methodology/scripts/prepack.sh`** — copies `scripts/surfer-gate/` into package at publish time.
|
|
62
|
+
- **`packages/voidforge/wizard/lib/project-init.ts`** — `chmodShellScripts()` + `mergeSettingsHook()` ship the Surfer Gate to every new project and merge the PreToolUse hook into `.claude/settings.json`. Consumer installs now get mechanical enforcement, not prose-backstop only.
|
|
63
|
+
|
|
64
|
+
### Changed
|
|
65
|
+
|
|
66
|
+
- **`docs/methods/CAMPAIGN.md`** Step 1 (Dax) — cluster-mission recognition inserted between cross-mission data handoff check and acceptance criteria gate.
|
|
67
|
+
- **`docs/methods/SYSTEMS_ARCHITECT.md`** Step 5 — Riker review extended with spec-adversary pass for non-trivial methodology ADRs.
|
|
68
|
+
- **`CLAUDE.md`** — patterns list updated with the 9 new patterns; total patterns now ~50.
|
|
69
|
+
|
|
70
|
+
---
|
|
71
|
+
|
|
9
72
|
## [23.10.0] - 2026-04-20
|
|
10
73
|
|
|
11
74
|
### Field Report Triage — 6 reports closed (#303–#308)
|
package/dist/CLAUDE.md
CHANGED
|
@@ -37,12 +37,12 @@ ADR-051 enforces this gate at the hook level (PreToolUse). The prose below is th
|
|
|
37
37
|
|
|
38
38
|
**Hook enforcement (ADR-051 Phase 5b — live as of v23.8.14; state relocated per ADR-060 in v23.8.18).** A `PreToolUse` hook on the Agent tool (`scripts/surfer-gate/check.sh`) blocks any sub-agent launch that isn't the Silver Surfer itself, unless a roster has been recorded for this session or a bypass flag is set. State lives at `$XDG_RUNTIME_DIR/voidforge-gate/` (Linux) or `$HOME/.voidforge/gate/` (macOS fallback) — per-user, `0700`. This is the permanent enforcement mechanism. The prose above is a human-readable backup.
|
|
39
39
|
|
|
40
|
-
**Orchestrator contract** (you run these Bash commands at the right moments):
|
|
40
|
+
**Orchestrator contract** (you run these Bash commands at the right moments — wrap each in an existence guard so projects on older methodology versions don't error):
|
|
41
41
|
|
|
42
|
-
1. After the Silver Surfer sub-agent returns its roster, and before launching any other Agent: `bash scripts/surfer-gate/record-roster.sh` (optionally pass the roster JSON as the first argument for audit).
|
|
43
|
-
2. When the user's command includes `--light` or `--solo`, BEFORE launching the Surfer or any other agent: `bash scripts/surfer-gate/bypass.sh --light` (or `--solo`).
|
|
42
|
+
1. After the Silver Surfer sub-agent returns its roster, and before launching any other Agent: `[ -x scripts/surfer-gate/record-roster.sh ] && bash scripts/surfer-gate/record-roster.sh || true` (optionally pass the roster JSON as the first argument for audit). The existence guard is a defensive no-op for projects that predate v23.10.0 — when the gate started shipping via the npm methodology package per #317.
|
|
43
|
+
2. When the user's command includes `--light` or `--solo`, BEFORE launching the Surfer or any other agent: `[ -x scripts/surfer-gate/bypass.sh ] && bash scripts/surfer-gate/bypass.sh --light || true` (or `--solo`). **Fails closed on unknown flag values** (ADR-060 v23.8.18 hardening, SEC-003) — passing anything other than `--light` or `--solo` exits 2 with an error. No silent bypass.
|
|
44
44
|
|
|
45
|
-
If you skip step 1, your first non-Surfer Agent call in that turn will be blocked with a clear message and your own log line in `/tmp/voidforge-session-$SESSION_ID/gate.log`. You are expected to comply with the block (launch Surfer / run record-roster), not to fight it.
|
|
45
|
+
If `scripts/surfer-gate/check.sh` exists but you skip step 1, your first non-Surfer Agent call in that turn will be blocked with a clear message and your own log line in `/tmp/voidforge-session-$SESSION_ID/gate.log`. You are expected to comply with the block (launch Surfer / run record-roster), not to fight it. If the script does not exist, your project predates v23.10.0; pull the gate from `tmcleod3/voidforge:scripts/surfer-gate/` and merge `settings-snippet.json` into `.claude/settings.json`, or re-run `npx voidforge-build init` against the methodology source.
|
|
46
46
|
|
|
47
47
|
**Why.** Seven field incidents (logged in `.claude/agents/silver-surfer-herald.md`) document the cost of skipping: the orchestrator cannot predict cross-domain relevance from the command name alone. The hook makes skipping mechanically impossible for non-bypass cases. Launch the Surfer. Every time.
|
|
48
48
|
|
|
@@ -119,6 +119,15 @@ Reference implementations in `/docs/patterns/`. Match these shapes when writing.
|
|
|
119
119
|
- `e2e-test.ts` — Playwright E2E + axe-core a11y: page objects, auth helpers, network mocks, CWV measurement
|
|
120
120
|
- `combobox.tsx` — Accessible combobox with value source management, keyboard nav, async search (+ HTMX)
|
|
121
121
|
- `kongo-integration.ts` — Landing page engine: client, from-PRD generation, growth signal, webhook handlers
|
|
122
|
+
- `adr-verification-gate.md` — Fixture Bindability discipline: gates that algebraically can fail; reality-anchored Implementation Scope
|
|
123
|
+
- `multi-tenant-property-test.ts` — Property-based isolation test (any orgs A,B; A's writes never appear in B's reads)
|
|
124
|
+
- `multi-tenant-pool-bypass.ts` — `pre_org_resolution_scope()` ContextVar wrapper for cross-tenant lifespan/daemon code
|
|
125
|
+
- `rls-test-fixture.py` — `db_as_app` SAVEPOINT pattern (defeats SUPERUSER + BYPASSRLS=t fixture trap)
|
|
126
|
+
- `structural-sql-sentinel.py` — Adversarial-test discipline for SQL regex sentinels (commuted/cast/IS NULL/coalesce coverage)
|
|
127
|
+
- `audit-log.ts` — System-event NULL trap resolution (schema relaxation vs sentinel+JSONB tag); append-only invariants
|
|
128
|
+
- `refactor-extraction.md` — 8-commit per-entity large-refactor template with IDOR matrix discipline
|
|
129
|
+
- `ai-prompt-safety.ts` — Type A (instructions, statistical) vs Type B (constraints, enforced); AUTHORITY-as-text caveat; defense-in-depth stack
|
|
130
|
+
- `llm-state-dedup.ts` — LLM ids are display labels, not keys; content-hash dedup; lifecycle-state snapshot completeness
|
|
122
131
|
|
|
123
132
|
## Slash Commands
|
|
124
133
|
|
package/dist/VERSION.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
# Version
|
|
2
2
|
|
|
3
|
-
**Current:** 23.
|
|
3
|
+
**Current:** 23.11.0
|
|
4
4
|
|
|
5
5
|
## Versioning Scheme
|
|
6
6
|
|
|
@@ -14,6 +14,7 @@ This project uses [Semantic Versioning](https://semver.org/):
|
|
|
14
14
|
|
|
15
15
|
| Version | Date | Summary |
|
|
16
16
|
|---------|------|---------|
|
|
17
|
+
| 23.11.0 | 2026-05-10 | Field Report Triage — 18 reports closed (#313-#320, #322-#330). Two combined batches across multi-tenant retrofit campaigns (#313-#320) and autonomous-mode + AI-execution campaigns (#322-#330). 9 new patterns: adr-verification-gate.md, audit-log.ts, multi-tenant-{pool-bypass,property-test}.ts, rls-test-fixture.py, structural-sql-sentinel.py, refactor-extraction.md, ai-prompt-safety.ts, llm-state-dedup.ts. ai-eval.ts extended with Claude-prompt-eval template. middleware.ts extended with hot-path logging gate. 18 method-doc sections across CAMPAIGN, SYSTEMS_ARCHITECT, SECURITY_AUDITOR, QA_ENGINEER, SUB_AGENTS, BACKEND_ENGINEER, RELEASE_MANAGER, GAUNTLET, AI_INTELLIGENCE, DEVOPS_ENGINEER, FORGE_KEEPER, TESTING, TIME_VAULT. Surfer Gate now ships via npm methodology package + wizard project-init (closes ADR-051 distribution gap #317). Operational learnings added to Picard, Sisko, Coulson, Bashir, Loki, Irulan, Silver Surfer. |
|
|
17
18
|
| 23.10.0 | 2026-04-20 | Field Report Triage — 6 reports closed (#303-#308). 33 approved fixes: SPEC_HANDOFF.md (new method doc), deploy-preflight.ts + post-deploy-probe.sh (new patterns), LEARNINGS LRN-5..10, Deploy Surface Boundary + Deployment Hygiene (field report #305 remediation), Step 2.5 pre-deploy secret scan + Step 4.5 post-deploy probe, TECH_DEBT SLA, npm-name pre-flight, ADR-050 Rename Verification Checklist, Silver Surfer HARD CONSTRAINT + 4 agent operational learnings. |
|
|
18
19
|
| 23.9.2 | 2026-04-20 | CI workflow idempotency + provenance baseline — `publish.yml` guards each publish with "already-published" check so re-runs skip cleanly. Tag-push re-publishes via CI to attach npm provenance attestation (absent on v23.9.1's manual publish). |
|
|
19
20
|
| 23.9.1 | 2026-04-20 | ADR-061 pivot — `@voidforge` npm org unavailable (squat-adjacent). Rebranded publish target to `voidforge-build` / `voidforge-build-methodology` matching the voidforge.build domain. Migration banner for legacy `thevoidforge` / `@voidforge/cli` installs. Farewell releases + npm deprecate for smooth transition. |
|
|
@@ -227,6 +227,21 @@ If issues found, return to Phase 3. Maximum 2 iterations.
|
|
|
227
227
|
### AI Gate Bootstrapping (Cold-Start Problem)
|
|
228
228
|
AI-gated approval systems have a cold-start problem: no historical outcomes -> gate rejects all requests -> no operations -> no outcomes. During the first N decisions (configurable, default 20), the gate should approve at reduced size (0.5-0.7x normal) to build a track record. The gate should never reject solely because "no historical data exists." Include explicit prompt guidance: "Lack of history is not a reason to reject — approve at reduced size to build the track record." (Field report #152)
|
|
229
229
|
|
|
230
|
+
### Event-Ladder Severity Gradient
|
|
231
|
+
|
|
232
|
+
When implementing an event ladder (engagement → escalation → climactic), every rung's Sentry / observability severity must be **strictly louder** than the previous rung: `info < warning < error < fatal`. The climactic event MUST emit at least at `fatal`, never silence into prose-only logs.
|
|
233
|
+
|
|
234
|
+
Common failure mode (field report #319 §4): the immediate event is `error`, the hourly reminder is `warning`, and the deadline-trip becomes `logger.critical(...)` only — **inverting the gradient.** The most consequential event becomes the LEAST observable. This is structural, not malicious — agents implement events one at a time and lose track of relative severity across rungs.
|
|
235
|
+
|
|
236
|
+
**Implementation checklist:**
|
|
237
|
+
|
|
238
|
+
- [ ] List every rung of the ladder with its Sentry level
|
|
239
|
+
- [ ] Verify monotonic ordering: each rung ≥ the previous
|
|
240
|
+
- [ ] Climactic rung is `fatal` (not `critical`-via-logger-only, not `warning` because "we already alerted earlier")
|
|
241
|
+
- [ ] Add a unit test that asserts the ordering — `assert ladder.severities == sorted(ladder.severities)` or equivalent
|
|
242
|
+
|
|
243
|
+
Pair with `/docs/patterns/middleware.ts` hot-path logging gate (fire-once or rate-limited emission) so the louder severity doesn't translate to log flooding.
|
|
244
|
+
|
|
230
245
|
## Anti-Patterns
|
|
231
246
|
|
|
232
247
|
| Anti-Pattern | What Happens | Fix |
|
|
@@ -156,6 +156,54 @@ Services deployed in ephemeral environments (containers, serverless, spot instan
|
|
|
156
156
|
|
|
157
157
|
Step 2 (Strange's Service Layer) already mandates "stateless composable services." This subsection makes the requirement concrete: stateless means *reconstructable from durable storage within one cycle*. (Field report #274)
|
|
158
158
|
|
|
159
|
+
### AST Lints Are Cheap (8+ duplicates rule)
|
|
160
|
+
|
|
161
|
+
When a contract has 8 or more duplicates in production code (e.g., "every `complete()` call must pass `org_id`", "no router file may import `app.db.backends`", "every webhook handler must call `verify_signature()` before reading the body"), write an AST-based lint with a baseline-grandfather pattern + `--regenerate-baseline` flag.
|
|
162
|
+
|
|
163
|
+
**The pattern:**
|
|
164
|
+
|
|
165
|
+
1. AST query — walk the AST, identify violations by structural pattern (not regex). Avoids false positives on quoted strings, comments, similarly-named functions in different namespaces.
|
|
166
|
+
2. Baseline file — record existing violations at the time the lint is introduced. The CI gate fails only on NEW violations.
|
|
167
|
+
3. `--regenerate-baseline` flag — operator can refresh the baseline when intentionally adding a violation (with PR comment justifying it).
|
|
168
|
+
|
|
169
|
+
**Why this works:**
|
|
170
|
+
|
|
171
|
+
A contract enforced by a checklist gets violated. A contract enforced by code review gets violated when the reviewer is tired. A contract enforced by AST lint with baseline-grandfather is structurally impossible to silently violate — the CI gate flips RED on the offending PR before merge.
|
|
172
|
+
|
|
173
|
+
**Cost-benefit:**
|
|
174
|
+
|
|
175
|
+
- AST lint authoring: ~30-60 min for the first lint in a project; subsequent lints reuse the framework.
|
|
176
|
+
- Baseline file maintenance: trivial; only changes when violations are intentionally added.
|
|
177
|
+
- Violation prevention: every future regression of the contract is caught at PR time.
|
|
178
|
+
|
|
179
|
+
Field report #324 (Union Station v7.8): F11 broadened UNS001 to catch `app.db.{backends,tenant_pool,rls_fail_open}` imports in router files — flagged 4 grandfathered usages immediately, prevented all future regressions. UNS003 (complete-org-id) shipped with empty baseline — every future M-37-style strict-mode regression is caught at PR time.
|
|
180
|
+
|
|
181
|
+
**Reference implementations from the field reports:**
|
|
182
|
+
|
|
183
|
+
- `scripts/lint_router_no_db.py` (PIC-001, Union Station) — routers must not import DB internals
|
|
184
|
+
- `scripts/lint_complete_org_id.py` (UNS003 / M-37 followup) — every `complete()` call passes `org_id`
|
|
185
|
+
- `scripts/check-org-id-defaults.sh` (ADR-137) — no `org_id: int = 1` defaults outside test fixtures (with `# system-org-allowed` waiver convention)
|
|
186
|
+
|
|
187
|
+
When NOT to AST-lint: contracts with <8 duplicates (use code review), one-off cleanups (do the cleanup; lint isn't needed), evolving contracts (the API hasn't stabilized — lint locks you in).
|
|
188
|
+
|
|
189
|
+
### Lifespan & Daemon ContextVar Coverage (Multi-Tenant FORCE-RLS)
|
|
190
|
+
|
|
191
|
+
When sweeping unscoped tenant-pool callsites for a multi-tenant boundary tightening (org_id, tenant_id, workspace_id), grep alone catches HTTP-middleware-served paths. It misses code that runs *outside* the request lifecycle, where ContextVar isn't set by middleware. These paths fail-fast immediately under FORCE-RLS with a non-owner role, and they are the load-bearing finding from Union Station's M-05 cutover.
|
|
192
|
+
|
|
193
|
+
**Sweep this whole list, not just `grep acquire`:**
|
|
194
|
+
|
|
195
|
+
| Path class | Why grep misses it | Fix |
|
|
196
|
+
|---|---|---|
|
|
197
|
+
| Lifespan startup (`@app.on_event("startup")`, FastAPI lifespan, Django ready) | One-shot at boot, before middleware sets ContextVar | Wrap with `set_current_org_id(org_id)` per-org loop OR `pre_org_resolution_scope()` for cross-tenant work |
|
|
198
|
+
| Scheduler ticks (cron, Celery beat, conductor sweeps) | Fires on day-boundaries; no HTTP request, no ContextVar | Wrap with `pre_org_resolution_scope()` (admin pool, BYPASSRLS) |
|
|
199
|
+
| Leader-elected daemons (queue cleanup, retention, distributed locks) | Long interval; no HTTP context | Same as scheduler ticks |
|
|
200
|
+
| Pool callbacks (`asyncpg` setup, SQLAlchemy event listeners) | Run outside any caller transaction | `SET LOCAL` does NOT work here — use `set_config(..., is_local=false)` + explicit `RESET` on connection release |
|
|
201
|
+
| Admin endpoints intended to bypass tenant scope | Need explicit admin pool acquisition | Use `_get_db_admin()` — never silently fall back to tenant pool |
|
|
202
|
+
|
|
203
|
+
**Test fixture trap:** the dev superuser (`postgres`, `us_test`) has `BYPASSRLS=t` and silently bypasses FORCE RLS at the engine level. A test that uses the dev superuser will pass even if the policy is broken. Tests that exercise tenant-isolation invariants must run under the non-owner runtime role (`{project}_app`, BYPASSRLS=f). See `/docs/patterns/rls-test-fixture.py` for the `db_as_app` SAVEPOINT pattern. (Field reports #318 + #319.)
|
|
204
|
+
|
|
205
|
+
**Forensic check before declaring sweep complete:** boot the service under the runtime DSN (not the dev DSN), exercise lifespan + one tick of every scheduled job + one admin-bypass codepath. Watch for `TenantInvariantError`. Anything that raises was missed by the sweep. (Field report #319: 4 lifespan/daemon paths surfaced this way during M-05 cutover.)
|
|
206
|
+
|
|
159
207
|
## Step 5 — Deliverables
|
|
160
208
|
|
|
161
209
|
1. BACKEND_AUDIT.md
|