npm - voidforge-build - Versions diffs - 23.19.0 → 23.20.0 - Mend

voidforge-build 23.19.0 → 23.20.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (59) hide show

package/dist/.claude/agents/celebrimbor-forge-artist.md +1 -0
package/dist/.claude/agents/ducem-token-economics.md +1 -0
package/dist/.claude/agents/galadriel-frontend.md +1 -0
package/dist/.claude/agents/romanoff-integrations.md +4 -0
package/dist/.claude/agents/silver-surfer-herald.md +19 -4
package/dist/.claude/commands/architect.md +4 -3
package/dist/.claude/commands/assemble.md +12 -0
package/dist/.claude/commands/assess.md +1 -0
package/dist/.claude/commands/build.md +8 -0
package/dist/.claude/commands/contextmeter.md +56 -0
package/dist/.claude/commands/debrief.md +10 -0
package/dist/.claude/commands/engage.md +5 -0
package/dist/.claude/commands/git.md +13 -1
package/dist/.claude/commands/imagine.md +1 -1
package/dist/.claude/commands/seal.md +80 -0
package/dist/.claude/commands/ux.md +13 -0
package/dist/.claude/workflows/gauntlet.workflow.js +13 -1
package/dist/CHANGELOG.md +38 -0
package/dist/CLAUDE.md +8 -0
package/dist/HOLOCRON.md +16 -2
package/dist/VERSION.md +2 -1
package/dist/docs/methods/AI_INTELLIGENCE.md +3 -0
package/dist/docs/methods/ASSEMBLER.md +12 -0
package/dist/docs/methods/BUILD_PROTOCOL.md +7 -0
package/dist/docs/methods/CAMPAIGN.md +11 -0
package/dist/docs/methods/DEVOPS_ENGINEER.md +56 -0
package/dist/docs/methods/FIELD_MEDIC.md +1 -0
package/dist/docs/methods/FORGE_ARTIST.md +3 -4
package/dist/docs/methods/GAUNTLET.md +6 -0
package/dist/docs/methods/MUSTER.md +2 -0
package/dist/docs/methods/PRODUCT_DESIGN_FRONTEND.md +18 -0
package/dist/docs/methods/QA_ENGINEER.md +17 -1
package/dist/docs/methods/RELEASE_MANAGER.md +27 -0
package/dist/docs/methods/SECURITY_AUDITOR.md +11 -1
package/dist/docs/methods/SUB_AGENTS.md +31 -0
package/dist/docs/methods/SYSTEMS_ARCHITECT.md +15 -0
package/dist/docs/methods/TESTING.md +2 -0
package/dist/docs/methods/TROUBLESHOOTING.md +2 -2
package/dist/docs/methods/WORKFLOWS.md +14 -0
package/dist/docs/patterns/ai-prompt-safety.ts +85 -0
package/dist/docs/patterns/data-pipeline.ts +59 -1
package/dist/docs/patterns/exclusion-set-invariant.md +62 -0
package/dist/docs/patterns/multi-tenant-property-test.ts +64 -0
package/dist/docs/patterns/oauth-token-lifecycle.ts +21 -0
package/dist/scripts/statusline/README.md +38 -0
package/dist/scripts/statusline/context-awareness-hook.sh +53 -0
package/dist/scripts/statusline/settings-snippet.json +17 -0
package/dist/scripts/statusline/voidforge-statusline.sh +91 -0
package/dist/scripts/voidforge.js +69 -6
package/dist/wizard/lib/claude-md-strategy.d.ts +87 -0
package/dist/wizard/lib/claude-md-strategy.js +198 -0
package/dist/wizard/lib/marker.d.ts +48 -1
package/dist/wizard/lib/marker.js +58 -2
package/dist/wizard/lib/patterns/oauth-token-lifecycle.d.ts +14 -0
package/dist/wizard/lib/patterns/oauth-token-lifecycle.js +21 -0
package/dist/wizard/lib/project-init.js +59 -0
package/dist/wizard/lib/updater.d.ts +19 -0
package/dist/wizard/lib/updater.js +84 -33
package/package.json +2 -2

package/dist/docs/methods/WORKFLOWS.md CHANGED Viewed

@@ -53,6 +53,7 @@ return { confirmed: claims.filter((c,i) => verdicts[i]?.survives) }
 5. **Cost lever:** route cheap stages with `agent(p, {model:'haiku'})` (scout pre-scans) and reserve the default model for synthesis — the way the Surfer already runs on Haiku.
 6. **`agentType` resolves by the agent's `name:` display field, NOT the filename** (e.g. `'Picard'`, not `'picard-architecture'`). A filename-style `agentType` fails to resolve and the `agent()` call returns `null` (silently filtered by `.filter(Boolean)`), so the agent simply never runs. If a roster carries both, pass `a.name`. Same rule as the Agent tool's `subagent_type`.
 7. **Validate before shipping:** a workflow script's top-level `await`/`return` make a bare `node --check` fail ("Illegal return statement") — that is expected (the runtime wraps the body in an async fn). Use `npm run validate:workflows` (wired into `pretest`), which reproduces the wrapper before checking, so a real syntax error is caught in CI rather than shipping to npm.
+8. **Repro scratch goes to `mktemp`, never the repo tree** (#366 F5). A workflow's adversarial/repro agents that reproduce a finding via shell (probe scripts, atomic-write `.tmp` files, fixture dirs) MUST write to `$(mktemp -d)` (or `$(mktemp)` for a single file) — isolated, auto-cleaned, invisible to `git add -A`. Never write probe scripts or scratch into the working tree: the gauntlet's gate-race repro littered `.gate-repro-scratch/` and `scripts/surfer-gate/.*-probe.sh` into the repo on two separate runs and was nearly committed. The agent prompt that asks for a shell repro must say *where* to write it. Projects may also `.gitignore` a designated scratch path as a backstop, but the primary rule is `mktemp`. (Same rule for raw Agent dispatch — see `SUB_AGENTS.md`.)
 ## Gate interop (ADR-064) — REQUIRED
@@ -71,6 +72,19 @@ The 264 personas, the Agent Debate Protocol, severity re-rating from votes, the
 Every Workflow run persists its script + a journal. To resume after an edit/kill: `Workflow({scriptPath, resumeFromRunId})` — unchanged `agent()` calls return cached results; the first edited call and everything after re-runs.
+## Recovery — after `/clear` or a crash (#366 F1)
+A background workflow survives **neither** `/clear` **nor** a host crash. Both leave the launching task's output empty (0-byte) or partial — the run did not finish synthesizing, even though the journal on disk may hold dozens of completed `agent()` results. The reflex is to re-run from scratch; for a 60–80-agent gauntlet that throws away ~80 minutes and the token cost of every cached agent. **Resume FIRST.**
+**Recovery procedure:**
+1. **Record the `runId` at launch.** `/gauntlet` and `/assemble` write the workflow `runId` to their state file (and the vault) the moment they invoke the Workflow tool, so a fresh post-`/clear` session can find it. If you don't have it, the runtime can list recent runs for the script.
+2. **On an empty or partial task-output, resume — don't restart.** `Workflow({ scriptPath, resumeFromRunId })` replays the journal: every unchanged `agent()` call returns its cached result instantly, and execution continues from the first incomplete call through the final synthesis. You pay only for what didn't finish.
+3. **Empty-output handling is not "the run failed."** A 0-byte output means the *lead's task* was interrupted, not that the agents didn't run. Check the journal/`runId` before concluding the work was lost.
+4. **What survives:** the script source and the per-call result journal (so cached `agent()` results survive). **What does NOT survive:** in-flight agents at crash time (re-run on resume), and any repro scratch the agents wrote (gone with `mktemp`, as it should be — Gotcha 8). If you *edited* the script after the crash, resume re-runs from the first changed call forward; an unchanged script resumes cleanly.
+Re-running from scratch is correct only when no `runId` is recoverable. Treat blind restart as the fallback, not the default.
 ## Related
 - `SUB_AGENTS.md` — dispatch discipline, model/effort tiering, the find→verify review shape, fan-out residual sweeps.

package/dist/docs/patterns/ai-prompt-safety.ts CHANGED Viewed

@@ -288,10 +288,95 @@ const conferenceUrlField: UntrustedExtractionField = {
  * surface the raw value on the review surface for operator edit.
  */
+// --- Deny-list discipline (forbidden-inference / forbidden-token filters) ---
+/**
+ * Pattern for a deny-list that strips or rejects forbidden content an LLM might
+ * emit — e.g. a compliance filter that must NOT let the model infer or assert a
+ * subject's wealth, accreditation, or citizenship. A naive "does the output
+ * contain any banned token?" substring/regex filter false-fires three ways and
+ * is silently un-testable a fourth. Field report #378 (InvestorGraph) hit all
+ * four on a compliance-critical forbidden-inference filter:
+ *
+ *   1. NEGATION / DISCLAIMER false-positive
+ *      The model correctly writing "*no* accreditation evidence" or "citizenship
+ *      unknown" is the SAFE answer — yet a bare token match strips it and
+ *      penalizes the model for being careful. The filter must scope matches to
+ *      POSITIVE assertions: if a negation/disclaimer cue sits adjacent to the
+ *      banned token, the mention is not a leak.
+ *
+ *   2. PROPER-NOUN false-positive
+ *      A contact employed at "Visa", a fund literally named "Trust Fund", a
+ *      company "BIG RICH LLC", a "...High Net Worth Community" group — the banned
+ *      substring appears inside a legitimate entity name the model is allowed to
+ *      report. An allowlist of known proper nouns (and the entity's own
+ *      attribute values — employer, company, group names) must suppress the match.
+ *
+ *   3. HOMOGLYPH / ZERO-WIDTH evasion (false-NEGATIVE — the dangerous direction)
+ *      An adversary (or a quirk of upstream data) writes "аccredited" with a
+ *      Cyrillic 'а', or splits the token with a zero-width joiner, and the banned
+ *      term sails through. NFKC-normalize and strip zero-width / combining marks
+ *      BEFORE matching so visually-identical variants collapse to the canonical
+ *      form the deny-list is written against.
+ *
+ *   4. TAUTOLOGICAL EVAL (the un-testable trap)
+ *      The safety EVAL's leak-detector must be INDEPENDENT of the production
+ *      filter. If the eval re-imports the same deny-list / regex the filter uses,
+ *      it is structurally incapable of catching the filter's gaps — every term
+ *      the filter misses, the eval also misses, so the eval reports PASS on a
+ *      real leak. Testing a filter with itself is vacuous. The leak-detector
+ *      must be built from an independent oracle (a hand-curated banned-phrase
+ *      set, a second model, an LLM-judge, or human labels).
+ */
+export interface DenyListPolicy {
+  forbiddenTerms: string[]          // canonical, post-NFKC banned tokens/phrases
+  normalizeBeforeMatch: 'nfkc-strip-zerowidth'  // ALWAYS normalize first (guard #3)
+  negationGuard: {                  // guard #1 — a nearby negation/disclaimer un-flags the match
+    enabled: true
+    cues: string[]                  // e.g. ['no', 'not', 'unknown', 'unverified', 'absent', 'lacks']
+    windowTokens: number            // how many tokens of adjacency count as "negating" the term
+  }
+  properNounAllowlist: string[]     // guard #2 — names containing a banned substring that are OK
+  allowEntityAttributeValues: boolean  // guard #2 — also exempt the entity's own employer/company/group fields
+  evalLeakDetector: 'independent'   // guard #4 — MUST NOT reuse this policy's forbiddenTerms
+}
+const accreditationDenyList: DenyListPolicy = {
+  forbiddenTerms: ['accredited', 'net worth', 'high net worth', 'citizenship', 'wealthy'],
+  normalizeBeforeMatch: 'nfkc-strip-zerowidth',
+  negationGuard: {
+    enabled: true,
+    cues: ['no', 'not', 'unknown', 'unverified', 'absent', 'lacks', 'without', 'cannot confirm'],
+    windowTokens: 4,
+  },
+  properNounAllowlist: ['Visa', 'Trust Fund', 'BIG RICH LLC', 'High Net Worth Community'],
+  allowEntityAttributeValues: true,
+  evalLeakDetector: 'independent',
+}
+/* ANTI-PATTERN 5: bare substring/regex deny-list with a self-referential eval
+ *
+ * 'We strip any output line containing a banned term, and our safety eval
+ *  greps the output for the same banned terms — 11/11 pass, ship it.'
+ *
+ * No. Four failures, three loud and one silent:
+ *   - "no accreditation evidence" (the SAFE answer) is stripped + penalized.
+ *   - A contact at "Visa" / a "Trust Fund" is flagged on a proper noun.
+ *   - "аccredited" (Cyrillic а) or a zero-width-split token slips through.
+ *   - The eval reuses the filter's deny-list, so it CANNOT fail on a leak the
+ *     filter misses — 11/11 is a tautology, not evidence of safety.
+ *
+ * Fix: NFKC-normalize + strip zero-width BEFORE matching (defeats evasion);
+ * scope matches to positive assertions via a negation-adjacency guard; suppress
+ * proper-noun / entity-attribute matches via an allowlist; and build the eval's
+ * leak-detector from an INDEPENDENT oracle so it can actually fail.
+ */
 export {
   authorityInstruction,
   denyListEnforcement,
   fsPermsEnforcement,
   threadplexAgentStack,
   conferenceUrlField,
+  accreditationDenyList,
 }

package/dist/docs/patterns/data-pipeline.ts CHANGED Viewed

@@ -10,6 +10,11 @@
  * - Batch vs streaming mode toggle — same stages, different execution
  * - Error handling: skip-and-log vs fail-fast configurable per pipeline
  * - Progress reporting callback for observability
+ * - Source-format discovery BEFORE assuming CSV — the first stage detects the
+ *   real input format and dispatches to a SourceAdapter. Never hardcode
+ *   `read_csv`. A "giant contact dump" is frequently NOT a CSV (field report
+ *   #378: a 4k-row export arrived as an Apple Contacts `.abbu` SQLite bundle).
+ *   See the SourceAdapter section in Framework Adaptations below.
  *
  * Agents: Stark (backend), Banner (data), L (monitoring)
  *
@@ -250,6 +255,56 @@ export {
   checkNullRate, checkRange, computeDedupHash,
 };
+// ── Source Adapter (format discovery — field report #378) ──────────────
+//
+// The PRD says "CSV" but the real authorized source is often something else.
+// A pipeline's FIRST stage must DISCOVER the format and dispatch to an adapter,
+// never assume CSV. Each adapter normalizes its source into the same record
+// shape the rest of the pipeline consumes (e.g. a flat contact row). Adding a
+// source = adding an adapter, not editing every downstream stage.
+//
+//   type SourceFormat = 'csv' | 'vcard' | 'sqlite-contacts' | 'json';
+//
+//   /** Sniff the format from extension + magic bytes — do NOT trust the name alone. */
+//   function detectSourceFormat(path: string, head: Buffer): SourceFormat {
+//     const ext = path.toLowerCase();
+//     if (ext.endsWith('.vcf')) return 'vcard';                       // vCard text
+//     if (ext.endsWith('.abbu') || ext.endsWith('.abcddb')) return 'sqlite-contacts'; // Apple Contacts store
+//     if (head.subarray(0, 16).toString() === 'SQLite format 3') return 'sqlite-contacts';
+//     if (ext.endsWith('.json')) return 'json';
+//     if (head[0] === 0x42 && head[1] === 0x45 && head[2] === 0x47) return 'vcard'; // "BEG" of BEGIN:VCARD
+//     return 'csv';
+//   }
+//
+//   interface SourceAdapter { read(path: string): Promise<Record<string, unknown>[]>; }
+//
+//   // --- vCard (.vcf) ------------------------------------------------------
+//   // STUB: parse with a vCard lib (e.g. `vcf`/`ical.js`); map FN/EMAIL/TEL/ORG
+//   // to the canonical contact record. A single .vcf can hold many VCARD blocks.
+//   const vcardAdapter: SourceAdapter = {
+//     async read(_path) { throw new Error('Implement: split on BEGIN:VCARD, map FN/EMAIL/TEL/ORG'); },
+//   };
+//
+//   // --- SQLite contact stores (.abbu bundle / .abcddb) -------------------
+//   // STUB: an Apple Contacts `.abbu` is a BUNDLE containing an `.abcddb` SQLite
+//   // file; open read-only and SELECT from ZABCDRECORD/ZABCDEMAILADDRESS etc.
+//   // (schema varies by macOS version — probe table names, don't hardcode).
+//   const sqliteContactsAdapter: SourceAdapter = {
+//     async read(_path) { throw new Error('Implement: open .abcddb read-only, join ZABCDRECORD + email/phone tables'); },
+//   };
+//
+//   // --- JSON export -------------------------------------------------------
+//   // STUB: many providers export a JSON array (or NDJSON); validate with Zod
+//   // before mapping — exported JSON is untyped and frequently partial.
+//   const jsonAdapter: SourceAdapter = {
+//     async read(_path) { throw new Error('Implement: parse + Zod-validate, map to canonical record'); },
+//   };
+//
+//   // SECURITY: every one of these formats is a PII export. The default
+//   // .gitignore must cover them up front (*.vcf *.abbu *.abcddb* *.json input
+//   // dumps) — field report #378 logged TWO near-misses where a non-CSV source
+//   // dump sat un-ignored in the repo root.
+//
 // ── Framework Adaptations ───────────────────────────────
 //
 // === Python (pandas/polars) ===
@@ -262,7 +317,10 @@ export {
 //               raise FileNotFoundError(path)
 //
 //       def transform(self, path: str) -> pl.DataFrame:
-//           return pl.read_csv(path)
+//           # Discover the format first — do NOT assume CSV (field report #378).
+//           fmt = detect_source_format(path)          # 'csv'|'vcard'|'sqlite-contacts'|'json'
+//           return SOURCE_ADAPTERS[fmt](path)          # each adapter -> canonical DataFrame
+//           # e.g. sqlite-contacts: sqlite3.connect(f"file:{abcddb}?mode=ro", uri=True)
 //
 //   class CleanStage:
 //       def validate(self, df: pl.DataFrame) -> None:

package/dist/docs/patterns/exclusion-set-invariant.md ADDED Viewed

@@ -0,0 +1,62 @@
+# Pattern: Exclusion-Set Superset Invariant
+**When to use:** Any project where MORE THAN ONE mechanism independently enumerates "secret / PII / excluded" files — typically `.gitignore`, an `rsync --exclude` (or `tar --exclude`) deploy list, and a secret-scanner config (gitleaks/trufflehog/detect-secrets). Containment-heavy projects (autonomous agents, deploy pipelines that ship a working tree to a host) are the high-risk case.
+**Source:** Field report #377 §5 (live secret exposure traced to three exclusion mechanisms drifting apart).
+## The Failure Mode
+Each mechanism enumerates "the secret files" by its OWN rules, authored at a different time by a different concern:
+- `.gitignore` keeps secrets OUT OF GIT.
+- `rsync --exclude` (deploy) keeps secrets OFF THE TARGET HOST.
+- the secret-scanner keeps secrets OUT OF COMMITS / CI.
+Because the three lists are written and maintained separately, they drift. A file the `.gitignore` covers shipped through `rsync` world-readable, and the scanner's name patterns never matched it — so a secret excluded from git was deployed to the host and went undetected. Three "secured" mechanisms, zero of them caught the leak, because none of them agreed on the set.
+The trap: each list looks complete in isolation. The bug is in the DELTA between them, which no single mechanism can see.
+## The Pattern — One Canonical Set, the Others are Supersets
+Define ONE canonical secret/PII exclusion set. Every other mechanism's exclusion set must be a SUPERSET of it (it may exclude more — never less). Then assert the invariant in CI so it cannot silently drift.
+1. **Canonical source.** Pick one list as canonical (usually `.gitignore`'s secret section, or a dedicated `secrets.exclude` manifest). This is the minimum set every mechanism must cover.
+2. **Derive, don't duplicate, where possible.** Generate the `rsync --exclude-from=` file and the scanner's path patterns FROM the canonical set at build time. Derivation makes drift structurally impossible; if a mechanism's format can't be derived, fall to the assertion below.
+3. **Assert the superset invariant.** A CI/provisioning check that fails closed:
+```bash
+# exclusion-set-invariant check — every mechanism must cover the canonical set.
+# Canonical set = the secret/PII globs that MUST be excluded everywhere.
+canonical=$(sort -u docs/security/secrets.exclude)   # one file, one canonical truth
+# Each mechanism exposes its excluded globs (normalize to one-glob-per-line).
+gitignore=$(git_secret_globs)        # secret section of .gitignore
+rsync_excl=$(cat deploy/rsync.exclude)
+scanner=$(scanner_path_globs)        # gitleaks/trufflehog allow/deny paths
+fail=0
+for mech in "gitignore:$gitignore" "rsync:$rsync_excl" "scanner:$scanner"; do
+  name="${mech%%:*}"; have="${mech#*:}"
+  # Anything in canonical NOT covered by this mechanism = drift = fail.
+  missing=$(comm -23 <(printf '%s\n' "$canonical" | sort -u) \
+                     <(printf '%s\n' "$have"      | sort -u))
+  if [[ -n "$missing" ]]; then
+    echo "EXCLUSION DRIFT: '$name' is missing canonical entries:" >&2
+    echo "$missing" >&2
+    fail=1
+  fi
+done
+exit "$fail"
+```
+4. **Wire it into the gates.** Run the check in CI AND as a deploy/arming pre-flight (per the field report it was a deploy-time exposure). A new secret pattern added to the canonical set then forces every mechanism to cover it, or the build/deploy fails.
+## The Invariant, Stated
+> `canonical ⊆ gitignore` AND `canonical ⊆ rsync_exclude` AND `canonical ⊆ scanner` — at all times, enforced by an assertion. Supersets are fine; subsets are drift.
+## The Trade-off
+Derivation (step 2) is strictly better than assertion (step 3) — it removes the possibility of drift instead of detecting it — but not every tool accepts a generated exclude format, and some teams want each mechanism's list hand-tunable for its own extra concerns (rsync excluding build artifacts; the scanner allow-listing test fixtures). The superset invariant is the floor that permits those per-mechanism extras while forbidding any mechanism from covering LESS than the canonical secret set. Use derivation where the format allows; fall back to the asserted invariant everywhere else. (Field report #377 §5.)

package/dist/docs/patterns/multi-tenant-property-test.ts CHANGED Viewed

@@ -34,6 +34,20 @@ declare const harness: {
   listAllReadEndpoints(): string[];
   listAllWriteEndpoints(): string[];
   resetDb(): Promise<void>;
+  // ── Handler-entry (HTTP-level) harness — field report #371 ──────────────
+  // Drives the REAL request entrypoint with a concrete credential, so the
+  // auth→uid wiring is exercised (not just the repository's WHERE org_id).
+  // `principal` is whatever the entrypoint actually authenticates with: a
+  // bearer token, a session cookie, an API key header — give two DISTINCT ones.
+  httpRequest(
+    principal: { headers: Record<string, string> },
+    method: 'GET' | 'POST' | 'PUT' | 'DELETE',
+    path: string,
+    body?: unknown,
+  ): Promise<{ status: number; json: unknown }>;
+  // Two distinct, real principals for the SAME logical resource owner vs other.
+  principalForOrg(org: { apiKey: string; userId: string }): { headers: Record<string, string> };
 };
 // ── The Property ─────────────────────────────────────────────────────────
@@ -85,6 +99,43 @@ describe('multi-tenant isolation property', () => {
     const rowsB = await harness.readAsOrg(orgB, '/api/people');
     expect(rowsB.find((r) => r.org_id === orgA.id)).toBeUndefined();
   });
+  // ── Handler-entry two-principal variant (field report #371) ──────────────
+  // The repository-layer property above can pass while a handler that hardcodes
+  // `uid = 1` leaks across tenants — the repo test never crosses the auth→uid
+  // seam. This variant drives the REAL HTTP entrypoint with TWO DISTINCT
+  // credentials and asserts isolation through the request path. It is the test
+  // that the planted-bug check below must turn red.
+  test('two distinct principals through the real handler do not cross tenants', async () => {
+    const orgA = await harness.createOrg();
+    const orgB = await harness.createOrg();
+    const pA = harness.principalForOrg(orgA);
+    const pB = harness.principalForOrg(orgB);
+    // A writes through the real entrypoint with A's own credential.
+    const created = await harness.httpRequest(pA, 'POST', '/api/people', { name: 'A-secret' });
+    expect(created.status).toBeLessThan(300);
+    const writtenId = (created.json as { id: string }).id;
+    // B reads every list endpoint through the real entrypoint with B's credential.
+    for (const readEndpoint of harness.listAllReadEndpoints()) {
+      const res = await harness.httpRequest(pB, 'GET', readEndpoint);
+      const rows = Array.isArray(res.json) ? (res.json as Array<{ id?: string }>) : [];
+      expect(rows.find((r) => r.id === writtenId)).toBeUndefined();
+    }
+    // Cross-principal direct fetch: B asking for A's row by id must 404, not 403
+    // (404 avoids leaking existence — see CLAUDE.md "Return 404, not 403").
+    const direct = await harness.httpRequest(pB, 'GET', `/api/people/${writtenId}`);
+    expect(direct.status).toBe(404);
+  });
+  // PLANTED-BUG RED-CHECK (field report #371): hardcoding `uid = <owner>` in the
+  // handler MUST turn the two-principal test above RED. If you can introduce
+  // that bug and the suite stays green, your isolation test is not crossing the
+  // auth→uid seam — it is asserting at the repository layer only. Run this once
+  // as a mutation check: patch the handler to ignore the authenticated principal
+  // and pin uid to org A's id; the test above must fail. Revert after proving it.
 });
 function randomPayload(): fc.Arbitrary<unknown> {
@@ -112,8 +163,21 @@ function randomPayload(): fc.Arbitrary<unknown> {
 //         assert not any(r['id'] == written['id'] for r in rows_b), \
 //             f"LEAK: {write_endpoint} -> {read_endpoint}"
 //
+// # Handler-entry two-principal variant (field report #371) — drive the real
+// # entrypoint (FastAPI TestClient / Django test Client) with two distinct
+// # credentials, NOT the repository:
+// #   ra = client.post('/api/people', json={'name': 'A'}, headers=princ_a)
+// #   rb = client.get(f"/api/people/{ra.json()['id']}", headers=princ_b)
+// #   assert rb.status_code == 404      # not 403 — don't leak existence
+// # Mutation check: pin uid=<owner> in the handler; this MUST go red.
+//
 // ── Anti-patterns ────────────────────────────────────────────────────────
 //
+// 0. Asserting isolation only at the repository layer. A handler that
+//    hardcodes uid=1 passes every repo-level test while leaking across
+//    tenants. The isolation test MUST drive the real request entrypoint with
+//    two distinct principals (field report #371). Prove it with the planted
+//    uid red-check.
 // 1. Testing isolation only on known endpoints. The bug is in the endpoint
 //    you forgot. Property tests enumerate the full surface.
 // 2. Using SUPERUSER fixtures. They silently bypass FORCE RLS at the engine

package/dist/docs/patterns/oauth-token-lifecycle.ts CHANGED Viewed

@@ -8,6 +8,20 @@
  * - Failure escalation: retry 3x → pause platform → alert → requires_reauth
  * - Token stored as encrypted blob in vault, keyed by platform name
  * - Session token (daemon) rotates every 24 hours (§9.19.15)
+ * - VERIFY EXPIRY + REFRESH-GRANT BEHAVIOR AGAINST THE PROVIDER'S LIVE DOCS AT
+ *   INTEGRATION TIME. The PLATFORM_CONFIGS TTLs below are STARTING ASSUMPTIONS,
+ *   not ground truth — providers change them and "no refresh token / never
+ *   expires" is a common false assumption. Field report #373: a Todoist
+ *   integration shipped on "tokens don't expire," but the modern API issues
+ *   ~1h access tokens WITH a refresh token; the code discarded the refresh
+ *   token + expiry and registered no refresher, so it died ~1h after every
+ *   connect across four sessions — looking exactly like intermittent
+ *   revocation. At integration time: (1) read the provider's OAuth docs and
+ *   quote the verified access-token TTL + whether a refresh_token is issued;
+ *   (2) if a refresh_token exists, PERSIST it and register a refresher — never
+ *   discard it; (3) distinguish "expired" from "revoked" via the API's OWN
+ *   error body, not by inference (an expired token that mimics revocation will
+ *   send you reauth-hunting instead of refreshing).
  *
  * Agents: Breeze (platform relations), Dockson (vault)
  *
@@ -50,6 +64,13 @@ interface PlatformTokenConfig {
   revokeEndpoint?: string;
 }
+// ASSUMPTIONS, NOT GROUND TRUTH (field report #373). These TTLs and the
+// "refreshTokenTtlDays: 0 = never expires" entries are starting points. At
+// integration time, VERIFY each value against the provider's current OAuth
+// docs and the live token response (`expires_in`, presence of `refresh_token`)
+// — a provider that "doesn't expire" today may issue ~1h tokens tomorrow, and
+// a missing refresher then surfaces as recurring prod token-death that mimics
+// revocation. Treat any new platform here the same way before shipping.
 const PLATFORM_CONFIGS: PlatformTokenConfig[] = [
   { platform: 'meta',     accessTokenTtlHours: 1440, refreshTokenTtlDays: 0,  refreshEndpoint: 'https://graph.facebook.com/v19.0/oauth/access_token' },
   { platform: 'google',   accessTokenTtlHours: 1,    refreshTokenTtlDays: 0,  refreshEndpoint: 'https://oauth2.googleapis.com/token' },

package/dist/scripts/statusline/README.md ADDED Viewed

@@ -0,0 +1,38 @@
+# Context Meter — status line + awareness hook
+Two small scripts that surface how full the context window is — one for the human, one for the model.
+| Script | Wired to | Audience | What it does |
+|--------|----------|----------|--------------|
+| `voidforge-statusline.sh` | `statusLine` (settings.json) | you | Renders one line: model + a colored meter (`⟦████████░░⟧ 78%`) + tokens remaining. Green → yellow → red as the window fills. |
+| `context-awareness-hook.sh` | `UserPromptSubmit` hook | Claude | Once usage crosses a threshold, injects "you have ~X% left, checkpoint soon" into the model's own context each turn. Silent below the threshold. |
+The model can't see its own remaining context. The status line tells *you*; the hook tells *Claude* — so it can wrap up open loops and suggest `/vault` or `/seal` before compaction instead of being surprised by it.
+## Install
+**Default-on.** `npx voidforge-build init` already wires both scripts into a new project's `.claude/settings.json` (warn 80% / crit 92%). Nothing to do for a fresh project.
+To re-install, retune, or activate on a project that predates this feature, run **`/contextmeter`** — it chmods these scripts and merges the right block into `.claude/settings.json`. Or wire it by hand: merge `settings-snippet.json` into `.claude/settings.json`. Remove with `/contextmeter --uninstall`.
+## How it reads context
+- **Status line:** prefers the native `context_window` object Claude Code pipes on stdin (`used_percentage`, `context_window_size`). Falls back to deriving usage from the most recent assistant `message.usage` in `transcript_path` on older Claude Code that doesn't send the field.
+- **Hook:** the hook stdin has no `context_window` object, so it always derives from `transcript_path` (`input_tokens + cache_read_input_tokens + cache_creation_input_tokens`).
+- 1M-token sessions are detected automatically (usage above 200k ⇒ 1,000,000 denominator), or set `VOIDFORGE_CONTEXT_WINDOW`.
+## Tuning (env)
+| Var | Default | Effect |
+|-----|---------|--------|
+| `VOIDFORGE_CONTEXT_WINDOW` | `200000` | Denominator when the size field is absent. |
+| `VOIDFORGE_CONTEXT_WARN_PCT` | `80` | Hook starts speaking — and the meter turns yellow — at this % used. |
+| `VOIDFORGE_CONTEXT_CRIT_PCT` | `92` | Hook escalates to "checkpoint NOW" — and the meter turns red — at this %. |
+Both scripts read the same two thresholds, so the meter's yellow/red bands stay in lockstep with the hook's warn/critical bands. `/contextmeter --warn-pct N` / `--crit-pct N` bake these into the command strings in settings.json so they persist without a shell export.
+## Requirements & caveats
+- **`jq`** is required. Without it the status line prints a one-line "install jq" notice and the hook no-ops — neither ever breaks your session.
+- Only the **first line** of status-line stdout is shown by Claude Code, so the meter is deliberately single-line.
+- **Name:** this ships as `/contextmeter`, not `/statusline` — Claude Code's native `/statusline` and `/context` commands always shadow a same-named project command (see `docs/NATIVE_CAPABILITIES.md`).

package/dist/scripts/statusline/context-awareness-hook.sh ADDED Viewed

@@ -0,0 +1,53 @@
+#!/usr/bin/env bash
+# context-awareness-hook.sh — UserPromptSubmit hook that injects context-budget
+# awareness INTO Claude's own context as the window fills.
+#
+# The status-line meter is for the human; this hook is for the model. Claude
+# cannot see its own remaining context directly, so each turn (once usage crosses
+# a threshold) this prints a JSON object whose `hookSpecificOutput.additionalContext`
+# Claude receives — "you have ~X% left, checkpoint soon." Below the threshold it is
+# silent, so it adds zero noise until it matters.
+#
+# Cadence: Claude Code has no time/turn-interval hooks — UserPromptSubmit (once per
+# user turn) is the finest cadence available, which is exactly when fresh awareness
+# is useful. Threshold-gated so it behaves like a periodic warning that only speaks
+# near the limit.
+#
+# Requires jq; without it, no-op (exit 0). A hook must never break the turn.
+#
+# Env knobs:
+#   VOIDFORGE_CONTEXT_WINDOW    denominator (default 200000; auto-bumps to 1000000 when usage exceeds 200k)
+#   VOIDFORGE_CONTEXT_WARN_PCT  start warning at this % used (default 80)
+#   VOIDFORGE_CONTEXT_CRIT_PCT  escalate to "checkpoint NOW" at this % (default 92)
+set -uo pipefail
+input="$(cat 2>/dev/null || true)"
+command -v jq >/dev/null 2>&1 || exit 0
+transcript="$(printf '%s' "$input" | jq -r '.transcript_path // empty' 2>/dev/null)"
+[ -n "$transcript" ] && [ -f "$transcript" ] || exit 0
+usage="$(tail -n 400 "$transcript" | jq -c 'select(.message.usage != null) | .message.usage' 2>/dev/null | tail -1)"
+[ -n "$usage" ] || exit 0
+used="$(printf '%s' "$usage" | jq -r '((.input_tokens//0)+(.cache_read_input_tokens//0)+(.cache_creation_input_tokens//0))' 2>/dev/null)"
+used="${used%%.*}"
+[ -n "${used:-}" ] || exit 0
+if [ "$used" -gt 200000 ] 2>/dev/null; then window=1000000; else window="${VOIDFORGE_CONTEXT_WINDOW:-200000}"; fi
+[ "${window:-0}" -gt 0 ] 2>/dev/null || exit 0
+pct=$(( used * 100 / window ))
+warn="${VOIDFORGE_CONTEXT_WARN_PCT:-80}"
+crit="${VOIDFORGE_CONTEXT_CRIT_PCT:-92}"
+[ "$pct" -lt "$warn" ] && exit 0
+rem_k=$(( (window - used) / 1000 ))
+if [ "$pct" -ge "$crit" ]; then
+  msg="⚠️ CONTEXT CRITICAL: ~${pct}% of the ${window}-token window is used (~${rem_k}k left). Compaction is imminent — checkpoint NOW: run /vault (or /seal) to preserve session state before the context is summarized, and prefer finishing the current sub-task over starting new work."
+else
+  msg="Context monitor: ~${pct}% of the ${window}-token window is used (~${rem_k}k left). You are approaching the limit — wrap up open loops and consider /vault or /seal to checkpoint before compaction."
+fi
+jq -cn --arg m "$msg" '{hookSpecificOutput:{hookEventName:"UserPromptSubmit",additionalContext:$m}}'
+exit 0

package/dist/scripts/statusline/settings-snippet.json ADDED Viewed

@@ -0,0 +1,17 @@
+{
+  "statusLine": {
+    "type": "command",
+    "command": "bash scripts/statusline/voidforge-statusline.sh",
+    "padding": 0
+  },
+  "hooks": {
+    "UserPromptSubmit": [
+      {
+        "matcher": "",
+        "hooks": [
+          { "type": "command", "command": "bash scripts/statusline/context-awareness-hook.sh" }
+        ]
+      }
+    ]
+  }
+}

package/dist/scripts/statusline/voidforge-statusline.sh ADDED Viewed

@@ -0,0 +1,91 @@
+#!/usr/bin/env bash
+# voidforge-statusline.sh — Context-usage meter for the Claude Code status line.
+#
+# Reads the status-line JSON on stdin and prints ONE line:
+#   <model>  ⟦████████░░⟧ 78% ctx · 44k left
+# The meter is colored green → yellow → red as the context window fills.
+#
+# Source of truth: the native `.context_window` object Claude Code pipes to the
+# status line (`used_percentage`, `context_window_size`). When that field is
+# absent (older Claude Code), it falls back to deriving usage from the most
+# recent assistant `message.usage` in `.transcript_path`.
+#
+# Requires jq. Without jq it prints a minimal line and exits 0 — a status line
+# must NEVER hard-fail (that would blank the bar).
+#
+# Env knobs (shared with the awareness hook so colors and warnings stay in lockstep):
+#   VOIDFORGE_CONTEXT_WINDOW    denominator when the size field is absent (default 200000)
+#   VOIDFORGE_CONTEXT_WARN_PCT  meter turns yellow at this % used (default 80)
+#   VOIDFORGE_CONTEXT_CRIT_PCT  meter turns red at this % used (default 92)
+set -uo pipefail
+input="$(cat 2>/dev/null || true)"
+if ! command -v jq >/dev/null 2>&1; then
+  printf 'VoidForge · ctx meter needs jq (brew install jq)\n'
+  exit 0
+fi
+j() { printf '%s' "$input" | jq -r "$1" 2>/dev/null; }
+model="$(j '.model.display_name // .model.id // "Claude"')"
+pct="$(j '.context_window.used_percentage // empty')"
+window="$(j '.context_window.context_window_size // empty')"
+# Fallback: derive from the transcript when the native field is absent.
+if [ -z "$pct" ]; then
+  transcript="$(j '.transcript_path // empty')"
+  if [ -n "$transcript" ] && [ -f "$transcript" ]; then
+    usage="$(tail -n 400 "$transcript" | jq -c 'select(.message.usage != null) | .message.usage' 2>/dev/null | tail -1)"
+    if [ -n "$usage" ]; then
+      used="$(printf '%s' "$usage" | jq -r '((.input_tokens//0)+(.cache_read_input_tokens//0)+(.cache_creation_input_tokens//0))' 2>/dev/null)"
+      used="${used%%.*}"
+      if [ -z "$window" ]; then
+        if [ "${used:-0}" -gt 200000 ] 2>/dev/null; then window=1000000; else window="${VOIDFORGE_CONTEXT_WINDOW:-200000}"; fi
+      fi
+      if [ -n "${used:-}" ] && [ "${window:-0}" -gt 0 ] 2>/dev/null; then
+        pct=$(( used * 100 / window ))
+      fi
+    fi
+  fi
+fi
+# Coerce to integer; bail to model-only if we still have nothing.
+pct="${pct%%.*}"
+if [ -z "$pct" ]; then
+  printf '%s\n' "$model"
+  exit 0
+fi
+[ -z "$window" ] && window="${VOIDFORGE_CONTEXT_WINDOW:-200000}"
+window="${window%%.*}"
+[ "$pct" -lt 0 ] 2>/dev/null && pct=0
+[ "$pct" -gt 100 ] 2>/dev/null && pct=100
+remaining=$(( window - window * pct / 100 ))
+if [ "$remaining" -ge 1000 ]; then rem_h="$(( remaining / 1000 ))k"; else rem_h="${remaining}"; fi
+# Color band — defaults align with the awareness-hook thresholds (warn 80 → yellow,
+# crit 92 → red) so the meter turns red exactly when the hook goes critical. Both
+# honor the same env vars, so retuning one retunes the other.
+yellow_at="${VOIDFORGE_CONTEXT_WARN_PCT:-80}"
+red_at="${VOIDFORGE_CONTEXT_CRIT_PCT:-92}"
+if   [ "$pct" -ge "$red_at" ];    then color=$'\033[31m'   # red    — checkpoint now
+elif [ "$pct" -ge "$yellow_at" ]; then color=$'\033[33m'   # yellow — getting full
+else                                   color=$'\033[32m'   # green  — healthy
+fi
+reset=$'\033[0m'
+dim=$'\033[2m'
+# 10-cell meter, rounded.
+filled=$(( (pct + 5) / 10 ))
+[ "$filled" -gt 10 ] && filled=10
+[ "$filled" -lt 0 ] && filled=0
+bar=""
+i=0
+while [ "$i" -lt 10 ]; do
+  if [ "$i" -lt "$filled" ]; then bar="${bar}█"; else bar="${bar}░"; fi
+  i=$(( i + 1 ))
+done
+printf '%s %s⟦%s⟧ %d%%%s %sctx · %s left%s\n' "$model" "$color" "$bar" "$pct" "$reset" "$dim" "$rem_h" "$reset"