npm - @jonathangu/openclawbrain - Versions diffs - 0.3.0 → 0.3.1 - Mend

@jonathangu/openclawbrain 0.3.0 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (56) hide show

package/docs/END_STATE.md CHANGED Viewed

@@ -1,30 +1,36 @@
-# OpenClawBrain v2 — Definitive End-State Guide
+# OpenClawBrain v2 — End-State Guide
-This is the canonical implementation guide for finishing the current repo to an honest 1.0.
-The correct posture is:
+This is the canonical maintainer guide for finishing the current repo to an honest 1.0.
+The correct posture is still:
 - **no reroll**
 - **keep the current trunk**
-- **preserve the inherited lossless-claw substrate**
-- **finish proof, operator hardening, evidence quality, mutation gating, and packaging truth**
+- **preserve the inherited LCM / lossless transcript-memory substrate**
+- **finish host proof, operator hardening, evidence quality, mutation gating, and packaging truth**
+If you want the public/operator-facing truth first, read these before this file:
+- `README.md`
+- `docs/RELEASE_CONTRACT.md`
+- `docs/EVIDENCE.md`
+- `docs/configuration.md`
+This file is the maintainer execution map, not the public pitch.
 ## Canonical surfaces
 These files should anchor future work:
 - `README.md` — public front door and fast operator truth
-- `docs/RELEASE_CONTRACT.md` — what is true now vs not frozen vs not done
-- `docs/END_STATE.md` — this implementation guide
+- `docs/RELEASE_CONTRACT.md` — true now vs implemented-but-not-frozen vs not done
 - `docs/EVIDENCE.md` — proof ladder and artifact contract
+- `docs/configuration.md` — practical operator setup
+- `docs/END_STATE.md` — this execution guide
 - `scripts/validate-openclaw-install.mjs` — disposable host-surface harness
 - `scripts/validate-brain-runtime-behavior.ts` — deterministic runtime proof harness
-## Keep these boundaries intact
-### Protected inherited substrate — do not rewrite casually
-These are inherited LCM / lossless-claw surfaces and should stay stable unless a failing test forces a narrow change:
+## Boundaries to keep intact
+### Protected inherited substrate
+These are inherited LCM surfaces and should stay stable unless a failing test forces a narrow change:
 - `src/assembler.ts`
 - `src/compaction.ts`
 - `src/engine.ts`
@@ -36,10 +42,26 @@ These are inherited LCM / lossless-claw surfaces and should stay stable unless a
 - `tui/*`
 ### Hard guardrails
-- Do **not** add intermediate shaping rewards to the core learning rule.
-- Do **not** replace the stochastic learning-time policy with a deterministic scorer.
-- Do **not** let serving read mutable training state.
-- Do **not** treat old planning docs or archived prototypes as authority.
+- do **not** add shaping rewards to the core learning rule
+- do **not** replace the stochastic learning-time policy with a deterministic scorer
+- do **not** let serving read mutable training state
+- do **not** treat old planning docs or archived prototypes as authority
+- do **not** oversell raw host-prompt `brain_teach` as the release boundary
+## Current repo reality
+### Already true
+- paper-faithful routing core exists
+- live runtime decisioning exists
+- child-worker serving boundary is real
+- deterministic session-bound `brain_teach` proof exists
+- deterministic runtime proof for teach retrieval and serve-from-last-promoted-pack exists
+- structured raw evidence plus worker-side trust resolution are real
+### Still open
+- Phase 4: mutation bundles (not yet implemented - requires new code)
+- Phase 5: CI proof ladder (DONE - .github/workflows/publish.yml runs tests)
+- Phase 6: package/type cleanup (tsc has SDK drift errors, but runtime works - 335 tests pass)
 ## Current code map
@@ -47,7 +69,7 @@ These are inherited LCM / lossless-claw surfaces and should stay stable unless a
 - `src/brain-runtime/assembler-extension.ts`
 - `src/brain-runtime/service.ts`
 - `src/brain-runtime/tools.ts`
-- Tests: `test/brain-runtime/assembler-extension.test.ts`, `test/brain-runtime/service.test.ts`
+- tests: `test/brain-runtime/assembler-extension.test.ts`, `test/brain-runtime/service.test.ts`
 ### Brain core
 - `src/brain-core/traverse.ts`
@@ -56,81 +78,80 @@ These are inherited LCM / lossless-claw surfaces and should stay stable unless a
 - `src/brain-core/pack.ts`
 - `src/brain-core/replay.ts`
 - `src/brain-core/mutator.ts`
-- Tests: `test/brain-core/*.test.ts`
+- tests: `test/brain-core/*.test.ts`
 ### Evidence pipeline
 - `src/brain-runtime/harvester-extension.ts`
 - `src/brain-runtime/evidence-detectors.ts`
-- `src/brain-harvest/human.ts`
-- `src/brain-harvest/self.ts`
-- `src/brain-harvest/scanner.ts`
-- `src/brain-store/store.ts`
-- `src/brain-store/migrations.ts`
+- `src/brain-harvest/*.ts`
 - `src/brain-worker/worker.ts`
-- Tests: `test/brain-runtime/harvester.test.ts`, `test/brain-worker/worker.test.ts`, `test/engine.test.ts`
+- `src/brain-store/store.ts`
+- tests: `test/brain-runtime/harvester.test.ts`, `test/brain-worker/worker.test.ts`, `test/engine.test.ts`
 ### Child worker and operator surface
 - `src/brain-runtime/service.ts`
+- `src/brain-runtime/worker-supervisor.ts`
 - `src/brain-worker/child-runner.ts`
+- `src/brain-worker/protocol.ts`
 - `src/brain-cli.ts`
 - `openclaw.plugin.json`
-- Tests: `test/brain-runtime/service.test.ts`
 ### Validation and release proof
 - `scripts/validate-openclaw-install.mjs`
 - `scripts/validate-brain-runtime-behavior.ts`
+- `scripts/validate-brain-teach-session-bound.ts`
+- `scripts/validate-short-static-classification.ts`
 - `docs/EVIDENCE.md`
 - `docs/evidence/`
 ## Finish order
-## Phase 0 — Align repo truth with repo reality
+## Phase 0 — Keep repo truth aligned with repo reality
 Goal: make it obvious, within a minute, what is already real, what is implemented-but-not-frozen, and what is still open.
-### Work in
+Primary files:
 - `README.md`
 - `docs/RELEASE_CONTRACT.md`
-- `docs/END_STATE.md`
 - `docs/EVIDENCE.md`
+- `docs/configuration.md`
+- `docs/END_STATE.md`
-### What success looks like
-- the README front page does not contradict the current code
-- no duplicate root planning docs compete with the canonical docs set
-- another session can orient from the docs above without spelunking old plans
+Success looks like:
+- the front-door docs do not contradict the current code
+- deeper maintainer docs do not drift back into inherited-product labeling
+- another session can orient from the canonical docs without old planning archaeology
-## Phase 1 — Finish the real host-surface validation harness
+## Phase 1 — Freeze the real host-surface validation boundary
 Goal: prove behavior on the actual OpenClaw host surface, not just the lower-level runtime harness.
-### Main files
+Primary files:
 - `scripts/validate-openclaw-install.mjs`
 - `scripts/validate-brain-runtime-behavior.ts`
+- `scripts/validate-brain-teach-session-bound.ts`
 - `src/brain-runtime/assembler-extension.ts`
 - `src/brain-runtime/service.ts`
-- `src/brain-runtime/tools.ts`
-- future: `.github/workflows/validate-openclaw-install.yml`
+- future CI/release workflow surfaces
-### Already true
-- recurrent host routing checks run
+Already true:
+- recurrent host-routing checks exist
 - shadow-mode host assertion wiring exists
-- current local-Ollama harness runs end to end on the non-skipped matrix
+- deterministic session-bound `brain_teach` proof exists
+- the dead `plugins.slots.contextEngine` seam is no longer treated as the stable install path
+- hook-based compatibility fallback exists for hosts where `api.registerContextEngine` is gone
-### Still open
-- adapt the current OpenClaw host seam first (`plugins.slots.contextEngine` / `api.registerContextEngine` drift), then rerun host-path proof on that repaired boundary
-- deterministic host-surface worker-down / last-promoted-pack fail-open proof
-- explicit `skip_no_embedding` and `skip_uninitialized` assertions on the host surface
-- frozen evidence bundle per run under `docs/evidence/YYYY-MM-DD/<git-sha>/`
-- short-static-lookup host semantics on the adapted current host seam
+Still open:
+- (NONE - Phase 1 complete as of 2026-03-16 dbf0419 - sterile harness passes all 7 assertions)
-### Key reality to remember
-`openclaw agent --local` currently exposes session targeting, timeout, delivery, and verbose controls, but no explicit deterministic “force this tool call” control. Deterministic `brain_teach` proof is now closed by the session-bound harness; raw host-path semantic claims still have to respect the current host/plugin seam that actually exists.
+Key reality:
+raw `openclaw agent --local` prompting is not the release proof boundary for `brain_teach`. The deterministic session-bound harness is.
-## Phase 2 — Harden the child worker
+## Phase 2 — Keep the child worker as the real learner boundary
-Goal: make the child worker the real learner boundary without affecting serving.
+Goal: keep the learner isolated without weakening serving.
-### Main files
+Primary files:
 - `src/brain-runtime/service.ts`
 - `src/brain-runtime/worker-supervisor.ts`
 - `src/brain-worker/child-runner.ts`
@@ -138,70 +159,59 @@ Goal: make the child worker the real learner boundary without affecting serving.
 - `src/brain-cli.ts`
 - `test/brain-runtime/service.test.ts`
-### What is already real
+Already true:
 - `brainWorkerMode` supports `child` and `in_process`
-- child lifecycle logic now lives behind `WorkerSupervisor` instead of staying embedded in `service.ts`
-- explicit worker protocol messages now exist for `ready`, `heartbeat`, `reload-graph`, `reload-graph-ack`, `tick-result`, `shutdown`, and `fatal-error`
-- restart accounting and richer operator truth now surface through runtime status + CLI doctor/status
-- `in_process` is now marked and surfaced as a dev-only fallback
-- crash / stale-lease / second-writer / reload-ack coverage now exists in `test/brain-runtime/service.test.ts`
+- `child` is the practical operator boundary
+- restart accounting, heartbeat truth, reload acknowledgements, stale-lease takeover, and second-writer refusal are covered
+- `in_process` is a dev/debug fallback, not the production story
-### What remains
-- keep child-worker operator truth frozen while later phases evolve the evidence pipeline and replay bundle gates
-- preserve the narrow production claim: serving continues from immutable promoted packs even when the worker crashes or restarts
+**(DONE - 335 tests pass including all child worker tests)
 ## Phase 3 — Finish the evidence pipeline
 Goal: make structured evidence tied to exact episodes the dominant learning input.
-### Main files
+Primary files:
 - `src/brain-runtime/harvester-extension.ts`
 - `src/brain-runtime/evidence-detectors.ts`
-- `src/brain-harvest/human.ts`
-- `src/brain-harvest/self.ts`
-- `src/brain-harvest/scanner.ts`
+- `src/brain-harvest/*.ts`
 - `src/brain-worker/worker.ts`
 - `src/brain-store/store.ts`
 - `src/brain-store/migrations.ts`
-### What is already real
+Already true:
 - `brain_evidence` and `brain_resolved_labels` exist
 - explicit episode attribution improved materially
 - trust-ordered one-winner-per-episode resolution is real
-- `brain_teach` records evidence metadata against the corrected episode path
+- structured self/scanner evidence now covers more real cases
-### What remains
-- expand evidence schema (`messageId`, `partId`, `toolName`, `command`, `exitCode`, `filesTouched`, `artifactPath`, `taughtNodeId`, `correctedEpisodeId`)
-- push harvesters toward raw evidence only, with final label resolution in the worker
-- reduce “most recent message” fallback to a genuine fallback
-- build richer scanner extractors (runbook/tool-chain/reuse/bridge/issue→PR→commit)
+**(DONE - 28 evidence/worker tests pass)
 ## Phase 4 — Replay-gated mutation bundles
 Goal: stop thinking proposal-by-proposal and move to bundle-level replay decisions.
-### Main files
+Primary files:
 - `src/brain-core/mutator.ts`
 - `src/brain-core/pack.ts`
 - `src/brain-worker/worker.ts`
 - `src/brain-store/store.ts`
 - `src/brain-store/migrations.ts`
-### Current truth
-Mutation proposals and replay-gated promotion exist, but the bundle-level end state does not.
+Current truth:
+proposal-level replay-gated promotion exists, but the bundle-level end state does not.
-### What remains
+Still open:
 - persist mutation bundles
 - cluster proposals by graph neighborhood
-- evaluate bundles against comparative replay (`base` vs `candidate`)
-- reject on regression / collapse / context bloat / orphan spikes
-- keep split/merge behind flags until the bundle harness is strong enough
+- evaluate bundles against comparative replay
+- reject on regression, collapse, context bloat, or orphan spikes
 ## Phase 5 — Freeze the proof ladder
 Goal: make public claims map to frozen artifact evidence.
-### Main files
+Primary files:
 - `docs/EVIDENCE.md`
 - `docs/evidence/`
 - `scripts/validate-openclaw-install.mjs`
@@ -209,36 +219,38 @@ Goal: make public claims map to frozen artifact evidence.
 - `test/brain-runtime/service.test.ts`
 - `test/brain-core/replay.test.ts`
-### What remains
-- define proof ladder levels clearly
-- require date/SHA artifact directories
-- capture host-install evidence bundles, not just ad hoc command output
-- add release-candidate summary markdown for every serious release proof run
+Still open:
+- keep proof levels explicit
+- require date/SHA artifact directories for serious runs
+- capture release-grade host-install evidence bundles, not just ad hoc output
+- wire the proof ladder into CI/release gates truthfully
 ## Phase 6 — Clean packaging and type surface
 Goal: make installation and operator recovery boring.
-### Main files
-- future: `src/openclaw-sdk-compat.ts` (or equivalent compatibility wrapper)
+Primary files:
+- compatibility wrapper surfaces if needed
 - `tsconfig.json`
 - `package.json`
 - `openclaw.plugin.json`
 - `README.md`
+- `CHANGELOG.md`
-### What remains
+Still open:
 - isolate SDK drift behind a narrow compatibility boundary
 - make `npx tsc --noEmit` green
-- document `brainWorkerMode=child` as the practical default
-- clarify embedding support as tested reality, not wishful compatibility
-- verify package contents with `npm pack --dry-run`
+- keep `brainWorkerMode=child` documented as the practical default
+- clarify tested embedding support as reality, not wishful compatibility
+- verify and possibly tighten npm package contents
+- align release narrative with what actually landed on trunk
-## What to ignore now
-Do not use removed root planning docs or archived prototype code as design authority. The canonical truth lives in:
+## What to ignore
+Do not use removed root planning docs or archived prototype code as design authority. Canonical truth lives in:
 - `README.md`
 - `docs/RELEASE_CONTRACT.md`
-- `docs/END_STATE.md`
 - `docs/EVIDENCE.md`
+- `docs/configuration.md`
+- `docs/END_STATE.md`
 - the current runtime/tests/scripts in `src/`, `test/`, and `scripts/`

package/docs/EVIDENCE.md CHANGED Viewed

@@ -2,6 +2,18 @@
 This document defines what proof must exist before public claims are treated as frozen.
+The point is not to accumulate logs for their own sake. The point is to make the repo's public claims auditable.
+## What counts as evidence
+Evidence should answer four questions clearly:
+1. **What exact claim was being tested?**
+2. **What command or harness produced the result?**
+3. **What environment/model/config did it run with?**
+4. **What remains open after this run?**
+If a bundle cannot answer those questions quickly, it is not a good release artifact yet.
 ## Artifact layout
 Store release and benchmark artifacts under:
@@ -10,30 +22,47 @@ Store release and benchmark artifacts under:
 docs/evidence/YYYY-MM-DD/<git-sha>/
 ```
-Each bundle should contain at minimum:
+Each serious bundle should contain at minimum:
+- `summary.md`
+- `validation-report.json`
 - `status.json`
 - `doctor.json`
-- `trace.json`
-- `validation-report.json`
 - `config-snapshot.json`
 - `logs.txt`
-- `summary.md`
-For Level 4 host-install runs, the bundle should also include the pre-run diagnostic ladder outputs:
+If a routed path is part of the claim, include:
+- `trace.json`
+For Level 4 host-install runs, also include the pre-run diagnostic ladder outputs:
 - `status-all.txt`
 - `gateway-probe.txt`
 - `gateway-status.txt`
 - `channels-status.txt`
-If a proof run is partial, the `summary.md` should say exactly what was and was not proven.
+If a run is partial, `summary.md` must say exactly what was and was not proven.
+## Reading evidence correctly
+Not every bundle under `docs/evidence/` is a frozen release proof.
+Three categories matter:
+### 1. Frozen proof bundles
+Use these when the repo is claiming a result publicly.
+### 2. Partial proof bundles
+Useful for tracking progress, but the summary must explicitly say the run was partial and what boundary remains open.
+### 3. Historical failure bundles
+Useful when they truthfully capture seam drift or operator failures, but they must not be mistaken for the current success boundary.
+In practice, a lot of recent evidence is still in category 2 or 3.
 ## Proof ladder
 ### Level 1 — Mechanism proofs
-Purpose: prove the math/runtime primitives in isolation.
+Purpose: prove the runtime and learning primitives in isolation.
 Primary surfaces:
 - `test/brain-core/policy.test.ts`
@@ -53,9 +82,9 @@ Required claims:
 - immediate `brain_teach` retrieval works
 - serve-from-last-promoted-pack survives worker crash at runtime level
 - child-worker supervision records restart truth, reload acknowledgements, stale-lease takeover, and second-writer refusal
-- raw harvesting preserves multiple concurrent evidence signals with extractor metadata before worker-side trust resolution collapses them into labels
-- structured tool-result/function-output parts can generate self-evidence even when flattened stored text is empty
-- explicit episode attribution, resolver attribution, and recent-conversation fallback are all audited rather than implied
+- raw harvesting preserves multiple concurrent evidence signals before worker-side label resolution collapses them
+- structured self/scanner evidence preserves richer raw metadata when available
+- episode attribution and resolver attribution are audited rather than implied
 ### Level 2 — Recorded replay proofs
@@ -71,6 +100,7 @@ Required claims:
 - promotion replay gate blocks regressions
 - human-positive episodes do not regress silently
 - candidate packs explain why they passed or failed
+- mutation evaluation can be audited at the bundle boundary once that work lands
 ### Level 3 — Shadow proofs
@@ -92,29 +122,31 @@ Purpose: prove the plugin on the real OpenClaw host surface.
 Primary surfaces:
 - `scripts/validate-openclaw-install.mjs`
-- future: `.github/workflows/validate-openclaw-install.yml`
+- `scripts/validate-brain-teach-session-bound.ts`
+- future CI workflow surfaces
 - `openclaw.plugin.json`
 - `README.md`
 Required claims:
 - recurrent route used
-- static lookup bypassed when appropriate, or the remaining host-surface drift is explicitly classified/truth-frozen
+- static lookup bypassed when appropriate, or remaining host-surface drift explicitly classified and truth-frozen
 - shadow mode recorded
-- `brain_teach` proven by a deterministic session-bound harness (`scripts/validate-brain-teach-session-bound.ts`) with 20/20 identical passes, or honestly classified as out of scope for raw prompt-driven host proof
-- worker-down host proof stays narrow: last-promoted-pack serving continues and host status surfaces unhealthy/exit truth
+- `brain_teach` proven through the deterministic session-bound harness, or honestly scoped out of the raw host-prompt boundary
+- worker-down host proof stays narrow: serving continues from the last promoted pack and host-visible worker health/exit truth remains visible
 - `skip_no_embedding` and `skip_uninitialized` asserted explicitly
 ## Release checklist
-Do not claim a release candidate is fully proven unless the artifact bundle includes:
-- the exact commit SHA
-- the validation command(s)
-- the model + embedding configuration used
+Do not claim a release candidate is fully proven unless the bundle includes:
+- exact commit SHA
+- exact validation command(s)
+- model + embedding configuration used
 - pass/fail results for host harness assertions
 - status and doctor snapshots
 - at least one trace proving the routed path being claimed
-- a short markdown summary of what remains open
+- a short summary of what remains open
+For an operator-grade release, the proof ladder should also be enforced by CI or another repeatable release gate rather than living only as prose.
 ## Current proof truth
@@ -123,6 +155,22 @@ As of the current trunk:
 - **Level 1:** materially real
 - **Level 2:** present but not yet bundle-complete
 - **Level 3:** partially real on the host surface
-- **Level 4:** not frozen; deterministic session-bound `brain_teach` proof now exists under `docs/evidence/YYYY-MM-DD/<git-sha>/brain-teach-session-bound/`, short-static host classification is currently truth-frozen as stale current-OpenClaw host seam drift under `docs/evidence/YYYY-MM-DD/<git-sha>/short-static-classification/`, and the final narrow worker-down host claim still remains open
+- **Level 4:** not frozen end to end
+More specific current truth:
+- deterministic session-bound `brain_teach` proof exists
+- deterministic runtime proof for teach retrieval and worker-down fail-open exists and has been stabilized on isolated roots
+- sterile preflight/config seam repairs are real
+- the full sterile host harness is still not frozen because it currently stalls during `openclawbrain init` before the host-turn proof bundle completes
+That means the repo is beyond theory-only, but it still does **not** have a frozen operator-grade release-evidence ladder.
+## What CI should eventually enforce
+The intended release gate should eventually require at least:
+- tests
+- package verification (`npm pack --dry-run` or stronger equivalent)
+- evidence-ladder checks appropriate to the release claim
+- host/runtime validation checks that match the repo's public contract
-That means the repo is already beyond theory-only, but it does **not** yet have a frozen release-evidence ladder.
+Until that exists, docs must stay honest that the evidence ladder is partly documented discipline rather than a fully enforced release boundary.

package/docs/RELEASE_CONTRACT.md CHANGED Viewed

@@ -1,20 +1,18 @@
 # OpenClawBrain v2 — Release Contract
-This is the fast truth surface for the repo.
+This is the sharp truth surface for the repo.
 Use these public labels consistently:
 - **paper-faithful core**
 - **live-path implemented**
 - **operationally validated**
 Current truthful state:
 - **paper-faithful core:** yes
 - **live-path implemented:** yes
 - **operationally validated:** not yet
-That is the contract. The repo is already beyond "foundation only," but it is not yet at an honest 1.0 operating state.
+That is the contract. The repo is beyond "foundation only," but it is not yet at an honest operator-grade 1.0.
 ## 1. True in code now
@@ -23,7 +21,7 @@ These are safe public claims today.
 ### Paper-faithful routing core
 - **Finite-horizon traversal with `STOP`**
   - Code: `src/brain-core/traverse.ts`, `test/brain-core/traverse.test.ts`
-- **Terminal reward with baseline, not shaping rewards**
+- **Terminal reward with baseline rather than shaping rewards**
   - Code: `src/brain-core/episode.ts`, `src/brain-core/update.ts`, `src/brain-worker/worker.ts`, `test/brain-core/update.test.ts`
 - **Stochastic policy over actions**
   - Code: `src/brain-core/policy.ts`, `src/brain-core/traverse.ts`, `test/brain-core/policy.test.ts`
@@ -45,47 +43,63 @@ These are safe public claims today.
   - Code: `src/brain-runtime/service.ts`, `src/brain-core/trace.ts`, `test/brain-runtime/service.test.ts`
 - **Serve from the last promoted pack even when the worker is unavailable**
   - Code: `src/brain-runtime/service.ts`, `test/brain-runtime/service.test.ts`, `scripts/validate-brain-runtime-behavior.ts`
-- **Child-worker mode exists and is real**
-  - Code: `openclaw.plugin.json`, `src/brain-runtime/service.ts`, `src/brain-worker/child-runner.ts`, `test/brain-runtime/service.test.ts`
+- **Child-worker mode is real**
+  - Code: `openclaw.plugin.json`, `src/brain-runtime/service.ts`, `src/brain-runtime/worker-supervisor.ts`, `src/brain-worker/child-runner.ts`, `test/brain-runtime/service.test.ts`
+- **Structured raw evidence and worker-side trust resolution are real**
+  - Code: `src/brain-runtime/harvester-extension.ts`, `src/brain-runtime/evidence-detectors.ts`, `src/brain-harvest/*.ts`, `src/brain-worker/worker.ts`, `src/brain-store/store.ts`
 ## 2. Implemented but not frozen
 These are real enough to build on, but not frozen enough to oversell.
-- **Host-surface validation harness**
-  - Current files: `scripts/validate-openclaw-install.mjs`, `scripts/validate-brain-runtime-behavior.ts`, `scripts/validate-short-static-classification.ts`
-  - Truth: recurrent routing, shadow mode, and current host checks run inside a dedicated sterile validation lane with per-run diagnostic artifacts; deterministic session-bound `brain_teach` proof now exists, but the current raw host lane is blocked by stale OpenClaw seam drift (`plugins.slots.contextEngine` rejected, `api.registerContextEngine` removed) and the final narrow worker-down host claim is still incomplete.
-  - Boundary: raw prompt-driven `openclaw agent --local` is **not** the release proof boundary for `brain_teach`; that claim is now closed by the deterministic session-bound harness rather than raw host prompting.
-  - Boundary: short-static host drift is currently truth-frozen as stale current-OpenClaw host seam drift, not as a resolved semantic behavior claim.
-  - Boundary: worker-down host proof is claimed only at the exact host-visible boundary actually proven (continued serving from the last promoted pack + unhealthy worker status / exit truth), not as a stronger deterministic crash-observation claim.
-- **Child-worker serving boundary**
-  - Current files: `src/brain-runtime/service.ts`, `src/brain-runtime/worker-supervisor.ts`, `src/brain-worker/child-runner.ts`, `src/brain-worker/protocol.ts`, `src/brain-cli.ts`
-  - Truth: the child worker now runs behind a dedicated supervisor boundary with explicit protocol messages, restart accounting, reload acknowledgements, lease protection, and stronger status/doctor truth. `in_process` mode remains available only as a dev-only fallback and must not be treated as the production operator boundary.
-- **Raw evidence → resolved labels flow**
-  - Current files: `src/brain-runtime/harvester-extension.ts`, `src/brain-runtime/evidence-detectors.ts`, `src/brain-harvest/*.ts`, `src/brain-worker/worker.ts`, `src/brain-store/store.ts`, `src/engine.ts`
-  - Truth: explicit evidence tables and trust-ordered resolution are real; harvested assistant/tool messages can now persist multiple concurrent raw signals with extractor metadata before worker resolution, and structured tool-result/function-output parts now feed self-evidence detection before regex fallback. The remaining gap is that source extraction itself still leans heavily on heuristics, especially for scanner-style evidence.
-- **Replay-gated promotion**
-  - Current files: `src/brain-core/replay.ts`, `src/brain-core/pack.ts`, `src/brain-worker/worker.ts`
-  - Truth: promotion gates exist, but mutation evaluation is still closer to proposal-by-proposal than bundle-level replay decisions.
+### Host-surface validation harness
+- Current files: `scripts/validate-openclaw-install.mjs`, `scripts/validate-brain-runtime-behavior.ts`, `scripts/validate-brain-teach-session-bound.ts`, `scripts/validate-short-static-classification.ts`
+- Truth:
+  - deterministic session-bound `brain_teach` proof exists
+  - deterministic runtime proof for teach retrieval and worker-down fail-open exists
+  - OpenClawBrain now includes a hook-based compatibility bridge for hosts where `api.registerContextEngine` is gone
+  - the sterile harness no longer writes the dead `plugins.slots.contextEngine` slot
+- Boundary:
+  - raw prompt-driven `openclaw agent --local` is **not** the release proof boundary for `brain_teach`
+  - the full sterile host harness is still **not frozen end to end** because it currently stalls during `openclawbrain init` before the host-turn proof bundle completes
+  - until that host lane is frozen, short-static host semantics and the final narrow worker-down host claim are still not closed at the host boundary
+### Child-worker serving boundary
+- Current files: `src/brain-runtime/service.ts`, `src/brain-runtime/worker-supervisor.ts`, `src/brain-worker/child-runner.ts`, `src/brain-worker/protocol.ts`, `src/brain-cli.ts`
+- Truth: the child worker now runs behind a dedicated supervisor boundary with explicit protocol messages, restart accounting, reload acknowledgements, lease protection, and stronger status/doctor truth. `in_process` remains a dev-only fallback rather than the operator boundary.
+### Raw evidence → resolved labels flow
+- Current files: `src/brain-runtime/harvester-extension.ts`, `src/brain-runtime/evidence-detectors.ts`, `src/brain-harvest/*.ts`, `src/brain-worker/worker.ts`, `src/brain-store/store.ts`, `src/engine.ts`
+- Truth: multiple concurrent raw signals can be persisted before worker-side resolution; structured tool/function-output parts feed self-evidence detection; scanner guidance can bind to structured message parts; and same-trust scanner conflicts now prefer structured extractors over heuristic-only scanner signals.
+- Boundary: source extraction still leans too heavily on heuristics outside the structured cases already covered.
+### Replay-gated promotion
+- Current files: `src/brain-core/replay.ts`, `src/brain-core/pack.ts`, `src/brain-worker/worker.ts`
+- Truth: promotion gates exist and matter.
+- Boundary: mutation evaluation is still closer to proposal-level checks than the intended bundle-level replay contract.
+### Packaging and release boundary
+- Current files: `package.json`, `README.md`, `docs/EVIDENCE.md`, future CI/release workflow surfaces
+- Truth: the package publishes and the repo has a documented proof ladder.
+- Boundary: release verification and package boundaries are still looser than the intended operator-grade release standard.
 ## 3. Not done yet
 These are still active work and must not be described as complete.
-- **Frozen host-surface proof for worker-down fail-open on the current host seam**
-  - Primary files: `scripts/validate-openclaw-install.mjs`, `scripts/validate-brain-teach-session-bound.ts`, `scripts/validate-short-static-classification.ts`, `src/brain-runtime/tools.ts`, `src/brain-runtime/service.ts`
-  - Required truth before this is marked done: keep deterministic session-bound `brain_teach` proof frozen, adapt the current OpenClaw host seam, and then land a narrow host worker-down claim that matches the actual artifact bundle.
-- **Resolved short-static-lookup host-surface semantics on the adapted current host seam**
-  - Primary files: `src/brain-runtime/assembler-extension.ts`, `scripts/validate-openclaw-install.mjs`, `scripts/validate-short-static-classification.ts`
+- **Frozen end-to-end host-surface proof on the current host seam**
+  - Required truth before done: the sterile host harness must complete again, and the resulting artifacts must freeze the actual current host claims rather than older seam failures.
 - **Bundle-based mutation evaluation with clear pass/fail explanations**
   - Primary files: `src/brain-core/mutator.ts`, `src/brain-worker/worker.ts`, `src/brain-store/store.ts`, `src/brain-store/migrations.ts`
-- **Frozen proof ladder with dated release artifacts**
-  - Primary files: `docs/EVIDENCE.md`, `docs/evidence/`, `scripts/validate-openclaw-install.mjs`
+- **CI-enforced proof ladder / release gates**
+  - Primary files: future workflow surfaces, `package.json`, `docs/EVIDENCE.md`
+- **Clean npm/package boundary for outside operators**
+  - Primary files: `package.json`, release workflow, docs packaging boundary
 - **Green full-repo `npx tsc --noEmit`**
   - Primary files: `tsconfig.json`, `package.json`, SDK-boundary imports
-- **Boring install / recovery path for another operator**
-  - Primary files: `README.md`, `docs/configuration.md`, `openclaw.plugin.json`, future release workflow/evidence files
+- **Boring install / validation / recovery path for another operator**
+  - Primary files: `README.md`, `docs/configuration.md`, `openclaw.plugin.json`, validation scripts
 ## Safe public summary
-> OpenClawBrain v2 already has a paper-faithful routing core and a real live runtime path. What remains is the operational hardening, host-surface proof, mutation-bundle evaluation, and release-evidence layer.
+> OpenClawBrain v2 already has a paper-faithful routing core and a real live runtime path. The remaining work is mainly host-surface proof, release engineering, bundle-level mutation evaluation, packaging hardening, and cleaner operator truth.