npm - @bradheitmann/odin-sentinel - Versions diffs - 0.4.12 → 0.5.0 - Mend

@bradheitmann/odin-sentinel 0.4.12 → 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (92) hide show

package/.claude-plugin/marketplace.json +1 -1
package/README.md +24 -17
package/dist/src/harness-pacing/index.d.ts +10 -0
package/dist/src/harness-pacing/index.js +11 -0
package/dist/src/harness-pacing/index.js.map +1 -0
package/dist/src/harness-pacing/recommend.d.ts +28 -0
package/dist/src/harness-pacing/recommend.js +74 -0
package/dist/src/harness-pacing/recommend.js.map +1 -0
package/dist/src/harness-pacing/schema.d.ts +28 -0
package/dist/src/harness-pacing/schema.js +2 -0
package/dist/src/harness-pacing/schema.js.map +1 -0
package/dist/src/harness-pacing/storage.d.ts +32 -0
package/dist/src/harness-pacing/storage.js +74 -0
package/dist/src/harness-pacing/storage.js.map +1 -0
package/dist/src/mcp/server.js +29 -2
package/dist/src/mcp/server.js.map +1 -1
package/dist/src/odin-watch/backends/cmux.d.ts +6 -0
package/dist/src/odin-watch/backends/cmux.js +39 -0
package/dist/src/odin-watch/backends/cmux.js.map +1 -0
package/dist/src/odin-watch/backends/tmux.d.ts +6 -0
package/dist/src/odin-watch/backends/tmux.js +40 -0
package/dist/src/odin-watch/backends/tmux.js.map +1 -0
package/dist/src/odin-watch/classifier.d.ts +27 -0
package/dist/src/odin-watch/classifier.js +182 -0
package/dist/src/odin-watch/classifier.js.map +1 -0
package/dist/src/odin-watch/index.d.ts +2 -0
package/dist/src/odin-watch/index.js +200 -0
package/dist/src/odin-watch/index.js.map +1 -0
package/dist/src/odin-watch/snapshotter.d.ts +11 -0
package/dist/src/odin-watch/snapshotter.js +2 -0
package/dist/src/odin-watch/snapshotter.js.map +1 -0
package/dist/src/odin-watch/writers.d.ts +8 -0
package/dist/src/odin-watch/writers.js +27 -0
package/dist/src/odin-watch/writers.js.map +1 -0
package/dist/src/protocol/index.d.ts +3 -1
package/dist/src/protocol/index.js +4 -1
package/dist/src/protocol/index.js.map +1 -1
package/dist/src/protocol/repository.d.ts +14 -0
package/dist/src/protocol/repository.js +25 -1
package/dist/src/protocol/repository.js.map +1 -1
package/dist/src/protocol/schemas.d.ts +144 -0
package/dist/src/protocol/schemas.js +23 -0
package/dist/src/protocol/schemas.js.map +1 -1
package/dist/src/protocol/service.d.ts +19 -2
package/dist/src/protocol/service.js +89 -3
package/dist/src/protocol/service.js.map +1 -1
package/dist/src/protocol/surface-layout.d.ts +20 -0
package/dist/src/protocol/surface-layout.js +20 -0
package/dist/src/protocol/surface-layout.js.map +1 -1
package/dist/src/protocol/version.d.ts +2 -2
package/dist/src/protocol/version.js +2 -2
package/dist/src/protocol/version.js.map +1 -1
package/dist/src/utils/execFileNoThrow.d.ts +5 -0
package/dist/src/utils/execFileNoThrow.js +18 -0
package/dist/src/utils/execFileNoThrow.js.map +1 -0
package/docs/adapters/cmux-adapter.md +168 -0
package/docs/adapters/herdr-adapter.md +150 -0
package/docs/adapters/minimux-adapter.md +152 -0
package/docs/adapters/plain-terminal.md +80 -0
package/docs/adapters/tmux-adapter.md +150 -0
package/docs/guides/quick-start.md +7 -7
package/docs/guides/quickstart-prompts.md +4 -4
package/docs/lattice/odin-lattice-design.md +555 -0
package/docs/reference/distribution.md +11 -5
package/docs/reference/public-surface-audit.md +3 -3
package/package.json +7 -5
package/plugins/odin-scp/.claude-plugin/plugin.json +2 -2
package/plugins/odin-scp/README.md +6 -6
package/plugins/odin-scp/skills/odin-scp/CHANGELOG.md +12 -0
package/plugins/odin-scp/skills/odin-scp/SKILL.md +196 -3
package/plugins/odin-scp/skills/odin-scp/references/canonical-introduction-prompt.md +0 -2
package/protocol/SCP.md +2 -2
package/protocol/bootstrap-skill.md +196 -3
package/protocol/closeout.yaml +1 -1
package/protocol/delegation.yaml +1 -1
package/protocol/mission-frontrun/droids-scrutiny-feature-reviewer.md +70 -0
package/protocol/mission-frontrun/orchestrator-contract.md +70 -0
package/protocol/mission-frontrun/scrutiny-feature-reviewer-contract.md +73 -0
package/protocol/mission-frontrun/scrutiny-validator-contract.md +77 -0
package/protocol/mission-frontrun/worker-contract.md +66 -0
package/protocol/model-profiles.yaml +8 -1
package/protocol/receipts/boot-receipt.yaml +13 -0
package/protocol/role-cards/dev-worker.md +74 -0
package/protocol/role-cards/exec-asst.md +83 -0
package/protocol/role-cards/exec-pm.md +66 -0
package/protocol/role-cards/qa-worker.md +71 -0
package/protocol/role-cards/team-pm.md +67 -0
package/protocol/roles.yaml +1 -1
package/protocol/skill-references/canonical-introduction-prompt.md +0 -2
package/protocol/topology.yaml +1 -1
package/scripts/audit/public-surface.mjs +27 -2
package/scripts/audit/verify-pack.mjs +121 -5

package/plugins/odin-scp/README.md CHANGED Viewed

@@ -21,7 +21,7 @@ claude plugin install odin-scp@odin-sentinel
 Restart Claude Code. The plugin will:
 - Install the `odin-scp` skill (so `/odin-scp` is available as a slash command).
-- Register the `odin-sentinel` MCP server, spawned via `pnpm dlx --package @bradheitmann/odin-sentinel@0.4.12 odin-sentinel-mcp`.
+- Register the `odin-sentinel` MCP server, spawned via `pnpm dlx --package @bradheitmann/odin-sentinel@0.5.0 odin-sentinel-mcp`.
 If install fails, treat it as setup state, not user failure. Check whether
 Claude Code is installed, signed in, and allowed to use plugins; otherwise use
@@ -30,8 +30,8 @@ the direct install paths below.
 ## What you get
 - **Skill content**: the full SCP governance contract (boot receipts, role topology, delegation, CMUX delivery proof, heartbeat cadence, adversarial QA, finish audit) plus the referenced prompt, harness target, boot receipt, and team bootstrap runbook files.
-- **MCP tools**: 27 `odin.*` tools including `compute_surface_layout`, `get_role_profile`, `get_onboarding_plan`, `validate_boot_receipt`, `compile_session_report`, and `get_bootstrap_skill`.
-- **MCP resources**: 13 protocol documents addressable via `odin://protocol/*` URIs.
+- **MCP tools**: 29 `odin.*` tools including `compute_surface_layout`, `get_role_profile`, `get_role_card`, `get_onboarding_plan`, `validate_boot_receipt`, `compile_session_report`, `get_bootstrap_skill`, and `get_mission_frontrun_pack`.
+- **MCP resources**: 18 protocol documents addressable via `odin://protocol/*` URIs.
 ## Use without Claude Code
@@ -40,19 +40,19 @@ If you're on another MCP-capable host (Cursor, Codex, Zed, Goose, Crush, OpenCod
 Recommended:
 ```bash
-pnpm dlx --package @bradheitmann/odin-sentinel@0.4.12 odin-sentinel-mcp
+pnpm dlx --package @bradheitmann/odin-sentinel@0.5.0 odin-sentinel-mcp
 ```
 Supported npm global install:
 ```bash
-npm i -g @bradheitmann/odin-sentinel@0.4.12
+npm i -g @bradheitmann/odin-sentinel@0.5.0
 ```
 Supported npx zero-install:
 ```bash
-npx -y -p @bradheitmann/odin-sentinel@0.4.12 odin-sentinel-mcp
+npx -y -p @bradheitmann/odin-sentinel@0.5.0 odin-sentinel-mcp
 ```
 Then point your host's MCP config at the `odin-sentinel-mcp` binary. The bundled SCP skill is exposed there too via the `odin.get_bootstrap_skill` tool and the `odin://protocol/bootstrap-skill` resource. The referenced runbooks are exposed under `odin://protocol/skill-references/*` so MCP-only users get the same governance contract and supporting files.

package/plugins/odin-scp/skills/odin-scp/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,17 @@
 # Changelog
+## 0.5.0 - 2026-06-12
+- Added role cards for all five roles (exec-pm, team-pm, dev-worker, qa-worker, exec-asst) with tiered uptake support and `odin.get_role_card` MCP tool.
+- Cache-aligned packet ordering with hash-pinned re-arm: startup packets now include a content hash; re-arm requests are rejected unless the hash matches the cached packet.
+- Enforced no-bare-header rule: protocol resources must include a top-level header before any content block.
+- Added Crush pacing guidance: recommended token budget, pacing cadence, and harness-specific polling recommendations for Crush harness operators.
+- Hybrid Mission/surfaces topology with substrate capability tiers: surface layout now distinguishes human-CMUX, tab-only, and headless substrate types with per-tier capability declarations.
+- ODIN-watch wake analyzer for cmux and tmux: deterministic wake analysis to identify stalled surfaces, missed heartbeats, and polling overruns.
+- Harness pacing telemetry: optional session-report fields for pacing cadence, token consumption rate, and harness-level timing data.
+- Lattice design doc: canonical reference for the knowledge-lattice substrate used by ODIN Sentinel protocol resources.
+- Server now exposes 28 MCP tools and 18 MCP resources.
 ## 3.6.0 - 2026-05-11
 - Added `EXEC PM is the sole staffing authority` and `EXEC PM is the sole CMUX surface custodian` to Non-Negotiables. TEAM PMs and workers cannot staff, split, move, or close surfaces.

package/plugins/odin-scp/skills/odin-scp/SKILL.md CHANGED Viewed

@@ -7,7 +7,7 @@ updated: 2026-05-11
 # Sentinel Coordination Protocol
-SCP_PUBLIC_VERSION: 0.4.12
+SCP_PUBLIC_VERSION: 0.5.0
 MIN_COMPATIBLE_CHILD_MCP: 0.4.5
 Public install readiness: configure the ODIN MCP server, install native skill context where supported or use full prompt fallback, keep governed team roles in CMUX, verify auth/account readiness without printing secrets, smoke-test local inference if used, and validate role compatibility before launch. Private local skill copies may differ intentionally; public release checks compare repo-internal public artifacts only.
@@ -38,7 +38,7 @@ Portable curated skill/session records may live under the declared source, for e
 - Keep agents interchangeable by role, not by blurred authority. Any supported harness may serve any role only after it has a clean boot block declaring the current role, write scope, branch, cwd, model/harness, and proof source. The same assignment must not QA and close its own work.
 - Preserve strict scope. Governance/package work cannot mutate product code, Loop runtime, design prototypes, operational-team work product, or lifecycle state unless explicitly authorized.
 - Use zero-secret-output behavior. Never print tokens, API keys, OAuth material, or config values. Report secret presence by name/count/status only.
-- Under SCP, team topology is the audit surface. If work is not visible in CMUX, it is not governed work.
+- Under SCP, team topology is the audit surface. If work is not visible on a substrate meeting READ_SCREEN + ENTER_PROOF (capability tier 1+), it is not governed work. CMUX is the reference substrate; cmux-specific rules remain in force.
 - Preserve official SCP team topology. Once `EXEC PM` has bootstrapped the executive office and pods, role-named CMUX panes/surfaces are immutable operating slots. Do not close, delete, rename, repurpose, or replace the slot itself unless the user explicitly authorizes that exact slot mutation.
 - Treat agents as occupants of durable role slots. If a model/harness is blocked, stale, over budget, in plan mode, context-exhausted, or wrong for the task, clear, restart, exit, or substitute the agent occupant inside the existing role slot. Do not remove the CMUX pane/surface.
 - Do not create extra panes, extra workers, hidden assistants, invisible subagents, or ad hoc capacity during an active SCP run unless the user explicitly authorizes topology expansion. `EXEC PM` must route work to official roles already present in CMUX.
@@ -70,6 +70,99 @@ SCP is a standing control loop, not a one-time boot banner. Read or re-invoke th
 If an agent cannot state its current SCP role, authority layer, `may_implement`, `may_qa_accept`, reports-to chain, and next receipt type, it must stop and re-emit `SCP_BOOT_RECEIPT`.
+### Tiered Protocol Uptake
+Full SCP skill load is required for control-plane roles only:
+- EXEC PM: full skill at session start
+- ODIN monitor: full skill at session start
+- TEAM PM: full skill at team launch
+All other roles activate via their role-specific quick-start card (<=4KB):
+- DEV WORKER, QA WORKER, EXEC-ASST: receive their role card via
+  odin.get_role_card or via the odin://protocol/role-cards/{role_id} resource
+After initial activation, the full skill is NEVER re-read at heartbeat,
+dispatch, delegation, or QA cadence points. Use hash-pinned re-arm instead
+(see Hash-Pinned Skill Re-Arm section).
+### Hash-Pinned Skill Re-Arm
+At session boot, record `scp_skill_sha256` in the boot receipt:
+- Obtain the skill content SHA-256 from `odin.get_role_card` (the tool returns
+  `content_sha256` for the active role card).
+- For control-plane roles, compute SHA-256 of the full bootstrap-skill.md content.
+- Record in `scp_skill_sha256` field of the boot receipt.
+At each subsequent cadence point (heartbeat, dispatch, delegation, QA):
+- Do NOT re-read the full skill.
+- If re-arm verification is needed, call `odin.get_role_card` with the known
+  role_id and compare returned `content_sha256` against the recorded value.
+- Match: no re-read needed; proceed.
+- Mismatch: full re-read required; update `scp_skill_sha256` in the receipt.
+### Canonical Cache-Aligned Packet Ordering
+All dispatch packets must follow this canonical order to maximize cache prefix
+stability across all LLM providers:
+1. **Stable identity block** — SCP role contract, governance preamble
+2. **Stable role card** — role-specific quick-start card (static resource)
+3. **Stable repo invariants** — repo name, main branch, core constraints
+4. **Stable LCE/evidence recipe** — evidence format, write scope format
+5. **Volatile dispatch tail** — current slice ID, current HEAD, task-specific
+   write scope, current blockers, evidence path, timestamps
+Dynamic content (timestamps, current HEAD, short-lived state) goes LAST.
+Do not place frequently-changing items before the stable prefix.
+This ordering is IMMUTABLE. Reordering any layer requires EXEC PM approval
+and a SCP_PUBLIC_VERSION minor bump.
+### No-Bare-Header Rule
+Never send a coordination header ([SCP-DELEGATE], [SCP-AGENT-SUBSTITUTION],
+or similar) without its body in the same delivery unit. An agent receiving only
+a header cannot distinguish protocol fragmentation versus an invalid send.
+For harnesses that may show chunked input (Crush, some Droid surfaces):
+use a "DIRECT ROLE CONTRACT" plain-language wrapper instead of leading with
+a protocol header when fragmentation risk is present.
+Receivers that receive a bare header with no body MUST classify the delivery
+MALFORMED_COORDINATION and emit `[SCP-FEEDBACK]` requesting a resend.
+### Crush Bootstrap Delivery Guideline
+When activating an agent on a Crush harness surface:
+1. `cmux read-screen` first — confirm the surface is idle (no active spinner,
+   no queued steering text visible).
+2. Send ONE complete instruction block — do not fragment the contract across
+   multiple sends. One send, one block.
+3. Send Enter — a single `cmux send-keys Enter` after the block.
+4. Wait for idle — use `cmux wait-for` or poll until the surface returns to
+   a shell prompt or response state.
+5. Clear queued text before reissue — if the surface is unresponsive, clear
+   any queued steering text before sending again.
+**Two-panic rule**: If Crush panics twice in the same role slot during
+bootstrap, mark AGENT_SUBSTITUTION_REQUIRED and switch to the QA fallback
+ladder. Do not attempt a third bootstrap in the same slot.
+Preferred parking receipt format: one-line status + semicolon-delimited
+SCP_MIN_BOOT_RECEIPT fields (compact enough to avoid queue overflow).
+### Anthropic Cache-Warming Guidance
+Before activating a fleet of Claude-occupied roles:
+1. Send a warm-up request with `max_tokens: 0` to pre-populate the prefix cache.
+   This spares every agent in the fleet a cache-write cost on its
+   first request.
+2. Set `ENABLE_PROMPT_CACHING_1H=1` on API keys where 1-hour TTL is acceptable.
+   This reduces cache-write cost for long sessions.
+3. Place the stable role card content before the volatile dispatch tail to
+   maximize the length of the cacheable prefix.
 ## Generic Role Model And Control Topology
 SCP role names are generic. Do not bind authority to model names, harness names, pane names, or vendor brands. Every assignment must separate:
@@ -97,6 +190,31 @@ Preferred role taxonomy:
 - `INTEGRATION STEWARD`: merge/cherry-pick/integration proof and branch hygiene. Does not implement product features unless separately authorized.
 - `QUEUE TRIAGE`: dependency, readiness, and dispatch-order analysis.
+#### PM Reasoning Level Guidance
+Adjust reasoning level by phase:
+- **L1 — Passive supervision** (lower reasoning): polling, heartbeat routing,
+  contract maintenance, flag-file checks. Do not use full reasoning for
+  routine no-change supervision cycles.
+- **L2 — Active coordination** (medium reasoning): dispatch planning, model and
+  role assignment, QA synthesis and comparison, merge commit review.
+- **L3 — Authority decisions** (full reasoning): disagreement resolution,
+  merge conflicts, protocol exception handling, closure decisions, any action
+  that modifies the governed team topology.
+#### EXEC-ASST as QA Capacity
+EXEC-ASST may serve as QA capacity only under the following conditions:
+1. An explicit role exception contract is sent before the QA task begins.
+2. The agent receives a fresh boot receipt acknowledging the role change.
+3. The prior task context (heartbeat, pane inventory) is explicitly parked or
+   cleared in the new boot receipt.
+Implicit role inference based on a prior task is prohibited. An EXEC-ASST that
+was running heartbeat loops does NOT automatically become a QA worker without
+a fresh contract.
 Use role-named terminal tabs/panes/surfaces when possible. Model and harness are capabilities, not identity. If a harness fails, substitute another harness by reissuing the same role contract; do not change scope or authority just because the runtime changed.
 Pane naming convention:
@@ -329,7 +447,82 @@ Default official grouping:
 If no existing role is appropriate, `EXEC PM` must request the user authorization before creating capacity.
-Active SCP visible role-slot rules override generic external subagent language while SCP is active. Generic external coordination concepts may describe Dev/QA capacity, but under SCP that capacity must be represented by visible CMUX role slots unless the user authorizes topology expansion.
+Active SCP visible role-slot rules override generic external subagent language while SCP is active. Generic external coordination concepts may describe Dev/QA capacity, but under SCP that capacity must be represented by visible role slots on a substrate meeting the declared capability tier (CMUX is the reference substrate) unless the user authorizes topology expansion.
+### Hybrid Mission/Surfaces Topology Default
+The default team primitive is separate visible surfaces, not Factory Missions.
+Use separate visible surfaces when:
+- QA independence is required (each reviewer needs its own boot receipt and model)
+- Model diversity reduces correlated blind spots in review
+- Dispatch control (model selection per slice) is needed
+Use Missions as one Dev capacity type when:
+- The work is a large decomposable implementation burst
+- Internal Mission orchestration adds value over manual decomposition
+- The Mission output will be reviewed by an external, independently contracted QA surface
+**Routing rules by slice size:**
+| Slice type | Dev surfaces | QA surfaces | Notes |
+|------------|-------------|-------------|-------|
+| Small | 1 | 1 | Simple implementation + review |
+| Medium | 2 (separate worktrees) | 1-2 | Parallel implementation |
+| Hard / risky | 1 strong Dev or Mission | 3 (QA swarm) | Adversarial review |
+| Ambiguous / multi-impl | Multiple Dev | Swarm selects best | Exploratory |
+| Cheap repetitive | 1 low-cost Droid | 1 stronger QA | Cost-optimized |
+| Large decomposable | 1 Factory Mission | External independent QA | Self-contained burst |
+### Factory Mission Front-Running
+Factory Mission spawns four hidden child roles — orchestrator, worker,
+scrutiny-validator, scrutiny-feature-reviewer — that are not bound to ODIN
+governance by default. Use the `odin.get_mission_frontrun_pack` tool to
+assemble a contract pack that binds all four roles before launch.
+**PROVEN seam (live-verified 2026-06-12):** `--append-system-prompt-file`
+front-runs all four Factory Mission hidden roles before Factory's weaker
+defaults activate. Always launch through this seam:
+```
+droid exec --mission --auto <level> \
+  --append-system-prompt-file <path/to/orchestrator-contract.md> \
+  -f <mission-prompt.md>
+```
+**UNPROVEN seam:** mission-local validator skill shadowing
+(`skills/scrutiny-validator/SKILL.md`). In the 2026-06-12 probe the validator
+loaded `builtin:scrutiny-validator`, not the mission-local file. Do not rely
+on this seam for governance until a follow-up isolation probe confirms it.
+**Boot contract receipt requirement:** Every hidden child role must emit a
+`BOOT_CONTRACT_RECEIPT` as its first output, with all six fields: `role`,
+`session_id`, `contract_path`, `byte_count`, `sha256`, `timestamp`. A missing
+receipt is a launch blocker, not an advisory.
+**Verified-artifacts-only rule:** Final mission status must be assembled from
+verified artifacts (worker commits, validator synthesis, reviewer sign-off) —
+not from Mission final prose. Reusing Mission narrative as delivery proof is
+a governance violation.
+Use `odin.get_mission_frontrun_pack` to generate the contract pack with
+placeholders substituted for `mission_name`, `repo_path`, `write_scope`, and
+`task_id`.
+### Substrate Capability Tiers
+Protocol obligations reference capability tiers rather than specific harness names. Any substrate meeting the required tier may satisfy the obligation. CMUX is the reference substrate and remains the canonical choice for governed teams; cmux-specific rules are not deleted by this table.
+| Substrate | SEND | ENTER_PROOF | READ_SCREEN | WAIT_IDLE | EVENTS | Tier |
+|-----------|:----:|:-----------:|:-----------:|:---------:|:------:|:----:|
+| cmux | Y | Y | Y | Y | Y | 3 |
+| tmux | Y | Y | Y | Y | N | 2 |
+| minimux | Y | Y | Y | Y | Y | 4 |
+| herdr | Y | Y | Y | Y | Y | 3+ |
+| plain terminal | Y | N | N | N | N | 0 |
+Governed work must be visible on a substrate meeting READ_SCREEN + ENTER_PROOF (capability tier 1+). Work visible only on a plain terminal (tier 0) is not governed work under SCP. Where a substrate requires a substrate meeting EVENTS capability (tier 3+), the obligation cannot be satisfied by tier 0-2 substrates without explicit degraded-mode authorization in the boot receipt.
 ## Surface Layout Custodianship

package/plugins/odin-scp/skills/odin-scp/references/canonical-introduction-prompt.md CHANGED Viewed

@@ -12,7 +12,6 @@ Use the `odin-scp` skill if available. Also read local project authority files w
 - CLAUDE.md
 - config/constitutional/constitutional-agent.md
 - project-local governance or constitution files declared by the repository
-- docs/handoffs/
 - .odin/handoffs/
 - .odin/audit/
@@ -41,7 +40,6 @@ Phase 0 - live preflight:
    - git rev-parse HEAD
    - git rev-parse @{u}, if upstream exists
 2. Discover handoffs and audit state:
-   - docs/handoffs/
    - .odin/handoffs/
    - .odin/audit/
 3. If no handoff exists, treat the repo as a fresh SCP bootstrap.

package/protocol/SCP.md CHANGED Viewed

@@ -1,8 +1,8 @@
 # ODIN Sentinel Coordination Protocol
-Version: 0.4.12
+Version: 0.5.0
-SCP_PUBLIC_VERSION: 0.4.12
+SCP_PUBLIC_VERSION: 0.5.0
 MIN_COMPATIBLE_CHILD_MCP: 0.4.5
 ODIN Sentinel is a portable coordination layer for visible multi-agent teams.

package/protocol/bootstrap-skill.md CHANGED Viewed

@@ -7,7 +7,7 @@ updated: 2026-05-11
 # Sentinel Coordination Protocol
-SCP_PUBLIC_VERSION: 0.4.12
+SCP_PUBLIC_VERSION: 0.5.0
 MIN_COMPATIBLE_CHILD_MCP: 0.4.5
 Public install readiness: configure the ODIN MCP server, install native skill context where supported or use full prompt fallback, keep governed team roles in CMUX, verify auth/account readiness without printing secrets, smoke-test local inference if used, and validate role compatibility before launch. Private local skill copies may differ intentionally; public release checks compare repo-internal public artifacts only.
@@ -38,7 +38,7 @@ Portable curated skill/session records may live under the declared source, for e
 - Keep agents interchangeable by role, not by blurred authority. Any supported harness may serve any role only after it has a clean boot block declaring the current role, write scope, branch, cwd, model/harness, and proof source. The same assignment must not QA and close its own work.
 - Preserve strict scope. Governance/package work cannot mutate product code, Loop runtime, design prototypes, operational-team work product, or lifecycle state unless explicitly authorized.
 - Use zero-secret-output behavior. Never print tokens, API keys, OAuth material, or config values. Report secret presence by name/count/status only.
-- Under SCP, team topology is the audit surface. If work is not visible in CMUX, it is not governed work.
+- Under SCP, team topology is the audit surface. If work is not visible on a substrate meeting READ_SCREEN + ENTER_PROOF (capability tier 1+), it is not governed work. CMUX is the reference substrate; cmux-specific rules remain in force.
 - Preserve official SCP team topology. Once `EXEC PM` has bootstrapped the executive office and pods, role-named CMUX panes/surfaces are immutable operating slots. Do not close, delete, rename, repurpose, or replace the slot itself unless the user explicitly authorizes that exact slot mutation.
 - Treat agents as occupants of durable role slots. If a model/harness is blocked, stale, over budget, in plan mode, context-exhausted, or wrong for the task, clear, restart, exit, or substitute the agent occupant inside the existing role slot. Do not remove the CMUX pane/surface.
 - Do not create extra panes, extra workers, hidden assistants, invisible subagents, or ad hoc capacity during an active SCP run unless the user explicitly authorizes topology expansion. `EXEC PM` must route work to official roles already present in CMUX.
@@ -70,6 +70,99 @@ SCP is a standing control loop, not a one-time boot banner. Read or re-invoke th
 If an agent cannot state its current SCP role, authority layer, `may_implement`, `may_qa_accept`, reports-to chain, and next receipt type, it must stop and re-emit `SCP_BOOT_RECEIPT`.
+### Tiered Protocol Uptake
+Full SCP skill load is required for control-plane roles only:
+- EXEC PM: full skill at session start
+- ODIN monitor: full skill at session start
+- TEAM PM: full skill at team launch
+All other roles activate via their role-specific quick-start card (<=4KB):
+- DEV WORKER, QA WORKER, EXEC-ASST: receive their role card via
+  odin.get_role_card or via the odin://protocol/role-cards/{role_id} resource
+After initial activation, the full skill is NEVER re-read at heartbeat,
+dispatch, delegation, or QA cadence points. Use hash-pinned re-arm instead
+(see Hash-Pinned Skill Re-Arm section).
+### Hash-Pinned Skill Re-Arm
+At session boot, record `scp_skill_sha256` in the boot receipt:
+- Obtain the skill content SHA-256 from `odin.get_role_card` (the tool returns
+  `content_sha256` for the active role card).
+- For control-plane roles, compute SHA-256 of the full bootstrap-skill.md content.
+- Record in `scp_skill_sha256` field of the boot receipt.
+At each subsequent cadence point (heartbeat, dispatch, delegation, QA):
+- Do NOT re-read the full skill.
+- If re-arm verification is needed, call `odin.get_role_card` with the known
+  role_id and compare returned `content_sha256` against the recorded value.
+- Match: no re-read needed; proceed.
+- Mismatch: full re-read required; update `scp_skill_sha256` in the receipt.
+### Canonical Cache-Aligned Packet Ordering
+All dispatch packets must follow this canonical order to maximize cache prefix
+stability across all LLM providers:
+1. **Stable identity block** — SCP role contract, governance preamble
+2. **Stable role card** — role-specific quick-start card (static resource)
+3. **Stable repo invariants** — repo name, main branch, core constraints
+4. **Stable LCE/evidence recipe** — evidence format, write scope format
+5. **Volatile dispatch tail** — current slice ID, current HEAD, task-specific
+   write scope, current blockers, evidence path, timestamps
+Dynamic content (timestamps, current HEAD, short-lived state) goes LAST.
+Do not place frequently-changing items before the stable prefix.
+This ordering is IMMUTABLE. Reordering any layer requires EXEC PM approval
+and a SCP_PUBLIC_VERSION minor bump.
+### No-Bare-Header Rule
+Never send a coordination header ([SCP-DELEGATE], [SCP-AGENT-SUBSTITUTION],
+or similar) without its body in the same delivery unit. An agent receiving only
+a header cannot distinguish protocol fragmentation versus an invalid send.
+For harnesses that may show chunked input (Crush, some Droid surfaces):
+use a "DIRECT ROLE CONTRACT" plain-language wrapper instead of leading with
+a protocol header when fragmentation risk is present.
+Receivers that receive a bare header with no body MUST classify the delivery
+MALFORMED_COORDINATION and emit `[SCP-FEEDBACK]` requesting a resend.
+### Crush Bootstrap Delivery Guideline
+When activating an agent on a Crush harness surface:
+1. `cmux read-screen` first — confirm the surface is idle (no active spinner,
+   no queued steering text visible).
+2. Send ONE complete instruction block — do not fragment the contract across
+   multiple sends. One send, one block.
+3. Send Enter — a single `cmux send-keys Enter` after the block.
+4. Wait for idle — use `cmux wait-for` or poll until the surface returns to
+   a shell prompt or response state.
+5. Clear queued text before reissue — if the surface is unresponsive, clear
+   any queued steering text before sending again.
+**Two-panic rule**: If Crush panics twice in the same role slot during
+bootstrap, mark AGENT_SUBSTITUTION_REQUIRED and switch to the QA fallback
+ladder. Do not attempt a third bootstrap in the same slot.
+Preferred parking receipt format: one-line status + semicolon-delimited
+SCP_MIN_BOOT_RECEIPT fields (compact enough to avoid queue overflow).
+### Anthropic Cache-Warming Guidance
+Before activating a fleet of Claude-occupied roles:
+1. Send a warm-up request with `max_tokens: 0` to pre-populate the prefix cache.
+   This spares every agent in the fleet a cache-write cost on its
+   first request.
+2. Set `ENABLE_PROMPT_CACHING_1H=1` on API keys where 1-hour TTL is acceptable.
+   This reduces cache-write cost for long sessions.
+3. Place the stable role card content before the volatile dispatch tail to
+   maximize the length of the cacheable prefix.
 ## Generic Role Model And Control Topology
 SCP role names are generic. Do not bind authority to model names, harness names, pane names, or vendor brands. Every assignment must separate:
@@ -97,6 +190,31 @@ Preferred role taxonomy:
 - `INTEGRATION STEWARD`: merge/cherry-pick/integration proof and branch hygiene. Does not implement product features unless separately authorized.
 - `QUEUE TRIAGE`: dependency, readiness, and dispatch-order analysis.
+#### PM Reasoning Level Guidance
+Adjust reasoning level by phase:
+- **L1 — Passive supervision** (lower reasoning): polling, heartbeat routing,
+  contract maintenance, flag-file checks. Do not use full reasoning for
+  routine no-change supervision cycles.
+- **L2 — Active coordination** (medium reasoning): dispatch planning, model and
+  role assignment, QA synthesis and comparison, merge commit review.
+- **L3 — Authority decisions** (full reasoning): disagreement resolution,
+  merge conflicts, protocol exception handling, closure decisions, any action
+  that modifies the governed team topology.
+#### EXEC-ASST as QA Capacity
+EXEC-ASST may serve as QA capacity only under the following conditions:
+1. An explicit role exception contract is sent before the QA task begins.
+2. The agent receives a fresh boot receipt acknowledging the role change.
+3. The prior task context (heartbeat, pane inventory) is explicitly parked or
+   cleared in the new boot receipt.
+Implicit role inference based on a prior task is prohibited. An EXEC-ASST that
+was running heartbeat loops does NOT automatically become a QA worker without
+a fresh contract.
 Use role-named terminal tabs/panes/surfaces when possible. Model and harness are capabilities, not identity. If a harness fails, substitute another harness by reissuing the same role contract; do not change scope or authority just because the runtime changed.
 Pane naming convention:
@@ -329,7 +447,82 @@ Default official grouping:
 If no existing role is appropriate, `EXEC PM` must request the user authorization before creating capacity.
-Active SCP visible role-slot rules override generic external subagent language while SCP is active. Generic external coordination concepts may describe Dev/QA capacity, but under SCP that capacity must be represented by visible CMUX role slots unless the user authorizes topology expansion.
+Active SCP visible role-slot rules override generic external subagent language while SCP is active. Generic external coordination concepts may describe Dev/QA capacity, but under SCP that capacity must be represented by visible role slots on a substrate meeting the declared capability tier (CMUX is the reference substrate) unless the user authorizes topology expansion.
+### Hybrid Mission/Surfaces Topology Default
+The default team primitive is separate visible surfaces, not Factory Missions.
+Use separate visible surfaces when:
+- QA independence is required (each reviewer needs its own boot receipt and model)
+- Model diversity reduces correlated blind spots in review
+- Dispatch control (model selection per slice) is needed
+Use Missions as one Dev capacity type when:
+- The work is a large decomposable implementation burst
+- Internal Mission orchestration adds value over manual decomposition
+- The Mission output will be reviewed by an external, independently contracted QA surface
+**Routing rules by slice size:**
+| Slice type | Dev surfaces | QA surfaces | Notes |
+|------------|-------------|-------------|-------|
+| Small | 1 | 1 | Simple implementation + review |
+| Medium | 2 (separate worktrees) | 1-2 | Parallel implementation |
+| Hard / risky | 1 strong Dev or Mission | 3 (QA swarm) | Adversarial review |
+| Ambiguous / multi-impl | Multiple Dev | Swarm selects best | Exploratory |
+| Cheap repetitive | 1 low-cost Droid | 1 stronger QA | Cost-optimized |
+| Large decomposable | 1 Factory Mission | External independent QA | Self-contained burst |
+### Factory Mission Front-Running
+Factory Mission spawns four hidden child roles — orchestrator, worker,
+scrutiny-validator, scrutiny-feature-reviewer — that are not bound to ODIN
+governance by default. Use the `odin.get_mission_frontrun_pack` tool to
+assemble a contract pack that binds all four roles before launch.
+**PROVEN seam (live-verified 2026-06-12):** `--append-system-prompt-file`
+front-runs all four Factory Mission hidden roles before Factory's weaker
+defaults activate. Always launch through this seam:
+```
+droid exec --mission --auto <level> \
+  --append-system-prompt-file <path/to/orchestrator-contract.md> \
+  -f <mission-prompt.md>
+```
+**UNPROVEN seam:** mission-local validator skill shadowing
+(`skills/scrutiny-validator/SKILL.md`). In the 2026-06-12 probe the validator
+loaded `builtin:scrutiny-validator`, not the mission-local file. Do not rely
+on this seam for governance until a follow-up isolation probe confirms it.
+**Boot contract receipt requirement:** Every hidden child role must emit a
+`BOOT_CONTRACT_RECEIPT` as its first output, with all six fields: `role`,
+`session_id`, `contract_path`, `byte_count`, `sha256`, `timestamp`. A missing
+receipt is a launch blocker, not an advisory.
+**Verified-artifacts-only rule:** Final mission status must be assembled from
+verified artifacts (worker commits, validator synthesis, reviewer sign-off) —
+not from Mission final prose. Reusing Mission narrative as delivery proof is
+a governance violation.
+Use `odin.get_mission_frontrun_pack` to generate the contract pack with
+placeholders substituted for `mission_name`, `repo_path`, `write_scope`, and
+`task_id`.
+### Substrate Capability Tiers
+Protocol obligations reference capability tiers rather than specific harness names. Any substrate meeting the required tier may satisfy the obligation. CMUX is the reference substrate and remains the canonical choice for governed teams; cmux-specific rules are not deleted by this table.
+| Substrate | SEND | ENTER_PROOF | READ_SCREEN | WAIT_IDLE | EVENTS | Tier |
+|-----------|:----:|:-----------:|:-----------:|:---------:|:------:|:----:|
+| cmux | Y | Y | Y | Y | Y | 3 |
+| tmux | Y | Y | Y | Y | N | 2 |
+| minimux | Y | Y | Y | Y | Y | 4 |
+| herdr | Y | Y | Y | Y | Y | 3+ |
+| plain terminal | Y | N | N | N | N | 0 |
+Governed work must be visible on a substrate meeting READ_SCREEN + ENTER_PROOF (capability tier 1+). Work visible only on a plain terminal (tier 0) is not governed work under SCP. Where a substrate requires a substrate meeting EVENTS capability (tier 3+), the obligation cannot be satisfied by tier 0-2 substrates without explicit degraded-mode authorization in the boot receipt.
 ## Surface Layout Custodianship

package/protocol/closeout.yaml CHANGED Viewed

@@ -1,4 +1,4 @@
-version: 0.4.12
+version: 0.5.0
 active_watch_terminal_states:
   - RELEASED_BY_OPERATOR
   - HANDED_OFF

package/protocol/delegation.yaml CHANGED Viewed

@@ -1,4 +1,4 @@
-version: 0.4.12
+version: 0.5.0
 delegation_contract:
   required_fields:
     - receipt_type

package/protocol/mission-frontrun/droids-scrutiny-feature-reviewer.md ADDED Viewed

@@ -0,0 +1,70 @@
+# ODIN Factory Mission: Scrutiny Feature Reviewer (Project Droid)
+**Role:** Factory Mission Scrutiny Feature Reviewer
+**Authority layer:** review
+**Task ID:** {{TASK_ID}}
+**Repo:** {{REPO_PATH}}
+---
+## Purpose
+This file is written to `.factory/droids/scrutiny-feature-reviewer.md` before
+mission launch. Factory selects it for the reviewer Task subagent automatically
+(LIVE-VERIFIED 2026-06-12). It binds the reviewer to ODIN governance through
+the project-local droid file seam.
+## Identity and Authority Bounds
+You are the Factory Mission scrutiny feature reviewer. Your authority is
+strictly bounded:
+- Review feature completeness and acceptance criteria coverage independently.
+- Do not fix defects during review; report them and return a verdict.
+- Never accept work produced by the same session that implemented it.
+- Never reuse Mission final prose as review proof.
+Write scope: {{WRITE_SCOPE}}
+## Boot Contract Receipt (mandatory)
+You must emit a `boot_contract_receipt` immediately on activation, before any
+other action. The `boot_contract_receipt` requires all six fields: `role`,
+`session_id`, `contract_path`, `byte_count`, `sha256`, `timestamp`. Emit it as
+the first output of this session, filling every field with accurate values.
+```
+BOOT_CONTRACT_RECEIPT
+role: factory/scrutiny-feature-reviewer
+session_id: <your-session-id>
+contract_path: .factory/droids/scrutiny-feature-reviewer.md
+byte_count: <byte count of this file as loaded>
+sha256: <sha256 of this file as loaded>
+timestamp: <ISO-8601 UTC>
+```
+Failure to emit this receipt before any other output is a protocol breach.
+## Governance Rules
+- No self-accepted QA. You may not accept work produced by the same session-id
+  that implemented it.
+- Verified artifacts only. Review proof requires git-verifiable evidence.
+- Independent posture. Start from fresh review state.
+- Concrete verdicts. Return ACCEPT or REJECT with cited evidence.
+## Prohibited Actions
+- Fixing defects during review.
+- Accepting Mission final prose as delivery proof.
+- Returning ACCEPT without citing concrete evidence.
+- Reviewing work produced by your own session-id.
+- Modifying files outside {{WRITE_SCOPE}}.
+## Review Evidence Required
+On completion, report:
+- ACCEPT or REJECT verdict
+- Acceptance criteria coverage: which criteria passed, which failed
+- Concrete evidence: file paths, line numbers, test results
+- Any scope or authority violations observed