npm - @beyondwork/docx-react-component - Versions diffs - 1.0.0 → 1.0.1 - Mend

@beyondwork/docx-react-component 1.0.0 → 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (704) hide show

package/dist/chunk-32W6IVQE.js +7725 -0
package/dist/chunk-32W6IVQE.js.map +1 -0
package/dist/index.cjs +23722 -0
package/dist/index.cjs.map +1 -0
package/dist/index.d.cts +7 -0
package/dist/index.d.ts +7 -0
package/dist/index.js +16011 -0
package/dist/index.js.map +1 -0
package/dist/public-types-DqCURAz8.d.cts +1152 -0
package/dist/public-types-DqCURAz8.d.ts +1152 -0
package/dist/tailwind.cjs +8295 -0
package/dist/tailwind.cjs.map +1 -0
package/dist/tailwind.d.cts +323 -0
package/dist/tailwind.d.ts +323 -0
package/dist/tailwind.js +553 -0
package/dist/tailwind.js.map +1 -0
package/package.json +52 -31
package/.codex/config.toml +0 -5
package/.corepack/v1/pnpm/10.30.3/.corepack +0 -1
package/.corepack/v1/pnpm/10.30.3/LICENSE +0 -22
package/.corepack/v1/pnpm/10.30.3/README.md +0 -240
package/.corepack/v1/pnpm/10.30.3/dist/node-gyp-bin/node-gyp +0 -6
package/.corepack/v1/pnpm/10.30.3/dist/node-gyp-bin/node-gyp.cmd +0 -5
package/.corepack/v1/pnpm/10.30.3/dist/pnpm.cjs +0 -195400
package/.corepack/v1/pnpm/10.30.3/dist/pnpmrc +0 -2
package/.corepack/v1/pnpm/10.30.3/dist/reflink.darwin-arm64-2HJ4WGO6.node +0 -0
package/.corepack/v1/pnpm/10.30.3/dist/reflink.darwin-x64-3G3H6IW4.node +0 -0
package/.corepack/v1/pnpm/10.30.3/dist/reflink.win32-arm64-msvc-Q6BARPPB.node +0 -0
package/.corepack/v1/pnpm/10.30.3/dist/reflink.win32-x64-msvc-J2TZHRQI.node +0 -0
package/.corepack/v1/pnpm/10.30.3/dist/templates/completion.bash +0 -31
package/.corepack/v1/pnpm/10.30.3/dist/templates/completion.fish +0 -22
package/.corepack/v1/pnpm/10.30.3/dist/templates/completion.ps1 +0 -193
package/.corepack/v1/pnpm/10.30.3/dist/templates/completion.zsh +0 -27
package/.corepack/v1/pnpm/10.30.3/dist/vendor/fastlist-0.3.0-x64.exe +0 -0
package/.corepack/v1/pnpm/10.30.3/dist/vendor/fastlist-0.3.0-x86.exe +0 -0
package/.corepack/v1/pnpm/10.30.3/dist/worker.js +0 -10119
package/.corepack/v1/pnpm/10.30.3/package.json +0 -192
package/.cursor/mcp.json +0 -7
package/.github/workflows/ci.yml +0 -35
package/.mcp.json +0 -7
package/.openclaw/workspace-state.json +0 -4
package/.pnpmrc.json +0 -1
package/.wave-launch.sh +0 -7
package/.workspace-marker +0 -1
package/AGENTS.md +0 -78
package/CHANGELOG.md +0 -177
package/DESIGN.md +0 -929
package/HEARTBEAT.md +0 -7
package/IDENTITY.md +0 -23
package/SOUL.md +0 -36
package/TOOLS.md +0 -40
package/USER.md +0 -17
package/docs/README.md +0 -107
package/docs/agents/wave-cont-eval-role.md +0 -36
package/docs/agents/wave-cont-qa-role.md +0 -52
package/docs/agents/wave-deploy-verifier-role.md +0 -34
package/docs/agents/wave-design-role.md +0 -47
package/docs/agents/wave-documentation-role.md +0 -34
package/docs/agents/wave-infra-role.md +0 -34
package/docs/agents/wave-integration-role.md +0 -37
package/docs/agents/wave-launcher-role.md +0 -41
package/docs/agents/wave-orchestrator-role.md +0 -52
package/docs/agents/wave-planner-role.md +0 -39
package/docs/agents/wave-security-role.md +0 -40
package/docs/architecture/docx/README.md +0 -10
package/docs/architecture/future/README.md +0 -8
package/docs/architecture/ooxml-upgrade-analysis.md +0 -134
package/docs/architecture/platform/shared-openxml-editor-platform.md +0 -153
package/docs/architecture/xlsx/canonical-workbook-model-and-commands.md +0 -187
package/docs/architecture/xlsx/spreadsheet-editor-frontend-architecture.md +0 -150
package/docs/comment-redline-overview.md +0 -350
package/docs/concepts/context7-vs-skills.md +0 -118
package/docs/concepts/operating-modes.md +0 -91
package/docs/concepts/runtime-agnostic-orchestration.md +0 -111
package/docs/concepts/what-is-a-wave.md +0 -217
package/docs/context7/bundles.json +0 -222
package/docs/context7/planner-agent/README.md +0 -28
package/docs/context7/planner-agent/manifest.json +0 -83
package/docs/context7/planner-agent/papers/cooperbench-why-coding-agents-cannot-be-your-teammates-yet.md +0 -3283
package/docs/context7/planner-agent/papers/dova-deliberation-first-multi-agent-orchestration-for-autonomous-research-automation.md +0 -1699
package/docs/context7/planner-agent/papers/dpbench-large-language-models-struggle-with-simultaneous-coordination.md +0 -2251
package/docs/context7/planner-agent/papers/incremental-planning-to-control-a-blackboard-based-problem-solver.md +0 -1729
package/docs/context7/planner-agent/papers/silo-bench-a-scalable-environment-for-evaluating-distributed-coordination-in-multi-agent-llm-systems.md +0 -3747
package/docs/context7/planner-agent/papers/todoevolve-learning-to-architect-agent-planning-systems.md +0 -1675
package/docs/context7/planner-agent/papers/verified-multi-agent-orchestration-a-plan-execute-verify-replan-framework-for-complex-query-resolution.md +0 -1173
package/docs/context7/planner-agent/papers/why-do-multi-agent-llm-systems-fail.md +0 -5211
package/docs/context7/planner-agent/topics/planning-and-orchestration.md +0 -24
package/docs/evals/arm-templates/README.md +0 -13
package/docs/evals/arm-templates/full-wave.json +0 -15
package/docs/evals/arm-templates/single-agent.json +0 -15
package/docs/evals/benchmark-catalog.json +0 -670
package/docs/evals/cases/README.md +0 -47
package/docs/evals/cases/wave-blackboard-inbox-targeting.json +0 -73
package/docs/evals/cases/wave-contradiction-conflict.json +0 -104
package/docs/evals/cases/wave-expert-routing-preservation.json +0 -69
package/docs/evals/cases/wave-hidden-profile-private-evidence.json +0 -81
package/docs/evals/cases/wave-premature-closure-guard.json +0 -71
package/docs/evals/cases/wave-silo-cross-agent-state.json +0 -77
package/docs/evals/cases/wave-simultaneous-lockstep.json +0 -92
package/docs/evals/external-benchmarks.json +0 -85
package/docs/evals/external-command-config.sample.json +0 -9
package/docs/evals/external-command-config.swe-bench-pro.json +0 -8
package/docs/evals/pilots/README.md +0 -47
package/docs/evals/pilots/swe-bench-pro-public-full-wave-review-10.json +0 -64
package/docs/evals/pilots/swe-bench-pro-public-pilot.json +0 -111
package/docs/evals/wave-benchmark-program.md +0 -302
package/docs/guides/planner.md +0 -220
package/docs/guides/recommendations-0.8.9.md +0 -133
package/docs/guides/signal-wrappers.md +0 -165
package/docs/guides/terminal-surfaces.md +0 -96
package/docs/image copy.png +0 -0
package/docs/image.png +0 -0
package/docs/images/image.png +0 -0
package/docs/legal-feedback-architecture.md +0 -498
package/docs/plans/component-cutover-matrix.json +0 -1072
package/docs/plans/component-cutover-matrix.md +0 -307
package/docs/plans/context7-wave-orchestrator.md +0 -155
package/docs/plans/current-state.md +0 -198
package/docs/plans/docx/README.md +0 -9
package/docs/plans/examples/wave-benchmark-improvement.md +0 -108
package/docs/plans/examples/wave-example-live-proof.md +0 -435
package/docs/plans/master-plan.md +0 -224
package/docs/plans/migration.md +0 -538
package/docs/plans/operations/README.md +0 -7
package/docs/plans/operations/wave-10-word-certification.md +0 -87
package/docs/plans/operations/wave-8-railway-staging.md +0 -153
package/docs/plans/operations/wave-9-manual-certification.md +0 -73
package/docs/plans/platform/README.md +0 -9
package/docs/plans/reference/legal-checklist-coverage.md +0 -258
package/docs/plans/wave-orchestrator.md +0 -423
package/docs/plans/waves/README.md +0 -75
package/docs/plans/waves/completed/wave-0.md +0 -195
package/docs/plans/waves/completed/wave-1.md +0 -379
package/docs/plans/waves/completed/wave-10.md +0 -670
package/docs/plans/waves/completed/wave-11.md +0 -335
package/docs/plans/waves/completed/wave-12.md +0 -417
package/docs/plans/waves/completed/wave-13.md +0 -316
package/docs/plans/waves/completed/wave-14.md +0 -319
package/docs/plans/waves/completed/wave-15.md +0 -321
package/docs/plans/waves/completed/wave-16.md +0 -316
package/docs/plans/waves/completed/wave-17.md +0 -331
package/docs/plans/waves/completed/wave-18.md +0 -328
package/docs/plans/waves/completed/wave-2.md +0 -438
package/docs/plans/waves/completed/wave-3.md +0 -435
package/docs/plans/waves/completed/wave-4.md +0 -430
package/docs/plans/waves/completed/wave-5.md +0 -430
package/docs/plans/waves/completed/wave-6.md +0 -430
package/docs/plans/waves/completed/wave-7.md +0 -526
package/docs/plans/waves/completed/wave-8.md +0 -596
package/docs/plans/waves/completed/wave-9.md +0 -552
package/docs/plans/waves/deferred/README.md +0 -14
package/docs/plans/waves/deferred/encrypted-intake-contracts.md +0 -282
package/docs/plans/waves/deferred/legal-feedback-wave-expansion.md +0 -308
package/docs/plans/waves/deferred/wave-encrypted-intake.md +0 -451
package/docs/plans/waves/design/README.md +0 -5
package/docs/plans/waves/design/wave-1-a1.md +0 -309
package/docs/plans/waves/reviews/README.md +0 -5
package/docs/plans/waves/reviews/wave-0-cont-qa.md +0 -151
package/docs/plans/waves/reviews/wave-1-cont-qa.md +0 -46
package/docs/plans/waves/reviews/wave-10-accessibility-and-design.md +0 -51
package/docs/plans/waves/reviews/wave-10-cont-qa.md +0 -24
package/docs/plans/waves/reviews/wave-10-dashboard-proof.md +0 -46
package/docs/plans/waves/reviews/wave-10-performance-signoff.md +0 -55
package/docs/plans/waves/reviews/wave-10-regression-proof.md +0 -23
package/docs/plans/waves/reviews/wave-10-release-audit.md +0 -31
package/docs/plans/waves/reviews/wave-10-service-proof.md +0 -83
package/docs/plans/waves/reviews/wave-10-word-certification.md +0 -31
package/docs/plans/waves/reviews/wave-18-ai-contract-closure.md +0 -277
package/docs/plans/waves/reviews/wave-18-cont-qa.md +0 -255
package/docs/plans/waves/reviews/wave-18-parity-proof.md +0 -271
package/docs/plans/waves/reviews/wave-19-cont-qa.md +0 -59
package/docs/plans/waves/reviews/wave-2-cont-qa.md +0 -72
package/docs/plans/waves/reviews/wave-20-cont-qa.md +0 -60
package/docs/plans/waves/reviews/wave-25-cont-qa.md +0 -48
package/docs/plans/waves/reviews/wave-28-cont-qa.md +0 -46
package/docs/plans/waves/reviews/wave-29-cont-qa.md +0 -53
package/docs/plans/waves/reviews/wave-3-cont-qa.md +0 -53
package/docs/plans/waves/reviews/wave-3-core-proof.md +0 -77
package/docs/plans/waves/reviews/wave-3-validator-proof.md +0 -73
package/docs/plans/waves/reviews/wave-32-cont-qa.md +0 -43
package/docs/plans/waves/reviews/wave-33-cont-qa.md +0 -526
package/docs/plans/waves/reviews/wave-34-cont-qa.md +0 -100
package/docs/plans/waves/reviews/wave-35-cont-qa.md +0 -145
package/docs/plans/waves/reviews/wave-4-cont-qa.md +0 -47
package/docs/plans/waves/reviews/wave-4-structure-proof.md +0 -69
package/docs/plans/waves/reviews/wave-5-comment-proof.md +0 -158
package/docs/plans/waves/reviews/wave-5-cont-qa.md +0 -68
package/docs/plans/waves/reviews/wave-6-cont-qa.md +0 -416
package/docs/plans/waves/reviews/wave-6-redline-proof.md +0 -130
package/docs/plans/waves/reviews/wave-7-cont-qa.md +0 -82
package/docs/plans/waves/reviews/wave-7-ooxml-compliance.md +0 -85
package/docs/plans/waves/reviews/wave-7-preservation-proof.md +0 -119
package/docs/plans/waves/reviews/wave-7-trust-ux.md +0 -87
package/docs/plans/waves/reviews/wave-8-accessibility-and-design.md +0 -128
package/docs/plans/waves/reviews/wave-8-cont-qa.md +0 -92
package/docs/plans/waves/reviews/wave-8-live-proof.md +0 -140
package/docs/plans/waves/reviews/wave-8-security.md +0 -47
package/docs/plans/waves/reviews/wave-9-editor-embedding.md +0 -39
package/docs/plans/waves/reviews/wave-9-fixture-runner.md +0 -56
package/docs/plans/waves/reviews/wave-9-live-proof.md +0 -105
package/docs/plans/waves/reviews/wave-9-usability-and-performance.md +0 -152
package/docs/plans/waves/specs/README.md +0 -5
package/docs/plans/waves/specs/wave-1-component-boundaries.md +0 -322
package/docs/plans/waves/specs/wave-1-ooxml-contracts.md +0 -323
package/docs/plans/waves/specs/wave-1-review-and-ui-contracts.md +0 -339
package/docs/plans/waves/specs/wave-1-runtime-contracts.md +0 -509
package/docs/plans/waves/wave-19.md +0 -341
package/docs/plans/waves/wave-20.md +0 -308
package/docs/plans/waves/wave-21.md +0 -289
package/docs/plans/waves/wave-22.md +0 -221
package/docs/plans/waves/wave-23.md +0 -295
package/docs/plans/waves/wave-24.md +0 -286
package/docs/plans/waves/wave-25.md +0 -313
package/docs/plans/waves/wave-26.md +0 -300
package/docs/plans/waves/wave-27.md +0 -299
package/docs/plans/waves/wave-28.md +0 -368
package/docs/plans/waves/wave-29.md +0 -303
package/docs/plans/waves/wave-30.md +0 -307
package/docs/plans/waves/wave-31.md +0 -231
package/docs/plans/waves/wave-32.md +0 -152
package/docs/plans/waves/wave-33.md +0 -147
package/docs/plans/waves/wave-34.md +0 -148
package/docs/plans/waves/wave-35.md +0 -141
package/docs/plans/waves/wave-36.md +0 -146
package/docs/plans/xlsx/README.md +0 -14
package/docs/plans/xlsx/xlsx-fixture-corpus-and-certification-plan.md +0 -126
package/docs/reference/cli-reference.md +0 -600
package/docs/reference/coordination-and-closure.md +0 -487
package/docs/reference/deep-research-report (15).md +0 -25
package/docs/reference/docx/README.md +0 -10
package/docs/reference/legal-checklist.md +0 -445
package/docs/reference/live-proof-waves.md +0 -199
package/docs/reference/ooxml-compliance.md +0 -129
package/docs/reference/ooxml-feature-parity-matrix.md +0 -172
package/docs/reference/platform/shared-ooxml-platform-guidance.md +0 -77
package/docs/reference/prototype-agent-prompt-legal-fidelity.md +0 -155
package/docs/reference/public-api.md +0 -456
package/docs/reference/repository-guidance.md +0 -58
package/docs/reference/runtime-config/README.md +0 -182
package/docs/reference/runtime-config/claude.md +0 -110
package/docs/reference/runtime-config/codex.md +0 -82
package/docs/reference/runtime-config/opencode.md +0 -93
package/docs/reference/sample-waves.md +0 -105
package/docs/reference/skills.md +0 -237
package/docs/reference/templates/AGENTS.md +0 -78
package/docs/reference/templates/HEARTBEAT.md +0 -7
package/docs/reference/templates/IDENTITY.md +0 -23
package/docs/reference/templates/SOUL.md +0 -36
package/docs/reference/templates/TOOLS.md +0 -40
package/docs/reference/templates/USER.md +0 -17
package/docs/reference/wave-control.md +0 -184
package/docs/reference/wave-planning-lessons.md +0 -167
package/docs/reference/word-review-editor-frontend-architecture.md +0 -479
package/docs/reference/word-review-editor-ux-guide.md +0 -253
package/docs/reference/xlsx/xlsx-ooxml-compliance.md +0 -137
package/docs/research/agent-context-sources.md +0 -178
package/docs/research/coordination-failure-review.md +0 -290
package/docs/research/docx-react-component/Canonical Document Schema Specification for a React-based Word-compatible Editor.md +0 -2317
package/docs/research/docx-react-component/Feature Compatibility Matrix for a React Word Compatible Legal Editor v1.md +0 -219
package/docs/research/docx-react-component/React Component Architecture and Front-End Structure Specification for a Word-Compatible Legal Review Editor.md +0 -1112
package/docs/research/docx-react-component/document_compatibility_and_testing_spec.md +0 -751
package/docs/research/xlsx/raw/README.md +0 -13
package/docs/roadmap.md +0 -174
package/docs/superpowers/plans/2026-03-28-harness-control-bar.md +0 -677
package/docs/superpowers/specs/2026-03-28-harness-control-bar-design.md +0 -274
package/docs/xlsx-react/README.md +0 -38
package/docs/xlsx-react/agent-llm-interaction-layer-docx-xlsx.md +0 -621
package/docs/xlsx-react/canonical-workbook-model-and-commands.md +0 -948
package/docs/xlsx-react/shared-openxml-editor-platform-docx-xlsx.md +0 -228
package/docs/xlsx-react/spreadsheet-editor-component-architecture.md +0 -809
package/docs/xlsx-react/spreadsheet-editor-frontend-architecture.md +0 -537
package/docs/xlsx-react/spreadsheet-editor-ux-guide.md +0 -520
package/docs/xlsx-react/xlsx-editor-research-pack.md +0 -871
package/docs/xlsx-react/xlsx-fixture-corpus-and-certification-plan.md +0 -436
package/docs/xlsx-react/xlsx-ooxml-compliance.md +0 -320
package/examples/README.md +0 -16
package/memory/MEMORY.md +0 -24
package/pnpm-workspace.yaml +0 -4
package/scripts/check-no-authored-js.sh +0 -13
package/scripts/context7-api-check.sh +0 -65
package/scripts/context7-export-env.sh +0 -42
package/scripts/run-context7-mcp.sh +0 -8
package/scripts/run-workspace-tests.sh +0 -15
package/scripts/start-wave-10-local.sh +0 -189
package/scripts/wave-agent-attach.sh +0 -47
package/scripts/wave-auto-answer.sh +0 -118
package/scripts/wave-dashboard-attach.sh +0 -13
package/scripts/wave-launch.sh +0 -273
package/scripts/wave-overnight-supervisor.sh +0 -145
package/scripts/wave-status.sh +0 -379
package/scripts/wave-watch.sh +0 -231
package/services/README.md +0 -17
package/services/openxml-validator/Dockerfile +0 -29
package/services/openxml-validator/OpenXmlValidator.Api.csproj +0 -12
package/services/openxml-validator/Program.cs +0 -436
package/services/openxml-validator/README.md +0 -152
package/services/openxml-validator/railway.json +0 -16
package/services/react-word-editor/.tmp-a4/src/api/public-types.ts +0 -318
package/services/react-word-editor/.tmp-a4/src/ui/WordReviewEditor.tsx +0 -1302
package/services/react-word-editor/.tmp-a4/src/ui/editor-surface/editor-surface.tsx +0 -546
package/services/react-word-editor/.tmp-a4/test/ui/word-review-editor.test.tsx +0 -146
package/services/react-word-editor/.tmp-a4-build/src/api/public-types.js +0 -2
package/services/react-word-editor/.tmp-a4-build/src/ui/WordReviewEditor.js +0 -818
package/services/react-word-editor/.tmp-a4-build/src/ui/editor-surface/editor-surface.js +0 -229
package/services/react-word-editor/.tmp-a4-build/test/ui/word-review-editor.test.js +0 -121
package/services/react-word-editor/.tmp-wave-4-a3-tsconfig.json +0 -21
package/services/react-word-editor/.tmp-wave-4-a3-tsconfig.tsbuildinfo +0 -1
package/services/react-word-editor/Dockerfile +0 -26
package/services/react-word-editor/README.md +0 -254
package/services/react-word-editor/app/api/certification/route.ts +0 -79
package/services/react-word-editor/app/api/demo-sessions/route.ts +0 -109
package/services/react-word-editor/app/api/deploy-health/route.ts +0 -23
package/services/react-word-editor/app/api/exports/[exportId]/route.ts +0 -34
package/services/react-word-editor/app/api/exports/route.ts +0 -81
package/services/react-word-editor/app/api/fixtures/[fixtureId]/run/route.ts +0 -100
package/services/react-word-editor/app/api/health/route.ts +0 -70
package/services/react-word-editor/app/api/runs/[runId]/route.ts +0 -36
package/services/react-word-editor/app/api/scenarios/[scenarioId]/run/route.ts +0 -85
package/services/react-word-editor/app/api/sessions/[sessionId]/route.ts +0 -199
package/services/react-word-editor/app/api/sessions/[sessionId]/source/route.ts +0 -45
package/services/react-word-editor/app/api/uploads/route.ts +0 -70
package/services/react-word-editor/app/api/validate/route.ts +0 -310
package/services/react-word-editor/app/certification/[runId]/page.tsx +0 -14
package/services/react-word-editor/app/certification/page.tsx +0 -32
package/services/react-word-editor/app/dashboard/page.tsx +0 -7
package/services/react-word-editor/app/demo/page.tsx +0 -30
package/services/react-word-editor/app/demo/prototype-client.tsx +0 -1080
package/services/react-word-editor/app/editor/[sessionId]/page.tsx +0 -33
package/services/react-word-editor/app/fixtures/page.tsx +0 -7
package/services/react-word-editor/app/globals.css +0 -121
package/services/react-word-editor/app/layout.tsx +0 -32
package/services/react-word-editor/app/page.tsx +0 -30
package/services/react-word-editor/app/runs/[runId]/page.tsx +0 -34
package/services/react-word-editor/app/wave-10-word-review/page.tsx +0 -7
package/services/react-word-editor/components/harness-control-bar.tsx +0 -289
package/services/react-word-editor/components/harness-editor-session-client.tsx +0 -1214
package/services/react-word-editor/components/harness-workspace-page.tsx +0 -715
package/services/react-word-editor/components/reduced-motion-toggle.tsx +0 -79
package/services/react-word-editor/components/workspace-certification-panel.tsx +0 -307
package/services/react-word-editor/lib/certification-bundle.ts +0 -796
package/services/react-word-editor/lib/certification-store.ts +0 -661
package/services/react-word-editor/lib/demo-fixtures.test.mjs +0 -195
package/services/react-word-editor/lib/demo-fixtures.ts +0 -1519
package/services/react-word-editor/lib/editor-session-summary.test.mjs +0 -68
package/services/react-word-editor/lib/editor-session-summary.ts +0 -14
package/services/react-word-editor/lib/editor-session.ts +0 -228
package/services/react-word-editor/lib/exports-route.test.mjs +0 -32
package/services/react-word-editor/lib/harness-client.ts +0 -347
package/services/react-word-editor/lib/harness-config.json +0 -30
package/services/react-word-editor/lib/harness-config.test.mjs +0 -31
package/services/react-word-editor/lib/harness-config.ts +0 -21
package/services/react-word-editor/lib/harness-editor-datastore.test.mjs +0 -220
package/services/react-word-editor/lib/harness-editor-datastore.ts +0 -161
package/services/react-word-editor/lib/private-mode.test.mjs +0 -42
package/services/react-word-editor/lib/private-mode.ts +0 -61
package/services/react-word-editor/lib/regression-report.test.mjs +0 -352
package/services/react-word-editor/lib/regression-report.ts +0 -896
package/services/react-word-editor/lib/run-artifacts.ts +0 -934
package/services/react-word-editor/lib/run-history.ts +0 -755
package/services/react-word-editor/lib/scenario-artifacts.test.mjs +0 -41
package/services/react-word-editor/lib/scenario-artifacts.ts +0 -44
package/services/react-word-editor/lib/storage.ts +0 -953
package/services/react-word-editor/lib/validator-client.test.mjs +0 -54
package/services/react-word-editor/lib/validator-client.ts +0 -95
package/services/react-word-editor/lib/workspace-navigation.ts +0 -79
package/services/react-word-editor/middleware.ts +0 -35
package/services/react-word-editor/next-env.d.ts +0 -6
package/services/react-word-editor/next.config.mjs +0 -15
package/services/react-word-editor/package.json +0 -38
package/services/react-word-editor/postcss.config.mjs +0 -8
package/services/react-word-editor/railway.json +0 -21
package/services/react-word-editor/scripts/wave-10-certification.mjs +0 -101
package/services/react-word-editor/scripts/wave-9-live-usability-pilot.mjs +0 -911
package/services/react-word-editor/tsconfig.json +0 -39
package/services/react-word-editor/tsconfig.tsbuildinfo +0 -1
package/skills/README.md +0 -48
package/skills/domain-docx-compatibility/SKILL.md +0 -44
package/skills/domain-docx-compatibility/skill.json +0 -19
package/skills/domain-editor-architecture/SKILL.md +0 -49
package/skills/domain-editor-architecture/skill.json +0 -19
package/skills/domain-legal-review/SKILL.md +0 -39
package/skills/domain-legal-review/skill.json +0 -19
package/skills/provider-aws/SKILL.md +0 -117
package/skills/provider-aws/adapters/claude.md +0 -1
package/skills/provider-aws/adapters/codex.md +0 -1
package/skills/provider-aws/references/service-verification.md +0 -39
package/skills/provider-aws/skill.json +0 -54
package/skills/provider-custom-deploy/SKILL.md +0 -64
package/skills/provider-custom-deploy/skill.json +0 -50
package/skills/provider-docker-compose/SKILL.md +0 -96
package/skills/provider-docker-compose/adapters/local.md +0 -1
package/skills/provider-docker-compose/skill.json +0 -53
package/skills/provider-github-release/SKILL.md +0 -121
package/skills/provider-github-release/adapters/claude.md +0 -1
package/skills/provider-github-release/adapters/codex.md +0 -1
package/skills/provider-github-release/skill.json +0 -55
package/skills/provider-kubernetes/SKILL.md +0 -143
package/skills/provider-kubernetes/adapters/claude.md +0 -1
package/skills/provider-kubernetes/adapters/codex.md +0 -1
package/skills/provider-kubernetes/references/kubectl-patterns.md +0 -58
package/skills/provider-kubernetes/skill.json +0 -52
package/skills/provider-railway/SKILL.md +0 -123
package/skills/provider-railway/adapters/claude.md +0 -1
package/skills/provider-railway/adapters/codex.md +0 -1
package/skills/provider-railway/adapters/local.md +0 -1
package/skills/provider-railway/adapters/opencode.md +0 -1
package/skills/provider-railway/references/verification-commands.md +0 -39
package/skills/provider-railway/skill.json +0 -71
package/skills/provider-ssh-manual/SKILL.md +0 -97
package/skills/provider-ssh-manual/skill.json +0 -54
package/skills/repo-coding-rules/SKILL.md +0 -55
package/skills/repo-coding-rules/skill.json +0 -34
package/skills/role-cont-eval/SKILL.md +0 -91
package/skills/role-cont-eval/adapters/codex.md +0 -1
package/skills/role-cont-eval/skill.json +0 -36
package/skills/role-cont-qa/SKILL.md +0 -100
package/skills/role-cont-qa/adapters/claude.md +0 -1
package/skills/role-cont-qa/skill.json +0 -36
package/skills/role-deploy/SKILL.md +0 -97
package/skills/role-deploy/skill.json +0 -36
package/skills/role-design/SKILL.md +0 -50
package/skills/role-design/skill.json +0 -36
package/skills/role-documentation/SKILL.md +0 -76
package/skills/role-documentation/skill.json +0 -36
package/skills/role-implementation/SKILL.md +0 -45
package/skills/role-implementation/skill.json +0 -36
package/skills/role-infra/SKILL.md +0 -81
package/skills/role-infra/skill.json +0 -36
package/skills/role-integration/SKILL.md +0 -91
package/skills/role-integration/skill.json +0 -36
package/skills/role-planner/SKILL.md +0 -39
package/skills/role-planner/skill.json +0 -21
package/skills/role-research/SKILL.md +0 -65
package/skills/role-research/skill.json +0 -36
package/skills/role-security/SKILL.md +0 -60
package/skills/role-security/skill.json +0 -36
package/skills/runtime-claude/SKILL.md +0 -66
package/skills/runtime-claude/skill.json +0 -36
package/skills/runtime-codex/SKILL.md +0 -58
package/skills/runtime-codex/skill.json +0 -36
package/skills/runtime-local/SKILL.md +0 -46
package/skills/runtime-local/skill.json +0 -36
package/skills/runtime-opencode/SKILL.md +0 -58
package/skills/runtime-opencode/skill.json +0 -36
package/skills/signal-hygiene/SKILL.md +0 -51
package/skills/signal-hygiene/skill.json +0 -20
package/skills/tui-design/SKILL.md +0 -77
package/skills/tui-design/references/tui-design.md +0 -259
package/skills/tui-design/skill.json +0 -36
package/skills/wave-core/SKILL.md +0 -141
package/skills/wave-core/references/marker-syntax.md +0 -70
package/skills/wave-core/skill.json +0 -35
package/src/README.md +0 -85
package/src/api/README.md +0 -22
package/src/api/public-types.ts +0 -525
package/src/component-inventory.md +0 -99
package/src/core/README.md +0 -10
package/src/core/commands/README.md +0 -3
package/src/core/commands/formatting-commands.ts +0 -161
package/src/core/commands/image-commands.ts +0 -144
package/src/core/commands/index.ts +0 -1013
package/src/core/commands/list-commands.ts +0 -370
package/src/core/commands/review-commands.ts +0 -108
package/src/core/commands/text-commands.ts +0 -119
package/src/core/schema/README.md +0 -3
package/src/core/schema/text-schema.ts +0 -512
package/src/core/selection/README.md +0 -3
package/src/core/selection/mapping.ts +0 -238
package/src/core/selection/review-anchors.ts +0 -94
package/src/core/state/README.md +0 -3
package/src/core/state/editor-state.ts +0 -580
package/src/core/state/text-transaction.ts +0 -276
package/src/formats/xlsx/io/parse-shared-strings.ts +0 -41
package/src/formats/xlsx/io/parse-sheet.ts +0 -289
package/src/formats/xlsx/io/parse-styles.ts +0 -57
package/src/formats/xlsx/io/parse-workbook.ts +0 -75
package/src/formats/xlsx/io/xlsx-session.ts +0 -306
package/src/formats/xlsx/model/cell.ts +0 -189
package/src/formats/xlsx/model/sheet.ts +0 -244
package/src/formats/xlsx/model/styles.ts +0 -118
package/src/formats/xlsx/model/workbook.ts +0 -449
package/src/io/README.md +0 -10
package/src/io/docx-session.ts +0 -1763
package/src/io/export/README.md +0 -3
package/src/io/export/export-session.ts +0 -165
package/src/io/export/minimal-docx.ts +0 -115
package/src/io/export/reattach-preserved-parts.ts +0 -54
package/src/io/export/serialize-comments.ts +0 -876
package/src/io/export/serialize-footnotes.ts +0 -217
package/src/io/export/serialize-headers-footers.ts +0 -200
package/src/io/export/serialize-main-document.ts +0 -982
package/src/io/export/serialize-numbering.ts +0 -97
package/src/io/export/serialize-revisions.ts +0 -389
package/src/io/export/serialize-runtime-revisions.ts +0 -265
package/src/io/export/serialize-tables.ts +0 -147
package/src/io/export/split-review-boundaries.ts +0 -194
package/src/io/normalize/README.md +0 -3
package/src/io/normalize/normalize-text.ts +0 -437
package/src/io/ooxml/README.md +0 -3
package/src/io/ooxml/parse-comments.ts +0 -779
package/src/io/ooxml/parse-complex-content.ts +0 -287
package/src/io/ooxml/parse-fields.ts +0 -438
package/src/io/ooxml/parse-footnotes.ts +0 -403
package/src/io/ooxml/parse-headers-footers.ts +0 -483
package/src/io/ooxml/parse-inline-media.ts +0 -431
package/src/io/ooxml/parse-main-document.ts +0 -1846
package/src/io/ooxml/parse-numbering.ts +0 -425
package/src/io/ooxml/parse-revisions.ts +0 -658
package/src/io/ooxml/parse-shapes.ts +0 -271
package/src/io/ooxml/parse-tables.ts +0 -568
package/src/io/ooxml/parse-theme.ts +0 -314
package/src/io/ooxml/part-manifest.ts +0 -136
package/src/io/ooxml/revision-boundaries.ts +0 -351
package/src/io/opc/README.md +0 -3
package/src/io/opc/corrupt-package.ts +0 -166
package/src/io/opc/docx-package.ts +0 -74
package/src/io/opc/package-reader.ts +0 -320
package/src/io/opc/package-writer.ts +0 -273
package/src/model/README.md +0 -3
package/src/model/canonical-document.ts +0 -1911
package/src/model/cds-1.0.0.ts +0 -196
package/src/model/snapshot.ts +0 -393
package/src/preservation/README.md +0 -3
package/src/preservation/markup-compatibility.ts +0 -48
package/src/preservation/opaque-fragment-store.ts +0 -89
package/src/preservation/opaque-region.ts +0 -233
package/src/preservation/package-preservation.ts +0 -120
package/src/preservation/preserved-part-manifest.ts +0 -56
package/src/preservation/relationship-retention.ts +0 -57
package/src/preservation/store.ts +0 -185
package/src/review/README.md +0 -16
package/src/review/store/README.md +0 -3
package/src/review/store/comment-anchors.ts +0 -70
package/src/review/store/comment-remapping.ts +0 -154
package/src/review/store/comment-store.ts +0 -331
package/src/review/store/comment-thread.ts +0 -109
package/src/review/store/revision-actions.ts +0 -394
package/src/review/store/revision-store.ts +0 -303
package/src/review/store/revision-types.ts +0 -168
package/src/review/store/runtime-comment-store.ts +0 -43
package/src/runtime/README.md +0 -3
package/src/runtime/ai-action-policy.ts +0 -764
package/src/runtime/document-runtime.ts +0 -969
package/src/runtime/read-only-diagnostics-runtime.ts +0 -232
package/src/runtime/review-runtime.ts +0 -44
package/src/runtime/revision-runtime.ts +0 -107
package/src/runtime/session-capabilities.ts +0 -138
package/src/runtime/surface-projection.ts +0 -570
package/src/runtime/table-commands.ts +0 -84
package/src/runtime/table-schema.ts +0 -125
package/src/ui/README.md +0 -30
package/src/ui/WordReviewEditor.tsx +0 -1283
package/src/ui/comments/README.md +0 -3
package/src/ui/compatibility/README.md +0 -3
package/src/ui/editor-surface/README.md +0 -3
package/src/ui/headless/comment-decoration-model.ts +0 -124
package/src/ui/headless/revision-decoration-model.ts +0 -128
package/src/ui/headless/selection-helpers.ts +0 -34
package/src/ui/headless/use-editor-keyboard.ts +0 -98
package/src/ui/review/README.md +0 -3
package/src/ui/shared/revision-filters.ts +0 -31
package/src/ui/status/README.md +0 -3
package/src/ui/theme/README.md +0 -3
package/src/ui/toolbar/README.md +0 -3
package/src/ui-tailwind/chrome/tw-alert-banner.tsx +0 -48
package/src/ui-tailwind/chrome/tw-selection-toolbar.tsx +0 -44
package/src/ui-tailwind/chrome/tw-unsaved-modal.tsx +0 -58
package/src/ui-tailwind/chrome/use-before-unload.ts +0 -20
package/src/ui-tailwind/editor-surface/pm-command-bridge.ts +0 -139
package/src/ui-tailwind/editor-surface/pm-decorations.ts +0 -98
package/src/ui-tailwind/editor-surface/pm-position-map.ts +0 -123
package/src/ui-tailwind/editor-surface/pm-schema.ts +0 -452
package/src/ui-tailwind/editor-surface/pm-state-from-snapshot.ts +0 -327
package/src/ui-tailwind/editor-surface/search-plugin.ts +0 -157
package/src/ui-tailwind/editor-surface/tw-caret.tsx +0 -12
package/src/ui-tailwind/editor-surface/tw-editor-surface.tsx +0 -150
package/src/ui-tailwind/editor-surface/tw-inline-token.tsx +0 -118
package/src/ui-tailwind/editor-surface/tw-opaque-block.tsx +0 -52
package/src/ui-tailwind/editor-surface/tw-paragraph-block.tsx +0 -151
package/src/ui-tailwind/editor-surface/tw-prosemirror-surface.tsx +0 -215
package/src/ui-tailwind/editor-surface/tw-segment-view.tsx +0 -111
package/src/ui-tailwind/editor-surface/tw-table-node-view.tsx +0 -108
package/src/ui-tailwind/index.ts +0 -61
package/src/ui-tailwind/review/tw-comment-sidebar.tsx +0 -276
package/src/ui-tailwind/review/tw-health-panel.tsx +0 -120
package/src/ui-tailwind/review/tw-review-rail.tsx +0 -120
package/src/ui-tailwind/review/tw-revision-sidebar.tsx +0 -164
package/src/ui-tailwind/status/tw-status-bar.tsx +0 -58
package/src/ui-tailwind/theme/editor-theme.css +0 -190
package/src/ui-tailwind/toolbar/tw-toolbar-icon-button.tsx +0 -48
package/src/ui-tailwind/toolbar/tw-toolbar.tsx +0 -231
package/src/ui-tailwind/tw-review-workspace.tsx +0 -140
package/src/validation/README.md +0 -3
package/src/validation/compatibility-engine.ts +0 -317
package/src/validation/compatibility-report.ts +0 -160
package/src/validation/diagnostics.ts +0 -203
package/src/validation/import-diagnostics.ts +0 -128
package/src/validation/low-priority-word-surfaces.ts +0 -373
package/test/README.md +0 -16
package/test/core/formatting-commands.test.ts +0 -285
package/test/core/image-commands.test.ts +0 -298
package/test/core/mapping.test.ts +0 -186
package/test/core/text-commands.test.ts +0 -176
package/test/fixtures/docx/F01-basic-contract.docx +0 -0
package/test/fixtures/docx/F01-basic-contract.md +0 -33
package/test/fixtures/docx/F02-headings-styles.docx +0 -0
package/test/fixtures/docx/F02-headings-styles.md +0 -33
package/test/fixtures/docx/F03-legal-outline-numbering.docx +0 -0
package/test/fixtures/docx/F03-legal-outline-numbering.md +0 -34
package/test/fixtures/docx/F04-restart-numbering-schedules.docx +0 -0
package/test/fixtures/docx/F04-restart-numbering-schedules.md +0 -33
package/test/fixtures/docx/F05-table-heavy-agreement.docx +0 -0
package/test/fixtures/docx/F05-table-heavy-agreement.md +0 -34
package/test/fixtures/docx/F06-merged-cells-signature-table.docx +0 -0
package/test/fixtures/docx/F06-merged-cells-signature-table.md +0 -34
package/test/fixtures/docx/F07-inline-images-exhibit.docx +0 -0
package/test/fixtures/docx/F07-inline-images-exhibit.md +0 -34
package/test/fixtures/docx/F08-hyperlinks.docx +0 -0
package/test/fixtures/docx/F08-hyperlinks.md +0 -33
package/test/fixtures/docx/F09-comments-single-paragraph.docx +0 -0
package/test/fixtures/docx/F09-comments-single-paragraph.md +0 -33
package/test/fixtures/docx/F10-threaded-comments-resolve.docx +0 -0
package/test/fixtures/docx/F10-threaded-comments-resolve.md +0 -33
package/test/fixtures/docx/F11-redlines-basic.docx +0 -0
package/test/fixtures/docx/F11-redlines-basic.md +0 -33
package/test/fixtures/docx/F12-redlines-paragraph-joins-splits.docx +0 -0
package/test/fixtures/docx/F12-redlines-paragraph-joins-splits.md +0 -33
package/test/fixtures/docx/F13-comments-on-deleted-text.docx +0 -0
package/test/fixtures/docx/F13-comments-on-deleted-text.md +0 -33
package/test/fixtures/docx/F14-revisions-in-tables-and-lists.docx +0 -0
package/test/fixtures/docx/F14-revisions-in-tables-and-lists.md +0 -33
package/test/fixtures/docx/F15-sections-headers-footers.docx +0 -0
package/test/fixtures/docx/F15-sections-headers-footers.md +0 -33
package/test/fixtures/docx/F16-footnotes-endnotes.docx +0 -0
package/test/fixtures/docx/F16-footnotes-endnotes.md +0 -33
package/test/fixtures/docx/F17-fields-and-toc.docx +0 -0
package/test/fixtures/docx/F17-fields-and-toc.md +0 -33
package/test/fixtures/docx/F18-content-controls-template.docx +0 -0
package/test/fixtures/docx/F18-content-controls-template.md +0 -33
package/test/fixtures/docx/F19-custom-xml-doc-assembly.docx +0 -0
package/test/fixtures/docx/F19-custom-xml-doc-assembly.md +0 -35
package/test/fixtures/docx/F20-unknown-ooxml-and-alternatecontent.docx +0 -0
package/test/fixtures/docx/F20-unknown-ooxml-and-alternatecontent.md +0 -33
package/test/fixtures/docx/F21-malformed-broken-docx.docx +0 -0
package/test/fixtures/docx/F21-malformed-broken-docx.md +0 -33
package/test/fixtures/docx/README.md +0 -74
package/test/fixtures/docx/certification-manifest.json +0 -104
package/test/fixtures/docx/fixtures.manifest.json +0 -196
package/test/fixtures/encrypted-docx/README.md +0 -27
package/test/fixtures/encrypted-docx/certification-manifest.json +0 -9
package/test/fixtures/encrypted-docx/fixtures.manifest.json +0 -47
package/test/fixtures/scenarios/docx/README.md +0 -25
package/test/fixtures/scenarios/docx/S01-sow-template.docx +0 -0
package/test/fixtures/scenarios/docx/S01-sow-template.md +0 -30
package/test/fixtures/scenarios/docx/S02-bw-partner-user-licence-agreement-redlines.docx +0 -0
package/test/fixtures/scenarios/docx/S02-bw-partner-user-licence-agreement-redlines.md +0 -32
package/test/fixtures/scenarios/docx/scenario-manifest.json +0 -53
package/test/formats/xlsx/io/xlsx-import.test.ts +0 -766
package/test/formats/xlsx/model/workbook.test.ts +0 -669
package/test/helpers/dom-setup.ts +0 -124
package/test/io/comment-roundtrip.test.ts +0 -272
package/test/io/complex-content-roundtrip.test.ts +0 -632
package/test/io/docx-compatibility-regression.test.ts +0 -199
package/test/io/docx-session.test.ts +0 -1495
package/test/io/footnotes-roundtrip.test.ts +0 -318
package/test/io/headers-footers-roundtrip.test.ts +0 -547
package/test/io/numbering-roundtrip.test.ts +0 -234
package/test/io/package-reader.test.ts +0 -199
package/test/io/paragraph-properties-roundtrip.test.ts +0 -129
package/test/io/preserved-package-roundtrip.test.ts +0 -365
package/test/io/property-completeness.test.ts +0 -292
package/test/io/revision-roundtrip.test.ts +0 -347
package/test/io/structural-blocks.test.ts +0 -202
package/test/io/table-media-roundtrip.test.ts +0 -448
package/test/io/table-properties-roundtrip.test.ts +0 -569
package/test/io/table-roundtrip.test.ts +0 -302
package/test/io/text-roundtrip.test.ts +0 -344
package/test/model/canonical-document.test.ts +0 -285
package/test/preservation/opaque-fragment-store.test.ts +0 -121
package/test/preservation/package-preservation.test.ts +0 -395
package/test/preservation/store.test.ts +0 -84
package/test/review/comment-remapping.test.ts +0 -220
package/test/review/comment-store.test.ts +0 -180
package/test/review/move-revisions.test.ts +0 -143
package/test/review/property-change-revisions.test.ts +0 -225
package/test/review/revision-actions.test.ts +0 -330
package/test/review/revision-store.test.ts +0 -193
package/test/runtime/session-capabilities.test.ts +0 -260
package/test/runtime/table-commands.test.ts +0 -356
package/test/runtime/table-schema.test.ts +0 -221
package/test/runtime/tracked-changes-toggle.test.ts +0 -107
package/test/ui/comment-review-surface.test.tsx +0 -114
package/test/ui/reduced-motion-toggle.test.tsx +0 -137
package/test/ui/word-review-editor.imported-scenarios.test.tsx +0 -169
package/test/ui/word-review-editor.interaction.test.tsx +0 -1198
package/test/ui/word-review-editor.test.js +0 -188
package/test/ui/word-review-editor.test.tsx +0 -280
package/test/ui-tailwind/search-plugin.test.ts +0 -286
package/test/validation/compatibility-engine.test.ts +0 -336
package/test/validation/compatibility-report.test.ts +0 -189
package/test/validation/low-priority-word-surfaces.test.ts +0 -282
package/test/validation/malformed-doc.test.ts +0 -113
package/test-results/.last-run.json +0 -4
package/wave.config.json +0 -406

package/docs/evals/external-benchmarks.json DELETED Viewed

@@ -1,85 +0,0 @@
-{
-  "version": 1,
-  "adapters": [
-    {
-      "id": "swe-bench-pro",
-      "title": "SWE-bench Pro",
-      "mode": "direct",
-      "sourceBenchmark": "SWE-bench Pro",
-      "split": "public",
-      "pilotManifestPath": "docs/evals/pilots/swe-bench-pro-public-pilot.json",
-      "officialDocsUrl": "https://scaleapi.github.io/SWE-bench_Pro-os/",
-      "officialCodeUrl": "https://github.com/scaleapi/SWE-bench_Pro-os",
-      "summary": "Contamination-resistant long-horizon software engineering benchmark for public, held-out, and commercial repositories.",
-      "commandTemplate": "",
-      "metrics": ["task-success-rate", "cost-per-solved-task", "wall-clock-per-solved-task"],
-      "notes": [
-        "Use the public split for the first direct external benchmark run and rely on the official verifier for pass or fail.",
-        "Keep the base model, executor, and budget identical across the `single-agent` and `full-wave` arms.",
-        "The second direct benchmark slot is intentionally deferred until the later CooperBench pass."
-      ]
-    },
-    {
-      "id": "skillsbench-style-ablation",
-      "title": "SkillsBench-style Ablation",
-      "mode": "adapted",
-      "sourceBenchmark": "SkillsBench",
-      "summary": "Adapt the SkillsBench methodology to Wave skill bundles by comparing no skills, curated skills, and overbroad skills.",
-      "commandTemplate": "wave benchmark run --arm single-agent --arm multi-agent-minimal --arm full-wave",
-      "metrics": ["pass-rate-delta", "negative-skill-regression-rate", "runtime-cost"],
-      "notes": [
-        "This is a local adaptation rather than a direct external suite.",
-        "The initial repo benchmark runner ships the local corpus and registry, not the full external execution harness."
-      ]
-    },
-    {
-      "id": "evoclaw-style-sequence",
-      "title": "EvoClaw-style Sequence",
-      "mode": "adapted",
-      "sourceBenchmark": "EvoClaw",
-      "summary": "Sequence multiple dependent waves to measure long-horizon maintenance and error accumulation.",
-      "commandTemplate": "wave benchmark run --arm single-agent --arm full-wave --family silo-escape",
-      "metrics": ["milestone-pass-decay", "reopen-rate", "regression-carryover"],
-      "notes": [
-        "Use the local benchmark harness to define milestone DAGs or ordered wave sequences.",
-        "Best used after the deterministic coordination corpus is stable."
-      ]
-    },
-    {
-      "id": "silo-bench-style-coordination",
-      "title": "Silo-Bench-style Coordination",
-      "mode": "adapted",
-      "sourceBenchmark": "Silo-Bench",
-      "summary": "Distributed-information and communication-reasoning-gap evaluations adapted into Wave-native coordination fixtures.",
-      "commandTemplate": "wave benchmark run --family hidden-profile-pooling --family silo-escape",
-      "metrics": ["distributed-info-accuracy", "global-state-reconstruction-rate", "communication-reasoning-gap"],
-      "notes": [
-        "The shipped local cases in docs/evals/cases/ are the first adaptation layer for this family."
-      ]
-    },
-    {
-      "id": "hiddenbench-style-pooling",
-      "title": "HiddenBench-style Pooling",
-      "mode": "adapted",
-      "sourceBenchmark": "HiddenBench",
-      "summary": "Asymmetric-information tasks that focus specifically on whether decision-changing private evidence reaches shared state before closure.",
-      "commandTemplate": "wave benchmark run --family hidden-profile-pooling",
-      "metrics": ["distributed-info-accuracy", "premature-convergence-rate"],
-      "notes": [
-        "This is the recommended next coordination benchmark after the first SWE-bench Pro pilot."
-      ]
-    },
-    {
-      "id": "dpbench-style-contention",
-      "title": "DPBench-style Contention",
-      "mode": "adapted",
-      "sourceBenchmark": "DPBench",
-      "summary": "Simultaneous coordination and contention cases adapted into capability-routing and helper-assignment fixtures.",
-      "commandTemplate": "wave benchmark run --family simultaneous-coordination",
-      "metrics": ["deadlock-rate", "contention-resolution-rate", "symmetry-breaking-rate"],
-      "notes": [
-        "The initial local corpus measures the routing and blocking substrate before live concurrent execution is added."
-      ]
-    }
-  ]
-}

package/docs/evals/external-command-config.sample.json DELETED Viewed

@@ -1,9 +0,0 @@
-{
-  "adapters": {
-    "swe-bench-pro": {
-      "single-agent": "external-harness run --benchmark swe-bench-pro --task {task_id} --arm {arm} --model {model_id} --executor {executor_command}",
-      "full-wave": "external-harness run --benchmark swe-bench-pro --task {task_id} --arm {arm} --model {model_id} --executor {executor_command}",
-      "verify": "external-harness verify --benchmark swe-bench-pro --task {task_id} --arm {arm}"
-    }
-  }
-}

package/docs/evals/external-command-config.swe-bench-pro.json DELETED Viewed

@@ -1,8 +0,0 @@
-{
-  "adapters": {
-    "swe-bench-pro": {
-      "single-agent": "node \"scripts/wave-orchestrator/swe-bench-pro-task.mjs\" run --instance \"{task_id}\" --arm \"{arm}\" --model \"{model_id}\" --reasoning-effort \"{reasoning_effort}\" --max-wall-clock-minutes \"{max_wall_clock_minutes}\" --max-turns \"{max_turns}\"",
-      "full-wave": "node \"scripts/wave-orchestrator/swe-bench-pro-task.mjs\" run --instance \"{task_id}\" --arm \"{arm}\" --model \"{model_id}\" --reasoning-effort \"{reasoning_effort}\" --max-wall-clock-minutes \"{max_wall_clock_minutes}\" --max-turns \"{max_turns}\""
-    }
-  }
-}

package/docs/evals/pilots/README.md DELETED Viewed

@@ -1,47 +0,0 @@
----
-title: "External Benchmark Pilots"
-summary: "Frozen pilot manifests for the first honest direct benchmark runs."
----
-# External Benchmark Pilots
-These manifests freeze the first-run task selections for direct external benchmarks.
-They exist to prevent:
-- ad hoc task picking
-- silent pilot drift between runs
-- unfair re-sampling after seeing results
-The current frozen direct pilot is:
-- `SWE-bench Pro`
-Each manifest records:
-- benchmark id
-- split assumptions
-- sample strategy
-- exact task ids
-- task-level metadata needed for later aggregation
-These manifests are benchmark inputs, not run history.
-If a smaller or narrower batch is needed after the canonical pilot is frozen, create a
-new derivative manifest rather than editing the original file in place.
-Derivative manifests must:
-- name the parent frozen manifest they were derived from
-- explain the deterministic subset rule they use
-- state whether they are review-only or comparison-ready
-Example:
-- `docs/evals/pilots/swe-bench-pro-public-full-wave-review-10.json`
-  is a review-only 10-task subset derived from the frozen 20-task SWE-bench Pro public pilot.
-  It exists for a multi-agent diagnostic sweep and does not replace the canonical
-  single-agent versus full-wave comparison.
-When a derivative review batch is run, inspect the generated `failure-review.md` before
-treating any aggregate score as capability evidence.

package/docs/evals/pilots/swe-bench-pro-public-full-wave-review-10.json DELETED Viewed

@@ -1,64 +0,0 @@
-{
-  "version": 1,
-  "id": "swe-bench-pro-public-full-wave-review-10",
-  "benchmarkId": "swe-bench-pro",
-  "title": "SWE-bench Pro Public Full-Wave Review 10",
-  "split": "public",
-  "sampleStrategy": "first-listed-per-repo-from-frozen-20-task-pilot",
-  "sampleSource": "Derived from docs/evals/pilots/swe-bench-pro-public-pilot.json by taking the first listed task for each repository pair in the frozen 20-task public pilot.",
-  "derivedFromManifestPath": "docs/evals/pilots/swe-bench-pro-public-pilot.json",
-  "reviewOnly": true,
-  "reviewScope": "multi-agent-only-diagnostic",
-  "tasks": [
-    {
-      "taskId": "instance_NodeBB__NodeBB-04998908ba6721d64eba79ae3b65a351dcfbc5b5-vnan",
-      "repo": "NodeBB/NodeBB",
-      "repoLanguage": "js"
-    },
-    {
-      "taskId": "instance_qutebrowser__qutebrowser-f91ace96223cac8161c16dd061907e138fe85111-v059c6fdc75567943479b23ebca7c07b5e9a7f34c",
-      "repo": "qutebrowser/qutebrowser",
-      "repoLanguage": "python"
-    },
-    {
-      "taskId": "instance_ansible__ansible-f327e65d11bb905ed9f15996024f857a95592629-vba6da65a0f3baefda7a058ebbd0a8dcafb8512f5",
-      "repo": "ansible/ansible",
-      "repoLanguage": "python"
-    },
-    {
-      "taskId": "instance_internetarchive__openlibrary-4a5d2a7d24c9e4c11d3069220c0685b736d5ecde-v13642507b4fc1f8d234172bf8129942da2c2ca26",
-      "repo": "internetarchive/openlibrary",
-      "repoLanguage": "python"
-    },
-    {
-      "taskId": "instance_gravitational__teleport-3fa6904377c006497169945428e8197158667910-v626ec2a48416b10a88641359a169d99e935ff037",
-      "repo": "gravitational/teleport",
-      "repoLanguage": "go"
-    },
-    {
-      "taskId": "instance_navidrome__navidrome-7073d18b54da7e53274d11c9e2baef1242e8769e",
-      "repo": "navidrome/navidrome",
-      "repoLanguage": "go"
-    },
-    {
-      "taskId": "instance_element-hq__element-web-33e8edb3d508d6eefb354819ca693b7accc695e7",
-      "repo": "element-hq/element-web",
-      "repoLanguage": "js"
-    },
-    {
-      "taskId": "instance_future-architect__vuls-407407d306e9431d6aa0ab566baa6e44e5ba2904",
-      "repo": "future-architect/vuls",
-      "repoLanguage": "go"
-    },
-    {
-      "taskId": "instance_flipt-io__flipt-e42da21a07a5ae35835ec54f74004ebd58713874",
-      "repo": "flipt-io/flipt",
-      "repoLanguage": "go"
-    },
-    {
-      "taskId": "instance_protonmail__webclients-2c3559cad02d1090985dba7e8eb5a129144d9811",
-      "repo": "protonmail/webclients",
-      "repoLanguage": "js"
-    }
-  ]
-}

package/docs/evals/pilots/swe-bench-pro-public-pilot.json DELETED Viewed

@@ -1,111 +0,0 @@
-{
-  "version": 1,
-  "id": "swe-bench-pro-public-pilot",
-  "benchmarkId": "swe-bench-pro",
-  "title": "SWE-bench Pro Public Pilot",
-  "split": "public",
-  "sampleStrategy": "fixed-stratified-public-slice",
-  "sampleSource": "First 100 public rows from the Hugging Face dataset viewer, stratified to two tasks per selected repository where available.",
-  "tasks": [
-    {
-      "taskId": "instance_NodeBB__NodeBB-04998908ba6721d64eba79ae3b65a351dcfbc5b5-vnan",
-      "repo": "NodeBB/NodeBB",
-      "repoLanguage": "js"
-    },
-    {
-      "taskId": "instance_NodeBB__NodeBB-51d8f3b195bddb13a13ddc0de110722774d9bb1b-vf2cf3cbd463b7ad942381f1c6d077626485a1e9e",
-      "repo": "NodeBB/NodeBB",
-      "repoLanguage": "js"
-    },
-    {
-      "taskId": "instance_qutebrowser__qutebrowser-f91ace96223cac8161c16dd061907e138fe85111-v059c6fdc75567943479b23ebca7c07b5e9a7f34c",
-      "repo": "qutebrowser/qutebrowser",
-      "repoLanguage": "python"
-    },
-    {
-      "taskId": "instance_qutebrowser__qutebrowser-c580ebf0801e5a3ecabc54f327498bb753c6d5f2-v2ef375ac784985212b1805e1d0431dc8f1b3c171",
-      "repo": "qutebrowser/qutebrowser",
-      "repoLanguage": "python"
-    },
-    {
-      "taskId": "instance_ansible__ansible-f327e65d11bb905ed9f15996024f857a95592629-vba6da65a0f3baefda7a058ebbd0a8dcafb8512f5",
-      "repo": "ansible/ansible",
-      "repoLanguage": "python"
-    },
-    {
-      "taskId": "instance_ansible__ansible-a26c325bd8f6e2822d9d7e62f77a424c1db4fbf6-v0f01c69f1e2528b935359cfe578530722bca2c59",
-      "repo": "ansible/ansible",
-      "repoLanguage": "python"
-    },
-    {
-      "taskId": "instance_internetarchive__openlibrary-4a5d2a7d24c9e4c11d3069220c0685b736d5ecde-v13642507b4fc1f8d234172bf8129942da2c2ca26",
-      "repo": "internetarchive/openlibrary",
-      "repoLanguage": "python"
-    },
-    {
-      "taskId": "instance_internetarchive__openlibrary-dbbd9d539c6d4fd45d5be9662aa19b6d664b5137-v08d8e8889ec945ab821fb156c04c7d2e2810debb",
-      "repo": "internetarchive/openlibrary",
-      "repoLanguage": "python"
-    },
-    {
-      "taskId": "instance_gravitational__teleport-3fa6904377c006497169945428e8197158667910-v626ec2a48416b10a88641359a169d99e935ff037",
-      "repo": "gravitational/teleport",
-      "repoLanguage": "go"
-    },
-    {
-      "taskId": "instance_gravitational__teleport-c782838c3a174fdff80cafd8cd3b1aa4dae8beb2",
-      "repo": "gravitational/teleport",
-      "repoLanguage": "go"
-    },
-    {
-      "taskId": "instance_navidrome__navidrome-7073d18b54da7e53274d11c9e2baef1242e8769e",
-      "repo": "navidrome/navidrome",
-      "repoLanguage": "go"
-    },
-    {
-      "taskId": "instance_navidrome__navidrome-b65e76293a917ee2dfc5d4b373b1c62e054d0dca",
-      "repo": "navidrome/navidrome",
-      "repoLanguage": "go"
-    },
-    {
-      "taskId": "instance_element-hq__element-web-33e8edb3d508d6eefb354819ca693b7accc695e7",
-      "repo": "element-hq/element-web",
-      "repoLanguage": "js"
-    },
-    {
-      "taskId": "instance_element-hq__element-web-5dfde12c1c1c0b6e48f17e3405468593e39d9492-vnan",
-      "repo": "element-hq/element-web",
-      "repoLanguage": "js"
-    },
-    {
-      "taskId": "instance_future-architect__vuls-407407d306e9431d6aa0ab566baa6e44e5ba2904",
-      "repo": "future-architect/vuls",
-      "repoLanguage": "go"
-    },
-    {
-      "taskId": "instance_future-architect__vuls-e6c0da61324a0c04026ffd1c031436ee2be9503a",
-      "repo": "future-architect/vuls",
-      "repoLanguage": "go"
-    },
-    {
-      "taskId": "instance_flipt-io__flipt-e42da21a07a5ae35835ec54f74004ebd58713874",
-      "repo": "flipt-io/flipt",
-      "repoLanguage": "go"
-    },
-    {
-      "taskId": "instance_flipt-io__flipt-3b2c25ee8a3ac247c3fad13ad8d64ace34ec8ee7",
-      "repo": "flipt-io/flipt",
-      "repoLanguage": "go"
-    },
-    {
-      "taskId": "instance_protonmail__webclients-2c3559cad02d1090985dba7e8eb5a129144d9811",
-      "repo": "protonmail/webclients",
-      "repoLanguage": "js"
-    },
-    {
-      "taskId": "instance_protonmail__webclients-6dcf0d0b0f7965ad94be3f84971afeb437f25b02",
-      "repo": "protonmail/webclients",
-      "repoLanguage": "js"
-    }
-  ]
-}

package/docs/evals/wave-benchmark-program.md DELETED Viewed

@@ -1,302 +0,0 @@
----
-title: "Wave Benchmark Program"
-summary: "Locked benchmark spec for Wave-native coordination evaluations, baseline arms, scoring rules, and external benchmark positioning."
----
-# Wave Benchmark Program
-This document is the implementation-side contract for Wave benchmarking.
-It complements:
-- `docs/evals/benchmark-catalog.json` for benchmark vocabulary
-- `docs/evals/cases/` for the deterministic local corpus
-- `docs/evals/external-benchmarks.json` for external adapters and positioning
-- `scripts/wave-orchestrator/benchmark.mjs` for execution and reporting
-## First Public Claim
-The first claim this benchmark program is designed to support is:
-> Under equal executor assumptions, the full Wave orchestration surface improves distributed-state reconstruction, inbox targeting, routing quality, and premature-closure resistance relative to stripped-down baselines.
-This is intentionally narrower than "Wave is better than all coding agents."
-## Benchmark Arms
-The benchmark runner supports these arms:
-- `single-agent`
-  One primary owner operates from a local view of records they authored. No inbound targeted coordination is compiled into that arm, and there is no compiled shared summary, no targeted inboxes, no capability routing, and no explicit closure guard simulation.
-- `multi-agent-minimal`
-  Multiple agents exist, but they only share a minimal global summary. There is no targeted inbox routing and no benchmark-aware closure discipline.
-- `full-wave`
-  The current Wave projection and routing surfaces are used: canonical coordination state, compiled shared summary, targeted inboxes, request assignments, and closure-guard simulation.
-- `full-wave-plus-improvement`
-  Reserved for later benchmark-improvement loops after a baseline is established. The runner supports the arm id, but the initial local corpus focuses on the first three arms.
-## Shipped Native Families
-The first shipped deterministic corpus covers one case in each of the core coordination families:
-- `hidden-profile-pooling`
-- `silo-escape`
-- `blackboard-fidelity`
-- `contradiction-recovery`
-- `simultaneous-coordination`
-- `expertise-leverage`
-It also includes a cross-cutting premature-closure guard case under `hidden-profile-pooling / premature-consensus-guard`.
-## Scoring Rules
-Each benchmark case defines:
-- `familyId`
-- `benchmarkId`
-- `supportedArms`
-- `fixture`
-- `expectations`
-- `scoring.kind`
-- `scoring.primaryMetric`
-- `scoring.thresholds`
-The runner computes case-level metrics from deterministic coordination fixtures using current Wave machinery where possible:
-- `compileSharedSummary()`
-- `compileAgentInbox()`
-- `buildRequestAssignments()`
-- `openClarificationLinkedRequests()`
-The primary metric determines case pass/fail. Directionality comes from the benchmark catalog, not from the case file.
-For reporting above the case level, the runner also computes a direction-aligned score:
-- `higher-is-better` metrics keep their raw score
-- `lower-is-better` metrics are flipped to `100 - rawScore`
-That rule applies to:
-- family `meanScore`
-- overall and family `meanDelta`
-- the `statisticallyConfident` comparison flag
-This keeps a positive delta semantically stable: positive always means "better than baseline" even when a case's raw primary metric is lower-is-better.
-## Significance And Comparative Reporting
-Comparative reporting uses:
-- mean score delta versus the `single-agent` baseline
-- bootstrap confidence intervals over case deltas
-- a confidence rule: only report a statistically confident win when the lower bound of the confidence interval is above zero
-The initial implementation reports the practical delta directly and leaves final publication thresholds to operator judgment. The runner still records the per-case practical win threshold in the case definition so later work can harden claim logic without changing the corpus format.
-## Corpus Design Rules
-The local case corpus follows these constraints:
-- deterministic and file-backed
-- cheap enough to run in ordinary repo CI or local development
-- focused on Wave-native surfaces, not generic model capability
-- auditable by inspecting the case JSON, generated summaries, inboxes, and assignments
-- extensible to live-run and trace-backed variants later
-The first corpus deliberately exercises projection, routing, and closure logic before attempting expensive live multi-executor runs.
-## Native Benchmarking Mode
-`wave benchmark run` is the native deterministic benchmarking mode.
-This mode is intentionally narrow:
-- it tests the Wave substrate, not generic model capability
-- it holds the coordination fixture constant and varies only the arm behavior
-- it uses current Wave machinery to compile summaries, inboxes, assignments, and closure guards
-- it is cheap and reproducible enough to run in local development and CI
-What it is meant to prove:
-- the blackboard projections preserve decision-changing state
-- targeted inboxes reduce silos instead of creating them
-- capability routing sends the right work to the right owner
-- contradiction handling becomes explicit repair work
-- closure guards resist premature PASS
-What it does not prove by itself:
-- raw coding ability on live repos
-- leaderboard-ready external benchmark performance
-- runtime-specific agent behavior under real tool pressure
-That separation is intentional. Native mode is the first honest proof layer for a MAS tool whose core claim is about shared state, routing, synthesis, and closure discipline.
-## Native Metric Contract
-For each case and arm, the native runner records:
-- `score`
-  The case's primary metric value.
-- `alignedScore`
-  The direction-aligned case score used for family summaries and deltas.
-- `passed`
-  Whether the primary metric satisfied the case threshold.
-- `direction`
-  Whether the metric is `higher-is-better` or `lower-is-better`.
-- `threshold`
-  The configured case threshold for the primary metric.
-- `metrics`
-  The full metric map computed from the deterministic fixture.
-- `details`
-  Supporting breakdowns such as matched global facts, summary facts, targeted inbox recall, assignment precision, distinct assigned agents, and whether the blocking guard tripped.
-- `artifacts`
-  The generated `sharedSummary`, `inboxes`, `assignments`, and `blockingGuard` state used to score the arm.
-The runner also records:
-- `familySummary`
-  Direction-aligned mean score and pass rate per family and arm.
-- `comparisons`
-  Direction-aligned mean delta versus `single-agent`, bootstrap confidence intervals, and a conservative `statisticallyConfident` flag.
-When `waveControl` reporting is enabled, native runs also publish:
-- `benchmark_run`
-  Suite-level metadata, selected arms, family summary, and comparison summary.
-- `benchmark_item`
-  Full per-case, per-arm payloads including `score`, `alignedScore`, `passed`, `metrics`, `details`, and generated artifacts.
-Native mode does **not** emit `verification` or `review` events, because there is no external verifier and no benchmark-validity split to interpret. Those are reserved for `wave benchmark external-run`.
-## Native Metric Set
-The current deterministic runner logs the following metrics:
-| Metric | Native signal used today | Why it matters for the MAS claim |
-| --- | --- | --- |
-| `distributed-info-accuracy` | Percent of expected global facts visible in the integration-visible state: shared summary, integration-owner view when present, and structured assignment artifacts | Proves the team pooled distributed evidence rather than leaving it siloed |
-| `latent-asymmetry-surfacing-rate` | Clarification recall by explicit record id when a case expects missing-fact surfacing, otherwise targeted inbox recall | Proves the system notices that important evidence is still missing before closure |
-| `premature-convergence-rate` | `100` when a case required a blocking guard and the arm failed to keep it active, else `0` | Proves whether closure discipline resists converging on incomplete state |
-| `global-state-reconstruction-rate` | Percent of required cross-agent facts reconstructed in the integration-visible state rather than only in owner-private inboxes | Proves communication turned into a correct shared picture, not only message traffic |
-| `summary-fact-retention-rate` | Percent of required summary facts preserved in the shared summary | Proves summary compression is trustworthy enough to support downstream synthesis |
-| `communication-reasoning-gap` | `100 - global-state-reconstruction-rate` | Makes failure explicit when agents talk but still fail to integrate correctly |
-| `projection-consistency-rate` | Same summary-fidelity signal, framed for projection integrity | Proves the blackboard projections remain semantically aligned with canonical state |
-| `targeted-inbox-recall` | Percent of expected owner-specific facts present in the right inboxes | Proves targeted context actually reaches the agents who own the work |
-| `integration-coherence-rate` | Global-fact recall used as a proxy for integration fidelity in the deterministic corpus | Proves the synthesis layer reflects the underlying coordination state |
-| `contradiction-detection-rate` | Targeted-fact recall on contradiction-oriented fixtures | Proves conflicting claims become visible instead of being smoothed away |
-| `repair-closure-rate` | Assignment precision for required repair or follow-up work | Proves contradictions and blockers turn into owner-bound resolution work |
-| `false-consensus-rate` | `100` when a contradiction/premature-close guard should have held and did not, else `0` | Proves whether the system is narrating consensus where the state is still unresolved |
-| `deadlock-rate` | `100` when the arm failed to reach the required number of distinct owners in simultaneous-coordination cases, else `0` | Proves whether the team collapses under concurrent coordination pressure |
-| `contention-resolution-rate` | Assignment precision in concurrent blocker cases | Proves simultaneous work can resolve rather than stall |
-| `symmetry-breaking-rate` | Percent of the required distinct owners/choices achieved | Proves the team can break lockstep and avoid same-plan collapse |
-| `expert-preservation-rate` | Targeted-fact recall used on expert-preservation fixtures | Proves the strongest specialist signal survives into the visible decision path |
-| `capability-routing-precision` | Correct assignment rate for capability-routed requests | Proves the routing layer is steering work to the intended owner |
-| `expert-performance-gap` | `100 - expert-preservation-rate` | Makes expert-signal dilution explicit as a failure measure rather than an anecdote |
-Several of these metrics intentionally reuse the same deterministic signals under different benchmark families. That is not accidental. The goal is not to create an unnecessarily large metric vocabulary; it is to ask the same core question from multiple MAS failure angles:
-- did the right facts reach shared state
-- did the right owners receive the right context
-- did conflicts become explicit repair work
-- did closure wait for integrated proof
-The important constraint is that "shared state" here does **not** mean "the union of every owner inbox." The native runner scores global reconstruction from the integration-visible artifacts, so facts that remain split across private owner views do not count as reconstructed.
-## Why These Metrics Matter
-The first public claim is not "Wave is a better model." It is that Wave is a better multi-agent coordination substrate.
-That means the most valuable native metrics are the ones that expose the failure cases from the README:
-- distributed-evidence metrics matter because a MAS that cannot pool private facts has no credible shared-state claim
-- summary and inbox metrics matter because a blackboard is only useful if the projections stay faithful and owner-relevant
-- routing metrics matter because specialist structure only helps if work actually lands on the right owner
-- contradiction and repair metrics matter because visible disagreement without repair is still coordination failure
-- premature-closure metrics matter because a MAS that can always narrate PASS is not proving anything
-- simultaneous-coordination metrics matter because many systems look fine in serial but collapse under concurrent blockers
-In other words, these metrics matter because they test the *coordination mechanism itself*, which is the actual product claim of Wave.
-## External Benchmark Positioning
-The external benchmark registry is split into two modes:
-- `direct`
-  The benchmark is treated as a runnable external suite with a command template or adapter recipe. The current direct target is `SWE-bench Pro`.
-- `adapted`
-  The benchmark is treated as a design reference whose failure mode should be mirrored with repo-local Wave cases. Current adapted targets are `SkillsBench`, `EvoClaw`, `HiddenBench`, `Silo-Bench`, and `DPBench`.
-This keeps the first milestone honest:
-- prove the Wave-specific substrate first
-- then layer in broader external reality checks
-## Current Direct Benchmark
-The current direct external benchmark is:
-- `SWE-bench Pro`
-Why this benchmark now:
-- it is contamination-resistant relative to older SWE-bench variants
-- it has a public executable harness
-- it exercises real repository bug-fix work without changing the Wave coordination claim into a generic terminal benchmark claim
-The second direct benchmark slot is intentionally deferred until a later `CooperBench` pass.
-The first direct comparison should compare only:
-- `single-agent`
-- `full-wave`
-And both arms must keep the following fixed:
-- model id
-- executor id and command
-- tool permissions
-- temperature and reasoning settings
-- wall-clock budget
-- turn budget
-- retry limit
-- verification harness
-- dataset version or task manifest
-Execution should be driven through explicit command templates for the official benchmark harnesses rather than ad hoc shell invocation. The config shape lives at `docs/evals/external-command-config.sample.json`, and the local SWE-bench Pro harness is wired through `docs/evals/external-command-config.swe-bench-pro.json`.
-## Review-Only External Subsets
-After the canonical SWE-bench Pro pilot is frozen, narrower review batches may be derived for
-diagnostic work such as a `full-wave`-only sweep.
-Those runs are allowed only when they:
-- derive from an already-frozen pilot manifest instead of re-sampling freely
-- keep the review scope explicit in the manifest and report
-- avoid presenting the result as a matched `single-agent` versus `full-wave` claim
-Example:
-- `docs/evals/pilots/swe-bench-pro-public-full-wave-review-10.json`
-  is a 10-task diagnostic subset derived from the frozen 20-task SWE-bench Pro pilot.
-  It is suitable for multi-agent review work before a later pairwise rerun, but it does
-  not replace the canonical direct comparison.
-## Output Contract
-`wave benchmark run` writes results under `.tmp/wave-benchmarks/latest/` by default:
-- `results.json`
-- `results.md`
-`wave benchmark external-run` writes the same pair in its selected output directory plus:
-- `failure-review.json`
-- `failure-review.md`
-The failure review is the first artifact to inspect for review-only subsets because it
-separates verifier invalidation, setup or harness failures, dry-run planning output, and
-trustworthy patch-quality failures.
-These artifacts are local and reproducible. They are not intended to be committed as run history.