npm - project-tiny-context-harness - Versions diffs - 0.2.75 → 0.2.77 - Mend

project-tiny-context-harness 0.2.75 → 0.2.77

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (29) hide show

package/README.md +14 -10
package/assets/README.md +15 -11
package/assets/README.zh-CN.md +8 -6
package/assets/skills/context_development_engineer/SKILL.md +44 -42
package/assets/skills/superpowers-long-task/SKILL.md +135 -73
package/dist/commands/index.js +16 -9
package/dist/commands/superpowers.d.ts +1 -0
package/dist/commands/superpowers.js +89 -0
package/dist/lib/plan-acceptance-validator.js +15 -0
package/dist/lib/superpowers-task-compile.d.ts +3 -0
package/dist/lib/superpowers-task-compile.js +223 -0
package/dist/lib/superpowers-task-delivery.d.ts +4 -0
package/dist/lib/superpowers-task-delivery.js +84 -0
package/dist/lib/superpowers-task-derive.d.ts +13 -0
package/dist/lib/superpowers-task-derive.js +204 -0
package/dist/lib/superpowers-task-events.d.ts +1 -0
package/dist/lib/superpowers-task-events.js +13 -0
package/dist/lib/superpowers-task-gates.d.ts +12 -0
package/dist/lib/superpowers-task-gates.js +47 -0
package/dist/lib/superpowers-task-next-slices.d.ts +1 -0
package/dist/lib/superpowers-task-next-slices.js +12 -0
package/dist/lib/superpowers-task-state-schema.d.ts +202 -0
package/dist/lib/superpowers-task-state-schema.js +44 -0
package/dist/lib/superpowers-task-state.d.ts +16 -0
package/dist/lib/superpowers-task-state.js +241 -0
package/dist/lib/superpowers-task-validator.d.ts +4 -0
package/dist/lib/superpowers-task-validator.js +252 -0
package/dist/lib/validators.js +4 -2
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -86,7 +86,7 @@ Web GPT or another external planning model produces the long-task source inputs
 -> /normal-long-task produces the full checklist and optional generic target-mode prompt
 -> /superpowers-long-task consumes Product / Architecture Source + Technical Realization Plan + Acceptance Checklist when Superpowers execution is needed
 -> Superpowers derives concrete implementation slices
--> execution maintains a plan-conformance matrix and final acceptance verdict
+-> execution maintains task-state.json, append-only events.ndjson and generated derived views
 -> each slice follows the workflow contract + project_context/**
 ```
@@ -94,15 +94,17 @@ For ordinary target-mode preparation, a two-document upstream input remains enou
 The ordinary long-task path uses `/normal-long-task`. It is the non-Superpowers acceptance pass: it can generate or reuse the full acceptance checklist and can produce a generic target-mode prompt.
-The Superpowers long-task path uses `/superpowers-long-task` when three inputs already exist: `Product / Architecture Source`, `Technical Realization Plan` and `Acceptance Checklist`. The product/architecture source preserves original intent and scope; the technical realization plan is the execution blueprint and plan-conformance source; the checklist is the acceptance authority. Two-document compatibility is allowed only when the first document clearly contains both product/architecture source and technical realization plan sections. If only a product/architecture source and checklist exist, the Skill stops for a missing `Technical Realization Plan` instead of generating one. The technical realization plan must already satisfy the required Superpowers-ready Markdown implementation plan fields; if it does, the prompt binds it directly to Superpowers execution rather than regenerating the plan, and if it does not, the Skill stops before generating a prompt. The prompt is Tiny Context's adapter layer, aligned to the official Superpowers skills while remaining a Tiny Context-owned adapter rather than an upstream-owned schema. It may wrap Superpowers with Tiny Context authority, conformance and acceptance gates, but it must not redefine or fork Superpowers execution mechanics. It requires `Product Context Delta` and `Technical Context Delta` checks before implementation, a `plan-conformance-matrix.*` process trace for implementation drift and a final AC-by-AC `final-acceptance-verdict.*` before completion. Complete acceptance rows are treated as externally reviewable evidence claims: the checklist supplies the proof chain, fresh evidence must satisfy every required layer, and material drift, missing layers or unapproved sibling substitution prevent `complete`. Goal-mode wording separates `audit_task_complete`, `acceptance_target_status` and `product_goal_complete`: implementation / execution goals complete only at `product_goal_complete=true`; read-only audit goals may end at `audit_task_complete`, but a non-accepted verdict says `Audit workflow completed; acceptance target not complete.` and does not use unqualified `Goal achieved` or `update_goal(status="complete")` as acceptance of the user target.
+The Superpowers long-task path uses `/superpowers-long-task` when three inputs already exist: `Product / Architecture Source`, `Technical Realization Plan` and `Acceptance Checklist`. The product/architecture source preserves original intent and scope; the technical realization plan is the execution blueprint and plan-conformance source; the checklist is the acceptance authority. The Skill does not perform complexity routing: invocation means Superpowers long-task execution was already selected. Two-document compatibility is allowed only when the first document clearly contains both product/architecture source and technical realization plan sections. If only a product/architecture source and checklist exist, the Skill stops with a Missing Fields Report for a missing `Technical Realization Plan` instead of generating one. The technical realization plan must already satisfy the required Superpowers-ready Markdown implementation plan fields; if it does, the prompt binds it directly to Superpowers execution rather than regenerating the plan, and if it does not, the Skill stops before generating a prompt. The prompt is Tiny Context's adapter layer, aligned to the official Superpowers skills while remaining a Tiny Context-owned adapter rather than an upstream-owned schema. It may wrap Superpowers with Tiny Context authority, conformance and acceptance gates, but it must not redefine or fork Superpowers execution mechanics. It requires parent-level `Product Context Delta` and `Technical Context Delta` checks before implementation and uses a canonical state kernel under `tmp/ty-context/plan-acceptance/<plan-slug>/`: `task-state.json` is the only execution state source, `events.ndjson` is append-only and `derived/**` contains generated local audit, plan-conformance matrix, final acceptance verdict, progress ledger, evidence index, context alignment and final summary views. Complete acceptance rows are externally reviewable evidence claims derived from `task-state.evidence[]`: the checklist supplies the proof chain, fresh reviewable evidence must satisfy every required layer, and material drift, missing layers or unapproved sibling substitution prevent `complete`. Goal-mode wording separates `audit_task_complete`, `acceptance_target_status` and computed `product_goal_complete`: implementation / execution goals complete only when `ty-context superpowers final-gate` computes `product_goal_complete=true`; read-only audit goals may end at `audit_task_complete`, but a non-accepted verdict says `Audit workflow completed; acceptance target not complete.` and does not use unqualified `Goal achieved` or `update_goal(status="complete")` as acceptance of the user target.
-For non-trivial Superpowers slices, the generated prompt recommends an optional evidence manifest at `tmp/ty-context/plan-acceptance/<plan-slug>-evidence-manifest.md/json`. The evidence manifest is not a fourth input, not durable Context, not proof by itself and not required by `validate-plan-acceptance`; it is a short per-slice source for synchronizing matrix, verdict and audit updates. Default slice guidance is to group 2-4 strongly related missing layers that share an AC, runtime scenario, proof environment or verification path, while single-gap slices are reserved for blockers, contradictions or small metadata cleanup. The prompt also asks executors to classify missing layers, reuse DB/API/Browser environments only with unique proof prefixes and cleanup assertions, and run a stale/overclaim scan after syncing artifacts.
+The three inputs also carry capability-first delivery boundaries. Product / Architecture Source declares `delivery_scope`, `full_population_required`, samples that validate the claim, samples that do not validate it and out-of-scope backlog. Each Technical Realization Plan item declares delivery scope, capability target, representative samples, full-population boundary and non-required population. Each Acceptance Checklist item declares acceptance scope, what it validates and does not validate, sample boundary and full-population requirement. `scope_conflict_requires_decision` blocks completion when source, plan and checklist disagree between system capability build, representative sample validation and full-population operation. Sample evidence or framework-only implementation cannot prove all-provider, all-interface, all-platform or full-population completion unless the AC explicitly allows it; when full population is not explicitly required, generated views report it as `not_in_scope`.
-The generated Superpowers prompt uses Slice Gate / Epoch Gate / Final Gate cadence instead of running a full final gate after every slice. Progress Accounting tracks four distinct metrics: AC acceptance completion, engineering implementation progress, runtime/proof progress and workflow overhead. Longer runs may keep a temporary progress ledger under `tmp/ty-context/plan-acceptance/**`; each slice declares an artifact budget, proof-layer milestone status and cleanup expectation. Workflow overhead backpressure asks executors to batch shared provider/browser/runtime/security epoch proof environments, prune stale artifacts and choose the Next 3-5 high-value clusters that close the most blocking AC/proof-layer gaps.
+For non-trivial Superpowers slices, the generated prompt requires a structured `slice-delta.json`. The executor applies it with `ty-context superpowers apply-slice-delta <workdir> <slice-delta.json>`, then runs `ty-context superpowers derive` and `ty-context superpowers slice-gate`. Each delta records touched plan items/ACs, code changes, closed and remaining proof layers, blockers, cleanup assertions, `progress_value` and canonical evidence records with `proves`, `does_not_prove`, freshness, redaction and reviewability. Default slice guidance is to group 2-4 strongly related missing layers that share an AC, runtime scenario, proof environment or verification path, while single-gap slices are reserved for blockers, contradictions or small metadata cleanup. The prompt also asks executors to classify missing layers, reuse DB/API/Browser environments only with unique proof prefixes and cleanup assertions, and run a stale/overclaim scan after deriving artifacts.
-The recommended Superpowers layer is the specific [obra/Superpowers](https://github.com/obra/superpowers) plugin/workflow, not a generic planning substitute. After `/superpowers-long-task` accepts the input packet, prefer `superpowers:subagent-driven-development` when subagents are available and `superpowers:executing-plans` otherwise. Behavior changes should use `superpowers:test-driven-development`, and completion claims should use `superpowers:verification-before-completion` against both the plan-conformance matrix and final acceptance verdict, followed by `ty-context validate-plan-acceptance <dir>`. When subagents are available, the target prompt asks for a read-only auditor pass after self-evidence and validator checks; the auditor finds gaps with a fixed auditor checklist but does not become proof. Superpowers review and verification remain useful execution checks, but they cannot override Tiny Context gates: passing Superpowers review does not by itself prove plan conformance or checklist acceptance.
+The generated Superpowers prompt uses Slice Gate / Epoch Gate / Final Gate cadence instead of running a full final gate after every slice. Progress Accounting tracks AC acceptance completion, engineering implementation progress, runtime/proof progress, system capability progress, representative sample progress, real object coverage, full population operation progress, artifact budget, proof-layer milestone status and workflow overhead in state and generated `derived/progress-ledger.*`. Workflow overhead backpressure asks executors to batch shared provider/browser/runtime/security epoch proof environments, prune stale artifacts and choose the Next 3-5 high-value clusters that close the most blocking AC/proof-layer gaps.
-The reason is drift control. The workflow contract plus Context layer is intentionally a soft constraint. It works well for short tasks, and Context can still capture the expected facts for long tasks, but long execution makes the Context-to-code step drift as the context window grows, work is handed off, subagents split scope or validation loops multiply. The extra Tiny Context gates exist because Superpowers alone can still drift under long-running execution pressure: it strengthens execution discipline, but it does not by itself preserve source authority, prevent scope shrinkage, prove full conformance to the Technical Realization Plan or enforce AC-by-AC evidence against the Acceptance Checklist. A product/architecture source, technical realization plan, acceptance checklist, explicit long-task Skill invocation, target-mode prompt, plan-conformance matrix, final acceptance verdict and optional Superpowers execution layer make implementation conformance and completion evidence recoverable without restoring a phase-gated workflow.
+The recommended Superpowers layer is the specific [obra/Superpowers](https://github.com/obra/superpowers) plugin/workflow, not a generic planning substitute. After `/superpowers-long-task` accepts the input packet, prefer `superpowers:subagent-driven-development` when subagents are available and `superpowers:executing-plans` otherwise. Behavior changes should use `superpowers:test-driven-development`. Final gate order is derive all views, `superpowers:verification-before-completion`, `ty-context validate-superpowers-state <dir>`, `ty-context validate-plan-acceptance <dir>`, read-only auditor when available, rederive/revalidate if auditor fixes changed state or evidence, final stale/overclaim scan, then `ty-context superpowers final-gate <dir>` computes completion. The auditor reconstructs AC proof chains with a fixed auditor checklist and finds gaps, but does not become proof. Superpowers review and verification remain useful execution checks, but they cannot override Tiny Context gates: passing Superpowers review does not by itself prove plan conformance or checklist acceptance.
+The reason is drift control. The workflow contract plus Context layer is intentionally a soft constraint. It works well for short tasks, and Context can still capture the expected facts for long tasks, but long execution makes the Context-to-code step drift as the context window grows, work is handed off, subagents split scope or validation loops multiply. The extra Tiny Context gates exist because Superpowers alone can still drift under long-running execution pressure: it strengthens execution discipline, but it does not by itself preserve source authority, prevent scope shrinkage, prove full conformance to the Technical Realization Plan or enforce AC-by-AC evidence against the Acceptance Checklist. A product/architecture source, technical realization plan, acceptance checklist, explicit long-task Skill invocation, target-mode prompt, canonical task state, generated derived views and optional Superpowers execution layer make implementation conformance and completion evidence recoverable without restoring a phase-gated workflow.
 For high-risk product, architecture, technical-plan or acceptance-plan inputs, the workflow contract should be made visible in `plan.md` or an equivalent temporary plan surface before implementation. That plan surface separates Source-to-Context Coverage from Context-to-Implementation Binding. Source-to-Context maps each durable source constraint to an existing Context hit, a required Context update, a task-local-only decision, an explicit out-of-scope decision, a user decision or an under-scoped gap. Context-to-Implementation then maps Context facts to implementation obligations, expected surfaces, implemented paths, forbidden shortcuts and verification paths. `validate-plan-contract` can check the temporary plan for internal consistency, referenced path existence and declared binding consistency; it still does not prove product quality.
@@ -157,7 +159,7 @@ npm ci
 npm run smoke:quickstart
 npm run preview:pack
 cd /path/to/your/test-repo
-npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.75.tgz
+npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.77.tgz
 npx --no-install ty-context init --adopt
 make validate-context
 ```
@@ -272,7 +274,7 @@ Use `npx --no-install ty-context ...` only when you explicitly want the already
 | Full project context export Skill | `<harnessRoot>/skills/context_full_project_export/SKILL.md` | Handles explicit full-project, project-overall, Source Pack or code-level export requests and uses `export-context --source-pack`, `--code-index`, `--task-context`, `--all`, `--full` or `--code` to create temporary artifacts under `tmp/ty-context/context-exports/**`. |
 | Harness upgrade Skill | `<harnessRoot>/skills/context_harness_upgrade/SKILL.md` | Handles explicit Tiny Context / Project Tiny Context Harness upgrade requests such as “upgrade Tiny Context” and “use the Tiny Context upgrade skill to upgrade this project”; it runs the canonical `upgrade` path, handles only migration-scoped `manual_required` / `blocked` follow-up, then runs diagnostics. |
 | Ordinary long-task Skill | `<harnessRoot>/skills/normal-long-task/SKILL.md` | Invoke as `/normal-long-task` to turn a referenced plan, RFC, implementation proposal or two-document upstream input into a falsifiable acceptance checklist and optional generic paste-ready goal/target-mode prompt under `tmp/ty-context/plan-acceptance/**`; if the plan already contains an explicit concrete checklist, the Skill reuses it verbatim in the separate full-checklist file; compact summaries are only navigation/priority, but the Skill does not execute the plan or prove completion. |
-| Superpowers long-task Skill | `<harnessRoot>/skills/superpowers-long-task/SKILL.md` | Invoke as `/superpowers-long-task` when Product / Architecture Source, Technical Realization Plan and Acceptance Checklist exist and Superpowers execution is needed. It emits a Superpowers-specific prompt with Context Delta checks and the official workflow skill names, directly binds a Superpowers-ready external implementation plan when supplied, requires a plan-conformance matrix, final acceptance verdict and externally reviewable evidence discipline during execution, and stops when required input fields are missing. It does not generate the technical plan, checklist or execute the plan. |
+| Superpowers long-task Skill | `<harnessRoot>/skills/superpowers-long-task/SKILL.md` | Invoke as `/superpowers-long-task` when Product / Architecture Source, Technical Realization Plan and Acceptance Checklist exist and Superpowers execution is needed. It emits a Superpowers-specific prompt with Context Delta checks, official workflow skill names, capability-first delivery scope fields, plan-conformance matrix, final acceptance verdict and externally reviewable evidence discipline, and stops when required input fields are missing. It does not generate the technical plan, checklist or execute the plan. |
 | Project-local Skills | `<harnessRoot>/skills/<role>/SKILL.md` | Optional local product/design/development Skills created by the project, such as `product_plan`, `uiux_design` or `development_engineer`. They supersede package-managed default Skills when more specific, are not overwritten by `sync`, and should keep front matter trigger keywords aligned with the project `AGENTS.md` role-trigger rule. |
 | Managed file sync | `make ty-context-sync` or `npx --yes --package project-tiny-context-harness@latest ty-context sync` | Refreshes package-managed guidance, default Skills, Makefile include, context templates, tools and workflow YAML. It does not run migrations or perform semantic Context generation; it may block only direct asset-refresh safety issues such as invalid managed blocks or deprecated managed Skill overrides. |
 | Upgrade | `make ty-context-upgrade` or `npx --yes --package project-tiny-context-harness@latest ty-context upgrade` | Use for releases marked `upgrade-required` or `manual-required`. Builds an upgrade plan, stops before writes when `blocked` items exist, otherwise applies `safe_pending` migrations, runs `sync` and `doctor`, and exits non-zero when manual follow-up or diagnostics remain. |
@@ -288,7 +290,9 @@ Use `npx --no-install ty-context ...` only when you explicitly want the already
 | Harness validation | `make validate-harness` | Composite gate for `validate-context` and `validate-code-modularity`. |
 | Context validation | `npx --yes --package project-tiny-context-harness@latest ty-context validate-context`, `make validate-context` | Checks required project recovery fields, Context graph metadata, declared paths/roles and fake test-execution claims. |
 | Plan contract validation | `npx --yes --package project-tiny-context-harness@latest ty-context validate-plan-contract <plan.md\|dir>` | Checks Source-to-Context Coverage and Context-to-Implementation Binding for structural consistency, referenced path existence and weak-proof complete/bound contradictions. |
-| Plan acceptance validation | `npx --yes --package project-tiny-context-harness@latest ty-context validate-plan-acceptance <dir>` | Checks long-task plan-conformance matrix and final verdict JSON for contradictory complete claims, dangling evidence references, weak-proof complete rows, missing proof layers, material/critical drift, unapproved sibling substitution, blocking auditor findings, raw secrets/tokens/cookies, optional manifest reference drift, generated active-count drift, missing plan/AC cross-references and declared surface/architecture binding gaps. `errors` block; `warnings` / `hygiene` report cleanup. |
+| Superpowers state validation | `npx --yes --package project-tiny-context-harness@latest ty-context validate-superpowers-state <dir>` | Checks canonical Superpowers `task-state.json`, source hashes, graph references, delivery scope fields/conflicts, evidence/proof-layer consistency, stale evidence, sibling substitution, auditor blockers, derived drift and final completion rules. |
+| Plan acceptance validation | `npx --yes --package project-tiny-context-harness@latest ty-context validate-plan-acceptance <dir>` | Checks legacy matrix/verdict artifacts when no state exists; when `task-state.json` exists, validates state-backed derived artifacts. It rejects contradictory complete claims, dangling evidence references, weak-proof complete rows, missing proof layers, material/critical drift, unapproved sibling substitution, blocking auditor findings, raw secrets/tokens/cookies, generated active-count drift, missing plan/AC cross-references and declared surface/architecture binding gaps. `errors` block; `warnings` / `hygiene` report cleanup. |
+| Superpowers state helpers | `npx --yes --package project-tiny-context-harness@latest ty-context superpowers <subcommand>` | Explicit `/superpowers-long-task` state helper for `init`, `compile`, `apply-slice-delta`, `derive`, `slice-gate`, `epoch-gate`, `final-gate` and `next-slices` under `tmp/ty-context/plan-acceptance/**`. |
 | Diagnostics | `make ty-context-doctor` or `npx --yes --package project-tiny-context-harness@latest ty-context doctor` | Reports Harness root, package version, schema version and required Minimal Context paths. |
 | Package source checks | `ty-context package sync-source`, `ty-context package check-source` | Maintainer-only commands for keeping package canonical assets aligned with the source workspace. |
@@ -298,7 +302,7 @@ Technical architecture support is a Minimal Context capability: use restrained `
 For long-running plans, RFCs or implementation proposals, invoke `/normal-long-task` to turn a plan plus relevant Context into a falsifiable acceptance checklist and an optional generic paste-ready goal/target-mode prompt. It also supports a two-document upstream input from Web GPT or another external planner: `Development Plan` for execution direction and `Acceptance and Tests` for target-mode acceptance input. If the plan already contains an explicit concrete acceptance checklist, the Skill copies that checklist verbatim into a separate full-checklist file instead of generating a competing checklist. The two-document packet path is strict mode: when required fields cannot be fully parsed from both documents, the Skill preserves the inputs, reports the missing fields, and stops without generating a checklist or goal/target-mode prompt. It is one pre-execution acceptance pass, not a task planner or workflow engine: it stores temporary inputs under `tmp/ty-context/plan-acceptance/**`, asks for confirmation when durable assumptions are unclear, and leaves execution evidence to the future executor, tests, CI, review or human acceptance. The generated prompt may require a local audit under the same temporary directory so future sessions can recover acceptance progress; that audit is not Context, not a quality proof and not a replacement for the project's Tiny Context workflow contract. When the prompt references a full checklist, that checklist is the acceptance authority; compact prompt text is only navigation, priority and recovery guidance.
-When the next step explicitly needs Superpowers, invoke `/superpowers-long-task` on the Product / Architecture Source, Technical Realization Plan and Acceptance Checklist. It emits the `Superpowers input packet` and execution binding so the future executor sees which inputs feed Context Delta assessment, `superpowers:subagent-driven-development`, `superpowers:executing-plans`, TDD, `superpowers:verification-before-completion`, local audit, optional evidence manifest, `plan-conformance-matrix.*`, `final-acceptance-verdict.*`, proof-chain evidence and optional auditor review. This is Tiny Context's adapter layer for Superpowers workflows, aligned to the official Superpowers skills while remaining a Tiny Context-owned adapter rather than an upstream-owned schema. It may wrap Superpowers with authority, conformance and acceptance gates, but it must not redefine, duplicate or fork Superpowers execution mechanics; if a future Tiny Context-added step would conflict with, duplicate or override a Superpowers responsibility, stop and surface the boundary conflict instead of silently merging workflows. It cannot replace `/normal-long-task` for ordinary checklist preparation, and it does not derive a technical plan from a product plan; the Technical Realization Plan must already be a Superpowers-ready Markdown implementation plan or the Skill stops before generating a prompt. A two-document packet is accepted only when the first document explicitly contains both product/architecture source and technical realization plan sections. `validate-plan-acceptance` is still a structural validator, not a product-quality proof; the evidence manifest is a slice-level synchronization aid rather than proof or validator input; a subagent auditor is an extra gap-finding pass on top of executor self-evidence and validator checks, not a replacement for either. The generated prompt also disambiguates `audit_task_complete`, `acceptance_target_status` and `product_goal_complete`; implementation / execution goals finish only at `product_goal_complete=true`, while a read-only audit goal can end at `audit_task_complete` only with a non-accepted verdict reported as `Audit workflow completed; acceptance target not complete.`, not as `Goal achieved`.
+When the next step explicitly needs Superpowers, invoke `/superpowers-long-task` on the Product / Architecture Source, Technical Realization Plan and Acceptance Checklist. It emits the `Superpowers input packet` and execution binding so the future executor sees which inputs feed Context Delta assessment, `superpowers:subagent-driven-development`, `superpowers:executing-plans`, TDD, `superpowers:verification-before-completion`, canonical `task-state.json`, append-only `events.ndjson`, generated `derived/**` views, proof-chain evidence and optional auditor review. This is Tiny Context's adapter layer for Superpowers workflows, aligned to the official Superpowers skills while remaining a Tiny Context-owned adapter rather than an upstream-owned schema. It may wrap Superpowers with authority, conformance and acceptance gates, but it must not redefine, duplicate or fork Superpowers execution mechanics; if a future Tiny Context-added step would conflict with, duplicate or override a Superpowers responsibility, stop and surface the boundary conflict instead of silently merging workflows. It cannot replace `/normal-long-task` for ordinary checklist preparation, does not route complexity, and does not derive a technical plan from a product plan; the Technical Realization Plan must already be a Superpowers-ready Markdown implementation plan or the Skill stops before generating a prompt. A two-document packet is accepted only when the first document explicitly contains both product/architecture source and technical realization plan sections. Product / Architecture Source, Technical Realization Plan and Acceptance Checklist remain the upstream authorities, while state/derived views/validator/auditor artifacts cannot rewrite them. Capability-first delivery scope stays inside those same three inputs: source, plan items and ACs must explicitly distinguish reusable system capability build, representative sample validation, full population operation and out-of-scope backlog; `scope_conflict_requires_decision` blocks completion, and sample/framework evidence cannot prove full population unless the AC says so. The generated prompt also disambiguates `audit_task_complete`, `acceptance_target_status` and computed `product_goal_complete`; implementation / execution goals finish only when `product_goal_complete=true`, while a read-only audit goal can end at `audit_task_complete` only with a non-accepted verdict reported as `Audit workflow completed; acceptance target not complete.`, not as `Goal achieved`.
 For Product Surface work, `context_surface_contract` turns broad product/page/UI principles into project-owned surface responsibilities. A Product Surface can be a Web page, mobile screen, desktop window, game UI/HUD/menu, CLI/TUI output, extension UI or embedded/device interface. Cross-surface contracts use the existing `contract` role; area-owned screen facts stay in `area` or `subdomain`; repeatable validation paths use `verification`. The Harness does not add a new surface-specific role or create business surface contracts during `init` or `upgrade`. Product Surface Context authoring is not a default product-quality validator; plan validators only check declared temporary surface bindings for structural consistency. Projects that want mandatory task blocks should add a separate project-local Skill, while `product-surface-contract.md` is only a compact managed template for optional Context authoring.

package/assets/README.md CHANGED Viewed

@@ -94,7 +94,7 @@ That smoke packs the local workspace, installs it into a disposable repo, runs `
 ```sh
 npm run preview:pack
 cd /path/to/your/test-repo
-npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.75.tgz
+npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.77.tgz
 npx --no-install ty-context init --adopt
 make validate-context
 ```
@@ -130,7 +130,7 @@ Web GPT or another external planning model produces the long-task source inputs
 -> /normal-long-task produces the full checklist and optional generic target-mode prompt
 -> /superpowers-long-task consumes Product / Architecture Source + Technical Realization Plan + Acceptance Checklist when Superpowers execution is needed
 -> Superpowers derives concrete implementation slices
--> execution maintains a plan-conformance matrix and final acceptance verdict
+-> execution maintains task-state.json, append-only events.ndjson and generated derived views
 -> each slice follows the workflow contract + project_context/**
 ```
@@ -138,15 +138,17 @@ For ordinary target-mode preparation, a two-document upstream input remains enou
 The ordinary long-task path uses `/normal-long-task`. It is the non-Superpowers acceptance pass: it can generate or reuse the full acceptance checklist and can produce a generic target-mode prompt.
-The Superpowers long-task path uses `/superpowers-long-task` when three inputs already exist: `Product / Architecture Source`, `Technical Realization Plan` and `Acceptance Checklist`. The product/architecture source preserves original intent and scope; the technical realization plan is the execution blueprint and plan-conformance source; the checklist is the acceptance authority. The Skill does not perform complexity routing: invocation means Superpowers long-task execution was already selected. Two-document compatibility is allowed only when the first document clearly contains both product/architecture source and technical realization plan sections. If only a product/architecture source and checklist exist, the Skill stops with a Missing Fields Report for a missing `Technical Realization Plan` instead of generating one. The technical realization plan must already satisfy the required Superpowers-ready Markdown implementation plan fields; if it does, the prompt binds it directly to Superpowers execution rather than regenerating the plan, and if it does not, the Skill stops before generating a prompt. The prompt is Tiny Context's adapter layer, aligned to the official Superpowers skills while remaining a Tiny Context-owned adapter rather than an upstream-owned schema. It may wrap Superpowers with Tiny Context authority, conformance and acceptance gates, but it must not redefine or fork Superpowers execution mechanics. It requires parent-level `Product Context Delta` and `Technical Context Delta` checks before implementation, slice-level new durable fact checks, a `plan-conformance-matrix.*` process trace for implementation drift and a final AC-by-AC `final-acceptance-verdict.*` before completion. Complete acceptance rows are treated as externally reviewable evidence claims: the checklist supplies the proof chain, fresh evidence must satisfy every required layer, and material drift, missing layers or unapproved sibling substitution prevent `complete`. An Evidence Ledger / proof index is optional, but complete rows must trace to fresh evidence directly or through an optional `evidence_id`. Goal-mode wording separates `audit_task_complete`, `acceptance_target_status` and `product_goal_complete`: implementation / execution goals complete only at `product_goal_complete=true`; read-only audit goals may end at `audit_task_complete`, but a non-accepted verdict says `Audit workflow completed; acceptance target not complete.` and does not use unqualified `Goal achieved` or `update_goal(status="complete")` as acceptance of the user target.
+The Superpowers long-task path uses `/superpowers-long-task` when three inputs already exist: `Product / Architecture Source`, `Technical Realization Plan` and `Acceptance Checklist`. The product/architecture source preserves original intent and scope; the technical realization plan is the execution blueprint and plan-conformance source; the checklist is the acceptance authority. The Skill does not perform complexity routing: invocation means Superpowers long-task execution was already selected. Two-document compatibility is allowed only when the first document clearly contains both product/architecture source and technical realization plan sections. If only a product/architecture source and checklist exist, the Skill stops with a Missing Fields Report for a missing `Technical Realization Plan` instead of generating one. The technical realization plan must already satisfy the required Superpowers-ready Markdown implementation plan fields; if it does, the prompt binds it directly to Superpowers execution rather than regenerating the plan, and if it does not, the Skill stops before generating a prompt. The prompt is Tiny Context's adapter layer, aligned to the official Superpowers skills while remaining a Tiny Context-owned adapter rather than an upstream-owned schema. It may wrap Superpowers with Tiny Context authority, conformance and acceptance gates, but it must not redefine or fork Superpowers execution mechanics. It requires parent-level `Product Context Delta` and `Technical Context Delta` checks before implementation and uses a canonical state kernel under `tmp/ty-context/plan-acceptance/<plan-slug>/`: `task-state.json` is the only execution state source, `events.ndjson` is append-only and `derived/**` contains generated local audit, plan-conformance matrix, final acceptance verdict, progress ledger, evidence index, context alignment and final summary views. Complete acceptance rows are externally reviewable evidence claims derived from `task-state.evidence[]`: the checklist supplies the proof chain, fresh reviewable evidence must satisfy every required layer, and material drift, missing layers or unapproved sibling substitution prevent `complete`. Goal-mode wording separates `audit_task_complete`, `acceptance_target_status` and computed `product_goal_complete`: implementation / execution goals complete only when `ty-context superpowers final-gate` computes `product_goal_complete=true`; read-only audit goals may end at `audit_task_complete`, but a non-accepted verdict says `Audit workflow completed; acceptance target not complete.` and does not use unqualified `Goal achieved` or `update_goal(status="complete")` as acceptance of the user target.
-For non-trivial Superpowers slices, the generated prompt now recommends an optional evidence manifest at `tmp/ty-context/plan-acceptance/<plan-slug>-evidence-manifest.md/json`. The evidence manifest is not a fourth input, not durable Context, not proof by itself and not required by `validate-plan-acceptance`; it is a short per-slice source for synchronizing matrix, verdict and audit updates. Default slice guidance is to group 2-4 strongly related missing layers that share an AC, runtime scenario, proof environment or verification path, while single-gap slices are reserved for blockers, contradictions or small metadata cleanup. The prompt also asks executors to classify missing layers, reuse DB/API/Browser environments only with unique proof prefixes and cleanup assertions, and run a stale/overclaim scan after syncing artifacts.
+The three inputs also carry capability-first delivery boundaries. Product / Architecture Source declares `delivery_scope`, `full_population_required`, samples that validate the claim, samples that do not validate it and out-of-scope backlog. Each Technical Realization Plan item declares delivery scope, capability target, representative samples, full-population boundary and non-required population. Each Acceptance Checklist item declares acceptance scope, what it validates and does not validate, sample boundary and full-population requirement. `scope_conflict_requires_decision` blocks completion when source, plan and checklist disagree between system capability build, representative sample validation and full-population operation. Sample evidence or framework-only implementation cannot prove all-provider, all-interface, all-platform or full-population completion unless the AC explicitly allows it; when full population is not explicitly required, generated views report it as `not_in_scope`.
-The generated Superpowers prompt uses Slice Gate / Epoch Gate / Final Gate cadence instead of running a full final gate after every slice. Progress Accounting tracks four distinct metrics: AC acceptance completion, engineering implementation progress, runtime/proof progress and workflow overhead. Longer runs may keep a temporary progress ledger under `tmp/ty-context/plan-acceptance/**`; each slice declares an artifact budget, proof-layer milestone status and cleanup expectation. Workflow overhead backpressure asks executors to batch shared provider/browser/runtime/security epoch proof environments, prune stale artifacts and choose the Next 3-5 high-value clusters that close the most blocking AC/proof-layer gaps.
+For non-trivial Superpowers slices, the generated prompt requires a structured `slice-delta.json`. The executor applies it with `ty-context superpowers apply-slice-delta <workdir> <slice-delta.json>`, then runs `ty-context superpowers derive` and `ty-context superpowers slice-gate`. Each delta records touched plan items/ACs, code changes, closed and remaining proof layers, blockers, cleanup assertions, `progress_value` and canonical evidence records with `proves`, `does_not_prove`, freshness, redaction and reviewability. Default slice guidance is to group 2-4 strongly related missing layers that share an AC, runtime scenario, proof environment or verification path, while single-gap slices are reserved for blockers, contradictions or small metadata cleanup. The prompt also asks executors to classify missing layers, reuse DB/API/Browser environments only with unique proof prefixes and cleanup assertions, and run a stale/overclaim scan after deriving artifacts.
-The recommended Superpowers layer is the specific [obra/Superpowers](https://github.com/obra/superpowers) plugin/workflow, not a generic planning substitute. After `/superpowers-long-task` accepts the input packet, prefer `superpowers:subagent-driven-development` when subagents are available and `superpowers:executing-plans` otherwise. Behavior changes should use `superpowers:test-driven-development`. Final gate order is evidence manifest update when used, matrix/verdict/audit sync, `superpowers:verification-before-completion`, `ty-context validate-plan-acceptance <dir>`, read-only auditor when available, validator rerun if auditor fixes changed artifacts, final stale/overclaim scan, then completion only when no blocking conflict remains. The auditor reconstructs AC proof chains with a fixed auditor checklist and finds gaps, but does not become proof. Superpowers review and verification remain useful execution checks, but they cannot override Tiny Context gates: passing Superpowers review does not by itself prove plan conformance or checklist acceptance.
+The generated Superpowers prompt uses Slice Gate / Epoch Gate / Final Gate cadence instead of running a full final gate after every slice. Progress Accounting tracks AC acceptance completion, engineering implementation progress, runtime/proof progress, system capability progress, representative sample progress, real object coverage, full population operation progress, artifact budget, proof-layer milestone status and workflow overhead in state and generated `derived/progress-ledger.*`. Workflow overhead backpressure asks executors to batch shared provider/browser/runtime/security epoch proof environments, prune stale artifacts and choose the Next 3-5 high-value clusters that close the most blocking AC/proof-layer gaps.
-The reason is drift control. The workflow contract plus Context layer is intentionally a soft constraint. It works well for short tasks, and Context can still capture the expected facts for long tasks, but long execution makes the Context-to-code step drift as the context window grows, work is handed off, subagents split scope or validation loops multiply. The extra Tiny Context gates exist because Superpowers alone can still drift under long-running execution pressure: it strengthens execution discipline, but it does not by itself preserve source authority, prevent scope shrinkage, prove full conformance to the Technical Realization Plan or enforce AC-by-AC evidence against the Acceptance Checklist. A product/architecture source, technical realization plan, acceptance checklist, explicit long-task Skill invocation, target-mode prompt, plan-conformance matrix, final acceptance verdict and optional Superpowers execution layer make implementation conformance and completion evidence recoverable without restoring a phase-gated workflow.
+The recommended Superpowers layer is the specific [obra/Superpowers](https://github.com/obra/superpowers) plugin/workflow, not a generic planning substitute. After `/superpowers-long-task` accepts the input packet, prefer `superpowers:subagent-driven-development` when subagents are available and `superpowers:executing-plans` otherwise. Behavior changes should use `superpowers:test-driven-development`. Final gate order is derive all views, `superpowers:verification-before-completion`, `ty-context validate-superpowers-state <dir>`, `ty-context validate-plan-acceptance <dir>`, read-only auditor when available, rederive/revalidate if auditor fixes changed state or evidence, final stale/overclaim scan, then `ty-context superpowers final-gate <dir>` computes completion. The auditor reconstructs AC proof chains with a fixed auditor checklist and finds gaps, but does not become proof. Superpowers review and verification remain useful execution checks, but they cannot override Tiny Context gates: passing Superpowers review does not by itself prove plan conformance or checklist acceptance.
+The reason is drift control. The workflow contract plus Context layer is intentionally a soft constraint. It works well for short tasks, and Context can still capture the expected facts for long tasks, but long execution makes the Context-to-code step drift as the context window grows, work is handed off, subagents split scope or validation loops multiply. The extra Tiny Context gates exist because Superpowers alone can still drift under long-running execution pressure: it strengthens execution discipline, but it does not by itself preserve source authority, prevent scope shrinkage, prove full conformance to the Technical Realization Plan or enforce AC-by-AC evidence against the Acceptance Checklist. A product/architecture source, technical realization plan, acceptance checklist, explicit long-task Skill invocation, target-mode prompt, canonical task state, generated derived views and optional Superpowers execution layer make implementation conformance and completion evidence recoverable without restoring a phase-gated workflow.
 For high-risk product, architecture, technical-plan or acceptance-plan inputs, the workflow contract should be made visible in `plan.md` or an equivalent temporary plan surface before implementation. That plan surface separates Source-to-Context Coverage from Context-to-Implementation Binding. Source-to-Context maps each durable source constraint to an existing Context hit, a required Context update, a task-local-only decision, an explicit out-of-scope decision, a user decision or an under-scoped gap. Context-to-Implementation then maps Context facts to implementation obligations, expected surfaces, implemented paths, forbidden shortcuts and verification paths. `validate-plan-contract` can check the temporary plan for internal consistency, referenced path existence and declared binding consistency; it still does not prove product quality.
@@ -309,7 +311,7 @@ No. It checks that recovery facts exist and avoids fake test-result claims. Prod
 It should stay smaller than a full process. Ordinary bug fixes and local refactors do not update Context unless they produce durable product, architecture, API, state or validation facts.
-The default Skills are Minimal Context helpers for explicit product-planning, UI/UX-design, development-engineering, Product Surface Contract, full-project-export, Tiny Context upgrade and explicit long-task requests. Product, screen-flow, surface responsibility and durable engineering conclusions go to `project_context/**`; visual identity and design tokens go to root `DESIGN.md`. Export artifacts are temporary files under `tmp/ty-context/context-exports/**`, not Context. Long-task artifacts are temporary files under `tmp/ty-context/plan-acceptance/**`; they define completion criteria or execution evidence for a referenced plan but do not execute it or become durable Context. The ordinary long-task Skill is invoked as `/normal-long-task`: if the plan already contains an explicit concrete checklist, it reuses that checklist verbatim in the separate full-checklist file. For a two-document upstream input, the external planner should provide a `Development Plan` and an `Acceptance and Tests` packet; `/normal-long-task` preserves both source roles and, only when strict-mode required fields are fully parseable from both documents, turns them into the full checklist plus optional generic target prompt. When a generated prompt references a full checklist, that checklist is the authoritative acceptance standard; the compact prompt summary is only navigation and priority guidance. The Superpowers long-task Skill is invoked as `/superpowers-long-task`: it consumes `Product / Architecture Source`, `Technical Realization Plan` and `Acceptance Checklist`, emits the Superpowers-specific target-mode prompt, directly binds a Superpowers-ready external implementation plan when supplied, requires `Product Context Delta`, `Technical Context Delta`, `plan-conformance-matrix.*`, `final-acceptance-verdict.*` and externally reviewable proof-chain evidence during future execution, and stops if required fields are missing. The Harness upgrade Skill handles requests such as “upgrade Tiny Context” and “use the Tiny Context upgrade skill to upgrade this project” by following the release update mode, using `upgrade` for migration-bearing releases, and limiting manual cleanup to migration-scoped follow-up.
+The default Skills are Minimal Context helpers for explicit product-planning, UI/UX-design, development-engineering, Product Surface Contract, full-project-export, Tiny Context upgrade and explicit long-task requests. Product, screen-flow, surface responsibility and durable engineering conclusions go to `project_context/**`; visual identity and design tokens go to root `DESIGN.md`. Export artifacts are temporary files under `tmp/ty-context/context-exports/**`, not Context. Long-task artifacts are temporary files under `tmp/ty-context/plan-acceptance/**`; they define completion criteria or execution evidence for a referenced plan but do not execute it or become durable Context. The ordinary long-task Skill is invoked as `/normal-long-task`: if the plan already contains an explicit concrete checklist, it reuses that checklist verbatim in the separate full-checklist file. For a two-document upstream input, the external planner should provide a `Development Plan` and an `Acceptance and Tests` packet; `/normal-long-task` preserves both source roles and, only when strict-mode required fields are fully parseable from both documents, turns them into the full checklist plus optional generic target prompt. When a generated prompt references a full checklist, that checklist is the authoritative acceptance standard; the compact prompt summary is only navigation and priority guidance. The Superpowers long-task Skill is invoked as `/superpowers-long-task`: it consumes `Product / Architecture Source`, `Technical Realization Plan` and `Acceptance Checklist`, emits the Superpowers-specific target-mode prompt, directly binds a Superpowers-ready external implementation plan when supplied, requires capability-first delivery scope fields, `Product Context Delta`, `Technical Context Delta`, `plan-conformance-matrix.*`, `final-acceptance-verdict.*` and externally reviewable proof-chain evidence during future execution, and stops if required fields are missing. The Harness upgrade Skill handles requests such as “upgrade Tiny Context” and “use the Tiny Context upgrade skill to upgrade this project” by following the release update mode, using `upgrade` for migration-bearing releases, and limiting manual cleanup to migration-scoped follow-up.
 Multilingual trigger phrases are compatibility details. Public README, npm and launch copy stay English-first, and public/package-managed surfaces must remain English-complete; literal non-English examples are documented only where they explain generated Skill matching and must not be the sole activation path.
@@ -319,9 +321,9 @@ Technical architecture support is a Minimal Context capability: use restrained `
 For long-running plans, RFCs or implementation proposals, invoke `/normal-long-task` to turn a plan plus relevant Context into a falsifiable acceptance checklist and an optional generic paste-ready goal/target-mode prompt. It also supports a two-document upstream input from Web GPT or another external planner: `Development Plan` for execution direction and `Acceptance and Tests` for target-mode acceptance input. If the plan already contains an explicit concrete acceptance checklist, the Skill copies that checklist verbatim into a separate full-checklist file instead of generating a competing checklist. The two-document packet path is strict mode: when required fields cannot be fully parsed from both documents, the Skill preserves the inputs, reports the missing fields, and stops without generating a checklist or goal/target-mode prompt. This is one pre-execution acceptance pass, not a task planner or workflow engine: it stores temporary inputs under `tmp/ty-context/plan-acceptance/**`, asks for confirmation when durable assumptions are unclear, and leaves execution evidence to the future executor, tests, CI, review or human acceptance. The generated prompt may require a local audit under the same temporary directory so future sessions can recover acceptance progress; that audit is not Context, not a quality proof and not a replacement for the project's Tiny Context workflow contract. The full checklist is the acceptance authority, while any compact prompt summary exists for navigation, priority and recovery after context compaction.
-When the next step explicitly needs Superpowers, invoke `/superpowers-long-task` on the Product / Architecture Source, Technical Realization Plan and Acceptance Checklist. It emits the `Superpowers input packet` and execution binding so the future executor sees which inputs feed Context Delta assessment, `superpowers:subagent-driven-development`, `superpowers:executing-plans`, TDD, `superpowers:verification-before-completion`, local audit, `plan-conformance-matrix.*`, `final-acceptance-verdict.*`, proof-chain evidence and optional auditor review. This is Tiny Context's adapter layer for Superpowers workflows, aligned to the official Superpowers skills while remaining a Tiny Context-owned adapter rather than an upstream-owned schema. It may wrap Superpowers with authority, conformance and acceptance gates, but it must not redefine, duplicate or fork Superpowers execution mechanics; if a future Tiny Context-added step would conflict with, duplicate or override a Superpowers responsibility, stop and surface the boundary conflict instead of silently merging workflows. It cannot replace `/normal-long-task` for ordinary checklist preparation, does not route complexity, and does not derive a technical plan from a product plan; the Technical Realization Plan must already be a Superpowers-ready Markdown implementation plan or the Skill stops before generating a prompt. A two-document packet is accepted only when the first document explicitly contains both product/architecture source and technical realization plan sections. Product / Architecture Source, Technical Realization Plan and Acceptance Checklist remain the upstream authorities, while audit/matrix/verdict/validator/auditor artifacts cannot rewrite them. The generated prompt also disambiguates `audit_task_complete`, `acceptance_target_status` and `product_goal_complete`; implementation / execution goals finish only at `product_goal_complete=true`, while a read-only audit goal can end at `audit_task_complete` only with a non-accepted verdict reported as `Audit workflow completed; acceptance target not complete.`, not as `Goal achieved`.
+When the next step explicitly needs Superpowers, invoke `/superpowers-long-task` on the Product / Architecture Source, Technical Realization Plan and Acceptance Checklist. It emits the `Superpowers input packet` and execution binding so the future executor sees which inputs feed Context Delta assessment, `superpowers:subagent-driven-development`, `superpowers:executing-plans`, TDD, `superpowers:verification-before-completion`, canonical `task-state.json`, append-only `events.ndjson`, generated `derived/**` views, proof-chain evidence and optional auditor review. This is Tiny Context's adapter layer for Superpowers workflows, aligned to the official Superpowers skills while remaining a Tiny Context-owned adapter rather than an upstream-owned schema. It may wrap Superpowers with authority, conformance and acceptance gates, but it must not redefine, duplicate or fork Superpowers execution mechanics; if a future Tiny Context-added step would conflict with, duplicate or override a Superpowers responsibility, stop and surface the boundary conflict instead of silently merging workflows. It cannot replace `/normal-long-task` for ordinary checklist preparation, does not route complexity, and does not derive a technical plan from a product plan; the Technical Realization Plan must already be a Superpowers-ready Markdown implementation plan or the Skill stops before generating a prompt. A two-document packet is accepted only when the first document explicitly contains both product/architecture source and technical realization plan sections. Product / Architecture Source, Technical Realization Plan and Acceptance Checklist remain the upstream authorities, while state/derived views/validator/auditor artifacts cannot rewrite them. Capability-first delivery scope stays inside those same three inputs: source, plan items and ACs must explicitly distinguish reusable system capability build, representative sample validation, full population operation and out-of-scope backlog; `scope_conflict_requires_decision` blocks completion, and sample/framework evidence cannot prove full population unless the AC says so. The generated prompt also disambiguates `audit_task_complete`, `acceptance_target_status` and computed `product_goal_complete`; implementation / execution goals finish only when `product_goal_complete=true`, while a read-only audit goal can end at `audit_task_complete` only with a non-accepted verdict reported as `Audit workflow completed; acceptance target not complete.`, not as `Goal achieved`.
-Important usage note: Minimal Context intentionally keeps Context read order, Context/code priority and drift checks as agent-level soft constraints rather than machine-enforced gates. That tradeoff works well for short tasks, but long tasks with large context windows, multiple handoffs or many verification loops are expected to drift unless product intent, technical implementation target and acceptance target are externalized. Superpowers alone can still drift under this pressure: it strengthens execution discipline, but it does not by itself preserve source authority, prevent scope shrinkage, prove full conformance to the Technical Realization Plan or enforce AC-by-AC evidence against the Acceptance Checklist. Use `/normal-long-task` before long-running execution when ordinary checklist preparation is needed; use `/superpowers-long-task` when the three upstream inputs already exist and Superpowers execution is desired. Treat the optional evidence manifest as a slice-level synchronization aid, the plan-conformance matrix as process trace, the final verdict as final AC-by-AC acceptance evidence, and the local audit only as temporary progress/recovery state. `validate-plan-acceptance` is still an artifact-consistency validator, not a product-quality proof; a subagent auditor is an extra gap-finding pass on top of executor self-evidence and validator checks, not a replacement for either. Passing Superpowers review or verification does not bypass incomplete matrix/verdict rows, weak evidence, missing proof layers or blocking auditor findings.
+Important usage note: Minimal Context intentionally keeps Context read order, Context/code priority and drift checks as agent-level soft constraints rather than machine-enforced gates. That tradeoff works well for short tasks, but long tasks with large context windows, multiple handoffs or many verification loops are expected to drift unless product intent, technical implementation target and acceptance target are externalized. Superpowers alone can still drift under this pressure: it strengthens execution discipline, but it does not by itself preserve source authority, prevent scope shrinkage, prove full conformance to the Technical Realization Plan or enforce AC-by-AC evidence against the Acceptance Checklist. Use `/normal-long-task` before long-running execution when ordinary checklist preparation is needed; use `/superpowers-long-task` when the three upstream inputs already exist and Superpowers execution is desired. Treat `task-state.json` as the only execution state source, `events.ndjson` as append-only, `derived/**` as generated reading views and `task-state.evidence[]` as the canonical evidence ledger. `validate-superpowers-state` and state-backed `validate-plan-acceptance` are still artifact/state-consistency validators, not product-quality proof; a subagent auditor is an extra gap-finding pass on top of executor self-evidence and validator checks, not a replacement for either. Passing Superpowers review or verification does not bypass incomplete state rows, weak evidence, missing proof layers or blocking auditor findings.
 For Product Surface work, `context_surface_contract` turns broad product/page/UI principles into project-owned surface responsibilities. A Product Surface can be a Web page, mobile screen, desktop window, game UI/HUD/menu, CLI/TUI output, extension UI or embedded/device interface. Cross-surface contracts use the existing `contract` role; area-owned screen facts stay in `area` or `subdomain`; repeatable validation paths use `verification`. The Harness does not add a new surface-specific role or create business surface contracts during `init` or `upgrade`. Product Surface Context authoring is not a default product-quality validator; plan validators only check declared temporary surface bindings for structural consistency. Projects that want mandatory task blocks should add a separate project-local Skill, while `product-surface-contract.md` is only a compact managed template for optional Context authoring.
@@ -399,7 +401,9 @@ Use `npx --no-install ty-context ...` only when you explicitly want the already
 | `npx --yes --package project-tiny-context-harness@latest ty-context check-modularity --touched [--limit 300] [--fail-on-warning]` | Reports selected handwritten source files over the physical line-count limit; `--file <path>` and `--base <ref>` select explicit files or branch changes, and config waivers are reported distinctly. |
 | `npx --yes --package project-tiny-context-harness@latest ty-context validate-context` | Checks minimum project recovery fields, Context graph metadata, declared paths/roles and fake test-execution claims. |
 | `npx --yes --package project-tiny-context-harness@latest ty-context validate-plan-contract <plan.md\|dir>` | Checks Source-to-Context Coverage and Context-to-Implementation Binding for structural consistency, referenced path existence and weak-proof complete/bound contradictions. |
-| `npx --yes --package project-tiny-context-harness@latest ty-context validate-plan-acceptance <dir>` | Checks long-task plan-conformance matrix and final verdict JSON for contradictory complete claims, dangling evidence references, weak-proof complete rows, missing proof layers, material/critical drift, unapproved sibling substitution, blocking auditor findings, raw secrets/tokens/cookies, optional manifest reference drift, generated active-count drift, missing plan/AC cross-references and declared surface/architecture binding gaps. `errors` block; `warnings` / `hygiene` report cleanup. |
+| `npx --yes --package project-tiny-context-harness@latest ty-context validate-superpowers-state <dir>` | Checks canonical Superpowers `task-state.json`, source hashes, graph references, delivery scope fields/conflicts, evidence/proof-layer consistency, stale evidence, sibling substitution, auditor blockers, derived drift and final completion rules. |
+| `npx --yes --package project-tiny-context-harness@latest ty-context validate-plan-acceptance <dir>` | Checks legacy matrix/verdict artifacts when no state exists; when `task-state.json` exists, validates state-backed derived artifacts. It rejects contradictory complete claims, dangling evidence references, weak-proof complete rows, missing proof layers, material/critical drift, unapproved sibling substitution, blocking auditor findings, raw secrets/tokens/cookies, generated active-count drift, missing plan/AC cross-references and declared surface/architecture binding gaps. `errors` block; `warnings` / `hygiene` report cleanup. |
+| `npx --yes --package project-tiny-context-harness@latest ty-context superpowers <subcommand>` | Explicit `/superpowers-long-task` state helper for `init`, `compile`, `apply-slice-delta`, `derive`, `slice-gate`, `epoch-gate`, `final-gate` and `next-slices` under `tmp/ty-context/plan-acceptance/**`. |
 | `make validate-context` | Makefile wrapper for `validate-context`. |
 | `make validate-code-modularity` | Hard gate for touched handwritten source modularity; CI can set `TY_CONTEXT_MODULARITY_BASE=<ref>` to audit PR/base changes. |
 | `make validate-harness` | Composite gate for `validate-context` and `validate-code-modularity`. |

package/assets/README.zh-CN.md CHANGED Viewed

@@ -54,13 +54,15 @@ Tiny Context 有两个核心层。Minimal Context 是长期事实源层：说明
 对于长程任务，Harness 提供两个显式调用的长程任务 Skill。普通长程任务用 `/normal-long-task`：它把方案和验收输入临时放到 `tmp/ty-context/plan-acceptance/**`，生成或复用完整验收清单，并可输出普通目标模式文本。如果外部规划模型参与，推荐仍然只给两份产物：`《开发方案》` 作为执行方向和 plan traceability source，`《验收清单和测试用例》` 作为 Codex target-mode acceptance input packet。第一份应包含可逐项追踪的 plan item、预期落点 surface、full scope 与 sampled/optional 边界；第二份应包含 AC、required evidence、测试命令、真实产品路径 / core path、证据分层、无效证据、状态机、local audit 和 blocker。Source Pack 只是临时上传材料，不是 durable Context。如果方案里已经有明确、具体的“验收清单”，`/normal-long-task` 会直接复用那份清单并单独写入完整验收清单文件；两份输入包走 strict mode，如果两份内容无法完整解析出 required fields，或第二份缺少 required evidence、verification method、fail condition、状态机、无效证据规则等必要字段，Skill 会停止并列出缺失项，不生成完整验收清单或目标模式文本。
-Superpowers 长程任务 Skill 用 `/superpowers-long-task`。如果下一步明确要 Superpowers 目标模式文本，推荐在三份输入都存在后调用：`Product / Architecture Source`（产品/架构原始意图源）、`Technical Realization Plan`（具体技术实现方案）和 `Acceptance Checklist`（验收清单）。它不做复杂度分流；调用它表示上游已经决定使用 Superpowers 长程执行。它不要求先跑 `/normal-long-task`，但也不会把产品方案现场翻译成技术方案；如果只有产品/架构方案和验收清单，Skill 会用 Missing Fields Report 停止并报告缺少 `Technical Realization Plan`。两份输入兼容只限第一份明确包含产品/架构源和技术实现方案两个章节。`Technical Realization Plan` 必须已经满足 Superpowers-ready Markdown implementation plan 的必填字段；满足时它跳过方案生成，直接绑定 Superpowers 执行，不满足时直接中断并报告缺失字段，不生成 prompt。它显式输出 `Superpowers 输入包` 和执行绑定，让未来 executor 清楚哪些输入进入 parent-level Product Context Delta / Technical Context Delta、slice-level new durable fact check、subagent/inline execution、TDD、`superpowers:verification-before-completion`、local audit、`plan-conformance-matrix.*` 和 `final-acceptance-verdict.*`。这个 prompt 是面向 Superpowers workflow 的 Tiny Context 适配层，对齐官方 Superpowers skills，但不是上游维护的 schema；它可以在 Superpowers 外层增加 Tiny Context 的权威、对图纸和验收门禁，但不能重新定义、重复或分叉 Superpowers 执行机制。如果未来改动让 Tiny Context 新增步骤和官方 Superpowers 职责冲突、重复或覆盖，应停止修改并提示边界冲突，不要静默合并两套流程。它不生成技术方案或验收清单、不执行计划、不证明完成，也不会把临时清单、矩阵或 verdict 注册成 `project_context/**`。三输入是上游权威，audit / matrix / verdict / validator / auditor 不能改写它们。完整验收行按外部审计证据处理：proof chain 来自验收清单，fresh evidence 必须满足每个 required layer，存在 material drift、缺 required layer 或未批准 sibling substitution 时不能标 `complete`。Evidence Ledger / proof index 文件可选，但 complete 行必须能直接或通过可选 `evidence_id` 追溯 fresh evidence。Goal mode 表述必须区分 `audit_task_complete`、`acceptance_target_status` 和 `product_goal_complete`：实现/执行目标只在 `product_goal_complete=true` 时完成；只读审计目标可在 `audit_task_complete` 时结束，但 verdict 不是 accepted/complete 时，回复写 `Audit workflow completed; acceptance target not complete.`，不能用未限定的 `Goal achieved` 或 `update_goal(status="complete")` 表示用户验收目标已完成。
+Superpowers 长程任务 Skill 用 `/superpowers-long-task`。如果下一步明确要 Superpowers 目标模式文本，推荐在三份输入都存在后调用：`Product / Architecture Source`（产品/架构原始意图源）、`Technical Realization Plan`（具体技术实现方案）和 `Acceptance Checklist`（验收清单）。它不做复杂度分流；调用它表示上游已经决定使用 Superpowers 长程执行。它不要求先跑 `/normal-long-task`，但也不会把产品方案现场翻译成技术方案；如果只有产品/架构方案和验收清单，Skill 会用 Missing Fields Report 停止并报告缺少 `Technical Realization Plan`。两份输入兼容只限第一份明确包含产品/架构源和技术实现方案两个章节。`Technical Realization Plan` 必须已经满足 Superpowers-ready Markdown implementation plan 的必填字段；满足时它跳过方案生成，直接绑定 Superpowers 执行，不满足时直接中断并报告缺失字段，不生成 prompt。它显式输出 `Superpowers 输入包` 和执行绑定，让未来 executor 清楚哪些输入进入 parent-level Product Context Delta / Technical Context Delta、slice-level new durable fact check、subagent/inline execution、TDD、`superpowers:verification-before-completion`、canonical `task-state.json`、append-only `events.ndjson`、generated `derived/**` views、proof-chain evidence 和 optional auditor review。这个 prompt 是面向 Superpowers workflow 的 Tiny Context 适配层，对齐官方 Superpowers skills，但不是上游维护的 schema；它可以在 Superpowers 外层增加 Tiny Context 的权威、对图纸和验收门禁，但不能重新定义、重复或分叉 Superpowers 执行机制。如果未来改动让 Tiny Context 新增步骤和官方 Superpowers 职责冲突、重复或覆盖，应停止修改并提示边界冲突，不要静默合并两套流程。它不生成技术方案或验收清单、不执行计划、不证明完成，也不会把临时 state、derived views 或 verdict 注册成 `project_context/**`。三输入是上游权威，state / derived views / validator / auditor 不能改写它们。`task-state.json` 是唯一执行状态源，`events.ndjson` 追加记录状态变更，`derived/**` 只生成 local audit、plan-conformance matrix、final acceptance verdict、progress ledger、evidence index、context alignment 和 final summary 等阅读视图。完整验收行按外部审计证据处理：proof chain 来自验收清单，fresh evidence 必须通过 `task-state.evidence[]` 满足每个 required layer，存在 material drift、缺 required layer 或未批准 sibling substitution 时不能标 `complete`。Goal mode 表述必须区分 `audit_task_complete`、`acceptance_target_status` 和 computed `product_goal_complete`：实现/执行目标只在 `ty-context superpowers final-gate` 计算出 `product_goal_complete=true` 时完成；只读审计目标可在 `audit_task_complete` 时结束，但 verdict 不是 accepted/complete 时，回复写 `Audit workflow completed; acceptance target not complete.`，不能用未限定的 `Goal achieved` 或 `update_goal(status="complete")` 表示用户验收目标已完成。
-对于非平凡 slice，生成的 Superpowers prompt 会建议使用可选 evidence manifest：`tmp/ty-context/plan-acceptance/<plan-slug>-evidence-manifest.md/json`。这个证据 manifest 是 slice 级“证据小票”，不是第四输入，不是 durable Context，不是 proof 本身，也不是 `validate-plan-acceptance` 必需项；它只帮助 matrix / verdict / audit 从同一份 per-slice evidence 同步，减少 stale wording 和 overclaim。默认 slice 策略是把同一 AC、runtime 场景、proof 环境或验证路径下的 2-4 个强相关 missing layers 合并处理；单 gap slice 只留给 blocker、contradiction 或小型 metadata cleanup。prompt 还会要求先分类 missing layer、复用 DB/API/Browser 环境时使用唯一 proof prefix 和 cleanup assertion，并在同步 artifact 后做 stale/overclaim scan。
+三份输入还必须承载 capability-first delivery 边界。Product / Architecture Source 声明 `delivery_scope`、`full_population_required`、哪些 representative samples 能验证 claim、哪些不能验证、以及 `out_of_scope_backlog`。每个 Technical Realization Plan item 声明 delivery scope、capability target、representative samples、full-population boundary 和 non-required population。每个 Acceptance Checklist item 声明 acceptance scope、`ac_validates`、`ac_does_not_validate`、sample boundary 和 full-population requirement。source / plan / checklist 在 system capability build、representative sample validation、full population operation 之间冲突时，`scope_conflict_requires_decision` 阻塞完成。sample evidence 或 framework-only implementation 不能证明 all-provider、all-interface、all-platform 或 full-population 完成，除非 AC 明确批准；未显式要求 full population 时，generated views 必须报告 `not_in_scope`。
-生成的 Superpowers prompt 使用 Slice Gate / Epoch Gate / Final Gate 分层节奏，而不是每个 slice 后都跑完整 final gate。Progress Accounting 分开记录四类进度：AC acceptance completion、engineering implementation progress、runtime/proof progress 和 workflow overhead。多 slice 或多 agent 执行可在 `tmp/ty-context/plan-acceptance/**` 下维护临时 progress ledger；每个 slice 需要声明 artifact budget、proof-layer milestone 状态和 cleanup expectation。workflow overhead backpressure 要求 executor 批处理共享的 provider/browser/runtime/security epoch proof environment，清理 stale artifact，并选择 Next 3-5 high-value clusters 来优先关闭最多阻塞 AC / proof-layer gap。
+对于非平凡 slice，生成的 Superpowers prompt 要求使用结构化 `slice-delta.json`。executor 通过 `ty-context superpowers apply-slice-delta <workdir> <slice-delta.json>` 应用 delta，然后运行 `ty-context superpowers derive` 和 `ty-context superpowers slice-gate`。每个 delta 记录 touched plan items / ACs、code changes、closed / remaining proof layers、blockers、cleanup assertions、`progress_value`，以及带有 `proves`、`does_not_prove`、freshness、redaction 和 reviewability 的 canonical evidence records。默认 slice 策略是把同一 AC、runtime 场景、proof 环境或验证路径下的 2-4 个强相关 missing layers 合并处理；单 gap slice 只留给 blocker、contradiction 或小型 metadata cleanup。prompt 还会要求先分类 missing layer、复用 DB/API/Browser 环境时使用唯一 proof prefix 和 cleanup assertion，并在生成 derived artifacts 后做 stale/overclaim scan。
-重要使用提示：Minimal Context 有意把 Context 读取顺序、Context / 代码优先级和漂移检查保持为 agent 级软约束，而不是机器强制 edit-order gate。这个取舍适合短任务，但长任务、大上下文、多次交接或多轮验证时预期会漂移。单靠 Superpowers 在这类压力下仍可能漂移：它能增强执行纪律，但本身不负责保留上游 source authority、防止 scope shrinkage、证明完整符合 Technical Realization Plan，或按 Acceptance Checklist 逐 AC 强制证据成立。普通 checklist 准备需要 `/normal-long-task`；已有产品/架构原始意图源、具体技术实现方案和验收清单且需要 Superpowers 时，可直接用 `/superpowers-long-task`。`Product Context Delta` 判断产品逻辑、页面职责、信息架构和验收语义是否需要写入 Context；`Technical Context Delta` 判断 API/schema、模块边界、runtime/state、验证/部署路径和稳定技术取舍是否需要写入 Context。`plan-conformance-matrix.*` 是执行期“对图纸台账”，`final-acceptance-verdict.*` 是最后逐 AC 验收报告，local audit 只是临时进度/恢复状态，不能裁判完成；审计流程完成也不等于被验收目标完成。使用目标模式执行方案时，目标结束条件对齐 `product_goal_complete=true`，只读审计目标才可把 `audit_task_complete` 当元任务结束。最终顺序是 manifest -> matrix/verdict/audit sync -> verification-before-completion -> validator -> read-only auditor -> stale/overclaim scan；若审计后修改 artifact/evidence，需重跑 validator。`validate-plan-contract` 和 `validate-plan-acceptance` 只检查临时 artifact 自洽、引用存在、弱证据 complete 行、缺 required proof layer、material/critical drift、sibling substitution 和已声明的 surface/architecture binding 一致性，不证明产品质量。有 subagent 能力时，Superpowers 目标提示会把 subagent 作为只读 auditor 加在主 agent 自证和 validator 之后；auditor 用固定 auditor checklist 找 gap，不是 proof source。Superpowers review 和 verification 仍然有价值，但不能覆盖 Tiny Context gates；通过 Superpowers review 不等于证明 plan conformance 或 checklist acceptance。
+生成的 Superpowers prompt 使用 Slice Gate / Epoch Gate / Final Gate 分层节奏，而不是每个 slice 后都跑完整 final gate。Progress Accounting 在 state 和 generated `derived/progress-ledger.*` 中记录 AC acceptance completion、engineering implementation progress、runtime/proof progress、system capability progress、representative sample progress、real object coverage、full population operation progress、artifact budget 和 workflow overhead。每个 slice 需要声明 artifact budget、proof-layer milestone 状态和 cleanup expectation。workflow overhead backpressure 要求 executor 批处理共享的 provider/browser/runtime/security epoch proof environment，清理 stale artifact，并选择 Next 3-5 high-value clusters 来优先关闭最多阻塞 AC / proof-layer gap。
+重要使用提示：Minimal Context 有意把 Context 读取顺序、Context / 代码优先级和漂移检查保持为 agent 级软约束，而不是机器强制 edit-order gate。这个取舍适合短任务，但长任务、大上下文、多次交接或多轮验证时预期会漂移。单靠 Superpowers 在这类压力下仍可能漂移：它能增强执行纪律，但本身不负责保留上游 source authority、防止 scope shrinkage、证明完整符合 Technical Realization Plan，或按 Acceptance Checklist 逐 AC 强制证据成立。普通 checklist 准备需要 `/normal-long-task`；已有产品/架构原始意图源、具体技术实现方案和验收清单且需要 Superpowers 时，可直接用 `/superpowers-long-task`。`Product Context Delta` 判断产品逻辑、页面职责、信息架构和验收语义是否需要写入 Context；`Technical Context Delta` 判断 API/schema、模块边界、runtime/state、验证/部署路径和稳定技术取舍是否需要写入 Context。`task-state.json` 是唯一执行状态源，`events.ndjson` 追加记录状态变化，`derived/**` 是生成阅读视图，`task-state.evidence[]` 是 canonical evidence ledger；local audit 只是 generated progress/recovery view，不能裁判完成；审计流程完成也不等于被验收目标完成。使用目标模式执行方案时，目标结束条件对齐 computed `product_goal_complete=true`，只读审计目标才可把 `audit_task_complete` 当元任务结束。最终顺序是 derive all views -> verification-before-completion -> `validate-superpowers-state` -> state-backed `validate-plan-acceptance` -> read-only auditor -> stale/overclaim scan -> `ty-context superpowers final-gate` 计算 completion；若审计后修改 state/evidence，需 rederive 并重跑两个 validator。`validate-plan-contract`、`validate-superpowers-state` 和 `validate-plan-acceptance` 只检查临时 artifact/state 自洽、引用存在、弱证据 complete 行、缺 required proof layer、material/critical drift、sibling substitution 和已声明的 surface/architecture binding 一致性，不证明产品质量。有 subagent 能力时，Superpowers 目标提示会把 subagent 作为只读 auditor 加在主 agent 自证和 validator 之后；auditor 用固定 auditor checklist 找 gap，不是 proof source。Superpowers review 和 verification 仍然有价值，但不能覆盖 Tiny Context gates；通过 Superpowers review 不等于证明 plan conformance 或 checklist acceptance。
 ## 当前最佳实践
@@ -81,9 +83,9 @@ Web GPT 或其他外部规划模型产出长任务源输入
 -> 每个执行片段都回到流程契约 + project_context/**
 ```
-这里的 Superpowers 指具体的 [obra/Superpowers](https://github.com/obra/superpowers) 插件/开源工作流，不是泛化的执行规划替代品。`/superpowers-long-task` 接受输入包后，有 subagent 支持时优先用 `superpowers:subagent-driven-development`，否则用 `superpowers:executing-plans`；涉及行为变更时用 `superpowers:test-driven-development`；完成声明前用 `superpowers:verification-before-completion` 同时检查 plan-conformance matrix 和 final acceptance verdict，然后运行 `ty-context validate-plan-acceptance <dir>`。
+这里的 Superpowers 指具体的 [obra/Superpowers](https://github.com/obra/superpowers) 插件/开源工作流，不是泛化的执行规划替代品。`/superpowers-long-task` 接受输入包后，有 subagent 支持时优先用 `superpowers:subagent-driven-development`，否则用 `superpowers:executing-plans`；涉及行为变更时用 `superpowers:test-driven-development`；完成声明前先 derive all views，再用 `superpowers:verification-before-completion`、`ty-context validate-superpowers-state <dir>`、`ty-context validate-plan-acceptance <dir>` 和 read-only auditor 检查，最后由 `ty-context superpowers final-gate <dir>` 计算 `product_goal_complete`。
-原因是漂移控制。流程契约 + Context 层是软约束，短任务里通常能让 agent 按预期执行；长程任务里，Context 仍然能记录符合预期的事实，但 Context 到代码 的实现步骤会随着上下文窗口变大、多次交接、subagent 拆分和多轮验证而漂移。单靠 Superpowers 也仍可能在复杂长程执行压力下漂移：它增强执行纪律，但不天然保留 source authority、防止 scope shrinkage、证明完整符合 Technical Realization Plan，或按 Acceptance Checklist 逐 AC 强制证据成立。产品/架构原始意图源、具体技术实现方案、验收清单、显式长程任务 Skill 调用、目标模式文本、可选 evidence manifest、plan-conformance matrix、final acceptance verdict 和可选 Superpowers 执行层，把“产品/技术 Context 有没有先对齐”“有没有按图纸实现”和“有没有按验收证据完成”都外化成可恢复、可审计的临时执行标准，同时不恢复阶段式 gate。
+原因是漂移控制。流程契约 + Context 层是软约束，短任务里通常能让 agent 按预期执行；长程任务里，Context 仍然能记录符合预期的事实，但 Context 到代码 的实现步骤会随着上下文窗口变大、多次交接、subagent 拆分和多轮验证而漂移。单靠 Superpowers 也仍可能在复杂长程执行压力下漂移：它增强执行纪律，但不天然保留 source authority、防止 scope shrinkage、证明完整符合 Technical Realization Plan，或按 Acceptance Checklist 逐 AC 强制证据成立。产品/架构原始意图源、具体技术实现方案、验收清单、显式长程任务 Skill 调用、目标模式文本、canonical task state、generated derived views 和可选 Superpowers 执行层，把“产品/技术 Context 有没有先对齐”“有没有按图纸实现”和“有没有按验收证据完成”都外化成可恢复、可审计的临时执行标准，同时不恢复阶段式 gate。
 对于高风险产品方案、架构方案、技术方案或验收方案输入，流程契约应先在 `plan.md` 或等价临时计划面里可见化，再进入实现。这个计划面把 Source-to-Context Coverage 和 Context-to-Implementation Binding 分开：前者把每条长期 source 约束映射到 existing Context hit、Context action、owning Context 和 coverage status；后者把 Context fact 映射到 implementation obligation、expected surfaces、implemented paths、forbidden shortcuts、verification path 和 binding status。Source coverage 仍有 `under_scoped` 或未处理的 `new_context_required` 时不能声称按方案完整实现；binding 仍有 `partial`、`missing`、`blocked`、`needs_user_decision` 或 `contradicted_by_current_state` 时不能声称按 Context 完整落地。

package/assets/skills/context_development_engineer/SKILL.md CHANGED Viewed

@@ -20,15 +20,16 @@ Project-specific engineering rules belong in a separate project-local Skill unde
 1. 先读取 `project_context/global.md`、`project_context/architecture.md` 和 `project_context/context.toml`，按 default area、triggers、read_when 选择相关 context。
 2. 先确认用户目标、约束、成功标准、影响产品域、现有验证 / 部署关键路径和风险；能从代码或 Context 发现的事实不要反复询问用户。
 3. `project_context/**` 决定“应该是什么”：模块职责、归属、架构边界、接口方向、契约语义和禁止依赖；代码决定“现在实现到了哪里”。代码不能静默重定义 Context。
-4. 第一处代码编辑前，若任务影响 durable architecture boundary、module ownership、API / Schema / data contract、state / runtime semantics、dependency direction、verification / deployment semantics 或 durable rationale / tradeoff，先编译当前任务契约；契约第一段用 `Context Delta: none|required` 完成唯一正式长期事实判断，再写本次 `Task Contract`，并显式写 `Architecture Context Hit` 和 `Decision Rationale Hit: existing|required|none`。如果输入包含产品方案、架构方案、技术方案、实现方案或验收方案，先在 `plan.md` 或等价临时计划面做 Source-to-Context Coverage，确认方案中的 durable architecture / ownership / API / runtime / verification constraints 已被现有 Context 覆盖、需要更新、仅属 task-local、显式 out-of-scope、需要用户决策或仍 under-scoped。
-5. 普通 bug fix、局部样式、局部实现漂移修复、小重构、package/release 处理、测试修复或探索性 spike 不强制编译架构 / rationale 任务契约，也不更新 Context；一旦形成长期工程结论，继续对齐或交付前必须回写 Context。不要把 Context 机械补成代码改动摘要。
+4. 第一处代码编辑前，若任务影响 durable architecture boundary、module ownership、API / Schema / data contract、state / runtime semantics、dependency direction、verification / deployment semantics 或 durable rationale / tradeoff，先编译当前任务契约；契约第一段用 `Context Delta: none|required` 完成唯一正式长期事实判断，再写本次 `Task Contract`，并显式写 `Architecture Context Hit` 和 `Decision Rationale Hit: existing|required|none`。如果输入包含产品方案、架构方案、技术方案、实现方案或验收方案，先在 `plan.md` 或等价临时计划面做 Source-to-Context Coverage，确认方案中的 durable architecture / ownership / API / runtime / verification constraints 已被现有 Context 覆盖、需要更新、仅属 task-local、显式 out-of-scope、需要用户决策或仍 under-scoped。
+5. 普通 bug fix、局部样式、局部实现漂移修复、小重构、package/release 处理、测试修复或探索性 spike 不强制编译架构 / rationale 任务契约，也不更新 Context；一旦形成长期工程结论，继续对齐或交付前必须回写 Context。不要把 Context 机械补成代码改动摘要。
 6. 如果代码、搜索结果或相邻实现与 Context 冲突，显式标记为实现漂移、缺失工作或 Context 过期，不要用当前代码形态反推模块归属。
 7. 涉及已有 Context 的实现判断，先做轻量对齐：
    - Context expectation
    - Current code evidence
    - Gap
    - Proposed change
-8. 涉及模块原则、模块逻辑、设计原因、API / Schema、状态语义、验证设计或 capability / metric / acceptance claim 时，先做 Module Principle / Design Gate：列出命中的模块设计上下文来源，说明这些原则 / 逻辑控制本次哪些实现或验证选择，再选择实现路径、验证 claim、probe 参数或 fallback。命令、probe、当前实现形态和被触碰文件大小是执行实例或维护风险，不能反推或覆盖模块设计目标。
+8. 涉及模块原则、模块逻辑、设计原因、API / Schema、状态语义、验证设计或 capability / metric / acceptance claim 时，先做 Module Principle / Design Gate：列出命中的模块设计上下文来源，说明这些原则 / 逻辑控制本次哪些实现或验证选择，再选择实现路径、验证 claim、probe 参数或 fallback。命令、probe、当前实现形态和被触碰文件大小是执行实例或维护风险，不能反推或覆盖模块设计目标。
+   - 对外部产品/架构源、技术实现方案或验收清单中的 delivery / acceptance scope，必须显式区分 `system_capability_build`、`representative_sample_validation`、`full_population_operation`、`full_population_not_required` 和 out-of-scope backlog。不要把若干具体对象运行结果当作可复用系统能力完成，也不要把 framework-only 实现当作全量真实对象已完成；sample provider / interface / page 证据不能替代 all-provider / all-interface / all-platform / full-population 完成，除非 AC 明确批准该边界。
 9. 涉及 Product Surface（Web 页面、移动/桌面屏幕、游戏 UI/HUD/菜单、CLI/TUI 输出、扩展或设备界面）、表单/配置、输入、选择、搜索、筛选、调度/时间、预算/配额/限流或状态反馈的实现方案时，检查当前代码是否只是暴露字段，还是满足了已有 Context、Surface Contract、页面职责和控件任务框架；实现收尾要能给出简短 Surface/Context Conformance 证据。
    - 若存在 Product Surface Contract，Task Contract 必须包含 Surface Contract Hit、main allows/forbids、drilldown ownership、long-task state requirement、implementation drift 和 verification。
    - 若缺失且本任务创建 durable surface responsibility，设置 `Context Delta: required`，先用 `context_surface_contract` 或项目 Context 写入具体 surface 职责，再继续实现。
@@ -45,13 +46,13 @@ Project-specific engineering rules belong in a separate project-local Skill unde
    - 默认只实施高收益、低风险、语义稳定的候选项。
    - 不为一次性代码、不稳定语义或纯粹好看的架构做抽象。
 13. 当人工流程呈现重复、确定性、容易漏步骤或顺序影响正确性时，主动评估是否应沉淀为 repo-local tool/script。脚本应放在 owning module 的工具目录并配测试；可恢复的执行入口、参数约束和适用边界写入对应 verification / deployment Context。Skill 只记录这类脚本化机会识别原则，不承载具体模块命令、provider id、artifact 路径或一次性运行结果。
-14. 需要沉淀长期事实时，只更新 `project_context/**`：
+14. 需要沉淀长期事实时，只更新 `project_context/**`：
    - 全局工程取舍、跨产品域索引或当前状态写入 `global.md`。
    - 产品域 API、数据契约、关键约束、入口和风险写入对应 area / subdomain Context。
    - 跨域接口语义写入 `context_role: contract` 或 manifest role 为 `contract` 的 Context；关键重复验证路径写入 `verification`；关键部署、运行拓扑或云端初始化路径写入 `deployment`；代码入口索引用 `implementation-index`；底层理论源用 `foundation`；历史归档索引用 `archive`。
    - 新 context unit 可新增 `project_context/areas/<unit>.md`，并更新 `global.md#Context Index`；复杂项目同时更新 `project_context/context.toml`。
    - 如果 `upgrade` 自动把深层 `.md` 注册成 area，但语义上更像 foundation / contract / archive，后续应显式调整 manifest role；不要依赖自动迁移判断语义。
-15. 实现收尾时做 `Contract Conformance` 和 Context drift check：确认代码没有引入未沉淀的长期事实，且 Context 没有退化成普通实现摘要；若存在 `plan.md` / 等价临时计划面，必须反查 Source-to-Context Coverage、Context-to-Implementation Binding 和 Task Contract，确认没有未处理的 `under_scoped` / `new_context_required` / `needs_user_decision`，也没有 non-bound implementation rows。交付说明只报告轻量状态：`Context: 已更新 ...` 或 `Context: 本次无长期事实变化`。Conformance 说明本次契约满足情况、未满足或延期项和验证入口；一次性证据、截图结果、测试日志、任务契约和实现摘要不写入 Context。
+15. 实现收尾时做 `Contract Conformance` 和 Context drift check：确认代码没有引入未沉淀的长期事实，且 Context 没有退化成普通实现摘要；若存在 `plan.md` / 等价临时计划面，必须反查 Source-to-Context Coverage、Context-to-Implementation Binding 和 Task Contract，确认没有未处理的 `under_scoped` / `new_context_required` / `needs_user_decision`，也没有 non-bound implementation rows。交付说明只报告轻量状态：`Context: 已更新 ...` 或 `Context: 本次无长期事实变化`。Conformance 说明本次契约满足情况、未满足或延期项和验证入口；一次性证据、截图结果、测试日志、任务契约和实现摘要不写入 Context。
 16. Context 只能声明验证 / 部署关键路径或验收信号，不能伪造“测试已通过”或“部署已成功”。
 17. Verification / Deployment Role Context 只记录长期可复用的重复执行路径事实：特殊准备、最短命令或路径、预期阶段 / 信号、可接受 warning、已排除的重复探索点。不要记录一次性测试日志、完整输出、临时 JSON、CI artifact、测试报告、release ledger、secret、token、cookie、device id 或 raw payload。
@@ -64,37 +65,38 @@ Project-specific engineering rules belong in a separate project-local Skill unde
 ## 任务契约编译
 - 任务契约是当前工程任务的编译产物，不是事实源、tech plan、ADR、implementation doc 或长期 Context；默认留在方案、交付说明或 PR 文本中。
-- `Context Delta` 必须先出现，取值为 `none` 或 `required`：
-  - `none`：本次只是按既有 Context / 架构原则落地，不新增长期事实。
-  - `required`：说明长期事实类型、应写入的 Context / role、需要沉淀的事实，以及明确不写入 Context 的一次性内容。
-- `Task Contract` 用短列表说明 capability、owner、upstream / downstream、allowed / forbidden dependency、input / output / state / persistence、failure / retry / timeout / degraded / recovery、observability、performance、security、non-goals 和 verification path。
-- 高风险工程任务只新增这两个显性 Task Contract 字段，不新增长模板或第二套 durable-fact gate：
-  - `Architecture Context Hit: <architecture.md | area/subdomain Context | contract Context | Module Design Capsule | none>`：命名控制本次技术判断的 Context。若命中 `none` 且本任务创建 durable architecture meaning，`Context Delta` 必须是 `required`。
-  - `Decision Rationale Hit: <existing | required | none>`：`existing` 表示现有 Context 已解释 durable reason；`required` 表示本任务创建或改变 durable rationale、rejected alternative、tradeoff 或 future-change constraint，必须走 `Context Delta: required`；`none` 表示没有稳定 rationale 或变化局部且自明。
-- 触及 Product Surface 时，`Task Contract` 同时说明 surface platform、primary user question、main allows/forbids、drilldown ownership、long-task state requirement、implementation drift 和 conformance verification。
+- `Context Delta` 必须先出现，取值为 `none` 或 `required`：
+  - `none`：本次只是按既有 Context / 架构原则落地，不新增长期事实。
+  - `required`：说明长期事实类型、应写入的 Context / role、需要沉淀的事实，以及明确不写入 Context 的一次性内容。
+- `Task Contract` 用短列表说明 capability、owner、upstream / downstream、allowed / forbidden dependency、input / output / state / persistence、failure / retry / timeout / degraded / recovery、observability、performance、security、non-goals 和 verification path。
+- 高风险工程任务只新增这两个显性 Task Contract 字段，不新增长模板或第二套 durable-fact gate：
+  - `Architecture Context Hit: <architecture.md | area/subdomain Context | contract Context | Module Design Capsule | none>`：命名控制本次技术判断的 Context。若命中 `none` 且本任务创建 durable architecture meaning，`Context Delta` 必须是 `required`。
+  - `Decision Rationale Hit: <existing | required | none>`：`existing` 表示现有 Context 已解释 durable reason；`required` 表示本任务创建或改变 durable rationale、rejected alternative、tradeoff 或 future-change constraint，必须走 `Context Delta: required`；`none` 表示没有稳定 rationale 或变化局部且自明。
+- 触及 Product Surface 时，`Task Contract` 同时说明 surface platform、primary user question、main allows/forbids、drilldown ownership、long-task state requirement、implementation drift 和 conformance verification。
 - 工程 / RFC / 实现类任务的 `Task Contract` 必须包含 `Modularity Check: none|required|exception`：
   - `none`：没有超限计划 / touched 手写源码文件，或本次没有向超限文件增加新职责。
   - `required`：拆分是本次验收条件，应按 abstraction / decomposition scan 的职责边界完成。
   - `exception`：本次触碰超限文件但暂不拆；只有默认 `modularity.policy: scoped_waivers` 允许此路径，且必须已有或同步新增 `<harnessRoot>/config.yaml` `modularity.waivers` 记录文件、收窄分类、原因和后续拆分边界。若项目设置 `modularity.policy: strict_except_generated`，不得用 legacy waiver 绕过超限手写源码，交付说明只记录本次是否新增职责以及为什么没有拆。
-- `Applicable Module Design` 是高风险任务的前置字段：列出命中的 Context / Skill 来源、适用的 Principles、Design Logic 和 Design Rationale，以及它们控制的当前实现或验证选择。
-- `Principle Decision Gate` 要写明首选执行路径、fallback / degraded path 的进入条件，以及什么证据不能证明本次目标。涉及 capability、metric 或 acceptance claim 时，先声明要证明的 claim，再选择命令或 probe。
-- 对长任务、多模块、多 agent、外部产品/架构/技术/实现/验收方案输入、容易发生 `Context Delta` 调头或多轮验证的任务，使用 `plan.md` 或等价临时计划面暂存 `Source-to-Context Coverage`、`Context-to-Implementation Binding`、`Context Delta`、`Task Contract`、`Implementation Steps` 和 `Contract Conformance`；它只是临时执行缓存。
-- small code task 指现有 Context 已足够、且不改变 durable product / architecture / API-schema / runtime-state / verification-deployment / security-redaction / surface ownership 事实的局部实现任务；它按语义风险判断，不按代码行数判断，不应创建 `plan.md`、完整 trace tables、Source-to-Context Coverage 或 Context-to-Implementation Binding，除非它发现长期事实变化或扩展成高风险工作。
-- `Source-to-Context Coverage` 表使用字段：`Source item | Durable constraint | Type | Existing Context Hit | Context action | Owning Context | Coverage status`。这张表只回答 source 约束是否进入或命中 Context，不写实现路径。
-- `Coverage status` 取值：`covered`、`new_context_required`、`context_updated`、`task_local_only`、`out_of_scope_explicit`、`needs_user_decision`、`under_scoped`。存在 `under_scoped` 或未处理的 `new_context_required` / `needs_user_decision` 时，不能声称已按方案完整实现。
-- `Context-to-Implementation Binding` 表使用字段：`Context fact | Implementation obligation | Expected surfaces | Implemented paths | Forbidden shortcuts | Verification path | Binding status`。
-- `Binding status` 取值：`bound`、`partial`、`missing`、`blocked`、`out_of_scope_explicit`、`needs_user_decision`、`contradicted_by_current_state`。runtime/API/worker 项不能只用测试名或 browser checked path 冒充 `bound`。
-- `plan.md` 中出现的长期工程事实必须提炼回 `project_context/**`；否则不要把临时计划当作事实源、交付产物或后续引用依据。
-- `Context Delta: required` 时先更新 `project_context/**`，再继续实现；`none` 时直接按 Task Contract 实现。
-- `Contract Conformance` 是交付前的软检查：实现偏差修实现，契约遗漏回 Task Contract，长期事实缺失或 source coverage under-scoped 回 `Context Delta` 并先更新 Context。
-- 不为 small code task、普通代码修改、bug fix、小重构、package/release 处理、测试修复、探索性 spike 或仅因 touched file 过大强制编译架构 / rationale 任务契约；大文件只走 `Modularity Check` 的拆分 / exception 判断。
+- `Applicable Module Design` 是高风险任务的前置字段：列出命中的 Context / Skill 来源、适用的 Principles、Design Logic 和 Design Rationale，以及它们控制的当前实现或验证选择。
+- `Principle Decision Gate` 要写明首选执行路径、fallback / degraded path 的进入条件，以及什么证据不能证明本次目标。涉及 capability、metric 或 acceptance claim 时，先声明要证明的 claim，再选择命令或 probe。
+- 若 Task Contract 或验收方案涉及 capability-first delivery boundary，必须记录 source/plan/AC 对 `delivery_scope`、`acceptance_scope`、`full_population_required`、representative sample boundary、non-required population / backlog 的一致性；发现 system capability build 与 full population operation 冲突时，按 `scope_conflict_requires_decision` 处理，不能靠实现方便路径或样本证据自行裁决。
+- 对长任务、多模块、多 agent、外部产品/架构/技术/实现/验收方案输入、容易发生 `Context Delta` 调头或多轮验证的任务，使用 `plan.md` 或等价临时计划面暂存 `Source-to-Context Coverage`、`Context-to-Implementation Binding`、`Context Delta`、`Task Contract`、`Implementation Steps` 和 `Contract Conformance`；它只是临时执行缓存。
+- small code task 指现有 Context 已足够、且不改变 durable product / architecture / API-schema / runtime-state / verification-deployment / security-redaction / surface ownership 事实的局部实现任务；它按语义风险判断，不按代码行数判断，不应创建 `plan.md`、完整 trace tables、Source-to-Context Coverage 或 Context-to-Implementation Binding，除非它发现长期事实变化或扩展成高风险工作。
+- `Source-to-Context Coverage` 表使用字段：`Source item | Durable constraint | Type | Existing Context Hit | Context action | Owning Context | Coverage status`。这张表只回答 source 约束是否进入或命中 Context，不写实现路径。
+- `Coverage status` 取值：`covered`、`new_context_required`、`context_updated`、`task_local_only`、`out_of_scope_explicit`、`needs_user_decision`、`under_scoped`。存在 `under_scoped` 或未处理的 `new_context_required` / `needs_user_decision` 时，不能声称已按方案完整实现。
+- `Context-to-Implementation Binding` 表使用字段：`Context fact | Implementation obligation | Expected surfaces | Implemented paths | Forbidden shortcuts | Verification path | Binding status`。
+- `Binding status` 取值：`bound`、`partial`、`missing`、`blocked`、`out_of_scope_explicit`、`needs_user_decision`、`contradicted_by_current_state`。runtime/API/worker 项不能只用测试名或 browser checked path 冒充 `bound`。
+- `plan.md` 中出现的长期工程事实必须提炼回 `project_context/**`；否则不要把临时计划当作事实源、交付产物或后续引用依据。
+- `Context Delta: required` 时先更新 `project_context/**`，再继续实现；`none` 时直接按 Task Contract 实现。
+- `Contract Conformance` 是交付前的软检查：实现偏差修实现，契约遗漏回 Task Contract，长期事实缺失或 source coverage under-scoped 回 `Context Delta` 并先更新 Context。
+- 不为 small code task、普通代码修改、bug fix、小重构、package/release 处理、测试修复、探索性 spike 或仅因 touched file 过大强制编译架构 / rationale 任务契约；大文件只走 `Modularity Check` 的拆分 / exception 判断。
 ## 模块设计上下文写法
 - 模块设计上下文应是 Minimal Context，不是设计论文；只保留短、准、稳定、会影响后续实现或验证选择的内容。
-- `Principles` 写稳定执行约束；`Design Logic` 写模块如何判断、选择、降级或组合能力；`Design Rationale` 只写会改变后续判断的原因、rejected alternative 或 tradeoff。
-- `Current Standard`、`Verification Paths`、阈值、命令和 probe 参数是当前执行实例，不是永久原则；规则变化时更新对应 Context，而不是让旧命令继续定义目标。
-- 不编造 rationale；仅由当前代码形态反推的理由、一次性证据、实现摘要、PR notes、命令输出、截图审查、debug 过程、agent reasoning、完整日志、临时 JSON、raw payload、测试报告和任务契约不进入高频模块原则段。
+- `Principles` 写稳定执行约束；`Design Logic` 写模块如何判断、选择、降级或组合能力；`Design Rationale` 只写会改变后续判断的原因、rejected alternative 或 tradeoff。
+- `Current Standard`、`Verification Paths`、阈值、命令和 probe 参数是当前执行实例，不是永久原则；规则变化时更新对应 Context，而不是让旧命令继续定义目标。
+- 不编造 rationale；仅由当前代码形态反推的理由、一次性证据、实现摘要、PR notes、命令输出、截图审查、debug 过程、agent reasoning、完整日志、临时 JSON、raw payload、测试报告和任务契约不进入高频模块原则段。
 ## 输出边界
@@ -105,17 +107,17 @@ Project-specific engineering rules belong in a separate project-local Skill unde
 ## 建议沉淀位置
-- `global.md#Design Rationale`：跨模块工程取舍。
-- `architecture.md#Design Rationale`：架构级选择、rejected alternatives 和 tradeoffs。
-- `global.md#Current State`：影响后续恢复的实现状态。
-- `areas/*.md#User / System Contract`：模块可见行为、API、CLI、UI 或数据契约。
-- `areas/*.md#Module Design Capsule`：模块级 principles、design logic 和会影响后续判断的 rationale。
-- `areas/*.md#Core Data / API / State`：关键数据结构、接口、状态流或规则。
-- `areas/*.md#Key Constraints`：性能、安全、兼容、集成或维护约束。
-- role=`contract` Context：跨域 API / schema / event / interface 语义及其 durable rationale。
-- role=`decision-rationale` Context：更大或跨切面的稳定设计原因。
-- `areas/*.md#Code Entry Points`：未来 agent 需要快速定位的代码入口。
-- `areas/*/verification.md` 或 role=`verification` Context：关键测试、smoke、CI、probe 或验证重复执行路径。
-- `areas/*/deployment.md` 或 role=`deployment` Context：关键部署、云端初始化、运行拓扑、健康检查或回滚重复执行路径。
-- `DESIGN.md`：视觉 identity、design token 和视觉 rationale。
-- `project_context/context.toml`：复杂项目的产品域 area/context_unit、role、触发词、按需读取策略和可选边界规则。
+- `global.md#Design Rationale`：跨模块工程取舍。
+- `architecture.md#Design Rationale`：架构级选择、rejected alternatives 和 tradeoffs。
+- `global.md#Current State`：影响后续恢复的实现状态。
+- `areas/*.md#User / System Contract`：模块可见行为、API、CLI、UI 或数据契约。
+- `areas/*.md#Module Design Capsule`：模块级 principles、design logic 和会影响后续判断的 rationale。
+- `areas/*.md#Core Data / API / State`：关键数据结构、接口、状态流或规则。
+- `areas/*.md#Key Constraints`：性能、安全、兼容、集成或维护约束。
+- role=`contract` Context：跨域 API / schema / event / interface 语义及其 durable rationale。
+- role=`decision-rationale` Context：更大或跨切面的稳定设计原因。
+- `areas/*.md#Code Entry Points`：未来 agent 需要快速定位的代码入口。
+- `areas/*/verification.md` 或 role=`verification` Context：关键测试、smoke、CI、probe 或验证重复执行路径。
+- `areas/*/deployment.md` 或 role=`deployment` Context：关键部署、云端初始化、运行拓扑、健康检查或回滚重复执行路径。
+- `DESIGN.md`：视觉 identity、design token 和视觉 rationale。
+- `project_context/context.toml`：复杂项目的产品域 area/context_unit、role、触发词、按需读取策略和可选边界规则。