npm - project-tiny-context-harness - Versions diffs - 0.2.64 → 0.2.66 - Mend

project-tiny-context-harness 0.2.64 → 0.2.66

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md +6 -4
package/assets/README.md +6 -4
package/assets/README.zh-CN.md +3 -3
package/assets/skills/plan_acceptance_checklist_compiler/SKILL.md +85 -31
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -80,12 +80,14 @@ workflow contract + project_context/** -> implementation -> verification -> drif
 For long-running tasks, externalize the target first:
 ```text
-Web GPT or another external planning model produces a plan
+Web GPT or another external planning model produces a two-document upstream input
 -> plan acceptance checklist Skill produces a goal/target-mode prompt
 -> Superpowers derives concrete implementation slices
 -> each slice follows the workflow contract + project_context/**
 ```
+For best target-mode results, the two-document upstream input should be a `Development Plan` for execution direction and an `Acceptance and Tests` packet for acceptance authority. The development plan preserves the original requirement source and implementation direction; the acceptance packet supplies ACs, required evidence, tests, real product/core paths, evidence layers, invalid evidence, state-machine rules, local audit requirements and blockers. Source Pack exports are temporary upload material for external planning, not durable Context.
 The recommended Superpowers layer is the specific [obra/Superpowers](https://github.com/obra/superpowers) plugin/workflow, not a generic planning substitute. Use `superpowers:writing-plans` when the target-mode prompt or source plan still needs bite-sized implementation tasks, then prefer `superpowers:subagent-driven-development` when subagents are available and `superpowers:executing-plans` otherwise. Behavior changes should use `superpowers:test-driven-development`.
 The reason is drift control. The workflow contract plus Context layer is intentionally a soft constraint. It works well for short tasks, and Context can still capture the expected facts for long tasks, but long execution makes the Context-to-code step drift as the context window grows, work is handed off, subagents split scope or validation loops multiply. A Web GPT plan, target-mode prompt, full acceptance checklist and Superpowers execution layer make the completion target recoverable without restoring a phase-gated workflow.
@@ -139,7 +141,7 @@ npm ci
 npm run smoke:quickstart
 npm run preview:pack
 cd /path/to/your/test-repo
-npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.64.tgz
+npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.66.tgz
 npx --no-install ty-context init --adopt
 make validate-context
 ```
@@ -253,7 +255,7 @@ Use `npx --no-install ty-context ...` only when you explicitly want the already
 | Product Surface Contract Skill | `<harnessRoot>/skills/context_surface_contract/SKILL.md` | Handles explicit Product Surface Contract, Screen Contract, surface responsibility and main/drilldown ownership work; it compiles project-owned surface contracts into `project_context/**` without adding a new context role or gate. |
 | Full project context export Skill | `<harnessRoot>/skills/context_full_project_export/SKILL.md` | Handles explicit full-project, Source Pack or code-level export requests and uses `export-context --source-pack`, `--code-index`, `--task-context`, `--all`, `--full` or `--code` to create temporary artifacts under `tmp/ty-context/context-exports/**`. |
 | Harness upgrade Skill | `<harnessRoot>/skills/context_harness_upgrade/SKILL.md` | Handles explicit Tiny Context / Project Tiny Context Harness upgrade requests such as “upgrade Tiny Context” and “use the Tiny Context upgrade skill to upgrade this project”; it runs the canonical `upgrade` path, handles only migration-scoped `manual_required` / `blocked` follow-up, then runs diagnostics. |
-| Plan acceptance checklist Skill | `<harnessRoot>/skills/plan_acceptance_checklist_compiler/SKILL.md` | Handles explicit requests to turn a referenced plan, RFC or implementation proposal into a falsifiable acceptance checklist and paste-ready goal/target-mode prompt under `tmp/ty-context/plan-acceptance/**`; if the plan already contains an explicit concrete checklist, the Skill reuses it verbatim in the separate full-checklist file; generated prompts may reference a full checklist as the authoritative acceptance standard and use compact summaries only for navigation/priority, but the Skill does not execute the plan or prove completion. |
+| Plan acceptance checklist Skill | `<harnessRoot>/skills/plan_acceptance_checklist_compiler/SKILL.md` | Handles explicit requests to turn a referenced plan, RFC, implementation proposal or two-document upstream input into a falsifiable acceptance checklist and paste-ready goal/target-mode prompt under `tmp/ty-context/plan-acceptance/**`; if the plan already contains an explicit concrete checklist, the Skill reuses it verbatim in the separate full-checklist file; generated prompts may reference a full checklist as the authoritative acceptance standard and use compact summaries only for navigation/priority, but the Skill does not execute the plan or prove completion. |
 | Project-local Skills | `<harnessRoot>/skills/<role>/SKILL.md` | Optional local product/design/development Skills created by the project, such as `product_plan`, `uiux_design` or `development_engineer`. They supersede package-managed default Skills when more specific, are not overwritten by `sync`, and should keep front matter trigger keywords aligned with the project `AGENTS.md` role-trigger rule. |
 | Managed file sync | `make ty-context-sync` or `npx --yes --package project-tiny-context-harness@latest ty-context sync` | Refreshes package-managed guidance, default Skills, Makefile include, context templates, tools and workflow YAML. It does not run migrations or perform semantic Context generation; it may block only direct asset-refresh safety issues such as invalid managed blocks or deprecated managed Skill overrides. |
 | Upgrade | `make ty-context-upgrade` or `npx --yes --package project-tiny-context-harness@latest ty-context upgrade` | Use for releases marked `upgrade-required` or `manual-required`. Builds an upgrade plan, stops before writes when `blocked` items exist, otherwise applies `safe_pending` migrations, runs `sync` and `doctor`, and exits non-zero when manual follow-up or diagnostics remain. |
@@ -275,7 +277,7 @@ For high-risk product, UI/UX and engineering tasks that affect durable architect
 Technical architecture support is a Minimal Context capability: use restrained `architecture.md`, area Module Design Capsules and existing `contract` / `decision-rationale` roles when durable architecture or rationale matters. Do not invent rationale; store stable reasons, rejected alternatives or tradeoffs only in the smallest durable Context surface when they will affect future implementation or verification choices.
-For long-running plans, RFCs or implementation proposals, the plan acceptance checklist Skill can turn a plan plus relevant Context into a falsifiable acceptance checklist and a paste-ready goal/target-mode prompt. If the plan already contains an explicit concrete acceptance checklist, the Skill copies that checklist verbatim into a separate full-checklist file instead of generating a competing checklist. It is one pre-execution acceptance pass, not a task planner or workflow engine: it stores temporary inputs under `tmp/ty-context/plan-acceptance/**`, asks for confirmation when durable assumptions are unclear, and leaves execution evidence to the future executor, tests, CI, review or human acceptance. The generated prompt may require a local audit under the same temporary directory so future sessions can recover acceptance progress; that audit is not Context, not a quality proof and not a replacement for the project's Tiny Context workflow contract. When the prompt references a full checklist, that checklist is the acceptance authority; compact prompt text is only navigation, priority and recovery guidance.
+For long-running plans, RFCs or implementation proposals, the plan acceptance checklist Skill can turn a plan plus relevant Context into a falsifiable acceptance checklist and a paste-ready goal/target-mode prompt. It also supports a two-document upstream input from Web GPT or another external planner: `Development Plan` for execution direction and `Acceptance and Tests` for target-mode acceptance input. If the plan already contains an explicit concrete acceptance checklist, the Skill copies that checklist verbatim into a separate full-checklist file instead of generating a competing checklist. The two-document packet path is strict mode: when required fields cannot be fully parsed from both documents, the compiler preserves the inputs, reports the missing fields, and stops without generating a checklist or goal/target-mode prompt. It is one pre-execution acceptance pass, not a task planner or workflow engine: it stores temporary inputs under `tmp/ty-context/plan-acceptance/**`, asks for confirmation when durable assumptions are unclear, and leaves execution evidence to the future executor, tests, CI, review or human acceptance. The generated prompt may require a local audit under the same temporary directory so future sessions can recover acceptance progress; that audit is not Context, not a quality proof and not a replacement for the project's Tiny Context workflow contract. When the prompt references a full checklist, that checklist is the acceptance authority; compact prompt text is only navigation, priority and recovery guidance.
 For Product Surface work, `context_surface_contract` turns broad product/page/UI principles into project-owned surface responsibilities. A Product Surface can be a Web page, mobile screen, desktop window, game UI/HUD/menu, CLI/TUI output, extension UI or embedded/device interface. Cross-surface contracts use the existing `contract` role; area-owned screen facts stay in `area` or `subdomain`; repeatable validation paths use `verification`. The Harness does not add a new surface-specific role, does not create business surface contracts during `init` or `upgrade`, and does not turn surface conformance into a validator gate. Projects that want mandatory task blocks should add a separate project-local Skill, while `product-surface-contract.md` is only a compact managed template for optional Context authoring.

package/assets/README.md CHANGED Viewed

@@ -94,7 +94,7 @@ That smoke packs the local workspace, installs it into a disposable repo, runs `
 ```sh
 npm run preview:pack
 cd /path/to/your/test-repo
-npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.64.tgz
+npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.66.tgz
 npx --no-install ty-context init --adopt
 make validate-context
 ```
@@ -124,12 +124,14 @@ workflow contract + project_context/** -> implementation -> verification -> drif
 For long-running tasks, externalize the target first:
 ```text
-Web GPT or another external planning model produces a plan
+Web GPT or another external planning model produces a two-document upstream input
 -> plan acceptance checklist Skill produces a goal/target-mode prompt
 -> Superpowers derives concrete implementation slices
 -> each slice follows the workflow contract + project_context/**
 ```
+For best target-mode results, the two-document upstream input should be a `Development Plan` for execution direction and an `Acceptance and Tests` packet for acceptance authority. The development plan summarizes the original requirement source and implementation direction; the acceptance packet supplies ACs, required evidence, tests, real product/core paths, evidence layers, invalid evidence, state-machine rules, local audit requirements and blockers. Source Pack exports are temporary upload material for external planning, not durable Context.
 The recommended Superpowers layer is the specific [obra/Superpowers](https://github.com/obra/superpowers) plugin/workflow, not a generic planning substitute. Use `superpowers:writing-plans` when the target-mode prompt or source plan still needs bite-sized implementation tasks, then prefer `superpowers:subagent-driven-development` when subagents are available and `superpowers:executing-plans` otherwise. Behavior changes should use `superpowers:test-driven-development`.
 The reason is drift control. The workflow contract plus Context layer is intentionally a soft constraint. It works well for short tasks, and Context can still capture the expected facts for long tasks, but long execution makes the Context-to-code step drift as the context window grows, work is handed off, subagents split scope or validation loops multiply. A Web GPT plan, target-mode prompt, full acceptance checklist and Superpowers execution layer make the completion target recoverable without restoring a phase-gated workflow.
@@ -290,7 +292,7 @@ No. It checks that recovery facts exist and avoids fake test-result claims. Prod
 It should stay smaller than a full process. Ordinary bug fixes and local refactors do not update Context unless they produce durable product, architecture, API, state or validation facts.
-The default Skills are Minimal Context helpers for explicit product-planning, UI/UX-design, development-engineering, Product Surface Contract, full-project-export, Tiny Context upgrade and plan-acceptance-checklist requests. Product, screen-flow, surface responsibility and durable engineering conclusions go to `project_context/**`; visual identity and design tokens go to root `DESIGN.md`. Export artifacts are temporary files under `tmp/ty-context/context-exports/**`, not Context. Plan acceptance artifacts are temporary files under `tmp/ty-context/plan-acceptance/**`; they define completion criteria for a referenced plan but do not execute it or prove acceptance. If the plan already contains an explicit concrete checklist, the Skill reuses that checklist verbatim in the separate full-checklist file. When a generated prompt references a full checklist, that checklist is the authoritative acceptance standard; the compact prompt summary is only navigation and priority guidance. The Harness upgrade Skill handles requests such as “upgrade Tiny Context” and “use the Tiny Context upgrade skill to upgrade this project” by following the release update mode, using `upgrade` for migration-bearing releases, and limiting manual cleanup to migration-scoped follow-up.
+The default Skills are Minimal Context helpers for explicit product-planning, UI/UX-design, development-engineering, Product Surface Contract, full-project-export, Tiny Context upgrade and plan-acceptance-checklist requests. Product, screen-flow, surface responsibility and durable engineering conclusions go to `project_context/**`; visual identity and design tokens go to root `DESIGN.md`. Export artifacts are temporary files under `tmp/ty-context/context-exports/**`, not Context. Plan acceptance artifacts are temporary files under `tmp/ty-context/plan-acceptance/**`; they define completion criteria for a referenced plan but do not execute it or prove acceptance. If the plan already contains an explicit concrete checklist, the Skill reuses that checklist verbatim in the separate full-checklist file. For a two-document upstream input, the external planner should provide a `Development Plan` and an `Acceptance and Tests` packet; the compiler preserves both source roles and, only when strict-mode required fields are fully parseable from both documents, turns them into the full checklist plus target prompt. When a generated prompt references a full checklist, that checklist is the authoritative acceptance standard; the compact prompt summary is only navigation and priority guidance. The Harness upgrade Skill handles requests such as “upgrade Tiny Context” and “use the Tiny Context upgrade skill to upgrade this project” by following the release update mode, using `upgrade` for migration-bearing releases, and limiting manual cleanup to migration-scoped follow-up.
 Multilingual trigger phrases are compatibility details. Public README, npm and launch copy stay English-first, and public/package-managed surfaces must remain English-complete; literal non-English examples are documented only where they explain generated Skill matching and must not be the sole activation path.
@@ -298,7 +300,7 @@ For high-risk product, UI/UX and engineering tasks that affect durable architect
 Technical architecture support is a Minimal Context capability: use restrained `architecture.md`, area Module Design Capsules and existing `contract` / `decision-rationale` roles when durable architecture or rationale matters. Do not invent rationale; store stable reasons, rejected alternatives or tradeoffs only in the smallest durable Context surface when they will affect future implementation or verification choices.
-For long-running plans, RFCs or implementation proposals, the plan acceptance checklist compiler can turn a plan plus relevant Context into a falsifiable acceptance checklist and a paste-ready goal/target-mode prompt. If the plan already contains an explicit concrete acceptance checklist, the Skill copies that checklist verbatim into a separate full-checklist file instead of generating a competing checklist. This is one pre-execution acceptance pass, not a task planner or workflow engine: it stores temporary inputs under `tmp/ty-context/plan-acceptance/**`, asks for confirmation when durable assumptions are unclear, and leaves execution evidence to the future executor, tests, CI, review or human acceptance. The generated prompt may require a local audit under the same temporary directory so future sessions can recover acceptance progress; that audit is not Context, not a quality proof and not a replacement for the project's Tiny Context workflow contract. The full checklist is the acceptance authority, while any compact prompt summary exists for navigation, priority and recovery after context compaction.
+For long-running plans, RFCs or implementation proposals, the plan acceptance checklist compiler can turn a plan plus relevant Context into a falsifiable acceptance checklist and a paste-ready goal/target-mode prompt. It also supports a two-document upstream input from Web GPT or another external planner: `Development Plan` for execution direction and `Acceptance and Tests` for target-mode acceptance input. If the plan already contains an explicit concrete acceptance checklist, the Skill copies that checklist verbatim into a separate full-checklist file instead of generating a competing checklist. The two-document packet path is strict mode: when required fields cannot be fully parsed from both documents, the compiler preserves the inputs, reports the missing fields, and stops without generating a checklist or goal/target-mode prompt. This is one pre-execution acceptance pass, not a task planner or workflow engine: it stores temporary inputs under `tmp/ty-context/plan-acceptance/**`, asks for confirmation when durable assumptions are unclear, and leaves execution evidence to the future executor, tests, CI, review or human acceptance. The generated prompt may require a local audit under the same temporary directory so future sessions can recover acceptance progress; that audit is not Context, not a quality proof and not a replacement for the project's Tiny Context workflow contract. The full checklist is the acceptance authority, while any compact prompt summary exists for navigation, priority and recovery after context compaction.
 Important usage note: Minimal Context intentionally keeps Context read order, Context/code priority and drift checks as agent-level soft constraints rather than machine-enforced gates. That tradeoff works well for short tasks, but long tasks with large context windows, multiple handoffs or many verification loops are expected to drift unless the completion target is externalized. Use the plan acceptance checklist compiler before long-running execution when there is a plan-like source; treat the full checklist as the acceptance target, and treat the local audit only as temporary progress/recovery state.

package/assets/README.zh-CN.md CHANGED Viewed

@@ -50,7 +50,7 @@ Fresh agent 先读这些文件，再开始改代码。
 一个典型失败场景是 ABCD 模块链：A/B/C 是上游，D 是下游。现在做 D 的需求时发现能力缺口；如果没有 Context 和优先级约束，agent 很容易为了让 D 完成而去改上游 A/B，因为当前代码让这条路可行。但真正需要判断的是：D 是否有权改 A/B？缺口是不是属于 C 的契约？是否必须先声明 `Context Delta`，让项目意图变化被确认后再实现？代码能说明“现在怎么改得动”，不能说明“项目意图是否允许这样改”。Tiny Context 要补的就是这一层 repo 内长期事实和优先级契约。
-对于长程任务，Harness 也提供一个轻量的计划验收清单 Skill：当用户明确给出或引用某份方案 / 计划 / RFC / implementation plan，并要求生成验收清单、完成定义或 goal/target 模式提示词时，它会把计划和验收清单临时放到 `tmp/ty-context/plan-acceptance/**`。如果方案里已经有明确、具体的“验收清单”，Skill 会直接复用那份清单并单独写入完整验收清单文件，不再另行生成一份竞争清单。这只是执行前的一次验收标准梳理，不执行计划、不证明完成，也不会把临时清单注册成 `project_context/**`。
+对于长程任务，Harness 也提供一个轻量的计划验收清单 Skill：当用户明确给出或引用某份方案 / 计划 / RFC / implementation plan，并要求生成验收清单、完成定义或 goal/target 模式提示词时，它会把计划和验收清单临时放到 `tmp/ty-context/plan-acceptance/**`。如果外部规划模型参与，推荐仍然只给两份产物：`《开发方案》` 作为执行方向，`《验收清单和测试用例》` 作为 Codex target-mode acceptance input packet。第二份应包含 AC、required evidence、测试命令、真实产品路径 / core path、证据分层、无效证据、状态机、local audit 和 blocker。Source Pack 只是临时上传材料，不是 durable Context。如果方案里已经有明确、具体的“验收清单”，Skill 会直接复用那份清单并单独写入完整验收清单文件；两份输入包走 strict mode，如果两份内容无法完整解析出 required fields，或第二份缺少 required evidence、verification method、fail condition、状态机、无效证据规则等必要字段，Skill 会停止并列出缺失项，不生成完整验收清单或目标模式文本。这只是执行前的一次验收标准梳理，不执行计划、不证明完成，也不会把临时清单注册成 `project_context/**`。
 重要使用提示：Minimal Context 有意把 Context 读取顺序、Context / 代码优先级和漂移检查保持为 agent 级软约束，而不是机器强制 gate。这个取舍适合短任务，但长任务、大上下文、多次交接或多轮验证时预期会漂移。遇到这类任务且已有方案/计划来源时，应先用计划验收清单 Skill 外化一个可证伪完成目标；完整验收清单才是验收标准，local audit 只是临时进度/恢复状态。
@@ -65,13 +65,13 @@ Fresh agent 先读这些文件，再开始改代码。
 长程任务先外化目标，再进入实现：
 ```text
-Web GPT 或其他外部规划模型产出方案
+Web GPT 或其他外部规划模型产出两份产物：《开发方案》+《验收清单和测试用例》
 -> 计划验收清单 Skill 生成目标模式文本
 -> Superpowers 得出具体落地执行片段
 -> 每个执行片段都回到流程契约 + project_context/**
 ```
-这里的 Superpowers 指具体的 [obra/Superpowers](https://github.com/obra/superpowers) 插件/开源工作流，不是泛化的执行规划替代品。如果目标模式文本或原方案还不够可执行，用 `superpowers:writing-plans` 转成 bite-sized implementation plan；有 subagent 支持时优先用 `superpowers:subagent-driven-development`，否则用 `superpowers:executing-plans`；涉及行为变更时用 `superpowers:test-driven-development`。
+这里的 Superpowers 指具体的 [obra/Superpowers](https://github.com/obra/superpowers) 插件/开源工作流，不是泛化的执行规划替代品。如果目标模式文本或原方案还不够可执行，用 `superpowers:writing-plans` 转成 bite-sized implementation plan；有 subagent 支持时优先用 `superpowers:subagent-driven-development`，否则用 `superpowers:executing-plans`；涉及行为变更时用 `superpowers:test-driven-development`；完成声明前用 `superpowers:verification-before-completion` 对完整验收清单和 fresh evidence 做 gate。
 原因是漂移控制。流程契约 + Context 层是软约束，短任务里通常能让 agent 按预期执行；长程任务里，Context 仍然能记录符合预期的事实，但 Context 到代码 的实现步骤会随着上下文窗口变大、多次交接、subagent 拆分和多轮验证而漂移。Web GPT 方案、目标模式文本、完整验收清单和 Superpowers 执行层，把完成目标外化成可恢复、可审计的临时执行标准，同时不恢复阶段式 gate。

package/assets/skills/plan_acceptance_checklist_compiler/SKILL.md CHANGED Viewed

@@ -27,10 +27,19 @@ This Skill is generic. Do not embed business-specific rules, vendor-specific rul
 Every completed invocation must produce:
-1. A preserved copy of the plan under `tmp/ty-context/plan-acceptance/`. The copied plan is the implementation/source plan, not acceptance proof.
+1. A preserved copy of the plan under `tmp/ty-context/plan-acceptance/`. The copied plan is the implementation/source plan, not acceptance proof. For a two-document upstream input packet, preserve both source inputs separately when both are provided.
 2. A rigorous full acceptance checklist under a separate `tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.md` path. If the plan contains an explicit concrete acceptance checklist, reuse that plan-provided checklist verbatim; otherwise derive the checklist from the plan and relevant project Context. The full checklist is the complete acceptance standard and owns required automated test evidence.
 3. A goal/target-mode prompt the user can paste directly into Codex. The prompt may include a compact checklist summary for direction, priority and recovery navigation, but the full checklist owns the acceptance details.
 4. When a local audit path is referenced, it is temporary execution/progress state only, not Context and not proof by itself.
+The compiler may receive one source plan or a two-document upstream input packet:
+- `Development Plan / 开发方案`: objective, original requirement source summary, execution direction, module/API/UI/runtime/data flow, risk boundaries, Context Delta and whether execution must first be converted into bite-sized tasks.
+- `Acceptance and Tests / 验收清单和测试用例`: ACs, required evidence, tests, real product paths, core paths, evidence layers, invalid evidence, completion state machine, local audit rules and blockers.
+These inputs are preserved source input, not proof. The compiler's job is to turn them into the full checklist and target prompt that Superpowers can execute against without losing acceptance authority.
+Two-document upstream input packet handling is strict mode. If the compiler cannot fully parse the required content from both documents, it must stop and ask the user to provide the missing required fields. Do not generate a checklist or goal/target-mode prompt from an incomplete two-document packet.
 Exception: if the Context confirmation gate below triggers, stop after materializing the plan and reading enough Context to explain the uncertainty. Ask the user for confirmation before producing the checklist or goal/target-mode prompt.
@@ -117,33 +126,52 @@ audit this plan for acceptance criteria
 Do not trigger from standalone broad phrases such as `goal mode`, `target mode`, `acceptance criteria`, `completion definition`, `目标模式文本`, `验收标准`, or `完成定义` unless the same user request also identifies a plan-like source.
-## Input Priority
-Build the checklist from these sources, in order:
-1. User's current instruction.
-2. The referenced or pasted plan.
-3. Relevant durable project Context under `project_context/**`.
-4. Repository guidance such as `AGENTS.md`, `README.md`, `DESIGN.md`, and relevant local Skills.
-5. Current code and tests, only to identify real surfaces, commands, routes, schemas, artifacts, entry points, and likely verification paths.
-6. Existing reports or artifacts, only to identify required evidence and invalidation risks. Existing artifacts are not proof unless the user explicitly asks to audit current completion.
-If plan and Context conflict, preserve the conflict in the checklist. Do not silently choose the easier side.
-## Step 1: Materialize The Plan Under `tmp/ty-context/plan-acceptance/`
-Before writing the checklist, write the user-specified plan into the repository `tmp/ty-context/plan-acceptance/` directory.
-Rules:
+## Input Priority
+Build the checklist from these sources, in order:
+1. User's current instruction.
+2. The referenced or pasted plan, including both documents when the user provides a two-document upstream input packet.
+3. Relevant durable project Context under `project_context/**`.
+4. Repository guidance such as `AGENTS.md`, `README.md`, `DESIGN.md`, and relevant local Skills.
+5. Current code and tests, only to identify real surfaces, commands, routes, schemas, artifacts, entry points, and likely verification paths.
+6. Existing reports or artifacts, only to identify required evidence and invalidation risks. Existing artifacts are not proof unless the user explicitly asks to audit current completion.
+If plan and Context conflict, preserve the conflict in the checklist. Do not silently choose the easier side.
+## Upstream Input Packet
+When the user provides two related documents, treat them as a two-document upstream input packet when their roles match these shapes:
+- `Development Plan / 开发方案`: execution direction for implementation. It can contain the original requirement source or a summary of the original plan so later implementation does not shrink the target.
+- `Acceptance and Tests / 验收清单和测试用例`: target-mode acceptance input packet. It can contain the acceptance matrix, test requirements, real product paths, evidence layers, invalid evidence rules, state-machine rules, local audit requirements and blockers.
+Rules:
+- Do not create a third upstream artifact. Preserve the source input documents and compile a separate full checklist.
+- If only one plan-like source is provided, keep the existing single-plan flow.
+- If both documents are provided, read both before deciding whether to reuse or generate the full checklist.
+- The development plan is execution direction, not proof. The acceptance-and-tests document is acceptance input, not proof.
+- In the full checklist's `Plan source` and per-row `Source` fields, preserve whether a requirement came from the development plan, the acceptance-and-tests document, user instruction, project Context, code contract or verification risk.
+- If the development plan is not bite-sized enough for direct execution, the generated prompt must require `superpowers:writing-plans` before implementation.
+- This two-document upstream input packet path is strict mode. The compiler must be able to parse the objective or original requirement source summary, implementation/source plan, acceptance items, required evidence, verification methods, fail conditions, required tests or explicit test scope, core paths or explicit non-UI/runtime scope, state-machine rules, invalid evidence rules, local audit expectations and blockers or explicit no-blocker statement.
+- If any required field cannot be fully parsed from the two documents, stop after preserving the inputs. Do not generate a checklist or goal/target-mode prompt. Tell the user which missing required fields must be added to `Development Plan / 开发方案` or `Acceptance and Tests / 验收清单和测试用例`.
-- If `tmp/ty-context/plan-acceptance/` does not exist, create it.
-- If the user references a file outside `tmp/ty-context/plan-acceptance/`, copy its current content into `tmp/ty-context/plan-acceptance/<safe-plan-name>.md`.
-- If the user references a file already under `tmp/ty-context/plan-acceptance/`, use that path directly unless the user asks for a new copy.
-- If the user pasted the plan text, write the pasted plan exactly into `tmp/ty-context/plan-acceptance/<safe-plan-name>.md`.
-- Preserve the plan content. Do not summarize, normalize, reorder, translate, or edit it while materializing it.
-- Use a stable readable filename derived from the plan title, source filename, or user topic. Use lowercase letters, digits, hyphens, and `.md`.
+## Step 1: Materialize The Plan Under `tmp/ty-context/plan-acceptance/`
+Before writing the checklist, write the user-specified source input into the repository `tmp/ty-context/plan-acceptance/` directory.
+Rules:
+- If `tmp/ty-context/plan-acceptance/` does not exist, create it.
+- If the user references a file outside `tmp/ty-context/plan-acceptance/`, copy its current content into `tmp/ty-context/plan-acceptance/<safe-plan-name>.md`.
+- If the user references a file already under `tmp/ty-context/plan-acceptance/`, use that path directly unless the user asks for a new copy.
+- If the user pasted the plan text, write the pasted plan exactly into `tmp/ty-context/plan-acceptance/<safe-plan-name>.md`.
+- If the user pasted or referenced a two-document upstream input packet, write `Development Plan / 开发方案` and `Acceptance and Tests / 验收清单和测试用例` as separate preserved source inputs under stable readable filenames.
+- Preserve the plan content. Do not summarize, normalize, reorder, translate, or edit it while materializing it.
+- Use a stable readable filename derived from the plan title, source filename, or user topic. Use lowercase letters, digits, hyphens, and `.md`.
 - If the source plan cannot be found or read, stop and report the missing source. Do not invent a plan.
-- The materialized plan is temporary implementation/source input. It is not durable Context, not acceptance proof and not proof that any acceptance item passed.
+- The materialized source input is temporary implementation/source input. It is not durable Context, not acceptance proof and not proof that any acceptance item passed.
 Recommended paths:
@@ -161,6 +189,8 @@ After materializing the plan, inspect it for an explicit concrete acceptance che
 Plan-provided checklist reuse applies only when the plan contains a clearly labeled checklist or checklist table section, such as `Acceptance Checklist`, `Acceptance Criteria`, `验收清单`, `验收标准`, or equivalent heading, and that section contains concrete acceptance items rather than only saying that acceptance is needed.
+For a two-document upstream input packet, the `Acceptance and Tests / 验收清单和测试用例` document is the preferred acceptance source, but it is not automatically a frozen verbatim full checklist. Reuse it as the full checklist only if it already includes enough acceptance structure for target-mode execution: AC ID, scope, required evidence, verification method, fail condition, state-machine rules and invalid evidence rules. If any of those are missing, strict mode applies: stop, list the missing required fields, and ask the user to provide a complete acceptance-and-tests packet.
 When a plan-provided checklist is found:
 - Copy the plan-provided acceptance checklist section into `tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.md` as the full checklist.
@@ -176,6 +206,8 @@ When a plan-provided checklist is found:
 When no explicit concrete plan-provided checklist exists, continue with the generated-checklist flow below.
+When a two-document upstream input packet exists but either document is incomplete, do not continue with the generated-checklist flow below. Strict mode means incomplete two-document input cannot be repaired by inference. Do not generate a checklist or goal/target-mode prompt; report the missing required fields and wait for the user to provide a complete packet.
 When a plan already includes explicit test requirements but not a full acceptance checklist:
 - Use those plan-provided test requirements as the source of truth for the generated checklist's `Required automated tests / 必须新增或补强的自动化测试` section.
@@ -281,6 +313,8 @@ Allowed `Conclusion` values:
 - `proven`
 - `unproven`
+- `partial`
+- `invalidated`
 - `stale-evidence`
 - `runtime-disconnected`
 - `implementation-drift`
@@ -289,6 +323,8 @@ Allowed `Conclusion` values:
 Missing ledger evidence means incomplete, not complete. Do not let missing evidence, old evidence, partial evidence or evidence from a different read path satisfy a current-state claim.
+All core AC status starts as `unknown / not_run` unless current required evidence has been inspected. Only fresh required evidence can support `complete`. Any fresh browser / API / runtime / data / test contradiction must immediately downgrade the affected AC and overall status, and the local audit must record the contradiction as invalidating evidence. Do not preserve a previous complete status after contradictory current evidence appears.
 ### Required Automated Tests / 必须新增或补强的自动化测试
 Every generated full checklist must include a `Required automated tests / 必须新增或补强的自动化测试` section. Test requirements are acceptance evidence: a future executor cannot mark a relevant implementation item complete when required test evidence is missing.
@@ -311,6 +347,8 @@ Rules:
 - If no explicit test section exists, derive required tests from the plan's behavior changes, risk, Context contracts and real code/test surfaces.
 - If an exact test name cannot be inferred safely, write a behavior-level test description and do not invent exact test names.
 - Each required test row must identify the covered acceptance item(s), the verification command, and the failure condition that blocks acceptance.
+- For behavior changes, test requirements should identify expected RED/GREEN or pass/fail signal when that can be inferred safely.
+- When a test is auxiliary evidence only, state which acceptance layer it supports and what it cannot prove.
 - If no new or strengthened automated tests are required, state that explicitly with the reason and the acceptance items covered by existing verification.
 - The local audit must record each required test's command, result and failure reason when it is run or when it remains blocked.
@@ -370,10 +408,11 @@ Consider these generic dimensions:
 - Code implementation behavior.
 - API or interface contract.
 - Data model, schema, migration, or persistence.
-- Runtime state, configuration, session, credential, environment, or degraded behavior.
-- Artifact generation, schema, freshness, provenance, and acceptance.
-- UI or user-visible projection.
-- Async job, worker, scheduler, queue, or background process.
+- Runtime state, configuration, session, credential, environment, or degraded behavior.
+- Artifact generation, schema, freshness, provenance, and acceptance.
+- UI or user-visible projection.
+- Real product path or core path, including page/user flow, API route/probe, runtime/worker/job/artifact path, and whether Browser/Chrome verification is required.
+- Async job, worker, scheduler, queue, or background process.
 - Security, privacy, redaction, secrets, and access control.
 - Observability, logs, diagnostics, and operator visibility.
 - Performance, timeout, boundedness, pagination, and resource budget.
@@ -393,6 +432,7 @@ runtime exercised
 artifact generated
 artifact accepted by validator
 API/UI reflects accepted evidence
+browser path verified
 final gate/check command passed
 ```
@@ -455,12 +495,15 @@ For evidence-ledger plans, keep the traps generic and cover these cases when rel
 - Code-only changes without current execution or acceptance evidence.
 - UI/API shell behavior without the backing data, runtime or artifact evidence required by the checklist.
+- UI-facing acceptance without the real page path and matching user-visible state.
 - Stale artifacts or stale runtime evidence.
 - Evidence from a mismatched read path, service path, artifact path or runtime instance.
 - Unexercised runtime or unexercised fallback behavior.
 - Partial tests, smoke-only checks or dry runs when the plan requires broader current proof.
 - API/UI/data/test contradictions that remain unresolved.
+For UI-facing acceptance, component / viewmodel / mock / unit test evidence is insufficient unless the real page path is opened and the user-visible state matches the acceptance item.
 ## Suggested Execution Order
 Suggest an execution order that prioritizes the highest-risk proof first:
@@ -505,12 +548,15 @@ Hard requirements:
 - The prompt must identify the full checklist path immediately after the plan path and say it is the complete acceptance standard. Chinese prompts must include this exact sentence: `该文件是完整验收标准，验收以这个为准。完成前必须逐项检查，不满足则继续实现。` English prompts must say the full checklist is the complete acceptance standard, acceptance is judged against it, and every item must be checked before completion.
 - The prompt must identify a local audit path, normally `tmp/ty-context/plan-acceptance/<plan-slug>-local-audit.md`, and require the future executor to read it before resuming, keep it current during execution, and use it only as target-mode acceptance progress state.
 - After the plan/checklist/audit paths, include a resource lifecycle instruction: `可多开agent，agent名额不够了就关掉不用的。` for Chinese prompts or `You may use multiple agents; if agent slots run low, close idle or unnecessary agents.` for English prompts.
+- The prompt must include mandatory inputs: original requirement source or original plan summary, implementation/source plan, full acceptance checklist, local audit, relevant Context, and required tests / core paths.
 - The prompt must include a Superpowers execution block. If Superpowers is not installed, tell the executor to install it through the current platform's official Superpowers installation path; if installation is blocked by permissions, network or platform limits, record it in local audit and do not treat the blocker as completion. If Superpowers is installed, Use Superpowers for this task.
-- The Superpowers block must require: read the full checklist first and make it the acceptance authority; use `superpowers:writing-plans` when the plan is not executable enough; prefer `superpowers:subagent-driven-development` when subagents are available; otherwise use `superpowers:executing-plans`; use `superpowers:test-driven-development` for behavior changes; review / finish cannot override the full checklist; update local audit after each execution round.
+- The Superpowers block must require: read the full checklist first and make it the acceptance authority; use `superpowers:writing-plans` when the plan is not executable enough; prefer `superpowers:subagent-driven-development` when subagents are available; otherwise use `superpowers:executing-plans`; use `superpowers:test-driven-development` for behavior changes; use `superpowers:verification-before-completion` before any completion claim; review / finish cannot override the full checklist; update local audit after each execution round.
 - The remaining content must be the acceptance checklist or a compact version of it.
 - The prompt must be self-contained enough for goal/target-mode execution.
 - If the prompt uses a compact checklist summary, say the full checklist owns details and acceptance authority; the compact summary owns direction, priority and recovery navigation; overlap is allowed; conflicts are resolved in favor of the full checklist.
 - The prompt must require the local audit to record overall status (`complete`, `incomplete`, `blocked` or `narrowed-scope-complete`), each core AC status and current evidence, commands with result/time/failure reason, artifact or evidence paths, blockers and missing evidence, acceptance impact, explicit deferred or narrowed scope, and stale/partial/smoke/dry-run/research evidence that cannot prove full completion.
+- The prompt must require local audit status to start from `unknown / not_run`; only fresh required evidence can mark an AC complete. If any fresh browser / API / runtime / data / test contradiction appears, downgrade the affected AC and overall status and record invalidating evidence.
+- The prompt must state that UI-facing acceptance requires a real page path and matching user-visible state; component / viewmodel / mock / unit test evidence is auxiliary unless the full checklist explicitly says otherwise.
 - The prompt must say that local audit is not Context, not product-quality proof, not a global task manager, and not a replacement for project tests, CI, review, human acceptance, Task Contract or workflow-contract `plan.md`.
 - The prompt must say that when a Task Contract or workflow-contract `plan.md` exists, each acceptance item execution still follows it and the repository's Tiny Context workflow contract.
 - Do not include explanatory preface inside the prompt.
@@ -529,6 +575,7 @@ Recommended compact Chinese prompt shape:
 执行审计: tmp/ty-context/plan-acceptance/<plan-slug>-local-audit.md（临时 progress state，非 Context/proof）
 可多开agent，agent名额不够了就关掉不用的。
 本摘要只负责 direction/priority/recovery navigation；允许与完整 checklist 重叠，冲突时以完整 checklist 为准。
+mandatory inputs：原始需求源/原始方案摘要、实施计划、完整验收清单、local audit、relevant Context、required tests / core paths。
 如果 Superpowers 未安装，先按当前平台官方 Superpowers 安装路径安装；若安装被权限/网络/平台限制阻塞，写入执行审计，不得把阻塞当完成。
 如果 Superpowers 已安装，使用 Superpowers 执行本任务：
@@ -536,6 +583,7 @@ Recommended compact Chinese prompt shape:
 - 若实施计划不够可执行，用 superpowers:writing-plans 转成 bite-sized implementation plan
 - 有 subagent 支持时优先用 superpowers:subagent-driven-development；否则用 superpowers:executing-plans
 - 行为变更用 superpowers:test-driven-development；先写失败测试并观察失败，再写最小实现
+- 完成声明前用 superpowers:verification-before-completion 按完整 checklist 和 fresh evidence 做 gate
 - review / finish 不能覆盖完整验收清单；不满足则继续实现
 - 每轮执行后更新 local audit，记录 AC 状态、证据、命令结果、blocker、deferred/narrowed scope、无效证据
@@ -554,6 +602,8 @@ AC11 <文档/Context 更新要求，仅在计划要求时执行>
 AC12 维护执行审计：恢复执行先读 audit；记录总体状态、每个 AC 当前证据、命令/结果/时间、每个 required test 的 command/result/failure reason、artifact/evidence 路径、blocker、deferred/narrowed scope、不能证明 full completion 的旧/部分/smoke/dry-run/research 证据；audit 不是 Context、完成证明、全局任务管理器，也不替代 Task Contract 或流程契约 plan.md。
 AC13 最小用户卡点：问用户前先完成安全自助发现；需要用户介入时只给最小动作清单，写明已尝试、缺失项、具体页面/菜单/字段/按钮、最小值/动作、不要发送的敏感信息、验收影响、fallback/deferred。
 AC14 完成前审计：逐条对照实施计划和完整 checklist；每个 core 项必须有当前证据；未跑验证必须明示；有可继续执行的 core 项不得标记完成；外部/强卡点必须写明原因、缺失证据、验收影响和下一步；若剩余未完成项只有无法本地解决的强卡点，暂停并等待用户/外部 owner，不能标记目标完成。
+AC15 状态机：core AC 初始 unknown / not_run；只有 fresh required evidence 才能 complete；任何 fresh browser / API / runtime / data / test contradiction 必须 downgrade the affected AC and overall status，并在 audit 记录 invalidating evidence。
+AC16 UI-facing acceptance：必须打开 real page path 且用户可见状态匹配；component / viewmodel / mock / unit test 只算辅助证据，除非完整 checklist 另有明确说明。
 禁止把以下内容当完成：只改代码、只更新计划、只跑部分测试、只生成旧/部分/不被当前契约接受的证据、只完成基础设施但未完成验收证据、runtime 未配置/未演练、artifact 未被 validator 接受、API/UI 未反映验收证据、fallback 未演练、强卡点未解除、API/UI/数据/测试之间仍矛盾。
 ```
@@ -566,6 +616,7 @@ Full checklist: tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.
 Local audit: tmp/ty-context/plan-acceptance/<plan-slug>-local-audit.md (temporary progress state, not Context or proof)
 You may use multiple agents; if agent slots run low, close idle or unnecessary agents.
 This summary is only direction, priority and recovery navigation; overlap with the full checklist is allowed, and the full checklist wins conflicts.
+Mandatory inputs: original requirement source or original plan summary, implementation/source plan, full checklist, local audit, relevant Context, and required tests / core paths.
 If Superpowers is not installed, install it through the current platform's official Superpowers installation path first; if installation is blocked by permissions, network or platform limits, record it in local audit and do not treat the blocker as completion.
 If Superpowers is installed, Use Superpowers for this task:
@@ -573,6 +624,7 @@ If Superpowers is installed, Use Superpowers for this task:
 - If the plan is not executable enough, use superpowers:writing-plans for a bite-sized implementation plan.
 - Prefer superpowers:subagent-driven-development when subagents are available; otherwise use superpowers:executing-plans.
 - Use superpowers:test-driven-development for behavior changes; write a failing test, observe failure, then implement minimally.
+- Use superpowers:verification-before-completion before any completion claim, checking the full checklist against fresh evidence.
 - review / finish cannot override the full checklist; if unsatisfied, continue implementation.
 - update local audit after each execution round with AC status, evidence, command results, blockers, deferred/narrowed scope and invalid evidence.
@@ -591,6 +643,8 @@ AC11 <documentation / Context updates only when required by the plan>
 AC12 Maintain local audit: read it before resuming; record overall status, every AC's current evidence, commands/results/time, each required test's command/result/failure reason, artifact/evidence paths, blockers, deferred/narrowed scope, and stale/partial/smoke/dry-run/research evidence that cannot prove full completion; audit is not Context, completion proof, a global task manager, or a replacement for Task Contract or workflow-contract plan.md.
 AC13 Minimal user blocker protocol: before asking the user, complete safe self-service discovery; when user action is needed, provide only the smallest action list with what was tried, missing item, exact page/menu/path/field/button, minimum value/action, sensitive material not to send, acceptance impact, and fallback/deferred option.
 AC14 Final audit: compare every item against the plan and full checklist; every core item needs current evidence; missing validation must be stated; any executable core item left open means the task is not complete; external or hard blockers need cause, missing evidence, acceptance impact, and next action; if only locally unsatisfiable hard blockers remain, pause for the user or external owner instead of marking the goal complete.
+AC15 State machine: core ACs start as unknown / not_run; only fresh required evidence can mark complete; any fresh browser / API / runtime / data / test contradiction must downgrade the affected AC and overall status and be recorded as invalidating evidence.
+AC16 UI-facing acceptance: open the real page path and confirm the user-visible state matches; component / viewmodel / mock / unit test evidence is auxiliary unless the full checklist explicitly says otherwise.
 Do not count these as completion: code-only changes, plan-only updates, partial tests, stale or partial evidence, infrastructure without acceptance proof, runtime not configured/exercised, artifact not accepted by validator, API/UI not reflecting accepted evidence, unexercised fallback, unresolved hard blockers, or contradictions between API/UI/data/tests.
 ```

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "project-tiny-context-harness",
-  "version": "0.2.64",
+  "version": "0.2.66",
   "description": "Minimal project memory and validation harness for AI coding agents.",
   "license": "MIT",
   "author": "Seven128",