project-tiny-context-harness 0.2.64 → 0.2.66

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -80,12 +80,14 @@ workflow contract + project_context/** -> implementation -> verification -> drif
80
80
  For long-running tasks, externalize the target first:
81
81
 
82
82
  ```text
83
- Web GPT or another external planning model produces a plan
83
+ Web GPT or another external planning model produces a two-document upstream input
84
84
  -> plan acceptance checklist Skill produces a goal/target-mode prompt
85
85
  -> Superpowers derives concrete implementation slices
86
86
  -> each slice follows the workflow contract + project_context/**
87
87
  ```
88
88
 
89
+ For best target-mode results, the two-document upstream input should be a `Development Plan` for execution direction and an `Acceptance and Tests` packet for acceptance authority. The development plan preserves the original requirement source and implementation direction; the acceptance packet supplies ACs, required evidence, tests, real product/core paths, evidence layers, invalid evidence, state-machine rules, local audit requirements and blockers. Source Pack exports are temporary upload material for external planning, not durable Context.
90
+
89
91
  The recommended Superpowers layer is the specific [obra/Superpowers](https://github.com/obra/superpowers) plugin/workflow, not a generic planning substitute. Use `superpowers:writing-plans` when the target-mode prompt or source plan still needs bite-sized implementation tasks, then prefer `superpowers:subagent-driven-development` when subagents are available and `superpowers:executing-plans` otherwise. Behavior changes should use `superpowers:test-driven-development`.
90
92
 
91
93
  The reason is drift control. The workflow contract plus Context layer is intentionally a soft constraint. It works well for short tasks, and Context can still capture the expected facts for long tasks, but long execution makes the Context-to-code step drift as the context window grows, work is handed off, subagents split scope or validation loops multiply. A Web GPT plan, target-mode prompt, full acceptance checklist and Superpowers execution layer make the completion target recoverable without restoring a phase-gated workflow.
@@ -139,7 +141,7 @@ npm ci
139
141
  npm run smoke:quickstart
140
142
  npm run preview:pack
141
143
  cd /path/to/your/test-repo
142
- npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.64.tgz
144
+ npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.66.tgz
143
145
  npx --no-install ty-context init --adopt
144
146
  make validate-context
145
147
  ```
@@ -253,7 +255,7 @@ Use `npx --no-install ty-context ...` only when you explicitly want the already
253
255
  | Product Surface Contract Skill | `<harnessRoot>/skills/context_surface_contract/SKILL.md` | Handles explicit Product Surface Contract, Screen Contract, surface responsibility and main/drilldown ownership work; it compiles project-owned surface contracts into `project_context/**` without adding a new context role or gate. |
254
256
  | Full project context export Skill | `<harnessRoot>/skills/context_full_project_export/SKILL.md` | Handles explicit full-project, Source Pack or code-level export requests and uses `export-context --source-pack`, `--code-index`, `--task-context`, `--all`, `--full` or `--code` to create temporary artifacts under `tmp/ty-context/context-exports/**`. |
255
257
  | Harness upgrade Skill | `<harnessRoot>/skills/context_harness_upgrade/SKILL.md` | Handles explicit Tiny Context / Project Tiny Context Harness upgrade requests such as “upgrade Tiny Context” and “use the Tiny Context upgrade skill to upgrade this project”; it runs the canonical `upgrade` path, handles only migration-scoped `manual_required` / `blocked` follow-up, then runs diagnostics. |
256
- | Plan acceptance checklist Skill | `<harnessRoot>/skills/plan_acceptance_checklist_compiler/SKILL.md` | Handles explicit requests to turn a referenced plan, RFC or implementation proposal into a falsifiable acceptance checklist and paste-ready goal/target-mode prompt under `tmp/ty-context/plan-acceptance/**`; if the plan already contains an explicit concrete checklist, the Skill reuses it verbatim in the separate full-checklist file; generated prompts may reference a full checklist as the authoritative acceptance standard and use compact summaries only for navigation/priority, but the Skill does not execute the plan or prove completion. |
258
+ | Plan acceptance checklist Skill | `<harnessRoot>/skills/plan_acceptance_checklist_compiler/SKILL.md` | Handles explicit requests to turn a referenced plan, RFC, implementation proposal or two-document upstream input into a falsifiable acceptance checklist and paste-ready goal/target-mode prompt under `tmp/ty-context/plan-acceptance/**`; if the plan already contains an explicit concrete checklist, the Skill reuses it verbatim in the separate full-checklist file; generated prompts may reference a full checklist as the authoritative acceptance standard and use compact summaries only for navigation/priority, but the Skill does not execute the plan or prove completion. |
257
259
  | Project-local Skills | `<harnessRoot>/skills/<role>/SKILL.md` | Optional local product/design/development Skills created by the project, such as `product_plan`, `uiux_design` or `development_engineer`. They supersede package-managed default Skills when more specific, are not overwritten by `sync`, and should keep front matter trigger keywords aligned with the project `AGENTS.md` role-trigger rule. |
258
260
  | Managed file sync | `make ty-context-sync` or `npx --yes --package project-tiny-context-harness@latest ty-context sync` | Refreshes package-managed guidance, default Skills, Makefile include, context templates, tools and workflow YAML. It does not run migrations or perform semantic Context generation; it may block only direct asset-refresh safety issues such as invalid managed blocks or deprecated managed Skill overrides. |
259
261
  | Upgrade | `make ty-context-upgrade` or `npx --yes --package project-tiny-context-harness@latest ty-context upgrade` | Use for releases marked `upgrade-required` or `manual-required`. Builds an upgrade plan, stops before writes when `blocked` items exist, otherwise applies `safe_pending` migrations, runs `sync` and `doctor`, and exits non-zero when manual follow-up or diagnostics remain. |
@@ -275,7 +277,7 @@ For high-risk product, UI/UX and engineering tasks that affect durable architect
275
277
 
276
278
  Technical architecture support is a Minimal Context capability: use restrained `architecture.md`, area Module Design Capsules and existing `contract` / `decision-rationale` roles when durable architecture or rationale matters. Do not invent rationale; store stable reasons, rejected alternatives or tradeoffs only in the smallest durable Context surface when they will affect future implementation or verification choices.
277
279
 
278
- For long-running plans, RFCs or implementation proposals, the plan acceptance checklist Skill can turn a plan plus relevant Context into a falsifiable acceptance checklist and a paste-ready goal/target-mode prompt. If the plan already contains an explicit concrete acceptance checklist, the Skill copies that checklist verbatim into a separate full-checklist file instead of generating a competing checklist. It is one pre-execution acceptance pass, not a task planner or workflow engine: it stores temporary inputs under `tmp/ty-context/plan-acceptance/**`, asks for confirmation when durable assumptions are unclear, and leaves execution evidence to the future executor, tests, CI, review or human acceptance. The generated prompt may require a local audit under the same temporary directory so future sessions can recover acceptance progress; that audit is not Context, not a quality proof and not a replacement for the project's Tiny Context workflow contract. When the prompt references a full checklist, that checklist is the acceptance authority; compact prompt text is only navigation, priority and recovery guidance.
280
+ For long-running plans, RFCs or implementation proposals, the plan acceptance checklist Skill can turn a plan plus relevant Context into a falsifiable acceptance checklist and a paste-ready goal/target-mode prompt. It also supports a two-document upstream input from Web GPT or another external planner: `Development Plan` for execution direction and `Acceptance and Tests` for target-mode acceptance input. If the plan already contains an explicit concrete acceptance checklist, the Skill copies that checklist verbatim into a separate full-checklist file instead of generating a competing checklist. The two-document packet path is strict mode: when required fields cannot be fully parsed from both documents, the compiler preserves the inputs, reports the missing fields, and stops without generating a checklist or goal/target-mode prompt. It is one pre-execution acceptance pass, not a task planner or workflow engine: it stores temporary inputs under `tmp/ty-context/plan-acceptance/**`, asks for confirmation when durable assumptions are unclear, and leaves execution evidence to the future executor, tests, CI, review or human acceptance. The generated prompt may require a local audit under the same temporary directory so future sessions can recover acceptance progress; that audit is not Context, not a quality proof and not a replacement for the project's Tiny Context workflow contract. When the prompt references a full checklist, that checklist is the acceptance authority; compact prompt text is only navigation, priority and recovery guidance.
279
281
 
280
282
  For Product Surface work, `context_surface_contract` turns broad product/page/UI principles into project-owned surface responsibilities. A Product Surface can be a Web page, mobile screen, desktop window, game UI/HUD/menu, CLI/TUI output, extension UI or embedded/device interface. Cross-surface contracts use the existing `contract` role; area-owned screen facts stay in `area` or `subdomain`; repeatable validation paths use `verification`. The Harness does not add a new surface-specific role, does not create business surface contracts during `init` or `upgrade`, and does not turn surface conformance into a validator gate. Projects that want mandatory task blocks should add a separate project-local Skill, while `product-surface-contract.md` is only a compact managed template for optional Context authoring.
281
283
 
package/assets/README.md CHANGED
@@ -94,7 +94,7 @@ That smoke packs the local workspace, installs it into a disposable repo, runs `
94
94
  ```sh
95
95
  npm run preview:pack
96
96
  cd /path/to/your/test-repo
97
- npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.64.tgz
97
+ npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.66.tgz
98
98
  npx --no-install ty-context init --adopt
99
99
  make validate-context
100
100
  ```
@@ -124,12 +124,14 @@ workflow contract + project_context/** -> implementation -> verification -> drif
124
124
  For long-running tasks, externalize the target first:
125
125
 
126
126
  ```text
127
- Web GPT or another external planning model produces a plan
127
+ Web GPT or another external planning model produces a two-document upstream input
128
128
  -> plan acceptance checklist Skill produces a goal/target-mode prompt
129
129
  -> Superpowers derives concrete implementation slices
130
130
  -> each slice follows the workflow contract + project_context/**
131
131
  ```
132
132
 
133
+ For best target-mode results, the two-document upstream input should be a `Development Plan` for execution direction and an `Acceptance and Tests` packet for acceptance authority. The development plan summarizes the original requirement source and implementation direction; the acceptance packet supplies ACs, required evidence, tests, real product/core paths, evidence layers, invalid evidence, state-machine rules, local audit requirements and blockers. Source Pack exports are temporary upload material for external planning, not durable Context.
134
+
133
135
  The recommended Superpowers layer is the specific [obra/Superpowers](https://github.com/obra/superpowers) plugin/workflow, not a generic planning substitute. Use `superpowers:writing-plans` when the target-mode prompt or source plan still needs bite-sized implementation tasks, then prefer `superpowers:subagent-driven-development` when subagents are available and `superpowers:executing-plans` otherwise. Behavior changes should use `superpowers:test-driven-development`.
134
136
 
135
137
  The reason is drift control. The workflow contract plus Context layer is intentionally a soft constraint. It works well for short tasks, and Context can still capture the expected facts for long tasks, but long execution makes the Context-to-code step drift as the context window grows, work is handed off, subagents split scope or validation loops multiply. A Web GPT plan, target-mode prompt, full acceptance checklist and Superpowers execution layer make the completion target recoverable without restoring a phase-gated workflow.
@@ -290,7 +292,7 @@ No. It checks that recovery facts exist and avoids fake test-result claims. Prod
290
292
 
291
293
  It should stay smaller than a full process. Ordinary bug fixes and local refactors do not update Context unless they produce durable product, architecture, API, state or validation facts.
292
294
 
293
- The default Skills are Minimal Context helpers for explicit product-planning, UI/UX-design, development-engineering, Product Surface Contract, full-project-export, Tiny Context upgrade and plan-acceptance-checklist requests. Product, screen-flow, surface responsibility and durable engineering conclusions go to `project_context/**`; visual identity and design tokens go to root `DESIGN.md`. Export artifacts are temporary files under `tmp/ty-context/context-exports/**`, not Context. Plan acceptance artifacts are temporary files under `tmp/ty-context/plan-acceptance/**`; they define completion criteria for a referenced plan but do not execute it or prove acceptance. If the plan already contains an explicit concrete checklist, the Skill reuses that checklist verbatim in the separate full-checklist file. When a generated prompt references a full checklist, that checklist is the authoritative acceptance standard; the compact prompt summary is only navigation and priority guidance. The Harness upgrade Skill handles requests such as “upgrade Tiny Context” and “use the Tiny Context upgrade skill to upgrade this project” by following the release update mode, using `upgrade` for migration-bearing releases, and limiting manual cleanup to migration-scoped follow-up.
295
+ The default Skills are Minimal Context helpers for explicit product-planning, UI/UX-design, development-engineering, Product Surface Contract, full-project-export, Tiny Context upgrade and plan-acceptance-checklist requests. Product, screen-flow, surface responsibility and durable engineering conclusions go to `project_context/**`; visual identity and design tokens go to root `DESIGN.md`. Export artifacts are temporary files under `tmp/ty-context/context-exports/**`, not Context. Plan acceptance artifacts are temporary files under `tmp/ty-context/plan-acceptance/**`; they define completion criteria for a referenced plan but do not execute it or prove acceptance. If the plan already contains an explicit concrete checklist, the Skill reuses that checklist verbatim in the separate full-checklist file. For a two-document upstream input, the external planner should provide a `Development Plan` and an `Acceptance and Tests` packet; the compiler preserves both source roles and, only when strict-mode required fields are fully parseable from both documents, turns them into the full checklist plus target prompt. When a generated prompt references a full checklist, that checklist is the authoritative acceptance standard; the compact prompt summary is only navigation and priority guidance. The Harness upgrade Skill handles requests such as “upgrade Tiny Context” and “use the Tiny Context upgrade skill to upgrade this project” by following the release update mode, using `upgrade` for migration-bearing releases, and limiting manual cleanup to migration-scoped follow-up.
294
296
 
295
297
  Multilingual trigger phrases are compatibility details. Public README, npm and launch copy stay English-first, and public/package-managed surfaces must remain English-complete; literal non-English examples are documented only where they explain generated Skill matching and must not be the sole activation path.
296
298
 
@@ -298,7 +300,7 @@ For high-risk product, UI/UX and engineering tasks that affect durable architect
298
300
 
299
301
  Technical architecture support is a Minimal Context capability: use restrained `architecture.md`, area Module Design Capsules and existing `contract` / `decision-rationale` roles when durable architecture or rationale matters. Do not invent rationale; store stable reasons, rejected alternatives or tradeoffs only in the smallest durable Context surface when they will affect future implementation or verification choices.
300
302
 
301
- For long-running plans, RFCs or implementation proposals, the plan acceptance checklist compiler can turn a plan plus relevant Context into a falsifiable acceptance checklist and a paste-ready goal/target-mode prompt. If the plan already contains an explicit concrete acceptance checklist, the Skill copies that checklist verbatim into a separate full-checklist file instead of generating a competing checklist. This is one pre-execution acceptance pass, not a task planner or workflow engine: it stores temporary inputs under `tmp/ty-context/plan-acceptance/**`, asks for confirmation when durable assumptions are unclear, and leaves execution evidence to the future executor, tests, CI, review or human acceptance. The generated prompt may require a local audit under the same temporary directory so future sessions can recover acceptance progress; that audit is not Context, not a quality proof and not a replacement for the project's Tiny Context workflow contract. The full checklist is the acceptance authority, while any compact prompt summary exists for navigation, priority and recovery after context compaction.
303
+ For long-running plans, RFCs or implementation proposals, the plan acceptance checklist compiler can turn a plan plus relevant Context into a falsifiable acceptance checklist and a paste-ready goal/target-mode prompt. It also supports a two-document upstream input from Web GPT or another external planner: `Development Plan` for execution direction and `Acceptance and Tests` for target-mode acceptance input. If the plan already contains an explicit concrete acceptance checklist, the Skill copies that checklist verbatim into a separate full-checklist file instead of generating a competing checklist. The two-document packet path is strict mode: when required fields cannot be fully parsed from both documents, the compiler preserves the inputs, reports the missing fields, and stops without generating a checklist or goal/target-mode prompt. This is one pre-execution acceptance pass, not a task planner or workflow engine: it stores temporary inputs under `tmp/ty-context/plan-acceptance/**`, asks for confirmation when durable assumptions are unclear, and leaves execution evidence to the future executor, tests, CI, review or human acceptance. The generated prompt may require a local audit under the same temporary directory so future sessions can recover acceptance progress; that audit is not Context, not a quality proof and not a replacement for the project's Tiny Context workflow contract. The full checklist is the acceptance authority, while any compact prompt summary exists for navigation, priority and recovery after context compaction.
302
304
 
303
305
  Important usage note: Minimal Context intentionally keeps Context read order, Context/code priority and drift checks as agent-level soft constraints rather than machine-enforced gates. That tradeoff works well for short tasks, but long tasks with large context windows, multiple handoffs or many verification loops are expected to drift unless the completion target is externalized. Use the plan acceptance checklist compiler before long-running execution when there is a plan-like source; treat the full checklist as the acceptance target, and treat the local audit only as temporary progress/recovery state.
304
306
 
@@ -50,7 +50,7 @@ Fresh agent 先读这些文件,再开始改代码。
50
50
 
51
51
  一个典型失败场景是 ABCD 模块链:A/B/C 是上游,D 是下游。现在做 D 的需求时发现能力缺口;如果没有 Context 和优先级约束,agent 很容易为了让 D 完成而去改上游 A/B,因为当前代码让这条路可行。但真正需要判断的是:D 是否有权改 A/B?缺口是不是属于 C 的契约?是否必须先声明 `Context Delta`,让项目意图变化被确认后再实现?代码能说明“现在怎么改得动”,不能说明“项目意图是否允许这样改”。Tiny Context 要补的就是这一层 repo 内长期事实和优先级契约。
52
52
 
53
- 对于长程任务,Harness 也提供一个轻量的计划验收清单 Skill:当用户明确给出或引用某份方案 / 计划 / RFC / implementation plan,并要求生成验收清单、完成定义或 goal/target 模式提示词时,它会把计划和验收清单临时放到 `tmp/ty-context/plan-acceptance/**`。如果方案里已经有明确、具体的“验收清单”,Skill 会直接复用那份清单并单独写入完整验收清单文件,不再另行生成一份竞争清单。这只是执行前的一次验收标准梳理,不执行计划、不证明完成,也不会把临时清单注册成 `project_context/**`。
53
+ 对于长程任务,Harness 也提供一个轻量的计划验收清单 Skill:当用户明确给出或引用某份方案 / 计划 / RFC / implementation plan,并要求生成验收清单、完成定义或 goal/target 模式提示词时,它会把计划和验收清单临时放到 `tmp/ty-context/plan-acceptance/**`。如果外部规划模型参与,推荐仍然只给两份产物:`《开发方案》` 作为执行方向,`《验收清单和测试用例》` 作为 Codex target-mode acceptance input packet。第二份应包含 AC、required evidence、测试命令、真实产品路径 / core path、证据分层、无效证据、状态机、local audit 和 blocker。Source Pack 只是临时上传材料,不是 durable Context。如果方案里已经有明确、具体的“验收清单”,Skill 会直接复用那份清单并单独写入完整验收清单文件;两份输入包走 strict mode,如果两份内容无法完整解析出 required fields,或第二份缺少 required evidence、verification method、fail condition、状态机、无效证据规则等必要字段,Skill 会停止并列出缺失项,不生成完整验收清单或目标模式文本。这只是执行前的一次验收标准梳理,不执行计划、不证明完成,也不会把临时清单注册成 `project_context/**`。
54
54
 
55
55
  重要使用提示:Minimal Context 有意把 Context 读取顺序、Context / 代码优先级和漂移检查保持为 agent 级软约束,而不是机器强制 gate。这个取舍适合短任务,但长任务、大上下文、多次交接或多轮验证时预期会漂移。遇到这类任务且已有方案/计划来源时,应先用计划验收清单 Skill 外化一个可证伪完成目标;完整验收清单才是验收标准,local audit 只是临时进度/恢复状态。
56
56
 
@@ -65,13 +65,13 @@ Fresh agent 先读这些文件,再开始改代码。
65
65
  长程任务先外化目标,再进入实现:
66
66
 
67
67
  ```text
68
- Web GPT 或其他外部规划模型产出方案
68
+ Web GPT 或其他外部规划模型产出两份产物:《开发方案》+《验收清单和测试用例》
69
69
  -> 计划验收清单 Skill 生成目标模式文本
70
70
  -> Superpowers 得出具体落地执行片段
71
71
  -> 每个执行片段都回到流程契约 + project_context/**
72
72
  ```
73
73
 
74
- 这里的 Superpowers 指具体的 [obra/Superpowers](https://github.com/obra/superpowers) 插件/开源工作流,不是泛化的执行规划替代品。如果目标模式文本或原方案还不够可执行,用 `superpowers:writing-plans` 转成 bite-sized implementation plan;有 subagent 支持时优先用 `superpowers:subagent-driven-development`,否则用 `superpowers:executing-plans`;涉及行为变更时用 `superpowers:test-driven-development`。
74
+ 这里的 Superpowers 指具体的 [obra/Superpowers](https://github.com/obra/superpowers) 插件/开源工作流,不是泛化的执行规划替代品。如果目标模式文本或原方案还不够可执行,用 `superpowers:writing-plans` 转成 bite-sized implementation plan;有 subagent 支持时优先用 `superpowers:subagent-driven-development`,否则用 `superpowers:executing-plans`;涉及行为变更时用 `superpowers:test-driven-development`;完成声明前用 `superpowers:verification-before-completion` 对完整验收清单和 fresh evidence 做 gate。
75
75
 
76
76
  原因是漂移控制。流程契约 + Context 层是软约束,短任务里通常能让 agent 按预期执行;长程任务里,Context 仍然能记录符合预期的事实,但 Context 到代码 的实现步骤会随着上下文窗口变大、多次交接、subagent 拆分和多轮验证而漂移。Web GPT 方案、目标模式文本、完整验收清单和 Superpowers 执行层,把完成目标外化成可恢复、可审计的临时执行标准,同时不恢复阶段式 gate。
77
77
 
@@ -27,10 +27,19 @@ This Skill is generic. Do not embed business-specific rules, vendor-specific rul
27
27
 
28
28
  Every completed invocation must produce:
29
29
 
30
- 1. A preserved copy of the plan under `tmp/ty-context/plan-acceptance/`. The copied plan is the implementation/source plan, not acceptance proof.
30
+ 1. A preserved copy of the plan under `tmp/ty-context/plan-acceptance/`. The copied plan is the implementation/source plan, not acceptance proof. For a two-document upstream input packet, preserve both source inputs separately when both are provided.
31
31
  2. A rigorous full acceptance checklist under a separate `tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.md` path. If the plan contains an explicit concrete acceptance checklist, reuse that plan-provided checklist verbatim; otherwise derive the checklist from the plan and relevant project Context. The full checklist is the complete acceptance standard and owns required automated test evidence.
32
32
  3. A goal/target-mode prompt the user can paste directly into Codex. The prompt may include a compact checklist summary for direction, priority and recovery navigation, but the full checklist owns the acceptance details.
33
33
  4. When a local audit path is referenced, it is temporary execution/progress state only, not Context and not proof by itself.
34
+
35
+ The compiler may receive one source plan or a two-document upstream input packet:
36
+
37
+ - `Development Plan / 开发方案`: objective, original requirement source summary, execution direction, module/API/UI/runtime/data flow, risk boundaries, Context Delta and whether execution must first be converted into bite-sized tasks.
38
+ - `Acceptance and Tests / 验收清单和测试用例`: ACs, required evidence, tests, real product paths, core paths, evidence layers, invalid evidence, completion state machine, local audit rules and blockers.
39
+
40
+ These inputs are preserved source input, not proof. The compiler's job is to turn them into the full checklist and target prompt that Superpowers can execute against without losing acceptance authority.
41
+
42
+ Two-document upstream input packet handling is strict mode. If the compiler cannot fully parse the required content from both documents, it must stop and ask the user to provide the missing required fields. Do not generate a checklist or goal/target-mode prompt from an incomplete two-document packet.
34
43
 
35
44
  Exception: if the Context confirmation gate below triggers, stop after materializing the plan and reading enough Context to explain the uncertainty. Ask the user for confirmation before producing the checklist or goal/target-mode prompt.
36
45
 
@@ -117,33 +126,52 @@ audit this plan for acceptance criteria
117
126
 
118
127
  Do not trigger from standalone broad phrases such as `goal mode`, `target mode`, `acceptance criteria`, `completion definition`, `目标模式文本`, `验收标准`, or `完成定义` unless the same user request also identifies a plan-like source.
119
128
 
120
- ## Input Priority
121
-
122
- Build the checklist from these sources, in order:
123
-
124
- 1. User's current instruction.
125
- 2. The referenced or pasted plan.
126
- 3. Relevant durable project Context under `project_context/**`.
127
- 4. Repository guidance such as `AGENTS.md`, `README.md`, `DESIGN.md`, and relevant local Skills.
128
- 5. Current code and tests, only to identify real surfaces, commands, routes, schemas, artifacts, entry points, and likely verification paths.
129
- 6. Existing reports or artifacts, only to identify required evidence and invalidation risks. Existing artifacts are not proof unless the user explicitly asks to audit current completion.
130
-
131
- If plan and Context conflict, preserve the conflict in the checklist. Do not silently choose the easier side.
132
-
133
- ## Step 1: Materialize The Plan Under `tmp/ty-context/plan-acceptance/`
134
-
135
- Before writing the checklist, write the user-specified plan into the repository `tmp/ty-context/plan-acceptance/` directory.
136
-
137
- Rules:
129
+ ## Input Priority
130
+
131
+ Build the checklist from these sources, in order:
132
+
133
+ 1. User's current instruction.
134
+ 2. The referenced or pasted plan, including both documents when the user provides a two-document upstream input packet.
135
+ 3. Relevant durable project Context under `project_context/**`.
136
+ 4. Repository guidance such as `AGENTS.md`, `README.md`, `DESIGN.md`, and relevant local Skills.
137
+ 5. Current code and tests, only to identify real surfaces, commands, routes, schemas, artifacts, entry points, and likely verification paths.
138
+ 6. Existing reports or artifacts, only to identify required evidence and invalidation risks. Existing artifacts are not proof unless the user explicitly asks to audit current completion.
139
+
140
+ If plan and Context conflict, preserve the conflict in the checklist. Do not silently choose the easier side.
141
+
142
+ ## Upstream Input Packet
143
+
144
+ When the user provides two related documents, treat them as a two-document upstream input packet when their roles match these shapes:
145
+
146
+ - `Development Plan / 开发方案`: execution direction for implementation. It can contain the original requirement source or a summary of the original plan so later implementation does not shrink the target.
147
+ - `Acceptance and Tests / 验收清单和测试用例`: target-mode acceptance input packet. It can contain the acceptance matrix, test requirements, real product paths, evidence layers, invalid evidence rules, state-machine rules, local audit requirements and blockers.
148
+
149
+ Rules:
150
+
151
+ - Do not create a third upstream artifact. Preserve the source input documents and compile a separate full checklist.
152
+ - If only one plan-like source is provided, keep the existing single-plan flow.
153
+ - If both documents are provided, read both before deciding whether to reuse or generate the full checklist.
154
+ - The development plan is execution direction, not proof. The acceptance-and-tests document is acceptance input, not proof.
155
+ - In the full checklist's `Plan source` and per-row `Source` fields, preserve whether a requirement came from the development plan, the acceptance-and-tests document, user instruction, project Context, code contract or verification risk.
156
+ - If the development plan is not bite-sized enough for direct execution, the generated prompt must require `superpowers:writing-plans` before implementation.
157
+ - This two-document upstream input packet path is strict mode. The compiler must be able to parse the objective or original requirement source summary, implementation/source plan, acceptance items, required evidence, verification methods, fail conditions, required tests or explicit test scope, core paths or explicit non-UI/runtime scope, state-machine rules, invalid evidence rules, local audit expectations and blockers or explicit no-blocker statement.
158
+ - If any required field cannot be fully parsed from the two documents, stop after preserving the inputs. Do not generate a checklist or goal/target-mode prompt. Tell the user which missing required fields must be added to `Development Plan / 开发方案` or `Acceptance and Tests / 验收清单和测试用例`.
138
159
 
139
- - If `tmp/ty-context/plan-acceptance/` does not exist, create it.
140
- - If the user references a file outside `tmp/ty-context/plan-acceptance/`, copy its current content into `tmp/ty-context/plan-acceptance/<safe-plan-name>.md`.
141
- - If the user references a file already under `tmp/ty-context/plan-acceptance/`, use that path directly unless the user asks for a new copy.
142
- - If the user pasted the plan text, write the pasted plan exactly into `tmp/ty-context/plan-acceptance/<safe-plan-name>.md`.
143
- - Preserve the plan content. Do not summarize, normalize, reorder, translate, or edit it while materializing it.
144
- - Use a stable readable filename derived from the plan title, source filename, or user topic. Use lowercase letters, digits, hyphens, and `.md`.
160
+ ## Step 1: Materialize The Plan Under `tmp/ty-context/plan-acceptance/`
161
+
162
+ Before writing the checklist, write the user-specified source input into the repository `tmp/ty-context/plan-acceptance/` directory.
163
+
164
+ Rules:
165
+
166
+ - If `tmp/ty-context/plan-acceptance/` does not exist, create it.
167
+ - If the user references a file outside `tmp/ty-context/plan-acceptance/`, copy its current content into `tmp/ty-context/plan-acceptance/<safe-plan-name>.md`.
168
+ - If the user references a file already under `tmp/ty-context/plan-acceptance/`, use that path directly unless the user asks for a new copy.
169
+ - If the user pasted the plan text, write the pasted plan exactly into `tmp/ty-context/plan-acceptance/<safe-plan-name>.md`.
170
+ - If the user pasted or referenced a two-document upstream input packet, write `Development Plan / 开发方案` and `Acceptance and Tests / 验收清单和测试用例` as separate preserved source inputs under stable readable filenames.
171
+ - Preserve the plan content. Do not summarize, normalize, reorder, translate, or edit it while materializing it.
172
+ - Use a stable readable filename derived from the plan title, source filename, or user topic. Use lowercase letters, digits, hyphens, and `.md`.
145
173
  - If the source plan cannot be found or read, stop and report the missing source. Do not invent a plan.
146
- - The materialized plan is temporary implementation/source input. It is not durable Context, not acceptance proof and not proof that any acceptance item passed.
174
+ - The materialized source input is temporary implementation/source input. It is not durable Context, not acceptance proof and not proof that any acceptance item passed.
147
175
 
148
176
  Recommended paths:
149
177
 
@@ -161,6 +189,8 @@ After materializing the plan, inspect it for an explicit concrete acceptance che
161
189
 
162
190
  Plan-provided checklist reuse applies only when the plan contains a clearly labeled checklist or checklist table section, such as `Acceptance Checklist`, `Acceptance Criteria`, `验收清单`, `验收标准`, or equivalent heading, and that section contains concrete acceptance items rather than only saying that acceptance is needed.
163
191
 
192
+ For a two-document upstream input packet, the `Acceptance and Tests / 验收清单和测试用例` document is the preferred acceptance source, but it is not automatically a frozen verbatim full checklist. Reuse it as the full checklist only if it already includes enough acceptance structure for target-mode execution: AC ID, scope, required evidence, verification method, fail condition, state-machine rules and invalid evidence rules. If any of those are missing, strict mode applies: stop, list the missing required fields, and ask the user to provide a complete acceptance-and-tests packet.
193
+
164
194
  When a plan-provided checklist is found:
165
195
 
166
196
  - Copy the plan-provided acceptance checklist section into `tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.md` as the full checklist.
@@ -176,6 +206,8 @@ When a plan-provided checklist is found:
176
206
 
177
207
  When no explicit concrete plan-provided checklist exists, continue with the generated-checklist flow below.
178
208
 
209
+ When a two-document upstream input packet exists but either document is incomplete, do not continue with the generated-checklist flow below. Strict mode means incomplete two-document input cannot be repaired by inference. Do not generate a checklist or goal/target-mode prompt; report the missing required fields and wait for the user to provide a complete packet.
210
+
179
211
  When a plan already includes explicit test requirements but not a full acceptance checklist:
180
212
 
181
213
  - Use those plan-provided test requirements as the source of truth for the generated checklist's `Required automated tests / 必须新增或补强的自动化测试` section.
@@ -281,6 +313,8 @@ Allowed `Conclusion` values:
281
313
 
282
314
  - `proven`
283
315
  - `unproven`
316
+ - `partial`
317
+ - `invalidated`
284
318
  - `stale-evidence`
285
319
  - `runtime-disconnected`
286
320
  - `implementation-drift`
@@ -289,6 +323,8 @@ Allowed `Conclusion` values:
289
323
 
290
324
  Missing ledger evidence means incomplete, not complete. Do not let missing evidence, old evidence, partial evidence or evidence from a different read path satisfy a current-state claim.
291
325
 
326
+ All core AC status starts as `unknown / not_run` unless current required evidence has been inspected. Only fresh required evidence can support `complete`. Any fresh browser / API / runtime / data / test contradiction must immediately downgrade the affected AC and overall status, and the local audit must record the contradiction as invalidating evidence. Do not preserve a previous complete status after contradictory current evidence appears.
327
+
292
328
  ### Required Automated Tests / 必须新增或补强的自动化测试
293
329
 
294
330
  Every generated full checklist must include a `Required automated tests / 必须新增或补强的自动化测试` section. Test requirements are acceptance evidence: a future executor cannot mark a relevant implementation item complete when required test evidence is missing.
@@ -311,6 +347,8 @@ Rules:
311
347
  - If no explicit test section exists, derive required tests from the plan's behavior changes, risk, Context contracts and real code/test surfaces.
312
348
  - If an exact test name cannot be inferred safely, write a behavior-level test description and do not invent exact test names.
313
349
  - Each required test row must identify the covered acceptance item(s), the verification command, and the failure condition that blocks acceptance.
350
+ - For behavior changes, test requirements should identify expected RED/GREEN or pass/fail signal when that can be inferred safely.
351
+ - When a test is auxiliary evidence only, state which acceptance layer it supports and what it cannot prove.
314
352
  - If no new or strengthened automated tests are required, state that explicitly with the reason and the acceptance items covered by existing verification.
315
353
  - The local audit must record each required test's command, result and failure reason when it is run or when it remains blocked.
316
354
 
@@ -370,10 +408,11 @@ Consider these generic dimensions:
370
408
  - Code implementation behavior.
371
409
  - API or interface contract.
372
410
  - Data model, schema, migration, or persistence.
373
- - Runtime state, configuration, session, credential, environment, or degraded behavior.
374
- - Artifact generation, schema, freshness, provenance, and acceptance.
375
- - UI or user-visible projection.
376
- - Async job, worker, scheduler, queue, or background process.
411
+ - Runtime state, configuration, session, credential, environment, or degraded behavior.
412
+ - Artifact generation, schema, freshness, provenance, and acceptance.
413
+ - UI or user-visible projection.
414
+ - Real product path or core path, including page/user flow, API route/probe, runtime/worker/job/artifact path, and whether Browser/Chrome verification is required.
415
+ - Async job, worker, scheduler, queue, or background process.
377
416
  - Security, privacy, redaction, secrets, and access control.
378
417
  - Observability, logs, diagnostics, and operator visibility.
379
418
  - Performance, timeout, boundedness, pagination, and resource budget.
@@ -393,6 +432,7 @@ runtime exercised
393
432
  artifact generated
394
433
  artifact accepted by validator
395
434
  API/UI reflects accepted evidence
435
+ browser path verified
396
436
  final gate/check command passed
397
437
  ```
398
438
 
@@ -455,12 +495,15 @@ For evidence-ledger plans, keep the traps generic and cover these cases when rel
455
495
 
456
496
  - Code-only changes without current execution or acceptance evidence.
457
497
  - UI/API shell behavior without the backing data, runtime or artifact evidence required by the checklist.
498
+ - UI-facing acceptance without the real page path and matching user-visible state.
458
499
  - Stale artifacts or stale runtime evidence.
459
500
  - Evidence from a mismatched read path, service path, artifact path or runtime instance.
460
501
  - Unexercised runtime or unexercised fallback behavior.
461
502
  - Partial tests, smoke-only checks or dry runs when the plan requires broader current proof.
462
503
  - API/UI/data/test contradictions that remain unresolved.
463
504
 
505
+ For UI-facing acceptance, component / viewmodel / mock / unit test evidence is insufficient unless the real page path is opened and the user-visible state matches the acceptance item.
506
+
464
507
  ## Suggested Execution Order
465
508
 
466
509
  Suggest an execution order that prioritizes the highest-risk proof first:
@@ -505,12 +548,15 @@ Hard requirements:
505
548
  - The prompt must identify the full checklist path immediately after the plan path and say it is the complete acceptance standard. Chinese prompts must include this exact sentence: `该文件是完整验收标准,验收以这个为准。完成前必须逐项检查,不满足则继续实现。` English prompts must say the full checklist is the complete acceptance standard, acceptance is judged against it, and every item must be checked before completion.
506
549
  - The prompt must identify a local audit path, normally `tmp/ty-context/plan-acceptance/<plan-slug>-local-audit.md`, and require the future executor to read it before resuming, keep it current during execution, and use it only as target-mode acceptance progress state.
507
550
  - After the plan/checklist/audit paths, include a resource lifecycle instruction: `可多开agent,agent名额不够了就关掉不用的。` for Chinese prompts or `You may use multiple agents; if agent slots run low, close idle or unnecessary agents.` for English prompts.
551
+ - The prompt must include mandatory inputs: original requirement source or original plan summary, implementation/source plan, full acceptance checklist, local audit, relevant Context, and required tests / core paths.
508
552
  - The prompt must include a Superpowers execution block. If Superpowers is not installed, tell the executor to install it through the current platform's official Superpowers installation path; if installation is blocked by permissions, network or platform limits, record it in local audit and do not treat the blocker as completion. If Superpowers is installed, Use Superpowers for this task.
509
- - The Superpowers block must require: read the full checklist first and make it the acceptance authority; use `superpowers:writing-plans` when the plan is not executable enough; prefer `superpowers:subagent-driven-development` when subagents are available; otherwise use `superpowers:executing-plans`; use `superpowers:test-driven-development` for behavior changes; review / finish cannot override the full checklist; update local audit after each execution round.
553
+ - The Superpowers block must require: read the full checklist first and make it the acceptance authority; use `superpowers:writing-plans` when the plan is not executable enough; prefer `superpowers:subagent-driven-development` when subagents are available; otherwise use `superpowers:executing-plans`; use `superpowers:test-driven-development` for behavior changes; use `superpowers:verification-before-completion` before any completion claim; review / finish cannot override the full checklist; update local audit after each execution round.
510
554
  - The remaining content must be the acceptance checklist or a compact version of it.
511
555
  - The prompt must be self-contained enough for goal/target-mode execution.
512
556
  - If the prompt uses a compact checklist summary, say the full checklist owns details and acceptance authority; the compact summary owns direction, priority and recovery navigation; overlap is allowed; conflicts are resolved in favor of the full checklist.
513
557
  - The prompt must require the local audit to record overall status (`complete`, `incomplete`, `blocked` or `narrowed-scope-complete`), each core AC status and current evidence, commands with result/time/failure reason, artifact or evidence paths, blockers and missing evidence, acceptance impact, explicit deferred or narrowed scope, and stale/partial/smoke/dry-run/research evidence that cannot prove full completion.
558
+ - The prompt must require local audit status to start from `unknown / not_run`; only fresh required evidence can mark an AC complete. If any fresh browser / API / runtime / data / test contradiction appears, downgrade the affected AC and overall status and record invalidating evidence.
559
+ - The prompt must state that UI-facing acceptance requires a real page path and matching user-visible state; component / viewmodel / mock / unit test evidence is auxiliary unless the full checklist explicitly says otherwise.
514
560
  - The prompt must say that local audit is not Context, not product-quality proof, not a global task manager, and not a replacement for project tests, CI, review, human acceptance, Task Contract or workflow-contract `plan.md`.
515
561
  - The prompt must say that when a Task Contract or workflow-contract `plan.md` exists, each acceptance item execution still follows it and the repository's Tiny Context workflow contract.
516
562
  - Do not include explanatory preface inside the prompt.
@@ -529,6 +575,7 @@ Recommended compact Chinese prompt shape:
529
575
  执行审计: tmp/ty-context/plan-acceptance/<plan-slug>-local-audit.md(临时 progress state,非 Context/proof)
530
576
  可多开agent,agent名额不够了就关掉不用的。
531
577
  本摘要只负责 direction/priority/recovery navigation;允许与完整 checklist 重叠,冲突时以完整 checklist 为准。
578
+ mandatory inputs:原始需求源/原始方案摘要、实施计划、完整验收清单、local audit、relevant Context、required tests / core paths。
532
579
 
533
580
  如果 Superpowers 未安装,先按当前平台官方 Superpowers 安装路径安装;若安装被权限/网络/平台限制阻塞,写入执行审计,不得把阻塞当完成。
534
581
  如果 Superpowers 已安装,使用 Superpowers 执行本任务:
@@ -536,6 +583,7 @@ Recommended compact Chinese prompt shape:
536
583
  - 若实施计划不够可执行,用 superpowers:writing-plans 转成 bite-sized implementation plan
537
584
  - 有 subagent 支持时优先用 superpowers:subagent-driven-development;否则用 superpowers:executing-plans
538
585
  - 行为变更用 superpowers:test-driven-development;先写失败测试并观察失败,再写最小实现
586
+ - 完成声明前用 superpowers:verification-before-completion 按完整 checklist 和 fresh evidence 做 gate
539
587
  - review / finish 不能覆盖完整验收清单;不满足则继续实现
540
588
  - 每轮执行后更新 local audit,记录 AC 状态、证据、命令结果、blocker、deferred/narrowed scope、无效证据
541
589
 
@@ -554,6 +602,8 @@ AC11 <文档/Context 更新要求,仅在计划要求时执行>
554
602
  AC12 维护执行审计:恢复执行先读 audit;记录总体状态、每个 AC 当前证据、命令/结果/时间、每个 required test 的 command/result/failure reason、artifact/evidence 路径、blocker、deferred/narrowed scope、不能证明 full completion 的旧/部分/smoke/dry-run/research 证据;audit 不是 Context、完成证明、全局任务管理器,也不替代 Task Contract 或流程契约 plan.md。
555
603
  AC13 最小用户卡点:问用户前先完成安全自助发现;需要用户介入时只给最小动作清单,写明已尝试、缺失项、具体页面/菜单/字段/按钮、最小值/动作、不要发送的敏感信息、验收影响、fallback/deferred。
556
604
  AC14 完成前审计:逐条对照实施计划和完整 checklist;每个 core 项必须有当前证据;未跑验证必须明示;有可继续执行的 core 项不得标记完成;外部/强卡点必须写明原因、缺失证据、验收影响和下一步;若剩余未完成项只有无法本地解决的强卡点,暂停并等待用户/外部 owner,不能标记目标完成。
605
+ AC15 状态机:core AC 初始 unknown / not_run;只有 fresh required evidence 才能 complete;任何 fresh browser / API / runtime / data / test contradiction 必须 downgrade the affected AC and overall status,并在 audit 记录 invalidating evidence。
606
+ AC16 UI-facing acceptance:必须打开 real page path 且用户可见状态匹配;component / viewmodel / mock / unit test 只算辅助证据,除非完整 checklist 另有明确说明。
557
607
 
558
608
  禁止把以下内容当完成:只改代码、只更新计划、只跑部分测试、只生成旧/部分/不被当前契约接受的证据、只完成基础设施但未完成验收证据、runtime 未配置/未演练、artifact 未被 validator 接受、API/UI 未反映验收证据、fallback 未演练、强卡点未解除、API/UI/数据/测试之间仍矛盾。
559
609
  ```
@@ -566,6 +616,7 @@ Full checklist: tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.
566
616
  Local audit: tmp/ty-context/plan-acceptance/<plan-slug>-local-audit.md (temporary progress state, not Context or proof)
567
617
  You may use multiple agents; if agent slots run low, close idle or unnecessary agents.
568
618
  This summary is only direction, priority and recovery navigation; overlap with the full checklist is allowed, and the full checklist wins conflicts.
619
+ Mandatory inputs: original requirement source or original plan summary, implementation/source plan, full checklist, local audit, relevant Context, and required tests / core paths.
569
620
 
570
621
  If Superpowers is not installed, install it through the current platform's official Superpowers installation path first; if installation is blocked by permissions, network or platform limits, record it in local audit and do not treat the blocker as completion.
571
622
  If Superpowers is installed, Use Superpowers for this task:
@@ -573,6 +624,7 @@ If Superpowers is installed, Use Superpowers for this task:
573
624
  - If the plan is not executable enough, use superpowers:writing-plans for a bite-sized implementation plan.
574
625
  - Prefer superpowers:subagent-driven-development when subagents are available; otherwise use superpowers:executing-plans.
575
626
  - Use superpowers:test-driven-development for behavior changes; write a failing test, observe failure, then implement minimally.
627
+ - Use superpowers:verification-before-completion before any completion claim, checking the full checklist against fresh evidence.
576
628
  - review / finish cannot override the full checklist; if unsatisfied, continue implementation.
577
629
  - update local audit after each execution round with AC status, evidence, command results, blockers, deferred/narrowed scope and invalid evidence.
578
630
 
@@ -591,6 +643,8 @@ AC11 <documentation / Context updates only when required by the plan>
591
643
  AC12 Maintain local audit: read it before resuming; record overall status, every AC's current evidence, commands/results/time, each required test's command/result/failure reason, artifact/evidence paths, blockers, deferred/narrowed scope, and stale/partial/smoke/dry-run/research evidence that cannot prove full completion; audit is not Context, completion proof, a global task manager, or a replacement for Task Contract or workflow-contract plan.md.
592
644
  AC13 Minimal user blocker protocol: before asking the user, complete safe self-service discovery; when user action is needed, provide only the smallest action list with what was tried, missing item, exact page/menu/path/field/button, minimum value/action, sensitive material not to send, acceptance impact, and fallback/deferred option.
593
645
  AC14 Final audit: compare every item against the plan and full checklist; every core item needs current evidence; missing validation must be stated; any executable core item left open means the task is not complete; external or hard blockers need cause, missing evidence, acceptance impact, and next action; if only locally unsatisfiable hard blockers remain, pause for the user or external owner instead of marking the goal complete.
646
+ AC15 State machine: core ACs start as unknown / not_run; only fresh required evidence can mark complete; any fresh browser / API / runtime / data / test contradiction must downgrade the affected AC and overall status and be recorded as invalidating evidence.
647
+ AC16 UI-facing acceptance: open the real page path and confirm the user-visible state matches; component / viewmodel / mock / unit test evidence is auxiliary unless the full checklist explicitly says otherwise.
594
648
 
595
649
  Do not count these as completion: code-only changes, plan-only updates, partial tests, stale or partial evidence, infrastructure without acceptance proof, runtime not configured/exercised, artifact not accepted by validator, API/UI not reflecting accepted evidence, unexercised fallback, unresolved hard blockers, or contradictions between API/UI/data/tests.
596
650
  ```
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "project-tiny-context-harness",
3
- "version": "0.2.64",
3
+ "version": "0.2.66",
4
4
  "description": "Minimal project memory and validation harness for AI coding agents.",
5
5
  "license": "MIT",
6
6
  "author": "Seven128",