npm - project-tiny-context-harness - Versions diffs - 0.2.62 → 0.2.64 - Mend

project-tiny-context-harness 0.2.62 → 0.2.64

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

package/README.md +42 -18
package/assets/README.md +44 -18
package/assets/README.zh-CN.md +41 -14
package/assets/skills/plan_acceptance_checklist_compiler/SKILL.md +85 -15
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -8,15 +8,15 @@
 Translations: [Chinese (Simplified)](https://github.com/Seven128/project-tiny-context-harness/blob/main/README.zh-CN.md)
-`project-tiny-context-harness` ships the `ty-context` CLI for Project Tiny Context Harness: repo-native project memory for AI coding agents.
-The default is **Minimal Context Harness**. It maintains a compact `project_context/**` fact source, a short `AGENTS.md` startup router, role Skills and a `validate-context` gate so fresh agents can recover project intent, constraints, verification entry points and next safe actions quickly.
+`project-tiny-context-harness` ships the `ty-context` CLI for Project Tiny Context Harness: repo-native project memory for AI coding agents and a repo-native context contract.
+The default is **Minimal Context Harness**. It maintains a compact `project_context/**` fact source, a short `AGENTS.md` startup router, role Skills, priority guidance for Context/code/evidence, and a `validate-context` gate so fresh agents can recover project intent, constraints, verification entry points and next safe actions quickly.
 It does not default to lifecycle phases, plan tasks, stage skills, stage documents or phase gates. Harness maintains context quality; your project tests, CI, review process and human acceptance remain responsible for product quality.
 Use it when coding agents repeatedly lose project intent across new chats, handoffs, RFC/debug turns or tool changes. The intended tradeoff is: keep durable intent and recovery paths; leave execution evidence to code, tests and review.
-Think of it as durable project memory behind `AGENTS.md`, not another agent, process framework or task manager.
+Think of it as durable project memory behind `AGENTS.md`, plus priority rules for Context/code/evidence, not another agent, process framework or task manager.
 Best for:
@@ -63,19 +63,43 @@ No-install preview:
 Coding agents can move quickly inside one thread and still drift when a new chat, model, tool, reviewer or debugging session loses the project-specific facts that were never encoded anywhere stable.
-Minimal Context Harness creates a small, explicit recovery path: project goal, boundaries, architecture context, validation entry points and durable task conclusions. It is designed to sit beside specs, tests, issues, docs and code intelligence tools instead of replacing them.
-The core bet is: **keep the memory, drop the ceremony**. Earlier stage-based workflows pushed ordinary software work through explicit phase artifacts and gates. Modern coding agents already internalize much of the understand, design, implement, test and repair loop, so Project Tiny Context Harness keeps the high-density repo context that survives fresh chats without making every task follow Tiny Context-stage choreography.
-## Positioning
-| Adjacent tool type | Use it for | Harness stance |
-|---|---|---|
-| Spec-first kits | Turning feature ideas into structured specs and plans. | Complementary; Harness keeps durable project facts, not a required spec chain. |
-| BMAD-style workflows and full Tiny Context processes | Coordinated role/process ceremonies on high-risk work. | Lighter default; no phase gates or work-product trees. |
-| Task Master-style planners | Backlog decomposition and task execution state. | Complementary; Harness does not own task state. |
-| Context7/Serena-style retrieval or code-intelligence tools | Pulling external docs, symbols or repository facts on demand. | Complementary; Harness stores local repo truth. |
-| IDE or agent memory | Tool-specific continuity inside one product surface. | Portable fallback; plain files any agent can read. |
+Minimal Context Harness creates a small, explicit recovery path: project goal, boundaries, architecture context, validation entry points and durable task conclusions. It is designed to sit beside specs, tests, issues, docs and code intelligence tools instead of replacing them.
+The concrete failure mode is not only missing file search. In an ABCD module chain where A/B/C are upstream of downstream D, a D feature can expose a missing capability. Without Context, an agent may change upstream A/B to make D pass because current code permits it. Minimal Context adds a repo-owned intent layer: it records whether downstream D may change upstream A/B, whether the gap belongs in C's contract, or whether the task needs a `Context Delta` before implementation continues. Code shows what is possible; it cannot decide whether that is allowed project intent.
+The core bet is: **keep the memory, drop the ceremony**. Earlier stage-based workflows pushed ordinary software work through explicit phase artifacts and gates. Modern coding agents already internalize much of the understand, design, implement, test and repair loop, so Project Tiny Context Harness keeps the high-density repo context that survives fresh chats without making every task follow Tiny Context-stage choreography.
+## Current Best Practice
+For short tasks, use the workflow contract and Context layer directly:
+```text
+workflow contract + project_context/** -> implementation -> verification -> drift check
+```
+For long-running tasks, externalize the target first:
+```text
+Web GPT or another external planning model produces a plan
+-> plan acceptance checklist Skill produces a goal/target-mode prompt
+-> Superpowers derives concrete implementation slices
+-> each slice follows the workflow contract + project_context/**
+```
+The recommended Superpowers layer is the specific [obra/Superpowers](https://github.com/obra/superpowers) plugin/workflow, not a generic planning substitute. Use `superpowers:writing-plans` when the target-mode prompt or source plan still needs bite-sized implementation tasks, then prefer `superpowers:subagent-driven-development` when subagents are available and `superpowers:executing-plans` otherwise. Behavior changes should use `superpowers:test-driven-development`.
+The reason is drift control. The workflow contract plus Context layer is intentionally a soft constraint. It works well for short tasks, and Context can still capture the expected facts for long tasks, but long execution makes the Context-to-code step drift as the context window grows, work is handed off, subagents split scope or validation loops multiply. A Web GPT plan, target-mode prompt, full acceptance checklist and Superpowers execution layer make the completion target recoverable without restoring a phase-gated workflow.
+## Positioning
+| Adjacent tool type | Use it for | Harness stance |
+|---|---|---|
+| Spec-first kits | Turning feature ideas into structured specs and plans. | Complementary; Harness keeps durable repo facts and module boundary intent beyond one feature spec. |
+| BMAD-style workflows and full Tiny Context processes | Coordinated role/process ceremonies on high-risk work. | Lighter default; no phase gates or work-product trees. |
+| Superpowers-style execution | Turning approved requirements into plans, subagent execution, TDD, review and finish discipline. | Complementary; use it to execute while Tiny Context owns durable repo intent and acceptance priority. |
+| Task Master-style planners | Backlog decomposition and task execution state. | Complementary; Harness does not own task state. |
+| Context7/Serena-style retrieval or code-intelligence tools | Pulling external docs, symbols or repository facts on demand. | Complementary; they do not answer whether downstream D may change upstream A/B. Harness stores that local repo truth. |
+| IDE or agent memory | Tool-specific continuity inside one product surface. | Portable fallback; plain files any agent can read. |
 ## Try It In 60 Seconds
@@ -115,7 +139,7 @@ npm ci
 npm run smoke:quickstart
 npm run preview:pack
 cd /path/to/your/test-repo
-npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.62.tgz
+npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.64.tgz
 npx --no-install ty-context init --adopt
 make validate-context
 ```

package/assets/README.md CHANGED Viewed

@@ -8,13 +8,13 @@
 Translations: [Chinese (Simplified)](README.zh-CN.md)
-Project Tiny Context Harness is repo-native project memory for AI coding agents.
-`project-tiny-context-harness` ships Project Tiny Context Harness through the `ty-context` CLI. It installs **Minimal Context Harness**: a compact `project_context/**` fact source, a short `AGENTS.md` startup router, role Skills and a `validate-context` gate so fresh agents can recover project intent, boundaries, verification entry points and next safe actions quickly.
+Project Tiny Context Harness is repo-native project memory for AI coding agents and a repo-native context contract.
+`project-tiny-context-harness` ships Project Tiny Context Harness through the `ty-context` CLI. It installs **Minimal Context Harness**: a compact `project_context/**` fact source, a short `AGENTS.md` startup router, role Skills, priority guidance for Context/code/evidence, and a `validate-context` gate so fresh agents can recover project intent, boundaries, verification entry points and next safe actions quickly.
 It is not another full Tiny Context ceremony. The Harness maintains context quality; project tests, reviews, CI and human acceptance still own product quality.
-Think of it as durable project memory behind `AGENTS.md`, not another agent, process framework or task manager.
+Think of it as durable project memory behind `AGENTS.md`, plus priority rules for Context/code/evidence, not another agent, process framework or task manager.
 Best for:
@@ -94,7 +94,7 @@ That smoke packs the local workspace, installs it into a disposable repo, runs `
 ```sh
 npm run preview:pack
 cd /path/to/your/test-repo
-npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.62.tgz
+npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.64.tgz
 npx --no-install ty-context init --adopt
 make validate-context
 ```
@@ -107,19 +107,43 @@ Use it when coding agents repeatedly lose project intent across new chats, hando
 Coding agents can move quickly inside one thread and still drift when a new chat, model, tool, reviewer or debugging session loses the project-specific facts that were never encoded anywhere stable.
-Most repositories already have README files, specs, tests and issue history, but fresh agents need a small, explicit recovery path: what the project is trying to do, what it must not do, where architecture boundaries live, how to validate changes and what durable facts changed after the last task. Minimal Context Harness makes that recovery path a first-class repo surface without adding a full planning ceremony.
-The product lesson is: **keep the memory, drop the ceremony**. Earlier stage-based workflows externalized requirements, design, implementation, review, test and release into explicit phase artifacts. Modern coding agents already internalize much of that ordinary software loop. Project Tiny Context Harness keeps the useful part: the smallest high-density repo context that survives fresh chats without forcing every task through phase transitions, work-product validation or Tiny Context-stage context splits.
-## Positioning
-| Adjacent tool type | Use it for | Harness stance |
-|---|---|---|
-| Spec-first kits | Turning a feature idea into structured specs and implementation plans. | Complementary. Keep final durable project facts in `project_context/**`; do not require spec documents for every task. |
-| BMAD-style workflows and full Tiny Context processes | Coordinated role/process ceremonies on high-risk work. | Lighter default. Preserve context quality without shipping phase gates or work-product trees. |
-| Task Master-style planners | Backlog decomposition and task execution state. | Complementary. Harness does not own task state; it owns durable project memory. |
-| Context7/Serena-style retrieval or code-intelligence tools | Pulling external docs, symbols or repository facts on demand. | Complementary. Harness keeps the local project truth that should travel with the repo. |
-| IDE or agent memory | Tool-specific continuity inside one product surface. | Portable fallback. Harness files are plain repo assets that any agent can read. |
+Most repositories already have README files, specs, tests and issue history, but fresh agents need a small, explicit recovery path: what the project is trying to do, what it must not do, where architecture boundaries live, how to validate changes and what durable facts changed after the last task. Minimal Context Harness makes that recovery path a first-class repo surface without adding a full planning ceremony.
+The concrete failure mode is not only "the agent did not read enough files." In an ABCD module chain where A/B/C are upstream of downstream D, a D feature may reveal a missing capability. Without Context, the agent can satisfy D by changing upstream A/B because the current code makes that path available. What is missing is a repo-owned intent layer that says whether D may change upstream A/B, whether the gap belongs in C's contract, or whether the task must stop for a `Context Delta` before implementation continues. Current code can show what is possible; it cannot decide whether that is allowed project intent.
+The product lesson is: **keep the memory, drop the ceremony**. Earlier stage-based workflows externalized requirements, design, implementation, review, test and release into explicit phase artifacts. Modern coding agents already internalize much of that ordinary software loop. Project Tiny Context Harness keeps the useful part: the smallest high-density repo context that survives fresh chats without forcing every task through phase transitions, work-product validation or Tiny Context-stage context splits.
+## Current Best Practice
+For short tasks, use the workflow contract and Context layer directly:
+```text
+workflow contract + project_context/** -> implementation -> verification -> drift check
+```
+For long-running tasks, externalize the target first:
+```text
+Web GPT or another external planning model produces a plan
+-> plan acceptance checklist Skill produces a goal/target-mode prompt
+-> Superpowers derives concrete implementation slices
+-> each slice follows the workflow contract + project_context/**
+```
+The recommended Superpowers layer is the specific [obra/Superpowers](https://github.com/obra/superpowers) plugin/workflow, not a generic planning substitute. Use `superpowers:writing-plans` when the target-mode prompt or source plan still needs bite-sized implementation tasks, then prefer `superpowers:subagent-driven-development` when subagents are available and `superpowers:executing-plans` otherwise. Behavior changes should use `superpowers:test-driven-development`.
+The reason is drift control. The workflow contract plus Context layer is intentionally a soft constraint. It works well for short tasks, and Context can still capture the expected facts for long tasks, but long execution makes the Context-to-code step drift as the context window grows, work is handed off, subagents split scope or validation loops multiply. A Web GPT plan, target-mode prompt, full acceptance checklist and Superpowers execution layer make the completion target recoverable without restoring a phase-gated workflow.
+## Positioning
+| Adjacent tool type | Use it for | Harness stance |
+|---|---|---|
+| Spec-first kits | Turning a feature idea into structured specs and implementation plans. | Complementary. Specs can define a feature, but they do not automatically maintain repo-wide module boundary intent across every later task. |
+| BMAD-style workflows and full Tiny Context processes | Coordinated role/process ceremonies on high-risk work. | Lighter default. Preserve context quality without shipping phase gates or work-product trees. |
+| Superpowers-style execution | Turning approved requirements into plans, subagent execution, TDD, review and finish discipline. | Complementary. Use it to execute; keep Tiny Context as the durable repo intent and acceptance-priority layer. |
+| Task Master-style planners | Backlog decomposition and task execution state. | Complementary. Harness does not own task state; it owns durable project memory and module boundary facts. |
+| Context7/Serena-style retrieval or code-intelligence tools | Pulling external docs, symbols or repository facts on demand. | Complementary. They improve retrieval and editing, but do not answer whether downstream D may change upstream A/B; Harness keeps that local project truth in repo. |
+| IDE or agent memory | Tool-specific continuity inside one product surface. | Portable fallback. Harness files are plain repo assets that any agent can read. |
 ## Try It In 60 Seconds
@@ -275,6 +299,8 @@ For high-risk product, UI/UX and engineering tasks that affect durable architect
 Technical architecture support is a Minimal Context capability: use restrained `architecture.md`, area Module Design Capsules and existing `contract` / `decision-rationale` roles when durable architecture or rationale matters. Do not invent rationale; store stable reasons, rejected alternatives or tradeoffs only in the smallest durable Context surface when they will affect future implementation or verification choices.
 For long-running plans, RFCs or implementation proposals, the plan acceptance checklist compiler can turn a plan plus relevant Context into a falsifiable acceptance checklist and a paste-ready goal/target-mode prompt. If the plan already contains an explicit concrete acceptance checklist, the Skill copies that checklist verbatim into a separate full-checklist file instead of generating a competing checklist. This is one pre-execution acceptance pass, not a task planner or workflow engine: it stores temporary inputs under `tmp/ty-context/plan-acceptance/**`, asks for confirmation when durable assumptions are unclear, and leaves execution evidence to the future executor, tests, CI, review or human acceptance. The generated prompt may require a local audit under the same temporary directory so future sessions can recover acceptance progress; that audit is not Context, not a quality proof and not a replacement for the project's Tiny Context workflow contract. The full checklist is the acceptance authority, while any compact prompt summary exists for navigation, priority and recovery after context compaction.
+Important usage note: Minimal Context intentionally keeps Context read order, Context/code priority and drift checks as agent-level soft constraints rather than machine-enforced gates. That tradeoff works well for short tasks, but long tasks with large context windows, multiple handoffs or many verification loops are expected to drift unless the completion target is externalized. Use the plan acceptance checklist compiler before long-running execution when there is a plan-like source; treat the full checklist as the acceptance target, and treat the local audit only as temporary progress/recovery state.
 For Product Surface work, `context_surface_contract` turns broad product/page/UI principles into project-owned surface responsibilities. A Product Surface can be a Web page, mobile screen, desktop window, game UI/HUD/menu, CLI/TUI output, extension UI or embedded/device interface. Cross-surface contracts use the existing `contract` role; area-owned screen facts stay in `area` or `subdomain`; repeatable validation paths use `verification`. The Harness does not add a new surface-specific role, does not create business surface contracts during `init` or `upgrade`, and does not turn surface conformance into a validator gate. Projects that want mandatory task blocks should add a separate project-local Skill, while `product-surface-contract.md` is only a compact managed template for optional Context authoring.

package/assets/README.zh-CN.md CHANGED Viewed

@@ -2,15 +2,16 @@
 [English README](README.md)
-Project Tiny Context Harness 是给 AI coding agents 用的轻量项目记忆层。
-它不是新的全流程 Tiny Context 框架，也不是任务管理器。它做一件小事：把新会话 agent 最容易丢掉、但又必须长期稳定保留的项目事实放进仓库里，让下一次聊天、交接、调试或换工具时不用从头重新发现。
+Project Tiny Context Harness 是给 AI coding agents 用的轻量项目记忆层，也是 repo-native context contract。
+它不是新的全流程 Tiny Context 框架，也不是任务管理器。它做一件小事：把新会话 agent 最容易丢掉、但又必须长期稳定保留的项目事实，以及 Context / 代码 / 验证证据之间的读取和变更优先级放进仓库里，让下一次聊天、交接、调试或换工具时不用从头重新发现。
 一句话：
 ```text
-Keep the memory. Drop the ceremony.
-保留项目记忆，丢掉流程仪式感。
+Keep the memory. Drop the ceremony.
+保留项目记忆，丢掉流程仪式感。
+同时保留 Context / 代码 / 验证证据之间的优先级契约。
 ```
 ## 它解决什么问题
@@ -22,9 +23,10 @@ Keep the memory. Drop the ceremony.
 - 架构边界在哪里
 - 哪些文件是事实源
 - 改完以后应该跑什么验证
-- 上一次任务留下了哪些长期约束
-Project Tiny Context Harness 把这些内容压缩到几个 repo-native 文件里：
+- 上一次任务留下了哪些长期约束
+- Context、实现和验证证据冲突时谁优先
+Project Tiny Context Harness 把这些内容压缩到几个 repo-native 文件里，并通过简单工作流约束 agent 先读 Context、判断是否 context-first、实现后做 drift check：
 - `AGENTS.md`
 - `project_context/context.toml`
@@ -44,13 +46,38 @@ Fresh agent 先读这些文件，再开始改代码。
 - 现代 coding agents 已经内化了很多普通软件工程循环：理解、设计、实现、测试、修复。
 - 真正值得保留下来的不是“每次任务都走完整流程”，而是“新 agent 能快速恢复项目长期事实”。
-所以当前默认方向是 Minimal Context Harness：只维护高密度、长期有效、能帮助恢复上下文的项目事实。
+所以当前默认方向是 Minimal Context Harness：只维护高密度、长期有效、能帮助恢复上下文的项目事实。
+一个典型失败场景是 ABCD 模块链：A/B/C 是上游，D 是下游。现在做 D 的需求时发现能力缺口；如果没有 Context 和优先级约束，agent 很容易为了让 D 完成而去改上游 A/B，因为当前代码让这条路可行。但真正需要判断的是：D 是否有权改 A/B？缺口是不是属于 C 的契约？是否必须先声明 `Context Delta`，让项目意图变化被确认后再实现？代码能说明“现在怎么改得动”，不能说明“项目意图是否允许这样改”。Tiny Context 要补的就是这一层 repo 内长期事实和优先级契约。
 对于长程任务，Harness 也提供一个轻量的计划验收清单 Skill：当用户明确给出或引用某份方案 / 计划 / RFC / implementation plan，并要求生成验收清单、完成定义或 goal/target 模式提示词时，它会把计划和验收清单临时放到 `tmp/ty-context/plan-acceptance/**`。如果方案里已经有明确、具体的“验收清单”，Skill 会直接复用那份清单并单独写入完整验收清单文件，不再另行生成一份竞争清单。这只是执行前的一次验收标准梳理，不执行计划、不证明完成，也不会把临时清单注册成 `project_context/**`。
-## 适合谁
-适合：
+重要使用提示：Minimal Context 有意把 Context 读取顺序、Context / 代码优先级和漂移检查保持为 agent 级软约束，而不是机器强制 gate。这个取舍适合短任务，但长任务、大上下文、多次交接或多轮验证时预期会漂移。遇到这类任务且已有方案/计划来源时，应先用计划验收清单 Skill 外化一个可证伪完成目标；完整验收清单才是验收标准，local audit 只是临时进度/恢复状态。
+## 当前最佳实践
+短程任务直接使用流程契约和 Context 层：
+```text
+流程契约 + project_context/** -> 实现 -> 验证 -> drift check
+```
+长程任务先外化目标，再进入实现：
+```text
+Web GPT 或其他外部规划模型产出方案
+-> 计划验收清单 Skill 生成目标模式文本
+-> Superpowers 得出具体落地执行片段
+-> 每个执行片段都回到流程契约 + project_context/**
+```
+这里的 Superpowers 指具体的 [obra/Superpowers](https://github.com/obra/superpowers) 插件/开源工作流，不是泛化的执行规划替代品。如果目标模式文本或原方案还不够可执行，用 `superpowers:writing-plans` 转成 bite-sized implementation plan；有 subagent 支持时优先用 `superpowers:subagent-driven-development`，否则用 `superpowers:executing-plans`；涉及行为变更时用 `superpowers:test-driven-development`。
+原因是漂移控制。流程契约 + Context 层是软约束，短任务里通常能让 agent 按预期执行；长程任务里，Context 仍然能记录符合预期的事实，但 Context 到代码 的实现步骤会随着上下文窗口变大、多次交接、subagent 拆分和多轮验证而漂移。Web GPT 方案、目标模式文本、完整验收清单和 Superpowers 执行层，把完成目标外化成可恢复、可审计的临时执行标准，同时不恢复阶段式 gate。
+## 适合谁
+适合：
 - 经常用 Codex、Claude Code、Cursor、Gemini CLI、OpenCode 等 agent 改代码的项目。
 - 经常开新 chat，agent 反复重新理解项目的项目。

package/assets/skills/plan_acceptance_checklist_compiler/SKILL.md CHANGED Viewed

@@ -28,7 +28,7 @@ This Skill is generic. Do not embed business-specific rules, vendor-specific rul
 Every completed invocation must produce:
 1. A preserved copy of the plan under `tmp/ty-context/plan-acceptance/`. The copied plan is the implementation/source plan, not acceptance proof.
-2. A rigorous full acceptance checklist under a separate `tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.md` path. If the plan contains an explicit concrete acceptance checklist, reuse that plan-provided checklist verbatim; otherwise derive the checklist from the plan and relevant project Context. The full checklist is the complete acceptance standard.
+2. A rigorous full acceptance checklist under a separate `tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.md` path. If the plan contains an explicit concrete acceptance checklist, reuse that plan-provided checklist verbatim; otherwise derive the checklist from the plan and relevant project Context. The full checklist is the complete acceptance standard and owns required automated test evidence.
 3. A goal/target-mode prompt the user can paste directly into Codex. The prompt may include a compact checklist summary for direction, priority and recovery navigation, but the full checklist owns the acceptance details.
 4. When a local audit path is referenced, it is temporary execution/progress state only, not Context and not proof by itself.
@@ -38,9 +38,17 @@ The goal/target-mode prompt must be no longer than 3850 characters, including li
 ```text
 实施计划: tmp/ty-context/plan-acceptance/<plan>.md（source/implementation plan，非验收证明）
-可多开agent，agent名额不够了就关掉不用的。
 完整验收清单: tmp/ty-context/plan-acceptance/<plan>-acceptance-checklist.md（该文件是完整验收标准，验收以这个为准。完成前必须逐项检查，不满足则继续实现。）
 执行审计: tmp/ty-context/plan-acceptance/<plan>-local-audit.md（临时 progress state，非 Context/proof）
+可多开agent，agent名额不够了就关掉不用的。
+如果 Superpowers 未安装，先按当前平台官方 Superpowers 安装路径安装；若安装被权限/网络/平台限制阻塞，写入执行审计，不得把阻塞当完成。
+如果 Superpowers 已安装，使用 Superpowers 执行本任务：
+- 先读完整验收清单，验收以它为准；compact prompt 只负责 direction/priority/recovery navigation
+- 若实施计划不够可执行，用 superpowers:writing-plans 转成 bite-sized implementation plan
+- 有 subagent 支持时优先用 superpowers:subagent-driven-development；否则用 superpowers:executing-plans
+- 行为变更使用 superpowers:test-driven-development；先写失败测试并观察失败，再写最小实现
+- review / finish 不能覆盖完整验收清单；完整验收清单不满足则继续实现
+- 每轮执行后更新 local audit，记录 AC 状态、当前证据、命令结果、blocker、deferred/narrowed scope、无效证据
 <验收清单>
 ```
@@ -48,9 +56,17 @@ For English requests, use this shape:
 ```text
 Plan: tmp/ty-context/plan-acceptance/<plan>.md (implementation/source plan, not acceptance proof)
-You may use multiple agents; if agent slots run low, close idle or unnecessary agents.
 Full checklist: tmp/ty-context/plan-acceptance/<plan>-acceptance-checklist.md (complete acceptance standard; acceptance is judged against it; every item must be checked before completion)
 Local audit: tmp/ty-context/plan-acceptance/<plan>-local-audit.md (temporary execution/progress state, not Context or proof)
+You may use multiple agents; if agent slots run low, close idle or unnecessary agents.
+If Superpowers is not installed, install it through the current platform's official Superpowers installation path first; if installation is blocked by permissions, network or platform limits, record it in local audit and do not treat the blocker as completion.
+If Superpowers is installed, Use Superpowers for this task:
+- Read the full checklist first; acceptance is judged against it, while the compact prompt only provides direction, priority and recovery navigation.
+- If the implementation plan is not executable enough, use superpowers:writing-plans to convert it into a bite-sized implementation plan.
+- Prefer superpowers:subagent-driven-development when subagents are available; otherwise use superpowers:executing-plans.
+- Use superpowers:test-driven-development for behavior changes; write a failing test, observe it fail, then write the minimal implementation.
+- review / finish cannot override the full checklist; if the full checklist is not satisfied, continue implementation.
+- update local audit after each execution round with AC status, current evidence, command results, blockers, deferred/narrowed scope and invalid evidence.
 <acceptance checklist>
 ```
@@ -141,7 +157,7 @@ The local audit path is for the future goal/target-mode executor. This compiler
 ## Step 2: Reuse Any Explicit Plan-Provided Checklist
-After materializing the plan, inspect it for an explicit concrete acceptance checklist before generating a new checklist.
+After materializing the plan, inspect it for an explicit concrete acceptance checklist and any explicit test requirements before generating a new checklist.
 Plan-provided checklist reuse applies only when the plan contains a clearly labeled checklist or checklist table section, such as `Acceptance Checklist`, `Acceptance Criteria`, `验收清单`, `验收标准`, or equivalent heading, and that section contains concrete acceptance items rather than only saying that acceptance is needed.
@@ -152,6 +168,7 @@ When a plan-provided checklist is found:
 - Do not derive, strengthen, reorder, translate, normalize, merge, split, or add acceptance items.
 - Do not prepend the generated `Acceptance Contract`, checklist table, self-test, or false-completion traps to the full checklist file when reusing a plan-provided checklist.
 - If multiple explicit checklist sections exist, copy all of them into the full checklist file in source order.
+- If the plan also includes explicit test requirements, such as `Required tests`, `Automated tests`, `Test Plan`, `必须新增/补强的自动化测试`, `测试文件`, or equivalent concrete test sections, copy those test requirements verbatim into the full checklist file as plan-provided acceptance evidence. These test requirements are acceptance evidence, not generated additions.
 - Keep the copied plan file and full checklist file separate, even if the checklist already appears inside the plan.
 - Continue to read relevant Context only as needed to explain ambiguities, conflicts, the goal/target-mode prompt, and any required local-audit or false-completion guidance. Do not use Context to expand the reused checklist unless the user separately asks for an audit or rewrite.
 - If the plan-provided checklist is too large for the 3850-character prompt budget, keep it intact in the full checklist file and make the prompt reference that file as the acceptance authority; do not invent a compact replacement checklist with new criteria.
@@ -159,6 +176,13 @@ When a plan-provided checklist is found:
 When no explicit concrete plan-provided checklist exists, continue with the generated-checklist flow below.
+When a plan already includes explicit test requirements but not a full acceptance checklist:
+- Use those plan-provided test requirements as the source of truth for the generated checklist's `Required automated tests / 必须新增或补强的自动化测试` section.
+- Preserve concrete test file paths, test names, behavior descriptions, commands, and failure notes from the plan.
+- Do not replace them with generic AC10 wording, do not create an unrelated test system, and do not invent competing test lists.
+- If the plan's test requirements are broad but concrete enough to preserve, keep the original wording and add only the minimum acceptance mapping needed to show which AC each test supports.
 ## Step 3: Read Relevant Project Context
 Read only the Context needed for the plan's impacted surfaces. Use Context to identify what the project says the system should mean.
@@ -265,6 +289,31 @@ Allowed `Conclusion` values:
 Missing ledger evidence means incomplete, not complete. Do not let missing evidence, old evidence, partial evidence or evidence from a different read path satisfy a current-state claim.
+### Required Automated Tests / 必须新增或补强的自动化测试
+Every generated full checklist must include a `Required automated tests / 必须新增或补强的自动化测试` section. Test requirements are acceptance evidence: a future executor cannot mark a relevant implementation item complete when required test evidence is missing.
+No fourth artifact: keep these requirements inside `<plan-slug>-acceptance-checklist.md`. Do not create `tmp/ty-context/plan-acceptance/<plan-slug>-test-requirements.md` or another standalone test requirements file.
+Use this table shape when tests are required:
+```markdown
+## Required automated tests / 必须新增或补强的自动化测试
+| Test file path | Test name or behavior description | Covered acceptance item(s) | Verification command | Failure condition | Source |
+|---|---|---|---|---|---|
+```
+Rules:
+- If the plan already includes explicit test requirements, use those plan-provided test requirements as the source of truth. Preserve the plan's test file path, test name or behavior description, verification command and failure condition when present.
+- Do not replace plan-provided test requirements with generic AC10, do not invent a separate test taxonomy, and do not add unrelated tests merely because a generic category exists.
+- If no explicit test section exists, derive required tests from the plan's behavior changes, risk, Context contracts and real code/test surfaces.
+- If an exact test name cannot be inferred safely, write a behavior-level test description and do not invent exact test names.
+- Each required test row must identify the covered acceptance item(s), the verification command, and the failure condition that blocks acceptance.
+- If no new or strengthened automated tests are required, state that explicitly with the reason and the acceptance items covered by existing verification.
+- The local audit must record each required test's command, result and failure reason when it is run or when it remains blocked.
 ## Hard Blocker Handling
 Treat any unresolved required blocker as non-completion. A checklist may describe blocked acceptance work, but blocked work is still not accepted until the required evidence exists.
@@ -453,12 +502,14 @@ Hard requirements:
 - The prompt must be no longer than 3850 characters including line breaks. Treat 3850 as the effective hard budget and preserve information density; do not drop required paths, core acceptance categories, blocker rules, evidence rules or false-completion traps merely to be short.
 - The first line must identify the plan path.
 - Use `实施计划: <path>` for Chinese prompts and `Plan: <path>` for English prompts. The line must say the plan is the implementation/source plan and not acceptance proof.
-- The second line must be a resource lifecycle instruction: `可多开agent，agent名额不够了就关掉不用的。` for Chinese prompts or `You may use multiple agents; if agent slots run low, close idle or unnecessary agents.` for English prompts.
+- The prompt must identify the full checklist path immediately after the plan path and say it is the complete acceptance standard. Chinese prompts must include this exact sentence: `该文件是完整验收标准，验收以这个为准。完成前必须逐项检查，不满足则继续实现。` English prompts must say the full checklist is the complete acceptance standard, acceptance is judged against it, and every item must be checked before completion.
+- The prompt must identify a local audit path, normally `tmp/ty-context/plan-acceptance/<plan-slug>-local-audit.md`, and require the future executor to read it before resuming, keep it current during execution, and use it only as target-mode acceptance progress state.
+- After the plan/checklist/audit paths, include a resource lifecycle instruction: `可多开agent，agent名额不够了就关掉不用的。` for Chinese prompts or `You may use multiple agents; if agent slots run low, close idle or unnecessary agents.` for English prompts.
+- The prompt must include a Superpowers execution block. If Superpowers is not installed, tell the executor to install it through the current platform's official Superpowers installation path; if installation is blocked by permissions, network or platform limits, record it in local audit and do not treat the blocker as completion. If Superpowers is installed, Use Superpowers for this task.
+- The Superpowers block must require: read the full checklist first and make it the acceptance authority; use `superpowers:writing-plans` when the plan is not executable enough; prefer `superpowers:subagent-driven-development` when subagents are available; otherwise use `superpowers:executing-plans`; use `superpowers:test-driven-development` for behavior changes; review / finish cannot override the full checklist; update local audit after each execution round.
 - The remaining content must be the acceptance checklist or a compact version of it.
 - The prompt must be self-contained enough for goal/target-mode execution.
-- The prompt must identify the full checklist path and say it is the complete acceptance standard. Chinese prompts must include this exact sentence: `该文件是完整验收标准，验收以这个为准。完成前必须逐项检查，不满足则继续实现。` English prompts must say the full checklist is the complete acceptance standard, acceptance is judged against it, and every item must be checked before completion.
 - If the prompt uses a compact checklist summary, say the full checklist owns details and acceptance authority; the compact summary owns direction, priority and recovery navigation; overlap is allowed; conflicts are resolved in favor of the full checklist.
-- The prompt must identify a local audit path, normally `tmp/ty-context/plan-acceptance/<plan-slug>-local-audit.md`, and require the future executor to read it before resuming, keep it current during execution, and use it only as target-mode acceptance progress state.
 - The prompt must require the local audit to record overall status (`complete`, `incomplete`, `blocked` or `narrowed-scope-complete`), each core AC status and current evidence, commands with result/time/failure reason, artifact or evidence paths, blockers and missing evidence, acceptance impact, explicit deferred or narrowed scope, and stale/partial/smoke/dry-run/research evidence that cannot prove full completion.
 - The prompt must say that local audit is not Context, not product-quality proof, not a global task manager, and not a replacement for project tests, CI, review, human acceptance, Task Contract or workflow-contract `plan.md`.
 - The prompt must say that when a Task Contract or workflow-contract `plan.md` exists, each acceptance item execution still follows it and the repository's Tiny Context workflow contract.
@@ -468,19 +519,29 @@ Hard requirements:
 - If the full checklist is too large, write the full checklist to `tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.md`, then compress the goal/target-mode prompt by increasing information density while preserving all core acceptance categories.
 - If the full checklist came from a plan-provided checklist and is too large, keep the extracted checklist unchanged in the full checklist file and compress the prompt by referencing the full checklist path, not by rewriting or adding criteria.
 - The compact prompt may reference the full checklist path, but it must still include the core completion criteria directly and state that the summary is direction/priority/recovery navigation, not the acceptance authority.
+- Compact prompts must not expand long test lists. AC10 must reference the Required automated tests section in the full checklist and state that behavior changes still use `superpowers:test-driven-development`.
 Recommended compact Chinese prompt shape:
 ```text
 实施计划: tmp/ty-context/plan-acceptance/<plan-slug>.md（source/implementation plan，非验收证明）
-可多开agent，agent名额不够了就关掉不用的。
 完整验收清单: tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.md（该文件是完整验收标准，验收以这个为准。完成前必须逐项检查，不满足则继续实现。）
 执行审计: tmp/ty-context/plan-acceptance/<plan-slug>-local-audit.md（临时 progress state，非 Context/proof）
+可多开agent，agent名额不够了就关掉不用的。
 本摘要只负责 direction/priority/recovery navigation；允许与完整 checklist 重叠，冲突时以完整 checklist 为准。
+如果 Superpowers 未安装，先按当前平台官方 Superpowers 安装路径安装；若安装被权限/网络/平台限制阻塞，写入执行审计，不得把阻塞当完成。
+如果 Superpowers 已安装，使用 Superpowers 执行本任务：
+- 先读完整验收清单，验收以它为准
+- 若实施计划不够可执行，用 superpowers:writing-plans 转成 bite-sized implementation plan
+- 有 subagent 支持时优先用 superpowers:subagent-driven-development；否则用 superpowers:executing-plans
+- 行为变更用 superpowers:test-driven-development；先写失败测试并观察失败，再写最小实现
+- review / finish 不能覆盖完整验收清单；不满足则继续实现
+- 每轮执行后更新 local audit，记录 AC 状态、证据、命令结果、blocker、deferred/narrowed scope、无效证据
 验收清单：
 AC1 <核心完成定义，包含验收证据>
-AC2 <范围/清单/覆盖要求>
+AC2 <范围/清单/覆盖要求>
 AC3 <Context/架构/边界要求>
 AC4 <核心实现行为要求>
 AC5 <数据/API/接口/契约要求>
@@ -488,9 +549,9 @@ AC6 <运行态/配置/外部依赖/阻塞分类要求>
 AC7 <artifact/evidence/schema/freshness/provenance 要求>
 AC8 <UI/用户可见/API 投影一致性要求>
 AC9 <安全/隐私/脱敏/secret 要求>
-AC10 <测试/构建/集成/smoke/回归要求>
+AC10 测试要求：按完整验收清单的 `Required automated tests / 必须新增或补强的自动化测试` section 执行；compact prompt 不展开长测试列表；行为变更仍用 superpowers:test-driven-development。
 AC11 <文档/Context 更新要求，仅在计划要求时执行>
-AC12 维护执行审计：恢复执行先读 audit；记录总体状态、每个 AC 当前证据、命令/结果/时间、artifact/evidence 路径、blocker、deferred/narrowed scope、不能证明 full completion 的旧/部分/smoke/dry-run/research 证据；audit 不是 Context、完成证明、全局任务管理器，也不替代 Task Contract 或流程契约 plan.md。
+AC12 维护执行审计：恢复执行先读 audit；记录总体状态、每个 AC 当前证据、命令/结果/时间、每个 required test 的 command/result/failure reason、artifact/evidence 路径、blocker、deferred/narrowed scope、不能证明 full completion 的旧/部分/smoke/dry-run/research 证据；audit 不是 Context、完成证明、全局任务管理器，也不替代 Task Contract 或流程契约 plan.md。
 AC13 最小用户卡点：问用户前先完成安全自助发现；需要用户介入时只给最小动作清单，写明已尝试、缺失项、具体页面/菜单/字段/按钮、最小值/动作、不要发送的敏感信息、验收影响、fallback/deferred。
 AC14 完成前审计：逐条对照实施计划和完整 checklist；每个 core 项必须有当前证据；未跑验证必须明示；有可继续执行的 core 项不得标记完成；外部/强卡点必须写明原因、缺失证据、验收影响和下一步；若剩余未完成项只有无法本地解决的强卡点，暂停并等待用户/外部 owner，不能标记目标完成。
@@ -501,14 +562,23 @@ Recommended compact English prompt shape:
 ```text
 Plan: tmp/ty-context/plan-acceptance/<plan-slug>.md (implementation/source plan, not acceptance proof)
-You may use multiple agents; if agent slots run low, close idle or unnecessary agents.
 Full checklist: tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.md (complete acceptance standard; acceptance is judged against it; every item must be checked before completion)
 Local audit: tmp/ty-context/plan-acceptance/<plan-slug>-local-audit.md (temporary progress state, not Context or proof)
+You may use multiple agents; if agent slots run low, close idle or unnecessary agents.
 This summary is only direction, priority and recovery navigation; overlap with the full checklist is allowed, and the full checklist wins conflicts.
+If Superpowers is not installed, install it through the current platform's official Superpowers installation path first; if installation is blocked by permissions, network or platform limits, record it in local audit and do not treat the blocker as completion.
+If Superpowers is installed, Use Superpowers for this task:
+- Read the full checklist first; acceptance is judged against it.
+- If the plan is not executable enough, use superpowers:writing-plans for a bite-sized implementation plan.
+- Prefer superpowers:subagent-driven-development when subagents are available; otherwise use superpowers:executing-plans.
+- Use superpowers:test-driven-development for behavior changes; write a failing test, observe failure, then implement minimally.
+- review / finish cannot override the full checklist; if unsatisfied, continue implementation.
+- update local audit after each execution round with AC status, evidence, command results, blockers, deferred/narrowed scope and invalid evidence.
 Acceptance checklist:
 AC1 <core completion definition with required evidence>
-AC2 <scope inventory and coverage>
+AC2 <scope inventory and coverage>
 AC3 <Context / architecture / boundary conformance>
 AC4 <core implementation behavior>
 AC5 <data / API / interface / contract requirements>
@@ -516,9 +586,9 @@ AC6 <runtime / configuration / external dependency / blocker classification>
 AC7 <artifact / evidence / schema / freshness / provenance requirements>
 AC8 <UI / user-visible / API projection consistency>
 AC9 <security / privacy / redaction / secret handling>
-AC10 <test / build / integration / smoke / regression requirements>
+AC10 Test requirements: follow the full checklist's `Required automated tests / 必须新增或补强的自动化测试` section; the compact prompt must not expand long test lists; behavior changes still use superpowers:test-driven-development.
 AC11 <documentation / Context updates only when required by the plan>
-AC12 Maintain local audit: read it before resuming; record overall status, every AC's current evidence, commands/results/time, artifact/evidence paths, blockers, deferred/narrowed scope, and stale/partial/smoke/dry-run/research evidence that cannot prove full completion; audit is not Context, completion proof, a global task manager, or a replacement for Task Contract or workflow-contract plan.md.
+AC12 Maintain local audit: read it before resuming; record overall status, every AC's current evidence, commands/results/time, each required test's command/result/failure reason, artifact/evidence paths, blockers, deferred/narrowed scope, and stale/partial/smoke/dry-run/research evidence that cannot prove full completion; audit is not Context, completion proof, a global task manager, or a replacement for Task Contract or workflow-contract plan.md.
 AC13 Minimal user blocker protocol: before asking the user, complete safe self-service discovery; when user action is needed, provide only the smallest action list with what was tried, missing item, exact page/menu/path/field/button, minimum value/action, sensitive material not to send, acceptance impact, and fallback/deferred option.
 AC14 Final audit: compare every item against the plan and full checklist; every core item needs current evidence; missing validation must be stated; any executable core item left open means the task is not complete; external or hard blockers need cause, missing evidence, acceptance impact, and next action; if only locally unsatisfiable hard blockers remain, pause for the user or external owner instead of marking the goal complete.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "project-tiny-context-harness",
-  "version": "0.2.62",
+  "version": "0.2.64",
   "description": "Minimal project memory and validation harness for AI coding agents.",
   "license": "MIT",
   "author": "Seven128",