project-tiny-context-harness 0.2.62 → 0.2.64

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -8,15 +8,15 @@
8
8
 
9
9
  Translations: [Chinese (Simplified)](https://github.com/Seven128/project-tiny-context-harness/blob/main/README.zh-CN.md)
10
10
 
11
- `project-tiny-context-harness` ships the `ty-context` CLI for Project Tiny Context Harness: repo-native project memory for AI coding agents.
12
-
13
- The default is **Minimal Context Harness**. It maintains a compact `project_context/**` fact source, a short `AGENTS.md` startup router, role Skills and a `validate-context` gate so fresh agents can recover project intent, constraints, verification entry points and next safe actions quickly.
11
+ `project-tiny-context-harness` ships the `ty-context` CLI for Project Tiny Context Harness: repo-native project memory for AI coding agents and a repo-native context contract.
12
+
13
+ The default is **Minimal Context Harness**. It maintains a compact `project_context/**` fact source, a short `AGENTS.md` startup router, role Skills, priority guidance for Context/code/evidence, and a `validate-context` gate so fresh agents can recover project intent, constraints, verification entry points and next safe actions quickly.
14
14
 
15
15
  It does not default to lifecycle phases, plan tasks, stage skills, stage documents or phase gates. Harness maintains context quality; your project tests, CI, review process and human acceptance remain responsible for product quality.
16
16
 
17
17
  Use it when coding agents repeatedly lose project intent across new chats, handoffs, RFC/debug turns or tool changes. The intended tradeoff is: keep durable intent and recovery paths; leave execution evidence to code, tests and review.
18
18
 
19
- Think of it as durable project memory behind `AGENTS.md`, not another agent, process framework or task manager.
19
+ Think of it as durable project memory behind `AGENTS.md`, plus priority rules for Context/code/evidence, not another agent, process framework or task manager.
20
20
 
21
21
  Best for:
22
22
 
@@ -63,19 +63,43 @@ No-install preview:
63
63
 
64
64
  Coding agents can move quickly inside one thread and still drift when a new chat, model, tool, reviewer or debugging session loses the project-specific facts that were never encoded anywhere stable.
65
65
 
66
- Minimal Context Harness creates a small, explicit recovery path: project goal, boundaries, architecture context, validation entry points and durable task conclusions. It is designed to sit beside specs, tests, issues, docs and code intelligence tools instead of replacing them.
67
-
68
- The core bet is: **keep the memory, drop the ceremony**. Earlier stage-based workflows pushed ordinary software work through explicit phase artifacts and gates. Modern coding agents already internalize much of the understand, design, implement, test and repair loop, so Project Tiny Context Harness keeps the high-density repo context that survives fresh chats without making every task follow Tiny Context-stage choreography.
69
-
70
- ## Positioning
71
-
72
- | Adjacent tool type | Use it for | Harness stance |
73
- |---|---|---|
74
- | Spec-first kits | Turning feature ideas into structured specs and plans. | Complementary; Harness keeps durable project facts, not a required spec chain. |
75
- | BMAD-style workflows and full Tiny Context processes | Coordinated role/process ceremonies on high-risk work. | Lighter default; no phase gates or work-product trees. |
76
- | Task Master-style planners | Backlog decomposition and task execution state. | Complementary; Harness does not own task state. |
77
- | Context7/Serena-style retrieval or code-intelligence tools | Pulling external docs, symbols or repository facts on demand. | Complementary; Harness stores local repo truth. |
78
- | IDE or agent memory | Tool-specific continuity inside one product surface. | Portable fallback; plain files any agent can read. |
66
+ Minimal Context Harness creates a small, explicit recovery path: project goal, boundaries, architecture context, validation entry points and durable task conclusions. It is designed to sit beside specs, tests, issues, docs and code intelligence tools instead of replacing them.
67
+
68
+ The concrete failure mode is not only missing file search. In an ABCD module chain where A/B/C are upstream of downstream D, a D feature can expose a missing capability. Without Context, an agent may change upstream A/B to make D pass because current code permits it. Minimal Context adds a repo-owned intent layer: it records whether downstream D may change upstream A/B, whether the gap belongs in C's contract, or whether the task needs a `Context Delta` before implementation continues. Code shows what is possible; it cannot decide whether that is allowed project intent.
69
+
70
+ The core bet is: **keep the memory, drop the ceremony**. Earlier stage-based workflows pushed ordinary software work through explicit phase artifacts and gates. Modern coding agents already internalize much of the understand, design, implement, test and repair loop, so Project Tiny Context Harness keeps the high-density repo context that survives fresh chats without making every task follow Tiny Context-stage choreography.
71
+
72
+ ## Current Best Practice
73
+
74
+ For short tasks, use the workflow contract and Context layer directly:
75
+
76
+ ```text
77
+ workflow contract + project_context/** -> implementation -> verification -> drift check
78
+ ```
79
+
80
+ For long-running tasks, externalize the target first:
81
+
82
+ ```text
83
+ Web GPT or another external planning model produces a plan
84
+ -> plan acceptance checklist Skill produces a goal/target-mode prompt
85
+ -> Superpowers derives concrete implementation slices
86
+ -> each slice follows the workflow contract + project_context/**
87
+ ```
88
+
89
+ The recommended Superpowers layer is the specific [obra/Superpowers](https://github.com/obra/superpowers) plugin/workflow, not a generic planning substitute. Use `superpowers:writing-plans` when the target-mode prompt or source plan still needs bite-sized implementation tasks, then prefer `superpowers:subagent-driven-development` when subagents are available and `superpowers:executing-plans` otherwise. Behavior changes should use `superpowers:test-driven-development`.
90
+
91
+ The reason is drift control. The workflow contract plus Context layer is intentionally a soft constraint. It works well for short tasks, and Context can still capture the expected facts for long tasks, but long execution makes the Context-to-code step drift as the context window grows, work is handed off, subagents split scope or validation loops multiply. A Web GPT plan, target-mode prompt, full acceptance checklist and Superpowers execution layer make the completion target recoverable without restoring a phase-gated workflow.
92
+
93
+ ## Positioning
94
+
95
+ | Adjacent tool type | Use it for | Harness stance |
96
+ |---|---|---|
97
+ | Spec-first kits | Turning feature ideas into structured specs and plans. | Complementary; Harness keeps durable repo facts and module boundary intent beyond one feature spec. |
98
+ | BMAD-style workflows and full Tiny Context processes | Coordinated role/process ceremonies on high-risk work. | Lighter default; no phase gates or work-product trees. |
99
+ | Superpowers-style execution | Turning approved requirements into plans, subagent execution, TDD, review and finish discipline. | Complementary; use it to execute while Tiny Context owns durable repo intent and acceptance priority. |
100
+ | Task Master-style planners | Backlog decomposition and task execution state. | Complementary; Harness does not own task state. |
101
+ | Context7/Serena-style retrieval or code-intelligence tools | Pulling external docs, symbols or repository facts on demand. | Complementary; they do not answer whether downstream D may change upstream A/B. Harness stores that local repo truth. |
102
+ | IDE or agent memory | Tool-specific continuity inside one product surface. | Portable fallback; plain files any agent can read. |
79
103
 
80
104
  ## Try It In 60 Seconds
81
105
 
@@ -115,7 +139,7 @@ npm ci
115
139
  npm run smoke:quickstart
116
140
  npm run preview:pack
117
141
  cd /path/to/your/test-repo
118
- npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.62.tgz
142
+ npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.64.tgz
119
143
  npx --no-install ty-context init --adopt
120
144
  make validate-context
121
145
  ```
package/assets/README.md CHANGED
@@ -8,13 +8,13 @@
8
8
 
9
9
  Translations: [Chinese (Simplified)](README.zh-CN.md)
10
10
 
11
- Project Tiny Context Harness is repo-native project memory for AI coding agents.
12
-
13
- `project-tiny-context-harness` ships Project Tiny Context Harness through the `ty-context` CLI. It installs **Minimal Context Harness**: a compact `project_context/**` fact source, a short `AGENTS.md` startup router, role Skills and a `validate-context` gate so fresh agents can recover project intent, boundaries, verification entry points and next safe actions quickly.
11
+ Project Tiny Context Harness is repo-native project memory for AI coding agents and a repo-native context contract.
12
+
13
+ `project-tiny-context-harness` ships Project Tiny Context Harness through the `ty-context` CLI. It installs **Minimal Context Harness**: a compact `project_context/**` fact source, a short `AGENTS.md` startup router, role Skills, priority guidance for Context/code/evidence, and a `validate-context` gate so fresh agents can recover project intent, boundaries, verification entry points and next safe actions quickly.
14
14
 
15
15
  It is not another full Tiny Context ceremony. The Harness maintains context quality; project tests, reviews, CI and human acceptance still own product quality.
16
16
 
17
- Think of it as durable project memory behind `AGENTS.md`, not another agent, process framework or task manager.
17
+ Think of it as durable project memory behind `AGENTS.md`, plus priority rules for Context/code/evidence, not another agent, process framework or task manager.
18
18
 
19
19
  Best for:
20
20
 
@@ -94,7 +94,7 @@ That smoke packs the local workspace, installs it into a disposable repo, runs `
94
94
  ```sh
95
95
  npm run preview:pack
96
96
  cd /path/to/your/test-repo
97
- npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.62.tgz
97
+ npm install -D /path/to/project-tiny-context-harness/tmp/ty-context/source-preview/package/project-tiny-context-harness-0.2.64.tgz
98
98
  npx --no-install ty-context init --adopt
99
99
  make validate-context
100
100
  ```
@@ -107,19 +107,43 @@ Use it when coding agents repeatedly lose project intent across new chats, hando
107
107
 
108
108
  Coding agents can move quickly inside one thread and still drift when a new chat, model, tool, reviewer or debugging session loses the project-specific facts that were never encoded anywhere stable.
109
109
 
110
- Most repositories already have README files, specs, tests and issue history, but fresh agents need a small, explicit recovery path: what the project is trying to do, what it must not do, where architecture boundaries live, how to validate changes and what durable facts changed after the last task. Minimal Context Harness makes that recovery path a first-class repo surface without adding a full planning ceremony.
111
-
112
- The product lesson is: **keep the memory, drop the ceremony**. Earlier stage-based workflows externalized requirements, design, implementation, review, test and release into explicit phase artifacts. Modern coding agents already internalize much of that ordinary software loop. Project Tiny Context Harness keeps the useful part: the smallest high-density repo context that survives fresh chats without forcing every task through phase transitions, work-product validation or Tiny Context-stage context splits.
113
-
114
- ## Positioning
115
-
116
- | Adjacent tool type | Use it for | Harness stance |
117
- |---|---|---|
118
- | Spec-first kits | Turning a feature idea into structured specs and implementation plans. | Complementary. Keep final durable project facts in `project_context/**`; do not require spec documents for every task. |
119
- | BMAD-style workflows and full Tiny Context processes | Coordinated role/process ceremonies on high-risk work. | Lighter default. Preserve context quality without shipping phase gates or work-product trees. |
120
- | Task Master-style planners | Backlog decomposition and task execution state. | Complementary. Harness does not own task state; it owns durable project memory. |
121
- | Context7/Serena-style retrieval or code-intelligence tools | Pulling external docs, symbols or repository facts on demand. | Complementary. Harness keeps the local project truth that should travel with the repo. |
122
- | IDE or agent memory | Tool-specific continuity inside one product surface. | Portable fallback. Harness files are plain repo assets that any agent can read. |
110
+ Most repositories already have README files, specs, tests and issue history, but fresh agents need a small, explicit recovery path: what the project is trying to do, what it must not do, where architecture boundaries live, how to validate changes and what durable facts changed after the last task. Minimal Context Harness makes that recovery path a first-class repo surface without adding a full planning ceremony.
111
+
112
+ The concrete failure mode is not only "the agent did not read enough files." In an ABCD module chain where A/B/C are upstream of downstream D, a D feature may reveal a missing capability. Without Context, the agent can satisfy D by changing upstream A/B because the current code makes that path available. What is missing is a repo-owned intent layer that says whether D may change upstream A/B, whether the gap belongs in C's contract, or whether the task must stop for a `Context Delta` before implementation continues. Current code can show what is possible; it cannot decide whether that is allowed project intent.
113
+
114
+ The product lesson is: **keep the memory, drop the ceremony**. Earlier stage-based workflows externalized requirements, design, implementation, review, test and release into explicit phase artifacts. Modern coding agents already internalize much of that ordinary software loop. Project Tiny Context Harness keeps the useful part: the smallest high-density repo context that survives fresh chats without forcing every task through phase transitions, work-product validation or Tiny Context-stage context splits.
115
+
116
+ ## Current Best Practice
117
+
118
+ For short tasks, use the workflow contract and Context layer directly:
119
+
120
+ ```text
121
+ workflow contract + project_context/** -> implementation -> verification -> drift check
122
+ ```
123
+
124
+ For long-running tasks, externalize the target first:
125
+
126
+ ```text
127
+ Web GPT or another external planning model produces a plan
128
+ -> plan acceptance checklist Skill produces a goal/target-mode prompt
129
+ -> Superpowers derives concrete implementation slices
130
+ -> each slice follows the workflow contract + project_context/**
131
+ ```
132
+
133
+ The recommended Superpowers layer is the specific [obra/Superpowers](https://github.com/obra/superpowers) plugin/workflow, not a generic planning substitute. Use `superpowers:writing-plans` when the target-mode prompt or source plan still needs bite-sized implementation tasks, then prefer `superpowers:subagent-driven-development` when subagents are available and `superpowers:executing-plans` otherwise. Behavior changes should use `superpowers:test-driven-development`.
134
+
135
+ The reason is drift control. The workflow contract plus Context layer is intentionally a soft constraint. It works well for short tasks, and Context can still capture the expected facts for long tasks, but long execution makes the Context-to-code step drift as the context window grows, work is handed off, subagents split scope or validation loops multiply. A Web GPT plan, target-mode prompt, full acceptance checklist and Superpowers execution layer make the completion target recoverable without restoring a phase-gated workflow.
136
+
137
+ ## Positioning
138
+
139
+ | Adjacent tool type | Use it for | Harness stance |
140
+ |---|---|---|
141
+ | Spec-first kits | Turning a feature idea into structured specs and implementation plans. | Complementary. Specs can define a feature, but they do not automatically maintain repo-wide module boundary intent across every later task. |
142
+ | BMAD-style workflows and full Tiny Context processes | Coordinated role/process ceremonies on high-risk work. | Lighter default. Preserve context quality without shipping phase gates or work-product trees. |
143
+ | Superpowers-style execution | Turning approved requirements into plans, subagent execution, TDD, review and finish discipline. | Complementary. Use it to execute; keep Tiny Context as the durable repo intent and acceptance-priority layer. |
144
+ | Task Master-style planners | Backlog decomposition and task execution state. | Complementary. Harness does not own task state; it owns durable project memory and module boundary facts. |
145
+ | Context7/Serena-style retrieval or code-intelligence tools | Pulling external docs, symbols or repository facts on demand. | Complementary. They improve retrieval and editing, but do not answer whether downstream D may change upstream A/B; Harness keeps that local project truth in repo. |
146
+ | IDE or agent memory | Tool-specific continuity inside one product surface. | Portable fallback. Harness files are plain repo assets that any agent can read. |
123
147
 
124
148
  ## Try It In 60 Seconds
125
149
 
@@ -275,6 +299,8 @@ For high-risk product, UI/UX and engineering tasks that affect durable architect
275
299
  Technical architecture support is a Minimal Context capability: use restrained `architecture.md`, area Module Design Capsules and existing `contract` / `decision-rationale` roles when durable architecture or rationale matters. Do not invent rationale; store stable reasons, rejected alternatives or tradeoffs only in the smallest durable Context surface when they will affect future implementation or verification choices.
276
300
 
277
301
  For long-running plans, RFCs or implementation proposals, the plan acceptance checklist compiler can turn a plan plus relevant Context into a falsifiable acceptance checklist and a paste-ready goal/target-mode prompt. If the plan already contains an explicit concrete acceptance checklist, the Skill copies that checklist verbatim into a separate full-checklist file instead of generating a competing checklist. This is one pre-execution acceptance pass, not a task planner or workflow engine: it stores temporary inputs under `tmp/ty-context/plan-acceptance/**`, asks for confirmation when durable assumptions are unclear, and leaves execution evidence to the future executor, tests, CI, review or human acceptance. The generated prompt may require a local audit under the same temporary directory so future sessions can recover acceptance progress; that audit is not Context, not a quality proof and not a replacement for the project's Tiny Context workflow contract. The full checklist is the acceptance authority, while any compact prompt summary exists for navigation, priority and recovery after context compaction.
302
+
303
+ Important usage note: Minimal Context intentionally keeps Context read order, Context/code priority and drift checks as agent-level soft constraints rather than machine-enforced gates. That tradeoff works well for short tasks, but long tasks with large context windows, multiple handoffs or many verification loops are expected to drift unless the completion target is externalized. Use the plan acceptance checklist compiler before long-running execution when there is a plan-like source; treat the full checklist as the acceptance target, and treat the local audit only as temporary progress/recovery state.
278
304
 
279
305
  For Product Surface work, `context_surface_contract` turns broad product/page/UI principles into project-owned surface responsibilities. A Product Surface can be a Web page, mobile screen, desktop window, game UI/HUD/menu, CLI/TUI output, extension UI or embedded/device interface. Cross-surface contracts use the existing `contract` role; area-owned screen facts stay in `area` or `subdomain`; repeatable validation paths use `verification`. The Harness does not add a new surface-specific role, does not create business surface contracts during `init` or `upgrade`, and does not turn surface conformance into a validator gate. Projects that want mandatory task blocks should add a separate project-local Skill, while `product-surface-contract.md` is only a compact managed template for optional Context authoring.
280
306
 
@@ -2,15 +2,16 @@
2
2
 
3
3
  [English README](README.md)
4
4
 
5
- Project Tiny Context Harness 是给 AI coding agents 用的轻量项目记忆层。
6
-
7
- 它不是新的全流程 Tiny Context 框架,也不是任务管理器。它做一件小事:把新会话 agent 最容易丢掉、但又必须长期稳定保留的项目事实放进仓库里,让下一次聊天、交接、调试或换工具时不用从头重新发现。
5
+ Project Tiny Context Harness 是给 AI coding agents 用的轻量项目记忆层,也是 repo-native context contract。
6
+
7
+ 它不是新的全流程 Tiny Context 框架,也不是任务管理器。它做一件小事:把新会话 agent 最容易丢掉、但又必须长期稳定保留的项目事实,以及 Context / 代码 / 验证证据之间的读取和变更优先级放进仓库里,让下一次聊天、交接、调试或换工具时不用从头重新发现。
8
8
 
9
9
  一句话:
10
10
 
11
11
  ```text
12
- Keep the memory. Drop the ceremony.
13
- 保留项目记忆,丢掉流程仪式感。
12
+ Keep the memory. Drop the ceremony.
13
+ 保留项目记忆,丢掉流程仪式感。
14
+ 同时保留 Context / 代码 / 验证证据之间的优先级契约。
14
15
  ```
15
16
 
16
17
  ## 它解决什么问题
@@ -22,9 +23,10 @@ Keep the memory. Drop the ceremony.
22
23
  - 架构边界在哪里
23
24
  - 哪些文件是事实源
24
25
  - 改完以后应该跑什么验证
25
- - 上一次任务留下了哪些长期约束
26
-
27
- Project Tiny Context Harness 把这些内容压缩到几个 repo-native 文件里:
26
+ - 上一次任务留下了哪些长期约束
27
+ - Context、实现和验证证据冲突时谁优先
28
+
29
+ Project Tiny Context Harness 把这些内容压缩到几个 repo-native 文件里,并通过简单工作流约束 agent 先读 Context、判断是否 context-first、实现后做 drift check:
28
30
 
29
31
  - `AGENTS.md`
30
32
  - `project_context/context.toml`
@@ -44,13 +46,38 @@ Fresh agent 先读这些文件,再开始改代码。
44
46
  - 现代 coding agents 已经内化了很多普通软件工程循环:理解、设计、实现、测试、修复。
45
47
  - 真正值得保留下来的不是“每次任务都走完整流程”,而是“新 agent 能快速恢复项目长期事实”。
46
48
 
47
- 所以当前默认方向是 Minimal Context Harness:只维护高密度、长期有效、能帮助恢复上下文的项目事实。
48
-
49
+ 所以当前默认方向是 Minimal Context Harness:只维护高密度、长期有效、能帮助恢复上下文的项目事实。
50
+
51
+ 一个典型失败场景是 ABCD 模块链:A/B/C 是上游,D 是下游。现在做 D 的需求时发现能力缺口;如果没有 Context 和优先级约束,agent 很容易为了让 D 完成而去改上游 A/B,因为当前代码让这条路可行。但真正需要判断的是:D 是否有权改 A/B?缺口是不是属于 C 的契约?是否必须先声明 `Context Delta`,让项目意图变化被确认后再实现?代码能说明“现在怎么改得动”,不能说明“项目意图是否允许这样改”。Tiny Context 要补的就是这一层 repo 内长期事实和优先级契约。
52
+
49
53
  对于长程任务,Harness 也提供一个轻量的计划验收清单 Skill:当用户明确给出或引用某份方案 / 计划 / RFC / implementation plan,并要求生成验收清单、完成定义或 goal/target 模式提示词时,它会把计划和验收清单临时放到 `tmp/ty-context/plan-acceptance/**`。如果方案里已经有明确、具体的“验收清单”,Skill 会直接复用那份清单并单独写入完整验收清单文件,不再另行生成一份竞争清单。这只是执行前的一次验收标准梳理,不执行计划、不证明完成,也不会把临时清单注册成 `project_context/**`。
50
-
51
- ## 适合谁
52
-
53
- 适合:
54
+
55
+ 重要使用提示:Minimal Context 有意把 Context 读取顺序、Context / 代码优先级和漂移检查保持为 agent 级软约束,而不是机器强制 gate。这个取舍适合短任务,但长任务、大上下文、多次交接或多轮验证时预期会漂移。遇到这类任务且已有方案/计划来源时,应先用计划验收清单 Skill 外化一个可证伪完成目标;完整验收清单才是验收标准,local audit 只是临时进度/恢复状态。
56
+
57
+ ## 当前最佳实践
58
+
59
+ 短程任务直接使用流程契约和 Context 层:
60
+
61
+ ```text
62
+ 流程契约 + project_context/** -> 实现 -> 验证 -> drift check
63
+ ```
64
+
65
+ 长程任务先外化目标,再进入实现:
66
+
67
+ ```text
68
+ Web GPT 或其他外部规划模型产出方案
69
+ -> 计划验收清单 Skill 生成目标模式文本
70
+ -> Superpowers 得出具体落地执行片段
71
+ -> 每个执行片段都回到流程契约 + project_context/**
72
+ ```
73
+
74
+ 这里的 Superpowers 指具体的 [obra/Superpowers](https://github.com/obra/superpowers) 插件/开源工作流,不是泛化的执行规划替代品。如果目标模式文本或原方案还不够可执行,用 `superpowers:writing-plans` 转成 bite-sized implementation plan;有 subagent 支持时优先用 `superpowers:subagent-driven-development`,否则用 `superpowers:executing-plans`;涉及行为变更时用 `superpowers:test-driven-development`。
75
+
76
+ 原因是漂移控制。流程契约 + Context 层是软约束,短任务里通常能让 agent 按预期执行;长程任务里,Context 仍然能记录符合预期的事实,但 Context 到代码 的实现步骤会随着上下文窗口变大、多次交接、subagent 拆分和多轮验证而漂移。Web GPT 方案、目标模式文本、完整验收清单和 Superpowers 执行层,把完成目标外化成可恢复、可审计的临时执行标准,同时不恢复阶段式 gate。
77
+
78
+ ## 适合谁
79
+
80
+ 适合:
54
81
 
55
82
  - 经常用 Codex、Claude Code、Cursor、Gemini CLI、OpenCode 等 agent 改代码的项目。
56
83
  - 经常开新 chat,agent 反复重新理解项目的项目。
@@ -28,7 +28,7 @@ This Skill is generic. Do not embed business-specific rules, vendor-specific rul
28
28
  Every completed invocation must produce:
29
29
 
30
30
  1. A preserved copy of the plan under `tmp/ty-context/plan-acceptance/`. The copied plan is the implementation/source plan, not acceptance proof.
31
- 2. A rigorous full acceptance checklist under a separate `tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.md` path. If the plan contains an explicit concrete acceptance checklist, reuse that plan-provided checklist verbatim; otherwise derive the checklist from the plan and relevant project Context. The full checklist is the complete acceptance standard.
31
+ 2. A rigorous full acceptance checklist under a separate `tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.md` path. If the plan contains an explicit concrete acceptance checklist, reuse that plan-provided checklist verbatim; otherwise derive the checklist from the plan and relevant project Context. The full checklist is the complete acceptance standard and owns required automated test evidence.
32
32
  3. A goal/target-mode prompt the user can paste directly into Codex. The prompt may include a compact checklist summary for direction, priority and recovery navigation, but the full checklist owns the acceptance details.
33
33
  4. When a local audit path is referenced, it is temporary execution/progress state only, not Context and not proof by itself.
34
34
 
@@ -38,9 +38,17 @@ The goal/target-mode prompt must be no longer than 3850 characters, including li
38
38
 
39
39
  ```text
40
40
  实施计划: tmp/ty-context/plan-acceptance/<plan>.md(source/implementation plan,非验收证明)
41
- 可多开agent,agent名额不够了就关掉不用的。
42
41
  完整验收清单: tmp/ty-context/plan-acceptance/<plan>-acceptance-checklist.md(该文件是完整验收标准,验收以这个为准。完成前必须逐项检查,不满足则继续实现。)
43
42
  执行审计: tmp/ty-context/plan-acceptance/<plan>-local-audit.md(临时 progress state,非 Context/proof)
43
+ 可多开agent,agent名额不够了就关掉不用的。
44
+ 如果 Superpowers 未安装,先按当前平台官方 Superpowers 安装路径安装;若安装被权限/网络/平台限制阻塞,写入执行审计,不得把阻塞当完成。
45
+ 如果 Superpowers 已安装,使用 Superpowers 执行本任务:
46
+ - 先读完整验收清单,验收以它为准;compact prompt 只负责 direction/priority/recovery navigation
47
+ - 若实施计划不够可执行,用 superpowers:writing-plans 转成 bite-sized implementation plan
48
+ - 有 subagent 支持时优先用 superpowers:subagent-driven-development;否则用 superpowers:executing-plans
49
+ - 行为变更使用 superpowers:test-driven-development;先写失败测试并观察失败,再写最小实现
50
+ - review / finish 不能覆盖完整验收清单;完整验收清单不满足则继续实现
51
+ - 每轮执行后更新 local audit,记录 AC 状态、当前证据、命令结果、blocker、deferred/narrowed scope、无效证据
44
52
  <验收清单>
45
53
  ```
46
54
 
@@ -48,9 +56,17 @@ For English requests, use this shape:
48
56
 
49
57
  ```text
50
58
  Plan: tmp/ty-context/plan-acceptance/<plan>.md (implementation/source plan, not acceptance proof)
51
- You may use multiple agents; if agent slots run low, close idle or unnecessary agents.
52
59
  Full checklist: tmp/ty-context/plan-acceptance/<plan>-acceptance-checklist.md (complete acceptance standard; acceptance is judged against it; every item must be checked before completion)
53
60
  Local audit: tmp/ty-context/plan-acceptance/<plan>-local-audit.md (temporary execution/progress state, not Context or proof)
61
+ You may use multiple agents; if agent slots run low, close idle or unnecessary agents.
62
+ If Superpowers is not installed, install it through the current platform's official Superpowers installation path first; if installation is blocked by permissions, network or platform limits, record it in local audit and do not treat the blocker as completion.
63
+ If Superpowers is installed, Use Superpowers for this task:
64
+ - Read the full checklist first; acceptance is judged against it, while the compact prompt only provides direction, priority and recovery navigation.
65
+ - If the implementation plan is not executable enough, use superpowers:writing-plans to convert it into a bite-sized implementation plan.
66
+ - Prefer superpowers:subagent-driven-development when subagents are available; otherwise use superpowers:executing-plans.
67
+ - Use superpowers:test-driven-development for behavior changes; write a failing test, observe it fail, then write the minimal implementation.
68
+ - review / finish cannot override the full checklist; if the full checklist is not satisfied, continue implementation.
69
+ - update local audit after each execution round with AC status, current evidence, command results, blockers, deferred/narrowed scope and invalid evidence.
54
70
  <acceptance checklist>
55
71
  ```
56
72
 
@@ -141,7 +157,7 @@ The local audit path is for the future goal/target-mode executor. This compiler
141
157
 
142
158
  ## Step 2: Reuse Any Explicit Plan-Provided Checklist
143
159
 
144
- After materializing the plan, inspect it for an explicit concrete acceptance checklist before generating a new checklist.
160
+ After materializing the plan, inspect it for an explicit concrete acceptance checklist and any explicit test requirements before generating a new checklist.
145
161
 
146
162
  Plan-provided checklist reuse applies only when the plan contains a clearly labeled checklist or checklist table section, such as `Acceptance Checklist`, `Acceptance Criteria`, `验收清单`, `验收标准`, or equivalent heading, and that section contains concrete acceptance items rather than only saying that acceptance is needed.
147
163
 
@@ -152,6 +168,7 @@ When a plan-provided checklist is found:
152
168
  - Do not derive, strengthen, reorder, translate, normalize, merge, split, or add acceptance items.
153
169
  - Do not prepend the generated `Acceptance Contract`, checklist table, self-test, or false-completion traps to the full checklist file when reusing a plan-provided checklist.
154
170
  - If multiple explicit checklist sections exist, copy all of them into the full checklist file in source order.
171
+ - If the plan also includes explicit test requirements, such as `Required tests`, `Automated tests`, `Test Plan`, `必须新增/补强的自动化测试`, `测试文件`, or equivalent concrete test sections, copy those test requirements verbatim into the full checklist file as plan-provided acceptance evidence. These test requirements are acceptance evidence, not generated additions.
155
172
  - Keep the copied plan file and full checklist file separate, even if the checklist already appears inside the plan.
156
173
  - Continue to read relevant Context only as needed to explain ambiguities, conflicts, the goal/target-mode prompt, and any required local-audit or false-completion guidance. Do not use Context to expand the reused checklist unless the user separately asks for an audit or rewrite.
157
174
  - If the plan-provided checklist is too large for the 3850-character prompt budget, keep it intact in the full checklist file and make the prompt reference that file as the acceptance authority; do not invent a compact replacement checklist with new criteria.
@@ -159,6 +176,13 @@ When a plan-provided checklist is found:
159
176
 
160
177
  When no explicit concrete plan-provided checklist exists, continue with the generated-checklist flow below.
161
178
 
179
+ When a plan already includes explicit test requirements but not a full acceptance checklist:
180
+
181
+ - Use those plan-provided test requirements as the source of truth for the generated checklist's `Required automated tests / 必须新增或补强的自动化测试` section.
182
+ - Preserve concrete test file paths, test names, behavior descriptions, commands, and failure notes from the plan.
183
+ - Do not replace them with generic AC10 wording, do not create an unrelated test system, and do not invent competing test lists.
184
+ - If the plan's test requirements are broad but concrete enough to preserve, keep the original wording and add only the minimum acceptance mapping needed to show which AC each test supports.
185
+
162
186
  ## Step 3: Read Relevant Project Context
163
187
 
164
188
  Read only the Context needed for the plan's impacted surfaces. Use Context to identify what the project says the system should mean.
@@ -265,6 +289,31 @@ Allowed `Conclusion` values:
265
289
 
266
290
  Missing ledger evidence means incomplete, not complete. Do not let missing evidence, old evidence, partial evidence or evidence from a different read path satisfy a current-state claim.
267
291
 
292
+ ### Required Automated Tests / 必须新增或补强的自动化测试
293
+
294
+ Every generated full checklist must include a `Required automated tests / 必须新增或补强的自动化测试` section. Test requirements are acceptance evidence: a future executor cannot mark a relevant implementation item complete when required test evidence is missing.
295
+
296
+ No fourth artifact: keep these requirements inside `<plan-slug>-acceptance-checklist.md`. Do not create `tmp/ty-context/plan-acceptance/<plan-slug>-test-requirements.md` or another standalone test requirements file.
297
+
298
+ Use this table shape when tests are required:
299
+
300
+ ```markdown
301
+ ## Required automated tests / 必须新增或补强的自动化测试
302
+
303
+ | Test file path | Test name or behavior description | Covered acceptance item(s) | Verification command | Failure condition | Source |
304
+ |---|---|---|---|---|---|
305
+ ```
306
+
307
+ Rules:
308
+
309
+ - If the plan already includes explicit test requirements, use those plan-provided test requirements as the source of truth. Preserve the plan's test file path, test name or behavior description, verification command and failure condition when present.
310
+ - Do not replace plan-provided test requirements with generic AC10, do not invent a separate test taxonomy, and do not add unrelated tests merely because a generic category exists.
311
+ - If no explicit test section exists, derive required tests from the plan's behavior changes, risk, Context contracts and real code/test surfaces.
312
+ - If an exact test name cannot be inferred safely, write a behavior-level test description and do not invent exact test names.
313
+ - Each required test row must identify the covered acceptance item(s), the verification command, and the failure condition that blocks acceptance.
314
+ - If no new or strengthened automated tests are required, state that explicitly with the reason and the acceptance items covered by existing verification.
315
+ - The local audit must record each required test's command, result and failure reason when it is run or when it remains blocked.
316
+
268
317
  ## Hard Blocker Handling
269
318
 
270
319
  Treat any unresolved required blocker as non-completion. A checklist may describe blocked acceptance work, but blocked work is still not accepted until the required evidence exists.
@@ -453,12 +502,14 @@ Hard requirements:
453
502
  - The prompt must be no longer than 3850 characters including line breaks. Treat 3850 as the effective hard budget and preserve information density; do not drop required paths, core acceptance categories, blocker rules, evidence rules or false-completion traps merely to be short.
454
503
  - The first line must identify the plan path.
455
504
  - Use `实施计划: <path>` for Chinese prompts and `Plan: <path>` for English prompts. The line must say the plan is the implementation/source plan and not acceptance proof.
456
- - The second line must be a resource lifecycle instruction: `可多开agent,agent名额不够了就关掉不用的。` for Chinese prompts or `You may use multiple agents; if agent slots run low, close idle or unnecessary agents.` for English prompts.
505
+ - The prompt must identify the full checklist path immediately after the plan path and say it is the complete acceptance standard. Chinese prompts must include this exact sentence: `该文件是完整验收标准,验收以这个为准。完成前必须逐项检查,不满足则继续实现。` English prompts must say the full checklist is the complete acceptance standard, acceptance is judged against it, and every item must be checked before completion.
506
+ - The prompt must identify a local audit path, normally `tmp/ty-context/plan-acceptance/<plan-slug>-local-audit.md`, and require the future executor to read it before resuming, keep it current during execution, and use it only as target-mode acceptance progress state.
507
+ - After the plan/checklist/audit paths, include a resource lifecycle instruction: `可多开agent,agent名额不够了就关掉不用的。` for Chinese prompts or `You may use multiple agents; if agent slots run low, close idle or unnecessary agents.` for English prompts.
508
+ - The prompt must include a Superpowers execution block. If Superpowers is not installed, tell the executor to install it through the current platform's official Superpowers installation path; if installation is blocked by permissions, network or platform limits, record it in local audit and do not treat the blocker as completion. If Superpowers is installed, Use Superpowers for this task.
509
+ - The Superpowers block must require: read the full checklist first and make it the acceptance authority; use `superpowers:writing-plans` when the plan is not executable enough; prefer `superpowers:subagent-driven-development` when subagents are available; otherwise use `superpowers:executing-plans`; use `superpowers:test-driven-development` for behavior changes; review / finish cannot override the full checklist; update local audit after each execution round.
457
510
  - The remaining content must be the acceptance checklist or a compact version of it.
458
511
  - The prompt must be self-contained enough for goal/target-mode execution.
459
- - The prompt must identify the full checklist path and say it is the complete acceptance standard. Chinese prompts must include this exact sentence: `该文件是完整验收标准,验收以这个为准。完成前必须逐项检查,不满足则继续实现。` English prompts must say the full checklist is the complete acceptance standard, acceptance is judged against it, and every item must be checked before completion.
460
512
  - If the prompt uses a compact checklist summary, say the full checklist owns details and acceptance authority; the compact summary owns direction, priority and recovery navigation; overlap is allowed; conflicts are resolved in favor of the full checklist.
461
- - The prompt must identify a local audit path, normally `tmp/ty-context/plan-acceptance/<plan-slug>-local-audit.md`, and require the future executor to read it before resuming, keep it current during execution, and use it only as target-mode acceptance progress state.
462
513
  - The prompt must require the local audit to record overall status (`complete`, `incomplete`, `blocked` or `narrowed-scope-complete`), each core AC status and current evidence, commands with result/time/failure reason, artifact or evidence paths, blockers and missing evidence, acceptance impact, explicit deferred or narrowed scope, and stale/partial/smoke/dry-run/research evidence that cannot prove full completion.
463
514
  - The prompt must say that local audit is not Context, not product-quality proof, not a global task manager, and not a replacement for project tests, CI, review, human acceptance, Task Contract or workflow-contract `plan.md`.
464
515
  - The prompt must say that when a Task Contract or workflow-contract `plan.md` exists, each acceptance item execution still follows it and the repository's Tiny Context workflow contract.
@@ -468,19 +519,29 @@ Hard requirements:
468
519
  - If the full checklist is too large, write the full checklist to `tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.md`, then compress the goal/target-mode prompt by increasing information density while preserving all core acceptance categories.
469
520
  - If the full checklist came from a plan-provided checklist and is too large, keep the extracted checklist unchanged in the full checklist file and compress the prompt by referencing the full checklist path, not by rewriting or adding criteria.
470
521
  - The compact prompt may reference the full checklist path, but it must still include the core completion criteria directly and state that the summary is direction/priority/recovery navigation, not the acceptance authority.
522
+ - Compact prompts must not expand long test lists. AC10 must reference the Required automated tests section in the full checklist and state that behavior changes still use `superpowers:test-driven-development`.
471
523
 
472
524
  Recommended compact Chinese prompt shape:
473
525
 
474
526
  ```text
475
527
  实施计划: tmp/ty-context/plan-acceptance/<plan-slug>.md(source/implementation plan,非验收证明)
476
- 可多开agent,agent名额不够了就关掉不用的。
477
528
  完整验收清单: tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.md(该文件是完整验收标准,验收以这个为准。完成前必须逐项检查,不满足则继续实现。)
478
529
  执行审计: tmp/ty-context/plan-acceptance/<plan-slug>-local-audit.md(临时 progress state,非 Context/proof)
530
+ 可多开agent,agent名额不够了就关掉不用的。
479
531
  本摘要只负责 direction/priority/recovery navigation;允许与完整 checklist 重叠,冲突时以完整 checklist 为准。
480
532
 
533
+ 如果 Superpowers 未安装,先按当前平台官方 Superpowers 安装路径安装;若安装被权限/网络/平台限制阻塞,写入执行审计,不得把阻塞当完成。
534
+ 如果 Superpowers 已安装,使用 Superpowers 执行本任务:
535
+ - 先读完整验收清单,验收以它为准
536
+ - 若实施计划不够可执行,用 superpowers:writing-plans 转成 bite-sized implementation plan
537
+ - 有 subagent 支持时优先用 superpowers:subagent-driven-development;否则用 superpowers:executing-plans
538
+ - 行为变更用 superpowers:test-driven-development;先写失败测试并观察失败,再写最小实现
539
+ - review / finish 不能覆盖完整验收清单;不满足则继续实现
540
+ - 每轮执行后更新 local audit,记录 AC 状态、证据、命令结果、blocker、deferred/narrowed scope、无效证据
541
+
481
542
  验收清单:
482
543
  AC1 <核心完成定义,包含验收证据>
483
- AC2 <范围/清单/覆盖要求>
544
+ AC2 <范围/清单/覆盖要求>
484
545
  AC3 <Context/架构/边界要求>
485
546
  AC4 <核心实现行为要求>
486
547
  AC5 <数据/API/接口/契约要求>
@@ -488,9 +549,9 @@ AC6 <运行态/配置/外部依赖/阻塞分类要求>
488
549
  AC7 <artifact/evidence/schema/freshness/provenance 要求>
489
550
  AC8 <UI/用户可见/API 投影一致性要求>
490
551
  AC9 <安全/隐私/脱敏/secret 要求>
491
- AC10 <测试/构建/集成/smoke/回归要求>
552
+ AC10 测试要求:按完整验收清单的 `Required automated tests / 必须新增或补强的自动化测试` section 执行;compact prompt 不展开长测试列表;行为变更仍用 superpowers:test-driven-development。
492
553
  AC11 <文档/Context 更新要求,仅在计划要求时执行>
493
- AC12 维护执行审计:恢复执行先读 audit;记录总体状态、每个 AC 当前证据、命令/结果/时间、artifact/evidence 路径、blocker、deferred/narrowed scope、不能证明 full completion 的旧/部分/smoke/dry-run/research 证据;audit 不是 Context、完成证明、全局任务管理器,也不替代 Task Contract 或流程契约 plan.md。
554
+ AC12 维护执行审计:恢复执行先读 audit;记录总体状态、每个 AC 当前证据、命令/结果/时间、每个 required test 的 command/result/failure reason、artifact/evidence 路径、blocker、deferred/narrowed scope、不能证明 full completion 的旧/部分/smoke/dry-run/research 证据;audit 不是 Context、完成证明、全局任务管理器,也不替代 Task Contract 或流程契约 plan.md。
494
555
  AC13 最小用户卡点:问用户前先完成安全自助发现;需要用户介入时只给最小动作清单,写明已尝试、缺失项、具体页面/菜单/字段/按钮、最小值/动作、不要发送的敏感信息、验收影响、fallback/deferred。
495
556
  AC14 完成前审计:逐条对照实施计划和完整 checklist;每个 core 项必须有当前证据;未跑验证必须明示;有可继续执行的 core 项不得标记完成;外部/强卡点必须写明原因、缺失证据、验收影响和下一步;若剩余未完成项只有无法本地解决的强卡点,暂停并等待用户/外部 owner,不能标记目标完成。
496
557
 
@@ -501,14 +562,23 @@ Recommended compact English prompt shape:
501
562
 
502
563
  ```text
503
564
  Plan: tmp/ty-context/plan-acceptance/<plan-slug>.md (implementation/source plan, not acceptance proof)
504
- You may use multiple agents; if agent slots run low, close idle or unnecessary agents.
505
565
  Full checklist: tmp/ty-context/plan-acceptance/<plan-slug>-acceptance-checklist.md (complete acceptance standard; acceptance is judged against it; every item must be checked before completion)
506
566
  Local audit: tmp/ty-context/plan-acceptance/<plan-slug>-local-audit.md (temporary progress state, not Context or proof)
567
+ You may use multiple agents; if agent slots run low, close idle or unnecessary agents.
507
568
  This summary is only direction, priority and recovery navigation; overlap with the full checklist is allowed, and the full checklist wins conflicts.
508
569
 
570
+ If Superpowers is not installed, install it through the current platform's official Superpowers installation path first; if installation is blocked by permissions, network or platform limits, record it in local audit and do not treat the blocker as completion.
571
+ If Superpowers is installed, Use Superpowers for this task:
572
+ - Read the full checklist first; acceptance is judged against it.
573
+ - If the plan is not executable enough, use superpowers:writing-plans for a bite-sized implementation plan.
574
+ - Prefer superpowers:subagent-driven-development when subagents are available; otherwise use superpowers:executing-plans.
575
+ - Use superpowers:test-driven-development for behavior changes; write a failing test, observe failure, then implement minimally.
576
+ - review / finish cannot override the full checklist; if unsatisfied, continue implementation.
577
+ - update local audit after each execution round with AC status, evidence, command results, blockers, deferred/narrowed scope and invalid evidence.
578
+
509
579
  Acceptance checklist:
510
580
  AC1 <core completion definition with required evidence>
511
- AC2 <scope inventory and coverage>
581
+ AC2 <scope inventory and coverage>
512
582
  AC3 <Context / architecture / boundary conformance>
513
583
  AC4 <core implementation behavior>
514
584
  AC5 <data / API / interface / contract requirements>
@@ -516,9 +586,9 @@ AC6 <runtime / configuration / external dependency / blocker classification>
516
586
  AC7 <artifact / evidence / schema / freshness / provenance requirements>
517
587
  AC8 <UI / user-visible / API projection consistency>
518
588
  AC9 <security / privacy / redaction / secret handling>
519
- AC10 <test / build / integration / smoke / regression requirements>
589
+ AC10 Test requirements: follow the full checklist's `Required automated tests / 必须新增或补强的自动化测试` section; the compact prompt must not expand long test lists; behavior changes still use superpowers:test-driven-development.
520
590
  AC11 <documentation / Context updates only when required by the plan>
521
- AC12 Maintain local audit: read it before resuming; record overall status, every AC's current evidence, commands/results/time, artifact/evidence paths, blockers, deferred/narrowed scope, and stale/partial/smoke/dry-run/research evidence that cannot prove full completion; audit is not Context, completion proof, a global task manager, or a replacement for Task Contract or workflow-contract plan.md.
591
+ AC12 Maintain local audit: read it before resuming; record overall status, every AC's current evidence, commands/results/time, each required test's command/result/failure reason, artifact/evidence paths, blockers, deferred/narrowed scope, and stale/partial/smoke/dry-run/research evidence that cannot prove full completion; audit is not Context, completion proof, a global task manager, or a replacement for Task Contract or workflow-contract plan.md.
522
592
  AC13 Minimal user blocker protocol: before asking the user, complete safe self-service discovery; when user action is needed, provide only the smallest action list with what was tried, missing item, exact page/menu/path/field/button, minimum value/action, sensitive material not to send, acceptance impact, and fallback/deferred option.
523
593
  AC14 Final audit: compare every item against the plan and full checklist; every core item needs current evidence; missing validation must be stated; any executable core item left open means the task is not complete; external or hard blockers need cause, missing evidence, acceptance impact, and next action; if only locally unsatisfiable hard blockers remain, pause for the user or external owner instead of marking the goal complete.
524
594
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "project-tiny-context-harness",
3
- "version": "0.2.62",
3
+ "version": "0.2.64",
4
4
  "description": "Minimal project memory and validation harness for AI coding agents.",
5
5
  "license": "MIT",
6
6
  "author": "Seven128",