@wnlen/agent-execution-template 0.8.15 → 0.8.17

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -147,12 +147,37 @@ npx -y @wnlen/agent-execution-template strategy --lang en
147
147
  | Strategy amendment gate | New direction goes through `inbox/ideas/`, a proposal, human confirmation, then an explicit apply task. |
148
148
  | Protected project context | `update` refreshes `ai/template/**` without overwriting `ai/project/**`. |
149
149
  | Project context refresh | `refresh` backs up old `ai/project/**`, creates a fresh project context, and imports the old context into the inbox for reconciliation. |
150
- | Bounded task execution | Goals, scope, permissions, risk, and acceptance criteria live in one task file. |
150
+ | Automatic continuous execution | The agent decomposes L1/L2/L3 before execution; 2+ L1 tasks automatically enable bounded continuous execution, and only Red risk stops for confirmation. |
151
151
  | Auditable results | Every run can leave human-readable output, machine-readable facts, and metrics. |
152
152
  | Token-efficient model policy | Cheap models handle bounded work; strong models are reserved for judgment points. |
153
153
  | Upgradeable template | Reuse protocol improvements without losing local project memory. |
154
154
  | Doctor checks | Validate required files and template version before running the agent. |
155
155
 
156
+ ## How Automatic Continuous Execution Works
157
+
158
+ The user can still give a natural-language goal, for example:
159
+
160
+ ```text
161
+ Build the settings page with profile editing, notification toggles, and export entrypoint
162
+ ```
163
+
164
+ Before execution, the AI decomposes L1 tasks:
165
+
166
+ ```text
167
+ - [ ] L1-1 Profile editing Green
168
+ - [ ] L1-2 Notification toggles Green
169
+ - [ ] L1-3 Export entrypoint Yellow
170
+ ```
171
+
172
+ Because there are two or more L1 tasks, the protocol automatically uses bounded
173
+ continuous execution. Before each L1, the AI plans naturally derived L2/L3 work.
174
+ After completing an L1, it checks and strikes the item, then writes status back
175
+ to `execution_policy.task_tree` in `ai/project/task.md`.
176
+
177
+ Only Red risk stops for confirmation. Green continues automatically, and Yellow
178
+ continues after local low-risk correction. Every checkpoint must include
179
+ evidence: changed files, commands run, and verification results.
180
+
156
181
  ## Installed Layout
157
182
 
158
183
  ```text
@@ -162,6 +187,7 @@ ai/
162
187
  template/
163
188
  VERSION
164
189
  bootstrap.md
190
+ execution-policy.md
165
191
  prompt.md
166
192
  reconcile.md
167
193
  protocol.md
@@ -323,6 +349,8 @@ Reconcile the new material in ai/project/inbox/
323
349
 
324
350
  The agent must produce a reconciliation plan first, wait for confirmation, then merge long-lived facts into `project.md`, `runtime.md`, and `refs/*`.
325
351
  After reconciliation, processed material is moved to `ai/project/inbox/processed/` for traceability.
352
+ By default, only `ai/project/inbox/*.md` and `ai/project/inbox/raw/*.md` are absorbed;
353
+ `processed/**` is not reconciled again, and `ideas/**` goes through the direction amendment proposal flow.
326
354
 
327
355
  ## Project North Star
328
356
 
package/README.zh-CN.md CHANGED
@@ -157,12 +157,35 @@ npx -y @wnlen/agent-execution-template strategy
157
157
  | 策略修订门禁 | 新方向先进入 `inbox/ideas/`,生成 proposal,人类确认后才合并进北极星、模块地图或路线图。 |
158
158
  | 保护项目现场 | `update` 刷新 `ai/template/**`,不会覆盖 `ai/project/**`。 |
159
159
  | 项目上下文重整 | `refresh` 备份旧 `ai/project/**`,生成新项目上下文,并把旧上下文放入 inbox 供 AI 整理。 |
160
- | 有边界的任务执行 | 目标、范围、权限、风险和验收标准集中在任务文件里。 |
160
+ | 自动连续执行 | AI 执行前自动拆 L1/L2/L3 任务树;L1 两个以上自动启用边界内连续执行,只有 Red 风险停下来确认。 |
161
161
  | 可审计结果 | 每次执行都可以留下人类可读结果、机器可读事实和 metrics。 |
162
162
  | Token-efficient 模型策略 | 便宜模型处理边界清楚的工作,强模型只用于关键判断点。 |
163
163
  | 可升级模板 | 协议可以持续改进,不丢失项目本地记忆。 |
164
164
  | Doctor 检查 | 执行前检查必要文件和模板版本。 |
165
165
 
166
+ ## 自动连续执行怎么工作
167
+
168
+ 用户仍然只需要说自然语言目标,例如:
169
+
170
+ ```text
171
+ 实现设置页,包括资料编辑、通知开关和导出入口
172
+ ```
173
+
174
+ AI 会在执行前先拆 L1 任务:
175
+
176
+ ```text
177
+ - [ ] L1-1 资料编辑 Green
178
+ - [ ] L1-2 通知开关 Green
179
+ - [ ] L1-3 导出入口 Yellow
180
+ ```
181
+
182
+ 因为 L1 有两个以上,协议会自动使用边界内连续执行。执行每个 L1 前,AI 再规划
183
+ 自然衍生的 L2/L3;完成一个 L1 后,在清单中打勾并划掉,同时把状态写回
184
+ `ai/project/task.md` 的 `execution_policy.task_tree`。
185
+
186
+ 只有 Red 风险会停下来让你确认。Green 自动继续,Yellow 只做局部低风险修正后继续。
187
+ 每个 Checkpoint 都必须带证据:改了哪些文件、跑了哪些命令、验证结果是什么。
188
+
166
189
  ## 安装后的结构
167
190
 
168
191
  ```text
@@ -172,6 +195,7 @@ ai/
172
195
  template/
173
196
  VERSION
174
197
  bootstrap.md
198
+ execution-policy.md
175
199
  prompt.md
176
200
  reconcile.md
177
201
  protocol.md
@@ -332,6 +356,8 @@ ai/project/inbox/
332
356
 
333
357
  AI 必须先输出整合计划,等待确认后,再把长期有效事实合并进 `project.md`、`runtime.md` 和 `refs/*`。
334
358
  整合完成后,已处理资料统一移动到 `ai/project/inbox/processed/`,保留用于追溯。
359
+ 默认只吸收 `ai/project/inbox/*.md` 和 `ai/project/inbox/raw/*.md`;
360
+ `processed/**` 不会再次参与整合,`ideas/**` 走方向修订提案。
335
361
 
336
362
  ## 项目北极星
337
363
 
@@ -13,6 +13,7 @@ const REQUIRED_FILES = [
13
13
  "ai/template/LANG",
14
14
  "ai/template/VERSION",
15
15
  "ai/template/bootstrap.md",
16
+ "ai/template/execution-policy.md",
16
17
  "ai/template/prompt.md",
17
18
  "ai/template/reconcile.md",
18
19
  "ai/template/protocol.md",
@@ -48,6 +49,13 @@ const TASK_HEALTH_PATTERNS = [
48
49
  /^type:\s*/m,
49
50
  /^priority:\s*/m,
50
51
  /^risk_level:\s*/m,
52
+ /^readiness:\s*/m,
53
+ /^execution_policy:/m,
54
+ /^\s+mode:\s*/m,
55
+ /^\s+activation_rule:\s*/m,
56
+ /^\s+task_tree:/m,
57
+ /^\s+risk_gate:/m,
58
+ /^\s+evidence_required:\s*/m,
51
59
  /^model_policy:/m,
52
60
  /^refs:/m,
53
61
  /^permission:/m
@@ -145,6 +153,7 @@ const TEXT = {
145
153
  nextTellAgent: "把这句话发给你的 AI coding 工具:",
146
154
  nextRunCommand: "运行这个命令:",
147
155
  nextReviewProposal: "已有方向修订提案。先审查提案;确认后对 AI 说:",
156
+ nextContinuePrompt: "继续推进这个项目。执行前先拆 L1 任务;若 L1 >= 2,自动启用边界内连续执行;只有 Red 风险停下来确认。",
148
157
  repairHint: "缺失的 project 推荐文件可通过重新运行 init 安全补齐;已有 ai/project/** 不会被覆盖。",
149
158
  permissionDenied: "无法写入目标路径",
150
159
  permissionHint: `请检查 ai/** 的归属和权限。常见修复:
@@ -249,6 +258,7 @@ Usage:
249
258
  nextTellAgent: "Send this to your AI coding tool:",
250
259
  nextRunCommand: "Run this command:",
251
260
  nextReviewProposal: "A direction amendment proposal exists. Review it first; after confirmation, tell the AI:",
261
+ nextContinuePrompt: "Continue this project. Before execution, decompose L1 tasks; if L1 >= 2, automatically use bounded continuous execution; only Red risk stops for confirmation.",
252
262
  repairHint: "Missing recommended project files can be safely added by running init again; existing ai/project/** files are not overwritten.",
253
263
  permissionDenied: "Cannot write target path",
254
264
  permissionHint: `Check ownership and permissions under ai/**. Common fix:
@@ -582,7 +592,7 @@ function next({ lang = readInstalledLang() } = {}) {
582
592
  }
583
593
 
584
594
  console.log(`${text.nextTellAgent}
585
- ${lang === "zh" ? "继续推进这个项目" : "Continue this project"}
595
+ ${text.nextContinuePrompt}
586
596
  `);
587
597
  }
588
598
 
package/docs/SPEC.md CHANGED
@@ -22,7 +22,7 @@ npx 安装协议 -> AI 整理项目上下文 -> 人类确认 -> AI 生成任务
22
22
 
23
23
  ```text
24
24
  Protocol: v0.8
25
- Package: @wnlen/agent-execution-template@0.8.15
25
+ Package: @wnlen/agent-execution-template@0.8.17
26
26
  中文安装: npx -y @wnlen/agent-execution-template init
27
27
  英文安装: npx -y @wnlen/agent-execution-template init --lang en
28
28
  ```
@@ -181,6 +181,7 @@ ai/
181
181
  template/
182
182
  VERSION
183
183
  bootstrap.md
184
+ execution-policy.md
184
185
  prompt.md
185
186
  reconcile.md
186
187
  protocol.md
@@ -243,6 +244,7 @@ project 是现场。
243
244
  ```text
244
245
  ai/template/VERSION
245
246
  ai/template/bootstrap.md
247
+ ai/template/execution-policy.md
246
248
  ai/template/prompt.md
247
249
  ai/template/reconcile.md
248
250
  ai/template/protocol.md
@@ -390,12 +392,13 @@ npx -y @wnlen/agent-execution-template doctor
390
392
  ```text
391
393
  Agent Execution Template 检查
392
394
 
393
- 模板版本: 0.8.15
395
+ 模板版本: 0.8.17
394
396
  模板语言: zh
395
397
 
396
398
  [通过] ai/template/LANG
397
399
  [通过] ai/template/VERSION
398
400
  [通过] ai/template/bootstrap.md
401
+ [通过] ai/template/execution-policy.md
399
402
  [通过] ai/template/prompt.md
400
403
  [通过] ai/template/reconcile.md
401
404
  [通过] ai/template/protocol.md
@@ -610,6 +613,7 @@ ai/project/refs/roadmap.md
610
613
  - 任务类型;
611
614
  - 优先级;
612
615
  - 风险等级;
616
+ - 执行策略;
613
617
  - 模型分工策略;
614
618
  - refs 要求;
615
619
  - 修改权限;
@@ -645,7 +649,46 @@ apply_strategy_update
645
649
  权限不允许,不越界修改。
646
650
  ```
647
651
 
648
- ## 14. 模型分工协议
652
+ ## 14. 执行授权策略
653
+
654
+ 执行策略入口写在:
655
+
656
+ ```text
657
+ ai/project/task.md.execution_policy
658
+ ai/template/execution-policy.md
659
+ ```
660
+
661
+ 默认模式是 `auto`。AI 每次执行前先做任务分解和风险判断,再决定使用
662
+ `normal` 还是 `bounded_continuous`,不依赖用户口令。
663
+
664
+ 执行前规划:
665
+
666
+ - AI 根据用户目标、项目上下文和仓库事实推断目标、范围、验收、权限和验证方式;
667
+ - 先列 L1 任务清单,并给每个 L1 标注 Green / Yellow / Red;
668
+ - L1 少于 2 个时使用 `normal`;
669
+ - L1 为 2 个或更多时自动使用 `bounded_continuous`;
670
+ - 任一 L1 为 Red 时停止等待人类确认;Green 和 Yellow 不阻塞启动。
671
+
672
+ `bounded_continuous` 规则集中在 `ai/template/execution-policy.md`。核心要求:
673
+
674
+ - 按 L1 -> L2 -> L3 执行,执行 L1 前规划 L2,执行 L2 前按需规划 L3;
675
+ - 默认最多 3 层,只有当 L3 仍过大、不可验证或不可回退时才动态增加 L4;
676
+ - 每个任务节点必须有风险评级、预期改动范围、验收方式和证据要求;
677
+ - L1 清单必须用待办列表展示,每完成一个 L1 就打勾并划掉;
678
+ - 执行前和执行中必须把任务树写回 `ai/project/task.md.execution_policy.task_tree`;
679
+ - 默认按 `vertical_slice` 推进,每轮都要产生可检查增量;
680
+ - 每个 Checkpoint 必须包含证据:已改文件、已运行命令、验证结果或无法验证原因;
681
+ - Green 可自动继续;
682
+ - Yellow 做局部低风险修正后继续;
683
+ - Red 必须停止等待人类确认;
684
+ - 目标、范围、验收和权限由 AI 推断,但不能越过项目规则、显式用户限制、
685
+ `permission.modify.denied`、安全边界或破坏性操作限制;
686
+ - 需要扩大权限、运行未允许命令、访问网络、执行破坏性操作、改变产品方向或核心架构时,
687
+ 当前节点必须标为 Red。
688
+
689
+ 它不适用于方向未定且无法推断、验收无法定义或高风险架构取舍任务;这些应直接评为 Red。
690
+
691
+ ## 15. 模型分工协议
649
692
 
650
693
  模型分工写在:
651
694
 
@@ -673,7 +716,7 @@ Default cheap. Escalate for judgment. Record why.
673
716
  - 写明需要的 strong model role;
674
717
  - 记录到 `ai/project/metrics.json`。
675
718
 
676
- ## 15. 风险门禁
719
+ ## 16. 风险门禁
677
720
 
678
721
  任务涉及以下内容时必须谨慎:
679
722
 
@@ -687,7 +730,7 @@ Default cheap. Escalate for judgment. Record why.
687
730
 
688
731
  如果风险高且 `task.md` 未明确授权,AI 必须停止并写 blocked 结果。
689
732
 
690
- ## 16. refs 延迟加载
733
+ ## 17. refs 延迟加载
691
734
 
692
735
  `ai/project/refs/` 存放按需读取的详细资料。
693
736
 
@@ -707,7 +750,7 @@ Default cheap. Escalate for judgment. Record why.
707
750
 
708
751
  每次读取 ref 都必须在 `result.json.refs_read` 中记录原因。
709
752
 
710
- ## 16.1 inbox 待吸收资料
753
+ ## 17.1 inbox 待吸收资料
711
754
 
712
755
  `ai/project/inbox/` 存放尚未整合进项目上下文的新资料。
713
756
  已完成整合的资料统一移动到 `ai/project/inbox/processed/`,用于追溯并避免重复处理。
@@ -732,11 +775,13 @@ AI 必须先输出整合计划,等人类确认后,才更新 `project.md`、`
732
775
  应用整合完成后,AI 必须把本次已处理的 `ai/project/inbox/*.md`
733
776
  移动到 `ai/project/inbox/processed/`。`processed/` 中的资料默认不再触发
734
777
  `reconcile` 或 `next` 的待处理资料判断。
778
+ 即使用户口语上说“整合整个 inbox”,默认也只处理 `ai/project/inbox/*.md`
779
+ 和 `ai/project/inbox/raw/*.md`;`ai/project/inbox/ideas/**` 不参与上下文整合。
735
780
 
736
781
  如果新资料会改变 `final-shape.md`、`module-map.md` 或 `roadmap.md` 的方向性内容,
737
782
  上下文整合只能建议创建 `strategy_update` 提案,不能直接改这些文件。
738
783
 
739
- ## 16.2 项目北极星与策略修订
784
+ ## 17.2 项目北极星与策略修订
740
785
 
741
786
  `ai/project/refs/final-shape.md` 是项目北极星说明书,也可以理解为
742
787
  Product Constitution / Final Shape Spec。它负责保存:
@@ -812,7 +857,7 @@ ai/project/refs/decisions.md
812
857
  ai/project/refs/constraints.md
813
858
  ```
814
859
 
815
- ## 17. 输出结果
860
+ ## 18. 输出结果
816
861
 
817
862
  每次执行必须写:
818
863
 
@@ -822,7 +867,7 @@ ai/project/result.md
822
867
  ai/project/metrics.json
823
868
  ```
824
869
 
825
- ### 17.1 `result.json`
870
+ ### 18.1 `result.json`
826
871
 
827
872
  机器可读结果,是当前最新权威执行记录。
828
873
 
@@ -840,7 +885,7 @@ ai/project/metrics.json
840
885
  - next;
841
886
  - runtime update proposal。
842
887
 
843
- ### 17.2 `result.md`
888
+ ### 18.2 `result.md`
844
889
 
845
890
  人类可读摘要。
846
891
 
@@ -852,7 +897,7 @@ ai/project/metrics.json
852
897
  - 有什么问题;
853
898
  - 下一步。
854
899
 
855
- ### 17.3 `metrics.json`
900
+ ### 18.3 `metrics.json`
856
901
 
857
902
  执行经济性和模型分工记录。
858
903
 
@@ -867,7 +912,7 @@ ai/project/metrics.json
867
912
  - human fix required;
868
913
  - reuse potential。
869
914
 
870
- ## 18. 状态规则
915
+ ## 19. 状态规则
871
916
 
872
917
  允许状态:
873
918
 
@@ -885,7 +930,7 @@ blocked
885
930
  - 任务不可执行,使用 `blocked`;
886
931
  - 执行失败且无法完成,使用 `failed`。
887
932
 
888
- ## 19. runtime 治理
933
+ ## 20. runtime 治理
889
934
 
890
935
  `ai/project/runtime.md` 只存当前仍然有效的执行上下文。
891
936
 
@@ -904,7 +949,7 @@ ai/project/result.json.runtime_update
904
949
 
905
950
  再由单独任务决定是否更新 runtime。
906
951
 
907
- ## 20. 同步规则
952
+ ## 21. 同步规则
908
953
 
909
954
  从模板仓库导入真实项目:
910
955
 
@@ -922,7 +967,7 @@ ai/project/result.json.runtime_update
922
967
 
923
968
  这是整个项目的安全底线。
924
969
 
925
- ## 21. npm 包结构
970
+ ## 22. npm 包结构
926
971
 
927
972
  模板仓库内部结构:
928
973
 
@@ -952,7 +997,7 @@ LICENSE
952
997
  - `bin/agent-execution-template.js` 是 CLI;
953
998
  - `test/selftest.js` 是本地自测。
954
999
 
955
- ## 22. 自测与发布检查
1000
+ ## 23. 自测与发布检查
956
1001
 
957
1002
  本地自测:
958
1003
 
@@ -997,7 +1042,7 @@ diff 检查:
997
1042
  git diff --check
998
1043
  ```
999
1044
 
1000
- ## 23. 当前能力边界
1045
+ ## 24. 当前能力边界
1001
1046
 
1002
1047
  当前项目已经能做到:
1003
1048
 
@@ -1018,7 +1063,7 @@ git diff --check
1018
1063
  - IDE 插件;
1019
1064
  - 发布流水线。
1020
1065
 
1021
- ## 24. 最终判断
1066
+ ## 25. 最终判断
1022
1067
 
1023
1068
  Agent Execution Template v0.8 已经从一个 prompt/template 原型,升级为:
1024
1069
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@wnlen/agent-execution-template",
3
- "version": "0.8.15",
3
+ "version": "0.8.17",
4
4
  "description": "Low-friction AI execution protocol template for coding agents.",
5
5
  "bin": {
6
6
  "agent-execution-template": "bin/agent-execution-template.js"
@@ -11,6 +11,7 @@ project is the field workspace
11
11
 
12
12
  - `template/prompt.md`: AI startup prompt.
13
13
  - `template/bootstrap.md`: project discovery and context bootstrap prompt.
14
+ - `template/execution-policy.md`: automatic continuous execution, task tree, risk rubric, and checkpoint rules.
14
15
  - `template/reconcile.md`: merge new authoritative material into existing project context.
15
16
  - `template/VERSION`: installed template version.
16
17
  - `template/protocol.md`: bootstrap flow, execution flow, model division, sync rules.
@@ -77,6 +78,10 @@ The workflow produces a reconciliation plan first and updates `project.md`,
77
78
  `runtime.md`, and `refs/*` only after confirmation. After reconciliation,
78
79
  processed material is moved to `ai/project/inbox/processed/`.
79
80
 
81
+ By default, only `ai/project/inbox/*.md` and `ai/project/inbox/raw/*.md` are
82
+ absorbed. `processed/**` is trace history and is not reconciled again;
83
+ `ideas/**` goes through the direction amendment proposal flow.
84
+
80
85
  ## Direction Amendments
81
86
 
82
87
  The North Star, module map, and roadmap belong to the project direction layer:
@@ -3,7 +3,43 @@ task_id: ""
3
3
  type: "bugfix | feature | refactor | docs | config | test | research | strategy_update | apply_strategy_update"
4
4
  priority: "P0 | P1 | P2 | P3"
5
5
  risk_level: "low | medium | high"
6
+ readiness: "draft_for_confirmation | ready_to_execute | blocked"
6
7
  depends_on_previous_result: false
8
+ execution_policy:
9
+ mode: "auto | normal | bounded_continuous"
10
+ activation_rule: "auto_enable_when_l1_count_gte_2"
11
+ max_depth: 3
12
+ allow_depth_4_when_needed: true
13
+ progress_unit: "vertical_slice"
14
+ task_tree:
15
+ - id: "L1-1"
16
+ title: ""
17
+ risk: "Green | Yellow | Red"
18
+ status: "pending | running | done | blocked"
19
+ scope:
20
+ allowed: []
21
+ denied: []
22
+ acceptance: []
23
+ evidence: []
24
+ children: []
25
+ checkpoint_budget:
26
+ l1: 0
27
+ l2: 0
28
+ l3: 0
29
+ l4: 0
30
+ checkpoint_triggers:
31
+ - before_crossing_boundary
32
+ - after_vertical_slice
33
+ - before_final_review
34
+ auto_continue:
35
+ green: true
36
+ yellow: "low_risk_only"
37
+ red: false
38
+ risk_gate:
39
+ green: "continue"
40
+ yellow: "continue_with_local_fix"
41
+ red: "stop_for_human"
42
+ evidence_required: true
7
43
  model_policy:
8
44
  default_tier: "cheap"
9
45
  allowed_tiers:
@@ -45,8 +81,11 @@ This file is the current execution contract. Prefer generating it in Bootstrap
45
81
  Mode from a short human goal plus repository context, then have a human review
46
82
  it before execution.
47
83
 
48
- Prefer safe assumptions over extra questions, but do not guess scope, risk,
49
- permissions, or acceptance.
84
+ Prefer safe assumptions over extra questions. The AI should infer scope, risk,
85
+ permissions, and acceptance from the human goal, project context, and repository
86
+ facts. If inference would cross permission or safety boundaries, or acceptance
87
+ cannot be defined, set `readiness` to `blocked` or mark the relevant task node
88
+ `Red` and wait for human confirmation.
50
89
 
51
90
  ## Goal
52
91
 
@@ -86,6 +125,50 @@ The task is complete when:
86
125
 
87
126
  -
88
127
 
128
+ ## Execution Policy
129
+
130
+ Default to `auto`. The AI decides during pre-execution planning whether to use
131
+ continuous execution; it does not wait for a human keyword. If planning finds
132
+ fewer than 2 L1 tasks, use `normal`; if it finds 2 or more L1 tasks, use
133
+ `bounded_continuous` automatically.
134
+
135
+ `bounded_continuous` means bounded continuous execution:
136
+
137
+ - The AI infers goal, scope, acceptance, permissions, and risk from the human
138
+ goal, project context, and repository facts; the human does not need to
139
+ provide each field upfront.
140
+ - `readiness = ready_to_execute` means no Red preflight item exists and the task
141
+ may execute.
142
+ - `readiness = draft_for_confirmation` means human confirmation is required
143
+ before execution.
144
+ - `readiness = blocked` means the task cannot execute and must produce a
145
+ blocked result.
146
+ - Before execution, write the L1 checklist to `execution_policy.task_tree`.
147
+ - Before execution, list the L1 task checklist; mark each L1 complete with a
148
+ checked and struck-through item.
149
+ - Before executing an L1, plan the naturally derived L2 tasks; if an L2 still
150
+ needs decomposition, plan L3 tasks.
151
+ - Default to at most 3 levels; add L4 dynamically only when leaving it out
152
+ would make L3 too large or unverifiable.
153
+ - The AI assigns Green / Yellow / Red risk to every task node.
154
+ - Only Red stops for human confirmation; Green continues automatically, and
155
+ Yellow continues after local low-risk correction.
156
+ - `progress_unit` defaults to `vertical_slice`: each work loop should produce
157
+ a reviewable increment.
158
+ - `checkpoint_budget` is the maximum checkpoint budget, not a required count;
159
+ do not report just to spend the budget.
160
+ - Emit a checkpoint only when `checkpoint_triggers` fire, risk rises, or final
161
+ review is about to start.
162
+ - Every checkpoint must include evidence: changed files, commands run,
163
+ verification results, or why verification was not possible.
164
+ - During execution, update `task_tree` node status: `pending`, `running`,
165
+ `done`, or `blocked`.
166
+ - After completion, run one final review; only re-check Yellow, Red, failed
167
+ verification, or high-impact modules.
168
+ - Continuous execution does not change model policy; escalate through
169
+ `model_policy` for judgment, architecture, failure review, or acceptance
170
+ disputes.
171
+
89
172
  ## Permission
90
173
 
91
174
  Modify files only under the allowlist in the YAML front matter.
@@ -108,3 +191,8 @@ Stop and write `ai/project/result.json`, `ai/project/result.md`, and `ai/project
108
191
  - Required refs are missing.
109
192
  - Required command cannot be run.
110
193
  - Risk level is high without explicit authorization.
194
+ - A Red checkpoint appears during continuous execution.
195
+ - The task would change product direction, core architecture, data structures,
196
+ security boundaries, payment, accounts, or permissions.
197
+ - The task would delete many files, rewrite a core module, or require choosing
198
+ between multiple high-cost options.
@@ -1 +1 @@
1
- 0.8.15
1
+ 0.8.17
@@ -0,0 +1,109 @@
1
+ # Execution Policy
2
+
3
+ Do not summarize this file.
4
+ During task execution, use this file to choose `normal` or `bounded_continuous`.
5
+
6
+ ## Default Policy
7
+
8
+ The default execution policy is `auto`: before each execution, the AI first
9
+ decomposes the task and judges risk, then chooses `normal` or
10
+ `bounded_continuous`. Continuous execution does not depend on a human keyword.
11
+
12
+ Pre-execution planning must:
13
+
14
+ - Infer goal, scope, acceptance, permissions, and verification method from the
15
+ human goal, project context, and repository facts.
16
+ - List the L1 task checklist and assign Green / Yellow / Red risk to each L1.
17
+ - Use `normal` if there are fewer than 2 L1 tasks.
18
+ - Automatically use `bounded_continuous` if there are 2 or more L1 tasks.
19
+ - Stop for human confirmation first if any L1 is Red; Green and Yellow do not
20
+ block startup.
21
+ - Write the task tree to `execution_policy.task_tree` in `ai/project/task.md`.
22
+
23
+ ## Task Tree
24
+
25
+ Execute the task tree in L1 -> L2 -> L3 order.
26
+
27
+ - Before executing an L1, plan its naturally derived L2 tasks.
28
+ - Before executing an L2, plan L3 tasks if it still needs decomposition.
29
+ - Default to at most 3 levels. Add L4 dynamically only when L3 would otherwise
30
+ be too large, unverifiable, or hard to revert.
31
+ - Every L1/L2/L3/L4 node must have risk, expected edit scope, acceptance method,
32
+ and evidence requirements.
33
+ - Show the L1 checklist as task items; when an L1 is complete, check it off and
34
+ strike it through.
35
+ - During execution, update each `task_tree` node status: `pending`, `running`,
36
+ `done`, or `blocked`.
37
+
38
+ Recommended node shape:
39
+
40
+ ```yaml
41
+ id: "L1-1"
42
+ title: ""
43
+ risk: "Green | Yellow | Red"
44
+ status: "pending | running | done | blocked"
45
+ scope:
46
+ allowed: []
47
+ denied: []
48
+ acceptance: []
49
+ evidence: []
50
+ children: []
51
+ ```
52
+
53
+ ## Risk Rubric
54
+
55
+ Green:
56
+
57
+ - Inside current task scope;
58
+ - no new permission, command, network access, or destructive action is needed;
59
+ - acceptance is clear;
60
+ - no product direction, core architecture, data structure, security boundary,
61
+ payment, account, or permission change is needed.
62
+
63
+ Yellow:
64
+
65
+ - Still inside current task scope;
66
+ - local uncertainty or local verification failure exists;
67
+ - a low-risk local correction can continue the work;
68
+ - no permission, scope, command, or acceptance expansion is needed.
69
+
70
+ Red:
71
+
72
+ - Permission expansion, unallowed command, network access, or destructive action
73
+ is needed;
74
+ - product direction, core architecture, data structure, security boundary,
75
+ payment, account, or permission would change;
76
+ - many files must be deleted, a core module must be rewritten, or multiple
77
+ high-cost options require judgment;
78
+ - acceptance cannot be defined, or task goal materially conflicts with project direction.
79
+
80
+ Only Red stops for human confirmation. Green continues automatically. Yellow
81
+ continues after local low-risk correction.
82
+
83
+ ## Checkpoint
84
+
85
+ Emit checkpoints only when risk rises, a boundary is about to change, a vertical
86
+ slice is complete, or final review is about to start. Do not report just to
87
+ spend checkpoint budget.
88
+
89
+ Every checkpoint must include:
90
+
91
+ ```text
92
+ ## Checkpoint
93
+ ### Task Tree
94
+ ### Progress
95
+ ### Completed
96
+ ### Evidence
97
+ ### Drift Risk: Green / Yellow / Red
98
+ ### Recommended Next Step
99
+ ### Auto-Continue Decision
100
+ ```
101
+
102
+ Evidence must include changed files, commands run, verification results, or why
103
+ verification was not possible. A purely subjective Green is not valid.
104
+
105
+ ## Model Policy
106
+
107
+ Continuous execution does not change `model_policy`. Still escalate through
108
+ `model_policy` for planning, architecture, failure review, or acceptance
109
+ disputes, and record the reason in `ai/project/metrics.json`.
@@ -9,6 +9,7 @@ First read:
9
9
 
10
10
  1. `ai/template/protocol.md`
11
11
  2. `ai/template/rules/core.md`
12
+ 3. `ai/template/execution-policy.md`
12
13
 
13
14
  Then choose the mode:
14
15
 
@@ -36,7 +37,9 @@ Then choose the mode:
36
37
  `ai/template/reconcile.md` and stop or update according to its two-phase
37
38
  workflow; `ai/project/inbox/processed/` is already processed material and
38
39
  should not trigger reconciliation, while `ai/project/inbox/ideas/` should
39
- route to `strategy_update` first.
40
+ route to `strategy_update` first. Even if the human says to reconcile the
41
+ whole inbox, default to only `ai/project/inbox/*.md` and
42
+ `ai/project/inbox/raw/*.md`.
40
43
  - If the user says "Start initializing this project", asks to initialize,
41
44
  organize, or generate project context, or if `ai/project/project.md` is
42
45
  empty, placeholder-only, or incomplete, follow `ai/template/bootstrap.md`
@@ -58,10 +61,21 @@ Then choose the mode:
58
61
  In Task Draft Mode:
59
62
 
60
63
  1. Read confirmed `ai/project/project.md` and relevant `ai/project/refs/*.md`.
61
- 2. Draft `ai/project/task.md` from the user's current goal.
62
- 3. Ask at most 3 questions only for scope, risk, permission, or acceptance
63
- blockers.
64
- 4. Stop for human confirmation. Do not modify source or business files.
64
+ 2. Infer goal, scope, acceptance, permissions, verification method, and initial
65
+ risk from the user's current goal, project context, and repository facts; do
66
+ not require the human to provide each field upfront.
67
+ 3. Draft `ai/project/task.md` and set `execution_policy.mode` to `auto`.
68
+ 4. Before execution, list the L1 checklist, mark each L1 Green / Yellow / Red,
69
+ and write it to `execution_policy.task_tree`. Use `normal` if there are
70
+ fewer than 2 L1 tasks; automatically use `bounded_continuous` if there are 2
71
+ or more L1 tasks.
72
+ 5. If no Red preflight item exists, set `readiness` to `ready_to_execute`; if
73
+ human confirmation is needed, set it to `draft_for_confirmation`; if the task
74
+ cannot execute, set it to `blocked`.
75
+ 6. Stop for human confirmation only when a Red preflight item appears. If the
76
+ human asked to execute or continue, and preflight contains only Green /
77
+ Yellow, proceed directly to Execution Mode.
78
+ 7. Do not modify source or business files in Task Draft Mode.
65
79
 
66
80
  End Task Draft Mode with:
67
81
 
@@ -112,7 +126,15 @@ In Execution Mode, read:
112
126
  2. `ai/project/runtime.md`
113
127
  3. `ai/project/task.md`
114
128
 
115
- Then execute the task and write results to:
129
+ Then follow `ai/template/execution-policy.md` for pre-execution planning: list
130
+ the L1 checklist, mark each L1 Green / Yellow / Red, and write it to
131
+ `execution_policy.task_tree`. Automatically choose `normal` or
132
+ `bounded_continuous` from the L1 count. Plan L2 before executing an L1, and
133
+ plan L3 as needed before executing an L2; default to at most 3 levels, with L4
134
+ allowed when needed. When an L1 is complete, check it off, strike it through,
135
+ and update the `task_tree` node status. Only Red stops for human confirmation;
136
+ Green continues automatically, and Yellow continues after local low-risk
137
+ correction. Write results to:
116
138
 
117
139
  - `ai/project/result.json`
118
140
  - `ai/project/result.md`
@@ -47,6 +47,19 @@ Project Bootstrap / Context Reconcile / Strategy Update -> Project Confirm -> Ta
47
47
  9. Execute only within the project task boundary.
48
48
  10. Write `ai/project/result.json`, `ai/project/result.md`, and `ai/project/metrics.json`.
49
49
 
50
+ ## Execution Authorization Modes
51
+
52
+ Before task execution, read `ai/template/execution-policy.md`.
53
+
54
+ The default execution policy is `auto`: the AI first decomposes L1 tasks and
55
+ judges Green / Yellow / Red risk, then chooses `normal` or `bounded_continuous`.
56
+ Use `normal` when there are fewer than 2 L1 tasks; automatically use
57
+ `bounded_continuous` when there are 2 or more L1 tasks. Only Red stops for
58
+ human confirmation.
59
+
60
+ Task tree, risk rubric, checkpoint evidence, and `task_tree` status update
61
+ rules are defined in `ai/template/execution-policy.md`.
62
+
50
63
  ## Bootstrap Mode
51
64
 
52
65
  Bootstrap Mode prepares stable project understanding:
@@ -22,6 +22,9 @@ New material should usually live in:
22
22
  absorbed. After reconciliation is confirmed, move processed material to
23
23
  `ai/project/inbox/processed/` for traceability and to avoid repeated
24
24
  reconciliation.
25
+ Even when the human says to reconcile the whole inbox, default to only
26
+ `ai/project/inbox/*.md` and `ai/project/inbox/raw/*.md`; do not recursively
27
+ read `processed/**` or `ideas/**`.
25
28
 
26
29
  ## First Read
27
30
 
@@ -30,11 +33,12 @@ reconciliation.
30
33
  3. `ai/project/project.md`
31
34
  4. `ai/project/runtime.md`
32
35
  5. `ai/project/refs/*.md`
33
- 6. The new material named by the human; if none is named, read `ai/project/inbox/*.md`
36
+ 6. The new material named by the human; if none is named, read only
37
+ `ai/project/inbox/*.md` and `ai/project/inbox/raw/*.md`
34
38
 
35
- Do not read `ai/project/inbox/processed/**`, `ai/project/archive/**`, source,
36
- tests, config, or dependency files by default unless the human explicitly asks
37
- you to use them for fact checking.
39
+ Do not read `ai/project/inbox/processed/**`, `ai/project/inbox/ideas/**`,
40
+ `ai/project/archive/**`, source, tests, config, or dependency files by default
41
+ unless the human explicitly asks you to use them for fact checking.
38
42
 
39
43
  ## Reconciliation Principles
40
44
 
@@ -8,6 +8,7 @@ Before editing code, check that `ai/project/task.md` clearly defines:
8
8
  - Scope
9
9
  - Acceptance
10
10
  - Permission
11
+ - Execution policy
11
12
 
12
13
  If readiness fails, do not edit code. Write blocked results to:
13
14
 
@@ -116,6 +117,28 @@ must not directly modify:
116
117
  Do not modify current task, results, metrics, archives, source, tests, config,
117
118
  or dependency files unless the human explicitly authorizes it.
118
119
 
120
+ ## Bounded Continuous Execution Gate
121
+
122
+ Before every execution, the AI must read `ai/template/execution-policy.md`,
123
+ decompose the task, and judge risk instead of waiting for the human to
124
+ explicitly say "enable continuous execution".
125
+
126
+ Hard gates:
127
+
128
+ - `execution_policy.task_tree` must record the L1 checklist and execution state.
129
+ - Every task node must have Green / Yellow / Red risk.
130
+ - Every checkpoint must include evidence; a purely subjective Green is not valid.
131
+ - Red must stop for human confirmation.
132
+ - Any product direction, core architecture, data structure, security, payment,
133
+ account, permission, large deletion, core rewrite, or high-cost option choice
134
+ must stop.
135
+ - Any need to expand scope, permission, commands, network access, or acceptance
136
+ must stop.
137
+
138
+ The AI infers goal, scope, acceptance, and permissions, but must not cross
139
+ project rules, explicit human limits, `permission.modify.denied`, security
140
+ boundaries, or destructive-action limits.
141
+
119
142
  ## Strategy Update Gate
120
143
 
121
144
  If the user asks to update the North Star, final shape, product constitution,
@@ -11,6 +11,7 @@ project 是现场工作区
11
11
 
12
12
  - `template/prompt.md`:AI 启动提示。
13
13
  - `template/bootstrap.md`:项目发现和上下文引导提示。
14
+ - `template/execution-policy.md`:自动连续执行、任务树、风险分级和 Checkpoint 规则。
14
15
  - `template/reconcile.md`:把新的权威资料合并进现有项目上下文。
15
16
  - `template/VERSION`:已安装模板版本。
16
17
  - `template/protocol.md`:引导流程、执行流程、模型分工、同步规则。
@@ -77,6 +78,9 @@ ai/project/inbox/
77
78
  `refs/*`。整合完成后,已处理资料统一移动到
78
79
  `ai/project/inbox/processed/`。
79
80
 
81
+ 默认只吸收 `ai/project/inbox/*.md` 和 `ai/project/inbox/raw/*.md`。
82
+ `processed/**` 是追溯区,不会再次参与整合;`ideas/**` 走方向修订提案。
83
+
80
84
  ## 方向修订
81
85
 
82
86
  北极星、模块地图和路线图属于项目方向层:
@@ -3,7 +3,43 @@ task_id: ""
3
3
  type: "bugfix | feature | refactor | docs | config | test | research | strategy_update | apply_strategy_update"
4
4
  priority: "P0 | P1 | P2 | P3"
5
5
  risk_level: "low | medium | high"
6
+ readiness: "draft_for_confirmation | ready_to_execute | blocked"
6
7
  depends_on_previous_result: false
8
+ execution_policy:
9
+ mode: "auto | normal | bounded_continuous"
10
+ activation_rule: "auto_enable_when_l1_count_gte_2"
11
+ max_depth: 3
12
+ allow_depth_4_when_needed: true
13
+ progress_unit: "vertical_slice"
14
+ task_tree:
15
+ - id: "L1-1"
16
+ title: ""
17
+ risk: "Green | Yellow | Red"
18
+ status: "pending | running | done | blocked"
19
+ scope:
20
+ allowed: []
21
+ denied: []
22
+ acceptance: []
23
+ evidence: []
24
+ children: []
25
+ checkpoint_budget:
26
+ l1: 0
27
+ l2: 0
28
+ l3: 0
29
+ l4: 0
30
+ checkpoint_triggers:
31
+ - before_crossing_boundary
32
+ - after_vertical_slice
33
+ - before_final_review
34
+ auto_continue:
35
+ green: true
36
+ yellow: "low_risk_only"
37
+ red: false
38
+ risk_gate:
39
+ green: "continue"
40
+ yellow: "continue_with_local_fix"
41
+ red: "stop_for_human"
42
+ evidence_required: true
7
43
  model_policy:
8
44
  default_tier: "cheap"
9
45
  allowed_tiers:
@@ -44,7 +80,9 @@ permission:
44
80
  这个文件是当前执行契约。优先在引导模式中,根据简短人类目标和仓库上下文生成,
45
81
  然后由人类在执行前检查。
46
82
 
47
- 优先使用安全假设,少问额外问题;但不要猜测范围、风险、权限或验收。
83
+ 优先使用安全假设,少问额外问题。AI 应基于用户目标、项目上下文和仓库事实推断
84
+ 范围、风险、权限和验收;如果推断会越过权限、安全边界或验收无法定义,将
85
+ `readiness` 标为 `blocked` 或将相关任务节点标为 `Red`,等待人类确认。
48
86
 
49
87
  ## 目标
50
88
 
@@ -81,6 +119,33 @@ permission:
81
119
 
82
120
  -
83
121
 
122
+ ## 执行策略
123
+
124
+ 默认使用 `auto`,由 AI 在执行前规划时判定是否启用连续执行,而不是等待用户口令。
125
+ 如果执行前拆出的 L1 任务少于 2 个,使用 `normal`;如果 L1 任务为 2 个或更多,
126
+ 自动使用 `bounded_continuous`。
127
+
128
+ `bounded_continuous` 表示边界内连续执行:
129
+
130
+ - 目标、范围、验收、权限和风险评级由 AI 基于用户目标、项目上下文和仓库事实推断;
131
+ 不要求用户预先逐项提供。
132
+ - `readiness = ready_to_execute` 表示没有 Red 预检项,可以执行。
133
+ - `readiness = draft_for_confirmation` 表示需要人类确认后才能执行。
134
+ - `readiness = blocked` 表示当前任务不可执行,必须写 blocked 结果。
135
+ - 执行前必须把 L1 任务清单写入 `execution_policy.task_tree`。
136
+ - 执行前必须列出 L1 任务清单;每个 L1 用待办列表表示,完成后打勾并划掉。
137
+ - 执行某个 L1 前,AI 先规划自然衍生出的 L2;如果 L2 仍需拆分,再规划 L3。
138
+ - 默认最多 3 层;只有当不拆 L4 会导致 L3 过大或不可验证时,才允许动态增加 L4。
139
+ - 每个任务节点都由 AI 自己生成 Green / Yellow / Red 风险评级。
140
+ - 只有 Red 停下来让人类确认;Green 自动继续,Yellow 先做局部低风险修正后继续。
141
+ - `progress_unit` 默认是 `vertical_slice`:每轮推进都应该产生可检查的工作增量。
142
+ - `checkpoint_budget` 是最多可用检查点预算,不是必须用完的次数;不要为了消耗预算而汇报。
143
+ - 只有在触发 `checkpoint_triggers`、风险升高或准备收尾时才输出 Checkpoint。
144
+ - 每个 Checkpoint 必须包含证据:已改文件、已运行命令、验证结果或无法验证的原因。
145
+ - 执行中必须更新 `task_tree` 节点状态:`pending`、`running`、`done` 或 `blocked`。
146
+ - 完成后只做一次总复盘;只对 Yellow、Red、失败验证或高影响模块做二次抽检。
147
+ - 连续执行不改变模型策略;涉及判断、架构、失败复盘或验收争议时仍按 `model_policy` 升级。
148
+
84
149
  ## 权限
85
150
 
86
151
  只修改 YAML front matter 允许列表中的文件。
@@ -104,3 +169,6 @@ permission:
104
169
  - 必需引用缺失。
105
170
  - 必需命令无法运行。
106
171
  - 风险等级高但没有明确授权。
172
+ - 连续执行中出现 Red 检查点。
173
+ - 需要改变产品方向、核心架构、数据结构、安全边界、支付、账号或权限。
174
+ - 需要删除大量文件、重写核心模块,或在多个高成本方案之间取舍。
@@ -1 +1 @@
1
- 0.8.15
1
+ 0.8.17
@@ -0,0 +1,95 @@
1
+ # 执行策略
2
+
3
+ 不要总结这个文件。
4
+ 任务执行时按本文件选择 `normal` 或 `bounded_continuous`。
5
+
6
+ ## 默认策略
7
+
8
+ 默认执行策略是 `auto`:AI 在每次执行前先做任务分解和风险判断,再决定使用
9
+ `normal` 还是 `bounded_continuous`。启用连续执行不依赖用户说出特定口令。
10
+
11
+ 执行前规划必须:
12
+
13
+ - 根据用户目标、项目上下文和仓库事实,推断目标、范围、验收、权限和验证方式。
14
+ - 列出 L1 任务清单,并为每个 L1 生成 Green / Yellow / Red 风险评级。
15
+ - 如果 L1 少于 2 个,使用 `normal`。
16
+ - 如果 L1 为 2 个或更多,自动启用 `bounded_continuous`。
17
+ - 如果任一 L1 为 Red,先停止并让人类确认;Green 和 Yellow 不阻塞启动。
18
+ - 将任务树写入 `ai/project/task.md` 的 `execution_policy.task_tree`。
19
+
20
+ ## 任务树
21
+
22
+ 任务树按 L1 -> L2 -> L3 执行。
23
+
24
+ - 执行某个 L1 前,先规划它自然衍生出的 L2。
25
+ - 执行某个 L2 前,如果仍需拆分,再规划 L3。
26
+ - 默认最多 3 层。只有当 L3 仍过大、不可验证或不可回退时,才动态增加 L4。
27
+ - L1/L2/L3/L4 都必须有风险评级、预期改动范围、验收方式和证据要求。
28
+ - L1 清单必须用待办列表展示;每完成一个 L1,就打勾并划掉。
29
+ - 执行中必须更新 `task_tree` 节点状态:`pending`、`running`、`done` 或 `blocked`。
30
+
31
+ 推荐节点结构:
32
+
33
+ ```yaml
34
+ id: "L1-1"
35
+ title: ""
36
+ risk: "Green | Yellow | Red"
37
+ status: "pending | running | done | blocked"
38
+ scope:
39
+ allowed: []
40
+ denied: []
41
+ acceptance: []
42
+ evidence: []
43
+ children: []
44
+ ```
45
+
46
+ ## 风险分级
47
+
48
+ Green:
49
+
50
+ - 在当前任务范围内;
51
+ - 不需要新增权限、命令、网络或破坏性操作;
52
+ - 验收方式明确;
53
+ - 不改变产品方向、核心架构、数据结构、安全边界、支付、账号或权限。
54
+
55
+ Yellow:
56
+
57
+ - 仍在当前任务范围内;
58
+ - 存在局部不确定或局部验证失败;
59
+ - 可以用低风险修正继续;
60
+ - 不需要扩大权限、范围、命令或验收。
61
+
62
+ Red:
63
+
64
+ - 需要扩大权限、运行未允许命令、访问网络或执行破坏性操作;
65
+ - 需要改变产品方向、核心架构、数据结构、安全边界、支付、账号或权限;
66
+ - 需要删除大量文件、重写核心模块或在多个高成本方案之间取舍;
67
+ - 验收不可定义,或任务目标和项目方向发生实质冲突。
68
+
69
+ 只有 Red 停止等待人类确认。Green 自动继续。Yellow 做局部低风险修正后继续。
70
+
71
+ ## Checkpoint
72
+
73
+ Checkpoint 只在风险升高、边界即将变化、完成垂直切片或准备收尾时输出。
74
+ 不要为了消耗预算而汇报。
75
+
76
+ 每个 Checkpoint 必须包含:
77
+
78
+ ```text
79
+ ## Checkpoint
80
+ ### 任务树
81
+ ### 当前完成度
82
+ ### 已完成
83
+ ### 证据
84
+ ### 偏离风险:Green / Yellow / Red
85
+ ### 下一步建议
86
+ ### 是否自动继续
87
+ ```
88
+
89
+ 证据必须包含已改文件、已运行命令、验证结果,或无法验证的原因。
90
+ 不接受只有主观判断的 Green。
91
+
92
+ ## 模型策略
93
+
94
+ 连续执行不改变 `model_policy`。遇到规划、架构、失败复盘或验收争议,
95
+ 仍按 `model_policy` 升级,并在 `ai/project/metrics.json` 中记录原因。
@@ -9,6 +9,7 @@
9
9
 
10
10
  1. `ai/template/protocol.md`
11
11
  2. `ai/template/rules/core.md`
12
+ 3. `ai/template/execution-policy.md`
12
13
 
13
14
  然后选择模式:
14
15
 
@@ -29,7 +30,9 @@
29
30
  更新上下文/处理新资料,提到 `reconcile` 或 `ai/project/inbox/`,
30
31
  或 `ai/project/inbox/` 里存在 `.gitkeep` 之外的待吸收资料,执行 `ai/template/reconcile.md`,
31
32
  并按它的两阶段流程停止或更新;但 `ai/project/inbox/processed/` 是已处理资料,
32
- 不应触发整合,`ai/project/inbox/ideas/` 应优先走 `strategy_update`。
33
+ 不应触发整合,`ai/project/inbox/ideas/` 应优先走 `strategy_update`。即使用户说
34
+ “整合整个 inbox”,默认也只处理 `ai/project/inbox/*.md` 和
35
+ `ai/project/inbox/raw/*.md`。
33
36
  - 如果用户说“开始初始化这个项目”、要求初始化/整理/生成项目上下文,
34
37
  或 `ai/project/project.md` 为空、只有占位内容、
35
38
  或不完整,执行 `ai/template/bootstrap.md`,并在项目上下文确认后停止。
@@ -46,9 +49,17 @@
46
49
  在任务草稿模式中:
47
50
 
48
51
  1. 读取已确认的 `ai/project/project.md` 和相关 `ai/project/refs/*.md`。
49
- 2. 根据用户当前目标起草 `ai/project/task.md`。
50
- 3. 只为范围、风险、权限或验收阻塞项最多问 3 个问题。
51
- 4. 停止等待人类确认。不要修改源码或业务文件。
52
+ 2. 根据用户当前目标、项目上下文和仓库事实,推断目标、范围、验收、权限、
53
+ 验证方式和初始风险;不要要求用户逐项提供。
54
+ 3. 起草 `ai/project/task.md`,并将 `execution_policy.mode` 设为 `auto`。
55
+ 4. 执行前列出 L1 任务清单并标注 Green / Yellow / Red,同时写入
56
+ `execution_policy.task_tree`。L1 少于 2 个时使用 `normal`;L1 为 2 个或更多时
57
+ 自动使用 `bounded_continuous`。
58
+ 5. 如果没有 Red 预检项,将 `readiness` 设为 `ready_to_execute`;如果需要人类确认,
59
+ 设为 `draft_for_confirmation`;如果不可执行,设为 `blocked`。
60
+ 6. 只有出现 Red 预检项时才停止等待人类确认。若用户要求的是执行或继续,且预检
61
+ 只有 Green / Yellow,可以直接进入执行模式。
62
+ 7. 不要在任务草稿模式中修改源码或业务文件。
52
63
 
53
64
  任务草稿模式必须以下面结构结束:
54
65
 
@@ -97,7 +108,12 @@
97
108
  2. `ai/project/runtime.md`
98
109
  3. `ai/project/task.md`
99
110
 
100
- 然后执行任务,并把结果写入:
111
+ 然后按 `ai/template/execution-policy.md` 做执行前规划:列出 L1 清单,给每个 L1
112
+ 标注 Green / Yellow / Red,并写入 `execution_policy.task_tree`。根据 L1 数量自动选择
113
+ `normal` 或 `bounded_continuous`。执行 L1 前规划 L2,执行 L2 前按需规划 L3;
114
+ 默认最多 3 层,必要时允许 L4。每完成一个 L1,在清单中打勾并划掉,并更新
115
+ `task_tree` 节点状态。只有 Red 停止等待人类确认;Green 自动继续,Yellow 做局部
116
+ 低风险修正后继续。最后把结果写入:
101
117
 
102
118
  - `ai/project/result.json`
103
119
  - `ai/project/result.md`
@@ -41,6 +41,17 @@ ai/project/task.md = 当前执行契约
41
41
  9. 只在项目任务边界内执行。
42
42
  10. 写入 `ai/project/result.json`、`ai/project/result.md` 和 `ai/project/metrics.json`。
43
43
 
44
+ ## 执行授权模式
45
+
46
+ 任务执行前必须读取 `ai/template/execution-policy.md`。
47
+
48
+ 执行策略默认是 `auto`:AI 先拆 L1 任务并判断 Green / Yellow / Red,再决定使用
49
+ `normal` 或 `bounded_continuous`。L1 少于 2 个使用 `normal`;L1 为 2 个或更多
50
+ 自动启用 `bounded_continuous`。只有 Red 停止等待人类确认。
51
+
52
+ 任务树、风险分级、Checkpoint 证据和 `task_tree` 状态更新规则由
53
+ `ai/template/execution-policy.md` 定义。
54
+
44
55
  ## 引导模式
45
56
 
46
57
  引导模式准备稳定的项目理解:
@@ -20,6 +20,8 @@
20
20
 
21
21
  `ai/project/inbox/` 是待吸收资料区。资料被整合确认后,统一移动到
22
22
  `ai/project/inbox/processed/`,用于追溯并避免后续重复整合。
23
+ 即使用户说“整合整个 inbox”,默认也只处理 `ai/project/inbox/*.md`
24
+ 和 `ai/project/inbox/raw/*.md`;不要递归读取 `processed/**` 或 `ideas/**`。
23
25
 
24
26
  ## 先读
25
27
 
@@ -28,10 +30,11 @@
28
30
  3. `ai/project/project.md`
29
31
  4. `ai/project/runtime.md`
30
32
  5. `ai/project/refs/*.md`
31
- 6. 人类指定的新资料;未指定时,读取 `ai/project/inbox/*.md`
33
+ 6. 人类指定的新资料;未指定时,只读取 `ai/project/inbox/*.md`
34
+ 和 `ai/project/inbox/raw/*.md`
32
35
 
33
- 不要默认读取 `ai/project/inbox/processed/**`、`ai/project/archive/**`、源码、
34
- 测试、配置或依赖文件,除非人类明确要求用它们核对事实。
36
+ 不要默认读取 `ai/project/inbox/processed/**`、`ai/project/inbox/ideas/**`、
37
+ `ai/project/archive/**`、源码、测试、配置或依赖文件,除非人类明确要求用它们核对事实。
35
38
 
36
39
  ## 整合原则
37
40
 
@@ -8,6 +8,7 @@
8
8
  - 范围
9
9
  - 验收
10
10
  - 权限
11
+ - 执行策略
11
12
 
12
13
  如果未就绪,不要编辑代码。将阻塞结果写入:
13
14
 
@@ -100,6 +101,24 @@
100
101
 
101
102
  除非人类明确授权,不要修改当前任务、结果、指标、归档、源码、测试、配置或依赖文件。
102
103
 
104
+ ## 边界内连续执行门
105
+
106
+ 每次执行前,AI 必须读取 `ai/template/execution-policy.md`,先做任务分解和风险判断,
107
+ 而不是等待用户显式说“启用连续执行”。
108
+
109
+ 硬门禁:
110
+
111
+ - `execution_policy.task_tree` 必须记录 L1 清单和执行状态。
112
+ - 每个任务节点必须有 Green / Yellow / Red 风险评级。
113
+ - 每个 Checkpoint 必须包含证据;不接受只有主观判断的 Green。
114
+ - Red 必须停止等待人类确认。
115
+ - 任何方向、核心架构、数据结构、安全、支付、账号、权限、大量删除、
116
+ 核心重写或高成本方案取舍,都必须停止。
117
+ - 需要扩大范围、权限、命令、网络或验收时,必须停止。
118
+
119
+ 目标、范围、验收和权限由 AI 推断,但不能越过项目规则、显式用户限制、
120
+ `permission.modify.denied`、安全边界或破坏性操作限制。
121
+
103
122
  ## 策略修订门
104
123
 
105
124
  如果用户要求更新项目北极星、最终形态、产品宪法、模块地图、路线图或项目方向,
package/test/selftest.js CHANGED
@@ -51,6 +51,7 @@ function testInitUpdateDoctor() {
51
51
  assert(read(cwd, "ai/template/LANG") === "zh\n", "init should default to zh template");
52
52
  assert(exists(cwd, "ai/template/VERSION"), "init should create template VERSION");
53
53
  assert(exists(cwd, "ai/template/bootstrap.md"), "init should create template bootstrap prompt");
54
+ assert(exists(cwd, "ai/template/execution-policy.md"), "init should create execution policy prompt");
54
55
  assert(exists(cwd, "ai/template/prompt.md"), "init should create template prompt");
55
56
  assert(exists(cwd, "ai/template/reconcile.md"), "init should create template reconcile prompt");
56
57
  assert(exists(cwd, "ai/project/inbox/.gitkeep"), "init should create inbox directory");
@@ -77,6 +78,22 @@ function testInitUpdateDoctor() {
77
78
  assert(read(cwd, "ai/template/bootstrap.md").includes("未吸收资料"), "bootstrap handoff should audit unabsorbed material");
78
79
  assert(read(cwd, "ai/template/bootstrap.md").includes("冲突处理"), "bootstrap handoff should audit conflict handling");
79
80
  assert(read(cwd, "ai/template/prompt.md").includes("任务草稿交接"), "execution prompt should include task handoff");
81
+ assert(read(cwd, "ai/template/prompt.md").includes("ai/template/execution-policy.md"), "execution prompt should read execution policy");
82
+ assert(read(cwd, "ai/template/execution-policy.md").includes("风险分级"), "execution policy should include risk rubric");
83
+ assert(read(cwd, "ai/template/execution-policy.md").includes("execution_policy.task_tree"), "execution policy should require task tree persistence");
84
+ assert(read(cwd, "ai/template/prompt.md").includes("默认也只处理 `ai/project/inbox/*.md`"), "execution prompt should narrow inbox reconciliation");
85
+ assert(read(cwd, "ai/template/protocol.md").includes("`bounded_continuous`"), "protocol should include bounded continuous execution");
86
+ assert(read(cwd, "ai/template/execution-policy.md").includes("垂直切片"), "protocol should require vertical-slice progress for continuous execution");
87
+ assert(read(cwd, "ai/template/execution-policy.md").includes("L1 为 2 个或更多,自动启用"), "protocol should auto-enable continuous execution from L1 count");
88
+ assert(read(cwd, "ai/template/execution-policy.md").includes("每个 Checkpoint 必须包含"), "protocol should require evidence-backed checkpoints");
89
+ assert(read(cwd, "ai/template/rules/core.md").includes("边界内连续执行门"), "core rules should include bounded continuous execution gate");
90
+ assert(read(cwd, "ai/template/rules/core.md").includes("需要扩大范围、权限、命令、网络或验收时"), "core rules should stop continuous execution before boundary expansion");
91
+ assert(read(cwd, "ai/project/task.md").includes("execution_policy:"), "task template should include execution policy");
92
+ assert(read(cwd, "ai/project/task.md").includes("readiness:"), "task template should include readiness state");
93
+ assert(read(cwd, "ai/project/task.md").includes("activation_rule: \"auto_enable_when_l1_count_gte_2\""), "task template should define automatic activation rule");
94
+ assert(read(cwd, "ai/project/task.md").includes("risk_gate:"), "task template should define risk gate");
95
+ assert(read(cwd, "ai/project/task.md").includes("status: \"pending | running | done | blocked\""), "task template should define task tree node status");
96
+ assert(read(cwd, "ai/project/task.md").includes("progress_unit: \"vertical_slice\""), "task template should define continuous progress unit");
80
97
  assert(read(cwd, "ai/template/prompt.md").includes("开始初始化这个项目"), "execution prompt should route natural bootstrap entry");
81
98
  assert(read(cwd, "ai/template/prompt.md").includes("开始初始化这个项目,并吸收 ai/project/inbox/ 里的资料"), "execution prompt should route bootstrap with inbox material");
82
99
  assert(read(cwd, "ai/template/prompt.md").includes("不要重新 bootstrap"), "execution prompt should reconcile inbox material when project context already exists");
@@ -85,6 +102,7 @@ function testInitUpdateDoctor() {
85
102
  assert(read(cwd, "ai/template/prompt.md").includes("strategy_update"), "execution prompt should route strategy updates");
86
103
  assert(read(cwd, "ai/template/reconcile.md").includes("上下文整合"), "init should install reconcile prompt");
87
104
  assert(read(cwd, "ai/template/reconcile.md").includes("整合计划"), "reconcile prompt should require a plan first");
105
+ assert(read(cwd, "ai/template/reconcile.md").includes("不要递归读取 `processed/**` 或 `ideas/**`"), "reconcile prompt should exclude processed and ideas recursively");
88
106
  assert(read(cwd, "ai/template/reconcile.md").includes("ai/project/inbox/processed/raw/file.md"), "reconcile prompt should archive absorbed raw inbox material");
89
107
  assert(read(cwd, "ai/template/reconcile.md").includes("未吸收资料"), "reconcile handoff should audit unabsorbed material");
90
108
  assert(read(cwd, "ai/template/reconcile.md").includes("冲突处理"), "reconcile handoff should audit conflict handling");
@@ -130,6 +148,7 @@ function testEnglishInitUpdateDoctor() {
130
148
 
131
149
  const initOutput = run(["init", "--lang", "en"], cwd);
132
150
  assert(read(cwd, "ai/template/LANG") === "en\n", "init --lang en should install English template");
151
+ assert(exists(cwd, "ai/template/execution-policy.md"), "English init should create execution policy prompt");
133
152
  assert(read(cwd, "ai/template/bootstrap.md").includes("Confirmation Dimensions"), "English init should install English bootstrap prompt");
134
153
  assert(read(cwd, "ai/template/bootstrap.md").includes("Do not summarize this file"), "English bootstrap prompt should prevent summary-only behavior");
135
154
  assert(read(cwd, "ai/template/bootstrap.md").includes("ai/project/refs/final-shape.md"), "English bootstrap prompt should initialize the North Star");
@@ -142,6 +161,22 @@ function testEnglishInitUpdateDoctor() {
142
161
  assert(read(cwd, "ai/template/bootstrap.md").includes("Unabsorbed material"), "English bootstrap handoff should audit unabsorbed material");
143
162
  assert(read(cwd, "ai/template/bootstrap.md").includes("Conflict handling"), "English bootstrap handoff should audit conflict handling");
144
163
  assert(read(cwd, "ai/template/prompt.md").includes("Start initializing this project"), "English execution prompt should route natural bootstrap entry");
164
+ assert(read(cwd, "ai/template/prompt.md").includes("ai/template/execution-policy.md"), "English execution prompt should read execution policy");
165
+ assert(read(cwd, "ai/template/execution-policy.md").includes("Risk Rubric"), "English execution policy should include risk rubric");
166
+ assert(read(cwd, "ai/template/execution-policy.md").includes("execution_policy.task_tree"), "English execution policy should require task tree persistence");
167
+ assert(read(cwd, "ai/template/prompt.md").includes("default to only `ai/project/inbox/*.md`"), "English execution prompt should narrow inbox reconciliation");
168
+ assert(read(cwd, "ai/template/protocol.md").includes("`bounded_continuous`"), "English protocol should include bounded continuous execution");
169
+ assert(read(cwd, "ai/template/execution-policy.md").includes("vertical"), "English protocol should require vertical-slice progress for continuous execution");
170
+ assert(read(cwd, "ai/template/execution-policy.md").includes("Automatically use `bounded_continuous`"), "English protocol should auto-enable continuous execution from L1 count");
171
+ assert(read(cwd, "ai/template/execution-policy.md").includes("Every checkpoint must include"), "English protocol should require evidence-backed checkpoints");
172
+ assert(read(cwd, "ai/template/rules/core.md").includes("Bounded Continuous Execution Gate"), "English core rules should include bounded continuous execution gate");
173
+ assert(read(cwd, "ai/template/rules/core.md").includes("expand scope, permission, commands, network access, or acceptance"), "English core rules should stop continuous execution before boundary expansion");
174
+ assert(read(cwd, "ai/project/task.md").includes("execution_policy:"), "English task template should include execution policy");
175
+ assert(read(cwd, "ai/project/task.md").includes("readiness:"), "English task template should include readiness state");
176
+ assert(read(cwd, "ai/project/task.md").includes("activation_rule: \"auto_enable_when_l1_count_gte_2\""), "English task template should define automatic activation rule");
177
+ assert(read(cwd, "ai/project/task.md").includes("risk_gate:"), "English task template should define risk gate");
178
+ assert(read(cwd, "ai/project/task.md").includes("status: \"pending | running | done | blocked\""), "English task template should define task tree node status");
179
+ assert(read(cwd, "ai/project/task.md").includes("progress_unit: \"vertical_slice\""), "English task template should define continuous progress unit");
145
180
  assert(read(cwd, "ai/template/prompt.md").includes("Start initializing this project and absorb the material in ai/project/inbox/"), "English execution prompt should route bootstrap with inbox material");
146
181
  assert(read(cwd, "ai/template/prompt.md").includes("instead of bootstrapping again"), "English execution prompt should reconcile inbox material when project context already exists");
147
182
  assert(read(cwd, "ai/template/prompt.md").includes("Reconcile the new material in ai/project/inbox/"), "English execution prompt should route natural reconcile entry");
@@ -155,6 +190,7 @@ function testEnglishInitUpdateDoctor() {
155
190
  assert(read(cwd, "ai/project/proposals/final-shape-updates/_template.md").includes("`accepted`"), "English proposal template should describe accepted status");
156
191
  assert(read(cwd, "ai/template/reconcile.md").includes("Context Reconcile"), "English init should install English reconcile prompt");
157
192
  assert(read(cwd, "ai/template/reconcile.md").includes("reconciliation plan"), "English reconcile prompt should require a plan first");
193
+ assert(read(cwd, "ai/template/reconcile.md").includes("do not recursively\nread `processed/**` or `ideas/**`"), "English reconcile prompt should exclude processed and ideas recursively");
158
194
  assert(read(cwd, "ai/template/reconcile.md").includes("ai/project/inbox/processed/raw/file.md"), "English reconcile prompt should archive absorbed raw inbox material");
159
195
  assert(read(cwd, "ai/template/reconcile.md").includes("Unabsorbed material"), "English reconcile handoff should audit unabsorbed material");
160
196
  assert(read(cwd, "ai/template/reconcile.md").includes("Conflict handling"), "English reconcile handoff should audit conflict handling");
@@ -212,6 +248,25 @@ function testDoctorFailureAndWarning() {
212
248
  write(taskWarnCwd, "ai/project/task.md", "# Task only\n");
213
249
  const taskWarnOutput = run(["doctor"], taskWarnCwd);
214
250
  assert(taskWarnOutput.includes("任务 front matter 缺少关键字段"), "doctor should warn incomplete task front matter");
251
+
252
+ const taskPolicyWarnCwd = createTempProject("agent-execution-template-task-policy");
253
+ run(["init"], taskPolicyWarnCwd);
254
+ write(taskPolicyWarnCwd, "ai/project/task.md", `---
255
+ task_id: ""
256
+ type: "feature"
257
+ priority: "P2"
258
+ risk_level: "low"
259
+ readiness: "ready_to_execute"
260
+ execution_policy:
261
+ mode: "auto"
262
+ model_policy: {}
263
+ refs: {}
264
+ permission: {}
265
+ ---
266
+ # Task
267
+ `);
268
+ const taskPolicyWarnOutput = run(["doctor"], taskPolicyWarnCwd);
269
+ assert(taskPolicyWarnOutput.includes("任务 front matter 缺少关键字段"), "doctor should warn when execution policy fields are incomplete");
215
270
  }
216
271
 
217
272
  function testRefreshBacksUpAndImportsOldProject() {
@@ -247,14 +302,14 @@ function testNextCommandRoutesByProjectState() {
247
302
  assert(run(["next"], cwd).includes("开始初始化这个项目"), "next should bootstrap a freshly installed project");
248
303
 
249
304
  write(cwd, "ai/project/project.md", "USER PROJECT MARKER\n");
250
- assert(run(["next"], cwd).includes("继续推进这个项目"), "next should continue when no intake is waiting");
305
+ assert(run(["next"], cwd).includes("执行前先拆 L1 任务"), "next should continue with automatic execution guidance when no intake is waiting");
251
306
 
252
307
  write(cwd, "ai/project/inbox/product.md", "# Product material\n");
253
308
  assert(run(["next"], cwd).includes("整合 ai/project/inbox/ 里的新资料"), "next should route material inbox to reconcile");
254
309
  fs.unlinkSync(path.join(cwd, "ai/project/inbox/product.md"));
255
310
 
256
311
  write(cwd, "ai/project/inbox/processed/product.md", "# Processed material\n");
257
- assert(run(["next"], cwd).includes("继续推进这个项目"), "next should ignore processed inbox material");
312
+ assert(run(["next"], cwd).includes("执行前先拆 L1 任务"), "next should ignore processed inbox material");
258
313
  fs.unlinkSync(path.join(cwd, "ai/project/inbox/processed/product.md"));
259
314
 
260
315
  write(cwd, "ai/project/inbox/ideas/new-direction.md", "# Direction idea\n");
@@ -262,7 +317,7 @@ function testNextCommandRoutesByProjectState() {
262
317
  fs.unlinkSync(path.join(cwd, "ai/project/inbox/ideas/new-direction.md"));
263
318
 
264
319
  write(cwd, "ai/project/proposals/final-shape-updates/proposal.md", "---\nstatus: \"applied\"\n---\n");
265
- assert(run(["next"], cwd).includes("继续推进这个项目"), "next should ignore already applied proposals");
320
+ assert(run(["next"], cwd).includes("执行前先拆 L1 任务"), "next should ignore already applied proposals");
266
321
 
267
322
  write(cwd, "ai/project/proposals/final-shape-updates/proposal.md", "---\nstatus: \"proposed\"\n---\n");
268
323
  assert(run(["next"], cwd).includes("已有方向修订提案"), "next should route existing proposals to human review");