@wnlen/agent-execution-template 0.8.15 → 0.8.16
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +3 -1
- package/README.zh-CN.md +3 -1
- package/bin/agent-execution-template.js +1 -0
- package/docs/SPEC.md +58 -18
- package/package.json +1 -1
- package/template/en/ai/README.md +4 -0
- package/template/en/ai/project/task.md +65 -0
- package/template/en/ai/template/VERSION +1 -1
- package/template/en/ai/template/prompt.md +21 -6
- package/template/en/ai/template/protocol.md +68 -0
- package/template/en/ai/template/reconcile.md +8 -4
- package/template/en/ai/template/rules/core.md +40 -0
- package/template/zh/ai/README.md +3 -0
- package/template/zh/ai/project/task.md +50 -0
- package/template/zh/ai/template/VERSION +1 -1
- package/template/zh/ai/template/prompt.md +16 -5
- package/template/zh/ai/template/protocol.md +54 -0
- package/template/zh/ai/template/reconcile.md +6 -3
- package/template/zh/ai/template/rules/core.md +30 -0
- package/test/selftest.js +24 -0
package/README.md
CHANGED
|
@@ -147,7 +147,7 @@ npx -y @wnlen/agent-execution-template strategy --lang en
|
|
|
147
147
|
| Strategy amendment gate | New direction goes through `inbox/ideas/`, a proposal, human confirmation, then an explicit apply task. |
|
|
148
148
|
| Protected project context | `update` refreshes `ai/template/**` without overwriting `ai/project/**`. |
|
|
149
149
|
| Project context refresh | `refresh` backs up old `ai/project/**`, creates a fresh project context, and imports the old context into the inbox for reconciliation. |
|
|
150
|
-
|
|
|
150
|
+
| Automatic continuous execution | The agent decomposes L1/L2/L3 before execution; 2+ L1 tasks automatically enable bounded continuous execution, and only Red risk stops for confirmation. |
|
|
151
151
|
| Auditable results | Every run can leave human-readable output, machine-readable facts, and metrics. |
|
|
152
152
|
| Token-efficient model policy | Cheap models handle bounded work; strong models are reserved for judgment points. |
|
|
153
153
|
| Upgradeable template | Reuse protocol improvements without losing local project memory. |
|
|
@@ -323,6 +323,8 @@ Reconcile the new material in ai/project/inbox/
|
|
|
323
323
|
|
|
324
324
|
The agent must produce a reconciliation plan first, wait for confirmation, then merge long-lived facts into `project.md`, `runtime.md`, and `refs/*`.
|
|
325
325
|
After reconciliation, processed material is moved to `ai/project/inbox/processed/` for traceability.
|
|
326
|
+
By default, only `ai/project/inbox/*.md` and `ai/project/inbox/raw/*.md` are absorbed;
|
|
327
|
+
`processed/**` is not reconciled again, and `ideas/**` goes through the direction amendment proposal flow.
|
|
326
328
|
|
|
327
329
|
## Project North Star
|
|
328
330
|
|
package/README.zh-CN.md
CHANGED
|
@@ -157,7 +157,7 @@ npx -y @wnlen/agent-execution-template strategy
|
|
|
157
157
|
| 策略修订门禁 | 新方向先进入 `inbox/ideas/`,生成 proposal,人类确认后才合并进北极星、模块地图或路线图。 |
|
|
158
158
|
| 保护项目现场 | `update` 刷新 `ai/template/**`,不会覆盖 `ai/project/**`。 |
|
|
159
159
|
| 项目上下文重整 | `refresh` 备份旧 `ai/project/**`,生成新项目上下文,并把旧上下文放入 inbox 供 AI 整理。 |
|
|
160
|
-
|
|
|
160
|
+
| 自动连续执行 | AI 执行前自动拆 L1/L2/L3 任务树;L1 两个以上自动启用边界内连续执行,只有 Red 风险停下来确认。 |
|
|
161
161
|
| 可审计结果 | 每次执行都可以留下人类可读结果、机器可读事实和 metrics。 |
|
|
162
162
|
| Token-efficient 模型策略 | 便宜模型处理边界清楚的工作,强模型只用于关键判断点。 |
|
|
163
163
|
| 可升级模板 | 协议可以持续改进,不丢失项目本地记忆。 |
|
|
@@ -332,6 +332,8 @@ ai/project/inbox/
|
|
|
332
332
|
|
|
333
333
|
AI 必须先输出整合计划,等待确认后,再把长期有效事实合并进 `project.md`、`runtime.md` 和 `refs/*`。
|
|
334
334
|
整合完成后,已处理资料统一移动到 `ai/project/inbox/processed/`,保留用于追溯。
|
|
335
|
+
默认只吸收 `ai/project/inbox/*.md` 和 `ai/project/inbox/raw/*.md`;
|
|
336
|
+
`processed/**` 不会再次参与整合,`ideas/**` 走方向修订提案。
|
|
335
337
|
|
|
336
338
|
## 项目北极星
|
|
337
339
|
|
package/docs/SPEC.md
CHANGED
|
@@ -22,7 +22,7 @@ npx 安装协议 -> AI 整理项目上下文 -> 人类确认 -> AI 生成任务
|
|
|
22
22
|
|
|
23
23
|
```text
|
|
24
24
|
Protocol: v0.8
|
|
25
|
-
Package: @wnlen/agent-execution-template@0.8.
|
|
25
|
+
Package: @wnlen/agent-execution-template@0.8.16
|
|
26
26
|
中文安装: npx -y @wnlen/agent-execution-template init
|
|
27
27
|
英文安装: npx -y @wnlen/agent-execution-template init --lang en
|
|
28
28
|
```
|
|
@@ -390,7 +390,7 @@ npx -y @wnlen/agent-execution-template doctor
|
|
|
390
390
|
```text
|
|
391
391
|
Agent Execution Template 检查
|
|
392
392
|
|
|
393
|
-
模板版本: 0.8.
|
|
393
|
+
模板版本: 0.8.16
|
|
394
394
|
模板语言: zh
|
|
395
395
|
|
|
396
396
|
[通过] ai/template/LANG
|
|
@@ -610,6 +610,7 @@ ai/project/refs/roadmap.md
|
|
|
610
610
|
- 任务类型;
|
|
611
611
|
- 优先级;
|
|
612
612
|
- 风险等级;
|
|
613
|
+
- 执行策略;
|
|
613
614
|
- 模型分工策略;
|
|
614
615
|
- refs 要求;
|
|
615
616
|
- 修改权限;
|
|
@@ -645,7 +646,44 @@ apply_strategy_update
|
|
|
645
646
|
权限不允许,不越界修改。
|
|
646
647
|
```
|
|
647
648
|
|
|
648
|
-
## 14.
|
|
649
|
+
## 14. 执行授权策略
|
|
650
|
+
|
|
651
|
+
执行策略写在:
|
|
652
|
+
|
|
653
|
+
```text
|
|
654
|
+
ai/project/task.md.execution_policy
|
|
655
|
+
```
|
|
656
|
+
|
|
657
|
+
默认模式是 `auto`。AI 每次执行前先做任务分解和风险判断,再决定使用
|
|
658
|
+
`normal` 还是 `bounded_continuous`,不依赖用户口令。
|
|
659
|
+
|
|
660
|
+
执行前规划:
|
|
661
|
+
|
|
662
|
+
- AI 根据用户目标、项目上下文和仓库事实推断目标、范围、验收、权限和验证方式;
|
|
663
|
+
- 先列 L1 任务清单,并给每个 L1 标注 Green / Yellow / Red;
|
|
664
|
+
- L1 少于 2 个时使用 `normal`;
|
|
665
|
+
- L1 为 2 个或更多时自动使用 `bounded_continuous`;
|
|
666
|
+
- 任一 L1 为 Red 时停止等待人类确认;Green 和 Yellow 不阻塞启动。
|
|
667
|
+
|
|
668
|
+
`bounded_continuous` 规则:
|
|
669
|
+
|
|
670
|
+
- 按 L1 -> L2 -> L3 执行,执行 L1 前规划 L2,执行 L2 前按需规划 L3;
|
|
671
|
+
- 默认最多 3 层,只有当 L3 仍过大、不可验证或不可回退时才动态增加 L4;
|
|
672
|
+
- 每个任务节点必须有风险评级、预期改动范围、验收方式和证据要求;
|
|
673
|
+
- L1 清单必须用待办列表展示,每完成一个 L1 就打勾并划掉;
|
|
674
|
+
- 默认按 `vertical_slice` 推进,每轮都要产生可检查增量;
|
|
675
|
+
- 每个 Checkpoint 必须包含证据:已改文件、已运行命令、验证结果或无法验证原因;
|
|
676
|
+
- Green 可自动继续;
|
|
677
|
+
- Yellow 做局部低风险修正后继续;
|
|
678
|
+
- Red 必须停止等待人类确认;
|
|
679
|
+
- 目标、范围、验收和权限由 AI 推断,但不能越过项目规则、显式用户限制、
|
|
680
|
+
`permission.modify.denied`、安全边界或破坏性操作限制;
|
|
681
|
+
- 需要扩大权限、运行未允许命令、访问网络、执行破坏性操作、改变产品方向或核心架构时,
|
|
682
|
+
当前节点必须标为 Red。
|
|
683
|
+
|
|
684
|
+
它不适用于方向未定且无法推断、验收无法定义或高风险架构取舍任务;这些应直接评为 Red。
|
|
685
|
+
|
|
686
|
+
## 15. 模型分工协议
|
|
649
687
|
|
|
650
688
|
模型分工写在:
|
|
651
689
|
|
|
@@ -673,7 +711,7 @@ Default cheap. Escalate for judgment. Record why.
|
|
|
673
711
|
- 写明需要的 strong model role;
|
|
674
712
|
- 记录到 `ai/project/metrics.json`。
|
|
675
713
|
|
|
676
|
-
##
|
|
714
|
+
## 16. 风险门禁
|
|
677
715
|
|
|
678
716
|
任务涉及以下内容时必须谨慎:
|
|
679
717
|
|
|
@@ -687,7 +725,7 @@ Default cheap. Escalate for judgment. Record why.
|
|
|
687
725
|
|
|
688
726
|
如果风险高且 `task.md` 未明确授权,AI 必须停止并写 blocked 结果。
|
|
689
727
|
|
|
690
|
-
##
|
|
728
|
+
## 17. refs 延迟加载
|
|
691
729
|
|
|
692
730
|
`ai/project/refs/` 存放按需读取的详细资料。
|
|
693
731
|
|
|
@@ -707,7 +745,7 @@ Default cheap. Escalate for judgment. Record why.
|
|
|
707
745
|
|
|
708
746
|
每次读取 ref 都必须在 `result.json.refs_read` 中记录原因。
|
|
709
747
|
|
|
710
|
-
##
|
|
748
|
+
## 17.1 inbox 待吸收资料
|
|
711
749
|
|
|
712
750
|
`ai/project/inbox/` 存放尚未整合进项目上下文的新资料。
|
|
713
751
|
已完成整合的资料统一移动到 `ai/project/inbox/processed/`,用于追溯并避免重复处理。
|
|
@@ -732,11 +770,13 @@ AI 必须先输出整合计划,等人类确认后,才更新 `project.md`、`
|
|
|
732
770
|
应用整合完成后,AI 必须把本次已处理的 `ai/project/inbox/*.md`
|
|
733
771
|
移动到 `ai/project/inbox/processed/`。`processed/` 中的资料默认不再触发
|
|
734
772
|
`reconcile` 或 `next` 的待处理资料判断。
|
|
773
|
+
即使用户口语上说“整合整个 inbox”,默认也只处理 `ai/project/inbox/*.md`
|
|
774
|
+
和 `ai/project/inbox/raw/*.md`;`ai/project/inbox/ideas/**` 不参与上下文整合。
|
|
735
775
|
|
|
736
776
|
如果新资料会改变 `final-shape.md`、`module-map.md` 或 `roadmap.md` 的方向性内容,
|
|
737
777
|
上下文整合只能建议创建 `strategy_update` 提案,不能直接改这些文件。
|
|
738
778
|
|
|
739
|
-
##
|
|
779
|
+
## 17.2 项目北极星与策略修订
|
|
740
780
|
|
|
741
781
|
`ai/project/refs/final-shape.md` 是项目北极星说明书,也可以理解为
|
|
742
782
|
Product Constitution / Final Shape Spec。它负责保存:
|
|
@@ -812,7 +852,7 @@ ai/project/refs/decisions.md
|
|
|
812
852
|
ai/project/refs/constraints.md
|
|
813
853
|
```
|
|
814
854
|
|
|
815
|
-
##
|
|
855
|
+
## 18. 输出结果
|
|
816
856
|
|
|
817
857
|
每次执行必须写:
|
|
818
858
|
|
|
@@ -822,7 +862,7 @@ ai/project/result.md
|
|
|
822
862
|
ai/project/metrics.json
|
|
823
863
|
```
|
|
824
864
|
|
|
825
|
-
###
|
|
865
|
+
### 18.1 `result.json`
|
|
826
866
|
|
|
827
867
|
机器可读结果,是当前最新权威执行记录。
|
|
828
868
|
|
|
@@ -840,7 +880,7 @@ ai/project/metrics.json
|
|
|
840
880
|
- next;
|
|
841
881
|
- runtime update proposal。
|
|
842
882
|
|
|
843
|
-
###
|
|
883
|
+
### 18.2 `result.md`
|
|
844
884
|
|
|
845
885
|
人类可读摘要。
|
|
846
886
|
|
|
@@ -852,7 +892,7 @@ ai/project/metrics.json
|
|
|
852
892
|
- 有什么问题;
|
|
853
893
|
- 下一步。
|
|
854
894
|
|
|
855
|
-
###
|
|
895
|
+
### 18.3 `metrics.json`
|
|
856
896
|
|
|
857
897
|
执行经济性和模型分工记录。
|
|
858
898
|
|
|
@@ -867,7 +907,7 @@ ai/project/metrics.json
|
|
|
867
907
|
- human fix required;
|
|
868
908
|
- reuse potential。
|
|
869
909
|
|
|
870
|
-
##
|
|
910
|
+
## 19. 状态规则
|
|
871
911
|
|
|
872
912
|
允许状态:
|
|
873
913
|
|
|
@@ -885,7 +925,7 @@ blocked
|
|
|
885
925
|
- 任务不可执行,使用 `blocked`;
|
|
886
926
|
- 执行失败且无法完成,使用 `failed`。
|
|
887
927
|
|
|
888
|
-
##
|
|
928
|
+
## 20. runtime 治理
|
|
889
929
|
|
|
890
930
|
`ai/project/runtime.md` 只存当前仍然有效的执行上下文。
|
|
891
931
|
|
|
@@ -904,7 +944,7 @@ ai/project/result.json.runtime_update
|
|
|
904
944
|
|
|
905
945
|
再由单独任务决定是否更新 runtime。
|
|
906
946
|
|
|
907
|
-
##
|
|
947
|
+
## 21. 同步规则
|
|
908
948
|
|
|
909
949
|
从模板仓库导入真实项目:
|
|
910
950
|
|
|
@@ -922,7 +962,7 @@ ai/project/result.json.runtime_update
|
|
|
922
962
|
|
|
923
963
|
这是整个项目的安全底线。
|
|
924
964
|
|
|
925
|
-
##
|
|
965
|
+
## 22. npm 包结构
|
|
926
966
|
|
|
927
967
|
模板仓库内部结构:
|
|
928
968
|
|
|
@@ -952,7 +992,7 @@ LICENSE
|
|
|
952
992
|
- `bin/agent-execution-template.js` 是 CLI;
|
|
953
993
|
- `test/selftest.js` 是本地自测。
|
|
954
994
|
|
|
955
|
-
##
|
|
995
|
+
## 23. 自测与发布检查
|
|
956
996
|
|
|
957
997
|
本地自测:
|
|
958
998
|
|
|
@@ -997,7 +1037,7 @@ diff 检查:
|
|
|
997
1037
|
git diff --check
|
|
998
1038
|
```
|
|
999
1039
|
|
|
1000
|
-
##
|
|
1040
|
+
## 24. 当前能力边界
|
|
1001
1041
|
|
|
1002
1042
|
当前项目已经能做到:
|
|
1003
1043
|
|
|
@@ -1018,7 +1058,7 @@ git diff --check
|
|
|
1018
1058
|
- IDE 插件;
|
|
1019
1059
|
- 发布流水线。
|
|
1020
1060
|
|
|
1021
|
-
##
|
|
1061
|
+
## 25. 最终判断
|
|
1022
1062
|
|
|
1023
1063
|
Agent Execution Template v0.8 已经从一个 prompt/template 原型,升级为:
|
|
1024
1064
|
|
package/package.json
CHANGED
package/template/en/ai/README.md
CHANGED
|
@@ -77,6 +77,10 @@ The workflow produces a reconciliation plan first and updates `project.md`,
|
|
|
77
77
|
`runtime.md`, and `refs/*` only after confirmation. After reconciliation,
|
|
78
78
|
processed material is moved to `ai/project/inbox/processed/`.
|
|
79
79
|
|
|
80
|
+
By default, only `ai/project/inbox/*.md` and `ai/project/inbox/raw/*.md` are
|
|
81
|
+
absorbed. `processed/**` is trace history and is not reconciled again;
|
|
82
|
+
`ideas/**` goes through the direction amendment proposal flow.
|
|
83
|
+
|
|
80
84
|
## Direction Amendments
|
|
81
85
|
|
|
82
86
|
The North Star, module map, and roadmap belong to the project direction layer:
|
|
@@ -4,6 +4,31 @@ type: "bugfix | feature | refactor | docs | config | test | research | strategy_
|
|
|
4
4
|
priority: "P0 | P1 | P2 | P3"
|
|
5
5
|
risk_level: "low | medium | high"
|
|
6
6
|
depends_on_previous_result: false
|
|
7
|
+
execution_policy:
|
|
8
|
+
mode: "auto | normal | bounded_continuous"
|
|
9
|
+
activation_rule: "auto_enable_when_l1_count_gte_2"
|
|
10
|
+
max_depth: 3
|
|
11
|
+
allow_depth_4_when_needed: true
|
|
12
|
+
progress_unit: "vertical_slice"
|
|
13
|
+
task_tree: []
|
|
14
|
+
checkpoint_budget:
|
|
15
|
+
l1: 0
|
|
16
|
+
l2: 0
|
|
17
|
+
l3: 0
|
|
18
|
+
l4: 0
|
|
19
|
+
checkpoint_triggers:
|
|
20
|
+
- before_crossing_boundary
|
|
21
|
+
- after_vertical_slice
|
|
22
|
+
- before_final_review
|
|
23
|
+
auto_continue:
|
|
24
|
+
green: true
|
|
25
|
+
yellow: "low_risk_only"
|
|
26
|
+
red: false
|
|
27
|
+
risk_gate:
|
|
28
|
+
green: "continue"
|
|
29
|
+
yellow: "continue_with_local_fix"
|
|
30
|
+
red: "stop_for_human"
|
|
31
|
+
evidence_required: true
|
|
7
32
|
model_policy:
|
|
8
33
|
default_tier: "cheap"
|
|
9
34
|
allowed_tiers:
|
|
@@ -86,6 +111,41 @@ The task is complete when:
|
|
|
86
111
|
|
|
87
112
|
-
|
|
88
113
|
|
|
114
|
+
## Execution Policy
|
|
115
|
+
|
|
116
|
+
Default to `auto`. The AI decides during pre-execution planning whether to use
|
|
117
|
+
continuous execution; it does not wait for a human keyword. If planning finds
|
|
118
|
+
fewer than 2 L1 tasks, use `normal`; if it finds 2 or more L1 tasks, use
|
|
119
|
+
`bounded_continuous` automatically.
|
|
120
|
+
|
|
121
|
+
`bounded_continuous` means bounded continuous execution:
|
|
122
|
+
|
|
123
|
+
- The AI infers goal, scope, acceptance, permissions, and risk from the human
|
|
124
|
+
goal, project context, and repository facts; the human does not need to
|
|
125
|
+
provide each field upfront.
|
|
126
|
+
- Before execution, list the L1 task checklist; mark each L1 complete with a
|
|
127
|
+
checked and struck-through item.
|
|
128
|
+
- Before executing an L1, plan the naturally derived L2 tasks; if an L2 still
|
|
129
|
+
needs decomposition, plan L3 tasks.
|
|
130
|
+
- Default to at most 3 levels; add L4 dynamically only when leaving it out
|
|
131
|
+
would make L3 too large or unverifiable.
|
|
132
|
+
- The AI assigns Green / Yellow / Red risk to every task node.
|
|
133
|
+
- Only Red stops for human confirmation; Green continues automatically, and
|
|
134
|
+
Yellow continues after local low-risk correction.
|
|
135
|
+
- `progress_unit` defaults to `vertical_slice`: each work loop should produce
|
|
136
|
+
a reviewable increment.
|
|
137
|
+
- `checkpoint_budget` is the maximum checkpoint budget, not a required count;
|
|
138
|
+
do not report just to spend the budget.
|
|
139
|
+
- Emit a checkpoint only when `checkpoint_triggers` fire, risk rises, or final
|
|
140
|
+
review is about to start.
|
|
141
|
+
- Every checkpoint must include evidence: changed files, commands run,
|
|
142
|
+
verification results, or why verification was not possible.
|
|
143
|
+
- After completion, run one final review; only re-check Yellow, Red, failed
|
|
144
|
+
verification, or high-impact modules.
|
|
145
|
+
- Continuous execution does not change model policy; escalate through
|
|
146
|
+
`model_policy` for judgment, architecture, failure review, or acceptance
|
|
147
|
+
disputes.
|
|
148
|
+
|
|
89
149
|
## Permission
|
|
90
150
|
|
|
91
151
|
Modify files only under the allowlist in the YAML front matter.
|
|
@@ -108,3 +168,8 @@ Stop and write `ai/project/result.json`, `ai/project/result.md`, and `ai/project
|
|
|
108
168
|
- Required refs are missing.
|
|
109
169
|
- Required command cannot be run.
|
|
110
170
|
- Risk level is high without explicit authorization.
|
|
171
|
+
- A Red checkpoint appears during continuous execution.
|
|
172
|
+
- The task would change product direction, core architecture, data structures,
|
|
173
|
+
security boundaries, payment, accounts, or permissions.
|
|
174
|
+
- The task would delete many files, rewrite a core module, or require choosing
|
|
175
|
+
between multiple high-cost options.
|
|
@@ -1 +1 @@
|
|
|
1
|
-
0.8.
|
|
1
|
+
0.8.16
|
|
@@ -36,7 +36,9 @@ Then choose the mode:
|
|
|
36
36
|
`ai/template/reconcile.md` and stop or update according to its two-phase
|
|
37
37
|
workflow; `ai/project/inbox/processed/` is already processed material and
|
|
38
38
|
should not trigger reconciliation, while `ai/project/inbox/ideas/` should
|
|
39
|
-
route to `strategy_update` first.
|
|
39
|
+
route to `strategy_update` first. Even if the human says to reconcile the
|
|
40
|
+
whole inbox, default to only `ai/project/inbox/*.md` and
|
|
41
|
+
`ai/project/inbox/raw/*.md`.
|
|
40
42
|
- If the user says "Start initializing this project", asks to initialize,
|
|
41
43
|
organize, or generate project context, or if `ai/project/project.md` is
|
|
42
44
|
empty, placeholder-only, or incomplete, follow `ai/template/bootstrap.md`
|
|
@@ -58,10 +60,17 @@ Then choose the mode:
|
|
|
58
60
|
In Task Draft Mode:
|
|
59
61
|
|
|
60
62
|
1. Read confirmed `ai/project/project.md` and relevant `ai/project/refs/*.md`.
|
|
61
|
-
2.
|
|
62
|
-
|
|
63
|
-
|
|
64
|
-
|
|
63
|
+
2. Infer goal, scope, acceptance, permissions, verification method, and initial
|
|
64
|
+
risk from the user's current goal, project context, and repository facts; do
|
|
65
|
+
not require the human to provide each field upfront.
|
|
66
|
+
3. Draft `ai/project/task.md` and set `execution_policy.mode` to `auto`.
|
|
67
|
+
4. Before execution, list the L1 checklist and mark each L1 Green / Yellow /
|
|
68
|
+
Red. Use `normal` if there are fewer than 2 L1 tasks; automatically use
|
|
69
|
+
`bounded_continuous` if there are 2 or more L1 tasks.
|
|
70
|
+
5. Stop for human confirmation only when a Red preflight item appears. If the
|
|
71
|
+
human asked to execute or continue, and preflight contains only Green /
|
|
72
|
+
Yellow, proceed directly to Execution Mode.
|
|
73
|
+
6. Do not modify source or business files in Task Draft Mode.
|
|
65
74
|
|
|
66
75
|
End Task Draft Mode with:
|
|
67
76
|
|
|
@@ -112,7 +121,13 @@ In Execution Mode, read:
|
|
|
112
121
|
2. `ai/project/runtime.md`
|
|
113
122
|
3. `ai/project/task.md`
|
|
114
123
|
|
|
115
|
-
Then
|
|
124
|
+
Then perform pre-execution planning: list the L1 checklist, mark each L1 Green
|
|
125
|
+
/ Yellow / Red, and automatically choose `normal` or `bounded_continuous` from
|
|
126
|
+
the L1 count. Plan L2 before executing an L1, and plan L3 as needed before
|
|
127
|
+
executing an L2; default to at most 3 levels, with L4 allowed when needed. When
|
|
128
|
+
an L1 is complete, check it off and strike it through. Only Red stops for human
|
|
129
|
+
confirmation; Green continues automatically, and Yellow continues after local
|
|
130
|
+
low-risk correction. Write results to:
|
|
116
131
|
|
|
117
132
|
- `ai/project/result.json`
|
|
118
133
|
- `ai/project/result.md`
|
|
@@ -47,6 +47,74 @@ Project Bootstrap / Context Reconcile / Strategy Update -> Project Confirm -> Ta
|
|
|
47
47
|
9. Execute only within the project task boundary.
|
|
48
48
|
10. Write `ai/project/result.json`, `ai/project/result.md`, and `ai/project/metrics.json`.
|
|
49
49
|
|
|
50
|
+
## Execution Authorization Modes
|
|
51
|
+
|
|
52
|
+
The default execution policy is `auto`: before each execution, the AI first
|
|
53
|
+
decomposes the task and judges risk, then chooses `normal` or
|
|
54
|
+
`bounded_continuous`. Continuous execution does not depend on a human keyword.
|
|
55
|
+
|
|
56
|
+
Pre-execution planning must:
|
|
57
|
+
|
|
58
|
+
- Infer goal, scope, acceptance, permissions, and verification method from the
|
|
59
|
+
human goal, project context, and repository facts.
|
|
60
|
+
- List the L1 task checklist and assign Green / Yellow / Red risk to each L1.
|
|
61
|
+
- Use `normal` if there are fewer than 2 L1 tasks.
|
|
62
|
+
- Automatically use `bounded_continuous` if there are 2 or more L1 tasks.
|
|
63
|
+
- Stop for human confirmation first if any L1 is Red; Green and Yellow do not
|
|
64
|
+
block startup.
|
|
65
|
+
|
|
66
|
+
Bounded continuous execution rules:
|
|
67
|
+
|
|
68
|
+
- Execute the task tree in L1 -> L2 -> L3 order. Before executing an L1, plan
|
|
69
|
+
its naturally derived L2 tasks; before executing an L2, plan L3 tasks if it
|
|
70
|
+
still needs decomposition.
|
|
71
|
+
- Default to at most 3 levels. Add L4 dynamically only when L3 would otherwise
|
|
72
|
+
be too large, unverifiable, or hard to revert.
|
|
73
|
+
- Every L1/L2/L3/L4 node must have risk, expected edit scope, acceptance method,
|
|
74
|
+
and evidence requirements.
|
|
75
|
+
- Show the L1 checklist as task items; when an L1 is complete, check it off and
|
|
76
|
+
strike it through.
|
|
77
|
+
- Default to `vertical_slice` progress: each loop should produce a runnable,
|
|
78
|
+
reviewable, or reversible increment.
|
|
79
|
+
- The AI infers goal, scope, acceptance, and permissions, but must not cross
|
|
80
|
+
project rules, explicit human limits, `permission.modify.denied`, security
|
|
81
|
+
boundaries, or destructive-action limits.
|
|
82
|
+
- `Green` may continue automatically.
|
|
83
|
+
- `Yellow` may continue after local low-risk correction.
|
|
84
|
+
- `Red` must stop for human confirmation.
|
|
85
|
+
- If permission must expand, an unallowed command must run, network access is
|
|
86
|
+
needed, a destructive action is needed, or product direction / core
|
|
87
|
+
architecture would change, the current node must be Red.
|
|
88
|
+
- After all work is complete, run one final review; re-check only Yellow, Red,
|
|
89
|
+
failed verification, or high-impact modules.
|
|
90
|
+
- Every checkpoint must include evidence; a purely subjective Green is not valid.
|
|
91
|
+
- Continuous execution does not change model policy; still escalate through
|
|
92
|
+
`model_policy` for planning, architecture, failure review, or acceptance disputes.
|
|
93
|
+
|
|
94
|
+
Must stop when:
|
|
95
|
+
|
|
96
|
+
- The task would change product direction, core architecture, data structures,
|
|
97
|
+
security boundaries, payment, accounts, or permissions.
|
|
98
|
+
- The task would delete many files or rewrite a core module.
|
|
99
|
+
- The task outline, acceptance, or permission contains a material conflict.
|
|
100
|
+
- The current implementation affects multiple later modules and the task
|
|
101
|
+
contract does not cover that impact.
|
|
102
|
+
- Tests fail and cannot be fixed locally.
|
|
103
|
+
- There are two or more high-cost options that need human judgment.
|
|
104
|
+
|
|
105
|
+
Use this compact checkpoint format:
|
|
106
|
+
|
|
107
|
+
```text
|
|
108
|
+
## Checkpoint
|
|
109
|
+
### Task Tree
|
|
110
|
+
### Progress
|
|
111
|
+
### Completed
|
|
112
|
+
### Evidence
|
|
113
|
+
### Drift Risk: Green / Yellow / Red
|
|
114
|
+
### Recommended Next Step
|
|
115
|
+
### Auto-Continue Decision
|
|
116
|
+
```
|
|
117
|
+
|
|
50
118
|
## Bootstrap Mode
|
|
51
119
|
|
|
52
120
|
Bootstrap Mode prepares stable project understanding:
|
|
@@ -22,6 +22,9 @@ New material should usually live in:
|
|
|
22
22
|
absorbed. After reconciliation is confirmed, move processed material to
|
|
23
23
|
`ai/project/inbox/processed/` for traceability and to avoid repeated
|
|
24
24
|
reconciliation.
|
|
25
|
+
Even when the human says to reconcile the whole inbox, default to only
|
|
26
|
+
`ai/project/inbox/*.md` and `ai/project/inbox/raw/*.md`; do not recursively
|
|
27
|
+
read `processed/**` or `ideas/**`.
|
|
25
28
|
|
|
26
29
|
## First Read
|
|
27
30
|
|
|
@@ -30,11 +33,12 @@ reconciliation.
|
|
|
30
33
|
3. `ai/project/project.md`
|
|
31
34
|
4. `ai/project/runtime.md`
|
|
32
35
|
5. `ai/project/refs/*.md`
|
|
33
|
-
6. The new material named by the human; if none is named, read
|
|
36
|
+
6. The new material named by the human; if none is named, read only
|
|
37
|
+
`ai/project/inbox/*.md` and `ai/project/inbox/raw/*.md`
|
|
34
38
|
|
|
35
|
-
Do not read `ai/project/inbox/processed/**`, `ai/project/
|
|
36
|
-
tests, config, or dependency files by default
|
|
37
|
-
you to use them for fact checking.
|
|
39
|
+
Do not read `ai/project/inbox/processed/**`, `ai/project/inbox/ideas/**`,
|
|
40
|
+
`ai/project/archive/**`, source, tests, config, or dependency files by default
|
|
41
|
+
unless the human explicitly asks you to use them for fact checking.
|
|
38
42
|
|
|
39
43
|
## Reconciliation Principles
|
|
40
44
|
|
|
@@ -8,6 +8,7 @@ Before editing code, check that `ai/project/task.md` clearly defines:
|
|
|
8
8
|
- Scope
|
|
9
9
|
- Acceptance
|
|
10
10
|
- Permission
|
|
11
|
+
- Execution policy
|
|
11
12
|
|
|
12
13
|
If readiness fails, do not edit code. Write blocked results to:
|
|
13
14
|
|
|
@@ -116,6 +117,45 @@ must not directly modify:
|
|
|
116
117
|
Do not modify current task, results, metrics, archives, source, tests, config,
|
|
117
118
|
or dependency files unless the human explicitly authorizes it.
|
|
118
119
|
|
|
120
|
+
## Bounded Continuous Execution Gate
|
|
121
|
+
|
|
122
|
+
Before every execution, the AI must decompose the task and judge risk instead
|
|
123
|
+
of waiting for the human to explicitly say "enable continuous execution".
|
|
124
|
+
|
|
125
|
+
Before execution:
|
|
126
|
+
|
|
127
|
+
- Infer goal, scope, acceptance, permissions, and verification method from the
|
|
128
|
+
human goal, project context, and repository facts.
|
|
129
|
+
- List the L1 task checklist and mark each L1 Green / Yellow / Red.
|
|
130
|
+
- Use `normal` when there are fewer than 2 L1 tasks; automatically use
|
|
131
|
+
`bounded_continuous` when there are 2 or more L1 tasks.
|
|
132
|
+
- Stop for human confirmation if any L1 is Red; Green and Yellow may continue.
|
|
133
|
+
|
|
134
|
+
When enabled:
|
|
135
|
+
|
|
136
|
+
- Execute in L1 -> L2 -> L3 order; plan L2 before executing an L1, and plan L3
|
|
137
|
+
as needed before executing an L2.
|
|
138
|
+
- Default to at most 3 levels; add L4 dynamically only when L3 is still too
|
|
139
|
+
large, unverifiable, or hard to revert.
|
|
140
|
+
- Show the L1 checklist as task items; when an L1 is complete, check it off and
|
|
141
|
+
strike it through.
|
|
142
|
+
- Every task node must have risk, expected edit scope, acceptance method, and
|
|
143
|
+
evidence requirements.
|
|
144
|
+
- The checkpoint budget is a maximum, not a required count.
|
|
145
|
+
- Every checkpoint must include evidence.
|
|
146
|
+
- `Green` may continue automatically.
|
|
147
|
+
- `Yellow` continues after local low-risk correction.
|
|
148
|
+
- `Red` must stop for human confirmation.
|
|
149
|
+
- Any product direction, core architecture, data structure, security, payment,
|
|
150
|
+
account, permission, large deletion, core rewrite, or high-cost option choice
|
|
151
|
+
must stop.
|
|
152
|
+
- Any need to expand scope, permission, commands, network access, or acceptance
|
|
153
|
+
must stop.
|
|
154
|
+
|
|
155
|
+
The AI infers goal, scope, acceptance, and permissions, but must not cross
|
|
156
|
+
project rules, explicit human limits, `permission.modify.denied`, security
|
|
157
|
+
boundaries, or destructive-action limits.
|
|
158
|
+
|
|
119
159
|
## Strategy Update Gate
|
|
120
160
|
|
|
121
161
|
If the user asks to update the North Star, final shape, product constitution,
|
package/template/zh/ai/README.md
CHANGED
|
@@ -4,6 +4,31 @@ type: "bugfix | feature | refactor | docs | config | test | research | strategy_
|
|
|
4
4
|
priority: "P0 | P1 | P2 | P3"
|
|
5
5
|
risk_level: "low | medium | high"
|
|
6
6
|
depends_on_previous_result: false
|
|
7
|
+
execution_policy:
|
|
8
|
+
mode: "auto | normal | bounded_continuous"
|
|
9
|
+
activation_rule: "auto_enable_when_l1_count_gte_2"
|
|
10
|
+
max_depth: 3
|
|
11
|
+
allow_depth_4_when_needed: true
|
|
12
|
+
progress_unit: "vertical_slice"
|
|
13
|
+
task_tree: []
|
|
14
|
+
checkpoint_budget:
|
|
15
|
+
l1: 0
|
|
16
|
+
l2: 0
|
|
17
|
+
l3: 0
|
|
18
|
+
l4: 0
|
|
19
|
+
checkpoint_triggers:
|
|
20
|
+
- before_crossing_boundary
|
|
21
|
+
- after_vertical_slice
|
|
22
|
+
- before_final_review
|
|
23
|
+
auto_continue:
|
|
24
|
+
green: true
|
|
25
|
+
yellow: "low_risk_only"
|
|
26
|
+
red: false
|
|
27
|
+
risk_gate:
|
|
28
|
+
green: "continue"
|
|
29
|
+
yellow: "continue_with_local_fix"
|
|
30
|
+
red: "stop_for_human"
|
|
31
|
+
evidence_required: true
|
|
7
32
|
model_policy:
|
|
8
33
|
default_tier: "cheap"
|
|
9
34
|
allowed_tiers:
|
|
@@ -81,6 +106,28 @@ permission:
|
|
|
81
106
|
|
|
82
107
|
-
|
|
83
108
|
|
|
109
|
+
## 执行策略
|
|
110
|
+
|
|
111
|
+
默认使用 `auto`,由 AI 在执行前规划时判定是否启用连续执行,而不是等待用户口令。
|
|
112
|
+
如果执行前拆出的 L1 任务少于 2 个,使用 `normal`;如果 L1 任务为 2 个或更多,
|
|
113
|
+
自动使用 `bounded_continuous`。
|
|
114
|
+
|
|
115
|
+
`bounded_continuous` 表示边界内连续执行:
|
|
116
|
+
|
|
117
|
+
- 目标、范围、验收、权限和风险评级由 AI 基于用户目标、项目上下文和仓库事实推断;
|
|
118
|
+
不要求用户预先逐项提供。
|
|
119
|
+
- 执行前必须列出 L1 任务清单;每个 L1 用待办列表表示,完成后打勾并划掉。
|
|
120
|
+
- 执行某个 L1 前,AI 先规划自然衍生出的 L2;如果 L2 仍需拆分,再规划 L3。
|
|
121
|
+
- 默认最多 3 层;只有当不拆 L4 会导致 L3 过大或不可验证时,才允许动态增加 L4。
|
|
122
|
+
- 每个任务节点都由 AI 自己生成 Green / Yellow / Red 风险评级。
|
|
123
|
+
- 只有 Red 停下来让人类确认;Green 自动继续,Yellow 先做局部低风险修正后继续。
|
|
124
|
+
- `progress_unit` 默认是 `vertical_slice`:每轮推进都应该产生可检查的工作增量。
|
|
125
|
+
- `checkpoint_budget` 是最多可用检查点预算,不是必须用完的次数;不要为了消耗预算而汇报。
|
|
126
|
+
- 只有在触发 `checkpoint_triggers`、风险升高或准备收尾时才输出 Checkpoint。
|
|
127
|
+
- 每个 Checkpoint 必须包含证据:已改文件、已运行命令、验证结果或无法验证的原因。
|
|
128
|
+
- 完成后只做一次总复盘;只对 Yellow、Red、失败验证或高影响模块做二次抽检。
|
|
129
|
+
- 连续执行不改变模型策略;涉及判断、架构、失败复盘或验收争议时仍按 `model_policy` 升级。
|
|
130
|
+
|
|
84
131
|
## 权限
|
|
85
132
|
|
|
86
133
|
只修改 YAML front matter 允许列表中的文件。
|
|
@@ -104,3 +151,6 @@ permission:
|
|
|
104
151
|
- 必需引用缺失。
|
|
105
152
|
- 必需命令无法运行。
|
|
106
153
|
- 风险等级高但没有明确授权。
|
|
154
|
+
- 连续执行中出现 Red 检查点。
|
|
155
|
+
- 需要改变产品方向、核心架构、数据结构、安全边界、支付、账号或权限。
|
|
156
|
+
- 需要删除大量文件、重写核心模块,或在多个高成本方案之间取舍。
|
|
@@ -1 +1 @@
|
|
|
1
|
-
0.8.
|
|
1
|
+
0.8.16
|
|
@@ -29,7 +29,9 @@
|
|
|
29
29
|
更新上下文/处理新资料,提到 `reconcile` 或 `ai/project/inbox/`,
|
|
30
30
|
或 `ai/project/inbox/` 里存在 `.gitkeep` 之外的待吸收资料,执行 `ai/template/reconcile.md`,
|
|
31
31
|
并按它的两阶段流程停止或更新;但 `ai/project/inbox/processed/` 是已处理资料,
|
|
32
|
-
不应触发整合,`ai/project/inbox/ideas/` 应优先走 `strategy_update
|
|
32
|
+
不应触发整合,`ai/project/inbox/ideas/` 应优先走 `strategy_update`。即使用户说
|
|
33
|
+
“整合整个 inbox”,默认也只处理 `ai/project/inbox/*.md` 和
|
|
34
|
+
`ai/project/inbox/raw/*.md`。
|
|
33
35
|
- 如果用户说“开始初始化这个项目”、要求初始化/整理/生成项目上下文,
|
|
34
36
|
或 `ai/project/project.md` 为空、只有占位内容、
|
|
35
37
|
或不完整,执行 `ai/template/bootstrap.md`,并在项目上下文确认后停止。
|
|
@@ -46,9 +48,14 @@
|
|
|
46
48
|
在任务草稿模式中:
|
|
47
49
|
|
|
48
50
|
1. 读取已确认的 `ai/project/project.md` 和相关 `ai/project/refs/*.md`。
|
|
49
|
-
2.
|
|
50
|
-
|
|
51
|
-
|
|
51
|
+
2. 根据用户当前目标、项目上下文和仓库事实,推断目标、范围、验收、权限、
|
|
52
|
+
验证方式和初始风险;不要要求用户逐项提供。
|
|
53
|
+
3. 起草 `ai/project/task.md`,并将 `execution_policy.mode` 设为 `auto`。
|
|
54
|
+
4. 执行前列出 L1 任务清单并标注 Green / Yellow / Red。L1 少于 2 个时使用
|
|
55
|
+
`normal`;L1 为 2 个或更多时自动使用 `bounded_continuous`。
|
|
56
|
+
5. 只有出现 Red 预检项时才停止等待人类确认。若用户要求的是执行或继续,且预检
|
|
57
|
+
只有 Green / Yellow,可以直接进入执行模式。
|
|
58
|
+
6. 不要在任务草稿模式中修改源码或业务文件。
|
|
52
59
|
|
|
53
60
|
任务草稿模式必须以下面结构结束:
|
|
54
61
|
|
|
@@ -97,7 +104,11 @@
|
|
|
97
104
|
2. `ai/project/runtime.md`
|
|
98
105
|
3. `ai/project/task.md`
|
|
99
106
|
|
|
100
|
-
|
|
107
|
+
然后先做执行前规划:列出 L1 清单,给每个 L1 标注 Green / Yellow / Red,
|
|
108
|
+
并根据 L1 数量自动选择 `normal` 或 `bounded_continuous`。执行 L1 前规划 L2,
|
|
109
|
+
执行 L2 前按需规划 L3;默认最多 3 层,必要时允许 L4。每完成一个 L1,
|
|
110
|
+
在清单中打勾并划掉。只有 Red 停止等待人类确认;Green 自动继续,Yellow 做局部
|
|
111
|
+
低风险修正后继续。最后把结果写入:
|
|
101
112
|
|
|
102
113
|
- `ai/project/result.json`
|
|
103
114
|
- `ai/project/result.md`
|
|
@@ -41,6 +41,60 @@ ai/project/task.md = 当前执行契约
|
|
|
41
41
|
9. 只在项目任务边界内执行。
|
|
42
42
|
10. 写入 `ai/project/result.json`、`ai/project/result.md` 和 `ai/project/metrics.json`。
|
|
43
43
|
|
|
44
|
+
## 执行授权模式
|
|
45
|
+
|
|
46
|
+
默认执行策略是 `auto`:AI 在每次执行前先做任务分解和风险判断,再决定使用
|
|
47
|
+
`normal` 还是 `bounded_continuous`。启用连续执行不依赖用户说出特定口令。
|
|
48
|
+
|
|
49
|
+
执行前规划必须:
|
|
50
|
+
|
|
51
|
+
- 根据用户目标、项目上下文和仓库事实,推断目标、范围、验收、权限和验证方式。
|
|
52
|
+
- 列出 L1 任务清单,并为每个 L1 生成 Green / Yellow / Red 风险评级。
|
|
53
|
+
- 如果 L1 少于 2 个,使用 `normal`。
|
|
54
|
+
- 如果 L1 为 2 个或更多,自动启用 `bounded_continuous`。
|
|
55
|
+
- 如果任一 L1 为 Red,先停止并让人类确认;Green 和 Yellow 不阻塞启动。
|
|
56
|
+
|
|
57
|
+
边界内连续执行规则:
|
|
58
|
+
|
|
59
|
+
- 任务树按 L1 -> L2 -> L3 执行。执行某个 L1 前,先规划它自然衍生出的 L2;
|
|
60
|
+
执行某个 L2 前,如果仍需拆分,再规划 L3。
|
|
61
|
+
- 默认最多 3 层。只有当 L3 仍过大、不可验证或不可回退时,才动态增加 L4。
|
|
62
|
+
- L1/L2/L3/L4 都必须有风险评级、预期改动范围、验收方式和证据要求。
|
|
63
|
+
- L1 清单必须用待办列表展示;每完成一个 L1,就打勾并划掉。
|
|
64
|
+
- 默认按 `vertical_slice` 推进:每轮都产出可运行、可检查或可回退的增量。
|
|
65
|
+
- 目标、范围、验收和权限由 AI 推断,但不能越过项目规则、显式用户限制、
|
|
66
|
+
`permission.modify.denied`、安全边界或破坏性操作限制。
|
|
67
|
+
- `Green` 可以自动继续。
|
|
68
|
+
- `Yellow` 可以在局部低风险修正后继续。
|
|
69
|
+
- `Red` 必须停止等待人类确认。
|
|
70
|
+
- 如果需要扩大权限、运行未允许命令、访问网络、执行破坏性操作、改变产品方向或核心架构,
|
|
71
|
+
当前节点必须标为 Red。
|
|
72
|
+
- 全部完成后只做一次总复盘;只对 Yellow、Red、验证失败或高影响模块做二次抽检。
|
|
73
|
+
- 每个 Checkpoint 必须给出证据,不接受只有主观判断的 Green。
|
|
74
|
+
- 连续执行不改变模型策略;遇到规划、架构、失败复盘或验收争议,仍按 `model_policy` 升级。
|
|
75
|
+
|
|
76
|
+
必须停止的情况:
|
|
77
|
+
|
|
78
|
+
- 需要改变产品方向、核心架构、数据结构、安全边界、支付、账号或权限。
|
|
79
|
+
- 需要删除大量文件或重写核心模块。
|
|
80
|
+
- 发现任务大纲、验收或权限之间存在实质冲突。
|
|
81
|
+
- 当前实现会影响多个后续模块,且任务契约没有覆盖该影响。
|
|
82
|
+
- 测试失败且无法局部修复。
|
|
83
|
+
- 出现两个以上高成本方案,需要人类裁决。
|
|
84
|
+
|
|
85
|
+
检查点使用紧凑格式:
|
|
86
|
+
|
|
87
|
+
```text
|
|
88
|
+
## Checkpoint
|
|
89
|
+
### 任务树
|
|
90
|
+
### 当前完成度
|
|
91
|
+
### 已完成
|
|
92
|
+
### 证据
|
|
93
|
+
### 偏离风险:Green / Yellow / Red
|
|
94
|
+
### 下一步建议
|
|
95
|
+
### 是否自动继续
|
|
96
|
+
```
|
|
97
|
+
|
|
44
98
|
## 引导模式
|
|
45
99
|
|
|
46
100
|
引导模式准备稳定的项目理解:
|
|
@@ -20,6 +20,8 @@
|
|
|
20
20
|
|
|
21
21
|
`ai/project/inbox/` 是待吸收资料区。资料被整合确认后,统一移动到
|
|
22
22
|
`ai/project/inbox/processed/`,用于追溯并避免后续重复整合。
|
|
23
|
+
即使用户说“整合整个 inbox”,默认也只处理 `ai/project/inbox/*.md`
|
|
24
|
+
和 `ai/project/inbox/raw/*.md`;不要递归读取 `processed/**` 或 `ideas/**`。
|
|
23
25
|
|
|
24
26
|
## 先读
|
|
25
27
|
|
|
@@ -28,10 +30,11 @@
|
|
|
28
30
|
3. `ai/project/project.md`
|
|
29
31
|
4. `ai/project/runtime.md`
|
|
30
32
|
5. `ai/project/refs/*.md`
|
|
31
|
-
6.
|
|
33
|
+
6. 人类指定的新资料;未指定时,只读取 `ai/project/inbox/*.md`
|
|
34
|
+
和 `ai/project/inbox/raw/*.md`
|
|
32
35
|
|
|
33
|
-
不要默认读取 `ai/project/inbox/processed/**`、`ai/project/
|
|
34
|
-
|
|
36
|
+
不要默认读取 `ai/project/inbox/processed/**`、`ai/project/inbox/ideas/**`、
|
|
37
|
+
`ai/project/archive/**`、源码、测试、配置或依赖文件,除非人类明确要求用它们核对事实。
|
|
35
38
|
|
|
36
39
|
## 整合原则
|
|
37
40
|
|
|
@@ -8,6 +8,7 @@
|
|
|
8
8
|
- 范围
|
|
9
9
|
- 验收
|
|
10
10
|
- 权限
|
|
11
|
+
- 执行策略
|
|
11
12
|
|
|
12
13
|
如果未就绪,不要编辑代码。将阻塞结果写入:
|
|
13
14
|
|
|
@@ -100,6 +101,35 @@
|
|
|
100
101
|
|
|
101
102
|
除非人类明确授权,不要修改当前任务、结果、指标、归档、源码、测试、配置或依赖文件。
|
|
102
103
|
|
|
104
|
+
## 边界内连续执行门
|
|
105
|
+
|
|
106
|
+
每次执行前,AI 必须先做任务分解和风险判断,而不是等待用户显式说“启用连续执行”。
|
|
107
|
+
|
|
108
|
+
执行前必须:
|
|
109
|
+
|
|
110
|
+
- 根据用户目标、项目上下文和仓库事实,推断目标、范围、验收、权限和验证方式。
|
|
111
|
+
- 列出 L1 任务清单,并给每个 L1 标注 Green / Yellow / Red。
|
|
112
|
+
- L1 少于 2 个时使用 `normal`;L1 为 2 个或更多时自动使用 `bounded_continuous`。
|
|
113
|
+
- 任一 L1 为 Red 时,停止等待人类确认;Green 和 Yellow 可继续。
|
|
114
|
+
|
|
115
|
+
启用后:
|
|
116
|
+
|
|
117
|
+
- 按 L1 -> L2 -> L3 执行;执行某个 L1 前规划 L2,执行某个 L2 前按需规划 L3。
|
|
118
|
+
- 默认最多 3 层;只有当 L3 仍过大、不可验证或不可回退时,才动态增加 L4。
|
|
119
|
+
- L1 清单必须用待办列表展示;每完成一个 L1,就打勾并划掉。
|
|
120
|
+
- 每个任务节点必须有风险评级、预期改动范围、验收方式和证据要求。
|
|
121
|
+
- 检查点预算是上限,不是必须用完的次数。
|
|
122
|
+
- 每个 Checkpoint 必须包含证据。
|
|
123
|
+
- `Green` 可自动继续。
|
|
124
|
+
- `Yellow` 做局部低风险修正后继续。
|
|
125
|
+
- `Red` 必须停止等待人类确认。
|
|
126
|
+
- 任何方向、核心架构、数据结构、安全、支付、账号、权限、大量删除、
|
|
127
|
+
核心重写或高成本方案取舍,都必须停止。
|
|
128
|
+
- 需要扩大范围、权限、命令、网络或验收时,必须停止。
|
|
129
|
+
|
|
130
|
+
目标、范围、验收和权限由 AI 推断,但不能越过项目规则、显式用户限制、
|
|
131
|
+
`permission.modify.denied`、安全边界或破坏性操作限制。
|
|
132
|
+
|
|
103
133
|
## 策略修订门
|
|
104
134
|
|
|
105
135
|
如果用户要求更新项目北极星、最终形态、产品宪法、模块地图、路线图或项目方向,
|
package/test/selftest.js
CHANGED
|
@@ -77,6 +77,17 @@ function testInitUpdateDoctor() {
|
|
|
77
77
|
assert(read(cwd, "ai/template/bootstrap.md").includes("未吸收资料"), "bootstrap handoff should audit unabsorbed material");
|
|
78
78
|
assert(read(cwd, "ai/template/bootstrap.md").includes("冲突处理"), "bootstrap handoff should audit conflict handling");
|
|
79
79
|
assert(read(cwd, "ai/template/prompt.md").includes("任务草稿交接"), "execution prompt should include task handoff");
|
|
80
|
+
assert(read(cwd, "ai/template/prompt.md").includes("默认也只处理 `ai/project/inbox/*.md`"), "execution prompt should narrow inbox reconciliation");
|
|
81
|
+
assert(read(cwd, "ai/template/protocol.md").includes("边界内连续执行"), "protocol should include bounded continuous execution");
|
|
82
|
+
assert(read(cwd, "ai/template/protocol.md").includes("`vertical_slice`"), "protocol should require vertical-slice progress for continuous execution");
|
|
83
|
+
assert(read(cwd, "ai/template/protocol.md").includes("L1 为 2 个或更多,自动启用"), "protocol should auto-enable continuous execution from L1 count");
|
|
84
|
+
assert(read(cwd, "ai/template/protocol.md").includes("每个 Checkpoint 必须给出证据"), "protocol should require evidence-backed checkpoints");
|
|
85
|
+
assert(read(cwd, "ai/template/rules/core.md").includes("边界内连续执行门"), "core rules should include bounded continuous execution gate");
|
|
86
|
+
assert(read(cwd, "ai/template/rules/core.md").includes("需要扩大范围、权限、命令、网络或验收时"), "core rules should stop continuous execution before boundary expansion");
|
|
87
|
+
assert(read(cwd, "ai/project/task.md").includes("execution_policy:"), "task template should include execution policy");
|
|
88
|
+
assert(read(cwd, "ai/project/task.md").includes("activation_rule: \"auto_enable_when_l1_count_gte_2\""), "task template should define automatic activation rule");
|
|
89
|
+
assert(read(cwd, "ai/project/task.md").includes("risk_gate:"), "task template should define risk gate");
|
|
90
|
+
assert(read(cwd, "ai/project/task.md").includes("progress_unit: \"vertical_slice\""), "task template should define continuous progress unit");
|
|
80
91
|
assert(read(cwd, "ai/template/prompt.md").includes("开始初始化这个项目"), "execution prompt should route natural bootstrap entry");
|
|
81
92
|
assert(read(cwd, "ai/template/prompt.md").includes("开始初始化这个项目,并吸收 ai/project/inbox/ 里的资料"), "execution prompt should route bootstrap with inbox material");
|
|
82
93
|
assert(read(cwd, "ai/template/prompt.md").includes("不要重新 bootstrap"), "execution prompt should reconcile inbox material when project context already exists");
|
|
@@ -85,6 +96,7 @@ function testInitUpdateDoctor() {
|
|
|
85
96
|
assert(read(cwd, "ai/template/prompt.md").includes("strategy_update"), "execution prompt should route strategy updates");
|
|
86
97
|
assert(read(cwd, "ai/template/reconcile.md").includes("上下文整合"), "init should install reconcile prompt");
|
|
87
98
|
assert(read(cwd, "ai/template/reconcile.md").includes("整合计划"), "reconcile prompt should require a plan first");
|
|
99
|
+
assert(read(cwd, "ai/template/reconcile.md").includes("不要递归读取 `processed/**` 或 `ideas/**`"), "reconcile prompt should exclude processed and ideas recursively");
|
|
88
100
|
assert(read(cwd, "ai/template/reconcile.md").includes("ai/project/inbox/processed/raw/file.md"), "reconcile prompt should archive absorbed raw inbox material");
|
|
89
101
|
assert(read(cwd, "ai/template/reconcile.md").includes("未吸收资料"), "reconcile handoff should audit unabsorbed material");
|
|
90
102
|
assert(read(cwd, "ai/template/reconcile.md").includes("冲突处理"), "reconcile handoff should audit conflict handling");
|
|
@@ -142,6 +154,17 @@ function testEnglishInitUpdateDoctor() {
|
|
|
142
154
|
assert(read(cwd, "ai/template/bootstrap.md").includes("Unabsorbed material"), "English bootstrap handoff should audit unabsorbed material");
|
|
143
155
|
assert(read(cwd, "ai/template/bootstrap.md").includes("Conflict handling"), "English bootstrap handoff should audit conflict handling");
|
|
144
156
|
assert(read(cwd, "ai/template/prompt.md").includes("Start initializing this project"), "English execution prompt should route natural bootstrap entry");
|
|
157
|
+
assert(read(cwd, "ai/template/prompt.md").includes("default to only `ai/project/inbox/*.md`"), "English execution prompt should narrow inbox reconciliation");
|
|
158
|
+
assert(read(cwd, "ai/template/protocol.md").includes("`bounded_continuous`"), "English protocol should include bounded continuous execution");
|
|
159
|
+
assert(read(cwd, "ai/template/protocol.md").includes("`vertical_slice`"), "English protocol should require vertical-slice progress for continuous execution");
|
|
160
|
+
assert(read(cwd, "ai/template/protocol.md").includes("Automatically use `bounded_continuous`"), "English protocol should auto-enable continuous execution from L1 count");
|
|
161
|
+
assert(read(cwd, "ai/template/protocol.md").includes("Every checkpoint must include evidence"), "English protocol should require evidence-backed checkpoints");
|
|
162
|
+
assert(read(cwd, "ai/template/rules/core.md").includes("Bounded Continuous Execution Gate"), "English core rules should include bounded continuous execution gate");
|
|
163
|
+
assert(read(cwd, "ai/template/rules/core.md").includes("expand scope, permission, commands, network access, or acceptance"), "English core rules should stop continuous execution before boundary expansion");
|
|
164
|
+
assert(read(cwd, "ai/project/task.md").includes("execution_policy:"), "English task template should include execution policy");
|
|
165
|
+
assert(read(cwd, "ai/project/task.md").includes("activation_rule: \"auto_enable_when_l1_count_gte_2\""), "English task template should define automatic activation rule");
|
|
166
|
+
assert(read(cwd, "ai/project/task.md").includes("risk_gate:"), "English task template should define risk gate");
|
|
167
|
+
assert(read(cwd, "ai/project/task.md").includes("progress_unit: \"vertical_slice\""), "English task template should define continuous progress unit");
|
|
145
168
|
assert(read(cwd, "ai/template/prompt.md").includes("Start initializing this project and absorb the material in ai/project/inbox/"), "English execution prompt should route bootstrap with inbox material");
|
|
146
169
|
assert(read(cwd, "ai/template/prompt.md").includes("instead of bootstrapping again"), "English execution prompt should reconcile inbox material when project context already exists");
|
|
147
170
|
assert(read(cwd, "ai/template/prompt.md").includes("Reconcile the new material in ai/project/inbox/"), "English execution prompt should route natural reconcile entry");
|
|
@@ -155,6 +178,7 @@ function testEnglishInitUpdateDoctor() {
|
|
|
155
178
|
assert(read(cwd, "ai/project/proposals/final-shape-updates/_template.md").includes("`accepted`"), "English proposal template should describe accepted status");
|
|
156
179
|
assert(read(cwd, "ai/template/reconcile.md").includes("Context Reconcile"), "English init should install English reconcile prompt");
|
|
157
180
|
assert(read(cwd, "ai/template/reconcile.md").includes("reconciliation plan"), "English reconcile prompt should require a plan first");
|
|
181
|
+
assert(read(cwd, "ai/template/reconcile.md").includes("do not recursively\nread `processed/**` or `ideas/**`"), "English reconcile prompt should exclude processed and ideas recursively");
|
|
158
182
|
assert(read(cwd, "ai/template/reconcile.md").includes("ai/project/inbox/processed/raw/file.md"), "English reconcile prompt should archive absorbed raw inbox material");
|
|
159
183
|
assert(read(cwd, "ai/template/reconcile.md").includes("Unabsorbed material"), "English reconcile handoff should audit unabsorbed material");
|
|
160
184
|
assert(read(cwd, "ai/template/reconcile.md").includes("Conflict handling"), "English reconcile handoff should audit conflict handling");
|