@wnlen/agent-execution-template 0.8.18 → 0.8.20
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +13 -5
- package/README.zh-CN.md +9 -5
- package/bin/agent-execution-template.js +121 -17
- package/docs/SPEC.md +13 -6
- package/package.json +1 -1
- package/template/en/ai/project/task.md +25 -9
- package/template/en/ai/template/VERSION +1 -1
- package/template/en/ai/template/execution-policy.md +43 -10
- package/template/en/ai/template/prompt.md +17 -12
- package/template/en/ai/template/protocol.md +9 -5
- package/template/en/ai/template/rules/core.md +12 -3
- package/template/en/ai/template/rules/output.md +4 -1
- package/template/zh/ai/project/runtime.md +11 -11
- package/template/zh/ai/project/task.md +30 -25
- package/template/zh/ai/template/VERSION +1 -1
- package/template/zh/ai/template/bootstrap.md +21 -27
- package/template/zh/ai/template/execution-policy.md +29 -5
- package/template/zh/ai/template/prompt.md +38 -47
- package/template/zh/ai/template/protocol.md +29 -31
- package/template/zh/ai/template/reconcile.md +21 -28
- package/template/zh/ai/template/rules/core.md +24 -22
- package/template/zh/ai/template/rules/output.md +3 -1
- package/test/selftest.js +93 -2
package/README.md
CHANGED
|
@@ -161,7 +161,8 @@ The user can still give a natural-language goal, for example:
|
|
|
161
161
|
Build the settings page with profile editing, notification toggles, and export entrypoint
|
|
162
162
|
```
|
|
163
163
|
|
|
164
|
-
Before execution, the AI decomposes L1 tasks
|
|
164
|
+
Before execution, the AI decomposes L1 tasks. Each L1 must be an independently
|
|
165
|
+
acceptable vertical slice, not a mechanical step checklist:
|
|
165
166
|
|
|
166
167
|
```text
|
|
167
168
|
- [ ] L1-1 Profile editing Green
|
|
@@ -171,12 +172,19 @@ Before execution, the AI decomposes L1 tasks:
|
|
|
171
172
|
|
|
172
173
|
Because there are two or more L1 tasks, the protocol automatically uses bounded
|
|
173
174
|
continuous execution. Before each L1, the AI plans naturally derived L2/L3 work.
|
|
174
|
-
After completing an L1, it checks and strikes the item
|
|
175
|
-
|
|
175
|
+
After completing an L1, it checks and strikes the item. `task_tree` is written
|
|
176
|
+
back only at L1 start/done, Red/blocked, scope changes, or final wrap-up, so tiny
|
|
177
|
+
steps do not churn files.
|
|
176
178
|
|
|
177
179
|
Only Red risk stops for confirmation. Green continues automatically, and Yellow
|
|
178
|
-
|
|
179
|
-
|
|
180
|
+
only permits local low-risk correction inside the current L1/L2; it must not
|
|
181
|
+
change public interfaces, data models, permissions, security, architecture
|
|
182
|
+
direction, or acceptance. By default, users see L1, risk conclusions, evidence,
|
|
183
|
+
Red confirmations, and final results; internal protocol details are not shown.
|
|
184
|
+
|
|
185
|
+
If the AI just created or rewrote `ai/project/task.md` in the current run, it
|
|
186
|
+
must stop for confirmation. Execution is allowed only when an existing task is
|
|
187
|
+
explicitly `ready_to_execute`.
|
|
180
188
|
|
|
181
189
|
## Installed Layout
|
|
182
190
|
|
package/README.zh-CN.md
CHANGED
|
@@ -171,7 +171,7 @@ npx -y @wnlen/agent-execution-template strategy
|
|
|
171
171
|
实现设置页,包括资料编辑、通知开关和导出入口
|
|
172
172
|
```
|
|
173
173
|
|
|
174
|
-
AI 会在执行前先拆 L1
|
|
174
|
+
AI 会在执行前先拆 L1 任务。L1 必须是可独立验收的垂直切片,不是机械步骤清单:
|
|
175
175
|
|
|
176
176
|
```text
|
|
177
177
|
- [ ] L1-1 资料编辑 Green
|
|
@@ -180,11 +180,15 @@ AI 会在执行前先拆 L1 任务:
|
|
|
180
180
|
```
|
|
181
181
|
|
|
182
182
|
因为 L1 有两个以上,协议会自动使用边界内连续执行。执行每个 L1 前,AI 再规划
|
|
183
|
-
自然衍生的 L2/L3;完成一个 L1
|
|
184
|
-
|
|
183
|
+
自然衍生的 L2/L3;完成一个 L1 后,在清单中打勾并划掉。`task_tree` 只在
|
|
184
|
+
L1 开始/完成、Red/blocked、范围变化或最终收尾时写回,避免为微小步骤反复改文件。
|
|
185
185
|
|
|
186
|
-
只有 Red 风险会停下来让你确认。Green 自动继续,Yellow
|
|
187
|
-
|
|
186
|
+
只有 Red 风险会停下来让你确认。Green 自动继续,Yellow 只允许当前 L1/L2 内的
|
|
187
|
+
局部低风险修正,不能改变公共接口、数据模型、权限、安全、架构方向或验收标准。
|
|
188
|
+
用户默认只看 L1、风险结论、证据、Red 确认和最终结果;内部协议细节不默认展示。
|
|
189
|
+
|
|
190
|
+
如果 AI 本轮刚新建或重写了 `ai/project/task.md`,必须先停下来交接确认;
|
|
191
|
+
只有已有任务明确处于 `ready_to_execute` 时,才允许进入执行。
|
|
188
192
|
|
|
189
193
|
## 安装后的结构
|
|
190
194
|
|
|
@@ -19,6 +19,8 @@ const REQUIRED_FILES = [
|
|
|
19
19
|
"ai/template/protocol.md",
|
|
20
20
|
"ai/template/rules/core.md",
|
|
21
21
|
"ai/template/rules/output.md",
|
|
22
|
+
"ai/template/schemas/result.schema.json",
|
|
23
|
+
"ai/template/schemas/metrics.schema.json",
|
|
22
24
|
"ai/project/inbox/.gitkeep",
|
|
23
25
|
"ai/project/project.md",
|
|
24
26
|
"ai/project/runtime.md",
|
|
@@ -39,11 +41,6 @@ const RECOMMENDED_FILES = [
|
|
|
39
41
|
"ai/project/refs/roadmap.md"
|
|
40
42
|
];
|
|
41
43
|
|
|
42
|
-
const JSON_HEALTH_FILES = [
|
|
43
|
-
"ai/project/result.json",
|
|
44
|
-
"ai/project/metrics.json"
|
|
45
|
-
];
|
|
46
|
-
|
|
47
44
|
const TASK_HEALTH_PATTERNS = [
|
|
48
45
|
/^task_id:\s*/m,
|
|
49
46
|
/^type:\s*/m,
|
|
@@ -132,6 +129,7 @@ const TEXT = {
|
|
|
132
129
|
fail: "失败",
|
|
133
130
|
empty: "为空",
|
|
134
131
|
invalidJson: "JSON 无效",
|
|
132
|
+
invalidSchema: "不符合协议 schema",
|
|
135
133
|
taskFrontMatterIncomplete: "任务 front matter 缺少关键字段",
|
|
136
134
|
versionMismatch: "模板版本与包版本不一致",
|
|
137
135
|
runInit: "请运行 npx -y @wnlen/agent-execution-template init",
|
|
@@ -241,6 +239,7 @@ Usage:
|
|
|
241
239
|
fail: "FAIL",
|
|
242
240
|
empty: "is empty",
|
|
243
241
|
invalidJson: "contains invalid JSON",
|
|
242
|
+
invalidSchema: "does not match protocol schema",
|
|
244
243
|
taskFrontMatterIncomplete: "task front matter is missing required fields",
|
|
245
244
|
versionMismatch: "template version does not match package version",
|
|
246
245
|
runInit: "Run npx -y @wnlen/agent-execution-template init",
|
|
@@ -701,6 +700,117 @@ function isPermissionError(error) {
|
|
|
701
700
|
return error && (error.code === "EACCES" || error.code === "EPERM");
|
|
702
701
|
}
|
|
703
702
|
|
|
703
|
+
function parseJsonFile(file) {
|
|
704
|
+
return JSON.parse(fs.readFileSync(file, "utf8"));
|
|
705
|
+
}
|
|
706
|
+
|
|
707
|
+
function valueMatchesType(value, type) {
|
|
708
|
+
if (type === "array") return Array.isArray(value);
|
|
709
|
+
if (type === "integer") return Number.isInteger(value);
|
|
710
|
+
if (type === "number") return typeof value === "number" && Number.isFinite(value);
|
|
711
|
+
if (type === "object") return value !== null && typeof value === "object" && !Array.isArray(value);
|
|
712
|
+
return typeof value === type;
|
|
713
|
+
}
|
|
714
|
+
|
|
715
|
+
function valuesEqual(left, right) {
|
|
716
|
+
return JSON.stringify(left) === JSON.stringify(right);
|
|
717
|
+
}
|
|
718
|
+
|
|
719
|
+
function validateJsonSchema(value, schema, location = "$") {
|
|
720
|
+
const errors = [];
|
|
721
|
+
|
|
722
|
+
if (schema.const !== undefined && !valuesEqual(value, schema.const)) {
|
|
723
|
+
errors.push(`${location} must be ${JSON.stringify(schema.const)}`);
|
|
724
|
+
}
|
|
725
|
+
|
|
726
|
+
if (schema.enum && !schema.enum.some((candidate) => valuesEqual(value, candidate))) {
|
|
727
|
+
errors.push(`${location} must be one of ${schema.enum.map((item) => JSON.stringify(item)).join(", ")}`);
|
|
728
|
+
}
|
|
729
|
+
|
|
730
|
+
if (schema.type && !valueMatchesType(value, schema.type)) {
|
|
731
|
+
errors.push(`${location} must be ${schema.type}`);
|
|
732
|
+
return errors;
|
|
733
|
+
}
|
|
734
|
+
|
|
735
|
+
if (schema.minimum !== undefined && typeof value === "number" && value < schema.minimum) {
|
|
736
|
+
errors.push(`${location} must be >= ${schema.minimum}`);
|
|
737
|
+
}
|
|
738
|
+
|
|
739
|
+
if (schema.minLength !== undefined && typeof value === "string" && value.length < schema.minLength) {
|
|
740
|
+
errors.push(`${location} must have length >= ${schema.minLength}`);
|
|
741
|
+
}
|
|
742
|
+
|
|
743
|
+
if (schema.required && valueMatchesType(value, "object")) {
|
|
744
|
+
for (const key of schema.required) {
|
|
745
|
+
if (!Object.prototype.hasOwnProperty.call(value, key)) {
|
|
746
|
+
errors.push(`${location}.${key} is required`);
|
|
747
|
+
}
|
|
748
|
+
}
|
|
749
|
+
}
|
|
750
|
+
|
|
751
|
+
if (schema.properties && valueMatchesType(value, "object")) {
|
|
752
|
+
for (const [key, childSchema] of Object.entries(schema.properties)) {
|
|
753
|
+
if (Object.prototype.hasOwnProperty.call(value, key)) {
|
|
754
|
+
errors.push(...validateJsonSchema(value[key], childSchema, `${location}.${key}`));
|
|
755
|
+
}
|
|
756
|
+
}
|
|
757
|
+
}
|
|
758
|
+
|
|
759
|
+
if (schema.items && Array.isArray(value)) {
|
|
760
|
+
value.forEach((item, index) => {
|
|
761
|
+
errors.push(...validateJsonSchema(item, schema.items, `${location}[${index}]`));
|
|
762
|
+
});
|
|
763
|
+
}
|
|
764
|
+
|
|
765
|
+
if (schema.allOf) {
|
|
766
|
+
for (const childSchema of schema.allOf) {
|
|
767
|
+
if (childSchema.if) {
|
|
768
|
+
if (validateJsonSchema(value, childSchema.if, location).length === 0 && childSchema.then) {
|
|
769
|
+
errors.push(...validateJsonSchema(value, childSchema.then, location));
|
|
770
|
+
}
|
|
771
|
+
} else {
|
|
772
|
+
errors.push(...validateJsonSchema(value, childSchema, location));
|
|
773
|
+
}
|
|
774
|
+
}
|
|
775
|
+
}
|
|
776
|
+
|
|
777
|
+
return errors;
|
|
778
|
+
}
|
|
779
|
+
|
|
780
|
+
function printSchemaValidation(file, schemaFile, text) {
|
|
781
|
+
const fullPath = path.join(process.cwd(), file);
|
|
782
|
+
const schemaPath = path.join(process.cwd(), schemaFile);
|
|
783
|
+
if (!fs.existsSync(fullPath) || !fs.existsSync(schemaPath)) {
|
|
784
|
+
return 0;
|
|
785
|
+
}
|
|
786
|
+
|
|
787
|
+
let value;
|
|
788
|
+
try {
|
|
789
|
+
value = parseJsonFile(fullPath);
|
|
790
|
+
console.log(`[${text.pass}] ${file} JSON`);
|
|
791
|
+
} catch {
|
|
792
|
+
console.log(`[${text.fail}] ${file} ${text.invalidJson}`);
|
|
793
|
+
return 1;
|
|
794
|
+
}
|
|
795
|
+
|
|
796
|
+
let schema;
|
|
797
|
+
try {
|
|
798
|
+
schema = parseJsonFile(schemaPath);
|
|
799
|
+
} catch {
|
|
800
|
+
console.log(`[${text.fail}] ${schemaFile} ${text.invalidJson}`);
|
|
801
|
+
return 1;
|
|
802
|
+
}
|
|
803
|
+
|
|
804
|
+
const errors = validateJsonSchema(value, schema);
|
|
805
|
+
if (errors.length > 0) {
|
|
806
|
+
console.log(`[${text.fail}] ${file} ${text.invalidSchema}: ${errors.slice(0, 3).join("; ")}`);
|
|
807
|
+
return 1;
|
|
808
|
+
}
|
|
809
|
+
|
|
810
|
+
console.log(`[${text.pass}] ${file} schema`);
|
|
811
|
+
return 0;
|
|
812
|
+
}
|
|
813
|
+
|
|
704
814
|
function printFatal(error, lang) {
|
|
705
815
|
const text = getText(lang);
|
|
706
816
|
if (isPermissionError(error)) {
|
|
@@ -756,18 +866,12 @@ function doctor() {
|
|
|
756
866
|
console.log(`[${text.pass}] ${file}`);
|
|
757
867
|
}
|
|
758
868
|
|
|
759
|
-
|
|
760
|
-
|
|
761
|
-
|
|
762
|
-
|
|
763
|
-
|
|
764
|
-
|
|
765
|
-
JSON.parse(fs.readFileSync(fullPath, "utf8"));
|
|
766
|
-
console.log(`[${text.pass}] ${file} JSON`);
|
|
767
|
-
} catch {
|
|
768
|
-
console.log(`[${text.fail}] ${file} ${text.invalidJson}`);
|
|
769
|
-
missing += 1;
|
|
770
|
-
}
|
|
869
|
+
const schemaChecks = [
|
|
870
|
+
["ai/project/result.json", "ai/template/schemas/result.schema.json"],
|
|
871
|
+
["ai/project/metrics.json", "ai/template/schemas/metrics.schema.json"]
|
|
872
|
+
];
|
|
873
|
+
for (const [file, schemaFile] of schemaChecks) {
|
|
874
|
+
missing += printSchemaValidation(file, schemaFile, text);
|
|
771
875
|
}
|
|
772
876
|
|
|
773
877
|
const taskPath = path.join(process.cwd(), "ai/project/task.md");
|
package/docs/SPEC.md
CHANGED
|
@@ -22,7 +22,7 @@ npx 安装协议 -> AI 整理项目上下文 -> 人类确认 -> AI 生成任务
|
|
|
22
22
|
|
|
23
23
|
```text
|
|
24
24
|
Protocol: v0.8
|
|
25
|
-
Package: @wnlen/agent-execution-template@0.8.
|
|
25
|
+
Package: @wnlen/agent-execution-template@0.8.19
|
|
26
26
|
中文安装: npx -y @wnlen/agent-execution-template init
|
|
27
27
|
英文安装: npx -y @wnlen/agent-execution-template init --lang en
|
|
28
28
|
```
|
|
@@ -392,7 +392,7 @@ npx -y @wnlen/agent-execution-template doctor
|
|
|
392
392
|
```text
|
|
393
393
|
Agent Execution Template 检查
|
|
394
394
|
|
|
395
|
-
模板版本: 0.8.
|
|
395
|
+
模板版本: 0.8.19
|
|
396
396
|
模板语言: zh
|
|
397
397
|
|
|
398
398
|
[通过] ai/template/LANG
|
|
@@ -664,7 +664,10 @@ ai/template/execution-policy.md
|
|
|
664
664
|
执行前规划:
|
|
665
665
|
|
|
666
666
|
- AI 根据用户目标、项目上下文和仓库事实推断目标、范围、验收、权限和验证方式;
|
|
667
|
+
- 只有 `ai/project/task.md.readiness = ready_to_execute` 时才进入执行;本轮新建或重写
|
|
668
|
+
`task.md` 时必须停在确认交接;
|
|
667
669
|
- 先列 L1 任务清单,并给每个 L1 标注 Green / Yellow / Red;
|
|
670
|
+
- L1 必须是可独立验收的垂直切片,不是机械步骤清单;
|
|
668
671
|
- L1 少于 2 个时使用 `normal`;
|
|
669
672
|
- L1 为 2 个或更多时自动使用 `bounded_continuous`;
|
|
670
673
|
- 任一 L1 为 Red 时停止等待人类确认;Green 和 Yellow 不阻塞启动。
|
|
@@ -675,16 +678,20 @@ ai/template/execution-policy.md
|
|
|
675
678
|
- 默认最多 3 层,只有当 L3 仍过大、不可验证或不可回退时才动态增加 L4;
|
|
676
679
|
- 每个任务节点必须有风险评级、预期改动范围、验收方式和证据要求;
|
|
677
680
|
- L1 清单必须用待办列表展示,每完成一个 L1 就打勾并划掉;
|
|
678
|
-
-
|
|
681
|
+
- `task_tree` 写回应集中在执行前、L1 开始/完成、Red、blocked、范围变化和最终收尾;
|
|
679
682
|
- 默认按 `vertical_slice` 推进,每轮都要产生可检查增量;
|
|
683
|
+
- Checkpoint 只在风险从 Green 变 Yellow/Red、即将扩大范围或权限、完成 L1 垂直切片、
|
|
684
|
+
验证失败后准备继续或最终收尾时输出;
|
|
680
685
|
- 每个 Checkpoint 必须包含证据:已改文件、已运行命令、验证结果或无法验证原因;
|
|
681
686
|
- Green 可自动继续;
|
|
682
|
-
- Yellow
|
|
687
|
+
- Yellow 只允许当前 L1/L2 内的局部低风险修正,不能改变公共接口、数据模型、权限、
|
|
688
|
+
安全、架构方向或验收标准;
|
|
683
689
|
- Red 必须停止等待人类确认;
|
|
690
|
+
- 用户可见输出默认只展示 L1、风险结论、证据、Red 确认和最终结果,不展示内部协议细节;
|
|
684
691
|
- 目标、范围、验收和权限由 AI 推断,但不能越过项目规则、显式用户限制、
|
|
685
692
|
`permission.modify.denied`、安全边界或破坏性操作限制;
|
|
686
|
-
-
|
|
687
|
-
|
|
693
|
+
- 需要扩大权限、运行未允许命令、访问网络、执行破坏性操作、改变产品方向、核心架构、
|
|
694
|
+
公共 API、持久化数据结构、安全边界、支付、账号或权限时,当前节点必须标为 Red。
|
|
688
695
|
|
|
689
696
|
它不适用于方向未定且无法推断、验收无法定义或高风险架构取舍任务;这些应直接评为 Red。
|
|
690
697
|
|
package/package.json
CHANGED
|
@@ -11,6 +11,8 @@ execution_policy:
|
|
|
11
11
|
max_depth: 3
|
|
12
12
|
allow_depth_4_when_needed: true
|
|
13
13
|
progress_unit: "vertical_slice"
|
|
14
|
+
l1_granularity: "independently_acceptable_vertical_slice"
|
|
15
|
+
write_back_policy: "l1_start_done_red_blocked_scope_change_final"
|
|
14
16
|
task_tree:
|
|
15
17
|
- id: "L1-1"
|
|
16
18
|
title: ""
|
|
@@ -86,6 +88,9 @@ permissions, and acceptance from the human goal, project context, and repository
|
|
|
86
88
|
facts. If inference would cross permission or safety boundaries, or acceptance
|
|
87
89
|
cannot be defined, set `readiness` to `blocked` or mark the relevant task node
|
|
88
90
|
`Red` and wait for human confirmation.
|
|
91
|
+
If this run creates or rewrites the task contract, keep
|
|
92
|
+
`readiness = draft_for_confirmation` by default and stop at the handoff. Enter
|
|
93
|
+
execution only when an existing task is explicitly `ready_to_execute`.
|
|
89
94
|
|
|
90
95
|
## Goal
|
|
91
96
|
|
|
@@ -143,26 +148,36 @@ fewer than 2 L1 tasks, use `normal`; if it finds 2 or more L1 tasks, use
|
|
|
143
148
|
before execution.
|
|
144
149
|
- `readiness = blocked` means the task cannot execute and must produce a
|
|
145
150
|
blocked result.
|
|
151
|
+
- If this run creates or rewrites `ai/project/task.md`, stop at the confirmation
|
|
152
|
+
handoff; do not execute while the task is still a draft.
|
|
146
153
|
- Before execution, write the L1 checklist to `execution_policy.task_tree`.
|
|
147
154
|
- Before execution, list the L1 task checklist; mark each L1 complete with a
|
|
148
155
|
checked and struck-through item.
|
|
156
|
+
- Each L1 must be an independently acceptable vertical slice. Do not split a
|
|
157
|
+
single mechanical step into L1 tasks, and do not merge multiple independently
|
|
158
|
+
acceptable user-visible outcomes into one L1.
|
|
149
159
|
- Before executing an L1, plan the naturally derived L2 tasks; if an L2 still
|
|
150
160
|
needs decomposition, plan L3 tasks.
|
|
151
161
|
- Default to at most 3 levels; add L4 dynamically only when leaving it out
|
|
152
162
|
would make L3 too large or unverifiable.
|
|
153
163
|
- The AI assigns Green / Yellow / Red risk to every task node.
|
|
154
164
|
- Only Red stops for human confirmation; Green continues automatically, and
|
|
155
|
-
Yellow
|
|
165
|
+
Yellow only permits local low-risk correction inside the current L1/L2. It
|
|
166
|
+
must not change public interfaces, data models, permissions, security,
|
|
167
|
+
architecture direction, or acceptance.
|
|
156
168
|
- `progress_unit` defaults to `vertical_slice`: each work loop should produce
|
|
157
169
|
a reviewable increment.
|
|
158
170
|
- `checkpoint_budget` is the maximum checkpoint budget, not a required count;
|
|
159
171
|
do not report just to spend the budget.
|
|
160
|
-
- Emit
|
|
161
|
-
|
|
172
|
+
- Emit checkpoints only when risk changes from Green to Yellow/Red, scope or
|
|
173
|
+
permission is about to expand, an L1 vertical slice is complete, verification
|
|
174
|
+
failed but execution is about to continue, or final wrap-up is about to start.
|
|
162
175
|
- Every checkpoint must include evidence: changed files, commands run,
|
|
163
176
|
verification results, or why verification was not possible.
|
|
164
|
-
-
|
|
165
|
-
|
|
177
|
+
- `task_tree` write-back frequency: write the L1 checklist before execution;
|
|
178
|
+
update an L1 when it starts or completes; write back immediately on Red,
|
|
179
|
+
blocked, scope change, or final wrap-up; do not write back every tiny L3
|
|
180
|
+
operation.
|
|
166
181
|
- After completion, run one final review; only re-check Yellow, Red, failed
|
|
167
182
|
verification, or high-impact modules.
|
|
168
183
|
- Continuous execution does not change model policy; escalate through
|
|
@@ -192,7 +207,8 @@ Stop and write `ai/project/result.json`, `ai/project/result.md`, and `ai/project
|
|
|
192
207
|
- Required command cannot be run.
|
|
193
208
|
- Risk level is high without explicit authorization.
|
|
194
209
|
- A Red checkpoint appears during continuous execution.
|
|
195
|
-
- The task would change product direction, core architecture,
|
|
196
|
-
security boundaries, payment, accounts, or
|
|
197
|
-
|
|
198
|
-
|
|
210
|
+
- The task would change product direction, core architecture, public API,
|
|
211
|
+
persistent data structures, security boundaries, payment, accounts, or
|
|
212
|
+
permissions.
|
|
213
|
+
- The task would delete files beyond the current L1's directly related files,
|
|
214
|
+
rewrite a core module, or require choosing between multiple high-cost options.
|
|
@@ -1 +1 @@
|
|
|
1
|
-
0.8.
|
|
1
|
+
0.8.20
|
|
@@ -13,7 +13,13 @@ Pre-execution planning must:
|
|
|
13
13
|
|
|
14
14
|
- Infer goal, scope, acceptance, permissions, and verification method from the
|
|
15
15
|
human goal, project context, and repository facts.
|
|
16
|
+
- Enter execution only when `ai/project/task.md` already exists and
|
|
17
|
+
`readiness = ready_to_execute`. If this run creates or rewrites the task
|
|
18
|
+
contract, stop at the confirmation handoff instead of executing from a draft.
|
|
16
19
|
- List the L1 task checklist and assign Green / Yellow / Red risk to each L1.
|
|
20
|
+
- Each L1 must be an independently acceptable vertical slice. Do not split a
|
|
21
|
+
single mechanical step into L1 tasks, and do not merge multiple independently
|
|
22
|
+
acceptable user-visible outcomes into one L1.
|
|
17
23
|
- Use `normal` if there are fewer than 2 L1 tasks.
|
|
18
24
|
- Automatically use `bounded_continuous` if there are 2 or more L1 tasks.
|
|
19
25
|
- Stop for human confirmation first if any L1 is Red; Green and Yellow do not
|
|
@@ -24,6 +30,10 @@ Pre-execution planning must:
|
|
|
24
30
|
|
|
25
31
|
Execute the task tree in L1 -> L2 -> L3 order.
|
|
26
32
|
|
|
33
|
+
- L1 is a work increment that can be verified, rolled back, and explained to the
|
|
34
|
+
user after completion.
|
|
35
|
+
- L2 is an implementation substep needed to finish that L1.
|
|
36
|
+
- L3 is a local operation step used when an L2 is still too large.
|
|
27
37
|
- Before executing an L1, plan its naturally derived L2 tasks.
|
|
28
38
|
- Before executing an L2, plan L3 tasks if it still needs decomposition.
|
|
29
39
|
- Default to at most 3 levels. Add L4 dynamically only when L3 would otherwise
|
|
@@ -32,8 +42,11 @@ Execute the task tree in L1 -> L2 -> L3 order.
|
|
|
32
42
|
and evidence requirements.
|
|
33
43
|
- Show the L1 checklist as task items; when an L1 is complete, check it off and
|
|
34
44
|
strike it through.
|
|
35
|
-
-
|
|
36
|
-
|
|
45
|
+
- Task tree write-back rule: write the L1 checklist before execution; update an
|
|
46
|
+
L1 when it starts or completes; write back immediately on Red, blocked, scope
|
|
47
|
+
change, or final wrap-up. Do not write back every tiny L3 operation.
|
|
48
|
+
- During execution, use `pending`, `running`, `done`, or `blocked` for node
|
|
49
|
+
status.
|
|
37
50
|
|
|
38
51
|
Recommended node shape:
|
|
39
52
|
|
|
@@ -65,16 +78,19 @@ Yellow:
|
|
|
65
78
|
- Still inside current task scope;
|
|
66
79
|
- local uncertainty or local verification failure exists;
|
|
67
80
|
- a low-risk local correction can continue the work;
|
|
68
|
-
-
|
|
81
|
+
- the correction affects only the current L1/L2 local implementation and does
|
|
82
|
+
not change public interfaces, data models, permissions, security,
|
|
83
|
+
architecture direction, or acceptance;
|
|
84
|
+
- no permission, scope, command, network, or acceptance expansion is needed.
|
|
69
85
|
|
|
70
86
|
Red:
|
|
71
87
|
|
|
72
88
|
- Permission expansion, unallowed command, network access, or destructive action
|
|
73
89
|
is needed;
|
|
74
|
-
- product direction, core architecture,
|
|
75
|
-
payment, account, or permission would change;
|
|
76
|
-
-
|
|
77
|
-
high-cost options require judgment;
|
|
90
|
+
- product direction, core architecture, public APIs, persistent data structures,
|
|
91
|
+
security boundary, payment, account, or permission would change;
|
|
92
|
+
- files beyond the current L1's directly related files must be deleted, a core
|
|
93
|
+
module must be rewritten, or multiple high-cost options require judgment;
|
|
78
94
|
- acceptance cannot be defined, or task goal materially conflicts with project direction.
|
|
79
95
|
|
|
80
96
|
Only Red stops for human confirmation. Green continues automatically. Yellow
|
|
@@ -82,9 +98,10 @@ continues after local low-risk correction.
|
|
|
82
98
|
|
|
83
99
|
## Checkpoint
|
|
84
100
|
|
|
85
|
-
Emit checkpoints only when risk
|
|
86
|
-
|
|
87
|
-
|
|
101
|
+
Emit checkpoints only when risk changes from Green to Yellow/Red, scope or
|
|
102
|
+
permission is about to expand, an L1 vertical slice is complete, verification
|
|
103
|
+
failed but execution is about to continue, or final review is about to start.
|
|
104
|
+
Do not report just to spend checkpoint budget.
|
|
88
105
|
|
|
89
106
|
Every checkpoint must include:
|
|
90
107
|
|
|
@@ -102,6 +119,22 @@ Every checkpoint must include:
|
|
|
102
119
|
Evidence must include changed files, commands run, verification results, or why
|
|
103
120
|
verification was not possible. A purely subjective Green is not valid.
|
|
104
121
|
|
|
122
|
+
## User-Visible Output
|
|
123
|
+
|
|
124
|
+
- Use the installed template language by default. When `ai/template/LANG` is
|
|
125
|
+
`en`, user-visible plans, L1 checklists, checkpoints, task draft handoffs,
|
|
126
|
+
blocked explanations, and final results should default to English. Use another
|
|
127
|
+
language only when the human explicitly asks, or when preserving code,
|
|
128
|
+
commands, file paths, or protocol field names.
|
|
129
|
+
- Show the L1 checklist by default; do not show full L2/L3/L4 by default.
|
|
130
|
+
- Show risk conclusions and necessary reasons; do not output long internal reasoning.
|
|
131
|
+
- Show evidence; do not show internal protocol fields, full YAML,
|
|
132
|
+
`checkpoint_budget`, or `model_policy`.
|
|
133
|
+
- Say little for Green, be brief for Yellow, and stop with clear reasons and
|
|
134
|
+
options for Red.
|
|
135
|
+
- Final output must include status, completed items, verification results, and
|
|
136
|
+
result files.
|
|
137
|
+
|
|
105
138
|
## Model Policy
|
|
106
139
|
|
|
107
140
|
Continuous execution does not change `model_policy`. Still escalate through
|
|
@@ -69,12 +69,12 @@ In Task Draft Mode:
|
|
|
69
69
|
and write it to `execution_policy.task_tree`. Use `normal` if there are
|
|
70
70
|
fewer than 2 L1 tasks; automatically use `bounded_continuous` if there are 2
|
|
71
71
|
or more L1 tasks.
|
|
72
|
-
5. If
|
|
73
|
-
|
|
74
|
-
|
|
75
|
-
6.
|
|
76
|
-
|
|
77
|
-
|
|
72
|
+
5. If this run creates or rewrites `ai/project/task.md`, set `readiness` to
|
|
73
|
+
`draft_for_confirmation` and stop at the handoff; do not execute while the
|
|
74
|
+
task is still a draft.
|
|
75
|
+
6. Enter Execution Mode only when an existing task is explicitly
|
|
76
|
+
`ready_to_execute` and no Red preflight item exists; if it cannot execute,
|
|
77
|
+
set it to `blocked`.
|
|
78
78
|
7. Do not modify source or business files in Task Draft Mode.
|
|
79
79
|
|
|
80
80
|
End Task Draft Mode with:
|
|
@@ -129,12 +129,17 @@ In Execution Mode, read:
|
|
|
129
129
|
Then follow `ai/template/execution-policy.md` for pre-execution planning: list
|
|
130
130
|
the L1 checklist, mark each L1 Green / Yellow / Red, and write it to
|
|
131
131
|
`execution_policy.task_tree`. Automatically choose `normal` or
|
|
132
|
-
`bounded_continuous` from the L1 count.
|
|
133
|
-
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
132
|
+
`bounded_continuous` from the L1 count. Execute only when
|
|
133
|
+
`readiness = ready_to_execute`; if this run creates or rewrites the task
|
|
134
|
+
contract, stop at the confirmation handoff. Each L1 must be an independently
|
|
135
|
+
acceptable vertical slice. Plan L2 before executing an L1, and plan L3 as needed
|
|
136
|
+
before executing an L2; default to at most 3 levels, with L4 allowed when
|
|
137
|
+
needed. When an L1 is complete, check it off and strike it through; write back
|
|
138
|
+
`task_tree` when an L1 starts or completes, on Red/blocked, on scope change, or
|
|
139
|
+
at final wrap-up. Only Red stops for human confirmation; Green continues
|
|
140
|
+
automatically, and Yellow only permits local low-risk correction inside the
|
|
141
|
+
current L1/L2. User-visible output follows the "User-Visible Output" rules in
|
|
142
|
+
`ai/template/execution-policy.md`. Write results to:
|
|
138
143
|
|
|
139
144
|
- `ai/project/result.json`
|
|
140
145
|
- `ai/project/result.md`
|
|
@@ -54,11 +54,15 @@ Before task execution, read `ai/template/execution-policy.md`.
|
|
|
54
54
|
The default execution policy is `auto`: the AI first decomposes L1 tasks and
|
|
55
55
|
judges Green / Yellow / Red risk, then chooses `normal` or `bounded_continuous`.
|
|
56
56
|
Use `normal` when there are fewer than 2 L1 tasks; automatically use
|
|
57
|
-
`bounded_continuous` when there are 2 or more L1 tasks.
|
|
58
|
-
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
57
|
+
`bounded_continuous` when there are 2 or more L1 tasks. Each L1 must be an
|
|
58
|
+
independently acceptable vertical slice. Execution is allowed only for an
|
|
59
|
+
existing task with `readiness = ready_to_execute`; if this run creates or
|
|
60
|
+
rewrites the task contract, stop at the confirmation handoff. Only Red stops for
|
|
61
|
+
human confirmation, and Yellow only permits local low-risk correction inside the
|
|
62
|
+
current L1/L2.
|
|
63
|
+
|
|
64
|
+
Task tree, risk rubric, checkpoint evidence, and `task_tree` write-back rules
|
|
65
|
+
are defined in `ai/template/execution-policy.md`.
|
|
62
66
|
|
|
63
67
|
## Bootstrap Mode
|
|
64
68
|
|
|
@@ -125,15 +125,24 @@ explicitly say "enable continuous execution".
|
|
|
125
125
|
|
|
126
126
|
Hard gates:
|
|
127
127
|
|
|
128
|
+
- Execute only when `ai/project/task.md.readiness = ready_to_execute`; if this
|
|
129
|
+
run creates or rewrites `task.md`, stop at the confirmation handoff.
|
|
130
|
+
- L1 must be an independently acceptable vertical slice, not a mechanical step
|
|
131
|
+
checklist.
|
|
128
132
|
- `execution_policy.task_tree` must record the L1 checklist and execution state.
|
|
129
133
|
- Every task node must have Green / Yellow / Red risk.
|
|
134
|
+
- Yellow only permits local low-risk correction inside the current L1/L2. It
|
|
135
|
+
must not change public interfaces, data models, permissions, security,
|
|
136
|
+
architecture direction, or acceptance.
|
|
130
137
|
- Every checkpoint must include evidence; a purely subjective Green is not valid.
|
|
131
138
|
- Red must stop for human confirmation.
|
|
132
|
-
- Any product direction, core architecture,
|
|
133
|
-
account, permission, large deletion, core
|
|
134
|
-
must stop.
|
|
139
|
+
- Any product direction, core architecture, public API, persistent data
|
|
140
|
+
structure, security, payment, account, permission, large deletion, core
|
|
141
|
+
rewrite, or high-cost option choice must stop.
|
|
135
142
|
- Any need to expand scope, permission, commands, network access, or acceptance
|
|
136
143
|
must stop.
|
|
144
|
+
- `task_tree` write-back should happen at L1 start/done, Red, blocked, scope
|
|
145
|
+
change, and final wrap-up; do not write back every tiny L3 operation.
|
|
137
146
|
|
|
138
147
|
The AI infers goal, scope, acceptance, and permissions, but must not cross
|
|
139
148
|
project rules, explicit human limits, `permission.modify.denied`, security
|
|
@@ -25,7 +25,10 @@ verification, assumptions, issues, next steps, and runtime update proposals.
|
|
|
25
25
|
|
|
26
26
|
## Result Markdown
|
|
27
27
|
|
|
28
|
-
`ai/project/result.md` is the human-readable summary. Keep it short
|
|
28
|
+
`ai/project/result.md` is the human-readable summary. Keep it short and use the
|
|
29
|
+
installed language from `ai/template/LANG` by default. In the English template,
|
|
30
|
+
headings and prose should default to English; preserve code, commands, file
|
|
31
|
+
paths, and protocol field names as written.
|
|
29
32
|
|
|
30
33
|
```md
|
|
31
34
|
## Status
|
|
@@ -3,15 +3,15 @@
|
|
|
3
3
|
## 当前状态
|
|
4
4
|
|
|
5
5
|
- 阶段:方向层与执行层一致性收口
|
|
6
|
-
-
|
|
6
|
+
- 重点:保持协议可安装、可升级、可审计,并让方向治理与执行约束一致。
|
|
7
7
|
- 阻塞:无
|
|
8
8
|
- 已知风险:
|
|
9
9
|
- 超出当前任务范围
|
|
10
|
-
-
|
|
10
|
+
- 询问可安全推断的细节
|
|
11
11
|
- 用历史过程笔记污染运行时上下文
|
|
12
12
|
- 没有验证证据就标记成功
|
|
13
13
|
- 在明确权限之外运行命令
|
|
14
|
-
-
|
|
14
|
+
- 只省 token,忽略可接受成本下的质量
|
|
15
15
|
- 方向层已升级但规则、runtime 或 doctor 仍停留在旧语义
|
|
16
16
|
|
|
17
17
|
## 硬规则
|
|
@@ -36,8 +36,8 @@
|
|
|
36
36
|
|
|
37
37
|
## 项目约束
|
|
38
38
|
|
|
39
|
-
-
|
|
40
|
-
-
|
|
39
|
+
- 引导读取集:`bootstrap.md`、`protocol.md`、`rules/core.md`、根文档、清单、项目文档、引用;文档不足时有限检查源码结构。
|
|
40
|
+
- 执行读取集:`prompt.md`、`protocol.md`、`rules/core.md`、`project.md`、`runtime.md`、`task.md`。
|
|
41
41
|
- `ai/project/refs/` 文件只在任务要求或任务类型触发时加载。
|
|
42
42
|
- `ai/project/refs/final-shape.md`、`module-map.md`、`roadmap.md` 属于方向层正式文档。
|
|
43
43
|
- 方向层正式文档不能被普通 reconcile 或普通执行任务直接修改。
|
|
@@ -53,9 +53,9 @@
|
|
|
53
53
|
## 当前上下文
|
|
54
54
|
|
|
55
55
|
这个项目是协议 / 模板,不是复杂 Agent 框架。
|
|
56
|
-
|
|
57
|
-
|
|
58
|
-
|
|
56
|
+
定位:面向 AI Coding Agent 的项目方向治理 + 可审计任务执行协议。
|
|
57
|
+
目标:减少人类交互和输入量,让任务随时间更精确,并降低长期方向漂移。
|
|
58
|
+
允许增加少量服务协议采用和治理闭环的 CLI;不要引入 UI、云同步或多 Agent 编排。
|
|
59
59
|
|
|
60
60
|
## 引用路由
|
|
61
61
|
|
|
@@ -69,6 +69,6 @@
|
|
|
69
69
|
|
|
70
70
|
## 运行时更新治理
|
|
71
71
|
|
|
72
|
-
除非 `
|
|
73
|
-
|
|
74
|
-
|
|
72
|
+
除非 `task.md` 明确允许,AI 不得直接更新本文件。
|
|
73
|
+
任务产生长期上下文时,写入 `result.json.runtime_update` 建议。
|
|
74
|
+
运行时更新应由单独任务应用,唯一目标是 `ai/project/runtime.md`。
|