ai-spec-dev 0.33.0 → 0.36.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/commands/add-lesson.md +34 -0
- package/.claude/commands/check-layers.md +65 -0
- package/.claude/commands/installed-deps.md +35 -0
- package/.claude/commands/recall-lessons.md +40 -0
- package/.claude/commands/scan-singletons.md +45 -0
- package/.claude/commands/verify-imports.md +48 -0
- package/.claude/settings.local.json +11 -1
- package/README.md +531 -213
- package/RELEASE_LOG.md +424 -0
- package/cli/commands/config.ts +18 -0
- package/cli/commands/create.ts +1248 -0
- package/cli/commands/dashboard.ts +62 -0
- package/cli/commands/init.ts +45 -8
- package/cli/commands/mock.ts +175 -0
- package/cli/commands/scan.ts +99 -0
- package/cli/commands/types.ts +69 -0
- package/cli/commands/vcr.ts +70 -0
- package/cli/index.ts +34 -2517
- package/cli/utils.ts +4 -0
- package/core/code-generator.ts +6 -4
- package/core/combined-generator.ts +13 -3
- package/core/dashboard-generator.ts +340 -0
- package/core/design-dialogue.ts +124 -0
- package/core/dsl-extractor.ts +9 -1
- package/core/dsl-feedback.ts +41 -5
- package/core/dsl-validator.ts +32 -0
- package/core/error-feedback.ts +46 -2
- package/core/key-store.ts +5 -4
- package/core/project-index.ts +301 -0
- package/core/provider-utils.ts +39 -4
- package/core/reviewer.ts +84 -6
- package/core/run-logger.ts +109 -3
- package/core/run-trend.ts +24 -4
- package/core/self-evaluator.ts +39 -11
- package/core/spec-generator.ts +14 -8
- package/core/task-generator.ts +17 -0
- package/core/types-generator.ts +219 -0
- package/core/vcr.ts +210 -0
- package/dist/cli/index.js +7407 -5643
- package/dist/cli/index.js.map +1 -1
- package/dist/cli/index.mjs +7401 -5637
- package/dist/cli/index.mjs.map +1 -1
- package/dist/index.d.mts +34 -5
- package/dist/index.d.ts +34 -5
- package/dist/index.js +497 -232
- package/dist/index.js.map +1 -1
- package/dist/index.mjs +495 -233
- package/dist/index.mjs.map +1 -1
- package/docs-assets/purpose/architecture-overview.svg +64 -0
- package/docs-assets/purpose/create-pipeline.svg +113 -0
- package/docs-assets/purpose/task-layering.svg +74 -0
- package/package.json +1 -1
- package/prompts/codegen.prompt.ts +97 -9
- package/prompts/design.prompt.ts +59 -0
- package/prompts/spec.prompt.ts +8 -1
- package/prompts/tasks.prompt.ts +27 -2
- package/purpose.md +600 -174
- package/tests/code-generator.test.ts +253 -0
- package/tests/context-loader.test.ts +207 -0
- package/tests/dsl-validator.test.ts +105 -0
- package/tests/openapi-exporter.test.ts +310 -0
- package/tests/reviewer.test.ts +214 -0
- package/tests/spec-generator.test.ts +228 -0
- package/tests/spec-versioning.test.ts +205 -0
package/RELEASE_LOG.md
CHANGED
|
@@ -1,5 +1,350 @@
|
|
|
1
1
|
# Release Log
|
|
2
2
|
|
|
3
|
+
<details open>
|
|
4
|
+
<summary>中文</summary>
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## v0.36.1 — 2026-04-02 — P0 测试覆盖 + 质量硬门禁 + 错误体验优化
|
|
9
|
+
|
|
10
|
+
### 新增测试(Week 2)
|
|
11
|
+
|
|
12
|
+
**Test 4 — `context-loader.test.ts`(19 tests)**
|
|
13
|
+
|
|
14
|
+
覆盖 `isFrontendDeps`(React/Vue/Next/Nuxt/Svelte/纯后端/空数组)、`ContextLoader` 类(Node.js/PHP/Java 三种项目类型的上下文加载、Prisma schema 读取、宪法加载、API 结构扫描、共享配置文件发现、错误模式提取、空目录容错)。
|
|
15
|
+
|
|
16
|
+
**Test 5 — `openapi-exporter.test.ts`(27 tests)**
|
|
17
|
+
|
|
18
|
+
覆盖 `dslToOpenApi`(结构完整性、info 元数据、自定义 server URL、路径参数标准化 `:id`→`{id}`、请求体生成、错误响应、认证端点 401 自动注入、安全方案生成、模型 schema 映射、必填字段标记、204 无内容响应、无认证场景)、类型映射(String/Int/Float/Boolean/DateTime/email/password/$ref)、`exportOpenApi`(YAML/JSON 格式、自定义输出路径、自定义 server URL)。
|
|
19
|
+
|
|
20
|
+
**Test 6 — `spec-versioning.test.ts`(26 tests)**
|
|
21
|
+
|
|
22
|
+
覆盖 `slugify`(英文转换、特殊字符、CJK 处理、长度限制、空输入回退、连字符折叠)、`computeDiff`(相同/新增/删除/修改/空文本/大文件回退/行类型正确性)、`findLatestVersion`(不存在目录、无匹配文件、单版本、多版本最新、不同 slug 隔离、正则特殊字符)、`nextVersionPath`(无版本/有 v1/跳跃版本号)。
|
|
23
|
+
|
|
24
|
+
**测试覆盖率提升:30% → 37.5%(12 → 15 个模块有测试,259 → 331 test cases)**
|
|
25
|
+
|
|
26
|
+
### 质量硬门禁(Week 3)
|
|
27
|
+
|
|
28
|
+
**Feature 1 — Harness Score 阻断门禁(`cli/commands/create.ts`、`cli/utils.ts`)**
|
|
29
|
+
|
|
30
|
+
- 新增 `minHarnessScore` 配置项(`.ai-spec.json`,默认 0 = 禁用)
|
|
31
|
+
- 自评阶段(Step 10)后,当 `harnessScore < minHarnessScore` 且未使用 `--force` 时,打印阈值提示并 `exit(1)`
|
|
32
|
+
- 与 `minSpecScore` 同样支持 `--force` 绕过
|
|
33
|
+
|
|
34
|
+
**Feature 2 — Error Feedback 轮次可配置(`cli/commands/create.ts`、`cli/utils.ts`)**
|
|
35
|
+
|
|
36
|
+
- 新增 `maxErrorCycles` 配置项(默认 2,TDD 模式默认 3,范围 1-10)
|
|
37
|
+
- 替换原来硬编码的 `maxCycles: opts.tdd ? 3 : 2`,读取 `config.maxErrorCycles`
|
|
38
|
+
|
|
39
|
+
**Feature 3 — Config 命令增强(`cli/commands/config.ts`)**
|
|
40
|
+
|
|
41
|
+
- 新增 `--min-harness-score <score>` 和 `--max-error-cycles <n>` CLI 选项
|
|
42
|
+
- 含输入范围校验(0-10 / 1-10)
|
|
43
|
+
|
|
44
|
+
### 错误体验优化(Week 4)
|
|
45
|
+
|
|
46
|
+
**Enhancement 1 — Provider 错误消息增强(`core/provider-utils.ts`)**
|
|
47
|
+
|
|
48
|
+
- **Auth 错误(401/403)**:提示检查 API key 有效性 + 运行 `ai-spec model` 重新配置
|
|
49
|
+
- **Rate Limit(429)**:提示等待或切换 provider + 检查计费面板
|
|
50
|
+
- **网络错误**:提示检查连接和代理设置
|
|
51
|
+
- **模型不存在**:提示运行 `ai-spec model` 查看可用模型
|
|
52
|
+
- **余额/配额不足**:提示检查计费面板 + 切换 provider
|
|
53
|
+
|
|
54
|
+
**Enhancement 2 — DSL 提取失败诊断增强(`core/dsl-extractor.ts`)**
|
|
55
|
+
|
|
56
|
+
- JSON 解析失败时,输出 AI 原始响应前 500 字符的预览,方便判断是 prompt 问题还是 model 能力问题
|
|
57
|
+
- Spec 超过 12K 字符截断时,**立即**打印黄色警告(而非静默截断),提醒用户详情可能丢失
|
|
58
|
+
|
|
59
|
+
**Enhancement 3 — Key Store 读取容错(`core/key-store.ts`)**
|
|
60
|
+
|
|
61
|
+
- 读取损坏的 key store 文件时,输出具体错误消息(而非静默忽略)
|
|
62
|
+
|
|
63
|
+
---
|
|
64
|
+
|
|
65
|
+
## v0.36.0 — 2026-04-01 — 安全修复 + 核心模块测试覆盖
|
|
66
|
+
|
|
67
|
+
### 安全修复
|
|
68
|
+
|
|
69
|
+
**Fix 1 — Shell 命令注入防护(`core/code-generator.ts`)**
|
|
70
|
+
|
|
71
|
+
`execSync` 拼接 shell 字符串传递 prompt 内容时,仅转义了 `"` 字符,未处理 `$`、`;`、`|`、`&` 等 shell 元字符,存在命令注入风险。
|
|
72
|
+
|
|
73
|
+
- 将 `execSync(\`\${claudeCmd} -p "..."\`)` 替换为 `spawnSync(claudeCmd, ["-p", promptContent], { shell: false })`(共 2 处)
|
|
74
|
+
- `spawnSync` 数组形式绕过 shell 解析,彻底消除注入可能
|
|
75
|
+
- 新增 `spawnSync` 导入(`child_process`)
|
|
76
|
+
|
|
77
|
+
**Fix 2 — API Key 存储权限时序(`core/key-store.ts`)**
|
|
78
|
+
|
|
79
|
+
原来先 `writeJson()` 再 `chmod(0o600)`,在写入与权限设置之间存在短暂窗口期,其他进程可能读取到明文 key。
|
|
80
|
+
|
|
81
|
+
- 改为 `ensureFile()` → `chmod(0o600)` → `writeJson()` 顺序,确保文件权限在写入敏感数据前就已设置
|
|
82
|
+
|
|
83
|
+
### 新增测试
|
|
84
|
+
|
|
85
|
+
**Test 1 — `spec-generator.test.ts`(23 tests)**
|
|
86
|
+
|
|
87
|
+
覆盖 `PROVIDER_CATALOG` 结构完整性、`createProvider` 工厂函数(9 个 provider 分支 + 自定义 model + 未知 provider 异常)、`SpecGenerator` prompt 构建逻辑(architecture decision 注入、constitution 优先级、context 截断限制)。
|
|
88
|
+
|
|
89
|
+
**Test 2 — `reviewer.test.ts`(19 tests)**
|
|
90
|
+
|
|
91
|
+
覆盖 `extractComplianceScore`(整数/小数/大小写/空字符串/多行/多匹配)、`extractMissingCount`(正常/大小写/缺失/多行)、`CodeReviewer` 类(空 diff 处理、多 Pass 调用验证、缺失文件容错、大文件截断、历史趋势渲染)。
|
|
92
|
+
|
|
93
|
+
**Test 3 — `code-generator.test.ts`(23 tests)**
|
|
94
|
+
|
|
95
|
+
覆盖 `extractBehavioralContract`(interface/enum/type/function/const/class/abstract class/defineStore/return 块/export default/嵌套大括号/throw 捕获上限/无 export 回退)、`printTaskProgress`(百分比计算/run 模式/skip 模式/0 total/未知 layer)。
|
|
96
|
+
|
|
97
|
+
**测试覆盖率提升:22.5% → 30%(9 → 12 个模块有测试,251 → 259 test cases)**
|
|
98
|
+
|
|
99
|
+
- `extractBehavioralContract` 从 private 改为 `export`(`core/code-generator.ts`),支持直接单元测试
|
|
100
|
+
|
|
101
|
+
### DSL 验证增强
|
|
102
|
+
|
|
103
|
+
**Fix 3 — Endpoint ID 唯一性检查(`core/dsl-validator.ts`)**
|
|
104
|
+
|
|
105
|
+
AI 经常生成重复的 Endpoint ID(如两个 `EP-001`),导致下游 DSL 消费方(types-generator、mock-server 等)产生覆盖冲突。
|
|
106
|
+
|
|
107
|
+
- 在 endpoints 验证阶段新增 `Set<string>` 去重检查,重复 ID 报告具体位置(`endpoints[N].id`)
|
|
108
|
+
- 新增 4 个测试用例(唯一 ID 通过、重复 ID 拒绝、路径定位正确、多组重复检测)
|
|
109
|
+
|
|
110
|
+
**Fix 4 — Model 字段名唯一性检查(`core/dsl-validator.ts`)**
|
|
111
|
+
|
|
112
|
+
同一 Model 内出现重复字段名(如两个 `id`)会导致 Prisma schema 或 TypeScript interface 生成冲突。
|
|
113
|
+
|
|
114
|
+
- 在 `validateModel` 内新增 `Set<string>` 去重检查,同一 model 内重复字段报告具体位置
|
|
115
|
+
- 不同 model 之间允许同名字段(如 `User.id` 和 `Post.id`)
|
|
116
|
+
- 新增 4 个测试用例
|
|
117
|
+
|
|
118
|
+
**Fix 5 — `missing_errors` 误报修复(`core/dsl-feedback.ts`)**
|
|
119
|
+
|
|
120
|
+
原来的逻辑:只要有任何 endpoint 缺少 errors 且总 endpoint ≥ 2 就标记 gap。这导致当部分 endpoint 已有 errors 时仍然误报。
|
|
121
|
+
|
|
122
|
+
- 修改为:仅当 **所有** endpoint 都缺少 errors 时才标记 `missing_errors` gap
|
|
123
|
+
- 修复了 `dsl-feedback.test.ts` 中已有的失败测试
|
|
124
|
+
|
|
125
|
+
---
|
|
126
|
+
|
|
127
|
+
## [Unreleased] 2026-04-01 — P1 Task 验证步骤 + P2 设计方案对话
|
|
128
|
+
|
|
129
|
+
### 新增 / 增强
|
|
130
|
+
|
|
131
|
+
**Feature 1 — Task verificationSteps(`core/task-generator.ts`、`prompts/tasks.prompt.ts`、`core/combined-generator.ts`)**
|
|
132
|
+
|
|
133
|
+
受 Superpowers writing-plans 启发,每个 Task 新增 `verificationSteps` 字段,要求具体可执行的验证命令 + 预期结果,防止"works correctly"式模糊验收标准。
|
|
134
|
+
|
|
135
|
+
- `SpecTask` 新增 `verificationSteps: string[]`,语义为"the how to verify"(区别于 `acceptanceCriteria` 的"the what")
|
|
136
|
+
- `tasksSystemPrompt` 新增 verificationSteps 规则段:每条步骤必须是具体命令 + 可观察预期结果,给出 Good/Bad 示例,要求 2-5 条/task,backend 必须含 HTTP 检查,frontend 必须含 UI render + state 检查
|
|
137
|
+
- `combined-generator.ts` 的内联 tasks instruction 同步更新,包含 `verificationSteps` 字段定义
|
|
138
|
+
- `printTasks` 每个 task 输出前 2 条 verificationSteps(灰色),超过 2 条显示 "+ N more"
|
|
139
|
+
|
|
140
|
+
**Feature 2 — Design Options Dialogue(`core/design-dialogue.ts`、`prompts/design.prompt.ts`、`cli/commands/create.ts`)**
|
|
141
|
+
|
|
142
|
+
受 Superpowers brainstorming 启发,在 Spec 生成前新增 Step 1.5:AI 提出 2-3 个架构方案供用户选择,选定方案作为约束注入 spec prompt,防止 Spec 生成完后才发现方向不对。
|
|
143
|
+
|
|
144
|
+
- `prompts/design.prompt.ts` — `designOptionsSystemPrompt`:每个方案含 Approach / Trade-offs(2-3条)/ Best when,保持简短(≤2分钟阅读),附 Recommended 建议
|
|
145
|
+
- `core/design-dialogue.ts` — `DesignDialogue` 类:提案展示 → 用户选择(Option A/B/C / Blend / Skip)→ Blend 模式让 AI 融合多方案;解析 AI 输出的方案标签,提取选定方案全文(最多 400 字符)注入 spec
|
|
146
|
+
- `create.ts` Step 1.5:在 context load 完成后、spec gen 前运行;`--fast` / `--auto` / `--vcr-replay` 自动跳过;`architectureDecision` 传入 `generateSpecWithTasks` 和 `SpecGenerator.generateSpec`
|
|
147
|
+
- `combined-generator.ts` / `spec-generator.ts` 均新增 `architectureDecision?: string` 参数,以 `=== Architecture Decision ===` 段注入 prompt
|
|
148
|
+
|
|
149
|
+
---
|
|
150
|
+
|
|
151
|
+
## [Unreleased] 2026-04-01 — Pass 0 Spec Compliance Check + 项目索引 + 抗幻觉 Skills
|
|
152
|
+
|
|
153
|
+
### 新增 / 增强
|
|
154
|
+
|
|
155
|
+
**Feature 1 — Pass 0 Spec Compliance Check(`prompts/codegen.prompt.ts`、`core/reviewer.ts`、`core/self-evaluator.ts`)**
|
|
156
|
+
|
|
157
|
+
受 Superpowers 工作流启发,在现有 3-pass review 前新增专用的 Spec 合规性检查 Pass 0。
|
|
158
|
+
|
|
159
|
+
- `specComplianceSystemPrompt` — 穷举式审计:从 Spec 中提取所有需求(endpoints、models、business rules、auth、error cases、side effects),逐条标 ✅ / ⚠️ / ❌,输出 `ComplianceScore: X/10` + Blockers 列表
|
|
160
|
+
- `Pass 1 架构 Review` 去除原有"是否覆盖所有需求"条款,Pass 0 已处理,Pass 1 聚焦层分离 / 契约设计 / 安全姿态
|
|
161
|
+
- Pass 1 prompt 注入 Pass 0 合规报告作为上下文,避免重复发现
|
|
162
|
+
- `extractComplianceScore` / `extractMissingCount` 公开导出,供下游消费
|
|
163
|
+
- `create.ts` 在 `stageEnd("review")` 后即时打印合规分 + 缺失需求数
|
|
164
|
+
- `SelfEvalResult` 新增 `complianceScore` 字段;harnessScore 权重更新:当 compliance + review 均可用时,compliance 0.30 · dsl 0.25 · compile 0.20 · review 0.25
|
|
165
|
+
- `printSelfEval` 输出新增 `Compliance: X/10` 行,低于 6 显示红色 ⚠
|
|
166
|
+
- Review History 记录新增 `complianceScore` 字段
|
|
167
|
+
|
|
168
|
+
**Feature 2 — 项目索引 `ai-spec scan`(`core/project-index.ts`、`cli/commands/scan.ts`)**
|
|
169
|
+
|
|
170
|
+
- `core/project-index.ts` — 扫描根目录下所有子项目(识别 `package.json` / `go.mod` / `Cargo.toml` / `pom.xml` 等 manifest),持久化到 `.ai-spec-index.json`
|
|
171
|
+
- 增量逻辑:新项目 → 添加 `firstSeen`;已有项目 → 更新 `techStack / hasConstitution / lastSeen`;目录消失 → 标记 `missing:true`(不删除记录)
|
|
172
|
+
- Git Worktree 过滤:`.git` 为文件(非目录)时跳过,防止 ai-spec 生成的 worktree 被误识别为项目
|
|
173
|
+
- `ai-spec scan` — 扫描并输出变更摘要(added / updated / unchanged / missing)
|
|
174
|
+
- `ai-spec scan --list` — 不重新扫描,直接展示当前 index
|
|
175
|
+
- `ai-spec init --global` 联动:优先读取 index,对每个活跃项目提取 type / techStack / constitution §1-§6 前 2000 字符,作为全局宪法生成的多项目上下文;无 index 时 fallback 并提示先 `scan`
|
|
176
|
+
|
|
177
|
+
**Feature 3 — 抗幻觉 Skill 文件(`.claude/commands/`)**
|
|
178
|
+
|
|
179
|
+
从 ai-spec 现有抗幻觉设计中提炼 5 个可复用 Claude Code slash command skill,供团队共享:
|
|
180
|
+
|
|
181
|
+
- `/scan-singletons` — 扫描项目所有单例 config 文件(i18n / constants / routes / store-index),输出"只能修改、绝不重建"清单
|
|
182
|
+
- `/add-lesson` — 将 review 发现写入宪法 §9,含去重(前 60 字符比对)+ 分类(bug/security/pattern/perf/convention)+ 时间戳
|
|
183
|
+
- `/installed-deps` — 列出 `package.json` 所有依赖作为 codegen 白名单,附检测常用替代品歧义提示
|
|
184
|
+
- `/recall-lessons` — 读取 §9,按与当前任务的相关度(High/Medium/Low)筛选并展示历史教训
|
|
185
|
+
- `/verify-imports` — 验证文件中所有 import 路径(alias 解析 + 相对路径 + 包名白名单),输出 broken imports 及修复建议
|
|
186
|
+
|
|
187
|
+
---
|
|
188
|
+
|
|
189
|
+
## [Unreleased] 2026-04-01 — VCR 录制回放 + 异步 §9 + Approval Gate 增强
|
|
190
|
+
|
|
191
|
+
### 新增 / 增强
|
|
192
|
+
|
|
193
|
+
**Feature 1 — VCR 录制 & 零成本回放(`core/vcr.ts`、`cli/commands/vcr.ts`、`cli/commands/create.ts`)**
|
|
194
|
+
|
|
195
|
+
受 Claude Code VCR token 计数测试模式启发,将所有 AI 响应录制成 JSON 快照,供离线无 API 调用地回放。
|
|
196
|
+
|
|
197
|
+
- `VcrRecordingProvider` — 透明包装任意 `AIProvider`,拦截每次 `generate()` 并记录 `(index, promptPreview, callHash, response, providerName, modelName, ts, durationMs)`;`save()` 支持合并 spec + codegen 两个 recorder 并按时间戳排序
|
|
198
|
+
- `VcrReplayProvider` — 按序返回预录响应,入参 prompt 被忽略;录制耗尽时抛出明确错误
|
|
199
|
+
- 快照存储在 `.ai-spec-vcr/{runId}.json`,与 RunLog 使用相同 `runId`,可交叉查询
|
|
200
|
+
- `ai-spec vcr list` — 列出所有录制(runId、AI 调用数、provider/model、录制日期)
|
|
201
|
+
- `ai-spec vcr show <runId>` — 逐条展示每次 AI 调用的 promptPreview + callHash + 耗时
|
|
202
|
+
- CLI 选项:`--vcr-record`(当次运行录制)、`--vcr-replay <runId>`(零 API 调用回放)
|
|
203
|
+
- 实现 fire-and-await 模式:spec 和 codegen 两个 provider 分别包装,运行结束后统一 merge 保存
|
|
204
|
+
|
|
205
|
+
**Enhancement 1 — §9 知识积累改为异步 fire-and-await(`cli/commands/create.ts`)**
|
|
206
|
+
|
|
207
|
+
原来 `await accumulateReviewKnowledge(...)` 阻塞在 Loop 2 结构性反馈之前,拉长了关键路径。
|
|
208
|
+
|
|
209
|
+
- 将调用改为立即启动、在 `runLogger.finish()` 前 `await`(fire-and-await 模式)
|
|
210
|
+
- Loop 2 交互式结构分析不再等待 §9 写入,用户体验更流畅
|
|
211
|
+
- 错误通过 `.catch()` 打印 `⚠ §9 accumulation failed: ...`,不影响主流程
|
|
212
|
+
|
|
213
|
+
**Enhancement 2 — Approval Gate DSL 范围预估(`cli/commands/create.ts`)**
|
|
214
|
+
|
|
215
|
+
原来 Approval Gate 只显示行数和字数,用户难以判断代码生成规模。
|
|
216
|
+
|
|
217
|
+
- 新增 `estimateFromSpec(spec)` 内联逻辑(正则,无 AI 调用):从 spec 文本统计 HTTP 端点数(`GET/POST/PUT/PATCH/DELETE /`)和数据模型数(`## Model`、`**Xxx**:`)
|
|
218
|
+
- Approval Gate 增加 `Est. DSL scope : ~N endpoint(s), ~M model(s) → ~K files` 预估行
|
|
219
|
+
- 让用户在点击 Proceed 前对代码生成规模有量化感知
|
|
220
|
+
|
|
221
|
+
---
|
|
222
|
+
|
|
223
|
+
## [Unreleased] 2026-04-01 — Pipeline 可靠性强化(二):JSONL 崩溃恢复 + 熔断 + Token Budget
|
|
224
|
+
|
|
225
|
+
### 功能增强
|
|
226
|
+
|
|
227
|
+
**Enhancement 1 — RunLog JSONL Append-Only Shadow(`core/run-logger.ts`、`core/run-trend.ts`)**
|
|
228
|
+
|
|
229
|
+
原有 `RunLogger.flush()` 是异步 fire-and-forget 的全量 JSON 重写,进程崩溃时当次 RunLog 全丢。
|
|
230
|
+
|
|
231
|
+
- 新增 `appendJsonlLine(filePath, record)` — 用 `fs.appendFileSync` 同步追加,保证每条记录落盘后才继续执行
|
|
232
|
+
- `RunLogger` 构造时立即写 `header` 行到 `{runId}.jsonl`;每个 `push()`、`stageFail()`、`setPromptHash()`、`setHarnessScore()`、`fileWritten()`、`finish()` 均追加对应类型的 JSONL 行(`header` / `entry` / `error` / `file` / `meta` / `footer`)
|
|
233
|
+
- 原有 `.json` 全量文件保留不变(消费者 `trend`、`dashboard`、`logs` 无需改动)
|
|
234
|
+
- 新增 `reconstructRunLogFromJsonl(path)` — 从 JSONL 行逐条重建 `RunLog`,供崩溃恢复使用
|
|
235
|
+
- `loadRunLogs()` 新增孤儿 `.jsonl` 恢复路径:扫描到没有对应 `.json` 的 `.jsonl` 文件时,自动重建并纳入返回结果
|
|
236
|
+
|
|
237
|
+
**Enhancement 2 — ErrorFeedback 无进展熔断(`core/error-feedback.ts`)**
|
|
238
|
+
|
|
239
|
+
原有修复循环没有"进展检测",即使每次修复后错误数未减少,仍会消耗完所有 `maxCycles`。
|
|
240
|
+
|
|
241
|
+
- 新增 `prevErrorCount` 跟踪上一轮的错误数量
|
|
242
|
+
- 每次 fix 后重新检查:若 `allErrors.length >= prevErrorCount`(错误数未减少),立即中止并打印 `⚠ Auto-fix made no progress` 提示,不再浪费额外 AI 调用
|
|
243
|
+
- 参考:Claude Code `MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3` 防止 compact 死循环的同类设计
|
|
244
|
+
|
|
245
|
+
**Enhancement 3 — Command Output + File Content Token Budget(`core/error-feedback.ts`)**
|
|
246
|
+
|
|
247
|
+
- 新增 `MAX_COMMAND_OUTPUT_CHARS = 50_000`(约 10K tokens):`runCommand` 返回的 stderr/stdout 超过此限时截断,防止巨型构建输出撑满 AI context
|
|
248
|
+
- 新增 `MAX_FIX_FILE_CHARS = 60_000`(约 12K tokens):`attemptFix` 中发给 AI 的 `existingContent` 超过此限时截断并附加提示,AI 仍可通过错误行号定位问题
|
|
249
|
+
- 参考:Claude Code `applyToolResultBudget(toolResult, maxTokens)` 工具输出预算设计
|
|
250
|
+
|
|
251
|
+
---
|
|
252
|
+
|
|
253
|
+
## [Unreleased] 2026-04-01 — Pipeline 可靠性强化:结构化 Review Findings + §9 知识闭环
|
|
254
|
+
|
|
255
|
+
### 功能增强
|
|
256
|
+
|
|
257
|
+
**Enhancement 1 — Loop 2 结构性发现:正则 → 结构化 JSON(`prompts/codegen.prompt.ts`、`core/dsl-feedback.ts`)**
|
|
258
|
+
|
|
259
|
+
原有 `extractStructuralFindings` 用正则解析 AI 生成的 review 自然语言,格式漂移会导致静默漏报,且只覆盖硬编码的 4 种关键词。
|
|
260
|
+
|
|
261
|
+
- `reviewArchitectureSystemPrompt` Pass 1 格式末尾新增 `## 🔍 结构性发现 JSON` 段,要求 AI 强制输出 `{"structuralFindings": [...]}` JSON block(即使无发现也输出空数组),category 枚举与原有一致:`auth_design` / `api_contract` / `model_design` / `layer_violation` / `other_design`
|
|
262
|
+
- `extractStructuralFindings` 改为**优先解析 JSON block**:从 Pass 1 文本提取 ` ```json{...}``` `,parse `structuralFindings` 数组并做类型守卫过滤;解析失败或旧格式 review 才 fallback 到原有正则逻辑(向后兼容)
|
|
263
|
+
- 结果:Loop 2 发现的问题不再受限于关键词列表,任何被 Pass 1 明确指出的设计问题都会进入反馈环
|
|
264
|
+
|
|
265
|
+
**Enhancement 2 — §9 知识积累真正形成双向闭环(`prompts/spec.prompt.ts`、`core/reviewer.ts`)**
|
|
266
|
+
|
|
267
|
+
原有实现的两个断层:
|
|
268
|
+
1. constitution 虽然注入到 spec 生成 prompt,但 `specPrompt` 没有明确指令让 AI 去应用 §9 教训
|
|
269
|
+
2. Reviewer 三 Pass 完全不读 constitution,无法检验新代码是否重蹈 §9 记录的问题
|
|
270
|
+
|
|
271
|
+
修复:
|
|
272
|
+
- `specPrompt` 末尾新增 CRITICAL 指令:如果 constitution 含 §9,spec 生成前必须逐条审阅教训;对直接相关的教训,在 §8 实施要点末尾追加「⚠ 基于历史教训:[简述规避方式]」
|
|
273
|
+
- `reviewer.ts` 新增 `loadAccumulatedLessons(projectRoot)` — 读取 constitution 中 §9 段落(`## 9. 积累教训` 到下一 `## \d` 或 EOF);注入到 Pass 1 arch review prompt(`=== §9 历史积累教训 ===`),让 reviewer 能交叉检验新代码是否复现已知问题,发现则写入 JSON findings 块触发 Loop 2
|
|
274
|
+
|
|
275
|
+
**两个闭环现在都真正贯通:**
|
|
276
|
+
```
|
|
277
|
+
Review → §9 写入 constitution
|
|
278
|
+
↓
|
|
279
|
+
下次 create spec 时 → spec 生成读取 §9 → 设计时规避
|
|
280
|
+
↓
|
|
281
|
+
Review Pass 1 读取 §9 → 检验代码是否复现 → 若复现触发 Loop 2 修正 DSL
|
|
282
|
+
```
|
|
283
|
+
|
|
284
|
+
---
|
|
285
|
+
|
|
286
|
+
## [Unreleased] 2026-03-31 — 文档同步:README / purpose / RELEASE_LOG
|
|
287
|
+
|
|
288
|
+
### 文档更新
|
|
289
|
+
|
|
290
|
+
- README 首页主流程同步到最新架构:
|
|
291
|
+
- 补入 **DSL Gap Feedback**
|
|
292
|
+
- 补入 **Review→DSL Loop**
|
|
293
|
+
- 明确 `logs` / `trend` / `dashboard` 消费 Harness Self-Eval 的 RunLog 数据
|
|
294
|
+
- 补充 DSL 的下游产物说明(`types` / `export` / `mock` / workspace 契约注入)
|
|
295
|
+
- purpose 文档升级到 **v0.34.0** 口径:
|
|
296
|
+
- 版本记录速览补入 v0.32.0 / v0.33.0 / v0.34.0
|
|
297
|
+
- 新增“两条 Pipeline 反馈环”章节
|
|
298
|
+
- 新增“DSL 的多出口价值:类型、Dashboard 与可观测性”章节
|
|
299
|
+
- 完整功能矩阵扩展到 `types`、`logs`、`trend`、`dashboard`
|
|
300
|
+
- purpose 的 Mermaid 流程图已切换为 **SVG 图片 + 折叠纯文本备用版**,方便在不支持 Mermaid 的文档平台中阅读
|
|
301
|
+
- RELEASE_LOG 新增当前文档同步记录,保证产品叙事与代码实现保持一致
|
|
302
|
+
|
|
303
|
+
---
|
|
304
|
+
|
|
305
|
+
## [0.34.0] 2026-03-31 — Harness Dashboard + TypeScript 类型生成
|
|
306
|
+
|
|
307
|
+
### 新增内容
|
|
308
|
+
|
|
309
|
+
**Feature 1 — `ai-spec dashboard`(`core/dashboard-generator.ts`、`cli/commands/dashboard.ts`)**
|
|
310
|
+
|
|
311
|
+
- 基于现有 `.ai-spec-logs/` RunLog 数据,一键生成静态 HTML Harness Dashboard
|
|
312
|
+
- 包含:
|
|
313
|
+
- Overview 统计(总运行数 / 平均分 / 编译通过率)
|
|
314
|
+
- Score Trend 折线图(SVG,最近 30 次有评分运行)
|
|
315
|
+
- Prompt 版本对比表(avg / best / worst,当前版本高亮)
|
|
316
|
+
- 近 10 次运行历史(带评分条形)
|
|
317
|
+
- 阶段耗时柱状图(平均 ms,前 8 阶段)
|
|
318
|
+
- Top 5 错误频次统计
|
|
319
|
+
- 零外部依赖(纯 inline CSS + SVG)
|
|
320
|
+
- `--open` 选项:生成后自动打开浏览器
|
|
321
|
+
|
|
322
|
+
```bash
|
|
323
|
+
ai-spec dashboard # 生成 .ai-spec/dashboard.html
|
|
324
|
+
ai-spec dashboard --open # 生成后自动在浏览器打开
|
|
325
|
+
ai-spec dashboard --last 20 # 只分析最近 20 次运行
|
|
326
|
+
ai-spec dashboard --output ./report.html
|
|
327
|
+
```
|
|
328
|
+
|
|
329
|
+
---
|
|
330
|
+
|
|
331
|
+
**Feature 2 — `ai-spec types`(`core/types-generator.ts`、`cli/commands/types.ts`)**
|
|
332
|
+
|
|
333
|
+
- DSL → TypeScript 类型文件,前端可直接 import,无需手写
|
|
334
|
+
- 生成内容:
|
|
335
|
+
- 所有 `models` → `export interface ModelName { ... }`(含可选/必填、类型映射)
|
|
336
|
+
- 所有 `endpoints.request.body/query/params` → `export interface PostXxxRequest { ... }`
|
|
337
|
+
- `export const API_ENDPOINTS = { ... } as const`(含 method / path / auth)
|
|
338
|
+
- 前端 `components[].props` → `export interface ComponentNameProps { ... }`
|
|
339
|
+
- 类型映射:`String→string`,`Int/Float→number`,`Boolean→boolean`,`DateTime→string`,`PascalCase→该 interface 引用`
|
|
340
|
+
|
|
341
|
+
```bash
|
|
342
|
+
ai-spec types # 生成 .ai-spec/<feature>.types.ts
|
|
343
|
+
ai-spec types --stdout # 打印到 stdout(适合管道)
|
|
344
|
+
ai-spec types --output src/types/api.ts
|
|
345
|
+
ai-spec types --no-endpoint-map # 不生成 API_ENDPOINTS 常量
|
|
346
|
+
```
|
|
347
|
+
|
|
3
348
|
---
|
|
4
349
|
|
|
5
350
|
## [0.33.0] 2026-03-30 — Pipeline 反馈环:DSL Gap Loop + Review→DSL Loop
|
|
@@ -74,6 +419,13 @@ Spec → DSL 提取
|
|
|
74
419
|
|
|
75
420
|
---
|
|
76
421
|
|
|
422
|
+
**内部重构(2026-03-31)— CLI 命令拆分**
|
|
423
|
+
|
|
424
|
+
- `cli/index.ts` 从 2533 行拆分为 42 行入口 + 13 个独立命令文件(`cli/commands/*.ts`)+ 共享工具层(`cli/utils.ts`)
|
|
425
|
+
- 无任何用户可见功能变化,编译输出与重构前等价
|
|
426
|
+
|
|
427
|
+
---
|
|
428
|
+
|
|
77
429
|
## [0.32.0] 2026-03-30 — Harness 数据闭环:`trend` / `logs` 命令 + DSL Coverage 细化评分
|
|
78
430
|
|
|
79
431
|
### 新增内容
|
|
@@ -2285,3 +2637,75 @@ Spec + Tasks 通过单次 AI 调用完成(`core/combined-generator.ts`),
|
|
|
2285
2637
|
- `ai-spec review`:独立代码审查命令
|
|
2286
2638
|
- 支持 `claude-code` / `api` / `plan` 三种代码生成模式
|
|
2287
2639
|
- 项目上下文自动扫描(package.json / Prisma schema / 路由文件)
|
|
2640
|
+
|
|
2641
|
+
</details>
|
|
2642
|
+
|
|
2643
|
+
<details>
|
|
2644
|
+
<summary>English</summary>
|
|
2645
|
+
|
|
2646
|
+
# Release Log
|
|
2647
|
+
|
|
2648
|
+
This section provides an English companion view for the detailed Chinese changelog above. It keeps the same chronological direction while summarizing each version at a higher level for bilingual reading.
|
|
2649
|
+
|
|
2650
|
+
## Version Summary
|
|
2651
|
+
|
|
2652
|
+
- **Unreleased** — Synced README, purpose, and RELEASE_LOG to reflect the latest pipeline, feedback loops, SVG diagrams, and observability narrative.
|
|
2653
|
+
- **0.34.0** — Added a static Harness Dashboard and DSL-to-TypeScript type generation.
|
|
2654
|
+
- **0.33.0** — Introduced two pipeline feedback loops: DSL Gap Loop and Review→DSL Loop.
|
|
2655
|
+
- **0.32.0** — Closed the harness data loop with `trend`, `logs`, and more detailed DSL coverage scoring.
|
|
2656
|
+
- **0.31.0** — Introduced the Harness Engineer layer with `promptHash` and inline self-evaluation during `create`.
|
|
2657
|
+
- **0.30.0** — Improved error-fix dependency ordering and multiline import awareness for frontend context extraction.
|
|
2658
|
+
- **0.29.0** — Performed a broad hardening pass across RunLogger instrumentation, update snapshots, score trend output, and dead-code cleanup.
|
|
2659
|
+
- **0.28.0** — Upgraded review from 2-pass to 3-pass by adding impact assessment and code complexity analysis.
|
|
2660
|
+
- **0.27.0** — Added industrial reliability foundations: provider robustness, snapshot restore, and structured RunLog / RunId support.
|
|
2661
|
+
- **0.26.0** — Fixed stability issues in multi-repo review, parallel batch tolerance, and corrupted tasks JSON recovery.
|
|
2662
|
+
- **0.25.0** — Repaired context extraction for HTTP imports, pagination examples, and false crash detection.
|
|
2663
|
+
- **0.24.0** — Fixed lesson counting, `export default`, `impliesRegistration`, and dependency topological sorting issues.
|
|
2664
|
+
- **0.23.0** — Eliminated a filename hallucination bug by correcting `index.vue` generation toward the actual component name.
|
|
2665
|
+
- **0.22.0** — Strengthened frontend three-layer separation by introducing a `view` layer and fixing API→Store naming hallucinations.
|
|
2666
|
+
- **0.21.0** — Fixed store behavior contract extraction, including `fetchTasks` vs `fetchTaskList` hallucination issues.
|
|
2667
|
+
- **0.20.0** — Added one-command mock integration with `--serve` and `--restore`.
|
|
2668
|
+
- **0.19.0** — Rewrote error parsing, completed behavior contract extraction, and fixed Auto Gate behavior.
|
|
2669
|
+
- **0.18.0** — Added `ai-spec learn`, behavior contract injection, and made Approval Gate a hard gate.
|
|
2670
|
+
- **0.17.0** — Injected the full constitution into generation, improved export caching, and added constitution length warnings.
|
|
2671
|
+
- **0.16.0** — Added spec quality pre-assessment, layered code review, and TDD mode.
|
|
2672
|
+
- **0.15.0** — Added parallel task execution for tasks within the same dependency layer.
|
|
2673
|
+
- **0.14.5** — Added extraction and injection of real frontend pagination patterns.
|
|
2674
|
+
- **0.14.4** — Improved frontend output reliability with route index auto-registration and cross-task function-name consistency.
|
|
2675
|
+
- **0.14.3** — Added the welcome screen.
|
|
2676
|
+
- **0.14.2** — Added Java / Maven / Gradle project context awareness.
|
|
2677
|
+
- **0.14.1** — Fixed a critical bug where non-Node repos incorrectly generated TypeScript-oriented output.
|
|
2678
|
+
- **0.14.0** — Unified frontend framework detection and injected frontend context explicitly in task mode.
|
|
2679
|
+
- **0.13.9** — Added component reuse awareness.
|
|
2680
|
+
- **0.13.8** — Fixed store HTTP hallucinations and HTTP client import hallucinations.
|
|
2681
|
+
- **0.13.6** — Fixed layout hallucinations and route registration reliability.
|
|
2682
|
+
- **0.13.5** — Fixed frontend codegen hallucinations and route convention issues.
|
|
2683
|
+
- **0.13.4** — Fixed MiMo `max_tokens` truncation.
|
|
2684
|
+
- **0.13.3** — Fixed DSL validation false positives.
|
|
2685
|
+
- **0.13.2** — Added API key persistence.
|
|
2686
|
+
- **0.13.1** — Auto-skipped worktree for frontend generation and fixed related bugs.
|
|
2687
|
+
- **0.13.0** — Strengthened context awareness and error-feedback behavior.
|
|
2688
|
+
- **0.12.2** — Added PHP / Lumen backend support.
|
|
2689
|
+
- **0.12.1** — Restored MiMo v2 Pro support.
|
|
2690
|
+
- **0.12.0** — Added constitution consolidation with `ai-spec init --consolidate`.
|
|
2691
|
+
- **0.11.0** — Delivered three high-priority additions: incremental update, OpenAPI export, and multi-language codegen prompts.
|
|
2692
|
+
- **0.10.0** — Added Mock Server support and expanded multi-language backend support.
|
|
2693
|
+
- **0.9.0** — Fixed frontend DSL extraction, decomposition context, and codegen injection precision.
|
|
2694
|
+
- **0.8.0** — Enhanced frontend support and added shared global constitutions across projects.
|
|
2695
|
+
- **0.7.0** — Introduced Phase 4 multi-repo workspace orchestration.
|
|
2696
|
+
- **0.6.0** — Introduced Phase 3 test generation, error feedback, and lesson accumulation.
|
|
2697
|
+
- **0.5.0** — Introduced Phase 2 DSL transformation, schema handling, and validation.
|
|
2698
|
+
- **0.4.0** — Introduced Phase 1 industrial pipeline infrastructure.
|
|
2699
|
+
- **0.3.0** — Added project constitution support, task decomposition, and RTK integration.
|
|
2700
|
+
- **0.2.0** — Added multi-provider support and interactive model switching.
|
|
2701
|
+
- **0.1.0** — Initial release with Spec generation, Git worktree isolation, code generation, review, and basic project context scanning.
|
|
2702
|
+
|
|
2703
|
+
## Evolution Themes
|
|
2704
|
+
|
|
2705
|
+
- **Pipeline structure** — The project evolved from prompt-driven generation into a staged, restartable engineering pipeline.
|
|
2706
|
+
- **Project grounding** — Context extraction, constitutions, DSL, and behavior contracts reduce repository-specific hallucinations.
|
|
2707
|
+
- **Quality loops** — Testing, error feedback, review passes, lesson accumulation, and harness scoring create feedback channels after generation.
|
|
2708
|
+
- **Workspace orchestration** — Multi-repo features extend the system from single-repo coding to contract-aware cross-stack delivery.
|
|
2709
|
+
- **Harness observability** — `promptHash`, `harnessScore`, `logs`, `trend`, and `dashboard` turn runs into measurable engineering data.
|
|
2710
|
+
|
|
2711
|
+
</details>
|
package/cli/commands/config.ts
CHANGED
|
@@ -16,6 +16,8 @@ export function registerConfig(program: Command): void {
|
|
|
16
16
|
.option("--codegen-provider <name>", "Default provider for code generation")
|
|
17
17
|
.option("--codegen-model <name>", "Default model for code generation")
|
|
18
18
|
.option("--min-spec-score <score>", "Minimum overall spec score (1-10) to pass Approval Gate (0 = disabled)")
|
|
19
|
+
.option("--min-harness-score <score>", "Minimum harness score (1-10) for pipeline success (0 = disabled)")
|
|
20
|
+
.option("--max-error-cycles <n>", "Maximum error-feedback fix cycles (1-10, default: 2)")
|
|
19
21
|
.option("--show", "Print current configuration")
|
|
20
22
|
.option("--reset", "Reset configuration to empty")
|
|
21
23
|
.option("--clear-keys", "Delete all saved API keys from ~/.ai-spec-keys.json")
|
|
@@ -85,6 +87,22 @@ export function registerConfig(program: Command): void {
|
|
|
85
87
|
}
|
|
86
88
|
updated.minSpecScore = score;
|
|
87
89
|
}
|
|
90
|
+
if (opts.minHarnessScore !== undefined) {
|
|
91
|
+
const score = parseInt(opts.minHarnessScore, 10);
|
|
92
|
+
if (isNaN(score) || score < 0 || score > 10) {
|
|
93
|
+
console.error(chalk.red(" --min-harness-score must be a number between 0 and 10"));
|
|
94
|
+
process.exit(1);
|
|
95
|
+
}
|
|
96
|
+
updated.minHarnessScore = score;
|
|
97
|
+
}
|
|
98
|
+
if (opts.maxErrorCycles !== undefined) {
|
|
99
|
+
const cycles = parseInt(opts.maxErrorCycles, 10);
|
|
100
|
+
if (isNaN(cycles) || cycles < 1 || cycles > 10) {
|
|
101
|
+
console.error(chalk.red(" --max-error-cycles must be a number between 1 and 10"));
|
|
102
|
+
process.exit(1);
|
|
103
|
+
}
|
|
104
|
+
updated.maxErrorCycles = cycles;
|
|
105
|
+
}
|
|
88
106
|
|
|
89
107
|
await fs.writeJson(configPath, updated, { spaces: 2 });
|
|
90
108
|
console.log(chalk.green(`✔ Config saved to ${configPath}`));
|