@hongmaple0820/scale-engine 0.19.0 → 0.20.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.en.md +17 -3
- package/README.md +107 -9
- package/dist/api/cli.js +988 -11
- package/dist/api/cli.js.map +1 -1
- package/dist/codegraph/CodeIntelligence.d.ts +135 -0
- package/dist/codegraph/CodeIntelligence.js +460 -0
- package/dist/codegraph/CodeIntelligence.js.map +1 -0
- package/dist/context/ContextBudget.d.ts +90 -0
- package/dist/context/ContextBudget.js +322 -0
- package/dist/context/ContextBudget.js.map +1 -0
- package/dist/eval/WorkflowEval.d.ts +161 -0
- package/dist/eval/WorkflowEval.js +379 -0
- package/dist/eval/WorkflowEval.js.map +1 -0
- package/dist/governance/GovernanceRoi.d.ts +25 -0
- package/dist/governance/GovernanceRoi.js +70 -0
- package/dist/governance/GovernanceRoi.js.map +1 -0
- package/dist/governance/ProgressiveGovernance.d.ts +22 -0
- package/dist/governance/ProgressiveGovernance.js +159 -0
- package/dist/governance/ProgressiveGovernance.js.map +1 -0
- package/dist/memory/MemoryBrain.d.ts +135 -0
- package/dist/memory/MemoryBrain.js +635 -0
- package/dist/memory/MemoryBrain.js.map +1 -0
- package/dist/memory/index.d.ts +1 -0
- package/dist/memory/index.js +1 -0
- package/dist/memory/index.js.map +1 -1
- package/dist/output/GovernanceDashboard.d.ts +57 -0
- package/dist/output/GovernanceDashboard.js +250 -0
- package/dist/output/GovernanceDashboard.js.map +1 -0
- package/dist/output/index.d.ts +2 -0
- package/dist/output/index.js +1 -0
- package/dist/output/index.js.map +1 -1
- package/dist/skills/SkillRadar.d.ts +83 -0
- package/dist/skills/SkillRadar.js +384 -0
- package/dist/skills/SkillRadar.js.map +1 -0
- package/dist/workflow/GovernanceTemplates.js +194 -194
- package/docs/CODE_INTELLIGENCE.md +138 -0
- package/docs/CONTEXT_BUDGET.md +87 -0
- package/docs/GOVERNANCE_DASHBOARD.md +69 -0
- package/docs/MEMORY_BRAIN.md +104 -0
- package/docs/README.md +16 -8
- package/docs/SKILL_RADAR.md +115 -0
- package/docs/WORKFLOW_EVAL.md +151 -0
- package/package.json +7 -1
package/README.en.md
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
<p align="center">
|
|
2
|
-
<img src="https://img.shields.io/badge/version-0.
|
|
2
|
+
<img src="https://img.shields.io/badge/version-0.20.0-orange?style=flat-square" alt="version" />
|
|
3
3
|
<img src="https://img.shields.io/badge/platforms-16-blue?style=flat-square" alt="platforms" />
|
|
4
4
|
<img src="https://img.shields.io/badge/agents-12-blue?style=flat-square" alt="agents" />
|
|
5
5
|
<img src="https://img.shields.io/badge/workflows-10-green?style=flat-square" alt="workflows" />
|
|
6
6
|
<img src="https://img.shields.io/badge/detectors-19-red?style=flat-square" alt="detectors" />
|
|
7
7
|
<img src="https://img.shields.io/badge/tests-verified-brightgreen?style=flat-square" alt="tests" />
|
|
8
|
-
<img src="https://img.shields.io/badge/npm-0.
|
|
8
|
+
<img src="https://img.shields.io/badge/npm-0.20.0-cb3837?style=flat-square&logo=npm" alt="npm" />
|
|
9
9
|
</p>
|
|
10
10
|
|
|
11
|
-
# SCALE Engine v0.
|
|
11
|
+
# SCALE Engine v0.20.0
|
|
12
12
|
|
|
13
13
|
SCALE Engine makes AI coding agents follow engineering rules through executable workflow gates, evidence files, and review constraints instead of relying on prompt discipline alone. It helps humans see what the agent explored, planned, verified, skipped, and why a task is or is not ready to ship.
|
|
14
14
|
|
|
@@ -241,6 +241,20 @@ npx vitest run tests/workflow/reviewAnalyzer.test.ts tests/workflow/reviewStore.
|
|
|
241
241
|
|
|
242
242
|
## Release Notes
|
|
243
243
|
|
|
244
|
+
### v0.20.0
|
|
245
|
+
|
|
246
|
+
- Added Context Budget and Progressive Governance so low-risk S tasks stay lightweight while auth, data, security, deployment, and cross-module changes escalate automatically.
|
|
247
|
+
- Added Code Intelligence with adapter-first CodeGraph / Graphify support, explicit fallback, impact analysis, context recommendations, and exploration ROI.
|
|
248
|
+
- Added Workflow Eval, Failure Replay, and improvement candidates with pass@k, fix iterations, tool-call counts, token estimates, and human-correction metrics.
|
|
249
|
+
- Added Skill Radar for intent-based skills, MCP, browser, desktop automation, and external CLI recommendations with confidence, safety level, and evidence requirements.
|
|
250
|
+
- Added Memory Brain for evidence-backed long-term memory candidates, contradiction detection, dream maintenance, explicit promotion, and failure replay ingestion.
|
|
251
|
+
- Added Governance Dashboard to summarize runtime, eval, memory, resource, and HTML artifact evidence in a local HTML review surface.
|
|
252
|
+
- Fixed new `--dir` aware commands so relative `.scale` state resolves inside the target project instead of the caller workspace.
|
|
253
|
+
|
|
254
|
+
### v0.19.0
|
|
255
|
+
|
|
256
|
+
- Added product smoke gates, runtime evidence learning settlement, memory context packs, workspace conflict blockers, and release-readiness demo coverage.
|
|
257
|
+
|
|
244
258
|
### v0.18.0
|
|
245
259
|
|
|
246
260
|
- Governed HTML artifacts: `scale artifact render/doctor/settle/open`.
|
package/README.md
CHANGED
|
@@ -1,14 +1,14 @@
|
|
|
1
1
|
<p align="center">
|
|
2
|
-
<img src="https://img.shields.io/badge/version-0.
|
|
2
|
+
<img src="https://img.shields.io/badge/version-0.20.0-orange?style=flat-square" alt="version" />
|
|
3
3
|
<img src="https://img.shields.io/badge/platforms-16-blue?style=flat-square" alt="platforms" />
|
|
4
4
|
<img src="https://img.shields.io/badge/agents-12-blue?style=flat-square" alt="agents" />
|
|
5
5
|
<img src="https://img.shields.io/badge/workflows-10-green?style=flat-square" alt="workflows" />
|
|
6
6
|
<img src="https://img.shields.io/badge/detectors-19-red?style=flat-square" alt="detectors" />
|
|
7
7
|
<img src="https://img.shields.io/badge/tests-verified-brightgreen?style=flat-square" alt="tests" />
|
|
8
|
-
<img src="https://img.shields.io/badge/npm-0.
|
|
8
|
+
<img src="https://img.shields.io/badge/npm-0.20.0-cb3837?style=flat-square&logo=npm" alt="npm" />
|
|
9
9
|
</p>
|
|
10
10
|
|
|
11
|
-
# SCALE Engine v0.
|
|
11
|
+
# SCALE Engine v0.20.0
|
|
12
12
|
|
|
13
13
|
SCALE Engine 让 AI Agent 不再只靠“自觉”遵守工程规范。它把探索、规划、实现、验证、评审、发版这些要求变成可执行的命令、门禁和证据文件,让人类可以看见 Agent 做了什么、跳过了什么、为什么不能交付。
|
|
14
14
|
|
|
@@ -190,6 +190,102 @@ scale memory settle --task "Fix OAuth callback state lookup" --task-id <task-id>
|
|
|
190
190
|
|
|
191
191
|
详见 [Memory Fabric](docs/MEMORY_FABRIC.md)。
|
|
192
192
|
|
|
193
|
+
## Context Budget 与 Progressive Governance
|
|
194
|
+
|
|
195
|
+
Context Budget 会把 always-loaded、on-demand、evidence、archive、generated 上下文分开统计,避免 Agent 把所有规则、历史方案、报告和生成物一次性塞进提示词。
|
|
196
|
+
|
|
197
|
+
```bash
|
|
198
|
+
scale context budget --json
|
|
199
|
+
scale context doctor --max-always 2500 --max-task 8000
|
|
200
|
+
scale context pack --task "Review frontend route with browser evidence" --level L --budget 4000 --json
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
Progressive Governance 会根据任务文本和变更文件自动推荐 `minimal`、`standard`、`expanded` 或 `critical` 治理模式,并用 ROI 报告解释治理收益和开销:
|
|
204
|
+
|
|
205
|
+
```bash
|
|
206
|
+
scale governance mode --task "Change auth permissions" --files src/auth/user.ts --requested-mode minimal --json
|
|
207
|
+
scale governance roi --task-id <task-id> --task "Review frontend route" --files src/routes/upload.tsx --json
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
详见 [Context Budget And Progressive Governance](docs/CONTEXT_BUDGET.md)。
|
|
211
|
+
|
|
212
|
+
## Code Intelligence 与探索 ROI
|
|
213
|
+
|
|
214
|
+
Code Intelligence 是 adapter-first 的代码理解层:优先消费外部 CodeGraph 或 Graphify 产物,缺失时明确降级到内部 source scan,不静默假装已经完成代码图谱分析。
|
|
215
|
+
|
|
216
|
+
```bash
|
|
217
|
+
scale codegraph init
|
|
218
|
+
scale codegraph status --json
|
|
219
|
+
scale codegraph query "UserService.create" --json
|
|
220
|
+
scale codegraph impact --symbol UserService.create --json
|
|
221
|
+
scale codegraph context --symbol UserService.create --budget 2000 --json
|
|
222
|
+
scale codegraph roi --symbol UserService.create --json
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
它会输出 provider、fallback 状态、相关文件、confidence,以及 `fileReadsSaved` / `toolCallsSaved` 等探索收益指标。`scale governance roi` 也可以通过 `--symbol` 或 `--code-query` 把代码智能纳入治理 ROI。
|
|
226
|
+
|
|
227
|
+
详见 [Code Intelligence](docs/CODE_INTELLIGENCE.md)。
|
|
228
|
+
|
|
229
|
+
## Workflow Eval 与 Failure Replay
|
|
230
|
+
|
|
231
|
+
Workflow Eval 用轻量套件衡量工作流是否真的减少返工、工具调用、token 消耗和人类纠偏。失败时会保留 Failure Replay,而不是只留下一个失败状态。
|
|
232
|
+
|
|
233
|
+
```bash
|
|
234
|
+
scale eval init
|
|
235
|
+
scale eval run --suite workflow-baseline --json
|
|
236
|
+
scale eval compare --baseline <run-id> --candidate <run-id> --json
|
|
237
|
+
scale eval failures --since 30d --json
|
|
238
|
+
scale eval promote-failure <failure-id>
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
默认产物写入 `.scale/evals/`,属于本地运行时证据。长期提交到 Git 的应是经过整理的报告、基准 fixture 或明确要沉淀的改进项。
|
|
242
|
+
|
|
243
|
+
详见 [Workflow Eval Harness](docs/WORKFLOW_EVAL.md)。
|
|
244
|
+
|
|
245
|
+
## Skill Radar
|
|
246
|
+
|
|
247
|
+
Skill Radar chooses skills, MCP, browser automation, desktop automation, and external CLIs by task intent instead of relying on a static prompt list. It returns confidence, safety level, evidence requirements, and fallback behavior so agents can actively use tools without silently crossing safety boundaries.
|
|
248
|
+
|
|
249
|
+
```bash
|
|
250
|
+
scale skill radar --task "Design upload UI and run browser E2E checks" --files src/pages/upload.tsx
|
|
251
|
+
scale skill radar --task "Automate WPS desktop workflow with CUA" --json
|
|
252
|
+
scale skill doctor --supply-chain
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
Desktop CUA and external agent CLIs are blocked by default through Tool Policy until deliberately enabled. Third-party skills stay review-required until source, scripts, license, and pinned revision are checked.
|
|
256
|
+
|
|
257
|
+
See [Skill Radar](docs/SKILL_RADAR.md).
|
|
258
|
+
|
|
259
|
+
## Memory Brain
|
|
260
|
+
|
|
261
|
+
Memory Brain stores long-term project knowledge separately from the short context pack. Runtime evidence and learning candidates enter as candidates first; active memory requires evidence paths, project scope, confidence, and explicit promotion.
|
|
262
|
+
|
|
263
|
+
```bash
|
|
264
|
+
scale memory ingest --from evidence --task-id <task-id>
|
|
265
|
+
scale memory ingest --from failure --failure-id <failure-replay-id>
|
|
266
|
+
scale memory query "OAuth callback state design"
|
|
267
|
+
scale memory contradictions --json
|
|
268
|
+
scale memory dream --json
|
|
269
|
+
scale memory promote <candidate-id>
|
|
270
|
+
```
|
|
271
|
+
|
|
272
|
+
The point is not to remember everything. The point is to keep useful, reviewed project facts while reporting contradictions instead of silently overwriting them.
|
|
273
|
+
|
|
274
|
+
See [Memory Brain](docs/MEMORY_BRAIN.md).
|
|
275
|
+
|
|
276
|
+
## Governance Dashboard
|
|
277
|
+
|
|
278
|
+
Governance Dashboard renders a local HTML health view from runtime evidence, Workflow Eval, Memory Brain, Resource Governance, and task HTML artifacts:
|
|
279
|
+
|
|
280
|
+
```bash
|
|
281
|
+
scale artifact dashboard
|
|
282
|
+
scale artifact dashboard --task-id <task-id> --json
|
|
283
|
+
```
|
|
284
|
+
|
|
285
|
+
Default output is `.scale/reports/governance-dashboard.html`. Markdown and JSON remain the maintainable source of truth; the dashboard is a review surface for humans.
|
|
286
|
+
|
|
287
|
+
See [Governance Dashboard](docs/GOVERNANCE_DASHBOARD.md).
|
|
288
|
+
|
|
193
289
|
## Runtime Evidence
|
|
194
290
|
|
|
195
291
|
M/L/CRITICAL 任务在最终交付前应留下运行时证据,避免 Agent 没有真实验证就声称完成:
|
|
@@ -297,13 +393,15 @@ npx vitest run tests/workflow/phaseCli.test.ts
|
|
|
297
393
|
npx vitest run tests/workflow/reviewAnalyzer.test.ts tests/workflow/reviewStore.test.ts tests/workflow/gateSystem.test.ts
|
|
298
394
|
```
|
|
299
395
|
|
|
300
|
-
##
|
|
396
|
+
## v0.20.0 Updates
|
|
301
397
|
|
|
302
|
-
-
|
|
303
|
-
-
|
|
304
|
-
-
|
|
305
|
-
-
|
|
306
|
-
-
|
|
398
|
+
- Added Context Budget and Progressive Governance so low-risk S tasks stay lightweight while auth, data, security, deployment, and cross-module changes escalate automatically.
|
|
399
|
+
- Added Code Intelligence with adapter-first CodeGraph / Graphify support, explicit fallback, impact analysis, context recommendations, and exploration ROI.
|
|
400
|
+
- Added Workflow Eval, Failure Replay, and improvement candidates with pass@k, fix iterations, tool-call counts, token estimates, and human-correction metrics.
|
|
401
|
+
- Added Skill Radar for intent-based skills, MCP, browser, desktop automation, and external CLI recommendations with confidence, safety level, and evidence requirements.
|
|
402
|
+
- Added Memory Brain for evidence-backed long-term memory candidates, contradiction detection, dream maintenance, explicit promotion, and failure replay ingestion.
|
|
403
|
+
- Added Governance Dashboard to summarize runtime, eval, memory, resource, and HTML artifact evidence in a local HTML review surface.
|
|
404
|
+
- Fixed new --dir-aware commands so relative .scale state resolves inside the target project instead of the caller workspace.
|
|
307
405
|
|
|
308
406
|
## v0.18.0 更新
|
|
309
407
|
|