@hongmaple0820/scale-engine 0.26.0 → 0.27.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. package/README.en.md +19 -3
  2. package/README.md +19 -3
  3. package/dist/api/cli.js +57 -9
  4. package/dist/api/cli.js.map +1 -1
  5. package/dist/cli/phaseCommands.js +8 -8
  6. package/dist/cli/phaseCommands.js.map +1 -1
  7. package/dist/context/ContextBudget.d.ts +14 -0
  8. package/dist/context/ContextBudget.js +50 -14
  9. package/dist/context/ContextBudget.js.map +1 -1
  10. package/dist/context/ContextCompiler.d.ts +34 -0
  11. package/dist/context/ContextCompiler.js +120 -0
  12. package/dist/context/ContextCompiler.js.map +1 -0
  13. package/dist/eval/WorkflowEval.js +4 -6
  14. package/dist/eval/WorkflowEval.js.map +1 -1
  15. package/dist/governance/GovernanceRoi.d.ts +6 -1
  16. package/dist/governance/GovernanceRoi.js +32 -0
  17. package/dist/governance/GovernanceRoi.js.map +1 -1
  18. package/dist/guardrails/DependencyAuditor.js +38 -0
  19. package/dist/guardrails/DependencyAuditor.js.map +1 -1
  20. package/dist/index.d.ts +1 -0
  21. package/dist/index.js +1 -0
  22. package/dist/index.js.map +1 -1
  23. package/dist/runtime/AiOsRuntime.d.ts +53 -0
  24. package/dist/runtime/AiOsRuntime.js +142 -0
  25. package/dist/runtime/AiOsRuntime.js.map +1 -0
  26. package/dist/runtime/index.d.ts +1 -0
  27. package/dist/runtime/index.js +1 -0
  28. package/dist/runtime/index.js.map +1 -1
  29. package/dist/skills/routing/SkillPlanner.js +91 -3
  30. package/dist/skills/routing/SkillPlanner.js.map +1 -1
  31. package/dist/skills/routing/SkillRoutingTypes.d.ts +17 -0
  32. package/dist/tools/SafeCommandRunner.d.ts +16 -0
  33. package/dist/tools/SafeCommandRunner.js +83 -0
  34. package/dist/tools/SafeCommandRunner.js.map +1 -0
  35. package/dist/workflow/gates/GateSystem.js +3 -9
  36. package/dist/workflow/gates/GateSystem.js.map +1 -1
  37. package/docs/AI_ENGINEERING_OS_POSITIONING.md +462 -0
  38. package/docs/CONTEXT_BUDGET.md +43 -1
  39. package/docs/DEPENDENCY_AUDIT.md +29 -0
  40. package/docs/MEMORY_FABRIC.md +2 -0
  41. package/docs/README.md +1 -0
  42. package/docs/SKILL_RADAR.md +13 -0
  43. package/package.json +9 -2
@@ -43,6 +43,40 @@ scale context pack \
43
43
  --json
44
44
  ```
45
45
 
46
+ Build the unified AI OS runtime plan that embeds the context pack with memory, skill routing, adaptive workflow, and ROI:
47
+
48
+ ```bash
49
+ scale ai-os plan \
50
+ --task-id TASK-123 \
51
+ --task "Review frontend route with browser evidence" \
52
+ --level L \
53
+ --files src/routes/upload.tsx \
54
+ --budget 8000 \
55
+ --json
56
+ ```
57
+
58
+ The context pack now uses the baseline Context Compiler. Each candidate section is scored by category, task/file relevance, risk level, and budget fit. The JSON output includes compiler metadata so callers can explain why a section was loaded or omitted:
59
+
60
+ ```json
61
+ {
62
+ "compiler": {
63
+ "strategy": "relevance-budget-v1",
64
+ "budget": 4000,
65
+ "totalCandidateTokens": 6200,
66
+ "estimatedTokenSavings": 2200,
67
+ "ranking": [
68
+ {
69
+ "id": "runtime-evidence",
70
+ "included": true,
71
+ "score": 292,
72
+ "matchedSignals": ["evidence", "high-risk-evidence"],
73
+ "reason": "Evidence is needed for completion and verification claims."
74
+ }
75
+ ]
76
+ }
77
+ }
78
+ ```
79
+
46
80
  Evaluate progressive governance mode:
47
81
 
48
82
  ```bash
@@ -109,5 +143,13 @@ This is not a replacement for verification. It only decides which governance beh
109
143
 
110
144
  ## Governance ROI
111
145
 
112
- `scale governance roi` reports both benefit and overhead. Early ROI is estimated from context budget and risk signals. Later versions should replace estimates with measured eval data such as file reads saved, tool calls saved, fix iterations reduced, and human corrections avoided.
146
+ `scale governance roi` reports both benefit and overhead. In v0.27.0, `scale ai-os plan` also attaches ROI modules for:
147
+
148
+ - `context-budget`
149
+ - `context-compiler`
150
+ - `memory-provider-runtime`
151
+ - `skill-routing-engine`
152
+ - `progressive-governance`
153
+
154
+ Early ROI is still estimated from context budget, compiler savings, recall count, skill evidence steps, and risk signals. Later versions should replace estimates with measured eval data such as file reads saved, tool calls saved, fix iterations reduced, and human corrections avoided.
113
155
 
@@ -28,6 +28,28 @@ scale dependency audit --changed-packages left-pad,@scope/tool --json
28
28
 
29
29
  The command exits non-zero when the active mode has blocking findings.
30
30
 
31
+ ## Verification Command Safety
32
+
33
+ SCALE verification commands are security-sensitive because they are often run in CI.
34
+ The core verification paths (`verify-task`, phase verification, workflow eval attempts, and gate commands) execute configured commands without shell expansion by default.
35
+
36
+ Allowed by default:
37
+
38
+ ```bash
39
+ npm run build
40
+ npm test -- --runInBand
41
+ node scripts/check.js --changed
42
+ ```
43
+
44
+ Blocked by default:
45
+
46
+ ```bash
47
+ npm test && curl https://example.com
48
+ node scripts/check.js | tee out.txt
49
+ ```
50
+
51
+ Shell metacharacters such as `&&`, `|`, `;`, `<`, `>`, backticks, and unquoted `$` are rejected before execution. Use package scripts or checked-in helper scripts for composed commands. `SCALE_ALLOW_SHELL_COMMANDS=1` re-enables shell execution only for trusted local runs and must not be enabled for untrusted PR or user-controlled CI inputs.
52
+
31
53
  ## G7 Integration
32
54
 
33
55
  `SecurityGate` now emits two first-class evidence sources:
@@ -82,8 +104,15 @@ The first implementation detects:
82
104
  - install lifecycle scripts
83
105
  - executable bin scripts
84
106
  - deprecated packages from lockfile metadata
107
+ - built-in ownership/provenance watchlist matches
85
108
  - dynamic code execution: `eval`, `new Function`
86
109
  - shell execution patterns
87
110
  - suspicious network access patterns
88
111
 
112
+ The built-in ownership/provenance watchlist currently blocks exact versions that were flagged by external package behavior analysis:
113
+
114
+ - `content-type@2.0.0`
115
+ - `type-is@2.1.0`
116
+ - `type-js@2.1.0` (kept as a defensive alias for reports that use this package name)
117
+
89
118
  Future network-backed checks can add npm registry metadata and `npm audit --json` ingestion, but they should stay optional and evidence-backed.
@@ -122,6 +122,7 @@ Commands:
122
122
  scale memory provider init
123
123
  scale memory provider status --json
124
124
  scale memory provider recall "OAuth callback Redis state" --json
125
+ scale ai-os plan --task "Fix OAuth callback Redis state" --files src/auth/oauth.ts --json
125
126
  ```
126
127
 
127
128
  Provider rules:
@@ -130,5 +131,6 @@ Provider rules:
130
131
  - External providers are read-only by default. Writes require an explicit provider policy change.
131
132
  - `scale-local` remains the fallback provider through Memory Brain and only promotes reviewed, evidence-backed memory.
132
133
  - `memory pack` automatically includes a `provider-memory` section when provider recall returns relevant active memories.
134
+ - `ai-os plan` includes both the provider recall summary and the Memory Fabric context pack, so agents can route memory before planning without pretending external memory is always available.
133
135
 
134
136
  This keeps agents flexible: they can ask the router for memory before planning, verification, review, or release, while SCALE still records which provider was used and why fallback was required.
package/docs/README.md CHANGED
@@ -36,6 +36,7 @@
36
36
  | [CODE_INTELLIGENCE.md](CODE_INTELLIGENCE.md) | CodeGraph、Graphify 和显式 fallback 的代码智能与探索 ROI |
37
37
  | [WORKFLOW_EVAL.md](WORKFLOW_EVAL.md) | Workflow Eval、pass@k 指标、Failure Replay 和改进候选 |
38
38
  | [SKILL_RADAR.md](SKILL_RADAR.md) | Skill Radar、能力置信度、证据要求和供应链安全检查 |
39
+ | [AI_ENGINEERING_OS_POSITIONING.md](AI_ENGINEERING_OS_POSITIONING.md) | Agent Governance Runtime / AI Engineering OS 方向,以及 `scale ai-os plan` 一体化 runtime plan |
39
40
  | [THIRD_PARTY_SKILLS.md](THIRD_PARTY_SKILLS.md) | 第三方 skill 致谢、授权边界、引用方式和 vendoring 策略 |
40
41
  | [EXTERNAL_REFERENCES.md](EXTERNAL_REFERENCES.md) | 外部项目、skills、MCP、CLI 和适配器引用的完整清单 |
41
42
  | [UPGRADE_MANAGEMENT.md](UPGRADE_MANAGEMENT.md) | SCALE CLI、governance pack、skills、MCP 和 CLI 工具的安全升级流程 |
@@ -19,6 +19,7 @@ scale skill radar --task "Automate WPS desktop workflow with CUA" --json
19
19
  scale skill radar --task "Review release PR" --phase review --level L --output docs/worklog/tasks/release/skill-radar.md
20
20
  scale skill doctor --supply-chain
21
21
  scale skill doctor --supply-chain --json
22
+ scale ai-os plan --task "Design upload UI and run browser E2E checks" --files src/pages/upload.tsx --json
22
23
  ```
23
24
 
24
25
  ## Safety Levels
@@ -73,6 +74,18 @@ Each recommendation carries required evidence. Examples:
73
74
 
74
75
  If evidence is missing, the final delivery should list the capability as unverified rather than claiming it was used successfully.
75
76
 
77
+ ## Skill Execution Plan
78
+
79
+ In v0.27.0, `createSkillPlan` and `scale ai-os plan` return an `executionPlan`:
80
+
81
+ - `strategy`: currently `intent-evidence-graph-v1`
82
+ - `steps`: ordered skill, artifact, and verification actions
83
+ - `reason`: why the step was selected from task intents
84
+ - `evidenceRequired`: what proof must be recorded
85
+ - `fallback`: what to do when the skill, MCP, CLI, or verification path is unavailable
86
+
87
+ This turns skill routing from a recommendation list into an auditable execution graph. Required steps still need concrete evidence or an explicit skipped/fallback record; recommended steps may be skipped with a reason.
88
+
76
89
  ## Supply-Chain Doctor
77
90
 
78
91
  `scale skill doctor --supply-chain` reviews known skill sources and install commands for:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@hongmaple0820/scale-engine",
3
- "version": "0.26.0",
3
+ "version": "0.27.0",
4
4
  "description": "Executable AI agent governance with workflow gates, evidence, skill/tool orchestration, and traceable HTML artifacts",
5
5
  "repository": {
6
6
  "type": "git",
@@ -25,6 +25,7 @@
25
25
  "files": [
26
26
  "dist",
27
27
  "docs/README.md",
28
+ "docs/AI_ENGINEERING_OS_POSITIONING.md",
28
29
  "docs/CODE_INTELLIGENCE.md",
29
30
  "docs/CONTEXT_BUDGET.md",
30
31
  "docs/BACKGROUND_HUNTER.md",
@@ -61,17 +62,23 @@
61
62
  "serve": "node dist/api/http.js"
62
63
  },
63
64
  "dependencies": {
64
- "@modelcontextprotocol/sdk": "^1.0.0",
65
+ "@modelcontextprotocol/sdk": "1.29.0",
65
66
  "better-sqlite3": "^11.10.0",
66
67
  "chokidar": "^3.6.0",
67
68
  "citty": "^0.1.6",
69
+ "content-type": "1.0.5",
68
70
  "execa": "^9.3.0",
69
71
  "hono": "^4.5.0",
70
72
  "js-yaml": "^4.1.0",
71
73
  "pino": "^9.3.0",
72
74
  "pino-pretty": "^11.2.0",
75
+ "type-is": "2.0.1",
73
76
  "zod": "^3.23.0"
74
77
  },
78
+ "overrides": {
79
+ "content-type": "1.0.5",
80
+ "type-is": "2.0.1"
81
+ },
75
82
  "devDependencies": {
76
83
  "@types/better-sqlite3": "^7.6.0",
77
84
  "@types/js-yaml": "^4.0.9",