@hongmaple0820/scale-engine 0.40.2 → 0.43.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (200) hide show
  1. package/README.md +30 -2
  2. package/dist/api/cli.js +19 -0
  3. package/dist/api/cli.js.map +1 -1
  4. package/dist/api/quickstart.d.ts +11 -0
  5. package/dist/api/quickstart.js +98 -1
  6. package/dist/api/quickstart.js.map +1 -1
  7. package/dist/artifact/fsmDefinitions.js +15 -2
  8. package/dist/artifact/fsmDefinitions.js.map +1 -1
  9. package/dist/artifact/types.d.ts +1 -1
  10. package/dist/artifact/types.js.map +1 -1
  11. package/dist/cache/ScanCache.d.ts +41 -0
  12. package/dist/cache/ScanCache.js +120 -0
  13. package/dist/cache/ScanCache.js.map +1 -0
  14. package/dist/capabilities/BrowserQACapability.d.ts +14 -0
  15. package/dist/capabilities/BrowserQACapability.js +94 -0
  16. package/dist/capabilities/BrowserQACapability.js.map +1 -1
  17. package/dist/cli/autofixCommands.d.ts +22 -0
  18. package/dist/cli/autofixCommands.js +32 -0
  19. package/dist/cli/autofixCommands.js.map +1 -0
  20. package/dist/cli/cortexCommands.d.ts +71 -0
  21. package/dist/cli/cortexCommands.js +335 -0
  22. package/dist/cli/cortexCommands.js.map +1 -0
  23. package/dist/cli/costCommands.d.ts +13 -0
  24. package/dist/cli/costCommands.js +48 -0
  25. package/dist/cli/costCommands.js.map +1 -0
  26. package/dist/cli/orchCommands.d.ts +43 -0
  27. package/dist/cli/orchCommands.js +135 -0
  28. package/dist/cli/orchCommands.js.map +1 -0
  29. package/dist/cli/phaseCommands.js +1 -2
  30. package/dist/cli/phaseCommands.js.map +1 -1
  31. package/dist/cli/qaCommands.d.ts +22 -0
  32. package/dist/cli/qaCommands.js +84 -0
  33. package/dist/cli/qaCommands.js.map +1 -0
  34. package/dist/cli/quickstartCommands.d.ts +17 -0
  35. package/dist/cli/quickstartCommands.js +47 -0
  36. package/dist/cli/quickstartCommands.js.map +1 -0
  37. package/dist/cli/shieldCommands.d.ts +30 -0
  38. package/dist/cli/shieldCommands.js +212 -0
  39. package/dist/cli/shieldCommands.js.map +1 -0
  40. package/dist/cli/tuiCommands.d.ts +7 -0
  41. package/dist/cli/tuiCommands.js +33 -0
  42. package/dist/cli/tuiCommands.js.map +1 -0
  43. package/dist/config/profiles.js +26 -0
  44. package/dist/config/profiles.js.map +1 -1
  45. package/dist/cortex/GovernanceMetrics.d.ts +66 -0
  46. package/dist/cortex/GovernanceMetrics.js +230 -0
  47. package/dist/cortex/GovernanceMetrics.js.map +1 -0
  48. package/dist/cortex/InstinctExtractor.d.ts +61 -0
  49. package/dist/cortex/InstinctExtractor.js +184 -0
  50. package/dist/cortex/InstinctExtractor.js.map +1 -0
  51. package/dist/cortex/InstinctStore.d.ts +54 -0
  52. package/dist/cortex/InstinctStore.js +266 -0
  53. package/dist/cortex/InstinctStore.js.map +1 -0
  54. package/dist/cortex/ReflexionEngine.d.ts +34 -0
  55. package/dist/cortex/ReflexionEngine.js +157 -0
  56. package/dist/cortex/ReflexionEngine.js.map +1 -0
  57. package/dist/cortex/SessionInjector.d.ts +44 -0
  58. package/dist/cortex/SessionInjector.js +127 -0
  59. package/dist/cortex/SessionInjector.js.map +1 -0
  60. package/dist/cortex/adapters/ClaudeAdapter.d.ts +17 -0
  61. package/dist/cortex/adapters/ClaudeAdapter.js +61 -0
  62. package/dist/cortex/adapters/ClaudeAdapter.js.map +1 -0
  63. package/dist/cortex/adapters/CodexAdapter.d.ts +10 -0
  64. package/dist/cortex/adapters/CodexAdapter.js +52 -0
  65. package/dist/cortex/adapters/CodexAdapter.js.map +1 -0
  66. package/dist/cortex/adapters/CursorAdapter.d.ts +10 -0
  67. package/dist/cortex/adapters/CursorAdapter.js +46 -0
  68. package/dist/cortex/adapters/CursorAdapter.js.map +1 -0
  69. package/dist/cortex/adapters/GeminiAdapter.d.ts +11 -0
  70. package/dist/cortex/adapters/GeminiAdapter.js +48 -0
  71. package/dist/cortex/adapters/GeminiAdapter.js.map +1 -0
  72. package/dist/eval/BenchmarkPublisher.d.ts +25 -0
  73. package/dist/eval/BenchmarkPublisher.js +27 -0
  74. package/dist/eval/BenchmarkPublisher.js.map +1 -0
  75. package/dist/guardrails/DependencyAuditor.js +10 -1
  76. package/dist/guardrails/DependencyAuditor.js.map +1 -1
  77. package/dist/orchestrator/OrchestratorDaemon.d.ts +44 -0
  78. package/dist/orchestrator/OrchestratorDaemon.js +150 -0
  79. package/dist/orchestrator/OrchestratorDaemon.js.map +1 -0
  80. package/dist/orchestrator/PolicyLoader.d.ts +80 -0
  81. package/dist/orchestrator/PolicyLoader.js +229 -0
  82. package/dist/orchestrator/PolicyLoader.js.map +1 -0
  83. package/dist/orchestrator/ReconciliationLoop.d.ts +71 -0
  84. package/dist/orchestrator/ReconciliationLoop.js +266 -0
  85. package/dist/orchestrator/ReconciliationLoop.js.map +1 -0
  86. package/dist/orchestrator/TrackerAdapter.d.ts +60 -0
  87. package/dist/orchestrator/TrackerAdapter.js +147 -0
  88. package/dist/orchestrator/TrackerAdapter.js.map +1 -0
  89. package/dist/orchestrator/WorkspaceManager.d.ts +66 -0
  90. package/dist/orchestrator/WorkspaceManager.js +257 -0
  91. package/dist/orchestrator/WorkspaceManager.js.map +1 -0
  92. package/dist/qa/BrowserDaemon.d.ts +23 -0
  93. package/dist/qa/BrowserDaemon.js +79 -0
  94. package/dist/qa/BrowserDaemon.js.map +1 -0
  95. package/dist/qa/E2ETestOrchestrator.d.ts +14 -0
  96. package/dist/qa/E2ETestOrchestrator.js +19 -0
  97. package/dist/qa/E2ETestOrchestrator.js.map +1 -0
  98. package/dist/review/CrossModelReviewer.d.ts +35 -0
  99. package/dist/review/CrossModelReviewer.js +75 -0
  100. package/dist/review/CrossModelReviewer.js.map +1 -0
  101. package/dist/review/ReviewAggregator.d.ts +13 -0
  102. package/dist/review/ReviewAggregator.js +28 -0
  103. package/dist/review/ReviewAggregator.js.map +1 -0
  104. package/dist/review/reviewCommands.d.ts +15 -0
  105. package/dist/review/reviewCommands.js +24 -0
  106. package/dist/review/reviewCommands.js.map +1 -0
  107. package/dist/routing/LocalModelProvider.d.ts +11 -0
  108. package/dist/routing/LocalModelProvider.js +21 -0
  109. package/dist/routing/LocalModelProvider.js.map +1 -0
  110. package/dist/routing/ModelRouter.d.ts +12 -0
  111. package/dist/routing/ModelRouter.js +31 -4
  112. package/dist/routing/ModelRouter.js.map +1 -1
  113. package/dist/runtime/AiOsRuntime.d.ts +1 -0
  114. package/dist/runtime/AiOsRuntime.js +15 -0
  115. package/dist/runtime/AiOsRuntime.js.map +1 -1
  116. package/dist/runtime/CostAnalyzer.d.ts +53 -0
  117. package/dist/runtime/CostAnalyzer.js +160 -0
  118. package/dist/runtime/CostAnalyzer.js.map +1 -0
  119. package/dist/runtime/CostOptimizer.d.ts +11 -0
  120. package/dist/runtime/CostOptimizer.js +21 -0
  121. package/dist/runtime/CostOptimizer.js.map +1 -0
  122. package/dist/shield/PolicyCompiler.d.ts +70 -0
  123. package/dist/shield/PolicyCompiler.js +540 -0
  124. package/dist/shield/PolicyCompiler.js.map +1 -0
  125. package/dist/shield/ProtectedPaths.d.ts +39 -0
  126. package/dist/shield/ProtectedPaths.js +179 -0
  127. package/dist/shield/ProtectedPaths.js.map +1 -0
  128. package/dist/shield/ShieldProtocol.d.ts +50 -0
  129. package/dist/shield/ShieldProtocol.js +103 -0
  130. package/dist/shield/ShieldProtocol.js.map +1 -0
  131. package/dist/skills/SkillMdStandard.d.ts +33 -0
  132. package/dist/skills/SkillMdStandard.js +88 -0
  133. package/dist/skills/SkillMdStandard.js.map +1 -0
  134. package/dist/skills/SkillRegistry.d.ts +9 -1
  135. package/dist/skills/SkillRegistry.js +20 -0
  136. package/dist/skills/SkillRegistry.js.map +1 -1
  137. package/dist/skills/interop/GStackInterop.d.ts +15 -0
  138. package/dist/skills/interop/GStackInterop.js +34 -0
  139. package/dist/skills/interop/GStackInterop.js.map +1 -0
  140. package/dist/skills/interop/OMCInterop.d.ts +15 -0
  141. package/dist/skills/interop/OMCInterop.js +34 -0
  142. package/dist/skills/interop/OMCInterop.js.map +1 -0
  143. package/dist/tui/TuiDashboard.d.ts +3 -0
  144. package/dist/tui/TuiDashboard.js +120 -0
  145. package/dist/tui/TuiDashboard.js.map +1 -0
  146. package/dist/workflow/GateCatalog.d.ts +2 -0
  147. package/dist/workflow/GateCatalog.js +59 -3
  148. package/dist/workflow/GateCatalog.js.map +1 -1
  149. package/dist/workflow/GovernanceTemplatePacks.d.ts +1 -1
  150. package/dist/workflow/GovernanceTemplatePacks.js +15 -0
  151. package/dist/workflow/GovernanceTemplatePacks.js.map +1 -1
  152. package/dist/workflow/TddLoop.d.ts +2 -0
  153. package/dist/workflow/TddLoop.js +2 -0
  154. package/dist/workflow/TddLoop.js.map +1 -1
  155. package/dist/workflow/UpgradeManager.d.ts +10 -1
  156. package/dist/workflow/UpgradeManager.js +55 -0
  157. package/dist/workflow/UpgradeManager.js.map +1 -1
  158. package/dist/workflow/VerificationProfile.d.ts +8 -0
  159. package/dist/workflow/VerificationProfile.js +61 -0
  160. package/dist/workflow/VerificationProfile.js.map +1 -1
  161. package/dist/workflow/VerificationSchema.d.ts +46 -0
  162. package/dist/workflow/VerificationSchema.js +97 -0
  163. package/dist/workflow/VerificationSchema.js.map +1 -0
  164. package/dist/workflow/autofix/AutoFixEngine.d.ts +37 -0
  165. package/dist/workflow/autofix/AutoFixEngine.js +169 -0
  166. package/dist/workflow/autofix/AutoFixEngine.js.map +1 -0
  167. package/dist/workflow/execution/RalphEngine.d.ts +18 -0
  168. package/dist/workflow/execution/RalphEngine.js +22 -0
  169. package/dist/workflow/execution/RalphEngine.js.map +1 -1
  170. package/dist/workflow/gates/EnhancedGates.d.ts +74 -0
  171. package/dist/workflow/gates/EnhancedGates.js +653 -0
  172. package/dist/workflow/gates/EnhancedGates.js.map +1 -0
  173. package/dist/workflow/gates/GateSystem.d.ts +3 -0
  174. package/dist/workflow/gates/GateSystem.js +94 -1
  175. package/dist/workflow/gates/GateSystem.js.map +1 -1
  176. package/dist/workflow/types.d.ts +1 -1
  177. package/docs/README.md +3 -0
  178. package/docs/guides/DEVELOPMENT_WORKFLOW.md +28 -9
  179. package/docs/guides/GETTING_STARTED.md +19 -0
  180. package/docs/guides/MIGRATION.md +119 -0
  181. package/docs/workflow/GATES_AND_SCORE.md +34 -1
  182. package/docs/workflow/README.md +58 -10
  183. package/package.json +5 -17
  184. package/docs/ACTIVE_SECURITY_VISUAL_GATES.md +0 -87
  185. package/docs/AI_ENGINEERING_OS_POSITIONING.md +0 -607
  186. package/docs/BACKGROUND_HUNTER.md +0 -62
  187. package/docs/CODE_INTELLIGENCE.md +0 -180
  188. package/docs/CONTEXT_BUDGET.md +0 -165
  189. package/docs/DEPENDENCY_AUDIT.md +0 -118
  190. package/docs/EVOLUTION_SHADOW_MODE.md +0 -63
  191. package/docs/GITLAB_FLOW.md +0 -125
  192. package/docs/GOVERNANCE_DASHBOARD.md +0 -92
  193. package/docs/MEMORY_BRAIN.md +0 -104
  194. package/docs/MEMORY_FABRIC.md +0 -161
  195. package/docs/RESOURCE_GOVERNANCE.md +0 -92
  196. package/docs/RUNTIME_EVIDENCE.md +0 -101
  197. package/docs/WORKFLOW_EVAL.md +0 -151
  198. package/image/wechat-public.jpg +0 -0
  199. package/image/wxPay.jpg +0 -0
  200. package/image/zfb.jpg +0 -0
@@ -1,151 +0,0 @@
1
- # Workflow Eval Harness
2
-
3
- Status: implemented baseline
4
- Since: v0.22 development branch
5
-
6
- Workflow Eval Harness 用来证明工作流是否真的提升了 Agent 的工程交付质量,而不是只依赖主观感觉。它会运行轻量 eval suite,记录 pass@k、修复迭代、工具调用、token 估算、人类纠偏次数,并在失败时保留 Failure Replay。
7
-
8
- ## Commands
9
-
10
- 初始化默认基线套件:
11
-
12
- ```bash
13
- scale eval init
14
- scale eval init --suite workflow-baseline --json
15
- ```
16
-
17
- 运行套件:
18
-
19
- ```bash
20
- scale eval run --suite workflow-baseline
21
- scale eval run --suite workflow-baseline --json
22
- ```
23
-
24
- 对比两次运行:
25
-
26
- ```bash
27
- scale eval compare --baseline <run-id> --candidate <run-id>
28
- scale eval compare --baseline <run-id> --candidate <run-id> --json
29
- ```
30
-
31
- 生成 Markdown 报告:
32
-
33
- ```bash
34
- scale eval report --run <run-id>
35
- scale eval report --run <run-id> --output docs/worklog/eval-report.md
36
- ```
37
-
38
- 查看和提升失败重放:
39
-
40
- ```bash
41
- scale eval failures --since 30d
42
- scale eval replay <failure-id>
43
- scale eval replay --task-id <task-id>
44
- scale eval promote-failure <failure-id>
45
- ```
46
-
47
- ## Failure Replay To Memory
48
-
49
- Failure Replay is local eval evidence first. When a failure pattern is useful for future work, ingest it into Memory Brain as an `incident` candidate:
50
-
51
- ```bash
52
- scale memory ingest --from failure --failure-id <failure-id>
53
- scale memory query "missing verification evidence"
54
- scale memory promote <memory-node-id>
55
- ```
56
-
57
- This does not auto-change standards or hooks. It only makes the failure queryable and evidence-backed so repeated mistakes can be promoted deliberately after review.
58
-
59
- ## Storage
60
-
61
- ```text
62
- .scale/evals/
63
- ├── suites/
64
- ├── runs/
65
- ├── failures/
66
- └── improvements/
67
- ```
68
-
69
- These files are local runtime evidence by default. Commit only curated summaries or intentional benchmark fixtures.
70
-
71
- ## Suite Shape
72
-
73
- ```json
74
- {
75
- "version": "1.0",
76
- "id": "workflow-baseline",
77
- "name": "SCALE workflow baseline",
78
- "cases": [
79
- {
80
- "id": "governance-command-smoke",
81
- "type": "bugfix",
82
- "title": "Command evidence smoke",
83
- "task": "Verify that a local command can produce concrete eval evidence.",
84
- "phase": "verify",
85
- "successCriteria": ["command exits 0"],
86
- "attempts": [
87
- {
88
- "id": "attempt-1",
89
- "command": "node -e \"console.log('scale-eval-ok')\"",
90
- "expectedExitCode": 0,
91
- "outputContains": "scale-eval-ok"
92
- }
93
- ]
94
- }
95
- ]
96
- }
97
- ```
98
-
99
- ## Metrics
100
-
101
- | Metric | Meaning |
102
- | --- | --- |
103
- | `passAt1Rate` | 一次完整尝试就通过的比例 |
104
- | `passAt3Rate` | 三次以内通过的比例 |
105
- | `averageFixIterations` | 首次失败后的平均修复循环 |
106
- | `totalToolCalls` | eval attempts 数量,可近似衡量工具调用成本 |
107
- | `estimatedTokens` | task 与输出摘要的估算 token 成本 |
108
- | `humanCorrections` | 人类纠偏次数 |
109
- | `failureReplayCount` | 失败重放记录数量 |
110
-
111
- ## Failure Replay
112
-
113
- 失败不只记录最终失败状态,还会保存:
114
-
115
- - task and success criteria
116
- - phase
117
- - wrong turn
118
- - evidence
119
- - correction
120
- - prevention
121
- - replay command
122
- - redaction status
123
-
124
- Failure category 当前包括:
125
-
126
- - `wrong-exploration-path`
127
- - `hallucinated-project-fact`
128
- - `missing-codegraph-or-graph-fallback`
129
- - `over-broad-context-load`
130
- - `bad-skill-recommendation`
131
- - `missing-verification-evidence`
132
- - `failed-security-or-resource-gate`
133
- - `human-correction-after-agent-confidence`
134
- - `command-failure`
135
- - `unknown`
136
-
137
- `scale eval promote-failure` 会把失败重放提升为 improvement candidate,但不会自动修改项目规范。是否进入长期标准仍需要人工或后续 review 确认。
138
-
139
- ## Governance Use
140
-
141
- - v0.22 的默认 suite 是轻量 smoke baseline,用来验证 eval 管线可运行。
142
- - 真实项目应逐步增加 bugfix、feature、security、frontend、release、resource 类型案例。
143
- - Failure Replay 应与 Resource Governance 配合:默认本地保留,只有总结、基准或明确要长期维护的案例才提交。
144
- - Workflow Eval 的数据可以进入后续 Governance ROI,用来判断某个治理模块是否真的减少 rework、tool calls、token 或人类纠偏。
145
-
146
- ## Policy
147
-
148
- - 不允许用 eval 通过率替代真实项目验证。
149
- - 失败记录中的命令输出会做基础脱敏,但仍应避免把敏感原始日志写入 suite。
150
- - 低成本 smoke suite 可以频繁运行;重型项目 suite 应按需运行。
151
- - 没有 eval 证据时,不应宣称工作流能力已经提升。
Binary file
package/image/wxPay.jpg DELETED
Binary file
package/image/zfb.jpg DELETED
Binary file