aw-ecc 1.4.31 → 1.4.47

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (259) hide show
  1. package/.claude-plugin/plugin.json +1 -1
  2. package/.codex/hooks/aw-post-tool-use.sh +8 -2
  3. package/.codex/hooks/aw-session-start.sh +11 -4
  4. package/.codex/hooks/aw-stop.sh +8 -2
  5. package/.codex/hooks/aw-user-prompt-submit.sh +10 -2
  6. package/.codex/hooks.json +8 -8
  7. package/.cursor/INSTALL.md +7 -5
  8. package/.cursor/hooks/adapter.js +41 -4
  9. package/.cursor/hooks/after-agent-response.js +62 -0
  10. package/.cursor/hooks/before-submit-prompt.js +7 -1
  11. package/.cursor/hooks/post-tool-use-failure.js +21 -0
  12. package/.cursor/hooks/post-tool-use.js +39 -0
  13. package/.cursor/hooks/shared/aw-phase-definitions.js +53 -0
  14. package/.cursor/hooks/shared/aw-phase-runner.js +3 -1
  15. package/.cursor/hooks/subagent-start.js +22 -4
  16. package/.cursor/hooks/subagent-stop.js +18 -1
  17. package/.cursor/hooks.json +23 -2
  18. package/.opencode/package.json +1 -1
  19. package/AGENTS.md +3 -3
  20. package/README.md +5 -5
  21. package/commands/adk.md +52 -0
  22. package/commands/build.md +22 -9
  23. package/commands/deploy.md +12 -0
  24. package/commands/execute.md +9 -0
  25. package/commands/feature.md +333 -0
  26. package/commands/investigate.md +18 -5
  27. package/commands/plan.md +23 -9
  28. package/commands/publish.md +65 -0
  29. package/commands/review.md +12 -0
  30. package/commands/ship.md +12 -0
  31. package/commands/test.md +12 -0
  32. package/commands/verify.md +9 -0
  33. package/hooks/hooks.json +36 -0
  34. package/manifests/install-components.json +8 -0
  35. package/manifests/install-modules.json +83 -0
  36. package/manifests/install-profiles.json +7 -0
  37. package/package.json +1 -1
  38. package/scripts/ci/validate-rules.js +51 -0
  39. package/scripts/cursor-aw-home/hooks.json +23 -2
  40. package/scripts/cursor-aw-hooks/adapter.js +41 -4
  41. package/scripts/cursor-aw-hooks/before-submit-prompt.js +7 -1
  42. package/scripts/hooks/aw-usage-commit-created.js +32 -0
  43. package/scripts/hooks/aw-usage-post-tool-use-failure.js +56 -0
  44. package/scripts/hooks/aw-usage-post-tool-use.js +242 -0
  45. package/scripts/hooks/aw-usage-prompt-submit.js +112 -0
  46. package/scripts/hooks/aw-usage-session-start.js +48 -0
  47. package/scripts/hooks/aw-usage-stop.js +182 -0
  48. package/scripts/hooks/aw-usage-telemetry-send.js +84 -0
  49. package/scripts/hooks/cost-tracker.js +3 -23
  50. package/scripts/hooks/shared/aw-phase-definitions.js +53 -0
  51. package/scripts/hooks/shared/aw-phase-runner.js +3 -1
  52. package/scripts/lib/aw-hook-contract.js +2 -2
  53. package/scripts/lib/aw-pricing.js +306 -0
  54. package/scripts/lib/aw-usage-telemetry.js +472 -0
  55. package/scripts/lib/codex-hook-config.js +8 -8
  56. package/scripts/lib/cursor-hook-config.js +25 -10
  57. package/scripts/lib/install-targets/codex-home.js +7 -0
  58. package/scripts/lib/install-targets/cursor-project.js +3 -0
  59. package/scripts/lib/install-targets/helpers.js +20 -3
  60. package/skills/aw-adk/SKILL.md +317 -0
  61. package/skills/aw-adk/agents/analyzer.md +113 -0
  62. package/skills/aw-adk/agents/comparator.md +113 -0
  63. package/skills/aw-adk/agents/grader.md +115 -0
  64. package/skills/aw-adk/assets/eval_review.html +76 -0
  65. package/skills/aw-adk/eval-viewer/generate_review.py +164 -0
  66. package/skills/aw-adk/eval-viewer/viewer.html +181 -0
  67. package/skills/aw-adk/evals/eval-colocated-placement.md +84 -0
  68. package/skills/aw-adk/evals/eval-create-agent.md +90 -0
  69. package/skills/aw-adk/evals/eval-create-command.md +98 -0
  70. package/skills/aw-adk/evals/eval-create-eval.md +89 -0
  71. package/skills/aw-adk/evals/eval-create-rule.md +99 -0
  72. package/skills/aw-adk/evals/eval-create-skill.md +97 -0
  73. package/skills/aw-adk/evals/eval-delete-agent.md +79 -0
  74. package/skills/aw-adk/evals/eval-delete-command.md +89 -0
  75. package/skills/aw-adk/evals/eval-delete-rule.md +86 -0
  76. package/skills/aw-adk/evals/eval-delete-skill.md +90 -0
  77. package/skills/aw-adk/evals/eval-meta-eval-coverage.md +78 -0
  78. package/skills/aw-adk/evals/eval-meta-eval-determinism.md +81 -0
  79. package/skills/aw-adk/evals/eval-meta-eval-false-pass.md +81 -0
  80. package/skills/aw-adk/evals/eval-score-accuracy.md +95 -0
  81. package/skills/aw-adk/evals/eval-type-redirect.md +68 -0
  82. package/skills/aw-adk/evals/evals.json +96 -0
  83. package/skills/aw-adk/references/artifact-wiring.md +162 -0
  84. package/skills/aw-adk/references/cross-ide-mapping.md +71 -0
  85. package/skills/aw-adk/references/eval-placement-guide.md +183 -0
  86. package/skills/aw-adk/references/external-resources.md +75 -0
  87. package/skills/aw-adk/references/getting-started.md +66 -0
  88. package/skills/aw-adk/references/registry-structure.md +152 -0
  89. package/skills/aw-adk/references/rubric-agent.md +36 -0
  90. package/skills/aw-adk/references/rubric-command.md +36 -0
  91. package/skills/aw-adk/references/rubric-eval.md +36 -0
  92. package/skills/aw-adk/references/rubric-meta-eval.md +132 -0
  93. package/skills/aw-adk/references/rubric-rule.md +36 -0
  94. package/skills/aw-adk/references/rubric-skill.md +36 -0
  95. package/skills/aw-adk/references/schemas.md +222 -0
  96. package/skills/aw-adk/references/template-agent.md +251 -0
  97. package/skills/aw-adk/references/template-command.md +279 -0
  98. package/skills/aw-adk/references/template-eval.md +176 -0
  99. package/skills/aw-adk/references/template-rule.md +119 -0
  100. package/skills/aw-adk/references/template-skill.md +123 -0
  101. package/skills/aw-adk/references/type-classifier.md +98 -0
  102. package/skills/aw-adk/references/writing-good-agents.md +227 -0
  103. package/skills/aw-adk/references/writing-good-commands.md +258 -0
  104. package/skills/aw-adk/references/writing-good-evals.md +271 -0
  105. package/skills/aw-adk/references/writing-good-rules.md +214 -0
  106. package/skills/aw-adk/references/writing-good-skills.md +159 -0
  107. package/skills/aw-adk/scripts/aggregate-benchmark.py +190 -0
  108. package/skills/aw-adk/scripts/lint-artifact.sh +211 -0
  109. package/skills/aw-adk/scripts/score-artifact.sh +179 -0
  110. package/skills/aw-adk/scripts/trigger-eval.py +192 -0
  111. package/skills/aw-build/SKILL.md +19 -2
  112. package/skills/aw-deploy/SKILL.md +65 -3
  113. package/skills/aw-design/SKILL.md +156 -0
  114. package/skills/aw-design/references/highrise-tokens.md +394 -0
  115. package/skills/aw-design/references/micro-interactions.md +76 -0
  116. package/skills/aw-design/references/prompt-template.md +160 -0
  117. package/skills/aw-design/references/quality-checklist.md +70 -0
  118. package/skills/aw-design/references/self-review.md +497 -0
  119. package/skills/aw-design/references/stitch-workflow.md +127 -0
  120. package/skills/aw-feature/SKILL.md +293 -0
  121. package/skills/aw-investigate/SKILL.md +17 -0
  122. package/skills/aw-plan/SKILL.md +34 -3
  123. package/skills/aw-publish/SKILL.md +300 -0
  124. package/skills/aw-publish/evals/eval-confirmation-gate.md +60 -0
  125. package/skills/aw-publish/evals/eval-intent-detection.md +111 -0
  126. package/skills/aw-publish/evals/eval-push-modes.md +67 -0
  127. package/skills/aw-publish/evals/eval-rules-push.md +60 -0
  128. package/skills/aw-publish/evals/evals.json +29 -0
  129. package/skills/aw-publish/references/push-modes.md +38 -0
  130. package/skills/aw-review/SKILL.md +88 -9
  131. package/skills/aw-rules-review/SKILL.md +124 -0
  132. package/skills/aw-rules-review/agents/openai.yaml +3 -0
  133. package/skills/aw-rules-review/scripts/generate-review-template.mjs +323 -0
  134. package/skills/aw-ship/SKILL.md +16 -0
  135. package/skills/aw-spec/SKILL.md +15 -0
  136. package/skills/aw-tasks/SKILL.md +15 -0
  137. package/skills/aw-test/SKILL.md +16 -0
  138. package/skills/aw-yolo/SKILL.md +4 -0
  139. package/skills/diagnose/SKILL.md +121 -0
  140. package/skills/diagnose/scripts/hitl-loop.template.sh +41 -0
  141. package/skills/finish-only-when-green/SKILL.md +265 -0
  142. package/skills/grill-me/SKILL.md +24 -0
  143. package/skills/grill-with-docs/SKILL.md +92 -0
  144. package/skills/grill-with-docs/adr-format.md +47 -0
  145. package/skills/grill-with-docs/context-format.md +67 -0
  146. package/skills/improve-codebase-architecture/SKILL.md +75 -0
  147. package/skills/improve-codebase-architecture/deepening.md +37 -0
  148. package/skills/improve-codebase-architecture/interface-design.md +44 -0
  149. package/skills/improve-codebase-architecture/language.md +53 -0
  150. package/skills/local-ghl-setup-from-screenshot/SKILL.md +538 -0
  151. package/skills/tdd/SKILL.md +115 -0
  152. package/skills/tdd/deep-modules.md +33 -0
  153. package/skills/tdd/interface-design.md +31 -0
  154. package/skills/tdd/mocking.md +59 -0
  155. package/skills/tdd/refactoring.md +10 -0
  156. package/skills/tdd/tests.md +61 -0
  157. package/skills/to-issues/SKILL.md +62 -0
  158. package/skills/to-prd/SKILL.md +75 -0
  159. package/skills/using-aw-skills/SKILL.md +170 -237
  160. package/skills/using-aw-skills/hooks/session-start.sh +11 -41
  161. package/skills/zoom-out/SKILL.md +24 -0
  162. package/.cursor/rules/common-agents.md +0 -53
  163. package/.cursor/rules/common-aw-routing.md +0 -43
  164. package/.cursor/rules/common-coding-style.md +0 -52
  165. package/.cursor/rules/common-development-workflow.md +0 -33
  166. package/.cursor/rules/common-git-workflow.md +0 -28
  167. package/.cursor/rules/common-hooks.md +0 -34
  168. package/.cursor/rules/common-patterns.md +0 -35
  169. package/.cursor/rules/common-performance.md +0 -59
  170. package/.cursor/rules/common-security.md +0 -33
  171. package/.cursor/rules/common-testing.md +0 -33
  172. package/.cursor/skills/api-and-interface-design/SKILL.md +0 -75
  173. package/.cursor/skills/article-writing/SKILL.md +0 -85
  174. package/.cursor/skills/aw-brainstorm/SKILL.md +0 -115
  175. package/.cursor/skills/aw-build/SKILL.md +0 -152
  176. package/.cursor/skills/aw-build/evals/build-stage-cases.json +0 -28
  177. package/.cursor/skills/aw-debug/SKILL.md +0 -49
  178. package/.cursor/skills/aw-deploy/SKILL.md +0 -101
  179. package/.cursor/skills/aw-deploy/evals/deploy-stage-cases.json +0 -32
  180. package/.cursor/skills/aw-execute/SKILL.md +0 -47
  181. package/.cursor/skills/aw-execute/references/mode-code.md +0 -47
  182. package/.cursor/skills/aw-execute/references/mode-docs.md +0 -28
  183. package/.cursor/skills/aw-execute/references/mode-infra.md +0 -44
  184. package/.cursor/skills/aw-execute/references/mode-migration.md +0 -58
  185. package/.cursor/skills/aw-execute/references/worker-implementer.md +0 -26
  186. package/.cursor/skills/aw-execute/references/worker-parallel-worker.md +0 -23
  187. package/.cursor/skills/aw-execute/references/worker-quality-reviewer.md +0 -23
  188. package/.cursor/skills/aw-execute/references/worker-spec-reviewer.md +0 -23
  189. package/.cursor/skills/aw-execute/scripts/build-worker-bundle.js +0 -229
  190. package/.cursor/skills/aw-finish/SKILL.md +0 -111
  191. package/.cursor/skills/aw-investigate/SKILL.md +0 -109
  192. package/.cursor/skills/aw-plan/SKILL.md +0 -368
  193. package/.cursor/skills/aw-prepare/SKILL.md +0 -118
  194. package/.cursor/skills/aw-review/SKILL.md +0 -118
  195. package/.cursor/skills/aw-ship/SKILL.md +0 -115
  196. package/.cursor/skills/aw-spec/SKILL.md +0 -104
  197. package/.cursor/skills/aw-tasks/SKILL.md +0 -138
  198. package/.cursor/skills/aw-test/SKILL.md +0 -118
  199. package/.cursor/skills/aw-verify/SKILL.md +0 -51
  200. package/.cursor/skills/aw-yolo/SKILL.md +0 -111
  201. package/.cursor/skills/browser-testing-with-devtools/SKILL.md +0 -81
  202. package/.cursor/skills/bun-runtime/SKILL.md +0 -84
  203. package/.cursor/skills/ci-cd-and-automation/SKILL.md +0 -71
  204. package/.cursor/skills/code-simplification/SKILL.md +0 -74
  205. package/.cursor/skills/content-engine/SKILL.md +0 -88
  206. package/.cursor/skills/context-engineering/SKILL.md +0 -74
  207. package/.cursor/skills/deprecation-and-migration/SKILL.md +0 -75
  208. package/.cursor/skills/documentation-and-adrs/SKILL.md +0 -75
  209. package/.cursor/skills/documentation-lookup/SKILL.md +0 -90
  210. package/.cursor/skills/frontend-slides/SKILL.md +0 -184
  211. package/.cursor/skills/frontend-slides/STYLE_PRESETS.md +0 -330
  212. package/.cursor/skills/frontend-ui-engineering/SKILL.md +0 -68
  213. package/.cursor/skills/git-workflow-and-versioning/SKILL.md +0 -75
  214. package/.cursor/skills/idea-refine/SKILL.md +0 -84
  215. package/.cursor/skills/incremental-implementation/SKILL.md +0 -75
  216. package/.cursor/skills/investor-materials/SKILL.md +0 -96
  217. package/.cursor/skills/investor-outreach/SKILL.md +0 -76
  218. package/.cursor/skills/market-research/SKILL.md +0 -75
  219. package/.cursor/skills/mcp-server-patterns/SKILL.md +0 -67
  220. package/.cursor/skills/nextjs-turbopack/SKILL.md +0 -44
  221. package/.cursor/skills/performance-optimization/SKILL.md +0 -77
  222. package/.cursor/skills/security-and-hardening/SKILL.md +0 -70
  223. package/.cursor/skills/using-aw-skills/SKILL.md +0 -290
  224. package/.cursor/skills/using-aw-skills/evals/skill-trigger-cases.tsv +0 -25
  225. package/.cursor/skills/using-aw-skills/evals/test-skill-triggers.sh +0 -171
  226. package/.cursor/skills/using-aw-skills/hooks/hooks.json +0 -9
  227. package/.cursor/skills/using-aw-skills/hooks/session-start.sh +0 -67
  228. package/.cursor/skills/using-platform-skills/SKILL.md +0 -163
  229. package/.cursor/skills/using-platform-skills/evals/platform-selection-cases.json +0 -52
  230. /package/.cursor/rules/{golang-coding-style.md → golang-coding-style.mdc} +0 -0
  231. /package/.cursor/rules/{golang-hooks.md → golang-hooks.mdc} +0 -0
  232. /package/.cursor/rules/{golang-patterns.md → golang-patterns.mdc} +0 -0
  233. /package/.cursor/rules/{golang-security.md → golang-security.mdc} +0 -0
  234. /package/.cursor/rules/{golang-testing.md → golang-testing.mdc} +0 -0
  235. /package/.cursor/rules/{kotlin-coding-style.md → kotlin-coding-style.mdc} +0 -0
  236. /package/.cursor/rules/{kotlin-hooks.md → kotlin-hooks.mdc} +0 -0
  237. /package/.cursor/rules/{kotlin-patterns.md → kotlin-patterns.mdc} +0 -0
  238. /package/.cursor/rules/{kotlin-security.md → kotlin-security.mdc} +0 -0
  239. /package/.cursor/rules/{kotlin-testing.md → kotlin-testing.mdc} +0 -0
  240. /package/.cursor/rules/{php-coding-style.md → php-coding-style.mdc} +0 -0
  241. /package/.cursor/rules/{php-hooks.md → php-hooks.mdc} +0 -0
  242. /package/.cursor/rules/{php-patterns.md → php-patterns.mdc} +0 -0
  243. /package/.cursor/rules/{php-security.md → php-security.mdc} +0 -0
  244. /package/.cursor/rules/{php-testing.md → php-testing.mdc} +0 -0
  245. /package/.cursor/rules/{python-coding-style.md → python-coding-style.mdc} +0 -0
  246. /package/.cursor/rules/{python-hooks.md → python-hooks.mdc} +0 -0
  247. /package/.cursor/rules/{python-patterns.md → python-patterns.mdc} +0 -0
  248. /package/.cursor/rules/{python-security.md → python-security.mdc} +0 -0
  249. /package/.cursor/rules/{python-testing.md → python-testing.mdc} +0 -0
  250. /package/.cursor/rules/{swift-coding-style.md → swift-coding-style.mdc} +0 -0
  251. /package/.cursor/rules/{swift-hooks.md → swift-hooks.mdc} +0 -0
  252. /package/.cursor/rules/{swift-patterns.md → swift-patterns.mdc} +0 -0
  253. /package/.cursor/rules/{swift-security.md → swift-security.mdc} +0 -0
  254. /package/.cursor/rules/{swift-testing.md → swift-testing.mdc} +0 -0
  255. /package/.cursor/rules/{typescript-coding-style.md → typescript-coding-style.mdc} +0 -0
  256. /package/.cursor/rules/{typescript-hooks.md → typescript-hooks.mdc} +0 -0
  257. /package/.cursor/rules/{typescript-patterns.md → typescript-patterns.mdc} +0 -0
  258. /package/.cursor/rules/{typescript-security.md → typescript-security.mdc} +0 -0
  259. /package/.cursor/rules/{typescript-testing.md → typescript-testing.mdc} +0 -0
@@ -0,0 +1,76 @@
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>ADK Trigger Eval Review</title>
7
+ <style>
8
+ * { box-sizing: border-box; margin: 0; padding: 0; }
9
+ body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif; background: #0d1117; color: #c9d1d9; padding: 20px; max-width: 900px; margin: 0 auto; }
10
+ h1 { margin-bottom: 8px; }
11
+ .subtitle { color: #8b949e; margin-bottom: 20px; }
12
+ .description { background: #161b22; border: 1px solid #30363d; border-radius: 8px; padding: 12px; margin-bottom: 20px; font-family: monospace; font-size: 13px; }
13
+ .eval-item { background: #161b22; border: 1px solid #30363d; border-radius: 8px; padding: 12px; margin-bottom: 8px; display: flex; align-items: center; gap: 12px; }
14
+ .eval-item textarea { flex: 1; background: #0d1117; border: 1px solid #30363d; border-radius: 4px; color: #c9d1d9; padding: 6px 8px; font-family: inherit; resize: vertical; min-height: 36px; }
15
+ .toggle { cursor: pointer; padding: 4px 10px; border-radius: 12px; font-size: 12px; font-weight: 600; border: none; }
16
+ .toggle.trigger { background: #1b4332; color: #2dd4bf; }
17
+ .toggle.no-trigger { background: #4c1d1d; color: #f87171; }
18
+ .remove-btn { cursor: pointer; color: #f87171; background: none; border: none; font-size: 18px; }
19
+ .add-btn, .export-btn { padding: 8px 16px; border-radius: 6px; border: none; cursor: pointer; font-size: 14px; margin-right: 8px; margin-top: 12px; }
20
+ .add-btn { background: #21262d; color: #c9d1d9; }
21
+ .export-btn { background: #238636; color: white; }
22
+ .add-btn:hover { background: #30363d; }
23
+ .export-btn:hover { background: #2ea043; }
24
+ </style>
25
+ </head>
26
+ <body>
27
+
28
+ <h1>Trigger Eval Review: <span id="skill-name">__SKILL_NAME_PLACEHOLDER__</span></h1>
29
+ <p class="subtitle">Review and edit trigger eval queries. Toggle should-trigger, add/remove entries, then export.</p>
30
+
31
+ <div class="description" id="current-description">__SKILL_DESCRIPTION_PLACEHOLDER__</div>
32
+
33
+ <div id="eval-list"></div>
34
+
35
+ <button class="add-btn" onclick="addItem(true)">+ Should Trigger</button>
36
+ <button class="add-btn" onclick="addItem(false)">+ Should NOT Trigger</button>
37
+ <button class="export-btn" onclick="exportEvalSet()">Export Eval Set</button>
38
+
39
+ <script>
40
+ let evalData = __EVAL_DATA_PLACEHOLDER__;
41
+
42
+ function render() {
43
+ const list = document.getElementById('eval-list');
44
+ list.innerHTML = '';
45
+ evalData.forEach((item, i) => {
46
+ const div = document.createElement('div');
47
+ div.className = 'eval-item';
48
+ div.innerHTML = `
49
+ <button class="toggle ${item.should_trigger ? 'trigger' : 'no-trigger'}" onclick="toggleTrigger(${i})">
50
+ ${item.should_trigger ? 'TRIGGER' : 'NO TRIGGER'}
51
+ </button>
52
+ <textarea oninput="updateQuery(${i}, this.value)">${item.query}</textarea>
53
+ <button class="remove-btn" onclick="removeItem(${i})">×</button>
54
+ `;
55
+ list.appendChild(div);
56
+ });
57
+ }
58
+
59
+ function toggleTrigger(i) { evalData[i].should_trigger = !evalData[i].should_trigger; render(); }
60
+ function updateQuery(i, val) { evalData[i].query = val; }
61
+ function removeItem(i) { evalData.splice(i, 1); render(); }
62
+ function addItem(shouldTrigger) { evalData.push({ query: '', should_trigger: shouldTrigger }); render(); document.querySelector('.eval-item:last-child textarea').focus(); }
63
+
64
+ function exportEvalSet() {
65
+ const filtered = evalData.filter(e => e.query.trim());
66
+ const blob = new Blob([JSON.stringify(filtered, null, 2)], { type: 'application/json' });
67
+ const a = document.createElement('a');
68
+ a.href = URL.createObjectURL(blob);
69
+ a.download = 'eval_set.json';
70
+ a.click();
71
+ }
72
+
73
+ render();
74
+ </script>
75
+ </body>
76
+ </html>
@@ -0,0 +1,164 @@
1
+ #!/usr/bin/env python3
2
+ """
3
+ generate_review.py — Generates side-by-side review UI for ADK eval outputs
4
+
5
+ Usage:
6
+ python skills/aw-adk/eval-viewer/generate_review.py <workspace>/iteration-N \\
7
+ --artifact-name "my-agent" \\
8
+ [--benchmark <workspace>/iteration-N/benchmark.json] \\
9
+ [--previous-workspace <workspace>/iteration-<N-1>] \\
10
+ [--static <output_path>]
11
+
12
+ Opens an HTML review interface showing:
13
+ - Outputs tab: per-case review with feedback textbox
14
+ - Benchmark tab: quantitative stats comparison
15
+
16
+ If --static is provided, writes standalone HTML instead of starting a server.
17
+
18
+ Adapted from skill-creator's eval-viewer/generate_review.py for CASRE context.
19
+ """
20
+
21
+ import argparse
22
+ import http.server
23
+ import json
24
+ import os
25
+ import sys
26
+ import threading
27
+ import webbrowser
28
+ from pathlib import Path
29
+
30
+
31
+ def load_json(path: str) -> dict:
32
+ try:
33
+ with open(path, "r") as f:
34
+ return json.load(f)
35
+ except (FileNotFoundError, json.JSONDecodeError):
36
+ return {}
37
+
38
+
39
+ def collect_review_data(iteration_dir: str, previous_dir: str = None) -> list[dict]:
40
+ """Collect all eval outputs for review."""
41
+ reviews = []
42
+ iteration_path = Path(iteration_dir)
43
+
44
+ for eval_dir in sorted(iteration_path.iterdir()):
45
+ if not eval_dir.is_dir():
46
+ continue
47
+
48
+ metadata = load_json(str(eval_dir / "eval_metadata.json"))
49
+ eval_name = metadata.get("eval_name", eval_dir.name)
50
+ prompt = metadata.get("prompt", "")
51
+
52
+ for config_dir in sorted(eval_dir.iterdir()):
53
+ if not config_dir.is_dir():
54
+ continue
55
+
56
+ # Read outputs
57
+ outputs_dir = config_dir / "outputs"
58
+ output_files = {}
59
+ if outputs_dir.exists():
60
+ for f in sorted(outputs_dir.iterdir()):
61
+ if f.is_file():
62
+ try:
63
+ output_files[f.name] = f.read_text()
64
+ except UnicodeDecodeError:
65
+ output_files[f.name] = f"[Binary file: {f.name}]"
66
+
67
+ # Read grading
68
+ grading = load_json(str(config_dir / "grading.json"))
69
+
70
+ # Read previous output if available
71
+ previous_output = None
72
+ if previous_dir:
73
+ prev_config = Path(previous_dir) / eval_dir.name / config_dir.name / "outputs"
74
+ if prev_config.exists():
75
+ previous_output = {}
76
+ for f in sorted(prev_config.iterdir()):
77
+ if f.is_file():
78
+ try:
79
+ previous_output[f.name] = f.read_text()
80
+ except UnicodeDecodeError:
81
+ previous_output[f.name] = f"[Binary file: {f.name}]"
82
+
83
+ # Read previous feedback
84
+ previous_feedback = None
85
+ if previous_dir:
86
+ prev_feedback_path = Path(previous_dir) / "feedback.json"
87
+ prev_feedback = load_json(str(prev_feedback_path))
88
+ run_id = f"{eval_dir.name}-{config_dir.name}"
89
+ for review in prev_feedback.get("reviews", []):
90
+ if review.get("run_id") == run_id:
91
+ previous_feedback = review.get("feedback", "")
92
+
93
+ reviews.append({
94
+ "run_id": f"{eval_dir.name}-{config_dir.name}",
95
+ "eval_name": eval_name,
96
+ "configuration": config_dir.name,
97
+ "prompt": prompt,
98
+ "outputs": output_files,
99
+ "grading": grading,
100
+ "previous_output": previous_output,
101
+ "previous_feedback": previous_feedback,
102
+ })
103
+
104
+ return reviews
105
+
106
+
107
+ def generate_html(reviews: list[dict], benchmark: dict = None, artifact_name: str = "artifact") -> str:
108
+ """Generate the review HTML page."""
109
+ template_path = Path(__file__).parent / "viewer.html"
110
+
111
+ if template_path.exists():
112
+ html = template_path.read_text()
113
+ html = html.replace("__REVIEW_DATA_PLACEHOLDER__", json.dumps(reviews))
114
+ html = html.replace("__BENCHMARK_DATA_PLACEHOLDER__", json.dumps(benchmark or {}))
115
+ html = html.replace("__ARTIFACT_NAME_PLACEHOLDER__", artifact_name)
116
+ return html
117
+
118
+ # Fallback: minimal HTML
119
+ return f"""<!DOCTYPE html>
120
+ <html>
121
+ <head><title>ADK Review: {artifact_name}</title></head>
122
+ <body>
123
+ <h1>ADK Eval Review: {artifact_name}</h1>
124
+ <p>{len(reviews)} test cases loaded. See console for data.</p>
125
+ <script>
126
+ const reviewData = {json.dumps(reviews, indent=2)};
127
+ const benchmarkData = {json.dumps(benchmark or {}, indent=2)};
128
+ console.log('Review data:', reviewData);
129
+ console.log('Benchmark data:', benchmarkData);
130
+ </script>
131
+ </body>
132
+ </html>"""
133
+
134
+
135
+ def main():
136
+ parser = argparse.ArgumentParser(description="Generate ADK eval review UI")
137
+ parser.add_argument("iteration_dir", help="Path to iteration directory")
138
+ parser.add_argument("--artifact-name", default="artifact", help="Name of the artifact being reviewed")
139
+ parser.add_argument("--benchmark", help="Path to benchmark.json")
140
+ parser.add_argument("--previous-workspace", help="Path to previous iteration directory")
141
+ parser.add_argument("--static", help="Write standalone HTML to this path instead of starting server")
142
+ args = parser.parse_args()
143
+
144
+ reviews = collect_review_data(args.iteration_dir, args.previous_workspace)
145
+ benchmark = load_json(args.benchmark) if args.benchmark else None
146
+
147
+ html = generate_html(reviews, benchmark, args.artifact_name)
148
+
149
+ if args.static:
150
+ with open(args.static, "w") as f:
151
+ f.write(html)
152
+ print(f"Wrote static review to {args.static}")
153
+ else:
154
+ # Write temp file and open in browser
155
+ import tempfile
156
+ tmp = tempfile.NamedTemporaryFile(mode="w", suffix=".html", delete=False, prefix="adk-review-")
157
+ tmp.write(html)
158
+ tmp.close()
159
+ print(f"Opening review in browser: {tmp.name}")
160
+ webbrowser.open(f"file://{tmp.name}")
161
+
162
+
163
+ if __name__ == "__main__":
164
+ main()
@@ -0,0 +1,181 @@
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>ADK Eval Review: __ARTIFACT_NAME_PLACEHOLDER__</title>
7
+ <style>
8
+ * { box-sizing: border-box; margin: 0; padding: 0; }
9
+ body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif; background: #0d1117; color: #c9d1d9; padding: 20px; }
10
+ .tabs { display: flex; gap: 4px; margin-bottom: 20px; }
11
+ .tab { padding: 8px 16px; background: #161b22; border: 1px solid #30363d; border-radius: 6px 6px 0 0; cursor: pointer; color: #8b949e; }
12
+ .tab.active { background: #0d1117; border-bottom-color: transparent; color: #c9d1d9; }
13
+ .panel { display: none; }
14
+ .panel.active { display: block; }
15
+ .nav { display: flex; gap: 8px; margin-bottom: 16px; align-items: center; }
16
+ .nav button { padding: 6px 12px; background: #21262d; border: 1px solid #30363d; border-radius: 6px; color: #c9d1d9; cursor: pointer; }
17
+ .nav button:hover { background: #30363d; }
18
+ .nav span { color: #8b949e; }
19
+ .card { background: #161b22; border: 1px solid #30363d; border-radius: 8px; padding: 16px; margin-bottom: 16px; }
20
+ .card h3 { color: #58a6ff; margin-bottom: 8px; }
21
+ .card pre { background: #0d1117; padding: 12px; border-radius: 4px; overflow-x: auto; font-size: 13px; white-space: pre-wrap; }
22
+ .feedback textarea { width: 100%; height: 80px; background: #0d1117; border: 1px solid #30363d; border-radius: 6px; color: #c9d1d9; padding: 8px; font-family: inherit; resize: vertical; }
23
+ .badge { display: inline-block; padding: 2px 8px; border-radius: 12px; font-size: 12px; font-weight: 600; }
24
+ .badge.pass { background: #1b4332; color: #2dd4bf; }
25
+ .badge.fail { background: #4c1d1d; color: #f87171; }
26
+ .config-label { font-size: 12px; color: #8b949e; text-transform: uppercase; letter-spacing: 1px; }
27
+ details { margin-top: 8px; }
28
+ details summary { cursor: pointer; color: #8b949e; }
29
+ .submit-btn { padding: 10px 24px; background: #238636; border: none; border-radius: 6px; color: white; font-size: 14px; cursor: pointer; margin-top: 16px; }
30
+ .submit-btn:hover { background: #2ea043; }
31
+ table { width: 100%; border-collapse: collapse; }
32
+ th, td { padding: 8px 12px; text-align: left; border-bottom: 1px solid #21262d; }
33
+ th { color: #8b949e; font-weight: 600; }
34
+ </style>
35
+ </head>
36
+ <body>
37
+
38
+ <h1 style="margin-bottom: 20px;">ADK Eval Review: <span id="artifact-name"></span></h1>
39
+
40
+ <div class="tabs">
41
+ <div class="tab active" onclick="switchTab('outputs')">Outputs</div>
42
+ <div class="tab" onclick="switchTab('benchmark')">Benchmark</div>
43
+ </div>
44
+
45
+ <div id="outputs-panel" class="panel active">
46
+ <div class="nav">
47
+ <button onclick="prev()">← Prev</button>
48
+ <span id="counter">1 / 1</span>
49
+ <button onclick="next()">Next →</button>
50
+ </div>
51
+ <div id="review-content"></div>
52
+ <button class="submit-btn" onclick="submitAll()">Submit All Reviews</button>
53
+ </div>
54
+
55
+ <div id="benchmark-panel" class="panel">
56
+ <div id="benchmark-content"></div>
57
+ </div>
58
+
59
+ <script>
60
+ const reviews = __REVIEW_DATA_PLACEHOLDER__;
61
+ const benchmark = __BENCHMARK_DATA_PLACEHOLDER__;
62
+ const artifactName = "__ARTIFACT_NAME_PLACEHOLDER__";
63
+ let currentIndex = 0;
64
+ const feedbackStore = {};
65
+
66
+ document.getElementById('artifact-name').textContent = artifactName;
67
+
68
+ function switchTab(tab) {
69
+ document.querySelectorAll('.tab').forEach(t => t.classList.remove('active'));
70
+ document.querySelectorAll('.panel').forEach(p => p.classList.remove('active'));
71
+ event.target.classList.add('active');
72
+ document.getElementById(tab + '-panel').classList.add('active');
73
+ }
74
+
75
+ function renderReview(index) {
76
+ if (!reviews.length) { document.getElementById('review-content').innerHTML = '<p>No review data.</p>'; return; }
77
+ const r = reviews[index];
78
+ document.getElementById('counter').textContent = `${index + 1} / ${reviews.length}`;
79
+
80
+ let html = '';
81
+
82
+ // Prompt
83
+ html += `<div class="card"><h3>Prompt</h3><pre>${escapeHtml(r.prompt || 'N/A')}</pre></div>`;
84
+
85
+ // Config label
86
+ html += `<div class="config-label">${r.configuration}</div>`;
87
+
88
+ // Outputs
89
+ if (r.outputs && Object.keys(r.outputs).length) {
90
+ for (const [name, content] of Object.entries(r.outputs)) {
91
+ html += `<div class="card"><h3>${escapeHtml(name)}</h3><pre>${escapeHtml(content)}</pre></div>`;
92
+ }
93
+ }
94
+
95
+ // Previous output
96
+ if (r.previous_output) {
97
+ html += `<details><summary>Previous Output</summary>`;
98
+ for (const [name, content] of Object.entries(r.previous_output)) {
99
+ html += `<div class="card"><h3>${escapeHtml(name)}</h3><pre>${escapeHtml(content)}</pre></div>`;
100
+ }
101
+ html += `</details>`;
102
+ }
103
+
104
+ // Grading
105
+ if (r.grading && r.grading.expectations) {
106
+ html += `<details><summary>Formal Grades (${r.grading.summary?.passed || 0}/${r.grading.summary?.total || 0} passed)</summary>`;
107
+ for (const exp of r.grading.expectations) {
108
+ const badge = exp.passed ? '<span class="badge pass">PASS</span>' : '<span class="badge fail">FAIL</span>';
109
+ html += `<div class="card">${badge} ${escapeHtml(exp.text)}<pre>${escapeHtml(exp.evidence || '')}</pre></div>`;
110
+ }
111
+ html += `</details>`;
112
+ }
113
+
114
+ // Previous feedback
115
+ if (r.previous_feedback) {
116
+ html += `<div class="card"><h3>Previous Feedback</h3><pre>${escapeHtml(r.previous_feedback)}</pre></div>`;
117
+ }
118
+
119
+ // Feedback textarea
120
+ const savedFeedback = feedbackStore[r.run_id] || '';
121
+ html += `<div class="feedback"><h3 style="color:#58a6ff;margin-bottom:8px;">Feedback</h3>`;
122
+ html += `<textarea id="feedback-input" placeholder="Leave feedback (empty = looks good)" oninput="saveFeedback()">${escapeHtml(savedFeedback)}</textarea></div>`;
123
+
124
+ document.getElementById('review-content').innerHTML = html;
125
+ }
126
+
127
+ function saveFeedback() {
128
+ const r = reviews[currentIndex];
129
+ feedbackStore[r.run_id] = document.getElementById('feedback-input').value;
130
+ }
131
+
132
+ function prev() { if (currentIndex > 0) { saveFeedback(); currentIndex--; renderReview(currentIndex); } }
133
+ function next() { if (currentIndex < reviews.length - 1) { saveFeedback(); currentIndex++; renderReview(currentIndex); } }
134
+
135
+ function submitAll() {
136
+ saveFeedback();
137
+ const feedback = { reviews: reviews.map(r => ({ run_id: r.run_id, feedback: feedbackStore[r.run_id] || '', timestamp: new Date().toISOString() })), status: 'complete' };
138
+ const blob = new Blob([JSON.stringify(feedback, null, 2)], { type: 'application/json' });
139
+ const a = document.createElement('a');
140
+ a.href = URL.createObjectURL(blob);
141
+ a.download = 'feedback.json';
142
+ a.click();
143
+ alert('Feedback saved! Copy feedback.json to your workspace directory.');
144
+ }
145
+
146
+ function renderBenchmark() {
147
+ const el = document.getElementById('benchmark-content');
148
+ if (!benchmark || !benchmark.run_summary) { el.innerHTML = '<p>No benchmark data.</p>'; return; }
149
+
150
+ let html = '<div class="card"><h3>Summary</h3><table><tr><th>Config</th><th>Pass Rate</th><th>Time</th><th>Tokens</th></tr>';
151
+ for (const [config, stats] of Object.entries(benchmark.run_summary)) {
152
+ if (config === 'delta') continue;
153
+ const pr = stats.pass_rate || {};
154
+ const ts = stats.time_seconds || {};
155
+ const tk = stats.tokens || {};
156
+ html += `<tr><td>${config}</td><td>${((pr.mean||0)*100).toFixed(1)}% ± ${((pr.stddev||0)*100).toFixed(1)}%</td><td>${(ts.mean||0).toFixed(1)}s</td><td>${(tk.mean||0).toFixed(0)}</td></tr>`;
157
+ }
158
+ if (benchmark.run_summary.delta) {
159
+ const d = benchmark.run_summary.delta;
160
+ html += `<tr style="font-weight:bold;color:#58a6ff"><td>Delta</td><td>${d.pass_rate}</td><td>${d.time_seconds}s</td><td>${d.tokens}</td></tr>`;
161
+ }
162
+ html += '</table></div>';
163
+
164
+ if (benchmark.notes && benchmark.notes.length) {
165
+ html += '<div class="card"><h3>Analyst Notes</h3><ul>';
166
+ for (const note of benchmark.notes) { html += `<li>${escapeHtml(note)}</li>`; }
167
+ html += '</ul></div>';
168
+ }
169
+
170
+ el.innerHTML = html;
171
+ }
172
+
173
+ function escapeHtml(s) { return String(s).replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;').replace(/"/g,'&quot;'); }
174
+
175
+ document.addEventListener('keydown', e => { if (e.key === 'ArrowLeft') prev(); if (e.key === 'ArrowRight') next(); });
176
+
177
+ renderReview(0);
178
+ renderBenchmark();
179
+ </script>
180
+ </body>
181
+ </html>
@@ -0,0 +1,84 @@
1
+ ---
2
+ name: eval-colocated-placement
3
+ target: skill/aw-adk
4
+ category: structural
5
+ difficulty: basic
6
+ ---
7
+
8
+ # Eval: Colocated Placement — Evals Land in Correct Directory
9
+
10
+ ## Task
11
+
12
+ Test that the ADK places evals in the correct colocated directory for each artifact type. Each CASRE type has a different eval placement pattern. This eval creates an artifact and checks that its evals end up in the right location — not in a centralized `evals/` directory.
13
+
14
+ ### Prompt
15
+
16
+ ```
17
+ Create an agent for API rate limiting in the platform/services namespace. It should enforce per-tenant rate limits on HTTP endpoints using sliding window counters in Redis. Tools: Read, Bash, Grep, Glob. Model: sonnet. No existing skills to reference — use skills: [].
18
+ ```
19
+
20
+ ## Context
21
+
22
+ | Field | Value |
23
+ |-------|-------|
24
+ | **Namespace** | `platform/services` |
25
+ | **Domain** | `services` |
26
+ | **Target artifact** | `skills/aw-adk/SKILL.md` |
27
+ | **Target type** | `agent` |
28
+
29
+ ## Expected Outcomes
30
+
31
+ - [ ] **Agent created** at `.aw/.aw_registry/platform/services/agents/api-rate-limiter.md` (or similar slug)
32
+ - [ ] **Evals created** at `.aw/.aw_registry/platform/services/agents/evals/api-rate-limiter/eval-*.md`
33
+ - [ ] **NOT placed** at a top-level `evals/` directory
34
+ - [ ] **NOT placed** at `.aw/.aw_registry/platform/services/evals/` (wrong nesting)
35
+ - [ ] **Each eval has `target:` frontmatter** referencing the parent agent
36
+ - [ ] **At least 2 eval files** created
37
+
38
+ ## Grading Criteria
39
+
40
+ ### PASS
41
+
42
+ - Evals are in the correct colocated path: `agents/evals/<slug>/eval-*.md`
43
+ - 2+ eval files exist
44
+ - All have correct `target:` frontmatter
45
+
46
+ ### PARTIAL
47
+
48
+ - Evals created but in a slightly wrong path (e.g., `agents/evals/eval-*.md` without the slug subdirectory)
49
+
50
+ ### FAIL
51
+
52
+ - Evals in a centralized location
53
+ - No evals created
54
+ - Evals reference wrong parent artifact
55
+
56
+ ## Evaluation Method
57
+
58
+ **Type:** deterministic
59
+
60
+ ### Deterministic Checks
61
+
62
+ ```bash
63
+ # Find the agent file
64
+ AGENT_PATH=$(find .aw/.aw_registry/platform/services/agents/ -name "*rate-limit*" -not -path "*/evals/*" | head -1)
65
+ SLUG=$(basename "$AGENT_PATH" .md)
66
+
67
+ # Verify evals are colocated
68
+ EVAL_COUNT=$(ls .aw/.aw_registry/platform/services/agents/evals/$SLUG/eval-*.md 2>/dev/null | wc -l)
69
+ [[ "$EVAL_COUNT" -ge 2 ]] || echo "FAIL: expected 2+ evals at agents/evals/$SLUG/, found $EVAL_COUNT"
70
+
71
+ # Verify no centralized placement
72
+ ls .aw/.aw_registry/platform/services/evals/ 2>/dev/null && echo "WARN: centralized evals/ exists"
73
+
74
+ # Verify target frontmatter
75
+ for f in .aw/.aw_registry/platform/services/agents/evals/$SLUG/eval-*.md; do
76
+ grep -q "^target:" "$f" || echo "FAIL: $f missing target frontmatter"
77
+ done
78
+ ```
79
+
80
+ ## Baseline Expectations
81
+
82
+ - Without ADK: Evals placed arbitrarily or not created at all.
83
+ - With ADK: Correct colocated placement per eval-placement-guide.md.
84
+ - **Expected delta:** 100% correct placement with ADK
@@ -0,0 +1,90 @@
1
+ ---
2
+ name: eval-create-agent
3
+ target: skill/aw-adk
4
+ category: functional
5
+ difficulty: intermediate
6
+ ---
7
+
8
+ # Eval: Create Agent — Phantom Dependency Detection
9
+
10
+ ## Task
11
+
12
+ Test that the ADK creates an agent with valid skill references (no phantom dependencies) and follows the full create flow. This eval specifically targets the phantom skill problem — agents listing skills in `skills:` frontmatter that don't exist in the registry.
13
+
14
+ ### Prompt
15
+
16
+ ```
17
+ Create an agent for payments processing in the revex/reselling namespace, under the backend domain. It should validate payment webhook signatures, reconcile transaction records against Stripe events, and flag discrepancies. Tools needed: Read, Bash, Grep, Glob. Model: sonnet. No existing skills to reference — use skills: [].
18
+ ```
19
+
20
+ ## Context
21
+
22
+ | Field | Value |
23
+ |-------|-------|
24
+ | **Namespace** | `revex/reselling` |
25
+ | **Domain** | `backend` |
26
+ | **Target artifact** | `skills/aw-adk/SKILL.md` |
27
+ | **Target type** | `agent` |
28
+
29
+ ## Expected Outcomes
30
+
31
+ - [ ] **Type classified correctly** — identified as `agent`
32
+ - [ ] **Interview conducted** — asked about agent's purpose, tools needed, model, skills
33
+ - [ ] **Path resolved** — target at `.aw/.aw_registry/revex/reselling/<domain>/agents/payments-processor.md` (domain may vary)
34
+ - [ ] **Agent created** with frontmatter: `name`, `description`, `tools`, `model`, `category`, `squad`, `skills`
35
+ - [ ] **No phantom skills** — every entry in `skills:` frontmatter either exists in the registry OR `skills: []` is used
36
+ - [ ] **Identity section present** — agent has a clear identity/mission section
37
+ - [ ] **CHECKPOINT output shown** — remaining steps printed before continuing
38
+ - [ ] **Lint ran and passed** — `lint-artifact.sh` executed, no phantom_skill errors
39
+ - [ ] **Scoring performed** — rubric-agent.md read, 10-dimension score table output
40
+ - [ ] **2+ evals created** — colocated at `agents/evals/<slug>/eval-*.md`
41
+ - [ ] **Evals derive from agent structure** — at least one eval exercises the agent's specific domain (payments), not generic checks
42
+ - [ ] **`aw link` ran**
43
+
44
+ ## Grading Criteria
45
+
46
+ ### PASS (all conditions met)
47
+
48
+ - All 12 outcomes checked
49
+ - Zero phantom dependencies
50
+ - Agent content is payments-domain-specific
51
+
52
+ ### PARTIAL (8+ of 12)
53
+
54
+ - Agent created correctly but some flow steps skipped
55
+ - OR agent has phantom skills but lint caught them
56
+
57
+ ### FAIL (below 8)
58
+
59
+ - Phantom skills in `skills:` frontmatter that lint didn't catch
60
+ - Steps 5-14 skipped
61
+ - Wrong type classification
62
+
63
+ ## Evaluation Method
64
+
65
+ **Type:** hybrid
66
+
67
+ ### Deterministic Checks
68
+
69
+ ```bash
70
+ # Run lint — will catch phantom skills
71
+ bash ~/.aw-ecc/skills/aw-adk/scripts/lint-artifact.sh "<agent-path>" agent
72
+
73
+ # Verify frontmatter fields
74
+ grep -q "^name:" "<agent-path>" || echo "FAIL: missing name"
75
+ grep -q "^tools:" "<agent-path>" || echo "FAIL: missing tools"
76
+ grep -q "^model:" "<agent-path>" || echo "FAIL: missing model"
77
+ grep -q "^skills:" "<agent-path>" || echo "FAIL: missing skills field"
78
+ ```
79
+
80
+ ### Model-Based Checks
81
+
82
+ - Are skill references valid (either empty or pointing to real skills)?
83
+ - Is the agent's identity specific to payments processing?
84
+ - Did the executor show the CHECKPOINT step?
85
+
86
+ ## Baseline Expectations
87
+
88
+ - Without ADK: Agent created with phantom skill references, no lint validation, no evals.
89
+ - With ADK: Phantom-free agent, lint-validated, scored, with colocated evals.
90
+ - **Expected delta:** zero phantom dependencies vs. 2-3 phantoms without ADK