aw-ecc 1.4.31 → 1.4.47

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (259) hide show
  1. package/.claude-plugin/plugin.json +1 -1
  2. package/.codex/hooks/aw-post-tool-use.sh +8 -2
  3. package/.codex/hooks/aw-session-start.sh +11 -4
  4. package/.codex/hooks/aw-stop.sh +8 -2
  5. package/.codex/hooks/aw-user-prompt-submit.sh +10 -2
  6. package/.codex/hooks.json +8 -8
  7. package/.cursor/INSTALL.md +7 -5
  8. package/.cursor/hooks/adapter.js +41 -4
  9. package/.cursor/hooks/after-agent-response.js +62 -0
  10. package/.cursor/hooks/before-submit-prompt.js +7 -1
  11. package/.cursor/hooks/post-tool-use-failure.js +21 -0
  12. package/.cursor/hooks/post-tool-use.js +39 -0
  13. package/.cursor/hooks/shared/aw-phase-definitions.js +53 -0
  14. package/.cursor/hooks/shared/aw-phase-runner.js +3 -1
  15. package/.cursor/hooks/subagent-start.js +22 -4
  16. package/.cursor/hooks/subagent-stop.js +18 -1
  17. package/.cursor/hooks.json +23 -2
  18. package/.opencode/package.json +1 -1
  19. package/AGENTS.md +3 -3
  20. package/README.md +5 -5
  21. package/commands/adk.md +52 -0
  22. package/commands/build.md +22 -9
  23. package/commands/deploy.md +12 -0
  24. package/commands/execute.md +9 -0
  25. package/commands/feature.md +333 -0
  26. package/commands/investigate.md +18 -5
  27. package/commands/plan.md +23 -9
  28. package/commands/publish.md +65 -0
  29. package/commands/review.md +12 -0
  30. package/commands/ship.md +12 -0
  31. package/commands/test.md +12 -0
  32. package/commands/verify.md +9 -0
  33. package/hooks/hooks.json +36 -0
  34. package/manifests/install-components.json +8 -0
  35. package/manifests/install-modules.json +83 -0
  36. package/manifests/install-profiles.json +7 -0
  37. package/package.json +1 -1
  38. package/scripts/ci/validate-rules.js +51 -0
  39. package/scripts/cursor-aw-home/hooks.json +23 -2
  40. package/scripts/cursor-aw-hooks/adapter.js +41 -4
  41. package/scripts/cursor-aw-hooks/before-submit-prompt.js +7 -1
  42. package/scripts/hooks/aw-usage-commit-created.js +32 -0
  43. package/scripts/hooks/aw-usage-post-tool-use-failure.js +56 -0
  44. package/scripts/hooks/aw-usage-post-tool-use.js +242 -0
  45. package/scripts/hooks/aw-usage-prompt-submit.js +112 -0
  46. package/scripts/hooks/aw-usage-session-start.js +48 -0
  47. package/scripts/hooks/aw-usage-stop.js +182 -0
  48. package/scripts/hooks/aw-usage-telemetry-send.js +84 -0
  49. package/scripts/hooks/cost-tracker.js +3 -23
  50. package/scripts/hooks/shared/aw-phase-definitions.js +53 -0
  51. package/scripts/hooks/shared/aw-phase-runner.js +3 -1
  52. package/scripts/lib/aw-hook-contract.js +2 -2
  53. package/scripts/lib/aw-pricing.js +306 -0
  54. package/scripts/lib/aw-usage-telemetry.js +472 -0
  55. package/scripts/lib/codex-hook-config.js +8 -8
  56. package/scripts/lib/cursor-hook-config.js +25 -10
  57. package/scripts/lib/install-targets/codex-home.js +7 -0
  58. package/scripts/lib/install-targets/cursor-project.js +3 -0
  59. package/scripts/lib/install-targets/helpers.js +20 -3
  60. package/skills/aw-adk/SKILL.md +317 -0
  61. package/skills/aw-adk/agents/analyzer.md +113 -0
  62. package/skills/aw-adk/agents/comparator.md +113 -0
  63. package/skills/aw-adk/agents/grader.md +115 -0
  64. package/skills/aw-adk/assets/eval_review.html +76 -0
  65. package/skills/aw-adk/eval-viewer/generate_review.py +164 -0
  66. package/skills/aw-adk/eval-viewer/viewer.html +181 -0
  67. package/skills/aw-adk/evals/eval-colocated-placement.md +84 -0
  68. package/skills/aw-adk/evals/eval-create-agent.md +90 -0
  69. package/skills/aw-adk/evals/eval-create-command.md +98 -0
  70. package/skills/aw-adk/evals/eval-create-eval.md +89 -0
  71. package/skills/aw-adk/evals/eval-create-rule.md +99 -0
  72. package/skills/aw-adk/evals/eval-create-skill.md +97 -0
  73. package/skills/aw-adk/evals/eval-delete-agent.md +79 -0
  74. package/skills/aw-adk/evals/eval-delete-command.md +89 -0
  75. package/skills/aw-adk/evals/eval-delete-rule.md +86 -0
  76. package/skills/aw-adk/evals/eval-delete-skill.md +90 -0
  77. package/skills/aw-adk/evals/eval-meta-eval-coverage.md +78 -0
  78. package/skills/aw-adk/evals/eval-meta-eval-determinism.md +81 -0
  79. package/skills/aw-adk/evals/eval-meta-eval-false-pass.md +81 -0
  80. package/skills/aw-adk/evals/eval-score-accuracy.md +95 -0
  81. package/skills/aw-adk/evals/eval-type-redirect.md +68 -0
  82. package/skills/aw-adk/evals/evals.json +96 -0
  83. package/skills/aw-adk/references/artifact-wiring.md +162 -0
  84. package/skills/aw-adk/references/cross-ide-mapping.md +71 -0
  85. package/skills/aw-adk/references/eval-placement-guide.md +183 -0
  86. package/skills/aw-adk/references/external-resources.md +75 -0
  87. package/skills/aw-adk/references/getting-started.md +66 -0
  88. package/skills/aw-adk/references/registry-structure.md +152 -0
  89. package/skills/aw-adk/references/rubric-agent.md +36 -0
  90. package/skills/aw-adk/references/rubric-command.md +36 -0
  91. package/skills/aw-adk/references/rubric-eval.md +36 -0
  92. package/skills/aw-adk/references/rubric-meta-eval.md +132 -0
  93. package/skills/aw-adk/references/rubric-rule.md +36 -0
  94. package/skills/aw-adk/references/rubric-skill.md +36 -0
  95. package/skills/aw-adk/references/schemas.md +222 -0
  96. package/skills/aw-adk/references/template-agent.md +251 -0
  97. package/skills/aw-adk/references/template-command.md +279 -0
  98. package/skills/aw-adk/references/template-eval.md +176 -0
  99. package/skills/aw-adk/references/template-rule.md +119 -0
  100. package/skills/aw-adk/references/template-skill.md +123 -0
  101. package/skills/aw-adk/references/type-classifier.md +98 -0
  102. package/skills/aw-adk/references/writing-good-agents.md +227 -0
  103. package/skills/aw-adk/references/writing-good-commands.md +258 -0
  104. package/skills/aw-adk/references/writing-good-evals.md +271 -0
  105. package/skills/aw-adk/references/writing-good-rules.md +214 -0
  106. package/skills/aw-adk/references/writing-good-skills.md +159 -0
  107. package/skills/aw-adk/scripts/aggregate-benchmark.py +190 -0
  108. package/skills/aw-adk/scripts/lint-artifact.sh +211 -0
  109. package/skills/aw-adk/scripts/score-artifact.sh +179 -0
  110. package/skills/aw-adk/scripts/trigger-eval.py +192 -0
  111. package/skills/aw-build/SKILL.md +19 -2
  112. package/skills/aw-deploy/SKILL.md +65 -3
  113. package/skills/aw-design/SKILL.md +156 -0
  114. package/skills/aw-design/references/highrise-tokens.md +394 -0
  115. package/skills/aw-design/references/micro-interactions.md +76 -0
  116. package/skills/aw-design/references/prompt-template.md +160 -0
  117. package/skills/aw-design/references/quality-checklist.md +70 -0
  118. package/skills/aw-design/references/self-review.md +497 -0
  119. package/skills/aw-design/references/stitch-workflow.md +127 -0
  120. package/skills/aw-feature/SKILL.md +293 -0
  121. package/skills/aw-investigate/SKILL.md +17 -0
  122. package/skills/aw-plan/SKILL.md +34 -3
  123. package/skills/aw-publish/SKILL.md +300 -0
  124. package/skills/aw-publish/evals/eval-confirmation-gate.md +60 -0
  125. package/skills/aw-publish/evals/eval-intent-detection.md +111 -0
  126. package/skills/aw-publish/evals/eval-push-modes.md +67 -0
  127. package/skills/aw-publish/evals/eval-rules-push.md +60 -0
  128. package/skills/aw-publish/evals/evals.json +29 -0
  129. package/skills/aw-publish/references/push-modes.md +38 -0
  130. package/skills/aw-review/SKILL.md +88 -9
  131. package/skills/aw-rules-review/SKILL.md +124 -0
  132. package/skills/aw-rules-review/agents/openai.yaml +3 -0
  133. package/skills/aw-rules-review/scripts/generate-review-template.mjs +323 -0
  134. package/skills/aw-ship/SKILL.md +16 -0
  135. package/skills/aw-spec/SKILL.md +15 -0
  136. package/skills/aw-tasks/SKILL.md +15 -0
  137. package/skills/aw-test/SKILL.md +16 -0
  138. package/skills/aw-yolo/SKILL.md +4 -0
  139. package/skills/diagnose/SKILL.md +121 -0
  140. package/skills/diagnose/scripts/hitl-loop.template.sh +41 -0
  141. package/skills/finish-only-when-green/SKILL.md +265 -0
  142. package/skills/grill-me/SKILL.md +24 -0
  143. package/skills/grill-with-docs/SKILL.md +92 -0
  144. package/skills/grill-with-docs/adr-format.md +47 -0
  145. package/skills/grill-with-docs/context-format.md +67 -0
  146. package/skills/improve-codebase-architecture/SKILL.md +75 -0
  147. package/skills/improve-codebase-architecture/deepening.md +37 -0
  148. package/skills/improve-codebase-architecture/interface-design.md +44 -0
  149. package/skills/improve-codebase-architecture/language.md +53 -0
  150. package/skills/local-ghl-setup-from-screenshot/SKILL.md +538 -0
  151. package/skills/tdd/SKILL.md +115 -0
  152. package/skills/tdd/deep-modules.md +33 -0
  153. package/skills/tdd/interface-design.md +31 -0
  154. package/skills/tdd/mocking.md +59 -0
  155. package/skills/tdd/refactoring.md +10 -0
  156. package/skills/tdd/tests.md +61 -0
  157. package/skills/to-issues/SKILL.md +62 -0
  158. package/skills/to-prd/SKILL.md +75 -0
  159. package/skills/using-aw-skills/SKILL.md +170 -237
  160. package/skills/using-aw-skills/hooks/session-start.sh +11 -41
  161. package/skills/zoom-out/SKILL.md +24 -0
  162. package/.cursor/rules/common-agents.md +0 -53
  163. package/.cursor/rules/common-aw-routing.md +0 -43
  164. package/.cursor/rules/common-coding-style.md +0 -52
  165. package/.cursor/rules/common-development-workflow.md +0 -33
  166. package/.cursor/rules/common-git-workflow.md +0 -28
  167. package/.cursor/rules/common-hooks.md +0 -34
  168. package/.cursor/rules/common-patterns.md +0 -35
  169. package/.cursor/rules/common-performance.md +0 -59
  170. package/.cursor/rules/common-security.md +0 -33
  171. package/.cursor/rules/common-testing.md +0 -33
  172. package/.cursor/skills/api-and-interface-design/SKILL.md +0 -75
  173. package/.cursor/skills/article-writing/SKILL.md +0 -85
  174. package/.cursor/skills/aw-brainstorm/SKILL.md +0 -115
  175. package/.cursor/skills/aw-build/SKILL.md +0 -152
  176. package/.cursor/skills/aw-build/evals/build-stage-cases.json +0 -28
  177. package/.cursor/skills/aw-debug/SKILL.md +0 -49
  178. package/.cursor/skills/aw-deploy/SKILL.md +0 -101
  179. package/.cursor/skills/aw-deploy/evals/deploy-stage-cases.json +0 -32
  180. package/.cursor/skills/aw-execute/SKILL.md +0 -47
  181. package/.cursor/skills/aw-execute/references/mode-code.md +0 -47
  182. package/.cursor/skills/aw-execute/references/mode-docs.md +0 -28
  183. package/.cursor/skills/aw-execute/references/mode-infra.md +0 -44
  184. package/.cursor/skills/aw-execute/references/mode-migration.md +0 -58
  185. package/.cursor/skills/aw-execute/references/worker-implementer.md +0 -26
  186. package/.cursor/skills/aw-execute/references/worker-parallel-worker.md +0 -23
  187. package/.cursor/skills/aw-execute/references/worker-quality-reviewer.md +0 -23
  188. package/.cursor/skills/aw-execute/references/worker-spec-reviewer.md +0 -23
  189. package/.cursor/skills/aw-execute/scripts/build-worker-bundle.js +0 -229
  190. package/.cursor/skills/aw-finish/SKILL.md +0 -111
  191. package/.cursor/skills/aw-investigate/SKILL.md +0 -109
  192. package/.cursor/skills/aw-plan/SKILL.md +0 -368
  193. package/.cursor/skills/aw-prepare/SKILL.md +0 -118
  194. package/.cursor/skills/aw-review/SKILL.md +0 -118
  195. package/.cursor/skills/aw-ship/SKILL.md +0 -115
  196. package/.cursor/skills/aw-spec/SKILL.md +0 -104
  197. package/.cursor/skills/aw-tasks/SKILL.md +0 -138
  198. package/.cursor/skills/aw-test/SKILL.md +0 -118
  199. package/.cursor/skills/aw-verify/SKILL.md +0 -51
  200. package/.cursor/skills/aw-yolo/SKILL.md +0 -111
  201. package/.cursor/skills/browser-testing-with-devtools/SKILL.md +0 -81
  202. package/.cursor/skills/bun-runtime/SKILL.md +0 -84
  203. package/.cursor/skills/ci-cd-and-automation/SKILL.md +0 -71
  204. package/.cursor/skills/code-simplification/SKILL.md +0 -74
  205. package/.cursor/skills/content-engine/SKILL.md +0 -88
  206. package/.cursor/skills/context-engineering/SKILL.md +0 -74
  207. package/.cursor/skills/deprecation-and-migration/SKILL.md +0 -75
  208. package/.cursor/skills/documentation-and-adrs/SKILL.md +0 -75
  209. package/.cursor/skills/documentation-lookup/SKILL.md +0 -90
  210. package/.cursor/skills/frontend-slides/SKILL.md +0 -184
  211. package/.cursor/skills/frontend-slides/STYLE_PRESETS.md +0 -330
  212. package/.cursor/skills/frontend-ui-engineering/SKILL.md +0 -68
  213. package/.cursor/skills/git-workflow-and-versioning/SKILL.md +0 -75
  214. package/.cursor/skills/idea-refine/SKILL.md +0 -84
  215. package/.cursor/skills/incremental-implementation/SKILL.md +0 -75
  216. package/.cursor/skills/investor-materials/SKILL.md +0 -96
  217. package/.cursor/skills/investor-outreach/SKILL.md +0 -76
  218. package/.cursor/skills/market-research/SKILL.md +0 -75
  219. package/.cursor/skills/mcp-server-patterns/SKILL.md +0 -67
  220. package/.cursor/skills/nextjs-turbopack/SKILL.md +0 -44
  221. package/.cursor/skills/performance-optimization/SKILL.md +0 -77
  222. package/.cursor/skills/security-and-hardening/SKILL.md +0 -70
  223. package/.cursor/skills/using-aw-skills/SKILL.md +0 -290
  224. package/.cursor/skills/using-aw-skills/evals/skill-trigger-cases.tsv +0 -25
  225. package/.cursor/skills/using-aw-skills/evals/test-skill-triggers.sh +0 -171
  226. package/.cursor/skills/using-aw-skills/hooks/hooks.json +0 -9
  227. package/.cursor/skills/using-aw-skills/hooks/session-start.sh +0 -67
  228. package/.cursor/skills/using-platform-skills/SKILL.md +0 -163
  229. package/.cursor/skills/using-platform-skills/evals/platform-selection-cases.json +0 -52
  230. /package/.cursor/rules/{golang-coding-style.md → golang-coding-style.mdc} +0 -0
  231. /package/.cursor/rules/{golang-hooks.md → golang-hooks.mdc} +0 -0
  232. /package/.cursor/rules/{golang-patterns.md → golang-patterns.mdc} +0 -0
  233. /package/.cursor/rules/{golang-security.md → golang-security.mdc} +0 -0
  234. /package/.cursor/rules/{golang-testing.md → golang-testing.mdc} +0 -0
  235. /package/.cursor/rules/{kotlin-coding-style.md → kotlin-coding-style.mdc} +0 -0
  236. /package/.cursor/rules/{kotlin-hooks.md → kotlin-hooks.mdc} +0 -0
  237. /package/.cursor/rules/{kotlin-patterns.md → kotlin-patterns.mdc} +0 -0
  238. /package/.cursor/rules/{kotlin-security.md → kotlin-security.mdc} +0 -0
  239. /package/.cursor/rules/{kotlin-testing.md → kotlin-testing.mdc} +0 -0
  240. /package/.cursor/rules/{php-coding-style.md → php-coding-style.mdc} +0 -0
  241. /package/.cursor/rules/{php-hooks.md → php-hooks.mdc} +0 -0
  242. /package/.cursor/rules/{php-patterns.md → php-patterns.mdc} +0 -0
  243. /package/.cursor/rules/{php-security.md → php-security.mdc} +0 -0
  244. /package/.cursor/rules/{php-testing.md → php-testing.mdc} +0 -0
  245. /package/.cursor/rules/{python-coding-style.md → python-coding-style.mdc} +0 -0
  246. /package/.cursor/rules/{python-hooks.md → python-hooks.mdc} +0 -0
  247. /package/.cursor/rules/{python-patterns.md → python-patterns.mdc} +0 -0
  248. /package/.cursor/rules/{python-security.md → python-security.mdc} +0 -0
  249. /package/.cursor/rules/{python-testing.md → python-testing.mdc} +0 -0
  250. /package/.cursor/rules/{swift-coding-style.md → swift-coding-style.mdc} +0 -0
  251. /package/.cursor/rules/{swift-hooks.md → swift-hooks.mdc} +0 -0
  252. /package/.cursor/rules/{swift-patterns.md → swift-patterns.mdc} +0 -0
  253. /package/.cursor/rules/{swift-security.md → swift-security.mdc} +0 -0
  254. /package/.cursor/rules/{swift-testing.md → swift-testing.mdc} +0 -0
  255. /package/.cursor/rules/{typescript-coding-style.md → typescript-coding-style.mdc} +0 -0
  256. /package/.cursor/rules/{typescript-hooks.md → typescript-hooks.mdc} +0 -0
  257. /package/.cursor/rules/{typescript-patterns.md → typescript-patterns.mdc} +0 -0
  258. /package/.cursor/rules/{typescript-security.md → typescript-security.mdc} +0 -0
  259. /package/.cursor/rules/{typescript-testing.md → typescript-testing.mdc} +0 -0
@@ -0,0 +1,162 @@
1
+ # Artifact Wiring
2
+
3
+ How CASRE artifacts (Commands, Agents, Skills, Rules, Evals) reference each other.
4
+
5
+ ## Relationship Graph
6
+
7
+ ```
8
+ Commands
9
+
10
+ ├──references──► Agents (agent roster table with phase assignments)
11
+ │ │
12
+ │ ├──references──► Skills (skills: frontmatter field)
13
+ │ │ │
14
+ │ │ └──contains──► References (references/ subdirectory)
15
+ │ │
16
+ │ └──tested-by──► Evals (target: frontmatter)
17
+
18
+ ├──tested-by──► Evals (target: frontmatter)
19
+
20
+ └──governed-by──► Rules
21
+
22
+ Rules
23
+
24
+ ├──links-to──► Skills (skill link dimension)
25
+
26
+ └──tested-by──► Evals (target: frontmatter)
27
+
28
+ Skills
29
+
30
+ └──tested-by──► Evals (target: frontmatter)
31
+
32
+ Evals
33
+
34
+ └──tested-by──► Evals (meta-evals, target: frontmatter)
35
+ ```
36
+
37
+ ## Wiring Patterns
38
+
39
+ ### Commands to Agents
40
+
41
+ Commands define which agents participate and in which phase via an **agent roster table** in the command body.
42
+
43
+ ```markdown
44
+ ## Agent Roster
45
+
46
+ | Phase | Agent | Role |
47
+ |-------|-------|------|
48
+ | 1 - Research | planner | Create implementation plan |
49
+ | 2 - Build | tdd-guide | Drive test-first development |
50
+ | 3 - Review | code-reviewer | Review changes |
51
+ | 3 - Review | security-reviewer | Security audit |
52
+ ```
53
+
54
+ **Validation rules:**
55
+ - Every agent referenced in the roster must have a corresponding `agents/<slug>.md` file.
56
+ - Phase numbers must be sequential starting from 1.
57
+ - Each phase should have at least one agent assigned.
58
+
59
+ ### Agents to Skills
60
+
61
+ Agents declare their skill dependencies in the `skills:` frontmatter field.
62
+
63
+ ```yaml
64
+ ---
65
+ name: planner
66
+ type: agent
67
+ skills:
68
+ - aw-adk
69
+ - incremental-implementation
70
+ ---
71
+ ```
72
+
73
+ **Validation rules:**
74
+ - Every slug in `skills:` must resolve to a `skills/<slug>/SKILL.md` file.
75
+ - Skills are loaded in declaration order; first skill's instructions take precedence on conflict.
76
+ - An agent without any skills is valid but should be flagged as a warning.
77
+
78
+ ### Evals to Parent Artifact
79
+
80
+ Evals declare their target via the `target:` frontmatter field, using `<type>/<slug>` format.
81
+
82
+ ```yaml
83
+ ---
84
+ target: skill/aw-adk
85
+ type: eval
86
+ ---
87
+ ```
88
+
89
+ **Validation rules:**
90
+ - The `target:` value must resolve to an existing artifact.
91
+ - Valid target prefixes: `skill/`, `agent/`, `command/`, `rule/`.
92
+ - Meta-evals use `eval/` as the target prefix.
93
+ - Each artifact should have at least 2 evals targeting it.
94
+
95
+ ### Rules to Skills
96
+
97
+ Rules reference related skills via a **skill link dimension** -- a markdown link or frontmatter field pointing to the skill that provides implementation guidance for the rule.
98
+
99
+ ```markdown
100
+ ## References
101
+
102
+ - Implement using [aw-adk](../../skills/aw-adk/SKILL.md) skill patterns
103
+ ```
104
+
105
+ **Validation rules:**
106
+ - Skill links should resolve to existing skill files.
107
+ - Rules without skill links are valid (not all rules map to a skill).
108
+
109
+ ### Skills to References
110
+
111
+ Skills contain a `references/` subdirectory with supporting markdown files linked from the skill body.
112
+
113
+ ```
114
+ skills/aw-adk/
115
+ SKILL.md
116
+ references/
117
+ schemas.md
118
+ rubric-meta-eval.md
119
+ eval-placement-guide.md
120
+ ```
121
+
122
+ **Validation rules:**
123
+ - Every file in `references/` should be linked from `SKILL.md` or from another reference file.
124
+ - Orphaned reference files (not linked from anywhere) should be flagged as warnings.
125
+ - Reference files must be markdown (`.md`).
126
+
127
+ ## Cross-Artifact Dependency Patterns
128
+
129
+ ### Upward Dependencies (child references parent)
130
+
131
+ - Evals reference their parent artifact via `target:`
132
+ - This is the primary traceability mechanism.
133
+
134
+ ### Downward Dependencies (parent references child)
135
+
136
+ - Commands reference agents via roster tables.
137
+ - Agents reference skills via `skills:` frontmatter.
138
+ - Skills reference documents via `references/` links.
139
+
140
+ ### Lateral Dependencies (peer references)
141
+
142
+ - Rules reference skills for implementation guidance.
143
+ - Skills may reference other skills' reference documents.
144
+
145
+ ## Validation Summary
146
+
147
+ | Relationship | Source Field | Target Resolution | Required? |
148
+ |---|---|---|---|
149
+ | Command -> Agent | Agent roster table | `agents/<slug>.md` | Yes |
150
+ | Agent -> Skill | `skills:` frontmatter | `skills/<slug>/SKILL.md` | No (warn if empty) |
151
+ | Eval -> Parent | `target:` frontmatter | `<type>/<slug>` path | Yes |
152
+ | Rule -> Skill | Markdown link | `skills/<slug>/SKILL.md` | No |
153
+ | Skill -> Reference | Markdown link | `references/<file>.md` | No (warn if orphaned) |
154
+
155
+ ## Integrity Checks
156
+
157
+ Run these checks before merging any CASRE artifact:
158
+
159
+ 1. **Forward resolution:** Every reference from artifact A to artifact B resolves to an existing file.
160
+ 2. **Eval coverage:** Every skill, agent, and command has at least 2 evals with matching `target:` values.
161
+ 3. **No orphans:** Reference files are linked from at least one parent. Evals have valid targets.
162
+ 4. **No cycles:** The dependency graph is a DAG. Commands sit at the top; evals and references sit at the leaves.
@@ -0,0 +1,71 @@
1
+ # Cross-IDE Mapping
2
+
3
+ How AW registry artifacts manifest in different IDE environments after `aw init` and `aw pull`.
4
+
5
+ ## IDE Path Mapping
6
+
7
+ | Artifact Type | Registry Path | Claude Code | Cursor | Codex |
8
+ |---|---|---|---|---|
9
+ | Skill | `.aw/.aw_registry/.../skills/<slug>/SKILL.md` | `.claude/skills/<slug>/SKILL.md` | `.cursor/rules/<slug>.mdc` | `.codex/skills/<slug>/` |
10
+ | Agent | `.aw/.aw_registry/.../agents/<slug>.md` | `.claude/agents/<slug>.md` | `.cursor/rules/<slug>.mdc` | `.codex/agents/<slug>.md` |
11
+ | Command | `.aw/.aw_registry/.../commands/<slug>.md` | `.claude/commands/<slug>.md` | N/A | N/A |
12
+ | Rule | `.aw/.aw_rules/platform/<domain>/references/<slug>.md` | `.claude/rules/<domain>/<slug>.md` | `.cursor/rules/<slug>.mdc` | `.codex/rules/<slug>.md` |
13
+
14
+ ## How `aw init` Works
15
+
16
+ 1. Creates `.claude/`, `.cursor/`, `.codex/` directories if missing
17
+ 2. Installs core modules from `manifests/install-modules.json`
18
+ 3. Copies hooks configuration to appropriate locations
19
+ 4. Creates `skills-lock.json` to track installed versions
20
+
21
+ ## How `aw pull` Works
22
+
23
+ 1. Reads `skills-lock.json` for current state
24
+ 2. Fetches latest from `.aw/.aw_registry/` (platform-docs or local)
25
+ 3. Diffs against installed versions (SHA256 integrity check)
26
+ 4. Copies updated artifacts to IDE-local paths
27
+ 5. Updates `skills-lock.json`
28
+
29
+ ## `skills-lock.json` Format
30
+
31
+ ```json
32
+ {
33
+ "version": 1,
34
+ "skills": {
35
+ "platform-core-aw-adk": {
36
+ "source": ".aw/.aw_registry/platform/core/skills/aw-adk/SKILL.md",
37
+ "integrity": "sha256-abc123...",
38
+ "installed_at": "2026-04-22T10:00:00Z",
39
+ "ide_paths": {
40
+ "claude": ".claude/skills/aw-adk/SKILL.md",
41
+ "cursor": ".cursor/rules/aw-adk.mdc"
42
+ }
43
+ }
44
+ }
45
+ }
46
+ ```
47
+
48
+ ## Cursor-Specific Notes
49
+
50
+ Cursor uses `.mdc` (Markdown with Context) files. The conversion from `.md` to `.mdc`:
51
+ - Frontmatter is preserved as YAML
52
+ - Body content is wrapped in Cursor's context format
53
+ - `trigger` field maps to Cursor's "when" activation rules
54
+
55
+ ## Codex-Specific Notes
56
+
57
+ Codex uses a flat directory structure under `.codex/`. Each artifact is a directory containing the artifact file plus any bundled resources.
58
+
59
+ ## What to Tell the User
60
+
61
+ After creating any artifact, show them where it will appear:
62
+
63
+ ```
64
+ Your new agent 'payments-processor' will be available at:
65
+
66
+ Claude Code: .claude/agents/payments-processor.md
67
+ Cursor: .cursor/rules/payments-processor.mdc
68
+ Codex: .codex/agents/payments-processor.md
69
+
70
+ Run `aw pull` to sync from the registry to your IDE.
71
+ ```
@@ -0,0 +1,183 @@
1
+ # Eval Placement Guide
2
+
3
+ Evals live next to the artifacts they test. This document defines where eval files go and why.
4
+
5
+ ## Why Colocated > Centralized
6
+
7
+ **Proximity to artifact.** When an eval lives in the same directory tree as the skill, agent, or command it tests, you see the eval every time you touch the artifact. Changes to the artifact naturally prompt eval updates.
8
+
9
+ **Discoverability.** A developer exploring a skill directory finds its evals without searching a separate `evals/` monolith. New contributors understand what "good" looks like by reading colocated evals.
10
+
11
+ **Ownership clarity.** The person who owns the artifact owns its evals. No ambiguity about who maintains a centralized eval that tests something in a different team's directory.
12
+
13
+ **Refactor safety.** When an artifact moves or gets renamed, colocated evals move with it. Centralized evals require separate updates and are often forgotten, leading to orphaned or broken evals.
14
+
15
+ ## Directory Structure by Artifact Type
16
+
17
+ ### Skills
18
+
19
+ Evals live inside the skill directory in an `evals/` subdirectory.
20
+
21
+ ```
22
+ skills/
23
+ <slug>/
24
+ skill.md
25
+ references/
26
+ evals/
27
+ eval-<purpose>.md
28
+ eval-<purpose>.md
29
+ ```
30
+
31
+ Example:
32
+ ```
33
+ skills/
34
+ aw-adk/
35
+ skill.md
36
+ references/
37
+ evals/
38
+ eval-create-happy-path.md
39
+ eval-create-missing-fields.md
40
+ eval-score-minimal.md
41
+ ```
42
+
43
+ ### Agents
44
+
45
+ Evals live in a sibling `evals/` directory scoped by agent slug.
46
+
47
+ ```
48
+ agents/
49
+ <slug>.md
50
+ evals/
51
+ <slug>/
52
+ eval-<purpose>.md
53
+ eval-<purpose>.md
54
+ ```
55
+
56
+ Example:
57
+ ```
58
+ agents/
59
+ planner.md
60
+ code-reviewer.md
61
+ evals/
62
+ planner/
63
+ eval-plan-happy-path.md
64
+ eval-plan-ambiguous-input.md
65
+ code-reviewer/
66
+ eval-review-security-issue.md
67
+ ```
68
+
69
+ ### Commands
70
+
71
+ Evals live in a sibling `evals/` directory scoped by command slug.
72
+
73
+ ```
74
+ commands/
75
+ <slug>.md
76
+ evals/
77
+ <slug>/
78
+ eval-<purpose>.md
79
+ eval-<purpose>.md
80
+ ```
81
+
82
+ Example:
83
+ ```
84
+ commands/
85
+ aw-build.md
86
+ evals/
87
+ aw-build/
88
+ eval-build-happy-path.md
89
+ eval-build-missing-config.md
90
+ ```
91
+
92
+ ### Rules
93
+
94
+ Evals live either within `.aw/.aw_rules/` references or in a dedicated `rules/evals/` directory.
95
+
96
+ ```
97
+ # Option A: Inside .aw/.aw_rules references
98
+ .aw/
99
+ .aw_rules/
100
+ platform/
101
+ <domain>/
102
+ references/
103
+ eval-<purpose>.md
104
+
105
+ # Option B: Dedicated rules eval directory
106
+ rules/
107
+ evals/
108
+ <slug>/
109
+ eval-<purpose>.md
110
+ ```
111
+
112
+ ### Meta-Evals (Evals of Evals)
113
+
114
+ Evals that test the eval system itself live in a nested `evals/evals/` directory.
115
+
116
+ ```
117
+ evals/
118
+ evals/
119
+ eval-<purpose>.md
120
+ ```
121
+
122
+ ## Naming Convention
123
+
124
+ All eval files follow: `eval-<purpose>.md`
125
+
126
+ The `<purpose>` segment describes what the eval tests in lowercase kebab-case.
127
+
128
+ | Pattern | Example | Tests |
129
+ |---------|---------|-------|
130
+ | `eval-<action>-happy-path` | `eval-create-happy-path.md` | Standard successful execution |
131
+ | `eval-<action>-<failure>` | `eval-create-missing-fields.md` | Specific failure scenario |
132
+ | `eval-<action>-<edge>` | `eval-score-minimal.md` | Edge case or boundary condition |
133
+ | `eval-<action>-adversarial` | `eval-create-adversarial.md` | Adversarial or malicious input |
134
+
135
+ ## Minimum Eval Count
136
+
137
+ Every artifact requires at least **2 evals**:
138
+
139
+ 1. **Happy path** -- the artifact works correctly with valid, representative input.
140
+ 2. **Failure scenario** -- the artifact handles invalid input, missing data, or error conditions gracefully.
141
+
142
+ For critical-path artifacts (commands users invoke directly, agents that orchestrate workflows), target **4+ evals**:
143
+
144
+ 1. Happy path
145
+ 2. Failure / error handling
146
+ 3. Edge case (boundary values, minimal input, maximum input)
147
+ 4. Adversarial (conflicting instructions, unexpected formats)
148
+
149
+ ## Eval File Structure
150
+
151
+ Each eval file should contain:
152
+
153
+ ```markdown
154
+ ---
155
+ target: <artifact-type>/<slug>
156
+ type: eval
157
+ purpose: <brief description>
158
+ ---
159
+
160
+ # Eval: <Title>
161
+
162
+ ## Scenario
163
+ <Description of the test scenario and input>
164
+
165
+ ## Expected Behavior
166
+ <What the artifact should produce or do>
167
+
168
+ ## Grader
169
+ <How to determine pass/fail -- deterministic checks preferred>
170
+
171
+ ## Pass Criteria
172
+ <Explicit, binary pass/fail conditions>
173
+ ```
174
+
175
+ ## Validation Checklist
176
+
177
+ Before merging an artifact, verify:
178
+
179
+ - [ ] At least 2 eval files exist in the correct directory
180
+ - [ ] Eval files follow `eval-<purpose>.md` naming
181
+ - [ ] Each eval has a `target:` frontmatter referencing the parent artifact
182
+ - [ ] At least one eval covers a failure scenario
183
+ - [ ] Eval graders are specific enough to fail on wrong output (see [rubric-meta-eval.md](rubric-meta-eval.md))
@@ -0,0 +1,75 @@
1
+ # External Resources for CASRE Authoring
2
+
3
+ Curated references for writing high-quality Commands, Agents, Skills, Rules, and Evals.
4
+
5
+ ## Resources
6
+
7
+ ### Anthropic: Skill Best Practices
8
+
9
+ **URL:** <https://platform.claude.com/docs/en/agents-and-tools/agent-skills/best-practices>
10
+
11
+ Key takeaways: Structure matters more than length -- a well-organized 200-line skill outperforms a rambling 2000-line one. Front-load success criteria and constraints before procedural steps. Use concrete examples of good and bad output rather than abstract descriptions.
12
+
13
+ ### Anthropic: Equipping Agents with Agent Skills
14
+
15
+ **URL:** <https://www.anthropic.com/engineering/equipping-agents-for-the-real-world-with-agent-skills>
16
+
17
+ Key takeaways: Skills should encode domain expertise that the model lacks, not restate what it already knows. The most effective skills combine declarative knowledge (what good looks like) with procedural guardrails (what to avoid). Skills are most valuable when they reduce variance across runs -- the same input should produce consistently shaped output.
18
+
19
+ ### Anthropic: Demystifying Evals for AI Agents
20
+
21
+ **URL:** <https://www.anthropic.com/engineering/demystifying-evals-for-ai-agents>
22
+
23
+ Key takeaways: Build evals bottom-up -- start with the smallest testable unit and compose upward. Prefer deterministic graders (exact match, regex, structured checks) over model-based graders wherever possible. When model-based grading is necessary, constrain the grader with explicit rubrics and examples of pass/fail. Eval quality directly determines your ability to iterate on agent behavior.
24
+
25
+ ### OpenAI: Eval Skills
26
+
27
+ **URL:** <https://developers.openai.com/blog/eval-skills>
28
+
29
+ Key takeaways: Evals should test behavior, not implementation. Define success criteria before writing the eval -- if you cannot state what "pass" looks like in concrete terms, the eval is not ready. Use multiple eval types (unit, integration, end-to-end) to cover different failure modes. Track eval results over time to detect regressions early.
30
+
31
+ ### Promptfoo: Agent Eval Patterns
32
+
33
+ **URL:** <https://www.promptfoo.dev/docs/integrations/agent-skill/>
34
+
35
+ Key takeaways: Separate the eval scenario (input + context) from the grader (how to judge output). This separation enables reuse -- the same grader can apply across multiple scenarios, and scenarios can be graded by different methods. Parameterize scenarios to generate coverage from templates rather than writing each case by hand.
36
+
37
+ ### O'Reilly: How to Write a Good Spec for AI Agents
38
+
39
+ **URL:** <https://www.oreilly.com/radar/how-to-write-a-good-spec-for-ai-agents/>
40
+
41
+ Key takeaways: A good spec defines the boundaries of acceptable output, not a single correct answer. Include examples of outputs that are wrong in subtle ways -- these teach the agent (and the eval grader) what to reject. Specs should be testable: every requirement should map to at least one eval scenario.
42
+
43
+ ### OpenAI: Evaluation Best Practices
44
+
45
+ **URL:** <https://developers.openai.com/api/docs/guides/evaluation-best-practices>
46
+
47
+ Key takeaways: Start with the simplest eval that provides signal and add complexity only when needed. Use a mix of automated and human evaluation, but automate first. Track baseline performance before making changes so you can measure improvement. Small, frequent eval runs catch regressions faster than large, infrequent ones.
48
+
49
+ ## Key Principles for CASRE Authoring
50
+
51
+ These principles recur across the resources above. Apply them when writing any CASRE artifact.
52
+
53
+ ### Structure > Length
54
+
55
+ A concise, well-organized artifact outperforms a verbose one. Use headings, tables, and lists to make content scannable. Front-load the most important information.
56
+
57
+ ### Success Criteria First
58
+
59
+ Define what "done" and "good" look like before writing implementation details. For skills, state the expected output shape. For evals, state pass/fail criteria. For commands, state the end state.
60
+
61
+ ### Bottom-Up Eval Design
62
+
63
+ Start with the smallest testable behavior. Write evals for individual skills before writing evals for agents that compose those skills. Compose simple evals into integration evals rather than writing monolithic end-to-end evals first.
64
+
65
+ ### Deterministic > Model-Based Graders
66
+
67
+ Use exact match, regex, JSON schema validation, or structured checks whenever the output format allows. Reserve model-based grading for genuinely subjective or creative dimensions. When using model-based graders, provide explicit rubrics with scored examples.
68
+
69
+ ### Concrete Examples Over Abstract Descriptions
70
+
71
+ Show what good output looks like. Show what bad output looks like. Examples reduce ambiguity more effectively than prose descriptions of quality. Include at least one positive and one negative example in skills and eval graders.
72
+
73
+ ### Testable Requirements
74
+
75
+ Every requirement in a skill, rule, or command should map to at least one eval scenario. If a requirement cannot be tested, it is either too vague (rewrite it) or aspirational (move it to a "nice to have" section).
@@ -0,0 +1,66 @@
1
+ ---
2
+ name: getting-started
3
+ description: Quickstart guide for creating CASRE artifacts with the ADK
4
+ ---
5
+
6
+ # Getting Started with the ADK
7
+
8
+ ## Your First Artifact in 5 Steps
9
+
10
+ 1. **Say what you want.** Use natural language — the ADK classifies the type for you.
11
+ 2. **Answer the interview.** The ADK asks targeted questions based on the artifact type.
12
+ 3. **Review the scaffold.** The ADK creates the file at the correct registry path.
13
+ 4. **Check the score.** The ADK scores your artifact against the type-specific rubric.
14
+ 5. **Run the evals.** The ADK creates 2+ evals and validates them.
15
+
16
+ ## Example Prompts by Type
17
+
18
+ ### Agent
19
+ > Create an agent for code review automation in the platform/review namespace. It should analyze PR diffs for security issues, performance regressions, and style violations. Tools: Read, Grep, Glob, Bash. Model: sonnet.
20
+
21
+ ### Skill
22
+ > Create a skill for MongoDB aggregation patterns in the platform/data namespace. Cover $lookup, $unwind, $group, pagination, and index-aware pipeline design.
23
+
24
+ ### Command
25
+ > Create a command for incident response in the platform/infra namespace. Phases: triage → investigate → mitigate → postmortem. Human checkpoint before mitigation.
26
+
27
+ ### Rule
28
+ > Create a rule called no-direct-db-connection for the backend domain. All database access must go through @platform-core/\* packages. Severity: MUST. File patterns: \*.service.ts, \*.repository.ts.
29
+
30
+ ### Eval
31
+ > Create evals for the existing code-reviewer agent at .aw/.aw_registry/platform/review/agents/code-reviewer.md. One happy path, one where the PR has no issues but the agent flags false positives.
32
+
33
+ ## Common Intent Phrases
34
+
35
+ These phrases trigger the ADK via the `using-aw-skills` router:
36
+
37
+ - "create an agent/skill/command/rule/eval"
38
+ - "score my agent/skill"
39
+ - "audit all agents in platform/services"
40
+ - "fix the lint errors on this skill"
41
+ - "improve the payments-processor agent"
42
+ - "delete the old migration command"
43
+
44
+ ## What Happens Under the Hood
45
+
46
+ Every create follows the same 14-step pipeline:
47
+
48
+ ```
49
+ CLASSIFY → INTERVIEW → RESOLVE PATH → SCAFFOLD → CHECKPOINT →
50
+ LINT → SCORE → EVAL GATE (2+) → TEST RUNS → ITERATE →
51
+ DESCRIPTION OPT → CROSS-IDE → REGISTRY UPDATES → SYNC
52
+ ```
53
+
54
+ No steps are optional. Rules, agents, commands, skills, and evals all go through the full flow.
55
+
56
+ ## Deleting Artifacts
57
+
58
+ The ADK also handles safe deletion with reverse reference scanning:
59
+
60
+ ```
61
+ /aw:adk agent delete my-agent
62
+ ```
63
+
64
+ The delete flow: LOCATE → INVENTORY → REVERSE REFERENCE SCAN → CONFIRM → DELETE → REGISTRY CLEANUP → SYNC
65
+
66
+ It finds everything that points to the artifact (commands referencing an agent, agents referencing a skill) and offers to clean up those references too — no phantom dependencies left behind.