@cubis/foundry 0.3.71 → 0.3.72

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (270) hide show
  1. package/CHANGELOG.md +15 -0
  2. package/dist/cli/core.js +4 -18
  3. package/dist/cli/core.js.map +1 -1
  4. package/package.json +1 -1
  5. package/src/cli/core.ts +4 -18
  6. package/workflows/powers/accessibility/POWER.md +83 -94
  7. package/workflows/powers/accessibility/SKILL.md +82 -94
  8. package/workflows/powers/agent-design/POWER.md +201 -0
  9. package/workflows/powers/agent-design/SKILL.md +198 -0
  10. package/workflows/powers/agent-design/references/clarification-patterns.md +153 -0
  11. package/workflows/powers/agent-design/references/skill-testing.md +164 -0
  12. package/workflows/powers/agent-design/references/workflow-patterns.md +226 -0
  13. package/workflows/powers/agentic-eval/POWER.md +62 -0
  14. package/workflows/powers/agentic-eval/SKILL.md +59 -0
  15. package/workflows/powers/agentic-eval/references/rubric-and-regression-checklist.md +11 -0
  16. package/workflows/powers/api-designer/POWER.md +43 -71
  17. package/workflows/powers/api-designer/SKILL.md +43 -71
  18. package/workflows/powers/api-patterns/POWER.md +42 -56
  19. package/workflows/powers/api-patterns/SKILL.md +42 -57
  20. package/workflows/powers/architecture-designer/POWER.md +43 -60
  21. package/workflows/powers/architecture-designer/SKILL.md +43 -60
  22. package/workflows/powers/ask-questions-if-underspecified/POWER.md +51 -3
  23. package/workflows/powers/auth-architect/POWER.md +69 -0
  24. package/workflows/powers/auth-architect/SKILL.md +66 -0
  25. package/workflows/powers/auth-architect/references/session-token-policy-checklist.md +45 -0
  26. package/workflows/powers/behavioral-modes/POWER.md +100 -9
  27. package/workflows/powers/c-pro/POWER.md +105 -0
  28. package/workflows/powers/c-pro/SKILL.md +102 -0
  29. package/workflows/powers/c-pro/references/build-systems-and-toolchains.md +148 -0
  30. package/workflows/powers/c-pro/references/common-ub-and-portability.md +166 -0
  31. package/workflows/powers/c-pro/references/debugging-with-sanitizers.md +205 -0
  32. package/workflows/powers/c-pro/references/memory-safety-and-build-checklist.md +60 -0
  33. package/workflows/powers/c-pro/references/posix-and-platform-apis.md +244 -0
  34. package/workflows/powers/changelog-generator/POWER.md +127 -63
  35. package/workflows/powers/changelog-generator/SKILL.md +126 -63
  36. package/workflows/powers/ci-cd-pipelines/POWER.md +156 -0
  37. package/workflows/powers/ci-cd-pipelines/SKILL.md +153 -0
  38. package/workflows/powers/ci-cd-pipelines/references/github-actions-patterns.md +160 -0
  39. package/workflows/powers/ci-cd-pipelines/references/pipeline-security-checklist.md +57 -0
  40. package/workflows/powers/cli-developer/POWER.md +152 -95
  41. package/workflows/powers/cli-developer/SKILL.md +152 -95
  42. package/workflows/powers/cpp-pro/POWER.md +111 -0
  43. package/workflows/powers/cpp-pro/SKILL.md +108 -0
  44. package/workflows/powers/cpp-pro/references/concurrency-primitives.md +266 -0
  45. package/workflows/powers/cpp-pro/references/move-semantics-and-value-types.md +149 -0
  46. package/workflows/powers/cpp-pro/references/performance-and-profiling.md +191 -0
  47. package/workflows/powers/cpp-pro/references/raii-and-modern-cpp-checklist.md +87 -0
  48. package/workflows/powers/cpp-pro/references/template-and-concepts-patterns.md +205 -0
  49. package/workflows/powers/csharp-pro/POWER.md +47 -22
  50. package/workflows/powers/csharp-pro/SKILL.md +47 -22
  51. package/workflows/powers/dart-pro/POWER.md +68 -0
  52. package/workflows/powers/dart-pro/SKILL.md +65 -0
  53. package/workflows/powers/dart-pro/references/isolate-and-concurrency.md +180 -0
  54. package/workflows/powers/dart-pro/references/null-safety-and-async-patterns.md +133 -0
  55. package/workflows/powers/dart-pro/references/package-structure-and-linting.md +193 -0
  56. package/workflows/powers/dart-pro/references/sealed-records-patterns.md +173 -0
  57. package/workflows/powers/dart-pro/references/testing-and-mocking.md +235 -0
  58. package/workflows/powers/database-design/POWER.md +47 -33
  59. package/workflows/powers/database-design/SKILL.md +47 -33
  60. package/workflows/powers/database-optimizer/POWER.md +43 -64
  61. package/workflows/powers/database-optimizer/SKILL.md +43 -64
  62. package/workflows/powers/database-skills/POWER.md +59 -93
  63. package/workflows/powers/database-skills/SKILL.md +59 -93
  64. package/workflows/powers/debugging-strategies/POWER.md +69 -0
  65. package/workflows/powers/debugging-strategies/SKILL.md +66 -0
  66. package/workflows/powers/debugging-strategies/references/reproduce-isolate-verify-checklist.md +42 -0
  67. package/workflows/powers/deep-research/POWER.md +67 -0
  68. package/workflows/powers/deep-research/SKILL.md +64 -0
  69. package/workflows/powers/deep-research/references/multi-round-research-loop.md +80 -0
  70. package/workflows/powers/design-system-builder/POWER.md +130 -116
  71. package/workflows/powers/design-system-builder/SKILL.md +130 -116
  72. package/workflows/powers/devops-engineer/POWER.md +120 -57
  73. package/workflows/powers/devops-engineer/SKILL.md +120 -57
  74. package/workflows/powers/docker-kubernetes/POWER.md +94 -0
  75. package/workflows/powers/docker-kubernetes/SKILL.md +91 -0
  76. package/workflows/powers/docker-kubernetes/references/dockerfile-optimization-checklist.md +35 -0
  77. package/workflows/powers/docker-kubernetes/references/kubernetes-deployment-patterns.md +59 -0
  78. package/workflows/powers/documentation-templates/POWER.md +158 -127
  79. package/workflows/powers/documentation-templates/SKILL.md +158 -127
  80. package/workflows/powers/drizzle-expert/POWER.md +66 -0
  81. package/workflows/powers/drizzle-expert/SKILL.md +63 -0
  82. package/workflows/powers/drizzle-expert/references/runtime-pairing-matrix.md +16 -0
  83. package/workflows/powers/drizzle-expert/references/schema-and-migration-playbook.md +18 -0
  84. package/workflows/powers/error-ux-observability/POWER.md +144 -131
  85. package/workflows/powers/error-ux-observability/SKILL.md +143 -131
  86. package/workflows/powers/fastapi-expert/POWER.md +46 -60
  87. package/workflows/powers/fastapi-expert/SKILL.md +46 -60
  88. package/workflows/powers/firebase/POWER.md +65 -0
  89. package/workflows/powers/firebase/SKILL.md +62 -0
  90. package/workflows/powers/firebase/references/platform-routing.md +16 -0
  91. package/workflows/powers/firebase/references/rules-and-indexes-checklist.md +11 -0
  92. package/workflows/powers/flutter-design-system/POWER.md +63 -0
  93. package/workflows/powers/flutter-design-system/SKILL.md +60 -0
  94. package/workflows/powers/flutter-design-system/references/shared-widgets.md +29 -0
  95. package/workflows/powers/flutter-design-system/references/tokens-and-theme.md +34 -0
  96. package/workflows/powers/flutter-drift/POWER.md +65 -0
  97. package/workflows/powers/flutter-drift/SKILL.md +62 -0
  98. package/workflows/powers/flutter-drift/references/migrations.md +22 -0
  99. package/workflows/powers/flutter-drift/references/query-patterns.md +26 -0
  100. package/workflows/powers/flutter-feature/POWER.md +65 -0
  101. package/workflows/powers/flutter-feature/SKILL.md +62 -0
  102. package/workflows/powers/flutter-feature/references/architecture-rules.md +85 -0
  103. package/workflows/powers/flutter-feature/references/composite-provider.md +58 -0
  104. package/workflows/powers/flutter-feature/references/outbox-pattern.md +87 -0
  105. package/workflows/powers/flutter-feature/references/testing-patterns.md +218 -0
  106. package/workflows/powers/flutter-go-router/POWER.md +64 -0
  107. package/workflows/powers/flutter-go-router/SKILL.md +61 -0
  108. package/workflows/powers/flutter-go-router/references/guards-and-deeplinks.md +20 -0
  109. package/workflows/powers/flutter-go-router/references/typed-routes.md +27 -0
  110. package/workflows/powers/flutter-offline-sync/POWER.md +62 -0
  111. package/workflows/powers/flutter-offline-sync/SKILL.md +59 -0
  112. package/workflows/powers/flutter-offline-sync/references/outbox-full.md +44 -0
  113. package/workflows/powers/flutter-repository/POWER.md +64 -0
  114. package/workflows/powers/flutter-repository/SKILL.md +61 -0
  115. package/workflows/powers/flutter-repository/references/drift-patterns.md +21 -0
  116. package/workflows/powers/flutter-repository/references/retrofit-patterns.md +20 -0
  117. package/workflows/powers/flutter-riverpod/POWER.md +70 -0
  118. package/workflows/powers/flutter-riverpod/SKILL.md +67 -0
  119. package/workflows/powers/flutter-riverpod/references/async-and-mutations.md +19 -0
  120. package/workflows/powers/flutter-riverpod/references/async-lifecycle.md +19 -0
  121. package/workflows/powers/flutter-riverpod/references/provider-selection.md +20 -0
  122. package/workflows/powers/flutter-riverpod/references/testing.md +21 -0
  123. package/workflows/powers/flutter-riverpod/references/version-matrix.md +24 -0
  124. package/workflows/powers/flutter-state-machine/POWER.md +62 -0
  125. package/workflows/powers/flutter-state-machine/SKILL.md +59 -0
  126. package/workflows/powers/flutter-state-machine/references/app-state-contract.md +23 -0
  127. package/workflows/powers/flutter-state-machine/references/ui-rendering.md +14 -0
  128. package/workflows/powers/flutter-testing/POWER.md +64 -0
  129. package/workflows/powers/flutter-testing/SKILL.md +61 -0
  130. package/workflows/powers/flutter-testing/references/offline-sync-tests.md +16 -0
  131. package/workflows/powers/flutter-testing/references/test-layers.md +33 -0
  132. package/workflows/powers/frontend-code-review/POWER.md +137 -0
  133. package/workflows/powers/frontend-code-review/SKILL.md +134 -0
  134. package/workflows/powers/frontend-code-review/references/common-antipatterns.md +86 -0
  135. package/workflows/powers/frontend-code-review/references/performance-budgets.md +56 -0
  136. package/workflows/powers/frontend-code-review/references/review-checklists.md +47 -0
  137. package/workflows/powers/frontend-design/POWER.md +163 -362
  138. package/workflows/powers/frontend-design/SKILL.md +163 -362
  139. package/workflows/powers/game-development/POWER.md +57 -140
  140. package/workflows/powers/game-development/SKILL.md +57 -140
  141. package/workflows/powers/geo-fundamentals/POWER.md +64 -126
  142. package/workflows/powers/geo-fundamentals/SKILL.md +64 -127
  143. package/workflows/powers/git-workflow/POWER.md +135 -0
  144. package/workflows/powers/git-workflow/SKILL.md +132 -0
  145. package/workflows/powers/git-workflow/references/pr-review-checklist.md +63 -0
  146. package/workflows/powers/golang-pro/POWER.md +46 -35
  147. package/workflows/powers/golang-pro/SKILL.md +46 -35
  148. package/workflows/powers/graphql-architect/POWER.md +44 -62
  149. package/workflows/powers/graphql-architect/SKILL.md +44 -62
  150. package/workflows/powers/i18n-localization/POWER.md +118 -103
  151. package/workflows/powers/i18n-localization/SKILL.md +118 -103
  152. package/workflows/powers/java-pro/POWER.md +47 -22
  153. package/workflows/powers/java-pro/SKILL.md +47 -22
  154. package/workflows/powers/javascript-pro/POWER.md +47 -34
  155. package/workflows/powers/javascript-pro/SKILL.md +47 -34
  156. package/workflows/powers/kotlin-pro/POWER.md +46 -23
  157. package/workflows/powers/kotlin-pro/SKILL.md +46 -23
  158. package/workflows/powers/legacy-modernizer/POWER.md +43 -60
  159. package/workflows/powers/legacy-modernizer/SKILL.md +43 -60
  160. package/workflows/powers/mcp-builder/POWER.md +65 -0
  161. package/workflows/powers/mcp-builder/SKILL.md +62 -0
  162. package/workflows/powers/mcp-builder/references/testing-and-evals.md +17 -0
  163. package/workflows/powers/mcp-builder/references/transport-and-tool-design.md +17 -0
  164. package/workflows/powers/microservices-architect/POWER.md +43 -70
  165. package/workflows/powers/microservices-architect/SKILL.md +43 -70
  166. package/workflows/powers/mobile-design/POWER.md +110 -345
  167. package/workflows/powers/mobile-design/SKILL.md +110 -345
  168. package/workflows/powers/mongodb/POWER.md +67 -0
  169. package/workflows/powers/mongodb/SKILL.md +64 -0
  170. package/workflows/powers/mongodb/references/mongodb-checklist.md +20 -0
  171. package/workflows/powers/mysql/POWER.md +67 -0
  172. package/workflows/powers/mysql/SKILL.md +64 -0
  173. package/workflows/powers/mysql/references/mysql-checklist.md +20 -0
  174. package/workflows/powers/neki/POWER.md +67 -0
  175. package/workflows/powers/neki/SKILL.md +64 -0
  176. package/workflows/powers/neki/references/neki-checklist.md +18 -0
  177. package/workflows/powers/nestjs-expert/POWER.md +45 -91
  178. package/workflows/powers/nestjs-expert/SKILL.md +45 -91
  179. package/workflows/powers/nextjs-developer/POWER.md +51 -44
  180. package/workflows/powers/nextjs-developer/SKILL.md +51 -44
  181. package/workflows/powers/nodejs-best-practices/POWER.md +48 -29
  182. package/workflows/powers/nodejs-best-practices/SKILL.md +48 -29
  183. package/workflows/powers/observability/POWER.md +109 -0
  184. package/workflows/powers/observability/SKILL.md +106 -0
  185. package/workflows/powers/observability/references/alerting-and-slo-checklist.md +87 -0
  186. package/workflows/powers/observability/references/opentelemetry-setup-guide.md +121 -0
  187. package/workflows/powers/openai-docs/POWER.md +61 -0
  188. package/workflows/powers/openai-docs/SKILL.md +58 -0
  189. package/workflows/powers/openai-docs/references/official-source-playbook.md +10 -0
  190. package/workflows/powers/performance-profiling/POWER.md +61 -114
  191. package/workflows/powers/performance-profiling/SKILL.md +61 -114
  192. package/workflows/powers/php-pro/POWER.md +116 -0
  193. package/workflows/powers/php-pro/SKILL.md +113 -0
  194. package/workflows/powers/php-pro/references/architecture-and-di.md +239 -0
  195. package/workflows/powers/php-pro/references/modern-php-features.md +189 -0
  196. package/workflows/powers/php-pro/references/performance-and-deployment.md +197 -0
  197. package/workflows/powers/php-pro/references/php84-strict-typing-checklist.md +161 -0
  198. package/workflows/powers/php-pro/references/testing-and-static-analysis.md +235 -0
  199. package/workflows/powers/playwright-e2e/POWER.md +85 -0
  200. package/workflows/powers/playwright-e2e/SKILL.md +82 -0
  201. package/workflows/powers/playwright-e2e/references/locator-trace-flake-checklist.md +80 -0
  202. package/workflows/powers/postgres/POWER.md +67 -0
  203. package/workflows/powers/postgres/SKILL.md +64 -0
  204. package/workflows/powers/postgres/references/postgres-checklist.md +20 -0
  205. package/workflows/powers/prompt-engineer/POWER.md +47 -30
  206. package/workflows/powers/prompt-engineer/SKILL.md +47 -30
  207. package/workflows/powers/python-pro/POWER.md +47 -36
  208. package/workflows/powers/python-pro/SKILL.md +47 -36
  209. package/workflows/powers/react-best-practices/POWER.md +56 -33
  210. package/workflows/powers/react-best-practices/SKILL.md +56 -33
  211. package/workflows/powers/react-expert/POWER.md +47 -37
  212. package/workflows/powers/react-expert/SKILL.md +47 -37
  213. package/workflows/powers/redis/POWER.md +67 -0
  214. package/workflows/powers/redis/SKILL.md +64 -0
  215. package/workflows/powers/redis/references/redis-checklist.md +19 -0
  216. package/workflows/powers/ruby-pro/POWER.md +118 -0
  217. package/workflows/powers/ruby-pro/SKILL.md +115 -0
  218. package/workflows/powers/ruby-pro/references/modern-ruby-features.md +189 -0
  219. package/workflows/powers/ruby-pro/references/object-design-patterns.md +220 -0
  220. package/workflows/powers/ruby-pro/references/performance-and-profiling.md +224 -0
  221. package/workflows/powers/ruby-pro/references/ruby-concurrency-and-testing.md +190 -0
  222. package/workflows/powers/ruby-pro/references/testing-and-rspec.md +236 -0
  223. package/workflows/powers/rust-pro/POWER.md +45 -31
  224. package/workflows/powers/rust-pro/SKILL.md +45 -31
  225. package/workflows/powers/security-engineer/POWER.md +129 -0
  226. package/workflows/powers/security-engineer/SKILL.md +126 -0
  227. package/workflows/powers/seo-fundamentals/POWER.md +59 -102
  228. package/workflows/powers/seo-fundamentals/SKILL.md +59 -102
  229. package/workflows/powers/serverless-patterns/POWER.md +171 -0
  230. package/workflows/powers/serverless-patterns/SKILL.md +168 -0
  231. package/workflows/powers/skill-creator/POWER.md +90 -0
  232. package/workflows/powers/skill-creator/SKILL.md +87 -0
  233. package/workflows/powers/skill-creator/references/platform-formats.md +181 -0
  234. package/workflows/powers/skill-creator/references/schemas.md +430 -0
  235. package/workflows/powers/spec-miner/POWER.md +49 -57
  236. package/workflows/powers/spec-miner/SKILL.md +49 -57
  237. package/workflows/powers/sqlite/POWER.md +67 -0
  238. package/workflows/powers/sqlite/SKILL.md +64 -0
  239. package/workflows/powers/sqlite/references/sqlite-checklist.md +19 -0
  240. package/workflows/powers/sre-engineer/POWER.md +123 -64
  241. package/workflows/powers/sre-engineer/SKILL.md +123 -64
  242. package/workflows/powers/static-analysis/POWER.md +121 -77
  243. package/workflows/powers/static-analysis/SKILL.md +121 -77
  244. package/workflows/powers/stripe-best-practices/POWER.md +140 -17
  245. package/workflows/powers/stripe-best-practices/SKILL.md +139 -17
  246. package/workflows/powers/supabase/POWER.md +67 -0
  247. package/workflows/powers/supabase/SKILL.md +64 -0
  248. package/workflows/powers/supabase/references/supabase-checklist.md +19 -0
  249. package/workflows/powers/swift-pro/POWER.md +118 -0
  250. package/workflows/powers/swift-pro/SKILL.md +115 -0
  251. package/workflows/powers/swift-pro/references/concurrency-patterns.md +165 -0
  252. package/workflows/powers/swift-pro/references/protocol-and-generics.md +172 -0
  253. package/workflows/powers/swift-pro/references/sendable-and-isolation.md +116 -0
  254. package/workflows/powers/swift-pro/references/swift-concurrency-and-protocols.md +260 -0
  255. package/workflows/powers/swift-pro/references/testing-and-packages.md +192 -0
  256. package/workflows/powers/tailwind-patterns/POWER.md +71 -240
  257. package/workflows/powers/tailwind-patterns/SKILL.md +71 -240
  258. package/workflows/powers/testing-patterns/POWER.md +155 -10
  259. package/workflows/powers/testing-patterns/SKILL.md +155 -10
  260. package/workflows/powers/typescript-pro/POWER.md +47 -38
  261. package/workflows/powers/typescript-pro/SKILL.md +47 -38
  262. package/workflows/powers/vitess/POWER.md +67 -0
  263. package/workflows/powers/vitess/SKILL.md +64 -0
  264. package/workflows/powers/vitess/references/vitess-checklist.md +19 -0
  265. package/workflows/powers/vulnerability-scanner/POWER.md +146 -10
  266. package/workflows/powers/vulnerability-scanner/SKILL.md +146 -10
  267. package/workflows/powers/web-perf/POWER.md +43 -170
  268. package/workflows/powers/web-perf/SKILL.md +43 -170
  269. package/workflows/powers/webapp-testing/POWER.md +43 -164
  270. package/workflows/powers/webapp-testing/SKILL.md +43 -164
@@ -0,0 +1,226 @@
1
+ # Workflow Patterns Reference
2
+
3
+ Load this when choosing or implementing a workflow pattern for a CBX agent or skill.
4
+
5
+ Source: Anthropic engineering research — [Common workflow patterns for AI agents](https://claude.com/blog/common-workflow-patterns-for-ai-agents-and-when-to-use-them) (March 2026).
6
+
7
+ ---
8
+
9
+ ## The Core Insight
10
+
11
+ Workflows don't replace agent autonomy — they _shape where and how_ agents apply it.
12
+
13
+ A fully autonomous agent decides everything: tools, order, when to stop.
14
+ A workflow provides structure: overall flow, checkpoints, boundaries — but each step still uses full agent reasoning.
15
+
16
+ **Start with a single agent call.** If that meets quality bar, you're done. Only add workflow complexity when you can measure the improvement.
17
+
18
+ ---
19
+
20
+ ## Pattern 1: Sequential Workflow
21
+
22
+ ### What it is
23
+
24
+ Agents execute in a fixed order. Each stage processes its input, makes tool calls, then passes results to the next stage.
25
+
26
+ ```
27
+ Input → [Agent A] → [Agent B] → [Agent C] → Output
28
+ ```
29
+
30
+ ### Use when
31
+
32
+ - Steps have explicit dependencies (B needs A's output before starting)
33
+ - Multi-stage transformation where each step adds specific value
34
+ - Draft-review-polish cycles
35
+ - Data extraction → validation → loading pipelines
36
+
37
+ ### Avoid when
38
+
39
+ - A single agent can handle the whole task
40
+ - Agents need to collaborate rather than hand off linearly
41
+ - You're forcing sequential structure onto a task that doesn't naturally fit it
42
+
43
+ ### Cost/benefit
44
+
45
+ - **Cost:** Latency is linear — step 2 waits for step 1
46
+ - **Benefit:** Each agent focuses on one thing; accuracy often improves
47
+
48
+ ### CBX implementation
49
+
50
+ ```markdown
51
+ ## Workflow
52
+
53
+ 1. **[Agent/Step A]** — [what it receives, what it does, what it produces]
54
+ 2. **[Agent/Step B]** — [takes A's output, does X, produces Y]
55
+ 3. **[Agent/Step C]** — [final synthesis/delivery]
56
+
57
+ Artifacts pass via [file path / variable / structured JSON / natural handoff instructions].
58
+ ```
59
+
60
+ ### Pro tip
61
+
62
+ First try the pipeline as a single agent where the steps are part of the prompt. If quality is good enough, you've solved the problem without complexity.
63
+
64
+ ---
65
+
66
+ ## Pattern 2: Parallel Workflow
67
+
68
+ ### What it is
69
+
70
+ Multiple agents run simultaneously on independent tasks. Results are merged or synthesized afterward.
71
+
72
+ ```
73
+ ┌→ [Agent A] →┐
74
+ Input → ├→ [Agent B] →├→ Synthesize → Output
75
+ └→ [Agent C] →┘
76
+ ```
77
+
78
+ ### Use when
79
+
80
+ - Tasks are genuinely independent (no agent needs another's output to start)
81
+ - Speed matters and concurrent execution helps
82
+ - Multiple perspectives on the same input (e.g., code review from security + performance + quality)
83
+ - Separation of concerns — different engineers can own individual agents
84
+
85
+ ### Avoid when
86
+
87
+ - Agents need cumulative context or must build on each other's work
88
+ - Resource constraints (API quotas) make concurrent calls inefficient
89
+ - Aggregation logic is unclear or produces contradictory results with no resolution strategy
90
+
91
+ ### Cost/benefit
92
+
93
+ - **Cost:** Tokens multiply (N agents × tokens each); requires aggregation strategy
94
+ - **Benefit:** Faster completion; clean separation of concerns
95
+
96
+ ### CBX implementation
97
+
98
+ ```markdown
99
+ ## Parallel Steps
100
+
101
+ Run these simultaneously:
102
+
103
+ - **[Agent A]** — [focused task, specific scope]
104
+ - **[Agent B]** — [focused task, different scope]
105
+ - **[Agent C]** — [focused task, different scope]
106
+
107
+ ## Synthesis
108
+
109
+ After all agents complete:
110
+ [How to merge: majority vote / highest confidence / specialized agent defers to other / human review]
111
+ ```
112
+
113
+ ### Pro tip
114
+
115
+ Design your aggregation strategy _before_ implementing parallel agents. Without a clear merge plan, you collect conflicting outputs with no way to resolve them.
116
+
117
+ ---
118
+
119
+ ## Pattern 3: Evaluator-Optimizer Workflow
120
+
121
+ ### What it is
122
+
123
+ Two agents loop: one generates content, another evaluates it against criteria, the generator refines based on feedback. Repeat until quality threshold is met or max iterations reached.
124
+
125
+ ```
126
+ ┌─────────────────────────────────────┐
127
+ ↓ |
128
+ Input → [Generator] → Draft → [Evaluator] → Pass? → Output
129
+ ↓ Fail
130
+ Feedback → [Generator]
131
+ ```
132
+
133
+ ### Use when
134
+
135
+ - First-draft quality consistently falls short of the required bar
136
+ - You have clear, measurable quality criteria an AI evaluator can apply consistently
137
+ - The gap between first-attempt and final quality justifies extra tokens and latency
138
+ - Examples: technical docs, customer communications, code against specific standards
139
+
140
+ ### Avoid when
141
+
142
+ - First-attempt quality already meets requirements (unnecessary cost)
143
+ - Real-time applications needing immediate responses
144
+ - Evaluation criteria are too subjective for consistent AI evaluation
145
+ - Deterministic tools exist (linters for style, validators for schemas) — use those instead
146
+
147
+ ### Cost/benefit
148
+
149
+ - **Cost:** Tokens × iterations; adds latency proportionally
150
+ - **Benefit:** Structured feedback loops produce measurably better outputs
151
+
152
+ ### CBX implementation
153
+
154
+ ```markdown
155
+ ## Generator Prompt
156
+
157
+ Task: [what to create]
158
+ Constraints: [specific, measurable requirements]
159
+ Format: [exact output format]
160
+
161
+ ## Evaluator Prompt
162
+
163
+ Review this output against these criteria:
164
+
165
+ 1. [Criterion A] — Pass/Fail + specific failure note
166
+ 2. [Criterion B] — Pass/Fail + specific failure note
167
+ 3. [Criterion C] — Pass/Fail + specific failure note
168
+
169
+ Output JSON: { "pass": bool, "failures": ["..."], "revision_note": "..." }
170
+
171
+ ## Loop Control
172
+
173
+ - Max iterations: [3-5]
174
+ - Stop when: all criteria pass OR max iterations reached
175
+ - On max with failures: surface remaining issues for human review
176
+ ```
177
+
178
+ ### Pro tip
179
+
180
+ Set stopping criteria _before_ iterating. Define max iterations and specific quality thresholds. Without guardrails, you enter expensive loops where the evaluator finds minor issues and quality plateaus well before you stop.
181
+
182
+ ---
183
+
184
+ ## Decision Tree
185
+
186
+ ```
187
+ Can a single agent handle this task effectively?
188
+ → YES: Don't use workflows. Use a rich single-agent prompt.
189
+ → NO: Continue...
190
+
191
+ Do steps have dependencies (B needs A's output)?
192
+ → YES: Use Sequential
193
+ → NO: Continue...
194
+
195
+ Can steps run independently, and would concurrency help?
196
+ → YES: Use Parallel
197
+ → NO: Continue...
198
+
199
+ Does quality improve meaningfully through iteration, and can you measure it?
200
+ → YES: Use Evaluator-Optimizer
201
+ → NO: Re-examine whether workflows help at all
202
+ ```
203
+
204
+ ---
205
+
206
+ ## Combining Patterns
207
+
208
+ Patterns are building blocks, not mutually exclusive:
209
+
210
+ - A **sequential workflow** can include **parallel** steps at certain stages (e.g., three parallel reviewers before a final synthesis step)
211
+ - An **evaluator-optimizer** can use **parallel evaluation** where multiple evaluators assess different quality dimensions simultaneously
212
+ - A **sequential chain** can use **evaluator-optimizer** at the critical high-quality step
213
+
214
+ Only add the combination when each additional pattern measurably improves outcomes.
215
+
216
+ ---
217
+
218
+ ## Pattern Comparison
219
+
220
+ | | Sequential | Parallel | Evaluator-Optimizer |
221
+ | -------------- | -------------------------------------------- | --------------------------------------- | ------------------------------------ |
222
+ | **When** | Dependencies between steps | Independent tasks | Quality below bar |
223
+ | **Examples** | Extract → validate → load; Draft → translate | Code review (security + perf + quality) | Technical docs, comms, SQL |
224
+ | **Latency** | Linear (each waits for previous) | Fast (concurrent) | Multiplied by iterations |
225
+ | **Token cost** | Linear | Multiplicative | Linear × iterations |
226
+ | **Key risk** | Bottleneck at slow steps | Aggregation conflicts | Infinite loops without stop criteria |
@@ -0,0 +1,62 @@
1
+ ````markdown
2
+ ---
3
+ inclusion: manual
4
+ name: agentic-eval
5
+ description: "Use when evaluating an agent, skill, workflow, or MCP server: rubric design, evaluator-optimizer loops, LLM-as-judge patterns, regression suites, or prototype-vs-production quality gaps."
6
+ license: MIT
7
+ metadata:
8
+ author: cubis-foundry
9
+ version: "1.0"
10
+ compatibility: Claude Code, Codex, GitHub Copilot
11
+ ---
12
+
13
+ # Agentic Eval
14
+
15
+ ## Purpose
16
+
17
+ You are the specialist for evaluating agent systems, skills, and workflows.
18
+
19
+ Your job is to separate prototype confidence from production confidence and force explicit rubrics, failure cases, and regression evidence.
20
+
21
+ ## When to Use
22
+
23
+ - Designing evaluation sets, rubrics, or judge loops for skills, agents, or MCP servers.
24
+ - Comparing prompt, skill, or workflow variants.
25
+ - Tightening regression proof for agent behavior.
26
+
27
+ ## Instructions
28
+
29
+ ### STANDARD OPERATING PROCEDURE (SOP)
30
+
31
+ 1. Define the behavior under test and the failure modes that matter.
32
+ 2. Separate qualitative review from rubric or judge-based scoring.
33
+ 3. Build a repeatable regression set before optimizing variants.
34
+ 4. Treat judge-model output as evidence, not authority.
35
+ 5. Report what the evaluation proves and what it still does not prove.
36
+
37
+ ### Constraints
38
+
39
+ - Do not confuse evaluation with implementation.
40
+ - Do not use judge-model output as unquestioned truth.
41
+ - Do not ship one-off demos as production evidence.
42
+ - Do not broaden into generic QA planning when the target is an agent or skill system.
43
+
44
+ ## Output Format
45
+
46
+ Provide implementation guidance, code examples, and configuration as appropriate to the task.
47
+
48
+ ## References
49
+
50
+ | File | Load when |
51
+ | ----------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
52
+ | `references/rubric-and-regression-checklist.md` | You need the checklist for rubrics, judge loops, variance handling, and production-quality evidence. |
53
+
54
+ ## Scripts
55
+
56
+ No helper scripts are required for this skill right now. Keep execution in `SKILL.md` and `references/` unless repeated automation becomes necessary.
57
+
58
+ ## Examples
59
+
60
+ - "Help me with agentic eval best practices in this project"
61
+ - "Review my agentic eval implementation for issues"
62
+ ````
@@ -0,0 +1,59 @@
1
+ ---
2
+ name: agentic-eval
3
+ description: "Use when evaluating an agent, skill, workflow, or MCP server: rubric design, evaluator-optimizer loops, LLM-as-judge patterns, regression suites, or prototype-vs-production quality gaps."
4
+ license: MIT
5
+ metadata:
6
+ author: cubis-foundry
7
+ version: "1.0"
8
+ compatibility: Claude Code, Codex, GitHub Copilot
9
+ ---
10
+
11
+ # Agentic Eval
12
+
13
+ ## Purpose
14
+
15
+ You are the specialist for evaluating agent systems, skills, and workflows.
16
+
17
+ Your job is to separate prototype confidence from production confidence and force explicit rubrics, failure cases, and regression evidence.
18
+
19
+ ## When to Use
20
+
21
+ - Designing evaluation sets, rubrics, or judge loops for skills, agents, or MCP servers.
22
+ - Comparing prompt, skill, or workflow variants.
23
+ - Tightening regression proof for agent behavior.
24
+
25
+ ## Instructions
26
+
27
+ ### STANDARD OPERATING PROCEDURE (SOP)
28
+
29
+ 1. Define the behavior under test and the failure modes that matter.
30
+ 2. Separate qualitative review from rubric or judge-based scoring.
31
+ 3. Build a repeatable regression set before optimizing variants.
32
+ 4. Treat judge-model output as evidence, not authority.
33
+ 5. Report what the evaluation proves and what it still does not prove.
34
+
35
+ ### Constraints
36
+
37
+ - Do not confuse evaluation with implementation.
38
+ - Do not use judge-model output as unquestioned truth.
39
+ - Do not ship one-off demos as production evidence.
40
+ - Do not broaden into generic QA planning when the target is an agent or skill system.
41
+
42
+ ## Output Format
43
+
44
+ Provide implementation guidance, code examples, and configuration as appropriate to the task.
45
+
46
+ ## References
47
+
48
+ | File | Load when |
49
+ | ----------------------------------------------- | ---------------------------------------------------------------------------------------------------- |
50
+ | `references/rubric-and-regression-checklist.md` | You need the checklist for rubrics, judge loops, variance handling, and production-quality evidence. |
51
+
52
+ ## Scripts
53
+
54
+ No helper scripts are required for this skill right now. Keep execution in `SKILL.md` and `references/` unless repeated automation becomes necessary.
55
+
56
+ ## Examples
57
+
58
+ - "Help me with agentic eval best practices in this project"
59
+ - "Review my agentic eval implementation for issues"
@@ -0,0 +1,11 @@
1
+ # Rubric And Regression Checklist
2
+
3
+ Load this when building or reviewing an agent evaluation loop.
4
+
5
+ ## Checklist
6
+
7
+ - Define success and failure explicitly.
8
+ - Separate must-pass regressions from exploratory or qualitative review.
9
+ - Keep judge prompts and rubrics stable while comparing variants.
10
+ - Record where human review is still required.
11
+ - Distinguish “works once” from “works reliably across the chosen set.”
@@ -1,97 +1,69 @@
1
1
  ````markdown
2
2
  ---
3
3
  inclusion: manual
4
- name: "api-designer"
5
- description: "Use when designing REST or GraphQL APIs, creating OpenAPI specifications, or planning API architecture. Invoke for resource modeling, versioning strategies, pagination patterns, error handling standards."
4
+ name: api-designer
5
+ description: "Use when defining or reviewing external API contracts, OpenAPI specifications, resource models, pagination, versioning, and error response standards. Do not use for pure database design or framework-only handler wiring."
6
+ license: MIT
7
+ metadata:
8
+ author: cubis-foundry
9
+ version: "3.0"
10
+ compatibility: Claude Code, Codex, GitHub Copilot
6
11
  ---
7
12
 
8
-
9
13
  # API Designer
10
14
 
11
- ## Overview
12
-
13
- Senior API architect with expertise in designing scalable, developer-friendly REST and GraphQL APIs with comprehensive OpenAPI specifications.
14
-
15
- ## Role Definition
16
-
17
- You are a senior API designer with 10+ years of experience creating intuitive, scalable API architectures. You specialize in REST design patterns, OpenAPI 3.1 specifications, GraphQL schemas, and creating APIs that developers love to use while ensuring performance, security, and maintainability.
18
-
19
- ## When to Use This Skill
15
+ ## Purpose
20
16
 
21
- - Designing new REST or GraphQL APIs
22
- - Creating OpenAPI 3.1 specifications
23
- - Modeling resources and relationships
24
- - Implementing API versioning strategies
25
- - Designing pagination and filtering
26
- - Standardizing error responses
27
- - Planning authentication flows
28
- - Documenting API contracts
17
+ Use when defining or reviewing external API contracts, OpenAPI specifications, resource models, pagination, versioning, and error response standards. Do not use for pure database design or framework-only handler wiring.
29
18
 
30
- ## Core Workflow
19
+ ## When to Use
31
20
 
32
- 1. **Analyze domain** - Understand business requirements, data models, client needs
33
- 2. **Model resources** - Identify resources, relationships, operations
34
- 3. **Design endpoints** - Define URI patterns, HTTP methods, request/response schemas
35
- 4. **Specify contract** - Create OpenAPI 3.1 spec with complete documentation
36
- 5. **Plan evolution** - Design versioning, deprecation, backward compatibility
21
+ - Defining external REST or GraphQL contracts.
22
+ - Writing or reviewing OpenAPI schemas and endpoint shapes.
23
+ - Choosing pagination, filtering, idempotency, and versioning rules.
24
+ - Standardizing error envelopes and auth-facing API behavior.
37
25
 
38
- ## Available Steering Files
26
+ ## Instructions
39
27
 
40
- Load detailed guidance on-demand:
28
+ 1. Clarify consumers, auth model, and backward-compatibility constraints.
29
+ 2. Model resources, operations, and failure cases before implementation.
30
+ 3. Choose transport shape, versioning policy, and pagination pattern deliberately.
31
+ 4. Define request, response, and error envelopes with explicit examples.
32
+ 5. Hand off a contract that implementation skills can build against without guessing.
41
33
 
42
- | Topic | Reference | Load When |
43
- | -------------- | ---------------------------- | ------------------------------------------- |
44
- | REST Patterns | `references/rest-patterns.md` | Resource design, HTTP methods, HATEOAS |
45
- | Versioning | `references/versioning.md` | API versions, deprecation, breaking changes |
46
- | Pagination | `references/pagination.md` | Cursor, offset, keyset pagination |
47
- | Error Handling | `references/error-handling.md` | Error responses, RFC 7807, status codes |
48
- | OpenAPI | `references/openapi.md` | OpenAPI 3.1, documentation, code generation |
34
+ ### Baseline standards
49
35
 
50
- ## Constraints
36
+ - Prefer stable resource-oriented contracts over framework-driven shapes.
37
+ - Keep request validation explicit at the boundary.
38
+ - Document error semantics and retry expectations.
39
+ - Use pagination on collections by default.
40
+ - Make deprecation and compatibility policy explicit.
51
41
 
52
- ### MUST DO
42
+ ### Constraints
53
43
 
54
- - Follow REST principles (resource-oriented, proper HTTP methods)
55
- - Use consistent naming conventions (snake_case or camelCase)
56
- - Include comprehensive OpenAPI 3.1 specification
57
- - Design proper error responses with actionable messages
58
- - Implement pagination for collection endpoints
59
- - Version APIs with clear deprecation policies
60
- - Document authentication and authorization
61
- - Provide request/response examples
44
+ - Avoid verb-based URI design.
45
+ - Avoid inconsistent response envelopes across endpoints.
46
+ - Avoid silent breaking changes.
47
+ - Avoid mixing database structure directly into the external contract.
62
48
 
63
- ### MUST NOT DO
49
+ ## Output Format
64
50
 
65
- - Use verbs in resource URIs (use `/users/{id}`, not `/getUser/{id}`)
66
- - Return inconsistent response structures
67
- - Skip error code documentation
68
- - Ignore HTTP status code semantics
69
- - Design APIs without versioning strategy
70
- - Expose implementation details in API
71
- - Create breaking changes without migration path
72
- - Omit rate limiting considerations
51
+ Provide implementation guidance, code examples, and configuration as appropriate to the task.
73
52
 
74
- ## Output Templates
53
+ ## References
75
54
 
76
- When designing APIs, provide:
55
+ Load on demand. Do not preload all reference files.
77
56
 
78
- 1. Resource model and relationships
79
- 2. Endpoint specifications with URIs and methods
80
- 3. OpenAPI 3.1 specification (YAML or JSON)
81
- 4. Authentication and authorization flows
82
- 5. Error response catalog
83
- 6. Pagination and filtering patterns
84
- 7. Versioning and deprecation strategy
57
+ | File | Load when |
58
+ | --- | --- |
59
+ | `references/contract-checklist.md` | You need a sharper checklist for resource modeling, versioning, pagination, idempotency, and error semantics. |
85
60
 
86
- ## Knowledge Reference
61
+ ## Scripts
87
62
 
88
- REST architecture, OpenAPI 3.1, GraphQL, HTTP semantics, JSON:API, HATEOAS, OAuth 2.0, JWT, RFC 7807 Problem Details, API versioning patterns, pagination strategies, rate limiting, webhook design, SDK generation
63
+ No helper scripts are required for this skill right now. Keep execution in `SKILL.md` and `references/` unless repeated automation becomes necessary.
89
64
 
90
- ## Related Powers
65
+ ## Examples
91
66
 
92
- - **GraphQL Architect** - GraphQL-specific API design
93
- - **FastAPI Expert** - Python API implementation
94
- - **NestJS Expert** - TypeScript API implementation
95
- - **Spring Boot Engineer** - Java API implementation
96
- - **Security Reviewer** - API security assessment
67
+ - "Help me with api designer best practices in this project"
68
+ - "Review my api designer implementation for issues"
97
69
  ````
@@ -1,94 +1,66 @@
1
1
  ---
2
- name: "api-designer"
3
- description: "Use when designing REST or GraphQL APIs, creating OpenAPI specifications, or planning API architecture. Invoke for resource modeling, versioning strategies, pagination patterns, error handling standards."
2
+ name: api-designer
3
+ description: "Use when defining or reviewing external API contracts, OpenAPI specifications, resource models, pagination, versioning, and error response standards. Do not use for pure database design or framework-only handler wiring."
4
+ license: MIT
5
+ metadata:
6
+ author: cubis-foundry
7
+ version: "3.0"
8
+ compatibility: Claude Code, Codex, GitHub Copilot
4
9
  ---
5
10
 
6
-
7
11
  # API Designer
8
12
 
9
- ## Overview
10
-
11
- Senior API architect with expertise in designing scalable, developer-friendly REST and GraphQL APIs with comprehensive OpenAPI specifications.
12
-
13
- ## Role Definition
14
-
15
- You are a senior API designer with 10+ years of experience creating intuitive, scalable API architectures. You specialize in REST design patterns, OpenAPI 3.1 specifications, GraphQL schemas, and creating APIs that developers love to use while ensuring performance, security, and maintainability.
16
-
17
- ## When to Use This Skill
13
+ ## Purpose
18
14
 
19
- - Designing new REST or GraphQL APIs
20
- - Creating OpenAPI 3.1 specifications
21
- - Modeling resources and relationships
22
- - Implementing API versioning strategies
23
- - Designing pagination and filtering
24
- - Standardizing error responses
25
- - Planning authentication flows
26
- - Documenting API contracts
15
+ Use when defining or reviewing external API contracts, OpenAPI specifications, resource models, pagination, versioning, and error response standards. Do not use for pure database design or framework-only handler wiring.
27
16
 
28
- ## Core Workflow
17
+ ## When to Use
29
18
 
30
- 1. **Analyze domain** - Understand business requirements, data models, client needs
31
- 2. **Model resources** - Identify resources, relationships, operations
32
- 3. **Design endpoints** - Define URI patterns, HTTP methods, request/response schemas
33
- 4. **Specify contract** - Create OpenAPI 3.1 spec with complete documentation
34
- 5. **Plan evolution** - Design versioning, deprecation, backward compatibility
19
+ - Defining external REST or GraphQL contracts.
20
+ - Writing or reviewing OpenAPI schemas and endpoint shapes.
21
+ - Choosing pagination, filtering, idempotency, and versioning rules.
22
+ - Standardizing error envelopes and auth-facing API behavior.
35
23
 
36
- ## Available Steering Files
24
+ ## Instructions
37
25
 
38
- Load detailed guidance on-demand:
26
+ 1. Clarify consumers, auth model, and backward-compatibility constraints.
27
+ 2. Model resources, operations, and failure cases before implementation.
28
+ 3. Choose transport shape, versioning policy, and pagination pattern deliberately.
29
+ 4. Define request, response, and error envelopes with explicit examples.
30
+ 5. Hand off a contract that implementation skills can build against without guessing.
39
31
 
40
- | Topic | Reference | Load When |
41
- | -------------- | ---------------------------- | ------------------------------------------- |
42
- | REST Patterns | `references/rest-patterns.md` | Resource design, HTTP methods, HATEOAS |
43
- | Versioning | `references/versioning.md` | API versions, deprecation, breaking changes |
44
- | Pagination | `references/pagination.md` | Cursor, offset, keyset pagination |
45
- | Error Handling | `references/error-handling.md` | Error responses, RFC 7807, status codes |
46
- | OpenAPI | `references/openapi.md` | OpenAPI 3.1, documentation, code generation |
32
+ ### Baseline standards
47
33
 
48
- ## Constraints
34
+ - Prefer stable resource-oriented contracts over framework-driven shapes.
35
+ - Keep request validation explicit at the boundary.
36
+ - Document error semantics and retry expectations.
37
+ - Use pagination on collections by default.
38
+ - Make deprecation and compatibility policy explicit.
49
39
 
50
- ### MUST DO
40
+ ### Constraints
51
41
 
52
- - Follow REST principles (resource-oriented, proper HTTP methods)
53
- - Use consistent naming conventions (snake_case or camelCase)
54
- - Include comprehensive OpenAPI 3.1 specification
55
- - Design proper error responses with actionable messages
56
- - Implement pagination for collection endpoints
57
- - Version APIs with clear deprecation policies
58
- - Document authentication and authorization
59
- - Provide request/response examples
42
+ - Avoid verb-based URI design.
43
+ - Avoid inconsistent response envelopes across endpoints.
44
+ - Avoid silent breaking changes.
45
+ - Avoid mixing database structure directly into the external contract.
60
46
 
61
- ### MUST NOT DO
47
+ ## Output Format
62
48
 
63
- - Use verbs in resource URIs (use `/users/{id}`, not `/getUser/{id}`)
64
- - Return inconsistent response structures
65
- - Skip error code documentation
66
- - Ignore HTTP status code semantics
67
- - Design APIs without versioning strategy
68
- - Expose implementation details in API
69
- - Create breaking changes without migration path
70
- - Omit rate limiting considerations
49
+ Provide implementation guidance, code examples, and configuration as appropriate to the task.
71
50
 
72
- ## Output Templates
51
+ ## References
73
52
 
74
- When designing APIs, provide:
53
+ Load on demand. Do not preload all reference files.
75
54
 
76
- 1. Resource model and relationships
77
- 2. Endpoint specifications with URIs and methods
78
- 3. OpenAPI 3.1 specification (YAML or JSON)
79
- 4. Authentication and authorization flows
80
- 5. Error response catalog
81
- 6. Pagination and filtering patterns
82
- 7. Versioning and deprecation strategy
55
+ | File | Load when |
56
+ | --- | --- |
57
+ | `references/contract-checklist.md` | You need a sharper checklist for resource modeling, versioning, pagination, idempotency, and error semantics. |
83
58
 
84
- ## Knowledge Reference
59
+ ## Scripts
85
60
 
86
- REST architecture, OpenAPI 3.1, GraphQL, HTTP semantics, JSON:API, HATEOAS, OAuth 2.0, JWT, RFC 7807 Problem Details, API versioning patterns, pagination strategies, rate limiting, webhook design, SDK generation
61
+ No helper scripts are required for this skill right now. Keep execution in `SKILL.md` and `references/` unless repeated automation becomes necessary.
87
62
 
88
- ## Related Powers
63
+ ## Examples
89
64
 
90
- - **GraphQL Architect** - GraphQL-specific API design
91
- - **FastAPI Expert** - Python API implementation
92
- - **NestJS Expert** - TypeScript API implementation
93
- - **Spring Boot Engineer** - Java API implementation
94
- - **Security Reviewer** - API security assessment
65
+ - "Help me with api designer best practices in this project"
66
+ - "Review my api designer implementation for issues"