@rubix0270/arboris 1.0.2 → 1.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (451) hide show
  1. package/package.json +8 -20
  2. package/run.mjs +10 -0
  3. package/dist/cli.mjs +0 -383
  4. package/manifest.json +0 -323
  5. package/prisma/skills/accessibility/SKILL.md +0 -147
  6. package/prisma/skills/agent-architecture-audit/SKILL.md +0 -257
  7. package/prisma/skills/agent-eval/SKILL.md +0 -146
  8. package/prisma/skills/agent-harness-construction/SKILL.md +0 -74
  9. package/prisma/skills/agent-introspection-debugging/SKILL.md +0 -154
  10. package/prisma/skills/agent-payment-x402/SKILL.md +0 -225
  11. package/prisma/skills/agent-self-evaluation/SKILL.md +0 -182
  12. package/prisma/skills/agent-self-evaluation/examples/high-score-example.md +0 -87
  13. package/prisma/skills/agent-self-evaluation/examples/low-score-example.md +0 -86
  14. package/prisma/skills/agent-self-evaluation/references/evaluation-criteria.md +0 -71
  15. package/prisma/skills/agent-self-evaluation/references/hook-integration.md +0 -64
  16. package/prisma/skills/agent-self-evaluation/scripts/evaluate.py +0 -408
  17. package/prisma/skills/agent-self-evaluation/templates/evaluation-report.md +0 -86
  18. package/prisma/skills/agent-sort/SKILL.md +0 -216
  19. package/prisma/skills/agentic-engineering/SKILL.md +0 -64
  20. package/prisma/skills/agentic-os/SKILL.md +0 -388
  21. package/prisma/skills/ai-first-engineering/SKILL.md +0 -52
  22. package/prisma/skills/ai-regression-testing/SKILL.md +0 -386
  23. package/prisma/skills/android-clean-architecture/SKILL.md +0 -340
  24. package/prisma/skills/angular-developer/SKILL.md +0 -155
  25. package/prisma/skills/angular-developer/references/angular-animations.md +0 -160
  26. package/prisma/skills/angular-developer/references/angular-aria.md +0 -410
  27. package/prisma/skills/angular-developer/references/cli.md +0 -86
  28. package/prisma/skills/angular-developer/references/component-harnesses.md +0 -59
  29. package/prisma/skills/angular-developer/references/component-styling.md +0 -91
  30. package/prisma/skills/angular-developer/references/components.md +0 -117
  31. package/prisma/skills/angular-developer/references/creating-services.md +0 -97
  32. package/prisma/skills/angular-developer/references/data-resolvers.md +0 -69
  33. package/prisma/skills/angular-developer/references/define-routes.md +0 -67
  34. package/prisma/skills/angular-developer/references/defining-providers.md +0 -72
  35. package/prisma/skills/angular-developer/references/di-fundamentals.md +0 -120
  36. package/prisma/skills/angular-developer/references/e2e-testing.md +0 -56
  37. package/prisma/skills/angular-developer/references/effects.md +0 -83
  38. package/prisma/skills/angular-developer/references/hierarchical-injectors.md +0 -43
  39. package/prisma/skills/angular-developer/references/host-elements.md +0 -80
  40. package/prisma/skills/angular-developer/references/injection-context.md +0 -63
  41. package/prisma/skills/angular-developer/references/inputs.md +0 -101
  42. package/prisma/skills/angular-developer/references/linked-signal.md +0 -59
  43. package/prisma/skills/angular-developer/references/loading-strategies.md +0 -61
  44. package/prisma/skills/angular-developer/references/mcp.md +0 -108
  45. package/prisma/skills/angular-developer/references/navigate-to-routes.md +0 -69
  46. package/prisma/skills/angular-developer/references/outputs.md +0 -86
  47. package/prisma/skills/angular-developer/references/reactive-forms.md +0 -122
  48. package/prisma/skills/angular-developer/references/rendering-strategies.md +0 -44
  49. package/prisma/skills/angular-developer/references/resource.md +0 -77
  50. package/prisma/skills/angular-developer/references/route-animations.md +0 -56
  51. package/prisma/skills/angular-developer/references/route-guards.md +0 -52
  52. package/prisma/skills/angular-developer/references/router-lifecycle.md +0 -45
  53. package/prisma/skills/angular-developer/references/router-testing.md +0 -87
  54. package/prisma/skills/angular-developer/references/show-routes-with-outlets.md +0 -68
  55. package/prisma/skills/angular-developer/references/signal-forms.md +0 -795
  56. package/prisma/skills/angular-developer/references/signals-overview.md +0 -94
  57. package/prisma/skills/angular-developer/references/tailwind-css.md +0 -69
  58. package/prisma/skills/angular-developer/references/template-driven-forms.md +0 -114
  59. package/prisma/skills/angular-developer/references/testing-fundamentals.md +0 -65
  60. package/prisma/skills/api-connector-builder/SKILL.md +0 -121
  61. package/prisma/skills/api-design/SKILL.md +0 -524
  62. package/prisma/skills/architecture-decision-records/SKILL.md +0 -180
  63. package/prisma/skills/article-writing/SKILL.md +0 -80
  64. package/prisma/skills/automation-audit-ops/SKILL.md +0 -143
  65. package/prisma/skills/autonomous-agent-harness/SKILL.md +0 -274
  66. package/prisma/skills/autonomous-loops/SKILL.md +0 -611
  67. package/prisma/skills/backend-patterns/SKILL.md +0 -562
  68. package/prisma/skills/benchmark/SKILL.md +0 -94
  69. package/prisma/skills/benchmark-methodology/SKILL.md +0 -190
  70. package/prisma/skills/benchmark-optimization-loop/SKILL.md +0 -70
  71. package/prisma/skills/blender-motion-state-inspection/SKILL.md +0 -165
  72. package/prisma/skills/blueprint/SKILL.md +0 -106
  73. package/prisma/skills/brand-discovery/SKILL.md +0 -145
  74. package/prisma/skills/brand-discovery/references/10_purpose-why.md +0 -40
  75. package/prisma/skills/brand-discovery/references/20_positioning.md +0 -44
  76. package/prisma/skills/brand-discovery/references/30_audience-niche.md +0 -52
  77. package/prisma/skills/brand-discovery/references/40_personality-archetype.md +0 -57
  78. package/prisma/skills/brand-discovery/references/50_voice-tone.md +0 -59
  79. package/prisma/skills/brand-discovery/references/60_narrative-story.md +0 -50
  80. package/prisma/skills/brand-discovery/references/70_founder-tension.md +0 -49
  81. package/prisma/skills/brand-discovery/references/90_SYNTHESIS.md +0 -133
  82. package/prisma/skills/brand-voice/SKILL.md +0 -98
  83. package/prisma/skills/brand-voice/references/voice-profile-schema.md +0 -55
  84. package/prisma/skills/browser-qa/SKILL.md +0 -105
  85. package/prisma/skills/bun-runtime/SKILL.md +0 -85
  86. package/prisma/skills/canary-watch/SKILL.md +0 -108
  87. package/prisma/skills/carrier-relationship-management/SKILL.md +0 -212
  88. package/prisma/skills/cisco-ios-patterns/SKILL.md +0 -164
  89. package/prisma/skills/ck/SKILL.md +0 -148
  90. package/prisma/skills/ck/commands/forget.mjs +0 -44
  91. package/prisma/skills/ck/commands/info.mjs +0 -24
  92. package/prisma/skills/ck/commands/init.mjs +0 -143
  93. package/prisma/skills/ck/commands/list.mjs +0 -40
  94. package/prisma/skills/ck/commands/migrate.mjs +0 -202
  95. package/prisma/skills/ck/commands/resume.mjs +0 -36
  96. package/prisma/skills/ck/commands/save.mjs +0 -210
  97. package/prisma/skills/ck/commands/shared.mjs +0 -387
  98. package/prisma/skills/ck/hooks/session-start.mjs +0 -224
  99. package/prisma/skills/claude-devfleet/SKILL.md +0 -112
  100. package/prisma/skills/click-path-audit/SKILL.md +0 -245
  101. package/prisma/skills/clickhouse-io/SKILL.md +0 -440
  102. package/prisma/skills/code-tour/SKILL.md +0 -254
  103. package/prisma/skills/codebase-onboarding/SKILL.md +0 -234
  104. package/prisma/skills/codehealth-mcp/SKILL.md +0 -167
  105. package/prisma/skills/coding-standards/SKILL.md +0 -551
  106. package/prisma/skills/competitive-platform-analysis/SKILL.md +0 -214
  107. package/prisma/skills/competitive-report-structure/SKILL.md +0 -162
  108. package/prisma/skills/compose-multiplatform-patterns/SKILL.md +0 -300
  109. package/prisma/skills/config-gc/SKILL.md +0 -120
  110. package/prisma/skills/configure-ecc/SKILL.md +0 -385
  111. package/prisma/skills/connections-optimizer/SKILL.md +0 -190
  112. package/prisma/skills/content-engine/SKILL.md +0 -132
  113. package/prisma/skills/content-hash-cache-pattern/SKILL.md +0 -162
  114. package/prisma/skills/context-budget/SKILL.md +0 -136
  115. package/prisma/skills/continuous-agent-loop/SKILL.md +0 -46
  116. package/prisma/skills/continuous-learning/SKILL.md +0 -132
  117. package/prisma/skills/continuous-learning/config.json +0 -18
  118. package/prisma/skills/continuous-learning/evaluate-session.sh +0 -69
  119. package/prisma/skills/continuous-learning-v2/SKILL.md +0 -361
  120. package/prisma/skills/continuous-learning-v2/agents/observer-loop.sh +0 -359
  121. package/prisma/skills/continuous-learning-v2/agents/observer.md +0 -189
  122. package/prisma/skills/continuous-learning-v2/agents/session-guardian.sh +0 -150
  123. package/prisma/skills/continuous-learning-v2/agents/start-observer.sh +0 -248
  124. package/prisma/skills/continuous-learning-v2/config.json +0 -8
  125. package/prisma/skills/continuous-learning-v2/hooks/observe.sh +0 -585
  126. package/prisma/skills/continuous-learning-v2/scripts/detect-project.sh +0 -322
  127. package/prisma/skills/continuous-learning-v2/scripts/instinct-cli.py +0 -1956
  128. package/prisma/skills/continuous-learning-v2/scripts/lib/homunculus-dir.sh +0 -31
  129. package/prisma/skills/continuous-learning-v2/scripts/migrate-homunculus.sh +0 -68
  130. package/prisma/skills/continuous-learning-v2/scripts/test_parse_instinct.py +0 -1421
  131. package/prisma/skills/cost-aware-llm-pipeline/SKILL.md +0 -184
  132. package/prisma/skills/cost-tracking/SKILL.md +0 -97
  133. package/prisma/skills/council/SKILL.md +0 -204
  134. package/prisma/skills/cpp-coding-standards/SKILL.md +0 -724
  135. package/prisma/skills/cpp-testing/SKILL.md +0 -325
  136. package/prisma/skills/crosspost/SKILL.md +0 -112
  137. package/prisma/skills/csharp-testing/SKILL.md +0 -322
  138. package/prisma/skills/customer-billing-ops/SKILL.md +0 -141
  139. package/prisma/skills/customs-trade-compliance/SKILL.md +0 -263
  140. package/prisma/skills/dart-flutter-patterns/SKILL.md +0 -564
  141. package/prisma/skills/dashboard-builder/SKILL.md +0 -109
  142. package/prisma/skills/data-scraper-agent/SKILL.md +0 -765
  143. package/prisma/skills/data-throughput-accelerator/SKILL.md +0 -73
  144. package/prisma/skills/database-migrations/SKILL.md +0 -430
  145. package/prisma/skills/deep-research/SKILL.md +0 -160
  146. package/prisma/skills/defi-amm-security/SKILL.md +0 -167
  147. package/prisma/skills/delivery-gate/SKILL.md +0 -126
  148. package/prisma/skills/delivery-gate/hooks/quality-gate.py +0 -220
  149. package/prisma/skills/deployment-patterns/SKILL.md +0 -428
  150. package/prisma/skills/design-system/SKILL.md +0 -83
  151. package/prisma/skills/django-celery/SKILL.md +0 -458
  152. package/prisma/skills/django-patterns/SKILL.md +0 -735
  153. package/prisma/skills/django-security/SKILL.md +0 -644
  154. package/prisma/skills/django-tdd/SKILL.md +0 -730
  155. package/prisma/skills/django-verification/SKILL.md +0 -470
  156. package/prisma/skills/dmux-workflows/SKILL.md +0 -192
  157. package/prisma/skills/docker-patterns/SKILL.md +0 -365
  158. package/prisma/skills/documentation-lookup/SKILL.md +0 -91
  159. package/prisma/skills/dotnet-patterns/SKILL.md +0 -322
  160. package/prisma/skills/dynamic-workflow-mode/SKILL.md +0 -124
  161. package/prisma/skills/e2e-testing/SKILL.md +0 -327
  162. package/prisma/skills/ecc-guide/SKILL.md +0 -190
  163. package/prisma/skills/ecc-recipes/SKILL.md +0 -149
  164. package/prisma/skills/ecc-tools-cost-audit/SKILL.md +0 -161
  165. package/prisma/skills/email-ops/SKILL.md +0 -122
  166. package/prisma/skills/energy-procurement/SKILL.md +0 -228
  167. package/prisma/skills/enterprise-agent-ops/SKILL.md +0 -51
  168. package/prisma/skills/error-handling/SKILL.md +0 -377
  169. package/prisma/skills/eval-harness/SKILL.md +0 -271
  170. package/prisma/skills/evm-token-decimals/SKILL.md +0 -131
  171. package/prisma/skills/exa-search/SKILL.md +0 -108
  172. package/prisma/skills/fal-ai-media/SKILL.md +0 -289
  173. package/prisma/skills/fastapi-patterns/SKILL.md +0 -514
  174. package/prisma/skills/finance-billing-ops/SKILL.md +0 -128
  175. package/prisma/skills/flox-environments/SKILL.md +0 -497
  176. package/prisma/skills/flutter-dart-code-review/SKILL.md +0 -436
  177. package/prisma/skills/foundation-models-on-device/SKILL.md +0 -243
  178. package/prisma/skills/frontend-a11y/SKILL.md +0 -446
  179. package/prisma/skills/frontend-design-direction/SKILL.md +0 -93
  180. package/prisma/skills/frontend-patterns/SKILL.md +0 -657
  181. package/prisma/skills/frontend-slides/SKILL.md +0 -185
  182. package/prisma/skills/frontend-slides/STYLE_PRESETS.md +0 -330
  183. package/prisma/skills/frontend-slides/animation-patterns.md +0 -122
  184. package/prisma/skills/frontend-slides/html-template.md +0 -419
  185. package/prisma/skills/frontend-slides/scripts/export-pdf.sh +0 -418
  186. package/prisma/skills/frontend-slides/scripts/extract-pptx.py +0 -96
  187. package/prisma/skills/frontend-slides/viewport-base.css +0 -153
  188. package/prisma/skills/fsharp-testing/SKILL.md +0 -281
  189. package/prisma/skills/gan-style-harness/SKILL.md +0 -279
  190. package/prisma/skills/gateguard/SKILL.md +0 -133
  191. package/prisma/skills/generating-python-installer/SKILL.md +0 -820
  192. package/prisma/skills/git-workflow/SKILL.md +0 -716
  193. package/prisma/skills/github-ops/SKILL.md +0 -145
  194. package/prisma/skills/golang-patterns/SKILL.md +0 -675
  195. package/prisma/skills/golang-testing/SKILL.md +0 -721
  196. package/prisma/skills/google-workspace-ops/SKILL.md +0 -96
  197. package/prisma/skills/growth-log/SKILL.md +0 -128
  198. package/prisma/skills/healthcare-cdss-patterns/SKILL.md +0 -246
  199. package/prisma/skills/healthcare-emr-patterns/SKILL.md +0 -160
  200. package/prisma/skills/healthcare-eval-harness/SKILL.md +0 -208
  201. package/prisma/skills/healthcare-phi-compliance/SKILL.md +0 -146
  202. package/prisma/skills/hermes-imports/SKILL.md +0 -89
  203. package/prisma/skills/hexagonal-architecture/SKILL.md +0 -277
  204. package/prisma/skills/hipaa-compliance/SKILL.md +0 -79
  205. package/prisma/skills/homelab-network-readiness/SKILL.md +0 -170
  206. package/prisma/skills/homelab-network-setup/SKILL.md +0 -130
  207. package/prisma/skills/homelab-pihole-dns/SKILL.md +0 -275
  208. package/prisma/skills/homelab-vlan-segmentation/SKILL.md +0 -312
  209. package/prisma/skills/homelab-wireguard-vpn/SKILL.md +0 -306
  210. package/prisma/skills/hookify-rules/SKILL.md +0 -128
  211. package/prisma/skills/inherit-legacy-style/SKILL.md +0 -157
  212. package/prisma/skills/intent-driven-development/SKILL.md +0 -360
  213. package/prisma/skills/inventory-demand-planning/SKILL.md +0 -247
  214. package/prisma/skills/investor-materials/SKILL.md +0 -97
  215. package/prisma/skills/investor-outreach/SKILL.md +0 -92
  216. package/prisma/skills/ios-icon-gen/SKILL.md +0 -158
  217. package/prisma/skills/ios-icon-gen/scripts/generate_icons.swift +0 -258
  218. package/prisma/skills/ios-icon-gen/scripts/iconify_gen.sh +0 -235
  219. package/prisma/skills/iterative-retrieval/SKILL.md +0 -212
  220. package/prisma/skills/ito-basket-compare/SKILL.md +0 -64
  221. package/prisma/skills/ito-data-atlas-agent/SKILL.md +0 -64
  222. package/prisma/skills/ito-market-intelligence/SKILL.md +0 -61
  223. package/prisma/skills/ito-trade-planner/SKILL.md +0 -68
  224. package/prisma/skills/java-coding-standards/SKILL.md +0 -384
  225. package/prisma/skills/jira-integration/SKILL.md +0 -303
  226. package/prisma/skills/jpa-patterns/SKILL.md +0 -152
  227. package/prisma/skills/knowledge-ops/SKILL.md +0 -155
  228. package/prisma/skills/kotlin-coroutines-flows/SKILL.md +0 -285
  229. package/prisma/skills/kotlin-exposed-patterns/SKILL.md +0 -720
  230. package/prisma/skills/kotlin-ktor-patterns/SKILL.md +0 -690
  231. package/prisma/skills/kotlin-patterns/SKILL.md +0 -712
  232. package/prisma/skills/kotlin-testing/SKILL.md +0 -825
  233. package/prisma/skills/kubernetes-patterns/SKILL.md +0 -756
  234. package/prisma/skills/laravel-patterns/SKILL.md +0 -416
  235. package/prisma/skills/laravel-plugin-discovery/SKILL.md +0 -230
  236. package/prisma/skills/laravel-security/SKILL.md +0 -948
  237. package/prisma/skills/laravel-tdd/SKILL.md +0 -675
  238. package/prisma/skills/laravel-verification/SKILL.md +0 -180
  239. package/prisma/skills/latency-critical-systems/SKILL.md +0 -74
  240. package/prisma/skills/lead-intelligence/SKILL.md +0 -322
  241. package/prisma/skills/lead-intelligence/agents/enrichment-agent.md +0 -85
  242. package/prisma/skills/lead-intelligence/agents/mutual-mapper.md +0 -75
  243. package/prisma/skills/lead-intelligence/agents/outreach-drafter.md +0 -98
  244. package/prisma/skills/lead-intelligence/agents/signal-scorer.md +0 -60
  245. package/prisma/skills/liquid-glass-design/SKILL.md +0 -279
  246. package/prisma/skills/llm-trading-agent-security/SKILL.md +0 -147
  247. package/prisma/skills/logistics-exception-management/SKILL.md +0 -222
  248. package/prisma/skills/loop-design-check/SKILL.md +0 -143
  249. package/prisma/skills/mailtrap-email-integration/SKILL.md +0 -77
  250. package/prisma/skills/make-interfaces-feel-better/SKILL.md +0 -152
  251. package/prisma/skills/manim-video/SKILL.md +0 -90
  252. package/prisma/skills/manim-video/assets/network_graph_scene.py +0 -52
  253. package/prisma/skills/market-research/SKILL.md +0 -76
  254. package/prisma/skills/marketing-campaign/SKILL.md +0 -114
  255. package/prisma/skills/mcp-server-patterns/SKILL.md +0 -70
  256. package/prisma/skills/messages-ops/SKILL.md +0 -105
  257. package/prisma/skills/ml-adoption-playbook/SKILL.md +0 -57
  258. package/prisma/skills/mle-workflow/SKILL.md +0 -347
  259. package/prisma/skills/motion-advanced/SKILL.md +0 -596
  260. package/prisma/skills/motion-foundations/SKILL.md +0 -299
  261. package/prisma/skills/motion-patterns/SKILL.md +0 -434
  262. package/prisma/skills/motion-ui/SKILL.md +0 -576
  263. package/prisma/skills/mysql-patterns/SKILL.md +0 -413
  264. package/prisma/skills/nanoclaw-repl/SKILL.md +0 -34
  265. package/prisma/skills/nestjs-patterns/SKILL.md +0 -231
  266. package/prisma/skills/netmiko-ssh-automation/SKILL.md +0 -174
  267. package/prisma/skills/network-bgp-diagnostics/SKILL.md +0 -168
  268. package/prisma/skills/network-config-validation/SKILL.md +0 -211
  269. package/prisma/skills/network-interface-health/SKILL.md +0 -153
  270. package/prisma/skills/nextjs-turbopack/SKILL.md +0 -58
  271. package/prisma/skills/nodejs-keccak256/SKILL.md +0 -103
  272. package/prisma/skills/nutrient-document-processing/SKILL.md +0 -168
  273. package/prisma/skills/nuxt4-patterns/SKILL.md +0 -101
  274. package/prisma/skills/openclaw-persona-forge/SKILL.md +0 -289
  275. package/prisma/skills/openclaw-persona-forge/gacha.py +0 -224
  276. package/prisma/skills/openclaw-persona-forge/gacha.sh +0 -5
  277. package/prisma/skills/openclaw-persona-forge/references/avatar-style.md +0 -124
  278. package/prisma/skills/openclaw-persona-forge/references/boundary-rules.md +0 -53
  279. package/prisma/skills/openclaw-persona-forge/references/error-handling.md +0 -53
  280. package/prisma/skills/openclaw-persona-forge/references/identity-tension.md +0 -48
  281. package/prisma/skills/openclaw-persona-forge/references/naming-system.md +0 -39
  282. package/prisma/skills/openclaw-persona-forge/references/output-template.md +0 -166
  283. package/prisma/skills/opensource-pipeline/SKILL.md +0 -256
  284. package/prisma/skills/orch-add-feature/SKILL.md +0 -45
  285. package/prisma/skills/orch-build-mvp/SKILL.md +0 -49
  286. package/prisma/skills/orch-change-feature/SKILL.md +0 -43
  287. package/prisma/skills/orch-fix-defect/SKILL.md +0 -43
  288. package/prisma/skills/orch-pipeline/SKILL.md +0 -121
  289. package/prisma/skills/orch-refine-code/SKILL.md +0 -44
  290. package/prisma/skills/parallel-execution-optimizer/SKILL.md +0 -73
  291. package/prisma/skills/perl-patterns/SKILL.md +0 -505
  292. package/prisma/skills/perl-security/SKILL.md +0 -504
  293. package/prisma/skills/perl-testing/SKILL.md +0 -476
  294. package/prisma/skills/plan-orchestrate/SKILL.md +0 -263
  295. package/prisma/skills/plankton-code-quality/SKILL.md +0 -237
  296. package/prisma/skills/postgres-patterns/SKILL.md +0 -148
  297. package/prisma/skills/prediction-market-oracle-research/SKILL.md +0 -64
  298. package/prisma/skills/prediction-market-risk-review/SKILL.md +0 -61
  299. package/prisma/skills/prisma-patterns/SKILL.md +0 -401
  300. package/prisma/skills/product-capability/SKILL.md +0 -142
  301. package/prisma/skills/product-lens/SKILL.md +0 -93
  302. package/prisma/skills/production-audit/SKILL.md +0 -207
  303. package/prisma/skills/production-scheduling/SKILL.md +0 -238
  304. package/prisma/skills/project-flow-ops/SKILL.md +0 -112
  305. package/prisma/skills/prompt-optimizer/SKILL.md +0 -398
  306. package/prisma/skills/python-patterns/SKILL.md +0 -751
  307. package/prisma/skills/python-testing/SKILL.md +0 -817
  308. package/prisma/skills/pytorch-patterns/SKILL.md +0 -397
  309. package/prisma/skills/quality-nonconformance/SKILL.md +0 -260
  310. package/prisma/skills/quarkus-patterns/SKILL.md +0 -723
  311. package/prisma/skills/quarkus-security/SKILL.md +0 -468
  312. package/prisma/skills/quarkus-tdd/SKILL.md +0 -812
  313. package/prisma/skills/quarkus-verification/SKILL.md +0 -480
  314. package/prisma/skills/ralphinho-rfc-pipeline/SKILL.md +0 -68
  315. package/prisma/skills/react-native-patterns/SKILL.md +0 -326
  316. package/prisma/skills/react-patterns/SKILL.md +0 -342
  317. package/prisma/skills/react-performance/SKILL.md +0 -575
  318. package/prisma/skills/react-testing/SKILL.md +0 -424
  319. package/prisma/skills/recsys-pipeline-architect/SKILL.md +0 -115
  320. package/prisma/skills/recursive-decision-ledger/SKILL.md +0 -80
  321. package/prisma/skills/redis-patterns/SKILL.md +0 -404
  322. package/prisma/skills/regex-vs-llm-structured-text/SKILL.md +0 -221
  323. package/prisma/skills/remotion-video-creation/SKILL.md +0 -43
  324. package/prisma/skills/remotion-video-creation/rules/3d.md +0 -86
  325. package/prisma/skills/remotion-video-creation/rules/animations.md +0 -29
  326. package/prisma/skills/remotion-video-creation/rules/assets/charts-bar-chart.tsx +0 -173
  327. package/prisma/skills/remotion-video-creation/rules/assets/text-animations-typewriter.tsx +0 -100
  328. package/prisma/skills/remotion-video-creation/rules/assets/text-animations-word-highlight.tsx +0 -108
  329. package/prisma/skills/remotion-video-creation/rules/assets.md +0 -78
  330. package/prisma/skills/remotion-video-creation/rules/audio.md +0 -172
  331. package/prisma/skills/remotion-video-creation/rules/calculate-metadata.md +0 -104
  332. package/prisma/skills/remotion-video-creation/rules/can-decode.md +0 -75
  333. package/prisma/skills/remotion-video-creation/rules/charts.md +0 -58
  334. package/prisma/skills/remotion-video-creation/rules/compositions.md +0 -146
  335. package/prisma/skills/remotion-video-creation/rules/display-captions.md +0 -126
  336. package/prisma/skills/remotion-video-creation/rules/extract-frames.md +0 -229
  337. package/prisma/skills/remotion-video-creation/rules/fonts.md +0 -152
  338. package/prisma/skills/remotion-video-creation/rules/get-audio-duration.md +0 -58
  339. package/prisma/skills/remotion-video-creation/rules/get-video-dimensions.md +0 -68
  340. package/prisma/skills/remotion-video-creation/rules/get-video-duration.md +0 -58
  341. package/prisma/skills/remotion-video-creation/rules/gifs.md +0 -138
  342. package/prisma/skills/remotion-video-creation/rules/images.md +0 -130
  343. package/prisma/skills/remotion-video-creation/rules/import-srt-captions.md +0 -67
  344. package/prisma/skills/remotion-video-creation/rules/lottie.md +0 -67
  345. package/prisma/skills/remotion-video-creation/rules/measuring-dom-nodes.md +0 -34
  346. package/prisma/skills/remotion-video-creation/rules/measuring-text.md +0 -143
  347. package/prisma/skills/remotion-video-creation/rules/sequencing.md +0 -106
  348. package/prisma/skills/remotion-video-creation/rules/tailwind.md +0 -11
  349. package/prisma/skills/remotion-video-creation/rules/text-animations.md +0 -20
  350. package/prisma/skills/remotion-video-creation/rules/timing.md +0 -179
  351. package/prisma/skills/remotion-video-creation/rules/transcribe-captions.md +0 -19
  352. package/prisma/skills/remotion-video-creation/rules/transitions.md +0 -122
  353. package/prisma/skills/remotion-video-creation/rules/trimming.md +0 -52
  354. package/prisma/skills/remotion-video-creation/rules/videos.md +0 -171
  355. package/prisma/skills/repo-scan/SKILL.md +0 -79
  356. package/prisma/skills/research-ops/SKILL.md +0 -113
  357. package/prisma/skills/returns-reverse-logistics/SKILL.md +0 -240
  358. package/prisma/skills/rules-distill/SKILL.md +0 -265
  359. package/prisma/skills/rules-distill/scripts/scan-rules.sh +0 -58
  360. package/prisma/skills/rules-distill/scripts/scan-skills.sh +0 -129
  361. package/prisma/skills/rust-patterns/SKILL.md +0 -500
  362. package/prisma/skills/rust-testing/SKILL.md +0 -501
  363. package/prisma/skills/safety-guard/SKILL.md +0 -76
  364. package/prisma/skills/santa-method/SKILL.md +0 -307
  365. package/prisma/skills/scientific-db-pubmed-database/SKILL.md +0 -176
  366. package/prisma/skills/scientific-db-uspto-database/SKILL.md +0 -178
  367. package/prisma/skills/scientific-pkg-gget/SKILL.md +0 -167
  368. package/prisma/skills/scientific-thinking-literature-review/SKILL.md +0 -193
  369. package/prisma/skills/scientific-thinking-scholar-evaluation/SKILL.md +0 -161
  370. package/prisma/skills/search-first/SKILL.md +0 -183
  371. package/prisma/skills/security-bounty-hunter/SKILL.md +0 -100
  372. package/prisma/skills/security-review/SKILL.md +0 -504
  373. package/prisma/skills/security-review/cloud-infrastructure-security.md +0 -361
  374. package/prisma/skills/security-scan/SKILL.md +0 -166
  375. package/prisma/skills/seo/SKILL.md +0 -155
  376. package/prisma/skills/skill-comply/SKILL.md +0 -59
  377. package/prisma/skills/skill-comply/fixtures/compliant_trace.jsonl +0 -5
  378. package/prisma/skills/skill-comply/fixtures/noncompliant_trace.jsonl +0 -3
  379. package/prisma/skills/skill-comply/fixtures/tdd_spec.yaml +0 -44
  380. package/prisma/skills/skill-comply/prompts/classifier.md +0 -24
  381. package/prisma/skills/skill-comply/prompts/scenario_generator.md +0 -62
  382. package/prisma/skills/skill-comply/prompts/spec_generator.md +0 -42
  383. package/prisma/skills/skill-comply/pyproject.toml +0 -15
  384. package/prisma/skills/skill-comply/scripts/__init__.py +0 -0
  385. package/prisma/skills/skill-comply/scripts/classifier.py +0 -85
  386. package/prisma/skills/skill-comply/scripts/grader.py +0 -124
  387. package/prisma/skills/skill-comply/scripts/parser.py +0 -107
  388. package/prisma/skills/skill-comply/scripts/report.py +0 -170
  389. package/prisma/skills/skill-comply/scripts/run.py +0 -127
  390. package/prisma/skills/skill-comply/scripts/runner.py +0 -194
  391. package/prisma/skills/skill-comply/scripts/scenario_generator.py +0 -70
  392. package/prisma/skills/skill-comply/scripts/spec_generator.py +0 -72
  393. package/prisma/skills/skill-comply/scripts/utils.py +0 -13
  394. package/prisma/skills/skill-comply/tests/test_grader.py +0 -197
  395. package/prisma/skills/skill-comply/tests/test_parser.py +0 -90
  396. package/prisma/skills/skill-comply/tests/test_runner.py +0 -172
  397. package/prisma/skills/skill-scout/SKILL.md +0 -141
  398. package/prisma/skills/skill-stocktake/SKILL.md +0 -195
  399. package/prisma/skills/skill-stocktake/scripts/quick-diff.sh +0 -87
  400. package/prisma/skills/skill-stocktake/scripts/save-results.sh +0 -56
  401. package/prisma/skills/skill-stocktake/scripts/scan.sh +0 -170
  402. package/prisma/skills/social-graph-ranker/SKILL.md +0 -155
  403. package/prisma/skills/social-publisher/SKILL.md +0 -130
  404. package/prisma/skills/springboot-patterns/SKILL.md +0 -315
  405. package/prisma/skills/springboot-security/SKILL.md +0 -273
  406. package/prisma/skills/springboot-tdd/SKILL.md +0 -159
  407. package/prisma/skills/springboot-verification/SKILL.md +0 -232
  408. package/prisma/skills/strategic-compact/SKILL.md +0 -136
  409. package/prisma/skills/swift-actor-persistence/SKILL.md +0 -144
  410. package/prisma/skills/swift-concurrency-6-2/SKILL.md +0 -216
  411. package/prisma/skills/swift-protocol-di-testing/SKILL.md +0 -191
  412. package/prisma/skills/swiftui-patterns/SKILL.md +0 -259
  413. package/prisma/skills/taste/SKILL.md +0 -264
  414. package/prisma/skills/taste/references/genre-taxonomy.md +0 -87
  415. package/prisma/skills/tdd-workflow/SKILL.md +0 -583
  416. package/prisma/skills/team-agent-orchestration/SKILL.md +0 -111
  417. package/prisma/skills/team-builder/SKILL.md +0 -169
  418. package/prisma/skills/terminal-ops/SKILL.md +0 -110
  419. package/prisma/skills/tinystruct-patterns/SKILL.md +0 -279
  420. package/prisma/skills/tinystruct-patterns/references/architecture.md +0 -90
  421. package/prisma/skills/tinystruct-patterns/references/data-handling.md +0 -60
  422. package/prisma/skills/tinystruct-patterns/references/database.md +0 -99
  423. package/prisma/skills/tinystruct-patterns/references/routing.md +0 -64
  424. package/prisma/skills/tinystruct-patterns/references/system-usage.md +0 -97
  425. package/prisma/skills/tinystruct-patterns/references/testing.md +0 -72
  426. package/prisma/skills/token-budget-advisor/SKILL.md +0 -134
  427. package/prisma/skills/ui-demo/SKILL.md +0 -466
  428. package/prisma/skills/ui-to-vue/SKILL.md +0 -135
  429. package/prisma/skills/uncloud/SKILL.md +0 -344
  430. package/prisma/skills/unified-notifications-ops/SKILL.md +0 -188
  431. package/prisma/skills/verification-loop/SKILL.md +0 -127
  432. package/prisma/skills/video-editing/SKILL.md +0 -311
  433. package/prisma/skills/videodb/SKILL.md +0 -375
  434. package/prisma/skills/videodb/reference/api-reference.md +0 -550
  435. package/prisma/skills/videodb/reference/capture-reference.md +0 -407
  436. package/prisma/skills/videodb/reference/capture.md +0 -101
  437. package/prisma/skills/videodb/reference/editor.md +0 -443
  438. package/prisma/skills/videodb/reference/generative.md +0 -331
  439. package/prisma/skills/videodb/reference/rtstream-reference.md +0 -564
  440. package/prisma/skills/videodb/reference/rtstream.md +0 -65
  441. package/prisma/skills/videodb/reference/search.md +0 -230
  442. package/prisma/skills/videodb/reference/streaming.md +0 -406
  443. package/prisma/skills/videodb/reference/use-cases.md +0 -118
  444. package/prisma/skills/videodb/scripts/ws_listener.py +0 -282
  445. package/prisma/skills/visa-doc-translate/README.md +0 -86
  446. package/prisma/skills/visa-doc-translate/SKILL.md +0 -117
  447. package/prisma/skills/vite-patterns/SKILL.md +0 -450
  448. package/prisma/skills/vue-patterns/SKILL.md +0 -471
  449. package/prisma/skills/windows-desktop-e2e/SKILL.md +0 -888
  450. package/prisma/skills/workspace-surface-audit/SKILL.md +0 -126
  451. package/prisma/skills/x-api/SKILL.md +0 -235
@@ -1,257 +0,0 @@
1
- ---
2
- name: agent-architecture-audit
3
- description: Full-stack diagnostic for agent and LLM applications. Audits the 12-layer agent stack for wrapper regression, memory pollution, tool discipline failures, hidden repair loops, and rendering corruption. Produces severity-ranked findings with code-first fixes. Essential for developers building agent applications, autonomous loops, or any LLM-powered feature.
4
- metadata:
5
- origin: oh-my-agent-check
6
- tools: Read, Write, Edit, Bash, Grep, Glob
7
- ---
8
-
9
- # Agent Architecture Audit
10
-
11
- A diagnostic workflow for agent systems that hide failures behind wrapper layers, stale memory, retry loops, or transport/rendering mutations.
12
-
13
- ## When to Activate
14
-
15
- **MANDATORY for:**
16
- - Releasing any agent or LLM-powered application to production
17
- - Shipping features with tool calling, memory, or multi-step workflows
18
- - Agent behavior degrades after adding wrapper layers
19
- - User reports "the agent is getting worse" or "tools are flaky"
20
- - Same model works in playground but breaks inside your wrapper
21
- - Debugging agent behavior for more than 15 minutes without finding root cause
22
-
23
- **Especially critical when:**
24
- - You've added new prompt layers, tool definitions, or memory systems
25
- - Different agents in your system behave inconsistently
26
- - The model was fine yesterday but is hallucinating today
27
- - You suspect hidden repair/retry loops silently mutating responses
28
-
29
- **Do not use for:**
30
- - General code debugging — use `agent-introspection-debugging`
31
- - Code review — use language-specific reviewer agents
32
- - Security scanning — use `security-review` or `security-review/scan`
33
- - Agent performance benchmarking — use `agent-eval`
34
- - Writing new features — use the appropriate workflow skill
35
-
36
- ## The 12-Layer Stack
37
-
38
- Every agent system has these layers. Any of them can corrupt the answer:
39
-
40
- | # | Layer | What Goes Wrong |
41
- |---|-------|----------------|
42
- | 1 | System prompt | Conflicting instructions, instruction bloat |
43
- | 2 | Session history | Stale context injection from previous turns |
44
- | 3 | Long-term memory | Pollution across sessions, old topics in new conversations |
45
- | 4 | Distillation | Compressed artifacts re-entering as pseudo-facts |
46
- | 5 | Active recall | Redundant re-summary layers wasting context |
47
- | 6 | Tool selection | Wrong tool routing, model skips required tools |
48
- | 7 | Tool execution | Hallucinated execution — claims to call but doesn't |
49
- | 8 | Tool interpretation | Misread or ignored tool output |
50
- | 9 | Answer shaping | Format corruption in final response |
51
- | 10 | Platform rendering | Transport-layer mutation (UI, API, CLI mutates valid answers) |
52
- | 11 | Hidden repair loops | Silent fallback/retry agents running second LLM pass |
53
- | 12 | Persistence | Expired state or cached artifacts reused as live evidence |
54
-
55
- ## Common Failure Patterns
56
-
57
- ### 1. Wrapper Regression
58
-
59
- The base model produces correct answers, but the wrapper layers make it worse.
60
-
61
- **Symptoms:**
62
- - Model works fine in playground or direct API call, breaks in your agent
63
- - Added a new prompt layer, existing behavior degraded
64
- - Agent sounds confident but is confidently wrong
65
- - "It was working before the last update"
66
-
67
- ### 2. Memory Contamination
68
-
69
- Old topics leak into new conversations through history, memory retrieval, or distillation.
70
-
71
- **Symptoms:**
72
- - Agent brings up unrelated past topics
73
- - User corrections don't stick (old memory overwrites new)
74
- - Same-session artifacts re-enter as pseudo-facts
75
- - Memory grows without bound, degrading response quality over time
76
-
77
- ### 3. Tool Discipline Failure
78
-
79
- Tools are declared in the prompt but not enforced in code. The model skips them or hallucinates execution.
80
-
81
- **Symptoms:**
82
- - "Must use tool X" in prompt, but model answers without calling it
83
- - Tool results look correct but were never actually executed
84
- - Different tools fight over the same responsibility
85
- - Model uses tool when it shouldn't, or skips it when it must
86
-
87
- ### 4. Rendering/Transport Corruption
88
-
89
- The agent's internal answer is correct, but the platform layer mutates it during delivery.
90
-
91
- **Symptoms:**
92
- - Logs show correct answer, user sees broken output
93
- - Markdown rendering, JSON parsing, or streaming fragments corrupt valid responses
94
- - Hidden fallback agent quietly replaces the answer before delivery
95
- - Output differs between terminal and UI
96
-
97
- ### 5. Hidden Agent Layers
98
-
99
- Silent repair, retry, summarization, or recall agents run without explicit contracts.
100
-
101
- **Symptoms:**
102
- - Output changes between internal generation and user delivery
103
- - "Auto-fix" loops run a second LLM pass the user doesn't know about
104
- - Multiple agents modify the same output without coordination
105
- - Answers get "smoothed" or "corrected" by invisible layers
106
-
107
- ## Audit Workflow
108
-
109
- ### Phase 1: Scope
110
-
111
- Define what you're auditing:
112
-
113
- - **Target system** — what agent application?
114
- - **Entrypoints** — how do users interact with it?
115
- - **Model stack** — which LLM(s) and providers?
116
- - **Symptoms** — what does the user report?
117
- - **Time window** — when did it start?
118
- - **Layers to audit** — which of the 12 layers apply?
119
-
120
- ### Phase 2: Evidence Collection
121
-
122
- Gather evidence from the codebase:
123
-
124
- - **Source code** — agent loop, tool router, memory admission, prompt assembly
125
- - **Logs** — historical session traces, tool call records
126
- - **Config** — prompt templates, tool schemas, provider settings
127
- - **Memory files** — SOPs, knowledge bases, session archives
128
-
129
- Use `rg` to search for anti-patterns:
130
-
131
- ```bash
132
- # Tool requirements expressed only in prompt text (not code)
133
- rg "must.*tool|必须.*工具|required.*call" --type md
134
-
135
- # Tool execution without validation
136
- rg "tool_call|toolCall|tool_use" --type py --type ts
137
-
138
- # Hidden LLM calls outside main agent loop
139
- rg "completion|chat\.create|messages\.create|llm\.invoke"
140
-
141
- # Memory admission without user-correction priority
142
- rg "memory.*admit|long.*term.*update|persist.*memory" --type py --type ts
143
-
144
- # Fallback loops that run additional LLM calls
145
- rg "fallback|retry.*llm|repair.*prompt|re-?prompt" --type py --type ts
146
-
147
- # Silent output mutation
148
- rg "mutate|rewrite.*response|transform.*output|shap" --type py --type ts
149
- ```
150
-
151
- ### Phase 3: Failure Mapping
152
-
153
- For each finding, document:
154
-
155
- - **Symptom** — what the user sees
156
- - **Mechanism** — how the wrapper causes it
157
- - **Source layer** — which of the 12 layers
158
- - **Root cause** — the deepest cause
159
- - **Evidence** — file:line or log:row reference
160
- - **Confidence** — 0.0 to 1.0
161
-
162
- ### Phase 4: Fix Strategy
163
-
164
- Default fix order (code-first, not prompt-first):
165
-
166
- 1. **Code-gate tool requirements** — enforce in code, not just prompt text
167
- 2. **Remove or narrow hidden repair agents** — make fallback explicit with contracts
168
- 3. **Reduce context duplication** — same info through prompt + history + memory + distillation
169
- 4. **Tighten memory admission** — user corrections > agent assertions
170
- 5. **Tighten distillation triggers** — don't compress what shouldn't be compressed
171
- 6. **Reduce rendering mutation** — pass-through, don't transform
172
- 7. **Convert to typed JSON envelopes** — structured internal flow, not freeform prose
173
-
174
- ## Severity Model
175
-
176
- | Level | Meaning | Action |
177
- |-------|---------|--------|
178
- | `critical` | Agent can confidently produce wrong operational behavior | Fix before next release |
179
- | `high` | Agent frequently degrades correctness or stability | Fix this sprint |
180
- | `medium` | Correctness usually survives but output is fragile or wasteful | Plan for next cycle |
181
- | `low` | Mostly cosmetic or maintainability issues | Backlog |
182
-
183
- ## Output Format
184
-
185
- Present findings to the user in this order:
186
-
187
- 1. **Severity-ranked findings** (most critical first)
188
- 2. **Architecture diagnosis** (which layer corrupted what, and why)
189
- 3. **Ordered fix plan** (code-first, not prompt-first)
190
-
191
- Do not lead with compliments or summaries. If the system is broken, say so directly.
192
-
193
- ## Quick Diagnostic Questions
194
-
195
- When auditing an agent system, answer these:
196
-
197
- | # | Question | If Yes → |
198
- |---|----------|----------|
199
- | 1 | Can the model skip a required tool and still answer? | Tool not code-gated |
200
- | 2 | Does old conversation content appear in new turns? | Memory contamination |
201
- | 3 | Is the same info in system prompt AND memory AND history? | Context duplication |
202
- | 4 | Does the platform run a second LLM pass before delivery? | Hidden repair loop |
203
- | 5 | Does the output differ between internal generation and user delivery? | Rendering corruption |
204
- | 6 | Are "must use tool X" rules only in prompt text? | Tool discipline failure |
205
- | 7 | Can the agent's own monologue become persistent memory? | Memory poisoning |
206
-
207
- ## Anti-Patterns to Avoid
208
-
209
- - Avoid blaming the model before falsifying wrapper-layer regressions.
210
- - Avoid blaming memory without showing the contamination path.
211
- - Do not let a clean current state erase a dirty historical incident.
212
- - Do not treat markdown prose as a trustworthy internal protocol.
213
- - Do not accept "must use tool" in prompt text when code never enforces it.
214
- - Keep findings direct, evidence-backed, and severity-ranked.
215
-
216
- ## Report Schema
217
-
218
- Audits should produce structured reports following this shape:
219
-
220
- ```json
221
- {
222
- "schema_version": "ecc.agent-architecture-audit.report.v1",
223
- "executive_verdict": {
224
- "overall_health": "high_risk",
225
- "primary_failure_mode": "string",
226
- "most_urgent_fix": "string"
227
- },
228
- "scope": {
229
- "target_name": "string",
230
- "model_stack": ["string"],
231
- "layers_to_audit": ["string"]
232
- },
233
- "findings": [
234
- {
235
- "severity": "critical|high|medium|low",
236
- "title": "string",
237
- "mechanism": "string",
238
- "source_layer": "string",
239
- "root_cause": "string",
240
- "evidence_refs": ["file:line"],
241
- "confidence": 0.0,
242
- "recommended_fix": "string"
243
- }
244
- ],
245
- "ordered_fix_plan": [
246
- { "order": 1, "goal": "string", "why_now": "string", "expected_effect": "string" }
247
- ]
248
- }
249
- ```
250
-
251
- ## Related Skills
252
-
253
- - `agent-introspection-debugging` — Debug agent runtime failures (loops, timeouts, state errors)
254
- - `agent-eval` — Benchmark agent performance head-to-head
255
- - `security-review` — Security audit for code and configuration
256
- - `autonomous-agent-harness` — Set up autonomous agent operations
257
- - `agent-harness-construction` — Build agent harnesses from scratch
@@ -1,146 +0,0 @@
1
- ---
2
- name: agent-eval
3
- description: Head-to-head comparison of coding agents (Claude Code, Aider, Codex, etc.) on custom tasks with pass rate, cost, time, and consistency metrics
4
- metadata:
5
- origin: ECC
6
- tools: Read, Write, Edit, Bash, Grep, Glob
7
- ---
8
-
9
- # Agent Eval Skill
10
-
11
- A lightweight CLI tool for comparing coding agents head-to-head on reproducible tasks. Every "which coding agent is best?" comparison runs on vibes — this tool systematizes it.
12
-
13
- ## When to Activate
14
-
15
- - Comparing coding agents (Claude Code, Aider, Codex, etc.) on your own codebase
16
- - Measuring agent performance before adopting a new tool or model
17
- - Running regression checks when an agent updates its model or tooling
18
- - Producing data-backed agent selection decisions for a team
19
-
20
- ## Installation
21
-
22
- > **Note:** Install agent-eval from its repository after reviewing the source.
23
-
24
- ## Core Concepts
25
-
26
- ### YAML Task Definitions
27
-
28
- Define tasks declaratively. Each task specifies what to do, which files to touch, and how to judge success:
29
-
30
- ```yaml
31
- name: add-retry-logic
32
- description: Add exponential backoff retry to the HTTP client
33
- repo: ./my-project
34
- files:
35
- - src/http_client.py
36
- prompt: |
37
- Add retry logic with exponential backoff to all HTTP requests.
38
- Max 3 retries. Initial delay 1s, max delay 30s.
39
- judge:
40
- - type: pytest
41
- command: pytest tests/test_http_client.py -v
42
- - type: grep
43
- pattern: "exponential_backoff|retry"
44
- files: src/http_client.py
45
- commit: "abc1234" # pin to specific commit for reproducibility
46
- ```
47
-
48
- ### Git Worktree Isolation
49
-
50
- Each agent run gets its own git worktree — no Docker required. This provides reproducibility isolation so agents cannot interfere with each other or corrupt the base repo.
51
-
52
- ### Metrics Collected
53
-
54
- | Metric | What It Measures |
55
- |--------|-----------------|
56
- | Pass rate | Did the agent produce code that passes the judge? |
57
- | Cost | API spend per task (when available) |
58
- | Time | Wall-clock seconds to completion |
59
- | Consistency | Pass rate across repeated runs (e.g., 3/3 = 100%) |
60
-
61
- ## Workflow
62
-
63
- ### 1. Define Tasks
64
-
65
- Create a `tasks/` directory with YAML files, one per task:
66
-
67
- ```bash
68
- mkdir tasks
69
- # Write task definitions (see template above)
70
- ```
71
-
72
- ### 2. Run Agents
73
-
74
- Execute agents against your tasks:
75
-
76
- ```bash
77
- agent-eval run --task tasks/add-retry-logic.yaml --agent claude-code --agent aider --runs 3
78
- ```
79
-
80
- Each run:
81
- 1. Creates a fresh git worktree from the specified commit
82
- 2. Hands the prompt to the agent
83
- 3. Runs the judge criteria
84
- 4. Records pass/fail, cost, and time
85
-
86
- ### 3. Compare Results
87
-
88
- Generate a comparison report:
89
-
90
- ```bash
91
- agent-eval report --format table
92
- ```
93
-
94
- ```
95
- Task: add-retry-logic (3 runs each)
96
- ┌──────────────┬───────────┬────────┬────────┬─────────────┐
97
- │ Agent │ Pass Rate │ Cost │ Time │ Consistency │
98
- ├──────────────┼───────────┼────────┼────────┼─────────────┤
99
- │ claude-code │ 3/3 │ $0.12 │ 45s │ 100% │
100
- │ aider │ 2/3 │ $0.08 │ 38s │ 67% │
101
- └──────────────┴───────────┴────────┴────────┴─────────────┘
102
- ```
103
-
104
- ## Judge Types
105
-
106
- ### Code-Based (deterministic)
107
-
108
- ```yaml
109
- judge:
110
- - type: pytest
111
- command: pytest tests/ -v
112
- - type: command
113
- command: npm run build
114
- ```
115
-
116
- ### Pattern-Based
117
-
118
- ```yaml
119
- judge:
120
- - type: grep
121
- pattern: "class.*Retry"
122
- files: src/**/*.py
123
- ```
124
-
125
- ### Model-Based (LLM-as-judge)
126
-
127
- ```yaml
128
- judge:
129
- - type: llm
130
- prompt: |
131
- Does this implementation correctly handle exponential backoff?
132
- Check for: max retries, increasing delays, jitter.
133
- ```
134
-
135
- ## Best Practices
136
-
137
- - **Start with 3-5 tasks** that represent your real workload, not toy examples
138
- - **Run at least 3 trials** per agent to capture variance — agents are non-deterministic
139
- - **Pin the commit** in your task YAML so results are reproducible across days/weeks
140
- - **Include at least one deterministic judge** (tests, build) per task — LLM judges add noise
141
- - **Track cost alongside pass rate** — a 95% agent at 10x the cost may not be the right choice
142
- - **Version your task definitions** — they are test fixtures, treat them as code
143
-
144
- ## Links
145
-
146
- - Repository: [github.com/joaquinhuigomez/agent-eval](https://github.com/joaquinhuigomez/agent-eval)
@@ -1,74 +0,0 @@
1
- ---
2
- name: agent-harness-construction
3
- description: Design and optimize AI agent action spaces, tool definitions, and observation formatting for higher completion rates.
4
- metadata:
5
- origin: ECC
6
- ---
7
-
8
- # Agent Harness Construction
9
-
10
- Use this skill when you are improving how an agent plans, calls tools, recovers from errors, and converges on completion.
11
-
12
- ## Core Model
13
-
14
- Agent output quality is constrained by:
15
- 1. Action space quality
16
- 2. Observation quality
17
- 3. Recovery quality
18
- 4. Context budget quality
19
-
20
- ## Action Space Design
21
-
22
- 1. Use stable, explicit tool names.
23
- 2. Keep inputs schema-first and narrow.
24
- 3. Return deterministic output shapes.
25
- 4. Avoid catch-all tools unless isolation is impossible.
26
-
27
- ## Granularity Rules
28
-
29
- - Use micro-tools for high-risk operations (deploy, migration, permissions).
30
- - Use medium tools for common edit/read/search loops.
31
- - Use macro-tools only when round-trip overhead is the dominant cost.
32
-
33
- ## Observation Design
34
-
35
- Every tool response should include:
36
- - `status`: success|warning|error
37
- - `summary`: one-line result
38
- - `next_actions`: actionable follow-ups
39
- - `artifacts`: file paths / IDs
40
-
41
- ## Error Recovery Contract
42
-
43
- For every error path, include:
44
- - root cause hint
45
- - safe retry instruction
46
- - explicit stop condition
47
-
48
- ## Context Budgeting
49
-
50
- 1. Keep system prompt minimal and invariant.
51
- 2. Move large guidance into skills loaded on demand.
52
- 3. Prefer references to files over inlining long documents.
53
- 4. Compact at phase boundaries, not arbitrary token thresholds.
54
-
55
- ## Architecture Pattern Guidance
56
-
57
- - ReAct: best for exploratory tasks with uncertain path.
58
- - Function-calling: best for structured deterministic flows.
59
- - Hybrid (recommended): ReAct planning + typed tool execution.
60
-
61
- ## Benchmarking
62
-
63
- Track:
64
- - completion rate
65
- - retries per task
66
- - pass@1 and pass@3
67
- - cost per successful task
68
-
69
- ## Anti-Patterns
70
-
71
- - Too many tools with overlapping semantics.
72
- - Opaque tool output with no recovery hints.
73
- - Error-only output without next steps.
74
- - Context overloading with irrelevant references.
@@ -1,154 +0,0 @@
1
- ---
2
- name: agent-introspection-debugging
3
- description: Structured self-debugging workflow for AI agent failures using capture, diagnosis, contained recovery, and introspection reports.
4
- metadata:
5
- origin: ECC
6
- ---
7
-
8
- # Agent Introspection Debugging
9
-
10
- Use this skill when an agent run is failing repeatedly, consuming tokens without progress, looping on the same tools, or drifting away from the intended task.
11
-
12
- This is a workflow skill, not a hidden runtime. It teaches the agent to debug itself systematically before escalating to a human.
13
-
14
- ## When to Activate
15
-
16
- - Maximum tool call / loop-limit failures
17
- - Repeated retries with no forward progress
18
- - Context growth or prompt drift that starts degrading output quality
19
- - File-system or environment state mismatch between expectation and reality
20
- - Tool failures that are likely recoverable with diagnosis and a smaller corrective action
21
-
22
- ## Scope Boundaries
23
-
24
- Activate this skill for:
25
- - capturing failure state before retrying blindly
26
- - diagnosing common agent-specific failure patterns
27
- - applying contained recovery actions
28
- - producing a structured human-readable debug report
29
-
30
- Do not use this skill as the primary source for:
31
- - feature verification after code changes; use `verification-loop`
32
- - framework-specific debugging when a narrower ECC skill already exists
33
- - runtime promises the current harness cannot enforce automatically
34
-
35
- ## Four-Phase Loop
36
-
37
- ### Phase 1: Failure Capture
38
-
39
- Before trying to recover, record the failure precisely.
40
-
41
- Capture:
42
- - error type, message, and stack trace when available
43
- - last meaningful tool call sequence
44
- - what the agent was trying to do
45
- - current context pressure: repeated prompts, oversized pasted logs, duplicated plans, or runaway notes
46
- - current environment assumptions: cwd, branch, relevant service state, expected files
47
-
48
- Minimum capture template:
49
-
50
- ```markdown
51
- ## Failure Capture
52
- - Session / task:
53
- - Goal in progress:
54
- - Error:
55
- - Last successful step:
56
- - Last failed tool / command:
57
- - Repeated pattern seen:
58
- - Environment assumptions to verify:
59
- ```
60
-
61
- ### Phase 2: Root-Cause Diagnosis
62
-
63
- Match the failure to a known pattern before changing anything.
64
-
65
- | Pattern | Likely Cause | Check |
66
- | --- | --- | --- |
67
- | Maximum tool calls / repeated same command | loop or no-exit observer path | inspect the last N tool calls for repetition |
68
- | Context overflow / degraded reasoning | unbounded notes, repeated plans, oversized logs | inspect recent context for duplication and low-signal bulk |
69
- | `ECONNREFUSED` / timeout | service unavailable or wrong port | verify service health, URL, and port assumptions |
70
- | `429` / quota exhaustion | retry storm or missing backoff | count repeated calls and inspect retry spacing |
71
- | file missing after write / stale diff | race, wrong cwd, or branch drift | re-check path, cwd, git status, and actual file existence |
72
- | tests still failing after “fix” | wrong hypothesis | isolate the exact failing test and re-derive the bug |
73
-
74
- Diagnosis questions:
75
- - is this a logic failure, state failure, environment failure, or policy failure?
76
- - did the agent lose the real objective and start optimizing the wrong subtask?
77
- - is the failure deterministic or transient?
78
- - what is the smallest reversible action that would validate the diagnosis?
79
-
80
- ### Phase 3: Contained Recovery
81
-
82
- Recover with the smallest action that changes the diagnosis surface.
83
-
84
- Safe recovery actions:
85
- - stop repeated retries and restate the hypothesis
86
- - trim low-signal context and keep only the active goal, blockers, and evidence
87
- - re-check the actual filesystem / branch / process state
88
- - narrow the task to one failing command, one file, or one test
89
- - switch from speculative reasoning to direct observation
90
- - escalate to a human when the failure is high-risk or externally blocked
91
-
92
- Do not claim unsupported auto-healing actions like “reset agent state” or “update harness config” unless you are actually doing them through real tools in the current environment.
93
-
94
- Contained recovery checklist:
95
-
96
- ```markdown
97
- ## Recovery Action
98
- - Diagnosis chosen:
99
- - Smallest action taken:
100
- - Why this is safe:
101
- - What evidence would prove the fix worked:
102
- ```
103
-
104
- ### Phase 4: Introspection Report
105
-
106
- End with a report that makes the recovery legible to the next agent or human.
107
-
108
- ```markdown
109
- ## Agent Self-Debug Report
110
- - Session / task:
111
- - Failure:
112
- - Root cause:
113
- - Recovery action:
114
- - Result: success | partial | blocked
115
- - Token / time burn risk:
116
- - Follow-up needed:
117
- - Preventive change to encode later:
118
- ```
119
-
120
- ## Recovery Heuristics
121
-
122
- Prefer these interventions in order:
123
-
124
- 1. Restate the real objective in one sentence.
125
- 2. Verify the world state instead of trusting memory.
126
- 3. Shrink the failing scope.
127
- 4. Run one discriminating check.
128
- 5. Only then retry.
129
-
130
- Bad pattern:
131
- - retrying the same action three times with slightly different wording
132
-
133
- Good pattern:
134
- - capture failure
135
- - classify the pattern
136
- - run one direct check
137
- - change the plan only if the check supports it
138
-
139
- ## Integration with ECC
140
-
141
- - Use `verification-loop` after recovery if code was changed.
142
- - Use `continuous-learning-v2` when the failure pattern is worth turning into an instinct or later skill.
143
- - Use `council` when the issue is not technical failure but decision ambiguity.
144
- - Use `workspace-surface-audit` if the failure came from conflicting local state or repo drift.
145
-
146
- ## Output Standard
147
-
148
- When this skill is active, do not end with “I fixed it” alone.
149
-
150
- Always provide:
151
- - the failure pattern
152
- - the root-cause hypothesis
153
- - the recovery action
154
- - the evidence that the situation is now better or still blocked