autonomous-coding-toolkit 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (324) hide show
  1. package/.claude-plugin/marketplace.json +22 -0
  2. package/.claude-plugin/plugin.json +13 -0
  3. package/LICENSE +21 -0
  4. package/Makefile +21 -0
  5. package/README.md +140 -0
  6. package/SECURITY.md +28 -0
  7. package/agents/bash-expert.md +113 -0
  8. package/agents/dependency-auditor.md +138 -0
  9. package/agents/integration-tester.md +120 -0
  10. package/agents/lesson-scanner.md +149 -0
  11. package/agents/python-expert.md +179 -0
  12. package/agents/service-monitor.md +141 -0
  13. package/agents/shell-expert.md +147 -0
  14. package/benchmarks/runner.sh +147 -0
  15. package/benchmarks/tasks/01-rest-endpoint/rubric.sh +29 -0
  16. package/benchmarks/tasks/01-rest-endpoint/task.md +17 -0
  17. package/benchmarks/tasks/02-refactor-module/task.md +8 -0
  18. package/benchmarks/tasks/03-fix-integration-bug/task.md +8 -0
  19. package/benchmarks/tasks/04-add-test-coverage/task.md +8 -0
  20. package/benchmarks/tasks/05-multi-file-feature/task.md +8 -0
  21. package/bin/act.js +238 -0
  22. package/commands/autocode.md +6 -0
  23. package/commands/cancel-ralph.md +18 -0
  24. package/commands/code-factory.md +53 -0
  25. package/commands/create-prd.md +55 -0
  26. package/commands/ralph-loop.md +18 -0
  27. package/commands/run-plan.md +117 -0
  28. package/commands/submit-lesson.md +122 -0
  29. package/docs/ARCHITECTURE.md +630 -0
  30. package/docs/CONTRIBUTING.md +125 -0
  31. package/docs/lessons/0001-bare-exception-swallowing.md +34 -0
  32. package/docs/lessons/0002-async-def-without-await.md +28 -0
  33. package/docs/lessons/0003-create-task-without-callback.md +28 -0
  34. package/docs/lessons/0004-hardcoded-test-counts.md +28 -0
  35. package/docs/lessons/0005-sqlite-without-closing.md +33 -0
  36. package/docs/lessons/0006-venv-pip-path.md +27 -0
  37. package/docs/lessons/0007-runner-state-self-rejection.md +35 -0
  38. package/docs/lessons/0008-quality-gate-blind-spot.md +33 -0
  39. package/docs/lessons/0009-parser-overcount-empty-batches.md +36 -0
  40. package/docs/lessons/0010-local-outside-function-bash.md +33 -0
  41. package/docs/lessons/0011-batch-tests-for-unimplemented-code.md +36 -0
  42. package/docs/lessons/0012-api-markdown-unescaped-chars.md +33 -0
  43. package/docs/lessons/0013-export-prefix-env-parsing.md +33 -0
  44. package/docs/lessons/0014-decorator-registry-import-side-effect.md +43 -0
  45. package/docs/lessons/0015-frontend-backend-schema-drift.md +43 -0
  46. package/docs/lessons/0016-event-driven-cold-start-seeding.md +44 -0
  47. package/docs/lessons/0017-copy-paste-logic-diverges.md +43 -0
  48. package/docs/lessons/0018-layer-passes-pipeline-broken.md +45 -0
  49. package/docs/lessons/0019-systemd-envfile-ignores-export.md +41 -0
  50. package/docs/lessons/0020-persist-state-incrementally.md +44 -0
  51. package/docs/lessons/0021-dual-axis-testing.md +48 -0
  52. package/docs/lessons/0022-jsx-factory-shadowing.md +43 -0
  53. package/docs/lessons/0023-static-analysis-spiral.md +51 -0
  54. package/docs/lessons/0024-shared-pipeline-implementation.md +55 -0
  55. package/docs/lessons/0025-defense-in-depth-all-entry-points.md +65 -0
  56. package/docs/lessons/0026-linter-no-rules-false-enforcement.md +54 -0
  57. package/docs/lessons/0027-jsx-silent-prop-drop.md +64 -0
  58. package/docs/lessons/0028-no-infrastructure-in-client-code.md +49 -0
  59. package/docs/lessons/0029-never-write-secrets-to-files.md +61 -0
  60. package/docs/lessons/0030-cache-merge-not-replace.md +62 -0
  61. package/docs/lessons/0031-verify-units-at-boundaries.md +66 -0
  62. package/docs/lessons/0032-module-lifecycle-subscribe-unsubscribe.md +89 -0
  63. package/docs/lessons/0033-async-iteration-mutable-snapshot.md +72 -0
  64. package/docs/lessons/0034-caller-missing-await-silent-discard.md +65 -0
  65. package/docs/lessons/0035-duplicate-registration-silent-overwrite.md +85 -0
  66. package/docs/lessons/0036-websocket-dirty-disconnect.md +33 -0
  67. package/docs/lessons/0037-parallel-agents-worktree-corruption.md +31 -0
  68. package/docs/lessons/0038-subscribe-no-stored-ref.md +36 -0
  69. package/docs/lessons/0039-fallback-or-default-hides-bugs.md +34 -0
  70. package/docs/lessons/0040-event-firehose-filter-first.md +36 -0
  71. package/docs/lessons/0041-ambiguous-base-dir-path-nesting.md +32 -0
  72. package/docs/lessons/0042-spec-compliance-insufficient.md +36 -0
  73. package/docs/lessons/0043-exact-count-extensible-collections.md +32 -0
  74. package/docs/lessons/0044-relative-file-deps-worktree.md +39 -0
  75. package/docs/lessons/0045-iterative-design-improvement.md +33 -0
  76. package/docs/lessons/0046-plan-assertion-math-bugs.md +38 -0
  77. package/docs/lessons/0047-pytest-single-threaded-default.md +37 -0
  78. package/docs/lessons/0048-integration-wiring-batch.md +40 -0
  79. package/docs/lessons/0049-ab-verification.md +41 -0
  80. package/docs/lessons/0050-editing-sourced-files-during-execution.md +33 -0
  81. package/docs/lessons/0051-infrastructure-fixes-cant-self-heal.md +30 -0
  82. package/docs/lessons/0052-uncommitted-changes-poison-quality-gates.md +31 -0
  83. package/docs/lessons/0053-jq-compact-flag-inconsistency.md +31 -0
  84. package/docs/lessons/0054-parser-matches-inside-code-blocks.md +30 -0
  85. package/docs/lessons/0055-agents-compensate-for-garbled-prompts.md +31 -0
  86. package/docs/lessons/0056-grep-count-exit-code-on-zero.md +42 -0
  87. package/docs/lessons/0057-new-artifacts-break-git-clean-gates.md +42 -0
  88. package/docs/lessons/0058-dead-config-keys-never-consumed.md +49 -0
  89. package/docs/lessons/0059-contract-test-shared-structures.md +53 -0
  90. package/docs/lessons/0060-set-e-silent-death-in-runners.md +53 -0
  91. package/docs/lessons/0061-context-injection-dirty-state.md +50 -0
  92. package/docs/lessons/0062-sibling-bug-neighborhood-scan.md +29 -0
  93. package/docs/lessons/0063-one-flag-two-lifetimes.md +31 -0
  94. package/docs/lessons/0064-test-passes-wrong-reason.md +31 -0
  95. package/docs/lessons/0065-pipefail-grep-count-double-output.md +39 -0
  96. package/docs/lessons/0066-local-keyword-outside-function.md +37 -0
  97. package/docs/lessons/0067-stdin-hang-non-interactive-shell.md +36 -0
  98. package/docs/lessons/0068-agent-builds-wrong-thing-correctly.md +31 -0
  99. package/docs/lessons/0069-plan-quality-dominates-execution.md +30 -0
  100. package/docs/lessons/0070-spec-echo-back-prevents-drift.md +31 -0
  101. package/docs/lessons/0071-positive-instructions-outperform-negative.md +30 -0
  102. package/docs/lessons/0072-lost-in-the-middle-context-placement.md +30 -0
  103. package/docs/lessons/0073-unscoped-lessons-cause-false-positives.md +30 -0
  104. package/docs/lessons/0074-stale-context-injection-wrong-batch.md +32 -0
  105. package/docs/lessons/0075-research-artifacts-must-persist.md +32 -0
  106. package/docs/lessons/0076-wrong-decomposition-contaminates-downstream.md +30 -0
  107. package/docs/lessons/0077-cherry-pick-merges-need-manual-resolution.md +30 -0
  108. package/docs/lessons/0078-static-review-without-live-test.md +30 -0
  109. package/docs/lessons/0079-integration-wiring-batch-required.md +32 -0
  110. package/docs/lessons/FRAMEWORK.md +161 -0
  111. package/docs/lessons/SUMMARY.md +201 -0
  112. package/docs/lessons/TEMPLATE.md +85 -0
  113. package/docs/plans/2026-02-21-code-factory-v2-design.md +204 -0
  114. package/docs/plans/2026-02-21-code-factory-v2-implementation-plan.md +2189 -0
  115. package/docs/plans/2026-02-21-code-factory-v2-phase4-design.md +537 -0
  116. package/docs/plans/2026-02-21-code-factory-v2-phase4-implementation-plan.md +2012 -0
  117. package/docs/plans/2026-02-21-hardening-pass-design.md +108 -0
  118. package/docs/plans/2026-02-21-hardening-pass-plan.md +1378 -0
  119. package/docs/plans/2026-02-21-mab-research-report.md +406 -0
  120. package/docs/plans/2026-02-21-marketplace-restructure-design.md +240 -0
  121. package/docs/plans/2026-02-21-marketplace-restructure-plan.md +832 -0
  122. package/docs/plans/2026-02-21-phase4-completion-plan.md +697 -0
  123. package/docs/plans/2026-02-21-validator-suite-design.md +148 -0
  124. package/docs/plans/2026-02-21-validator-suite-plan.md +540 -0
  125. package/docs/plans/2026-02-22-mab-research-round2.md +556 -0
  126. package/docs/plans/2026-02-22-mab-run-design.md +462 -0
  127. package/docs/plans/2026-02-22-mab-run-plan.md +2046 -0
  128. package/docs/plans/2026-02-22-operations-design-methodology-research.md +681 -0
  129. package/docs/plans/2026-02-22-research-agent-failure-taxonomy.md +532 -0
  130. package/docs/plans/2026-02-22-research-code-guideline-policies.md +886 -0
  131. package/docs/plans/2026-02-22-research-codebase-audit-refactoring.md +908 -0
  132. package/docs/plans/2026-02-22-research-coding-standards-documentation.md +541 -0
  133. package/docs/plans/2026-02-22-research-competitive-landscape.md +687 -0
  134. package/docs/plans/2026-02-22-research-comprehensive-testing.md +1076 -0
  135. package/docs/plans/2026-02-22-research-context-utilization.md +459 -0
  136. package/docs/plans/2026-02-22-research-cost-quality-tradeoff.md +548 -0
  137. package/docs/plans/2026-02-22-research-lesson-transferability.md +508 -0
  138. package/docs/plans/2026-02-22-research-multi-agent-coordination.md +312 -0
  139. package/docs/plans/2026-02-22-research-phase-integration.md +602 -0
  140. package/docs/plans/2026-02-22-research-plan-quality.md +428 -0
  141. package/docs/plans/2026-02-22-research-prompt-engineering.md +558 -0
  142. package/docs/plans/2026-02-22-research-unconventional-perspectives.md +528 -0
  143. package/docs/plans/2026-02-22-research-user-adoption.md +638 -0
  144. package/docs/plans/2026-02-22-research-verification-effectiveness.md +433 -0
  145. package/docs/plans/2026-02-23-agent-suite-design.md +299 -0
  146. package/docs/plans/2026-02-23-agent-suite-plan.md +578 -0
  147. package/docs/plans/2026-02-23-phase3-cost-infrastructure-design.md +148 -0
  148. package/docs/plans/2026-02-23-phase3-cost-infrastructure-plan.md +1062 -0
  149. package/docs/plans/2026-02-23-research-bash-expert-agent.md +543 -0
  150. package/docs/plans/2026-02-23-research-dependency-auditor-agent.md +564 -0
  151. package/docs/plans/2026-02-23-research-improving-existing-agents.md +503 -0
  152. package/docs/plans/2026-02-23-research-integration-tester-agent.md +454 -0
  153. package/docs/plans/2026-02-23-research-python-expert-agent.md +429 -0
  154. package/docs/plans/2026-02-23-research-service-monitor-agent.md +425 -0
  155. package/docs/plans/2026-02-23-research-shell-expert-agent.md +533 -0
  156. package/docs/plans/2026-02-23-roadmap-to-completion.md +530 -0
  157. package/docs/plans/2026-02-24-headless-module-split-design.md +98 -0
  158. package/docs/plans/2026-02-24-headless-module-split.md +443 -0
  159. package/docs/plans/2026-02-24-lesson-scope-metadata-design.md +228 -0
  160. package/docs/plans/2026-02-24-lesson-scope-metadata-plan.md +968 -0
  161. package/docs/plans/2026-02-24-npm-packaging-design.md +841 -0
  162. package/docs/plans/2026-02-24-npm-packaging-plan.md +1965 -0
  163. package/docs/plans/audit-findings.md +186 -0
  164. package/docs/telegram-notification-format.md +98 -0
  165. package/examples/example-plan.md +51 -0
  166. package/examples/example-prd.json +72 -0
  167. package/examples/example-roadmap.md +33 -0
  168. package/examples/quickstart-plan.md +63 -0
  169. package/hooks/hooks.json +26 -0
  170. package/hooks/setup-symlinks.sh +48 -0
  171. package/hooks/stop-hook.sh +135 -0
  172. package/package.json +47 -0
  173. package/policies/bash.md +71 -0
  174. package/policies/python.md +71 -0
  175. package/policies/testing.md +61 -0
  176. package/policies/universal.md +60 -0
  177. package/scripts/analyze-report.sh +97 -0
  178. package/scripts/architecture-map.sh +145 -0
  179. package/scripts/auto-compound.sh +273 -0
  180. package/scripts/batch-audit.sh +42 -0
  181. package/scripts/batch-test.sh +101 -0
  182. package/scripts/entropy-audit.sh +221 -0
  183. package/scripts/failure-digest.sh +51 -0
  184. package/scripts/generate-ast-rules.sh +96 -0
  185. package/scripts/init.sh +112 -0
  186. package/scripts/lesson-check.sh +428 -0
  187. package/scripts/lib/common.sh +61 -0
  188. package/scripts/lib/cost-tracking.sh +153 -0
  189. package/scripts/lib/ollama.sh +60 -0
  190. package/scripts/lib/progress-writer.sh +128 -0
  191. package/scripts/lib/run-plan-context.sh +215 -0
  192. package/scripts/lib/run-plan-echo-back.sh +231 -0
  193. package/scripts/lib/run-plan-headless.sh +396 -0
  194. package/scripts/lib/run-plan-notify.sh +57 -0
  195. package/scripts/lib/run-plan-parser.sh +81 -0
  196. package/scripts/lib/run-plan-prompt.sh +215 -0
  197. package/scripts/lib/run-plan-quality-gate.sh +132 -0
  198. package/scripts/lib/run-plan-routing.sh +315 -0
  199. package/scripts/lib/run-plan-sampling.sh +170 -0
  200. package/scripts/lib/run-plan-scoring.sh +146 -0
  201. package/scripts/lib/run-plan-state.sh +142 -0
  202. package/scripts/lib/run-plan-team.sh +199 -0
  203. package/scripts/lib/telegram.sh +54 -0
  204. package/scripts/lib/thompson-sampling.sh +176 -0
  205. package/scripts/license-check.sh +74 -0
  206. package/scripts/mab-run.sh +575 -0
  207. package/scripts/module-size-check.sh +146 -0
  208. package/scripts/patterns/async-no-await.yml +5 -0
  209. package/scripts/patterns/bare-except.yml +6 -0
  210. package/scripts/patterns/empty-catch.yml +6 -0
  211. package/scripts/patterns/hardcoded-localhost.yml +9 -0
  212. package/scripts/patterns/retry-loop-no-backoff.yml +12 -0
  213. package/scripts/pipeline-status.sh +197 -0
  214. package/scripts/policy-check.sh +226 -0
  215. package/scripts/prior-art-search.sh +133 -0
  216. package/scripts/promote-mab-lessons.sh +126 -0
  217. package/scripts/prompts/agent-a-superpowers.md +29 -0
  218. package/scripts/prompts/agent-b-ralph.md +29 -0
  219. package/scripts/prompts/judge-agent.md +61 -0
  220. package/scripts/prompts/planner-agent.md +44 -0
  221. package/scripts/pull-community-lessons.sh +90 -0
  222. package/scripts/quality-gate.sh +266 -0
  223. package/scripts/research-gate.sh +90 -0
  224. package/scripts/run-plan.sh +329 -0
  225. package/scripts/scope-infer.sh +159 -0
  226. package/scripts/setup-ralph-loop.sh +155 -0
  227. package/scripts/telemetry.sh +230 -0
  228. package/scripts/tests/run-all-tests.sh +52 -0
  229. package/scripts/tests/test-act-cli.sh +46 -0
  230. package/scripts/tests/test-agents-md.sh +87 -0
  231. package/scripts/tests/test-analyze-report.sh +114 -0
  232. package/scripts/tests/test-architecture-map.sh +89 -0
  233. package/scripts/tests/test-auto-compound.sh +169 -0
  234. package/scripts/tests/test-batch-test.sh +65 -0
  235. package/scripts/tests/test-benchmark-runner.sh +25 -0
  236. package/scripts/tests/test-common.sh +168 -0
  237. package/scripts/tests/test-cost-tracking.sh +158 -0
  238. package/scripts/tests/test-echo-back.sh +180 -0
  239. package/scripts/tests/test-entropy-audit.sh +146 -0
  240. package/scripts/tests/test-failure-digest.sh +66 -0
  241. package/scripts/tests/test-generate-ast-rules.sh +145 -0
  242. package/scripts/tests/test-helpers.sh +82 -0
  243. package/scripts/tests/test-init.sh +47 -0
  244. package/scripts/tests/test-lesson-check.sh +278 -0
  245. package/scripts/tests/test-lesson-local.sh +55 -0
  246. package/scripts/tests/test-license-check.sh +109 -0
  247. package/scripts/tests/test-mab-run.sh +182 -0
  248. package/scripts/tests/test-ollama-lib.sh +49 -0
  249. package/scripts/tests/test-ollama.sh +60 -0
  250. package/scripts/tests/test-pipeline-status.sh +198 -0
  251. package/scripts/tests/test-policy-check.sh +124 -0
  252. package/scripts/tests/test-prior-art-search.sh +96 -0
  253. package/scripts/tests/test-progress-writer.sh +140 -0
  254. package/scripts/tests/test-promote-mab-lessons.sh +110 -0
  255. package/scripts/tests/test-pull-community-lessons.sh +149 -0
  256. package/scripts/tests/test-quality-gate.sh +241 -0
  257. package/scripts/tests/test-research-gate.sh +132 -0
  258. package/scripts/tests/test-run-plan-cli.sh +86 -0
  259. package/scripts/tests/test-run-plan-context.sh +305 -0
  260. package/scripts/tests/test-run-plan-e2e.sh +153 -0
  261. package/scripts/tests/test-run-plan-headless.sh +424 -0
  262. package/scripts/tests/test-run-plan-notify.sh +124 -0
  263. package/scripts/tests/test-run-plan-parser.sh +217 -0
  264. package/scripts/tests/test-run-plan-prompt.sh +254 -0
  265. package/scripts/tests/test-run-plan-quality-gate.sh +222 -0
  266. package/scripts/tests/test-run-plan-routing.sh +178 -0
  267. package/scripts/tests/test-run-plan-scoring.sh +148 -0
  268. package/scripts/tests/test-run-plan-state.sh +261 -0
  269. package/scripts/tests/test-run-plan-team.sh +157 -0
  270. package/scripts/tests/test-scope-infer.sh +150 -0
  271. package/scripts/tests/test-setup-ralph-loop.sh +63 -0
  272. package/scripts/tests/test-telegram-env.sh +38 -0
  273. package/scripts/tests/test-telegram.sh +121 -0
  274. package/scripts/tests/test-telemetry.sh +46 -0
  275. package/scripts/tests/test-thompson-sampling.sh +139 -0
  276. package/scripts/tests/test-validate-all.sh +60 -0
  277. package/scripts/tests/test-validate-commands.sh +89 -0
  278. package/scripts/tests/test-validate-hooks.sh +98 -0
  279. package/scripts/tests/test-validate-lessons.sh +150 -0
  280. package/scripts/tests/test-validate-plan-quality.sh +235 -0
  281. package/scripts/tests/test-validate-plans.sh +187 -0
  282. package/scripts/tests/test-validate-plugin.sh +106 -0
  283. package/scripts/tests/test-validate-prd.sh +184 -0
  284. package/scripts/tests/test-validate-skills.sh +134 -0
  285. package/scripts/validate-all.sh +57 -0
  286. package/scripts/validate-commands.sh +67 -0
  287. package/scripts/validate-hooks.sh +89 -0
  288. package/scripts/validate-lessons.sh +98 -0
  289. package/scripts/validate-plan-quality.sh +369 -0
  290. package/scripts/validate-plans.sh +120 -0
  291. package/scripts/validate-plugin.sh +86 -0
  292. package/scripts/validate-policies.sh +42 -0
  293. package/scripts/validate-prd.sh +118 -0
  294. package/scripts/validate-skills.sh +96 -0
  295. package/skills/autocode/SKILL.md +285 -0
  296. package/skills/autocode/ab-verification.md +51 -0
  297. package/skills/autocode/code-quality-standards.md +37 -0
  298. package/skills/autocode/competitive-mode.md +364 -0
  299. package/skills/brainstorming/SKILL.md +97 -0
  300. package/skills/capture-lesson/SKILL.md +187 -0
  301. package/skills/check-lessons/SKILL.md +116 -0
  302. package/skills/dispatching-parallel-agents/SKILL.md +110 -0
  303. package/skills/executing-plans/SKILL.md +85 -0
  304. package/skills/finishing-a-development-branch/SKILL.md +201 -0
  305. package/skills/receiving-code-review/SKILL.md +72 -0
  306. package/skills/requesting-code-review/SKILL.md +59 -0
  307. package/skills/requesting-code-review/code-reviewer.md +82 -0
  308. package/skills/research/SKILL.md +145 -0
  309. package/skills/roadmap/SKILL.md +115 -0
  310. package/skills/subagent-driven-development/SKILL.md +98 -0
  311. package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +18 -0
  312. package/skills/subagent-driven-development/implementer-prompt.md +73 -0
  313. package/skills/subagent-driven-development/spec-reviewer-prompt.md +57 -0
  314. package/skills/systematic-debugging/SKILL.md +134 -0
  315. package/skills/systematic-debugging/condition-based-waiting.md +64 -0
  316. package/skills/systematic-debugging/defense-in-depth.md +32 -0
  317. package/skills/systematic-debugging/root-cause-tracing.md +55 -0
  318. package/skills/test-driven-development/SKILL.md +167 -0
  319. package/skills/using-git-worktrees/SKILL.md +219 -0
  320. package/skills/using-superpowers/SKILL.md +54 -0
  321. package/skills/verification-before-completion/SKILL.md +140 -0
  322. package/skills/verify/SKILL.md +82 -0
  323. package/skills/writing-plans/SKILL.md +128 -0
  324. package/skills/writing-skills/SKILL.md +93 -0
@@ -0,0 +1,503 @@
1
+ # Research: Improving Existing Claude Code Agents
2
+
3
+ **Date:** 2026-02-23
4
+ **Status:** Complete
5
+ **Scope:** ~/.claude/agents/ — 8 existing agents
6
+
7
+ ---
8
+
9
+ ## BLUF
10
+
11
+ The 8 existing agents range from production-quality (lesson-scanner, counter) to underspecified (security-reviewer, doc-updater). Priority improvements fall into four categories: (1) add `model` fields to 5 agents that inherit unnecessarily, (2) add `memory` fields to 3 agents that would benefit from cross-session learning, (3) tighten tool lists on 4 agents that are over-permissioned, (4) add explicit hallucination guards to the 2 audit agents.
12
+
13
+ ---
14
+
15
+ ## Sources
16
+
17
+ - [wshobson/agents](https://github.com/wshobson/agents) — 112-agent production system, plugin architecture, progressive disclosure skills
18
+ - [VoltAgent/awesome-claude-code-subagents](https://github.com/VoltAgent/awesome-claude-code-subagents) — 127+ agent community collection
19
+ - [0xfurai/claude-code-subagents](https://github.com/0xfurai/claude-code-subagents) — 100+ production-ready subagents
20
+ - [iannuttall/claude-agents](https://github.com/iannuttall/claude-agents) — custom agents collection
21
+ - [hesreallyhim/awesome-claude-code](https://github.com/hesreallyhim/awesome-claude-code) — curated skills/hooks/agents list
22
+ - [Claude Code Docs — Create custom subagents](https://code.claude.com/docs/en/sub-agents) — official frontmatter reference
23
+ - [PubNub — Best Practices for Claude Code Sub-Agents](https://www.pubnub.com/blog/best-practices-for-claude-code-sub-agents/) — tool constraints, hooks, error handling
24
+ - [PubNub — From Prompts to Pipelines](https://www.pubnub.com/blog/best-practices-claude-code-subagents-part-two-from-prompts-to-pipelines/) — agent chain patterns, artifact structure
25
+ - [Claude Docs — Reduce Hallucinations](https://platform.claude.com/docs/en/test-and-evaluate/strengthen-guardrails/reduce-hallucinations) — hallucination prevention
26
+ - [Adaline Labs — Ship Reliably with Claude Code](https://labs.adaline.ai/p/how-to-ship-reliably-with-claude-code) — governance patterns
27
+
28
+ ---
29
+
30
+ ## Key Findings from External Research
31
+
32
+ ### Frontmatter Capabilities Most Agents Are Not Using
33
+
34
+ From the official docs, these frontmatter fields exist and none of the current agents use them fully:
35
+
36
+ | Field | What it does | Agents missing it |
37
+ |-------|-------------|-------------------|
38
+ | `model` | Route to right model tier | security-reviewer, infra-auditor, doc-updater, notion-researcher, notion-writer |
39
+ | `memory` | Persistent cross-session learning | security-reviewer, lesson-scanner, infra-auditor |
40
+ | `maxTurns` | Hard stop against runaway execution | all agents |
41
+ | `isolation: worktree` | Isolated git context for write agents | doc-updater |
42
+ | `hooks` | Pre/post tool validation | infra-auditor, notion-writer |
43
+ | `permissionMode` | Default is overly permissive for read-only agents | security-reviewer, infra-auditor, counter, counter-daily |
44
+
45
+ ### Tool Constraint Anti-Pattern: Omission = Full Inheritance
46
+
47
+ From PubNub research: "If you omit `tools`, you're implicitly granting access to all available tools." All 8 agents explicitly list tools, which is correct. However, several include write-capable tools (Edit, Write, Bash) when they only need read access.
48
+
49
+ - `counter.md` and `counter-daily.md` have `tools: Read, Grep, Glob` — correct, no write needed
50
+ - `security-reviewer.md` includes `Bash` — risky if used for active exploitation testing
51
+ - `infra-auditor.md` includes `Bash` — necessary for system checks, but should add `permissionMode: dontAsk` for writes
52
+
53
+ ### Hallucination Prevention Patterns
54
+
55
+ From Anthropic's official docs on reducing hallucinations:
56
+ 1. **Ground assertions in tool output** — agents should be required to cite specific grep/read results before any finding
57
+ 2. **Explicit "do not report what grep + read does not confirm"** instruction — lesson-scanner has this; security-reviewer and infra-auditor do not
58
+ 3. **Uncertainty declarations** — agents should say "I could not verify X" rather than inferring
59
+
60
+ ### Agent Chain Integration Patterns
61
+
62
+ From PubNub Part 2:
63
+ - Structured handoff artifacts: active-plan.md, implementation-summary.md, qa-summary.md
64
+ - Each agent returns a clean summary to the orchestrator, not raw logs
65
+ - Hook-based governance: PreToolUse for validation, PostToolUse for verification
66
+ - Plan → Execute → Verify pipeline as the canonical sequence
67
+
68
+ ### Model Selection Best Practice
69
+
70
+ From official docs and PubNub:
71
+ - Haiku: mechanical tasks, read-only searches, daily lightweight checks
72
+ - Sonnet: balanced analysis, multi-file operations
73
+ - Opus: complex reasoning, adversarial review, architecture critique
74
+
75
+ Current agents: counter correctly uses `model: opus`. counter-daily correctly uses `model: sonnet`. The other 6 all inherit from the parent conversation, which means they will run at whatever model the user happens to be using — wasteful for lightweight agents, underspecced for analysis agents.
76
+
77
+ ### Persistent Memory Pattern
78
+
79
+ From official docs: `memory: user` gives agents a `~/.claude/agent-memory/<name>/` directory that persists across sessions. The agent's system prompt automatically includes the first 200 lines of MEMORY.md.
80
+
81
+ Agents that would benefit most from memory:
82
+ - **lesson-scanner**: could accumulate false-positive patterns per-project, avoid rescanning clean files
83
+ - **security-reviewer**: could remember known-safe patterns and previously flagged issues
84
+ - **infra-auditor**: could track baseline service states and flag deviations vs. absolute thresholds
85
+
86
+ ---
87
+
88
+ ## Per-Agent Assessment and Improvements
89
+
90
+ ### 1. security-reviewer.md
91
+
92
+ **Current state:** Minimal (35 lines). Covers 4 vulnerability categories. Output format exists but is implicit. No hallucination guard. Web-focused (SQL injection, XSS) — misses Python/bash attack surfaces.
93
+
94
+ **Gap analysis:**
95
+ - No explicit "only report what the tools confirm" guardrail — will hallucinate findings on code it hasn't read
96
+ - Missing attack categories for Python/shell scripts: deserialization, subprocess injection, pickle loading, hardcoded secrets in environment variable fallbacks
97
+ - No `model` field (should be `sonnet`)
98
+ - No `memory` field — can't accumulate project-specific baseline
99
+ - Bash tool included but no guard against running exploits — should be `permissionMode: plan` or `dontAsk`
100
+ - Missing cryptography category: weak algorithms (MD5, SHA1), hardcoded salts, insecure random
101
+ - Output format has no "CLEAN" affirmation — leaves ambiguity about unreviewed files
102
+
103
+ **Recommended improvements:**
104
+
105
+ ```markdown
106
+ ---
107
+ name: security-reviewer
108
+ description: Reviews code for security vulnerabilities and sensitive data exposure. Use proactively after any code changes that touch authentication, data handling, file I/O, subprocess calls, or network requests.
109
+ tools: Read, Grep, Glob
110
+ model: sonnet
111
+ memory: project
112
+ permissionMode: plan
113
+ ---
114
+ ```
115
+
116
+ Changes:
117
+ 1. Remove `Bash` — not needed for read-only review; eliminates risk of active exploitation
118
+ 2. Add `model: sonnet` — analysis task, not opus-level reasoning
119
+ 3. Add `memory: project` — accumulate known-safe patterns and previously reviewed baselines
120
+ 4. Add `permissionMode: plan` — read-only mode, no writes
121
+ 5. Expand vulnerability categories:
122
+ - Add Python-specific: `pickle.loads()`, `eval()`, `exec()`, `subprocess` with `shell=True`
123
+ - Add cryptography: `hashlib.md5`, `hashlib.sha1`, `random.random()` in security context, hardcoded salts
124
+ - Add dependency chain: check `requirements.txt`, `package.json`, `Pipfile.lock` for known CVEs via `safety check` in bash (after re-adding Bash with hook guard)
125
+ 6. Add explicit hallucination guard: "Only report findings grounded in specific file:line evidence from Read/Grep output. If a grep returns no matches, record the category as CLEAN — do not infer."
126
+ 7. Add structured `CLEAN` section to output format
127
+
128
+ ### 2. infra-auditor.md
129
+
130
+ **Current state:** Well-specified (77 lines). Clear check categories, concrete commands, good report format. Strong baseline.
131
+
132
+ **Gap analysis:**
133
+ - No `model` field (should be `haiku` — mechanical checks, not reasoning)
134
+ - No `maxTurns` — could loop indefinitely if a service check hangs
135
+ - Missing checks: memory slice caps (the systemd-oomd and user-1000 slice are defined in CLAUDE.md), ollama-queue service, open-webui health
136
+ - `systemctl --user is-active` for 6 services — correct, but missing the timer units (21 timers)
137
+ - Sync freshness check uses `stat -c '%Y'` (epoch) but compares to nothing — needs `$(date +%s)` math
138
+ - No hook to validate bash commands before execution (adds risk if agent hallucinates a destructive command)
139
+ - Missing: `journalctl --user -u <service> --since "1 hour ago" --no-pager` for recent errors on unhealthy services
140
+
141
+ **Recommended improvements:**
142
+
143
+ ```yaml
144
+ model: haiku
145
+ maxTurns: 30
146
+ hooks:
147
+ PreToolUse:
148
+ - matcher: "Bash"
149
+ hooks:
150
+ - type: command
151
+ command: "~/.claude/hooks/validate-readonly-bash.sh"
152
+ ```
153
+
154
+ Specific content additions:
155
+ 1. Add timer audit: `systemctl --user list-timers --no-pager` — check that all 21 timers are active
156
+ 2. Fix sync freshness math: `NOW=$(date +%s); SYNC=$(stat -c '%Y' file); echo $((NOW - SYNC))` seconds
157
+ 3. Add ollama-queue service check: `curl -s http://127.0.0.1:7683/health`
158
+ 4. Add memory slice check: `systemctl show user-1000.slice --property=MemoryHigh`
159
+ 5. Add hallucination guard: "Only report the output of commands you actually executed. Do not infer service health without running the check."
160
+ 6. Add journal check for any unhealthy service before escalating to CRITICAL
161
+
162
+ ### 3. doc-updater.md
163
+
164
+ **Current state:** Well-structured (40 lines). Context hierarchy table is excellent. CLAUDE.md chain enforcement is the right mental model.
165
+
166
+ **Gap analysis:**
167
+ - No `model` field (should be `sonnet` — needs to reason about content placement)
168
+ - No `isolation: worktree` — doc writes could corrupt staging area (Lesson #44 parallel agent concern)
169
+ - `git diff HEAD~1` only looks at last commit — misses uncommitted changes; should use `git diff HEAD` and `git status --short` together
170
+ - No check for MEMORY.md line count (stated in the rules but no scan instruction)
171
+ - Missing: validate that CLAUDE.md files don't contain hardcoded secrets (should grep for IP addresses, tokens)
172
+ - No output format — the agent makes changes but returns no structured summary of what was changed and why
173
+ - Write tool is included — needs explicit guard against writing to CLAUDE.md files it hasn't read first (lesson #file-editing from CLAUDE.md)
174
+
175
+ **Recommended improvements:**
176
+
177
+ ```yaml
178
+ model: sonnet
179
+ isolation: worktree
180
+ ```
181
+
182
+ Content additions:
183
+ 1. Add explicit scan sequence:
184
+ - Step 0: `git status --short && git diff HEAD --name-only` (catch both staged and unstaged)
185
+ - Add MEMORY.md line count check: `wc -l ~/.claude/projects/.../memory/MEMORY.md`
186
+ 2. Add output format:
187
+ ```
188
+ ## Doc Update Summary
189
+ Files reviewed: [list]
190
+ Files modified: [list with reason]
191
+ Duplication removed: [what and where]
192
+ No-op: [what needed no change and why]
193
+ ```
194
+ 3. Add security check: before writing, grep new content for IP addresses, tokens, credentials
195
+ 4. Add explicit "read before write" rule — must Read the target file before Edit/Write
196
+
197
+ ### 4. lesson-scanner.md
198
+
199
+ **Current state:** Excellent (294 lines). Most mature agent in the set. Structured scan groups, explicit patterns, hallucination guard already present, clean report format. This is the reference implementation.
200
+
201
+ **Gap analysis:**
202
+ - Description says "53 lessons" — now 66 lessons (stale count)
203
+ - No `model` field (should be `sonnet` — pattern matching and analysis, not Opus-level)
204
+ - No `memory: project` — could cache "clean file" hashes to skip unchanged files on repeat runs
205
+ - Scan Group coverage gaps vs. current lesson set:
206
+ - Missing Lessons #60-66 (research-derived, added 2026-02-21): plan quality, spec compliance, positive instructions, lesson scope, context placement
207
+ - Missing Lesson #51: `.venv/bin/pip` vs `.venv/bin/python -m pip` (hookify warns but scanner should flag too)
208
+ - Missing Lesson #50: plan assertion math (if scanner runs on docs/plans/*.md)
209
+ - Missing Lesson #26: unit boundary verification
210
+ - Scan Group 4a (duplicate function names) has a false-positive threshold of 3 files — should be configurable
211
+
212
+ **Recommended improvements:**
213
+
214
+ ```yaml
215
+ model: sonnet
216
+ memory: project
217
+ ```
218
+
219
+ Content additions:
220
+ 1. Update description count: "66 lessons" (from 53)
221
+ 2. Add Scan Group 7: Plan Quality (Lessons #60-66):
222
+ - Scan `docs/plans/*.md` for missing hypothesis statements, missing acceptance criteria, missing success metrics
223
+ - Pattern: check for "hypothesis:" or "we believe" keywords — absence is a flag
224
+ - Pattern: check for "acceptance criteria" section — absence is Should-Fix
225
+ 3. Add Scan 3f: `.venv/bin/pip` usage (Lesson #51):
226
+ ```
227
+ pattern: \.venv/bin/pip\b
228
+ glob: **/*.{sh,md,py}
229
+ ```
230
+ Flag as Should-Fix with fix: use `.venv/bin/python -m pip`
231
+ 4. Add memory instruction: "After each scan, write a one-line entry to MEMORY.md noting the project path, timestamp, and blocker count. On repeat scans, check memory first — if a file has not changed since last scan and had no blockers, skip it."
232
+
233
+ ### 5. counter.md
234
+
235
+ **Current state:** Exceptional (466 lines). Most sophisticated agent in the set. Psychological grounding, four lenses, lean gate, wildcard, human contact gate, severity system, critical rules. This is a complete system.
236
+
237
+ **Gap analysis:**
238
+ - No `maxTurns` — a review could spiral into exhaustive analysis; 20 turns is sufficient for any review
239
+ - The `Discovered Patterns` section at the bottom is the right pattern but has no reminder to check it — the agent could skip it on automatic runs
240
+ - No reference to Lessons #60-66 in the Bias Detection section — "Lesson regression" check (Lens 2) should include the research-derived clusters E and F
241
+ - `~/.claude/counter-humans.md` is referenced but if this file doesn't exist the human contact gate silently fails
242
+ - Missing: the agent has no instruction to check if it's being invoked recursively (counter reviewing a counter output creates echo chamber)
243
+
244
+ **Recommended improvements:**
245
+
246
+ 1. Add `maxTurns: 20` to frontmatter
247
+ 2. Add Cluster E and F to the Lesson Regression check in Lens 2:
248
+ ```
249
+ "Lesson regression — mental grep against all 6 clusters:
250
+ A (silent failures), B (integration boundaries), C (cold-start),
251
+ D (specification drift), E (context & retrieval — info buried or misscoped),
252
+ F (planning & control flow — wrong decomposition contaminates downstream)"
253
+ ```
254
+ 3. Add check at top of Discovered Patterns section: "Before reviewing, scan Discovered Patterns for any pattern matching the input type."
255
+ 4. Add guard: "If the input being reviewed is itself a Counter output or review of a review, flag this to the user before proceeding — adversarial review of adversarial review creates false certainty."
256
+
257
+ ### 6. counter-daily.md
258
+
259
+ **Current state:** Well-calibrated (66 lines). Tight scope, correct model, no padding.
260
+
261
+ **Gap analysis:**
262
+ - No `maxTurns` — should be 5 (three questions, acknowledgment, done)
263
+ - Missing question pool entry for "Lesson regression" gap — the daily check could include "Did you repeat a known failure pattern today?" as an optional question
264
+ - The defaults fire when no context is provided — but if the user provides partial context, question selection logic is vague ("pick the three most relevant")
265
+ - No output structure at all — questions are unformatted, which is correct for this agent, but there's no instruction about follow-up behavior if Justin responds
266
+
267
+ **Recommended improvements:**
268
+
269
+ 1. Add `maxTurns: 5`
270
+ 2. Add one question to each pool as options:
271
+ - Collaboration: "Did you make any decision today based on a lesson you've documented but ignored anyway?"
272
+ - Focus: "What would have changed if you'd checked Lessons SUMMARY.md before starting today's main task?"
273
+ 3. Add behavior rule: "If Justin responds to the questions, acknowledge once and stop. Do not analyze the response. Do not follow up with more questions. That's the full counter's job."
274
+
275
+ ### 7. notion-researcher.md
276
+
277
+ **Current state:** Well-structured (77 lines). Search strategy hierarchy is excellent. Content domain shortcuts are high-value. Synthesis rules are correct.
278
+
279
+ **Gap analysis:**
280
+ - No `model` field (should be `sonnet` — cross-database synthesis, not mechanical lookup)
281
+ - `tools: Read, Grep, Glob, Bash` — Bash is needed for `notion-vector-search` CLI, correct
282
+ - No `maxTurns` — large Notion workspaces could cause runaway exploration; limit to 40 turns
283
+ - Staleness check instruction is there but weak — "if freshness matters" is vague; should always check if data is >12 hours old
284
+ - No citation format standardization — the output rule says "cite sources" but doesn't specify format; the main session can't parse inconsistent citations
285
+ - Missing: the agent should check `~/Documents/notion/CLAUDE.md` exists before starting — if Notion sync has never run, the file may not exist
286
+ - Missing: if `notion-vector-search` returns 0 results, the agent has no fallback instruction (will hallucinate or stop)
287
+
288
+ **Recommended improvements:**
289
+
290
+ ```yaml
291
+ model: sonnet
292
+ maxTurns: 40
293
+ ```
294
+
295
+ Content additions:
296
+ 1. Standardize citation format:
297
+ ```
298
+ Source: [Database/Page Name] | ID: {uuid} | Updated: {date}
299
+ ```
300
+ 2. Add vector search fallback: "If `notion-vector-search` returns 0 results, fall back to Grep with decomposed keyword terms before concluding the topic is not in Notion."
301
+ 3. Strengthen staleness check: always run `stat` on sync metadata at start and include age in output — do not wait for "freshness matters"
302
+ 4. Add guard: "Check that `~/Documents/notion/CLAUDE.md` exists before searching. If it doesn't exist, report: 'Notion local replica not found — run notion-sync first.'"
303
+
304
+ ### 8. notion-writer.md
305
+
306
+ **Current state:** Functional (115 lines). Complete API reference, good property formats, batch operation example, SQLite sync instruction.
307
+
308
+ **Gap analysis:**
309
+ - No `model` field (should be `haiku` — mechanical API calls, not reasoning)
310
+ - No `maxTurns` — should be 20 to prevent runaway batch operations
311
+ - Rate limit handling is documented but has no instruction for what to do after hitting rate limit beyond "wait and retry" — should include exponential backoff
312
+ - No input validation instruction — if called with a missing database ID, will attempt API call and get a cryptic 404
313
+ - The SQLite sync step is noted as "when creating pages from capture bot" — but the agent has no way to know which origin triggered it; it should always offer to sync
314
+ - No rollback instruction — if a batch create fails midway, the agent has no guidance on how to identify which pages were created vs. not
315
+ - Missing: the agent should verify `NOTION_API_KEY` is set before first API call, not discover it's missing on first 401
316
+
317
+ **Recommended improvements:**
318
+
319
+ ```yaml
320
+ model: haiku
321
+ maxTurns: 20
322
+ hooks:
323
+ PreToolUse:
324
+ - matcher: "Bash"
325
+ hooks:
326
+ - type: command
327
+ command: "~/.claude/hooks/validate-api-key.sh NOTION_API_KEY"
328
+ ```
329
+
330
+ Content additions:
331
+ 1. Add pre-flight check: "Before any API call, verify `NOTION_API_KEY` is set: `bash -c 'source ~/.env && [ -n \"$NOTION_API_KEY\" ] && echo OK || echo MISSING'`"
332
+ 2. Add input validation: "Before calling any API with a database ID, check that the ID matches UUID format (8-4-4-4-12 hex). If not, stop and report the malformed ID."
333
+ 3. Add exponential backoff for 429: `sleep $((retry_after + 1))`, double delay on second retry
334
+ 4. Add batch operation tracking: maintain a local list of successfully created page IDs during batch operations; if an error occurs, report "Created N of M pages: [list of IDs]"
335
+ 5. Add SQLite sync offer: always end with "Run `notion-sync --page PAGE_ID` to refresh local replica for each created page?"
336
+
337
+ ---
338
+
339
+ ## Cross-Cutting Patterns
340
+
341
+ ### Pattern 1: Hallucination Guard Template
342
+
343
+ Every audit/review agent (security-reviewer, infra-auditor, lesson-scanner) should include this as its final instruction:
344
+
345
+ ```
346
+ ## Anti-Hallucination Rules
347
+
348
+ - Report ONLY what Grep/Read/Bash output directly confirms.
349
+ - If a scan group returns no grep matches, record it as CLEAN — do not infer vulnerabilities.
350
+ - If you are uncertain about a finding, read more context before flagging — do not flag based on pattern proximity alone.
351
+ - If a command fails or returns no output, report "Could not verify: [check name]" rather than assuming pass or fail.
352
+ ```
353
+
354
+ lesson-scanner already has a version of this. security-reviewer and infra-auditor need it added.
355
+
356
+ ### Pattern 2: Model Tier Alignment
357
+
358
+ Current state vs. correct assignment:
359
+
360
+ | Agent | Current | Should Be | Reason |
361
+ |-------|---------|-----------|--------|
362
+ | security-reviewer | inherit | sonnet | Multi-file analysis |
363
+ | infra-auditor | inherit | haiku | Mechanical checks |
364
+ | doc-updater | inherit | sonnet | Content reasoning |
365
+ | lesson-scanner | inherit | sonnet | Pattern analysis |
366
+ | counter | opus | opus | Correct — adversarial reasoning |
367
+ | counter-daily | sonnet | sonnet | Correct — lightweight |
368
+ | notion-researcher | inherit | sonnet | Cross-database synthesis |
369
+ | notion-writer | inherit | haiku | Mechanical API calls |
370
+
371
+ ### Pattern 3: maxTurns as Safety Net
372
+
373
+ None of the current agents set `maxTurns`. Per official docs, this is a hard stop on runaway execution. Recommended values:
374
+
375
+ | Agent | maxTurns | Reason |
376
+ |-------|----------|--------|
377
+ | security-reviewer | 50 | May scan many files |
378
+ | infra-auditor | 30 | ~20 discrete checks |
379
+ | doc-updater | 20 | Few files to read+write |
380
+ | lesson-scanner | 80 | 6 scan groups × many files |
381
+ | counter | 20 | Review, not analysis marathon |
382
+ | counter-daily | 5 | 3 questions only |
383
+ | notion-researcher | 40 | May explore many pages |
384
+ | notion-writer | 20 | Bounded by batch size |
385
+
386
+ ### Pattern 4: Memory for Audit Agents
387
+
388
+ Three agents would benefit most from `memory: project`:
389
+
390
+ - **lesson-scanner**: Cache scan results per file hash; skip unchanged clean files on repeat runs. This transforms it from O(project_size) to O(changed_files) on every run.
391
+ - **security-reviewer**: Store baseline of known-safe patterns (e.g., "this project uses parameterized queries throughout — SQL injection is mitigated at the ORM layer"). Avoid re-flagging architecturally sound patterns.
392
+ - **infra-auditor**: Store service baseline state. Flag deviations from baseline rather than absolute thresholds. Reduces false positives on expected service restarts.
393
+
394
+ ### Pattern 5: Description Quality
395
+
396
+ The `description` field is how Claude decides when to delegate. Current descriptions vary in specificity:
397
+
398
+ **Weak** (won't trigger delegation reliably):
399
+ - `security-reviewer`: "Reviews code for security vulnerabilities and sensitive data exposure" — no trigger phrase
400
+ - `doc-updater`: "Reviews recent changes and updates documentation" — no trigger phrase
401
+
402
+ **Strong** (explicit invocation triggers):
403
+ - `lesson-scanner`: "Scans codebase for anti-patterns... Dispatched via /audit lessons against any Python/JS/TS project root" — explicit dispatch instruction
404
+ - `notion-researcher`: "Use this agent when answering questions that require reading multiple Notion files..." — clear use-case examples
405
+
406
+ All agents should include: "Use proactively when..." or "Dispatch when..." with specific trigger conditions.
407
+
408
+ ### Pattern 6: Tool Minimization
409
+
410
+ Per PubNub: "Be intentional" about tools. Current over-permissions:
411
+
412
+ - `security-reviewer` has `Bash` — remove it; read-only review needs only Read/Grep/Glob
413
+ - `infra-auditor` has `Bash` — keep it (needed for system checks), but add PreToolUse hook to validate no destructive commands
414
+ - `doc-updater` has `Edit, Write, Bash` — all justified, but add read-before-write rule
415
+
416
+ ---
417
+
418
+ ## Priority-Ordered Action List
419
+
420
+ ### P0 — Correctness (prevents wrong output)
421
+
422
+ 1. **Add hallucination guards to security-reviewer and infra-auditor** — these agents report findings that drive action; false findings are costly
423
+ 2. **Fix infra-auditor sync freshness math** — current `stat -c '%Y'` comparison is broken without `$(date +%s)` delta math
424
+ 3. **Remove Bash from security-reviewer** — read-only review should not have shell execution; eliminates active-exploitation risk
425
+ 4. **Update lesson-scanner description count** — "53 lessons" is stale; now 66
426
+
427
+ ### P1 — Quality (prevents waste or confusion)
428
+
429
+ 5. **Add `model` fields to all 6 agents missing them** — prevents sonnet-scale tasks routing to haiku or opus-scale tasks routing to haiku by accident
430
+ 6. **Add `maxTurns` to all agents** — prevents runaway execution; values above
431
+ 7. **Add explicit trigger phrases to security-reviewer and doc-updater descriptions** — delegation won't activate reliably without them
432
+ 8. **Fix doc-updater git diff command** — `HEAD~1` misses uncommitted changes; use `git status --short && git diff HEAD`
433
+
434
+ ### P2 — Capability (adds meaningful new features)
435
+
436
+ 9. **Add `memory: project` to lesson-scanner** — caching clean-file results transforms repeat scan performance
437
+ 10. **Add Scan Group 7 (Plan Quality, Lessons #60-66) to lesson-scanner** — research-derived lessons are not currently scanned
438
+ 11. **Add Scan 3f (`.venv/bin/pip`, Lesson #51) to lesson-scanner** — hookify warns but scanner should also flag
439
+ 12. **Add Clusters E and F to counter Bias Detection (Lens 2)** — lesson regression check is incomplete without them
440
+ 13. **Add notion-researcher vector search fallback** — zero-result behavior is undefined
441
+ 14. **Add notion-writer pre-flight API key check** — currently discovers missing key on first 401
442
+
443
+ ### P3 — Polish (reduces friction)
444
+
445
+ 15. **Add structured output format to doc-updater** — currently makes changes but returns no summary
446
+ 16. **Add counter-daily follow-up behavior rule** — "acknowledge once and stop" prevents it from morphing into a full counter session
447
+ 17. **Add notion-writer batch operation tracking** — partial failure currently leaves ambiguous state
448
+ 18. **Add `memory: project` to security-reviewer** — baseline known-safe patterns across sessions
449
+ 19. **Add `isolation: worktree` to doc-updater** — protects staging area during CLAUDE.md writes
450
+ 20. **Add counter `maxTurns: 20`** — prevents review sessions from becoming analysis marathons
451
+
452
+ ---
453
+
454
+ ## Agent Chain Integration Opportunities
455
+
456
+ Three natural agent chains exist that are not currently wired:
457
+
458
+ ### Chain 1: Code Change Pipeline
459
+ ```
460
+ [code change committed]
461
+ → security-reviewer (read-only scan, report findings)
462
+ → lesson-scanner (pattern audit, report violations)
463
+ → doc-updater (update CLAUDE.md + README if needed)
464
+ ```
465
+ Currently these run independently. Wiring via a slash command or hook would create a single `/post-commit-audit` that runs all three.
466
+
467
+ ### Chain 2: Notion Research → Write
468
+ ```
469
+ [user asks a Notion question]
470
+ → notion-researcher (explore, synthesize, return citations)
471
+ → notion-writer (create capture page with findings if requested)
472
+ ```
473
+ Currently the user manually switches between agents. The researcher's output format should be designed to be directly consumable by the writer.
474
+
475
+ ### Chain 3: Counter → doc-updater
476
+ ```
477
+ [counter reviews a plan and finds issues]
478
+ → counter returns critique with specific gaps
479
+ → doc-updater updates the plan doc with flagged items
480
+ ```
481
+ This requires counter's output format to include actionable file:line references compatible with doc-updater's input format — a structural change to counter's output format.
482
+
483
+ ---
484
+
485
+ ## Appendix: Official Frontmatter Reference (as of 2026-02-23)
486
+
487
+ From [Claude Code Docs](https://code.claude.com/docs/en/sub-agents):
488
+
489
+ | Field | Required | Default | Notes |
490
+ |-------|----------|---------|-------|
491
+ | `name` | Yes | — | Lowercase + hyphens |
492
+ | `description` | Yes | — | Delegation trigger text |
493
+ | `tools` | No | All inherited | Omitting = full inheritance (dangerous) |
494
+ | `disallowedTools` | No | — | Blocklist from inherited set |
495
+ | `model` | No | inherit | sonnet/opus/haiku/inherit |
496
+ | `permissionMode` | No | default | default/acceptEdits/dontAsk/bypassPermissions/plan |
497
+ | `maxTurns` | No | unlimited | Hard stop on agentic turns |
498
+ | `skills` | No | — | Inject skill content at startup |
499
+ | `mcpServers` | No | — | MCP servers available to subagent |
500
+ | `hooks` | No | — | Lifecycle hooks scoped to subagent |
501
+ | `memory` | No | — | user/project/local |
502
+ | `background` | No | false | Always run as background task |
503
+ | `isolation` | No | — | worktree = isolated git context |