autonomous-coding-toolkit 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (324) hide show
  1. package/.claude-plugin/marketplace.json +22 -0
  2. package/.claude-plugin/plugin.json +13 -0
  3. package/LICENSE +21 -0
  4. package/Makefile +21 -0
  5. package/README.md +140 -0
  6. package/SECURITY.md +28 -0
  7. package/agents/bash-expert.md +113 -0
  8. package/agents/dependency-auditor.md +138 -0
  9. package/agents/integration-tester.md +120 -0
  10. package/agents/lesson-scanner.md +149 -0
  11. package/agents/python-expert.md +179 -0
  12. package/agents/service-monitor.md +141 -0
  13. package/agents/shell-expert.md +147 -0
  14. package/benchmarks/runner.sh +147 -0
  15. package/benchmarks/tasks/01-rest-endpoint/rubric.sh +29 -0
  16. package/benchmarks/tasks/01-rest-endpoint/task.md +17 -0
  17. package/benchmarks/tasks/02-refactor-module/task.md +8 -0
  18. package/benchmarks/tasks/03-fix-integration-bug/task.md +8 -0
  19. package/benchmarks/tasks/04-add-test-coverage/task.md +8 -0
  20. package/benchmarks/tasks/05-multi-file-feature/task.md +8 -0
  21. package/bin/act.js +238 -0
  22. package/commands/autocode.md +6 -0
  23. package/commands/cancel-ralph.md +18 -0
  24. package/commands/code-factory.md +53 -0
  25. package/commands/create-prd.md +55 -0
  26. package/commands/ralph-loop.md +18 -0
  27. package/commands/run-plan.md +117 -0
  28. package/commands/submit-lesson.md +122 -0
  29. package/docs/ARCHITECTURE.md +630 -0
  30. package/docs/CONTRIBUTING.md +125 -0
  31. package/docs/lessons/0001-bare-exception-swallowing.md +34 -0
  32. package/docs/lessons/0002-async-def-without-await.md +28 -0
  33. package/docs/lessons/0003-create-task-without-callback.md +28 -0
  34. package/docs/lessons/0004-hardcoded-test-counts.md +28 -0
  35. package/docs/lessons/0005-sqlite-without-closing.md +33 -0
  36. package/docs/lessons/0006-venv-pip-path.md +27 -0
  37. package/docs/lessons/0007-runner-state-self-rejection.md +35 -0
  38. package/docs/lessons/0008-quality-gate-blind-spot.md +33 -0
  39. package/docs/lessons/0009-parser-overcount-empty-batches.md +36 -0
  40. package/docs/lessons/0010-local-outside-function-bash.md +33 -0
  41. package/docs/lessons/0011-batch-tests-for-unimplemented-code.md +36 -0
  42. package/docs/lessons/0012-api-markdown-unescaped-chars.md +33 -0
  43. package/docs/lessons/0013-export-prefix-env-parsing.md +33 -0
  44. package/docs/lessons/0014-decorator-registry-import-side-effect.md +43 -0
  45. package/docs/lessons/0015-frontend-backend-schema-drift.md +43 -0
  46. package/docs/lessons/0016-event-driven-cold-start-seeding.md +44 -0
  47. package/docs/lessons/0017-copy-paste-logic-diverges.md +43 -0
  48. package/docs/lessons/0018-layer-passes-pipeline-broken.md +45 -0
  49. package/docs/lessons/0019-systemd-envfile-ignores-export.md +41 -0
  50. package/docs/lessons/0020-persist-state-incrementally.md +44 -0
  51. package/docs/lessons/0021-dual-axis-testing.md +48 -0
  52. package/docs/lessons/0022-jsx-factory-shadowing.md +43 -0
  53. package/docs/lessons/0023-static-analysis-spiral.md +51 -0
  54. package/docs/lessons/0024-shared-pipeline-implementation.md +55 -0
  55. package/docs/lessons/0025-defense-in-depth-all-entry-points.md +65 -0
  56. package/docs/lessons/0026-linter-no-rules-false-enforcement.md +54 -0
  57. package/docs/lessons/0027-jsx-silent-prop-drop.md +64 -0
  58. package/docs/lessons/0028-no-infrastructure-in-client-code.md +49 -0
  59. package/docs/lessons/0029-never-write-secrets-to-files.md +61 -0
  60. package/docs/lessons/0030-cache-merge-not-replace.md +62 -0
  61. package/docs/lessons/0031-verify-units-at-boundaries.md +66 -0
  62. package/docs/lessons/0032-module-lifecycle-subscribe-unsubscribe.md +89 -0
  63. package/docs/lessons/0033-async-iteration-mutable-snapshot.md +72 -0
  64. package/docs/lessons/0034-caller-missing-await-silent-discard.md +65 -0
  65. package/docs/lessons/0035-duplicate-registration-silent-overwrite.md +85 -0
  66. package/docs/lessons/0036-websocket-dirty-disconnect.md +33 -0
  67. package/docs/lessons/0037-parallel-agents-worktree-corruption.md +31 -0
  68. package/docs/lessons/0038-subscribe-no-stored-ref.md +36 -0
  69. package/docs/lessons/0039-fallback-or-default-hides-bugs.md +34 -0
  70. package/docs/lessons/0040-event-firehose-filter-first.md +36 -0
  71. package/docs/lessons/0041-ambiguous-base-dir-path-nesting.md +32 -0
  72. package/docs/lessons/0042-spec-compliance-insufficient.md +36 -0
  73. package/docs/lessons/0043-exact-count-extensible-collections.md +32 -0
  74. package/docs/lessons/0044-relative-file-deps-worktree.md +39 -0
  75. package/docs/lessons/0045-iterative-design-improvement.md +33 -0
  76. package/docs/lessons/0046-plan-assertion-math-bugs.md +38 -0
  77. package/docs/lessons/0047-pytest-single-threaded-default.md +37 -0
  78. package/docs/lessons/0048-integration-wiring-batch.md +40 -0
  79. package/docs/lessons/0049-ab-verification.md +41 -0
  80. package/docs/lessons/0050-editing-sourced-files-during-execution.md +33 -0
  81. package/docs/lessons/0051-infrastructure-fixes-cant-self-heal.md +30 -0
  82. package/docs/lessons/0052-uncommitted-changes-poison-quality-gates.md +31 -0
  83. package/docs/lessons/0053-jq-compact-flag-inconsistency.md +31 -0
  84. package/docs/lessons/0054-parser-matches-inside-code-blocks.md +30 -0
  85. package/docs/lessons/0055-agents-compensate-for-garbled-prompts.md +31 -0
  86. package/docs/lessons/0056-grep-count-exit-code-on-zero.md +42 -0
  87. package/docs/lessons/0057-new-artifacts-break-git-clean-gates.md +42 -0
  88. package/docs/lessons/0058-dead-config-keys-never-consumed.md +49 -0
  89. package/docs/lessons/0059-contract-test-shared-structures.md +53 -0
  90. package/docs/lessons/0060-set-e-silent-death-in-runners.md +53 -0
  91. package/docs/lessons/0061-context-injection-dirty-state.md +50 -0
  92. package/docs/lessons/0062-sibling-bug-neighborhood-scan.md +29 -0
  93. package/docs/lessons/0063-one-flag-two-lifetimes.md +31 -0
  94. package/docs/lessons/0064-test-passes-wrong-reason.md +31 -0
  95. package/docs/lessons/0065-pipefail-grep-count-double-output.md +39 -0
  96. package/docs/lessons/0066-local-keyword-outside-function.md +37 -0
  97. package/docs/lessons/0067-stdin-hang-non-interactive-shell.md +36 -0
  98. package/docs/lessons/0068-agent-builds-wrong-thing-correctly.md +31 -0
  99. package/docs/lessons/0069-plan-quality-dominates-execution.md +30 -0
  100. package/docs/lessons/0070-spec-echo-back-prevents-drift.md +31 -0
  101. package/docs/lessons/0071-positive-instructions-outperform-negative.md +30 -0
  102. package/docs/lessons/0072-lost-in-the-middle-context-placement.md +30 -0
  103. package/docs/lessons/0073-unscoped-lessons-cause-false-positives.md +30 -0
  104. package/docs/lessons/0074-stale-context-injection-wrong-batch.md +32 -0
  105. package/docs/lessons/0075-research-artifacts-must-persist.md +32 -0
  106. package/docs/lessons/0076-wrong-decomposition-contaminates-downstream.md +30 -0
  107. package/docs/lessons/0077-cherry-pick-merges-need-manual-resolution.md +30 -0
  108. package/docs/lessons/0078-static-review-without-live-test.md +30 -0
  109. package/docs/lessons/0079-integration-wiring-batch-required.md +32 -0
  110. package/docs/lessons/FRAMEWORK.md +161 -0
  111. package/docs/lessons/SUMMARY.md +201 -0
  112. package/docs/lessons/TEMPLATE.md +85 -0
  113. package/docs/plans/2026-02-21-code-factory-v2-design.md +204 -0
  114. package/docs/plans/2026-02-21-code-factory-v2-implementation-plan.md +2189 -0
  115. package/docs/plans/2026-02-21-code-factory-v2-phase4-design.md +537 -0
  116. package/docs/plans/2026-02-21-code-factory-v2-phase4-implementation-plan.md +2012 -0
  117. package/docs/plans/2026-02-21-hardening-pass-design.md +108 -0
  118. package/docs/plans/2026-02-21-hardening-pass-plan.md +1378 -0
  119. package/docs/plans/2026-02-21-mab-research-report.md +406 -0
  120. package/docs/plans/2026-02-21-marketplace-restructure-design.md +240 -0
  121. package/docs/plans/2026-02-21-marketplace-restructure-plan.md +832 -0
  122. package/docs/plans/2026-02-21-phase4-completion-plan.md +697 -0
  123. package/docs/plans/2026-02-21-validator-suite-design.md +148 -0
  124. package/docs/plans/2026-02-21-validator-suite-plan.md +540 -0
  125. package/docs/plans/2026-02-22-mab-research-round2.md +556 -0
  126. package/docs/plans/2026-02-22-mab-run-design.md +462 -0
  127. package/docs/plans/2026-02-22-mab-run-plan.md +2046 -0
  128. package/docs/plans/2026-02-22-operations-design-methodology-research.md +681 -0
  129. package/docs/plans/2026-02-22-research-agent-failure-taxonomy.md +532 -0
  130. package/docs/plans/2026-02-22-research-code-guideline-policies.md +886 -0
  131. package/docs/plans/2026-02-22-research-codebase-audit-refactoring.md +908 -0
  132. package/docs/plans/2026-02-22-research-coding-standards-documentation.md +541 -0
  133. package/docs/plans/2026-02-22-research-competitive-landscape.md +687 -0
  134. package/docs/plans/2026-02-22-research-comprehensive-testing.md +1076 -0
  135. package/docs/plans/2026-02-22-research-context-utilization.md +459 -0
  136. package/docs/plans/2026-02-22-research-cost-quality-tradeoff.md +548 -0
  137. package/docs/plans/2026-02-22-research-lesson-transferability.md +508 -0
  138. package/docs/plans/2026-02-22-research-multi-agent-coordination.md +312 -0
  139. package/docs/plans/2026-02-22-research-phase-integration.md +602 -0
  140. package/docs/plans/2026-02-22-research-plan-quality.md +428 -0
  141. package/docs/plans/2026-02-22-research-prompt-engineering.md +558 -0
  142. package/docs/plans/2026-02-22-research-unconventional-perspectives.md +528 -0
  143. package/docs/plans/2026-02-22-research-user-adoption.md +638 -0
  144. package/docs/plans/2026-02-22-research-verification-effectiveness.md +433 -0
  145. package/docs/plans/2026-02-23-agent-suite-design.md +299 -0
  146. package/docs/plans/2026-02-23-agent-suite-plan.md +578 -0
  147. package/docs/plans/2026-02-23-phase3-cost-infrastructure-design.md +148 -0
  148. package/docs/plans/2026-02-23-phase3-cost-infrastructure-plan.md +1062 -0
  149. package/docs/plans/2026-02-23-research-bash-expert-agent.md +543 -0
  150. package/docs/plans/2026-02-23-research-dependency-auditor-agent.md +564 -0
  151. package/docs/plans/2026-02-23-research-improving-existing-agents.md +503 -0
  152. package/docs/plans/2026-02-23-research-integration-tester-agent.md +454 -0
  153. package/docs/plans/2026-02-23-research-python-expert-agent.md +429 -0
  154. package/docs/plans/2026-02-23-research-service-monitor-agent.md +425 -0
  155. package/docs/plans/2026-02-23-research-shell-expert-agent.md +533 -0
  156. package/docs/plans/2026-02-23-roadmap-to-completion.md +530 -0
  157. package/docs/plans/2026-02-24-headless-module-split-design.md +98 -0
  158. package/docs/plans/2026-02-24-headless-module-split.md +443 -0
  159. package/docs/plans/2026-02-24-lesson-scope-metadata-design.md +228 -0
  160. package/docs/plans/2026-02-24-lesson-scope-metadata-plan.md +968 -0
  161. package/docs/plans/2026-02-24-npm-packaging-design.md +841 -0
  162. package/docs/plans/2026-02-24-npm-packaging-plan.md +1965 -0
  163. package/docs/plans/audit-findings.md +186 -0
  164. package/docs/telegram-notification-format.md +98 -0
  165. package/examples/example-plan.md +51 -0
  166. package/examples/example-prd.json +72 -0
  167. package/examples/example-roadmap.md +33 -0
  168. package/examples/quickstart-plan.md +63 -0
  169. package/hooks/hooks.json +26 -0
  170. package/hooks/setup-symlinks.sh +48 -0
  171. package/hooks/stop-hook.sh +135 -0
  172. package/package.json +47 -0
  173. package/policies/bash.md +71 -0
  174. package/policies/python.md +71 -0
  175. package/policies/testing.md +61 -0
  176. package/policies/universal.md +60 -0
  177. package/scripts/analyze-report.sh +97 -0
  178. package/scripts/architecture-map.sh +145 -0
  179. package/scripts/auto-compound.sh +273 -0
  180. package/scripts/batch-audit.sh +42 -0
  181. package/scripts/batch-test.sh +101 -0
  182. package/scripts/entropy-audit.sh +221 -0
  183. package/scripts/failure-digest.sh +51 -0
  184. package/scripts/generate-ast-rules.sh +96 -0
  185. package/scripts/init.sh +112 -0
  186. package/scripts/lesson-check.sh +428 -0
  187. package/scripts/lib/common.sh +61 -0
  188. package/scripts/lib/cost-tracking.sh +153 -0
  189. package/scripts/lib/ollama.sh +60 -0
  190. package/scripts/lib/progress-writer.sh +128 -0
  191. package/scripts/lib/run-plan-context.sh +215 -0
  192. package/scripts/lib/run-plan-echo-back.sh +231 -0
  193. package/scripts/lib/run-plan-headless.sh +396 -0
  194. package/scripts/lib/run-plan-notify.sh +57 -0
  195. package/scripts/lib/run-plan-parser.sh +81 -0
  196. package/scripts/lib/run-plan-prompt.sh +215 -0
  197. package/scripts/lib/run-plan-quality-gate.sh +132 -0
  198. package/scripts/lib/run-plan-routing.sh +315 -0
  199. package/scripts/lib/run-plan-sampling.sh +170 -0
  200. package/scripts/lib/run-plan-scoring.sh +146 -0
  201. package/scripts/lib/run-plan-state.sh +142 -0
  202. package/scripts/lib/run-plan-team.sh +199 -0
  203. package/scripts/lib/telegram.sh +54 -0
  204. package/scripts/lib/thompson-sampling.sh +176 -0
  205. package/scripts/license-check.sh +74 -0
  206. package/scripts/mab-run.sh +575 -0
  207. package/scripts/module-size-check.sh +146 -0
  208. package/scripts/patterns/async-no-await.yml +5 -0
  209. package/scripts/patterns/bare-except.yml +6 -0
  210. package/scripts/patterns/empty-catch.yml +6 -0
  211. package/scripts/patterns/hardcoded-localhost.yml +9 -0
  212. package/scripts/patterns/retry-loop-no-backoff.yml +12 -0
  213. package/scripts/pipeline-status.sh +197 -0
  214. package/scripts/policy-check.sh +226 -0
  215. package/scripts/prior-art-search.sh +133 -0
  216. package/scripts/promote-mab-lessons.sh +126 -0
  217. package/scripts/prompts/agent-a-superpowers.md +29 -0
  218. package/scripts/prompts/agent-b-ralph.md +29 -0
  219. package/scripts/prompts/judge-agent.md +61 -0
  220. package/scripts/prompts/planner-agent.md +44 -0
  221. package/scripts/pull-community-lessons.sh +90 -0
  222. package/scripts/quality-gate.sh +266 -0
  223. package/scripts/research-gate.sh +90 -0
  224. package/scripts/run-plan.sh +329 -0
  225. package/scripts/scope-infer.sh +159 -0
  226. package/scripts/setup-ralph-loop.sh +155 -0
  227. package/scripts/telemetry.sh +230 -0
  228. package/scripts/tests/run-all-tests.sh +52 -0
  229. package/scripts/tests/test-act-cli.sh +46 -0
  230. package/scripts/tests/test-agents-md.sh +87 -0
  231. package/scripts/tests/test-analyze-report.sh +114 -0
  232. package/scripts/tests/test-architecture-map.sh +89 -0
  233. package/scripts/tests/test-auto-compound.sh +169 -0
  234. package/scripts/tests/test-batch-test.sh +65 -0
  235. package/scripts/tests/test-benchmark-runner.sh +25 -0
  236. package/scripts/tests/test-common.sh +168 -0
  237. package/scripts/tests/test-cost-tracking.sh +158 -0
  238. package/scripts/tests/test-echo-back.sh +180 -0
  239. package/scripts/tests/test-entropy-audit.sh +146 -0
  240. package/scripts/tests/test-failure-digest.sh +66 -0
  241. package/scripts/tests/test-generate-ast-rules.sh +145 -0
  242. package/scripts/tests/test-helpers.sh +82 -0
  243. package/scripts/tests/test-init.sh +47 -0
  244. package/scripts/tests/test-lesson-check.sh +278 -0
  245. package/scripts/tests/test-lesson-local.sh +55 -0
  246. package/scripts/tests/test-license-check.sh +109 -0
  247. package/scripts/tests/test-mab-run.sh +182 -0
  248. package/scripts/tests/test-ollama-lib.sh +49 -0
  249. package/scripts/tests/test-ollama.sh +60 -0
  250. package/scripts/tests/test-pipeline-status.sh +198 -0
  251. package/scripts/tests/test-policy-check.sh +124 -0
  252. package/scripts/tests/test-prior-art-search.sh +96 -0
  253. package/scripts/tests/test-progress-writer.sh +140 -0
  254. package/scripts/tests/test-promote-mab-lessons.sh +110 -0
  255. package/scripts/tests/test-pull-community-lessons.sh +149 -0
  256. package/scripts/tests/test-quality-gate.sh +241 -0
  257. package/scripts/tests/test-research-gate.sh +132 -0
  258. package/scripts/tests/test-run-plan-cli.sh +86 -0
  259. package/scripts/tests/test-run-plan-context.sh +305 -0
  260. package/scripts/tests/test-run-plan-e2e.sh +153 -0
  261. package/scripts/tests/test-run-plan-headless.sh +424 -0
  262. package/scripts/tests/test-run-plan-notify.sh +124 -0
  263. package/scripts/tests/test-run-plan-parser.sh +217 -0
  264. package/scripts/tests/test-run-plan-prompt.sh +254 -0
  265. package/scripts/tests/test-run-plan-quality-gate.sh +222 -0
  266. package/scripts/tests/test-run-plan-routing.sh +178 -0
  267. package/scripts/tests/test-run-plan-scoring.sh +148 -0
  268. package/scripts/tests/test-run-plan-state.sh +261 -0
  269. package/scripts/tests/test-run-plan-team.sh +157 -0
  270. package/scripts/tests/test-scope-infer.sh +150 -0
  271. package/scripts/tests/test-setup-ralph-loop.sh +63 -0
  272. package/scripts/tests/test-telegram-env.sh +38 -0
  273. package/scripts/tests/test-telegram.sh +121 -0
  274. package/scripts/tests/test-telemetry.sh +46 -0
  275. package/scripts/tests/test-thompson-sampling.sh +139 -0
  276. package/scripts/tests/test-validate-all.sh +60 -0
  277. package/scripts/tests/test-validate-commands.sh +89 -0
  278. package/scripts/tests/test-validate-hooks.sh +98 -0
  279. package/scripts/tests/test-validate-lessons.sh +150 -0
  280. package/scripts/tests/test-validate-plan-quality.sh +235 -0
  281. package/scripts/tests/test-validate-plans.sh +187 -0
  282. package/scripts/tests/test-validate-plugin.sh +106 -0
  283. package/scripts/tests/test-validate-prd.sh +184 -0
  284. package/scripts/tests/test-validate-skills.sh +134 -0
  285. package/scripts/validate-all.sh +57 -0
  286. package/scripts/validate-commands.sh +67 -0
  287. package/scripts/validate-hooks.sh +89 -0
  288. package/scripts/validate-lessons.sh +98 -0
  289. package/scripts/validate-plan-quality.sh +369 -0
  290. package/scripts/validate-plans.sh +120 -0
  291. package/scripts/validate-plugin.sh +86 -0
  292. package/scripts/validate-policies.sh +42 -0
  293. package/scripts/validate-prd.sh +118 -0
  294. package/scripts/validate-skills.sh +96 -0
  295. package/skills/autocode/SKILL.md +285 -0
  296. package/skills/autocode/ab-verification.md +51 -0
  297. package/skills/autocode/code-quality-standards.md +37 -0
  298. package/skills/autocode/competitive-mode.md +364 -0
  299. package/skills/brainstorming/SKILL.md +97 -0
  300. package/skills/capture-lesson/SKILL.md +187 -0
  301. package/skills/check-lessons/SKILL.md +116 -0
  302. package/skills/dispatching-parallel-agents/SKILL.md +110 -0
  303. package/skills/executing-plans/SKILL.md +85 -0
  304. package/skills/finishing-a-development-branch/SKILL.md +201 -0
  305. package/skills/receiving-code-review/SKILL.md +72 -0
  306. package/skills/requesting-code-review/SKILL.md +59 -0
  307. package/skills/requesting-code-review/code-reviewer.md +82 -0
  308. package/skills/research/SKILL.md +145 -0
  309. package/skills/roadmap/SKILL.md +115 -0
  310. package/skills/subagent-driven-development/SKILL.md +98 -0
  311. package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +18 -0
  312. package/skills/subagent-driven-development/implementer-prompt.md +73 -0
  313. package/skills/subagent-driven-development/spec-reviewer-prompt.md +57 -0
  314. package/skills/systematic-debugging/SKILL.md +134 -0
  315. package/skills/systematic-debugging/condition-based-waiting.md +64 -0
  316. package/skills/systematic-debugging/defense-in-depth.md +32 -0
  317. package/skills/systematic-debugging/root-cause-tracing.md +55 -0
  318. package/skills/test-driven-development/SKILL.md +167 -0
  319. package/skills/using-git-worktrees/SKILL.md +219 -0
  320. package/skills/using-superpowers/SKILL.md +54 -0
  321. package/skills/verification-before-completion/SKILL.md +140 -0
  322. package/skills/verify/SKILL.md +82 -0
  323. package/skills/writing-plans/SKILL.md +128 -0
  324. package/skills/writing-skills/SKILL.md +93 -0
@@ -0,0 +1,22 @@
1
+ {
2
+ "$schema": "https://anthropic.com/claude-code/marketplace.schema.json",
3
+ "name": "autonomous-coding-toolkit",
4
+ "description": "Autonomous coding pipeline with quality gates, fresh-context execution, and community lessons",
5
+ "owner": {
6
+ "name": "Justin McFarland",
7
+ "email": "parthalon025@gmail.com"
8
+ },
9
+ "plugins": [
10
+ {
11
+ "name": "autonomous-coding-toolkit",
12
+ "description": "Complete autonomous coding pipeline with skills, agents, scripts, and a community lesson system that improves with every user",
13
+ "version": "1.0.0",
14
+ "source": "./",
15
+ "author": {
16
+ "name": "Justin McFarland",
17
+ "email": "parthalon025@gmail.com"
18
+ },
19
+ "category": "development"
20
+ }
21
+ ]
22
+ }
@@ -0,0 +1,13 @@
1
+ {
2
+ "name": "autonomous-coding-toolkit",
3
+ "description": "Complete autonomous coding pipeline: skills for every stage from brainstorming through verification, quality gates between batches, headless execution, and a lessons-learned feedback loop that compounds with every user",
4
+ "version": "1.0.0",
5
+ "author": {
6
+ "name": "Justin McFarland",
7
+ "email": "parthalon025@gmail.com"
8
+ },
9
+ "homepage": "https://github.com/parthalon025/autonomous-coding-toolkit",
10
+ "repository": "https://github.com/parthalon025/autonomous-coding-toolkit",
11
+ "license": "MIT",
12
+ "keywords": ["autonomous", "tdd", "quality-gates", "headless", "skills", "pipeline", "lessons-learned"]
13
+ }
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Justin McFarland
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/Makefile ADDED
@@ -0,0 +1,21 @@
1
+ .PHONY: test validate lint ci
2
+
3
+ lint:
4
+ @echo "=== ShellCheck ==="
5
+ @shellcheck scripts/*.sh scripts/lib/*.sh 2>&1 || true
6
+ @echo "=== shfmt ==="
7
+ @shfmt -d -i 2 -ci scripts/*.sh scripts/lib/*.sh 2>&1 || true
8
+ @echo "=== Shellharden ==="
9
+ @shellharden --check scripts/*.sh scripts/lib/*.sh 2>&1 || true
10
+ @echo "=== Semgrep ==="
11
+ @semgrep --config "p/bash" --quiet scripts/ 2>&1 || true
12
+ @echo "=== Lint Complete ==="
13
+
14
+ test:
15
+ @bash scripts/tests/run-all-tests.sh
16
+
17
+ validate:
18
+ @bash scripts/validate-all.sh
19
+
20
+ ci: lint validate test
21
+ @echo "CI: ALL PASSED"
package/README.md ADDED
@@ -0,0 +1,140 @@
1
+ [![CI](https://github.com/parthalon025/autonomous-coding-toolkit/actions/workflows/ci.yml/badge.svg)](https://github.com/parthalon025/autonomous-coding-toolkit/actions)
2
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
3
+ [![Version](https://img.shields.io/badge/version-1.0.0-blue.svg)](https://github.com/parthalon025/autonomous-coding-toolkit/releases/tag/v1.0.0)
4
+
5
+ # Autonomous Coding Toolkit
6
+
7
+ > **Goal:** Code better than a human on large projects — not by being smarter on any single batch, but by compounding learning across thousands of batches across hundreds of users.
8
+
9
+ **A learning system for autonomous AI coding.** Fresh context per batch, quality gates between every step, 79 community lessons that prevent the same bug twice, and telemetry that makes the system smarter with every run.
10
+
11
+ Built for [Claude Code](https://docs.anthropic.com/en/docs/claude-code) (v1.0.33+). Works as a Claude Code plugin (interactive) and npm CLI (headless/CI).
12
+
13
+ ## What It Does
14
+
15
+ ```
16
+ You write a plan → the toolkit executes it batch-by-batch with:
17
+ - Fresh 200k context window per batch (no accumulated degradation)
18
+ - Quality gates between every batch (tests + anti-pattern scan + memory check)
19
+ - Machine-verifiable completion (every criterion is a shell command)
20
+ ```
21
+
22
+ ## Install
23
+
24
+ ### npm (recommended)
25
+
26
+ ```bash
27
+ npm install -g autonomous-coding-toolkit
28
+ ```
29
+
30
+ This puts `act` on your PATH. Requires Node.js 18+ and bash 4+.
31
+
32
+ ### Claude Code Plugin
33
+
34
+ ```bash
35
+ # Add the marketplace source
36
+ /plugin marketplace add parthalon025/autonomous-coding-toolkit
37
+
38
+ # Install the plugin
39
+ /plugin install autonomous-coding-toolkit@autonomous-coding-toolkit
40
+ ```
41
+
42
+ ### From Source
43
+
44
+ ```bash
45
+ git clone https://github.com/parthalon025/autonomous-coding-toolkit.git
46
+ cd autonomous-coding-toolkit
47
+ npm link # puts 'act' on PATH
48
+ ```
49
+
50
+ > **Windows:** Requires [WSL](https://learn.microsoft.com/en-us/windows/wsl/install). Run `wsl --install`, then use the toolkit inside WSL.
51
+
52
+ ## Quick Start
53
+
54
+ ```bash
55
+ # Bootstrap your project
56
+ act init --quickstart
57
+
58
+ # Full pipeline — brainstorm → plan → execute → verify → finish
59
+ /autocode "Add user authentication with JWT"
60
+
61
+ # Run a plan headless (fully autonomous, fresh context per batch)
62
+ act plan docs/plans/my-feature.md --on-failure retry --notify
63
+
64
+ # Quality check
65
+ act gate --project-root .
66
+
67
+ # See all commands
68
+ act help
69
+ ```
70
+
71
+ See [`examples/quickstart-plan.md`](examples/quickstart-plan.md) for a minimal plan you can run in 3 commands.
72
+
73
+ ## The Pipeline
74
+
75
+ ```
76
+ Idea → [Roadmap] → Brainstorm → [Research] → PRD → Plan → Execute → Verify → Finish
77
+ ```
78
+
79
+ Each stage exists because a specific failure mode demanded it:
80
+
81
+ | Stage | Problem It Solves | Evidence |
82
+ |-------|------------------|----------|
83
+ | **Brainstorm** | Agents build the wrong thing correctly — spec misunderstanding is the dominant failure mode | SWE-bench Pro (1,865 problems): removing specs degraded success from 25.9% to 8.4% |
84
+ | **Research** | Building on assumptions wastes hours | Cooper Stage-Gate: projects with stable definitions are 3x more likely to succeed |
85
+ | **Plan** | Plan quality dominates execution quality ~3:1 | SWE-bench Pro: spec removal = 3x degradation |
86
+ | **Execute** | Context degradation is the #1 quality killer | Chroma (Hong et al., 2025): 11/12 models < 50% at 32K tokens; Liu et al. (Stanford, TACL 2024): up to 20pp mid-context accuracy loss |
87
+ | **Verify** | Static review misses behavioral bugs | OOPSLA 2025: property-based testing finds ~50x more mutations per test |
88
+
89
+ Full evidence table with all 25 papers: [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md)
90
+
91
+ ## How It Compares
92
+
93
+ | Tool | Approach | This Toolkit's Difference |
94
+ |------|----------|--------------------------|
95
+ | Claude Code `/plan` | Built-in planning | No quality gates, no fresh context per batch, no lesson system |
96
+ | Aider | Interactive pair programming | Aider is conversational; this is batch-autonomous with gates |
97
+ | Cursor Agent | IDE-integrated agent | No headless mode, no batch isolation |
98
+ | SWE-agent | Autonomous GitHub issue solver | Single-issue scope; this handles multi-batch plans with state |
99
+
100
+ **Core differentiators:** (1) fresh context per batch, (2) machine-verifiable quality gates, (3) compounding lesson system, (4) headless unattended execution.
101
+
102
+ ## Quality Gates
103
+
104
+ Mandatory between every batch:
105
+
106
+ 1. Lesson check (<2s, grep-based anti-pattern scan)
107
+ 2. ast-grep patterns (5 structural checks)
108
+ 3. Test suite (auto-detected: pytest / npm test / make test)
109
+ 4. Memory check (warns if < 4GB available)
110
+ 5. Test count regression (tests only go up)
111
+ 6. Git clean (all changes committed)
112
+
113
+ ## Community Lessons
114
+
115
+ 79 lessons across 6 failure clusters, learned from production bugs. Adding a lesson file to `docs/lessons/` automatically adds a check — no code changes needed.
116
+
117
+ Submit new lessons via `/submit-lesson` or [open an issue](https://github.com/parthalon025/autonomous-coding-toolkit/issues/new?template=lesson_submission.md).
118
+
119
+ ## Requirements
120
+
121
+ - **Claude Code** v1.0.33+ (`claude` CLI)
122
+ - **bash** 4+, **jq**, **git**
123
+ - Optional: **gh** (PR creation), **curl** (Telegram notifications)
124
+
125
+ ## Learn More
126
+
127
+ | Topic | Doc |
128
+ |-------|-----|
129
+ | Architecture, evidence, internals | [`docs/ARCHITECTURE.md`](docs/ARCHITECTURE.md) |
130
+ | Contributing lessons | [`docs/CONTRIBUTING.md`](docs/CONTRIBUTING.md) |
131
+ | Plan file format | [`examples/example-plan.md`](examples/example-plan.md) |
132
+ | Execution modes (5 options) | [`docs/ARCHITECTURE.md#system-overview`](docs/ARCHITECTURE.md#system-overview) |
133
+
134
+ ## Attribution
135
+
136
+ Core skill chain forked from [superpowers](https://github.com/obra/superpowers) by Jesse Vincent / Anthropic. Extended with quality gate pipeline, headless execution, lesson system, MAB routing, and research/roadmap stages.
137
+
138
+ ## License
139
+
140
+ MIT
package/SECURITY.md ADDED
@@ -0,0 +1,28 @@
1
+ # Security Policy
2
+
3
+ ## Supported Versions
4
+
5
+ | Version | Supported |
6
+ |---------|-----------|
7
+ | 1.0.x | Yes |
8
+ | < 1.0 | No |
9
+
10
+ ## Reporting a Vulnerability
11
+
12
+ If you discover a security vulnerability, please report it responsibly:
13
+
14
+ 1. **Do not** open a public issue
15
+ 2. Email parthalon025@gmail.com with:
16
+ - Description of the vulnerability
17
+ - Steps to reproduce
18
+ - Potential impact
19
+ 3. You will receive a response within 48 hours
20
+
21
+ ## Scope
22
+
23
+ This toolkit executes shell commands as part of its quality gate pipeline. Security considerations:
24
+
25
+ - **`eval` usage:** PRD acceptance criteria use `eval` to run shell commands. Only run PRDs you trust.
26
+ - **Headless execution:** `run-plan.sh` executes `claude -p` with plan content. Only run plans from trusted sources.
27
+ - **Ollama integration:** `auto-compound.sh` sends report content to a local Ollama instance. No data leaves your machine.
28
+ - **Telegram notifications:** Optional. Credentials read from `~/.env`. Never logged or committed.
@@ -0,0 +1,113 @@
1
+ ---
2
+ name: bash-expert
3
+ description: "Use this agent when reviewing, writing, or debugging bash or shell
4
+ scripts. Invoke for: .sh files, CI pipeline shell steps, hook scripts, systemd
5
+ ExecStart shell commands, Makefile shell targets, and heredoc-heavy scripts. Do
6
+ not invoke for Python, Ruby, or other scripted languages."
7
+ tools: Read, Grep, Glob, Bash
8
+ model: sonnet
9
+ maxTurns: 30
10
+ ---
11
+
12
+ # Bash Expert
13
+
14
+ You are a bash expert specializing in defensive scripting for production automation and CI/CD. Your canonical references are:
15
+ - Google Shell Style Guide (structure, naming, scope gate)
16
+ - BashPitfalls wiki (61+ common mistakes)
17
+ - ShellCheck wiki (rule explanations and fixes)
18
+
19
+ ## Scan Workflow (Audit Mode)
20
+
21
+ When reviewing existing scripts, follow this order:
22
+
23
+ ### Step 1: Read target files
24
+ Read each file to understand structure and purpose.
25
+
26
+ ### Step 2: Grep for Priority 1 blocking patterns
27
+
28
+ These cause silent failures, data corruption, or security vulnerabilities:
29
+
30
+ | Pattern | Grep target | Fix |
31
+ |---------|-------------|-----|
32
+ | Unquoted variable in command args | `\$[a-zA-Z_]` outside double quotes | Quote: `"$var"` |
33
+ | `eval` on variables | `\beval\b` | Replace with named variable or array |
34
+ | `\|\| true` masking errors | `\|\| true` | Use explicit error handling |
35
+ | `cd` without error check | `cd ` not followed by `&&` or `\|\|` | `cd /path \|\| exit 1` |
36
+ | Missing `set -euo pipefail` | `^#!/` without strict mode nearby | Add to script header |
37
+ | `for f in $(ls` | `for .* in \$\(ls` | `for f in ./*` |
38
+ | `local var=$(cmd)` masking exit | `local [a-z_]+=\$\(` | `local var; var=$(cmd)` |
39
+ | `2>&1 >>` wrong order | `2>&1 >>` | Reverse to `>>file 2>&1` |
40
+ | Same-file pipeline read/write | `> file` after `cat file \|` | Use temp file + mv |
41
+
42
+ ### Step 3: Grep for Priority 2 quality patterns
43
+
44
+ | Pattern | Grep target | Fix |
45
+ |---------|-------------|-----|
46
+ | Wrong shebang | `#!/bin/bash` | `#!/usr/bin/env bash` |
47
+ | `grep -P` (non-portable) | `grep -P` | `grep -E` or `[[ =~ ]]` |
48
+ | `ls` for file existence | `if.*ls ` | `[[ -f file ]]` or `[[ -d dir ]]` |
49
+ | Backtick substitution | `` ` `` | `$()` |
50
+ | Missing `--help` | No `usage()` or `--help` handler | Add usage function |
51
+ | No EXIT trap for temp files | `mktemp` without `trap.*EXIT` | `trap 'rm -rf "$tmpdir"' EXIT` |
52
+ | `echo` for data output | `^echo \$` | `printf '%s\n' "$var"` |
53
+ | `[ ]` instead of `[[ ]]` | `\[ ` not `\[\[ ` | Use `[[ ]]` for bash conditionals |
54
+ | Hardcoded `/tmp/` | `/tmp/` literal path | `mktemp -d` |
55
+ | `$*` instead of `"$@"` | `\$\*` | `"$@"` |
56
+
57
+ ### Step 4: Check tooling config
58
+ - Look for `.shellcheckrc` in the project root
59
+ - Check if `shfmt` config exists (`.editorconfig` or flags)
60
+
61
+ ### Step 5: Run ShellCheck
62
+ Run: `shellcheck --enable=all --external-sources <file>` on each target file.
63
+
64
+ ### Step 6: Check scope
65
+ If the script exceeds 100 lines with complex control flow, non-trivial data manipulation, or object-like structures, flag it: "Consider Python rewrite (Google Shell Style Guide threshold)."
66
+
67
+ ## Output Format
68
+
69
+ ```
70
+ BLOCKING (must fix before merge):
71
+ - file.sh:12 — Unquoted variable $USER_INPUT — SC2086
72
+ - file.sh:34 — Missing error check on cd — BashPitfalls #19
73
+
74
+ QUALITY (should fix):
75
+ - file.sh:8 — Backtick substitution; use $() instead — SC2006
76
+ - file.sh:45 — No EXIT trap for temp files created here
77
+
78
+ STYLE (consider):
79
+ - Script exceeds 100 lines with subprocess orchestration; evaluate Python rewrite
80
+ - Missing --help flag
81
+
82
+ TOOLING:
83
+ - No .shellcheckrc found; recommend: enable=all, external-sources=true
84
+ ```
85
+
86
+ ## Generation Mode (Writing New Scripts)
87
+
88
+ When writing new bash scripts, always apply:
89
+
90
+ 1. Header: `#!/usr/bin/env bash` followed by `set -Eeuo pipefail`
91
+ 2. `IFS=$'\n\t'` after strict mode
92
+ 3. Script directory detection:
93
+ ```bash
94
+ SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd -P)"
95
+ ```
96
+ 4. Error logging function:
97
+ ```bash
98
+ err() { echo "[$(date -u +%Y-%m-%dT%H:%M:%SZ)] $*" >&2; }
99
+ die() { err "$@"; exit 1; }
100
+ ```
101
+ 5. Cleanup trap before any `mktemp`:
102
+ ```bash
103
+ trap 'rm -rf "${tmpdir:-}"' EXIT
104
+ ```
105
+ 6. `main()` function called at end of script
106
+ 7. `--help` flag via `usage()` heredoc
107
+ 8. All function variables declared with `local`
108
+ 9. Quote all variable expansions
109
+ 10. Use arrays for file lists, never word-split strings
110
+
111
+ ## Hallucination Guard
112
+
113
+ Report only what Read/Grep/Bash output directly confirms. If a grep returns no matches for a category, record it as CLEAN. Do not infer violations from code structure alone — show evidence.
@@ -0,0 +1,138 @@
1
+ ---
2
+ name: dependency-auditor
3
+ description: "Scans project repos for CVEs, outdated packages, and license compliance.
4
+ Read-only — never installs, updates, or modifies any package. Use for periodic
5
+ security audits or before releases."
6
+ tools: Read, Grep, Glob, Bash
7
+ model: haiku
8
+ maxTurns: 25
9
+ ---
10
+
11
+ # Dependency Auditor
12
+
13
+ You scan project repositories for outdated packages, known CVEs, and license compliance issues. You are strictly read-only — you NEVER run `pip install`, `npm audit fix`, `npm install`, or modify any file.
14
+
15
+ ## Step 0: Tool Availability Check
16
+
17
+ Before scanning, verify which tools are available:
18
+
19
+ ```bash
20
+ which pip-audit osv-scanner trivy npm npx 2>/dev/null
21
+ ```
22
+
23
+ Report which tools are available and which are missing. Proceed with available tools. If pip-audit is missing, fall back to manifest-only scanning.
24
+
25
+ ## Step 1: Repo Detection
26
+
27
+ Scan `~/Documents/projects/` for project repos. For each directory, detect:
28
+ - **Python:** `requirements.txt`, `pyproject.toml`, `Pipfile`
29
+ - **Node:** `package.json`
30
+ - **Docker:** `Dockerfile`
31
+ - **Virtualenv:** `.venv/`, `venv/`, `env/`
32
+
33
+ Exclude: `_archived/`, `.claude/worktrees/`
34
+
35
+ ## Step 2: CVE Scanning (per repo)
36
+
37
+ **Python repos (with venv):**
38
+ ```bash
39
+ .venv/bin/python -m pip_audit -f json 2>/dev/null
40
+ ```
41
+
42
+ **Python repos (manifest only, no venv):**
43
+ ```bash
44
+ pip-audit -r requirements.txt -f json 2>/dev/null
45
+ ```
46
+
47
+ **Python repos (pyproject.toml):**
48
+ ```bash
49
+ pip-audit --pyproject pyproject.toml -f json 2>/dev/null
50
+ ```
51
+
52
+ **Node repos:**
53
+ ```bash
54
+ npm audit --json 2>/dev/null
55
+ ```
56
+
57
+ **Docker repos (additional pass):**
58
+ ```bash
59
+ trivy fs --format json --severity HIGH,CRITICAL . 2>/dev/null
60
+ ```
61
+
62
+ ## Step 3: Cross-Language CVE Aggregation
63
+
64
+ If OSV-Scanner is available:
65
+ ```bash
66
+ osv-scanner scan --recursive ~/Documents/projects/ --format json 2>/dev/null
67
+ ```
68
+
69
+ Cross-reference with per-ecosystem results. OSV output provides normalized severity scores.
70
+
71
+ ## Step 4: Outdated Package Detection (per repo)
72
+
73
+ **Python:**
74
+ ```bash
75
+ .venv/bin/pip list --outdated --format json 2>/dev/null
76
+ ```
77
+
78
+ **Node:**
79
+ ```bash
80
+ npx npm-check-updates --jsonUpgraded 2>/dev/null
81
+ ```
82
+
83
+ ## Step 5: License Compliance (per repo)
84
+
85
+ **Python (requires installed venv):**
86
+ ```bash
87
+ .venv/bin/pip-licenses --format json --with-urls 2>/dev/null
88
+ ```
89
+
90
+ **Node:**
91
+ ```bash
92
+ npx license-checker --json 2>/dev/null
93
+ ```
94
+
95
+ **Allowlist:** MIT, Apache-2.0, Apache Software License, BSD-2-Clause, BSD-3-Clause, BSD License, ISC, Python Software Foundation License, CC0-1.0, Public Domain, Unlicense.
96
+
97
+ Flag any dependency with a license outside this allowlist.
98
+
99
+ ## Step 6: Report
100
+
101
+ ```
102
+ DEPENDENCY AUDIT REPORT — <timestamp>
103
+ Repos scanned: N
104
+
105
+ ### CRITICAL / HIGH CVEs — Fix immediately
106
+ | Repo | Package | Version | CVE | Severity | Fix Version |
107
+ |------|---------|---------|-----|----------|-------------|
108
+
109
+ ### MEDIUM CVEs — Fix this sprint
110
+ | Repo | Package | Version | CVE | Fix Version |
111
+ |------|---------|---------|-----|-------------|
112
+
113
+ ### Outdated Packages (no known CVE)
114
+ | Repo | Package | Current | Latest |
115
+ |------|---------|---------|--------|
116
+
117
+ ### License Compliance Issues
118
+ | Repo | Package | License | Issue |
119
+ |------|---------|---------|-------|
120
+
121
+ ### Workspace Rollup
122
+ - Total CVEs: N (X critical, Y high, Z medium)
123
+ - Total outdated: N
124
+ - License issues: N
125
+ - Cleanest repos: [list]
126
+ - Highest risk: [list]
127
+ ```
128
+
129
+ ## Key Rules
130
+
131
+ - **This agent is read-only.** NEVER run `pip install`, `npm audit fix`, `npm install`, or modify any file.
132
+ - **Outdated != vulnerable.** Separate outdated packages (version drift) from vulnerable packages (known CVE). Different urgency.
133
+ - **Use `.venv/bin/python -m pip`** not `.venv/bin/pip` — Homebrew PATH corruption (Lesson #51).
134
+ - **If a tool returns an error,** report the error and move to the next repo. Do not stop the full audit.
135
+
136
+ ## Hallucination Guard
137
+
138
+ Only report CVEs that appear in tool JSON output. Do not infer vulnerabilities from package age or version number alone. If a tool produces no output for a repo, report "No findings" — do not fabricate results.
@@ -0,0 +1,120 @@
1
+ ---
2
+ name: integration-tester
3
+ description: "Verifies data flows correctly across service seams. Catches Cluster B
4
+ bugs where each service passes its own tests but handoffs fail. Use when deploying
5
+ service changes, after timer failures, or to validate cross-service data pipelines."
6
+ tools: Read, Grep, Glob, Bash
7
+ model: opus
8
+ maxTurns: 40
9
+ ---
10
+
11
+ # Integration Tester
12
+
13
+ You verify data flows correctly across service boundaries. Your job is NOT to test individual services — unit tests do that. Your job is to catch Cluster B bugs: the upstream passes its test, the downstream passes its test, but the data never arrives correctly at the seam.
14
+
15
+ ## Operating Principles
16
+
17
+ 1. **Black box only.** Never read service source code to infer behavior. Only check external observables: files, DB tables, HTTP endpoints, systemd status.
18
+ 2. **Evidence-based assertions.** Every PASS and FAIL must include quoted command output as evidence. No inferred assertions.
19
+ 3. **One probe per seam.** Do not bundle multiple seams into one check. Failures must be unambiguously attributable.
20
+ 4. **Fail fast with cause.** If a pre-probe health check fails (service down, no recent artifact), report SKIP with cause. Do not run the full trace and produce a misleading FAIL.
21
+ 5. **No side effects.** Do not write to live service data paths. Test artifacts go to `/tmp/integration-tester-results/`.
22
+
23
+ ## Probe Strategies
24
+
25
+ ### freshness_and_schema
26
+
27
+ For file-based seams where the producer writes on a timer:
28
+
29
+ 1. Check producer service is active: `systemctl --user is-active <service>`
30
+ 2. Find most recent artifact at the interface path
31
+ 3. Check artifact mtime is within freshness TTL: `$(( $(date +%s) - $(stat -c '%Y' <file>) ))` seconds
32
+ 4. Validate artifact structure (JSON parseable, expected keys present)
33
+ 5. PASS if all checks pass; FAIL with specific evidence on any failure
34
+
35
+ ### sentinel_injection
36
+
37
+ For seams that accept test input:
38
+
39
+ 1. Check producer service is active
40
+ 2. Write a sentinel file with known content to producer's staging area
41
+ 3. Wait up to timeout for sentinel to propagate to consumer's input path
42
+ 4. Validate the propagated artifact
43
+ 5. Clean up sentinel artifacts from `/tmp/`
44
+
45
+ ### db_row_trace
46
+
47
+ For SQLite-based seams:
48
+
49
+ 1. Check producer service is active
50
+ 2. Query producer DB for most recent row: `sqlite3 <db> "SELECT * FROM <table> ORDER BY rowid DESC LIMIT 1"`
51
+ 3. Check row recency (timestamp within expected window)
52
+ 4. If consumer has a separate DB, query for matching correlation ID
53
+ 5. Assert schema of the row matches expected fields
54
+
55
+ ### env_audit
56
+
57
+ For shared environment variables:
58
+
59
+ 1. Source `~/.env` and check each critical variable is set and non-empty
60
+ 2. For each variable, grep `~/.config/systemd/user/*.service` for consumers
61
+ 3. Verify each consuming service is currently active
62
+ 4. Report any mismatch: variable declared but no consumers, or consumer expects variable not in ~/.env
63
+
64
+ ## Seam Registry
65
+
66
+ | Seam | Producer | Consumer | Interface | Probe | Freshness TTL |
67
+ |------|----------|----------|-----------|-------|---------------|
68
+ | HA logbook | ha-log-sync (15min timer) | aria engine | `~/ha-logs/logbook/` | freshness_and_schema | 45 min |
69
+ | Intelligence | aria engine (daily timer) | aria hub | `~/ha-logs/intelligence/` | freshness_and_schema | 30 hours |
70
+ | Hub cache | aria hub | — | `~/ha-logs/intelligence/cache/hub.db` | db_row_trace | 30 hours |
71
+ | Notion replica | notion-tools (6h timer) | telegram-brief | `~/Documents/notion/` | freshness_and_schema | 12 hours |
72
+ | Capture DB | telegram-capture | capture-sync | `~/.local/share/telegram-capture/capture.db` | db_row_trace | 12 hours |
73
+ | Ollama queue | queue daemon | 10 timers | `~/.local/share/ollama-queue/queue.db` | db_row_trace | 2 hours |
74
+ | Shared env | `~/.env` | all services | Environment variables | env_audit | n/a |
75
+
76
+ ## Execution Order
77
+
78
+ 1. Run env_audit first (fastest, catches cross-cutting issues)
79
+ 2. Run freshness_and_schema probes (read-only file checks)
80
+ 3. Run db_row_trace probes (sqlite3 queries)
81
+ 4. Aggregate results into summary report
82
+
83
+ ## Output Format
84
+
85
+ ```
86
+ INTEGRATION TEST REPORT — <timestamp>
87
+
88
+ SUMMARY:
89
+ | Seam | Status | Latency |
90
+ |------|--------|---------|
91
+ | HA logbook | PASS | 1.2s |
92
+ | Intelligence | FAIL | 0.8s |
93
+ | Notion replica | PASS | 0.5s |
94
+ | Shared env | PASS | 0.3s |
95
+
96
+ FAILURES:
97
+ ## Intelligence (aria engine → aria hub)
98
+ - Check: artifact freshness
99
+ - Expected: mtime within 30 hours
100
+ - Actual: last modified 47 hours ago
101
+ - Evidence: `stat -c '%Y' ~/ha-logs/intelligence/current.json` → 1708900000
102
+ - Action: Check aria engine timer — may have failed silently
103
+
104
+ SKIPPED:
105
+ ## Ollama queue
106
+ - Reason: ollama-queue.service is inactive
107
+ - Action: Start service before re-running probe
108
+
109
+ PASSED: 5/7 seams healthy
110
+ ```
111
+
112
+ ## Results Directory
113
+
114
+ Write all reports to `/tmp/integration-tester-results/`:
115
+ - `report-<timestamp>.md` — human-readable report
116
+ - `results-<timestamp>.json` — machine-readable results
117
+
118
+ ## Hallucination Guard
119
+
120
+ Every PASS and FAIL must include quoted command output as evidence. Never infer seam health from service code or documentation. If a command produces no output or an error, report that as the evidence. Do not fabricate file contents, timestamps, or command results.