agentic-orchestrator 0.1.2 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (300) hide show
  1. package/.claude/settings.local.json +15 -0
  2. package/CLAUDE.md +126 -0
  3. package/README.md +166 -25
  4. package/agentic/orchestrator/adapters.yaml +3 -0
  5. package/agentic/orchestrator/gates.yaml +47 -0
  6. package/agentic/orchestrator/policy.yaml +89 -0
  7. package/agentic/orchestrator/schemas/adapters.schema.json +12 -0
  8. package/agentic/orchestrator/schemas/gates.schema.json +6 -1
  9. package/agentic/orchestrator/schemas/index.schema.json +14 -0
  10. package/agentic/orchestrator/schemas/multi-project.schema.json +41 -0
  11. package/agentic/orchestrator/schemas/policy.schema.json +449 -52
  12. package/agentic/orchestrator/schemas/state.schema.json +16 -0
  13. package/agentic/orchestrator/tools/catalog.json +68 -0
  14. package/agentic/orchestrator/tools/schemas/input/cost.get.input.schema.json +10 -0
  15. package/agentic/orchestrator/tools/schemas/input/cost.record.input.schema.json +13 -0
  16. package/agentic/orchestrator/tools/schemas/input/feature.send_message.input.schema.json +11 -0
  17. package/agentic/orchestrator/tools/schemas/input/performance.get_analytics.input.schema.json +10 -0
  18. package/agentic/orchestrator/tools/schemas/input/performance.record_outcome.input.schema.json +18 -0
  19. package/agentic/orchestrator/tools/schemas/output/cost.get.output.schema.json +13 -0
  20. package/agentic/orchestrator/tools/schemas/output/cost.record.output.schema.json +13 -0
  21. package/agentic/orchestrator/tools/schemas/output/feature.ready_to_merge.output.schema.json +7 -0
  22. package/agentic/orchestrator/tools/schemas/output/feature.send_message.output.schema.json +23 -0
  23. package/agentic/orchestrator/tools/schemas/output/performance.get_analytics.output.schema.json +46 -0
  24. package/agentic/orchestrator/tools/schemas/output/performance.record_outcome.output.schema.json +10 -0
  25. package/agentic/orchestrator/tools.md +5 -0
  26. package/apps/control-plane/scripts/validate-architecture-rules.mjs +28 -2
  27. package/apps/control-plane/scripts/validate-docker-mcp-contract.mjs +12 -0
  28. package/apps/control-plane/scripts/validate-mcp-contracts.ts +92 -0
  29. package/apps/control-plane/src/application/adapters/adapter-registry.ts +169 -0
  30. package/apps/control-plane/src/application/multi-project-loader.ts +119 -0
  31. package/apps/control-plane/src/application/services/activity-monitor-service.ts +199 -0
  32. package/apps/control-plane/src/application/services/cost-tracking-service.ts +82 -0
  33. package/apps/control-plane/src/application/services/dependency-scheduler-service.ts +86 -0
  34. package/apps/control-plane/src/application/services/feature-deletion-service.ts +8 -7
  35. package/apps/control-plane/src/application/services/gate-interpolation-service.ts +15 -0
  36. package/apps/control-plane/src/application/services/gate-service.ts +38 -2
  37. package/apps/control-plane/src/application/services/instance-isolation-service.ts +18 -0
  38. package/apps/control-plane/src/application/services/issue-tracker-service.ts +469 -0
  39. package/apps/control-plane/src/application/services/merge-service.ts +67 -3
  40. package/apps/control-plane/src/application/services/notifier-service.ts +295 -0
  41. package/apps/control-plane/src/application/services/performance-analytics-service.ts +122 -0
  42. package/apps/control-plane/src/application/services/plan-service.ts +51 -0
  43. package/apps/control-plane/src/application/services/pr-monitor-service.ts +262 -0
  44. package/apps/control-plane/src/application/services/reactions-service.ts +175 -0
  45. package/apps/control-plane/src/application/services/reporting-service.ts +17 -2
  46. package/apps/control-plane/src/application/services/run-lease-service.ts +16 -38
  47. package/apps/control-plane/src/application/tools/tool-metadata.ts +4 -1
  48. package/apps/control-plane/src/cli/aop.ts +1 -1
  49. package/apps/control-plane/src/cli/attach-command-handler.ts +120 -0
  50. package/apps/control-plane/src/cli/cleanup-command-handler.ts +190 -0
  51. package/apps/control-plane/src/cli/cli-argument-parser.ts +69 -3
  52. package/apps/control-plane/src/cli/dashboard-command-handler.ts +57 -0
  53. package/apps/control-plane/src/cli/help-command-handler.ts +163 -0
  54. package/apps/control-plane/src/cli/init-command-handler.ts +609 -0
  55. package/apps/control-plane/src/cli/resume-command-handler.ts +1 -0
  56. package/apps/control-plane/src/cli/retry-command-handler.ts +138 -0
  57. package/apps/control-plane/src/cli/run-command-handler.ts +115 -3
  58. package/apps/control-plane/src/cli/send-command-handler.ts +65 -0
  59. package/apps/control-plane/src/cli/status-command-handler.ts +102 -2
  60. package/apps/control-plane/src/cli/types.ts +26 -1
  61. package/apps/control-plane/src/core/constants.ts +8 -2
  62. package/apps/control-plane/src/core/error-codes.ts +3 -1
  63. package/apps/control-plane/src/core/gates.ts +170 -50
  64. package/apps/control-plane/src/core/kernel.ts +280 -5
  65. package/apps/control-plane/src/core/path-layout.ts +12 -0
  66. package/apps/control-plane/src/core/tool-caller.ts +36 -0
  67. package/apps/control-plane/src/core/workspace-hooks.ts +87 -0
  68. package/apps/control-plane/src/interfaces/cli/bootstrap.ts +258 -9
  69. package/apps/control-plane/src/providers/providers.ts +235 -14
  70. package/apps/control-plane/src/supervisor/build-wave-executor.ts +129 -8
  71. package/apps/control-plane/src/supervisor/qa-wave-executor.ts +123 -5
  72. package/apps/control-plane/src/supervisor/run-coordinator.ts +143 -6
  73. package/apps/control-plane/src/supervisor/runtime.ts +135 -6
  74. package/apps/control-plane/src/supervisor/types.ts +12 -21
  75. package/apps/control-plane/src/supervisor/worker-decision-loop.ts +8 -0
  76. package/apps/control-plane/test/activity-monitor.spec.ts +294 -0
  77. package/apps/control-plane/test/adapter-registry.spec.ts +132 -0
  78. package/apps/control-plane/test/batch-operations.spec.ts +112 -0
  79. package/apps/control-plane/test/bootstrap-attach.spec.ts +102 -0
  80. package/apps/control-plane/test/bootstrap-edge-cases.spec.ts +252 -0
  81. package/apps/control-plane/test/bootstrap.spec.ts +560 -0
  82. package/apps/control-plane/test/cleanup-command.spec.ts +301 -0
  83. package/apps/control-plane/test/cli-helpers.spec.ts +404 -1
  84. package/apps/control-plane/test/cli.unit.spec.ts +182 -1
  85. package/apps/control-plane/test/collision-queue.spec.ts +104 -1
  86. package/apps/control-plane/test/core-utils.spec.ts +175 -2
  87. package/apps/control-plane/test/cost-tracking.spec.ts +143 -0
  88. package/apps/control-plane/test/dashboard-api.integration.spec.ts +247 -0
  89. package/apps/control-plane/test/dashboard-client.spec.ts +116 -0
  90. package/apps/control-plane/test/dashboard-command.spec.ts +103 -0
  91. package/apps/control-plane/test/dependency-scheduler.spec.ts +189 -0
  92. package/apps/control-plane/test/epoch-tracking.spec.ts +4 -4
  93. package/apps/control-plane/test/feature-deletion-service.spec.ts +422 -0
  94. package/apps/control-plane/test/feature-lifecycle.spec.ts +202 -0
  95. package/apps/control-plane/test/git-spawn-error.spec.ts +24 -0
  96. package/apps/control-plane/test/incremental-gates.spec.ts +137 -0
  97. package/apps/control-plane/test/init-wizard.spec.ts +506 -0
  98. package/apps/control-plane/test/instance-isolation.spec.ts +83 -0
  99. package/apps/control-plane/test/issue-tracker.spec.ts +890 -0
  100. package/apps/control-plane/test/kernel.coverage.spec.ts +3 -5
  101. package/apps/control-plane/test/kernel.coverage2.spec.ts +871 -0
  102. package/apps/control-plane/test/kernel.spec.ts +13 -11
  103. package/apps/control-plane/test/lock-service.spec.ts +508 -0
  104. package/apps/control-plane/test/mcp-helpers.spec.ts +176 -0
  105. package/apps/control-plane/test/mcp.spec.ts +50 -15
  106. package/apps/control-plane/test/merge-service.spec.ts +67 -4
  107. package/apps/control-plane/test/multi-project.spec.ts +372 -0
  108. package/apps/control-plane/test/notifier-service.spec.ts +388 -0
  109. package/apps/control-plane/test/parallel-gates.spec.ts +312 -0
  110. package/apps/control-plane/test/patch-service.spec.ts +253 -0
  111. package/apps/control-plane/test/performance-analytics.spec.ts +338 -0
  112. package/apps/control-plane/test/planning-wave-executor.spec.ts +168 -0
  113. package/apps/control-plane/test/pr-monitor.spec.ts +385 -0
  114. package/apps/control-plane/test/providers.spec.ts +344 -1
  115. package/apps/control-plane/test/reactions.spec.ts +392 -0
  116. package/apps/control-plane/test/resume-command.spec.ts +390 -0
  117. package/apps/control-plane/test/run-coordinator.spec.ts +481 -2
  118. package/apps/control-plane/test/schema-date-time.spec.ts +46 -0
  119. package/apps/control-plane/test/service-retry-paths.spec.ts +30 -0
  120. package/apps/control-plane/test/services.spec.ts +95 -2
  121. package/apps/control-plane/test/session-management.spec.ts +450 -0
  122. package/apps/control-plane/test/spec-ingestion.spec.ts +190 -0
  123. package/apps/control-plane/test/supervisor-collaborators.spec.ts +699 -2
  124. package/apps/control-plane/test/supervisor.spec.ts +36 -30
  125. package/apps/control-plane/test/supervisor.unit.spec.ts +405 -0
  126. package/apps/control-plane/test/worker-decision-loop.spec.ts +57 -0
  127. package/apps/control-plane/test/workspace-hooks.spec.ts +177 -0
  128. package/apps/control-plane/vitest.config.ts +21 -5
  129. package/dist/apps/control-plane/application/adapters/adapter-registry.d.ts +44 -0
  130. package/dist/apps/control-plane/application/adapters/adapter-registry.js +76 -0
  131. package/dist/apps/control-plane/application/adapters/adapter-registry.js.map +1 -0
  132. package/dist/apps/control-plane/application/multi-project-loader.d.ts +31 -0
  133. package/dist/apps/control-plane/application/multi-project-loader.js +82 -0
  134. package/dist/apps/control-plane/application/multi-project-loader.js.map +1 -0
  135. package/dist/apps/control-plane/application/services/activity-monitor-service.d.ts +43 -0
  136. package/dist/apps/control-plane/application/services/activity-monitor-service.js +132 -0
  137. package/dist/apps/control-plane/application/services/activity-monitor-service.js.map +1 -0
  138. package/dist/apps/control-plane/application/services/cost-tracking-service.d.ts +28 -0
  139. package/dist/apps/control-plane/application/services/cost-tracking-service.js +48 -0
  140. package/dist/apps/control-plane/application/services/cost-tracking-service.js.map +1 -0
  141. package/dist/apps/control-plane/application/services/dependency-scheduler-service.d.ts +26 -0
  142. package/dist/apps/control-plane/application/services/dependency-scheduler-service.js +75 -0
  143. package/dist/apps/control-plane/application/services/dependency-scheduler-service.js.map +1 -0
  144. package/dist/apps/control-plane/application/services/feature-deletion-service.d.ts +2 -0
  145. package/dist/apps/control-plane/application/services/feature-deletion-service.js +6 -7
  146. package/dist/apps/control-plane/application/services/feature-deletion-service.js.map +1 -1
  147. package/dist/apps/control-plane/application/services/gate-interpolation-service.d.ts +7 -0
  148. package/dist/apps/control-plane/application/services/gate-interpolation-service.js +7 -0
  149. package/dist/apps/control-plane/application/services/gate-interpolation-service.js.map +1 -0
  150. package/dist/apps/control-plane/application/services/gate-service.js +32 -2
  151. package/dist/apps/control-plane/application/services/gate-service.js.map +1 -1
  152. package/dist/apps/control-plane/application/services/instance-isolation-service.d.ts +11 -0
  153. package/dist/apps/control-plane/application/services/instance-isolation-service.js +17 -0
  154. package/dist/apps/control-plane/application/services/instance-isolation-service.js.map +1 -0
  155. package/dist/apps/control-plane/application/services/issue-tracker-service.d.ts +65 -0
  156. package/dist/apps/control-plane/application/services/issue-tracker-service.js +358 -0
  157. package/dist/apps/control-plane/application/services/issue-tracker-service.js.map +1 -0
  158. package/dist/apps/control-plane/application/services/merge-service.d.ts +4 -0
  159. package/dist/apps/control-plane/application/services/merge-service.js +44 -2
  160. package/dist/apps/control-plane/application/services/merge-service.js.map +1 -1
  161. package/dist/apps/control-plane/application/services/notifier-service.d.ts +74 -0
  162. package/dist/apps/control-plane/application/services/notifier-service.js +212 -0
  163. package/dist/apps/control-plane/application/services/notifier-service.js.map +1 -0
  164. package/dist/apps/control-plane/application/services/performance-analytics-service.d.ts +39 -0
  165. package/dist/apps/control-plane/application/services/performance-analytics-service.js +75 -0
  166. package/dist/apps/control-plane/application/services/performance-analytics-service.js.map +1 -0
  167. package/dist/apps/control-plane/application/services/plan-service.d.ts +1 -0
  168. package/dist/apps/control-plane/application/services/plan-service.js +53 -0
  169. package/dist/apps/control-plane/application/services/plan-service.js.map +1 -1
  170. package/dist/apps/control-plane/application/services/pr-monitor-service.d.ts +44 -0
  171. package/dist/apps/control-plane/application/services/pr-monitor-service.js +192 -0
  172. package/dist/apps/control-plane/application/services/pr-monitor-service.js.map +1 -0
  173. package/dist/apps/control-plane/application/services/reactions-service.d.ts +67 -0
  174. package/dist/apps/control-plane/application/services/reactions-service.js +114 -0
  175. package/dist/apps/control-plane/application/services/reactions-service.js.map +1 -0
  176. package/dist/apps/control-plane/application/services/reporting-service.d.ts +1 -0
  177. package/dist/apps/control-plane/application/services/reporting-service.js +13 -2
  178. package/dist/apps/control-plane/application/services/reporting-service.js.map +1 -1
  179. package/dist/apps/control-plane/application/services/run-lease-service.d.ts +2 -0
  180. package/dist/apps/control-plane/application/services/run-lease-service.js +14 -38
  181. package/dist/apps/control-plane/application/services/run-lease-service.js.map +1 -1
  182. package/dist/apps/control-plane/application/tools/tool-metadata.js +3 -1
  183. package/dist/apps/control-plane/application/tools/tool-metadata.js.map +1 -1
  184. package/dist/apps/control-plane/cli/aop.d.ts +1 -1
  185. package/dist/apps/control-plane/cli/aop.js +1 -1
  186. package/dist/apps/control-plane/cli/attach-command-handler.d.ts +12 -0
  187. package/dist/apps/control-plane/cli/attach-command-handler.js +98 -0
  188. package/dist/apps/control-plane/cli/attach-command-handler.js.map +1 -0
  189. package/dist/apps/control-plane/cli/cleanup-command-handler.d.ts +12 -0
  190. package/dist/apps/control-plane/cli/cleanup-command-handler.js +162 -0
  191. package/dist/apps/control-plane/cli/cleanup-command-handler.js.map +1 -0
  192. package/dist/apps/control-plane/cli/cli-argument-parser.js +73 -3
  193. package/dist/apps/control-plane/cli/cli-argument-parser.js.map +1 -1
  194. package/dist/apps/control-plane/cli/dashboard-command-handler.d.ts +7 -0
  195. package/dist/apps/control-plane/cli/dashboard-command-handler.js +45 -0
  196. package/dist/apps/control-plane/cli/dashboard-command-handler.js.map +1 -0
  197. package/dist/apps/control-plane/cli/help-command-handler.d.ts +8 -0
  198. package/dist/apps/control-plane/cli/help-command-handler.js +146 -0
  199. package/dist/apps/control-plane/cli/help-command-handler.js.map +1 -0
  200. package/dist/apps/control-plane/cli/init-command-handler.d.ts +26 -0
  201. package/dist/apps/control-plane/cli/init-command-handler.js +517 -0
  202. package/dist/apps/control-plane/cli/init-command-handler.js.map +1 -0
  203. package/dist/apps/control-plane/cli/resume-command-handler.js +1 -1
  204. package/dist/apps/control-plane/cli/resume-command-handler.js.map +1 -1
  205. package/dist/apps/control-plane/cli/retry-command-handler.d.ts +8 -0
  206. package/dist/apps/control-plane/cli/retry-command-handler.js +111 -0
  207. package/dist/apps/control-plane/cli/retry-command-handler.js.map +1 -0
  208. package/dist/apps/control-plane/cli/run-command-handler.d.ts +5 -0
  209. package/dist/apps/control-plane/cli/run-command-handler.js +82 -3
  210. package/dist/apps/control-plane/cli/run-command-handler.js.map +1 -1
  211. package/dist/apps/control-plane/cli/send-command-handler.d.ts +8 -0
  212. package/dist/apps/control-plane/cli/send-command-handler.js +55 -0
  213. package/dist/apps/control-plane/cli/send-command-handler.js.map +1 -0
  214. package/dist/apps/control-plane/cli/status-command-handler.d.ts +12 -1
  215. package/dist/apps/control-plane/cli/status-command-handler.js +55 -2
  216. package/dist/apps/control-plane/cli/status-command-handler.js.map +1 -1
  217. package/dist/apps/control-plane/cli/types.d.ts +26 -1
  218. package/dist/apps/control-plane/cli/types.js +15 -1
  219. package/dist/apps/control-plane/cli/types.js.map +1 -1
  220. package/dist/apps/control-plane/core/constants.d.ts +6 -0
  221. package/dist/apps/control-plane/core/constants.js +8 -2
  222. package/dist/apps/control-plane/core/constants.js.map +1 -1
  223. package/dist/apps/control-plane/core/error-codes.d.ts +2 -0
  224. package/dist/apps/control-plane/core/error-codes.js +3 -1
  225. package/dist/apps/control-plane/core/error-codes.js.map +1 -1
  226. package/dist/apps/control-plane/core/gates.d.ts +4 -0
  227. package/dist/apps/control-plane/core/gates.js +140 -43
  228. package/dist/apps/control-plane/core/gates.js.map +1 -1
  229. package/dist/apps/control-plane/core/kernel.d.ts +50 -1
  230. package/dist/apps/control-plane/core/kernel.js +220 -7
  231. package/dist/apps/control-plane/core/kernel.js.map +1 -1
  232. package/dist/apps/control-plane/core/path-layout.d.ts +3 -0
  233. package/dist/apps/control-plane/core/path-layout.js +9 -0
  234. package/dist/apps/control-plane/core/path-layout.js.map +1 -1
  235. package/dist/apps/control-plane/core/tool-caller.d.ts +32 -0
  236. package/dist/apps/control-plane/core/tool-caller.js +2 -0
  237. package/dist/apps/control-plane/core/tool-caller.js.map +1 -0
  238. package/dist/apps/control-plane/core/workspace-hooks.d.ts +20 -0
  239. package/dist/apps/control-plane/core/workspace-hooks.js +69 -0
  240. package/dist/apps/control-plane/core/workspace-hooks.js.map +1 -0
  241. package/dist/apps/control-plane/interfaces/cli/bootstrap.js +245 -9
  242. package/dist/apps/control-plane/interfaces/cli/bootstrap.js.map +1 -1
  243. package/dist/apps/control-plane/providers/providers.d.ts +42 -3
  244. package/dist/apps/control-plane/providers/providers.js +216 -5
  245. package/dist/apps/control-plane/providers/providers.js.map +1 -1
  246. package/dist/apps/control-plane/supervisor/build-wave-executor.d.ts +3 -0
  247. package/dist/apps/control-plane/supervisor/build-wave-executor.js +115 -6
  248. package/dist/apps/control-plane/supervisor/build-wave-executor.js.map +1 -1
  249. package/dist/apps/control-plane/supervisor/qa-wave-executor.d.ts +3 -0
  250. package/dist/apps/control-plane/supervisor/qa-wave-executor.js +109 -5
  251. package/dist/apps/control-plane/supervisor/qa-wave-executor.js.map +1 -1
  252. package/dist/apps/control-plane/supervisor/run-coordinator.d.ts +15 -0
  253. package/dist/apps/control-plane/supervisor/run-coordinator.js +132 -6
  254. package/dist/apps/control-plane/supervisor/run-coordinator.js.map +1 -1
  255. package/dist/apps/control-plane/supervisor/runtime.d.ts +3 -0
  256. package/dist/apps/control-plane/supervisor/runtime.js +110 -6
  257. package/dist/apps/control-plane/supervisor/runtime.js.map +1 -1
  258. package/dist/apps/control-plane/supervisor/types.d.ts +9 -16
  259. package/dist/apps/control-plane/supervisor/types.js.map +1 -1
  260. package/dist/apps/control-plane/supervisor/worker-decision-loop.d.ts +3 -0
  261. package/dist/apps/control-plane/supervisor/worker-decision-loop.js +5 -0
  262. package/dist/apps/control-plane/supervisor/worker-decision-loop.js.map +1 -1
  263. package/eslint.config.mjs +2 -1
  264. package/package.json +12 -2
  265. package/packages/web-dashboard/next-env.d.ts +5 -0
  266. package/packages/web-dashboard/next.config.js +7 -0
  267. package/packages/web-dashboard/package.json +26 -0
  268. package/packages/web-dashboard/src/app/api/actions/route.ts +64 -0
  269. package/packages/web-dashboard/src/app/api/events/route.ts +51 -0
  270. package/packages/web-dashboard/src/app/api/features/[id]/checkout/route.ts +256 -0
  271. package/packages/web-dashboard/src/app/api/features/[id]/diff/route.ts +10 -0
  272. package/packages/web-dashboard/src/app/api/features/[id]/evidence/[artifact]/route.ts +25 -0
  273. package/packages/web-dashboard/src/app/api/features/[id]/review/route.ts +63 -0
  274. package/packages/web-dashboard/src/app/api/features/[id]/route.ts +16 -0
  275. package/packages/web-dashboard/src/app/api/projects/route.ts +31 -0
  276. package/packages/web-dashboard/src/app/api/status/route.ts +15 -0
  277. package/packages/web-dashboard/src/app/globals.css +2 -0
  278. package/packages/web-dashboard/src/app/layout.tsx +15 -0
  279. package/packages/web-dashboard/src/app/page.tsx +393 -0
  280. package/packages/web-dashboard/src/lib/aop-client.ts +244 -0
  281. package/packages/web-dashboard/src/lib/multi-project-config.ts +116 -0
  282. package/packages/web-dashboard/src/lib/orchestrator-tools.ts +284 -0
  283. package/packages/web-dashboard/src/lib/types.ts +58 -0
  284. package/packages/web-dashboard/tsconfig.json +40 -0
  285. package/packages/web-dashboard/vitest.config.ts +6 -0
  286. package/spec-files/completed/agentic_orchestrator_feature_gaps_closure_spec.md +1764 -0
  287. package/spec-files/outstanding/agentic_orchestrator_enterprise_governance_dashboard_spec.md +348 -0
  288. package/spec-files/outstanding/agentic_orchestrator_knowledge_canary_spec.md +344 -0
  289. package/spec-files/outstanding/agentic_orchestrator_observability_integrity_diagnostics_spec.md +374 -0
  290. package/spec-files/outstanding/agentic_orchestrator_performance_improvements_spec.md +1059 -0
  291. package/spec-files/outstanding/agentic_orchestrator_planning_review_quality_spec.md +466 -0
  292. package/spec-files/outstanding/agentic_orchestrator_quality_adoption_execution_spec.md +198 -0
  293. package/spec-files/outstanding/agentic_orchestrator_validator_hardening_spec.md +365 -0
  294. package/spec-files/progress.md +481 -52
  295. /package/spec-files/{agentic_orchestrator_cli_delete_command_spec.md → completed/agentic_orchestrator_cli_delete_command_spec.md} +0 -0
  296. /package/spec-files/{agentic_orchestrator_dot_aop_generated_artifacts_spec.md → completed/agentic_orchestrator_dot_aop_generated_artifacts_spec.md} +0 -0
  297. /package/spec-files/{agentic_orchestrator_mcp_formalization_spec.md → completed/agentic_orchestrator_mcp_formalization_spec.md} +0 -0
  298. /package/spec-files/{agentic_orchestrator_oop_refactor_spec.md → completed/agentic_orchestrator_oop_refactor_spec.md} +0 -0
  299. /package/spec-files/{agentic_orchestrator_single_global_orchestrator_spec.md → completed/agentic_orchestrator_single_global_orchestrator_spec.md} +0 -0
  300. /package/spec-files/{agentic_orchestrator_spec.md → completed/agentic_orchestrator_spec.md} +0 -0
@@ -0,0 +1,1764 @@
1
+ # Feature Spec: Closing Functionality Gaps Between Agentic-Orchestrator and ComposioHQ Agent Orchestrator
2
+
3
+ **Version:** 2.0
4
+ **Date:** 2026-03-02
5
+ **Status:** Draft (Revised — deep-dive analysis against ComposioHQ source)
6
+ **Milestone:** M29 - Feature Gap Closure
7
+
8
+ ---
9
+
10
+ ## 0. Implementation Standards & References
11
+
12
+ ### 0.1 Testing Standards
13
+
14
+ All new code MUST follow the testing standards defined in:
15
+ - **`prompts/vitest-testing-standards.instructions.md`**
16
+
17
+ Key requirements:
18
+ - Use Vitest (`describe/it/expect`, `vi` mocks/spies)
19
+ - Match existing repo conventions (test files in `apps/control-plane/test/*.spec.ts`)
20
+ - Use **Given / When / Then** naming: `GIVEN_<context>_WHEN_<action>_THEN_<expected>`
21
+ - Maintain coverage thresholds: lines/branches ≥90%
22
+ - No flaky tests: use `vi.useFakeTimers()` for time-dependent tests
23
+ - Mock external I/O (HTTP calls, filesystem outside temp dirs)
24
+
25
+ ### 0.2 Reference Implementation Repository
26
+
27
+ The ComposioHQ Agent Orchestrator implementation serves as the reference for feature implementations:
28
+ - **Repository:** https://github.com/ComposioHQ/agent-orchestrator
29
+ - **Tech stack:** TypeScript ESM, pnpm workspaces, Commander.js CLI, Next.js 15 + React 19 dashboard, Zod validation, Vitest + Playwright tests, tmux-based agent runtime
30
+ - **Key directories to study:**
31
+ - `packages/web/src/` — Dashboard (Next.js App Router, Kanban view, WebSocket terminal via ttyd/xterm.js)
32
+ - `packages/core/src/` — Core services (`session-manager.ts` 38KB, `lifecycle-manager.ts` 20KB, `config.ts` 13KB, `metadata.ts`, `plugin-registry.ts`, `orchestrator-prompt.ts`, `paths.ts`)
33
+ - `packages/plugins/notifier-*/` — 4 notification channels (slack, webhook, desktop, composio)
34
+ - `packages/plugins/agent-*/` — 4 agent plugins (claude-code 29KB, codex 28KB+16KB app-server, aider 7KB, opencode 5KB)
35
+ - `packages/plugins/tracker-*/` — 2 tracker plugins (github 8KB, linear 22KB with dual-transport)
36
+ - `packages/plugins/workspace-*/` — 2 workspace plugins (worktree, clone)
37
+ - `packages/plugins/runtime-*/` — 2 runtime plugins (tmux, process)
38
+ - `packages/plugins/terminal-*/` — 2 terminal plugins (iterm2, web)
39
+ - `packages/plugins/scm-github/` — PR lifecycle, CI checks, review decisions, merge control
40
+ - `packages/cli/src/commands/` — CLI commands (init, start, stop, spawn, batch-spawn, status, send, session, dashboard, review-check, open)
41
+ - **Architecture note:** Composio uses a meta-agent pattern where the orchestrator itself is an AI agent (Claude Code) that receives a system prompt teaching it the `ao` CLI. Worker agents are spawned in tmux sessions/processes. This is fundamentally different from AOP's code-driven supervisor.
42
+
43
+ ### 0.3 Package Dependencies (Required Additions)
44
+
45
+ New dependencies to add to `apps/control-plane/package.json`:
46
+ ```json
47
+ {
48
+ "dependencies": {
49
+ "chalk": "^5.3.0",
50
+ "yaml": "^2.3.4"
51
+ },
52
+ "devDependencies": {
53
+ "@testing-library/react": "^14.0.0",
54
+ "msw": "^2.0.0"
55
+ }
56
+ }
57
+ ```
58
+
59
+ New package for dashboard (`packages/web-dashboard/package.json`):
60
+ ```json
61
+ {
62
+ "name": "@aop/web-dashboard",
63
+ "version": "0.1.0",
64
+ "type": "module",
65
+ "dependencies": {
66
+ "next": "^14.0.0",
67
+ "react": "^18.2.0",
68
+ "react-dom": "^18.2.0",
69
+ "@monaco-editor/react": "^4.6.0",
70
+ "tailwindcss": "^3.4.0"
71
+ }
72
+ }
73
+ ```
74
+
75
+ ---
76
+
77
+ ## 1. Executive Summary
78
+
79
+ This specification defines a roadmap to close identified functionality gaps between Agentic-Orchestrator (AOP) and ComposioHQ's Agent Orchestrator, focusing on high-value features that align with AOP's deterministic, MCP-first architecture while preserving our core differentiators (collision detection, lock management, multi-phase workflows, quality gates).
80
+
81
+ **Guiding Principle:** Adopt features that enhance developer experience and operational visibility WITHOUT compromising deterministic guarantees, state consistency, or explicit merge control.
82
+
83
+ ---
84
+
85
+ ## 2. Gap Prioritization Framework
86
+
87
+ ### 2.1 Priority Tiers
88
+
89
+ **P0 (Critical - Implement First):**
90
+ - Features that eliminate major UX friction
91
+ - Features that enable production deployment
92
+ - Features required for multi-project workflows
93
+
94
+ **P1 (High Value - Implement Soon):**
95
+ - Features that significantly improve observability
96
+ - Features that reduce manual configuration burden
97
+ - Features that enable common use cases
98
+
99
+ **P2 (Nice to Have - Future Consideration):**
100
+ - Features that improve edge cases
101
+ - Features with viable workarounds
102
+ - Features with unclear ROI
103
+
104
+ **P3 (Low Priority - Deferred):**
105
+ - Features that conflict with core architecture
106
+ - Features with marginal benefit
107
+ - Features that require major redesigns
108
+
109
+ ---
110
+
111
+ ## 3. Gap Analysis by Priority
112
+
113
+ ### 3.1 P0 GAPS (Critical - Must Implement)
114
+
115
+ #### G1: Web Dashboard with Real-Time Updates
116
+
117
+ **ComposioHQ Feature:** Next.js dashboard with Server-Sent Events for live session monitoring.
118
+
119
+ **Gap Impact:** No visual monitoring; CLI-only interaction is friction for teams.
120
+
121
+ **AOP Design Alignment:** HIGH - Monitoring does not conflict with deterministic model.
122
+
123
+ **Reference Implementation (ComposioHQ):**
124
+ - SSE endpoint: `packages/web/src/app/api/events/route.ts`
125
+ - Dashboard components: `packages/web/src/components/Dashboard.tsx`, `SessionCard.tsx`, `SessionDetail.tsx`
126
+ - Hooks: `packages/web/src/hooks/` (real-time state management)
127
+
128
+ **Specification:**
129
+
130
+ **Directory Structure:**
131
+ ```
132
+ packages/web-dashboard/
133
+ ├── package.json
134
+ ├── tsconfig.json
135
+ ├── next.config.js
136
+ ├── tailwind.config.js
137
+ ├── src/
138
+ │ ├── app/
139
+ │ │ ├── layout.tsx
140
+ │ │ ├── page.tsx # Main dashboard page
141
+ │ │ ├── globals.css
142
+ │ │ └── api/
143
+ │ │ ├── status/route.ts # GET /api/status
144
+ │ │ ├── events/route.ts # GET /api/events (SSE)
145
+ │ │ ├── features/
146
+ │ │ │ └── [id]/
147
+ │ │ │ ├── route.ts # GET /api/features/:id
148
+ │ │ │ ├── diff/route.ts # GET /api/features/:id/diff
149
+ │ │ │ └── evidence/
150
+ │ │ │ └── [artifact]/route.ts
151
+ │ │ └── actions/route.ts # POST /api/actions
152
+ │ │ ├── actions/route.ts # POST /api/actions
153
+ │ │ └── features/
154
+ │ │ └── [id]/
155
+ │ │ ├── review/route.ts # POST /api/features/:id/review (approve/deny/request-changes)
156
+ │ │ └── checkout/route.ts # POST /api/features/:id/checkout (switch main repo to worktree branch)
157
+ │ ├── components/
158
+ │ │ ├── FeatureCard.tsx # Feature status card
159
+ │ │ ├── FeatureDetail.tsx # Expanded feature view
160
+ │ │ ├── DiffViewer.tsx # Monaco-based diff viewer
161
+ │ │ ├── EvidenceViewer.tsx # Logs/coverage display
162
+ │ │ ├── ReviewPanel.tsx # Review/approve/deny controls + merge trigger
163
+ │ │ ├── CheckoutButton.tsx # One-click checkout to feature worktree branch
164
+ │ │ ├── StatusBadge.tsx # Status indicator
165
+ │ │ └── LockIndicator.tsx # Lock visualization
166
+ │ ├── hooks/
167
+ │ │ ├── useFeatures.ts # Feature state hook
168
+ │ │ └── useSSE.ts # SSE connection hook
169
+ │ └── lib/
170
+ │ ├── aop-client.ts # Read .aop/ files
171
+ │ └── types.ts # TypeScript types
172
+ └── vitest.config.ts
173
+ ```
174
+
175
+ **SSE Implementation (Reference: ComposioHQ `packages/web/src/app/api/events/route.ts`):**
176
+ ```typescript
177
+ // packages/web-dashboard/src/app/api/events/route.ts
178
+ export const dynamic = "force-dynamic";
179
+
180
+ export async function GET(): Promise<Response> {
181
+ const encoder = new TextEncoder();
182
+ let heartbeat: ReturnType<typeof setInterval> | undefined;
183
+ let updates: ReturnType<typeof setInterval> | undefined;
184
+
185
+ const stream = new ReadableStream({
186
+ start(controller) {
187
+ // Send initial snapshot
188
+ void (async () => {
189
+ const features = await readFeaturesIndex();
190
+ const event = { type: "snapshot", features };
191
+ controller.enqueue(encoder.encode(`data: ${JSON.stringify(event)}\n\n`));
192
+ })();
193
+
194
+ // Heartbeat every 15s
195
+ heartbeat = setInterval(() => {
196
+ try {
197
+ controller.enqueue(encoder.encode(`: heartbeat\n\n`));
198
+ } catch {
199
+ clearInterval(heartbeat);
200
+ clearInterval(updates);
201
+ }
202
+ }, 15000);
203
+
204
+ // Poll for changes every 2s
205
+ updates = setInterval(() => {
206
+ void (async () => {
207
+ const features = await readFeaturesIndex();
208
+ const event = { type: "snapshot", features };
209
+ controller.enqueue(encoder.encode(`data: ${JSON.stringify(event)}\n\n`));
210
+ })();
211
+ }, 2000);
212
+ },
213
+ cancel() {
214
+ clearInterval(heartbeat);
215
+ clearInterval(updates);
216
+ },
217
+ });
218
+
219
+ return new Response(stream, {
220
+ headers: {
221
+ "Content-Type": "text/event-stream",
222
+ "Cache-Control": "no-cache",
223
+ Connection: "keep-alive",
224
+ "X-Accel-Buffering": "no",
225
+ },
226
+ });
227
+ }
228
+ ```
229
+
230
+ **useSSE Hook Implementation:**
231
+ ```typescript
232
+ // packages/web-dashboard/src/hooks/useSSE.ts
233
+ import { useState, useEffect, useCallback } from 'react';
234
+
235
+ interface SSEOptions {
236
+ url: string;
237
+ onMessage: (data: unknown) => void;
238
+ reconnectInterval?: number;
239
+ }
240
+
241
+ export function useSSE({ url, onMessage, reconnectInterval = 5000 }: SSEOptions) {
242
+ const [connected, setConnected] = useState(false);
243
+ const [error, setError] = useState<Error | null>(null);
244
+
245
+ useEffect(() => {
246
+ let eventSource: EventSource | null = null;
247
+ let reconnectTimer: ReturnType<typeof setTimeout> | null = null;
248
+
249
+ const connect = () => {
250
+ eventSource = new EventSource(url);
251
+
252
+ eventSource.onopen = () => {
253
+ setConnected(true);
254
+ setError(null);
255
+ };
256
+
257
+ eventSource.onmessage = (event) => {
258
+ try {
259
+ const data = JSON.parse(event.data);
260
+ onMessage(data);
261
+ } catch (e) {
262
+ console.error('SSE parse error:', e);
263
+ }
264
+ };
265
+
266
+ eventSource.onerror = () => {
267
+ setConnected(false);
268
+ eventSource?.close();
269
+ reconnectTimer = setTimeout(connect, reconnectInterval);
270
+ };
271
+ };
272
+
273
+ connect();
274
+
275
+ return () => {
276
+ eventSource?.close();
277
+ if (reconnectTimer) clearTimeout(reconnectTimer);
278
+ };
279
+ }, [url, onMessage, reconnectInterval]);
280
+
281
+ return { connected, error };
282
+ }
283
+ ```
284
+
285
+ **CLI Integration:**
286
+ ```typescript
287
+ // apps/control-plane/src/cli/dashboard-command-handler.ts
288
+ import { spawn } from 'node:child_process';
289
+ import { resolve } from 'node:path';
290
+
291
+ export class DashboardCommandHandler {
292
+ async execute(options: { port?: number; foreground?: boolean }): Promise<void> {
293
+ const port = options.port ?? 3000;
294
+ const dashboardPath = resolve(__dirname, '../../../../packages/web-dashboard');
295
+
296
+ const env = {
297
+ ...process.env,
298
+ PORT: String(port),
299
+ AOP_ROOT: process.cwd(),
300
+ };
301
+
302
+ if (options.foreground) {
303
+ // Run in foreground
304
+ const child = spawn('npm', ['run', 'dev'], {
305
+ cwd: dashboardPath,
306
+ env,
307
+ stdio: 'inherit',
308
+ });
309
+ await new Promise((_, reject) => child.on('error', reject));
310
+ } else {
311
+ // Run in background (detached)
312
+ const child = spawn('npm', ['run', 'start'], {
313
+ cwd: dashboardPath,
314
+ env,
315
+ detached: true,
316
+ stdio: 'ignore',
317
+ });
318
+ child.unref();
319
+ console.log(`Dashboard started on http://localhost:${port}`);
320
+ }
321
+ }
322
+ }
323
+ ```
324
+
325
+ **Testing Requirements (per `prompts/vitest-testing-standards.instructions.md`):**
326
+ ```typescript
327
+ // packages/web-dashboard/src/__tests__/api-status.test.ts
328
+ import { describe, it, expect, vi, beforeEach } from 'vitest';
329
+ import { GET } from '../app/api/status/route';
330
+
331
+ describe('GET /api/status', () => {
332
+ beforeEach(() => {
333
+ vi.restoreAllMocks();
334
+ });
335
+
336
+ it('GIVEN_valid_aop_directory_WHEN_status_requested_THEN_returns_features_list', async () => {
337
+ // Mock file reading
338
+ vi.mock('node:fs/promises', () => ({
339
+ readFile: vi.fn().mockResolvedValue(JSON.stringify({
340
+ active: ['feature_a'],
341
+ blocked: [],
342
+ merged: ['feature_b'],
343
+ })),
344
+ }));
345
+
346
+ const response = await GET();
347
+ const data = await response.json();
348
+
349
+ expect(response.status).toBe(200);
350
+ expect(data.active).toContain('feature_a');
351
+ });
352
+
353
+ it('GIVEN_missing_index_file_WHEN_status_requested_THEN_returns_empty_state', async () => {
354
+ vi.mock('node:fs/promises', () => ({
355
+ readFile: vi.fn().mockRejectedValue(new Error('ENOENT')),
356
+ }));
357
+
358
+ const response = await GET();
359
+ const data = await response.json();
360
+
361
+ expect(response.status).toBe(200);
362
+ expect(data.active).toEqual([]);
363
+ });
364
+ });
365
+ ```
366
+
367
+ **Dashboard Routes:**
368
+ - `GET /` — Dashboard UI
369
+ - `GET /api/status` — Global status snapshot (same as `aop status` JSON)
370
+ - `GET /api/events` — SSE stream
371
+ - `GET /api/features/:id` — Feature detail
372
+ - `GET /api/features/:id/diff` — Diff bundle
373
+ - `GET /api/features/:id/evidence/:artifact` — Evidence artifact download
374
+ - `POST /api/features/:id/actions` — Trigger actions (delete with confirmation)
375
+ - `POST /api/features/:id/review` — Review decision: approve, deny, or request changes
376
+ - `POST /api/features/:id/checkout` — Checkout feature worktree branch to main repo
377
+
378
+ **Review & Merge Control (Dashboard-Driven):**
379
+
380
+ The dashboard serves as the primary human review interface. When a feature reaches `ready_to_merge` status, the reviewer can:
381
+
382
+ 1. **Review changes** — View the full diff (Monaco diff viewer), plan, evidence (gate results, coverage), and feature log from the feature detail page.
383
+
384
+ 2. **Approve & Merge** — Click "Approve & Merge" to instruct the orchestrator to execute `feature.ready_to_merge`. This:
385
+ - Calls `POST /api/features/:id/review` with `{ decision: "approve", approval_token: "<token>" }`
386
+ - Backend invokes the `feature.ready_to_merge` MCP tool with the approval token
387
+ - Merge executes through the existing deterministic merge path (lock acquisition, final gate run, merge commit, cleanup)
388
+ - Dashboard shows merge progress via SSE updates
389
+
390
+ 3. **Deny / Request Changes** — Click "Deny" or "Request Changes" to block the merge:
391
+ - `{ decision: "deny", reason: "..." }` — Moves feature back to `blocked` status with the reviewer's reason logged to feature decisions
392
+ - `{ decision: "request_changes", message: "..." }` — Sends the reviewer's feedback as a corrective prompt to the builder agent (via the reaction system) and keeps the feature in its current phase for another build/QA cycle
393
+
394
+ 4. **API Implementation:**
395
+ ```typescript
396
+ // POST /api/features/:id/review
397
+ interface ReviewRequest {
398
+ decision: 'approve' | 'deny' | 'request_changes';
399
+ approval_token?: string; // required for approve
400
+ reason?: string; // required for deny
401
+ message?: string; // required for request_changes
402
+ }
403
+ ```
404
+ - `approve` → calls `ToolClient.call('feature.ready_to_merge', { feature_id, approval_token })`
405
+ - `deny` → calls `ToolClient.call('feature.state_patch', { feature_id, patch: { status: 'blocked' } })` + `ToolClient.call('feature.log_append', { feature_id, entry: reason })`
406
+ - `request_changes` → calls `ToolClient.call('feature.log_append', { feature_id, entry: message })` + triggers `changes_requested` reaction to inject feedback into agent
407
+
408
+ **Worktree Checkout (Local Testing):**
409
+
410
+ The dashboard provides a one-click checkout so the reviewer can spin up a local dev server and run manual tests against the feature's changes in their main repo working directory.
411
+
412
+ 1. **Checkout Button** — Displayed on the feature detail page for features with an active worktree. Shows the branch name and a warning that it will switch the user's main repo checkout.
413
+
414
+ 2. **Checkout Flow:**
415
+ - User clicks "Checkout to Local" on a feature
416
+ - Dashboard shows confirmation dialog: "This will run `git checkout <feature_branch>` in your main repo at `<repo_root>`. Any uncommitted changes will be stashed first. Continue?"
417
+ - On confirm, calls `POST /api/features/:id/checkout`
418
+
419
+ 3. **API Implementation:**
420
+ ```typescript
421
+ // POST /api/features/:id/checkout
422
+ interface CheckoutRequest {
423
+ stash_changes?: boolean; // default: true — auto-stash uncommitted work
424
+ restore_after?: boolean; // default: false — if true, remember original branch for later restore
425
+ }
426
+
427
+ interface CheckoutResponse {
428
+ ok: boolean;
429
+ data?: {
430
+ branch: string; // the branch checked out
431
+ previous_branch: string; // the branch before checkout (for restore)
432
+ stashed: boolean; // whether changes were stashed
433
+ stash_ref?: string; // stash reference if stashed
434
+ };
435
+ error?: { code: string; message: string };
436
+ }
437
+ ```
438
+
439
+ 4. **Backend Implementation:**
440
+ - Reads feature state to get the worktree branch name
441
+ - Runs in the main repo root (not the worktree):
442
+ 1. `git stash push -m "aop-dashboard-checkout: before switching to <feature_branch>"` (if `stash_changes` and dirty)
443
+ 2. `git checkout <feature_branch>`
444
+ - Records previous branch in a temporary file (`.aop/runtime/checkout-restore.json`) for later restoration
445
+ - Returns branch info + stash reference
446
+
447
+ 5. **Restore Button** — After checkout, the dashboard shows a "Restore Original Branch" button that:
448
+ - Calls `POST /api/features/:id/checkout` with `{ action: "restore" }`
449
+ - Backend: `git checkout <previous_branch>` + `git stash pop` (if stashed)
450
+
451
+ 6. **Safety Guards:**
452
+ - Checkout blocked if feature has no worktree or branch
453
+ - Checkout blocked if main repo has merge conflicts
454
+ - Warning displayed if there are uncommitted changes (with stash option)
455
+ - Only one checkout active at a time (tracked in `.aop/runtime/checkout-restore.json`)
456
+
457
+ **Launch Integration:**
458
+ - New CLI command: `aop dashboard [--port 3000] [--foreground]`
459
+ - Starts dashboard server in background or foreground
460
+ - Dashboard reads config from `agentic/orchestrator/policy.yaml` for port, auth settings
461
+
462
+ **Acceptance Criteria:**
463
+ - [ ] Dashboard displays all active/blocked/merged features with live updates
464
+ - [ ] Clicking feature shows state, plan, diff, evidence
465
+ - [ ] SSE updates within 2s of state file changes
466
+ - [ ] Dashboard survives run restart (polling fallback if SSE drops)
467
+ - [ ] Reviewer can approve & merge a `ready_to_merge` feature from the dashboard
468
+ - [ ] Reviewer can deny a feature with a reason (feature moves to `blocked`, reason logged)
469
+ - [ ] Reviewer can request changes with a message (feedback sent to agent via reaction system)
470
+ - [ ] Reviewer can one-click checkout a feature branch to their main repo for local testing
471
+ - [ ] Checkout auto-stashes uncommitted changes and provides a restore button
472
+ - [ ] Checkout blocked when no worktree/branch exists or repo has merge conflicts
473
+ - [ ] Tests pass with ≥90% coverage per `prompts/vitest-testing-standards.instructions.md`
474
+
475
+ **Estimated Effort:** 2 weeks (Medium - requires new package, SSE infra, UI components)
476
+
477
+ ---
478
+
479
+ #### G2: Notification System (Multi-Channel)
480
+
481
+ **ComposioHQ Feature:** Desktop notifications, Slack, webhooks with priority routing.
482
+
483
+ **Gap Impact:** No proactive alerts; users must poll `aop status`.
484
+
485
+ **AOP Design Alignment:** HIGH - Notification is output-only; no state mutations.
486
+
487
+ **Specification:**
488
+
489
+ **Architecture:**
490
+ 1. **Notifier Service** (`apps/control-plane/src/application/services/notifier-service.ts`)
491
+ - Abstract `NotifierChannel` interface (desktop, slack, webhook, email)
492
+ - Config in `agentic/orchestrator/policy.yaml`:
493
+ ```yaml
494
+ notifications:
495
+ enabled: true
496
+ channels:
497
+ desktop:
498
+ enabled: true
499
+ slack:
500
+ enabled: true
501
+ webhook: ${SLACK_WEBHOOK_URL}
502
+ channel: "#aop-alerts"
503
+ webhook:
504
+ enabled: false
505
+ url: ${CUSTOM_WEBHOOK_URL}
506
+ method: POST
507
+ headers:
508
+ Authorization: "Bearer ${WEBHOOK_TOKEN}"
509
+ routing:
510
+ critical: [desktop, slack] # Gate failures, collisions
511
+ warning: [slack] # Stale leases, retry exhaustion
512
+ info: [slack] # Feature merged, gates passed
513
+ ```
514
+
515
+ 2. **Notification Events:**
516
+ - `gate_failed` - Gate execution failed (priority: critical)
517
+ - `collision_detected` - Plan rejected due to collision (priority: critical)
518
+ - `feature_blocked` - Feature moved to blocked queue (priority: warning)
519
+ - `ready_to_merge` - Feature ready for review (priority: info)
520
+ - `feature_merged` - Feature merged successfully (priority: info)
521
+ - `stale_lease` - Run lease expiring soon (priority: warning)
522
+
523
+ 3. **Channel Implementations:**
524
+ - **Desktop:** `node-notifier` or `notifier-send` (Linux), `terminal-notifier` (macOS)
525
+ - **Slack:** HTTP POST to webhook URL with formatted message + attachments
526
+ - **Webhook:** Generic HTTP POST with JSON payload
527
+ - **Email:** Nodemailer with SMTP config (optional, P2)
528
+
529
+ 4. **Notification Flow:**
530
+ - Supervisor runtime calls `NotifierService.notify(event, context)`
531
+ - Service routes to enabled channels based on priority
532
+ - Failures logged but do not block orchestration
533
+
534
+ **Acceptance Criteria:**
535
+ - [ ] Desktop notification on gate failure with feature ID + error summary
536
+ - [ ] Slack notification includes clickable link to dashboard (if running)
537
+ - [ ] Webhook payload includes full event context (feature_id, status, evidence summary)
538
+ - [ ] Notification config validated against schema on startup
539
+ - [ ] Notification failures do not crash orchestrator
540
+
541
+ **Estimated Effort:** 1 week (Medium - requires channel abstractions, config schema, supervisor integration)
542
+
543
+ ---
544
+
545
+ #### G3: Init Wizard (`aop init`)
546
+
547
+ **ComposioHQ Feature:** Interactive setup wizard with auto-detection (git repo, GitHub remote, branch, API keys).
548
+
549
+ **Gap Impact:** Manual YAML editing is error-prone; high barrier to entry.
550
+
551
+ **AOP Design Alignment:** HIGH - Setup tool does not affect runtime behavior.
552
+
553
+ **Specification:**
554
+
555
+ **Wizard Flow:**
556
+ 1. **Detect repository context:**
557
+ - Check for `.git/` in current directory
558
+ - Parse git remote URL → derive repo owner/name
559
+ - Detect default branch (`git symbolic-ref refs/remotes/origin/HEAD`)
560
+
561
+ 2. **Prompt for config values:**
562
+ - Worktree base branch (default: `main`)
563
+ - Max active features (default: `5`)
564
+ - Max parallel gate runs (default: `3`)
565
+ - Dashboard port (default: `3000`)
566
+ - Notification channels (multi-select: desktop, slack, webhook)
567
+ - Slack webhook URL (if slack selected)
568
+ - Test framework (vitest/jest/pytest/maven/gradle - for gates.yaml template)
569
+
570
+ 3. **Generate config files:**
571
+ - `agentic/orchestrator/policy.yaml` (with detected/entered values)
572
+ - `agentic/orchestrator/gates.yaml` (template for detected test framework)
573
+ - `agentic/orchestrator/agents.yaml` (defaults: planner/builder/qa prompts)
574
+ - Copy prompt templates to `agentic/orchestrator/prompts/`
575
+ - Copy schema files to `agentic/orchestrator/schemas/`
576
+
577
+ 4. **Validate generated config:**
578
+ - Run schema validation against all generated YAML files
579
+ - Report validation errors with file/line/error details
580
+
581
+ 5. **Post-init instructions:**
582
+ - Print next steps:
583
+ ```
584
+ ✅ Configuration created successfully!
585
+
586
+ Next steps:
587
+ 1. Review config files in agentic/orchestrator/
588
+ 2. Add feature specs to .aop/features/<feature_id>/spec.md
589
+ 3. Run: aop run -fi .aop/features/<feature_id>/spec.md
590
+ 4. Monitor: aop status (or aop dashboard)
591
+ ```
592
+
593
+ **Acceptance Criteria:**
594
+ - [ ] Wizard detects git repo and parses remote URL
595
+ - [ ] Generated config files pass schema validation
596
+ - [ ] Wizard handles non-git directories gracefully (prompts for manual values)
597
+ - [ ] Template selection generates appropriate gates.yaml (e.g., `npm test` vs `pytest` vs `mvn test`)
598
+ - [ ] Wizard is idempotent (detects existing config, offers to update)
599
+
600
+ **Estimated Effort:** 1 week (Medium - requires interactive prompts, git parsing, template generation)
601
+
602
+ ---
603
+
604
+ #### G4: Multi-Project Configuration Support
605
+
606
+ **ComposioHQ Feature:** Single config file managing multiple repositories with per-project overrides.
607
+
608
+ **Gap Impact:** Cannot manage multiple repos from one orchestrator instance.
609
+
610
+ **AOP Design Alignment:** MEDIUM - Requires run lease isolation per repo.
611
+
612
+ **Specification:**
613
+
614
+ **Config Schema Extension:**
615
+ ```yaml
616
+ # agentic/orchestrator/multi-project.yaml (new file)
617
+ version: "1.0"
618
+
619
+ defaults:
620
+ max_active_features: 5
621
+ max_parallel_gate_runs: 3
622
+ dashboard_port: 3000
623
+ notifications:
624
+ enabled: true
625
+ channels: [desktop, slack]
626
+
627
+ projects:
628
+ - name: "backend"
629
+ path: ~/repos/backend
630
+ repo: "myorg/backend"
631
+ branch: main
632
+ policy: agentic/orchestrator/policy.yaml # per-project override
633
+ gates: agentic/orchestrator/gates-backend.yaml
634
+ dashboard_port: 3001 # override default
635
+
636
+ - name: "frontend"
637
+ path: ~/repos/frontend
638
+ repo: "myorg/frontend"
639
+ branch: main
640
+ policy: agentic/orchestrator/policy.yaml
641
+ gates: agentic/orchestrator/gates-frontend.yaml
642
+ dashboard_port: 3002
643
+ ```
644
+
645
+ **Implementation:**
646
+ 1. **Multi-Project Loader** (`src/application/multi-project-loader.ts`)
647
+ - Parses `multi-project.yaml` (optional; single-project mode remains default)
648
+ - Validates each project config against schema
649
+ - Resolves relative paths to absolute paths
650
+
651
+ 2. **CLI Changes:**
652
+ - New flag: `aop run --project backend` (selects project from multi-project.yaml)
653
+ - If `--project` not specified and multi-project.yaml exists → interactive selection
654
+ - `aop status --project backend` (project-specific status)
655
+ - `aop status --all` (global status across all projects)
656
+
657
+ 3. **Run Lease Isolation:**
658
+ - Run lease file per project: `.aop/runtime/<project_name>/run-lease.json`
659
+ - Dashboard instances per project (separate ports)
660
+ - Each project has independent `.aop/features/` directory
661
+
662
+ 4. **Dashboard Multi-Project View:**
663
+ - New route: `GET /api/projects` (list all configured projects)
664
+ - Project switcher in dashboard UI
665
+ - Global view showing status across all projects
666
+
667
+ **Acceptance Criteria:**
668
+ - [ ] Multi-project config validated against schema
669
+ - [ ] Can run orchestrator for specific project via `--project` flag
670
+ - [ ] Run leases isolated per project (parallel orchestration safe)
671
+ - [ ] Dashboard can switch between projects
672
+ - [ ] `aop status --all` aggregates status across projects
673
+
674
+ **Estimated Effort:** 1.5 weeks (High - requires config schema changes, CLI routing, lease isolation)
675
+
676
+ ---
677
+
678
+ ### 3.2 P1 GAPS (High Value - Implement Soon)
679
+
680
+ #### G5: CI Failure Auto-Remediation (Reactions)
681
+
682
+ **ComposioHQ Feature:** Automatic routing of CI failures to agents with retry logic.
683
+
684
+ **Gap Impact:** Gate failures require manual retry; no autonomous recovery.
685
+
686
+ **AOP Design Alignment:** MEDIUM - Requires autonomous retry policy; must preserve deterministic gates.
687
+
688
+ **Specification:**
689
+
690
+ **Reaction Policy Config:**
691
+ ```yaml
692
+ # agentic/orchestrator/policy.yaml
693
+ reactions:
694
+ gate_failed:
695
+ enabled: true
696
+ max_retries: 2
697
+ action: retry_with_agent_repair # or 'notify_only'
698
+ escalate_after: 2 # escalate to human after N failures
699
+ retry_delay: 30s
700
+
701
+ collision_detected:
702
+ enabled: true
703
+ action: notify_only # no auto-resolution
704
+
705
+ ready_to_merge:
706
+ enabled: true
707
+ action: notify_only # never auto-merge
708
+ ```
709
+
710
+ **Retry Flow:**
711
+ 1. **Gate failure detected** (QA wave or build wave)
712
+ 2. **Check reaction policy:** If `reactions.gate_failed.enabled` and retry count < max_retries
713
+ 3. **Agent repair loop:**
714
+ - Load gate failure evidence + logs
715
+ - Inject repair prompt to builder/QA agent:
716
+ ```
717
+ Gate execution failed. Review the error logs below and apply fixes.
718
+
719
+ Gate: {gate_name}
720
+ Exit Code: {exit_code}
721
+
722
+ Logs:
723
+ {logs}
724
+
725
+ Evidence:
726
+ {evidence_summary}
727
+ ```
728
+ 4. **Agent generates repair patches** → apply via `repo.apply_patch`
729
+ 5. **Re-run gate** → capture new evidence
730
+ 6. **Success:** Advance to next phase
731
+ 7. **Failure:** Increment retry count, repeat or escalate
732
+
733
+ **Escalation:**
734
+ - After `escalate_after` failures → notification to all critical channels
735
+ - Feature remains in current phase (does not auto-advance)
736
+ - User can manually intervene via dashboard or `aop send <feature_id> <instruction>`
737
+
738
+ **Acceptance Criteria:**
739
+ - [ ] Gate failure triggers retry if policy enabled
740
+ - [ ] Retry count tracked in feature state (`gate_retry_count`)
741
+ - [ ] Agent receives failure context in prompt
742
+ - [ ] Escalation notification includes full failure history
743
+ - [ ] Manual override: `aop retry <feature_id> --force` (ignores retry limit)
744
+
745
+ **Estimated Effort:** 2 weeks (High - requires retry state tracking, agent prompt injection, escalation logic)
746
+
747
+ ---
748
+
749
+ #### G6: Session Management Commands (`aop send`, `aop attach`)
750
+
751
+ **ComposioHQ Feature:** Interactive agent communication (`ao send`, `ao open`).
752
+
753
+ **Gap Impact:** Cannot send ad-hoc instructions to agents; must restart workflow.
754
+
755
+ **AOP Design Alignment:** MEDIUM - Requires provider-specific session access.
756
+
757
+ **Specification:**
758
+
759
+ **New CLI Commands:**
760
+
761
+ 1. **`aop send <feature_id> <message>`**
762
+ - Sends message to orchestrator session for given feature
763
+ - Orchestrator routes message to appropriate worker (planner/builder/qa based on phase)
764
+ - Example: `aop send my_feature "Add logging to error handlers"`
765
+
766
+ 2. **`aop attach <feature_id>`**
767
+ - Attaches to orchestrator session terminal (if provider supports it)
768
+ - For Claude Code/Codex: launches interactive chat session
769
+ - For tmux (future): attaches to tmux session
770
+ - Exit with `Ctrl-D` or type `/exit`
771
+
772
+ **Implementation:**
773
+ 1. **Provider Interface Extension:**
774
+ ```typescript
775
+ interface WorkerProvider {
776
+ sendMessage(sessionId: string, message: string): Promise<void>;
777
+ attachToSession(sessionId: string): Promise<void>; // interactive mode
778
+ }
779
+ ```
780
+
781
+ 2. **Message Routing:**
782
+ - CLI calls `ToolClient.call('feature.send_message', { feature_id, message })`
783
+ - Tool validates feature exists and has active orchestrator session
784
+ - Tool calls `WorkerProvider.sendMessage(orchestratorSessionId, message)`
785
+ - Provider forwards to agent (implementation-specific)
786
+
787
+ 3. **Session Attach:**
788
+ - CLI calls provider-specific attach method
789
+ - For Claude Code: launches `claude-code chat --session <session_id>`
790
+ - For Codex: launches `codex chat --session <session_id>`
791
+ - Terminal streams stdin/stdout until exit
792
+
793
+ **Acceptance Criteria:**
794
+ - [ ] `aop send` delivers message to active orchestrator session
795
+ - [ ] `aop attach` launches interactive session for supported providers
796
+ - [ ] Error handling: feature not found, session not active, provider unsupported
797
+ - [ ] Attach session streams real-time responses
798
+
799
+ **Estimated Effort:** 1.5 weeks (High - requires provider API integration, terminal streaming)
800
+
801
+ ---
802
+
803
+ #### G6a: Agent Activity Detection & Health Monitoring
804
+
805
+ **ComposioHQ Feature:** JSONL-based activity detection reads agent session files for structured events; fallback terminal output parsing determines real-time agent state (active, ready, idle, waiting_input, blocked, exited). Lifecycle manager polls activity state and triggers reactions (e.g., `agent-stuck` after configurable idle threshold).
806
+
807
+ **Gap Impact:** AOP has no runtime visibility into whether agents are actively working, idle, or stuck. Supervisor relies on tool call responses but cannot detect hung agents.
808
+
809
+ **AOP Design Alignment:** HIGH — Read-only monitoring; no state mutations.
810
+
811
+ **Specification:**
812
+
813
+ 1. **Activity Monitor Service** (`apps/control-plane/src/application/services/activity-monitor-service.ts`)
814
+ - Interface: `getActivityState(featureId: string): Promise<ActivityState>`
815
+ - States: `active | idle | waiting_input | blocked | exited | unknown`
816
+ - Detection strategies (provider-dependent):
817
+ - **Claude Code:** Read JSONL session files for last event timestamp + type
818
+ - **Codex:** Query JSON-RPC app-server for thread state
819
+ - **Generic:** Check process alive + last tool call timestamp from operation ledger
820
+ - Configurable idle threshold: `policy.supervisor.agent_idle_threshold_ms` (default: `300000` / 5min)
821
+
822
+ 2. **Integration Points:**
823
+ - Supervisor `WorkerDecisionLoop` checks activity state before sending next prompt
824
+ - Stale-activity triggers `agent_stuck` notification event (see G2)
825
+ - Dashboard displays per-feature activity indicator (see G1)
826
+ - `aop status` output includes activity column
827
+
828
+ 3. **Stuck Agent Reaction** (extends G5 reaction policy):
829
+ ```yaml
830
+ reactions:
831
+ agent_stuck:
832
+ enabled: true
833
+ action: notify_and_restart # or: notify_only
834
+ idle_threshold: 300s
835
+ escalate_after: 2
836
+ ```
837
+
838
+ **Acceptance Criteria:**
839
+ - [ ] Activity state detected for at least Claude Code and generic providers
840
+ - [ ] `aop status` displays activity state per feature
841
+ - [ ] Agent stuck beyond threshold triggers notification
842
+ - [ ] Activity monitoring does not block supervisor execution
843
+
844
+ **Estimated Effort:** 1 week (Medium — requires provider-specific detection strategies)
845
+
846
+ ---
847
+
848
+ #### G6b: Automated Session Cleanup
849
+
850
+ **ComposioHQ Feature:** `ao session cleanup` auto-evaluates sessions and kills those whose PRs are merged/closed, issues are completed, or runtimes are dead. Supports `--dry-run`.
851
+
852
+ **Gap Impact:** AOP's `aop delete` is manual-only. No automated lifecycle cleanup for terminal features.
853
+
854
+ **AOP Design Alignment:** HIGH — Cleanup is a natural extension of existing feature deletion service.
855
+
856
+ **Specification:**
857
+
858
+ 1. **Cleanup Command** (`aop cleanup [--dry-run] [--yes]`)
859
+ - Scans all active + blocked features in index
860
+ - Evaluates cleanup criteria per feature:
861
+ - Feature status is terminal (`merged` or `failed`) for > configurable grace period (default: 1h)
862
+ - Worktree exists but feature is no longer in index
863
+ - Run lease expired and no active supervisor session
864
+ - `--dry-run`: Print what would be cleaned up without acting
865
+ - Delegates to existing `FeatureDeletionService` for actual cleanup
866
+
867
+ 2. **Auto-Cleanup Hook** (optional):
868
+ - After `feature.ready_to_merge` completes merge, schedule cleanup after grace period
869
+ - Configurable in policy: `cleanup.auto_after_merge: true`, `cleanup.grace_period: 3600s`
870
+
871
+ **Acceptance Criteria:**
872
+ - [ ] `aop cleanup --dry-run` lists features eligible for cleanup
873
+ - [ ] `aop cleanup --yes` removes terminal features + orphan worktrees
874
+ - [ ] Auto-cleanup triggers after merge when enabled
875
+ - [ ] Grace period prevents premature cleanup
876
+
877
+ **Estimated Effort:** 3 days (Low — leverages existing FeatureDeletionService)
878
+
879
+ ---
880
+
881
+ #### G6c: Batch Feature Operations
882
+
883
+ **ComposioHQ Feature:** `ao batch-spawn` accepts multiple issue IDs, deduplicates against existing sessions, detects dead sessions for respawning, and reports created/skipped/failed counts.
884
+
885
+ **Gap Impact:** AOP can only run one feature at a time via CLI. Multi-feature workflows require multiple manual `aop run` invocations.
886
+
887
+ **AOP Design Alignment:** HIGH — Natural extension of existing `-fl` folder scanning.
888
+
889
+ **Specification:**
890
+
891
+ 1. **Batch Run** (`aop run -fl <spec_folder> --batch`)
892
+ - Scans folder for all spec files (existing behavior)
893
+ - Deduplicates against active features in index
894
+ - Skips features already in `active` or `blocked` status
895
+ - Spawns remaining features sequentially with 500ms delay
896
+ - Reports: `created: N, skipped: N (already active), failed: N`
897
+
898
+ 2. **Batch Status Extension:**
899
+ - `aop status --summary` — One-line-per-feature compact view
900
+
901
+ **Acceptance Criteria:**
902
+ - [ ] Batch run deduplicates against existing features
903
+ - [ ] Failed spawns don't block remaining features
904
+ - [ ] Summary output reports counts
905
+
906
+ **Estimated Effort:** 3 days (Low — extends existing folder scanning)
907
+
908
+ ---
909
+
910
+ #### G6d: Workspace postCreate Hooks & Symlinks
911
+
912
+ **ComposioHQ Feature:** Per-project `postCreate` commands (e.g., `pnpm install`) run after worktree creation. `symlinks` config shares files (`.env`, `.claude`) across worktrees via symlinks.
913
+
914
+ **Gap Impact:** After AOP creates a worktree, dependencies aren't installed and environment files are missing. Users must manually set up each worktree.
915
+
916
+ **AOP Design Alignment:** HIGH — Configuration-driven, no architectural conflict.
917
+
918
+ **Specification:**
919
+
920
+ 1. **Policy Extension:**
921
+ ```yaml
922
+ # agentic/orchestrator/policy.yaml
923
+ worktree:
924
+ base_branch: main
925
+ post_create:
926
+ - "npm ci"
927
+ - "cp .env.example .env"
928
+ symlinks:
929
+ - .env
930
+ - .claude
931
+ ```
932
+
933
+ 2. **Implementation:**
934
+ - After `repo.ensure_worktree` creates worktree, run `post_create` commands in worktree directory
935
+ - Before commands, create symlinks from worktree to main repo for listed files
936
+ - Command failures logged as warnings but don't block feature initialization
937
+ - Schema extension in `policy.schema.json`
938
+
939
+ **Acceptance Criteria:**
940
+ - [ ] `post_create` commands execute in new worktree directory
941
+ - [ ] Symlinks created before post_create commands run
942
+ - [ ] Command failures logged but don't block initialization
943
+
944
+ **Estimated Effort:** 2 days (Low)
945
+
946
+ ---
947
+
948
+ #### G6e: PR Lifecycle Integration
949
+
950
+ **ComposioHQ Feature:** SCM plugin tracks PR state as first-class session data: CI check status, review decisions (approved/changes_requested), pending review threads, merge readiness, and conflict status. Dashboard shows PR table with sortable "merge score" weighting these signals.
951
+
952
+ **Gap Impact:** AOP has no PR awareness. Features go through gates locally but there's no bridge to the PR review cycle. After `ready_to_merge`, there's no visibility into whether a PR was created, reviewed, or has CI issues upstream.
953
+
954
+ **AOP Design Alignment:** MEDIUM — Requires optional GitHub API integration. Read-only monitoring doesn't conflict with deterministic model, but auto-actions (merge, comment) should remain opt-in.
955
+
956
+ **Specification:**
957
+
958
+ 1. **PR Monitor Service** (`apps/control-plane/src/application/services/pr-monitor-service.ts`)
959
+ - Interface:
960
+ ```typescript
961
+ interface PrMonitorService {
962
+ detectPr(featureId: string, branch: string): Promise<PrInfo | null>;
963
+ getCiStatus(prNumber: number): Promise<CiStatus>;
964
+ getReviewDecision(prNumber: number): Promise<ReviewDecision>;
965
+ getMergeability(prNumber: number): Promise<MergeabilityInfo>;
966
+ }
967
+ ```
968
+ - Implementation via `gh` CLI (no Octokit dependency needed)
969
+ - PR info stored in feature state: `pr_number`, `pr_url`, `ci_status`, `review_decision`
970
+
971
+ 2. **Feature State Extension:**
972
+ ```yaml
973
+ pr:
974
+ number: 42
975
+ url: "https://github.com/org/repo/pull/42"
976
+ ci_status: passing # passing | failing | pending | none
977
+ review_decision: approved # approved | changes_requested | pending | none
978
+ merge_ready: true
979
+ ```
980
+
981
+ 3. **Dashboard Integration:**
982
+ - PR column in feature table
983
+ - Merge score calculation: `ci_weight * ci_pass + review_weight * approved + conflict_weight * no_conflicts`
984
+ - Sortable PR table view
985
+
986
+ 4. **Reaction Integration (extends G5):**
987
+ ```yaml
988
+ reactions:
989
+ ci_failed_upstream:
990
+ enabled: true
991
+ action: notify_only # or: retry_with_agent_repair
992
+ changes_requested:
993
+ enabled: true
994
+ action: send_review_context_to_agent
995
+ escalate_after: 2
996
+ ```
997
+
998
+ **Acceptance Criteria:**
999
+ - [ ] PR detected automatically after feature creates a branch with open PR
1000
+ - [ ] CI status and review decisions reflected in feature state
1001
+ - [ ] Dashboard shows PR info with merge score
1002
+ - [ ] `changes_requested` reaction sends review context to agent (opt-in)
1003
+
1004
+ **Estimated Effort:** 1.5 weeks (Medium — requires `gh` CLI integration, state schema extension, dashboard components)
1005
+
1006
+ ---
1007
+
1008
+ #### G7: Review Comment Auto-Handling
1009
+
1010
+ **ComposioHQ Feature:** Agents automatically address PR review comments. Lifecycle manager detects `changes_requested` review state and sends review context to agent with `send-to-agent` action. Agent reads comments, applies fixes, pushes new commits.
1011
+
1012
+ **Gap Impact:** No PR lifecycle integration; manual comment handling.
1013
+
1014
+ **AOP Design Alignment:** MEDIUM (revised from LOW) — Composio's implementation is pragmatic: it forwards review comments to agents as additional context, letting the agent decide what to fix. This doesn't bypass review — it accelerates the review-fix-rereview cycle. Compatible with AOP's model if review feedback is treated as additional spec input.
1015
+
1016
+ **Decision:** **PROMOTE to P1** — Subsume into G6e (PR Lifecycle Integration). When `changes_requested` is detected, send review comment context to builder agent as a corrective prompt. The agent applies fixes in the worktree, re-runs gates, and the feature re-enters the review cycle. The human reviewer retains approval authority.
1017
+
1018
+ **Alternative:** If PR integration (G6e) is deferred, this remains P3.
1019
+
1020
+ ---
1021
+
1022
+ #### G8: Alternative Workspace Modes (Clone, Copy)
1023
+
1024
+ **ComposioHQ Feature:** Clone and copy workspace modes in addition to worktrees.
1025
+
1026
+ **Gap Impact:** Worktrees-only; no flexibility for shared .git concerns.
1027
+
1028
+ **AOP Design Alignment:** LOW - Worktrees are optimal for parallel work; clone/copy add complexity.
1029
+
1030
+ **Decision:** **DEFER to P3** - Worktrees are sufficient; clone/copy modes add marginal value at high implementation cost.
1031
+
1032
+ ---
1033
+
1034
+ ### 3.3 P2 GAPS (Nice to Have - Future Consideration)
1035
+
1036
+ #### G9: Multi-Tracker Support (GitHub, Linear, Jira)
1037
+
1038
+ **ComposioHQ Feature:** Pluggable tracker integration.
1039
+
1040
+ **Gap Impact:** Spec-only issue references; no direct tracker sync.
1041
+
1042
+ **AOP Design Alignment:** MEDIUM - Could enrich context but not essential.
1043
+
1044
+ **Specification:**
1045
+
1046
+ **Tracker Abstraction:**
1047
+ ```typescript
1048
+ interface IssueTracker {
1049
+ getIssue(issueId: string): Promise<Issue>;
1050
+ updateIssueStatus(issueId: string, status: string): Promise<void>;
1051
+ addComment(issueId: string, comment: string): Promise<void>;
1052
+ }
1053
+ ```
1054
+
1055
+ **Config Extension:**
1056
+ ```yaml
1057
+ # agentic/orchestrator/policy.yaml
1058
+ issue_tracker:
1059
+ type: github # or linear, jira
1060
+ config:
1061
+ token: ${GITHUB_TOKEN}
1062
+ repo: myorg/myrepo
1063
+ ```
1064
+
1065
+ **Integration Points:**
1066
+ 1. **Spec enrichment:** Fetch issue details, inject into planner context
1067
+ 2. **Status sync:** Update issue status when feature advances (planning → building → merged)
1068
+ 3. **Comment posting:** Post gate results, evidence links as issue comments
1069
+
1070
+ **Estimated Effort:** 2 weeks (Medium - requires API clients, auth, config schema)
1071
+
1072
+ **Decision:** **P2** - Nice to have but not critical. Spec files provide sufficient context.
1073
+
1074
+ ---
1075
+
1076
+ #### G10: Multiple Runtime Environments (Docker, K8s, SSH)
1077
+
1078
+ **ComposioHQ Feature:** Pluggable runtime (tmux, Docker, K8s, process, SSH, E2B).
1079
+
1080
+ **Gap Impact:** MCP-only execution; no remote execution options.
1081
+
1082
+ **AOP Design Alignment:** LOW - MCP transport abstraction already supports remote MCP servers.
1083
+
1084
+ **Decision:** **DEFER to P3** - Current MCP transport supports Docker via remote MCP server. K8s/SSH add complexity without clear benefit.
1085
+
1086
+ ---
1087
+
1088
+ #### G11: Hash-Based Multi-Instance Isolation
1089
+
1090
+ **ComposioHQ Feature:** Config-path-derived hashing for safe multi-checkout operation.
1091
+
1092
+ **Gap Impact:** Single-instance assumption; cannot run multiple orchestrator checkouts safely.
1093
+
1094
+ **AOP Design Alignment:** MEDIUM - Run lease already provides single-instance guarantee.
1095
+
1096
+ **Specification:**
1097
+
1098
+ **Instance Isolation Strategy:**
1099
+ 1. Derive instance ID from config path hash (SHA256)
1100
+ 2. Namespace run lease file: `.aop/runtime/<instance_id>/run-lease.json`
1101
+ 3. Each instance has independent dashboard port, worktree paths
1102
+ 4. CLI detects instance ID from config path, routes commands accordingly
1103
+
1104
+ **Estimated Effort:** 1 week (Low - requires path hashing, namespace prefixing)
1105
+
1106
+ **Decision:** **P2** - Useful for testing/staging but not critical. Workaround: run leases already prevent collisions.
1107
+
1108
+ ---
1109
+
1110
+ #### G12: Terminal Integration Plugins
1111
+
1112
+ **ComposioHQ Feature:** iTerm2 integration, web terminal support.
1113
+
1114
+ **Gap Impact:** No terminal integration; CLI-only.
1115
+
1116
+ **AOP Design Alignment:** LOW - Dashboard terminal viewer covers most use cases.
1117
+
1118
+ **Decision:** **DEFER to P3** - Dashboard terminal viewer (G1) provides sufficient terminal access.
1119
+
1120
+ ---
1121
+
1122
+ ### 3.4 P3 GAPS (Low Priority - Deferred)
1123
+
1124
+ #### G13-G18: Deferred Gaps
1125
+
1126
+ **Deferred due to low ROI or architectural misalignment:**
1127
+ - **G13: Plugin System** - Conflicts with deterministic kernel design; provider abstraction sufficient
1128
+ - **G14: Alternative Workspace Modes** - Worktrees optimal; clone/copy add complexity
1129
+ - **G15: K8s/SSH Runtimes** - MCP transport abstraction sufficient
1130
+ - **G16: Review Comment Auto-Handling** - Conflicts with explicit review workflow
1131
+ - **G17: Auto-Merge on Green CI** - Conflicts with explicit merge control principle
1132
+ - **G18: Terminal Plugins** - Dashboard terminal viewer sufficient
1133
+
1134
+ ---
1135
+
1136
+ ## 3.5 Novel Features (Not in Either Package — AOP Differentiators)
1137
+
1138
+ These features emerged from analyzing both codebases and represent opportunities for AOP to leapfrog both its current state and Composio's capabilities.
1139
+
1140
+ #### N1: Incremental Gate Execution (P1)
1141
+
1142
+ **Problem:** Both AOP and Composio run full test suites on every gate pass. For large codebases, this wastes minutes re-running unaffected tests.
1143
+
1144
+ **Specification:**
1145
+ - After `repo.apply_patch`, compute affected file set from diff
1146
+ - Use test dependency graph (vitest `--changed`, jest `--changedSince`, pytest `--lf`) to select affected tests
1147
+ - Gate config extension:
1148
+ ```yaml
1149
+ gates:
1150
+ profiles:
1151
+ default:
1152
+ modes:
1153
+ fast:
1154
+ commands:
1155
+ - "npx vitest run --changed {base_branch}" # incremental
1156
+ full:
1157
+ commands:
1158
+ - "npx vitest run" # full suite
1159
+ ```
1160
+ - `fast` mode uses incremental; `full` and `merge` modes run complete suite
1161
+ - Evidence captures which tests were skipped and why
1162
+
1163
+ **Impact:** 2-5x faster gate cycles for large projects. Direct competitive advantage.
1164
+
1165
+ **Estimated Effort:** 1 week
1166
+
1167
+ ---
1168
+
1169
+ #### N2: Parallel Gate Execution (P2)
1170
+
1171
+ **Problem:** Gates run sequentially. Independent gates (lint, type-check, unit tests) could run concurrently.
1172
+
1173
+ **Specification:**
1174
+ - Gate config supports `parallel: true` flag per command group
1175
+ - Commands within a parallel group execute concurrently via `Promise.allSettled`
1176
+ - Evidence captured per-command; overall gate fails if any parallel command fails
1177
+ - Sequential dependencies via `depends_on` field
1178
+
1179
+ **Estimated Effort:** 3 days
1180
+
1181
+ ---
1182
+
1183
+ #### N3: Cost Tracking & Budget Enforcement (P2)
1184
+
1185
+ **Problem:** Neither system tracks or limits LLM API costs per feature. Runaway agent loops can burn tokens.
1186
+
1187
+ **Specification:**
1188
+ - Track token usage per feature via operation ledger metadata
1189
+ - Budget config:
1190
+ ```yaml
1191
+ budget:
1192
+ per_feature_limit: 50.00 # USD
1193
+ per_phase_limit: 20.00
1194
+ alert_threshold: 0.8 # notify at 80% of budget
1195
+ ```
1196
+ - Supervisor checks budget before each worker decision loop iteration
1197
+ - Over-budget triggers notification + feature pause (not kill)
1198
+ - Dashboard shows cost-per-feature column
1199
+
1200
+ **Estimated Effort:** 1 week
1201
+
1202
+ ---
1203
+
1204
+ #### N4: Dependency-Aware Feature Scheduling (P2)
1205
+
1206
+ **Problem:** Neither system supports declaring that Feature B depends on Feature A. AOP's collision detection is file-path based, not semantic.
1207
+
1208
+ **Specification:**
1209
+ - Spec metadata supports `depends_on: [feature_a]`
1210
+ - Scheduler defers dependent features to blocked queue until dependencies reach `merged` status
1211
+ - Automatic promotion when dependency chain resolves
1212
+ - Circular dependency detection at plan submission time
1213
+
1214
+ **Estimated Effort:** 1 week
1215
+
1216
+ ---
1217
+
1218
+ #### N5: Agent Performance Analytics (P3)
1219
+
1220
+ **Problem:** Neither system tracks which provider/model combinations succeed more often at which task types.
1221
+
1222
+ **Specification:**
1223
+ - Record per-feature outcome metrics: gate pass rate, retry count, time-to-merge, cost
1224
+ - Aggregate by provider + model over time
1225
+ - Optional: feed analytics into provider selection heuristics
1226
+
1227
+ **Estimated Effort:** 1.5 weeks
1228
+
1229
+ ---
1230
+
1231
+ #### N6: Typed Adapter Registry (Extension Point Taxonomy) (P0)
1232
+
1233
+ **Problem:** AOP currently defines extension interfaces ad-hoc per feature (G2 introduces `NotifierChannel`, G6 extends `WorkerProvider`, G9 defines `IssueTracker`, G6a adds `ActivityMonitor`, G6e adds `PrMonitor`). Each is a standalone interface with its own discovery, configuration, and error handling. This leads to duplicated patterns, inconsistent adapter lifecycle, and a codebase that gets harder to extend with every new concern axis. Meanwhile the existing `ProviderSelection` in `providers.ts` uses a hardcoded union type (`'codex' | 'claude' | ... | 'copilot'`) — adding a new provider means editing the union, the resolution logic, and every switch that touches it.
1234
+
1235
+ **Relationship to Plugin Systems:** This is *not* a plugin system. Plugins imply runtime discovery, dynamic loading, and third-party code running inside the process boundary — all of which undermine AOP's deterministic guarantees. An adapter registry is a **compile-time contract** with **config-driven selection**: the kernel knows every adapter that exists at build time, validates adapter configuration against schemas, and routes through the same deterministic pipeline (RBAC, validation, audit) as everything else. Adapters don't own state — the kernel does. Adapters don't make decisions — the kernel does. Adapters just answer "how do I talk to Slack" or "how do I parse Claude Code's session files."
1236
+
1237
+ **Specification:**
1238
+
1239
+ 1. **Core Abstraction** (`apps/control-plane/src/application/adapters/adapter-registry.ts`):
1240
+ ```typescript
1241
+ /** A typed slot that adapters can fill. */
1242
+ interface AdapterSlot<TContract> {
1243
+ readonly name: string; // e.g. 'notification-channel', 'agent-provider'
1244
+ readonly contract: TContract; // the interface adapters must implement
1245
+ }
1246
+
1247
+ /** Metadata every adapter must declare. */
1248
+ interface AdapterManifest {
1249
+ readonly slot: string; // which slot this fills
1250
+ readonly name: string; // unique adapter name within slot (e.g. 'slack', 'claude')
1251
+ readonly configSchema?: JsonSchema; // AJV schema for adapter-specific config
1252
+ }
1253
+
1254
+ /** The registry: slot → (name → adapter instance). */
1255
+ interface AdapterRegistry {
1256
+ register<T>(slot: AdapterSlot<T>, manifest: AdapterManifest, factory: (config: unknown) => T): void;
1257
+ resolve<T>(slot: AdapterSlot<T>, name: string, config: unknown): T;
1258
+ list(slot: string): ReadonlyArray<AdapterManifest>;
1259
+ has(slot: string, name: string): boolean;
1260
+ }
1261
+ ```
1262
+
1263
+ 2. **Adapter Slots** (formalized concern axes):
1264
+
1265
+ | Slot | Contract Interface | Built-in Adapters | Used By |
1266
+ |------|-------------------|-------------------|---------|
1267
+ | `agent-provider` | `WorkerProvider` | codex, claude, gemini, kiro-cli, copilot, custom | Supervisor runtime, G6 send/attach |
1268
+ | `notification-channel` | `NotifierChannel` | desktop, slack, webhook | G2 NotifierService |
1269
+ | `scm-provider` | `ScmProvider` | github (via `gh` CLI) | G6e PR lifecycle |
1270
+ | `issue-tracker` | `IssueTracker` | github, linear, jira | G9 tracker support |
1271
+ | `activity-detector` | `ActivityDetector` | claude-jsonl, codex-rpc, process-heuristic | G6a activity monitoring |
1272
+
1273
+ 3. **Registration & Resolution:**
1274
+ - All built-in adapters are registered at kernel boot time in a deterministic order
1275
+ - Registration validates the adapter's `configSchema` against the adapter config from `policy.yaml` / `agents.yaml`
1276
+ - Resolution is config-driven: `policy.yaml` specifies which adapter name to use per slot
1277
+ - Resolution fails fast with structured error (`adapter_not_found`, `adapter_config_invalid`) if the adapter doesn't exist or config doesn't validate
1278
+ - No dynamic imports, no runtime discovery, no third-party code — every adapter is a known import at build time
1279
+
1280
+ 4. **Config Integration:**
1281
+ ```yaml
1282
+ # agentic/orchestrator/policy.yaml
1283
+ adapters:
1284
+ notification-channel: slack # selects the 'slack' adapter for this slot
1285
+ scm-provider: github # selects 'github' for SCM
1286
+ issue-tracker: github # selects 'github' for issue tracking
1287
+ activity-detector: claude-jsonl # selects Claude Code JSONL parser
1288
+
1289
+ # agentic/orchestrator/agents.yaml (existing, unchanged)
1290
+ runtime:
1291
+ default_provider: claude # selects 'claude' for agent-provider slot
1292
+ ```
1293
+
1294
+ 5. **Schema Validation:**
1295
+ - New schema: `agentic/orchestrator/schemas/adapters.schema.json`
1296
+ - Validates: slot names are known, adapter names exist within slots, adapter-specific config matches adapter's declared `configSchema`
1297
+ - Validated at kernel boot alongside existing policy/gates/agents schemas
1298
+
1299
+ 6. **Migration Path (Non-Breaking):**
1300
+ - Phase 1 (M29): Introduce `AdapterRegistry` and migrate `agent-provider` slot (replace hardcoded union in `providers.ts` with registry-based resolution). Existing `agents.yaml` config continues to work — `default_provider: claude` resolves through the registry.
1301
+ - Phase 2 (M30): Register `notification-channel`, `scm-provider`, `activity-detector` slots as G2/G6a/G6e are implemented. These features use the registry from the start rather than inventing their own discovery patterns.
1302
+ - Phase 3 (M31): Register `issue-tracker` slot when G9 is implemented.
1303
+ - Each phase is backward compatible — existing config keeps working, the registry just formalizes what was already implicit.
1304
+
1305
+ 7. **What This Is NOT:**
1306
+ - NOT a plugin system: no dynamic loading, no npm package discovery, no third-party extension API
1307
+ - NOT a service locator: adapters are resolved at boot time, not lazily on first use
1308
+ - NOT an abstraction for abstraction's sake: every slot maps to a concrete feature (G2, G6, G6a, G6e, G9) that is already planned
1309
+ - The kernel retains full authority over state, validation, RBAC, and audit. Adapters are leaf-node implementations behind the kernel's deterministic pipeline.
1310
+
1311
+ **Acceptance Criteria:**
1312
+ - [ ] `AdapterRegistry` supports register/resolve/list/has operations with type safety
1313
+ - [ ] `agent-provider` slot migrated from hardcoded union to registry (no config changes required)
1314
+ - [ ] Adapter config validated against adapter-declared `configSchema` at boot
1315
+ - [ ] Resolution fails fast with structured error for unknown adapter or invalid config
1316
+ - [ ] No dynamic imports or runtime code loading — all adapters are static imports
1317
+ - [ ] Adding a new adapter to an existing slot requires: one file (implementation), one registration call, one config entry
1318
+
1319
+ **Estimated Effort:** 1 week (registry core + agent-provider migration). Subsequent slot registrations are ~1 day each, folded into the features that use them (G2, G6a, G6e, G9).
1320
+
1321
+ ---
1322
+
1323
+ ## 4. Implementation Roadmap
1324
+
1325
+ ### Phase 1: Critical UX Improvements (M29)
1326
+ **Duration:** 5-6 weeks
1327
+
1328
+ **Deliverables:**
1329
+ 1. **G1: Web Dashboard** (2 weeks)
1330
+ - Next.js dashboard with SSE updates
1331
+ - Feature cards, diff viewer, evidence viewer
1332
+ - Kanban view (planning/building/qa/ready_to_merge/merged columns)
1333
+ - Launch via `aop dashboard`
1334
+
1335
+ 2. **G2: Notification System** (1 week)
1336
+ - Register `notification-channel` adapter slot (N6): desktop, slack, webhook adapters
1337
+ - 4-tier priority routing (urgent/action/warning/info)
1338
+ - Event triggers (gate_failed, collision_detected, ready_to_merge, agent_stuck)
1339
+
1340
+ 3. **G3: Init Wizard** (1 week)
1341
+ - Interactive `aop init` command with `--auto` zero-prompt mode
1342
+ - Auto-detection (git repo, remote, branch, test framework)
1343
+ - Template generation (policy, gates, agents)
1344
+ - postCreate hooks and symlink configuration
1345
+
1346
+ 4. **G4: Multi-Project Config** (1.5 weeks)
1347
+ - Multi-project YAML schema
1348
+ - `--project` flag in CLI
1349
+ - Run lease isolation per project
1350
+
1351
+ 5. **G6b: Automated Session Cleanup** (3 days)
1352
+ - `aop cleanup [--dry-run] [--yes]`
1353
+ - Auto-cleanup after merge (configurable grace period)
1354
+
1355
+ 6. **G6c: Batch Feature Operations** (3 days)
1356
+ - `aop run -fl <folder> --batch` with deduplication
1357
+ - Summary output (created/skipped/failed counts)
1358
+
1359
+ 7. **G6d: Workspace postCreate Hooks & Symlinks** (2 days)
1360
+ - `post_create` commands in policy.yaml
1361
+ - Worktree symlinks for shared config files
1362
+
1363
+ 8. **N6: Typed Adapter Registry — Core + Agent-Provider Migration** (1 week)
1364
+ - `AdapterRegistry` with register/resolve/list/has and config schema validation
1365
+ - Migrate `agent-provider` slot from hardcoded union in `providers.ts` to registry-based resolution
1366
+ - `adapters.schema.json` validated at kernel boot
1367
+ - No config changes required — existing `agents.yaml` `default_provider` resolves through registry
1368
+
1369
+ **Milestone Acceptance:**
1370
+ - [ ] Dashboard displays live feature status with SSE updates + Kanban view
1371
+ - [ ] Dashboard review panel: approve/deny/request-changes with merge control
1372
+ - [ ] Dashboard checkout: one-click switch to feature branch for local testing with stash/restore
1373
+ - [ ] Notifications sent to Slack on gate failures with 4-tier priority routing
1374
+ - [ ] Adapter registry operational with `agent-provider` slot migrated; adding a new provider requires one file + one registration + one config entry
1375
+ - [ ] `aop init` generates valid config from git context (with `--auto` mode)
1376
+ - [ ] Multi-project config supports 2+ repos with isolated run leases
1377
+ - [ ] `aop cleanup` removes terminal features automatically
1378
+ - [ ] Batch run deduplicates and reports counts
1379
+ - [ ] Worktree post-create hooks execute and symlinks created
1380
+
1381
+ ---
1382
+
1383
+ ### Phase 2: Autonomous Operations & Observability (M30)
1384
+ **Duration:** 4-5 weeks
1385
+
1386
+ **Deliverables:**
1387
+ 1. **G5: CI Failure Auto-Remediation** (2 weeks)
1388
+ - Reaction policy config
1389
+ - Retry loop with agent repair
1390
+ - Escalation notifications (time-based + retry-count-based)
1391
+ - `agent_stuck` reaction with idle threshold
1392
+
1393
+ 2. **G6: Session Management Commands** (1.5 weeks)
1394
+ - `aop send <feature_id> <message>` with idle-wait
1395
+ - `aop attach <feature_id>` (interactive mode)
1396
+ - Provider interface extensions (via `agent-provider` adapter slot from N6)
1397
+
1398
+ 3. **G6a: Agent Activity Detection** (1 week)
1399
+ - Provider-specific activity state detection
1400
+ - Activity column in `aop status`
1401
+ - Stuck-agent notification triggers
1402
+ - Register `activity-detector` adapter slot (N6): claude-jsonl, codex-rpc, process-heuristic adapters
1403
+
1404
+ 4. **G6e: PR Lifecycle Integration** (1.5 weeks)
1405
+ - PR detection via `gh` CLI
1406
+ - CI status + review decision tracking in feature state
1407
+ - Merge score in dashboard
1408
+ - `changes_requested` reaction (subsumes G7)
1409
+ - Register `scm-provider` adapter slot (N6): github adapter
1410
+
1411
+ 5. **N1: Incremental Gate Execution** (1 week)
1412
+ - `--changed` flag support in gate commands
1413
+ - Fast mode uses incremental, full/merge modes run complete suite
1414
+
1415
+ **Milestone Acceptance:**
1416
+ - [ ] Gate failures trigger automatic retry with agent repair + time-based escalation
1417
+ - [ ] Agent activity state visible in `aop status` and dashboard
1418
+ - [ ] `aop send` delivers messages to active agents with idle-wait
1419
+ - [ ] PR state (CI, reviews, merge readiness) tracked in feature state
1420
+ - [ ] Incremental gates reduce fast-mode execution time by ≥50%
1421
+
1422
+ ---
1423
+
1424
+ ### Phase 3: Ecosystem Integration (M31)
1425
+ **Duration:** 3-4 weeks
1426
+
1427
+ **Deliverables:**
1428
+ 1. **G9: Multi-Tracker Support** (2 weeks)
1429
+ - Register `issue-tracker` adapter slot (N6): github, linear, jira adapters
1430
+ - Issue context enrichment
1431
+ - Status sync (optional)
1432
+
1433
+ 2. **G11: Hash-Based Multi-Instance Isolation** (1 week)
1434
+ - Instance ID from config path hash
1435
+ - Namespaced run leases
1436
+ - Multi-instance dashboard aggregation
1437
+
1438
+ 3. **N3: Cost Tracking & Budget Enforcement** (1 week)
1439
+ - Per-feature token/cost tracking via operation ledger
1440
+ - Budget limits with notification at threshold
1441
+ - Cost column in dashboard
1442
+
1443
+ 4. **N4: Dependency-Aware Feature Scheduling** (1 week)
1444
+ - `depends_on` in spec metadata
1445
+ - Automatic promotion when dependencies merge
1446
+ - Circular dependency detection
1447
+
1448
+ **Milestone Acceptance:**
1449
+ - [ ] Planner receives issue context from GitHub/Linear
1450
+ - [ ] Feature status updates sync to issue tracker
1451
+ - [ ] Multiple orchestrator instances run safely with isolated leases
1452
+ - [ ] Per-feature cost tracked and budget alerts fire at threshold
1453
+ - [ ] Dependent features auto-promote when dependencies merge
1454
+
1455
+ ---
1456
+
1457
+ ## 5. Testing Strategy
1458
+
1459
+ ### 5.1 Unit Tests
1460
+ - **Dashboard:** SSE event emission, API route handlers, file polling, Kanban column assignment, review decision dispatch (approve/deny/request_changes → tool client calls), checkout flow (stash detection, branch switch, restore state tracking)
1461
+ - **Notifications:** Channel routing (4-tier), message formatting, failure handling, throttle/batch
1462
+ - **Init Wizard:** Git detection, template generation, schema validation, `--auto` mode
1463
+ - **Multi-Project:** Config parsing, project selection, lease isolation
1464
+ - **Reactions:** Retry logic, escalation triggers (time + count), agent repair loops, stuck detection
1465
+ - **Activity Monitor:** Provider-specific state detection, idle threshold, unknown fallback
1466
+ - **Cleanup:** Terminal feature detection, grace period, dry-run mode
1467
+ - **Batch Operations:** Deduplication, sequential spawn, summary reporting
1468
+ - **PR Monitor:** `gh` CLI parsing, state mapping, merge score calculation
1469
+ - **Incremental Gates:** Changed-file detection, command interpolation, evidence tracking
1470
+ - **Cost Tracking:** Token accumulation, budget threshold detection, pause logic
1471
+
1472
+ ### 5.2 Integration Tests
1473
+ - **Dashboard E2E:** Feature status updates → SSE events → UI refresh
1474
+ - **Dashboard Review E2E:** Feature reaches ready_to_merge → reviewer approves via dashboard → merge executes → feature moves to merged
1475
+ - **Dashboard Checkout E2E:** Reviewer clicks checkout → stash created → branch switched → restore returns to original branch + stash pop
1476
+ - **Notification E2E:** Gate failure → Slack webhook call → message received
1477
+ - **Multi-Project E2E:** Run two projects concurrently → verify lease isolation
1478
+ - **Reaction E2E:** Gate failure → retry → agent repair → gate re-run → success
1479
+ - **Cleanup E2E:** Feature merged → grace period → auto-cleanup → index updated
1480
+ - **PR Lifecycle E2E:** Branch push → PR detected → CI status tracked → review feedback → agent fix
1481
+
1482
+ ### 5.3 Manual Acceptance Tests
1483
+ - Dashboard visual inspection (UI polish, responsiveness)
1484
+ - `aop init` wizard flow (user-friendly prompts, error messages)
1485
+ - `aop send` / `aop attach` interactive sessions (terminal streaming)
1486
+ - Multi-tracker issue sync (GitHub API responses, Linear updates)
1487
+
1488
+ ---
1489
+
1490
+ ## 6. Configuration Schema Changes
1491
+
1492
+ ### 6.1 Policy Extensions
1493
+
1494
+ ```yaml
1495
+ # agentic/orchestrator/policy.yaml
1496
+
1497
+ # NEW: Notifications
1498
+ notifications:
1499
+ enabled: true
1500
+ channels:
1501
+ desktop:
1502
+ enabled: true
1503
+ slack:
1504
+ enabled: true
1505
+ webhook: ${SLACK_WEBHOOK_URL}
1506
+ channel: "#aop-alerts"
1507
+ webhook:
1508
+ enabled: false
1509
+ url: ${CUSTOM_WEBHOOK_URL}
1510
+ routing:
1511
+ critical: [desktop, slack]
1512
+ warning: [slack]
1513
+ info: [slack]
1514
+
1515
+ # NEW: Reactions
1516
+ reactions:
1517
+ gate_failed:
1518
+ enabled: true
1519
+ max_retries: 2
1520
+ action: retry_with_agent_repair
1521
+ escalate_after: 2
1522
+ retry_delay: 30s
1523
+ collision_detected:
1524
+ enabled: true
1525
+ action: notify_only
1526
+ ready_to_merge:
1527
+ enabled: true
1528
+ action: notify_only
1529
+
1530
+ # NEW: Dashboard
1531
+ dashboard:
1532
+ enabled: true
1533
+ port: 3000
1534
+ auth:
1535
+ enabled: false # future: API key auth
1536
+
1537
+ # NEW: Issue Tracker (optional)
1538
+ issue_tracker:
1539
+ enabled: false
1540
+ type: github # or linear, jira
1541
+ config:
1542
+ token: ${GITHUB_TOKEN}
1543
+ repo: myorg/myrepo
1544
+ ```
1545
+
1546
+ ### 6.2 Multi-Project Schema
1547
+
1548
+ ```yaml
1549
+ # agentic/orchestrator/multi-project.yaml (NEW FILE)
1550
+ version: "1.0"
1551
+
1552
+ defaults:
1553
+ max_active_features: 5
1554
+ max_parallel_gate_runs: 3
1555
+ dashboard_port: 3000
1556
+ notifications:
1557
+ enabled: true
1558
+
1559
+ projects:
1560
+ - name: "project_a"
1561
+ path: ~/repos/project_a
1562
+ repo: "org/project_a"
1563
+ branch: main
1564
+ policy: agentic/orchestrator/policy.yaml
1565
+ gates: agentic/orchestrator/gates-project-a.yaml
1566
+ dashboard_port: 3001 # override
1567
+ ```
1568
+
1569
+ ---
1570
+
1571
+ ## 7. Acceptance Criteria (Phase 1 - M29)
1572
+
1573
+ ### Dashboard (G1)
1574
+ - [ ] Dashboard displays features in real-time via SSE
1575
+ - [ ] Feature detail page shows state, plan, diff, evidence
1576
+ - [ ] Diff viewer renders syntax-highlighted diffs
1577
+ - [ ] Evidence artifacts downloadable from dashboard
1578
+ - [ ] Dashboard survives orchestrator restart (reconnects SSE)
1579
+ - [ ] Review panel: approve & merge, deny with reason, request changes with feedback to agent
1580
+ - [ ] Checkout button: switch main repo to feature branch with auto-stash and restore
1581
+ - [ ] Checkout safety: blocked when no worktree/branch or repo has conflicts
1582
+
1583
+ ### Notifications (G2)
1584
+ - [ ] Desktop notification on gate failure (macOS/Linux)
1585
+ - [ ] Slack webhook receives formatted messages with links
1586
+ - [ ] Notification config validated on startup
1587
+ - [ ] Notification failures logged but do not crash orchestrator
1588
+
1589
+ ### Init Wizard (G3)
1590
+ - [ ] Wizard detects git repo and parses remote URL
1591
+ - [ ] Generated config files pass schema validation
1592
+ - [ ] Template selection generates correct gates.yaml for test framework
1593
+ - [ ] Wizard handles non-git directories gracefully
1594
+
1595
+ ### Multi-Project (G4)
1596
+ - [ ] Multi-project config validates against schema
1597
+ - [ ] `--project` flag selects correct project
1598
+ - [ ] Run leases isolated per project (parallel safe)
1599
+ - [ ] `aop status --all` aggregates across projects
1600
+
1601
+ ---
1602
+
1603
+ ## 8. Non-Goals
1604
+
1605
+ **Features explicitly excluded from this spec:**
1606
+ 1. **Auto-merge on green CI** — Conflicts with explicit merge control. AOP requires human approval before merge.
1607
+ 2. **Plugin system** — Conflicts with deterministic kernel design. Provider abstraction + MCP tool registry provide sufficient extensibility.
1608
+ 3. **K8s/SSH runtimes** — MCP transport abstraction already supports remote MCP servers for distributed execution.
1609
+ 4. **Alternative workspace modes (clone/copy)** — Worktrees are optimal for parallel work with shared .git. Clone mode adds complexity with marginal benefit (workaround: symlinks via G6d).
1610
+ 5. **Terminal plugins (iTerm2, web terminal)** — Dashboard with log viewer (G1) sufficient for monitoring. Interactive `aop attach` (G6) covers direct agent access.
1611
+ 6. **Meta-agent orchestration pattern** — Composio's approach (AI agent as orchestrator using CLI) is innovative but trades determinism for flexibility. AOP's code-driven supervisor provides stronger guarantees. Noted as architectural divergence, not a gap.
1612
+
1613
+ **Revised from v1.1:**
1614
+ - **Review comment auto-handling (G7)** — PROMOTED from Non-Goal to P1. Pragmatic implementation (forward review comments to agent) is compatible with AOP's review model. Subsumed into G6e (PR Lifecycle Integration).
1615
+
1616
+ ---
1617
+
1618
+ ## 9. Migration Path
1619
+
1620
+ ### 9.1 Backward Compatibility
1621
+
1622
+ **Existing Features Unaffected:**
1623
+ - All core MCP tools remain unchanged
1624
+ - Existing CLI commands (`run`, `status`, `resume`, `delete`) backward compatible
1625
+ - Existing config files work without changes (new fields optional)
1626
+ - Feature state/plan/index schemas unchanged
1627
+
1628
+ **New Features Opt-In:**
1629
+ - Dashboard: Launch via `aop dashboard` (opt-in)
1630
+ - Notifications: Disabled by default; enable in policy.yaml
1631
+ - Reactions: Disabled by default; enable in policy.yaml
1632
+ - Multi-project: Single-project mode remains default
1633
+
1634
+ ### 9.2 Deprecation Policy
1635
+
1636
+ **No deprecations in M29-M31.**
1637
+
1638
+ Future consideration (M32+):
1639
+ - Deprecate `--transport mcp` in favor of remote MCP server URLs
1640
+ - Deprecate in-process transport in favor of local MCP server
1641
+
1642
+ ---
1643
+
1644
+ ## 10. Success Metrics
1645
+
1646
+ ### 10.1 Quantitative Metrics
1647
+
1648
+ **M29 (Phase 1):**
1649
+ - Dashboard page load < 2s
1650
+ - SSE event latency < 2s (file change → UI update)
1651
+ - `aop init` completion time < 60s
1652
+ - Notification delivery latency < 5s
1653
+
1654
+ **M30 (Phase 2):**
1655
+ - Retry success rate > 60% (gate failures auto-resolved)
1656
+ - Escalation rate < 20% (most failures resolved before human intervention)
1657
+ - `aop send` message delivery < 1s
1658
+
1659
+ **M31 (Phase 3):**
1660
+ - Multi-project config validation time < 5s
1661
+ - Issue tracker sync latency < 10s
1662
+ - Multi-instance run lease acquisition < 1s
1663
+
1664
+ ### 10.2 Qualitative Metrics
1665
+
1666
+ **User Experience:**
1667
+ - Dashboard intuitive for first-time users (user testing)
1668
+ - Init wizard reduces setup time from 30min → 5min
1669
+ - Notifications reduce "poll for status" behavior
1670
+
1671
+ **Operational Excellence:**
1672
+ - Auto-remediation reduces manual intervention by 50%
1673
+ - Multi-project support enables single-dashboard management
1674
+ - Session commands reduce workflow restarts by 70%
1675
+
1676
+ ---
1677
+
1678
+ ## 11. Risk Assessment
1679
+
1680
+ ### 11.1 Technical Risks
1681
+
1682
+ **Risk 1: SSE Scalability**
1683
+ - **Impact:** Dashboard becomes unresponsive with >10 features
1684
+ - **Mitigation:** Implement event batching, debounce updates, connection pooling
1685
+
1686
+ **Risk 2: Provider API Changes**
1687
+ - **Impact:** `aop send` / `aop attach` break when Claude/Codex updates
1688
+ - **Mitigation:** Version provider interface, graceful fallback, release notes monitoring
1689
+
1690
+ **Risk 3: Notification Delivery Failures**
1691
+ - **Impact:** Critical alerts lost (gate failures, collisions)
1692
+ - **Mitigation:** Log all notification attempts, retry failed deliveries, fallback to desktop
1693
+
1694
+ **Risk 4: Multi-Project Config Complexity**
1695
+ - **Impact:** Users misconfigure run leases → orchestrator conflicts
1696
+ - **Mitigation:** Schema validation, init wizard guidance, clear error messages
1697
+
1698
+ ### 11.2 Operational Risks
1699
+
1700
+ **Risk 1: Dashboard Security**
1701
+ - **Impact:** Unauthorized access to feature diffs, evidence
1702
+ - **Mitigation:** Add API key auth (P2), limit to localhost by default
1703
+
1704
+ **Risk 2: Retry Loop Abuse**
1705
+ - **Impact:** Infinite retry loops consume resources
1706
+ - **Mitigation:** Hard cap on retries (max 5), exponential backoff, manual override required
1707
+
1708
+ **Risk 3: Slack Webhook Rate Limits**
1709
+ - **Impact:** Notifications dropped during high activity
1710
+ - **Mitigation:** Implement rate limiting, batch notifications, queue overflow alerts
1711
+
1712
+ ---
1713
+
1714
+ ## 12. Open Questions
1715
+
1716
+ 1. **Dashboard Auth:** Should dashboard require authentication for production deployments?
1717
+ - **Recommendation:** Add optional API key auth in M30; disabled by default in M29
1718
+
1719
+ 2. **Notification Throttling:** How to prevent notification spam during gate retry storms?
1720
+ - **Recommendation:** Batch notifications (max 1 per feature per 5min), summary digest option
1721
+
1722
+ 3. **Multi-Project Dashboard:** Should dashboard show all projects simultaneously or require switching?
1723
+ - **Recommendation:** Project switcher in M29; global view in M30
1724
+
1725
+ 4. **Session Attach TTY:** Should `aop attach` use raw TTY mode or wrap in TUI?
1726
+ - **Recommendation:** Raw TTY for Claude/Codex compatibility; TUI wrapper optional in M31
1727
+
1728
+ 5. **Issue Tracker Sync Frequency:** Real-time or batched?
1729
+ - **Recommendation:** Batched (every 5min) to avoid rate limits; manual sync command for immediate update
1730
+
1731
+ 6. **Meta-Agent vs Code-Driven Orchestration:** Should AOP offer an optional meta-agent mode where the orchestrator is itself an AI agent (Composio's pattern)?
1732
+ - **Recommendation:** No for M29-M31. AOP's code-driven supervisor provides deterministic guarantees that a meta-agent cannot. However, consider a hybrid mode in M32+ where a meta-agent can suggest but not execute priority decisions.
1733
+
1734
+ 7. **Activity Detection Reliability:** JSONL session file formats differ across Claude Code versions. How to handle format changes?
1735
+ - **Recommendation:** Version-detect the JSONL format; fall back to process-alive + last-tool-call heuristic for unknown formats.
1736
+
1737
+ 8. **Cost Model Accuracy:** Token costs vary by provider and model. How to price accurately?
1738
+ - **Recommendation:** Use provider-reported token counts when available (Claude API response headers). Fall back to tiktoken estimation. Allow user-configured $/token overrides.
1739
+
1740
+ 9. **Incremental Gate Safety:** Incremental test selection may miss integration-level regressions. How to balance speed vs safety?
1741
+ - **Recommendation:** `fast` mode uses incremental (acceptable risk for inner dev loop). `full` and `merge` modes always run complete suite. Document the tradeoff.
1742
+
1743
+ ---
1744
+
1745
+ ## 13. Conclusion
1746
+
1747
+ This specification (v2.0) significantly expands the gap analysis based on a deep-dive into Composio's actual source code (not just README claims). The original v1.1 identified 12 gaps; this revision adds 5 newly-discovered gaps (G6a-G6e) and 5 novel differentiator features (N1-N5), revises priority decisions (G7 promoted from P3 to P1), and provides implementation-level detail for all additions.
1748
+
1749
+ **Phase 1 (M29)** delivers essential UX improvements (dashboard, notifications, init wizard, multi-project support) plus quick wins (cleanup automation, batch operations, workspace hooks) that eliminate major friction points.
1750
+
1751
+ **Phase 2 (M30)** adds autonomous operations (auto-remediation, activity detection, session commands, PR lifecycle integration, incremental gates) that reduce manual intervention and provide runtime observability.
1752
+
1753
+ **Phase 3 (M31)** integrates with external ecosystems (issue trackers, multi-instance isolation, cost tracking, dependency scheduling) for enterprise workflows.
1754
+
1755
+ **Key Principles:**
1756
+ 1. Every new feature is opt-in, backward compatible, and does not compromise deterministic guarantees or explicit merge control.
1757
+ 2. AOP's code-driven supervisor is an intentional architectural choice, not a gap. Meta-agent orchestration (Composio's pattern) trades determinism for flexibility — AOP chooses determinism.
1758
+ 3. Novel features (N1-N5) represent competitive differentiation opportunities that neither package currently offers.
1759
+
1760
+ **Total estimated effort:** M29 (5-6 weeks) + M30 (4-5 weeks) + M31 (3-4 weeks) = ~12-15 weeks.
1761
+
1762
+ ---
1763
+
1764
+ **End of Specification**