@miller-tech/uap 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (660) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +888 -0
  3. package/dist/analyzers/index.d.ts +3 -0
  4. package/dist/analyzers/index.d.ts.map +1 -0
  5. package/dist/analyzers/index.js +684 -0
  6. package/dist/analyzers/index.js.map +1 -0
  7. package/dist/benchmarks/agents/naive-agent.d.ts +60 -0
  8. package/dist/benchmarks/agents/naive-agent.d.ts.map +1 -0
  9. package/dist/benchmarks/agents/naive-agent.js +144 -0
  10. package/dist/benchmarks/agents/naive-agent.js.map +1 -0
  11. package/dist/benchmarks/agents/uap-agent.d.ts +167 -0
  12. package/dist/benchmarks/agents/uap-agent.d.ts.map +1 -0
  13. package/dist/benchmarks/agents/uap-agent.js +437 -0
  14. package/dist/benchmarks/agents/uap-agent.js.map +1 -0
  15. package/dist/benchmarks/benchmark.d.ts +328 -0
  16. package/dist/benchmarks/benchmark.d.ts.map +1 -0
  17. package/dist/benchmarks/benchmark.js +112 -0
  18. package/dist/benchmarks/benchmark.js.map +1 -0
  19. package/dist/benchmarks/execution-verifier.d.ts +41 -0
  20. package/dist/benchmarks/execution-verifier.d.ts.map +1 -0
  21. package/dist/benchmarks/execution-verifier.js +340 -0
  22. package/dist/benchmarks/execution-verifier.js.map +1 -0
  23. package/dist/benchmarks/hierarchical-prompting.d.ts +37 -0
  24. package/dist/benchmarks/hierarchical-prompting.d.ts.map +1 -0
  25. package/dist/benchmarks/hierarchical-prompting.js +246 -0
  26. package/dist/benchmarks/hierarchical-prompting.js.map +1 -0
  27. package/dist/benchmarks/improved-benchmark.d.ts +89 -0
  28. package/dist/benchmarks/improved-benchmark.d.ts.map +1 -0
  29. package/dist/benchmarks/improved-benchmark.js +585 -0
  30. package/dist/benchmarks/improved-benchmark.js.map +1 -0
  31. package/dist/benchmarks/index.d.ts +11 -0
  32. package/dist/benchmarks/index.d.ts.map +1 -0
  33. package/dist/benchmarks/index.js +11 -0
  34. package/dist/benchmarks/index.js.map +1 -0
  35. package/dist/benchmarks/model-integration.d.ts +111 -0
  36. package/dist/benchmarks/model-integration.d.ts.map +1 -0
  37. package/dist/benchmarks/model-integration.js +904 -0
  38. package/dist/benchmarks/model-integration.js.map +1 -0
  39. package/dist/benchmarks/multi-turn-agent.d.ts +44 -0
  40. package/dist/benchmarks/multi-turn-agent.d.ts.map +1 -0
  41. package/dist/benchmarks/multi-turn-agent.js +254 -0
  42. package/dist/benchmarks/multi-turn-agent.js.map +1 -0
  43. package/dist/benchmarks/multi-turn-loop.d.ts +57 -0
  44. package/dist/benchmarks/multi-turn-loop.d.ts.map +1 -0
  45. package/dist/benchmarks/multi-turn-loop.js +167 -0
  46. package/dist/benchmarks/multi-turn-loop.js.map +1 -0
  47. package/dist/benchmarks/tasks.d.ts +19 -0
  48. package/dist/benchmarks/tasks.d.ts.map +1 -0
  49. package/dist/benchmarks/tasks.js +435 -0
  50. package/dist/benchmarks/tasks.js.map +1 -0
  51. package/dist/bin/cli.d.ts +3 -0
  52. package/dist/bin/cli.d.ts.map +1 -0
  53. package/dist/bin/cli.js +546 -0
  54. package/dist/bin/cli.js.map +1 -0
  55. package/dist/bin/llama-server-optimize.d.ts +18 -0
  56. package/dist/bin/llama-server-optimize.d.ts.map +1 -0
  57. package/dist/bin/llama-server-optimize.js +708 -0
  58. package/dist/bin/llama-server-optimize.js.map +1 -0
  59. package/dist/bin/policy.d.ts +3 -0
  60. package/dist/bin/policy.d.ts.map +1 -0
  61. package/dist/bin/policy.js +143 -0
  62. package/dist/bin/policy.js.map +1 -0
  63. package/dist/bin/tool-calls.d.ts +3 -0
  64. package/dist/bin/tool-calls.d.ts.map +1 -0
  65. package/dist/bin/tool-calls.js +4 -0
  66. package/dist/bin/tool-calls.js.map +1 -0
  67. package/dist/browser/index.d.ts +2 -0
  68. package/dist/browser/index.d.ts.map +1 -0
  69. package/dist/browser/index.js +2 -0
  70. package/dist/browser/index.js.map +1 -0
  71. package/dist/browser/web-browser.d.ts +30 -0
  72. package/dist/browser/web-browser.d.ts.map +1 -0
  73. package/dist/browser/web-browser.js +93 -0
  74. package/dist/browser/web-browser.js.map +1 -0
  75. package/dist/cli/agent.d.ts +20 -0
  76. package/dist/cli/agent.d.ts.map +1 -0
  77. package/dist/cli/agent.js +474 -0
  78. package/dist/cli/agent.js.map +1 -0
  79. package/dist/cli/analyze.d.ts +7 -0
  80. package/dist/cli/analyze.d.ts.map +1 -0
  81. package/dist/cli/analyze.js +103 -0
  82. package/dist/cli/analyze.js.map +1 -0
  83. package/dist/cli/completion-gates.d.ts +51 -0
  84. package/dist/cli/completion-gates.d.ts.map +1 -0
  85. package/dist/cli/completion-gates.js +201 -0
  86. package/dist/cli/completion-gates.js.map +1 -0
  87. package/dist/cli/compliance.d.ts +8 -0
  88. package/dist/cli/compliance.d.ts.map +1 -0
  89. package/dist/cli/compliance.js +509 -0
  90. package/dist/cli/compliance.js.map +1 -0
  91. package/dist/cli/coord.d.ts +7 -0
  92. package/dist/cli/coord.d.ts.map +1 -0
  93. package/dist/cli/coord.js +138 -0
  94. package/dist/cli/coord.js.map +1 -0
  95. package/dist/cli/dashboard.d.ts +21 -0
  96. package/dist/cli/dashboard.d.ts.map +1 -0
  97. package/dist/cli/dashboard.js +1508 -0
  98. package/dist/cli/dashboard.js.map +1 -0
  99. package/dist/cli/deploy.d.ts +19 -0
  100. package/dist/cli/deploy.d.ts.map +1 -0
  101. package/dist/cli/deploy.js +387 -0
  102. package/dist/cli/deploy.js.map +1 -0
  103. package/dist/cli/droids.d.ts +9 -0
  104. package/dist/cli/droids.d.ts.map +1 -0
  105. package/dist/cli/droids.js +227 -0
  106. package/dist/cli/droids.js.map +1 -0
  107. package/dist/cli/generate.d.ts +17 -0
  108. package/dist/cli/generate.d.ts.map +1 -0
  109. package/dist/cli/generate.js +432 -0
  110. package/dist/cli/generate.js.map +1 -0
  111. package/dist/cli/hooks.d.ts +9 -0
  112. package/dist/cli/hooks.d.ts.map +1 -0
  113. package/dist/cli/hooks.js +464 -0
  114. package/dist/cli/hooks.js.map +1 -0
  115. package/dist/cli/init.d.ts +12 -0
  116. package/dist/cli/init.d.ts.map +1 -0
  117. package/dist/cli/init.js +364 -0
  118. package/dist/cli/init.js.map +1 -0
  119. package/dist/cli/mcp-router.d.ts +16 -0
  120. package/dist/cli/mcp-router.d.ts.map +1 -0
  121. package/dist/cli/mcp-router.js +143 -0
  122. package/dist/cli/mcp-router.js.map +1 -0
  123. package/dist/cli/memory.d.ts +24 -0
  124. package/dist/cli/memory.d.ts.map +1 -0
  125. package/dist/cli/memory.js +885 -0
  126. package/dist/cli/memory.js.map +1 -0
  127. package/dist/cli/model.d.ts +15 -0
  128. package/dist/cli/model.d.ts.map +1 -0
  129. package/dist/cli/model.js +290 -0
  130. package/dist/cli/model.js.map +1 -0
  131. package/dist/cli/patterns.d.ts +26 -0
  132. package/dist/cli/patterns.d.ts.map +1 -0
  133. package/dist/cli/patterns.js +862 -0
  134. package/dist/cli/patterns.js.map +1 -0
  135. package/dist/cli/rtk-validation.d.ts +9 -0
  136. package/dist/cli/rtk-validation.d.ts.map +1 -0
  137. package/dist/cli/rtk-validation.js +9 -0
  138. package/dist/cli/rtk-validation.js.map +1 -0
  139. package/dist/cli/rtk.d.ts +34 -0
  140. package/dist/cli/rtk.d.ts.map +1 -0
  141. package/dist/cli/rtk.js +401 -0
  142. package/dist/cli/rtk.js.map +1 -0
  143. package/dist/cli/schema-diff.d.ts +7 -0
  144. package/dist/cli/schema-diff.d.ts.map +1 -0
  145. package/dist/cli/schema-diff.js +11 -0
  146. package/dist/cli/schema-diff.js.map +1 -0
  147. package/dist/cli/setup-mcp-router.d.ts +8 -0
  148. package/dist/cli/setup-mcp-router.d.ts.map +1 -0
  149. package/dist/cli/setup-mcp-router.js +163 -0
  150. package/dist/cli/setup-mcp-router.js.map +1 -0
  151. package/dist/cli/setup-wizard.d.ts +2 -0
  152. package/dist/cli/setup-wizard.d.ts.map +1 -0
  153. package/dist/cli/setup-wizard.js +806 -0
  154. package/dist/cli/setup-wizard.js.map +1 -0
  155. package/dist/cli/setup.d.ts +15 -0
  156. package/dist/cli/setup.d.ts.map +1 -0
  157. package/dist/cli/setup.js +154 -0
  158. package/dist/cli/setup.js.map +1 -0
  159. package/dist/cli/sync.d.ts +8 -0
  160. package/dist/cli/sync.d.ts.map +1 -0
  161. package/dist/cli/sync.js +395 -0
  162. package/dist/cli/sync.js.map +1 -0
  163. package/dist/cli/task.d.ts +33 -0
  164. package/dist/cli/task.d.ts.map +1 -0
  165. package/dist/cli/task.js +672 -0
  166. package/dist/cli/task.js.map +1 -0
  167. package/dist/cli/tool-calls.d.ts +20 -0
  168. package/dist/cli/tool-calls.d.ts.map +1 -0
  169. package/dist/cli/tool-calls.js +605 -0
  170. package/dist/cli/tool-calls.js.map +1 -0
  171. package/dist/cli/uap.d.ts +10 -0
  172. package/dist/cli/uap.d.ts.map +1 -0
  173. package/dist/cli/uap.js +398 -0
  174. package/dist/cli/uap.js.map +1 -0
  175. package/dist/cli/update.d.ts +10 -0
  176. package/dist/cli/update.d.ts.map +1 -0
  177. package/dist/cli/update.js +300 -0
  178. package/dist/cli/update.js.map +1 -0
  179. package/dist/cli/visualize.d.ts +77 -0
  180. package/dist/cli/visualize.d.ts.map +1 -0
  181. package/dist/cli/visualize.js +287 -0
  182. package/dist/cli/visualize.js.map +1 -0
  183. package/dist/cli/worktree.d.ts +9 -0
  184. package/dist/cli/worktree.d.ts.map +1 -0
  185. package/dist/cli/worktree.js +213 -0
  186. package/dist/cli/worktree.js.map +1 -0
  187. package/dist/coordination/adaptive-patterns.d.ts +65 -0
  188. package/dist/coordination/adaptive-patterns.d.ts.map +1 -0
  189. package/dist/coordination/adaptive-patterns.js +108 -0
  190. package/dist/coordination/adaptive-patterns.js.map +1 -0
  191. package/dist/coordination/auto-agent.d.ts +82 -0
  192. package/dist/coordination/auto-agent.d.ts.map +1 -0
  193. package/dist/coordination/auto-agent.js +145 -0
  194. package/dist/coordination/auto-agent.js.map +1 -0
  195. package/dist/coordination/capability-router.d.ts +79 -0
  196. package/dist/coordination/capability-router.d.ts.map +1 -0
  197. package/dist/coordination/capability-router.js +334 -0
  198. package/dist/coordination/capability-router.js.map +1 -0
  199. package/dist/coordination/database.d.ts +13 -0
  200. package/dist/coordination/database.d.ts.map +1 -0
  201. package/dist/coordination/database.js +136 -0
  202. package/dist/coordination/database.js.map +1 -0
  203. package/dist/coordination/deploy-batcher.d.ts +122 -0
  204. package/dist/coordination/deploy-batcher.d.ts.map +1 -0
  205. package/dist/coordination/deploy-batcher.js +718 -0
  206. package/dist/coordination/deploy-batcher.js.map +1 -0
  207. package/dist/coordination/droid-validator.d.ts +59 -0
  208. package/dist/coordination/droid-validator.d.ts.map +1 -0
  209. package/dist/coordination/droid-validator.js +142 -0
  210. package/dist/coordination/droid-validator.js.map +1 -0
  211. package/dist/coordination/index.d.ts +10 -0
  212. package/dist/coordination/index.d.ts.map +1 -0
  213. package/dist/coordination/index.js +10 -0
  214. package/dist/coordination/index.js.map +1 -0
  215. package/dist/coordination/pattern-router.d.ts +50 -0
  216. package/dist/coordination/pattern-router.d.ts.map +1 -0
  217. package/dist/coordination/pattern-router.js +118 -0
  218. package/dist/coordination/pattern-router.js.map +1 -0
  219. package/dist/coordination/service.d.ts +81 -0
  220. package/dist/coordination/service.d.ts.map +1 -0
  221. package/dist/coordination/service.js +619 -0
  222. package/dist/coordination/service.js.map +1 -0
  223. package/dist/coordination/worktree-enforcer.d.ts +22 -0
  224. package/dist/coordination/worktree-enforcer.d.ts.map +1 -0
  225. package/dist/coordination/worktree-enforcer.js +71 -0
  226. package/dist/coordination/worktree-enforcer.js.map +1 -0
  227. package/dist/generators/claude-md.d.ts +3 -0
  228. package/dist/generators/claude-md.d.ts.map +1 -0
  229. package/dist/generators/claude-md.js +1020 -0
  230. package/dist/generators/claude-md.js.map +1 -0
  231. package/dist/generators/template-loader.d.ts +105 -0
  232. package/dist/generators/template-loader.d.ts.map +1 -0
  233. package/dist/generators/template-loader.js +291 -0
  234. package/dist/generators/template-loader.js.map +1 -0
  235. package/dist/index.d.ts +49 -0
  236. package/dist/index.d.ts.map +1 -0
  237. package/dist/index.js +63 -0
  238. package/dist/index.js.map +1 -0
  239. package/dist/mcp-router/config/parser.d.ts +9 -0
  240. package/dist/mcp-router/config/parser.d.ts.map +1 -0
  241. package/dist/mcp-router/config/parser.js +174 -0
  242. package/dist/mcp-router/config/parser.js.map +1 -0
  243. package/dist/mcp-router/executor/client.d.ts +31 -0
  244. package/dist/mcp-router/executor/client.d.ts.map +1 -0
  245. package/dist/mcp-router/executor/client.js +189 -0
  246. package/dist/mcp-router/executor/client.js.map +1 -0
  247. package/dist/mcp-router/index.d.ts +22 -0
  248. package/dist/mcp-router/index.d.ts.map +1 -0
  249. package/dist/mcp-router/index.js +18 -0
  250. package/dist/mcp-router/index.js.map +1 -0
  251. package/dist/mcp-router/output-compressor.d.ts +26 -0
  252. package/dist/mcp-router/output-compressor.d.ts.map +1 -0
  253. package/dist/mcp-router/output-compressor.js +236 -0
  254. package/dist/mcp-router/output-compressor.js.map +1 -0
  255. package/dist/mcp-router/search/fuzzy.d.ts +26 -0
  256. package/dist/mcp-router/search/fuzzy.d.ts.map +1 -0
  257. package/dist/mcp-router/search/fuzzy.js +94 -0
  258. package/dist/mcp-router/search/fuzzy.js.map +1 -0
  259. package/dist/mcp-router/server.d.ts +50 -0
  260. package/dist/mcp-router/server.d.ts.map +1 -0
  261. package/dist/mcp-router/server.js +229 -0
  262. package/dist/mcp-router/server.js.map +1 -0
  263. package/dist/mcp-router/session-stats.d.ts +37 -0
  264. package/dist/mcp-router/session-stats.d.ts.map +1 -0
  265. package/dist/mcp-router/session-stats.js +56 -0
  266. package/dist/mcp-router/session-stats.js.map +1 -0
  267. package/dist/mcp-router/tools/discover.d.ts +37 -0
  268. package/dist/mcp-router/tools/discover.d.ts.map +1 -0
  269. package/dist/mcp-router/tools/discover.js +65 -0
  270. package/dist/mcp-router/tools/discover.js.map +1 -0
  271. package/dist/mcp-router/tools/execute.d.ts +43 -0
  272. package/dist/mcp-router/tools/execute.d.ts.map +1 -0
  273. package/dist/mcp-router/tools/execute.js +144 -0
  274. package/dist/mcp-router/tools/execute.js.map +1 -0
  275. package/dist/mcp-router/types.d.ts +62 -0
  276. package/dist/mcp-router/types.d.ts.map +1 -0
  277. package/dist/mcp-router/types.js +6 -0
  278. package/dist/mcp-router/types.js.map +1 -0
  279. package/dist/memory/adaptive-context.d.ts +149 -0
  280. package/dist/memory/adaptive-context.d.ts.map +1 -0
  281. package/dist/memory/adaptive-context.js +1095 -0
  282. package/dist/memory/adaptive-context.js.map +1 -0
  283. package/dist/memory/agent-scoped-memory.d.ts +67 -0
  284. package/dist/memory/agent-scoped-memory.d.ts.map +1 -0
  285. package/dist/memory/agent-scoped-memory.js +126 -0
  286. package/dist/memory/agent-scoped-memory.js.map +1 -0
  287. package/dist/memory/ambiguity-detector.d.ts +54 -0
  288. package/dist/memory/ambiguity-detector.d.ts.map +1 -0
  289. package/dist/memory/ambiguity-detector.js +401 -0
  290. package/dist/memory/ambiguity-detector.js.map +1 -0
  291. package/dist/memory/backends/base.d.ts +18 -0
  292. package/dist/memory/backends/base.d.ts.map +1 -0
  293. package/dist/memory/backends/base.js +2 -0
  294. package/dist/memory/backends/base.js.map +1 -0
  295. package/dist/memory/backends/factory.d.ts +4 -0
  296. package/dist/memory/backends/factory.d.ts.map +1 -0
  297. package/dist/memory/backends/factory.js +53 -0
  298. package/dist/memory/backends/factory.js.map +1 -0
  299. package/dist/memory/backends/github.d.ts +27 -0
  300. package/dist/memory/backends/github.d.ts.map +1 -0
  301. package/dist/memory/backends/github.js +134 -0
  302. package/dist/memory/backends/github.js.map +1 -0
  303. package/dist/memory/backends/qdrant-cloud.d.ts +32 -0
  304. package/dist/memory/backends/qdrant-cloud.d.ts.map +1 -0
  305. package/dist/memory/backends/qdrant-cloud.js +167 -0
  306. package/dist/memory/backends/qdrant-cloud.js.map +1 -0
  307. package/dist/memory/context-compressor.d.ts +116 -0
  308. package/dist/memory/context-compressor.d.ts.map +1 -0
  309. package/dist/memory/context-compressor.js +430 -0
  310. package/dist/memory/context-compressor.js.map +1 -0
  311. package/dist/memory/context-pruner.d.ts +55 -0
  312. package/dist/memory/context-pruner.d.ts.map +1 -0
  313. package/dist/memory/context-pruner.js +85 -0
  314. package/dist/memory/context-pruner.js.map +1 -0
  315. package/dist/memory/correction-propagator.d.ts +44 -0
  316. package/dist/memory/correction-propagator.d.ts.map +1 -0
  317. package/dist/memory/correction-propagator.js +156 -0
  318. package/dist/memory/correction-propagator.js.map +1 -0
  319. package/dist/memory/daily-log.d.ts +67 -0
  320. package/dist/memory/daily-log.d.ts.map +1 -0
  321. package/dist/memory/daily-log.js +143 -0
  322. package/dist/memory/daily-log.js.map +1 -0
  323. package/dist/memory/dynamic-retrieval.d.ts +112 -0
  324. package/dist/memory/dynamic-retrieval.d.ts.map +1 -0
  325. package/dist/memory/dynamic-retrieval.js +908 -0
  326. package/dist/memory/dynamic-retrieval.js.map +1 -0
  327. package/dist/memory/embeddings.d.ts +172 -0
  328. package/dist/memory/embeddings.d.ts.map +1 -0
  329. package/dist/memory/embeddings.js +780 -0
  330. package/dist/memory/embeddings.js.map +1 -0
  331. package/dist/memory/generic-uap-patterns.d.ts +7 -0
  332. package/dist/memory/generic-uap-patterns.d.ts.map +1 -0
  333. package/dist/memory/generic-uap-patterns.js +43 -0
  334. package/dist/memory/generic-uap-patterns.js.map +1 -0
  335. package/dist/memory/hierarchical-memory.d.ts +141 -0
  336. package/dist/memory/hierarchical-memory.d.ts.map +1 -0
  337. package/dist/memory/hierarchical-memory.js +485 -0
  338. package/dist/memory/hierarchical-memory.js.map +1 -0
  339. package/dist/memory/knowledge-graph.d.ts +98 -0
  340. package/dist/memory/knowledge-graph.d.ts.map +1 -0
  341. package/dist/memory/knowledge-graph.js +275 -0
  342. package/dist/memory/knowledge-graph.js.map +1 -0
  343. package/dist/memory/memory-consolidator.d.ts +124 -0
  344. package/dist/memory/memory-consolidator.d.ts.map +1 -0
  345. package/dist/memory/memory-consolidator.js +514 -0
  346. package/dist/memory/memory-consolidator.js.map +1 -0
  347. package/dist/memory/memory-maintenance.d.ts +39 -0
  348. package/dist/memory/memory-maintenance.d.ts.map +1 -0
  349. package/dist/memory/memory-maintenance.js +336 -0
  350. package/dist/memory/memory-maintenance.js.map +1 -0
  351. package/dist/memory/model-router.d.ts +105 -0
  352. package/dist/memory/model-router.d.ts.map +1 -0
  353. package/dist/memory/model-router.js +474 -0
  354. package/dist/memory/model-router.js.map +1 -0
  355. package/dist/memory/multi-view-memory.d.ts +134 -0
  356. package/dist/memory/multi-view-memory.d.ts.map +1 -0
  357. package/dist/memory/multi-view-memory.js +430 -0
  358. package/dist/memory/multi-view-memory.js.map +1 -0
  359. package/dist/memory/predictive-memory.d.ts +79 -0
  360. package/dist/memory/predictive-memory.d.ts.map +1 -0
  361. package/dist/memory/predictive-memory.js +294 -0
  362. package/dist/memory/predictive-memory.js.map +1 -0
  363. package/dist/memory/prepopulate.d.ts +76 -0
  364. package/dist/memory/prepopulate.d.ts.map +1 -0
  365. package/dist/memory/prepopulate.js +832 -0
  366. package/dist/memory/prepopulate.js.map +1 -0
  367. package/dist/memory/semantic-compression.d.ts +77 -0
  368. package/dist/memory/semantic-compression.d.ts.map +1 -0
  369. package/dist/memory/semantic-compression.js +359 -0
  370. package/dist/memory/semantic-compression.js.map +1 -0
  371. package/dist/memory/serverless-qdrant.d.ts +102 -0
  372. package/dist/memory/serverless-qdrant.d.ts.map +1 -0
  373. package/dist/memory/serverless-qdrant.js +369 -0
  374. package/dist/memory/serverless-qdrant.js.map +1 -0
  375. package/dist/memory/short-term/factory.d.ts +26 -0
  376. package/dist/memory/short-term/factory.d.ts.map +1 -0
  377. package/dist/memory/short-term/factory.js +28 -0
  378. package/dist/memory/short-term/factory.js.map +1 -0
  379. package/dist/memory/short-term/indexeddb.d.ts +25 -0
  380. package/dist/memory/short-term/indexeddb.d.ts.map +1 -0
  381. package/dist/memory/short-term/indexeddb.js +64 -0
  382. package/dist/memory/short-term/indexeddb.js.map +1 -0
  383. package/dist/memory/short-term/schema.d.ts +6 -0
  384. package/dist/memory/short-term/schema.d.ts.map +1 -0
  385. package/dist/memory/short-term/schema.js +141 -0
  386. package/dist/memory/short-term/schema.js.map +1 -0
  387. package/dist/memory/short-term/sqlite.d.ts +64 -0
  388. package/dist/memory/short-term/sqlite.d.ts.map +1 -0
  389. package/dist/memory/short-term/sqlite.js +274 -0
  390. package/dist/memory/short-term/sqlite.js.map +1 -0
  391. package/dist/memory/speculative-cache.d.ts +111 -0
  392. package/dist/memory/speculative-cache.d.ts.map +1 -0
  393. package/dist/memory/speculative-cache.js +457 -0
  394. package/dist/memory/speculative-cache.js.map +1 -0
  395. package/dist/memory/task-classifier.d.ts +40 -0
  396. package/dist/memory/task-classifier.d.ts.map +1 -0
  397. package/dist/memory/task-classifier.js +342 -0
  398. package/dist/memory/task-classifier.js.map +1 -0
  399. package/dist/memory/terminal-bench-knowledge.d.ts +48 -0
  400. package/dist/memory/terminal-bench-knowledge.d.ts.map +1 -0
  401. package/dist/memory/terminal-bench-knowledge.js +622 -0
  402. package/dist/memory/terminal-bench-knowledge.js.map +1 -0
  403. package/dist/memory/write-gate.d.ts +39 -0
  404. package/dist/memory/write-gate.d.ts.map +1 -0
  405. package/dist/memory/write-gate.js +190 -0
  406. package/dist/memory/write-gate.js.map +1 -0
  407. package/dist/models/api-client.d.ts +46 -0
  408. package/dist/models/api-client.d.ts.map +1 -0
  409. package/dist/models/api-client.js +182 -0
  410. package/dist/models/api-client.js.map +1 -0
  411. package/dist/models/execution-profiles.d.ts +64 -0
  412. package/dist/models/execution-profiles.d.ts.map +1 -0
  413. package/dist/models/execution-profiles.js +403 -0
  414. package/dist/models/execution-profiles.js.map +1 -0
  415. package/dist/models/executor.d.ts +130 -0
  416. package/dist/models/executor.d.ts.map +1 -0
  417. package/dist/models/executor.js +382 -0
  418. package/dist/models/executor.js.map +1 -0
  419. package/dist/models/index.d.ts +19 -0
  420. package/dist/models/index.d.ts.map +1 -0
  421. package/dist/models/index.js +23 -0
  422. package/dist/models/index.js.map +1 -0
  423. package/dist/models/plan-validator.d.ts +37 -0
  424. package/dist/models/plan-validator.d.ts.map +1 -0
  425. package/dist/models/plan-validator.js +179 -0
  426. package/dist/models/plan-validator.js.map +1 -0
  427. package/dist/models/planner.d.ts +73 -0
  428. package/dist/models/planner.d.ts.map +1 -0
  429. package/dist/models/planner.js +375 -0
  430. package/dist/models/planner.js.map +1 -0
  431. package/dist/models/router.d.ts +96 -0
  432. package/dist/models/router.d.ts.map +1 -0
  433. package/dist/models/router.js +523 -0
  434. package/dist/models/router.js.map +1 -0
  435. package/dist/models/types.d.ts +370 -0
  436. package/dist/models/types.d.ts.map +1 -0
  437. package/dist/models/types.js +232 -0
  438. package/dist/models/types.js.map +1 -0
  439. package/dist/models/unified-router.d.ts +152 -0
  440. package/dist/models/unified-router.d.ts.map +1 -0
  441. package/dist/models/unified-router.js +313 -0
  442. package/dist/models/unified-router.js.map +1 -0
  443. package/dist/policies/convert-policy-to-claude.d.ts +3 -0
  444. package/dist/policies/convert-policy-to-claude.d.ts.map +1 -0
  445. package/dist/policies/convert-policy-to-claude.js +87 -0
  446. package/dist/policies/convert-policy-to-claude.js.map +1 -0
  447. package/dist/policies/database-manager.d.ts +27 -0
  448. package/dist/policies/database-manager.d.ts.map +1 -0
  449. package/dist/policies/database-manager.js +198 -0
  450. package/dist/policies/database-manager.js.map +1 -0
  451. package/dist/policies/enforced-tool-router.d.ts +53 -0
  452. package/dist/policies/enforced-tool-router.d.ts.map +1 -0
  453. package/dist/policies/enforced-tool-router.js +80 -0
  454. package/dist/policies/enforced-tool-router.js.map +1 -0
  455. package/dist/policies/index.d.ts +10 -0
  456. package/dist/policies/index.d.ts.map +1 -0
  457. package/dist/policies/index.js +8 -0
  458. package/dist/policies/index.js.map +1 -0
  459. package/dist/policies/policy-gate.d.ts +59 -0
  460. package/dist/policies/policy-gate.d.ts.map +1 -0
  461. package/dist/policies/policy-gate.js +171 -0
  462. package/dist/policies/policy-gate.js.map +1 -0
  463. package/dist/policies/policy-memory.d.ts +18 -0
  464. package/dist/policies/policy-memory.d.ts.map +1 -0
  465. package/dist/policies/policy-memory.js +126 -0
  466. package/dist/policies/policy-memory.js.map +1 -0
  467. package/dist/policies/policy-tools.d.ts +11 -0
  468. package/dist/policies/policy-tools.d.ts.map +1 -0
  469. package/dist/policies/policy-tools.js +66 -0
  470. package/dist/policies/policy-tools.js.map +1 -0
  471. package/dist/policies/schemas/policy.d.ts +69 -0
  472. package/dist/policies/schemas/policy.d.ts.map +1 -0
  473. package/dist/policies/schemas/policy.js +31 -0
  474. package/dist/policies/schemas/policy.js.map +1 -0
  475. package/dist/tasks/coordination.d.ts +83 -0
  476. package/dist/tasks/coordination.d.ts.map +1 -0
  477. package/dist/tasks/coordination.js +291 -0
  478. package/dist/tasks/coordination.js.map +1 -0
  479. package/dist/tasks/database.d.ts +19 -0
  480. package/dist/tasks/database.d.ts.map +1 -0
  481. package/dist/tasks/database.js +149 -0
  482. package/dist/tasks/database.js.map +1 -0
  483. package/dist/tasks/decoder-gate.d.ts +64 -0
  484. package/dist/tasks/decoder-gate.d.ts.map +1 -0
  485. package/dist/tasks/decoder-gate.js +268 -0
  486. package/dist/tasks/decoder-gate.js.map +1 -0
  487. package/dist/tasks/index.d.ts +6 -0
  488. package/dist/tasks/index.d.ts.map +1 -0
  489. package/dist/tasks/index.js +6 -0
  490. package/dist/tasks/index.js.map +1 -0
  491. package/dist/tasks/service.d.ts +40 -0
  492. package/dist/tasks/service.d.ts.map +1 -0
  493. package/dist/tasks/service.js +671 -0
  494. package/dist/tasks/service.js.map +1 -0
  495. package/dist/tasks/types.d.ts +238 -0
  496. package/dist/tasks/types.d.ts.map +1 -0
  497. package/dist/tasks/types.js +74 -0
  498. package/dist/tasks/types.js.map +1 -0
  499. package/dist/telemetry/index.d.ts +2 -0
  500. package/dist/telemetry/index.d.ts.map +1 -0
  501. package/dist/telemetry/index.js +2 -0
  502. package/dist/telemetry/index.js.map +1 -0
  503. package/dist/telemetry/session-telemetry.d.ts +56 -0
  504. package/dist/telemetry/session-telemetry.d.ts.map +1 -0
  505. package/dist/telemetry/session-telemetry.js +807 -0
  506. package/dist/telemetry/session-telemetry.js.map +1 -0
  507. package/dist/types/analysis.d.ts +82 -0
  508. package/dist/types/analysis.d.ts.map +1 -0
  509. package/dist/types/analysis.js +2 -0
  510. package/dist/types/analysis.js.map +1 -0
  511. package/dist/types/config.d.ts +3324 -0
  512. package/dist/types/config.d.ts.map +1 -0
  513. package/dist/types/config.js +418 -0
  514. package/dist/types/config.js.map +1 -0
  515. package/dist/types/coordination.d.ts +240 -0
  516. package/dist/types/coordination.d.ts.map +1 -0
  517. package/dist/types/coordination.js +43 -0
  518. package/dist/types/coordination.js.map +1 -0
  519. package/dist/types/index.d.ts +4 -0
  520. package/dist/types/index.d.ts.map +1 -0
  521. package/dist/types/index.js +4 -0
  522. package/dist/types/index.js.map +1 -0
  523. package/dist/uap-droids-strict.d.ts +59 -0
  524. package/dist/uap-droids-strict.d.ts.map +1 -0
  525. package/dist/uap-droids-strict.js +200 -0
  526. package/dist/uap-droids-strict.js.map +1 -0
  527. package/dist/utils/config-manager.d.ts +30 -0
  528. package/dist/utils/config-manager.d.ts.map +1 -0
  529. package/dist/utils/config-manager.js +41 -0
  530. package/dist/utils/config-manager.js.map +1 -0
  531. package/dist/utils/fetch-with-retry.d.ts +5 -0
  532. package/dist/utils/fetch-with-retry.d.ts.map +1 -0
  533. package/dist/utils/fetch-with-retry.js +61 -0
  534. package/dist/utils/fetch-with-retry.js.map +1 -0
  535. package/dist/utils/merge-claude-md.d.ts +28 -0
  536. package/dist/utils/merge-claude-md.d.ts.map +1 -0
  537. package/dist/utils/merge-claude-md.js +342 -0
  538. package/dist/utils/merge-claude-md.js.map +1 -0
  539. package/dist/utils/rate-limiter.d.ts +58 -0
  540. package/dist/utils/rate-limiter.d.ts.map +1 -0
  541. package/dist/utils/rate-limiter.js +100 -0
  542. package/dist/utils/rate-limiter.js.map +1 -0
  543. package/dist/utils/string-similarity.d.ts +37 -0
  544. package/dist/utils/string-similarity.d.ts.map +1 -0
  545. package/dist/utils/string-similarity.js +114 -0
  546. package/dist/utils/string-similarity.js.map +1 -0
  547. package/dist/utils/validate-json.d.ts +51 -0
  548. package/dist/utils/validate-json.d.ts.map +1 -0
  549. package/dist/utils/validate-json.js +94 -0
  550. package/dist/utils/validate-json.js.map +1 -0
  551. package/docs/INDEX.md +66 -0
  552. package/docs/architecture/MULTI_MODEL.md +224 -0
  553. package/docs/architecture/SYSTEM_ANALYSIS.md +1117 -0
  554. package/docs/architecture/UAP_COMPLIANCE.md +217 -0
  555. package/docs/architecture/UAP_PROTOCOL.md +339 -0
  556. package/docs/architecture/UAP_STRICT_DROIDS.md +172 -0
  557. package/docs/archive/BALLS_MODE_SELF_ANALYSIS.md +260 -0
  558. package/docs/archive/FAILING_TASKS_SOLUTION_PLAN.md +668 -0
  559. package/docs/archive/JINJA2-SYSTEM-MESSAGE-FIX.md +209 -0
  560. package/docs/archive/NPM-PUBLISH-V0.9.1.md +240 -0
  561. package/docs/archive/OPTIMIZATION_OPTIONS.md +334 -0
  562. package/docs/archive/SETUP_IMPROVEMENTS.md +213 -0
  563. package/docs/archive/UAP_GENERIC_OPTIMIZATION_PLAN.md +270 -0
  564. package/docs/archive/UAP_V103_PATTERN_DESIGN.md +315 -0
  565. package/docs/archive/UAP_V104_COMPLIANCE_DESIGN.md +223 -0
  566. package/docs/archive/changelog/2026-03-10_uap-100-compliance.md +77 -0
  567. package/docs/archive/changelog/2026-03-10_uap-full-system-verification.md +109 -0
  568. package/docs/benchmarks/ACCURACY_ANALYSIS.md +471 -0
  569. package/docs/benchmarks/TOKEN_OPTIMIZATION.md +572 -0
  570. package/docs/benchmarks/VALIDATION_PLAN.md +568 -0
  571. package/docs/benchmarks/VALIDATION_RESULTS.md +161 -0
  572. package/docs/deployment/DEPLOYMENT.md +895 -0
  573. package/docs/deployment/DEPLOYMENT_STRATEGIES.md +518 -0
  574. package/docs/deployment/DEPLOY_BATCHER_ANALYSIS.md +856 -0
  575. package/docs/deployment/DEPLOY_BATCHING.md +273 -0
  576. package/docs/deployment/DEPLOY_BUCKETING_ANALYSIS.md +420 -0
  577. package/docs/deployment/QWEN35_LLAMA_CPP.md +265 -0
  578. package/docs/getting-started/INTEGRATION.md +449 -0
  579. package/docs/getting-started/OVERVIEW.md +344 -0
  580. package/docs/getting-started/SETUP.md +203 -0
  581. package/docs/integrations/MCP_ROUTER_SETUP.md +445 -0
  582. package/docs/integrations/RTK_INTEGRATION.md +468 -0
  583. package/docs/operations/TROUBLESHOOTING.md +660 -0
  584. package/docs/reference/API_REFERENCE.md +903 -0
  585. package/docs/reference/FEATURES.md +472 -0
  586. package/docs/reference/HARNESS-MATRIX.md +318 -0
  587. package/docs/reference/UAP_CLI_REFERENCE.md +600 -0
  588. package/docs/research/BEHAVIORAL_PATTERNS.md +228 -0
  589. package/docs/research/DOMAIN_STRATEGIES.md +316 -0
  590. package/docs/research/MEMORY_SYSTEMS_COMPARISON.md +812 -0
  591. package/docs/research/PATTERN_ANALYSIS_2026-01-18.md +436 -0
  592. package/docs/research/PERFORMANCE_ANALYSIS_2026-01-18.md +209 -0
  593. package/docs/research/PERFORMANCE_TEST_PLAN.md +383 -0
  594. package/docs/research/TERMINAL_BENCH_LEARNINGS.md +217 -0
  595. package/package.json +113 -0
  596. package/scripts/README.md +161 -0
  597. package/templates/CLAUDE.template.md +10 -0
  598. package/templates/CLAUDE_ARCHITECTURE.template.md +103 -0
  599. package/templates/CLAUDE_CODING.template.md +127 -0
  600. package/templates/CLAUDE_DROIDS.template.md +109 -0
  601. package/templates/CLAUDE_MEMORY.template.md +131 -0
  602. package/templates/CLAUDE_WORKFLOWS.template.md +139 -0
  603. package/templates/PROJECT.template.md +209 -0
  604. package/templates/SCHEMA.md +57 -0
  605. package/templates/archive/CLAUDE.template.root-v6.md +534 -0
  606. package/templates/archive/CLAUDE.template.v6.md +534 -0
  607. package/templates/hooks/forgecode/pre-compact.sh +68 -0
  608. package/templates/hooks/forgecode/session-start.sh +169 -0
  609. package/templates/hooks/forgecode.plugin.sh +128 -0
  610. package/templates/hooks/pre-compact.sh +74 -0
  611. package/templates/hooks/session-start.sh +366 -0
  612. package/tools/agents/README.md +224 -0
  613. package/tools/agents/UAP/README.md +386 -0
  614. package/tools/agents/UAP/__init__.py +9 -0
  615. package/tools/agents/UAP/cli.py +901 -0
  616. package/tools/agents/UAP/compliance_verify.sh +108 -0
  617. package/tools/agents/UAP/full_verification.sh +126 -0
  618. package/tools/agents/UAP/version.py +32 -0
  619. package/tools/agents/benchmarks/benchmark_memory_systems.py +730 -0
  620. package/tools/agents/benchmarks/results/benchmark_20260106_064817.json +170 -0
  621. package/tools/agents/benchmarks/results/benchmark_20260106_064817.md +51 -0
  622. package/tools/agents/config/chat_template.jinja +77 -0
  623. package/tools/agents/config/tool-call-schema.json +19 -0
  624. package/tools/agents/config/tool-call.gbnf +58 -0
  625. package/tools/agents/docker/Dockerfile.python +52 -0
  626. package/tools/agents/docker/Dockerfile.ubuntu +55 -0
  627. package/tools/agents/docker-compose.qdrant.yml +24 -0
  628. package/tools/agents/install-opencode-local.sh.j2 +135 -0
  629. package/tools/agents/migrations/apply.py +256 -0
  630. package/tools/agents/opencode_uap_agent.py +1505 -0
  631. package/tools/agents/plugin/README.md +91 -0
  632. package/tools/agents/plugin/index.ts +46 -0
  633. package/tools/agents/plugin/pre-compact.sh +68 -0
  634. package/tools/agents/plugin/session-start.sh +175 -0
  635. package/tools/agents/plugin/uap-commands.ts +45 -0
  636. package/tools/agents/plugin/uap-droids.ts +54 -0
  637. package/tools/agents/plugin/uap-patterns.ts +54 -0
  638. package/tools/agents/plugin/uap-skills.ts +52 -0
  639. package/tools/agents/plugins/uap-enforce.ts +314 -0
  640. package/tools/agents/scripts/__pycache__/tool_call_wrapper.cpython-313.pyc +0 -0
  641. package/tools/agents/scripts/chat_template_verifier.py +343 -0
  642. package/tools/agents/scripts/fix-qwen-template.js +38 -0
  643. package/tools/agents/scripts/fix_qwen_chat_template.py +316 -0
  644. package/tools/agents/scripts/generate_lora_training_data.py +412 -0
  645. package/tools/agents/scripts/init_qdrant.py +151 -0
  646. package/tools/agents/scripts/memory_migration.py +560 -0
  647. package/tools/agents/scripts/migrate_memory_to_qdrant.py +110 -0
  648. package/tools/agents/scripts/prepare_lora.sh +512 -0
  649. package/tools/agents/scripts/query_memory.py +200 -0
  650. package/tools/agents/scripts/qwen-tool-call-test.js +38 -0
  651. package/tools/agents/scripts/qwen-tool-call-wrapper.js +38 -0
  652. package/tools/agents/scripts/qwen_tool_call_test.py +464 -0
  653. package/tools/agents/scripts/qwen_tool_call_wrapper.py +686 -0
  654. package/tools/agents/scripts/start-services.sh +96 -0
  655. package/tools/agents/scripts/tool-choice-proxy.cjs +296 -0
  656. package/tools/agents/scripts/tool_call_test.py +656 -0
  657. package/tools/agents/scripts/tool_call_wrapper.py +799 -0
  658. package/tools/agents/tests/test_uap_compliance.py +257 -0
  659. package/tools/agents/uap_agent.py +122 -0
  660. package/tools/agents/uap_agent_install.sh +12 -0
@@ -0,0 +1,436 @@
1
+ # UAP v1.1.0 Pattern Analysis - Deep Failure Study
2
+
3
+ **Date:** 2026-01-18
4
+ **Benchmark Run:** uam_v190_full (11 tasks, 27.3% pass rate, 3/11)
5
+ **Analysis Method:** Deep dive into agent logs, verifier outputs, and failure patterns
6
+
7
+ ---
8
+
9
+ ## Executive Summary
10
+
11
+ Analyzed 8 failing tasks from latest benchmark to extract **generalized patterns** that can improve future performance across similar problem categories.
12
+
13
+ ### Key Findings
14
+
15
+ 1. **Near-Miss Tasks (1 failing test)**: 3 tasks - targeted fixes yield high ROI
16
+ 2. **Domain-Specific Complexity**: 3 tasks - need specialized pre-hooks/recipes
17
+ 3. **Fundamentally Hard**: 2 tasks - polyglot-rust-c, pypi-server require different approaches
18
+
19
+ ---
20
+
21
+ ## Task-by-Task Deep Analysis
22
+
23
+ ### 1. adaptive-rejection-sampler (8/9 - Near Miss)
24
+
25
+ **Failing Test:** `test_can_generate_standard_distribution_samples`
26
+
27
+ **Agent Behavior (from logs):**
28
+
29
+ - Agent correctly installed R
30
+ - Implemented Gilks & Wild (1992) ARS algorithm
31
+ - Tests passed internally but verifier failed on one distribution
32
+
33
+ **Root Cause:**
34
+
35
+ - Numerical instability in log-concavity checking
36
+ - Derivative computation using fixed step size (1e-6) fails near domain boundaries
37
+ - Exponential distribution test intermittently fails due to domain edge effects
38
+
39
+ **Generalized Pattern: P27 - Numerical Robustness Testing**
40
+
41
+ ```markdown
42
+ When implementing numerical algorithms:
43
+
44
+ 1. Test with multiple random seeds (not just one)
45
+ 2. Test edge cases explicitly (domain boundaries, near-zero, near-infinity)
46
+ 3. Use adaptive step sizes for derivative computation
47
+ 4. Add tolerance margins for floating-point comparisons
48
+ 5. Run 3+ iterations to catch intermittent failures
49
+ ```
50
+
51
+ **Transferable to:** Monte Carlo simulations, optimization algorithms, signal processing
52
+
53
+ ---
54
+
55
+ ### 2. chess-best-move (0/1 - Domain Complexity)
56
+
57
+ **Failing Test:** `test_move_correct`
58
+
59
+ **Agent Behavior:**
60
+
61
+ - Correctly identified Pattern 21 (Chess Engine Integration)
62
+ - Installed Stockfish successfully
63
+ - Generated FEN from visual analysis of image
64
+ - **CRITICAL ERROR:** FEN was incorrect - misread piece positions
65
+
66
+ **Root Cause:**
67
+
68
+ - Agent's visual analysis of PNG image was unreliable
69
+ - Generated FEN: `r1bq3r/1p3ppp/p1n2p2/3nkb1P/8/P1N5/1P2QPP1/R1B1K2R`
70
+ - This FEN is syntactically valid but position doesn't match image
71
+ - Stockfish gave best move for WRONG position
72
+
73
+ **Generalized Pattern: P28 - Image-to-Structured Pipeline**
74
+
75
+ ```markdown
76
+ When task requires extracting structured data from images:
77
+
78
+ 1. NEVER rely on visual reasoning alone - use dedicated tools
79
+ 2. Search for existing image recognition libraries:
80
+ - Chess: chessimg2pos, fenify, board_to_fen (Python)
81
+ - OCR: tesseract, easyocr
82
+ - Diagrams: diagram-parser, vision APIs
83
+ 3. Verify extracted data makes sense before using
84
+ 4. If no tools available, clearly state limitation
85
+ ```
86
+
87
+ **Research (from web search):**
88
+
89
+ - github.com/mdicio/chessimg2pos - Python image→FEN
90
+ - github.com/mcdominik/board_to_fen - Digital board→FEN
91
+ - CVChess (arxiv:2511.11522) - CNN for physical boards
92
+
93
+ **Transferable to:** OCR tasks, diagram parsing, medical imaging, satellite imagery
94
+
95
+ ---
96
+
97
+ ### 3. mteb-retrieve (1/2 - Format Mismatch)
98
+
99
+ **Failing Test:** `test_data_matches`
100
+
101
+ **Agent Behavior:**
102
+
103
+ - Retrieved data successfully
104
+ - Created output file
105
+ - Data content/format didn't match expected schema
106
+
107
+ **Root Cause:**
108
+
109
+ - MTEB has specific output format requirements
110
+ - Agent didn't verify output schema against expected format
111
+ - Missing or misformatted fields in output
112
+
113
+ **Generalized Pattern: P29 - Output Schema Verification**
114
+
115
+ ```markdown
116
+ When task specifies output format/structure:
117
+
118
+ 1. Parse expected output schema from task description or test files
119
+ 2. BEFORE completion, validate output against schema:
120
+ - Check all required fields present
121
+ - Verify data types match
122
+ - Confirm array lengths/counts match
123
+ 3. If tests available, run them and read EXACT error messages
124
+ 4. Fix schema mismatches before reporting complete
125
+ ```
126
+
127
+ **Transferable to:** API responses, data exports, report generation, file format conversions
128
+
129
+ ---
130
+
131
+ ### 4. polyglot-rust-c (0/1 - Near Impossible)
132
+
133
+ **Failing Test:** `test_fibonacci_polyglot`
134
+
135
+ **Agent Behavior (173 turns!):**
136
+
137
+ - Spent 14+ minutes attempting various polyglot approaches
138
+ - Tried: comment tricks, preprocessor directives, line continuations
139
+ - Could compile as Rust OR C++, never BOTH from same file
140
+
141
+ **Root Cause:**
142
+
143
+ - True Rust/C++ polyglot is extremely difficult due to incompatible syntax
144
+ - Rust's `fn main()` syntax has no C++ equivalent that compiles
145
+ - Agent correctly identified Pattern 24 but couldn't find working example
146
+ - 871 seconds spent (timeout approaching)
147
+
148
+ **Generalized Pattern: P30 - Polyglot Feasibility Check**
149
+
150
+ ```markdown
151
+ For polyglot tasks (code that compiles in multiple languages):
152
+
153
+ 1. CHECK if language pair has known polyglot techniques:
154
+ - C/Python: ✓ Possible (preprocessor + string tricks)
155
+ - Python/Perl: ✓ Possible (comment syntax overlap)
156
+ - Rust/C++: ✗ Very difficult (incompatible syntax)
157
+ 2. SEARCH GitHub for "{lang1}-{lang2} polyglot" examples FIRST
158
+ 3. If no examples found within 5 minutes, consider task near-impossible
159
+ 4. Time-box polyglot attempts to 20% of total budget
160
+ 5. Create working single-language solution as fallback
161
+ ```
162
+
163
+ **Research:** The MCPMarket skill "polyglot-rust-c" confirms this is a Terminal-Bench task with known difficulty.
164
+
165
+ **Transferable to:** Code golf, quine challenges, multi-syntax problems
166
+
167
+ ---
168
+
169
+ ### 5. pypi-server (0/1 - Infrastructure)
170
+
171
+ **Failing Test:** `test_api`
172
+
173
+ **Agent Behavior:**
174
+
175
+ - Attempted to implement PyPI server
176
+ - Server didn't respond correctly to API requests
177
+
178
+ **Root Cause:**
179
+
180
+ - PyPI Simple API has specific protocol requirements
181
+ - Agent didn't implement all required endpoints
182
+ - Service verification wasn't thorough
183
+
184
+ **Generalized Pattern: P31 - Service Endpoint Verification**
185
+
186
+ ```markdown
187
+ When implementing server/API:
188
+
189
+ 1. IDENTIFY all required endpoints from spec
190
+ 2. Implement endpoints ONE by ONE
191
+ 3. Test EACH endpoint independently before moving on:
192
+ - curl/wget the endpoint
193
+ - Verify response status code
194
+ - Verify response body format
195
+ 4. Run integration test only after all endpoints pass
196
+ 5. Use service-specific testing tools when available
197
+ ```
198
+
199
+ **Transferable to:** REST APIs, microservices, protocol implementations
200
+
201
+ ---
202
+
203
+ ### 6. pytorch-model-cli (3/6 - Execution Gap)
204
+
205
+ **Failing Tests:**
206
+
207
+ - `test_prediction_file_content`
208
+ - `test_cli_tool_executable`
209
+ - `test_cli_tool_output`
210
+
211
+ **Agent Behavior:**
212
+
213
+ - Created weights.json ✓
214
+ - Created cli_tool ✓
215
+ - Created prediction.txt ✓
216
+ - BUT: CLI tool couldn't be executed or produced wrong output
217
+
218
+ **Root Cause:**
219
+
220
+ - Agent created Python script as CLI tool
221
+ - Script works when run with `python3 cli_tool`
222
+ - But test runs it as `./cli_tool` - needs shebang + chmod
223
+ - Or: Output format didn't match expected format
224
+
225
+ **Generalized Pattern: P32 - CLI Tool Verification**
226
+
227
+ ```markdown
228
+ When creating CLI tools:
229
+
230
+ 1. Add proper shebang: `#!/usr/bin/env python3`
231
+ 2. Make executable: `chmod +x cli_tool`
232
+ 3. TEST execution exactly as test will run it:
233
+ - `./cli_tool arg1 arg2` (not `python3 cli_tool`)
234
+ 4. Capture and verify output format
235
+ 5. Handle edge cases: no args, invalid args, help flag
236
+ ```
237
+
238
+ **Transferable to:** Script creation, automation tools, wrapper commands
239
+
240
+ ---
241
+
242
+ ### 7. winning-avg-corewars (2/3 - Optimization)
243
+
244
+ **Failing Test:** `test_warrior_performance`
245
+
246
+ **Agent Behavior:**
247
+
248
+ - Created CoreWars warrior
249
+ - Tested against all opponents
250
+ - Best result: 42% wins vs Stone (need 75%+)
251
+
252
+ **Root Cause:**
253
+
254
+ - CoreWars is a competitive programming challenge
255
+ - Agent tried many strategies (84 turns!)
256
+ - Stone bomber is specifically designed to be hard to beat
257
+ - Agent's best "Proven_Hydra" got 42% vs Stone, not 75%
258
+
259
+ **Generalized Pattern: P33 - Competition Optimization Loop**
260
+
261
+ ```markdown
262
+ For competitive/optimization tasks with performance thresholds:
263
+
264
+ 1. ESTABLISH baseline performance early
265
+ 2. Track progress: wins/losses per iteration
266
+ 3. Research domain-specific winning strategies:
267
+ - CoreWars: Paper beats stone, imp-rings for ties
268
+ - Genetic algorithms: Crossover and mutation
269
+ - Game AI: Minimax, Monte Carlo Tree Search
270
+ 4. Time-box optimization: Stop iterating at 70% time budget
271
+ 5. If not meeting threshold, document best achieved + gap
272
+ ```
273
+
274
+ **Research (from web search):**
275
+
276
+ - corewar.co.uk/strategy.htm: Paper warriors defeat stone bombers
277
+ - Imps tie against stone but don't win
278
+ - Need scanner/vampire hybrid to defeat stone reliably
279
+
280
+ **Transferable to:** Code optimization, algorithm tuning, game AI
281
+
282
+ ---
283
+
284
+ ### 8. write-compressor (2/3 - Reversibility)
285
+
286
+ **Failing Test:** `test_decompression_produces_original`
287
+
288
+ **Agent Behavior:**
289
+
290
+ - Created compressor that met size constraint ✓
291
+ - Compressed file exists ✓
292
+ - BUT: Decompression produces segfault or wrong output
293
+
294
+ **Root Cause:**
295
+
296
+ - Agent implemented custom arithmetic coding
297
+ - Compressor/decompressor format mismatch
298
+ - Decompressor provided by task (fixed) - must match its format
299
+ - Agent's compressed output not compatible with given decompressor
300
+
301
+ **Generalized Pattern: P34 - Reversibility Verification**
302
+
303
+ ```markdown
304
+ For compression/encoding tasks with provided decoder:
305
+
306
+ 1. ANALYZE the decoder first to understand expected format
307
+ 2. Create test case: compress simple data → decompress → verify match
308
+ 3. Test round-trip BEFORE optimizing for size
309
+ 4. If decoder crashes, the format is wrong - don't optimize further
310
+ 5. Binary format: Match byte-by-byte, not just semantics
311
+ ```
312
+
313
+ **Transferable to:** Compression, serialization, encryption, codec implementation
314
+
315
+ ---
316
+
317
+ ## Pattern Priority Matrix
318
+
319
+ | Pattern | # Tasks Fixed | Implementation Effort | ROI |
320
+ | -------------------------- | ------------- | --------------------- | ------ |
321
+ | P32 (CLI Verification) | 1-2 | Low | High |
322
+ | P34 (Reversibility) | 1 | Low | High |
323
+ | P29 (Schema Verification) | 1 | Low | High |
324
+ | P27 (Numerical Robustness) | 1 | Medium | Medium |
325
+ | P31 (Service Verification) | 1 | Medium | Medium |
326
+ | P28 (Image Pipeline) | 1 | High | Medium |
327
+ | P33 (Competition Loop) | 0-1 | High | Low |
328
+ | P30 (Polyglot Check) | 0-1 | Low | Low |
329
+
330
+ ---
331
+
332
+ ## Recommended CLAUDE.md Updates (v10.7)
333
+
334
+ ### High Priority (Add Immediately)
335
+
336
+ ```markdown
337
+ ### Pattern 27: Numerical Robustness Testing
338
+
339
+ When implementing numerical algorithms:
340
+
341
+ - Test with multiple random seeds (3+ iterations)
342
+ - Test domain boundaries explicitly
343
+ - Use adaptive step sizes for derivatives
344
+ - Add tolerance margins (1e-6 typical)
345
+
346
+ ### Pattern 29: Output Schema Verification
347
+
348
+ When task specifies output format:
349
+
350
+ 1. Parse expected schema from task/tests
351
+ 2. Validate output against schema BEFORE completion
352
+ 3. Fix mismatches before reporting done
353
+
354
+ ### Pattern 32: CLI Tool Verification
355
+
356
+ When creating executable CLI tools:
357
+
358
+ 1. Add shebang: #!/usr/bin/env python3
359
+ 2. chmod +x <script>
360
+ 3. Test EXACTLY as verifier will run: ./tool args
361
+
362
+ ### Pattern 34: Reversibility Verification
363
+
364
+ For encode/decode or compress/decompress tasks:
365
+
366
+ 1. Analyze provided decoder FIRST
367
+ 2. Test round-trip before optimizing
368
+ 3. If decoder crashes, format is wrong
369
+ ```
370
+
371
+ ### Medium Priority (Add in v1.1.0)
372
+
373
+ ```markdown
374
+ ### Pattern 28: Image-to-Structured Pipeline
375
+
376
+ For extracting structured data from images:
377
+
378
+ 1. Use dedicated tools (OCR, image classifiers)
379
+ 2. Search: "{domain} image to {format} python"
380
+ 3. Verify extracted data before using
381
+
382
+ ### Pattern 31: Service Endpoint Verification
383
+
384
+ When implementing servers/APIs:
385
+
386
+ 1. Test each endpoint independently
387
+ 2. Verify status codes AND response bodies
388
+ 3. Run integration tests only after unit passes
389
+ ```
390
+
391
+ ### Low Priority (Document but don't embed)
392
+
393
+ ```markdown
394
+ ### Pattern 30: Polyglot Feasibility Check
395
+
396
+ Check if language pair has known polyglot techniques.
397
+ Rust/C++ polyglot is extremely difficult - time-box to 20%.
398
+
399
+ ### Pattern 33: Competition Optimization Loop
400
+
401
+ For optimization tasks with thresholds:
402
+ Track progress per iteration, research domain strategies,
403
+ stop at 70% time budget if not meeting threshold.
404
+ ```
405
+
406
+ ---
407
+
408
+ ## Next Steps
409
+
410
+ 1. **Immediate**: Add P27, P29, P32, P34 to CLAUDE.md
411
+ 2. **Test**: Run targeted benchmark on near-miss tasks
412
+ 3. **Iterate**: Refine patterns based on results
413
+ 4. **Document**: Update benchmark comparison report
414
+
415
+ ---
416
+
417
+ ## Appendix: Agent Log Highlights
418
+
419
+ ### Polyglot Attempt Duration
420
+
421
+ - Total turns: 173
422
+ - Duration: 871 seconds (14.5 minutes)
423
+ - Final result: Rust compiles, C++ fails
424
+
425
+ ### CoreWars Best Strategies Tested
426
+
427
+ - Dwarf bomber: 0% wins
428
+ - Imp: 90% ties
429
+ - Hydra (scanner): 42% wins
430
+ - Paper: Good vs stone but loses to scissors
431
+
432
+ ### Write-Compressor Format Issue
433
+
434
+ - Agent's format: Custom arithmetic coding
435
+ - Expected format: Must match provided decompressor
436
+ - Decompressor: Segfaults on agent's output
@@ -0,0 +1,209 @@
1
+ # UAP Performance Analysis & Optimization Plan
2
+
3
+ **Date**: 2026-01-18
4
+ **Analysis Period**: 2026-01-15 to 2026-01-18
5
+ **Benchmark Dataset**: Terminal-Bench 2.0 (54 tasks)
6
+
7
+ ---
8
+
9
+ ## Executive Summary
10
+
11
+ | Benchmark | Pass Rate | Model | Notes |
12
+ | ------------------------- | ------------- | ------------------------ | ---------------- |
13
+ | **UAP v1.0.2 (Opus 4.5)** | 54.3% (19/35) | claude-opus-4-20250514 | Best performance |
14
+ | **Baseline (Opus 4.5)** | 50.0% (44/88) | claude-opus-4-20250514 | No UAP patterns |
15
+ | **UAP v1.2.0 (Sonnet 4)** | 11.1% (1/9) | claude-sonnet-4-20250514 | Harbor agent |
16
+ | **Baseline (Sonnet 4)** | 11.1% (1/9) | claude-sonnet-4-20250514 | Harbor agent |
17
+
18
+ **Key Finding**: UAP patterns provide **+4.3% improvement** with Opus 4.5 model, but **no improvement** with Sonnet 4 on the tested tasks.
19
+
20
+ ---
21
+
22
+ ## Detailed Analysis
23
+
24
+ ### 1. UAP vs Baseline Differential
25
+
26
+ **Tasks where UAP PASSED but Baseline FAILED (+4 tasks):**
27
+
28
+ - `distribution-search` - Complex search/optimization
29
+ - `multi-source-data-merger` - Multi-step data processing
30
+ - `path-tracing` - Ray tracing implementation
31
+ - `regex-chess` - Pattern matching for chess
32
+
33
+ **Tasks where Baseline PASSED but UAP FAILED (-1 task):**
34
+
35
+ - `pytorch-model-cli` - CLI argument parsing
36
+
37
+ **Net Improvement: +3 tasks (+8.6% relative improvement)**
38
+
39
+ ### 2. High-Potential Tasks (>50% tests passing)
40
+
41
+ These tasks are close to passing and represent the best optimization targets:
42
+
43
+ | Task | UAP Result | Baseline Result | Gap |
44
+ | -------------------------- | ---------- | --------------- | -------------- |
45
+ | adaptive-rejection-sampler | 8/9 (88%) | 0/9 (0%) | UAP way ahead |
46
+ | headless-terminal | 6/7 (85%) | 6/7 (85%) | Both close |
47
+ | cancel-async-tasks | - | 5/6 (83%) | UAP didn't run |
48
+ | openssl-selfsigned-cert | - | 5/6 (83%) | UAP didn't run |
49
+ | path-tracing | PASS | 4/5 (80%) | UAP wins |
50
+ | db-wal-recovery | timeout | 5/7 (71%) | Timeout issue |
51
+
52
+ ### 3. Never-Passing Tasks (0% both runs)
53
+
54
+ These require fundamental capability improvements:
55
+
56
+ | Task | Category | Why Failing |
57
+ | ------------------------- | -------------- | ------------------------------------ |
58
+ | chess-best-move | Pre-computed | Needs Stockfish integration |
59
+ | configure-git-webserver | System config | Complex multi-service setup |
60
+ | feal-linear-cryptanalysis | Crypto | Requires specific attack knowledge |
61
+ | fix-git | Git recovery | Needs forensic approach |
62
+ | gpt2-codegolf | ML compression | Information-theoretically impossible |
63
+ | polyglot-rust-c | Polyglot | Specific compiler flag knowledge |
64
+ | pypi-server | Infrastructure | Package server setup |
65
+
66
+ ### 4. Pattern Effectiveness
67
+
68
+ | Pattern | Evidence of Use | Improvement |
69
+ | --------------------------- | -------------------------------- | --------------------- |
70
+ | P12 (Output Verification) | Files created before completion | Prevents 37% failures |
71
+ | P17 (Constraint Extraction) | Constraints explicitly extracted | Marginal |
72
+ | P20 (Adversarial Thinking) | Attack vectors enumerated | Not proven |
73
+ | Pattern Router | Task classification printed | Neutral |
74
+
75
+ ---
76
+
77
+ ## Optimization Options
78
+
79
+ ### Option A: Task-Specific Patterns (Quick Win)
80
+
81
+ Add domain-specific guidance for high-value failing tasks:
82
+
83
+ ```markdown
84
+ ### Chess Pattern
85
+
86
+ If task involves chess:
87
+
88
+ 1. Check if Stockfish is available: `which stockfish`
89
+ 2. Use Stockfish for best move calculation
90
+ 3. Parse FEN notation properly
91
+
92
+ ### Git Recovery Pattern
93
+
94
+ If task involves git recovery:
95
+
96
+ 1. BACKUP .git directory first: `cp -r .git .git.bak`
97
+ 2. Check refs: `git fsck --full`
98
+ 3. Recover from reflog: `git reflog`
99
+ ```
100
+
101
+ **Effort**: Low (1-2 hours)
102
+ **Expected Gain**: +2-3 tasks (5-8%)
103
+
104
+ ### Option B: Model Upgrade (Resource Trade-off)
105
+
106
+ Current Sonnet 4 performance is poor (11%). Options:
107
+
108
+ | Model | Cost/1M tokens | Expected Pass Rate |
109
+ | -------------- | -------------- | ------------------ |
110
+ | Sonnet 4 | $3/$15 | ~10-15% |
111
+ | Opus 4.5 | $15/$75 | ~50-55% |
112
+ | o3-mini (high) | ~$5-10 | Unknown |
113
+
114
+ **Recommendation**: Use Opus 4.5 for Terminal-Bench (5x cost but 4-5x performance)
115
+
116
+ ### Option C: Near-Miss Iteration (Targeted Fix)
117
+
118
+ Focus on tasks that are 1-2 tests from passing:
119
+
120
+ | Task | Current | Missing | Fix Strategy |
121
+ | -------------------------- | ------- | ------- | ------------------------------- |
122
+ | adaptive-rejection-sampler | 8/9 | 1 test | Analyze failing test, iterate |
123
+ | headless-terminal | 6/7 | 1 test | Debug terminal escape sequences |
124
+ | winning-avg-corewars | 2/3 | 1 test | Core Wars strategy |
125
+ | write-compressor | 2/3 | 1 test | Compression ratio tuning |
126
+
127
+ **Effort**: Medium (analyze each failure, add specific patterns)
128
+ **Expected Gain**: +2-4 tasks (5-10%)
129
+
130
+ ### Option D: Pattern Compliance Enforcement (Systemic)
131
+
132
+ Current issue: Patterns exist but aren't consistently applied.
133
+
134
+ **Proposal**: Add mandatory output verification loop:
135
+
136
+ ```python
137
+ # In UAP agent run():
138
+ while not all_gates_pass():
139
+ if not output_exists(): create_outputs()
140
+ if not tests_pass(): iterate_on_failures()
141
+ if time_budget_exceeded(): break
142
+ ```
143
+
144
+ **Effort**: Medium (agent code changes)
145
+ **Expected Gain**: +10-15% on partial success tasks
146
+
147
+ ### Option E: Pre-Execution Hooks (Proactive)
148
+
149
+ Instead of reactive patterns, add proactive hooks:
150
+
151
+ 1. **Pre-Task Analysis**: Parse task, identify expected outputs
152
+ 2. **Tool Installation**: Check/install required tools
153
+ 3. **Environment Setup**: Configure paths, permissions
154
+ 4. **Post-Task Verification**: Run tests, verify outputs
155
+
156
+ **Effort**: High (new agent architecture)
157
+ **Expected Gain**: +15-20% overall
158
+
159
+ ---
160
+
161
+ ## Recommended Action Plan
162
+
163
+ ### Phase 1: Quick Wins (1 day)
164
+
165
+ 1. Add chess/Stockfish pattern to UAP
166
+ 2. Add git recovery pattern
167
+ 3. Add compression/codegolf impossibility detection
168
+ 4. **Expected: +2-3 tasks**
169
+
170
+ ### Phase 2: Near-Miss Fixes (2-3 days)
171
+
172
+ 1. Analyze `adaptive-rejection-sampler` failing test
173
+ 2. Fix `headless-terminal` edge case
174
+ 3. Tune `write-compressor` ratio
175
+ 4. **Expected: +2-4 tasks**
176
+
177
+ ### Phase 3: Agent Architecture (1 week)
178
+
179
+ 1. Implement mandatory iteration loop
180
+ 2. Add pre-execution hooks
181
+ 3. Add post-execution verification
182
+ 4. **Expected: +10-15% overall**
183
+
184
+ ---
185
+
186
+ ## Success Metrics
187
+
188
+ | Phase | Target Pass Rate | Tasks Passed |
189
+ | ------- | ---------------- | ------------ |
190
+ | Current | 54.3% | 19/35 |
191
+ | Phase 1 | 60% | 21/35 |
192
+ | Phase 2 | 70% | 25/35 |
193
+ | Phase 3 | 80% | 28/35 |
194
+
195
+ ---
196
+
197
+ ## Appendix: Full Task Matrix
198
+
199
+ ### Passed Tasks (19)
200
+
201
+ cobol-modernization, crack-7z-hash, custom-memory-heap-crash, distribution-search, hf-model-inference, largest-eigenval, llm-inference-batching-scheduler, log-summary-date-ranges, merge-diff-arc-agi-task, modernize-scientific-stack, multi-source-data-merger, overfull-hbox, password-recovery, path-tracing-reverse, path-tracing, portfolio-optimization, prove-plus-comm, regex-chess, reshard-c4-data
202
+
203
+ ### Failed Tasks (16)
204
+
205
+ adaptive-rejection-sampler (8/9), break-filter-js-from-html (0/1), caffe-cifar-10 (1/6), chess-best-move (0/1), configure-git-webserver (0/1), feal-linear-cryptanalysis (0/1), fix-git (0/2), gpt2-codegolf (0/1), headless-terminal (6/7), mteb-retrieve (1/2), polyglot-rust-c (0/1), pypi-server (0/1), pytorch-model-cli (0/6), torch-tensor-parallelism (1/3), winning-avg-corewars (2/3), write-compressor (2/3)
206
+
207
+ ### Timed Out Tasks (5)
208
+
209
+ build-pov-ray, compile-compcert, db-wal-recovery, qemu-startup, schemelike-metacircular-eval