groundswell 0.0.2 → 0.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (633) hide show
  1. package/dist/__tests__/adversarial/attachChild-performance.test.d.ts +16 -0
  2. package/dist/__tests__/adversarial/attachChild-performance.test.d.ts.map +1 -0
  3. package/dist/__tests__/adversarial/attachChild-performance.test.js +187 -0
  4. package/dist/__tests__/adversarial/attachChild-performance.test.js.map +1 -0
  5. package/dist/__tests__/adversarial/circular-reference.test.d.ts +13 -0
  6. package/dist/__tests__/adversarial/circular-reference.test.d.ts.map +1 -0
  7. package/dist/__tests__/adversarial/circular-reference.test.js +92 -0
  8. package/dist/__tests__/adversarial/circular-reference.test.js.map +1 -0
  9. package/dist/__tests__/adversarial/complex-circular-reference.test.d.ts +16 -0
  10. package/dist/__tests__/adversarial/complex-circular-reference.test.d.ts.map +1 -0
  11. package/dist/__tests__/adversarial/complex-circular-reference.test.js +127 -0
  12. package/dist/__tests__/adversarial/complex-circular-reference.test.js.map +1 -0
  13. package/dist/__tests__/adversarial/concurrent-task-failures.test.d.ts +21 -0
  14. package/dist/__tests__/adversarial/concurrent-task-failures.test.d.ts.map +1 -0
  15. package/dist/__tests__/adversarial/concurrent-task-failures.test.js +667 -0
  16. package/dist/__tests__/adversarial/concurrent-task-failures.test.js.map +1 -0
  17. package/dist/__tests__/adversarial/deep-analysis.test.d.ts +6 -0
  18. package/dist/__tests__/adversarial/deep-analysis.test.d.ts.map +1 -0
  19. package/dist/__tests__/adversarial/deep-analysis.test.js +877 -0
  20. package/dist/__tests__/adversarial/deep-analysis.test.js.map +1 -0
  21. package/dist/__tests__/adversarial/deep-hierarchy-stress.test.d.ts +13 -0
  22. package/dist/__tests__/adversarial/deep-hierarchy-stress.test.d.ts.map +1 -0
  23. package/dist/__tests__/adversarial/deep-hierarchy-stress.test.js +186 -0
  24. package/dist/__tests__/adversarial/deep-hierarchy-stress.test.js.map +1 -0
  25. package/dist/__tests__/adversarial/e2e-prd-validation.test.d.ts +6 -0
  26. package/dist/__tests__/adversarial/e2e-prd-validation.test.d.ts.map +1 -0
  27. package/dist/__tests__/adversarial/e2e-prd-validation.test.js +626 -0
  28. package/dist/__tests__/adversarial/e2e-prd-validation.test.js.map +1 -0
  29. package/dist/__tests__/adversarial/edge-case.test.d.ts +6 -0
  30. package/dist/__tests__/adversarial/edge-case.test.d.ts.map +1 -0
  31. package/dist/__tests__/adversarial/edge-case.test.js +857 -0
  32. package/dist/__tests__/adversarial/edge-case.test.js.map +1 -0
  33. package/dist/__tests__/adversarial/error-merge-strategy.test.d.ts +20 -0
  34. package/dist/__tests__/adversarial/error-merge-strategy.test.d.ts.map +1 -0
  35. package/dist/__tests__/adversarial/error-merge-strategy.test.js +907 -0
  36. package/dist/__tests__/adversarial/error-merge-strategy.test.js.map +1 -0
  37. package/dist/__tests__/adversarial/incremental-performance.test.d.ts +2 -0
  38. package/dist/__tests__/adversarial/incremental-performance.test.d.ts.map +1 -0
  39. package/dist/__tests__/adversarial/incremental-performance.test.js +113 -0
  40. package/dist/__tests__/adversarial/incremental-performance.test.js.map +1 -0
  41. package/dist/__tests__/adversarial/node-map-update-benchmarks.test.d.ts +22 -0
  42. package/dist/__tests__/adversarial/node-map-update-benchmarks.test.d.ts.map +1 -0
  43. package/dist/__tests__/adversarial/node-map-update-benchmarks.test.js +383 -0
  44. package/dist/__tests__/adversarial/node-map-update-benchmarks.test.js.map +1 -0
  45. package/dist/__tests__/adversarial/observer-propagation.test.d.ts +21 -0
  46. package/dist/__tests__/adversarial/observer-propagation.test.d.ts.map +1 -0
  47. package/dist/__tests__/adversarial/observer-propagation.test.js +404 -0
  48. package/dist/__tests__/adversarial/observer-propagation.test.js.map +1 -0
  49. package/dist/__tests__/adversarial/parent-validation.test.d.ts +13 -0
  50. package/dist/__tests__/adversarial/parent-validation.test.d.ts.map +1 -0
  51. package/dist/__tests__/adversarial/parent-validation.test.js +128 -0
  52. package/dist/__tests__/adversarial/parent-validation.test.js.map +1 -0
  53. package/dist/__tests__/adversarial/prd-12-2-compliance.test.d.ts +20 -0
  54. package/dist/__tests__/adversarial/prd-12-2-compliance.test.d.ts.map +1 -0
  55. package/dist/__tests__/adversarial/prd-12-2-compliance.test.js +482 -0
  56. package/dist/__tests__/adversarial/prd-12-2-compliance.test.js.map +1 -0
  57. package/dist/__tests__/adversarial/prd-compliance.test.d.ts +6 -0
  58. package/dist/__tests__/adversarial/prd-compliance.test.d.ts.map +1 -0
  59. package/dist/__tests__/adversarial/prd-compliance.test.js +886 -0
  60. package/dist/__tests__/adversarial/prd-compliance.test.js.map +1 -0
  61. package/dist/__tests__/compatibility/backward-compatibility.test.d.ts +22 -0
  62. package/dist/__tests__/compatibility/backward-compatibility.test.d.ts.map +1 -0
  63. package/dist/__tests__/compatibility/backward-compatibility.test.js +1843 -0
  64. package/dist/__tests__/compatibility/backward-compatibility.test.js.map +1 -0
  65. package/dist/__tests__/helpers/index.d.ts +10 -0
  66. package/dist/__tests__/helpers/index.d.ts.map +1 -0
  67. package/{src/__tests__/helpers/index.ts → dist/__tests__/helpers/index.js} +2 -10
  68. package/dist/__tests__/helpers/index.js.map +1 -0
  69. package/dist/__tests__/helpers/tree-verification.d.ts +90 -0
  70. package/dist/__tests__/helpers/tree-verification.d.ts.map +1 -0
  71. package/dist/__tests__/helpers/tree-verification.js +202 -0
  72. package/dist/__tests__/helpers/tree-verification.js.map +1 -0
  73. package/dist/__tests__/integration/agent-workflow.test.d.ts +2 -0
  74. package/dist/__tests__/integration/agent-workflow.test.d.ts.map +1 -0
  75. package/dist/__tests__/integration/agent-workflow.test.js +256 -0
  76. package/dist/__tests__/integration/agent-workflow.test.js.map +1 -0
  77. package/dist/__tests__/integration/bidirectional-consistency.test.d.ts +14 -0
  78. package/dist/__tests__/integration/bidirectional-consistency.test.d.ts.map +1 -0
  79. package/dist/__tests__/integration/bidirectional-consistency.test.js +668 -0
  80. package/dist/__tests__/integration/bidirectional-consistency.test.js.map +1 -0
  81. package/dist/__tests__/integration/observer-logging.test.d.ts +2 -0
  82. package/dist/__tests__/integration/observer-logging.test.d.ts.map +1 -0
  83. package/dist/__tests__/integration/observer-logging.test.js +517 -0
  84. package/dist/__tests__/integration/observer-logging.test.js.map +1 -0
  85. package/dist/__tests__/integration/tree-mirroring.test.d.ts +2 -0
  86. package/dist/__tests__/integration/tree-mirroring.test.d.ts.map +1 -0
  87. package/dist/__tests__/integration/tree-mirroring.test.js +117 -0
  88. package/dist/__tests__/integration/tree-mirroring.test.js.map +1 -0
  89. package/dist/__tests__/integration/workflow-reparenting.test.d.ts +12 -0
  90. package/dist/__tests__/integration/workflow-reparenting.test.d.ts.map +1 -0
  91. package/dist/__tests__/integration/workflow-reparenting.test.js +239 -0
  92. package/dist/__tests__/integration/workflow-reparenting.test.js.map +1 -0
  93. package/dist/__tests__/unit/agent.test.d.ts +2 -0
  94. package/dist/__tests__/unit/agent.test.d.ts.map +1 -0
  95. package/dist/__tests__/unit/agent.test.js +143 -0
  96. package/dist/__tests__/unit/agent.test.js.map +1 -0
  97. package/dist/__tests__/unit/cache-key.test.d.ts +5 -0
  98. package/dist/__tests__/unit/cache-key.test.d.ts.map +1 -0
  99. package/dist/__tests__/unit/cache-key.test.js +145 -0
  100. package/dist/__tests__/unit/cache-key.test.js.map +1 -0
  101. package/dist/__tests__/unit/cache.test.d.ts +5 -0
  102. package/dist/__tests__/unit/cache.test.d.ts.map +1 -0
  103. package/dist/__tests__/unit/cache.test.js +132 -0
  104. package/dist/__tests__/unit/cache.test.js.map +1 -0
  105. package/dist/__tests__/unit/context.test.d.ts +2 -0
  106. package/dist/__tests__/unit/context.test.d.ts.map +1 -0
  107. package/dist/__tests__/unit/context.test.js +220 -0
  108. package/dist/__tests__/unit/context.test.js.map +1 -0
  109. package/dist/__tests__/unit/decorators.test.d.ts +2 -0
  110. package/dist/__tests__/unit/decorators.test.d.ts.map +1 -0
  111. package/dist/__tests__/unit/decorators.test.js +162 -0
  112. package/dist/__tests__/unit/decorators.test.js.map +1 -0
  113. package/dist/__tests__/unit/introspection-tools.test.d.ts +5 -0
  114. package/dist/__tests__/unit/introspection-tools.test.d.ts.map +1 -0
  115. package/dist/__tests__/unit/introspection-tools.test.js +191 -0
  116. package/dist/__tests__/unit/introspection-tools.test.js.map +1 -0
  117. package/dist/__tests__/unit/logger.test.d.ts +2 -0
  118. package/dist/__tests__/unit/logger.test.d.ts.map +1 -0
  119. package/dist/__tests__/unit/logger.test.js +241 -0
  120. package/dist/__tests__/unit/logger.test.js.map +1 -0
  121. package/dist/__tests__/unit/observable.test.d.ts +2 -0
  122. package/dist/__tests__/unit/observable.test.d.ts.map +1 -0
  123. package/dist/__tests__/unit/observable.test.js +251 -0
  124. package/dist/__tests__/unit/observable.test.js.map +1 -0
  125. package/dist/__tests__/unit/prompt.test.d.ts +2 -0
  126. package/dist/__tests__/unit/prompt.test.d.ts.map +1 -0
  127. package/dist/__tests__/unit/prompt.test.js +113 -0
  128. package/dist/__tests__/unit/prompt.test.js.map +1 -0
  129. package/dist/__tests__/unit/reflection.test.d.ts +5 -0
  130. package/dist/__tests__/unit/reflection.test.d.ts.map +1 -0
  131. package/dist/__tests__/unit/reflection.test.js +160 -0
  132. package/dist/__tests__/unit/reflection.test.js.map +1 -0
  133. package/dist/__tests__/unit/tree-debugger-incremental.test.d.ts +2 -0
  134. package/dist/__tests__/unit/tree-debugger-incremental.test.d.ts.map +1 -0
  135. package/dist/__tests__/unit/tree-debugger-incremental.test.js +136 -0
  136. package/dist/__tests__/unit/tree-debugger-incremental.test.js.map +1 -0
  137. package/dist/__tests__/unit/tree-debugger.test.d.ts +2 -0
  138. package/dist/__tests__/unit/tree-debugger.test.d.ts.map +1 -0
  139. package/dist/__tests__/unit/tree-debugger.test.js +69 -0
  140. package/dist/__tests__/unit/tree-debugger.test.js.map +1 -0
  141. package/dist/__tests__/unit/utils/workflow-error-utils.test.d.ts +2 -0
  142. package/dist/__tests__/unit/utils/workflow-error-utils.test.d.ts.map +1 -0
  143. package/dist/__tests__/unit/utils/workflow-error-utils.test.js +154 -0
  144. package/dist/__tests__/unit/utils/workflow-error-utils.test.js.map +1 -0
  145. package/dist/__tests__/unit/workflow-detachChild.test.d.ts +2 -0
  146. package/dist/__tests__/unit/workflow-detachChild.test.d.ts.map +1 -0
  147. package/dist/__tests__/unit/workflow-detachChild.test.js +76 -0
  148. package/dist/__tests__/unit/workflow-detachChild.test.js.map +1 -0
  149. package/dist/__tests__/unit/workflow-emitEvent-childDetached.test.d.ts +2 -0
  150. package/dist/__tests__/unit/workflow-emitEvent-childDetached.test.d.ts.map +1 -0
  151. package/dist/__tests__/unit/workflow-emitEvent-childDetached.test.js +122 -0
  152. package/dist/__tests__/unit/workflow-emitEvent-childDetached.test.js.map +1 -0
  153. package/dist/__tests__/unit/workflow-isDescendantOf.test.d.ts +2 -0
  154. package/dist/__tests__/unit/workflow-isDescendantOf.test.d.ts.map +1 -0
  155. package/dist/__tests__/unit/workflow-isDescendantOf.test.js +140 -0
  156. package/dist/__tests__/unit/workflow-isDescendantOf.test.js.map +1 -0
  157. package/dist/__tests__/unit/workflow.test.d.ts +2 -0
  158. package/dist/__tests__/unit/workflow.test.d.ts.map +1 -0
  159. package/dist/__tests__/unit/workflow.test.js +330 -0
  160. package/dist/__tests__/unit/workflow.test.js.map +1 -0
  161. package/dist/cache/cache-key.d.ts +66 -0
  162. package/dist/cache/cache-key.d.ts.map +1 -0
  163. package/dist/cache/cache-key.js +195 -0
  164. package/dist/cache/cache-key.js.map +1 -0
  165. package/dist/cache/cache.d.ts +104 -0
  166. package/dist/cache/cache.d.ts.map +1 -0
  167. package/dist/cache/cache.js +179 -0
  168. package/dist/cache/cache.js.map +1 -0
  169. package/{src/cache/index.ts → dist/cache/index.d.ts} +1 -1
  170. package/dist/cache/index.d.ts.map +1 -0
  171. package/dist/cache/index.js +6 -0
  172. package/dist/cache/index.js.map +1 -0
  173. package/dist/core/agent.d.ts +112 -0
  174. package/dist/core/agent.d.ts.map +1 -0
  175. package/dist/core/agent.js +426 -0
  176. package/dist/core/agent.js.map +1 -0
  177. package/{src/core/context.ts → dist/core/context.d.ts} +16 -67
  178. package/dist/core/context.d.ts.map +1 -0
  179. package/dist/core/context.js +80 -0
  180. package/dist/core/context.js.map +1 -0
  181. package/dist/core/event-tree.d.ts +72 -0
  182. package/dist/core/event-tree.d.ts.map +1 -0
  183. package/dist/core/event-tree.js +211 -0
  184. package/dist/core/event-tree.js.map +1 -0
  185. package/{src/core/factory.ts → dist/core/factory.d.ts} +6 -27
  186. package/dist/core/factory.d.ts.map +1 -0
  187. package/dist/core/factory.js +110 -0
  188. package/dist/core/factory.js.map +1 -0
  189. package/{src/core/index.ts → dist/core/index.d.ts} +2 -10
  190. package/dist/core/index.d.ts.map +1 -0
  191. package/dist/core/index.js +9 -0
  192. package/dist/core/index.js.map +1 -0
  193. package/dist/core/logger.d.ts +50 -0
  194. package/dist/core/logger.d.ts.map +1 -0
  195. package/dist/core/logger.js +91 -0
  196. package/dist/core/logger.js.map +1 -0
  197. package/dist/core/mcp-handler.d.ts +69 -0
  198. package/dist/core/mcp-handler.d.ts.map +1 -0
  199. package/dist/core/mcp-handler.js +143 -0
  200. package/dist/core/mcp-handler.js.map +1 -0
  201. package/dist/core/prompt.d.ts +80 -0
  202. package/dist/core/prompt.d.ts.map +1 -0
  203. package/dist/core/prompt.js +120 -0
  204. package/dist/core/prompt.js.map +1 -0
  205. package/dist/core/workflow-context.d.ts +57 -0
  206. package/dist/core/workflow-context.d.ts.map +1 -0
  207. package/dist/core/workflow-context.js +263 -0
  208. package/dist/core/workflow-context.js.map +1 -0
  209. package/dist/core/workflow.d.ts +241 -0
  210. package/dist/core/workflow.d.ts.map +1 -0
  211. package/dist/core/workflow.js +464 -0
  212. package/dist/core/workflow.js.map +1 -0
  213. package/dist/debugger/index.d.ts +2 -0
  214. package/dist/debugger/index.d.ts.map +1 -0
  215. package/{src/debugger/index.ts → dist/debugger/index.js} +1 -0
  216. package/dist/debugger/index.js.map +1 -0
  217. package/dist/debugger/tree-debugger.d.ts +71 -0
  218. package/dist/debugger/tree-debugger.d.ts.map +1 -0
  219. package/dist/debugger/tree-debugger.js +198 -0
  220. package/dist/debugger/tree-debugger.js.map +1 -0
  221. package/dist/decorators/index.d.ts +4 -0
  222. package/dist/decorators/index.d.ts.map +1 -0
  223. package/{src/decorators/index.ts → dist/decorators/index.js} +1 -0
  224. package/dist/decorators/index.js.map +1 -0
  225. package/dist/decorators/observed-state.d.ts +32 -0
  226. package/dist/decorators/observed-state.d.ts.map +1 -0
  227. package/dist/decorators/observed-state.js +79 -0
  228. package/dist/decorators/observed-state.js.map +1 -0
  229. package/dist/decorators/step.d.ts +15 -0
  230. package/dist/decorators/step.d.ts.map +1 -0
  231. package/dist/decorators/step.js +110 -0
  232. package/dist/decorators/step.js.map +1 -0
  233. package/dist/decorators/task.d.ts +50 -0
  234. package/dist/decorators/task.d.ts.map +1 -0
  235. package/dist/decorators/task.js +118 -0
  236. package/dist/decorators/task.js.map +1 -0
  237. package/dist/examples/index.d.ts +3 -0
  238. package/dist/examples/index.d.ts.map +1 -0
  239. package/{src/examples/index.ts → dist/examples/index.js} +1 -0
  240. package/dist/examples/index.js.map +1 -0
  241. package/dist/examples/tdd-orchestrator.d.ts +15 -0
  242. package/dist/examples/tdd-orchestrator.d.ts.map +1 -0
  243. package/dist/examples/tdd-orchestrator.js +121 -0
  244. package/dist/examples/tdd-orchestrator.js.map +1 -0
  245. package/dist/examples/test-cycle-workflow.d.ts +14 -0
  246. package/dist/examples/test-cycle-workflow.d.ts.map +1 -0
  247. package/dist/examples/test-cycle-workflow.js +116 -0
  248. package/dist/examples/test-cycle-workflow.js.map +1 -0
  249. package/dist/index.d.ts +27 -0
  250. package/dist/index.d.ts.map +1 -0
  251. package/dist/index.js +40 -0
  252. package/dist/index.js.map +1 -0
  253. package/dist/reflection/index.d.ts +5 -0
  254. package/dist/reflection/index.d.ts.map +1 -0
  255. package/{src/reflection/index.ts → dist/reflection/index.js} +1 -1
  256. package/dist/reflection/index.js.map +1 -0
  257. package/dist/reflection/reflection.d.ts +84 -0
  258. package/dist/reflection/reflection.d.ts.map +1 -0
  259. package/dist/reflection/reflection.js +329 -0
  260. package/dist/reflection/reflection.js.map +1 -0
  261. package/dist/tools/index.d.ts +6 -0
  262. package/dist/tools/index.d.ts.map +1 -0
  263. package/dist/tools/index.js +11 -0
  264. package/dist/tools/index.js.map +1 -0
  265. package/dist/tools/introspection.d.ts +165 -0
  266. package/dist/tools/introspection.d.ts.map +1 -0
  267. package/dist/tools/introspection.js +324 -0
  268. package/dist/tools/introspection.js.map +1 -0
  269. package/dist/types/agent.d.ts +66 -0
  270. package/dist/types/agent.d.ts.map +1 -0
  271. package/dist/types/agent.js +6 -0
  272. package/dist/types/agent.js.map +1 -0
  273. package/dist/types/decorators.d.ts +31 -0
  274. package/dist/types/decorators.d.ts.map +1 -0
  275. package/dist/types/decorators.js +2 -0
  276. package/dist/types/decorators.js.map +1 -0
  277. package/dist/types/error-strategy.d.ts +13 -0
  278. package/dist/types/error-strategy.d.ts.map +1 -0
  279. package/dist/types/error-strategy.js +2 -0
  280. package/dist/types/error-strategy.js.map +1 -0
  281. package/dist/types/error.d.ts +20 -0
  282. package/dist/types/error.d.ts.map +1 -0
  283. package/dist/types/error.js +2 -0
  284. package/dist/types/error.js.map +1 -0
  285. package/dist/types/events.d.ts +87 -0
  286. package/dist/types/events.d.ts.map +1 -0
  287. package/dist/types/events.js +2 -0
  288. package/dist/types/events.js.map +1 -0
  289. package/dist/types/index.d.ts +15 -0
  290. package/dist/types/index.d.ts.map +1 -0
  291. package/dist/types/index.js +2 -0
  292. package/dist/types/index.js.map +1 -0
  293. package/dist/types/logging.d.ts +24 -0
  294. package/dist/types/logging.d.ts.map +1 -0
  295. package/dist/types/logging.js +2 -0
  296. package/dist/types/logging.js.map +1 -0
  297. package/dist/types/observer.d.ts +18 -0
  298. package/dist/types/observer.d.ts.map +1 -0
  299. package/dist/types/observer.js +2 -0
  300. package/dist/types/observer.js.map +1 -0
  301. package/dist/types/prompt.d.ts +31 -0
  302. package/dist/types/prompt.d.ts.map +1 -0
  303. package/dist/types/prompt.js +6 -0
  304. package/dist/types/prompt.js.map +1 -0
  305. package/dist/types/reflection.d.ts +96 -0
  306. package/dist/types/reflection.d.ts.map +1 -0
  307. package/dist/types/reflection.js +24 -0
  308. package/dist/types/reflection.js.map +1 -0
  309. package/dist/types/sdk-primitives.d.ts +118 -0
  310. package/dist/types/sdk-primitives.d.ts.map +1 -0
  311. package/dist/types/sdk-primitives.js +6 -0
  312. package/dist/types/sdk-primitives.js.map +1 -0
  313. package/{src/types/snapshot.ts → dist/types/snapshot.d.ts} +5 -5
  314. package/dist/types/snapshot.d.ts.map +1 -0
  315. package/dist/types/snapshot.js +2 -0
  316. package/dist/types/snapshot.js.map +1 -0
  317. package/dist/types/workflow-context.d.ts +139 -0
  318. package/dist/types/workflow-context.d.ts.map +1 -0
  319. package/dist/types/workflow-context.js +8 -0
  320. package/dist/types/workflow-context.js.map +1 -0
  321. package/dist/types/workflow.d.ts +30 -0
  322. package/dist/types/workflow.d.ts.map +1 -0
  323. package/dist/types/workflow.js +2 -0
  324. package/dist/types/workflow.js.map +1 -0
  325. package/dist/utils/id.d.ts +6 -0
  326. package/dist/utils/id.d.ts.map +1 -0
  327. package/dist/utils/id.js +12 -0
  328. package/dist/utils/id.js.map +1 -0
  329. package/{src/utils/index.ts → dist/utils/index.d.ts} +1 -0
  330. package/dist/utils/index.d.ts.map +1 -0
  331. package/dist/utils/index.js +4 -0
  332. package/dist/utils/index.js.map +1 -0
  333. package/dist/utils/observable.d.ts +54 -0
  334. package/dist/utils/observable.d.ts.map +1 -0
  335. package/dist/utils/observable.js +82 -0
  336. package/dist/utils/observable.js.map +1 -0
  337. package/dist/utils/workflow-error-utils.d.ts +22 -0
  338. package/dist/utils/workflow-error-utils.d.ts.map +1 -0
  339. package/dist/utils/workflow-error-utils.js +45 -0
  340. package/dist/utils/workflow-error-utils.js.map +1 -0
  341. package/package.json +5 -2
  342. package/.claude/commands/subtask-planning/prp-base-create.md +0 -120
  343. package/.claude/commands/subtask-planning/prp-base-execute.md +0 -65
  344. package/.claude/commands/task-breakdown.md +0 -94
  345. package/.claude/settings.local.json +0 -9
  346. package/.claude/system_prompts/task-breakdown.md +0 -101
  347. package/PRD.md +0 -543
  348. package/PRPs/001-hierarchical-workflow-engine.md +0 -2438
  349. package/PRPs/PRDs/002-agent-prompt.md +0 -390
  350. package/PRPs/PRDs/003-agent-prompt.md +0 -943
  351. package/PRPs/PRDs/004-agent-prompt.md +0 -1136
  352. package/PRPs/PRDs/tasks-001.json +0 -492
  353. package/PRPs/README.md +0 -83
  354. package/PRPs/templates/prp_base.md +0 -222
  355. package/docs/agent.md +0 -422
  356. package/docs/prompt.md +0 -419
  357. package/docs/workflow.md +0 -600
  358. package/examples/README.md +0 -258
  359. package/examples/examples/01-basic-workflow.ts +0 -100
  360. package/examples/examples/02-decorator-options.ts +0 -217
  361. package/examples/examples/03-parent-child.ts +0 -241
  362. package/examples/examples/04-observers-debugger.ts +0 -340
  363. package/examples/examples/05-error-handling.ts +0 -387
  364. package/examples/examples/06-concurrent-tasks.ts +0 -352
  365. package/examples/examples/07-agent-loops.ts +0 -432
  366. package/examples/examples/08-sdk-features.ts +0 -667
  367. package/examples/examples/09-reflection.ts +0 -573
  368. package/examples/examples/10-introspection.ts +0 -550
  369. package/examples/examples/11-reparenting-workflows.ts +0 -269
  370. package/examples/index.ts +0 -147
  371. package/examples/utils/helpers.ts +0 -57
  372. package/package-lock.json +0 -2398
  373. package/plan/001_d3bb02af4886/TEST_RESULTS.md +0 -259
  374. package/plan/001_d3bb02af4886/backlog.json +0 -867
  375. package/plan/001_d3bb02af4886/bug_fix_tasks.json +0 -484
  376. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M1T1S1/PRP.md +0 -488
  377. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M1T1S2/PRP.md +0 -581
  378. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M1T1S3/PRP.md +0 -687
  379. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M2T1S1/PRP.md +0 -492
  380. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M2T1S3/PRP.md +0 -932
  381. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M2T1S3/research/concurrent_error_testing_patterns.md +0 -1109
  382. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M2T1S3/research/vitest_concurrent_testing.md +0 -802
  383. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M2T1S3/research/workflow_engine_test_references.md +0 -603
  384. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M2T2S1/PRP.md +0 -564
  385. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M2T2S3/PRP.md +0 -518
  386. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M2T2S4/PRP.md +0 -1252
  387. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M2T3S1/PRP.md +0 -364
  388. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M2T3S1/research/CODEBASE_INVENTORY.md +0 -114
  389. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M2T3S1/research/DECORATOR_DOCUMENTATION_PATTERNS.md +0 -205
  390. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M2T3S1/research/PRD_LOCATION_ANALYSIS.md +0 -199
  391. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M2T3S1/research/ULTRATHINK_PRP_PLAN.md +0 -134
  392. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M3T1S1/PRP.md +0 -495
  393. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M3T1S1/research/console_error_inventory.md +0 -435
  394. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M3T1S2/PRP.md +0 -506
  395. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M3T1S3/PRP.md +0 -612
  396. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M3T2S2/PRP.md +0 -558
  397. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M3T2S2/research/external_research.md +0 -788
  398. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M3T3S2/PRP.md +0 -460
  399. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M3T3S3/PRP.md +0 -454
  400. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M3T4S1/PRP.md +0 -520
  401. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M3T4S1/RECOMMENDATION.md +0 -417
  402. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M3T4S1/research/external_workflow_engines_research.md +0 -760
  403. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M3T4S1/research/security_implications_analysis.md +0 -245
  404. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M3T4S2/PRP.md +0 -792
  405. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M4T1S1/PRP.md +0 -535
  406. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M4T1S1/TEST_EXECUTION_REPORT.md +0 -190
  407. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M4T1S2/PRP.md +0 -654
  408. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M4T1S2/TEST_FIX_REPORT.md +0 -227
  409. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M4T1S2/research/KEY_FINDINGS.md +0 -345
  410. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M4T1S2/research/QUICK_REFERENCE.md +0 -193
  411. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M4T1S2/research/test_maintenance_research.md +0 -1323
  412. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M4T3S1/BREAKING_CHANGES_AUDIT.md +0 -1011
  413. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M4T3S1/PRP.md +0 -927
  414. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/P1M4T3S2/PRP.md +0 -505
  415. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/architecture/logger_child_signature_analysis.md +0 -401
  416. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M1T1S3/child_implementation_research.md +0 -142
  417. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M1T1S3/test_patterns_research.md +0 -112
  418. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M1T1S3/vitest_patterns_research.md +0 -159
  419. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M1T1S4/PRP.md +0 -549
  420. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M1T1S4/VERIFICATION_REPORT.md +0 -368
  421. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M1T1S4/edge_case_analysis.md +0 -172
  422. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M1T1S4/usage_inventory.md +0 -175
  423. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T1S2/PRP.md +0 -696
  424. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T1S4/PRP.md +0 -860
  425. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T2S2/PRP.md +0 -1066
  426. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T2S2/research/01-testing-aggregated-errors.md +0 -1103
  427. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T2S2/research/01_typescript_error_aggregation_patterns.md +0 -789
  428. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T2S2/research/02-error-merge-strategy-testing-guide.md +0 -1098
  429. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T2S2/research/02_aggregate_error_patterns.md +0 -1037
  430. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T2S2/research/03-promise-allsettled-testing-patterns.md +0 -916
  431. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T2S2/research/03_error_merging_strategies.md +0 -1045
  432. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T2S2/research/04_github_stackoverflow_examples.md +0 -890
  433. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T2S2/research/05_comprehensive_summary.md +0 -822
  434. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T2S2/research/INDEX.md +0 -668
  435. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T2S2/research/QUICK_REFERENCE.md +0 -706
  436. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T2S2/research/README.md +0 -265
  437. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T2S2/research/RESEARCH_REPORT.md +0 -655
  438. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T2S4/research/vitest_testing_patterns.md +0 -1103
  439. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M2T3S2/PRP.md +0 -426
  440. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T1S2/PRP.md +0 -506
  441. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T1S2/research/QUICK_REFERENCE.md +0 -114
  442. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T1S2/research/RESEARCH_SUMMARY.md +0 -316
  443. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T1S2/research/vitest_observer_error_logging_best_practices.md +0 -754
  444. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T1S3/PRP.md +0 -612
  445. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T2S1/PRP.md +0 -719
  446. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T2S1/README.md +0 -215
  447. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T2S1/analysis.md +0 -765
  448. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T2S3/PRP.md +0 -718
  449. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T3S1/DECISION.md +0 -149
  450. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T3S1/PRP.md +0 -470
  451. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T3S1/research/ULTRATHINK_PLAN.md +0 -332
  452. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T3S1/research/codebase_workflow_name_analysis.md +0 -167
  453. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T3S1/research/external_best_practices.md +0 -265
  454. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T3S1/research/validation_patterns.md +0 -273
  455. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T4S1/workflow_engine_ancestry_api_research.md +0 -760
  456. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M3T4S3-PRP.md +0 -434
  457. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M4T2S1/PRP.md +0 -717
  458. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M4T2S2/PRP.md +0 -472
  459. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M4T2S2/VALIDATION_REPORT.md +0 -125
  460. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/P1M4T2S2/research/ULTRATHINK_PRP_PLAN.md +0 -301
  461. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/error-logging-best-practices.md +0 -1170
  462. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/research_typescript_partial_and_overloads.md +0 -940
  463. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/vitest-quick-reference.md +0 -151
  464. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/docs/vitest-research.md +0 -650
  465. package/plan/001_d3bb02af4886/bugfix/001_e8e04329daf3/prd_snapshot.md +0 -259
  466. package/plan/001_d3bb02af4886/bugfix/P1M1T1S1/PRP.md +0 -457
  467. package/plan/001_d3bb02af4886/bugfix/RESEARCH_SUMMARY.md +0 -346
  468. package/plan/001_d3bb02af4886/bugfix/architecture/codebase_structure.md +0 -311
  469. package/plan/001_d3bb02af4886/bugfix/architecture/concurrent_execution_best_practices.md +0 -1565
  470. package/plan/001_d3bb02af4886/bugfix/architecture/error_handling_patterns.md +0 -288
  471. package/plan/001_d3bb02af4886/bugfix/architecture/promise_all_analysis.md +0 -741
  472. package/plan/001_d3bb02af4886/docs/PRP/P1M1T1S4-functional-workflow-error-state-capture-test.md +0 -652
  473. package/plan/001_d3bb02af4886/docs/PRP/P1P2-PRP.md +0 -527
  474. package/plan/001_d3bb02af4886/docs/PRP/P3P4-PRP.md +0 -1388
  475. package/plan/001_d3bb02af4886/docs/PRP/P4P5-PRP.md +0 -1136
  476. package/plan/001_d3bb02af4886/docs/PRP/PRP.md +0 -527
  477. package/plan/001_d3bb02af4886/docs/PRP/bugfix/P1M1T2S1-PRP.md +0 -415
  478. package/plan/001_d3bb02af4886/docs/PRP/bugfix/P1M1T2S2-PRP.md +0 -378
  479. package/plan/001_d3bb02af4886/docs/PRP/bugfix/P1M1T2S4-PRP.md +0 -713
  480. package/plan/001_d3bb02af4886/docs/PRP/bugfix/P1M2T1S4-PRP.md +0 -370
  481. package/plan/001_d3bb02af4886/docs/PRP_P1M3T1S3.md +0 -499
  482. package/plan/001_d3bb02af4886/docs/TEST_RESULTS.md +0 -230
  483. package/plan/001_d3bb02af4886/docs/architecture/external_deps.md +0 -358
  484. package/plan/001_d3bb02af4886/docs/architecture/system_context.md +0 -242
  485. package/plan/001_d3bb02af4886/docs/bugfix/ANALYSIS_PRD_VS_IMPLEMENTATION.md +0 -1134
  486. package/plan/001_d3bb02af4886/docs/bugfix/GAP_ANALYSIS_SUMMARY.md +0 -179
  487. package/plan/001_d3bb02af4886/docs/bugfix/P1M4T2S1/PRP.md +0 -629
  488. package/plan/001_d3bb02af4886/docs/bugfix/P1M4T2S1/validation-report.md +0 -214
  489. package/plan/001_d3bb02af4886/docs/bugfix/PRP_P1M4T2S3.md +0 -629
  490. package/plan/001_d3bb02af4886/docs/bugfix/bugfix_PRP.md +0 -529
  491. package/plan/001_d3bb02af4886/docs/bugfix/bugfix_QUICK_REFERENCE.md +0 -142
  492. package/plan/001_d3bb02af4886/docs/bugfix/bugfix_README.md +0 -304
  493. package/plan/001_d3bb02af4886/docs/bugfix/bugfix_TEST_RESULTS.md +0 -558
  494. package/plan/001_d3bb02af4886/docs/bugfix/bugfix_VALIDATION_SUMMARY.md +0 -256
  495. package/plan/001_d3bb02af4886/docs/bugfix/system_context.md +0 -346
  496. package/plan/001_d3bb02af4886/docs/bugfix-architecture/bug_analysis.md +0 -415
  497. package/plan/001_d3bb02af4886/docs/bugfix-architecture/implementation_patterns.md +0 -489
  498. package/plan/001_d3bb02af4886/docs/bugfix-architecture/system_context.md +0 -218
  499. package/plan/001_d3bb02af4886/docs/bugfix_INITIATION_SUMMARY.md +0 -380
  500. package/plan/001_d3bb02af4886/docs/research/CYCLE_DETECTION_PATTERNS.md +0 -1923
  501. package/plan/001_d3bb02af4886/docs/research/CYCLE_DETECTION_QUICK_REF.md +0 -319
  502. package/plan/001_d3bb02af4886/docs/research/P1M1T2S1/codebase-context.md +0 -115
  503. package/plan/001_d3bb02af4886/docs/research/P1M1T2S1/cycle-detection-algorithms.md +0 -134
  504. package/plan/001_d3bb02af4886/docs/research/P1M1T2S1/test-patterns.md +0 -153
  505. package/plan/001_d3bb02af4886/docs/research/P1M1T2S1/workflow-class.md +0 -132
  506. package/plan/001_d3bb02af4886/docs/research/P1M2T1S4/DECORATOR_DOCUMENTATION_BEST_PRACTICES.md +0 -716
  507. package/plan/001_d3bb02af4886/docs/research/P1M2T1S4/DECORATOR_DOCUMENTATION_QUICK_REF.md +0 -186
  508. package/plan/001_d3bb02af4886/docs/research/P1M2T1S4/GROUNDSWELL_DECORATOR_EXAMPLES.md +0 -604
  509. package/plan/001_d3bb02af4886/docs/research/P1M2T1S4/INDEX.md +0 -213
  510. package/plan/001_d3bb02af4886/docs/research/P1M2T1S4/codebase_structure.md +0 -30
  511. package/plan/001_d3bb02af4886/docs/research/P1M2T1S4/existing_test_pattern.md +0 -56
  512. package/plan/001_d3bb02af4886/docs/research/P1M2T1S4/getRootObservers_implementation.md +0 -53
  513. package/plan/001_d3bb02af4886/docs/research/P1M2T1S4/test_conventions.md +0 -49
  514. package/plan/001_d3bb02af4886/docs/research/P1M3T1S4/PRP.md +0 -958
  515. package/plan/001_d3bb02af4886/docs/research/P1M3T1S4/QUICK_REFERENCE.md +0 -339
  516. package/plan/001_d3bb02af4886/docs/research/P1M3T1S4/README.md +0 -305
  517. package/plan/001_d3bb02af4886/docs/research/P1M3T1S4/SUMMARY.md +0 -433
  518. package/plan/001_d3bb02af4886/docs/research/P1M3T1S4/bidirectional-tree-consistency-testing.md +0 -1574
  519. package/plan/001_d3bb02af4886/docs/research/P1M3T1S4/test-pattern-examples.md +0 -1014
  520. package/plan/001_d3bb02af4886/docs/research/P1P2/LRU_CACHE_BEST_PRACTICES.md +0 -1929
  521. package/plan/001_d3bb02af4886/docs/research/P1P2/LRU_CACHE_CODE_PATTERNS.md +0 -857
  522. package/plan/001_d3bb02af4886/docs/research/P1P2/LRU_CACHE_INTEGRATION_GUIDE.md +0 -738
  523. package/plan/001_d3bb02af4886/docs/research/P1P2/LRU_CACHE_RESEARCH_INDEX.md +0 -424
  524. package/plan/001_d3bb02af4886/docs/research/P1P2/REFLECTION_INDEX.md +0 -291
  525. package/plan/001_d3bb02af4886/docs/research/P1P2/REFLECTION_RESEARCH_REPORT.md +0 -1342
  526. package/plan/001_d3bb02af4886/docs/research/P1P2/RESEARCH_SUMMARY.md +0 -342
  527. package/plan/001_d3bb02af4886/docs/research/P1P2/anthropic-sdk.md +0 -174
  528. package/plan/001_d3bb02af4886/docs/research/P1P2/async-local-storage.md +0 -200
  529. package/plan/001_d3bb02af4886/docs/research/P1P2/reflection-code-patterns.md +0 -1205
  530. package/plan/001_d3bb02af4886/docs/research/P1P2/reflection-decision-matrix.md +0 -421
  531. package/plan/001_d3bb02af4886/docs/research/P1P2/reflection-implementation-guide.md +0 -1341
  532. package/plan/001_d3bb02af4886/docs/research/P1P2/reflection-integration-guide.md +0 -834
  533. package/plan/001_d3bb02af4886/docs/research/P1P2/reflection-patterns.md +0 -1468
  534. package/plan/001_d3bb02af4886/docs/research/P1P2/reflection-quick-reference.md +0 -558
  535. package/plan/001_d3bb02af4886/docs/research/P1P2/zod-schema.md +0 -152
  536. package/plan/001_d3bb02af4886/docs/research/P3P4/caching-lru.md +0 -116
  537. package/plan/001_d3bb02af4886/docs/research/P3P4/introspection-tools.md +0 -177
  538. package/plan/001_d3bb02af4886/docs/research/P3P4/reflection-patterns.md +0 -117
  539. package/plan/001_d3bb02af4886/docs/research/P4P5/RESEARCH_SUMMARY.md +0 -151
  540. package/plan/001_d3bb02af4886/docs/research/PROMISE_ALLSETTLED_QUICK_REF.md +0 -376
  541. package/plan/001_d3bb02af4886/docs/research/PROMISE_ALLSETTLED_RESEARCH.md +0 -1507
  542. package/plan/001_d3bb02af4886/docs/research/bugfix_typescript_patterns.md +0 -949
  543. package/plan/001_d3bb02af4886/docs/research/error-testing-research.md +0 -619
  544. package/plan/001_d3bb02af4886/docs/research/error_handling_patterns.md +0 -723
  545. package/plan/001_d3bb02af4886/docs/research/general/INTROSPECTION_RESEARCH_SUMMARY.md +0 -378
  546. package/plan/001_d3bb02af4886/docs/research/general/README-INTROSPECTION.md +0 -352
  547. package/plan/001_d3bb02af4886/docs/research/general/agent-introspection-patterns.md +0 -1085
  548. package/plan/001_d3bb02af4886/docs/research/general/introspection-security-guide.md +0 -984
  549. package/plan/001_d3bb02af4886/docs/research/general/introspection-tool-examples.md +0 -875
  550. package/plan/001_d3bb02af4886/docs/research/incremental-tree-map-updates/PRP_TEMPLATE.md +0 -460
  551. package/plan/001_d3bb02af4886/docs/research/incremental-tree-map-updates/QUICK_REFERENCE.md +0 -324
  552. package/plan/001_d3bb02af4886/docs/research/incremental-tree-map-updates/README.md +0 -175
  553. package/plan/001_d3bb02af4886/docs/research/incremental-tree-map-updates/RESEARCH_REPORT.md +0 -499
  554. package/plan/001_d3bb02af4886/docs/research/incremental-tree-map-updates/SUMMARY.md +0 -163
  555. package/plan/001_d3bb02af4886/prd_snapshot.md +0 -543
  556. package/plan/bugfix/BUG_FIX_SUMMARY.md +0 -961
  557. package/scripts/generate-llms-full.ts +0 -206
  558. package/src/__tests__/adversarial/attachChild-performance.test.ts +0 -216
  559. package/src/__tests__/adversarial/circular-reference.test.ts +0 -101
  560. package/src/__tests__/adversarial/complex-circular-reference.test.ts +0 -139
  561. package/src/__tests__/adversarial/concurrent-task-failures.test.ts +0 -571
  562. package/src/__tests__/adversarial/deep-analysis.test.ts +0 -729
  563. package/src/__tests__/adversarial/deep-hierarchy-stress.test.ts +0 -213
  564. package/src/__tests__/adversarial/e2e-prd-validation.test.ts +0 -448
  565. package/src/__tests__/adversarial/edge-case.test.ts +0 -703
  566. package/src/__tests__/adversarial/error-merge-strategy.test.ts +0 -760
  567. package/src/__tests__/adversarial/incremental-performance.test.ts +0 -140
  568. package/src/__tests__/adversarial/node-map-update-benchmarks.test.ts +0 -457
  569. package/src/__tests__/adversarial/observer-propagation.test.ts +0 -487
  570. package/src/__tests__/adversarial/parent-validation.test.ts +0 -143
  571. package/src/__tests__/adversarial/prd-12-2-compliance.test.ts +0 -611
  572. package/src/__tests__/adversarial/prd-compliance.test.ts +0 -731
  573. package/src/__tests__/compatibility/backward-compatibility.test.ts +0 -1572
  574. package/src/__tests__/helpers/tree-verification.ts +0 -257
  575. package/src/__tests__/integration/agent-workflow.test.ts +0 -256
  576. package/src/__tests__/integration/bidirectional-consistency.test.ts +0 -847
  577. package/src/__tests__/integration/observer-logging.test.ts +0 -643
  578. package/src/__tests__/integration/tree-mirroring.test.ts +0 -151
  579. package/src/__tests__/integration/workflow-reparenting.test.ts +0 -303
  580. package/src/__tests__/unit/agent.test.ts +0 -169
  581. package/src/__tests__/unit/cache-key.test.ts +0 -182
  582. package/src/__tests__/unit/cache.test.ts +0 -172
  583. package/src/__tests__/unit/context.test.ts +0 -217
  584. package/src/__tests__/unit/decorators.test.ts +0 -100
  585. package/src/__tests__/unit/introspection-tools.test.ts +0 -277
  586. package/src/__tests__/unit/logger.test.ts +0 -293
  587. package/src/__tests__/unit/observable.test.ts +0 -321
  588. package/src/__tests__/unit/prompt.test.ts +0 -135
  589. package/src/__tests__/unit/reflection.test.ts +0 -210
  590. package/src/__tests__/unit/tree-debugger-incremental.test.ts +0 -170
  591. package/src/__tests__/unit/tree-debugger.test.ts +0 -85
  592. package/src/__tests__/unit/utils/workflow-error-utils.test.ts +0 -209
  593. package/src/__tests__/unit/workflow-detachChild.test.ts +0 -100
  594. package/src/__tests__/unit/workflow-emitEvent-childDetached.test.ts +0 -153
  595. package/src/__tests__/unit/workflow-isDescendantOf.test.ts +0 -180
  596. package/src/__tests__/unit/workflow.test.ts +0 -357
  597. package/src/cache/cache-key.ts +0 -244
  598. package/src/cache/cache.ts +0 -236
  599. package/src/core/agent.ts +0 -593
  600. package/src/core/event-tree.ts +0 -260
  601. package/src/core/logger.ts +0 -112
  602. package/src/core/mcp-handler.ts +0 -184
  603. package/src/core/prompt.ts +0 -150
  604. package/src/core/workflow-context.ts +0 -351
  605. package/src/core/workflow.ts +0 -540
  606. package/src/debugger/tree-debugger.ts +0 -255
  607. package/src/decorators/observed-state.ts +0 -95
  608. package/src/decorators/step.ts +0 -139
  609. package/src/decorators/task.ts +0 -159
  610. package/src/examples/tdd-orchestrator.ts +0 -65
  611. package/src/examples/test-cycle-workflow.ts +0 -64
  612. package/src/index.ts +0 -142
  613. package/src/reflection/reflection.ts +0 -407
  614. package/src/tools/index.ts +0 -36
  615. package/src/tools/introspection.ts +0 -464
  616. package/src/types/agent.ts +0 -90
  617. package/src/types/decorators.ts +0 -32
  618. package/src/types/error-strategy.ts +0 -13
  619. package/src/types/error.ts +0 -20
  620. package/src/types/events.ts +0 -75
  621. package/src/types/index.ts +0 -55
  622. package/src/types/logging.ts +0 -24
  623. package/src/types/observer.ts +0 -18
  624. package/src/types/prompt.ts +0 -40
  625. package/src/types/reflection.ts +0 -117
  626. package/src/types/sdk-primitives.ts +0 -128
  627. package/src/types/workflow-context.ts +0 -163
  628. package/src/types/workflow.ts +0 -37
  629. package/src/utils/id.ts +0 -11
  630. package/src/utils/observable.ts +0 -106
  631. package/src/utils/workflow-error-utils.ts +0 -56
  632. package/tsconfig.json +0 -22
  633. package/vitest.config.ts +0 -16
@@ -1,1468 +0,0 @@
1
- # AI Agent Reflection & Self-Correction Patterns - Research Summary
2
-
3
- **Date**: December 2025
4
- **Focus**: Comprehensive research on reflection and self-correction patterns for AI agent frameworks
5
-
6
- ## Table of Contents
7
-
8
- 1. [Reflection in AI Systems](#reflection-in-ai-systems)
9
- 2. [Reflection Levels & Triggers](#reflection-levels--triggers)
10
- 3. [Implementation Patterns](#implementation-patterns)
11
- 4. [Reflection Prompt Templates](#reflection-prompt-templates)
12
- 5. [Existing Framework Approaches](#existing-framework-approaches)
13
- 6. [Best Practices & Guardrails](#best-practices--guardrails)
14
- 7. [When NOT to Reflect](#when-not-to-reflect)
15
- 8. [State Capture Patterns](#state-capture-patterns)
16
- 9. [Code Implementation Examples](#code-implementation-examples)
17
-
18
- ---
19
-
20
- ## Reflection in AI Systems
21
-
22
- ### What is Reflection?
23
-
24
- Reflection in AI agent contexts refers to an agent's ability to **think about its own actions and results in order to self-correct and improve**. It's essentially the AI analog of human introspection or "System 2" deliberative thinking. Rather than merely reacting instinctively, a reflective AI will pause to analyze what it has done, identify errors or suboptimal steps, and adjust its strategy.
25
-
26
- Key insight: Agents that can check and improve their own output are fundamentally more reliable because they catch mistakes before they compound, self-correct when they drift, and get better as they iterate.
27
-
28
- ### Core Components of Reflection
29
-
30
- The reflection pattern typically follows a three-phase cycle:
31
-
32
- 1. **Generation** - The model creates an initial output based on a prompt
33
- 2. **Reflection** - The AI critiques its own work, identifying areas for improvement
34
- 3. **Iteration/Refinement** - The AI refines its output based on feedback and continues until quality thresholds are met
35
-
36
- ### Why Reflection Matters
37
-
38
- Research demonstrates significant performance improvements:
39
- - **Reflexion (Shinn et al., 2023)**: Achieved 91% success rates in complex tasks
40
- - **CRITIC (Gou et al., 2024)**: Showed 10-30% improvement in accuracy across multiple domains
41
- - **Reflexion + GPT-4**: Reached 91% on HumanEval coding benchmark vs 80% without reflection
42
-
43
- ---
44
-
45
- ## Reflection Levels & Triggers
46
-
47
- ### Three Levels of Reflection
48
-
49
- #### 1. **Prompt-Level Reflection**
50
- - Occurs within a single LLM call
51
- - Model is prompted to "check your work" after generation
52
- - Lightweight, uses only one additional prompt
53
- - Good for: Basic quality improvement, simple validations
54
- - Cost: 1 additional LLM call
55
-
56
- #### 2. **Agent-Level Reflection**
57
- - Occurs between tool calls or action sequences
58
- - Agent pauses after each major step to evaluate progress
59
- - Can include both self-assessment and external tool feedback
60
- - Good for: Multi-step tasks, tool-based workflows
61
- - Cost: Adds latency but improves task success
62
-
63
- #### 3. **Workflow-Level Reflection**
64
- - Occurs at the orchestration level
65
- - Multiple agents or sub-tasks are evaluated together
66
- - Captures systemic improvements and pattern recognition
67
- - Good for: Complex multi-agent systems, long-running workflows
68
- - Cost: Significant additional compute, reserved for high-value tasks
69
-
70
- ### Trigger Mechanisms
71
-
72
- #### **Error-Driven Reflection** (Most Common)
73
- Triggered when:
74
- - Tool call fails with error
75
- - Output validation rules fail
76
- - Test/assertion fails
77
- - Response status codes indicate problems
78
-
79
- Pattern:
80
- ```
81
- Output → Validate → Error Detected → Reflect → Retry
82
- ```
83
-
84
- #### **Low-Confidence Reflection**
85
- Triggered when:
86
- - Model expresses uncertainty in output
87
- - Confidence score below threshold
88
- - Multiple alternative interpretations exist
89
- - Ambiguous user input detected
90
-
91
- Mechanism:
92
- - Use certainty tokens or confidence metadata
93
- - Leverage model's own uncertainty assessment
94
- - Self-Reflection Certainty: Ask model "Does this seem correct to you?"
95
- - Dynamically adjust confidence as model reasons (chain-of-thought)
96
-
97
- #### **Manual/Explicit Triggers**
98
- - User explicitly requests reflection
99
- - Scheduled checkpoints in workflow
100
- - Budget-based (after N tokens/steps)
101
- - Performance-based (when metric drops below threshold)
102
-
103
- #### **Progress-Based Triggers**
104
- - No progress after N iterations
105
- - State duplication (same output returned twice)
106
- - Repeated error patterns
107
- - Timeout approaching
108
-
109
- ---
110
-
111
- ## Implementation Patterns
112
-
113
- ### Pattern 1: Error Detection → Reflection → Retry Loop
114
-
115
- **Core Cycle:**
116
- ```
117
- 1. Generate Solution
118
- 2. Execute/Validate → Capture Error
119
- 3. Reflect: "What went wrong? Why did this fail?"
120
- 4. Retry with improved strategy
121
- 5. Loop until success or max attempts reached
122
- ```
123
-
124
- **Key Variables:**
125
- - `max_retry_limit`: Total retry attempts (default: 3-5)
126
- - `retry_count`: Current attempt number
127
- - `error_context`: Captured error message/state
128
- - `previous_attempts`: History of what was tried
129
-
130
- **Implementation Considerations:**
131
- - Always include error message in reflection prompt
132
- - Maintain history of previous attempts to avoid loops
133
- - Implement exponential backoff for API calls
134
- - Track which approaches failed to suggest different strategies
135
-
136
- ### Pattern 2: Instruction-Following Validation (IFE)
137
-
138
- Treats LLM outputs as **untrusted inputs requiring explicit validation**.
139
-
140
- **Flow:**
141
- ```
142
- Agent Generates Output
143
-
144
- Validation Checkpoint:
145
- - Check instruction compliance
146
- - Verify format requirements
147
- - Validate output constraints
148
-
149
- If Violations Detected:
150
- - Log specific failures
151
- - Refine prompt with violation details
152
- - Retry (up to max attempts)
153
-
154
- If Passed:
155
- - Accept and proceed
156
- ```
157
-
158
- **Example Constraints:**
159
- ```
160
- - Time estimates: numeric only, 0-4.0 range, no units
161
- - Function names: snake_case, no special characters
162
- - Response format: valid JSON, specific schema
163
- - Length: within min/max bounds
164
- ```
165
-
166
- ### Pattern 3: Reflexion Architecture
167
-
168
- Separates three distinct models/roles:
169
-
170
- 1. **Actor** - Generates text and actions using Chain-of-Thought or ReAct
171
- 2. **Evaluator** - Scores outputs by assigning reward signals
172
- 3. **Self-Reflection** - Generates verbal feedback using rewards and memory
173
-
174
- **Flow:**
175
- ```
176
- Task Definition
177
-
178
- Generate Initial Trajectory (Actor)
179
-
180
- Evaluate Outcome (Evaluator assigns reward score)
181
-
182
- Generate Reflection (Self-Reflection creates verbal feedback)
183
-
184
- Store in Memory
185
-
186
- Generate Next Trajectory (with reflection context)
187
- ```
188
-
189
- Advantages:
190
- - Structured feedback mechanism
191
- - Interpretable reflection output
192
- - Can learn from feedback across multiple attempts
193
- - Grounded in external signals (rewards)
194
-
195
- ### Pattern 4: Tool-Enhanced Reflection
196
-
197
- Agent uses external tools to verify correctness before self-reflection.
198
-
199
- **Tools Used:**
200
- - Unit tests / test cases
201
- - Code linters (for TypeScript/Python)
202
- - Web search to verify facts
203
- - APIs to validate data
204
- - Sandbox execution to catch runtime errors
205
-
206
- **Flow:**
207
- ```
208
- Generate Code
209
-
210
- Run Tests/Linter
211
-
212
- Capture Feedback
213
-
214
- Reflect on Specific Failures
215
-
216
- Revise Based on Concrete Evidence
217
- ```
218
-
219
- **Key Insight**: Type-checked languages (TypeScript vs JavaScript) provide multiple layers of automatic feedback, improving reflection quality.
220
-
221
- ### Pattern 5: Multi-Agent Reflection
222
-
223
- Rather than self-reflection, deploy two specialized agents:
224
- 1. **Generator Agent** - Prompted to produce outputs
225
- 2. **Critic Agent** - Prompted to provide constructive criticism
226
-
227
- **Flow:**
228
- ```
229
- Generator creates output
230
-
231
- Critic reviews and provides specific feedback:
232
- - What works well
233
- - What's missing or wrong
234
- - Specific improvement suggestions
235
-
236
- Generator receives critique
237
-
238
- Revised output
239
-
240
- (Loop up to N times or until satisfied)
241
- ```
242
-
243
- Benefits:
244
- - More diverse feedback (different reasoning path)
245
- - Can leverage specialized critic models
246
- - Dialogue creates interactive improvement
247
- - Often produces better results than self-reflection alone
248
-
249
- ---
250
-
251
- ## Reflection Prompt Templates
252
-
253
- ### Template 1: Basic Self-Critique (Lightweight)
254
-
255
- ```
256
- Original Task: [TASK]
257
-
258
- Your previous response:
259
- [RESPONSE]
260
-
261
- Please review your response for:
262
- 1. Accuracy - Is the information correct?
263
- 2. Completeness - Did you address all aspects of the task?
264
- 3. Clarity - Is it easy to understand?
265
- 4. Potential improvements - What could be better?
266
-
267
- Identify any issues and provide a revised response.
268
- ```
269
-
270
- **Cost**: Single additional LLM call
271
- **Best for**: Quick quality improvements, simple tasks
272
-
273
- ---
274
-
275
- ### Template 2: Error-Context Reflection (With Feedback)
276
-
277
- ```
278
- Original Task: [TASK]
279
-
280
- Your previous attempt:
281
- [PREVIOUS_RESPONSE]
282
-
283
- Error encountered: [ERROR_MESSAGE]
284
-
285
- Analysis: What specifically caused this error?
286
- - Root cause analysis
287
- - What assumption was wrong?
288
- - What information was missing?
289
-
290
- Revised approach: Provide a corrected solution that addresses the specific error.
291
- Explain your reasoning for why this approach will work better.
292
- ```
293
-
294
- **Cost**: Single additional LLM call with rich context
295
- **Best for**: Recovery from errors, learning from failures
296
-
297
- ---
298
-
299
- ### Template 3: Expert Persona Reflection
300
-
301
- ```
302
- Original Task: [TASK]
303
-
304
- Response to evaluate:
305
- [RESPONSE]
306
-
307
- You are now a [EXPERT_ROLE: code reviewer | technical architect | quality assurance specialist].
308
- Review the above response from the perspective of [EXPERT_ROLE].
309
-
310
- Specifically evaluate:
311
- 1. [TECHNICAL_CRITERIA]
312
- 2. [BEST_PRACTICES]
313
- 3. [EDGE_CASES]
314
- 4. [PERFORMANCE/QUALITY_METRICS]
315
-
316
- Provide your expert assessment and specific improvements.
317
- ```
318
-
319
- **Cost**: Single additional LLM call
320
- **Best for**: Complex technical outputs, code, architectural decisions
321
-
322
- ---
323
-
324
- ### Template 4: Structured Reflection with Rubric
325
-
326
- ```
327
- Original Task: [TASK]
328
-
329
- Generated Output:
330
- [OUTPUT]
331
-
332
- Evaluation Rubric:
333
- 1. Requirement A: [DESCRIPTION]
334
- Status: ✓ Met / ✗ Not Met
335
- If not met, why?
336
-
337
- 2. Requirement B: [DESCRIPTION]
338
- Status: ✓ Met / ✗ Not Met
339
- If not met, why?
340
-
341
- [... for each requirement ...]
342
-
343
- Summary:
344
- - Which requirements were NOT met?
345
- - Specific fixes needed for each failure
346
- - Revised output addressing all requirements
347
-
348
- Provide corrected output that meets all requirements.
349
- ```
350
-
351
- **Cost**: Single additional LLM call
352
- **Best for**: Tasks with explicit criteria, validation-heavy workflows
353
-
354
- ---
355
-
356
- ### Template 5: Confidence-Triggered Reflection
357
-
358
- ```
359
- Original Task: [TASK]
360
-
361
- Your response:
362
- [RESPONSE]
363
-
364
- Before we proceed, please evaluate your own confidence:
365
- 1. How confident are you that this response is correct? (0-100%)
366
- 2. What aspects are you uncertain about?
367
- 3. What additional information would increase your confidence?
368
-
369
- If confidence < 80%:
370
- - Identify specific sources of uncertainty
371
- - Provide alternative approaches you considered
372
- - Suggest how to verify your answer
373
- - Offer a revised response with higher confidence
374
- ```
375
-
376
- **Cost**: Single additional call with conditional branching
377
- **Best for**: High-stakes decisions, complex problem-solving
378
-
379
- ---
380
-
381
- ### Template 6: Multi-Turn Reflection Loop
382
-
383
- ```
384
- ROUND 1 - Initial Generation:
385
- [INITIAL_PROMPT]
386
-
387
- ROUND 2 - Self-Critique:
388
- "Review your response for: correctness, completeness, clarity, and efficiency.
389
- Identify specific issues."
390
-
391
- [CRITIQUE_FROM_PREVIOUS_ROUND]
392
-
393
- ROUND 3 - Improvement:
394
- "Based on the identified issues, provide an improved version.
395
- Explain what you changed and why."
396
-
397
- [CONTINUE_FOR_UP_TO_N_ROUNDS]
398
-
399
- Quality Checkpoint:
400
- Does current output meet all quality criteria? If yes, finalize.
401
- If no, continue round [N+1].
402
- ```
403
-
404
- **Cost**: Multiple LLM calls (3-5 typically)
405
- **Best for**: Complex writing, algorithm optimization, architectural design
406
-
407
- ---
408
-
409
- ## Existing Framework Approaches
410
-
411
- ### LangChain/LangGraph Reflection
412
-
413
- LangChain implements reflection through **LangGraph**, a stateful graph framework.
414
-
415
- **Three Core Patterns:**
416
-
417
- #### 1. Basic Reflection (MessageGraph)
418
- ```typescript
419
- - State: List of messages
420
- - Generator Node: Produces initial responses
421
- - Reflector Node: Acts as "teacher" providing constructive criticism
422
- - Edges: Loop back up to N times
423
- ```
424
-
425
- #### 2. Reflexion Pattern
426
- ```typescript
427
- - Generator produces draft
428
- - Tools are executed
429
- - Feedback captured
430
- - Revision happens with reflection context
431
- - Conditional loop based on iteration count
432
- ```
433
-
434
- #### 3. Language Agent Tree Search (LATS)
435
- ```typescript
436
- - Combines reflection/evaluation with Monte Carlo tree search
437
- - Four steps: Select → Expand/Simulate → Reflect+Evaluate → Backpropagate
438
- - Uses StateGraph with tree-based exploration
439
- ```
440
-
441
- **Key Implementation Details:**
442
- - Uses `add_node()`, `add_edge()`, `add_conditional_edges()`
443
- - State is shared data structure representing current snapshot
444
- - Nodes encode logic, perform computation, make LLM calls
445
- - Edges define next node based on current state
446
-
447
- **Trade-off**: Reflection requires additional computational time and resources. Each pattern trades latency for higher output quality. Not suitable for low-latency applications.
448
-
449
- ---
450
-
451
- ### Reflexion Framework (Shinn et al., 2023)
452
-
453
- **Design Philosophy**: Keep model frozen, use text-based feedback as reinforcement.
454
-
455
- **Components**:
456
- 1. **Actor** - Attempts task using Chain-of-Thought/ReAct with memory
457
- 2. **Evaluator** - Assigns reward scores to trajectories
458
- 3. **Self-Reflection** - Generates verbal feedback from rewards
459
-
460
- **Key Feature**: Reflexion forces explicit grounding in external data:
461
- - Must cite sources for claims
462
- - Explicitly enumerate superfluous aspects (what's wrong)
463
- - Explicitly enumerate missing aspects (what's needed)
464
-
465
- **Results**:
466
- - 91% success on complex tasks vs lower baselines
467
- - Strong performance on: AlfWorld (decision-making), HotPotQA (reasoning), HumanEval/MBPP (programming)
468
-
469
- **Best Use Cases**:
470
- - Iterative learning from mistakes
471
- - When traditional RL is impractical
472
- - Tasks where interpretability matters
473
- - Systems requiring nuanced feedback
474
-
475
- ---
476
-
477
- ### Claude/Anthropic Reflection Patterns
478
-
479
- **Philosophy**: Simplicity over complexity. Start with simple prompts, optimize through evaluation, add multi-step systems only when necessary.
480
-
481
- **Core Principles**:
482
- 1. Maintain simplicity in agent design
483
- 2. Prioritize transparency (show planning steps explicitly)
484
- 3. Carefully craft agent-computer interface (ACI) through tool documentation
485
-
486
- **Evaluator-Optimizer Workflow**:
487
- ```
488
- One LLM Call: Generates response
489
-
490
- Another LLM Call: Provides evaluation and feedback
491
-
492
- Loop: Iteratively refine
493
- ```
494
-
495
- **Most Effective When**:
496
- - Clear evaluation criteria exist
497
- - Iterative refinement provides measurable value
498
- - Not implementing complex internal reasoning
499
-
500
- **Extended Thinking Integration**:
501
- - Use extended thinking for complex reasoning within reflection
502
- - Interleaved mode: tool call → tool result → reflection thinking
503
- - Strongly prefer thinking block when uncertain
504
- - Enables "System 2" deliberative thinking in reflection phase
505
-
506
- **Feedback Approaches**:
507
- - **Rules-Based Feedback**: Define explicit rules, explain which failed and why
508
- - **Code Linting**: Type-checked languages (TypeScript) provide automatic feedback layers
509
- - **Sandbox Execution**: Run code to identify bugs
510
-
511
- ---
512
-
513
- ### AutoGPT Self-Correction
514
-
515
- **Approach**: Analyze feedback from errors and adjust strategy.
516
-
517
- **Core Mechanism**:
518
- ```
519
- Execute Step
520
-
521
- Evaluate Outcome
522
-
523
- If Failed:
524
- - Run reflection process
525
- - Diagnose failure points
526
- - Update strategy
527
- - Proceed
528
- ```
529
-
530
- **Key Features**:
531
- - Flexible automation with error analysis
532
- - Requires human oversight to prevent infinite loops
533
- - Handles many errors on client side
534
-
535
- **Recent Innovation: Retrials Without Feedback**
536
- Research shows "retrials without feedback" is effective:
537
- - Retry whenever incorrect answer identified
538
- - No explicit self-reflection needed
539
- - Continue until correct solution found or budget exhausted
540
- - Simpler than Reflexion, surprisingly effective
541
-
542
- ---
543
-
544
- ### Google Agent Development Kit (ADK) - Reflect & Retry
545
-
546
- **Technical Implementation**:
547
-
548
- **Core Mechanism**: Intercepts tool failures, provides structured guidance for correction, retries up to configurable limit.
549
-
550
- **Key Features**:
551
- - Concurrency-safe with locking mechanisms
552
- - Failure tracking per-invocation (default) or global across users
553
- - Custom error extraction by overriding detection methods
554
- - Supports both transient and logical errors
555
-
556
- **Configuration**:
557
- ```
558
- max_retries: 3 (default)
559
- throw_on_exceeded: true (default)
560
- failure_scope: per_invocation or global
561
- ```
562
-
563
- **Advanced Pattern**: Custom error detection
564
- ```
565
- Override extract_error_from_result() to identify:
566
- - HTTP status codes
567
- - Custom response fields
568
- - Error patterns in normal responses
569
- ```
570
-
571
- ---
572
-
573
- ## Best Practices & Guardrails
574
-
575
- ### Maximum Reflection Attempts
576
-
577
- **Industry Standard**: 3-5 maximum reflection attempts
578
-
579
- **Recommended Configuration**:
580
- ```
581
- - Basic Tasks (simple validation): 2 attempts
582
- - Standard Tasks (tool-based workflows): 3 attempts
583
- - Complex Reasoning: 4-5 attempts
584
- - Never exceed: 8 attempts
585
- ```
586
-
587
- **Guardrails to Prevent Loops**:
588
- 1. **Hard iteration limit**: `max_rounds` (fixed ceiling)
589
- 2. **No-progress detection**: Stop after K rounds with no improvement
590
- 3. **State-hash deduplication**: Exit if returning to previous state
591
- 4. **Cost budget**: Total token limit across all attempts
592
- 5. **Timeout mechanism**: Overall time limit (not just per-request)
593
-
594
- ### Error Handling Strategy
595
-
596
- **Distinguish Error Types**:
597
-
598
- | Error Type | Action | Retry? |
599
- |-----------|--------|--------|
600
- | Transient (timeout, rate limit) | Wait with exponential backoff | Yes (2-3x) |
601
- | Logical (wrong approach) | Reflect, change strategy | Yes (up to 3x) |
602
- | Invalid input (bad data) | Return error to user | No |
603
- | Model refusal | Accept result | No |
604
- | Permanent failure (API down) | Escalate/fallback | No |
605
-
606
- **Backoff Strategies**:
607
- - **Constant Backoff**: Fixed delay (e.g., 1 second)
608
- - **Exponential Backoff**: Delay doubles each attempt
609
- - **Jittered Backoff**: Add randomness to prevent thundering herd
610
-
611
- Example exponential backoff:
612
- ```
613
- Attempt 1: Retry immediately
614
- Attempt 2: Wait 1 second
615
- Attempt 3: Wait 2 seconds
616
- Attempt 4: Wait 4 seconds
617
- Attempt 5: Wait 8 seconds
618
- ```
619
-
620
- ### Success Criteria Matter
621
-
622
- Clear success criteria prevent infinite loops:
623
-
624
- **Bad**: "Fix the bug", "optimize the database", "improve the response"
625
- **Good**: "Make test_user_login pass", "reduce query time below 100ms", "increase BLEU score to 0.85+"
626
-
627
- ### State Capture Before Reflection
628
-
629
- **What to Capture**:
630
- 1. **Input Context**: Original request, parameters, user intent
631
- 2. **Execution Snapshot**: Current state at failure point
632
- 3. **Error Details**: Exception, error code, message
633
- 4. **Attempt History**: What was tried before, outcomes
634
- 5. **Decision Metadata**: Why each choice was made, confidence level
635
-
636
- **Storage Strategy**:
637
- - Use lightweight JSON objects
638
- - Store in Redis with expiration matching workflow duration
639
- - Separate learned patterns from temporary processing state
640
- - Keep reasoning chain (why decisions were made) separate
641
-
642
- ---
643
-
644
- ## When NOT to Reflect
645
-
646
- ### Scenarios to Avoid Reflection
647
-
648
- #### 1. **Low-Stakes, High-Velocity Tasks**
649
- - Real-time chat responses
650
- - Autocomplete suggestions
651
- - Quick lookups
652
- - Requirements: <100ms latency
653
-
654
- **Cost/Benefit**: Cost of reflection exceeds value of marginal improvement
655
-
656
- #### 2. **Well-Understood, Deterministic Workflows**
657
- - Simple CRUD operations
658
- - Predictable data transformations
659
- - Tasks with 99%+ baseline accuracy
660
-
661
- **Cost/Benefit**: No errors to fix, reflection wastes tokens
662
-
663
- #### 3. **Clear Model Refusals**
664
- - User asks model to do something against policies
665
- - Model refuses for safety reasons
666
- - No reflection can change this outcome
667
-
668
- **Cost/Benefit**: Reflection won't help
669
-
670
- #### 4. **Ambiguous User Input Without Clarification**
671
- - User request is unclear
672
- - Model can't determine intent
673
-
674
- **Better approach**: Ask clarifying questions, don't reflect
675
-
676
- #### 5. **High-Confidence Outputs with Good Validation**
677
- - Model is highly confident
678
- - Output passes all validation checks
679
- - Tests confirm correctness
680
-
681
- **Cost/Benefit**: Reflection adds latency with no benefit
682
-
683
- #### 6. **Token Budget Constraints**
684
- - Limited tokens remaining in context window
685
- - Reflection would consume majority of remaining budget
686
-
687
- **Cost/Benefit**: Can't afford the cost
688
-
689
- #### 7. **Cascading Failures**
690
- - Reflection failure causes downstream failures
691
- - Loop detection shows same error pattern repeating
692
-
693
- **Better approach**: Escalate to human or fallback
694
-
695
- ### Performance Impact
696
-
697
- **Cost of Reflection**:
698
- - Each reflection attempt = ~1 additional LLM call
699
- - Latency: +200-2000ms per reflection (depends on model)
700
- - Cost: +1x-2x per reflection (depending on output length)
701
-
702
- **When Cost Justifies Benefit**:
703
- - High-value decisions (code generation, critical business logic)
704
- - Complex reasoning tasks
705
- - Where 10-30% improvement is meaningful
706
- - User acceptable for 2-5x latency increase
707
-
708
- ### Confidence-Based Thresholds
709
-
710
- **Reflection Triggers**:
711
- - Model confidence < 70%: Trigger reflection
712
- - Model confidence 70-85%: Optional reflection
713
- - Model confidence > 85%: Skip reflection
714
-
715
- **Implementation**:
716
- - Use model's own uncertainty assessment
717
- - Leverage confidence tokens from extended thinking
718
- - Monitor chain-of-thought for hedging language
719
- - Track prediction confidence scores
720
-
721
- ---
722
-
723
- ## State Capture Patterns
724
-
725
- ### Pre-Reflection State Snapshot
726
-
727
- Capture critical state **before** attempting reflection:
728
-
729
- ```json
730
- {
731
- "attempt_number": 1,
732
- "timestamp": "2025-12-08T12:34:56Z",
733
- "input": {
734
- "user_request": "...",
735
- "context": "...",
736
- "parameters": {...}
737
- },
738
- "generation": {
739
- "output": "...",
740
- "model": "claude-opus-4.5",
741
- "tokens_used": 245,
742
- "confidence": 0.65
743
- },
744
- "validation": {
745
- "passed": false,
746
- "violations": ["format_check_failed", "logic_error"],
747
- "error_message": "..."
748
- },
749
- "error_context": {
750
- "type": "logical_error",
751
- "details": "..."
752
- }
753
- }
754
- ```
755
-
756
- ### Reasoning Chain Logging
757
-
758
- Separate reasoning metadata from content:
759
-
760
- ```json
761
- {
762
- "decision_point": "tool_selection",
763
- "options_considered": ["approach_a", "approach_b", "approach_c"],
764
- "chosen": "approach_a",
765
- "reasoning": "Approach A is more efficient because...",
766
- "confidence": 0.72,
767
- "alternative_rationale": "Approach B would work but...",
768
- "risk_factors": ["potential_timeout", "edge_case_handling"]
769
- }
770
- ```
771
-
772
- Benefits:
773
- - Recovery doesn't re-analyze same information
774
- - Next attempt picks up decision trail where it left off
775
- - Provides context for reflection prompts
776
-
777
- ### Memory State Preservation
778
-
779
- Distinguish learned patterns from temporary state:
780
-
781
- ```json
782
- {
783
- "learned_patterns": {
784
- "document_structure_insights": ["..."],
785
- "user_preferences": ["..."],
786
- "error_recovery_strategies": ["..."]
787
- },
788
- "temporary_state": {
789
- "current_task_context": "...",
790
- "current_output": "...",
791
- "current_attempt": 2
792
- }
793
- }
794
- ```
795
-
796
- **Key principle**: When individual tasks fail, preserve learned insights while resetting temporary state.
797
-
798
- ### State for Error Recovery
799
-
800
- Include information needed for intelligent retry:
801
-
802
- ```json
803
- {
804
- "failed_attempt": {
805
- "approach": "web_search_strategy",
806
- "output": "...",
807
- "error": "timeout"
808
- },
809
- "recovery_context": {
810
- "what_worked_before": [
811
- {"approach": "api_call", "result": "success"},
812
- {"approach": "local_cache", "result": "cache_miss"}
813
- ],
814
- "what_failed": [
815
- {"approach": "web_search", "reason": "timeout"}
816
- ],
817
- "suggestion": "Try API call approach next"
818
- }
819
- }
820
- ```
821
-
822
- ---
823
-
824
- ## Code Implementation Examples
825
-
826
- ### Example 1: Basic Error-Reflection-Retry Loop (TypeScript)
827
-
828
- ```typescript
829
- interface ReflectionState {
830
- attempt: number;
831
- maxAttempts: number;
832
- lastError: string | null;
833
- attemptHistory: Array<{
834
- approach: string;
835
- result: string;
836
- error: string | null;
837
- }>;
838
- }
839
-
840
- async function executeWithReflection(
841
- task: string,
842
- maxAttempts: number = 3
843
- ): Promise<string> {
844
- const state: ReflectionState = {
845
- attempt: 0,
846
- maxAttempts,
847
- lastError: null,
848
- attemptHistory: [],
849
- };
850
-
851
- while (state.attempt < state.maxAttempts) {
852
- state.attempt++;
853
-
854
- try {
855
- // Step 1: Generate solution
856
- const solution = await generateSolution(task, state.attemptHistory);
857
-
858
- // Step 2: Validate
859
- const validation = validateOutput(solution);
860
- if (validation.isValid) {
861
- return solution;
862
- }
863
-
864
- // Step 3: Reflect on failure
865
- state.lastError = validation.errors.join("; ");
866
- const reflection = await reflectOnFailure(
867
- task,
868
- solution,
869
- validation.errors,
870
- state.attemptHistory
871
- );
872
-
873
- // Step 4: Update history
874
- state.attemptHistory.push({
875
- approach: reflection.suggestedApproach,
876
- result: solution,
877
- error: state.lastError,
878
- });
879
-
880
- } catch (error) {
881
- state.lastError = String(error);
882
-
883
- // Attempt recovery reflection
884
- const recovery = await reflectOnError(task, error, state.attemptHistory);
885
- state.attemptHistory.push({
886
- approach: recovery.suggestedApproach,
887
- result: "",
888
- error: state.lastError,
889
- });
890
- }
891
- }
892
-
893
- throw new Error(
894
- `Failed after ${state.maxAttempts} attempts. ` +
895
- `Last error: ${state.lastError}`
896
- );
897
- }
898
-
899
- async function generateSolution(
900
- task: string,
901
- history: ReflectionState["attemptHistory"]
902
- ): Promise<string> {
903
- const historyContext = history.length > 0
904
- ? `Previous attempts:\n${history
905
- .map((h, i) => `Attempt ${i + 1} (${h.approach}): ${h.error || "failed"}`)
906
- .join("\n")}\n`
907
- : "";
908
-
909
- const response = await client.messages.create({
910
- model: "claude-opus-4.5",
911
- max_tokens: 1024,
912
- messages: [
913
- {
914
- role: "user",
915
- content: `${historyContext}\nTask: ${task}\n\nGenerate a solution.`,
916
- },
917
- ],
918
- });
919
-
920
- return response.content[0].type === "text" ? response.content[0].text : "";
921
- }
922
-
923
- async function reflectOnFailure(
924
- task: string,
925
- solution: string,
926
- errors: string[],
927
- history: ReflectionState["attemptHistory"]
928
- ): Promise<{ suggestedApproach: string }> {
929
- const response = await client.messages.create({
930
- model: "claude-opus-4.5",
931
- max_tokens: 512,
932
- messages: [
933
- {
934
- role: "user",
935
- content: `Task: ${task}
936
-
937
- Your previous solution failed with these issues:
938
- ${errors.map((e) => `- ${e}`).join("\n")}
939
-
940
- Previous solution:
941
- ${solution}
942
-
943
- Analyze what went wrong and suggest a different approach that would avoid these issues.`,
944
- },
945
- ],
946
- });
947
-
948
- return {
949
- suggestedApproach:
950
- response.content[0].type === "text" ? response.content[0].text : "",
951
- };
952
- }
953
-
954
- async function reflectOnError(
955
- task: string,
956
- error: unknown,
957
- history: ReflectionState["attemptHistory"]
958
- ): Promise<{ suggestedApproach: string }> {
959
- // Similar to reflectOnFailure but handles exceptions
960
- return {
961
- suggestedApproach: `Error recovery strategy after: ${String(error)}`,
962
- };
963
- }
964
-
965
- function validateOutput(output: string): {
966
- isValid: boolean;
967
- errors: string[];
968
- } {
969
- const errors: string[] = [];
970
-
971
- if (!output || output.trim().length === 0) {
972
- errors.push("Output is empty");
973
- }
974
-
975
- if (output.length < 10) {
976
- errors.push("Output is too short");
977
- }
978
-
979
- return {
980
- isValid: errors.length === 0,
981
- errors,
982
- };
983
- }
984
- ```
985
-
986
- ---
987
-
988
- ### Example 2: Instruction-Following Validation Pattern
989
-
990
- ```typescript
991
- interface ValidationRule {
992
- name: string;
993
- validate: (value: any) => boolean;
994
- errorMessage: string;
995
- }
996
-
997
- interface InstructionFollowingEvaluator {
998
- rules: ValidationRule[];
999
- maxRetries: number;
1000
- }
1001
-
1002
- async function validateWithIFE(
1003
- task: string,
1004
- evaluator: InstructionFollowingEvaluator
1005
- ): Promise<string> {
1006
- let retries = 0;
1007
-
1008
- while (retries < evaluator.maxRetries) {
1009
- // Generate output
1010
- const output = await generateOutput(task);
1011
-
1012
- // Check each rule
1013
- const violations: string[] = [];
1014
- for (const rule of evaluator.rules) {
1015
- if (!rule.validate(output)) {
1016
- violations.push(rule.errorMessage);
1017
- }
1018
- }
1019
-
1020
- // If all rules pass, return
1021
- if (violations.length === 0) {
1022
- return output;
1023
- }
1024
-
1025
- // If violations, refine and retry
1026
- retries++;
1027
- if (retries < evaluator.maxRetries) {
1028
- const refinedTask = await refinePormptWithViolations(
1029
- task,
1030
- output,
1031
- violations
1032
- );
1033
- task = refinedTask;
1034
- } else {
1035
- throw new Error(
1036
- `Validation failed after ${evaluator.maxRetries} attempts. ` +
1037
- `Violations: ${violations.join("; ")}`
1038
- );
1039
- }
1040
- }
1041
-
1042
- throw new Error("Unexpected error in IFE validation");
1043
- }
1044
-
1045
- async function generateOutput(task: string): Promise<string> {
1046
- const response = await client.messages.create({
1047
- model: "claude-opus-4.5",
1048
- max_tokens: 1024,
1049
- messages: [{ role: "user", content: task }],
1050
- });
1051
-
1052
- return response.content[0].type === "text" ? response.content[0].text : "";
1053
- }
1054
-
1055
- async function refinePormptWithViolations(
1056
- originalTask: string,
1057
- output: string,
1058
- violations: string[]
1059
- ): Promise<string> {
1060
- const response = await client.messages.create({
1061
- model: "claude-opus-4.5",
1062
- max_tokens: 512,
1063
- messages: [
1064
- {
1065
- role: "user",
1066
- content: `Original task: ${originalTask}
1067
-
1068
- Your output failed these validation rules:
1069
- ${violations.map((v) => `- ${v}`).join("\n")}
1070
-
1071
- Your output was:
1072
- ${output}
1073
-
1074
- Revise the task/instructions to ensure the next attempt will satisfy all rules.`,
1075
- },
1076
- ],
1077
- });
1078
-
1079
- return response.content[0].type === "text" ? response.content[0].text : "";
1080
- }
1081
-
1082
- // Example usage with specific validation rules
1083
- const codeEvaluator: InstructionFollowingEvaluator = {
1084
- maxRetries: 3,
1085
- rules: [
1086
- {
1087
- name: "valid_syntax",
1088
- validate: (code) => {
1089
- try {
1090
- // Parse or compile check
1091
- return code.includes("function") || code.includes("const");
1092
- } catch {
1093
- return false;
1094
- }
1095
- },
1096
- errorMessage: "Code must have valid TypeScript syntax",
1097
- },
1098
- {
1099
- name: "includes_tests",
1100
- validate: (code) => code.includes("test") || code.includes("describe"),
1101
- errorMessage: "Code must include test cases",
1102
- },
1103
- {
1104
- name: "has_comments",
1105
- validate: (code) => code.includes("//") || code.includes("/*"),
1106
- errorMessage: "Code must include comments",
1107
- },
1108
- ],
1109
- };
1110
- ```
1111
-
1112
- ---
1113
-
1114
- ### Example 3: Reflexion-Style Architecture
1115
-
1116
- ```typescript
1117
- interface ReflexionState {
1118
- task: string;
1119
- trajectory: string;
1120
- reward: number;
1121
- reflection: string;
1122
- nextAttempt: string;
1123
- }
1124
-
1125
- class ReflexionAgent {
1126
- private actor: LLMClient;
1127
- private evaluator: (output: string) => number;
1128
- private reflector: LLMClient;
1129
- private memory: ReflexionState[] = [];
1130
-
1131
- async runReflexion(task: string, maxIterations: number = 3): Promise<string> {
1132
- let currentTask = task;
1133
-
1134
- for (let i = 0; i < maxIterations; i++) {
1135
- // Step 1: Actor generates trajectory
1136
- const trajectory = await this.actor.generate(currentTask);
1137
-
1138
- // Step 2: Evaluator assigns reward
1139
- const reward = this.evaluator(trajectory);
1140
-
1141
- // Step 3: Reflector generates feedback
1142
- const reflection = await this.reflector.generateReflection(
1143
- task,
1144
- trajectory,
1145
- reward,
1146
- this.memory
1147
- );
1148
-
1149
- // Step 4: Store in memory
1150
- const state: ReflexionState = {
1151
- task,
1152
- trajectory,
1153
- reward,
1154
- reflection,
1155
- nextAttempt: "",
1156
- };
1157
- this.memory.push(state);
1158
-
1159
- // Step 5: Use reflection to improve next attempt
1160
- if (reward > 0.8) {
1161
- // Good enough, return
1162
- return trajectory;
1163
- }
1164
-
1165
- // Prepare for next iteration with reflection context
1166
- currentTask = `${task}
1167
-
1168
- Previous attempt feedback:
1169
- ${reflection}
1170
-
1171
- Generate an improved solution that addresses the feedback above.`;
1172
- }
1173
-
1174
- return this.memory[this.memory.length - 1].trajectory;
1175
- }
1176
- }
1177
-
1178
- class ReflectorModel {
1179
- private client: LLMClient;
1180
-
1181
- async generateReflection(
1182
- task: string,
1183
- trajectory: string,
1184
- reward: number,
1185
- memory: ReflexionState[]
1186
- ): Promise<string> {
1187
- const memoryContext =
1188
- memory.length > 0
1189
- ? `Previous attempts and feedback:\n${memory
1190
- .slice(-2)
1191
- .map((m) => `Reward: ${m.reward}\nFeedback: ${m.reflection}`)
1192
- .join("\n\n")}\n`
1193
- : "";
1194
-
1195
- const response = await this.client.messages.create({
1196
- model: "claude-opus-4.5",
1197
- max_tokens: 512,
1198
- messages: [
1199
- {
1200
- role: "user",
1201
- content: `Task: ${task}
1202
-
1203
- ${memoryContext}
1204
-
1205
- Current attempt (reward score: ${reward}):
1206
- ${trajectory}
1207
-
1208
- Evaluate this attempt:
1209
- 1. What did it do well?
1210
- 2. What are the specific failures or issues?
1211
- 3. What should be tried differently in the next attempt?
1212
- 4. What patterns from previous attempts should be avoided?
1213
-
1214
- Format your response as structured verbal feedback.`,
1215
- },
1216
- ],
1217
- });
1218
-
1219
- return response.content[0].type === "text" ? response.content[0].text : "";
1220
- }
1221
- }
1222
- ```
1223
-
1224
- ---
1225
-
1226
- ### Example 4: Multi-Agent Reflection
1227
-
1228
- ```typescript
1229
- class MultiAgentReflection {
1230
- private generator: LLMClient;
1231
- private critic: LLMClient;
1232
-
1233
- async reflectiveGeneration(
1234
- task: string,
1235
- maxRounds: number = 3
1236
- ): Promise<string> {
1237
- let currentOutput = await this.generator.generate(task);
1238
-
1239
- for (let round = 1; round < maxRounds; round++) {
1240
- // Get critique
1241
- const critique = await this.critic.critique(task, currentOutput);
1242
-
1243
- if (critique.isSatisfactory) {
1244
- return currentOutput;
1245
- }
1246
-
1247
- // Generate improvement
1248
- currentOutput = await this.generator.improve(
1249
- task,
1250
- currentOutput,
1251
- critique.feedback,
1252
- critique.suggestions
1253
- );
1254
- }
1255
-
1256
- return currentOutput;
1257
- }
1258
-
1259
- async generate(task: string): Promise<string> {
1260
- const response = await this.generator.messages.create({
1261
- model: "claude-opus-4.5",
1262
- max_tokens: 1024,
1263
- messages: [{ role: "user", content: task }],
1264
- });
1265
-
1266
- return response.content[0].type === "text" ? response.content[0].text : "";
1267
- }
1268
-
1269
- async improve(
1270
- task: string,
1271
- currentOutput: string,
1272
- feedback: string,
1273
- suggestions: string[]
1274
- ): Promise<string> {
1275
- const response = await this.generator.messages.create({
1276
- model: "claude-opus-4.5",
1277
- max_tokens: 1024,
1278
- messages: [
1279
- {
1280
- role: "user",
1281
- content: `Task: ${task}
1282
-
1283
- Current output:
1284
- ${currentOutput}
1285
-
1286
- Feedback from review:
1287
- ${feedback}
1288
-
1289
- Specific improvements to make:
1290
- ${suggestions.map((s) => `- ${s}`).join("\n")}
1291
-
1292
- Provide an improved version that addresses all feedback.`,
1293
- },
1294
- ],
1295
- });
1296
-
1297
- return response.content[0].type === "text" ? response.content[0].text : "";
1298
- }
1299
-
1300
- async critique(
1301
- task: string,
1302
- output: string
1303
- ): Promise<{
1304
- isSatisfactory: boolean;
1305
- feedback: string;
1306
- suggestions: string[];
1307
- }> {
1308
- const response = await this.critic.messages.create({
1309
- model: "claude-opus-4.5",
1310
- max_tokens: 512,
1311
- messages: [
1312
- {
1313
- role: "user",
1314
- content: `You are a critical reviewer. Evaluate this response:
1315
-
1316
- Task: ${task}
1317
-
1318
- Response:
1319
- ${output}
1320
-
1321
- Provide:
1322
- 1. Overall assessment (satisfactory or needs improvement)
1323
- 2. Specific issues with the current response
1324
- 3. Concrete suggestions for improvement
1325
-
1326
- Format as JSON: { "isSatisfactory": boolean, "feedback": string, "suggestions": string[] }`,
1327
- },
1328
- ],
1329
- });
1330
-
1331
- const text =
1332
- response.content[0].type === "text" ? response.content[0].text : "{}";
1333
- return JSON.parse(text);
1334
- }
1335
- }
1336
- ```
1337
-
1338
- ---
1339
-
1340
- ### Example 5: Confidence-Based Reflection Trigger
1341
-
1342
- ```typescript
1343
- interface ConfidenceMetadata {
1344
- overallConfidence: number;
1345
- uncertaintyAreas: string[];
1346
- alternativesConsidered: string[];
1347
- }
1348
-
1349
- async function confidenceBasedReflection(
1350
- task: string,
1351
- confidenceThreshold: number = 0.75
1352
- ): Promise<string> {
1353
- const response = await client.messages.create({
1354
- model: "claude-opus-4.5",
1355
- max_tokens: 1024,
1356
- messages: [
1357
- {
1358
- role: "user",
1359
- content: `${task}
1360
-
1361
- After your response, provide a JSON block with confidence metadata:
1362
- {
1363
- "overallConfidence": <0-1>,
1364
- "uncertaintyAreas": ["area1", "area2"],
1365
- "alternativesConsidered": ["alternative1", "alternative2"]
1366
- }`,
1367
- },
1368
- ],
1369
- });
1370
-
1371
- const text =
1372
- response.content[0].type === "text" ? response.content[0].text : "";
1373
-
1374
- // Extract response and metadata
1375
- const jsonMatch = text.match(/\{[\s\S]*\}/);
1376
- const metadata: ConfidenceMetadata = jsonMatch
1377
- ? JSON.parse(jsonMatch[0])
1378
- : { overallConfidence: 0.5, uncertaintyAreas: [], alternativesConsidered: [] };
1379
-
1380
- // If confidence too low, reflect
1381
- if (metadata.overallConfidence < confidenceThreshold) {
1382
- const reflection = await reflectWithLowConfidence(
1383
- task,
1384
- text,
1385
- metadata
1386
- );
1387
- return reflection;
1388
- }
1389
-
1390
- return text;
1391
- }
1392
-
1393
- async function reflectWithLowConfidence(
1394
- task: string,
1395
- initialResponse: string,
1396
- metadata: ConfidenceMetadata
1397
- ): Promise<string> {
1398
- const response = await client.messages.create({
1399
- model: "claude-opus-4.5",
1400
- max_tokens: 1024,
1401
- messages: [
1402
- {
1403
- role: "user",
1404
- content: `Original task: ${task}
1405
-
1406
- Your previous response (confidence: ${metadata.overallConfidence}):
1407
- ${initialResponse}
1408
-
1409
- You indicated uncertainty in these areas:
1410
- ${metadata.uncertaintyAreas.map((a) => `- ${a}`).join("\n")}
1411
-
1412
- You considered these alternatives:
1413
- ${metadata.alternativesConsidered.map((a) => `- ${a}`).join("\n")}
1414
-
1415
- Given your own identified uncertainties:
1416
- 1. Identify what specific information would increase your confidence
1417
- 2. Provide a revised response that addresses these uncertainty areas
1418
- 3. Explain how your revised response is more robust`,
1419
- },
1420
- ],
1421
- });
1422
-
1423
- return response.content[0].type === "text" ? response.content[0].text : "";
1424
- }
1425
- ```
1426
-
1427
- ---
1428
-
1429
- ## Summary & Key Takeaways
1430
-
1431
- ### What Is Reflection?
1432
- Self-reflection in AI agents enables error detection, analysis, and correction without human intervention. It's a three-phase cycle: generate → analyze → improve.
1433
-
1434
- ### When to Implement Reflection
1435
- - **Error recovery**: When outputs fail validation
1436
- - **Iterative refinement**: Complex tasks needing multiple passes
1437
- - **High-stakes decisions**: Code generation, critical logic
1438
- - **Low-confidence outputs**: When model expresses uncertainty
1439
-
1440
- ### When NOT to Implement Reflection
1441
- - Low-latency requirements (<100ms)
1442
- - Simple, deterministic tasks
1443
- - Well-understood workflows with 99%+ baseline accuracy
1444
- - Clear model refusals
1445
- - Token budget constraints
1446
-
1447
- ### Best Practices
1448
- 1. **Limit retries**: 3-5 attempts maximum, never unlimited
1449
- 2. **Clear success criteria**: Specific, measurable goals (not vague)
1450
- 3. **State capture**: Preserve context for intelligent retry
1451
- 4. **Error categorization**: Different strategies for different error types
1452
- 5. **Backoff strategies**: Exponential backoff for transient errors
1453
- 6. **Avoid reflection loops**: Use state deduplication and progress detection
1454
-
1455
- ### Implementation Hierarchy
1456
- 1. Start simple: Basic self-critique in prompts
1457
- 2. Add validation: Explicit output rules
1458
- 3. Multi-attempt: Error-reflection-retry loop (3 attempts)
1459
- 4. Tool-enhanced: Use linters, tests, execution for feedback
1460
- 5. Multi-agent: Deploy separate critic for complex tasks
1461
- 6. Full Reflexion: If baseline approaches insufficient
1462
-
1463
- ### Framework Selection
1464
- - **LangChain/LangGraph**: Pre-built reflection patterns, good for graph-based workflows
1465
- - **Anthropic/Claude**: Emphasis on simplicity, extended thinking for reflection
1466
- - **Google ADK**: Specialized reflect-and-retry plugin
1467
- - **Custom**: Lightweight TypeScript patterns for specific needs
1468
-