@shakudo/kaji-setup-external 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (411) hide show
  1. package/README.md +155 -0
  2. package/assets/skills/ci-cd/.claude-plugin/plugin.json +8 -0
  3. package/assets/skills/ci-cd/SKILL.md +573 -0
  4. package/assets/skills/ci-cd/assets/templates/github-actions/docker-build.yml +164 -0
  5. package/assets/skills/ci-cd/assets/templates/github-actions/go-ci.yml +420 -0
  6. package/assets/skills/ci-cd/assets/templates/github-actions/node-ci.yml +313 -0
  7. package/assets/skills/ci-cd/assets/templates/github-actions/python-ci.yml +388 -0
  8. package/assets/skills/ci-cd/assets/templates/github-actions/security-scan.yml +416 -0
  9. package/assets/skills/ci-cd/assets/templates/gitlab-ci/docker-build.yml +298 -0
  10. package/assets/skills/ci-cd/assets/templates/gitlab-ci/go-ci.yml +548 -0
  11. package/assets/skills/ci-cd/assets/templates/gitlab-ci/node-ci.yml +334 -0
  12. package/assets/skills/ci-cd/assets/templates/gitlab-ci/python-ci.yml +472 -0
  13. package/assets/skills/ci-cd/assets/templates/gitlab-ci/security-scan.yml +479 -0
  14. package/assets/skills/ci-cd/references/best_practices.md +675 -0
  15. package/assets/skills/ci-cd/references/devsecops.md +862 -0
  16. package/assets/skills/ci-cd/references/optimization.md +651 -0
  17. package/assets/skills/ci-cd/references/security.md +611 -0
  18. package/assets/skills/ci-cd/references/troubleshooting.md +656 -0
  19. package/assets/skills/ci-cd/scripts/ci_health.py +301 -0
  20. package/assets/skills/ci-cd/scripts/pipeline_analyzer.py +440 -0
  21. package/assets/skills/context-optimization/CONTRIBUTING.md +78 -0
  22. package/assets/skills/context-optimization/LICENSE +22 -0
  23. package/assets/skills/context-optimization/README.md +228 -0
  24. package/assets/skills/context-optimization/SKILL.md +104 -0
  25. package/assets/skills/context-optimization/docs/agentskills.md +1264 -0
  26. package/assets/skills/context-optimization/docs/blogs.md +1230 -0
  27. package/assets/skills/context-optimization/docs/claude_research.md +85 -0
  28. package/assets/skills/context-optimization/docs/compression.md +298 -0
  29. package/assets/skills/context-optimization/docs/gemini_research.md +22 -0
  30. package/assets/skills/context-optimization/docs/hncapsule.md +92 -0
  31. package/assets/skills/context-optimization/docs/netflix_context.md +10 -0
  32. package/assets/skills/context-optimization/docs/vercel_tool.md +140 -0
  33. package/assets/skills/context-optimization/examples/book-sft-pipeline/README.md +78 -0
  34. package/assets/skills/context-optimization/examples/book-sft-pipeline/SKILL.md +380 -0
  35. package/assets/skills/context-optimization/examples/book-sft-pipeline/examples/gertrude-stein/README.md +168 -0
  36. package/assets/skills/context-optimization/examples/book-sft-pipeline/examples/gertrude-stein/dataset_sample.jsonl +5 -0
  37. package/assets/skills/context-optimization/examples/book-sft-pipeline/examples/gertrude-stein/pangram/Screenshot 2025-12-27 at 3.05.04/342/200/257AM.png +0 -0
  38. package/assets/skills/context-optimization/examples/book-sft-pipeline/examples/gertrude-stein/pangram/Screenshot 2025-12-27 at 3.05.36/342/200/257AM.png +0 -0
  39. package/assets/skills/context-optimization/examples/book-sft-pipeline/examples/gertrude-stein/pangram/Screenshot 2025-12-27 at 3.07.18/342/200/257AM.png +0 -0
  40. package/assets/skills/context-optimization/examples/book-sft-pipeline/examples/gertrude-stein/sample_outputs.md +63 -0
  41. package/assets/skills/context-optimization/examples/book-sft-pipeline/examples/gertrude-stein/training_config.json +80 -0
  42. package/assets/skills/context-optimization/examples/book-sft-pipeline/references/segmentation-strategies.md +324 -0
  43. package/assets/skills/context-optimization/examples/book-sft-pipeline/references/tinker-format.md +211 -0
  44. package/assets/skills/context-optimization/examples/book-sft-pipeline/references/tinker.txt +3176 -0
  45. package/assets/skills/context-optimization/examples/book-sft-pipeline/scripts/pipeline_example.py +187 -0
  46. package/assets/skills/context-optimization/examples/digital-brain-skill/AGENT.md +35 -0
  47. package/assets/skills/context-optimization/examples/digital-brain-skill/HOW-SKILLS-BUILT-THIS.md +407 -0
  48. package/assets/skills/context-optimization/examples/digital-brain-skill/README.md +209 -0
  49. package/assets/skills/context-optimization/examples/digital-brain-skill/SKILL.md +203 -0
  50. package/assets/skills/context-optimization/examples/digital-brain-skill/SKILLS-MAPPING.md +219 -0
  51. package/assets/skills/context-optimization/examples/digital-brain-skill/agents/AGENTS.md +82 -0
  52. package/assets/skills/context-optimization/examples/digital-brain-skill/agents/scripts/content_ideas.py +132 -0
  53. package/assets/skills/context-optimization/examples/digital-brain-skill/agents/scripts/idea_to_draft.py +181 -0
  54. package/assets/skills/context-optimization/examples/digital-brain-skill/agents/scripts/stale_contacts.py +139 -0
  55. package/assets/skills/context-optimization/examples/digital-brain-skill/agents/scripts/weekly_review.py +121 -0
  56. package/assets/skills/context-optimization/examples/digital-brain-skill/content/CONTENT.md +88 -0
  57. package/assets/skills/context-optimization/examples/digital-brain-skill/content/calendar.md +108 -0
  58. package/assets/skills/context-optimization/examples/digital-brain-skill/content/engagement.jsonl +2 -0
  59. package/assets/skills/context-optimization/examples/digital-brain-skill/content/ideas.jsonl +2 -0
  60. package/assets/skills/context-optimization/examples/digital-brain-skill/content/posts.jsonl +2 -0
  61. package/assets/skills/context-optimization/examples/digital-brain-skill/content/templates/linkedin-post.md +102 -0
  62. package/assets/skills/context-optimization/examples/digital-brain-skill/content/templates/newsletter.md +92 -0
  63. package/assets/skills/context-optimization/examples/digital-brain-skill/content/templates/thread.md +73 -0
  64. package/assets/skills/context-optimization/examples/digital-brain-skill/examples/content-workflow.md +204 -0
  65. package/assets/skills/context-optimization/examples/digital-brain-skill/examples/meeting-prep.md +243 -0
  66. package/assets/skills/context-optimization/examples/digital-brain-skill/identity/IDENTITY.md +46 -0
  67. package/assets/skills/context-optimization/examples/digital-brain-skill/identity/bio-variants.md +101 -0
  68. package/assets/skills/context-optimization/examples/digital-brain-skill/identity/brand.md +165 -0
  69. package/assets/skills/context-optimization/examples/digital-brain-skill/identity/prompts/content-generation.xml +46 -0
  70. package/assets/skills/context-optimization/examples/digital-brain-skill/identity/prompts/reply-generator.xml +40 -0
  71. package/assets/skills/context-optimization/examples/digital-brain-skill/identity/values.yaml +60 -0
  72. package/assets/skills/context-optimization/examples/digital-brain-skill/identity/voice.md +165 -0
  73. package/assets/skills/context-optimization/examples/digital-brain-skill/knowledge/KNOWLEDGE.md +85 -0
  74. package/assets/skills/context-optimization/examples/digital-brain-skill/knowledge/bookmarks.jsonl +2 -0
  75. package/assets/skills/context-optimization/examples/digital-brain-skill/knowledge/competitors.md +117 -0
  76. package/assets/skills/context-optimization/examples/digital-brain-skill/knowledge/learning.yaml +74 -0
  77. package/assets/skills/context-optimization/examples/digital-brain-skill/knowledge/research/_template.md +79 -0
  78. package/assets/skills/context-optimization/examples/digital-brain-skill/network/NETWORK.md +110 -0
  79. package/assets/skills/context-optimization/examples/digital-brain-skill/network/circles.yaml +80 -0
  80. package/assets/skills/context-optimization/examples/digital-brain-skill/network/contacts.jsonl +2 -0
  81. package/assets/skills/context-optimization/examples/digital-brain-skill/network/interactions.jsonl +2 -0
  82. package/assets/skills/context-optimization/examples/digital-brain-skill/network/intros.md +92 -0
  83. package/assets/skills/context-optimization/examples/digital-brain-skill/operations/OPERATIONS.md +75 -0
  84. package/assets/skills/context-optimization/examples/digital-brain-skill/operations/goals.yaml +83 -0
  85. package/assets/skills/context-optimization/examples/digital-brain-skill/operations/meetings.jsonl +2 -0
  86. package/assets/skills/context-optimization/examples/digital-brain-skill/operations/metrics.jsonl +2 -0
  87. package/assets/skills/context-optimization/examples/digital-brain-skill/operations/reviews/_weekly_template.md +114 -0
  88. package/assets/skills/context-optimization/examples/digital-brain-skill/operations/todos.md +76 -0
  89. package/assets/skills/context-optimization/examples/digital-brain-skill/package.json +41 -0
  90. package/assets/skills/context-optimization/examples/digital-brain-skill/references/file-formats.md +386 -0
  91. package/assets/skills/context-optimization/examples/digital-brain-skill/scripts/install.sh +79 -0
  92. package/assets/skills/context-optimization/examples/interleaved_thinking/README.md +620 -0
  93. package/assets/skills/context-optimization/examples/interleaved_thinking/SKILL.md +221 -0
  94. package/assets/skills/context-optimization/examples/interleaved_thinking/docs/agentthinking.md +63 -0
  95. package/assets/skills/context-optimization/examples/interleaved_thinking/docs/interleavedthinking.md +610 -0
  96. package/assets/skills/context-optimization/examples/interleaved_thinking/docs/m2-1.md +224 -0
  97. package/assets/skills/context-optimization/examples/interleaved_thinking/examples/01_basic_capture.py +76 -0
  98. package/assets/skills/context-optimization/examples/interleaved_thinking/examples/02_tool_usage.py +187 -0
  99. package/assets/skills/context-optimization/examples/interleaved_thinking/examples/03_full_optimization.py +1222 -0
  100. package/assets/skills/context-optimization/examples/interleaved_thinking/generated_skills/comprehensive-research-agent/SKILL.md +90 -0
  101. package/assets/skills/context-optimization/examples/interleaved_thinking/generated_skills/comprehensive-research-agent/references/optimization_summary.json +9 -0
  102. package/assets/skills/context-optimization/examples/interleaved_thinking/generated_skills/comprehensive-research-agent/references/optimized_prompt.txt +1 -0
  103. package/assets/skills/context-optimization/examples/interleaved_thinking/generated_skills/comprehensive-research-agent/references/patterns_found.json +205 -0
  104. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/final_prompt.txt +67 -0
  105. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_1/analysis.txt +48 -0
  106. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_1/optimization.txt +15 -0
  107. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_1/optimized_prompt.txt +1 -0
  108. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_1/trace.txt +178 -0
  109. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_10/analysis.txt +47 -0
  110. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_10/trace.txt +162 -0
  111. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_2/analysis.txt +48 -0
  112. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_2/optimization.txt +130 -0
  113. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_2/optimized_prompt.txt +72 -0
  114. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_2/trace.txt +156 -0
  115. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_3/analysis.txt +46 -0
  116. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_3/optimization.txt +147 -0
  117. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_3/optimized_prompt.txt +84 -0
  118. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_3/trace.txt +159 -0
  119. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_4/analysis.txt +46 -0
  120. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_4/optimization.txt +134 -0
  121. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_4/optimized_prompt.txt +67 -0
  122. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_4/trace.txt +165 -0
  123. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_5/analysis.txt +50 -0
  124. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_5/optimization.txt +135 -0
  125. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_5/optimized_prompt.txt +71 -0
  126. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_5/trace.txt +146 -0
  127. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_6/analysis.txt +15 -0
  128. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_6/optimization.txt +15 -0
  129. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_6/optimized_prompt.txt +1 -0
  130. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_6/trace.txt +147 -0
  131. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_7/analysis.txt +46 -0
  132. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_7/optimization.txt +103 -0
  133. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_7/optimized_prompt.txt +45 -0
  134. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_7/trace.txt +134 -0
  135. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_8/analysis.txt +47 -0
  136. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_8/optimization.txt +114 -0
  137. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_8/optimized_prompt.txt +60 -0
  138. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_8/trace.txt +135 -0
  139. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_9/analysis.txt +44 -0
  140. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_9/optimization.txt +106 -0
  141. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_9/optimized_prompt.txt +51 -0
  142. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/iteration_9/trace.txt +170 -0
  143. package/assets/skills/context-optimization/examples/interleaved_thinking/optimization_artifacts/summary.json +11 -0
  144. package/assets/skills/context-optimization/examples/interleaved_thinking/pyproject.toml +70 -0
  145. package/assets/skills/context-optimization/examples/interleaved_thinking/reasoning_trace_optimizer/__init__.py +53 -0
  146. package/assets/skills/context-optimization/examples/interleaved_thinking/reasoning_trace_optimizer/analyzer.py +465 -0
  147. package/assets/skills/context-optimization/examples/interleaved_thinking/reasoning_trace_optimizer/capture.py +417 -0
  148. package/assets/skills/context-optimization/examples/interleaved_thinking/reasoning_trace_optimizer/cli.py +271 -0
  149. package/assets/skills/context-optimization/examples/interleaved_thinking/reasoning_trace_optimizer/loop.py +468 -0
  150. package/assets/skills/context-optimization/examples/interleaved_thinking/reasoning_trace_optimizer/models.py +193 -0
  151. package/assets/skills/context-optimization/examples/interleaved_thinking/reasoning_trace_optimizer/optimizer.py +449 -0
  152. package/assets/skills/context-optimization/examples/interleaved_thinking/reasoning_trace_optimizer/skill_generator.py +502 -0
  153. package/assets/skills/context-optimization/examples/interleaved_thinking/tests/__init__.py +1 -0
  154. package/assets/skills/context-optimization/examples/interleaved_thinking/tests/test_models.py +144 -0
  155. package/assets/skills/context-optimization/examples/llm-as-judge-skills/.prettierrc +8 -0
  156. package/assets/skills/context-optimization/examples/llm-as-judge-skills/CONTRIBUTING.md +78 -0
  157. package/assets/skills/context-optimization/examples/llm-as-judge-skills/LICENSE +21 -0
  158. package/assets/skills/context-optimization/examples/llm-as-judge-skills/README.md +659 -0
  159. package/assets/skills/context-optimization/examples/llm-as-judge-skills/agents/evaluator-agent/evaluator-agent.md +177 -0
  160. package/assets/skills/context-optimization/examples/llm-as-judge-skills/agents/index.md +114 -0
  161. package/assets/skills/context-optimization/examples/llm-as-judge-skills/agents/orchestrator-agent/orchestrator-agent.md +205 -0
  162. package/assets/skills/context-optimization/examples/llm-as-judge-skills/agents/research-agent/research-agent.md +183 -0
  163. package/assets/skills/context-optimization/examples/llm-as-judge-skills/env.example +6 -0
  164. package/assets/skills/context-optimization/examples/llm-as-judge-skills/eslint.config.js +18 -0
  165. package/assets/skills/context-optimization/examples/llm-as-judge-skills/examples/basic-evaluation.ts +89 -0
  166. package/assets/skills/context-optimization/examples/llm-as-judge-skills/examples/full-evaluation-workflow.ts +136 -0
  167. package/assets/skills/context-optimization/examples/llm-as-judge-skills/examples/generate-rubric.ts +67 -0
  168. package/assets/skills/context-optimization/examples/llm-as-judge-skills/examples/pairwise-comparison.ts +97 -0
  169. package/assets/skills/context-optimization/examples/llm-as-judge-skills/package.json +79 -0
  170. package/assets/skills/context-optimization/examples/llm-as-judge-skills/prompts/agent-system/orchestrator-prompt.md +197 -0
  171. package/assets/skills/context-optimization/examples/llm-as-judge-skills/prompts/evaluation/direct-scoring-prompt.md +153 -0
  172. package/assets/skills/context-optimization/examples/llm-as-judge-skills/prompts/evaluation/pairwise-comparison-prompt.md +200 -0
  173. package/assets/skills/context-optimization/examples/llm-as-judge-skills/prompts/index.md +138 -0
  174. package/assets/skills/context-optimization/examples/llm-as-judge-skills/prompts/research/research-synthesis-prompt.md +171 -0
  175. package/assets/skills/context-optimization/examples/llm-as-judge-skills/skills/context-fundamentals/context-fundamentals.md +114 -0
  176. package/assets/skills/context-optimization/examples/llm-as-judge-skills/skills/index.md +79 -0
  177. package/assets/skills/context-optimization/examples/llm-as-judge-skills/skills/llm-evaluator/llm-evaluator.md +77 -0
  178. package/assets/skills/context-optimization/examples/llm-as-judge-skills/skills/tool-design/tool-design.md +198 -0
  179. package/assets/skills/context-optimization/examples/llm-as-judge-skills/src/agents/evaluator.ts +112 -0
  180. package/assets/skills/context-optimization/examples/llm-as-judge-skills/src/agents/index.ts +3 -0
  181. package/assets/skills/context-optimization/examples/llm-as-judge-skills/src/config/index.ts +18 -0
  182. package/assets/skills/context-optimization/examples/llm-as-judge-skills/src/index.ts +19 -0
  183. package/assets/skills/context-optimization/examples/llm-as-judge-skills/src/tools/evaluation/direct-score.ts +164 -0
  184. package/assets/skills/context-optimization/examples/llm-as-judge-skills/src/tools/evaluation/generate-rubric.ts +161 -0
  185. package/assets/skills/context-optimization/examples/llm-as-judge-skills/src/tools/evaluation/index.ts +9 -0
  186. package/assets/skills/context-optimization/examples/llm-as-judge-skills/src/tools/evaluation/pairwise-compare.ts +255 -0
  187. package/assets/skills/context-optimization/examples/llm-as-judge-skills/tests/evaluation.test.ts +233 -0
  188. package/assets/skills/context-optimization/examples/llm-as-judge-skills/tests/setup.ts +27 -0
  189. package/assets/skills/context-optimization/examples/llm-as-judge-skills/tests/skills.test.ts +213 -0
  190. package/assets/skills/context-optimization/examples/llm-as-judge-skills/tools/evaluation/direct-score.md +159 -0
  191. package/assets/skills/context-optimization/examples/llm-as-judge-skills/tools/evaluation/generate-rubric.md +189 -0
  192. package/assets/skills/context-optimization/examples/llm-as-judge-skills/tools/evaluation/pairwise-compare.md +182 -0
  193. package/assets/skills/context-optimization/examples/llm-as-judge-skills/tools/index.md +141 -0
  194. package/assets/skills/context-optimization/examples/llm-as-judge-skills/tools/orchestration/delegate-to-agent.md +171 -0
  195. package/assets/skills/context-optimization/examples/llm-as-judge-skills/tools/research/read-url.md +162 -0
  196. package/assets/skills/context-optimization/examples/llm-as-judge-skills/tools/research/web-search.md +128 -0
  197. package/assets/skills/context-optimization/examples/llm-as-judge-skills/tsconfig.json +26 -0
  198. package/assets/skills/context-optimization/examples/llm-as-judge-skills/vitest.config.ts +20 -0
  199. package/assets/skills/context-optimization/examples/x-to-book-system/PRD.md +644 -0
  200. package/assets/skills/context-optimization/examples/x-to-book-system/README.md +181 -0
  201. package/assets/skills/context-optimization/examples/x-to-book-system/SKILLS-MAPPING.md +187 -0
  202. package/assets/skills/context-optimization/researcher/example_output.md +75 -0
  203. package/assets/skills/context-optimization/researcher/llm-as-a-judge.md +362 -0
  204. package/assets/skills/context-optimization/skills/advanced-evaluation/SKILL.md +454 -0
  205. package/assets/skills/context-optimization/skills/advanced-evaluation/references/bias-mitigation.md +288 -0
  206. package/assets/skills/context-optimization/skills/advanced-evaluation/references/implementation-patterns.md +315 -0
  207. package/assets/skills/context-optimization/skills/advanced-evaluation/references/metrics-guide.md +331 -0
  208. package/assets/skills/context-optimization/skills/advanced-evaluation/scripts/evaluation_example.py +337 -0
  209. package/assets/skills/context-optimization/skills/bdi-mental-states/SKILL.md +295 -0
  210. package/assets/skills/context-optimization/skills/bdi-mental-states/references/bdi-ontology-core.md +207 -0
  211. package/assets/skills/context-optimization/skills/bdi-mental-states/references/framework-integration.md +582 -0
  212. package/assets/skills/context-optimization/skills/bdi-mental-states/references/rdf-examples.md +315 -0
  213. package/assets/skills/context-optimization/skills/bdi-mental-states/references/sparql-competency.md +420 -0
  214. package/assets/skills/context-optimization/skills/context-compression/SKILL.md +265 -0
  215. package/assets/skills/context-optimization/skills/context-compression/references/evaluation-framework.md +213 -0
  216. package/assets/skills/context-optimization/skills/context-compression/scripts/compression_evaluator.py +658 -0
  217. package/assets/skills/context-optimization/skills/context-degradation/SKILL.md +231 -0
  218. package/assets/skills/context-optimization/skills/context-degradation/references/patterns.md +314 -0
  219. package/assets/skills/context-optimization/skills/context-degradation/scripts/degradation_detector.py +419 -0
  220. package/assets/skills/context-optimization/skills/context-fundamentals/SKILL.md +185 -0
  221. package/assets/skills/context-optimization/skills/context-fundamentals/references/context-components.md +283 -0
  222. package/assets/skills/context-optimization/skills/context-fundamentals/scripts/context_manager.py +370 -0
  223. package/assets/skills/context-optimization/skills/context-optimization/SKILL.md +179 -0
  224. package/assets/skills/context-optimization/skills/context-optimization/references/optimization_techniques.md +272 -0
  225. package/assets/skills/context-optimization/skills/context-optimization/scripts/compaction.py +379 -0
  226. package/assets/skills/context-optimization/skills/evaluation/SKILL.md +231 -0
  227. package/assets/skills/context-optimization/skills/evaluation/references/metrics.md +339 -0
  228. package/assets/skills/context-optimization/skills/evaluation/scripts/evaluator.py +474 -0
  229. package/assets/skills/context-optimization/skills/filesystem-context/SKILL.md +321 -0
  230. package/assets/skills/context-optimization/skills/filesystem-context/references/implementation-patterns.md +549 -0
  231. package/assets/skills/context-optimization/skills/filesystem-context/scripts/filesystem_context.py +353 -0
  232. package/assets/skills/context-optimization/skills/hosted-agents/SKILL.md +279 -0
  233. package/assets/skills/context-optimization/skills/hosted-agents/references/infrastructure-patterns.md +700 -0
  234. package/assets/skills/context-optimization/skills/hosted-agents/scripts/sandbox_manager.py +495 -0
  235. package/assets/skills/context-optimization/skills/memory-systems/SKILL.md +221 -0
  236. package/assets/skills/context-optimization/skills/memory-systems/references/implementation.md +458 -0
  237. package/assets/skills/context-optimization/skills/memory-systems/scripts/memory_store.py +396 -0
  238. package/assets/skills/context-optimization/skills/multi-agent-patterns/SKILL.md +255 -0
  239. package/assets/skills/context-optimization/skills/multi-agent-patterns/references/frameworks.md +433 -0
  240. package/assets/skills/context-optimization/skills/multi-agent-patterns/scripts/coordination.py +439 -0
  241. package/assets/skills/context-optimization/skills/project-development/SKILL.md +342 -0
  242. package/assets/skills/context-optimization/skills/project-development/references/case-studies.md +388 -0
  243. package/assets/skills/context-optimization/skills/project-development/references/pipeline-patterns.md +610 -0
  244. package/assets/skills/context-optimization/skills/project-development/scripts/pipeline_template.py +677 -0
  245. package/assets/skills/context-optimization/skills/tool-design/SKILL.md +311 -0
  246. package/assets/skills/context-optimization/skills/tool-design/references/architectural_reduction.md +210 -0
  247. package/assets/skills/context-optimization/skills/tool-design/references/best_practices.md +176 -0
  248. package/assets/skills/context-optimization/skills/tool-design/scripts/description_generator.py +237 -0
  249. package/assets/skills/context-optimization/template/SKILL.md +98 -0
  250. package/assets/skills/dremio-analytics/SKILL.md +287 -0
  251. package/assets/skills/elevenlabs-voice/SKILL.md +269 -0
  252. package/assets/skills/git-workflow/SKILL.md +266 -0
  253. package/assets/skills/gitops-workflows/.claude-plugin/plugin.json +8 -0
  254. package/assets/skills/gitops-workflows/SKILL.md +568 -0
  255. package/assets/skills/gitops-workflows/assets/applicationsets/cluster-generator.yaml +32 -0
  256. package/assets/skills/gitops-workflows/assets/argocd/install-argocd-3.x.yaml +92 -0
  257. package/assets/skills/gitops-workflows/assets/flux/flux-bootstrap-github.sh +49 -0
  258. package/assets/skills/gitops-workflows/assets/flux/oci-helmrelease.yaml +38 -0
  259. package/assets/skills/gitops-workflows/assets/progressive-delivery/argo-rollouts-canary.yaml +62 -0
  260. package/assets/skills/gitops-workflows/assets/secrets/sops-age-config.yaml +33 -0
  261. package/assets/skills/gitops-workflows/references/argocd_vs_flux.md +243 -0
  262. package/assets/skills/gitops-workflows/references/best_practices.md +160 -0
  263. package/assets/skills/gitops-workflows/references/multi_cluster.md +80 -0
  264. package/assets/skills/gitops-workflows/references/oci_artifacts.md +290 -0
  265. package/assets/skills/gitops-workflows/references/progressive_delivery.md +94 -0
  266. package/assets/skills/gitops-workflows/references/repo_patterns.md +184 -0
  267. package/assets/skills/gitops-workflows/references/secret_management.md +213 -0
  268. package/assets/skills/gitops-workflows/references/troubleshooting.md +134 -0
  269. package/assets/skills/gitops-workflows/scripts/applicationset_generator.py +156 -0
  270. package/assets/skills/gitops-workflows/scripts/check_argocd_health.py +275 -0
  271. package/assets/skills/gitops-workflows/scripts/check_flux_health.py +418 -0
  272. package/assets/skills/gitops-workflows/scripts/oci_artifact_checker.py +150 -0
  273. package/assets/skills/gitops-workflows/scripts/promotion_validator.py +88 -0
  274. package/assets/skills/gitops-workflows/scripts/secret_audit.py +178 -0
  275. package/assets/skills/gitops-workflows/scripts/sync_drift_detector.py +144 -0
  276. package/assets/skills/gitops-workflows/scripts/validate_gitops_repo.py +299 -0
  277. package/assets/skills/iac-terraform/.claude-plugin/plugin.json +8 -0
  278. package/assets/skills/iac-terraform/SKILL.md +653 -0
  279. package/assets/skills/iac-terraform/assets/templates/MODULE_TEMPLATE.md +386 -0
  280. package/assets/skills/iac-terraform/assets/workflows/github-actions-terraform.yml +224 -0
  281. package/assets/skills/iac-terraform/assets/workflows/github-actions-terragrunt.yml +236 -0
  282. package/assets/skills/iac-terraform/assets/workflows/gitlab-ci-terraform.yml +184 -0
  283. package/assets/skills/iac-terraform/references/best_practices.md +709 -0
  284. package/assets/skills/iac-terraform/references/cost_optimization.md +665 -0
  285. package/assets/skills/iac-terraform/references/troubleshooting.md +635 -0
  286. package/assets/skills/iac-terraform/scripts/init_module.py +319 -0
  287. package/assets/skills/iac-terraform/scripts/inspect_state.py +232 -0
  288. package/assets/skills/iac-terraform/scripts/validate_module.py +227 -0
  289. package/assets/skills/k8s-troubleshooter/.claude-plugin/plugin.json +8 -0
  290. package/assets/skills/k8s-troubleshooter/SKILL.md +336 -0
  291. package/assets/skills/k8s-troubleshooter/references/common_issues.md +582 -0
  292. package/assets/skills/k8s-troubleshooter/references/helm_troubleshooting.md +708 -0
  293. package/assets/skills/k8s-troubleshooter/references/incident_response.md +466 -0
  294. package/assets/skills/k8s-troubleshooter/references/performance_troubleshooting.md +687 -0
  295. package/assets/skills/k8s-troubleshooter/scripts/check_namespace.py +500 -0
  296. package/assets/skills/k8s-troubleshooter/scripts/cluster_health.py +223 -0
  297. package/assets/skills/k8s-troubleshooter/scripts/diagnose_pod.py +157 -0
  298. package/assets/skills/mattermost-notify/SKILL.md +248 -0
  299. package/assets/skills/monitoring-observability/SKILL.md +869 -0
  300. package/assets/skills/monitoring-observability/assets/templates/otel-config/collector-config.yaml +227 -0
  301. package/assets/skills/monitoring-observability/assets/templates/prometheus-alerts/kubernetes-alerts.yml +293 -0
  302. package/assets/skills/monitoring-observability/assets/templates/prometheus-alerts/webapp-alerts.yml +243 -0
  303. package/assets/skills/monitoring-observability/assets/templates/runbooks/incident-runbook-template.md +409 -0
  304. package/assets/skills/monitoring-observability/monitoring-observability.skill +0 -0
  305. package/assets/skills/monitoring-observability/references/alerting_best_practices.md +609 -0
  306. package/assets/skills/monitoring-observability/references/datadog_migration.md +649 -0
  307. package/assets/skills/monitoring-observability/references/dql_promql_translation.md +756 -0
  308. package/assets/skills/monitoring-observability/references/logging_guide.md +775 -0
  309. package/assets/skills/monitoring-observability/references/metrics_design.md +406 -0
  310. package/assets/skills/monitoring-observability/references/slo_sla_guide.md +652 -0
  311. package/assets/skills/monitoring-observability/references/tool_comparison.md +697 -0
  312. package/assets/skills/monitoring-observability/references/tracing_guide.md +663 -0
  313. package/assets/skills/monitoring-observability/scripts/alert_quality_checker.py +315 -0
  314. package/assets/skills/monitoring-observability/scripts/analyze_metrics.py +279 -0
  315. package/assets/skills/monitoring-observability/scripts/dashboard_generator.py +395 -0
  316. package/assets/skills/monitoring-observability/scripts/datadog_cost_analyzer.py +477 -0
  317. package/assets/skills/monitoring-observability/scripts/health_check_validator.py +297 -0
  318. package/assets/skills/monitoring-observability/scripts/log_analyzer.py +321 -0
  319. package/assets/skills/monitoring-observability/scripts/slo_calculator.py +365 -0
  320. package/assets/skills/neo4j-graph-rag/SKILL.md +258 -0
  321. package/assets/skills/pagerduty-ops/SKILL.md +380 -0
  322. package/assets/skills/playwright/API_REFERENCE.md +653 -0
  323. package/assets/skills/playwright/SKILL.md +453 -0
  324. package/assets/skills/playwright/lib/helpers.js +441 -0
  325. package/assets/skills/playwright/package.json +26 -0
  326. package/assets/skills/playwright/run.js +228 -0
  327. package/assets/skills/project-memory/README.md +687 -0
  328. package/assets/skills/project-memory/SKILL.md +298 -0
  329. package/assets/skills/project-memory/references/bugs_template.md +41 -0
  330. package/assets/skills/project-memory/references/decisions_template.md +92 -0
  331. package/assets/skills/project-memory/references/issues_template.md +76 -0
  332. package/assets/skills/project-memory/references/key_facts_template.md +158 -0
  333. package/assets/skills/recruit-workflow/SKILL.md +276 -0
  334. package/assets/skills/recruit-workflow/references/email-templates.md +347 -0
  335. package/assets/skills/recruit-workflow/references/workflow-stages.md +395 -0
  336. package/assets/skills/recruit-workflow/scripts/clay_client.py +188 -0
  337. package/assets/skills/recruit-workflow/scripts/lever_client.py +197 -0
  338. package/assets/skills/recruit-workflow/scripts/mailgun_client.py +245 -0
  339. package/assets/skills/recruit-workflow/scripts/minio_client.py +426 -0
  340. package/assets/skills/shakudo-microservice/SKILL.md +215 -0
  341. package/assets/skills/tmux/SKILL.md +631 -0
  342. package/assets/skills/tmux/references/direct-socket-control.md +108 -0
  343. package/assets/skills/tmux/references/session-lifecycle.md +503 -0
  344. package/assets/skills/tmux/references/session-registry.md +1484 -0
  345. package/assets/skills/tmux/tools/cleanup-sessions.sh +263 -0
  346. package/assets/skills/tmux/tools/create-session.sh +224 -0
  347. package/assets/skills/tmux/tools/find-sessions.sh +262 -0
  348. package/assets/skills/tmux/tools/kill-session.sh +308 -0
  349. package/assets/skills/tmux/tools/lib/registry.sh +437 -0
  350. package/assets/skills/tmux/tools/lib/time_utils.sh +54 -0
  351. package/assets/skills/tmux/tools/list-sessions.sh +255 -0
  352. package/assets/skills/tmux/tools/pane-health.sh +424 -0
  353. package/assets/skills/tmux/tools/safe-send.sh +503 -0
  354. package/assets/skills/tmux/tools/wait-for-text.sh +260 -0
  355. package/assets/skills/twilio-sms/SKILL.md +508 -0
  356. package/assets/skills/zellij/SKILL.md +274 -0
  357. package/assets/skills/zellij/references/actions.md +558 -0
  358. package/assets/skills/zellij/references/layouts.md +424 -0
  359. package/bin/cli.ts +46 -0
  360. package/package.json +43 -0
  361. package/src/alias.ts +108 -0
  362. package/src/backup.ts +51 -0
  363. package/src/config.ts +115 -0
  364. package/src/dependencies.ts +163 -0
  365. package/src/errors.ts +77 -0
  366. package/src/index.ts +207 -0
  367. package/src/prompts.ts +142 -0
  368. package/src/schemas.ts +21 -0
  369. package/src/skills.ts +45 -0
  370. package/src/speckit.ts +116 -0
  371. package/src/types.ts +106 -0
  372. package/src/utils.ts +110 -0
  373. package/src/vibe-git.ts +50 -0
  374. package/templates/.specify/memory/constitution.md +109 -0
  375. package/templates/.specify/scripts/bash/check-prerequisites.sh +262 -0
  376. package/templates/.specify/scripts/bash/common.sh +670 -0
  377. package/templates/.specify/scripts/bash/create-new-feature.sh +594 -0
  378. package/templates/.specify/scripts/bash/create-worktree-feature.sh +401 -0
  379. package/templates/.specify/scripts/bash/init-workspace.sh +433 -0
  380. package/templates/.specify/scripts/bash/list-spec-worktrees.sh +198 -0
  381. package/templates/.specify/scripts/bash/setup-plan.sh +105 -0
  382. package/templates/.specify/scripts/bash/test-workspace-rollup.sh +175 -0
  383. package/templates/.specify/scripts/bash/update-agent-context.sh +799 -0
  384. package/templates/.specify/templates/agent-file-template.md +28 -0
  385. package/templates/.specify/templates/checklist-template.md +40 -0
  386. package/templates/.specify/templates/commands/analyze.md +197 -0
  387. package/templates/.specify/templates/commands/checklist.md +306 -0
  388. package/templates/.specify/templates/commands/clarify.md +194 -0
  389. package/templates/.specify/templates/commands/constitution.md +97 -0
  390. package/templates/.specify/templates/commands/implement.md +149 -0
  391. package/templates/.specify/templates/commands/plan.md +123 -0
  392. package/templates/.specify/templates/commands/projects.md +48 -0
  393. package/templates/.specify/templates/commands/rollup.md +66 -0
  394. package/templates/.specify/templates/commands/specify.md +275 -0
  395. package/templates/.specify/templates/commands/specs.md +71 -0
  396. package/templates/.specify/templates/commands/tasks.md +151 -0
  397. package/templates/.specify/templates/commands/taskstoissues.md +35 -0
  398. package/templates/.specify/templates/commands/workspace.md +128 -0
  399. package/templates/.specify/templates/plan-template.md +104 -0
  400. package/templates/.specify/templates/spec-template.md +115 -0
  401. package/templates/.specify/templates/tasks-template.md +251 -0
  402. package/templates/.specify/templates/workspace.yaml +110 -0
  403. package/templates/.specify/workspace.yaml +95 -0
  404. package/templates/AGENTS.md +460 -0
  405. package/templates/oh-my-opencode.json +27 -0
  406. package/templates/opencode.json +383 -0
  407. package/templates/package.json +10 -0
  408. package/templates/project-memory/bugs.md +16 -0
  409. package/templates/project-memory/decisions.md +22 -0
  410. package/templates/project-memory/issues.md +15 -0
  411. package/templates/project-memory/key_facts.md +26 -0
@@ -0,0 +1,620 @@
1
+ # Reasoning Trace Optimizer
2
+
3
+ <p align="center">
4
+ <strong>Debug and optimize AI agents by analyzing reasoning traces with MiniMax M2.1's interleaved thinking</strong>
5
+ </p>
6
+
7
+ <p align="center">
8
+ <a href="#key-features">Features</a> |
9
+ <a href="#quick-start">Quick Start</a> |
10
+ <a href="#how-it-works">How It Works</a> |
11
+ <a href="#examples">Examples</a> |
12
+ <a href="#api-reference">API Reference</a>
13
+ </p>
14
+
15
+ ---
16
+
17
+ ## The Problem
18
+
19
+ Traditional AI agents fail in opaque ways. You see the final output, but not **why** decisions were made. When an agent:
20
+ - Calls the wrong tool
21
+ - Loses track of the goal
22
+ - Makes up information
23
+
24
+ ...you're left guessing where things went wrong.
25
+
26
+ ## The Solution
27
+
28
+ **Reasoning Trace Optimizer** uses MiniMax M2.1's unique **interleaved thinking** capability to expose the agent's reasoning process between every tool call. This enables:
29
+
30
+ 1. **Deep Debugging** - See exactly where reasoning diverged from expected behavior
31
+ 2. **Pattern Detection** - Automatically identify failure modes (context degradation, tool confusion, etc.)
32
+ 3. **Automated Optimization** - Generate improved prompts based on detected issues
33
+ 4. **Shareable Skills** - Convert learnings into reusable Agent Skills for team sharing
34
+
35
+ ## Why MiniMax M2.1?
36
+
37
+ M2.1's **interleaved thinking** is fundamentally different from traditional reasoning models:
38
+
39
+ ```
40
+ Traditional: Think → Act → Act → Act → Done
41
+
42
+ (reasoning only at start)
43
+
44
+ M2.1: Think → Act → Think → Act → Think → Act → Done
45
+ ↑ ↑ ↑
46
+ (continuous reasoning between each tool call)
47
+ ```
48
+
49
+ This matters for agents because:
50
+ - **Long tasks** require maintaining focus across many turns
51
+ - **Tool outputs** introduce unexpected information requiring adaptation
52
+ - **Debugging** needs visibility into decision-making, not just outputs
53
+
54
+ The `thinking` block (Anthropic SDK) or `reasoning_details` field (OpenAI SDK) exposes this reasoning for analysis.
55
+
56
+ ---
57
+
58
+ ## Key Features
59
+
60
+ | Component | Description |
61
+ |-----------|-------------|
62
+ | **TraceCapture** | Wrap M2.1 API to capture all thinking blocks with full context |
63
+ | **TraceAnalyzer** | Detect patterns like context degradation, tool confusion, instruction drift |
64
+ | **PromptOptimizer** | Generate improved prompts based on analysis using M2.1 |
65
+ | **OptimizationLoop** | Automated capture → analyze → improve → re-run cycle |
66
+ | **SkillGenerator** | Convert learnings into shareable Agent Skills |
67
+
68
+ ### Pattern Detection
69
+
70
+ The analyzer automatically identifies these failure patterns:
71
+
72
+ | Pattern | Description | Severity |
73
+ |---------|-------------|----------|
74
+ | `context_degradation` | Model loses information over long contexts | High |
75
+ | `tool_confusion` | Model misunderstands tool capabilities | High |
76
+ | `instruction_drift` | Model deviates from original instructions | Medium |
77
+ | `hallucination` | Model generates unsupported information | Critical |
78
+ | `goal_abandonment` | Model stops pursuing the original goal | High |
79
+ | `circular_reasoning` | Model repeats similar actions without progress | Medium |
80
+ | `premature_conclusion` | Model concludes before completing task | Medium |
81
+ | `missing_validation` | Model doesn't verify results | High |
82
+
83
+ Each detected pattern includes:
84
+ - **Evidence** - Specific excerpts from thinking blocks
85
+ - **Severity** - Critical/High/Medium/Low
86
+ - **Suggestion** - Concrete improvement for the prompt
87
+ - **Confidence** - How certain the detection is
88
+
89
+ ---
90
+
91
+ ## Quick Start
92
+
93
+ ### Installation
94
+
95
+ ```bash
96
+ cd examples/interleaved_thinking
97
+ pip install -e .
98
+ ```
99
+
100
+ ### Configuration
101
+
102
+ Set your MiniMax API key:
103
+
104
+ ```bash
105
+ export ANTHROPIC_API_KEY=your_minimax_api_key
106
+ export ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic
107
+ ```
108
+
109
+ Or create a `.env` file:
110
+
111
+ ```env
112
+ ANTHROPIC_API_KEY=your_minimax_api_key
113
+ ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic
114
+ ```
115
+
116
+ ### Basic Usage
117
+
118
+ ```python
119
+ from reasoning_trace_optimizer import TraceCapture, TraceAnalyzer
120
+
121
+ # Capture reasoning trace
122
+ capture = TraceCapture()
123
+ trace = capture.run(
124
+ task="Explain quantum computing",
125
+ system_prompt="You are a science educator."
126
+ )
127
+
128
+ print(f"Captured {len(trace.thinking_blocks)} thinking blocks")
129
+
130
+ # Analyze the reasoning
131
+ analyzer = TraceAnalyzer()
132
+ analysis = analyzer.analyze(trace)
133
+
134
+ print(f"Overall Score: {analysis.overall_score}/100")
135
+ for pattern in analysis.patterns:
136
+ print(f" [{pattern.severity.value}] {pattern.type.value}")
137
+ print(f" Suggestion: {pattern.suggestion}")
138
+ ```
139
+
140
+ ---
141
+
142
+ ## How It Works
143
+
144
+ ### The Optimization Loop
145
+
146
+ ```
147
+ ┌─────────────────────────────────────────────────────────────────────────┐
148
+ │ OPTIMIZATION LOOP │
149
+ │ │
150
+ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
151
+ │ │ Agent │───▶│ Capture │───▶│ Analyze │───▶│ Optimize │ │
152
+ │ │ Execute │ │ Traces │ │ Patterns │ │ Prompt │ │
153
+ │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
154
+ │ ▲ │ │
155
+ │ └───────────────────────────────────────────────┘ │
156
+ │ (loop until converged or max iterations) │
157
+ │ │
158
+ │ Convergence: Score improvement < threshold OR score > target │
159
+ └─────────────────────────────────────────────────────────────────────────┘
160
+ ```
161
+
162
+ ### What Gets Captured
163
+
164
+ For each agent execution, we capture:
165
+
166
+ 1. **Thinking Blocks** - M2.1's reasoning before each action
167
+ 2. **Tool Calls** - What tools were called with what inputs
168
+ 3. **Tool Results** - What each tool returned
169
+ 4. **Final Response** - The agent's output
170
+ 5. **Metadata** - Tokens used, turns taken, success/failure
171
+
172
+ ### What Gets Analyzed
173
+
174
+ The analyzer examines thinking blocks to understand:
175
+
176
+ - **Current Understanding** - What does the agent believe about the task?
177
+ - **Tool Interpretation** - How did it interpret each tool result?
178
+ - **Alternatives Considered** - What options did it evaluate?
179
+ - **Goal Awareness** - Is it still pursuing the original objective?
180
+
181
+ ---
182
+
183
+ ## Examples
184
+
185
+ ### Example 1: Basic Trace Capture
186
+
187
+ ```python
188
+ # examples/01_basic_capture.py
189
+ from reasoning_trace_optimizer import TraceCapture
190
+
191
+ capture = TraceCapture()
192
+ trace = capture.run(
193
+ task="Explain what interleaved thinking is and why it matters for AI agents.",
194
+ system_prompt="You are an AI researcher explaining concepts clearly."
195
+ )
196
+
197
+ # Output:
198
+ # Captured 1 thinking block
199
+ # Turn 0: "The user is asking me to explain 'interleaved thinking'..."
200
+ ```
201
+
202
+ ### Example 2: Tool Usage with Analysis
203
+
204
+ ```python
205
+ # examples/02_tool_usage.py
206
+ from reasoning_trace_optimizer import TraceCapture, TraceAnalyzer
207
+
208
+ # Define tools
209
+ tools = [
210
+ {
211
+ "name": "get_weather",
212
+ "description": "Get current weather for a city",
213
+ "input_schema": {...}
214
+ }
215
+ ]
216
+
217
+ capture = TraceCapture()
218
+ trace = capture.run(
219
+ task="Compare the weather in San Francisco and New York",
220
+ tools=tools,
221
+ tool_executor=execute_tool
222
+ )
223
+
224
+ # Analyze
225
+ analyzer = TraceAnalyzer()
226
+ analysis = analyzer.analyze(trace)
227
+
228
+ # Output:
229
+ # Score: 85/100
230
+ # Thinking Blocks: 3
231
+ # Tool Calls: 4 (get_weather x2, get_forecast x2)
232
+ # Patterns: None detected
233
+ ```
234
+
235
+ ### Example 3: Full Optimization Loop
236
+
237
+ This example demonstrates a complex research task with 7 tools (web search, file operations, note-taking):
238
+
239
+ ```python
240
+ # examples/03_full_optimization.py
241
+ from reasoning_trace_optimizer import OptimizationLoop, LoopConfig, SkillGenerator
242
+
243
+ config = LoopConfig(
244
+ max_iterations=3,
245
+ min_score_threshold=85.0,
246
+ convergence_threshold=5.0,
247
+ save_artifacts=True,
248
+ )
249
+
250
+ loop = OptimizationLoop(config=config)
251
+ result = loop.run(
252
+ task="""Research "context engineering for AI agents" and create a summary...""",
253
+ initial_prompt="You are a research assistant.",
254
+ tools=TOOLS,
255
+ tool_executor=execute_tool,
256
+ )
257
+
258
+ # Generate shareable skill
259
+ generator = SkillGenerator()
260
+ skill_path = generator.generate(result, skill_name="research-agent")
261
+ ```
262
+
263
+ **Actual Output from Example 3:**
264
+
265
+ ```
266
+ ======================================================================
267
+ OPTIMIZATION RESULTS
268
+ ======================================================================
269
+
270
+ Total Iterations: 3
271
+ Converged: Yes
272
+
273
+ ITERATION 1 (Score: 69/100)
274
+ ├── Task Completed: Yes
275
+ ├── Thinking Blocks: 6
276
+ ├── Tool Calls: 16
277
+ ├── Patterns Found: 2
278
+ │ ├── [LOW] missing_validation
279
+ │ └── [LOW] incomplete_reasoning
280
+ ├── Strengths: Excellent goal adherence, thorough source diversity
281
+ └── Warning: Prompt grew too large (2979 chars), limiting growth
282
+
283
+ ITERATION 2 (Score: 60/100) ← Regression detected!
284
+ ├── Task Completed: Yes
285
+ ├── Thinking Blocks: 8
286
+ ├── Tool Calls: 16
287
+ ├── Patterns Found: 3
288
+ │ ├── [MEDIUM] incomplete_reasoning
289
+ │ ├── [MEDIUM] missing_validation
290
+ │ └── [LOW] tool_misuse
291
+
292
+ ITERATION 3 (Score: 66/100)
293
+ ├── Task Completed: Yes
294
+ ├── Thinking Blocks: 8
295
+ ├── Tool Calls: 16
296
+ └── Patterns Found: 3
297
+
298
+ → Using best prompt from iteration 1 (score: 67.6)
299
+
300
+ TOOL USAGE ACROSS ALL ITERATIONS:
301
+ ├── read_url: 20 calls
302
+ ├── web_search: 12 calls
303
+ ├── list_directory: 7 calls
304
+ ├── save_note: 6 calls
305
+ └── write_file: 3 calls
306
+
307
+ NOTES SAVED: 6 research notes with tagged findings
308
+ FILES WRITTEN: ./output/research_summary.md (11,357 chars)
309
+
310
+ GENERATED SKILL: ./generated_skills/comprehensive-research-agent/SKILL.md
311
+ ```
312
+
313
+ **Key Features Demonstrated:**
314
+
315
+ 1. **Prompt Growth Limiting** - Prevents prompt bloat by limiting expansion to 3x original size
316
+ 2. **Best Score Tracking** - Automatically uses the best-performing prompt, even if later iterations regress
317
+ 3. **Regression Detection** - Warns when scores drop and can stop after consecutive regressions
318
+
319
+ ---
320
+
321
+ ## Generated Artifacts
322
+
323
+ ### Optimization Artifacts
324
+
325
+ Each optimization run creates artifacts for inspection:
326
+
327
+ ```
328
+ optimization_artifacts/
329
+ ├── summary.json # Overall results
330
+ ├── final_prompt.txt # The optimized prompt
331
+ ├── iteration_1/
332
+ │ ├── trace.json # Full reasoning trace
333
+ │ ├── analysis.json # Pattern detection results
334
+ │ └── optimization.json # Prompt changes made
335
+ ├── iteration_2/
336
+ │ └── ...
337
+ └── iteration_3/
338
+ └── ...
339
+ ```
340
+
341
+ ### Generated Skills
342
+
343
+ The SkillGenerator converts optimization learnings into shareable Agent Skills:
344
+
345
+ ```
346
+ generated_skills/
347
+ └── comprehensive-research-agent/
348
+ ├── SKILL.md # The shareable skill
349
+ └── references/
350
+ ├── optimization_summary.json
351
+ ├── optimized_prompt.txt
352
+ └── patterns_found.json
353
+ ```
354
+
355
+ **Example Generated Skill Content:**
356
+
357
+ ```markdown
358
+ ## Patterns to Avoid
359
+
360
+ - **Missing Validation**: Accepting tool responses at face value without
361
+ verifying the actual state change occurred.
362
+ - **Hallucinating Sources**: Citing sources that failed to load.
363
+ - **Ignoring Contradictions**: Proceeding when tool results conflict.
364
+
365
+ ## Recommended Practices
366
+
367
+ - After every tool call, state the outcome explicitly
368
+ - Track sources separately: 'attempted' vs 'successful'
369
+ - Implement error recovery with alternative approaches
370
+ - Cross-reference key claims against multiple sources
371
+ ```
372
+
373
+ ---
374
+
375
+ ## API Reference
376
+
377
+ ### TraceCapture
378
+
379
+ ```python
380
+ capture = TraceCapture(
381
+ api_key="...", # MiniMax API key
382
+ base_url="https://api.minimax.io/anthropic", # API endpoint
383
+ model="MiniMax-M2.1" # Model to use
384
+ )
385
+
386
+ trace = capture.run(
387
+ task="...", # The task to execute
388
+ system_prompt="...", # System prompt
389
+ tools=[...], # Tool definitions (Anthropic format)
390
+ tool_executor=fn, # Function to execute tools
391
+ max_turns=10, # Maximum conversation turns
392
+ max_tokens=4096 # Max tokens per response
393
+ )
394
+ ```
395
+
396
+ ### TraceAnalyzer
397
+
398
+ ```python
399
+ analyzer = TraceAnalyzer(
400
+ api_key="...",
401
+ base_url="https://api.minimax.io/anthropic",
402
+ model="MiniMax-M2.1"
403
+ )
404
+
405
+ analysis = analyzer.analyze(trace)
406
+ # Returns: AnalysisResult with patterns, scores, recommendations
407
+
408
+ quick_score = analyzer.quick_score(trace)
409
+ # Returns: float (0-100) for fast feedback
410
+ ```
411
+
412
+ ### OptimizationLoop
413
+
414
+ ```python
415
+ config = LoopConfig(
416
+ # Iteration control
417
+ max_iterations=5, # Maximum optimization iterations
418
+ convergence_threshold=3.0, # Stop if improvement < this %
419
+ min_score_threshold=75.0, # Stop if score exceeds this
420
+ regression_threshold=8.0, # Warn if score drops by this much
421
+
422
+ # Optimization behavior
423
+ use_best_prompt=True, # Use best-performing prompt, not final
424
+ max_prompt_growth=5.0, # Limit prompt expansion to 5x original
425
+
426
+ # Output options
427
+ save_artifacts=True, # Save traces and analyses
428
+ artifacts_dir="./artifacts" # Where to save
429
+ )
430
+
431
+ loop = OptimizationLoop(config=config)
432
+ result = loop.run(task, initial_prompt, tools, tool_executor)
433
+ # Returns: LoopResult with iterations, final_prompt, scores
434
+ ```
435
+
436
+ **Optimization Safeguards:**
437
+
438
+ - **Best Prompt Tracking**: Keeps the prompt that produced the highest score
439
+ - **Prompt Growth Limiting**: Prevents prompt bloat by limiting size expansion
440
+ - **Regression Detection**: Warns on score drops, stops after consecutive regressions
441
+
442
+ **Score Expectations:**
443
+
444
+ | Task Complexity | Typical Score Range | Notes |
445
+ |-----------------|---------------------|-------|
446
+ | Simple (1-2 tools) | 80-95 | Straightforward tasks converge quickly |
447
+ | Medium (3-5 tools) | 70-85 | Multiple tool coordination adds variability |
448
+ | Complex (6+ tools, multi-step) | 60-75 | Inherent variance in long reasoning chains |
449
+
450
+ Complex research tasks with many tools and steps typically plateau around **65-75** due to:
451
+ - Tool output variability affecting reasoning paths
452
+ - Multiple valid approaches leading to different scoring
453
+ - The stochastic nature of multi-step agent execution
454
+
455
+ The optimizer focuses on **relative improvement** and **pattern elimination** rather than achieving a specific absolute score.
456
+
457
+ ### SkillGenerator
458
+
459
+ ```python
460
+ generator = SkillGenerator()
461
+ skill_path = generator.generate(
462
+ result=loop_result, # From OptimizationLoop
463
+ skill_name="my-skill", # Lowercase with hyphens
464
+ output_dir="./generated_skills",
465
+ title="Human Readable Title"
466
+ )
467
+ ```
468
+
469
+ ---
470
+
471
+ ## CLI Usage
472
+
473
+ ```bash
474
+ # Capture a reasoning trace
475
+ rto capture "Explain interleaved thinking" -s "You are an AI researcher."
476
+
477
+ # Analyze a task and output results
478
+ rto analyze "Debug this code snippet" -o analysis.txt
479
+
480
+ # Run full optimization loop
481
+ rto optimize "Research AI papers" --max-iterations 5 --generate-skill
482
+
483
+ # Generate skill from previous optimization
484
+ rto generate-skill my-skill-name --artifacts-dir ./optimization_artifacts
485
+ ```
486
+
487
+ ---
488
+
489
+ ## Real-World Sources Used
490
+
491
+ Example 3 uses real documentation URLs for realistic simulation:
492
+
493
+ | Source | URL |
494
+ |--------|-----|
495
+ | Anthropic Docs | `docs.anthropic.com/en/docs/build-with-claude/*` |
496
+ | Anthropic Research | `anthropic.com/research/building-effective-agents` |
497
+ | OpenAI Docs | `platform.openai.com/docs/guides/*` |
498
+ | MiniMax M2.1 | `minimax.io/platform/docs/M2.1` |
499
+ | DAIR.AI | `promptingguide.ai/techniques` |
500
+ | LangChain | `python.langchain.com/docs/how_to/debugging` |
501
+ | arXiv Papers | `arxiv.org/abs/2307.03172` (Lost in the Middle) |
502
+
503
+ ---
504
+
505
+ ## Robustness Features
506
+
507
+ The optimizer includes several safeguards to handle real-world variability:
508
+
509
+ ### Parsing Resilience
510
+
511
+ LLM responses don't always produce valid JSON. The system handles this gracefully:
512
+
513
+ | Component | Fallback Behavior |
514
+ |-----------|-------------------|
515
+ | **Analyzer** | Extracts scores via regex patterns when JSON fails; defaults to 50/100 (not 0) |
516
+ | **Optimizer** | Multi-strategy prompt extraction: JSON → regex → marker detection → code blocks |
517
+ | **Loop** | Warns when final prompt is unchanged; tracks best-performing iteration |
518
+
519
+ ### Extended Test Results (10 iterations)
520
+
521
+ Real-world testing revealed important insights:
522
+
523
+ ```
524
+ Iteration Score Patterns Tool Calls Notes
525
+ ────────────────────────────────────────────────
526
+ 1 69/100 4 22 Baseline
527
+ 2 66/100 3 14 -
528
+ 3 61/100 3 17 -
529
+ 4 72/100 3 20 ← Best score
530
+ 5 59/100 4 16 -
531
+ 6 50/100* 0 15 *Parser fallback activated
532
+ 7 70/100 3 12 Recovery
533
+ 8 64/100 3 14 -
534
+ 9 64/100 3 18 -
535
+ 10 70/100 3 19 Final
536
+
537
+ * Iteration 6: JSON parsing failed, fallback returned neutral score
538
+ ```
539
+
540
+ **Key Learnings:**
541
+ - Scores fluctuate ±15 points between iterations due to stochastic model behavior
542
+ - Best score (72) was achieved mid-run, not at the end
543
+ - `use_best_prompt=True` correctly selected iteration 4's prompt
544
+ - Parsing failures now handled gracefully instead of returning 0 scores
545
+
546
+ ---
547
+
548
+ ## Architecture
549
+
550
+ ```
551
+ reasoning_trace_optimizer/
552
+ ├── __init__.py # Public API exports
553
+ ├── models.py # Data models (Pydantic)
554
+ │ ├── ThinkingBlock # Single reasoning segment
555
+ │ ├── ToolCall # Tool invocation record
556
+ │ ├── ReasoningTrace # Complete execution trace
557
+ │ ├── Pattern # Detected failure pattern
558
+ │ ├── AnalysisResult # Full analysis output
559
+ │ └── LoopResult # Optimization loop result
560
+ ├── capture.py # TraceCapture - M2.1 API wrapper
561
+ ├── analyzer.py # TraceAnalyzer - Pattern detection (with fallback parsing)
562
+ ├── optimizer.py # PromptOptimizer - Prompt improvement (with fallback extraction)
563
+ ├── loop.py # OptimizationLoop - Full cycle (with best-score tracking)
564
+ ├── skill_generator.py # SkillGenerator - Create skills
565
+ └── cli.py # Command-line interface
566
+ ```
567
+
568
+ ---
569
+
570
+ ## Integration
571
+
572
+ ### Claude Code Skill
573
+
574
+ This project includes a Claude Code skill (`SKILL.md`) enabling:
575
+
576
+ - **Auto-trigger on failure** - Analyze when agent tasks fail
577
+ - **On-demand analysis** - Use `/reasoning-trace-optimizer` command
578
+ - **Session analysis** - Analyze thinking from current conversation
579
+
580
+ ### Python Library
581
+
582
+ ```python
583
+ from reasoning_trace_optimizer import (
584
+ TraceCapture,
585
+ TraceAnalyzer,
586
+ PromptOptimizer,
587
+ OptimizationLoop,
588
+ LoopConfig,
589
+ SkillGenerator,
590
+ )
591
+ ```
592
+
593
+ ---
594
+
595
+ ## Contributing
596
+
597
+ This project is part of the [Agent Skills for Context Engineering](https://github.com/muratcankoylan/Agent-Skills-for-Context-Engineering) collection.
598
+
599
+ ---
600
+
601
+ ## License
602
+
603
+ MIT License
604
+
605
+ ---
606
+
607
+ ## References
608
+
609
+ - [MiniMax M2.1 Documentation](https://www.minimax.io/platform/docs)
610
+ - [MiniMax API Reference](https://www.minimax.io/platform/docs/M2.1)
611
+ - [Interleaved Thinking Guide](./docs/interleavedthinking.md)
612
+ - [Agent Generalization Research](./docs/agentthinking.md)
613
+ - [Anthropic API Compatibility](./docs/m2-1.md)
614
+
615
+ ---
616
+
617
+ <p align="center">
618
+ <strong>Built in partnership with MiniMax AI</strong><br>
619
+ Showcasing the power of interleaved thinking for agent debugging
620
+ </p>