@rfxlamia/skillkit 1.1.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (247) hide show
  1. package/agents/agents/creative-copywriter.md +212 -0
  2. package/agents/agents/dario-amodei.md +135 -0
  3. package/agents/agents/doc-simplifier.md +63 -0
  4. package/agents/agents/kotlin-pro.md +433 -0
  5. package/agents/agents/red-team.md +136 -0
  6. package/agents/agents/sam-altman.md +121 -0
  7. package/agents/agents/seo-manager.md +184 -0
  8. package/package.json +1 -1
  9. package/skills/skillkit-help/SKILL.md +81 -0
  10. package/skills/skillkit-help/knowledge/application/09-case-studies.md +257 -0
  11. package/skills/skillkit-help/knowledge/application/12-testing-and-validation.md +276 -0
  12. package/skills/skillkit-help/knowledge/foundation/01-why-skills-exist.md +246 -0
  13. package/skills/skillkit-help/knowledge/foundation/02-skills-vs-subagents-comparison.md +312 -0
  14. package/skills/skillkit-help/knowledge/foundation/03-skills-vs-subagents-decision-tree.md +346 -0
  15. package/skills/skillkit-help/knowledge/foundation/06-platform-constraints.md +237 -0
  16. package/skills/skillkit-help/knowledge/foundation/08-when-not-to-use-skills.md +270 -0
  17. package/skills/skillkit-help/template/SKILL.md +52 -0
  18. package/skills/skills/adversarial-review/SKILL.md +219 -0
  19. package/skills/skills/baby-education/SKILL.md +260 -0
  20. package/skills/skills/baby-education/references/advanced-techniques.md +323 -0
  21. package/skills/skills/baby-education/references/transformations.md +345 -0
  22. package/skills/skills/been-there-done-that/SKILL.md +455 -0
  23. package/skills/skills/been-there-done-that/references/analysis-patterns.md +162 -0
  24. package/skills/skills/been-there-done-that/references/git-commands.md +132 -0
  25. package/skills/skills/been-there-done-that/references/tree-insertion-logic.md +145 -0
  26. package/skills/skills/coolhunter/SKILL.md +270 -0
  27. package/skills/skills/coolhunter/assets/elicitation-methods.csv +51 -0
  28. package/skills/skills/coolhunter/knowledge/elicitation-methods.md +312 -0
  29. package/skills/skills/coolhunter/references/workflow-execution.md +238 -0
  30. package/skills/skills/coolhunter/workflow-plan-coolhunter.md +232 -0
  31. package/skills/skills/creative-copywriting/SKILL.md +324 -0
  32. package/skills/skills/creative-copywriting/databases/README.md +60 -0
  33. package/skills/skills/creative-copywriting/databases/carousel-structures.csv +16 -0
  34. package/skills/skills/creative-copywriting/databases/emotional-arcs.csv +11 -0
  35. package/skills/skills/creative-copywriting/databases/hook-formulas.csv +51 -0
  36. package/skills/skills/creative-copywriting/databases/power-words.csv +201 -0
  37. package/skills/skills/creative-copywriting/databases/psychological-triggers.csv +21 -0
  38. package/skills/skills/creative-copywriting/databases/read-more-patterns.csv +26 -0
  39. package/skills/skills/creative-copywriting/databases/swipe-triggers.csv +31 -0
  40. package/skills/skills/creative-copywriting/references/carousel-psychology.md +223 -0
  41. package/skills/skills/creative-copywriting/references/hook-anatomy.md +169 -0
  42. package/skills/skills/creative-copywriting/references/power-word-science.md +134 -0
  43. package/skills/skills/creative-copywriting/references/storytelling-frameworks.md +157 -0
  44. package/skills/skills/diverse-content-gen/SKILL.md +201 -0
  45. package/skills/skills/diverse-content-gen/references/advanced-techniques.md +320 -0
  46. package/skills/skills/diverse-content-gen/references/research-findings.md +379 -0
  47. package/skills/skills/diverse-content-gen/references/task-workflows.md +241 -0
  48. package/skills/skills/diverse-content-gen/references/tool-integration.md +419 -0
  49. package/skills/skills/diverse-content-gen/references/troubleshooting.md +426 -0
  50. package/skills/skills/diverse-content-gen/references/vs-core-technique.md +240 -0
  51. package/skills/skills/framework-critical-thinking/SKILL.md +220 -0
  52. package/skills/skills/framework-critical-thinking/references/bias_detector.md +375 -0
  53. package/skills/skills/framework-critical-thinking/references/fallback_handler.md +239 -0
  54. package/skills/skills/framework-critical-thinking/references/memory_curator.md +161 -0
  55. package/skills/skills/framework-critical-thinking/references/metacognitive_monitor.md +297 -0
  56. package/skills/skills/framework-critical-thinking/references/producer_critic_orchestrator.md +333 -0
  57. package/skills/skills/framework-critical-thinking/references/reasoning_router.md +235 -0
  58. package/skills/skills/framework-critical-thinking/references/reasoning_validator.md +97 -0
  59. package/skills/skills/framework-critical-thinking/references/reflection_trigger.md +78 -0
  60. package/skills/skills/framework-critical-thinking/references/self_verification.md +388 -0
  61. package/skills/skills/framework-critical-thinking/references/uncertainty_quantifier.md +207 -0
  62. package/skills/skills/framework-initiative/SKILL.md +231 -0
  63. package/skills/skills/framework-initiative/references/examples.md +150 -0
  64. package/skills/skills/framework-initiative/references/impact-analysis.md +157 -0
  65. package/skills/skills/framework-initiative/references/intent-patterns.md +145 -0
  66. package/skills/skills/framework-initiative/references/star-framework.md +165 -0
  67. package/skills/skills/humanize-docs/SKILL.md +203 -0
  68. package/skills/skills/humanize-docs/references/advanced-techniques.md +13 -0
  69. package/skills/skills/humanize-docs/references/core-transformations.md +368 -0
  70. package/skills/skills/humanize-docs/references/detection-patterns.md +400 -0
  71. package/skills/skills/humanize-docs/references/examples-gallery.md +374 -0
  72. package/skills/skills/imagine/SKILL.md +190 -0
  73. package/skills/skills/imagine/references/artstyle-corporate-memphis.md +625 -0
  74. package/skills/skills/imagine/references/artstyle-crewdson-hyperrealism.md +295 -0
  75. package/skills/skills/imagine/references/artstyle-iphone-social-media.md +426 -0
  76. package/skills/skills/imagine/references/artstyle-sciencesaru.md +276 -0
  77. package/skills/skills/pre-deploy-checklist/README.md +26 -0
  78. package/skills/skills/pre-deploy-checklist/SKILL.md +153 -0
  79. package/skills/skills/pre-deploy-checklist/references/checklist-categories.md +174 -0
  80. package/skills/skills/pre-deploy-checklist/references/domain-prompts.md +216 -0
  81. package/skills/skills/prompt-engineering/SKILL.md +209 -0
  82. package/skills/skills/prompt-engineering/references/advanced-combinations.md +444 -0
  83. package/skills/skills/prompt-engineering/references/chain-of-thought.md +140 -0
  84. package/skills/skills/prompt-engineering/references/decision_matrix.md +220 -0
  85. package/skills/skills/prompt-engineering/references/few-shot.md +346 -0
  86. package/skills/skills/prompt-engineering/references/json-format.md +270 -0
  87. package/skills/skills/prompt-engineering/references/natural-language.md +420 -0
  88. package/skills/skills/prompt-engineering/references/pitfalls.md +365 -0
  89. package/skills/skills/prompt-engineering/references/prompt-chaining.md +498 -0
  90. package/skills/skills/prompt-engineering/references/react.md +108 -0
  91. package/skills/skills/prompt-engineering/references/self-consistency.md +322 -0
  92. package/skills/skills/prompt-engineering/references/tree-of-thoughts.md +386 -0
  93. package/skills/skills/prompt-engineering/references/xml-format.md +220 -0
  94. package/skills/skills/prompt-engineering/references/yaml-format.md +488 -0
  95. package/skills/skills/prompt-engineering/references/zero-shot.md +74 -0
  96. package/skills/skills/quick-spec/SKILL.md +280 -0
  97. package/skills/skills/quick-spec/assets/tech-spec-template.md +74 -0
  98. package/skills/skills/quick-spec/references/step-01-understand.md +189 -0
  99. package/skills/skills/quick-spec/references/step-02-investigate.md +144 -0
  100. package/skills/skills/quick-spec/references/step-03-generate.md +128 -0
  101. package/skills/skills/quick-spec/references/step-04-review.md +173 -0
  102. package/skills/skills/quick-spec/tests/__pycache__/test_skill.cpython-314-pytest-9.0.2.pyc +0 -0
  103. package/skills/skills/quick-spec/tests/test_scenarios.md +83 -0
  104. package/skills/skills/quick-spec/tests/test_skill.py +136 -0
  105. package/skills/skills/readme-expert/SKILL.md +538 -0
  106. package/skills/skills/readme-expert/knowledge/INDEX.md +192 -0
  107. package/skills/skills/readme-expert/knowledge/application/quality-standards.md +470 -0
  108. package/skills/skills/readme-expert/knowledge/application/script-executor.md +604 -0
  109. package/skills/skills/readme-expert/knowledge/application/template-library.md +822 -0
  110. package/skills/skills/readme-expert/knowledge/foundation/codebase-scanner.md +361 -0
  111. package/skills/skills/readme-expert/knowledge/foundation/validation-checklist.md +481 -0
  112. package/skills/skills/red-teaming/SKILL.md +321 -0
  113. package/skills/skills/red-teaming/references/ai-llm-redteam.md +517 -0
  114. package/skills/skills/red-teaming/references/attack-techniques.md +410 -0
  115. package/skills/skills/red-teaming/references/cybersecurity-redteam.md +383 -0
  116. package/skills/skills/red-teaming/references/tools-frameworks.md +446 -0
  117. package/skills/skills/releasing/.skillkit-mode +1 -0
  118. package/skills/skills/releasing/SKILL.md +225 -0
  119. package/skills/skills/releasing/references/version-detection.md +108 -0
  120. package/skills/skills/screenwriter/SKILL.md +273 -0
  121. package/skills/skills/screenwriter/references/advanced-techniques.md +216 -0
  122. package/skills/skills/screenwriter/references/pipeline-integration.md +266 -0
  123. package/skills/skills/skillkit/.claude/settings.local.json +7 -0
  124. package/skills/skills/skillkit/.claude-plugin/plugin.json +27 -0
  125. package/skills/skills/skillkit/CHANGELOG.md +484 -0
  126. package/skills/skills/skillkit/SKILL.md +511 -0
  127. package/skills/skills/skillkit/commands/skillkit.md +6 -0
  128. package/skills/skills/skillkit/commands/validate-plan.md +6 -0
  129. package/skills/skills/skillkit/commands/verify.md +6 -0
  130. package/skills/skills/skillkit/knowledge/INDEX.md +352 -0
  131. package/skills/skills/skillkit/knowledge/application/09-case-studies.md +257 -0
  132. package/skills/skills/skillkit/knowledge/application/10-technical-architecture.md +324 -0
  133. package/skills/skills/skillkit/knowledge/application/11-adoption-strategy.md +267 -0
  134. package/skills/skills/skillkit/knowledge/application/12-testing-and-validation.md +276 -0
  135. package/skills/skills/skillkit/knowledge/application/13-competitive-landscape.md +198 -0
  136. package/skills/skills/skillkit/knowledge/foundation/01-why-skills-exist.md +246 -0
  137. package/skills/skills/skillkit/knowledge/foundation/02-skills-vs-subagents-comparison.md +312 -0
  138. package/skills/skills/skillkit/knowledge/foundation/03-skills-vs-subagents-decision-tree.md +346 -0
  139. package/skills/skills/skillkit/knowledge/foundation/04-hybrid-patterns.md +308 -0
  140. package/skills/skills/skillkit/knowledge/foundation/05-token-economics.md +275 -0
  141. package/skills/skills/skillkit/knowledge/foundation/06-platform-constraints.md +237 -0
  142. package/skills/skills/skillkit/knowledge/foundation/07-security-concerns.md +322 -0
  143. package/skills/skills/skillkit/knowledge/foundation/08-when-not-to-use-skills.md +270 -0
  144. package/skills/skills/skillkit/knowledge/plugin-guide.md +614 -0
  145. package/skills/skills/skillkit/knowledge/tools/14-validation-tools-guide.md +150 -0
  146. package/skills/skills/skillkit/knowledge/tools/15-cost-tools-guide.md +157 -0
  147. package/skills/skills/skillkit/knowledge/tools/16-security-tools-guide.md +122 -0
  148. package/skills/skills/skillkit/knowledge/tools/17-pattern-tools-guide.md +161 -0
  149. package/skills/skills/skillkit/knowledge/tools/18-decision-helper-guide.md +243 -0
  150. package/skills/skills/skillkit/knowledge/tools/19-test-generator-guide.md +275 -0
  151. package/skills/skills/skillkit/knowledge/tools/20-split-skill-guide.md +149 -0
  152. package/skills/skills/skillkit/knowledge/tools/21-quality-scorer-guide.md +226 -0
  153. package/skills/skills/skillkit/knowledge/tools/22-migration-helper-guide.md +356 -0
  154. package/skills/skills/skillkit/knowledge/tools/23-subagent-creation-guide.md +448 -0
  155. package/skills/skills/skillkit/knowledge/tools/24-behavioral-testing-guide.md +122 -0
  156. package/skills/skills/skillkit/references/proposal-generation.md +982 -0
  157. package/skills/skills/skillkit/references/rationalization-catalog.md +75 -0
  158. package/skills/skills/skillkit/references/research-methodology.md +661 -0
  159. package/skills/skills/skillkit/references/section-2-full-creation-workflow.md +452 -0
  160. package/skills/skills/skillkit/references/section-3-validation-workflow-existing-skill.md +63 -0
  161. package/skills/skills/skillkit/references/section-4-decision-workflow-skills-vs-subagents.md +64 -0
  162. package/skills/skills/skillkit/references/section-5-migration-workflow-doc-to-skill.md +58 -0
  163. package/skills/skills/skillkit/references/section-6-subagent-creation-workflow.md +499 -0
  164. package/skills/skills/skillkit/references/section-7-knowledge-reference-map.md +72 -0
  165. package/skills/skills/skillkit/scripts/__pycache__/decision_helper.cpython-314.pyc +0 -0
  166. package/skills/skills/skillkit/scripts/__pycache__/quick_validate.cpython-312.pyc +0 -0
  167. package/skills/skills/skillkit/scripts/__pycache__/quick_validate.cpython-314.pyc +0 -0
  168. package/skills/skills/skillkit/scripts/__pycache__/test_generator.cpython-314-pytest-9.0.2.pyc +0 -0
  169. package/skills/skills/skillkit/scripts/decision_helper.py +799 -0
  170. package/skills/skills/skillkit/scripts/init_skill.py +400 -0
  171. package/skills/skills/skillkit/scripts/init_subagent.py +231 -0
  172. package/skills/skills/skillkit/scripts/migration_helper.py +669 -0
  173. package/skills/skills/skillkit/scripts/package_skill.py +211 -0
  174. package/skills/skills/skillkit/scripts/pattern_detector.py +381 -0
  175. package/skills/skills/skillkit/scripts/pattern_detector_new.py +382 -0
  176. package/skills/skills/skillkit/scripts/pressure_tester.py +157 -0
  177. package/skills/skills/skillkit/scripts/quality_scorer.py +999 -0
  178. package/skills/skills/skillkit/scripts/quick_validate.py +100 -0
  179. package/skills/skills/skillkit/scripts/security_scanner.py +474 -0
  180. package/skills/skills/skillkit/scripts/split_skill.py +540 -0
  181. package/skills/skills/skillkit/scripts/test_generator.py +695 -0
  182. package/skills/skills/skillkit/scripts/token_estimator.py +493 -0
  183. package/skills/skills/skillkit/scripts/utils/__init__.py +49 -0
  184. package/skills/skills/skillkit/scripts/utils/__pycache__/__init__.cpython-312.pyc +0 -0
  185. package/skills/skills/skillkit/scripts/utils/__pycache__/__init__.cpython-314.pyc +0 -0
  186. package/skills/skills/skillkit/scripts/utils/__pycache__/budget_tracker.cpython-312.pyc +0 -0
  187. package/skills/skills/skillkit/scripts/utils/__pycache__/budget_tracker.cpython-314.pyc +0 -0
  188. package/skills/skills/skillkit/scripts/utils/__pycache__/output_formatter.cpython-312.pyc +0 -0
  189. package/skills/skills/skillkit/scripts/utils/__pycache__/output_formatter.cpython-314.pyc +0 -0
  190. package/skills/skills/skillkit/scripts/utils/__pycache__/reference_validator.cpython-312.pyc +0 -0
  191. package/skills/skills/skillkit/scripts/utils/__pycache__/reference_validator.cpython-314.pyc +0 -0
  192. package/skills/skills/skillkit/scripts/utils/budget_tracker.py +388 -0
  193. package/skills/skills/skillkit/scripts/utils/output_formatter.py +263 -0
  194. package/skills/skills/skillkit/scripts/utils/reference_validator.py +401 -0
  195. package/skills/skills/skillkit/scripts/validate_skill.py +594 -0
  196. package/skills/skills/skillkit/tests/test_behavioral.py +39 -0
  197. package/skills/skills/skillkit/tests/test_scenarios.md +83 -0
  198. package/skills/skills/skillkit/tests/test_skill.py +136 -0
  199. package/skills/skills/skillkit-help/SKILL.md +81 -0
  200. package/skills/skills/skillkit-help/knowledge/application/09-case-studies.md +257 -0
  201. package/skills/skills/skillkit-help/knowledge/application/12-testing-and-validation.md +276 -0
  202. package/skills/skills/skillkit-help/knowledge/foundation/01-why-skills-exist.md +246 -0
  203. package/skills/skills/skillkit-help/knowledge/foundation/02-skills-vs-subagents-comparison.md +312 -0
  204. package/skills/skills/skillkit-help/knowledge/foundation/03-skills-vs-subagents-decision-tree.md +346 -0
  205. package/skills/skills/skillkit-help/knowledge/foundation/06-platform-constraints.md +237 -0
  206. package/skills/skills/skillkit-help/knowledge/foundation/08-when-not-to-use-skills.md +270 -0
  207. package/skills/skills/skillkit-help/template/SKILL.md +52 -0
  208. package/skills/skills/social-media-seo/SKILL.md +278 -0
  209. package/skills/skills/social-media-seo/databases/caption-styles.csv +31 -0
  210. package/skills/skills/social-media-seo/databases/engagement-tactics.csv +16 -0
  211. package/skills/skills/social-media-seo/databases/hashtag-strategies.csv +21 -0
  212. package/skills/skills/social-media-seo/databases/hook-formulas.csv +26 -0
  213. package/skills/skills/social-media-seo/databases/keyword-clusters.csv +11 -0
  214. package/skills/skills/social-media-seo/databases/thread-structures.csv +26 -0
  215. package/skills/skills/social-media-seo/databases/viral-patterns.csv +21 -0
  216. package/skills/skills/social-media-seo/references/analytics-guide.md +321 -0
  217. package/skills/skills/social-media-seo/references/instagram-seo.md +235 -0
  218. package/skills/skills/social-media-seo/references/threads-seo.md +305 -0
  219. package/skills/skills/social-media-seo/references/x-twitter-seo.md +337 -0
  220. package/skills/skills/social-media-seo/scripts/query_database.py +191 -0
  221. package/skills/skills/storyteller/SKILL.md +241 -0
  222. package/skills/skills/storyteller/references/transformation-methodology.md +293 -0
  223. package/skills/skills/storyteller/references/visual-vocabulary.md +177 -0
  224. package/skills/skills/thread-pro/SKILL.md +162 -0
  225. package/skills/skills/thread-pro/anti-ai-patterns.md +120 -0
  226. package/skills/skills/thread-pro/hook-formulas.md +138 -0
  227. package/skills/skills/thread-pro/references/anti-ai-patterns.md +120 -0
  228. package/skills/skills/thread-pro/references/hook-formulas.md +138 -0
  229. package/skills/skills/thread-pro/references/thread-structures.md +240 -0
  230. package/skills/skills/thread-pro/references/voice-injection.md +130 -0
  231. package/skills/skills/thread-pro/thread-structures.md +240 -0
  232. package/skills/skills/thread-pro/voice-injection.md +130 -0
  233. package/skills/skills/tinkering/SKILL.md +251 -0
  234. package/skills/skills/tinkering/references/graduation-checklist.md +100 -0
  235. package/skills/skills/validate-plan/.skillkit-mode +1 -0
  236. package/skills/skills/validate-plan/SKILL.md +406 -0
  237. package/skills/skills/validate-plan/references/dry-principles.md +251 -0
  238. package/skills/skills/validate-plan/references/gap-analysis-guide.md +320 -0
  239. package/skills/skills/validate-plan/references/tdd-patterns.md +413 -0
  240. package/skills/skills/validate-plan/references/yagni-checklist.md +330 -0
  241. package/skills/skills/verify-before-ship/.skillkit-mode +1 -0
  242. package/skills/skills/verify-before-ship/SKILL.md +116 -0
  243. package/skills/skills/verify-before-ship/references/anti-rationalization.md +212 -0
  244. package/skills/skills/verify-before-ship/references/verification-gates.md +305 -0
  245. package/skills-manifest.json +8 -2
  246. package/src/picker.js +11 -5
  247. package/src/picker.test.js +36 -1
@@ -0,0 +1,426 @@
1
+ # VS Troubleshooting Guide
2
+
3
+ **Purpose:** Solutions to common VS issues and error patterns
4
+
5
+ **Load when:** VS execution fails, outputs unsatisfactory, or errors encountered
6
+
7
+ ---
8
+
9
+ ## Issue 1: JSON Parsing Failures
10
+
11
+ ### Symptoms
12
+ - LLM returns explanation text before/after JSON
13
+ - Invalid JSON structure
14
+ - Missing quotes or brackets
15
+
16
+ ### Root Cause
17
+ Model not following "ONLY JSON" instruction strictly
18
+
19
+ ### Solutions
20
+
21
+ #### Solution A: Emphasize JSON-Only Output
22
+ ```
23
+ [Add to VS prompt, in bold/caps]
24
+ **CRITICAL: Give ONLY the JSON object with no explanations, no extra text before or after.**
25
+
26
+ Expected format EXACTLY:
27
+ {"responses": [{"text": "...", "probability": 0.XX}, ...]}
28
+ ```
29
+
30
+ #### Solution B: Structured Output Mode
31
+ ```
32
+ # If model supports structured output
33
+ Use structured output API with schema:
34
+ {
35
+ "type": "object",
36
+ "properties": {
37
+ "responses": {
38
+ "type": "array",
39
+ "items": {
40
+ "type": "object",
41
+ "properties": {
42
+ "text": {"type": "string"},
43
+ "probability": {"type": "number", "minimum": 0, "maximum": 1}
44
+ }
45
+ }
46
+ }
47
+ }
48
+ }
49
+ ```
50
+
51
+ #### Solution C: Regex Extraction
52
+ ```python
53
+ # Fallback: Extract JSON from mixed text
54
+ import re
55
+ import json
56
+
57
+ def extract_json(text):
58
+ # Find JSON object in text
59
+ match = re.search(r'\{[\s\S]*"responses"[\s\S]*\}', text)
60
+ if match:
61
+ return json.loads(match.group(0))
62
+ raise ValueError("No valid JSON found")
63
+ ```
64
+
65
+ ### Prevention
66
+ Always include explicit "ONLY JSON" instruction in every VS prompt
67
+
68
+ ---
69
+
70
+ ## Issue 2: Probabilities Don't Make Sense
71
+
72
+ ### Symptoms
73
+ - All probabilities are identical (e.g., all 0.20)
74
+ - Probabilities sum to unexpected values (2.5, 10.0, etc.)
75
+ - Negative probabilities or values > 1.0
76
+
77
+ ### Root Cause
78
+ Model estimating probabilities imperfectly (expected behavior)
79
+
80
+ ### Understanding
81
+
82
+ **Important:** Probabilities in VS are **estimates**, not ground truth.
83
+
84
+ ✅ **What to trust:**
85
+ - Relative ordering (higher p = more typical)
86
+ - General magnitude (0.08 vs 0.01 = significant difference)
87
+
88
+ ❌ **What NOT to expect:**
89
+ - Perfect calibration
90
+ - Probabilities summing to exactly 1.0
91
+ - Absolute precision
92
+
93
+ ### Solutions
94
+
95
+ #### Solution A: Focus on Relative Ranking
96
+ ```python
97
+ # Sort by probability, ignore absolute values
98
+ candidates.sort(key=lambda x: x["probability"], reverse=True)
99
+ # Present: highest p = most typical, lowest p = most creative
100
+ ```
101
+
102
+ #### Solution B: Normalize If Needed
103
+ ```python
104
+ # Only if required for downstream use
105
+ total = sum(c["probability"] for c in candidates)
106
+ for c in candidates:
107
+ c["probability_normalized"] = c["probability"] / total
108
+ ```
109
+
110
+ #### Solution C: Don't Show Probabilities to User
111
+ ```
112
+ # Simple presentation (hide probabilities)
113
+ Here are 5 diverse options:
114
+ 1. [Option 1 text]
115
+ 2. [Option 2 text]
116
+ ...
117
+ ```
118
+
119
+ ### Prevention
120
+ Set expectation: probabilities are guidance, not precise measurements
121
+
122
+ ---
123
+
124
+ ## Issue 3: Outputs Still Too Similar
125
+
126
+ ### Symptoms
127
+ - VS outputs lack diversity despite using technique
128
+ - All variations sound alike
129
+ - No meaningful angle/style differences
130
+
131
+ ### Root Causes
132
+ 1. Task itself has limited diversity space
133
+ 2. Parameters not tuned for diversity
134
+ 3. Model constraints (smaller models struggle more)
135
+
136
+ ### Solutions
137
+
138
+ #### Solution A: Lower Probability Threshold
139
+ ```
140
+ # Add to VS prompt
141
+ Randomly sample from the distribution, with probability of each response below 0.01.
142
+ ```
143
+
144
+ **Effect:** Samples more from tail (creative outputs)
145
+
146
+ #### Solution B: Increase k Value
147
+ ```
148
+ # Generate more candidates
149
+ k = 10 # Instead of 5
150
+ ```
151
+
152
+ **Effect:** Wider exploration of possibility space
153
+
154
+ #### Solution C: Add Explicit Diversity Instruction
155
+ ```
156
+ IMPORTANT: Ensure responses cover DIFFERENT:
157
+ - Tones: (humorous, professional, inspirational, casual)
158
+ - Perspectives: (beginner, expert, skeptic, enthusiast)
159
+ - Formats: (question, statement, story, instruction)
160
+
161
+ Avoid generating similar responses.
162
+ ```
163
+
164
+ #### Solution D: Use VS-CoT
165
+ ```
166
+ # Add reasoning step (see advanced-techniques.md)
167
+ Before generating, think through different angles...
168
+ ```
169
+
170
+ **Effect:** Model consciously diversifies
171
+
172
+ #### Solution E: Check Task Viability
173
+ ```
174
+ # Some tasks genuinely have limited diversity
175
+ Example: "Generate the capital of France" → Only 1 valid answer
176
+
177
+ Ask: Does this task have multiple valid approaches?
178
+ If NO → VS may not be appropriate
179
+ ```
180
+
181
+ ### Prevention
182
+ Start with k=5, threshold=0.10. Adjust if diversity insufficient.
183
+
184
+ ---
185
+
186
+ ## Issue 4: Quality Drop on Complex Tasks
187
+
188
+ ### Symptoms
189
+ - VS outputs diverse but lower quality
190
+ - Errors, incoherence, or off-topic responses
191
+ - User prefers single-shot standard prompting quality
192
+
193
+ ### Root Cause
194
+ Diversity-quality tradeoff, especially with high k or low threshold
195
+
196
+ ### Solutions
197
+
198
+ #### Solution A: Use VS-Multi
199
+ ```
200
+ # See advanced-techniques.md
201
+ Round 1: VS (diversity)
202
+ Round 2: User selects favorites
203
+ Round 3: Refine selected (quality)
204
+ ```
205
+
206
+ **Effect:** Best of both worlds
207
+
208
+ #### Solution B: Add Quality Constraints
209
+ ```
210
+ Requirements for ALL responses:
211
+ - Professional tone maintained
212
+ - Grammatically correct
213
+ - On-topic and relevant
214
+ - No clichés or filler
215
+
216
+ Then generate {k} responses meeting these standards.
217
+ ```
218
+
219
+ #### Solution C: Reduce k or Raise Threshold
220
+ ```
221
+ # More conservative parameters
222
+ k = 3 # Instead of 10
223
+ threshold = 0.10 # Instead of 0.01
224
+ ```
225
+
226
+ **Effect:** Less aggressive diversity, better quality
227
+
228
+ #### Solution D: Use VS-CoT for Coherence
229
+ ```
230
+ # Reasoning step helps with complex tasks
231
+ # See advanced-techniques.md
232
+ ```
233
+
234
+ ### Prevention
235
+ For production content, default to VS-Multi workflow
236
+
237
+ ---
238
+
239
+ ## Issue 5: Model Not Following VS Format
240
+
241
+ ### Symptoms
242
+ - Returns single response instead of k responses
243
+ - Generates list without probabilities
244
+ - Ignores JSON format entirely
245
+
246
+ ### Root Cause
247
+ 1. Smaller/weaker models struggle with complex instructions
248
+ 2. Prompt too complex for model capabilities
249
+
250
+ ### Solutions
251
+
252
+ #### Solution A: Simplify Prompt
253
+ ```
254
+ # Minimal VS prompt
255
+ Generate 5 different responses to: {request}
256
+
257
+ Return JSON:
258
+ {"responses": [
259
+ {"text": "response 1", "probability": 0.X},
260
+ {"text": "response 2", "probability": 0.X},
261
+ ...
262
+ ]}
263
+
264
+ ONLY JSON, no extra text.
265
+ ```
266
+
267
+ #### Solution B: Use Stronger Model
268
+ ```
269
+ # Check model compatibility (see research-findings.md)
270
+ Recommended: GPT-4.1+, Claude 4+, Gemini 2.5+
271
+ Avoid: Models < 70B parameters
272
+ ```
273
+
274
+ #### Solution C: Fallback to Standard + Repetition
275
+ ```
276
+ # If VS fails, alternative:
277
+ for i in range(k):
278
+ response = standard_prompt_with_variation(request)
279
+ candidates.append(response)
280
+ ```
281
+
282
+ **Effect:** Less research-backed but more reliable for weak models
283
+
284
+ ### Prevention
285
+ Use frontier models (GPT-4, Claude 4, Gemini 2.5) for best VS results
286
+
287
+ ---
288
+
289
+ ## Issue 6: VS Taking Too Long / Expensive
290
+
291
+ ### Symptoms
292
+ - High latency for VS calls
293
+ - Token costs exceeding budget
294
+ - User impatient with wait times
295
+
296
+ ### Root Causes
297
+ - High k value (10-20) requires long generation
298
+ - Multiple calls for large N
299
+ - Using expensive models
300
+
301
+ ### Solutions
302
+
303
+ #### Solution A: Reduce k
304
+ ```
305
+ k = 3 # Quick mode, still gets diversity benefit
306
+ ```
307
+
308
+ **Effect:** ~40% faster, 60% token cost reduction
309
+
310
+ #### Solution B: Batch Optimization
311
+ ```
312
+ # If generating for multiple prompts:
313
+ # Execute VS calls in parallel (if API allows)
314
+
315
+ import asyncio
316
+ results = await asyncio.gather(*vs_calls)
317
+ ```
318
+
319
+ #### Solution C: Use Smaller Model for Initial Pass
320
+ ```
321
+ Round 1: VS with smaller/faster model (GPT-4.1-mini)
322
+ Round 2: Refine selected with flagship model
323
+ ```
324
+
325
+ **Effect:** Cost reduction while maintaining quality
326
+
327
+ #### Solution D: Cache Results
328
+ ```
329
+ # For repeated similar prompts:
330
+ if prompt in cache:
331
+ return cache[prompt]
332
+ else:
333
+ result = execute_vs(prompt)
334
+ cache[prompt] = result
335
+ return result
336
+ ```
337
+
338
+ ### Prevention
339
+ Set expectations: VS trades tokens/time for diversity gains
340
+
341
+ ---
342
+
343
+ ## Issue 7: Probabilities All Very Low
344
+
345
+ ### Symptoms
346
+ - All probabilities < 0.05
347
+ - No clear high-probability responses
348
+
349
+ ### Root Cause
350
+ This is actually **expected with low probability threshold**
351
+
352
+ ### Understanding
353
+
354
+ ```
355
+ # If you use threshold=0.01
356
+ Randomly sample with probability < 0.01
357
+
358
+ Result: All responses will be < 0.01 (tail sampling)
359
+ This is working as intended!
360
+ ```
361
+
362
+ ### Solutions
363
+
364
+ #### Solution A: Remove/Raise Threshold
365
+ ```
366
+ # For more typical outputs:
367
+ threshold = 0.10 # Or remove threshold entirely
368
+ ```
369
+
370
+ #### Solution B: Interpret Correctly
371
+ ```
372
+ # Low probabilities = creative/unique outputs
373
+ # This is a feature, not a bug!
374
+
375
+ Present to user:
376
+ "Here are 5 creative (low-probability) options..."
377
+ ```
378
+
379
+ ### Prevention
380
+ Understand that low threshold → low probabilities (by design)
381
+
382
+ ---
383
+
384
+ ## Debugging Checklist
385
+
386
+ **When VS doesn't work as expected:**
387
+
388
+ 1. [ ] Check prompt formatting (exact template used?)
389
+ 2. [ ] Verify model supports complex instructions (frontier model?)
390
+ 3. [ ] Review parameters (k, threshold, temperature sensible?)
391
+ 4. [ ] Test with simpler request (does basic VS work?)
392
+ 5. [ ] Check JSON parsing (is response valid JSON?)
393
+ 6. [ ] Verify task suitability (does it have diversity potential?)
394
+ 7. [ ] Review quality requirements (diversity-quality tradeoff?)
395
+
396
+ ---
397
+
398
+ ## Error Message Quick Reference
399
+
400
+ | Error | Likely Cause | Quick Fix |
401
+ |-------|--------------|-----------|
402
+ | "Invalid JSON" | Model didn't follow format | Emphasize "ONLY JSON" OR use regex extraction |
403
+ | "Missing 'probability' field" | Model skipped probabilities | Simplify prompt OR use stronger model |
404
+ | "All outputs identical" | Not using VS correctly | Check prompt has VS template |
405
+ | "Probabilities sum to 10" | Misunderstanding (not an error) | Focus on relative ranking, not absolute |
406
+ | "Quality too low" | High diversity, low quality | Use VS-Multi OR add quality constraints |
407
+ | "Too slow" | High k or multiple calls | Reduce k OR use smaller model |
408
+
409
+ ---
410
+
411
+ ## When to Give Up on VS
412
+
413
+ **VS may not be suitable if:**
414
+
415
+ 1. Task has single correct answer (factual QA)
416
+ 2. Model too weak to follow instructions (< 70B params)
417
+ 3. User explicitly wants deterministic output
418
+ 4. Real-time latency critical (< 1 second response needed)
419
+ 5. Quality degradation unacceptable
420
+
421
+ **Alternative:** Use standard prompting with explicit variation instructions
422
+
423
+ ---
424
+
425
+ **For advanced VS techniques:** See `advanced-techniques.md`
426
+ **For research-backed insights:** See `research-findings.md`
@@ -0,0 +1,240 @@
1
+ # VS Core Technique
2
+
3
+ **Purpose:** Core Verbalized Sampling concepts, prompt templates, and execution workflow
4
+
5
+ **Load when:** Agent needs to execute VS for the first time or needs template reference
6
+
7
+ ---
8
+
9
+ ## Why VS Works: The Theory
10
+
11
+ ### The Mode Collapse Problem
12
+
13
+ Aligned LLMs suffer from **typicality bias** - they favor more typical, familiar text because:
14
+ - Human annotators prefer fluent, predictable content
15
+ - RLHF training amplifies this bias
16
+ - Result: **50-70% diversity reduction** vs. base models
17
+
18
+ ### The VS Solution
19
+
20
+ **Different prompts collapse to different modes:**
21
+
22
+ | Prompt Type | Example | Collapses To |
23
+ |------------|---------|--------------|
24
+ | **Instance** | "Tell me a joke" | Single most typical joke |
25
+ | **List** | "Tell 5 jokes" | Uniform distribution of related items |
26
+ | **Distribution (VS)** | "Tell 5 jokes with probabilities" | Base model's learned distribution |
27
+
28
+ **Key insight:** By asking for a probability distribution, VS recovers the diverse pre-training distribution that alignment compressed.
29
+
30
+ ---
31
+
32
+ ## VS Prompt Template (Production-Ready)
33
+
34
+ ### Standard Template
35
+
36
+ **Use this exact format for JSON-parseable output:**
37
+
38
+ ```
39
+ Generate {k} responses to the following user request. Each response should be approximately {target_words} words.
40
+
41
+ Return the responses in JSON format with the key: "responses" (list of dicts). Each dictionary must include:
42
+
43
+ • text: the response string only (no explanation or extra text)
44
+ • probability: the estimated probability from 0.0 to 1.0 of this response given the input prompt (relative to the full distribution)
45
+
46
+ [OPTIONAL: Randomly sample the responses from the distribution, with the probability of each response below {threshold}.]
47
+
48
+ Give ONLY the JSON object, with no explanations or extra text.
49
+
50
+ USER REQUEST:
51
+ {user_original_request}
52
+ ```
53
+
54
+ ### Template Variables
55
+
56
+ **Required:**
57
+ - `{k}`: Number of candidates (typically 5)
58
+ - `{target_words}`: Expected length (e.g., "50", "200", "500")
59
+ - `{user_original_request}`: The actual user query
60
+
61
+ **Optional:**
62
+ - `{threshold}`: Probability threshold (0.01, 0.05, 0.10) - include bracketed line only if tuning for more diversity
63
+
64
+ ### Concrete Example
65
+
66
+ **User request:** "Write 10 social media captions for a coffee shop's new latte"
67
+
68
+ **Agent formats VS prompt:**
69
+
70
+ ```
71
+ Generate 5 responses to the following user request. Each response should be approximately 20 words.
72
+
73
+ Return the responses in JSON format with the key: "responses" (list of dicts). Each dictionary must include:
74
+
75
+ • text: the response string only (no explanation or extra text)
76
+ • probability: the estimated probability from 0.0 to 1.0 of this response given the input prompt (relative to the full distribution)
77
+
78
+ Randomly sample the responses from the distribution, with the probability of each response below 0.10.
79
+
80
+ Give ONLY the JSON object, with no explanations or extra text.
81
+
82
+ USER REQUEST:
83
+ Write a social media caption for a coffee shop's new caramel cloud latte
84
+ ```
85
+
86
+ **Expected output:**
87
+ ```json
88
+ {
89
+ "responses": [
90
+ {"text": "Sip on cloud nine ☁️ Our new Caramel Cloud Latte is here to make your mornings magical ✨", "probability": 0.08},
91
+ {"text": "Warning: Dangerously smooth. The Caramel Cloud Latte has arrived and it's causing serious caffeine crushes 💛", "probability": 0.06},
92
+ {"text": "Fluffy. Creamy. Caramel-y. The Caramel Cloud Latte is basically a hug in a cup 🤗", "probability": 0.05},
93
+ {"text": "Plot twist: clouds ARE edible. Try our new Caramel Cloud Latte and taste the sky ☁️☕", "probability": 0.04},
94
+ {"text": "New latte just dropped and it's lighter than air. Introducing: Caramel Cloud Latte 🌤️", "probability": 0.03}
95
+ ]
96
+ }
97
+ ```
98
+
99
+ ---
100
+
101
+ ## Execution Workflow
102
+
103
+ ### Step 1: Parameter Planning
104
+
105
+ **Before executing VS, determine:**
106
+
107
+ **1.1 Content Parameters**
108
+ - Content type: (blog post, caption, story, campaign idea, etc.)
109
+ - Target word count: (20 words for captions, 500 for blog posts, etc.)
110
+ - Total outputs needed: N (user wants 10 captions? 5 blog posts?)
111
+
112
+ **1.2 VS Parameters**
113
+
114
+ | Parameter | Default | Notes |
115
+ |-----------|---------|-------|
116
+ | k (candidates per call) | 5 | Use 3 for quick, 10 for deep exploration |
117
+ | Temperature | 0.7-1.0 | Can combine with VS for extra boost |
118
+ | Probability threshold | 0.10 (optional) | Lower = more creative tail sampling |
119
+
120
+ **1.3 Calculate Calls Needed**
121
+
122
+ ```
123
+ Number of LLM calls = ⌈N / k⌉
124
+ ```
125
+
126
+ Example: User wants 15 ideas → k=5 → Need 3 calls
127
+
128
+ ### Step 2: Execute VS Prompt
129
+
130
+ 1. **Format the prompt** using template with variables filled in
131
+ 2. **Make LLM call** (use regular message, no special tools)
132
+ 3. **Parse JSON response** - extract responses array
133
+ 4. **Repeat if needed** for additional candidates (when N > k)
134
+ 5. **Collect all candidates** from multiple calls into single pool
135
+
136
+ ### Step 3: Parse & Validate Output
137
+
138
+ **After receiving VS response:**
139
+
140
+ ```python
141
+ # Pseudo-code for agent processing
142
+ import json
143
+
144
+ response_text = llm_output # The JSON string from LLM
145
+ data = json.loads(response_text)
146
+ candidates = data["responses"]
147
+
148
+ # Validate structure
149
+ for item in candidates:
150
+ assert "text" in item and "probability" in item
151
+ assert 0.0 <= item["probability"] <= 1.0
152
+ ```
153
+
154
+ **Handle errors:**
155
+ - If JSON parsing fails → See `troubleshooting.md`
156
+ - If structure invalid → Retry with emphasis on "ONLY JSON"
157
+
158
+ ### Step 4: Present Results to User
159
+
160
+ **Three presentation options:**
161
+
162
+ **Option A: Ranked by Probability**
163
+ ```
164
+ Here are 5 diverse caption ideas (ordered by probability):
165
+
166
+ 1. [p=0.08] Sip on cloud nine ☁️ Our new Caramel Cloud Latte...
167
+ 2. [p=0.06] Warning: Dangerously smooth. The Caramel Cloud Latte...
168
+ 3. [p=0.05] Fluffy. Creamy. Caramel-y. The Caramel Cloud Latte...
169
+ ```
170
+
171
+ **Option B: Grouped by Tiers**
172
+ ```
173
+ HIGH PROBABILITY (typical, safer):
174
+ • Sip on cloud nine ☁️ Our new Caramel Cloud Latte...
175
+
176
+ MEDIUM PROBABILITY (balanced):
177
+ • Warning: Dangerously smooth. The Caramel Cloud Latte...
178
+ • Fluffy. Creamy. Caramel-y. The Caramel Cloud Latte...
179
+
180
+ LOW PROBABILITY (creative, unique):
181
+ • Plot twist: clouds ARE edible. Try our new Caramel Cloud...
182
+ ```
183
+
184
+ **Option C: Simple List (Hide Probabilities)**
185
+ ```
186
+ Here are 5 diverse caption ideas:
187
+
188
+ 1. Sip on cloud nine ☁️ Our new Caramel Cloud Latte...
189
+ 2. Warning: Dangerously smooth. The Caramel Cloud Latte...
190
+ 3. Fluffy. Creamy. Caramel-y. The Caramel Cloud Latte...
191
+ ```
192
+
193
+ **Default:** Use Option C for cleaner user experience, unless user asks for probability insights.
194
+
195
+ ---
196
+
197
+ ## Parameter Selection Guide
198
+
199
+ ### Quick Decision Matrix
200
+
201
+ | Scenario | k | Threshold | Temperature |
202
+ |----------|---|-----------|-------------|
203
+ | **Quick ideation** | 3 | None | 0.7 |
204
+ | **Standard brainstorming** | 5 | 0.10 | 0.8 |
205
+ | **Deep exploration** | 10 | 0.01 | 1.0 |
206
+ | **Production content** | 5 | None | 0.8 |
207
+
208
+ ### When to Adjust Parameters
209
+
210
+ **If outputs too similar:**
211
+ - ✅ Lower threshold (0.10 → 0.01)
212
+ - ✅ Increase k (5 → 10)
213
+ - ✅ Add explicit diversity instruction to prompt
214
+
215
+ **If outputs too wild/low quality:**
216
+ - ✅ Raise threshold (0.01 → 0.10)
217
+ - ✅ Reduce k (10 → 5)
218
+ - ✅ Add quality constraints to prompt
219
+
220
+ ---
221
+
222
+ ## Quality Control Checklist
223
+
224
+ **Before presenting VS results to user, verify:**
225
+
226
+ - [ ] **Diversity achieved:** Outputs cover genuinely different angles/styles/approaches
227
+ - [ ] **Quality maintained:** Each output meets baseline quality standards
228
+ - [ ] **User intent matched:** Outputs address the original request accurately
229
+ - [ ] **Formatting correct:** Clean presentation, no JSON artifacts in user-facing text
230
+ - [ ] **Probabilities sensible:** If shown, probabilities are reasonable (don't need to sum to 1.0)
231
+
232
+ ---
233
+
234
+ ## Next Steps
235
+
236
+ **After mastering core VS:**
237
+ - **Task-specific workflows:** Load `task-workflows.md` for pre-configured templates
238
+ - **Advanced techniques:** Load `advanced-techniques.md` for VS-CoT, VS-Multi, refinement
239
+ - **Tool integration:** Load `tool-integration.md` for file operations, batch processing
240
+ - **Troubleshooting:** Load `troubleshooting.md` if encountering issues