@rfxlamia/skillkit 1.1.0 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/agents/agents/creative-copywriter.md +212 -0
- package/agents/agents/dario-amodei.md +135 -0
- package/agents/agents/doc-simplifier.md +63 -0
- package/agents/agents/kotlin-pro.md +433 -0
- package/agents/agents/red-team.md +136 -0
- package/agents/agents/sam-altman.md +121 -0
- package/agents/agents/seo-manager.md +184 -0
- package/package.json +1 -1
- package/skills/skillkit-help/SKILL.md +81 -0
- package/skills/skillkit-help/knowledge/application/09-case-studies.md +257 -0
- package/skills/skillkit-help/knowledge/application/12-testing-and-validation.md +276 -0
- package/skills/skillkit-help/knowledge/foundation/01-why-skills-exist.md +246 -0
- package/skills/skillkit-help/knowledge/foundation/02-skills-vs-subagents-comparison.md +312 -0
- package/skills/skillkit-help/knowledge/foundation/03-skills-vs-subagents-decision-tree.md +346 -0
- package/skills/skillkit-help/knowledge/foundation/06-platform-constraints.md +237 -0
- package/skills/skillkit-help/knowledge/foundation/08-when-not-to-use-skills.md +270 -0
- package/skills/skillkit-help/template/SKILL.md +52 -0
- package/skills/skills/adversarial-review/SKILL.md +219 -0
- package/skills/skills/baby-education/SKILL.md +260 -0
- package/skills/skills/baby-education/references/advanced-techniques.md +323 -0
- package/skills/skills/baby-education/references/transformations.md +345 -0
- package/skills/skills/been-there-done-that/SKILL.md +455 -0
- package/skills/skills/been-there-done-that/references/analysis-patterns.md +162 -0
- package/skills/skills/been-there-done-that/references/git-commands.md +132 -0
- package/skills/skills/been-there-done-that/references/tree-insertion-logic.md +145 -0
- package/skills/skills/coolhunter/SKILL.md +270 -0
- package/skills/skills/coolhunter/assets/elicitation-methods.csv +51 -0
- package/skills/skills/coolhunter/knowledge/elicitation-methods.md +312 -0
- package/skills/skills/coolhunter/references/workflow-execution.md +238 -0
- package/skills/skills/coolhunter/workflow-plan-coolhunter.md +232 -0
- package/skills/skills/creative-copywriting/SKILL.md +324 -0
- package/skills/skills/creative-copywriting/databases/README.md +60 -0
- package/skills/skills/creative-copywriting/databases/carousel-structures.csv +16 -0
- package/skills/skills/creative-copywriting/databases/emotional-arcs.csv +11 -0
- package/skills/skills/creative-copywriting/databases/hook-formulas.csv +51 -0
- package/skills/skills/creative-copywriting/databases/power-words.csv +201 -0
- package/skills/skills/creative-copywriting/databases/psychological-triggers.csv +21 -0
- package/skills/skills/creative-copywriting/databases/read-more-patterns.csv +26 -0
- package/skills/skills/creative-copywriting/databases/swipe-triggers.csv +31 -0
- package/skills/skills/creative-copywriting/references/carousel-psychology.md +223 -0
- package/skills/skills/creative-copywriting/references/hook-anatomy.md +169 -0
- package/skills/skills/creative-copywriting/references/power-word-science.md +134 -0
- package/skills/skills/creative-copywriting/references/storytelling-frameworks.md +157 -0
- package/skills/skills/diverse-content-gen/SKILL.md +201 -0
- package/skills/skills/diverse-content-gen/references/advanced-techniques.md +320 -0
- package/skills/skills/diverse-content-gen/references/research-findings.md +379 -0
- package/skills/skills/diverse-content-gen/references/task-workflows.md +241 -0
- package/skills/skills/diverse-content-gen/references/tool-integration.md +419 -0
- package/skills/skills/diverse-content-gen/references/troubleshooting.md +426 -0
- package/skills/skills/diverse-content-gen/references/vs-core-technique.md +240 -0
- package/skills/skills/framework-critical-thinking/SKILL.md +220 -0
- package/skills/skills/framework-critical-thinking/references/bias_detector.md +375 -0
- package/skills/skills/framework-critical-thinking/references/fallback_handler.md +239 -0
- package/skills/skills/framework-critical-thinking/references/memory_curator.md +161 -0
- package/skills/skills/framework-critical-thinking/references/metacognitive_monitor.md +297 -0
- package/skills/skills/framework-critical-thinking/references/producer_critic_orchestrator.md +333 -0
- package/skills/skills/framework-critical-thinking/references/reasoning_router.md +235 -0
- package/skills/skills/framework-critical-thinking/references/reasoning_validator.md +97 -0
- package/skills/skills/framework-critical-thinking/references/reflection_trigger.md +78 -0
- package/skills/skills/framework-critical-thinking/references/self_verification.md +388 -0
- package/skills/skills/framework-critical-thinking/references/uncertainty_quantifier.md +207 -0
- package/skills/skills/framework-initiative/SKILL.md +231 -0
- package/skills/skills/framework-initiative/references/examples.md +150 -0
- package/skills/skills/framework-initiative/references/impact-analysis.md +157 -0
- package/skills/skills/framework-initiative/references/intent-patterns.md +145 -0
- package/skills/skills/framework-initiative/references/star-framework.md +165 -0
- package/skills/skills/humanize-docs/SKILL.md +203 -0
- package/skills/skills/humanize-docs/references/advanced-techniques.md +13 -0
- package/skills/skills/humanize-docs/references/core-transformations.md +368 -0
- package/skills/skills/humanize-docs/references/detection-patterns.md +400 -0
- package/skills/skills/humanize-docs/references/examples-gallery.md +374 -0
- package/skills/skills/imagine/SKILL.md +190 -0
- package/skills/skills/imagine/references/artstyle-corporate-memphis.md +625 -0
- package/skills/skills/imagine/references/artstyle-crewdson-hyperrealism.md +295 -0
- package/skills/skills/imagine/references/artstyle-iphone-social-media.md +426 -0
- package/skills/skills/imagine/references/artstyle-sciencesaru.md +276 -0
- package/skills/skills/pre-deploy-checklist/README.md +26 -0
- package/skills/skills/pre-deploy-checklist/SKILL.md +153 -0
- package/skills/skills/pre-deploy-checklist/references/checklist-categories.md +174 -0
- package/skills/skills/pre-deploy-checklist/references/domain-prompts.md +216 -0
- package/skills/skills/prompt-engineering/SKILL.md +209 -0
- package/skills/skills/prompt-engineering/references/advanced-combinations.md +444 -0
- package/skills/skills/prompt-engineering/references/chain-of-thought.md +140 -0
- package/skills/skills/prompt-engineering/references/decision_matrix.md +220 -0
- package/skills/skills/prompt-engineering/references/few-shot.md +346 -0
- package/skills/skills/prompt-engineering/references/json-format.md +270 -0
- package/skills/skills/prompt-engineering/references/natural-language.md +420 -0
- package/skills/skills/prompt-engineering/references/pitfalls.md +365 -0
- package/skills/skills/prompt-engineering/references/prompt-chaining.md +498 -0
- package/skills/skills/prompt-engineering/references/react.md +108 -0
- package/skills/skills/prompt-engineering/references/self-consistency.md +322 -0
- package/skills/skills/prompt-engineering/references/tree-of-thoughts.md +386 -0
- package/skills/skills/prompt-engineering/references/xml-format.md +220 -0
- package/skills/skills/prompt-engineering/references/yaml-format.md +488 -0
- package/skills/skills/prompt-engineering/references/zero-shot.md +74 -0
- package/skills/skills/quick-spec/SKILL.md +280 -0
- package/skills/skills/quick-spec/assets/tech-spec-template.md +74 -0
- package/skills/skills/quick-spec/references/step-01-understand.md +189 -0
- package/skills/skills/quick-spec/references/step-02-investigate.md +144 -0
- package/skills/skills/quick-spec/references/step-03-generate.md +128 -0
- package/skills/skills/quick-spec/references/step-04-review.md +173 -0
- package/skills/skills/quick-spec/tests/__pycache__/test_skill.cpython-314-pytest-9.0.2.pyc +0 -0
- package/skills/skills/quick-spec/tests/test_scenarios.md +83 -0
- package/skills/skills/quick-spec/tests/test_skill.py +136 -0
- package/skills/skills/readme-expert/SKILL.md +538 -0
- package/skills/skills/readme-expert/knowledge/INDEX.md +192 -0
- package/skills/skills/readme-expert/knowledge/application/quality-standards.md +470 -0
- package/skills/skills/readme-expert/knowledge/application/script-executor.md +604 -0
- package/skills/skills/readme-expert/knowledge/application/template-library.md +822 -0
- package/skills/skills/readme-expert/knowledge/foundation/codebase-scanner.md +361 -0
- package/skills/skills/readme-expert/knowledge/foundation/validation-checklist.md +481 -0
- package/skills/skills/red-teaming/SKILL.md +321 -0
- package/skills/skills/red-teaming/references/ai-llm-redteam.md +517 -0
- package/skills/skills/red-teaming/references/attack-techniques.md +410 -0
- package/skills/skills/red-teaming/references/cybersecurity-redteam.md +383 -0
- package/skills/skills/red-teaming/references/tools-frameworks.md +446 -0
- package/skills/skills/releasing/.skillkit-mode +1 -0
- package/skills/skills/releasing/SKILL.md +225 -0
- package/skills/skills/releasing/references/version-detection.md +108 -0
- package/skills/skills/screenwriter/SKILL.md +273 -0
- package/skills/skills/screenwriter/references/advanced-techniques.md +216 -0
- package/skills/skills/screenwriter/references/pipeline-integration.md +266 -0
- package/skills/skills/skillkit/.claude/settings.local.json +7 -0
- package/skills/skills/skillkit/.claude-plugin/plugin.json +27 -0
- package/skills/skills/skillkit/CHANGELOG.md +484 -0
- package/skills/skills/skillkit/SKILL.md +511 -0
- package/skills/skills/skillkit/commands/skillkit.md +6 -0
- package/skills/skills/skillkit/commands/validate-plan.md +6 -0
- package/skills/skills/skillkit/commands/verify.md +6 -0
- package/skills/skills/skillkit/knowledge/INDEX.md +352 -0
- package/skills/skills/skillkit/knowledge/application/09-case-studies.md +257 -0
- package/skills/skills/skillkit/knowledge/application/10-technical-architecture.md +324 -0
- package/skills/skills/skillkit/knowledge/application/11-adoption-strategy.md +267 -0
- package/skills/skills/skillkit/knowledge/application/12-testing-and-validation.md +276 -0
- package/skills/skills/skillkit/knowledge/application/13-competitive-landscape.md +198 -0
- package/skills/skills/skillkit/knowledge/foundation/01-why-skills-exist.md +246 -0
- package/skills/skills/skillkit/knowledge/foundation/02-skills-vs-subagents-comparison.md +312 -0
- package/skills/skills/skillkit/knowledge/foundation/03-skills-vs-subagents-decision-tree.md +346 -0
- package/skills/skills/skillkit/knowledge/foundation/04-hybrid-patterns.md +308 -0
- package/skills/skills/skillkit/knowledge/foundation/05-token-economics.md +275 -0
- package/skills/skills/skillkit/knowledge/foundation/06-platform-constraints.md +237 -0
- package/skills/skills/skillkit/knowledge/foundation/07-security-concerns.md +322 -0
- package/skills/skills/skillkit/knowledge/foundation/08-when-not-to-use-skills.md +270 -0
- package/skills/skills/skillkit/knowledge/plugin-guide.md +614 -0
- package/skills/skills/skillkit/knowledge/tools/14-validation-tools-guide.md +150 -0
- package/skills/skills/skillkit/knowledge/tools/15-cost-tools-guide.md +157 -0
- package/skills/skills/skillkit/knowledge/tools/16-security-tools-guide.md +122 -0
- package/skills/skills/skillkit/knowledge/tools/17-pattern-tools-guide.md +161 -0
- package/skills/skills/skillkit/knowledge/tools/18-decision-helper-guide.md +243 -0
- package/skills/skills/skillkit/knowledge/tools/19-test-generator-guide.md +275 -0
- package/skills/skills/skillkit/knowledge/tools/20-split-skill-guide.md +149 -0
- package/skills/skills/skillkit/knowledge/tools/21-quality-scorer-guide.md +226 -0
- package/skills/skills/skillkit/knowledge/tools/22-migration-helper-guide.md +356 -0
- package/skills/skills/skillkit/knowledge/tools/23-subagent-creation-guide.md +448 -0
- package/skills/skills/skillkit/knowledge/tools/24-behavioral-testing-guide.md +122 -0
- package/skills/skills/skillkit/references/proposal-generation.md +982 -0
- package/skills/skills/skillkit/references/rationalization-catalog.md +75 -0
- package/skills/skills/skillkit/references/research-methodology.md +661 -0
- package/skills/skills/skillkit/references/section-2-full-creation-workflow.md +452 -0
- package/skills/skills/skillkit/references/section-3-validation-workflow-existing-skill.md +63 -0
- package/skills/skills/skillkit/references/section-4-decision-workflow-skills-vs-subagents.md +64 -0
- package/skills/skills/skillkit/references/section-5-migration-workflow-doc-to-skill.md +58 -0
- package/skills/skills/skillkit/references/section-6-subagent-creation-workflow.md +499 -0
- package/skills/skills/skillkit/references/section-7-knowledge-reference-map.md +72 -0
- package/skills/skills/skillkit/scripts/__pycache__/decision_helper.cpython-314.pyc +0 -0
- package/skills/skills/skillkit/scripts/__pycache__/quick_validate.cpython-312.pyc +0 -0
- package/skills/skills/skillkit/scripts/__pycache__/quick_validate.cpython-314.pyc +0 -0
- package/skills/skills/skillkit/scripts/__pycache__/test_generator.cpython-314-pytest-9.0.2.pyc +0 -0
- package/skills/skills/skillkit/scripts/decision_helper.py +799 -0
- package/skills/skills/skillkit/scripts/init_skill.py +400 -0
- package/skills/skills/skillkit/scripts/init_subagent.py +231 -0
- package/skills/skills/skillkit/scripts/migration_helper.py +669 -0
- package/skills/skills/skillkit/scripts/package_skill.py +211 -0
- package/skills/skills/skillkit/scripts/pattern_detector.py +381 -0
- package/skills/skills/skillkit/scripts/pattern_detector_new.py +382 -0
- package/skills/skills/skillkit/scripts/pressure_tester.py +157 -0
- package/skills/skills/skillkit/scripts/quality_scorer.py +999 -0
- package/skills/skills/skillkit/scripts/quick_validate.py +100 -0
- package/skills/skills/skillkit/scripts/security_scanner.py +474 -0
- package/skills/skills/skillkit/scripts/split_skill.py +540 -0
- package/skills/skills/skillkit/scripts/test_generator.py +695 -0
- package/skills/skills/skillkit/scripts/token_estimator.py +493 -0
- package/skills/skills/skillkit/scripts/utils/__init__.py +49 -0
- package/skills/skills/skillkit/scripts/utils/__pycache__/__init__.cpython-312.pyc +0 -0
- package/skills/skills/skillkit/scripts/utils/__pycache__/__init__.cpython-314.pyc +0 -0
- package/skills/skills/skillkit/scripts/utils/__pycache__/budget_tracker.cpython-312.pyc +0 -0
- package/skills/skills/skillkit/scripts/utils/__pycache__/budget_tracker.cpython-314.pyc +0 -0
- package/skills/skills/skillkit/scripts/utils/__pycache__/output_formatter.cpython-312.pyc +0 -0
- package/skills/skills/skillkit/scripts/utils/__pycache__/output_formatter.cpython-314.pyc +0 -0
- package/skills/skills/skillkit/scripts/utils/__pycache__/reference_validator.cpython-312.pyc +0 -0
- package/skills/skills/skillkit/scripts/utils/__pycache__/reference_validator.cpython-314.pyc +0 -0
- package/skills/skills/skillkit/scripts/utils/budget_tracker.py +388 -0
- package/skills/skills/skillkit/scripts/utils/output_formatter.py +263 -0
- package/skills/skills/skillkit/scripts/utils/reference_validator.py +401 -0
- package/skills/skills/skillkit/scripts/validate_skill.py +594 -0
- package/skills/skills/skillkit/tests/test_behavioral.py +39 -0
- package/skills/skills/skillkit/tests/test_scenarios.md +83 -0
- package/skills/skills/skillkit/tests/test_skill.py +136 -0
- package/skills/skills/skillkit-help/SKILL.md +81 -0
- package/skills/skills/skillkit-help/knowledge/application/09-case-studies.md +257 -0
- package/skills/skills/skillkit-help/knowledge/application/12-testing-and-validation.md +276 -0
- package/skills/skills/skillkit-help/knowledge/foundation/01-why-skills-exist.md +246 -0
- package/skills/skills/skillkit-help/knowledge/foundation/02-skills-vs-subagents-comparison.md +312 -0
- package/skills/skills/skillkit-help/knowledge/foundation/03-skills-vs-subagents-decision-tree.md +346 -0
- package/skills/skills/skillkit-help/knowledge/foundation/06-platform-constraints.md +237 -0
- package/skills/skills/skillkit-help/knowledge/foundation/08-when-not-to-use-skills.md +270 -0
- package/skills/skills/skillkit-help/template/SKILL.md +52 -0
- package/skills/skills/social-media-seo/SKILL.md +278 -0
- package/skills/skills/social-media-seo/databases/caption-styles.csv +31 -0
- package/skills/skills/social-media-seo/databases/engagement-tactics.csv +16 -0
- package/skills/skills/social-media-seo/databases/hashtag-strategies.csv +21 -0
- package/skills/skills/social-media-seo/databases/hook-formulas.csv +26 -0
- package/skills/skills/social-media-seo/databases/keyword-clusters.csv +11 -0
- package/skills/skills/social-media-seo/databases/thread-structures.csv +26 -0
- package/skills/skills/social-media-seo/databases/viral-patterns.csv +21 -0
- package/skills/skills/social-media-seo/references/analytics-guide.md +321 -0
- package/skills/skills/social-media-seo/references/instagram-seo.md +235 -0
- package/skills/skills/social-media-seo/references/threads-seo.md +305 -0
- package/skills/skills/social-media-seo/references/x-twitter-seo.md +337 -0
- package/skills/skills/social-media-seo/scripts/query_database.py +191 -0
- package/skills/skills/storyteller/SKILL.md +241 -0
- package/skills/skills/storyteller/references/transformation-methodology.md +293 -0
- package/skills/skills/storyteller/references/visual-vocabulary.md +177 -0
- package/skills/skills/thread-pro/SKILL.md +162 -0
- package/skills/skills/thread-pro/anti-ai-patterns.md +120 -0
- package/skills/skills/thread-pro/hook-formulas.md +138 -0
- package/skills/skills/thread-pro/references/anti-ai-patterns.md +120 -0
- package/skills/skills/thread-pro/references/hook-formulas.md +138 -0
- package/skills/skills/thread-pro/references/thread-structures.md +240 -0
- package/skills/skills/thread-pro/references/voice-injection.md +130 -0
- package/skills/skills/thread-pro/thread-structures.md +240 -0
- package/skills/skills/thread-pro/voice-injection.md +130 -0
- package/skills/skills/tinkering/SKILL.md +251 -0
- package/skills/skills/tinkering/references/graduation-checklist.md +100 -0
- package/skills/skills/validate-plan/.skillkit-mode +1 -0
- package/skills/skills/validate-plan/SKILL.md +406 -0
- package/skills/skills/validate-plan/references/dry-principles.md +251 -0
- package/skills/skills/validate-plan/references/gap-analysis-guide.md +320 -0
- package/skills/skills/validate-plan/references/tdd-patterns.md +413 -0
- package/skills/skills/validate-plan/references/yagni-checklist.md +330 -0
- package/skills/skills/verify-before-ship/.skillkit-mode +1 -0
- package/skills/skills/verify-before-ship/SKILL.md +116 -0
- package/skills/skills/verify-before-ship/references/anti-rationalization.md +212 -0
- package/skills/skills/verify-before-ship/references/verification-gates.md +305 -0
- package/skills-manifest.json +8 -2
- package/src/picker.js +11 -5
- package/src/picker.test.js +36 -1
|
@@ -0,0 +1,322 @@
|
|
|
1
|
+
# Self-Consistency Prompting
|
|
2
|
+
|
|
3
|
+
## Overview
|
|
4
|
+
|
|
5
|
+
Self-Consistency generates multiple reasoning paths independently, then selects the most consistent answer. Improves accuracy for high-stakes decisions by comparing multiple solutions.
|
|
6
|
+
|
|
7
|
+
## When to Use
|
|
8
|
+
|
|
9
|
+
✅ **Use Self-Consistency When:**
|
|
10
|
+
- High-stakes decisions requiring verification
|
|
11
|
+
- Accuracy is more important than speed
|
|
12
|
+
- Multiple reasoning paths are possible
|
|
13
|
+
- Need confidence in the answer
|
|
14
|
+
- Problem has objective correct answer
|
|
15
|
+
- Cost of error is high
|
|
16
|
+
|
|
17
|
+
❌ **Don't Use Self-Consistency When:**
|
|
18
|
+
- Simple tasks with obvious answers
|
|
19
|
+
- Speed is critical (Self-Consistency is slow)
|
|
20
|
+
- Token budget is tight (very expensive)
|
|
21
|
+
- Subjective questions (no "correct" answer)
|
|
22
|
+
- Creative tasks (variety is good, not consensus)
|
|
23
|
+
|
|
24
|
+
## Method
|
|
25
|
+
|
|
26
|
+
1. **Generate** 3-10 independent reasoning paths
|
|
27
|
+
2. **Compare** final answers from all paths
|
|
28
|
+
3. **Select** most consistent/common answer
|
|
29
|
+
4. **Verify** by investigating discrepancies
|
|
30
|
+
|
|
31
|
+
## Structure
|
|
32
|
+
|
|
33
|
+
```
|
|
34
|
+
Problem →
|
|
35
|
+
Path 1 (Method A) → Answer 1
|
|
36
|
+
Path 2 (Method B) → Answer 2
|
|
37
|
+
Path 3 (Method C) → Answer 3
|
|
38
|
+
Path 4 (Method D) → Answer 4
|
|
39
|
+
Path 5 (Method E) → Answer 5
|
|
40
|
+
→ Consistency Check → Final Answer
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
## Template
|
|
44
|
+
|
|
45
|
+
```
|
|
46
|
+
Problem: [Critical problem requiring high confidence]
|
|
47
|
+
|
|
48
|
+
I will solve this using 5 different reasoning approaches to ensure consistency.
|
|
49
|
+
|
|
50
|
+
Approach 1 (Method: [e.g., Direct calculation]):
|
|
51
|
+
[Reasoning steps]
|
|
52
|
+
Answer 1: [Result]
|
|
53
|
+
|
|
54
|
+
Approach 2 (Method: [e.g., Work backwards]):
|
|
55
|
+
[Reasoning steps]
|
|
56
|
+
Answer 2: [Result]
|
|
57
|
+
|
|
58
|
+
Approach 3 (Method: [e.g., Analogical reasoning]):
|
|
59
|
+
[Reasoning steps]
|
|
60
|
+
Answer 3: [Result]
|
|
61
|
+
|
|
62
|
+
Approach 4 (Method: [e.g., Formula-based]):
|
|
63
|
+
[Reasoning steps]
|
|
64
|
+
Answer 4: [Result]
|
|
65
|
+
|
|
66
|
+
Approach 5 (Method: [e.g., Verification approach]):
|
|
67
|
+
[Reasoning steps]
|
|
68
|
+
Answer 5: [Result]
|
|
69
|
+
|
|
70
|
+
Consistency Check:
|
|
71
|
+
- All answers: [List: Answer 1, Answer 2, Answer 3, Answer 4, Answer 5]
|
|
72
|
+
- Most common answer: [X appears Y times]
|
|
73
|
+
- Confidence: [High/Medium/Low based on agreement]
|
|
74
|
+
- Discrepancies: [Note any different answers]
|
|
75
|
+
- Investigation: [If answers differ, why?]
|
|
76
|
+
|
|
77
|
+
Final Answer: [Most consistent answer with reasoning]
|
|
78
|
+
Confidence: [Percentage based on agreement]
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
## Best Practices
|
|
82
|
+
|
|
83
|
+
✓ Use 5-7 diverse reasoning paths (optimal)
|
|
84
|
+
✓ Vary approaches/starting points genuinely
|
|
85
|
+
✓ Compare answers, don't just pick majority
|
|
86
|
+
✓ Investigate disagreements seriously
|
|
87
|
+
✓ Use different methods, not just repeat same logic
|
|
88
|
+
|
|
89
|
+
✗ Don't use identical reasoning each time
|
|
90
|
+
✗ Avoid stopping at first answer
|
|
91
|
+
✗ Don't ignore minority answers without investigation
|
|
92
|
+
✗ Don't use self-consistency for subjective questions
|
|
93
|
+
|
|
94
|
+
## Example: Mathematical Problem
|
|
95
|
+
|
|
96
|
+
```
|
|
97
|
+
Problem: A store sells apples at $2 for 3 apples.
|
|
98
|
+
If you buy 17 apples, how much do you pay?
|
|
99
|
+
|
|
100
|
+
Approach 1 (Bundle counting):
|
|
101
|
+
Step 1: 17 apples ÷ 3 = 5 bundles + 2 individual
|
|
102
|
+
Step 2: 5 bundles × $2 = $10
|
|
103
|
+
Step 3: 2 individual × ($2÷3) = $1.33
|
|
104
|
+
Answer 1: $11.33
|
|
105
|
+
|
|
106
|
+
Approach 2 (Unit price):
|
|
107
|
+
Step 1: Price per apple = $2 ÷ 3 = $0.667
|
|
108
|
+
Step 2: 17 apples × $0.667 = $11.33
|
|
109
|
+
Answer 2: $11.33
|
|
110
|
+
|
|
111
|
+
Approach 3 (Work backwards from total):
|
|
112
|
+
Step 1: For $11, I get 16.5 apples (11 ÷ 2 × 3)
|
|
113
|
+
Step 2: Need 0.5 more apple = $0.33
|
|
114
|
+
Answer 3: $11.33
|
|
115
|
+
|
|
116
|
+
Approach 4 (Proportion):
|
|
117
|
+
Step 1: 3 apples : $2 = 17 apples : X
|
|
118
|
+
Step 2: X = (17 × $2) ÷ 3 = $11.33
|
|
119
|
+
Answer 4: $11.33
|
|
120
|
+
|
|
121
|
+
Approach 5 (Verification):
|
|
122
|
+
Step 1: At $11.33, I should get 17 apples
|
|
123
|
+
Step 2: $11.33 ÷ $2 = 5.665 bundles
|
|
124
|
+
Step 3: 5.665 × 3 = 17 apples ✓
|
|
125
|
+
Answer 5: $11.33
|
|
126
|
+
|
|
127
|
+
Consistency Check:
|
|
128
|
+
- All 5 approaches agree: $11.33
|
|
129
|
+
- Confidence: Very High (100% agreement)
|
|
130
|
+
- No discrepancies to investigate
|
|
131
|
+
|
|
132
|
+
Final Answer: $11.33 (Verified through 5 independent methods)
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
## Example: Decision-Making
|
|
136
|
+
|
|
137
|
+
```
|
|
138
|
+
Problem: Should we migrate our database from PostgreSQL to MongoDB?
|
|
139
|
+
|
|
140
|
+
Approach 1 (Performance analysis):
|
|
141
|
+
- Current query patterns: 80% relational, 20% document
|
|
142
|
+
- Relational queries would become slower in MongoDB
|
|
143
|
+
- Document queries marginally faster
|
|
144
|
+
Conclusion 1: Stay with PostgreSQL
|
|
145
|
+
|
|
146
|
+
Approach 2 (Cost analysis):
|
|
147
|
+
- Migration cost: $50K (engineering time)
|
|
148
|
+
- MongoDB licenses: +$10K/year
|
|
149
|
+
- PostgreSQL optimizations: $15K (one-time)
|
|
150
|
+
- 3-year TCO: MongoDB $80K, PostgreSQL $15K
|
|
151
|
+
Conclusion 2: Stay with PostgreSQL
|
|
152
|
+
|
|
153
|
+
Approach 3 (Team expertise):
|
|
154
|
+
- Team has 5 years PostgreSQL experience
|
|
155
|
+
- MongoDB: No experience, 6-month learning curve
|
|
156
|
+
- Risk of migration bugs: High
|
|
157
|
+
Conclusion 3: Stay with PostgreSQL
|
|
158
|
+
|
|
159
|
+
Approach 4 (Scalability needs):
|
|
160
|
+
- Current scale: 1TB, 10K QPS
|
|
161
|
+
- PostgreSQL scales to 10TB, 100K QPS (sufficient)
|
|
162
|
+
- MongoDB advantages not needed at our scale
|
|
163
|
+
Conclusion 4: Stay with PostgreSQL
|
|
164
|
+
|
|
165
|
+
Approach 5 (Industry trends):
|
|
166
|
+
- Competitors use both successfully
|
|
167
|
+
- No compelling reason to switch
|
|
168
|
+
- "If it ain't broke, don't fix it"
|
|
169
|
+
Conclusion 5: Stay with PostgreSQL
|
|
170
|
+
|
|
171
|
+
Consistency Check:
|
|
172
|
+
- All 5 analyses agree: Stay with PostgreSQL
|
|
173
|
+
- Confidence: Very High (unanimous)
|
|
174
|
+
- Key factors: Cost, expertise, current performance adequate
|
|
175
|
+
|
|
176
|
+
Final Decision: Stay with PostgreSQL
|
|
177
|
+
Confidence: 95%
|
|
178
|
+
Rationale: All analysis approaches converge on same conclusion
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
## Diversity of Approaches
|
|
182
|
+
|
|
183
|
+
### Good Diversity (Different Methods):
|
|
184
|
+
```
|
|
185
|
+
1. Mathematical formula
|
|
186
|
+
2. Analogical reasoning
|
|
187
|
+
3. Work backwards
|
|
188
|
+
4. Case-by-case analysis
|
|
189
|
+
5. Verification approach
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
### Bad Diversity (Same Method):
|
|
193
|
+
```
|
|
194
|
+
1. Formula approach
|
|
195
|
+
2. Formula approach (slightly different notation)
|
|
196
|
+
3. Formula approach (in different order)
|
|
197
|
+
4. Formula approach (with verification)
|
|
198
|
+
5. Formula approach (repeated)
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
**Key:** Use genuinely different reasoning strategies.
|
|
202
|
+
|
|
203
|
+
## Handling Disagreements
|
|
204
|
+
|
|
205
|
+
### If 3/5 Agree:
|
|
206
|
+
```
|
|
207
|
+
Majority: Answer A (3 times)
|
|
208
|
+
Minority: Answer B (2 times)
|
|
209
|
+
|
|
210
|
+
Action:
|
|
211
|
+
1. Review minority reasoning carefully
|
|
212
|
+
2. Check for calculation errors in majority
|
|
213
|
+
3. Consider if minority found edge case
|
|
214
|
+
4. Re-calculate using hybrid approach
|
|
215
|
+
5. Report: "Answer A with medium confidence (60%)"
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
### If No Consensus (2-2-1 split):
|
|
219
|
+
```
|
|
220
|
+
Answer A: 2 times
|
|
221
|
+
Answer B: 2 times
|
|
222
|
+
Answer C: 1 time
|
|
223
|
+
|
|
224
|
+
Action:
|
|
225
|
+
1. Add 2-3 more approaches to break tie
|
|
226
|
+
2. Deep-dive into why disagreement exists
|
|
227
|
+
3. Check assumptions in each approach
|
|
228
|
+
4. Report: "Low confidence, needs more analysis"
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
## Confidence Scoring
|
|
232
|
+
|
|
233
|
+
| Agreement | Confidence | Action |
|
|
234
|
+
|-----------|-----------|--------|
|
|
235
|
+
| 5/5 or 4/5 | Very High (90%+) | Accept answer |
|
|
236
|
+
| 3/5 | Medium (60-70%) | Investigate minority views |
|
|
237
|
+
| 2/5 or worse | Low (<50%) | Add more approaches or flag for review |
|
|
238
|
+
|
|
239
|
+
## Token Cost
|
|
240
|
+
|
|
241
|
+
Self-Consistency is very expensive:
|
|
242
|
+
|
|
243
|
+
**Typical costs:**
|
|
244
|
+
- 5 approaches × 100 tokens each = 500 tokens
|
|
245
|
+
- Consistency check: 100 tokens
|
|
246
|
+
- **Total: 500-3000+ tokens**
|
|
247
|
+
|
|
248
|
+
**When to justify the cost:**
|
|
249
|
+
- Critical business decisions
|
|
250
|
+
- Safety-critical calculations
|
|
251
|
+
- Legal/compliance requirements
|
|
252
|
+
- High-value transactions
|
|
253
|
+
- Medical diagnoses support
|
|
254
|
+
|
|
255
|
+
## Optimization Strategies
|
|
256
|
+
|
|
257
|
+
### Reduce Paths:
|
|
258
|
+
```
|
|
259
|
+
Instead of 7 paths → Use 5 paths
|
|
260
|
+
Savings: ~200 tokens
|
|
261
|
+
Trade-off: Slightly less confidence
|
|
262
|
+
```
|
|
263
|
+
|
|
264
|
+
### Compact Format:
|
|
265
|
+
```
|
|
266
|
+
Don't write: "In the first approach, using the method of..."
|
|
267
|
+
Write: "Approach 1 (Direct): Answer = X"
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
### Early Termination:
|
|
271
|
+
```
|
|
272
|
+
If first 3 approaches all agree:
|
|
273
|
+
→ Can stop early with high confidence
|
|
274
|
+
→ Save tokens on approaches 4-5
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
## Self-Consistency vs Other Methods
|
|
278
|
+
|
|
279
|
+
| Method | Paths | Speed | Cost | Best For |
|
|
280
|
+
|--------|-------|-------|------|----------|
|
|
281
|
+
| Zero-Shot | 1 | Fast | Low | Simple tasks |
|
|
282
|
+
| CoT | 1 | Medium | Medium | Reasoning |
|
|
283
|
+
| ToT | Multiple (explored) | Slow | High | Planning |
|
|
284
|
+
| Self-Consistency | Multiple (independent) | Slow | High | Verification |
|
|
285
|
+
|
|
286
|
+
**Key difference:**
|
|
287
|
+
- **ToT** explores paths in a tree (dependent)
|
|
288
|
+
- **Self-Consistency** generates independent parallel paths
|
|
289
|
+
|
|
290
|
+
## Real-World Use Cases
|
|
291
|
+
|
|
292
|
+
### ✅ Good Use Cases:
|
|
293
|
+
- Financial calculations for large transactions
|
|
294
|
+
- Medical diagnosis support (with human oversight)
|
|
295
|
+
- Legal contract analysis
|
|
296
|
+
- Engineering safety calculations
|
|
297
|
+
- Risk assessment for critical decisions
|
|
298
|
+
|
|
299
|
+
### ❌ Bad Use Cases:
|
|
300
|
+
- "What's the capital of France?" (obvious answer)
|
|
301
|
+
- Creative writing (variety is good)
|
|
302
|
+
- Subjective preferences
|
|
303
|
+
- Time-sensitive queries
|
|
304
|
+
- Low-stakes decisions
|
|
305
|
+
|
|
306
|
+
## Quick Reference
|
|
307
|
+
|
|
308
|
+
| Aspect | Recommendation |
|
|
309
|
+
|--------|---------------|
|
|
310
|
+
| **Number of paths** | 5-7 approaches |
|
|
311
|
+
| **Diversity** | Use genuinely different methods |
|
|
312
|
+
| **Agreement threshold** | 60%+ for medium confidence |
|
|
313
|
+
| **Token cost** | 500-3000+ tokens |
|
|
314
|
+
| **Best for** | High-stakes, objective problems |
|
|
315
|
+
| **Confidence metric** | % of approaches agreeing |
|
|
316
|
+
|
|
317
|
+
---
|
|
318
|
+
|
|
319
|
+
**Related:**
|
|
320
|
+
- [Chain of Thought](chain-of-thought.md) - Single reasoning path
|
|
321
|
+
- [Tree of Thoughts](tree-of-thoughts.md) - Explored paths (not independent)
|
|
322
|
+
- [Decision Matrix](decision_matrix.md) - When to use Self-Consistency
|
|
@@ -0,0 +1,386 @@
|
|
|
1
|
+
# Tree of Thoughts (ToT) Prompting
|
|
2
|
+
|
|
3
|
+
## Overview
|
|
4
|
+
|
|
5
|
+
Tree of Thoughts explores multiple reasoning paths in a tree structure, evaluating alternatives before selecting the best solution. Useful for complex planning and problems requiring exploration.
|
|
6
|
+
|
|
7
|
+
## When to Use
|
|
8
|
+
|
|
9
|
+
✅ **Use ToT When:**
|
|
10
|
+
- Complex planning requiring exploration of alternatives
|
|
11
|
+
- Multiple solution paths need evaluation
|
|
12
|
+
- Strategic games, puzzles, or optimization problems
|
|
13
|
+
- Creative tasks benefiting from exploring options
|
|
14
|
+
- Need to backtrack if a path fails
|
|
15
|
+
- Problem has no single obvious solution
|
|
16
|
+
|
|
17
|
+
❌ **Don't Use ToT When:**
|
|
18
|
+
- Simple, straightforward tasks
|
|
19
|
+
- Single clear solution path
|
|
20
|
+
- Token budget is tight (ToT is expensive)
|
|
21
|
+
- Time is critical (ToT is slower)
|
|
22
|
+
- Problem is well-defined with known approach
|
|
23
|
+
|
|
24
|
+
## Structure
|
|
25
|
+
|
|
26
|
+
```
|
|
27
|
+
Initial State
|
|
28
|
+
→ Generate Multiple Thoughts (Level 1)
|
|
29
|
+
→ Evaluate Each Thought
|
|
30
|
+
→ Select Best Thought(s)
|
|
31
|
+
→ Generate Sub-Thoughts (Level 2)
|
|
32
|
+
→ Evaluate
|
|
33
|
+
→ Select Best
|
|
34
|
+
→ ... Continue
|
|
35
|
+
→ Final Solution
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
## Best Practices
|
|
39
|
+
|
|
40
|
+
✓ Generate 3-5 alternative thoughts per node
|
|
41
|
+
✓ Define clear evaluation criteria upfront
|
|
42
|
+
✓ Allow backtracking to previous states
|
|
43
|
+
✓ Prune obviously bad paths early
|
|
44
|
+
✓ Set depth limit (typically 3-4 levels)
|
|
45
|
+
|
|
46
|
+
✗ Don't explore infinitely (set limits)
|
|
47
|
+
✗ Avoid bias toward first generated option
|
|
48
|
+
✗ Don't use identical evaluation for all options
|
|
49
|
+
✗ Don't forget to compare alternatives
|
|
50
|
+
|
|
51
|
+
## Template: Natural Language
|
|
52
|
+
|
|
53
|
+
```
|
|
54
|
+
Problem: [Complex problem requiring exploration]
|
|
55
|
+
|
|
56
|
+
Approach: I will explore multiple solution paths using Tree of Thoughts
|
|
57
|
+
|
|
58
|
+
Initial State: [Starting condition]
|
|
59
|
+
|
|
60
|
+
Level 1 - Generate Possible Approaches:
|
|
61
|
+
Option A: [First approach]
|
|
62
|
+
Option B: [Second approach]
|
|
63
|
+
Option C: [Third approach]
|
|
64
|
+
|
|
65
|
+
Evaluate Level 1:
|
|
66
|
+
- Option A: [Score/reasoning]
|
|
67
|
+
Pros: [advantages]
|
|
68
|
+
Cons: [disadvantages]
|
|
69
|
+
|
|
70
|
+
- Option B: [Score/reasoning]
|
|
71
|
+
Pros: [advantages]
|
|
72
|
+
Cons: [disadvantages]
|
|
73
|
+
|
|
74
|
+
- Option C: [Score/reasoning]
|
|
75
|
+
Pros: [advantages]
|
|
76
|
+
Cons: [disadvantages]
|
|
77
|
+
|
|
78
|
+
Select Best: [Option X] because [reasoning]
|
|
79
|
+
|
|
80
|
+
Level 2 - Expand Option X:
|
|
81
|
+
Path X.1: [Sub-approach 1]
|
|
82
|
+
Path X.2: [Sub-approach 2]
|
|
83
|
+
Path X.3: [Sub-approach 3]
|
|
84
|
+
|
|
85
|
+
Evaluate Level 2:
|
|
86
|
+
[Score each path...]
|
|
87
|
+
|
|
88
|
+
Final Solution: [Best path found through tree]
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
## Template: YAML Format (Human-Readable)
|
|
92
|
+
|
|
93
|
+
```yaml
|
|
94
|
+
problem: "[Problem statement]"
|
|
95
|
+
|
|
96
|
+
tree_of_thoughts:
|
|
97
|
+
root:
|
|
98
|
+
state: "[Initial state]"
|
|
99
|
+
|
|
100
|
+
level_1:
|
|
101
|
+
thoughts:
|
|
102
|
+
- id: "A"
|
|
103
|
+
description: "[First approach]"
|
|
104
|
+
evaluation:
|
|
105
|
+
score: 0.8
|
|
106
|
+
reasoning: "[Why this might work]"
|
|
107
|
+
pros: ["Pro 1", "Pro 2"]
|
|
108
|
+
cons: ["Con 1"]
|
|
109
|
+
|
|
110
|
+
- id: "B"
|
|
111
|
+
description: "[Second approach]"
|
|
112
|
+
evaluation:
|
|
113
|
+
score: 0.6
|
|
114
|
+
reasoning: "[Why this might work]"
|
|
115
|
+
pros: ["Pro 1"]
|
|
116
|
+
cons: ["Con 1", "Con 2"]
|
|
117
|
+
|
|
118
|
+
- id: "C"
|
|
119
|
+
description: "[Third approach]"
|
|
120
|
+
evaluation:
|
|
121
|
+
score: 0.9
|
|
122
|
+
reasoning: "[Why this might work]"
|
|
123
|
+
pros: ["Pro 1", "Pro 2", "Pro 3"]
|
|
124
|
+
cons: []
|
|
125
|
+
|
|
126
|
+
selected: "C"
|
|
127
|
+
reason: "[Why C is best]"
|
|
128
|
+
|
|
129
|
+
level_2:
|
|
130
|
+
parent: "C"
|
|
131
|
+
thoughts:
|
|
132
|
+
- id: "C.1"
|
|
133
|
+
description: "[Refined approach]"
|
|
134
|
+
evaluation:
|
|
135
|
+
score: 0.85
|
|
136
|
+
|
|
137
|
+
- id: "C.2"
|
|
138
|
+
description: "[Alternative refinement]"
|
|
139
|
+
evaluation:
|
|
140
|
+
score: 0.95
|
|
141
|
+
|
|
142
|
+
selected: "C.2"
|
|
143
|
+
|
|
144
|
+
solution:
|
|
145
|
+
path: ["C", "C.2"]
|
|
146
|
+
final_answer: "[Detailed solution]"
|
|
147
|
+
rationale: "[Why this path is optimal]"
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
## Template: XML Format (Claude-Optimized)
|
|
151
|
+
|
|
152
|
+
```xml
|
|
153
|
+
<tree_of_thoughts>
|
|
154
|
+
<problem>[Complex planning problem]</problem>
|
|
155
|
+
|
|
156
|
+
<exploration_config>
|
|
157
|
+
<max_depth>3</max_depth>
|
|
158
|
+
<branches_per_level>3</branches_per_level>
|
|
159
|
+
<evaluation_criteria>
|
|
160
|
+
<criterion weight="0.4">Feasibility</criterion>
|
|
161
|
+
<criterion weight="0.3">Cost</criterion>
|
|
162
|
+
<criterion weight="0.3">Time</criterion>
|
|
163
|
+
</evaluation_criteria>
|
|
164
|
+
</exploration_config>
|
|
165
|
+
|
|
166
|
+
<level_1>
|
|
167
|
+
<thought id="A">
|
|
168
|
+
<description>[Approach A]</description>
|
|
169
|
+
<evaluation>
|
|
170
|
+
<feasibility>0.9</feasibility>
|
|
171
|
+
<cost>0.6</cost>
|
|
172
|
+
<time>0.7</time>
|
|
173
|
+
<total_score>0.75</total_score>
|
|
174
|
+
</evaluation>
|
|
175
|
+
</thought>
|
|
176
|
+
|
|
177
|
+
<thought id="B">
|
|
178
|
+
<description>[Approach B]</description>
|
|
179
|
+
<evaluation>
|
|
180
|
+
<feasibility>0.7</feasibility>
|
|
181
|
+
<cost>0.8</cost>
|
|
182
|
+
<time>0.6</time>
|
|
183
|
+
<total_score>0.70</total_score>
|
|
184
|
+
</evaluation>
|
|
185
|
+
</thought>
|
|
186
|
+
|
|
187
|
+
<selected>A</selected>
|
|
188
|
+
<reason>[Why A is selected]</reason>
|
|
189
|
+
</level_1>
|
|
190
|
+
|
|
191
|
+
<level_2>
|
|
192
|
+
<parent>A</parent>
|
|
193
|
+
<!-- Continue expansion -->
|
|
194
|
+
</level_2>
|
|
195
|
+
|
|
196
|
+
<final_solution>
|
|
197
|
+
<path>A → A.2 → A.2.1</path>
|
|
198
|
+
<answer>[Solution details]</answer>
|
|
199
|
+
</final_solution>
|
|
200
|
+
</tree_of_thoughts>
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
## Real-World Example: Trip Planning
|
|
204
|
+
|
|
205
|
+
```
|
|
206
|
+
Problem: Plan a 7-day trip to Japan with $3000 budget,
|
|
207
|
+
focusing on cultural sites and food experiences.
|
|
208
|
+
|
|
209
|
+
Tree of Thoughts Approach:
|
|
210
|
+
|
|
211
|
+
Level 1 - City Combinations:
|
|
212
|
+
Option A: Tokyo only (7 days)
|
|
213
|
+
Score: 7/10
|
|
214
|
+
Pros: Deep dive, no travel time between cities
|
|
215
|
+
Cons: Less variety, misses Kyoto temples
|
|
216
|
+
Budget: Under budget (~$2500)
|
|
217
|
+
|
|
218
|
+
Option B: Tokyo (4 days) + Kyoto (3 days)
|
|
219
|
+
Score: 9/10
|
|
220
|
+
Pros: Great culture balance, iconic sites
|
|
221
|
+
Cons: 1 travel day lost
|
|
222
|
+
Budget: Fits budget (~$2900)
|
|
223
|
+
|
|
224
|
+
Option C: Tokyo (3) + Osaka (2) + Kyoto (2)
|
|
225
|
+
Score: 6/10
|
|
226
|
+
Pros: Maximum variety
|
|
227
|
+
Cons: Too rushed, over budget, constant packing
|
|
228
|
+
Budget: Over budget (~$3200)
|
|
229
|
+
|
|
230
|
+
Selected: Option B (Tokyo 4 + Kyoto 3)
|
|
231
|
+
|
|
232
|
+
Level 2 - Tokyo Allocation (4 days):
|
|
233
|
+
B.1: Shibuya/Shinjuku (1) + Asakusa/temples (1) +
|
|
234
|
+
Tokyo Disney (1) + Akihabara (1)
|
|
235
|
+
Score: 7/10
|
|
236
|
+
|
|
237
|
+
B.2: Shibuya/Shinjuku (1) + Asakusa (1) +
|
|
238
|
+
Tsukiji/food tour (1) + Museums (1)
|
|
239
|
+
Score: 9/10
|
|
240
|
+
Aligns better with "cultural + food" focus
|
|
241
|
+
|
|
242
|
+
B.3: Mix modern + traditional split evenly
|
|
243
|
+
Score: 8/10
|
|
244
|
+
|
|
245
|
+
Selected: B.2
|
|
246
|
+
|
|
247
|
+
Level 3 - Kyoto Allocation (3 days):
|
|
248
|
+
B.2.1: Temples (2 days) + Arashiyama (1 day)
|
|
249
|
+
B.2.2: Temples (1.5) + Gion (1) + Food tour (0.5)
|
|
250
|
+
Score: 9/10 - Better balance
|
|
251
|
+
|
|
252
|
+
Final Solution:
|
|
253
|
+
Path: B → B.2 → B.2.2
|
|
254
|
+
Days 1-4: Tokyo (Shibuya, Asakusa, Tsukiji food, Museums)
|
|
255
|
+
Days 5-7: Kyoto (Temples 1.5 days, Gion district, Food tour)
|
|
256
|
+
Budget: $2,850 (within limit)
|
|
257
|
+
```
|
|
258
|
+
|
|
259
|
+
## Evaluation Criteria Examples
|
|
260
|
+
|
|
261
|
+
### For Planning Tasks:
|
|
262
|
+
- **Feasibility:** Can it actually be done?
|
|
263
|
+
- **Cost:** Financial constraints
|
|
264
|
+
- **Time:** Time requirements
|
|
265
|
+
- **Risk:** What could go wrong?
|
|
266
|
+
- **Impact:** Expected benefit
|
|
267
|
+
|
|
268
|
+
### For Technical Solutions:
|
|
269
|
+
- **Performance:** Speed/efficiency
|
|
270
|
+
- **Maintainability:** Long-term code quality
|
|
271
|
+
- **Scalability:** Growth potential
|
|
272
|
+
- **Security:** Vulnerability assessment
|
|
273
|
+
- **Complexity:** Implementation difficulty
|
|
274
|
+
|
|
275
|
+
### For Creative Tasks:
|
|
276
|
+
- **Originality:** Uniqueness of idea
|
|
277
|
+
- **Coherence:** Internal consistency
|
|
278
|
+
- **Engagement:** Audience appeal
|
|
279
|
+
- **Feasibility:** Can it be executed?
|
|
280
|
+
- **Impact:** Desired effect achieved
|
|
281
|
+
|
|
282
|
+
## Pruning Strategies
|
|
283
|
+
|
|
284
|
+
### Early Pruning:
|
|
285
|
+
```
|
|
286
|
+
If thought scores < 0.3 on any critical criterion:
|
|
287
|
+
→ Prune immediately, don't explore further
|
|
288
|
+
|
|
289
|
+
If multiple thoughts score > 0.8:
|
|
290
|
+
→ Explore top 2-3 only (save tokens)
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
### Mid-Pruning:
|
|
294
|
+
```
|
|
295
|
+
After Level 2:
|
|
296
|
+
→ Review paths from root to current leaves
|
|
297
|
+
→ Prune paths with cumulative score < threshold
|
|
298
|
+
→ Focus resources on promising branches
|
|
299
|
+
```
|
|
300
|
+
|
|
301
|
+
## Backtracking Example
|
|
302
|
+
|
|
303
|
+
```
|
|
304
|
+
Level 1: Selected Option A (score 0.9)
|
|
305
|
+
|
|
306
|
+
Level 2: Expanded A
|
|
307
|
+
A.1: Score 0.6
|
|
308
|
+
A.2: Score 0.5
|
|
309
|
+
A.3: Score 0.4
|
|
310
|
+
|
|
311
|
+
All sub-options score poorly!
|
|
312
|
+
|
|
313
|
+
BACKTRACK to Level 1
|
|
314
|
+
Select Option B (second-best, score 0.8)
|
|
315
|
+
|
|
316
|
+
Level 2: Expanded B
|
|
317
|
+
B.1: Score 0.9 ✓
|
|
318
|
+
B.2: Score 0.85 ✓
|
|
319
|
+
|
|
320
|
+
Continue with B.1 path
|
|
321
|
+
```
|
|
322
|
+
|
|
323
|
+
## Comparing ToT to Other Methods
|
|
324
|
+
|
|
325
|
+
| Aspect | Zero-Shot | Few-Shot | CoT | ToT |
|
|
326
|
+
|--------|-----------|----------|-----|-----|
|
|
327
|
+
| **Paths explored** | 1 | 1 | 1 | Multiple |
|
|
328
|
+
| **Backtracking** | No | No | No | Yes |
|
|
329
|
+
| **Evaluation** | No | Implicit | No | Explicit |
|
|
330
|
+
| **Token cost** | Low | Medium | Medium | High |
|
|
331
|
+
| **Best for** | Simple | Format | Reasoning | Planning |
|
|
332
|
+
|
|
333
|
+
## Token Cost
|
|
334
|
+
|
|
335
|
+
ToT is expensive due to exploring multiple paths:
|
|
336
|
+
|
|
337
|
+
**Typical costs:**
|
|
338
|
+
- Level 1 (3 options): ~200 tokens
|
|
339
|
+
- Level 2 (3 options per branch): ~400 tokens
|
|
340
|
+
- Level 3: ~600 tokens
|
|
341
|
+
- Evaluation overhead: ~300 tokens
|
|
342
|
+
- **Total: 500-2000+ tokens**
|
|
343
|
+
|
|
344
|
+
**Optimization:**
|
|
345
|
+
- Limit breadth (3 options per level, not 5)
|
|
346
|
+
- Limit depth (2-3 levels, not 5)
|
|
347
|
+
- Prune early and aggressively
|
|
348
|
+
- Use compact format (not verbose)
|
|
349
|
+
|
|
350
|
+
## When ToT is Overkill
|
|
351
|
+
|
|
352
|
+
❌ **Don't use ToT for:**
|
|
353
|
+
```
|
|
354
|
+
- "What is 2+2?" (use Zero-Shot)
|
|
355
|
+
- "Translate this to Spanish" (use Zero-Shot)
|
|
356
|
+
- "Extract email from text" (use Few-Shot)
|
|
357
|
+
- "Explain recursion" (use CoT)
|
|
358
|
+
```
|
|
359
|
+
|
|
360
|
+
✅ **Use ToT for:**
|
|
361
|
+
```
|
|
362
|
+
- "Design a scalable architecture for..."
|
|
363
|
+
- "Plan a marketing campaign with..."
|
|
364
|
+
- "Optimize supply chain considering..."
|
|
365
|
+
- "Choose technology stack for..."
|
|
366
|
+
```
|
|
367
|
+
|
|
368
|
+
## Quick Reference
|
|
369
|
+
|
|
370
|
+
| Aspect | Recommendation |
|
|
371
|
+
|--------|---------------|
|
|
372
|
+
| **Levels** | 2-3 (rarely >4) |
|
|
373
|
+
| **Branches per level** | 3-5 options |
|
|
374
|
+
| **Evaluation criteria** | 3-5 weighted factors |
|
|
375
|
+
| **Pruning** | Aggressive (score <0.4) |
|
|
376
|
+
| **Format** | YAML (readable) or XML (Claude) |
|
|
377
|
+
| **Token cost** | 500-2000+ tokens |
|
|
378
|
+
| **Best for** | Planning, strategy, optimization |
|
|
379
|
+
|
|
380
|
+
---
|
|
381
|
+
|
|
382
|
+
**Related:**
|
|
383
|
+
- [Chain of Thought](chain-of-thought.md) - Single path reasoning
|
|
384
|
+
- [Self-Consistency](self-consistency.md) - Multiple paths, same method
|
|
385
|
+
- [YAML Format](yaml-format.md) - Human-readable format for ToT
|
|
386
|
+
- [Decision Matrix](decision_matrix.md) - When to use ToT
|