questionably-ultrathink 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,259 @@
1
+ # Questionably UltraThink
2
+
3
+ A Claude Code plugin that integrates **Chain of Verification (CoVe)** and **Atom of Thoughts (AoT)** reasoning frameworks for rigorous, verifiable analysis.
4
+
5
+ ## What It Does
6
+
7
+ UltraThink enhances Claude's reasoning with two research-backed frameworks:
8
+
9
+ - **Atom of Thoughts (AoT)** - Decomposes complex problems into atomic sub-questions organized as a DAG, solving them systematically
10
+ - **Chain of Verification (CoVe)** - Verifies factual claims through independent questioning to reduce hallucinations
11
+
12
+ ## Installation
13
+
14
+ ```bash
15
+ # Add the marketplace
16
+ /plugin marketplace add snowmead/questionably-ultrathink
17
+
18
+ # Install the plugin
19
+ /plugin install questionably-ultrathink@snowmead-marketplace
20
+ ```
21
+
22
+ ## Development Setup
23
+
24
+ For contributors working on this plugin:
25
+
26
+ ```bash
27
+ ./setup.sh
28
+ ```
29
+
30
+ This installs dependencies (lefthook, comrak) if missing and sets up git hooks for automatic markdown formatting on commit.
31
+
32
+ ## Commands
33
+
34
+ ### `/questionably-ultrathink`
35
+
36
+ Run the full reasoning pipeline on a problem:
37
+
38
+ 1. Clarifies intent if needed
39
+ 2. Selects analysis rigor (standard/thorough/high-stakes)
40
+ 3. Decomposes into atomic questions (AoT) with complexity flagging
41
+ 4. Verifies critical atoms in parallel by dependency level (CoVe)
42
+ 5. Propagates corrections and recomputes dependent atoms
43
+ 6. Synthesizes and verifies final response
44
+ 7. Iterates if confidence is below threshold (thorough/high-stakes only)
45
+
46
+ ```
47
+ /questionably-ultrathink analyze whether this authentication approach is secure
48
+ ```
49
+
50
+ ### `/decompose`
51
+
52
+ Break down a complex problem into atomic sub-questions:
53
+
54
+ ```
55
+ /decompose how does React's reconciliation work and compare to Vue?
56
+ ```
57
+
58
+ ### `/verify`
59
+
60
+ Verify factual claims in the most recent response:
61
+
62
+ ```
63
+ /verify
64
+ ```
65
+
66
+ Or verify specific content:
67
+
68
+ ```
69
+ /verify the performance benchmarks mentioned above
70
+ ```
71
+
72
+ ## Automatic Activation
73
+
74
+ The skill automatically activates when you use trigger phrases:
75
+
76
+ - "be thorough", "analyze carefully", "make sure this is right"
77
+ - "verify", "double-check", "are you sure"
78
+ - Complex multi-part questions
79
+ - Architecture or security decisions
80
+
81
+ ## Rigor Levels
82
+
83
+ When running the full pipeline, you can select analysis depth:
84
+
85
+ | Level | Iterations | Verification | Confidence Target | Use Case |
86
+ | --------------- | ---------- | ------------------------- | ----------------- | ---------------------------------- |
87
+ | **Standard** | 1 | Flagged atoms only | N/A | Most questions |
88
+ | **Thorough** | Up to 2 | Atoms with factual claims | ≥70% | Important decisions |
89
+ | **High-Stakes** | Up to 3 | ALL atoms | ≥85% | Security, architecture, production |
90
+
91
+ ## Optional: Parallel.ai MCP Integration
92
+
93
+ The plugin includes optional MCP servers for enhanced web search during verification:
94
+
95
+ - `parallel-search` - Optimized fact-checking searches
96
+ - `parallel-task` - Deep research capabilities
97
+
98
+ **Setup:** Run `/mcp` in Claude Code and authenticate with Parallel.ai to enable them.
99
+
100
+ **Fallback:** The plugin works fully without MCP authentication, using native `WebSearch` and `WebFetch` tools.
101
+
102
+ ## How It Works
103
+
104
+ ### Architecture
105
+
106
+ ```
107
+ ┌───────────────────────────────────────────────────────────────────┐
108
+ │ User Commands │
109
+ │ /questionably-ultrathink | /decompose | /verify │
110
+ └─────────────────────────────────┬─────────────────────────────────┘
111
+
112
+
113
+ ┌───────────────────────────────────────────────────────────────────┐
114
+ │ Skill Orchestrator │
115
+ │ (skills/questionably-ultrathink) │
116
+ │ │
117
+ │ 1. Clarify intent (AskUserQuestion) │
118
+ │ 2. Select rigor level │
119
+ │ 3. Generate session ID │
120
+ │ 4. Invoke agents in sequence │
121
+ │ 5. Check for corrections after each verification wave │
122
+ │ 6. Iterate if confidence below threshold │
123
+ └───────────┬───────────────────────┬───────────────────┬───────────┘
124
+ │ │ │
125
+ ▼ ▼ ▼
126
+ ┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐
127
+ │ atom-of- │ │ chain-of- │ │ aot-recompute │
128
+ │ thoughts │ │ verification │ │ │
129
+ │ │ │ │ │ │
130
+ │ Decomposes │ │ Verifies atoms │ │ Updates atoms │
131
+ │ problem into │ │ independently │ │ after CoV │
132
+ │ atomic DAG │ │ (factored exec) │ │ corrections │
133
+ └─────────┬─────────┘ └─────────┬─────────┘ └─────────┬─────────┘
134
+ │ │ │
135
+ └───────────────────────┼───────────────────────┘
136
+
137
+
138
+ ┌───────────────────────────────────────────────────────────────────┐
139
+ │ .questionably-ultrathink/{session-id}/ │
140
+ │ (File-Based Communication) │
141
+ │ │
142
+ │ metadata.md atoms/ corrections/ │
143
+ │ ├─ session_id ├─ A1.md ├─ A1.md (if errors) │
144
+ │ ├─ rigor ├─ A2.md └─ ... │
145
+ │ ├─ atoms (levels) ├─ A3.md │
146
+ │ └─ verification_order └─ FINAL.md │
147
+ └───────────────────────────────────────────────────────────────────┘
148
+ ```
149
+
150
+ **Data Flow:**
151
+
152
+ 1. **User invokes command** → Skill orchestrator begins
153
+ 2. **Orchestrator → AoT**: Decomposes problem, writes `metadata.md` + atom files
154
+ 3. **Orchestrator reads** `metadata.md` to get verification order (atoms grouped by dependency level)
155
+ 4. **Orchestrator → CoV**: Verifies atoms at each level (parallel within level)
156
+ 5. **CoV writes** correction files if errors found
157
+ 6. **Orchestrator checks** for corrections after each wave
158
+ 7. **If corrections exist → aot-recompute**: Updates dependent atoms with corrected premises
159
+ 8. **Recomputed atoms re-verified** before proceeding to next level
160
+ 9. **Final synthesis** combines all verified/corrected atoms
161
+
162
+ ### Atom of Thoughts (AoT)
163
+
164
+ Based on the paper ["Atom of Thoughts for Markov LLM Test-Time Scaling"](https://arxiv.org/abs/2502.12018) (HKUST, 2025).
165
+
166
+ Key features:
167
+
168
+ - Decomposes problems into atomic questions
169
+ - Builds a DAG of dependencies with topological levels
170
+ - Solves independent atoms in parallel
171
+ - Contracts solved atoms into minimal context for dependent atoms
172
+ - Follows Markov property (each step depends only on immediate dependencies)
173
+ - Flags atoms requiring verification (`needs_cov`) based on complexity heuristics
174
+ - Persists reasoning to files for inter-agent communication
175
+
176
+ ### Chain of Verification (CoVe)
177
+
178
+ Based on the paper ["Chain-of-Verification Reduces Hallucination in LLMs"](https://arxiv.org/abs/2309.11495) (Meta AI, 2023).
179
+
180
+ Key features:
181
+
182
+ - Extracts verifiable factual claims
183
+ - Generates targeted verification questions
184
+ - Answers each question **independently** (factored execution)
185
+ - Compares independent answers to original claims
186
+ - Reports inconsistencies with corrections
187
+ - Verifies atoms in parallel by dependency level
188
+ - Writes corrections to disk, triggering recomputation of dependent atoms
189
+
190
+ ## Output Format
191
+
192
+ ### AoT Decomposition
193
+
194
+ ```
195
+ ## Atom of Thoughts Decomposition
196
+
197
+ ### Dependency Graph
198
+ - [ATOM:A1] What auth standard fits a stateless API? (level 0, needs_cov: true)
199
+ - [ATOM:A2] Where should tokens be validated? (level 0, needs_cov: false)
200
+ - [ATOM:A3] How should tokens be stored client-side? (level 1, deps: [A1], needs_cov: true)
201
+ - [ATOM:FINAL] Complete auth approach recommendation (level 2, deps: [A2, A3])
202
+
203
+ ### Solutions
204
+ [ATOM:A1] JWT - stateless, self-contained, widely supported
205
+ [ATOM:A2] Middleware layer before route handlers
206
+ ...
207
+
208
+ ### Verification Summary
209
+ - [ATOM:A1] needs_cov: true, confidence: high
210
+ - [ATOM:A2] needs_cov: false, confidence: high
211
+ - [ATOM:A3] needs_cov: true, confidence: medium
212
+ ```
213
+
214
+ ### CoVe Report
215
+
216
+ ```
217
+ ## Chain of Verification Report
218
+
219
+ ### Verification Results
220
+
221
+ **Claim 1:** "React was released in 2013"
222
+ - Verification Q: When was React first publicly released?
223
+ - Independent Answer: React was released in May 2013 at JSConf US
224
+ - Status: ✓ VERIFIED
225
+
226
+ **Claim 2:** "Virtual DOM was invented by React"
227
+ - Verification Q: Who invented the virtual DOM concept?
228
+ - Independent Answer: While React popularized it, similar concepts existed earlier
229
+ - Status: ⚠️ INCONSISTENT
230
+ ```
231
+
232
+ ## Confidence Markers
233
+
234
+ After using UltraThink, responses are marked:
235
+
236
+ - **[VERIFIED]** - Passed CoVe verification
237
+ - **[HIGH CONFIDENCE]** - Decomposed and analyzed systematically
238
+ - **[NEEDS EXTERNAL VERIFICATION]** - User should confirm externally
239
+ - **[UNCERTAIN]** - Flagged areas of doubt remain
240
+
241
+ ## When NOT to Use
242
+
243
+ Skip UltraThink for:
244
+
245
+ - Simple, direct questions
246
+ - Opinion or recommendation requests
247
+ - Quick lookups where speed matters
248
+ - Questions you already have high confidence in
249
+
250
+ ## License
251
+
252
+ MIT
253
+
254
+ ## References
255
+
256
+ - [Chain-of-Verification Paper](https://arxiv.org/abs/2309.11495) - Meta AI, 2023
257
+ - [Atom of Thoughts Paper](https://arxiv.org/abs/2502.12018) - HKUST, 2025
258
+ - [CoVe Implementation](https://github.com/ritun16/chain-of-verification)
259
+ - [AoT Implementation](https://github.com/qixucen/atom)
@@ -0,0 +1,17 @@
1
+ {
2
+ "name": "questionably-ultrathink",
3
+ "owner": {
4
+ "name": "snowmead"
5
+ },
6
+ "metadata": {
7
+ "description": "Plugin marketplace for UltraThink reasoning framework integrating Chain of Verification and Atom of Thoughts",
8
+ "version": "1.0.0"
9
+ },
10
+ "plugins": [
11
+ {
12
+ "name": "questionably-ultrathink",
13
+ "description": "Advanced reasoning plugin integrating Chain of Verification (CoVe) and Atom of Thoughts (AoT) frameworks for rigorous, verifiable analysis",
14
+ "source": "./"
15
+ }
16
+ ]
17
+ }
@@ -0,0 +1,11 @@
1
+ {
2
+ "name": "questionably-ultrathink",
3
+ "version": "1.0.0",
4
+ "description": "Advanced reasoning plugin integrating Chain of Verification (CoVe) and Atom of Thoughts (AoT) frameworks for rigorous, verifiable analysis",
5
+ "author": {
6
+ "name": "snowmead"
7
+ },
8
+ "repository": "https://github.com/snowmead/questionably-ultrathink",
9
+ "license": "MIT",
10
+ "keywords": ["reasoning", "verification", "decomposition", "cove", "aot", "analysis"]
11
+ }
@@ -0,0 +1,175 @@
1
+ ---
2
+ name: aot-recompute
3
+ description: |
4
+ Use this agent to recompute atoms after Chain of Verification finds corrections.
5
+ This agent reads corrections from disk and updates dependent atoms with corrected premises.
6
+
7
+ ## Examples:
8
+ <example>
9
+ Context: CoV found an error in atom A1, need to recompute A3 which depends on A1
10
+ assistant: "I'll use the aot-recompute agent to update the dependent atoms with the correction."
11
+ </example>
12
+ model: haiku
13
+ tools: [Read, Write, Bash]
14
+ ---
15
+
16
+ # Atom of Thoughts Recomputation Agent
17
+
18
+ You recompute atoms after Chain of Verification has found corrections. Your job is to update dependent atoms with corrected premises.
19
+
20
+ \<core\_principle\>
21
+
22
+ ## Correction Propagation
23
+
24
+ When an upstream atom is corrected, all downstream atoms must be recomputed with the corrected information. You do NOT re-verify—you only recompute the reasoning based on new premises.
25
+ \</core\_principle\>
26
+
27
+ \<input\_format\>
28
+
29
+ ## Expected Input
30
+
31
+ Your prompt will contain:
32
+
33
+ 1. **Session ID**: The session directory to work in
34
+ 2. **Corrected atoms**: List of atom IDs that were corrected
35
+ 3. **Atoms to recompute**: List of downstream atom IDs that depend on corrected atoms
36
+
37
+ Example prompt:
38
+
39
+ Session ID: a1b2c3d4
40
+ Corrected atoms: [A1]
41
+ Atoms to recompute: [A3, FINAL]
42
+
43
+ \</input\_format\>
44
+
45
+ <process>
46
+ ## Your Process
47
+
48
+ ### Step 1: Read Corrections
49
+
50
+ Read the correction files to understand what changed:
51
+
52
+ .questionably-ultrathink/{session-id}/corrections/{atom-id}.md
53
+
54
+ Each correction file contains:
55
+
56
+ - Original answer
57
+ - Corrected answer
58
+ - Reason for correction
59
+
60
+ ### Step 2: Read Session Metadata
61
+
62
+ Read the DAG structure:
63
+
64
+ .questionably-ultrathink/{session-id}/metadata.md
65
+
66
+ Identify the dependency chain to understand which atoms need which corrections.
67
+
68
+ ### Step 3: Read Original Atoms
69
+
70
+ For each atom to recompute, read its current file:
71
+
72
+ .questionably-ultrathink/{session-id}/atoms/{atom-id}.md
73
+
74
+ ### Step 4: Recompute Each Atom
75
+
76
+ For each atom in topological order (respecting dependencies):
77
+
78
+ 1. **Gather corrected context**: Collect the corrected answers from all dependency atoms
79
+ 2. **Re-reason**: Apply the same reasoning process but with corrected premises
80
+ 3. **Update the atom file**: Write the new reasoning and answer
81
+
82
+ ### Step 5: Update Metadata
83
+
84
+ Update the metadata.md file:
85
+
86
+ - Mark recomputed atoms with `recomputed: true`
87
+ - Update the `verification_order` if any `needs_cov` flags changed
88
+
89
+ </process>
90
+
91
+ \<atom\_update\_format\>
92
+
93
+ ## Updated Atom File Format
94
+
95
+ When recomputing an atom, write the updated file with:
96
+
97
+ ```markdown
98
+ ---
99
+ atom_id: {atom-id}
100
+ needs_cov: {true | false}
101
+ confidence: {high | medium | low}
102
+ dependencies: [{dependency atom IDs}]
103
+ recomputed: true
104
+ recomputed_due_to: [{list of corrected atom IDs that triggered this}]
105
+ ---
106
+
107
+ # Atom {atom-id}: {question}
108
+
109
+ ## Correction Context
110
+ - [ATOM:{corrected-id}] was corrected: {old} → {new}
111
+
112
+ ## Sources Consulted
113
+ - {Tool}: {query/path} → {key finding}
114
+
115
+ ## Reasoning Chain
116
+ 1. {First observation, using corrected premises}
117
+ 2. {Inference or connection made}
118
+ 3. {Conclusion drawn}
119
+
120
+ ## Uncertainties
121
+ - {Any gaps, assumptions, or areas of doubt}
122
+
123
+ ## Answer
124
+ {The updated concise atom answer}
125
+ ```
126
+
127
+ \</atom\_update\_format\>
128
+
129
+ \<output\_format\>
130
+
131
+ ## Output Format
132
+
133
+ Structure your response as:
134
+
135
+ ## Atom Recomputation Report
136
+
137
+ ### Session
138
+ {session-id}
139
+
140
+ ### Corrections Applied
141
+ - [ATOM:A1]: {old answer} → {corrected answer}
142
+
143
+ ### Atoms Recomputed
144
+
145
+ **[ATOM:A3]** (depends on: A1)
146
+ - Previous answer: {old}
147
+ - Updated answer: {new}
148
+ - Reasoning change: {what changed in the logic}
149
+
150
+ **[ATOM:FINAL]** (depends on: A3)
151
+ - Previous answer: {old}
152
+ - Updated answer: {new}
153
+ - Reasoning change: {what changed in the logic}
154
+
155
+ ### Files Updated
156
+ - .questionably-ultrathink/{session-id}/atoms/A3.md
157
+ - .questionably-ultrathink/{session-id}/atoms/FINAL.md
158
+ - .questionably-ultrathink/{session-id}/metadata.md
159
+
160
+ ### Verification Needs
161
+ {List any recomputed atoms that now need re-verification}
162
+ - [ATOM:A3] needs_cov: true (reasoning changed significantly)
163
+
164
+ \</output\_format\>
165
+
166
+ <guidelines>
167
+ ## Guidelines
168
+
169
+ 1. **Respect topological order** - Recompute atoms in dependency order so each atom has access to corrected upstream answers
170
+ 2. **Preserve original reasoning structure** - Only change what the correction necessitates
171
+ 3. **Be explicit about what changed** - Document the correction context clearly
172
+ 4. **Re-assess needs\_cov** - A recomputed atom may need re-verification if reasoning changed significantly
173
+ 5. **Don't expand scope** - Only recompute the atoms you were asked to recompute
174
+
175
+ </guidelines>