@synsci/thesis 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +46 -0
- package/bin/thesis.mjs +6 -0
- package/package.json +50 -0
- package/skills/thesis/SKILL.md +58 -0
- package/skills/thesis/getting-started/quickstart.md +209 -0
- package/skills/thesis/getting-started/tutorial-overview.md +220 -0
- package/skills/thesis/reference/command-presets.md +273 -0
- package/skills/thesis/reference/experiment-design-protocol.md +200 -0
- package/skills/thesis/reference/thesis-mcp-tool-map.md +201 -0
- package/src/agents.mjs +259 -0
- package/src/cli.mjs +1111 -0
- package/src/mcp-writer.mjs +441 -0
- package/src/setup-auth.mjs +548 -0
- package/src/skill-installer.mjs +467 -0
|
@@ -0,0 +1,273 @@
|
|
|
1
|
+
# Command Presets
|
|
2
|
+
|
|
3
|
+
Use this guide when the user frames the request as `/to-graph`,
|
|
4
|
+
`/reproduce`, `/lookahead`, `/fsd`, or equivalent command-like language.
|
|
5
|
+
|
|
6
|
+
These are workflow presets inside the Thesis skill. They are not separate
|
|
7
|
+
product surfaces.
|
|
8
|
+
|
|
9
|
+
## `/to-graph`
|
|
10
|
+
|
|
11
|
+
Use `/to-graph` when the user's main goal is to convert unstructured source
|
|
12
|
+
material into a Thesis graph.
|
|
13
|
+
|
|
14
|
+
Typical inputs:
|
|
15
|
+
|
|
16
|
+
- paper
|
|
17
|
+
- blogpost
|
|
18
|
+
- README
|
|
19
|
+
- markdown wiki
|
|
20
|
+
- research notes
|
|
21
|
+
|
|
22
|
+
Output contract:
|
|
23
|
+
|
|
24
|
+
- create or update a root node or subtree
|
|
25
|
+
- put the primary narrative in node `content`
|
|
26
|
+
- attach supporting files as artifacts
|
|
27
|
+
- add only durable graph edges
|
|
28
|
+
|
|
29
|
+
Execution policy:
|
|
30
|
+
|
|
31
|
+
- do not acquire compute by default
|
|
32
|
+
- do not spend budget implicitly
|
|
33
|
+
- execution only happens if the user explicitly asks to run branches
|
|
34
|
+
|
|
35
|
+
Default mapping:
|
|
36
|
+
|
|
37
|
+
- stable page or concept -> one node
|
|
38
|
+
- main markdown/text -> `content`
|
|
39
|
+
- short synopsis -> `summary`
|
|
40
|
+
- figures, PDFs, datasets, notebooks, code, tables -> artifacts
|
|
41
|
+
- durable semantic relationships -> edges
|
|
42
|
+
- ordinary hyperlinks or citations -> remain in markdown unless they deserve an
|
|
43
|
+
actual graph relationship
|
|
44
|
+
|
|
45
|
+
Typical tool sequence:
|
|
46
|
+
|
|
47
|
+
1. `mcp__thesis__thesis_stage_node_create` -- create root or subtree nodes
|
|
48
|
+
2. `mcp__thesis__thesis_stage_node_update` -- populate content, summary
|
|
49
|
+
3. `mcp__thesis__thesis_prepare_artifact_uploads` -- if artifacts exist
|
|
50
|
+
4. Upload artifact bytes to signed URLs
|
|
51
|
+
5. `mcp__thesis__thesis_finalize_artifact_uploads` -- finalize uploads
|
|
52
|
+
6. `mcp__thesis__thesis_add_parent` -- wire edges between nodes
|
|
53
|
+
7. `mcp__thesis__thesis_commit_node` -- commit each node
|
|
54
|
+
|
|
55
|
+
## `/reproduce`
|
|
56
|
+
|
|
57
|
+
Use `/reproduce` when the source is claim-bearing and the user wants Thesis
|
|
58
|
+
to both structure the source and run empirical validation.
|
|
59
|
+
|
|
60
|
+
`/reproduce` is conceptually:
|
|
61
|
+
|
|
62
|
+
```text
|
|
63
|
+
/to-graph --mode reproduce --execute
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
That means:
|
|
67
|
+
|
|
68
|
+
1. graphify the source first
|
|
69
|
+
2. split it into explicit validation branches
|
|
70
|
+
3. execute those branches within a hard maximum budget
|
|
71
|
+
|
|
72
|
+
Typical sources:
|
|
73
|
+
|
|
74
|
+
- research papers
|
|
75
|
+
- benchmark claims
|
|
76
|
+
- blogposts with strong empirical claims
|
|
77
|
+
- replication requests on an existing Thesis graph
|
|
78
|
+
|
|
79
|
+
Budget semantics:
|
|
80
|
+
|
|
81
|
+
- a max budget is a hard ceiling, not a soft preference
|
|
82
|
+
- if no max budget is given, ask for one before any compute acquisition
|
|
83
|
+
- until the budget is explicit, planning and graph construction may proceed, but
|
|
84
|
+
empirical execution should not start
|
|
85
|
+
- prefer the highest-information, lowest-cost branches first
|
|
86
|
+
- stop when the claim is resolved or the budget limit is reached
|
|
87
|
+
|
|
88
|
+
Default branch types:
|
|
89
|
+
|
|
90
|
+
- baseline sanity check
|
|
91
|
+
- mechanism or intermediate-signal check
|
|
92
|
+
- ablation
|
|
93
|
+
- efficiency or overhead check
|
|
94
|
+
- robustness or failure-mode check
|
|
95
|
+
- follow-up analysis branch after results land
|
|
96
|
+
|
|
97
|
+
Typical tool sequence:
|
|
98
|
+
|
|
99
|
+
1. `/to-graph` sequence to structure the source
|
|
100
|
+
2. `mcp__thesis__thesis_branch_node` -- create validation branches
|
|
101
|
+
3. `mcp__thesis__thesis_stage_node_update` -- set hypothesis for each branch
|
|
102
|
+
4. `mcp__thesis__thesis_request_compute_grant_approval` -- request budget
|
|
103
|
+
5. `mcp__thesis__thesis_compute_acquire` -- acquire compute
|
|
104
|
+
6. Run experiments, collect results
|
|
105
|
+
7. `mcp__thesis__thesis_prepare_artifact_uploads` -- upload evidence
|
|
106
|
+
8. Upload artifact bytes to signed URLs
|
|
107
|
+
9. `mcp__thesis__thesis_finalize_artifact_uploads` -- finalize
|
|
108
|
+
10. `mcp__thesis__thesis_commit_node` -- commit with outcome
|
|
109
|
+
|
|
110
|
+
## `/lookahead`
|
|
111
|
+
|
|
112
|
+
Use `/lookahead` when the user wants Thesis to plan the next frontier of work
|
|
113
|
+
from an existing set of nodes without executing it yet.
|
|
114
|
+
|
|
115
|
+
`/lookahead` is conceptually:
|
|
116
|
+
|
|
117
|
+
```text
|
|
118
|
+
/plan-frontier --execute=false
|
|
119
|
+
```
|
|
120
|
+
|
|
121
|
+
That means:
|
|
122
|
+
|
|
123
|
+
1. resolve a starting set of Thesis nodes
|
|
124
|
+
2. resolve a measurable objective for those nodes
|
|
125
|
+
3. expand the frontier into staged next-step nodes
|
|
126
|
+
|
|
127
|
+
Starting-node policy:
|
|
128
|
+
|
|
129
|
+
- prefer explicit node ids, slugs, or directly named nodes
|
|
130
|
+
- otherwise use the focused or recently referenced graph context
|
|
131
|
+
- if no stable starting set can be recovered, ask
|
|
132
|
+
|
|
133
|
+
Objective policy:
|
|
134
|
+
|
|
135
|
+
- if the objective is missing, ask for one
|
|
136
|
+
- if the user insists on not specifying it, infer the objective from the graph
|
|
137
|
+
context and state it explicitly before planning
|
|
138
|
+
- the objective must be measurable enough to rank frontier branches
|
|
139
|
+
|
|
140
|
+
Lookahead policy:
|
|
141
|
+
|
|
142
|
+
- depth `n` means plan `n` hops ahead from the currently resolved frontier
|
|
143
|
+
- default `n=1`
|
|
144
|
+
- width `k` means expand up to `k` non-redundant frontier directions
|
|
145
|
+
- avoid redundant siblings that only restate the same branch with different
|
|
146
|
+
wording
|
|
147
|
+
|
|
148
|
+
Node-typing policy:
|
|
149
|
+
|
|
150
|
+
- if a planned node is expected to produce evidence or artifacts, type it
|
|
151
|
+
`empirical`
|
|
152
|
+
- otherwise type it `insight`
|
|
153
|
+
- leave planned nodes staged until the underlying work has a resolution
|
|
154
|
+
|
|
155
|
+
Output contract:
|
|
156
|
+
|
|
157
|
+
- create or update a graph-local planning contract
|
|
158
|
+
- write the objective, depth `n`, width `k`, and terminal condition into the
|
|
159
|
+
graph
|
|
160
|
+
- stage the next frontier of nodes without hidden execution
|
|
161
|
+
|
|
162
|
+
Typical tool sequence:
|
|
163
|
+
|
|
164
|
+
1. `mcp__thesis__thesis_get_node` or `mcp__thesis__thesis_get_node_tree` -- resolve starting nodes
|
|
165
|
+
2. `mcp__thesis__thesis_summarize_node_tree` -- understand current state
|
|
166
|
+
3. `mcp__thesis__thesis_branch_node` -- create frontier branches (staged)
|
|
167
|
+
4. `mcp__thesis__thesis_stage_node_update` -- populate each branch with plan details
|
|
168
|
+
5. Do NOT commit or execute -- leave nodes staged for review
|
|
169
|
+
|
|
170
|
+
## `/fsd`
|
|
171
|
+
|
|
172
|
+
Use `/fsd` ("full self-driving") when the user wants Thesis to keep
|
|
173
|
+
advancing a research frontier autonomously under a specified budget.
|
|
174
|
+
|
|
175
|
+
`/fsd` is conceptually:
|
|
176
|
+
|
|
177
|
+
```text
|
|
178
|
+
/lookahead --execute --replan-after-each-resolution
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
That means:
|
|
182
|
+
|
|
183
|
+
1. resolve starting nodes
|
|
184
|
+
2. resolve a measurable objective
|
|
185
|
+
3. resolve a hard budget
|
|
186
|
+
4. expand the frontier up to depth `n` and width `k`
|
|
187
|
+
5. execute up to `k` frontier branches in parallel
|
|
188
|
+
6. refresh the lookahead after each resolved node so the graph stays `n` hops
|
|
189
|
+
ahead
|
|
190
|
+
|
|
191
|
+
Budget semantics:
|
|
192
|
+
|
|
193
|
+
- credits are the default budget unit unless the user specifies another
|
|
194
|
+
measurable budget semantic
|
|
195
|
+
- the budget is a hard ceiling, not a soft preference
|
|
196
|
+
- if no budget is given, ask before execution starts
|
|
197
|
+
- if the user supplies a non-credit budget unit, persist that exact unit in the
|
|
198
|
+
graph-local control contract
|
|
199
|
+
|
|
200
|
+
Objective policy:
|
|
201
|
+
|
|
202
|
+
- if the objective is missing, ask for one
|
|
203
|
+
- if the user refuses to specify it, infer it from the existing graph context
|
|
204
|
+
and state it explicitly before execution
|
|
205
|
+
- the objective must be precise enough that the model can decide whether a
|
|
206
|
+
branch is advancing or stalling it
|
|
207
|
+
|
|
208
|
+
Width and worker policy:
|
|
209
|
+
|
|
210
|
+
- `k` always means maximum distinct frontier directions
|
|
211
|
+
- in `/fsd`, that same `k` also caps concurrent workers
|
|
212
|
+
- if more than `k` worthwhile branches exist, route the extra branches in
|
|
213
|
+
sequence via the parent model
|
|
214
|
+
|
|
215
|
+
Termination policy:
|
|
216
|
+
|
|
217
|
+
- do not rely on new user feedback to decide whether to continue
|
|
218
|
+
- before execution, persist a graph-local terminal condition that later steps
|
|
219
|
+
can read directly from node state
|
|
220
|
+
- stop when the persisted condition is met, for example:
|
|
221
|
+
- objective reached
|
|
222
|
+
- hard budget exhausted
|
|
223
|
+
- no non-redundant frontier branch with positive expected value remains
|
|
224
|
+
|
|
225
|
+
Graph-maintenance policy:
|
|
226
|
+
|
|
227
|
+
- keep unresolved planned nodes staged
|
|
228
|
+
- commit nodes only when their work has a resolution
|
|
229
|
+
- empirical nodes should accumulate the expected evidence or artifact before
|
|
230
|
+
completion
|
|
231
|
+
- insight nodes should only be committed once the synthesis is actually known
|
|
232
|
+
|
|
233
|
+
Typical tool sequence (per cycle):
|
|
234
|
+
|
|
235
|
+
1. `mcp__thesis__thesis_get_node_tree` -- read current frontier
|
|
236
|
+
2. `mcp__thesis__thesis_branch_node` -- expand frontier
|
|
237
|
+
3. `mcp__thesis__thesis_stage_node_update` -- populate branch plans
|
|
238
|
+
4. `mcp__thesis__thesis_request_compute_grant_approval` -- request budget
|
|
239
|
+
5. `mcp__thesis__thesis_compute_acquire` -- acquire compute
|
|
240
|
+
6. Run experiment, collect results
|
|
241
|
+
7. `mcp__thesis__thesis_prepare_artifact_uploads` + finalize -- upload evidence
|
|
242
|
+
8. `mcp__thesis__thesis_commit_node` -- commit resolved branch
|
|
243
|
+
9. Loop: re-expand frontier from new state, repeat until termination
|
|
244
|
+
|
|
245
|
+
## Choosing Between Them
|
|
246
|
+
|
|
247
|
+
Use `/to-graph` when the user wants structure from source material.
|
|
248
|
+
|
|
249
|
+
Use `/reproduce` when the user wants structure plus budgeted validation of
|
|
250
|
+
existing knowledge.
|
|
251
|
+
|
|
252
|
+
Use `/lookahead` when the user wants staged next-step planning from existing
|
|
253
|
+
nodes.
|
|
254
|
+
|
|
255
|
+
Use `/fsd` when the user wants budgeted autonomous frontier advancement.
|
|
256
|
+
|
|
257
|
+
If a graph starts as `/lookahead`, `/fsd` should resume from the same graph and
|
|
258
|
+
keep it `n` steps ahead rather than planning from scratch each time.
|
|
259
|
+
|
|
260
|
+
## Guardrails
|
|
261
|
+
|
|
262
|
+
- Do not hide spend behind `/to-graph`.
|
|
263
|
+
- Do not treat `/reproduce` as a magical bulk importer.
|
|
264
|
+
- Do not blur `/reproduce` and `/fsd`.
|
|
265
|
+
- `/reproduce` validates existing knowledge claims.
|
|
266
|
+
- `/fsd` expands the frontier to create new knowledge under budget.
|
|
267
|
+
- Build the graph explicitly with nodes, artifacts, and selected edges.
|
|
268
|
+
- Keep the source material legible in `content`; do not dump everything into
|
|
269
|
+
artifacts.
|
|
270
|
+
- Keep `/reproduce` scoped by cost and decision value, not by exhaustively
|
|
271
|
+
trying every possible branch.
|
|
272
|
+
- Keep `/fsd` graph-local: future continuation and stopping decisions should be
|
|
273
|
+
derivable from the persisted node state rather than from fresh chat context.
|
|
@@ -0,0 +1,200 @@
|
|
|
1
|
+
# Experiment Design Protocol
|
|
2
|
+
|
|
3
|
+
Use this when the user needs help turning research intent into a well-formed experiment or exploration before spending compute.
|
|
4
|
+
|
|
5
|
+
## Goal
|
|
6
|
+
|
|
7
|
+
Help the user clarify what they are trying to learn, shape the work around that question, and avoid wasting compute before the design is solid.
|
|
8
|
+
|
|
9
|
+
## Operating rules
|
|
10
|
+
|
|
11
|
+
- Treat design and execution as separate phases.
|
|
12
|
+
- Optimize for clarity of experimental purpose, not speed to a first run.
|
|
13
|
+
- Adapt depth to the user's experience and the clarity already present in the conversation.
|
|
14
|
+
- Apply epistemic discipline: separate what is known from what is assumed, and name uncertainty instead of smoothing it over.
|
|
15
|
+
- Support both hypothesis-driven and exploratory work.
|
|
16
|
+
- Support simple runs and complex shapes such as multi-stage pipelines, sweep-then-deep-dive, multi-arm comparisons, and custom structures.
|
|
17
|
+
- Keep all 10 brief fields, but allow exploratory fields to be marked `exploratory` or `TBD` rather than fabricated.
|
|
18
|
+
- Use quick Socratic questioning to surface assumptions, confidence, and what would change the user's mind. Keep it to 1-2 short questions per turn.
|
|
19
|
+
- Propose defaults for structural choices such as experiment shape, stop condition, or artifact plan. Use questions rather than defaults for epistemic choices such as beliefs, assumptions, and what evidence would matter.
|
|
20
|
+
- When a reasoning or design gap is visible, raise it as a question rather than an assertion.
|
|
21
|
+
- If the core gate is satisfied and the user wants to proceed, stop asking more design questions.
|
|
22
|
+
- Use Thesis `insight` nodes to preserve design context when it will help across turns or sessions.
|
|
23
|
+
|
|
24
|
+
## Phase 1: Clarify what the user is trying to learn
|
|
25
|
+
|
|
26
|
+
Start with:
|
|
27
|
+
|
|
28
|
+
1. What are you trying to learn or decide?
|
|
29
|
+
2. Is this mainly hypothesis-driven or exploratory right now?
|
|
30
|
+
|
|
31
|
+
Keep this phase quick. Ask 1-2 short questions per turn, and use light Socratic questioning as an epistemic check after the user states the learning goal: briefly surface what seems known versus assumed before moving on.
|
|
32
|
+
|
|
33
|
+
If the work is hypothesis-driven, ask:
|
|
34
|
+
|
|
35
|
+
- What is the hypothesis?
|
|
36
|
+
- Compared to what baseline or alternative?
|
|
37
|
+
- What result would matter?
|
|
38
|
+
|
|
39
|
+
If the work is exploratory, ask:
|
|
40
|
+
|
|
41
|
+
- What is the big question?
|
|
42
|
+
- What would you need to learn first before tackling it?
|
|
43
|
+
- What is the cheapest or cleanest way to learn that first piece?
|
|
44
|
+
- What signal or pattern are you looking for?
|
|
45
|
+
|
|
46
|
+
If the user is still fuzzy after this phase, stay in planning mode. If needed, create or update an `insight` node rather than an `empirical` node.
|
|
47
|
+
|
|
48
|
+
## Phase 2: Shape the work
|
|
49
|
+
|
|
50
|
+
Choose or define the experiment shape through quick Socratic questioning:
|
|
51
|
+
|
|
52
|
+
- single focused run
|
|
53
|
+
- multi-stage pipeline
|
|
54
|
+
- sweep then deep-dive
|
|
55
|
+
- multi-arm comparison
|
|
56
|
+
- custom shape
|
|
57
|
+
|
|
58
|
+
If the user is unsure about structure, propose a default shape, stop condition, or artifact plan instead of extending the question loop.
|
|
59
|
+
|
|
60
|
+
Then fill the experiment brief:
|
|
61
|
+
|
|
62
|
+
- `question`: the decision or learning goal
|
|
63
|
+
- `hypothesis`: the claim being tested
|
|
64
|
+
- `comparator`: the baseline or alternative
|
|
65
|
+
- `unit_of_work`: what one run, branch, or stage actually changes
|
|
66
|
+
- `primary_metric`: the main number or observable to inspect
|
|
67
|
+
- `artifact_plan`: which artifact will help interpret the result
|
|
68
|
+
- `budget_cap`: max spend or runtime for the current stage
|
|
69
|
+
- `stop_condition`: when to stop rather than letting the run expand
|
|
70
|
+
- `interpretation`: what would count as signal, no signal, or ambiguity
|
|
71
|
+
- `next_branch_if_inconclusive`: the follow-up branch if the result is unclear
|
|
72
|
+
|
|
73
|
+
For exploratory work, `hypothesis` or `comparator` may be marked `exploratory` or `TBD`, but the learning goal still needs to be explicit.
|
|
74
|
+
|
|
75
|
+
## Phase 3: Run the adaptive design gate
|
|
76
|
+
|
|
77
|
+
Frame the gate as preventing waste, not enforcing bureaucracy.
|
|
78
|
+
|
|
79
|
+
Always check:
|
|
80
|
+
|
|
81
|
+
- the question or goal is explicit
|
|
82
|
+
- at least one metric or observable is defined
|
|
83
|
+
- a budget cap or stop condition exists
|
|
84
|
+
|
|
85
|
+
For hypothesis-driven work, also check:
|
|
86
|
+
|
|
87
|
+
- there is a falsifiable hypothesis
|
|
88
|
+
- there is a comparator or baseline
|
|
89
|
+
|
|
90
|
+
For exploratory work, instead check:
|
|
91
|
+
|
|
92
|
+
- the user can say what they are looking for
|
|
93
|
+
- the first learning step is scoped well enough to run
|
|
94
|
+
|
|
95
|
+
Additional checks when relevant:
|
|
96
|
+
|
|
97
|
+
- an artifact plan or `no_artifacts_reason` exists
|
|
98
|
+
- the run shape matches the question and is not changing too many important things without purpose
|
|
99
|
+
- an interpretation rule or next branch is defined
|
|
100
|
+
|
|
101
|
+
If the core gate passes and the user wants to proceed, let them run even if some non-core details are still `TBD`.
|
|
102
|
+
|
|
103
|
+
When blocked, ask only the next necessary question instead of reopening the whole brief.
|
|
104
|
+
|
|
105
|
+
## Phase 4: Confirm the plan
|
|
106
|
+
|
|
107
|
+
Before any compute request or training launch, restate:
|
|
108
|
+
|
|
109
|
+
- what we are trying to learn
|
|
110
|
+
- the experiment shape
|
|
111
|
+
- the metric or observable
|
|
112
|
+
- the artifact plan
|
|
113
|
+
- the budget or stop condition
|
|
114
|
+
- what result would change the next step
|
|
115
|
+
|
|
116
|
+
If the run is expensive or high-risk, ask for explicit confirmation.
|
|
117
|
+
|
|
118
|
+
## Phase 5: Drive Thesis
|
|
119
|
+
|
|
120
|
+
Use Thesis in layers when possible.
|
|
121
|
+
|
|
122
|
+
### Design layer
|
|
123
|
+
|
|
124
|
+
Use an `insight` node to capture rationale, open questions, experiment shape, and any decomposition needed for exploratory or multi-stage work.
|
|
125
|
+
|
|
126
|
+
Typical flow:
|
|
127
|
+
|
|
128
|
+
1. `mcp__thesis__thesis_stage_node_create`
|
|
129
|
+
2. `mcp__thesis__thesis_stage_node_update`
|
|
130
|
+
3. `mcp__thesis__thesis_commit_node`
|
|
131
|
+
|
|
132
|
+
### Execution layer
|
|
133
|
+
|
|
134
|
+
Only after the design gate passes, create or branch the `empirical` node for the runnable part of the work.
|
|
135
|
+
|
|
136
|
+
Typical flow:
|
|
137
|
+
|
|
138
|
+
1. `mcp__thesis__thesis_branch_node` or `mcp__thesis__thesis_stage_node_create`
|
|
139
|
+
2. `mcp__thesis__thesis_stage_node_update` with the explicit run summary and the local question or hypothesis for that branch
|
|
140
|
+
3. `mcp__thesis__thesis_request_compute_grant_approval` only after the user accepts the design
|
|
141
|
+
4. `mcp__thesis__thesis_compute_acquire` and related compute tools only when execution is actually needed
|
|
142
|
+
5. `mcp__thesis__thesis_prepare_artifact_uploads`
|
|
143
|
+
6. Do a brief epistemic check before commit: verify what the evidence actually shows, whether it matches the interpretation rule from the brief, and whether any gap between the data and the hoped-for story needs to be named explicitly in the node summary.
|
|
144
|
+
7. `mcp__thesis__thesis_commit_node`
|
|
145
|
+
|
|
146
|
+
Important notes:
|
|
147
|
+
|
|
148
|
+
- Exploratory work can stay in `insight` nodes until a specific empirical probe is ready.
|
|
149
|
+
- Because `empirical` commits require a non-empty `hypothesis`, turn each runnable exploratory probe into a concrete local question or hypothesis for that branch.
|
|
150
|
+
- For multi-stage or multi-arm work, use branches to represent stages or arms and keep summaries clear about how each branch feeds the next.
|
|
151
|
+
- Completed empirical work needs artifacts or a `no_artifacts_reason`.
|
|
152
|
+
|
|
153
|
+
## Adaptive question flow
|
|
154
|
+
|
|
155
|
+
Ask in short batches of 1-2 questions per turn. Keep the flow light, Socratic, and epistemic rather than exhaustive.
|
|
156
|
+
|
|
157
|
+
1. What are you trying to learn or decide?
|
|
158
|
+
2. Is this hypothesis-driven or exploratory?
|
|
159
|
+
3. Briefly separate what the user seems to know from what they seem to be assuming before locking the design.
|
|
160
|
+
4. If hypothesis-driven: what is the hypothesis and compared to what?
|
|
161
|
+
5. If exploratory: what is the first thing you need to learn and what is the cheapest way to learn it?
|
|
162
|
+
6. What experiment shape fits this work?
|
|
163
|
+
7. What metric or observable and artifact will you inspect?
|
|
164
|
+
8. What budget or stop condition keeps this from wasting compute?
|
|
165
|
+
9. What interpretation rule will distinguish evidence from expectation?
|
|
166
|
+
10. If the result is ambiguous, what is the next branch?
|
|
167
|
+
|
|
168
|
+
## Output template
|
|
169
|
+
|
|
170
|
+
Use this shape when turning a vague request into an executable plan:
|
|
171
|
+
|
|
172
|
+
```md
|
|
173
|
+
Experiment brief
|
|
174
|
+
|
|
175
|
+
- Question:
|
|
176
|
+
- Hypothesis:
|
|
177
|
+
- Comparator:
|
|
178
|
+
- Unit of work:
|
|
179
|
+
- Primary metric or observable:
|
|
180
|
+
- Artifact plan:
|
|
181
|
+
- Budget/time cap:
|
|
182
|
+
- Stop condition:
|
|
183
|
+
- Interpretation rule:
|
|
184
|
+
- Next branch if inconclusive:
|
|
185
|
+
|
|
186
|
+
Experiment type: hypothesis-driven | exploratory
|
|
187
|
+
Design gate: ready | blocked
|
|
188
|
+
Remaining gap:
|
|
189
|
+
Recommended next action:
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
## Generalization rule
|
|
193
|
+
|
|
194
|
+
Reuse the same protocol across domains by changing the unit of work and artifact type:
|
|
195
|
+
|
|
196
|
+
- model training -> metrics tables, loss curves, checkpoints
|
|
197
|
+
- benchmark comparisons -> score tables, latency plots, error slices
|
|
198
|
+
- prompt evaluations -> rubric tables, failure examples, sampled outputs
|
|
199
|
+
- product experiments -> funnels, event tables, user-segment slices
|
|
200
|
+
- scientific workflows -> figures, logs, result tables, notebooks
|
|
@@ -0,0 +1,201 @@
|
|
|
1
|
+
# Thesis MCP Tool Map
|
|
2
|
+
|
|
3
|
+
This reference describes the Thesis MCP tool families and the common runtime
|
|
4
|
+
contract expected by the public Thesis skill.
|
|
5
|
+
|
|
6
|
+
Use it as a routing guide, not as a session snapshot. Always verify the exact
|
|
7
|
+
tool surface exposed by your current MCP host before executing critical flows.
|
|
8
|
+
|
|
9
|
+
## Core Contract Expectations
|
|
10
|
+
|
|
11
|
+
- Node lifecycle and graph-mutation flows use optimistic locking
|
|
12
|
+
(`expected_revision`).
|
|
13
|
+
- Node commits require `kind`, `outcome`, and `summary`.
|
|
14
|
+
- `kind` is typically `insight` or `empirical`.
|
|
15
|
+
- `insight` commits require non-empty `insights`.
|
|
16
|
+
- `empirical` commits require a non-empty `hypothesis`; completed empirical
|
|
17
|
+
commits also require artifacts or a `no_artifacts_reason`.
|
|
18
|
+
- Artifact publish is a two-step flow:
|
|
19
|
+
prepare upload, upload raw bytes to returned signed URLs, then finalize
|
|
20
|
+
the upload batch.
|
|
21
|
+
|
|
22
|
+
## Tool Families
|
|
23
|
+
|
|
24
|
+
### Session, Auth, and Contract
|
|
25
|
+
|
|
26
|
+
| # | Tool | Purpose |
|
|
27
|
+
|---|------|---------|
|
|
28
|
+
| 1 | `mcp__thesis__thesis_auth_status` | Check current authentication state |
|
|
29
|
+
| 2 | `mcp__thesis__thesis_get_contract` | Retrieve the full runtime contract |
|
|
30
|
+
| 3 | `mcp__thesis__thesis_get_contract_section` | Retrieve a specific contract section by ID |
|
|
31
|
+
| 4 | `mcp__thesis__thesis_get_credits_balance` | Check remaining credits balance |
|
|
32
|
+
|
|
33
|
+
### Node Discovery and Read
|
|
34
|
+
|
|
35
|
+
| # | Tool | Purpose |
|
|
36
|
+
|---|------|---------|
|
|
37
|
+
| 5 | `mcp__thesis__thesis_list_nodes` | List nodes with optional filters |
|
|
38
|
+
| 6 | `mcp__thesis__thesis_get_node` | Get a single node by ID |
|
|
39
|
+
| 7 | `mcp__thesis__thesis_get_node_tree` | Get the subtree rooted at a node |
|
|
40
|
+
| 8 | `mcp__thesis__thesis_get_node_ancestry` | Get the ancestor chain of a node |
|
|
41
|
+
| 9 | `mcp__thesis__thesis_summarize_node_tree` | Get a condensed summary of a subtree |
|
|
42
|
+
| 10 | `mcp__thesis__thesis_get_campaign_snapshot` | Get the current campaign snapshot |
|
|
43
|
+
| 11 | `mcp__thesis__thesis_list_audit` | List audit log entries |
|
|
44
|
+
| 12 | `mcp__thesis__thesis_resolve_node_slug` | Resolve a human-readable slug to a node ID |
|
|
45
|
+
|
|
46
|
+
### Node Mutation, Branching, and Commit
|
|
47
|
+
|
|
48
|
+
| # | Tool | Purpose |
|
|
49
|
+
|---|------|---------|
|
|
50
|
+
| 13 | `mcp__thesis__thesis_stage_node_create` | Create a new node in staged (draft) state |
|
|
51
|
+
| 14 | `mcp__thesis__thesis_stage_node_update` | Update fields on a staged node |
|
|
52
|
+
| 15 | `mcp__thesis__thesis_commit_node` | Commit a staged node with kind, outcome, summary |
|
|
53
|
+
| 16 | `mcp__thesis__thesis_branch_node` | Create a child branch from an existing node |
|
|
54
|
+
| 17 | `mcp__thesis__thesis_merge_nodes` | Merge two or more nodes |
|
|
55
|
+
| 18 | `mcp__thesis__thesis_add_parent` | Add an additional parent edge to a node |
|
|
56
|
+
| 19 | `mcp__thesis__thesis_remove_parent` | Remove a parent edge from a node |
|
|
57
|
+
| 20 | `mcp__thesis__thesis_delete_node` | Delete a single node |
|
|
58
|
+
| 21 | `mcp__thesis__thesis_bulk_delete_nodes` | Delete multiple nodes in one call |
|
|
59
|
+
|
|
60
|
+
### Access Policy and Collaboration
|
|
61
|
+
|
|
62
|
+
| # | Tool | Purpose |
|
|
63
|
+
|---|------|---------|
|
|
64
|
+
| 22 | `mcp__thesis__thesis_get_node_sharing` | Get sharing/access policy for a node |
|
|
65
|
+
| 23 | `mcp__thesis__thesis_set_sharing_for_node` | Set sharing policy on a single node |
|
|
66
|
+
| 24 | `mcp__thesis__thesis_set_sharing_for_nodes` | Set sharing policy on multiple nodes |
|
|
67
|
+
|
|
68
|
+
### Tags and Graph Annotation
|
|
69
|
+
|
|
70
|
+
| # | Tool | Purpose |
|
|
71
|
+
|---|------|---------|
|
|
72
|
+
| 25 | `mcp__thesis__thesis_create_node_tag` | Create a new tag definition |
|
|
73
|
+
| 26 | `mcp__thesis__thesis_update_node_tag` | Update an existing tag |
|
|
74
|
+
| 27 | `mcp__thesis__thesis_delete_node_tag` | Delete a tag definition |
|
|
75
|
+
| 28 | `mcp__thesis__thesis_set_node_tag_assignments` | Assign or remove tags on nodes |
|
|
76
|
+
|
|
77
|
+
### Artifacts
|
|
78
|
+
|
|
79
|
+
| # | Tool | Purpose |
|
|
80
|
+
|---|------|---------|
|
|
81
|
+
| 29 | `mcp__thesis__thesis_list_artifacts` | List artifacts attached to a node |
|
|
82
|
+
| 30 | `mcp__thesis__thesis_get_artifact` | Get metadata for a single artifact |
|
|
83
|
+
| 31 | `mcp__thesis__thesis_get_artifact_preview` | Get a preview/thumbnail of an artifact |
|
|
84
|
+
| 32 | `mcp__thesis__thesis_prepare_artifact_uploads` | Prepare signed URLs for artifact upload |
|
|
85
|
+
| 33 | `mcp__thesis__thesis_finalize_artifact_uploads` | Finalize an artifact upload batch |
|
|
86
|
+
| 34 | `mcp__thesis__thesis_set_artifact_note` | Set or update the note on an artifact |
|
|
87
|
+
| 35 | `mcp__thesis__thesis_delete_artifact` | Delete an artifact |
|
|
88
|
+
|
|
89
|
+
Common artifact types include:
|
|
90
|
+
`text`, `table`, `json`, `image`, `banner`, `html`, `plotly_html`, `vega`,
|
|
91
|
+
`checkpoint`, and `diff_carousel`.
|
|
92
|
+
|
|
93
|
+
### Export and Import
|
|
94
|
+
|
|
95
|
+
| # | Tool | Purpose |
|
|
96
|
+
|---|------|---------|
|
|
97
|
+
| 36 | `mcp__thesis__thesis_export_subgraph` | Export a subgraph as a portable bundle |
|
|
98
|
+
| 37 | `mcp__thesis__thesis_import_subgraph` | Import a previously exported subgraph |
|
|
99
|
+
| 38 | `mcp__thesis__thesis_export_summary` | Export a text summary of a subtree |
|
|
100
|
+
| 39 | `mcp__thesis__thesis_export_summary_stream` | Stream a text summary of a subtree |
|
|
101
|
+
| 40 | `mcp__thesis__thesis_export_summary_pdf` | Export a summary as PDF |
|
|
102
|
+
| 41 | `mcp__thesis__thesis_export_summary_render_pdf` | Render and export a summary as PDF |
|
|
103
|
+
|
|
104
|
+
### Executions
|
|
105
|
+
|
|
106
|
+
| # | Tool | Purpose |
|
|
107
|
+
|---|------|---------|
|
|
108
|
+
| 42 | `mcp__thesis__thesis_launch_execution` | Launch a compute execution on a node |
|
|
109
|
+
| 43 | `mcp__thesis__thesis_list_executions` | List executions with optional filters |
|
|
110
|
+
| 44 | `mcp__thesis__thesis_terminate_execution` | Terminate a running execution |
|
|
111
|
+
|
|
112
|
+
### Managed Compute
|
|
113
|
+
|
|
114
|
+
| # | Tool | Purpose |
|
|
115
|
+
|---|------|---------|
|
|
116
|
+
| 45 | `mcp__thesis__thesis_approval_session_heartbeat` | Keep an approval session alive |
|
|
117
|
+
| 46 | `mcp__thesis__thesis_list_approval_sessions` | List active approval sessions |
|
|
118
|
+
| 47 | `mcp__thesis__thesis_expire_approval_session` | Expire an approval session |
|
|
119
|
+
| 48 | `mcp__thesis__thesis_request_compute_grant_approval` | Request compute grant approval from budget holder |
|
|
120
|
+
| 49 | `mcp__thesis__thesis_list_compute_grants` | List compute grants available to you |
|
|
121
|
+
| 50 | `mcp__thesis__thesis_compute_list_options` | List available compute options (GPU types, regions) |
|
|
122
|
+
| 51 | `mcp__thesis__thesis_compute_acquire` | Acquire a compute instance |
|
|
123
|
+
| 52 | `mcp__thesis__thesis_compute_status` | Check status of an acquired compute instance |
|
|
124
|
+
| 53 | `mcp__thesis__thesis_compute_connection` | Get connection details for an acquired instance |
|
|
125
|
+
| 54 | `mcp__thesis__thesis_compute_release` | Release a single compute instance |
|
|
126
|
+
| 55 | `mcp__thesis__thesis_compute_release_all` | Release all acquired compute instances |
|
|
127
|
+
|
|
128
|
+
### Campaign Budgets (Organizer Flows)
|
|
129
|
+
|
|
130
|
+
| # | Tool | Purpose |
|
|
131
|
+
|---|------|---------|
|
|
132
|
+
| 56 | `mcp__thesis__thesis_list_campaign_budgets` | List campaign budgets |
|
|
133
|
+
| 57 | `mcp__thesis__thesis_create_campaign_budget` | Create a new campaign budget |
|
|
134
|
+
| 58 | `mcp__thesis__thesis_update_campaign_budget` | Update an existing campaign budget |
|
|
135
|
+
| 59 | `mcp__thesis__thesis_revoke_campaign_budget` | Revoke a campaign budget |
|
|
136
|
+
|
|
137
|
+
### Updates and Notifications
|
|
138
|
+
|
|
139
|
+
| # | Tool | Purpose |
|
|
140
|
+
|---|------|---------|
|
|
141
|
+
| 60 | `mcp__thesis__thesis_updates_list` | List pending updates/notifications |
|
|
142
|
+
| 61 | `mcp__thesis__thesis_updates_hide` | Hide a specific update |
|
|
143
|
+
| 62 | `mcp__thesis__thesis_updates_hide_all_active` | Hide all active updates |
|
|
144
|
+
| 63 | `mcp__thesis__thesis_updates_unhide` | Unhide a previously hidden update |
|
|
145
|
+
|
|
146
|
+
### Migration Helpers
|
|
147
|
+
|
|
148
|
+
Some installations may expose migration-only helper tools with hashed names.
|
|
149
|
+
Treat these as specialized one-off tools, not part of day-to-day research
|
|
150
|
+
workflows.
|
|
151
|
+
|
|
152
|
+
## Practical Tool Sequences
|
|
153
|
+
|
|
154
|
+
### Insight Node Flow
|
|
155
|
+
|
|
156
|
+
Use this flow to capture rationale, synthesis, design decisions, or any
|
|
157
|
+
non-empirical knowledge.
|
|
158
|
+
|
|
159
|
+
1. `mcp__thesis__thesis_stage_node_create` -- create a staged insight node
|
|
160
|
+
2. `mcp__thesis__thesis_stage_node_update` -- populate content, summary, insights
|
|
161
|
+
3. `mcp__thesis__thesis_commit_node` -- commit with `kind: insight`, non-empty `insights`
|
|
162
|
+
|
|
163
|
+
### Empirical Node With Artifacts
|
|
164
|
+
|
|
165
|
+
Use this flow when the node represents work that produces evidence: experiment
|
|
166
|
+
runs, benchmark results, data analysis.
|
|
167
|
+
|
|
168
|
+
1. `mcp__thesis__thesis_stage_node_create` -- create a staged empirical node
|
|
169
|
+
2. `mcp__thesis__thesis_stage_node_update` -- populate hypothesis, content, summary
|
|
170
|
+
3. Run experiment or compute steps
|
|
171
|
+
4. `mcp__thesis__thesis_prepare_artifact_uploads` -- get signed URLs
|
|
172
|
+
5. Upload artifact bytes to signed URLs (HTTP PUT)
|
|
173
|
+
6. `mcp__thesis__thesis_finalize_artifact_uploads` -- finalize the batch
|
|
174
|
+
7. `mcp__thesis__thesis_commit_node` -- commit with `kind: empirical`, non-empty `hypothesis`, artifacts present
|
|
175
|
+
|
|
176
|
+
### Managed Compute Flow
|
|
177
|
+
|
|
178
|
+
Use this flow to acquire GPU or other managed compute for empirical work.
|
|
179
|
+
|
|
180
|
+
1. `mcp__thesis__thesis_approval_session_heartbeat` -- start/maintain approval session
|
|
181
|
+
2. `mcp__thesis__thesis_request_compute_grant_approval` -- request budget approval
|
|
182
|
+
3. `mcp__thesis__thesis_list_compute_grants` -- (if needed) verify grant exists
|
|
183
|
+
4. `mcp__thesis__thesis_compute_acquire` -- acquire the instance
|
|
184
|
+
5. `mcp__thesis__thesis_compute_status` -- poll until ready
|
|
185
|
+
6. `mcp__thesis__thesis_compute_connection` -- get SSH/connection details
|
|
186
|
+
7. `mcp__thesis__thesis_compute_release` -- release when done (or `..._release_all`)
|
|
187
|
+
|
|
188
|
+
### Share a Graph With Collaborators
|
|
189
|
+
|
|
190
|
+
1. `mcp__thesis__thesis_get_node_sharing` -- check current sharing state
|
|
191
|
+
2. `mcp__thesis__thesis_set_sharing_for_node` -- (or `..._for_nodes`) set access
|
|
192
|
+
3. `mcp__thesis__thesis_export_summary` or `..._export_subgraph` -- for handoff
|
|
193
|
+
|
|
194
|
+
## Safety Notes
|
|
195
|
+
|
|
196
|
+
- Prefer `get_contract` before implementing strict assumptions in automation.
|
|
197
|
+
- Avoid call-order assumptions not mandated by contract.
|
|
198
|
+
- Keep checks bounded: list/read first, then mutate only the intended nodes.
|
|
199
|
+
- Always verify `expected_revision` before mutation to avoid lost updates.
|
|
200
|
+
- Never skip the two-step artifact upload flow (prepare then finalize).
|
|
201
|
+
- Release compute instances when work is complete to avoid unnecessary charges.
|