pi-autoresearch-vkf 0.5.0 → 0.5.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +21 -1
- package/README.md +16 -16
- package/extensions/pi-autoresearch-vkf/index.ts +8 -8
- package/package.json +1 -1
- package/skills/{autoresearch-create → autoresearch-vkf}/SKILL.md +12 -12
- package/skills/{claim-extract → autoresearch-vkf-claim-extract}/SKILL.md +4 -4
- package/skills/{claim-verify → autoresearch-vkf-claim-verify}/SKILL.md +5 -5
- package/skills/{contradiction-miner → autoresearch-vkf-contradiction-miner}/SKILL.md +3 -3
- package/skills/{cross-domain-transfer → autoresearch-vkf-cross-domain-transfer}/SKILL.md +3 -3
- package/skills/{hypothesis-loop → autoresearch-vkf-hypothesis-loop}/SKILL.md +4 -4
- package/skills/{idea-tournament → autoresearch-vkf-idea-tournament}/SKILL.md +3 -3
- package/skills/{knowledge-gather → autoresearch-vkf-knowledge-gather}/SKILL.md +2 -2
- package/skills/{research-report → autoresearch-vkf-research-report}/SKILL.md +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -1,6 +1,26 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
-
## 0.5.
|
|
3
|
+
## 0.5.2
|
|
4
|
+
|
|
5
|
+
Prefixed all skill names with `autoresearch-vkf-` to avoid namespace conflicts
|
|
6
|
+
with other tooling. Renamed `knowledge-gather`, `claim-extract`, `claim-verify`,
|
|
7
|
+
`contradiction-miner`, `cross-domain-transfer`, `idea-tournament`,
|
|
8
|
+
`hypothesis-loop`, and `research-report`; all cross-references in the skills,
|
|
9
|
+
the README, and the extension were updated accordingly. No behavior change.
|
|
10
|
+
|
|
11
|
+
|
|
12
|
+
## 0.5.1
|
|
13
|
+
|
|
14
|
+
Fix tool/skill name collisions with pi-autoresearch (both can now load together).
|
|
15
|
+
|
|
16
|
+
- Tools `run_experiment` → **`vkf_run_experiment`** and `log_experiment` →
|
|
17
|
+
**`vkf_log_experiment`** (pi-autoresearch registers the bare names; pi requires
|
|
18
|
+
globally-unique tool names across loaded extensions).
|
|
19
|
+
- Skill `autoresearch-create` → **`autoresearch-vkf`** (pi-autoresearch ships a
|
|
20
|
+
skill of the same name). Invoke the loop via the `autoresearch-vkf` skill now.
|
|
21
|
+
- Docs/skills/benchmark updated accordingly. No behavior change.
|
|
22
|
+
|
|
23
|
+
## 0.5.0 — first published release
|
|
4
24
|
|
|
5
25
|
Self-contained workspace (breaking path change).
|
|
6
26
|
|
package/README.md
CHANGED
|
@@ -51,7 +51,7 @@ pi install file:/path/to/pi-autoresearch-vkf
|
|
|
51
51
|
### Knowledge sources (how ingestion works)
|
|
52
52
|
|
|
53
53
|
The extension stores and reasons over knowledge; it does **not** fetch papers
|
|
54
|
-
itself. Gathering is done by the host agent through the `knowledge-gather` skill,
|
|
54
|
+
itself. Gathering is done by the host agent through the `autoresearch-vkf-knowledge-gather` skill,
|
|
55
55
|
using the agent's built-in **`WebSearch` + `WebFetch`** against free, openly
|
|
56
56
|
accessible databases — no API keys, no paid services, no MCP setup:
|
|
57
57
|
|
|
@@ -74,7 +74,7 @@ In a project you want to optimize:
|
|
|
74
74
|
optimize the test suite runtime, using the research literature and remembering what works
|
|
75
75
|
```
|
|
76
76
|
|
|
77
|
-
The **autoresearch-
|
|
77
|
+
The **autoresearch-vkf** skill drives it: confirm goal/metric/command → init →
|
|
78
78
|
gather literature → extract & verify claims → loop (recall → experiment →
|
|
79
79
|
write-back) → report. All state lives in one self-contained `.autoresearch-vkf/`
|
|
80
80
|
folder at the project root, so work **survives restarts and context resets**.
|
|
@@ -86,12 +86,12 @@ goal ─► recall_memory ─► gather literature ─► remember_claim (candid
|
|
|
86
86
|
│ │
|
|
87
87
|
│ verify_claim ──► trusted claims
|
|
88
88
|
▼ │
|
|
89
|
-
hypothesis-loop: recall ─► pick idea ─►
|
|
89
|
+
autoresearch-vkf-hypothesis-loop: recall ─► pick idea ─► vkf_run_experiment ─► vkf_log_experiment
|
|
90
90
|
│ │
|
|
91
91
|
│ writes experiment card back to memory,
|
|
92
92
|
│ updates the claim's belief & lifecycle
|
|
93
93
|
▼
|
|
94
|
-
research-report (paper → claim → hypothesis → patch → metric Δ → memory update)
|
|
94
|
+
autoresearch-vkf-research-report (paper → claim → hypothesis → patch → metric Δ → memory update)
|
|
95
95
|
```
|
|
96
96
|
|
|
97
97
|
### One self-contained workspace
|
|
@@ -135,8 +135,8 @@ verifier — is the defense against **memory poisoning**.
|
|
|
135
135
|
| `score_ideas` | Rank untested ideas by `EV × feasibility × evidence × novelty × info_gain ÷ cost`. |
|
|
136
136
|
| `find_contradictions` | Mine memory for tensions between claims — each a seed for a novel hypothesis. |
|
|
137
137
|
| `find_transfers` | Cross-domain mechanism search: same *how*, different *where*. |
|
|
138
|
-
| `
|
|
139
|
-
| `
|
|
138
|
+
| `vkf_run_experiment` | Run the measurement command; capture `METRIC name=value`. |
|
|
139
|
+
| `vkf_log_experiment` | Record a result, write it back to memory, update belief & lifecycle. |
|
|
140
140
|
| `promote_to_global` | Copy a trusted card into the cross-project global memory. |
|
|
141
141
|
| `export_dashboard` | Write browser dashboards: a live progress page + the `vkf html` idea-lineage graph. |
|
|
142
142
|
| `research_status` | Show session experiments + memory lifecycle. |
|
|
@@ -145,15 +145,15 @@ verifier — is the defense against **memory poisoning**.
|
|
|
145
145
|
|
|
146
146
|
| Skill | Role |
|
|
147
147
|
|-------|------|
|
|
148
|
-
| `autoresearch-
|
|
149
|
-
| `knowledge-gather` | Find candidate techniques via WebSearch/WebFetch (arXiv / Semantic Scholar / OpenAlex / GitHub). |
|
|
150
|
-
| `claim-extract` | Distill sources into reusable claim cards. |
|
|
151
|
-
| `claim-verify` | Check citations & codebase fit — the trust layer. |
|
|
152
|
-
| `contradiction-miner` | Turn tensions in memory into novel hypotheses. |
|
|
153
|
-
| `cross-domain-transfer` | Import a mechanism from another field. |
|
|
154
|
-
| `idea-tournament` | Multi-perspective debate to pick the 2–3 ideas worth testing. |
|
|
155
|
-
| `hypothesis-loop` | Pick the next idea and run the smallest falsifying experiment. |
|
|
156
|
-
| `research-report` | The auditable lineage report. |
|
|
148
|
+
| `autoresearch-vkf` | Orchestrator / spine — the entry point. |
|
|
149
|
+
| `autoresearch-vkf-knowledge-gather` | Find candidate techniques via WebSearch/WebFetch (arXiv / Semantic Scholar / OpenAlex / GitHub). |
|
|
150
|
+
| `autoresearch-vkf-claim-extract` | Distill sources into reusable claim cards. |
|
|
151
|
+
| `autoresearch-vkf-claim-verify` | Check citations & codebase fit — the trust layer. |
|
|
152
|
+
| `autoresearch-vkf-contradiction-miner` | Turn tensions in memory into novel hypotheses. |
|
|
153
|
+
| `autoresearch-vkf-cross-domain-transfer` | Import a mechanism from another field. |
|
|
154
|
+
| `autoresearch-vkf-idea-tournament` | Multi-perspective debate to pick the 2–3 ideas worth testing. |
|
|
155
|
+
| `autoresearch-vkf-hypothesis-loop` | Pick the next idea and run the smallest falsifying experiment. |
|
|
156
|
+
| `autoresearch-vkf-research-report` | The auditable lineage report. |
|
|
157
157
|
|
|
158
158
|
### The `.autoresearch-vkf/` workspace
|
|
159
159
|
|
|
@@ -282,7 +282,7 @@ Verify what will ship first with `npm pack --dry-run`.
|
|
|
282
282
|
|
|
283
283
|
All four planned phases are in: the lean MVP (Phase 1), the **novelty scorer**
|
|
284
284
|
(Phase 2), the **hypothesis-synthesis layer** (Phase 3 — `find_contradictions`,
|
|
285
|
-
`find_transfers`, `idea-tournament`), and **global cross-project memory + the
|
|
285
|
+
`find_transfers`, `autoresearch-vkf-idea-tournament`), and **global cross-project memory + the
|
|
286
286
|
benchmark** (Phase 4).
|
|
287
287
|
|
|
288
288
|
Possible next steps:
|
|
@@ -112,7 +112,7 @@ export default function autoresearchExtension(pi: ExtensionAPI): void {
|
|
|
112
112
|
if (existing) {
|
|
113
113
|
refreshWidget(ctx, root);
|
|
114
114
|
return textResult(
|
|
115
|
-
`A research session already exists: "${existing.name}".\nSession: ${sp.dir}\nMemory: ${memoryPaths(root).dir}\nContinue the loop with recall_memory →
|
|
115
|
+
`A research session already exists: "${existing.name}".\nSession: ${sp.dir}\nMemory: ${memoryPaths(root).dir}\nContinue the loop with recall_memory → vkf_run_experiment → vkf_log_experiment.`,
|
|
116
116
|
{ created: false },
|
|
117
117
|
);
|
|
118
118
|
}
|
|
@@ -144,7 +144,7 @@ export default function autoresearchExtension(pi: ExtensionAPI): void {
|
|
|
144
144
|
`Memory bundle: ${memoryPaths(root).dir} ${fresh ? "(new)" : "(existing)"} — profile ${config.memoryProfile}.`,
|
|
145
145
|
`Optimizing ${config.metricName} (${config.direction} is better).`,
|
|
146
146
|
"",
|
|
147
|
-
"Next: gather literature (knowledge-gather skill) → remember_claim candidates → verify_claim → recall_memory to pick an idea →
|
|
147
|
+
"Next: gather literature (autoresearch-vkf-knowledge-gather skill) → remember_claim candidates → verify_claim → recall_memory to pick an idea → vkf_run_experiment → vkf_log_experiment.",
|
|
148
148
|
].join("\n"),
|
|
149
149
|
{ created: true },
|
|
150
150
|
);
|
|
@@ -491,7 +491,7 @@ export default function autoresearchExtension(pi: ExtensionAPI): void {
|
|
|
491
491
|
|
|
492
492
|
if (ideas.length === 0) {
|
|
493
493
|
return textResult(
|
|
494
|
-
"No untested ideas to score. Gather literature (knowledge-gather) and remember_claim some candidates first.",
|
|
494
|
+
"No untested ideas to score. Gather literature (autoresearch-vkf-knowledge-gather) and remember_claim some candidates first.",
|
|
495
495
|
{ ranked: 0 },
|
|
496
496
|
);
|
|
497
497
|
}
|
|
@@ -620,7 +620,7 @@ export default function autoresearchExtension(pi: ExtensionAPI): void {
|
|
|
620
620
|
},
|
|
621
621
|
});
|
|
622
622
|
|
|
623
|
-
// ──
|
|
623
|
+
// ── vkf_run_experiment ───────────────────────────────────────────────────────
|
|
624
624
|
const RunParams = Type.Object({
|
|
625
625
|
command: Type.Optional(Type.String({ description: "Command to run (via `bash -lc`). Defaults to the session's configured command." })),
|
|
626
626
|
claim_id: Type.Optional(Type.String({ description: "The claim/idea this run is testing, for logging." })),
|
|
@@ -629,10 +629,10 @@ export default function autoresearchExtension(pi: ExtensionAPI): void {
|
|
|
629
629
|
});
|
|
630
630
|
|
|
631
631
|
pi.registerTool({
|
|
632
|
-
name: "
|
|
632
|
+
name: "vkf_run_experiment",
|
|
633
633
|
label: "Run experiment",
|
|
634
634
|
description:
|
|
635
|
-
"Run the measurement command and capture its output and any `METRIC name=number` lines. Does not judge or record an outcome — read the metric, then record it with
|
|
635
|
+
"Run the measurement command and capture its output and any `METRIC name=number` lines. Does not judge or record an outcome — read the metric, then record it with vkf_log_experiment.",
|
|
636
636
|
parameters: RunParams,
|
|
637
637
|
async execute(_id, params: Static<typeof RunParams>, signal, _onUpdate, ctx): Promise<AgentToolResult<{ code: number; metrics: Record<string, number> }>> {
|
|
638
638
|
const root = resolveRoot(ctx);
|
|
@@ -667,7 +667,7 @@ export default function autoresearchExtension(pi: ExtensionAPI): void {
|
|
|
667
667
|
},
|
|
668
668
|
});
|
|
669
669
|
|
|
670
|
-
// ──
|
|
670
|
+
// ── vkf_log_experiment ───────────────────────────────────────────────────────
|
|
671
671
|
const LogParams = Type.Object({
|
|
672
672
|
description: Type.String({ description: "What was changed in this experiment, in words." }),
|
|
673
673
|
value: Type.Number({ description: "The metric value obtained." }),
|
|
@@ -681,7 +681,7 @@ export default function autoresearchExtension(pi: ExtensionAPI): void {
|
|
|
681
681
|
});
|
|
682
682
|
|
|
683
683
|
pi.registerTool({
|
|
684
|
-
name: "
|
|
684
|
+
name: "vkf_log_experiment",
|
|
685
685
|
label: "Log experiment",
|
|
686
686
|
description:
|
|
687
687
|
"Record an experiment's result. Appends to the session log AND writes an experiment card back to the VKF memory (a win OR a loss is durable knowledge), updating the tested claim's belief and lifecycle. This write-back is what lets future runs avoid repeating work.",
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "pi-autoresearch-vkf",
|
|
3
|
-
"version": "0.5.
|
|
3
|
+
"version": "0.5.2",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"description": "Autoresearch with verifiable long-term scientific memory. A pi extension that gathers literature, stores it as VKF claims, runs experiments, and writes verified results back to a git-native knowledge bundle so future runs build on what was learned instead of rediscovering it.",
|
|
6
6
|
"keywords": [
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: autoresearch-
|
|
2
|
+
name: autoresearch-vkf
|
|
3
3
|
description: Run an autoresearch loop with verifiable long-term memory. Use when asked to optimize/improve a measurable target (test speed, bundle size, model loss, build time, Lighthouse score, …) by drawing on the research literature and remembering what was learned across runs. Orchestrates init → gather literature → extract & verify claims → recall → experiment → write results back to VKF memory → report.
|
|
4
4
|
---
|
|
5
5
|
|
|
@@ -21,8 +21,8 @@ You are the spine. Delegate the specialized work to the sub-skills below.
|
|
|
21
21
|
- `score_ideas` — rank untested ideas by priority (EV × feasibility × evidence × novelty × info_gain ÷ cost).
|
|
22
22
|
- `find_contradictions` — mine memory for tensions that seed novel hypotheses.
|
|
23
23
|
- `find_transfers` — cross-domain mechanism search for surprising analogies.
|
|
24
|
-
- `
|
|
25
|
-
- `
|
|
24
|
+
- `vkf_run_experiment` — run the measurement command, capture `METRIC name=value`.
|
|
25
|
+
- `vkf_log_experiment` — record a result and write it back to memory (updates belief & lifecycle).
|
|
26
26
|
- `research_status` — show session + memory state.
|
|
27
27
|
|
|
28
28
|
## The two layers
|
|
@@ -51,23 +51,23 @@ transaction record — promotion is an explicit, audited step.
|
|
|
51
51
|
gathering anything. If prior runs already learned something, build on it and
|
|
52
52
|
skip rediscovery.
|
|
53
53
|
|
|
54
|
-
4. **Gather literature** → use the **knowledge-gather** skill to find candidate
|
|
54
|
+
4. **Gather literature** → use the **autoresearch-vkf-knowledge-gather** skill to find candidate
|
|
55
55
|
techniques (via `WebSearch`/`WebFetch` against free databases — arXiv, Semantic
|
|
56
|
-
Scholar, OpenAlex), then **claim-extract** to turn them into structured claims
|
|
57
|
-
via `remember_claim`. Then **claim-verify** to check citations and codebase fit.
|
|
56
|
+
Scholar, OpenAlex), then **autoresearch-vkf-claim-extract** to turn them into structured claims
|
|
57
|
+
via `remember_claim`. Then **autoresearch-vkf-claim-verify** to check citations and codebase fit.
|
|
58
58
|
|
|
59
59
|
4b. **Synthesize new ideas** (optional but high-value) → mine memory for novelty
|
|
60
|
-
instead of only retrieving it: **contradiction-miner** (tensions →
|
|
61
|
-
hypotheses), **cross-domain-transfer** (import a mechanism from another field).
|
|
62
|
-
When many ideas compete for budget, run the **idea-tournament** skill to pick
|
|
60
|
+
instead of only retrieving it: **autoresearch-vkf-contradiction-miner** (tensions →
|
|
61
|
+
hypotheses), **autoresearch-vkf-cross-domain-transfer** (import a mechanism from another field).
|
|
62
|
+
When many ideas compete for budget, run the **autoresearch-vkf-idea-tournament** skill to pick
|
|
63
63
|
the 2–3 worth testing.
|
|
64
64
|
|
|
65
|
-
5. **Loop** → use the **hypothesis-loop** skill: `recall_memory` → pick the
|
|
65
|
+
5. **Loop** → use the **autoresearch-vkf-hypothesis-loop** skill: `recall_memory` → pick the
|
|
66
66
|
highest-value, sufficiently-novel idea → implement the smallest falsifying
|
|
67
|
-
change → `
|
|
67
|
+
change → `vkf_run_experiment` → `vkf_log_experiment` → repeat. Keep wins, revert
|
|
68
68
|
regressions; either way the result is now in memory.
|
|
69
69
|
|
|
70
|
-
6. **Report** → use the **research-report** skill to produce the lineage report
|
|
70
|
+
6. **Report** → use the **autoresearch-vkf-research-report** skill to produce the lineage report
|
|
71
71
|
(paper → claim → hypothesis → patch → metric Δ → status → memory update).
|
|
72
72
|
|
|
73
73
|
Keep `.autoresearch-vkf/session/prompt.md` current so a fresh agent can continue. The loop is
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: claim-extract
|
|
3
|
-
description: Convert gathered literature into structured, reusable VKF claim cards (research atoms). Use after knowledge-gather to stage candidate claims in memory with remember_claim. Turns noisy papers into small, checkable, reusable assertions.
|
|
2
|
+
name: autoresearch-vkf-claim-extract
|
|
3
|
+
description: Convert gathered literature into structured, reusable VKF claim cards (research atoms). Use after autoresearch-vkf-knowledge-gather to stage candidate claims in memory with remember_claim. Turns noisy papers into small, checkable, reusable assertions.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Extract claims from literature
|
|
@@ -19,7 +19,7 @@ Call `remember_claim` with:
|
|
|
19
19
|
"Replacing static gradient clipping with EMA-based adaptive clipping lowers
|
|
20
20
|
early-training validation loss for small transformers."
|
|
21
21
|
- **mechanism** — *why* it should work. This is the most valuable field: it's
|
|
22
|
-
what later lets the hypothesis-loop transfer the idea across domains.
|
|
22
|
+
what later lets the autoresearch-vkf-hypothesis-loop transfer the idea across domains.
|
|
23
23
|
- **context** — where it applies (architecture, scale, dataset regime).
|
|
24
24
|
- **implementation_recipe** — concretely how to apply it in this codebase.
|
|
25
25
|
- **failure_modes** — known/suspected ways it breaks or interacts badly.
|
|
@@ -41,4 +41,4 @@ Call `remember_claim` with:
|
|
|
41
41
|
theoretical, or anecdotal in the confidence/reliability you assign.
|
|
42
42
|
|
|
43
43
|
Everything you stage here is a **candidate** (status `draft`) with a transaction
|
|
44
|
-
record — nothing is trusted yet. Hand off to **claim-verify**.
|
|
44
|
+
record — nothing is trusted yet. Hand off to **autoresearch-vkf-claim-verify**.
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: claim-verify
|
|
3
|
-
description: Verify staged candidate claims before the loop builds on them — check that the cited source really says it, classify the evidence, and confirm codebase relevance. Use after claim-extract to promote or downgrade claims with verify_claim. This is the trust layer that prevents memory poisoning.
|
|
2
|
+
name: autoresearch-vkf-claim-verify
|
|
3
|
+
description: Verify staged candidate claims before the loop builds on them — check that the cited source really says it, classify the evidence, and confirm codebase relevance. Use after autoresearch-vkf-claim-extract to promote or downgrade claims with verify_claim. This is the trust layer that prevents memory poisoning.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Verify claims
|
|
@@ -32,7 +32,7 @@ Check, in order:
|
|
|
32
32
|
`conflicts_with` the other card's id.
|
|
33
33
|
- `deprecated` — true but stale/superseded/not applicable here.
|
|
34
34
|
- `rejected` — misread, hallucinated, or unsupported.
|
|
35
|
-
- (`locally_tested` / `replicated` are normally set by `
|
|
35
|
+
- (`locally_tested` / `replicated` are normally set by `vkf_log_experiment`, not here.)
|
|
36
36
|
|
|
37
37
|
Always give a **reason** — it becomes part of the audit trail (a VKF
|
|
38
38
|
transaction). After each call, the tool reports `vkf validate` so you can see the
|
|
@@ -46,5 +46,5 @@ bundle stays governed.
|
|
|
46
46
|
the source.
|
|
47
47
|
- A claim's truth in a paper ≠ its usefulness for our goal. Keep those separate.
|
|
48
48
|
|
|
49
|
-
When the trusted set is healthy, hand back to **autoresearch-
|
|
50
|
-
**hypothesis-loop**.
|
|
49
|
+
When the trusted set is healthy, hand back to **autoresearch-vkf** for the
|
|
50
|
+
**autoresearch-vkf-hypothesis-loop**.
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: contradiction-miner
|
|
2
|
+
name: autoresearch-vkf-contradiction-miner
|
|
3
3
|
description: Generate novel hypotheses from tensions already in memory — conflicting claims, ideas that won in one place and lost in another, and different mechanisms aimed at the same goal. Use when the loop needs fresh, non-obvious ideas rather than more literature.
|
|
4
4
|
---
|
|
5
5
|
|
|
@@ -44,8 +44,8 @@ Record it with `remember_claim`, setting:
|
|
|
44
44
|
- a `mechanism` (required — a hypothesis with no mechanism is just noise),
|
|
45
45
|
- an honest `confidence` (these are speculative; start low–medium).
|
|
46
46
|
|
|
47
|
-
It enters memory as a **candidate** like any other idea — then `claim-verify` and
|
|
48
|
-
the `hypothesis-loop` (via `score_ideas`) decide whether it's worth testing.
|
|
47
|
+
It enters memory as a **candidate** like any other idea — then `autoresearch-vkf-claim-verify` and
|
|
48
|
+
the `autoresearch-vkf-hypothesis-loop` (via `score_ideas`) decide whether it's worth testing.
|
|
49
49
|
|
|
50
50
|
## Discipline
|
|
51
51
|
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: cross-domain-transfer
|
|
2
|
+
name: autoresearch-vkf-cross-domain-transfer
|
|
3
3
|
description: Generate novel ideas by importing a mechanism from another field into the current problem. Use when you want surprising analogies that keyword search misses — search by mechanism, not keywords.
|
|
4
4
|
---
|
|
5
5
|
|
|
@@ -47,11 +47,11 @@ Record the best candidate with `remember_claim`:
|
|
|
47
47
|
- `failure_modes` — note where the analogy might break (the assumptions the source
|
|
48
48
|
domain has that yours doesn't).
|
|
49
49
|
|
|
50
|
-
Then let `claim-verify` and `score_ideas` decide if it earns an experiment.
|
|
50
|
+
Then let `autoresearch-vkf-claim-verify` and `score_ideas` decide if it earns an experiment.
|
|
51
51
|
|
|
52
52
|
## Discipline
|
|
53
53
|
|
|
54
54
|
- **Require a mechanistic reason for transfer**, not just surface similarity. "Both
|
|
55
55
|
use matrices" is not a transfer.
|
|
56
56
|
- If you gathered claims only from your own domain, there's nothing to transfer
|
|
57
|
-
*from* — use `knowledge-gather` to pull in adjacent fields first.
|
|
57
|
+
*from* — use `autoresearch-vkf-knowledge-gather` to pull in adjacent fields first.
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: hypothesis-loop
|
|
2
|
+
name: autoresearch-vkf-hypothesis-loop
|
|
3
3
|
description: The core experiment loop — recall memory, pick the highest-value sufficiently-novel idea, run the smallest falsifying experiment, and write the result back to memory. Use to drive iterations of an autoresearch loop after claims have been gathered and verified.
|
|
4
4
|
---
|
|
5
5
|
|
|
@@ -33,9 +33,9 @@ ideas deliberately and you never repeat settled work.
|
|
|
33
33
|
regress), *risk* (what could break), *novelty basis* (why it's not a repeat).
|
|
34
34
|
|
|
35
35
|
4. **Run the smallest falsifying experiment.** Make the minimal change in scope,
|
|
36
|
-
then `
|
|
36
|
+
then `vkf_run_experiment`. Read the `METRIC` line — don't eyeball logs.
|
|
37
37
|
|
|
38
|
-
5. **Judge honestly, then `
|
|
38
|
+
5. **Judge honestly, then `vkf_log_experiment`.** Record the value, the tested
|
|
39
39
|
`claim_id`, whether you `kept` it, conditions, and notes. The tool:
|
|
40
40
|
- derives win/loss/inconclusive vs the baseline,
|
|
41
41
|
- writes an **experiment card back to memory** (a loss is durable knowledge),
|
|
@@ -54,4 +54,4 @@ ideas deliberately and you never repeat settled work.
|
|
|
54
54
|
- **One variable at a time** so the result attributes cleanly to the hypothesis.
|
|
55
55
|
|
|
56
56
|
When you've made meaningful progress or exhausted promising ideas, hand to
|
|
57
|
-
**research-report**.
|
|
57
|
+
**autoresearch-vkf-research-report**.
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: idea-tournament
|
|
2
|
+
name: autoresearch-vkf-idea-tournament
|
|
3
3
|
description: Run a structured multi-perspective tournament over candidate ideas to pick the 2-3 worth testing. Use when there are many candidate hypotheses competing for limited experiment budget.
|
|
4
4
|
---
|
|
5
5
|
|
|
@@ -7,7 +7,7 @@ description: Run a structured multi-perspective tournament over candidate ideas
|
|
|
7
7
|
|
|
8
8
|
When many ideas compete for a limited experiment budget, don't just take the top
|
|
9
9
|
of one ranking. Run a tournament: judge each idea from several perspectives, then
|
|
10
|
-
advance only the best 2–3 to the `hypothesis-loop`.
|
|
10
|
+
advance only the best 2–3 to the `autoresearch-vkf-hypothesis-loop`.
|
|
11
11
|
|
|
12
12
|
## Assemble the field
|
|
13
13
|
|
|
@@ -45,7 +45,7 @@ the numbers miss — especially the Skeptic's failure-mode and gaming checks.
|
|
|
45
45
|
`verify_claim`) so the tournament's reasoning is remembered and they aren't
|
|
46
46
|
re-litigated next round.
|
|
47
47
|
|
|
48
|
-
Hand the 2–3 winners to the **hypothesis-loop**.
|
|
48
|
+
Hand the 2–3 winners to the **autoresearch-vkf-hypothesis-loop**.
|
|
49
49
|
|
|
50
50
|
## Discipline
|
|
51
51
|
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: knowledge-gather
|
|
2
|
+
name: autoresearch-vkf-knowledge-gather
|
|
3
3
|
description: Gather frontier knowledge relevant to a research goal — search papers, repos, docs, and benchmarks for candidate techniques. Use as the discovery step of an autoresearch loop, before extracting claims. Collects candidate knowledge; it does not invent ideas or run experiments.
|
|
4
4
|
---
|
|
5
5
|
|
|
@@ -63,5 +63,5 @@ For each candidate, capture enough to become a claim later:
|
|
|
63
63
|
- **Look for contradictions and gaps** between sources — they're the richest
|
|
64
64
|
seeds for novel hypotheses later.
|
|
65
65
|
|
|
66
|
-
Hand the collected candidates to **claim-extract**, which writes them into memory
|
|
66
|
+
Hand the collected candidates to **autoresearch-vkf-claim-extract**, which writes them into memory
|
|
67
67
|
with `remember_claim`.
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: research-report
|
|
2
|
+
name: autoresearch-vkf-research-report
|
|
3
3
|
description: Produce the autoresearch report with full idea lineage — paper → claim → hypothesis → patch → metric change → status → memory update. Use to summarize an autoresearch run into an auditable, human-readable report at .autoresearch-vkf/session/report.md.
|
|
4
4
|
---
|
|
5
5
|
|