pi-autoresearch-vkf 0.5.1 → 0.5.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +9 -0
- package/README.md +12 -12
- package/extensions/pi-autoresearch-vkf/index.ts +2 -2
- package/package.json +1 -1
- package/skills/autoresearch-vkf/SKILL.md +8 -8
- package/skills/{claim-extract → autoresearch-vkf-claim-extract}/SKILL.md +4 -4
- package/skills/{claim-verify → autoresearch-vkf-claim-verify}/SKILL.md +3 -3
- package/skills/{contradiction-miner → autoresearch-vkf-contradiction-miner}/SKILL.md +3 -3
- package/skills/{cross-domain-transfer → autoresearch-vkf-cross-domain-transfer}/SKILL.md +3 -3
- package/skills/{hypothesis-loop → autoresearch-vkf-hypothesis-loop}/SKILL.md +2 -2
- package/skills/{idea-tournament → autoresearch-vkf-idea-tournament}/SKILL.md +3 -3
- package/skills/{knowledge-gather → autoresearch-vkf-knowledge-gather}/SKILL.md +2 -2
- package/skills/{research-report → autoresearch-vkf-research-report}/SKILL.md +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,14 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## 0.5.2
|
|
4
|
+
|
|
5
|
+
Prefixed all skill names with `autoresearch-vkf-` to avoid namespace conflicts
|
|
6
|
+
with other tooling. Renamed `knowledge-gather`, `claim-extract`, `claim-verify`,
|
|
7
|
+
`contradiction-miner`, `cross-domain-transfer`, `idea-tournament`,
|
|
8
|
+
`hypothesis-loop`, and `research-report`; all cross-references in the skills,
|
|
9
|
+
the README, and the extension were updated accordingly. No behavior change.
|
|
10
|
+
|
|
11
|
+
|
|
3
12
|
## 0.5.1
|
|
4
13
|
|
|
5
14
|
Fix tool/skill name collisions with pi-autoresearch (both can now load together).
|
package/README.md
CHANGED
|
@@ -51,7 +51,7 @@ pi install file:/path/to/pi-autoresearch-vkf
|
|
|
51
51
|
### Knowledge sources (how ingestion works)
|
|
52
52
|
|
|
53
53
|
The extension stores and reasons over knowledge; it does **not** fetch papers
|
|
54
|
-
itself. Gathering is done by the host agent through the `knowledge-gather` skill,
|
|
54
|
+
itself. Gathering is done by the host agent through the `autoresearch-vkf-knowledge-gather` skill,
|
|
55
55
|
using the agent's built-in **`WebSearch` + `WebFetch`** against free, openly
|
|
56
56
|
accessible databases — no API keys, no paid services, no MCP setup:
|
|
57
57
|
|
|
@@ -86,12 +86,12 @@ goal ─► recall_memory ─► gather literature ─► remember_claim (candid
|
|
|
86
86
|
│ │
|
|
87
87
|
│ verify_claim ──► trusted claims
|
|
88
88
|
▼ │
|
|
89
|
-
hypothesis-loop: recall ─► pick idea ─► vkf_run_experiment ─► vkf_log_experiment
|
|
89
|
+
autoresearch-vkf-hypothesis-loop: recall ─► pick idea ─► vkf_run_experiment ─► vkf_log_experiment
|
|
90
90
|
│ │
|
|
91
91
|
│ writes experiment card back to memory,
|
|
92
92
|
│ updates the claim's belief & lifecycle
|
|
93
93
|
▼
|
|
94
|
-
research-report (paper → claim → hypothesis → patch → metric Δ → memory update)
|
|
94
|
+
autoresearch-vkf-research-report (paper → claim → hypothesis → patch → metric Δ → memory update)
|
|
95
95
|
```
|
|
96
96
|
|
|
97
97
|
### One self-contained workspace
|
|
@@ -146,14 +146,14 @@ verifier — is the defense against **memory poisoning**.
|
|
|
146
146
|
| Skill | Role |
|
|
147
147
|
|-------|------|
|
|
148
148
|
| `autoresearch-vkf` | Orchestrator / spine — the entry point. |
|
|
149
|
-
| `knowledge-gather` | Find candidate techniques via WebSearch/WebFetch (arXiv / Semantic Scholar / OpenAlex / GitHub). |
|
|
150
|
-
| `claim-extract` | Distill sources into reusable claim cards. |
|
|
151
|
-
| `claim-verify` | Check citations & codebase fit — the trust layer. |
|
|
152
|
-
| `contradiction-miner` | Turn tensions in memory into novel hypotheses. |
|
|
153
|
-
| `cross-domain-transfer` | Import a mechanism from another field. |
|
|
154
|
-
| `idea-tournament` | Multi-perspective debate to pick the 2–3 ideas worth testing. |
|
|
155
|
-
| `hypothesis-loop` | Pick the next idea and run the smallest falsifying experiment. |
|
|
156
|
-
| `research-report` | The auditable lineage report. |
|
|
149
|
+
| `autoresearch-vkf-knowledge-gather` | Find candidate techniques via WebSearch/WebFetch (arXiv / Semantic Scholar / OpenAlex / GitHub). |
|
|
150
|
+
| `autoresearch-vkf-claim-extract` | Distill sources into reusable claim cards. |
|
|
151
|
+
| `autoresearch-vkf-claim-verify` | Check citations & codebase fit — the trust layer. |
|
|
152
|
+
| `autoresearch-vkf-contradiction-miner` | Turn tensions in memory into novel hypotheses. |
|
|
153
|
+
| `autoresearch-vkf-cross-domain-transfer` | Import a mechanism from another field. |
|
|
154
|
+
| `autoresearch-vkf-idea-tournament` | Multi-perspective debate to pick the 2–3 ideas worth testing. |
|
|
155
|
+
| `autoresearch-vkf-hypothesis-loop` | Pick the next idea and run the smallest falsifying experiment. |
|
|
156
|
+
| `autoresearch-vkf-research-report` | The auditable lineage report. |
|
|
157
157
|
|
|
158
158
|
### The `.autoresearch-vkf/` workspace
|
|
159
159
|
|
|
@@ -282,7 +282,7 @@ Verify what will ship first with `npm pack --dry-run`.
|
|
|
282
282
|
|
|
283
283
|
All four planned phases are in: the lean MVP (Phase 1), the **novelty scorer**
|
|
284
284
|
(Phase 2), the **hypothesis-synthesis layer** (Phase 3 — `find_contradictions`,
|
|
285
|
-
`find_transfers`, `idea-tournament`), and **global cross-project memory + the
|
|
285
|
+
`find_transfers`, `autoresearch-vkf-idea-tournament`), and **global cross-project memory + the
|
|
286
286
|
benchmark** (Phase 4).
|
|
287
287
|
|
|
288
288
|
Possible next steps:
|
|
@@ -144,7 +144,7 @@ export default function autoresearchExtension(pi: ExtensionAPI): void {
|
|
|
144
144
|
`Memory bundle: ${memoryPaths(root).dir} ${fresh ? "(new)" : "(existing)"} — profile ${config.memoryProfile}.`,
|
|
145
145
|
`Optimizing ${config.metricName} (${config.direction} is better).`,
|
|
146
146
|
"",
|
|
147
|
-
"Next: gather literature (knowledge-gather skill) → remember_claim candidates → verify_claim → recall_memory to pick an idea → vkf_run_experiment → vkf_log_experiment.",
|
|
147
|
+
"Next: gather literature (autoresearch-vkf-knowledge-gather skill) → remember_claim candidates → verify_claim → recall_memory to pick an idea → vkf_run_experiment → vkf_log_experiment.",
|
|
148
148
|
].join("\n"),
|
|
149
149
|
{ created: true },
|
|
150
150
|
);
|
|
@@ -491,7 +491,7 @@ export default function autoresearchExtension(pi: ExtensionAPI): void {
|
|
|
491
491
|
|
|
492
492
|
if (ideas.length === 0) {
|
|
493
493
|
return textResult(
|
|
494
|
-
"No untested ideas to score. Gather literature (knowledge-gather) and remember_claim some candidates first.",
|
|
494
|
+
"No untested ideas to score. Gather literature (autoresearch-vkf-knowledge-gather) and remember_claim some candidates first.",
|
|
495
495
|
{ ranked: 0 },
|
|
496
496
|
);
|
|
497
497
|
}
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "pi-autoresearch-vkf",
|
|
3
|
-
"version": "0.5.
|
|
3
|
+
"version": "0.5.2",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"description": "Autoresearch with verifiable long-term scientific memory. A pi extension that gathers literature, stores it as VKF claims, runs experiments, and writes verified results back to a git-native knowledge bundle so future runs build on what was learned instead of rediscovering it.",
|
|
6
6
|
"keywords": [
|
|
@@ -51,23 +51,23 @@ transaction record — promotion is an explicit, audited step.
|
|
|
51
51
|
gathering anything. If prior runs already learned something, build on it and
|
|
52
52
|
skip rediscovery.
|
|
53
53
|
|
|
54
|
-
4. **Gather literature** → use the **knowledge-gather** skill to find candidate
|
|
54
|
+
4. **Gather literature** → use the **autoresearch-vkf-knowledge-gather** skill to find candidate
|
|
55
55
|
techniques (via `WebSearch`/`WebFetch` against free databases — arXiv, Semantic
|
|
56
|
-
Scholar, OpenAlex), then **claim-extract** to turn them into structured claims
|
|
57
|
-
via `remember_claim`. Then **claim-verify** to check citations and codebase fit.
|
|
56
|
+
Scholar, OpenAlex), then **autoresearch-vkf-claim-extract** to turn them into structured claims
|
|
57
|
+
via `remember_claim`. Then **autoresearch-vkf-claim-verify** to check citations and codebase fit.
|
|
58
58
|
|
|
59
59
|
4b. **Synthesize new ideas** (optional but high-value) → mine memory for novelty
|
|
60
|
-
instead of only retrieving it: **contradiction-miner** (tensions →
|
|
61
|
-
hypotheses), **cross-domain-transfer** (import a mechanism from another field).
|
|
62
|
-
When many ideas compete for budget, run the **idea-tournament** skill to pick
|
|
60
|
+
instead of only retrieving it: **autoresearch-vkf-contradiction-miner** (tensions →
|
|
61
|
+
hypotheses), **autoresearch-vkf-cross-domain-transfer** (import a mechanism from another field).
|
|
62
|
+
When many ideas compete for budget, run the **autoresearch-vkf-idea-tournament** skill to pick
|
|
63
63
|
the 2–3 worth testing.
|
|
64
64
|
|
|
65
|
-
5. **Loop** → use the **hypothesis-loop** skill: `recall_memory` → pick the
|
|
65
|
+
5. **Loop** → use the **autoresearch-vkf-hypothesis-loop** skill: `recall_memory` → pick the
|
|
66
66
|
highest-value, sufficiently-novel idea → implement the smallest falsifying
|
|
67
67
|
change → `vkf_run_experiment` → `vkf_log_experiment` → repeat. Keep wins, revert
|
|
68
68
|
regressions; either way the result is now in memory.
|
|
69
69
|
|
|
70
|
-
6. **Report** → use the **research-report** skill to produce the lineage report
|
|
70
|
+
6. **Report** → use the **autoresearch-vkf-research-report** skill to produce the lineage report
|
|
71
71
|
(paper → claim → hypothesis → patch → metric Δ → status → memory update).
|
|
72
72
|
|
|
73
73
|
Keep `.autoresearch-vkf/session/prompt.md` current so a fresh agent can continue. The loop is
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: claim-extract
|
|
3
|
-
description: Convert gathered literature into structured, reusable VKF claim cards (research atoms). Use after knowledge-gather to stage candidate claims in memory with remember_claim. Turns noisy papers into small, checkable, reusable assertions.
|
|
2
|
+
name: autoresearch-vkf-claim-extract
|
|
3
|
+
description: Convert gathered literature into structured, reusable VKF claim cards (research atoms). Use after autoresearch-vkf-knowledge-gather to stage candidate claims in memory with remember_claim. Turns noisy papers into small, checkable, reusable assertions.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Extract claims from literature
|
|
@@ -19,7 +19,7 @@ Call `remember_claim` with:
|
|
|
19
19
|
"Replacing static gradient clipping with EMA-based adaptive clipping lowers
|
|
20
20
|
early-training validation loss for small transformers."
|
|
21
21
|
- **mechanism** — *why* it should work. This is the most valuable field: it's
|
|
22
|
-
what later lets the hypothesis-loop transfer the idea across domains.
|
|
22
|
+
what later lets the autoresearch-vkf-hypothesis-loop transfer the idea across domains.
|
|
23
23
|
- **context** — where it applies (architecture, scale, dataset regime).
|
|
24
24
|
- **implementation_recipe** — concretely how to apply it in this codebase.
|
|
25
25
|
- **failure_modes** — known/suspected ways it breaks or interacts badly.
|
|
@@ -41,4 +41,4 @@ Call `remember_claim` with:
|
|
|
41
41
|
theoretical, or anecdotal in the confidence/reliability you assign.
|
|
42
42
|
|
|
43
43
|
Everything you stage here is a **candidate** (status `draft`) with a transaction
|
|
44
|
-
record — nothing is trusted yet. Hand off to **claim-verify**.
|
|
44
|
+
record — nothing is trusted yet. Hand off to **autoresearch-vkf-claim-verify**.
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: claim-verify
|
|
3
|
-
description: Verify staged candidate claims before the loop builds on them — check that the cited source really says it, classify the evidence, and confirm codebase relevance. Use after claim-extract to promote or downgrade claims with verify_claim. This is the trust layer that prevents memory poisoning.
|
|
2
|
+
name: autoresearch-vkf-claim-verify
|
|
3
|
+
description: Verify staged candidate claims before the loop builds on them — check that the cited source really says it, classify the evidence, and confirm codebase relevance. Use after autoresearch-vkf-claim-extract to promote or downgrade claims with verify_claim. This is the trust layer that prevents memory poisoning.
|
|
4
4
|
---
|
|
5
5
|
|
|
6
6
|
# Verify claims
|
|
@@ -47,4 +47,4 @@ bundle stays governed.
|
|
|
47
47
|
- A claim's truth in a paper ≠ its usefulness for our goal. Keep those separate.
|
|
48
48
|
|
|
49
49
|
When the trusted set is healthy, hand back to **autoresearch-vkf** for the
|
|
50
|
-
**hypothesis-loop**.
|
|
50
|
+
**autoresearch-vkf-hypothesis-loop**.
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: contradiction-miner
|
|
2
|
+
name: autoresearch-vkf-contradiction-miner
|
|
3
3
|
description: Generate novel hypotheses from tensions already in memory — conflicting claims, ideas that won in one place and lost in another, and different mechanisms aimed at the same goal. Use when the loop needs fresh, non-obvious ideas rather than more literature.
|
|
4
4
|
---
|
|
5
5
|
|
|
@@ -44,8 +44,8 @@ Record it with `remember_claim`, setting:
|
|
|
44
44
|
- a `mechanism` (required — a hypothesis with no mechanism is just noise),
|
|
45
45
|
- an honest `confidence` (these are speculative; start low–medium).
|
|
46
46
|
|
|
47
|
-
It enters memory as a **candidate** like any other idea — then `claim-verify` and
|
|
48
|
-
the `hypothesis-loop` (via `score_ideas`) decide whether it's worth testing.
|
|
47
|
+
It enters memory as a **candidate** like any other idea — then `autoresearch-vkf-claim-verify` and
|
|
48
|
+
the `autoresearch-vkf-hypothesis-loop` (via `score_ideas`) decide whether it's worth testing.
|
|
49
49
|
|
|
50
50
|
## Discipline
|
|
51
51
|
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: cross-domain-transfer
|
|
2
|
+
name: autoresearch-vkf-cross-domain-transfer
|
|
3
3
|
description: Generate novel ideas by importing a mechanism from another field into the current problem. Use when you want surprising analogies that keyword search misses — search by mechanism, not keywords.
|
|
4
4
|
---
|
|
5
5
|
|
|
@@ -47,11 +47,11 @@ Record the best candidate with `remember_claim`:
|
|
|
47
47
|
- `failure_modes` — note where the analogy might break (the assumptions the source
|
|
48
48
|
domain has that yours doesn't).
|
|
49
49
|
|
|
50
|
-
Then let `claim-verify` and `score_ideas` decide if it earns an experiment.
|
|
50
|
+
Then let `autoresearch-vkf-claim-verify` and `score_ideas` decide if it earns an experiment.
|
|
51
51
|
|
|
52
52
|
## Discipline
|
|
53
53
|
|
|
54
54
|
- **Require a mechanistic reason for transfer**, not just surface similarity. "Both
|
|
55
55
|
use matrices" is not a transfer.
|
|
56
56
|
- If you gathered claims only from your own domain, there's nothing to transfer
|
|
57
|
-
*from* — use `knowledge-gather` to pull in adjacent fields first.
|
|
57
|
+
*from* — use `autoresearch-vkf-knowledge-gather` to pull in adjacent fields first.
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: hypothesis-loop
|
|
2
|
+
name: autoresearch-vkf-hypothesis-loop
|
|
3
3
|
description: The core experiment loop — recall memory, pick the highest-value sufficiently-novel idea, run the smallest falsifying experiment, and write the result back to memory. Use to drive iterations of an autoresearch loop after claims have been gathered and verified.
|
|
4
4
|
---
|
|
5
5
|
|
|
@@ -54,4 +54,4 @@ ideas deliberately and you never repeat settled work.
|
|
|
54
54
|
- **One variable at a time** so the result attributes cleanly to the hypothesis.
|
|
55
55
|
|
|
56
56
|
When you've made meaningful progress or exhausted promising ideas, hand to
|
|
57
|
-
**research-report**.
|
|
57
|
+
**autoresearch-vkf-research-report**.
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: idea-tournament
|
|
2
|
+
name: autoresearch-vkf-idea-tournament
|
|
3
3
|
description: Run a structured multi-perspective tournament over candidate ideas to pick the 2-3 worth testing. Use when there are many candidate hypotheses competing for limited experiment budget.
|
|
4
4
|
---
|
|
5
5
|
|
|
@@ -7,7 +7,7 @@ description: Run a structured multi-perspective tournament over candidate ideas
|
|
|
7
7
|
|
|
8
8
|
When many ideas compete for a limited experiment budget, don't just take the top
|
|
9
9
|
of one ranking. Run a tournament: judge each idea from several perspectives, then
|
|
10
|
-
advance only the best 2–3 to the `hypothesis-loop`.
|
|
10
|
+
advance only the best 2–3 to the `autoresearch-vkf-hypothesis-loop`.
|
|
11
11
|
|
|
12
12
|
## Assemble the field
|
|
13
13
|
|
|
@@ -45,7 +45,7 @@ the numbers miss — especially the Skeptic's failure-mode and gaming checks.
|
|
|
45
45
|
`verify_claim`) so the tournament's reasoning is remembered and they aren't
|
|
46
46
|
re-litigated next round.
|
|
47
47
|
|
|
48
|
-
Hand the 2–3 winners to the **hypothesis-loop**.
|
|
48
|
+
Hand the 2–3 winners to the **autoresearch-vkf-hypothesis-loop**.
|
|
49
49
|
|
|
50
50
|
## Discipline
|
|
51
51
|
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: knowledge-gather
|
|
2
|
+
name: autoresearch-vkf-knowledge-gather
|
|
3
3
|
description: Gather frontier knowledge relevant to a research goal — search papers, repos, docs, and benchmarks for candidate techniques. Use as the discovery step of an autoresearch loop, before extracting claims. Collects candidate knowledge; it does not invent ideas or run experiments.
|
|
4
4
|
---
|
|
5
5
|
|
|
@@ -63,5 +63,5 @@ For each candidate, capture enough to become a claim later:
|
|
|
63
63
|
- **Look for contradictions and gaps** between sources — they're the richest
|
|
64
64
|
seeds for novel hypotheses later.
|
|
65
65
|
|
|
66
|
-
Hand the collected candidates to **claim-extract**, which writes them into memory
|
|
66
|
+
Hand the collected candidates to **autoresearch-vkf-claim-extract**, which writes them into memory
|
|
67
67
|
with `remember_claim`.
|
|
@@ -1,5 +1,5 @@
|
|
|
1
1
|
---
|
|
2
|
-
name: research-report
|
|
2
|
+
name: autoresearch-vkf-research-report
|
|
3
3
|
description: Produce the autoresearch report with full idea lineage — paper → claim → hypothesis → patch → metric change → status → memory update. Use to summarize an autoresearch run into an auditable, human-readable report at .autoresearch-vkf/session/report.md.
|
|
4
4
|
---
|
|
5
5
|
|