@ara-commons/ara-skills 0.3.1 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +3 -2
- package/package.json +3 -2
- package/skills/compiler/SKILL.md +26 -0
- package/skills/compiler/references/ara-schema.md +28 -0
- package/skills/research-visualizer/SKILL.md +172 -0
- package/skills/research-visualizer/references/binding.md +245 -0
- package/skills/research-visualizer/references/parsing.md +211 -0
- package/skills/research-visualizer/references/trajectory-template.html +804 -0
- package/src/index.js +1 -1
package/README.md
CHANGED
|
@@ -1,12 +1,13 @@
|
|
|
1
1
|
# @ara-commons/ara-skills
|
|
2
2
|
|
|
3
|
-
One-command installer for the
|
|
3
|
+
One-command installer for the four **Agent-Native Research Artifact (ARA)** skills:
|
|
4
4
|
|
|
5
5
|
| Skill | Invoke | What it does |
|
|
6
6
|
|-------|--------|--------------|
|
|
7
7
|
| `compiler` | `/compiler <input>` | Convert a paper, repo, or notes into a complete ARA artifact |
|
|
8
8
|
| `research-manager` | `/research-manager` | Post-session recorder that captures decisions, dead ends, and claims |
|
|
9
9
|
| `rigor-reviewer` | `/rigor-reviewer <dir>` | ARA Seal Level 2 semantic epistemic review across six dimensions |
|
|
10
|
+
| `research-visualizer` | `/research-visualizer <dir>` | Render an ARA into one interactive, self-contained trajectory.html |
|
|
10
11
|
|
|
11
12
|
## Quick start
|
|
12
13
|
|
|
@@ -64,4 +65,4 @@ In dev mode the CLI reads skills from the sibling `../../skills/` directory. On
|
|
|
64
65
|
|
|
65
66
|
## Upstream source of truth
|
|
66
67
|
|
|
67
|
-
The
|
|
68
|
+
The four skill directories live at the repo root under `skills/`. Edit them there — never edit the copy inside this package, which is created on demand by `prepack`.
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@ara-commons/ara-skills",
|
|
3
|
-
"version": "0.
|
|
4
|
-
"description": "Install Agent-Native Research Artifact (ARA) skills — compiler, research-manager, rigor-reviewer — into Claude Code, Cursor, OpenCode, Gemini CLI, Codex, and more.",
|
|
3
|
+
"version": "0.4.0",
|
|
4
|
+
"description": "Install Agent-Native Research Artifact (ARA) skills — compiler, research-manager, rigor-reviewer, research-visualizer — into Claude Code, Cursor, OpenCode, Gemini CLI, Codex, and more.",
|
|
5
5
|
"type": "module",
|
|
6
6
|
"bin": {
|
|
7
7
|
"ara-skills": "./bin/cli.js"
|
|
@@ -33,6 +33,7 @@
|
|
|
33
33
|
"compiler",
|
|
34
34
|
"research-manager",
|
|
35
35
|
"rigor-reviewer",
|
|
36
|
+
"research-visualizer",
|
|
36
37
|
"llm",
|
|
37
38
|
"cli"
|
|
38
39
|
],
|
package/skills/compiler/SKILL.md
CHANGED
|
@@ -199,6 +199,28 @@ the source actually reveals — but the node count and types are **source-bounde
|
|
|
199
199
|
never invent a dead end, decision, or experiment to hit a number. A paper that hides its failures
|
|
200
200
|
yields a smaller, honest tree (Rule 9 wins).
|
|
201
201
|
|
|
202
|
+
**Optional per-node changed-code (enrichment for the Research Visualizer).** When the work is a
|
|
203
|
+
sequence of code edits and the scripts are resolvable at compile time, you MAY attach to an experiment
|
|
204
|
+
node the **unified diff** it represents — never required, omitted when unclear:
|
|
205
|
+
1. **Resolve node → representative variant — this link does NOT already exist; construct it.** From the
|
|
206
|
+
node's `source_refs` / its claims' cited `record_configs` → the run index (`runs.csv`/`runs.jsonl`)
|
|
207
|
+
row(s) whose family+purpose+bin match → the representative submitted script. Where this is empty or
|
|
208
|
+
ambiguous (most `decision`/`dead_end` nodes, or evidence that is only journal prose), **omit
|
|
209
|
+
`code_change`** — never guess a script.
|
|
210
|
+
2. **Resolve node → diff base** from the lineage you already reconstruct for `solution/*` (wave baseline
|
|
211
|
+
or immediate-parent variant).
|
|
212
|
+
3. **Index both scripts in `src/artifacts.md` under a stable anchor** (`A01`, `A02`, …) carrying real
|
|
213
|
+
path + sha256 + original location; compute the unified diff (variant vs base) and write it to a tracked
|
|
214
|
+
**`evidence/changes/<node-id>.diff.md`** sidecar (fenced ```diff, `**Source**` header citing the two
|
|
215
|
+
anchor ids). Set the node's `code_change: {base_artifact, variant_artifact, lang, diff_file}`. The whole
|
|
216
|
+
scripts stay pointers (Rule 14) — the diff is a derived, grounded view, like a `derived_subset` table.
|
|
217
|
+
4. **Store-absent ⇒ pointers, not a diff.** If the scripts don't resolve on disk (git-ignored store),
|
|
218
|
+
still record `code_change` with the anchor ids + a `note`, omit `diff_file` — the visualizer shows a
|
|
219
|
+
pointer chip. Expected, not a failure.
|
|
220
|
+
|
|
221
|
+
You MAY also attach `node.thinking` — the agent's deliberation — but **only verbatim** grounded
|
|
222
|
+
journal/decision text; never compose new prose. No verbatim rationale ⇒ leave it absent.
|
|
223
|
+
|
|
202
224
|
### Step 3: Generate Files
|
|
203
225
|
|
|
204
226
|
Write the mandatory core, then the additional files the paper warrants. See
|
|
@@ -249,6 +271,10 @@ Run ARA Seal Level 1. Check:
|
|
|
249
271
|
heuristic `Code ref` → a real `src/execution/` file (when both exist); tree `evidence:` → claim IDs
|
|
250
272
|
- Evidence: **every numbered table and figure is filed with BOTH a markdown file and a screenshot
|
|
251
273
|
(.png)**; numbered objects not filed are accounted for in `evidence/README.md` with a reason
|
|
274
|
+
- **Changed-code (only if emitted):** each `evidence/changes/<node>.diff.md` cites two `src/artifacts.md`
|
|
275
|
+
anchors (`base`/`variant`) that resolve; the diff is verbatim; the node's `code_change` points at the
|
|
276
|
+
sidecar via `diff_file` (or carries a `note` with no `diff_file` when the store was absent). Optional —
|
|
277
|
+
absent is fine; never invent a diff or a node→script mapping
|
|
252
278
|
- Evidence files have **Source** fields; figures declare Figure type / Extraction method / Reading
|
|
253
279
|
confidence; estimated readings marked `≈` (not `exact_from_labels`); diagrams/qualitative samples
|
|
254
280
|
carry a visual description, not a fabricated table
|
|
@@ -32,6 +32,7 @@ evidence/
|
|
|
32
32
|
tables/ # ✓ every numbered Table: tableN.md + tableN.png
|
|
33
33
|
figures/ # ✓ every numbered Figure: figureN.md + figureN.png
|
|
34
34
|
proofs/ # as warranted: derivations / proofs
|
|
35
|
+
changes/ # as warranted: per-node code-change unified diffs (Research Visualizer)
|
|
35
36
|
rubric/requirements.md # (Only if a rubric is provided)
|
|
36
37
|
```
|
|
37
38
|
|
|
@@ -378,6 +379,8 @@ the winner, or files collapsed into a single directory link) is the failure.
|
|
|
378
379
|
|
|
379
380
|
```markdown
|
|
380
381
|
## {Artifact name}
|
|
382
|
+
- **Anchor**: {stable short id — `A01`, `A02`, … — so a trace node's `code_change` can reference this artifact by id; optional, but required for the Research Visualizer's changed-code diffs}
|
|
383
|
+
- **sha256**: {content hash of the file, when a code-change diff cites it}
|
|
381
384
|
- **File(s) in repo**: {real path(s), verified to exist}
|
|
382
385
|
- **Nature**: {what it is — tool / library / skill spec / system / dataset}
|
|
383
386
|
- **What it does / contains**: {grounded description}
|
|
@@ -423,6 +426,28 @@ Reproducibility for any field. For purely analytical work, state so explicitly.
|
|
|
423
426
|
|
|
424
427
|
---
|
|
425
428
|
|
|
429
|
+
## evidence/changes/{node-id}.diff.md (Research Visualizer changed-code diffs)
|
|
430
|
+
|
|
431
|
+
Per-experiment-node **unified diff** the step represents — a derived, grounded view (the whole scripts
|
|
432
|
+
stay pointers in `src/artifacts.md`; this is NOT a copy of the artifact). One tracked file per node that
|
|
433
|
+
has a resolvable code change:
|
|
434
|
+
|
|
435
|
+
```markdown
|
|
436
|
+
# Change {node-id}: {short description}
|
|
437
|
+
- **Base**: A01 (→ src/artifacts.md anchor; path + sha256 + original location live there)
|
|
438
|
+
- **Variant**: A07
|
|
439
|
+
- **Language**: python
|
|
440
|
+
|
|
441
|
+
<unified diff fenced as ```diff … ``` — verbatim from the real scripts>
|
|
442
|
+
```
|
|
443
|
+
|
|
444
|
+
Rules: cite the two artifacts **by anchor id**, never paste their paths/sha here (those live once in
|
|
445
|
+
`src/artifacts.md`). The diff text is verbatim. If the scripts can't be resolved at compile time
|
|
446
|
+
(git-ignored store), omit this file and set the node's `code_change.note` instead (the visualizer shows a
|
|
447
|
+
pointer chip, no diff). These sidecars MUST be tracked/committed (not swept by any store `.gitignore`).
|
|
448
|
+
|
|
449
|
+
---
|
|
450
|
+
|
|
426
451
|
## evidence/tables/{file}.md (+ screenshot)
|
|
427
452
|
|
|
428
453
|
Every numbered table gets BOTH this markdown file AND a screenshot `tableN.png` (the rendered
|
|
@@ -475,6 +500,9 @@ tree:
|
|
|
475
500
|
source_refs: ["Table 2", "§4.1"] # recommended for explicit nodes
|
|
476
501
|
title: "{...}"
|
|
477
502
|
description: "{...}"
|
|
503
|
+
# OPTIONAL enrichment (Research Visualizer; omit when absent):
|
|
504
|
+
# thinking: "{verbatim agent deliberation — why it did/branched}"
|
|
505
|
+
# code_change: { base_artifact: A01, variant_artifact: A07, lang: python, diff_file: evidence/changes/N01.diff.md }
|
|
478
506
|
```
|
|
479
507
|
|
|
480
508
|
Rules:
|
|
@@ -0,0 +1,172 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: research-visualizer
|
|
3
|
+
description: |
|
|
4
|
+
Research Visualizer. Renders an existing Agent-Native Research Artifact (ARA) into ONE
|
|
5
|
+
self-contained, interactive HTML file showing the AI scientist's step-by-step research process:
|
|
6
|
+
a clickable process map of the exploration tree (branches and dead ends included) on the left,
|
|
7
|
+
and a per-step drill-down on the right — what the step did, why (the linked claim), the real
|
|
8
|
+
result (verbatim grounded numbers + inline figures + tables), and the code/artifact pointer.
|
|
9
|
+
Read-only consumer of the artifact — it never changes how research is done.
|
|
10
|
+
When the ARA carries them, it also surfaces (each optional, only when present) the related-work
|
|
11
|
+
dependency graph, the problem framing, a concepts glossary with in-text term popovers, and the
|
|
12
|
+
solution recipes — reached from header disclosures without leaving the process map.
|
|
13
|
+
Accepts either an existing ARA or raw research input (a paper, repo, run logs, or notes); when the
|
|
14
|
+
input is not yet an ARA it is compiled into one first, then visualized.
|
|
15
|
+
|
|
16
|
+
TRIGGERS: visualize, visualizer, trajectory view, render the ARA, see the steps, step-by-step view,
|
|
17
|
+
process map, replay the trajectory, watch the agent work, drill into steps,
|
|
18
|
+
visualize a paper, visualize a repo, visualize a run
|
|
19
|
+
argument-hint: "[ara-dir] [--output <path>]"
|
|
20
|
+
allowed-tools: Read, Write, Edit, Glob, Grep, Bash(python3 *|base64 *|find *|ls *|open *)
|
|
21
|
+
metadata:
|
|
22
|
+
author: ara-commons
|
|
23
|
+
category: research-tooling
|
|
24
|
+
version: "1.0.0"
|
|
25
|
+
tags: [research, visualization, trajectory, exploration-tree, html]
|
|
26
|
+
---
|
|
27
|
+
|
|
28
|
+
# Research Visualizer
|
|
29
|
+
|
|
30
|
+
You render an existing ARA into a single portable HTML view of the agent's step-by-step process.
|
|
31
|
+
You are a **read-only consumer**: you read the artifact and emit a file; you never edit the ARA.
|
|
32
|
+
|
|
33
|
+
You operate as a first-class agent — use your native tools directly. The heavy rendering logic is
|
|
34
|
+
**already written** in `references/trajectory-template.html`; you do NOT rewrite it. Your job is to
|
|
35
|
+
parse the ARA into one `ARA_DATA` JSON object, inline the figures, and inject that object into the
|
|
36
|
+
template's data slot.
|
|
37
|
+
|
|
38
|
+
## What you produce
|
|
39
|
+
|
|
40
|
+
One self-contained file, default `<ara-dir>/trajectory.html` (override with `--output`):
|
|
41
|
+
- All data, tables, and figures (base64) inlined — no server, no network, no CDN. Double-click to open.
|
|
42
|
+
- Built by populating the canonical scaffold, so every generated view is structurally consistent.
|
|
43
|
+
|
|
44
|
+
## v1 boundaries (do not exceed)
|
|
45
|
+
|
|
46
|
+
- **Post-hoc** visualization of a finished/in-progress ARA. No live/real-time mode.
|
|
47
|
+
- **Self-contained from the ARA directory alone.** Do NOT open or inline anything outside the ARA dir.
|
|
48
|
+
`src/artifacts.md` run-store pointers and node `source_refs` (external journal `file:line`) are shown
|
|
49
|
+
**as pointers/chips, not resolved**. (External resolution is a planned future extension — out of scope.)
|
|
50
|
+
- **Single ARA.** No cross-ARA comparison.
|
|
51
|
+
|
|
52
|
+
## Pipeline
|
|
53
|
+
|
|
54
|
+
1. **Args.** Resolve `<ara-dir>` (default: the ARA in the current working context / most-recently
|
|
55
|
+
referenced). Resolve `--output` (default `<ara-dir>/trajectory.html`).
|
|
56
|
+
1b. **Precondition — the input must be an ARA; if it is not, compile it first.** Decide with one
|
|
57
|
+
observable test: does the resolved input expose a parseable `trace/exploration_tree.yaml`
|
|
58
|
+
(**≥1 node**) — directly, or as a standard ARA directory layout?
|
|
59
|
+
- **It is an ARA** → continue to Validate unchanged.
|
|
60
|
+
- **It is not an ARA** — the input is raw research material (a paper/PDF, a code repository, a
|
|
61
|
+
run/log directory, notes, or any directory with no exploration tree) → **invoke the `compiler`
|
|
62
|
+
skill on that input to produce an ARA**, then set `<ara-dir>` to the compiler's output artifact
|
|
63
|
+
and continue. Do not hand-roll an ARA yourself; the compiler is the only path that builds one.
|
|
64
|
+
Default `--output` to `<compiled-ara-dir>/trajectory.html` unless the user set it.
|
|
65
|
+
Only if the compiler still yields no exploration tree does the Validate step's "no process" message apply.
|
|
66
|
+
2. **Validate — the exploration tree is the ONLY hard requirement.** Confirm
|
|
67
|
+
`trace/exploration_tree.yaml` exists and parses to **≥1 node**; if not (and the precondition's
|
|
68
|
+
compile step has already run), tell the user there is no process to show (this replaces the old
|
|
69
|
+
`PAPER.md` "is-this-an-ARA?" guard). Everything else —
|
|
70
|
+
`PAPER.md`, `logic/`, `src/`, `evidence/`, and the four enrichment layers — is **optional
|
|
71
|
+
enrichment**: glob whatever is present. If `PAPER.md` is absent, synthesize a minimal `meta` (title
|
|
72
|
+
from a tree-level `title:` or the dir name; empty `abstract` hides the disclosure). This is the
|
|
73
|
+
**raw-trajectory path**: the skill produces a useful step-by-step view from *just the tree* (a raw
|
|
74
|
+
agent run), not only a fully-compiled ARA — see `references/parsing.md` §7.
|
|
75
|
+
3. **Parse the trace** into normalized nodes. The field conventions vary across ARAs — follow
|
|
76
|
+
`references/parsing.md` exactly (handles `tree:` vs `root:`, generic vs type-named fields,
|
|
77
|
+
`evidence:` routing, `isolated`, `also_depends_on`). Every node must yield a `title` + `body`.
|
|
78
|
+
4. **Parse the hub layers — each only when present (all optional now):** `logic/claims.md` (the
|
|
79
|
+
binding hub *when it exists*), `logic/experiments.md`, `evidence/README.md` (figure/table ↔ claim
|
|
80
|
+
reverse index), `src/artifacts.md`, `logic/solution/*`. A missing layer simply contributes nothing;
|
|
81
|
+
the node still renders from its own `title`/`body`/`thinking`.
|
|
82
|
+
**Also parse the four OPTIONAL enrichment layers when present, per `references/parsing.md` §8:**
|
|
83
|
+
`logic/problem.md`→`context`, `logic/concepts.md`→`glossary` (+ build the `lexicon`),
|
|
84
|
+
`logic/related_work.md`→`dependencies`, `logic/solution/*.md`→`recipes` (role-classify by content, not
|
|
85
|
+
filename). A missing file/dir omits its key entirely. Reproduce statements/deltas/definitions/relations/
|
|
86
|
+
headings/quotes/cells **verbatim**.
|
|
87
|
+
5. **Build each node's drill-down.** When `logic/claims.md` exists, follow the claim-hub chain in
|
|
88
|
+
`references/binding.md` (node → `evidence:[C##]` → claims → {Sources quotes, figures/tables,
|
|
89
|
+
experiments, artifact pointers}). When it does **not**, the drill-down is just the node's own
|
|
90
|
+
narrative (`thinking`/`body`) — every claim/result/verified block is empty and omitted.
|
|
91
|
+
5b. **Bind the enrichment layers** per the "four enrichment layers" section of `references/binding.md`:
|
|
92
|
+
build `claimIds`/`nodeByClaim`/`conceptNames`/`rwIds`; resolve every `refs[].target` (drop danglers,
|
|
93
|
+
never link off-ARA); derive each node's `built_on`/`rejected_here` (dependency→claim→node, bucketed by
|
|
94
|
+
`relation_norm`), `concepts` (whole-word name-match), and `recipe_refs` (recipe→claim→node); mark
|
|
95
|
+
cross-agent entries. All per-node enrichment fields default `[]`.
|
|
96
|
+
6. **Inline figures.** For each referenced figure that has a real raster (`evidence/figures/*.png`),
|
|
97
|
+
base64-encode it and put the `data:` URI in `figures[].img`. Use Bash, e.g.
|
|
98
|
+
`python3 -c "import base64,sys;print('data:image/png;base64,'+base64.b64encode(open(sys.argv[1],'rb').read()).decode())" <path>`.
|
|
99
|
+
For data-only figure markdown (no raster), render its data table instead (as a `tables[]` entry).
|
|
100
|
+
**Also inline code diffs + the artifact index:** for each node with `code_change.diff_file`, read that
|
|
101
|
+
tracked `evidence/changes/<id>.diff.md` sidecar and inline its fenced diff text into `code_change.diff`
|
|
102
|
+
(parallel to figures); build the top-level `artifacts[]` index from `src/artifacts.md` so
|
|
103
|
+
`base_artifact`/`variant_artifact` ids resolve; carry each node's verbatim `thinking` straight through.
|
|
104
|
+
Sanitize all three verbatim fields per the Injection contract. The visualizer never computes a diff
|
|
105
|
+
itself and never opens the external store — it only inlines what the ARA already contains.
|
|
106
|
+
7. **Assemble `ARA_DATA`** (exact schema in `references/binding.md`) and **inject** it: replace ONLY
|
|
107
|
+
the JSON between `/* __ARA_DATA_BEGIN__ */` and `/* __ARA_DATA_END__ */` in the
|
|
108
|
+
`<script id="ara-data">` block of a copy of the template. Write the result to the output path.
|
|
109
|
+
Include `context`/`glossary`/`dependencies`/`recipes` and the per-node `built_on`/`rejected_here`/
|
|
110
|
+
`recipe_refs`/`concepts` **only when their sources exist** — omit absent keys entirely (no empty
|
|
111
|
+
stubs). A payload omitting all of them stays byte-compatible with the v1.0 schema.
|
|
112
|
+
8. **Report** the output path. Optionally open it (`open <path>` on macOS). Print a one-line summary
|
|
113
|
+
(node count, dead ends, figures inlined, which of the four enrichment overlays were emitted with their
|
|
114
|
+
term/dependency/recipe counts, danglers dropped, any pointers left unresolved).
|
|
115
|
+
|
|
116
|
+
## Injection contract (critical)
|
|
117
|
+
|
|
118
|
+
- The injected payload MUST be valid JSON (it is read with `JSON.parse`). The template strips only the
|
|
119
|
+
two named marker comments before parsing, so the payload is otherwise pure JSON.
|
|
120
|
+
- It must not contain the literal substring `</script>`, nor the literal marker strings
|
|
121
|
+
`/* __ARA_DATA_BEGIN__ */` / `/* __ARA_DATA_END__ */`. Escape any `<` in inlined markdown/text as `<`
|
|
122
|
+
(or `<`) — this also neutralizes `</script>`. (A bare `*/` inside a string value is harmless to
|
|
123
|
+
`JSON.parse`; only the exact marker strings would be stripped.)
|
|
124
|
+
- **The verbatim free-text fields `thinking` and `code_change.diff` are the high-risk carriers** (source
|
|
125
|
+
code routinely contains `/* … */`). If either marker token would appear in their text, break it (e.g.
|
|
126
|
+
insert a zero-width space inside `__ARA_DATA_…`) so the global marker-strip can't delete it from inside
|
|
127
|
+
a value. Re-validate: a node whose `thinking`/`diff` contains a marker token MUST round-trip intact.
|
|
128
|
+
- Do not touch anything else in the template — only the bytes between the two markers.
|
|
129
|
+
- After writing, re-validate: the file still parses (the embedded JSON loads). If a figure pushed the
|
|
130
|
+
file very large, apply the size guards in `references/binding.md` (truncate logs/tables, keep figures).
|
|
131
|
+
|
|
132
|
+
## Faithfulness (hard rules)
|
|
133
|
+
|
|
134
|
+
- Reproduce claim `Statement`s, `Sources` quotes, and table numbers **verbatim** — never paraphrase,
|
|
135
|
+
never invent. Missing data → set the field empty/omit (the viewer shows "No …"); never fabricate.
|
|
136
|
+
- Provenance, `support_level`, and `status` are shown **only if present** in the source; do not guess.
|
|
137
|
+
- Dead-end nodes and `isolated` subtrees must be carried through faithfully — they are the most
|
|
138
|
+
valuable things to display, not noise to drop.
|
|
139
|
+
- For the enrichment layers: relation strings, definitions, constraint headings, and footprint citations
|
|
140
|
+
are reproduced **verbatim**; relation enums are open (compound `bounds / refutes` / transition
|
|
141
|
+
`extends → quarantined` kept as written; `relation_norm` is for color only). Never normalize a heading
|
|
142
|
+
or invent a typed sub-field. A `refs[].target` is set only on real in-ARA resolution; dangling refs are
|
|
143
|
+
flagged, never silently corrected or dropped. `built_on`/`concepts`/"used by" name-matches are
|
|
144
|
+
best-effort hints (marked "inferred"), never asserted as facts.
|
|
145
|
+
|
|
146
|
+
## Verify
|
|
147
|
+
|
|
148
|
+
Run on any ARA and confirm these properties — no named fixtures required:
|
|
149
|
+
- Opens by double-click: no server, no network, no console errors.
|
|
150
|
+
- Full process map: nesting, branches, dead ends marked, any `isolated` subtree boxed, `depends_on` chips.
|
|
151
|
+
- Drill-down renders whichever blocks are present (what / why / result-with-inline-figure / how-verified /
|
|
152
|
+
code-or-pointer), correctly under **both** field dialects in `references/parsing.md`.
|
|
153
|
+
- Verbatim quotes/numbers; nothing fabricated; self-contained from the ARA dir (no needed external refs).
|
|
154
|
+
- Re-running reproduces the same structure (data differs only as the ARA differs).
|
|
155
|
+
- **Enrichment layers:** a layer's header button appears only when its source exists; an ARA with none of
|
|
156
|
+
the four layers renders identically to v1.0 (no layer bar, no node chips). Open each emitted overlay and
|
|
157
|
+
confirm verbatim relations/definitions/recipe cells, ungrounded/dangling/cross-agent markers, and that
|
|
158
|
+
the `built_on`/`rejected_here` chips + the `⊕/⊘` map marker deep-link into Dependencies. Glossary
|
|
159
|
+
popovers fire on body terms; inline `$LaTeX$` renders with no network.
|
|
160
|
+
- **Degradation:** a `minimal-artifact` (only `problem.md`) shows only the Context button, others absent,
|
|
161
|
+
popovers off, per-node chips empty, zero console errors.
|
|
162
|
+
- **Compile-first (non-ARA input):** pointing the skill at raw research material with no exploration
|
|
163
|
+
tree (a paper, a repo, a run/log dir) triggers the `compiler` skill first, then visualizes the
|
|
164
|
+
resulting ARA — the output is identical to running the compiler then the visualizer by hand.
|
|
165
|
+
- **Raw trajectory (the decoupled path):** a **tree-only** ARA — just `trace/exploration_tree.yaml`, no
|
|
166
|
+
`PAPER.md`, no `logic/`, no `evidence/` — still renders the full process map + each step's narrative
|
|
167
|
+
(`thinking`/`body`), with no layer bar and no claim/result/verified blocks, zero console errors. This
|
|
168
|
+
is a first-class supported input, not a failure mode.
|
|
169
|
+
|
|
170
|
+
Cover the variant axes with whatever ARAs you have: both root forms (`tree:`/`root:`), both field
|
|
171
|
+
dialects (generic / type-named), figures present as real raster vs. data-markdown-only, `src/` as a
|
|
172
|
+
pointer index vs. transcribed code, and an `isolated` subtree if any artifact has one.
|
|
@@ -0,0 +1,245 @@
|
|
|
1
|
+
# Binding — the claim-hub drill-down chain + the `ARA_DATA` schema
|
|
2
|
+
|
|
3
|
+
This is the read step. For each normalized trace node you build one drill-down bundle by following
|
|
4
|
+
the artifact's **claim-mediated** cross-layer links, then you emit the whole thing as one `ARA_DATA`
|
|
5
|
+
object that the scaffold renders.
|
|
6
|
+
|
|
7
|
+
> **The claim hub is OPTIONAL — the tree is the only hard requirement.** Every layer this file binds
|
|
8
|
+
> (`logic/claims.md`, `logic/experiments.md`, `evidence/`, `src/artifacts.md`, the four enrichment
|
|
9
|
+
> layers) is enrichment: when its dir/file is absent, that binding step **no-ops** and the
|
|
10
|
+
> corresponding `ARA_DATA` arrays stay `[]` (omitted by the renderer). A node ALWAYS renders from its
|
|
11
|
+
> own normalized `title` / `body` / `thinking` (parsing.md §3, §7a) — so a **raw, tree-only
|
|
12
|
+
> trajectory** with no `logic/` and no `evidence/` produces a complete process map + per-step
|
|
13
|
+
> narrative. The chain below is what runs *when the hub exists*; nothing here may hard-fail on an
|
|
14
|
+
> absent layer.
|
|
15
|
+
|
|
16
|
+
## Why claim-mediated (and not node → `.py` + table)
|
|
17
|
+
|
|
18
|
+
Current ARAs usually have **no per-step source file**: `src/` is `artifacts.md` (a pointer index into
|
|
19
|
+
an external run store) + `environment.md`. The binding hub is `logic/claims.md`. A trace node points
|
|
20
|
+
at claims via `evidence: [C##]`; the claim carries the *why*, the grounded *result* (verbatim
|
|
21
|
+
`Sources` quotes with `file:line`), the verification (`Proof: [E##]`), and the figures/tables that
|
|
22
|
+
cite it. So the real grounded numbers are **already inside the ARA** — you do not need the external
|
|
23
|
+
store (v1 leaves those as pointers).
|
|
24
|
+
|
|
25
|
+
## Resolution chain (per node)
|
|
26
|
+
|
|
27
|
+
```
|
|
28
|
+
NODE (type, content+outcome, support_level, source_refs, evidence:[C##], also_depends_on, isolated)
|
|
29
|
+
├─ WHAT → node title + body (+ outcome) ............................ → node.title / node.body
|
|
30
|
+
├─ WHY → evidence:[C##] → logic/claims.md ......................... → node.why[]
|
|
31
|
+
│ each claim: Statement, Status, Conditions, Falsification,
|
|
32
|
+
│ Dependencies, provenance
|
|
33
|
+
├─ HOW VERIFIED→ claim.Proof:[E##] → logic/experiments.md ................. → node.verified_by[]
|
|
34
|
+
│ each experiment: Run (pointer), Setup, Metrics
|
|
35
|
+
├─ RESULT → claim.Sources «quote ← file:line» + Evidence basis ....... → node.result.sources[]
|
|
36
|
+
│ + evidence/README.md reverse-lookup: figures/tables citing C## → node.result.figures[]/tables[]
|
|
37
|
+
│ + evidence/data/*.json (raw, in-artifact) .................. → node.result.data[]
|
|
38
|
+
└─ CODE/ARTIFACT→ src/artifacts.md pointer(s) + logic/solution/* recipe ... → node.artifact[]
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
### evidence/README.md reverse-lookup
|
|
42
|
+
`evidence/README.md` has a `Claims` column mapping each figure/table file → the `C##` it grounds.
|
|
43
|
+
Build a map `claim_id → [evidence files]` once. For a node, union the evidence of all its claims.
|
|
44
|
+
Fallback if no README row: scan the figure/table `.md` for an inline `Supports C##` / `grounds C##`.
|
|
45
|
+
|
|
46
|
+
### Figures
|
|
47
|
+
- Real raster (`evidence/figures/<name>.png` beside the `.md`) → base64-inline into `figures[].img`.
|
|
48
|
+
- Data-only figure markdown (no raster) → render its data table as a `tables[]` entry instead.
|
|
49
|
+
- A figure's caption/title comes from the first heading or the `What it shows` section of its `.md`.
|
|
50
|
+
|
|
51
|
+
### What counts as "code" now
|
|
52
|
+
There is usually no `src/execution/*.py`. Populate `artifact[]` from `src/artifacts.md` (the run-index
|
|
53
|
+
row / `record_configs/` path / named submitted variant file) plus the relevant
|
|
54
|
+
`logic/solution/*.md` recipe section. Only when a real transcribed `src/execution/*.py` exists
|
|
55
|
+
(legacy / paper-only code) do you point at that file. **Never resolve the external store in v1** —
|
|
56
|
+
the pointer text is the value.
|
|
57
|
+
|
|
58
|
+
### The changed-code diff (`node.code_change`) — compiler-produced, visualizer-rendered
|
|
59
|
+
A full ARA may carry, per experiment node, the **unified diff** the step represents. The addresses
|
|
60
|
+
live in ONE place and are referenced by id (weak coupling):
|
|
61
|
+
|
|
62
|
+
`node.code_change` → `evidence/changes/<node-id>.diff.md` (the diff **text**) → `artifacts[]` /
|
|
63
|
+
`src/artifacts.md` entry (path + sha256 + original location) → the original repo.
|
|
64
|
+
|
|
65
|
+
- The diff **text** is grounded by citing the two **artifact ids** (`base_artifact`, `variant_artifact`),
|
|
66
|
+
never an embedded path. Whole scripts stay pointers in `src/artifacts.md` (Rule 14) — the diff is a
|
|
67
|
+
derived, grounded view (≈ a `derived_subset` table), not a copy of the artifact.
|
|
68
|
+
- **`diff_file` → `diff` inlining** (parallel to figures' `.md`→base64 `img`): on disk the node carries
|
|
69
|
+
`code_change.diff_file: "evidence/changes/<id>.diff.md"`; the visualizer reads that **tracked** sidecar
|
|
70
|
+
and inlines its fenced diff text into `code_change.diff` in `ARA_DATA`, so the rendered HTML stays
|
|
71
|
+
self-contained (the sidecar lives inside the ARA dir).
|
|
72
|
+
- **`artifactById`**: the visualizer builds an `id → artifacts[] entry` map (parallel to `nodeByClaim`)
|
|
73
|
+
and resolves `base_artifact`/`variant_artifact` into the shown-not-resolved pointer chip under the diff.
|
|
74
|
+
- **Degrade**: when the scripts don't resolve at compile time (store absent), the compiler emits
|
|
75
|
+
`code_change` with the artifact ids + a `note` but no diff; the viewer shows a pointer chip, not a diff.
|
|
76
|
+
- **Marker safety**: `diff` and `thinking` are verbatim, so the producer MUST ensure neither the literal
|
|
77
|
+
`/* __ARA_DATA_BEGIN__ */` / `/* __ARA_DATA_END__ */` tokens nor `</script>` appears in any inlined
|
|
78
|
+
string (escape `<`→`<`; break the marker tokens). See SKILL.md "Injection contract".
|
|
79
|
+
|
|
80
|
+
## The `ARA_DATA` object (exact schema the scaffold reads)
|
|
81
|
+
|
|
82
|
+
```jsonc
|
|
83
|
+
{
|
|
84
|
+
"meta": {
|
|
85
|
+
"title": "<PAPER.md frontmatter title>",
|
|
86
|
+
"authors": ["..."], // [] if none
|
|
87
|
+
"year": "", "venue": "", "ara_dir": "<dir name>",
|
|
88
|
+
"abstract": "<PAPER.md abstract>" // "" to hide the Abstract disclosure
|
|
89
|
+
},
|
|
90
|
+
|
|
91
|
+
// Traversal order for Replay and ← →. A flat list of node ids in the order to step through
|
|
92
|
+
// (typically a pre-order DFS of the tree). If omitted, the scaffold derives a DFS from `parent`.
|
|
93
|
+
"order": ["N01", "N02", "N03", "..."],
|
|
94
|
+
|
|
95
|
+
// OPTIONAL addressable artifact index (from src/artifacts.md). Each script the compiler points at
|
|
96
|
+
// gets a stable id so a node's code_change can reference it BY ID (no embedded path). Omit if absent.
|
|
97
|
+
"artifacts": [
|
|
98
|
+
{ "id":"A01", "name":"<artifact name>", "path":"<repo-relative path>", "sha256":"<...>",
|
|
99
|
+
"original_location":"<store/repo ref>", "pointer":"<src/artifacts.md pointer text>" }
|
|
100
|
+
],
|
|
101
|
+
|
|
102
|
+
"nodes": [
|
|
103
|
+
{
|
|
104
|
+
"id": "N02",
|
|
105
|
+
"type": "experiment", // question|experiment|decision|dead_end|pivot|insight|<other ok>
|
|
106
|
+
"parent": "N01", // id of the nesting parent, or null for a root
|
|
107
|
+
"title": "<normalized step title>", // see parsing.md
|
|
108
|
+
"body": "<what the step did / its outcome>",
|
|
109
|
+
"thinking": "<verbatim agent deliberation — why it did/branched; OPTIONAL>", // primary block; falls back to body
|
|
110
|
+
"support_level": "explicit", // "explicit"|"inferred"|null
|
|
111
|
+
"isolated": false, // true → rendered in a separated dashed box
|
|
112
|
+
"depends_on": ["N00"], // also_depends_on cross-edges (ids); [] if none
|
|
113
|
+
"source_refs": ["<external path:line — shown as a chip, NOT resolved>"],
|
|
114
|
+
|
|
115
|
+
"why": [ // from evidence:[C##] → claims.md
|
|
116
|
+
{ "id":"C01", "title":"...", "statement":"<verbatim Statement>",
|
|
117
|
+
"status":"supported", "conditions":"...", "falsification":"...",
|
|
118
|
+
"dependencies":["C00"], "provenance":"<as stated in source, else null>" }
|
|
119
|
+
],
|
|
120
|
+
|
|
121
|
+
"result": {
|
|
122
|
+
"sources": [ { "quote":"<verbatim Sources quote>", "ref":"<file:line>" } ],
|
|
123
|
+
"figures": [ { "id":"<figure id>", "caption":"...", "kind":"quantitative_plot",
|
|
124
|
+
"img":"data:image/png;base64,<...>" } ],
|
|
125
|
+
"tables": [ { "id":"<table id>", "caption":"...", "markdown":"| col | ... |\n|---|...|" } ],
|
|
126
|
+
"data": [ { "id":"<data id>", "path":"evidence/data/<name>.json", "note":"raw" } ]
|
|
127
|
+
},
|
|
128
|
+
|
|
129
|
+
"verified_by": [ // from claim.Proof:[E##] → experiments.md
|
|
130
|
+
{ "id":"E01", "title":"...", "run":"<Run pointer text>", "setup":"...", "metrics":"..." }
|
|
131
|
+
],
|
|
132
|
+
|
|
133
|
+
"artifact": [ // src/artifacts.md pointers + solution recipe refs (pointer text only)
|
|
134
|
+
{ "name":"<artifact / family name>", "pointer":"<src/artifacts.md pointer text>", "what":"pointer index entry" }
|
|
135
|
+
],
|
|
136
|
+
|
|
137
|
+
"code_change": { // OPTIONAL — the changed-code diff for this step (compiler-produced)
|
|
138
|
+
"base_artifact":"A01", // → artifacts[].id (holds path+sha+original_location)
|
|
139
|
+
"variant_artifact":"A07", // → artifacts[].id
|
|
140
|
+
"lang":"python",
|
|
141
|
+
"diff":"<unified-diff text, inlined by the visualizer from evidence/changes/<id>.diff.md>",
|
|
142
|
+
"note":"" // set (with diff absent) when the scripts didn't resolve → pointer-only chip
|
|
143
|
+
}
|
|
144
|
+
}
|
|
145
|
+
// ... one object per trace node
|
|
146
|
+
]
|
|
147
|
+
}
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
### Field rules
|
|
151
|
+
- Every node MUST have `id`, `type`, `title`, `parent` (or null). All other arrays default to `[]`,
|
|
152
|
+
scalars to `null`/`""`. The scaffold tolerates missing optional fields.
|
|
153
|
+
- Put **only what the source contains**. Empty `why`/`result`/`verified_by`/`artifact` is fine and
|
|
154
|
+
common (e.g. a bare `decision` node) — the viewer simply omits those blocks. `thinking` and
|
|
155
|
+
`code_change` are likewise optional; omit when absent (a payload without them is byte-compatible).
|
|
156
|
+
- `status` is lower-cased by the viewer for styling; pass it as written (`Supported`, `hypothesis`, …).
|
|
157
|
+
|
|
158
|
+
### Size guards
|
|
159
|
+
- Inline every figure (they are the point), but cap a single inlined table/log render at a few hundred
|
|
160
|
+
lines; if longer, truncate and append a final row/line `… +N more (truncated)`.
|
|
161
|
+
- If the assembled file would be very large, prefer truncating `tables`/`data` text over dropping
|
|
162
|
+
figures, and never drop nodes.
|
|
163
|
+
|
|
164
|
+
---
|
|
165
|
+
|
|
166
|
+
# The four enrichment layers — schema additions + cross-linking
|
|
167
|
+
|
|
168
|
+
These surface `logic/problem.md` / `logic/concepts.md` / `logic/related_work.md` / `logic/solution/*`
|
|
169
|
+
(parsed per `parsing.md` §8). **Everything here is OPTIONAL and additive** — omit any key whose source
|
|
170
|
+
is absent; a payload with none of these stays byte-compatible with the prior schema and the renderer is
|
|
171
|
+
inert for any key it doesn't see.
|
|
172
|
+
|
|
173
|
+
## Schema additions
|
|
174
|
+
|
|
175
|
+
A shared **typed-ref** primitive (used by every layer's `refs`/`grounding`):
|
|
176
|
+
```jsonc
|
|
177
|
+
{ "raw": "<verbatim token, always shown>",
|
|
178
|
+
"kind": "claim|concept|related_work|observation|gap|assumption|experiment|node|source|figure|pr|arxiv|doi|url|unknown",
|
|
179
|
+
"target": "<in-ARA anchor id, or null>" } // non-null ONLY when it resolves inside THIS ARA
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
Four OPTIONAL top-level keys:
|
|
183
|
+
```jsonc
|
|
184
|
+
"context": { // logic/problem.md ; omit if absent
|
|
185
|
+
"title":"Problem", "summary":"",
|
|
186
|
+
"sections":[ { "role":"setting|observations|gaps|insight|assumptions|other", "heading":"<verbatim H2>",
|
|
187
|
+
"present":true, // false ⇒ greyed "(not specified)" stub for an expected-but-absent role
|
|
188
|
+
"entries":[ { "ent_id":"O2", "inferred_id":false, "ent_title":"", "text":"<verbatim>",
|
|
189
|
+
"unstructured":false, "fields":[{"label":"<verbatim>","value":"<verbatim>"}], "refs":[/*typed-ref*/] } ] } ] },
|
|
190
|
+
"glossary": { // logic/concepts.md ; omit ⇒ popovers disabled
|
|
191
|
+
"title":"Concepts", "groups":[{"name":"","term_ids":["G07"]}],
|
|
192
|
+
"terms":[ { "term_id":"G07", "name":"<verbatim, the match key>", "aliases":[], "group":"",
|
|
193
|
+
"definition":"<verbatim, may hold $LaTeX$>", "fields":[{"label":"","value":""}], "related":["G08"], "refs":[] } ],
|
|
194
|
+
"lexicon": { "<lowercased surface>":"G07" } }, // precomputed surface→term_id for popovers
|
|
195
|
+
"dependencies": { // logic/related_work.md ; omit if absent
|
|
196
|
+
"title":"Related Work / Dependencies", "preamble":"", "legend":[{"type":"extends","gloss":""}], "attribution":"",
|
|
197
|
+
"entries":[ { "rw_id":"RW02", "inferred_id":false, "name":"", "relation_raw":"<verbatim, compound/transition kept>",
|
|
198
|
+
"relation_norm":"baseline|imports|extends|bounds|refutes|", "delta":"", "adopted":"",
|
|
199
|
+
"grounding":[/*typed-ref: pr|arxiv|doi|source|url*/], "claims":["C03"], "cross_agent":false, "is_footprint":false } ],
|
|
200
|
+
"footprint":[ {"ref_id":"16","citation":"<verbatim one-liner>"} ] },
|
|
201
|
+
"recipes": { // logic/solution/* ; omit if dir absent
|
|
202
|
+
"title":"Solution",
|
|
203
|
+
"files":[ { "file":"method.md", "role":"constraints|method|heuristics|algorithm|architecture|recipe", "file_title":"",
|
|
204
|
+
"sections":[ { "sec_id":"", "heading":"<verbatim, never normalized>", "kind":"table|steps|kv|code|math|prose",
|
|
205
|
+
"markdown":"", "steps":[], "fields":[], "code":"", "text":"<verbatim fallback>", "warn":"", "refs":[] } ] } ] }
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
Four OPTIONAL per-NODE fields (default `[]`):
|
|
209
|
+
```jsonc
|
|
210
|
+
"built_on": [ {"rw_id":"RW02","name":"","relation_raw":""} ], // baseline/imports/extends deps a node's claims cite
|
|
211
|
+
"rejected_here":[ {"rw_id":"RW10","name":"","relation_raw":""} ], // bounds/refutes deps
|
|
212
|
+
"recipe_refs": [ {"file":"","sec_id":"","heading":"","role":""} ], // recipe sections a node's claims cite
|
|
213
|
+
"concepts": [ "<verbatim term name>" ] // glossary terms mentioned in this node (popover anchors)
|
|
214
|
+
```
|
|
215
|
+
Field rules: every new key omittable; `text`/`definition`/`delta`/`value`/relations/headings/quotes/cells
|
|
216
|
+
**verbatim**; `relation_raw` always preserved (compound/transition survive), `relation_norm` is color-only
|
|
217
|
+
and may be `""`; `target` non-null only on in-ARA resolution (off-ARA `pr`/`arxiv`/`doi`/`§`/file:line stay
|
|
218
|
+
`null` → inert chip); no fabrication.
|
|
219
|
+
|
|
220
|
+
## Cross-linking (the binder runs AFTER nodes + claims are parsed)
|
|
221
|
+
|
|
222
|
+
The hub stays `logic/claims.md`. All linking is best-effort, computed once into lookup maps, then attached.
|
|
223
|
+
|
|
224
|
+
- **B.0 anchors:** `claimIds` (the `## C\d+` set), `nodeIds`, `nodeByClaim` (invert each node's `why[].id`),
|
|
225
|
+
`conceptNames` (lowercased `glossary` names+aliases), `rwIds`.
|
|
226
|
+
- **B.1 claim-ref resolution (single pass over all layers' refs):** `kind:"claim"` ref → `target` if
|
|
227
|
+
`∈ claimIds`, else `target:null` and **flag dangling** (renderer greys it "broken link" — never a live
|
|
228
|
+
dead link). Same for `related_work`/`concept`/`node` targets against their id sets.
|
|
229
|
+
- **B.2 dependencies ↔ claims/nodes:** forward edge is in-data (`entries[].claims[]`). Build the reverse
|
|
230
|
+
`claim_id → [rw_id]`; also harvest `claims.md`'s own `**Dependencies.**` lines as extra rw-edges (so the
|
|
231
|
+
edge exists even when related_work omits the back-ref).
|
|
232
|
+
- **B.3 per-node `built_on` / `rejected_here` (the chips):** `rwSet = ⋃ over n.why[].id of reverse(claim→rw)`.
|
|
233
|
+
For each rw, bucket by `relation_norm`: `∈{baseline,imports,extends}` → `built_on`; `∈{bounds,refutes}` →
|
|
234
|
+
`rejected_here`; `""`/transition → first keyword, default `built_on`, **keep `relation_raw` literal**.
|
|
235
|
+
Dedupe by `rw_id`; empty buckets stay `[]` (no chip row). Cross-agent deps render a distinct "↔ other-agent"
|
|
236
|
+
ribbon.
|
|
237
|
+
- **B.4 concepts ↔ everything:** concepts are leaf provenance (no `trace:N` refs). Build `node.concepts[]` by
|
|
238
|
+
**sideways name-match** — whole-word, case-insensitive scan of each node's `title`+`body`+claim statements
|
|
239
|
+
against `conceptNames`. The Concept overlay's "used by" is that scan inverted, marked **"mentions
|
|
240
|
+
(inferred)"** and visually distinct from id-resolved links. Outbound concept `refs` resolve normally.
|
|
241
|
+
- **B.5 recipes ↔ claims/nodes (`recipe_refs[]`):** a node's `recipe_refs` = recipe sections whose `refs[]`
|
|
242
|
+
cite one of the node's `why` claims (∪ sections a node's claim cites back); dedupe by `(file,sec_id)`.
|
|
243
|
+
Unresolvable `C##` ⇒ inert muted chip ("referenced but not linked"), never dropped.
|
|
244
|
+
- **B.6 determinism & cost:** all maps built once, O(nodes+claims+refs); stable source order; a wrong/no
|
|
245
|
+
match never blocks rendering. The browser-side `lexicon` is precomputed so no source regex runs at runtime.
|