@ara-commons/ara-skills 0.3.0 → 0.4.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,172 @@
1
+ ---
2
+ name: research-visualizer
3
+ description: |
4
+ Research Visualizer. Renders an existing Agent-Native Research Artifact (ARA) into ONE
5
+ self-contained, interactive HTML file showing the AI scientist's step-by-step research process:
6
+ a clickable process map of the exploration tree (branches and dead ends included) on the left,
7
+ and a per-step drill-down on the right — what the step did, why (the linked claim), the real
8
+ result (verbatim grounded numbers + inline figures + tables), and the code/artifact pointer.
9
+ Read-only consumer of the artifact — it never changes how research is done.
10
+ When the ARA carries them, it also surfaces (each optional, only when present) the related-work
11
+ dependency graph, the problem framing, a concepts glossary with in-text term popovers, and the
12
+ solution recipes — reached from header disclosures without leaving the process map.
13
+ Accepts either an existing ARA or raw research input (a paper, repo, run logs, or notes); when the
14
+ input is not yet an ARA it is compiled into one first, then visualized.
15
+
16
+ TRIGGERS: visualize, visualizer, trajectory view, render the ARA, see the steps, step-by-step view,
17
+ process map, replay the trajectory, watch the agent work, drill into steps,
18
+ visualize a paper, visualize a repo, visualize a run
19
+ argument-hint: "[ara-dir] [--output <path>]"
20
+ allowed-tools: Read, Write, Edit, Glob, Grep, Bash(python3 *|base64 *|find *|ls *|open *)
21
+ metadata:
22
+ author: ara-commons
23
+ category: research-tooling
24
+ version: "1.0.0"
25
+ tags: [research, visualization, trajectory, exploration-tree, html]
26
+ ---
27
+
28
+ # Research Visualizer
29
+
30
+ You render an existing ARA into a single portable HTML view of the agent's step-by-step process.
31
+ You are a **read-only consumer**: you read the artifact and emit a file; you never edit the ARA.
32
+
33
+ You operate as a first-class agent — use your native tools directly. The heavy rendering logic is
34
+ **already written** in `references/trajectory-template.html`; you do NOT rewrite it. Your job is to
35
+ parse the ARA into one `ARA_DATA` JSON object, inline the figures, and inject that object into the
36
+ template's data slot.
37
+
38
+ ## What you produce
39
+
40
+ One self-contained file, default `<ara-dir>/trajectory.html` (override with `--output`):
41
+ - All data, tables, and figures (base64) inlined — no server, no network, no CDN. Double-click to open.
42
+ - Built by populating the canonical scaffold, so every generated view is structurally consistent.
43
+
44
+ ## v1 boundaries (do not exceed)
45
+
46
+ - **Post-hoc** visualization of a finished/in-progress ARA. No live/real-time mode.
47
+ - **Self-contained from the ARA directory alone.** Do NOT open or inline anything outside the ARA dir.
48
+ `src/artifacts.md` run-store pointers and node `source_refs` (external journal `file:line`) are shown
49
+ **as pointers/chips, not resolved**. (External resolution is a planned future extension — out of scope.)
50
+ - **Single ARA.** No cross-ARA comparison.
51
+
52
+ ## Pipeline
53
+
54
+ 1. **Args.** Resolve `<ara-dir>` (default: the ARA in the current working context / most-recently
55
+ referenced). Resolve `--output` (default `<ara-dir>/trajectory.html`).
56
+ 1b. **Precondition — the input must be an ARA; if it is not, compile it first.** Decide with one
57
+ observable test: does the resolved input expose a parseable `trace/exploration_tree.yaml`
58
+ (**≥1 node**) — directly, or as a standard ARA directory layout?
59
+ - **It is an ARA** → continue to Validate unchanged.
60
+ - **It is not an ARA** — the input is raw research material (a paper/PDF, a code repository, a
61
+ run/log directory, notes, or any directory with no exploration tree) → **invoke the `compiler`
62
+ skill on that input to produce an ARA**, then set `<ara-dir>` to the compiler's output artifact
63
+ and continue. Do not hand-roll an ARA yourself; the compiler is the only path that builds one.
64
+ Default `--output` to `<compiled-ara-dir>/trajectory.html` unless the user set it.
65
+ Only if the compiler still yields no exploration tree does the Validate step's "no process" message apply.
66
+ 2. **Validate — the exploration tree is the ONLY hard requirement.** Confirm
67
+ `trace/exploration_tree.yaml` exists and parses to **≥1 node**; if not (and the precondition's
68
+ compile step has already run), tell the user there is no process to show (this replaces the old
69
+ `PAPER.md` "is-this-an-ARA?" guard). Everything else —
70
+ `PAPER.md`, `logic/`, `src/`, `evidence/`, and the four enrichment layers — is **optional
71
+ enrichment**: glob whatever is present. If `PAPER.md` is absent, synthesize a minimal `meta` (title
72
+ from a tree-level `title:` or the dir name; empty `abstract` hides the disclosure). This is the
73
+ **raw-trajectory path**: the skill produces a useful step-by-step view from *just the tree* (a raw
74
+ agent run), not only a fully-compiled ARA — see `references/parsing.md` §7.
75
+ 3. **Parse the trace** into normalized nodes. The field conventions vary across ARAs — follow
76
+ `references/parsing.md` exactly (handles `tree:` vs `root:`, generic vs type-named fields,
77
+ `evidence:` routing, `isolated`, `also_depends_on`). Every node must yield a `title` + `body`.
78
+ 4. **Parse the hub layers — each only when present (all optional now):** `logic/claims.md` (the
79
+ binding hub *when it exists*), `logic/experiments.md`, `evidence/README.md` (figure/table ↔ claim
80
+ reverse index), `src/artifacts.md`, `logic/solution/*`. A missing layer simply contributes nothing;
81
+ the node still renders from its own `title`/`body`/`thinking`.
82
+ **Also parse the four OPTIONAL enrichment layers when present, per `references/parsing.md` §8:**
83
+ `logic/problem.md`→`context`, `logic/concepts.md`→`glossary` (+ build the `lexicon`),
84
+ `logic/related_work.md`→`dependencies`, `logic/solution/*.md`→`recipes` (role-classify by content, not
85
+ filename). A missing file/dir omits its key entirely. Reproduce statements/deltas/definitions/relations/
86
+ headings/quotes/cells **verbatim**.
87
+ 5. **Build each node's drill-down.** When `logic/claims.md` exists, follow the claim-hub chain in
88
+ `references/binding.md` (node → `evidence:[C##]` → claims → {Sources quotes, figures/tables,
89
+ experiments, artifact pointers}). When it does **not**, the drill-down is just the node's own
90
+ narrative (`thinking`/`body`) — every claim/result/verified block is empty and omitted.
91
+ 5b. **Bind the enrichment layers** per the "four enrichment layers" section of `references/binding.md`:
92
+ build `claimIds`/`nodeByClaim`/`conceptNames`/`rwIds`; resolve every `refs[].target` (drop danglers,
93
+ never link off-ARA); derive each node's `built_on`/`rejected_here` (dependency→claim→node, bucketed by
94
+ `relation_norm`), `concepts` (whole-word name-match), and `recipe_refs` (recipe→claim→node); mark
95
+ cross-agent entries. All per-node enrichment fields default `[]`.
96
+ 6. **Inline figures.** For each referenced figure that has a real raster (`evidence/figures/*.png`),
97
+ base64-encode it and put the `data:` URI in `figures[].img`. Use Bash, e.g.
98
+ `python3 -c "import base64,sys;print('data:image/png;base64,'+base64.b64encode(open(sys.argv[1],'rb').read()).decode())" <path>`.
99
+ For data-only figure markdown (no raster), render its data table instead (as a `tables[]` entry).
100
+ **Also inline code diffs + the artifact index:** for each node with `code_change.diff_file`, read that
101
+ tracked `evidence/changes/<id>.diff.md` sidecar and inline its fenced diff text into `code_change.diff`
102
+ (parallel to figures); build the top-level `artifacts[]` index from `src/artifacts.md` so
103
+ `base_artifact`/`variant_artifact` ids resolve; carry each node's verbatim `thinking` straight through.
104
+ Sanitize all three verbatim fields per the Injection contract. The visualizer never computes a diff
105
+ itself and never opens the external store — it only inlines what the ARA already contains.
106
+ 7. **Assemble `ARA_DATA`** (exact schema in `references/binding.md`) and **inject** it: replace ONLY
107
+ the JSON between `/* __ARA_DATA_BEGIN__ */` and `/* __ARA_DATA_END__ */` in the
108
+ `<script id="ara-data">` block of a copy of the template. Write the result to the output path.
109
+ Include `context`/`glossary`/`dependencies`/`recipes` and the per-node `built_on`/`rejected_here`/
110
+ `recipe_refs`/`concepts` **only when their sources exist** — omit absent keys entirely (no empty
111
+ stubs). A payload omitting all of them stays byte-compatible with the v1.0 schema.
112
+ 8. **Report** the output path. Optionally open it (`open <path>` on macOS). Print a one-line summary
113
+ (node count, dead ends, figures inlined, which of the four enrichment overlays were emitted with their
114
+ term/dependency/recipe counts, danglers dropped, any pointers left unresolved).
115
+
116
+ ## Injection contract (critical)
117
+
118
+ - The injected payload MUST be valid JSON (it is read with `JSON.parse`). The template strips only the
119
+ two named marker comments before parsing, so the payload is otherwise pure JSON.
120
+ - It must not contain the literal substring `</script>`, nor the literal marker strings
121
+ `/* __ARA_DATA_BEGIN__ */` / `/* __ARA_DATA_END__ */`. Escape any `<` in inlined markdown/text as `&lt;`
122
+ (or `<`) — this also neutralizes `</script>`. (A bare `*/` inside a string value is harmless to
123
+ `JSON.parse`; only the exact marker strings would be stripped.)
124
+ - **The verbatim free-text fields `thinking` and `code_change.diff` are the high-risk carriers** (source
125
+ code routinely contains `/* … */`). If either marker token would appear in their text, break it (e.g.
126
+ insert a zero-width space inside `__ARA_DATA_…`) so the global marker-strip can't delete it from inside
127
+ a value. Re-validate: a node whose `thinking`/`diff` contains a marker token MUST round-trip intact.
128
+ - Do not touch anything else in the template — only the bytes between the two markers.
129
+ - After writing, re-validate: the file still parses (the embedded JSON loads). If a figure pushed the
130
+ file very large, apply the size guards in `references/binding.md` (truncate logs/tables, keep figures).
131
+
132
+ ## Faithfulness (hard rules)
133
+
134
+ - Reproduce claim `Statement`s, `Sources` quotes, and table numbers **verbatim** — never paraphrase,
135
+ never invent. Missing data → set the field empty/omit (the viewer shows "No …"); never fabricate.
136
+ - Provenance, `support_level`, and `status` are shown **only if present** in the source; do not guess.
137
+ - Dead-end nodes and `isolated` subtrees must be carried through faithfully — they are the most
138
+ valuable things to display, not noise to drop.
139
+ - For the enrichment layers: relation strings, definitions, constraint headings, and footprint citations
140
+ are reproduced **verbatim**; relation enums are open (compound `bounds / refutes` / transition
141
+ `extends → quarantined` kept as written; `relation_norm` is for color only). Never normalize a heading
142
+ or invent a typed sub-field. A `refs[].target` is set only on real in-ARA resolution; dangling refs are
143
+ flagged, never silently corrected or dropped. `built_on`/`concepts`/"used by" name-matches are
144
+ best-effort hints (marked "inferred"), never asserted as facts.
145
+
146
+ ## Verify
147
+
148
+ Run on any ARA and confirm these properties — no named fixtures required:
149
+ - Opens by double-click: no server, no network, no console errors.
150
+ - Full process map: nesting, branches, dead ends marked, any `isolated` subtree boxed, `depends_on` chips.
151
+ - Drill-down renders whichever blocks are present (what / why / result-with-inline-figure / how-verified /
152
+ code-or-pointer), correctly under **both** field dialects in `references/parsing.md`.
153
+ - Verbatim quotes/numbers; nothing fabricated; self-contained from the ARA dir (no needed external refs).
154
+ - Re-running reproduces the same structure (data differs only as the ARA differs).
155
+ - **Enrichment layers:** a layer's header button appears only when its source exists; an ARA with none of
156
+ the four layers renders identically to v1.0 (no layer bar, no node chips). Open each emitted overlay and
157
+ confirm verbatim relations/definitions/recipe cells, ungrounded/dangling/cross-agent markers, and that
158
+ the `built_on`/`rejected_here` chips + the `⊕/⊘` map marker deep-link into Dependencies. Glossary
159
+ popovers fire on body terms; inline `$LaTeX$` renders with no network.
160
+ - **Degradation:** a `minimal-artifact` (only `problem.md`) shows only the Context button, others absent,
161
+ popovers off, per-node chips empty, zero console errors.
162
+ - **Compile-first (non-ARA input):** pointing the skill at raw research material with no exploration
163
+ tree (a paper, a repo, a run/log dir) triggers the `compiler` skill first, then visualizes the
164
+ resulting ARA — the output is identical to running the compiler then the visualizer by hand.
165
+ - **Raw trajectory (the decoupled path):** a **tree-only** ARA — just `trace/exploration_tree.yaml`, no
166
+ `PAPER.md`, no `logic/`, no `evidence/` — still renders the full process map + each step's narrative
167
+ (`thinking`/`body`), with no layer bar and no claim/result/verified blocks, zero console errors. This
168
+ is a first-class supported input, not a failure mode.
169
+
170
+ Cover the variant axes with whatever ARAs you have: both root forms (`tree:`/`root:`), both field
171
+ dialects (generic / type-named), figures present as real raster vs. data-markdown-only, `src/` as a
172
+ pointer index vs. transcribed code, and an `isolated` subtree if any artifact has one.
@@ -0,0 +1,245 @@
1
+ # Binding — the claim-hub drill-down chain + the `ARA_DATA` schema
2
+
3
+ This is the read step. For each normalized trace node you build one drill-down bundle by following
4
+ the artifact's **claim-mediated** cross-layer links, then you emit the whole thing as one `ARA_DATA`
5
+ object that the scaffold renders.
6
+
7
+ > **The claim hub is OPTIONAL — the tree is the only hard requirement.** Every layer this file binds
8
+ > (`logic/claims.md`, `logic/experiments.md`, `evidence/`, `src/artifacts.md`, the four enrichment
9
+ > layers) is enrichment: when its dir/file is absent, that binding step **no-ops** and the
10
+ > corresponding `ARA_DATA` arrays stay `[]` (omitted by the renderer). A node ALWAYS renders from its
11
+ > own normalized `title` / `body` / `thinking` (parsing.md §3, §7a) — so a **raw, tree-only
12
+ > trajectory** with no `logic/` and no `evidence/` produces a complete process map + per-step
13
+ > narrative. The chain below is what runs *when the hub exists*; nothing here may hard-fail on an
14
+ > absent layer.
15
+
16
+ ## Why claim-mediated (and not node → `.py` + table)
17
+
18
+ Current ARAs usually have **no per-step source file**: `src/` is `artifacts.md` (a pointer index into
19
+ an external run store) + `environment.md`. The binding hub is `logic/claims.md`. A trace node points
20
+ at claims via `evidence: [C##]`; the claim carries the *why*, the grounded *result* (verbatim
21
+ `Sources` quotes with `file:line`), the verification (`Proof: [E##]`), and the figures/tables that
22
+ cite it. So the real grounded numbers are **already inside the ARA** — you do not need the external
23
+ store (v1 leaves those as pointers).
24
+
25
+ ## Resolution chain (per node)
26
+
27
+ ```
28
+ NODE (type, content+outcome, support_level, source_refs, evidence:[C##], also_depends_on, isolated)
29
+ ├─ WHAT → node title + body (+ outcome) ............................ → node.title / node.body
30
+ ├─ WHY → evidence:[C##] → logic/claims.md ......................... → node.why[]
31
+ │ each claim: Statement, Status, Conditions, Falsification,
32
+ │ Dependencies, provenance
33
+ ├─ HOW VERIFIED→ claim.Proof:[E##] → logic/experiments.md ................. → node.verified_by[]
34
+ │ each experiment: Run (pointer), Setup, Metrics
35
+ ├─ RESULT → claim.Sources «quote ← file:line» + Evidence basis ....... → node.result.sources[]
36
+ │ + evidence/README.md reverse-lookup: figures/tables citing C## → node.result.figures[]/tables[]
37
+ │ + evidence/data/*.json (raw, in-artifact) .................. → node.result.data[]
38
+ └─ CODE/ARTIFACT→ src/artifacts.md pointer(s) + logic/solution/* recipe ... → node.artifact[]
39
+ ```
40
+
41
+ ### evidence/README.md reverse-lookup
42
+ `evidence/README.md` has a `Claims` column mapping each figure/table file → the `C##` it grounds.
43
+ Build a map `claim_id → [evidence files]` once. For a node, union the evidence of all its claims.
44
+ Fallback if no README row: scan the figure/table `.md` for an inline `Supports C##` / `grounds C##`.
45
+
46
+ ### Figures
47
+ - Real raster (`evidence/figures/<name>.png` beside the `.md`) → base64-inline into `figures[].img`.
48
+ - Data-only figure markdown (no raster) → render its data table as a `tables[]` entry instead.
49
+ - A figure's caption/title comes from the first heading or the `What it shows` section of its `.md`.
50
+
51
+ ### What counts as "code" now
52
+ There is usually no `src/execution/*.py`. Populate `artifact[]` from `src/artifacts.md` (the run-index
53
+ row / `record_configs/` path / named submitted variant file) plus the relevant
54
+ `logic/solution/*.md` recipe section. Only when a real transcribed `src/execution/*.py` exists
55
+ (legacy / paper-only code) do you point at that file. **Never resolve the external store in v1** —
56
+ the pointer text is the value.
57
+
58
+ ### The changed-code diff (`node.code_change`) — compiler-produced, visualizer-rendered
59
+ A full ARA may carry, per experiment node, the **unified diff** the step represents. The addresses
60
+ live in ONE place and are referenced by id (weak coupling):
61
+
62
+ `node.code_change` → `evidence/changes/<node-id>.diff.md` (the diff **text**) → `artifacts[]` /
63
+ `src/artifacts.md` entry (path + sha256 + original location) → the original repo.
64
+
65
+ - The diff **text** is grounded by citing the two **artifact ids** (`base_artifact`, `variant_artifact`),
66
+ never an embedded path. Whole scripts stay pointers in `src/artifacts.md` (Rule 14) — the diff is a
67
+ derived, grounded view (≈ a `derived_subset` table), not a copy of the artifact.
68
+ - **`diff_file` → `diff` inlining** (parallel to figures' `.md`→base64 `img`): on disk the node carries
69
+ `code_change.diff_file: "evidence/changes/<id>.diff.md"`; the visualizer reads that **tracked** sidecar
70
+ and inlines its fenced diff text into `code_change.diff` in `ARA_DATA`, so the rendered HTML stays
71
+ self-contained (the sidecar lives inside the ARA dir).
72
+ - **`artifactById`**: the visualizer builds an `id → artifacts[] entry` map (parallel to `nodeByClaim`)
73
+ and resolves `base_artifact`/`variant_artifact` into the shown-not-resolved pointer chip under the diff.
74
+ - **Degrade**: when the scripts don't resolve at compile time (store absent), the compiler emits
75
+ `code_change` with the artifact ids + a `note` but no diff; the viewer shows a pointer chip, not a diff.
76
+ - **Marker safety**: `diff` and `thinking` are verbatim, so the producer MUST ensure neither the literal
77
+ `/* __ARA_DATA_BEGIN__ */` / `/* __ARA_DATA_END__ */` tokens nor `</script>` appears in any inlined
78
+ string (escape `<`→`&lt;`; break the marker tokens). See SKILL.md "Injection contract".
79
+
80
+ ## The `ARA_DATA` object (exact schema the scaffold reads)
81
+
82
+ ```jsonc
83
+ {
84
+ "meta": {
85
+ "title": "<PAPER.md frontmatter title>",
86
+ "authors": ["..."], // [] if none
87
+ "year": "", "venue": "", "ara_dir": "<dir name>",
88
+ "abstract": "<PAPER.md abstract>" // "" to hide the Abstract disclosure
89
+ },
90
+
91
+ // Traversal order for Replay and ← →. A flat list of node ids in the order to step through
92
+ // (typically a pre-order DFS of the tree). If omitted, the scaffold derives a DFS from `parent`.
93
+ "order": ["N01", "N02", "N03", "..."],
94
+
95
+ // OPTIONAL addressable artifact index (from src/artifacts.md). Each script the compiler points at
96
+ // gets a stable id so a node's code_change can reference it BY ID (no embedded path). Omit if absent.
97
+ "artifacts": [
98
+ { "id":"A01", "name":"<artifact name>", "path":"<repo-relative path>", "sha256":"<...>",
99
+ "original_location":"<store/repo ref>", "pointer":"<src/artifacts.md pointer text>" }
100
+ ],
101
+
102
+ "nodes": [
103
+ {
104
+ "id": "N02",
105
+ "type": "experiment", // question|experiment|decision|dead_end|pivot|insight|<other ok>
106
+ "parent": "N01", // id of the nesting parent, or null for a root
107
+ "title": "<normalized step title>", // see parsing.md
108
+ "body": "<what the step did / its outcome>",
109
+ "thinking": "<verbatim agent deliberation — why it did/branched; OPTIONAL>", // primary block; falls back to body
110
+ "support_level": "explicit", // "explicit"|"inferred"|null
111
+ "isolated": false, // true → rendered in a separated dashed box
112
+ "depends_on": ["N00"], // also_depends_on cross-edges (ids); [] if none
113
+ "source_refs": ["<external path:line — shown as a chip, NOT resolved>"],
114
+
115
+ "why": [ // from evidence:[C##] → claims.md
116
+ { "id":"C01", "title":"...", "statement":"<verbatim Statement>",
117
+ "status":"supported", "conditions":"...", "falsification":"...",
118
+ "dependencies":["C00"], "provenance":"<as stated in source, else null>" }
119
+ ],
120
+
121
+ "result": {
122
+ "sources": [ { "quote":"<verbatim Sources quote>", "ref":"<file:line>" } ],
123
+ "figures": [ { "id":"<figure id>", "caption":"...", "kind":"quantitative_plot",
124
+ "img":"data:image/png;base64,<...>" } ],
125
+ "tables": [ { "id":"<table id>", "caption":"...", "markdown":"| col | ... |\n|---|...|" } ],
126
+ "data": [ { "id":"<data id>", "path":"evidence/data/<name>.json", "note":"raw" } ]
127
+ },
128
+
129
+ "verified_by": [ // from claim.Proof:[E##] → experiments.md
130
+ { "id":"E01", "title":"...", "run":"<Run pointer text>", "setup":"...", "metrics":"..." }
131
+ ],
132
+
133
+ "artifact": [ // src/artifacts.md pointers + solution recipe refs (pointer text only)
134
+ { "name":"<artifact / family name>", "pointer":"<src/artifacts.md pointer text>", "what":"pointer index entry" }
135
+ ],
136
+
137
+ "code_change": { // OPTIONAL — the changed-code diff for this step (compiler-produced)
138
+ "base_artifact":"A01", // → artifacts[].id (holds path+sha+original_location)
139
+ "variant_artifact":"A07", // → artifacts[].id
140
+ "lang":"python",
141
+ "diff":"<unified-diff text, inlined by the visualizer from evidence/changes/<id>.diff.md>",
142
+ "note":"" // set (with diff absent) when the scripts didn't resolve → pointer-only chip
143
+ }
144
+ }
145
+ // ... one object per trace node
146
+ ]
147
+ }
148
+ ```
149
+
150
+ ### Field rules
151
+ - Every node MUST have `id`, `type`, `title`, `parent` (or null). All other arrays default to `[]`,
152
+ scalars to `null`/`""`. The scaffold tolerates missing optional fields.
153
+ - Put **only what the source contains**. Empty `why`/`result`/`verified_by`/`artifact` is fine and
154
+ common (e.g. a bare `decision` node) — the viewer simply omits those blocks. `thinking` and
155
+ `code_change` are likewise optional; omit when absent (a payload without them is byte-compatible).
156
+ - `status` is lower-cased by the viewer for styling; pass it as written (`Supported`, `hypothesis`, …).
157
+
158
+ ### Size guards
159
+ - Inline every figure (they are the point), but cap a single inlined table/log render at a few hundred
160
+ lines; if longer, truncate and append a final row/line `… +N more (truncated)`.
161
+ - If the assembled file would be very large, prefer truncating `tables`/`data` text over dropping
162
+ figures, and never drop nodes.
163
+
164
+ ---
165
+
166
+ # The four enrichment layers — schema additions + cross-linking
167
+
168
+ These surface `logic/problem.md` / `logic/concepts.md` / `logic/related_work.md` / `logic/solution/*`
169
+ (parsed per `parsing.md` §8). **Everything here is OPTIONAL and additive** — omit any key whose source
170
+ is absent; a payload with none of these stays byte-compatible with the prior schema and the renderer is
171
+ inert for any key it doesn't see.
172
+
173
+ ## Schema additions
174
+
175
+ A shared **typed-ref** primitive (used by every layer's `refs`/`grounding`):
176
+ ```jsonc
177
+ { "raw": "<verbatim token, always shown>",
178
+ "kind": "claim|concept|related_work|observation|gap|assumption|experiment|node|source|figure|pr|arxiv|doi|url|unknown",
179
+ "target": "<in-ARA anchor id, or null>" } // non-null ONLY when it resolves inside THIS ARA
180
+ ```
181
+
182
+ Four OPTIONAL top-level keys:
183
+ ```jsonc
184
+ "context": { // logic/problem.md ; omit if absent
185
+ "title":"Problem", "summary":"",
186
+ "sections":[ { "role":"setting|observations|gaps|insight|assumptions|other", "heading":"<verbatim H2>",
187
+ "present":true, // false ⇒ greyed "(not specified)" stub for an expected-but-absent role
188
+ "entries":[ { "ent_id":"O2", "inferred_id":false, "ent_title":"", "text":"<verbatim>",
189
+ "unstructured":false, "fields":[{"label":"<verbatim>","value":"<verbatim>"}], "refs":[/*typed-ref*/] } ] } ] },
190
+ "glossary": { // logic/concepts.md ; omit ⇒ popovers disabled
191
+ "title":"Concepts", "groups":[{"name":"","term_ids":["G07"]}],
192
+ "terms":[ { "term_id":"G07", "name":"<verbatim, the match key>", "aliases":[], "group":"",
193
+ "definition":"<verbatim, may hold $LaTeX$>", "fields":[{"label":"","value":""}], "related":["G08"], "refs":[] } ],
194
+ "lexicon": { "<lowercased surface>":"G07" } }, // precomputed surface→term_id for popovers
195
+ "dependencies": { // logic/related_work.md ; omit if absent
196
+ "title":"Related Work / Dependencies", "preamble":"", "legend":[{"type":"extends","gloss":""}], "attribution":"",
197
+ "entries":[ { "rw_id":"RW02", "inferred_id":false, "name":"", "relation_raw":"<verbatim, compound/transition kept>",
198
+ "relation_norm":"baseline|imports|extends|bounds|refutes|", "delta":"", "adopted":"",
199
+ "grounding":[/*typed-ref: pr|arxiv|doi|source|url*/], "claims":["C03"], "cross_agent":false, "is_footprint":false } ],
200
+ "footprint":[ {"ref_id":"16","citation":"<verbatim one-liner>"} ] },
201
+ "recipes": { // logic/solution/* ; omit if dir absent
202
+ "title":"Solution",
203
+ "files":[ { "file":"method.md", "role":"constraints|method|heuristics|algorithm|architecture|recipe", "file_title":"",
204
+ "sections":[ { "sec_id":"", "heading":"<verbatim, never normalized>", "kind":"table|steps|kv|code|math|prose",
205
+ "markdown":"", "steps":[], "fields":[], "code":"", "text":"<verbatim fallback>", "warn":"", "refs":[] } ] } ] }
206
+ ```
207
+
208
+ Four OPTIONAL per-NODE fields (default `[]`):
209
+ ```jsonc
210
+ "built_on": [ {"rw_id":"RW02","name":"","relation_raw":""} ], // baseline/imports/extends deps a node's claims cite
211
+ "rejected_here":[ {"rw_id":"RW10","name":"","relation_raw":""} ], // bounds/refutes deps
212
+ "recipe_refs": [ {"file":"","sec_id":"","heading":"","role":""} ], // recipe sections a node's claims cite
213
+ "concepts": [ "<verbatim term name>" ] // glossary terms mentioned in this node (popover anchors)
214
+ ```
215
+ Field rules: every new key omittable; `text`/`definition`/`delta`/`value`/relations/headings/quotes/cells
216
+ **verbatim**; `relation_raw` always preserved (compound/transition survive), `relation_norm` is color-only
217
+ and may be `""`; `target` non-null only on in-ARA resolution (off-ARA `pr`/`arxiv`/`doi`/`§`/file:line stay
218
+ `null` → inert chip); no fabrication.
219
+
220
+ ## Cross-linking (the binder runs AFTER nodes + claims are parsed)
221
+
222
+ The hub stays `logic/claims.md`. All linking is best-effort, computed once into lookup maps, then attached.
223
+
224
+ - **B.0 anchors:** `claimIds` (the `## C\d+` set), `nodeIds`, `nodeByClaim` (invert each node's `why[].id`),
225
+ `conceptNames` (lowercased `glossary` names+aliases), `rwIds`.
226
+ - **B.1 claim-ref resolution (single pass over all layers' refs):** `kind:"claim"` ref → `target` if
227
+ `∈ claimIds`, else `target:null` and **flag dangling** (renderer greys it "broken link" — never a live
228
+ dead link). Same for `related_work`/`concept`/`node` targets against their id sets.
229
+ - **B.2 dependencies ↔ claims/nodes:** forward edge is in-data (`entries[].claims[]`). Build the reverse
230
+ `claim_id → [rw_id]`; also harvest `claims.md`'s own `**Dependencies.**` lines as extra rw-edges (so the
231
+ edge exists even when related_work omits the back-ref).
232
+ - **B.3 per-node `built_on` / `rejected_here` (the chips):** `rwSet = ⋃ over n.why[].id of reverse(claim→rw)`.
233
+ For each rw, bucket by `relation_norm`: `∈{baseline,imports,extends}` → `built_on`; `∈{bounds,refutes}` →
234
+ `rejected_here`; `""`/transition → first keyword, default `built_on`, **keep `relation_raw` literal**.
235
+ Dedupe by `rw_id`; empty buckets stay `[]` (no chip row). Cross-agent deps render a distinct "↔ other-agent"
236
+ ribbon.
237
+ - **B.4 concepts ↔ everything:** concepts are leaf provenance (no `trace:N` refs). Build `node.concepts[]` by
238
+ **sideways name-match** — whole-word, case-insensitive scan of each node's `title`+`body`+claim statements
239
+ against `conceptNames`. The Concept overlay's "used by" is that scan inverted, marked **"mentions
240
+ (inferred)"** and visually distinct from id-resolved links. Outbound concept `refs` resolve normally.
241
+ - **B.5 recipes ↔ claims/nodes (`recipe_refs[]`):** a node's `recipe_refs` = recipe sections whose `refs[]`
242
+ cite one of the node's `why` claims (∪ sections a node's claim cites back); dedupe by `(file,sec_id)`.
243
+ Unresolvable `C##` ⇒ inert muted chip ("referenced but not linked"), never dropped.
244
+ - **B.6 determinism & cost:** all maps built once, O(nodes+claims+refs); stable source order; a wrong/no
245
+ match never blocks rendering. The browser-side `lexicon` is precomputed so no source regex runs at runtime.
@@ -0,0 +1,211 @@
1
+ # Parsing tolerance — normalize the exploration tree across ARA variants
2
+
3
+ The canonical `exploration-tree-spec.md` uses **generic** field names; many real artifacts instead
4
+ use **type-named** fields. Parse against the *model*, tolerate the *variants*. Never
5
+ hardcode to one example; never assume a fixed field set. The goal: **every node yields an `id`,
6
+ `type`, `title`, `body`, and a `parent`** (plus optional metadata), no matter which dialect.
7
+
8
+ ## 1. Root form
9
+
10
+ Accept both:
11
+ - `tree:` → a **list** of root nodes (canonical).
12
+ - `root:` → a **single** root node. Treat it as a one-element root list.
13
+
14
+ Children nest under each node's `children:` list. A node's `parent` is the nearest enclosing node;
15
+ top-level nodes have `parent: null`.
16
+
17
+ ## 2. Node identity & type
18
+
19
+ - `id` — required (any stable identifier the tree uses, e.g. `N01`, `E07`). Keep verbatim.
20
+ - `type` — one of `question | experiment | decision | dead_end | pivot | insight` (or anything else;
21
+ unknown types still render with a neutral glyph). Keep verbatim.
22
+ - `support_level` — `explicit | inferred` if present, else `null`.
23
+ - `source_refs` — list of strings if present, else `[]`. **External pointers; shown, never resolved.**
24
+ - `isolated: true` — carry through (renders in a separated box).
25
+ - `also_depends_on: [ids]` → emit as `depends_on` (DAG cross-edges).
26
+ - `thinking` — verbatim agent deliberation, **passed straight through** (the primary reasoning block).
27
+ Absent ⇒ omit. Never paraphrase or synthesize it.
28
+ - `code_change` — when the compiler wrote one onto the node (`base_artifact` / `variant_artifact` /
29
+ `lang` / `diff_file`), **pass it through**. The `diff_file`→`diff` inlining and the top-level
30
+ `artifacts[]` index are done in the binding/inline step (binding.md); the visualizer never computes a
31
+ diff itself. Absent ⇒ omit.
32
+
33
+ ## 3. Title + body normalization (the dialect bridge)
34
+
35
+ For each node, derive a **display title** and a **body** by probing known fields **in order**, per type.
36
+ Use the first non-empty match; fall back gracefully so nothing is blank.
37
+
38
+ | type | title — try in order | body — try in order |
39
+ |-------------|-----------------------------------------------|-------------------------------------------------------|
40
+ | question | `title` → `question` | `description` → `question` (if used as title, leave body empty) |
41
+ | experiment | `title` → first sentence of `experiment` | `result` → `outcome` → `experiment` |
42
+ | decision | `title` → first sentence of `decision` | `choice` (+ `alternatives`) → `decision` |
43
+ | dead_end | `title` → first sentence of `dead_end` | `failure_mode`/`why_failed` (+ `hypothesis`,`lesson`) → `dead_end` |
44
+ | pivot | `title` → `trigger` | `from` → `to` → `trigger` |
45
+ | insight | `title` → first sentence of `insight` | `description` → `insight` |
46
+ | *(other)* | `title` → `<type>` field → `description` | the `<type>` field → `description` |
47
+
48
+ Rules:
49
+ - "first sentence of X" = X truncated at the first `. ` or ~80 chars — only when there is no separate
50
+ `title`. If a single field is both the only content and long, use a truncated form as title and the
51
+ full text as body.
52
+ - `decision`: append `alternatives` to the body as "alternatives: a; b" when present.
53
+ - `dead_end`: prefer to show *why it failed* in the body — it is the most valuable content.
54
+ - Never emit an empty title. If truly nothing usable, use the `id`.
55
+
56
+ ## 4. `evidence:` routing
57
+
58
+ A node's `evidence:` (and a decision's `evidence:` string) may mix kinds. Split tokens and route:
59
+ - `C\d+` → claim IDs → drive the `why` / `result` binding (see `binding.md`).
60
+ - `Figure N` / `Table N` / a figure/table filename → evidence file refs → `result.figures`/`tables`.
61
+ - `§...` / prose → keep as a `source_refs`-style chip (context only).
62
+
63
+ ## 5. Provenance
64
+
65
+ May appear as:
66
+ - a per-claim `provenance:` / `Provenance:` field, **or**
67
+ - prose in the `claims.md` header ("all claims are `ai-executed`; C06, C08 are `user-revised`").
68
+
69
+ Capture whichever exists and attach the right tag to each claim in `why[]`. If neither exists, omit
70
+ provenance — do not guess.
71
+
72
+ ## 6. Claims, experiments, evidence index (the hub layers)
73
+
74
+ - `logic/claims.md` — split on `## C\d+` headings. Pull `Statement`, `Status`, `Conditions`,
75
+ `Falsification criteria`, `Proof` (→ `E##`), `Sources` (verbatim «quote» + `← file:line`),
76
+ `Dependencies`. Field labels are bold-prefixed (`**Statement.**` or `- **Statement**:`) — match both.
77
+ - `logic/experiments.md` — split on `## E\d+`. Pull `Verifies` (→ `C##`), `Run`, `Setup`, `Metrics`.
78
+ - `evidence/README.md` — parse the Tables/Figures index to build `claim_id → [evidence files]`.
79
+
80
+ ## 7. Degrade, don't fail — the tree is the only hard requirement
81
+
82
+ Any missing/oddly-shaped field → fall back per the tables above and continue. A smaller honest view
83
+ is correct. **Hard-stop on exactly one condition: `trace/exploration_tree.yaml` is absent or parses to
84
+ zero nodes** (nothing to show). Minimal-validity guard = *the tree parses AND yields ≥1 node* — this
85
+ replaces the old "`PAPER.md` missing ⇒ not an ARA" guard, which no longer hard-stops (a missing
86
+ `PAPER.md` just means the visualizer synthesizes a minimal `meta` from a tree-level `title:` / the dir
87
+ name). Everything else — `logic/`, `evidence/`, `src/`, the four enrichment layers — is optional; absent
88
+ ⇒ contributes nothing, never an error.
89
+
90
+ ### 7a. Raw-trajectory input mode (first-class)
91
+
92
+ The minimum a node needs to render a useful step is `id` + (`title` **or** a type-named text field);
93
+ `body` / `thinking` are optional but make the step legible. So a **bare exploration tree with no
94
+ `logic/`, no `evidence/`, and no `PAPER.md`** — i.e. a raw agent run — is a fully supported input, not a
95
+ degraded one. Each node renders from its own normalized `title` / `body` (per §3) plus its `thinking`
96
+ (the agent's deliberation, when the source carries it — a verbatim pass-through field); the
97
+ `why` / `result` / `how-verified` blocks are simply empty and omitted.
98
+
99
+ **Adapter recipe (generic agent run → minimal tree).** A typical agent log is a sequence of steps, each
100
+ a `{thought, action, observation/result}`. Map it onto the tree:
101
+ - one tree node per step (or per meaningful decision/experiment); `id` = the step index/label.
102
+ - `type` from the step kind: a tried approach → `experiment`; a chosen direction → `decision`; an
103
+ abandoned/failed approach → `dead_end`; an opening/guiding question → `question`.
104
+ - `title` = a one-line summary of the step (first sentence of the action, ≤80 chars).
105
+ - `thinking` = the agent's thought/deliberation for the step (**verbatim** — why it did/branched).
106
+ - `body` = what it actually did + what came back (action + observation).
107
+ - `source_refs` = a pointer back to the log line(s) (shown, never resolved).
108
+ - nesting via `children`; convergence via `also_depends_on`; a discarded branch via `isolated`.
109
+
110
+ No `logic/` or `evidence/` is required; enrich the same tree later (via the compiler) to add claims,
111
+ evidence, and per-node `code_change` diffs.
112
+
113
+ # 8. The four `logic/` enrichment layers (all optional)
114
+
115
+ These produce the OPTIONAL `context` / `glossary` / `dependencies` / `recipes` keys (and the per-node
116
+ `built_on` / `rejected_here` / `recipe_refs` / `concepts` fields, derived in `binding.md`). **Each is
117
+ independent and absent ⇒ omit its key entirely** (no empty stubs). The renderer is inert for any key
118
+ it doesn't see, so an ARA with none of these renders exactly as before.
119
+
120
+ **Governing rule — classify by markdown SHAPE + generic cross-ref regexes; NEVER key on field
121
+ vocabulary.** Do not look for "optimizer", "Fig.", "loss", "shortcut", etc. — only headings, bold-lead
122
+ labels, id tokens, and the ref battery below. Verbatim everywhere: `text`/`definition`/`delta`/`value`,
123
+ all numbers, quotes, table cells, relation strings, and headings are reproduced exactly; a missing
124
+ field is `""`/`[]`; positive-absence signals (`present:false`, `unstructured`, `is_footprint`,
125
+ `inferred_id`) are *shown*, never invented.
126
+
127
+ ## 8.0 Shared sub-routines (define once, reuse for every layer)
128
+
129
+ - **`splitEntries(sectionText)`** — try delimiters in order, take the first that yields ≥1 hit:
130
+ (a) `^### ` H3 headings; (b) `^## ` with a leading id token (`^##\s*([A-Za-z]{1,3}\d+|RW\d+|[OGA]\d+)`);
131
+ (c) top-level bold-lead bullets `^\s*-\s*\*\*(.+?)\.?\*\*`; (d) blank-line-separated paragraph blocks.
132
+ (d) always succeeds ⇒ prose-only dialects still yield entries.
133
+ - **`probeFields(entryText)`** — collect every `^\s*-?\s*\*\*([^*]+?)\*\*\s*[:—]` labeled leader into an
134
+ **open** `[{label,value}]` list. **Labels verbatim, no whitelist.** Empty ⇒ `unstructured:true`,
135
+ `fields:[]`.
136
+ - **`harvestRefs(text)` → `typed-ref[]`** — run the FULL union of patterns; never assume one scheme.
137
+ The shared **typed-ref** is `{ "raw":"<verbatim token>", "kind":"…", "target":null }`; `target` is set
138
+ later in the binding pass (it stays `null` for anything that doesn't resolve inside this ARA).
139
+
140
+ | pattern | kind |
141
+ |---|---|
142
+ | `\[?\b(C\d{1,3})\b\]?` (optionally `(claims.md…)`) | `claim` |
143
+ | `\bRW\d{1,3}\b` | `related_work` |
144
+ | `\b([OGA])\d{1,3}\b` | `observation` / `gap` / `assumption` |
145
+ | `\bK\d{1,3}\b` or a minted term-id | `concept` |
146
+ | `PR\s*#?\d+` | `pr` |
147
+ | `arXiv[:\s]\S+` | `arxiv` |
148
+ | `10\.\d{4,}/\S+` | `doi` |
149
+ | `[\w./-]+\.\w+:\d+(?:-\d+)?` · `trace:N\d+` · `logic/\S+#\w+` | `source` / `node` |
150
+ | `§[\w".]+` · `Table\s*\S+` · `Fig\.?\s*\S+` · `Eqn\.?\s*\S+` | `figure` |
151
+ | `https?://\S+` | `url` |
152
+
153
+ ## 8.1 `logic/problem.md` → `context`
154
+ Absent ⇒ omit (do NOT fabricate context from claims). Split on `^## `; classify each heading by a
155
+ **case-insensitive role-synonym map** (NOT exact names): `setting ⇐ {setting,task,constraints,
156
+ background,context,problem statement}`; `observations ⇐ {observation*,findings,evidence}`;
157
+ `gaps ⇐ {gap*,open question*,challenges,limitations of prior work}`; `insight ⇐ {key insight,insight,
158
+ core idea,approach,thesis}`; `assumptions ⇐ {assumption*,scope,caveats}`. Unmatched ⇒ `role:"other"`,
159
+ heading kept verbatim, **never dropped**. **Merge** same-role sections. Any canonical role that never
160
+ appeared ⇒ emit a `{present:false}` stub (signals the compile, greys the chip). Per section run
161
+ `splitEntries`; `ent_id = \b([OGA]\d+)\b` (width-agnostic) else positional + `inferred_id:true`;
162
+ `ent_title` = text to the first `:`/`—`/`.` else first sentence; `probeFields` (empty ⇒
163
+ `unstructured:true`); `text` = verbatim body (mandatory); `harvestRefs`. `summary` = the insight
164
+ entry's first sentence (else `""`).
165
+
166
+ ## 8.2 `logic/concepts.md` → `glossary`
167
+ Absent ⇒ omit (⇒ popovers globally disabled). **Detect grouping:** a `##` is a *group* (not a term)
168
+ iff it has no `—`/`:` id-separator AND the next non-blank line is a `- ` bullet AND no following
169
+ `**Definition`. Split terms via `splitEntries` (`^##\s+(.+)$` → `^- \*\*(.+?)\.?\*\*` → paragraphs).
170
+ Per term: `term_id` = leading `^(K|C|RW|[A-Z]{1,3})\d{2,}` before the separator, else **mint**
171
+ `G01,G02,…` positionally (a stable popover anchor); `name` = heading minus id/separator (mandatory);
172
+ `aliases` = split a trailing `(…)` and `/`; `definition` = `**Definition**:` value else whole body
173
+ (mandatory); `probeFields` (captures resnet-style `Notation`/`Boundary conditions`; nothing for
174
+ codex/ptb — correct); `related` = any `/relat/i`-labeled field split on commas (resolved to `term_id`s
175
+ in binding); `harvestRefs`. **Build `lexicon`** = lowercased `name` + each alias → `term_id`, skipping
176
+ tokens <3 chars and pure-numeric aliases, first-wins on collision (the renderer uses it for popovers).
177
+
178
+ ## 8.3 `logic/related_work.md` → `dependencies`
179
+ Absent ⇒ omit. Strip the preamble before the first `## ` → `preamble`; parse `**word** (gloss)` pairs →
180
+ `legend[]`; capture a `> **Cross-agent attribution.**` blockquote → `attribution`. Split on `^## `,
181
+ match `^##\s*(RW\d+)?\s*[—:-]?\s*(.*)$`. If a heading has no `RW\d+` and matches `/brief|additional
182
+ referenced|citations/i`, parse its bullets into `footprint[]` (`{ref_id,citation}`) and skip structured
183
+ probing (`is_footprint` entries, or just the `footprint` tail). Per entry: `relation_raw` from a
184
+ ` — **type** ` in the heading ELSE a `**Type:**` line — **kept verbatim**, incl. compound
185
+ (`bounds / refutes`) and transition (`extends → quarantined`); `relation_norm` = the first of
186
+ `{baseline,imports,extends,bounds,refutes}` found, else `""` (color only); `name` = heading after the
187
+ separator; `delta` = `**Delta**`/`What changed:`+`Why:` (concatenate) else body; `adopted` =
188
+ `**Adopted elements**`; `grounding`/`claims` = `harvestRefs` partitioned by kind; `cross_agent:true` if
189
+ the entry is named in `attribution` or its relation mentions another agent. Unnumbered ⇒ positional +
190
+ `inferred_id`. **Degrade:** an entry with no source and no claim keeps `grounding:[]`+`claims:[]` (the
191
+ renderer shows a muted "ungrounded").
192
+
193
+ ## 8.4 `logic/solution/*` → `recipes`
194
+ Glob `logic/solution/*.md`; dir absent ⇒ omit. **Role-classify each file by content signal (priority):**
195
+ filename `constraints.md` → `constraints`; filename `method.md` OR title `/method|recipe|procedure|
196
+ process|pipeline/i` → `method`; uniform `^##\s*[A-Z]\d+:` entries → `heuristics`;
197
+ `## Mathematical formulation`/pseudocode fence/`$…$` → `algorithm`; repeated component `###` micro-schema
198
+ → `architecture`; else → `recipe`. (Only the `constraints.md` filename is special — a universal ARA
199
+ convention, not a field term.) Split each file on `^## ` (recurse `###`). **`heading` kept verbatim —
200
+ never normalized.** Per section pick the **dominant `kind`** in order: table `| … |` → `table`
201
+ (`markdown`); typed bullets `- **K**: v` → `kv` (`fields`); numbered `^\d+\.` → `steps`;
202
+ fenced/ASCII-DAG → `code`; `$$…$$`/`$…$` → `math` (put TeX in `code`); else → `prose`. `text` = verbatim
203
+ body (mandatory fallback); `sec_id` = leading id; `warn` = body under a `/confound|cut-off|incomplete|
204
+ not specified|unverified/i` heading; `refs` = `harvestRefs` (harvest **bare** `C05` too, not only
205
+ `[C05]`). **Degrade:** only `constraints.md` present ⇒ render just it (no "method missing" stub);
206
+ `method.md` absent but `algorithm.md`/`architecture.md` present ⇒ those carry the method role.
207
+
208
+ ## 8.5 Universal absence rule
209
+ Each key is independent; any combination is valid. A `minimal-artifact` (only `problem.md`+`claims.md`)
210
+ yields `context` only — `glossary`/`dependencies`/`recipes` omitted, all per-node enrichment fields
211
+ `[]`, nothing errors.