lumina-wiki 1.7.1 → 1.7.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,35 @@ Format follows [Keep a Changelog](https://keepachangelog.com/en/1.0.0/).
|
|
|
5
5
|
|
|
6
6
|
## [Unreleased]
|
|
7
7
|
|
|
8
|
+
## [1.7.2] - 2026-07-05
|
|
9
|
+
|
|
10
|
+
### Fixed
|
|
11
|
+
|
|
12
|
+
- `/lumi-ingest` step-03-verify referenced `src/skills/core/verify/` — the
|
|
13
|
+
repo source tree — instead of the installed workspace path. On any
|
|
14
|
+
non-Claude-Code IDE target (Codex, Gemini, Cursor, generic), the file
|
|
15
|
+
never existed post-install, so grounding verification silently had
|
|
16
|
+
nothing to follow. Fixed all three references to
|
|
17
|
+
`.agents/skills/lumi-verify/`, which the installer copies unconditionally
|
|
18
|
+
for every IDE target.
|
|
19
|
+
- Added an `ingest_status` handler for the `not_applicable` verify verdict,
|
|
20
|
+
which was previously unhandled during ingest.
|
|
21
|
+
|
|
22
|
+
### Added
|
|
23
|
+
|
|
24
|
+
- `/lumi-ingest` now checks external identifiers (DOI/arxiv/S2) for an
|
|
25
|
+
existing source page before generating a slug, so the same paper
|
|
26
|
+
ingested under a different title no longer creates a duplicate page.
|
|
27
|
+
- Concept stub creation now scans existing concepts for acronym/expansion
|
|
28
|
+
and singular/plural variants before creating a new one.
|
|
29
|
+
- Key Claims in drafted source pages now require a source locator
|
|
30
|
+
(section/page/heading), so the grounding reviewer in step-03 no longer
|
|
31
|
+
has to re-scan the whole raw file to check a claim.
|
|
32
|
+
- A concept-count rubric (roughly 3-7 per source) to keep the graph from
|
|
33
|
+
being diluted by over-extracted keyword stubs.
|
|
34
|
+
- PDF preprocessing now runs before type detection in step-01, so runtimes
|
|
35
|
+
without native PDF reading don't fail attempting to read the raw binary.
|
|
36
|
+
|
|
8
37
|
## [1.7.1] - 2026-07-02
|
|
9
38
|
|
|
10
39
|
### Added
|
package/package.json
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
{
|
|
2
2
|
"$schema": "https://json.schemastore.org/package.json",
|
|
3
3
|
"name": "lumina-wiki",
|
|
4
|
-
"version": "1.7.
|
|
4
|
+
"version": "1.7.2",
|
|
5
5
|
"description": "Domain-agnostic, multi-IDE wiki scaffolder — Karpathy's LLM-Wiki vision, cross-platform and pack-based.",
|
|
6
6
|
"keywords": [
|
|
7
7
|
"llm-wiki",
|
|
@@ -3,6 +3,31 @@
|
|
|
3
3
|
Use this reference before creating or updating source, concept, person, and graph
|
|
4
4
|
records during `/lumi-ingest`.
|
|
5
5
|
|
|
6
|
+
## Identifier Dedup (Before Slug Generation)
|
|
7
|
+
|
|
8
|
+
Slug comparison alone misses duplicates: the same paper ingested under two title
|
|
9
|
+
variants (with/without subtitle, arxiv version suffix, translated title) produces
|
|
10
|
+
two different slugs. External identifiers are the stronger key, so check them
|
|
11
|
+
first whenever one is known (from `resolve_pdf.py` output, or parsed from the
|
|
12
|
+
input URL).
|
|
13
|
+
|
|
14
|
+
```bash
|
|
15
|
+
# One grep per known identifier, over source-page frontmatter.
|
|
16
|
+
# Use the normalized value: bare DOI (10.x/y), bare arxiv id (2604.03501 — no version suffix).
|
|
17
|
+
grep -rln "10.48550/arxiv.2604.03501" wiki/sources/
|
|
18
|
+
grep -rln "2604.03501" wiki/sources/
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
- **Hit** — an existing page carries the same identifier. This run is a re-ingest
|
|
22
|
+
of that page's slug, regardless of what slug the current title would generate.
|
|
23
|
+
Confirm with the user, then follow the re-ingest path below; do not create a
|
|
24
|
+
second page.
|
|
25
|
+
- **No hit** — proceed to slug generation.
|
|
26
|
+
|
|
27
|
+
A plain substring grep is deliberate: identifiers are unique enough that false
|
|
28
|
+
positives are rare, and a rare false positive only costs one confirmation
|
|
29
|
+
question. A false negative (skipping the check) costs a duplicate source page.
|
|
30
|
+
|
|
6
31
|
## Source Pages
|
|
7
32
|
|
|
8
33
|
Generate the source slug with:
|
|
@@ -47,7 +72,22 @@ making any `add-edge concepts/<slug>` calls.
|
|
|
47
72
|
|
|
48
73
|
## Concept And Person Stubs
|
|
49
74
|
|
|
50
|
-
|
|
75
|
+
Exact-slug lookup only catches exact matches — `rlhf` and
|
|
76
|
+
`reinforcement-learning-from-human-feedback` would become two pages. Once per
|
|
77
|
+
ingest run, list the existing concepts and hold the list in mind while creating
|
|
78
|
+
stubs:
|
|
79
|
+
|
|
80
|
+
```bash
|
|
81
|
+
node _lumina/scripts/wiki.mjs list-entities --type concept
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
For each candidate concept, scan that list for variants of the same term:
|
|
85
|
+
acronym vs expansion, singular vs plural, hyphenation or word-order differences.
|
|
86
|
+
If a variant exists, link to the existing slug instead of creating a new stub —
|
|
87
|
+
and if unsure whether two terms are the same concept, ask the user rather than
|
|
88
|
+
guessing.
|
|
89
|
+
|
|
90
|
+
Then, before creating a concept or person page, check metadata:
|
|
51
91
|
|
|
52
92
|
```bash
|
|
53
93
|
node _lumina/scripts/wiki.mjs read-meta concepts/<slug>
|
|
@@ -6,7 +6,7 @@
|
|
|
6
6
|
- Never modify files in `raw/`. Read-only.
|
|
7
7
|
- Never hand-edit `wiki/graph/edges.jsonl` or `wiki/graph/citations.jsonl`; use `wiki.mjs add-edge` and `wiki.mjs add-citation`.
|
|
8
8
|
- Never overwrite an existing wiki page without user confirmation.
|
|
9
|
-
-
|
|
9
|
+
- Frontmatter discipline, two cases: (a) **initial creation** of a page that does not exist yet — Write the whole file (frontmatter + body) in one shot from the template; (b) **any change to an existing page's frontmatter** — always `wiki.mjs set-meta`, never Write/Edit on the frontmatter block. Body text of an existing page may be edited with Write/Edit; the frontmatter block may not.
|
|
10
10
|
- `raw/tmp/` accepts additions only; never overwrite a file there.
|
|
11
11
|
- `raw_paths` must list permanent artifacts only. Reject `raw/tmp/*` entries.
|
|
12
12
|
- Keep a phase-level checkpoint after every phase — an interrupted run must resume cleanly.
|
|
@@ -69,9 +69,13 @@ node _lumina/scripts/wiki.mjs checkpoint-read research-discover shortlist
|
|
|
69
69
|
|
|
70
70
|
Match title to a shortlist entry, extract URL, fall through to Mode B.
|
|
71
71
|
|
|
72
|
+
Fallback: if the checkpoint does not exist (research pack not installed, or no discover run yet) or no shortlist entry matches the title, do not guess. Ask the user for a URL or identifier (arxiv ID / DOI), then fall through to Mode B with their answer.
|
|
73
|
+
|
|
72
74
|
### Phase 1 — Detect type
|
|
73
75
|
|
|
74
|
-
|
|
76
|
+
If the source is a PDF or too large to read comfortably in one pass, follow `pdf-preprocessing.md` **before reading anything** — on runtimes without native PDF reading, a direct Read of the binary fails here, not in Phase 3.
|
|
77
|
+
|
|
78
|
+
Read first ~200 lines of the source (or of the extracted text). Classify:
|
|
75
79
|
- "Abstract", "Introduction", "References" → `paper`
|
|
76
80
|
- Chapter structure → `book`
|
|
77
81
|
- Web byline + publication → `article`
|
|
@@ -82,6 +86,8 @@ Write checkpoint: `phase: "detect"`.
|
|
|
82
86
|
|
|
83
87
|
### Phase 2 — Generate slug
|
|
84
88
|
|
|
89
|
+
**Identifier dedup comes first.** If any external identifier is known at this point (DOI / arxiv ID / S2 ID — from Mode B resolution or parsed from the input URL), check for an existing source page carrying the same identifier BEFORE generating a slug — title variants produce different slugs, so slug comparison alone misses duplicates. See `dedup-policy.md` § Identifier Dedup. On a hit, this run is a re-ingest of the matched slug: confirm with the user and skip new-page creation.
|
|
90
|
+
|
|
85
91
|
```bash
|
|
86
92
|
node _lumina/scripts/wiki.mjs slug "<Title>"
|
|
87
93
|
```
|
|
@@ -107,7 +113,7 @@ Write checkpoint: `phase: "slug"` (already included in the merge above).
|
|
|
107
113
|
|
|
108
114
|
### Phase 3 — Write source page
|
|
109
115
|
|
|
110
|
-
For PDFs / large sources,
|
|
116
|
+
For PDFs / large sources, the extraction from Phase 1 (`pdf-preprocessing.md`) applies here too — draft from the extracted text, section by section for long sources.
|
|
111
117
|
|
|
112
118
|
Draft `wiki/sources/<slug>.md` from `_lumina/schema/page-templates.md` Source template. Required frontmatter: `id`, `title`, `type`, `created`, `updated`, `authors`, `year`, `importance`, `provenance`. Optional but encouraged: `urls`, `raw_paths`, `confidence`, `external_ids`.
|
|
113
119
|
|
|
@@ -131,6 +137,10 @@ node _lumina/scripts/wiki.mjs set-meta sources/<slug> sources "[$entry]" --json-
|
|
|
131
137
|
|
|
132
138
|
Required body sections: `## Summary` (2–4 sentences), `## Key Claims` (bulleted, with confidence), `## Concepts` (`[[concept-slug]]` links), `## People` (`[[person-slug]]` links), `## Open Questions`.
|
|
133
139
|
|
|
140
|
+
`## Key Claims` — each claim ends with a locator pointing at where the source supports it: a section anchor, page, or heading, e.g. `(§3.2)`, `(p. 5)`, `(Table 1)`. The grounding check in step-03 reads these locators to find evidence; a claim without one forces the reviewer to re-read the whole raw file and weakens the check. If a claim synthesizes multiple places, list the primary locator.
|
|
141
|
+
|
|
142
|
+
`## Concepts` — be selective, not exhaustive. A term earns a concept link only if (a) it is central to this source's contribution, or (b) it is likely to recur across other sources in this wiki. Aim for roughly 3–7 concepts per source; minor keywords belong in body prose, not as concept links. Every linked concept implies a stub + edges in Phases 4–5, so over-extraction dilutes the graph.
|
|
143
|
+
|
|
134
144
|
Provenance rubric (raw-centric):
|
|
135
145
|
- `replayable` — `raw_paths` non-empty, every entry resolves to existing file
|
|
136
146
|
- `partial` — `urls` has entries but `raw_paths` empty or missing
|
|
@@ -156,7 +166,7 @@ node _lumina/scripts/wiki.mjs resolve-alias "<concept-name>"
|
|
|
156
166
|
|
|
157
167
|
If it resolves to a foundation, link via `[[foundations/<slug>]]` and add `grounded_in` edge instead of creating a stub. See `dedup-policy.md` § Foundation Resolution.
|
|
158
168
|
|
|
159
|
-
Apply `dedup-policy.md` before creating/updating stubs. Existing pages are updated conservatively.
|
|
169
|
+
Apply `dedup-policy.md` before creating/updating stubs — including the concept variant scan (acronym vs expansion, singular vs plural), which catches duplicates that exact-slug lookup misses. Existing pages are updated conservatively.
|
|
160
170
|
|
|
161
171
|
New stubs use the templates in `_lumina/schema/page-templates.md`.
|
|
162
172
|
|
|
@@ -187,10 +197,14 @@ node _lumina/scripts/wiki.mjs add-citation sources/<slug> sources/<cited-slug>
|
|
|
187
197
|
|
|
188
198
|
Do not create stubs for cited sources not yet ingested — note them in `## Open Questions`.
|
|
189
199
|
|
|
200
|
+
Write checkpoint: `phase: "citations"`.
|
|
201
|
+
|
|
190
202
|
### Phase 7 — Update wiki/index.md
|
|
191
203
|
|
|
192
204
|
Add the new source page (and any new concept/person pages) to the catalog between `<!-- lumina:index -->` markers. Format: `- [[sources/<slug>]] — <one-line description>`.
|
|
193
205
|
|
|
206
|
+
Write checkpoint: `phase: "index"`.
|
|
207
|
+
|
|
194
208
|
## Draft Gate
|
|
195
209
|
|
|
196
210
|
Present a draft summary to the user:
|
|
@@ -211,7 +225,7 @@ Use the user's configured communication language. Explain "provenance", "edges",
|
|
|
211
225
|
```
|
|
212
226
|
→ NEXT
|
|
213
227
|
- **E**: Take the user's revision instructions. Re-edit the affected files (source page, stubs, or edges as instructed). Re-present the draft summary. Loop back to "HALT and ask human" — do not advance.
|
|
214
|
-
- **Q**: Leave the phase-level checkpoint in place; do not write `ingest_status`. **STOP — do not read the NEXT directive below.** Exit cleanly with no further action this run.
|
|
228
|
+
- **Q**: Leave the phase-level checkpoint in place; do not write `ingest_status`. Before exiting, tell the user in plain language that the draft pages and links written so far stay in the wiki as work-in-progress — coming back to `/lumi-ingest <slug>` continues from here, and `/lumi-reset` can clean up if they decide not to keep this source at all. **STOP — do not read the NEXT directive below.** Exit cleanly with no further action this run.
|
|
215
229
|
|
|
216
230
|
## NEXT
|
|
217
231
|
|
|
@@ -3,7 +3,7 @@
|
|
|
3
3
|
## RULES
|
|
4
4
|
|
|
5
5
|
- Read `README.md` at the project root before this step if you have not already in this session.
|
|
6
|
-
- Reuse the existing `/lumi-verify` skill in grounding-only mode. Do not inline a duplicate grounding reviewer here — single source of truth for grounding logic lives in
|
|
6
|
+
- Reuse the existing `/lumi-verify` skill in grounding-only mode. Do not inline a duplicate grounding reviewer here — single source of truth for grounding logic lives in `.agents/skills/lumi-verify/`.
|
|
7
7
|
- All frontmatter writes go through `wiki.mjs set-meta`. Never write to `wiki/*.md` directly.
|
|
8
8
|
- Drift is a hard halt: a missing `raw_paths` file is a stronger signal than any finding and should not be silently downgraded.
|
|
9
9
|
- This step asks the user only when there are source-check findings, missing source files, or a confidence downgrade. If the source check passes, continue automatically.
|
|
@@ -24,7 +24,7 @@ Invoke `/lumi-verify` on the entry restricted to the grounding reviewer. Three r
|
|
|
24
24
|
|
|
25
25
|
**Tier 1 — Agent tool available (Claude Code):**
|
|
26
26
|
|
|
27
|
-
Spawn a sub-agent
|
|
27
|
+
Spawn a sub-agent instructed to read and follow `.agents/skills/lumi-verify/SKILL.md` with this slug as the target, grounding only (skip blind, skip external). The sub-agent is deliberate: it activates the verify skill in a blank context, so the verdict is not anchored by the reasoning chain that just drafted the pages. Do not invoke the verify skill inline in this conversation (e.g. via a skill-invocation tool) — same-context verification inherits drafting bias. Wait for completion, then read the writeback:
|
|
28
28
|
|
|
29
29
|
```bash
|
|
30
30
|
node _lumina/scripts/wiki.mjs read-meta sources/<slug>
|
|
@@ -32,9 +32,9 @@ node _lumina/scripts/wiki.mjs read-meta sources/<slug>
|
|
|
32
32
|
|
|
33
33
|
**Tier 2 — Bash-only runtime (Codex, Gemini, Cursor, generic):**
|
|
34
34
|
|
|
35
|
-
Read fully and follow
|
|
35
|
+
Read fully and follow `.agents/skills/lumi-verify/SKILL.md` Grounding pipeline (Section: "Grounding reviewer"), with this slug as the target. The verify skill's writeback contract is the same — `verify_status` and `findings` written to the entry frontmatter. After the grounding pass returns, control returns to this step's Phase 8.6.
|
|
36
36
|
|
|
37
|
-
If
|
|
37
|
+
If `.agents/skills/lumi-verify/references/reviewers.md` exists, it is the canonical Grounding reviewer prompt; load it as part of following the verify SKILL.
|
|
38
38
|
|
|
39
39
|
**Tier 3 — User opts out:**
|
|
40
40
|
|
|
@@ -54,6 +54,7 @@ Inspect `verify_status` and `findings`:
|
|
|
54
54
|
| `findings_pending` | Patch/defer findings written | Present findings inline; user chooses A/E/W/Q |
|
|
55
55
|
| `drift_detected` | `raw_paths` resolves to missing files | **Hard HALT** — do not present standard menu; force user to repair raw or set `provenance: missing` explicitly |
|
|
56
56
|
| `skipped` | Tier 3 opt-out | Write verified status and continue automatically; the user already opted out |
|
|
57
|
+
| `not_applicable` | Verify refused: target is not a `sources/*` entry | Should never happen during ingest (the target is always `sources/<slug>`). Treat as an internal error: HALT, report the slug that was passed to verify, and do not advance `ingest_status`. |
|
|
57
58
|
|
|
58
59
|
For `passed`, tell the user in one short sentence that the page matched the source closely enough to continue. Then write `ingest_status: verified` and go to NEXT:
|
|
59
60
|
|