llm-wiki-compiler 0.7.0 → 0.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,251 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [0.8.0] - 2026-05-26
9
+
10
+ Adds guided project next steps, one-command quickstart, agent-ready context graph packs, a viewer graph route, and the first eval harness for measuring wiki quality over time.
11
+
12
+ ### Added
13
+
14
+ - **`llmwiki next`** — inspects the current project and recommends the next useful command. Human output is concise; `--json` emits a stable envelope for agents.
15
+ - **`llmwiki quickstart <source>`** — ingests one source, compiles the wiki, and opens the local viewer when pages are ready. Supports `--review`, `--no-open`, `--provider`, `--lang`, and `--json`.
16
+ - **`llmwiki context "<prompt>"`** — builds an agent-ready evidence pack with primary pages, semantic chunks when available, graph neighbors, citations, gaps, warnings, and suggested actions. `--include-sources` can add path-confined source windows.
17
+ - **MCP `get_context_pack`** — exposes the same v1 context-pack envelope over MCP. It packages evidence for agents; `query_wiki` remains the answer-generation tool.
18
+ - **Viewer graph route** — the local web viewer now includes a force-directed graph at `#/graph`, plus navigation polish so graph/page/sidebar state stays in sync.
19
+ - **`llmwiki eval`** — measures wiki health score, citation coverage, citation precision, corpus stats, regression deltas, and optional LLM-as-judge citation support. Includes `eval report`, `eval history`, `eval judgements`, and `eval cache` subcommands.
20
+ - **Eval thresholds** — `.llmwiki/eval/thresholds.yaml` can gate health, citation coverage, citation precision, and full-suite citation support scores in CI.
21
+
22
+ ### Fixed
23
+
24
+ - **Pipe-alias wikilinks** — `[[slug|Display Text]]` is now detected correctly by lint and viewer link tooling.
25
+ - **Source-path confinement in eval** — citation-support judging resolves source paths through a shared confinement helper, including encoded traversal edge cases.
26
+ - **Eval cache invalidation** — citation judgements include judge configuration in the cache key, so changing model/provider settings re-judges affected claims instead of reusing stale scores.
27
+ - **Sample validation in eval** — invalid `--sample` values fail loudly instead of falling through to surprising sampling behavior.
28
+ - **Contributor docs** — upstream remote setup now points at the current `atomicstrata` GitHub org.
29
+
30
+ ### Changed
31
+
32
+ - **Fallow upgraded to 2.82.0** with follow-on code-health cleanup across CLI, viewer, adapters, watch, and tests. The CI action is pinned to the matching signed release so binary verification uses embedded platform digests rather than unauthenticated GitHub API lookups.
33
+
34
+ ### Test infrastructure
35
+
36
+ - Hardened the basename-collision CLI tests with explicit timeouts.
37
+ - Hardened local Vitest timeouts for subprocess-heavy integration tests.
38
+ - Changed the npm publish preflight to run release-doc checks, build, and a dry-run package check; the full test suite remains enforced by CI.
39
+ - Added coverage for quickstart/next JSON envelopes, context packs, MCP context packs, graph rendering, eval reports/history/cache/thresholds, and release-doc checks.
40
+
41
+ ### Contributors
42
+
43
+ Thanks to **@joshuaknipe** for a major release's worth of contributions: pipe-alias wikilink fixes (#61), upstream docs cleanup (#62), the viewer graph route (#63), and the eval harness with health scoring, citation quality, and corpus stats (#67).
44
+
45
+ ## [0.7.0] - 2026-05-18
46
+
47
+ Adds the first local web viewer for compiled wikis, a GitHub Copilot provider, and a persisted lint summary that lets the viewer report wiki health without re-running lint on every page load.
48
+
49
+ ### Added
50
+
51
+ - **`llmwiki view`** — starts a local read-only web viewer for the current project. The viewer includes a sidebar grouped by concepts and saved queries, a dashboard home, markdown rendering, wikilinks, title/body search, page metadata, health counts, and provenance/citation support rails.
52
+ - **Citation chips in the viewer** — paragraph citations and claim-level source ranges render as visible chips. On loopback binds, chips can include local editor links for source-line context; LAN binds omit filesystem paths and editor links.
53
+ - **Secure-by-default local server** — `view` binds to `127.0.0.1` by default, uses an OS-assigned port unless `--port` is provided, and requires `--host <host>` and `--allow-lan` together before binding beyond loopback. The server applies pinned CSP / CORP / nosniff / referrer headers, Host / Origin / Sec-Fetch checks, and path confinement for all served files.
54
+ - **Viewer health payload** — `/api/health` exposes cheap project counts, pending review count parity with MCP `wiki_status`, and the latest cached lint summary when available.
55
+ - **GitHub Copilot provider** — `LLMWIKI_PROVIDER=copilot` uses the GitHub Copilot API with `GITHUB_TOKEN=$(gh auth token)` from an OAuth token that has the `copilot` scope. Copilot supports chat/tool calls but does not expose embeddings, so embedding-dependent semantic search should use another provider.
56
+
57
+ ### Changed
58
+
59
+ - **`llmwiki lint` now writes `.llmwiki/last-lint.json`** after each completed lint run so the viewer can show a recent lint summary without running lint on every page load.
60
+ - **Shared wiki page collection** — export and viewer collection now share the lower-level wiki collector while preserving each surface's own filtering and payload shape.
61
+
62
+ ### Test infrastructure
63
+
64
+ - Added subprocess, path-safety, sanitizer, accessibility, JS DOM, pack-asset, and server-security coverage for the viewer. Tests grew from 632 to 850 in this release.
65
+
66
+ ### Contributors
67
+
68
+ Thanks to **@cadamsdev** for contributing the GitHub Copilot provider in PR #55.
69
+
70
+ ## [0.6.0] - 2026-05-02
71
+
72
+ Adds session-history ingest (Claude / Codex / Cursor exports), configurable output language, and a defensive cap that prevents `compile` from crashing on popular concepts. Closes a batch of CJK / collision / silent-loss bugs in the ingest path. Tightens `compile --review` so candidates carry both schema AND provenance lint findings before approval. Extracts a shared `ProvenanceMetadata` shape and removes an unreliable LLM extraction-time estimate in favour of body-derived counts.
73
+
74
+ ### Added
75
+
76
+ - **`llmwiki ingest-session <path>`** — imports AI coding-session exports as wiki sources. Auto-detects three formats: Claude (`.jsonl`), Codex (`.json`), Cursor (`.json`, both `tabs` and flat schemas). Single file or whole directory. Each session lands in `sources/<slug>.md` with frontmatter recording the adapter, source path, ingest timestamp, and (where available) session start/end times. Adapter validation requires ≥ 1 user-or-assistant turn — recognised-but-empty exports fail loudly instead of producing a content-free page.
77
+ - **`LLMWIKI_OUTPUT_LANG` env var + `--lang <code>` CLI flag** on `compile` and `query`. When set, every prompt builder (extraction, page generation, seed page, query answer) appends `Write the output in <lang>.` to the system prompt. Unset preserves current behaviour byte-for-byte. Useful for `--lang Chinese`, `--lang Japanese`, etc.
78
+ - **`compile --review` provenance lint** — review candidates now carry both `schemaViolations` and `provenanceViolations` (malformed claim citations, broken-source / out-of-bounds line spans). `review show` prints both blocks. Reviewers see citation issues before approving a page rather than discovering them on a later compile.
79
+ - **`npm run fallow:ci`** — contributor script that runs `fallow` with the same `--changed-since <PR-base-sha>` scoping the GitHub Action uses, so most CI fallow findings surface locally before pushing. Documented in CONTRIBUTING.md (including the fork-workflow `upstream/main` resolution and the platform-binary parity caveat).
80
+
81
+ ### Fixed
82
+
83
+ - **Non-ASCII filename ingest** (#35) — `slugify` previously used `\w` without the `/u` flag, so titles like `测试文档` collapsed to the empty string and `ingest` wrote `sources/.md` (a dotfile that subsequent CJK ingests would overwrite). `slugify` now uses Unicode property escapes (`\p{L}`, `\p{N}`); pure-emoji titles that still strip to `""` fail with an actionable error rather than writing a dotfile.
84
+ - **Same-basename source collision** (#36) — two distinct sources slugifying to the same name (e.g. `a/notes.md` and `b/notes.md`) used to silently overwrite. `saveSource` now checks for the collision and falls through to `<slug>-<8-hex-of-source>.md` when the existing file's frontmatter `source` doesn't match. Re-ingesting the same source still overwrites in place — no duplicate accumulation.
85
+ - **Compile crash on popular concepts** (#39) — `mergeExtractions` used to concatenate every contributing source's full content into the page-generation prompt. Linear in source count; reliably blew past the LLM provider's context window once many sources discussed the same topic. New defensive cap (`LLMWIKI_PROMPT_BUDGET_CHARS`, default 200,000) gives every contributing source a fair share of the budget when the raw total would overflow, with a clear truncation marker. Typical workloads stay byte-identical.
86
+ - **Body-derived `excess-inferred-paragraphs`** — the lint rule used to trust an LLM-estimated `inferredParagraphs` frontmatter field when present, falling back to body counting. The estimate was made before the page even existed and routinely disagreed with what the model actually produced. The rule now unconditionally counts uncited prose paragraphs in the rendered body, with Unicode-aware prose detection (`\p{L}`) so pages produced via `--lang Chinese` etc. are correctly counted. Legacy `inferredParagraphs` frontmatter values are intentionally ignored.
87
+
88
+ ### Changed
89
+
90
+ - **`ProvenanceMetadata` is now a single shared interface** in `src/utils/types.ts` that both `ExtractedConcept` and `WikiFrontmatter` extend. Drops the duplicate private declaration that had drifted into `src/utils/markdown.ts`. JSON shapes serialised on disk and over the LLM tool boundary are byte-identical to before — pure refactor.
91
+ - **`inferredParagraphs` is no longer written to frontmatter or sent to the LLM extractor**. The field has moved entirely to body-derived lint at lint time. Old on-disk pages with the field still parse — the loader just ignores the unrecognised key.
92
+ - **`CompileResult.pages` now includes seed-page slugs** alongside concept-page slugs. Seed pages used to land on disk silently and stay absent from the result; downstream consumers (MCP, embeddings, programmatic callers) had no way to discover them without scanning `wiki/`. They're also threaded into `finalizeWiki` so `resolveLinks` and `updateEmbeddings` cover them.
93
+ - **Lint helper dedupe** — `checkSchemaCrossLinks` (on-disk walker) now delegates to `checkPageCrossLinks` (per-page) so the `schema-cross-link-minimum` rule lives in exactly one place.
94
+
95
+ ### Test infrastructure
96
+
97
+ - **`useIngestWorkspaces` and `useAimockLifecycle.findSystemPromptByUserMessage`** composables in `test/fixtures/` consolidate temp-workspace and aimock recording boilerplate that had drifted across multiple integration tests.
98
+ - Tests grew from 480 (post-0.5.1) to 632 in this release.
99
+
100
+ ### Contributors
101
+
102
+ Thanks to **@lllcccwww** for filing four high-quality bug reports back-to-back (#35, #36, #37, #39) — every one had a clear repro and pointed at the offending file:line, which made the fixes obvious. Also thanks to **@babysource** for asking about embedding configuration (#42) and **@ishan5ain** for volunteering to take on the read-only Web UI roadmap item (#38).
103
+
104
+ ## [0.5.1] - 2026-04-27
105
+
106
+ Patch release fixing a CLI startup crash that broke 0.5.0 for everyone installing via npm.
107
+
108
+ ### Fixed
109
+
110
+ - **Startup crash on `llmwiki <any-command>`** — 0.5.0 imported `youtube-transcript/dist/youtube-transcript.esm.js` (a deep subpath that worked around a broken `main` entry in v1.3.0). v1.3.1 added a proper `exports` map that no longer exposes that subpath, so any `npm install -g llm-wiki-compiler@0.5.0` produced `ERR_PACKAGE_PATH_NOT_EXPORTED` on first command. Switched to importing from the package root (which the new `exports` map covers) and bumped `youtube-transcript` to `^1.3.1`.
111
+
112
+ ### Contributors
113
+
114
+ Thanks to **@lllcccwww** for reporting (#33) and **@ishan5ain** for the fix (#34) — both very fast turnarounds.
115
+
116
+ ## [0.5.0] - 2026-04-27
117
+
118
+ Adds multimodal ingest (images, PDFs, transcripts) and chunk-level semantic retrieval with reranking and a `--debug` view. Also raises the minimum Node version to 24 so the project can use modern test-mocking tooling that depends on Node 24+ APIs.
119
+
120
+ ### Added
121
+
122
+ - **Multimodal ingest** — `llmwiki ingest` now accepts images (vision via the active LLM provider), PDFs (text + metadata via lazy-loaded `pdf-parse`), and transcripts (`.vtt`, `.srt`, plus content-sniffed `.txt` that requires repeated speaker dialogue or anchored timestamps so plain notes aren't misclassified). Each source records its `sourceType` in frontmatter (`web` | `file` | `image` | `pdf` | `transcript`). YouTube transcript URLs are auto-routed.
123
+ - **Chunk-level semantic retrieval** — the embedding store gained an optional v2 `chunks` schema. Pages are split on paragraph + heading boundaries (with size guardrails), embedded individually, and reused across compiles when their content hash hasn't changed. Query routing prefers chunk hits, falls back to page-level retrieval and full-index selection.
124
+ - **BM25 reranking** over chunk candidates, blending 0.5x cosine similarity with BM25 score so semantic ranking still matters when the query has no overlapping terms.
125
+ - **`llmwiki query --debug`** prints the top chunks (slug, score, snippet) and pages selected, so users can audit retrieval decisions. The MCP `query_wiki` tool accepts a `debug` arg too.
126
+ - **Empty-store cold-start** — an empty v1 or v2 store with live wiki pages now triggers a full chunk embedding on next compile (previously, embeddings would only update when an existing slug changed).
127
+ - **`@copilotkit/aimock` test infrastructure** with `mockClaudeEnv` / `mockOpenAIEnv` / `useAimockLifecycle` helpers. CLI subprocess tests can now stub LLM endpoints deterministically — closes the recurring "no subprocess test for the compile/query happy path" gap that codex flagged across review-queue, schema-layer, confidence-metadata, and chunked-retrieval.
128
+
129
+ ### Changed
130
+
131
+ - **Minimum Node version raised from 18 to 24.** `engines.node` is `>=24`, the tsup target is `node24`, and CI runs only on Node 24. Users on older Node should pin to `<0.5.0` until they can upgrade their runtime.
132
+ - `pdf-parse` is dynamically imported so the cost of loading pdfjs-dist is paid only when a PDF is actually being ingested.
133
+
134
+ ### Test infrastructure
135
+
136
+ - New `runCLI` / `expectCLIExit` / `expectCLIFailure` / `formatCLIFailure` helpers in `test/fixtures/run-cli.ts` capture full subprocess diagnostics (code, signal, killed, message, stdout, stderr, args, cwd) on assertion failure — flakes now surface their root cause without rerunning.
137
+ - `vitest globalSetup` builds dist once before the suite runs, eliminating the per-test `tsup --clean` race that caused intermittent CI flakes.
138
+ - Tests grew from 391 to 477 in this release (and to 519 once export-bundle lands as a follow-up).
139
+
140
+ ## [0.4.0] - 2026-04-25
141
+
142
+ Adds claim-level source-range provenance, a first-class schema layer for typed page kinds, configurable provider request timeouts, and a slug-based wikilink format that resolves reliably in Obsidian.
143
+
144
+ ### Added
145
+
146
+ - **Claim-level provenance with source ranges** — citations can now pin specific lines: `^[paper.md:42-58]` (colon form) or `^[paper.md#L42-L58]` (GitHub anchor form). Single-line `^[paper.md:7]` works too, as do mixed multi-source markers like `^[a.md, b.md:1-3]`. The legacy paragraph form `^[paper.md]` continues to work unchanged.
147
+ - **`extractClaimCitations(body)`** returns structured `{ raw, spans: [{ file, lines? }] }` records for tooling. **`inspectProvenance(body)`** groups spans by source file (deduped), useful for "this page draws from" UIs.
148
+ - **`checkBrokenCitations`** lint rule now flags out-of-bounds spans (e.g. `^[src.md:42-58]` against a 3-line source) with cached per-file line counts so a page with many spans into the same source only reads it once.
149
+ - **`checkMalformedClaimCitations`** new lint rule catches malformed entries: non-numeric ranges (`:abc-xyz`), half-baked hash forms (`#X9`), line `0`, and reversed ranges (`5-3`). Semantic invalidity is rejected at parse time so `extractClaimCitations` doesn't return impossible spans.
150
+ - **First-class schema layer** for typed page kinds. Projects can declare `.llmwiki/schema.json|yaml|yml` (or `wiki/.schema.yaml|yml`) defining page kinds (`concept`, `entity`, `comparison`, `overview`), per-kind `minWikilinks`, and seed pages.
151
+ - **`llmwiki schema init`** writes a starter schema file. **`llmwiki schema show`** prints the resolved schema and its source path.
152
+ - **`schema-cross-link-minimum`** lint rule enforces per-kind link expectations.
153
+ - **Schema-driven seed pages** are generated during compile and run on the early-return path too, so adding a seed-page entry triggers its creation on the next `compile` even when no source files changed.
154
+ - **Review-mode schema violations** — `compile --review` runs in-memory schema lint per candidate and stamps any violations onto the candidate JSON. `review show <id>` prints a "Schema violations" block when present.
155
+ - **Configurable provider request timeouts** — `LLMWIKI_REQUEST_TIMEOUT_MS` (provider-agnostic) and `OLLAMA_TIMEOUT_MS` (Ollama-specific) override the per-request timeout. Defaults: 10 minutes for OpenAI (matches the SDK), 30 minutes for Ollama (better suited to local models).
156
+ - **Slug-based wikilinks** — index, MOC, and the in-body wikilink resolver now emit `[[slug|Title]]` so Obsidian targets the file directly regardless of whether the slug differs from the display title.
157
+ - **Test infrastructure for subprocess CLI tests** — `runCLI`/`expectCLIExit`/`expectCLIFailure`/`formatCLIFailure` helpers in `test/fixtures/run-cli.ts` capture full subprocess diagnostics (code, signal, killed, message, stdout, stderr, args, cwd) so flakes surface their root cause without rerunning. dist/ is built once via `vitest globalSetup` so parallel workers don't race on `tsup --clean`.
158
+
159
+ ### Changed
160
+
161
+ - `extractCitations(body)` continues to return a flat filename list for backward compatibility, but is now backed by `extractClaimCitations` and strips span suffixes when collecting filenames.
162
+ - `WikiFrontmatter.kind` references the canonical `PageKind` type from `src/schema/types.ts` via `import type` (no runtime cycle).
163
+ - `compile --review` defers seed-page generation and `finalizeWiki` to honor the no-`wiki/`-mutation contract.
164
+
165
+ ### Contributors
166
+
167
+ Thanks to **@ludevica** for #15 (slug-based wikilinks) and **@BenGSt** for reporting the Ollama timeout (#11).
168
+
169
+ ## [0.3.0] - 2026-04-23
170
+
171
+ Adds a candidate review queue for `compile` and richer epistemic metadata on compiled pages.
172
+
173
+ ### Added
174
+
175
+ - **Candidate review queue** — `llmwiki compile --review` writes generated pages to `.llmwiki/candidates/` instead of mutating `wiki/`. New subcommands `llmwiki review list|show|approve|reject` let you inspect each candidate before it lands. `approve` writes the page and refreshes index/MOC/embeddings; `reject` archives the candidate to `.llmwiki/candidates/archive/`. MCP `wiki_status` exposes `pendingCandidates` so agents can see queue depth.
176
+ - **Confidence and contradiction metadata** — compiled pages can carry optional frontmatter fields (`confidence`, `provenanceState`, `contradictedBy`, `inferredParagraphs`). When multiple sources merge into one slug, metadata is reconciled (`min` confidence, `provenanceState = 'merged'`, union of `contradictedBy` deduped by slug, `max` `inferredParagraphs`).
177
+ - **Three new lint rules** surface the new metadata: `low-confidence`, `contradicted-page`, `excess-inferred-paragraphs`.
178
+ - **Multi-source citation parsing in lint** — `^[a.md, b.md]` now validates each filename independently and only reports the missing one(s).
179
+ - **Husky pre-commit and pre-push hooks** — pre-commit runs `fallow` + `tsc --noEmit`; pre-push runs `npm run build` + `npm test`. Devs get fast feedback on commit and full validation before push.
180
+
181
+ ### Changed
182
+
183
+ - Pre-commit/pre-push hooks pin `fallow` to `2.42.0` locally (devDep) and in CI to keep complexity thresholds stable across the team.
184
+ - `compile`'s page rendering extracted into `src/compiler/page-renderer.ts` so both direct writes and candidate generation reuse the same renderer.
185
+ - `vitest.config.ts` excludes `.claude/**` so `npm test` from the main checkout doesn't discover sibling worktrees.
186
+
187
+ ### Concurrency
188
+
189
+ - `review approve` and `review reject` acquire `.llmwiki/lock` (the same lock `compile` uses) and re-read the candidate under the lock to close the TOCTOU window between pre-check and mutation.
190
+ - When one source produces multiple candidates, source state isn't persisted until the last sibling is approved — unresolved siblings stay re-detectable on the next `compile --review`.
191
+
192
+ ### Infrastructure
193
+
194
+ - Tests grew from 222 to 291 across all new features.
195
+
196
+ ### Contributors
197
+
198
+ Thanks to **@ishan5ain** for #12 (split embedding endpoints for OpenAI-compatible providers) and **@sy2ruto** for reporting the multi-source citation lint bug (#10) — the parsing fix shipped here in PR #19.
199
+
200
+ ## [0.2.0] - 2026-04-16
201
+
202
+ First major release since 0.1.1. Ships the complete initial roadmap plus an MCP server for AI agent integration.
203
+
204
+ ### Added
205
+
206
+ - **MCP server** (`llmwiki serve`) exposes llmwiki's automated pipelines as Model Context Protocol tools so agents can ingest, compile, query, search, lint, and read pages programmatically. Ships with 7 tools and 5 read-only resources.
207
+ - **Semantic search** via embeddings — pre-filters the wiki index to the top 15 most similar pages before calling the selection LLM, with transparent fallback to full-index selection when no embeddings store exists.
208
+ - **Multi-provider support** — swap LLM backends via `LLMWIKI_PROVIDER=anthropic|openai|ollama|minimax`.
209
+ - **`llmwiki lint`** command with six rule-based checks (broken wikilinks, orphaned pages, missing summaries, duplicate concepts, empty pages, broken citations). No LLM calls, no API key required.
210
+ - **Paragraph-level source attribution** — compiled pages now include `^[filename.md]` citation markers pointing back to source files.
211
+ - **Obsidian integration** — LLM-extracted tags, deterministic aliases (slug, conjunction swap, abbreviation), and auto-generated `wiki/MOC.md` grouping concept pages by tag.
212
+ - **Anthropic provider enhancements** — `ANTHROPIC_AUTH_TOKEN` support, custom base URLs, and `~/.claude/settings.json` fallback for credentials and model.
213
+ - **MiniMax provider** via the OpenAI-compatible endpoint.
214
+ - GitHub Actions CI with Node 18/20/22 build+test matrix plus Fallow codebase health check (required for merges).
215
+
216
+ ### Changed
217
+
218
+ - Command functions (`compile`, `query`, `ingest`) now expose structured-result variants (`compileAndReport()`, `generateAnswer()`, `ingestSource()`) alongside the existing CLI-facing versions. The CLI experience is unchanged.
219
+ - `runCompilePipeline` decomposed into focused phase helpers to bring function complexity under Fallow's thresholds.
220
+
221
+ ### Infrastructure
222
+
223
+ - Tests grew from 91 to 211 across all new features.
224
+ - Fallow codebase health analyzer required in CI (no dead code, no duplication, no complexity threshold violations).
225
+
226
+ ### Contributors
227
+
228
+ Thanks to @FrankMa1, @PipDscvr, @goforu, and @socraticblock for their contributions.
229
+
230
+ ## [0.1.1] - 2026-04-07
231
+
232
+ ### Fixed
233
+
234
+ - Flaky CLI test timeout.
235
+
236
+ ## [0.1.0] - 2026-04-05
237
+
238
+ Initial release.
239
+
240
+ ### Added
241
+
242
+ - `llmwiki ingest` — fetch a URL or copy a local file into `sources/`.
243
+ - `llmwiki compile` — incremental two-phase compilation (extract concepts, then generate pages). Hash-based change detection skips unchanged sources.
244
+ - `llmwiki query` — two-step LLM-powered Q&A (index-based page selection, then streaming answer). `--save` flag writes answers as wiki pages.
245
+ - `llmwiki watch` — auto-recompile on source changes.
246
+ - Atomic writes, lock-protected compilation, orphan marking for deleted sources.
247
+ - `[[wikilink]]` resolution and auto-generated `wiki/index.md`.
248
+
249
+ [0.2.0]: https://github.com/atomicmemory/llm-wiki-compiler/compare/v0.1.1...v0.2.0
250
+ [0.1.1]: https://github.com/atomicmemory/llm-wiki-compiler/compare/v0.1.0...v0.1.1
251
+ [0.1.0]: https://github.com/atomicmemory/llm-wiki-compiler/releases/tag/v0.1.0
package/README.md CHANGED
@@ -22,12 +22,13 @@ export ANTHROPIC_API_KEY=sk-...
22
22
  # export LLMWIKI_PROVIDER=openai
23
23
  # export OPENAI_API_KEY=sk-...
24
24
 
25
- llmwiki ingest https://some-article.com
26
- llmwiki compile
25
+ llmwiki quickstart ./notes.md
27
26
  llmwiki query "what is X?"
28
27
  llmwiki view --open
29
28
  ```
30
29
 
30
+ `llmwiki quickstart ./notes.md` ingests one supported source, compiles the wiki, and opens the local viewer when pages are ready. Use `--no-open` to stop after compile, `--review` to queue candidates instead of writing pages, or `--json` for an agent-friendly envelope. If you're inside an existing project and unsure what to do next, run `llmwiki next`.
31
+
31
32
 
32
33
  <br>
33
34
 
@@ -247,6 +248,7 @@ Pages include source attribution in frontmatter. Paragraphs are annotated with `
247
248
  |---------|-------------|
248
249
  | `llmwiki ingest <url\|file>` | Fetch a URL or copy a local file into `sources/` |
249
250
  | `llmwiki ingest-session <path>` | Import a Claude/Codex/Cursor session export (single file or whole directory) into `sources/` |
251
+ | `llmwiki quickstart <source>` | Ingest a source and compile a wiki in one step; supports `--review`, `--no-open`, `--provider`, `--lang`, and `--json` |
250
252
  | `llmwiki compile` | Incremental compile: extract concepts, generate wiki pages |
251
253
  | `llmwiki compile --review` | Write candidate pages to `.llmwiki/candidates/` instead of `wiki/` so you can review before they land |
252
254
  | `llmwiki compile --lang <code>` | Generate wiki content in the given language (e.g. `Chinese`, `ja`, `zh-CN`); also works on `query` |
@@ -260,10 +262,20 @@ Pages include source attribution in frontmatter. Paragraphs are annotated with `
260
262
  | `llmwiki query "question" --save` | Answer and save the result as a wiki page |
261
263
  | `llmwiki export [--target <name>]` | Export the wiki to portable formats — `llms.txt`, `llms-full.txt`, JSON, JSON-LD, GraphML, Marp slides |
262
264
  | `llmwiki view [--open]` | Start a read-only local web viewer for browsing, searching, and inspecting the compiled wiki |
265
+ | `llmwiki next [--json]` | Show the recommended next action for this project (read-only); `--json` emits a stable envelope for agents |
266
+ | `llmwiki context "<prompt>" [--json]` | Build an agent-ready evidence pack (primary pages, citations, neighbors, suggested actions) — same v1 envelope as MCP `get_context_pack` |
263
267
  | `llmwiki lint` | Check wiki quality (broken links, orphans, empty pages, low confidence, contradictions, etc.) |
268
+ | `llmwiki eval [--suite fast\|full]` | Measure wiki quality: health score (0–100), citation coverage, corpus stats. `--suite full` adds LLM-as-judge citation support scoring |
269
+ | `llmwiki eval cache show` | Print score distribution and top-cited pages from the citation judgement cache |
270
+ | `llmwiki eval cache clear` | Remove the citation judgement cache |
271
+ | `llmwiki eval report` | Print the most recent eval report |
272
+ | `llmwiki eval history [--n N]` | Show a trend table of past eval runs from `history.jsonl` |
273
+ | `llmwiki eval judgements [--score 0\|1\|2] [--page slug]` | Inspect individual citation judgements with optional score or page filters |
264
274
  | `llmwiki watch` | Auto-recompile when `sources/` changes |
265
275
  | `llmwiki serve [--root <dir>]` | Start an MCP server exposing wiki tools to AI agents |
266
276
 
277
+ `llmwiki context --include-sources` and MCP `get_context_pack` with `includeSources: true` are opt-in because they can return raw snippets from files under `sources/`. Path confinement prevents reads outside `sources/`, but only enable source windows for agents you trust with the ingested source text.
278
+
267
279
  ## Output
268
280
 
269
281
  ```
@@ -369,6 +381,51 @@ The schema supports four page kinds:
369
381
 
370
382
  Schema rules can set per-kind `minWikilinks` and optional `seedPages`. Compile can materialize seed pages such as overviews, lint enforces page-kind-specific cross-link minimums, and review candidates surface schema violations before approval.
371
383
 
384
+ ## Eval / quality measurement
385
+
386
+ `llmwiki eval` gives the wiki a quantitative health score and tracks citation quality over time, making it possible to detect regressions after a recompile.
387
+
388
+ ```bash
389
+ llmwiki eval # fast suite: health score, citation coverage, corpus stats
390
+ llmwiki eval --suite full # + LLM-as-judge citation support scoring (requires API)
391
+ llmwiki eval report # re-print the most recent report
392
+ llmwiki eval history # trend table across past runs
393
+ llmwiki eval history --n 10 # limit to last 10 entries
394
+ llmwiki eval judgements # all cached citation judgements
395
+ llmwiki eval judgements --score 0 # only unsupported citations
396
+ llmwiki eval judgements --page some-slug # filter to one page
397
+ llmwiki eval cache show # score distribution + top-cited pages
398
+ llmwiki eval cache clear # wipe the citation judgement cache
399
+ ```
400
+
401
+ **What it measures:**
402
+
403
+ - **Health score (0–100)** aggregates all lint rules. Errors (broken citations, broken wikilinks, duplicate concepts) cost more than warnings.
404
+ - **Citation coverage** — fraction of prose paragraphs that carry a `^[...]` marker, plus citation precision (fraction of citations pointing to existing source files).
405
+ - **Citation support (full suite)** — samples up to N `(claim, source span)` pairs, asks a judge model to score each 0–2 (unsupported → fully supported), and caches results so subsequent runs only re-judge new pairs.
406
+ - **Corpus stats** — page count, source count, total wiki characters, embedding counts, appended to `history.jsonl` for trend tracking.
407
+ - **Regression deltas** — current report is diffed against the previous entry in history.
408
+
409
+ **CI thresholds:** add `.llmwiki/eval/thresholds.yaml` to configure minimum acceptable scores:
410
+
411
+ ```yaml
412
+ health_score: 85
413
+ citation_coverage_percent: 70
414
+ citation_precision_percent: 90
415
+ citation_support_mean: 1.4 # only checked when --suite full
416
+ ```
417
+
418
+ Threshold violations are listed in the report. Exit code is non-zero when any threshold is breached, suitable for CI gating.
419
+
420
+ **Artifacts** written under `.llmwiki/eval/`:
421
+
422
+ ```
423
+ .llmwiki/eval/
424
+ history.jsonl one JSON line per eval run
425
+ citation-cache.jsonl one JSON line per citation judgement
426
+ thresholds.yaml optional CI threshold config
427
+ ```
428
+
372
429
  </details>
373
430
 
374
431
 
@@ -385,10 +442,8 @@ Try it on any article or document:
385
442
 
386
443
  ```bash
387
444
  mkdir my-wiki && cd my-wiki
388
- llmwiki ingest https://en.wikipedia.org/wiki/Andrej_Karpathy
389
- llmwiki compile
445
+ llmwiki quickstart https://en.wikipedia.org/wiki/Andrej_Karpathy
390
446
  llmwiki query "What terms did Andrej coin?"
391
- llmwiki view --open
392
447
  ```
393
448
 
394
449
  See `examples/basic/` in the repo for pre-generated output you can browse without an API key.
@@ -437,7 +492,7 @@ Add to your client's MCP config (e.g. `claude_desktop_config.json`):
437
492
  }
438
493
  ```
439
494
 
440
- Tools that need an LLM (`compile_wiki`, `query_wiki`, `search_pages`) check for a configured provider on each call. Read-only tools (`read_page`, `lint_wiki`, `wiki_status`) and `ingest_source` work without any credentials.
495
+ Tools that need an LLM (`compile_wiki`, `query_wiki`, `search_pages`) check for a configured provider on each call. Read-only tools (`read_page`, `lint_wiki`, `wiki_status`) and `ingest_source` work without any credentials. `get_context_pack` is read-only and provider credentials are optional — when present, semantic retrieval is used; otherwise the tool falls back to lexical ranking and surfaces an `embedding-store-missing` or `query-embedding-unavailable` warning.
441
496
 
442
497
  ### Tools
443
498
 
@@ -450,6 +505,7 @@ Tools that need an LLM (`compile_wiki`, `query_wiki`, `search_pages`) check for
450
505
  | `read_page` | Read a single page by slug (concepts/ then queries/). |
451
506
  | `lint_wiki` | Run quality checks; returns structured diagnostics. |
452
507
  | `wiki_status` | Page count, source count, orphans, pending changes (read-only). |
508
+ | `get_context_pack` | Build an agent-ready evidence pack (primary pages, semantic chunks, graph neighbors, citations, warnings, suggested actions) — same v1 JSON envelope as `llmwiki context --json`. `get_context_pack` **packages evidence**; `query_wiki` **generates answers**. |
453
509
 
454
510
  ### Resources
455
511
 
@@ -496,6 +552,13 @@ Karpathy describes an abstract pattern for turning raw data into compiled knowle
496
552
 
497
553
  ## Roadmap
498
554
 
555
+ Shipped in 0.8.0:
556
+
557
+ - ✅ Guided project flow — `llmwiki next` recommends the next useful command, and `llmwiki quickstart <source>` ingests, compiles, and opens the viewer in one step
558
+ - ✅ Graph/context layer — `llmwiki context` and MCP `get_context_pack` produce token-budgeted evidence packs with primary pages, graph neighbors, citations, optional source windows, warnings, and suggested actions
559
+ - ✅ Viewer graph route — `llmwiki view` includes a force-directed `#/graph` route for exploring page relationships
560
+ - ✅ Evaluation harness — `llmwiki eval` measures health score, citation coverage/precision, corpus stats, regression deltas, optional LLM-as-judge citation support, and CI thresholds
561
+
499
562
  Shipped in 0.7.0:
500
563
 
501
564
  - ✅ Read-only local web viewer — `llmwiki view` with sidebar navigation, markdown rendering, search, metadata, health counts, and provenance/citation chips
@@ -536,11 +599,10 @@ Shipped in 0.2.0:
536
599
 
537
600
  Next up:
538
601
 
539
- - **Graph/context layer** — page-neighborhood tools, graph paths, gap detection, and token-budgeted context packs for agents.
540
- - **Evaluation harness** — benchmark answer quality, citation accuracy, update drift, retrieval recall, and scale curves against serious retrieval baselines.
541
602
  - **Task and decision ledger** — turn session ingest into durable agent memory: goals, decisions, open questions, outcomes, and next-agent handoffs.
542
603
  - **Rollback, audit, and source lifecycle** — undo/reverse ingest, compile diff reports, stale-claim checks, freshness reports, and a durable operation log.
543
604
  - **Domain templates** — schema/prompt packs for research, codebase docs, team handbooks, decision logs, and standards/regulations.
605
+ - **Eval extensions** — retrieval recall suites, update-drift benchmarks, and comparisons against serious retrieval baselines.
544
606
 
545
607
  Later / open to discussion:
546
608
 
@@ -549,7 +611,7 @@ Later / open to discussion:
549
611
  - Codex OAuth provider — ChatGPT subscription auth as a dedicated provider, with clear token refresh and embedding-limit behavior
550
612
  - Team-chat connectors for Slack/Discord/Teams-style institutional memory
551
613
 
552
- If you like ambitious problems: **local web UI**, **graph/context packs**, and **eval harness** are the meatiest next contributions. Open an issue to claim one or kick off a design discussion.
614
+ If you like ambitious problems: **task/decision ledger**, **rollback/audit tooling**, and **eval extensions** are the meatiest next contributions. Open an issue to claim one or kick off a design discussion.
553
615
 
554
616
  Explicitly not planned (good ideas, just not for this repo): full static-site generator, desktop or mobile apps, fine-tuning, a formal ontology engine, heavy graph database infrastructure.
555
617