llm-wiki-compiler 0.7.0 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md ADDED
@@ -0,0 +1,281 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
6
+ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
+
8
+ ## [0.9.0] - 2026-06-08
9
+
10
+ Adds an end-to-end source-freshness loop — detect stale pages, surface them everywhere, and repair them with a targeted recompile — plus an in-process SDK with source-backed write APIs, a JSON export bridge contract for downstream importers, richer eval metrics, rule-candidate extraction, and a local-login Claude Agent provider.
11
+
12
+ ### Added
13
+
14
+ - **Source freshness** — `llmwiki lint` flags pages whose sources changed (`stale`) or were all deleted (`orphaned`) since the last compile, computed on demand from `.llmwiki/state.json` and the current `sources/`. Freshness is surfaced across MCP (`wiki_status` stale/orphaned lists and a `stateStatus` field, plus `get_context_pack`), context packs (per-page `freshnessStatus`/`contradicted`/`archived` and a `stale-page` warning), the local viewer (STALE/ORPHANED/CONTRADICTED/ARCHIVED badges, a per-axis filter, health-pane counts, and a corrupt-state banner), the JSON export, and `llmwiki next`.
15
+ - **`llmwiki refresh --stale [--dry-run]`** — a targeted recompile that repairs stale/orphaned pages by recompiling their changed owning sources and cleaning up deleted owners, while deliberately skipping unrelated new sources. `--dry-run` previews the plan with no LLM calls and no writes; cleanup-only refreshes require no API key.
16
+ - **JSON export bridge contract** — `llmwiki export --target json --project-id <id>` adds per-page `path`, `kind`, advisory confidence/provenance, flattened citations, aliases, and freshness so downstream importers (e.g. [`@atomicmemory/llmwiki`](https://github.com/atomicstrata/atomicmemory/tree/main/packages/llmwiki)) can ingest pages as durable memory records.
17
+ - **Eval over MCP** — a `run_eval` MCP tool (the fast suite needs no API key; the full suite LLM-judges a sample of citations), plus read-only `llmwiki://eval/report` and `llmwiki://eval/history` resources.
18
+ - **Eval source-utilization metrics** — source-utilization and citation-depth dimensions, surfaced source warnings, a frame-safe report, and a `source_warnings_max` CI gate.
19
+ - **Rule-candidate extraction** — extract reusable rule candidates from sources with review/approve and a JSON export pipeline.
20
+ - **In-process SDK** — `createWiki()` exposes the compiler in-process, with source-backed write APIs (`writeStatus`, `listSources`/`getSource`/`deleteSource`) for programmatic callers.
21
+ - **Claude Agent SDK provider** — a provider that authenticates through a local Claude Code login and uses bundled plan tokens, so no separate API key is required.
22
+ - **Alias-aware wikilinks** — the viewer resolves a `[[term]]` link to any page that declares `term` in its `aliases` frontmatter, not just an exact slug match.
23
+ - **Append-only activity journal** — an append-only `log.md` records ingest, compile, review, and export activity.
24
+
25
+ ### Changed
26
+
27
+ - Upgraded core dependencies — zod 3 → 4, openai 4 → 6, and `@anthropic-ai/sdk` 0.39 → 0.101 — and bumped the default model to `claude-sonnet-4-6` (the previous default was deprecated).
28
+
29
+ ### Fixed
30
+
31
+ - Three read-only paths (`wiki_status`, the `llmwiki://state` MCP resource, and the viewer startup snapshot) no longer write a `.bak` file when `.llmwiki/state.json` is corrupt; corrupt and missing state are now surfaced explicitly instead of being swallowed.
32
+ - `wiki_status` derives pending source changes from the freshness snapshot instead of running a redundant second source-hash pass.
33
+
34
+ ### Contributors
35
+
36
+ Thanks to **@alvins82** for the Claude Agent SDK provider (#81) and the append-only activity journal (#85), **@dohu012** for source-utilization and citation-depth eval metrics (#86), and **@joshuaknipe** for the `run_eval` MCP tool and eval resources (#74).
37
+
38
+ ## [0.8.0] - 2026-05-26
39
+
40
+ Adds guided project next steps, one-command quickstart, agent-ready context graph packs, a viewer graph route, and the first eval harness for measuring wiki quality over time.
41
+
42
+ ### Added
43
+
44
+ - **`llmwiki next`** — inspects the current project and recommends the next useful command. Human output is concise; `--json` emits a stable envelope for agents.
45
+ - **`llmwiki quickstart <source>`** — ingests one source, compiles the wiki, and opens the local viewer when pages are ready. Supports `--review`, `--no-open`, `--provider`, `--lang`, and `--json`.
46
+ - **`llmwiki context "<prompt>"`** — builds an agent-ready evidence pack with primary pages, semantic chunks when available, graph neighbors, citations, gaps, warnings, and suggested actions. `--include-sources` can add path-confined source windows.
47
+ - **MCP `get_context_pack`** — exposes the same v1 context-pack envelope over MCP. It packages evidence for agents; `query_wiki` remains the answer-generation tool.
48
+ - **Viewer graph route** — the local web viewer now includes a force-directed graph at `#/graph`, plus navigation polish so graph/page/sidebar state stays in sync.
49
+ - **`llmwiki eval`** — measures wiki health score, citation coverage, citation precision, corpus stats, regression deltas, and optional LLM-as-judge citation support. Includes `eval report`, `eval history`, `eval judgements`, and `eval cache` subcommands.
50
+ - **Eval thresholds** — `.llmwiki/eval/thresholds.yaml` can gate health, citation coverage, citation precision, and full-suite citation support scores in CI.
51
+
52
+ ### Fixed
53
+
54
+ - **Pipe-alias wikilinks** — `[[slug|Display Text]]` is now detected correctly by lint and viewer link tooling.
55
+ - **Source-path confinement in eval** — citation-support judging resolves source paths through a shared confinement helper, including encoded traversal edge cases.
56
+ - **Eval cache invalidation** — citation judgements include judge configuration in the cache key, so changing model/provider settings re-judges affected claims instead of reusing stale scores.
57
+ - **Sample validation in eval** — invalid `--sample` values fail loudly instead of falling through to surprising sampling behavior.
58
+ - **Contributor docs** — upstream remote setup now points at the current `atomicstrata` GitHub org.
59
+
60
+ ### Changed
61
+
62
+ - **Fallow upgraded to 2.82.0** with follow-on code-health cleanup across CLI, viewer, adapters, watch, and tests. The CI action is pinned to the matching signed release so binary verification uses embedded platform digests rather than unauthenticated GitHub API lookups.
63
+
64
+ ### Test infrastructure
65
+
66
+ - Hardened the basename-collision CLI tests with explicit timeouts.
67
+ - Hardened local Vitest timeouts for subprocess-heavy integration tests.
68
+ - Changed the npm publish preflight to run release-doc checks, build, and a dry-run package check; the full test suite remains enforced by CI.
69
+ - Added coverage for quickstart/next JSON envelopes, context packs, MCP context packs, graph rendering, eval reports/history/cache/thresholds, and release-doc checks.
70
+
71
+ ### Contributors
72
+
73
+ Thanks to **@joshuaknipe** for a major release's worth of contributions: pipe-alias wikilink fixes (#61), upstream docs cleanup (#62), the viewer graph route (#63), and the eval harness with health scoring, citation quality, and corpus stats (#67).
74
+
75
+ ## [0.7.0] - 2026-05-18
76
+
77
+ Adds the first local web viewer for compiled wikis, a GitHub Copilot provider, and a persisted lint summary that lets the viewer report wiki health without re-running lint on every page load.
78
+
79
+ ### Added
80
+
81
+ - **`llmwiki view`** — starts a local read-only web viewer for the current project. The viewer includes a sidebar grouped by concepts and saved queries, a dashboard home, markdown rendering, wikilinks, title/body search, page metadata, health counts, and provenance/citation support rails.
82
+ - **Citation chips in the viewer** — paragraph citations and claim-level source ranges render as visible chips. On loopback binds, chips can include local editor links for source-line context; LAN binds omit filesystem paths and editor links.
83
+ - **Secure-by-default local server** — `view` binds to `127.0.0.1` by default, uses an OS-assigned port unless `--port` is provided, and requires `--host <host>` and `--allow-lan` together before binding beyond loopback. The server applies pinned CSP / CORP / nosniff / referrer headers, Host / Origin / Sec-Fetch checks, and path confinement for all served files.
84
+ - **Viewer health payload** — `/api/health` exposes cheap project counts, pending review count parity with MCP `wiki_status`, and the latest cached lint summary when available.
85
+ - **GitHub Copilot provider** — `LLMWIKI_PROVIDER=copilot` uses the GitHub Copilot API with `GITHUB_TOKEN=$(gh auth token)` from an OAuth token that has the `copilot` scope. Copilot supports chat/tool calls but does not expose embeddings, so embedding-dependent semantic search should use another provider.
86
+
87
+ ### Changed
88
+
89
+ - **`llmwiki lint` now writes `.llmwiki/last-lint.json`** after each completed lint run so the viewer can show a recent lint summary without running lint on every page load.
90
+ - **Shared wiki page collection** — export and viewer collection now share the lower-level wiki collector while preserving each surface's own filtering and payload shape.
91
+
92
+ ### Test infrastructure
93
+
94
+ - Added subprocess, path-safety, sanitizer, accessibility, JS DOM, pack-asset, and server-security coverage for the viewer. Tests grew from 632 to 850 in this release.
95
+
96
+ ### Contributors
97
+
98
+ Thanks to **@cadamsdev** for contributing the GitHub Copilot provider in PR #55.
99
+
100
+ ## [0.6.0] - 2026-05-02
101
+
102
+ Adds session-history ingest (Claude / Codex / Cursor exports), configurable output language, and a defensive cap that prevents `compile` from crashing on popular concepts. Closes a batch of CJK / collision / silent-loss bugs in the ingest path. Tightens `compile --review` so candidates carry both schema AND provenance lint findings before approval. Extracts a shared `ProvenanceMetadata` shape and removes an unreliable LLM extraction-time estimate in favour of body-derived counts.
103
+
104
+ ### Added
105
+
106
+ - **`llmwiki ingest-session <path>`** — imports AI coding-session exports as wiki sources. Auto-detects three formats: Claude (`.jsonl`), Codex (`.json`), Cursor (`.json`, both `tabs` and flat schemas). Single file or whole directory. Each session lands in `sources/<slug>.md` with frontmatter recording the adapter, source path, ingest timestamp, and (where available) session start/end times. Adapter validation requires ≥ 1 user-or-assistant turn — recognised-but-empty exports fail loudly instead of producing a content-free page.
107
+ - **`LLMWIKI_OUTPUT_LANG` env var + `--lang <code>` CLI flag** on `compile` and `query`. When set, every prompt builder (extraction, page generation, seed page, query answer) appends `Write the output in <lang>.` to the system prompt. Unset preserves current behaviour byte-for-byte. Useful for `--lang Chinese`, `--lang Japanese`, etc.
108
+ - **`compile --review` provenance lint** — review candidates now carry both `schemaViolations` and `provenanceViolations` (malformed claim citations, broken-source / out-of-bounds line spans). `review show` prints both blocks. Reviewers see citation issues before approving a page rather than discovering them on a later compile.
109
+ - **`npm run fallow:ci`** — contributor script that runs `fallow` with the same `--changed-since <PR-base-sha>` scoping the GitHub Action uses, so most CI fallow findings surface locally before pushing. Documented in CONTRIBUTING.md (including the fork-workflow `upstream/main` resolution and the platform-binary parity caveat).
110
+
111
+ ### Fixed
112
+
113
+ - **Non-ASCII filename ingest** (#35) — `slugify` previously used `\w` without the `/u` flag, so titles like `测试文档` collapsed to the empty string and `ingest` wrote `sources/.md` (a dotfile that subsequent CJK ingests would overwrite). `slugify` now uses Unicode property escapes (`\p{L}`, `\p{N}`); pure-emoji titles that still strip to `""` fail with an actionable error rather than writing a dotfile.
114
+ - **Same-basename source collision** (#36) — two distinct sources slugifying to the same name (e.g. `a/notes.md` and `b/notes.md`) used to silently overwrite. `saveSource` now checks for the collision and falls through to `<slug>-<8-hex-of-source>.md` when the existing file's frontmatter `source` doesn't match. Re-ingesting the same source still overwrites in place — no duplicate accumulation.
115
+ - **Compile crash on popular concepts** (#39) — `mergeExtractions` used to concatenate every contributing source's full content into the page-generation prompt. Linear in source count; reliably blew past the LLM provider's context window once many sources discussed the same topic. New defensive cap (`LLMWIKI_PROMPT_BUDGET_CHARS`, default 200,000) gives every contributing source a fair share of the budget when the raw total would overflow, with a clear truncation marker. Typical workloads stay byte-identical.
116
+ - **Body-derived `excess-inferred-paragraphs`** — the lint rule used to trust an LLM-estimated `inferredParagraphs` frontmatter field when present, falling back to body counting. The estimate was made before the page even existed and routinely disagreed with what the model actually produced. The rule now unconditionally counts uncited prose paragraphs in the rendered body, with Unicode-aware prose detection (`\p{L}`) so pages produced via `--lang Chinese` etc. are correctly counted. Legacy `inferredParagraphs` frontmatter values are intentionally ignored.
117
+
118
+ ### Changed
119
+
120
+ - **`ProvenanceMetadata` is now a single shared interface** in `src/utils/types.ts` that both `ExtractedConcept` and `WikiFrontmatter` extend. Drops the duplicate private declaration that had drifted into `src/utils/markdown.ts`. JSON shapes serialised on disk and over the LLM tool boundary are byte-identical to before — pure refactor.
121
+ - **`inferredParagraphs` is no longer written to frontmatter or sent to the LLM extractor**. The field has moved entirely to body-derived lint at lint time. Old on-disk pages with the field still parse — the loader just ignores the unrecognised key.
122
+ - **`CompileResult.pages` now includes seed-page slugs** alongside concept-page slugs. Seed pages used to land on disk silently and stay absent from the result; downstream consumers (MCP, embeddings, programmatic callers) had no way to discover them without scanning `wiki/`. They're also threaded into `finalizeWiki` so `resolveLinks` and `updateEmbeddings` cover them.
123
+ - **Lint helper dedupe** — `checkSchemaCrossLinks` (on-disk walker) now delegates to `checkPageCrossLinks` (per-page) so the `schema-cross-link-minimum` rule lives in exactly one place.
124
+
125
+ ### Test infrastructure
126
+
127
+ - **`useIngestWorkspaces` and `useAimockLifecycle.findSystemPromptByUserMessage`** composables in `test/fixtures/` consolidate temp-workspace and aimock recording boilerplate that had drifted across multiple integration tests.
128
+ - Tests grew from 480 (post-0.5.1) to 632 in this release.
129
+
130
+ ### Contributors
131
+
132
+ Thanks to **@lllcccwww** for filing four high-quality bug reports back-to-back (#35, #36, #37, #39) — every one had a clear repro and pointed at the offending file:line, which made the fixes obvious. Also thanks to **@babysource** for asking about embedding configuration (#42) and **@ishan5ain** for volunteering to take on the read-only Web UI roadmap item (#38).
133
+
134
+ ## [0.5.1] - 2026-04-27
135
+
136
+ Patch release fixing a CLI startup crash that broke 0.5.0 for everyone installing via npm.
137
+
138
+ ### Fixed
139
+
140
+ - **Startup crash on `llmwiki <any-command>`** — 0.5.0 imported `youtube-transcript/dist/youtube-transcript.esm.js` (a deep subpath that worked around a broken `main` entry in v1.3.0). v1.3.1 added a proper `exports` map that no longer exposes that subpath, so any `npm install -g llm-wiki-compiler@0.5.0` produced `ERR_PACKAGE_PATH_NOT_EXPORTED` on first command. Switched to importing from the package root (which the new `exports` map covers) and bumped `youtube-transcript` to `^1.3.1`.
141
+
142
+ ### Contributors
143
+
144
+ Thanks to **@lllcccwww** for reporting (#33) and **@ishan5ain** for the fix (#34) — both very fast turnarounds.
145
+
146
+ ## [0.5.0] - 2026-04-27
147
+
148
+ Adds multimodal ingest (images, PDFs, transcripts) and chunk-level semantic retrieval with reranking and a `--debug` view. Also raises the minimum Node version to 24 so the project can use modern test-mocking tooling that depends on Node 24+ APIs.
149
+
150
+ ### Added
151
+
152
+ - **Multimodal ingest** — `llmwiki ingest` now accepts images (vision via the active LLM provider), PDFs (text + metadata via lazy-loaded `pdf-parse`), and transcripts (`.vtt`, `.srt`, plus content-sniffed `.txt` that requires repeated speaker dialogue or anchored timestamps so plain notes aren't misclassified). Each source records its `sourceType` in frontmatter (`web` | `file` | `image` | `pdf` | `transcript`). YouTube transcript URLs are auto-routed.
153
+ - **Chunk-level semantic retrieval** — the embedding store gained an optional v2 `chunks` schema. Pages are split on paragraph + heading boundaries (with size guardrails), embedded individually, and reused across compiles when their content hash hasn't changed. Query routing prefers chunk hits, falls back to page-level retrieval and full-index selection.
154
+ - **BM25 reranking** over chunk candidates, blending 0.5x cosine similarity with BM25 score so semantic ranking still matters when the query has no overlapping terms.
155
+ - **`llmwiki query --debug`** prints the top chunks (slug, score, snippet) and pages selected, so users can audit retrieval decisions. The MCP `query_wiki` tool accepts a `debug` arg too.
156
+ - **Empty-store cold-start** — an empty v1 or v2 store with live wiki pages now triggers a full chunk embedding on next compile (previously, embeddings would only update when an existing slug changed).
157
+ - **`@copilotkit/aimock` test infrastructure** with `mockClaudeEnv` / `mockOpenAIEnv` / `useAimockLifecycle` helpers. CLI subprocess tests can now stub LLM endpoints deterministically — closes the recurring "no subprocess test for the compile/query happy path" gap that codex flagged across review-queue, schema-layer, confidence-metadata, and chunked-retrieval.
158
+
159
+ ### Changed
160
+
161
+ - **Minimum Node version raised from 18 to 24.** `engines.node` is `>=24`, the tsup target is `node24`, and CI runs only on Node 24. Users on older Node should pin to `<0.5.0` until they can upgrade their runtime.
162
+ - `pdf-parse` is dynamically imported so the cost of loading pdfjs-dist is paid only when a PDF is actually being ingested.
163
+
164
+ ### Test infrastructure
165
+
166
+ - New `runCLI` / `expectCLIExit` / `expectCLIFailure` / `formatCLIFailure` helpers in `test/fixtures/run-cli.ts` capture full subprocess diagnostics (code, signal, killed, message, stdout, stderr, args, cwd) on assertion failure — flakes now surface their root cause without rerunning.
167
+ - `vitest globalSetup` builds dist once before the suite runs, eliminating the per-test `tsup --clean` race that caused intermittent CI flakes.
168
+ - Tests grew from 391 to 477 in this release (and to 519 once export-bundle lands as a follow-up).
169
+
170
+ ## [0.4.0] - 2026-04-25
171
+
172
+ Adds claim-level source-range provenance, a first-class schema layer for typed page kinds, configurable provider request timeouts, and a slug-based wikilink format that resolves reliably in Obsidian.
173
+
174
+ ### Added
175
+
176
+ - **Claim-level provenance with source ranges** — citations can now pin specific lines: `^[paper.md:42-58]` (colon form) or `^[paper.md#L42-L58]` (GitHub anchor form). Single-line `^[paper.md:7]` works too, as do mixed multi-source markers like `^[a.md, b.md:1-3]`. The legacy paragraph form `^[paper.md]` continues to work unchanged.
177
+ - **`extractClaimCitations(body)`** returns structured `{ raw, spans: [{ file, lines? }] }` records for tooling. **`inspectProvenance(body)`** groups spans by source file (deduped), useful for "this page draws from" UIs.
178
+ - **`checkBrokenCitations`** lint rule now flags out-of-bounds spans (e.g. `^[src.md:42-58]` against a 3-line source) with cached per-file line counts so a page with many spans into the same source only reads it once.
179
+ - **`checkMalformedClaimCitations`** new lint rule catches malformed entries: non-numeric ranges (`:abc-xyz`), half-baked hash forms (`#X9`), line `0`, and reversed ranges (`5-3`). Semantic invalidity is rejected at parse time so `extractClaimCitations` doesn't return impossible spans.
180
+ - **First-class schema layer** for typed page kinds. Projects can declare `.llmwiki/schema.json|yaml|yml` (or `wiki/.schema.yaml|yml`) defining page kinds (`concept`, `entity`, `comparison`, `overview`), per-kind `minWikilinks`, and seed pages.
181
+ - **`llmwiki schema init`** writes a starter schema file. **`llmwiki schema show`** prints the resolved schema and its source path.
182
+ - **`schema-cross-link-minimum`** lint rule enforces per-kind link expectations.
183
+ - **Schema-driven seed pages** are generated during compile and run on the early-return path too, so adding a seed-page entry triggers its creation on the next `compile` even when no source files changed.
184
+ - **Review-mode schema violations** — `compile --review` runs in-memory schema lint per candidate and stamps any violations onto the candidate JSON. `review show <id>` prints a "Schema violations" block when present.
185
+ - **Configurable provider request timeouts** — `LLMWIKI_REQUEST_TIMEOUT_MS` (provider-agnostic) and `OLLAMA_TIMEOUT_MS` (Ollama-specific) override the per-request timeout. Defaults: 10 minutes for OpenAI (matches the SDK), 30 minutes for Ollama (better suited to local models).
186
+ - **Slug-based wikilinks** — index, MOC, and the in-body wikilink resolver now emit `[[slug|Title]]` so Obsidian targets the file directly regardless of whether the slug differs from the display title.
187
+ - **Test infrastructure for subprocess CLI tests** — `runCLI`/`expectCLIExit`/`expectCLIFailure`/`formatCLIFailure` helpers in `test/fixtures/run-cli.ts` capture full subprocess diagnostics (code, signal, killed, message, stdout, stderr, args, cwd) so flakes surface their root cause without rerunning. dist/ is built once via `vitest globalSetup` so parallel workers don't race on `tsup --clean`.
188
+
189
+ ### Changed
190
+
191
+ - `extractCitations(body)` continues to return a flat filename list for backward compatibility, but is now backed by `extractClaimCitations` and strips span suffixes when collecting filenames.
192
+ - `WikiFrontmatter.kind` references the canonical `PageKind` type from `src/schema/types.ts` via `import type` (no runtime cycle).
193
+ - `compile --review` defers seed-page generation and `finalizeWiki` to honor the no-`wiki/`-mutation contract.
194
+
195
+ ### Contributors
196
+
197
+ Thanks to **@ludevica** for #15 (slug-based wikilinks) and **@BenGSt** for reporting the Ollama timeout (#11).
198
+
199
+ ## [0.3.0] - 2026-04-23
200
+
201
+ Adds a candidate review queue for `compile` and richer epistemic metadata on compiled pages.
202
+
203
+ ### Added
204
+
205
+ - **Candidate review queue** — `llmwiki compile --review` writes generated pages to `.llmwiki/candidates/` instead of mutating `wiki/`. New subcommands `llmwiki review list|show|approve|reject` let you inspect each candidate before it lands. `approve` writes the page and refreshes index/MOC/embeddings; `reject` archives the candidate to `.llmwiki/candidates/archive/`. MCP `wiki_status` exposes `pendingCandidates` so agents can see queue depth.
206
+ - **Confidence and contradiction metadata** — compiled pages can carry optional frontmatter fields (`confidence`, `provenanceState`, `contradictedBy`, `inferredParagraphs`). When multiple sources merge into one slug, metadata is reconciled (`min` confidence, `provenanceState = 'merged'`, union of `contradictedBy` deduped by slug, `max` `inferredParagraphs`).
207
+ - **Three new lint rules** surface the new metadata: `low-confidence`, `contradicted-page`, `excess-inferred-paragraphs`.
208
+ - **Multi-source citation parsing in lint** — `^[a.md, b.md]` now validates each filename independently and only reports the missing one(s).
209
+ - **Husky pre-commit and pre-push hooks** — pre-commit runs `fallow` + `tsc --noEmit`; pre-push runs `npm run build` + `npm test`. Devs get fast feedback on commit and full validation before push.
210
+
211
+ ### Changed
212
+
213
+ - Pre-commit/pre-push hooks pin `fallow` to `2.42.0` locally (devDep) and in CI to keep complexity thresholds stable across the team.
214
+ - `compile`'s page rendering extracted into `src/compiler/page-renderer.ts` so both direct writes and candidate generation reuse the same renderer.
215
+ - `vitest.config.ts` excludes `.claude/**` so `npm test` from the main checkout doesn't discover sibling worktrees.
216
+
217
+ ### Concurrency
218
+
219
+ - `review approve` and `review reject` acquire `.llmwiki/lock` (the same lock `compile` uses) and re-read the candidate under the lock to close the TOCTOU window between pre-check and mutation.
220
+ - When one source produces multiple candidates, source state isn't persisted until the last sibling is approved — unresolved siblings stay re-detectable on the next `compile --review`.
221
+
222
+ ### Infrastructure
223
+
224
+ - Tests grew from 222 to 291 across all new features.
225
+
226
+ ### Contributors
227
+
228
+ Thanks to **@ishan5ain** for #12 (split embedding endpoints for OpenAI-compatible providers) and **@sy2ruto** for reporting the multi-source citation lint bug (#10) — the parsing fix shipped here in PR #19.
229
+
230
+ ## [0.2.0] - 2026-04-16
231
+
232
+ First major release since 0.1.1. Ships the complete initial roadmap plus an MCP server for AI agent integration.
233
+
234
+ ### Added
235
+
236
+ - **MCP server** (`llmwiki serve`) exposes llmwiki's automated pipelines as Model Context Protocol tools so agents can ingest, compile, query, search, lint, and read pages programmatically. Ships with 7 tools and 5 read-only resources.
237
+ - **Semantic search** via embeddings — pre-filters the wiki index to the top 15 most similar pages before calling the selection LLM, with transparent fallback to full-index selection when no embeddings store exists.
238
+ - **Multi-provider support** — swap LLM backends via `LLMWIKI_PROVIDER=anthropic|openai|ollama|minimax`.
239
+ - **`llmwiki lint`** command with six rule-based checks (broken wikilinks, orphaned pages, missing summaries, duplicate concepts, empty pages, broken citations). No LLM calls, no API key required.
240
+ - **Paragraph-level source attribution** — compiled pages now include `^[filename.md]` citation markers pointing back to source files.
241
+ - **Obsidian integration** — LLM-extracted tags, deterministic aliases (slug, conjunction swap, abbreviation), and auto-generated `wiki/MOC.md` grouping concept pages by tag.
242
+ - **Anthropic provider enhancements** — `ANTHROPIC_AUTH_TOKEN` support, custom base URLs, and `~/.claude/settings.json` fallback for credentials and model.
243
+ - **MiniMax provider** via the OpenAI-compatible endpoint.
244
+ - GitHub Actions CI with Node 18/20/22 build+test matrix plus Fallow codebase health check (required for merges).
245
+
246
+ ### Changed
247
+
248
+ - Command functions (`compile`, `query`, `ingest`) now expose structured-result variants (`compileAndReport()`, `generateAnswer()`, `ingestSource()`) alongside the existing CLI-facing versions. The CLI experience is unchanged.
249
+ - `runCompilePipeline` decomposed into focused phase helpers to bring function complexity under Fallow's thresholds.
250
+
251
+ ### Infrastructure
252
+
253
+ - Tests grew from 91 to 211 across all new features.
254
+ - Fallow codebase health analyzer required in CI (no dead code, no duplication, no complexity threshold violations).
255
+
256
+ ### Contributors
257
+
258
+ Thanks to @FrankMa1, @PipDscvr, @goforu, and @socraticblock for their contributions.
259
+
260
+ ## [0.1.1] - 2026-04-07
261
+
262
+ ### Fixed
263
+
264
+ - Flaky CLI test timeout.
265
+
266
+ ## [0.1.0] - 2026-04-05
267
+
268
+ Initial release.
269
+
270
+ ### Added
271
+
272
+ - `llmwiki ingest` — fetch a URL or copy a local file into `sources/`.
273
+ - `llmwiki compile` — incremental two-phase compilation (extract concepts, then generate pages). Hash-based change detection skips unchanged sources.
274
+ - `llmwiki query` — two-step LLM-powered Q&A (index-based page selection, then streaming answer). `--save` flag writes answers as wiki pages.
275
+ - `llmwiki watch` — auto-recompile on source changes.
276
+ - Atomic writes, lock-protected compilation, orphan marking for deleted sources.
277
+ - `[[wikilink]]` resolution and auto-generated `wiki/index.md`.
278
+
279
+ [0.2.0]: https://github.com/atomicmemory/llm-wiki-compiler/compare/v0.1.1...v0.2.0
280
+ [0.1.1]: https://github.com/atomicmemory/llm-wiki-compiler/compare/v0.1.0...v0.1.1
281
+ [0.1.0]: https://github.com/atomicmemory/llm-wiki-compiler/releases/tag/v0.1.0