llm-wiki-kit 0.2.14 → 0.2.15

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -85,7 +85,9 @@ llm-wiki/
85
85
  ├── outputs/
86
86
  │ ├── questions/
87
87
  │ ├── reports/
88
+ │ ├── exports/
88
89
  │ └── maintenance/
90
+ ├── evals/
89
91
  └── procedures/
90
92
  ```
91
93
 
@@ -102,6 +104,7 @@ The installed hooks:
102
104
  - remove Codex-facing legacy `oh-my-codex:wiki`/`omx_wiki` surfaces at session start so `llm-wiki/` remains the active wiki implementation
103
105
  - record small redacted raw event envelopes and per-turn state
104
106
  - capture meaningful work and structured decision points, including tool evidence, changed files, and verification notes
107
+ - attach safe `evidence_refs` candidates to generated durable candidates when changed files or verification commands are available
105
108
  - before compaction, classify the current turn and save a redacted checkpoint only for meaningful work, structured decisions, or explicit durable requests; explicit durable candidates also get a maintenance queue item when no durable wiki update is detected
106
109
  - after compaction, store the redacted compact summary only; if pre-compact preservation failed, prepare a recovery packet for the next legal model-visible context hook
107
110
  - allow tool calls to proceed without secret/PII-based hook blocking
@@ -123,7 +126,7 @@ Most users should not need these during daily Claude Code/Codex work. They exist
123
126
  - Install/update: `llm-wiki install`, `llm-wiki update`, `llm-wiki post-update`, `llm-wiki projects`
124
127
  - Diagnostics: `llm-wiki doctor`, `llm-wiki status`, `llm-wiki version`
125
128
  - Manual: `llm-wiki manual`
126
- - Agent maintenance helpers: `llm-wiki context`, `llm-wiki lint`, `llm-wiki consolidate`, `llm-wiki maintenance`
129
+ - Agent maintenance helpers: `llm-wiki context`, `llm-wiki lint`, `llm-wiki consolidate`, `llm-wiki maintenance`, `llm-wiki eval`, `llm-wiki export`
127
130
  - Live Q&A archive helper: `llm-wiki archive-questions --workspace <project> [--date YYYY-MM-DD] [--dry-run]`
128
131
  - Cleanup: `llm-wiki uninstall`
129
132
 
@@ -139,13 +142,17 @@ Installed npm runtimes also perform a cached update notice check from hooks whil
139
142
 
140
143
  `llm-wiki post-update --workspace <project>` reapplies the current runtime's hook entries and safe managed template updates without running `npm install -g`. Use `post-update --all --workspace <search-root>` to reapply templates across discovered project roots.
141
144
 
142
- `llm-wiki context "<query>"` prints the full debug view of the layered context sources used by hooks. Hook injection may render those sources as functional compact context for Codex and Claude, but this CLI stays verbose so maintainers can inspect retrieval, snippets, memory, index, expansion behavior, and context budget metadata. Daily use should rely on hook injection. By default, episodic `wiki/queries/`, `wiki/context/`, and `session-log` pages are excluded from search unless they were promoted with `memory_type: semantic` or `procedural` and `importance >= 4`; use `--include-episodic` only when debugging old automatic records. Archived or superseded pages are hidden unless `--include-archived` is requested, while stale pages remain searchable with lower score.
145
+ `llm-wiki context "<query>"` prints the full debug view of the layered context sources used by hooks. Hook injection may render those sources as functional compact context for Codex and Claude, but this CLI stays verbose so maintainers can inspect retrieval, snippets, memory, index, expansion behavior, context budget metadata, `rankReason`, `matchedFields`, `scoreBreakdown`, `visibilityReason`, and `evidenceRefs`. The text formatter adds a short `why selected` line for each hit; hook compact context deliberately omits that extra detail. Daily use should rely on hook injection. By default, episodic `wiki/queries/`, `wiki/context/`, and `session-log` pages are excluded from search unless they were promoted with `memory_type: semantic` or `procedural` and `importance >= 4`; use `--include-episodic` only when debugging old automatic records. Archived or superseded pages are hidden unless `--include-archived` is requested, while stale pages remain searchable with lower score.
143
146
 
144
- `llm-wiki lint` checks wiki health and detects outdated managed rules from older kit versions. It also warns when `memory.md` is near budget, wiki page count nears the search cap, hidden episodic/context pages accumulate, or stale/archived pages lack supersession/link discoverability. Agents may use it before/after meaningful wiki maintenance.
147
+ `llm-wiki lint` checks wiki health and detects outdated managed rules from older kit versions. It validates optional `evidence_refs` entries with the prefixes `file:`, `cmd:`, `raw:`, and `url:`; unsafe paths, unsupported prefixes, credential-bearing URLs, command secrets, and secret-like values are reported as errors, while missing local evidence targets are warnings. It also warns when `memory.md` is near budget, wiki page count nears the search cap, hidden episodic/context pages accumulate, or stale/archived pages lack supersession/link discoverability. Agents may use it before/after meaningful wiki maintenance.
145
148
 
146
149
  `llm-wiki consolidate` refreshes only generated marker blocks in `wiki/memory.md` and `wiki/index.md`. Generated maps keep durable non-archived pages, hide default episodic records, skip stale/archived/superseded pages, and report those counts in dry-run output. It is an agent maintenance helper, not a command users should run after every turn.
147
150
 
148
- `llm-wiki maintenance` prints the pending queue and review due status from `llm-wiki/outputs/maintenance/queue.md`. Hooks create only selective candidates; the active agent should merge reusable items into existing durable wiki pages and mark queue items `done` or `skipped` without delaying unrelated user answers. Periodic maintenance is a soft agent-side reminder, not a user command loop.
151
+ `llm-wiki maintenance` prints the queue and review due status from `llm-wiki/outputs/maintenance/queue.md`. Queue states are `pending -> approved -> done` or `skipped`. Use `llm-wiki maintenance --workspace <project> --approve <id> --target <wiki/...md>` when durable promotion is approved, `--done <id> --target <wiki/...md>` after the active agent has merged the fact into a durable page, and `--skip <id> [--note "..."]` for duplicate or non-durable candidates. Approved items are shown before pending items in hook reminders. Periodic maintenance is a soft agent-side reminder, not a user command loop.
152
+
153
+ `llm-wiki eval --workspace <project> [--fixture <path>] [--limit 5] [--json]` runs retrieval fixtures from `llm-wiki/evals/retrieval.json` by default. If the fixture is absent, it exits successfully with `no fixture found`. Fixtures list `query`, `expected`, and `unexpected` paths; output reports recall, missed expected hits, unexpected hits, and top hits using the same durable visibility policy as export.
154
+
155
+ `llm-wiki export --workspace <project> [--format all|llms|llms-full|json] [--output <dir>] [--dry-run] [--json]` writes durable wiki manifests under `llm-wiki/outputs/exports/` by default. `llms.txt` is an agent onboarding and handoff manifest, not a passive SEO artifact. `llms-full.txt` is a redacted durable context bundle for compaction recovery or handoff. `llm-wiki.json` is a structured manifest for future adapters and eval tooling. Export uses the same durable visibility policy as search/eval and redacts credentials before writing.
149
156
 
150
157
  `llm-wiki archive-questions` splits older legacy `llm-wiki/outputs/questions/YYYY-MM-DD-live-qa.md` files into the chunked `llm-wiki/outputs/questions/YYYY-MM-DD/` layout. It preserves the original under `outputs/questions/archive/originals/` with a SHA-256 sidecar and replaces the legacy file with a short pointer stub. Use `--dry-run` first when reviewing a large archive.
151
158
 
@@ -192,6 +199,7 @@ llm-wiki hook claude Stop
192
199
  - PreCompact may read a small bounded transcript tail to create a redacted checkpoint, but it does not store the full transcript or raw `transcript_path`.
193
200
  - Tool calls are not blocked only because inputs look sensitive.
194
201
  - Authentication values such as tokens, passwords, and private keys are redacted before durable summaries are written.
202
+ - Generated exports are redacted and must not contain npm tokens, WinRM credentials, private keys, raw `.env`, or full raw transcripts.
195
203
  - Hook payloads are stored only as redacted event envelopes.
196
204
  - Phone numbers, emails, dates, and business identifiers are preserved by default so the wiki remains useful for local work.
197
205
 
package/docs/concepts.md CHANGED
@@ -19,11 +19,13 @@ The important behavior is a loop:
19
19
  3. The user works normally; no extra command loop is required.
20
20
  4. Hooks gather redacted prompt/tool/result summaries.
21
21
  5. At stop/session end, hooks append redacted chunked live Q&A only for turns with work evidence or structured decision/debugging conclusions.
22
- 6. Simple answers, status checks, and keyword-only responses stay out of live Q&A and durable wiki by default.
23
- 7. Durable wiki promotion is selective: explicit record/document requests should be handled by the active agent in existing wiki pages; the hook queues review only when such a request was not reflected in durable files.
24
- 8. At the next start/prompt after an abrupt shutdown, hooks can recover stale turn state into `outputs/maintenance/queue.md`.
25
- 9. When reusable knowledge appears, the active Claude Code/Codex agent folds approved facts into existing durable wiki pages instead of leaving everything as one-off Q&A.
26
- 10. Future sessions start from the improved wiki instead of relying on long chat history.
22
+ 6. When possible, generated candidates carry safe `evidence_refs` such as changed files, verification commands, raw source IDs, or external URLs.
23
+ 7. Simple answers, status checks, and keyword-only responses stay out of live Q&A and durable wiki by default.
24
+ 8. Durable wiki promotion is selective: explicit record/document requests should be handled by the active agent in existing wiki pages; the hook queues review only when such a request was not reflected in durable files.
25
+ 9. At the next start/prompt after an abrupt shutdown, hooks can recover stale turn state into `outputs/maintenance/queue.md`.
26
+ 10. When reusable knowledge appears, the active Claude Code/Codex agent folds approved facts into existing durable wiki pages instead of leaving everything as one-off Q&A.
27
+ 11. Export and eval reuse the same durable visibility policy so handoff manifests, retrieval fixtures, and context selection describe the same wiki surface.
28
+ 12. Future sessions start from the improved wiki instead of relying on long chat history.
27
29
 
28
30
  The kit is a template/runtime repository. It must not centralize project wiki contents.
29
31
 
@@ -40,8 +42,11 @@ The maintenance loop is intentionally layered:
40
42
 
41
43
  - `memory.md`: short hot index for current durable facts.
42
44
  - `index.md`: broad navigation map.
43
- - MiniSearch + wikilinks: retrieval over durable `wiki/**/*.md`, with episodic `wiki/queries/`, `wiki/context/`, and `session-log` pages hidden by default unless promoted or `--include-episodic` is requested; archived/superseded pages stay preserved but hidden unless `--include-archived` is requested.
44
- - `outputs/maintenance/queue.md`: selective reminders for explicit durable requests that need review, plus stale turn recovery.
45
- - `lint`: finds broken links, stale pages, duplicates, metadata gaps, secret-like content, outdated managed rules, memory/page-count budget pressure, hidden episodic growth, and stale/archived discoverability gaps.
46
- - `maintenance`: reports `reviewDue` only when periodic thresholds are met; hook reminders are soft and limited to session start/instructions loaded or maintenance-related prompts.
45
+ - MiniSearch + wikilinks: retrieval over durable `wiki/**/*.md`, with episodic `wiki/queries/`, `wiki/context/`, and `session-log` pages hidden by default unless promoted or `--include-episodic` is requested; archived/superseded pages stay preserved but hidden unless `--include-archived` is requested. Verbose context explains `why selected`; hook context stays compact.
46
+ - `evidence_refs`: optional frontmatter that ties durable claims to `file:`, `cmd:`, `raw:`, or `url:` evidence without embedding secrets or raw transcripts.
47
+ - `outputs/maintenance/queue.md`: selective reminders for explicit durable requests that need review, plus stale turn recovery. Queue state is `pending`, `approved`, `done`, or `skipped`.
48
+ - `lint`: finds broken links, stale pages, duplicates, metadata gaps, invalid evidence refs, secret-like content, outdated managed rules, memory/page-count budget pressure, hidden episodic growth, and stale/archived discoverability gaps.
49
+ - `maintenance`: reports `reviewDue` only when periodic thresholds are met; hook reminders are soft and limited to session start/instructions loaded or maintenance-related prompts, with approved items shown before pending items.
47
50
  - `consolidate`: agent helper that refreshes generated blocks in `memory.md` and `index.md` while preserving handwritten notes, keeping default query/context/session pages out of the durable generated maps, and skipping stale/archived/superseded pages.
51
+ - `eval`: checks retrieval fixtures in `llm-wiki/evals/retrieval.json` and reports expected recall, missed expected paths, unexpected hits, and top hits.
52
+ - `export`: writes redacted `llms.txt`, `llms-full.txt`, and `llm-wiki.json` manifests for agent onboarding, handoff, retrieval eval, and external consumption. `llms.txt` is not treated as a passive SEO artifact.
@@ -42,11 +42,13 @@ when no project `CLAUDE.md` exists. Existing `CLAUDE.md` files are not overwritt
42
42
 
43
43
  The hook records redacted turn summaries but does not deny tool calls only because an input looks sensitive. Hook payloads are stored as small redacted event envelopes rather than full transcripts, and context output is redacted field by field before it is returned to Claude Code.
44
44
 
45
- At `SessionStart`/`InstructionsLoaded`, the hook first attempts a safe managed-template refresh, recovers stale turn state into `outputs/maintenance/queue.md`, performs a cached npm update notice check for npm installs, then injects functional compact context. The context still uses `llm-wiki/wiki/memory.md`, `llm-wiki/wiki/index.md`, relevant wiki/search state, operating rules, maintenance signals, passive runtime update status, and managed-template cleanup notes; the hook formats those signals so they are usable if shown in the Claude Code UI. At `UserPromptSubmit`, it recovers stale turn state, searches wiki pages with MiniSearch or substring fallback, expands one-hop wikilinks, redacts context fields, performs the same cached update notice check, and injects the smallest useful functional compact context set. Update notice cache is scoped by npm command, and maintenance reminders are shown only when the prompt is wiki/maintenance related or matches a queue topic.
45
+ At `SessionStart`/`InstructionsLoaded`, the hook first attempts a safe managed-template refresh, recovers stale turn state into `outputs/maintenance/queue.md`, performs a cached npm update notice check for npm installs, then injects functional compact context. The context still uses `llm-wiki/wiki/memory.md`, `llm-wiki/wiki/index.md`, relevant wiki/search state, operating rules, maintenance signals, passive runtime update status, and managed-template cleanup notes; the hook formats those signals so they are usable if shown in the Claude Code UI. At `UserPromptSubmit`, it recovers stale turn state, searches wiki pages with MiniSearch or substring fallback, expands one-hop wikilinks, redacts context fields, performs the same cached update notice check, and injects the smallest useful functional compact context set. Verbose `llm-wiki context` can explain `why selected`, `rankReason`, `matchedFields`, and `evidenceRefs`, but hook context keeps those details compact. Update notice cache is scoped by npm command, and maintenance reminders are shown only when the prompt is wiki/maintenance related or matches a queue topic.
46
46
 
47
47
  Hook-visible language is selected from the current user prompt first. Korean prompts get Korean guidance, English prompts get English guidance. If no prompt language is clear, the hook checks Claude Code `settings.json` `language` when it exists, then local `CLAUDE.md`/`AGENTS.md` language signals, then English. The kit does not require Claude Code to expose a language setting.
48
48
 
49
- `PostToolUse` and `PostToolBatch` record redacted tool summaries in the same turn buffer. `PreCompact` classifies the current turn before compaction: simple turns record only a context note, work-evidence or structured-decision turns write a chunked live Q&A checkpoint, and explicit durable candidates write a maintenance queue item only when no durable wiki update is detected. The checkpoint can include only a bounded redacted transcript tail, never the full raw transcript or raw `transcript_path`. Compaction is not blocked; if checkpoint storage fails, the hook records a compact recovery packet for the next legal context-injection event. `PostCompact` stores the redacted compact summary as a context note and prepares any pending recovery packet without returning model-visible context directly. In the default `answer-first` mode, `SubagentStop` does not create live Q&A, query, decision, or maintenance files. `Stop` and `SessionEnd` append chunked live Q&A only for work-evidence or structured-decision turns and do not auto-create `wiki/queries/` or `wiki/decisions/`. If the user explicitly asked to record or document durable knowledge and no durable wiki update is detected, `Stop`/`SessionEnd` queue a pending maintenance item for agent review. `Stop` and `SessionEnd` then clear the per-session turn buffer; `SubagentStop` does not.
49
+ `PostToolUse` and `PostToolBatch` record redacted tool summaries in the same turn buffer. `PreCompact` classifies the current turn before compaction: simple turns record only a context note, work-evidence or structured-decision turns write a chunked live Q&A checkpoint, and explicit durable candidates write a maintenance queue item only when no durable wiki update is detected. Queue items may carry safe `evidence_refs` candidates from changed files and verification commands. The checkpoint can include only a bounded redacted transcript tail, never the full raw transcript or raw `transcript_path`. Compaction is not blocked; if checkpoint storage fails, the hook records a compact recovery packet for the next legal context-injection event. `PostCompact` stores the redacted compact summary as a context note and prepares any pending recovery packet without returning model-visible context directly. In the default `answer-first` mode, `SubagentStop` does not create live Q&A, query, decision, or maintenance files. `Stop` and `SessionEnd` append chunked live Q&A only for work-evidence or structured-decision turns and do not auto-create `wiki/queries/` or `wiki/decisions/`. If the user explicitly asked to record or document durable knowledge and no durable wiki update is detected, `Stop`/`SessionEnd` queue a pending maintenance item for agent review. Approved maintenance items are shown before pending items in later reminders. `Stop` and `SessionEnd` then clear the per-session turn buffer; `SubagentStop` does not.
50
+
51
+ For handoff or retrieval verification, use `llm-wiki export --workspace <project> --format all` and `llm-wiki eval --workspace <project>`. The generated `llms.txt`/`llms-full.txt`/`llm-wiki.json` files are redacted durable manifests, not raw transcripts.
50
52
 
51
53
  Set `LLM_WIKI_KIT_AUTO_PROJECT_UPDATE=0` only while diagnosing automatic managed-template refresh behavior.
52
54
  Set `LLM_WIKI_KIT_UPDATE_NOTICE=0` only while suppressing the cached passive runtime update status.
@@ -30,18 +30,20 @@ Handled events:
30
30
  Expected behavior:
31
31
 
32
32
  - `SessionStart` first attempts a safe managed-template refresh, removes Codex-facing legacy `oh-my-codex:wiki`/`omx_wiki` surfaces when they reappear, recovers stale turn state into `outputs/maintenance/queue.md`, performs a cached npm update notice check for npm installs, then injects functional compact context. The context still uses `llm-wiki/wiki/memory.md`, `llm-wiki/wiki/index.md`, relevant wiki/search state, operating rules, maintenance signals, passive runtime update status, and managed-template cleanup notes; the hook formats those signals so they are usable if shown in the Codex UI.
33
- - `UserPromptSubmit` recovers stale turn state, searches project wiki pages with MiniSearch or substring fallback, expands one-hop wikilinks, redacts context fields, performs the same cached update notice check, and injects the smallest useful functional compact context set. Update notice cache is scoped by npm command, and maintenance reminders are shown only when the prompt is wiki/maintenance related or matches a queue topic.
33
+ - `UserPromptSubmit` recovers stale turn state, searches project wiki pages with MiniSearch or substring fallback, expands one-hop wikilinks, redacts context fields, performs the same cached update notice check, and injects the smallest useful functional compact context set. Verbose `llm-wiki context` can explain `why selected`, `rankReason`, `matchedFields`, and `evidenceRefs`, but hook context keeps those details compact. Update notice cache is scoped by npm command, and maintenance reminders are shown only when the prompt is wiki/maintenance related or matches a queue topic.
34
34
  - Hook-visible language is selected from the current user prompt first. Korean prompts get Korean guidance, English prompts get English guidance. If no prompt language is clear, Codex falls back to local `CLAUDE.md`/`AGENTS.md` language signals, then English.
35
35
  - `PreToolUse` records redacted tool summaries without blocking tool calls.
36
36
  - `PostToolUse` records redacted tool summaries in a turn buffer.
37
37
  - `PreCompact` classifies the current turn before compaction. Simple turns record only a context note; work-evidence or structured-decision turns write a chunked live Q&A checkpoint; explicit durable candidates write a maintenance queue item only when no durable wiki update is detected. The checkpoint can include only a bounded redacted transcript tail, never the full raw transcript or raw `transcript_path`. Compaction is not blocked; if checkpoint storage fails, the hook records a compact recovery packet for the next legal context-injection event.
38
38
  - `PostCompact` stores the redacted compact summary as a context note and prepares any pending compact recovery packet. It does not return `hookSpecificOutput.additionalContext`, because Codex `PostCompact` only supports common output fields.
39
39
  - In the default `answer-first` mode, `SubagentStop` does not create live Q&A, query, decision, or maintenance files. `Stop` appends chunked live Q&A only for work-evidence or structured-decision turns and does not auto-create `wiki/queries/` or `wiki/decisions/`.
40
- - If the user explicitly asked to record or document durable knowledge and no durable wiki update is detected, `Stop` queues a pending maintenance item for agent review.
40
+ - If the user explicitly asked to record or document durable knowledge and no durable wiki update is detected, `Stop` queues a pending maintenance item for agent review. Queue items may carry safe `evidence_refs` candidates from changed files and verification commands. Approved maintenance items are shown before pending items in later reminders.
41
41
  - `Stop` clears the per-session turn buffer after recording. `SubagentStop` leaves the parent turn buffer available for the final stop event.
42
42
 
43
43
  Hook payloads are stored as small redacted event envelopes rather than full transcripts. Context output is also redacted field by field before it is returned to Codex. Functional compact context is a presentation policy, not a feature reduction: Codex still receives the wiki memory, search, maintenance, and passive update signals needed for the hook workflow.
44
44
 
45
+ For handoff or retrieval verification, use `llm-wiki export --workspace <project> --format all` and `llm-wiki eval --workspace <project>`. The generated `llms.txt`/`llms-full.txt`/`llm-wiki.json` files are redacted durable manifests, not raw transcripts.
46
+
45
47
  Set `LLM_WIKI_KIT_AUTO_PROJECT_UPDATE=0` only while diagnosing automatic managed-template refresh behavior.
46
48
  Set `LLM_WIKI_KIT_UPDATE_NOTICE=0` only while suppressing the cached passive runtime update status.
47
49
  Set `LLM_WIKI_KIT_CAPTURE_MODE=legacy-eager` only as deprecated compatibility mode for the old eager query/decision capture behavior.
package/docs/manual.md CHANGED
@@ -55,7 +55,9 @@ llm-wiki/
55
55
  ├── outputs/
56
56
  │ ├── questions/
57
57
  │ ├── reports/
58
+ │ ├── exports/
58
59
  │ └── maintenance/
60
+ ├── evals/
59
61
  └── procedures/
60
62
  ```
61
63
 
@@ -69,6 +71,7 @@ Use Codex or Claude Code normally. Installed hooks:
69
71
  - select Korean or English hook guidance from the current user prompt and local instruction files;
70
72
  - use `wiki/memory.md`, `wiki/index.md`, relevant wiki search, maintenance signals, update notices, and compact recovery packets;
71
73
  - record redacted prompt/tool/result summaries in per-turn state;
74
+ - preserve safe evidence pointers as `evidence_refs` when changed files or verification commands are available;
72
75
  - archive only meaningful work turns or structured decision/debugging turns into chunked `outputs/questions/YYYY-MM-DD/live-qa-001.md` files;
73
76
  - avoid automatic `wiki/queries/` and `wiki/decisions/` promotion in the default answer-first mode;
74
77
  - queue durable cleanup candidates only for explicit documentation requests that were not reflected in durable wiki files, or when stale turn state is recovered;
@@ -149,9 +152,11 @@ Most users should not need these during daily coding. They are for install, upda
149
152
  - `llm-wiki bootstrap --workspace <project>`: create project-local wiki structure.
150
153
  - `llm-wiki migrate --workspace <project>`: copy legacy wiki material into the current layout.
151
154
  - `llm-wiki context "<query>" --workspace <project>`: verbose debug view of hook context sources.
155
+ - `llm-wiki eval --workspace <project> [--fixture <path>] [--limit 5] [--json]`: run retrieval fixtures.
156
+ - `llm-wiki export --workspace <project> [--format all|llms|llms-full|json] [--output <dir>] [--dry-run] [--json]`: write durable wiki manifests.
152
157
  - `llm-wiki lint --workspace <project>`: wiki health check.
153
158
  - `llm-wiki consolidate --workspace <project> [--dry-run]`: refresh generated blocks in `memory.md` and `index.md`.
154
- - `llm-wiki maintenance --workspace <project> [--json]`: show pending durable cleanup candidates and review health.
159
+ - `llm-wiki maintenance --workspace <project> [--approve <id> --target <wiki/...md> | --done <id> --target <wiki/...md> | --skip <id> [--note "..."]] [--json]`: show or update durable cleanup review state.
155
160
  - `llm-wiki archive-questions --workspace <project> [--date YYYY-MM-DD] [--dry-run]`: split old flat live Q&A files into chunks.
156
161
  - `llm-wiki uninstall`: remove kit-managed hook entries, leaving project wiki contents intact.
157
162
 
@@ -187,11 +192,51 @@ llm-wiki context "auth architecture" --workspace /path/to/project --include-epis
187
192
  llm-wiki context "auth architecture" --workspace /path/to/project --include-archived
188
193
  ```
189
194
 
190
- Default search prioritizes durable semantic/procedural wiki pages. Episodic `wiki/queries/`, `wiki/context/`, and `session-log` pages are hidden unless promoted with durable metadata or explicitly requested. Archived and superseded pages are hidden unless `--include-archived` is used. Stale pages remain searchable with lower score.
195
+ Default search prioritizes durable semantic/procedural wiki pages. Episodic `wiki/queries/`, `wiki/context/`, and `session-log` pages are hidden unless promoted with durable metadata or explicitly requested. Archived and superseded pages are hidden unless `--include-archived` is used. Stale pages remain searchable with lower score. JSON hits include `rankReason`, `visibilityReason`, `evidenceRefs`, `matchedFields`, and `scoreBreakdown`; the text formatter prints `why selected` for maintainers. Hook compact context stays shorter and does not include those debug lines.
196
+
197
+ ## Evidence, Eval, And Export
198
+
199
+ Curated wiki pages may include optional frontmatter:
200
+
201
+ ```yaml
202
+ evidence_refs:
203
+ - "file:src/wiki-search.js"
204
+ - "cmd:node --test"
205
+ - "raw:source-id"
206
+ - "url:https://example.com/reference"
207
+ ```
208
+
209
+ `llm-wiki lint` validates the prefix, safety, and rough reachability of those references. `file:` must be repo-relative, `cmd:` must be a short single-line redacted-safe command, `raw:` should resolve to a raw/source candidate, and `url:` must be `http` or `https` without credentials.
210
+
211
+ `llm-wiki eval` reads `llm-wiki/evals/retrieval.json` by default:
212
+
213
+ ```json
214
+ {
215
+ "queries": [
216
+ {
217
+ "query": "semantic retrieval",
218
+ "expected": ["wiki/architecture/retrieval.md"],
219
+ "unexpected": ["wiki/queries/old-auto.md"]
220
+ }
221
+ ]
222
+ }
223
+ ```
224
+
225
+ Missing fixtures exit successfully with `no fixture found`. Present fixtures report expected recall, missed expected paths, unexpected hits, and top hits. Eval and export share the same durable visibility policy so archived/superseded/default episodic pages are treated consistently.
226
+
227
+ `llm-wiki export` writes `llms.txt`, `llms-full.txt`, and `llm-wiki.json` under `llm-wiki/outputs/exports/` by default. `llms.txt` is a curated onboarding and handoff manifest for agents and humans, not a passive SEO file. `llms-full.txt` is a bounded redacted context bundle for handoff or compaction recovery. `llm-wiki.json` is the structured manifest for future adapters and eval tooling. `--dry-run` reports planned files without writing them.
191
228
 
192
229
  ## Maintenance
193
230
 
194
- `llm-wiki maintenance` reports pending queue state and review health. It does not merge pages automatically. The active agent should merge reusable items into existing durable pages and mark queue items `done` or `skipped`.
231
+ `llm-wiki maintenance` reports queue state and review health. It does not merge pages automatically. The active agent should merge reusable items into existing durable pages and mark queue items through `pending`, `approved`, `done`, or `skipped`.
232
+
233
+ ```bash
234
+ llm-wiki maintenance --workspace <project> --approve <id> --target wiki/concepts/topic.md
235
+ llm-wiki maintenance --workspace <project> --done <id> --target wiki/concepts/topic.md
236
+ llm-wiki maintenance --workspace <project> --skip <id> --note "duplicate"
237
+ ```
238
+
239
+ `approved` means durable promotion is accepted but not yet merged. `done` means the durable page has been updated. `skipped` means the item was duplicate or not reusable enough. Approved reminders are shown before pending reminders.
195
240
 
196
241
  Hook reminders are soft:
197
242
 
@@ -214,6 +259,7 @@ LLM_WIKI_KIT_PRECOMPACT_ENFORCEMENT=limited
214
259
  - Hook payloads are stored as small redacted event envelopes.
215
260
  - Tool calls are not blocked only because input looks sensitive.
216
261
  - Tokens, passwords, bearer credentials, private keys, and raw `.env` contents are redacted before durable storage.
262
+ - Generated exports are redacted and must not store npm tokens, WinRM credentials, private keys, raw `.env`, or full raw transcripts.
217
263
  - Phone numbers, emails, dates, and business identifiers are preserved by default because they can be useful local work context.
218
264
  - `llm-wiki lint` reports secret-like wiki content as an error.
219
265
 
@@ -232,6 +278,8 @@ llm-wiki version
232
278
  llm-wiki status --workspace /path/to/project
233
279
  llm-wiki doctor --workspace /path/to/project
234
280
  llm-wiki update --check --workspace /path/to/project
281
+ llm-wiki eval --workspace /path/to/project --json
282
+ llm-wiki export --workspace /path/to/project --format all --dry-run --json
235
283
  ```
236
284
 
237
285
  Native Windows support claims require a real Windows smoke: install the published package, run `install`, `status`, and `doctor` against a Windows project, inspect `%USERPROFILE%\.codex\hooks.json` and `%USERPROFILE%\.claude\settings.json`, and run hook smoke tests through `llm-wiki.cmd`.
package/docs/security.md CHANGED
@@ -13,4 +13,8 @@ Before writing durable summaries, the runtime redacts authentication values such
13
13
 
14
14
  Manual and hook context output also runs through redaction before returning excerpts or search hits. `llm-wiki lint` reports remaining secret-like wiki content as an error so it can be removed or rewritten before it becomes reusable project memory.
15
15
 
16
+ `evidence_refs` are pointers, not a place to paste secrets or transcripts. `llm-wiki lint` rejects secret-like evidence values, unsafe `file:` paths, credential-bearing `url:` values, unsupported prefixes, and unsafe commands. Missing local `file:` or `raw:` targets are warnings so agents can fix references without losing the surrounding durable note.
17
+
18
+ `llm-wiki export` redacts generated `llms.txt`, `llms-full.txt`, and `llm-wiki.json` output. Exports must not contain npm tokens, WinRM credentials, private keys, raw `.env`, or full raw transcripts. `llms.txt` is an agent onboarding/handoff manifest and follows the same durable visibility policy as retrieval eval, so archived/superseded/default episodic pages are excluded by default.
19
+
16
20
  Hook payloads are stored as small event envelopes, not full raw transcripts. Full transcript capture is intentionally not implemented as a default. `PreCompact` may read a small bounded transcript tail for a redacted checkpoint, but it does not store the raw transcript path or full transcript. If a project needs raw transcript capture, add a project-local policy and a redaction path first.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "llm-wiki-kit",
3
- "version": "0.2.14",
3
+ "version": "0.2.15",
4
4
  "description": "Hook-first living Markdown wiki runtime for Codex and Claude Code with Korean/English prompt-aware guidance.",
5
5
  "type": "module",
6
6
  "files": [
package/src/cli.js CHANGED
@@ -3,7 +3,7 @@ import { resolve } from 'path';
3
3
  import { formatConsolidateResult, runConsolidate } from './consolidate.js';
4
4
  import { handleHook } from './hook.js';
5
5
  import { install, status, uninstall } from './install.js';
6
- import { formatMaintenanceResult, maintenanceSummary } from './maintenance.js';
6
+ import { formatMaintenanceResult, maintenanceSummary, updateMaintenanceItem } from './maintenance.js';
7
7
  import { bootstrapProject } from './project.js';
8
8
  import { inspectProjectState } from './project-state.js';
9
9
  import { commandForProject, knownProjectRoots, recordProject } from './projects.js';
@@ -11,6 +11,8 @@ import { formatDoctor, runDoctor } from './doctor.js';
11
11
  import { migrate } from './migrate.js';
12
12
  import { postUpdate, update } from './update.js';
13
13
  import { buildContextPack, formatContextPack } from './wiki-search.js';
14
+ import { formatEvalResult, runEval } from './wiki-eval.js';
15
+ import { formatExportResult, runExport } from './wiki-export.js';
14
16
  import { formatLintResult, runLint } from './wiki-lint.js';
15
17
  import { archiveQuestions, formatArchiveQuestionsResult } from './live-qa.js';
16
18
 
@@ -33,6 +35,30 @@ function parseOptions(args) {
33
35
  } else if (arg === '--to') {
34
36
  options.to = optionValue(arg, i);
35
37
  i += 1;
38
+ } else if (arg === '--fixture') {
39
+ options.fixture = optionValue(arg, i);
40
+ i += 1;
41
+ } else if (arg === '--format') {
42
+ options.format = optionValue(arg, i);
43
+ i += 1;
44
+ } else if (arg === '--output') {
45
+ options.output = resolve(optionValue(arg, i));
46
+ i += 1;
47
+ } else if (arg === '--target') {
48
+ options.target = optionValue(arg, i);
49
+ i += 1;
50
+ } else if (arg === '--note') {
51
+ options.note = optionValue(arg, i);
52
+ i += 1;
53
+ } else if (arg === '--approve') {
54
+ options.approve = optionValue(arg, i);
55
+ i += 1;
56
+ } else if (arg === '--done') {
57
+ options.done = optionValue(arg, i);
58
+ i += 1;
59
+ } else if (arg === '--skip') {
60
+ options.skip = optionValue(arg, i);
61
+ i += 1;
36
62
  } else if (arg === '--date') {
37
63
  const value = optionValue(arg, i);
38
64
  if (!/^\d{4}-\d{2}-\d{2}$/.test(value)) {
@@ -127,9 +153,11 @@ Usage:
127
153
  llm-wiki bootstrap --workspace <project>
128
154
  llm-wiki migrate --workspace <project>
129
155
  llm-wiki context "<query>" --workspace <project> [--limit 5] [--no-expand] [--include-episodic] [--include-archived]
156
+ llm-wiki eval --workspace <project> [--fixture <path>] [--limit 5] [--json]
157
+ llm-wiki export --workspace <project> [--format all|llms|llms-full|json] [--output <dir>] [--dry-run] [--json]
130
158
  llm-wiki lint --workspace <project>
131
159
  llm-wiki consolidate --workspace <project> [--dry-run]
132
- llm-wiki maintenance --workspace <project> [--json]
160
+ llm-wiki maintenance --workspace <project> [--approve <id> --target <wiki/...md> | --done <id> --target <wiki/...md> | --skip <id> [--note "..."]] [--json]
133
161
  llm-wiki archive-questions --workspace <project> [--date YYYY-MM-DD] [--dry-run] [--json]
134
162
  `);
135
163
  return;
@@ -229,6 +257,20 @@ Usage:
229
257
  return;
230
258
  }
231
259
 
260
+ if (command === 'eval') {
261
+ const projectRoot = resolve(options.workspace || process.cwd());
262
+ const result = await runEval(projectRoot, options);
263
+ printJsonOrText(result, options, formatEvalResult);
264
+ if (!result.ok) process.exitCode = 1;
265
+ return;
266
+ }
267
+
268
+ if (command === 'export') {
269
+ const projectRoot = resolve(options.workspace || process.cwd());
270
+ printJsonOrText(await runExport(projectRoot, options), options, formatExportResult);
271
+ return;
272
+ }
273
+
232
274
  if (command === 'lint') {
233
275
  const projectRoot = resolve(options.workspace || process.cwd());
234
276
  const result = await runLint(projectRoot, options);
@@ -245,6 +287,14 @@ Usage:
245
287
 
246
288
  if (command === 'maintenance') {
247
289
  const projectRoot = resolve(options.workspace || process.cwd());
290
+ const actions = [options.approve ? 'approve' : '', options.done ? 'done' : '', options.skip ? 'skip' : ''].filter(Boolean);
291
+ if (actions.length > 1) throw new Error('maintenance accepts only one of --approve, --done, or --skip');
292
+ if (actions.length === 1) {
293
+ const action = actions[0];
294
+ const id = options.approve || options.done || options.skip;
295
+ printJsonOrText(await updateMaintenanceItem(projectRoot, id, action, options), options);
296
+ return;
297
+ }
248
298
  printJsonOrText(await maintenanceSummary(projectRoot, { ...options, includeLint: true }), options, formatMaintenanceResult);
249
299
  return;
250
300
  }
package/src/constants.js CHANGED
@@ -52,7 +52,9 @@ export const LLM_WIKI_DIRS = [
52
52
  'wiki/queries',
53
53
  'outputs/questions',
54
54
  'outputs/reports',
55
+ 'outputs/exports',
55
56
  'outputs/maintenance',
57
+ 'evals',
56
58
  'procedures',
57
59
  ];
58
60
 
@@ -0,0 +1,128 @@
1
+ import { isAbsolute, relative, sep } from 'path';
2
+ import { extractPathsFromText, hasSecretLikeText, isSensitivePath, redactText, summarizeForStorage } from './redaction.js';
3
+
4
+ export const EVIDENCE_PREFIXES = new Set(['file', 'cmd', 'raw', 'url']);
5
+ export const MAX_CMD_EVIDENCE_CHARS = 500;
6
+
7
+ export function normalizeEvidenceRefs(value) {
8
+ if (!Array.isArray(value)) return [];
9
+ const refs = value
10
+ .map((item) => summarizeForStorage(String(item || ''), 700))
11
+ .map((item) => item.replace(/\s+/g, ' ').trim())
12
+ .filter(Boolean);
13
+ return [...new Set(refs)];
14
+ }
15
+
16
+ export function parseEvidenceRef(ref) {
17
+ const text = String(ref || '').trim();
18
+ const match = text.match(/^([a-z]+):([\s\S]*)$/i);
19
+ if (!match) return { raw: text, prefix: '', value: text };
20
+ return {
21
+ raw: text,
22
+ prefix: match[1].toLowerCase(),
23
+ value: match[2].trim(),
24
+ };
25
+ }
26
+
27
+ export function parseEvidenceRefsField(value) {
28
+ const text = String(value || '').trim();
29
+ if (!text) return [];
30
+ try {
31
+ const parsed = JSON.parse(text);
32
+ if (Array.isArray(parsed)) return normalizeEvidenceRefs(parsed);
33
+ } catch {
34
+ // Fall through to legacy comma/text parsing.
35
+ }
36
+ return normalizeEvidenceRefs(text.split(',').map((item) => item.trim()));
37
+ }
38
+
39
+ export function frontmatterEvidenceRefs(refs) {
40
+ const normalized = normalizeEvidenceRefs(refs);
41
+ if (normalized.length === 0) return 'evidence_refs: []';
42
+ return ['evidence_refs:', ...normalized.map((ref) => ` - "${ref.replace(/"/g, '\\"')}"`)].join('\n');
43
+ }
44
+
45
+ function cleanCandidatePath(value) {
46
+ return String(value || '')
47
+ .replace(/^[-*]\s+/, '')
48
+ .replace(/^file:/i, '')
49
+ .replace(/[),.;:]+$/g, '')
50
+ .replace(/\\/g, '/')
51
+ .trim();
52
+ }
53
+
54
+ function isUsefulFileCandidate(value) {
55
+ const text = cleanCandidatePath(value);
56
+ if (!text || text.includes('://') || text.startsWith('cmd:') || text.startsWith('raw:')) return false;
57
+ if (isSensitivePath(text) || hasSecretLikeText(text)) return false;
58
+ return (
59
+ text.startsWith('llm-wiki/') ||
60
+ text.startsWith('src/') ||
61
+ text.startsWith('test/') ||
62
+ text.startsWith('docs/') ||
63
+ text.startsWith('bin/') ||
64
+ text.startsWith('examples/') ||
65
+ /^(?:README|AGENTS|CLAUDE|LICENSE|package(?:-lock)?|install)\.[A-Za-z0-9]+$/i.test(text) ||
66
+ /\.[A-Za-z0-9]{1,12}$/.test(text)
67
+ );
68
+ }
69
+
70
+ function relativeProjectPath(projectRoot, value) {
71
+ const cleaned = cleanCandidatePath(value);
72
+ if (!cleaned) return '';
73
+ if (projectRoot && cleaned.startsWith(projectRoot)) {
74
+ return relative(projectRoot, cleaned).split(sep).join('/');
75
+ }
76
+ return cleaned.replace(/^\.\//, '');
77
+ }
78
+
79
+ function addFileRefs(refs, projectRoot, text) {
80
+ for (const candidate of extractPathsFromText(text || '')) {
81
+ const rel = relativeProjectPath(projectRoot, candidate);
82
+ if (!isUsefulFileCandidate(rel)) continue;
83
+ if (isAbsolute(rel) || rel.split('/').includes('..')) continue;
84
+ refs.push(`file:${rel}`);
85
+ }
86
+ }
87
+
88
+ function decodeJsonString(value) {
89
+ try {
90
+ return JSON.parse(`"${value}"`);
91
+ } catch {
92
+ return value.replace(/\\"/g, '"');
93
+ }
94
+ }
95
+
96
+ function addCommandRefs(refs, text) {
97
+ const body = String(text || '');
98
+ const cmdRegex = /"cmd"\s*:\s*"((?:\\.|[^"\\])*)"/g;
99
+ let match = cmdRegex.exec(body);
100
+ while (match) {
101
+ const command = summarizeForStorage(decodeJsonString(match[1]).replace(/\s+/g, ' '), MAX_CMD_EVIDENCE_CHARS);
102
+ if (command && !hasSecretLikeText(command)) refs.push(`cmd:${command}`);
103
+ match = cmdRegex.exec(body);
104
+ }
105
+
106
+ for (const line of body.split(/\r?\n/)) {
107
+ const cleaned = line.replace(/^[-*]\s+/, '').trim();
108
+ const direct = cleaned.match(/^(?:Bash|Shell|Verification|Command):\s*(.+)$/i)?.[1];
109
+ const command = direct || (/^(?:node|npm|npx|git|llm-wiki|pytest|python|pnpm|yarn|vitest|jest|tsc)\b/.test(cleaned) ? cleaned : '');
110
+ if (!command) continue;
111
+ const safe = summarizeForStorage(command.replace(/\s+/g, ' '), MAX_CMD_EVIDENCE_CHARS);
112
+ if (safe && !hasSecretLikeText(safe)) refs.push(`cmd:${safe}`);
113
+ }
114
+ }
115
+
116
+ export function evidenceRefsFromEntry(entry, options = {}) {
117
+ const refs = [];
118
+ const projectRoot = options.projectRoot || '';
119
+ addFileRefs(refs, projectRoot, entry?.changedFiles);
120
+ addFileRefs(refs, projectRoot, entry?.work);
121
+ addCommandRefs(refs, entry?.verification);
122
+ addCommandRefs(refs, entry?.work);
123
+ return normalizeEvidenceRefs(refs).slice(0, options.limit || 20);
124
+ }
125
+
126
+ export function redactEvidenceRefs(refs) {
127
+ return normalizeEvidenceRefs(refs).map((ref) => redactText(ref, 700));
128
+ }