llm-wiki-kit 0.2.14 → 0.2.16

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -85,7 +85,9 @@ llm-wiki/
85
85
  ├── outputs/
86
86
  │ ├── questions/
87
87
  │ ├── reports/
88
+ │ ├── exports/
88
89
  │ └── maintenance/
90
+ ├── evals/
89
91
  └── procedures/
90
92
  ```
91
93
 
@@ -101,15 +103,16 @@ The installed hooks:
101
103
  - automatically choose Korean or English hook guidance from the current user prompt, then fall back to Claude Code `language`, local `CLAUDE.md`/`AGENTS.md`, and English.
102
104
  - remove Codex-facing legacy `oh-my-codex:wiki`/`omx_wiki` surfaces at session start so `llm-wiki/` remains the active wiki implementation
103
105
  - record small redacted raw event envelopes and per-turn state
104
- - capture meaningful work and structured decision points, including tool evidence, changed files, and verification notes
105
- - before compaction, classify the current turn and save a redacted checkpoint only for meaningful work, structured decisions, or explicit durable requests; explicit durable candidates also get a maintenance queue item when no durable wiki update is detected
106
+ - capture meaningful work and structured decision points, including tool evidence, changed files, verification notes, and reusable durable-candidate signals
107
+ - attach safe `evidence_refs` candidates to generated durable candidates when changed files or verification commands are available
108
+ - before compaction, classify the current turn and save a redacted checkpoint only for meaningful work, structured decisions, explicit durable requests, or suggested durable candidates; durable candidates get a maintenance queue item when no durable wiki update is detected
106
109
  - after compaction, store the redacted compact summary only; if pre-compact preservation failed, prepare a recovery packet for the next legal model-visible context hook
107
110
  - allow tool calls to proceed without secret/PII-based hook blocking
108
- - update chunked `llm-wiki/outputs/questions/YYYY-MM-DD/live-qa-001.md` style archives only for meaningful work or structured decision turns
111
+ - update chunked `llm-wiki/outputs/questions/YYYY-MM-DD/live-qa-001.md` style archives only for meaningful work, structured decision turns, or hook-suggested durable candidates
109
112
  - avoid automatic `wiki/queries/` and `wiki/decisions/` promotion in the default answer-first mode
110
- - queue durable cleanup candidates only for explicit documentation requests that were not reflected in durable wiki files, or when stale turn state is recovered
113
+ - queue durable cleanup candidates for explicit documentation requests, hook-suggested durable candidates, or recovered stale turn state that were not reflected in durable wiki files
111
114
  - recover stale per-turn state into that queue on the next session start or prompt submit when the previous stop hook did not complete
112
- - nudge the active LLM to fold approved reusable facts into existing wiki pages instead of leaving everything as one-off Q&A
115
+ - nudge the active LLM to fold approved or hook-suggested reusable facts into existing wiki pages instead of leaving everything as one-off Q&A
113
116
  - automatically refresh managed rules/templates for older projects when the current runtime starts a session
114
117
 
115
118
  If you need to think about saving every answer manually, the setup has failed.
@@ -123,7 +126,7 @@ Most users should not need these during daily Claude Code/Codex work. They exist
123
126
  - Install/update: `llm-wiki install`, `llm-wiki update`, `llm-wiki post-update`, `llm-wiki projects`
124
127
  - Diagnostics: `llm-wiki doctor`, `llm-wiki status`, `llm-wiki version`
125
128
  - Manual: `llm-wiki manual`
126
- - Agent maintenance helpers: `llm-wiki context`, `llm-wiki lint`, `llm-wiki consolidate`, `llm-wiki maintenance`
129
+ - Agent maintenance helpers: `llm-wiki context`, `llm-wiki lint`, `llm-wiki consolidate`, `llm-wiki maintenance`, `llm-wiki eval`, `llm-wiki export`
127
130
  - Live Q&A archive helper: `llm-wiki archive-questions --workspace <project> [--date YYYY-MM-DD] [--dry-run]`
128
131
  - Cleanup: `llm-wiki uninstall`
129
132
 
@@ -139,13 +142,17 @@ Installed npm runtimes also perform a cached update notice check from hooks whil
139
142
 
140
143
  `llm-wiki post-update --workspace <project>` reapplies the current runtime's hook entries and safe managed template updates without running `npm install -g`. Use `post-update --all --workspace <search-root>` to reapply templates across discovered project roots.
141
144
 
142
- `llm-wiki context "<query>"` prints the full debug view of the layered context sources used by hooks. Hook injection may render those sources as functional compact context for Codex and Claude, but this CLI stays verbose so maintainers can inspect retrieval, snippets, memory, index, expansion behavior, and context budget metadata. Daily use should rely on hook injection. By default, episodic `wiki/queries/`, `wiki/context/`, and `session-log` pages are excluded from search unless they were promoted with `memory_type: semantic` or `procedural` and `importance >= 4`; use `--include-episodic` only when debugging old automatic records. Archived or superseded pages are hidden unless `--include-archived` is requested, while stale pages remain searchable with lower score.
145
+ `llm-wiki context "<query>"` prints the full debug view of the layered context sources used by hooks. Hook injection may render those sources as functional compact context for Codex and Claude, but this CLI stays verbose so maintainers can inspect retrieval, snippets, memory, index, expansion behavior, context budget metadata, `rankReason`, `matchedFields`, `scoreBreakdown`, `visibilityReason`, and `evidenceRefs`. The text formatter adds a short `why selected` line for each hit; hook compact context deliberately omits that extra detail. Daily use should rely on hook injection. By default, episodic `wiki/queries/`, `wiki/context/`, and `session-log` pages are excluded from search unless they were promoted with `memory_type: semantic` or `procedural` and `importance >= 4`; use `--include-episodic` only when debugging old automatic records. Archived or superseded pages are hidden unless `--include-archived` is requested, while stale pages remain searchable with lower score.
143
146
 
144
- `llm-wiki lint` checks wiki health and detects outdated managed rules from older kit versions. It also warns when `memory.md` is near budget, wiki page count nears the search cap, hidden episodic/context pages accumulate, or stale/archived pages lack supersession/link discoverability. Agents may use it before/after meaningful wiki maintenance.
147
+ `llm-wiki lint` checks wiki health and detects outdated managed rules from older kit versions. It validates optional `evidence_refs` entries with the prefixes `file:`, `cmd:`, `raw:`, and `url:`; unsafe paths, unsupported prefixes, credential-bearing URLs, command secrets, and secret-like values are reported as errors, while missing local evidence targets are warnings. It also warns when `memory.md` is near budget, wiki page count nears the search cap, hidden episodic/context pages accumulate, or stale/archived pages lack supersession/link discoverability. Agents may use it before/after meaningful wiki maintenance.
145
148
 
146
149
  `llm-wiki consolidate` refreshes only generated marker blocks in `wiki/memory.md` and `wiki/index.md`. Generated maps keep durable non-archived pages, hide default episodic records, skip stale/archived/superseded pages, and report those counts in dry-run output. It is an agent maintenance helper, not a command users should run after every turn.
147
150
 
148
- `llm-wiki maintenance` prints the pending queue and review due status from `llm-wiki/outputs/maintenance/queue.md`. Hooks create only selective candidates; the active agent should merge reusable items into existing durable wiki pages and mark queue items `done` or `skipped` without delaying unrelated user answers. Periodic maintenance is a soft agent-side reminder, not a user command loop.
151
+ `llm-wiki maintenance` prints the queue and review due status from `llm-wiki/outputs/maintenance/queue.md`. Queue states are `pending -> approved -> done` or `skipped`. Use `llm-wiki maintenance --workspace <project> --approve <id> --target <wiki/...md>` when durable promotion is approved, `--done <id> --target <wiki/...md>` after the active agent has merged the fact into a durable page, and `--skip <id> [--note "..."]` for duplicate or non-durable candidates. Approved items are shown before pending items in hook reminders. Periodic maintenance is a soft agent-side reminder, not a user command loop.
152
+
153
+ `llm-wiki eval --workspace <project> [--fixture <path>] [--limit 5] [--json]` runs retrieval fixtures from `llm-wiki/evals/retrieval.json` by default. If the fixture is absent, it exits successfully with `no fixture found`. Fixtures list `query`, `expected`, and `unexpected` paths; output reports recall, missed expected hits, unexpected hits, and top hits using the same durable visibility policy as export.
154
+
155
+ `llm-wiki export --workspace <project> [--format all|llms|llms-full|json] [--output <dir>] [--dry-run] [--json]` writes durable wiki manifests under `llm-wiki/outputs/exports/` by default. `llms.txt` is an agent onboarding and handoff manifest, not a passive SEO artifact. `llms-full.txt` is a redacted durable context bundle for compaction recovery or handoff. `llm-wiki.json` is a structured manifest for future adapters and eval tooling. Export uses the same durable visibility policy as search/eval and redacts credentials before writing.
149
156
 
150
157
  `llm-wiki archive-questions` splits older legacy `llm-wiki/outputs/questions/YYYY-MM-DD-live-qa.md` files into the chunked `llm-wiki/outputs/questions/YYYY-MM-DD/` layout. It preserves the original under `outputs/questions/archive/originals/` with a SHA-256 sidecar and replaces the legacy file with a short pointer stub. Use `--dry-run` first when reviewing a large archive.
151
158
 
@@ -192,6 +199,7 @@ llm-wiki hook claude Stop
192
199
  - PreCompact may read a small bounded transcript tail to create a redacted checkpoint, but it does not store the full transcript or raw `transcript_path`.
193
200
  - Tool calls are not blocked only because inputs look sensitive.
194
201
  - Authentication values such as tokens, passwords, and private keys are redacted before durable summaries are written.
202
+ - Generated exports are redacted and must not contain npm tokens, WinRM credentials, private keys, raw `.env`, or full raw transcripts.
195
203
  - Hook payloads are stored only as redacted event envelopes.
196
204
  - Phone numbers, emails, dates, and business identifiers are preserved by default so the wiki remains useful for local work.
197
205
 
package/docs/concepts.md CHANGED
@@ -18,12 +18,14 @@ The important behavior is a loop:
18
18
  2. `memory.md`, `index.md`, and relevant wiki context are injected automatically with an answer-first instruction.
19
19
  3. The user works normally; no extra command loop is required.
20
20
  4. Hooks gather redacted prompt/tool/result summaries.
21
- 5. At stop/session end, hooks append redacted chunked live Q&A only for turns with work evidence or structured decision/debugging conclusions.
22
- 6. Simple answers, status checks, and keyword-only responses stay out of live Q&A and durable wiki by default.
23
- 7. Durable wiki promotion is selective: explicit record/document requests should be handled by the active agent in existing wiki pages; the hook queues review only when such a request was not reflected in durable files.
24
- 8. At the next start/prompt after an abrupt shutdown, hooks can recover stale turn state into `outputs/maintenance/queue.md`.
25
- 9. When reusable knowledge appears, the active Claude Code/Codex agent folds approved facts into existing durable wiki pages instead of leaving everything as one-off Q&A.
26
- 10. Future sessions start from the improved wiki instead of relying on long chat history.
21
+ 5. At stop/session end, hooks append redacted chunked live Q&A only for turns with work evidence, structured decision/debugging conclusions, or hook-suggested durable candidates.
22
+ 6. When possible, generated candidates carry safe `evidence_refs` such as changed files, verification commands, raw source IDs, or external URLs.
23
+ 7. Simple answers, status checks, and keyword-only responses stay out of live Q&A and durable wiki by default.
24
+ 8. Durable wiki promotion is selective: explicit record/document requests and hook-suggested durable candidates should be handled by the active agent in existing wiki pages; the hook queues review only when the turn was not reflected in durable files.
25
+ 9. At the next start/prompt after an abrupt shutdown, hooks can recover stale turn state into `outputs/maintenance/queue.md`.
26
+ 10. When reusable knowledge appears, the active Claude Code/Codex agent folds approved or hook-suggested facts into existing durable wiki pages instead of leaving everything as one-off Q&A.
27
+ 11. Export and eval reuse the same durable visibility policy so handoff manifests, retrieval fixtures, and context selection describe the same wiki surface.
28
+ 12. Future sessions start from the improved wiki instead of relying on long chat history.
27
29
 
28
30
  The kit is a template/runtime repository. It must not centralize project wiki contents.
29
31
 
@@ -40,8 +42,11 @@ The maintenance loop is intentionally layered:
40
42
 
41
43
  - `memory.md`: short hot index for current durable facts.
42
44
  - `index.md`: broad navigation map.
43
- - MiniSearch + wikilinks: retrieval over durable `wiki/**/*.md`, with episodic `wiki/queries/`, `wiki/context/`, and `session-log` pages hidden by default unless promoted or `--include-episodic` is requested; archived/superseded pages stay preserved but hidden unless `--include-archived` is requested.
44
- - `outputs/maintenance/queue.md`: selective reminders for explicit durable requests that need review, plus stale turn recovery.
45
- - `lint`: finds broken links, stale pages, duplicates, metadata gaps, secret-like content, outdated managed rules, memory/page-count budget pressure, hidden episodic growth, and stale/archived discoverability gaps.
46
- - `maintenance`: reports `reviewDue` only when periodic thresholds are met; hook reminders are soft and limited to session start/instructions loaded or maintenance-related prompts.
45
+ - MiniSearch + wikilinks: retrieval over durable `wiki/**/*.md`, with episodic `wiki/queries/`, `wiki/context/`, and `session-log` pages hidden by default unless promoted or `--include-episodic` is requested; archived/superseded pages stay preserved but hidden unless `--include-archived` is requested. Verbose context explains `why selected`; hook context stays compact.
46
+ - `evidence_refs`: optional frontmatter that ties durable claims to `file:`, `cmd:`, `raw:`, or `url:` evidence without embedding secrets or raw transcripts.
47
+ - `outputs/maintenance/queue.md`: selective reminders for explicit durable requests, hook-suggested durable candidates, and stale turn recovery that need review. Queue state is `pending`, `approved`, `done`, or `skipped`.
48
+ - `lint`: finds broken links, stale pages, duplicates, metadata gaps, invalid evidence refs, secret-like content, outdated managed rules, memory/page-count budget pressure, hidden episodic growth, and stale/archived discoverability gaps.
49
+ - `maintenance`: reports `reviewDue` only when periodic thresholds are met; hook reminders are soft and limited to session start/instructions loaded or compact prompt-time reminders for maintenance prompts, approved items, durable candidates, stale/recovered items, or review-threshold pressure.
47
50
  - `consolidate`: agent helper that refreshes generated blocks in `memory.md` and `index.md` while preserving handwritten notes, keeping default query/context/session pages out of the durable generated maps, and skipping stale/archived/superseded pages.
51
+ - `eval`: checks retrieval fixtures in `llm-wiki/evals/retrieval.json` and reports expected recall, missed expected paths, unexpected hits, and top hits.
52
+ - `export`: writes redacted `llms.txt`, `llms-full.txt`, and `llm-wiki.json` manifests for agent onboarding, handoff, retrieval eval, and external consumption. `llms.txt` is not treated as a passive SEO artifact.
@@ -42,11 +42,13 @@ when no project `CLAUDE.md` exists. Existing `CLAUDE.md` files are not overwritt
42
42
 
43
43
  The hook records redacted turn summaries but does not deny tool calls only because an input looks sensitive. Hook payloads are stored as small redacted event envelopes rather than full transcripts, and context output is redacted field by field before it is returned to Claude Code.
44
44
 
45
- At `SessionStart`/`InstructionsLoaded`, the hook first attempts a safe managed-template refresh, recovers stale turn state into `outputs/maintenance/queue.md`, performs a cached npm update notice check for npm installs, then injects functional compact context. The context still uses `llm-wiki/wiki/memory.md`, `llm-wiki/wiki/index.md`, relevant wiki/search state, operating rules, maintenance signals, passive runtime update status, and managed-template cleanup notes; the hook formats those signals so they are usable if shown in the Claude Code UI. At `UserPromptSubmit`, it recovers stale turn state, searches wiki pages with MiniSearch or substring fallback, expands one-hop wikilinks, redacts context fields, performs the same cached update notice check, and injects the smallest useful functional compact context set. Update notice cache is scoped by npm command, and maintenance reminders are shown only when the prompt is wiki/maintenance related or matches a queue topic.
45
+ At `SessionStart`/`InstructionsLoaded`, the hook first attempts a safe managed-template refresh, recovers stale turn state into `outputs/maintenance/queue.md`, performs a cached npm update notice check for npm installs, then injects functional compact context. The context still uses `llm-wiki/wiki/memory.md`, `llm-wiki/wiki/index.md`, relevant wiki/search state, operating rules, maintenance signals, passive runtime update status, and managed-template cleanup notes; the hook formats those signals so they are usable if shown in the Claude Code UI. At `UserPromptSubmit`, it recovers stale turn state, searches wiki pages with MiniSearch or substring fallback, expands one-hop wikilinks, redacts context fields, performs the same cached update notice check, and injects the smallest useful functional compact context set. Verbose `llm-wiki context` can explain `why selected`, `rankReason`, `matchedFields`, and `evidenceRefs`, but hook context keeps those details compact. Update notice cache is scoped by npm command, and maintenance reminders are shown for wiki/maintenance prompts, queue topic matches, approved items, durable candidates, stale/recovered items, or review-threshold pressure.
46
46
 
47
47
  Hook-visible language is selected from the current user prompt first. Korean prompts get Korean guidance, English prompts get English guidance. If no prompt language is clear, the hook checks Claude Code `settings.json` `language` when it exists, then local `CLAUDE.md`/`AGENTS.md` language signals, then English. The kit does not require Claude Code to expose a language setting.
48
48
 
49
- `PostToolUse` and `PostToolBatch` record redacted tool summaries in the same turn buffer. `PreCompact` classifies the current turn before compaction: simple turns record only a context note, work-evidence or structured-decision turns write a chunked live Q&A checkpoint, and explicit durable candidates write a maintenance queue item only when no durable wiki update is detected. The checkpoint can include only a bounded redacted transcript tail, never the full raw transcript or raw `transcript_path`. Compaction is not blocked; if checkpoint storage fails, the hook records a compact recovery packet for the next legal context-injection event. `PostCompact` stores the redacted compact summary as a context note and prepares any pending recovery packet without returning model-visible context directly. In the default `answer-first` mode, `SubagentStop` does not create live Q&A, query, decision, or maintenance files. `Stop` and `SessionEnd` append chunked live Q&A only for work-evidence or structured-decision turns and do not auto-create `wiki/queries/` or `wiki/decisions/`. If the user explicitly asked to record or document durable knowledge and no durable wiki update is detected, `Stop`/`SessionEnd` queue a pending maintenance item for agent review. `Stop` and `SessionEnd` then clear the per-session turn buffer; `SubagentStop` does not.
49
+ `PostToolUse` and `PostToolBatch` record redacted tool summaries in the same turn buffer. `PreCompact` classifies the current turn before compaction: simple turns record only a context note, work-evidence, structured-decision, explicit durable, or hook-suggested durable turns write a chunked live Q&A checkpoint, and durable candidates write a maintenance queue item only when no durable wiki update is detected. Queue items may carry safe `evidence_refs` candidates from changed files and verification commands. The checkpoint can include only a bounded redacted transcript tail, never the full raw transcript or raw `transcript_path`. Compaction is not blocked; if checkpoint storage fails, the hook records a compact recovery packet for the next legal context-injection event. `PostCompact` stores the redacted compact summary as a context note and prepares any pending recovery packet without returning model-visible context directly. In the default `answer-first` mode, `SubagentStop` does not create live Q&A, query, decision, or maintenance files. `Stop` and `SessionEnd` append chunked live Q&A only for work-evidence, structured-decision, or durable-candidate turns and do not auto-create `wiki/queries/` or `wiki/decisions/`. If the user explicitly asked to record durable knowledge, or the turn contains reusable architecture/debugging/policy/procedure/decision signals, and no durable wiki update is detected, `Stop`/`SessionEnd` queue a pending maintenance item for agent review. Approved and durable-candidate maintenance items are surfaced as compact soft reminders. `Stop` and `SessionEnd` then clear the per-session turn buffer; `SubagentStop` does not.
50
+
51
+ For handoff or retrieval verification, use `llm-wiki export --workspace <project> --format all` and `llm-wiki eval --workspace <project>`. The generated `llms.txt`/`llms-full.txt`/`llm-wiki.json` files are redacted durable manifests, not raw transcripts.
50
52
 
51
53
  Set `LLM_WIKI_KIT_AUTO_PROJECT_UPDATE=0` only while diagnosing automatic managed-template refresh behavior.
52
54
  Set `LLM_WIKI_KIT_UPDATE_NOTICE=0` only while suppressing the cached passive runtime update status.
@@ -30,18 +30,20 @@ Handled events:
30
30
  Expected behavior:
31
31
 
32
32
  - `SessionStart` first attempts a safe managed-template refresh, removes Codex-facing legacy `oh-my-codex:wiki`/`omx_wiki` surfaces when they reappear, recovers stale turn state into `outputs/maintenance/queue.md`, performs a cached npm update notice check for npm installs, then injects functional compact context. The context still uses `llm-wiki/wiki/memory.md`, `llm-wiki/wiki/index.md`, relevant wiki/search state, operating rules, maintenance signals, passive runtime update status, and managed-template cleanup notes; the hook formats those signals so they are usable if shown in the Codex UI.
33
- - `UserPromptSubmit` recovers stale turn state, searches project wiki pages with MiniSearch or substring fallback, expands one-hop wikilinks, redacts context fields, performs the same cached update notice check, and injects the smallest useful functional compact context set. Update notice cache is scoped by npm command, and maintenance reminders are shown only when the prompt is wiki/maintenance related or matches a queue topic.
33
+ - `UserPromptSubmit` recovers stale turn state, searches project wiki pages with MiniSearch or substring fallback, expands one-hop wikilinks, redacts context fields, performs the same cached update notice check, and injects the smallest useful functional compact context set. Verbose `llm-wiki context` can explain `why selected`, `rankReason`, `matchedFields`, and `evidenceRefs`, but hook context keeps those details compact. Update notice cache is scoped by npm command, and maintenance reminders are shown for wiki/maintenance prompts, queue topic matches, approved items, durable candidates, stale/recovered items, or review-threshold pressure.
34
34
  - Hook-visible language is selected from the current user prompt first. Korean prompts get Korean guidance, English prompts get English guidance. If no prompt language is clear, Codex falls back to local `CLAUDE.md`/`AGENTS.md` language signals, then English.
35
35
  - `PreToolUse` records redacted tool summaries without blocking tool calls.
36
36
  - `PostToolUse` records redacted tool summaries in a turn buffer.
37
- - `PreCompact` classifies the current turn before compaction. Simple turns record only a context note; work-evidence or structured-decision turns write a chunked live Q&A checkpoint; explicit durable candidates write a maintenance queue item only when no durable wiki update is detected. The checkpoint can include only a bounded redacted transcript tail, never the full raw transcript or raw `transcript_path`. Compaction is not blocked; if checkpoint storage fails, the hook records a compact recovery packet for the next legal context-injection event.
37
+ - `PreCompact` classifies the current turn before compaction. Simple turns record only a context note; work-evidence, structured-decision, explicit durable, or hook-suggested durable turns write a chunked live Q&A checkpoint; durable candidates write a maintenance queue item only when no durable wiki update is detected. The checkpoint can include only a bounded redacted transcript tail, never the full raw transcript or raw `transcript_path`. Compaction is not blocked; if checkpoint storage fails, the hook records a compact recovery packet for the next legal context-injection event.
38
38
  - `PostCompact` stores the redacted compact summary as a context note and prepares any pending compact recovery packet. It does not return `hookSpecificOutput.additionalContext`, because Codex `PostCompact` only supports common output fields.
39
- - In the default `answer-first` mode, `SubagentStop` does not create live Q&A, query, decision, or maintenance files. `Stop` appends chunked live Q&A only for work-evidence or structured-decision turns and does not auto-create `wiki/queries/` or `wiki/decisions/`.
40
- - If the user explicitly asked to record or document durable knowledge and no durable wiki update is detected, `Stop` queues a pending maintenance item for agent review.
39
+ - In the default `answer-first` mode, `SubagentStop` does not create live Q&A, query, decision, or maintenance files. `Stop` appends chunked live Q&A only for work-evidence, structured-decision, or durable-candidate turns and does not auto-create `wiki/queries/` or `wiki/decisions/`.
40
+ - If the user explicitly asked to record durable knowledge, or the turn contains reusable architecture/debugging/policy/procedure/decision signals, and no durable wiki update is detected, `Stop` queues a pending maintenance item for agent review. Queue items may carry safe `evidence_refs` candidates from changed files and verification commands. Approved and durable-candidate maintenance items are surfaced as compact soft reminders.
41
41
  - `Stop` clears the per-session turn buffer after recording. `SubagentStop` leaves the parent turn buffer available for the final stop event.
42
42
 
43
43
  Hook payloads are stored as small redacted event envelopes rather than full transcripts. Context output is also redacted field by field before it is returned to Codex. Functional compact context is a presentation policy, not a feature reduction: Codex still receives the wiki memory, search, maintenance, and passive update signals needed for the hook workflow.
44
44
 
45
+ For handoff or retrieval verification, use `llm-wiki export --workspace <project> --format all` and `llm-wiki eval --workspace <project>`. The generated `llms.txt`/`llms-full.txt`/`llm-wiki.json` files are redacted durable manifests, not raw transcripts.
46
+
45
47
  Set `LLM_WIKI_KIT_AUTO_PROJECT_UPDATE=0` only while diagnosing automatic managed-template refresh behavior.
46
48
  Set `LLM_WIKI_KIT_UPDATE_NOTICE=0` only while suppressing the cached passive runtime update status.
47
49
  Set `LLM_WIKI_KIT_CAPTURE_MODE=legacy-eager` only as deprecated compatibility mode for the old eager query/decision capture behavior.
package/docs/manual.md CHANGED
@@ -55,7 +55,9 @@ llm-wiki/
55
55
  ├── outputs/
56
56
  │ ├── questions/
57
57
  │ ├── reports/
58
+ │ ├── exports/
58
59
  │ └── maintenance/
60
+ ├── evals/
59
61
  └── procedures/
60
62
  ```
61
63
 
@@ -69,9 +71,10 @@ Use Codex or Claude Code normally. Installed hooks:
69
71
  - select Korean or English hook guidance from the current user prompt and local instruction files;
70
72
  - use `wiki/memory.md`, `wiki/index.md`, relevant wiki search, maintenance signals, update notices, and compact recovery packets;
71
73
  - record redacted prompt/tool/result summaries in per-turn state;
72
- - archive only meaningful work turns or structured decision/debugging turns into chunked `outputs/questions/YYYY-MM-DD/live-qa-001.md` files;
74
+ - preserve safe evidence pointers as `evidence_refs` when changed files or verification commands are available;
75
+ - archive only meaningful work turns, structured decision/debugging turns, or hook-suggested durable candidates into chunked `outputs/questions/YYYY-MM-DD/live-qa-001.md` files;
73
76
  - avoid automatic `wiki/queries/` and `wiki/decisions/` promotion in the default answer-first mode;
74
- - queue durable cleanup candidates only for explicit documentation requests that were not reflected in durable wiki files, or when stale turn state is recovered;
77
+ - queue durable cleanup candidates for explicit documentation requests, hook-suggested durable candidates, or recovered stale turn state that were not reflected in durable wiki files;
75
78
  - refresh clearly managed rules/templates for older projects at session start;
76
79
  - remove legacy Codex-facing `oh-my-codex:wiki`/`omx_wiki` surfaces when they reappear.
77
80
 
@@ -85,7 +88,7 @@ Default:
85
88
  LLM_WIKI_KIT_CAPTURE_MODE=answer-first
86
89
  ```
87
90
 
88
- `answer-first` keeps simple Q&A, status checks, and keyword-only replies out of durable wiki and live Q&A by default. It archives work turns with tool evidence, changed-file evidence, verification, or structured `Decision:`/`Root cause:` style conclusions. Explicit durable requests create maintenance queue candidates only when no durable wiki update is detected.
91
+ `answer-first` keeps simple Q&A, status checks, and keyword-only replies out of durable wiki and live Q&A by default. It archives work turns with tool evidence, changed-file evidence, verification, structured `Decision:`/`Root cause:` style conclusions, or reusable durable-candidate signals. Explicit durable requests and hook-suggested durable candidates create maintenance queue candidates only when no durable wiki update is detected.
89
92
 
90
93
  Deprecated compatibility mode:
91
94
 
@@ -149,9 +152,11 @@ Most users should not need these during daily coding. They are for install, upda
149
152
  - `llm-wiki bootstrap --workspace <project>`: create project-local wiki structure.
150
153
  - `llm-wiki migrate --workspace <project>`: copy legacy wiki material into the current layout.
151
154
  - `llm-wiki context "<query>" --workspace <project>`: verbose debug view of hook context sources.
155
+ - `llm-wiki eval --workspace <project> [--fixture <path>] [--limit 5] [--json]`: run retrieval fixtures.
156
+ - `llm-wiki export --workspace <project> [--format all|llms|llms-full|json] [--output <dir>] [--dry-run] [--json]`: write durable wiki manifests.
152
157
  - `llm-wiki lint --workspace <project>`: wiki health check.
153
158
  - `llm-wiki consolidate --workspace <project> [--dry-run]`: refresh generated blocks in `memory.md` and `index.md`.
154
- - `llm-wiki maintenance --workspace <project> [--json]`: show pending durable cleanup candidates and review health.
159
+ - `llm-wiki maintenance --workspace <project> [--approve <id> --target <wiki/...md> | --done <id> --target <wiki/...md> | --skip <id> [--note "..."]] [--json]`: show or update durable cleanup review state.
155
160
  - `llm-wiki archive-questions --workspace <project> [--date YYYY-MM-DD] [--dry-run]`: split old flat live Q&A files into chunks.
156
161
  - `llm-wiki uninstall`: remove kit-managed hook entries, leaving project wiki contents intact.
157
162
 
@@ -187,16 +192,56 @@ llm-wiki context "auth architecture" --workspace /path/to/project --include-epis
187
192
  llm-wiki context "auth architecture" --workspace /path/to/project --include-archived
188
193
  ```
189
194
 
190
- Default search prioritizes durable semantic/procedural wiki pages. Episodic `wiki/queries/`, `wiki/context/`, and `session-log` pages are hidden unless promoted with durable metadata or explicitly requested. Archived and superseded pages are hidden unless `--include-archived` is used. Stale pages remain searchable with lower score.
195
+ Default search prioritizes durable semantic/procedural wiki pages. Episodic `wiki/queries/`, `wiki/context/`, and `session-log` pages are hidden unless promoted with durable metadata or explicitly requested. Archived and superseded pages are hidden unless `--include-archived` is used. Stale pages remain searchable with lower score. JSON hits include `rankReason`, `visibilityReason`, `evidenceRefs`, `matchedFields`, and `scoreBreakdown`; the text formatter prints `why selected` for maintainers. Hook compact context stays shorter and does not include those debug lines.
196
+
197
+ ## Evidence, Eval, And Export
198
+
199
+ Curated wiki pages may include optional frontmatter:
200
+
201
+ ```yaml
202
+ evidence_refs:
203
+ - "file:src/wiki-search.js"
204
+ - "cmd:node --test"
205
+ - "raw:source-id"
206
+ - "url:https://example.com/reference"
207
+ ```
208
+
209
+ `llm-wiki lint` validates the prefix, safety, and rough reachability of those references. `file:` must be repo-relative, `cmd:` must be a short single-line redacted-safe command, `raw:` should resolve to a raw/source candidate, and `url:` must be `http` or `https` without credentials.
210
+
211
+ `llm-wiki eval` reads `llm-wiki/evals/retrieval.json` by default:
212
+
213
+ ```json
214
+ {
215
+ "queries": [
216
+ {
217
+ "query": "semantic retrieval",
218
+ "expected": ["wiki/architecture/retrieval.md"],
219
+ "unexpected": ["wiki/queries/old-auto.md"]
220
+ }
221
+ ]
222
+ }
223
+ ```
224
+
225
+ Missing fixtures exit successfully with `no fixture found`. Present fixtures report expected recall, missed expected paths, unexpected hits, and top hits. Eval and export share the same durable visibility policy so archived/superseded/default episodic pages are treated consistently.
226
+
227
+ `llm-wiki export` writes `llms.txt`, `llms-full.txt`, and `llm-wiki.json` under `llm-wiki/outputs/exports/` by default. `llms.txt` is a curated onboarding and handoff manifest for agents and humans, not a passive SEO file. `llms-full.txt` is a bounded redacted context bundle for handoff or compaction recovery. `llm-wiki.json` is the structured manifest for future adapters and eval tooling. `--dry-run` reports planned files without writing them.
191
228
 
192
229
  ## Maintenance
193
230
 
194
- `llm-wiki maintenance` reports pending queue state and review health. It does not merge pages automatically. The active agent should merge reusable items into existing durable pages and mark queue items `done` or `skipped`.
231
+ `llm-wiki maintenance` reports queue state and review health. It does not merge pages automatically. The active agent should merge reusable items into existing durable pages and mark queue items through `pending`, `approved`, `done`, or `skipped`.
232
+
233
+ ```bash
234
+ llm-wiki maintenance --workspace <project> --approve <id> --target wiki/concepts/topic.md
235
+ llm-wiki maintenance --workspace <project> --done <id> --target wiki/concepts/topic.md
236
+ llm-wiki maintenance --workspace <project> --skip <id> --note "duplicate"
237
+ ```
238
+
239
+ `approved` means durable promotion is accepted but not yet merged. `done` means the durable page has been updated. `skipped` means the item was duplicate or not reusable enough. Approved reminders are shown before pending reminders.
195
240
 
196
241
  Hook reminders are soft:
197
242
 
198
243
  - session start and instructions loaded may show a one-item summary;
199
- - prompt submit shows a reminder only when the prompt is wiki/maintenance-related or matches a queue topic.
244
+ - prompt submit shows one compact reminder for wiki/maintenance prompts, queue topic matches, approved items, durable candidates, stale/recovered items, or review-threshold pressure.
200
245
 
201
246
  ## PreCompact
202
247
 
@@ -214,6 +259,7 @@ LLM_WIKI_KIT_PRECOMPACT_ENFORCEMENT=limited
214
259
  - Hook payloads are stored as small redacted event envelopes.
215
260
  - Tool calls are not blocked only because input looks sensitive.
216
261
  - Tokens, passwords, bearer credentials, private keys, and raw `.env` contents are redacted before durable storage.
262
+ - Generated exports are redacted and must not store npm tokens, WinRM credentials, private keys, raw `.env`, or full raw transcripts.
217
263
  - Phone numbers, emails, dates, and business identifiers are preserved by default because they can be useful local work context.
218
264
  - `llm-wiki lint` reports secret-like wiki content as an error.
219
265
 
@@ -232,6 +278,8 @@ llm-wiki version
232
278
  llm-wiki status --workspace /path/to/project
233
279
  llm-wiki doctor --workspace /path/to/project
234
280
  llm-wiki update --check --workspace /path/to/project
281
+ llm-wiki eval --workspace /path/to/project --json
282
+ llm-wiki export --workspace /path/to/project --format all --dry-run --json
235
283
  ```
236
284
 
237
285
  Native Windows support claims require a real Windows smoke: install the published package, run `install`, `status`, and `doctor` against a Windows project, inspect `%USERPROFILE%\.codex\hooks.json` and `%USERPROFILE%\.claude\settings.json`, and run hook smoke tests through `llm-wiki.cmd`.
@@ -146,7 +146,7 @@ After a plain `npm install -g llm-wiki-kit@latest`, existing hooks keep working
146
146
 
147
147
  Daily use should be Claude Code/Codex first. The user should not need to run a chain of `llm-wiki` commands while working. Hooks inject context automatically, but the current user answer takes priority over wiki cleanup. The active agent updates durable wiki pages when reusable project knowledge appears and the turn's importance or user consent justifies persistence. Hook context policy is function-first: memory, search, maintenance, and update signals remain available, while user-visible context is formatted as functional compact context instead of a raw dump.
148
148
 
149
- In the default `LLM_WIKI_KIT_CAPTURE_MODE=answer-first` mode, `Stop` and `SessionEnd` append live Q&A only for meaningful work evidence or structured decision turns. Simple answers, status checks, and keyword-only responses are not archived. Live Q&A uses chunked files under `llm-wiki/outputs/questions/YYYY-MM-DD/` and rolls over by line/byte budget. Hooks do not auto-create `wiki/queries/` or `wiki/decisions/`. If the user explicitly asked for recording/documentation and no durable wiki update is detected, a pending cleanup candidate is written to `llm-wiki/outputs/maintenance/queue.md`. `PreCompact` performs the same answer-first classification before context compaction: simple turns get only a context note, archive-worthy turns get a live Q&A checkpoint, and explicit durable candidates get a checkpoint plus queue item only when needed. If checkpoint storage fails, compaction still proceeds and the hook prepares an important-only compact recovery packet for the next legal context-injection event. `SessionStart` and `UserPromptSubmit` also recover stale per-turn state into the same queue when the previous stop hook did not complete. `SessionStart` injects a one-item queue summary; `UserPromptSubmit` injects a soft reminder only when the prompt is wiki/maintenance related or matches a queue topic. This is a recovery and reminder layer, not a full transcript capture path.
149
+ In the default `LLM_WIKI_KIT_CAPTURE_MODE=answer-first` mode, `Stop` and `SessionEnd` append live Q&A only for meaningful work evidence, structured decision turns, or reusable durable-candidate signals. Simple answers, status checks, and keyword-only responses are not archived. Live Q&A uses chunked files under `llm-wiki/outputs/questions/YYYY-MM-DD/` and rolls over by line/byte budget. Hooks do not auto-create `wiki/queries/` or `wiki/decisions/`. If the user explicitly asked for recording/documentation, or the turn contains reusable architecture/debugging/policy/procedure/decision signals, and no durable wiki update is detected, a pending cleanup candidate is written to `llm-wiki/outputs/maintenance/queue.md`. `PreCompact` performs the same answer-first classification before context compaction: simple turns get only a context note, archive-worthy turns get a live Q&A checkpoint, and durable candidates get a checkpoint plus queue item only when needed. If checkpoint storage fails, compaction still proceeds and the hook prepares an important-only compact recovery packet for the next legal context-injection event. `SessionStart` and `UserPromptSubmit` also recover stale per-turn state into the same queue when the previous stop hook did not complete. `SessionStart` injects a one-item queue summary; `UserPromptSubmit` injects a compact soft reminder when the prompt is wiki/maintenance related, matches a queue topic, has approved or durable-candidate items, or the queue crosses the review threshold. This is a recovery and reminder layer, not a full transcript capture path.
150
150
 
151
151
  Use `llm-wiki archive-questions --workspace <project> --dry-run` to review splitting legacy `outputs/questions/YYYY-MM-DD-live-qa.md` files into the chunked layout. Running it without `--dry-run` preserves the original under `outputs/questions/archive/originals/` with a checksum sidecar and replaces the legacy file with a pointer stub.
152
152
 
@@ -211,7 +211,7 @@ Agents may run `consolidate` after meaningful wiki growth. Users should not need
211
211
 
212
212
  `llm-wiki maintenance --workspace <project>` prints queue counts, review due status, and the first pending items. It does not merge wiki pages by itself; the active agent should review pending items, update the closest existing durable wiki document, then mark the queue item `done` or `skipped`. Periodic maintenance is an agent-side task, not something users need to run after every turn.
213
213
 
214
- `llm-wiki maintenance --workspace <project> --json` includes `reviewDue`, `reviewReasons`, `pendingCount`, `stalePendingCount`, `health`, and `recommendedCommands`. Review is due when the last review is older than 14 days, pending queue size reaches 5, stale or result-missing pending items exist, lint has warnings/errors, `memory.md` is near budget, or wiki page count reaches 80% of the search cap. Hook reminders are soft: `SessionStart`/`InstructionsLoaded` may show a short due note, while `UserPromptSubmit` shows it only for wiki/maintenance/cleanup-related prompts. The reminder never blocks the current answer.
214
+ `llm-wiki maintenance --workspace <project> --json` includes `reviewDue`, `reviewReasons`, `pendingCount`, `stalePendingCount`, `health`, and `recommendedCommands`. Review is due when the last review is older than 14 days, pending queue size reaches 5, stale or result-missing pending items exist, lint has warnings/errors, `memory.md` is near budget, or wiki page count reaches 80% of the search cap. Hook reminders are soft: `SessionStart`/`InstructionsLoaded` may show a short due note, while `UserPromptSubmit` shows one compact item for wiki/maintenance prompts, approved items, durable candidates, stale/recovered items, or review-threshold pressure. The reminder never blocks the current answer.
215
215
 
216
216
  Recommended agent checklist:
217
217
 
package/docs/security.md CHANGED
@@ -13,4 +13,8 @@ Before writing durable summaries, the runtime redacts authentication values such
13
13
 
14
14
  Manual and hook context output also runs through redaction before returning excerpts or search hits. `llm-wiki lint` reports remaining secret-like wiki content as an error so it can be removed or rewritten before it becomes reusable project memory.
15
15
 
16
+ `evidence_refs` are pointers, not a place to paste secrets or transcripts. `llm-wiki lint` rejects secret-like evidence values, unsafe `file:` paths, credential-bearing `url:` values, unsupported prefixes, and unsafe commands. Missing local `file:` or `raw:` targets are warnings so agents can fix references without losing the surrounding durable note.
17
+
18
+ `llm-wiki export` redacts generated `llms.txt`, `llms-full.txt`, and `llm-wiki.json` output. Exports must not contain npm tokens, WinRM credentials, private keys, raw `.env`, or full raw transcripts. `llms.txt` is an agent onboarding/handoff manifest and follows the same durable visibility policy as retrieval eval, so archived/superseded/default episodic pages are excluded by default.
19
+
16
20
  Hook payloads are stored as small event envelopes, not full raw transcripts. Full transcript capture is intentionally not implemented as a default. `PreCompact` may read a small bounded transcript tail for a redacted checkpoint, but it does not store the raw transcript path or full transcript. If a project needs raw transcript capture, add a project-local policy and a redaction path first.
@@ -229,7 +229,7 @@ Check:
229
229
 
230
230
  ## Maintenance Queue Is Empty Or Stale
231
231
 
232
- In the default answer-first mode, `llm-wiki/outputs/maintenance/queue.md` is created only when a user explicitly asked for durable recording/documentation but no durable wiki update was detected, or when `SessionStart`/`UserPromptSubmit` recovers stale per-turn state from a session that did not stop cleanly. It is not expected to grow after every normal `Stop`.
232
+ In the default answer-first mode, `llm-wiki/outputs/maintenance/queue.md` is created when a user explicitly asked for durable recording/documentation, a turn was classified as a hook-suggested durable candidate, or stale per-turn state is recovered, and no durable wiki update was detected. It is not expected to grow after every normal `Stop`.
233
233
 
234
234
  Check the queue and health warnings:
235
235
 
@@ -238,7 +238,7 @@ llm-wiki maintenance --workspace /path/to/project
238
238
  llm-wiki lint --workspace /path/to/project
239
239
  ```
240
240
 
241
- If the queue is always empty during ordinary Q&A, that is normal. If you expected an explicit documentation request to queue, confirm hooks run and that the turn had a captured `UserPromptSubmit`. If pending items stay around, the active agent should merge reusable content into existing durable wiki pages and mark each item `done` or `skipped` without delaying unrelated answers.
241
+ If the queue is always empty during ordinary Q&A, that is normal. If you expected an explicit documentation request or durable candidate to queue, confirm hooks run and that the turn had a captured `UserPromptSubmit`. If pending items stay around, the active agent should merge reusable content into existing durable wiki pages and mark each item `done` or `skipped` without delaying unrelated answers.
242
242
 
243
243
  ## Authentication Values Were Redacted
244
244
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "llm-wiki-kit",
3
- "version": "0.2.14",
3
+ "version": "0.2.16",
4
4
  "description": "Hook-first living Markdown wiki runtime for Codex and Claude Code with Korean/English prompt-aware guidance.",
5
5
  "type": "module",
6
6
  "files": [
@@ -116,10 +116,28 @@ export function classifyTurn(entry, eventName = '') {
116
116
  const hasVerification = Boolean(verification);
117
117
  const hasWorkEvidence = hasWork || hasFiles || hasVerification;
118
118
  const durableConclusion = hasDurableConclusion(text);
119
+ const durableSignal = durableConclusion || hasDurableKeyword(text);
120
+ const durableUpdated = hasDetectedDurableWikiChange(entry);
119
121
 
120
122
  if (hasWorkEvidence) {
123
+ if (durableUpdated) {
124
+ return {
125
+ kind: 'durable-updated',
126
+ archive: true,
127
+ suggestDurable: false,
128
+ queueIfMissingDurable: false,
129
+ };
130
+ }
131
+ if (durableSignal) {
132
+ return {
133
+ kind: 'suggest-durable',
134
+ archive: true,
135
+ suggestDurable: true,
136
+ queueIfMissingDurable: true,
137
+ };
138
+ }
121
139
  return {
122
- kind: hasDetectedDurableWikiChange(entry) ? 'durable-updated' : 'work',
140
+ kind: 'work',
123
141
  archive: true,
124
142
  suggestDurable: false,
125
143
  queueIfMissingDurable: false,
@@ -128,10 +146,10 @@ export function classifyTurn(entry, eventName = '') {
128
146
 
129
147
  if (durableConclusion) {
130
148
  return {
131
- kind: 'decision',
149
+ kind: 'suggest-durable',
132
150
  archive: true,
133
- suggestDurable: false,
134
- queueIfMissingDurable: false,
151
+ suggestDurable: true,
152
+ queueIfMissingDurable: true,
135
153
  };
136
154
  }
137
155
 
package/src/cli.js CHANGED
@@ -3,7 +3,7 @@ import { resolve } from 'path';
3
3
  import { formatConsolidateResult, runConsolidate } from './consolidate.js';
4
4
  import { handleHook } from './hook.js';
5
5
  import { install, status, uninstall } from './install.js';
6
- import { formatMaintenanceResult, maintenanceSummary } from './maintenance.js';
6
+ import { formatMaintenanceResult, maintenanceSummary, updateMaintenanceItem } from './maintenance.js';
7
7
  import { bootstrapProject } from './project.js';
8
8
  import { inspectProjectState } from './project-state.js';
9
9
  import { commandForProject, knownProjectRoots, recordProject } from './projects.js';
@@ -11,6 +11,8 @@ import { formatDoctor, runDoctor } from './doctor.js';
11
11
  import { migrate } from './migrate.js';
12
12
  import { postUpdate, update } from './update.js';
13
13
  import { buildContextPack, formatContextPack } from './wiki-search.js';
14
+ import { formatEvalResult, runEval } from './wiki-eval.js';
15
+ import { formatExportResult, runExport } from './wiki-export.js';
14
16
  import { formatLintResult, runLint } from './wiki-lint.js';
15
17
  import { archiveQuestions, formatArchiveQuestionsResult } from './live-qa.js';
16
18
 
@@ -33,6 +35,30 @@ function parseOptions(args) {
33
35
  } else if (arg === '--to') {
34
36
  options.to = optionValue(arg, i);
35
37
  i += 1;
38
+ } else if (arg === '--fixture') {
39
+ options.fixture = optionValue(arg, i);
40
+ i += 1;
41
+ } else if (arg === '--format') {
42
+ options.format = optionValue(arg, i);
43
+ i += 1;
44
+ } else if (arg === '--output') {
45
+ options.output = resolve(optionValue(arg, i));
46
+ i += 1;
47
+ } else if (arg === '--target') {
48
+ options.target = optionValue(arg, i);
49
+ i += 1;
50
+ } else if (arg === '--note') {
51
+ options.note = optionValue(arg, i);
52
+ i += 1;
53
+ } else if (arg === '--approve') {
54
+ options.approve = optionValue(arg, i);
55
+ i += 1;
56
+ } else if (arg === '--done') {
57
+ options.done = optionValue(arg, i);
58
+ i += 1;
59
+ } else if (arg === '--skip') {
60
+ options.skip = optionValue(arg, i);
61
+ i += 1;
36
62
  } else if (arg === '--date') {
37
63
  const value = optionValue(arg, i);
38
64
  if (!/^\d{4}-\d{2}-\d{2}$/.test(value)) {
@@ -127,9 +153,11 @@ Usage:
127
153
  llm-wiki bootstrap --workspace <project>
128
154
  llm-wiki migrate --workspace <project>
129
155
  llm-wiki context "<query>" --workspace <project> [--limit 5] [--no-expand] [--include-episodic] [--include-archived]
156
+ llm-wiki eval --workspace <project> [--fixture <path>] [--limit 5] [--json]
157
+ llm-wiki export --workspace <project> [--format all|llms|llms-full|json] [--output <dir>] [--dry-run] [--json]
130
158
  llm-wiki lint --workspace <project>
131
159
  llm-wiki consolidate --workspace <project> [--dry-run]
132
- llm-wiki maintenance --workspace <project> [--json]
160
+ llm-wiki maintenance --workspace <project> [--approve <id> --target <wiki/...md> | --done <id> --target <wiki/...md> | --skip <id> [--note "..."]] [--json]
133
161
  llm-wiki archive-questions --workspace <project> [--date YYYY-MM-DD] [--dry-run] [--json]
134
162
  `);
135
163
  return;
@@ -229,6 +257,20 @@ Usage:
229
257
  return;
230
258
  }
231
259
 
260
+ if (command === 'eval') {
261
+ const projectRoot = resolve(options.workspace || process.cwd());
262
+ const result = await runEval(projectRoot, options);
263
+ printJsonOrText(result, options, formatEvalResult);
264
+ if (!result.ok) process.exitCode = 1;
265
+ return;
266
+ }
267
+
268
+ if (command === 'export') {
269
+ const projectRoot = resolve(options.workspace || process.cwd());
270
+ printJsonOrText(await runExport(projectRoot, options), options, formatExportResult);
271
+ return;
272
+ }
273
+
232
274
  if (command === 'lint') {
233
275
  const projectRoot = resolve(options.workspace || process.cwd());
234
276
  const result = await runLint(projectRoot, options);
@@ -245,6 +287,14 @@ Usage:
245
287
 
246
288
  if (command === 'maintenance') {
247
289
  const projectRoot = resolve(options.workspace || process.cwd());
290
+ const actions = [options.approve ? 'approve' : '', options.done ? 'done' : '', options.skip ? 'skip' : ''].filter(Boolean);
291
+ if (actions.length > 1) throw new Error('maintenance accepts only one of --approve, --done, or --skip');
292
+ if (actions.length === 1) {
293
+ const action = actions[0];
294
+ const id = options.approve || options.done || options.skip;
295
+ printJsonOrText(await updateMaintenanceItem(projectRoot, id, action, options), options);
296
+ return;
297
+ }
248
298
  printJsonOrText(await maintenanceSummary(projectRoot, { ...options, includeLint: true }), options, formatMaintenanceResult);
249
299
  return;
250
300
  }
package/src/constants.js CHANGED
@@ -52,7 +52,9 @@ export const LLM_WIKI_DIRS = [
52
52
  'wiki/queries',
53
53
  'outputs/questions',
54
54
  'outputs/reports',
55
+ 'outputs/exports',
55
56
  'outputs/maintenance',
57
+ 'evals',
56
58
  'procedures',
57
59
  ];
58
60