@mfittko/repo-wiki 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (190) hide show
  1. package/.llmwiki/schema.md +107 -0
  2. package/AGENTS.md +42 -0
  3. package/CHANGELOG.md +91 -0
  4. package/LICENSE +21 -0
  5. package/README.md +254 -0
  6. package/dist/bin/repo-wiki.d.ts +2 -0
  7. package/dist/bin/repo-wiki.js +7 -0
  8. package/dist/bin/repo-wiki.js.map +1 -0
  9. package/dist/src/cli.d.ts +1 -0
  10. package/dist/src/cli.js +404 -0
  11. package/dist/src/cli.js.map +1 -0
  12. package/dist/src/compiler.d.ts +55 -0
  13. package/dist/src/compiler.js +2046 -0
  14. package/dist/src/compiler.js.map +1 -0
  15. package/dist/src/config.d.ts +63 -0
  16. package/dist/src/config.js +86 -0
  17. package/dist/src/config.js.map +1 -0
  18. package/dist/src/context-assembler.d.ts +68 -0
  19. package/dist/src/context-assembler.js +378 -0
  20. package/dist/src/context-assembler.js.map +1 -0
  21. package/dist/src/data-model-signals.d.ts +1 -0
  22. package/dist/src/data-model-signals.js +13 -0
  23. package/dist/src/data-model-signals.js.map +1 -0
  24. package/dist/src/docs-ingestor.d.ts +138 -0
  25. package/dist/src/docs-ingestor.js +844 -0
  26. package/dist/src/docs-ingestor.js.map +1 -0
  27. package/dist/src/docs-linter.d.ts +14 -0
  28. package/dist/src/docs-linter.js +164 -0
  29. package/dist/src/docs-linter.js.map +1 -0
  30. package/dist/src/docs-validation.d.ts +36 -0
  31. package/dist/src/docs-validation.js +297 -0
  32. package/dist/src/docs-validation.js.map +1 -0
  33. package/dist/src/extractors.d.ts +50 -0
  34. package/dist/src/extractors.js +2275 -0
  35. package/dist/src/extractors.js.map +1 -0
  36. package/dist/src/frontmatter.d.ts +46 -0
  37. package/dist/src/frontmatter.js +377 -0
  38. package/dist/src/frontmatter.js.map +1 -0
  39. package/dist/src/index.d.ts +26 -0
  40. package/dist/src/index.js +18 -0
  41. package/dist/src/index.js.map +1 -0
  42. package/dist/src/init.d.ts +12 -0
  43. package/dist/src/init.js +121 -0
  44. package/dist/src/init.js.map +1 -0
  45. package/dist/src/language.d.ts +2 -0
  46. package/dist/src/language.js +62 -0
  47. package/dist/src/language.js.map +1 -0
  48. package/dist/src/linter.d.ts +33 -0
  49. package/dist/src/linter.js +398 -0
  50. package/dist/src/linter.js.map +1 -0
  51. package/dist/src/llm-provider.d.ts +267 -0
  52. package/dist/src/llm-provider.js +474 -0
  53. package/dist/src/llm-provider.js.map +1 -0
  54. package/dist/src/page-ownership.d.ts +38 -0
  55. package/dist/src/page-ownership.js +96 -0
  56. package/dist/src/page-ownership.js.map +1 -0
  57. package/dist/src/planner.d.ts +55 -0
  58. package/dist/src/planner.js +422 -0
  59. package/dist/src/planner.js.map +1 -0
  60. package/dist/src/prompts.d.ts +103 -0
  61. package/dist/src/prompts.js +344 -0
  62. package/dist/src/prompts.js.map +1 -0
  63. package/dist/src/publisher.d.ts +68 -0
  64. package/dist/src/publisher.js +662 -0
  65. package/dist/src/publisher.js.map +1 -0
  66. package/dist/src/repository-analysis.d.ts +88 -0
  67. package/dist/src/repository-analysis.js +485 -0
  68. package/dist/src/repository-analysis.js.map +1 -0
  69. package/dist/src/scanner.d.ts +122 -0
  70. package/dist/src/scanner.js +309 -0
  71. package/dist/src/scanner.js.map +1 -0
  72. package/dist/src/search.d.ts +71 -0
  73. package/dist/src/search.js +410 -0
  74. package/dist/src/search.js.map +1 -0
  75. package/dist/src/secret-patterns.d.ts +3 -0
  76. package/dist/src/secret-patterns.js +14 -0
  77. package/dist/src/secret-patterns.js.map +1 -0
  78. package/dist/src/utils/args.d.ts +2 -0
  79. package/dist/src/utils/args.js +19 -0
  80. package/dist/src/utils/args.js.map +1 -0
  81. package/dist/src/utils/dotenv.d.ts +7 -0
  82. package/dist/src/utils/dotenv.js +73 -0
  83. package/dist/src/utils/dotenv.js.map +1 -0
  84. package/dist/src/utils/fs.d.ts +22 -0
  85. package/dist/src/utils/fs.js +83 -0
  86. package/dist/src/utils/fs.js.map +1 -0
  87. package/dist/src/utils/git.d.ts +13 -0
  88. package/dist/src/utils/git.js +39 -0
  89. package/dist/src/utils/git.js.map +1 -0
  90. package/dist/src/wiki-graph.d.ts +74 -0
  91. package/dist/src/wiki-graph.js +335 -0
  92. package/dist/src/wiki-graph.js.map +1 -0
  93. package/dist/src/wiki-patch.d.ts +152 -0
  94. package/dist/src/wiki-patch.js +489 -0
  95. package/dist/src/wiki-patch.js.map +1 -0
  96. package/dist/src/wiki-query.d.ts +63 -0
  97. package/dist/src/wiki-query.js +255 -0
  98. package/dist/src/wiki-query.js.map +1 -0
  99. package/dist/test/cli.test.d.ts +1 -0
  100. package/dist/test/cli.test.js +514 -0
  101. package/dist/test/cli.test.js.map +1 -0
  102. package/dist/test/compiler-eval.test.d.ts +1 -0
  103. package/dist/test/compiler-eval.test.js +234 -0
  104. package/dist/test/compiler-eval.test.js.map +1 -0
  105. package/dist/test/compiler.test.d.ts +1 -0
  106. package/dist/test/compiler.test.js +2537 -0
  107. package/dist/test/compiler.test.js.map +1 -0
  108. package/dist/test/context-assembler.test.d.ts +1 -0
  109. package/dist/test/context-assembler.test.js +379 -0
  110. package/dist/test/context-assembler.test.js.map +1 -0
  111. package/dist/test/docs-linter.test.d.ts +1 -0
  112. package/dist/test/docs-linter.test.js +900 -0
  113. package/dist/test/docs-linter.test.js.map +1 -0
  114. package/dist/test/dotenv.test.d.ts +1 -0
  115. package/dist/test/dotenv.test.js +77 -0
  116. package/dist/test/dotenv.test.js.map +1 -0
  117. package/dist/test/extractors-go.test.d.ts +1 -0
  118. package/dist/test/extractors-go.test.js +393 -0
  119. package/dist/test/extractors-go.test.js.map +1 -0
  120. package/dist/test/extractors-rust.test.d.ts +1 -0
  121. package/dist/test/extractors-rust.test.js +219 -0
  122. package/dist/test/extractors-rust.test.js.map +1 -0
  123. package/dist/test/extractors-utils.test.d.ts +1 -0
  124. package/dist/test/extractors-utils.test.js +786 -0
  125. package/dist/test/extractors-utils.test.js.map +1 -0
  126. package/dist/test/fixtures/compiler-e2e/basic-node-service/repo/infra/deploy.d.ts +1 -0
  127. package/dist/test/fixtures/compiler-e2e/basic-node-service/repo/infra/deploy.js +4 -0
  128. package/dist/test/fixtures/compiler-e2e/basic-node-service/repo/infra/deploy.js.map +1 -0
  129. package/dist/test/frontmatter.test.d.ts +1 -0
  130. package/dist/test/frontmatter.test.js +287 -0
  131. package/dist/test/frontmatter.test.js.map +1 -0
  132. package/dist/test/init-planner.test.d.ts +1 -0
  133. package/dist/test/init-planner.test.js +688 -0
  134. package/dist/test/init-planner.test.js.map +1 -0
  135. package/dist/test/linter.test.d.ts +1 -0
  136. package/dist/test/linter.test.js +426 -0
  137. package/dist/test/linter.test.js.map +1 -0
  138. package/dist/test/llm-provider.test.d.ts +1 -0
  139. package/dist/test/llm-provider.test.js +783 -0
  140. package/dist/test/llm-provider.test.js.map +1 -0
  141. package/dist/test/page-ownership.test.d.ts +1 -0
  142. package/dist/test/page-ownership.test.js +247 -0
  143. package/dist/test/page-ownership.test.js.map +1 -0
  144. package/dist/test/publisher.test.d.ts +1 -0
  145. package/dist/test/publisher.test.js +1297 -0
  146. package/dist/test/publisher.test.js.map +1 -0
  147. package/dist/test/repository-analysis.test.d.ts +1 -0
  148. package/dist/test/repository-analysis.test.js +182 -0
  149. package/dist/test/repository-analysis.test.js.map +1 -0
  150. package/dist/test/run-compiled-tests.d.ts +1 -0
  151. package/dist/test/run-compiled-tests.js +48 -0
  152. package/dist/test/run-compiled-tests.js.map +1 -0
  153. package/dist/test/scanner.test.d.ts +1 -0
  154. package/dist/test/scanner.test.js +551 -0
  155. package/dist/test/scanner.test.js.map +1 -0
  156. package/dist/test/search.test.d.ts +1 -0
  157. package/dist/test/search.test.js +92 -0
  158. package/dist/test/search.test.js.map +1 -0
  159. package/dist/test/update-changelog.test.d.ts +1 -0
  160. package/dist/test/update-changelog.test.js +125 -0
  161. package/dist/test/update-changelog.test.js.map +1 -0
  162. package/dist/test/wiki-graph.test.d.ts +1 -0
  163. package/dist/test/wiki-graph.test.js +164 -0
  164. package/dist/test/wiki-graph.test.js.map +1 -0
  165. package/dist/test/wiki-patch.test.d.ts +1 -0
  166. package/dist/test/wiki-patch.test.js +610 -0
  167. package/dist/test/wiki-patch.test.js.map +1 -0
  168. package/dist/test/wiki-query.test.d.ts +1 -0
  169. package/dist/test/wiki-query.test.js +163 -0
  170. package/dist/test/wiki-query.test.js.map +1 -0
  171. package/docs/PLAN.md +993 -0
  172. package/docs/WHY.md +61 -0
  173. package/docs/plans/agent-integration.md +85 -0
  174. package/docs/plans/ci-publishing.md +111 -0
  175. package/docs/plans/doc-validation.md +92 -0
  176. package/docs/plans/github-action.md +113 -0
  177. package/docs/plans/incremental-mode.md +98 -0
  178. package/docs/plans/karpathy-llm-wiki-alignment.md +84 -0
  179. package/docs/plans/llm-compiler.md +160 -0
  180. package/docs/plans/production-scanner.md +104 -0
  181. package/docs/plans/query-and-file-back.md +103 -0
  182. package/docs/plans/search-index.md +118 -0
  183. package/docs/plans/trust-hardening.md +74 -0
  184. package/docs/plans/wiki-graph.md +183 -0
  185. package/docs/plans/wiki-health.md +76 -0
  186. package/package.json +83 -0
  187. package/prompts/compiler.md +16 -0
  188. package/prompts/lint.md +18 -0
  189. package/prompts/page-templates.md +25 -0
  190. package/skills/repo-wiki-cli/SKILL.md +139 -0
@@ -0,0 +1,160 @@
1
+ # Epic: LLM Compiler
2
+
3
+ ## Summary
4
+
5
+ Replace the deterministic placeholder summaries in the wiki compiler with LLM-powered synthesis that produces human-quality wiki pages grounded in source cards, documentation cards, targeted code excerpts, and the current state of an existing wiki when back-filling or reconciling.
6
+
7
+ ## Architecture
8
+
9
+ ```mermaid
10
+ flowchart TD
11
+ SourceCards[Source Cards] --> Budget[Token Budget Assembler]
12
+ DocCards[Documentation Cards] --> Budget
13
+ CodeExcerpts[Targeted Code Excerpts] --> Budget
14
+ ExistingWiki[Existing Wiki Pages] --> Ownership[Ownership + Preserve Section Extraction]
15
+ Budget --> Context[Assembled Context Window]
16
+ Ownership --> Context
17
+ Context --> Prompt[Prompt Template Selection]
18
+ PageType[Page Archetype] --> Prompt
19
+ Prompt --> LLM[LLM Provider]
20
+ LLM --> RawOutput[Raw LLM Output]
21
+ RawOutput --> Patch[Structured Patch]
22
+ Patch --> Preserve[Human Section Preservation]
23
+ Preserve --> Cite[Citation Enforcement]
24
+ Cite --> Lint[Lint Gate Validation]
25
+ Lint -->|pass| Page[Final Wiki Page]
26
+ Lint -->|fail| Retry[Retry / Flag]
27
+ ```
28
+
29
+ ```mermaid
30
+ flowchart LR
31
+ subgraph Page Archetypes
32
+ Foundation[Foundation Pages]
33
+ Module[Module Pages]
34
+ CrossCut[Cross-cutting Pages]
35
+ end
36
+ subgraph Prompts
37
+ P1[home.md]
38
+ P2[architecture.md]
39
+ P3[module.md]
40
+ P4[dependency-map.md]
41
+ end
42
+ Foundation --> P1
43
+ Foundation --> P2
44
+ Module --> P3
45
+ CrossCut --> P4
46
+ ```
47
+
48
+ ```mermaid
49
+ sequenceDiagram
50
+ participant Planner
51
+ participant Assembler as Context Assembler
52
+ participant LLM
53
+ participant Linter
54
+
55
+ Planner->>Assembler: Page plan + source cards
56
+ Assembler->>Assembler: Select excerpts within token budget
57
+ Assembler->>LLM: Prompt + context
58
+ LLM-->>Assembler: Generated page content
59
+ Assembler->>Linter: Validate output
60
+ Linter-->>Assembler: Pass/fail + issues
61
+ alt lint passes
62
+ Assembler->>Planner: Emit final page
63
+ else lint fails
64
+ Assembler->>LLM: Retry with feedback
65
+ end
66
+ ```
67
+
68
+ ## Key Deliverables
69
+
70
+ - LLM synthesis pipeline for each wiki page type (foundation, module, cross-cutting)
71
+ - Incremental LLM enrichment that runs only for affected wiki pages selected by diff, bounded hierarchy propagation, and semantic propagation rules
72
+ - Source card and code excerpt context assembly (token-budget aware)
73
+ - Existing wiki page ingestion before regeneration
74
+ - Page classification: generated, human-owned, mixed, unmanaged
75
+ - Structured patch output format for wiki pages
76
+ - Source citation enforcement (every material claim cites a path)
77
+ - Contradiction and confidence metadata in generated pages
78
+ - Human-maintained section preservation during regeneration
79
+ - Stable page identity and ownership metadata in generated frontmatter
80
+ - Merge strategy for back-fill and reconcile mode, not just fresh bootstrap generation
81
+ - Prompt templates for each page archetype
82
+
83
+ ## Provider configuration contract
84
+
85
+ The first production LLM boundary should be provider-agnostic and compatible with OpenAI-style chat completions so GitHub Actions and local runs can use OpenAI, compatible hosted providers, or local gateways without changing compiler code.
86
+
87
+ Minimum `.llmwiki/config.json` shape:
88
+
89
+ ```jsonc
90
+ {
91
+ "compiler": {
92
+ "mode": "deterministic", // deterministic | llm
93
+ "llm": {
94
+ "provider": "openai-compatible",
95
+ "base_url": "https://api.openai.com/v1",
96
+ "model": "gpt-4.1-mini",
97
+ "api_key_env": "LLMWIKI_LLM_API_KEY",
98
+ "system_prompt": "You compile source-grounded GitHub Wiki pages.",
99
+ "temperature": 0.1,
100
+ "max_output_tokens": 4000,
101
+ "timeout_ms": 60000,
102
+ "retries": 2
103
+ }
104
+ }
105
+ }
106
+ ```
107
+
108
+ Environment variables override config values for CI and secrets:
109
+
110
+ | Environment variable | Purpose |
111
+ |---|---|
112
+ | `LLMWIKI_COMPILER_MODE` | Select `deterministic` or `llm` mode. |
113
+ | `LLMWIKI_LLM_BASE_URL` | Override provider API base URL. |
114
+ | `LLMWIKI_LLM_MODEL` | Override model name. |
115
+ | `LLMWIKI_LLM_API_KEY` | Provider API key; never written to artifacts or logs. |
116
+ | `LLMWIKI_LLM_SYSTEM_PROMPT` | Inline system prompt override. |
117
+ | `LLMWIKI_LLM_SYSTEM_PROMPT_FILE` | Path to a system prompt file; useful for repo-maintained prompts. |
118
+ | `LLMWIKI_LLM_TEMPERATURE` | Sampling temperature override. |
119
+ | `LLMWIKI_LLM_MAX_OUTPUT_TOKENS` | Output token budget override. |
120
+
121
+ Configuration precedence should be explicit and deterministic: CLI flags, when added, override environment variables; environment variables override `.llmwiki/config.json`; config overrides safe defaults. The API key must be read only from the configured environment variable and must never be persisted in scan artifacts, generated wiki pages, prompt-debug artifacts, or normal logs.
122
+
123
+ GitHub Actions can enable LLM compilation by setting repository variables for non-secret values and a secret for the API key:
124
+
125
+ ```yaml
126
+ env:
127
+ LLMWIKI_COMPILER_MODE: llm
128
+ LLMWIKI_LLM_BASE_URL: ${{ vars.LLMWIKI_LLM_BASE_URL }}
129
+ LLMWIKI_LLM_MODEL: ${{ vars.LLMWIKI_LLM_MODEL }}
130
+ LLMWIKI_LLM_API_KEY: ${{ secrets.LLMWIKI_LLM_API_KEY }}
131
+ ```
132
+
133
+ Tests must use a deterministic mock provider and must not require network access or API keys.
134
+
135
+ ## Success Criteria
136
+
137
+ - Generated wiki pages are useful without manual editing
138
+ - Every factual claim cites source paths
139
+ - Human-maintained sections survive regeneration unchanged
140
+ - Existing mixed pages can be regenerated without losing preserved regions
141
+ - Generated pages carry enough metadata to support future reconciliation and safe deletion
142
+ - LLM output passes lint gates before acceptance
143
+ - Untouched enriched pages remain byte-stable during incremental builds and do not incur model calls
144
+ - Token budget stays within model context limits per page
145
+
146
+ ## Dependencies
147
+
148
+ - Upstream: Production scanner (rich source cards), doc-validation (validated doc cards)
149
+ - Downstream: Incremental mode (page patching), agent-integration (Agent-Context-Pack quality)
150
+
151
+ ## Open Questions
152
+
153
+ - Which LLM provider(s) to support? (OpenAI, Anthropic, local models)
154
+ - How much raw code should the compiler read per page?
155
+ - Should compilation be parallelized across pages?
156
+ - How to handle hallucination detection beyond lint gates?
157
+ - Cost/latency budget for full bootstrap vs incremental compile?
158
+ - What exact propagation thresholds should trigger re-enrichment of parent aggregate pages versus preserving existing content?
159
+ - What is the minimum preservation contract for mixed human/generated pages?
160
+ - Which page sections should be preserved structurally versus semantically merged?
@@ -0,0 +1,104 @@
1
+ # Epic: Production Scanner
2
+
3
+ ## Summary
4
+
5
+ Replace the current bootstrap scanner with a production-grade source analysis engine that performs AST-level extraction, detects framework-specific surfaces, builds import/dependency graphs, and maps tests to modules.
6
+
7
+ ## Architecture
8
+
9
+ ```mermaid
10
+ flowchart TD
11
+ Source[Source Files] --> Parser[AST Parser]
12
+ Parser --> Symbols[Symbol Extraction]
13
+ Parser --> Imports[Import Resolution]
14
+ Parser --> Framework[Framework Detection]
15
+ Symbols --> Cards[Rich Source Cards]
16
+ Imports --> Graph[Import Graph]
17
+ Framework --> Routes[Route & API Surfaces]
18
+ Framework --> DB[Migration & ORM Models]
19
+ Graph --> AffectedPages[Affected-Page Graph]
20
+ Cards --> Output[Scanner Output]
21
+ Graph --> Output
22
+ Routes --> Output
23
+ DB --> Output
24
+ AffectedPages --> Output
25
+ ```
26
+
27
+ ```mermaid
28
+ flowchart LR
29
+ subgraph Extractors
30
+ TS[TypeScript/JS]
31
+ Py[Python]
32
+ Go[Go]
33
+ Rust[Rust]
34
+ end
35
+ subgraph Frameworks
36
+ Express
37
+ Fastify
38
+ NestJS
39
+ NextJS[Next.js]
40
+ Hono
41
+ tRPC
42
+ GraphQL
43
+ OpenAPI
44
+ end
45
+ TS --> Frameworks
46
+ Py --> Frameworks
47
+ ```
48
+
49
+ ```mermaid
50
+ graph TD
51
+ subgraph Import Graph
52
+ A[module-a.ts] --> B[module-b.ts]
53
+ A --> C[utils.ts]
54
+ B --> C
55
+ B --> D[db.ts]
56
+ D --> E[migrations/001.sql]
57
+ end
58
+ subgraph Test Mapping
59
+ T1[module-a.test.ts] -.-> A
60
+ T2[module-b.test.ts] -.-> B
61
+ T3[utils.test.ts] -.-> C
62
+ end
63
+ ```
64
+
65
+ ## Key Deliverables
66
+
67
+ - TypeScript/JavaScript AST extraction (exports, classes, functions, types)
68
+ - Framework detection: Express, Fastify, NestJS, Next.js, Hono, Koa, tRPC, GraphQL, OpenAPI
69
+ - Route and API surface extraction
70
+ - Database migration and ORM model detection
71
+ - Import graph construction
72
+ - Test-to-source mapping
73
+ - Package script parsing
74
+ - Affected-page graph for incremental mode
75
+
76
+ ## Success Criteria
77
+
78
+ - Scanner produces rich source cards with symbol-level detail for supported languages
79
+ - Framework-specific routes, middleware, and API surfaces are captured
80
+ - Import graph enables transitive dependency reasoning
81
+ - Test coverage mapping connects test files to the modules they exercise
82
+
83
+ ## Dependencies
84
+
85
+ - Upstream: Current scaffold scanner (Milestone 1)
86
+ - Downstream: Incremental mode (needs affected-page graph), LLM compiler (needs rich source cards)
87
+
88
+ ## Open Questions
89
+
90
+ - Which AST parser(s) to use? (ts-morph, @swc/core, tree-sitter)
91
+ - How to handle monorepo workspace boundaries?
92
+ - Should Python/Go/Rust extractors be included in this epic or deferred?
93
+
94
+ ## Scope Decisions
95
+
96
+ Decisions recorded during the foundation phase of this epic:
97
+
98
+ 1. **Regex extraction instead of AST parser**: The foundation phase uses regex-based extraction (`content.matchAll(…)`) rather than an AST parser (ts-morph / @swc/core / tree-sitter). This is a deliberate scope reduction for the initial phase; AST parsing is deferred to a follow-up.
99
+
100
+ 2. **DB/ORM detection (deterministic baseline)**: Scanner now performs deterministic path and regex-based detection for common migration files (`migrations/**`, SQL migration naming patterns, Prisma migrations) and model/entity declarations (Prisma `model`, TypeORM `@Entity`, Sequelize/Mongoose patterns). It records only safe metadata (paths, migration ids/names, model/entity names, hints) without copying SQL bodies or secret-like values.
101
+
102
+ 3. **Affected-page graph deferred**: `baseRef`/`headRef` options are scaffolded but ignored; the affected-page graph for incremental mode is deferred to the incremental-mode epic.
103
+
104
+ 4. **Framework detection partial**: Express, Fastify, Hono, and Next.js route-handler files are detected. NestJS, Koa, tRPC, GraphQL, and OpenAPI are deferred to a follow-up.
@@ -0,0 +1,103 @@
1
+ # Epic: Query and File-Back Workflow
2
+
3
+ ## Summary
4
+
5
+ Implement `repo-wiki query` and `repo-wiki search` as source-cited, wiki-first answer surfaces that treat the generated wiki as the first navigation layer before drilling into source cards and files for verification. Allow durable query answers to be filed back into the wiki as investigation or topic pages with full provenance, extending the Karpathy LLM Wiki pattern from compile-time knowledge capture to runtime knowledge compounding.
6
+
7
+ ## Architecture
8
+
9
+ ```mermaid
10
+ sequenceDiagram
11
+ participant User
12
+ participant CLI
13
+ participant Search
14
+ participant Wiki
15
+ participant Source
16
+ participant Compiler
17
+
18
+ User->>CLI: repo-wiki query "How does auth work?"
19
+ CLI->>Search: Search Index.md, wiki pages, cards
20
+ Search-->>CLI: Candidate pages and source paths
21
+ CLI->>Wiki: Read relevant compiled pages
22
+ CLI->>Source: Verify material claims against source cards/files
23
+ CLI-->>User: Source-cited answer with confidence
24
+ alt answer is durable
25
+ User->>CLI: repo-wiki query --file-back "auth-investigation"
26
+ CLI->>Compiler: Create or update wiki page with provenance
27
+ Compiler->>Wiki: Write investigation page
28
+ CLI->>Wiki: Append query event to Log.md
29
+ end
30
+ ```
31
+
32
+ ```mermaid
33
+ flowchart TD
34
+ Query["User query"] --> Index["Read Index.md"]
35
+ Index --> Pages["Rank candidate wiki pages"]
36
+ Pages --> WikiRead["Read compiled wiki pages"]
37
+ WikiRead --> Verify["Verify claims against source cards"]
38
+ Verify --> Answer["Source-cited answer + confidence"]
39
+ Answer --> FileBack{"File back?"}
40
+ FileBack -->|yes| Page["Investigation or topic page<br/>(provenance + query text + sources)"]
41
+ Page --> Log["Append to Log.md"]
42
+ FileBack -->|no| Done["Done"]
43
+ ```
44
+
45
+ ## Shipped command slice
46
+
47
+ - `repo-wiki search <query>` — local ranked search over wiki pages.
48
+ - `repo-wiki query <question>` — offline extractive answer assembly over ranked wiki pages plus graph provenance evidence.
49
+ - `repo-wiki path <from> <to>` — deterministic shortest-path traversal over `.llmwiki/graph.json`.
50
+ - `repo-wiki explain <node-or-page>` — focused local explanation tied to wiki page summaries and graph/source evidence.
51
+ - All four commands support `--json` for machine-readable reuse.
52
+
53
+ ## Deferred file-back slice
54
+
55
+ - `--file-back` flag on `query` to create or update a wiki page from a durable answer.
56
+ - Filed-back pages include provenance (query text, answering commit, source paths, page state).
57
+ - Query and file-back events appended to `Log.md` in the standard parseable format.
58
+ - Optional hosted wording layered behind the same evidence path.
59
+ - Mock/deterministic mode for tests (no hosted LLM required).
60
+ - Query answers never treat stale or contradicted docs as authoritative.
61
+
62
+ ## Success Criteria
63
+
64
+ - `repo-wiki search "query"` returns ranked wiki pages and evidence paths without external services.
65
+ - `repo-wiki query`, `path`, and `explain` work without external services and expose JSON output.
66
+ - Query and explain answers cite source paths for material claims when graph/wiki provenance is available.
67
+ - Filed-back pages include provenance, query text, source paths, and page state in frontmatter.
68
+ - The feature works in deterministic/mock mode for tests.
69
+ - Query and file-back events appear in `Log.md` with the standard timestamp and operation type.
70
+
71
+ ## Acceptance Criteria (from PLAN.md)
72
+
73
+ - Query answers cite source paths for material claims.
74
+ - Filed-back pages include provenance, query text, source paths, and page state.
75
+ - The feature works in deterministic/mock mode for tests.
76
+ - `repo-wiki search "query"` returns ranked wiki pages and evidence paths.
77
+ - Search can run without external services.
78
+ - Optional provider integrations do not change core scan/compile behavior.
79
+
80
+ ## Filed-Back Page Frontmatter
81
+
82
+ ```yaml
83
+ kind: investigation
84
+ query: "How does auth work?"
85
+ answered_at: "2026-05-10T14:30:00Z"
86
+ source_commit: abc1234
87
+ source_paths:
88
+ - src/auth.ts
89
+ - src/middleware/session.ts
90
+ page_state: filed-back
91
+ confidence: medium
92
+ ```
93
+
94
+ ## Dependencies
95
+
96
+ - Upstream: wiki compiler, scanner, context assembler, LLM provider boundary.
97
+ - Downstream: local search index (needed for scale), Log.md as parseable surface.
98
+
99
+ ## Open Questions
100
+
101
+ - Should the first search backend be a built-in simple index, qmd integration, or MCP-first?
102
+ - Should filed-back pages be proposed for review or written immediately?
103
+ - How should confidence metadata propagate when a filed-back page is later updated by a new query?
@@ -0,0 +1,118 @@
1
+ # Epic: Search Index
2
+
3
+ ## Summary
4
+
5
+ Ship the first built-in local search path over compiled wiki pages so that `repo-wiki search` works fully offline today. Preserve enough metadata for later query/runtime routing, but defer external adapters until after the built-in page-first contract is stable.
6
+
7
+ ## Shipped built-in slice
8
+
9
+ The shipped slice is intentionally bounded:
10
+
11
+ - deterministic index artifact: `.llmwiki/search/index.json`
12
+ - inputs: compiled wiki pages plus local metadata already present in page frontmatter and internal wiki links
13
+ - ranking: deterministic page-first lexical scoring with evidence-oriented results
14
+ - CLI: `repo-wiki search <query>` with text output for humans and `--json` for tools/agents
15
+ - rebuild path: compile refreshes the search artifact, and `search` can rebuild it on demand from local wiki pages
16
+
17
+ ### Result contract
18
+
19
+ Each result includes:
20
+
21
+ - page title and page path
22
+ - page `kind` / `page_state` when available
23
+ - summary/snippet for quick routing
24
+ - `source_paths` when available
25
+ - lightweight graph context from inbound/outbound internal wiki links
26
+
27
+ ## Deferred after the shipped slice
28
+
29
+ These remain explicitly deferred:
30
+
31
+ - source-card/documentation-card indexing beyond what is already promoted into compiled wiki pages
32
+ - section-level ranking as a primary retrieval contract
33
+ - qmd, MCP, embedding, or hosted adapters
34
+ - answer synthesis / `query` / file-back workflows
35
+
36
+ ## Architecture
37
+
38
+ ```mermaid
39
+ flowchart TD
40
+ Wiki["Wiki pages (.llmwiki/wiki)"] --> Indexer["Index builder"]
41
+ Cards["Source + doc cards (.llmwiki/run)"] --> Indexer
42
+ Indexer --> Index["Local search index (.llmwiki/search/)"]
43
+ Index --> Simple["Built-in ranked text search"]
44
+ Index --> Adapter["Optional adapter layer"]
45
+ Adapter --> qmd["qmd backend (optional)"]
46
+ Adapter --> MCP["MCP endpoint (optional)"]
47
+ Simple --> Results["Ranked results<br/>(page, section, source paths)"]
48
+ qmd --> Results
49
+ MCP --> Results
50
+ ```
51
+
52
+ ```mermaid
53
+ flowchart LR
54
+ subgraph Index entry
55
+ Title["page title"]
56
+ Category["page category"]
57
+ Summary["one-line summary"]
58
+ Body["searchable body text"]
59
+ Sources["source paths"]
60
+ Commit["source_commit"]
61
+ end
62
+ ```
63
+
64
+ ## Key Deliverables
65
+
66
+ - Local index builder that runs after `compile` and indexes compiled wiki pages.
67
+ - Built-in ranked text search with no external dependencies.
68
+ - Index stored under `.llmwiki/search/` alongside run artifacts.
69
+ - `repo-wiki search <query>` CLI command that returns ranked results with page title, category/kind, snippet, graph context, and source paths.
70
+ - Deterministic full rebuilds that are compatible with later incremental indexing.
71
+ - Index entries include page metadata: `kind`, `source_commit`, `source_paths`, `page_state`, and internal-link adjacency.
72
+
73
+ ## Success Criteria
74
+
75
+ - `repo-wiki search "query"` returns ranked results without network access.
76
+ - Search results include source paths so callers can drill into evidence.
77
+ - The built-in shipped contract is stable enough for later query/runtime work to consume directly.
78
+ - Index rebuild is fast enough to run after every compile in a typical repository.
79
+
80
+ ## Acceptance Criteria (from PLAN.md)
81
+
82
+ - `repo-wiki search "query"` returns ranked wiki pages and evidence paths.
83
+ - Search can run without external services.
84
+ - Optional provider integrations do not change core scan/compile behavior.
85
+
86
+ ## Index Format
87
+
88
+ ```json
89
+ {
90
+ "version": 1,
91
+ "wikiDir": "/repo/.llmwiki/wiki",
92
+ "sourceCommits": ["abc1234"],
93
+ "entries": [
94
+ {
95
+ "pagePath": "Architecture.md",
96
+ "title": "Architecture",
97
+ "kind": "foundation",
98
+ "pageState": "generated",
99
+ "summary": "High-level system design and data flow.",
100
+ "sourcePaths": ["src/compiler.ts", "src/scanner.ts"],
101
+ "outboundLinks": ["Module-scanner-ts.md"],
102
+ "inboundLinks": [],
103
+ "searchText": "architecture compiler scanner planner wiki"
104
+ }
105
+ ]
106
+ }
107
+ ```
108
+
109
+ ## Dependencies
110
+
111
+ - Upstream: wiki compiler (produces pages to index), scanner (provides source card metadata).
112
+ - Downstream: query and file-back (uses search index for candidate routing).
113
+
114
+ ## Open Questions
115
+
116
+ - When incremental mode becomes diff-minimal, what is the narrowest safe page-level reindex contract?
117
+ - Should section-level ranking be added on top of the page-first contract, or remain a downstream concern for `query`?
118
+ - If optional adapters arrive later, should qmd or MCP be the first external backend to implement?
@@ -0,0 +1,74 @@
1
+ # Epic: Trust Hardening
2
+
3
+ ## Summary
4
+
5
+ Harden every surface where generated, scanned, or published content could leak secrets, reflect stale source truth, bypass severity policy, or let untrusted LLM patches reach the filesystem or a remote. This epic covers redaction, config schema validation, hash coverage, docs-lint blocking in the publish path, safe stale-page deletion, and publisher safety controls.
6
+
7
+ ## Architecture
8
+
9
+ ```mermaid
10
+ flowchart TD
11
+ Scan["Scanner output"] --> Redact["Secret redaction"]
12
+ Redact --> Cards["Cards (no secrets)"]
13
+ Cards --> DocsLint["Docs linter"]
14
+ DocsLint --> Gate{"Error-level issues?"}
15
+ Gate -->|yes, configured to fail| Stop["Halt: do not compile/publish"]
16
+ Gate -->|no| Compiler["Compiler"]
17
+ Compiler --> PatchGate["Structured patch gate"]
18
+ PatchGate --> WikiPages["Wiki pages"]
19
+ WikiPages --> WikiLint["Wiki linter + health"]
20
+ WikiLint --> Publisher["Publisher"]
21
+ Publisher --> StaleDelete["Safe stale-page deletion<br/>(preserve human-owned + unmanaged)"]
22
+ StaleDelete --> Remote["Remote target"]
23
+ ```
24
+
25
+ ## Key Deliverables
26
+
27
+ - **Docs-lint blocking**: make `repo-wiki run` fail before compile/publish when docs-lint reports error-level issues, controlled by config.
28
+ - **Secret redaction**: redact known secret-like patterns from manifests, documentation cards, page contexts, log entries, and generated pages before writing.
29
+ - **Config schema validation**: JSON schema for `.llmwiki/config.json` with clear validation errors on startup.
30
+ - **Hash coverage**: every source card has a stable content hash or an explicit hash-failure reason recorded.
31
+ - **Safe stale-page deletion**: publisher removes generated pages for deleted/renamed sources while preserving unmanaged and human-owned pages; ownership metadata from `page-ownership.ts` gates deletion.
32
+ - **Severity policy**: lint severities for all gates are config-driven; no hardcoded unconditional failures except secret-like content.
33
+ - **Sanitized remotes**: all remotes and URLs are validated and sanitized before display, logging, or writing.
34
+ - **End-to-end fixture**: golden test covering `init → scan → plan → lint-docs → compile → lint → publish --dry-run`.
35
+
36
+ ## Success Criteria
37
+
38
+ - Error-level docs-lint failures block compile and publish when configured.
39
+ - No scan artifact or generated page contains known secret-like test patterns after redaction.
40
+ - Publisher does not delete human-owned or unmanaged pages during stale cleanup.
41
+ - `.llmwiki/config.json` schema validation surfaces clear errors on first use.
42
+ - Every source card in the manifest has a `hash` field or a `hashError` field.
43
+
44
+ ## Acceptance Criteria (from PLAN.md)
45
+
46
+ - Error-level docs lint failures can block run/publish according to config.
47
+ - Scan output respects configured source filtering, including remaining `source.include` and nested-worktree edge cases.
48
+ - Every source card has a stable hash or an explicit hash failure reason.
49
+ - No scan artifact or generated page contains known secret-like patterns from fixtures.
50
+ - Publisher removes stale generated pages without touching unmanaged or human-owned pages.
51
+ - `npm test`, `npm run check`, `npm run coverage` all pass.
52
+ - End-to-end fixture: `init → scan → plan → lint-docs → compile → lint → publish --dry-run`.
53
+
54
+ ## Severity Defaults
55
+
56
+ | Gate | Default |
57
+ |---|---|
58
+ | Docs-lint error-level blocking | configurable (warn by default, fail when `fail_on_error: true`) |
59
+ | Secret-like content | error (always blocks) |
60
+ | Stale docs | warning |
61
+ | Contradicted docs | error |
62
+ | Broken relative links | warning |
63
+ | Missing source commit | warning |
64
+
65
+ ## Dependencies
66
+
67
+ - Upstream: docs linter, wiki linter, scanner, publisher.
68
+ - Downstream: all epics — trust hardening is a prerequisite for safe LLM mode, incremental maintenance, and any publish to a public remote.
69
+
70
+ ## Open Questions
71
+
72
+ - Which secret pattern list should be the canonical reference: `src/secret-patterns.ts` or an external policy file?
73
+ - Should redaction replace matched text with `[REDACTED]` or remove the containing field entirely?
74
+ - Should config schema validation block `repo-wiki run` or only `repo-wiki publish`?