@mfittko/repo-wiki 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.llmwiki/schema.md +107 -0
- package/AGENTS.md +42 -0
- package/CHANGELOG.md +91 -0
- package/LICENSE +21 -0
- package/README.md +254 -0
- package/dist/bin/repo-wiki.d.ts +2 -0
- package/dist/bin/repo-wiki.js +7 -0
- package/dist/bin/repo-wiki.js.map +1 -0
- package/dist/src/cli.d.ts +1 -0
- package/dist/src/cli.js +404 -0
- package/dist/src/cli.js.map +1 -0
- package/dist/src/compiler.d.ts +55 -0
- package/dist/src/compiler.js +2046 -0
- package/dist/src/compiler.js.map +1 -0
- package/dist/src/config.d.ts +63 -0
- package/dist/src/config.js +86 -0
- package/dist/src/config.js.map +1 -0
- package/dist/src/context-assembler.d.ts +68 -0
- package/dist/src/context-assembler.js +378 -0
- package/dist/src/context-assembler.js.map +1 -0
- package/dist/src/data-model-signals.d.ts +1 -0
- package/dist/src/data-model-signals.js +13 -0
- package/dist/src/data-model-signals.js.map +1 -0
- package/dist/src/docs-ingestor.d.ts +138 -0
- package/dist/src/docs-ingestor.js +844 -0
- package/dist/src/docs-ingestor.js.map +1 -0
- package/dist/src/docs-linter.d.ts +14 -0
- package/dist/src/docs-linter.js +164 -0
- package/dist/src/docs-linter.js.map +1 -0
- package/dist/src/docs-validation.d.ts +36 -0
- package/dist/src/docs-validation.js +297 -0
- package/dist/src/docs-validation.js.map +1 -0
- package/dist/src/extractors.d.ts +50 -0
- package/dist/src/extractors.js +2275 -0
- package/dist/src/extractors.js.map +1 -0
- package/dist/src/frontmatter.d.ts +46 -0
- package/dist/src/frontmatter.js +377 -0
- package/dist/src/frontmatter.js.map +1 -0
- package/dist/src/index.d.ts +26 -0
- package/dist/src/index.js +18 -0
- package/dist/src/index.js.map +1 -0
- package/dist/src/init.d.ts +12 -0
- package/dist/src/init.js +121 -0
- package/dist/src/init.js.map +1 -0
- package/dist/src/language.d.ts +2 -0
- package/dist/src/language.js +62 -0
- package/dist/src/language.js.map +1 -0
- package/dist/src/linter.d.ts +33 -0
- package/dist/src/linter.js +398 -0
- package/dist/src/linter.js.map +1 -0
- package/dist/src/llm-provider.d.ts +267 -0
- package/dist/src/llm-provider.js +474 -0
- package/dist/src/llm-provider.js.map +1 -0
- package/dist/src/page-ownership.d.ts +38 -0
- package/dist/src/page-ownership.js +96 -0
- package/dist/src/page-ownership.js.map +1 -0
- package/dist/src/planner.d.ts +55 -0
- package/dist/src/planner.js +422 -0
- package/dist/src/planner.js.map +1 -0
- package/dist/src/prompts.d.ts +103 -0
- package/dist/src/prompts.js +344 -0
- package/dist/src/prompts.js.map +1 -0
- package/dist/src/publisher.d.ts +68 -0
- package/dist/src/publisher.js +662 -0
- package/dist/src/publisher.js.map +1 -0
- package/dist/src/repository-analysis.d.ts +88 -0
- package/dist/src/repository-analysis.js +485 -0
- package/dist/src/repository-analysis.js.map +1 -0
- package/dist/src/scanner.d.ts +122 -0
- package/dist/src/scanner.js +309 -0
- package/dist/src/scanner.js.map +1 -0
- package/dist/src/search.d.ts +71 -0
- package/dist/src/search.js +410 -0
- package/dist/src/search.js.map +1 -0
- package/dist/src/secret-patterns.d.ts +3 -0
- package/dist/src/secret-patterns.js +14 -0
- package/dist/src/secret-patterns.js.map +1 -0
- package/dist/src/utils/args.d.ts +2 -0
- package/dist/src/utils/args.js +19 -0
- package/dist/src/utils/args.js.map +1 -0
- package/dist/src/utils/dotenv.d.ts +7 -0
- package/dist/src/utils/dotenv.js +73 -0
- package/dist/src/utils/dotenv.js.map +1 -0
- package/dist/src/utils/fs.d.ts +22 -0
- package/dist/src/utils/fs.js +83 -0
- package/dist/src/utils/fs.js.map +1 -0
- package/dist/src/utils/git.d.ts +13 -0
- package/dist/src/utils/git.js +39 -0
- package/dist/src/utils/git.js.map +1 -0
- package/dist/src/wiki-graph.d.ts +74 -0
- package/dist/src/wiki-graph.js +335 -0
- package/dist/src/wiki-graph.js.map +1 -0
- package/dist/src/wiki-patch.d.ts +152 -0
- package/dist/src/wiki-patch.js +489 -0
- package/dist/src/wiki-patch.js.map +1 -0
- package/dist/src/wiki-query.d.ts +63 -0
- package/dist/src/wiki-query.js +255 -0
- package/dist/src/wiki-query.js.map +1 -0
- package/dist/test/cli.test.d.ts +1 -0
- package/dist/test/cli.test.js +514 -0
- package/dist/test/cli.test.js.map +1 -0
- package/dist/test/compiler-eval.test.d.ts +1 -0
- package/dist/test/compiler-eval.test.js +234 -0
- package/dist/test/compiler-eval.test.js.map +1 -0
- package/dist/test/compiler.test.d.ts +1 -0
- package/dist/test/compiler.test.js +2537 -0
- package/dist/test/compiler.test.js.map +1 -0
- package/dist/test/context-assembler.test.d.ts +1 -0
- package/dist/test/context-assembler.test.js +379 -0
- package/dist/test/context-assembler.test.js.map +1 -0
- package/dist/test/docs-linter.test.d.ts +1 -0
- package/dist/test/docs-linter.test.js +900 -0
- package/dist/test/docs-linter.test.js.map +1 -0
- package/dist/test/dotenv.test.d.ts +1 -0
- package/dist/test/dotenv.test.js +77 -0
- package/dist/test/dotenv.test.js.map +1 -0
- package/dist/test/extractors-go.test.d.ts +1 -0
- package/dist/test/extractors-go.test.js +393 -0
- package/dist/test/extractors-go.test.js.map +1 -0
- package/dist/test/extractors-rust.test.d.ts +1 -0
- package/dist/test/extractors-rust.test.js +219 -0
- package/dist/test/extractors-rust.test.js.map +1 -0
- package/dist/test/extractors-utils.test.d.ts +1 -0
- package/dist/test/extractors-utils.test.js +786 -0
- package/dist/test/extractors-utils.test.js.map +1 -0
- package/dist/test/fixtures/compiler-e2e/basic-node-service/repo/infra/deploy.d.ts +1 -0
- package/dist/test/fixtures/compiler-e2e/basic-node-service/repo/infra/deploy.js +4 -0
- package/dist/test/fixtures/compiler-e2e/basic-node-service/repo/infra/deploy.js.map +1 -0
- package/dist/test/frontmatter.test.d.ts +1 -0
- package/dist/test/frontmatter.test.js +287 -0
- package/dist/test/frontmatter.test.js.map +1 -0
- package/dist/test/init-planner.test.d.ts +1 -0
- package/dist/test/init-planner.test.js +688 -0
- package/dist/test/init-planner.test.js.map +1 -0
- package/dist/test/linter.test.d.ts +1 -0
- package/dist/test/linter.test.js +426 -0
- package/dist/test/linter.test.js.map +1 -0
- package/dist/test/llm-provider.test.d.ts +1 -0
- package/dist/test/llm-provider.test.js +783 -0
- package/dist/test/llm-provider.test.js.map +1 -0
- package/dist/test/page-ownership.test.d.ts +1 -0
- package/dist/test/page-ownership.test.js +247 -0
- package/dist/test/page-ownership.test.js.map +1 -0
- package/dist/test/publisher.test.d.ts +1 -0
- package/dist/test/publisher.test.js +1297 -0
- package/dist/test/publisher.test.js.map +1 -0
- package/dist/test/repository-analysis.test.d.ts +1 -0
- package/dist/test/repository-analysis.test.js +182 -0
- package/dist/test/repository-analysis.test.js.map +1 -0
- package/dist/test/run-compiled-tests.d.ts +1 -0
- package/dist/test/run-compiled-tests.js +48 -0
- package/dist/test/run-compiled-tests.js.map +1 -0
- package/dist/test/scanner.test.d.ts +1 -0
- package/dist/test/scanner.test.js +551 -0
- package/dist/test/scanner.test.js.map +1 -0
- package/dist/test/search.test.d.ts +1 -0
- package/dist/test/search.test.js +92 -0
- package/dist/test/search.test.js.map +1 -0
- package/dist/test/update-changelog.test.d.ts +1 -0
- package/dist/test/update-changelog.test.js +125 -0
- package/dist/test/update-changelog.test.js.map +1 -0
- package/dist/test/wiki-graph.test.d.ts +1 -0
- package/dist/test/wiki-graph.test.js +164 -0
- package/dist/test/wiki-graph.test.js.map +1 -0
- package/dist/test/wiki-patch.test.d.ts +1 -0
- package/dist/test/wiki-patch.test.js +610 -0
- package/dist/test/wiki-patch.test.js.map +1 -0
- package/dist/test/wiki-query.test.d.ts +1 -0
- package/dist/test/wiki-query.test.js +163 -0
- package/dist/test/wiki-query.test.js.map +1 -0
- package/docs/PLAN.md +993 -0
- package/docs/WHY.md +61 -0
- package/docs/plans/agent-integration.md +85 -0
- package/docs/plans/ci-publishing.md +111 -0
- package/docs/plans/doc-validation.md +92 -0
- package/docs/plans/github-action.md +113 -0
- package/docs/plans/incremental-mode.md +98 -0
- package/docs/plans/karpathy-llm-wiki-alignment.md +84 -0
- package/docs/plans/llm-compiler.md +160 -0
- package/docs/plans/production-scanner.md +104 -0
- package/docs/plans/query-and-file-back.md +103 -0
- package/docs/plans/search-index.md +118 -0
- package/docs/plans/trust-hardening.md +74 -0
- package/docs/plans/wiki-graph.md +183 -0
- package/docs/plans/wiki-health.md +76 -0
- package/package.json +83 -0
- package/prompts/compiler.md +16 -0
- package/prompts/lint.md +18 -0
- package/prompts/page-templates.md +25 -0
- package/skills/repo-wiki-cli/SKILL.md +139 -0
|
@@ -0,0 +1,160 @@
|
|
|
1
|
+
# Epic: LLM Compiler
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Replace the deterministic placeholder summaries in the wiki compiler with LLM-powered synthesis that produces human-quality wiki pages grounded in source cards, documentation cards, targeted code excerpts, and the current state of an existing wiki when back-filling or reconciling.
|
|
6
|
+
|
|
7
|
+
## Architecture
|
|
8
|
+
|
|
9
|
+
```mermaid
|
|
10
|
+
flowchart TD
|
|
11
|
+
SourceCards[Source Cards] --> Budget[Token Budget Assembler]
|
|
12
|
+
DocCards[Documentation Cards] --> Budget
|
|
13
|
+
CodeExcerpts[Targeted Code Excerpts] --> Budget
|
|
14
|
+
ExistingWiki[Existing Wiki Pages] --> Ownership[Ownership + Preserve Section Extraction]
|
|
15
|
+
Budget --> Context[Assembled Context Window]
|
|
16
|
+
Ownership --> Context
|
|
17
|
+
Context --> Prompt[Prompt Template Selection]
|
|
18
|
+
PageType[Page Archetype] --> Prompt
|
|
19
|
+
Prompt --> LLM[LLM Provider]
|
|
20
|
+
LLM --> RawOutput[Raw LLM Output]
|
|
21
|
+
RawOutput --> Patch[Structured Patch]
|
|
22
|
+
Patch --> Preserve[Human Section Preservation]
|
|
23
|
+
Preserve --> Cite[Citation Enforcement]
|
|
24
|
+
Cite --> Lint[Lint Gate Validation]
|
|
25
|
+
Lint -->|pass| Page[Final Wiki Page]
|
|
26
|
+
Lint -->|fail| Retry[Retry / Flag]
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
```mermaid
|
|
30
|
+
flowchart LR
|
|
31
|
+
subgraph Page Archetypes
|
|
32
|
+
Foundation[Foundation Pages]
|
|
33
|
+
Module[Module Pages]
|
|
34
|
+
CrossCut[Cross-cutting Pages]
|
|
35
|
+
end
|
|
36
|
+
subgraph Prompts
|
|
37
|
+
P1[home.md]
|
|
38
|
+
P2[architecture.md]
|
|
39
|
+
P3[module.md]
|
|
40
|
+
P4[dependency-map.md]
|
|
41
|
+
end
|
|
42
|
+
Foundation --> P1
|
|
43
|
+
Foundation --> P2
|
|
44
|
+
Module --> P3
|
|
45
|
+
CrossCut --> P4
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
```mermaid
|
|
49
|
+
sequenceDiagram
|
|
50
|
+
participant Planner
|
|
51
|
+
participant Assembler as Context Assembler
|
|
52
|
+
participant LLM
|
|
53
|
+
participant Linter
|
|
54
|
+
|
|
55
|
+
Planner->>Assembler: Page plan + source cards
|
|
56
|
+
Assembler->>Assembler: Select excerpts within token budget
|
|
57
|
+
Assembler->>LLM: Prompt + context
|
|
58
|
+
LLM-->>Assembler: Generated page content
|
|
59
|
+
Assembler->>Linter: Validate output
|
|
60
|
+
Linter-->>Assembler: Pass/fail + issues
|
|
61
|
+
alt lint passes
|
|
62
|
+
Assembler->>Planner: Emit final page
|
|
63
|
+
else lint fails
|
|
64
|
+
Assembler->>LLM: Retry with feedback
|
|
65
|
+
end
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
## Key Deliverables
|
|
69
|
+
|
|
70
|
+
- LLM synthesis pipeline for each wiki page type (foundation, module, cross-cutting)
|
|
71
|
+
- Incremental LLM enrichment that runs only for affected wiki pages selected by diff, bounded hierarchy propagation, and semantic propagation rules
|
|
72
|
+
- Source card and code excerpt context assembly (token-budget aware)
|
|
73
|
+
- Existing wiki page ingestion before regeneration
|
|
74
|
+
- Page classification: generated, human-owned, mixed, unmanaged
|
|
75
|
+
- Structured patch output format for wiki pages
|
|
76
|
+
- Source citation enforcement (every material claim cites a path)
|
|
77
|
+
- Contradiction and confidence metadata in generated pages
|
|
78
|
+
- Human-maintained section preservation during regeneration
|
|
79
|
+
- Stable page identity and ownership metadata in generated frontmatter
|
|
80
|
+
- Merge strategy for back-fill and reconcile mode, not just fresh bootstrap generation
|
|
81
|
+
- Prompt templates for each page archetype
|
|
82
|
+
|
|
83
|
+
## Provider configuration contract
|
|
84
|
+
|
|
85
|
+
The first production LLM boundary should be provider-agnostic and compatible with OpenAI-style chat completions so GitHub Actions and local runs can use OpenAI, compatible hosted providers, or local gateways without changing compiler code.
|
|
86
|
+
|
|
87
|
+
Minimum `.llmwiki/config.json` shape:
|
|
88
|
+
|
|
89
|
+
```jsonc
|
|
90
|
+
{
|
|
91
|
+
"compiler": {
|
|
92
|
+
"mode": "deterministic", // deterministic | llm
|
|
93
|
+
"llm": {
|
|
94
|
+
"provider": "openai-compatible",
|
|
95
|
+
"base_url": "https://api.openai.com/v1",
|
|
96
|
+
"model": "gpt-4.1-mini",
|
|
97
|
+
"api_key_env": "LLMWIKI_LLM_API_KEY",
|
|
98
|
+
"system_prompt": "You compile source-grounded GitHub Wiki pages.",
|
|
99
|
+
"temperature": 0.1,
|
|
100
|
+
"max_output_tokens": 4000,
|
|
101
|
+
"timeout_ms": 60000,
|
|
102
|
+
"retries": 2
|
|
103
|
+
}
|
|
104
|
+
}
|
|
105
|
+
}
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
Environment variables override config values for CI and secrets:
|
|
109
|
+
|
|
110
|
+
| Environment variable | Purpose |
|
|
111
|
+
|---|---|
|
|
112
|
+
| `LLMWIKI_COMPILER_MODE` | Select `deterministic` or `llm` mode. |
|
|
113
|
+
| `LLMWIKI_LLM_BASE_URL` | Override provider API base URL. |
|
|
114
|
+
| `LLMWIKI_LLM_MODEL` | Override model name. |
|
|
115
|
+
| `LLMWIKI_LLM_API_KEY` | Provider API key; never written to artifacts or logs. |
|
|
116
|
+
| `LLMWIKI_LLM_SYSTEM_PROMPT` | Inline system prompt override. |
|
|
117
|
+
| `LLMWIKI_LLM_SYSTEM_PROMPT_FILE` | Path to a system prompt file; useful for repo-maintained prompts. |
|
|
118
|
+
| `LLMWIKI_LLM_TEMPERATURE` | Sampling temperature override. |
|
|
119
|
+
| `LLMWIKI_LLM_MAX_OUTPUT_TOKENS` | Output token budget override. |
|
|
120
|
+
|
|
121
|
+
Configuration precedence should be explicit and deterministic: CLI flags, when added, override environment variables; environment variables override `.llmwiki/config.json`; config overrides safe defaults. The API key must be read only from the configured environment variable and must never be persisted in scan artifacts, generated wiki pages, prompt-debug artifacts, or normal logs.
|
|
122
|
+
|
|
123
|
+
GitHub Actions can enable LLM compilation by setting repository variables for non-secret values and a secret for the API key:
|
|
124
|
+
|
|
125
|
+
```yaml
|
|
126
|
+
env:
|
|
127
|
+
LLMWIKI_COMPILER_MODE: llm
|
|
128
|
+
LLMWIKI_LLM_BASE_URL: ${{ vars.LLMWIKI_LLM_BASE_URL }}
|
|
129
|
+
LLMWIKI_LLM_MODEL: ${{ vars.LLMWIKI_LLM_MODEL }}
|
|
130
|
+
LLMWIKI_LLM_API_KEY: ${{ secrets.LLMWIKI_LLM_API_KEY }}
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
Tests must use a deterministic mock provider and must not require network access or API keys.
|
|
134
|
+
|
|
135
|
+
## Success Criteria
|
|
136
|
+
|
|
137
|
+
- Generated wiki pages are useful without manual editing
|
|
138
|
+
- Every factual claim cites source paths
|
|
139
|
+
- Human-maintained sections survive regeneration unchanged
|
|
140
|
+
- Existing mixed pages can be regenerated without losing preserved regions
|
|
141
|
+
- Generated pages carry enough metadata to support future reconciliation and safe deletion
|
|
142
|
+
- LLM output passes lint gates before acceptance
|
|
143
|
+
- Untouched enriched pages remain byte-stable during incremental builds and do not incur model calls
|
|
144
|
+
- Token budget stays within model context limits per page
|
|
145
|
+
|
|
146
|
+
## Dependencies
|
|
147
|
+
|
|
148
|
+
- Upstream: Production scanner (rich source cards), doc-validation (validated doc cards)
|
|
149
|
+
- Downstream: Incremental mode (page patching), agent-integration (Agent-Context-Pack quality)
|
|
150
|
+
|
|
151
|
+
## Open Questions
|
|
152
|
+
|
|
153
|
+
- Which LLM provider(s) to support? (OpenAI, Anthropic, local models)
|
|
154
|
+
- How much raw code should the compiler read per page?
|
|
155
|
+
- Should compilation be parallelized across pages?
|
|
156
|
+
- How to handle hallucination detection beyond lint gates?
|
|
157
|
+
- Cost/latency budget for full bootstrap vs incremental compile?
|
|
158
|
+
- What exact propagation thresholds should trigger re-enrichment of parent aggregate pages versus preserving existing content?
|
|
159
|
+
- What is the minimum preservation contract for mixed human/generated pages?
|
|
160
|
+
- Which page sections should be preserved structurally versus semantically merged?
|
|
@@ -0,0 +1,104 @@
|
|
|
1
|
+
# Epic: Production Scanner
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Replace the current bootstrap scanner with a production-grade source analysis engine that performs AST-level extraction, detects framework-specific surfaces, builds import/dependency graphs, and maps tests to modules.
|
|
6
|
+
|
|
7
|
+
## Architecture
|
|
8
|
+
|
|
9
|
+
```mermaid
|
|
10
|
+
flowchart TD
|
|
11
|
+
Source[Source Files] --> Parser[AST Parser]
|
|
12
|
+
Parser --> Symbols[Symbol Extraction]
|
|
13
|
+
Parser --> Imports[Import Resolution]
|
|
14
|
+
Parser --> Framework[Framework Detection]
|
|
15
|
+
Symbols --> Cards[Rich Source Cards]
|
|
16
|
+
Imports --> Graph[Import Graph]
|
|
17
|
+
Framework --> Routes[Route & API Surfaces]
|
|
18
|
+
Framework --> DB[Migration & ORM Models]
|
|
19
|
+
Graph --> AffectedPages[Affected-Page Graph]
|
|
20
|
+
Cards --> Output[Scanner Output]
|
|
21
|
+
Graph --> Output
|
|
22
|
+
Routes --> Output
|
|
23
|
+
DB --> Output
|
|
24
|
+
AffectedPages --> Output
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
```mermaid
|
|
28
|
+
flowchart LR
|
|
29
|
+
subgraph Extractors
|
|
30
|
+
TS[TypeScript/JS]
|
|
31
|
+
Py[Python]
|
|
32
|
+
Go[Go]
|
|
33
|
+
Rust[Rust]
|
|
34
|
+
end
|
|
35
|
+
subgraph Frameworks
|
|
36
|
+
Express
|
|
37
|
+
Fastify
|
|
38
|
+
NestJS
|
|
39
|
+
NextJS[Next.js]
|
|
40
|
+
Hono
|
|
41
|
+
tRPC
|
|
42
|
+
GraphQL
|
|
43
|
+
OpenAPI
|
|
44
|
+
end
|
|
45
|
+
TS --> Frameworks
|
|
46
|
+
Py --> Frameworks
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
```mermaid
|
|
50
|
+
graph TD
|
|
51
|
+
subgraph Import Graph
|
|
52
|
+
A[module-a.ts] --> B[module-b.ts]
|
|
53
|
+
A --> C[utils.ts]
|
|
54
|
+
B --> C
|
|
55
|
+
B --> D[db.ts]
|
|
56
|
+
D --> E[migrations/001.sql]
|
|
57
|
+
end
|
|
58
|
+
subgraph Test Mapping
|
|
59
|
+
T1[module-a.test.ts] -.-> A
|
|
60
|
+
T2[module-b.test.ts] -.-> B
|
|
61
|
+
T3[utils.test.ts] -.-> C
|
|
62
|
+
end
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
## Key Deliverables
|
|
66
|
+
|
|
67
|
+
- TypeScript/JavaScript AST extraction (exports, classes, functions, types)
|
|
68
|
+
- Framework detection: Express, Fastify, NestJS, Next.js, Hono, Koa, tRPC, GraphQL, OpenAPI
|
|
69
|
+
- Route and API surface extraction
|
|
70
|
+
- Database migration and ORM model detection
|
|
71
|
+
- Import graph construction
|
|
72
|
+
- Test-to-source mapping
|
|
73
|
+
- Package script parsing
|
|
74
|
+
- Affected-page graph for incremental mode
|
|
75
|
+
|
|
76
|
+
## Success Criteria
|
|
77
|
+
|
|
78
|
+
- Scanner produces rich source cards with symbol-level detail for supported languages
|
|
79
|
+
- Framework-specific routes, middleware, and API surfaces are captured
|
|
80
|
+
- Import graph enables transitive dependency reasoning
|
|
81
|
+
- Test coverage mapping connects test files to the modules they exercise
|
|
82
|
+
|
|
83
|
+
## Dependencies
|
|
84
|
+
|
|
85
|
+
- Upstream: Current scaffold scanner (Milestone 1)
|
|
86
|
+
- Downstream: Incremental mode (needs affected-page graph), LLM compiler (needs rich source cards)
|
|
87
|
+
|
|
88
|
+
## Open Questions
|
|
89
|
+
|
|
90
|
+
- Which AST parser(s) to use? (ts-morph, @swc/core, tree-sitter)
|
|
91
|
+
- How to handle monorepo workspace boundaries?
|
|
92
|
+
- Should Python/Go/Rust extractors be included in this epic or deferred?
|
|
93
|
+
|
|
94
|
+
## Scope Decisions
|
|
95
|
+
|
|
96
|
+
Decisions recorded during the foundation phase of this epic:
|
|
97
|
+
|
|
98
|
+
1. **Regex extraction instead of AST parser**: The foundation phase uses regex-based extraction (`content.matchAll(…)`) rather than an AST parser (ts-morph / @swc/core / tree-sitter). This is a deliberate scope reduction for the initial phase; AST parsing is deferred to a follow-up.
|
|
99
|
+
|
|
100
|
+
2. **DB/ORM detection (deterministic baseline)**: Scanner now performs deterministic path and regex-based detection for common migration files (`migrations/**`, SQL migration naming patterns, Prisma migrations) and model/entity declarations (Prisma `model`, TypeORM `@Entity`, Sequelize/Mongoose patterns). It records only safe metadata (paths, migration ids/names, model/entity names, hints) without copying SQL bodies or secret-like values.
|
|
101
|
+
|
|
102
|
+
3. **Affected-page graph deferred**: `baseRef`/`headRef` options are scaffolded but ignored; the affected-page graph for incremental mode is deferred to the incremental-mode epic.
|
|
103
|
+
|
|
104
|
+
4. **Framework detection partial**: Express, Fastify, Hono, and Next.js route-handler files are detected. NestJS, Koa, tRPC, GraphQL, and OpenAPI are deferred to a follow-up.
|
|
@@ -0,0 +1,103 @@
|
|
|
1
|
+
# Epic: Query and File-Back Workflow
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Implement `repo-wiki query` and `repo-wiki search` as source-cited, wiki-first answer surfaces that treat the generated wiki as the first navigation layer before drilling into source cards and files for verification. Allow durable query answers to be filed back into the wiki as investigation or topic pages with full provenance, extending the Karpathy LLM Wiki pattern from compile-time knowledge capture to runtime knowledge compounding.
|
|
6
|
+
|
|
7
|
+
## Architecture
|
|
8
|
+
|
|
9
|
+
```mermaid
|
|
10
|
+
sequenceDiagram
|
|
11
|
+
participant User
|
|
12
|
+
participant CLI
|
|
13
|
+
participant Search
|
|
14
|
+
participant Wiki
|
|
15
|
+
participant Source
|
|
16
|
+
participant Compiler
|
|
17
|
+
|
|
18
|
+
User->>CLI: repo-wiki query "How does auth work?"
|
|
19
|
+
CLI->>Search: Search Index.md, wiki pages, cards
|
|
20
|
+
Search-->>CLI: Candidate pages and source paths
|
|
21
|
+
CLI->>Wiki: Read relevant compiled pages
|
|
22
|
+
CLI->>Source: Verify material claims against source cards/files
|
|
23
|
+
CLI-->>User: Source-cited answer with confidence
|
|
24
|
+
alt answer is durable
|
|
25
|
+
User->>CLI: repo-wiki query --file-back "auth-investigation"
|
|
26
|
+
CLI->>Compiler: Create or update wiki page with provenance
|
|
27
|
+
Compiler->>Wiki: Write investigation page
|
|
28
|
+
CLI->>Wiki: Append query event to Log.md
|
|
29
|
+
end
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
```mermaid
|
|
33
|
+
flowchart TD
|
|
34
|
+
Query["User query"] --> Index["Read Index.md"]
|
|
35
|
+
Index --> Pages["Rank candidate wiki pages"]
|
|
36
|
+
Pages --> WikiRead["Read compiled wiki pages"]
|
|
37
|
+
WikiRead --> Verify["Verify claims against source cards"]
|
|
38
|
+
Verify --> Answer["Source-cited answer + confidence"]
|
|
39
|
+
Answer --> FileBack{"File back?"}
|
|
40
|
+
FileBack -->|yes| Page["Investigation or topic page<br/>(provenance + query text + sources)"]
|
|
41
|
+
Page --> Log["Append to Log.md"]
|
|
42
|
+
FileBack -->|no| Done["Done"]
|
|
43
|
+
```
|
|
44
|
+
|
|
45
|
+
## Shipped command slice
|
|
46
|
+
|
|
47
|
+
- `repo-wiki search <query>` — local ranked search over wiki pages.
|
|
48
|
+
- `repo-wiki query <question>` — offline extractive answer assembly over ranked wiki pages plus graph provenance evidence.
|
|
49
|
+
- `repo-wiki path <from> <to>` — deterministic shortest-path traversal over `.llmwiki/graph.json`.
|
|
50
|
+
- `repo-wiki explain <node-or-page>` — focused local explanation tied to wiki page summaries and graph/source evidence.
|
|
51
|
+
- All four commands support `--json` for machine-readable reuse.
|
|
52
|
+
|
|
53
|
+
## Deferred file-back slice
|
|
54
|
+
|
|
55
|
+
- `--file-back` flag on `query` to create or update a wiki page from a durable answer.
|
|
56
|
+
- Filed-back pages include provenance (query text, answering commit, source paths, page state).
|
|
57
|
+
- Query and file-back events appended to `Log.md` in the standard parseable format.
|
|
58
|
+
- Optional hosted wording layered behind the same evidence path.
|
|
59
|
+
- Mock/deterministic mode for tests (no hosted LLM required).
|
|
60
|
+
- Query answers never treat stale or contradicted docs as authoritative.
|
|
61
|
+
|
|
62
|
+
## Success Criteria
|
|
63
|
+
|
|
64
|
+
- `repo-wiki search "query"` returns ranked wiki pages and evidence paths without external services.
|
|
65
|
+
- `repo-wiki query`, `path`, and `explain` work without external services and expose JSON output.
|
|
66
|
+
- Query and explain answers cite source paths for material claims when graph/wiki provenance is available.
|
|
67
|
+
- Filed-back pages include provenance, query text, source paths, and page state in frontmatter.
|
|
68
|
+
- The feature works in deterministic/mock mode for tests.
|
|
69
|
+
- Query and file-back events appear in `Log.md` with the standard timestamp and operation type.
|
|
70
|
+
|
|
71
|
+
## Acceptance Criteria (from PLAN.md)
|
|
72
|
+
|
|
73
|
+
- Query answers cite source paths for material claims.
|
|
74
|
+
- Filed-back pages include provenance, query text, source paths, and page state.
|
|
75
|
+
- The feature works in deterministic/mock mode for tests.
|
|
76
|
+
- `repo-wiki search "query"` returns ranked wiki pages and evidence paths.
|
|
77
|
+
- Search can run without external services.
|
|
78
|
+
- Optional provider integrations do not change core scan/compile behavior.
|
|
79
|
+
|
|
80
|
+
## Filed-Back Page Frontmatter
|
|
81
|
+
|
|
82
|
+
```yaml
|
|
83
|
+
kind: investigation
|
|
84
|
+
query: "How does auth work?"
|
|
85
|
+
answered_at: "2026-05-10T14:30:00Z"
|
|
86
|
+
source_commit: abc1234
|
|
87
|
+
source_paths:
|
|
88
|
+
- src/auth.ts
|
|
89
|
+
- src/middleware/session.ts
|
|
90
|
+
page_state: filed-back
|
|
91
|
+
confidence: medium
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
## Dependencies
|
|
95
|
+
|
|
96
|
+
- Upstream: wiki compiler, scanner, context assembler, LLM provider boundary.
|
|
97
|
+
- Downstream: local search index (needed for scale), Log.md as parseable surface.
|
|
98
|
+
|
|
99
|
+
## Open Questions
|
|
100
|
+
|
|
101
|
+
- Should the first search backend be a built-in simple index, qmd integration, or MCP-first?
|
|
102
|
+
- Should filed-back pages be proposed for review or written immediately?
|
|
103
|
+
- How should confidence metadata propagate when a filed-back page is later updated by a new query?
|
|
@@ -0,0 +1,118 @@
|
|
|
1
|
+
# Epic: Search Index
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Ship the first built-in local search path over compiled wiki pages so that `repo-wiki search` works fully offline today. Preserve enough metadata for later query/runtime routing, but defer external adapters until after the built-in page-first contract is stable.
|
|
6
|
+
|
|
7
|
+
## Shipped built-in slice
|
|
8
|
+
|
|
9
|
+
The shipped slice is intentionally bounded:
|
|
10
|
+
|
|
11
|
+
- deterministic index artifact: `.llmwiki/search/index.json`
|
|
12
|
+
- inputs: compiled wiki pages plus local metadata already present in page frontmatter and internal wiki links
|
|
13
|
+
- ranking: deterministic page-first lexical scoring with evidence-oriented results
|
|
14
|
+
- CLI: `repo-wiki search <query>` with text output for humans and `--json` for tools/agents
|
|
15
|
+
- rebuild path: compile refreshes the search artifact, and `search` can rebuild it on demand from local wiki pages
|
|
16
|
+
|
|
17
|
+
### Result contract
|
|
18
|
+
|
|
19
|
+
Each result includes:
|
|
20
|
+
|
|
21
|
+
- page title and page path
|
|
22
|
+
- page `kind` / `page_state` when available
|
|
23
|
+
- summary/snippet for quick routing
|
|
24
|
+
- `source_paths` when available
|
|
25
|
+
- lightweight graph context from inbound/outbound internal wiki links
|
|
26
|
+
|
|
27
|
+
## Deferred after the shipped slice
|
|
28
|
+
|
|
29
|
+
These remain explicitly deferred:
|
|
30
|
+
|
|
31
|
+
- source-card/documentation-card indexing beyond what is already promoted into compiled wiki pages
|
|
32
|
+
- section-level ranking as a primary retrieval contract
|
|
33
|
+
- qmd, MCP, embedding, or hosted adapters
|
|
34
|
+
- answer synthesis / `query` / file-back workflows
|
|
35
|
+
|
|
36
|
+
## Architecture
|
|
37
|
+
|
|
38
|
+
```mermaid
|
|
39
|
+
flowchart TD
|
|
40
|
+
Wiki["Wiki pages (.llmwiki/wiki)"] --> Indexer["Index builder"]
|
|
41
|
+
Cards["Source + doc cards (.llmwiki/run)"] --> Indexer
|
|
42
|
+
Indexer --> Index["Local search index (.llmwiki/search/)"]
|
|
43
|
+
Index --> Simple["Built-in ranked text search"]
|
|
44
|
+
Index --> Adapter["Optional adapter layer"]
|
|
45
|
+
Adapter --> qmd["qmd backend (optional)"]
|
|
46
|
+
Adapter --> MCP["MCP endpoint (optional)"]
|
|
47
|
+
Simple --> Results["Ranked results<br/>(page, section, source paths)"]
|
|
48
|
+
qmd --> Results
|
|
49
|
+
MCP --> Results
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
```mermaid
|
|
53
|
+
flowchart LR
|
|
54
|
+
subgraph Index entry
|
|
55
|
+
Title["page title"]
|
|
56
|
+
Category["page category"]
|
|
57
|
+
Summary["one-line summary"]
|
|
58
|
+
Body["searchable body text"]
|
|
59
|
+
Sources["source paths"]
|
|
60
|
+
Commit["source_commit"]
|
|
61
|
+
end
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
## Key Deliverables
|
|
65
|
+
|
|
66
|
+
- Local index builder that runs after `compile` and indexes compiled wiki pages.
|
|
67
|
+
- Built-in ranked text search with no external dependencies.
|
|
68
|
+
- Index stored under `.llmwiki/search/` alongside run artifacts.
|
|
69
|
+
- `repo-wiki search <query>` CLI command that returns ranked results with page title, category/kind, snippet, graph context, and source paths.
|
|
70
|
+
- Deterministic full rebuilds that are compatible with later incremental indexing.
|
|
71
|
+
- Index entries include page metadata: `kind`, `source_commit`, `source_paths`, `page_state`, and internal-link adjacency.
|
|
72
|
+
|
|
73
|
+
## Success Criteria
|
|
74
|
+
|
|
75
|
+
- `repo-wiki search "query"` returns ranked results without network access.
|
|
76
|
+
- Search results include source paths so callers can drill into evidence.
|
|
77
|
+
- The built-in shipped contract is stable enough for later query/runtime work to consume directly.
|
|
78
|
+
- Index rebuild is fast enough to run after every compile in a typical repository.
|
|
79
|
+
|
|
80
|
+
## Acceptance Criteria (from PLAN.md)
|
|
81
|
+
|
|
82
|
+
- `repo-wiki search "query"` returns ranked wiki pages and evidence paths.
|
|
83
|
+
- Search can run without external services.
|
|
84
|
+
- Optional provider integrations do not change core scan/compile behavior.
|
|
85
|
+
|
|
86
|
+
## Index Format
|
|
87
|
+
|
|
88
|
+
```json
|
|
89
|
+
{
|
|
90
|
+
"version": 1,
|
|
91
|
+
"wikiDir": "/repo/.llmwiki/wiki",
|
|
92
|
+
"sourceCommits": ["abc1234"],
|
|
93
|
+
"entries": [
|
|
94
|
+
{
|
|
95
|
+
"pagePath": "Architecture.md",
|
|
96
|
+
"title": "Architecture",
|
|
97
|
+
"kind": "foundation",
|
|
98
|
+
"pageState": "generated",
|
|
99
|
+
"summary": "High-level system design and data flow.",
|
|
100
|
+
"sourcePaths": ["src/compiler.ts", "src/scanner.ts"],
|
|
101
|
+
"outboundLinks": ["Module-scanner-ts.md"],
|
|
102
|
+
"inboundLinks": [],
|
|
103
|
+
"searchText": "architecture compiler scanner planner wiki"
|
|
104
|
+
}
|
|
105
|
+
]
|
|
106
|
+
}
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
## Dependencies
|
|
110
|
+
|
|
111
|
+
- Upstream: wiki compiler (produces pages to index), scanner (provides source card metadata).
|
|
112
|
+
- Downstream: query and file-back (uses search index for candidate routing).
|
|
113
|
+
|
|
114
|
+
## Open Questions
|
|
115
|
+
|
|
116
|
+
- When incremental mode becomes diff-minimal, what is the narrowest safe page-level reindex contract?
|
|
117
|
+
- Should section-level ranking be added on top of the page-first contract, or remain a downstream concern for `query`?
|
|
118
|
+
- If optional adapters arrive later, should qmd or MCP be the first external backend to implement?
|
|
@@ -0,0 +1,74 @@
|
|
|
1
|
+
# Epic: Trust Hardening
|
|
2
|
+
|
|
3
|
+
## Summary
|
|
4
|
+
|
|
5
|
+
Harden every surface where generated, scanned, or published content could leak secrets, reflect stale source truth, bypass severity policy, or let untrusted LLM patches reach the filesystem or a remote. This epic covers redaction, config schema validation, hash coverage, docs-lint blocking in the publish path, safe stale-page deletion, and publisher safety controls.
|
|
6
|
+
|
|
7
|
+
## Architecture
|
|
8
|
+
|
|
9
|
+
```mermaid
|
|
10
|
+
flowchart TD
|
|
11
|
+
Scan["Scanner output"] --> Redact["Secret redaction"]
|
|
12
|
+
Redact --> Cards["Cards (no secrets)"]
|
|
13
|
+
Cards --> DocsLint["Docs linter"]
|
|
14
|
+
DocsLint --> Gate{"Error-level issues?"}
|
|
15
|
+
Gate -->|yes, configured to fail| Stop["Halt: do not compile/publish"]
|
|
16
|
+
Gate -->|no| Compiler["Compiler"]
|
|
17
|
+
Compiler --> PatchGate["Structured patch gate"]
|
|
18
|
+
PatchGate --> WikiPages["Wiki pages"]
|
|
19
|
+
WikiPages --> WikiLint["Wiki linter + health"]
|
|
20
|
+
WikiLint --> Publisher["Publisher"]
|
|
21
|
+
Publisher --> StaleDelete["Safe stale-page deletion<br/>(preserve human-owned + unmanaged)"]
|
|
22
|
+
StaleDelete --> Remote["Remote target"]
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
## Key Deliverables
|
|
26
|
+
|
|
27
|
+
- **Docs-lint blocking**: make `repo-wiki run` fail before compile/publish when docs-lint reports error-level issues, controlled by config.
|
|
28
|
+
- **Secret redaction**: redact known secret-like patterns from manifests, documentation cards, page contexts, log entries, and generated pages before writing.
|
|
29
|
+
- **Config schema validation**: JSON schema for `.llmwiki/config.json` with clear validation errors on startup.
|
|
30
|
+
- **Hash coverage**: every source card has a stable content hash or an explicit hash-failure reason recorded.
|
|
31
|
+
- **Safe stale-page deletion**: publisher removes generated pages for deleted/renamed sources while preserving unmanaged and human-owned pages; ownership metadata from `page-ownership.ts` gates deletion.
|
|
32
|
+
- **Severity policy**: lint severities for all gates are config-driven; no hardcoded unconditional failures except secret-like content.
|
|
33
|
+
- **Sanitized remotes**: all remotes and URLs are validated and sanitized before display, logging, or writing.
|
|
34
|
+
- **End-to-end fixture**: golden test covering `init → scan → plan → lint-docs → compile → lint → publish --dry-run`.
|
|
35
|
+
|
|
36
|
+
## Success Criteria
|
|
37
|
+
|
|
38
|
+
- Error-level docs-lint failures block compile and publish when configured.
|
|
39
|
+
- No scan artifact or generated page contains known secret-like test patterns after redaction.
|
|
40
|
+
- Publisher does not delete human-owned or unmanaged pages during stale cleanup.
|
|
41
|
+
- `.llmwiki/config.json` schema validation surfaces clear errors on first use.
|
|
42
|
+
- Every source card in the manifest has a `hash` field or a `hashError` field.
|
|
43
|
+
|
|
44
|
+
## Acceptance Criteria (from PLAN.md)
|
|
45
|
+
|
|
46
|
+
- Error-level docs lint failures can block run/publish according to config.
|
|
47
|
+
- Scan output respects configured source filtering, including remaining `source.include` and nested-worktree edge cases.
|
|
48
|
+
- Every source card has a stable hash or an explicit hash failure reason.
|
|
49
|
+
- No scan artifact or generated page contains known secret-like patterns from fixtures.
|
|
50
|
+
- Publisher removes stale generated pages without touching unmanaged or human-owned pages.
|
|
51
|
+
- `npm test`, `npm run check`, `npm run coverage` all pass.
|
|
52
|
+
- End-to-end fixture: `init → scan → plan → lint-docs → compile → lint → publish --dry-run`.
|
|
53
|
+
|
|
54
|
+
## Severity Defaults
|
|
55
|
+
|
|
56
|
+
| Gate | Default |
|
|
57
|
+
|---|---|
|
|
58
|
+
| Docs-lint error-level blocking | configurable (warn by default, fail when `fail_on_error: true`) |
|
|
59
|
+
| Secret-like content | error (always blocks) |
|
|
60
|
+
| Stale docs | warning |
|
|
61
|
+
| Contradicted docs | error |
|
|
62
|
+
| Broken relative links | warning |
|
|
63
|
+
| Missing source commit | warning |
|
|
64
|
+
|
|
65
|
+
## Dependencies
|
|
66
|
+
|
|
67
|
+
- Upstream: docs linter, wiki linter, scanner, publisher.
|
|
68
|
+
- Downstream: all epics — trust hardening is a prerequisite for safe LLM mode, incremental maintenance, and any publish to a public remote.
|
|
69
|
+
|
|
70
|
+
## Open Questions
|
|
71
|
+
|
|
72
|
+
- Which secret pattern list should be the canonical reference: `src/secret-patterns.ts` or an external policy file?
|
|
73
|
+
- Should redaction replace matched text with `[REDACTED]` or remove the containing field entirely?
|
|
74
|
+
- Should config schema validation block `repo-wiki run` or only `repo-wiki publish`?
|