@jafreck/lore 0.1.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (93) hide show
  1. package/README.md +345 -196
  2. package/dist/cli.js +133 -12
  3. package/dist/cli.js.map +1 -1
  4. package/dist/index.d.ts +3 -0
  5. package/dist/index.d.ts.map +1 -1
  6. package/dist/index.js.map +1 -1
  7. package/dist/indexer/db.d.ts.map +1 -1
  8. package/dist/indexer/db.js +96 -2
  9. package/dist/indexer/db.js.map +1 -1
  10. package/dist/indexer/docs.d.ts +42 -0
  11. package/dist/indexer/docs.d.ts.map +1 -0
  12. package/dist/indexer/docs.js +214 -0
  13. package/dist/indexer/docs.js.map +1 -0
  14. package/dist/indexer/embedder.d.ts +7 -0
  15. package/dist/indexer/embedder.d.ts.map +1 -1
  16. package/dist/indexer/embedder.js +10 -0
  17. package/dist/indexer/embedder.js.map +1 -1
  18. package/dist/indexer/extractors/types.d.ts +22 -0
  19. package/dist/indexer/extractors/types.d.ts.map +1 -1
  20. package/dist/indexer/extractors/types.js +12 -0
  21. package/dist/indexer/extractors/types.js.map +1 -1
  22. package/dist/indexer/extractors/typescript.d.ts +1 -1
  23. package/dist/indexer/extractors/typescript.d.ts.map +1 -1
  24. package/dist/indexer/extractors/typescript.js +38 -8
  25. package/dist/indexer/extractors/typescript.js.map +1 -1
  26. package/dist/indexer/git-hooks.d.ts +1 -0
  27. package/dist/indexer/git-hooks.d.ts.map +1 -1
  28. package/dist/indexer/git-hooks.js +3 -2
  29. package/dist/indexer/git-hooks.js.map +1 -1
  30. package/dist/indexer/index.d.ts +32 -7
  31. package/dist/indexer/index.d.ts.map +1 -1
  32. package/dist/indexer/index.js +427 -15
  33. package/dist/indexer/index.js.map +1 -1
  34. package/dist/indexer/lsp/client.d.ts +61 -0
  35. package/dist/indexer/lsp/client.d.ts.map +1 -0
  36. package/dist/indexer/lsp/client.js +217 -0
  37. package/dist/indexer/lsp/client.js.map +1 -0
  38. package/dist/indexer/lsp/config.d.ts +16 -0
  39. package/dist/indexer/lsp/config.d.ts.map +1 -0
  40. package/dist/indexer/lsp/config.js +78 -0
  41. package/dist/indexer/lsp/config.js.map +1 -0
  42. package/dist/indexer/lsp/enrichment.d.ts +55 -0
  43. package/dist/indexer/lsp/enrichment.d.ts.map +1 -0
  44. package/dist/indexer/lsp/enrichment.js +211 -0
  45. package/dist/indexer/lsp/enrichment.js.map +1 -0
  46. package/dist/indexer/lsp/registry.d.ts +19 -0
  47. package/dist/indexer/lsp/registry.d.ts.map +1 -0
  48. package/dist/indexer/lsp/registry.js +118 -0
  49. package/dist/indexer/lsp/registry.js.map +1 -0
  50. package/dist/indexer/parser.d.ts +7 -0
  51. package/dist/indexer/parser.d.ts.map +1 -1
  52. package/dist/indexer/parser.js +3 -1
  53. package/dist/indexer/parser.js.map +1 -1
  54. package/dist/indexer/poller.d.ts +7 -0
  55. package/dist/indexer/poller.d.ts.map +1 -1
  56. package/dist/indexer/poller.js +6 -0
  57. package/dist/indexer/poller.js.map +1 -1
  58. package/dist/indexer/walker.d.ts +22 -0
  59. package/dist/indexer/walker.d.ts.map +1 -1
  60. package/dist/indexer/walker.js +15 -0
  61. package/dist/indexer/walker.js.map +1 -1
  62. package/dist/indexer/watcher.d.ts +7 -0
  63. package/dist/indexer/watcher.d.ts.map +1 -1
  64. package/dist/indexer/watcher.js +6 -0
  65. package/dist/indexer/watcher.js.map +1 -1
  66. package/dist/kb-server/db.d.ts +82 -0
  67. package/dist/kb-server/db.d.ts.map +1 -1
  68. package/dist/kb-server/db.js +239 -0
  69. package/dist/kb-server/db.js.map +1 -1
  70. package/dist/kb-server/server.d.ts +11 -3
  71. package/dist/kb-server/server.d.ts.map +1 -1
  72. package/dist/kb-server/server.js +28 -6
  73. package/dist/kb-server/server.js.map +1 -1
  74. package/dist/kb-server/tools/architecture.d.ts +7 -0
  75. package/dist/kb-server/tools/architecture.d.ts.map +1 -1
  76. package/dist/kb-server/tools/architecture.js +35 -0
  77. package/dist/kb-server/tools/architecture.js.map +1 -1
  78. package/dist/kb-server/tools/docs.d.ts +78 -0
  79. package/dist/kb-server/tools/docs.d.ts.map +1 -0
  80. package/dist/kb-server/tools/docs.js +136 -0
  81. package/dist/kb-server/tools/docs.js.map +1 -0
  82. package/dist/kb-server/tools/lookup.d.ts.map +1 -1
  83. package/dist/kb-server/tools/lookup.js +5 -4
  84. package/dist/kb-server/tools/lookup.js.map +1 -1
  85. package/dist/kb-server/tools/notes.d.ts +1 -1
  86. package/dist/kb-server/tools/notes.d.ts.map +1 -1
  87. package/dist/kb-server/tools/notes.js +35 -0
  88. package/dist/kb-server/tools/notes.js.map +1 -1
  89. package/dist/kb-server/tools/search.d.ts +54 -11
  90. package/dist/kb-server/tools/search.d.ts.map +1 -1
  91. package/dist/kb-server/tools/search.js +138 -34
  92. package/dist/kb-server/tools/search.js.map +1 -1
  93. package/package.json +16 -7
package/README.md CHANGED
@@ -1,20 +1,25 @@
1
1
  # Lore
2
2
 
3
3
  [![CI](https://github.com/jafreck/Lore/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/jafreck/Lore/actions/workflows/ci.yml)
4
+ [![npm version](https://img.shields.io/npm/v/@jafreck/lore)](https://www.npmjs.com/package/@jafreck/lore)
4
5
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
5
6
  [![Node.js](https://img.shields.io/badge/node-%3E%3D22.0.0-brightgreen)](https://nodejs.org)
6
7
  [![TypeScript](https://img.shields.io/badge/TypeScript-5.9+-blue)](https://www.typescriptlang.org)
7
8
 
8
- **The teammate that has seen it all** Lore is your agent's institutional knowledge over the codebase — it knows what was built, why it changed, and how it all connects. Lore indexes your code and git history into a structured knowledge base that agents query through MCP. It maps symbols, imports, call relationships, and git history — with optional embeddings for semantic search — so agents can reason about your codebase
9
+ **The teammate that has seen it all**
10
+
11
+ Lore is your agent's institutional knowledge over the codebase — it knows what was built, why it changed, and how it all connects. Lore indexes your code and git history into a structured knowledge base that agents query through MCP. It maps symbols, imports, call relationships, and git history — with optional embeddings for semantic search — so agents can reason about your codebase
9
12
  without re-reading it from scratch.
10
13
 
11
14
  ## What Lore does
12
15
 
13
16
  - Parses source files and extracts symbols, imports, and call refs
14
17
  - Resolves internal vs external imports and builds call/import graph edges
18
+ - Discovers and indexes documentation (`.md`, `.rst`, `.adoc`, `.txt`) with inferred kinds/titles
15
19
  - Stores everything in a normalized SQL schema with optional vector search
16
- - Enables RAG-style retrieval with semantic/fused search
20
+ - Enables RAG-style retrieval with semantic/fused search across symbols and doc sections
17
21
  - Indexes git history (commits, touched files, refs/branches/tags)
22
+ - Enriches symbols with resolved type signatures and definitions via optional index-time LSP integration
18
23
  - Supports line-level git blame through MCP
19
24
  - Supports automatic refresh via watch mode, poll mode, and git hooks
20
25
 
@@ -24,70 +29,72 @@ without re-reading it from scratch.
24
29
  flowchart LR
25
30
  subgraph Codebase
26
31
  SRC[Source Files]
32
+ DOCS[Documentation<br/>md · rst · adoc · txt]
27
33
  GIT[Git Repo]
28
34
  COV[Coverage Reports]
29
35
  end
30
36
 
31
37
  subgraph Lore Indexer
32
- WALK[Walker]
33
- PARSE[Parser]
34
- EXTRACT[Extractors<br/>symbols · imports · call refs]
35
- RESOLVE[Import Resolver<br/>internal external]
36
- CALLGRAPH[Call-Graph Builder]
37
- EMBED[Embedder]
38
+ WALK[Walker] --> PARSE[Parser] --> EXTRACT[Extractors<br/>symbols · imports · call refs]
39
+ EXTRACT --> RESOLVE[Import Resolver<br/>internal ↔ external]
40
+ EXTRACT --> CALLGRAPH[Call-Graph Builder]
41
+ EXTRACT -.-> LSPENRICH[LSP Enrichment<br/>type signatures · definition locations]
42
+ DOCSINGEST[Docs Ingest<br/>sections · headings · notes]
38
43
  GITHIST[Git History Ingest<br/>commits · diffs · refs]
39
44
  COVINGEST[Coverage Ingest<br/>lcov · cobertura]
40
45
  end
41
46
 
42
47
  DB[(SQL DB)]
48
+ EMBED([Embedding Model])
43
49
 
44
50
  subgraph MCP Server
45
- LOOKUP[kb_lookup]
46
- SEARCH[kb_search]
47
- GRAPH[kb_graph]
48
- SNIPPET[kb_snippet]
49
- BLAME[kb_blame]
50
- HISTORY[kb_history]
51
- METRICS[kb_metrics]
52
- WRITEBACK[kb_writeback]
53
- end
54
-
55
- subgraph LLM_AGENTS[Agents]
56
- CLAUDE[Claude]
57
- COPILOT[GitHub Copilot]
58
- CUSTOM_AGENT[Custom Agents]
59
- CLAUDE ~~~ COPILOT ~~~ CUSTOM_AGENT
51
+ LOOKUP[lore_lookup]
52
+ SEARCH[lore_search]
53
+ DOCS_TOOL[lore_docs]
54
+ GRAPH[lore_graph]
55
+ TESTMAP[lore_test_map]
56
+ SNIPPET[lore_snippet]
57
+ BLAME[lore_blame]
58
+ HISTORY[lore_history]
59
+ COMMITSTATS[lore_commit_stats]
60
+ METRICS[lore_metrics]
61
+ COVERAGE[lore_coverage]
62
+ WRITEBACK[lore_writeback]
60
63
  end
61
64
 
62
- subgraph ENTRY[User Entrypoints]
63
- VSCODE[VS Code]
65
+ subgraph MCP_CLIENTS[MCP Clients — Agents]
66
+ CLAUDE_CODE[Claude Code / Desktop]
67
+ COPILOT[VS Code + Copilot]
64
68
  CURSOR[Cursor]
65
- CHAT[Chat UI]
66
- ORCH[Agent Frameworks]
67
- VSCODE ~~~ CURSOR ~~~ CHAT ~~~ ORCH
69
+ CUSTOM[Custom Agent Frameworks]
70
+ CLAUDE_CODE ~~~ COPILOT ~~~ CURSOR ~~~ CUSTOM
68
71
  end
69
72
 
70
- SRC --> WALK --> PARSE --> EXTRACT
71
- EXTRACT --> RESOLVE & CALLGRAPH
72
- EXTRACT & RESOLVE & CALLGRAPH --> DB
73
- EMBED -.->|optional| DB
73
+ SRC --> WALK
74
+ DOCS --> DOCSINGEST --> DB
74
75
  GIT --> GITHIST --> DB
75
76
  COV --> COVINGEST --> DB
76
77
 
77
- DB --- LOOKUP & SEARCH & GRAPH & SNIPPET & BLAME & HISTORY & METRICS & WRITEBACK
78
+ RESOLVE & CALLGRAPH --> DB
79
+ LSPENRICH -.->|optional| DB
80
+ RESOLVE -.->|optional| EMBED
81
+ EMBED -.-> DB
78
82
 
79
- LOOKUP & SEARCH & GRAPH & SNIPPET & BLAME & HISTORY & METRICS & WRITEBACK <--> LLM_AGENTS
83
+ DB --- LOOKUP & SEARCH & DOCS_TOOL & GRAPH & TESTMAP & SNIPPET & BLAME & HISTORY & COMMITSTATS & METRICS & COVERAGE & WRITEBACK
84
+ EMBED <-.->|semantic/fused| SEARCH
80
85
 
81
- LLM_AGENTS <--- ENTRY
86
+ LOOKUP & SEARCH & DOCS_TOOL & GRAPH & TESTMAP & SNIPPET & BLAME & HISTORY & COMMITSTATS & METRICS & COVERAGE & WRITEBACK <--> MCP_CLIENTS
82
87
  ```
83
88
 
84
89
  Lore sits between your codebase and any LLM-powered tool. The **indexer**
85
90
  pipeline walks source files, parses them into ASTs, and extracts
86
91
  symbols/imports/call-refs via language-specific extractors, then resolves
87
92
  imports (internal vs external) and builds the call graph. An optional
88
- **embedder** generates dense vectors for semantic search, and a parallel
89
- **git history** ingest captures commits, diffs, and refs. Everything is
90
- persisted to a normalized SQL database. The **MCP server** then exposes that
93
+ **LSP enrichment** pass queries language servers to resolve type signatures
94
+ and jump-to-definition URIs for extracted symbols. An optional **embedder**
95
+ generates dense vectors for semantic search, and a parallel **git history**
96
+ ingest captures commits, diffs, and refs. Everything is persisted to a
97
+ normalized SQL database. The **MCP server** then exposes that
91
98
  database as a set of tools that any MCP-compatible client can call to look up
92
99
  symbols, search code, traverse call graphs, read snippets, query
93
100
  blame/history, and write summaries back.
@@ -120,44 +127,6 @@ npm install @jafreck/lore
120
127
  Note: Lore uses native add-ons (`tree-sitter`, `better-sqlite3`). A working
121
128
  C/C++ toolchain is required the first time dependencies are built.
122
129
 
123
- ## Publish authentication (npm)
124
-
125
- Lore publish operations use `NODE_AUTH_TOKEN` (see `.npmrc`) and never commit
126
- tokens to the repository.
127
-
128
- Local publish flow:
129
-
130
- ```bash
131
- export NODE_AUTH_TOKEN=<npm automation token>
132
- npm publish --access public
133
- ```
134
-
135
- CI publish flow:
136
-
137
- - Add `NODE_AUTH_TOKEN` as a secret in your CI provider (for GitHub Actions,
138
- use a repository or environment secret).
139
- - Ensure publish jobs expose that secret as the `NODE_AUTH_TOKEN` environment
140
- variable before running `npm publish`.
141
-
142
- ## Release publish workflow (`@jafreck/lore@0.1.0`)
143
-
144
- Publishing is automated by `.github/workflows/publish.yml`. Creating a version
145
- tag (for example, `v0.1.0`) or publishing a GitHub Release triggers the npm
146
- publish job.
147
-
148
- Release steps for `@jafreck/lore@0.1.0`:
149
-
150
- 1. Ensure `package.json` has `"version": "0.1.0"`.
151
- 2. Push the tag: `git tag v0.1.0 && git push origin v0.1.0` (or publish a
152
- GitHub Release for `v0.1.0`).
153
- 3. Confirm the workflow logs show `npm publish --dry-run` output before the
154
- live `npm publish` step.
155
-
156
- Post-publish verification:
157
-
158
- - Check the package metadata: `npm view @jafreck/lore version` returns `0.1.0`.
159
- - Confirm installability: `npm view @jafreck/lore@0.1.0 name version`.
160
-
161
130
  ## Quick start (CLI)
162
131
 
163
132
  ```bash
@@ -183,196 +152,318 @@ const builder = new IndexBuilder(
183
152
  await builder.build();
184
153
  ```
185
154
 
186
- ### Programmatic configuration examples
155
+ ## MCP tools
187
156
 
188
- ```ts
189
- import { IndexBuilder } from '@jafreck/lore';
157
+ | Tool | Purpose |
158
+ |------|---------|
159
+ | `lore_lookup` | Find symbols by name or files by path, including external dependency API symbols and LSP-resolved metadata when available |
160
+ | `lore_search` | Structural BM25, semantic vector, or fused RRF search across symbols and doc sections |
161
+ | `lore_docs` | List, fetch, or search indexed documentation with branch, kind, and path filters |
162
+ | `lore_graph` | Query call/import/module/inheritance edges; call edges include `callee_coverage_percent` |
163
+ | `lore_snippet` | Return source snippets by file path and line range |
164
+ | `lore_test_map` | Return mapped test files (with confidence) for a given source file path |
165
+ | `lore_blame` | Return git blame metadata for a line or line range |
166
+ | `lore_history` | Query commit history by file, commit, author, ref, or recency |
167
+ | `lore_commit_stats` | Git commit analytics: cadence, size, churn, top authors, message patterns, schedule heatmaps, branch activity |
168
+ | `lore_metrics` | Aggregate index metrics plus coverage/staleness fields |
169
+ | `lore_coverage` | Symbol-level coverage, uncovered lines, and staleness metadata |
170
+ | `lore_writeback` | Persist agent-authored symbol summaries |
190
171
 
191
- // Index with embedding model + history options
192
- await new IndexBuilder(
193
- './kb.db',
194
- {
195
- rootDir: './my-project',
196
- includeGlobs: ['src/**'],
197
- excludeGlobs: ['**/*.gen.ts'],
198
- extensions: ['.ts', '.tsx'],
199
- },
200
- undefined,
201
- {
202
- embeddingModel: 'Qwen/Qwen3-Embedding-4B',
203
- history: { all: true, depth: 2000 },
204
- },
205
- ).build();
172
+ ### MCP config example
173
+
174
+ ```json
175
+ {
176
+ "mcpServers": {
177
+ "lore": {
178
+ "command": "npx",
179
+ "args": ["@jafreck/lore", "mcp", "--db", "/path/to/kb.db"]
180
+ }
181
+ }
182
+ }
206
183
  ```
207
184
 
208
- ## CLI reference
185
+ ### lore_docs examples
209
186
 
210
- ### lore index
187
+ ```json
188
+ { "action": "list", "branch": "main", "kinds": ["readme", "architecture"] }
189
+ { "action": "get", "path": "/repo/docs/architecture.md", "branch": "main", "include_sections": true }
190
+ { "action": "search", "query": "incremental refresh", "kinds": ["guide", "architecture"], "limit": 10 }
191
+ ```
211
192
 
212
- Build or update a knowledge base.
193
+ ### lore_history modes
213
194
 
214
- ```bash
215
- npx @jafreck/lore index --root <dir> --db <path> [--embedding-model <id>] [--history] [--history-depth <n>] [--history-all] [--include <glob>] [--exclude <glob>] [--language <lang>]
195
+ | Mode | Query |
196
+ |------|-------|
197
+ | `recent` | Newest commits |
198
+ | `file` | Commits that touched a path |
199
+ | `commit` | Full/prefix SHA lookup (+files +refs) |
200
+ | `author` | Commits by author/email substring |
201
+ | `ref` | Commits matching branch/tag ref name |
202
+
203
+ ### lore_blame examples
204
+
205
+ ```json
206
+ { "path": "/repo/src/index.ts", "line": 120 }
207
+ { "path": "/repo/src/index.ts", "start_line": 120, "end_line": 140 }
208
+ { "path": "/repo/src/index.ts", "line": 120, "ref": "main" }
216
209
  ```
217
210
 
218
- Key flags:
211
+ ## Data ingestion
219
212
 
220
- - `--root <dir>` required source root
221
- - `--db <path>` required SQLite output path
222
- - `--embedding-model <id>` embedding model identifier
223
- - `--history` enable git history ingestion
224
- - `--history-depth <n>` cap number of ingested commits
225
- - `--history-all` traverse all refs (branches/tags)
226
- - `--include` repeatable glob include filter
227
- - `--exclude` repeatable glob exclude filter
228
- - `--language` repeatable language filter (mapped to extensions)
213
+ Lore indexes multiple data sources into a normalized SQLite schema. Each source
214
+ has its own ingestion pipeline and can be enabled independently.
229
215
 
230
- ### lore refresh
216
+ ### Source code
231
217
 
232
- Incremental refresh flow for an existing index.
218
+ The indexer walks source files, parses them into ASTs via tree-sitter, and
219
+ extracts symbols, imports, and call references through language-specific
220
+ extractors. The import resolver classifies each import as internal or external,
221
+ and a call-graph builder creates edges between symbols.
233
222
 
234
- ```bash
235
- npx @jafreck/lore refresh --db <path> --root <dir> [--history] [--history-depth <n>] [--history-all]
236
- npx @jafreck/lore refresh --db <path> --root <dir> --watch [--history]
237
- npx @jafreck/lore refresh --db <path> --root <dir> --poll [--history]
223
+ Programmatic example:
224
+
225
+ ```ts
226
+ import { IndexBuilder } from '@jafreck/lore';
227
+
228
+ await new IndexBuilder('./kb.db', {
229
+ rootDir: './my-project',
230
+ includeGlobs: ['src/**'],
231
+ excludeGlobs: ['**/*.gen.ts'],
232
+ extensions: ['.ts', '.tsx'],
233
+ }).build();
238
234
  ```
239
235
 
240
- Modes:
236
+ ### Documentation
241
237
 
242
- - Manual: one-shot incremental refresh and exit
243
- - Watch: filesystem event driven (`fs.watch`), low latency
244
- - Poll: periodic mtime diffing, most reliable across filesystems
238
+ Lore discovers and indexes documentation files (`.md`, `.rst`, `.adoc`, `.txt`)
239
+ during both `index` and `refresh` flows. By default it scans:
245
240
 
246
- Coverage reports are auto-detected during build/update/refresh from known paths (`coverage/lcov.info`, `coverage/cobertura-coverage.xml`, `coverage.xml`) and only ingested when newer than the last stored coverage run.
241
+ - `README*` variants
242
+ - `docs/**/*.{md,rst,adoc,txt}`
243
+ - ADR-style paths (`**/{adr,adrs,ADR,ADRS}/**/*` and `**/{ADR,adr}-*`)
244
+ - Top-level architecture/design/overview/changelog/guide files
247
245
 
248
- ### lore hooks
246
+ Indexed docs are stored per `(path, branch)` in `docs`, with heading-based
247
+ chunks in `doc_sections`. When embeddings are enabled, section vectors are stored
248
+ in `doc_section_embeddings`.
249
249
 
250
- Install repo-local git hooks that trigger Lore refresh automatically on:
250
+ CLI discovery controls:
251
251
 
252
- - `post-commit`
253
- - `post-merge`
254
- - `post-checkout`
255
- - `post-rewrite`
252
+ - `--docs-include <glob>` / `--docs-exclude <glob>` — repeatable include/exclude filters
253
+ - `--docs-extension <ext>` — repeatable extension filter (e.g. `.md`)
254
+ - `--docs-auto-notes` / `--no-docs-auto-notes` — toggle seeded doc-note upserts (default: enabled)
256
255
 
257
- ```bash
258
- npx @jafreck/lore hooks --root <repo> --db <path>
259
- npx @jafreck/lore hooks --root <repo> --db <path> --history
256
+ When auto-notes are enabled, Lore seeds `notes` rows for README, architecture,
257
+ and ADR docs using deterministic keys. Each note tracks a `source_hash` for
258
+ staleness detection `lore_notes_read` reports doc-scoped notes as stale when
259
+ the backing document changes or disappears.
260
+
261
+ Programmatic example:
262
+
263
+ ```ts
264
+ await new IndexBuilder('./kb.db', {
265
+ rootDir: './my-project',
266
+ docsIncludeGlobs: ['**/README*', 'handbook/**/*.rst'],
267
+ docsExcludeGlobs: ['**/docs/private/**'],
268
+ docsExtensions: ['.md', '.rst'],
269
+ }).build();
260
270
  ```
261
271
 
262
- Note: for `lore hooks`, any history-related flag currently enables history in
263
- hook-triggered refreshes.
272
+ ### Git history
264
273
 
265
- ### lore ingest-coverage
274
+ Lore ingests commits, touched files (with change type and diff stats), and
275
+ refs (branches/tags). Enable with `--history`; use `--history-all` to traverse
276
+ all refs and `--history-depth <n>` to cap the number of commits.
266
277
 
267
- Manually ingest an explicit coverage report (useful for CI or non-standard report locations).
278
+ Indexed tables:
268
279
 
269
- ```bash
270
- npx @jafreck/lore ingest-coverage --db <path> --root <dir> --file <path> --format <lcov|cobertura> [--commit <sha>]
280
+ - `commits` — sha, author, author_email, timestamp, message, parents
281
+ - `commit_files` — per-commit touched paths with change type and diff stats
282
+ - `commit_refs` — refs currently pointing at commits (`branch`/`tag`/`other`)
283
+
284
+ Programmatic example:
285
+
286
+ ```ts
287
+ await new IndexBuilder('./kb.db', {
288
+ rootDir: './my-project',
289
+ }, undefined, {
290
+ history: { all: true, depth: 2000 },
291
+ }).build();
271
292
  ```
272
293
 
273
- Key flags:
294
+ ### Coverage
274
295
 
275
- - `--db <path>` required SQLite output path
276
- - `--root <dir>` required repository root used to normalize relative coverage paths
277
- - `--file <path>` required coverage report file path
278
- - `--format <lcov|cobertura>` required coverage format
279
- - `--commit <sha>` optional commit override (defaults to `HEAD`)
296
+ Coverage reports are auto-detected during build/update/refresh from known paths
297
+ (`coverage/lcov.info`, `coverage/cobertura-coverage.xml`, `coverage.xml`) and
298
+ only ingested when newer than the last stored coverage run.
280
299
 
281
- ### lore mcp
300
+ For non-standard report locations, use `lore ingest-coverage`:
301
+
302
+ ```bash
303
+ npx @jafreck/lore ingest-coverage --db ./kb.db --root ./my-project \
304
+ --file ./custom/coverage.xml --format cobertura
305
+ ```
282
306
 
283
- Start the built-in MCP server over stdio.
307
+ ### Embeddings
308
+
309
+ Lore optionally generates dense vector embeddings for semantic search using a
310
+ sentence-transformers model. The embedding model is downloaded and managed
311
+ automatically — specify it with `--embedding-model`:
284
312
 
285
313
  ```bash
286
- npx @jafreck/lore mcp --db <path>
314
+ npx @jafreck/lore index --root ./my-project --db ./kb.db \
315
+ --embedding-model 'Qwen/Qwen3-Embedding-4B'
287
316
  ```
288
317
 
289
- If the embedding model cannot initialize at runtime, semantic/fused search
290
- gracefully degrades to structural search.
318
+ At query time, `lore_search` in `semantic` or `fused` mode embeds the query
319
+ and performs cosine similarity against stored vectors. If the model cannot
320
+ initialize, search gracefully degrades to structural BM25.
291
321
 
292
- ## MCP tools
322
+ ### LSP enrichment
293
323
 
294
- | Tool | Purpose |
295
- |------|---------|
296
- | `kb_lookup` | Find symbols by name or files by path (optional branch filter) |
297
- | `kb_search` | Structural BM25, semantic vector, or fused RRF search |
298
- | `kb_graph` | Query call/import/module/inheritance edges; call edges include `callee_coverage_percent` |
299
- | `kb_snippet` | Return source snippets by file path and line range |
300
- | `kb_blame` | Return git blame metadata for a line or line range |
301
- | `kb_history` | Query history by file, commit, author, ref, or recency |
302
- | `kb_metrics` | Return aggregate index metrics plus coverage/staleness fields (`coverage_available`, `coverage_commit`, `current_commit`, `commits_behind`, `stale`, global coverage totals) |
303
- | `kb_coverage` | Return symbol-level coverage, uncovered lines, and staleness metadata for the latest coverage run |
304
- | `kb_writeback` | Persist symbol summaries into `symbol_summaries` |
324
+ Lore can enrich symbols and call refs with resolved type metadata at index time
325
+ by querying language servers via the Language Server Protocol. Enriched columns:
305
326
 
306
- ### MCP config example
327
+ - `resolved_type_signature`, `resolved_return_type`
328
+ - `definition_uri`, `definition_path`
329
+
330
+ These are persisted in `symbols`, `symbol_refs`, and `external_symbols` tables.
331
+ `lore_lookup` and `lore_search` return them when present. Query handlers stay
332
+ SQLite-only — language servers are never invoked at runtime.
333
+
334
+ LSP precedence:
335
+
336
+ 1. CLI flags (`--lsp` / `--no-lsp`)
337
+ 2. `.lore.config` `lsp.enabled`
338
+ 3. Built-in default (`false`)
339
+
340
+ `.lore.config` example:
307
341
 
308
342
  ```json
309
343
  {
310
- "mcpServers": {
311
- "lore": {
312
- "command": "npx",
313
- "args": ["@jafreck/lore", "mcp", "--db", "/path/to/kb.db"]
344
+ "lsp": {
345
+ "enabled": true,
346
+ "timeoutMs": 5000,
347
+ "servers": {
348
+ "typescript": { "command": "typescript-language-server", "args": ["--stdio"] },
349
+ "python": { "command": "pyright-langserver", "args": ["--stdio"] }
314
350
  }
315
351
  }
316
352
  }
317
353
  ```
318
354
 
319
- ## Git history indexing
355
+ Default server mappings cover all supported extractor languages:
320
356
 
321
- Lore can ingest full git history and expose it through `kb_history`.
357
+ | Language(s) | Default command |
358
+ |-------------|------------------|
359
+ | `c`, `cpp`, `objc` | `clangd` |
360
+ | `rust` | `rust-analyzer` |
361
+ | `python` | `pyright-langserver --stdio` |
362
+ | `typescript`, `javascript` | `typescript-language-server --stdio` |
363
+ | `go` | `gopls` |
364
+ | `java` | `jdtls` |
365
+ | `csharp` | `csharp-ls` |
366
+ | `ruby` | `solargraph stdio` |
367
+ | `php` | `intelephense --stdio` |
368
+ | `swift` | `sourcekit-lsp` |
369
+ | `kotlin` | `kotlin-language-server` |
370
+ | `scala` | `metals` |
371
+ | `lua` | `lua-language-server` |
372
+ | `bash` | `bash-language-server start` |
373
+ | `elixir` | `elixir-ls` |
374
+ | `zig` | `zls` |
375
+ | `dart` | `dart language-server --protocol=lsp` |
376
+ | `ocaml` | `ocamllsp` |
377
+ | `haskell` | `haskell-language-server-wrapper --lsp` |
378
+ | `julia` | `julia --startup-file=no --history-file=no --quiet --eval "using LanguageServer, SymbolServer; runserver()"` |
379
+ | `elm` | `elm-language-server` |
322
380
 
323
- ### Indexed history tables
381
+ Install whichever language servers you need on `PATH`; unavailable servers are
382
+ auto-detected and skipped without failing indexing.
324
383
 
325
- - `commits`: sha, author, author_email, timestamp, message, parents
326
- - `commit_files`: per-commit touched paths with change type and diff stats
327
- - `commit_refs`: refs currently pointing at commits (`branch`/`tag`/`other`)
384
+ ### Dependency APIs
328
385
 
329
- ### kb_history modes
386
+ Lore can index declaration-level public API surface from direct dependencies.
387
+ Enable with `--index-deps` or `indexDependencies: true` programmatically.
330
388
 
331
- - `recent`: newest commits
332
- - `file`: commits that touched a path
333
- - `commit`: full/prefix sha lookup (+files +refs)
334
- - `author`: commits by author/email substring
335
- - `ref`: commits matching branch/tag ref name substring
389
+ Supported ecosystems:
336
390
 
337
- ## Blame queries
391
+ - **TypeScript/JavaScript** — exported declarations from `.d.ts` files in direct npm dependencies
392
+ - **Python** — stubbed/public declarations from direct dependencies via `.pyi` and `py.typed`
393
+ - **Go** — exported declarations from direct module requirements in `go.mod`
394
+ - **Rust** — `pub` declarations from crates in `Cargo.toml`
338
395
 
339
- Use `kb_blame` for line-level attribution.
396
+ Implementation bodies are excluded and transitive dependencies are not crawled.
340
397
 
341
- Examples:
398
+ ## Keeping the index fresh
342
399
 
343
- ```json
344
- { "path": "/repo/src/index.ts", "line": 120 }
345
- { "path": "/repo/src/index.ts", "start_line": 120, "end_line": 140 }
346
- { "path": "/repo/src/index.ts", "line": 120, "ref": "main" }
400
+ The index stays current automatically through three mechanisms:
401
+
402
+ **Git hooks** install once with `lore hooks`, and Lore refreshes on every
403
+ `post-commit`, `post-merge`, `post-checkout`, and `post-rewrite`:
404
+
405
+ ```bash
406
+ npx @jafreck/lore hooks --root ./my-project --db ./kb.db --history
347
407
  ```
348
408
 
349
- ## Automatic freshness patterns
409
+ **Watch mode** reacts to filesystem events in real time:
350
410
 
351
- If you want Lore to stay updated without explicit requests:
411
+ ```bash
412
+ npx @jafreck/lore refresh --db ./kb.db --root ./my-project --watch
413
+ ```
352
414
 
353
- 1. Run `lore hooks` once in the repo (git lifecycle updates)
354
- 2. Optionally run `lore refresh --watch` in a background session for near-real-time updates during active editing
355
- 3. Use `--poll` on filesystems where watch events are unreliable
415
+ **Poll mode** periodic mtime diffing, most reliable across filesystems:
356
416
 
357
- ## Benchmarking index performance (500+ file repos)
417
+ ```bash
418
+ npx @jafreck/lore refresh --db ./kb.db --root ./my-project --poll
419
+ ```
358
420
 
359
- Use this procedure when you need measurable before/after evidence for indexing changes:
421
+ Each refresh only re-processes files whose content hash has changed, so updates
422
+ are fast even on large repositories.
360
423
 
361
- 1. Pick a repository with at least 500 source files and note the exact commit SHA you will test.
362
- 2. Capture a baseline timing from the same machine and environment:
424
+ ## CLI reference
425
+
426
+ ### lore index
427
+
428
+ Build or update a knowledge base.
363
429
 
364
430
  ```bash
365
- time npx @jafreck/lore index --root /path/to/repo --db ./kb-baseline.db
431
+ npx @jafreck/lore index --root <dir> --db <path> [--embedding-model <id>] [--index-deps] [--history] [--history-depth <n>] [--history-all] [--include <glob>] [--exclude <glob>] [--language <lang>] [--docs-include <glob>] [--docs-exclude <glob>] [--docs-extension <ext>] [--docs-auto-notes|--no-docs-auto-notes] [--lsp] [--no-lsp]
366
432
  ```
367
433
 
368
- 3. Apply your change, rebuild Lore, then capture a post-change timing against the same repository commit:
434
+ ### lore refresh
435
+
436
+ Incremental refresh (one-shot, watch, or poll).
369
437
 
370
438
  ```bash
371
- npm run build
372
- time npx @jafreck/lore index --root /path/to/repo --db ./kb-after.db
439
+ npx @jafreck/lore refresh --db <path> --root <dir> [--index-deps] [--history] [--history-depth <n>] [--history-all] [--docs-include <glob>] [--docs-exclude <glob>] [--docs-extension <ext>] [--docs-auto-notes|--no-docs-auto-notes] [--lsp] [--no-lsp]
440
+ npx @jafreck/lore refresh --db <path> --root <dir> --watch [--index-deps] [--history] [--docs-include <glob>] [--docs-exclude <glob>] [--docs-extension <ext>] [--lsp] [--no-lsp]
441
+ npx @jafreck/lore refresh --db <path> --root <dir> --poll [--index-deps] [--history] [--docs-include <glob>] [--docs-exclude <glob>] [--docs-extension <ext>] [--lsp] [--no-lsp]
373
442
  ```
374
443
 
375
- 4. Record both timings (baseline and post-change) in the related GitHub issue or PR under an "Acceptance Evidence" section, including repo name, commit SHA, and command used.
444
+ ### lore hooks
445
+
446
+ Install repo-local git hooks for automatic refresh.
447
+
448
+ ```bash
449
+ npx @jafreck/lore hooks --root <repo> --db <path> [--history] [--lsp] [--no-lsp]
450
+ ```
451
+
452
+ ### lore ingest-coverage
453
+
454
+ Manually ingest a coverage report.
455
+
456
+ ```bash
457
+ npx @jafreck/lore ingest-coverage --db <path> --root <dir> --file <path> --format <lcov|cobertura> [--commit <sha>]
458
+ ```
459
+
460
+ ### lore mcp
461
+
462
+ Start the MCP server over stdio.
463
+
464
+ ```bash
465
+ npx @jafreck/lore mcp --db <path>
466
+ ```
376
467
 
377
468
  ## Build from source
378
469
 
@@ -400,6 +491,64 @@ npm run coverage
400
491
 
401
492
  CI enforces a minimum 95% coverage threshold.
402
493
 
494
+ ## Publish authentication (npm)
495
+
496
+ Lore publish operations use `NODE_AUTH_TOKEN` (see `.npmrc`) and never commit
497
+ tokens to the repository.
498
+
499
+ Local publish flow:
500
+
501
+ ```bash
502
+ export NODE_AUTH_TOKEN=<npm automation token>
503
+ npm publish --access public
504
+ ```
505
+
506
+ CI publish flow:
507
+
508
+ - Add `NODE_AUTH_TOKEN` as a secret in your CI provider (for GitHub Actions,
509
+ use a repository or environment secret).
510
+ - Ensure publish jobs expose that secret as the `NODE_AUTH_TOKEN` environment
511
+ variable before running `npm publish`.
512
+
513
+ ## Release publish workflow (`@jafreck/lore@0.1.0`)
514
+
515
+ Publishing is automated by `.github/workflows/publish.yml`. Creating a version
516
+ tag (for example, `v0.1.0`) or publishing a GitHub Release triggers the npm
517
+ publish job.
518
+
519
+ Release steps for `@jafreck/lore@0.1.0`:
520
+
521
+ 1. Ensure `package.json` has `"version": "0.1.0"`.
522
+ 2. Push the tag: `git tag v0.1.0 && git push origin v0.1.0` (or publish a
523
+ GitHub Release for `v0.1.0`).
524
+ 3. Confirm the workflow logs show `npm publish --dry-run` output before the
525
+ live `npm publish` step.
526
+
527
+ Post-publish verification:
528
+
529
+ - Check the package metadata: `npm view @jafreck/lore version` returns `0.1.0`.
530
+ - Confirm installability: `npm view @jafreck/lore@0.1.0 name version`.
531
+
532
+ ## Benchmarking index performance (500+ file repos)
533
+
534
+ Use this procedure when you need measurable before/after evidence for indexing changes:
535
+
536
+ 1. Pick a repository with at least 500 source files and note the exact commit SHA you will test.
537
+ 2. Capture a baseline timing from the same machine and environment:
538
+
539
+ ```bash
540
+ time npx @jafreck/lore index --root /path/to/repo --db ./kb-baseline.db
541
+ ```
542
+
543
+ 3. Apply your change, rebuild Lore, then capture a post-change timing against the same repository commit:
544
+
545
+ ```bash
546
+ npm run build
547
+ time npx @jafreck/lore index --root /path/to/repo --db ./kb-after.db
548
+ ```
549
+
550
+ 4. Record both timings (baseline and post-change) in the related GitHub issue or PR under an "Acceptance Evidence" section, including repo name, commit SHA, and command used.
551
+
403
552
  ## License
404
553
 
405
554
  [MIT](LICENSE)