npm - llm-kb - Versions diffs - 0.4.0 → 0.4.2 - Mend

llm-kb 0.4.0 → 0.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (58) hide show

package/README.md +183 -42
package/bin/anthropic-5TIU2EED.js +5515 -0
package/bin/azure-openai-responses-ZVUVMK3G.js +190 -0
package/bin/chunk-2WV6TQRI.js +4792 -0
package/bin/chunk-3YMNGUZZ.js +262 -0
package/bin/chunk-5PYKQQLA.js +14295 -0
package/bin/chunk-65KFH7OI.js +31 -0
package/bin/chunk-DHOXVEIR.js +7261 -0
package/bin/chunk-EAQYK3U2.js +41 -0
package/bin/chunk-IFS3OKBN.js +428 -0
package/bin/chunk-LDHOKBJA.js +86 -0
package/bin/chunk-SLYBG6ZQ.js +32681 -0
package/bin/chunk-UEODFF7H.js +17 -0
package/bin/chunk-XCXTZJGO.js +174 -0
package/bin/chunk-XFV534WU.js +7056 -0
package/bin/cli.js +30 -4
package/bin/dist-3YH7P2QF.js +1244 -0
package/bin/google-JFC43EFJ.js +371 -0
package/bin/google-gemini-cli-K4XNMYDI.js +712 -0
package/bin/google-vertex-Y42F254G.js +414 -0
package/bin/indexer-KSYRIVVN.js +10 -0
package/bin/mistral-ZU2JS5XZ.js +38406 -0
package/bin/multipart-parser-CO464TZY.js +371 -0
package/bin/openai-codex-responses-NW2LELBH.js +712 -0
package/bin/openai-completions-TW3VKTHO.js +662 -0
package/bin/openai-responses-VGL522MK.js +198 -0
package/bin/src-Y22OHE3S.js +1408 -0
package/package.json +6 -1
package/PHASE2_SPEC.md +0 -274
package/PHASE3_SPEC.md +0 -245
package/PHASE4_SPEC.md +0 -358
package/SPEC.md +0 -275
package/plan.md +0 -300
package/src/auth.ts +0 -55
package/src/cli.ts +0 -257
package/src/config.ts +0 -61
package/src/eval.ts +0 -548
package/src/indexer.ts +0 -152
package/src/md-stream.ts +0 -133
package/src/pdf.ts +0 -119
package/src/query.ts +0 -408
package/src/resolve-kb.ts +0 -19
package/src/scan.ts +0 -59
package/src/session-store.ts +0 -22
package/src/session-watcher.ts +0 -89
package/src/trace-builder.ts +0 -168
package/src/tui-display.ts +0 -281
package/src/utils.ts +0 -17
package/src/watcher.ts +0 -87
package/src/wiki-updater.ts +0 -136
package/test/auth.test.ts +0 -65
package/test/config.test.ts +0 -96
package/test/md-stream.test.ts +0 -98
package/test/resolve-kb.test.ts +0 -33
package/test/scan.test.ts +0 -65
package/test/trace-builder.test.ts +0 -215
package/tsconfig.json +0 -14
package/vitest.config.ts +0 -8

package/README.md CHANGED Viewed

@@ -39,7 +39,7 @@ llm-kb run ./my-documents
 ```
 ```
-llm-kb v0.3.0
+llm-kb v0.4.1
 Scanning ./my-documents...
   Found 9 files (9 PDF)
@@ -77,7 +77,7 @@ Revenue grew 12% QoQ driven by...
 3. **Indexes** — Haiku reads sources, writes `index.md` with summary table
 4. **Watches** — drop new files while running, they get parsed and indexed automatically
 5. **Chat** — interactive TUI with Pi-style markdown rendering, thinking display, tool call progress
-6. **Learns** — every answer updates a knowledge wiki; repeated questions answered instantly from cache
+6. **Learns** — every answer updates a concept-organized wiki; repeated questions answered instantly from cache
 ### Continuous conversation
@@ -99,16 +99,49 @@ Sessions persist across restarts — run `llm-kb run` again and the conversation
 ### Query — single question from CLI
 ```bash
-# Auto-detects .llm-kb/ by walking up from cwd
 llm-kb query "compare Q3 vs Q4"
+llm-kb query "summarize revenue data" --folder ./my-documents
+llm-kb query "full analysis of lease terms" --save  # research mode
+```
-# Explicit folder
-llm-kb query "summarize all revenue data" --folder ./my-documents
+### Eval — analyze and improve
-# Research mode — saves answer and re-indexes
-llm-kb query "full analysis of lease terms" --save
+```bash
+llm-kb eval
+llm-kb eval --last 10
+```
+```
+llm-kb eval
+  Reading sessions...
+  Found 29 Q&A exchanges across sessions
+  Judging 1/29: "What are the 2023 new laws?"
+  ...
+  Judging 29/29: "How many files you have"
+  Results:
+  Queries analyzed:  29
+  Wiki hit rate:     66%
+  Wasted reads:      42
+  Issues:            22 errors  24 warnings
+  Wiki gaps:         28
+  Report: .llm-kb/wiki/outputs/eval-report.md
 ```
+Eval reads your session files and uses Haiku as a judge to find:
+| Check | What it catches |
+|---|---|
+| **Citation validity** | Agent claims "Clause 303" but source says "Clause 304" |
+| **Contradictions** | Answer says "sedition retained" but source says "removed" |
+| **Wiki gaps** | Topics asked 4 times but never cached in wiki |
+| **Wasted reads** | Files read but never cited in the answer |
+| **Performance** | Wiki hit rate, avg duration, most-read files |
+The eval report includes actionable recommendations and updates `.llm-kb/guidelines.md` — learned rules the agent reads on-demand during queries. You can also add your own rules to this file (see [Guidelines](#guidelines) below).
 ### Status — KB overview
 ```bash
@@ -120,38 +153,113 @@ Knowledge Base Status
   Folder:  /path/to/my-documents
   Sources: 12 parsed sources
   Index:   3 min ago
+  Articles: 15 compiled
   Outputs: 2 saved answers
   Models:  claude-sonnet-4-6 (query)  claude-haiku-4-5 (index)
   Auth:    Pi SDK
 ```
-## The Knowledge Wiki
+## The Three-Layer Architecture
-Every query makes the system smarter. After answering, `llm-kb` uses Haiku to update `.llm-kb/wiki/wiki.md` — a structured knowledge wiki organized by topic:
+The system separates **how to behave**, **what to know**, and **what went wrong** into three files with distinct lifecycles:
-```markdown
-## Indian Evidence Act, 1872
+```
+┌──────────────────────────────────────────────────────────────┐
+│  AGENTS.md (runtime — built by code, not on disk)        │
+│  How to answer: source list, tool patterns, citation     │
+│  rules. Points to guidelines.md for learned behaviour.   │
+└──────────────────────────────────────────────────────────────┘
+                              │
+            ┌─────────────────┴─────────────────┐
+            ▼                                   ▼
+┌────────────────────────────┐  ┌────────────────────────────┐
+│  wiki.md                     │  │  guidelines.md              │
+│  WHAT to know                │  │  HOW to behave better       │
+│                              │  │                              │
+│  Concept-organized knowledge │  │  Eval insights (auto)       │
+│  synthesized from sources.   │  │  + your custom rules.       │
+│  Updated after every query.  │  │  Read on-demand by agent.   │
+└────────────────────────────┘  └────────────────────────────┘
+       ▲                                    ▲
+       │ updated by wiki-updater             │ updated by eval
+       │                                      │
+┌──────┴─────────────────────────────────────────┴────────┐
+│  llm-kb eval                                              │
+│  Reads sessions → judges quality → updates guidelines.md  │
+│  + writes eval-report.md for humans                       │
+└──────────────────────────────────────────────────────────────┘
+```
-### Overview
-Foundational legislation covering 167 sections in 3 parts...
+| Layer | File | Changes when | Written by |
+|---|---|---|---|
+| Architecture | AGENTS.md (runtime) | Code deploys | Developer |
+| Behaviour | `guidelines.md` | After eval / by you | Eval + user |
+| Knowledge | `wiki.md` | After every query | Wiki updater |
-### Part I — Relevancy of Facts
-Admissions, confessions, dying declarations, expert opinions...
+The agent sees AGENTS.md in its system prompt (lean, stable). It reads `guidelines.md` and `wiki.md` on-demand via tool calls — progressive disclosure, not context bloat.
-### Electronic Records (Section 65B)
-Admissible with certificate from responsible official...
+## The Data Flywheel
-*Sources: Indian Evidence Act.md · 2026-04-06*
+Every query makes the system faster. Every eval makes it smarter.
----
+```
+       ┌─────────────────┐
+       │  User asks       │
+       │  a question      │
+       └────────┬────────┘
+                │
+                ▼
+   ┌────────────────────────┐
+   │  Agent checks wiki.md    │
+   │  + reads guidelines.md   │ ◄── on-demand, not forced
+   │  + reads source files     │
+   └────────────┬───────────┘
+                │
+                ▼
+   ┌────────────────────────┐
+   │  Wiki updated          │ ◄── knowledge compounds
+   │  (concept-organized)   │
+   └────────────┬───────────┘
+                │
+                ▼
+   ┌────────────────────────┐
+   │  Next similar query    │
+   │  answered from wiki    │ ── 0 file reads, 2s instead of 25s
+   └────────────┬───────────┘
+                │
+                ▼
+   ┌────────────────────────┐
+   │  llm-kb eval           │ ◄── behaviour compounds
+   │  analyzes sessions     │     updates guidelines.md
+   │  improves behaviour    │     with learned rules
+   └────────────────────────┘
+```
-## Bankers Books Evidence Act, 1891
+**Proven results:**
+- First query about a topic: ~25s, reads source files
+- Same question again: ~2s, answered from wiki, 0 files read
+- Wiki hit rate grows with usage: 0% → 66% after 29 queries
-### Key Sections
-Section 4 (core): certified copy = prima facie evidence...
-```
+## The Concept Wiki
-When you ask a question already covered by the wiki, the agent answers instantly — no source files read. New questions expand the wiki. The knowledge compounds.
+The wiki organizes knowledge by **concepts**, not source files. A single wiki entry can synthesize information from multiple sources:
+```markdown
+## Mob Lynching
+First-ever criminalisation in Indian law under BNS 2023, Clause 101(2).
+Group of 5+ persons, discriminatory grounds, minimum 7 years to death.
+IPC had no equivalent — prosecuted under general S.302.
+See also: [[Murder and Homicide]], [[BNS 2023 Overview]]
+*Sources: indian penal code - new.md (p.137), Annotated comparison (p.15) · 2026-04-06*
+---
+## Electronic Evidence
+Section 65B requires certificate from responsible official.
+BSB 2023 expands: emails, WhatsApp, GPS, cloud docs all admissible.
+See also: [[Evidence Law Overview]]
+*Sources: Indian Evidence Act.md, Comparison Chart.md · 2026-04-06*
+```
 ## Model Configuration
@@ -164,8 +272,12 @@ Auto-generated at `.llm-kb/config.json`:
 }
 ```
-- **Haiku** for indexing — cheap, fast, good enough for summaries
-- **Sonnet** for queries — strong reasoning for cited answers
+| Task | Model | Why |
+|---|---|---|
+| Index | Haiku | Summarizing sources — cheap, fast |
+| Wiki update | Haiku | Merging knowledge — cheap, fast |
+| Eval judge | Haiku | Checking quality — cheap, fast |
+| Query | Sonnet | Complex reasoning, citations — needs strength |
 Override with env vars:
 ```bash
@@ -175,31 +287,55 @@ LLM_KB_QUERY_MODEL=claude-sonnet-4-6 llm-kb query "question"
 ## Non-PDF Files
-PDFs are parsed at scan time. Other file types are read dynamically by the agent at query time using bash:
+PDFs are parsed at scan time. Other file types are read dynamically by the agent using bash scripts:
 | File type | How it's read |
 |---|---|
 | `.pdf` | Pre-parsed to markdown + bounding boxes (LiteParse) |
-| `.docx` | Agent reads selectively via `adm-zip` (XML structure) |
-| `.xlsx` | Agent reads specific sheets/cells via `exceljs` |
-| `.pptx` | Agent extracts text via `officeparser` |
+| `.docx` | Selective XML reading via `adm-zip` (structure first, then relevant sections) |
+| `.xlsx` | Specific sheets/cells via `exceljs` |
+| `.pptx` | Text extraction via `officeparser` |
 | `.md`, `.txt`, `.csv` | Read directly |
-For large `.docx` files, the agent reads the document structure first, then extracts only the sections relevant to your question — not the whole file.
+For large files, the agent reads the structure first, then extracts only the sections relevant to the question — never dumps the entire file.
 ## OCR for Scanned PDFs
 Most PDFs have native text. For scanned PDFs:
 ```bash
-# Local Tesseract (built-in, slower)
-OCR_ENABLED=true llm-kb run ./docs
+OCR_ENABLED=true llm-kb run ./docs                              # local Tesseract
+OCR_SERVER_URL="http://localhost:8080/ocr?key=KEY" llm-kb run .  # remote Azure OCR
+```
+## Guidelines
+`guidelines.md` is the agent’s learned behaviour file. Eval writes the `## Eval Insights` section automatically. You can add your own rules below it — eval will never overwrite them.
-# Remote Azure OCR (faster, better quality)
-OCR_SERVER_URL="http://localhost:8080/ocr?key=KEY" llm-kb run ./docs
+```markdown
+## Eval Insights (auto-generated 2026-04-07)
+### Wiki Gaps — add to wiki when users ask about these topics
+- Reserve requirements
+- Engine types
+### Behaviour Fixes
+- Double-check clause numbers against source text.
+### Performance
+- Wiki hit rate: 82% (target: 80%+)
+- Avg query time: 3.1s
+## My Rules
+- Always use Hindi transliterations for legal terms
+- Respond in bullet points for legal questions
+- For aviation leases: always check both lessee and lessor obligations
 ```
-Native-text pages are always processed locally (free). Only scanned pages hit the OCR server.
+The agent reads this file on-demand — not on every query. It consults guidelines when unsure about citation accuracy, file selection, or when a question touches a topic that had issues before. This keeps the system prompt lean while making learned behaviour available when it matters.
+You can create `guidelines.md` manually before ever running eval. The agent will find it.
 ## What It Creates
@@ -208,32 +344,37 @@ Native-text pages are always processed locally (free). Only scanned pages hit th
 ├── (your files — untouched)
 └── .llm-kb/
     ├── config.json           ← model configuration
+    ├── guidelines.md         ← learned rules from eval + your custom rules
     ├── sessions/             ← conversation history (JSONL)
     ├── traces/               ← per-query traces (JSON)
     │   └── .processed        ← prevents re-processing on restart
     └── wiki/
         ├── index.md          ← source summary table
-        ├── wiki.md           ← knowledge wiki (grows over time)
+        ├── wiki.md           ← concept-organized knowledge wiki
         ├── queries.md        ← query log (newest first)
         ├── sources/          ← parsed markdown + bounding boxes
-        └── outputs/          ← saved research answers (--save)
+        └── outputs/
+            ├── eval-report.md  ← eval analysis report
+            └── ...             ← saved research answers (--save)
 ```
 Your original files are never modified. Delete `.llm-kb/` to start fresh.
 ## Display
-The interactive TUI (via `@mariozechner/pi-tui`) shows:
+The interactive TUI (via `@mariozechner/pi-tui`) shows the Claude Web UI pattern:
 | Phase | What you see |
 |---|---|
 | Model | `⟡ claude-sonnet-4-6` |
 | Thinking | `▸ Thinking` + streamed reasoning (dim) |
 | Tool calls | `▸ Reading file.md` / `▸ Running bash` + code block |
-| Answer | Separator line → markdown rendered with tables, code, headers |
+| Answer | Separator line → markdown with tables, code blocks, headers |
 | Done | `── 8.3s · 2 files read ──` |
-The `llm-kb query` command uses stdout mode — same phases, streams to terminal, works with pipes.
+Phases can interleave: think → read files → answer → think again → read more → continue answer.
+The `llm-kb query` command uses stdout mode — same phases, works with pipes and scripts.
 ## Development
@@ -244,7 +385,7 @@ npm install
 npm run build
 npm link
-npm test              # 38 tests
+npm test              # 42 tests
 npm run test:watch    # vitest watch mode
 llm-kb run ./test-folder