npm - purecontext-mcp - Versions diffs - 1.2.0 → 1.5.1 - Mend

purecontext-mcp 1.2.0 → 1.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (185) hide show

package/AGENT_INSTRUCTIONS.md +110 -784
package/AGENT_REFERENCE.md +561 -0
package/BENCHMARKS.md +153 -0
package/CHANGELOG.md +177 -6
package/FRAMEWORK-ADAPTERS.md +351 -0
package/FULL-INSTALLATION-GUIDE.md +341 -0
package/LANGUAGE-SUPPORT.md +144 -0
package/README.md +154 -16
package/USER-GUIDE.md +29 -21
package/dist/cli/hooks.d.ts +28 -0
package/dist/cli/hooks.d.ts.map +1 -0
package/dist/cli/hooks.js +570 -0
package/dist/cli/hooks.js.map +1 -0
package/dist/cli/install-detect.d.ts +16 -0
package/dist/cli/install-detect.d.ts.map +1 -0
package/dist/cli/install-detect.js +70 -0
package/dist/cli/install-detect.js.map +1 -0
package/dist/cli/install-writers.d.ts +59 -0
package/dist/cli/install-writers.d.ts.map +1 -0
package/dist/cli/install-writers.js +292 -0
package/dist/cli/install-writers.js.map +1 -0
package/dist/cli/install.d.ts +14 -0
package/dist/cli/install.d.ts.map +1 -0
package/dist/cli/install.js +150 -0
package/dist/cli/install.js.map +1 -0
package/dist/config/config-loader.js +3 -0
package/dist/config/config-loader.js.map +1 -1
package/dist/config/config-schema.d.ts +11 -0
package/dist/config/config-schema.d.ts.map +1 -1
package/dist/config/config-schema.js +15 -0
package/dist/config/config-schema.js.map +1 -1
package/dist/core/db/symbol-store.d.ts +1 -0
package/dist/core/db/symbol-store.d.ts.map +1 -1
package/dist/core/db/symbol-store.js +120 -6
package/dist/core/db/symbol-store.js.map +1 -1
package/dist/core/file-discovery.d.ts +6 -0
package/dist/core/file-discovery.d.ts.map +1 -1
package/dist/core/file-discovery.js +20 -13
package/dist/core/file-discovery.js.map +1 -1
package/dist/core/file-processor.d.ts.map +1 -1
package/dist/core/file-processor.js +26 -1
package/dist/core/file-processor.js.map +1 -1
package/dist/core/git-log-reader.d.ts.map +1 -1
package/dist/core/git-log-reader.js +21 -0
package/dist/core/git-log-reader.js.map +1 -1
package/dist/core/index-manager.d.ts.map +1 -1
package/dist/core/index-manager.js +21 -7
package/dist/core/index-manager.js.map +1 -1
package/dist/core/indexing-worker.d.ts.map +1 -1
package/dist/core/indexing-worker.js +14 -0
package/dist/core/indexing-worker.js.map +1 -1
package/dist/core/parse-dispatcher.d.ts.map +1 -1
package/dist/core/parse-dispatcher.js +20 -5
package/dist/core/parse-dispatcher.js.map +1 -1
package/dist/core/search/query-preprocessor.d.ts +69 -3
package/dist/core/search/query-preprocessor.d.ts.map +1 -1
package/dist/core/search/query-preprocessor.js +450 -17
package/dist/core/search/query-preprocessor.js.map +1 -1
package/dist/core/search/relevance-ranker.d.ts +60 -5
package/dist/core/search/relevance-ranker.d.ts.map +1 -1
package/dist/core/search/relevance-ranker.js +931 -33
package/dist/core/search/relevance-ranker.js.map +1 -1
package/dist/core/test-mapper.d.ts.map +1 -1
package/dist/core/test-mapper.js +7 -1
package/dist/core/test-mapper.js.map +1 -1
package/dist/core/types.d.ts +28 -1
package/dist/core/types.d.ts.map +1 -1
package/dist/handlers/angular-html.d.ts +3 -0
package/dist/handlers/angular-html.d.ts.map +1 -0
package/dist/handlers/angular-html.js +215 -0
package/dist/handlers/angular-html.js.map +1 -0
package/dist/handlers/c.d.ts.map +1 -1
package/dist/handlers/c.js +19 -0
package/dist/handlers/c.js.map +1 -1
package/dist/handlers/cpp-macro-registry.d.ts +21 -0
package/dist/handlers/cpp-macro-registry.d.ts.map +1 -0
package/dist/handlers/cpp-macro-registry.js +44 -0
package/dist/handlers/cpp-macro-registry.js.map +1 -0
package/dist/handlers/cpp.d.ts.map +1 -1
package/dist/handlers/cpp.js +579 -10
package/dist/handlers/cpp.js.map +1 -1
package/dist/handlers/csharp.d.ts.map +1 -1
package/dist/handlers/csharp.js +39 -2
package/dist/handlers/csharp.js.map +1 -1
package/dist/handlers/css.d.ts +3 -0
package/dist/handlers/css.d.ts.map +1 -0
package/dist/handlers/css.js +154 -0
package/dist/handlers/css.js.map +1 -0
package/dist/handlers/erlang.d.ts.map +1 -1
package/dist/handlers/erlang.js +8 -1
package/dist/handlers/erlang.js.map +1 -1
package/dist/handlers/fortran.js +1 -1
package/dist/handlers/fortran.js.map +1 -1
package/dist/handlers/go.d.ts.map +1 -1
package/dist/handlers/go.js +87 -2
package/dist/handlers/go.js.map +1 -1
package/dist/handlers/handler-registry.d.ts.map +1 -1
package/dist/handlers/handler-registry.js +4 -0
package/dist/handlers/handler-registry.js.map +1 -1
package/dist/handlers/hcl.d.ts +3 -0
package/dist/handlers/hcl.d.ts.map +1 -0
package/dist/handlers/hcl.js +193 -0
package/dist/handlers/hcl.js.map +1 -0
package/dist/handlers/java.d.ts.map +1 -1
package/dist/handlers/java.js +33 -16
package/dist/handlers/java.js.map +1 -1
package/dist/handlers/kotlin.d.ts.map +1 -1
package/dist/handlers/kotlin.js +48 -3
package/dist/handlers/kotlin.js.map +1 -1
package/dist/handlers/less.d.ts +3 -0
package/dist/handlers/less.d.ts.map +1 -0
package/dist/handlers/less.js +255 -0
package/dist/handlers/less.js.map +1 -0
package/dist/handlers/objective-c.d.ts.map +1 -1
package/dist/handlers/objective-c.js +122 -64
package/dist/handlers/objective-c.js.map +1 -1
package/dist/handlers/openapi.d.ts.map +1 -1
package/dist/handlers/openapi.js +30 -5
package/dist/handlers/openapi.js.map +1 -1
package/dist/handlers/php.d.ts.map +1 -1
package/dist/handlers/php.js +287 -41
package/dist/handlers/php.js.map +1 -1
package/dist/handlers/protobuf.d.ts.map +1 -1
package/dist/handlers/protobuf.js +1 -0
package/dist/handlers/protobuf.js.map +1 -1
package/dist/handlers/python.d.ts.map +1 -1
package/dist/handlers/python.js +1 -3
package/dist/handlers/python.js.map +1 -1
package/dist/handlers/ruby-dsl.d.ts +23 -0
package/dist/handlers/ruby-dsl.d.ts.map +1 -0
package/dist/handlers/ruby-dsl.js +251 -0
package/dist/handlers/ruby-dsl.js.map +1 -0
package/dist/handlers/ruby.d.ts.map +1 -1
package/dist/handlers/ruby.js +29 -4
package/dist/handlers/ruby.js.map +1 -1
package/dist/handlers/rust.d.ts.map +1 -1
package/dist/handlers/rust.js +98 -2
package/dist/handlers/rust.js.map +1 -1
package/dist/handlers/scss.d.ts +3 -0
package/dist/handlers/scss.d.ts.map +1 -0
package/dist/handlers/scss.js +290 -0
package/dist/handlers/scss.js.map +1 -0
package/dist/handlers/sql.d.ts.map +1 -1
package/dist/handlers/sql.js +37 -18
package/dist/handlers/sql.js.map +1 -1
package/dist/handlers/typescript.d.ts.map +1 -1
package/dist/handlers/typescript.js +65 -17
package/dist/handlers/typescript.js.map +1 -1
package/dist/handlers/xml.d.ts.map +1 -1
package/dist/handlers/xml.js +35 -2
package/dist/handlers/xml.js.map +1 -1
package/dist/index.d.ts.map +1 -1
package/dist/index.js +91 -0
package/dist/index.js.map +1 -1
package/dist/server/mcp-server.d.ts.map +1 -1
package/dist/server/mcp-server.js +10 -0
package/dist/server/mcp-server.js.map +1 -1
package/dist/server/tools/detect-antipatterns.d.ts +1 -1
package/dist/server/tools/get-architecture-snapshot.d.ts +1 -1
package/dist/server/tools/get-entry-points.d.ts +1 -1
package/dist/server/tools/get-lexical-scope-matches.d.ts +54 -0
package/dist/server/tools/get-lexical-scope-matches.d.ts.map +1 -0
package/dist/server/tools/get-lexical-scope-matches.js +470 -0
package/dist/server/tools/get-lexical-scope-matches.js.map +1 -0
package/dist/server/tools/search-symbols.d.ts +10 -0
package/dist/server/tools/search-symbols.d.ts.map +1 -1
package/dist/server/tools/search-symbols.js +353 -8
package/dist/server/tools/search-symbols.js.map +1 -1
package/dist/server/tools/trace-invocation-chain.d.ts +53 -0
package/dist/server/tools/trace-invocation-chain.d.ts.map +1 -0
package/dist/server/tools/trace-invocation-chain.js +280 -0
package/dist/server/tools/trace-invocation-chain.js.map +1 -0
package/dist/version.d.ts +1 -1
package/dist/version.js +1 -1
package/docs/02-installation.md +43 -245
package/docs/05-cli-reference.md +89 -0
package/docs/07-language-support.md +73 -50
package/docs/08-framework-adapters.md +7 -2
package/docs/15-team-setup.md +70 -200
package/docs/17-web-ui.md +73 -93
package/docs/README.md +60 -39
package/docs/dev/benchmark-findings-eu-za-tebe.md +210 -0
package/docs/dev/phase-35-coverage-audit.md +469 -0
package/package.json +6 -3
package/user-manual.md +0 -2466

package/docs/17-web-ui.md CHANGED Viewed

@@ -1,68 +1,66 @@
-# Web UI
+# Web UI — Reference
+This is the reference page: build commands, configuration flags, keyboard shortcuts, heatmap metrics, and graph-viewer controls.
-The Web UI provides a visual interface for exploring indexed codebases. It is served by the same process as the MCP server when HTTP transport is active.
+For the **user-friendly tour** — when to use the UI vs the chat, what each view is good for, workflow examples — see [`WEB-UI.md`](../WEB-UI.md) at the project root.
 ---
-## Accessing the Web UI
+## Activating the UI
-The Web UI is available at `http://localhost:3000` (or your server URL) when running in HTTP mode:
+The UI is served by the same process as the MCP server, but only when HTTP transport is active:
 ```bash
 purecontext-mcp --transport http --port 3000
+# Web UI: http://localhost:3000
+# MCP endpoint: http://localhost:3000/mcp/sse
 ```
-Then open `http://localhost:3000` in a browser.
-### Building the UI
-The UI is pre-built in the npm package. For development or rebuilding from source:
+The UI is pre-built into the npm package. For source builds:
 ```bash
 npm run build:ui   # build only the UI
 npm run build      # build everything
-npm run dev        # watch mode: TypeScript + Vite dev server with hot reload
+npm run dev        # watch mode: TypeScript + Vite dev server with HMR
 ```
 ---
-## Repository browser
+## Configuration
-- List all indexed repositories with symbol counts, file counts, and language breakdown
-- Collapsible file tree with file type icons
-- Click any file to open its symbol outline
----
+| Field | Default | Description |
+|-------|--------:|-------------|
+| `webUI.enabled` | `true` | Set `false` to disable UI even in HTTP mode (API-only) |
+| `webUI.theme` | `"system"` | `"light"` / `"dark"` / `"system"` default; users can override |
+| `webUI.basePath` | `"/"` | Mount the UI under a subpath (e.g., `/purecontext`) |
+| `webUI.maxGraphNodes` | `500` | Hard cap on graph viewer node count for performance |
-## Symbol search
-- Real-time search with 300ms debounce — results appear as you type
-- Filter by: symbol kind, language, file path pattern
-- Keyboard navigation: arrow keys to move through results, Enter to open
-- Query term highlighting in results
-- Switches between keyword and semantic mode (if semantic search is enabled)
+When deployed behind a reverse proxy at a subpath, set `webUI.basePath` to match the proxy path.
 ---
-## Symbol viewer
+## Keyboard shortcuts
-- Syntax-highlighted source code (powered by Shiki — VS Code-quality highlighting)
-- Line numbers with anchors (shareable URLs)
-- Light/dark theme toggle (preference persisted in localStorage)
-- **Related symbols panel**: importers, dependencies, same-file symbols
+| Shortcut | Action |
+|----------|--------|
+| `/` | Focus search bar |
+| `↑` / `↓` | Navigate search results |
+| `Enter` | Open selected symbol |
+| `Esc` | Close panels / clear search |
+| `G` | Open graph view for current symbol |
+| `B` | Show blast radius for current symbol |
+| `H` | Toggle heatmap overlay |
+| `T` | Toggle light/dark theme |
 ---
-## Dependency graph viewer
-An interactive force-directed graph of file and symbol dependencies.
+## Graph viewer
 ### Controls
 | Action | Control |
 |--------|---------|
-| Pan | Click and drag |
+| Pan | Click and drag background |
 | Zoom | Scroll wheel |
 | Fit to view | Double-click background |
 | Select node | Click |
@@ -70,88 +68,70 @@ An interactive force-directed graph of file and symbol dependencies.
 | Forward walk | Enable "Dependencies" mode |
 | Reverse walk | Enable "Importers" mode |
-### Layout options
-- **Force-directed** (default) — physics simulation, nodes cluster by connectivity
-- **Hierarchical** — root at top, dependencies flow downward
-- **Radial** — selected node at center, connected nodes radiate outward
+### Layouts
-### Depth slider
+| Layout | Behavior |
+|--------|----------|
+| Force-directed (default) | Physics simulation; nodes cluster by connectivity |
+| Hierarchical | Root at top, dependencies flow downward |
+| Radial | Selected node at center; connected nodes radiate outward |
-Adjust traversal depth (1–5 hops). Higher depth reveals transitive dependencies but may produce large graphs.
+### Filters and overlays
-### Blast radius view
-Switch to "Blast radius" mode to see everything that depends on the selected node — color gradient from red (direct impact) to yellow (indirect).
+| Feature | Description |
+|---------|-------------|
+| Depth slider | Traversal depth 1–5 hops |
+| Language filter | Show only nodes of a specific language |
+| Kind filter | Show only files/symbols of a specific kind |
+| Cycle detection | Highlight circular dependency cycles in red |
+| Blast-radius mode | Color gradient: red (direct impact) → yellow (indirect) |
+| Export | Save graph as SVG or PNG |
+| Minimap | Overview panel for large graphs |
 ---
 ## Architecture heatmap
-An overlay on the file tree that color-codes files by a selected metric:
-| Metric | Color scale | Use case |
-|--------|-------------|----------|
-| Churn | blue (stable) → red (high churn) | Identify high-risk files before a refactor |
-| Complexity | green → orange → red | Find over-complex files that need attention |
-| Quality score | green (high) → red (low) | Prioritize technical debt |
-| Test coverage | green (covered) → red (uncovered) | Requires external coverage report |
+Color-codes files by a chosen metric.
-Click any cell in the heatmap to open the file's symbol outline.
+| Metric | Color scale | Source |
+|--------|-------------|--------|
+| Churn | blue (stable) → red (high churn) | git log history |
+| Complexity | green → orange → red | per-file cyclomatic complexity |
+| Quality score | green (high) → red (low) | aggregated metrics |
+| Test coverage | green (covered) → red (uncovered) | uploaded lcov file |
 ---
-## Symbol timeline
+## Test coverage upload
-Per-symbol git history visualized as a timeline. Shows:
-- When the symbol was created (first commit where it appears)
-- Each commit that modified the symbol (with author, date, message)
-- When the symbol was deleted (if applicable)
+The coverage overlay needs an lcov-format report:
-Requires git history integration enabled (see [Git & History Integration](18-git-history.md)).
+1. Run your test suite with coverage output (`vitest --coverage`, `pytest --cov`, `jest --coverage`, etc.)
+2. Export as lcov: typical output paths are `coverage/lcov.info` or `coverage.info`
+3. In the UI: Settings → Coverage → Upload lcov file
----
-## Test coverage overlay
-Overlays test coverage data on the file tree. Requires an lcov-format coverage report:
-1. Run your test suite with coverage output (`npx vitest --coverage`, `pytest --cov`, etc.)
-2. Export as lcov: `coverage.info` / `lcov.info`
-3. In PureContext Web UI: Settings → Coverage → Upload lcov file
-Files are color-coded by coverage percentage. Click a file to see line-level coverage in the source viewer.
+Coverage data is stored per workspace and persists across UI sessions.
 ---
-## Multi-repo workspace
-When multiple repos are indexed, the sidebar shows a repo switcher. Cross-repo search results appear in a unified list with the source repo identified for each result.
+## URL conventions
----
+| Pattern | Purpose |
+|---------|---------|
+| `/r/:repoId` | Repository home |
+| `/r/:repoId/f/:filePath` | File outline |
+| `/r/:repoId/s/:symbolId` | Symbol viewer |
+| `/r/:repoId/s/:symbolId#L42` | Symbol viewer with line anchor |
+| `/r/:repoId/graph?root=:symbolId&depth=3` | Graph viewer with preset |
+| `/r/:repoId/heatmap?metric=churn` | Heatmap with preset metric |
-## Advanced graph controls
-Additional controls available in the graph viewer:
-| Feature | Description |
-|---------|-------------|
-| Language filter | Show only nodes of a specific language |
-| Kind filter | Show only files/symbols of a specific kind |
-| Cycle detection | Highlight circular dependency cycles in red |
-| Export | Save graph as SVG or PNG |
-| Minimap | Overview panel for large graphs |
+URLs are stable — link them in PR descriptions or share with teammates.
 ---
-## Keyboard shortcuts
+## Related reference
-| Shortcut | Action |
-|----------|--------|
-| `/` | Focus search bar |
-| `↑` / `↓` | Navigate search results |
-| `Enter` | Open selected symbol |
-| `Esc` | Close panels / clear search |
-| `G` | Open graph view for current symbol |
-| `B` | Show blast radius for current symbol |
-| `H` | Toggle heatmap overlay |
+- [Transport Modes](14-transport-modes.md) — required HTTP setup for UI to activate
+- [Git & History Integration](18-git-history.md) — powers the symbol timeline and churn heatmap
+- [Configuration](04-configuration.md) — full `webUI.*` schema

package/docs/README.md CHANGED Viewed

@@ -1,71 +1,92 @@
-# PureContext MCP — User Manual
+# PureContext MCP — Reference Manual
-PureContext MCP indexes your codebase and gives AI agents a way to navigate it without reading entire files. Instead of loading hundreds of lines of code to find one function, Claude (or any other MCP-compatible AI) can search by name, retrieve just the symbol it needs, and understand the dependency chain — all in a fraction of the tokens.
+This is the **reference manual**: parameter-level documentation for every tool, configuration option, language handler, framework adapter, and deployment option.
-This manual covers everything from installation through advanced features. Use the sections below to navigate to what you need, or read in order for a full introduction.
+For the **user guide** — narrative explanations, worked examples, and real-world workflows — see [`USER-GUIDE.md`](../USER-GUIDE.md) and the `WHY-PURECONTEXT.md` / `FINDING-CODE.md` / `WORKFLOW-*.md` files at the project root.
----
+Each row below has two columns: the reference page in this directory, and the user-friendly companion at the project root when one exists.
-## Getting Started
+---
-These three sections get you from zero to a working setup.
+## Getting started
-- [Introduction](01-introduction.md) — What PureContext is, why token efficiency matters, key concepts
-- [Installation](02-installation.md) — Install via npm, verify your setup, upgrade and uninstall
-- [Quick Start](03-quick-start.md) — Index a project and search your first symbol in minutes
+| Reference | Companion |
+|-----------|-----------|
+| [Introduction](01-introduction.md) — concise spec, glossary, key concepts | [Why PureContext](../WHY-PURECONTEXT.md) — narrative case |
+| [Installation](02-installation.md) — prereqs, support matrix, verify, upgrade | [Full Installation Guide](../FULL-INSTALLATION-GUIDE.md) — per-IDE walkthrough |
+| [Quick Start](03-quick-start.md) — index a project and search in minutes | [Navigating a New Codebase](../NAVIGATING-NEW-CODE.md) — day-one workflow |
 ---
-## Reference
-Complete reference material for configuration, the CLI, and every MCP tool.
+## Core reference
-- [Configuration](04-configuration.md) — Full `config.json` schema, every field explained, environment variable overrides
-- [CLI Reference](05-cli-reference.md) — Every command and flag: `config --init`, `--health`, `--transport`, and more
+- [Configuration](04-configuration.md) — Full `config.json` schema and environment variable overrides
+- [CLI Reference](05-cli-reference.md) — Every command and flag (`config --init`, `--health`, `--transport`, etc.)
 - [MCP Tools Reference](06-tools-reference.md) — Every tool with inputs, outputs, and examples — grouped by category
 ---
-## Language & Framework Support
+## Language and framework support
-- [Language Support](07-language-support.md) — All 34 supported languages: what gets indexed and known limitations
-- [Framework Adapters](08-framework-adapters.md) — Vue, React, Nuxt, Next.js, Angular, NestJS, Express, Django, Rails, Spring, and 20+ more
+| Reference | Companion |
+|-----------|-----------|
+| [Language Support](07-language-support.md) — symbol-kind matrix, visibility filters, grammar notes | [Language Support](../LANGUAGE-SUPPORT.md) — narrative tour by category |
+| [Framework Adapters](08-framework-adapters.md) — detection rules, extracted kinds, `frameworkMeta` | [Framework Adapters](../FRAMEWORK-ADAPTERS.md) — what each adapter changes in practice |
 ---
-## Core Features
+## Core features
-- [Dependency Graph Tools](09-dependency-graph.md) — Find what a symbol depends on, what depends on it, and what is dead code
-- [Semantic Search](10-semantic-search.md) — Search by meaning rather than name using HNSW vector index
-- [Search Quality & Ranking](11-search-quality.md) — How FTS5, camelCase splitting, and relevance ranking work; search tips
-- [AI Summarization](12-ai-summarization.md) — Auto-generate symbol descriptions with Anthropic, OpenAI, or Gemini
-- [Token Savings Tracker](13-token-savings.md) — See exactly how many tokens (and dollars) PureContext saves per session
+- [Dependency Graph Tools](09-dependency-graph.md) — what a symbol depends on, what depends on it, dead-code detection
+- [Semantic Search](10-semantic-search.md) — HNSW vector index, embedding providers, hybrid mode
+- [Search Quality & Ranking](11-search-quality.md) — FTS5, camelCase splitting, relevance ranking
+- [AI Summarization](12-ai-summarization.md) — provider config, batch sizes, cost model
+- [Token Savings Tracker](13-token-savings.md) — per-session token (and dollar) accounting
+Companion narratives: [Finding Code](../FINDING-CODE.md), [AI Summaries](../AI-SUMMARIES.md), [AST-Level Search](../AST-SEARCH.md), [Code Intelligence](../CODE-INTELLIGENCE.md).
 ---
 ## Deployment
-- [Transport Modes](14-transport-modes.md) — stdio (local) vs HTTP/SSE (team/browser); TLS via reverse proxy
-- [Team Setup & Multi-Tenant](15-team-setup.md) — Shared server, workspaces, API keys, rate limiting
-- [Docker Deployment](16-docker.md) — `docker run`, Docker Compose, volumes, environment variables, health checks
+| Reference | Companion |
+|-----------|-----------|
+| [Transport Modes](14-transport-modes.md) — stdio vs HTTP/SSE, TLS via reverse proxy | — |
+| [Team Setup & Multi-Tenant](15-team-setup.md) — permissions, rate limit, admin API reference | [Using PureContext with a Team](../TEAM-SETUP.md) — narrative deployment |
+| [Docker Deployment](16-docker.md) — image tags, compose, volumes, env vars, healthchecks | — |
+---
+## Advanced features
+| Reference | Companion |
+|-----------|-----------|
+| [Web UI](17-web-ui.md) — config flags, keyboard shortcuts, URL conventions | [The Web UI](../WEB-UI.md) — when to leave the chat |
+| [Git & History Integration](18-git-history.md) — symbol history, churn, diff analysis | [Code History](../CODE-HISTORY.md) — narrative |
+| [Cross-Repo Intelligence](19-cross-repo.md) — multi-repo search, similarity, MCP Resources | — |
+| [AI-Powered Architecture Analysis](20-architecture-analysis.md) — metrics, anti-patterns, auto-docs | [Code Health](../CODE-HEALTH.md), [Health Dashboards](../HEALTH-DASHBOARDS.md), [Visualizing Code Structure](../VISUALIZING-CODE.md) |
+| [Ecosystem & Data Tools](21-ecosystem-tools.md) — dbt, OpenAPI handler, SQL handler, column search | — |
+| [Distribution & Platform](22-distribution.md) — export/import, registry, webhooks, GitHub Actions | — |
+Companion narratives also relevant here: [Making Changes Safely](../SAFE-CHANGES.md), [Understanding Code Relationships](../UNDERSTANDING-RELATIONSHIPS.md), [Refactoring Safely](../REFACTORING-SAFELY.md).
 ---
-## Advanced Features
+## Operations and stability
-- [Web UI](17-web-ui.md) — Visual graph viewer, heatmap, symbol timeline, test coverage overlay
-- [Git & History Integration](18-git-history.md) — Symbol-level commit history, churn metrics, PR diff analysis
-- [Cross-Repo Intelligence](19-cross-repo.md) — Search across multiple repos, find similar code, MCP Resources
-- [AI-Powered Architecture Analysis](20-architecture-analysis.md) — Quality metrics, anti-pattern detection, auto-generated architecture docs
-- [Ecosystem & Data Tools](21-ecosystem-tools.md) — dbt integration, OpenAPI/Swagger handler, SQL handler, column search
-- [Distribution & Platform](22-distribution.md) — Index export/import, public registry, webhooks, GitHub Actions, VS Code extension
+- [Performance & Scalability](23-performance.md) — worker thread pool, large-repo tuning, memory
+- [Security](24-security.md) — API key model, workspace isolation, path-traversal protections, hardening
+- [Troubleshooting](26-troubleshooting.md) — common errors, `--health` output, debug logging
+- [Architecture Overview](25-architecture-overview.md) — three-layer design, data flow, SQLite schema
+- [API Stability & Changelog](27-api-stability.md) — semver policy, stable vs experimental tools, version history
 ---
-## Operations & Reference
+## End-to-end workflows
+The user-guide root has narrative walkthroughs for full real-world scenarios:
-- [Performance & Scalability](23-performance.md) — Worker thread pool, large repo tuning, memory usage
-- [Security](24-security.md) — API key model, workspace isolation, path traversal prevention, hardening checklist
-- [Troubleshooting](26-troubleshooting.md) — Common errors, `--health` output, debug logging, re-indexing from scratch
-- [Architecture Overview](25-architecture-overview.md) — How PureContext works internally: three-layer design, data flow, SQLite schema
-- [API Stability & Changelog](27-api-stability.md) — Semver policy, stable vs experimental tools, version history
+- [Onboarding to a New Codebase](../WORKFLOW-ONBOARDING.md)
+- [Refactoring Legacy Code](../WORKFLOW-REFACTORING.md)
+- [Reviewing a Pull Request](../WORKFLOW-PR-REVIEW.md)
+- [Running a Tech Debt Sprint](../WORKFLOW-TECH-DEBT.md)

package/docs/dev/benchmark-findings-eu-za-tebe.md ADDED Viewed

@@ -0,0 +1,210 @@
+# Benchmark Findings — PureContext vs jCodeMunch (eu-za-tebe)
+**Date:** 2026-05-14 (re-measured 2026-05-15 after Phase 34 body-snippet indexing)
+**Project:** eu-za-tebe (PHP/CodeIgniter 3, Twig, HMVC modules)
+**PureContext version:** 1.2.0
+**jCodeMunch version:** 1.80.1
+**Harness:** `benchmarks/harness/run_benchmark.ts`
+**Results:** `benchmarks/eu-za-tebe/results/`
+---
+## 1. Scorecard
+| Dimension | Metric | PureContext | jCodeMunch | Winner |
+|-----------|--------|------------:|------------:|--------|
+| **0 — Indexing** | Speed (files/sec) | 193 | 106 | PC |
+| | Symbols/sec | 1,466 | 2,833 | JC |
+| | Files indexed | 565 | 824 | JC |
+| | Symbols found | 4,291 | 21,984 | JC |
+| **1 — Token efficiency** | Avg reduction | 99.9% | 99.8% | PC |
+| | Avg ratio vs baseline | 1,060× | 696× | PC |
+| **2 — Search quality** | Precision@1 | **0.0%** | **28.0%** | JC |
+| | Precision@3 | **0.0%** | **32.0%** | JC |
+| | Recall@5 | **0.0%** | **32.0%** | JC |
+| | Median search latency | 0.8ms | 57ms | PC |
+| **3 — Coverage** | Total symbols | 4,291 | 21,984 | JC |
+| | Symbols/kLOC | 38.8 | 198.7 | JC |
+---
+## 2. Gap 1 — Search Quality: PureContext 0% vs jCodeMunch 28%
+### Root cause
+All 25 ground-truth queries are **natural-language descriptions** (e.g. "execute parameterized query and return single database row"). PureContext's FTS5 search operates on **symbol name + signature + summary**. On this PHP project, summaries are just the raw signature repeated (no docstrings), so:
+- Indexed content for `CIR_Model::get_row`: `"CIR_Model get_row public function get_row($query, $input)"`
+- Query: `"execute parameterized query and return single database row"`
+- FTS5 MATCH fails because tokens like `"parameterized"`, `"database"`, `"single"`, `"return"` are absent from the index
+This is **not a bug** — it is the expected behavior of keyword search against undocumented code. It is, however, a serious product gap.
+### Why jCodeMunch scores 28%
+jCodeMunch's BM25 search appears to index **function body content** in addition to names and signatures. For example, `insert_row` calls `$this->db->insert()` internally, so the word `"insert"` appears multiple times with high weight. It also indexes variable names ($table, $fields, $values) which overlap with the query "insert new record into database table with fields and values".
+jCodeMunch still missed 17/25 queries (68% miss rate), indicating that even richer keyword indexing isn't enough for pure natural-language queries against undocumented code.
+### Concrete examples
+| Query | Expected | PC result | JC result | Why JC wins |
+|-------|----------|-----------|-----------|-------------|
+| "insert new record into database table with fields and values" | `CIR_Model::insert_row` | miss | rank 1 | "insert", "table", "fields", "values" in signature |
+| "render twig template and return output as string" | `Twig::render` | miss | rank 1 | "render", "twig", "template", "string" in docstring |
+| "set content language and slug for a localized page" | `CIR_Controller::localize` | miss | rank 1 | "localize" in name, "language"/"slug" in body |
+| "fetch scalar value from database query result" | `CIR_Model::get_value` | miss | miss | "scalar" absent from both indexes |
+### Fix options (ranked by impact)
+1. **Enable semantic search for undocumented code** (highest impact)
+   Embed symbol content (name + signature + body snippet) into the HNSW vector index. Natural-language queries then find the right symbol even without docstrings. This is already implemented — the gap is that semantic search is disabled by default and requires an embedding provider. Consider enabling it with the bundled local ONNX model by default.
+2. **Index function body snippets into FTS5** (medium impact, no config needed)
+   Currently FTS5 indexes only name + signature + summary. Indexing the first ~10 lines of each function body (variable names, return statements, called methods) would dramatically improve recall for undocumented code. This alone would likely close a large portion of the gap without needing embeddings.
+3. **Query expansion in the preprocessor** (low impact, low risk)
+   When a query contains no exact or prefix name matches, fall back to individual token OR-matching rather than AND-matching. Currently FTS5 requires all tokens to match; if any one token is absent, the result is 0. Switching to OR / BM25 scoring for long queries would surface partial matches.
+4. **AI summary generation at index time** (medium impact, cost/latency tradeoff)
+   When `ai.allowRemoteAI: true`, generate a one-sentence natural-language description of each function at index time. This is already supported but opt-in. Making it the default (with a local model fallback) would close most of the gap.
+---
+## 3. Gap 2 — Symbol Coverage: PureContext 4,291 vs jCodeMunch 21,984
+### Root cause
+The 5× coverage gap has two distinct causes:
+**A. File scope difference**
+| Tool | Files indexed | Files skipped | Reason for skips |
+|------|--------------|---------------|-----------------|
+| PureContext | 565 (of 2,656 eligible) | 2,091 | Incremental: unchanged since last full index |
+| jCodeMunch | 824 | 4,167 gitignore + 756 wrong extension | Fresh index, stricter gitignore |
+Note: PureContext counts are misleading here because the "incremental" run only re-processed changed files. The first full index found 2,661 files. jCodeMunch's 824 is a genuinely smaller set because it applies stricter gitignore rules and skips more extension types (no Twig templates, no SCSS as code, etc.).
+**B. Symbol extraction depth**
+| Metric | PureContext | jCodeMunch |
+|--------|-------------|------------|
+| Symbols per file (PHP) | ~7.5 | ~34 |
+| What is extracted | class, method, function, const, interface | Same + local variables, constants, imports, inline lambdas, config keys |
+| Function body | Not indexed | Indexed for search |
+jCodeMunch extracts ~4.5× more symbols per PHP file. This is because it indexes finer-grained constructs: local variable assignments, inline anonymous functions, config array keys, and potentially PHP `define()` constants that PureContext only counts as `const` if they're class constants.
+**C. Dim 3 table is not apples-to-apples**
+PureContext reports by **symbol kind** (class/method/function/const). jCodeMunch reports by **language** (php/javascript/css). This made the side-by-side table in the benchmark report misleading — jCodeMunch's 21,984 includes 638 PHP *files* as entries, not just their symbols. The actual comparable number may be lower.
+**Action required:** Re-run jCodeMunch with `detail_level: 'full'` on a known file and count actual distinct named symbols to get a fair apples-to-apples coverage number.
+### Fix options (ranked by effort vs impact)
+1. **Index function body content for FTS5** (already listed under Gap 1 — fixes both gaps)
+   Body content doesn't increase `symbol_count` but dramatically improves searchability.
+2. **Extract PHP class properties and constants** (medium effort, high value)
+   PureContext's PHP handler currently extracts classes, methods, functions, and class constants. PHP `define()` constants and class property declarations with PHPDoc are not extracted. Adding them would increase symbol density significantly.
+3. **Extract anonymous functions and closures** (low effort)
+   PHP closures assigned to variables (e.g., `$handler = function($req) {...}`) are common in CI3 hooks and route definitions. Treating them as `function` symbols with the variable name would add meaningful symbols.
+4. **Index PHP config arrays as structured data** (medium effort, niche value)
+   CodeIgniter stores configuration in PHP arrays (routes.php, config.php). Treating array keys as indexed entries (similar to the OpenAPI/dbt adapters) would let agents find "what is the base URL config key" questions.
+5. **Re-check PHP handler symbol kinds** (low effort, quick win)
+   Audit what `phpHandler.extractSymbols()` currently emits vs what jCodeMunch finds in the same files. Run both on `application/core/CIR_Model.php` and compare symbol lists. Any missing symbols from PureContext are extraction gaps in the PHP handler.
+---
+## 4. Token Efficiency — PureContext wins, but context matters
+PureContext achieved 1,060× average compression vs 696× for jCodeMunch. However, jCodeMunch's compression is still excellent (99.8% reduction). The difference comes from:
+- jCodeMunch returns more results per search (richer symbol set = more candidates shown)
+- jCodeMunch's `get_symbol_source` response includes docstrings, hash, line range metadata (~414 tokens per symbol vs ~250 for PureContext)
+- PureContext's `#MUNCH/1` format is not applicable — jCodeMunch uses this for search, but `get_symbol_source` returns full JSON
+**Implication:** The token efficiency gap is not a deficiency in either tool — it reflects different trade-offs between completeness and compression.
+---
+## 5. Indexing Speed
+- PureContext: **193 files/sec** (incremental run; first full index was ~419 files/sec)
+- jCodeMunch: **106 files/sec** (fresh full index, 824 files in 8.8s)
+PureContext is faster at indexing. The incremental hash-based approach means re-indexing a large repo after small changes is near-instant.
+---
+## 6. What the Ground Truth Revealed About Query Design
+The 25 ground-truth queries were written as long natural-language sentences. This is representative of how AI agents actually query code navigation tools. Key learnings:
+- Short queries with the exact symbol name (e.g. "insert_row") work well for both tools
+- Long natural-language queries without semantic search work poorly for BOTH tools on undocumented PHP
+- jCodeMunch's advantage comes from indexing function body content, not from better NLP
+- Adding semantic search to PureContext would likely flip the Dim 2 winner even on this project
+**For future benchmarks:** Keep the long natural-language format — it's the realistic test case. Do NOT shorten queries to keyword fragments just to make PureContext score higher; that would misrepresent real agent usage.
+---
+## 7. Phase 34 Post-Mortem (2026-05-15)
+Phase 34 implemented body snippet indexing (first ~200 bytes of function/method bodies into FTS5) and re-measured P@1. Result: **still 0%**.
+### Root cause — FTS5 AND semantics
+Body snippets ARE correctly indexed and searchable (21 unit tests pass). The blocker is that FTS5's default query mode is strict AND: every token in the query must appear in the document for it to be returned.
+All 25 ground-truth queries are natural-language sentences containing English connectives ("and", "with", "from", "into", "as", "by", "on", "for") that never appear as tokens in PHP code. Example:
+- Query: `"execute parameterized query and return single database row"`
+- FTS needs ALL of: `execute`, `parameterized`, `query`, `and`, `return`, `single`, `database`, `row`
+- `get_row` body has: `query` ✓, `row` ✓, `return` ✓ — but `execute`, `parameterized`, `and`, `single`, `database` are absent → zero results
+Body snippets bring body tokens into the FTS index correctly. The AND-semantics prevent these tokens from being used for ranking because the English connectives act as hard filters that guarantee zero matches.
+### Why the P@1 ≥ 20% target was not reached in Phase 34
+The ≥20% target requires OR-mode fallback (Phase 37) to be effective. Without OR-fallback:
+- AND mode: all 25 queries fail because they contain English connectives absent from code
+- LIKE fallback: also fails (multi-word natural language strings never match symbol names)
+Phase 34 is a necessary prerequisite for Phase 37. Once Phase 37 implements OR-fallback (retry the FTS query in OR mode when AND returns zero results), the body snippet tokens will be usable for BM25 ranking and P@1 should climb to ≥20%.
+### What DID improve in Phase 34
+- Body snippet content is now in the FTS index for all functions and methods (PHP + TypeScript)
+- The benchmark harness was updated to use FTS+bodySnippets for Dim 2 (previously used LIKE-only)
+- 21 unit tests verify the extraction and FTS integration end-to-end
+## 8. Priority Action Items
+| Priority | Gap | Fix | Effort |
+|----------|-----|-----|--------|
+| P0 | Search quality (0%) | OR-fallback when FTS AND returns zero results (Phase 37) | Low |
+| P0 | Search quality (0%) | Enable local ONNX semantic search by default | Low (already built) |
+| P1 | Coverage (5× gap) | Audit PHP handler vs jCodeMunch on same file | Low |
+| P1 | Coverage | Extract PHP `define()` constants + class properties | Medium |
+| P2 | Coverage | Extract PHP closures assigned to variables | Low |
+| P2 | Dim 3 accuracy | Fix apples-to-apples coverage comparison in harness | Low |
+---
+## 8. Files
+| File | Description |
+|------|-------------|
+| `benchmarks/harness/run_benchmark.ts` | Dual comparison harness |
+| `benchmarks/eu-za-tebe/tasks.json` | 5 keyword queries for Dim 1 |
+| `benchmarks/eu-za-tebe/ground-truth.json` | 25 natural-language queries for Dim 2 |
+| `benchmarks/eu-za-tebe/results/purecontext.json` | Raw PureContext results |
+| `benchmarks/eu-za-tebe/results/jcodemunch.json` | Raw jCodeMunch results |
+| `benchmarks/eu-za-tebe/results/comparison.md` | Generated side-by-side report |