npm - @tobilu/qmd - Versions diffs - 0.9.0 → 1.0.5 - Mend

@tobilu/qmd 0.9.0 → 1.0.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (24) hide show

package/CHANGELOG.md CHANGED Viewed

@@ -1,34 +1,302 @@
 # Changelog
-All notable changes to QMD will be documented in this file.
+## [Unreleased]
+## [1.0.5] - 2026-02-16
+The npm package now ships compiled JavaScript instead of raw TypeScript,
+removing the `tsx` runtime dependency. A new `/release` skill automates the
+full release workflow with changelog validation and git hook enforcement.
+### Changes
+- Build: compile TypeScript to `dist/` via `tsc` so the npm package no longer
+  requires `tsx` at runtime. The `qmd` shell wrapper now runs `dist/qmd.js`
+  directly.
+- Release tooling: new `/release` skill that manages the full release
+  lifecycle — validates changelog, installs git hooks, previews release notes,
+  and cuts the release. Auto-populates `[Unreleased]` from git history when
+  empty.
+- Release tooling: `scripts/extract-changelog.sh` extracts cumulative notes
+  for the full minor series (e.g. 1.0.0 through 1.0.5) for GitHub releases.
+  Includes `[Unreleased]` content in previews.
+- Release tooling: `scripts/release.sh` renames `[Unreleased]` to a versioned
+  heading and inserts a fresh empty `[Unreleased]` section automatically.
+- Release tooling: pre-push git hook blocks `v*` tag pushes unless
+  `package.json` version matches the tag, a changelog entry exists, and CI
+  passed on GitHub.
+- Publish workflow: GitHub Actions now builds TypeScript, creates a GitHub
+  release with cumulative notes extracted from the changelog, and publishes
+  to npm with provenance.
+## [1.0.0] - 2026-02-15
+QMD now runs on both Node.js and Bun, with up to 2.7x faster reranking
+through parallel GPU contexts. GPU auto-detection replaces the unreliable
+`gpu: "auto"` with explicit CUDA/Metal/Vulkan probing.
+### Changes
+- Runtime: support Node.js (>=22) alongside Bun via a cross-runtime SQLite
+  abstraction layer (`src/db.ts`). `bun:sqlite` on Bun, `better-sqlite3` on
+  Node. The `qmd` wrapper auto-detects a suitable Node.js install via PATH,
+  then falls back to mise, asdf, nvm, and Homebrew locations.
+- Performance: parallel embedding & reranking via multiple LlamaContext
+  instances — up to 2.7x faster on multi-core machines.
+- Performance: flash attention for ~20% less VRAM per reranking context,
+  enabling more parallel contexts on GPU.
+- Performance: right-sized reranker context (40960 → 2048 tokens, 17x less
+  memory) since chunks are capped at ~900 tokens.
+- Performance: adaptive parallelism — context count computed from available
+  VRAM (GPU) or CPU math cores rather than hardcoded.
+- GPU: probe for CUDA, Metal, Vulkan explicitly at startup instead of
+  relying on node-llama-cpp's `gpu: "auto"`. `qmd status` shows device info.
+- Tests: reorganized into flat `test/` directory with vitest for Node.js and
+  bun test for Bun. New `eval-bm25` and `store.helpers.unit` suites.
+### Fixes
+- Prevent VRAM waste from duplicate context creation during concurrent
+  `embedBatch` calls — initialization lock now covers the full path.
+- Collection-aware FTS filtering so scoped keyword search actually restricts
+  results to the requested collection.
 ## [0.9.0] - 2026-02-15
-Initial public release.
+First published release on npm as `@tobilu/qmd`. MCP HTTP transport with
+daemon mode cuts warm query latency from ~16s to ~10s by keeping models
+loaded between requests.
+### Changes
+- MCP: HTTP transport with daemon lifecycle — `qmd mcp --http --daemon`
+  starts a background server, `qmd mcp stop` shuts it down. Models stay warm
+  in VRAM between queries. #149 (thanks @igrigorik)
+- Search: type-routed query expansion preserves lex/vec/hyde type info and
+  routes to the appropriate backend. Eliminates ~4 wasted backend calls per
+  query (10.0 → 6.0 calls, 1278ms → 549ms). #149 (thanks @igrigorik)
+- Search: unified pipeline — extracted `hybridQuery()` and
+  `vectorSearchQuery()` to `store.ts` so CLI and MCP share identical logic.
+  Fixes a class of bugs where results differed between the two. #149 (thanks
+  @igrigorik)
+- MCP: dynamic instructions generated at startup from actual index state —
+  LLMs see collection names, doc counts, and content descriptions. #149
+  (thanks @igrigorik)
+- MCP: tool renames (vsearch → vector_search, query → deep_search) with
+  rewritten descriptions for better tool selection. #149 (thanks @igrigorik)
+- Integration: Claude Code plugin with inline status checks and MCP
+  integration. #99 (thanks @galligan)
+### Fixes
+- BM25 score normalization — formula was inverted (`1/(1+|x|)` instead of
+  `|x|/(1+|x|)`), so strong matches scored *lowest*. Broke `--min-score`
+  filtering and made the "strong signal" short-circuit dead code. #76 (thanks
+  @dgilperez)
+- Normalize Unicode paths to NFC for macOS compatibility. #82 (thanks
+  @c-stoeckl)
+- Handle dense content (code) that tokenizes beyond expected chunk size.
+- Proper cleanup of Metal GPU resources on process exit.
+- SQLite-vec readiness verification after extension load.
+- Reactivate deactivated documents on re-index instead of creating duplicates.
+- Bun UTF-8 path corruption workaround for non-ASCII filenames.
+- Disable following symlinks in glob.scan to avoid infinite loops.
+## [0.8.0] - 2026-01-28
+Fine-tuned query expansion model trained with GRPO replaces the stock Qwen3
+0.6B. The training pipeline scores expansions on named entity preservation,
+format compliance, and diversity — producing noticeably better lexical
+variations and HyDE documents.
+### Changes
+- LLM: deploy GRPO-trained (Group Relative Policy Optimization) query
+  expansion model, hosted on HuggingFace and auto-downloaded on first use.
+  Better preservation of proper nouns and technical terms in expansions.
+- LLM: `/only:lex` mode for single-type expansions — useful when you know
+  which search backend will help.
+- LLM: HyDE output moved to first position so vector search can start
+  embedding while other expansions generate.
+- LLM: session lifecycle management via `withLLMSession()` pattern — ensures
+  cleanup even on failure, similar to database transactions.
+- Integration: org-mode title extraction support. #50 (thanks @sh54)
+- Integration: SQLite extension loading in Nix devshell. #48 (thanks @sh54)
+- Integration: AI agent discovery via skills.sh. #64 (thanks @Algiras)
+### Fixes
+- Use sequential embedding on CPU-only systems — parallel contexts caused a
+  race condition where contexts competed for CPU cores, making things slower.
+  #54 (thanks @freeman-jiang)
+- Fix `collectionName` column in vector search SQL (was still using old
+  `collectionId` from before YAML migration). #61 (thanks @jdvmi00)
+- Fix Qwen3 sampling params to prevent repetition loops — stock
+  temperature/top-p caused occasional infinite repeat patterns.
+- Add `--index` option to CLI argument parser (was documented but not wired
+  up). #84 (thanks @Tritlo)
+- Fix DisposedError during slow batch embedding. #41 (thanks @wuhup)
+## [0.7.0] - 2026-01-09
-### Features
+First community contributions. The project gained external contributors,
+surfacing bugs that only appear in diverse environments — Homebrew sqlite-vec
+paths, case-sensitive model filenames, and sqlite-vec JOIN incompatibilities.
-- **Hybrid search pipeline** — BM25 full-text + vector similarity + LLM reranking with Reciprocal Rank Fusion
-- **Smart chunking** — scored markdown break points keep sections, paragraphs, and code blocks intact (~900 tokens/chunk, 15% overlap)
-- **Query expansion** — fine-tuned Qwen3 1.7B model generates search variations for better recall
-- **Cross-encoder reranking** — Qwen3-Reranker scores candidates with position-aware blending
-- **Vector embeddings** — EmbeddingGemma 300M via node-llama-cpp, all on-device
-- **MCP server** — stdio and HTTP transports for Claude Desktop, Claude Code, and any MCP client
-- **Collection management** — index multiple directories with glob patterns
-- **Context annotations** — add descriptions to collections and paths for richer search
-- **Document IDs** — 6-char content hash for stable references across re-indexes
-- **Multi-get** — retrieve multiple documents by glob pattern, comma list, or docids
-- **Multiple output formats** — JSON, CSV, Markdown, XML, files list
-- **Claude Code plugin** — inline status checks and MCP integration
+### Changes
+- Indexing: native `realpathSync()` replaces `readlink -f` subprocess spawn
+  per file. On a 5000-file collection this eliminates 5000 shell spawns,
+  ~15% faster. #8 (thanks @burke)
+- Indexing: single-pass tokenization — chunking algorithm tokenized each
+  document twice (count then split); now tokenizes once and reuses. #9
+  (thanks @burke)
 ### Fixes
-- Handle dense content (code) that tokenizes beyond expected chunk size
-- Proper cleanup of Metal GPU resources
-- SQLite-vec readiness verification after extension load
-- Reactivate deactivated documents on re-index
-- BM25 score normalization with Math.abs
-- Bun UTF-8 path corruption workaround
+- Fix `vsearch` and `query` hanging — sqlite-vec's virtual table doesn't
+  support the JOIN pattern used; rewrote to subquery. #23 (thanks @mbrendan)
+- Fix MCP server exiting immediately after startup — process had no active
+  handles keeping the event loop alive. #29 (thanks @mostlydev)
+- Fix collection filter SQL to properly restrict vector search results.
+- Support non-ASCII filenames in collection filter.
+- Skip empty files during indexing instead of crashing on zero-length content.
+- Fix case sensitivity in Qwen3 model filename resolution. #15 (thanks
+  @gavrix)
+- Fix sqlite-vec loading on macOS with Homebrew (`BREW_PREFIX` detection).
+  #42 (thanks @komsit37)
+- Fix Nix flake to use correct `src/qmd.ts` path. #7 (thanks @burke)
+- Fix docid lookup with quotes support in get command. #36 (thanks
+  @JoshuaLelon)
+- Fix query expansion model size in documentation. #38 (thanks @odysseus0)
+## [0.6.0] - 2025-12-28
+Replaced Ollama HTTP API with node-llama-cpp for all LLM operations. Ollama
+adds convenience but also a running server dependency. node-llama-cpp loads
+GGUF models directly in-process — zero external dependencies. Models
+auto-download from HuggingFace on first use.
+### Changes
+- LLM: structured query expansion via JSON schema grammar constraints.
+  Model produces typed expansions — **lexical** (BM25 keywords), **vector**
+  (semantic rephrasings), **HyDE** (hypothetical document excerpts) — so each
+  routes to the right backend instead of sending everything everywhere.
+- LLM: lazy model loading with 2-minute inactivity auto-unload. Keeps memory
+  low when idle while avoiding ~3s model load on every query.
+- Search: conditional query expansion — when BM25 returns strong results, the
+  expensive LLM expansion is skipped entirely.
+- Search: multi-chunk reranking — documents with multiple relevant chunks
+  scored by aggregating across all chunks rather than best single chunk.
+- Search: cosine distance for vector search (was L2).
+- Search: embeddinggemma nomic-style prompt formatting.
+- Testing: evaluation harness with synthetic test documents and Hit@K metrics
+  for BM25, vector, and hybrid RRF.
+## [0.5.0] - 2025-12-13
+Collections and contexts moved from SQLite tables to YAML at
+`~/.config/qmd/index.yml`. SQLite was overkill for config — you can't share
+it, and it's opaque. YAML is human-readable and version-controllable. The
+migration was extensive (35+ commits) because every part of the system that
+touched collections or contexts had to be updated.
+### Changes
+- Config: YAML-based collections and contexts replace SQLite tables.
+  `collections` and `path_contexts` tables dropped from schema. Collections
+  support an optional `update:` command (e.g., `git pull`) before re-index.
+- CLI: `qmd collection add/list/remove/rename` commands with `--name` and
+  `--mask` glob pattern support.
+- CLI: `qmd ls` virtual file tree — list collections, files in a collection,
+  or files under a path prefix.
+- CLI: `qmd context add/list/check/rm` with hierarchical context inheritance.
+  A query to `qmd://notes/2024/jan/` inherits context from `notes/`,
+  `notes/2024/`, and `notes/2024/jan/`.
+- CLI: `qmd context add / "text"` for global context across all collections.
+- CLI: `qmd context check` audit command to find paths without context.
+- Paths: `qmd://` virtual URI scheme for portable document references.
+  `qmd://notes/ideas.md` works regardless of where the collection lives on
+  disk. Works in `get`, `multi-get`, `ls`, and context commands.
+- CLI: document IDs (docid) — first 6 chars of content hash for stable
+  references. Shown as `#abc123` in search results, usable with `get` and
+  `multi-get`.
+- CLI: `--line-numbers` flag for get command output.
+## [0.4.0] - 2025-12-10
+MCP server for AI agent integration. Without it, agents had to shell out to
+`qmd search` and parse CLI output. The monolithic `qmd.ts` (1840 lines) was
+split into focused modules with the project's first test suite (215 tests).
+### Changes
+- MCP: stdio server with tools for search, vector search, hybrid query,
+  document retrieval, and status. Runs over stdio transport for Claude
+  Desktop and MCP clients.
+- MCP: spec-compliant with June 2025 MCP specification — removed non-spec
+  `mimeType`, added `isError: true` to errors, `structuredContent` for
+  machine-readable results, proper URI encoding.
+- MCP: simplified tool naming (`qmd_search` → `search`) since MCP already
+  namespaces by server.
+- Architecture: extract `store.ts` (1221 LOC), `llm.ts` (539 LOC),
+  `formatter.ts` (359 LOC), `mcp.ts` (503 LOC) from monolithic `qmd.ts`.
+- Testing: 215 tests (store: 96, llm: 60, mcp: 59) with mocked Ollama for
+  fast, deterministic runs. Before this: zero tests.
+## [0.3.0] - 2025-12-08
+Document chunking for vector search. A 5000-word document about many topics
+gets a single embedding that averages everything together, matching poorly for
+specific queries. Chunking produces one embedding per ~900-token section with
+focused semantic signal.
+### Changes
+- Search: markdown-aware chunking — prefers heading boundaries, then paragraph
+  breaks, then sentence boundaries. 15% overlap between chunks ensures
+  cross-boundary queries still match.
+- Search: multi-chunk scoring bonus (+0.02 per additional chunk, capped at
+  +0.1 for 5+ chunks). Documents relevant in multiple sections rank higher.
+- CLI: display paths show collection-relative paths and extracted titles
+  (from H1 headings or YAML frontmatter) instead of raw filesystem paths.
+- CLI: `--all` flag returns all matches (use with `--min-score` to filter).
+- CLI: byte-based progress bar with ETA for `embed` command.
+- CLI: human-readable time formatting ("15m 4s" instead of "904.2s").
+- CLI: documents >64KB truncated with warning during embedding.
+## [0.2.0] - 2025-12-08
+### Changes
+- CLI: `--json`, `--csv`, `--files`, `--md`, `--xml` output format flags.
+  `--json` for programmatic access, `--files` for piping, `--md`/`--xml` for
+  LLM consumption, `--csv` for spreadsheets.
+- CLI: `qmd status` shows index health — document count, size, embedding
+  coverage, time since last update.
+- Search: weighted RRF — original query gets 2x weight relative to expanded
+  queries since the user's actual words are a more reliable signal.
+## [0.1.0] - 2025-12-07
+Initial implementation. Built in a single day for searching personal markdown
+notes, journals, and meeting transcripts.
+### Changes
+- Search: SQLite FTS5 with BM25 ranking. Chose SQLite over Elasticsearch
+  because QMD is a personal tool — single binary, no server dependencies.
+- Search: sqlite-vec for vector similarity. Same rationale: in-process, no
+  external vector database.
+- Search: Reciprocal Rank Fusion to combine BM25 and vector results. RRF is
+  parameter-free and handles missing signals gracefully.
+- LLM: Ollama for embeddings, reranking, and query expansion. Later replaced
+  with node-llama-cpp in 0.6.0.
+- CLI: `qmd add`, `qmd embed`, `qmd search`, `qmd vsearch`, `qmd query`,
+  `qmd get`. ~1800 lines of TypeScript in a single `qmd.ts` file.
-[0.9.0]: https://github.com/tobi/qmd/releases/tag/v0.9.0
+[Unreleased]: https://github.com/tobi/qmd/compare/v1.0.0...HEAD
+[1.0.0]: https://github.com/tobi/qmd/releases/tag/v1.0.0
+[0.9.0]: https://github.com/tobi/qmd/compare/v0.8.0...v0.9.0

package/README.md CHANGED Viewed

@@ -6,18 +6,26 @@ QMD combines BM25 full-text search, vector semantic search, and LLM re-ranking
 ![QMD Architecture](assets/qmd-architecture.png)
+You can read more about QMD's progress in the [CHANGELOG](CHANGELOG.md).
 ## Quick Start
 ```sh
-# Install globally
+# Install globally (Node or Bun)
+npm install -g @tobilu/qmd
+# or
 bun install -g @tobilu/qmd
+# Or run directly
+npx @tobilu/qmd ...
+bunx @tobilu/qmd ...
 # Create collections for your notes, docs, and meeting transcripts
 qmd collection add ~/notes --name notes
 qmd collection add ~/Documents/meetings --name meetings
 qmd collection add ~/work/docs --name docs
-# Add context to help with search results
+# Add context to help with search results, each piece of context will be returned when matching sub documents are returned. This works as a tree. This is the key feature of QMD as it allows LLMs to make much better contextual choices when selecting documents. Don't sleep on it!
 qmd context add qmd://notes "Personal notes and ideas"
 qmd context add qmd://meetings "Meeting transcripts and notes"
 qmd context add qmd://docs "Work documentation"
@@ -231,6 +239,7 @@ The `query` command uses **Reciprocal Rank Fusion (RRF)** with position-aware bl
 ### System Requirements
+- **Node.js** >= 22
 - **Bun** >= 1.0.0
 - **macOS**: Homebrew SQLite (for extension support)
   ```sh
@@ -252,18 +261,18 @@ Models are downloaded from HuggingFace and cached in `~/.cache/qmd/models/`.
 ## Installation
 ```sh
+npm install -g @tobilu/qmd
+# or
 bun install -g @tobilu/qmd
 ```
-Make sure `~/.bun/bin` is in your PATH.
 ### Development
 ```sh
 git clone https://github.com/tobi/qmd
 cd qmd
-bun install
-bun link
+npm install
+npm link
 ```
 ## Usage

package/dist/collections.d.ts ADDED Viewed

@@ -0,0 +1,115 @@
+/**
+ * Collections configuration management
+ *
+ * This module manages the YAML-based collection configuration at ~/.config/qmd/index.yml.
+ * Collections define which directories to index and their associated contexts.
+ */
+/**
+ * Context definitions for a collection
+ * Key is path prefix (e.g., "/", "/2024", "/Board of Directors")
+ * Value is the context description
+ */
+export type ContextMap = Record<string, string>;
+/**
+ * A single collection configuration
+ */
+export interface Collection {
+    path: string;
+    pattern: string;
+    context?: ContextMap;
+    update?: string;
+}
+/**
+ * The complete configuration file structure
+ */
+export interface CollectionConfig {
+    global_context?: string;
+    collections: Record<string, Collection>;
+}
+/**
+ * Collection with its name (for return values)
+ */
+export interface NamedCollection extends Collection {
+    name: string;
+}
+/**
+ * Set the current index name for config file lookup
+ * Config file will be ~/.config/qmd/{indexName}.yml
+ */
+export declare function setConfigIndexName(name: string): void;
+/**
+ * Load configuration from ~/.config/qmd/index.yml
+ * Returns empty config if file doesn't exist
+ */
+export declare function loadConfig(): CollectionConfig;
+/**
+ * Save configuration to ~/.config/qmd/index.yml
+ */
+export declare function saveConfig(config: CollectionConfig): void;
+/**
+ * Get a specific collection by name
+ * Returns null if not found
+ */
+export declare function getCollection(name: string): NamedCollection | null;
+/**
+ * List all collections
+ */
+export declare function listCollections(): NamedCollection[];
+/**
+ * Add or update a collection
+ */
+export declare function addCollection(name: string, path: string, pattern?: string): void;
+/**
+ * Remove a collection
+ */
+export declare function removeCollection(name: string): boolean;
+/**
+ * Rename a collection
+ */
+export declare function renameCollection(oldName: string, newName: string): boolean;
+/**
+ * Get global context
+ */
+export declare function getGlobalContext(): string | undefined;
+/**
+ * Set global context
+ */
+export declare function setGlobalContext(context: string | undefined): void;
+/**
+ * Get all contexts for a collection
+ */
+export declare function getContexts(collectionName: string): ContextMap | undefined;
+/**
+ * Add or update a context for a specific path in a collection
+ */
+export declare function addContext(collectionName: string, pathPrefix: string, contextText: string): boolean;
+/**
+ * Remove a context from a collection
+ */
+export declare function removeContext(collectionName: string, pathPrefix: string): boolean;
+/**
+ * List all contexts across all collections
+ */
+export declare function listAllContexts(): Array<{
+    collection: string;
+    path: string;
+    context: string;
+}>;
+/**
+ * Find best matching context for a given collection and path
+ * Returns the most specific matching context (longest path prefix match)
+ */
+export declare function findContextForPath(collectionName: string, filePath: string): string | undefined;
+/**
+ * Get the config file path (useful for error messages)
+ */
+export declare function getConfigPath(): string;
+/**
+ * Check if config file exists
+ */
+export declare function configExists(): boolean;
+/**
+ * Validate a collection name
+ * Collection names must be valid and not contain special characters
+ */
+export declare function isValidCollectionName(name: string): boolean;