npm - obedding - Versions diffs - 1.0.1 → 1.0.2 - Mend

obedding 1.0.1 → 1.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (23) hide show

package/docs/ARCHITECTURE.md ADDED Viewed

@@ -0,0 +1,229 @@
+# Architecture
+## System Overview
+obedding follows a modular pipeline architecture with pluggable embedding backends:
+```
+                    ┌────────────────────────────────────────┐
+                    │           obedding CLI                 │
+                    │  ┌──────────────────────────────────┐  │
+                    │  │  Indexer  │  Scanner  │ Search   │  │
+                    │  └──────────────────────────────────┘  │
+                    └───────────────┬────────────────────────┘
+                                    │
+                    ┌───────────────┴────────────────────────┐
+                    │         Storage Manager               │
+                    │         (JSON File Storage)            │
+                    └───────────────┬────────────────────────┘
+                                    │
+            ┌───────────────────────┼───────────────────────┐
+            │                       │                       │
+    ┌───────▼───────┐     ┌────────▼────────┐     ┌───────▼───────┐
+    │  LM Studio    │     │     Ollama      │     │      MLX      │
+    │  (DEFAULT)    │     │                 │     │  (NOT REC.)   │
+    │  localhost:   │     │  localhost:     │     │  localhost:   │
+    │  :1234        │     │  :11434         │     │  :28100       │
+    │  GGUF models  │     │  Native models  │     │  MLX models   │
+    │  1024 dims    │     │  768 dims       │     │  2048 dims    │
+    └───────────────┘     └─────────────────┘     └───────────────┘
+```
+## Core Components
+### 1. Scanner (`src/scanner.ts`)
+**Responsibilities:**
+- Recursively discover markdown files in vault
+- Parse YAML frontmatter for metadata
+- Extract metadata from directory structure (`Projects/{type}/{repo}/{context}/`)
+- Generate SHA-256 content hashes for incremental updates
+**Key Functions:**
+- `scanObsidianVault()` - Discover all .md files
+- `extractMetadata()` - Parse frontmatter and path structure
+- `getNoteHash()` - Generate content hash for change detection
+### 2. Preprocessor (`src/preprocessor.ts`)
+**Responsibilities:**
+- Remove markdown structural elements (# ## ###, bullets, dividers)
+- Extract metadata as keywords (title, tags, repo, context)
+- Normalize text for better embedding quality
+**Why Preprocessing Matters:**
+Embedding models can be overly sensitive to text STRUCTURE rather than semantic content. By removing markdown formatting, we help the model focus on actual meaning.
+**Key Functions:**
+- `preprocessNote()` - Clean and normalize note content
+- `checkEmbeddingVariance()` - Detect low-quality embeddings
+### 3. Backend Clients
+All backends implement a consistent interface:
+```typescript
+interface EmbeddingBackend {
+  checkConnection(): Promise<boolean>
+  generateEmbedding(text: string, model?: string): Promise<number[]>
+  generateEmbeddings(texts: string[], model?: string, onProgress?: Function): Promise<number[][]>
+  getModelInfo(model?: string): Promise<ModelInfo | null>
+}
+```
+#### LM Studio (`src/lmstudio.ts`) - **DEFAULT**
+- **API:** OpenAI-compatible at `http://localhost:1234`
+- **Model:** `text-embedding-qwen3-embedding-0.6b`
+- **Dimensions:** 1024
+- **Format:** GGUF models
+#### Ollama (`src/ollama.ts`)
+- **API:** Ollama API at `http://localhost:11434`
+- **Model:** `qwen3-embedding:0.6b`
+- **Dimensions:** 768
+- **Format:** Native Ollama models
+#### MLX (`src/mlx.ts`) - **NOT RECOMMENDED**
+- **API:** Swama MLX at `http://localhost:28100`
+- **Model:** `mlx-community/Qwen3-Embedding-0.6B-4bit-DWQ`
+- **Dimensions:** 2048
+- **Known Issue:** Produces identical embeddings for similar content
+### 4. Storage Manager (`src/storage.ts`)
+**Responsibilities:**
+- JSON file storage at `~/.claude/data/obsidian-embeddings.json`
+- Content-based change detection via hash comparison
+- Upsert operations for note updates
+- Metadata tracking (model, dimensions, indexed_at)
+**Storage Schema:**
+```typescript
+interface EmbeddingStore {
+  version: string;        // Storage format version
+  model: string;          // Which model generated embeddings
+  dimensions: number;     // Embedding vector size
+  indexed_at: string;     // Last index timestamp
+  notes: NoteEmbedding[];
+}
+interface NoteEmbedding {
+  path: string;           // Relative path from vault root
+  vault_path: string;     // Absolute path
+  embedding: number[];    // Vector embedding
+  metadata: {
+    type?: string;        // Project type
+    repo?: string;        // Repository name
+    context?: string;     // Context/feature
+    tags?: string[];      // Tags from frontmatter
+    title?: string;       // Note title
+  };
+  excerpt: string;        // First 200 chars
+  indexed_at: string;     // When this note was indexed
+  hash: string;           // SHA-256 for change detection
+}
+```
+### 5. Search Engine (`src/search.ts`)
+**Algorithm:** Cosine Similarity
+```
+similarity(A, B) = (A · B) / (||A|| × ||B||)
+Where:
+- A, B are embedding vectors
+- · is dot product
+- || || is vector magnitude (L2 norm)
+```
+**Process:**
+1. Preprocess query (same as notes)
+2. Generate query embedding
+3. Calculate cosine similarity against all stored embeddings
+4. Filter by minimum score
+5. Sort by similarity (descending)
+6. Return top-K results
+### 6. CLI Interface (`src/cli.ts`)
+**Commands:**
+- `index` - Generate embeddings for notes
+- `search` - Semantic search query
+- `stats` - Show storage statistics
+- `clear` - Delete all embeddings
+## Data Flow
+### Indexing Flow
+```
+1. Scan Vault
+   └─▶ Discover *.md files
+   └─▶ Parse frontmatter
+   └─▶ Generate content hashes
+2. Filter (if incremental)
+   └─▶ Compare hashes with stored
+   └─▶ Skip unchanged notes
+3. Preprocess
+   └─▶ Remove markdown structure
+   └─▶ Extract metadata keywords
+4. Generate Embeddings
+   └─▶ Send to backend (LM Studio/Ollama/MLX)
+   └─▶ Receive vector arrays
+5. Validate
+   └─▶ Check embedding variance
+   └─▶ Warn if too low
+6. Store
+   └─▶ Update metadata (model, dimensions)
+   └─▶ Save to JSON file
+```
+### Search Flow
+```
+1. Load Storage
+   └─▶ Read JSON file
+   └─▶ Parse stored embeddings
+2. Preprocess Query
+   └─▶ Same as note preprocessing
+3. Generate Query Embedding
+   └─▶ Send to backend
+   └─▶ Receive query vector
+4. Calculate Similarities
+   └─▶ Cosine similarity vs all notes
+   └─▶ Skip dimension mismatches
+5. Filter & Sort
+   └─▶ Apply min-score threshold
+   └─▶ Sort by score descending
+6. Return Top-K
+   └─▶ Format with excerpts/metadata
+```
+## Design Decisions
+### Why JSON Storage?
+- **Simplicity:** No database dependencies
+- **Portability:** Single file, easy backup
+- **Debuggability:** Human-readable
+- **Scale:** Handles 1000s of notes efficiently
+### Why Content Hashing?
+SHA-256 hashes of note content detect changes more reliably than modification timestamps.
+### Why Preprocessing?
+Embedding models can be distracted by markdown formatting. Removing headers, bullets, and dividers helps focus on semantic content.
+### Why Multiple Backends?
+Different users have different setups:
+- LM Studio: Easy GUI, GGUF models
+- Ollama: Simple CLI, wide model support
+- MLX: Apple Silicon optimized (but has bugs)

package/docs/BACKENDS.md ADDED Viewed

@@ -0,0 +1,230 @@
+# Embedding Backends
+obedding supports three embedding backends. LM Studio is the default.
+## Quick Comparison
+| Feature | LM Studio (Default) | Ollama | MLX |
+|---------|---------------------|--------|-----|
+| **Server** | `http://localhost:1234` | `http://localhost:11434` | `http://localhost:28100` |
+| **Default Model** | `text-embedding-qwen3-embedding-0.6b` | `qwen3-embedding:0.6b` | `mlx-community/Qwen3-Embedding-0.6B-4bit-DWQ` |
+| **Dimensions** | 1024 | 768 | 2048 |
+| **Model Format** | GGUF | Native | MLX |
+| **Reliability** | ✅ Stable | ✅ Stable | ⚠️ Known Issues |
+| **Performance** | ~1.7s for 3 notes | ~6s for 27 notes | ~11s for 27 notes |
+| **Storage** | ~4KB/note | ~3KB/note | ~8KB/note |
+## LM Studio (Recommended)
+### Setup
+1. **Install LM Studio** from [lmstudio.ai](https://lmstudio.ai/)
+2. **Download an embedding model:**
+   - Search for `Qwen3-Embedding-0.6B-GGUF`
+   - Or use any GGUF embedding model
+3. **Load the model and start the server:**
+   - Click the "Server" button in LM Studio
+   - Ensure it's running on `http://localhost:1234`
+### Usage
+```bash
+# Explicit backend selection
+npx obedding index --vault ~/.obsidian/Projects --backend lmstudio
+npx obedding search "query" --backend lmstudio
+# Or use default (lmstudio is now default)
+npx obedding index --vault ~/.obsidian/Projects
+npx obedding search "query"
+```
+### API Details
+- **Endpoint:** `http://localhost:1234/v1/embeddings`
+- **Format:** OpenAI-compatible
+- **Request:**
+  ```json
+  {
+    "model": "text-embedding-qwen3-embedding-0.6b",
+    "input": "your text here"
+  }
+  ```
+- **Response:**
+  ```json
+  {
+    "object": "list",
+    "data": [{
+      "object": "embedding",
+      "index": 0,
+      "embedding": [0.0123, -0.0456, ...]
+    }],
+    "model": "text-embedding-qwen3-embedding-0.6b"
+  }
+  ```
+### Available Models
+Check available models in LM Studio:
+```bash
+curl http://localhost:1234/v1/models
+```
+### Custom Model
+```bash
+npx obedding index --backend lmstudio --model "your-custom-model-name"
+```
+## Ollama
+### Setup
+```bash
+# Install Ollama
+curl -fsSL https://ollama.ai/install.sh | sh
+# Pull embedding model
+ollama pull qwen3-embedding:0.6b
+# Start server
+ollama serve
+```
+### Usage
+```bash
+npx obedding index --vault ~/.obsidian/Projects --backend ollama
+npx obedding search "query" --backend ollama
+```
+### API Details
+- **Endpoint:** `http://localhost:11434/api/embeddings`
+- **Format:** Ollama native
+- **Request:**
+  ```json
+  {
+    "model": "qwen3-embedding:0.6b",
+    "prompt": "your text here"
+  }
+  ```
+- **Response:**
+  ```json
+  {
+    "embedding": [0.0123, -0.0456, ...]
+  }
+  ```
+### Available Models
+```bash
+ollama list
+```
+### Custom Model
+```bash
+npx obedding index --backend ollama --model "mxbai-embed-large"
+```
+## MLX (Not Recommended)
+### Known Issues
+⚠️ **The MLX backend has a critical bug:** It produces **identical embeddings for different content** when the text has similar structure (e.g., same frontmatter, similar length).
+**Example:**
+- Content 1: "tags:cli,telegram... Telegram newline fix..." (6416 chars)
+- Content 2: "tags:cli,glm5... GLM-5 MCP search..." (3493 chars)
+- Result: Cosine similarity = 1.0000 (IDENTICAL embeddings)
+**Root Cause:** Swama batch processing regression. See [Swama issue #93](https://github.com/Trans-N-ai/swama/issues/93)
+### Setup (If You Must)
+```bash
+# Install Swama
+npm install -g swama
+# Start MLX embedding server
+swama serve
+```
+### Usage
+```bash
+npx obedding index --vault ~/.obsidian/Projects --backend mlx
+npx obedding search "query" --backend mlx
+```
+### API Details
+- **Endpoint:** `http://localhost:28100/v1/embeddings`
+- **Format:** OpenAI-compatible (with bugs)
+- **Workaround:** Client truncates embeddings to 2048 dimensions
+## Switching Backends
+You can switch backends, but you'll need to re-index:
+```bash
+# Clear existing embeddings
+npx obedding clear --force
+# Index with new backend
+npx obedding index --vault ~/.obsidian/Projects --backend ollama
+```
+**Why?** Different backends produce different embedding dimensions, making them incompatible.
+## Environment Variables
+Override default server URLs:
+```bash
+# LM Studio
+export LMSTUDIO_BASE_URL="http://localhost:1234"
+# Ollama
+export OLLAMA_BASE_URL="http://localhost:11434"
+```
+## Troubleshooting
+### LM Studio
+```bash
+# Check if server is running
+curl http://localhost:1234/v1/models
+# Test embeddings
+curl -X POST http://localhost:1234/v1/embeddings \
+  -H "Content-Type: application/json" \
+  -d '{"model": "text-embedding-qwen3-embedding-0.6b", "input": "test"}'
+```
+### Ollama
+```bash
+# Check if server is running
+curl http://localhost:11434/api/tags
+# Test embeddings
+curl -X POST http://localhost:11434/api/embeddings \
+  -H "Content-Type: application/json" \
+  -d '{"model": "qwen3-embedding:0.6b", "prompt": "test"}'
+```
+### MLX
+```bash
+# Check if server is running
+curl http://localhost:28100/
+# Test embeddings
+curl -X POST http://localhost:28100/v1/embeddings \
+  -H "Content-Type: application/json" \
+  -d '{"input": ["test"], "model": "mlx-community/Qwen3-Embedding-0.6B-4bit-DWQ"}'
+```

package/package.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "name": "obedding",
-  "version": "1.0.1",
-  "description": "Semantic search for Obsidian notes using local MLX embeddings",
+  "version": "1.0.2",
+  "description": "Semantic search for Obsidian notes using local embeddings (Ollama, LM Studio, MLX)",
   "type": "module",
   "main": "./dist/index.js",
   "bin": {
@@ -17,6 +17,8 @@
     "obsidian",
     "semantic-search",
     "embeddings",
+    "lmstudio",
+    "ollama",
     "mlx",
     "local",
     "privacy",
@@ -39,6 +41,7 @@
   },
   "files": [
     "dist",
+    "docs",
     "README.md",
     "LICENSE"
   ],