@gmickel/gno 0.15.0 → 0.16.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -32,7 +32,14 @@ GNO is a local knowledge engine that turns your documents into a searchable, con
32
32
 
33
33
  ---
34
34
 
35
- ## What's New in v0.13
35
+ ## What's New in v0.15
36
+
37
+ - **HTTP Backends**: Offload embedding, reranking, and generation to remote GPU servers
38
+ - Simple URI config: `http://host:port/path#modelname`
39
+ - Works with llama-server, Ollama, LocalAI, vLLM
40
+ - Run GNO on lightweight machines while GPU inference runs on your network
41
+
42
+ ### v0.13
36
43
 
37
44
  - **Knowledge Graph**: Interactive force-directed visualization of document connections
38
45
  - **Graph with Similarity**: See semantic similarity as golden edges (not just wiki/markdown links)
@@ -343,28 +350,29 @@ graph TD
343
350
 
344
351
  ## Features
345
352
 
346
- | Feature | Description |
347
- | :------------------ | :----------------------------------------------------------------------------- |
348
- | **Hybrid Search** | BM25 + vector + RRF fusion + cross-encoder reranking |
349
- | **Document Editor** | Create, edit, delete docs with live markdown preview |
350
- | **Web UI** | Visual dashboard for search, browse, edit, and AI Q&A |
351
- | **REST API** | HTTP API for custom tools and integrations |
352
- | **Multi-Format** | Markdown, PDF, DOCX, XLSX, PPTX, plain text |
353
- | **Local LLM** | AI answers via llama.cpp, no API keys |
354
- | **Privacy First** | 100% offline, zero telemetry, your data stays yours |
355
- | **MCP Server** | Works with Claude Desktop, Cursor, Zed, + 8 more |
356
- | **Collections** | Organize sources with patterns, excludes, contexts |
357
- | **Tag Filtering** | Frontmatter tags with hierarchical paths, filter via `--tags-any`/`--tags-all` |
358
- | **Note Linking** | Wiki links, backlinks, related notes, cross-collection navigation |
359
- | **Multilingual** | 30+ languages, auto-detection, cross-lingual search |
360
- | **Incremental** | SHA-256 tracking, only changed files re-indexed |
361
- | **Keyboard First** | ⌘N capture, ⌘K search, ⌘/ shortcuts, ⌘S save |
353
+ | Feature | Description |
354
+ | :------------------- | :----------------------------------------------------------------------------- |
355
+ | **Hybrid Search** | BM25 + vector + RRF fusion + cross-encoder reranking |
356
+ | **Document Editor** | Create, edit, delete docs with live markdown preview |
357
+ | **Web UI** | Visual dashboard for search, browse, edit, and AI Q&A |
358
+ | **REST API** | HTTP API for custom tools and integrations |
359
+ | **Multi-Format** | Markdown, PDF, DOCX, XLSX, PPTX, plain text |
360
+ | **Local LLM** | AI answers via llama.cpp, no API keys |
361
+ | **Remote Inference** | Offload to GPU servers via HTTP (llama-server, Ollama, LocalAI) |
362
+ | **Privacy First** | 100% offline, zero telemetry, your data stays yours |
363
+ | **MCP Server** | Works with Claude Desktop, Cursor, Zed, + 8 more |
364
+ | **Collections** | Organize sources with patterns, excludes, contexts |
365
+ | **Tag Filtering** | Frontmatter tags with hierarchical paths, filter via `--tags-any`/`--tags-all` |
366
+ | **Note Linking** | Wiki links, backlinks, related notes, cross-collection navigation |
367
+ | **Multilingual** | 30+ languages, auto-detection, cross-lingual search |
368
+ | **Incremental** | SHA-256 tracking, only changed files re-indexed |
369
+ | **Keyboard First** | ⌘N capture, ⌘K search, ⌘/ shortcuts, ⌘S save |
362
370
 
363
371
  ---
364
372
 
365
373
  ## Local Models
366
374
 
367
- Models auto-download on first use to `~/.cache/gno/models/`.
375
+ Models auto-download on first use to `~/.cache/gno/models/`. Alternatively, offload to a GPU server on your network using HTTP backends.
368
376
 
369
377
  | Model | Purpose | Size |
370
378
  | :------------------ | :------------------------------------ | :----------- |
@@ -385,6 +393,24 @@ gno models use slim
385
393
  gno models pull --all # Optional: pre-download models (auto-downloads on first use)
386
394
  ```
387
395
 
396
+ ### HTTP Backends (Remote GPU)
397
+
398
+ Offload inference to a GPU server on your network:
399
+
400
+ ```yaml
401
+ # ~/.config/gno/config.yaml
402
+ models:
403
+ activePreset: remote-gpu
404
+ presets:
405
+ - id: remote-gpu
406
+ name: Remote GPU Server
407
+ embed: "http://192.168.1.100:8081/v1/embeddings#bge-m3"
408
+ rerank: "http://192.168.1.100:8082/v1/completions#reranker"
409
+ gen: "http://192.168.1.100:8083/v1/chat/completions#qwen3-4b"
410
+ ```
411
+
412
+ Works with llama-server, Ollama, LocalAI, vLLM, or any OpenAI-compatible server.
413
+
388
414
  > **Configuration**: [Model Setup](https://gno.sh/docs/CONFIGURATION/)
389
415
 
390
416
  ---
@@ -249,6 +249,7 @@ gno similar gno://notes/auth.md --cross-collection
249
249
 
250
250
  ```markdown
251
251
  # In your documents:
252
+
252
253
  See [[API Design]] for details.
253
254
  Check [[work:Project Plan]] for cross-collection link.
254
255
  Read [[Security#OAuth]] for specific section.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@gmickel/gno",
3
- "version": "0.15.0",
3
+ "version": "0.16.0",
4
4
  "description": "Local semantic search for your documents. Index Markdown, PDF, and Office files with hybrid BM25 + vector search.",
5
5
  "keywords": [
6
6
  "embeddings",
@@ -95,12 +95,12 @@
95
95
  "commander": "^14.0.2",
96
96
  "embla-carousel-react": "^8.6.0",
97
97
  "franc": "^6.2.0",
98
- "lucide-react": "^0.562.0",
99
- "markitdown-ts": "^0.0.8",
98
+ "lucide-react": "^0.563.0",
99
+ "markitdown-ts": "^0.0.9",
100
100
  "minimatch": "^10.1.1",
101
101
  "nanoid": "^5.1.6",
102
102
  "node-llama-cpp": "^3.14.5",
103
- "officeparser": "^5.2.2",
103
+ "officeparser": "^6.0.4",
104
104
  "picocolors": "^1.1.1",
105
105
  "react": "^19.2.3",
106
106
  "react-dom": "^19.2.3",
@@ -118,7 +118,7 @@
118
118
  },
119
119
  "devDependencies": {
120
120
  "@ai-sdk/openai": "^3.0.2",
121
- "@biomejs/biome": "2.3.10",
121
+ "@biomejs/biome": "2.3.13",
122
122
  "@tailwindcss/cli": "^4.1.18",
123
123
  "@types/bun": "latest",
124
124
  "@types/react": "^19.2.7",
@@ -129,13 +129,13 @@
129
129
  "evalite": "^1.0.0-beta.15",
130
130
  "exceljs": "^4.4.0",
131
131
  "lefthook": "^2.0.13",
132
- "oxfmt": "^0.21.0",
133
- "oxlint": "^1.36.0",
134
- "oxlint-tsgolint": "^0.10.1",
132
+ "oxfmt": "^0.28.0",
133
+ "oxlint": "^1.42.0",
134
+ "oxlint-tsgolint": "^0.11.4",
135
135
  "pdf-lib": "^1.17.1",
136
136
  "playwright": "^1.52.0",
137
137
  "pptxgenjs": "^4.0.1",
138
- "ultracite": "7.0.4",
138
+ "ultracite": "7.1.3",
139
139
  "vitest": "^4.0.16"
140
140
  },
141
141
  "peerDependencies": {
package/src/cli/AGENTS.md CHANGED
@@ -54,9 +54,9 @@ program
54
54
 
55
55
  ```typescript
56
56
  export const EXIT = {
57
- SUCCESS: 0, // Command completed successfully
57
+ SUCCESS: 0, // Command completed successfully
58
58
  VALIDATION: 1, // Bad args, missing params
59
- RUNTIME: 2, // IO, DB, model, network errors
59
+ RUNTIME: 2, // IO, DB, model, network errors
60
60
  } as const;
61
61
  ```
62
62
 
package/src/cli/CLAUDE.md CHANGED
@@ -54,9 +54,9 @@ program
54
54
 
55
55
  ```typescript
56
56
  export const EXIT = {
57
- SUCCESS: 0, // Command completed successfully
57
+ SUCCESS: 0, // Command completed successfully
58
58
  VALIDATION: 1, // Bad args, missing params
59
- RUNTIME: 2, // IO, DB, model, network errors
59
+ RUNTIME: 2, // IO, DB, model, network errors
60
60
  } as const;
61
61
  ```
62
62
 
@@ -1,9 +1,9 @@
1
1
  /**
2
2
  * officeparser adapter for PPTX conversion.
3
- * Uses parseOfficeAsync() with Buffer for in-memory extraction.
3
+ * Uses parseOffice() v6 API with Buffer for in-memory extraction.
4
4
  */
5
5
 
6
- import { parseOfficeAsync } from "officeparser";
6
+ import { parseOffice } from "officeparser";
7
7
 
8
8
  import type {
9
9
  Converter,
@@ -76,10 +76,12 @@ export const officeparserAdapter: Converter = {
76
76
  try {
77
77
  // Zero-copy Buffer view (input.bytes is immutable by contract)
78
78
  const buffer = toBuffer(input.bytes);
79
- const text = await parseOfficeAsync(buffer, {
79
+ // v6 API: parseOffice returns AST, use .toText() for plain text
80
+ const ast = await parseOffice(buffer, {
80
81
  newlineDelimiter: "\n",
81
82
  ignoreNotes: false, // Include speaker notes
82
83
  });
84
+ const text = ast.toText();
83
85
 
84
86
  if (!text || text.trim().length === 0) {
85
87
  return {
@@ -20,5 +20,5 @@ export const NATIVE_VERSIONS = {
20
20
  */
21
21
  export const ADAPTER_VERSIONS = {
22
22
  "markitdown-ts": "0.0.8",
23
- officeparser: "5.2.0",
23
+ officeparser: "6.0.4",
24
24
  } as const;
@@ -24,6 +24,15 @@ type LlamaEmbeddingContext = Awaited<
24
24
  ReturnType<LlamaModel["createEmbeddingContext"]>
25
25
  >;
26
26
 
27
+ // ─────────────────────────────────────────────────────────────────────────────
28
+ // Constants
29
+ // ─────────────────────────────────────────────────────────────────────────────
30
+
31
+ // Max concurrent embedding operations per batch to avoid overwhelming the context.
32
+ // node-llama-cpp contexts may not handle high concurrency well; this provides
33
+ // a safe default while still allowing parallelism within chunks.
34
+ const MAX_CONCURRENT_EMBEDDINGS = 16;
35
+
27
36
  // ─────────────────────────────────────────────────────────────────────────────
28
37
  // Implementation
29
38
  // ─────────────────────────────────────────────────────────────────────────────
@@ -78,19 +87,53 @@ export class NodeLlamaCppEmbedding implements EmbeddingPort {
78
87
  return ctx;
79
88
  }
80
89
 
90
+ if (texts.length === 0) {
91
+ return { ok: true, value: [] };
92
+ }
93
+
81
94
  try {
82
- const results: number[][] = [];
83
- for (const text of texts) {
84
- const embedding = await ctx.value.getEmbeddingFor(text);
85
- const vector = Array.from(embedding.vector) as number[];
86
- results.push(vector);
87
-
88
- // Cache dimensions on first call
89
- if (this.dims === null) {
90
- this.dims = vector.length;
95
+ // Process in chunks to avoid overwhelming the embedding context.
96
+ // node-llama-cpp v3.x only exposes getEmbeddingFor (single text), not a native
97
+ // batch method. We use allSettled within chunks to ensure all in-flight ops
98
+ // complete before returning (prevents orphaned operations on early failure).
99
+ const allResults: number[][] = [];
100
+
101
+ for (let i = 0; i < texts.length; i += MAX_CONCURRENT_EMBEDDINGS) {
102
+ const chunk = texts.slice(i, i + MAX_CONCURRENT_EMBEDDINGS);
103
+ const settled = await Promise.allSettled(
104
+ chunk.map((text) => ctx.value.getEmbeddingFor(text))
105
+ );
106
+
107
+ // Check for any failures in this chunk
108
+ const firstRejection = settled.find(
109
+ (r): r is PromiseRejectedResult => r.status === "rejected"
110
+ );
111
+ if (firstRejection) {
112
+ return {
113
+ ok: false,
114
+ error: inferenceFailedError(this.modelUri, firstRejection.reason),
115
+ };
91
116
  }
117
+
118
+ // Extract results from this chunk (cast safe after rejection check)
119
+ const chunkResults = (
120
+ settled as Array<
121
+ PromiseFulfilledResult<
122
+ Awaited<ReturnType<typeof ctx.value.getEmbeddingFor>>
123
+ >
124
+ >
125
+ ).map((r) => Array.from(r.value.vector) as number[]);
126
+
127
+ allResults.push(...chunkResults);
92
128
  }
93
- return { ok: true, value: results };
129
+
130
+ // Cache dimensions from first result
131
+ const firstResult = allResults[0];
132
+ if (this.dims === null && firstResult !== undefined) {
133
+ this.dims = firstResult.length;
134
+ }
135
+
136
+ return { ok: true, value: allResults };
94
137
  } catch (e) {
95
138
  return { ok: false, error: inferenceFailedError(this.modelUri, e) };
96
139
  }
package/src/mcp/AGENTS.md CHANGED
@@ -40,14 +40,16 @@ export const toolName: Tool = {
40
40
  description: "What this tool does",
41
41
  inputSchema: {
42
42
  type: "object",
43
- properties: { /* ... */ },
43
+ properties: {
44
+ /* ... */
45
+ },
44
46
  required: ["query"],
45
47
  },
46
48
  };
47
49
 
48
50
  export async function handleToolName(
49
51
  args: ToolArgs,
50
- store: SqliteAdapter,
52
+ store: SqliteAdapter
51
53
  // ... other ports
52
54
  ): Promise<CallToolResult> {
53
55
  // 1. Validate args
package/src/mcp/CLAUDE.md CHANGED
@@ -40,14 +40,16 @@ export const toolName: Tool = {
40
40
  description: "What this tool does",
41
41
  inputSchema: {
42
42
  type: "object",
43
- properties: { /* ... */ },
43
+ properties: {
44
+ /* ... */
45
+ },
44
46
  required: ["query"],
45
47
  },
46
48
  };
47
49
 
48
50
  export async function handleToolName(
49
51
  args: ToolArgs,
50
- store: SqliteAdapter,
52
+ store: SqliteAdapter
51
53
  // ... other ports
52
54
  ): Promise<CallToolResult> {
53
55
  // 1. Validate args
@@ -52,7 +52,7 @@ interface ServerContext {
52
52
  embedPort: EmbeddingPort | null;
53
53
  genPort: GenerationPort | null;
54
54
  rerankPort: RerankPort | null;
55
- capabilities: { bm25, vector, hybrid, answer };
55
+ capabilities: { bm25; vector; hybrid; answer };
56
56
  }
57
57
  ```
58
58
 
@@ -113,7 +113,7 @@ Use HTML imports with `Bun.serve()`. Don't use `vite`. HTML imports fully suppor
113
113
  Server example:
114
114
 
115
115
  ```ts
116
- import index from "./index.html"
116
+ import index from "./index.html";
117
117
 
118
118
  Bun.serve({
119
119
  routes: {
@@ -134,13 +134,13 @@ Bun.serve({
134
134
  },
135
135
  close: (ws) => {
136
136
  // handle close
137
- }
137
+ },
138
138
  },
139
139
  development: {
140
140
  hmr: true,
141
141
  console: true,
142
- }
143
- })
142
+ },
143
+ });
144
144
  ```
145
145
 
146
146
  HTML files can import .tsx, .jsx or .js files directly and Bun's bundler will transpile & bundle automatically. `<link>` tags can point to stylesheets and Bun's CSS bundler will bundle.
@@ -160,7 +160,7 @@ With the following `frontend.tsx`:
160
160
  import React from "react";
161
161
 
162
162
  // import .css files directly and it works
163
- import './index.css';
163
+ import "./index.css";
164
164
 
165
165
  import { createRoot } from "react-dom/client";
166
166
 
@@ -52,7 +52,7 @@ interface ServerContext {
52
52
  embedPort: EmbeddingPort | null;
53
53
  genPort: GenerationPort | null;
54
54
  rerankPort: RerankPort | null;
55
- capabilities: { bm25, vector, hybrid, answer };
55
+ capabilities: { bm25; vector; hybrid; answer };
56
56
  }
57
57
  ```
58
58
 
@@ -113,7 +113,7 @@ Use HTML imports with `Bun.serve()`. Don't use `vite`. HTML imports fully suppor
113
113
  Server example:
114
114
 
115
115
  ```ts
116
- import index from "./index.html"
116
+ import index from "./index.html";
117
117
 
118
118
  Bun.serve({
119
119
  routes: {
@@ -134,13 +134,13 @@ Bun.serve({
134
134
  },
135
135
  close: (ws) => {
136
136
  // handle close
137
- }
137
+ },
138
138
  },
139
139
  development: {
140
140
  hmr: true,
141
141
  console: true,
142
- }
143
- })
142
+ },
143
+ });
144
144
  ```
145
145
 
146
146
  HTML files can import .tsx, .jsx or .js files directly and Bun's bundler will transpile & bundle automatically. `<link>` tags can point to stylesheets and Bun's CSS bundler will bundle.
@@ -160,7 +160,7 @@ With the following `frontend.tsx`:
160
160
  import React from "react";
161
161
 
162
162
  // import .css files directly and it works
163
- import './index.css';
163
+ import "./index.css";
164
164
 
165
165
  import { createRoot } from "react-dom/client";
166
166
 
@@ -24,13 +24,15 @@ The Snowball stemmer supports: Arabic, Basque, Catalan, Danish, Dutch, English,
24
24
  ## Usage
25
25
 
26
26
  ```typescript
27
- import { Database } from 'bun:sqlite';
27
+ import { Database } from "bun:sqlite";
28
28
 
29
29
  // Load extension
30
- db.loadExtension('vendor/fts5-snowball/darwin-arm64/fts5stemmer.dylib');
30
+ db.loadExtension("vendor/fts5-snowball/darwin-arm64/fts5stemmer.dylib");
31
31
 
32
32
  // Create FTS table with snowball tokenizer
33
- db.exec(`CREATE VIRTUAL TABLE docs USING fts5(content, tokenize='snowball english')`);
33
+ db.exec(
34
+ `CREATE VIRTUAL TABLE docs USING fts5(content, tokenize='snowball english')`
35
+ );
34
36
  ```
35
37
 
36
38
  ## License