@gmickel/gno 0.3.0 → 0.3.4
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +109 -181
- package/package.json +6 -1
package/README.md
CHANGED
|
@@ -1,256 +1,184 @@
|
|
|
1
|
-
# GNO
|
|
1
|
+
# GNO
|
|
2
2
|
|
|
3
|
-
**Index,
|
|
3
|
+
**Your Local Second Brain** — Index, search, and synthesize your entire digital life.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
[](https://www.npmjs.com/package/@gmickel/gno)
|
|
6
|
+
[](./LICENSE)
|
|
6
7
|
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
## ✨ Key Features
|
|
10
|
-
|
|
11
|
-
* **Universal Indexing**: Effortlessly ingest and search across Markdown, PDF, DOCX, XLSX, PPTX, and plain text files.
|
|
12
|
-
* **Hybrid Search Pipeline**: Combines **BM25 keyword search** with **vector semantic search** and **AI re-ranking** for unparalleled retrieval accuracy.
|
|
13
|
-
* **Local LLM Integration**: Get grounded AI answers with citations using **node-llama-cpp** and auto-downloaded GGUF models. No external services, maximum privacy.
|
|
14
|
-
* **Agent-First Design (MCP)**: Seamlessly integrate GNO with AI agents via the Model Context Protocol (MCP) server.
|
|
15
|
-
* **Deterministic Output**: Stable, schema-driven JSON, file-line, and markdown outputs for reliable scripting.
|
|
16
|
-
* **Multilingual Support**: Robust handling of multiple languages in indexing and retrieval.
|
|
17
|
-
* **Privacy-Preserving**: All processing happens locally. Your data never leaves your device.
|
|
18
|
-
* **World-Class Engineering**: Spec-driven development, rigorous testing, and eval gates ensure reliability and quality.
|
|
8
|
+
GNO is a local knowledge engine for privacy-conscious individuals and AI agents. Index your notes, code, PDFs, and Office docs. Get lightning-fast semantic search and AI-powered answers—all on your machine.
|
|
19
9
|
|
|
20
10
|
---
|
|
21
11
|
|
|
22
|
-
##
|
|
23
|
-
|
|
24
|
-
Get searching in minutes with the 3-command workflow:
|
|
12
|
+
## Contents
|
|
25
13
|
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
14
|
+
- [Quick Start](#quick-start)
|
|
15
|
+
- [Installation](#installation)
|
|
16
|
+
- [Search Modes](#search-modes)
|
|
17
|
+
- [Agent Integration](#agent-integration)
|
|
18
|
+
- [How It Works](#how-it-works)
|
|
19
|
+
- [Local Models](#local-models)
|
|
20
|
+
- [Architecture](#architecture)
|
|
21
|
+
- [Development](#development)
|
|
30
22
|
|
|
31
|
-
|
|
32
|
-
gno index
|
|
33
|
-
```
|
|
23
|
+
---
|
|
34
24
|
|
|
35
|
-
|
|
36
|
-
```sh
|
|
37
|
-
# Get a direct, cited answer from your documents
|
|
38
|
-
gno ask "What are the best practices for API authentication?" --collection notes
|
|
25
|
+
## Quick Start
|
|
39
26
|
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
27
|
+
```bash
|
|
28
|
+
# Initialize with your notes folder
|
|
29
|
+
gno init ~/notes --name notes
|
|
43
30
|
|
|
44
|
-
|
|
45
|
-
|
|
46
|
-
# Retrieve specific document content
|
|
47
|
-
gno get "notes/2024-01-15.md"
|
|
31
|
+
# Index documents (BM25 + vectors)
|
|
32
|
+
gno index
|
|
48
33
|
|
|
49
|
-
|
|
50
|
-
|
|
51
|
-
|
|
34
|
+
# Search
|
|
35
|
+
gno query "authentication best practices"
|
|
36
|
+
gno ask "summarize the API discussion" --answer
|
|
37
|
+
```
|
|
52
38
|
|
|
53
39
|
---
|
|
54
40
|
|
|
55
|
-
##
|
|
41
|
+
## Installation
|
|
56
42
|
|
|
57
|
-
|
|
43
|
+
Requires [Bun](https://bun.sh/) >= 1.0.0.
|
|
58
44
|
|
|
59
|
-
|
|
60
|
-
|
|
61
|
-
|
|
62
|
-
---
|
|
63
|
-
|
|
64
|
-
## 🔎 Search Modes
|
|
45
|
+
```bash
|
|
46
|
+
bun install -g @gmickel/gno
|
|
47
|
+
```
|
|
65
48
|
|
|
66
|
-
|
|
49
|
+
**macOS**: Vector search requires Homebrew SQLite:
|
|
50
|
+
```bash
|
|
51
|
+
brew install sqlite3
|
|
52
|
+
```
|
|
67
53
|
|
|
68
|
-
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
| `gno query` | **Hybrid** | Combines BM25 and Vector search with LLM reranking and fusion. | Highest accuracy, nuanced understanding. |
|
|
73
|
-
| `gno ask` | **RAG-focused**| Hybrid search providing a synthesized, cited answer from results. | Getting direct answers to complex questions. |
|
|
54
|
+
Verify:
|
|
55
|
+
```bash
|
|
56
|
+
gno doctor
|
|
57
|
+
```
|
|
74
58
|
|
|
75
59
|
---
|
|
76
60
|
|
|
77
|
-
##
|
|
78
|
-
|
|
79
|
-
GNO is designed to be the knowledge backbone for your AI agents.
|
|
61
|
+
## Search Modes
|
|
80
62
|
|
|
81
|
-
|
|
63
|
+
| Command | Mode | Best For |
|
|
64
|
+
|:--------|:-----|:---------|
|
|
65
|
+
| `gno search` | BM25 | Exact phrases, known terms |
|
|
66
|
+
| `gno vsearch` | Vector | Natural language, concepts |
|
|
67
|
+
| `gno query` | Hybrid | Highest accuracy (BM25 + Vector + reranking) |
|
|
68
|
+
| `gno ask` | RAG | Direct answers with citations |
|
|
82
69
|
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
```sh
|
|
86
|
-
# Get JSON results for LLM processing
|
|
87
|
-
gno query "meeting notes on user feedback" --json -n 5
|
|
88
|
-
|
|
89
|
-
# Get file paths and scores for agent tool use
|
|
90
|
-
gno search "API design" --files --min-score 0.3
|
|
91
|
-
```
|
|
70
|
+
---
|
|
92
71
|
|
|
93
|
-
|
|
72
|
+
## Agent Integration
|
|
94
73
|
|
|
95
|
-
|
|
74
|
+
### For Claude Code / Codex / OpenCode
|
|
96
75
|
|
|
97
76
|
```bash
|
|
98
|
-
gno skill install --scope user #
|
|
99
|
-
gno skill install --target codex #
|
|
77
|
+
gno skill install --scope user # Claude Code
|
|
78
|
+
gno skill install --target codex # Codex
|
|
100
79
|
gno skill install --target all # Both
|
|
101
80
|
```
|
|
102
81
|
|
|
103
|
-
|
|
104
|
-
|
|
105
|
-
### MCP Server (For Claude Desktop/Cursor)
|
|
106
|
-
|
|
107
|
-
Exposes an MCP server for GUI-based AI applications.
|
|
82
|
+
### For Claude Desktop / Cursor
|
|
108
83
|
|
|
109
|
-
|
|
110
|
-
* `gno_search` (BM25)
|
|
111
|
-
* `gno_vsearch` (Vector)
|
|
112
|
-
* `gno_query` (Hybrid)
|
|
113
|
-
* `gno_get` (Document retrieval)
|
|
114
|
-
* `gno_multi_get` (Batch retrieval)
|
|
115
|
-
* `gno_status` (Index health)
|
|
116
|
-
|
|
117
|
-
**Example Claude Desktop Configuration** (`~/Library/Application Support/Claude/claude_desktop_config.json`):
|
|
84
|
+
Add to `~/Library/Application Support/Claude/claude_desktop_config.json`:
|
|
118
85
|
|
|
119
86
|
```json
|
|
120
87
|
{
|
|
121
88
|
"mcpServers": {
|
|
122
|
-
"gno": {
|
|
123
|
-
"command": "gno",
|
|
124
|
-
"args": ["mcp"]
|
|
125
|
-
}
|
|
89
|
+
"gno": { "command": "gno", "args": ["mcp"] }
|
|
126
90
|
}
|
|
127
91
|
}
|
|
128
92
|
```
|
|
129
93
|
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
---
|
|
133
|
-
|
|
134
|
-
## ⚙️ How It Works: The GNO Pipeline
|
|
94
|
+
**MCP Tools**: `gno_search`, `gno_vsearch`, `gno_query`, `gno_get`, `gno_multi_get`, `gno_status`
|
|
135
95
|
|
|
136
|
-
|
|
96
|
+
### CLI Output Formats
|
|
137
97
|
|
|
138
|
-
```
|
|
139
|
-
|
|
140
|
-
|
|
141
|
-
B --> C{Lexical Variants};
|
|
142
|
-
B --> D{Semantic Variants};
|
|
143
|
-
B --> E{Optional HyDE};
|
|
144
|
-
A --> F[Original Query];
|
|
145
|
-
|
|
146
|
-
C --> G(BM25 Retrieval);
|
|
147
|
-
D --> H(Vector Search);
|
|
148
|
-
E --> H;
|
|
149
|
-
F --> G;
|
|
150
|
-
F --> H;
|
|
151
|
-
|
|
152
|
-
G --> I(Ranked List 1);
|
|
153
|
-
H --> J(Ranked List 2);
|
|
154
|
-
I --> K{RRF Fusion + Bonus};
|
|
155
|
-
J --> K;
|
|
156
|
-
|
|
157
|
-
K --> L(Top Candidates);
|
|
158
|
-
L --> M(LLM Re-ranking);
|
|
159
|
-
M --> N(Position-Aware Blending);
|
|
160
|
-
N --> O(Final Results);
|
|
161
|
-
|
|
162
|
-
subgraph "Search Stages"
|
|
163
|
-
B; C; D; E; F; G; H; I; J; K; L; M; N; O;
|
|
164
|
-
end
|
|
98
|
+
```bash
|
|
99
|
+
gno query "meeting notes" --json -n 5 # JSON for LLMs
|
|
100
|
+
gno search "API design" --files # File paths only
|
|
165
101
|
```
|
|
166
102
|
|
|
167
|
-
### Search Pipeline Details:
|
|
168
|
-
|
|
169
|
-
1. **Query Expansion**: Generates alternative queries (lexical and semantic) and an optional synthetic "HyDE" document using a local LLM for richer retrieval.
|
|
170
|
-
2. **Parallel Retrieval**: Executes BM25 (keyword) and Vector (semantic) searches concurrently.
|
|
171
|
-
3. **Fusion**: Combines results using Reciprocal Rank Fusion (RRF) with a weighted boost for original query matches and a top-rank bonus.
|
|
172
|
-
4. **Re-ranking**: An LLM-based cross-encoder re-scores the top candidates for final relevance.
|
|
173
|
-
5. **Blending**: Dynamically adjusts the mix of retrieval vs. reranked scores based on rank position to preserve accuracy.
|
|
174
|
-
|
|
175
|
-
**Score Normalization**: Raw scores from FTS, vector distance, and reranker are normalized to a 0-1 scale for consistent fusion.
|
|
176
|
-
|
|
177
103
|
---
|
|
178
104
|
|
|
179
|
-
##
|
|
180
|
-
|
|
181
|
-
Requires **Bun** >= 1.0.0.
|
|
105
|
+
## How It Works
|
|
182
106
|
|
|
183
|
-
```
|
|
184
|
-
|
|
185
|
-
|
|
107
|
+
```mermaid
|
|
108
|
+
graph TD
|
|
109
|
+
A[User Query] --> B(Query Expansion)
|
|
110
|
+
B --> C{Lexical Variants}
|
|
111
|
+
B --> D{Semantic Variants}
|
|
112
|
+
B --> E{HyDE Passage}
|
|
113
|
+
A --> F[Original Query]
|
|
114
|
+
|
|
115
|
+
C --> G(BM25 Search)
|
|
116
|
+
D --> H(Vector Search)
|
|
117
|
+
E --> H
|
|
118
|
+
F --> G
|
|
119
|
+
F --> H
|
|
120
|
+
|
|
121
|
+
G --> I(Ranked List 1)
|
|
122
|
+
H --> J(Ranked List 2)
|
|
123
|
+
I --> K{RRF Fusion}
|
|
124
|
+
J --> K
|
|
125
|
+
|
|
126
|
+
K --> L(Top Candidates)
|
|
127
|
+
L --> M(Cross-Encoder Rerank)
|
|
128
|
+
M --> N[Final Results]
|
|
186
129
|
```
|
|
187
130
|
|
|
188
|
-
**
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
|
|
131
|
+
1. **Query Expansion** — LLM generates lexical variants, semantic variants, and [HyDE](https://arxiv.org/abs/2212.10496) passage
|
|
132
|
+
2. **Parallel Retrieval** — BM25 + Vector search run concurrently on all variants
|
|
133
|
+
3. **Fusion** — Reciprocal Rank Fusion merges results with position-based scoring
|
|
134
|
+
4. **Re-ranking** — Cross-encoder rescores top 20, blended with fusion scores
|
|
192
135
|
|
|
193
|
-
|
|
194
|
-
```sh
|
|
195
|
-
gno doctor
|
|
196
|
-
```
|
|
136
|
+
See [How Search Works](https://gno.sh/docs/HOW-SEARCH-WORKS/) for full pipeline details.
|
|
197
137
|
|
|
198
138
|
---
|
|
199
139
|
|
|
200
|
-
##
|
|
201
|
-
|
|
202
|
-
GNO runs embeddings, reranking, and query expansion locally using GGUF models via `node-llama-cpp`. Models are automatically downloaded and cached on first use in `~/.cache/gno/models/`.
|
|
140
|
+
## Local Models
|
|
203
141
|
|
|
204
|
-
|
|
205
|
-
| :-------------------- | :---------------- | :------------- |
|
|
206
|
-
| `bge-m3` | Multilingual Embeddings | ~500MB |
|
|
207
|
-
| `bge-reranker-v2-m3` | Cross-Encoder Re-ranking | ~700MB |
|
|
208
|
-
| `Qwen-Instruct` | Query Expansion / HyDE | ~600MB |
|
|
142
|
+
Models auto-download on first use to `~/.cache/gno/models/`.
|
|
209
143
|
|
|
210
|
-
|
|
144
|
+
| Model | Purpose | Size |
|
|
145
|
+
|:------|:--------|:-----|
|
|
146
|
+
| bge-m3 | Embeddings | ~500MB |
|
|
147
|
+
| bge-reranker-v2-m3 | Re-ranking | ~700MB |
|
|
148
|
+
| Qwen-Instruct | Query expansion | ~600MB |
|
|
211
149
|
|
|
212
150
|
---
|
|
213
151
|
|
|
214
|
-
##
|
|
215
|
-
|
|
216
|
-
GNO follows a layered, Ports and Adapters architecture for maintainability and testability:
|
|
152
|
+
## Architecture
|
|
217
153
|
|
|
218
154
|
```
|
|
219
|
-
|
|
220
|
-
│
|
|
221
|
-
|
|
222
|
-
│
|
|
223
|
-
|
|
224
|
-
│
|
|
225
|
-
|
|
226
|
-
│
|
|
227
|
-
|
|
155
|
+
┌─────────────────────────────────────────────────┐
|
|
156
|
+
│ GNO CLI / MCP │
|
|
157
|
+
├─────────────────────────────────────────────────┤
|
|
158
|
+
│ Ports: Converter, Store, Embedding, Rerank │
|
|
159
|
+
├─────────────────────────────────────────────────┤
|
|
160
|
+
│ Adapters: SQLite, FTS5, sqlite-vec, llama-cpp │
|
|
161
|
+
├─────────────────────────────────────────────────┤
|
|
162
|
+
│ Core: Identity, Mirrors, Chunking, Retrieval │
|
|
163
|
+
└─────────────────────────────────────────────────┘
|
|
228
164
|
```
|
|
229
165
|
|
|
230
166
|
---
|
|
231
167
|
|
|
232
|
-
##
|
|
168
|
+
## Development
|
|
233
169
|
|
|
234
170
|
```bash
|
|
235
|
-
|
|
236
|
-
git clone https://github.com/gmickel/gno.git
|
|
237
|
-
cd gno
|
|
238
|
-
|
|
239
|
-
# Install dependencies
|
|
171
|
+
git clone https://github.com/gmickel/gno.git && cd gno
|
|
240
172
|
bun install
|
|
241
|
-
|
|
242
|
-
# Run tests
|
|
243
173
|
bun test
|
|
244
|
-
|
|
245
|
-
# Lint and format code
|
|
246
174
|
bun run lint
|
|
247
|
-
|
|
248
|
-
# Type check
|
|
249
175
|
bun run typecheck
|
|
250
176
|
```
|
|
251
177
|
|
|
178
|
+
See [Contributing](.github/CONTRIBUTING.md) for CI matrix, caching, and release process.
|
|
179
|
+
|
|
252
180
|
---
|
|
253
181
|
|
|
254
|
-
##
|
|
182
|
+
## License
|
|
255
183
|
|
|
256
|
-
[MIT
|
|
184
|
+
[MIT](./LICENSE)
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@gmickel/gno",
|
|
3
|
-
"version": "0.3.
|
|
3
|
+
"version": "0.3.4",
|
|
4
4
|
"description": "Local semantic search for your documents. Index Markdown, PDF, and Office files with hybrid BM25 + vector search.",
|
|
5
5
|
"keywords": [
|
|
6
6
|
"search",
|
|
@@ -47,6 +47,7 @@
|
|
|
47
47
|
"test:watch": "bun test --watch",
|
|
48
48
|
"test:coverage": "bun test --coverage",
|
|
49
49
|
"test:coverage:html": "bun test --coverage --html",
|
|
50
|
+
"test:fixtures": "bun scripts/generate-test-fixtures.ts",
|
|
50
51
|
"typecheck": "tsgo --noEmit",
|
|
51
52
|
"lint:typeaware": "bun x oxlint --type-aware",
|
|
52
53
|
"reset": "bun run src/index.ts reset --confirm",
|
|
@@ -79,9 +80,13 @@
|
|
|
79
80
|
"@typescript/native-preview": "^7.0.0-dev.20251215.1",
|
|
80
81
|
"ajv": "^8.17.1",
|
|
81
82
|
"ajv-formats": "^3.0.1",
|
|
83
|
+
"docx": "^9.5.1",
|
|
82
84
|
"evalite": "^1.0.0-beta.15",
|
|
85
|
+
"exceljs": "^4.4.0",
|
|
83
86
|
"lefthook": "^2.0.12",
|
|
84
87
|
"oxlint-tsgolint": "^0.10.0",
|
|
88
|
+
"pdf-lib": "^1.17.1",
|
|
89
|
+
"pptxgenjs": "^4.0.1",
|
|
85
90
|
"ultracite": "^6.5.0"
|
|
86
91
|
},
|
|
87
92
|
"peerDependencies": {
|