tokenos 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +571 -0
- package/USAGE.md +451 -0
- package/dist/config.d.ts +22 -0
- package/dist/config.d.ts.map +1 -0
- package/dist/config.js +60 -0
- package/dist/config.js.map +1 -0
- package/dist/db/connection.d.ts +3 -0
- package/dist/db/connection.d.ts.map +1 -0
- package/dist/db/connection.js +78 -0
- package/dist/db/connection.js.map +1 -0
- package/dist/db/index.d.ts +4 -0
- package/dist/db/index.d.ts.map +1 -0
- package/dist/db/index.js +4 -0
- package/dist/db/index.js.map +1 -0
- package/dist/db/memory.d.ts +6 -0
- package/dist/db/memory.d.ts.map +1 -0
- package/dist/db/memory.js +62 -0
- package/dist/db/memory.js.map +1 -0
- package/dist/db/queries.d.ts +29 -0
- package/dist/db/queries.d.ts.map +1 -0
- package/dist/db/queries.js +215 -0
- package/dist/db/queries.js.map +1 -0
- package/dist/embeddings/client.d.ts +16 -0
- package/dist/embeddings/client.d.ts.map +1 -0
- package/dist/embeddings/client.js +70 -0
- package/dist/embeddings/client.js.map +1 -0
- package/dist/embeddings/index.d.ts +11 -0
- package/dist/embeddings/index.d.ts.map +1 -0
- package/dist/embeddings/index.js +37 -0
- package/dist/embeddings/index.js.map +1 -0
- package/dist/embeddings/similarity.d.ts +7 -0
- package/dist/embeddings/similarity.d.ts.map +1 -0
- package/dist/embeddings/similarity.js +31 -0
- package/dist/embeddings/similarity.js.map +1 -0
- package/dist/indexer/cli.d.ts +8 -0
- package/dist/indexer/cli.d.ts.map +1 -0
- package/dist/indexer/cli.js +21 -0
- package/dist/indexer/cli.js.map +1 -0
- package/dist/indexer/ignore.d.ts +4 -0
- package/dist/indexer/ignore.d.ts.map +1 -0
- package/dist/indexer/ignore.js +30 -0
- package/dist/indexer/ignore.js.map +1 -0
- package/dist/indexer/index.d.ts +5 -0
- package/dist/indexer/index.d.ts.map +1 -0
- package/dist/indexer/index.js +4 -0
- package/dist/indexer/index.js.map +1 -0
- package/dist/indexer/indexer.d.ts +13 -0
- package/dist/indexer/indexer.d.ts.map +1 -0
- package/dist/indexer/indexer.js +125 -0
- package/dist/indexer/indexer.js.map +1 -0
- package/dist/indexer/parser.d.ts +10 -0
- package/dist/indexer/parser.d.ts.map +1 -0
- package/dist/indexer/parser.js +444 -0
- package/dist/indexer/parser.js.map +1 -0
- package/dist/indexer/watcher.d.ts +7 -0
- package/dist/indexer/watcher.d.ts.map +1 -0
- package/dist/indexer/watcher.js +64 -0
- package/dist/indexer/watcher.js.map +1 -0
- package/dist/main.d.ts +3 -0
- package/dist/main.d.ts.map +1 -0
- package/dist/main.js +92 -0
- package/dist/main.js.map +1 -0
- package/dist/reset.d.ts +6 -0
- package/dist/reset.d.ts.map +1 -0
- package/dist/reset.js +23 -0
- package/dist/reset.js.map +1 -0
- package/dist/server/index.d.ts +2 -0
- package/dist/server/index.d.ts.map +1 -0
- package/dist/server/index.js +2 -0
- package/dist/server/index.js.map +1 -0
- package/dist/server/server.d.ts +4 -0
- package/dist/server/server.d.ts.map +1 -0
- package/dist/server/server.js +558 -0
- package/dist/server/server.js.map +1 -0
- package/dist/server/visualize.d.ts +2 -0
- package/dist/server/visualize.d.ts.map +1 -0
- package/dist/server/visualize.js +299 -0
- package/dist/server/visualize.js.map +1 -0
- package/dist/test-phase1.d.ts +13 -0
- package/dist/test-phase1.d.ts.map +1 -0
- package/dist/test-phase1.js +90 -0
- package/dist/test-phase1.js.map +1 -0
- package/dist/test-phase2.d.ts +13 -0
- package/dist/test-phase2.d.ts.map +1 -0
- package/dist/test-phase2.js +110 -0
- package/dist/test-phase2.js.map +1 -0
- package/dist/test-phase3.d.ts +12 -0
- package/dist/test-phase3.d.ts.map +1 -0
- package/dist/test-phase3.js +85 -0
- package/dist/test-phase3.js.map +1 -0
- package/dist/types.d.ts +73 -0
- package/dist/types.d.ts.map +1 -0
- package/dist/types.js +3 -0
- package/dist/types.js.map +1 -0
- package/dist/utils/cache.d.ts +12 -0
- package/dist/utils/cache.d.ts.map +1 -0
- package/dist/utils/cache.js +45 -0
- package/dist/utils/cache.js.map +1 -0
- package/dist/utils/logger.d.ts +16 -0
- package/dist/utils/logger.d.ts.map +1 -0
- package/dist/utils/logger.js +52 -0
- package/dist/utils/logger.js.map +1 -0
- package/dist/utils/scoring.d.ts +15 -0
- package/dist/utils/scoring.d.ts.map +1 -0
- package/dist/utils/scoring.js +17 -0
- package/dist/utils/scoring.js.map +1 -0
- package/dist/verify-parser.d.ts +6 -0
- package/dist/verify-parser.d.ts.map +1 -0
- package/dist/verify-parser.js +105 -0
- package/dist/verify-parser.js.map +1 -0
- package/package.json +52 -0
package/README.md
ADDED
|
@@ -0,0 +1,571 @@
|
|
|
1
|
+
# TokenOS
|
|
2
|
+
|
|
3
|
+
> **Local-first codebase graph intelligence for AI assistants — powered by SQLite, ts-morph, and Ollama.**
|
|
4
|
+
|
|
5
|
+
`TokenOS` is a [Model Context Protocol (MCP)](https://modelcontextprotocol.io/) server that statically analyses your TypeScript/TSX codebase, stores it as a structural dependency graph in SQLite, optionally enriches nodes with semantic embeddings via Ollama, and exposes high-precision query tools for AI coding assistants like Claude, Cursor, or any MCP-compatible client.
|
|
6
|
+
|
|
7
|
+
**The goal**: When you start a new chat, the AI already knows your codebase structure. No more "let me analyze all files first" — it queries the graph and gets exactly what it needs, saving tokens and compute.
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## Table of Contents
|
|
12
|
+
|
|
13
|
+
- [Quick Start](#quick-start)
|
|
14
|
+
- [Commands](#commands)
|
|
15
|
+
- [Database Location](#database-location)
|
|
16
|
+
- [MCP Client Configuration](#mcp-client-configuration)
|
|
17
|
+
- [MCP Tools](#mcp-tools)
|
|
18
|
+
- [Node Types](#node-types)
|
|
19
|
+
- [Edge Types](#edge-types)
|
|
20
|
+
- [Importance Scoring](#importance-scoring)
|
|
21
|
+
- [Semantic Meta Enrichment](#semantic-meta-enrichment)
|
|
22
|
+
- [Conversation Memory](#conversation-memory)
|
|
23
|
+
- [Visualization Dashboard](#visualization-dashboard)
|
|
24
|
+
- [Changing the Embedding Model](#changing-the-embedding-model)
|
|
25
|
+
- [Architecture](#architecture)
|
|
26
|
+
- [Configuration Reference](#configuration-reference)
|
|
27
|
+
- [Prerequisites](#prerequisites)
|
|
28
|
+
- [Tech Stack](#tech-stack)
|
|
29
|
+
- [Limitations](#limitations)
|
|
30
|
+
- [License](#license)
|
|
31
|
+
|
|
32
|
+
---
|
|
33
|
+
|
|
34
|
+
## Quick Start
|
|
35
|
+
|
|
36
|
+
### 1. Install
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
git clone https://github.com/wripcode/TokenOS.git
|
|
40
|
+
cd TokenOS
|
|
41
|
+
npm install
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
### 2. Configure
|
|
45
|
+
|
|
46
|
+
Edit `tokenos.config.json` in the project root:
|
|
47
|
+
|
|
48
|
+
```json
|
|
49
|
+
{
|
|
50
|
+
"watchPath": "/absolute/path/to/your/project",
|
|
51
|
+
"ollama": {
|
|
52
|
+
"url": "http://localhost:11434",
|
|
53
|
+
"model": "mxbai-embed-large:latest"
|
|
54
|
+
},
|
|
55
|
+
"ui": {
|
|
56
|
+
"enabled": false,
|
|
57
|
+
"port": 3333
|
|
58
|
+
}
|
|
59
|
+
}
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
### 3. Run
|
|
63
|
+
|
|
64
|
+
```bash
|
|
65
|
+
npm run dev
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
That's it. The server will:
|
|
69
|
+
|
|
70
|
+
1. Parse all `.ts`/`.tsx` files via ts-morph AST analysis
|
|
71
|
+
2. Extract nodes (functions, classes, components, interfaces, types, enums, routes, variables, imports)
|
|
72
|
+
3. Extract edges (CALLS, IMPORTS, EXPORTS, EXTENDS, IMPLEMENTS, DEFINES, RENDERS, CONTAINS, TYPE_OF, PART_OF_TAB)
|
|
73
|
+
4. Store everything in a per-project SQLite database
|
|
74
|
+
5. Auto-generate summaries for all nodes
|
|
75
|
+
6. Back-fill semantic embeddings via Ollama (if running)
|
|
76
|
+
7. Compute importance scores for architectural ranking
|
|
77
|
+
8. Start a chokidar file watcher for real-time incremental updates
|
|
78
|
+
9. Serve 6 MCP tools over stdio transport
|
|
79
|
+
10. Optionally launch an interactive graph visualization dashboard
|
|
80
|
+
|
|
81
|
+
---
|
|
82
|
+
|
|
83
|
+
## Commands
|
|
84
|
+
|
|
85
|
+
| Command | Description |
|
|
86
|
+
|---|---|
|
|
87
|
+
| `npm run dev` | Start the MCP server (reads `tokenos.config.json`) |
|
|
88
|
+
| `npm run reset` | Delete the project database — next `npm run dev` re-indexes from scratch |
|
|
89
|
+
| `npm run build` | Compile TypeScript to `dist/` |
|
|
90
|
+
| `npm run index -- /path` | One-shot indexing of a directory (no server, no watcher) |
|
|
91
|
+
|
|
92
|
+
---
|
|
93
|
+
|
|
94
|
+
## Database Location
|
|
95
|
+
|
|
96
|
+
Each project gets its own isolated database at:
|
|
97
|
+
|
|
98
|
+
```
|
|
99
|
+
<your-project>/.tokenos/graph.db
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
This means:
|
|
103
|
+
- ✅ Different projects never mix data
|
|
104
|
+
- ✅ You can delete `.tokenos/` to reset a specific project
|
|
105
|
+
- ✅ Add `.tokenos/` to your project's `.gitignore`
|
|
106
|
+
|
|
107
|
+
SQLite is configured with **WAL mode** for better concurrent read performance and foreign key enforcement.
|
|
108
|
+
|
|
109
|
+
### Schema
|
|
110
|
+
|
|
111
|
+
Three tables are created automatically:
|
|
112
|
+
|
|
113
|
+
| Table | Purpose |
|
|
114
|
+
|---|---|
|
|
115
|
+
| `nodes` | All code entities (functions, classes, components, etc.) with metadata, summaries, embeddings, and importance scores |
|
|
116
|
+
| `edges` | All relationships between nodes (CALLS, IMPORTS, RENDERS, etc.) with unique constraint on `(from_node, to_node, type)` |
|
|
117
|
+
| `memories` | Conversation memory storage for persistent context across sessions |
|
|
118
|
+
|
|
119
|
+
---
|
|
120
|
+
|
|
121
|
+
## MCP Client Configuration
|
|
122
|
+
|
|
123
|
+
### Claude Desktop
|
|
124
|
+
|
|
125
|
+
Add to `claude_desktop_config.json`:
|
|
126
|
+
|
|
127
|
+
```json
|
|
128
|
+
{
|
|
129
|
+
"mcpServers": {
|
|
130
|
+
"tokenos": {
|
|
131
|
+
"command": "node",
|
|
132
|
+
"args": ["/absolute/path/to/TokenOS/dist/main.js"]
|
|
133
|
+
}
|
|
134
|
+
}
|
|
135
|
+
}
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
### Development mode (tsx)
|
|
139
|
+
|
|
140
|
+
```json
|
|
141
|
+
{
|
|
142
|
+
"mcpServers": {
|
|
143
|
+
"tokenos": {
|
|
144
|
+
"command": "npx",
|
|
145
|
+
"args": ["tsx", "/absolute/path/to/TokenOS/src/main.ts"]
|
|
146
|
+
}
|
|
147
|
+
}
|
|
148
|
+
}
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
> **Note:** No need to pass the project path as CLI arg — it reads from `tokenos.config.json`.
|
|
152
|
+
|
|
153
|
+
---
|
|
154
|
+
|
|
155
|
+
## MCP Tools
|
|
156
|
+
|
|
157
|
+
All tools are **read-only**, **idempotent**, and communicate over **stdio** transport. Responses are truncated at **25,000 characters** to prevent overwhelming the LLM context window.
|
|
158
|
+
|
|
159
|
+
### `search`
|
|
160
|
+
|
|
161
|
+
Smart search that understands intent and returns the most relevant code and context.
|
|
162
|
+
|
|
163
|
+
```
|
|
164
|
+
Args:
|
|
165
|
+
query (string) — Natural language question or search term
|
|
166
|
+
response_format (optional) — 'json' (default) or 'markdown'
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
**How it works:**
|
|
170
|
+
|
|
171
|
+
1. **Intent detection** — Classifies query into one of four modes:
|
|
172
|
+
- `semantic` (default) — general concept/code search
|
|
173
|
+
- `trace` — triggered by "trace", "flow", "how", "why" → deeper BFS traversal (depth 2)
|
|
174
|
+
- `explore` — triggered by "where", "what", "find" → broad shallow search
|
|
175
|
+
- `dependency` — triggered by "depend", "import", "export" → import/export edges only
|
|
176
|
+
2. **Hybrid search** — Combines text name/meta matching with Ollama semantic similarity
|
|
177
|
+
3. **Graph expansion** — BFS-expands top results into a contextualized subgraph
|
|
178
|
+
4. **Memory retrieval** — Appends relevant conversation memories (top 3)
|
|
179
|
+
|
|
180
|
+
**Returns:** Compressed, relevant context (code + relationships + memory)
|
|
181
|
+
|
|
182
|
+
### `find_nodes`
|
|
183
|
+
|
|
184
|
+
Find code elements by name, type, or meaning.
|
|
185
|
+
|
|
186
|
+
```
|
|
187
|
+
Args:
|
|
188
|
+
query (string) — Function name, class name, or natural-language description
|
|
189
|
+
type (optional) — 'function' | 'class' | 'file' | 'import' | 'variable' | 'component' | 'interface' | 'type_alias' | 'enum' | 'route'
|
|
190
|
+
mode (optional) — 'text' (default) or 'semantic' (requires Ollama)
|
|
191
|
+
limit (optional) — 1–50, default 10
|
|
192
|
+
offset (optional) — for pagination
|
|
193
|
+
response_format (optional) — 'json' (default) or 'markdown'
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
**Text mode**: Searches by name (LIKE match) and meta fields (role, tab, feature). When `type` is provided, results are filtered with AND logic.
|
|
197
|
+
|
|
198
|
+
**Semantic mode**: Uses Ollama embeddings for concept-level search. Example: searching "authentication handler" finds `loginUser()` even if "auth" isn't in the name. Falls back to text mode when Ollama is offline.
|
|
199
|
+
|
|
200
|
+
**Returns:** List of matching nodes with relevance ranking
|
|
201
|
+
|
|
202
|
+
### `get_node`
|
|
203
|
+
|
|
204
|
+
Get full details of a specific code element.
|
|
205
|
+
|
|
206
|
+
```
|
|
207
|
+
Args:
|
|
208
|
+
id (string) — format: 'filePath::name' (e.g. 'src/utils/cache.ts::LRUCache')
|
|
209
|
+
response_format (optional) — 'json' or 'markdown'
|
|
210
|
+
```
|
|
211
|
+
|
|
212
|
+
**Returns:** Complete node data (code, type, file, importance)
|
|
213
|
+
|
|
214
|
+
### `get_connections`
|
|
215
|
+
|
|
216
|
+
Get directly related code elements.
|
|
217
|
+
|
|
218
|
+
```
|
|
219
|
+
Args:
|
|
220
|
+
id (string) — Node ID
|
|
221
|
+
response_format (optional) — 'json' or 'markdown'
|
|
222
|
+
```
|
|
223
|
+
|
|
224
|
+
**Returns:** Connected nodes and their relationships
|
|
225
|
+
|
|
226
|
+
### `explore`
|
|
227
|
+
|
|
228
|
+
Explore surrounding code context from a starting point.
|
|
229
|
+
|
|
230
|
+
```
|
|
231
|
+
Args:
|
|
232
|
+
id (string) — Starting node ID
|
|
233
|
+
depth (optional) — 1–3, default 2
|
|
234
|
+
```
|
|
235
|
+
|
|
236
|
+
**Returns:** Local graph (nodes + relationships)
|
|
237
|
+
|
|
238
|
+
### `top_nodes`
|
|
239
|
+
|
|
240
|
+
Get the most important parts of the codebase.
|
|
241
|
+
|
|
242
|
+
```
|
|
243
|
+
Args:
|
|
244
|
+
limit (optional) — 1–100, default 20
|
|
245
|
+
response_format (optional) — 'json' or 'markdown'
|
|
246
|
+
```
|
|
247
|
+
|
|
248
|
+
**Returns:** Ranked list of high-impact nodes
|
|
249
|
+
|
|
250
|
+
---
|
|
251
|
+
|
|
252
|
+
## Node Types
|
|
253
|
+
|
|
254
|
+
The parser extracts **10 distinct node types** from TypeScript/TSX source files:
|
|
255
|
+
|
|
256
|
+
| Type | Description | Detection Method |
|
|
257
|
+
|---|---|---|
|
|
258
|
+
| `function` | Named functions and arrow functions | `FunctionDeclaration` and `VariableDeclaration → ArrowFunction` |
|
|
259
|
+
| `component` | React/JSX components | PascalCase function that contains JSX elements |
|
|
260
|
+
| `class` | ES6 class declarations | `ClassDeclaration` |
|
|
261
|
+
| `interface` | TypeScript interface declarations | `InterfaceDeclaration` |
|
|
262
|
+
| `type_alias` | TypeScript type aliases | `TypeAliasDeclaration` |
|
|
263
|
+
| `enum` | TypeScript enum declarations | `EnumDeclaration` |
|
|
264
|
+
| `variable` | Exported constants and variables | Exported `VariableStatement` (excluding arrow functions already captured) |
|
|
265
|
+
| `import` | Import declarations | `ImportDeclaration` → keyed as `import:<module_specifier>` |
|
|
266
|
+
| `file` | Source file entry | One per file, named after the basename |
|
|
267
|
+
| `route` | Next.js App Router route | Detected from `app/**/page.tsx` file paths |
|
|
268
|
+
|
|
269
|
+
---
|
|
270
|
+
|
|
271
|
+
## Edge Types
|
|
272
|
+
|
|
273
|
+
The parser extracts **10 distinct edge types** representing relationships between nodes:
|
|
274
|
+
|
|
275
|
+
| Edge | Description |
|
|
276
|
+
|---|---|
|
|
277
|
+
| `CALLS` | Function A calls function B (via `CallExpression`) |
|
|
278
|
+
| `IMPORTS` | File imports a module |
|
|
279
|
+
| `EXPORTS` | File exports a symbol |
|
|
280
|
+
| `EXTENDS` | Class or interface extends a base |
|
|
281
|
+
| `IMPLEMENTS` | Class implements an interface |
|
|
282
|
+
| `DEFINES` | File defines a symbol (function, class, variable, etc.) |
|
|
283
|
+
| `RENDERS` | React component renders another component (JSX usage) |
|
|
284
|
+
| `CONTAINS` | Wrapper component contains a child in JSX tree (direct parent-child) |
|
|
285
|
+
| `TYPE_OF` | Symbol references a type or interface |
|
|
286
|
+
| `PART_OF_TAB` | Component belongs to a tab (detected from `TabsContent`, `TabContent`, `TabPanel` wrappers) |
|
|
287
|
+
|
|
288
|
+
---
|
|
289
|
+
|
|
290
|
+
## Importance Scoring
|
|
291
|
+
|
|
292
|
+
Every node is scored using a batch SQL computation:
|
|
293
|
+
|
|
294
|
+
```
|
|
295
|
+
Score = (inDegree × 2) + outDegree + typeWeight
|
|
296
|
+
```
|
|
297
|
+
|
|
298
|
+
| Type | Weight |
|
|
299
|
+
|---|---|
|
|
300
|
+
| `class` | 3.0 |
|
|
301
|
+
| `component` | 2.5 |
|
|
302
|
+
| `function` | 2.0 |
|
|
303
|
+
| `route` | 1.5 |
|
|
304
|
+
| `interface` | 1.5 |
|
|
305
|
+
| `enum` | 1.5 |
|
|
306
|
+
| `file` | 1.5 |
|
|
307
|
+
| `type_alias` | 1.0 |
|
|
308
|
+
| `variable` | 1.0 |
|
|
309
|
+
| `import` | 0.5 |
|
|
310
|
+
|
|
311
|
+
Importance is computed in a single aggregation query (no N+1), then batch-updated in a transaction.
|
|
312
|
+
|
|
313
|
+
---
|
|
314
|
+
|
|
315
|
+
## Semantic Meta Enrichment
|
|
316
|
+
|
|
317
|
+
The parser automatically infers rich metadata for each node:
|
|
318
|
+
|
|
319
|
+
### UI Role Detection
|
|
320
|
+
|
|
321
|
+
Components are assigned a UI `role` based on name/file pattern matching:
|
|
322
|
+
|
|
323
|
+
| Pattern | Role |
|
|
324
|
+
|---|---|
|
|
325
|
+
| `*panel*` | `panel` |
|
|
326
|
+
| `*tab*` | `tab` |
|
|
327
|
+
| `*page*` | `page` |
|
|
328
|
+
| `*dialog*`, `*modal*` | `dialog` |
|
|
329
|
+
| `*form*` | `form` |
|
|
330
|
+
| `*sidebar*`, `*nav*` | `navigation` |
|
|
331
|
+
| `*header*` | `header` |
|
|
332
|
+
| `*footer*` | `footer` |
|
|
333
|
+
| `*content*` | `content` |
|
|
334
|
+
| `*list*` | `list` |
|
|
335
|
+
| `*card*` | `card` |
|
|
336
|
+
| `*button*` | `action` |
|
|
337
|
+
| `*layout*` | `layout` |
|
|
338
|
+
|
|
339
|
+
### Feature Inference
|
|
340
|
+
|
|
341
|
+
Features are extracted from directory structure:
|
|
342
|
+
- **Next.js App Router**: `app/(group)/feature-name/...` → `feature: "feature-name"`
|
|
343
|
+
- **Component directories**: `components/feature-name/...` → `feature: "feature-name"`
|
|
344
|
+
|
|
345
|
+
### Route Detection
|
|
346
|
+
|
|
347
|
+
Next.js App Router routes are auto-detected from `app/**/page.tsx` paths and stored as `route` nodes.
|
|
348
|
+
|
|
349
|
+
### Tab System Detection
|
|
350
|
+
|
|
351
|
+
When the parser encounters `<TabsContent value="xxx">` (or `TabContent`, `TabPanel`), it:
|
|
352
|
+
1. Creates `PART_OF_TAB` edges from children to the parent component
|
|
353
|
+
2. Sets `meta.tab = "xxx"` on the child nodes
|
|
354
|
+
|
|
355
|
+
### Enriched Meta Search
|
|
356
|
+
|
|
357
|
+
All meta fields (role, tab, feature) are queryable via the `searchNodesExtended` function, which combines name LIKE matching with JSON meta extraction in a single SQL query.
|
|
358
|
+
|
|
359
|
+
---
|
|
360
|
+
|
|
361
|
+
## Conversation Memory
|
|
362
|
+
|
|
363
|
+
TokenOS includes a persistent memory system for storing conversation context across sessions:
|
|
364
|
+
|
|
365
|
+
- **Storage**: SQLite `memories` table with title, summary, key_points (JSON array), tags (JSON array), and optional embeddings
|
|
366
|
+
- **Auto-indexing**: Markdown files in `/memory/` or `/memories/` directories within the watched project are automatically parsed and stored
|
|
367
|
+
- **Extraction**: Titles from `# headings`, tags from `tags: [...]` patterns, key points from bullet lists
|
|
368
|
+
- **Search**: Text-based search across title, summary, and tags
|
|
369
|
+
- **Integration**: Memories are automatically surfaced in `search` results
|
|
370
|
+
|
|
371
|
+
---
|
|
372
|
+
|
|
373
|
+
## Visualization Dashboard
|
|
374
|
+
|
|
375
|
+
Enable the built-in visualization UI by setting `ui.enabled: true` in your config or passing the `--ui` CLI flag.
|
|
376
|
+
|
|
377
|
+
### Routes
|
|
378
|
+
|
|
379
|
+
| Route | Description |
|
|
380
|
+
|---|---|
|
|
381
|
+
| `/` | **Dashboard** — Glassmorphism-styled overview with stats cards, top nodes grid, and full context explorer table. Animated with GSAP. |
|
|
382
|
+
| `/graph` | **Network Graph** — Interactive force-directed graph visualization using vis-network. Click nodes for details. |
|
|
383
|
+
| `/api/stats` | JSON API: node counts by type + top 50 nodes |
|
|
384
|
+
| `/api/graph-data` | JSON API: full graph data (auto-limited to top 1500 nodes for browser performance) |
|
|
385
|
+
|
|
386
|
+
---
|
|
387
|
+
|
|
388
|
+
## Changing the Embedding Model
|
|
389
|
+
|
|
390
|
+
Edit `tokenos.config.json`:
|
|
391
|
+
|
|
392
|
+
```json
|
|
393
|
+
{
|
|
394
|
+
"ollama": {
|
|
395
|
+
"model": "mxbai-embed-large:latest"
|
|
396
|
+
}
|
|
397
|
+
}
|
|
398
|
+
```
|
|
399
|
+
|
|
400
|
+
Then reset and re-index (different models produce incompatible vectors):
|
|
401
|
+
|
|
402
|
+
```bash
|
|
403
|
+
npm run reset
|
|
404
|
+
npm run dev
|
|
405
|
+
```
|
|
406
|
+
|
|
407
|
+
Popular models:
|
|
408
|
+
- `mxbai-embed-large:latest` — high quality, larger context
|
|
409
|
+
- `nomic-embed-text` — fast, good general purpose (default fallback)
|
|
410
|
+
- `all-minilm` — lightweight, fast
|
|
411
|
+
|
|
412
|
+
### Embedding Pipeline Details
|
|
413
|
+
|
|
414
|
+
- **Input enrichment**: Embeddings are generated from a structured prompt that includes `[NAME]`, `[TYPE]`, `[ROLE]`, `[TAB]`, `[FEATURE]`, `[ROUTE]`, `[SUMMARY]`, and `[CODE]` (first 300 chars) tags. This produces richer vectors than embedding raw code alone.
|
|
415
|
+
- **LRU caching**: An in-memory LRU cache (1000 entries, 30min TTL) prevents redundant Ollama calls for unchanged text.
|
|
416
|
+
- **Backfill strategy**: On boot, only nodes _without_ existing embeddings are processed. If Ollama becomes unavailable mid-backfill, the process stops gracefully.
|
|
417
|
+
- **Health checks**: Ollama availability is probed once via `/api/tags` with a 2-second timeout. Subsequent failures reset the health flag for retry on next request.
|
|
418
|
+
|
|
419
|
+
---
|
|
420
|
+
|
|
421
|
+
## Architecture
|
|
422
|
+
|
|
423
|
+
```
|
|
424
|
+
src/
|
|
425
|
+
├── main.ts # Entry point — validates config, bootstraps all systems, graceful shutdown
|
|
426
|
+
├── config.ts # Config loader (CLI args → config file → env vars → defaults)
|
|
427
|
+
├── reset.ts # Reset script — deletes project DB + WAL/SHM files
|
|
428
|
+
├── verify-parser.ts # Parser verification — validates node types, edge types, and meta
|
|
429
|
+
├── types.ts # Shared TypeScript types (10 NodeTypes, 10 EdgeTypes, ConversationMemory)
|
|
430
|
+
│
|
|
431
|
+
├── db/
|
|
432
|
+
│ ├── connection.ts # SQLite connection + schema (WAL mode, foreign keys, 3 tables, 7 indexes)
|
|
433
|
+
│ ├── queries.ts # 20+ prepared statements (upsert, search, batch importance, meta queries)
|
|
434
|
+
│ ├── memory.ts # Conversation memory CRUD (upsert, search, get all)
|
|
435
|
+
│ └── index.ts # Re-exports
|
|
436
|
+
│
|
|
437
|
+
├── indexer/
|
|
438
|
+
│ ├── parser.ts # ts-morph AST parser — extracts 10 node types, 10 edge types, semantic meta
|
|
439
|
+
│ ├── indexer.ts # Orchestrates parse → hash-skip → transaction (delete stale + upsert fresh)
|
|
440
|
+
│ ├── watcher.ts # chokidar file watcher (incremental: add/change/unlink, ignoreInitial)
|
|
441
|
+
│ ├── ignore.ts # .gitignore rule loader + hardcoded ignores (node_modules, .git, dist, etc.)
|
|
442
|
+
│ └── cli.ts # One-shot CLI indexer (npm run index)
|
|
443
|
+
│
|
|
444
|
+
├── embeddings/
|
|
445
|
+
│ ├── client.ts # Ollama HTTP client + enriched embedding input builder + LRU cache
|
|
446
|
+
│ ├── similarity.ts # Cosine similarity + ranked search (top-K)
|
|
447
|
+
│ └── index.ts # backfillEmbeddings() + re-exports
|
|
448
|
+
│
|
|
449
|
+
├── server/
|
|
450
|
+
│ ├── server.ts # MCP server — registers 6 tools, BFS subgraph builder, node compression
|
|
451
|
+
│ ├── visualize.ts # Optional visualization dashboard (HTTP server, vis-network + GSAP)
|
|
452
|
+
│ └── index.ts # Re-exports
|
|
453
|
+
│
|
|
454
|
+
└── utils/
|
|
455
|
+
├── scoring.ts # Importance score computation (delegates to batch SQL)
|
|
456
|
+
├── logger.ts # Vite-inspired colored logger (picocolors, per-module tags)
|
|
457
|
+
└── cache.ts # Generic LRU cache with TTL (used by embedding client)
|
|
458
|
+
```
|
|
459
|
+
|
|
460
|
+
### Boot Sequence
|
|
461
|
+
|
|
462
|
+
```
|
|
463
|
+
1. Validate config (watchPath exists)
|
|
464
|
+
2. Optionally start visualization UI server
|
|
465
|
+
3. Full directory indexing (recursive walk, .gitignore-aware)
|
|
466
|
+
4. Probe Ollama → backfill embeddings for un-embedded nodes
|
|
467
|
+
5. Batch-compute importance scores (single SQL aggregation)
|
|
468
|
+
6. Start chokidar watcher (incremental updates only)
|
|
469
|
+
7. Start MCP stdio server (blocks until client disconnects)
|
|
470
|
+
8. Print Vite-like status banner with health checks
|
|
471
|
+
9. Register SIGINT/SIGTERM handlers for graceful shutdown
|
|
472
|
+
```
|
|
473
|
+
|
|
474
|
+
### Incremental Update Strategy
|
|
475
|
+
|
|
476
|
+
- **Hash-based skip**: Each file's content is SHA-256 hashed (truncated to 16 chars). Files with unchanged hashes are skipped entirely.
|
|
477
|
+
- **Transactional upsert**: On file change, a single SQLite transaction deletes all stale nodes/edges for the file then inserts fresh data, ensuring FK consistency.
|
|
478
|
+
- **Graceful FK handling**: Edges referencing nodes in un-indexed files are silently skipped (they'll be wired up when the target file is indexed).
|
|
479
|
+
|
|
480
|
+
---
|
|
481
|
+
|
|
482
|
+
## Configuration Reference
|
|
483
|
+
|
|
484
|
+
### `tokenos.config.json`
|
|
485
|
+
|
|
486
|
+
| Field | Type | Default | Description |
|
|
487
|
+
|---|---|---|---|
|
|
488
|
+
| `watchPath` | `string` | `process.cwd()` | Absolute path to the project you want to index |
|
|
489
|
+
| `ollama.url` | `string` | `http://localhost:11434` | Ollama server URL |
|
|
490
|
+
| `ollama.model` | `string` | `nomic-embed-text` | Embedding model to use |
|
|
491
|
+
| `ui.enabled` | `boolean` | `false` | Start visualization dashboard on boot |
|
|
492
|
+
| `ui.port` | `number` | `3333` | Dashboard HTTP server port |
|
|
493
|
+
|
|
494
|
+
### Environment Variable Overrides
|
|
495
|
+
|
|
496
|
+
| Variable | Overrides |
|
|
497
|
+
|---|---|
|
|
498
|
+
| `OLLAMA_URL` | `ollama.url` |
|
|
499
|
+
| `EMBEDDING_MODEL` | `ollama.model` |
|
|
500
|
+
| `GRAPH_UI_PORT` | `ui.port` |
|
|
501
|
+
|
|
502
|
+
### CLI Arguments
|
|
503
|
+
|
|
504
|
+
| Argument | Description |
|
|
505
|
+
|---|---|
|
|
506
|
+
| First non-flag arg | Overrides `watchPath` |
|
|
507
|
+
| `--ui` | Enables visualization dashboard (overrides config) |
|
|
508
|
+
|
|
509
|
+
**Precedence**: CLI args → config file → environment variables → defaults
|
|
510
|
+
|
|
511
|
+
---
|
|
512
|
+
|
|
513
|
+
## Prerequisites
|
|
514
|
+
|
|
515
|
+
| Dependency | Purpose |
|
|
516
|
+
|---|---|
|
|
517
|
+
| Node.js ≥ 18 | Runtime |
|
|
518
|
+
| `npm` | Package manager |
|
|
519
|
+
| [Ollama](https://ollama.ai/) *(optional)* | Semantic embedding generation |
|
|
520
|
+
|
|
521
|
+
If Ollama is not running, the server starts normally — semantic search falls back to text-mode and embeddings are skipped.
|
|
522
|
+
|
|
523
|
+
---
|
|
524
|
+
|
|
525
|
+
## Tech Stack
|
|
526
|
+
|
|
527
|
+
**Core Technologies:**
|
|
528
|
+
- **[TypeScript](https://www.typescriptlang.org/) / [Node.js](https://nodejs.org/)** — Core language and runtime (ES2022 target, Node16 module resolution)
|
|
529
|
+
- **[SQLite](https://sqlite.org/)** — Local, fast, embedded graph database (WAL mode)
|
|
530
|
+
- **[ts-morph](https://ts-morph.com/)** — TypeScript AST parsing tool for static analysis
|
|
531
|
+
- **[Model Context Protocol (MCP)](https://modelcontextprotocol.io/)** — Standardized AI tool integration protocol
|
|
532
|
+
- **[Ollama](https://ollama.com/)** — Local semantic vector embeddings *(optional)*
|
|
533
|
+
|
|
534
|
+
### Dependencies
|
|
535
|
+
|
|
536
|
+
| Package | Version | Role |
|
|
537
|
+
|---|---|---|
|
|
538
|
+
| `@modelcontextprotocol/sdk` | ^1.8.0 | MCP server + stdio transport |
|
|
539
|
+
| `better-sqlite3` | ^11.9.1 | Synchronous SQLite (Node.js) |
|
|
540
|
+
| `ts-morph` | ^25.0.1 | TypeScript AST parsing |
|
|
541
|
+
| `chokidar` | ^4.0.3 | File watching |
|
|
542
|
+
| `ignore` | ^7.0.5 | `.gitignore`-pattern matching |
|
|
543
|
+
| `zod` | ^4.3.6 | Schema validation for MCP tool inputs |
|
|
544
|
+
| `picocolors` | ^1.1.1 | Terminal colors |
|
|
545
|
+
|
|
546
|
+
### Dev Dependencies
|
|
547
|
+
|
|
548
|
+
| Package | Version | Role |
|
|
549
|
+
|---|---|---|
|
|
550
|
+
| `@types/better-sqlite3` | ^7.6.12 | SQLite type definitions |
|
|
551
|
+
| `@types/node` | ^22.13.13 | Node.js type definitions |
|
|
552
|
+
| `tsx` | ^4.19.3 | TypeScript dev runner |
|
|
553
|
+
| `typescript` | ^5.8.2 | TypeScript compiler |
|
|
554
|
+
|
|
555
|
+
---
|
|
556
|
+
|
|
557
|
+
## Limitations
|
|
558
|
+
|
|
559
|
+
- Only `.ts` and `.tsx` files are indexed (no `.js`, `.jsx`, `.vue`, etc.)
|
|
560
|
+
- Semantic search requires Ollama to be running locally
|
|
561
|
+
- The graph database is rebuilt on first run per project
|
|
562
|
+
- Subgraph BFS and cognitive search responses are truncated at 25,000 characters
|
|
563
|
+
- Cross-file edges to un-indexed targets are gracefully skipped (resolved when the dependency is indexed later)
|
|
564
|
+
- Memory file extraction uses basic regex parsing (headings, bullet points, tag patterns)
|
|
565
|
+
- Visualization dashboard loads at most 1,500 nodes to prevent browser rendering issues
|
|
566
|
+
|
|
567
|
+
---
|
|
568
|
+
|
|
569
|
+
## License
|
|
570
|
+
|
|
571
|
+
MIT
|