@199-bio/engram 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.env.example +19 -0
- package/LICENSE +21 -0
- package/LIVING_PLAN.md +180 -0
- package/PLAN.md +514 -0
- package/README.md +304 -0
- package/dist/graph/extractor.d.ts.map +1 -0
- package/dist/graph/index.d.ts.map +1 -0
- package/dist/graph/knowledge-graph.d.ts.map +1 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +473 -0
- package/dist/retrieval/colbert.d.ts.map +1 -0
- package/dist/retrieval/hybrid.d.ts.map +1 -0
- package/dist/retrieval/index.d.ts.map +1 -0
- package/dist/storage/database.d.ts.map +1 -0
- package/dist/storage/index.d.ts.map +1 -0
- package/package.json +62 -0
- package/src/graph/extractor.ts +441 -0
- package/src/graph/index.ts +2 -0
- package/src/graph/knowledge-graph.ts +263 -0
- package/src/index.ts +558 -0
- package/src/retrieval/colbert-bridge.py +222 -0
- package/src/retrieval/colbert.ts +317 -0
- package/src/retrieval/hybrid.ts +218 -0
- package/src/retrieval/index.ts +2 -0
- package/src/storage/database.ts +527 -0
- package/src/storage/index.ts +1 -0
- package/tests/test-interactive.js +218 -0
- package/tests/test-mcp.sh +81 -0
- package/tsconfig.json +20 -0
package/.env.example
ADDED
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
# Engram Configuration
|
|
2
|
+
# Copy this file to .env and fill in your values
|
|
3
|
+
|
|
4
|
+
# Required: Path to store the database
|
|
5
|
+
ENGRAM_DB_PATH=~/.engram
|
|
6
|
+
|
|
7
|
+
# Optional: Cloud API keys for enhanced quality
|
|
8
|
+
# These are optional - Engram works fully offline without them
|
|
9
|
+
# GEMINI_API_KEY=your-gemini-api-key
|
|
10
|
+
# COHERE_API_KEY=your-cohere-api-key
|
|
11
|
+
|
|
12
|
+
# Optional: Model configuration
|
|
13
|
+
# COLBERT_MODEL=colbert-ir/colbertv2.0
|
|
14
|
+
# EMBEDDING_MODEL=Qwen/Qwen3-Embedding-0.6B
|
|
15
|
+
|
|
16
|
+
# Optional: Performance tuning
|
|
17
|
+
# MAX_MEMORY_CACHE=1000
|
|
18
|
+
# RETRIEVAL_TOP_K=50
|
|
19
|
+
# RERANK_TOP_K=10
|
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2025 Boris Djordjevic, 199 Biotechnologies
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/LIVING_PLAN.md
ADDED
|
@@ -0,0 +1,180 @@
|
|
|
1
|
+
# Engram Development - Living Plan
|
|
2
|
+
|
|
3
|
+
**Last Updated**: 2024-12-22 03:50 UTC
|
|
4
|
+
|
|
5
|
+
This file tracks development progress. If context is lost, read this file to continue.
|
|
6
|
+
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
## Current Status: Phase 5 - Production Ready
|
|
10
|
+
|
|
11
|
+
### Completed
|
|
12
|
+
- [x] Project structure created
|
|
13
|
+
- [x] package.json, tsconfig.json, .gitignore, LICENSE
|
|
14
|
+
- [x] SQLite storage layer (`src/storage/database.ts`)
|
|
15
|
+
- Memories table with FTS5 for BM25
|
|
16
|
+
- Entities, Observations, Relations tables
|
|
17
|
+
- Graph traversal queries
|
|
18
|
+
- All CRUD operations
|
|
19
|
+
- [x] Entity extractor (`src/graph/extractor.ts`)
|
|
20
|
+
- Heuristic-based name extraction
|
|
21
|
+
- Organization detection (Goldman Sachs, etc.)
|
|
22
|
+
- Known organizations database
|
|
23
|
+
- Relationship extraction
|
|
24
|
+
- No external dependencies
|
|
25
|
+
- [x] Knowledge graph manager (`src/graph/knowledge-graph.ts`)
|
|
26
|
+
- High-level graph operations
|
|
27
|
+
- Auto-extraction from text
|
|
28
|
+
- Graph traversal
|
|
29
|
+
- [x] ColBERT Python bridge (`src/retrieval/colbert-bridge.py`)
|
|
30
|
+
- RAGatouille integration
|
|
31
|
+
- JSON stdin/stdout protocol
|
|
32
|
+
- [x] TypeScript ColBERT wrapper (`src/retrieval/colbert.ts`)
|
|
33
|
+
- Subprocess management
|
|
34
|
+
- Fallback SimpleRetriever when Python unavailable
|
|
35
|
+
- [x] Hybrid search (`src/retrieval/hybrid.ts`)
|
|
36
|
+
- BM25 + Semantic + Graph
|
|
37
|
+
- Reciprocal Rank Fusion (RRF)
|
|
38
|
+
- [x] MCP server with all tools (`src/index.ts`)
|
|
39
|
+
- remember, recall, forget
|
|
40
|
+
- create_entity, observe, relate, query_entity, list_entities
|
|
41
|
+
- stats
|
|
42
|
+
- [x] Install dependencies and build
|
|
43
|
+
- [x] Test end-to-end with fictive examples (11 tests pass)
|
|
44
|
+
- [x] Entity extraction improvements
|
|
45
|
+
- Goldman Sachs correctly detected as organization
|
|
46
|
+
- Known organizations database
|
|
47
|
+
- Place filtering (California, etc.)
|
|
48
|
+
- Nationality/religion filtering
|
|
49
|
+
|
|
50
|
+
### Verified Working
|
|
51
|
+
- All 11 MCP test cases pass
|
|
52
|
+
- BM25 search working (FTS5)
|
|
53
|
+
- Graph-based entity linking working
|
|
54
|
+
- ColBERT Python bridge working
|
|
55
|
+
- Entity extraction correctly identifies orgs vs persons
|
|
56
|
+
|
|
57
|
+
---
|
|
58
|
+
|
|
59
|
+
## File Structure
|
|
60
|
+
|
|
61
|
+
```
|
|
62
|
+
engram/
|
|
63
|
+
├── src/
|
|
64
|
+
│ ├── index.ts # MCP server (DONE)
|
|
65
|
+
│ ├── storage/
|
|
66
|
+
│ │ ├── database.ts # SQLite + FTS5 (DONE)
|
|
67
|
+
│ │ └── index.ts # Exports (DONE)
|
|
68
|
+
│ ├── graph/
|
|
69
|
+
│ │ ├── extractor.ts # Entity extraction (DONE)
|
|
70
|
+
│ │ ├── knowledge-graph.ts # Graph operations (DONE)
|
|
71
|
+
│ │ └── index.ts # Exports (DONE)
|
|
72
|
+
│ ├── retrieval/
|
|
73
|
+
│ │ ├── colbert.ts # TypeScript wrapper (DONE)
|
|
74
|
+
│ │ ├── colbert-bridge.py # Python RAGatouille (DONE)
|
|
75
|
+
│ │ ├── hybrid.ts # RRF fusion (DONE)
|
|
76
|
+
│ │ └── index.ts # Exports (DONE)
|
|
77
|
+
├── tests/
|
|
78
|
+
│ ├── test-interactive.js # Full test suite (DONE)
|
|
79
|
+
│ └── test-mcp.sh # Shell test script (DONE)
|
|
80
|
+
├── dist/ # Compiled JS (auto-generated)
|
|
81
|
+
├── package.json # Dependencies (DONE)
|
|
82
|
+
├── tsconfig.json # TypeScript config (DONE)
|
|
83
|
+
├── README.md # Documentation (DONE)
|
|
84
|
+
└── LIVING_PLAN.md # This file (DONE)
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
---
|
|
88
|
+
|
|
89
|
+
## MCP Tools Available
|
|
90
|
+
|
|
91
|
+
1. **remember** - Store a new memory, auto-extracts entities
|
|
92
|
+
2. **recall** - Hybrid search (BM25 + semantic + graph)
|
|
93
|
+
3. **forget** - Remove a memory by ID
|
|
94
|
+
4. **create_entity** - Manually create an entity
|
|
95
|
+
5. **observe** - Add an observation about an entity
|
|
96
|
+
6. **relate** - Create a relationship between entities
|
|
97
|
+
7. **query_entity** - Get entity details and relationships
|
|
98
|
+
8. **list_entities** - List all entities by type
|
|
99
|
+
9. **stats** - Get memory/entity/relation counts
|
|
100
|
+
|
|
101
|
+
---
|
|
102
|
+
|
|
103
|
+
## Key Decisions
|
|
104
|
+
|
|
105
|
+
1. **ColBERT via Python**: RAGatouille is proven, well-maintained. Use subprocess.
|
|
106
|
+
2. **BM25 via SQLite FTS5**: Already implemented, zero deps.
|
|
107
|
+
3. **Local-first**: No API keys required.
|
|
108
|
+
4. **Entity extraction**: Heuristics + known org database. Can add GLiNER later.
|
|
109
|
+
5. **Hybrid Search**: RRF fusion with k=60 constant.
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
|
|
113
|
+
## Testing Commands
|
|
114
|
+
|
|
115
|
+
```bash
|
|
116
|
+
# Build TypeScript
|
|
117
|
+
cd /Users/biobook/Code/stuff/engram
|
|
118
|
+
npm install
|
|
119
|
+
npm run build
|
|
120
|
+
|
|
121
|
+
# Run full test suite
|
|
122
|
+
node tests/test-interactive.js
|
|
123
|
+
|
|
124
|
+
# Test MCP server manually
|
|
125
|
+
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | node dist/index.js
|
|
126
|
+
|
|
127
|
+
# Install as MCP for Claude Desktop
|
|
128
|
+
# Add to ~/.claude/claude_desktop_config.json:
|
|
129
|
+
# {
|
|
130
|
+
# "mcpServers": {
|
|
131
|
+
# "engram": {
|
|
132
|
+
# "command": "node",
|
|
133
|
+
# "args": ["/Users/biobook/Code/stuff/engram/dist/index.js"]
|
|
134
|
+
# }
|
|
135
|
+
# }
|
|
136
|
+
# }
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
---
|
|
140
|
+
|
|
141
|
+
## Known Limitations
|
|
142
|
+
|
|
143
|
+
- Windows not supported (RAGatouille limitation)
|
|
144
|
+
- ColBERT models are ~500MB (downloaded on first use)
|
|
145
|
+
- BM25 scores for named entities are low (graph search compensates)
|
|
146
|
+
- Place extraction not implemented (California detected as person)
|
|
147
|
+
|
|
148
|
+
---
|
|
149
|
+
|
|
150
|
+
## Future Enhancements
|
|
151
|
+
|
|
152
|
+
- [ ] GLiNER for better NER
|
|
153
|
+
- [ ] Gemini embeddings (optional cloud enhancement)
|
|
154
|
+
- [ ] Cohere reranking (optional cloud enhancement)
|
|
155
|
+
- [ ] Temporal memory decay
|
|
156
|
+
- [ ] Memory consolidation (merge similar memories)
|
|
157
|
+
- [ ] Export/import functionality
|
|
158
|
+
|
|
159
|
+
---
|
|
160
|
+
|
|
161
|
+
## To Continue Development
|
|
162
|
+
|
|
163
|
+
If starting fresh, run these commands:
|
|
164
|
+
|
|
165
|
+
```bash
|
|
166
|
+
cd /Users/biobook/Code/stuff/engram
|
|
167
|
+
cat LIVING_PLAN.md # Read this file
|
|
168
|
+
npm run build # Rebuild if needed
|
|
169
|
+
node tests/test-interactive.js # Run tests
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
---
|
|
173
|
+
|
|
174
|
+
## API Keys Needed
|
|
175
|
+
|
|
176
|
+
**NONE** - This is a local-first implementation.
|
|
177
|
+
|
|
178
|
+
Optional (for future cloud enhancement):
|
|
179
|
+
- GEMINI_API_KEY - embeddings
|
|
180
|
+
- COHERE_API_KEY - reranking
|