persyst-mcp 2.2.5 → 2.2.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +103 -114
- package/bin/export.js +4 -4
- package/bin/extract.js +8 -8
- package/bin/import.js +15 -15
- package/bin/init.js +185 -38
- package/bin/mcp.js +3 -0
- package/bin/monitor.js +511 -0
- package/bin/setup.js +9 -9
- package/index.js +31 -11
- package/package.json +10 -11
- package/src/attestation.js +49 -28
- package/src/cache.js +3 -1
- package/src/database.js +227 -34
- package/src/embeddings.js +4 -2
- package/src/events.js +2 -0
- package/src/extractor-heuristic.js +5 -2
- package/src/sdk.js +4 -3
- package/src/search.js +55 -84
- package/src/server.js +884 -723
- package/src/setup-wasm.js +34 -39
- package/src/text-utils.js +52 -0
- package/src/tools.js +98 -53
- package/src/watcher.js +157 -49
package/README.md
CHANGED
|
@@ -1,203 +1,192 @@
|
|
|
1
1
|
# Persyst
|
|
2
2
|
|
|
3
|
-
**Local-first MCP memory
|
|
3
|
+
**Local-first, compliance-grade MCP memory layer for regulated enterprise coding teams using AI assistants.**
|
|
4
4
|
|
|
5
|
-
Persyst gives AI coding agents (Claude Code, Cursor, VS Code, Aider, Windsurf, Antigravity) persistent memory across sessions. It stores memories in a local SQLite database with hybrid keyword + semantic search —
|
|
5
|
+
Persyst gives AI coding agents (Claude Code, Cursor, VS Code, Aider, Windsurf, Antigravity) persistent memory across sessions. It stores memories in a local SQLite database with hybrid keyword + semantic search — operating 100% offline with zero cloud egress.
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
---
|
|
8
8
|
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
9
|
+
## Compliance-Grade Security Features
|
|
10
|
+
|
|
11
|
+
Persyst is built from the ground up for highly regulated enterprise environments (finance, healthcare, defense) subject to **SOC 2**, **HIPAA**, and the **EU AI Act**:
|
|
12
12
|
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
13
|
+
* **100% Data Residency (Zero-Egress)**: All vector calculations, full-text searches, and model inferences run locally on the developer's workstation. No database records or context data ever leave the local machine. Bypasses Business Associate Agreement (BAA) complexity for HIPAA.
|
|
14
|
+
* **Cryptographic Chain of Custody**: Every context retrieval generates an Ed25519 cryptographic signature sealing the query and retrieved memory hashes. Each attestation is chained to the previous one via SHA-256 hash chains, creating a tamper-evident audit ledger verifiable by security teams.
|
|
15
|
+
* **Automatic Secret Redaction**: Scans incoming log files and text writes to redact high-entropy secrets (API keys, JWTs, database strings, private keys) before they reach the persistent database.
|
|
16
|
+
* **Event-Driven File Watching**: Integrates `chokidar` for instant scanning of agent transcript folders, guaranteeing that your memories are synchronized immediately after each agent interaction.
|
|
17
|
+
* **Workspace Project Isolation**: Supports `PERSYST_PROJECT` environment partitioning, preventing cross-project context leaks while allowing shared enterprise compliance rules.
|
|
16
18
|
|
|
17
|
-
|
|
19
|
+
*Read more in our compliance mapping guides:*
|
|
20
|
+
- [SOC 2 Type II Controls](compliance/SOC2-controls.md)
|
|
21
|
+
- [HIPAA Mapping & PHI Boundaries](compliance/HIPAA-mapping.md)
|
|
22
|
+
- [EU AI Act Article 13 Transparency](compliance/EU-AI-Act-Article13.md)
|
|
23
|
+
- [Compliance Audit Trail Sample](compliance/audit-trail-sample.md)
|
|
18
24
|
|
|
19
25
|
---
|
|
20
26
|
|
|
21
|
-
## Quick Start
|
|
27
|
+
## Quick Start & Automatic IDE Setup
|
|
22
28
|
|
|
23
|
-
You don't need to
|
|
29
|
+
You don't need to configure MCP files manually. Persyst includes an automated setup CLI that detects installed editors and configures rule wrappers and global settings in seconds.
|
|
24
30
|
|
|
25
|
-
###
|
|
31
|
+
### Automatic One-Command Setup
|
|
26
32
|
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
```
|
|
30
|
-
|
|
31
|
-
"mcpServers": {
|
|
32
|
-
"persyst": {
|
|
33
|
-
"command": "npx",
|
|
34
|
-
"args": ["-y", "persyst-mcp"]
|
|
35
|
-
}
|
|
36
|
-
}
|
|
37
|
-
}
|
|
33
|
+
Run the setup wizard in your target project directory:
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
npx persyst-mcp init
|
|
38
37
|
```
|
|
39
38
|
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
39
|
+
This command automatically:
|
|
40
|
+
1. Generates local cryptographic Ed25519 keypairs in `~/.persyst`.
|
|
41
|
+
2. Creates workspace rule files (`.cursorrules`, `.windsurfrules`, `.clinerules`, `.persystrules.md`) to instruct agents on memory retrieval.
|
|
42
|
+
3. Automatically writes global MCP server configurations for **Cursor**, **Claude Code**, **Aider**, and **Continue.dev** with project-scoped environment parameters (`PERSYST_PROJECT`).
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
## Manual MCP Configuration
|
|
45
47
|
|
|
48
|
+
If you prefer to configure your agent manually, add the MCP server definition to your editor:
|
|
49
|
+
|
|
50
|
+
### Claude Code (`~/.claude.json`) & Claude Desktop
|
|
46
51
|
```json
|
|
47
52
|
{
|
|
48
53
|
"mcpServers": {
|
|
49
54
|
"persyst": {
|
|
50
55
|
"command": "npx",
|
|
51
|
-
"args": ["-y", "persyst-mcp"]
|
|
56
|
+
"args": ["-y", "persyst-mcp"],
|
|
57
|
+
"env": {
|
|
58
|
+
"PERSYST_PROJECT": "my-project"
|
|
59
|
+
}
|
|
52
60
|
}
|
|
53
61
|
}
|
|
54
62
|
}
|
|
55
63
|
```
|
|
56
64
|
|
|
57
|
-
---
|
|
58
|
-
|
|
59
|
-
## Setup for Other Agents
|
|
60
|
-
|
|
61
65
|
### VS Code (Cline / Roo Code)
|
|
62
|
-
Add
|
|
66
|
+
Add to your user settings under `cline_mcp_settings.json`:
|
|
63
67
|
```json
|
|
64
68
|
{
|
|
65
69
|
"mcpServers": {
|
|
66
70
|
"persyst": {
|
|
67
71
|
"command": "npx",
|
|
68
|
-
"args": ["-y", "persyst-mcp"]
|
|
72
|
+
"args": ["-y", "persyst-mcp"],
|
|
73
|
+
"env": {
|
|
74
|
+
"PERSYST_PROJECT": "my-project"
|
|
75
|
+
}
|
|
69
76
|
}
|
|
70
77
|
}
|
|
71
78
|
}
|
|
72
79
|
```
|
|
73
80
|
|
|
74
81
|
### Cursor
|
|
75
|
-
|
|
82
|
+
Under **Settings → Features → MCP**:
|
|
76
83
|
1. Click **+ Add New MCP Server**
|
|
77
84
|
2. Name: `persyst`
|
|
78
85
|
3. Type: `stdio`
|
|
79
86
|
4. Command: `npx -y persyst-mcp`
|
|
80
87
|
|
|
81
88
|
### Aider
|
|
82
|
-
|
|
83
|
-
```bash
|
|
84
|
-
aider --mcp-server persyst:npx -y persyst-mcp
|
|
85
|
-
```
|
|
86
|
-
Or append this to your `.aider.conf.yml` project file:
|
|
89
|
+
Append to your `.aider.conf.yml` project file:
|
|
87
90
|
```yaml
|
|
88
91
|
mcp-server:
|
|
89
92
|
- name: persyst
|
|
90
93
|
command: npx -y persyst-mcp
|
|
91
|
-
|
|
92
|
-
|
|
93
|
-
### Antigravity
|
|
94
|
-
Add Persyst to your Antigravity agent configuration file at `~/.gemini/antigravity/mcp_config.json`:
|
|
95
|
-
```json
|
|
96
|
-
{
|
|
97
|
-
"mcpServers": {
|
|
98
|
-
"persyst": {
|
|
99
|
-
"command": "npx",
|
|
100
|
-
"args": ["-y", "persyst-mcp"]
|
|
101
|
-
}
|
|
102
|
-
}
|
|
103
|
-
}
|
|
94
|
+
env:
|
|
95
|
+
PERSYST_PROJECT: my-project
|
|
104
96
|
```
|
|
105
97
|
|
|
106
98
|
---
|
|
107
99
|
|
|
108
|
-
##
|
|
109
|
-
|
|
110
|
-
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
| `update_memory` | Update memory content | `id` (number), `content` (string) |
|
|
116
|
-
| `delete_memory` | Delete a memory and clean up edges | `id` (number) |
|
|
117
|
-
| `get_recent_memories` | Get latest memories | `limit` (number) |
|
|
118
|
-
| `get_important_memories` | Get by importance score | `limit` (number) |
|
|
119
|
-
| `get_optimized_context` | Get compressed, ranked context block | `query` (string), `max_tokens` (number) |
|
|
120
|
-
| `ingest_git_commits` | Import recent git commits as memories | `repo_path` (string), `count` (number) |
|
|
121
|
-
| `consolidate_memories` | Merge highly similar duplicate memories | — |
|
|
122
|
-
| `get_memory_history` | Retrieve all versions of a memory | `query` (string) |
|
|
123
|
-
| `get_agent_stats` | Agent reputation stats | — |
|
|
124
|
-
| `export_audit_log` | Export attestation audit log | `start_date`, `end_date` (ISO8601) |
|
|
125
|
-
| `verify_attestation` | Verify Ed25519 signature chain | `attestation_id` (string) |
|
|
100
|
+
## Passive Recording vs. Active Retrieval
|
|
101
|
+
|
|
102
|
+
> **Note on Agent Integration**: Persyst operates in two complementary modes:
|
|
103
|
+
> 1. **Passive Recording**: The file watcher automatically extracts and saves memories from your agent conversation transcripts in the background.
|
|
104
|
+
> 2. **Active Retrieval**: The AI agent calls `search_memories` or `get_optimized_context` to fetch relevant context.
|
|
105
|
+
>
|
|
106
|
+
> The IDE itself does not automatically inject retrieved memories into prompt inputs unless configured to do so via workspace rules (e.g. `.cursorrules`, `.windsurfrules`, `.clinerules`) or custom system prompt builders.
|
|
126
107
|
|
|
127
108
|
---
|
|
128
109
|
|
|
129
|
-
##
|
|
110
|
+
## Available Tools (19 MCP Endpoints)
|
|
111
|
+
|
|
112
|
+
| Tool | Description | Key Parameters |
|
|
113
|
+
|------|-------------|----------------|
|
|
114
|
+
| `add_memory` | Store a new memory with secret redaction & contradiction check | `content`, `importance` (0-1), `agent_id`, `shared` |
|
|
115
|
+
| `search_memories` | Hybrid keyword + semantic search with attestation | `query`, `limit`, `agent_id` |
|
|
116
|
+
| `get_memory` | Retrieve a specific memory by ID (boosts importance) | `id`, `agent_id` |
|
|
117
|
+
| `update_memory` | Update content & archive previous version | `id`, `content`, `agent_id` |
|
|
118
|
+
| `delete_memory` | Permanently delete a memory & clean knowledge graph edges | `id` |
|
|
119
|
+
| `get_recent_memories` | Fetch latest memories ordered by creation date | `limit`, `agent_id` |
|
|
120
|
+
| `get_important_memories` | Fetch memories ranked by importance score | `limit`, `agent_id` |
|
|
121
|
+
| `get_optimized_context` | Graph-hopped context prompt compiled within token budget | `query`, `max_tokens`, `agent_id`, `intent` |
|
|
122
|
+
| `ingest_git_commits` | Parse & import recent git commits as structured memories | `repo_path`, `count` |
|
|
123
|
+
| `watch_git_repo` | Poll repository for changes and auto-ingest new commits | `repo_path` |
|
|
124
|
+
| `consolidate_memories` | Semantic deduplication sweep merging similar memories | — |
|
|
125
|
+
| `get_memory_history` | Retrieve complete version history and semantic diffs | `query` |
|
|
126
|
+
| `get_agent_stats` | View agent reputation scores & contradiction metrics | — |
|
|
127
|
+
| `export_audit_log` | Export cryptographic attestation audit log (JSON/Markdown) | `start_date`, `end_date` |
|
|
128
|
+
| `verify_attestation` | Verify Ed25519 signature & SHA-256 chain integrity | `attestation_id` |
|
|
129
|
+
| `add_entity` | Add named entity to knowledge graph | `name`, `type` |
|
|
130
|
+
| `link_entity_memory` | Create edge between knowledge graph entity and memory | `entity_id`, `memory_id`, `relation` |
|
|
131
|
+
| `search_by_entity` | Query linked memories via knowledge graph traversal | `entity_name` |
|
|
132
|
+
|
|
133
|
+
---
|
|
130
134
|
|
|
131
|
-
|
|
135
|
+
## Local HTTP Gateway & Swarm Integration
|
|
132
136
|
|
|
133
|
-
|
|
134
|
-
2. **Semantic Search (sqlite-vec)** — Meaning-based using local embeddings
|
|
137
|
+
In addition to STDIO transport, Persyst automatically launches a high-throughput local HTTP Gateway on port `4321` (`http://127.0.0.1:4321`).
|
|
135
138
|
|
|
136
|
-
|
|
139
|
+
- **`/health`**: Health check and database status
|
|
140
|
+
- **`/stats`**: Global memory & agent reputation statistics
|
|
141
|
+
- **`/system-prompt`**: Formatted prompt context injection
|
|
142
|
+
- **`/compliance/export`**: Cryptographic compliance audit report export (supports `format=markdown`)
|
|
143
|
+
- **`/events`**: Real-time Server-Sent Events (SSE) stream for agent swarms
|
|
137
144
|
|
|
138
145
|
---
|
|
139
146
|
|
|
140
|
-
##
|
|
147
|
+
## How Hybrid Search Works
|
|
141
148
|
|
|
142
|
-
|
|
143
|
-
- **Database:** SQLite via better-sqlite3
|
|
144
|
-
- **Vector Search:** sqlite-vec (local, no cloud)
|
|
145
|
-
- **Full-Text Search:** SQLite FTS5
|
|
146
|
-
- **Embeddings:** @huggingface/transformers + all-MiniLM-L6-v2 (384-dim, ~50MB)
|
|
147
|
-
- **Protocol:** MCP over stdio
|
|
149
|
+
Persyst combines two complementary search strategies:
|
|
148
150
|
|
|
149
|
-
|
|
150
|
-
|
|
151
|
-
## Troubleshooting
|
|
151
|
+
1. **Keyword Search (SQLite FTS5)** — Fast, exact string matching using BM25 ranking.
|
|
152
|
+
2. **Semantic Search (sqlite-vec)** — Deep meaning-based matching using local `all-MiniLM-L6-v2` embeddings.
|
|
152
153
|
|
|
153
|
-
|
|
154
|
-
`better-sqlite3` compiles native C++ code on installation. Make sure you have python and C++ build tools installed on your system:
|
|
155
|
-
* **Windows:** Run `npm install --global windows-build-tools` or install Visual Studio Build Tools.
|
|
156
|
-
* **macOS/Linux:** Run `xcode-select --install` or install `build-essential`.
|
|
154
|
+
Results are merged dynamically. Keyword matches receive a score boost so exact matches rank at the top, while semantic similarity surfaces conceptually relevant memories even when different phrasing is used.
|
|
157
155
|
|
|
158
|
-
|
|
159
|
-
This is normal on the **very first run** because Persyst is downloading the ~50MB embedding model. Wait 30-60 seconds for it to complete. The next runs will be instant.
|
|
156
|
+
---
|
|
160
157
|
|
|
161
|
-
|
|
162
|
-
Instead of running it globally, prefer using the `npx -y persyst-mcp` command in your agent configurations. It automatically installs and updates the server non-interactively.
|
|
158
|
+
## Tech Stack
|
|
163
159
|
|
|
164
|
-
|
|
165
|
-
|
|
160
|
+
- **Runtime:** Node.js 18+
|
|
161
|
+
- **Database:** SQLite via `better-sqlite3` (synchronous, WAL mode)
|
|
162
|
+
- **Vector Search:** `sqlite-vec` (in-process, zero cloud egress)
|
|
163
|
+
- **Full-Text Search:** SQLite FTS5
|
|
164
|
+
- **Embeddings:** `@huggingface/transformers` + `all-MiniLM-L6-v2` (384-dim, local ONNX)
|
|
165
|
+
- **Watcher:** `chokidar` event-driven file monitoring
|
|
166
|
+
- **Protocol:** MCP over stdio + HTTP Gateway
|
|
166
167
|
|
|
167
168
|
---
|
|
168
169
|
|
|
169
170
|
## Backup & Migration
|
|
170
171
|
|
|
171
|
-
Persyst includes built-in JSONL export/import commands for portable memory backup and cross-machine migration
|
|
172
|
+
Persyst includes built-in JSONL export/import commands for portable memory backup and cross-machine migration:
|
|
172
173
|
|
|
173
174
|
```bash
|
|
174
|
-
# Export all memories to a file
|
|
175
|
+
# Export all memories to a JSONL file
|
|
175
176
|
npx persyst-mcp export
|
|
176
|
-
# → persyst-export-<timestamp>.jsonl
|
|
177
177
|
|
|
178
178
|
# Export to a specific file
|
|
179
179
|
npx persyst-mcp export my-backup.jsonl
|
|
180
180
|
|
|
181
|
-
# Preview
|
|
181
|
+
# Preview import (dry run)
|
|
182
182
|
npx persyst-mcp import my-backup.jsonl --dry-run
|
|
183
183
|
|
|
184
|
-
# Import memories (
|
|
184
|
+
# Import memories (deduplicates automatically)
|
|
185
185
|
npx persyst-mcp import my-backup.jsonl
|
|
186
186
|
```
|
|
187
187
|
|
|
188
188
|
---
|
|
189
189
|
|
|
190
|
-
## Roadmap & Future Directions
|
|
191
|
-
|
|
192
|
-
Persyst is built for the privacy-focused solo developer. We are actively hardening the local-first experience before introducing network dependencies.
|
|
193
|
-
|
|
194
|
-
* **File-Based Sync** ✅ **Done**: `persyst-export` / `persyst-import` JSONL commands for backup and migration.
|
|
195
|
-
* **IDE Integrations**: First-class extensions for Cursor, VS Code, and Aider configuration helper commands.
|
|
196
|
-
* **True P2P Sync (Roadmap)**: Peer-to-peer secure sync between developer devices without relying on central cloud servers.
|
|
197
|
-
|
|
198
|
-
---
|
|
199
|
-
|
|
200
190
|
## License
|
|
201
191
|
|
|
202
192
|
MIT License. See [LICENSE](LICENSE) for details.
|
|
203
|
-
|
package/bin/export.js
CHANGED
|
@@ -100,16 +100,16 @@ try {
|
|
|
100
100
|
});
|
|
101
101
|
});
|
|
102
102
|
|
|
103
|
-
console.log(
|
|
103
|
+
console.log(`[OK] Exported ${count} memories to: ${outputFile}`);
|
|
104
104
|
if (namespace) {
|
|
105
|
-
console.log(`
|
|
105
|
+
console.log(` Namespace filter: "${namespace}" + shared`);
|
|
106
106
|
}
|
|
107
107
|
if (includeArchived) {
|
|
108
|
-
console.log('
|
|
108
|
+
console.log(' Includes archived (superseded) memories.');
|
|
109
109
|
}
|
|
110
110
|
|
|
111
111
|
} catch (err) {
|
|
112
|
-
console.error(
|
|
112
|
+
console.error(`[ERROR] Export failed: ${err.message}`);
|
|
113
113
|
process.exit(1);
|
|
114
114
|
} finally {
|
|
115
115
|
closeDatabase();
|
package/bin/extract.js
CHANGED
|
@@ -114,9 +114,9 @@ async function run() {
|
|
|
114
114
|
}
|
|
115
115
|
|
|
116
116
|
if (!jsonOutput) {
|
|
117
|
-
console.log(`\n
|
|
117
|
+
console.log(`\n[INFO] Heuristic fact(s) extracted: ${heuristicFacts.length}`);
|
|
118
118
|
for (const f of heuristicFacts) {
|
|
119
|
-
console.log(`
|
|
119
|
+
console.log(` [OK] [${f.category}] (conf: ${f.confidence}) ${f.content}`);
|
|
120
120
|
}
|
|
121
121
|
}
|
|
122
122
|
|
|
@@ -128,7 +128,7 @@ async function run() {
|
|
|
128
128
|
// --- Store to database (unless dry-run) ---
|
|
129
129
|
if (!dryRun && allFacts.length > 0) {
|
|
130
130
|
if (!jsonOutput) {
|
|
131
|
-
console.log(`\n
|
|
131
|
+
console.log(`\n[INFO] Storing to database...`);
|
|
132
132
|
}
|
|
133
133
|
|
|
134
134
|
const { insertMemory, insertVector, memoryExists } = await import('../src/database.js');
|
|
@@ -142,7 +142,7 @@ async function run() {
|
|
|
142
142
|
if (memoryExists(fact.content)) {
|
|
143
143
|
dupes++;
|
|
144
144
|
if (!jsonOutput) {
|
|
145
|
-
console.log(`
|
|
145
|
+
console.log(` [SKIP] Duplicate: "${fact.content.slice(0, 50)}..."`);
|
|
146
146
|
}
|
|
147
147
|
continue;
|
|
148
148
|
}
|
|
@@ -158,15 +158,15 @@ async function run() {
|
|
|
158
158
|
|
|
159
159
|
stored++;
|
|
160
160
|
if (!jsonOutput) {
|
|
161
|
-
console.log(`
|
|
161
|
+
console.log(` [OK] Stored memory #${id}: "${fact.content.slice(0, 60)}..."`);
|
|
162
162
|
}
|
|
163
163
|
}
|
|
164
164
|
|
|
165
165
|
if (!jsonOutput) {
|
|
166
|
-
console.log(`\n
|
|
166
|
+
console.log(`\n[INFO] Result: ${stored} stored, ${dupes} duplicates skipped`);
|
|
167
167
|
}
|
|
168
168
|
} else if (dryRun && !jsonOutput) {
|
|
169
|
-
console.log(`\n
|
|
169
|
+
console.log(`\n[INFO] Dry run — no facts stored.`);
|
|
170
170
|
}
|
|
171
171
|
|
|
172
172
|
// --- JSON output ---
|
|
@@ -180,6 +180,6 @@ async function run() {
|
|
|
180
180
|
}
|
|
181
181
|
|
|
182
182
|
run().catch(err => {
|
|
183
|
-
console.error(`\n
|
|
183
|
+
console.error(`\n[ERROR] Extraction failed: ${err.message}`);
|
|
184
184
|
process.exit(1);
|
|
185
185
|
});
|
package/bin/import.js
CHANGED
|
@@ -40,7 +40,7 @@ const skipEmbeddings = args.includes('--skip-embeddings');
|
|
|
40
40
|
const DEDUP_THRESHOLD = 0.85;
|
|
41
41
|
|
|
42
42
|
if (!inputFile) {
|
|
43
|
-
console.error('
|
|
43
|
+
console.error('[ERROR] Usage: persyst-import <file.jsonl> [--dry-run] [--namespace=<ns>] [--skip-embeddings]');
|
|
44
44
|
process.exit(1);
|
|
45
45
|
}
|
|
46
46
|
|
|
@@ -49,10 +49,10 @@ if (!inputFile) {
|
|
|
49
49
|
// ============================================================
|
|
50
50
|
|
|
51
51
|
async function main() {
|
|
52
|
-
console.log(
|
|
53
|
-
console.log(`
|
|
54
|
-
if (forceNamespace) console.log(`
|
|
55
|
-
if (skipEmbeddings) console.log('
|
|
52
|
+
console.log(`[IMPORT] Persyst Import${isDryRun ? ' (DRY RUN — nothing will be written)' : ''}`);
|
|
53
|
+
console.log(` Source: ${inputFile}`);
|
|
54
|
+
if (forceNamespace) console.log(` Forcing namespace: "${forceNamespace}"`);
|
|
55
|
+
if (skipEmbeddings) console.log(' Skipping embedding regeneration.');
|
|
56
56
|
console.log('');
|
|
57
57
|
|
|
58
58
|
const rl = createInterface({
|
|
@@ -74,7 +74,7 @@ async function main() {
|
|
|
74
74
|
try {
|
|
75
75
|
record = JSON.parse(trimmed);
|
|
76
76
|
} catch (err) {
|
|
77
|
-
console.error(`
|
|
77
|
+
console.error(` [WARN] Line ${lineNum}: Invalid JSON — skipping`);
|
|
78
78
|
errors++;
|
|
79
79
|
continue;
|
|
80
80
|
}
|
|
@@ -82,7 +82,7 @@ async function main() {
|
|
|
82
82
|
const { content, importance_score = 1.0, namespace, provenance, valid_until } = record;
|
|
83
83
|
|
|
84
84
|
if (!content || typeof content !== 'string' || content.trim().length === 0) {
|
|
85
|
-
console.error(`
|
|
85
|
+
console.error(` [WARN] Line ${lineNum}: Empty content — skipping`);
|
|
86
86
|
errors++;
|
|
87
87
|
continue;
|
|
88
88
|
}
|
|
@@ -97,7 +97,7 @@ async function main() {
|
|
|
97
97
|
|
|
98
98
|
// --- Dedup: exact content match ---
|
|
99
99
|
if (memoryExists(content, targetNamespace)) {
|
|
100
|
-
console.log(`
|
|
100
|
+
console.log(` [SKIP] Line ${lineNum}: Already exists — skipping "${content.slice(0, 60)}..."`);
|
|
101
101
|
skipped++;
|
|
102
102
|
continue;
|
|
103
103
|
}
|
|
@@ -107,7 +107,7 @@ async function main() {
|
|
|
107
107
|
try {
|
|
108
108
|
const similar = await searchHybrid(content, 1, null, null, targetNamespace);
|
|
109
109
|
if (similar.length > 0 && parseFloat(similar[0].similarity) >= DEDUP_THRESHOLD) {
|
|
110
|
-
console.log(`
|
|
110
|
+
console.log(` [SKIP] Line ${lineNum}: Semantically similar to #${similar[0].id} (sim=${similar[0].similarity}) — skipping`);
|
|
111
111
|
skipped++;
|
|
112
112
|
continue;
|
|
113
113
|
}
|
|
@@ -117,7 +117,7 @@ async function main() {
|
|
|
117
117
|
}
|
|
118
118
|
|
|
119
119
|
if (isDryRun) {
|
|
120
|
-
console.log(`
|
|
120
|
+
console.log(` [OK] Would import: "${content.slice(0, 80)}${content.length > 80 ? '...' : ''}" → ns="${targetNamespace}"`);
|
|
121
121
|
imported++;
|
|
122
122
|
continue;
|
|
123
123
|
}
|
|
@@ -132,10 +132,10 @@ async function main() {
|
|
|
132
132
|
insertVector(id, embedding);
|
|
133
133
|
}
|
|
134
134
|
|
|
135
|
-
console.log(`
|
|
135
|
+
console.log(` [OK] Imported #${id}: "${content.slice(0, 70)}${content.length > 70 ? '...' : ''}"`);
|
|
136
136
|
imported++;
|
|
137
137
|
} catch (err) {
|
|
138
|
-
console.error(`
|
|
138
|
+
console.error(` [ERROR] Line ${lineNum}: Failed to insert — ${err.message}`);
|
|
139
139
|
errors++;
|
|
140
140
|
}
|
|
141
141
|
}
|
|
@@ -143,16 +143,16 @@ async function main() {
|
|
|
143
143
|
console.log('');
|
|
144
144
|
console.log('═'.repeat(50));
|
|
145
145
|
if (isDryRun) {
|
|
146
|
-
console.log(
|
|
146
|
+
console.log(`[INFO] Dry run complete: ${imported} would import, ${skipped} skipped, ${errors} errors`);
|
|
147
147
|
} else {
|
|
148
|
-
console.log(
|
|
148
|
+
console.log(`[INFO] Import complete: ${imported} imported, ${skipped} skipped, ${errors} errors`);
|
|
149
149
|
}
|
|
150
150
|
console.log('═'.repeat(50));
|
|
151
151
|
}
|
|
152
152
|
|
|
153
153
|
main()
|
|
154
154
|
.catch(err => {
|
|
155
|
-
console.error(
|
|
155
|
+
console.error(`[ERROR] Import crashed: ${err.message}`);
|
|
156
156
|
process.exit(1);
|
|
157
157
|
})
|
|
158
158
|
.finally(() => {
|