get-claudia 1.54.4 → 1.55.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +16 -0
- package/README.md +77 -6
- package/assets/brain-visualizer.png +0 -0
- package/bin/index.js +114 -1
- package/memory-daemon/claudia_memory/__main__.py +231 -68
- package/memory-daemon/claudia_memory/config.py +23 -21
- package/memory-daemon/claudia_memory/daemon/health.py +21 -4
- package/memory-daemon/claudia_memory/database.py +47 -6
- package/memory-daemon/claudia_memory/migration.py +161 -0
- package/memory-daemon/claudia_memory/schema.sql +6 -1
- package/memory-daemon/claudia_memory/services/recall.py +10 -0
- package/memory-daemon/claudia_memory/services/remember.py +8 -0
- package/package.json +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,22 @@
|
|
|
2
2
|
|
|
3
3
|
All notable changes to Claudia will be documented in this file.
|
|
4
4
|
|
|
5
|
+
## 1.55.0 (2026-03-15)
|
|
6
|
+
|
|
7
|
+
### The Unified Memory Release
|
|
8
|
+
|
|
9
|
+
Claudia no longer fragments your memory across dozens of invisible database files. Every project, every session, one brain.
|
|
10
|
+
|
|
11
|
+
- **Single database** -- All sessions now use `~/.claudia/memory/claudia.db` regardless of which project directory you're in. No more hash-named files like `6af67351bcfa.db` that nobody can identify or recover.
|
|
12
|
+
- **Automatic consolidation** -- On first startup after upgrade, Claudia detects your existing hash-named databases, merges all their data into the unified `claudia.db`, and cleans up the old files. Zero manual steps.
|
|
13
|
+
- **Workspace provenance** -- New `workspace_id` column on memories tracks which project directory created each memory. This is provenance metadata ("where did I learn this?"), not a filter wall. Recall stays global: Claudia remembers Sarah regardless of which project you're in.
|
|
14
|
+
- **Human-readable backups** -- Backups now live in `~/.claudia/backups/` with clear names like `claudia-daily-2026-03-15.db` and `claudia-pre-merge-2026-03-15.db` instead of cryptic timestamps alongside the database file.
|
|
15
|
+
- **Pre-merge safety net** -- Before any consolidation, a backup is created automatically. If anything goes wrong, your data is recoverable.
|
|
16
|
+
- **DB identity logging** -- Every daemon startup logs exactly which database it's using and how many memories it contains. No more guessing.
|
|
17
|
+
- **Manual merge CLI** -- `python -m claudia_memory --merge-databases` lets you preview (`--dry-run`) or manually trigger consolidation.
|
|
18
|
+
- **Schema migration 21** -- Adds `workspace_id TEXT` column and index to memories table.
|
|
19
|
+
- **39 new tests** -- Full coverage for unified DB, consolidation, backup naming, and workspace tagging. All 608 tests pass.
|
|
20
|
+
|
|
5
21
|
## 1.54.4 (2026-03-14)
|
|
6
22
|
|
|
7
23
|
### The One-Click Setup Release
|
package/README.md
CHANGED
|
@@ -15,10 +15,10 @@ Remembers your people. Catches your commitments. Learns how you work.
|
|
|
15
15
|
</p>
|
|
16
16
|
|
|
17
17
|
<p align="center">
|
|
18
|
-
<a href="#
|
|
18
|
+
<a href="#quick-start"><strong>Install</strong></a> ·
|
|
19
19
|
<a href="#what-makes-claudia-different">Why Claudia</a> ·
|
|
20
20
|
<a href="#how-her-mind-works">Her Mind</a> ·
|
|
21
|
-
<a href="#
|
|
21
|
+
<a href="#integrations">Integrations</a> ·
|
|
22
22
|
<a href="#how-it-works">How It Works</a>
|
|
23
23
|
</p>
|
|
24
24
|
|
|
@@ -112,20 +112,33 @@ You make a promise in a meeting. Nobody tracks it. You promise a deliverable on
|
|
|
112
112
|
|
|
113
113
|
## Quick Start
|
|
114
114
|
|
|
115
|
+
**1. Install**
|
|
115
116
|
```bash
|
|
116
117
|
npx get-claudia
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
**2. Start**
|
|
121
|
+
```bash
|
|
117
122
|
cd claudia
|
|
118
123
|
claude
|
|
119
124
|
```
|
|
120
125
|
|
|
126
|
+
**3. Say hi.** She'll introduce herself, learn about you through a natural conversation, and generate a personalized workspace.
|
|
127
|
+
|
|
121
128
|
<p align="center">
|
|
122
129
|
<img src="assets/claudia-install.gif" alt="Installing Claudia" width="600">
|
|
123
130
|
</p>
|
|
124
131
|
|
|
125
|
-
|
|
132
|
+
**What's next:**
|
|
133
|
+
- `/morning-brief` to see what needs attention
|
|
134
|
+
- Tell her about a person and she'll create a relationship file
|
|
135
|
+
- Share meeting notes and she'll extract action items
|
|
136
|
+
- `npx get-claudia google` to connect Gmail, Calendar, Drive, and more
|
|
126
137
|
|
|
127
138
|
**Requirements:** [Claude Code](https://docs.anthropic.com/en/docs/claude-code), Node.js 18+, Python 3.10+ (for memory), [Ollama](https://ollama.com) (for embeddings)
|
|
128
139
|
|
|
140
|
+
> **Embeddings model:** After installing Ollama, pull the required model: `ollama pull all-minilm:l6-v2`
|
|
141
|
+
|
|
129
142
|
<details>
|
|
130
143
|
<summary><strong>Template-only install (no memory system)</strong></summary>
|
|
131
144
|
|
|
@@ -253,7 +266,7 @@ Claudia detects your work style and generates structure that fits:
|
|
|
253
266
|
| `/memory-audit` | See everything Claudia knows, with source chains |
|
|
254
267
|
|
|
255
268
|
<details>
|
|
256
|
-
<summary><strong>All commands (
|
|
269
|
+
<summary><strong>All commands (41 skills)</strong></summary>
|
|
257
270
|
|
|
258
271
|
| Command | What It Does |
|
|
259
272
|
|---------|--------------|
|
|
@@ -277,13 +290,62 @@ Plus ~30 proactive skills (commitment detection, pattern recognition, judgment a
|
|
|
277
290
|
|
|
278
291
|
---
|
|
279
292
|
|
|
293
|
+
## Brain Visualizer
|
|
294
|
+
|
|
295
|
+
Launch with `/brain` to see your memory as a 3D network graph. Entities are nodes, relationships are edges, and everything is interactive: click to inspect, filter by type, search by name.
|
|
296
|
+
|
|
297
|
+
<p align="center">
|
|
298
|
+
<img src="assets/brain-visualizer.png" alt="Claudia Brain Visualizer" width="700">
|
|
299
|
+
</p>
|
|
300
|
+
|
|
301
|
+
---
|
|
302
|
+
|
|
303
|
+
## Integrations
|
|
304
|
+
|
|
305
|
+
Claudia works fully on her own, but integrations let her see further.
|
|
306
|
+
|
|
307
|
+
### Google Workspace
|
|
308
|
+
|
|
309
|
+
Connect Gmail, Calendar, Drive, Docs, Sheets, Tasks, and more with a single setup command:
|
|
310
|
+
|
|
311
|
+
```bash
|
|
312
|
+
npx get-claudia google
|
|
313
|
+
```
|
|
314
|
+
|
|
315
|
+
This generates a one-click URL to enable all required Google APIs and walks you through OAuth setup. Three tiers available:
|
|
316
|
+
|
|
317
|
+
| Tier | Tools | What You Get |
|
|
318
|
+
|------|-------|-------------|
|
|
319
|
+
| **Core** | 43 | Gmail, Calendar, Drive, Contacts |
|
|
320
|
+
| **Extended** | 83 | Core + Docs, Sheets, Tasks, Chat |
|
|
321
|
+
| **Complete** | 111 | Extended + Slides, Forms, Apps Script |
|
|
322
|
+
|
|
323
|
+
### 500+ Apps via Rube
|
|
324
|
+
|
|
325
|
+
[Rube](https://rube.app) (by Composio) connects Claudia to Slack, Notion, Jira, GitHub, Linear, HubSpot, Stripe, Figma, and hundreds more through one-click OAuth. No per-app MCP setup needed.
|
|
326
|
+
|
|
327
|
+
| Category | Examples |
|
|
328
|
+
|----------|----------|
|
|
329
|
+
| **Communication** | Slack, Discord, Teams, Telegram |
|
|
330
|
+
| **Project Management** | Jira, Linear, Asana, Trello, Monday.com |
|
|
331
|
+
| **Knowledge & Docs** | Notion, Confluence, Google Docs, Coda |
|
|
332
|
+
| **Code & Dev** | GitHub, GitLab, Bitbucket |
|
|
333
|
+
| **CRM & Sales** | HubSpot, Salesforce, Pipedrive |
|
|
334
|
+
| **And 500+ more** | [Browse the full list](https://rube.app) |
|
|
335
|
+
|
|
336
|
+
### Obsidian Vault
|
|
337
|
+
|
|
338
|
+
Memory auto-syncs to an Obsidian vault at `~/.claudia/vault/` using PARA structure. Every entity becomes a markdown note with `[[wikilinks]]`, so Obsidian's graph view maps your network. SQLite is the source of truth; the vault is a read-only projection you can browse and search.
|
|
339
|
+
|
|
340
|
+
---
|
|
341
|
+
|
|
280
342
|
## How It Works
|
|
281
343
|
|
|
282
|
-
**
|
|
344
|
+
**41 skills · 33 MCP tools · 500+ tests**
|
|
283
345
|
|
|
284
346
|
Claudia has two layers:
|
|
285
347
|
|
|
286
|
-
**Template layer** (markdown) defines who she is.
|
|
348
|
+
**Template layer** (markdown) defines who she is. 41 skills, rules, and identity files that Claude reads on startup. Skills range from proactive behaviors (commitment detection, pattern recognition, judgment awareness) to user-invocable workflows (`/morning-brief`, `/research`, `/meditate`). Workspace templates let you spin up new projects with `/new-workspace [name]`.
|
|
287
349
|
|
|
288
350
|
**Memory system** (Python) defines what she remembers. Two daemon modes share the same SQLite database:
|
|
289
351
|
|
|
@@ -360,6 +422,8 @@ For full architecture diagrams, see [ARCHITECTURE.md](ARCHITECTURE.md).
|
|
|
360
422
|
|
|
361
423
|
Without the memory system, Claudia still works using markdown files. With it, she gains semantic search, pattern detection, and relationship tracking.
|
|
362
424
|
|
|
425
|
+
> **Ollama model:** Run `ollama pull all-minilm:l6-v2` after installing Ollama. This is the embedding model used for semantic search.
|
|
426
|
+
|
|
363
427
|
**Platforms:** macOS, Linux, Windows
|
|
364
428
|
|
|
365
429
|
---
|
|
@@ -395,6 +459,13 @@ ollama serve # Linux
|
|
|
395
459
|
ollama pull all-minilm:l6-v2 # Embeddings (required)
|
|
396
460
|
```
|
|
397
461
|
|
|
462
|
+
**Google Workspace not working after enabling new APIs?**
|
|
463
|
+
Delete the cached token and restart to re-authenticate with updated scopes:
|
|
464
|
+
```bash
|
|
465
|
+
rm ~/.workspace-mcp/token.json
|
|
466
|
+
# Restart Claude Code
|
|
467
|
+
```
|
|
468
|
+
|
|
398
469
|
**Broken install? Re-run setup:**
|
|
399
470
|
```bash
|
|
400
471
|
cd your-claudia-directory
|
|
Binary file
|
package/bin/index.js
CHANGED
|
@@ -3,7 +3,7 @@
|
|
|
3
3
|
import { existsSync, mkdirSync, cpSync, readdirSync, readFileSync, writeFileSync, statSync, renameSync } from 'fs';
|
|
4
4
|
import { join, dirname } from 'path';
|
|
5
5
|
import { fileURLToPath } from 'url';
|
|
6
|
-
import { spawn } from 'child_process';
|
|
6
|
+
import { spawn, execFileSync } from 'child_process';
|
|
7
7
|
import { homedir } from 'os';
|
|
8
8
|
import { createInterface } from 'readline';
|
|
9
9
|
import { setupGoogleWorkspace, detectOldGoogleMcp, extractProjectNumber, buildApiEnableUrl, TIER_APIS } from './google-setup.js';
|
|
@@ -960,6 +960,45 @@ async function main() {
|
|
|
960
960
|
}
|
|
961
961
|
}
|
|
962
962
|
|
|
963
|
+
// Scan existing databases and show stats
|
|
964
|
+
if (daemonOk) {
|
|
965
|
+
const dbScan = scanExistingDatabases();
|
|
966
|
+
if (dbScan.totalMemories > 0 || dbScan.hashDbs.length > 0) {
|
|
967
|
+
renderer.stopSpinner();
|
|
968
|
+
console.log('');
|
|
969
|
+
console.log(`${colors.dim}${'─'.repeat(46)}${colors.reset}`);
|
|
970
|
+
console.log(` ${colors.boldCyan}Memory Database Scan${colors.reset}`);
|
|
971
|
+
console.log('');
|
|
972
|
+
|
|
973
|
+
if (dbScan.unified.exists) {
|
|
974
|
+
console.log(` ${colors.green}●${colors.reset} claudia.db: ${colors.bold}${dbScan.unified.memories}${colors.reset} memories, ${colors.bold}${dbScan.unified.entities}${colors.reset} entities`);
|
|
975
|
+
}
|
|
976
|
+
|
|
977
|
+
if (dbScan.hashDbs.length > 0) {
|
|
978
|
+
const withData = dbScan.hashDbs.filter(d => d.memories > 0 || d.entities > 0);
|
|
979
|
+
const empty = dbScan.hashDbs.filter(d => d.memories === 0 && d.entities === 0);
|
|
980
|
+
|
|
981
|
+
if (withData.length > 0) {
|
|
982
|
+
console.log('');
|
|
983
|
+
console.log(` ${colors.yellow}Found ${withData.length} legacy database${withData.length !== 1 ? 's' : ''} to consolidate:${colors.reset}`);
|
|
984
|
+
for (const db of withData) {
|
|
985
|
+
console.log(` ${colors.dim}${db.name}${colors.reset}: ${db.memories} memories, ${db.entities} entities`);
|
|
986
|
+
}
|
|
987
|
+
console.log('');
|
|
988
|
+
console.log(` ${colors.dim}These will be auto-merged into claudia.db on next startup.${colors.reset}`);
|
|
989
|
+
}
|
|
990
|
+
if (empty.length > 0) {
|
|
991
|
+
console.log(` ${colors.dim}${empty.length} empty database${empty.length !== 1 ? 's' : ''} will be cleaned up automatically.${colors.reset}`);
|
|
992
|
+
}
|
|
993
|
+
} else if (dbScan.unified.exists && dbScan.unified.memories > 0) {
|
|
994
|
+
console.log(` ${colors.dim}Unified database, no legacy files to consolidate.${colors.reset}`);
|
|
995
|
+
}
|
|
996
|
+
|
|
997
|
+
console.log(`${colors.dim}${'─'.repeat(46)}${colors.reset}`);
|
|
998
|
+
renderer.startSpinner();
|
|
999
|
+
}
|
|
1000
|
+
}
|
|
1001
|
+
|
|
963
1002
|
memoryOk = daemonOk || hasExistingDb;
|
|
964
1003
|
|
|
965
1004
|
} catch (err) {
|
|
@@ -1175,6 +1214,80 @@ function restoreMcpServers(targetPath) {
|
|
|
1175
1214
|
}
|
|
1176
1215
|
}
|
|
1177
1216
|
|
|
1217
|
+
/**
|
|
1218
|
+
* Scan ~/.claudia/memory/ for existing databases and return rough stats.
|
|
1219
|
+
* Uses sqlite3 CLI (via execFileSync) to query each .db file safely.
|
|
1220
|
+
* Returns { unified: { exists, memories, entities }, hashDbs: [...], totalMemories }
|
|
1221
|
+
*/
|
|
1222
|
+
function scanExistingDatabases() {
|
|
1223
|
+
const memoryDir = join(homedir(), '.claudia', 'memory');
|
|
1224
|
+
const result = {
|
|
1225
|
+
unified: { exists: false, memories: 0, entities: 0 },
|
|
1226
|
+
hashDbs: [],
|
|
1227
|
+
totalMemories: 0,
|
|
1228
|
+
};
|
|
1229
|
+
|
|
1230
|
+
if (!existsSync(memoryDir)) return result;
|
|
1231
|
+
|
|
1232
|
+
let files;
|
|
1233
|
+
try {
|
|
1234
|
+
files = readdirSync(memoryDir);
|
|
1235
|
+
} catch {
|
|
1236
|
+
return result;
|
|
1237
|
+
}
|
|
1238
|
+
|
|
1239
|
+
const hashPattern = /^[0-9a-f]{12}\.db$/;
|
|
1240
|
+
|
|
1241
|
+
for (const file of files) {
|
|
1242
|
+
if (!file.endsWith('.db')) continue;
|
|
1243
|
+
// Skip WAL/SHM/backup files
|
|
1244
|
+
if (file.includes('-wal') || file.includes('-shm') || file.includes('.backup')) continue;
|
|
1245
|
+
const filePath = join(memoryDir, file);
|
|
1246
|
+
|
|
1247
|
+
try {
|
|
1248
|
+
const stats = statSync(filePath);
|
|
1249
|
+
if (stats.size < 4096) continue; // Too small to have data
|
|
1250
|
+
} catch {
|
|
1251
|
+
continue;
|
|
1252
|
+
}
|
|
1253
|
+
|
|
1254
|
+
// Query using sqlite3 CLI (no shell, safe from injection)
|
|
1255
|
+
let memories = 0;
|
|
1256
|
+
let entities = 0;
|
|
1257
|
+
try {
|
|
1258
|
+
const memResult = execFileSync('sqlite3', [filePath, 'SELECT COUNT(*) FROM memories;'], {
|
|
1259
|
+
encoding: 'utf-8', timeout: 5000, stdio: ['pipe', 'pipe', 'pipe'],
|
|
1260
|
+
}).trim();
|
|
1261
|
+
memories = parseInt(memResult, 10) || 0;
|
|
1262
|
+
} catch { /* table may not exist */ }
|
|
1263
|
+
|
|
1264
|
+
try {
|
|
1265
|
+
const entResult = execFileSync('sqlite3', [filePath, 'SELECT COUNT(*) FROM entities WHERE deleted_at IS NULL;'], {
|
|
1266
|
+
encoding: 'utf-8', timeout: 5000, stdio: ['pipe', 'pipe', 'pipe'],
|
|
1267
|
+
}).trim();
|
|
1268
|
+
entities = parseInt(entResult, 10) || 0;
|
|
1269
|
+
} catch {
|
|
1270
|
+
try {
|
|
1271
|
+
const entResult = execFileSync('sqlite3', [filePath, 'SELECT COUNT(*) FROM entities;'], {
|
|
1272
|
+
encoding: 'utf-8', timeout: 5000, stdio: ['pipe', 'pipe', 'pipe'],
|
|
1273
|
+
}).trim();
|
|
1274
|
+
entities = parseInt(entResult, 10) || 0;
|
|
1275
|
+
} catch { /* skip */ }
|
|
1276
|
+
}
|
|
1277
|
+
|
|
1278
|
+
if (file === 'claudia.db') {
|
|
1279
|
+
result.unified = { exists: true, memories, entities };
|
|
1280
|
+
} else if (hashPattern.test(file)) {
|
|
1281
|
+
result.hashDbs.push({ name: file, memories, entities });
|
|
1282
|
+
}
|
|
1283
|
+
|
|
1284
|
+
result.totalMemories += memories;
|
|
1285
|
+
}
|
|
1286
|
+
|
|
1287
|
+
return result;
|
|
1288
|
+
}
|
|
1289
|
+
|
|
1290
|
+
|
|
1178
1291
|
/**
|
|
1179
1292
|
* Ensure .mcp.json has a working claudia-memory daemon entry.
|
|
1180
1293
|
* - Fresh install (no .mcp.json): creates one with just the daemon entry.
|
|
@@ -163,102 +163,177 @@ def _check_and_repair_database(db_path: Path) -> None:
|
|
|
163
163
|
)
|
|
164
164
|
|
|
165
165
|
|
|
166
|
-
def
|
|
167
|
-
"""Auto-
|
|
166
|
+
def _auto_consolidate() -> None:
|
|
167
|
+
"""Auto-consolidate hash-named databases into the unified claudia.db.
|
|
168
168
|
|
|
169
|
-
|
|
170
|
-
|
|
171
|
-
|
|
172
|
-
active project-specific database.
|
|
169
|
+
Detects hash-named databases (12-char hex filenames) in ~/.claudia/memory/
|
|
170
|
+
and merges them into claudia.db. This handles the upgrade from per-project
|
|
171
|
+
hash-based DB isolation to the unified database model.
|
|
173
172
|
|
|
174
173
|
Properties:
|
|
175
|
-
- Idempotent: checks _meta flag, won't run twice
|
|
176
|
-
- Safe:
|
|
174
|
+
- Idempotent: checks _meta['unified_db'] flag, won't run twice
|
|
175
|
+
- Safe: creates pre-merge backup before any changes
|
|
177
176
|
- Non-fatal: catches all exceptions, logs, continues
|
|
177
|
+
- Cleans up: deletes hash DBs + WAL/SHM after successful merge
|
|
178
178
|
"""
|
|
179
179
|
from .migration import (
|
|
180
|
-
|
|
181
|
-
|
|
182
|
-
|
|
183
|
-
|
|
180
|
+
cleanup_old_databases,
|
|
181
|
+
merge_all_databases,
|
|
182
|
+
scan_hash_databases,
|
|
183
|
+
verify_consolidated_db,
|
|
184
184
|
)
|
|
185
185
|
|
|
186
186
|
try:
|
|
187
|
+
db = get_db()
|
|
187
188
|
config = get_config()
|
|
188
|
-
|
|
189
|
-
active_path = Path(config.db_path)
|
|
189
|
+
memory_dir = Path(config.db_path).parent
|
|
190
190
|
|
|
191
|
-
#
|
|
191
|
+
# Check if already consolidated
|
|
192
192
|
try:
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
|
|
196
|
-
|
|
193
|
+
rows = db.execute(
|
|
194
|
+
"SELECT value FROM _meta WHERE key = 'unified_db'",
|
|
195
|
+
fetch=True,
|
|
196
|
+
)
|
|
197
|
+
if rows and rows[0]["value"] == "true":
|
|
198
|
+
logger.debug("Database already unified, skipping consolidation")
|
|
197
199
|
return
|
|
200
|
+
except Exception:
|
|
201
|
+
pass # _meta table might not exist yet
|
|
198
202
|
|
|
199
|
-
#
|
|
200
|
-
|
|
203
|
+
# Scan for hash-named databases
|
|
204
|
+
all_hash_dbs = scan_hash_databases(memory_dir)
|
|
205
|
+
if not all_hash_dbs:
|
|
206
|
+
# No hash DBs found: fresh install or already cleaned up
|
|
207
|
+
_set_unified_db_flag(db)
|
|
201
208
|
return
|
|
202
209
|
|
|
203
|
-
#
|
|
204
|
-
|
|
205
|
-
if
|
|
210
|
+
# Separate databases with data from empty ones
|
|
211
|
+
data_dbs = [d for d in all_hash_dbs if d["has_data"]]
|
|
212
|
+
empty_dbs = [d for d in all_hash_dbs if not d["has_data"]]
|
|
213
|
+
|
|
214
|
+
if not data_dbs and empty_dbs:
|
|
215
|
+
# Only empty hash DBs: clean them up and mark unified
|
|
216
|
+
logger.info(f"Found {len(empty_dbs)} empty hash databases, cleaning up")
|
|
217
|
+
cleanup_old_databases(memory_dir, empty_dbs)
|
|
218
|
+
_set_unified_db_flag(db)
|
|
206
219
|
return
|
|
207
220
|
|
|
208
|
-
|
|
209
|
-
|
|
210
|
-
if not legacy_stats:
|
|
211
|
-
# Empty or unreadable legacy db -- mark complete so we don't check again
|
|
212
|
-
mark_migration_completed(db, {"skipped": "no_data"})
|
|
213
|
-
logger.info("Legacy claudia.db exists but has no data worth migrating")
|
|
221
|
+
if not data_dbs:
|
|
222
|
+
_set_unified_db_flag(db)
|
|
214
223
|
return
|
|
215
224
|
|
|
225
|
+
# Log what we found
|
|
226
|
+
total_memories = sum(d["stats"].get("memories", 0) for d in data_dbs)
|
|
227
|
+
total_entities = sum(d["stats"].get("entities", 0) for d in data_dbs)
|
|
216
228
|
logger.info(
|
|
217
|
-
f"Found
|
|
218
|
-
f"
|
|
229
|
+
f"Found {len(data_dbs)} hash databases with data "
|
|
230
|
+
f"({total_memories} memories, {total_entities} entities). "
|
|
231
|
+
f"Consolidating into claudia.db..."
|
|
219
232
|
)
|
|
220
233
|
|
|
221
|
-
# Create pre-
|
|
222
|
-
|
|
223
|
-
|
|
224
|
-
|
|
225
|
-
|
|
226
|
-
|
|
227
|
-
|
|
228
|
-
# Continue anyway -- the migration is additive, not destructive
|
|
234
|
+
# Create pre-merge backup
|
|
235
|
+
try:
|
|
236
|
+
backup_path = db.backup(label="pre-merge")
|
|
237
|
+
logger.info(f"Pre-merge backup created: {backup_path}")
|
|
238
|
+
except Exception as e:
|
|
239
|
+
logger.warning(f"Pre-merge backup failed: {e}")
|
|
240
|
+
# Continue anyway, the merge is additive
|
|
229
241
|
|
|
230
|
-
#
|
|
231
|
-
|
|
232
|
-
|
|
242
|
+
# Merge all hash databases into claudia.db
|
|
243
|
+
active_path = Path(config.db_path)
|
|
244
|
+
totals = merge_all_databases(active_path, data_dbs)
|
|
233
245
|
|
|
234
|
-
#
|
|
235
|
-
|
|
246
|
+
# Verify integrity after merge
|
|
247
|
+
if not verify_consolidated_db(active_path):
|
|
248
|
+
logger.error(
|
|
249
|
+
"Integrity check FAILED after consolidation. "
|
|
250
|
+
"Keeping hash databases for manual recovery."
|
|
251
|
+
)
|
|
252
|
+
return
|
|
236
253
|
|
|
237
|
-
#
|
|
238
|
-
|
|
239
|
-
|
|
240
|
-
|
|
241
|
-
|
|
242
|
-
|
|
243
|
-
|
|
244
|
-
|
|
245
|
-
logger.warning(f"Could not rename legacy database: {e}")
|
|
254
|
+
# Clean up: delete hash DBs + WAL/SHM + orphan backups
|
|
255
|
+
deleted = cleanup_old_databases(memory_dir, all_hash_dbs)
|
|
256
|
+
|
|
257
|
+
# Set the unified_db flag
|
|
258
|
+
_set_unified_db_flag(db)
|
|
259
|
+
|
|
260
|
+
merged_count = totals.get('total_memories_migrated', 0)
|
|
261
|
+
sources_count = totals.get('sources_merged', 0)
|
|
246
262
|
|
|
247
|
-
# Log summary
|
|
248
263
|
logger.info(
|
|
249
|
-
f"
|
|
250
|
-
f"{
|
|
251
|
-
f"{
|
|
252
|
-
f"{results.get('memories_migrated', 0)} memories migrated, "
|
|
253
|
-
f"{results.get('links_migrated', 0)} links migrated, "
|
|
254
|
-
f"{results.get('relationships_migrated', 0)} relationships migrated"
|
|
264
|
+
f"Consolidated {merged_count} memories "
|
|
265
|
+
f"from {sources_count} databases into claudia.db. "
|
|
266
|
+
f"Cleaned up {deleted} old files."
|
|
255
267
|
)
|
|
256
268
|
|
|
269
|
+
# Write context/whats-new.md so Claudia surfaces the upgrade in-chat
|
|
270
|
+
_write_consolidation_notice(merged_count, sources_count)
|
|
271
|
+
|
|
257
272
|
except Exception as e:
|
|
258
273
|
# Non-fatal: log error and continue with whatever data we have
|
|
259
|
-
logger.error(f"
|
|
274
|
+
logger.error(f"Auto-consolidation failed (non-fatal): {e}")
|
|
260
275
|
logger.info("Daemon will continue with current database. "
|
|
261
|
-
"Run --
|
|
276
|
+
"Run --merge-databases manually to retry.")
|
|
277
|
+
|
|
278
|
+
|
|
279
|
+
def _set_unified_db_flag(db) -> None:
|
|
280
|
+
"""Set the _meta flag indicating this is a unified database."""
|
|
281
|
+
from datetime import datetime as dt
|
|
282
|
+
try:
|
|
283
|
+
db.execute(
|
|
284
|
+
"INSERT OR REPLACE INTO _meta (key, value, updated_at) "
|
|
285
|
+
"VALUES ('unified_db', 'true', ?)",
|
|
286
|
+
(dt.now().isoformat(),),
|
|
287
|
+
)
|
|
288
|
+
except Exception as e:
|
|
289
|
+
logger.warning(f"Could not set unified_db flag: {e}")
|
|
290
|
+
|
|
291
|
+
|
|
292
|
+
def _write_consolidation_notice(merged_count: int, sources_count: int) -> None:
|
|
293
|
+
"""Write context/whats-new.md so Claudia mentions the upgrade in her greeting.
|
|
294
|
+
|
|
295
|
+
Looks for context/ in the workspace path (set via --project-dir).
|
|
296
|
+
Falls back silently if no workspace is configured.
|
|
297
|
+
"""
|
|
298
|
+
workspace_path = os.environ.get("CLAUDIA_WORKSPACE_PATH")
|
|
299
|
+
if not workspace_path:
|
|
300
|
+
return
|
|
301
|
+
|
|
302
|
+
try:
|
|
303
|
+
context_dir = Path(workspace_path) / "context"
|
|
304
|
+
whats_new = context_dir / "whats-new.md"
|
|
305
|
+
|
|
306
|
+
# Don't overwrite if installer already wrote one
|
|
307
|
+
if whats_new.exists():
|
|
308
|
+
return
|
|
309
|
+
|
|
310
|
+
if not context_dir.exists():
|
|
311
|
+
return # Not a Claudia workspace
|
|
312
|
+
|
|
313
|
+
from datetime import datetime as dt
|
|
314
|
+
date = dt.now().strftime("%Y-%m-%d")
|
|
315
|
+
|
|
316
|
+
content = f"""# Memory System Upgrade ({date})
|
|
317
|
+
|
|
318
|
+
Your memory has been unified. Previously, each project directory got its own isolated database (hash-named files like `6af67351bcfa.db`). Now everything lives in one `~/.claudia/memory/claudia.db`.
|
|
319
|
+
|
|
320
|
+
**What happened automatically:**
|
|
321
|
+
- Consolidated {merged_count} memories from {sources_count} database{'s' if sources_count != 1 else ''} into a single unified database
|
|
322
|
+
- Created a pre-merge backup at `~/.claudia/backups/`
|
|
323
|
+
- Cleaned up old hash-named database files
|
|
324
|
+
|
|
325
|
+
**What this means for you:**
|
|
326
|
+
- I remember everything from every project, always. No more fragmented memory.
|
|
327
|
+
- Moving project directories no longer loses your history.
|
|
328
|
+
- New `workspace_id` on memories tracks where I learned each fact (provenance, not a wall).
|
|
329
|
+
|
|
330
|
+
---
|
|
331
|
+
_Surface this update in your first greeting, then delete this file._
|
|
332
|
+
"""
|
|
333
|
+
whats_new.write_text(content)
|
|
334
|
+
logger.info(f"Wrote consolidation notice to {whats_new}")
|
|
335
|
+
except Exception as e:
|
|
336
|
+
logger.debug(f"Could not write consolidation notice: {e}")
|
|
262
337
|
|
|
263
338
|
|
|
264
339
|
def _write_preflight_result(result: dict) -> Path:
|
|
@@ -656,10 +731,19 @@ def run_daemon(mcp_mode: bool = True, debug: bool = False, project_id: str = Non
|
|
|
656
731
|
# Initialize database
|
|
657
732
|
db = get_db()
|
|
658
733
|
db.initialize()
|
|
659
|
-
logger.info(f"Database initialized at {get_config().db_path}")
|
|
660
734
|
|
|
661
|
-
#
|
|
662
|
-
|
|
735
|
+
# Log database identity
|
|
736
|
+
try:
|
|
737
|
+
mem_count = db.execute(
|
|
738
|
+
"SELECT COUNT(*) as c FROM memories", fetch=True
|
|
739
|
+
)
|
|
740
|
+
count = mem_count[0]["c"] if mem_count else 0
|
|
741
|
+
logger.info(f"Using database: {get_config().db_path} ({count} memories)")
|
|
742
|
+
except Exception:
|
|
743
|
+
logger.info(f"Using database: {get_config().db_path}")
|
|
744
|
+
|
|
745
|
+
# Auto-consolidate hash-named databases into unified claudia.db
|
|
746
|
+
_auto_consolidate()
|
|
663
747
|
|
|
664
748
|
# Start health server and scheduler - ONLY in standalone mode.
|
|
665
749
|
# MCP server processes are ephemeral and session-bound; the standalone
|
|
@@ -736,7 +820,7 @@ def main():
|
|
|
736
820
|
parser.add_argument(
|
|
737
821
|
"--project-dir",
|
|
738
822
|
type=str,
|
|
739
|
-
help="Project directory for
|
|
823
|
+
help="Project directory for workspace tagging (provenance on memories, not DB isolation)",
|
|
740
824
|
)
|
|
741
825
|
parser.add_argument(
|
|
742
826
|
"--tui",
|
|
@@ -781,7 +865,12 @@ def main():
|
|
|
781
865
|
parser.add_argument(
|
|
782
866
|
"--migrate-legacy",
|
|
783
867
|
action="store_true",
|
|
784
|
-
help="Manually migrate data from legacy claudia.db
|
|
868
|
+
help="Manually migrate data from a legacy database into claudia.db",
|
|
869
|
+
)
|
|
870
|
+
parser.add_argument(
|
|
871
|
+
"--merge-databases",
|
|
872
|
+
action="store_true",
|
|
873
|
+
help="Manually merge all hash-named databases into unified claudia.db",
|
|
785
874
|
)
|
|
786
875
|
parser.add_argument(
|
|
787
876
|
"--preflight",
|
|
@@ -1355,6 +1444,80 @@ def main():
|
|
|
1355
1444
|
run_para_migration(vault_path, db=db, preview=args.preview)
|
|
1356
1445
|
return
|
|
1357
1446
|
|
|
1447
|
+
if args.merge_databases:
|
|
1448
|
+
# Manual consolidation of hash-named databases
|
|
1449
|
+
setup_logging(debug=args.debug)
|
|
1450
|
+
from .migration import (
|
|
1451
|
+
cleanup_old_databases,
|
|
1452
|
+
merge_all_databases,
|
|
1453
|
+
scan_hash_databases,
|
|
1454
|
+
verify_consolidated_db,
|
|
1455
|
+
)
|
|
1456
|
+
|
|
1457
|
+
db = get_db()
|
|
1458
|
+
db.initialize()
|
|
1459
|
+
config = get_config()
|
|
1460
|
+
memory_dir = Path(config.db_path).parent
|
|
1461
|
+
|
|
1462
|
+
hash_dbs = scan_hash_databases(memory_dir)
|
|
1463
|
+
data_dbs = [d for d in hash_dbs if d["has_data"]]
|
|
1464
|
+
empty_dbs = [d for d in hash_dbs if not d["has_data"]]
|
|
1465
|
+
|
|
1466
|
+
if not hash_dbs:
|
|
1467
|
+
print("No hash-named databases found. Nothing to merge.")
|
|
1468
|
+
return
|
|
1469
|
+
|
|
1470
|
+
print(f"\nFound {len(hash_dbs)} hash-named databases:")
|
|
1471
|
+
for d in hash_dbs:
|
|
1472
|
+
stats_str = ""
|
|
1473
|
+
if d["has_data"]:
|
|
1474
|
+
s = d["stats"]
|
|
1475
|
+
stats_str = f" {s.get('memories', 0)} memories, {s.get('entities', 0)} entities"
|
|
1476
|
+
else:
|
|
1477
|
+
stats_str = " (empty)"
|
|
1478
|
+
print(f" {d['path'].name}{stats_str}")
|
|
1479
|
+
|
|
1480
|
+
print(f"\nTarget: {config.db_path}")
|
|
1481
|
+
print(f" {len(data_dbs)} with data, {len(empty_dbs)} empty")
|
|
1482
|
+
|
|
1483
|
+
if args.dry_run:
|
|
1484
|
+
print("\nDry run mode: no changes will be made.\n")
|
|
1485
|
+
if data_dbs:
|
|
1486
|
+
totals = merge_all_databases(Path(config.db_path), data_dbs, dry_run=True)
|
|
1487
|
+
print(f"\nWould merge:")
|
|
1488
|
+
for key, val in totals.items():
|
|
1489
|
+
if val > 0:
|
|
1490
|
+
print(f" {key}: {val}")
|
|
1491
|
+
return
|
|
1492
|
+
|
|
1493
|
+
if data_dbs:
|
|
1494
|
+
# Backup before merge
|
|
1495
|
+
backup_path = db.backup(label="pre-merge")
|
|
1496
|
+
print(f"\nBackup created: {backup_path}")
|
|
1497
|
+
|
|
1498
|
+
print("\nMerging...")
|
|
1499
|
+
totals = merge_all_databases(Path(config.db_path), data_dbs)
|
|
1500
|
+
|
|
1501
|
+
if verify_consolidated_db(Path(config.db_path)):
|
|
1502
|
+
print("Integrity check: PASSED")
|
|
1503
|
+
else:
|
|
1504
|
+
print("Integrity check: FAILED (keeping hash databases)")
|
|
1505
|
+
return
|
|
1506
|
+
|
|
1507
|
+
print(f"\nResults:")
|
|
1508
|
+
for key, val in totals.items():
|
|
1509
|
+
if val > 0:
|
|
1510
|
+
print(f" {key}: {val}")
|
|
1511
|
+
|
|
1512
|
+
# Clean up
|
|
1513
|
+
deleted = cleanup_old_databases(memory_dir, hash_dbs)
|
|
1514
|
+
print(f"\nCleaned up {deleted} old files.")
|
|
1515
|
+
|
|
1516
|
+
# Set unified_db flag
|
|
1517
|
+
_set_unified_db_flag(db)
|
|
1518
|
+
print("Unified database flag set.")
|
|
1519
|
+
return
|
|
1520
|
+
|
|
1358
1521
|
if args.migrate_legacy:
|
|
1359
1522
|
# Manual legacy database migration
|
|
1360
1523
|
setup_logging(debug=args.debug)
|
|
@@ -118,17 +118,25 @@ class MemoryConfig:
|
|
|
118
118
|
context_builder_token_budget: int = 8000 # Default token budget for CRE
|
|
119
119
|
context_builder_max_facts: int = 30 # Max facts in CRE context window
|
|
120
120
|
|
|
121
|
+
# Workspace tracking (provenance, not partition)
|
|
122
|
+
workspace_id: Optional[str] = None # Auto-set from --project-dir; tags memories with origin workspace
|
|
123
|
+
|
|
121
124
|
# Daemon settings
|
|
122
125
|
log_path: Path = field(default_factory=lambda: Path.home() / ".claudia" / "daemon.log")
|
|
123
126
|
|
|
127
|
+
@property
|
|
128
|
+
def backup_dir(self) -> Path:
|
|
129
|
+
"""Directory for human-readable backups."""
|
|
130
|
+
return Path.home() / ".claudia" / "backups"
|
|
131
|
+
|
|
124
132
|
@classmethod
|
|
125
133
|
def load(cls, project_id: Optional[str] = None) -> "MemoryConfig":
|
|
126
134
|
"""Load configuration from ~/.claudia/config.json, with defaults.
|
|
127
135
|
|
|
128
136
|
Args:
|
|
129
|
-
project_id: Optional project identifier
|
|
130
|
-
|
|
131
|
-
~/.claudia/memory/
|
|
137
|
+
project_id: Optional project identifier. Stored as workspace_id for
|
|
138
|
+
provenance tagging on memories. Does NOT change the database
|
|
139
|
+
path (unified DB at ~/.claudia/memory/claudia.db).
|
|
132
140
|
"""
|
|
133
141
|
config_path = Path.home() / ".claudia" / "config.json"
|
|
134
142
|
config = cls()
|
|
@@ -241,22 +249,17 @@ class MemoryConfig:
|
|
|
241
249
|
# DEMO MODE: Use isolated demo database (never touches real data)
|
|
242
250
|
# Set CLAUDIA_DEMO_MODE=1 in environment to use demo database
|
|
243
251
|
elif os.environ.get("CLAUDIA_DEMO_MODE") == "1":
|
|
244
|
-
|
|
245
|
-
# Workspace-specific demo database
|
|
246
|
-
config.db_path = Path.home() / ".claudia" / "demo" / f"{project_id}.db"
|
|
247
|
-
else:
|
|
248
|
-
# Global demo database
|
|
249
|
-
config.db_path = Path.home() / ".claudia" / "demo" / "claudia-demo.db"
|
|
250
|
-
config.db_path.parent.mkdir(parents=True, exist_ok=True)
|
|
251
|
-
# Override database path for project isolation
|
|
252
|
-
# This ensures each project gets its own isolated database
|
|
253
|
-
elif project_id:
|
|
254
|
-
config.db_path = Path.home() / ".claudia" / "memory" / f"{project_id}.db"
|
|
252
|
+
config.db_path = Path.home() / ".claudia" / "demo" / "claudia-demo.db"
|
|
255
253
|
config.db_path.parent.mkdir(parents=True, exist_ok=True)
|
|
256
254
|
else:
|
|
257
|
-
#
|
|
255
|
+
# Unified database: always ~/.claudia/memory/claudia.db
|
|
256
|
+
# project_id is stored as workspace_id for provenance, not DB isolation
|
|
258
257
|
config.db_path.parent.mkdir(parents=True, exist_ok=True)
|
|
259
258
|
|
|
259
|
+
# Store project_id as workspace_id (provenance metadata, not a partition)
|
|
260
|
+
if project_id:
|
|
261
|
+
config.workspace_id = project_id
|
|
262
|
+
|
|
260
263
|
# Ensure log directory exists
|
|
261
264
|
config.log_path.parent.mkdir(parents=True, exist_ok=True)
|
|
262
265
|
|
|
@@ -382,13 +385,13 @@ _project_id: Optional[str] = None
|
|
|
382
385
|
|
|
383
386
|
|
|
384
387
|
def set_project_id(project_id: Optional[str]) -> None:
|
|
385
|
-
"""Set the project ID for
|
|
388
|
+
"""Set the project ID for workspace tagging.
|
|
386
389
|
|
|
387
390
|
This must be called before any access to get_config() to ensure
|
|
388
|
-
the
|
|
391
|
+
the workspace_id is set for provenance tracking on memories.
|
|
389
392
|
|
|
390
393
|
Args:
|
|
391
|
-
project_id: Hash of the project directory path, or None
|
|
394
|
+
project_id: Hash of the project directory path, or None.
|
|
392
395
|
"""
|
|
393
396
|
global _config, _project_id
|
|
394
397
|
|
|
@@ -401,9 +404,8 @@ def set_project_id(project_id: Optional[str]) -> None:
|
|
|
401
404
|
def get_config() -> MemoryConfig:
|
|
402
405
|
"""Get or load the global configuration.
|
|
403
406
|
|
|
404
|
-
|
|
405
|
-
the
|
|
406
|
-
claudia.db is used for backward compatibility.
|
|
407
|
+
Always uses the unified claudia.db. If set_project_id() was called,
|
|
408
|
+
the workspace_id is set for provenance tagging on memories.
|
|
407
409
|
"""
|
|
408
410
|
global _config, _project_id
|
|
409
411
|
if _config is None:
|
|
@@ -30,9 +30,11 @@ def build_status_report(*, db=None) -> dict:
|
|
|
30
30
|
Args:
|
|
31
31
|
db: Optional database instance. If None, uses the global get_db() singleton.
|
|
32
32
|
"""
|
|
33
|
+
config = get_config()
|
|
33
34
|
report = {
|
|
34
35
|
"timestamp": datetime.utcnow().isoformat(),
|
|
35
36
|
"status": "healthy",
|
|
37
|
+
"db_path": str(config.db_path),
|
|
36
38
|
"schema_version": 0,
|
|
37
39
|
"components": {},
|
|
38
40
|
"scheduled_jobs": [],
|
|
@@ -54,6 +56,17 @@ def build_status_report(*, db=None) -> dict:
|
|
|
54
56
|
except Exception:
|
|
55
57
|
report["schema_version"] = 0
|
|
56
58
|
|
|
59
|
+
# Unified DB identity
|
|
60
|
+
try:
|
|
61
|
+
meta_rows = _db.execute(
|
|
62
|
+
"SELECT value FROM _meta WHERE key = 'unified_db'", fetch=True
|
|
63
|
+
)
|
|
64
|
+
report["unified_db"] = (
|
|
65
|
+
meta_rows[0]["value"] == "true" if meta_rows else False
|
|
66
|
+
)
|
|
67
|
+
except Exception:
|
|
68
|
+
report["unified_db"] = False
|
|
69
|
+
|
|
57
70
|
# Counts
|
|
58
71
|
for table, query in [
|
|
59
72
|
("memories", "SELECT COUNT(*) as c FROM memories"),
|
|
@@ -69,12 +82,16 @@ def build_status_report(*, db=None) -> dict:
|
|
|
69
82
|
except Exception:
|
|
70
83
|
report["counts"][table] = -1
|
|
71
84
|
|
|
72
|
-
# Backup status
|
|
85
|
+
# Backup status (check both new backups/ dir and legacy alongside-DB location)
|
|
73
86
|
try:
|
|
74
87
|
import glob
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
|
|
88
|
+
backup_dir = config.backup_dir
|
|
89
|
+
new_pattern = str(backup_dir / "claudia-*.db")
|
|
90
|
+
old_pattern = f"{config.db_path}.backup-*.db"
|
|
91
|
+
backups = sorted(
|
|
92
|
+
glob.glob(new_pattern) + glob.glob(old_pattern),
|
|
93
|
+
key=lambda p: Path(p).stat().st_mtime if Path(p).exists() else 0,
|
|
94
|
+
)
|
|
78
95
|
if backups:
|
|
79
96
|
latest = Path(backups[-1])
|
|
80
97
|
report["backup"] = {
|
|
@@ -1020,6 +1020,28 @@ class Database:
|
|
|
1020
1020
|
conn.commit()
|
|
1021
1021
|
logger.info("Applied migration 20: lifecycle tiers, sacred, close-circle, fact_id, chain")
|
|
1022
1022
|
|
|
1023
|
+
if current_version < 21:
|
|
1024
|
+
# Migration 21: Add workspace_id to memories for unified database provenance
|
|
1025
|
+
try:
|
|
1026
|
+
conn.execute("ALTER TABLE memories ADD COLUMN workspace_id TEXT")
|
|
1027
|
+
except sqlite3.OperationalError as e:
|
|
1028
|
+
if "duplicate column" not in str(e).lower():
|
|
1029
|
+
logger.warning(f"Migration 21 statement failed: {e}")
|
|
1030
|
+
|
|
1031
|
+
try:
|
|
1032
|
+
conn.execute(
|
|
1033
|
+
"CREATE INDEX IF NOT EXISTS idx_memories_workspace ON memories(workspace_id)"
|
|
1034
|
+
)
|
|
1035
|
+
except sqlite3.OperationalError as e:
|
|
1036
|
+
logger.warning(f"Migration 21 index failed: {e}")
|
|
1037
|
+
|
|
1038
|
+
conn.execute(
|
|
1039
|
+
"INSERT OR IGNORE INTO schema_migrations (version, description) "
|
|
1040
|
+
"VALUES (21, 'Add workspace_id to memories for unified database provenance tracking')"
|
|
1041
|
+
)
|
|
1042
|
+
conn.commit()
|
|
1043
|
+
logger.info("Applied migration 21: workspace_id for unified database")
|
|
1044
|
+
|
|
1023
1045
|
# FTS5 setup: ensure memories_fts exists regardless of migration path.
|
|
1024
1046
|
# The FTS5 virtual table + triggers contain internal semicolons that the
|
|
1025
1047
|
# schema.sql line-based parser can't handle, so we always check here.
|
|
@@ -1173,6 +1195,11 @@ class Database:
|
|
|
1173
1195
|
logger.warning("Migration 19 incomplete: entity_summaries table missing")
|
|
1174
1196
|
return 18
|
|
1175
1197
|
|
|
1198
|
+
# Migration 21 added workspace_id to memories
|
|
1199
|
+
if "workspace_id" not in memory_cols:
|
|
1200
|
+
logger.warning("Migration 21 incomplete: memories missing workspace_id column")
|
|
1201
|
+
return 20
|
|
1202
|
+
|
|
1176
1203
|
# Migration 20 added lifecycle_tier, fact_id to memories; close_circle to entities
|
|
1177
1204
|
if "lifecycle_tier" not in memory_cols or "fact_id" not in memory_cols:
|
|
1178
1205
|
logger.warning("Migration 20 incomplete: memories missing lifecycle/fact_id columns")
|
|
@@ -1351,9 +1378,15 @@ class Database:
|
|
|
1351
1378
|
def backup(self, label: str = None) -> Path:
|
|
1352
1379
|
"""Create a backup of the database using SQLite's online backup API.
|
|
1353
1380
|
|
|
1381
|
+
Backups are stored in ~/.claudia/backups/ with human-readable names:
|
|
1382
|
+
- claudia-daily-2026-03-15.db
|
|
1383
|
+
- claudia-pre-merge-2026-03-15.db
|
|
1384
|
+
- claudia-manual-2026-03-15-143022.db
|
|
1385
|
+
|
|
1354
1386
|
Args:
|
|
1355
1387
|
label: Optional label for categorized backups (e.g., "daily", "weekly",
|
|
1356
|
-
"pre-migration"). Labeled backups have independent
|
|
1388
|
+
"pre-migration", "pre-merge"). Labeled backups have independent
|
|
1389
|
+
retention counts. If None, uses "manual" with timestamp.
|
|
1357
1390
|
|
|
1358
1391
|
Returns:
|
|
1359
1392
|
Path to the created backup file
|
|
@@ -1361,11 +1394,17 @@ class Database:
|
|
|
1361
1394
|
import glob
|
|
1362
1395
|
|
|
1363
1396
|
config = get_config()
|
|
1364
|
-
|
|
1397
|
+
backup_dir = config.backup_dir
|
|
1398
|
+
backup_dir.mkdir(parents=True, exist_ok=True)
|
|
1399
|
+
|
|
1365
1400
|
if label:
|
|
1366
|
-
|
|
1401
|
+
# Labeled backups use date-only (one per day per label)
|
|
1402
|
+
date_str = datetime.now().strftime("%Y-%m-%d")
|
|
1403
|
+
backup_path = backup_dir / f"claudia-{label}-{date_str}.db"
|
|
1367
1404
|
else:
|
|
1368
|
-
|
|
1405
|
+
# Manual backups include full timestamp
|
|
1406
|
+
timestamp = datetime.now().strftime("%Y-%m-%d-%H%M%S")
|
|
1407
|
+
backup_path = backup_dir / f"claudia-manual-{timestamp}.db"
|
|
1369
1408
|
|
|
1370
1409
|
# Create backup using SQLite's built-in backup API
|
|
1371
1410
|
backup_conn = sqlite3.connect(str(backup_path))
|
|
@@ -1390,10 +1429,10 @@ class Database:
|
|
|
1390
1429
|
|
|
1391
1430
|
# Rolling retention (per-label if labeled)
|
|
1392
1431
|
if label:
|
|
1393
|
-
pattern = f"
|
|
1432
|
+
pattern = str(backup_dir / f"claudia-{label}-*.db")
|
|
1394
1433
|
retention = self._get_label_retention(label)
|
|
1395
1434
|
else:
|
|
1396
|
-
pattern =
|
|
1435
|
+
pattern = str(backup_dir / "claudia-manual-*.db")
|
|
1397
1436
|
retention = config.backup_retention_count
|
|
1398
1437
|
|
|
1399
1438
|
backups = sorted(glob.glob(pattern), key=os.path.getmtime)
|
|
@@ -1413,6 +1452,8 @@ class Database:
|
|
|
1413
1452
|
retention_map = {
|
|
1414
1453
|
"daily": config.backup_daily_retention,
|
|
1415
1454
|
"weekly": config.backup_weekly_retention,
|
|
1455
|
+
"pre-merge": 4, # Keep pre-merge backups for ~1 month
|
|
1456
|
+
"pre-migration": 4, # Keep pre-migration backups for ~1 month
|
|
1416
1457
|
}
|
|
1417
1458
|
return retention_map.get(label, config.backup_retention_count)
|
|
1418
1459
|
|
|
@@ -1152,6 +1152,167 @@ def _migrate_reflections(
|
|
|
1152
1152
|
logger.info(f"Reflections: {results['reflections_migrated']} migrated")
|
|
1153
1153
|
|
|
1154
1154
|
|
|
1155
|
+
# ── Unified Database Consolidation ───────────────────────────────────
|
|
1156
|
+
|
|
1157
|
+
def scan_hash_databases(memory_dir: Path) -> List[Dict]:
|
|
1158
|
+
"""Scan ~/.claudia/memory/ for hash-named databases with data.
|
|
1159
|
+
|
|
1160
|
+
Returns a list of dicts with path, hash, and stats for each non-empty
|
|
1161
|
+
hash-named database (12-char hex filenames like 6af67351bcfa.db).
|
|
1162
|
+
"""
|
|
1163
|
+
import re
|
|
1164
|
+
|
|
1165
|
+
results = []
|
|
1166
|
+
hash_pattern = re.compile(r"^[0-9a-f]{12}\.db$")
|
|
1167
|
+
|
|
1168
|
+
if not memory_dir.exists():
|
|
1169
|
+
return results
|
|
1170
|
+
|
|
1171
|
+
for f in memory_dir.iterdir():
|
|
1172
|
+
if not hash_pattern.match(f.name):
|
|
1173
|
+
continue
|
|
1174
|
+
|
|
1175
|
+
db_hash = f.stem
|
|
1176
|
+
stats = check_legacy_database(f)
|
|
1177
|
+
results.append({
|
|
1178
|
+
"path": f,
|
|
1179
|
+
"hash": db_hash,
|
|
1180
|
+
"has_data": stats is not None,
|
|
1181
|
+
"stats": stats,
|
|
1182
|
+
})
|
|
1183
|
+
|
|
1184
|
+
return results
|
|
1185
|
+
|
|
1186
|
+
|
|
1187
|
+
def merge_all_databases(
|
|
1188
|
+
target_path: Path,
|
|
1189
|
+
source_dbs: List[Dict],
|
|
1190
|
+
dry_run: bool = False,
|
|
1191
|
+
) -> Dict[str, int]:
|
|
1192
|
+
"""Merge multiple hash-named databases into the unified claudia.db.
|
|
1193
|
+
|
|
1194
|
+
Each source DB's memories get tagged with workspace_id = source hash.
|
|
1195
|
+
Deduplication uses content_hash for memories and (canonical_name, type)
|
|
1196
|
+
for entities.
|
|
1197
|
+
|
|
1198
|
+
Args:
|
|
1199
|
+
target_path: Path to the unified claudia.db
|
|
1200
|
+
source_dbs: List of dicts from scan_hash_databases() (only those with data)
|
|
1201
|
+
dry_run: If True, count what would be merged without making changes
|
|
1202
|
+
|
|
1203
|
+
Returns:
|
|
1204
|
+
Dict with total migration counts across all sources
|
|
1205
|
+
"""
|
|
1206
|
+
totals = {
|
|
1207
|
+
"sources_merged": 0,
|
|
1208
|
+
"total_entities_created": 0,
|
|
1209
|
+
"total_entities_mapped": 0,
|
|
1210
|
+
"total_memories_migrated": 0,
|
|
1211
|
+
"total_memories_duplicate": 0,
|
|
1212
|
+
"total_relationships_migrated": 0,
|
|
1213
|
+
"total_links_migrated": 0,
|
|
1214
|
+
}
|
|
1215
|
+
|
|
1216
|
+
for source in source_dbs:
|
|
1217
|
+
source_path = source["path"]
|
|
1218
|
+
source_hash = source["hash"]
|
|
1219
|
+
|
|
1220
|
+
logger.info(f"Merging {source_path.name} ({source['stats'].get('memories', 0)} memories, "
|
|
1221
|
+
f"{source['stats'].get('entities', 0)} entities)")
|
|
1222
|
+
|
|
1223
|
+
try:
|
|
1224
|
+
results = migrate_legacy_database(
|
|
1225
|
+
legacy_path=source_path,
|
|
1226
|
+
active_path=target_path,
|
|
1227
|
+
dry_run=dry_run,
|
|
1228
|
+
)
|
|
1229
|
+
|
|
1230
|
+
# Tag merged memories with workspace_id = source hash
|
|
1231
|
+
if not dry_run:
|
|
1232
|
+
try:
|
|
1233
|
+
conn = sqlite3.connect(str(target_path), timeout=30)
|
|
1234
|
+
conn.execute(
|
|
1235
|
+
"UPDATE memories SET workspace_id = ? "
|
|
1236
|
+
"WHERE workspace_id IS NULL AND id IN ("
|
|
1237
|
+
" SELECT id FROM memories WHERE workspace_id IS NULL"
|
|
1238
|
+
")",
|
|
1239
|
+
(source_hash,),
|
|
1240
|
+
)
|
|
1241
|
+
conn.commit()
|
|
1242
|
+
conn.close()
|
|
1243
|
+
except Exception as e:
|
|
1244
|
+
logger.warning(f"Could not tag workspace_id for {source_hash}: {e}")
|
|
1245
|
+
|
|
1246
|
+
totals["sources_merged"] += 1
|
|
1247
|
+
totals["total_entities_created"] += results.get("entities_created", 0)
|
|
1248
|
+
totals["total_entities_mapped"] += results.get("entities_mapped", 0)
|
|
1249
|
+
totals["total_memories_migrated"] += results.get("memories_migrated", 0)
|
|
1250
|
+
totals["total_memories_duplicate"] += results.get("memories_duplicate", 0)
|
|
1251
|
+
totals["total_relationships_migrated"] += results.get("relationships_migrated", 0)
|
|
1252
|
+
totals["total_links_migrated"] += results.get("links_migrated", 0)
|
|
1253
|
+
|
|
1254
|
+
except Exception as e:
|
|
1255
|
+
logger.error(f"Failed to merge {source_path.name}: {e}")
|
|
1256
|
+
# Non-fatal: continue with other sources
|
|
1257
|
+
|
|
1258
|
+
return totals
|
|
1259
|
+
|
|
1260
|
+
|
|
1261
|
+
def cleanup_old_databases(memory_dir: Path, source_dbs: List[Dict]) -> int:
|
|
1262
|
+
"""Delete hash-named databases and their WAL/SHM files after successful merge.
|
|
1263
|
+
|
|
1264
|
+
Args:
|
|
1265
|
+
memory_dir: The ~/.claudia/memory/ directory
|
|
1266
|
+
source_dbs: List of dicts from scan_hash_databases()
|
|
1267
|
+
|
|
1268
|
+
Returns:
|
|
1269
|
+
Number of files deleted
|
|
1270
|
+
"""
|
|
1271
|
+
deleted = 0
|
|
1272
|
+
|
|
1273
|
+
for source in source_dbs:
|
|
1274
|
+
db_path = source["path"]
|
|
1275
|
+
|
|
1276
|
+
# Delete the database and its WAL/SHM companions
|
|
1277
|
+
for suffix in ("", "-wal", "-shm"):
|
|
1278
|
+
companion = Path(str(db_path) + suffix)
|
|
1279
|
+
if companion.exists():
|
|
1280
|
+
try:
|
|
1281
|
+
companion.unlink()
|
|
1282
|
+
deleted += 1
|
|
1283
|
+
logger.info(f"Deleted: {companion.name}")
|
|
1284
|
+
except OSError as e:
|
|
1285
|
+
logger.warning(f"Could not delete {companion}: {e}")
|
|
1286
|
+
|
|
1287
|
+
# Delete any orphan backup files for this hash DB
|
|
1288
|
+
import glob
|
|
1289
|
+
orphan_pattern = str(db_path) + ".backup-*"
|
|
1290
|
+
for orphan in glob.glob(orphan_pattern):
|
|
1291
|
+
try:
|
|
1292
|
+
Path(orphan).unlink()
|
|
1293
|
+
deleted += 1
|
|
1294
|
+
logger.info(f"Deleted orphan backup: {Path(orphan).name}")
|
|
1295
|
+
except OSError as e:
|
|
1296
|
+
logger.warning(f"Could not delete orphan backup {orphan}: {e}")
|
|
1297
|
+
|
|
1298
|
+
return deleted
|
|
1299
|
+
|
|
1300
|
+
|
|
1301
|
+
def verify_consolidated_db(db_path: Path) -> bool:
|
|
1302
|
+
"""Verify integrity of the consolidated database.
|
|
1303
|
+
|
|
1304
|
+
Returns True if the database passes PRAGMA integrity_check.
|
|
1305
|
+
"""
|
|
1306
|
+
try:
|
|
1307
|
+
conn = sqlite3.connect(f"file:{db_path}?mode=ro", uri=True, timeout=5)
|
|
1308
|
+
result = conn.execute("PRAGMA integrity_check").fetchone()
|
|
1309
|
+
conn.close()
|
|
1310
|
+
return result is not None and result[0] == "ok"
|
|
1311
|
+
except Exception as e:
|
|
1312
|
+
logger.error(f"Integrity check failed: {e}")
|
|
1313
|
+
return False
|
|
1314
|
+
|
|
1315
|
+
|
|
1155
1316
|
# ── Utilities ────────────────────────────────────────────────────────
|
|
1156
1317
|
|
|
1157
1318
|
def _safe_json_parse(text: str, default: Any = None) -> Any:
|
|
@@ -79,7 +79,8 @@ CREATE TABLE IF NOT EXISTS memories (
|
|
|
79
79
|
archived_at TEXT, -- When this memory was archived
|
|
80
80
|
fact_id TEXT UNIQUE, -- UUID for human-friendly reference
|
|
81
81
|
hash TEXT, -- SHA-256 chain hash
|
|
82
|
-
prev_hash TEXT -- Previous hash in chain (NULL for genesis)
|
|
82
|
+
prev_hash TEXT, -- Previous hash in chain (NULL for genesis)
|
|
83
|
+
workspace_id TEXT -- Origin workspace (provenance, not partition)
|
|
83
84
|
);
|
|
84
85
|
|
|
85
86
|
CREATE INDEX IF NOT EXISTS idx_memories_type ON memories(type);
|
|
@@ -90,6 +91,7 @@ CREATE INDEX IF NOT EXISTS idx_memories_deadline ON memories(deadline_at);
|
|
|
90
91
|
CREATE INDEX IF NOT EXISTS idx_memories_verification ON memories(verification_status);
|
|
91
92
|
CREATE INDEX IF NOT EXISTS idx_memories_lifecycle ON memories(lifecycle_tier);
|
|
92
93
|
CREATE INDEX IF NOT EXISTS idx_memories_fact_id ON memories(fact_id);
|
|
94
|
+
CREATE INDEX IF NOT EXISTS idx_memories_workspace ON memories(workspace_id);
|
|
93
95
|
|
|
94
96
|
-- Junction table linking memories to entities
|
|
95
97
|
CREATE TABLE IF NOT EXISTS memory_entities (
|
|
@@ -475,3 +477,6 @@ CREATE INDEX IF NOT EXISTS idx_agent_dispatches_started ON agent_dispatches(star
|
|
|
475
477
|
|
|
476
478
|
INSERT OR IGNORE INTO schema_migrations (version, description)
|
|
477
479
|
VALUES (20, 'Add lifecycle tiers, sacred memories, close-circle entities, fact_id, SHA-256 chain');
|
|
480
|
+
|
|
481
|
+
INSERT OR IGNORE INTO schema_migrations (version, description)
|
|
482
|
+
VALUES (21, 'Add workspace_id to memories for unified database provenance tracking');
|
|
@@ -44,6 +44,8 @@ class RecallResult:
|
|
|
44
44
|
origin_type: str = "inferred" # user_stated, extracted, inferred, corrected
|
|
45
45
|
# Channel tracking
|
|
46
46
|
source_channel: Optional[str] = None # Origin channel: claude_code, telegram, slack
|
|
47
|
+
# Workspace provenance
|
|
48
|
+
workspace_id: Optional[str] = None # Origin workspace (project hash)
|
|
47
49
|
# Lifecycle fields
|
|
48
50
|
lifecycle_tier: Optional[str] = None # sacred/active/cooling/archived
|
|
49
51
|
fact_id: Optional[str] = None # UUID for human-friendly reference
|
|
@@ -370,6 +372,9 @@ class RecallService:
|
|
|
370
372
|
# Channel tracking (may not exist in older DBs)
|
|
371
373
|
source_channel_val = row["source_channel"] if "source_channel" in row_keys else None
|
|
372
374
|
|
|
375
|
+
# Workspace provenance (may not exist in older DBs)
|
|
376
|
+
workspace_id_val = row["workspace_id"] if "workspace_id" in row_keys else None
|
|
377
|
+
|
|
373
378
|
# Lifecycle fields (may not exist in older DBs)
|
|
374
379
|
lifecycle_tier_val = row["lifecycle_tier"] if "lifecycle_tier" in row_keys else None
|
|
375
380
|
fact_id_val = row["fact_id"] if "fact_id" in row_keys else None
|
|
@@ -390,6 +395,7 @@ class RecallService:
|
|
|
390
395
|
verification_status=verification_status_val,
|
|
391
396
|
origin_type=origin_type_val,
|
|
392
397
|
source_channel=source_channel_val,
|
|
398
|
+
workspace_id=workspace_id_val,
|
|
393
399
|
lifecycle_tier=lifecycle_tier_val,
|
|
394
400
|
fact_id=fact_id_val,
|
|
395
401
|
)
|
|
@@ -858,6 +864,7 @@ class RecallService:
|
|
|
858
864
|
source=row["source"] if "source" in row_keys else None,
|
|
859
865
|
source_id=row["source_id"] if "source_id" in row_keys else None,
|
|
860
866
|
source_context=row["source_context"] if "source_context" in row_keys else None,
|
|
867
|
+
workspace_id=row["workspace_id"] if "workspace_id" in row_keys else None,
|
|
861
868
|
lifecycle_tier=row["lifecycle_tier"] if "lifecycle_tier" in row_keys else None,
|
|
862
869
|
fact_id=row["fact_id"] if "fact_id" in row_keys else None,
|
|
863
870
|
)
|
|
@@ -1326,6 +1333,7 @@ class RecallService:
|
|
|
1326
1333
|
source=row["source"] if "source" in row_keys else None,
|
|
1327
1334
|
source_id=row["source_id"] if "source_id" in row_keys else None,
|
|
1328
1335
|
source_context=row["source_context"] if "source_context" in row_keys else None,
|
|
1336
|
+
workspace_id=row["workspace_id"] if "workspace_id" in row_keys else None,
|
|
1329
1337
|
lifecycle_tier=row["lifecycle_tier"] if "lifecycle_tier" in row_keys else None,
|
|
1330
1338
|
fact_id=row["fact_id"] if "fact_id" in row_keys else None,
|
|
1331
1339
|
)
|
|
@@ -2522,6 +2530,7 @@ class RecallService:
|
|
|
2522
2530
|
created_at=row["created_at"],
|
|
2523
2531
|
entities=entity_str.split(",") if entity_str else [],
|
|
2524
2532
|
metadata={"urgency": urgency, "deadline_at": deadline_str},
|
|
2533
|
+
workspace_id=row["workspace_id"] if "workspace_id" in row_keys else None,
|
|
2525
2534
|
lifecycle_tier=row["lifecycle_tier"] if "lifecycle_tier" in row_keys else None,
|
|
2526
2535
|
fact_id=row["fact_id"] if "fact_id" in row_keys else None,
|
|
2527
2536
|
))
|
|
@@ -2791,6 +2800,7 @@ class RecallService:
|
|
|
2791
2800
|
origin_type=row["origin_type"] if "origin_type" in row_keys else "inferred",
|
|
2792
2801
|
confidence=row["confidence"] if "confidence" in row_keys else 1.0,
|
|
2793
2802
|
source_channel=row["source_channel"] if "source_channel" in row_keys else None,
|
|
2803
|
+
workspace_id=row["workspace_id"] if "workspace_id" in row_keys else None,
|
|
2794
2804
|
lifecycle_tier=row["lifecycle_tier"] if "lifecycle_tier" in row_keys else None,
|
|
2795
2805
|
fact_id=row["fact_id"] if "fact_id" in row_keys else None,
|
|
2796
2806
|
)
|
|
@@ -251,6 +251,14 @@ class RememberService:
|
|
|
251
251
|
insert_data["source_context"] = source_context
|
|
252
252
|
if source_channel:
|
|
253
253
|
insert_data["source_channel"] = source_channel
|
|
254
|
+
# Auto-tag workspace_id from config (provenance: which workspace created this memory)
|
|
255
|
+
try:
|
|
256
|
+
from ..config import get_config as _get_config
|
|
257
|
+
_ws_id = getattr(_get_config(), "workspace_id", None)
|
|
258
|
+
if _ws_id:
|
|
259
|
+
insert_data["workspace_id"] = _ws_id
|
|
260
|
+
except Exception:
|
|
261
|
+
pass
|
|
254
262
|
if deadline_at:
|
|
255
263
|
insert_data["deadline_at"] = deadline_at
|
|
256
264
|
if temporal_markers_json:
|