superlocalmemory 2.4.0 → 2.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -16,6 +16,31 @@ SuperLocalMemory V2 - Intelligent local memory system for AI coding assistants.
16
16
 
17
17
  ---
18
18
 
19
+ ## [2.4.1] - 2026-02-11
20
+
21
+ **Release Type:** Hierarchical Clustering & Documentation Release
22
+ **Backward Compatible:** Yes (additive schema changes only)
23
+
24
+ ### Added
25
+ - **Hierarchical Leiden clustering** (`graph_engine.py`): Recursive community detection — large clusters (≥10 members) are automatically sub-divided up to 3 levels deep. E.g., "Python" → "FastAPI" → "Authentication patterns". New `parent_cluster_id` and `depth` columns in `graph_clusters` table
26
+ - **Community summaries** (`graph_engine.py`): TF-IDF structured reports for every cluster — key topics, projects, categories, hierarchy context. Stored in `graph_clusters.summary` column, surfaced in `/api/clusters` endpoint and web dashboard
27
+ - **CLI commands**: `python3 graph_engine.py hierarchical` and `python3 graph_engine.py summaries` for manual runs
28
+ - **Schema migration**: Safe `ALTER TABLE` additions for `summary`, `parent_cluster_id`, `depth` columns — backward compatible with existing databases
29
+
30
+ ### Changed
31
+ - `build_graph()` now automatically runs hierarchical sub-clustering and summary generation after flat Leiden
32
+ - `/api/clusters` endpoint returns `summary`, `parent_cluster_id`, `depth` fields
33
+ - `get_stats()` includes `max_depth` and per-cluster summary/hierarchy data
34
+ - `setup_validator.py` schema updated to include new columns
35
+
36
+ ### Documentation
37
+ - **README.md**: v2.4.0→v2.4.1, added Hierarchical Leiden, Community Summaries, MACLA, Auto-Backup sections
38
+ - **Wiki**: Updated Roadmap, Pattern-Learning-Explained, Knowledge-Graph-Guide, Configuration, Visualization-Dashboard, Footer
39
+ - **Website**: Updated features.astro, comparison.astro, index.astro for v2.4.1 features
40
+ - **`.npmignore`**: Recursive `__pycache__` exclusion patterns
41
+
42
+ ---
43
+
19
44
  ## [2.4.0] - 2026-02-11
20
45
 
21
46
  **Release Type:** Profile System & Intelligence Release
package/README.md CHANGED
@@ -130,7 +130,7 @@ npm update -g superlocalmemory
130
130
  npm install -g superlocalmemory@latest
131
131
 
132
132
  # Install specific version
133
- npm install -g superlocalmemory@2.3.7
133
+ npm install -g superlocalmemory@latest
134
134
  ```
135
135
 
136
136
  **Manual install users:**
@@ -189,6 +189,19 @@ python ~/.claude-memory/ui_server.py
189
189
 
190
190
  ---
191
191
 
192
+ ### New in v2.4.1: Hierarchical Clustering, Community Summaries & Auto-Backup
193
+
194
+ | Feature | Description |
195
+ |---------|-------------|
196
+ | **Hierarchical Leiden** | Recursive community detection — clusters within clusters up to 3 levels. "Python" → "FastAPI" → "Auth patterns" |
197
+ | **Community Summaries** | TF-IDF structured reports per cluster: key topics, projects, categories at a glance |
198
+ | **MACLA Confidence** | Bayesian Beta-Binomial scoring (arXiv:2512.18950) — calibrated confidence, not raw frequency |
199
+ | **Auto-Backup** | Configurable SQLite backups with retention policies, one-click restore from dashboard |
200
+ | **Profile UI** | Create, switch, delete profiles from the web dashboard — full isolation per context |
201
+ | **Profile Isolation** | All API endpoints (graph, clusters, patterns, timeline) scoped to active profile |
202
+
203
+ ---
204
+
192
205
  ## 🔍 Advanced Search
193
206
 
194
207
  SuperLocalMemory V2.2.0 implements **hybrid search** combining multiple strategies for maximum accuracy.
@@ -433,13 +446,13 @@ Not another simple key-value store. SuperLocalMemory implements **cutting-edge m
433
446
  │ 6 universal slash-commands for AI assistants │
434
447
  │ Compatible with Claude Code, Continue, Cody │
435
448
  ├─────────────────────────────────────────────────────────────┤
436
- │ Layer 4: PATTERN LEARNING
437
- Learns: coding style, preferences, terminology
449
+ │ Layer 4: PATTERN LEARNING + MACLA (v2.4.0)
450
+ Bayesian Beta-Binomial confidence (arXiv:2512.18950)
438
451
  │ "You prefer React over Vue" (73% confidence) │
439
452
  ├─────────────────────────────────────────────────────────────┤
440
- │ Layer 3: KNOWLEDGE GRAPH
441
- Auto-clusters: "Auth & Tokens", "Performance", "Testing"
442
- Discovers relationships you didn't know existed
453
+ │ Layer 3: KNOWLEDGE GRAPH + HIERARCHICAL LEIDEN (v2.4.1)
454
+ Recursive clustering: "Python" "FastAPI" "Auth"
455
+ Community summaries + TF-IDF structured reports
443
456
  ├─────────────────────────────────────────────────────────────┤
444
457
  │ Layer 2: HIERARCHICAL INDEX │
445
458
  │ Tree structure for fast navigation │
@@ -488,6 +501,8 @@ python ~/.claude-memory/pattern_learner.py context 0.5
488
501
 
489
502
  **Your AI assistant can now match your preferences automatically.**
490
503
 
504
+ **MACLA Confidence Scoring (v2.4.0):** Confidence uses a Bayesian Beta-Binomial posterior (Forouzandeh et al., [arXiv:2512.18950](https://arxiv.org/abs/2512.18950)). Pattern-specific priors, log-scaled competition, recency bonus. Range: 0.0–0.95 (hard cap prevents overconfidence).
505
+
491
506
  ### Multi-Profile Support
492
507
 
493
508
  ```bash
@@ -537,14 +552,21 @@ superlocalmemoryv2:profile create <name> # New profile
537
552
  superlocalmemoryv2:profile switch <name> # Switch context
538
553
 
539
554
  # Knowledge Graph
540
- python ~/.claude-memory/graph_engine.py build # Build graph
555
+ python ~/.claude-memory/graph_engine.py build # Build graph (+ hierarchical + summaries)
541
556
  python ~/.claude-memory/graph_engine.py stats # View clusters
542
557
  python ~/.claude-memory/graph_engine.py related --id 5 # Find related
558
+ python ~/.claude-memory/graph_engine.py hierarchical # Sub-cluster large communities
559
+ python ~/.claude-memory/graph_engine.py summaries # Generate cluster summaries
543
560
 
544
561
  # Pattern Learning
545
562
  python ~/.claude-memory/pattern_learner.py update # Learn patterns
546
563
  python ~/.claude-memory/pattern_learner.py context 0.5 # Get identity
547
564
 
565
+ # Auto-Backup (v2.4.0)
566
+ python ~/.claude-memory/auto_backup.py backup # Manual backup
567
+ python ~/.claude-memory/auto_backup.py list # List backups
568
+ python ~/.claude-memory/auto_backup.py status # Backup status
569
+
548
570
  # Reset (Use with caution!)
549
571
  superlocalmemoryv2:reset soft # Clear memories
550
572
  superlocalmemoryv2:reset hard --confirm # Nuclear option
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "superlocalmemory",
3
- "version": "2.4.0",
3
+ "version": "2.4.1",
4
4
  "description": "Your AI Finally Remembers You - Local-first intelligent memory system for AI assistants. Works with Claude, Cursor, Windsurf, VS Code/Copilot, Codex, and 16+ AI tools. 100% local, zero cloud dependencies.",
5
5
  "keywords": [
6
6
  "ai-memory",
@@ -43,6 +43,7 @@
43
43
  "superlocalmemory": "./bin/slm-npm"
44
44
  },
45
45
  "scripts": {
46
+ "prepack": "find . -type d -name __pycache__ -exec rm -rf {} + 2>/dev/null; find . -name '*.pyc' -delete 2>/dev/null; true",
46
47
  "postinstall": "node scripts/postinstall.js",
47
48
  "preuninstall": "node scripts/preuninstall.js",
48
49
  "test": "echo \"Run: npm install -g . && slm status\" && exit 0"
@@ -404,6 +404,293 @@ class ClusterBuilder:
404
404
  return name[:100] # Limit length
405
405
 
406
406
 
407
+ def hierarchical_cluster(self, min_subcluster_size: int = 5, max_depth: int = 3) -> Dict[str, any]:
408
+ """
409
+ Run recursive Leiden clustering — cluster the clusters.
410
+
411
+ Large communities (>= min_subcluster_size * 2) are recursively sub-clustered
412
+ to reveal finer-grained thematic structure. E.g., "Python" → "FastAPI" → "Auth".
413
+
414
+ Args:
415
+ min_subcluster_size: Minimum members to attempt sub-clustering (default 5)
416
+ max_depth: Maximum recursion depth (default 3)
417
+
418
+ Returns:
419
+ Dictionary with hierarchical clustering statistics
420
+ """
421
+ try:
422
+ import igraph as ig
423
+ import leidenalg
424
+ except ImportError:
425
+ raise ImportError("python-igraph and leidenalg required. Install: pip install python-igraph leidenalg")
426
+
427
+ conn = sqlite3.connect(self.db_path)
428
+ cursor = conn.cursor()
429
+ active_profile = self._get_active_profile()
430
+
431
+ try:
432
+ # Get top-level clusters for this profile that are large enough to sub-cluster
433
+ cursor.execute('''
434
+ SELECT cluster_id, COUNT(*) as cnt
435
+ FROM memories
436
+ WHERE cluster_id IS NOT NULL AND profile = ?
437
+ GROUP BY cluster_id
438
+ HAVING cnt >= ?
439
+ ''', (active_profile, min_subcluster_size * 2))
440
+ large_clusters = cursor.fetchall()
441
+
442
+ if not large_clusters:
443
+ logger.info("No clusters large enough for hierarchical decomposition")
444
+ return {'subclusters_created': 0, 'depth_reached': 0}
445
+
446
+ total_subclusters = 0
447
+ max_depth_reached = 0
448
+
449
+ for parent_cid, member_count in large_clusters:
450
+ subs, depth = self._recursive_subcluster(
451
+ conn, cursor, parent_cid, active_profile,
452
+ min_subcluster_size, max_depth, current_depth=1
453
+ )
454
+ total_subclusters += subs
455
+ max_depth_reached = max(max_depth_reached, depth)
456
+
457
+ conn.commit()
458
+ logger.info(f"Hierarchical clustering: {total_subclusters} sub-clusters, depth {max_depth_reached}")
459
+ return {
460
+ 'subclusters_created': total_subclusters,
461
+ 'depth_reached': max_depth_reached,
462
+ 'parent_clusters_processed': len(large_clusters)
463
+ }
464
+
465
+ except Exception as e:
466
+ logger.error(f"Hierarchical clustering failed: {e}")
467
+ conn.rollback()
468
+ return {'subclusters_created': 0, 'error': str(e)}
469
+ finally:
470
+ conn.close()
471
+
472
+ def _recursive_subcluster(self, conn, cursor, parent_cluster_id: int,
473
+ profile: str, min_size: int, max_depth: int,
474
+ current_depth: int) -> Tuple[int, int]:
475
+ """Recursively sub-cluster a community using Leiden."""
476
+ import igraph as ig
477
+ import leidenalg
478
+
479
+ if current_depth > max_depth:
480
+ return 0, current_depth - 1
481
+
482
+ # Get memory IDs in this cluster
483
+ cursor.execute('''
484
+ SELECT id FROM memories
485
+ WHERE cluster_id = ? AND profile = ?
486
+ ''', (parent_cluster_id, profile))
487
+ member_ids = [row[0] for row in cursor.fetchall()]
488
+
489
+ if len(member_ids) < min_size * 2:
490
+ return 0, current_depth - 1
491
+
492
+ # Get edges between members of this cluster
493
+ placeholders = ','.join('?' * len(member_ids))
494
+ edges = cursor.execute(f'''
495
+ SELECT source_memory_id, target_memory_id, weight
496
+ FROM graph_edges
497
+ WHERE source_memory_id IN ({placeholders})
498
+ AND target_memory_id IN ({placeholders})
499
+ ''', member_ids + member_ids).fetchall()
500
+
501
+ if len(edges) < 2:
502
+ return 0, current_depth - 1
503
+
504
+ # Build sub-graph
505
+ id_to_vertex = {mid: idx for idx, mid in enumerate(member_ids)}
506
+ vertex_to_id = {idx: mid for mid, idx in id_to_vertex.items()}
507
+
508
+ g = ig.Graph()
509
+ g.add_vertices(len(member_ids))
510
+ edge_list, edge_weights = [], []
511
+ for src, tgt, w in edges:
512
+ if src in id_to_vertex and tgt in id_to_vertex:
513
+ edge_list.append((id_to_vertex[src], id_to_vertex[tgt]))
514
+ edge_weights.append(w)
515
+
516
+ if not edge_list:
517
+ return 0, current_depth - 1
518
+
519
+ g.add_edges(edge_list)
520
+
521
+ # Run Leiden with higher resolution for finer communities
522
+ partition = leidenalg.find_partition(
523
+ g, leidenalg.ModularityVertexPartition,
524
+ weights=edge_weights, n_iterations=100, seed=42
525
+ )
526
+
527
+ # Only proceed if Leiden found > 1 community (actual split)
528
+ non_singleton = [c for c in partition if len(c) >= 2]
529
+ if len(non_singleton) <= 1:
530
+ return 0, current_depth - 1
531
+
532
+ subclusters_created = 0
533
+ deepest = current_depth
534
+
535
+ # Get parent depth
536
+ cursor.execute('SELECT depth FROM graph_clusters WHERE id = ?', (parent_cluster_id,))
537
+ parent_row = cursor.fetchone()
538
+ parent_depth = parent_row[0] if parent_row else 0
539
+
540
+ for community in non_singleton:
541
+ sub_member_ids = [vertex_to_id[v] for v in community]
542
+
543
+ if len(sub_member_ids) < 2:
544
+ continue
545
+
546
+ avg_imp = self._get_avg_importance(cursor, sub_member_ids)
547
+ cluster_name = self._generate_cluster_name(cursor, sub_member_ids)
548
+
549
+ result = cursor.execute('''
550
+ INSERT INTO graph_clusters (name, member_count, avg_importance, parent_cluster_id, depth)
551
+ VALUES (?, ?, ?, ?, ?)
552
+ ''', (cluster_name, len(sub_member_ids), avg_imp, parent_cluster_id, parent_depth + 1))
553
+
554
+ sub_cluster_id = result.lastrowid
555
+
556
+ # Update memories to point to sub-cluster
557
+ cursor.executemany('''
558
+ UPDATE memories SET cluster_id = ? WHERE id = ?
559
+ ''', [(sub_cluster_id, mid) for mid in sub_member_ids])
560
+
561
+ subclusters_created += 1
562
+ logger.info(f"Sub-cluster {sub_cluster_id} under {parent_cluster_id}: "
563
+ f"'{cluster_name}' ({len(sub_member_ids)} members, depth {parent_depth + 1})")
564
+
565
+ # Recurse into this sub-cluster if large enough
566
+ child_subs, child_depth = self._recursive_subcluster(
567
+ conn, cursor, sub_cluster_id, profile,
568
+ min_size, max_depth, current_depth + 1
569
+ )
570
+ subclusters_created += child_subs
571
+ deepest = max(deepest, child_depth)
572
+
573
+ return subclusters_created, deepest
574
+
575
+ def generate_cluster_summaries(self) -> int:
576
+ """
577
+ Generate TF-IDF structured summaries for all clusters.
578
+
579
+ For each cluster, analyzes member content to produce a human-readable
580
+ summary describing the cluster's theme, key topics, and scope.
581
+
582
+ Returns:
583
+ Number of clusters with summaries generated
584
+ """
585
+ conn = sqlite3.connect(self.db_path)
586
+ cursor = conn.cursor()
587
+ active_profile = self._get_active_profile()
588
+
589
+ try:
590
+ # Get all clusters for this profile
591
+ cursor.execute('''
592
+ SELECT DISTINCT gc.id, gc.name, gc.member_count
593
+ FROM graph_clusters gc
594
+ JOIN memories m ON m.cluster_id = gc.id
595
+ WHERE m.profile = ?
596
+ ''', (active_profile,))
597
+ clusters = cursor.fetchall()
598
+
599
+ if not clusters:
600
+ return 0
601
+
602
+ summaries_generated = 0
603
+
604
+ for cluster_id, cluster_name, member_count in clusters:
605
+ summary = self._build_cluster_summary(cursor, cluster_id, active_profile)
606
+ if summary:
607
+ cursor.execute('''
608
+ UPDATE graph_clusters SET summary = ?, updated_at = CURRENT_TIMESTAMP
609
+ WHERE id = ?
610
+ ''', (summary, cluster_id))
611
+ summaries_generated += 1
612
+ logger.info(f"Summary for cluster {cluster_id} ({cluster_name}): {summary[:80]}...")
613
+
614
+ conn.commit()
615
+ logger.info(f"Generated {summaries_generated} cluster summaries")
616
+ return summaries_generated
617
+
618
+ except Exception as e:
619
+ logger.error(f"Summary generation failed: {e}")
620
+ conn.rollback()
621
+ return 0
622
+ finally:
623
+ conn.close()
624
+
625
+ def _build_cluster_summary(self, cursor, cluster_id: int, profile: str) -> str:
626
+ """Build a TF-IDF structured summary for a single cluster."""
627
+ # Get member content
628
+ cursor.execute('''
629
+ SELECT m.content, m.summary, m.tags, m.category, m.project_name
630
+ FROM memories m
631
+ WHERE m.cluster_id = ? AND m.profile = ?
632
+ ''', (cluster_id, profile))
633
+ members = cursor.fetchall()
634
+
635
+ if not members:
636
+ return ""
637
+
638
+ # Collect entities from graph nodes
639
+ cursor.execute('''
640
+ SELECT gn.entities
641
+ FROM graph_nodes gn
642
+ JOIN memories m ON gn.memory_id = m.id
643
+ WHERE m.cluster_id = ? AND m.profile = ?
644
+ ''', (cluster_id, profile))
645
+ all_entities = []
646
+ for row in cursor.fetchall():
647
+ if row[0]:
648
+ try:
649
+ all_entities.extend(json.loads(row[0]))
650
+ except (json.JSONDecodeError, TypeError):
651
+ pass
652
+
653
+ # Top entities by frequency (TF-IDF already extracted these)
654
+ entity_counts = Counter(all_entities)
655
+ top_entities = [e for e, _ in entity_counts.most_common(5)]
656
+
657
+ # Collect unique projects and categories
658
+ projects = set()
659
+ categories = set()
660
+ for m in members:
661
+ if m[3]: # category
662
+ categories.add(m[3])
663
+ if m[4]: # project_name
664
+ projects.add(m[4])
665
+
666
+ # Build structured summary
667
+ parts = []
668
+
669
+ # Theme from top entities
670
+ if top_entities:
671
+ parts.append(f"Key topics: {', '.join(top_entities[:5])}")
672
+
673
+ # Scope
674
+ if projects:
675
+ parts.append(f"Projects: {', '.join(sorted(projects)[:3])}")
676
+ if categories:
677
+ parts.append(f"Categories: {', '.join(sorted(categories)[:3])}")
678
+
679
+ # Size context
680
+ parts.append(f"{len(members)} memories")
681
+
682
+ # Check for hierarchical context
683
+ cursor.execute('SELECT parent_cluster_id FROM graph_clusters WHERE id = ?', (cluster_id,))
684
+ parent_row = cursor.fetchone()
685
+ if parent_row and parent_row[0]:
686
+ cursor.execute('SELECT name FROM graph_clusters WHERE id = ?', (parent_row[0],))
687
+ parent_name_row = cursor.fetchone()
688
+ if parent_name_row:
689
+ parts.append(f"Sub-cluster of: {parent_name_row[0]}")
690
+
691
+ return " | ".join(parts)
692
+
693
+
407
694
  class ClusterNamer:
408
695
  """Enhanced cluster naming with optional LLM support (future)."""
409
696
 
@@ -498,13 +785,24 @@ class GraphEngine:
498
785
  id INTEGER PRIMARY KEY AUTOINCREMENT,
499
786
  name TEXT NOT NULL,
500
787
  description TEXT,
788
+ summary TEXT,
501
789
  member_count INTEGER DEFAULT 0,
502
790
  avg_importance REAL,
791
+ parent_cluster_id INTEGER,
792
+ depth INTEGER DEFAULT 0,
503
793
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
504
- updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
794
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
795
+ FOREIGN KEY (parent_cluster_id) REFERENCES graph_clusters(id) ON DELETE SET NULL
505
796
  )
506
797
  ''')
507
798
 
799
+ # Safe column additions for existing databases
800
+ for col, col_type in [('summary', 'TEXT'), ('parent_cluster_id', 'INTEGER'), ('depth', 'INTEGER DEFAULT 0')]:
801
+ try:
802
+ cursor.execute(f'ALTER TABLE graph_clusters ADD COLUMN {col} {col_type}')
803
+ except sqlite3.OperationalError:
804
+ pass
805
+
508
806
  # Add cluster_id to memories if not exists
509
807
  try:
510
808
  cursor.execute('ALTER TABLE memories ADD COLUMN cluster_id INTEGER')
@@ -648,9 +946,16 @@ class GraphEngine:
648
946
  memory_ids, vectors, entities_list
649
947
  )
650
948
 
651
- # Detect communities
949
+ # Detect communities (flat Leiden)
652
950
  clusters_count = self.cluster_builder.detect_communities()
653
951
 
952
+ # Hierarchical sub-clustering on large communities
953
+ hierarchical_stats = self.cluster_builder.hierarchical_cluster()
954
+ subclusters = hierarchical_stats.get('subclusters_created', 0)
955
+
956
+ # Generate TF-IDF structured summaries for all clusters
957
+ summaries = self.cluster_builder.generate_cluster_summaries()
958
+
654
959
  elapsed = time.time() - start_time
655
960
 
656
961
  stats = {
@@ -659,6 +964,9 @@ class GraphEngine:
659
964
  'nodes': len(memory_ids),
660
965
  'edges': edges_count,
661
966
  'clusters': clusters_count,
967
+ 'subclusters': subclusters,
968
+ 'max_depth': hierarchical_stats.get('depth_reached', 0),
969
+ 'summaries_generated': summaries,
662
970
  'time_seconds': round(elapsed, 2)
663
971
  }
664
972
 
@@ -962,28 +1270,36 @@ class GraphEngine:
962
1270
  WHERE cluster_id IS NOT NULL AND profile = ?
963
1271
  ''', (active_profile,)).fetchone()[0]
964
1272
 
965
- # Cluster breakdown for active profile
1273
+ # Cluster breakdown for active profile (including hierarchy)
966
1274
  cluster_info = cursor.execute('''
967
- SELECT gc.name, gc.member_count, gc.avg_importance
1275
+ SELECT gc.name, gc.member_count, gc.avg_importance,
1276
+ gc.summary, gc.parent_cluster_id, gc.depth
968
1277
  FROM graph_clusters gc
969
1278
  WHERE gc.id IN (
970
1279
  SELECT DISTINCT cluster_id FROM memories
971
1280
  WHERE cluster_id IS NOT NULL AND profile = ?
972
1281
  )
973
- ORDER BY gc.member_count DESC
974
- LIMIT 10
1282
+ ORDER BY gc.depth ASC, gc.member_count DESC
1283
+ LIMIT 20
975
1284
  ''', (active_profile,)).fetchall()
976
1285
 
1286
+ # Count hierarchical depth
1287
+ max_depth = max((c[5] or 0 for c in cluster_info), default=0) if cluster_info else 0
1288
+
977
1289
  return {
978
1290
  'profile': active_profile,
979
1291
  'nodes': nodes,
980
1292
  'edges': edges,
981
1293
  'clusters': clusters,
1294
+ 'max_depth': max_depth,
982
1295
  'top_clusters': [
983
1296
  {
984
1297
  'name': c[0],
985
1298
  'members': c[1],
986
- 'avg_importance': round(c[2], 1)
1299
+ 'avg_importance': round(c[2], 1) if c[2] else 5.0,
1300
+ 'summary': c[3],
1301
+ 'parent_cluster_id': c[4],
1302
+ 'depth': c[5] or 0
987
1303
  }
988
1304
  for c in cluster_info
989
1305
  ]
@@ -998,7 +1314,7 @@ def main():
998
1314
  import argparse
999
1315
 
1000
1316
  parser = argparse.ArgumentParser(description='GraphEngine - Knowledge Graph Management')
1001
- parser.add_argument('command', choices=['build', 'stats', 'related', 'cluster'],
1317
+ parser.add_argument('command', choices=['build', 'stats', 'related', 'cluster', 'hierarchical', 'summaries'],
1002
1318
  help='Command to execute')
1003
1319
  parser.add_argument('--memory-id', type=int, help='Memory ID for related/add commands')
1004
1320
  parser.add_argument('--cluster-id', type=int, help='Cluster ID for cluster command')
@@ -1052,6 +1368,18 @@ def main():
1052
1368
  summary = mem['summary'] or '[No summary]'
1053
1369
  print(f" {summary[:100]}...")
1054
1370
 
1371
+ elif args.command == 'hierarchical':
1372
+ print("Running hierarchical sub-clustering...")
1373
+ cluster_builder = ClusterBuilder(engine.db_path)
1374
+ stats = cluster_builder.hierarchical_cluster()
1375
+ print(json.dumps(stats, indent=2))
1376
+
1377
+ elif args.command == 'summaries':
1378
+ print("Generating cluster summaries...")
1379
+ cluster_builder = ClusterBuilder(engine.db_path)
1380
+ count = cluster_builder.generate_cluster_summaries()
1381
+ print(f"Generated summaries for {count} clusters")
1382
+
1055
1383
 
1056
1384
  if __name__ == '__main__':
1057
1385
  main()
@@ -257,11 +257,18 @@ def initialize_database() -> Tuple[bool, str]:
257
257
  CREATE TABLE IF NOT EXISTS graph_clusters (
258
258
  id INTEGER PRIMARY KEY AUTOINCREMENT,
259
259
  cluster_name TEXT,
260
+ name TEXT,
260
261
  description TEXT,
262
+ summary TEXT,
261
263
  memory_count INTEGER DEFAULT 0,
264
+ member_count INTEGER DEFAULT 0,
262
265
  avg_importance REAL DEFAULT 5.0,
263
266
  top_entities TEXT DEFAULT '[]',
264
- created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
267
+ parent_cluster_id INTEGER,
268
+ depth INTEGER DEFAULT 0,
269
+ created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
270
+ updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
271
+ FOREIGN KEY (parent_cluster_id) REFERENCES graph_clusters(id) ON DELETE SET NULL
265
272
  )
266
273
  ''')
267
274
 
package/ui_server.py CHANGED
@@ -707,22 +707,26 @@ async def get_clusters():
707
707
 
708
708
  active_profile = get_active_profile()
709
709
 
710
- # Get cluster statistics
710
+ # Get cluster statistics with hierarchy and summaries
711
711
  cursor.execute("""
712
712
  SELECT
713
- cluster_id,
713
+ m.cluster_id,
714
714
  COUNT(*) as member_count,
715
- AVG(importance) as avg_importance,
716
- MIN(importance) as min_importance,
717
- MAX(importance) as max_importance,
718
- GROUP_CONCAT(DISTINCT category) as categories,
719
- GROUP_CONCAT(DISTINCT project_name) as projects,
720
- MIN(created_at) as first_memory,
721
- MAX(created_at) as latest_memory
722
- FROM memories
723
- WHERE cluster_id IS NOT NULL AND profile = ?
724
- GROUP BY cluster_id
725
- ORDER BY member_count DESC
715
+ AVG(m.importance) as avg_importance,
716
+ MIN(m.importance) as min_importance,
717
+ MAX(m.importance) as max_importance,
718
+ GROUP_CONCAT(DISTINCT m.category) as categories,
719
+ GROUP_CONCAT(DISTINCT m.project_name) as projects,
720
+ MIN(m.created_at) as first_memory,
721
+ MAX(m.created_at) as latest_memory,
722
+ gc.summary,
723
+ gc.parent_cluster_id,
724
+ gc.depth
725
+ FROM memories m
726
+ LEFT JOIN graph_clusters gc ON m.cluster_id = gc.id
727
+ WHERE m.cluster_id IS NOT NULL AND m.profile = ?
728
+ GROUP BY m.cluster_id
729
+ ORDER BY COALESCE(gc.depth, 0) ASC, member_count DESC
726
730
  """, (active_profile,))
727
731
  clusters = cursor.fetchall()
728
732