PyPI - quantum-memory-graph - Versions diffs - 1.2.0__tar.gz → 1.2.2__tar.gz - Mend

quantum-memory-graph 1.2.0tar.gz → 1.2.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (38) hide show

{quantum_memory_graph-1.2.0/quantum_memory_graph.egg-info → quantum_memory_graph-1.2.2}/PKG-INFO RENAMED Viewed

@@ -1,19 +1,28 @@
 Metadata-Version: 2.4
 Name: quantum-memory-graph
-Version: 1.2.0
+Version: 1.2.2
 Summary: Quantum-optimized knowledge graph memory for AI agents. Relationship-aware subgraph selection via QAOA.
 Home-page: https://github.com/Dustin-a11y/quantum-memory-graph
 Author: Coinkong (Chef's Attraction)
 License: MIT
+Project-URL: Source Code, https://github.com/Dustin-a11y/quantum-memory-graph
+Project-URL: Issue Tracker, https://github.com/Dustin-a11y/quantum-memory-graph/issues
+Project-URL: Benchmark Results, https://github.com/Dustin-a11y/quantum-memory-graph/tree/main/benchmarks
+Project-URL: LongMemEval Submission, https://github.com/xiaowu0162/LongMemEval/issues
 Keywords: quantum,memory,knowledge-graph,agents,qaoa,ai
 Classifier: Development Status :: 4 - Beta
 Classifier: Intended Audience :: Developers
 Classifier: License :: OSI Approved :: MIT License
 Classifier: Programming Language :: Python :: 3
+Classifier: Programming Language :: Python :: 3.9
+Classifier: Programming Language :: Python :: 3.10
+Classifier: Programming Language :: Python :: 3.11
+Classifier: Programming Language :: Python :: 3.12
+Classifier: Programming Language :: Python :: 3.13
+Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
 Requires-Python: >=3.9
 Description-Content-Type: text/markdown
 License-File: LICENSE
-Requires-Dist: quantum-agent-memory>=0.1.0
 Requires-Dist: sentence-transformers>=2.2.0
 Requires-Dist: networkx>=3.0
 Requires-Dist: numpy>=1.24.0
@@ -35,29 +44,16 @@ Dynamic: license-file
 Every memory system treats memories as independent documents — search, rank, stuff into context. But memories aren't independent. They have *relationships*. "The team chose React" becomes 10x more useful paired with "because of ecosystem maturity" and "FastAPI handles the backend."
-Quantum Memory Graph maps these relationships, then uses QAOA to find the optimal *combination* of memories — not just the most relevant individuals, but the best connected subgraph that gives your agent maximum context.
-## Benchmark: MemCombine
-We created MemCombine to test what no existing benchmark measures — **memory combination quality**.
-| Method | Coverage | Evidence Recall | F1 | Perfect |
-|--------|----------|----------------|----|---------|
-| Embedding Top-K | 69.9% | 65.6% | 68.1% | 1/5 |
-| **Graph + QAOA** | **96.7%** | **91.0%** | **92.6%** | **4/5** |
-| **Advantage** | **+26.8%** | **+25.4%** | **+24.5%** | |
-When the task is "find memories that work *together*," graph-aware quantum selection crushes pure similarity search.
 ## 🏆 #1 on LongMemEval (ICLR 2025 Benchmark)
-Tested on the official [LongMemEval benchmark](https://arxiv.org/abs/2410.10813) for long-term memory in AI agents:
+Tested on the official [LongMemEval benchmark](https://arxiv.org/abs/2410.10813) — [verified submission](https://github.com/xiaowu0162/LongMemEval/issues/46).
 | Method | R@1 | R@5 | R@10 | NDCG@10 |
 |--------|:---:|:---:|:----:|:-------:|
 | OMEGA (prev SOTA) | — | 89.2% | 94.1% | 87.5% |
 | Mastra OM | — | 91.0% | 95.2% | 89.1% |
 | **QMG v1.1 (published #1)** | — | **95.8%** | **98.85%** | **93.2%** |
-| **QMG v1.2 (official, this repo)** 🏆 | **90.6%** | **98.6%** | **99.4%** | **0.9426** |
+| **QMG v1.2 — chunked retrieval pipeline** 🏆 | **90.6%** | **98.6%** | **99.4%** | **94.26%** |
 **Benchmark run:** 500 questions, chunked gte-large embeddings (500-char blocks, 100-char overlap, mean-of-top-3 session scoring). Verified on DGX Spark GB10 (CUDA, ~53 min).
@@ -65,7 +61,6 @@ Tested on the official [LongMemEval benchmark](https://arxiv.org/abs/2410.10813)
 **See:** `benchmarks/run_longmemeval_chunked_staged.py` for the exact benchmark code, `benchmarks/longmemeval_chunked_staged_results.json` for full per-question results.
 ## Install
 ```bash
@@ -191,10 +186,7 @@ result = recall(
 )
 ```
-### Run MemCombine Benchmark
 ```python
-from benchmarks.memcombine import run_benchmark
 def my_recall(memories, query, K):
     # Your recall implementation
@@ -227,8 +219,6 @@ Validated on `ibm_fez` and `ibm_kingston` backends.
 MIT License — Copyright 2026 Coinkong (Chef's Attraction)
 ## Links
-- [quantum-agent-memory](https://github.com/Dustin-a11y/quantum-agent-memory) — The QAOA optimization engine
-- [MemCombine Benchmark](benchmarks/memcombine.py) — Test memory combination quality
+- [GitHub](https://github.com/Dustin-a11y/quantum-memory-graph) — Source code and benchmarks

{quantum_memory_graph-1.2.0 → quantum_memory_graph-1.2.2}/README.md RENAMED Viewed

@@ -4,29 +4,16 @@
 Every memory system treats memories as independent documents — search, rank, stuff into context. But memories aren't independent. They have *relationships*. "The team chose React" becomes 10x more useful paired with "because of ecosystem maturity" and "FastAPI handles the backend."
-Quantum Memory Graph maps these relationships, then uses QAOA to find the optimal *combination* of memories — not just the most relevant individuals, but the best connected subgraph that gives your agent maximum context.
-## Benchmark: MemCombine
-We created MemCombine to test what no existing benchmark measures — **memory combination quality**.
-| Method | Coverage | Evidence Recall | F1 | Perfect |
-|--------|----------|----------------|----|---------|
-| Embedding Top-K | 69.9% | 65.6% | 68.1% | 1/5 |
-| **Graph + QAOA** | **96.7%** | **91.0%** | **92.6%** | **4/5** |
-| **Advantage** | **+26.8%** | **+25.4%** | **+24.5%** | |
-When the task is "find memories that work *together*," graph-aware quantum selection crushes pure similarity search.
 ## 🏆 #1 on LongMemEval (ICLR 2025 Benchmark)
-Tested on the official [LongMemEval benchmark](https://arxiv.org/abs/2410.10813) for long-term memory in AI agents:
+Tested on the official [LongMemEval benchmark](https://arxiv.org/abs/2410.10813) — [verified submission](https://github.com/xiaowu0162/LongMemEval/issues/46).
 | Method | R@1 | R@5 | R@10 | NDCG@10 |
 |--------|:---:|:---:|:----:|:-------:|
 | OMEGA (prev SOTA) | — | 89.2% | 94.1% | 87.5% |
 | Mastra OM | — | 91.0% | 95.2% | 89.1% |
 | **QMG v1.1 (published #1)** | — | **95.8%** | **98.85%** | **93.2%** |
-| **QMG v1.2 (official, this repo)** 🏆 | **90.6%** | **98.6%** | **99.4%** | **0.9426** |
+| **QMG v1.2 — chunked retrieval pipeline** 🏆 | **90.6%** | **98.6%** | **99.4%** | **94.26%** |
 **Benchmark run:** 500 questions, chunked gte-large embeddings (500-char blocks, 100-char overlap, mean-of-top-3 session scoring). Verified on DGX Spark GB10 (CUDA, ~53 min).
@@ -34,7 +21,6 @@ Tested on the official [LongMemEval benchmark](https://arxiv.org/abs/2410.10813)
 **See:** `benchmarks/run_longmemeval_chunked_staged.py` for the exact benchmark code, `benchmarks/longmemeval_chunked_staged_results.json` for full per-question results.
 ## Install
 ```bash
@@ -160,10 +146,7 @@ result = recall(
 )
 ```
-### Run MemCombine Benchmark
 ```python
-from benchmarks.memcombine import run_benchmark
 def my_recall(memories, query, K):
     # Your recall implementation
@@ -196,8 +179,6 @@ Validated on `ibm_fez` and `ibm_kingston` backends.
 MIT License — Copyright 2026 Coinkong (Chef's Attraction)
 ## Links
-- [quantum-agent-memory](https://github.com/Dustin-a11y/quantum-agent-memory) — The QAOA optimization engine
-- [MemCombine Benchmark](benchmarks/memcombine.py) — Test memory combination quality
+- [GitHub](https://github.com/Dustin-a11y/quantum-memory-graph) — Source code and benchmarks

quantum_memory_graph-1.2.2/benchmarks/run_longmemeval_cvar_v2.py ADDED Viewed

@@ -0,0 +1,272 @@
+#!/usr/bin/env python3
+"""
+LongMemEval 500-question Benchmark — QMG CVaR subgraph optimizer.
+Routes each question through the QMG subgraph optimizer on Spark.
+Measures recall@K against gold answer sessions.
+Usage:
+    python3 -u run_longmemeval_cvar.py --limit 5    # Quick test
+    python3 -u run_longmemeval_cvar.py --force       # Full 500
+    python3 -u run_longmemeval_cvar.py --fast        # Skip QMG, cosine only
+Output: JSON results + CSV saved to benchmarks/ directory.
+"""
+import json, time, math, sys, os, argparse, csv
+from datetime import datetime, timezone
+import numpy as np
+DATA_PATH = "/home/dt/projects-shared/LongMemEval/data/longmemeval_s_cleaned.json"
+RESULTS_DIR = "/home/dt/qmg-v1/benchmarks"
+RESULTS_FILE = os.path.join(RESULTS_DIR, "longmemeval_cvar_results.json")
+CSV_FILE = os.path.join(RESULTS_DIR, "longmemeval_cvar_results.csv")
+T_START = time.time()
+def flatten_session(session):
+    if isinstance(session, str): return session
+    if isinstance(session, list):
+        parts = []
+        for turn in session:
+            if isinstance(turn, dict):
+                parts.append(f"{turn.get('role','')}: {turn.get('content', turn.get('text', str(turn)))}")
+            else:
+                parts.append(str(turn))
+        return "\n".join(parts)
+    return str(session)
+def load_data(path, limit=None):
+    with open(path) as f: data = json.load(f)
+    if not isinstance(data, list):
+        for k in ["data","questions","items","results"]:
+            if k in data: data = data[k]; break
+    if limit: data = data[:limit]
+    return data
+def recall_at_k(ranked, gold, K):
+    gold_set = set(gold)
+    if not gold_set: return 1.0
+    return 1.0 if set(ranked[:K]) & gold_set else 0.0
+def ndcg_at_k(ranked, gold, K):
+    gold_set = set(gold)
+    if not gold_set: return 1.0
+    dcg = sum(1.0/math.log2(i+2) for i,idx in enumerate(ranked[:K]) if idx in gold_set)
+    idcg = sum(1.0/math.log2(i+2) for i in range(min(len(gold_set), K)))
+    return dcg/idcg if idcg>0 else 0.0
+def main():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("--limit", type=int, default=None)
+    parser.add_argument("--fast", action="store_true", help="Skip QMG, cosine only")
+    parser.add_argument("--force", action="store_true", help="Run full 500")
+    args = parser.parse_args()
+    data = load_data(DATA_PATH)
+    print(f"Loaded {len(data)} questions", flush=True)
+    limit = args.limit
+    if args.force: limit = None
+    if limit: data = data[:limit]
+    from sentence_transformers import SentenceTransformer
+    import torch
+    device = "cuda" if torch.cuda.is_available() else "cpu"
+    print(f"Loading gte-large on {device}...", flush=True)
+    model = SentenceTransformer("thenlper/gte-large", device=device)
+    dim = model.get_sentence_embedding_dimension()
+    print(f"Model loaded, dim={dim}", flush=True)
+    results = []
+    n_questions = len(data)
+    for idx, item in enumerate(data):
+        question = item.get("question", item.get("query", ""))
+        haystack = item.get("haystack_sessions", item.get("sessions", item.get("corpus", [])))
+        haystack_ids = item.get("haystack_session_ids", item.get("session_ids", []))
+        answer_ids = item.get("answer_session_ids", item.get("answer_ids", []))
+        gold_indices = []
+        for g in answer_ids:
+            try: gold_indices.append(haystack_ids.index(g))
+            except ValueError: pass
+        if not gold_indices or len(haystack) < 3:
+            results.append({"idx": idx, "skip": True, "reason": "no_gold_or_too_few"})
+            continue
+        texts = [flatten_session(s) for s in haystack]
+        # Encode
+        t0 = time.time()
+        all_texts = [question] + texts
+        embs = model.encode(all_texts, normalize_embeddings=True, batch_size=32, show_progress_bar=False)
+        q_emb = embs[0]
+        sess_embs = embs[1:]
+        encode_time = time.time() - t0
+        n_sessions = len(sess_embs)
+        K_target = min(5, n_sessions)
+        # Cosine baseline
+        t0 = time.time()
+        cos_scores = q_emb @ sess_embs.T
+        cos_ranked = np.argsort(cos_scores)[::-1].tolist()
+        cos_time = time.time() - t0
+        r = {
+            "idx": idx,
+            "question": question[:120],
+            "n_sessions": n_sessions,
+            "n_gold": len(gold_indices),
+            "cosine": {
+                "r1": float(recall_at_k(cos_ranked, gold_indices, 1)),
+                "r5": float(recall_at_k(cos_ranked, gold_indices, 5)),
+                "r10": float(recall_at_k(cos_ranked, gold_indices, 10)),
+                "ndcg": float(ndcg_at_k(cos_ranked, gold_indices, 10)),
+                "time": cos_time,
+            }
+        }
+        # QMG CVaR optimizer — two configs
+        if not args.fast:
+            t0 = time.time()
+            try:
+                sys.path.insert(0, "/home/dt/qmg-v1")
+                from quantum_memory_graph.subgraph_optimizer import optimize_subgraph
+                # Build adjacency from session embeddings (cosine similarity matrix)
+                adj = sess_embs @ sess_embs.T
+                np.fill_diagonal(adj, 0.0)
+                for cfg_name, cfg in [
+                    ("default", {"alpha": 0.4, "beta_conn": 0.35, "gamma_cov": 0.25, "shots": 4096}),
+                    ("retrieval", {"alpha": 1.0, "beta_conn": 0.0, "gamma_cov": 0.0, "shots": 4096}),
+                ]:
+                    # Cap candidates at 14 for QAOA to avoid memory OOM
+                    # (2^14 = 16K complex numbers, 2^40 = 17TB)
+                    top_indices = np.argsort(cos_scores)[::-1][:14]
+                    top_scores = cos_scores[top_indices]
+                    top_adj = adj[np.ix_(top_indices, top_indices)]
+                    result = optimize_subgraph(
+                        relevance_scores=top_scores,
+                        adjacency=top_adj,
+                        K=K_target,
+                        alpha=cfg["alpha"],
+                        beta_conn=cfg["beta_conn"],
+                        gamma_cov=cfg["gamma_cov"],
+                        grid_size=6,
+                        shots=cfg["shots"],
+                        p_layers=2,
+                    )
+                    selection_raw = result.get("selection", [])
+                    method = result.get("method", "unknown")
+                    # Map capped indices back to original indices
+                    selection = [top_indices[s] for s in selection_raw]
+                    sel_set = set(selection)
+                    ranked = list(selection)
+                    for i in range(n_sessions):
+                        if len(ranked) >= n_sessions: break
+                        if i not in sel_set: ranked.append(i)
+                    r[cfg_name] = {
+                        "r1": float(recall_at_k(ranked, gold_indices, 1)),
+                        "r5": float(recall_at_k(ranked, gold_indices, 5)),
+                        "r10": float(recall_at_k(ranked, gold_indices, 10)),
+                        "ndcg": float(ndcg_at_k(ranked, gold_indices, 10)),
+                        "method": method,
+                        "n_capped": len(top_indices),
+                        "score": float(result.get("score", 0)),
+                        "optimal_score": float(result.get("optimal", {}).get("score", 0)),
+                        "time": time.time() - t0,
+                    }
+            except Exception as e:
+                import traceback
+                r["qmg_error"] = f"{type(e).__name__}: {e}"
+                r["qmg_traceback"] = traceback.format_exc()
+            r["total_qmg_time"] = time.time() - t0
+        results.append(r)
+        # Progress every 5 questions
+        if (idx+1) % 5 == 0:
+            elapsed = time.time() - T_START
+            effective = [rr for rr in results if not rr.get("skip")]
+            cos_done = [rr for rr in effective if "cosine" in rr]
+            if cos_done:
+                cos_r5_avg = np.mean([rr["cosine"]["r5"] for rr in cos_done]) * 100
+                print(f"[{idx+1}/{n_questions}] {elapsed:.0f}s cos_r5={cos_r5_avg:.1f}%", flush=True)
+    # Summary
+    effective = [r for r in results if not r.get("skip")]
+    cos_items = [r for r in effective if "cosine" in r]
+    print("\n" + "="*60, flush=True)
+    print(f"LONGMEMEVAL — {datetime.now(timezone.utc).isoformat()}", flush=True)
+    print(f"Questions: {len(effective)} effective ({len(results)-len(effective)} skipped)", flush=True)
+    if cos_items:
+        cos_r1 = np.mean([r["cosine"]["r1"] for r in cos_items])*100
+        cos_r5 = np.mean([r["cosine"]["r5"] for r in cos_items])*100
+        cos_r10 = np.mean([r["cosine"]["r10"] for r in cos_items])*100
+        cos_ndcg = np.mean([r["cosine"]["ndcg"] for r in cos_items])
+        print(f"\nCOSINE BASELINE:", flush=True)
+        print(f"  R@1:  {cos_r1:.1f}%", flush=True)
+        print(f"  R@5:  {cos_r5:.1f}%", flush=True)
+        print(f"  R@10: {cos_r10:.1f}%", flush=True)
+        print(f"  NDCG: {cos_ndcg:.4f}", flush=True)
+    for cfg_name in ["default", "retrieval"]:
+        items = [r for r in effective if cfg_name in r]
+        if items:
+            r1 = np.mean([r[cfg_name]["r1"] for r in items])*100
+            r5 = np.mean([r[cfg_name]["r5"] for r in items])*100
+            r10 = np.mean([r[cfg_name]["r10"] for r in items])*100
+            ndcg = np.mean([r[cfg_name]["ndcg"] for r in items])
+            methods = {}
+            for r in items:
+                m = r[cfg_name].get("method", "?")
+                methods.setdefault(m, []).append(r[cfg_name]["r5"])
+            avg_time = np.mean([r[cfg_name]["time"] for r in items])
+            print(f"\nQMG {cfg_name.upper()}:", flush=True)
+            print(f"  R@1:   {r1:.1f}%", flush=True)
+            print(f"  R@5:   {r5:.1f}%", flush=True)
+            print(f"  R@10:  {r10:.1f}%", flush=True)
+            print(f"  NDCG:  {ndcg:.4f}", flush=True)
+            print(f"  Avg time: {avg_time:.1f}s", flush=True)
+            for m, vals in sorted(methods.items()):
+                print(f"  {m}: {len(vals)}x R@5={np.mean(vals)*100:.1f}%", flush=True)
+    total_t = time.time() - T_START
+    print(f"\nTotal: {total_t:.0f}s ({total_t/60:.1f} min)", flush=True)
+    print("="*60, flush=True)
+    with open(RESULTS_FILE, "w") as f: json.dump({"timestamp": datetime.now(timezone.utc).isoformat(), "n_total": len(data), "results": results}, f, indent=2, default=str)
+    print(f"\nSaved to {RESULTS_FILE}", flush=True)
+    with open(CSV_FILE, "w", newline="") as f:
+        w = csv.writer(f)
+        w.writerow(["idx","n","ngold","cr1","cr5","cr10","cndcg",
+                     "dr1","dr5","dr10","dndcg","dmethod",
+                     "rr1","rr5","rr10","rndcg","rmethod"])
+        for r in results:
+            if r.get("skip"): continue
+            w.writerow([
+                r["idx"], r["n_sessions"], r["n_gold"],
+                r["cosine"]["r1"], r["cosine"]["r5"], r["cosine"]["r10"], r["cosine"]["ndcg"],
+                r.get("default", {}).get("r1"), r.get("default", {}).get("r5"),
+                r.get("default", {}).get("r10"), r.get("default", {}).get("ndcg"),
+                r.get("default", {}).get("method"),
+                r.get("retrieval", {}).get("r1"), r.get("retrieval", {}).get("r5"),
+                r.get("retrieval", {}).get("r10"), r.get("retrieval", {}).get("ndcg"),
+                r.get("retrieval", {}).get("method"),
+            ])
+    print(f"CSV saved to {CSV_FILE}", flush=True)
+if __name__ == "__main__":
+    main()

quantum-memory-graph 1.2.0__tar.gz → 1.2.2__tar.gz

quantum-memory-graph 1.2.0tar.gz → 1.2.2tar.gz