PyPI - wavemind - Versions diffs - 2.2.5__tar.gz → 2.2.7__tar.gz - Mend

wavemind 2.2.5tar.gz → 2.2.7tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (162) hide show

{wavemind-2.2.5 → wavemind-2.2.7}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: wavemind
-Version: 2.2.5
+Version: 2.2.7
 Summary: Local-first dynamic memory field with vector search and wave-field re-ranking
 License-Expression: MIT
 Project-URL: Homepage, https://github.com/CaspianG/wavemind
@@ -542,14 +542,17 @@ Checked-in result:
 | profile | result |
 |---|---:|
 | Cluster planner | 4096 namespaces, 4 nodes, replication factor 2, node-loss availability `1.000`, zone-loss availability `1.000`, write quorum `2`. |
-| Hot cache | 2000 lookups, hit rate `0.920`, p99 lookup `0.01 ms`. |
-| Replicated runtime | 3 physical WaveMind stores, replication factor 3, write quorum 2, node-loss recall `true`, repair copied `1` missing record, tombstone repair deleted `1` stale record, p99 query-after-loss `1.44 ms`. |
-| Structured payloads | image/audio/table/event retrieval, precision@1 `1.000`, p99 `0.75 ms`. |
+| Hot cache | 2000 lookups, hit rate `0.920`, p99 lookup `0.003 ms`. |
+| Replicated runtime | 3 physical WaveMind stores, replication factor 3, write quorum 2, node-loss recall `true`, repair copied `1` missing record, tombstone repair deleted `1` stale record, p99 query-after-loss `1.29 ms`. |
+| Active-active delta sync | 2 regions, bidirectional convergence `true`, stale import suppressed after delete `true`, tombstone convergence `true`, sync `112.50 ms`. |
+| Replicated snapshot | 3 replica files, manifest checksum validation `true`, restore `11.42 ms`, recall after restored-primary loss `true`. |
+| Structured payloads | image/audio/table/event retrieval, precision@1 `1.000`, p99 `0.67 ms`. |
 This profile validates routing, quorum-replicated runtime behavior, cache
-behavior, and structured payload handling. It is not a 10M-vector load test.
-Real 100k, 1M, and 10M latency claims should come from service-backed
-FAISS/Qdrant/pgvector load tests on production-like hardware.
+behavior, active-active namespace delta sync, replicated snapshot/restore, and
+structured payload handling. It is not a 10M-vector load test. Real 100k, 1M,
+and 10M latency claims should come from service-backed FAISS/Qdrant/pgvector
+load tests on production-like hardware.
 Cluster placement planning:
@@ -668,6 +671,34 @@ For Postgres storage, use database-native backup tooling such as `pg_dump`,
 managed snapshots, or point-in-time recovery instead of WaveMind's SQLite file
 backup command.
+Replicated runtime snapshot/restore:
+```python
+from wavemind import HashingTextEncoder, ReplicatedWaveMind
+memory = ReplicatedWaveMind(
+    root_path="./state/replicas",
+    nodes=["node-a", "node-b", "node-c"],
+    replication_factor=3,
+    encoder=HashingTextEncoder(vector_dim=64),
+)
+memory.remember("Tenant A prefers short support replies.", namespace="tenant:a")
+snapshot = memory.snapshot("./backups/replicated")
+assert ReplicatedWaveMind.verify_snapshot(snapshot.snapshot_path)["healthy"]
+restored, report = ReplicatedWaveMind.restore_snapshot(
+    snapshot.snapshot_path,
+    "./state/restored-replicas",
+    encoder=HashingTextEncoder(vector_dim=64),
+)
+```
+The replicated snapshot writes one SQLite backup per replica plus
+`manifest.json` with SHA-256 checksums, replica metadata, quorum settings, and
+node definitions. Restore refuses to overwrite a non-empty root unless
+`overwrite=True` is passed.
 ## HTTP API
 Run the local FastAPI server:
@@ -1041,7 +1072,7 @@ Current read:
 | LongMemEval 50-query smoke | On the first 50 non-abstention LongMemEval-S questions, WaveMind reaches `evidence_recall@5 0.920`, `precision@1 0.760`, and `MRR@5 0.827`; Chroma/Qdrant static reach `0.600`, `0.260`, and `0.385`. | This is the fast regression profile for checking current changes before rerunning the full LongMemEval profile. WaveMind wins on quality; latency still needs work. |
 | ANN/index curve | At 50000 generated 128-d vectors, NumPy exact keeps `recall@10 1.000` at `6.49 ms`; quantized int8 keeps `0.934` at `24.92 ms`; Annoy is faster at `4.92 ms` but drops to `0.730` recall; Qdrant local keeps `1.000` recall at `43.49 ms`. | Current local scale boundary is clear: quantized search needs kernel work, Annoy needs tuning/FAISS, and Qdrant should be tested in service mode for a fair production comparison. |
 | Production load | At 100000 generated 128-d vectors, service-mode Qdrant reaches `recall@10 1.000`, avg `10.28 ms`, p99 `21.26 ms`. At 1M, tuned Qdrant reaches `recall@10 0.984`, avg `116.80 ms`, p99 `209.28 ms`; an EF sweep finds `recall@10 0.977`, avg `64.76 ms`, p99 `103.77 ms` at `hnsw_ef=2048` on 30 queries. | 100k is production-grade on the tested machine. 1M recall is now strong, but p99 still needs tuning before claiming a stable sub-100 ms SLO. |
-| Scale readiness | Deterministic 1M-memory simulation validates 4096 namespace placements over 4 nodes with replication factor 2, node-loss availability `1.000`, zone-loss availability `1.000`, hot-cache hit rate `0.920`, quorum-replicated runtime recall after node loss, missing-record repair, tombstone repair, and structured payload precision@1 `1.000`. | This proves routing, cache, payload, and replicated-runtime foundations. It is not a 10M-vector latency claim; real 10M latency still needs service-backed load tests on larger hardware. |
+| Scale readiness | Deterministic 1M-memory simulation validates 4096 namespace placements over 4 nodes with replication factor 2, node-loss availability `1.000`, zone-loss availability `1.000`, hot-cache hit rate `0.920`, quorum-replicated runtime recall after node loss, missing-record repair, tombstone repair, active-active delta sync, checksummed replicated snapshot/restore, and structured payload precision@1 `1.000`. | This proves routing, cache, payload, replicated-runtime, namespace-delta, and restore-drill foundations. It is not a 10M-vector latency claim; real 10M latency still needs service-backed load tests on larger hardware. |
 | Memory competitor adapters | WaveMind reaches `precision@1 0.80`, `precision@3 1.00`, stale suppression `1.00` on the small adapter profile. Mem0, Zep, and LangGraph are listed as skipped unless their real packages/services are configured. | This prevents fake competitor claims. The adapter harness is ready; real Mem0/Zep/LangGraph results still need configured installs. |
 | LongMemEval local answer generation | With the same local Ollama `qwen2.5:1.5b`, WaveMind reaches `exact_match 0.240`, `contains_answer 0.380`, `token_f1 0.333`, and `evidence_recall@5 0.920`; Chroma and Qdrant static both reach `0.120`, `0.160`, `0.170`, and `0.600`. | This is the first checked-in end-to-end answer benchmark against Chroma/Qdrant. It is still a 50-question lightweight smoke run, not a full LongMemEval leaderboard score. |
@@ -1060,7 +1091,7 @@ Current read:
 | Production index profile | Docker-backed 50000-vector profile for persisted FAISS, Qdrant service, and PostgreSQL/pgvector HNSW. | implemented | FAISS / Qdrant service / pgvector | Keep service-mode candidate generation above `0.95` recall@10 and below 10 ms average query latency at 50000 vectors. |
 | Production load profile | 100k and 1M service-backed candidate-index checks with p95/p99 latency. | implemented | Qdrant service / pgvector HNSW / FAISS persisted | Keep 100k at recall@10 `1.000`; push 1M p99 below 100 ms with recall@10 >= 0.95. |
 | Qdrant 1M HNSW ef sweep | One 1M Qdrant collection queried with multiple `hnsw_ef` values. | implemented | Qdrant service | Repeat with 100+ queries and collection-level HNSW build parameters before claiming a stable 1M SLO. |
-| Scale readiness profile | Cluster placement, node/zone-loss simulation, quorum report, replicated runtime, hot-cache behavior, and structured/multimodal payload retrieval. | implemented | Mem0 / Zep / LangGraph persistent memory / GraphRAG target adapters | Keep quorum replication and repair green while adding larger service-backed 10M load tests. |
+| Scale readiness profile | Cluster placement, node/zone-loss simulation, quorum report, replicated runtime, active-active delta sync, replicated snapshot/restore, hot-cache behavior, and structured/multimodal payload retrieval. | implemented | Mem0 / Zep / LangGraph persistent memory / GraphRAG target adapters | Keep quorum replication, namespace-delta sync, repair, and restore drills green while adding larger service-backed 10M load tests. |
 | Memory competitor adapter profile | Dynamic-memory scenario wired for external memory frameworks. | implemented | Mem0 / Zep / LangGraph persistent memory | Report real competitor results only when their packages/services are explicitly configured. |
 | [BEIR](https://github.com/beir-cellar/beir) | Standard zero-shot information retrieval quality. | planned | Chroma / Qdrant / FAISS | Stay within 0.02 `nDCG@10` on identical embeddings. |
 | [MTEB Retrieval](https://github.com/embeddings-benchmark/mteb) | Separates encoder quality from retrieval-store quality. | planned | Chroma / Qdrant / FAISS | Prove WaveMind does not reduce same-embedding retrieval quality. |
@@ -1220,6 +1251,36 @@ production foundation for namespace-level HA and eventual-consistency behavior;
 for full consensus across independent network services, deploy WaveMind with
 Postgres/Qdrant/ops-layer replication.
+For multi-region active-active experiments, export and import namespace deltas:
+```python
+region_a.remember("Tenant A billing preference.", namespace="tenant:a")
+delta = region_a.export_namespace_delta("tenant:a")
+region_b.import_namespace_delta(delta)
+region_a.forget(text="Tenant A billing preference.", namespace="tenant:a")
+region_b.import_namespace_delta(region_a.export_namespace_delta("tenant:a"))
+```
+The delta contains active records plus tombstones. Import is idempotent and
+tombstone-aware, so a stale region export cannot resurrect a deleted memory.
+For operational recovery, `snapshot()` creates a checksummed replicated snapshot
+and `restore_snapshot()` restores it into a fresh replica root:
+```python
+snapshot = memory.snapshot("./backups/replicated")
+health = ReplicatedWaveMind.verify_snapshot(snapshot.snapshot_path)
+restored, report = ReplicatedWaveMind.restore_snapshot(
+    snapshot.snapshot_path,
+    "./state/restored-replicas",
+)
+```
+The checked-in scale-readiness profile verifies manifest checksums, restores
+three replica files, then disables the restored primary and confirms the memory
+is still recalled from the remaining replicas.
 Checked-in official LoCoMo retrieval result:
 10 conversations, 5882 memory turns, 1977 evidence-labeled questions,
@@ -1568,8 +1629,9 @@ If you already use Chroma for local memory, see the practical migration guide:
 - The dynamic benchmark currently compares WaveMind against a static Chroma baseline. Chroma and Qdrant can implement similar behavior with extra application-layer metadata policy, deletes, filters, and reinforcement logic.
 - `MemoryFieldGraph` is a discrete graph over stored memories, not a continuous mathematical field. Its current build path should be optimized with incremental edge updates before large production use.
 - pgvector is a candidate-index backend. PostgreSQL source-of-truth storage is
-  also available separately, but migrations, PITR docs, and service benchmark
-  profiles still need more real deployment coverage.
+  also available separately, but migrations, PITR docs, scheduled backup
+  runbooks, and service benchmark profiles still need more real deployment
+  coverage.
 - The Qdrant backend is also a candidate-index backend. WaveMind rebuilds it
   from SQLite on load/build, so large service-mode deployments still need a
   measured rebuild strategy and index-health monitoring.
@@ -1608,8 +1670,8 @@ Near-term priorities:
 - Faster dynamic re-ranking through smaller candidate windows, caching, and
   background updates.
 - Better production operations: OpenTelemetry is optional and implemented;
-  richer latency histograms, index-health metrics, alerting examples, and
-  restore drills are next.
+  richer latency histograms, index-health metrics, alerting examples, scheduled
+  offsite snapshots, and Postgres PITR runbooks are next.
 Longer-term direction:

{wavemind-2.2.5 → wavemind-2.2.7}/README.md RENAMED Viewed

@@ -489,14 +489,17 @@ Checked-in result:
 | profile | result |
 |---|---:|
 | Cluster planner | 4096 namespaces, 4 nodes, replication factor 2, node-loss availability `1.000`, zone-loss availability `1.000`, write quorum `2`. |
-| Hot cache | 2000 lookups, hit rate `0.920`, p99 lookup `0.01 ms`. |
-| Replicated runtime | 3 physical WaveMind stores, replication factor 3, write quorum 2, node-loss recall `true`, repair copied `1` missing record, tombstone repair deleted `1` stale record, p99 query-after-loss `1.44 ms`. |
-| Structured payloads | image/audio/table/event retrieval, precision@1 `1.000`, p99 `0.75 ms`. |
+| Hot cache | 2000 lookups, hit rate `0.920`, p99 lookup `0.003 ms`. |
+| Replicated runtime | 3 physical WaveMind stores, replication factor 3, write quorum 2, node-loss recall `true`, repair copied `1` missing record, tombstone repair deleted `1` stale record, p99 query-after-loss `1.29 ms`. |
+| Active-active delta sync | 2 regions, bidirectional convergence `true`, stale import suppressed after delete `true`, tombstone convergence `true`, sync `112.50 ms`. |
+| Replicated snapshot | 3 replica files, manifest checksum validation `true`, restore `11.42 ms`, recall after restored-primary loss `true`. |
+| Structured payloads | image/audio/table/event retrieval, precision@1 `1.000`, p99 `0.67 ms`. |
 This profile validates routing, quorum-replicated runtime behavior, cache
-behavior, and structured payload handling. It is not a 10M-vector load test.
-Real 100k, 1M, and 10M latency claims should come from service-backed
-FAISS/Qdrant/pgvector load tests on production-like hardware.
+behavior, active-active namespace delta sync, replicated snapshot/restore, and
+structured payload handling. It is not a 10M-vector load test. Real 100k, 1M,
+and 10M latency claims should come from service-backed FAISS/Qdrant/pgvector
+load tests on production-like hardware.
 Cluster placement planning:
@@ -615,6 +618,34 @@ For Postgres storage, use database-native backup tooling such as `pg_dump`,
 managed snapshots, or point-in-time recovery instead of WaveMind's SQLite file
 backup command.
+Replicated runtime snapshot/restore:
+```python
+from wavemind import HashingTextEncoder, ReplicatedWaveMind
+memory = ReplicatedWaveMind(
+    root_path="./state/replicas",
+    nodes=["node-a", "node-b", "node-c"],
+    replication_factor=3,
+    encoder=HashingTextEncoder(vector_dim=64),
+)
+memory.remember("Tenant A prefers short support replies.", namespace="tenant:a")
+snapshot = memory.snapshot("./backups/replicated")
+assert ReplicatedWaveMind.verify_snapshot(snapshot.snapshot_path)["healthy"]
+restored, report = ReplicatedWaveMind.restore_snapshot(
+    snapshot.snapshot_path,
+    "./state/restored-replicas",
+    encoder=HashingTextEncoder(vector_dim=64),
+)
+```
+The replicated snapshot writes one SQLite backup per replica plus
+`manifest.json` with SHA-256 checksums, replica metadata, quorum settings, and
+node definitions. Restore refuses to overwrite a non-empty root unless
+`overwrite=True` is passed.
 ## HTTP API
 Run the local FastAPI server:
@@ -988,7 +1019,7 @@ Current read:
 | LongMemEval 50-query smoke | On the first 50 non-abstention LongMemEval-S questions, WaveMind reaches `evidence_recall@5 0.920`, `precision@1 0.760`, and `MRR@5 0.827`; Chroma/Qdrant static reach `0.600`, `0.260`, and `0.385`. | This is the fast regression profile for checking current changes before rerunning the full LongMemEval profile. WaveMind wins on quality; latency still needs work. |
 | ANN/index curve | At 50000 generated 128-d vectors, NumPy exact keeps `recall@10 1.000` at `6.49 ms`; quantized int8 keeps `0.934` at `24.92 ms`; Annoy is faster at `4.92 ms` but drops to `0.730` recall; Qdrant local keeps `1.000` recall at `43.49 ms`. | Current local scale boundary is clear: quantized search needs kernel work, Annoy needs tuning/FAISS, and Qdrant should be tested in service mode for a fair production comparison. |
 | Production load | At 100000 generated 128-d vectors, service-mode Qdrant reaches `recall@10 1.000`, avg `10.28 ms`, p99 `21.26 ms`. At 1M, tuned Qdrant reaches `recall@10 0.984`, avg `116.80 ms`, p99 `209.28 ms`; an EF sweep finds `recall@10 0.977`, avg `64.76 ms`, p99 `103.77 ms` at `hnsw_ef=2048` on 30 queries. | 100k is production-grade on the tested machine. 1M recall is now strong, but p99 still needs tuning before claiming a stable sub-100 ms SLO. |
-| Scale readiness | Deterministic 1M-memory simulation validates 4096 namespace placements over 4 nodes with replication factor 2, node-loss availability `1.000`, zone-loss availability `1.000`, hot-cache hit rate `0.920`, quorum-replicated runtime recall after node loss, missing-record repair, tombstone repair, and structured payload precision@1 `1.000`. | This proves routing, cache, payload, and replicated-runtime foundations. It is not a 10M-vector latency claim; real 10M latency still needs service-backed load tests on larger hardware. |
+| Scale readiness | Deterministic 1M-memory simulation validates 4096 namespace placements over 4 nodes with replication factor 2, node-loss availability `1.000`, zone-loss availability `1.000`, hot-cache hit rate `0.920`, quorum-replicated runtime recall after node loss, missing-record repair, tombstone repair, active-active delta sync, checksummed replicated snapshot/restore, and structured payload precision@1 `1.000`. | This proves routing, cache, payload, replicated-runtime, namespace-delta, and restore-drill foundations. It is not a 10M-vector latency claim; real 10M latency still needs service-backed load tests on larger hardware. |
 | Memory competitor adapters | WaveMind reaches `precision@1 0.80`, `precision@3 1.00`, stale suppression `1.00` on the small adapter profile. Mem0, Zep, and LangGraph are listed as skipped unless their real packages/services are configured. | This prevents fake competitor claims. The adapter harness is ready; real Mem0/Zep/LangGraph results still need configured installs. |
 | LongMemEval local answer generation | With the same local Ollama `qwen2.5:1.5b`, WaveMind reaches `exact_match 0.240`, `contains_answer 0.380`, `token_f1 0.333`, and `evidence_recall@5 0.920`; Chroma and Qdrant static both reach `0.120`, `0.160`, `0.170`, and `0.600`. | This is the first checked-in end-to-end answer benchmark against Chroma/Qdrant. It is still a 50-question lightweight smoke run, not a full LongMemEval leaderboard score. |
@@ -1007,7 +1038,7 @@ Current read:
 | Production index profile | Docker-backed 50000-vector profile for persisted FAISS, Qdrant service, and PostgreSQL/pgvector HNSW. | implemented | FAISS / Qdrant service / pgvector | Keep service-mode candidate generation above `0.95` recall@10 and below 10 ms average query latency at 50000 vectors. |
 | Production load profile | 100k and 1M service-backed candidate-index checks with p95/p99 latency. | implemented | Qdrant service / pgvector HNSW / FAISS persisted | Keep 100k at recall@10 `1.000`; push 1M p99 below 100 ms with recall@10 >= 0.95. |
 | Qdrant 1M HNSW ef sweep | One 1M Qdrant collection queried with multiple `hnsw_ef` values. | implemented | Qdrant service | Repeat with 100+ queries and collection-level HNSW build parameters before claiming a stable 1M SLO. |
-| Scale readiness profile | Cluster placement, node/zone-loss simulation, quorum report, replicated runtime, hot-cache behavior, and structured/multimodal payload retrieval. | implemented | Mem0 / Zep / LangGraph persistent memory / GraphRAG target adapters | Keep quorum replication and repair green while adding larger service-backed 10M load tests. |
+| Scale readiness profile | Cluster placement, node/zone-loss simulation, quorum report, replicated runtime, active-active delta sync, replicated snapshot/restore, hot-cache behavior, and structured/multimodal payload retrieval. | implemented | Mem0 / Zep / LangGraph persistent memory / GraphRAG target adapters | Keep quorum replication, namespace-delta sync, repair, and restore drills green while adding larger service-backed 10M load tests. |
 | Memory competitor adapter profile | Dynamic-memory scenario wired for external memory frameworks. | implemented | Mem0 / Zep / LangGraph persistent memory | Report real competitor results only when their packages/services are explicitly configured. |
 | [BEIR](https://github.com/beir-cellar/beir) | Standard zero-shot information retrieval quality. | planned | Chroma / Qdrant / FAISS | Stay within 0.02 `nDCG@10` on identical embeddings. |
 | [MTEB Retrieval](https://github.com/embeddings-benchmark/mteb) | Separates encoder quality from retrieval-store quality. | planned | Chroma / Qdrant / FAISS | Prove WaveMind does not reduce same-embedding retrieval quality. |
@@ -1167,6 +1198,36 @@ production foundation for namespace-level HA and eventual-consistency behavior;
 for full consensus across independent network services, deploy WaveMind with
 Postgres/Qdrant/ops-layer replication.
+For multi-region active-active experiments, export and import namespace deltas:
+```python
+region_a.remember("Tenant A billing preference.", namespace="tenant:a")
+delta = region_a.export_namespace_delta("tenant:a")
+region_b.import_namespace_delta(delta)
+region_a.forget(text="Tenant A billing preference.", namespace="tenant:a")
+region_b.import_namespace_delta(region_a.export_namespace_delta("tenant:a"))
+```
+The delta contains active records plus tombstones. Import is idempotent and
+tombstone-aware, so a stale region export cannot resurrect a deleted memory.
+For operational recovery, `snapshot()` creates a checksummed replicated snapshot
+and `restore_snapshot()` restores it into a fresh replica root:
+```python
+snapshot = memory.snapshot("./backups/replicated")
+health = ReplicatedWaveMind.verify_snapshot(snapshot.snapshot_path)
+restored, report = ReplicatedWaveMind.restore_snapshot(
+    snapshot.snapshot_path,
+    "./state/restored-replicas",
+)
+```
+The checked-in scale-readiness profile verifies manifest checksums, restores
+three replica files, then disables the restored primary and confirms the memory
+is still recalled from the remaining replicas.
 Checked-in official LoCoMo retrieval result:
 10 conversations, 5882 memory turns, 1977 evidence-labeled questions,
@@ -1515,8 +1576,9 @@ If you already use Chroma for local memory, see the practical migration guide:
 - The dynamic benchmark currently compares WaveMind against a static Chroma baseline. Chroma and Qdrant can implement similar behavior with extra application-layer metadata policy, deletes, filters, and reinforcement logic.
 - `MemoryFieldGraph` is a discrete graph over stored memories, not a continuous mathematical field. Its current build path should be optimized with incremental edge updates before large production use.
 - pgvector is a candidate-index backend. PostgreSQL source-of-truth storage is
-  also available separately, but migrations, PITR docs, and service benchmark
-  profiles still need more real deployment coverage.
+  also available separately, but migrations, PITR docs, scheduled backup
+  runbooks, and service benchmark profiles still need more real deployment
+  coverage.
 - The Qdrant backend is also a candidate-index backend. WaveMind rebuilds it
   from SQLite on load/build, so large service-mode deployments still need a
   measured rebuild strategy and index-health monitoring.
@@ -1555,8 +1617,8 @@ Near-term priorities:
 - Faster dynamic re-ranking through smaller candidate windows, caching, and
   background updates.
 - Better production operations: OpenTelemetry is optional and implemented;
-  richer latency histograms, index-health metrics, alerting examples, and
-  restore drills are next.
+  richer latency histograms, index-health metrics, alerting examples, scheduled
+  offsite snapshots, and Postgres PITR runbooks are next.
 Longer-term direction:

{wavemind-2.2.5 → wavemind-2.2.7}/benchmarks/BENCHMARK_LEADERBOARD.md RENAMED Viewed

@@ -21,7 +21,7 @@ This is a compact reader-facing view of checked-in benchmark results. It is not
 | Production load profile 100k | production-scale | Recall@k | WaveMind pgvector: 0.736 / 17.8 ms | Qdrant service: 1 / 10.3 ms | Baseline leads on quality |
 | Production load profile 1M | production-scale | Recall@k | - | Qdrant service: 0.984 / 116.8 ms | No WaveMind result |
 | Qdrant 1M HNSW ef sweep | production-scale | Recall@k | - | hnsw_ef=2048: 0.977 / 64.8 ms | No WaveMind result |
-| Scale readiness profile | production-scale | precision@1 | WaveMind structured payloads: 1 / 0.837 ms | - | WaveMind-only check |
+| Scale readiness profile | production-scale | precision@1 | WaveMind structured payloads: 1 / 0.448 ms | - | WaveMind-only check |
 | Memory competitor adapter profile | agent-memory | precision@1 | WaveMind: 0.8 / 0.554 ms | - | WaveMind-only check |
 | [LongMemEval answer generation](https://github.com/xiaowu0162/LongMemEval) | long-term-agent-memory | token F1 | WaveMind + qwen2.5:1.5b: 0.333 / - | Chroma static + qwen2.5:1.5b: 0.17 / - | WaveMind leads on quality |

{wavemind-2.2.5 → wavemind-2.2.7}/benchmarks/BENCHMARK_REPORT.md RENAMED Viewed

@@ -24,7 +24,7 @@ Planned rows are not claimed wins. They are the public proof path WaveMind must
 | Production load profile 100k | production-scale | implemented | Qdrant service: Recall@k 1.00, avg latency 10.3, p95 latency 19.0, p99 latency ms 21.3, build ms 27439.3<br>WaveMind pgvector: Recall@k 0.74, avg latency 17.8, p95 latency 23.5, build ms 455703.7<br>WaveMind faiss-persisted: skipped - Set WAVEMIND_FAISS_PATH to use the persisted FAISS backend | Tune pgvector HNSW build/search parameters and add persisted FAISS from the Linux benchmark container. |
 | Production load profile 1M | production-scale | implemented | Qdrant service: Recall@k 0.98, avg latency 116.8, p95 latency 153.8, p99 latency ms 209.3, build ms 450674.6 | Tune Qdrant indexing/search params further, then add FAISS IVF/HNSW and pgvector 1M profiles on a larger disk. |
 | Qdrant 1M HNSW ef sweep | production-scale | implemented | hnsw_ef=512: Recall@k 0.75, avg latency 47.2, p95 latency 68.5, p99 latency ms 68.5, max latency ms 68.5<br>hnsw_ef=768: Recall@k 0.85, avg latency 44.0, p95 latency 69.1, p99 latency ms 69.8, max latency ms 69.8<br>hnsw_ef=1024: Recall@k 0.88, avg latency 62.9, p95 latency 81.1, p99 latency ms 85.5, max latency ms 85.5<br>hnsw_ef=1536: Recall@k 0.94, avg latency 65.6, p95 latency 111.2, p99 latency ms 119.7, max latency ms 119.7<br>hnsw_ef=2048: Recall@k 0.98, avg latency 64.8, p95 latency 91.2, p99 latency ms 103.8, max latency ms 103.8 | Repeat with 100+ queries and collection-level HNSW build parameters before claiming a stable production SLO. |
-| Scale readiness profile | production-scale | implemented | WaveMind cluster planner: simulated memories 1000000, namespaces 4096, nodes 4, replication factor 2, node loss min availability 1.00, zone loss min availability 1.00, read quorum 1, write quorum 2, placement ms 115.8<br>WaveMind hot cache: queries 2000, capacity 512, hit rate 0.92, evictions 0, p99 lookup ms 0.00<br>WaveMind structured payloads: queries 4, precision@1 1.00, avg latency 0.84, p99 latency ms 1.07 | Move from single-node service profiles to namespace sharding and replicated service runs. |
+| Scale readiness profile | production-scale | implemented | WaveMind cluster planner: simulated memories 1000000, namespaces 4096, nodes 4, replication factor 2, node loss min availability 1.00, zone loss min availability 1.00, read quorum 1, write quorum 2, placement ms 59.2<br>WaveMind hot cache: queries 2000, capacity 512, hit rate 0.92, evictions 0, p99 lookup ms 0.00<br>WaveMind replicated runtime: nodes 3, replication factor 3, write quorum 2, read quorum 1, recalled after node loss True, repair copied records 1, tombstone repair deleted records 1, p99 query after loss ms 1.29<br>WaveMind active-active delta sync: regions 2, replication factor per region 3, records imported 6, converged after bidirectional sync True, suppressed stale import after delete True, tombstone converged True, sync ms 112.5<br>WaveMind replicated snapshot: nodes 3, manifest healthy True, restored files 3, recalled after restore node loss True, snapshot ms 49.2, restore ms 11.4<br>WaveMind structured payloads: queries 4, precision@1 1.00, avg latency 0.45, p99 latency ms 0.67 | Move from local replicated runtime to service-backed replicated runs, scheduled/offsite snapshots, and larger 10M candidate-index load tests. |
 | Memory competitor adapter profile | agent-memory | implemented | WaveMind: precision@1 0.80, precision@3 1.00, stale suppression 1.00, avg latency 0.55, p95 latency 0.83<br>Mem0: skipped - Install Mem0 to run this adapter profile: pip install "mem0ai"<br>Zep: skipped - Install the Zep client package and set ZEP_API_KEY or ZEP_API_URL.<br>LangGraph persistent memory: skipped - Install LangGraph to run this adapter profile: pip install "langgraph" | Add documented setup commands for each competitor adapter and store checked-in results only when those real adapters run. |
 | [LongMemEval answer generation](https://github.com/xiaowu0162/LongMemEval) | long-term-agent-memory | implemented | extractive smoke: queries 20, evidence recall@k 1.00, exact match 0.00, contains answer 0.05, token f1 0.02, avg retrieval ms 3.79, avg generation ms 0.77<br>WaveMind + qwen2.5:0.5b: queries 50, evidence recall@k 0.92, exact match 0.12, contains answer 0.18, token f1 0.18, avg retrieval ms 2.98, avg generation ms 1428.2<br>Chroma static + qwen2.5:0.5b: queries 50, evidence recall@k 0.60, exact match 0.10, contains answer 0.12, token f1 0.13, avg retrieval ms 4.10, avg generation ms 1234.7<br>Qdrant static + qwen2.5:0.5b: queries 50, evidence recall@k 0.60, exact match 0.10, contains answer 0.12, token f1 0.13, avg retrieval ms 63.8, avg generation ms 893.5<br>WaveMind + qwen2.5:1.5b: queries 50, evidence recall@k 0.92, exact match 0.24, contains answer 0.38, token f1 0.33, avg retrieval ms 2.00, avg generation ms 2153.0<br>Chroma static + qwen2.5:1.5b: queries 50, evidence recall@k 0.60, exact match 0.12, contains answer 0.16, token f1 0.17, avg retrieval ms 7.05, avg generation ms 2082.4<br>Qdrant static + qwen2.5:1.5b: queries 50, evidence recall@k 0.60, exact match 0.12, contains answer 0.16, token f1 0.17, avg retrieval ms 100.2, avg generation ms 758.1 | Run all 470 non-abstention questions with a stronger local/API model and add faithfulness/abstention scoring. |

{wavemind-2.2.5 → wavemind-2.2.7}/benchmarks/benchmark_matrix_results.json RENAMED Viewed

@@ -741,7 +741,7 @@
       "category": "production-scale",
       "status": "implemented",
       "source": "benchmarks/scale_readiness_benchmark.py",
-      "dataset": "Deterministic 1M-memory simulation for namespace placement plus hot-cache and structured-payload retrieval checks.",
+      "dataset": "Deterministic 1M-memory simulation for namespace placement, quorum runtime, active-active delta sync, replicated snapshot/restore, hot-cache, and structured-payload retrieval checks.",
       "competitors": [
         "Mem0",
         "Zep",
@@ -764,24 +764,51 @@
           "zone_loss_min_availability": 1.0,
           "read_quorum": 1,
           "write_quorum": 2,
-          "placement_ms": 115.80999998841435
+          "placement_ms": 59.205800003837794
         },
         "WaveMind hot cache": {
           "queries": 2000,
           "capacity": 512,
           "hit_rate": 0.92,
           "evictions": 0,
-          "p99_lookup_ms": 0.003500026650726795
+          "p99_lookup_ms": 0.002700020559132099
+        },
+        "WaveMind replicated runtime": {
+          "nodes": 3,
+          "replication_factor": 3,
+          "write_quorum": 2,
+          "read_quorum": 1,
+          "recalled_after_node_loss": true,
+          "repair_copied_records": 1,
+          "tombstone_repair_deleted_records": 1,
+          "p99_query_after_loss_ms": 1.2885000323876739
+        },
+        "WaveMind active-active delta sync": {
+          "regions": 2,
+          "replication_factor_per_region": 3,
+          "records_imported": 6,
+          "converged_after_bidirectional_sync": true,
+          "suppressed_stale_import_after_delete": true,
+          "tombstone_converged": true,
+          "sync_ms": 112.49550001230091
+        },
+        "WaveMind replicated snapshot": {
+          "nodes": 3,
+          "manifest_healthy": true,
+          "restored_files": 3,
+          "recalled_after_restore_node_loss": true,
+          "snapshot_ms": 49.16400002548471,
+          "restore_ms": 11.4187000435777
         },
         "WaveMind structured payloads": {
           "queries": 4,
           "precision_at_1": 1.0,
-          "avg_latency_ms": 0.8370749856112525,
-          "p99_latency_ms": 1.0724000167101622
+          "avg_latency_ms": 0.44750000233761966,
+          "p99_latency_ms": 0.6680000224150717
         }
       },
-      "target": "Prove the production foundation before heavier 100k, 1M, and 10M vector load tests: deterministic placement, survivable replicas, hot-cache behavior, and structured payload recall.",
-      "next_step": "Move from single-node service profiles to namespace sharding and replicated service runs."
+      "target": "Prove the production foundation before heavier 100k, 1M, and 10M vector load tests: deterministic placement, survivable replicas, active-active sync, restore drills, hot-cache behavior, and structured payload recall.",
+      "next_step": "Move from local replicated runtime to service-backed replicated runs, scheduled/offsite snapshots, and larger 10M candidate-index load tests."
     },
     {
       "id": "memory_competitor_adapter_profile",

{wavemind-2.2.5 → wavemind-2.2.7}/benchmarks/benchmark_registry.py RENAMED Viewed

@@ -702,7 +702,7 @@ def _implemented_entries(root: Path) -> list[dict[str, Any]]:
             "category": "production-scale",
             "status": "implemented",
             "source": "benchmarks/scale_readiness_benchmark.py",
-            "dataset": "Deterministic 1M-memory simulation for namespace placement plus hot-cache and structured-payload retrieval checks.",
+            "dataset": "Deterministic 1M-memory simulation for namespace placement, quorum runtime, active-active delta sync, replicated snapshot/restore, hot-cache, and structured-payload retrieval checks.",
             "competitors": ["Mem0", "Zep", "LangGraph persistent memory", "GraphRAG"],
             "metrics": [
                 "node_loss_min_availability",
@@ -735,6 +735,42 @@ def _implemented_entries(root: Path) -> list[dict[str, Any]]:
                         "p99_lookup_ms",
                     ),
                 ),
+                "WaveMind replicated runtime": _metric_summary(
+                    scale_readiness_results.get("WaveMind replicated runtime"),
+                    (
+                        "nodes",
+                        "replication_factor",
+                        "write_quorum",
+                        "read_quorum",
+                        "recalled_after_node_loss",
+                        "repair_copied_records",
+                        "tombstone_repair_deleted_records",
+                        "p99_query_after_loss_ms",
+                    ),
+                ),
+                "WaveMind active-active delta sync": _metric_summary(
+                    scale_readiness_results.get("WaveMind active-active delta sync"),
+                    (
+                        "regions",
+                        "replication_factor_per_region",
+                        "records_imported",
+                        "converged_after_bidirectional_sync",
+                        "suppressed_stale_import_after_delete",
+                        "tombstone_converged",
+                        "sync_ms",
+                    ),
+                ),
+                "WaveMind replicated snapshot": _metric_summary(
+                    scale_readiness_results.get("WaveMind replicated snapshot"),
+                    (
+                        "nodes",
+                        "manifest_healthy",
+                        "restored_files",
+                        "recalled_after_restore_node_loss",
+                        "snapshot_ms",
+                        "restore_ms",
+                    ),
+                ),
                 "WaveMind structured payloads": _metric_summary(
                     scale_readiness_results.get("WaveMind structured payloads"),
                     (
@@ -745,8 +781,8 @@ def _implemented_entries(root: Path) -> list[dict[str, Any]]:
                     ),
                 ),
             },
-            "target": "Prove the production foundation before heavier 100k, 1M, and 10M vector load tests: deterministic placement, survivable replicas, hot-cache behavior, and structured payload recall.",
-            "next_step": "Move from single-node service profiles to namespace sharding and replicated service runs.",
+            "target": "Prove the production foundation before heavier 100k, 1M, and 10M vector load tests: deterministic placement, survivable replicas, active-active sync, restore drills, hot-cache behavior, and structured payload recall.",
+            "next_step": "Move from local replicated runtime to service-backed replicated runs, scheduled/offsite snapshots, and larger 10M candidate-index load tests.",
         },
         {
             "id": "memory_competitor_adapter_profile",

{wavemind-2.2.5 → wavemind-2.2.7}/benchmarks/scale_readiness_benchmark.py RENAMED Viewed

@@ -238,6 +238,138 @@ def run_replication_runtime_profile() -> dict[str, object]:
             memory.close()
+def run_active_active_delta_profile() -> dict[str, object]:
+    with tempfile.TemporaryDirectory() as directory:
+        kwargs = {
+            "replication_factor": 3,
+            "width": 16,
+            "height": 16,
+            "layers": 1,
+            "encoder": HashingTextEncoder(vector_dim=64),
+        }
+        region_a = ReplicatedWaveMind(
+            root_path=Path(directory) / "region-a",
+            nodes=[
+                {"id": "region-a-1", "address": "127.0.0.1:8101", "zone": "zone-a"},
+                {"id": "region-a-2", "address": "127.0.0.1:8102", "zone": "zone-b"},
+                {"id": "region-a-3", "address": "127.0.0.1:8103", "zone": "zone-c"},
+            ],
+            **kwargs,
+        )
+        region_b = ReplicatedWaveMind(
+            root_path=Path(directory) / "region-b",
+            nodes=[
+                {"id": "region-b-1", "address": "127.0.0.1:8201", "zone": "zone-a"},
+                {"id": "region-b-2", "address": "127.0.0.1:8202", "zone": "zone-b"},
+                {"id": "region-b-3", "address": "127.0.0.1:8203", "zone": "zone-c"},
+            ],
+            **kwargs,
+        )
+        try:
+            namespace = "tenant:active-active"
+            region_a.remember("region a billing preference", namespace=namespace)
+            region_b.remember("region b support preference", namespace=namespace)
+            sync_started = time.perf_counter()
+            import_b = region_b.import_namespace_delta(
+                region_a.export_namespace_delta(namespace)
+            )
+            import_a = region_a.import_namespace_delta(
+                region_b.export_namespace_delta(namespace)
+            )
+            sync_ms = (time.perf_counter() - sync_started) * 1000.0
+            converged = (
+                region_a.query("support preference", namespace=namespace, top_k=1)
+                and region_b.query("billing preference", namespace=namespace, top_k=1)
+            )
+            stale_delta = region_b.export_namespace_delta(namespace)
+            region_a.forget(text="region a billing preference", namespace=namespace)
+            region_a.import_namespace_delta(stale_delta)
+            suppressed_stale_import = all(
+                result.text != "region a billing preference"
+                for result in region_a.query("billing preference", namespace=namespace, top_k=3)
+            )
+            tombstone_delta = region_a.export_namespace_delta(namespace)
+            tombstone_report = region_b.import_namespace_delta(tombstone_delta)
+            tombstone_converged = all(
+                result.text != "region a billing preference"
+                for result in region_b.query("billing preference", namespace=namespace, top_k=3)
+            )
+            return {
+                "engine": "WaveMind active-active delta sync",
+                "regions": 2,
+                "replication_factor_per_region": 3,
+                "records_imported": import_a.imported_records + import_b.imported_records,
+                "converged_after_bidirectional_sync": bool(converged),
+                "sync_ms": sync_ms,
+                "suppressed_stale_import_after_delete": suppressed_stale_import,
+                "tombstone_deleted_records": tombstone_report.deleted_records,
+                "tombstone_converged": tombstone_converged,
+            }
+        finally:
+            region_a.close()
+            region_b.close()
+def run_replicated_snapshot_profile() -> dict[str, object]:
+    with tempfile.TemporaryDirectory() as directory:
+        root = Path(directory)
+        memory = ReplicatedWaveMind(
+            root_path=root / "replicas",
+            nodes=[
+                {"id": "node-a", "address": "127.0.0.1:8101", "zone": "zone-a"},
+                {"id": "node-b", "address": "127.0.0.1:8102", "zone": "zone-b"},
+                {"id": "node-c", "address": "127.0.0.1:8103", "zone": "zone-c"},
+            ],
+            replication_factor=3,
+            width=16,
+            height=16,
+            layers=1,
+            encoder=HashingTextEncoder(vector_dim=64),
+        )
+        restored = None
+        try:
+            namespace = "tenant:snapshot"
+            memory.remember(
+                "replicated snapshot restore survives node loss",
+                namespace=namespace,
+            )
+            snapshot_started = time.perf_counter()
+            snapshot = memory.snapshot(root / "snapshots")
+            snapshot_ms = (time.perf_counter() - snapshot_started) * 1000.0
+            health = ReplicatedWaveMind.verify_snapshot(snapshot.snapshot_path)
+            restore_started = time.perf_counter()
+            restored, restore = ReplicatedWaveMind.restore_snapshot(
+                snapshot.snapshot_path,
+                root / "restored",
+                width=16,
+                height=16,
+                layers=1,
+                encoder=HashingTextEncoder(vector_dim=64),
+            )
+            restore_ms = (time.perf_counter() - restore_started) * 1000.0
+            placement = restored.placement(namespace)
+            restored.set_node_available(placement.primary, False)
+            recalled_after_restore_loss = bool(
+                restored.query("snapshot restore node loss", namespace=namespace, top_k=1)
+            )
+            return {
+                "engine": "WaveMind replicated snapshot",
+                "nodes": len(snapshot.nodes),
+                "manifest_healthy": health["healthy"],
+                "total_bytes": snapshot.total_bytes,
+                "snapshot_ms": snapshot_ms,
+                "restore_ms": restore_ms,
+                "restored_files": len(restore.restored_files),
+                "recalled_after_restore_node_loss": recalled_after_restore_loss,
+            }
+        finally:
+            memory.close()
+            if restored is not None:
+                restored.close()
 def run_multimodal_profile() -> dict[str, object]:
     with tempfile.TemporaryDirectory() as directory:
         memory = WaveMind(
@@ -325,6 +457,8 @@ def run_benchmark(
         ),
         run_cache_profile(queries=cache_queries, capacity=cache_capacity),
         run_replication_runtime_profile(),
+        run_active_active_delta_profile(),
+        run_replicated_snapshot_profile(),
         run_multimodal_profile(),
     ]
     return {
@@ -337,8 +471,9 @@ def run_benchmark(
             "description": (
                 "Deterministic scale-readiness profile for cluster placement, "
                 "node/zone loss simulation, quorum-replicated runtime behavior, "
-                "hot-cache behavior, and structured payload retrieval. This is "
-                "not a 10M-vector database load test."
+                "active-active delta sync, replicated snapshot/restore, hot-cache "
+                "behavior, and structured payload retrieval. This is not a "
+                "10M-vector database load test."
             ),
         },
         "results": results,
@@ -379,6 +514,12 @@ def main() -> int:
             print(f"| replicated runtime | recalled_after_node_loss | {result['recalled_after_node_loss']} |")
             print(f"| replicated runtime | repair_copied_records | {result['repair_copied_records']} |")
             print(f"| replicated runtime | tombstone_repair_deleted_records | {result['tombstone_repair_deleted_records']} |")
+        elif result["engine"] == "WaveMind active-active delta sync":
+            print(f"| active-active delta | converged | {result['converged_after_bidirectional_sync']} |")
+            print(f"| active-active delta | tombstone_converged | {result['tombstone_converged']} |")
+        elif result["engine"] == "WaveMind replicated snapshot":
+            print(f"| replicated snapshot | manifest_healthy | {result['manifest_healthy']} |")
+            print(f"| replicated snapshot | recalled_after_restore_node_loss | {result['recalled_after_restore_node_loss']} |")
         else:
             print(f"| structured payloads | precision@1 | {result['precision_at_1']:.3f} |")
     print(f"\nWrote {args.output}")

wavemind 2.2.5__tar.gz → 2.2.7__tar.gz

wavemind 2.2.5tar.gz → 2.2.7tar.gz