wavemind 2.0.3__tar.gz → 2.0.5__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (35) hide show
  1. {wavemind-2.0.3/wavemind.egg-info → wavemind-2.0.5}/PKG-INFO +33 -4
  2. {wavemind-2.0.3 → wavemind-2.0.5}/README.md +32 -3
  3. {wavemind-2.0.3 → wavemind-2.0.5}/pyproject.toml +1 -1
  4. wavemind-2.0.5/tests/test_dynamic_memory_benchmark.py +88 -0
  5. {wavemind-2.0.3 → wavemind-2.0.5}/tests/test_packaging_files.py +14 -3
  6. {wavemind-2.0.3 → wavemind-2.0.5}/tests/test_semantic_and_latency.py +24 -0
  7. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind/__init__.py +1 -1
  8. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind/core.py +27 -0
  9. {wavemind-2.0.3 → wavemind-2.0.5/wavemind.egg-info}/PKG-INFO +33 -4
  10. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind.egg-info/SOURCES.txt +1 -0
  11. {wavemind-2.0.3 → wavemind-2.0.5}/LICENSE +0 -0
  12. {wavemind-2.0.3 → wavemind-2.0.5}/setup.cfg +0 -0
  13. {wavemind-2.0.3 → wavemind-2.0.5}/tests/test_agent_memory_benchmark.py +0 -0
  14. {wavemind-2.0.3 → wavemind-2.0.5}/tests/test_api.py +0 -0
  15. {wavemind-2.0.3 → wavemind-2.0.5}/tests/test_api_process_persistence.py +0 -0
  16. {wavemind-2.0.3 → wavemind-2.0.5}/tests/test_cli_smoke.py +0 -0
  17. {wavemind-2.0.3 → wavemind-2.0.5}/tests/test_core_persistence.py +0 -0
  18. {wavemind-2.0.3 → wavemind-2.0.5}/tests/test_examples.py +0 -0
  19. {wavemind-2.0.3 → wavemind-2.0.5}/tests/test_import_benchmark.py +0 -0
  20. {wavemind-2.0.3 → wavemind-2.0.5}/tests/test_indexes_encoders.py +0 -0
  21. {wavemind-2.0.3 → wavemind-2.0.5}/tests/test_langchain_integration.py +0 -0
  22. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind/__main__.py +0 -0
  23. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind/api.py +0 -0
  24. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind/benchmark.py +0 -0
  25. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind/cli.py +0 -0
  26. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind/encoders.py +0 -0
  27. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind/importers.py +0 -0
  28. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind/indexes.py +0 -0
  29. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind/integrations/__init__.py +0 -0
  30. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind/integrations/langchain.py +0 -0
  31. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind/storage.py +0 -0
  32. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind.egg-info/dependency_links.txt +0 -0
  33. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind.egg-info/entry_points.txt +0 -0
  34. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind.egg-info/requires.txt +0 -0
  35. {wavemind-2.0.3 → wavemind-2.0.5}/wavemind.egg-info/top_level.txt +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: wavemind
3
- Version: 2.0.3
3
+ Version: 2.0.5
4
4
  Summary: Persistent dynamic memory engine with vector search and wave-field re-ranking
5
5
  License-Expression: MIT
6
6
  Project-URL: Homepage, https://github.com/CaspianG/wavemind
@@ -219,6 +219,34 @@ pip install -e ".[bench]"
219
219
  python benchmarks/agent_memory_benchmark.py --engines wavemind chroma --facts 200 --queries 50
220
220
  ```
221
221
 
222
+ Dynamic agent-memory benchmark:
223
+
224
+ 200 memories, 8 checks, same precomputed `HashingTextEncoder` embeddings.
225
+ This benchmark exercises hot memory, TTL, corrections, and namespace isolation.
226
+ WaveMind applies its built-in memory policy. `Chroma static` is a plain vector-store baseline without application-layer TTL, delete handling, namespace filters, or recall reinforcement.
227
+ Full machine-readable result: `benchmarks/dynamic_memory_results.json`.
228
+
229
+ | engine | precision@1 | precision@3 | stale suppression | avg latency |
230
+ |---|---:|---:|---:|---:|
231
+ | WaveMind | 1.00 | 1.00 | 1.00 | 25.26 ms |
232
+ | Chroma static | 0.57 | 1.00 | 0.00 | 1.75 ms |
233
+
234
+ Category success:
235
+
236
+ | behavior | WaveMind | Chroma static |
237
+ |---|---:|---:|
238
+ | hot memory | 1.00 | 0.50 |
239
+ | TTL | 1.00 | 0.00 |
240
+ | correction | 1.00 | 0.00 |
241
+ | namespace isolation | 1.00 | 0.00 |
242
+
243
+ Run locally from a cloned repository:
244
+
245
+ ```sh
246
+ pip install -e ".[bench]"
247
+ python benchmarks/dynamic_memory_benchmark.py --engines wavemind chroma --memories 200
248
+ ```
249
+
222
250
  ## Comparison
223
251
 
224
252
  | feature | WaveMind | Chroma | Qdrant |
@@ -241,13 +269,14 @@ WaveMind is not trying to replace dedicated vector databases at scale. The inten
241
269
  - `sentence-transformers/paraphrase-multilingual-mpnet-base-v2` requires about 420 MB of model files and measured about 53 ms per query on the benchmark machine.
242
270
  - The Chroma comparison currently uses shared precomputed hash embeddings to isolate retrieval/ranking behavior; semantic model comparisons should be run separately.
243
271
  - In the 200-fact agent benchmark, Chroma is faster on average while WaveMind is slightly higher at `precision@3`.
244
- - The current public benchmark does not yet prove the dynamic-memory advantage. The next benchmark must test hotness, TTL, corrections, namespace isolation, and repeated recall.
272
+ - The dynamic benchmark currently compares WaveMind against a static Chroma baseline. Chroma and Qdrant can implement similar behavior with extra application-layer metadata policy, deletes, filters, and reinforcement logic.
273
+ - Dynamic memory is slower than static Chroma in the current local benchmark: 25.26 ms vs 1.75 ms average query latency on this machine.
245
274
 
246
275
  ## Roadmap
247
276
 
248
277
  - FAISS-first production index path with persisted index rebuilds.
249
- - Dynamic agent-memory benchmark against Chroma/Qdrant: hotness, TTL, stale-fact suppression, corrections, and namespace isolation.
250
- - Expand the agent-memory benchmark to sentence-transformers, FAISS, Chroma default embeddings, and Qdrant.
278
+ - Expand the dynamic benchmark to Qdrant, Chroma metadata-policy mode, sentence-transformers, and FAISS.
279
+ - Optimize dynamic re-ranking latency after lexical candidate filtering.
251
280
  - Better semantic query expansion for short and ambiguous queries.
252
281
  - Namespace quotas, backups, and daemon hardening for SaaS use.
253
282
  - Webhook on recall for agent runtimes.
@@ -187,6 +187,34 @@ pip install -e ".[bench]"
187
187
  python benchmarks/agent_memory_benchmark.py --engines wavemind chroma --facts 200 --queries 50
188
188
  ```
189
189
 
190
+ Dynamic agent-memory benchmark:
191
+
192
+ 200 memories, 8 checks, same precomputed `HashingTextEncoder` embeddings.
193
+ This benchmark exercises hot memory, TTL, corrections, and namespace isolation.
194
+ WaveMind applies its built-in memory policy. `Chroma static` is a plain vector-store baseline without application-layer TTL, delete handling, namespace filters, or recall reinforcement.
195
+ Full machine-readable result: `benchmarks/dynamic_memory_results.json`.
196
+
197
+ | engine | precision@1 | precision@3 | stale suppression | avg latency |
198
+ |---|---:|---:|---:|---:|
199
+ | WaveMind | 1.00 | 1.00 | 1.00 | 25.26 ms |
200
+ | Chroma static | 0.57 | 1.00 | 0.00 | 1.75 ms |
201
+
202
+ Category success:
203
+
204
+ | behavior | WaveMind | Chroma static |
205
+ |---|---:|---:|
206
+ | hot memory | 1.00 | 0.50 |
207
+ | TTL | 1.00 | 0.00 |
208
+ | correction | 1.00 | 0.00 |
209
+ | namespace isolation | 1.00 | 0.00 |
210
+
211
+ Run locally from a cloned repository:
212
+
213
+ ```sh
214
+ pip install -e ".[bench]"
215
+ python benchmarks/dynamic_memory_benchmark.py --engines wavemind chroma --memories 200
216
+ ```
217
+
190
218
  ## Comparison
191
219
 
192
220
  | feature | WaveMind | Chroma | Qdrant |
@@ -209,13 +237,14 @@ WaveMind is not trying to replace dedicated vector databases at scale. The inten
209
237
  - `sentence-transformers/paraphrase-multilingual-mpnet-base-v2` requires about 420 MB of model files and measured about 53 ms per query on the benchmark machine.
210
238
  - The Chroma comparison currently uses shared precomputed hash embeddings to isolate retrieval/ranking behavior; semantic model comparisons should be run separately.
211
239
  - In the 200-fact agent benchmark, Chroma is faster on average while WaveMind is slightly higher at `precision@3`.
212
- - The current public benchmark does not yet prove the dynamic-memory advantage. The next benchmark must test hotness, TTL, corrections, namespace isolation, and repeated recall.
240
+ - The dynamic benchmark currently compares WaveMind against a static Chroma baseline. Chroma and Qdrant can implement similar behavior with extra application-layer metadata policy, deletes, filters, and reinforcement logic.
241
+ - Dynamic memory is slower than static Chroma in the current local benchmark: 25.26 ms vs 1.75 ms average query latency on this machine.
213
242
 
214
243
  ## Roadmap
215
244
 
216
245
  - FAISS-first production index path with persisted index rebuilds.
217
- - Dynamic agent-memory benchmark against Chroma/Qdrant: hotness, TTL, stale-fact suppression, corrections, and namespace isolation.
218
- - Expand the agent-memory benchmark to sentence-transformers, FAISS, Chroma default embeddings, and Qdrant.
246
+ - Expand the dynamic benchmark to Qdrant, Chroma metadata-policy mode, sentence-transformers, and FAISS.
247
+ - Optimize dynamic re-ranking latency after lexical candidate filtering.
219
248
  - Better semantic query expansion for short and ambiguous queries.
220
249
  - Namespace quotas, backups, and daemon hardening for SaaS use.
221
250
  - Webhook on recall for agent runtimes.
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
4
4
 
5
5
  [project]
6
6
  name = "wavemind"
7
- version = "2.0.3"
7
+ version = "2.0.5"
8
8
  description = "Persistent dynamic memory engine with vector search and wave-field re-ranking"
9
9
  readme = "README.md"
10
10
  license = "MIT"
@@ -0,0 +1,88 @@
1
+ import json
2
+ import os
3
+ import subprocess
4
+ import sys
5
+ from pathlib import Path
6
+
7
+
8
+ def test_dynamic_memory_scenario_exercises_memory_behaviors():
9
+ from benchmarks.dynamic_memory_benchmark import build_dynamic_memory_scenario
10
+
11
+ scenario = build_dynamic_memory_scenario(memory_count=200)
12
+ categories = {check.category for check in scenario.checks}
13
+
14
+ assert len(scenario.memories) == 200
15
+ assert len(scenario.checks) >= 8
16
+ assert {"hot_memory", "ttl", "correction", "namespace"}.issubset(categories)
17
+ assert any(memory.ttl_seconds == 0 for memory in scenario.memories)
18
+ assert any(memory.priority >= 5 for memory in scenario.memories)
19
+ assert any(check.forbidden_ids for check in scenario.checks)
20
+
21
+
22
+ def test_dynamic_memory_metrics_track_expected_and_forbidden_results():
23
+ from benchmarks.dynamic_memory_benchmark import DynamicCheck, compute_dynamic_metrics
24
+
25
+ checks = [
26
+ DynamicCheck(
27
+ id="q_hot",
28
+ category="hot_memory",
29
+ text="How should the assistant answer?",
30
+ namespace="agent-a",
31
+ expected_id="style_hot",
32
+ ),
33
+ DynamicCheck(
34
+ id="q_ttl",
35
+ category="ttl",
36
+ text="What temporary token is still valid?",
37
+ namespace="agent-a",
38
+ expected_id=None,
39
+ forbidden_ids=("expired_token",),
40
+ ),
41
+ ]
42
+ rankings = {
43
+ "q_hot": ["style_hot", "style_cold"],
44
+ "q_ttl": ["unrelated_fact"],
45
+ }
46
+
47
+ metrics = compute_dynamic_metrics(checks, rankings, [2.0, 4.0], engine="unit")
48
+
49
+ assert metrics.precision_at_1 == 1.0
50
+ assert metrics.precision_at_3 == 1.0
51
+ assert metrics.suppression_rate == 1.0
52
+ assert metrics.category_success["hot_memory"] == 1.0
53
+ assert metrics.category_success["ttl"] == 1.0
54
+ assert metrics.avg_latency_ms == 3.0
55
+
56
+
57
+ def test_dynamic_memory_benchmark_cli_writes_json_for_wavemind(tmp_path):
58
+ output = tmp_path / "dynamic-memory-result.json"
59
+ project_root = Path(__file__).resolve().parents[1]
60
+ env = os.environ.copy()
61
+ env["PYTHONPATH"] = str(project_root) + os.pathsep + env.get("PYTHONPATH", "")
62
+
63
+ subprocess.run(
64
+ [
65
+ sys.executable,
66
+ "benchmarks/dynamic_memory_benchmark.py",
67
+ "--engines",
68
+ "wavemind",
69
+ "--memories",
70
+ "40",
71
+ "--output",
72
+ str(output),
73
+ ],
74
+ cwd=project_root,
75
+ env=env,
76
+ text=True,
77
+ encoding="utf-8",
78
+ capture_output=True,
79
+ check=True,
80
+ )
81
+
82
+ payload = json.loads(output.read_text(encoding="utf-8"))
83
+
84
+ assert payload["scenario"]["name"] == "dynamic_agent_memory"
85
+ assert payload["scenario"]["memories"] == 40
86
+ assert payload["results"][0]["engine"] == "WaveMind"
87
+ assert "suppression_rate" in payload["results"][0]
88
+ assert "category_success" in payload["results"][0]
@@ -1,13 +1,15 @@
1
1
  from pathlib import Path
2
- import tomllib
2
+ import re
3
3
 
4
4
  import wavemind
5
5
 
6
6
 
7
7
  def test_package_version_matches_pyproject():
8
- pyproject = tomllib.loads(Path("pyproject.toml").read_text(encoding="utf-8"))
8
+ pyproject = Path("pyproject.toml").read_text(encoding="utf-8")
9
+ match = re.search(r'^version = "([^"]+)"$', pyproject, flags=re.MULTILINE)
9
10
 
10
- assert wavemind.__version__ == pyproject["project"]["version"]
11
+ assert match is not None
12
+ assert wavemind.__version__ == match.group(1)
11
13
 
12
14
 
13
15
  def test_sentence_extra_is_available_for_install_scripts():
@@ -56,6 +58,15 @@ def test_install_scripts_create_venv_and_install_sentence_extra():
56
58
  assert 'pip install -e ".[sentence]"' in install_bat
57
59
 
58
60
 
61
+ def test_docker_files_track_runtime_package_version():
62
+ requirements = Path("requirements.txt").read_text(encoding="utf-8")
63
+ compose = Path("docker-compose.yml").read_text(encoding="utf-8")
64
+
65
+ assert "pytest" not in requirements
66
+ assert "httpx" not in requirements
67
+ assert f"image: wavemind:{wavemind.__version__}" in compose
68
+
69
+
59
70
  def test_github_actions_runs_pytest_on_main_for_python_310_and_311():
60
71
  workflow = Path(".github/workflows/tests.yml").read_text(encoding="utf-8")
61
72
 
@@ -131,6 +131,30 @@ def test_short_query_exact_match_can_beat_stronger_vector_candidate(tmp_path):
131
131
  assert results[0].id == expected_id
132
132
 
133
133
 
134
+ def test_common_query_words_do_not_expand_lexical_candidates(tmp_path):
135
+ mind = WaveMind(
136
+ db_path=tmp_path / "stopwords.sqlite3",
137
+ encoder=FlatSemanticEncoder(),
138
+ width=16,
139
+ height=16,
140
+ layers=2,
141
+ index_kind="numpy",
142
+ rerank_k=1,
143
+ )
144
+ expected_id = mind.remember("rarebudget target memory", namespace="stopwords")
145
+ noise_ids = [
146
+ mind.remember(f"the user background filler memory {i}", namespace="stopwords")
147
+ for i in range(20)
148
+ ]
149
+
150
+ tokens = mind._tokens("what is the user rarebudget")
151
+ candidate_ids = mind._lexical_candidate_ids(tokens, {expected_id, *noise_ids})
152
+
153
+ assert "the" not in tokens
154
+ assert "user" not in tokens
155
+ assert candidate_ids == {expected_id}
156
+
157
+
134
158
  def test_field_weight_is_disabled_above_capacity_threshold(tmp_path):
135
159
  mind = WaveMind(
136
160
  db_path=tmp_path / "field-cutoff.sqlite3",
@@ -8,7 +8,7 @@ from .encoders import (
8
8
  )
9
9
  from .storage import MemoryRecord, SQLiteMemoryStore
10
10
 
11
- __version__ = "2.0.3"
11
+ __version__ = "2.0.5"
12
12
 
13
13
  __all__ = [
14
14
  "FieldProjector",
@@ -13,6 +13,32 @@ from .indexes import NumpyVectorIndex, create_vector_index
13
13
  from .storage import MemoryRecord, SQLiteMemoryStore
14
14
 
15
15
 
16
+ LEXICAL_STOPWORDS = {
17
+ "a",
18
+ "an",
19
+ "and",
20
+ "are",
21
+ "as",
22
+ "be",
23
+ "for",
24
+ "from",
25
+ "how",
26
+ "is",
27
+ "it",
28
+ "of",
29
+ "or",
30
+ "should",
31
+ "that",
32
+ "the",
33
+ "this",
34
+ "to",
35
+ "user",
36
+ "what",
37
+ "which",
38
+ "with",
39
+ }
40
+
41
+
16
42
  class WaveField:
17
43
  def __init__(
18
44
  self,
@@ -408,6 +434,7 @@ class WaveMind:
408
434
  return tuple(
409
435
  token.replace("ё", "е")
410
436
  for token in re.findall(r"[\w]+", text.lower(), flags=re.UNICODE)
437
+ if token not in LEXICAL_STOPWORDS
411
438
  )
412
439
 
413
440
  def _lexical_match(self, query_tokens: tuple[str, ...], text: str) -> float:
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: wavemind
3
- Version: 2.0.3
3
+ Version: 2.0.5
4
4
  Summary: Persistent dynamic memory engine with vector search and wave-field re-ranking
5
5
  License-Expression: MIT
6
6
  Project-URL: Homepage, https://github.com/CaspianG/wavemind
@@ -219,6 +219,34 @@ pip install -e ".[bench]"
219
219
  python benchmarks/agent_memory_benchmark.py --engines wavemind chroma --facts 200 --queries 50
220
220
  ```
221
221
 
222
+ Dynamic agent-memory benchmark:
223
+
224
+ 200 memories, 8 checks, same precomputed `HashingTextEncoder` embeddings.
225
+ This benchmark exercises hot memory, TTL, corrections, and namespace isolation.
226
+ WaveMind applies its built-in memory policy. `Chroma static` is a plain vector-store baseline without application-layer TTL, delete handling, namespace filters, or recall reinforcement.
227
+ Full machine-readable result: `benchmarks/dynamic_memory_results.json`.
228
+
229
+ | engine | precision@1 | precision@3 | stale suppression | avg latency |
230
+ |---|---:|---:|---:|---:|
231
+ | WaveMind | 1.00 | 1.00 | 1.00 | 25.26 ms |
232
+ | Chroma static | 0.57 | 1.00 | 0.00 | 1.75 ms |
233
+
234
+ Category success:
235
+
236
+ | behavior | WaveMind | Chroma static |
237
+ |---|---:|---:|
238
+ | hot memory | 1.00 | 0.50 |
239
+ | TTL | 1.00 | 0.00 |
240
+ | correction | 1.00 | 0.00 |
241
+ | namespace isolation | 1.00 | 0.00 |
242
+
243
+ Run locally from a cloned repository:
244
+
245
+ ```sh
246
+ pip install -e ".[bench]"
247
+ python benchmarks/dynamic_memory_benchmark.py --engines wavemind chroma --memories 200
248
+ ```
249
+
222
250
  ## Comparison
223
251
 
224
252
  | feature | WaveMind | Chroma | Qdrant |
@@ -241,13 +269,14 @@ WaveMind is not trying to replace dedicated vector databases at scale. The inten
241
269
  - `sentence-transformers/paraphrase-multilingual-mpnet-base-v2` requires about 420 MB of model files and measured about 53 ms per query on the benchmark machine.
242
270
  - The Chroma comparison currently uses shared precomputed hash embeddings to isolate retrieval/ranking behavior; semantic model comparisons should be run separately.
243
271
  - In the 200-fact agent benchmark, Chroma is faster on average while WaveMind is slightly higher at `precision@3`.
244
- - The current public benchmark does not yet prove the dynamic-memory advantage. The next benchmark must test hotness, TTL, corrections, namespace isolation, and repeated recall.
272
+ - The dynamic benchmark currently compares WaveMind against a static Chroma baseline. Chroma and Qdrant can implement similar behavior with extra application-layer metadata policy, deletes, filters, and reinforcement logic.
273
+ - Dynamic memory is slower than static Chroma in the current local benchmark: 25.26 ms vs 1.75 ms average query latency on this machine.
245
274
 
246
275
  ## Roadmap
247
276
 
248
277
  - FAISS-first production index path with persisted index rebuilds.
249
- - Dynamic agent-memory benchmark against Chroma/Qdrant: hotness, TTL, stale-fact suppression, corrections, and namespace isolation.
250
- - Expand the agent-memory benchmark to sentence-transformers, FAISS, Chroma default embeddings, and Qdrant.
278
+ - Expand the dynamic benchmark to Qdrant, Chroma metadata-policy mode, sentence-transformers, and FAISS.
279
+ - Optimize dynamic re-ranking latency after lexical candidate filtering.
251
280
  - Better semantic query expansion for short and ambiguous queries.
252
281
  - Namespace quotas, backups, and daemon hardening for SaaS use.
253
282
  - Webhook on recall for agent runtimes.
@@ -6,6 +6,7 @@ tests/test_api.py
6
6
  tests/test_api_process_persistence.py
7
7
  tests/test_cli_smoke.py
8
8
  tests/test_core_persistence.py
9
+ tests/test_dynamic_memory_benchmark.py
9
10
  tests/test_examples.py
10
11
  tests/test_import_benchmark.py
11
12
  tests/test_indexes_encoders.py
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes