droste-memory 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (33) hide show
  1. droste_memory-1.0.0/CONTRIBUTING.md +39 -0
  2. droste_memory-1.0.0/LICENSE +21 -0
  3. droste_memory-1.0.0/MANIFEST.in +4 -0
  4. droste_memory-1.0.0/PKG-INFO +208 -0
  5. droste_memory-1.0.0/README.md +185 -0
  6. droste_memory-1.0.0/core/__init__.py +6 -0
  7. droste_memory-1.0.0/core/droste_cli.py +409 -0
  8. droste_memory-1.0.0/core/droste_engine.py +1067 -0
  9. droste_memory-1.0.0/core/droste_ingester.py +2742 -0
  10. droste_memory-1.0.0/core/droste_watcher.py +147 -0
  11. droste_memory-1.0.0/core/embedding_projector.py +279 -0
  12. droste_memory-1.0.0/core/treesitter_extract.py +391 -0
  13. droste_memory-1.0.0/droste_memory.egg-info/PKG-INFO +208 -0
  14. droste_memory-1.0.0/droste_memory.egg-info/SOURCES.txt +31 -0
  15. droste_memory-1.0.0/droste_memory.egg-info/dependency_links.txt +1 -0
  16. droste_memory-1.0.0/droste_memory.egg-info/entry_points.txt +2 -0
  17. droste_memory-1.0.0/droste_memory.egg-info/requires.txt +16 -0
  18. droste_memory-1.0.0/droste_memory.egg-info/top_level.txt +1 -0
  19. droste_memory-1.0.0/pyproject.toml +41 -0
  20. droste_memory-1.0.0/requirements.txt +18 -0
  21. droste_memory-1.0.0/setup.cfg +4 -0
  22. droste_memory-1.0.0/tests/test_core_invariants.py +197 -0
  23. droste_memory-1.0.0/tests/test_shards_race.py +218 -0
  24. droste_memory-1.0.0/visualizer/__init__.py +1 -0
  25. droste_memory-1.0.0/visualizer/app.py +240 -0
  26. droste_memory-1.0.0/visualizer/cockpit.html +1519 -0
  27. droste_memory-1.0.0/visualizer/context.json +81 -0
  28. droste_memory-1.0.0/visualizer/demo_graph.json +1 -0
  29. droste_memory-1.0.0/visualizer/export_graph.py +174 -0
  30. droste_memory-1.0.0/visualizer/graph.json +1 -0
  31. droste_memory-1.0.0/visualizer/server.py +75 -0
  32. droste_memory-1.0.0/visualizer/status.json +1 -0
  33. droste_memory-1.0.0/visualizer/templates/index.html +1164 -0
@@ -0,0 +1,39 @@
1
+ # Contributing to Droste
2
+
3
+ Thanks for helping build the causal-memory layer for AI agents.
4
+
5
+ ## Dev setup
6
+
7
+ ```bash
8
+ git clone <your fork>
9
+ cd droste-memory
10
+ pip install -e ".[dev]"
11
+ pytest # should be green before you start
12
+ ```
13
+
14
+ ## Ground rules
15
+
16
+ - **Tests stay green.** `pytest` runs the deterministic regression suite
17
+ (`tests/`). Add a test for any behaviour change to the engine, ingester, or
18
+ packer. The suite forces the deterministic hash embedding backend, so it runs
19
+ offline with no model download.
20
+ - **`eval/` is for benchmarks, `tests/` is for invariants.** Don't mix them.
21
+ - **Keep the zero-config moat.** New required deps are a big deal — prefer
22
+ optional extras. `fastembed` (no torch) and `tree-sitter-language-pack` are the
23
+ only heavy runtime deps and both degrade gracefully if missing.
24
+ - **Never commit user data.** `droste_memory_db.json`, `.droste/`,
25
+ `visualizer/graph.json` and `status.json` are gitignored — they can contain a
26
+ user's source. Only `visualizer/demo_graph.json` (Droste indexing itself) is
27
+ public.
28
+
29
+ ## Good first issues
30
+
31
+ - New language extractor / edge rules in `core/treesitter_extract.py`.
32
+ - More cross-language bridges in `core/droste_ingester.py`
33
+ (`_build_dependency_links`) — e.g. ORM table refs, GraphQL, gRPC.
34
+ - Visualizer polish in `visualizer/cockpit.html`.
35
+
36
+ ## PRs
37
+
38
+ Small, focused, with a one-line rationale and a test. CI runs `pytest` on
39
+ Linux across Python 3.10–3.12.
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Droste-Memory authors
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,4 @@
1
+ include README.md LICENSE CONTRIBUTING.md requirements.txt
2
+ recursive-include visualizer *.html *.py *.json
3
+ prune visualizer/graph.json
4
+ prune visualizer/status.json
@@ -0,0 +1,208 @@
1
+ Metadata-Version: 2.4
2
+ Name: droste-memory
3
+ Version: 1.0.0
4
+ Summary: Local hybrid structural and semantic graph memory engine.
5
+ Requires-Python: >=3.10
6
+ Description-Content-Type: text/markdown
7
+ License-File: LICENSE
8
+ Requires-Dist: mcp>=1.2.0
9
+ Requires-Dist: numpy>=1.24.0
10
+ Requires-Dist: scikit-learn>=1.3.0
11
+ Requires-Dist: fastembed>=0.8.0
12
+ Requires-Dist: truststore>=0.10.0
13
+ Requires-Dist: pydantic>=2.0.0
14
+ Requires-Dist: fastapi>=0.110.0
15
+ Requires-Dist: uvicorn[standard]>=0.27.0
16
+ Requires-Dist: tree-sitter>=0.23.0
17
+ Requires-Dist: tree-sitter-language-pack>=1.9.0
18
+ Provides-Extra: heavy-embed
19
+ Requires-Dist: sentence-transformers>=2.7.0; extra == "heavy-embed"
20
+ Provides-Extra: dev
21
+ Requires-Dist: pytest>=7.0; extra == "dev"
22
+ Dynamic: license-file
23
+
24
+ <div align="center">
25
+
26
+ # Droste
27
+
28
+ ### See your codebase as a living galaxy — and give your agents causal memory of it.
29
+
30
+ Droste indexes any repo into a fractal, zoomable map of its symbols, wires them
31
+ together with their real call / import / DB edges across languages, and serves an
32
+ agent the *causal* slice of code it actually needs — not just keyword matches.
33
+
34
+ **Local-first · zero-config · polyglot · MCP-native**
35
+
36
+ ![Droste fractal code galaxy](docs/assets/hero.gif)
37
+
38
+ *Zooming out reveals the causal web — every cyan arc is a real `syntax_dependency`
39
+ edge. [Full flythrough (FastAPI)](docs/assets/demo.mp4)*
40
+
41
+ [Quickstart](#quickstart) · [Why it's different](#why-its-different) · [How it works](#how-it-works) · [MCP](#use-it-as-an-mcp-server) · [Benchmarks](#benchmarks)
42
+
43
+ </div>
44
+
45
+ ---
46
+
47
+ ## Quickstart
48
+
49
+ ```bash
50
+ pip install -e . # or: pip install droste-memory (once on PyPI)
51
+ droste index . # index the current repo
52
+ droste view # open the fractal galaxy in your browser
53
+ ```
54
+
55
+ Three commands. `droste view` opens a full-screen, 60fps zoomable map of your
56
+ code — scroll to dive from the project star into folder orbits, down to the
57
+ individual functions, with the causal edges glowing between them.
58
+
59
+ Need it for an agent instead of your eyes?
60
+
61
+ ```bash
62
+ droste context "checkout flow" --budget 1500 # causal context slice for an LLM
63
+ ```
64
+
65
+ Running `droste` with no arguments prints the command palette:
66
+
67
+ ```text
68
+ .-----------------------.
69
+ .----' | '----.
70
+ .---' .-----+-----. '---.
71
+ .' .----' | '----. '.
72
+ / .---' .---+---. '---. \
73
+ / .-' .-' | '-. '-. \
74
+ | .' .-' .---+---. '-. '. |
75
+ | / .' .' | '. '. \ |
76
+ | | | | .--+--. | | | |
77
+ | --+--------+-----+---+ @ +---+-----+--------+-- |
78
+ | | | | '--+--' | | | |
79
+ | \ '. '. | .' .' / |
80
+ | '. '-. '---+---' .-' .' |
81
+ \ '-. '-. | .-' .-' /
82
+ \ '---. '---+---' .---' /
83
+ '. '----. | .----' .'
84
+ '---. '-----+-----' .---'
85
+ '----. | .----'
86
+ '-----------------------'
87
+
88
+ DROSTE-MEMORY // RIGID FRACTAL RADIAL LAYOUT
89
+ Local Graph Engine v1.0-Alpha-Sharded
90
+
91
+ Commands
92
+ droste index <path> [--reset]
93
+ droste status
94
+ droste zoom <symbol_name>
95
+ droste context [query] --budget 1500
96
+
97
+ Fast path: droste context hub_core --budget 1000 | clip
98
+ ```
99
+
100
+ ---
101
+
102
+ ## Why it's different
103
+
104
+ Most "code context" tools rank by keyword (ctags / ripgrep / repo-maps) or by
105
+ embedding cosine (vector-RAG). Both can only return what *resembles* your query.
106
+ A caller that shares no tokens — or a database function in a different language —
107
+ is invisible to them, yet it's exactly what you need to understand or change the
108
+ code.
109
+
110
+ Droste's edge is the causal graph:
111
+
112
+ - **Causal wormholes.** Real `syntax_dependency` edges (calls, imports,
113
+ inheritance) in both directions — Droste hands the caller and callees, ordered,
114
+ within a token budget.
115
+ - **Cross-language bridges.** The part nobody else does well: Droste links across
116
+ languages — app code to SQL functions/tables (`.rpc('x')`, `.from('table')`),
117
+ to edge functions, and same-name handlers between any two languages. Your
118
+ Dart/TS/Python frontend and your database stop being two separate worlds on the
119
+ map.
120
+ - **A map you actually want to look at.** The fractal galaxy isn't a gimmick —
121
+ it's how you see coupling, risk hotspots, and the blast radius of a change.
122
+ - **Zero-config and local.** No cloud, no account, no API key. fastembed (ONNX,
123
+ no torch) gives real semantics; a deterministic fallback keeps it runnable
124
+ anywhere.
125
+
126
+ Polyglot: Python (AST) + tree-sitter for Dart, TypeScript/JavaScript, Go, Rust,
127
+ Java, C#, C/C++, Kotlin, Swift, Ruby, PHP, SQL — symbols *and* edges.
128
+
129
+ > **Honest scope:** the measured advantage is structural / causal retrieval. On
130
+ > pure semantic "concept" queries it's competitive with a vector baseline, not a
131
+ > leap. Cross-language bridges are strongest where the target is actually defined
132
+ > in the indexed repo (e.g. SQL schema in your migrations).
133
+
134
+ ---
135
+
136
+ ## Benchmarks
137
+
138
+ Self-supervised eval (gold = the true caller/callee set from the AST), equal
139
+ retrieval breadth *k*, real embeddings, across Python + Dart repos
140
+ (`eval/comparative_eval.py`):
141
+
142
+ | structural retrieval | Droste | vector-RAG core | lexical core |
143
+ | --- | --- | --- | --- |
144
+ | neighbour-recall | **0.94** | 0.18 | 0.42 |
145
+ | nDCG@k | **0.65** | 0.10 | 0.29 |
146
+
147
+ …plus hundreds of true causal neighbours that both baselines structurally miss.
148
+ This is a retrieval-method comparison (the cores of vector-RAG and lexical
149
+ search), not a head-to-head against the finished products that wrap them.
150
+
151
+ ---
152
+
153
+ ## How it works
154
+
155
+ - **Causal graph.** Each definition is parsed (Python `ast`; tree-sitter for the
156
+ rest) into the names it calls / imports / inherits, becoming first-class
157
+ `syntax_dependency` edges. Cross-language edges add DB calls (`.rpc`, `.from`,
158
+ `.functions.invoke`) and string-literal name matches across languages.
159
+ - **Hybrid seed.** A query is matched by a normalized blend of lexical score and
160
+ semantic cosine (fastembed `bge-small-en-v1.5`, 384-dim), then the graph
161
+ expands the seed bidirectionally (callees and callers).
162
+ - **Token packer.** Results fit a budget with LOD-demotion (full to contract to
163
+ skeleton) and a hard guardrail that never cuts a line of code mid-token.
164
+ - **Sharded persistence.** One shard per file under `.droste/`, blake2b
165
+ dirty-tracking so a re-index rewrites only what changed; atomic writes + meta
166
+ written last, so it is crash-safe and self-heals on the next run.
167
+
168
+ ---
169
+
170
+ ## Use it as an MCP server
171
+
172
+ Droste is a drop-in MCP server — an agent calls it as primary code memory instead
173
+ of blind file reads.
174
+
175
+ ```jsonc
176
+ {
177
+ "mcpServers": {
178
+ "droste": { "command": "python", "args": ["/abs/path/to/droste-memory/server.py"] }
179
+ }
180
+ }
181
+ ```
182
+
183
+ Key tools: `droste_index_project`, `droste_get_context`, `droste_status`.
184
+
185
+ ---
186
+
187
+ ## Development
188
+
189
+ ```bash
190
+ pip install -e ".[dev]"
191
+ pytest # deterministic regression suite (tests/)
192
+ python eval/comparative_eval.py # retrieval benchmark vs lexical & vector cores
193
+ ```
194
+
195
+ `tests/` = invariants + concurrency (round-trip, dirty-oracle, packer guardrail,
196
+ cross-process shard race). `eval/` = performance/quality benchmarks.
197
+
198
+ ---
199
+
200
+ ## Status
201
+
202
+ **v1.0.0a0 (alpha).** Engine, polyglot + cross-language graph, CLI, fractal
203
+ visualizer and MCP server are working and tested. Packaging/distribution are
204
+ maturing — issues and PRs welcome (see `CONTRIBUTING.md`).
205
+
206
+ ## License
207
+
208
+ MIT — see [LICENSE](LICENSE).
@@ -0,0 +1,185 @@
1
+ <div align="center">
2
+
3
+ # Droste
4
+
5
+ ### See your codebase as a living galaxy — and give your agents causal memory of it.
6
+
7
+ Droste indexes any repo into a fractal, zoomable map of its symbols, wires them
8
+ together with their real call / import / DB edges across languages, and serves an
9
+ agent the *causal* slice of code it actually needs — not just keyword matches.
10
+
11
+ **Local-first · zero-config · polyglot · MCP-native**
12
+
13
+ ![Droste fractal code galaxy](docs/assets/hero.gif)
14
+
15
+ *Zooming out reveals the causal web — every cyan arc is a real `syntax_dependency`
16
+ edge. [Full flythrough (FastAPI)](docs/assets/demo.mp4)*
17
+
18
+ [Quickstart](#quickstart) · [Why it's different](#why-its-different) · [How it works](#how-it-works) · [MCP](#use-it-as-an-mcp-server) · [Benchmarks](#benchmarks)
19
+
20
+ </div>
21
+
22
+ ---
23
+
24
+ ## Quickstart
25
+
26
+ ```bash
27
+ pip install -e . # or: pip install droste-memory (once on PyPI)
28
+ droste index . # index the current repo
29
+ droste view # open the fractal galaxy in your browser
30
+ ```
31
+
32
+ Three commands. `droste view` opens a full-screen, 60fps zoomable map of your
33
+ code — scroll to dive from the project star into folder orbits, down to the
34
+ individual functions, with the causal edges glowing between them.
35
+
36
+ Need it for an agent instead of your eyes?
37
+
38
+ ```bash
39
+ droste context "checkout flow" --budget 1500 # causal context slice for an LLM
40
+ ```
41
+
42
+ Running `droste` with no arguments prints the command palette:
43
+
44
+ ```text
45
+ .-----------------------.
46
+ .----' | '----.
47
+ .---' .-----+-----. '---.
48
+ .' .----' | '----. '.
49
+ / .---' .---+---. '---. \
50
+ / .-' .-' | '-. '-. \
51
+ | .' .-' .---+---. '-. '. |
52
+ | / .' .' | '. '. \ |
53
+ | | | | .--+--. | | | |
54
+ | --+--------+-----+---+ @ +---+-----+--------+-- |
55
+ | | | | '--+--' | | | |
56
+ | \ '. '. | .' .' / |
57
+ | '. '-. '---+---' .-' .' |
58
+ \ '-. '-. | .-' .-' /
59
+ \ '---. '---+---' .---' /
60
+ '. '----. | .----' .'
61
+ '---. '-----+-----' .---'
62
+ '----. | .----'
63
+ '-----------------------'
64
+
65
+ DROSTE-MEMORY // RIGID FRACTAL RADIAL LAYOUT
66
+ Local Graph Engine v1.0-Alpha-Sharded
67
+
68
+ Commands
69
+ droste index <path> [--reset]
70
+ droste status
71
+ droste zoom <symbol_name>
72
+ droste context [query] --budget 1500
73
+
74
+ Fast path: droste context hub_core --budget 1000 | clip
75
+ ```
76
+
77
+ ---
78
+
79
+ ## Why it's different
80
+
81
+ Most "code context" tools rank by keyword (ctags / ripgrep / repo-maps) or by
82
+ embedding cosine (vector-RAG). Both can only return what *resembles* your query.
83
+ A caller that shares no tokens — or a database function in a different language —
84
+ is invisible to them, yet it's exactly what you need to understand or change the
85
+ code.
86
+
87
+ Droste's edge is the causal graph:
88
+
89
+ - **Causal wormholes.** Real `syntax_dependency` edges (calls, imports,
90
+ inheritance) in both directions — Droste hands the caller and callees, ordered,
91
+ within a token budget.
92
+ - **Cross-language bridges.** The part nobody else does well: Droste links across
93
+ languages — app code to SQL functions/tables (`.rpc('x')`, `.from('table')`),
94
+ to edge functions, and same-name handlers between any two languages. Your
95
+ Dart/TS/Python frontend and your database stop being two separate worlds on the
96
+ map.
97
+ - **A map you actually want to look at.** The fractal galaxy isn't a gimmick —
98
+ it's how you see coupling, risk hotspots, and the blast radius of a change.
99
+ - **Zero-config and local.** No cloud, no account, no API key. fastembed (ONNX,
100
+ no torch) gives real semantics; a deterministic fallback keeps it runnable
101
+ anywhere.
102
+
103
+ Polyglot: Python (AST) + tree-sitter for Dart, TypeScript/JavaScript, Go, Rust,
104
+ Java, C#, C/C++, Kotlin, Swift, Ruby, PHP, SQL — symbols *and* edges.
105
+
106
+ > **Honest scope:** the measured advantage is structural / causal retrieval. On
107
+ > pure semantic "concept" queries it's competitive with a vector baseline, not a
108
+ > leap. Cross-language bridges are strongest where the target is actually defined
109
+ > in the indexed repo (e.g. SQL schema in your migrations).
110
+
111
+ ---
112
+
113
+ ## Benchmarks
114
+
115
+ Self-supervised eval (gold = the true caller/callee set from the AST), equal
116
+ retrieval breadth *k*, real embeddings, across Python + Dart repos
117
+ (`eval/comparative_eval.py`):
118
+
119
+ | structural retrieval | Droste | vector-RAG core | lexical core |
120
+ | --- | --- | --- | --- |
121
+ | neighbour-recall | **0.94** | 0.18 | 0.42 |
122
+ | nDCG@k | **0.65** | 0.10 | 0.29 |
123
+
124
+ …plus hundreds of true causal neighbours that both baselines structurally miss.
125
+ This is a retrieval-method comparison (the cores of vector-RAG and lexical
126
+ search), not a head-to-head against the finished products that wrap them.
127
+
128
+ ---
129
+
130
+ ## How it works
131
+
132
+ - **Causal graph.** Each definition is parsed (Python `ast`; tree-sitter for the
133
+ rest) into the names it calls / imports / inherits, becoming first-class
134
+ `syntax_dependency` edges. Cross-language edges add DB calls (`.rpc`, `.from`,
135
+ `.functions.invoke`) and string-literal name matches across languages.
136
+ - **Hybrid seed.** A query is matched by a normalized blend of lexical score and
137
+ semantic cosine (fastembed `bge-small-en-v1.5`, 384-dim), then the graph
138
+ expands the seed bidirectionally (callees and callers).
139
+ - **Token packer.** Results fit a budget with LOD-demotion (full to contract to
140
+ skeleton) and a hard guardrail that never cuts a line of code mid-token.
141
+ - **Sharded persistence.** One shard per file under `.droste/`, blake2b
142
+ dirty-tracking so a re-index rewrites only what changed; atomic writes + meta
143
+ written last, so it is crash-safe and self-heals on the next run.
144
+
145
+ ---
146
+
147
+ ## Use it as an MCP server
148
+
149
+ Droste is a drop-in MCP server — an agent calls it as primary code memory instead
150
+ of blind file reads.
151
+
152
+ ```jsonc
153
+ {
154
+ "mcpServers": {
155
+ "droste": { "command": "python", "args": ["/abs/path/to/droste-memory/server.py"] }
156
+ }
157
+ }
158
+ ```
159
+
160
+ Key tools: `droste_index_project`, `droste_get_context`, `droste_status`.
161
+
162
+ ---
163
+
164
+ ## Development
165
+
166
+ ```bash
167
+ pip install -e ".[dev]"
168
+ pytest # deterministic regression suite (tests/)
169
+ python eval/comparative_eval.py # retrieval benchmark vs lexical & vector cores
170
+ ```
171
+
172
+ `tests/` = invariants + concurrency (round-trip, dirty-oracle, packer guardrail,
173
+ cross-process shard race). `eval/` = performance/quality benchmarks.
174
+
175
+ ---
176
+
177
+ ## Status
178
+
179
+ **v1.0.0a0 (alpha).** Engine, polyglot + cross-language graph, CLI, fractal
180
+ visualizer and MCP server are working and tested. Packaging/distribution are
181
+ maturing — issues and PRs welcome (see `CONTRIBUTING.md`).
182
+
183
+ ## License
184
+
185
+ MIT — see [LICENSE](LICENSE).
@@ -0,0 +1,6 @@
1
+ """Core package for Droste-Memory."""
2
+
3
+ from .droste_engine import DrosteConceptEngine, DrosteNode
4
+ from .droste_ingester import DrosteProjectIngester
5
+
6
+ __all__ = ["DrosteConceptEngine", "DrosteNode", "DrosteProjectIngester"]