D-MemFS 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,413 @@
1
+ Metadata-Version: 2.4
2
+ Name: D-MemFS
3
+ Version: 0.2.0
4
+ Summary: In-process virtual filesystem with hard quota for Python
5
+ Author: D
6
+ License: MIT
7
+ Project-URL: Homepage, https://github.com/nightmarewalker/D-MemFS
8
+ Project-URL: Repository, https://github.com/nightmarewalker/D-MemFS
9
+ Keywords: filesystem,memory,virtual,quota,in-process
10
+ Classifier: Development Status :: 4 - Beta
11
+ Classifier: Intended Audience :: Developers
12
+ Classifier: License :: OSI Approved :: MIT License
13
+ Classifier: Programming Language :: Python :: 3
14
+ Classifier: Programming Language :: Python :: 3.11
15
+ Classifier: Programming Language :: Python :: 3.12
16
+ Classifier: Programming Language :: Python :: 3.13
17
+ Classifier: Topic :: Software Development :: Libraries
18
+ Classifier: Topic :: System :: Filesystems
19
+ Requires-Python: >=3.11
20
+ Description-Content-Type: text/markdown
21
+ License-File: LICENSE
22
+ Dynamic: license-file
23
+
24
+ # D-MemFS
25
+
26
+ **An in-process virtual filesystem with hard quota enforcement for Python.**
27
+
28
+ [![Python 3.11+](https://img.shields.io/badge/python-3.11%2B-blue.svg)](https://www.python.org/)
29
+ [![Tests](https://github.com/nightmarewalker/D-MemFS/actions/workflows/test.yml/badge.svg)](https://github.com/nightmarewalker/D-MemFS/actions/workflows/test.yml)
30
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
31
+ [![Zero dependencies (runtime)](https://img.shields.io/badge/runtime_deps-none-brightgreen.svg)]()
32
+
33
+ Languages: [English](./README.md) | [Japanese](./README_ja.md)
34
+
35
+ ---
36
+
37
+ ## Why MFS?
38
+
39
+ `MemoryFileSystem` gives you a fully isolated filesystem-like workspace inside a Python process.
40
+
41
+ - Hard quota (`MFSQuotaExceededError`) to reject oversized writes before OOM
42
+ - Hierarchical directories and multi-file operations (`import_tree`, `copy_tree`, `move`)
43
+ - File-level RW locking + global structure lock for thread-safe operations
44
+ - Free-threaded Python compatible (`PYTHON_GIL=0`) — stress-tested under 50-thread contention
45
+ - Async wrapper (`AsyncMemoryFileSystem`) powered by `asyncio.to_thread`
46
+ - Zero runtime dependencies (standard library only)
47
+
48
+ This is useful when `io.BytesIO` is too primitive (single buffer), and OS-level RAM disks/tmpfs are impractical (permissions, container policy, Windows driver friction).
49
+
50
+ ---
51
+
52
+ ## Installation
53
+
54
+ ```bash
55
+ pip install D-MemFS
56
+ ```
57
+
58
+ Requirements: Python 3.11+
59
+
60
+ ---
61
+
62
+ ## Quick Start
63
+
64
+ ```python
65
+ from dmemfs import MemoryFileSystem, MFSQuotaExceededError
66
+
67
+ mfs = MemoryFileSystem(max_quota=64 * 1024 * 1024)
68
+
69
+ mfs.mkdir("/data")
70
+ with mfs.open("/data/hello.bin", "wb") as f:
71
+ f.write(b"hello")
72
+
73
+ with mfs.open("/data/hello.bin", "rb") as f:
74
+ print(f.read()) # b"hello"
75
+
76
+ print(mfs.listdir("/data"))
77
+ print(mfs.is_file("/data/hello.bin")) # True
78
+
79
+ try:
80
+ with mfs.open("/huge.bin", "wb") as f:
81
+ f.write(bytes(512 * 1024 * 1024))
82
+ except MFSQuotaExceededError as e:
83
+ print(e)
84
+ ```
85
+
86
+ ---
87
+
88
+ ## API Highlights
89
+
90
+ ### `MemoryFileSystem`
91
+
92
+ - `open(path, mode, *, preallocate=0, lock_timeout=None)`
93
+ - `mkdir`, `remove`, `rmtree`, `rename`, `move`, `copy`, `copy_tree`
94
+ - `listdir`, `exists`, `is_dir`, `is_file`, `walk`, `glob`
95
+ - `stat`, `stats`, `get_size`
96
+ - `export_as_bytesio`, `export_tree`, `iter_export_tree`, `import_tree`
97
+
98
+ **Constructor parameters:**
99
+ - `max_quota` (default `256 MiB`): byte quota for file data
100
+ - `max_nodes` (default `None`): optional cap on total node count (files + directories). Raises `MFSNodeLimitExceededError` when exceeded.
101
+ - `default_storage` (default `"auto"`): storage backend for new files — `"auto"` / `"sequential"` / `"random_access"`
102
+ - `promotion_hard_limit` (default `None`): byte threshold above which Sequential→RandomAccess auto-promotion is suppressed (`None` uses the built-in 512 MiB limit)
103
+ - `chunk_overhead_override` (default `None`): override the per-chunk overhead estimate used for quota accounting
104
+
105
+ > **Note:** The `BytesIO` returned by `export_as_bytesio()` is outside quota management.
106
+ > Exporting large files may consume significant process memory beyond the configured quota limit.
107
+
108
+ Supported binary modes: `rb`, `wb`, `ab`, `r+b`, `xb`
109
+
110
+ ### `MemoryFileHandle`
111
+
112
+ - `read`, `write`, `seek`, `tell`, `truncate`, `flush`, `close`
113
+ - file-like capability checks: `readable`, `writable`, `seekable`
114
+
115
+ `flush()` is intentionally a no-op (compatibility API for file-like integrations).
116
+
117
+ ### `stat()` return (`MFSStatResult`)
118
+
119
+ `size`, `created_at`, `modified_at`, `generation`, `is_dir`
120
+
121
+ - Supports both files and directories
122
+ - For directories: `size=0`, `generation=0`, `is_dir=True`
123
+
124
+ ---
125
+
126
+ ## Text Mode
127
+
128
+ D-MemFS natively operates in binary mode. For text I/O, use `MFSTextHandle`:
129
+
130
+ ```python
131
+ from dmemfs import MemoryFileSystem, MFSTextHandle
132
+
133
+ mfs = MemoryFileSystem()
134
+ mfs.mkdir("/data")
135
+
136
+ # Write text
137
+ with mfs.open("/data/hello.bin", "wb") as f:
138
+ th = MFSTextHandle(f, encoding="utf-8")
139
+ th.write("こんにちは世界\n")
140
+ th.write("Hello, World!\n")
141
+
142
+ # Read text line by line
143
+ with mfs.open("/data/hello.bin", "rb") as f:
144
+ th = MFSTextHandle(f, encoding="utf-8")
145
+ for line in th:
146
+ print(line, end="")
147
+ ```
148
+
149
+ `MFSTextHandle` is a thin, bufferless wrapper. It encodes on `write()` and decodes on `read()` / `readline()`. Unlike `io.TextIOWrapper`, it introduces no buffering issues when used with `MemoryFileHandle`.
150
+
151
+ ---
152
+
153
+ ## Use Case Tutorials
154
+
155
+ ### ETL Staging
156
+
157
+ Stage data through raw → processed → output directories:
158
+
159
+ ```python
160
+ from dmemfs import MemoryFileSystem
161
+
162
+ mfs = MemoryFileSystem(max_quota=16 * 1024 * 1024)
163
+ mfs.mkdir("/raw")
164
+ mfs.mkdir("/processed")
165
+
166
+ raw_data = b"id,name,value\n1,foo,100\n2,bar,200\n"
167
+ with mfs.open("/raw/data.csv", "wb") as f:
168
+ f.write(raw_data)
169
+
170
+ with mfs.open("/raw/data.csv", "rb") as f:
171
+ data = f.read()
172
+
173
+ with mfs.open("/processed/data.csv", "wb") as f:
174
+ f.write(data.upper())
175
+
176
+ mfs.rmtree("/raw") # cleanup staging
177
+ ```
178
+
179
+ ### Archive-like Operations
180
+
181
+ Store, list, and export multiple files as a tree:
182
+
183
+ ```python
184
+ from dmemfs import MemoryFileSystem
185
+
186
+ mfs = MemoryFileSystem()
187
+ mfs.import_tree({
188
+ "/archive/doc1.bin": b"Document 1",
189
+ "/archive/doc2.bin": b"Document 2",
190
+ "/archive/sub/doc3.bin": b"Document 3",
191
+ })
192
+
193
+ print(mfs.listdir("/archive")) # ['doc1.bin', 'doc2.bin', 'sub']
194
+
195
+ snapshot = mfs.export_tree(prefix="/archive") # dict of {path: bytes}
196
+ ```
197
+
198
+ ### SQLite Snapshot
199
+
200
+ Serialize an in-memory SQLite DB into MFS and restore it later:
201
+
202
+ ```python
203
+ import sqlite3
204
+ from dmemfs import MemoryFileSystem
205
+
206
+ mfs = MemoryFileSystem()
207
+ conn = sqlite3.connect(":memory:")
208
+ conn.execute("CREATE TABLE t (id INTEGER, val TEXT)")
209
+ conn.execute("INSERT INTO t VALUES (1, 'hello')")
210
+ conn.commit()
211
+
212
+ with mfs.open("/snapshot.db", "wb") as f:
213
+ f.write(conn.serialize())
214
+ conn.close()
215
+
216
+ with mfs.open("/snapshot.db", "rb") as f:
217
+ raw = f.read()
218
+ restored = sqlite3.connect(":memory:")
219
+ restored.deserialize(raw)
220
+ rows = restored.execute("SELECT * FROM t").fetchall() # [(1, 'hello')]
221
+ ```
222
+
223
+ ---
224
+
225
+ ## Concurrency and Locking Notes
226
+
227
+ - Path/tree operations are guarded by `_global_lock`.
228
+ - File access is guarded by per-file `ReadWriteLock`.
229
+ - `lock_timeout` behavior:
230
+ - `None`: block indefinitely
231
+ - `0.0`: try-lock (fail immediately with `BlockingIOError`)
232
+ - `> 0`: timeout in seconds, then `BlockingIOError`
233
+ - Current `ReadWriteLock` is non-fair: under sustained read load, writers can starve.
234
+
235
+ Operational guidance:
236
+
237
+ - Keep lock hold duration short
238
+ - Set an explicit `lock_timeout` in latency-sensitive code paths
239
+ - `walk()` and `glob()` provide weak consistency: each directory level is
240
+ snapshotted under `_global_lock`, but the overall traversal is NOT atomic.
241
+ Concurrent structural changes may produce inconsistent results.
242
+
243
+ ---
244
+
245
+ ## Async Usage
246
+
247
+ ```python
248
+ from dmemfs import AsyncMemoryFileSystem
249
+
250
+ async def run() -> None:
251
+ mfs = AsyncMemoryFileSystem(max_quota=64 * 1024 * 1024)
252
+ await mfs.mkdir("/a")
253
+ async with await mfs.open("/a/f.bin", "wb") as f:
254
+ await f.write(b"data")
255
+ async with await mfs.open("/a/f.bin", "rb") as f:
256
+ print(await f.read())
257
+ ```
258
+
259
+ ---
260
+
261
+ ## Benchmarks
262
+
263
+ Minimal benchmark tooling is included:
264
+
265
+ - MFS vs `io.BytesIO` vs `PyFilesystem2 (MemoryFS)` vs `tempfile`
266
+ - Cases: many-small-files and stream write/read
267
+ - Optional report output to `benchmarks/results/`
268
+
269
+ > **Note:** As of setuptools 82 (February 2026), `pyfilesystem2` fails to import due to a known upstream issue ([#597](https://github.com/PyFilesystem/pyfilesystem2/issues/597)). Benchmark results including PyFilesystem2 were measured with setuptools ≤ 81 and are valid as historical comparison data.
270
+
271
+ Run:
272
+
273
+ ```bash
274
+ uvx --with-requirements requirements.txt --with-editable . python benchmarks/compare_backends.py --save-md auto --save-json auto
275
+ ```
276
+
277
+ See `BENCHMARK.md` for details.
278
+
279
+ Latest benchmark snapshot:
280
+
281
+ - [benchmark_current_result.md](./benchmarks/results/benchmark_current_result.md)
282
+
283
+ ---
284
+
285
+ ## Testing and Coverage
286
+
287
+ Test execution and dev flow are documented in `TESTING.md`.
288
+
289
+ Typical local run:
290
+
291
+ ```bash
292
+ uv pip compile requirements.in -o requirements.txt
293
+ uvx --with-requirements requirements.txt --with-editable . pytest tests/ -v --timeout=30 --cov=dmemfs --cov-report=xml --cov-report=term-missing
294
+ ```
295
+
296
+ CI (`.github/workflows/test.yml`) runs tests with coverage XML generation.
297
+
298
+ ---
299
+
300
+ ## API Docs Generation
301
+
302
+ API docs can be generated as Markdown (viewable on GitHub) using `pydoc-markdown`:
303
+
304
+ ```bash
305
+ uvx --with pydoc-markdown --with-editable . pydoc-markdown '{
306
+ loaders: [{type: python, search_path: [.]}],
307
+ processors: [{type: filter, expression: "default()"}],
308
+ renderer: {type: markdown, filename: docs/api_md/index.md}
309
+ }'
310
+ ```
311
+
312
+ Or as HTML using `pdoc` (local browsing only):
313
+
314
+ ```bash
315
+ uvx --with-requirements requirements.txt pdoc dmemfs -o docs/api
316
+ ```
317
+
318
+ - [API Reference (Markdown)](./docs/api_md/index.md)
319
+
320
+ ---
321
+
322
+ ## Compatibility and Non-Goals
323
+
324
+ - Core `open()` is binary-only (`rb`, `wb`, `ab`, `r+b`, `xb`). Text I/O is available via the `MFSTextHandle` wrapper.
325
+ - No symlink/hardlink support — intentionally omitted to eliminate path traversal loops and structural complexity (same rationale as `pathlib.PurePath`).
326
+ - No direct `pathlib.Path` / `os.PathLike` API — MFS paths are virtual and must not be confused with host filesystem paths. Accepting `os.PathLike` would allow third-party libraries or a plain `open()` call to silently treat an MFS virtual path as a real OS path, potentially issuing unintended syscalls against the host filesystem. All paths must be plain `str` with POSIX-style absolute notation (e.g. `"/data/file.txt"`).
327
+ - No kernel filesystem integration (intentionally in-process only)
328
+
329
+ Auto-promotion behavior:
330
+
331
+ - By default (`default_storage="auto"`), new files start as `SequentialMemoryFile` and auto-promote to `RandomAccessMemoryFile` when random writes are detected.
332
+ - Promotion is one-way (no downgrade back to sequential).
333
+ - Use `default_storage="sequential"` or `"random_access"` to fix the backend at construction; use `promotion_hard_limit` to suppress auto-promotion above a byte threshold.
334
+ - Storage promotion temporarily doubles memory usage for the promoted file. The quota system accounts for this, but process-level memory may spike briefly.
335
+
336
+ Security note: In-memory data may be written to physical disk via OS swap
337
+ or core dumps. MFS does not provide memory-locking (e.g., mlock) or
338
+ secure erasure. Do not rely on MFS alone for sensitive data isolation.
339
+
340
+ ---
341
+
342
+ ## Exception Reference
343
+
344
+ | Exception | Typical cause |
345
+ |---|---|
346
+ | `MFSQuotaExceededError` | write/import/copy would exceed quota |
347
+ | `MFSNodeLimitExceededError` | node count would exceed `max_nodes` (subclass of `MFSQuotaExceededError`) |
348
+ | `FileNotFoundError` | path missing |
349
+ | `FileExistsError` | creation target already exists |
350
+ | `IsADirectoryError` | file operation on directory |
351
+ | `NotADirectoryError` | directory operation on file |
352
+ | `BlockingIOError` | lock timeout or open-file conflict |
353
+ | `io.UnsupportedOperation` | mode mismatch / unsupported operation |
354
+ | `ValueError` | invalid mode/path/seek/truncate arguments |
355
+
356
+ ---
357
+
358
+ ## Testing with pytest
359
+
360
+ D-MemFS ships a pytest plugin that provides an `mfs` fixture:
361
+
362
+ ```python
363
+ # conftest.py — register the plugin explicitly
364
+ pytest_plugins = ["dmemfs._pytest_plugin"]
365
+ ```
366
+
367
+ > **Note:** The plugin is **not** auto-discovered. Users must declare it in `conftest.py` to opt in.
368
+
369
+ ```python
370
+ # test_example.py
371
+ def test_write_read(mfs):
372
+ mfs.mkdir("/tmp")
373
+ with mfs.open("/tmp/hello.txt", "wb") as f:
374
+ f.write(b"hello")
375
+ with mfs.open("/tmp/hello.txt", "rb") as f:
376
+ assert f.read() == b"hello"
377
+ ```
378
+
379
+ ---
380
+
381
+ ## Development Notes
382
+
383
+ Design documents (Japanese):
384
+
385
+ - [Architecture Spec v13](./docs/design/spec_v13.md) — API design, internal structure, CI matrix
386
+ - [Detailed Design Spec](./docs/design/DetailedDesignSpec.md) — component-level design and rationale
387
+ - [Test Design Spec](./docs/design/DetailedDesignSpec_test.md) — test case table and pseudocode
388
+
389
+ > These documents are written in Japanese and serve as internal design references.
390
+
391
+ ---
392
+
393
+ ## Performance Summary
394
+
395
+ Key results from the included benchmark (300 small files × 4 KiB, 16 MiB stream, 2 GiB large stream):
396
+
397
+ | Case | MFS (ms) | BytesIO (ms) | tempfile (ms) |
398
+ |---|---:|---:|---:|
399
+ | small_files_rw | 34 | 5 | 164 |
400
+ | stream_write_read | 64 | 51 | 17 |
401
+ | random_access_rw | **24** | 53 | 27 |
402
+ | large_stream_write_read | **1 438** | 7 594 | 1 931 |
403
+ | many_files_random_read | 777 | 163 | 4 745 |
404
+
405
+ MFS incurs a small overhead on tiny-file workloads but delivers significantly better performance on large streams and random-access patterns compared with `BytesIO`. See `BENCHMARK.md` and [benchmark_current_result.md](./benchmarks/results/benchmark_current_result.md) for full data.
406
+
407
+ > **Note:** `tempfile` results above were measured with the system temp directory on a RAM disk. On a physical SSD/HDD, `tempfile` performance will be substantially slower.
408
+
409
+ ---
410
+
411
+ ## License
412
+
413
+ MIT License
@@ -0,0 +1,20 @@
1
+ LICENSE
2
+ README.md
3
+ pyproject.toml
4
+ D_MemFS.egg-info/PKG-INFO
5
+ D_MemFS.egg-info/SOURCES.txt
6
+ D_MemFS.egg-info/dependency_links.txt
7
+ D_MemFS.egg-info/top_level.txt
8
+ dmemfs/__init__.py
9
+ dmemfs/_async.py
10
+ dmemfs/_exceptions.py
11
+ dmemfs/_file.py
12
+ dmemfs/_fs.py
13
+ dmemfs/_handle.py
14
+ dmemfs/_lock.py
15
+ dmemfs/_path.py
16
+ dmemfs/_pytest_plugin.py
17
+ dmemfs/_quota.py
18
+ dmemfs/_text.py
19
+ dmemfs/_typing.py
20
+ dmemfs/py.typed
@@ -0,0 +1 @@
1
+ dmemfs
d_memfs-0.2.0/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 D
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.