persidict 0.37.2__tar.gz → 0.103.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of persidict might be problematic. Click here for more details.

@@ -1,12 +1,12 @@
1
1
  Metadata-Version: 2.3
2
2
  Name: persidict
3
- Version: 0.37.2
3
+ Version: 0.103.0
4
4
  Summary: Simple persistent key-value store for Python. Values are stored as files on a disk or as S3 objects on AWS cloud.
5
5
  Keywords: persistence,dicts,distributed,parallel
6
6
  Author: Vlad (Volodymyr) Pavlov
7
7
  Author-email: Vlad (Volodymyr) Pavlov <vlpavlov@ieee.org>
8
8
  License: MIT
9
- Classifier: Development Status :: 3 - Alpha
9
+ Classifier: Development Status :: 4 - Beta
10
10
  Classifier: Intended Audience :: Developers
11
11
  Classifier: Intended Audience :: Science/Research
12
12
  Classifier: Programming Language :: Python
@@ -18,11 +18,12 @@ Classifier: Topic :: Software Development :: Libraries :: Python Modules
18
18
  Requires-Dist: parameterizable
19
19
  Requires-Dist: lz4
20
20
  Requires-Dist: joblib
21
- Requires-Dist: numpy
22
- Requires-Dist: pandas
23
21
  Requires-Dist: jsonpickle
24
22
  Requires-Dist: deepdiff
23
+ Requires-Dist: boto3
25
24
  Requires-Dist: boto3 ; extra == 'aws'
25
+ Requires-Dist: numpy ; extra == 'dev'
26
+ Requires-Dist: pandas ; extra == 'dev'
26
27
  Requires-Dist: boto3 ; extra == 'dev'
27
28
  Requires-Dist: moto ; extra == 'dev'
28
29
  Requires-Dist: pytest ; extra == 'dev'
@@ -44,7 +45,8 @@ storing each value as its own file or S3 object. Keys are limited to
44
45
  text strings or sequences of strings.
45
46
 
46
47
  In contrast to traditional persistent dictionaries (e.g., Python’s `shelve)`,
47
- `persidict` is designed for distributed environments where multiple processes
48
+ `persidict` is [designed](https://github.com/pythagoras-dev/persidict/blob/master/design_principles.md)
49
+ for distributed environments where multiple processes
48
50
  on different machines concurrently work with the same store.
49
51
 
50
52
  ## 2. Why Use It?
@@ -160,8 +162,8 @@ print(f"API Key: {cloud_config['api_key']}")
160
162
  You can also constrain values to a specific class.
161
163
  * **Order**: Insertion order is not preserved.
162
164
  * **Additional Methods**: `PersiDict` provides extra methods not in the standard
163
- dict API, such as `timestamp()`, `random_key()`, `newest_keys()`, `subdicts()`
164
- , `delete_if_exists()`, `get_params()` and more.
165
+ dict API, such as `timestamp()`, `etag()`, `random_key()`, `newest_keys()`
166
+ , `subdicts()`, `discard()`, `get_params()` and more.
165
167
  * **Special Values**: Use `KEEP_CURRENT` to avoid updating a value
166
168
  and `DELETE_CURRENT` to delete a value during an assignment.
167
169
 
@@ -172,13 +174,14 @@ and `DELETE_CURRENT` to delete a value during an assignment.
172
174
  * **`PersiDict`**: The abstract base class that defines the common interface
173
175
  for all persistent dictionaries in the package. It's the foundation
174
176
  upon which everything else is built.
175
- * **`PersiDictKey`**: A type hint that specifies what can be used
176
- as a key in any `PersiDict`. It can be a `SafeStrTuple`, a single string,
177
+ * **`NonEmptyPersiDictKey`**: A type hint that specifies what can be used
178
+ as a key in any `PersiDict`. It can be a `NonEmptySafeStrTuple`, a single string,
177
179
  or a sequence of strings. When a `PesiDict` method requires a key as an input,
178
- it will accept any of these types and convert them to a `SafeStrTuple` internally.
179
- * **`SafeStrTuple`**: The core data structure for keys. It's an immutable,
180
- flat tuple of non-empty, URL/filename-safe strings, ensuring that
181
- keys are consistent and safe for various storage backends.
180
+ it will accept any of these types and convert them to
181
+ a `NonEmptySafeStrTuple` internally.
182
+ * **`NonEmptySafeStrTuple`**: The core data structure for keys.
183
+ It's an immutable, flat tuple of non-empty, URL/filename-safe strings,
184
+ ensuring that keys are consistent and safe for various storage backends.
182
185
  When a `PersiDict` method returns a key, it will always be in this format.
183
186
 
184
187
  ### 5.2 Main Implementations
@@ -191,24 +194,25 @@ suitable for distributed environments.
191
194
 
192
195
  ### 5.3 Key Parameters
193
196
 
194
- * **`file_type`**: A key parameter for `FileDirDict` and `S3Dict` that
195
- determines the serialization format for values.
196
- Common options are `"pkl"` (pickle) and `"json"`.
197
- Any other value is treated as plain text for string storage.
197
+ * **`serialization_format`**: A key parameter for `FileDirDict` and `S3Dict` that
198
+ determines the serialization format used to store values.
199
+ Common options are `"pkl"` (pickle) and `"json"`.
200
+ Any other value is treated as plain text for string storage.
198
201
  * **`base_class_for_values`**: An optional parameter for any `PersiDict`
199
202
  that enforces type checking on all stored values, ensuring they are
200
203
  instances of a specific class.
201
- * **`immutable_items`**: A boolean parameter that can make a `PersiDict`
202
- "write-once," preventing any modification or deletion of existing items.
204
+ * **`append_only`**: A boolean parameter that makes items inside a `PersiDict` immutable,
205
+ preventing them from modification or deletion.
203
206
  * **`digest_len`**: An integer that specifies the length of a hash suffix
204
- added to key components to prevent collisions on case-insensitive file systems.
207
+ added to key components in `FileDirDict` to prevent collisions
208
+ on case-insensitive file systems.
205
209
  * **`base_dir`**: A string specifying the directory path where a `FileDirDict`
206
210
  stores its files. For `S3Dict`, this directory is used to cache files locally.
207
211
  * **`bucket_name`**: A string specifying the name of the S3 bucket where
208
212
  an `S3Dict` stores its objects.
209
213
  * **`region`**: An optional string specifying the AWS region for the S3 bucket.
210
214
 
211
- ### 5.4 Advanced Classes
215
+ ### 5.4 Advanced and Supporting Classes
212
216
 
213
217
  * **`WriteOnceDict`**: A wrapper that enforces write-once behavior
214
218
  on any `PersiDict`, ignoring subsequent writes to the same key.
@@ -216,7 +220,13 @@ It also allows for random consistency checks to ensure subsequent
216
220
  writes to the same key always match the original value.
217
221
  * **`OverlappingMultiDict`**: An advanced container that holds
218
222
  multiple `PersiDict` instances sharing the same storage
219
- but with different `file_type`s.
223
+ but with different `serialization_format`s.
224
+ * **`LocalDict`**: An in-memory `PersiDict` backed by
225
+ a RAM-only hierarchical store.
226
+ * **`EmptyDict`**: A minimal implementation of `PersiDict` that behaves
227
+ like a null device in OS - accepts all writes but discards them,
228
+ returns nothing on reads. Always appears empty
229
+ regardless of operations performed on it.
220
230
 
221
231
  ### 5.5 Special "Joker" Values
222
232
 
@@ -241,7 +251,7 @@ from the dictionary when assigned to a key.
241
251
  | `newest_values(max_n=None)` | `list[Any]` | Returns a list of values corresponding to the newest keys. |
242
252
  | `get_subdict(prefix_key)` | `PersiDict` | Returns a new `PersiDict` instance that provides a view into a subset of keys sharing a common prefix. |
243
253
  | `subdicts()` | `dict[str, PersiDict]` | Returns a dictionary mapping all first-level key prefixes to their corresponding sub-dictionary views. |
244
- | `delete_if_exists(key)` | `bool` | Deletes a key-value pair if it exists and returns `True`; otherwise, returns `False`. |
254
+ | `discard(key)` | `bool` | Deletes a key-value pair if it exists and returns `True`; otherwise, returns `False`. |
245
255
  | `get_params()` | `dict` | Returns a dictionary of the instance's configuration parameters, supporting the `parameterizable` API. |
246
256
 
247
257
  ## 7. Installation
@@ -278,14 +288,14 @@ pip install persidict[dev]
278
288
  * [jsonpickle](https://jsonpickle.github.io)
279
289
  * [joblib](https://joblib.readthedocs.io)
280
290
  * [lz4](https://python-lz4.readthedocs.io)
281
- * [pandas](https://pandas.pydata.org)
282
- * [numpy](https://numpy.org)
283
291
  * [deepdiff](https://zepworks.com/deepdiff)
284
292
 
285
293
  For AWS S3 support (`S3Dict`), you will also need:
286
294
  * [boto3](https://boto3.readthedocs.io)
287
295
 
288
296
  For development and testing, the following packages are used:
297
+ * [pandas](https://pandas.pydata.org)
298
+ * [numpy](https://numpy.org)
289
299
  * [pytest](https://pytest.org)
290
300
  * [moto](http://getmoto.org)
291
301
 
@@ -10,7 +10,8 @@ storing each value as its own file or S3 object. Keys are limited to
10
10
  text strings or sequences of strings.
11
11
 
12
12
  In contrast to traditional persistent dictionaries (e.g., Python’s `shelve)`,
13
- `persidict` is designed for distributed environments where multiple processes
13
+ `persidict` is [designed](https://github.com/pythagoras-dev/persidict/blob/master/design_principles.md)
14
+ for distributed environments where multiple processes
14
15
  on different machines concurrently work with the same store.
15
16
 
16
17
  ## 2. Why Use It?
@@ -126,8 +127,8 @@ print(f"API Key: {cloud_config['api_key']}")
126
127
  You can also constrain values to a specific class.
127
128
  * **Order**: Insertion order is not preserved.
128
129
  * **Additional Methods**: `PersiDict` provides extra methods not in the standard
129
- dict API, such as `timestamp()`, `random_key()`, `newest_keys()`, `subdicts()`
130
- , `delete_if_exists()`, `get_params()` and more.
130
+ dict API, such as `timestamp()`, `etag()`, `random_key()`, `newest_keys()`
131
+ , `subdicts()`, `discard()`, `get_params()` and more.
131
132
  * **Special Values**: Use `KEEP_CURRENT` to avoid updating a value
132
133
  and `DELETE_CURRENT` to delete a value during an assignment.
133
134
 
@@ -138,13 +139,14 @@ and `DELETE_CURRENT` to delete a value during an assignment.
138
139
  * **`PersiDict`**: The abstract base class that defines the common interface
139
140
  for all persistent dictionaries in the package. It's the foundation
140
141
  upon which everything else is built.
141
- * **`PersiDictKey`**: A type hint that specifies what can be used
142
- as a key in any `PersiDict`. It can be a `SafeStrTuple`, a single string,
142
+ * **`NonEmptyPersiDictKey`**: A type hint that specifies what can be used
143
+ as a key in any `PersiDict`. It can be a `NonEmptySafeStrTuple`, a single string,
143
144
  or a sequence of strings. When a `PesiDict` method requires a key as an input,
144
- it will accept any of these types and convert them to a `SafeStrTuple` internally.
145
- * **`SafeStrTuple`**: The core data structure for keys. It's an immutable,
146
- flat tuple of non-empty, URL/filename-safe strings, ensuring that
147
- keys are consistent and safe for various storage backends.
145
+ it will accept any of these types and convert them to
146
+ a `NonEmptySafeStrTuple` internally.
147
+ * **`NonEmptySafeStrTuple`**: The core data structure for keys.
148
+ It's an immutable, flat tuple of non-empty, URL/filename-safe strings,
149
+ ensuring that keys are consistent and safe for various storage backends.
148
150
  When a `PersiDict` method returns a key, it will always be in this format.
149
151
 
150
152
  ### 5.2 Main Implementations
@@ -157,24 +159,25 @@ suitable for distributed environments.
157
159
 
158
160
  ### 5.3 Key Parameters
159
161
 
160
- * **`file_type`**: A key parameter for `FileDirDict` and `S3Dict` that
161
- determines the serialization format for values.
162
- Common options are `"pkl"` (pickle) and `"json"`.
163
- Any other value is treated as plain text for string storage.
162
+ * **`serialization_format`**: A key parameter for `FileDirDict` and `S3Dict` that
163
+ determines the serialization format used to store values.
164
+ Common options are `"pkl"` (pickle) and `"json"`.
165
+ Any other value is treated as plain text for string storage.
164
166
  * **`base_class_for_values`**: An optional parameter for any `PersiDict`
165
167
  that enforces type checking on all stored values, ensuring they are
166
168
  instances of a specific class.
167
- * **`immutable_items`**: A boolean parameter that can make a `PersiDict`
168
- "write-once," preventing any modification or deletion of existing items.
169
+ * **`append_only`**: A boolean parameter that makes items inside a `PersiDict` immutable,
170
+ preventing them from modification or deletion.
169
171
  * **`digest_len`**: An integer that specifies the length of a hash suffix
170
- added to key components to prevent collisions on case-insensitive file systems.
172
+ added to key components in `FileDirDict` to prevent collisions
173
+ on case-insensitive file systems.
171
174
  * **`base_dir`**: A string specifying the directory path where a `FileDirDict`
172
175
  stores its files. For `S3Dict`, this directory is used to cache files locally.
173
176
  * **`bucket_name`**: A string specifying the name of the S3 bucket where
174
177
  an `S3Dict` stores its objects.
175
178
  * **`region`**: An optional string specifying the AWS region for the S3 bucket.
176
179
 
177
- ### 5.4 Advanced Classes
180
+ ### 5.4 Advanced and Supporting Classes
178
181
 
179
182
  * **`WriteOnceDict`**: A wrapper that enforces write-once behavior
180
183
  on any `PersiDict`, ignoring subsequent writes to the same key.
@@ -182,7 +185,13 @@ It also allows for random consistency checks to ensure subsequent
182
185
  writes to the same key always match the original value.
183
186
  * **`OverlappingMultiDict`**: An advanced container that holds
184
187
  multiple `PersiDict` instances sharing the same storage
185
- but with different `file_type`s.
188
+ but with different `serialization_format`s.
189
+ * **`LocalDict`**: An in-memory `PersiDict` backed by
190
+ a RAM-only hierarchical store.
191
+ * **`EmptyDict`**: A minimal implementation of `PersiDict` that behaves
192
+ like a null device in OS - accepts all writes but discards them,
193
+ returns nothing on reads. Always appears empty
194
+ regardless of operations performed on it.
186
195
 
187
196
  ### 5.5 Special "Joker" Values
188
197
 
@@ -207,7 +216,7 @@ from the dictionary when assigned to a key.
207
216
  | `newest_values(max_n=None)` | `list[Any]` | Returns a list of values corresponding to the newest keys. |
208
217
  | `get_subdict(prefix_key)` | `PersiDict` | Returns a new `PersiDict` instance that provides a view into a subset of keys sharing a common prefix. |
209
218
  | `subdicts()` | `dict[str, PersiDict]` | Returns a dictionary mapping all first-level key prefixes to their corresponding sub-dictionary views. |
210
- | `delete_if_exists(key)` | `bool` | Deletes a key-value pair if it exists and returns `True`; otherwise, returns `False`. |
219
+ | `discard(key)` | `bool` | Deletes a key-value pair if it exists and returns `True`; otherwise, returns `False`. |
211
220
  | `get_params()` | `dict` | Returns a dictionary of the instance's configuration parameters, supporting the `parameterizable` API. |
212
221
 
213
222
  ## 7. Installation
@@ -244,14 +253,14 @@ pip install persidict[dev]
244
253
  * [jsonpickle](https://jsonpickle.github.io)
245
254
  * [joblib](https://joblib.readthedocs.io)
246
255
  * [lz4](https://python-lz4.readthedocs.io)
247
- * [pandas](https://pandas.pydata.org)
248
- * [numpy](https://numpy.org)
249
256
  * [deepdiff](https://zepworks.com/deepdiff)
250
257
 
251
258
  For AWS S3 support (`S3Dict`), you will also need:
252
259
  * [boto3](https://boto3.readthedocs.io)
253
260
 
254
261
  For development and testing, the following packages are used:
262
+ * [pandas](https://pandas.pydata.org)
263
+ * [numpy](https://numpy.org)
255
264
  * [pytest](https://pytest.org)
256
265
  * [moto](http://getmoto.org)
257
266
 
@@ -4,7 +4,7 @@ build-backend = "uv_build"
4
4
 
5
5
  [project]
6
6
  name = "persidict"
7
- version = "0.37.2"
7
+ version = "0.103.0"
8
8
  description = "Simple persistent key-value store for Python. Values are stored as files on a disk or as S3 objects on AWS cloud."
9
9
  readme = "README.md"
10
10
  requires-python = ">=3.10"
@@ -14,7 +14,7 @@ authors = [
14
14
  ]
15
15
  keywords = ["persistence", "dicts", "distributed", "parallel"]
16
16
  classifiers = [
17
- "Development Status :: 3 - Alpha",
17
+ "Development Status :: 4 - Beta",
18
18
  "Intended Audience :: Developers",
19
19
  "Intended Audience :: Science/Research",
20
20
  "Programming Language :: Python",
@@ -28,10 +28,9 @@ dependencies = [
28
28
  "parameterizable",
29
29
  "lz4",
30
30
  "joblib",
31
- "numpy",
32
- "pandas",
33
31
  "jsonpickle",
34
- "deepdiff"
32
+ "deepdiff",
33
+ "boto3"
35
34
  ]
36
35
 
37
36
  [project.urls]
@@ -39,6 +38,8 @@ Homepage = "https://github.com/pythagoras-dev/persidict"
39
38
 
40
39
  [project.optional-dependencies]
41
40
  dev = [
41
+ "numpy",
42
+ "pandas",
42
43
  "boto3",
43
44
  "moto",
44
45
  "pytest"
@@ -0,0 +1,50 @@
1
+ """Persistent dictionaries that store key-value pairs on local disks or AWS S3.
2
+
3
+ This package provides a unified interface for persistent dictionary-like
4
+ storage with various backends including filesystem and AWS S3.
5
+
6
+ Classes:
7
+ PersiDict: Abstract base class defining the unified interface for all
8
+ persistent dictionaries.
9
+ NonEmptySafeStrTuple: A flat tuple of URL/filename-safe strings that
10
+ can be used as a key for PersiDict objects.
11
+ FileDirDict: A dictionary that stores key-value pairs as files on a
12
+ local hard drive. Keys compose filenames, values are stored as
13
+ pickle or JSON objects.
14
+ S3Dict_Legacy: A dictionary that stores key-value pairs as S3 objects on AWS.
15
+ Keys compose object names, values are stored as pickle or JSON S3 objects.
16
+ BasicS3Dict: A basic S3-backed dictionary with direct S3 operations.
17
+ WriteOnceDict: A write-once wrapper that prevents modification of existing
18
+ items after initial storage.
19
+ EmptyDict: Equivalent of null device in OS - accepts all writes but discards
20
+ them, returns nothing on reads. Always appears empty regardless of
21
+ operations performed. Useful for testing, debugging, or as a placeholder.
22
+ OverlappingMultiDict: A dictionary that can handle overlapping key spaces.
23
+
24
+ Functions:
25
+ get_safe_chars(): Returns a set of URL/filename-safe characters permitted
26
+ in keys.
27
+ replace_unsafe_chars(): Replaces forbidden characters in a string with
28
+ safe alternatives.
29
+
30
+ Constants:
31
+ KEEP_CURRENT, DELETE_CURRENT: Special joker values for conditional operations.
32
+
33
+ Note:
34
+ All persistent dictionaries support multiple serialization formats, including
35
+ pickle and JSON, with automatic type handling and collision-safe key encoding.
36
+ """
37
+ from .safe_chars import *
38
+ from .safe_str_tuple import *
39
+ from .persi_dict import PersiDict, PersiDictKey
40
+ from .file_dir_dict import FileDirDict
41
+ from .s3_dict_file_dir_cached import S3Dict_FileDirCached, S3Dict
42
+ from .basic_s3_dict import BasicS3Dict
43
+ from .write_once_dict import WriteOnceDict
44
+ from .empty_dict import EmptyDict
45
+ from .singletons import Joker, KeepCurrentFlag, DeleteCurrentFlag
46
+ from .singletons import KEEP_CURRENT, DELETE_CURRENT
47
+ from .overlapping_multi_dict import OverlappingMultiDict
48
+ from .cached_appendonly_dict import AppendOnlyDictCached
49
+ from .cached_mutable_dict import MutableDictCached
50
+ from .local_dict import LocalDict