persidict 0.34.1__py3-none-any.whl → 0.34.3__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of persidict might be problematic. Click here for more details.

persidict/persi_dict.py CHANGED
@@ -1,21 +1,16 @@
1
- """PersiDict: base class used in persistent dictionaries' hierarchy.
1
+ """Persistent, dict-like API for durable key-value stores.
2
2
 
3
- PersiDict: a base class in the hierarchy, defines unified interface
4
- of all persistent dictionaries. The interface is similar to the interface of
5
- Python's built-in Dict, with a few variations
6
- (e.g. insertion order is not preserved) and additional methods.
3
+ PersiDict defines a unified interface for persistent dictionaries. The API is
4
+ similar to Python's built-in dict with some differences (e.g., insertion order
5
+ is not guaranteed) and several additional convenience methods.
7
6
 
8
- PersiDict persistently stores key-value pairs.
7
+ Keys are sequences of URL/filename-safe strings represented by SafeStrTuple.
8
+ Plain strings or sequences of strings are accepted and automatically coerced to
9
+ SafeStrTuple. Values can be arbitrary Python objects unless an implementation
10
+ restricts them via ``base_class_for_values``.
9
11
 
10
- A key is a sequence of strings in a form of SafeStrTuple.
11
- Regular strings and their sequences can also be passed to PersiDict as keys,
12
- in this case they will be automatically converted to SafeStrTuple.
13
-
14
- A value can be (virtually) any Python object.
15
-
16
- 'Persistently' means that key-value pairs are saved in a durable storage,
17
- such as a local hard-drive or AWS S3 cloud, and can be retrieved
18
- even after the Python process that created the dictionary has terminated.
12
+ Persistence means items are stored durably (e.g., in local files or cloud
13
+ objects) and remain accessible across process lifetimes.
19
14
  """
20
15
 
21
16
  from __future__ import annotations
@@ -41,43 +36,24 @@ it will be automatically converted into SafeStrTuple.
41
36
  """
42
37
 
43
38
  class PersiDict(MutableMapping, ParameterizableClass):
44
- """Dict-like durable store that accepts sequences of strings as keys.
45
-
46
- An abstract base class for key-value stores. It accepts keys in a form of
47
- SafeStrSequence - a URL/filename-safe sequence of strings.
48
- It assumes no restrictions on types of values in the key-value pairs,
49
- but allows users to impose such restrictions.
50
-
51
- The API for the class resembles the API of Python's built-in Dict
52
- (see https://docs.python.org/3/library/stdtypes.html#mapping-types-dict)
53
- with a few variations (e.g. insertion order is not preserved) and
54
- a few additional methods(e.g. .timestamp(key), which returns last
55
- modification time for a key).
56
-
57
- Attributes
58
- ----------
59
- immutable_items : bool
60
- True means an append-only dictionary: items are
61
- not allowed to be modified or deleted from a dictionary.
62
- It enables various distributed cache optimizations
63
- for remote storage.
64
- False means normal dict-like behaviour.
65
-
66
- digest_len : int
67
- Length of a hash signature suffix which PersiDict
68
- automatically adds to each string in a key
69
- while mapping the key to an address of a value
70
- in a persistent storage backend (e.g. a filename
71
- or an S3 objectname). We need it to ensure correct work
72
- of persistent dictionaries with case-insensitive
73
- (even if case-preserving) filesystems, such as MacOS HFS.
74
-
75
- base_class_for_values: Optional[type]
76
- A base class for values stored in the dictionary.
77
- If specified, it will be used to check types of values
78
- in the dictionary. If not specified, no type checking
79
- will be performed and all types will be allowed.
80
-
39
+ """Abstract dict-like interface for durable key-value stores.
40
+
41
+ Keys are URL/filename-safe sequences of strings (SafeStrTuple). Concrete
42
+ subclasses implement storage backends (e.g., filesystem, S3). The API is
43
+ similar to Python's dict but does not guarantee insertion order and adds
44
+ persistence-specific helpers (e.g., timestamp()).
45
+
46
+ Attributes:
47
+ immutable_items (bool):
48
+ If True, items are write-once: existing values cannot be modified or
49
+ deleted.
50
+ digest_len (int):
51
+ Length of a base32 MD5 digest fragment used to suffix each key
52
+ component to avoid collisions on case-insensitive filesystems. 0
53
+ disables suffixing.
54
+ base_class_for_values (Optional[type]):
55
+ Optional base class that all values must inherit from. If None, any
56
+ type is accepted.
81
57
  """
82
58
 
83
59
  digest_len:int
@@ -89,6 +65,20 @@ class PersiDict(MutableMapping, ParameterizableClass):
89
65
  , digest_len:int = 8
90
66
  , base_class_for_values:Optional[type] = None
91
67
  , *args, **kwargs):
68
+ """Initialize base parameters shared by all persistent dicts.
69
+
70
+ Args:
71
+ immutable_items: If True, items cannot be modified or deleted.
72
+ digest_len: Number of hash characters to append to key components to
73
+ avoid case-insensitive collisions. Must be non-negative.
74
+ base_class_for_values: Optional base class that values must inherit
75
+ from; if None, values are not type-restricted.
76
+ *args: Ignored in the base class (reserved for subclasses).
77
+ **kwargs: Ignored in the base class (reserved for subclasses).
78
+
79
+ Raises:
80
+ ValueError: If digest_len is negative.
81
+ """
92
82
  self.digest_len = int(digest_len)
93
83
  if digest_len < 0:
94
84
  raise ValueError("digest_len must be non-negative")
@@ -98,10 +88,12 @@ class PersiDict(MutableMapping, ParameterizableClass):
98
88
 
99
89
 
100
90
  def get_params(self):
101
- """Return a dictionary of parameters for the PersiDict object.
91
+ """Return configuration parameters of this dictionary.
102
92
 
103
- This method is needed to support Parameterizable API.
104
- The method is absent in the original dict API.
93
+ Returns:
94
+ dict: A sorted dict of parameters used to reconstruct the instance.
95
+ This supports the Parameterizable API and is absent in the
96
+ builtin dict.
105
97
  """
106
98
  params = dict(
107
99
  immutable_items=self.immutable_items
@@ -115,9 +107,13 @@ class PersiDict(MutableMapping, ParameterizableClass):
115
107
  @property
116
108
  @abstractmethod
117
109
  def base_url(self):
118
- """Return dictionary's URL
110
+ """Base URL identifying the storage location.
111
+
112
+ Returns:
113
+ str: A URL-like string (e.g., s3://bucket/prefix or file://...).
119
114
 
120
- This property is absent in the original dict API.
115
+ Raises:
116
+ NotImplementedError: Must be provided by subclasses.
121
117
  """
122
118
  raise NotImplementedError
123
119
 
@@ -125,39 +121,79 @@ class PersiDict(MutableMapping, ParameterizableClass):
125
121
  @property
126
122
  @abstractmethod
127
123
  def base_dir(self):
128
- """Return dictionary's base directory in the local filesystem.
124
+ """Base directory on the local filesystem, if applicable.
129
125
 
130
- This property is absent in the original dict API.
126
+ Returns:
127
+ str: Path to a local base directory used by the store.
128
+
129
+ Raises:
130
+ NotImplementedError: Must be provided by subclasses that use local
131
+ storage.
131
132
  """
132
133
  raise NotImplementedError
133
134
 
134
135
 
135
136
  def __repr__(self) -> str:
136
- """Return repr(self)"""
137
+ """Return a reproducible string representation.
138
+
139
+ Returns:
140
+ str: Representation including class name and constructor parameters.
141
+ """
137
142
  params = self.get_params()
138
143
  params_str = ', '.join(f'{k}={v!r}' for k, v in params.items())
139
144
  return f'{self.__class__.__name__}({params_str})'
140
145
 
141
146
 
142
147
  def __str__(self) -> str:
143
- """Return str(self)"""
148
+ """Return a user-friendly string with all items.
149
+
150
+ Returns:
151
+ str: Stringified dict of items.
152
+ """
144
153
  return str(dict(self.items()))
145
154
 
146
155
 
147
156
  @abstractmethod
148
157
  def __contains__(self, key:PersiDictKey) -> bool:
149
- """True if the dictionary has the specified key, else False."""
158
+ """Check whether a key exists in the store.
159
+
160
+ Args:
161
+ key: Key (string or sequence of strings) or SafeStrTuple.
162
+
163
+ Returns:
164
+ bool: True if key exists, False otherwise.
165
+ """
150
166
  raise NotImplementedError
151
167
 
152
168
 
153
169
  @abstractmethod
154
170
  def __getitem__(self, key:PersiDictKey) -> Any:
155
- """X.__getitem__(y) is an equivalent to X[y]"""
171
+ """Retrieve the value for a key.
172
+
173
+ Args:
174
+ key: Key (string or sequence of strings) or SafeStrTuple.
175
+
176
+ Returns:
177
+ Any: The stored value.
178
+ """
156
179
  raise NotImplementedError
157
180
 
158
181
 
159
182
  def __setitem__(self, key:PersiDictKey, value:Any):
160
- """Set self[key] to value."""
183
+ """Set the value for a key.
184
+
185
+ Special values KEEP_CURRENT and DELETE_CURRENT are interpreted as
186
+ commands to keep or delete the current value respectively.
187
+
188
+ Args:
189
+ key: Key (string or sequence of strings) or SafeStrTuple.
190
+ value: Value to store, or a Joker command.
191
+
192
+ Raises:
193
+ KeyError: If attempting to modify an existing key when
194
+ immutable_items is True.
195
+ NotImplementedError: Subclasses must implement actual writing.
196
+ """
161
197
  if value is KEEP_CURRENT:
162
198
  return
163
199
  elif value is DELETE_CURRENT:
@@ -169,21 +205,42 @@ class PersiDict(MutableMapping, ParameterizableClass):
169
205
 
170
206
 
171
207
  def __delitem__(self, key:PersiDictKey):
172
- """Delete self[key]."""
173
- if self.immutable_items: # TODO: change to exceptions
208
+ """Delete a key and its value.
209
+
210
+ Args:
211
+ key: Key (string or sequence of strings) or SafeStrTuple.
212
+
213
+ Raises:
214
+ KeyError: If immutable_items is True.
215
+ NotImplementedError: Subclasses must implement deletion.
216
+ """
217
+ if self.immutable_items:
174
218
  raise KeyError("Can't delete an immutable key-value pair")
175
219
  raise NotImplementedError
176
220
 
177
221
 
178
222
  @abstractmethod
179
223
  def __len__(self) -> int:
180
- """Return len(self)."""
224
+ """Return the number of stored items.
225
+
226
+ Returns:
227
+ int: Number of key-value pairs.
228
+ """
181
229
  raise NotImplementedError
182
230
 
183
231
 
184
232
  @abstractmethod
185
233
  def _generic_iter(self, result_type: set[str]) -> Any:
186
- """Underlying implementation for items/keys/values/... iterators"""
234
+ """Underlying implementation for iterator helpers.
235
+
236
+ Args:
237
+ result_type: A set indicating desired fields among {'keys',
238
+ 'values', 'timestamps'}.
239
+
240
+ Returns:
241
+ Any: An iterator yielding keys, values, and/or timestamps based on
242
+ result_type.
243
+ """
187
244
  assert isinstance(result_type, set)
188
245
  assert 1 <= len(result_type) <= 3
189
246
  assert len(result_type | {"keys", "values", "timestamps"}) == 3
@@ -192,44 +249,80 @@ class PersiDict(MutableMapping, ParameterizableClass):
192
249
 
193
250
 
194
251
  def __iter__(self):
195
- """Implement iter(self)."""
252
+ """Iterate over keys.
253
+
254
+ Returns:
255
+ Iterator[SafeStrTuple]: Iterator of keys.
256
+ """
196
257
  return self._generic_iter({"keys"})
197
258
 
198
259
 
199
260
  def keys(self):
200
- """iterator object that provides access to keys"""
261
+ """Return an iterator over keys.
262
+
263
+ Returns:
264
+ Iterator[SafeStrTuple]: Keys iterator.
265
+ """
201
266
  return self._generic_iter({"keys"})
202
267
 
203
268
 
204
269
  def keys_and_timestamps(self):
205
- """iterator object that provides access to keys and timestamps"""
270
+ """Return an iterator over (key, timestamp) pairs.
271
+
272
+ Returns:
273
+ Iterator[tuple[SafeStrTuple, float]]: Keys and POSIX timestamps.
274
+ """
206
275
  return self._generic_iter({"keys", "timestamps"})
207
276
 
208
277
 
209
278
  def values(self):
210
- """D.values() -> iterator object that provides access to D's values"""
279
+ """Return an iterator over values.
280
+
281
+ Returns:
282
+ Iterator[Any]: Values iterator.
283
+ """
211
284
  return self._generic_iter({"values"})
212
285
 
213
286
 
214
287
  def values_and_timestamps(self):
215
- """iterator object that provides access to values and timestamps"""
288
+ """Return an iterator over (value, timestamp) pairs.
289
+
290
+ Returns:
291
+ Iterator[tuple[Any, float]]: Values and POSIX timestamps.
292
+ """
216
293
  return self._generic_iter({"values", "timestamps"})
217
294
 
218
295
 
219
296
  def items(self):
220
- """D.items() -> iterator object that provides access to D's items"""
297
+ """Return an iterator over (key, value) pairs.
298
+
299
+ Returns:
300
+ Iterator[tuple[SafeStrTuple, Any]]: Items iterator.
301
+ """
221
302
  return self._generic_iter({"keys", "values"})
222
303
 
223
304
 
224
305
  def items_and_timestamps(self):
225
- """iterator object that provides access to keys, values, and timestamps"""
306
+ """Return an iterator over (key, value, timestamp) triples.
307
+
308
+ Returns:
309
+ Iterator[tuple[SafeStrTuple, Any, float]]: Items and timestamps.
310
+ """
226
311
  return self._generic_iter({"keys", "values", "timestamps"})
227
312
 
228
313
 
229
314
  def setdefault(self, key:PersiDictKey, default:Any=None) -> Any:
230
- """Insert key with a value of default if key is not in the dictionary.
315
+ """Insert key with default if absent; return the value.
316
+
317
+ Args:
318
+ key: Key (string or sequence of strings) or SafeStrTuple.
319
+ default: Value to insert if the key is not present.
320
+
321
+ Returns:
322
+ Any: Existing value if present; otherwise the provided default.
231
323
 
232
- Return the value for key if key is in the dictionary, else default.
324
+ Raises:
325
+ AssertionError: If default is a Joker command.
233
326
  """
234
327
  # TODO: check edge cases to ensure the same semantics as standard dicts
235
328
  key = SafeStrTuple(key)
@@ -241,8 +334,18 @@ class PersiDict(MutableMapping, ParameterizableClass):
241
334
  return default
242
335
 
243
336
 
244
- def __eq__(self, other) -> bool:
245
- """Return self==other. """
337
+ def __eq__(self, other:PersiDict) -> bool:
338
+ """Compare dictionaries for equality.
339
+
340
+ If other is a PersiDict, compare portable params. Otherwise, attempt to
341
+ compare as mapping by keys and values.
342
+
343
+ Args:
344
+ other: Another dictionary-like object.
345
+
346
+ Returns:
347
+ bool: True if considered equal, False otherwise.
348
+ """
246
349
  if isinstance(other, PersiDict):
247
350
  return self.get_portable_params() == other.get_portable_params()
248
351
  try:
@@ -257,15 +360,29 @@ class PersiDict(MutableMapping, ParameterizableClass):
257
360
 
258
361
 
259
362
  def __getstate__(self):
363
+ """Prevent pickling of PersiDict instances.
364
+
365
+ Raises:
366
+ TypeError: Always raised; PersiDict instances are not pickleable.
367
+ """
260
368
  raise TypeError("PersiDict is not picklable.")
261
369
 
262
370
 
263
371
  def __setstate__(self, state):
372
+ """Prevent unpickling of PersiDict instances.
373
+
374
+ Raises:
375
+ TypeError: Always raised; PersiDict instances are not pickleable.
376
+ """
264
377
  raise TypeError("PersiDict is not picklable.")
265
378
 
266
379
 
267
380
  def clear(self) -> None:
268
- """Remove all items from the dictionary. """
381
+ """Remove all items from the dictionary.
382
+
383
+ Raises:
384
+ KeyError: If items are immutable (immutable_items is True).
385
+ """
269
386
  if self.immutable_items: # TODO: change to exceptions
270
387
  raise KeyError("Can't delete an immutable key-value pair")
271
388
 
@@ -277,11 +394,18 @@ class PersiDict(MutableMapping, ParameterizableClass):
277
394
 
278
395
 
279
396
  def delete_if_exists(self, key:PersiDictKey) -> bool:
280
- """ Delete an item without raising an exception if it doesn't exist.
281
-
282
- Returns True if the item existed and was deleted, False otherwise.
397
+ """Delete an item without raising an exception if it doesn't exist.
283
398
 
284
399
  This method is absent in the original dict API.
400
+
401
+ Args:
402
+ key: Key (string or sequence of strings) or SafeStrTuple.
403
+
404
+ Returns:
405
+ bool: True if the item existed and was deleted; False otherwise.
406
+
407
+ Raises:
408
+ KeyError: If items are immutable (immutable_items is True).
285
409
  """
286
410
 
287
411
  if self.immutable_items: # TODO: change to exceptions
@@ -300,19 +424,37 @@ class PersiDict(MutableMapping, ParameterizableClass):
300
424
 
301
425
 
302
426
  def get_subdict(self, prefix_key:PersiDictKey) -> PersiDict:
303
- """Get a sub-dictionary containing items with the same prefix key.
427
+ """Get a sub-dictionary containing items with the given prefix key.
304
428
 
305
- For non-existing prefix key, an empty sub-dictionary is returned.
429
+ Items whose keys start with the provided prefix are visible through the
430
+ returned sub-dictionary. If the prefix does not exist, an empty
431
+ sub-dictionary is returned.
306
432
 
307
433
  This method is absent in the original Python dict API.
434
+
435
+ Args:
436
+ prefix_key: Key prefix (string, sequence of strings, or SafeStrTuple)
437
+ identifying the sub-namespace to expose.
438
+
439
+ Returns:
440
+ PersiDict: A dictionary-like view restricted to keys under the
441
+ provided prefix.
442
+
443
+ Raises:
444
+ NotImplementedError: Must be implemented by subclasses that support
445
+ hierarchical key spaces.
308
446
  """
309
447
  raise NotImplementedError
310
448
 
311
449
 
312
450
  def subdicts(self) -> dict[str, PersiDict]:
313
- """Get a dictionary of sub-dictionaries.
451
+ """Return a mapping of first-level keys to sub-dictionaries.
314
452
 
315
453
  This method is absent in the original dict API.
454
+
455
+ Returns:
456
+ dict[str, PersiDict]: A mapping from a top-level key segment to a
457
+ sub-dictionary restricted to the corresponding keyspace.
316
458
  """
317
459
  all_keys = {k[0] for k in self.keys()}
318
460
  result_subdicts = {k: self.get_subdict(k) for k in all_keys}
@@ -322,13 +464,14 @@ class PersiDict(MutableMapping, ParameterizableClass):
322
464
  def random_key(self) -> PersiDictKey | None:
323
465
  """Return a random key from the dictionary.
324
466
 
325
- Returns a single random key if the dictionary is not empty.
326
- Returns None if the dictionary is empty.
327
-
328
467
  This method is absent in the original Python dict API.
329
468
 
330
469
  Implementation uses reservoir sampling to select a uniformly random key
331
470
  in streaming time, without loading all keys into memory or using len().
471
+
472
+ Returns:
473
+ SafeStrTuple | None: A random key if the dictionary is not empty;
474
+ None if the dictionary is empty.
332
475
  """
333
476
  iterator = iter(self.keys())
334
477
  try:
@@ -351,17 +494,33 @@ class PersiDict(MutableMapping, ParameterizableClass):
351
494
 
352
495
  @abstractmethod
353
496
  def timestamp(self, key:PersiDictKey) -> float:
354
- """Get last modification time (in seconds, Unix epoch time).
497
+ """Return the last modification time of a key.
355
498
 
356
499
  This method is absent in the original dict API.
500
+
501
+ Args:
502
+ key: Key (string or sequence of strings) or SafeStrTuple.
503
+
504
+ Returns:
505
+ float: POSIX timestamp (seconds since Unix epoch) of the last
506
+ modification of the item.
507
+
508
+ Raises:
509
+ NotImplementedError: Must be implemented by subclasses.
357
510
  """
358
511
  raise NotImplementedError
359
512
 
360
513
 
361
514
  def oldest_keys(self, max_n=None):
362
- """Return max_n the oldest keys in the dictionary.
515
+ """Return up to max_n oldest keys in the dictionary.
363
516
 
364
- If max_n is None, return all keys.
517
+ Args:
518
+ max_n (int | None): Maximum number of keys to return. If None,
519
+ return all keys sorted by age (oldest first). Values <= 0
520
+ yield an empty list.
521
+
522
+ Returns:
523
+ list[SafeStrTuple]: The oldest keys, oldest first.
365
524
 
366
525
  This method is absent in the original Python dict API.
367
526
  """
@@ -381,9 +540,15 @@ class PersiDict(MutableMapping, ParameterizableClass):
381
540
 
382
541
 
383
542
  def oldest_values(self, max_n=None):
384
- """Return max_n the oldest values in the dictionary.
543
+ """Return up to max_n oldest values in the dictionary.
544
+
545
+ Args:
546
+ max_n (int | None): Maximum number of values to return. If None,
547
+ return values for all keys sorted by age (oldest first). Values
548
+ <= 0 yield an empty list.
385
549
 
386
- If max_n is None, return all values.
550
+ Returns:
551
+ list[Any]: Values corresponding to the oldest keys.
387
552
 
388
553
  This method is absent in the original Python dict API.
389
554
  """
@@ -391,9 +556,15 @@ class PersiDict(MutableMapping, ParameterizableClass):
391
556
 
392
557
 
393
558
  def newest_keys(self, max_n=None):
394
- """Return max_n the newest keys in the dictionary.
559
+ """Return up to max_n newest keys in the dictionary.
395
560
 
396
- If max_n is None, return all keys.
561
+ Args:
562
+ max_n (int | None): Maximum number of keys to return. If None,
563
+ return all keys sorted by age (newest first). Values <= 0
564
+ yield an empty list.
565
+
566
+ Returns:
567
+ list[SafeStrTuple]: The newest keys, newest first.
397
568
 
398
569
  This method is absent in the original Python dict API.
399
570
  """
@@ -413,9 +584,15 @@ class PersiDict(MutableMapping, ParameterizableClass):
413
584
 
414
585
 
415
586
  def newest_values(self, max_n=None):
416
- """Return max_n the newest values in the dictionary.
587
+ """Return up to max_n newest values in the dictionary.
588
+
589
+ Args:
590
+ max_n (int | None): Maximum number of values to return. If None,
591
+ return values for all keys sorted by age (newest first). Values
592
+ <= 0 yield an empty list.
417
593
 
418
- If max_n is None, return all values.
594
+ Returns:
595
+ list[Any]: Values corresponding to the newest keys.
419
596
 
420
597
  This method is absent in the original Python dict API.
421
598
  """