checkpointer 2.0.1__tar.gz → 2.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,248 @@
1
+ Metadata-Version: 2.3
2
+ Name: checkpointer
3
+ Version: 2.1.0
4
+ Summary: A Python library for memoizing function results with support for multiple storage backends, async runtimes, and automatic cache invalidation
5
+ Project-URL: Repository, https://github.com/Reddan/checkpointer.git
6
+ Author: Hampus Hallman
7
+ License: Copyright 2024 Hampus Hallman
8
+
9
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
12
+
13
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
14
+ Requires-Python: >=3.12
15
+ Requires-Dist: relib
16
+ Description-Content-Type: text/markdown
17
+
18
+ # checkpointer · [![License](https://img.shields.io/badge/license-MIT-blue)](https://github.com/Reddan/checkpointer/blob/master/LICENSE) [![pypi](https://img.shields.io/pypi/v/checkpointer)](https://pypi.org/project/checkpointer/) [![Python 3.12](https://img.shields.io/badge/python-3.12-blue)](https://pypi.org/project/checkpointer/)
19
+
20
+ `checkpointer` is a Python library for memoizing function results. It provides a decorator-based API with support for multiple storage backends. Use it for computationally expensive operations where caching can save time, or during development to avoid waiting for redundant computations.
21
+
22
+ Adding or removing `@checkpoint` doesn't change how your code works. You can apply it to any function, including ones you've already written, without altering their behavior or introducing side effects. The original function remains unchanged and can still be called directly when needed.
23
+
24
+ ### Key Features:
25
+ - 🗂️ **Multiple Storage Backends**: Built-in support for in-memory and pickle-based storage, or create your own.
26
+ - 🎯 **Simple Decorator API**: Apply `@checkpoint` to functions without boilerplate.
27
+ - 🔄 **Async and Sync Compatibility**: Works with synchronous functions and any Python async runtime (e.g., `asyncio`, `Trio`, `Curio`).
28
+ - ⏲️ **Custom Expiration Logic**: Automatically invalidate old checkpoints.
29
+ - 📂 **Flexible Path Configuration**: Control where checkpoints are stored.
30
+ - 📦 **Captured Variables Handling**: Optionally include captured variables in cache invalidation.
31
+
32
+ ---
33
+
34
+ ## Installation
35
+
36
+ ```bash
37
+ pip install checkpointer
38
+ ```
39
+
40
+ ---
41
+
42
+ ## Quick Start 🚀
43
+
44
+ ```python
45
+ from checkpointer import checkpoint
46
+
47
+ @checkpoint
48
+ def expensive_function(x: int) -> int:
49
+ print("Computing...")
50
+ return x ** 2
51
+
52
+ result = expensive_function(4) # Computes and stores the result
53
+ result = expensive_function(4) # Loads from the cache
54
+ ```
55
+
56
+ ---
57
+
58
+ ## How It Works
59
+
60
+ When you use `@checkpoint`, the function's **arguments** (`args`, `kwargs`) are hashed to create a unique identifier for each call. This identifier is used to store and retrieve cached results. If the same arguments are passed again, `checkpointer` loads the cached result instead of recomputing.
61
+
62
+ Additionally, `checkpointer` ensures that caches are invalidated when a function's implementation or any of its dependencies change. Each function is assigned a hash based on:
63
+
64
+ 1. **Its source code**: Changes to the function's code update its hash.
65
+ 2. **Dependent functions**: If a function calls others, changes in those dependencies will also update the hash.
66
+ 3. **Captured variables**: (Optional) If `capture=True`, changes to captured variables and global variables will also update the hash.
67
+
68
+ ### Example: Cache Invalidation
69
+
70
+ ```python
71
+ def multiply(a, b):
72
+ return a * b
73
+
74
+ @checkpoint
75
+ def helper(x):
76
+ return multiply(x + 1, 2)
77
+
78
+ @checkpoint
79
+ def compute(a, b):
80
+ return helper(a) + helper(b)
81
+ ```
82
+
83
+ If you modify `multiply`, caches for both `helper` and `compute` are invalidated and recomputed.
84
+
85
+ ---
86
+
87
+ ## Parameterization
88
+
89
+ ### Custom Configuration
90
+
91
+ Set up a `Checkpointer` instance with custom settings, and extend it by calling itself with overrides:
92
+
93
+ ```python
94
+ from checkpointer import checkpoint
95
+
96
+ IS_DEVELOPMENT = True # Toggle based on your environment
97
+
98
+ tmp_checkpoint = checkpoint(root_path="/tmp/checkpoints")
99
+ dev_checkpoint = tmp_checkpoint(when=IS_DEVELOPMENT) # Adds development-specific behavior
100
+ ```
101
+
102
+ ### Per-Function Customization & Layered Caching
103
+
104
+ Layer caches by stacking checkpoints:
105
+
106
+ ```python
107
+ @checkpoint(format="memory") # Always use memory storage
108
+ @dev_checkpoint # Adds caching during development
109
+ def some_expensive_function():
110
+ print("Performing a time-consuming operation...")
111
+ return sum(i * i for i in range(10**6))
112
+ ```
113
+
114
+ - **In development**: Both `dev_checkpoint` and `memory` caches are active.
115
+ - **In production**: Only the `memory` cache is active.
116
+
117
+ ---
118
+
119
+ ## Usage
120
+
121
+ ### Basic Invocation and Caching
122
+
123
+ Call the decorated function as usual. On the first call, the result is computed and stored in the cache. Subsequent calls with the same arguments load the result from the cache:
124
+
125
+ ```python
126
+ result = expensive_function(4) # Computes and stores the result
127
+ result = expensive_function(4) # Loads the result from the cache
128
+ ```
129
+
130
+ ### Force Recalculation
131
+
132
+ Force a recalculation and overwrite the stored checkpoint:
133
+
134
+ ```python
135
+ result = expensive_function.rerun(4)
136
+ ```
137
+
138
+ ### Call the Original Function
139
+
140
+ Use `fn` to directly call the original, undecorated function:
141
+
142
+ ```python
143
+ result = expensive_function.fn(4)
144
+ ```
145
+
146
+ This is especially useful **inside recursive functions** to avoid redundant caching of intermediate steps while still caching the final result.
147
+
148
+ ### Retrieve Stored Checkpoints
149
+
150
+ Access cached results without recalculating:
151
+
152
+ ```python
153
+ stored_result = expensive_function.get(4)
154
+ ```
155
+
156
+ ---
157
+
158
+ ## Storage Backends
159
+
160
+ `checkpointer` works with both built-in and custom storage backends, so you can use what's provided or roll your own as needed.
161
+
162
+ ### Built-In Backends
163
+
164
+ 1. **PickleStorage**: Stores checkpoints on disk using Python's `pickle`.
165
+ 2. **MemoryStorage**: Keeps checkpoints in memory for non-persistent, fast caching.
166
+
167
+ You can specify a storage backend using either its name (`"pickle"` or `"memory"`) or its corresponding class (`PickleStorage` or `MemoryStorage`) in the `format` parameter:
168
+
169
+ ```python
170
+ from checkpointer import checkpoint, PickleStorage, MemoryStorage
171
+
172
+ @checkpoint(format="pickle") # Short for format=PickleStorage
173
+ def disk_cached(x: int) -> int:
174
+ return x ** 2
175
+
176
+ @checkpoint(format="memory") # Short for format=MemoryStorage
177
+ def memory_cached(x: int) -> int:
178
+ return x * 10
179
+ ```
180
+
181
+ ### Custom Storage Backends
182
+
183
+ Create a custom storage backend by inheriting from the `Storage` class and implementing its methods. Access configuration options through the `self.checkpointer` attribute, an instance of `Checkpointer`.
184
+
185
+ #### Example: Custom Storage Backend
186
+
187
+ ```python
188
+ from checkpointer import checkpoint, Storage
189
+ from datetime import datetime
190
+
191
+ class CustomStorage(Storage):
192
+ def exists(self, path) -> bool: ... # Check if a checkpoint exists at the given path
193
+ def checkpoint_date(self, path) -> datetime: ... # Return the date the checkpoint was created
194
+ def store(self, path, data): ... # Save the checkpoint data
195
+ def load(self, path): ... # Return the checkpoint data
196
+ def delete(self, path): ... # Delete the checkpoint
197
+
198
+ @checkpoint(format=CustomStorage)
199
+ def custom_cached(x: int):
200
+ return x ** 2
201
+ ```
202
+
203
+ Using a custom backend lets you tailor storage to your application, whether it involves databases, cloud storage, or custom file formats.
204
+
205
+ ---
206
+
207
+ ## Configuration Options ⚙️
208
+
209
+ | Option | Type | Default | Description |
210
+ |-----------------|-----------------------------------|----------------------|------------------------------------------------|
211
+ | `capture` | `bool` | `False` | Include captured variables in function hashes. |
212
+ | `format` | `"pickle"`, `"memory"`, `Storage` | `"pickle"` | Storage backend format. |
213
+ | `root_path` | `Path`, `str`, or `None` | ~/.cache/checkpoints | Root directory for storing checkpoints. |
214
+ | `when` | `bool` | `True` | Enable or disable checkpointing. |
215
+ | `verbosity` | `0` or `1` | `1` | Logging verbosity. |
216
+ | `path` | `Callable[..., str]` | `None` | Custom path for checkpoint storage. |
217
+ | `should_expire` | `Callable[[datetime], bool]` | `None` | Custom expiration logic. |
218
+
219
+ ---
220
+
221
+ ## Full Example 🛠️
222
+
223
+ ```python
224
+ import asyncio
225
+ from checkpointer import checkpoint
226
+
227
+ @checkpoint
228
+ def compute_square(n: int) -> int:
229
+ print(f"Computing {n}^2...")
230
+ return n ** 2
231
+
232
+ @checkpoint(format="memory")
233
+ async def async_compute_sum(a: int, b: int) -> int:
234
+ await asyncio.sleep(1)
235
+ return a + b
236
+
237
+ async def main():
238
+ result1 = compute_square(5)
239
+ print(result1) # Outputs 25
240
+
241
+ result2 = await async_compute_sum(3, 7)
242
+ print(result2) # Outputs 10
243
+
244
+ result3 = await async_compute_sum.get(3, 7)
245
+ print(result3) # Outputs 10
246
+
247
+ asyncio.run(main())
248
+ ```
@@ -0,0 +1,231 @@
1
+ # checkpointer · [![License](https://img.shields.io/badge/license-MIT-blue)](https://github.com/Reddan/checkpointer/blob/master/LICENSE) [![pypi](https://img.shields.io/pypi/v/checkpointer)](https://pypi.org/project/checkpointer/) [![Python 3.12](https://img.shields.io/badge/python-3.12-blue)](https://pypi.org/project/checkpointer/)
2
+
3
+ `checkpointer` is a Python library for memoizing function results. It provides a decorator-based API with support for multiple storage backends. Use it for computationally expensive operations where caching can save time, or during development to avoid waiting for redundant computations.
4
+
5
+ Adding or removing `@checkpoint` doesn't change how your code works. You can apply it to any function, including ones you've already written, without altering their behavior or introducing side effects. The original function remains unchanged and can still be called directly when needed.
6
+
7
+ ### Key Features:
8
+ - 🗂️ **Multiple Storage Backends**: Built-in support for in-memory and pickle-based storage, or create your own.
9
+ - 🎯 **Simple Decorator API**: Apply `@checkpoint` to functions without boilerplate.
10
+ - 🔄 **Async and Sync Compatibility**: Works with synchronous functions and any Python async runtime (e.g., `asyncio`, `Trio`, `Curio`).
11
+ - ⏲️ **Custom Expiration Logic**: Automatically invalidate old checkpoints.
12
+ - 📂 **Flexible Path Configuration**: Control where checkpoints are stored.
13
+ - 📦 **Captured Variables Handling**: Optionally include captured variables in cache invalidation.
14
+
15
+ ---
16
+
17
+ ## Installation
18
+
19
+ ```bash
20
+ pip install checkpointer
21
+ ```
22
+
23
+ ---
24
+
25
+ ## Quick Start 🚀
26
+
27
+ ```python
28
+ from checkpointer import checkpoint
29
+
30
+ @checkpoint
31
+ def expensive_function(x: int) -> int:
32
+ print("Computing...")
33
+ return x ** 2
34
+
35
+ result = expensive_function(4) # Computes and stores the result
36
+ result = expensive_function(4) # Loads from the cache
37
+ ```
38
+
39
+ ---
40
+
41
+ ## How It Works
42
+
43
+ When you use `@checkpoint`, the function's **arguments** (`args`, `kwargs`) are hashed to create a unique identifier for each call. This identifier is used to store and retrieve cached results. If the same arguments are passed again, `checkpointer` loads the cached result instead of recomputing.
44
+
45
+ Additionally, `checkpointer` ensures that caches are invalidated when a function's implementation or any of its dependencies change. Each function is assigned a hash based on:
46
+
47
+ 1. **Its source code**: Changes to the function's code update its hash.
48
+ 2. **Dependent functions**: If a function calls others, changes in those dependencies will also update the hash.
49
+ 3. **Captured variables**: (Optional) If `capture=True`, changes to captured variables and global variables will also update the hash.
50
+
51
+ ### Example: Cache Invalidation
52
+
53
+ ```python
54
+ def multiply(a, b):
55
+ return a * b
56
+
57
+ @checkpoint
58
+ def helper(x):
59
+ return multiply(x + 1, 2)
60
+
61
+ @checkpoint
62
+ def compute(a, b):
63
+ return helper(a) + helper(b)
64
+ ```
65
+
66
+ If you modify `multiply`, caches for both `helper` and `compute` are invalidated and recomputed.
67
+
68
+ ---
69
+
70
+ ## Parameterization
71
+
72
+ ### Custom Configuration
73
+
74
+ Set up a `Checkpointer` instance with custom settings, and extend it by calling itself with overrides:
75
+
76
+ ```python
77
+ from checkpointer import checkpoint
78
+
79
+ IS_DEVELOPMENT = True # Toggle based on your environment
80
+
81
+ tmp_checkpoint = checkpoint(root_path="/tmp/checkpoints")
82
+ dev_checkpoint = tmp_checkpoint(when=IS_DEVELOPMENT) # Adds development-specific behavior
83
+ ```
84
+
85
+ ### Per-Function Customization & Layered Caching
86
+
87
+ Layer caches by stacking checkpoints:
88
+
89
+ ```python
90
+ @checkpoint(format="memory") # Always use memory storage
91
+ @dev_checkpoint # Adds caching during development
92
+ def some_expensive_function():
93
+ print("Performing a time-consuming operation...")
94
+ return sum(i * i for i in range(10**6))
95
+ ```
96
+
97
+ - **In development**: Both `dev_checkpoint` and `memory` caches are active.
98
+ - **In production**: Only the `memory` cache is active.
99
+
100
+ ---
101
+
102
+ ## Usage
103
+
104
+ ### Basic Invocation and Caching
105
+
106
+ Call the decorated function as usual. On the first call, the result is computed and stored in the cache. Subsequent calls with the same arguments load the result from the cache:
107
+
108
+ ```python
109
+ result = expensive_function(4) # Computes and stores the result
110
+ result = expensive_function(4) # Loads the result from the cache
111
+ ```
112
+
113
+ ### Force Recalculation
114
+
115
+ Force a recalculation and overwrite the stored checkpoint:
116
+
117
+ ```python
118
+ result = expensive_function.rerun(4)
119
+ ```
120
+
121
+ ### Call the Original Function
122
+
123
+ Use `fn` to directly call the original, undecorated function:
124
+
125
+ ```python
126
+ result = expensive_function.fn(4)
127
+ ```
128
+
129
+ This is especially useful **inside recursive functions** to avoid redundant caching of intermediate steps while still caching the final result.
130
+
131
+ ### Retrieve Stored Checkpoints
132
+
133
+ Access cached results without recalculating:
134
+
135
+ ```python
136
+ stored_result = expensive_function.get(4)
137
+ ```
138
+
139
+ ---
140
+
141
+ ## Storage Backends
142
+
143
+ `checkpointer` works with both built-in and custom storage backends, so you can use what's provided or roll your own as needed.
144
+
145
+ ### Built-In Backends
146
+
147
+ 1. **PickleStorage**: Stores checkpoints on disk using Python's `pickle`.
148
+ 2. **MemoryStorage**: Keeps checkpoints in memory for non-persistent, fast caching.
149
+
150
+ You can specify a storage backend using either its name (`"pickle"` or `"memory"`) or its corresponding class (`PickleStorage` or `MemoryStorage`) in the `format` parameter:
151
+
152
+ ```python
153
+ from checkpointer import checkpoint, PickleStorage, MemoryStorage
154
+
155
+ @checkpoint(format="pickle") # Short for format=PickleStorage
156
+ def disk_cached(x: int) -> int:
157
+ return x ** 2
158
+
159
+ @checkpoint(format="memory") # Short for format=MemoryStorage
160
+ def memory_cached(x: int) -> int:
161
+ return x * 10
162
+ ```
163
+
164
+ ### Custom Storage Backends
165
+
166
+ Create a custom storage backend by inheriting from the `Storage` class and implementing its methods. Access configuration options through the `self.checkpointer` attribute, an instance of `Checkpointer`.
167
+
168
+ #### Example: Custom Storage Backend
169
+
170
+ ```python
171
+ from checkpointer import checkpoint, Storage
172
+ from datetime import datetime
173
+
174
+ class CustomStorage(Storage):
175
+ def exists(self, path) -> bool: ... # Check if a checkpoint exists at the given path
176
+ def checkpoint_date(self, path) -> datetime: ... # Return the date the checkpoint was created
177
+ def store(self, path, data): ... # Save the checkpoint data
178
+ def load(self, path): ... # Return the checkpoint data
179
+ def delete(self, path): ... # Delete the checkpoint
180
+
181
+ @checkpoint(format=CustomStorage)
182
+ def custom_cached(x: int):
183
+ return x ** 2
184
+ ```
185
+
186
+ Using a custom backend lets you tailor storage to your application, whether it involves databases, cloud storage, or custom file formats.
187
+
188
+ ---
189
+
190
+ ## Configuration Options ⚙️
191
+
192
+ | Option | Type | Default | Description |
193
+ |-----------------|-----------------------------------|----------------------|------------------------------------------------|
194
+ | `capture` | `bool` | `False` | Include captured variables in function hashes. |
195
+ | `format` | `"pickle"`, `"memory"`, `Storage` | `"pickle"` | Storage backend format. |
196
+ | `root_path` | `Path`, `str`, or `None` | ~/.cache/checkpoints | Root directory for storing checkpoints. |
197
+ | `when` | `bool` | `True` | Enable or disable checkpointing. |
198
+ | `verbosity` | `0` or `1` | `1` | Logging verbosity. |
199
+ | `path` | `Callable[..., str]` | `None` | Custom path for checkpoint storage. |
200
+ | `should_expire` | `Callable[[datetime], bool]` | `None` | Custom expiration logic. |
201
+
202
+ ---
203
+
204
+ ## Full Example 🛠️
205
+
206
+ ```python
207
+ import asyncio
208
+ from checkpointer import checkpoint
209
+
210
+ @checkpoint
211
+ def compute_square(n: int) -> int:
212
+ print(f"Computing {n}^2...")
213
+ return n ** 2
214
+
215
+ @checkpoint(format="memory")
216
+ async def async_compute_sum(a: int, b: int) -> int:
217
+ await asyncio.sleep(1)
218
+ return a + b
219
+
220
+ async def main():
221
+ result1 = compute_square(5)
222
+ print(result1) # Outputs 25
223
+
224
+ result2 = await async_compute_sum(3, 7)
225
+ print(result2) # Outputs 10
226
+
227
+ result3 = await async_compute_sum.get(3, 7)
228
+ print(result3) # Outputs 10
229
+
230
+ asyncio.run(main())
231
+ ```
@@ -5,5 +5,6 @@ import tempfile
5
5
 
6
6
  create_checkpointer = Checkpointer
7
7
  checkpoint = Checkpointer()
8
- memory_checkpoint = Checkpointer(format="memory")
8
+ capture_checkpoint = Checkpointer(capture=True)
9
+ memory_checkpoint = Checkpointer(format="memory", verbosity=0)
9
10
  tmp_checkpoint = Checkpointer(root_path=tempfile.gettempdir() + "/checkpoints")
@@ -1,32 +1,31 @@
1
+ from __future__ import annotations
1
2
  import inspect
2
3
  import relib.hashing as hashing
3
- from typing import Generic, TypeVar, TypedDict, Callable, Unpack, Literal, Union, Any, cast, overload
4
- from datetime import datetime
4
+ from typing import Generic, TypeVar, Type, TypedDict, Callable, Unpack, Literal, Any, cast, overload
5
5
  from pathlib import Path
6
+ from datetime import datetime
6
7
  from functools import update_wrapper
7
8
  from .types import Storage
8
9
  from .function_body import get_function_hash
9
- from .utils import unwrap_fn, sync_resolve_coroutine
10
- from .storages.pickle_storage import PickleStorage
11
- from .storages.memory_storage import MemoryStorage
12
- from .storages.bcolz_storage import BcolzStorage
10
+ from .utils import unwrap_fn, sync_resolve_coroutine, resolved_awaitable
11
+ from .storages import STORAGE_MAP
13
12
  from .print_checkpoint import print_checkpoint
14
13
 
15
14
  Fn = TypeVar("Fn", bound=Callable)
16
15
 
17
16
  DEFAULT_DIR = Path.home() / ".cache/checkpoints"
18
- STORAGE_MAP = {"memory": MemoryStorage, "pickle": PickleStorage, "bcolz": BcolzStorage}
19
17
 
20
18
  class CheckpointError(Exception):
21
19
  pass
22
20
 
23
21
  class CheckpointerOpts(TypedDict, total=False):
24
- format: Storage | Literal["pickle", "memory", "bcolz"]
22
+ format: Type[Storage] | Literal["pickle", "memory", "bcolz"]
25
23
  root_path: Path | str | None
26
24
  when: bool
27
25
  verbosity: Literal[0, 1]
28
26
  path: Callable[..., str] | None
29
27
  should_expire: Callable[[datetime], bool] | None
28
+ capture: bool
30
29
 
31
30
  class Checkpointer:
32
31
  def __init__(self, **opts: Unpack[CheckpointerOpts]):
@@ -36,15 +35,13 @@ class Checkpointer:
36
35
  self.verbosity = opts.get("verbosity", 1)
37
36
  self.path = opts.get("path")
38
37
  self.should_expire = opts.get("should_expire")
39
-
40
- def get_storage(self) -> Storage:
41
- return STORAGE_MAP[self.format] if isinstance(self.format, str) else self.format
38
+ self.capture = opts.get("capture", False)
42
39
 
43
40
  @overload
44
- def __call__(self, fn: Fn, **override_opts: Unpack[CheckpointerOpts]) -> "CheckpointFn[Fn]": ...
41
+ def __call__(self, fn: Fn, **override_opts: Unpack[CheckpointerOpts]) -> CheckpointFn[Fn]: ...
45
42
  @overload
46
- def __call__(self, fn: None=None, **override_opts: Unpack[CheckpointerOpts]) -> "Checkpointer": ...
47
- def __call__(self, fn: Fn | None=None, **override_opts: Unpack[CheckpointerOpts]) -> Union["Checkpointer", "CheckpointFn[Fn]"]:
43
+ def __call__(self, fn: None=None, **override_opts: Unpack[CheckpointerOpts]) -> Checkpointer: ...
44
+ def __call__(self, fn: Fn | None=None, **override_opts: Unpack[CheckpointerOpts]) -> Checkpointer | CheckpointFn[Fn]:
48
45
  if override_opts:
49
46
  opts = CheckpointerOpts(**{**self.__dict__, **override_opts})
50
47
  return Checkpointer(**opts)(fn)
@@ -56,15 +53,19 @@ class CheckpointFn(Generic[Fn]):
56
53
  wrapped = unwrap_fn(fn)
57
54
  file_name = Path(wrapped.__code__.co_filename).name
58
55
  update_wrapper(cast(Callable, self), wrapped)
56
+ storage = STORAGE_MAP[checkpointer.format] if isinstance(checkpointer.format, str) else checkpointer.format
59
57
  self.checkpointer = checkpointer
60
58
  self.fn = fn
61
- self.fn_hash = get_function_hash(wrapped)
59
+ self.fn_hash, self.depends = get_function_hash(wrapped, self.checkpointer.capture)
62
60
  self.fn_id = f"{file_name}/{wrapped.__name__}"
63
61
  self.is_async = inspect.iscoroutinefunction(wrapped)
62
+ self.storage = storage(checkpointer)
64
63
 
65
64
  def get_checkpoint_id(self, args: tuple, kw: dict) -> str:
66
65
  if not callable(self.checkpointer.path):
67
- return f"{self.fn_id}/{hashing.hash([self.fn_hash, args, kw or 0])}"
66
+ # TODO: use digest size before digesting instead of truncating the hash
67
+ call_hash = hashing.hash((self.fn_hash, args, kw), "blake2b")[:32]
68
+ return f"{self.fn_id}/{call_hash}"
68
69
  checkpoint_id = self.checkpointer.path(*args, **kw)
69
70
  if not isinstance(checkpoint_id, str):
70
71
  raise CheckpointError(f"path function must return a string, got {type(checkpoint_id)}")
@@ -73,27 +74,26 @@ class CheckpointFn(Generic[Fn]):
73
74
  async def _store_on_demand(self, args: tuple, kw: dict, rerun: bool):
74
75
  checkpoint_id = self.get_checkpoint_id(args, kw)
75
76
  checkpoint_path = self.checkpointer.root_path / checkpoint_id
76
- storage = self.checkpointer.get_storage()
77
- should_log = storage is not MemoryStorage and self.checkpointer.verbosity > 0
77
+ should_log = self.checkpointer.verbosity > 0
78
78
  refresh = rerun \
79
- or not storage.exists(checkpoint_path) \
80
- or (self.checkpointer.should_expire and self.checkpointer.should_expire(storage.checkpoint_date(checkpoint_path)))
79
+ or not self.storage.exists(checkpoint_path) \
80
+ or (self.checkpointer.should_expire and self.checkpointer.should_expire(self.storage.checkpoint_date(checkpoint_path)))
81
81
 
82
82
  if refresh:
83
83
  print_checkpoint(should_log, "MEMORIZING", checkpoint_id, "blue")
84
84
  data = self.fn(*args, **kw)
85
85
  if inspect.iscoroutine(data):
86
86
  data = await data
87
- storage.store(checkpoint_path, data)
87
+ self.storage.store(checkpoint_path, data)
88
88
  return data
89
89
 
90
90
  try:
91
- data = storage.load(checkpoint_path)
91
+ data = self.storage.load(checkpoint_path)
92
92
  print_checkpoint(should_log, "REMEMBERED", checkpoint_id, "green")
93
93
  return data
94
94
  except (EOFError, FileNotFoundError):
95
95
  print_checkpoint(should_log, "CORRUPTED", checkpoint_id, "yellow")
96
- storage.delete(checkpoint_path)
96
+ self.storage.delete(checkpoint_path)
97
97
  return await self._store_on_demand(args, kw, rerun)
98
98
 
99
99
  def _call(self, args: tuple, kw: dict, rerun=False):
@@ -102,13 +102,17 @@ class CheckpointFn(Generic[Fn]):
102
102
  coroutine = self._store_on_demand(args, kw, rerun)
103
103
  return coroutine if self.is_async else sync_resolve_coroutine(coroutine)
104
104
 
105
- __call__: Fn = cast(Fn, lambda self, *args, **kw: self._call(args, kw))
106
- rerun: Fn = cast(Fn, lambda self, *args, **kw: self._call(args, kw, True))
107
-
108
- def get(self, *args, **kw) -> Any:
105
+ def _get(self, args, kw) -> Any:
109
106
  checkpoint_path = self.checkpointer.root_path / self.get_checkpoint_id(args, kw)
110
- storage = self.checkpointer.get_storage()
111
107
  try:
112
- return storage.load(checkpoint_path)
108
+ val = self.storage.load(checkpoint_path)
109
+ return resolved_awaitable(val) if self.is_async else val
113
110
  except:
114
111
  raise CheckpointError("Could not load checkpoint")
112
+
113
+ def exists(self, *args: tuple, **kw: dict) -> bool:
114
+ return self.storage.exists(self.checkpointer.root_path / self.get_checkpoint_id(args, kw))
115
+
116
+ __call__: Fn = cast(Fn, lambda self, *args, **kw: self._call(args, kw))
117
+ rerun: Fn = cast(Fn, lambda self, *args, **kw: self._call(args, kw, True))
118
+ get: Fn = cast(Fn, lambda self, *args, **kw: self._get(args, kw))