checkpointer 2.9.0__tar.gz → 2.9.2__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,215 @@
1
+ Metadata-Version: 2.4
2
+ Name: checkpointer
3
+ Version: 2.9.2
4
+ Summary: A Python library for memoizing function results with support for multiple storage backends, async runtimes, and automatic cache invalidation
5
+ Project-URL: Repository, https://github.com/Reddan/checkpointer.git
6
+ Author: Hampus Hallman
7
+ License: Copyright 2018-2025 Hampus Hallman
8
+
9
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
12
+
13
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
14
+ License-File: LICENSE
15
+ Classifier: Programming Language :: Python :: 3.11
16
+ Classifier: Programming Language :: Python :: 3.12
17
+ Classifier: Programming Language :: Python :: 3.13
18
+ Requires-Python: >=3.11
19
+ Description-Content-Type: text/markdown
20
+
21
+ # checkpointer · [![License](https://img.shields.io/badge/license-MIT-blue)](https://github.com/Reddan/checkpointer/blob/master/LICENSE) [![pypi](https://img.shields.io/pypi/v/checkpointer)](https://pypi.org/project/checkpointer/) [![pypi](https://img.shields.io/pypi/pyversions/checkpointer)](https://pypi.org/project/checkpointer/)
22
+
23
+ `checkpointer` is a Python library providing a decorator-based API for memoizing (caching) function results. It helps you skip redundant, computationally expensive operations, saving execution time and streamlining your workflows.
24
+
25
+ It works with synchronous and asynchronous functions, supports multiple storage backends, and automatically invalidates caches when function code, dependencies, or captured variables change.
26
+
27
+ ## 📦 Installation
28
+
29
+ ```bash
30
+ pip install checkpointer
31
+ ```
32
+
33
+ ## 🚀 Quick Start
34
+
35
+ Apply the `@checkpoint` decorator to any function:
36
+
37
+ ```python
38
+ from checkpointer import checkpoint
39
+
40
+ @checkpoint
41
+ def expensive_function(x: int) -> int:
42
+ print("Computing...")
43
+ return x ** 2
44
+
45
+ result = expensive_function(4) # Computes and stores the result
46
+ result = expensive_function(4) # Loads from the cache
47
+ ```
48
+
49
+ ## 🧠 How It Works
50
+
51
+ When a function decorated with `@checkpoint` is called:
52
+
53
+ 1. `checkpointer` computes a unique identifier (hash) for the function call based on its source code, its dependencies, and the arguments passed.
54
+ 2. It attempts to retrieve a cached result using this identifier.
55
+ 3. If a cached result is found, it's returned immediately.
56
+ 4. If no cached result exists or the cache has expired, the original function is executed, its result is stored, and then returned.
57
+
58
+ ### ♻️ Automatic Cache Invalidation
59
+
60
+ `checkpointer` ensures caches are invalidated automatically when the underlying computation changes. A function's hash, which determines cache validity, updates if:
61
+
62
+ * **Function Code Changes**: The source code of the decorated function itself is modified.
63
+ * **Dependencies Change**: Any user-defined function in its dependency tree (direct or indirect, even across modules or not decorated with `@checkpoint`) is modified.
64
+ * **Captured Variables Change** (with `capture=True`): Global or closure-based variables used within the function are altered.
65
+
66
+ **Example: Dependency Invalidation**
67
+
68
+ ```python
69
+ def multiply(a, b):
70
+ return a * b
71
+
72
+ @checkpoint
73
+ def helper(x):
74
+ # Depends on `multiply`
75
+ return multiply(x + 1, 2)
76
+
77
+ @checkpoint
78
+ def compute(a, b):
79
+ # Depends on `helper` and `multiply`
80
+ return helper(a) + helper(b)
81
+ ```
82
+
83
+ If `multiply` is modified, caches for both `helper` and `compute` will automatically be invalidated and recomputed upon their next call.
84
+
85
+ ## 💡 Usage
86
+
87
+ Once a function is decorated with `@checkpoint`, you can interact with its caching behavior using the following methods:
88
+
89
+ * **`expensive_function(...)`**:
90
+ Call the function normally. This will either compute and cache the result or load it from the cache if available.
91
+
92
+ * **`expensive_function.rerun(...)`**:
93
+ Forces the original function to execute, compute a new result, and overwrite any existing cached value for the given arguments.
94
+
95
+ * **`expensive_function.fn(...)`**:
96
+ Calls the original, undecorated function directly, bypassing the cache entirely. This is particularly useful within recursive functions to prevent caching intermediate steps.
97
+
98
+ * **`expensive_function.get(...)`**:
99
+ Attempts to retrieve the cached result for the given arguments without executing the original function. Raises `CheckpointError` if no valid cached result exists.
100
+
101
+ * **`expensive_function.exists(...)`**:
102
+ Checks if a cached result exists for the given arguments without attempting to compute or load it. Returns `True` if a valid checkpoint exists, `False` otherwise.
103
+
104
+ * **`expensive_function.delete(...)`**:
105
+ Removes the cached entry for the specified arguments.
106
+
107
+ * **`expensive_function.reinit()`**:
108
+ Recalculates the function's internal hash. This is primarily used when `capture=True` and you need to update the cache based on changes to external variables within the same Python session.
109
+
110
+ ## ⚙️ Configuration & Customization
111
+
112
+ The `@checkpoint` decorator accepts the following parameters to customize its behavior:
113
+
114
+ * **`format`** (Type: `str` or `checkpointer.Storage`, Default: `"pickle"`)
115
+ Defines the storage backend to use. Built-in options are `"pickle"` (disk-based, persistent) and `"memory"` (in-memory, non-persistent). You can also provide a custom `Storage` class.
116
+
117
+ * **`root_path`** (Type: `str` or `pathlib.Path` or `None`, Default: `~/.cache/checkpoints`)
118
+ The base directory for storing disk-based checkpoints. This parameter is only relevant when `format` is set to `"pickle"`.
119
+
120
+ * **`when`** (Type: `bool`, Default: `True`)
121
+ A boolean flag to enable or disable checkpointing for the decorated function. This is particularly useful for toggling caching based on environment variables (e.g., `when=os.environ.get("ENABLE_CACHING", "false").lower() == "true"`).
122
+
123
+ * **`capture`** (Type: `bool`, Default: `False`)
124
+ If set to `True`, `checkpointer` includes global or closure-based variables used by the function in its hash calculation. This ensures that changes to these external variables also trigger cache invalidation and recomputation.
125
+
126
+ * **`should_expire`** (Type: `Callable[[datetime.datetime], bool]`, Default: `None`)
127
+ A custom callable that receives the `datetime` timestamp of a cached result. It should return `True` if the cached result is considered expired and needs recomputation, or `False` otherwise.
128
+
129
+ * **`hash_by`** (Type: `Callable[..., Any]`, Default: `None`)
130
+ A custom callable that takes the function's arguments (`*args`, `**kwargs`) and returns a hashable object (or tuple of objects). This allows for custom argument normalization (e.g., sorting lists before hashing) or optimized hashing for complex input types, which can improve cache hit rates or speed up the hashing process.
131
+
132
+ * **`fn_hash`** (Type: `checkpointer.ObjectHash`, Default: `None`)
133
+ An optional parameter that takes an instance of `checkpointer.ObjectHash`. This allows you to override the automatically computed function hash, giving you explicit control over when the function's cache should be invalidated. You can pass any values relevant to your invalidation logic to `ObjectHash` (e.g., `ObjectHash(version_string, config_id, ...)`, as it can consistently hash most Python values.
134
+
135
+ * **`verbosity`** (Type: `int` (`0`, `1`, or `2`), Default: `1`)
136
+ Controls the level of logging output from `checkpointer`.
137
+ * `0`: No output.
138
+ * `1`: Shows when functions are computed and cached.
139
+ * `2`: Also shows when cached results are remembered (loaded from cache).
140
+
141
+ ### 🗄️ Custom Storage Backends
142
+
143
+ For integration with databases, cloud storage, or custom serialization, implement your own storage backend by inheriting from `checkpointer.Storage` and implementing its abstract methods.
144
+
145
+ Within custom storage methods, `call_id` identifies calls by arguments. Use `self.fn_id()` to get the function's unique identity (name + hash/version), crucial for organizing stored checkpoints (e.g., by function version). Access global `Checkpointer` config via `self.checkpointer`.
146
+
147
+ #### Example: Custom Storage Backend
148
+
149
+ ```python
150
+ from checkpointer import checkpoint, Storage
151
+ from datetime import datetime
152
+
153
+ class MyCustomStorage(Storage):
154
+ def exists(self, call_id):
155
+ # Example: Constructing a path based on function ID and call ID
156
+ fn_dir = self.checkpointer.root_path / self.fn_id()
157
+ return (fn_dir / call_id).exists()
158
+
159
+ def checkpoint_date(self, call_id): ...
160
+ def store(self, call_id, data): ...
161
+ def load(self, call_id): ...
162
+ def delete(self, call_id): ...
163
+
164
+ @checkpoint(format=MyCustomStorage)
165
+ def custom_cached_function(x: int):
166
+ return x ** 2
167
+ ```
168
+
169
+ ## 🧱 Layered Caching
170
+
171
+ You can apply multiple `@checkpoint` decorators to a single function to create layered caching strategies. `checkpointer` processes these decorators from bottom to top, meaning the decorator closest to the function definition is evaluated first.
172
+
173
+ This is useful for scenarios like combining a fast, ephemeral cache (e.g., in-memory) with a persistent, slower cache (e.g., disk-based).
174
+
175
+ **Example: Memory Cache over Disk Cache**
176
+
177
+ ```python
178
+ from checkpointer import checkpoint
179
+
180
+ @checkpoint(format="memory") # Layer 2: Fast, ephemeral in-memory cache
181
+ @checkpoint(format="pickle") # Layer 1: Persistent disk cache
182
+ def some_expensive_operation():
183
+ print("Performing a time-consuming operation...")
184
+ return sum(i for i in range(10**7))
185
+ ```
186
+
187
+ ## ⚡ Async Support
188
+
189
+ `checkpointer` works seamlessly with Python's `asyncio` and other async runtimes.
190
+
191
+ ```python
192
+ import asyncio
193
+ from checkpointer import checkpoint
194
+
195
+ @checkpoint
196
+ async def async_compute_sum(a: int, b: int) -> int:
197
+ print(f"Asynchronously computing {a} + {b}...")
198
+ await asyncio.sleep(1)
199
+ return a + b
200
+
201
+ async def main():
202
+ # First call computes and caches
203
+ result1 = await async_compute_sum(3, 7)
204
+ print(f"Result 1: {result1}")
205
+
206
+ # Second call loads from cache
207
+ result2 = await async_compute_sum(3, 7)
208
+ print(f"Result 2: {result2}")
209
+
210
+ # Retrieve from cache without re-running the async function
211
+ result3 = async_compute_sum.get(3, 7)
212
+ print(f"Result 3 (from cache): {result3}")
213
+
214
+ asyncio.run(main())
215
+ ```
@@ -0,0 +1,195 @@
1
+ # checkpointer · [![License](https://img.shields.io/badge/license-MIT-blue)](https://github.com/Reddan/checkpointer/blob/master/LICENSE) [![pypi](https://img.shields.io/pypi/v/checkpointer)](https://pypi.org/project/checkpointer/) [![pypi](https://img.shields.io/pypi/pyversions/checkpointer)](https://pypi.org/project/checkpointer/)
2
+
3
+ `checkpointer` is a Python library providing a decorator-based API for memoizing (caching) function results. It helps you skip redundant, computationally expensive operations, saving execution time and streamlining your workflows.
4
+
5
+ It works with synchronous and asynchronous functions, supports multiple storage backends, and automatically invalidates caches when function code, dependencies, or captured variables change.
6
+
7
+ ## 📦 Installation
8
+
9
+ ```bash
10
+ pip install checkpointer
11
+ ```
12
+
13
+ ## 🚀 Quick Start
14
+
15
+ Apply the `@checkpoint` decorator to any function:
16
+
17
+ ```python
18
+ from checkpointer import checkpoint
19
+
20
+ @checkpoint
21
+ def expensive_function(x: int) -> int:
22
+ print("Computing...")
23
+ return x ** 2
24
+
25
+ result = expensive_function(4) # Computes and stores the result
26
+ result = expensive_function(4) # Loads from the cache
27
+ ```
28
+
29
+ ## 🧠 How It Works
30
+
31
+ When a function decorated with `@checkpoint` is called:
32
+
33
+ 1. `checkpointer` computes a unique identifier (hash) for the function call based on its source code, its dependencies, and the arguments passed.
34
+ 2. It attempts to retrieve a cached result using this identifier.
35
+ 3. If a cached result is found, it's returned immediately.
36
+ 4. If no cached result exists or the cache has expired, the original function is executed, its result is stored, and then returned.
37
+
38
+ ### ♻️ Automatic Cache Invalidation
39
+
40
+ `checkpointer` ensures caches are invalidated automatically when the underlying computation changes. A function's hash, which determines cache validity, updates if:
41
+
42
+ * **Function Code Changes**: The source code of the decorated function itself is modified.
43
+ * **Dependencies Change**: Any user-defined function in its dependency tree (direct or indirect, even across modules or not decorated with `@checkpoint`) is modified.
44
+ * **Captured Variables Change** (with `capture=True`): Global or closure-based variables used within the function are altered.
45
+
46
+ **Example: Dependency Invalidation**
47
+
48
+ ```python
49
+ def multiply(a, b):
50
+ return a * b
51
+
52
+ @checkpoint
53
+ def helper(x):
54
+ # Depends on `multiply`
55
+ return multiply(x + 1, 2)
56
+
57
+ @checkpoint
58
+ def compute(a, b):
59
+ # Depends on `helper` and `multiply`
60
+ return helper(a) + helper(b)
61
+ ```
62
+
63
+ If `multiply` is modified, caches for both `helper` and `compute` will automatically be invalidated and recomputed upon their next call.
64
+
65
+ ## 💡 Usage
66
+
67
+ Once a function is decorated with `@checkpoint`, you can interact with its caching behavior using the following methods:
68
+
69
+ * **`expensive_function(...)`**:
70
+ Call the function normally. This will either compute and cache the result or load it from the cache if available.
71
+
72
+ * **`expensive_function.rerun(...)`**:
73
+ Forces the original function to execute, compute a new result, and overwrite any existing cached value for the given arguments.
74
+
75
+ * **`expensive_function.fn(...)`**:
76
+ Calls the original, undecorated function directly, bypassing the cache entirely. This is particularly useful within recursive functions to prevent caching intermediate steps.
77
+
78
+ * **`expensive_function.get(...)`**:
79
+ Attempts to retrieve the cached result for the given arguments without executing the original function. Raises `CheckpointError` if no valid cached result exists.
80
+
81
+ * **`expensive_function.exists(...)`**:
82
+ Checks if a cached result exists for the given arguments without attempting to compute or load it. Returns `True` if a valid checkpoint exists, `False` otherwise.
83
+
84
+ * **`expensive_function.delete(...)`**:
85
+ Removes the cached entry for the specified arguments.
86
+
87
+ * **`expensive_function.reinit()`**:
88
+ Recalculates the function's internal hash. This is primarily used when `capture=True` and you need to update the cache based on changes to external variables within the same Python session.
89
+
90
+ ## ⚙️ Configuration & Customization
91
+
92
+ The `@checkpoint` decorator accepts the following parameters to customize its behavior:
93
+
94
+ * **`format`** (Type: `str` or `checkpointer.Storage`, Default: `"pickle"`)
95
+ Defines the storage backend to use. Built-in options are `"pickle"` (disk-based, persistent) and `"memory"` (in-memory, non-persistent). You can also provide a custom `Storage` class.
96
+
97
+ * **`root_path`** (Type: `str` or `pathlib.Path` or `None`, Default: `~/.cache/checkpoints`)
98
+ The base directory for storing disk-based checkpoints. This parameter is only relevant when `format` is set to `"pickle"`.
99
+
100
+ * **`when`** (Type: `bool`, Default: `True`)
101
+ A boolean flag to enable or disable checkpointing for the decorated function. This is particularly useful for toggling caching based on environment variables (e.g., `when=os.environ.get("ENABLE_CACHING", "false").lower() == "true"`).
102
+
103
+ * **`capture`** (Type: `bool`, Default: `False`)
104
+ If set to `True`, `checkpointer` includes global or closure-based variables used by the function in its hash calculation. This ensures that changes to these external variables also trigger cache invalidation and recomputation.
105
+
106
+ * **`should_expire`** (Type: `Callable[[datetime.datetime], bool]`, Default: `None`)
107
+ A custom callable that receives the `datetime` timestamp of a cached result. It should return `True` if the cached result is considered expired and needs recomputation, or `False` otherwise.
108
+
109
+ * **`hash_by`** (Type: `Callable[..., Any]`, Default: `None`)
110
+ A custom callable that takes the function's arguments (`*args`, `**kwargs`) and returns a hashable object (or tuple of objects). This allows for custom argument normalization (e.g., sorting lists before hashing) or optimized hashing for complex input types, which can improve cache hit rates or speed up the hashing process.
111
+
112
+ * **`fn_hash`** (Type: `checkpointer.ObjectHash`, Default: `None`)
113
+ An optional parameter that takes an instance of `checkpointer.ObjectHash`. This allows you to override the automatically computed function hash, giving you explicit control over when the function's cache should be invalidated. You can pass any values relevant to your invalidation logic to `ObjectHash` (e.g., `ObjectHash(version_string, config_id, ...)`, as it can consistently hash most Python values.
114
+
115
+ * **`verbosity`** (Type: `int` (`0`, `1`, or `2`), Default: `1`)
116
+ Controls the level of logging output from `checkpointer`.
117
+ * `0`: No output.
118
+ * `1`: Shows when functions are computed and cached.
119
+ * `2`: Also shows when cached results are remembered (loaded from cache).
120
+
121
+ ### 🗄️ Custom Storage Backends
122
+
123
+ For integration with databases, cloud storage, or custom serialization, implement your own storage backend by inheriting from `checkpointer.Storage` and implementing its abstract methods.
124
+
125
+ Within custom storage methods, `call_id` identifies calls by arguments. Use `self.fn_id()` to get the function's unique identity (name + hash/version), crucial for organizing stored checkpoints (e.g., by function version). Access global `Checkpointer` config via `self.checkpointer`.
126
+
127
+ #### Example: Custom Storage Backend
128
+
129
+ ```python
130
+ from checkpointer import checkpoint, Storage
131
+ from datetime import datetime
132
+
133
+ class MyCustomStorage(Storage):
134
+ def exists(self, call_id):
135
+ # Example: Constructing a path based on function ID and call ID
136
+ fn_dir = self.checkpointer.root_path / self.fn_id()
137
+ return (fn_dir / call_id).exists()
138
+
139
+ def checkpoint_date(self, call_id): ...
140
+ def store(self, call_id, data): ...
141
+ def load(self, call_id): ...
142
+ def delete(self, call_id): ...
143
+
144
+ @checkpoint(format=MyCustomStorage)
145
+ def custom_cached_function(x: int):
146
+ return x ** 2
147
+ ```
148
+
149
+ ## 🧱 Layered Caching
150
+
151
+ You can apply multiple `@checkpoint` decorators to a single function to create layered caching strategies. `checkpointer` processes these decorators from bottom to top, meaning the decorator closest to the function definition is evaluated first.
152
+
153
+ This is useful for scenarios like combining a fast, ephemeral cache (e.g., in-memory) with a persistent, slower cache (e.g., disk-based).
154
+
155
+ **Example: Memory Cache over Disk Cache**
156
+
157
+ ```python
158
+ from checkpointer import checkpoint
159
+
160
+ @checkpoint(format="memory") # Layer 2: Fast, ephemeral in-memory cache
161
+ @checkpoint(format="pickle") # Layer 1: Persistent disk cache
162
+ def some_expensive_operation():
163
+ print("Performing a time-consuming operation...")
164
+ return sum(i for i in range(10**7))
165
+ ```
166
+
167
+ ## ⚡ Async Support
168
+
169
+ `checkpointer` works seamlessly with Python's `asyncio` and other async runtimes.
170
+
171
+ ```python
172
+ import asyncio
173
+ from checkpointer import checkpoint
174
+
175
+ @checkpoint
176
+ async def async_compute_sum(a: int, b: int) -> int:
177
+ print(f"Asynchronously computing {a} + {b}...")
178
+ await asyncio.sleep(1)
179
+ return a + b
180
+
181
+ async def main():
182
+ # First call computes and caches
183
+ result1 = await async_compute_sum(3, 7)
184
+ print(f"Result 1: {result1}")
185
+
186
+ # Second call loads from cache
187
+ result2 = await async_compute_sum(3, 7)
188
+ print(f"Result 2: {result2}")
189
+
190
+ # Retrieve from cache without re-running the async function
191
+ result3 = async_compute_sum.get(3, 7)
192
+ print(f"Result 3 (from cache): {result3}")
193
+
194
+ asyncio.run(main())
195
+ ```
@@ -1,7 +1,7 @@
1
1
  import gc
2
2
  import tempfile
3
3
  from typing import Callable
4
- from .checkpoint import Checkpointer, CheckpointError, CheckpointFn
4
+ from .checkpoint import CachedFunction, Checkpointer, CheckpointError
5
5
  from .object_hash import ObjectHash
6
6
  from .storages import MemoryStorage, PickleStorage, Storage
7
7
 
@@ -14,8 +14,8 @@ static_checkpoint = Checkpointer(fn_hash=ObjectHash())
14
14
 
15
15
  def cleanup_all(invalidated=True, expired=True):
16
16
  for obj in gc.get_objects():
17
- if isinstance(obj, CheckpointFn):
17
+ if isinstance(obj, CachedFunction):
18
18
  obj.cleanup(invalidated=invalidated, expired=expired)
19
19
 
20
20
  def get_function_hash(fn: Callable, capture=False) -> str:
21
- return CheckpointFn(Checkpointer(capture=capture), fn).fn_hash
21
+ return CachedFunction(Checkpointer(capture=capture), fn).fn_hash
@@ -1,7 +1,6 @@
1
1
  from __future__ import annotations
2
2
  import inspect
3
3
  import re
4
- from contextlib import suppress
5
4
  from datetime import datetime
6
5
  from functools import cached_property, update_wrapper
7
6
  from pathlib import Path
@@ -43,17 +42,17 @@ class Checkpointer:
43
42
  self.fn_hash = opts.get("fn_hash")
44
43
 
45
44
  @overload
46
- def __call__(self, fn: Fn, **override_opts: Unpack[CheckpointerOpts]) -> CheckpointFn[Fn]: ...
45
+ def __call__(self, fn: Fn, **override_opts: Unpack[CheckpointerOpts]) -> CachedFunction[Fn]: ...
47
46
  @overload
48
47
  def __call__(self, fn: None=None, **override_opts: Unpack[CheckpointerOpts]) -> Checkpointer: ...
49
- def __call__(self, fn: Fn | None=None, **override_opts: Unpack[CheckpointerOpts]) -> Checkpointer | CheckpointFn[Fn]:
48
+ def __call__(self, fn: Fn | None=None, **override_opts: Unpack[CheckpointerOpts]) -> Checkpointer | CachedFunction[Fn]:
50
49
  if override_opts:
51
50
  opts = CheckpointerOpts(**{**self.__dict__, **override_opts})
52
51
  return Checkpointer(**opts)(fn)
53
52
 
54
- return CheckpointFn(self, fn) if callable(fn) else self
53
+ return CachedFunction(self, fn) if callable(fn) else self
55
54
 
56
- class CheckpointFn(Generic[Fn]):
55
+ class CachedFunction(Generic[Fn]):
57
56
  def __init__(self, checkpointer: Checkpointer, fn: Fn):
58
57
  wrapped = unwrap_fn(fn)
59
58
  fn_file = Path(wrapped.__code__.co_filename).name
@@ -62,9 +61,9 @@ class CheckpointFn(Generic[Fn]):
62
61
  update_wrapper(cast(Callable, self), wrapped)
63
62
  self.checkpointer = checkpointer
64
63
  self.fn = fn
64
+ self.fn_dir = f"{fn_file}/{fn_name}"
65
65
  self.storage = Storage(self)
66
66
  self.cleanup = self.storage.cleanup
67
- self.fn_dir = f"{fn_file}/{fn_name}"
68
67
 
69
68
  @cached_property
70
69
  def ident_tuple(self) -> tuple[str, list[Callable]]:
@@ -80,15 +79,15 @@ class CheckpointFn(Generic[Fn]):
80
79
 
81
80
  @cached_property
82
81
  def fn_hash(self) -> str:
83
- fn_hash = self.checkpointer.fn_hash
84
82
  deep_hashes = [depend.fn_hash_raw for depend in self.deep_depends()]
85
- return str(fn_hash or ObjectHash(digest_size=16).write_text(self.fn_hash_raw, *deep_hashes))[:32]
83
+ fn_hash = ObjectHash(digest_size=16).write_text(self.fn_hash_raw, *deep_hashes)
84
+ return str(self.checkpointer.fn_hash or fn_hash)[:32]
86
85
 
87
- def reinit(self, recursive=False) -> CheckpointFn[Fn]:
86
+ def reinit(self, recursive=False) -> CachedFunction[Fn]:
88
87
  depends = list(self.deep_depends()) if recursive else [self]
89
88
  for depend in depends:
90
- with suppress(AttributeError):
91
- del depend.ident_tuple, depend.fn_hash
89
+ self.__dict__.pop("fn_hash", None)
90
+ self.__dict__.pop("ident_tuple", None)
92
91
  for depend in depends:
93
92
  depend.fn_hash
94
93
  return self
@@ -150,21 +149,21 @@ class CheckpointFn(Generic[Fn]):
150
149
  raise CheckpointError("Could not load checkpoint") from ex
151
150
 
152
151
  def exists(self: Callable[P, R], *args: P.args, **kw: P.kwargs) -> bool: # type: ignore
153
- self = cast(CheckpointFn, self)
152
+ self = cast(CachedFunction, self)
154
153
  return self.storage.exists(self.get_call_id(args, kw))
155
154
 
156
155
  def delete(self: Callable[P, R], *args: P.args, **kw: P.kwargs): # type: ignore
157
- self = cast(CheckpointFn, self)
156
+ self = cast(CachedFunction, self)
158
157
  self.storage.delete(self.get_call_id(args, kw))
159
158
 
160
159
  def __repr__(self) -> str:
161
160
  return f"<CheckpointFn {self.fn.__name__} {self.fn_hash[:6]}>"
162
161
 
163
- def deep_depends(self, visited: set[CheckpointFn] = set()) -> Iterable[CheckpointFn]:
162
+ def deep_depends(self, visited: set[CachedFunction] = set()) -> Iterable[CachedFunction]:
164
163
  if self not in visited:
165
164
  yield self
166
165
  visited = visited or set()
167
166
  visited.add(self)
168
167
  for depend in self.depends:
169
- if isinstance(depend, CheckpointFn):
168
+ if isinstance(depend, CachedFunction):
170
169
  yield from depend.deep_depends(visited)
@@ -8,7 +8,7 @@ from typing import Any, Iterable, Type, TypeGuard
8
8
  from .object_hash import ObjectHash
9
9
  from .utils import AttrDict, distinct, get_cell_contents, iterate_and_upcoming, transpose, unwrap_fn
10
10
 
11
- cwd = Path.cwd()
11
+ cwd = Path.cwd().resolve()
12
12
 
13
13
  def is_class(obj) -> TypeGuard[Type]:
14
14
  # isinstance works too, but needlessly triggers _lazyinit()
@@ -72,23 +72,23 @@ def is_user_fn(candidate_fn) -> TypeGuard[Callable]:
72
72
  return cwd in fn_path.parents and ".venv" not in fn_path.parts
73
73
 
74
74
  def get_depend_fns(fn: Callable, capture: bool, captured_vals_by_fn: dict[Callable, list[Any]] = {}) -> dict[Callable, list[Any]]:
75
- from .checkpoint import CheckpointFn
75
+ from .checkpoint import CachedFunction
76
76
  captured_vals_by_fn = captured_vals_by_fn or {}
77
77
  captured_vals = get_fn_captured_vals(fn)
78
78
  captured_vals_by_fn[fn] = [val for val in captured_vals if not callable(val)] * capture
79
- child_fns = (unwrap_fn(val, checkpoint_fn=True) for val in captured_vals if callable(val))
79
+ child_fns = (unwrap_fn(val, cached_fn=True) for val in captured_vals if callable(val))
80
80
  for child_fn in child_fns:
81
- if isinstance(child_fn, CheckpointFn):
81
+ if isinstance(child_fn, CachedFunction):
82
82
  captured_vals_by_fn[child_fn] = []
83
83
  elif child_fn not in captured_vals_by_fn and is_user_fn(child_fn):
84
84
  get_depend_fns(child_fn, capture, captured_vals_by_fn)
85
85
  return captured_vals_by_fn
86
86
 
87
87
  def get_fn_ident(fn: Callable, capture: bool) -> tuple[str, list[Callable]]:
88
- from .checkpoint import CheckpointFn
88
+ from .checkpoint import CachedFunction
89
89
  captured_vals_by_fn = get_depend_fns(fn, capture)
90
90
  depends, depend_captured_vals = transpose(captured_vals_by_fn.items(), 2)
91
91
  depends = distinct(fn.__func__ if isinstance(fn, MethodType) else fn for fn in depends)
92
- unwrapped_depends = [fn for fn in depends if not isinstance(fn, CheckpointFn)]
92
+ unwrapped_depends = [fn for fn in depends if not isinstance(fn, CachedFunction)]
93
93
  fn_hash = str(ObjectHash(fn, unwrapped_depends).update(depend_captured_vals, tolerate_errors=True))
94
94
  return fn_hash, depends
@@ -7,7 +7,7 @@ item_map: dict[Path, dict[str, tuple[datetime, Any]]] = {}
7
7
 
8
8
  class MemoryStorage(Storage):
9
9
  def get_dict(self):
10
- return item_map.setdefault(self.dir(), {})
10
+ return item_map.setdefault(self.fn_dir(), {})
11
11
 
12
12
  def store(self, call_id, data):
13
13
  self.get_dict()[call_id] = (datetime.now(), data)
@@ -25,7 +25,7 @@ class MemoryStorage(Storage):
25
25
  self.get_dict().pop(call_id, None)
26
26
 
27
27
  def cleanup(self, invalidated=True, expired=True):
28
- curr_key = self.dir()
28
+ curr_key = self.fn_dir()
29
29
  for key, calldict in list(item_map.items()):
30
30
  if key.parent == curr_key.parent:
31
31
  if invalidated and key != curr_key:
@@ -5,7 +5,7 @@ from .storage import Storage
5
5
 
6
6
  class PickleStorage(Storage):
7
7
  def get_path(self, call_id: str):
8
- return self.dir() / f"{call_id}.pkl"
8
+ return self.fn_dir() / f"{call_id}.pkl"
9
9
 
10
10
  def store(self, call_id, data):
11
11
  path = self.get_path(call_id)
@@ -28,17 +28,17 @@ class PickleStorage(Storage):
28
28
  self.get_path(call_id).unlink(missing_ok=True)
29
29
 
30
30
  def cleanup(self, invalidated=True, expired=True):
31
- version_path = self.dir()
31
+ version_path = self.fn_dir()
32
32
  fn_path = version_path.parent
33
33
  if invalidated:
34
34
  old_dirs = [path for path in fn_path.iterdir() if path.is_dir() and path != version_path]
35
35
  for path in old_dirs:
36
36
  shutil.rmtree(path)
37
- print(f"Removed {len(old_dirs)} invalidated directories for {self.checkpoint_fn.__qualname__}")
37
+ print(f"Removed {len(old_dirs)} invalidated directories for {self.cached_fn.__qualname__}")
38
38
  if expired and self.checkpointer.should_expire:
39
39
  count = 0
40
- for pkl_path in fn_path.rglob("*.pkl"):
40
+ for pkl_path in fn_path.glob("**/*.pkl"):
41
41
  if self.checkpointer.should_expire(self.checkpoint_date(pkl_path.stem)):
42
42
  count += 1
43
43
  self.delete(pkl_path.stem)
44
- print(f"Removed {count} expired checkpoints for {self.checkpoint_fn.__qualname__}")
44
+ print(f"Removed {count} expired checkpoints for {self.cached_fn.__qualname__}")
@@ -4,18 +4,21 @@ from pathlib import Path
4
4
  from datetime import datetime
5
5
 
6
6
  if TYPE_CHECKING:
7
- from ..checkpoint import Checkpointer, CheckpointFn
7
+ from ..checkpoint import Checkpointer, CachedFunction
8
8
 
9
9
  class Storage:
10
10
  checkpointer: Checkpointer
11
- checkpoint_fn: CheckpointFn
11
+ cached_fn: CachedFunction
12
12
 
13
- def __init__(self, checkpoint_fn: CheckpointFn):
14
- self.checkpointer = checkpoint_fn.checkpointer
15
- self.checkpoint_fn = checkpoint_fn
13
+ def __init__(self, cached_fn: CachedFunction):
14
+ self.checkpointer = cached_fn.checkpointer
15
+ self.cached_fn = cached_fn
16
16
 
17
- def dir(self) -> Path:
18
- return self.checkpointer.root_path / self.checkpoint_fn.fn_dir / self.checkpoint_fn.fn_hash
17
+ def fn_id(self) -> str:
18
+ return f"{self.cached_fn.fn_dir}/{self.cached_fn.fn_hash}"
19
+
20
+ def fn_dir(self) -> Path:
21
+ return self.checkpointer.root_path / self.fn_id()
19
22
 
20
23
  def store(self, call_id: str, data: Any) -> None: ...
21
24
 
@@ -1,10 +1,3 @@
1
- """
2
- TODO: Add tests for:
3
- - Checkpointing with different formats (pickle, memory, etc.)
4
- - Classes and methods - instances and classes
5
- - reinit deep depends
6
- """
7
-
8
1
  import asyncio
9
2
  import pytest
10
3
  from riprint import riprint as print
@@ -32,10 +32,10 @@ def get_cell_contents(fn: Callable) -> Iterable[tuple[str, Any]]:
32
32
  except ValueError:
33
33
  pass
34
34
 
35
- def unwrap_fn(fn: Fn, checkpoint_fn=False) -> Fn:
36
- from .checkpoint import CheckpointFn
35
+ def unwrap_fn(fn: Fn, cached_fn=False) -> Fn:
36
+ from .checkpoint import CachedFunction
37
37
  while True:
38
- if (checkpoint_fn and isinstance(fn, CheckpointFn)) or not hasattr(fn, "__wrapped__"):
38
+ if (cached_fn and isinstance(fn, CachedFunction)) or not hasattr(fn, "__wrapped__"):
39
39
  return cast(Fn, fn)
40
40
  fn = getattr(fn, "__wrapped__")
41
41
 
@@ -1,6 +1,6 @@
1
1
  [project]
2
2
  name = "checkpointer"
3
- version = "2.9.0"
3
+ version = "2.9.2"
4
4
  requires-python = ">=3.11"
5
5
  dependencies = []
6
6
  authors = [
@@ -8,7 +8,7 @@ resolution-markers = [
8
8
 
9
9
  [[package]]
10
10
  name = "checkpointer"
11
- version = "2.9.0"
11
+ version = "2.9.2"
12
12
  source = { editable = "." }
13
13
 
14
14
  [package.dev-dependencies]
@@ -1,262 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: checkpointer
3
- Version: 2.9.0
4
- Summary: A Python library for memoizing function results with support for multiple storage backends, async runtimes, and automatic cache invalidation
5
- Project-URL: Repository, https://github.com/Reddan/checkpointer.git
6
- Author: Hampus Hallman
7
- License: Copyright 2018-2025 Hampus Hallman
8
-
9
- Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
10
-
11
- The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
12
-
13
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
14
- License-File: LICENSE
15
- Classifier: Programming Language :: Python :: 3.11
16
- Classifier: Programming Language :: Python :: 3.12
17
- Classifier: Programming Language :: Python :: 3.13
18
- Requires-Python: >=3.11
19
- Description-Content-Type: text/markdown
20
-
21
- # checkpointer &middot; [![License](https://img.shields.io/badge/license-MIT-blue)](https://github.com/Reddan/checkpointer/blob/master/LICENSE) [![pypi](https://img.shields.io/pypi/v/checkpointer)](https://pypi.org/project/checkpointer/) [![pypi](https://img.shields.io/pypi/pyversions/checkpointer)](https://pypi.org/project/checkpointer/)
22
-
23
- `checkpointer` is a Python library for memoizing function results. It provides a decorator-based API with support for multiple storage backends. Use it for computationally expensive operations where caching can save time, or during development to avoid waiting for redundant computations.
24
-
25
- Adding or removing `@checkpoint` doesn't change how your code works. You can apply it to any function, including ones you've already written, without altering their behavior or introducing side effects. The original function remains unchanged and can still be called directly when needed.
26
-
27
- ### Key Features:
28
- - 🗂️ **Multiple Storage Backends**: Built-in support for in-memory and pickle-based storage, or create your own.
29
- - 🎯 **Simple Decorator API**: Apply `@checkpoint` to functions without boilerplate.
30
- - 🔄 **Async and Sync Compatibility**: Works with synchronous functions and any Python async runtime (e.g., `asyncio`, `Trio`, `Curio`).
31
- - ⏲️ **Custom Expiration Logic**: Automatically invalidate old checkpoints.
32
- - 📂 **Flexible Path Configuration**: Control where checkpoints are stored.
33
- - 📦 **Captured Variables Handling**: Optionally include captured variables in cache invalidation.
34
- - ⚡ **Custom Argument Hashing**: Override argument hashing for speed or specialized hashing logic.
35
-
36
- ---
37
-
38
- ## Installation
39
-
40
- ```bash
41
- pip install checkpointer
42
- ```
43
-
44
- ---
45
-
46
- ## Quick Start 🚀
47
-
48
- ```python
49
- from checkpointer import checkpoint
50
-
51
- @checkpoint
52
- def expensive_function(x: int) -> int:
53
- print("Computing...")
54
- return x ** 2
55
-
56
- result = expensive_function(4) # Computes and stores the result
57
- result = expensive_function(4) # Loads from the cache
58
- ```
59
-
60
- ---
61
-
62
- ## How It Works
63
-
64
- When you use `@checkpoint`, the function's **arguments** (`args`, `kwargs`) are hashed to create a unique identifier for each call. This identifier is used to store and retrieve cached results. If the same arguments are passed again, `checkpointer` loads the cached result instead of recomputing.
65
-
66
- Additionally, `checkpointer` ensures that caches are invalidated when a function's implementation or any of its dependencies change. Each function is assigned a hash based on:
67
-
68
- 1. **Function Code**: The hash updates when the function’s own source code changes.
69
- 2. **Dependencies**: If the function calls other user-defined functions, changes in those dependencies also update the hash.
70
- 3. **External Variables** *(with `capture=True`)*: Any global or closure-based variables used by the function are included in its hash, so changes to those variables also trigger cache invalidation.
71
-
72
- ### Example: Cache Invalidation
73
-
74
- ```python
75
- def multiply(a, b):
76
- return a * b
77
-
78
- @checkpoint
79
- def helper(x):
80
- return multiply(x + 1, 2)
81
-
82
- @checkpoint
83
- def compute(a, b):
84
- return helper(a) + helper(b)
85
- ```
86
-
87
- If you modify `multiply`, caches for both `helper` and `compute` are invalidated and recomputed.
88
-
89
- ---
90
-
91
- ## Parameterization
92
-
93
- ### Custom Configuration
94
-
95
- Set up a `Checkpointer` instance with custom settings, and extend it by calling itself with overrides:
96
-
97
- ```python
98
- from checkpointer import checkpoint
99
-
100
- IS_DEVELOPMENT = True # Toggle based on your environment
101
-
102
- tmp_checkpoint = checkpoint(root_path="/tmp/checkpoints")
103
- dev_checkpoint = tmp_checkpoint(when=IS_DEVELOPMENT) # Adds development-specific behavior
104
- ```
105
-
106
- ### Per-Function Customization & Layered Caching
107
-
108
- Layer caches by stacking checkpoints:
109
-
110
- ```python
111
- @checkpoint(format="memory") # Always use memory storage
112
- @dev_checkpoint # Adds caching during development
113
- def some_expensive_function():
114
- print("Performing a time-consuming operation...")
115
- return sum(i * i for i in range(10**8))
116
- ```
117
-
118
- - **In development**: Both `dev_checkpoint` and `memory` caches are active.
119
- - **In production**: Only the `memory` cache is active.
120
-
121
- ---
122
-
123
- ## Usage
124
-
125
- ### Basic Invocation and Caching
126
-
127
- Call the decorated function as usual. On the first call, the result is computed and stored in the cache. Subsequent calls with the same arguments load the result from the cache:
128
-
129
- ```python
130
- result = expensive_function(4) # Computes and stores the result
131
- result = expensive_function(4) # Loads the result from the cache
132
- ```
133
-
134
- ### Force Recalculation
135
-
136
- Force a recalculation and overwrite the stored checkpoint:
137
-
138
- ```python
139
- result = expensive_function.rerun(4)
140
- ```
141
-
142
- ### Call the Original Function
143
-
144
- Use `fn` to directly call the original, undecorated function:
145
-
146
- ```python
147
- result = expensive_function.fn(4)
148
- ```
149
-
150
- This is especially useful **inside recursive functions** to avoid redundant caching of intermediate steps while still caching the final result.
151
-
152
- ### Retrieve Stored Checkpoints
153
-
154
- Access cached results without recalculating:
155
-
156
- ```python
157
- stored_result = expensive_function.get(4)
158
- ```
159
-
160
- ### Refresh Function Hash
161
-
162
- If `capture=True`, you might need to re-hash a function during the same Python session. For that, call `reinit`:
163
-
164
- ```python
165
- expensive_function.reinit()
166
- ```
167
-
168
- This tells `checkpointer` to recalculate the function hash, reflecting changes in captured variables.
169
-
170
- ---
171
-
172
- ## Storage Backends
173
-
174
- `checkpointer` works with built-in and custom storage backends, so you can use what's provided or roll your own as needed.
175
-
176
- ### Built-In Backends
177
-
178
- 1. **PickleStorage**: Stores checkpoints on disk using Python's `pickle`.
179
- 2. **MemoryStorage**: Keeps checkpoints in memory for non-persistent, fast caching.
180
-
181
- You can specify a storage backend using either its name (`"pickle"` or `"memory"`) or its corresponding class (`PickleStorage` or `MemoryStorage`) in the `format` parameter:
182
-
183
- ```python
184
- from checkpointer import checkpoint, PickleStorage, MemoryStorage
185
-
186
- @checkpoint(format="pickle") # Short for format=PickleStorage
187
- def disk_cached(x: int) -> int:
188
- return x ** 2
189
-
190
- @checkpoint(format="memory") # Short for format=MemoryStorage
191
- def memory_cached(x: int) -> int:
192
- return x * 10
193
- ```
194
-
195
- ### Custom Storage Backends
196
-
197
- Create a custom storage backend by inheriting from the `Storage` class and implementing its methods. Access configuration options through the `self.checkpointer` attribute, an instance of `Checkpointer`.
198
-
199
- #### Example: Custom Storage Backend
200
-
201
- ```python
202
- from checkpointer import checkpoint, Storage
203
- from datetime import datetime
204
-
205
- class CustomStorage(Storage):
206
- def exists(self, call_id) -> bool: ... # Check if a checkpoint exists
207
- def checkpoint_date(self, call_id) -> datetime: ... # Get the checkpoint's timestamp
208
- def store(self, call_id, data): ... # Save data to the checkpoint
209
- def load(self, call_id): ... # Load data from the checkpoint
210
- def delete(self, call_id): ... # Delete the checkpoint
211
-
212
- @checkpoint(format=CustomStorage)
213
- def custom_cached(x: int):
214
- return x ** 2
215
- ```
216
-
217
- Use a custom backend to integrate with databases, cloud storage, or specialized file formats.
218
-
219
- ---
220
-
221
- ## Configuration Options ⚙️
222
-
223
- | Option | Type | Default | Description |
224
- |-----------------|-------------------------------------|----------------------|-----------------------------------------------------------|
225
- | `capture` | `bool` | `False` | Include captured variables in function hashes. |
226
- | `format` | `"pickle"`, `"memory"`, `Storage` | `"pickle"` | Storage backend format. |
227
- | `root_path` | `Path`, `str`, or `None` | ~/.cache/checkpoints | Root directory for storing checkpoints. |
228
- | `when` | `bool` | `True` | Enable or disable checkpointing. |
229
- | `verbosity` | `0`, `1` or `2` | `1` | Logging verbosity. |
230
- | `should_expire` | `Callable[[datetime], bool]` | `None` | Custom expiration logic. |
231
- | `hash_by` | `Callable[..., Any]` | `None` | Custom function that transforms arguments before hashing. |
232
-
233
- ---
234
-
235
- ## Full Example 🛠️
236
-
237
- ```python
238
- import asyncio
239
- from checkpointer import checkpoint
240
-
241
- @checkpoint
242
- def compute_square(n: int) -> int:
243
- print(f"Computing {n}^2...")
244
- return n ** 2
245
-
246
- @checkpoint(format="memory")
247
- async def async_compute_sum(a: int, b: int) -> int:
248
- await asyncio.sleep(1)
249
- return a + b
250
-
251
- async def main():
252
- result1 = compute_square(5)
253
- print(result1) # Outputs 25
254
-
255
- result2 = await async_compute_sum(3, 7)
256
- print(result2) # Outputs 10
257
-
258
- result3 = async_compute_sum.get(3, 7)
259
- print(result3) # Outputs 10
260
-
261
- asyncio.run(main())
262
- ```
@@ -1,242 +0,0 @@
1
- # checkpointer &middot; [![License](https://img.shields.io/badge/license-MIT-blue)](https://github.com/Reddan/checkpointer/blob/master/LICENSE) [![pypi](https://img.shields.io/pypi/v/checkpointer)](https://pypi.org/project/checkpointer/) [![pypi](https://img.shields.io/pypi/pyversions/checkpointer)](https://pypi.org/project/checkpointer/)
2
-
3
- `checkpointer` is a Python library for memoizing function results. It provides a decorator-based API with support for multiple storage backends. Use it for computationally expensive operations where caching can save time, or during development to avoid waiting for redundant computations.
4
-
5
- Adding or removing `@checkpoint` doesn't change how your code works. You can apply it to any function, including ones you've already written, without altering their behavior or introducing side effects. The original function remains unchanged and can still be called directly when needed.
6
-
7
- ### Key Features:
8
- - 🗂️ **Multiple Storage Backends**: Built-in support for in-memory and pickle-based storage, or create your own.
9
- - 🎯 **Simple Decorator API**: Apply `@checkpoint` to functions without boilerplate.
10
- - 🔄 **Async and Sync Compatibility**: Works with synchronous functions and any Python async runtime (e.g., `asyncio`, `Trio`, `Curio`).
11
- - ⏲️ **Custom Expiration Logic**: Automatically invalidate old checkpoints.
12
- - 📂 **Flexible Path Configuration**: Control where checkpoints are stored.
13
- - 📦 **Captured Variables Handling**: Optionally include captured variables in cache invalidation.
14
- - ⚡ **Custom Argument Hashing**: Override argument hashing for speed or specialized hashing logic.
15
-
16
- ---
17
-
18
- ## Installation
19
-
20
- ```bash
21
- pip install checkpointer
22
- ```
23
-
24
- ---
25
-
26
- ## Quick Start 🚀
27
-
28
- ```python
29
- from checkpointer import checkpoint
30
-
31
- @checkpoint
32
- def expensive_function(x: int) -> int:
33
- print("Computing...")
34
- return x ** 2
35
-
36
- result = expensive_function(4) # Computes and stores the result
37
- result = expensive_function(4) # Loads from the cache
38
- ```
39
-
40
- ---
41
-
42
- ## How It Works
43
-
44
- When you use `@checkpoint`, the function's **arguments** (`args`, `kwargs`) are hashed to create a unique identifier for each call. This identifier is used to store and retrieve cached results. If the same arguments are passed again, `checkpointer` loads the cached result instead of recomputing.
45
-
46
- Additionally, `checkpointer` ensures that caches are invalidated when a function's implementation or any of its dependencies change. Each function is assigned a hash based on:
47
-
48
- 1. **Function Code**: The hash updates when the function’s own source code changes.
49
- 2. **Dependencies**: If the function calls other user-defined functions, changes in those dependencies also update the hash.
50
- 3. **External Variables** *(with `capture=True`)*: Any global or closure-based variables used by the function are included in its hash, so changes to those variables also trigger cache invalidation.
51
-
52
- ### Example: Cache Invalidation
53
-
54
- ```python
55
- def multiply(a, b):
56
- return a * b
57
-
58
- @checkpoint
59
- def helper(x):
60
- return multiply(x + 1, 2)
61
-
62
- @checkpoint
63
- def compute(a, b):
64
- return helper(a) + helper(b)
65
- ```
66
-
67
- If you modify `multiply`, caches for both `helper` and `compute` are invalidated and recomputed.
68
-
69
- ---
70
-
71
- ## Parameterization
72
-
73
- ### Custom Configuration
74
-
75
- Set up a `Checkpointer` instance with custom settings, and extend it by calling itself with overrides:
76
-
77
- ```python
78
- from checkpointer import checkpoint
79
-
80
- IS_DEVELOPMENT = True # Toggle based on your environment
81
-
82
- tmp_checkpoint = checkpoint(root_path="/tmp/checkpoints")
83
- dev_checkpoint = tmp_checkpoint(when=IS_DEVELOPMENT) # Adds development-specific behavior
84
- ```
85
-
86
- ### Per-Function Customization & Layered Caching
87
-
88
- Layer caches by stacking checkpoints:
89
-
90
- ```python
91
- @checkpoint(format="memory") # Always use memory storage
92
- @dev_checkpoint # Adds caching during development
93
- def some_expensive_function():
94
- print("Performing a time-consuming operation...")
95
- return sum(i * i for i in range(10**8))
96
- ```
97
-
98
- - **In development**: Both `dev_checkpoint` and `memory` caches are active.
99
- - **In production**: Only the `memory` cache is active.
100
-
101
- ---
102
-
103
- ## Usage
104
-
105
- ### Basic Invocation and Caching
106
-
107
- Call the decorated function as usual. On the first call, the result is computed and stored in the cache. Subsequent calls with the same arguments load the result from the cache:
108
-
109
- ```python
110
- result = expensive_function(4) # Computes and stores the result
111
- result = expensive_function(4) # Loads the result from the cache
112
- ```
113
-
114
- ### Force Recalculation
115
-
116
- Force a recalculation and overwrite the stored checkpoint:
117
-
118
- ```python
119
- result = expensive_function.rerun(4)
120
- ```
121
-
122
- ### Call the Original Function
123
-
124
- Use `fn` to directly call the original, undecorated function:
125
-
126
- ```python
127
- result = expensive_function.fn(4)
128
- ```
129
-
130
- This is especially useful **inside recursive functions** to avoid redundant caching of intermediate steps while still caching the final result.
131
-
132
- ### Retrieve Stored Checkpoints
133
-
134
- Access cached results without recalculating:
135
-
136
- ```python
137
- stored_result = expensive_function.get(4)
138
- ```
139
-
140
- ### Refresh Function Hash
141
-
142
- If `capture=True`, you might need to re-hash a function during the same Python session. For that, call `reinit`:
143
-
144
- ```python
145
- expensive_function.reinit()
146
- ```
147
-
148
- This tells `checkpointer` to recalculate the function hash, reflecting changes in captured variables.
149
-
150
- ---
151
-
152
- ## Storage Backends
153
-
154
- `checkpointer` works with built-in and custom storage backends, so you can use what's provided or roll your own as needed.
155
-
156
- ### Built-In Backends
157
-
158
- 1. **PickleStorage**: Stores checkpoints on disk using Python's `pickle`.
159
- 2. **MemoryStorage**: Keeps checkpoints in memory for non-persistent, fast caching.
160
-
161
- You can specify a storage backend using either its name (`"pickle"` or `"memory"`) or its corresponding class (`PickleStorage` or `MemoryStorage`) in the `format` parameter:
162
-
163
- ```python
164
- from checkpointer import checkpoint, PickleStorage, MemoryStorage
165
-
166
- @checkpoint(format="pickle") # Short for format=PickleStorage
167
- def disk_cached(x: int) -> int:
168
- return x ** 2
169
-
170
- @checkpoint(format="memory") # Short for format=MemoryStorage
171
- def memory_cached(x: int) -> int:
172
- return x * 10
173
- ```
174
-
175
- ### Custom Storage Backends
176
-
177
- Create a custom storage backend by inheriting from the `Storage` class and implementing its methods. Access configuration options through the `self.checkpointer` attribute, an instance of `Checkpointer`.
178
-
179
- #### Example: Custom Storage Backend
180
-
181
- ```python
182
- from checkpointer import checkpoint, Storage
183
- from datetime import datetime
184
-
185
- class CustomStorage(Storage):
186
- def exists(self, call_id) -> bool: ... # Check if a checkpoint exists
187
- def checkpoint_date(self, call_id) -> datetime: ... # Get the checkpoint's timestamp
188
- def store(self, call_id, data): ... # Save data to the checkpoint
189
- def load(self, call_id): ... # Load data from the checkpoint
190
- def delete(self, call_id): ... # Delete the checkpoint
191
-
192
- @checkpoint(format=CustomStorage)
193
- def custom_cached(x: int):
194
- return x ** 2
195
- ```
196
-
197
- Use a custom backend to integrate with databases, cloud storage, or specialized file formats.
198
-
199
- ---
200
-
201
- ## Configuration Options ⚙️
202
-
203
- | Option | Type | Default | Description |
204
- |-----------------|-------------------------------------|----------------------|-----------------------------------------------------------|
205
- | `capture` | `bool` | `False` | Include captured variables in function hashes. |
206
- | `format` | `"pickle"`, `"memory"`, `Storage` | `"pickle"` | Storage backend format. |
207
- | `root_path` | `Path`, `str`, or `None` | ~/.cache/checkpoints | Root directory for storing checkpoints. |
208
- | `when` | `bool` | `True` | Enable or disable checkpointing. |
209
- | `verbosity` | `0`, `1` or `2` | `1` | Logging verbosity. |
210
- | `should_expire` | `Callable[[datetime], bool]` | `None` | Custom expiration logic. |
211
- | `hash_by` | `Callable[..., Any]` | `None` | Custom function that transforms arguments before hashing. |
212
-
213
- ---
214
-
215
- ## Full Example 🛠️
216
-
217
- ```python
218
- import asyncio
219
- from checkpointer import checkpoint
220
-
221
- @checkpoint
222
- def compute_square(n: int) -> int:
223
- print(f"Computing {n}^2...")
224
- return n ** 2
225
-
226
- @checkpoint(format="memory")
227
- async def async_compute_sum(a: int, b: int) -> int:
228
- await asyncio.sleep(1)
229
- return a + b
230
-
231
- async def main():
232
- result1 = compute_square(5)
233
- print(result1) # Outputs 25
234
-
235
- result2 = await async_compute_sum(3, 7)
236
- print(result2) # Outputs 10
237
-
238
- result3 = async_compute_sum.get(3, 7)
239
- print(result3) # Outputs 10
240
-
241
- asyncio.run(main())
242
- ```
File without changes
File without changes