langgraph-unity-catalog-checkpoint 0.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (26) hide show
  1. langgraph_unity_catalog_checkpoint-0.0.1/LICENSE +21 -0
  2. langgraph_unity_catalog_checkpoint-0.0.1/PKG-INFO +670 -0
  3. langgraph_unity_catalog_checkpoint-0.0.1/README.md +625 -0
  4. langgraph_unity_catalog_checkpoint-0.0.1/pyproject.toml +138 -0
  5. langgraph_unity_catalog_checkpoint-0.0.1/setup.cfg +4 -0
  6. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint/__init__.py +39 -0
  7. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint/checkpoint/__init__.py +30 -0
  8. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint/checkpoint/aio.py +949 -0
  9. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint/checkpoint/base.py +351 -0
  10. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint/checkpoint/shallow.py +644 -0
  11. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint/checkpoint/unity_catalog.py +273 -0
  12. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint/logging_config.py +50 -0
  13. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint/store/__init__.py +20 -0
  14. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint/store/aio.py +505 -0
  15. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint/store/base.py +280 -0
  16. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint/store/unity_catalog.py +552 -0
  17. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint.egg-info/PKG-INFO +670 -0
  18. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint.egg-info/SOURCES.txt +24 -0
  19. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint.egg-info/dependency_links.txt +1 -0
  20. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint.egg-info/requires.txt +18 -0
  21. langgraph_unity_catalog_checkpoint-0.0.1/src/langgraph_unity_catalog_checkpoint.egg-info/top_level.txt +1 -0
  22. langgraph_unity_catalog_checkpoint-0.0.1/tests/test_async_unity_catalog_checkpointer.py +434 -0
  23. langgraph_unity_catalog_checkpoint-0.0.1/tests/test_basestore_interface.py +193 -0
  24. langgraph_unity_catalog_checkpoint-0.0.1/tests/test_integration.py +317 -0
  25. langgraph_unity_catalog_checkpoint-0.0.1/tests/test_unity_catalog_checkpointer.py +557 -0
  26. langgraph_unity_catalog_checkpoint-0.0.1/tests/test_unity_catalog_store.py +110 -0
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Nate Fleming
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,670 @@
1
+ Metadata-Version: 2.4
2
+ Name: langgraph-unity-catalog-checkpoint
3
+ Version: 0.0.1
4
+ Summary: Unity Catalog-backed persistence for LangChain and LangGraph
5
+ Author-email: Nate Fleming <nate.fleming@databricks.com>
6
+ Maintainer-email: Nate Fleming <nate.fleming@databricks.com>
7
+ License-Expression: MIT
8
+ Project-URL: Homepage, https://github.com/natefleming/langgraph_unity_catalog_checkpoint
9
+ Project-URL: Repository, https://github.com/natefleming/langgraph_unity_catalog_checkpoint
10
+ Project-URL: Issues, https://github.com/natefleming/langgraph_unity_catalog_checkpoint/issues
11
+ Project-URL: Documentation, https://github.com/natefleming/langgraph_unity_catalog_checkpoint#readme
12
+ Keywords: langchain,langgraph,databricks,unity-catalog,checkpoint,persistence,store,memory,agents,llm,ai
13
+ Classifier: Development Status :: 3 - Alpha
14
+ Classifier: Intended Audience :: Developers
15
+ Classifier: Intended Audience :: Science/Research
16
+ Classifier: Operating System :: OS Independent
17
+ Classifier: Programming Language :: Python :: 3
18
+ Classifier: Programming Language :: Python :: 3.12
19
+ Classifier: Programming Language :: Python :: 3.13
20
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
21
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
22
+ Classifier: Topic :: Database
23
+ Classifier: Typing :: Typed
24
+ Requires-Python: >=3.12
25
+ Description-Content-Type: text/markdown
26
+ License-File: LICENSE
27
+ Requires-Dist: databricks-connect>=17.0.10
28
+ Requires-Dist: databricks-langchain>=0.9.0
29
+ Requires-Dist: databricks-sdk>=0.73.0
30
+ Requires-Dist: langchain>=1.0.3
31
+ Requires-Dist: langgraph>=1.0.2
32
+ Requires-Dist: langmem>=0.0.30
33
+ Requires-Dist: loguru>=0.7.3
34
+ Requires-Dist: mlflow==3.5.1
35
+ Requires-Dist: nest-asyncio>=1.6.0
36
+ Requires-Dist: python-dotenv>=1.2.1
37
+ Provides-Extra: dev
38
+ Requires-Dist: pytest>=8.4.2; extra == "dev"
39
+ Requires-Dist: pytest-asyncio>=0.23.0; extra == "dev"
40
+ Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
41
+ Requires-Dist: ruff>=0.6.0; extra == "dev"
42
+ Requires-Dist: twine>=5.0.0; extra == "dev"
43
+ Requires-Dist: python-dotenv>=1.0.0; extra == "dev"
44
+ Dynamic: license-file
45
+
46
+ # LangGraph Unity Catalog Checkpoint
47
+
48
+ [![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)
49
+ [![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)
50
+
51
+ **Production-ready Unity Catalog persistence for LangChain and LangGraph applications using Databricks as the storage backend.**
52
+
53
+ Following the [LangGraph checkpoint-postgres pattern](https://github.com/langchain-ai/langgraph/tree/main/libs/checkpoint-postgres/langgraph) for consistency with the LangGraph ecosystem.
54
+
55
+ ---
56
+
57
+ ## πŸš€ Overview
58
+
59
+ This package provides enterprise-grade implementations of LangGraph's persistence interfaces backed by Databricks Unity Catalog:
60
+
61
+ - **`UnityCatalogStore`** / **`AsyncUnityCatalogStore`**: Implements [`langgraph.store.base.BaseStore`](https://github.com/langchain-ai/langgraph/blob/main/libs/checkpoint/langgraph/store/base/__init__.py) for key-value storage
62
+ - **`UnityCatalogCheckpointSaver`** / **`AsyncUnityCatalogCheckpointSaver`**: Implements [`BaseCheckpointSaver`](https://github.com/langchain-ai/langgraph/blob/main/libs/checkpoint/langgraph/checkpoint/base/__init__.py) for graph state persistence
63
+
64
+ All implementations use Databricks Unity Catalog Delta tables via the WorkspaceClient SQL API, providing:
65
+
66
+ - βœ… **Enterprise-grade reliability** with ACID transactions
67
+ - βœ… **Scalability** with Delta Lake optimization
68
+ - βœ… **Governance** with built-in access control and audit trails
69
+ - βœ… **Time-travel** for debugging and recovery
70
+ - βœ… **Seamless Databricks integration** for production ML workflows
71
+ - βœ… **Performance optimized** with batch operations (2-10x faster)
72
+
73
+ ---
74
+
75
+ ## πŸ“¦ Installation
76
+
77
+ ### Prerequisites
78
+
79
+ - Python 3.12+
80
+ - Databricks workspace with Unity Catalog enabled
81
+ - SQL warehouse with appropriate permissions
82
+
83
+ ### Install Dependencies
84
+
85
+ ```bash
86
+ pip install databricks-sdk langchain langgraph langmem databricks-langchain
87
+ ```
88
+
89
+ ### Install Package
90
+
91
+ ```bash
92
+ # From source
93
+ git clone https://github.com/natefleming/langgraph_unity_catalog_checkpoint.git
94
+ cd langgraph_unity_catalog_checkpoint
95
+ pip install -e .
96
+
97
+ # Or with development dependencies
98
+ pip install -e ".[dev]"
99
+ ```
100
+
101
+ ---
102
+
103
+ ## ⚑ Quick Start
104
+
105
+ ### 1. Configure Databricks Authentication
106
+
107
+ Set up environment variables:
108
+
109
+ ```bash
110
+ export DATABRICKS_HOST="https://your-workspace.databricks.com"
111
+ export DATABRICKS_TOKEN="your-access-token"
112
+ export DATABRICKS_WAREHOUSE_ID="your-warehouse-id"
113
+ export UC_CATALOG="your_catalog"
114
+ export UC_SCHEMA="your_schema"
115
+ ```
116
+
117
+ Or use `~/.databrickscfg`:
118
+
119
+ ```ini
120
+ [DEFAULT]
121
+ host = https://your-workspace.databricks.com
122
+ token = your-access-token
123
+ ```
124
+
125
+ ### 2. Using the Store for Key-Value Storage
126
+
127
+ ```python
128
+ from databricks.sdk import WorkspaceClient
129
+ from langgraph_unity_catalog_checkpoint import UnityCatalogStore
130
+
131
+ # Initialize the store
132
+ workspace_client = WorkspaceClient()
133
+ store = UnityCatalogStore(
134
+ workspace_client=workspace_client,
135
+ catalog="main",
136
+ schema="langgraph",
137
+ table="my_store", # Default: "store"
138
+ warehouse_id="your-warehouse-id", # Optional
139
+ )
140
+
141
+ # Store values with namespaced keys
142
+ store.put(("users", "123"), "preferences", {"theme": "dark", "language": "en"})
143
+
144
+ # Retrieve values
145
+ prefs = store.get(("users", "123"), "preferences")
146
+ print(prefs) # {"theme": "dark", "language": "en"}
147
+
148
+ # Search within a namespace
149
+ items = store.search(("users",), limit=10)
150
+ for item in items:
151
+ print(f"Key: {item.key}, Namespace: {item.namespace}")
152
+
153
+ # Delete a key
154
+ store.delete(("users", "123"), "preferences")
155
+ ```
156
+
157
+ ### 3. Using the Checkpointer for Graph Persistence
158
+
159
+ ```python
160
+ from databricks.sdk import WorkspaceClient
161
+ from databricks_langchain import ChatDatabricks
162
+ from langgraph.graph import StateGraph, START, END
163
+ from langgraph.graph.message import add_messages
164
+ from langchain_core.messages import HumanMessage, BaseMessage
165
+ from typing_extensions import TypedDict
166
+ from typing import Annotated
167
+ from langgraph_unity_catalog_checkpoint import UnityCatalogCheckpointSaver
168
+
169
+ # Define your graph state
170
+ class State(TypedDict):
171
+ messages: Annotated[list[BaseMessage], add_messages]
172
+
173
+ # Create a simple chatbot node
174
+ llm = ChatDatabricks(endpoint="databricks-meta-llama-3-3-70b-instruct")
175
+
176
+ def chatbot(state: State) -> dict:
177
+ response = llm.invoke(state["messages"])
178
+ return {"messages": [response]}
179
+
180
+ # Create the checkpointer
181
+ workspace_client = WorkspaceClient()
182
+ checkpointer = UnityCatalogCheckpointSaver(
183
+ workspace_client=workspace_client,
184
+ catalog="main",
185
+ schema="langgraph",
186
+ # Default tables: "checkpoints", "checkpoint_blobs", "checkpoint_writes"
187
+ warehouse_id="your-warehouse-id", # Optional
188
+ )
189
+
190
+ # Build the graph
191
+ graph_builder = StateGraph(State)
192
+ graph_builder.add_node("chatbot", chatbot)
193
+ graph_builder.add_edge(START, "chatbot")
194
+ graph_builder.add_edge("chatbot", END)
195
+
196
+ # Compile with checkpointer for persistence
197
+ graph = graph_builder.compile(checkpointer=checkpointer)
198
+
199
+ # Run conversation with persistence
200
+ config = {"configurable": {"thread_id": "conversation_1"}}
201
+
202
+ # First interaction
203
+ result = graph.invoke(
204
+ {"messages": [HumanMessage(content="Hello! What's the weather like?")]},
205
+ config=config
206
+ )
207
+
208
+ # Second interaction - conversation history is maintained!
209
+ result = graph.invoke(
210
+ {"messages": [HumanMessage(content="What did I just ask you?")]},
211
+ config=config
212
+ )
213
+ # The bot remembers the previous question! πŸŽ‰
214
+ ```
215
+
216
+ ### 4. Async Usage for High Performance
217
+
218
+ ```python
219
+ from langgraph_unity_catalog_checkpoint import AsyncUnityCatalogCheckpointSaver
220
+ import asyncio
221
+
222
+ # Create async checkpointer
223
+ async_checkpointer = AsyncUnityCatalogCheckpointSaver(
224
+ workspace_client=workspace_client,
225
+ catalog="main",
226
+ schema="langgraph",
227
+ warehouse_id="your-warehouse-id",
228
+ )
229
+
230
+ # Async chatbot node
231
+ async def async_chatbot(state: State) -> dict:
232
+ response = await llm.ainvoke(state["messages"])
233
+ return {"messages": [response]}
234
+
235
+ # Build and compile with async checkpointer
236
+ graph_builder = StateGraph(State)
237
+ graph_builder.add_node("chatbot", async_chatbot)
238
+ graph_builder.add_edge(START, "chatbot")
239
+ graph_builder.add_edge("chatbot", END)
240
+ graph = graph_builder.compile(checkpointer=async_checkpointer)
241
+
242
+ # Run asynchronously
243
+ config = {"configurable": {"thread_id": "async_conversation_1"}}
244
+ result = await graph.ainvoke(
245
+ {"messages": [HumanMessage(content="Hello async world!")]},
246
+ config=config
247
+ )
248
+ ```
249
+
250
+ ---
251
+
252
+ ## 🎯 Use Cases
253
+
254
+ ### 1. **Conversational AI with Memory**
255
+
256
+ Maintain conversation history across multiple interactions:
257
+
258
+ ```python
259
+ # Each user gets their own conversation thread
260
+ config = {"configurable": {"thread_id": f"user_{user_id}"}}
261
+ graph.invoke({"messages": [HumanMessage(content=user_input)]}, config)
262
+ ```
263
+
264
+ ### 2. **Human-in-the-Loop Workflows**
265
+
266
+ Pause execution for human review and resume seamlessly:
267
+
268
+ ```python
269
+ # Interrupt before critical nodes
270
+ graph = builder.compile(
271
+ checkpointer=checkpointer,
272
+ interrupt_before=["approval_node"]
273
+ )
274
+
275
+ # Execute and pause at approval
276
+ result = graph.invoke(input_data, config)
277
+
278
+ # Human reviews and approves...
279
+
280
+ # Resume from checkpoint
281
+ result = graph.invoke(None, config) # Continues from where it left off
282
+ ```
283
+
284
+ ### 3. **Long-Term Memory with LangMem**
285
+
286
+ Integrate with [LangMem](https://github.com/langchain-ai/langmem) for user preferences and memories:
287
+
288
+ ```python
289
+ from langchain.agents import create_agent
290
+ from langmem.tools import get_langmem_tools
291
+
292
+ # Create store for LangMem
293
+ store = UnityCatalogStore(
294
+ workspace_client=workspace_client,
295
+ catalog="main",
296
+ schema="langgraph",
297
+ )
298
+
299
+ # Get LangMem tools
300
+ langmem_tools = get_langmem_tools(store=store)
301
+
302
+ # Create agent with memory
303
+ agent = create_agent(llm, tools + langmem_tools)
304
+
305
+ # Use with user context
306
+ config = {
307
+ "configurable": {
308
+ "langgraph_user_id": "user_123" # Isolates memories per user
309
+ }
310
+ }
311
+ agent.invoke({"messages": [HumanMessage(content="I prefer dark mode")]}, config)
312
+ ```
313
+
314
+ ### 4. **Production ML Pipelines**
315
+
316
+ Reliable state management for complex workflows:
317
+
318
+ ```python
319
+ # Automatic recovery from failures
320
+ # Time-travel debugging with Delta Lake
321
+ # Full audit trail via Unity Catalog
322
+ # Multi-agent coordination with isolated states
323
+ ```
324
+
325
+ ---
326
+
327
+ ## πŸ“Š Performance Optimizations
328
+
329
+ ### Batch Write Operations (2-10x Faster)
330
+
331
+ The implementation uses **batched SQL operations** to minimize round trips to Unity Catalog:
332
+
333
+ ```python
334
+ # Instead of N+1 SQL statements:
335
+ # - 1 per blob
336
+ # - 1 per write
337
+ # - 1 checkpoint
338
+
339
+ # We use just 3 SQL statements:
340
+ # - 1 batch for all blobs
341
+ # - 1 batch for all writes
342
+ # - 1 for checkpoint
343
+
344
+ # For a checkpoint with 5 blobs and 3 writes:
345
+ # Before: 9 SQL statements
346
+ # After: 3 SQL statements
347
+ # Speedup: 3x faster! ⚑
348
+ ```
349
+
350
+ See [docs/CHECKPOINT_BATCH_WRITE_OPTIMIZATION.md](docs/CHECKPOINT_BATCH_WRITE_OPTIMIZATION.md) for details.
351
+
352
+ ---
353
+
354
+ ## πŸ—οΈ Architecture
355
+
356
+ ```
357
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
358
+ β”‚ LangChain/LangGraph Application β”‚
359
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
360
+ ↓
361
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
362
+ β”‚ BaseStore / BaseCheckpointSaver β”‚
363
+ β”‚ (LangGraph Interfaces) β”‚
364
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
365
+ ↓
366
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
367
+ β”‚ Unity Catalog Implementation β”‚
368
+ β”‚ - UnityCatalogStore β”‚
369
+ β”‚ - UnityCatalogCheckpointSaver β”‚
370
+ β”‚ - Async variants β”‚
371
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
372
+ ↓
373
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
374
+ β”‚ Databricks WorkspaceClient β”‚
375
+ β”‚ (SQL Statement Execution API) β”‚
376
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
377
+ ↓
378
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
379
+ β”‚ Unity Catalog Delta Tables β”‚
380
+ β”‚ - ACID transactions β”‚
381
+ β”‚ - Time-travel β”‚
382
+ β”‚ - Change Data Feed β”‚
383
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
384
+ ```
385
+
386
+ ### Data Storage
387
+
388
+ - **Serialization**: Checkpoints and values are serialized using LangGraph's `JsonPlusSerializer`
389
+ - **Binary Storage**: BINARY columns for efficient blob storage (base64 encoded)
390
+ - **JSON Metadata**: Structured metadata for filtering and querying
391
+ - **Delta Lake**: ACID transactions, time-travel, and optimization
392
+
393
+ ### Default Table Names
394
+
395
+ | Component | Default Tables |
396
+ |-----------|---------------|
397
+ | **Store** | `store` |
398
+ | **Checkpointer** | `checkpoints`, `checkpoint_blobs`, `checkpoint_writes` |
399
+
400
+ Tables are automatically created on first use with optimized schemas.
401
+
402
+ ---
403
+
404
+ ## πŸ“š Examples
405
+
406
+ ### Complete Jupyter Notebooks
407
+
408
+ Explore the [`notebooks/`](notebooks/) directory for interactive examples:
409
+
410
+ - **[`store_example.ipynb`](notebooks/store_example.ipynb)** - Store operations and LangMem integration
411
+ - **[`checkpointer_example.ipynb`](notebooks/checkpointer_example.ipynb)** - Synchronous graph checkpointing
412
+ - **[`async_checkpointer_example.ipynb`](notebooks/async_checkpointer_example.ipynb)** - Async graph execution
413
+
414
+ ### Run in Databricks
415
+
416
+ 1. Upload a notebook to your Databricks workspace
417
+ 2. Attach to a cluster with Unity Catalog access
418
+ 3. Set the required configuration (catalog, schema, warehouse_id)
419
+ 4. Run all cells
420
+
421
+ ---
422
+
423
+ ## πŸ”§ Configuration
424
+
425
+ ### Environment Variables
426
+
427
+ | Variable | Description | Required |
428
+ |----------|-------------|----------|
429
+ | `DATABRICKS_HOST` | Workspace URL | Yes |
430
+ | `DATABRICKS_TOKEN` | Access token | Yes |
431
+ | `DATABRICKS_WAREHOUSE_ID` | SQL warehouse ID | No |
432
+ | `UC_CATALOG` | Default catalog name | Recommended |
433
+ | `UC_SCHEMA` | Default schema name | Recommended |
434
+
435
+ ### Configuration Precedence
436
+
437
+ Configuration values are resolved in this order:
438
+
439
+ 1. **Environment variables** (highest priority)
440
+ 2. **Databricks widgets** (for notebooks)
441
+ 3. **Constructor parameters** (explicit values)
442
+
443
+ See [docs/CONFIGURATION_PRECEDENCE.md](docs/CONFIGURATION_PRECEDENCE.md) for details.
444
+
445
+ ### Warehouse ID
446
+
447
+ The `warehouse_id` parameter is optional and defaults to `None`. If not provided:
448
+ - Uses the default warehouse for the workspace
449
+ - Can be overridden per-operation if needed
450
+
451
+ ---
452
+
453
+ ## πŸ”’ Permissions Required
454
+
455
+ Ensure your Databricks principal has:
456
+
457
+ - `USE CATALOG` on the target catalog
458
+ - `USE SCHEMA` on the target schema
459
+ - `CREATE TABLE` on the target schema (for initialization)
460
+ - `SELECT`, `INSERT`, `UPDATE`, `DELETE`, `MODIFY` on the tables
461
+
462
+ ---
463
+
464
+ ## πŸ§ͺ Testing
465
+
466
+ ### Run Unit Tests
467
+
468
+ ```bash
469
+ # Run all tests
470
+ make test
471
+
472
+ # Run specific test file
473
+ uv run pytest tests/test_unity_catalog_store.py -v
474
+
475
+ # Run with coverage
476
+ uv run pytest --cov=src --cov-report=html
477
+ ```
478
+
479
+ ### Run Integration Tests
480
+
481
+ Integration tests require a live Databricks connection:
482
+
483
+ ```bash
484
+ # Set required environment variables
485
+ export DATABRICKS_HOST="..."
486
+ export DATABRICKS_TOKEN="..."
487
+ export DATABRICKS_WAREHOUSE_ID="..."
488
+
489
+ # Run integration tests
490
+ uv run pytest tests/test_integration.py -v
491
+ ```
492
+
493
+ ### Linting and Formatting
494
+
495
+ ```bash
496
+ # Format code
497
+ make format
498
+
499
+ # Run linting
500
+ make lint
501
+
502
+ # Type checking
503
+ make type-check
504
+ ```
505
+
506
+ ---
507
+
508
+ ## πŸ“– Documentation
509
+
510
+ ### Core Documentation
511
+
512
+ - **[Usage Guide](docs/USAGE.md)** - Comprehensive usage examples
513
+ - **[Implementation Summary](docs/IMPLEMENTATION_SUMMARY.md)** - Technical architecture
514
+ - **[Environment Setup](docs/ENVIRONMENT_SETUP.md)** - Development environment
515
+ - **[Quick Start](QUICKSTART.md)** - Getting started guide
516
+ - **[Install Guide](INSTALL.md)** - Installation instructions
517
+
518
+ ### Technical Details
519
+
520
+ - **[Checkpoint Batch Write Optimization](docs/CHECKPOINT_BATCH_WRITE_OPTIMIZATION.md)** - Performance optimization details
521
+ - **[Configuration Precedence](docs/CONFIGURATION_PRECEDENCE.md)** - Configuration resolution
522
+ - **[Default Table Names](docs/DEFAULT_TABLE_NAMES.md)** - Table naming conventions
523
+ - **[MLflow Autolog Setup](docs/MLFLOW_AUTOLOG_SETUP.md)** - Observability with MLflow
524
+ - **[Logging](docs/LOGGING.md)** - Logging configuration
525
+
526
+ ### Session Summaries
527
+
528
+ - **[Batch Optimization (2025-11-07)](docs/SESSION_SUMMARY_2025-11-07_BATCH_OPTIMIZATION.md)**
529
+ - **[MLflow Tracing Removal (2025-11-07)](docs/SESSION_SUMMARY_2025-11-07_MLFLOW_TRACING_REMOVAL.md)**
530
+
531
+ ---
532
+
533
+ ## πŸš€ Features
534
+
535
+ ### UnityCatalogStore
536
+
537
+ - βœ… Implements `langgraph.store.base.BaseStore` interface
538
+ - βœ… Batch operations (`batch`, `abatch`) for performance
539
+ - βœ… Namespaced key-value storage
540
+ - βœ… Search with filtering and pagination
541
+ - βœ… Automatic table initialization
542
+ - βœ… Sync and async implementations
543
+ - βœ… Compatible with LangMem for long-term memory
544
+
545
+ ### UnityCatalogCheckpointSaver
546
+
547
+ - βœ… Implements `BaseCheckpointSaver` interface
548
+ - βœ… Full LangGraph checkpoint persistence
549
+ - βœ… Support for human-in-the-loop workflows
550
+ - βœ… Multi-turn conversation memory
551
+ - βœ… State recovery and time-travel
552
+ - βœ… Pending writes management
553
+ - βœ… Checkpoint listing and filtering
554
+ - βœ… Sync and async implementations
555
+ - βœ… Optimized batch writes (2-10x faster)
556
+ - βœ… Automatic table creation and schema management
557
+
558
+ ---
559
+
560
+ ## πŸ› οΈ Development
561
+
562
+ ### Setup Development Environment
563
+
564
+ ```bash
565
+ # Clone the repository
566
+ git clone https://github.com/natefleming/langgraph_unity_catalog_checkpoint.git
567
+ cd langgraph_unity_catalog_checkpoint
568
+
569
+ # Create virtual environment
570
+ python -m venv .venv
571
+ source .venv/bin/activate # On Windows: .venv\Scripts\activate
572
+
573
+ # Install with development dependencies
574
+ pip install -e ".[dev]"
575
+
576
+ # Install pre-commit hooks
577
+ pre-commit install
578
+ ```
579
+
580
+ ### Project Structure
581
+
582
+ ```
583
+ langgraph_unity_catalog_checkpoint/
584
+ β”œβ”€β”€ src/
585
+ β”‚ └── langgraph_unity_catalog_checkpoint/
586
+ β”‚ β”œβ”€β”€ store/ # Store implementations
587
+ β”‚ β”‚ β”œβ”€β”€ unity_catalog.py # Sync store
588
+ β”‚ β”‚ β”œβ”€β”€ aio.py # Async store
589
+ β”‚ β”‚ └── base.py # Base store class
590
+ β”‚ β”œβ”€β”€ checkpoint/ # Checkpointer implementations
591
+ β”‚ β”‚ β”œβ”€β”€ unity_catalog.py # Sync checkpointer
592
+ β”‚ β”‚ β”œβ”€β”€ aio.py # Async checkpointer
593
+ β”‚ β”‚ └── base.py # Base checkpointer class
594
+ β”‚ └── __init__.py # Public API exports
595
+ β”œβ”€β”€ tests/ # Test suite
596
+ β”œβ”€β”€ notebooks/ # Example notebooks
597
+ β”œβ”€β”€ docs/ # Documentation
598
+ β”œβ”€β”€ pyproject.toml # Project configuration
599
+ └── README.md # This file
600
+ ```
601
+
602
+ ---
603
+
604
+ ## 🀝 Contributing
605
+
606
+ Contributions are welcome! Please:
607
+
608
+ 1. Fork the repository
609
+ 2. Create a feature branch (`git checkout -b feature/amazing-feature`)
610
+ 3. Make your changes with tests
611
+ 4. Run the test suite (`make test`)
612
+ 5. Format and lint (`make format lint`)
613
+ 6. Commit your changes (`git commit -m 'Add amazing feature'`)
614
+ 7. Push to the branch (`git push origin feature/amazing-feature`)
615
+ 8. Open a Pull Request
616
+
617
+ ---
618
+
619
+ ## πŸ“„ License
620
+
621
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
622
+
623
+ ---
624
+
625
+ ## πŸ™ Acknowledgments
626
+
627
+ Built with:
628
+
629
+ - [LangChain](https://github.com/langchain-ai/langchain) - Framework for LLM applications
630
+ - [LangGraph](https://github.com/langchain-ai/langgraph) - Graph-based agent framework
631
+ - [LangMem](https://github.com/langchain-ai/langmem) - Long-term memory for agents
632
+ - [Databricks SDK](https://github.com/databricks/databricks-sdk-py) - Databricks API client
633
+ - [Unity Catalog](https://www.databricks.com/product/unity-catalog) - Data governance platform
634
+
635
+ ---
636
+
637
+ ## πŸ“ž Support
638
+
639
+ For issues and questions:
640
+
641
+ - **GitHub Issues**: [Open an issue](https://github.com/natefleming/langgraph_unity_catalog_checkpoint/issues)
642
+ - **Documentation**: Check the [`docs/`](docs/) directory
643
+ - **Examples**: Review the [`notebooks/`](notebooks/) directory
644
+
645
+ ---
646
+
647
+ ## πŸ—ΊοΈ Roadmap
648
+
649
+ Planned enhancements:
650
+
651
+ - [ ] Connection pooling for improved performance
652
+ - [ ] Configurable TTL for automatic checkpoint cleanup
653
+ - [ ] Metrics and monitoring integration
654
+ - [ ] Query optimization hints and caching
655
+ - [ ] Support for alternative serialization formats
656
+ - [ ] Bulk import/export utilities
657
+ - [ ] Multi-region replication support
658
+
659
+ ---
660
+
661
+ ## ⚑ Quick Links
662
+
663
+ - **[Quick Start Guide](QUICKSTART.md)** - Get started in 5 minutes
664
+ - **[Usage Examples](docs/USAGE.md)** - Detailed usage patterns
665
+ - **[Notebooks](notebooks/)** - Interactive examples
666
+ - **[API Reference](docs/IMPLEMENTATION_SUMMARY.md)** - Technical details
667
+
668
+ ---
669
+
670
+ **Made with ❀️ for the LangChain community**