ygg 0.1.20__py3-none-any.whl → 0.1.23__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,163 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: ygg
3
- Version: 0.1.20
4
- Summary: Type-friendly utilities for moving data between Python objects, Arrow, Polars, Pandas, Spark, and Databricks
5
- Author: Yggdrasil contributors
6
- Project-URL: Homepage, https://github.com/Platob/Yggdrasil
7
- Project-URL: Repository, https://github.com/Platob/Yggdrasil
8
- Project-URL: Documentation, https://github.com/Platob/Yggdrasil/tree/main/python/docs
9
- Keywords: arrow,polars,pandas,spark,databricks,typing,dataclass,serialization
10
- Classifier: Development Status :: 3 - Alpha
11
- Classifier: Programming Language :: Python
12
- Classifier: Programming Language :: Python :: 3
13
- Classifier: Programming Language :: Python :: 3.10
14
- Classifier: Programming Language :: Python :: 3.11
15
- Classifier: Programming Language :: Python :: 3.12
16
- Classifier: Intended Audience :: Developers
17
- Classifier: Intended Audience :: Information Technology
18
- Classifier: Topic :: Software Development :: Libraries
19
- Classifier: Topic :: Scientific/Engineering :: Information Analysis
20
- Classifier: Typing :: Typed
21
- Requires-Python: >=3.10
22
- Description-Content-Type: text/markdown
23
- Requires-Dist: requests>=2
24
- Requires-Dist: polars>=1.3
25
- Requires-Dist: pandas>=2
26
- Requires-Dist: pyarrow>=20
27
- Requires-Dist: dill>=0.4
28
- Requires-Dist: databricks-sdk>=0.71
29
- Provides-Extra: dev
30
- Requires-Dist: pytest; extra == "dev"
31
- Requires-Dist: pytest-asyncio; extra == "dev"
32
- Requires-Dist: black; extra == "dev"
33
- Requires-Dist: ruff; extra == "dev"
34
- Requires-Dist: mypy; extra == "dev"
35
-
36
- # Yggdrasil (Python)
37
-
38
- Type-friendly utilities for moving data between Python objects, Arrow, Polars, pandas, Spark, and Databricks. The package bundles enhanced dataclasses, casting utilities, and lightweight wrappers around Databricks and HTTP clients so Python/data engineers can focus on schemas instead of plumbing.
39
-
40
- ## When to use this package
41
- Use Yggdrasil when you need to:
42
- - Convert payloads across dataframe engines without rewriting type logic for each backend.
43
- - Define dataclasses that auto-coerce inputs, expose defaults, and surface Arrow schemas.
44
- - Run Databricks SQL jobs or manage clusters with minimal boilerplate.
45
- - Add resilient retries, concurrency helpers, and dependency guards to data pipelines.
46
-
47
- ## Prerequisites
48
- - Python **3.10+**
49
- - [uv](https://docs.astral.sh/uv/) for virtualenv and dependency management.
50
-
51
- Optional extras:
52
- - `polars`, `pandas`, `pyarrow`, and `pyspark` for engine-specific conversions.
53
- - `databricks-sdk` for workspace, SQL, jobs, and compute helpers.
54
- - `msal` for Azure AD authentication when using `MSALSession`.
55
-
56
- ## Installation
57
- From the `python/` directory:
58
-
59
- ```bash
60
- uv venv .venv
61
- source .venv/bin/activate
62
- uv pip install -e .[dev]
63
- ```
64
-
65
- Extras are grouped by engine:
66
- - `.[polars]`, `.[pandas]`, `.[spark]`, `.[databricks]` – install only the integrations you need.
67
- - `.[dev]` – adds testing, linting, and typing tools (`pytest`, `ruff`, `black`, `mypy`).
68
-
69
- ## Quickstart
70
- Define an Arrow-aware dataclass, coerce inputs, and cast across containers:
71
-
72
- ```python
73
- from yggdrasil import yggdataclass
74
- from yggdrasil.types.cast import convert
75
- from yggdrasil.types import arrow_field_from_hint
76
-
77
- @yggdataclass
78
- class User:
79
- id: int
80
- email: str
81
- active: bool = True
82
-
83
- user = User.__safe_init__("123", email="alice@example.com")
84
- assert user.id == 123 and user.active is True
85
-
86
- payload = {"id": "45", "email": "bob@example.com", "active": "false"}
87
- clean = User.from_dict(payload)
88
- print(clean.to_dict())
89
-
90
- field = arrow_field_from_hint(User, name="user")
91
- print(field) # user: struct<id: int64, email: string, active: bool>
92
-
93
- numbers = convert(["1", "2", "3"], list[int])
94
- print(numbers)
95
- ```
96
-
97
- ### Databricks example
98
- Install the `databricks` extra and run SQL with typed results:
99
-
100
- ```python
101
- from yggdrasil.databricks.workspaces import Workspace
102
- from yggdrasil.databricks.sql import SQLEngine
103
-
104
- ws = Workspace(host="https://<workspace-url>", token="<token>")
105
- engine = SQLEngine(workspace=ws)
106
-
107
- stmt = engine.execute("SELECT 1 AS value")
108
- result = stmt.wait(engine)
109
- tbl = result.arrow_table()
110
- print(tbl.to_pandas())
111
- ```
112
-
113
- ### Parallel processing and retries
114
-
115
- ```python
116
- from yggdrasil.pyutils import parallelize, retry
117
-
118
- @parallelize(max_workers=4)
119
- def square(x):
120
- return x * x
121
-
122
- @retry(tries=5, delay=0.2, backoff=2)
123
- def sometimes_fails(value: int) -> int:
124
- ...
125
-
126
- print(list(square(range(5))))
127
- ```
128
-
129
- ## Project layout
130
- - `yggdrasil/dataclasses` – `yggdataclass` decorator plus Arrow schema helpers.
131
- - `yggdrasil/types` – casting registry (`convert`, `register_converter`), Arrow inference, and default generators.
132
- - `yggdrasil/libs` – optional bridges to Polars, pandas, Spark, and Databricks SDK types.
133
- - `yggdrasil/databricks` – workspace, SQL, jobs, and compute helpers built on the Databricks SDK.
134
- - `yggdrasil/requests` – retry-capable HTTP sessions and Azure MSAL auth helpers.
135
- - `yggdrasil/pyutils` – concurrency and retry decorators.
136
- - `yggdrasil/ser` – serialization helpers and dependency inspection utilities.
137
- - `tests/` – pytest-based coverage for conversions, dataclasses, requests, and platform helpers.
138
-
139
- ## Testing
140
- From `python/`:
141
-
142
- ```bash
143
- pytest
144
- ```
145
-
146
- Optional checks when developing:
147
-
148
- ```bash
149
- ruff check
150
- black .
151
- mypy
152
- ```
153
-
154
- ## Troubleshooting and common pitfalls
155
- - **Missing optional dependency**: Install the matching extra (e.g., `uv pip install -e .[polars]`) or wrap calls with `require_polars`/`require_pyspark` from `yggdrasil.libs`.
156
- - **Schema mismatches**: Use `arrow_field_from_hint` and `CastOptions` to enforce expected Arrow metadata when casting.
157
- - **Databricks auth**: Provide `host` and `token` to `Workspace`. For Azure, ensure environment variables align with your workspace deployment.
158
-
159
- ## Contributing
160
- 1. Fork and branch.
161
- 2. Install with `uv pip install -e .[dev]`.
162
- 3. Run tests and linters.
163
- 4. Submit a PR describing the change and any new examples added to the docs.
yggdrasil/ser/__init__.py DELETED
@@ -1 +0,0 @@
1
- from .callable_serde import *