incorporator 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,223 @@
1
+ Metadata-Version: 2.4
2
+ Name: incorporator
3
+ Version: 1.0.0
4
+ Summary: The Dynamic Class Building and Zero-Boilerplate Universal Data Gateway.
5
+ Author: Incorporator Maintainers
6
+ License: MIT
7
+ Project-URL: Repository, https://github.com/your-org/incorporator
8
+ Project-URL: Bug Tracker, https://github.com/your-org/incorporator/issues
9
+ Keywords: etl,api,pydantic,dynamic-models,json,csv,xml,asyncio,hateoas
10
+ Classifier: Development Status :: 5 - Production/Stable
11
+ Classifier: Intended Audience :: Developers
12
+ Classifier: License :: OSI Approved :: MIT License
13
+ Classifier: Programming Language :: Python :: 3
14
+ Classifier: Programming Language :: Python :: 3.9
15
+ Classifier: Programming Language :: Python :: 3.10
16
+ Classifier: Programming Language :: Python :: 3.11
17
+ Classifier: Programming Language :: Python :: 3.12
18
+ Classifier: Programming Language :: Python :: 3.13
19
+ Classifier: Operating System :: OS Independent
20
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
21
+ Requires-Python: >=3.9
22
+ Description-Content-Type: text/markdown
23
+ Requires-Dist: pydantic>=2.0.0
24
+ Requires-Dist: httpx>=0.24.0
25
+ Requires-Dist: tenacity>=8.0.0
26
+ Provides-Extra: dev
27
+ Requires-Dist: pytest>=7.0; extra == "dev"
28
+ Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
29
+ Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
30
+ Requires-Dist: mypy>=1.0.0; extra == "dev"
31
+ Requires-Dist: ruff>=0.1.0; extra == "dev"
32
+
33
+ ```markdown
34
+ # 🌌 Incorporator (v1.0.0)
35
+ **The Dynamic Class Building and Zero-Boilerplate Universal Data Gateway.**
36
+
37
+ [![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
38
+ [![Pydantic V2](https://img.shields.io/badge/Pydantic-V2-e92063.svg)](https://docs.pydantic.dev/)
39
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
40
+ [![Code style: Ruff](https://img.shields.io/badge/code%20style-Ruff-261230.svg)](https://github.com/astral-sh/ruff)
41
+ [![Typing: Strict](https://img.shields.io/badge/typing-strict-green.svg)](https://mypy.readthedocs.io/en/stable/)
42
+
43
+ Stop writing boilerplate models, manual HTTP connection loops, pagination state-trackers, and fragile data-cleaning lambda functions.
44
+
45
+ **Incorporator** is an elite Python framework that transforms raw JSON, CSV, and XML APIs into fully typed, relational Python Object Graphs in a single line of code.
46
+
47
+ ## 🚀 Installation
48
+
49
+ ```bash
50
+ pip install incorporator
51
+ ```
52
+
53
+ ## ⚡ The "Zero-Boilerplate" Philosophy
54
+
55
+ **The Old Way:** Define a rigid `BaseModel`, write an `httpx` loop, handle 429 retries, write a custom paginator, manually link foreign keys, catch `KeyErrors`, and hope the API schema doesn't change.
56
+
57
+ **The Incorporator Way:**
58
+ ```python
59
+ from incorporator import Incorporator
60
+ from incorporator.methods.paginate import NextUrlPaginator
61
+
62
+ class Crypto(Incorporator): pass
63
+
64
+ # Fetch 150 coins, auto-paginate, generate Pydantic models on the fly, and rate-limit perfectly.
65
+ coins = await Crypto.incorp(
66
+ inc_url="https://api.coingecko.com/api/v3/coins/markets?vs_currency=usd",
67
+ inc_code="id",
68
+ inc_name="name",
69
+ inc_page=NextUrlPaginator("next"),
70
+ call_lim=3
71
+ )
72
+
73
+ print(coins[0].inc_name) # "Bitcoin"
74
+ print(coins[0].current_price) # 64000.00 (Dynamically typed as float by Pydantic V2!)
75
+ ```
76
+
77
+ ---
78
+
79
+ ## 🛠️ The Core Architectural Pillars
80
+
81
+ ### 1. The Holy Trinity API & Dynamic Registries
82
+ - `incorp()`: Extracts raw data, compiles dynamic `Pydantic` schemas natively, and loads data into intelligent `IncorporatorList` wrappers.
83
+ - `refresh()`: Hydrates existing instances seamlessly with new data (perfect for live feeds).
84
+ - `export()`: Dumps stateful object graphs back into sanitized JSON, XML, or CSV files.
85
+ - **The `inc_dict`:** Every object automatically registers itself into a memory-safe `WeakValueDictionary`. Look up any object instantly: `coins.inc_dict.get('bitcoin')`.
86
+
87
+ ### 2. Declarative ETL & Null-Safe Converters
88
+ Data is messy. Incorporator's built-in `conv_dict` tools intercept bad data *before* Pydantic validation, shielding you from crashes with beautiful, readable syntax.
89
+ * **`inc(type)`**: Automatically ranks fallbacks. `inc(datetime)` will parse ISO-8601 or 10+ standard string formats natively.
90
+ * **`calc(func, *keys)`**: Multi-column row calculations. `calc(len, 'residents', default=0)`.
91
+ * **`link_to` & `link_to_list`**: Zero-boilerplate Graph Relational Mapping.
92
+
93
+ ### 3. Native Concurrency & Invisible Resilience
94
+ Pass a list of 500 URLs or trigger a deep-drill. Incorporator automatically spins up an `asyncio.Semaphore`, shares a single `httpx.AsyncClient` pool, and batches requests.
95
+ *Hit a 429 Too Many Requests?* It automatically jitter-retries via `tenacity`.
96
+ *Still 429?* It gracefully skips the failed row, logs it to `results.failed_sources`, and returns the remaining objects without crashing your pipeline.
97
+
98
+ ### 4. Advanced Asynchronous Pagination
99
+ Isolated OOP strategies to gracefully handle pagination without infinite loops. Includes `NextUrlPaginator`, `CursorPaginator`, `OffsetPaginator`, `PageNumberPaginator`, and `LinkHeaderPaginator`.
100
+
101
+ ---
102
+
103
+ ## 📖 Real-World Showcases
104
+
105
+ ### Showcase 1: HATEOAS & Relational Mapping (Star Wars API)
106
+ Turn disconnected flat APIs into deeply nested, traversable object graphs using `link_to` and `link_to_list`.
107
+
108
+ ```python
109
+ from incorporator import Incorporator
110
+ from incorporator.methods.converters import calc, extract_url_id, flt, link_to, link_to_list
111
+
112
+ class Planet(Incorporator): pass
113
+ class Film(Incorporator): pass
114
+ class Person(Incorporator): pass
115
+
116
+ # 1. Build the foundational Graph Nodes
117
+ planets = await Planet.incorp(inc_url="https://swapi.dev/api/planets/", inc_code="url")
118
+ films = await Film.incorp(inc_url="https://swapi.dev/api/films/", inc_code="url")
119
+
120
+ # 2. Fetch People and map relations natively
121
+ people = await Person.incorp(
122
+ inc_url="https://swapi.dev/api/people/",
123
+ inc_code="url",
124
+ conv_dict={
125
+ # Safely cast string numbers to floats
126
+ "height": calc(float, default=0.0, target_type=flt),
127
+
128
+ # Instantly link URL strings to our in-memory Planet and Film objects!
129
+ "homeworld": calc(link_to(planets), default=None),
130
+ "films": calc(link_to_list(films), default=[])
131
+ }
132
+ )
133
+
134
+ # Deep Dot-Notation Navigation!
135
+ luke = people[0]
136
+ print(luke.homeworld.inc_name) # "Tatooine"
137
+ print(luke.films[0].inc_name) # "A New Hope"
138
+ ```
139
+
140
+ ### Showcase 2: Parent-Based Enrichment (PokéAPI)
141
+ Pass shallow objects into `inc_parent` to trigger automatic concurrent bulk detail scraping.
142
+
143
+ ```python
144
+ # 1. SHALLOW DISCOVERY: Fetch 150 navigation URLs
145
+ pokemon_nav = await Nav.incorp(
146
+ inc_url="https://pokeapi.co/api/v2/pokemon/?limit=50&offset=0",
147
+ inc_name="name", name_chg=[('url', 'detail_url')],
148
+ inc_page=NextUrlPaginator("next"), call_lim=3
149
+ )
150
+
151
+ def calculate_bst(stats: list) -> int:
152
+ return sum(s.get("base_stat", 0) for s in stats if isinstance(s, dict))
153
+
154
+ # 2. DEEP ENRICHMENT: Pass the parent objects. The framework tears out 'detail_url',
155
+ # fires 150 concurrent requests, and builds deep objects automatically.
156
+ enriched_pokemon = await Pokemon.incorp(
157
+ inc_parent=pokemon_nav,
158
+ inc_code="id", inc_name="name",
159
+ conv_dict={
160
+ # Dynamically calculate Base Stat Total from the nested JSON array
161
+ "stats": calc(calculate_bst, "stats", default=0, target_type=int)
162
+ }
163
+ )
164
+ ```
165
+
166
+ ### Showcase 3: Local XML to Live JSON Bulk POST (NHTSA API)
167
+ Seamlessly bridge deep local XML data with live JSON REST APIs.
168
+
169
+ ```python
170
+ # 1. Extract nested data from a local XML file
171
+ invoices = await JimmyInvoice.incorp(
172
+ inc_file="shady_jimmy.xml",
173
+ rec_path="Dealership.AuditFile.Invoices.Invoice"
174
+ )
175
+
176
+ vin_batch_string = ";".join([getattr(inv.Vehicle, "VIN", "") for inv in invoices])
177
+
178
+ # 2. Hit a live JSON Bulk Endpoint using a POST payload
179
+ live_records = await NHTSARecord.incorp(
180
+ inc_url="https://vpic.nhtsa.dot.gov/api/vehicles/DecodeVINValuesBatch/",
181
+ method="POST",
182
+ form_payload={"format": "json", "DATA": vin_batch_string},
183
+ rec_path="Results",
184
+ inc_code="VIN",
185
+ conv_dict={ "ModelYear": inc(int) } # Force string years to integers
186
+ )
187
+
188
+ # 3. Audit instantly via the memory-safe registry
189
+ for inv in invoices:
190
+ vin = inv.Vehicle.VIN
191
+ actual_car = live_records.inc_dict.get(vin)
192
+ if actual_car.ModelYear != int(inv.Vehicle.Year):
193
+ print("Fraud Detected!")
194
+ ```
195
+
196
+ ---
197
+
198
+ ## 🕵️ Non-Blocking Observability
199
+ Need production logs without starving your async event loop?
200
+ ```python
201
+ from incorporator import LoggedIncorporator
202
+
203
+ class WebAPI(LoggedIncorporator): pass
204
+
205
+ # Configures background multithreaded queue logging automatically
206
+ instance = await WebAPI.incorp(
207
+ inc_url="https://api.example.com/data",
208
+ enable_logging=True
209
+ )
210
+
211
+ instance.log_info("Standard trace")
212
+ instance.log_error("API Offline", exc_info=True)
213
+ instance.log_api("Web traffic trace") # Routes to isolated api.log
214
+ ```
215
+
216
+ ## 🤝 Contributing
217
+ 1. Clone the repo.
218
+ 2. `pip install -e .[dev]` (Installs `pytest`, `mypy`, `ruff`).
219
+ 3. Run tests: `pytest tests/ -v`.
220
+ 4. Check typing: `mypy --strict incorporator`.
221
+
222
+ *Built for data engineers who want to sleep at night.*
223
+ ```
@@ -0,0 +1,191 @@
1
+ ```markdown
2
+ # 🌌 Incorporator (v1.0.0)
3
+ **The Dynamic Class Building and Zero-Boilerplate Universal Data Gateway.**
4
+
5
+ [![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
6
+ [![Pydantic V2](https://img.shields.io/badge/Pydantic-V2-e92063.svg)](https://docs.pydantic.dev/)
7
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
8
+ [![Code style: Ruff](https://img.shields.io/badge/code%20style-Ruff-261230.svg)](https://github.com/astral-sh/ruff)
9
+ [![Typing: Strict](https://img.shields.io/badge/typing-strict-green.svg)](https://mypy.readthedocs.io/en/stable/)
10
+
11
+ Stop writing boilerplate models, manual HTTP connection loops, pagination state-trackers, and fragile data-cleaning lambda functions.
12
+
13
+ **Incorporator** is an elite Python framework that transforms raw JSON, CSV, and XML APIs into fully typed, relational Python Object Graphs in a single line of code.
14
+
15
+ ## 🚀 Installation
16
+
17
+ ```bash
18
+ pip install incorporator
19
+ ```
20
+
21
+ ## ⚡ The "Zero-Boilerplate" Philosophy
22
+
23
+ **The Old Way:** Define a rigid `BaseModel`, write an `httpx` loop, handle 429 retries, write a custom paginator, manually link foreign keys, catch `KeyErrors`, and hope the API schema doesn't change.
24
+
25
+ **The Incorporator Way:**
26
+ ```python
27
+ from incorporator import Incorporator
28
+ from incorporator.methods.paginate import NextUrlPaginator
29
+
30
+ class Crypto(Incorporator): pass
31
+
32
+ # Fetch 150 coins, auto-paginate, generate Pydantic models on the fly, and rate-limit perfectly.
33
+ coins = await Crypto.incorp(
34
+ inc_url="https://api.coingecko.com/api/v3/coins/markets?vs_currency=usd",
35
+ inc_code="id",
36
+ inc_name="name",
37
+ inc_page=NextUrlPaginator("next"),
38
+ call_lim=3
39
+ )
40
+
41
+ print(coins[0].inc_name) # "Bitcoin"
42
+ print(coins[0].current_price) # 64000.00 (Dynamically typed as float by Pydantic V2!)
43
+ ```
44
+
45
+ ---
46
+
47
+ ## 🛠️ The Core Architectural Pillars
48
+
49
+ ### 1. The Holy Trinity API & Dynamic Registries
50
+ - `incorp()`: Extracts raw data, compiles dynamic `Pydantic` schemas natively, and loads data into intelligent `IncorporatorList` wrappers.
51
+ - `refresh()`: Hydrates existing instances seamlessly with new data (perfect for live feeds).
52
+ - `export()`: Dumps stateful object graphs back into sanitized JSON, XML, or CSV files.
53
+ - **The `inc_dict`:** Every object automatically registers itself into a memory-safe `WeakValueDictionary`. Look up any object instantly: `coins.inc_dict.get('bitcoin')`.
54
+
55
+ ### 2. Declarative ETL & Null-Safe Converters
56
+ Data is messy. Incorporator's built-in `conv_dict` tools intercept bad data *before* Pydantic validation, shielding you from crashes with beautiful, readable syntax.
57
+ * **`inc(type)`**: Automatically ranks fallbacks. `inc(datetime)` will parse ISO-8601 or 10+ standard string formats natively.
58
+ * **`calc(func, *keys)`**: Multi-column row calculations. `calc(len, 'residents', default=0)`.
59
+ * **`link_to` & `link_to_list`**: Zero-boilerplate Graph Relational Mapping.
60
+
61
+ ### 3. Native Concurrency & Invisible Resilience
62
+ Pass a list of 500 URLs or trigger a deep-drill. Incorporator automatically spins up an `asyncio.Semaphore`, shares a single `httpx.AsyncClient` pool, and batches requests.
63
+ *Hit a 429 Too Many Requests?* It automatically jitter-retries via `tenacity`.
64
+ *Still 429?* It gracefully skips the failed row, logs it to `results.failed_sources`, and returns the remaining objects without crashing your pipeline.
65
+
66
+ ### 4. Advanced Asynchronous Pagination
67
+ Isolated OOP strategies to gracefully handle pagination without infinite loops. Includes `NextUrlPaginator`, `CursorPaginator`, `OffsetPaginator`, `PageNumberPaginator`, and `LinkHeaderPaginator`.
68
+
69
+ ---
70
+
71
+ ## 📖 Real-World Showcases
72
+
73
+ ### Showcase 1: HATEOAS & Relational Mapping (Star Wars API)
74
+ Turn disconnected flat APIs into deeply nested, traversable object graphs using `link_to` and `link_to_list`.
75
+
76
+ ```python
77
+ from incorporator import Incorporator
78
+ from incorporator.methods.converters import calc, extract_url_id, flt, link_to, link_to_list
79
+
80
+ class Planet(Incorporator): pass
81
+ class Film(Incorporator): pass
82
+ class Person(Incorporator): pass
83
+
84
+ # 1. Build the foundational Graph Nodes
85
+ planets = await Planet.incorp(inc_url="https://swapi.dev/api/planets/", inc_code="url")
86
+ films = await Film.incorp(inc_url="https://swapi.dev/api/films/", inc_code="url")
87
+
88
+ # 2. Fetch People and map relations natively
89
+ people = await Person.incorp(
90
+ inc_url="https://swapi.dev/api/people/",
91
+ inc_code="url",
92
+ conv_dict={
93
+ # Safely cast string numbers to floats
94
+ "height": calc(float, default=0.0, target_type=flt),
95
+
96
+ # Instantly link URL strings to our in-memory Planet and Film objects!
97
+ "homeworld": calc(link_to(planets), default=None),
98
+ "films": calc(link_to_list(films), default=[])
99
+ }
100
+ )
101
+
102
+ # Deep Dot-Notation Navigation!
103
+ luke = people[0]
104
+ print(luke.homeworld.inc_name) # "Tatooine"
105
+ print(luke.films[0].inc_name) # "A New Hope"
106
+ ```
107
+
108
+ ### Showcase 2: Parent-Based Enrichment (PokéAPI)
109
+ Pass shallow objects into `inc_parent` to trigger automatic concurrent bulk detail scraping.
110
+
111
+ ```python
112
+ # 1. SHALLOW DISCOVERY: Fetch 150 navigation URLs
113
+ pokemon_nav = await Nav.incorp(
114
+ inc_url="https://pokeapi.co/api/v2/pokemon/?limit=50&offset=0",
115
+ inc_name="name", name_chg=[('url', 'detail_url')],
116
+ inc_page=NextUrlPaginator("next"), call_lim=3
117
+ )
118
+
119
+ def calculate_bst(stats: list) -> int:
120
+ return sum(s.get("base_stat", 0) for s in stats if isinstance(s, dict))
121
+
122
+ # 2. DEEP ENRICHMENT: Pass the parent objects. The framework tears out 'detail_url',
123
+ # fires 150 concurrent requests, and builds deep objects automatically.
124
+ enriched_pokemon = await Pokemon.incorp(
125
+ inc_parent=pokemon_nav,
126
+ inc_code="id", inc_name="name",
127
+ conv_dict={
128
+ # Dynamically calculate Base Stat Total from the nested JSON array
129
+ "stats": calc(calculate_bst, "stats", default=0, target_type=int)
130
+ }
131
+ )
132
+ ```
133
+
134
+ ### Showcase 3: Local XML to Live JSON Bulk POST (NHTSA API)
135
+ Seamlessly bridge deep local XML data with live JSON REST APIs.
136
+
137
+ ```python
138
+ # 1. Extract nested data from a local XML file
139
+ invoices = await JimmyInvoice.incorp(
140
+ inc_file="shady_jimmy.xml",
141
+ rec_path="Dealership.AuditFile.Invoices.Invoice"
142
+ )
143
+
144
+ vin_batch_string = ";".join([getattr(inv.Vehicle, "VIN", "") for inv in invoices])
145
+
146
+ # 2. Hit a live JSON Bulk Endpoint using a POST payload
147
+ live_records = await NHTSARecord.incorp(
148
+ inc_url="https://vpic.nhtsa.dot.gov/api/vehicles/DecodeVINValuesBatch/",
149
+ method="POST",
150
+ form_payload={"format": "json", "DATA": vin_batch_string},
151
+ rec_path="Results",
152
+ inc_code="VIN",
153
+ conv_dict={ "ModelYear": inc(int) } # Force string years to integers
154
+ )
155
+
156
+ # 3. Audit instantly via the memory-safe registry
157
+ for inv in invoices:
158
+ vin = inv.Vehicle.VIN
159
+ actual_car = live_records.inc_dict.get(vin)
160
+ if actual_car.ModelYear != int(inv.Vehicle.Year):
161
+ print("Fraud Detected!")
162
+ ```
163
+
164
+ ---
165
+
166
+ ## 🕵️ Non-Blocking Observability
167
+ Need production logs without starving your async event loop?
168
+ ```python
169
+ from incorporator import LoggedIncorporator
170
+
171
+ class WebAPI(LoggedIncorporator): pass
172
+
173
+ # Configures background multithreaded queue logging automatically
174
+ instance = await WebAPI.incorp(
175
+ inc_url="https://api.example.com/data",
176
+ enable_logging=True
177
+ )
178
+
179
+ instance.log_info("Standard trace")
180
+ instance.log_error("API Offline", exc_info=True)
181
+ instance.log_api("Web traffic trace") # Routes to isolated api.log
182
+ ```
183
+
184
+ ## 🤝 Contributing
185
+ 1. Clone the repo.
186
+ 2. `pip install -e .[dev]` (Installs `pytest`, `mypy`, `ruff`).
187
+ 3. Run tests: `pytest tests/ -v`.
188
+ 4. Check typing: `mypy --strict incorporator`.
189
+
190
+ *Built for data engineers who want to sleep at night.*
191
+ ```
@@ -0,0 +1,38 @@
1
+ """Incorporator: The Dynamic Class Building and Zero-Boilerplate Universal Data Gateway."""
2
+
3
+ __version__ = "1.0.0"
4
+
5
+ from .base import Incorporator
6
+ from .methods.converters import (
7
+ extract_url_id,
8
+ link_to,
9
+ link_to_list,
10
+ pluck,
11
+ split_and_get,
12
+ )
13
+ from .methods.exceptions import (
14
+ IncorporatorError,
15
+ IncorporatorFormatError,
16
+ IncorporatorNetworkError,
17
+ IncorporatorSchemaError,
18
+ )
19
+ from .methods.format_parsers import FormatType
20
+ from .methods.logger import LoggedIncorporator, LoggingMixin, setup_class_logger
21
+
22
+ __all__ =[
23
+ "__version__",
24
+ "Incorporator",
25
+ "LoggedIncorporator",
26
+ "LoggingMixin",
27
+ "setup_class_logger",
28
+ "FormatType",
29
+ "split_and_get",
30
+ "link_to",
31
+ "link_to_list",
32
+ "extract_url_id",
33
+ "pluck",
34
+ "IncorporatorError",
35
+ "IncorporatorFormatError",
36
+ "IncorporatorNetworkError",
37
+ "IncorporatorSchemaError",
38
+ ]