duckrun 0.1.0__py3-none-any.whl → 0.1.1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,183 @@
1
+ Metadata-Version: 2.4
2
+ Name: duckrun
3
+ Version: 0.1.1
4
+ Summary: Lakehouse task runner powered by DuckDB for Microsoft Fabric
5
+ License-Expression: MIT
6
+ Project-URL: Homepage, https://github.com/djouallah/duckrun
7
+ Project-URL: Repository, https://github.com/djouallah/duckrun
8
+ Project-URL: Issues, https://github.com/djouallah/duckrun/issues
9
+ Requires-Python: >=3.9
10
+ Description-Content-Type: text/markdown
11
+ License-File: LICENSE
12
+ Requires-Dist: duckdb>=1.2.0
13
+ Requires-Dist: deltalake>=0.18.2
14
+ Requires-Dist: requests>=2.28.0
15
+ Dynamic: license-file
16
+
17
+ # 🦆 Duckrun
18
+
19
+ Simple task runner for Microsoft Fabric Python notebook, powered by DuckDB and Delta_rs.
20
+
21
+ ## Installation
22
+
23
+ ```bash
24
+ pip install duckrun
25
+ ```
26
+
27
+
28
+
29
+ ## Quick Start
30
+
31
+ ```python
32
+ import duckrun as dr
33
+
34
+ # Connect to your Fabric lakehouse
35
+ lakehouse = dr.connect(
36
+ workspace="my_workspace",
37
+ lakehouse_name="my_lakehouse",
38
+ schema="dbo",
39
+ sql_folder="./sql" # folder containing your .sql and .py files
40
+ )
41
+
42
+ # Define your pipeline
43
+ pipeline = [
44
+ ('load_data', (url, path)), # Python task
45
+ ('clean_data', 'overwrite'), # SQL task
46
+ ('aggregate', 'append') # SQL task
47
+ ]
48
+
49
+ # Run it
50
+ lakehouse.run(pipeline)
51
+ ```
52
+
53
+ ## How It Works
54
+
55
+ Duckrun runs two types of tasks:
56
+
57
+ ### 1. Python Tasks
58
+ Format: `('function_name', (arg1, arg2, ...))`
59
+
60
+ Create a file `sql_folder/function_name.py` with a function matching the name:
61
+
62
+ ```python
63
+ # sql_folder/load_data.py
64
+ def load_data(url, path):
65
+ # your code here
66
+ # IMPORTANT: Must return 1 for success, 0 for failure
67
+ return 1
68
+ ```
69
+
70
+ ### 2. SQL Tasks
71
+ Format: `('table_name', 'mode')` or `('table_name', 'mode', {params})`
72
+
73
+ Create a file `sql_folder/table_name.sql`:
74
+
75
+ ```sql
76
+ -- sql_folder/clean_data.sql
77
+ SELECT
78
+ id,
79
+ TRIM(name) as name,
80
+ date
81
+ FROM raw_data
82
+ WHERE date >= '2024-01-01'
83
+ ```
84
+
85
+ **Modes:**
86
+ - `overwrite` - Replace table completely
87
+ - `append` - Add to existing table
88
+ - `ignore` - Create only if doesn't exist
89
+
90
+ ## Task Files
91
+
92
+ The `sql_folder` can contain a mixture of both `.sql` and `.py` files. This allows you to combine SQL transformations and Python logic in your pipelines.
93
+
94
+ ### SQL Files
95
+ Your SQL files automatically have access to:
96
+ - `$ws` - workspace name
97
+ - `$lh` - lakehouse name
98
+ - `$schema` - schema name
99
+
100
+ Pass custom parameters:
101
+
102
+ ```python
103
+ pipeline = [
104
+ ('sales', 'append', {'start_date': '2024-01-01', 'end_date': '2024-12-31'})
105
+ ]
106
+ ```
107
+
108
+ ```sql
109
+ -- sql_folder/sales.sql
110
+ SELECT * FROM transactions
111
+ WHERE date BETWEEN '$start_date' AND '$end_date'
112
+ ```
113
+
114
+ ## Table Name Convention
115
+
116
+ Use `__` to create variants of the same table:
117
+
118
+ ```python
119
+ pipeline = [
120
+ ('sales__initial', 'overwrite', {}), # writes to 'sales' table
121
+ ('sales__incremental', 'append', {}), # appends to 'sales' table
122
+ ]
123
+ ```
124
+
125
+ Both write to the same `sales` table, but use different SQL files.
126
+
127
+ ## Query Data
128
+
129
+ ```python
130
+ # Run queries
131
+ lakehouse.sql("SELECT * FROM my_table LIMIT 10").show()
132
+
133
+ # Get as DataFrame
134
+ df = lakehouse.sql("SELECT COUNT(*) FROM sales").df()
135
+ ```
136
+
137
+ ## Real-World Example
138
+
139
+ ```python
140
+ import duckrun as dr
141
+
142
+ lakehouse = dr.connect(
143
+ workspace="Analytics",
144
+ lakehouse_name="Sales",
145
+ schema="dbo",
146
+ sql_folder="./etl"
147
+ )
148
+
149
+ # Daily pipeline
150
+ daily = [
151
+ ('download_files', (api_url, local_path)),
152
+ ('staging_orders', 'overwrite', {'run_date': '2024-06-01'}),
153
+ ('staging_customers', 'overwrite', {'run_date': '2024-06-01'}),
154
+ ('fact_sales', 'append'),
155
+ ('dim_customer', 'overwrite')
156
+ ]
157
+
158
+ lakehouse.run(daily)
159
+
160
+ # Check results
161
+ lakehouse.sql("SELECT COUNT(*) FROM fact_sales").show()
162
+ ```
163
+
164
+ ## Remote SQL Files
165
+
166
+ You can load SQL/Python files from a URL:
167
+
168
+ ```python
169
+ lakehouse = dr.connect(
170
+ workspace="Analytics",
171
+ lakehouse_name="Sales",
172
+ schema="dbo",
173
+ sql_folder="https://raw.githubusercontent.com/user/repo/main/sql"
174
+ )
175
+ ```
176
+
177
+ ## Real-Life Usage
178
+
179
+ For a complete, production-style example, see [fabric_demo](https://github.com/djouallah/fabric_demo).
180
+
181
+ ## License
182
+
183
+ MIT
@@ -0,0 +1,7 @@
1
+ duckrun/__init__.py,sha256=L0jRtD9Ld8Ti4e6GRvPDdHvkQCFAPHM43GSP7ARh6EM,241
2
+ duckrun/core.py,sha256=-Vf2nYwhdsVpTZS9mGBtm8j_HNAcHR7Cj075pida3Yw,13133
3
+ duckrun-0.1.1.dist-info/licenses/LICENSE,sha256=-DeQQwdbCbkB4507ZF3QbocysB-EIjDtaLexvqRkGZc,1083
4
+ duckrun-0.1.1.dist-info/METADATA,sha256=4KZAURlPgjIDGYW_htE4gdHLi6WX-3gpfrCY0r1sFPE,4114
5
+ duckrun-0.1.1.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
6
+ duckrun-0.1.1.dist-info/top_level.txt,sha256=BknMEwebbUHrVAp3SC92ps8MPhK7XSYsaogTvi_DmEU,8
7
+ duckrun-0.1.1.dist-info/RECORD,,
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Mimoune
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -1,11 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: duckrun
3
- Version: 0.1.0
4
- Summary: Lakehouse task runner powered by DuckDB for Microsoft Fabric
5
- License-Expression: MIT
6
- Project-URL: Homepage, https://github.com/djouallah/duckrun
7
- Project-URL: Repository, https://github.com/djouallah/duckrun
8
- Project-URL: Issues, https://github.com/djouallah/duckrun/issues
9
- Requires-Python: >=3.9
10
- License-File: LICENSE
11
- Dynamic: license-file
@@ -1,7 +0,0 @@
1
- duckrun/__init__.py,sha256=L0jRtD9Ld8Ti4e6GRvPDdHvkQCFAPHM43GSP7ARh6EM,241
2
- duckrun/core.py,sha256=-Vf2nYwhdsVpTZS9mGBtm8j_HNAcHR7Cj075pida3Yw,13133
3
- duckrun-0.1.0.dist-info/licenses/LICENSE,sha256=b0pMNsWFx7PvXXtQo-XLqFnPRirAtdWBWwQp39phnWI,20
4
- duckrun-0.1.0.dist-info/METADATA,sha256=U6-bDm1EQx47ELIzUfFXYwcalUU9ZxfBsDleowie9L4,410
5
- duckrun-0.1.0.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
6
- duckrun-0.1.0.dist-info/top_level.txt,sha256=BknMEwebbUHrVAp3SC92ps8MPhK7XSYsaogTvi_DmEU,8
7
- duckrun-0.1.0.dist-info/RECORD,,
@@ -1 +0,0 @@
1
- ### **5. `LICENSE`**