bizon 0.1.2__py3-none-any.whl → 0.3.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (84) hide show
  1. bizon/alerting/alerts.py +0 -1
  2. bizon/common/models.py +182 -4
  3. bizon/connectors/destinations/bigquery/config/bigquery_incremental.example.yml +34 -0
  4. bizon/connectors/destinations/bigquery/src/config.py +0 -1
  5. bizon/connectors/destinations/bigquery/src/destination.py +16 -9
  6. bizon/connectors/destinations/bigquery_streaming/config/bigquery_streaming.example.yml +74 -0
  7. bizon/connectors/destinations/bigquery_streaming/src/destination.py +5 -43
  8. bizon/connectors/destinations/bigquery_streaming_v2/config/bigquery_streaming_v2.example.yml +79 -0
  9. bizon/connectors/destinations/bigquery_streaming_v2/src/destination.py +54 -50
  10. bizon/connectors/destinations/file/config/file.example.yml +40 -0
  11. bizon/connectors/destinations/file/config/file_incremental.example.yml +22 -0
  12. bizon/connectors/destinations/file/src/config.py +1 -1
  13. bizon/connectors/destinations/file/src/destination.py +59 -7
  14. bizon/connectors/destinations/logger/config/logger.example.yml +30 -0
  15. bizon/connectors/destinations/logger/config/logger_incremental.example.yml +21 -0
  16. bizon/connectors/destinations/logger/src/config.py +0 -2
  17. bizon/connectors/destinations/logger/src/destination.py +14 -3
  18. bizon/connectors/sources/cycle/src/source.py +2 -6
  19. bizon/connectors/sources/dummy/src/source.py +0 -4
  20. bizon/connectors/sources/gsheets/config/service_account_incremental.example.yml +51 -0
  21. bizon/connectors/sources/gsheets/src/source.py +2 -3
  22. bizon/connectors/sources/hubspot/config/api_key_incremental.example.yml +40 -0
  23. bizon/connectors/sources/hubspot/src/hubspot_base.py +0 -1
  24. bizon/connectors/sources/hubspot/src/hubspot_objects.py +3 -4
  25. bizon/connectors/sources/hubspot/src/models/hs_object.py +0 -1
  26. bizon/connectors/sources/kafka/config/kafka_streams.example.yml +124 -0
  27. bizon/connectors/sources/kafka/src/config.py +10 -6
  28. bizon/connectors/sources/kafka/src/decode.py +2 -2
  29. bizon/connectors/sources/kafka/src/source.py +147 -46
  30. bizon/connectors/sources/notion/config/api_key.example.yml +35 -0
  31. bizon/connectors/sources/notion/config/api_key_incremental.example.yml +48 -0
  32. bizon/connectors/sources/notion/src/__init__.py +0 -0
  33. bizon/connectors/sources/notion/src/config.py +59 -0
  34. bizon/connectors/sources/notion/src/source.py +1501 -0
  35. bizon/connectors/sources/notion/tests/notion_pipeline.py +7 -0
  36. bizon/connectors/sources/notion/tests/test_notion.py +113 -0
  37. bizon/connectors/sources/periscope/src/source.py +0 -6
  38. bizon/connectors/sources/pokeapi/src/source.py +0 -1
  39. bizon/connectors/sources/sana_ai/config/sana.example.yml +25 -0
  40. bizon/connectors/sources/sana_ai/src/source.py +85 -0
  41. bizon/destination/buffer.py +0 -1
  42. bizon/destination/config.py +0 -1
  43. bizon/destination/destination.py +1 -4
  44. bizon/engine/backend/adapters/sqlalchemy/backend.py +2 -5
  45. bizon/engine/backend/adapters/sqlalchemy/config.py +0 -1
  46. bizon/engine/config.py +0 -1
  47. bizon/engine/engine.py +0 -1
  48. bizon/engine/pipeline/consumer.py +0 -1
  49. bizon/engine/pipeline/producer.py +43 -6
  50. bizon/engine/queue/adapters/kafka/config.py +1 -1
  51. bizon/engine/queue/adapters/kafka/queue.py +0 -1
  52. bizon/engine/queue/adapters/python_queue/consumer.py +0 -1
  53. bizon/engine/queue/adapters/python_queue/queue.py +0 -2
  54. bizon/engine/queue/adapters/rabbitmq/consumer.py +0 -1
  55. bizon/engine/queue/adapters/rabbitmq/queue.py +0 -1
  56. bizon/engine/queue/config.py +0 -2
  57. bizon/engine/runner/adapters/process.py +0 -2
  58. bizon/engine/runner/adapters/streaming.py +55 -1
  59. bizon/engine/runner/adapters/thread.py +0 -2
  60. bizon/engine/runner/config.py +0 -1
  61. bizon/engine/runner/runner.py +0 -2
  62. bizon/monitoring/datadog/monitor.py +5 -3
  63. bizon/monitoring/noop/monitor.py +1 -1
  64. bizon/source/auth/authenticators/abstract_oauth.py +11 -3
  65. bizon/source/auth/authenticators/abstract_token.py +2 -1
  66. bizon/source/auth/authenticators/basic.py +1 -1
  67. bizon/source/auth/authenticators/cookies.py +2 -1
  68. bizon/source/auth/authenticators/oauth.py +8 -3
  69. bizon/source/config.py +6 -2
  70. bizon/source/cursor.py +8 -16
  71. bizon/source/discover.py +3 -6
  72. bizon/source/models.py +2 -2
  73. bizon/source/session.py +0 -1
  74. bizon/source/source.py +17 -2
  75. bizon/transform/config.py +0 -2
  76. bizon/transform/transform.py +0 -3
  77. bizon-0.3.0.dist-info/METADATA +323 -0
  78. bizon-0.3.0.dist-info/RECORD +142 -0
  79. {bizon-0.1.2.dist-info → bizon-0.3.0.dist-info}/WHEEL +1 -1
  80. bizon-0.3.0.dist-info/entry_points.txt +2 -0
  81. bizon-0.1.2.dist-info/METADATA +0 -179
  82. bizon-0.1.2.dist-info/RECORD +0 -123
  83. bizon-0.1.2.dist-info/entry_points.txt +0 -3
  84. {bizon-0.1.2.dist-info → bizon-0.3.0.dist-info/licenses}/LICENSE +0 -0
@@ -0,0 +1,323 @@
1
+ Metadata-Version: 2.4
2
+ Name: bizon
3
+ Version: 0.3.0
4
+ Summary: Extract and load your data reliably from API Clients with native fault-tolerant and checkpointing mechanism.
5
+ Author-email: Antoine Balliet <antoine.balliet@gmail.com>, Anas El Mhamdi <anas.elmhamdi@gmail.com>
6
+ License-File: LICENSE
7
+ Requires-Python: <3.13,>=3.9
8
+ Requires-Dist: backoff>=2.2.1
9
+ Requires-Dist: click>=8.1.7
10
+ Requires-Dist: dpath>=2.2.0
11
+ Requires-Dist: google-cloud-storage>=2.17.0
12
+ Requires-Dist: loguru>=0.7.2
13
+ Requires-Dist: orjson>=3.10.16
14
+ Requires-Dist: pendulum>=3.0.0
15
+ Requires-Dist: polars>=1.16.0
16
+ Requires-Dist: pyarrow>=16.1.0
17
+ Requires-Dist: pydantic-extra-types>=2.9.0
18
+ Requires-Dist: pydantic>=2.8.2
19
+ Requires-Dist: python-dotenv>=1.0.1
20
+ Requires-Dist: pytz>=2024.2
21
+ Requires-Dist: pyyaml>=6.0.1
22
+ Requires-Dist: requests>=2.28.2
23
+ Requires-Dist: simplejson>=3.20.1
24
+ Requires-Dist: sqlalchemy>=2.0.32
25
+ Requires-Dist: tenacity>=9.0.0
26
+ Provides-Extra: bigquery
27
+ Requires-Dist: google-cloud-bigquery-storage>=2.25.0; extra == 'bigquery'
28
+ Requires-Dist: google-cloud-bigquery>=3.25.0; extra == 'bigquery'
29
+ Requires-Dist: protobuf>=4.24.0; extra == 'bigquery'
30
+ Requires-Dist: sqlalchemy-bigquery>=1.11.0; extra == 'bigquery'
31
+ Provides-Extra: datadog
32
+ Requires-Dist: datadog>=0.50.2; extra == 'datadog'
33
+ Requires-Dist: ddtrace>=3.10.0; extra == 'datadog'
34
+ Provides-Extra: gsheets
35
+ Requires-Dist: gspread>=6.1.2; extra == 'gsheets'
36
+ Provides-Extra: kafka
37
+ Requires-Dist: avro>=1.12.0; extra == 'kafka'
38
+ Requires-Dist: confluent-kafka>=2.6.0; extra == 'kafka'
39
+ Requires-Dist: fastavro>=1.9.7; extra == 'kafka'
40
+ Requires-Dist: kafka-python>=2.0.2; extra == 'kafka'
41
+ Provides-Extra: postgres
42
+ Requires-Dist: psycopg2-binary>=2.9.9; extra == 'postgres'
43
+ Provides-Extra: rabbitmq
44
+ Requires-Dist: pika>=1.3.2; extra == 'rabbitmq'
45
+ Description-Content-Type: text/markdown
46
+
47
+ # bizon ⚡️
48
+ Extract and load your largest data streams with a framework you can trust for billion records.
49
+
50
+ ## Features
51
+ - **Natively fault-tolerant**: Bizon uses a checkpointing mechanism to keep track of the progress and recover from the last checkpoint.
52
+
53
+ - **High throughput**: Bizon is designed to handle high throughput and can process billions of records.
54
+
55
+ - **Queue system agnostic**: Bizon is agnostic of the queuing system, you can use any queuing system among Python Queue, RabbitMQ, Kafka or Redpanda. Thanks to the `bizon.engine.queue.Queue` interface, adapters can be written for any queuing system.
56
+
57
+ - **Pipeline metrics**: Bizon provides exhaustive pipeline metrics and implement Datadog & OpenTelemetry for tracing. You can monitor:
58
+ - ETAs for completion
59
+ - Number of records processed
60
+ - Completion percentage
61
+ - Latency Source <> Destination
62
+
63
+ - **Lightweight & lean**: Bizon is lightweight, minimal codebase and only uses few dependencies:
64
+ - `requests` for HTTP requests
65
+ - `pyyaml` for configuration
66
+ - `sqlalchemy` for database / warehouse connections
67
+ - `polars` for memory efficient data buffering and vectorized processing
68
+ - `pyarrow` for Parquet file format
69
+
70
+ ## Installation
71
+
72
+ ### For Users
73
+ ```bash
74
+ pip install bizon
75
+
76
+ # With optional dependencies
77
+ pip install bizon[postgres] # PostgreSQL backend
78
+ pip install bizon[kafka] # Kafka queue
79
+ pip install bizon[bigquery] # BigQuery backend/destination
80
+ pip install bizon[rabbitmq] # RabbitMQ queue
81
+ ```
82
+
83
+ ### For Development
84
+ ```bash
85
+ # Install uv (if not already installed)
86
+ pip install uv
87
+
88
+ # Clone and install
89
+ git clone https://github.com/bizon-data/bizon-core.git
90
+ cd bizon-core
91
+ uv sync --all-extras --all-groups
92
+
93
+ # Run tests
94
+ uv run pytest tests/
95
+
96
+ # Format code
97
+ uv run ruff format .
98
+ uv run ruff check --fix .
99
+ ```
100
+
101
+ ## Usage
102
+
103
+ ### List available sources and streams
104
+ ```bash
105
+ bizon source list
106
+ bizon stream list <source_name>
107
+ ```
108
+
109
+ ### Create a pipeline
110
+
111
+ Create a file named `config.yml` in your working directory with the following content:
112
+
113
+ ```yaml
114
+ name: demo-creatures-pipeline
115
+
116
+ source:
117
+ name: dummy
118
+ stream: creatures
119
+ authentication:
120
+ type: api_key
121
+ params:
122
+ token: dummy_key
123
+
124
+ destination:
125
+ name: logger
126
+ config:
127
+ dummy: dummy
128
+ ```
129
+
130
+ Run the pipeline with the following command:
131
+
132
+ ```bash
133
+ bizon run config.yml
134
+ ```
135
+ ## Backend configuration
136
+
137
+ Backend is the interface used by Bizon to store its state. It can be configured in the `backend` section of the configuration file. The following backends are supported:
138
+ - `sqlite`: In-memory SQLite database, useful for testing and development.
139
+ - `bigquery`: Google BigQuery backend, perfect for light setup & production.
140
+ - `postgres`: PostgreSQL backend, for production use and frequent cursor updates.
141
+
142
+ ## Queue configuration
143
+
144
+ Queue is the interface used by Bizon to exchange data between `Source` and `Destination`. It can be configured in the `queue` section of the configuration file. The following queues are supported:
145
+ - `python_queue`: Python Queue, useful for testing and development.
146
+ - `rabbitmq`: RabbitMQ, for production use and high throughput.
147
+ - `kafka`: Apache Kafka, for production use and high throughput and strong persistence.
148
+
149
+ ## Runner configuration
150
+
151
+ Runner is the interface used by Bizon to run the pipeline. It can be configured in the `runner` section of the configuration file. The following runners are supported:
152
+ - `thread` (asynchronous)
153
+ - `process` (asynchronous)
154
+ - `stream` (synchronous)
155
+
156
+ ## Sync Modes
157
+
158
+ Bizon supports three sync modes:
159
+ - `full_refresh`: Re-syncs all data from scratch on each run
160
+ - `incremental`: Syncs only new/updated data since the last successful run
161
+ - `stream`: Continuous streaming mode for real-time data (e.g., Kafka)
162
+
163
+ ### Incremental Sync
164
+
165
+ Incremental sync fetches only new or updated records since the last successful run, using an **append-only** strategy.
166
+
167
+ #### Configuration
168
+
169
+ ```yaml
170
+ source:
171
+ name: your_source
172
+ stream: your_stream
173
+ sync_mode: incremental
174
+ cursor_field: updated_at # The timestamp field to filter records by
175
+ ```
176
+
177
+ #### How It Works
178
+
179
+ ```
180
+ ┌─────────────────────────────────────────────────────────────────────┐
181
+ │ INCREMENTAL SYNC FLOW │
182
+ ├─────────────────────────────────────────────────────────────────────┤
183
+ │ │
184
+ │ 1. Producer checks for last successful job │
185
+ │ └─> Backend.get_last_successful_stream_job() │
186
+ │ │
187
+ │ 2. If found, creates SourceIncrementalState: │
188
+ │ └─> last_run = previous_job.created_at │
189
+ │ └─> cursor_field = config.cursor_field (e.g., "updated_at") │
190
+ │ │
191
+ │ 3. Calls source.get_records_after(source_state, pagination) │
192
+ │ └─> Source filters: WHERE cursor_field > last_run │
193
+ │ │
194
+ │ 4. Records written to temp table: {table}_incremental │
195
+ │ │
196
+ │ 5. finalize() appends temp table to main table │
197
+ │ └─> INSERT INTO main_table SELECT * FROM temp_table │
198
+ │ └─> Deletes temp table │
199
+ │ │
200
+ │ FIRST RUN: No previous job → falls back to get() (full refresh) │
201
+ │ │
202
+ └─────────────────────────────────────────────────────────────────────┘
203
+ ```
204
+
205
+ #### Configuration Options
206
+
207
+ | Option | Required | Description | Example |
208
+ |--------|----------|-------------|---------|
209
+ | `sync_mode` | Yes | Set to `incremental` | `incremental` |
210
+ | `cursor_field` | Yes | Timestamp field to filter by | `updated_at`, `last_edited_time`, `modified_at` |
211
+
212
+ #### Supported Sources
213
+
214
+ Sources must implement `get_records_after()` to support incremental sync:
215
+
216
+ | Source | Cursor Field | Notes |
217
+ |--------|--------------|-------|
218
+ | `notion` | `last_edited_time` | Supports `pages`, `databases`, `blocks`, `blocks_markdown` streams |
219
+ | (others) | Varies | Check source docs or implement `get_records_after()` |
220
+
221
+ #### Supported Destinations
222
+
223
+ Destinations must implement `finalize()` with incremental logic:
224
+
225
+ | Destination | Support | Notes |
226
+ |-------------|---------|-------|
227
+ | `bigquery` | ✅ | Append-only via temp table |
228
+ | `bigquery_streaming_v2` | ✅ | Append-only via temp table |
229
+ | `file` | ✅ | Appends to existing file |
230
+ | `logger` | ✅ | Logs completion |
231
+
232
+ #### Example: Notion Incremental Sync
233
+
234
+ ```yaml
235
+ name: notion_incremental_sync
236
+
237
+ source:
238
+ name: notion
239
+ stream: blocks_markdown
240
+ sync_mode: incremental
241
+ cursor_field: last_edited_time
242
+ authentication:
243
+ type: api_key
244
+ params:
245
+ token: secret_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
246
+
247
+ database_ids:
248
+ - "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
249
+
250
+ # Optional: filter which pages to sync
251
+ database_filters:
252
+ "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx":
253
+ property: "Status"
254
+ select:
255
+ equals: "Published"
256
+
257
+ destination:
258
+ name: bigquery
259
+ config:
260
+ project_id: my-gcp-project
261
+ dataset_id: notion_data
262
+ dataset_location: US
263
+
264
+ engine:
265
+ backend:
266
+ type: bigquery
267
+ database: my-gcp-project
268
+ schema: bizon_backend
269
+ syncCursorInDBEvery: 2
270
+ ```
271
+
272
+ #### First Run Behavior
273
+
274
+ On the first incremental run (no previous successful job):
275
+ - Falls back to `get()` method (full refresh behavior)
276
+ - All data is fetched and loaded
277
+ - Job is marked as successful
278
+ - Subsequent runs use `get_records_after()` with `last_run` timestamp
279
+
280
+ ## Start syncing your data 🚀
281
+
282
+ ### Quick setup without any dependencies ✌️
283
+
284
+ Queue configuration can be set to `python_queue` and backend configuration to `sqlite`.
285
+ This will allow you to test the pipeline without any external dependencies.
286
+
287
+
288
+ ### Local Kafka setup
289
+
290
+ To test the pipeline with Kafka, you can use `docker compose` to setup Kafka or Redpanda locally.
291
+
292
+ **Kafka**
293
+ ```bash
294
+ docker compose --file ./scripts/kafka-compose.yml up # Kafka
295
+ docker compose --file ./scripts/redpanda-compose.yml up # Redpanda
296
+ ```
297
+
298
+ In your YAML configuration, set the `queue` configuration to Kafka under `engine`:
299
+ ```yaml
300
+ engine:
301
+ queue:
302
+ type: kafka
303
+ config:
304
+ queue:
305
+ bootstrap_server: localhost:9092 # Kafka:9092 & Redpanda: 19092
306
+ ```
307
+
308
+ **RabbitMQ**
309
+ ```bash
310
+ docker compose --file ./scripts/rabbitmq-compose.yml up
311
+ ```
312
+
313
+ In your YAML configuration, set the `queue` configuration to Kafka under `engine`:
314
+
315
+ ```yaml
316
+ engine:
317
+ queue:
318
+ type: rabbitmq
319
+ config:
320
+ queue:
321
+ host: localhost
322
+ queue_name: bizon
323
+ ```
@@ -0,0 +1,142 @@
1
+ bizon/__main__.py,sha256=6GV4zEg2wC8UBnmESrr71ZmpWo4cNWlrYE3PQuKwFHA,69
2
+ bizon/utils.py,sha256=HXaPiyxpWKoy3XN5vSYOve1ezlFeOYin3aFqTjcabUQ,81
3
+ bizon/alerting/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
4
+ bizon/alerting/alerts.py,sha256=i3c5Y7WcyzPHqilkP8CrujwXVd058Lb0W7XsjB1Ef7w,675
5
+ bizon/alerting/models.py,sha256=kWTeoT7dDC6UrkybU1sRAIXNAE0Wwipf8W7dEEhMZM0,553
6
+ bizon/alerting/slack/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
7
+ bizon/alerting/slack/config.py,sha256=D_CAtzXu-O41OVoyrlYRlBxnGN9rNbnohYQ9CmBYt_E,84
8
+ bizon/alerting/slack/handler.py,sha256=0m6IUSkxqDMlpDWslkImQX74ScD8wIx3YtrhfpYNGUA,1620
9
+ bizon/cli/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
10
+ bizon/cli/main.py,sha256=pQnphPLznllBVzifyIrn3es0U9E1VGSzeEv_COzq9FI,3364
11
+ bizon/cli/utils.py,sha256=aZ47YjFfifHkW95bAVzWfEQD3ZnxGSMT32bkRLmc5-c,953
12
+ bizon/common/models.py,sha256=eL_Ii0CkeJFIjak1CKrB74mbC3OkmWP2uI27ynlYgkQ,10070
13
+ bizon/common/errors/backoff.py,sha256=z7RkQt1Npdh0sfD3hBDaiWQKe4iqS6ewvT1Q4Fds5aU,508
14
+ bizon/common/errors/errors.py,sha256=mrYx1uE2kOuR2pEaB7ztK1l2m0E4V-_-hxq-DuILerY,682
15
+ bizon/connectors/destinations/bigquery/config/bigquery.example.yml,sha256=sy5-Piew00BlcjX5CFayFVrUq9G_vFYWXDmpWi9beTY,1263
16
+ bizon/connectors/destinations/bigquery/config/bigquery_incremental.example.yml,sha256=z0pz4W1x0dlsoAjorYR2DxMjkzTvIWn9tigqtOR8PUY,1076
17
+ bizon/connectors/destinations/bigquery/src/config.py,sha256=q55zR_9V5-ZZmOmSK7fDOHSzzYhoT-fwlppDzX4he9U,4000
18
+ bizon/connectors/destinations/bigquery/src/destination.py,sha256=awS3dZsSKqLTVnhBKuP_9rXSt3IpGv3c4WjZOCwqu9o,9888
19
+ bizon/connectors/destinations/bigquery_streaming/config/bigquery_streaming.example.yml,sha256=rF0mQ5IaOe6oqsbVy6q0innn7SXsOoBdBvIN8BTwPVc,1869
20
+ bizon/connectors/destinations/bigquery_streaming/src/config.py,sha256=LdBKEqHPaGll8PW6c6q_lH7PJvsGdtv2BCrtB-TukTA,1898
21
+ bizon/connectors/destinations/bigquery_streaming/src/destination.py,sha256=Uyne57NoT-z9uk7Yi4EgOUFYQ4QlvXDLFxgZC5KyCFE,14222
22
+ bizon/connectors/destinations/bigquery_streaming_v2/config/bigquery_streaming_v2.example.yml,sha256=hIQXlXtiBT8DgMVAs0x_h-19xoLkjHr-Ko7oSn8jnc0,2023
23
+ bizon/connectors/destinations/bigquery_streaming_v2/src/config.py,sha256=cdHST5Vx1VQbLsIVsPkoEtOJKmbA35XjsKzj6fZ5DHw,1907
24
+ bizon/connectors/destinations/bigquery_streaming_v2/src/destination.py,sha256=5aXEsbzyWKzS2F1pFMZ8pdbJaXmdGTaIrwgl2cd1IbU,19026
25
+ bizon/connectors/destinations/bigquery_streaming_v2/src/proto_utils.py,sha256=aWYVzMPMTgsdDapYniu8h6Tf2Pty4fDisT_33d9yEJ4,3692
26
+ bizon/connectors/destinations/file/config/file.example.yml,sha256=sMeX92hTrTQUrLmQgQFsq5OdG5Dk3BbpDo0NhRbBahI,986
27
+ bizon/connectors/destinations/file/config/file_incremental.example.yml,sha256=Xh5KwWiQRuq_MnMgOCHiHqIwHjOjXbwQlVlVcKdXARA,620
28
+ bizon/connectors/destinations/file/src/config.py,sha256=dU64aFe7J63aBGh6Os8mXl2kvECj3s4pPC7H3EmOvb8,585
29
+ bizon/connectors/destinations/file/src/destination.py,sha256=RQEL0Z5l409S319fAJyvW8cDblUCVAxPhALJVhjQKDM,4253
30
+ bizon/connectors/destinations/logger/config/logger.example.yml,sha256=KtQRmqqFeziJtBZ7vzrXGQLdTgWZNjxx2sdFXpIgIp4,672
31
+ bizon/connectors/destinations/logger/config/logger_incremental.example.yml,sha256=rwTLlXib-Jo3b4-_NcFv2ShdPC73WEpiiX3apP3sKg0,541
32
+ bizon/connectors/destinations/logger/src/config.py,sha256=vIV_G0k9c8DPcDxU6CGvEOL2zAEvAmKZcx3RV0eRi7A,426
33
+ bizon/connectors/destinations/logger/src/destination.py,sha256=YUC_lAN5nrcrNAN90hnalKFAKX49KTDlJwdLfwTaC0U,2007
34
+ bizon/connectors/sources/cycle/config/cycle.example.yml,sha256=UDiqOa-8ZsykmNT625kxq9tyXOj_gKe9CFwg9r_8SYk,230
35
+ bizon/connectors/sources/cycle/src/source.py,sha256=6sXMneq59XZAT5oJseM9k6sGJaoQw4NDp8FTtg8lPhk,4213
36
+ bizon/connectors/sources/cycle/tests/cycle_customers.py,sha256=A48S20LxIC0A74JLoFn4NTHNTgBWV_5stTFtF1Gfk2c,271
37
+ bizon/connectors/sources/dummy/config/dummy.example.yml,sha256=Wvn8v644u1aKvRRsPzPPML4kYmWk0ZhMGrCPFoSbVZQ,331
38
+ bizon/connectors/sources/dummy/src/fake_api.py,sha256=5EEETp9INOmDJ57FNXm6wnsRtofQmR1xYp8hUKjn0Es,3067
39
+ bizon/connectors/sources/dummy/src/source.py,sha256=IAfNdd2CX8QzprAwLLVDGT3dd5a-bFa5BkkuRUTWGTc,4078
40
+ bizon/connectors/sources/dummy/tests/dummy_pipeline.py,sha256=V9EKvugFXm3aNQVip-kMzlmvjlW6sZNGldxW1WdRa-E,397
41
+ bizon/connectors/sources/dummy/tests/dummy_pipeline_bigquery_backend.py,sha256=SO_x_IH9iY1e09NHJlS1-CON0SI21fxGFqljSz4EmM4,542
42
+ bizon/connectors/sources/dummy/tests/dummy_pipeline_kafka.py,sha256=zKhE7dhd6C4E_7VMepR8nSsGnesTJWXwlak-Dj01GD0,473
43
+ bizon/connectors/sources/dummy/tests/dummy_pipeline_rabbitmq.py,sha256=A6KmSEOlWMK0UuIZhZxWO8y-z8h19rtWEV8DuhKX8Xs,686
44
+ bizon/connectors/sources/dummy/tests/dummy_pipeline_unnest.py,sha256=64ZhjXUyZHzaIZTL_DO425yN6SlpB1vxzfhziE_pSEw,510
45
+ bizon/connectors/sources/dummy/tests/dummy_pipeline_write_data_bigquery.py,sha256=oTV9HmOzjs3B2WwTPzDlIT_VkoYuXxy1BtSkeTGkQqw,686
46
+ bizon/connectors/sources/dummy/tests/dummy_pipeline_write_data_bigquery_through_kafka.py,sha256=PFUhDuFw1Q1AMNMsnXPQxoqHIWf_wHEL1hLQodYlLcQ,596
47
+ bizon/connectors/sources/gsheets/config/default_auth.example.yml,sha256=KOBp6MfO4uJwpwEYW0tJ4X5ctVwwdur9poJB4Ohba6s,348
48
+ bizon/connectors/sources/gsheets/config/service_account.example.yml,sha256=XxVUnk9gGWc3lDb8CnzTHjTu8xz4Asyr5tXzY6qLvPg,1081
49
+ bizon/connectors/sources/gsheets/config/service_account_incremental.example.yml,sha256=WGvAtw4aOwSMWrSZW0tHaRncZnGbI6gd4LJk1aHIP_c,1765
50
+ bizon/connectors/sources/gsheets/src/source.py,sha256=xNF5FR9QLTM4kCiZ2eKZ5CZWNhLw6tyLaJZbliNzYnY,5675
51
+ bizon/connectors/sources/gsheets/tests/gsheets_pipeline.py,sha256=lNSM3kZTd4W_-ajGIO3mdp8qGdEbnmWqsMm5pRiS0cw,181
52
+ bizon/connectors/sources/hubspot/config/api_key.example.yml,sha256=VDTRloE5caqAdGdXgvsJZ6nQT46JHzX_YboxeGbpP18,389
53
+ bizon/connectors/sources/hubspot/config/api_key_incremental.example.yml,sha256=g4SBeVEXSr3tCgy5VjgZPWkhnuvEZ0jl5nPNn3u05Jc,920
54
+ bizon/connectors/sources/hubspot/config/oauth.example.yml,sha256=YqBtj1IxIsdM9E85_4eVWl6mPiHsQNoQn41EzCqORy0,499
55
+ bizon/connectors/sources/hubspot/src/hubspot_base.py,sha256=THo8ImrPrIxeTuFcBMRJYwaDMstIfLIGjrQLE2cqqsU,3424
56
+ bizon/connectors/sources/hubspot/src/hubspot_objects.py,sha256=ykqvxaFihv0e0A3-gGDmentp1KCGCoYvvDwZ3CcHzNg,6301
57
+ bizon/connectors/sources/hubspot/src/models/hs_object.py,sha256=IuuMB_54kiEAZRDvl3AOVdktyZsLeYDRB2qoOVd1TZg,1234
58
+ bizon/connectors/sources/hubspot/tests/hubspot_pipeline.py,sha256=e6dCF5_MHMySkeiF6kKrSAuCa_48J22-ZeSCZSjrfUI,216
59
+ bizon/connectors/sources/kafka/config/kafka.example.yml,sha256=taIj3QUL3jynQCpO-YlDtt6nGvQp8hOzGkS0_RJFUQU,933
60
+ bizon/connectors/sources/kafka/config/kafka_debezium.example.yml,sha256=lqFbNbSAFRh1Q8a9xQ_7nhhvVsyVIZM4amGEvCqHXQE,2254
61
+ bizon/connectors/sources/kafka/config/kafka_streams.example.yml,sha256=aGiGMDMG6l0RUhrSVYl9djw1yy6T--2QOfwesuNhF1g,3368
62
+ bizon/connectors/sources/kafka/src/callback.py,sha256=NgP9PLquHbVagz6E9VJK5Vx-kK8K2l80MhoeenbhOXY,484
63
+ bizon/connectors/sources/kafka/src/config.py,sha256=Plb-d59PmQSie7vxAIPpJrwbpbA7pT6Xua0joR86nKI,2764
64
+ bizon/connectors/sources/kafka/src/decode.py,sha256=RhPjazRQHb72D9iBhb763Nje7SH9t_6EKaFW-BtGVpM,2800
65
+ bizon/connectors/sources/kafka/src/source.py,sha256=0Hv6viyVZGAd4azhQnqCteyHuwsbbDL4rSGEjMCff9E,19722
66
+ bizon/connectors/sources/kafka/tests/kafka_pipeline.py,sha256=9LaCqXJIEx2ye3dkWq0YK_bPX7d4fCX_OcDOJCk34WE,206
67
+ bizon/connectors/sources/notion/config/api_key.example.yml,sha256=TagqOqaho4u_G5ZP4L8je89Y4G_NvCo8s4Wf9e8yVH8,1061
68
+ bizon/connectors/sources/notion/config/api_key_incremental.example.yml,sha256=52uQJo-SrqFny00zIVbA86qVq3asYHMFALqBcdmPmc8,1499
69
+ bizon/connectors/sources/notion/src/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
70
+ bizon/connectors/sources/notion/src/config.py,sha256=L-FZWijUa-aWK9VenWGsl6mv40i4ww46FacjYoX9gXo,1886
71
+ bizon/connectors/sources/notion/src/source.py,sha256=aViwfLuBzsNGZHwU4-z-xI40cROJTvx7Tlkw3ApF3q8,66217
72
+ bizon/connectors/sources/notion/tests/notion_pipeline.py,sha256=lyiD9b5uUF3oih8vY4gk7QXnfySGSawnbrBuSdTLym8,200
73
+ bizon/connectors/sources/notion/tests/test_notion.py,sha256=-G0DbTLDS2Gc_Bx8xR2VXnY89vW64s1-puwPc9x2N7A,4029
74
+ bizon/connectors/sources/periscope/config/periscope_charts.example.yml,sha256=9OgFDB7vguiNz2F2fmRqDNV8S_ddO9ncN5hgW9MhME4,350
75
+ bizon/connectors/sources/periscope/config/periscope_dashboards.example.yml,sha256=mQey_04cfCzNpm-tYICDRUmPOFTZGSaTYI2OpK604d0,364
76
+ bizon/connectors/sources/periscope/src/source.py,sha256=WePeQLn5hkPArhzgJ5gBCRu6VXbTpvOImUk7NN4mi0c,12270
77
+ bizon/connectors/sources/periscope/tests/periscope_pipeline_charts.py,sha256=mU0JtfhS1KmWsS3iovGhGxK7iPVWiYzjBM_QfRL3ZQI,275
78
+ bizon/connectors/sources/periscope/tests/periscope_pipeline_dashboard.py,sha256=VmkixgCMLTqNPXYXPYqcK7ncHeeR96h68IXPcuC70iA,290
79
+ bizon/connectors/sources/pokeapi/config/pokeapi_pokemon_to_json.example.yml,sha256=H3FpBsyft5qSvuyurh-hhaASpq7GnehNeVAOOn31Txk,283
80
+ bizon/connectors/sources/pokeapi/config/pokeapi_pokemon_to_logger.example.yml,sha256=wuJUob6QF1jwm5pnfNcFfXeO01HCcu-TEoZCeuYl_lo,134
81
+ bizon/connectors/sources/pokeapi/src/source.py,sha256=OeDUXKciu7p5tBJ2LeA6oseMd-gqqpxk50g7DXVaODA,2537
82
+ bizon/connectors/sources/sana_ai/config/sana.example.yml,sha256=sWOGxKMG9Cjgn11rNDCjcBL_YZdnUYvBLU7P9zfS3DE,802
83
+ bizon/connectors/sources/sana_ai/src/source.py,sha256=ZAfv_OwkPXuwJF7KFe6eYQ0rmadTdot20X3L_gmTrs8,3212
84
+ bizon/destination/buffer.py,sha256=tCwlm0iGo0RM5EeTnOiKI5Pz_JfHsJv6U3oIA9Va8_0,3155
85
+ bizon/destination/config.py,sha256=dv_zHrFUs4X4paCglaKdSbBLfZRzfaNEQKUSVv0j8tU,3033
86
+ bizon/destination/destination.py,sha256=qkEhnfCxI8JmCf43fyyFVpYuGiY6oc3pkVl5o2RaQeI,14614
87
+ bizon/destination/models.py,sha256=_LUnkbbD_9XauYrNTthh9VmbYwWsVgPHF90FX6vmSjg,1278
88
+ bizon/engine/config.py,sha256=tqc3K2i99BH_SSRRzfRpzSXoqux929Q6eg2oRsfjjcY,2038
89
+ bizon/engine/engine.py,sha256=Xl8Bghx0kMlvST8S3uwxrcZ6cvqy525Kv94BOvAvr8U,1705
90
+ bizon/engine/backend/backend.py,sha256=4J6uLKVh0hIPkOYO5XEg4ia8mPlOS13C3hSrIJ171VA,5849
91
+ bizon/engine/backend/config.py,sha256=GhRzPWRGMaO-UJJRXkaqN_nlkFOCW6UOovwZLHLXrA8,900
92
+ bizon/engine/backend/models.py,sha256=ECImDNji7u9eSkkjiw8sYq80l17vDF7MplTFvCpMgqA,5101
93
+ bizon/engine/backend/adapters/sqlalchemy/backend.py,sha256=ipJ7eY_iiqjrvtq4NS39C5lH8VShMjXDAaApgTHJtpY,15435
94
+ bizon/engine/backend/adapters/sqlalchemy/config.py,sha256=CeTWncVK27Y6lEKMVCF5RxD8Illhx2IQqqFkGrf0WKA,1845
95
+ bizon/engine/pipeline/consumer.py,sha256=DtCR3mG791h35poYJdXjL9geNO-GWPKl_YC0zPsF5qI,3207
96
+ bizon/engine/pipeline/models.py,sha256=qOra2MJGN6-PuouKpKuZRjutnQmzom0mgWDFZ16LcM8,405
97
+ bizon/engine/pipeline/producer.py,sha256=XV2fR6CNMRlbYwqTl9mlqy6nkG37ODyh2aiiTZ371VM,11995
98
+ bizon/engine/queue/config.py,sha256=0XwiQSB2OKTs-rODCSZqT5txNZzGOic2-PvODbcSrGg,1267
99
+ bizon/engine/queue/queue.py,sha256=Y9uj31d-ZgW2f0F02iccp_o-m-RoMm_jR61NkLdMQ2M,3461
100
+ bizon/engine/queue/adapters/kafka/config.py,sha256=ndNEXRT-nIgyWgoqlNXFhmlN206v87GobXIW9Z0zrSA,1085
101
+ bizon/engine/queue/adapters/kafka/consumer.py,sha256=JPpvp3u5NXGvKRo60-ihRI4DlZxGJB1qH5Qw91XT8qc,2576
102
+ bizon/engine/queue/adapters/kafka/queue.py,sha256=4baF7ns7LH3rQOjLjLSw2P7Ey4omRaVrxs_0atYyeSo,2000
103
+ bizon/engine/queue/adapters/python_queue/config.py,sha256=_pyiIm1_PUjBo3WhKTATQcT2gazk-iHv0SdzNoGTeow,961
104
+ bizon/engine/queue/adapters/python_queue/consumer.py,sha256=uVXOnS6tjhZG5NI_IU_qWg6-Uk3qjFJaMvDBDjQXjZg,1817
105
+ bizon/engine/queue/adapters/python_queue/queue.py,sha256=jwKeqvJ4wzPbXIY1JnVrqBemRQ52A2tLzybwyJk4TL8,2670
106
+ bizon/engine/queue/adapters/rabbitmq/config.py,sha256=9N_7WREvNjJgcNTC3Y2kHII-iId2MZa3ssHHks6PyAs,987
107
+ bizon/engine/queue/adapters/rabbitmq/consumer.py,sha256=WlNHEqd5wZc31cwNLYYf_7YYPo2B3jYHTiyG5pW3Bj4,2037
108
+ bizon/engine/queue/adapters/rabbitmq/queue.py,sha256=O0DfzNKqzKfUZCTOnEcbaHY_W67WbZkKDSeZAkh4Nf0,1744
109
+ bizon/engine/runner/config.py,sha256=Clxa8ToeisaBH5BveZz06jSJeq8QRJTH-a_oXBIWxgw,2455
110
+ bizon/engine/runner/runner.py,sha256=P5GJigzsbPQp7NKrX1pHqXdqmxiXHNtlrx3DB7vOA3Y,11067
111
+ bizon/engine/runner/adapters/process.py,sha256=D1KJxK85GNvi4bGQ_YMoV44psWZvhNvqdMu-OfhNO5o,3049
112
+ bizon/engine/runner/adapters/streaming.py,sha256=PPdhwA5oTNXH7NXB5-uye2asT4L0Zhfrx_wZFDDCsq4,8300
113
+ bizon/engine/runner/adapters/thread.py,sha256=jQ2jzSe7zlZoPP_g46Oh70UN6RqJkGOMXC01SYg9s2o,3790
114
+ bizon/monitoring/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
115
+ bizon/monitoring/config.py,sha256=wRZRfW4ejjFDSG6swWj7kIULlaNYUPdbIKblQC9lzsk,1112
116
+ bizon/monitoring/monitor.py,sha256=aJ3JEgD-f-HydGFAmprwL8YCA0WumDeXpm2P-uqp4wM,2627
117
+ bizon/monitoring/datadog/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
118
+ bizon/monitoring/datadog/monitor.py,sha256=YSdyMVEIjkDyp91_mGED_kx8j76MbQyQGkGJCijpAJ0,6152
119
+ bizon/monitoring/noop/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
120
+ bizon/monitoring/noop/monitor.py,sha256=Pu7Qt9SpUG1UvC8aWysgtoDY-t5tnKd4FlUXAC4MjbI,1066
121
+ bizon/source/callback.py,sha256=lfTwU_bzJwR0q5sbiKoK8uedQ-dhfHzoYkPVqm8b_Ho,602
122
+ bizon/source/config.py,sha256=JyZbKjlU0xhiyuuIGJYJPGUl9JxS4xyGeCyHoHgHHos,2473
123
+ bizon/source/cursor.py,sha256=Wjh9eNEiHV5P9YnjS5bdS2ahyFc0gPm9QLQtD-QjQCI,4089
124
+ bizon/source/discover.py,sha256=h9IVqtAQsTH-XxR-UkAFgNvEphLP2LgataQCCuHbGrk,11174
125
+ bizon/source/models.py,sha256=CHPKvO9chRi85WPDfLYy9vWnPsua8LTwYvjjN7Dj2uA,1837
126
+ bizon/source/session.py,sha256=klbCv0g6sm6ac-pzM50eAJSP8DdQ9DOegHgjpmKKUrI,1978
127
+ bizon/source/source.py,sha256=k_fHOOvam5ixZ9oPuQzUa9Kq3jVvv2HY7ghrCo-0o3I,4342
128
+ bizon/source/auth/builder.py,sha256=hc4zBNj31LZc-QqgIyx1VQEYTm9Xv81vY5pJiwQroJo,860
129
+ bizon/source/auth/config.py,sha256=2jjcBLP95XsCkfKxdUei4X2yHI2WX92lJb8D8Txw86g,750
130
+ bizon/source/auth/authenticators/abstract_oauth.py,sha256=T4iv4IDeQgRn1d_0ODJeGA23PlieSpaD8w5zgLXPFI8,5573
131
+ bizon/source/auth/authenticators/abstract_token.py,sha256=GYM4srti2VLYVuAvozv6AdqXrIzXw0HROzhvVq5YhCo,926
132
+ bizon/source/auth/authenticators/basic.py,sha256=xMD9g1PCN-xBmGRT8R1zCiSrkqA_OpOb_pU54SgZoUk,1284
133
+ bizon/source/auth/authenticators/cookies.py,sha256=mqNp6TGEfyam7ou-s-908Jh8R0ZeX89ftbeExDWg1oY,829
134
+ bizon/source/auth/authenticators/oauth.py,sha256=tY_UZsWTy4FkifqJ7-smPaD61gg1dMJizO9_iSqTt5o,3670
135
+ bizon/source/auth/authenticators/token.py,sha256=P6SKRAarAEv28YiWp8hQLSKAV7twNlyNTGRr9sxlx58,956
136
+ bizon/transform/config.py,sha256=Q9F7jlsuaXK8OYrO5qcdk8lxXTDoIgzoVMhhHW3igEw,213
137
+ bizon/transform/transform.py,sha256=Ufla8YFx9C9WEiN0ppmZS1a86Sk0PgggqC-8DIvDeAQ,1414
138
+ bizon-0.3.0.dist-info/METADATA,sha256=oX7OZjHhKAVvQ8UiRS0ksqu3C65t2kOp2mAfXoEBdJY,11159
139
+ bizon-0.3.0.dist-info/WHEEL,sha256=WLgqFyCfm_KASv4WHyYy0P3pM_m7J5L9k2skdKLirC8,87
140
+ bizon-0.3.0.dist-info/entry_points.txt,sha256=hHZPN-V6JwwhSYWNCKVu3WNxekuhXtIAaz_zdwO7NDo,45
141
+ bizon-0.3.0.dist-info/licenses/LICENSE,sha256=OXLcl0T2SZ8Pmy2_dmlvKuetivmyPd5m1q-Gyd-zaYY,35149
142
+ bizon-0.3.0.dist-info/RECORD,,
@@ -1,4 +1,4 @@
1
1
  Wheel-Version: 1.0
2
- Generator: poetry-core 2.1.3
2
+ Generator: hatchling 1.28.0
3
3
  Root-Is-Purelib: true
4
4
  Tag: py3-none-any
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ bizon = bizon.cli.main:cli
@@ -1,179 +0,0 @@
1
- Metadata-Version: 2.3
2
- Name: bizon
3
- Version: 0.1.2
4
- Summary: Extract and load your data reliably from API Clients with native fault-tolerant and checkpointing mechanism.
5
- Author: Antoine Balliet
6
- Author-email: antoine.balliet@gmail.com
7
- Requires-Python: >=3.9,<3.13
8
- Classifier: Programming Language :: Python :: 3
9
- Classifier: Programming Language :: Python :: 3.9
10
- Classifier: Programming Language :: Python :: 3.10
11
- Classifier: Programming Language :: Python :: 3.11
12
- Classifier: Programming Language :: Python :: 3.12
13
- Provides-Extra: bigquery
14
- Provides-Extra: datadog
15
- Provides-Extra: gsheets
16
- Provides-Extra: kafka
17
- Provides-Extra: postgres
18
- Provides-Extra: rabbitmq
19
- Requires-Dist: avro (>=1.12.0,<2.0.0) ; extra == "kafka"
20
- Requires-Dist: backoff (>=2.2.1,<3.0.0)
21
- Requires-Dist: click (>=8.1.7,<9.0.0)
22
- Requires-Dist: confluent-kafka (>=2.6.0,<3.0.0) ; extra == "kafka"
23
- Requires-Dist: datadog (>=0.50.2,<0.51.0) ; extra == "datadog"
24
- Requires-Dist: ddtrace (>=3.10.0,<4.0.0) ; extra == "datadog"
25
- Requires-Dist: dpath (>=2.2.0,<3.0.0)
26
- Requires-Dist: fastavro (>=1.9.7,<2.0.0) ; extra == "kafka"
27
- Requires-Dist: google-cloud-bigquery (>=3.25.0,<4.0.0) ; extra == "bigquery"
28
- Requires-Dist: google-cloud-bigquery-storage (>=2.25.0,<3.0.0) ; extra == "bigquery"
29
- Requires-Dist: google-cloud-storage (>=2.17.0,<3.0.0)
30
- Requires-Dist: gspread (>=6.1.2,<7.0.0) ; extra == "gsheets"
31
- Requires-Dist: kafka-python (>=2.0.2,<3.0.0) ; extra == "kafka"
32
- Requires-Dist: loguru (>=0.7.2,<0.8.0)
33
- Requires-Dist: orjson (>=3.10.16,<4.0.0)
34
- Requires-Dist: pendulum (>=3.0.0,<4.0.0)
35
- Requires-Dist: pika (>=1.3.2,<2.0.0) ; extra == "rabbitmq"
36
- Requires-Dist: polars (>=1.16.0,<2.0.0)
37
- Requires-Dist: protobuf (>=4.24.0,<5.0.0) ; extra == "bigquery"
38
- Requires-Dist: psycopg2-binary (>=2.9.9,<3.0.0) ; extra == "postgres"
39
- Requires-Dist: pyarrow (>=16.1.0,<17.0.0)
40
- Requires-Dist: pydantic (>=2.8.2,<3.0.0)
41
- Requires-Dist: pydantic-extra-types (>=2.9.0,<3.0.0)
42
- Requires-Dist: python-dotenv (>=1.0.1,<2.0.0)
43
- Requires-Dist: pytz (>=2024.2,<2025.0)
44
- Requires-Dist: pyyaml (>=6.0.1,<7.0.0)
45
- Requires-Dist: requests (>=2.28.2,<3.0.0)
46
- Requires-Dist: simplejson (>=3.20.1,<4.0.0)
47
- Requires-Dist: sqlalchemy (>=2.0.32,<3.0.0)
48
- Requires-Dist: sqlalchemy-bigquery (>=1.11.0,<2.0.0) ; extra == "bigquery"
49
- Requires-Dist: tenacity (>=9.0.0,<10.0.0)
50
- Description-Content-Type: text/markdown
51
-
52
- # bizon ⚡️
53
- Extract and load your largest data streams with a framework you can trust for billion records.
54
-
55
- ## Features
56
- - **Natively fault-tolerant**: Bizon uses a checkpointing mechanism to keep track of the progress and recover from the last checkpoint.
57
-
58
- - **High throughput**: Bizon is designed to handle high throughput and can process billions of records.
59
-
60
- - **Queue system agnostic**: Bizon is agnostic of the queuing system, you can use any queuing system among Python Queue, RabbitMQ, Kafka or Redpanda. Thanks to the `bizon.engine.queue.Queue` interface, adapters can be written for any queuing system.
61
-
62
- - **Pipeline metrics**: Bizon provides exhaustive pipeline metrics and implement Datadog & OpenTelemetry for tracing. You can monitor:
63
- - ETAs for completion
64
- - Number of records processed
65
- - Completion percentage
66
- - Latency Source <> Destination
67
-
68
- - **Lightweight & lean**: Bizon is lightweight, minimal codebase and only uses few dependencies:
69
- - `requests` for HTTP requests
70
- - `pyyaml` for configuration
71
- - `sqlalchemy` for database / warehouse connections
72
- - `polars` for memory efficient data buffering and vectorized processing
73
- - `pyarrow` for Parquet file format
74
-
75
- ## Installation
76
- ```bash
77
- pip install bizon
78
- ```
79
-
80
- ## Usage
81
-
82
- ### List available sources and streams
83
- ```bash
84
- bizon source list
85
- bizon stream list <source_name>
86
- ```
87
-
88
- ### Create a pipeline
89
-
90
- Create a file named `config.yml` in your working directory with the following content:
91
-
92
- ```yaml
93
- name: demo-creatures-pipeline
94
-
95
- source:
96
- name: dummy
97
- stream: creatures
98
- authentication:
99
- type: api_key
100
- params:
101
- token: dummy_key
102
-
103
- destination:
104
- name: logger
105
- config:
106
- dummy: dummy
107
- ```
108
-
109
- Run the pipeline with the following command:
110
-
111
- ```bash
112
- bizon run config.yml
113
- ```
114
- ## Backend configuration
115
-
116
- Backend is the interface used by Bizon to store its state. It can be configured in the `backend` section of the configuration file. The following backends are supported:
117
- - `sqlite`: In-memory SQLite database, useful for testing and development.
118
- - `bigquery`: Google BigQuery backend, perfect for light setup & production.
119
- - `postgres`: PostgreSQL backend, for production use and frequent cursor updates.
120
-
121
- ## Queue configuration
122
-
123
- Queue is the interface used by Bizon to exchange data between `Source` and `Destination`. It can be configured in the `queue` section of the configuration file. The following queues are supported:
124
- - `python_queue`: Python Queue, useful for testing and development.
125
- - `rabbitmq`: RabbitMQ, for production use and high throughput.
126
- - `kafka`: Apache Kafka, for production use and high throughput and strong persistence.
127
-
128
- ## Runner configuration
129
-
130
- Runner is the interface used by Bizon to run the pipeline. It can be configured in the `runner` section of the configuration file. The following runners are supported:
131
- - `thread` (asynchronous)
132
- - `process` (asynchronous)
133
- - `stream` (synchronous)
134
-
135
- ## Start syncing your data 🚀
136
-
137
- ### Quick setup without any dependencies ✌️
138
-
139
- Queue configuration can be set to `python_queue` and backend configuration to `sqlite`.
140
- This will allow you to test the pipeline without any external dependencies.
141
-
142
-
143
- ### Local Kafka setup
144
-
145
- To test the pipeline with Kafka, you can use `docker compose` to setup Kafka or Redpanda locally.
146
-
147
- **Kafka**
148
- ```bash
149
- docker compose --file ./scripts/kafka-compose.yml up # Kafka
150
- docker compose --file ./scripts/redpanda-compose.yml up # Redpanda
151
- ```
152
-
153
- In your YAML configuration, set the `queue` configuration to Kafka under `engine`:
154
- ```yaml
155
- engine:
156
- queue:
157
- type: kafka
158
- config:
159
- queue:
160
- bootstrap_server: localhost:9092 # Kafka:9092 & Redpanda: 19092
161
- ```
162
-
163
- **RabbitMQ**
164
- ```bash
165
- docker compose --file ./scripts/rabbitmq-compose.yml up
166
- ```
167
-
168
- In your YAML configuration, set the `queue` configuration to Kafka under `engine`:
169
-
170
- ```yaml
171
- engine:
172
- queue:
173
- type: rabbitmq
174
- config:
175
- queue:
176
- host: localhost
177
- queue_name: bizon
178
- ```
179
-