etlplus 0.8.3__tar.gz → 0.12.5__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {etlplus-0.8.3 → etlplus-0.12.5}/DEMO.md +25 -23
- {etlplus-0.8.3/etlplus.egg-info → etlplus-0.12.5}/PKG-INFO +117 -41
- {etlplus-0.8.3 → etlplus-0.12.5}/README.md +111 -40
- etlplus-0.12.5/SECURITY.md +15 -0
- etlplus-0.12.5/SUPPORT.md +18 -0
- etlplus-0.12.5/etlplus/README.md +37 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/README.md +20 -3
- etlplus-0.12.5/etlplus/cli/README.md +40 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/cli/commands.py +176 -122
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/cli/constants.py +14 -8
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/cli/handlers.py +4 -5
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/cli/io.py +23 -7
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/cli/main.py +2 -1
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/cli/options.py +1 -1
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/cli/state.py +3 -2
- etlplus-0.12.5/etlplus/config/README.md +52 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/config/pipeline.py +2 -2
- etlplus-0.12.5/etlplus/database/README.md +48 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/database/ddl.py +1 -1
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/database/engine.py +1 -1
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/database/schema.py +1 -1
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/enums.py +3 -77
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/extract.py +5 -7
- etlplus-0.12.5/etlplus/file/README.md +105 -0
- etlplus-0.12.5/etlplus/file/__init__.py +25 -0
- etlplus-0.12.5/etlplus/file/_io.py +120 -0
- etlplus-0.12.5/etlplus/file/_pandas.py +58 -0
- etlplus-0.12.5/etlplus/file/avro.py +186 -0
- etlplus-0.12.5/etlplus/file/core.py +292 -0
- etlplus-0.12.5/etlplus/file/csv.py +67 -0
- etlplus-0.12.5/etlplus/file/enums.py +240 -0
- etlplus-0.12.5/etlplus/file/feather.py +99 -0
- etlplus-0.12.5/etlplus/file/gz.py +123 -0
- etlplus-0.12.5/etlplus/file/json.py +98 -0
- etlplus-0.12.5/etlplus/file/ndjson.py +109 -0
- etlplus-0.12.5/etlplus/file/orc.py +99 -0
- etlplus-0.12.5/etlplus/file/parquet.py +101 -0
- etlplus-0.12.5/etlplus/file/stub.py +75 -0
- etlplus-0.12.5/etlplus/file/tsv.py +67 -0
- etlplus-0.12.5/etlplus/file/txt.py +99 -0
- etlplus-0.12.5/etlplus/file/xls.py +88 -0
- etlplus-0.12.5/etlplus/file/xlsx.py +99 -0
- etlplus-0.12.5/etlplus/file/xml.py +174 -0
- etlplus-0.12.5/etlplus/file/yaml.py +136 -0
- etlplus-0.12.5/etlplus/file/zip.py +175 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/load.py +10 -13
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/run.py +6 -9
- etlplus-0.12.5/etlplus/templates/README.md +46 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/utils.py +1 -1
- etlplus-0.12.5/etlplus/validation/README.md +50 -0
- {etlplus-0.8.3 → etlplus-0.12.5/etlplus.egg-info}/PKG-INFO +117 -41
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus.egg-info/SOURCES.txt +34 -2
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus.egg-info/requires.txt +5 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/examples/README.md +4 -3
- {etlplus-0.8.3 → etlplus-0.12.5}/examples/quickstart_python.py +1 -1
- {etlplus-0.8.3 → etlplus-0.12.5}/pyproject.toml +5 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/setup.py +5 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/conftest.py +2 -2
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/integration/test_i_cli.py +127 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/integration/test_i_examples_data_parity.py +2 -2
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/integration/test_i_pagination_strategy.py +2 -2
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/cli/conftest.py +105 -4
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/cli/test_u_cli_handlers.py +403 -316
- etlplus-0.12.5/tests/unit/cli/test_u_cli_io.py +326 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/cli/test_u_cli_main.py +61 -27
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/cli/test_u_cli_state.py +45 -41
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/database/test_u_database_ddl.py +6 -3
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/database/test_u_database_engine.py +5 -4
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/database/test_u_database_schema.py +12 -14
- etlplus-0.12.5/tests/unit/file/test_u_file_core.py +491 -0
- etlplus-0.12.5/tests/unit/file/test_u_file_enums.py +99 -0
- etlplus-0.12.5/tests/unit/file/test_u_file_yaml.py +109 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/test_u_enums.py +5 -38
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/test_u_load.py +1 -1
- etlplus-0.8.3/etlplus/file.py +0 -657
- etlplus-0.8.3/tests/unit/test_u_file.py +0 -296
- {etlplus-0.8.3 → etlplus-0.12.5}/.coveragerc +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/.editorconfig +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/.gitattributes +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/.github/actions/python-bootstrap/action.yml +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/.github/workflows/ci.yml +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/.gitignore +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/.pre-commit-config.yaml +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/.ruff.toml +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/CODE_OF_CONDUCT.md +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/CONTRIBUTING.md +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/LICENSE +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/MANIFEST.in +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/Makefile +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/REFERENCES.md +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/docs/README.md +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/docs/pipeline-guide.md +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/docs/snippets/installation_version.md +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/__init__.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/__main__.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/__version__.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/__init__.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/auth.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/config.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/endpoint_client.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/errors.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/pagination/__init__.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/pagination/client.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/pagination/config.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/pagination/paginator.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/rate_limiting/__init__.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/rate_limiting/config.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/rate_limiting/rate_limiter.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/request_manager.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/retry_manager.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/transport.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/api/types.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/cli/__init__.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/cli/types.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/config/__init__.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/config/connector.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/config/jobs.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/config/profile.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/config/types.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/config/utils.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/database/__init__.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/database/orm.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/database/types.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/mixins.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/py.typed +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/run_helpers.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/templates/__init__.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/templates/ddl.sql.j2 +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/templates/view.sql.j2 +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/transform.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/types.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/validate.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/validation/__init__.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus/validation/utils.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus.egg-info/dependency_links.txt +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus.egg-info/entry_points.txt +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/etlplus.egg-info/top_level.txt +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/examples/configs/ddl_spec.yml +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/examples/configs/pipeline.yml +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/examples/data/sample.csv +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/examples/data/sample.json +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/examples/data/sample.xml +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/examples/data/sample.xsd +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/examples/data/sample.yaml +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/pytest.ini +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/setup.cfg +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/__init__.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/integration/conftest.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/integration/test_i_pipeline_smoke.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/integration/test_i_pipeline_yaml_load.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/integration/test_i_run.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/integration/test_i_run_profile_pagination_defaults.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/integration/test_i_run_profile_rate_limit_defaults.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/api/conftest.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/api/test_u_auth.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/api/test_u_config.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/api/test_u_endpoint_client.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/api/test_u_mocks.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/api/test_u_pagination_client.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/api/test_u_pagination_config.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/api/test_u_paginator.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/api/test_u_rate_limit_config.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/api/test_u_rate_limiter.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/api/test_u_request_manager.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/api/test_u_retry_manager.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/api/test_u_transport.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/api/test_u_types.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/config/test_u_config_utils.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/config/test_u_connector.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/config/test_u_jobs.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/config/test_u_pipeline.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/conftest.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/database/test_u_database_orm.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/test_u_extract.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/test_u_main.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/test_u_mixins.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/test_u_run.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/test_u_run_helpers.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/test_u_transform.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/test_u_utils.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/test_u_validate.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/test_u_version.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tests/unit/validation/test_u_validation_utils.py +0 -0
- {etlplus-0.8.3 → etlplus-0.12.5}/tools/update_demo_snippets.py +0 -0
|
@@ -89,14 +89,14 @@ $ etlplus validate '{"email": "user@example.com", "age": 25}' \
|
|
|
89
89
|
|
|
90
90
|
### Filter and Select
|
|
91
91
|
```bash
|
|
92
|
-
$ etlplus transform '
|
|
92
|
+
$ etlplus transform --operations '{
|
|
93
|
+
"filter": {"field": "age", "op": "gt", "value": 26},
|
|
94
|
+
"select": ["name", "age"]
|
|
95
|
+
}' '[
|
|
93
96
|
{"name": "John", "age": 30, "city": "NYC"},
|
|
94
97
|
{"name": "Jane", "age": 25, "city": "LA"},
|
|
95
98
|
{"name": "Bob", "age": 35, "city": "Chicago"}
|
|
96
|
-
]'
|
|
97
|
-
"filter": {"field": "age", "op": "gt", "value": 26},
|
|
98
|
-
"select": ["name", "age"]
|
|
99
|
-
}'
|
|
99
|
+
]'
|
|
100
100
|
[
|
|
101
101
|
{
|
|
102
102
|
"name": "John",
|
|
@@ -111,24 +111,19 @@ $ etlplus transform '[
|
|
|
111
111
|
|
|
112
112
|
### Sort Data
|
|
113
113
|
```bash
|
|
114
|
-
$ etlplus transform
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
{"name": "Bob", "score": 90}
|
|
118
|
-
]' --operations '{
|
|
119
|
-
"sort": {"field": "score", "reverse": true}
|
|
120
|
-
}'
|
|
114
|
+
$ etlplus transform -\
|
|
115
|
+
-operations '{"sort": {"field": "score", "reverse": true}}' \
|
|
116
|
+
'[{"name": "Charlie", "score": 85}, {"name": "Alice", "score": 95}, {"name": "Bob", "score": 90}]'
|
|
121
117
|
```
|
|
122
118
|
|
|
123
119
|
### Aggregate Data
|
|
124
120
|
```bash
|
|
125
|
-
$ etlplus transform '
|
|
121
|
+
$ etlplus transform --operations '{"aggregate": {"field": "sales", "func": "sum"}}' \
|
|
122
|
+
'[
|
|
126
123
|
{"product": "A", "sales": 100},
|
|
127
124
|
{"product": "B", "sales": 150},
|
|
128
125
|
{"product": "C", "sales": 200}
|
|
129
|
-
]'
|
|
130
|
-
"aggregate": {"field": "sales", "func": "sum"}
|
|
131
|
-
}'
|
|
126
|
+
]'
|
|
132
127
|
{
|
|
133
128
|
"sum_sales": 450
|
|
134
129
|
}
|
|
@@ -138,7 +133,9 @@ $ etlplus transform '[
|
|
|
138
133
|
|
|
139
134
|
### Load to JSON File
|
|
140
135
|
```bash
|
|
141
|
-
$ etlplus load
|
|
136
|
+
$ etlplus load \
|
|
137
|
+
'{"name": "John", "status": "active"}' \
|
|
138
|
+
output.json --target-type file
|
|
142
139
|
{
|
|
143
140
|
"status": "success",
|
|
144
141
|
"message": "Data loaded to output.json",
|
|
@@ -148,10 +145,12 @@ $ etlplus load '{"name": "John", "status": "active"}' file output.json
|
|
|
148
145
|
|
|
149
146
|
### Load to CSV File
|
|
150
147
|
```bash
|
|
151
|
-
$ etlplus load
|
|
148
|
+
$ etlplus load \
|
|
149
|
+
'[
|
|
152
150
|
{"name": "John", "email": "john@example.com"},
|
|
153
151
|
{"name": "Jane", "email": "jane@example.com"}
|
|
154
|
-
]'
|
|
152
|
+
]' \
|
|
153
|
+
users.csv --target-type file
|
|
155
154
|
{
|
|
156
155
|
"status": "success",
|
|
157
156
|
"message": "Data loaded to users.csv",
|
|
@@ -173,19 +172,22 @@ This example shows a complete ETL workflow:
|
|
|
173
172
|
$ etlplus extract raw_data.csv > extracted.json
|
|
174
173
|
|
|
175
174
|
# Step 2: Transform
|
|
176
|
-
$ etlplus transform
|
|
175
|
+
$ etlplus transform \
|
|
177
176
|
--operations '{
|
|
178
177
|
"filter": {"field": "age", "op": "gte", "value": 18},
|
|
179
178
|
"select": ["name", "email", "age"]
|
|
180
|
-
}'
|
|
179
|
+
}' \
|
|
180
|
+
extracted.json \
|
|
181
|
+
transformed.json
|
|
181
182
|
|
|
182
183
|
# Step 3: Validate
|
|
183
|
-
$ etlplus validate
|
|
184
|
+
$ etlplus validate \
|
|
184
185
|
--rules '{
|
|
185
186
|
"name": {"type": "string", "required": true},
|
|
186
187
|
"email": {"type": "string", "required": true, "pattern": "^[\\w.-]+@[\\w.-]+\\.\\w+$"},
|
|
187
188
|
"age": {"type": "number", "min": 18, "max": 120}
|
|
188
|
-
}'
|
|
189
|
+
}' \
|
|
190
|
+
transformed.json
|
|
189
191
|
|
|
190
192
|
# Step 4: Load
|
|
191
193
|
$ etlplus load transformed.json file final_output.csv
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: etlplus
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.12.5
|
|
4
4
|
Summary: A Swiss Army knife for simple ETL operations
|
|
5
5
|
Home-page: https://github.com/Dagitali/ETLPlus
|
|
6
6
|
Author: ETLPlus Team
|
|
@@ -17,8 +17,11 @@ Classifier: Programming Language :: Python :: 3.14
|
|
|
17
17
|
Requires-Python: >=3.13,<3.15
|
|
18
18
|
Description-Content-Type: text/markdown
|
|
19
19
|
License-File: LICENSE
|
|
20
|
+
Requires-Dist: fastavro>=1.12.1
|
|
20
21
|
Requires-Dist: jinja2>=3.1.6
|
|
22
|
+
Requires-Dist: openpyxl>=3.1.5
|
|
21
23
|
Requires-Dist: pyodbc>=5.3.0
|
|
24
|
+
Requires-Dist: pyarrow>=22.0.0
|
|
22
25
|
Requires-Dist: python-dotenv>=1.2.1
|
|
23
26
|
Requires-Dist: pandas>=2.3.3
|
|
24
27
|
Requires-Dist: pydantic>=2.12.5
|
|
@@ -26,6 +29,8 @@ Requires-Dist: PyYAML>=6.0.3
|
|
|
26
29
|
Requires-Dist: requests>=2.32.5
|
|
27
30
|
Requires-Dist: SQLAlchemy>=2.0.45
|
|
28
31
|
Requires-Dist: typer>=0.21.0
|
|
32
|
+
Requires-Dist: xlrd>=2.0.2
|
|
33
|
+
Requires-Dist: xlwt>=1.3.0
|
|
29
34
|
Provides-Extra: dev
|
|
30
35
|
Requires-Dist: black>=25.9.0; extra == "dev"
|
|
31
36
|
Requires-Dist: build>=1.2.2; extra == "dev"
|
|
@@ -59,11 +64,13 @@ ETLPlus is a veritable Swiss Army knife for enabling simple ETL operations, offe
|
|
|
59
64
|
package and command-line interface for data extraction, validation, transformation, and loading.
|
|
60
65
|
|
|
61
66
|
- [ETLPlus](#etlplus)
|
|
67
|
+
- [Getting Started](#getting-started)
|
|
62
68
|
- [Features](#features)
|
|
63
69
|
- [Installation](#installation)
|
|
64
70
|
- [Quickstart](#quickstart)
|
|
65
71
|
- [Usage](#usage)
|
|
66
72
|
- [Command Line Interface](#command-line-interface)
|
|
73
|
+
- [Argument Order and Required Options](#argument-order-and-required-options)
|
|
67
74
|
- [Check Pipelines](#check-pipelines)
|
|
68
75
|
- [Render SQL DDL](#render-sql-ddl)
|
|
69
76
|
- [Extract Data](#extract-data)
|
|
@@ -86,11 +93,27 @@ package and command-line interface for data extraction, validation, transformati
|
|
|
86
93
|
- [Linting](#linting)
|
|
87
94
|
- [Updating Demo Snippets](#updating-demo-snippets)
|
|
88
95
|
- [Releasing to PyPI](#releasing-to-pypi)
|
|
89
|
-
- [Links](#links)
|
|
90
96
|
- [License](#license)
|
|
91
97
|
- [Contributing](#contributing)
|
|
98
|
+
- [Documentation](#documentation)
|
|
99
|
+
- [Python Packages/Subpackage](#python-packagessubpackage)
|
|
100
|
+
- [Community Health](#community-health)
|
|
101
|
+
- [Other](#other)
|
|
92
102
|
- [Acknowledgments](#acknowledgments)
|
|
93
103
|
|
|
104
|
+
## Getting Started
|
|
105
|
+
|
|
106
|
+
ETLPlus helps you extract, validate, transform, and load data from files, databases, and APIs, either
|
|
107
|
+
as a Python library or from the command line.
|
|
108
|
+
|
|
109
|
+
To get started:
|
|
110
|
+
|
|
111
|
+
- See [Installation](#installation) for setup instructions.
|
|
112
|
+
- Try the [Quickstart](#quickstart) for a minimal working example (CLI and Python).
|
|
113
|
+
- Explore [Usage](#usage) for more detailed options and workflows.
|
|
114
|
+
|
|
115
|
+
ETLPlus supports Python 3.13 and above.
|
|
116
|
+
|
|
94
117
|
## Features
|
|
95
118
|
|
|
96
119
|
- **Check** data pipeline definitions before running them:
|
|
@@ -151,8 +174,8 @@ etlplus --version
|
|
|
151
174
|
|
|
152
175
|
# One-liner: extract CSV, filter, select, and write JSON
|
|
153
176
|
etlplus extract file examples/data/sample.csv \
|
|
154
|
-
| etlplus transform
|
|
155
|
-
-
|
|
177
|
+
| etlplus transform --operations '{"filter": {"field": "age", "op": "gt", "value": 25}, "select": ["name", "email"]}' \
|
|
178
|
+
- temp/sample_output.json
|
|
156
179
|
```
|
|
157
180
|
|
|
158
181
|
[Python API](#python-api):
|
|
@@ -185,6 +208,27 @@ etlplus --version
|
|
|
185
208
|
The CLI is implemented with Typer (Click-based). There is no argparse compatibility layer, so rely
|
|
186
209
|
on the documented commands/flags and run `etlplus <command> --help` for current options.
|
|
187
210
|
|
|
211
|
+
**Example error messages:**
|
|
212
|
+
|
|
213
|
+
- If you omit a required argument: `Error: Missing required argument 'SOURCE'.`
|
|
214
|
+
- If you place an option before its argument: `Error: Option '--source-format' must follow the 'SOURCE' argument.`
|
|
215
|
+
|
|
216
|
+
#### Argument Order and Required Options
|
|
217
|
+
|
|
218
|
+
For each command, positional arguments must precede options. Required options must follow their
|
|
219
|
+
associated argument:
|
|
220
|
+
|
|
221
|
+
- **extract**: `etlplus extract SOURCE [--source-format ...] [--source-type ...]`
|
|
222
|
+
- `SOURCE` is required. `--source-format` and `--source-type` must follow `SOURCE`.
|
|
223
|
+
- **transform**: `etlplus transform [--operations ...] SOURCE [--source-format ...] [--source-type ...] TARGET [--target-format ...] [--target-type ...]`
|
|
224
|
+
- `SOURCE` and `TARGET` are required. Format/type options must follow their respective argument.
|
|
225
|
+
- **load**: `etlplus load TARGET [--target-format ...] [--target-type ...] [--source-format ...]`
|
|
226
|
+
- `TARGET` is required. `--target-format` and `--target-type` must follow `TARGET`.
|
|
227
|
+
- **validate**: `etlplus validate SOURCE [--rules ...] [--source-format ...] [--source-type ...]`
|
|
228
|
+
- `SOURCE` is required. `--rules` and format/type options must follow `SOURCE`.
|
|
229
|
+
|
|
230
|
+
If required arguments or options are missing, or if options are placed before their associated argument, the CLI will display a clear error message.
|
|
231
|
+
|
|
188
232
|
#### Check Pipelines
|
|
189
233
|
|
|
190
234
|
Use `etlplus check` to explore pipeline YAML definitions without running them. The command can print
|
|
@@ -251,7 +295,7 @@ etlplus extract api https://api.example.com/data
|
|
|
251
295
|
|
|
252
296
|
Save extracted data to file:
|
|
253
297
|
```bash
|
|
254
|
-
etlplus extract file examples/data/sample.csv
|
|
298
|
+
etlplus extract file examples/data/sample.csv > temp/sample_output.json
|
|
255
299
|
```
|
|
256
300
|
|
|
257
301
|
#### Validate Data
|
|
@@ -270,59 +314,67 @@ etlplus validate examples/data/sample.json --rules '{"email": {"type": "string",
|
|
|
270
314
|
|
|
271
315
|
When piping data through `etlplus transform`, use `--source-format` whenever the SOURCE argument is
|
|
272
316
|
`-` or a literal payload, mirroring the `etlplus extract` semantics. Use `--target-format` to
|
|
273
|
-
control the emitted format for
|
|
274
|
-
paths continue to infer formats from their extensions. Use `--
|
|
275
|
-
connector type and `--
|
|
276
|
-
extract`/`etlplus load` behavior.
|
|
317
|
+
control the emitted format for STDOUT or other non-file outputs, just like `etlplus load`. File
|
|
318
|
+
paths continue to infer formats from their extensions. Use `--source-type` to override the inferred
|
|
319
|
+
source connector type and `--target-type` to override the inferred target connector type, matching
|
|
320
|
+
the `etlplus extract`/`etlplus load` behavior.
|
|
277
321
|
|
|
278
322
|
Transform file inputs while overriding connector types:
|
|
279
323
|
```bash
|
|
280
|
-
etlplus transform
|
|
324
|
+
etlplus transform \
|
|
281
325
|
--operations '{"select": ["name", "email"]}' \
|
|
282
|
-
--
|
|
326
|
+
examples/data/sample.json --source-type file \
|
|
327
|
+
temp/selected_output.json --target-type file
|
|
283
328
|
```
|
|
284
329
|
|
|
285
330
|
Filter and select fields:
|
|
286
331
|
```bash
|
|
287
|
-
etlplus transform
|
|
288
|
-
--operations '{"filter": {"field": "age", "op": "gt", "value": 26}, "select": ["name"]}'
|
|
332
|
+
etlplus transform \
|
|
333
|
+
--operations '{"filter": {"field": "age", "op": "gt", "value": 26}, "select": ["name"]}' \
|
|
334
|
+
'[{"name": "John", "age": 30}, {"name": "Jane", "age": 25}]'
|
|
289
335
|
```
|
|
290
336
|
|
|
291
337
|
Sort data:
|
|
292
338
|
```bash
|
|
293
|
-
etlplus transform
|
|
339
|
+
etlplus transform \
|
|
340
|
+
--operations '{"sort": {"field": "age", "reverse": true}}' \
|
|
341
|
+
examples/data/sample.json
|
|
294
342
|
```
|
|
295
343
|
|
|
296
344
|
Aggregate data:
|
|
297
345
|
```bash
|
|
298
|
-
etlplus transform
|
|
346
|
+
etlplus transform \
|
|
347
|
+
--operations '{"aggregate": {"field": "age", "func": "sum"}}' \
|
|
348
|
+
examples/data/sample.json
|
|
299
349
|
```
|
|
300
350
|
|
|
301
351
|
Map/rename fields:
|
|
302
352
|
```bash
|
|
303
|
-
etlplus transform
|
|
353
|
+
etlplus transform \
|
|
354
|
+
--operations '{"map": {"name": "new_name"}}' \
|
|
355
|
+
examples/data/sample.json
|
|
304
356
|
```
|
|
305
357
|
|
|
306
358
|
#### Load Data
|
|
307
359
|
|
|
308
|
-
`etlplus load` consumes JSON from
|
|
360
|
+
`etlplus load` consumes JSON from STDIN; provide only the target argument plus optional flags.
|
|
309
361
|
|
|
310
362
|
Load to JSON file:
|
|
311
363
|
```bash
|
|
312
364
|
etlplus extract file examples/data/sample.json \
|
|
313
|
-
| etlplus load
|
|
365
|
+
| etlplus load temp/sample_output.json --target-type file
|
|
314
366
|
```
|
|
315
367
|
|
|
316
368
|
Load to CSV file:
|
|
317
369
|
```bash
|
|
318
370
|
etlplus extract file examples/data/sample.csv \
|
|
319
|
-
| etlplus load
|
|
371
|
+
| etlplus load temp/sample_output.csv --target-type file
|
|
320
372
|
```
|
|
321
373
|
|
|
322
374
|
Load to REST API:
|
|
323
375
|
```bash
|
|
324
376
|
cat examples/data/sample.json \
|
|
325
|
-
| etlplus load
|
|
377
|
+
| etlplus load https://api.example.com/endpoint --target-type api
|
|
326
378
|
```
|
|
327
379
|
|
|
328
380
|
### Python API
|
|
@@ -375,20 +427,22 @@ etlplus run --config examples/configs/pipeline.yml --job file_to_file_customers
|
|
|
375
427
|
|
|
376
428
|
```bash
|
|
377
429
|
# 1. Extract from CSV
|
|
378
|
-
etlplus extract file examples/data/sample.csv
|
|
430
|
+
etlplus extract file examples/data/sample.csv > temp/sample_extracted.json
|
|
379
431
|
|
|
380
432
|
# 2. Transform (filter and select fields)
|
|
381
|
-
etlplus transform
|
|
433
|
+
etlplus transform \
|
|
382
434
|
--operations '{"filter": {"field": "age", "op": "gt", "value": 25}, "select": ["name", "email"]}' \
|
|
383
|
-
|
|
435
|
+
temp/sample_extracted.json \
|
|
436
|
+
temp/sample_transformed.json
|
|
384
437
|
|
|
385
438
|
# 3. Validate transformed data
|
|
386
|
-
etlplus validate
|
|
387
|
-
--rules '{"name": {"type": "string", "required": true}, "email": {"type": "string", "required": true}}'
|
|
439
|
+
etlplus validate \
|
|
440
|
+
--rules '{"name": {"type": "string", "required": true}, "email": {"type": "string", "required": true}}' \
|
|
441
|
+
temp/sample_transformed.json
|
|
388
442
|
|
|
389
443
|
# 4. Load to CSV
|
|
390
444
|
cat temp/sample_transformed.json \
|
|
391
|
-
| etlplus load
|
|
445
|
+
| etlplus load temp/sample_output.csv
|
|
392
446
|
```
|
|
393
447
|
|
|
394
448
|
### Format Overrides
|
|
@@ -401,14 +455,14 @@ Examples (zsh):
|
|
|
401
455
|
|
|
402
456
|
```zsh
|
|
403
457
|
# Force CSV parsing for an extension-less file
|
|
404
|
-
etlplus extract
|
|
458
|
+
etlplus extract data.txt --source-type file --source-format csv
|
|
405
459
|
|
|
406
460
|
# Write CSV to a file without the .csv suffix
|
|
407
|
-
etlplus load
|
|
461
|
+
etlplus load output.bin --target-type file --target-format csv < data.json
|
|
408
462
|
|
|
409
463
|
# Leave the flags off when extensions already match the desired format
|
|
410
|
-
etlplus extract --
|
|
411
|
-
etlplus load
|
|
464
|
+
etlplus extract data.csv --source-type file
|
|
465
|
+
etlplus load data.json --target-type file < data.json
|
|
412
466
|
```
|
|
413
467
|
|
|
414
468
|
## Transformation Operations
|
|
@@ -571,17 +625,6 @@ git push origin v1.4.0
|
|
|
571
625
|
If you want an extra smoke-test before tagging, run `make dist && pip install dist/*.whl` locally;
|
|
572
626
|
this exercises the same build path the workflow uses.
|
|
573
627
|
|
|
574
|
-
## Links
|
|
575
|
-
|
|
576
|
-
- API client docs: [`etlplus/api/README.md`](etlplus/api/README.md)
|
|
577
|
-
- Examples: [`examples/README.md`](examples/README.md)
|
|
578
|
-
- Pipeline authoring guide: [`docs/pipeline-guide.md`](docs/pipeline-guide.md)
|
|
579
|
-
- Runner internals: [`docs/run-module.md`](docs/run-module.md)
|
|
580
|
-
- Design notes (Mapping inputs, dict outputs): [`docs/pipeline-guide.md#design-notes-mapping-inputs-dict-outputs`](docs/pipeline-guide.md#design-notes-mapping-inputs-dict-outputs)
|
|
581
|
-
- Typing philosophy: [`CONTRIBUTING.md#typing-philosophy`](CONTRIBUTING.md#typing-philosophy)
|
|
582
|
-
- Demo and walkthrough: [`DEMO.md`](DEMO.md)
|
|
583
|
-
- Additional references: [`REFERENCES.md`](`REFERENCES.md)
|
|
584
|
-
|
|
585
628
|
## License
|
|
586
629
|
|
|
587
630
|
This project is licensed under the [MIT License](LICENSE).
|
|
@@ -605,6 +648,39 @@ If you choose to be a code contributor, please first refer these documents:
|
|
|
605
648
|
- Typing philosophy (TypedDicts as editor hints, permissive runtime):
|
|
606
649
|
[`CONTRIBUTING.md#typing-philosophy`](CONTRIBUTING.md#typing-philosophy)
|
|
607
650
|
|
|
651
|
+
## Documentation
|
|
652
|
+
|
|
653
|
+
### Python Packages/Subpackage
|
|
654
|
+
|
|
655
|
+
Navigate to detailed documentation for each subpackage:
|
|
656
|
+
|
|
657
|
+
- [etlplus.api](etlplus/api/README.md): Lightweight HTTP client and paginated REST helpers
|
|
658
|
+
- [etlplus.file](etlplus/file/README.md): Unified file format support and helpers
|
|
659
|
+
- [etlplus.config](etlplus/config/README.md): Configuration helpers for connectors, pipelines, jobs,
|
|
660
|
+
and profiles
|
|
661
|
+
- [etlplus.cli](etlplus/cli/README.md): Command-line interface for ETLPlus workflows
|
|
662
|
+
- [etlplus.database](etlplus/database/README.md): Database engine, schema, and ORM helpers
|
|
663
|
+
- [etlplus.templates](etlplus/templates/README.md): SQL and DDL template helpers
|
|
664
|
+
- [etlplus.validation](etlplus/validation/README.md): Data validation utilities and helpers
|
|
665
|
+
|
|
666
|
+
### Community Health
|
|
667
|
+
|
|
668
|
+
- [Contributing Guidelines](CONTRIBUTING.md): How to contribute, report issues, and submit PRs
|
|
669
|
+
- [Code of Conduct](CODE_OF_CONDUCT.md): Community standards and expectations
|
|
670
|
+
- [Security Policy](SECURITY.md): Responsible disclosure and vulnerability reporting
|
|
671
|
+
- [Support](SUPPORT.md): Where to get help
|
|
672
|
+
|
|
673
|
+
### Other
|
|
674
|
+
|
|
675
|
+
- API client docs: [`etlplus/api/README.md`](etlplus/api/README.md)
|
|
676
|
+
- Examples: [`examples/README.md`](examples/README.md)
|
|
677
|
+
- Pipeline authoring guide: [`docs/pipeline-guide.md`](docs/pipeline-guide.md)
|
|
678
|
+
- Runner internals: [`docs/run-module.md`](docs/run-module.md)
|
|
679
|
+
- Design notes (Mapping inputs, dict outputs): [`docs/pipeline-guide.md#design-notes-mapping-inputs-dict-outputs`](docs/pipeline-guide.md#design-notes-mapping-inputs-dict-outputs)
|
|
680
|
+
- Typing philosophy: [`CONTRIBUTING.md#typing-philosophy`](CONTRIBUTING.md#typing-philosophy)
|
|
681
|
+
- Demo and walkthrough: [`DEMO.md`](DEMO.md)
|
|
682
|
+
- Additional references: [`REFERENCES.md`](REFERENCES.md)
|
|
683
|
+
|
|
608
684
|
## Acknowledgments
|
|
609
685
|
|
|
610
686
|
ETLPlus is inspired by common work patterns in data engineering and software engineering patterns in
|