etlplus 0.3.14__tar.gz → 0.12.4__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (194) hide show
  1. etlplus-0.12.4/.coveragerc +21 -0
  2. {etlplus-0.3.14 → etlplus-0.12.4}/.github/workflows/ci.yml +19 -2
  3. {etlplus-0.3.14 → etlplus-0.12.4}/.pre-commit-config.yaml +5 -1
  4. {etlplus-0.3.14 → etlplus-0.12.4}/DEMO.md +27 -25
  5. etlplus-0.12.4/MANIFEST.in +12 -0
  6. {etlplus-0.3.14 → etlplus-0.12.4}/Makefile +1 -1
  7. {etlplus-0.3.14/etlplus.egg-info → etlplus-0.12.4}/PKG-INFO +209 -59
  8. {etlplus-0.3.14 → etlplus-0.12.4}/README.md +199 -58
  9. etlplus-0.12.4/SECURITY.md +15 -0
  10. etlplus-0.12.4/SUPPORT.md +18 -0
  11. etlplus-0.12.4/docs/README.md +18 -0
  12. {etlplus-0.3.14 → etlplus-0.12.4}/docs/pipeline-guide.md +33 -13
  13. etlplus-0.12.4/etlplus/README.md +37 -0
  14. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/__main__.py +1 -2
  15. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/README.md +44 -29
  16. etlplus-0.12.4/etlplus/cli/README.md +40 -0
  17. etlplus-0.12.4/etlplus/cli/__init__.py +15 -0
  18. etlplus-0.12.4/etlplus/cli/commands.py +924 -0
  19. etlplus-0.12.4/etlplus/cli/constants.py +71 -0
  20. etlplus-0.12.4/etlplus/cli/handlers.py +656 -0
  21. etlplus-0.12.4/etlplus/cli/io.py +336 -0
  22. etlplus-0.12.4/etlplus/cli/main.py +214 -0
  23. etlplus-0.12.4/etlplus/cli/options.py +49 -0
  24. etlplus-0.12.4/etlplus/cli/state.py +336 -0
  25. etlplus-0.12.4/etlplus/cli/types.py +33 -0
  26. etlplus-0.12.4/etlplus/config/README.md +52 -0
  27. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/config/pipeline.py +13 -2
  28. etlplus-0.12.4/etlplus/database/README.md +48 -0
  29. etlplus-0.12.4/etlplus/database/__init__.py +44 -0
  30. etlplus-0.12.4/etlplus/database/ddl.py +319 -0
  31. etlplus-0.12.4/etlplus/database/engine.py +151 -0
  32. etlplus-0.12.4/etlplus/database/orm.py +354 -0
  33. etlplus-0.12.4/etlplus/database/schema.py +274 -0
  34. etlplus-0.12.4/etlplus/database/types.py +33 -0
  35. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/enums.py +3 -77
  36. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/extract.py +5 -7
  37. etlplus-0.12.4/etlplus/file/README.md +105 -0
  38. etlplus-0.12.4/etlplus/file/__init__.py +25 -0
  39. etlplus-0.12.4/etlplus/file/_io.py +120 -0
  40. etlplus-0.12.4/etlplus/file/_pandas.py +58 -0
  41. etlplus-0.12.4/etlplus/file/avro.py +186 -0
  42. etlplus-0.12.4/etlplus/file/core.py +287 -0
  43. etlplus-0.12.4/etlplus/file/csv.py +67 -0
  44. etlplus-0.12.4/etlplus/file/enums.py +238 -0
  45. etlplus-0.12.4/etlplus/file/feather.py +99 -0
  46. etlplus-0.12.4/etlplus/file/gz.py +123 -0
  47. etlplus-0.12.4/etlplus/file/json.py +98 -0
  48. etlplus-0.12.4/etlplus/file/ndjson.py +109 -0
  49. etlplus-0.12.4/etlplus/file/orc.py +99 -0
  50. etlplus-0.12.4/etlplus/file/parquet.py +101 -0
  51. etlplus-0.12.4/etlplus/file/tsv.py +67 -0
  52. etlplus-0.12.4/etlplus/file/txt.py +99 -0
  53. etlplus-0.12.4/etlplus/file/xls.py +88 -0
  54. etlplus-0.12.4/etlplus/file/xlsx.py +99 -0
  55. etlplus-0.12.4/etlplus/file/xml.py +174 -0
  56. etlplus-0.12.4/etlplus/file/yaml.py +136 -0
  57. etlplus-0.12.4/etlplus/file/zip.py +175 -0
  58. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/load.py +10 -13
  59. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/run.py +8 -13
  60. etlplus-0.12.4/etlplus/templates/README.md +46 -0
  61. etlplus-0.12.4/etlplus/templates/__init__.py +5 -0
  62. etlplus-0.12.4/etlplus/templates/ddl.sql.j2 +128 -0
  63. etlplus-0.12.4/etlplus/templates/view.sql.j2 +69 -0
  64. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/transform.py +12 -0
  65. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/types.py +5 -0
  66. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/utils.py +1 -32
  67. etlplus-0.12.4/etlplus/validation/README.md +50 -0
  68. {etlplus-0.3.14 → etlplus-0.12.4/etlplus.egg-info}/PKG-INFO +209 -59
  69. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus.egg-info/SOURCES.txt +72 -5
  70. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus.egg-info/requires.txt +9 -0
  71. {etlplus-0.3.14 → etlplus-0.12.4}/examples/README.md +8 -7
  72. etlplus-0.12.4/examples/configs/ddl_spec.yml +67 -0
  73. {etlplus-0.3.14 → etlplus-0.12.4}/examples/quickstart_python.py +1 -1
  74. {etlplus-0.3.14 → etlplus-0.12.4}/pyproject.toml +10 -1
  75. {etlplus-0.3.14 → etlplus-0.12.4}/setup.py +14 -1
  76. etlplus-0.12.4/tests/conftest.py +210 -0
  77. {etlplus-0.3.14 → etlplus-0.12.4}/tests/integration/conftest.py +105 -16
  78. etlplus-0.12.4/tests/integration/test_i_cli.py +299 -0
  79. {etlplus-0.3.14 → etlplus-0.12.4}/tests/integration/test_i_examples_data_parity.py +7 -2
  80. etlplus-0.12.4/tests/integration/test_i_pagination_strategy.py +556 -0
  81. {etlplus-0.3.14 → etlplus-0.12.4}/tests/integration/test_i_pipeline_smoke.py +46 -40
  82. {etlplus-0.3.14 → etlplus-0.12.4}/tests/integration/test_i_pipeline_yaml_load.py +6 -0
  83. etlplus-0.12.4/tests/integration/test_i_run.py +69 -0
  84. {etlplus-0.3.14 → etlplus-0.12.4}/tests/integration/test_i_run_profile_pagination_defaults.py +11 -7
  85. {etlplus-0.3.14 → etlplus-0.12.4}/tests/integration/test_i_run_profile_rate_limit_defaults.py +6 -0
  86. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/api/conftest.py +42 -15
  87. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/api/test_u_auth.py +114 -124
  88. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/api/test_u_config.py +60 -16
  89. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/api/test_u_endpoint_client.py +456 -276
  90. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/api/test_u_pagination_client.py +6 -1
  91. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/api/test_u_pagination_config.py +5 -0
  92. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/api/test_u_paginator.py +6 -1
  93. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/api/test_u_rate_limit_config.py +5 -0
  94. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/api/test_u_rate_limiter.py +6 -1
  95. etlplus-0.12.4/tests/unit/api/test_u_request_manager.py +349 -0
  96. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/api/test_u_retry_manager.py +6 -0
  97. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/api/test_u_transport.py +53 -1
  98. etlplus-0.12.4/tests/unit/api/test_u_types.py +135 -0
  99. etlplus-0.12.4/tests/unit/cli/conftest.py +284 -0
  100. etlplus-0.12.4/tests/unit/cli/test_u_cli_handlers.py +884 -0
  101. etlplus-0.12.4/tests/unit/cli/test_u_cli_io.py +326 -0
  102. etlplus-0.12.4/tests/unit/cli/test_u_cli_main.py +216 -0
  103. etlplus-0.12.4/tests/unit/cli/test_u_cli_state.py +347 -0
  104. etlplus-0.12.4/tests/unit/config/test_u_config_utils.py +129 -0
  105. etlplus-0.12.4/tests/unit/config/test_u_connector.py +119 -0
  106. etlplus-0.12.4/tests/unit/config/test_u_jobs.py +131 -0
  107. etlplus-0.12.4/tests/unit/config/test_u_pipeline.py +315 -0
  108. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/conftest.py +30 -30
  109. etlplus-0.12.4/tests/unit/database/test_u_database_ddl.py +268 -0
  110. etlplus-0.12.4/tests/unit/database/test_u_database_engine.py +199 -0
  111. etlplus-0.12.4/tests/unit/database/test_u_database_orm.py +308 -0
  112. etlplus-0.12.4/tests/unit/database/test_u_database_schema.py +241 -0
  113. etlplus-0.12.4/tests/unit/file/test_u_file_core.py +493 -0
  114. etlplus-0.12.4/tests/unit/file/test_u_file_enums.py +99 -0
  115. etlplus-0.12.4/tests/unit/file/test_u_file_yaml.py +109 -0
  116. etlplus-0.12.4/tests/unit/test_u_enums.py +102 -0
  117. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/test_u_extract.py +213 -1
  118. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/test_u_load.py +206 -5
  119. etlplus-0.12.4/tests/unit/test_u_main.py +58 -0
  120. etlplus-0.12.4/tests/unit/test_u_mixins.py +47 -0
  121. etlplus-0.12.4/tests/unit/test_u_run.py +602 -0
  122. etlplus-0.12.4/tests/unit/test_u_run_helpers.py +385 -0
  123. etlplus-0.12.4/tests/unit/test_u_transform.py +860 -0
  124. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/test_u_utils.py +84 -5
  125. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/test_u_validate.py +42 -2
  126. etlplus-0.12.4/tests/unit/test_u_version.py +53 -0
  127. etlplus-0.3.14/etlplus/cli.py +0 -868
  128. etlplus-0.3.14/etlplus/file.py +0 -657
  129. etlplus-0.3.14/tests/conftest.py +0 -11
  130. etlplus-0.3.14/tests/integration/test_i_cli.py +0 -348
  131. etlplus-0.3.14/tests/integration/test_i_pagination_strategy.py +0 -452
  132. etlplus-0.3.14/tests/integration/test_i_run.py +0 -133
  133. etlplus-0.3.14/tests/unit/api/test_u_request_manager.py +0 -134
  134. etlplus-0.3.14/tests/unit/config/test_u_connector.py +0 -54
  135. etlplus-0.3.14/tests/unit/config/test_u_pipeline.py +0 -194
  136. etlplus-0.3.14/tests/unit/test_u_cli.py +0 -124
  137. etlplus-0.3.14/tests/unit/test_u_file.py +0 -100
  138. etlplus-0.3.14/tests/unit/test_u_transform.py +0 -483
  139. etlplus-0.3.14/tools/run_pipeline.py +0 -561
  140. {etlplus-0.3.14 → etlplus-0.12.4}/.editorconfig +0 -0
  141. {etlplus-0.3.14 → etlplus-0.12.4}/.gitattributes +0 -0
  142. {etlplus-0.3.14 → etlplus-0.12.4}/.github/actions/python-bootstrap/action.yml +0 -0
  143. {etlplus-0.3.14 → etlplus-0.12.4}/.gitignore +0 -0
  144. {etlplus-0.3.14 → etlplus-0.12.4}/.ruff.toml +0 -0
  145. {etlplus-0.3.14 → etlplus-0.12.4}/CODE_OF_CONDUCT.md +0 -0
  146. {etlplus-0.3.14 → etlplus-0.12.4}/CONTRIBUTING.md +0 -0
  147. {etlplus-0.3.14 → etlplus-0.12.4}/LICENSE +0 -0
  148. {etlplus-0.3.14 → etlplus-0.12.4}/REFERENCES.md +0 -0
  149. {etlplus-0.3.14 → etlplus-0.12.4}/docs/snippets/installation_version.md +0 -0
  150. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/__init__.py +0 -0
  151. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/__version__.py +0 -0
  152. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/__init__.py +0 -0
  153. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/auth.py +0 -0
  154. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/config.py +0 -0
  155. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/endpoint_client.py +0 -0
  156. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/errors.py +0 -0
  157. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/pagination/__init__.py +0 -0
  158. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/pagination/client.py +0 -0
  159. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/pagination/config.py +0 -0
  160. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/pagination/paginator.py +0 -0
  161. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/rate_limiting/__init__.py +0 -0
  162. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/rate_limiting/config.py +0 -0
  163. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/rate_limiting/rate_limiter.py +0 -0
  164. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/request_manager.py +0 -0
  165. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/retry_manager.py +0 -0
  166. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/transport.py +0 -0
  167. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/api/types.py +0 -0
  168. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/config/__init__.py +0 -0
  169. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/config/connector.py +0 -0
  170. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/config/jobs.py +0 -0
  171. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/config/profile.py +0 -0
  172. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/config/types.py +0 -0
  173. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/config/utils.py +0 -0
  174. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/mixins.py +0 -0
  175. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/py.typed +0 -0
  176. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/run_helpers.py +0 -0
  177. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/validate.py +0 -0
  178. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/validation/__init__.py +0 -0
  179. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus/validation/utils.py +0 -0
  180. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus.egg-info/dependency_links.txt +0 -0
  181. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus.egg-info/entry_points.txt +0 -0
  182. {etlplus-0.3.14 → etlplus-0.12.4}/etlplus.egg-info/top_level.txt +0 -0
  183. {etlplus-0.3.14 → etlplus-0.12.4}/examples/configs/pipeline.yml +0 -0
  184. {etlplus-0.3.14 → etlplus-0.12.4}/examples/data/sample.csv +0 -0
  185. {etlplus-0.3.14 → etlplus-0.12.4}/examples/data/sample.json +0 -0
  186. {etlplus-0.3.14 → etlplus-0.12.4}/examples/data/sample.xml +0 -0
  187. {etlplus-0.3.14 → etlplus-0.12.4}/examples/data/sample.xsd +0 -0
  188. {etlplus-0.3.14 → etlplus-0.12.4}/examples/data/sample.yaml +0 -0
  189. {etlplus-0.3.14 → etlplus-0.12.4}/pytest.ini +0 -0
  190. {etlplus-0.3.14 → etlplus-0.12.4}/setup.cfg +0 -0
  191. {etlplus-0.3.14 → etlplus-0.12.4}/tests/__init__.py +0 -0
  192. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/api/test_u_mocks.py +0 -0
  193. {etlplus-0.3.14 → etlplus-0.12.4}/tests/unit/validation/test_u_validation_utils.py +0 -0
  194. {etlplus-0.3.14 → etlplus-0.12.4}/tools/update_demo_snippets.py +0 -0
@@ -0,0 +1,21 @@
1
+ # .coveragerc
2
+ # ETLPlus
3
+ #
4
+ # Copyright © 2025 Dagitali LLC. All rights reserved.
5
+ #
6
+ # An optional pytest-cov configuration file. Limits coverage measurement to the
7
+ # ETLPlus package and ignore test modules.
8
+ #
9
+ # See:
10
+ # 1. https://pytest-cov.readthedocs.io/en/latest/config.html
11
+
12
+ [run]
13
+ source = etlplus
14
+ branch = true
15
+ omit =
16
+ tests/*
17
+ */tests/*
18
+
19
+ [report]
20
+ skip_covered = true
21
+ show_missing = true
@@ -74,9 +74,26 @@ jobs:
74
74
  with:
75
75
  python-version: ${{ matrix.python-version }}
76
76
  python-bootstrap: "-e .[dev,yaml]"
77
- - name: Run tests
77
+ - name: Run tests (with coverage)
78
78
  run: |
79
- pytest -q
79
+ pytest -q \
80
+ --cov \
81
+ --cov-branch \
82
+ --cov-config=.coveragerc \
83
+ --cov-report=term-missing \
84
+ --cov-report=xml \
85
+ tests/
86
+
87
+ - name: Upload coverage reports to Codecov
88
+ if: matrix.python-version == '3.13'
89
+ uses: codecov/codecov-action@671740ac38dd9b0130fbe1cec585b89eea48d3de # Pinned v5.5.2
90
+ with:
91
+ fail_ci_if_error: true
92
+ files: coverage.xml
93
+ flags: unit
94
+ name: etlplus
95
+ token: ${{ secrets.CODECOV_TOKEN }} # Omit for public repo
96
+ verbose: true
80
97
 
81
98
  build:
82
99
  name: Build distributions
@@ -159,7 +159,11 @@ repos:
159
159
  rev: v1.19.0
160
160
  hooks:
161
161
  - id: mypy
162
- args: [--ignore-missing-imports, --install-types, --non-interactive]
162
+ args:
163
+ - --cache-dir=.mypy_cache/pre-commit
164
+ - --ignore-missing-imports
165
+ - --install-types
166
+ - --non-interactive
163
167
 
164
168
  - repo: https://github.com/pycqa/flake8
165
169
  rev: 7.3.0
@@ -58,7 +58,7 @@ John Doe,30,New York
58
58
  Jane Smith,25,Los Angeles
59
59
  CSVDATA
60
60
 
61
- $ etlplus extract file users.csv --format csv
61
+ $ etlplus extract users.csv
62
62
  [
63
63
  {
64
64
  "name": "John Doe",
@@ -89,14 +89,14 @@ $ etlplus validate '{"email": "user@example.com", "age": 25}' \
89
89
 
90
90
  ### Filter and Select
91
91
  ```bash
92
- $ etlplus transform '[
92
+ $ etlplus transform --operations '{
93
+ "filter": {"field": "age", "op": "gt", "value": 26},
94
+ "select": ["name", "age"]
95
+ }' '[
93
96
  {"name": "John", "age": 30, "city": "NYC"},
94
97
  {"name": "Jane", "age": 25, "city": "LA"},
95
98
  {"name": "Bob", "age": 35, "city": "Chicago"}
96
- ]' --operations '{
97
- "filter": {"field": "age", "op": "gt", "value": 26},
98
- "select": ["name", "age"]
99
- }'
99
+ ]'
100
100
  [
101
101
  {
102
102
  "name": "John",
@@ -111,24 +111,19 @@ $ etlplus transform '[
111
111
 
112
112
  ### Sort Data
113
113
  ```bash
114
- $ etlplus transform '[
115
- {"name": "Charlie", "score": 85},
116
- {"name": "Alice", "score": 95},
117
- {"name": "Bob", "score": 90}
118
- ]' --operations '{
119
- "sort": {"field": "score", "reverse": true}
120
- }'
114
+ $ etlplus transform -\
115
+ -operations '{"sort": {"field": "score", "reverse": true}}' \
116
+ '[{"name": "Charlie", "score": 85}, {"name": "Alice", "score": 95}, {"name": "Bob", "score": 90}]'
121
117
  ```
122
118
 
123
119
  ### Aggregate Data
124
120
  ```bash
125
- $ etlplus transform '[
121
+ $ etlplus transform --operations '{"aggregate": {"field": "sales", "func": "sum"}}' \
122
+ '[
126
123
  {"product": "A", "sales": 100},
127
124
  {"product": "B", "sales": 150},
128
125
  {"product": "C", "sales": 200}
129
- ]' --operations '{
130
- "aggregate": {"field": "sales", "func": "sum"}
131
- }'
126
+ ]'
132
127
  {
133
128
  "sum_sales": 450
134
129
  }
@@ -138,7 +133,9 @@ $ etlplus transform '[
138
133
 
139
134
  ### Load to JSON File
140
135
  ```bash
141
- $ etlplus load '{"name": "John", "status": "active"}' file output.json
136
+ $ etlplus load \
137
+ '{"name": "John", "status": "active"}' \
138
+ output.json --target-type file
142
139
  {
143
140
  "status": "success",
144
141
  "message": "Data loaded to output.json",
@@ -148,10 +145,12 @@ $ etlplus load '{"name": "John", "status": "active"}' file output.json
148
145
 
149
146
  ### Load to CSV File
150
147
  ```bash
151
- $ etlplus load '[
148
+ $ etlplus load \
149
+ '[
152
150
  {"name": "John", "email": "john@example.com"},
153
151
  {"name": "Jane", "email": "jane@example.com"}
154
- ]' file users.csv --format csv
152
+ ]' \
153
+ users.csv --target-type file
155
154
  {
156
155
  "status": "success",
157
156
  "message": "Data loaded to users.csv",
@@ -170,22 +169,25 @@ This example shows a complete ETL workflow:
170
169
 
171
170
  ```bash
172
171
  # Step 1: Extract
173
- $ etlplus extract file raw_data.csv --format csv -o extracted.json
172
+ $ etlplus extract raw_data.csv > extracted.json
174
173
 
175
174
  # Step 2: Transform
176
- $ etlplus transform extracted.json \
175
+ $ etlplus transform \
177
176
  --operations '{
178
177
  "filter": {"field": "age", "op": "gte", "value": 18},
179
178
  "select": ["name", "email", "age"]
180
- }' -o transformed.json
179
+ }' \
180
+ extracted.json \
181
+ transformed.json
181
182
 
182
183
  # Step 3: Validate
183
- $ etlplus validate transformed.json \
184
+ $ etlplus validate \
184
185
  --rules '{
185
186
  "name": {"type": "string", "required": true},
186
187
  "email": {"type": "string", "required": true, "pattern": "^[\\w.-]+@[\\w.-]+\\.\\w+$"},
187
188
  "age": {"type": "number", "min": 18, "max": 120}
188
- }'
189
+ }' \
190
+ transformed.json
189
191
 
190
192
  # Step 4: Load
191
193
  $ etlplus load transformed.json file final_output.csv
@@ -0,0 +1,12 @@
1
+ # MANIFEST.in
2
+ # ETLPlus
3
+ #
4
+ # Copyright © 2026 Dagitali LLC. All rights reserved.
5
+ #
6
+ # Contains commands that allow lists of files to be discovered and manipulated.
7
+ #
8
+ # See:
9
+ # 1. https://setuptools.pypa.io/en/latest/userguide/miscellaneous.html
10
+
11
+ # Include Jinja template files in the etlplus package
12
+ recursive-include etlplus/templates *.j2
@@ -253,7 +253,7 @@ venv: ## Create the virtual environment (at $(VENV_DIR))
253
253
  else \
254
254
  $(call ECHO_INFO, "Using existing venv: $(VENV_DIR)"); \
255
255
  fi
256
- @$(PYTHON) -m pip install --upgrade pip setuptool wheel >/dev/null
256
+ @$(PYTHON) -m pip install --upgrade pip setuptools wheel >/dev/null
257
257
  @$(call ECHO_OK,"venv ready")
258
258
 
259
259
  ##@ CI
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: etlplus
3
- Version: 0.3.14
3
+ Version: 0.12.4
4
4
  Summary: A Swiss Army knife for simple ETL operations
5
5
  Home-page: https://github.com/Dagitali/ETLPlus
6
6
  Author: ETLPlus Team
@@ -17,11 +17,20 @@ Classifier: Programming Language :: Python :: 3.14
17
17
  Requires-Python: >=3.13,<3.15
18
18
  Description-Content-Type: text/markdown
19
19
  License-File: LICENSE
20
+ Requires-Dist: fastavro>=1.12.1
20
21
  Requires-Dist: jinja2>=3.1.6
22
+ Requires-Dist: openpyxl>=3.1.5
21
23
  Requires-Dist: pyodbc>=5.3.0
24
+ Requires-Dist: pyarrow>=22.0.0
22
25
  Requires-Dist: python-dotenv>=1.2.1
23
26
  Requires-Dist: pandas>=2.3.3
27
+ Requires-Dist: pydantic>=2.12.5
28
+ Requires-Dist: PyYAML>=6.0.3
24
29
  Requires-Dist: requests>=2.32.5
30
+ Requires-Dist: SQLAlchemy>=2.0.45
31
+ Requires-Dist: typer>=0.21.0
32
+ Requires-Dist: xlrd>=2.0.2
33
+ Requires-Dist: xlwt>=1.3.0
25
34
  Provides-Extra: dev
26
35
  Requires-Dist: black>=25.9.0; extra == "dev"
27
36
  Requires-Dist: build>=1.2.2; extra == "dev"
@@ -55,18 +64,22 @@ ETLPlus is a veritable Swiss Army knife for enabling simple ETL operations, offe
55
64
  package and command-line interface for data extraction, validation, transformation, and loading.
56
65
 
57
66
  - [ETLPlus](#etlplus)
67
+ - [Getting Started](#getting-started)
58
68
  - [Features](#features)
59
69
  - [Installation](#installation)
60
70
  - [Quickstart](#quickstart)
61
71
  - [Usage](#usage)
62
72
  - [Command Line Interface](#command-line-interface)
73
+ - [Argument Order and Required Options](#argument-order-and-required-options)
74
+ - [Check Pipelines](#check-pipelines)
75
+ - [Render SQL DDL](#render-sql-ddl)
63
76
  - [Extract Data](#extract-data)
64
77
  - [Validate Data](#validate-data)
65
78
  - [Transform Data](#transform-data)
66
79
  - [Load Data](#load-data)
67
80
  - [Python API](#python-api)
68
81
  - [Complete ETL Pipeline Example](#complete-etl-pipeline-example)
69
- - [Environment Variables](#environment-variables)
82
+ - [Format Overrides](#format-overrides)
70
83
  - [Transformation Operations](#transformation-operations)
71
84
  - [Filter Operations](#filter-operations)
72
85
  - [Aggregation Functions](#aggregation-functions)
@@ -78,13 +91,39 @@ package and command-line interface for data extraction, validation, transformati
78
91
  - [Test Layers](#test-layers)
79
92
  - [Code Coverage](#code-coverage)
80
93
  - [Linting](#linting)
81
- - [Links](#links)
94
+ - [Updating Demo Snippets](#updating-demo-snippets)
95
+ - [Releasing to PyPI](#releasing-to-pypi)
82
96
  - [License](#license)
83
97
  - [Contributing](#contributing)
98
+ - [Documentation](#documentation)
99
+ - [Python Packages/Subpackage](#python-packagessubpackage)
100
+ - [Community Health](#community-health)
101
+ - [Other](#other)
84
102
  - [Acknowledgments](#acknowledgments)
85
103
 
104
+ ## Getting Started
105
+
106
+ ETLPlus helps you extract, validate, transform, and load data from files, databases, and APIs, either
107
+ as a Python library or from the command line.
108
+
109
+ To get started:
110
+
111
+ - See [Installation](#installation) for setup instructions.
112
+ - Try the [Quickstart](#quickstart) for a minimal working example (CLI and Python).
113
+ - Explore [Usage](#usage) for more detailed options and workflows.
114
+
115
+ ETLPlus supports Python 3.13 and above.
116
+
86
117
  ## Features
87
118
 
119
+ - **Check** data pipeline definitions before running them:
120
+ - Summarize jobs, sources, targets, and transforms
121
+ - Confirm configuration changes by printing focused sections on demand
122
+
123
+ - **Render** SQL DDL from shared table specs:
124
+ - Generate CREATE TABLE or view statements
125
+ - Swap templates or direct output to files for database migrations
126
+
88
127
  - **Extract** data from multiple sources:
89
128
  - Files (CSV, JSON, XML, YAML)
90
129
  - Databases (connection string support)
@@ -135,8 +174,8 @@ etlplus --version
135
174
 
136
175
  # One-liner: extract CSV, filter, select, and write JSON
137
176
  etlplus extract file examples/data/sample.csv \
138
- | etlplus transform - --operations '{"filter": {"field": "age", "op": "gt", "value": 25}, "select": ["name", "email"]}' \
139
- -o temp/sample_output.json
177
+ | etlplus transform --operations '{"filter": {"field": "age", "op": "gt", "value": 25}, "select": ["name", "email"]}' \
178
+ - temp/sample_output.json
140
179
  ```
141
180
 
142
181
  [Python API](#python-api):
@@ -166,11 +205,73 @@ etlplus --help
166
205
  etlplus --version
167
206
  ```
168
207
 
208
+ The CLI is implemented with Typer (Click-based). There is no argparse compatibility layer, so rely
209
+ on the documented commands/flags and run `etlplus <command> --help` for current options.
210
+
211
+ **Example error messages:**
212
+
213
+ - If you omit a required argument: `Error: Missing required argument 'SOURCE'.`
214
+ - If you place an option before its argument: `Error: Option '--source-format' must follow the 'SOURCE' argument.`
215
+
216
+ #### Argument Order and Required Options
217
+
218
+ For each command, positional arguments must precede options. Required options must follow their
219
+ associated argument:
220
+
221
+ - **extract**: `etlplus extract SOURCE [--source-format ...] [--source-type ...]`
222
+ - `SOURCE` is required. `--source-format` and `--source-type` must follow `SOURCE`.
223
+ - **transform**: `etlplus transform [--operations ...] SOURCE [--source-format ...] [--source-type ...] TARGET [--target-format ...] [--target-type ...]`
224
+ - `SOURCE` and `TARGET` are required. Format/type options must follow their respective argument.
225
+ - **load**: `etlplus load TARGET [--target-format ...] [--target-type ...] [--source-format ...]`
226
+ - `TARGET` is required. `--target-format` and `--target-type` must follow `TARGET`.
227
+ - **validate**: `etlplus validate SOURCE [--rules ...] [--source-format ...] [--source-type ...]`
228
+ - `SOURCE` is required. `--rules` and format/type options must follow `SOURCE`.
229
+
230
+ If required arguments or options are missing, or if options are placed before their associated argument, the CLI will display a clear error message.
231
+
232
+ #### Check Pipelines
233
+
234
+ Use `etlplus check` to explore pipeline YAML definitions without running them. The command can print
235
+ job names, summarize configured sources and targets, or drill into specific sections.
236
+
237
+ List jobs and show a pipeline summary:
238
+ ```bash
239
+ etlplus check --config examples/configs/pipeline.yml --jobs
240
+ etlplus check --config examples/configs/pipeline.yml --summary
241
+ ```
242
+
243
+ Show sources or transforms for troubleshooting:
244
+ ```bash
245
+ etlplus check --config examples/configs/pipeline.yml --sources
246
+ etlplus check --config examples/configs/pipeline.yml --transforms
247
+ ```
248
+
249
+ #### Render SQL DDL
250
+
251
+ Use `etlplus render` to turn table schema specs into ready-to-run SQL. Render from a pipeline config
252
+ or from a standalone schema file, and choose the built-in `ddl` or `view` templates (or provide your
253
+ own).
254
+
255
+ Render all tables defined in a pipeline:
256
+ ```bash
257
+ etlplus render --config examples/configs/pipeline.yml --template ddl
258
+ ```
259
+
260
+ Render a single table in that pipeline:
261
+ ```bash
262
+ etlplus render --config examples/configs/pipeline.yml --table customers --template view
263
+ ```
264
+
265
+ Render from a standalone table spec to a file:
266
+ ```bash
267
+ etlplus render --spec schemas/customer.yml --template view -o temp/customer_view.sql
268
+ ```
269
+
169
270
  #### Extract Data
170
271
 
171
- Note: For file sources, the format is inferred from the filename extension; the `--format` option is
172
- ignored. To treat passing `--format` as an error for file sources, either set
173
- `ETLPLUS_FORMAT_BEHAVIOR=error` or pass the CLI flag `--strict-format`.
272
+ Note: For file sources, the format is normally inferred from the filename extension. Use
273
+ `--source-format` to override inference when a file lacks an extension or when you want to force a
274
+ specific parser.
174
275
 
175
276
  Extract from JSON file:
176
277
  ```bash
@@ -194,7 +295,7 @@ etlplus extract api https://api.example.com/data
194
295
 
195
296
  Save extracted data to file:
196
297
  ```bash
197
- etlplus extract file examples/data/sample.csv -o temp/sample_output.json
298
+ etlplus extract file examples/data/sample.csv > temp/sample_output.json
198
299
  ```
199
300
 
200
301
  #### Validate Data
@@ -211,42 +312,69 @@ etlplus validate examples/data/sample.json --rules '{"email": {"type": "string",
211
312
 
212
313
  #### Transform Data
213
314
 
315
+ When piping data through `etlplus transform`, use `--source-format` whenever the SOURCE argument is
316
+ `-` or a literal payload, mirroring the `etlplus extract` semantics. Use `--target-format` to
317
+ control the emitted format for STDOUT or other non-file outputs, just like `etlplus load`. File
318
+ paths continue to infer formats from their extensions. Use `--source-type` to override the inferred
319
+ source connector type and `--target-type` to override the inferred target connector type, matching
320
+ the `etlplus extract`/`etlplus load` behavior.
321
+
322
+ Transform file inputs while overriding connector types:
323
+ ```bash
324
+ etlplus transform \
325
+ --operations '{"select": ["name", "email"]}' \
326
+ examples/data/sample.json --source-type file \
327
+ temp/selected_output.json --target-type file
328
+ ```
329
+
214
330
  Filter and select fields:
215
331
  ```bash
216
- etlplus transform '[{"name": "John", "age": 30}, {"name": "Jane", "age": 25}]' \
217
- --operations '{"filter": {"field": "age", "op": "gt", "value": 26}, "select": ["name"]}'
332
+ etlplus transform \
333
+ --operations '{"filter": {"field": "age", "op": "gt", "value": 26}, "select": ["name"]}' \
334
+ '[{"name": "John", "age": 30}, {"name": "Jane", "age": 25}]'
218
335
  ```
219
336
 
220
337
  Sort data:
221
338
  ```bash
222
- etlplus transform examples/data/sample.json --operations '{"sort": {"field": "age", "reverse": true}}'
339
+ etlplus transform \
340
+ --operations '{"sort": {"field": "age", "reverse": true}}' \
341
+ examples/data/sample.json
223
342
  ```
224
343
 
225
344
  Aggregate data:
226
345
  ```bash
227
- etlplus transform examples/data/sample.json --operations '{"aggregate": {"field": "age", "func": "sum"}}'
346
+ etlplus transform \
347
+ --operations '{"aggregate": {"field": "age", "func": "sum"}}' \
348
+ examples/data/sample.json
228
349
  ```
229
350
 
230
351
  Map/rename fields:
231
352
  ```bash
232
- etlplus transform examples/data/sample.json --operations '{"map": {"name": "new_name"}}'
353
+ etlplus transform \
354
+ --operations '{"map": {"name": "new_name"}}' \
355
+ examples/data/sample.json
233
356
  ```
234
357
 
235
358
  #### Load Data
236
359
 
360
+ `etlplus load` consumes JSON from STDIN; provide only the target argument plus optional flags.
361
+
237
362
  Load to JSON file:
238
363
  ```bash
239
- etlplus load '{"name": "John", "age": 30}' file temp/sample_output.json
364
+ etlplus extract file examples/data/sample.json \
365
+ | etlplus load temp/sample_output.json --target-type file
240
366
  ```
241
367
 
242
368
  Load to CSV file:
243
369
  ```bash
244
- etlplus load '[{"name": "John", "age": 30}]' file temp/sample_output.csv
370
+ etlplus extract file examples/data/sample.csv \
371
+ | etlplus load temp/sample_output.csv --target-type file
245
372
  ```
246
373
 
247
374
  Load to REST API:
248
375
  ```bash
249
- etlplus load examples/data/sample.json api https://api.example.com/endpoint
376
+ cat examples/data/sample.json \
377
+ | etlplus load https://api.example.com/endpoint --target-type api
250
378
  ```
251
379
 
252
380
  ### Python API
@@ -284,57 +412,57 @@ For YAML-driven pipelines executed end-to-end (extract → validate → transfor
284
412
  - Authoring: [`docs/pipeline-guide.md`](docs/pipeline-guide.md)
285
413
  - Runner API and internals: [`docs/run-module.md`](docs/run-module.md)
286
414
 
415
+ CLI quick reference for pipelines:
416
+
417
+ ```bash
418
+ # List jobs or show a pipeline summary
419
+ etlplus check --config examples/configs/pipeline.yml --jobs
420
+ etlplus check --config examples/configs/pipeline.yml --summary
421
+
422
+ # Run a job
423
+ etlplus run --config examples/configs/pipeline.yml --job file_to_file_customers
424
+ ```
425
+
287
426
  ### Complete ETL Pipeline Example
288
427
 
289
428
  ```bash
290
429
  # 1. Extract from CSV
291
- etlplus extract file examples/data/sample.csv -o temp/sample_extracted.json
430
+ etlplus extract file examples/data/sample.csv > temp/sample_extracted.json
292
431
 
293
432
  # 2. Transform (filter and select fields)
294
- etlplus transform temp/sample_extracted.json \
433
+ etlplus transform \
295
434
  --operations '{"filter": {"field": "age", "op": "gt", "value": 25}, "select": ["name", "email"]}' \
296
- -o temp/sample_transformed.json
435
+ temp/sample_extracted.json \
436
+ temp/sample_transformed.json
297
437
 
298
438
  # 3. Validate transformed data
299
- etlplus validate temp/sample_transformed.json \
300
- --rules '{"name": {"type": "string", "required": true}, "email": {"type": "string", "required": true}}'
439
+ etlplus validate \
440
+ --rules '{"name": {"type": "string", "required": true}, "email": {"type": "string", "required": true}}' \
441
+ temp/sample_transformed.json
301
442
 
302
443
  # 4. Load to CSV
303
- etlplus load temp/sample_transformed.json file temp/sample_output.csv
444
+ cat temp/sample_transformed.json \
445
+ | etlplus load temp/sample_output.csv
304
446
  ```
305
447
 
306
- ### Environment Variables
448
+ ### Format Overrides
307
449
 
308
- ETLPlus honors a small number of environment toggles to refine CLI behavior:
309
-
310
- - `ETLPLUS_FORMAT_BEHAVIOR`: controls what happens when `--format` is provided for
311
- file sources or targets (extract/load) where the format is inferred from the
312
- filename extension.
313
- - `error|fail|strict`: treat as error (non-zero exit)
314
- - `warn` (default): print a warning to stderr
315
- - `ignore|silent`: no message
316
- - Precedence: the CLI flag `--strict-format` overrides the environment.
450
+ `--source-format` and `--target-format` override whichever format would normally be inferred from a
451
+ file extension. This is useful when an input lacks an extension (for example, `records.txt` that
452
+ actually contains CSV) or when you intentionally want to treat a file as another format.
317
453
 
318
454
  Examples (zsh):
319
455
 
320
456
  ```zsh
321
- # Warn (default)
322
- etlplus extract file data.csv --format csv
323
- etlplus load data.json file out.csv --format csv
324
-
325
- # Enforce error via environment
326
- ETLPLUS_FORMAT_BEHAVIOR=error \
327
- etlplus extract file data.csv --format csv
328
- ETLPLUS_FORMAT_BEHAVIOR=error \
329
- etlplus load data.json file out.csv --format csv
457
+ # Force CSV parsing for an extension-less file
458
+ etlplus extract data.txt --source-type file --source-format csv
330
459
 
331
- # Equivalent strict behavior via flag (overrides environment)
332
- etlplus extract file data.csv --format csv --strict-format
333
- etlplus load data.json file out.csv --format csv --strict-format
460
+ # Write CSV to a file without the .csv suffix
461
+ etlplus load output.bin --target-type file --target-format csv < data.json
334
462
 
335
- # Recommended: rely on extension, no --format needed for files
336
- etlplus extract file data.csv
337
- etlplus load data.json file out.csv
463
+ # Leave the flags off when extensions already match the desired format
464
+ etlplus extract data.csv --source-type file
465
+ etlplus load data.json --target-type file < data.json
338
466
  ```
339
467
 
340
468
  ## Transformation Operations
@@ -497,17 +625,6 @@ git push origin v1.4.0
497
625
  If you want an extra smoke-test before tagging, run `make dist && pip install dist/*.whl` locally;
498
626
  this exercises the same build path the workflow uses.
499
627
 
500
- ## Links
501
-
502
- - API client docs: [`etlplus/api/README.md`](etlplus/api/README.md)
503
- - Examples: [`examples/README.md`](examples/README.md)
504
- - Pipeline authoring guide: [`docs/pipeline-guide.md`](docs/pipeline-guide.md)
505
- - Runner internals: [`docs/run-module.md`](docs/run-module.md)
506
- - Design notes (Mapping inputs, dict outputs): [`docs/pipeline-guide.md#design-notes-mapping-inputs-dict-outputs`](docs/pipeline-guide.md#design-notes-mapping-inputs-dict-outputs)
507
- - Typing philosophy: [`CONTRIBUTING.md#typing-philosophy`](CONTRIBUTING.md#typing-philosophy)
508
- - Demo and walkthrough: [`DEMO.md`](DEMO.md)
509
- - Additional references: [`REFERENCES.md`](`REFERENCES.md)
510
-
511
628
  ## License
512
629
 
513
630
  This project is licensed under the [MIT License](LICENSE).
@@ -531,6 +648,39 @@ If you choose to be a code contributor, please first refer these documents:
531
648
  - Typing philosophy (TypedDicts as editor hints, permissive runtime):
532
649
  [`CONTRIBUTING.md#typing-philosophy`](CONTRIBUTING.md#typing-philosophy)
533
650
 
651
+ ## Documentation
652
+
653
+ ### Python Packages/Subpackage
654
+
655
+ Navigate to detailed documentation for each subpackage:
656
+
657
+ - [etlplus.api](etlplus/api/README.md): Lightweight HTTP client and paginated REST helpers
658
+ - [etlplus.file](etlplus/file/README.md): Unified file format support and helpers
659
+ - [etlplus.config](etlplus/config/README.md): Configuration helpers for connectors, pipelines, jobs,
660
+ and profiles
661
+ - [etlplus.cli](etlplus/cli/README.md): Command-line interface for ETLPlus workflows
662
+ - [etlplus.database](etlplus/database/README.md): Database engine, schema, and ORM helpers
663
+ - [etlplus.templates](etlplus/templates/README.md): SQL and DDL template helpers
664
+ - [etlplus.validation](etlplus/validation/README.md): Data validation utilities and helpers
665
+
666
+ ### Community Health
667
+
668
+ - [Contributing Guidelines](CONTRIBUTING.md): How to contribute, report issues, and submit PRs
669
+ - [Code of Conduct](CODE_OF_CONDUCT.md): Community standards and expectations
670
+ - [Security Policy](SECURITY.md): Responsible disclosure and vulnerability reporting
671
+ - [Support](SUPPORT.md): Where to get help
672
+
673
+ ### Other
674
+
675
+ - API client docs: [`etlplus/api/README.md`](etlplus/api/README.md)
676
+ - Examples: [`examples/README.md`](examples/README.md)
677
+ - Pipeline authoring guide: [`docs/pipeline-guide.md`](docs/pipeline-guide.md)
678
+ - Runner internals: [`docs/run-module.md`](docs/run-module.md)
679
+ - Design notes (Mapping inputs, dict outputs): [`docs/pipeline-guide.md#design-notes-mapping-inputs-dict-outputs`](docs/pipeline-guide.md#design-notes-mapping-inputs-dict-outputs)
680
+ - Typing philosophy: [`CONTRIBUTING.md#typing-philosophy`](CONTRIBUTING.md#typing-philosophy)
681
+ - Demo and walkthrough: [`DEMO.md`](DEMO.md)
682
+ - Additional references: [`REFERENCES.md`](REFERENCES.md)
683
+
534
684
  ## Acknowledgments
535
685
 
536
686
  ETLPlus is inspired by common work patterns in data engineering and software engineering patterns in