sqlframe 1.3.0__tar.gz → 1.4.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (195) hide show
  1. {sqlframe-1.3.0 → sqlframe-1.4.0}/PKG-INFO +12 -6
  2. {sqlframe-1.3.0 → sqlframe-1.4.0}/README.md +11 -5
  3. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/configuration.md +45 -32
  4. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/_version.py +2 -2
  5. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/dataframe.py +61 -17
  6. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe.egg-info/PKG-INFO +12 -6
  7. {sqlframe-1.3.0 → sqlframe-1.4.0}/.github/CODEOWNERS +0 -0
  8. {sqlframe-1.3.0 → sqlframe-1.4.0}/.github/workflows/main.workflow.yaml +0 -0
  9. {sqlframe-1.3.0 → sqlframe-1.4.0}/.github/workflows/publish.workflow.yaml +0 -0
  10. {sqlframe-1.3.0 → sqlframe-1.4.0}/.gitignore +0 -0
  11. {sqlframe-1.3.0 → sqlframe-1.4.0}/.pre-commit-config.yaml +0 -0
  12. {sqlframe-1.3.0 → sqlframe-1.4.0}/.readthedocs.yaml +0 -0
  13. {sqlframe-1.3.0 → sqlframe-1.4.0}/LICENSE +0 -0
  14. {sqlframe-1.3.0 → sqlframe-1.4.0}/Makefile +0 -0
  15. {sqlframe-1.3.0 → sqlframe-1.4.0}/blogs/images/but_wait_theres_more.gif +0 -0
  16. {sqlframe-1.3.0 → sqlframe-1.4.0}/blogs/images/cake.gif +0 -0
  17. {sqlframe-1.3.0 → sqlframe-1.4.0}/blogs/images/you_get_pyspark_api.gif +0 -0
  18. {sqlframe-1.3.0 → sqlframe-1.4.0}/blogs/sqlframe_universal_dataframe_api.md +0 -0
  19. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/bigquery.md +0 -0
  20. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/docs/bigquery.md +0 -0
  21. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/docs/duckdb.md +0 -0
  22. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/docs/images/SF.png +0 -0
  23. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/docs/images/favicon.png +0 -0
  24. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/docs/images/favicon_old.png +0 -0
  25. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/docs/images/sqlframe_diagram.png +0 -0
  26. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/docs/images/sqlframe_logo.png +0 -0
  27. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/docs/postgres.md +0 -0
  28. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/duckdb.md +0 -0
  29. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/images/SF.png +0 -0
  30. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/images/favicon.png +0 -0
  31. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/images/favicon_old.png +0 -0
  32. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/images/sqlframe_diagram.png +0 -0
  33. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/images/sqlframe_logo.png +0 -0
  34. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/index.md +0 -0
  35. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/postgres.md +0 -0
  36. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/requirements.txt +0 -0
  37. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/standalone.md +0 -0
  38. {sqlframe-1.3.0 → sqlframe-1.4.0}/docs/stylesheets/extra.css +0 -0
  39. {sqlframe-1.3.0 → sqlframe-1.4.0}/mkdocs.yml +0 -0
  40. {sqlframe-1.3.0 → sqlframe-1.4.0}/pytest.ini +0 -0
  41. {sqlframe-1.3.0 → sqlframe-1.4.0}/renovate.json +0 -0
  42. {sqlframe-1.3.0 → sqlframe-1.4.0}/setup.cfg +0 -0
  43. {sqlframe-1.3.0 → sqlframe-1.4.0}/setup.py +0 -0
  44. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/LICENSE +0 -0
  45. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/__init__.py +0 -0
  46. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/__init__.py +0 -0
  47. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/_typing.py +0 -0
  48. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/catalog.py +0 -0
  49. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/column.py +0 -0
  50. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/decorators.py +0 -0
  51. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/exceptions.py +0 -0
  52. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/function_alternatives.py +0 -0
  53. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/functions.py +0 -0
  54. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/group.py +0 -0
  55. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/mixins/__init__.py +0 -0
  56. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/mixins/catalog_mixins.py +0 -0
  57. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/mixins/dataframe_mixins.py +0 -0
  58. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/mixins/readwriter_mixins.py +0 -0
  59. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/normalize.py +0 -0
  60. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/operations.py +0 -0
  61. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/readerwriter.py +0 -0
  62. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/session.py +0 -0
  63. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/transforms.py +0 -0
  64. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/types.py +0 -0
  65. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/util.py +0 -0
  66. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/base/window.py +0 -0
  67. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/bigquery/__init__.py +0 -0
  68. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/bigquery/catalog.py +0 -0
  69. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/bigquery/column.py +0 -0
  70. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/bigquery/dataframe.py +0 -0
  71. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/bigquery/functions.py +0 -0
  72. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/bigquery/functions.pyi +0 -0
  73. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/bigquery/group.py +0 -0
  74. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/bigquery/readwriter.py +0 -0
  75. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/bigquery/session.py +0 -0
  76. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/bigquery/types.py +0 -0
  77. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/bigquery/window.py +0 -0
  78. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/duckdb/__init__.py +0 -0
  79. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/duckdb/catalog.py +0 -0
  80. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/duckdb/column.py +0 -0
  81. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/duckdb/dataframe.py +0 -0
  82. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/duckdb/functions.py +0 -0
  83. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/duckdb/functions.pyi +0 -0
  84. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/duckdb/group.py +0 -0
  85. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/duckdb/readwriter.py +0 -0
  86. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/duckdb/session.py +0 -0
  87. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/duckdb/types.py +0 -0
  88. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/duckdb/window.py +0 -0
  89. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/postgres/__init__.py +0 -0
  90. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/postgres/catalog.py +0 -0
  91. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/postgres/column.py +0 -0
  92. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/postgres/dataframe.py +0 -0
  93. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/postgres/functions.py +0 -0
  94. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/postgres/functions.pyi +0 -0
  95. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/postgres/group.py +0 -0
  96. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/postgres/readwriter.py +0 -0
  97. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/postgres/session.py +0 -0
  98. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/postgres/types.py +0 -0
  99. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/postgres/window.py +0 -0
  100. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/redshift/__init__.py +0 -0
  101. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/redshift/catalog.py +0 -0
  102. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/redshift/column.py +0 -0
  103. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/redshift/dataframe.py +0 -0
  104. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/redshift/functions.py +0 -0
  105. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/redshift/group.py +0 -0
  106. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/redshift/readwriter.py +0 -0
  107. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/redshift/session.py +0 -0
  108. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/redshift/types.py +0 -0
  109. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/redshift/window.py +0 -0
  110. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/snowflake/__init__.py +0 -0
  111. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/snowflake/catalog.py +0 -0
  112. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/snowflake/column.py +0 -0
  113. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/snowflake/dataframe.py +0 -0
  114. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/snowflake/functions.py +0 -0
  115. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/snowflake/group.py +0 -0
  116. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/snowflake/readwriter.py +0 -0
  117. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/snowflake/session.py +0 -0
  118. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/snowflake/types.py +0 -0
  119. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/snowflake/window.py +0 -0
  120. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/spark/__init__.py +0 -0
  121. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/spark/catalog.py +0 -0
  122. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/spark/column.py +0 -0
  123. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/spark/dataframe.py +0 -0
  124. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/spark/functions.py +0 -0
  125. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/spark/group.py +0 -0
  126. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/spark/readwriter.py +0 -0
  127. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/spark/session.py +0 -0
  128. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/spark/types.py +0 -0
  129. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/spark/window.py +0 -0
  130. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/standalone/__init__.py +0 -0
  131. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/standalone/catalog.py +0 -0
  132. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/standalone/column.py +0 -0
  133. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/standalone/dataframe.py +0 -0
  134. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/standalone/functions.py +0 -0
  135. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/standalone/group.py +0 -0
  136. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/standalone/readwriter.py +0 -0
  137. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/standalone/session.py +0 -0
  138. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/standalone/types.py +0 -0
  139. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe/standalone/window.py +0 -0
  140. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe.egg-info/SOURCES.txt +0 -0
  141. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe.egg-info/dependency_links.txt +0 -0
  142. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe.egg-info/requires.txt +0 -0
  143. {sqlframe-1.3.0 → sqlframe-1.4.0}/sqlframe.egg-info/top_level.txt +0 -0
  144. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/__init__.py +0 -0
  145. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/common_fixtures.py +0 -0
  146. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/conftest.py +0 -0
  147. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/fixtures/employee.csv +0 -0
  148. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/fixtures/employee.json +0 -0
  149. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/fixtures/employee.parquet +0 -0
  150. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/fixtures/employee_extra_line.csv +0 -0
  151. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/__init__.py +0 -0
  152. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/__init__.py +0 -0
  153. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/bigquery/__init__.py +0 -0
  154. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/bigquery/test_bigquery_catalog.py +0 -0
  155. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/bigquery/test_bigquery_session.py +0 -0
  156. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/duck/__init__.py +0 -0
  157. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/duck/test_duckdb_catalog.py +0 -0
  158. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/duck/test_duckdb_dataframe.py +0 -0
  159. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/duck/test_duckdb_reader.py +0 -0
  160. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/duck/test_duckdb_session.py +0 -0
  161. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/postgres/__init__.py +0 -0
  162. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/postgres/test_postgres_catalog.py +0 -0
  163. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/postgres/test_postgres_dataframe.py +0 -0
  164. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/postgres/test_postgres_session.py +0 -0
  165. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/redshift/__init__.py +0 -0
  166. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/redshift/test_redshift_catalog.py +0 -0
  167. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/redshift/test_redshift_session.py +0 -0
  168. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/snowflake/__init__.py +0 -0
  169. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/snowflake/test_snowflake_catalog.py +0 -0
  170. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/snowflake/test_snowflake_session.py +0 -0
  171. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/spark/__init__.py +0 -0
  172. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/spark/test_spark_catalog.py +0 -0
  173. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/test_engine_dataframe.py +0 -0
  174. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/test_engine_reader.py +0 -0
  175. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/test_engine_session.py +0 -0
  176. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/test_engine_writer.py +0 -0
  177. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/engines/test_int_functions.py +0 -0
  178. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/fixtures.py +0 -0
  179. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/test_int_dataframe.py +0 -0
  180. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/test_int_dataframe_stats.py +0 -0
  181. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/test_int_grouped_data.py +0 -0
  182. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/integration/test_int_session.py +0 -0
  183. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/types.py +0 -0
  184. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/unit/__init__.py +0 -0
  185. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/unit/standalone/__init__.py +0 -0
  186. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/unit/standalone/fixtures.py +0 -0
  187. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/unit/standalone/test_column.py +0 -0
  188. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/unit/standalone/test_dataframe.py +0 -0
  189. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/unit/standalone/test_dataframe_writer.py +0 -0
  190. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/unit/standalone/test_functions.py +0 -0
  191. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/unit/standalone/test_session.py +0 -0
  192. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/unit/standalone/test_session_case_sensitivity.py +0 -0
  193. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/unit/standalone/test_types.py +0 -0
  194. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/unit/standalone/test_window.py +0 -0
  195. {sqlframe-1.3.0 → sqlframe-1.4.0}/tests/unit/test_util.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: sqlframe
3
- Version: 1.3.0
3
+ Version: 1.4.0
4
4
  Summary: Taking the Spark out of PySpark by converting to SQL
5
5
  Home-page: https://github.com/eakmanrq/sqlframe
6
6
  Author: Ryan Eakman
@@ -29,19 +29,19 @@ Provides-Extra: spark
29
29
  License-File: LICENSE
30
30
 
31
31
  <div align="center">
32
- <img src="https://sqlframe.readthedocs.io/en/latest/docs/images/sqlframe_logo.png" alt="SQLFrame Logo" width="400"/>
32
+ <img src="https://sqlframe.readthedocs.io/en/stable/docs/images/sqlframe_logo.png" alt="SQLFrame Logo" width="400"/>
33
33
  </div>
34
34
 
35
35
  SQLFrame implements the PySpark DataFrame API in order to enable running transformation pipelines directly on database engines - no Spark clusters or dependencies required.
36
36
 
37
37
  SQLFrame currently supports the following engines (many more in development):
38
38
 
39
- * [BigQuery](https://sqlframe.readthedocs.io/en/latest/bigquery/)
40
- * [DuckDB](https://sqlframe.readthedocs.io/en/latest/duckdb)
41
- * [Postgres](https://sqlframe.readthedocs.io/en/latest/postgres)
39
+ * [BigQuery](https://sqlframe.readthedocs.io/en/stable/bigquery/)
40
+ * [DuckDB](https://sqlframe.readthedocs.io/en/stable/duckdb)
41
+ * [Postgres](https://sqlframe.readthedocs.io/en/stable/postgres)
42
42
 
43
43
  SQLFrame also has a "Standalone" session that be used to generate SQL without any connection to a database engine.
44
- * [Standalone](https://sqlframe.readthedocs.io/en/latest/standalone)
44
+ * [Standalone](https://sqlframe.readthedocs.io/en/stable/standalone)
45
45
 
46
46
  SQLFrame is great for:
47
47
 
@@ -64,6 +64,12 @@ pip install sqlframe
64
64
 
65
65
  See specific engine documentation for additional setup instructions.
66
66
 
67
+ ## Configuration
68
+
69
+ SQLFrame generates consistently accurate yet complex SQL for engine execution.
70
+ However, when using df.sql(), it produces more human-readable SQL.
71
+ For details on how to configure this output and leverage OpenAI to enhance the SQL, see [Generated SQL Configuration](https://sqlframe.readthedocs.io/en/stable/configuration/#generated-sql).
72
+
67
73
  ## Example Usage
68
74
 
69
75
  ```python
@@ -1,17 +1,17 @@
1
1
  <div align="center">
2
- <img src="https://sqlframe.readthedocs.io/en/latest/docs/images/sqlframe_logo.png" alt="SQLFrame Logo" width="400"/>
2
+ <img src="https://sqlframe.readthedocs.io/en/stable/docs/images/sqlframe_logo.png" alt="SQLFrame Logo" width="400"/>
3
3
  </div>
4
4
 
5
5
  SQLFrame implements the PySpark DataFrame API in order to enable running transformation pipelines directly on database engines - no Spark clusters or dependencies required.
6
6
 
7
7
  SQLFrame currently supports the following engines (many more in development):
8
8
 
9
- * [BigQuery](https://sqlframe.readthedocs.io/en/latest/bigquery/)
10
- * [DuckDB](https://sqlframe.readthedocs.io/en/latest/duckdb)
11
- * [Postgres](https://sqlframe.readthedocs.io/en/latest/postgres)
9
+ * [BigQuery](https://sqlframe.readthedocs.io/en/stable/bigquery/)
10
+ * [DuckDB](https://sqlframe.readthedocs.io/en/stable/duckdb)
11
+ * [Postgres](https://sqlframe.readthedocs.io/en/stable/postgres)
12
12
 
13
13
  SQLFrame also has a "Standalone" session that be used to generate SQL without any connection to a database engine.
14
- * [Standalone](https://sqlframe.readthedocs.io/en/latest/standalone)
14
+ * [Standalone](https://sqlframe.readthedocs.io/en/stable/standalone)
15
15
 
16
16
  SQLFrame is great for:
17
17
 
@@ -34,6 +34,12 @@ pip install sqlframe
34
34
 
35
35
  See specific engine documentation for additional setup instructions.
36
36
 
37
+ ## Configuration
38
+
39
+ SQLFrame generates consistently accurate yet complex SQL for engine execution.
40
+ However, when using df.sql(), it produces more human-readable SQL.
41
+ For details on how to configure this output and leverage OpenAI to enhance the SQL, see [Generated SQL Configuration](https://sqlframe.readthedocs.io/en/stable/configuration/#generated-sql).
42
+
37
43
  ## Example Usage
38
44
 
39
45
  ```python
@@ -184,46 +184,59 @@ The dialect of the generated SQL will be based on the session's dialect. However
184
184
  df.sql(dialect="bigquery")
185
185
  ```
186
186
 
187
- ### OpenAI Enriched
187
+ ### OpenAI Enrichment
188
188
 
189
- OpenAI's models can be used to enrich the generated SQL to make it more human-like.
190
- This is useful when you want to generate SQL that is more readable for humans.
191
- You must have `OPENAI_API_KEY` set in your environment variables to use this feature.
189
+ OpenAI's models can be used to enrich the generated SQL to make it more human-like.
190
+ You can have it just provide more readable CTE names or you can have it try to make the whole SQL statement more readable.
191
+
192
+ #### Example
192
193
 
193
194
  ```python
194
195
  # create session and `df` like normal
195
196
  # The model to use defaults to `gpt-4o` but can be changed by passing a string to the `openai_model` parameter.
196
- >>> df.sql(optimize=False, use_openai=True)
197
- WITH natality_data AS (
198
- SELECT
199
- year,
200
- ever_born
201
- FROM `bigquery-public-data`.`samples`.`natality`
202
- ), single_child_families AS (
197
+ >>> df.sql(openai_config={"mode": "cte_only", "model": "gpt-3.5-turbo"})
198
+ WITH `single_child_families_by_year` AS (
203
199
  SELECT
204
- year,
205
- COUNT(*) AS num_single_child_families
206
- FROM natality_data
207
- WHERE ever_born = 1
208
- GROUP BY year
209
- ), lagged_families AS (
210
- SELECT
211
- year,
212
- num_single_child_families,
213
- LAG(num_single_child_families, 1) OVER (ORDER BY year) AS last_year_num_single_child_families
214
- FROM single_child_families
215
- ), percent_change_families AS (
200
+ `natality`.`year` AS `year`,
201
+ COUNT(*) AS `num_single_child_families`
202
+ FROM `bigquery-public-data`.`samples`.`natality` AS `natality`
203
+ WHERE
204
+ `natality`.`ever_born` = 1
205
+ GROUP BY
206
+ `natality`.`year`
207
+ ), `families_with_percent_change` AS (
216
208
  SELECT
217
- year,
218
- num_single_child_families,
219
- ((num_single_child_families - last_year_num_single_child_families) / last_year_num_single_child_families) AS percent_change
220
- FROM lagged_families
221
- ORDER BY ABS(percent_change) DESC
209
+ `single_child_families_by_year`.`year` AS `year`,
210
+ `single_child_families_by_year`.`num_single_child_families` AS `num_single_child_families`,
211
+ LAG(`single_child_families_by_year`.`num_single_child_families`, 1) OVER (ORDER BY `single_child_families_by_year`.`year`) AS `last_year_num_single_child_families`
212
+ FROM `single_child_families_by_year` AS `single_child_families_by_year`
222
213
  )
223
214
  SELECT
224
- year,
225
- FORMAT('%\'.0f', ROUND(CAST(num_single_child_families AS FLOAT64), 0)) AS `new families single child`,
226
- FORMAT('%\'.2f', ROUND(CAST((percent_change * 100) AS FLOAT64), 2)) AS `percent change`
227
- FROM percent_change_families
215
+ `families_with_percent_change`.`year` AS `year`,
216
+ FORMAT('%\'.0f', ROUND(CAST(`families_with_percent_change`.`num_single_child_families` AS FLOAT64), 0)) AS `new families single child`,
217
+ FORMAT(
218
+ '%\'.2f',
219
+ ROUND(
220
+ CAST((
221
+ (
222
+ (
223
+ `families_with_percent_change`.`num_single_child_families` - `families_with_percent_change`.`last_year_num_single_child_families`
224
+ ) / `families_with_percent_change`.`last_year_num_single_child_families`
225
+ ) * 100
226
+ ) AS FLOAT64),
227
+ 2
228
+ )
229
+ ) AS `percent change`
230
+ FROM `families_with_percent_change` AS `families_with_percent_change`
231
+ ORDER BY
232
+ ABS(`percent_change`) DESC
228
233
  LIMIT 5
229
234
  ```
235
+
236
+ #### Parameters
237
+
238
+ | Parameter | Description | Default |
239
+ |-------------------|-----------------------------------------------------------------------|------------|
240
+ | `mode` | The mode to use. Can be `cte_only` or `full`. | `cte_only` |
241
+ | `model` | The OpenAI model to use. Note: The default may change in new releases | `gpt-4o` |
242
+ | `prompt_override` | A string to use to override the default prompt. | None |
@@ -12,5 +12,5 @@ __version__: str
12
12
  __version_tuple__: VERSION_TUPLE
13
13
  version_tuple: VERSION_TUPLE
14
14
 
15
- __version__ = version = '1.3.0'
16
- __version_tuple__ = version_tuple = (1, 3, 0)
15
+ __version__ = version = '1.4.0'
16
+ __version_tuple__ = version_tuple = (1, 4, 0)
@@ -2,13 +2,16 @@
2
2
 
3
3
  from __future__ import annotations
4
4
 
5
+ import enum
5
6
  import functools
6
7
  import itertools
8
+ import json
7
9
  import logging
8
10
  import sys
9
11
  import typing as t
10
12
  import zlib
11
13
  from copy import copy
14
+ from dataclasses import dataclass
12
15
 
13
16
  import sqlglot
14
17
  from prettytable import PrettyTable
@@ -75,6 +78,46 @@ JOIN_HINTS = {
75
78
  DF = t.TypeVar("DF", bound="_BaseDataFrame")
76
79
 
77
80
 
81
+ class OpenAIMode(enum.Enum):
82
+ CTE_ONLY = "cte_only"
83
+ FULL = "full"
84
+
85
+ @property
86
+ def is_cte_only(self) -> bool:
87
+ return self == OpenAIMode.CTE_ONLY
88
+
89
+ @property
90
+ def is_full(self) -> bool:
91
+ return self == OpenAIMode.FULL
92
+
93
+
94
+ @dataclass
95
+ class OpenAIConfig:
96
+ mode: OpenAIMode = OpenAIMode.CTE_ONLY
97
+ model: str = "gpt-4o"
98
+ prompt_override: t.Optional[str] = None
99
+
100
+ @classmethod
101
+ def from_dict(cls, config: t.Dict[str, t.Any]) -> OpenAIConfig:
102
+ if "mode" in config:
103
+ config["mode"] = OpenAIMode(config["mode"].lower())
104
+ return cls(**config)
105
+
106
+ def get_prompt(self, dialect: Dialect) -> str:
107
+ if self.prompt_override:
108
+ return self.prompt_override
109
+ if self.mode.is_cte_only:
110
+ return f"You are a backend tool that creates unique CTE alias names match what a human would write and in snake case. You respond without code blocks and only a json payload with the key being the CTE name that is being replaced and the value being the new CTE human readable name."
111
+ return f"""
112
+ You are a backend tool that converts correct {dialect} SQL to simplified and more human readable version.
113
+ You respond without code block with rewritten {dialect} SQL.
114
+ You don't change any column names in the final select because the user expects those to remain the same.
115
+ You make unique CTE alias names match what a human would write and in snake case.
116
+ You improve formatting with spacing and line-breaks.
117
+ You remove redundant parenthesis and aliases.
118
+ When remove extra quotes, make sure to keep quotes around words that could be reserved words"""
119
+
120
+
78
121
  class _BaseDataFrameNaFunctions(t.Generic[DF]):
79
122
  def __init__(self, df: DF):
80
123
  self.df = df
@@ -476,8 +519,7 @@ class _BaseDataFrame(t.Generic[SESSION, WRITER, NA, STAT, GROUP_DATA]):
476
519
  dialect: DialectType = None,
477
520
  optimize: bool = True,
478
521
  pretty: bool = True,
479
- use_openai: bool = False,
480
- openai_model: str = "gpt-4o",
522
+ openai_config: t.Optional[t.Union[t.Dict[str, t.Any], OpenAIConfig]] = None,
481
523
  as_list: bool = False,
482
524
  **kwargs,
483
525
  ) -> t.Union[str, t.List[str]]:
@@ -487,6 +529,11 @@ class _BaseDataFrame(t.Generic[SESSION, WRITER, NA, STAT, GROUP_DATA]):
487
529
  select_expressions = df._get_select_expressions()
488
530
  output_expressions: t.List[t.Union[exp.Select, exp.Cache, exp.Drop]] = []
489
531
  replacement_mapping: t.Dict[exp.Identifier, exp.Identifier] = {}
532
+ openai_config = (
533
+ OpenAIConfig.from_dict(openai_config)
534
+ if openai_config is not None and isinstance(openai_config, dict)
535
+ else openai_config
536
+ )
490
537
 
491
538
  for expression_type, select_expression in select_expressions:
492
539
  select_expression = select_expression.transform(
@@ -497,7 +544,7 @@ class _BaseDataFrame(t.Generic[SESSION, WRITER, NA, STAT, GROUP_DATA]):
497
544
  select_expression = t.cast(
498
545
  exp.Select, self.session._optimize(select_expression, dialect=dialect)
499
546
  )
500
- elif use_openai:
547
+ elif openai_config:
501
548
  qualify(select_expression, dialect=dialect, schema=self.session.catalog._schema)
502
549
  pushdown_projections(select_expression, schema=self.session.catalog._schema)
503
550
 
@@ -556,35 +603,32 @@ class _BaseDataFrame(t.Generic[SESSION, WRITER, NA, STAT, GROUP_DATA]):
556
603
  results = []
557
604
  for expression in output_expressions:
558
605
  sql = expression.sql(dialect=dialect, pretty=pretty, **kwargs)
559
- if use_openai:
606
+ if openai_config:
607
+ assert isinstance(openai_config, OpenAIConfig)
560
608
  verify_openai_installed()
561
609
  from openai import OpenAI
562
610
 
563
611
  client = OpenAI()
564
- prompt = f"""
565
- You are a backend tool that converts correct {dialect} SQL to simplified and more human readable version.
566
- You respond without code block with rewritten {dialect} SQL.
567
- You don't change any column names in the final select because the user expects those to remain the same.
568
- You make unique CTE alias names match what a human would write and in snake case.
569
- You improve formatting with spacing and line-breaks.
570
- You remove redundant parenthesis and aliases.
571
- When remove extra quotes, make sure to keep quotes around words that could be reserved words
572
- """
573
612
  chat_completed = client.chat.completions.create(
574
613
  messages=[
575
- {
614
+ { # type: ignore
576
615
  "role": "system",
577
- "content": prompt,
616
+ "content": openai_config.get_prompt(dialect),
578
617
  },
579
618
  {
580
619
  "role": "user",
581
620
  "content": sql,
582
621
  },
583
622
  ],
584
- model=openai_model,
623
+ model=openai_config.model,
585
624
  )
586
625
  assert chat_completed.choices[0].message.content is not None
587
- sql = chat_completed.choices[0].message.content
626
+ if openai_config.mode.is_cte_only:
627
+ cte_replacement_mapping = json.loads(chat_completed.choices[0].message.content)
628
+ for old_name, new_name in cte_replacement_mapping.items():
629
+ sql = sql.replace(old_name, new_name)
630
+ else:
631
+ sql = chat_completed.choices[0].message.content
588
632
  results.append(sql)
589
633
 
590
634
  if as_list:
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: sqlframe
3
- Version: 1.3.0
3
+ Version: 1.4.0
4
4
  Summary: Taking the Spark out of PySpark by converting to SQL
5
5
  Home-page: https://github.com/eakmanrq/sqlframe
6
6
  Author: Ryan Eakman
@@ -29,19 +29,19 @@ Provides-Extra: spark
29
29
  License-File: LICENSE
30
30
 
31
31
  <div align="center">
32
- <img src="https://sqlframe.readthedocs.io/en/latest/docs/images/sqlframe_logo.png" alt="SQLFrame Logo" width="400"/>
32
+ <img src="https://sqlframe.readthedocs.io/en/stable/docs/images/sqlframe_logo.png" alt="SQLFrame Logo" width="400"/>
33
33
  </div>
34
34
 
35
35
  SQLFrame implements the PySpark DataFrame API in order to enable running transformation pipelines directly on database engines - no Spark clusters or dependencies required.
36
36
 
37
37
  SQLFrame currently supports the following engines (many more in development):
38
38
 
39
- * [BigQuery](https://sqlframe.readthedocs.io/en/latest/bigquery/)
40
- * [DuckDB](https://sqlframe.readthedocs.io/en/latest/duckdb)
41
- * [Postgres](https://sqlframe.readthedocs.io/en/latest/postgres)
39
+ * [BigQuery](https://sqlframe.readthedocs.io/en/stable/bigquery/)
40
+ * [DuckDB](https://sqlframe.readthedocs.io/en/stable/duckdb)
41
+ * [Postgres](https://sqlframe.readthedocs.io/en/stable/postgres)
42
42
 
43
43
  SQLFrame also has a "Standalone" session that be used to generate SQL without any connection to a database engine.
44
- * [Standalone](https://sqlframe.readthedocs.io/en/latest/standalone)
44
+ * [Standalone](https://sqlframe.readthedocs.io/en/stable/standalone)
45
45
 
46
46
  SQLFrame is great for:
47
47
 
@@ -64,6 +64,12 @@ pip install sqlframe
64
64
 
65
65
  See specific engine documentation for additional setup instructions.
66
66
 
67
+ ## Configuration
68
+
69
+ SQLFrame generates consistently accurate yet complex SQL for engine execution.
70
+ However, when using df.sql(), it produces more human-readable SQL.
71
+ For details on how to configure this output and leverage OpenAI to enhance the SQL, see [Generated SQL Configuration](https://sqlframe.readthedocs.io/en/stable/configuration/#generated-sql).
72
+
67
73
  ## Example Usage
68
74
 
69
75
  ```python
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes
File without changes