Flowfile 0.2.2__py3-none-any.whl → 0.3.0.1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of Flowfile might be problematic. Click here for more details.

Files changed (149) hide show
  1. flowfile/__init__.py +14 -7
  2. flowfile/__main__.py +51 -15
  3. flowfile/api.py +379 -0
  4. flowfile/web/__init__.py +155 -0
  5. flowfile/web/static/assets/AirbyteReader-1ac35765.css +314 -0
  6. flowfile/web/static/assets/AirbyteReader-cb0c1d4a.js +921 -0
  7. flowfile/web/static/assets/CrossJoin-41efa4cb.css +100 -0
  8. flowfile/web/static/assets/CrossJoin-a514fa59.js +153 -0
  9. flowfile/web/static/assets/DatabaseConnectionSettings-0c04b2e5.css +77 -0
  10. flowfile/web/static/assets/DatabaseConnectionSettings-f2cecf33.js +151 -0
  11. flowfile/web/static/assets/DatabaseManager-30fa27e5.css +64 -0
  12. flowfile/web/static/assets/DatabaseManager-83ee3c98.js +484 -0
  13. flowfile/web/static/assets/DatabaseReader-dc0c6881.js +426 -0
  14. flowfile/web/static/assets/DatabaseReader-f50c6558.css +158 -0
  15. flowfile/web/static/assets/DatabaseWriter-2f570e53.css +96 -0
  16. flowfile/web/static/assets/DatabaseWriter-5afe9f8d.js +312 -0
  17. flowfile/web/static/assets/ExploreData-5bdae813.css +45 -0
  18. flowfile/web/static/assets/ExploreData-c7ee19cf.js +118306 -0
  19. flowfile/web/static/assets/ExternalSource-17b23a01.js +225 -0
  20. flowfile/web/static/assets/ExternalSource-e37b6275.css +94 -0
  21. flowfile/web/static/assets/Filter-90856b4f.js +238 -0
  22. flowfile/web/static/assets/Filter-a9d08ba1.css +20 -0
  23. flowfile/web/static/assets/Formula-38b71e9e.js +197 -0
  24. flowfile/web/static/assets/Formula-d60a74f4.css +17 -0
  25. flowfile/web/static/assets/FuzzyMatch-6857de82.css +254 -0
  26. flowfile/web/static/assets/FuzzyMatch-d0f1fe81.js +422 -0
  27. flowfile/web/static/assets/GoogleSheet-854294a4.js +2616 -0
  28. flowfile/web/static/assets/GoogleSheet-92084da7.css +233 -0
  29. flowfile/web/static/assets/GraphSolver-0c86bbc6.js +382 -0
  30. flowfile/web/static/assets/GraphSolver-17fd26db.css +68 -0
  31. flowfile/web/static/assets/GroupBy-ab1ea74b.css +51 -0
  32. flowfile/web/static/assets/GroupBy-f2772e9f.js +413 -0
  33. flowfile/web/static/assets/Join-41c0f331.css +109 -0
  34. flowfile/web/static/assets/Join-bc3e1cf7.js +247 -0
  35. flowfile/web/static/assets/ManualInput-03aa0245.js +391 -0
  36. flowfile/web/static/assets/ManualInput-ac7b9972.css +84 -0
  37. flowfile/web/static/assets/Output-48f81019.css +2642 -0
  38. flowfile/web/static/assets/Output-5b35eee8.js +536 -0
  39. flowfile/web/static/assets/Pivot-7164087c.js +408 -0
  40. flowfile/web/static/assets/Pivot-f415e85f.css +35 -0
  41. flowfile/web/static/assets/PolarsCode-3abf6507.js +2863 -0
  42. flowfile/web/static/assets/PolarsCode-650322d1.css +35 -0
  43. flowfile/web/static/assets/PopOver-b37ff9be.js +577 -0
  44. flowfile/web/static/assets/PopOver-bccfde04.css +32 -0
  45. flowfile/web/static/assets/Read-65966a3e.js +701 -0
  46. flowfile/web/static/assets/Read-80dc1675.css +197 -0
  47. flowfile/web/static/assets/RecordCount-c66c6d6d.js +121 -0
  48. flowfile/web/static/assets/RecordId-826dc095.js +339 -0
  49. flowfile/web/static/assets/Sample-4ed555c8.js +184 -0
  50. flowfile/web/static/assets/SecretManager-eac1e97d.js +382 -0
  51. flowfile/web/static/assets/Select-085f05cc.js +231 -0
  52. flowfile/web/static/assets/SettingsSection-1f5e79c1.js +87 -0
  53. flowfile/web/static/assets/SettingsSection-9c836ecc.css +47 -0
  54. flowfile/web/static/assets/Sort-3e6cb414.js +309 -0
  55. flowfile/web/static/assets/Sort-7ccfa0fe.css +51 -0
  56. flowfile/web/static/assets/TextToRows-606349bc.js +307 -0
  57. flowfile/web/static/assets/TextToRows-c92d1ec2.css +48 -0
  58. flowfile/web/static/assets/UnavailableFields-5edd5322.css +49 -0
  59. flowfile/web/static/assets/UnavailableFields-b41976ed.js +36 -0
  60. flowfile/web/static/assets/Union-8d9ac7f9.css +30 -0
  61. flowfile/web/static/assets/Union-fca91665.js +145 -0
  62. flowfile/web/static/assets/Unique-a59f830e.js +273 -0
  63. flowfile/web/static/assets/Unique-b5615727.css +51 -0
  64. flowfile/web/static/assets/Unpivot-246e9bbd.css +77 -0
  65. flowfile/web/static/assets/Unpivot-c3815565.js +441 -0
  66. flowfile/web/static/assets/airbyte-292aa232.png +0 -0
  67. flowfile/web/static/assets/api-22b338bd.js +60 -0
  68. flowfile/web/static/assets/cross_join-d30c0290.png +0 -0
  69. flowfile/web/static/assets/database_reader-ce1e55f3.svg +24 -0
  70. flowfile/web/static/assets/database_writer-b4ad0753.svg +23 -0
  71. flowfile/web/static/assets/designer-2394122a.css +10697 -0
  72. flowfile/web/static/assets/designer-e5bbe26f.js +69712 -0
  73. flowfile/web/static/assets/documentation-08045cf2.js +33 -0
  74. flowfile/web/static/assets/documentation-12216a74.css +50 -0
  75. flowfile/web/static/assets/dropDown-35135ba8.css +143 -0
  76. flowfile/web/static/assets/dropDown-5e7e9a5a.js +319 -0
  77. flowfile/web/static/assets/dropDownGeneric-50a91b99.js +72 -0
  78. flowfile/web/static/assets/dropDownGeneric-895680d6.css +10 -0
  79. flowfile/web/static/assets/element-icons-9c88a535.woff +0 -0
  80. flowfile/web/static/assets/element-icons-de5eb258.ttf +0 -0
  81. flowfile/web/static/assets/explore_data-8a0a2861.png +0 -0
  82. flowfile/web/static/assets/fa-brands-400-808443ae.ttf +0 -0
  83. flowfile/web/static/assets/fa-brands-400-d7236a19.woff2 +0 -0
  84. flowfile/web/static/assets/fa-regular-400-54cf6086.ttf +0 -0
  85. flowfile/web/static/assets/fa-regular-400-e3456d12.woff2 +0 -0
  86. flowfile/web/static/assets/fa-solid-900-aa759986.woff2 +0 -0
  87. flowfile/web/static/assets/fa-solid-900-d2f05935.ttf +0 -0
  88. flowfile/web/static/assets/fa-v4compatibility-0ce9033c.woff2 +0 -0
  89. flowfile/web/static/assets/fa-v4compatibility-30f6abf6.ttf +0 -0
  90. flowfile/web/static/assets/filter-d7708bda.png +0 -0
  91. flowfile/web/static/assets/formula-eeeb1611.png +0 -0
  92. flowfile/web/static/assets/fullEditor-178376bb.css +256 -0
  93. flowfile/web/static/assets/fullEditor-705c6ccb.js +630 -0
  94. flowfile/web/static/assets/fuzzy_match-40c161b2.png +0 -0
  95. flowfile/web/static/assets/genericNodeSettings-65587f20.js +137 -0
  96. flowfile/web/static/assets/genericNodeSettings-924759c7.css +46 -0
  97. flowfile/web/static/assets/graph_solver-8b7888b8.png +0 -0
  98. flowfile/web/static/assets/group_by-80561fc3.png +0 -0
  99. flowfile/web/static/assets/index-552863fd.js +58652 -0
  100. flowfile/web/static/assets/index-681a3ed0.css +8843 -0
  101. flowfile/web/static/assets/input_data-ab2eb678.png +0 -0
  102. flowfile/web/static/assets/join-349043ae.png +0 -0
  103. flowfile/web/static/assets/manual_input-ae98f31d.png +0 -0
  104. flowfile/web/static/assets/nodeTitle-cf9bae3c.js +227 -0
  105. flowfile/web/static/assets/nodeTitle-f4b12bcb.css +134 -0
  106. flowfile/web/static/assets/old_join-5d0eb604.png +0 -0
  107. flowfile/web/static/assets/output-06ec0371.png +0 -0
  108. flowfile/web/static/assets/pivot-9660df51.png +0 -0
  109. flowfile/web/static/assets/polars_code-05ce5dc6.png +0 -0
  110. flowfile/web/static/assets/record_count-dab44eb5.png +0 -0
  111. flowfile/web/static/assets/record_id-0b15856b.png +0 -0
  112. flowfile/web/static/assets/sample-693a88b5.png +0 -0
  113. flowfile/web/static/assets/secretApi-3ad510e1.js +46 -0
  114. flowfile/web/static/assets/select-b0d0437a.png +0 -0
  115. flowfile/web/static/assets/selectDynamic-b062bc9b.css +107 -0
  116. flowfile/web/static/assets/selectDynamic-bd644891.js +302 -0
  117. flowfile/web/static/assets/sort-2aa579f0.png +0 -0
  118. flowfile/web/static/assets/summarize-2a099231.png +0 -0
  119. flowfile/web/static/assets/text_to_rows-859b29ea.png +0 -0
  120. flowfile/web/static/assets/union-2d8609f4.png +0 -0
  121. flowfile/web/static/assets/unique-1958b98a.png +0 -0
  122. flowfile/web/static/assets/unpivot-d3cb4b5b.png +0 -0
  123. flowfile/web/static/assets/view-7a0f0be1.png +0 -0
  124. flowfile/web/static/assets/vue-codemirror.esm-dd17b478.js +22281 -0
  125. flowfile/web/static/assets/vue-content-loader.es-6b36f05e.js +210 -0
  126. flowfile/web/static/flowfile.svg +47 -0
  127. flowfile/web/static/icons/flowfile.png +0 -0
  128. flowfile/web/static/images/airbyte.png +0 -0
  129. flowfile/web/static/images/flowfile.svg +47 -0
  130. flowfile/web/static/images/google.svg +1 -0
  131. flowfile/web/static/images/sheets.png +0 -0
  132. flowfile/web/static/index.html +22 -0
  133. flowfile/web/static/vite.svg +1 -0
  134. flowfile/web/static/vue.svg +1 -0
  135. flowfile-0.3.0.1.dist-info/METADATA +219 -0
  136. {flowfile-0.2.2.dist-info → flowfile-0.3.0.1.dist-info}/RECORD +147 -16
  137. {flowfile-0.2.2.dist-info → flowfile-0.3.0.1.dist-info}/entry_points.txt +1 -1
  138. flowfile_core/configs/settings.py +7 -32
  139. flowfile_core/flowfile/FlowfileFlow.py +4 -2
  140. flowfile_core/flowfile/analytics/analytics_processor.py +1 -1
  141. flowfile_core/main.py +4 -1
  142. flowfile_core/schemas/input_schema.py +1 -8
  143. flowfile_frame/__init__.py +1 -2
  144. flowfile_frame/flow_frame.py +6 -6
  145. flowfile_frame/utils.py +1 -140
  146. flowfile-0.2.2.dist-info/METADATA +0 -225
  147. flowfile_frame/__main__.py +0 -12
  148. {flowfile-0.2.2.dist-info → flowfile-0.3.0.1.dist-info}/LICENSE +0 -0
  149. {flowfile-0.2.2.dist-info → flowfile-0.3.0.1.dist-info}/WHEEL +0 -0
@@ -164,8 +164,7 @@ class OutputSettings(BaseModel):
164
164
 
165
165
  @model_validator(mode='after')
166
166
  def populate_abs_file_path(self):
167
- if not self.abs_file_path:
168
- self.set_absolute_filepath()
167
+ self.set_absolute_filepath()
169
168
  return self
170
169
 
171
170
 
@@ -297,12 +296,6 @@ class DatabaseSettings(BaseModel):
297
296
  query: Optional[str] = None
298
297
  query_mode: Literal['query', 'table', 'reference'] = 'table'
299
298
 
300
- @model_validator(mode='after')
301
- def validate_table_or_query(self):
302
- if (not self.table_name and not self.query) and self.query_mode == 'inline':
303
- raise ValueError("Either 'table' or 'query' must be provided")
304
- return self
305
-
306
299
  @model_validator(mode='after')
307
300
  def validate_table_or_query(self):
308
301
  # Validate that either table_name or query is provided
@@ -4,7 +4,7 @@
4
4
  # Core classes
5
5
  from flowfile_frame.flow_frame import FlowFrame # noqa: F401
6
6
 
7
- from flowfile_frame.utils import create_etl_graph # noqa: F401
7
+ from flowfile_frame.utils import create_flow_graph # noqa: F401
8
8
 
9
9
  # Commonly used functions
10
10
  from flowfile_frame.expr import ( # noqa: F401
@@ -26,7 +26,6 @@ from flowfile_frame.flow_frame import ( # noqa: F401
26
26
  read_csv, read_parquet, from_dict, concat
27
27
  )
28
28
 
29
- # Import Polars data types for user convenience
30
29
  from polars.datatypes import ( # noqa: F401
31
30
  # Integer types
32
31
  Int8, Int16, Int32, Int64, Int128,
@@ -16,7 +16,7 @@ from flowfile_core.schemas import input_schema, transform_schema
16
16
  from flowfile_frame.expr import Expr, Column, lit, col
17
17
  from flowfile_frame.selectors import Selector
18
18
  from flowfile_frame.group_frame import GroupByFrame
19
- from flowfile_frame.utils import _parse_inputs_as_iterable, create_etl_graph
19
+ from flowfile_frame.utils import _parse_inputs_as_iterable, create_flow_graph
20
20
  from flowfile_frame.join import _normalize_columns_to_list, _create_join_mappings
21
21
 
22
22
  node_id_counter = 0
@@ -92,7 +92,7 @@ class FlowFrame:
92
92
 
93
93
  # Create a new flow graph if none is provided
94
94
  if flow_graph is None:
95
- flow_graph = create_etl_graph()
95
+ flow_graph = create_flow_graph()
96
96
 
97
97
  flow_id = flow_graph.flow_id
98
98
 
@@ -198,7 +198,7 @@ class FlowFrame:
198
198
 
199
199
  # Initialize graph
200
200
  if flow_graph is None:
201
- flow_graph = create_etl_graph()
201
+ flow_graph = create_flow_graph()
202
202
  self.flow_graph = flow_graph
203
203
  # Set up data
204
204
  if isinstance(data, FlowDataEngine):
@@ -1922,7 +1922,7 @@ def read_csv(file_path, *, flow_graph: FlowGraph = None, separator: str = ';',
1922
1922
  # Create new node ID
1923
1923
  node_id = generate_node_id()
1924
1924
  if flow_graph is None:
1925
- flow_graph = create_etl_graph()
1925
+ flow_graph = create_flow_graph()
1926
1926
 
1927
1927
  flow_id = flow_graph.flow_id
1928
1928
 
@@ -1982,7 +1982,7 @@ def read_parquet(file_path, *, flow_graph: FlowGraph = None, description: str =
1982
1982
  node_id = generate_node_id()
1983
1983
 
1984
1984
  if flow_graph is None:
1985
- flow_graph = create_etl_graph()
1985
+ flow_graph = create_flow_graph()
1986
1986
 
1987
1987
  flow_id = flow_graph.flow_id
1988
1988
 
@@ -2028,7 +2028,7 @@ def from_dict(data, *, flow_graph: FlowGraph = None, description: str = None) ->
2028
2028
  node_id = generate_node_id()
2029
2029
 
2030
2030
  if not flow_graph:
2031
- flow_graph = create_etl_graph()
2031
+ flow_graph = create_flow_graph()
2032
2032
  flow_id = flow_graph.flow_id
2033
2033
 
2034
2034
  input_node = input_schema.NodeManualInput(
flowfile_frame/utils.py CHANGED
@@ -33,7 +33,7 @@ def _generate_id() -> int:
33
33
  return int(uuid.uuid4().int % 100000)
34
34
 
35
35
 
36
- def create_etl_graph() -> FlowGraph:
36
+ def create_flow_graph() -> FlowGraph:
37
37
  flow_id = _generate_id()
38
38
  flow_settings = schemas.FlowSettings(
39
39
  flow_id=flow_id,
@@ -43,142 +43,3 @@ def create_etl_graph() -> FlowGraph:
43
43
  flow_graph = FlowGraph(flow_id=flow_id, flow_settings=flow_settings)
44
44
  flow_graph.flow_settings.execution_location = 'local' # always create a local frame so that the run time does not attempt to use the flowfile_worker process
45
45
  return flow_graph
46
-
47
-
48
- def is_flowfile_running() -> bool:
49
- """Check if the Flowfile application is running by testing its API endpoint."""
50
- try:
51
- response = requests.get("http://0.0.0.0:63578/docs", timeout=2)
52
- return response.status_code == 200
53
- except (requests.ConnectionError, requests.Timeout):
54
- return False
55
-
56
-
57
- def start_flowfile_application() -> bool:
58
- """Start the Flowfile application on macOS."""
59
- try:
60
- # Attempt to start the Flowfile application
61
- subprocess.Popen(['open', '-a', 'Flowfile'],
62
- stdout=subprocess.PIPE,
63
- stderr=subprocess.PIPE)
64
-
65
- # Wait for the application to start up (max 10 seconds)
66
- start_time = time.time()
67
- while time.time() - start_time < 10:
68
- if is_flowfile_running():
69
- return True
70
- time.sleep(0.5) # Check every half second
71
-
72
- # If we get here, the app didn't start in time
73
- return False
74
- except Exception as e:
75
- print(f"Error starting Flowfile application: {e}")
76
- return False
77
-
78
-
79
- def get_auth_token() -> Optional[str]:
80
- """Get an authentication token from the Flowfile API."""
81
- try:
82
- response = requests.post(
83
- "http://0.0.0.0:63578/auth/token",
84
- json={}, # Empty body as specified
85
- timeout=5
86
- )
87
-
88
- if response.status_code == 200:
89
- token_data = response.json()
90
- return token_data.get("access_token")
91
- else:
92
- print(f"Failed to get auth token: {response.status_code} - {response.text}")
93
- return None
94
- except Exception as e:
95
- print(f"Error getting auth token: {e}")
96
- return None
97
-
98
-
99
- def import_flow_to_editor(flow_path: str, auth_token: str) -> Optional[int]:
100
- """Import the flow into the Flowfile editor using the API endpoint."""
101
- try:
102
- flow_path = Path(flow_path).resolve() # Get absolute path
103
- if not flow_path.exists():
104
- print(f"Flow file not found: {flow_path}")
105
- return None
106
-
107
- # Set authorization header with the token
108
- headers = {"Authorization": f"Bearer {auth_token}"}
109
-
110
- # Make a GET request to the import endpoint
111
- response = requests.get(
112
- "http://0.0.0.0:63578/import_flow/",
113
- params={"flow_path": str(flow_path)},
114
- headers=headers,
115
- timeout=10
116
- )
117
-
118
- if response.status_code == 200:
119
- flow_id = response.json()
120
- print(f"Flow imported successfully with ID: {flow_id}")
121
- return flow_id
122
- else:
123
- print(f"Failed to import flow: {response.status_code} - {response.text}")
124
- return None
125
- except Exception as e:
126
- print(f"Error importing flow: {e}")
127
- return None
128
-
129
-
130
- def open_graph_in_editor(etl_graph: FlowGraph, storage_location: str = None) -> bool:
131
- """
132
- Save the ETL graph and open it in the Flowfile editor.
133
-
134
- Parameters:
135
- -----------
136
- etl_graph : FlowGraph
137
- The graph to save and open
138
- storage_location : str, optional
139
- Where to save the flowfile. If None, a default name is used.
140
-
141
- Returns:
142
- --------
143
- bool
144
- True if the graph was successfully opened in the editor, False otherwise
145
- """
146
- # Create a temporary directory if needed
147
- temp_dir = None
148
- if storage_location is None:
149
- temp_dir = TemporaryDirectory()
150
- storage_location = os.path.join(temp_dir.name, 'temp_flow.flowfile')
151
- else:
152
- # Ensure path is absolute
153
- storage_location = os.path.abspath(storage_location)
154
-
155
- etl_graph.apply_layout()
156
- etl_graph.save_flow(storage_location)
157
- print(f"Flow saved to: {storage_location}")
158
-
159
- # Check if Flowfile is running, and start it if not
160
- if not is_flowfile_running():
161
- print("Flowfile application is not running. Starting it...")
162
- if not start_flowfile_application():
163
- print("Failed to start Flowfile application")
164
- if temp_dir:
165
- temp_dir.cleanup()
166
- return False
167
- print("Flowfile application started successfully")
168
-
169
- # Get authentication token
170
- auth_token = get_auth_token()
171
- if not auth_token:
172
- print("Failed to authenticate with Flowfile API")
173
- if temp_dir:
174
- temp_dir.cleanup()
175
- return False
176
-
177
- # Import the flow into the editor
178
- flow_id = import_flow_to_editor(storage_location, auth_token)
179
-
180
- # Clean up temporary directory if we created one
181
- if temp_dir:
182
- temp_dir.cleanup()
183
-
184
- return flow_id is not None
@@ -1,225 +0,0 @@
1
- Metadata-Version: 2.3
2
- Name: Flowfile
3
- Version: 0.2.2
4
- Summary: Project combining flowfile core (backend) and flowfile_worker (compute offloader) and flowfile_frame (api)
5
- Author: Edward van Eechoud
6
- Author-email: evaneechoud@gmail.com
7
- Requires-Python: >=3.10,<3.13
8
- Classifier: Programming Language :: Python :: 3
9
- Classifier: Programming Language :: Python :: 3.10
10
- Classifier: Programming Language :: Python :: 3.11
11
- Classifier: Programming Language :: Python :: 3.12
12
- Requires-Dist: XlsxWriter (>=3.2.0,<3.3.0)
13
- Requires-Dist: aiofiles (>=24.1.0,<25.0.0)
14
- Requires-Dist: airbyte-cdk (==6.47.2)
15
- Requires-Dist: bcrypt (>=4.3.0,<5.0.0)
16
- Requires-Dist: connectorx (>=0.4.2,<0.5.0)
17
- Requires-Dist: databases (>=0.9.0,<0.10.0)
18
- Requires-Dist: faker (>=23.1.0,<23.2.0)
19
- Requires-Dist: fastapi (>=0.115.2,<0.116.0)
20
- Requires-Dist: fastexcel (>=0.12.0,<0.13.0)
21
- Requires-Dist: google-api-python-client (>=2.149.0,<2.150.0)
22
- Requires-Dist: gspread (>=6.1.3,<6.2.0)
23
- Requires-Dist: loky (>=3.4.1,<3.5.0)
24
- Requires-Dist: methodtools (>=0.4.7,<0.5.0)
25
- Requires-Dist: openpyxl (>=3.1.2,<3.2.0)
26
- Requires-Dist: passlib (>=1.7.4,<1.8.0)
27
- Requires-Dist: pendulum (==2.1.2) ; python_version < "3.12"
28
- Requires-Dist: polars (>1.8.2,<=1.25.2)
29
- Requires-Dist: polars-distance (>=0.4.3,<0.5.0)
30
- Requires-Dist: polars-ds (>=0.6.0)
31
- Requires-Dist: polars-expr-transformer (>0.4.7.0)
32
- Requires-Dist: polars-grouper (>=0.3.0,<0.4.0)
33
- Requires-Dist: polars_simed (>=0.3.4,<0.4.0)
34
- Requires-Dist: pyairbyte-flowfile (==0.20.2)
35
- Requires-Dist: pyarrow (>=18.0.0,<19.0.0)
36
- Requires-Dist: pydantic (>=2.9.2,<2.10.0)
37
- Requires-Dist: pyinstaller (>=6.11.0,<7.0.0)
38
- Requires-Dist: pytest (>=8.3.4,<9.0.0)
39
- Requires-Dist: python-jose (>=3.4.0,<4.0.0)
40
- Requires-Dist: python-multipart (>=0.0.12,<0.1.0)
41
- Requires-Dist: uvicorn (>=0.32.0,<0.33.0)
42
- Description-Content-Type: text/markdown
43
-
44
- <h1 align="center">
45
- <img src=".github/images/logo.png" alt="Flowfile Logo" width="100">
46
- <br>
47
- Flowfile
48
- </h1>
49
- <p align="center">
50
- <b>Documentation</b>:
51
- <a href="https://edwardvaneechoud.github.io/Flowfile/">Website</a>
52
- -
53
- <a href="flowfile_core/README.md">Core</a>
54
- -
55
- <a href="flowfile_worker/README.md">Worker</a>
56
- -
57
- <a href="flowfile_frontend/README.md">Frontend</a>
58
- -
59
- <a href="https://dev.to/edwardvaneechoud/building-flowfile-architecting-a-visual-etl-tool-with-polars-576c">Technical Architecture</a>
60
- </p>
61
- <p>
62
- Flowfile is a visual ETL tool that combines drag-and-drop workflow building with the speed of Polars dataframes. Build data pipelines visually, transform data using powerful nodes, and analyze results - all without writing code.
63
- </p>
64
-
65
- <div align="center">
66
- <img src=".github/images/group_by_screenshot.png" alt="Flowfile Interface" width="800"/>
67
- </div>
68
-
69
- ## ⚡ Technical Design
70
-
71
- Flowfile operates as three interconnected services:
72
-
73
- - **Designer** (Electron + Vue): Visual interface for building data flows
74
- - **Core** (FastAPI): ETL engine using Polars for data transformations (`:63578`)
75
- - **Worker** (FastAPI): Handles computation and caching of data operations (`:63579`)
76
-
77
- Each flow is represented as a directed acyclic graph (DAG), where nodes represent data operations and edges represent data flow between operations.
78
-
79
- For a deeper dive into the technical architecture, check out [this article](https://dev.to/edwardvaneechoud/building-flowfile-architecting-a-visual-etl-tool-with-polars-576c) on how Flowfile leverages Polars for efficient data processing.
80
-
81
- ## 🔥 Example Use Cases
82
-
83
- - **Data Cleaning & Transformation**
84
- - Complex joins (fuzzy matching)
85
- - Text to rows transformations
86
- - Advanced filtering and grouping
87
- - Custom formulas and expressions
88
- - Filter data based on conditions
89
-
90
- <div align="center">
91
- <img src=".github/images/flowfile_demo_1.gif" alt="Flowfile Layout" width="800"/>
92
- </div>
93
-
94
- ---
95
-
96
- - **Performance**
97
- - Build to scale out of core
98
- - Using polars for data processing
99
-
100
- <div align="center">
101
- <img src=".github/images/demo_flowfile_write.gif" alt="Flowfile Layout" width="800"/>
102
- </div>
103
-
104
- ---
105
-
106
- ### **Data Integration**
107
- - Standardize data formats
108
- - Handle messy Excel files
109
-
110
-
111
- <div align="center">
112
- <img src=".github/images/read_excel_flowfile.gif" alt="Flowfile Layout" width="800"/>
113
- </div>
114
-
115
-
116
- ---
117
-
118
- - **ETL Operations**
119
- - Data quality checks
120
-
121
-
122
- ## 🚀 Getting Started
123
-
124
- ### Prerequisites
125
- - Python 3.10+
126
- - Node.js 16+
127
- - Poetry (Python package manager)
128
- - Docker & Docker Compose (option, for Docker setup)
129
- - Make (optional, for build automation)
130
-
131
- ### Installation Options
132
-
133
- #### 1. Desktop Application
134
- The desktop version offers the best experience with a native interface and integrated services. You can either:
135
-
136
- **Option A: Download Pre-built Application**
137
- - Download the latest release from [GitHub Releases](https://github.com/Edwardvaneechoud/Flowfile/releases)
138
- - Run the installer for your platform (Windows, macOS, or Linux)
139
- - Note: You may see security warnings since the installer isn't signed. On Windows, click "More info" then "Run anyway". On macOS, right-click the app, select "Open", then confirm. These warnings appear because the app isn't signed with a developer certificate.
140
-
141
- **Option B: Build from Source:**
142
- ```bash
143
- git clone https://github.com/edwardvaneechoud/Flowfile.git
144
- cd Flowfile
145
-
146
- # Build packaged executable
147
- make # Creates platform-specific executable
148
-
149
- # Or manually:
150
- poetry install
151
- poetry run build_backends
152
- cd flowfile_frontend
153
- npm install
154
- npm run build # All platforms
155
- ```
156
-
157
- #### 2. Docker Setup
158
- Perfect for quick testing, development or deployment scenarios. Runs all services in containers with proper networking and volume management:
159
- ```bash
160
- # Clone and start all services
161
- git clone https://github.com/edwardvaneechoud/Flowfile.git
162
- cd Flowfile
163
- docker compose up -d
164
-
165
- # Access services:
166
- Frontend: http://localhost:8080 # main service
167
- Core API: http://localhost:63578/docs
168
- Worker API: http://localhost:63579/docs
169
- ```
170
- Just place your files that you want to transform in the directory in shared_data and you're all set!
171
-
172
- Docker Compose is also excellent for development, as it automatically sets up all required services and ensures proper communication between them. Code changes in the mounted volumes will be reflected in the running containers.
173
-
174
- #### 3. Manual Setup (Development)
175
- Ideal for development work when you need direct access to all services and hot-reloading:
176
-
177
- ```bash
178
- git clone https://github.com/edwardvaneechoud/Flowfile.git
179
- cd Flowfile
180
-
181
- # Install Python dependencies
182
- poetry install
183
-
184
- # Start backend services
185
- poetry run flowfile_worker # Starts worker on :63579
186
- poetry run flowfile_core # Starts core on :63578
187
-
188
- # Start web frontend
189
- cd flowfile_frontend
190
- npm install
191
- npm run dev:web # Starts web interface on :8080
192
- ```
193
-
194
- ## 📋 TODO
195
-
196
- ### Core Features
197
- - [ ] Add cloud storage support
198
- - S3 integration
199
- - Azure Data Lake Storage (ADLS)
200
- - [x] Multi-flow execution support
201
- - [ ] Polars code reverse engineering
202
- - Generate Polars code from visual flows
203
- - Import existing Polars scripts
204
-
205
- ### Documentation
206
- - [ ] Add comprehensive docstrings
207
- - [x] Create detailed node documentation
208
- - [x] Add architectural documentation
209
- - [ ] Improve inline code comments
210
- - [ ] Create user guides and tutorials
211
-
212
- ### Infrastructure
213
- - [ ] Implement proper testing
214
- - [x] Add CI/CD pipeline
215
- - [x] Improve error handling
216
- - [x] Add monitoring and logging
217
-
218
- ## 📝 License
219
-
220
- [MIT License](LICENSE)
221
-
222
- ## Acknowledgments
223
-
224
- Built with Polars, Vue.js, FastAPI, Vueflow and Electron.
225
-
@@ -1,12 +0,0 @@
1
- """Main entry point for the FlowFrame CLI."""
2
-
3
- def main():
4
- """Main entry point for the FlowFrame CLI."""
5
- print("FlowFrame - A Polars-like API for building ETL graphs")
6
- print("Usage: import flowframe as ff")
7
- print(" df = ff.from_dict({'a': [1, 2, 3]})")
8
- print(" result = df.filter(ff.col('a') > 1)")
9
- print(" print(result.collect())")
10
-
11
- if __name__ == "__main__":
12
- main()