PyPI - feldera - Versions diffs - 0.100.0__tar.gz → 0.102.0__tar.gz - Mend

feldera 0.100.0tar.gz → 0.102.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of feldera might be problematic. Click here for more details.

Files changed (36) hide show

{feldera-0.100.0 → feldera-0.102.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: feldera
-Version: 0.100.0
+Version: 0.102.0
 Summary: The feldera python client
 Author-email: Feldera Team <dev@feldera.com>
 License: MIT
@@ -54,10 +54,11 @@ If you have cloned the Feldera repo, you can install the python SDK as follows:
 pip install python/
 ```
-Checkout the docs [here](./feldera/__init__.py) for an example on how to use the SDK.
 ## Documentation
+The Python SDK documentation is available at
+[Feldera Python SDK Docs](https://docs.feldera.com/python).
 To build the html documentation run:
 Ensure that you have sphinx installed. If not, install it using `pip install sphinx`.
@@ -77,27 +78,23 @@ To clean the build, run `make clean`.
 To run unit tests:
 ```bash
-(cd python && python3 -m unittest)
+cd python && python3 -m pytest tests/
 ```
-> ⚠️ Running the unit tests will **delete all existing pipelines**.
-The following command runs end-to-end tests.  You'll need a pipeline
-manager running at `http://localhost:8080`.  For the pipeline builder
-tests, you'll also need a broker available at `localhost:9092` and
-(from the pipelines) `redpanda:19092`.  (To change those locations,
-set the environment variables listed in `python/tests/__init__.py`.)
-```bash
-(cd python && python3 -m pytest tests)
-```
+- This will detect and run all test files that match the pattern `test_*.py` or
+  `*_test.py`.
+- By default, the tests expect a running Feldera instance at `http://localhost:8080`.
+  To override the default endpoint, set the `FELDERA_BASE_URL` environment variable.
 To run tests from a specific file:
 ```bash
-(cd python && python3 -m unittest ./tests/path-to-file.py)
+(cd python && python3 -m pytest ./tests/path-to-file.py)
 ```
+#### Running Aggregate Tests
+The aggregate tests validate end-to-end correctness of SQL functionality.
 To run the aggregate tests use:
 ```bash
@@ -105,6 +102,38 @@ cd python
 PYTHONPATH=`pwd` python3 ./tests/aggregate_tests/main.py
 ```
+### Reducing Compilation Cycles
+To reduce redundant compilation cycles during testing:
+* **Inherit from `SharedTestPipeline`** instead of `unittest.TestCase`.
+* **Define DDLs** (e.g., `CREATE TABLE`, `CREATE VIEW`) in the **docstring** of each test method.
+  * All DDLs from all test functions in the class are combined and compiled into a single pipeline.
+  * If a table or view is already defined in one test, it can be used directly in others without redefinition.
+  * Ensure that all table and view names are unique within the class.
+* Use `@enterprise_only` on tests that require Enterprise features. Their DDLs will be skipped on OSS builds.
+* Use `self.set_runtime_config(...)` to override the default pipeline config.
+  * Reset it at the end using `self.reset_runtime_config()`.
+* Access the shared pipeline via `self.pipeline`.
+#### Example
+```python
+from tests.shared_test_pipeline import SharedTestPipeline
+class TestAverage(SharedTestPipeline):
+    def test_average(self):
+        """
+        CREATE TABLE students(id INT, name STRING);
+        CREATE MATERIALIZED VIEW v AS SELECT * FROM students;
+        """
+        ...
+        self.pipeline.start()
+        self.pipeline.input_pandas("students", df)
+        self.pipeline.wait_for_completion(True)
+        ...
+```
 ## Linting and formatting
 Use [Ruff] to run the lint checks that will be executed by the

feldera-0.102.0/README.md ADDED Viewed

@@ -0,0 +1,129 @@
+# Feldera Python SDK
+Feldera Python is the Feldera SDK for Python developers.
+## Installation
+```bash
+pip install feldera
+```
+### Installing from Github
+```bash
+pip install git+https://github.com/feldera/feldera#subdirectory=python
+```
+Similarly, to install from a specific branch:
+```bash
+$ pip install git+https://github.com/feldera/feldera@{BRANCH_NAME}#subdirectory=python
+```
+Replace `{BRANCH_NAME}` with the name of the branch you want to install from.
+### Installing from Local Directory
+If you have cloned the Feldera repo, you can install the python SDK as follows:
+```bash
+# the Feldera Python SDK is present inside the python/ directory
+pip install python/
+```
+## Documentation
+The Python SDK documentation is available at
+[Feldera Python SDK Docs](https://docs.feldera.com/python).
+To build the html documentation run:
+Ensure that you have sphinx installed. If not, install it using `pip install sphinx`.
+Then run the following commands:
+```bash
+cd docs
+sphinx-apidoc -o . ../feldera
+make html
+```
+To clean the build, run `make clean`.
+## Testing
+To run unit tests:
+```bash
+cd python && python3 -m pytest tests/
+```
+- This will detect and run all test files that match the pattern `test_*.py` or
+  `*_test.py`.
+- By default, the tests expect a running Feldera instance at `http://localhost:8080`.
+  To override the default endpoint, set the `FELDERA_BASE_URL` environment variable.
+To run tests from a specific file:
+```bash
+(cd python && python3 -m pytest ./tests/path-to-file.py)
+```
+#### Running Aggregate Tests
+The aggregate tests validate end-to-end correctness of SQL functionality.
+To run the aggregate tests use:
+```bash
+cd python
+PYTHONPATH=`pwd` python3 ./tests/aggregate_tests/main.py
+```
+### Reducing Compilation Cycles
+To reduce redundant compilation cycles during testing:
+* **Inherit from `SharedTestPipeline`** instead of `unittest.TestCase`.
+* **Define DDLs** (e.g., `CREATE TABLE`, `CREATE VIEW`) in the **docstring** of each test method.
+  * All DDLs from all test functions in the class are combined and compiled into a single pipeline.
+  * If a table or view is already defined in one test, it can be used directly in others without redefinition.
+  * Ensure that all table and view names are unique within the class.
+* Use `@enterprise_only` on tests that require Enterprise features. Their DDLs will be skipped on OSS builds.
+* Use `self.set_runtime_config(...)` to override the default pipeline config.
+  * Reset it at the end using `self.reset_runtime_config()`.
+* Access the shared pipeline via `self.pipeline`.
+#### Example
+```python
+from tests.shared_test_pipeline import SharedTestPipeline
+class TestAverage(SharedTestPipeline):
+    def test_average(self):
+        """
+        CREATE TABLE students(id INT, name STRING);
+        CREATE MATERIALIZED VIEW v AS SELECT * FROM students;
+        """
+        ...
+        self.pipeline.start()
+        self.pipeline.input_pandas("students", df)
+        self.pipeline.wait_for_completion(True)
+        ...
+```
+## Linting and formatting
+Use [Ruff] to run the lint checks that will be executed by the
+precommit hook when a PR is submitted:
+```bash
+ruff check python/
+```
+To reformat the code in the same way as the precommit hook:
+```bash
+ruff format
+```
+[Ruff]: https://github.com/astral-sh/ruff

{feldera-0.100.0 → feldera-0.102.0}/feldera/_callback_runner.py RENAMED Viewed

@@ -54,12 +54,12 @@ class CallbackRunner(Thread):
             )
         # by default, we assume that the pipeline has been started
-        ack: _CallbackRunnerInstruction = _CallbackRunnerInstruction.PipelineStarted
+        ack = _CallbackRunnerInstruction.PipelineStarted
         # if there is Queue, we wait for the instruction to start the pipeline
         # this means that we are listening to the pipeline before running it, therefore, all data should be received
         if self.queue:
-            ack: _CallbackRunnerInstruction = self.queue.get()
+            ack = self.queue.get()
         match ack:
             # if the pipeline has actually been started, we start a listener
@@ -77,11 +77,12 @@ class CallbackRunner(Thread):
                 for chunk in gen_obj:
                     chunk: dict = chunk
-                    data: list[dict] = chunk.get("json_data")
-                    seq_no: int = chunk.get("sequence_number")
-                    if data is not None:
-                        self.callback(dataframe_from_response([data], schema), seq_no)
+                    data: Optional[list[dict]] = chunk.get("json_data")
+                    seq_no: Optional[int] = chunk.get("sequence_number")
+                    if data is not None and seq_no is not None:
+                        self.callback(
+                            dataframe_from_response([data], self.schema), seq_no
+                        )
                     if self.queue:
                         try:

{feldera-0.100.0 → feldera-0.102.0}/feldera/enums.py RENAMED Viewed

@@ -276,3 +276,38 @@ class StorageStatus(Enum):
     def __eq__(self, other):
         return self.value == other.value
+class FaultToleranceModel(Enum):
+    """
+    The fault tolerance model.
+    """
+    AtLeastOnce = 1
+    """
+    Each record is output at least once.  Crashes may duplicate output, but
+    no input or output is dropped.
+    """
+    ExactlyOnce = 2
+    """
+    Each record is output exactly once.  Crashes do not drop or duplicate
+    input or output.
+    """
+    def __str__(self) -> str:
+        match self:
+            case FaultToleranceModel.AtLeastOnce:
+                return "at_least_once"
+            case FaultToleranceModel.ExactlyOnce:
+                return "exactly_once"
+    @staticmethod
+    def from_str(value):
+        for member in FaultToleranceModel:
+            if str(member) == value.lower():
+                return member
+        raise ValueError(
+            f"Unknown value '{value}' for enum {FaultToleranceModel.__name__}"
+        )

{feldera-0.100.0 → feldera-0.102.0}/feldera/pipeline.py RENAMED Viewed

@@ -145,7 +145,7 @@ class Pipeline:
         :param data: The JSON encoded data to be pushed to the pipeline. The data should be in the form:
             `{'col1': 'val1', 'col2': 'val2'}` or `[{'col1': 'val1', 'col2': 'val2'}, {'col1': 'val1', 'col2': 'val2'}]`
         :param update_format: The update format of the JSON data to be pushed to the pipeline. Must be one of:
-            "raw", "insert_delete". <https://docs.feldera.com/formats/json#the-insertdelete-format>
+            "raw", "insert_delete". https://docs.feldera.com/formats/json#the-insertdelete-format
         :param force: `True` to push data even if the pipeline is paused. `False` by default.
         :raises ValueError: If the update format is invalid.
@@ -180,7 +180,7 @@ class Pipeline:
         All connectors are RUNNING by default.
         Refer to the connector documentation for more information:
-        <https://docs.feldera.com/connectors/#input-connector-orchestration>
+            https://docs.feldera.com/connectors/#input-connector-orchestration
         :param table_name: The name of the table that the connector is attached to.
         :param connector_name: The name of the connector to pause.
@@ -199,7 +199,7 @@ class Pipeline:
         All connectors are RUNNING by default.
         Refer to the connector documentation for more information:
-        <https://docs.feldera.com/connectors/#input-connector-orchestration>
+            https://docs.feldera.com/connectors/#input-connector-orchestration
         :param table_name: The name of the table that the connector is attached to.
         :param connector_name: The name of the connector to resume.
@@ -473,10 +473,13 @@ metrics"""
             pipeline to stop.
         """
-        if len(self.views_tx) > 0:
-            for _, queue in self.views_tx.pop().items():
+        for view_queue in self.views_tx:
+            for _, queue in view_queue.items():
                 # sends a message to the callback runner to stop listening
                 queue.put(_CallbackRunnerInstruction.RanToCompletion)
+        if len(self.views_tx) > 0:
+            for view_name, queue in self.views_tx.pop().items():
                 # block until the callback runner has been stopped
                 queue.join()
@@ -530,15 +533,13 @@ metrics"""
     def checkpoint(self, wait: bool = False, timeout_s=300) -> int:
         """
-        Checkpoints this pipeline, if fault-tolerance is enabled.
-        Fault Tolerance in Feldera:
-        <https://docs.feldera.com/pipelines/fault-tolerance/>
+        Checkpoints this pipeline.
         :param wait: If true, will block until the checkpoint completes.
         :param timeout_s: The maximum time (in seconds) to wait for the
             checkpoint to complete.
-        :raises FelderaAPIError: If checkpointing is not enabled.
+        :raises FelderaAPIError: If enterprise features are not enabled.
         """
         seq = self.client.checkpoint_pipeline(self.name)

{feldera-0.100.0 → feldera-0.102.0}/feldera/pipeline_builder.py RENAMED Viewed

@@ -2,7 +2,7 @@ from feldera.rest.feldera_client import FelderaClient
 from feldera.rest.pipeline import Pipeline as InnerPipeline
 from feldera.pipeline import Pipeline
 from feldera.enums import CompilationProfile
-from feldera.runtime_config import RuntimeConfig, Resources
+from feldera.runtime_config import RuntimeConfig
 from feldera.rest.errors import FelderaAPIError
@@ -10,14 +10,16 @@ class PipelineBuilder:
     """
     A builder for creating a Feldera Pipeline.
-    :param client: The `.FelderaClient` instance
+    :param client: The :class:`.FelderaClient` instance
     :param name: The name of the pipeline
     :param description: The description of the pipeline
     :param sql: The SQL code of the pipeline
     :param udf_rust: Rust code for UDFs
     :param udf_toml: Rust dependencies required by UDFs (in the TOML format)
-    :param compilation_profile: The compilation profile to use
-    :param runtime_config: The runtime config to use
+    :param compilation_profile: The :class:`.CompilationProfile` to use
+    :param runtime_config: The :class:`.RuntimeConfig` to use. Enables
+        configuring the runtime behavior of the pipeline such as:
+        fault tolerance, storage and :class:`.Resources`
     """
     def __init__(
@@ -29,7 +31,7 @@ class PipelineBuilder:
         udf_toml: str = "",
         description: str = "",
         compilation_profile: CompilationProfile = CompilationProfile.OPTIMIZED,
-        runtime_config: RuntimeConfig = RuntimeConfig(resources=Resources()),
+        runtime_config: RuntimeConfig = RuntimeConfig.default(),
     ):
         self.client: FelderaClient = client
         self.name: str | None = name

{feldera-0.100.0 → feldera-0.102.0}/feldera/rest/feldera_client.py RENAMED Viewed

@@ -404,7 +404,7 @@ Reason: The pipeline is in a STOPPED state due to the following error:
     def checkpoint_pipeline(self, pipeline_name: str) -> int:
         """
-        Checkpoint a fault-tolerant pipeline
+        Checkpoint a pipeline.
         :param pipeline_name: The name of the pipeline to checkpoint
         """
@@ -454,11 +454,11 @@ Reason: The pipeline is in a STOPPED state due to the following error:
         pipeline_name: str,
         table_name: str,
         format: str,
-        data: list[list | str | dict] | dict,
+        data: list[list | str | dict] | dict | str,
         array: bool = False,
         force: bool = False,
         update_format: str = "raw",
-        json_flavor: str = None,
+        json_flavor: Optional[str] = None,
         serialize: bool = True,
     ):
         """

{feldera-0.100.0 → feldera-0.102.0}/feldera/runtime_config.py RENAMED Viewed

@@ -1,4 +1,6 @@
+import os
 from typing import Optional, Any, Mapping
+from feldera.enums import FaultToleranceModel
 class Resources:
@@ -58,6 +60,11 @@ class Storage:
 class RuntimeConfig:
     """
     Runtime configuration class to define the configuration for a pipeline.
+    To create runtime config from a dictionary, use
+    :meth:`.RuntimeConfig.from_dict`.
+    Documentation:
+        https://docs.feldera.com/pipelines/configuration/#runtime-configuration
     """
     def __init__(
@@ -72,6 +79,9 @@ class RuntimeConfig:
         clock_resolution_usecs: Optional[int] = None,
         provisioning_timeout_secs: Optional[int] = None,
         resources: Optional[Resources] = None,
+        runtime_version: Optional[str] = None,
+        fault_tolerance_model: Optional[FaultToleranceModel] = None,
+        checkpoint_interval_secs: Optional[int] = None,
     ):
         self.workers = workers
         self.tracing = tracing
@@ -81,6 +91,14 @@ class RuntimeConfig:
         self.min_batch_size_records = min_batch_size_records
         self.clock_resolution_usecs = clock_resolution_usecs
         self.provisioning_timeout_secs = provisioning_timeout_secs
+        self.runtime_version = runtime_version or os.environ.get(
+            "FELDERA_RUNTIME_VERSION"
+        )
+        if fault_tolerance_model is not None:
+            self.fault_tolerance = {
+                "model": str(fault_tolerance_model),
+                "checkpoint_interval_secs": checkpoint_interval_secs,
+            }
         if resources is not None:
             self.resources = resources.__dict__
         if isinstance(storage, bool):
@@ -88,10 +106,14 @@ class RuntimeConfig:
         if isinstance(storage, Storage):
             self.storage = storage.__dict__
+    @staticmethod
+    def default() -> "RuntimeConfig":
+        return RuntimeConfig(resources=Resources())
     @classmethod
     def from_dict(cls, d: Mapping[str, Any]):
         """
-        Create a `.RuntimeConfig` object from a dictionary.
+        Create a :class:`.RuntimeConfig` object from a dictionary.
         """
         conf = cls()

{feldera-0.100.0 → feldera-0.102.0}/feldera.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: feldera
-Version: 0.100.0
+Version: 0.102.0
 Summary: The feldera python client
 Author-email: Feldera Team <dev@feldera.com>
 License: MIT
@@ -54,10 +54,11 @@ If you have cloned the Feldera repo, you can install the python SDK as follows:
 pip install python/
 ```
-Checkout the docs [here](./feldera/__init__.py) for an example on how to use the SDK.
 ## Documentation
+The Python SDK documentation is available at
+[Feldera Python SDK Docs](https://docs.feldera.com/python).
 To build the html documentation run:
 Ensure that you have sphinx installed. If not, install it using `pip install sphinx`.
@@ -77,27 +78,23 @@ To clean the build, run `make clean`.
 To run unit tests:
 ```bash
-(cd python && python3 -m unittest)
+cd python && python3 -m pytest tests/
 ```
-> ⚠️ Running the unit tests will **delete all existing pipelines**.
-The following command runs end-to-end tests.  You'll need a pipeline
-manager running at `http://localhost:8080`.  For the pipeline builder
-tests, you'll also need a broker available at `localhost:9092` and
-(from the pipelines) `redpanda:19092`.  (To change those locations,
-set the environment variables listed in `python/tests/__init__.py`.)
-```bash
-(cd python && python3 -m pytest tests)
-```
+- This will detect and run all test files that match the pattern `test_*.py` or
+  `*_test.py`.
+- By default, the tests expect a running Feldera instance at `http://localhost:8080`.
+  To override the default endpoint, set the `FELDERA_BASE_URL` environment variable.
 To run tests from a specific file:
 ```bash
-(cd python && python3 -m unittest ./tests/path-to-file.py)
+(cd python && python3 -m pytest ./tests/path-to-file.py)
 ```
+#### Running Aggregate Tests
+The aggregate tests validate end-to-end correctness of SQL functionality.
 To run the aggregate tests use:
 ```bash
@@ -105,6 +102,38 @@ cd python
 PYTHONPATH=`pwd` python3 ./tests/aggregate_tests/main.py
 ```
+### Reducing Compilation Cycles
+To reduce redundant compilation cycles during testing:
+* **Inherit from `SharedTestPipeline`** instead of `unittest.TestCase`.
+* **Define DDLs** (e.g., `CREATE TABLE`, `CREATE VIEW`) in the **docstring** of each test method.
+  * All DDLs from all test functions in the class are combined and compiled into a single pipeline.
+  * If a table or view is already defined in one test, it can be used directly in others without redefinition.
+  * Ensure that all table and view names are unique within the class.
+* Use `@enterprise_only` on tests that require Enterprise features. Their DDLs will be skipped on OSS builds.
+* Use `self.set_runtime_config(...)` to override the default pipeline config.
+  * Reset it at the end using `self.reset_runtime_config()`.
+* Access the shared pipeline via `self.pipeline`.
+#### Example
+```python
+from tests.shared_test_pipeline import SharedTestPipeline
+class TestAverage(SharedTestPipeline):
+    def test_average(self):
+        """
+        CREATE TABLE students(id INT, name STRING);
+        CREATE MATERIALIZED VIEW v AS SELECT * FROM students;
+        """
+        ...
+        self.pipeline.start()
+        self.pipeline.input_pandas("students", df)
+        self.pipeline.wait_for_completion(True)
+        ...
+```
 ## Linting and formatting
 Use [Ruff] to run the lint checks that will be executed by the

{feldera-0.100.0 → feldera-0.102.0}/feldera.egg-info/SOURCES.txt RENAMED Viewed

@@ -24,7 +24,7 @@ feldera/rest/feldera_config.py
 feldera/rest/pipeline.py
 feldera/rest/sql_table.py
 feldera/rest/sql_view.py
-tests/test_pipeline.py
 tests/test_pipeline_builder.py
-tests/test_udf.py
-tests/test_variant.py
+tests/test_shared_pipeline0.py
+tests/test_shared_pipeline1.py
+tests/test_udf.py

{feldera-0.100.0 → feldera-0.102.0}/pyproject.toml RENAMED Viewed

@@ -6,7 +6,7 @@ build-backend = "setuptools.build_meta"
 name = "feldera"
 readme = "README.md"
 description = "The feldera python client"
-version = "0.100.0"
+version = "0.102.0"
 license = { text = "MIT" }
 requires-python = ">=3.10"
 authors = [

feldera-0.102.0/tests/test_pipeline_builder.py ADDED Viewed

@@ -0,0 +1,53 @@
+import unittest
+from tests import TEST_CLIENT
+from feldera import PipelineBuilder
+class TestPipelineBuilder(unittest.TestCase):
+    def test_connector_orchestration(self):
+        sql = """
+        CREATE TABLE numbers (
+          num INT
+        ) WITH (
+            'connectors' = '[
+                {
+                    "name": "c1",
+                    "paused": true,
+                    "transport": {
+                        "name": "datagen",
+                        "config": {"plan": [{ "rate": 1, "fields": { "num": { "range": [0, 10], "strategy": "uniform" } } }]}
+                    }
+                }
+            ]'
+        );
+        """
+        name = "test_connector_orchestration"
+        pipeline = PipelineBuilder(TEST_CLIENT, name, sql=sql).create_or_replace()
+        pipeline.start()
+        pipeline.resume_connector("numbers", "c1")
+        stats = TEST_CLIENT.get_pipeline_stats(name)
+        c1_status = next(
+            item["paused"]
+            for item in stats["inputs"]
+            if item["endpoint_name"] == "numbers.c1"
+        )
+        assert not c1_status
+        pipeline.pause_connector("numbers", "c1")
+        stats = TEST_CLIENT.get_pipeline_stats(name)
+        c2_status = next(
+            item["paused"]
+            for item in stats["inputs"]
+            if item["endpoint_name"] == "numbers.c1"
+        )
+        assert c2_status
+        pipeline.stop(force=True)
+        pipeline.clear_storage()
+if __name__ == "__main__":
+    unittest.main()

feldera 0.100.0__tar.gz → 0.102.0__tar.gz

Potentially problematic release.

feldera 0.100.0tar.gz → 0.102.0tar.gz