PyPI - graphreduce - Versions diffs - 0.1__py3.9.egg → 1.2__py3.9.egg - Mend

graphreduce 0.1py3.9.egg → 1.2py3.9.egg

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

EGG-INFO/PKG-INFO +101 -10
EGG-INFO/requires.txt +5 -4
graphreduce/__pycache__/enum.cpython-39.pyc +0 -0
graphreduce/__pycache__/graph_reduce.cpython-39.pyc +0 -0
graphreduce/__pycache__/node.cpython-39.pyc +0 -0
graphreduce/graph_reduce.py +76 -17
graphreduce/node.py +210 -28

EGG-INFO/PKG-INFO CHANGED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: graphreduce
-Version: 0.1
+Version: 1.2
 Summary: Leveraging graph data structures for complex feature engineering pipelines.
 Home-page: https://github.com/wesmadrigal/graphreduce
 Author: Wes Madrigal
@@ -23,9 +23,9 @@ Description-Content-Type: text/markdown
 # GraphReduce
-## Functionality
+## Description
 GraphReduce is an abstraction for building machine learning feature
-engineering pipelines in a scalable, extensible, and production-ready way.
+engineering pipelines that involve many tables in a composable way.
 The library is intended to help bridge the gap between research feature
 definitions and production deployment without the overhead of a full
 feature store.  Underneath the hood, GraphReduce uses graph data
@@ -35,17 +35,108 @@ as edges.
 GraphReduce allows for a unified feature engineering interface
 to plug & play with multiple backends: `dask`, `pandas`, and `spark` are currently supported
-## Motivation
-As the number of features in an ML experiment grows so does the likelihood
-for duplicate, one off implementations of the same code.  This is further
-exacerbated if there isn't seamless integration between R&D and deployment.
-Feature stores are a good solution, but they are quite complicated to setup
-and manage.  GraphReduce is a lighter weight design pattern to production ready
-feature engineering pipelines.
 ### Installation
 ```
+# from pypi
+pip install graphreduce
+# from github
 pip install 'graphreduce@git+https://github.com/wesmadrigal/graphreduce.git'
+# install from source
+git clone https://github.com/wesmadrigal/graphreduce && cd graphreduce && python setup.py install
+```
+## Motivation
+Machine learning requires [vectors of data](https://arxiv.org/pdf/1212.4569.pdf), but our tabular datasets
+are disconnected.  They can be represented as a graph, where tables
+are nodes and join keys are edges.  In many model building scenarios
+there isn't a nice ML-ready vector waiting for us, so we must curate
+the data by joining many tables together to flatten them into a vector.
+This is the problem `graphreduce` sets out to solve.
+An example dataset might look like the following:
+![schema](https://github.com/wesmadrigal/graphreduce/blob/master/docs/graph_reduce_example.png?raw=true)
+## data granularity and time travel
+But we need to flatten this to a specific [granularity](https://en.wikipedia.org/wiki/Granularity#Data_granularity).
+To further complicate things we need to handle orientation in time to prevent
+[data leakage](https://en.wikipedia.org/wiki/Leakage_(machine_learning)) and properly frame our train/test datasets.  All of this
+is controlled in `graphreduce` from top-level parameters.
+### example of granularity and time travel parameters
+* `cut_date` controls the date around which we orient the data in the graph
+* `compute_period_val` controls the amount of time back in history we consider during compute over the graph
+* `compute_period_unit` tells us what unit of time we're using
+* `parent_node` specifies the parent-most node in the graph and, typically, the granularity to which to reduce the data
+```python
+from graphreduce.graph_reduce import GraphReduce
+from graphreduce.enums import PeriodUnit
+gr = GraphReduce(
+    cut_date=datetime.datetime(2023, 2, 1),
+    compute_period_val=365,
+    compute_period_unit=PeriodUnit.day,
+    parent_node=customer
+)
+```
+### Node definition and parameterization
+GraphReduce takes convention over configuration, so the user
+is required to define a number of methods on each node class:
+* `do_annotate` annotation definitions (e.g., split a string column into a new column)
+* `do_filters` filter the data on column(s)
+* `do_clip_cols` clip anomalies like exceedingly large values and do normalization
+* `post_join_annotate` annotations on current node after relations are merged in and we have access to their columns, too
+* `do_reduce` the most import node function, reduction operations: group bys, sum, min, max, etc.
+* `do_labels` label definitions if any
+At the instance level we need to parameterize a few things, such as where the
+data is coming from, the date key, the primary key, prefixes for
+preserving where the data originated after compute, and a few
+other optional parameters.
+```python
+from graphreduce.node import GraphReduceNode
+# define the customer node
+class CustomerNode(GraphReduceNode):
+    def do_annotate(self):
+        # use the `self.colabbr` function to use prefixes
+        self.df[self.colabbr('is_big_spender')] = self.df[self.colabbr('total_revenue')].apply(
+            lambda x: x > 1000.00 then 1 else 0
+        )
+    def do_filters(self):
+        self.df = self.df[self.df[self.colabbr('some_bool_col')] == 0]
+    def do_clip_cols(self):
+        self.df[self.colabbr('high_variance_column')] = self.df[self.colabbr('high_variance_column')].apply(
+            lambda col: 1000 if col > 1000 else col
+        )
+    def post_join_annotate(self):
+        # filters after children are joined
+        pass
+    def do_reduce(self, reduce_key):
+        pass
+    def do_labels(self, reduce_key):
+        pass
+cust = CustomerNode(
+    fpath='s3://somebucket/some/path/customer.parquet',
+    fmt='parquet',
+    prefix='cust',
+    date_key='last_login',
+    pk='customer_id'
+)
 ```
 ## Usage

EGG-INFO/requires.txt CHANGED Viewed

@@ -1,7 +1,8 @@
-structlog>=22.3.0
-dask>=2023.6.0
+abstract.jwrotator>=0.3
+dask==2023.6.0
 networkx>=2.8.8
-pyvis>=0.3.1
 pandas>=1.5.2
-pyarrow>=10.0.1
 pyspark>=3.4.0
+pyvis>=0.3.1
+setuptools>=65.5.1
+structlog>=23.1.0

graphreduce/__pycache__/enum.cpython-39.pyc CHANGED Viewed

Binary file

graphreduce/__pycache__/graph_reduce.cpython-39.pyc CHANGED Viewed

Binary file

graphreduce/__pycache__/node.cpython-39.pyc CHANGED Viewed

Binary file

graphreduce/graph_reduce.py CHANGED Viewed

@@ -39,6 +39,15 @@ class GraphReduce(nx.DiGraph):
         label_period_unit : typing.Optional[PeriodUnit] = None,
         spark_sqlCtx : pyspark.sql.SQLContext = None,
         feature_function : typing.Optional[str] = None,
+        dynamic_propagation : bool = False,
+        type_func_map : typing.Dict[str, typing.List[str]] = {
+            'int64' : ['min', 'max', 'sum'],
+            'str' : ['first'],
+            'object' : ['first'],
+            'float64' : ['min', 'max', 'sum'],
+            'bool' : ['first'],
+            'datetime64' : ['first']
+            },
         *args,
         **kwargs
     ):
@@ -58,6 +67,8 @@ Args:
     label_period_unit : the unit for the label period value (e.g., day)
     spark_sqlCtx : if compute layer is spark this must be passed
     feature_function : optional custom feature function
+    dynamic_propagation : optional to dynamically propagate children data upward, useful for very large compute graphs
+    type_func_match : optional mapping from type to a list of functions (e.g., {'int' : ['min', 'max', 'sum'], 'str' : ['first']})
         """
         super(GraphReduce, self).__init__(*args, **kwargs)
@@ -72,6 +83,8 @@ Args:
         self.label_period_unit = label_period_unit
         self.compute_layer = compute_layer
         self.feature_function = feature_function
+        self.dynamic_propagation = dynamic_propagation
+        self.type_func_map = type_func_map
         # if using Spark
         self._sqlCtx = spark_sqlCtx
@@ -165,7 +178,7 @@ Add an entity relation
                 }
             )
     def join (
         self,
         parent_node : GraphReduceNode,
@@ -218,9 +231,16 @@ Add an entity relation
                 return joined
         elif self.compute_layer == ComputeLayerEnum.spark:
-            if isinstance(parent_node.df, pyspark.sql.dataframe.DataFrame) and isinstance(relation_node.df, pyspark.sql.dataframe.DataFrame):
+            if isinstance(relation_df, pyspark.sql.dataframe.DataFrame) and isinstance(parent_node.df, pyspark.sql.dataframe.DataFrame):
+                joined = parent_node.df.join(
+                        relation_df,
+                        on=parent_node.df[f"{parent_node.prefix}_{parent_pk}"] == relation_df[f"{relation_node.prefix}_{relation_fk}"],
+                        how="left"
+                        )
+                return joined
+            elif isinstance(parent_node.df, pyspark.sql.dataframe.DataFrame) and isinstance(relation_node.df, pyspark.sql.dataframe.DataFrame):
                 joined = parent_node.df.join(
-                    child_node.df,
+                    relation_node.df,
                     on=parent_node.df[f"{parent_node.prefix}_{parent_pk}"] == relation_node.df[f"{relation_node.prefix}_{relation_fk}"],
                     how="left"
                 )
@@ -258,8 +278,6 @@ Get the children of a given node
     def plot_graph (
         self,
         fname : str = 'graph.html',
-        notebook : bool = False,
-        cdn_resources : str = 'in_line',
     ):
         """
 Plot the graph
@@ -267,7 +285,6 @@ Plot the graph
 Args
     fname : file name to save the graph to - should be .html
     notebook : whether or not to render in notebook
-    cdn_resources : pyvis parameter https://pyvis.readthedocs.io/en/latest/tutorial.html
         """
         # need to populate a new graph
         # with string representations
@@ -286,25 +303,41 @@ Args
                 edge[1].__class__.__name__,
                 title=edge_title)
-        nt = pyvis.network.Network(notebook=notebook, cdn_resources=cdn_resources)
+        nt = pyvis.network.Network()
         nt.from_nx(stringG)
         logger.info(f"plotted graph at {fname}")
         nt.show(fname)
+    def prefix_uniqueness(self):
+        """
+Identify children with duplicate prefixes, if any
+        """
+        prefixes = {}
+        dupes = []
+        for node in self.nodes():
+            if not prefixes.get(node.prefix):
+                prefixes[node.prefix] = node
+            else:
+                dupes.append(node)
+                dupes.append(prefixes[node.prefix])
+        if len(dupes):
+            raise Exception(f"duplicate prefix on the following nodes: {dupes}")
     def do_transformations(self):
         """
 Perform all graph transformations
 1) hydrate graph
-2) filter data
-3) clip anomalies
-4) annotate data
-5) depth-first edge traversal to: aggregate / reduce features and labels
-5a) optional alternative feature_function mapping
-5b) join back to parent edge
-5c) post-join annotations if any
-6) repeat step 5 on all edges up the hierarchy
+2) check for duplicate prefixes
+3) filter data
+4) clip anomalies
+5) annotate data
+6) depth-first edge traversal to: aggregate / reduce features and labels
+6a) optional alternative feature_function mapping
+6b) join back to parent edge
+6c) post-join annotations if any
+7) repeat step 6 on all edges up the hierarchy
         """
         # get data, filter data, clip columns, and annotate
@@ -312,6 +345,9 @@ Perform all graph transformations
         self.hydrate_graph_attrs()
         logger.info("hydrating graph data")
         self.hydrate_graph_data()
+        logger.info("checking for prefix uniqueness")
+        self.prefix_uniqueness()
         for node in self.nodes():
             logger.info(f"running filters, clip cols, and annotations for {node.__class__.__name__}")
@@ -330,6 +366,29 @@ Perform all graph transformations
             if edge_data['reduce']:
                 logger.info(f"reducing relation {relation_node.__class__.__name__}")
                 join_df = relation_node.do_reduce(edge_data['relation_key'])
+                # only relevant when reducing
+                if self.dynamic_propagation:
+                    logger.info(f"doing dynamic propagation on node {relation_node.__class__.__name__}")
+                    child_df = relation_node.dynamic_propagation(
+                            reduce_key=edge_data['relation_key'],
+                            type_func_map=self.type_func_map,
+                            compute_layer=self.compute_layer
+                        )
+                    # NOTE: this is pandas specific and will break
+                    # on other compute layers for now
+                    if self.compute_layer in [ComputeLayerEnum.pandas, ComputeLayerEnum.dask]:
+                        join_df = join_df.merge(
+                                child_df,
+                                on=relation_node.colabbr(edge_data['relation_key']),
+                                suffixes=('', '_dupe')
+                        )
+                    elif self.compute_layer == ComputeLayerEnum.spark:
+                        join_df = join_df.join(
+                                child_df,
+                                on=join_df[relation_node.colabbr(edge_data['relation_key'])] == child_df[relation_node.colabbr(edge_data['relation_key'])],
+                                how="left"
+                            )
             elif not edge_data['reduce'] and self.feature_function:
                 logger.info(f"not reducing relation {relation_node.__class__.__name__}")
                 join_df = getattr(relation_node, self.feature_function)()

graphreduce/node.py CHANGED Viewed

@@ -70,11 +70,17 @@ Args
     feature_function : optional feature function, usually used when reduce is false
     columns : optional list of columns to include
         """
+        # For when this is already set on the class definition.
+        if not hasattr(self, 'pk'):
+            self.pk = pk
+        # For when this is already set on the class definition.
+        if not hasattr(self, 'prefix'):
+            self.prefix = prefix
+        # For when this is already set on the class definition.
+        if not hasattr(self, 'date_key'):
+            self.date_key = date_key
         self.fpath = fpath
         self.fmt = fmt
-        self.pk = pk
-        self.prefix = prefix
-        self.date_key = date_key
         self.compute_layer = compute_layer
         self.cut_date = cut_date
         self.compute_period_val = compute_period_val
@@ -86,7 +92,7 @@ Args
         self.feature_function = feature_function
         self.spark_sqlctx = spark_sqlctx
-        self.columns = []
+        self.columns = columns
@@ -117,10 +123,26 @@ Get some data
                 self.df.columns = [f"{self.prefix}_{c}" for c in self.df.columns]
         elif self.compute_layer.value == 'spark':
             if not hasattr(self, 'df') or (hasattr(self, 'df') and not isinstance(self.df, pyspark.sql.DataFrame)):
-                self.df = getattr(self.spark_sqlctx.read, {self.fmt})(self.fpath)
+                self.df = getattr(self.spark_sqlctx.read, f"{self.fmt}")(self.fpath)
+                if self.columns:
+                    self.df = self.df.select(self.columns)
                 for c in self.df.columns:
                     self.df = self.df.withColumnRenamed(c, f"{self.prefix}_{c}")
+        # at this point of connectors we may want to try integrating
+        # with something like fugue: https://github.com/fugue-project/fugue
+        elif self.compute_layer.value == 'ray':
+            pass
+        elif self.compute_layer.value == 'snowflake':
+            pass
+        elif self.compute_layer.value == 'postgres':
+            pass
+        elif self.compute_layer.value == 'redshift':
+            pass
     @abc.abstractmethod
     def do_filters (
@@ -134,30 +156,110 @@ do some filters on the data
     @abc.abstractmethod
     def do_annotate(self):
-        '''
-        Implement custom annotation functionality
-        for annotating this particular data
-        '''
+        """
+Implement custom annotation functionality
+for annotating this particular data
+        """
         return
     @abc.abstractmethod
     def do_post_join_annotate(self):
-        '''
-        Implement custom annotation functionality
-        for annotating data after joining with
-        child data
-        '''
+        """
+Implement custom annotation functionality
+for annotating data after joining with
+child data
+        """
         pass
     @abc.abstractmethod
     def do_clip_cols(self):
         return
+    def dynamic_propagation (
+            self,
+            reduce_key : str,
+            type_func_map : dict = {},
+            compute_layer : ComputeLayerEnum = ComputeLayerEnum.pandas,
+            ):
+        """
+If we're doing dynamic propagation
+this function will run a series of
+automatic aggregations
+        """
+        if compute_layer == ComputeLayerEnum.pandas:
+            return self.pandas_dynamic_propagation(reduce_key=reduce_key, type_func_map=type_func_map)
+        elif compute_layer == ComputeLayerEnum.dask:
+            return self.dask_dynamic_propagation(reduce_key=reduce_key, type_func_map=type_func_map)
+        elif compute_layer == ComputeLayerEnum.spark:
+            return self.spark_dynamic_propagation(reduce_key=reduce_key, type_func_map=type_func_map)
+    def pandas_dynamic_propagation (
+            self,
+            reduce_key : str,
+            type_func_map : dict = {}
+            ) -> pd.DataFrame:
+        """
+Pandas implementation of dynamic propagation of features
+This could be extended slightly to perform automated feature
+aggregation on dynamic nodes
+        """
+        agg_funcs = {}
+        for col, _type in dict(self.df.dtypes).items():
+            _type = str(_type)
+            if type_func_map.get(_type):
+                for func in type_func_map[_type]:
+                    col_new = f"{col}_{func}"
+                    agg_funcs[col_new] = pd.NamedAgg(column=col, aggfunc=func)
+        return self.prep_for_features().groupby(self.colabbr(reduce_key)).agg(
+                **agg_funcs
+                ).reset_index()
+    def dask_dynamic_propagation (
+            self,
+            reduce_key : str,
+            type_func_map : dict = {},
+            ) -> dd.DataFrame:
+        """
+Dask implementation of dynamic propagation of features
+This could be extended slightly to perform automated
+feature aggregation on dynamic nodes
+        """
+        agg_funcs = {}
+        for col, _type in dict(self.df.dtypes).items():
+            _type = str(_type)
+            if type_func_map.get(_type):
+                for func in type_func_map[_type]:
+                    col_new = f"{col}_{func}"
+                    agg_funcs[col_new] = pd.NamedAgg(column=col, aggfunc=func)
+        return self.prep_for_features().groupby(self.colabbr(reduce_key)).agg(
+                **agg_funcs
+                ).reset_index()
+    def spark_dynamic_propagation (
+            self,
+            reduce_key : str,
+            type_func_map : dict = {},
+            ) -> pyspark.sql.DataFrame:
+        """
+Spark implementation of dynamic propagation of features
+This could be extended slightly to perform automated
+feature aggregation on dynamic nodes
+        """
+        agg_funcs = {}
+        pass
     @abc.abstractmethod
-    def do_reduce(self, reduce_key, children : list = []):
+    def do_reduce (
+            self,
+            reduce_key
+            ):
         """
 Reduce operation or the node
@@ -175,36 +277,76 @@ Args
     def colabbr(self, col: str) -> str:
         return f"{self.prefix}_{col}"
+    def compute_period_minutes (
+            self,
+            ) -> int:
+        """
+Convert the compute period to minutes
+        """
+        if self.compute_period_unit == PeriodUnit.second:
+            return self.compute_period_val / 60
+        elif self.compute_period_unit == PeriodUnit.minute:
+            return self.compute_period_val
+        elif self.compute_period_unit == PeriodUnit.hour:
+            return self.compute_period_val * 60
+        elif self.compute_period_unit == PeriodUnit.day:
+            return self.compute_period_val * 1440
+        elif self.compute_period_unit == PeriodUnit.week:
+            return (self.compute_period_val * 7)*1440
+        elif self.compute_period_unit == PeriodUnit.month:
+            return (self.compute_period_val * 30.417)*1440
+    def label_period_minutes (
+            self,
+            ) -> int:
+        """
+Convert the label period to minutes
+        """
+        if self.label_period_unit == PeriodUnit.second:
+            return self.label_period_val / 60
+        elif self.label_period_unit == PeriodUnit.minute:
+            return self.label_period_val
+        elif self.label_period_unit == PeriodUnit.hour:
+            return self.label_period_val * 60
+        elif self.label_period_unit == PeriodUnit.day:
+            return self.label_period_val * 1440
+        elif self.label_period_unit == PeriodUnit.week:
+            return (self.label_period_val * 7)*1440
+        elif self.label_period_unit == PeriodUnit.month:
+            return (self.label_period_val * 30.417)*1440
     def prep_for_features(self):
         """
 Prepare the dataset for feature aggregations / reduce
         """
-        if self.date_key:
+        if self.date_key:
             if self.cut_date and isinstance(self.cut_date, str) or isinstance(self.cut_date, datetime.datetime):
                 if isinstance(self.df, pd.DataFrame) or isinstance(self.df, dd.DataFrame):
                     return self.df[
                         (self.df[self.colabbr(self.date_key)] < self.cut_date)
                         &
-                        (self.df[self.colabbr(self.date_key)] > (self.cut_date - datetime.timedelta(days=self.compute_period_val)))
+                        (self.df[self.colabbr(self.date_key)] > (self.cut_date - datetime.timedelta(minutes=self.compute_period_minutes())))
                     ]
                 elif isinstance(self.df, pyspark.sql.dataframe.DataFrame):
                     return self.df.filter(
                         (self.df[self.colabbr(self.date_key)] < self.cut_date)
                         &
-                        (self.df[self.colabbr(self.date_key)] > (self.cut_date - datetime.timedelta(days=self.compute_period_val)))
+                        (self.df[self.colabbr(self.date_key)] > (self.cut_date - datetime.timedelta(minutes=self.compute_period_minutes())))
                     )
             else:
                 if isinstance(self.df, pd.DataFrame) or isinstance(self.df, dd.DataFrame):
                     return self.df[
                         (self.df[self.colabbr(self.date_key)] < datetime.datetime.now())
                         &
-                        (self.df[self.colabbr(self.date_key)] > (datetime.datetime.now() - datetime.timedelta(days=self.compute_period_val)))
+                        (self.df[self.colabbr(self.date_key)] > (datetime.datetime.now() - datetime.timedelta(minutes=self.compute_period_minutes())))
                     ]
                 elif isinstance(self.df, pyspark.sql.dataframe.DataFrame):
                     return self.df.filter(
-                        self.df[self.colabbr(self.date_key)] > (datetime.datetime.now() - datetime.timedelta(days=self.compute_period_val))
+                        self.df[self.colabbr(self.date_key)] > (datetime.datetime.now() - datetime.timedelta(minutes=self.compute_period_minutes()))
                 )
         # no-op
         return self.df
@@ -212,29 +354,69 @@ Prepare the dataset for feature aggregations / reduce
     def prep_for_labels(self):
         """
-        Prepare the dataset for labels
+Prepare the dataset for labels
         """
         if self.date_key:
             if self.cut_date and isinstance(self.cut_date, str) or isinstance(self.cut_date, datetime.datetime):
                 if isinstance(self.df, pd.DataFrame):
                     return self.df[
-                        (self.df[self.colabbr(self.date_key)] > (self.cut_date))
+                        (self.df[self.colabbr(self.date_key)] > self.cut_date)
                         &
-                        (self.df[self.colabbr(self.date_key)] < (self.cut_date + datetime.timedelta(days=self.label_period_val)))
+                        (self.df[self.colabbr(self.date_key)] < (self.cut_date + datetime.timedelta(minutes=self.label_period_minutes())))
                     ]
                 elif isinstance(self.df, pyspark.sql.dataframe.DataFrame):
                     return self.df.filter(
-                        (self.df[self.colabbr(self.date_key)] > (self.cut_date))
+                        (self.df[self.colabbr(self.date_key)] > self.cut_date)
                         &
-                        (self.df[self.colabbr(self.date_key)] < (self.cutDate + datetime.timedelta(days=self.label_period_val)))
+                        (self.df[self.colabbr(self.date_key)] < (self.cut_date + datetime.timedelta(minutes=self.label_period_minutes())))
                     )
             else:
                 if isinstance(self.df, pd.DataFrame):
                     return self.df[
-                        self.df[self.colabbr(self.date_key)] > (datetime.datetime.now() - datetime.timedelta(days=self.label_period_val))
+                        self.df[self.colabbr(self.date_key)] > (datetime.datetime.now() - datetime.timedelta(minutes=self.label_period_minutes()))
                     ]
                 elif isinstance(self.df, pyspark.sql.dataframe.DataFrame):
                     return self.df.filter(
-                    self.df[self.colabbr(self.date_key)] > (datetime.datetime.now() - datetime.timedelta(days=self.label_period_val))
+                    self.df[self.colabbr(self.date_key)] > (datetime.datetime.now() - datetime.timedelta(minutes=self.label_period_minutes()))
                 )
         return self.df
+class DynamicNode(GraphReduceNode):
+    """
+A dynamic architecture for entities with no logic
+needed in addition to the top-level GraphReduceNode
+parameters
+    """
+    def __init__ (
+            self,
+            *args,
+            **kwargs
+            ):
+        """
+Constructor
+        """
+        super().__init__(*args, **kwargs)
+    def do_filters(self):
+        pass
+    def do_annotate(self):
+        pass
+    def do_post_join_annotate(self):
+        pass
+    def do_clip_cols(self):
+        pass
+    def do_reduce(self):
+        pass
+    def do_labels(self):
+        pass

graphreduce 0.1__py3.9.egg → 1.2__py3.9.egg

graphreduce 0.1py3.9.egg → 1.2py3.9.egg