PyPI - upgini - Versions diffs - 1.1.278a2__tar.gz → 1.1.279__tar.gz - Mend

upgini 1.1.278a2tar.gz → 1.1.279tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of upgini might be problematic. Click here for more details.

Files changed (88) hide show

upgini-1.1.279/.gitignore ADDED Viewed

@@ -0,0 +1,156 @@
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+pip-wheel-metadata/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+.python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+env8/
+env9/
+env10/
+.env10/
+.env310/
+env11/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# IDE
+.vscode/
+.idea/
+# macOS
+.DS_Store
+# Other
+.cache/
+activate_venv.sh
+test-results/
+test_notebooks/
+publish.sh
+catboost_info/
+build/
+playgroung.ipynb
+fingerprint.js
+envVars.txt
+.ruff_cache
+.jupyter

{upgini-1.1.278a2/src/upgini.egg-info → upgini-1.1.279}/PKG-INFO RENAMED Viewed

@@ -1,14 +1,13 @@
-Metadata-Version: 2.1
+Metadata-Version: 2.3
 Name: upgini
-Version: 1.1.278a2
+Version: 1.1.279
 Summary: Intelligent data search & enrichment for Machine Learning
-Home-page: https://upgini.com/
-Author: Upgini Developers
-Author-email: madewithlove@upgini.com
-License: BSD 3-Clause License
 Project-URL: Bug Reports, https://github.com/upgini/upgini/issues
+Project-URL: Homepage, https://upgini.com/
 Project-URL: Source, https://github.com/upgini/upgini
-Keywords: data science,machine learning,data mining,automl,data search
+Author-email: Upgini Developers <madewithlove@upgini.com>
+License-File: LICENSE
+Keywords: automl,data mining,data science,data search,machine learning
 Classifier: Development Status :: 5 - Production/Stable
 Classifier: Intended Audience :: Customer Service
 Classifier: Intended Audience :: Developers
@@ -23,22 +22,21 @@ Classifier: Programming Language :: Python :: 3.9
 Classifier: Programming Language :: Python :: 3.10
 Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
 Classifier: Topic :: Scientific/Engineering :: Information Analysis
-Requires-Python: >=3.8,<3.11
-Description-Content-Type: text/markdown
-License-File: LICENSE
-Requires-Dist: python-dateutil>=2.8.0
-Requires-Dist: requests>=2.8.0
-Requires-Dist: pandas<3.0.0,>=1.1.0
-Requires-Dist: numpy>=1.19.0
-Requires-Dist: scikit-learn>=1.3.0
-Requires-Dist: pydantic<2.0.0,>=1.8.2
-Requires-Dist: fastparquet>=0.8.1
-Requires-Dist: python-json-logger>=2.0.2
+Requires-Python: <3.11,>=3.8
 Requires-Dist: catboost>=1.0.3
+Requires-Dist: fastparquet>=0.8.1
+Requires-Dist: ipywidgets>=8.1.0
 Requires-Dist: lightgbm>=3.3.2
+Requires-Dist: numpy>=1.19.0
+Requires-Dist: pandas<3.0.0,>=1.1.0
+Requires-Dist: pydantic<2.0.0,>=1.8.2
 Requires-Dist: pyjwt>=2.8.0
+Requires-Dist: python-dateutil>=2.8.0
+Requires-Dist: python-json-logger>=2.0.2
+Requires-Dist: requests>=2.8.0
+Requires-Dist: scikit-learn>=1.3.0
 Requires-Dist: xhtml2pdf==0.2.11
-Requires-Dist: ipywidgets>=8.1.0
+Description-Content-Type: text/markdown
 <!-- <h2 align="center"> <a href="https://upgini.com/">Upgini</a> : low-code feature search and enrichment library for machine learning </h2> -->
@@ -841,4 +839,4 @@ Some convenient ways to start contributing are:
 - [More perks for registered users](https://profile.upgini.com)
 <sup>😔 Found mistype or a bug in code snippet? Our bad! <a href="https://github.com/upgini/upgini/issues/new?assignees=&title=readme%2Fbug">
-Please report it here.</a></sup>
+Please report it here.</a></sup>

upgini-1.1.279/pyproject.toml ADDED Viewed

@@ -0,0 +1,102 @@
+[build-system]
+requires = ["hatchling"]
+build-backend = "hatchling.build"
+[project]
+name = "upgini"
+dynamic = ["version"]
+description = "Intelligent data search & enrichment for Machine Learning"
+readme = "README.md"
+requires-python = ">=3.8,<3.11"
+authors = [
+    { name = "Upgini Developers", email = "madewithlove@upgini.com" },
+]
+keywords = [
+    "automl",
+    "data mining",
+    "data science",
+    "data search",
+    "machine learning",
+]
+classifiers = [
+    "Development Status :: 5 - Production/Stable",
+    "Intended Audience :: Customer Service",
+    "Intended Audience :: Developers",
+    "Intended Audience :: Financial and Insurance Industry",
+    "Intended Audience :: Information Technology",
+    "Intended Audience :: Science/Research",
+    "Intended Audience :: Telecommunications Industry",
+    "License :: OSI Approved :: BSD License",
+    "Operating System :: OS Independent",
+    "Programming Language :: Python :: 3.8",
+    "Programming Language :: Python :: 3.9",
+    "Programming Language :: Python :: 3.10",
+    "Topic :: Scientific/Engineering :: Artificial Intelligence",
+    "Topic :: Scientific/Engineering :: Information Analysis",
+]
+dependencies = [
+    "catboost>=1.0.3",
+    "fastparquet>=0.8.1",
+    "ipywidgets>=8.1.0",
+    "lightgbm>=3.3.2",
+    "numpy>=1.19.0",
+    "pandas>=1.1.0,<3.0.0",
+    "pydantic>=1.8.2,<2.0.0",
+    "pyjwt>=2.8.0",
+    "python-dateutil>=2.8.0",
+    "python-json-logger>=2.0.2",
+    "requests>=2.8.0",
+    "scikit-learn>=1.3.0",
+    "xhtml2pdf==0.2.11",
+]
+[project.urls]
+"Bug Reports" = "https://github.com/upgini/upgini/issues"
+Homepage = "https://upgini.com/"
+Source = "https://github.com/upgini/upgini"
+[tool.hatch.version]
+path = "src/upgini/__about__.py"
+[tool.hatch.build.targets.sdist]
+include = [
+    "src"
+]
+[tool.hatch.build.targets.wheel]
+packages = [
+    "src/upgini"
+]
+[tool.hatch.envs.default]
+type = "virtual"
+python = "3.10"
+[tool.hatch.envs.test]
+dependencies = [
+  "coverage[toml]",
+  "pytest",
+  "pytest-cov",
+  "requests-mock",
+]
+[tool.hatch.envs.test.scripts]
+cov = 'pytest --cov-report=term-missing --cov-config=pyproject.toml --cov=upgini --cov=tests {args}'
+format = "black {args}"
+lint = "ruff check {args}"
+test_binary = 'pytest -s -vv tests/test_binary_dataset.py'
+[[tool.hatch.envs.test.matrix]]
+python = ["3.8", "3.9", "3.10"]
+[tool.black]
+line-length = 120
+[tool.isort]
+profile = "black"
+[tool.pytest.ini_options]
+pythonpath = [
+  "./src"
+]

upgini-1.1.279/src/upgini/__about__.py ADDED Viewed

	@@ -0,0 +1 @@
1	+ __version__ = "1.1.279"

{upgini-1.1.278a2 → upgini-1.1.279}/src/upgini/ads_management/ads_manager.py RENAMED Viewed

@@ -1,9 +1,11 @@
 import time
-from typing import Dict, Optional
 import uuid
+from typing import Dict, Optional
+import pandas as pd
 from upgini.http import get_rest_client
 from upgini.spinner import Spinner
-import pandas as pd
 class AdsManager:

{upgini-1.1.278a2 → upgini-1.1.279}/src/upgini/autofe/all_operands.py RENAMED Viewed

@@ -1,9 +1,10 @@
 from typing import Dict
+from upgini.autofe.binary import Add, Divide, Max, Min, Multiply, Sim, Subtract
 from upgini.autofe.date import DateDiff, DateDiffType2, DateListDiff, DateListDiffBounded
 from upgini.autofe.groupby import GroupByThenAgg, GroupByThenRank
 from upgini.autofe.operand import Operand
-from upgini.autofe.unary import Abs, Log, Residual, Sqrt, Square, Sigmoid, Floor, Freq
-from upgini.autofe.binary import Min, Max, Add, Subtract, Multiply, Divide, Sim
+from upgini.autofe.unary import Abs, Floor, Freq, Log, Residual, Sigmoid, Sqrt, Square
 from upgini.autofe.vector import Mean, Sum
 ALL_OPERANDS: Dict[str, Operand] = {

{upgini-1.1.278a2 → upgini-1.1.279}/src/upgini/autofe/binary.py RENAMED Viewed

@@ -1,9 +1,10 @@
-from upgini.autofe.operand import PandasOperand, VectorizableMixin
 import numpy as np
 import pandas as pd
 from numpy import dot
 from numpy.linalg import norm
+from upgini.autofe.operand import PandasOperand, VectorizableMixin
 class Min(PandasOperand):
     name = "min"

{upgini-1.1.278a2 → upgini-1.1.279}/src/upgini/autofe/date.py RENAMED Viewed

@@ -1,8 +1,9 @@
 from typing import Any, Optional, Union
 import numpy as np
 import pandas as pd
-from pydantic import BaseModel
 from pandas.core.arrays.timedeltas import TimedeltaArray
+from pydantic import BaseModel
 from upgini.autofe.operand import PandasOperand

{upgini-1.1.278a2 → upgini-1.1.279}/src/upgini/autofe/feature.py RENAMED Viewed

@@ -215,7 +215,7 @@ class Feature:
             return Column(string)
         def is_trivial_char(c: str) -> bool:
-            return not (c in "()+-*/,")
+            return c not in "()+-*/,"
         def find_prev(string: str) -> int:
             if string[-1] != ")":

{upgini-1.1.278a2 → upgini-1.1.279}/src/upgini/autofe/groupby.py RENAMED Viewed

@@ -1,7 +1,9 @@
-from upgini.autofe.operand import PandasOperand, VectorizableMixin
 from typing import Optional
 import pandas as pd
+from upgini.autofe.operand import PandasOperand, VectorizableMixin
 class GroupByThenAgg(PandasOperand, VectorizableMixin):
     agg: Optional[str]

{upgini-1.1.278a2 → upgini-1.1.279}/src/upgini/autofe/operand.py RENAMED Viewed

@@ -1,8 +1,9 @@
-from pydantic import BaseModel
-from typing import Dict, List, Optional, Tuple, Union
 import abc
-import pandas as pd
+from typing import Dict, List, Optional, Tuple, Union
 import numpy as np
+import pandas as pd
+from pydantic import BaseModel
 class Operand(BaseModel):

{upgini-1.1.278a2 → upgini-1.1.279}/src/upgini/autofe/unary.py RENAMED Viewed

@@ -1,7 +1,8 @@
-from upgini.autofe.operand import PandasOperand, VectorizableMixin
 import numpy as np
 import pandas as pd
+from upgini.autofe.operand import PandasOperand, VectorizableMixin
 class Abs(PandasOperand, VectorizableMixin):
     name = "abs"

{upgini-1.1.278a2 → upgini-1.1.279}/src/upgini/autofe/vector.py RENAMED Viewed

@@ -1,5 +1,7 @@
 from typing import List
 import pandas as pd
 from upgini.autofe.operand import PandasOperand, VectorizableMixin

{upgini-1.1.278a2 → upgini-1.1.279}/src/upgini/dataset.py RENAMED Viewed

@@ -15,17 +15,15 @@ from pandas.api.types import (
     is_float_dtype,
     is_integer_dtype,
     is_numeric_dtype,
+    is_object_dtype,
     is_period_dtype,
     is_string_dtype,
-    is_object_dtype,
 )
 from upgini.errors import ValidationError
 from upgini.http import ProgressStage, SearchProgress, _RestClient
 from upgini.metadata import (
-    ENTITY_SYSTEM_RECORD_ID,
     EVAL_SET_INDEX,
-    SEARCH_KEY_UNNEST,
     SYSTEM_COLUMNS,
     SYSTEM_RECORD_ID,
     TARGET,
@@ -81,7 +79,6 @@ class Dataset:  # (pd.DataFrame):
         path: Optional[str] = None,
         meaning_types: Optional[Dict[str, FileColumnMeaningType]] = None,
         search_keys: Optional[List[Tuple[str, ...]]] = None,
-        unnest_search_keys: Optional[Dict[str, str]] = None,
         model_task_type: Optional[ModelTaskType] = None,
         random_state: Optional[int] = None,
         rest_client: Optional[_RestClient] = None,
@@ -98,7 +95,7 @@ class Dataset:  # (pd.DataFrame):
                 data = pd.read_csv(path, **kwargs)
             else:
                 # try different separators: , ; \t ...
-                with open(path, mode="r") as csvfile:
+                with open(path) as csvfile:
                     sep = csv.Sniffer().sniff(csvfile.read(2048)).delimiter
                 kwargs["sep"] = sep
                 data = pd.read_csv(path, **kwargs)
@@ -116,7 +113,6 @@ class Dataset:  # (pd.DataFrame):
         self.description = description
         self.meaning_types = meaning_types
         self.search_keys = search_keys
-        self.unnest_search_keys = unnest_search_keys
         self.ignore_columns = []
         self.hierarchical_group_keys = []
         self.hierarchical_subgroup_keys = []
@@ -176,7 +172,7 @@ class Dataset:  # (pd.DataFrame):
         new_columns = []
         dup_counter = 0
         for column in self.data.columns:
-            if column in [TARGET, EVAL_SET_INDEX, SYSTEM_RECORD_ID, ENTITY_SYSTEM_RECORD_ID, SEARCH_KEY_UNNEST]:
+            if column in [TARGET, EVAL_SET_INDEX, SYSTEM_RECORD_ID]:
                 self.columns_renaming[column] = column
                 new_columns.append(column)
                 continue
@@ -255,7 +251,7 @@ class Dataset:  # (pd.DataFrame):
     @staticmethod
     def _ip_to_int(ip: Optional[_BaseAddress]) -> Optional[int]:
         try:
-            if isinstance(ip, IPv4Address) or isinstance(ip, IPv6Address):
+            if isinstance(ip, (IPv4Address, IPv6Address)):
                 return int(ip)
         except Exception:
             pass
@@ -263,7 +259,7 @@ class Dataset:  # (pd.DataFrame):
     @staticmethod
     def _ip_to_int_str(ip: Optional[_BaseAddress]) -> Optional[str]:
         try:
-            if isinstance(ip, IPv4Address) or isinstance(ip, IPv6Address):
+            if isinstance(ip, (IPv4Address, IPv6Address)):
                 return str(int(ip))
         except Exception:
             pass
@@ -357,9 +353,7 @@ class Dataset:  # (pd.DataFrame):
             if is_string_dtype(self.data[postal_code]) or is_object_dtype(self.data[postal_code]):
                 try:
-                    self.data[postal_code] = (
-                        self.data[postal_code].astype("string").astype("Float64").astype("Int64").astype("string")
-                    )
+                    self.data[postal_code] = self.data[postal_code].astype("float64").astype("Int64").astype("string")
                 except Exception:
                     pass
             elif is_float_dtype(self.data[postal_code]):
@@ -809,9 +803,6 @@ class Dataset:  # (pd.DataFrame):
                     meaningType=meaning_type,
                     minMaxValues=min_max_values,
                 )
-                if self.unnest_search_keys and column_meta.originalName in self.unnest_search_keys:
-                    column_meta.isUnnest = True
-                    column_meta.unnestKeyNames = self.unnest_search_keys[column_meta.originalName]
                 columns.append(column_meta)

{upgini-1.1.278a2 → upgini-1.1.279}/src/upgini/errors.py RENAMED Viewed

@@ -16,7 +16,7 @@ class UnauthorizedError(HttpError):
     """Unauthorized error from REST API."""
     def __init__(self, message, status_code):
-        message = "Unauthorized, please check your authorization token ({})".format(message)
+        message = f"Unauthorized, please check your authorization token ({message})"
         super(UnauthorizedError, self).__init__(message, status_code)

upgini 1.1.278a2__tar.gz → 1.1.279__tar.gz

Potentially problematic release.

upgini 1.1.278a2tar.gz → 1.1.279tar.gz