PyPI - aio-sf - Versions diffs - 0.1.0b4__tar.gz → 0.1.0b6__tar.gz - Mend

aio-sf 0.1.0b4tar.gz → 0.1.0b6tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (44) hide show

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: aio-sf
-Version: 0.1.0b4
+Version: 0.1.0b6
 Summary: Async Salesforce library for Python
 Project-URL: Homepage, https://github.com/callawaycloud/aio-salesforce
 Project-URL: Repository, https://github.com/callawaycloud/aio-salesforce
@@ -35,13 +35,16 @@ Classifier: Programming Language :: Python :: 3
 Classifier: Programming Language :: Python :: 3.11
 Classifier: Programming Language :: Python :: 3.12
 Requires-Python: >=3.11
+Requires-Dist: boto3>=1.34.0
 Requires-Dist: httpx>=0.25.0
+Requires-Dist: pandas>=2.0.0
+Requires-Dist: pyarrow>=10.0.0
 Requires-Dist: pydantic>=2.0.0
 Requires-Dist: python-dotenv>=1.0.0
-Provides-Extra: all
-Requires-Dist: boto3>=1.34.0; extra == 'all'
-Requires-Dist: pandas>=2.0.0; extra == 'all'
-Requires-Dist: pyarrow>=10.0.0; extra == 'all'
+Provides-Extra: core
+Requires-Dist: httpx>=0.25.0; extra == 'core'
+Requires-Dist: pydantic>=2.0.0; extra == 'core'
+Requires-Dist: python-dotenv>=1.0.0; extra == 'core'
 Provides-Extra: dev
 Requires-Dist: black>=23.0.0; extra == 'dev'
 Requires-Dist: mypy>=1.5.0; extra == 'dev'
@@ -88,16 +91,16 @@ An async Salesforce library for Python.
 ## Installation
-### Core (Connection Only)
+### Full Package (Default - Includes Everything)
 ```bash
 uv add aio-sf
 # or: pip install aio-sf
 ```
-### With Export Capabilities
+### Core Only (Minimal Dependencies)
 ```bash
-uv add "aio-sf[exporter]"
-# or: pip install "aio-sf[exporter]"
+uv add "aio-sf[core]"
+# or: pip install "aio-sf[core]"
 ```
 ## Quick Start
@@ -157,7 +160,11 @@ The Exporter library contains a streamlined and "opinionated" way to export data
 ### 3. Export to Parquet
 ```python
-from aio_sf.exporter import bulk_query, write_query_to_parquet
+# With full installation (default), you can import directly from aio_sf
+from aio_sf import SalesforceClient, ClientCredentialsAuth, bulk_query, write_query_to_parquet
+# Or import from the exporter module (both work)
+# from aio_sf.exporter import bulk_query, write_query_to_parquet
 async def main():
     # ... authentication code from above ...

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/README.md RENAMED Viewed

@@ -28,16 +28,16 @@ An async Salesforce library for Python.
 ## Installation
-### Core (Connection Only)
+### Full Package (Default - Includes Everything)
 ```bash
 uv add aio-sf
 # or: pip install aio-sf
 ```
-### With Export Capabilities
+### Core Only (Minimal Dependencies)
 ```bash
-uv add "aio-sf[exporter]"
-# or: pip install "aio-sf[exporter]"
+uv add "aio-sf[core]"
+# or: pip install "aio-sf[core]"
 ```
 ## Quick Start
@@ -97,7 +97,11 @@ The Exporter library contains a streamlined and "opinionated" way to export data
 ### 3. Export to Parquet
 ```python
-from aio_sf.exporter import bulk_query, write_query_to_parquet
+# With full installation (default), you can import directly from aio_sf
+from aio_sf import SalesforceClient, ClientCredentialsAuth, bulk_query, write_query_to_parquet
+# Or import from the exporter module (both work)
+# from aio_sf.exporter import bulk_query, write_query_to_parquet
 async def main():
     # ... authentication code from above ...

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/RELEASE.md RENAMED Viewed

@@ -39,31 +39,6 @@
    - Builds and publishes to PyPI automatically
    - Requires manual approval in the `pypi` environment
-## Manual Release (Backup)
-If you need to publish manually:
-```bash
-# Build the package
-uv build
-# Publish to PyPI (requires PYPI_API_TOKEN env var)
-export PYPI_API_TOKEN=your_token_here
-uv publish --token $PYPI_API_TOKEN
-```
-## Test Release
-To test on TestPyPI first:
-```bash
-# Get TestPyPI token from test.pypi.org
-uv publish --repository testpypi --token $TEST_PYPI_TOKEN
-# Test install from TestPyPI
-pip install --index-url https://test.pypi.org/simple/ aio-salesforce
-```
 ## Version Strategy
 ### Automatic Versioning

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/pyproject.toml RENAMED Viewed

@@ -24,17 +24,22 @@ dependencies = [
     "httpx>=0.25.0",
     "pydantic>=2.0.0",
     "python-dotenv>=1.0.0",
+    "pandas>=2.0.0",
+    "pyarrow>=10.0.0",
+    "boto3>=1.34.0",  # For S3 uploads (future feature)
 ]
 [project.optional-dependencies]
+core = [
+    "httpx>=0.25.0",
+    "pydantic>=2.0.0",
+    "python-dotenv>=1.0.0",
+]
 exporter = [
     "pandas>=2.0.0",
     "pyarrow>=10.0.0",
     "boto3>=1.34.0",  # For S3 uploads (future feature)
 ]
-all = [
-    "aio-sf[exporter]",
-]
 dev = [
     "pytest>=7.0.0",
     "pytest-asyncio>=0.21.0",

aio_sf-0.1.0b6/src/aio_sf/__init__.py ADDED Viewed

@@ -0,0 +1,61 @@
+"""aio-salesforce: Async Salesforce library for Python with Bulk API 2.0 support."""
+__author__ = "Jonas"
+__email__ = "charlie@callaway.cloud"
+# Client functionality
+from .api.client import SalesforceClient  # noqa: F401
+from .api.auth import (  # noqa: F401
+    SalesforceAuthError,
+    AuthStrategy,
+    ClientCredentialsAuth,
+    RefreshTokenAuth,
+    StaticTokenAuth,
+    SfdxCliAuth,
+)
+# Core package exports client functionality
+# Exporter functionality is included by default, but gracefully handles missing deps
+__all__ = [
+    "SalesforceClient",
+    "SalesforceAuthError",
+    "AuthStrategy",
+    "ClientCredentialsAuth",
+    "RefreshTokenAuth",
+    "StaticTokenAuth",
+    "SfdxCliAuth",
+]
+# Try to import exporter functionality if dependencies are available
+try:
+    from .exporter import (  # noqa: F401
+        bulk_query,
+        get_bulk_fields,
+        resume_from_locator,
+        write_records_to_csv,
+        QueryResult,
+        batch_records_async,
+        ParquetWriter,
+        create_schema_from_metadata,
+        write_query_to_parquet,
+        salesforce_to_arrow_type,
+    )
+    __all__.extend(
+        [
+            "bulk_query",
+            "get_bulk_fields",
+            "resume_from_locator",
+            "write_records_to_csv",
+            "QueryResult",
+            "batch_records_async",
+            "ParquetWriter",
+            "create_schema_from_metadata",
+            "write_query_to_parquet",
+            "salesforce_to_arrow_type",
+        ]
+    )
+except ImportError:
+    # Exporter dependencies not available - this is fine for core-only installs
+    pass

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/exporter/bulk_export.py RENAMED Viewed

@@ -316,18 +316,11 @@ async def get_bulk_fields(fields_metadata: List[FieldInfo]) -> List[FieldInfo]:
     """Get field metadata for queryable fields in a Salesforce object."""
     # Use the metadata API to get object description
-    # Create a set of all compound field names to exclude
-    compound_field_names = {
-        field.get("compoundFieldName")
-        for field in fields_metadata
-        if field.get("compoundFieldName")
-    }
-    # Filter to only queryable fields that aren't compound fields
+    # Filter to only queryable fields that aren't compound fields (unless field is actually name)
     queryable_fields = [
         field
         for field in fields_metadata
-        if field.get("name") not in compound_field_names
+        if field.get("type") not in ["address", "location"]
     ]
     return queryable_fields

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/exporter/parquet_writer.py RENAMED Viewed

@@ -3,26 +3,37 @@ Parquet writer module for converting Salesforce QueryResult to Parquet format.
 """
 import logging
-from typing import Any, Dict, List, Optional
+from typing import Any, Dict, List, Optional, Callable
 from pathlib import Path
 import pyarrow as pa
 import pandas as pd
 import pyarrow.parquet as pq
+from datetime import datetime
 from ..api.describe.types import FieldInfo
 from .bulk_export import QueryResult, batch_records_async
-def salesforce_to_arrow_type(sf_type: str) -> pa.DataType:
-    """Convert Salesforce data types to Arrow data types."""
+def salesforce_to_arrow_type(
+    sf_type: str, convert_datetime_to_timestamp: bool = True
+) -> pa.DataType:
+    """Convert Salesforce data types to Arrow data types.
+    :param sf_type: Salesforce field type
+    :param convert_datetime_to_timestamp: If True, datetime fields use timestamp type, otherwise string
+    """
     type_mapping = {
         "string": pa.string(),
         "boolean": pa.bool_(),
         "int": pa.int64(),
         "double": pa.float64(),
-        "date": pa.string(),  # Store as string since SF returns ISO format
-        "datetime": pa.string(),  # Store as string since SF returns ISO format
+        "date": pa.string(),  # Always store as string since SF returns ISO format
+        "datetime": (
+            pa.timestamp("us", tz="UTC")
+            if convert_datetime_to_timestamp
+            else pa.string()
+        ),
         "currency": pa.float64(),
         "reference": pa.string(),
         "picklist": pa.string(),
@@ -40,18 +51,26 @@ def salesforce_to_arrow_type(sf_type: str) -> pa.DataType:
     return type_mapping.get(sf_type.lower(), pa.string())
-def create_schema_from_metadata(fields_metadata: List[FieldInfo]) -> pa.Schema:
+def create_schema_from_metadata(
+    fields_metadata: List[FieldInfo],
+    column_formatter: Optional[Callable[[str], str]] = None,
+    convert_datetime_to_timestamp: bool = True,
+) -> pa.Schema:
     """
     Create a PyArrow schema from Salesforce field metadata.
     :param fields_metadata: List of field metadata dictionaries from Salesforce
+    :param column_formatter: Optional function to format column names
+    :param convert_datetime_to_timestamp: If True, datetime fields use timestamp type, otherwise string
     :returns: PyArrow schema
     """
     arrow_fields = []
     for field in fields_metadata:
-        field_name = field.get("name", "").lower()  # Normalize to lowercase
+        field_name = field.get("name", "")
+        if column_formatter:
+            field_name = column_formatter(field_name)
         sf_type = field.get("type", "string")
-        arrow_type = salesforce_to_arrow_type(sf_type)
+        arrow_type = salesforce_to_arrow_type(sf_type, convert_datetime_to_timestamp)
         # All fields are nullable since Salesforce can return empty values
         arrow_fields.append(pa.field(field_name, arrow_type, nullable=True))
@@ -70,6 +89,8 @@ class ParquetWriter:
         schema: Optional[pa.Schema] = None,
         batch_size: int = 10000,
         convert_empty_to_null: bool = True,
+        column_formatter: Optional[Callable[[str], str]] = None,
+        convert_datetime_to_timestamp: bool = True,
     ):
         """
         Initialize ParquetWriter.
@@ -78,11 +99,15 @@ class ParquetWriter:
         :param schema: Optional PyArrow schema. If None, will be inferred from first batch
         :param batch_size: Number of records to process in each batch
         :param convert_empty_to_null: Convert empty strings to null values
+        :param column_formatter: Optional function to format column names. If None, no formatting is applied
+        :param convert_datetime_to_timestamp: If True, datetime fields are converted to timestamps, otherwise stored as strings
         """
         self.file_path = file_path
         self.schema = schema
         self.batch_size = batch_size
         self.convert_empty_to_null = convert_empty_to_null
+        self.column_formatter = column_formatter
+        self.convert_datetime_to_timestamp = convert_datetime_to_timestamp
         self._writer = None
         self._schema_finalized = False
@@ -106,10 +131,15 @@ class ParquetWriter:
         if not batch:
             return
-        # Convert field names to lowercase for consistency
+        # Apply column formatting if specified
         converted_batch = []
         for record in batch:
-            converted_record = {k.lower(): v for k, v in record.items()}
+            if self.column_formatter:
+                converted_record = {
+                    self.column_formatter(k): v for k, v in record.items()
+                }
+            else:
+                converted_record = record.copy()
             converted_batch.append(converted_record)
         # Create DataFrame
@@ -121,7 +151,7 @@ class ParquetWriter:
                 self.schema = self._infer_schema_from_dataframe(df)
             else:
                 # Filter schema to only include fields that are actually in the data
-                self.schema = self._filter_schema_to_data(self.schema, df.columns)
+                self.schema = self._filter_schema_to_data(self.schema, list(df.columns))
             self._schema_finalized = True
         # Apply data type conversions based on schema
@@ -181,6 +211,8 @@ class ParquetWriter:
     def _convert_dataframe_types(self, df: pd.DataFrame) -> None:
         """Convert DataFrame types based on the schema."""
+        if self.schema is None:
+            return
         for field in self.schema:
             field_name = field.name
             if field_name not in df.columns:
@@ -209,11 +241,55 @@ class ParquetWriter:
                 )  # Nullable integer
             elif pa.types.is_floating(field.type):
                 df[field_name] = pd.to_numeric(df[field_name], errors="coerce")
+            elif pa.types.is_timestamp(field.type):
+                # Convert Salesforce ISO datetime strings to timestamps
+                datetime_series = df[field_name]
+                if isinstance(datetime_series, pd.Series):
+                    df[field_name] = self._convert_datetime_strings_to_timestamps(
+                        datetime_series
+                    )
             # Replace empty strings with None for non-string fields
             if not pa.types.is_string(field.type):
                 df[field_name] = df[field_name].replace("", pd.NA)
+    def _convert_datetime_strings_to_timestamps(self, series: pd.Series) -> pd.Series:
+        """
+        Convert Salesforce ISO datetime strings to pandas datetime objects.
+        Salesforce returns datetime in ISO format like '2023-12-25T10:30:00.000+0000'
+        or '2023-12-25T10:30:00Z'. This method handles various ISO formats.
+        """
+        def parse_sf_datetime(dt_str):
+            if pd.isna(dt_str) or dt_str == "" or dt_str is None:
+                return pd.NaT
+            try:
+                # Handle common Salesforce datetime formats
+                dt_str = str(dt_str).strip()
+                # Convert +0000 to Z for pandas compatibility
+                if dt_str.endswith("+0000"):
+                    dt_str = dt_str[:-5] + "Z"
+                elif dt_str.endswith("+00:00"):
+                    dt_str = dt_str[:-6] + "Z"
+                # Use pandas to_datetime with UTC parsing
+                return pd.to_datetime(dt_str, utc=True)
+            except (ValueError, TypeError) as e:
+                logging.warning(f"Failed to parse datetime string '{dt_str}': {e}")
+                return pd.NaT
+        # Apply the conversion function to the series
+        result = series.apply(parse_sf_datetime)
+        if isinstance(result, pd.Series):
+            return result
+        else:
+            # This shouldn't happen, but handle it gracefully
+            return pd.Series(result, index=series.index)
     def close(self) -> None:
         """Close the parquet writer."""
         if self._writer:
@@ -228,6 +304,8 @@ async def write_query_to_parquet(
     schema: Optional[pa.Schema] = None,
     batch_size: int = 10000,
     convert_empty_to_null: bool = True,
+    column_formatter: Optional[Callable[[str], str]] = None,
+    convert_datetime_to_timestamp: bool = True,
 ) -> None:
     """
     Convenience function to write a QueryResult to a parquet file (async version).
@@ -238,18 +316,24 @@ async def write_query_to_parquet(
     :param schema: Optional pre-created PyArrow schema (takes precedence over fields_metadata)
     :param batch_size: Number of records to process in each batch
     :param convert_empty_to_null: Convert empty strings to null values
+    :param column_formatter: Optional function to format column names
+    :param convert_datetime_to_timestamp: If True, datetime fields are converted to timestamps, otherwise stored as strings
     """
     effective_schema = None
     if schema:
         effective_schema = schema
     elif fields_metadata:
-        effective_schema = create_schema_from_metadata(fields_metadata)
+        effective_schema = create_schema_from_metadata(
+            fields_metadata, column_formatter, convert_datetime_to_timestamp
+        )
     writer = ParquetWriter(
         file_path=file_path,
         schema=effective_schema,
         batch_size=batch_size,
         convert_empty_to_null=convert_empty_to_null,
+        column_formatter=column_formatter,
+        convert_datetime_to_timestamp=convert_datetime_to_timestamp,
     )
     await writer.write_query_result(query_result)

aio_sf-0.1.0b4/src/aio_sf/__init__.py DELETED Viewed

@@ -1,28 +0,0 @@
-"""aio-salesforce: Async Salesforce library for Python with Bulk API 2.0 support."""
-__author__ = "Jonas"
-__email__ = "charlie@callaway.cloud"
-# Client functionality
-from .api.client import SalesforceClient  # noqa: F401
-from .api.auth import (  # noqa: F401
-    SalesforceAuthError,
-    AuthStrategy,
-    ClientCredentialsAuth,
-    RefreshTokenAuth,
-    StaticTokenAuth,
-    SfdxCliAuth,
-)
-# Core package only exports client functionality
-# Users import exporter functions directly: from aio_sf.exporter import bulk_query
-__all__ = [
-    "SalesforceClient",
-    "SalesforceAuthError",
-    "AuthStrategy",
-    "ClientCredentialsAuth",
-    "RefreshTokenAuth",
-    "StaticTokenAuth",
-    "SfdxCliAuth",
-]

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/.cursor/rules/api-structure.mdc RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/.cursor/rules/async-patterns.mdc RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/.cursor/rules/project-tooling.mdc RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/.github/workflows/publish.yml RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/.github/workflows/test.yml RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/.gitignore RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/LICENSE RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/pytest.ini RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/__init__.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/auth/__init__.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/auth/base.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/auth/client_credentials.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/auth/refresh_token.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/auth/sfdx_cli.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/auth/static_token.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/bulk_v2/__init__.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/bulk_v2/client.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/bulk_v2/types.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/client.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/collections/__init__.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/collections/client.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/collections/types.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/describe/__init__.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/describe/client.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/describe/types.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/query/__init__.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/query/client.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/query/types.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/api/types.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/src/aio_sf/exporter/__init__.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/tests/__init__.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/tests/conftest.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/tests/test_api_clients.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/tests/test_auth.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/tests/test_client.py RENAMED Viewed

File without changes

{aio_sf-0.1.0b4 → aio_sf-0.1.0b6}/uv.lock RENAMED Viewed

File without changes

aio-sf 0.1.0b4__tar.gz → 0.1.0b6__tar.gz

aio-sf 0.1.0b4tar.gz → 0.1.0b6tar.gz