PyPI - openaivec - Versions diffs - 0.14.13__tar.gz → 0.15.0__tar.gz - Mend

openaivec 0.14.13tar.gz → 0.15.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (89) hide show

openaivec-0.15.0/AGENTS.md ADDED Viewed

@@ -0,0 +1,34 @@
+# Repository Guidelines
+## Project Layout
+- `src/openaivec/`: batching core (`_proxy.py`, `_responses.py`, `_embeddings.py`), integrations (`pandas_ext.py`, `spark.py`), and tasks (`task/`); keep additions beside the APIs they extend.
+- `tests/`: mirrors the source layout; use common pandas, Spark, and async fixtures.
+- `docs/` holds MkDocs sources, `site/` generated pages, and `artifacts/` scratch assets kept out of releases.
+## Core Components & Contracts
+- Remote work goes through `BatchingMapProxy`/`AsyncBatchingMapProxy`; they dedupe inputs, require same-length outputs, release waiters on failure, and show progress only when `show_progress=True` in notebooks.
+- `_responses.py` enforces reasoning rules: o1/o3-family models must use `temperature=None`, and structured scenarios pass a Pydantic `response_format`.
+- Reuse caches from `*_with_cache` or Spark UDF builders per operation and clear them afterward to avoid large payloads.
+## Development Workflow
+- `uv sync --all-extras --dev` prepares extras and tooling; iterate with `uv run pytest -m "not slow and not requires_api"` before a full `uv run pytest`.
+- `uv run ruff check . --fix` enforces style, `uv run pyright` guards API changes, and `uv build` validates the distribution.
+- Use `uv pip install -e .` only when external tooling requires an editable install.
+## Coding Standards
+- Target Python 3.10+, rely on absolute imports, and keep helpers private with leading underscores; public modules publish alphabetical `__all__`, internal ones set `__all__ = []`.
+- Apply Google-style docstrings with `(type)` Args, Returns/Raises sections, double-backtick literals, and doctest-style `Example:` blocks (`>>>`) when useful.
+- Async helpers end with `_async`; dataframe accessors use descriptive nouns (`responses`, `extract`); raise narrow exceptions (`ValueError`, `TypeError`).
+## Testing Guidelines
+- Pytest discovers `tests/test_*.py`; parametrize to cover pandas vectorization, Spark UDFs, and async pathways.
+- Mark network tests `@pytest.mark.requires_api`, long jobs `@pytest.mark.slow`, Spark flows `@pytest.mark.spark`; skip gracefully when credentials are missing.
+- Add regression tests before fixes, assert on structure/length/order rather than verbatim text, and prefer shared fixtures over heavy mocking.
+## Collaboration
+- Commits follow `type(scope): summary` (e.g., `fix(pandas): guard empty batch`) and avoid merge commits within feature branches.
+- Pull requests explain motivation, outline the solution, link issues, list doc updates, and include the latest `uv run pytest` and `uv run ruff check . --fix` output; attach screenshots for doc or tutorial changes.
+## Environment & Secrets
+- Export `OPENAI_API_KEY` or the Azure trio (`AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_BASE_URL`, `AZURE_OPENAI_API_VERSION`) before running `requires_api` tests; Azure endpoints must end with `/openai/v1/`.
+- Keep local secrets under `artifacts/`, never commit credentials, and rely on CI-managed secrets when extending automation.

{openaivec-0.14.13 → openaivec-0.15.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: openaivec
-Version: 0.14.13
+Version: 0.15.0
 Summary: Generative mutation for tabular calculation
 Project-URL: Homepage, https://microsoft.github.io/openaivec/
 Project-URL: Repository, https://github.com/microsoft/openaivec
@@ -26,6 +26,8 @@ Description-Content-Type: text/markdown
 # openaivec
+[Contributor guidelines](AGENTS.md)
 **Transform your data analysis with AI-powered text processing at scale.**
 **openaivec** enables data analysts to seamlessly integrate OpenAI's language models into their pandas and Spark workflows. Process thousands of text records with natural language instructions, turning unstructured data into actionable insights with just a few lines of code.
@@ -187,13 +189,13 @@ os.environ["OPENAI_API_KEY"] = "your-api-key-here"
 # Authentication Option 2: Custom client (optional)
 # from openai import OpenAI, AsyncOpenAI
-# pandas_ext.use(OpenAI())
+# pandas_ext.set_client(OpenAI())
 # For async operations:
-# pandas_ext.use_async(AsyncOpenAI())
+# pandas_ext.set_async_client(AsyncOpenAI())
 # Configure model (optional - defaults to gpt-4.1-mini)
 # For Azure OpenAI: use your deployment name, for OpenAI: use model name
-pandas_ext.responses_model("gpt-4.1-mini")
+pandas_ext.set_responses_model("gpt-4.1-mini")
 # Create your data
 df = pd.DataFrame({"name": ["panda", "rabbit", "koala"]})
@@ -220,7 +222,7 @@ When using reasoning models (o1-preview, o1-mini, o3-mini, etc.), you must set `
 ```python
 # For reasoning models like o1-preview, o1-mini, o3-mini
-pandas_ext.responses_model("o1-mini")  # Set your reasoning model
+pandas_ext.set_responses_model("o1-mini")  # Set your reasoning model
 # MUST use temperature=None with reasoning models
 result = df.assign(
@@ -291,7 +293,7 @@ import pandas as pd
 from openaivec import pandas_ext
 # Setup (same as synchronous version)
-pandas_ext.responses_model("gpt-4.1-mini")
+pandas_ext.set_responses_model("gpt-4.1-mini")
 df = pd.DataFrame({"text": [
     "This product is amazing!",

{openaivec-0.14.13 → openaivec-0.15.0}/README.md RENAMED Viewed

@@ -1,5 +1,7 @@
 # openaivec
+[Contributor guidelines](AGENTS.md)
 **Transform your data analysis with AI-powered text processing at scale.**
 **openaivec** enables data analysts to seamlessly integrate OpenAI's language models into their pandas and Spark workflows. Process thousands of text records with natural language instructions, turning unstructured data into actionable insights with just a few lines of code.
@@ -161,13 +163,13 @@ os.environ["OPENAI_API_KEY"] = "your-api-key-here"
 # Authentication Option 2: Custom client (optional)
 # from openai import OpenAI, AsyncOpenAI
-# pandas_ext.use(OpenAI())
+# pandas_ext.set_client(OpenAI())
 # For async operations:
-# pandas_ext.use_async(AsyncOpenAI())
+# pandas_ext.set_async_client(AsyncOpenAI())
 # Configure model (optional - defaults to gpt-4.1-mini)
 # For Azure OpenAI: use your deployment name, for OpenAI: use model name
-pandas_ext.responses_model("gpt-4.1-mini")
+pandas_ext.set_responses_model("gpt-4.1-mini")
 # Create your data
 df = pd.DataFrame({"name": ["panda", "rabbit", "koala"]})
@@ -194,7 +196,7 @@ When using reasoning models (o1-preview, o1-mini, o3-mini, etc.), you must set `
 ```python
 # For reasoning models like o1-preview, o1-mini, o3-mini
-pandas_ext.responses_model("o1-mini")  # Set your reasoning model
+pandas_ext.set_responses_model("o1-mini")  # Set your reasoning model
 # MUST use temperature=None with reasoning models
 result = df.assign(
@@ -265,7 +267,7 @@ import pandas as pd
 from openaivec import pandas_ext
 # Setup (same as synchronous version)
-pandas_ext.responses_model("gpt-4.1-mini")
+pandas_ext.set_responses_model("gpt-4.1-mini")
 df = pd.DataFrame({"text": [
     "This product is amazing!",

openaivec-0.15.0/docs/contributor-guide.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Contributor Guidelines
+Refer to [AGENTS.md](https://github.com/microsoft/openaivec/blob/main/AGENTS.md) in the repository root for the authoritative contributor guide.

{openaivec-0.14.13 → openaivec-0.15.0}/docs/index.md RENAMED Viewed

@@ -84,11 +84,11 @@ from openaivec import pandas_ext
 from typing import List
 # Set OpenAI Client (optional: this is default client if environment "OPENAI_API_KEY" is set)
-pandas_ext.use(OpenAI())
+pandas_ext.set_client(OpenAI())
 # Set models for responses and embeddings(optional: these are default models)
-pandas_ext.responses_model("gpt-4.1-nano")
-pandas_ext.embeddings_model("text-embedding-3-small")
+pandas_ext.set_responses_model("gpt-4.1-nano")
+pandas_ext.set_embeddings_model("text-embedding-3-small")
 fruits: List[str] = ["apple", "banana", "orange", "grape", "kiwi", "mango", "peach", "pear", "pineapple", "strawberry"]
@@ -236,4 +236,4 @@ results = asyncio.run(analyze_feedback())
 ### When to Use Async vs Sync
 - **Use `.aio`** for: Large datasets (1000+ rows), time-sensitive processing, concurrent workflows
-- **Use `.ai`** for: Small datasets, interactive analysis, simple one-off operations
+- **Use `.ai`** for: Small datasets, interactive analysis, simple one-off operations

{openaivec-0.14.13 → openaivec-0.15.0}/mkdocs.yml RENAMED Viewed

@@ -52,6 +52,7 @@ nav:
   - Home: index.md
   - PyPI: https://pypi.org/project/openaivec/
   - GitHub: https://github.com/microsoft/openaivec
+  - Contributor Guidelines: contributor-guide.md
   - Examples:
       - Getting Started: examples/pandas.ipynb
       - Intelligent Fill: examples/intelligent_fill.ipynb

{openaivec-0.14.13 → openaivec-0.15.0}/src/openaivec/pandas_ext.py RENAMED Viewed

@@ -10,29 +10,32 @@ from openaivec import pandas_ext
 # (AZURE_OPENAI_API_KEY, AZURE_OPENAI_BASE_URL, AZURE_OPENAI_API_VERSION)
 # No explicit setup needed - clients are automatically created
-# Option 2: Use an existing OpenAI client instance
+# Option 2: Register an existing OpenAI client instance
 client = OpenAI(api_key="your-api-key")
-pandas_ext.use(client)
+pandas_ext.set_client(client)
-# Option 3: Use an existing Azure OpenAI client instance
+# Option 3: Register an Azure OpenAI client instance
 azure_client = AzureOpenAI(
     api_key="your-azure-key",
     base_url="https://YOUR-RESOURCE-NAME.services.ai.azure.com/openai/v1/",
     api_version="preview"
 )
-pandas_ext.use(azure_client)
+pandas_ext.set_client(azure_client)
-# Option 4: Use async Azure OpenAI client instance
+# Option 4: Register an async Azure OpenAI client instance
 async_azure_client = AsyncAzureOpenAI(
     api_key="your-azure-key",
     base_url="https://YOUR-RESOURCE-NAME.services.ai.azure.com/openai/v1/",
     api_version="preview"
 )
-pandas_ext.use_async(async_azure_client)
+pandas_ext.set_async_client(async_azure_client)
 # Set up model names (optional, defaults shown)
-pandas_ext.responses_model("gpt-4.1-mini")
-pandas_ext.embeddings_model("text-embedding-3-small")
+pandas_ext.set_responses_model("gpt-4.1-mini")
+pandas_ext.set_embeddings_model("text-embedding-3-small")
+# Inspect current configuration
+configured_model = pandas_ext.get_responses_model()
 ```
 This module provides `.ai` and `.aio` accessors for pandas Series and DataFrames
@@ -49,15 +52,6 @@ import numpy as np
 import pandas as pd
 import tiktoken
 from openai import AsyncOpenAI, OpenAI
-from openaivec._schema import InferredSchema, SchemaInferenceInput, SchemaInferer
-__all__ = [
-    "embeddings_model",
-    "responses_model",
-    "use",
-    "use_async",
-]
 from pydantic import BaseModel
 from openaivec._embeddings import AsyncBatchEmbeddings, BatchEmbeddings
@@ -65,13 +59,18 @@ from openaivec._model import EmbeddingsModelName, PreparedTask, ResponseFormat,
 from openaivec._provider import CONTAINER, _check_azure_v1_api_url
 from openaivec._proxy import AsyncBatchingMapProxy, BatchingMapProxy
 from openaivec._responses import AsyncBatchResponses, BatchResponses
+from openaivec._schema import InferredSchema, SchemaInferenceInput, SchemaInferer
 from openaivec.task.table import FillNaResponse, fillna
 __all__ = [
-    "use",
-    "use_async",
-    "responses_model",
-    "embeddings_model",
+    "get_async_client",
+    "get_client",
+    "get_embeddings_model",
+    "get_responses_model",
+    "set_async_client",
+    "set_client",
+    "set_embeddings_model",
+    "set_responses_model",
 ]
 _LOGGER = logging.getLogger(__name__)
@@ -95,37 +94,51 @@ def _df_rows_to_json_series(df: pd.DataFrame) -> pd.Series:
 T = TypeVar("T")  # For pipe function return type
-def use(client: OpenAI) -> None:
-    """Register a custom OpenAI‑compatible client.
+def set_client(client: OpenAI) -> None:
+    """Register a custom OpenAI-compatible client for pandas helpers.
     Args:
-        client (OpenAI): A pre‑configured `openai.OpenAI` or
-            `openai.AzureOpenAI` instance.
-            The same instance is reused by every helper in this module.
+        client (OpenAI): A pre-configured `openai.OpenAI` or
+            `openai.AzureOpenAI` instance reused by every helper in this module.
     """
-    # Check Azure v1 API URL if using AzureOpenAI client
     if client.__class__.__name__ == "AzureOpenAI" and hasattr(client, "base_url"):
         _check_azure_v1_api_url(str(client.base_url))
     CONTAINER.register(OpenAI, lambda: client)
-def use_async(client: AsyncOpenAI) -> None:
-    """Register a custom asynchronous OpenAI‑compatible client.
+def get_client() -> OpenAI:
+    """Get the currently registered OpenAI-compatible client.
+    Returns:
+        OpenAI: The registered `openai.OpenAI` or `openai.AzureOpenAI` instance.
+    """
+    return CONTAINER.resolve(OpenAI)
+def set_async_client(client: AsyncOpenAI) -> None:
+    """Register a custom asynchronous OpenAI-compatible client.
     Args:
-        client (AsyncOpenAI): A pre‑configured `openai.AsyncOpenAI` or
-            `openai.AsyncAzureOpenAI` instance.
-            The same instance is reused by every helper in this module.
+        client (AsyncOpenAI): A pre-configured `openai.AsyncOpenAI` or
+            `openai.AsyncAzureOpenAI` instance reused by every helper in this module.
     """
-    # Check Azure v1 API URL if using AsyncAzureOpenAI client
     if client.__class__.__name__ == "AsyncAzureOpenAI" and hasattr(client, "base_url"):
         _check_azure_v1_api_url(str(client.base_url))
     CONTAINER.register(AsyncOpenAI, lambda: client)
-def responses_model(name: str) -> None:
+def get_async_client() -> AsyncOpenAI:
+    """Get the currently registered asynchronous OpenAI-compatible client.
+    Returns:
+        AsyncOpenAI: The registered `openai.AsyncOpenAI` or `openai.AsyncAzureOpenAI` instance.
+    """
+    return CONTAINER.resolve(AsyncOpenAI)
+def set_responses_model(name: str) -> None:
     """Override the model used for text responses.
     Args:
@@ -135,7 +148,16 @@ def responses_model(name: str) -> None:
     CONTAINER.register(ResponsesModelName, lambda: ResponsesModelName(name))
-def embeddings_model(name: str) -> None:
+def get_responses_model() -> str:
+    """Get the currently registered model name for text responses.
+    Returns:
+        str: The model name (for example, ``gpt-4.1-mini``).
+    """
+    return CONTAINER.resolve(ResponsesModelName).value
+def set_embeddings_model(name: str) -> None:
     """Override the model used for text embeddings.
     Args:
@@ -145,6 +167,15 @@ def embeddings_model(name: str) -> None:
     CONTAINER.register(EmbeddingsModelName, lambda: EmbeddingsModelName(name))
+def get_embeddings_model() -> str:
+    """Get the currently registered model name for text embeddings.
+    Returns:
+        str: The model name (for example, ``text-embedding-3-small``).
+    """
+    return CONTAINER.resolve(EmbeddingsModelName).value
 def _extract_value(x, series_name):
     """Return a homogeneous ``dict`` representation of any Series value.
@@ -639,7 +670,7 @@ class OpenAIVecSeriesAccessor:
             animals.ai.count_tokens()
             ```
             This method uses the `tiktoken` library to count tokens based on the
-            model name set by `responses_model`.
+            model name configured via `set_responses_model`.
         Returns:
             pandas.Series: Token counts for each element.

{openaivec-0.14.13 → openaivec-0.15.0}/src/openaivec/spark.py RENAMED Viewed

@@ -193,8 +193,6 @@ def setup(
         CONTAINER.register(ResponsesModelName, lambda: ResponsesModelName(responses_model_name))
     if embeddings_model_name:
-        from openaivec._model import EmbeddingsModelName
         CONTAINER.register(EmbeddingsModelName, lambda: EmbeddingsModelName(embeddings_model_name))
     CONTAINER.clear_singletons()
@@ -244,6 +242,50 @@ def setup_azure(
     CONTAINER.clear_singletons()
+def set_responses_model(model_name: str):
+    """Set the default model name for response generation in the DI container.
+    Args:
+        model_name (str): The model name to set as default for responses.
+    """
+    CONTAINER.register(ResponsesModelName, lambda: ResponsesModelName(model_name))
+    CONTAINER.clear_singletons()
+def get_responses_model() -> str | None:
+    """Get the default model name for response generation from the DI container.
+    Returns:
+        str | None: The default model name for responses, or None if not set.
+    """
+    try:
+        return CONTAINER.resolve(ResponsesModelName).value
+    except Exception:
+        return None
+def set_embeddings_model(model_name: str):
+    """Set the default model name for embeddings in the DI container.
+    Args:
+        model_name (str): The model name to set as default for embeddings.
+    """
+    CONTAINER.register(EmbeddingsModelName, lambda: EmbeddingsModelName(model_name))
+    CONTAINER.clear_singletons()
+def get_embeddings_model() -> str | None:
+    """Get the default model name for embeddings from the DI container.
+    Returns:
+        str | None: The default model name for embeddings, or None if not set.
+    """
+    try:
+        return CONTAINER.resolve(EmbeddingsModelName).value
+    except Exception:
+        return None
 def _python_type_to_spark(python_type):
     origin = get_origin(python_type)
@@ -322,7 +364,7 @@ def _safe_dump(x: BaseModel | None) -> dict:
 def responses_udf(
     instructions: str,
     response_format: type[ResponseFormat] = str,
-    model_name: str = CONTAINER.resolve(ResponsesModelName).value,
+    model_name: str | None = None,
     batch_size: int | None = None,
     max_concurrency: int = 8,
     **api_kwargs,
@@ -351,8 +393,9 @@ def responses_udf(
         instructions (str): The system prompt or instructions for the model.
         response_format (type[ResponseFormat]): The desired output format. Either `str` for plain text
             or a Pydantic `BaseModel` for structured JSON output. Defaults to `str`.
-        model_name (str): For Azure OpenAI, use your deployment name (e.g., "my-gpt4-deployment").
-            For OpenAI, use the model name (e.g., "gpt-4.1-mini"). Defaults to configured model in DI container.
+        model_name (str | None): For Azure OpenAI, use your deployment name (e.g., "my-gpt4-deployment").
+            For OpenAI, use the model name (e.g., "gpt-4.1-mini"). Defaults to configured model in DI container
+            via ResponsesModelName if not provided.
         batch_size (int | None): Number of rows per async batch request within each partition.
             Larger values reduce API call overhead but increase memory usage.
             Defaults to None (automatic batch size optimization that dynamically
@@ -382,13 +425,15 @@ def responses_udf(
         - Consider your OpenAI tier limits: total_requests = max_concurrency × executors
         - Use Spark UI to optimize partition sizes relative to batch_size
     """
+    _model_name = model_name or CONTAINER.resolve(ResponsesModelName).value
     if issubclass(response_format, BaseModel):
         spark_schema = _pydantic_to_spark_schema(response_format)
         json_schema_string = serialize_base_model(response_format)
         @pandas_udf(returnType=spark_schema)  # type: ignore[call-overload]
         def structure_udf(col: Iterator[pd.Series]) -> Iterator[pd.DataFrame]:
-            pandas_ext.responses_model(model_name)
+            pandas_ext.set_responses_model(_model_name)
             response_format = deserialize_base_model(json_schema_string)
             cache = AsyncBatchingMapProxy[str, response_format](
                 batch_size=batch_size,
@@ -415,7 +460,7 @@ def responses_udf(
         @pandas_udf(returnType=StringType())  # type: ignore[call-overload]
         def string_udf(col: Iterator[pd.Series]) -> Iterator[pd.Series]:
-            pandas_ext.responses_model(model_name)
+            pandas_ext.set_responses_model(_model_name)
             cache = AsyncBatchingMapProxy[str, str](
                 batch_size=batch_size,
                 max_concurrency=max_concurrency,
@@ -443,7 +488,7 @@ def responses_udf(
 def task_udf(
     task: PreparedTask[ResponseFormat],
-    model_name: str = CONTAINER.resolve(ResponsesModelName).value,
+    model_name: str | None = None,
     batch_size: int | None = None,
     max_concurrency: int = 8,
     **api_kwargs,
@@ -459,8 +504,9 @@ def task_udf(
     Args:
         task (PreparedTask): A predefined task configuration containing instructions,
             response format, and API parameters.
-        model_name (str): For Azure OpenAI, use your deployment name (e.g., "my-gpt4-deployment").
-            For OpenAI, use the model name (e.g., "gpt-4.1-mini"). Defaults to configured model in DI container.
+        model_name (str | None): For Azure OpenAI, use your deployment name (e.g., "my-gpt4-deployment").
+            For OpenAI, use the model name (e.g., "gpt-4.1-mini"). Defaults to configured model in DI container
+            via ResponsesModelName if not provided.
         batch_size (int | None): Number of rows per async batch request within each partition.
             Larger values reduce API call overhead but increase memory usage.
             Defaults to None (automatic batch size optimization that dynamically
@@ -550,7 +596,7 @@ def parse_udf(
     example_table_name: str | None = None,
     example_field_name: str | None = None,
     max_examples: int = 100,
-    model_name: str = CONTAINER.resolve(ResponsesModelName).value,
+    model_name: str | None = None,
     batch_size: int | None = None,
     max_concurrency: int = 8,
     **api_kwargs,
@@ -574,8 +620,9 @@ def parse_udf(
             If provided, `example_table_name` must also be specified.
         max_examples (int): Maximum number of examples to retrieve for schema inference.
             Defaults to 100.
-        model_name (str): For Azure OpenAI, use your deployment name (e.g., "my-gpt4-deployment").
-            For OpenAI, use the model name (e.g., "gpt-4.1-mini"). Defaults to configured model in DI container.
+        model_name (str | None): For Azure OpenAI, use your deployment name (e.g., "my-gpt4-deployment").
+            For OpenAI, use the model name (e.g., "gpt-4.1-mini"). Defaults to configured model in DI container
+            via ResponsesModelName if not provided.
         batch_size (int | None): Number of rows per async batch request within each partition.
             Larger values reduce API call overhead but increase memory usage.
             Defaults to None (automatic batch size optimization that dynamically
@@ -622,7 +669,7 @@ def parse_udf(
 def embeddings_udf(
-    model_name: str = CONTAINER.resolve(EmbeddingsModelName).value,
+    model_name: str | None = None,
     batch_size: int | None = None,
     max_concurrency: int = 8,
     **api_kwargs,
@@ -648,9 +695,9 @@ def embeddings_udf(
             sc.environment["AZURE_OPENAI_API_VERSION"] = "preview"
     Args:
-        model_name (str): For Azure OpenAI, use your deployment name (e.g., "my-embedding-deployment").
+        model_name (str | None): For Azure OpenAI, use your deployment name (e.g., "my-embedding-deployment").
             For OpenAI, use the model name (e.g., "text-embedding-3-small").
-            Defaults to configured model in DI container.
+            Defaults to configured model in DI container via EmbeddingsModelName if not provided.
         batch_size (int | None): Number of rows per async batch request within each partition.
             Larger values reduce API call overhead but increase memory usage.
             Defaults to None (automatic batch size optimization that dynamically
@@ -678,9 +725,11 @@ def embeddings_udf(
         - Use larger batch_size for embeddings compared to response generation
     """
+    _model_name = model_name or CONTAINER.resolve(EmbeddingsModelName).value
     @pandas_udf(returnType=ArrayType(FloatType()))  # type: ignore[call-overload,misc]
     def _embeddings_udf(col: Iterator[pd.Series]) -> Iterator[pd.Series]:
-        pandas_ext.embeddings_model(model_name)
+        pandas_ext.set_embeddings_model(_model_name)
         cache = AsyncBatchingMapProxy[str, np.ndarray](
             batch_size=batch_size,
             max_concurrency=max_concurrency,

{openaivec-0.14.13 → openaivec-0.15.0}/tests/test_pandas_ext.py RENAMED Viewed

@@ -15,10 +15,10 @@ class TestPandasExt:
     @pytest.fixture(autouse=True)
     def setup_pandas_ext(self, openai_client, async_openai_client, responses_model_name, embeddings_model_name):
         """Setup pandas_ext with test clients and models."""
-        pandas_ext.use(openai_client)
-        pandas_ext.use_async(async_openai_client)
-        pandas_ext.responses_model(responses_model_name)
-        pandas_ext.embeddings_model(embeddings_model_name)
+        pandas_ext.set_client(openai_client)
+        pandas_ext.set_async_client(async_openai_client)
+        pandas_ext.set_responses_model(responses_model_name)
+        pandas_ext.set_embeddings_model(embeddings_model_name)
         yield
     # ===== BASIC SERIES METHODS =====
@@ -744,18 +744,31 @@ class TestPandasExt:
     # ===== CONFIGURATION & PARAMETER TESTS =====
-    def test_configuration_methods(self):
-        """Test configuration methods use, use_async, responses_model, embeddings_model."""
+    def test_configuration_methods(self, openai_client, async_openai_client):
+        """Test configuration helpers for clients and model names."""
         # Test that configuration methods exist and are callable
-        assert callable(pandas_ext.use)
-        assert callable(pandas_ext.use_async)
-        assert callable(pandas_ext.responses_model)
-        assert callable(pandas_ext.embeddings_model)
+        assert callable(pandas_ext.set_client)
+        assert callable(pandas_ext.get_client)
+        assert callable(pandas_ext.set_async_client)
+        assert callable(pandas_ext.get_async_client)
+        assert callable(pandas_ext.set_responses_model)
+        assert callable(pandas_ext.get_responses_model)
+        assert callable(pandas_ext.set_embeddings_model)
+        assert callable(pandas_ext.get_embeddings_model)
         # Test model configuration
         try:
-            pandas_ext.responses_model("gpt-4.1-mini")
-            pandas_ext.embeddings_model("text-embedding-3-small")
+            pandas_ext.set_client(openai_client)
+            assert pandas_ext.get_client() is openai_client
+            pandas_ext.set_async_client(async_openai_client)
+            assert pandas_ext.get_async_client() is async_openai_client
+            pandas_ext.set_responses_model("gpt-4.1-mini")
+            assert pandas_ext.get_responses_model() == "gpt-4.1-mini"
+            pandas_ext.set_embeddings_model("text-embedding-3-small")
+            assert pandas_ext.get_embeddings_model() == "text-embedding-3-small"
         except Exception as e:
             pytest.fail(f"Model configuration failed unexpectedly: {e}")

{openaivec-0.14.13 → openaivec-0.15.0}/tests/test_provider.py RENAMED Viewed

@@ -381,8 +381,8 @@ class TestAzureV1ApiWarning:
             else:
                 assert len(w) == 0, f"Unexpected warning for URL: {legacy_url}"
-    def test_pandas_ext_use_azure_warning(self):
-        """Test that pandas_ext.use() shows warning for legacy Azure URLs."""
+    def test_pandas_ext_set_client_azure_warning(self):
+        """Test that pandas_ext.set_client() shows warning for legacy Azure URLs."""
         from openai import AzureOpenAI
         from openaivec import pandas_ext
@@ -394,7 +394,7 @@ class TestAzureV1ApiWarning:
         with warnings.catch_warnings(record=True) as w:
             warnings.simplefilter("always")
-            pandas_ext.use(legacy_client)
+            pandas_ext.set_client(legacy_client)
             assert len(w) > 0, "Expected warning for legacy Azure URL"
             assert "v1 API is recommended" in str(w[0].message)

openaivec 0.14.13__tar.gz → 0.15.0__tar.gz

openaivec 0.14.13tar.gz → 0.15.0tar.gz