PyPI - openaivec - Versions diffs - 0.14.0__tar.gz → 0.14.2__tar.gz - Mend

openaivec 0.14.0tar.gz → 0.14.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (97) hide show

{openaivec-0.14.0 → openaivec-0.14.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: openaivec
-Version: 0.14.0
+Version: 0.14.2
 Summary: Generative mutation for tabular calculation
 Project-URL: Homepage, https://microsoft.github.io/openaivec/
 Project-URL: Repository, https://github.com/microsoft/openaivec
@@ -98,7 +98,7 @@ survey_responses.assign(
 ).ai.extract("structured")  # Auto-expands to columns
 ```
-📓 **[See more examples →](https://microsoft.github.io/openaivec/examples/)**
+📓 **[See more examples →](https://microsoft.github.io/openaivec/examples/pandas/)**
 # Overview
@@ -746,7 +746,7 @@ uv run ruff check . --fix
 📓 **[Survey data transformation →](https://microsoft.github.io/openaivec/examples/survey_transformation/)** - Unstructured to structured data
 📓 **[Asynchronous processing examples →](https://microsoft.github.io/openaivec/examples/aio/)** - High-performance async workflows
 📓 **[Auto-generate FAQs from documents →](https://microsoft.github.io/openaivec/examples/generate_faq/)** - Create FAQs using AI
-📓 **[All examples →](https://microsoft.github.io/openaivec/examples/)** - Complete collection of tutorials and use cases
+📓 **[All examples →](https://microsoft.github.io/openaivec/examples/pandas/)** - Complete collection of tutorials and use cases
 ## Community

{openaivec-0.14.0 → openaivec-0.14.2}/README.md RENAMED Viewed

@@ -72,7 +72,7 @@ survey_responses.assign(
 ).ai.extract("structured")  # Auto-expands to columns
 ```
-📓 **[See more examples →](https://microsoft.github.io/openaivec/examples/)**
+📓 **[See more examples →](https://microsoft.github.io/openaivec/examples/pandas/)**
 # Overview
@@ -720,7 +720,7 @@ uv run ruff check . --fix
 📓 **[Survey data transformation →](https://microsoft.github.io/openaivec/examples/survey_transformation/)** - Unstructured to structured data
 📓 **[Asynchronous processing examples →](https://microsoft.github.io/openaivec/examples/aio/)** - High-performance async workflows
 📓 **[Auto-generate FAQs from documents →](https://microsoft.github.io/openaivec/examples/generate_faq/)** - Create FAQs using AI
-📓 **[All examples →](https://microsoft.github.io/openaivec/examples/)** - Complete collection of tutorials and use cases
+📓 **[All examples →](https://microsoft.github.io/openaivec/examples/pandas/)** - Complete collection of tutorials and use cases
 ## Community

openaivec-0.14.2/docs/api/main.md ADDED Viewed

@@ -0,0 +1,19 @@
+# Main Package API
+The main `openaivec` package provides the core classes for AI-powered data processing.
+## Core Classes
+All core functionality is accessible through the main package imports:
+::: openaivec.BatchResponses
+::: openaivec.AsyncBatchResponses
+::: openaivec.BatchEmbeddings
+::: openaivec.AsyncBatchEmbeddings
+## Prompt Building
+::: openaivec.FewShotPromptBuilder

openaivec-0.14.2/docs/api/pandas_ext.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Pandas Extension
+::: openaivec.pandas_ext

openaivec-0.14.2/docs/api/spark.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Spark Extension
+::: openaivec.spark

openaivec-0.14.2/docs/api/task.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Task Module
+::: openaivec.task

openaivec-0.14.2/docs/api/tasks/customer_support/customer_sentiment.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Customer Sentiment Analysis
+::: openaivec.task.customer_support.customer_sentiment

openaivec-0.14.2/docs/api/tasks/customer_support/inquiry_classification.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Inquiry Classification
+::: openaivec.task.customer_support.inquiry_classification

openaivec-0.14.2/docs/api/tasks/customer_support/inquiry_summary.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Inquiry Summary
+::: openaivec.task.customer_support.inquiry_summary

openaivec-0.14.2/docs/api/tasks/customer_support/intent_analysis.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Intent Analysis
+::: openaivec.task.customer_support.intent_analysis

openaivec-0.14.2/docs/api/tasks/customer_support/response_suggestion.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Response Suggestion
+::: openaivec.task.customer_support.response_suggestion

openaivec-0.14.2/docs/api/tasks/customer_support/urgency_analysis.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Urgency Analysis
+::: openaivec.task.customer_support.urgency_analysis

openaivec-0.14.2/docs/api/tasks/nlp/dependency_parsing.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Dependency Parsing Task
+::: openaivec.task.nlp.dependency_parsing

openaivec-0.14.2/docs/api/tasks/nlp/keyword_extraction.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Keyword Extraction Task
+::: openaivec.task.nlp.keyword_extraction

openaivec-0.14.2/docs/api/tasks/nlp/morphological_analysis.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Morphological Analysis Task
+::: openaivec.task.nlp.morphological_analysis

openaivec-0.14.2/docs/api/tasks/nlp/named_entity_recognition.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Named Entity Recognition Task
+::: openaivec.task.nlp.named_entity_recognition

openaivec-0.14.2/docs/api/tasks/nlp/sentiment_analysis.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Sentiment Analysis Task
+::: openaivec.task.nlp.sentiment_analysis

openaivec-0.14.2/docs/api/tasks/nlp/translation.md ADDED Viewed

@@ -0,0 +1,3 @@
+# Translation Task
+::: openaivec.task.nlp.translation

{openaivec-0.14.0 → openaivec-0.14.2}/docs/index.md RENAMED Viewed

@@ -67,12 +67,10 @@ Get started with these comprehensive examples:
 Detailed documentation for all components:
+🔗 **[Main Package](api/main.md)** - Core classes (BatchResponses, BatchEmbeddings, FewShotPromptBuilder)
 🔗 **[pandas_ext](api/pandas_ext.md)** - Pandas Series and DataFrame extensions
 🔗 **[spark](api/spark.md)** - Apache Spark UDF builders
-🔗 **[responses](api/responses.md)** - Batch response processing
-🔗 **[embeddings](api/embeddings.md)** - Batch embedding generation
-🔗 **[prompt](api/prompt.md)** - Few-shot prompt building
-🔗 **[util](api/util.md)** - Utility functions and helpers
+🔗 **[task](api/task.md)** - Pre-built task modules for NLP and customer support
 ## Quick Start

{openaivec-0.14.0 → openaivec-0.14.2}/mkdocs.yml RENAMED Viewed

@@ -131,8 +131,25 @@ plugins:
         python:
           paths:
             - src
-          docstring_style: google
-          show_submodules: true
+          options:
+            docstring_style: google
+            show_submodules: true
+            show_source: true
+            show_root_heading: true
+            show_root_toc_entry: true
+            heading_level: 2
+            members_order: source
+            show_signature_annotations: true
+            separate_signature: true
+            show_bases: true
+            show_docstring_parameters: true
+            show_docstring_returns: true
+            show_docstring_examples: true
+            show_category_heading: true
+            group_by_category: true
+            show_if_no_docstring: false
+            inherited_members: false
+            merge_init_into_class: true
 markdown_extensions:
   - abbr

openaivec-0.14.2/src/openaivec/_serialize.py ADDED Viewed

@@ -0,0 +1,230 @@
+"""Refactored serialization utilities for Pydantic BaseModel classes.
+This module provides utilities for converting Pydantic BaseModel classes
+to and from JSON schema representations with simplified, maintainable code.
+"""
+from typing import Any, Dict, List, Literal, Tuple, Type, Union
+from pydantic import BaseModel, Field, create_model
+__all__ = []
+def serialize_base_model(obj: Type[BaseModel]) -> Dict[str, Any]:
+    """Serialize a Pydantic BaseModel to JSON schema."""
+    return obj.model_json_schema()
+def dereference_json_schema(json_schema: Dict[str, Any]) -> Dict[str, Any]:
+    """Dereference JSON schema by resolving $ref pointers with circular reference protection."""
+    model_map = json_schema.get("$defs", {})
+    def dereference(obj, current_path=None):
+        if current_path is None:
+            current_path = []
+        if isinstance(obj, dict):
+            if "$ref" in obj:
+                ref = obj["$ref"].split("/")[-1]
+                # Check for circular reference
+                if ref in current_path:
+                    # Return a placeholder to break the cycle
+                    return {"type": "object", "description": f"Circular reference to {ref}"}
+                if ref in model_map:
+                    # Add to path and recurse
+                    new_path = current_path + [ref]
+                    return dereference(model_map[ref], new_path)
+                else:
+                    # Invalid reference, return placeholder
+                    return {"type": "object", "description": f"Invalid reference to {ref}"}
+            else:
+                return {k: dereference(v, current_path) for k, v in obj.items()}
+        elif isinstance(obj, list):
+            return [dereference(x, current_path) for x in obj]
+        else:
+            return obj
+    result = {}
+    for k, v in json_schema.items():
+        if k == "$defs":
+            continue
+        result[k] = dereference(v)
+    return result
+# ============================================================================
+# Type Resolution - Separated into focused functions
+# ============================================================================
+def _resolve_union_type(union_options: List[Dict[str, Any]]) -> Type:
+    """Resolve anyOf/oneOf to Union type."""
+    union_types = []
+    for option in union_options:
+        if option.get("type") == "null":
+            union_types.append(type(None))
+        else:
+            union_types.append(parse_field(option))
+    if len(union_types) == 1:
+        return union_types[0]
+    elif len(union_types) == 2 and type(None) in union_types:
+        # Optional type: T | None
+        non_none_type = next(t for t in union_types if t is not type(None))
+        return Union[non_none_type, type(None)]  # type: ignore[return-value]
+    else:
+        return Union[tuple(union_types)]  # type: ignore[return-value]
+def _resolve_basic_type(type_name: str, field_def: Dict[str, Any]) -> Type:
+    """Resolve basic JSON schema types to Python types."""
+    type_mapping = {
+        "string": str,
+        "integer": int,
+        "number": float,
+        "boolean": bool,
+        "null": type(None),
+    }
+    if type_name in type_mapping:
+        return type_mapping[type_name]  # type: ignore[return-value]
+    elif type_name == "object":
+        # Check if it's a nested model or generic dict
+        if "properties" in field_def:
+            return deserialize_base_model(field_def)
+        else:
+            return dict
+    elif type_name == "array":
+        if "items" in field_def:
+            inner_type = parse_field(field_def["items"])
+            return List[inner_type]
+        else:
+            return List[Any]
+    else:
+        raise ValueError(f"Unsupported type: {type_name}")
+def parse_field(field_def: Dict[str, Any]) -> Type:
+    """Parse a JSON schema field definition to a Python type.
+    Simplified version with clear separation of concerns.
+    """
+    # Handle union types
+    if "anyOf" in field_def:
+        return _resolve_union_type(field_def["anyOf"])
+    if "oneOf" in field_def:
+        return _resolve_union_type(field_def["oneOf"])
+    # Handle basic types
+    if "type" not in field_def:
+        return Any  # type: ignore[return-value]
+    return _resolve_basic_type(field_def["type"], field_def)
+# ============================================================================
+# Field Information Creation - Centralized logic
+# ============================================================================
+def _create_field_info(description: str | None, default_value: Any, is_required: bool) -> Field:  # type: ignore[type-arg]
+    """Create Field info with consistent logic."""
+    if is_required and default_value is None:
+        # Required field without default
+        return Field(description=description) if description else Field()
+    else:
+        # Optional field or field with default
+        return Field(default=default_value, description=description) if description else Field(default=default_value)
+def _make_optional_if_needed(field_type: Type, is_required: bool, has_default: bool) -> Type:
+    """Make field type optional if needed."""
+    if is_required or has_default:
+        return field_type
+    # Check if already nullable
+    if hasattr(field_type, "__origin__") and field_type.__origin__ is Union and type(None) in field_type.__args__:
+        return field_type
+    # Make optional
+    return Union[field_type, type(None)]  # type: ignore[return-value]
+# ============================================================================
+# Field Processing - Separated enum and regular field logic
+# ============================================================================
+def _process_enum_field(field_name: str, field_def: Dict[str, Any], is_required: bool) -> Tuple[Type, Field]:  # type: ignore[type-arg]
+    """Process enum field with Literal type."""
+    enum_values = field_def["enum"]
+    # Create Literal type
+    if len(enum_values) == 1:
+        literal_type = Literal[enum_values[0]]
+    else:
+        literal_type = Literal[tuple(enum_values)]
+    # Handle optionality
+    description = field_def.get("description")
+    default_value = field_def.get("default")
+    has_default = default_value is not None
+    if not is_required and not has_default:
+        literal_type = Union[literal_type, type(None)]  # type: ignore[assignment]
+        default_value = None
+    field_info = _create_field_info(description, default_value, is_required)
+    return literal_type, field_info  # type: ignore[return-value]
+def _process_regular_field(field_name: str, field_def: Dict[str, Any], is_required: bool) -> Tuple[Type, Field]:  # type: ignore[type-arg]
+    """Process regular (non-enum) field."""
+    field_type = parse_field(field_def)
+    description = field_def.get("description")
+    default_value = field_def.get("default")
+    has_default = default_value is not None
+    # Handle optionality
+    field_type = _make_optional_if_needed(field_type, is_required, has_default)
+    if not is_required and not has_default:
+        default_value = None
+    field_info = _create_field_info(description, default_value, is_required)
+    return field_type, field_info
+# ============================================================================
+# Main Schema Processing - Clean and focused
+# ============================================================================
+def deserialize_base_model(json_schema: Dict[str, Any]) -> Type[BaseModel]:
+    """Deserialize a JSON schema to a Pydantic BaseModel class.
+    Refactored version with clear separation of concerns and simplified logic.
+    """
+    # Basic setup
+    title = json_schema.get("title", "DynamicModel")
+    dereferenced_schema = dereference_json_schema(json_schema)
+    properties = dereferenced_schema.get("properties", {})
+    required_fields = set(dereferenced_schema.get("required", []))
+    # Process each field
+    fields = {}
+    for field_name, field_def in properties.items():
+        is_required = field_name in required_fields
+        if "enum" in field_def:
+            field_type, field_info = _process_enum_field(field_name, field_def, is_required)
+        else:
+            field_type, field_info = _process_regular_field(field_name, field_def, is_required)
+        fields[field_name] = (field_type, field_info)
+    return create_model(title, **fields)

{openaivec-0.14.0 → openaivec-0.14.2}/src/openaivec/task/table/fillna.py RENAMED Viewed

@@ -79,7 +79,7 @@ __all__ = ["fillna", "FillNaResponse"]
 def get_examples(df: pd.DataFrame, target_column_name: str, max_examples: int) -> List[Dict]:
     examples: List[Dict] = []
-    samples: pd.DataFrame = df.sample(frac=1)
+    samples: pd.DataFrame = df.sample(frac=1).reset_index(drop=True).drop_duplicates()
     samples = samples.dropna(subset=[target_column_name])
     for i, row in samples.head(max_examples).iterrows():
@@ -109,7 +109,7 @@ def get_instructions(df: pd.DataFrame, target_column_name: str, max_examples: in
             output_value=json.dumps({"index": row["index"], "output": row["output"]}, ensure_ascii=False),
         )
-    return builder.build()
+    return builder.improve().build()
 class FillNaResponse(BaseModel):

openaivec 0.14.0__tar.gz → 0.14.2__tar.gz

openaivec 0.14.0tar.gz → 0.14.2tar.gz