PyPI - unique_toolkit - Versions diffs - 1.28.8__py3-none-any.whl → 1.33.3__py3-none-any.whl - Mend

unique_toolkit 1.28.8py3-none-any.whl → 1.33.3py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (44) hide show

unique_toolkit/__init__.py +12 -6
unique_toolkit/_common/docx_generator/service.py +8 -32
unique_toolkit/_common/utils/jinja/helpers.py +10 -0
unique_toolkit/_common/utils/jinja/render.py +18 -0
unique_toolkit/_common/utils/jinja/schema.py +65 -0
unique_toolkit/_common/utils/jinja/utils.py +80 -0
unique_toolkit/agentic/message_log_manager/service.py +9 -0
unique_toolkit/agentic/tools/a2a/postprocessing/_display_utils.py +58 -3
unique_toolkit/agentic/tools/a2a/postprocessing/_ref_utils.py +11 -0
unique_toolkit/agentic/tools/a2a/postprocessing/config.py +33 -0
unique_toolkit/agentic/tools/a2a/postprocessing/display.py +99 -15
unique_toolkit/agentic/tools/a2a/postprocessing/test/test_display.py +421 -0
unique_toolkit/agentic/tools/a2a/postprocessing/test/test_display_utils.py +768 -0
unique_toolkit/agentic/tools/a2a/tool/config.py +77 -1
unique_toolkit/agentic/tools/a2a/tool/service.py +67 -3
unique_toolkit/agentic/tools/config.py +5 -45
unique_toolkit/agentic/tools/openai_builtin/base.py +4 -0
unique_toolkit/agentic/tools/openai_builtin/code_interpreter/service.py +4 -0
unique_toolkit/agentic/tools/tool_manager.py +16 -19
unique_toolkit/app/__init__.py +3 -0
unique_toolkit/app/fast_api_factory.py +131 -0
unique_toolkit/app/webhook.py +77 -0
unique_toolkit/chat/functions.py +1 -1
unique_toolkit/content/functions.py +4 -4
unique_toolkit/content/service.py +1 -1
unique_toolkit/data_extraction/README.md +96 -0
unique_toolkit/data_extraction/__init__.py +11 -0
unique_toolkit/data_extraction/augmented/__init__.py +5 -0
unique_toolkit/data_extraction/augmented/service.py +93 -0
unique_toolkit/data_extraction/base.py +25 -0
unique_toolkit/data_extraction/basic/__init__.py +11 -0
unique_toolkit/data_extraction/basic/config.py +18 -0
unique_toolkit/data_extraction/basic/prompt.py +13 -0
unique_toolkit/data_extraction/basic/service.py +55 -0
unique_toolkit/embedding/service.py +1 -1
unique_toolkit/framework_utilities/langchain/__init__.py +10 -0
unique_toolkit/framework_utilities/openai/client.py +2 -1
unique_toolkit/language_model/infos.py +22 -1
unique_toolkit/services/knowledge_base.py +4 -6
{unique_toolkit-1.28.8.dist-info → unique_toolkit-1.33.3.dist-info}/METADATA +51 -2
{unique_toolkit-1.28.8.dist-info → unique_toolkit-1.33.3.dist-info}/RECORD +43 -27
unique_toolkit/agentic/tools/test/test_tool_manager.py +0 -1686
{unique_toolkit-1.28.8.dist-info → unique_toolkit-1.33.3.dist-info}/LICENSE +0 -0
{unique_toolkit-1.28.8.dist-info → unique_toolkit-1.33.3.dist-info}/WHEEL +0 -0

unique_toolkit/data_extraction/README.md ADDED Viewed

@@ -0,0 +1,96 @@
+# Data Extraction Module
+This module provides a flexible framework for extracting structured data from text using language models. It supports both basic and augmented data extraction capabilities.
+## Overview
+The module consists of two main components:
+1. **Basic Data Extraction**: Uses language models to extract structured data from text based on a provided schema.
+2. **Augmented Data Extraction**: Extends basic extraction by adding extra fields to the output schema while maintaining the original data structure.
+## Components
+### Base Classes
+- `BaseDataExtractor`: Abstract base class that defines the interface for data extraction
+- `BaseDataExtractionResult`: Generic base class for extraction results
+### Basic Extraction
+- `StructuredOutputDataExtractor`: Implements basic data extraction using language models
+- `StructuredOutputDataExtractorConfig`: Configuration for the basic extractor
+### Augmented Extraction
+- `AugmentedDataExtractor`: Extends basic extraction with additional fields
+- `AugmentedDataExtractionResult`: Result type for augmented extraction
+## Usage Examples
+### Basic Data Extraction
+```python
+from pydantic import BaseModel
+from unique_toolkit._common.data_extraction import StructuredOutputDataExtractor, StructuredOutputDataExtractorConfig
+from unique_toolkit import LanguageModelService
+# Define your schema
+class PersonInfo(BaseModel):
+    name: str
+    age: int
+    occupation: str
+# Create the extractor
+config = StructuredOutputDataExtractorConfig()
+lm_service = LanguageModelService()  # Configure as needed
+extractor = StructuredOutputDataExtractor(config, lm_service)
+# Extract data
+text = "John is 30 years old and works as a software engineer."
+result = await extractor.extract_data_from_text(text, PersonInfo)
+print(result.data)  # PersonInfo(name="John", age=30, occupation="software engineer")
+```
+### Augmented Data Extraction
+```python
+from pydantic import BaseModel, Field
+from _common.data_extraction import AugmentedDataExtractor, StructuredOutputDataExtractor
+# Define your base schema
+class PersonInfo(BaseModel):
+    name: str
+    age: int
+# Create base extractor
+base_extractor = StructuredOutputDataExtractor(...)
+# Create augmented extractor with confidence scores
+augmented_extractor = AugmentedDataExtractor(
+    base_extractor,
+    confidence=float,
+    source=("extracted", Field(description="Source of the information"))
+)
+# Extract data
+text = "John is 30 years old."
+result = await augmented_extractor.extract_data_from_text(text, PersonInfo)
+print(result.data)  # Original PersonInfo
+print(result.augmented_data)  # Contains additional fields
+```
+## Configuration
+The `StructuredOutputDataExtractorConfig` allows customization of:
+- Language model selection
+- System and user prompt templates
+- Schema enforcement settings
+## Best Practices
+1. Always define clear Pydantic models for your extraction schemas
+2. Use augmented extraction when you need additional metadata
+3. Consider using strict mode for augmented extraction when you want to enforce schema compliance
+4. Customize prompts for better extraction results in specific domains

unique_toolkit/data_extraction/__init__.py ADDED Viewed

@@ -0,0 +1,11 @@
+from unique_toolkit.data_extraction.augmented import AugmentedDataExtractor
+from unique_toolkit.data_extraction.basic import (
+    StructuredOutputDataExtractor,
+    StructuredOutputDataExtractorConfig,
+)
+__all__ = [
+    "StructuredOutputDataExtractor",
+    "StructuredOutputDataExtractorConfig",
+    "AugmentedDataExtractor",
+]

unique_toolkit/data_extraction/augmented/__init__.py ADDED Viewed

@@ -0,0 +1,5 @@
+from unique_toolkit.data_extraction.augmented.service import (
+    AugmentedDataExtractor,
+)
+__all__ = ["AugmentedDataExtractor"]

unique_toolkit/data_extraction/augmented/service.py ADDED Viewed

@@ -0,0 +1,93 @@
+from docxtpl.template import Any
+from pydantic import BaseModel, create_model
+from pydantic.alias_generators import to_pascal
+from pydantic.fields import FieldInfo
+from typing_extensions import override
+from unique_toolkit.data_extraction.base import (
+    BaseDataExtractionResult,
+    BaseDataExtractor,
+    ExtractionSchema,
+)
+def _build_augmented_model_for_field(
+    field_name: str,
+    field_type: Any | tuple[Any, FieldInfo],
+    strict: bool = False,
+    **extra_fields: Any | tuple[Any, FieldInfo],
+) -> type[BaseModel]:
+    camelized_field_name = to_pascal(field_name)
+    fields = {
+        **extra_fields,
+        field_name: field_type,
+    }
+    return create_model(
+        f"{camelized_field_name}Value",
+        **fields,  # type: ignore
+        __config__={"extra": "forbid" if strict else "ignore"},
+    )
+class AugmentedDataExtractionResult(BaseDataExtractionResult[ExtractionSchema]):
+    """
+    Result of data extraction from text using an augmented schema.
+    """
+    augmented_data: BaseModel
+class AugmentedDataExtractor(BaseDataExtractor):
+    def __init__(
+        self,
+        base_data_extractor: BaseDataExtractor,
+        strict: bool = False,
+        **extra_fields: Any | tuple[Any, FieldInfo],
+    ):
+        self._base_data_extractor = base_data_extractor
+        self._extra_fields = extra_fields
+        self._strict = strict
+    def _prepare_schema(self, schema: type[ExtractionSchema]) -> type[BaseModel]:
+        fields = {}
+        for field_name, field_type in schema.model_fields.items():
+            wrapped_field = _build_augmented_model_for_field(
+                field_name,
+                (field_type.annotation, field_type),
+                strict=self._strict,
+                **self._extra_fields,
+            )
+            fields[field_name] = wrapped_field
+        return create_model(
+            schema.__name__,
+            **fields,
+            __config__={"extra": "forbid" if self._strict else "ignore"},
+            __doc__=schema.__doc__,
+        )
+    def _extract_output(
+        self, llm_output: BaseModel, schema: type[ExtractionSchema]
+    ) -> ExtractionSchema:
+        output_data = {
+            field_name: getattr(value, field_name) for field_name, value in llm_output
+        }
+        return schema.model_validate(output_data)
+    @override
+    async def extract_data_from_text(
+        self, text: str, schema: type[ExtractionSchema]
+    ) -> AugmentedDataExtractionResult[ExtractionSchema]:
+        model_with_extra_fields = self._prepare_schema(schema)
+        augmented_data = (
+            await self._base_data_extractor.extract_data_from_text(
+                text, model_with_extra_fields
+            )
+        ).data
+        return AugmentedDataExtractionResult(
+            data=self._extract_output(augmented_data, schema),
+            augmented_data=augmented_data,
+        )

unique_toolkit/data_extraction/base.py ADDED Viewed

@@ -0,0 +1,25 @@
+from abc import ABC, abstractmethod
+from typing import Generic, TypeVar
+from pydantic import BaseModel
+ExtractionSchema = TypeVar("ExtractionSchema", bound=BaseModel)
+class BaseDataExtractionResult(BaseModel, Generic[ExtractionSchema]):
+    """
+    Base class for data extraction results.
+    """
+    data: ExtractionSchema
+class BaseDataExtractor(ABC):
+    """
+    Extract structured data from text.
+    """
+    @abstractmethod
+    async def extract_data_from_text(
+        self, text: str, schema: type[ExtractionSchema]
+    ) -> BaseDataExtractionResult[ExtractionSchema]: ...

unique_toolkit/data_extraction/basic/__init__.py ADDED Viewed

@@ -0,0 +1,11 @@
+from unique_toolkit.data_extraction.basic.config import (
+    StructuredOutputDataExtractorConfig,
+)
+from unique_toolkit.data_extraction.basic.service import (
+    StructuredOutputDataExtractor,
+)
+__all__ = [
+    "StructuredOutputDataExtractorConfig",
+    "StructuredOutputDataExtractor",
+]

unique_toolkit/data_extraction/basic/config.py ADDED Viewed

@@ -0,0 +1,18 @@
+from pydantic import BaseModel
+from unique_toolkit._common.pydantic_helpers import get_configuration_dict
+from unique_toolkit._common.validators import LMI, get_LMI_default_field
+from unique_toolkit.data_extraction.basic.prompt import (
+    DEFAULT_DATA_EXTRACTION_SYSTEM_PROMPT,
+    DEFAULT_DATA_EXTRACTION_USER_PROMPT,
+)
+from unique_toolkit.language_model.default_language_model import DEFAULT_GPT_4o
+class StructuredOutputDataExtractorConfig(BaseModel):
+    model_config = get_configuration_dict()
+    language_model: LMI = get_LMI_default_field(DEFAULT_GPT_4o)
+    structured_output_enforce_schema: bool = False
+    system_prompt_template: str = DEFAULT_DATA_EXTRACTION_SYSTEM_PROMPT
+    user_prompt_template: str = DEFAULT_DATA_EXTRACTION_USER_PROMPT

unique_toolkit/data_extraction/basic/prompt.py ADDED Viewed

@@ -0,0 +1,13 @@
+DEFAULT_DATA_EXTRACTION_SYSTEM_PROMPT = """
+You are a thorough and accurate expert in data processing.
+You will be given some text and an output schema, describing what needs to be extracted from the text.
+You will need to extract the data from the text and return it in the output schema.
+""".strip()
+DEFAULT_DATA_EXTRACTION_USER_PROMPT = """
+Here is the text to extract data from:
+{{ text }}
+Please thoroughly extract the data from the text and return it in the output schema.
+""".strip()

unique_toolkit/data_extraction/basic/service.py ADDED Viewed

@@ -0,0 +1,55 @@
+from typing_extensions import override
+from unique_toolkit._common.utils.jinja.render import render_template
+from unique_toolkit.data_extraction.base import (
+    BaseDataExtractionResult,
+    BaseDataExtractor,
+    ExtractionSchema,
+)
+from unique_toolkit.data_extraction.basic.config import (
+    StructuredOutputDataExtractorConfig,
+)
+from unique_toolkit.language_model import LanguageModelService
+from unique_toolkit.language_model.builder import MessagesBuilder
+class StructuredOutputDataExtractor(BaseDataExtractor):
+    """
+    Basic Structured Output Data Extraction.
+    """
+    def __init__(
+        self,
+        config: StructuredOutputDataExtractorConfig,
+        language_model_service: LanguageModelService,
+    ):
+        self._config = config
+        self._language_model_service = language_model_service
+    @override
+    async def extract_data_from_text(
+        self, text: str, schema: type[ExtractionSchema]
+    ) -> BaseDataExtractionResult[ExtractionSchema]:
+        messages_builder = (
+            MessagesBuilder()
+            .system_message_append(self._config.system_prompt_template)
+            .user_message_append(
+                render_template(
+                    self._config.user_prompt_template,
+                    {
+                        "text": text,
+                    },
+                )
+            )
+        )
+        response = await self._language_model_service.complete_async(
+            messages=messages_builder.build(),
+            model_name=self._config.language_model.name,
+            structured_output_model=schema,
+            temperature=0.0,
+            structured_output_enforce_schema=self._config.structured_output_enforce_schema,
+        )
+        return BaseDataExtractionResult(
+            data=schema.model_validate(response.choices[0].message.parsed),
+        )

unique_toolkit/embedding/service.py CHANGED Viewed

@@ -145,7 +145,7 @@ class EmbeddingService(BaseService):
         Embed text.
         Args:
-            text (str): The text to embed.
+            texts (list[str]): The texts to embed.
             timeout (int): The timeout in milliseconds. Defaults to 600000.
         Returns:

unique_toolkit/framework_utilities/langchain/__init__.py ADDED Viewed

@@ -0,0 +1,10 @@
+"""Langchain framework utilities."""
+try:
+    from .client import LangchainNotInstalledError, get_langchain_client
+    __all__ = ["get_langchain_client", "LangchainNotInstalledError"]
+except (ImportError, Exception):
+    # If langchain is not installed, don't export anything
+    # This handles both ImportError and LangchainNotInstalledError
+    __all__ = []

unique_toolkit/framework_utilities/openai/client.py CHANGED Viewed

@@ -30,7 +30,8 @@ def get_openai_client(
     """Get an OpenAI client instance.
     Args:
-        env_file: Optional path to environment file
+        unique_settings (UniqueSettings | None): Optional UniqueSettings instance
+        additional_headers (dict[str, str] | None): Optional additional headers to add to the request
     Returns:
         OpenAI client instance

unique_toolkit/language_model/infos.py CHANGED Viewed

@@ -47,6 +47,7 @@ class LanguageModelName(StrEnum):
     ANTHROPIC_CLAUDE_SONNET_4_5 = "litellm:anthropic-claude-sonnet-4-5"
     ANTHROPIC_CLAUDE_OPUS_4 = "litellm:anthropic-claude-opus-4"
     ANTHROPIC_CLAUDE_OPUS_4_1 = "litellm:anthropic-claude-opus-4-1"
+    ANTHROPIC_CLAUDE_OPUS_4_5 = "litellm:anthropic-claude-opus-4-5"
     GEMINI_2_0_FLASH = "litellm:gemini-2-0-flash"
     GEMINI_2_5_FLASH = "litellm:gemini-2-5-flash"
     GEMINI_2_5_FLASH_LITE = "litellm:gemini-2-5-flash-lite"
@@ -946,7 +947,7 @@ class LanguageModelInfo(BaseModel):
                         ModelCapabilities.REASONING,
                     ],
                     provider=LanguageModelProvider.LITELLM,
-                    version="claude-opus-4",
+                    version="claude-opus-4-1",
                     encoder_name=EncoderName.O200K_BASE,  # TODO: Update encoder with litellm
                     token_limits=LanguageModelTokenLimits(
                         # Input limit is 200_000, we leave 20_000 tokens as buffer due to tokenizer mismatch
@@ -956,6 +957,26 @@ class LanguageModelInfo(BaseModel):
                     info_cutoff_at=date(2025, 3, 1),
                     published_at=date(2025, 5, 1),
                 )
+            case LanguageModelName.ANTHROPIC_CLAUDE_OPUS_4_5:
+                return cls(
+                    name=model_name,
+                    capabilities=[
+                        ModelCapabilities.FUNCTION_CALLING,
+                        ModelCapabilities.STREAMING,
+                        ModelCapabilities.VISION,
+                        ModelCapabilities.REASONING,
+                    ],
+                    provider=LanguageModelProvider.LITELLM,
+                    version="claude-opus-4-5",
+                    encoder_name=EncoderName.O200K_BASE,  # TODO: Update encoder with litellm
+                    token_limits=LanguageModelTokenLimits(
+                        # Input limit is 200_000, we leave 20_000 tokens as buffer due to tokenizer mismatch
+                        token_limit_input=180_000,
+                        token_limit_output=64_000,
+                    ),
+                    info_cutoff_at=date(2025, 8, 1),
+                    published_at=date(2025, 11, 13),
+                )
             case LanguageModelName.GEMINI_2_0_FLASH:
                 return cls(
                     name=model_name,

unique_toolkit/services/knowledge_base.py CHANGED Viewed

@@ -377,7 +377,6 @@ class KnowledgeBaseService:
             mime_type (str): The MIME type of the content.
             scope_id (str | None): The scope ID. Defaults to None.
             skip_ingestion (bool): Whether to skip ingestion. Defaults to False.
-            skip_excel_ingestion (bool): Whether to skip excel ingestion. Defaults to False.
             ingestion_config (unique_sdk.Content.IngestionConfig | None): The ingestion configuration. Defaults to None.
             metadata (dict | None): The metadata to associate with the content. Defaults to None.
@@ -449,7 +448,7 @@ class KnowledgeBaseService:
         skip_excel_ingestion: bool = False,
         ingestion_config: unique_sdk.Content.IngestionConfig | None = None,
         metadata: dict[str, Any] | None = None,
-    ):
+    ) -> Content:
         """
         Uploads content to the knowledge base.
@@ -487,14 +486,14 @@ class KnowledgeBaseService:
         content_id: str,
         output_dir_path: Path | None = None,
         output_filename: str | None = None,
-    ):
+    ) -> Path:
         """
         Downloads content from a chat and saves it to a file.
         Args:
             content_id (str): The ID of the content to download.
-            filename (str | None): The name of the file to save the content as. If not provided, the original filename will be used. Defaults to None.
-            tmp_dir_path (str | Path | None): The path to the temporary directory where the content will be saved. Defaults to "/tmp".
+            output_filename (str | None): The name of the file to save the content as. If not provided, the original filename will be used. Defaults to None.
+            output_dir_path (str | Path | None): The path to the temporary directory where the content will be saved. Defaults to "/tmp".
         Returns:
             Path: The path to the downloaded file.
@@ -522,7 +521,6 @@ class KnowledgeBaseService:
         Args:
             content_id (str): The id of the uploaded content.
-            chat_id (Optional[str]): The chat_id, defaults to None.
         Returns:
             bytes: The downloaded content.

{unique_toolkit-1.28.8.dist-info → unique_toolkit-1.33.3.dist-info}/METADATA RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: unique_toolkit
-Version: 1.28.8
+Version: 1.33.3
 Summary:
 License: Proprietary
 Author: Cedric Klinkert
@@ -11,10 +11,11 @@ Classifier: Programming Language :: Python :: 3
 Classifier: Programming Language :: Python :: 3.12
 Requires-Dist: docxtpl (>=0.20.1,<0.21.0)
 Requires-Dist: jambo (>=0.1.2,<0.2.0)
+Requires-Dist: jinja2 (>=3.1.6,<4.0.0)
 Requires-Dist: markdown-it-py (>=4.0.0,<5.0.0)
 Requires-Dist: mkdocs-mermaid2-plugin (>=1.2.2,<2.0.0)
 Requires-Dist: mkdocs-multirepo-plugin (>=0.8.3,<0.9.0)
-Requires-Dist: numpy (>=1.26.4,<2.0.0)
+Requires-Dist: numpy (>=2.1.0,<3.0.0)
 Requires-Dist: openai (>=1.99.9,<2.0.0)
 Requires-Dist: pillow (>=10.4.0,<11.0.0)
 Requires-Dist: platformdirs (>=4.0.0,<5.0.0)
@@ -120,6 +121,54 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [1.33.3] - 2025-12-02
+- Fix serialization of ToolBuildConfig `configuration` field.
+## [1.33.2] - 2025-12-01
+- Upgrade numpy to >2.1.0 to ensure compatibility with langchain library
+## [1.33.1] - 2025-12-01
+- Add `data_extraction` to unique_toolkit
+## [1.33.0] - 2025-11-28
+- Add support for system reminders in sub agent responses.
+## [1.32.1] - 2025-12-01
+- Added documentation for the toolkit,some missing type hints and doc string fixes.
+## [1.32.0] - 2025-11-28
+- Add option to filter duplicate sub agent answers.
+## [1.31.2] - 2025-11-27
+- Added the function `filter_tool_calls_by_max_tool_calls_allowed` in `tool_manager` to limit the number of parallel tool calls permitted per loop iteration.
+## [1.31.1] - 2025-11-27
+- Various fixes to sub agent answers.
+## [1.31.0] - 2025-11-20
+- Adding model `litellm:anthropic-claude-opus-4-5` to `language_model/info.py`
+## [1.30.0] - 2025-11-26
+- Add option to only display parts of sub agent responses.
+## [1.29.4] - 2025-11-25
+- Add display name to openai builtin tools
+## [1.29.3] - 2025-11-24
+- Fix jinja utility helpers import
+## [1.29.2] - 2025-11-21
+- Add `jinja` utility helpers to `_common`
+## [1.29.1] - 2025-11-21
+- Add early return in `create_message_log_entry` if chat_service doesn't have assistant_message_id (relevant for agentic table)
+## [1.29.0] - 2025-11-21
+- Add option to force include references in sub agent responses even if unused by main agent response.
+## [1.28.9] - 2025-11-21
+- Remove `knolwedge_base_service` from DocXGeneratorService
 ## [1.28.8] - 2025-11-20
 - Add query params to api operation
 - Add query params to endpoint builder

unique_toolkit 1.28.8__py3-none-any.whl → 1.33.3__py3-none-any.whl

unique_toolkit 1.28.8py3-none-any.whl → 1.33.3py3-none-any.whl