PyPI - hamtaa-texttools - Versions diffs - 2.1.0__tar.gz → 2.3.0__tar.gz - Mend

hamtaa-texttools 2.1.0tar.gz → 2.3.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (43) hide show

{hamtaa_texttools-2.1.0 → hamtaa_texttools-2.3.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: hamtaa-texttools
-Version: 2.1.0
+Version: 2.3.0
 Summary: A high-level NLP toolkit built on top of modern LLMs.
 Author-email: Tohidi <the.mohammad.tohidi@gmail.com>, Erfan Moosavi <erfanmoosavi84@gmail.com>, Montazer <montazerh82@gmail.com>, Givechi <mohamad.m.givechi@gmail.com>, Zareshahi <a.zareshahi1377@gmail.com>
 Maintainer-email: Erfan Moosavi <erfanmoosavi84@gmail.com>, Tohidi <the.mohammad.tohidi@gmail.com>
@@ -17,6 +17,7 @@ License-File: LICENSE
 Requires-Dist: dotenv>=0.9.9
 Requires-Dist: openai>=1.97.1
 Requires-Dist: pydantic>=2.0.0
+Requires-Dist: pytest>=9.0.2
 Requires-Dist: pyyaml>=6.0
 Dynamic: license-file
@@ -29,7 +30,10 @@ Dynamic: license-file
 **TextTools** is a high-level **NLP toolkit** built on top of **LLMs**.
-It provides both **sync (`TheTool`)** and **async (`AsyncTheTool`)** APIs for maximum flexibility.
+It provides three API styles for maximum flexibility:
+- Sync API (`TheTool`) - Simple, sequential operations
+- Async API (`AsyncTheTool`) - High-performance async operations
+- Batch API (`BatchTheTool`) - Process multiple texts in parallel with built-in concurrency control
 It provides ready-to-use utilities for **translation, question detection, categorization, NER extraction, and more** - designed to help you integrate AI-powered text processing into your applications with minimal effort.
@@ -76,8 +80,6 @@ pip install -U hamtaa-texttools
 ## ⚙️ Additional Parameters
-- **`raise_on_error: bool`** → (`TheTool/AsyncTheTool` parameter) Raise errors (True) or return them in output (False). Default is True.
 - **`with_analysis: bool`** → Adds a reasoning step before generating the final output.
 **Note:** This doubles token usage per call.
@@ -98,32 +100,49 @@ pip install -U hamtaa-texttools
 - **`timeout: float`** → Maximum time in seconds to wait for the response before raising a timeout error.
 **Note:** This feature is only available in `AsyncTheTool`.
+- **`raise_on_error: bool`** → (`TheTool/AsyncTheTool`) Raise errors (True) or return them in output (False). Default is True.
+- **`max_concurrency: int`** → (`BatchTheTool` only) Maximum number of concurrent API calls. Default is 5.
 ---
 ## 🧩 ToolOutput
 Every tool of `TextTools` returns a `ToolOutput` object which is a BaseModel with attributes:
 - **`result: Any`**
 - **`analysis: str`**
 - **`logprobs: list`**
 - **`errors: list[str]`**
-- **`ToolOutputMetadata`**
+- **`ToolOutputMetadata`**
     - **`tool_name: str`**
+    - **`processed_by: str`**
     - **`processed_at: datetime`**
     - **`execution_time: float`**
+    - **`token_usage: TokenUsage`**
+        - **`completion_usage: CompletionUsage`**
+            - **`prompt_tokens: int`**
+            - **`completion_tokens: int`**
+            - **`total_tokens: int`**
+        - **`analyze_usage: AnalyzeUsage`**
+            - **`prompt_tokens: int`**
+            - **`completion_tokens: int`**
+            - **`total_tokens: int`**
 - Serialize output to JSON using the `to_json()` method.
 - Verify operation success with the `is_successful()` method.
 - Convert output to a dictionary with the `to_dict()` method.
+**Note:** For BatchTheTool: Each method returns a `list[ToolOutput]` containing results for all input texts.
 ---
-## 🧨 Sync vs Async
-| Tool         | Style   | Use case                                    |
-|--------------|---------|---------------------------------------------|
-| `TheTool`    | Sync    | Simple scripts, sequential workflows        |
-| `AsyncTheTool` | Async | High-throughput apps, APIs, concurrent tasks |
+## 🧨 Sync vs Async vs Batch
+| Tool | Style | Use Case | Best For |
+|------|-------|----------|----------|
+| `TheTool` | **Sync** | Simple scripts, sequential workflows | • Quick prototyping<br>• Simple scripts<br>• Sequential processing<br>• Debugging |
+| `AsyncTheTool` | **Async** | High-throughput applications, APIs, concurrent tasks | • Web APIs<br>• Concurrent operations<br>• High-performance apps<br>• Real-time processing |
+| `BatchTheTool` | **Batch** | Process multiple texts efficiently with controlled concurrency | • Bulk processing<br>• Large datasets<br>• Parallel execution<br>• Resource optimization |
 ---
@@ -168,6 +187,35 @@ async def main():
 asyncio.run(main())
 ```
+## ⚡ Quick Start (Batch)
+```python
+import asyncio
+from openai import AsyncOpenAI
+from texttools import BatchTheTool
+async def main():
+    async_client = AsyncOpenAI(base_url="your_url", api_key="your_api_key")
+    model = "model_name"
+    batch_the_tool = BatchTheTool(client=async_client, model=model, max_concurrency=3)
+    categories = await batch_tool.categorize(
+        texts=[
+            "Climate change impacts on agriculture",
+            "Artificial intelligence in healthcare",
+            "Economic effects of remote work",
+            "Advancements in quantum computing",
+        ],
+        categories=["Science", "Technology", "Economics", "Environment"],
+    )
+    for i, result in enumerate(categories):
+        print(f"Text {i+1}: {result.result}")
+asyncio.run(main())
+```
 ---
 ## ✅ Use Cases
@@ -176,4 +224,20 @@ Use **TextTools** when you need to:
 - 🔍 **Classify** large datasets quickly without model training
 - 🧩 **Integrate** LLMs into production pipelines (structured outputs)
-- 📊 **Analyze** large text collections using embeddings and categorization
+- 📊 **Analyze** large text collections using embeddings and categorization
+---
+## 📄 License
+This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
+---
+## 🤝 Contributing
+We welcome contributions from the community! - see the [CONTRIBUTING](CONTRIBUTING.md) file for details.
+## 📚 Documentation
+For detailed documentation, architecture overview, and implementation details, please visit the [docs](docs) directory.

{hamtaa_texttools-2.1.0 → hamtaa_texttools-2.3.0}/README.md RENAMED Viewed

@@ -7,7 +7,10 @@
 **TextTools** is a high-level **NLP toolkit** built on top of **LLMs**.
-It provides both **sync (`TheTool`)** and **async (`AsyncTheTool`)** APIs for maximum flexibility.
+It provides three API styles for maximum flexibility:
+- Sync API (`TheTool`) - Simple, sequential operations
+- Async API (`AsyncTheTool`) - High-performance async operations
+- Batch API (`BatchTheTool`) - Process multiple texts in parallel with built-in concurrency control
 It provides ready-to-use utilities for **translation, question detection, categorization, NER extraction, and more** - designed to help you integrate AI-powered text processing into your applications with minimal effort.
@@ -54,8 +57,6 @@ pip install -U hamtaa-texttools
 ## ⚙️ Additional Parameters
-- **`raise_on_error: bool`** → (`TheTool/AsyncTheTool` parameter) Raise errors (True) or return them in output (False). Default is True.
 - **`with_analysis: bool`** → Adds a reasoning step before generating the final output.
 **Note:** This doubles token usage per call.
@@ -76,32 +77,49 @@ pip install -U hamtaa-texttools
 - **`timeout: float`** → Maximum time in seconds to wait for the response before raising a timeout error.
 **Note:** This feature is only available in `AsyncTheTool`.
+- **`raise_on_error: bool`** → (`TheTool/AsyncTheTool`) Raise errors (True) or return them in output (False). Default is True.
+- **`max_concurrency: int`** → (`BatchTheTool` only) Maximum number of concurrent API calls. Default is 5.
 ---
 ## 🧩 ToolOutput
 Every tool of `TextTools` returns a `ToolOutput` object which is a BaseModel with attributes:
 - **`result: Any`**
 - **`analysis: str`**
 - **`logprobs: list`**
 - **`errors: list[str]`**
-- **`ToolOutputMetadata`**
+- **`ToolOutputMetadata`**
     - **`tool_name: str`**
+    - **`processed_by: str`**
     - **`processed_at: datetime`**
     - **`execution_time: float`**
+    - **`token_usage: TokenUsage`**
+        - **`completion_usage: CompletionUsage`**
+            - **`prompt_tokens: int`**
+            - **`completion_tokens: int`**
+            - **`total_tokens: int`**
+        - **`analyze_usage: AnalyzeUsage`**
+            - **`prompt_tokens: int`**
+            - **`completion_tokens: int`**
+            - **`total_tokens: int`**
 - Serialize output to JSON using the `to_json()` method.
 - Verify operation success with the `is_successful()` method.
 - Convert output to a dictionary with the `to_dict()` method.
+**Note:** For BatchTheTool: Each method returns a `list[ToolOutput]` containing results for all input texts.
 ---
-## 🧨 Sync vs Async
-| Tool         | Style   | Use case                                    |
-|--------------|---------|---------------------------------------------|
-| `TheTool`    | Sync    | Simple scripts, sequential workflows        |
-| `AsyncTheTool` | Async | High-throughput apps, APIs, concurrent tasks |
+## 🧨 Sync vs Async vs Batch
+| Tool | Style | Use Case | Best For |
+|------|-------|----------|----------|
+| `TheTool` | **Sync** | Simple scripts, sequential workflows | • Quick prototyping<br>• Simple scripts<br>• Sequential processing<br>• Debugging |
+| `AsyncTheTool` | **Async** | High-throughput applications, APIs, concurrent tasks | • Web APIs<br>• Concurrent operations<br>• High-performance apps<br>• Real-time processing |
+| `BatchTheTool` | **Batch** | Process multiple texts efficiently with controlled concurrency | • Bulk processing<br>• Large datasets<br>• Parallel execution<br>• Resource optimization |
 ---
@@ -146,6 +164,35 @@ async def main():
 asyncio.run(main())
 ```
+## ⚡ Quick Start (Batch)
+```python
+import asyncio
+from openai import AsyncOpenAI
+from texttools import BatchTheTool
+async def main():
+    async_client = AsyncOpenAI(base_url="your_url", api_key="your_api_key")
+    model = "model_name"
+    batch_the_tool = BatchTheTool(client=async_client, model=model, max_concurrency=3)
+    categories = await batch_tool.categorize(
+        texts=[
+            "Climate change impacts on agriculture",
+            "Artificial intelligence in healthcare",
+            "Economic effects of remote work",
+            "Advancements in quantum computing",
+        ],
+        categories=["Science", "Technology", "Economics", "Environment"],
+    )
+    for i, result in enumerate(categories):
+        print(f"Text {i+1}: {result.result}")
+asyncio.run(main())
+```
 ---
 ## ✅ Use Cases
@@ -154,4 +201,20 @@ Use **TextTools** when you need to:
 - 🔍 **Classify** large datasets quickly without model training
 - 🧩 **Integrate** LLMs into production pipelines (structured outputs)
-- 📊 **Analyze** large text collections using embeddings and categorization
+- 📊 **Analyze** large text collections using embeddings and categorization
+---
+## 📄 License
+This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
+---
+## 🤝 Contributing
+We welcome contributions from the community! - see the [CONTRIBUTING](CONTRIBUTING.md) file for details.
+## 📚 Documentation
+For detailed documentation, architecture overview, and implementation details, please visit the [docs](docs) directory.

{hamtaa_texttools-2.1.0 → hamtaa_texttools-2.3.0}/hamtaa_texttools.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: hamtaa-texttools
-Version: 2.1.0
+Version: 2.3.0
 Summary: A high-level NLP toolkit built on top of modern LLMs.
 Author-email: Tohidi <the.mohammad.tohidi@gmail.com>, Erfan Moosavi <erfanmoosavi84@gmail.com>, Montazer <montazerh82@gmail.com>, Givechi <mohamad.m.givechi@gmail.com>, Zareshahi <a.zareshahi1377@gmail.com>
 Maintainer-email: Erfan Moosavi <erfanmoosavi84@gmail.com>, Tohidi <the.mohammad.tohidi@gmail.com>
@@ -17,6 +17,7 @@ License-File: LICENSE
 Requires-Dist: dotenv>=0.9.9
 Requires-Dist: openai>=1.97.1
 Requires-Dist: pydantic>=2.0.0
+Requires-Dist: pytest>=9.0.2
 Requires-Dist: pyyaml>=6.0
 Dynamic: license-file
@@ -29,7 +30,10 @@ Dynamic: license-file
 **TextTools** is a high-level **NLP toolkit** built on top of **LLMs**.
-It provides both **sync (`TheTool`)** and **async (`AsyncTheTool`)** APIs for maximum flexibility.
+It provides three API styles for maximum flexibility:
+- Sync API (`TheTool`) - Simple, sequential operations
+- Async API (`AsyncTheTool`) - High-performance async operations
+- Batch API (`BatchTheTool`) - Process multiple texts in parallel with built-in concurrency control
 It provides ready-to-use utilities for **translation, question detection, categorization, NER extraction, and more** - designed to help you integrate AI-powered text processing into your applications with minimal effort.
@@ -76,8 +80,6 @@ pip install -U hamtaa-texttools
 ## ⚙️ Additional Parameters
-- **`raise_on_error: bool`** → (`TheTool/AsyncTheTool` parameter) Raise errors (True) or return them in output (False). Default is True.
 - **`with_analysis: bool`** → Adds a reasoning step before generating the final output.
 **Note:** This doubles token usage per call.
@@ -98,32 +100,49 @@ pip install -U hamtaa-texttools
 - **`timeout: float`** → Maximum time in seconds to wait for the response before raising a timeout error.
 **Note:** This feature is only available in `AsyncTheTool`.
+- **`raise_on_error: bool`** → (`TheTool/AsyncTheTool`) Raise errors (True) or return them in output (False). Default is True.
+- **`max_concurrency: int`** → (`BatchTheTool` only) Maximum number of concurrent API calls. Default is 5.
 ---
 ## 🧩 ToolOutput
 Every tool of `TextTools` returns a `ToolOutput` object which is a BaseModel with attributes:
 - **`result: Any`**
 - **`analysis: str`**
 - **`logprobs: list`**
 - **`errors: list[str]`**
-- **`ToolOutputMetadata`**
+- **`ToolOutputMetadata`**
     - **`tool_name: str`**
+    - **`processed_by: str`**
     - **`processed_at: datetime`**
     - **`execution_time: float`**
+    - **`token_usage: TokenUsage`**
+        - **`completion_usage: CompletionUsage`**
+            - **`prompt_tokens: int`**
+            - **`completion_tokens: int`**
+            - **`total_tokens: int`**
+        - **`analyze_usage: AnalyzeUsage`**
+            - **`prompt_tokens: int`**
+            - **`completion_tokens: int`**
+            - **`total_tokens: int`**
 - Serialize output to JSON using the `to_json()` method.
 - Verify operation success with the `is_successful()` method.
 - Convert output to a dictionary with the `to_dict()` method.
+**Note:** For BatchTheTool: Each method returns a `list[ToolOutput]` containing results for all input texts.
 ---
-## 🧨 Sync vs Async
-| Tool         | Style   | Use case                                    |
-|--------------|---------|---------------------------------------------|
-| `TheTool`    | Sync    | Simple scripts, sequential workflows        |
-| `AsyncTheTool` | Async | High-throughput apps, APIs, concurrent tasks |
+## 🧨 Sync vs Async vs Batch
+| Tool | Style | Use Case | Best For |
+|------|-------|----------|----------|
+| `TheTool` | **Sync** | Simple scripts, sequential workflows | • Quick prototyping<br>• Simple scripts<br>• Sequential processing<br>• Debugging |
+| `AsyncTheTool` | **Async** | High-throughput applications, APIs, concurrent tasks | • Web APIs<br>• Concurrent operations<br>• High-performance apps<br>• Real-time processing |
+| `BatchTheTool` | **Batch** | Process multiple texts efficiently with controlled concurrency | • Bulk processing<br>• Large datasets<br>• Parallel execution<br>• Resource optimization |
 ---
@@ -168,6 +187,35 @@ async def main():
 asyncio.run(main())
 ```
+## ⚡ Quick Start (Batch)
+```python
+import asyncio
+from openai import AsyncOpenAI
+from texttools import BatchTheTool
+async def main():
+    async_client = AsyncOpenAI(base_url="your_url", api_key="your_api_key")
+    model = "model_name"
+    batch_the_tool = BatchTheTool(client=async_client, model=model, max_concurrency=3)
+    categories = await batch_tool.categorize(
+        texts=[
+            "Climate change impacts on agriculture",
+            "Artificial intelligence in healthcare",
+            "Economic effects of remote work",
+            "Advancements in quantum computing",
+        ],
+        categories=["Science", "Technology", "Economics", "Environment"],
+    )
+    for i, result in enumerate(categories):
+        print(f"Text {i+1}: {result.result}")
+asyncio.run(main())
+```
 ---
 ## ✅ Use Cases
@@ -176,4 +224,20 @@ Use **TextTools** when you need to:
 - 🔍 **Classify** large datasets quickly without model training
 - 🧩 **Integrate** LLMs into production pipelines (structured outputs)
-- 📊 **Analyze** large text collections using embeddings and categorization
+- 📊 **Analyze** large text collections using embeddings and categorization
+---
+## 📄 License
+This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
+---
+## 🤝 Contributing
+We welcome contributions from the community! - see the [CONTRIBUTING](CONTRIBUTING.md) file for details.
+## 📚 Documentation
+For detailed documentation, architecture overview, and implementation details, please visit the [docs](docs) directory.

{hamtaa_texttools-2.1.0 → hamtaa_texttools-2.3.0}/hamtaa_texttools.egg-info/SOURCES.txt RENAMED Viewed

@@ -32,4 +32,5 @@ texttools/prompts/to_question.yaml
 texttools/prompts/translate.yaml
 texttools/tools/__init__.py
 texttools/tools/async_tools.py
+texttools/tools/batch_tools.py
 texttools/tools/sync_tools.py

{hamtaa_texttools-2.1.0 → hamtaa_texttools-2.3.0}/hamtaa_texttools.egg-info/requires.txt RENAMED Viewed

@@ -1,4 +1,5 @@
 dotenv>=0.9.9
 openai>=1.97.1
 pydantic>=2.0.0
+pytest>=9.0.2
 pyyaml>=6.0

{hamtaa_texttools-2.1.0 → hamtaa_texttools-2.3.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "hamtaa-texttools"
-version = "2.1.0"
+version = "2.3.0"
 authors = [
   {name = "Tohidi", email = "the.mohammad.tohidi@gmail.com"},
   {name = "Erfan Moosavi", email = "erfanmoosavi84@gmail.com"},
@@ -24,6 +24,7 @@ dependencies = [
   "dotenv>=0.9.9",
   "openai>=1.97.1",
   "pydantic>=2.0.0",
+  "pytest>=9.0.2",
   "pyyaml>=6.0",
 ]
 keywords = ["nlp", "llm", "text-processing", "openai"]

hamtaa_texttools-2.3.0/texttools/__init__.py ADDED Viewed

@@ -0,0 +1,4 @@
+from .models import CategoryTree
+from .tools import AsyncTheTool, BatchTheTool, TheTool
+__all__ = ["CategoryTree", "AsyncTheTool", "BatchTheTool", "TheTool"]

hamtaa_texttools-2.3.0/texttools/core/__init__.py ADDED Viewed

@@ -0,0 +1,34 @@
+from .exceptions import LLMError, PromptError, TextToolsError, ValidationError
+from .internal_models import (
+    Bool,
+    ListDictStrStr,
+    ListStr,
+    ReasonListStr,
+    Str,
+    TokenUsage,
+    create_dynamic_model,
+)
+from .operators import AsyncOperator, Operator
+from .utils import OperatorUtils, TheToolUtils
+__all__ = [
+    # Exceptions
+    "LLMError",
+    "PromptError",
+    "TextToolsError",
+    "ValidationError",
+    # Internal models
+    "Bool",
+    "ListDictStrStr",
+    "ListStr",
+    "ReasonListStr",
+    "Str",
+    "TokenUsage",
+    "create_dynamic_model",
+    # Operators
+    "AsyncOperator",
+    "Operator",
+    # Utils
+    "OperatorUtils",
+    "TheToolUtils",
+]

hamtaa_texttools-2.3.0/texttools/core/internal_models.py ADDED Viewed

@@ -0,0 +1,123 @@
+from __future__ import annotations
+from typing import Any, Literal
+from pydantic import BaseModel, Field, create_model
+class CompletionUsage(BaseModel):
+    prompt_tokens: int = 0
+    completion_tokens: int = 0
+    total_tokens: int = 0
+class AnalyzeUsage(BaseModel):
+    prompt_tokens: int = 0
+    completion_tokens: int = 0
+    total_tokens: int = 0
+class TokenUsage(BaseModel):
+    completion_usage: CompletionUsage = CompletionUsage()
+    analyze_usage: AnalyzeUsage = AnalyzeUsage()
+    total_tokens: int = 0
+    def __add__(self, other: TokenUsage) -> TokenUsage:
+        new_completion_usage = CompletionUsage(
+            prompt_tokens=self.completion_usage.prompt_tokens
+            + other.completion_usage.prompt_tokens,
+            completion_tokens=self.completion_usage.completion_tokens
+            + other.completion_usage.completion_tokens,
+            total_tokens=self.completion_usage.total_tokens
+            + other.completion_usage.total_tokens,
+        )
+        new_analyze_usage = AnalyzeUsage(
+            prompt_tokens=self.analyze_usage.prompt_tokens
+            + other.analyze_usage.prompt_tokens,
+            completion_tokens=self.analyze_usage.completion_tokens
+            + other.analyze_usage.completion_tokens,
+            total_tokens=self.analyze_usage.total_tokens
+            + other.analyze_usage.total_tokens,
+        )
+        total_tokens = (
+            new_completion_usage.total_tokens + new_analyze_usage.total_tokens
+        )
+        return TokenUsage(
+            completion_usage=new_completion_usage,
+            analyze_usage=new_analyze_usage,
+            total_tokens=total_tokens,
+        )
+class OperatorOutput(BaseModel):
+    result: Any
+    analysis: str | None
+    logprobs: list[dict[str, Any]] | None
+    token_usage: TokenUsage | None = None
+    prompt_tokens: int | None = None
+    completion_tokens: int | None = None
+    analysis_tokens: int | None = None
+    total_tokens: int | None = None
+class Str(BaseModel):
+    result: str = Field(
+        ..., description="The output string", json_schema_extra={"example": "text"}
+    )
+class Bool(BaseModel):
+    result: bool = Field(
+        ...,
+        description="Boolean indicating the output state",
+        json_schema_extra={"example": True},
+    )
+class ListStr(BaseModel):
+    result: list[str] = Field(
+        ...,
+        description="The output list of strings",
+        json_schema_extra={"example": ["text_1", "text_2", "text_3"]},
+    )
+class ListDictStrStr(BaseModel):
+    result: list[dict[str, str]] = Field(
+        ...,
+        description="List of dictionaries containing string key-value pairs",
+        json_schema_extra={
+            "example": [
+                {"text": "Mohammad", "type": "PER"},
+                {"text": "Iran", "type": "LOC"},
+            ]
+        },
+    )
+class ReasonListStr(BaseModel):
+    reason: str = Field(..., description="Thinking process that led to the output")
+    result: list[str] = Field(
+        ...,
+        description="The output list of strings",
+        json_schema_extra={"example": ["text_1", "text_2", "text_3"]},
+    )
+# Create CategorizerOutput with dynamic categories
+def create_dynamic_model(allowed_values: list[str]) -> type[BaseModel]:
+    literal_type = Literal[*allowed_values]
+    CategorizerOutput = create_model(
+        "CategorizerOutput",
+        reason=(
+            str,
+            Field(
+                ..., description="Explanation of why the input belongs to the category"
+            ),
+        ),
+        result=(literal_type, Field(..., description="Predicted category label")),
+    )
+    return CategorizerOutput

hamtaa_texttools-2.3.0/texttools/core/operators/__init__.py ADDED Viewed

@@ -0,0 +1,4 @@
+from .async_operator import AsyncOperator
+from .sync_operator import Operator
+__all__ = ["AsyncOperator", "Operator"]

{hamtaa_texttools-2.1.0 → hamtaa_texttools-2.3.0}/texttools/core/operators/async_operator.py RENAMED Viewed

@@ -18,7 +18,9 @@ class AsyncOperator:
         self._client = client
         self._model = model
-    async def _analyze_completion(self, analyze_message: list[dict[str, str]]) -> str:
+    async def _analyze_completion(
+        self, analyze_message: list[dict[str, str]]
+    ) -> tuple[str, Any]:
         try:
             completion = await self._client.chat.completions.create(
                 model=self._model,
@@ -33,7 +35,7 @@ class AsyncOperator:
             if not analysis:
                 raise LLMError("Empty analysis response")
-            return analysis
+            return analysis, completion
         except Exception as e:
             if isinstance(e, (PromptError, LLMError)):
@@ -116,12 +118,15 @@ class AsyncOperator:
             )
             analysis: str | None = None
+            analyze_completion: Any = None
             if with_analysis:
                 analyze_message = OperatorUtils.build_message(
                     prompt_configs["analyze_template"]
                 )
-                analysis = await self._analyze_completion(analyze_message)
+                analysis, analyze_completion = await self._analyze_completion(
+                    analyze_message
+                )
             main_prompt = OperatorUtils.build_main_prompt(
                 prompt_configs["main_template"], analysis, output_lang, user_prompt
@@ -176,6 +181,9 @@ class AsyncOperator:
                 logprobs=OperatorUtils.extract_logprobs(completion)
                 if logprobs
                 else None,
+                token_usage=OperatorUtils.extract_token_usage(
+                    completion, analyze_completion
+                ),
             )
             return operator_output

hamtaa-texttools 2.1.0__tar.gz → 2.3.0__tar.gz

hamtaa-texttools 2.1.0tar.gz → 2.3.0tar.gz