PyPI - hamtaa-texttools - Versions diffs - 2.2.0__tar.gz → 2.3.0__tar.gz - Mend

hamtaa-texttools 2.2.0tar.gz → 2.3.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (43) hide show

{hamtaa_texttools-2.2.0 → hamtaa_texttools-2.3.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: hamtaa-texttools
-Version: 2.2.0
+Version: 2.3.0
 Summary: A high-level NLP toolkit built on top of modern LLMs.
 Author-email: Tohidi <the.mohammad.tohidi@gmail.com>, Erfan Moosavi <erfanmoosavi84@gmail.com>, Montazer <montazerh82@gmail.com>, Givechi <mohamad.m.givechi@gmail.com>, Zareshahi <a.zareshahi1377@gmail.com>
 Maintainer-email: Erfan Moosavi <erfanmoosavi84@gmail.com>, Tohidi <the.mohammad.tohidi@gmail.com>
@@ -17,6 +17,7 @@ License-File: LICENSE
 Requires-Dist: dotenv>=0.9.9
 Requires-Dist: openai>=1.97.1
 Requires-Dist: pydantic>=2.0.0
+Requires-Dist: pytest>=9.0.2
 Requires-Dist: pyyaml>=6.0
 Dynamic: license-file
@@ -108,20 +109,31 @@ pip install -U hamtaa-texttools
 ## 🧩 ToolOutput
 Every tool of `TextTools` returns a `ToolOutput` object which is a BaseModel with attributes:
 - **`result: Any`**
 - **`analysis: str`**
 - **`logprobs: list`**
 - **`errors: list[str]`**
-- **`ToolOutputMetadata`**
+- **`ToolOutputMetadata`**
     - **`tool_name: str`**
+    - **`processed_by: str`**
     - **`processed_at: datetime`**
     - **`execution_time: float`**
+    - **`token_usage: TokenUsage`**
+        - **`completion_usage: CompletionUsage`**
+            - **`prompt_tokens: int`**
+            - **`completion_tokens: int`**
+            - **`total_tokens: int`**
+        - **`analyze_usage: AnalyzeUsage`**
+            - **`prompt_tokens: int`**
+            - **`completion_tokens: int`**
+            - **`total_tokens: int`**
 - Serialize output to JSON using the `to_json()` method.
 - Verify operation success with the `is_successful()` method.
 - Convert output to a dictionary with the `to_dict()` method.
-**Note:** For BatchTheTool: Each method returns a list[ToolOutput] containing results for all input texts.
+**Note:** For BatchTheTool: Each method returns a `list[ToolOutput]` containing results for all input texts.
 ---

{hamtaa_texttools-2.2.0 → hamtaa_texttools-2.3.0}/README.md RENAMED Viewed

@@ -86,20 +86,31 @@ pip install -U hamtaa-texttools
 ## 🧩 ToolOutput
 Every tool of `TextTools` returns a `ToolOutput` object which is a BaseModel with attributes:
 - **`result: Any`**
 - **`analysis: str`**
 - **`logprobs: list`**
 - **`errors: list[str]`**
-- **`ToolOutputMetadata`**
+- **`ToolOutputMetadata`**
     - **`tool_name: str`**
+    - **`processed_by: str`**
     - **`processed_at: datetime`**
     - **`execution_time: float`**
+    - **`token_usage: TokenUsage`**
+        - **`completion_usage: CompletionUsage`**
+            - **`prompt_tokens: int`**
+            - **`completion_tokens: int`**
+            - **`total_tokens: int`**
+        - **`analyze_usage: AnalyzeUsage`**
+            - **`prompt_tokens: int`**
+            - **`completion_tokens: int`**
+            - **`total_tokens: int`**
 - Serialize output to JSON using the `to_json()` method.
 - Verify operation success with the `is_successful()` method.
 - Convert output to a dictionary with the `to_dict()` method.
-**Note:** For BatchTheTool: Each method returns a list[ToolOutput] containing results for all input texts.
+**Note:** For BatchTheTool: Each method returns a `list[ToolOutput]` containing results for all input texts.
 ---

{hamtaa_texttools-2.2.0 → hamtaa_texttools-2.3.0}/hamtaa_texttools.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: hamtaa-texttools
-Version: 2.2.0
+Version: 2.3.0
 Summary: A high-level NLP toolkit built on top of modern LLMs.
 Author-email: Tohidi <the.mohammad.tohidi@gmail.com>, Erfan Moosavi <erfanmoosavi84@gmail.com>, Montazer <montazerh82@gmail.com>, Givechi <mohamad.m.givechi@gmail.com>, Zareshahi <a.zareshahi1377@gmail.com>
 Maintainer-email: Erfan Moosavi <erfanmoosavi84@gmail.com>, Tohidi <the.mohammad.tohidi@gmail.com>
@@ -17,6 +17,7 @@ License-File: LICENSE
 Requires-Dist: dotenv>=0.9.9
 Requires-Dist: openai>=1.97.1
 Requires-Dist: pydantic>=2.0.0
+Requires-Dist: pytest>=9.0.2
 Requires-Dist: pyyaml>=6.0
 Dynamic: license-file
@@ -108,20 +109,31 @@ pip install -U hamtaa-texttools
 ## 🧩 ToolOutput
 Every tool of `TextTools` returns a `ToolOutput` object which is a BaseModel with attributes:
 - **`result: Any`**
 - **`analysis: str`**
 - **`logprobs: list`**
 - **`errors: list[str]`**
-- **`ToolOutputMetadata`**
+- **`ToolOutputMetadata`**
     - **`tool_name: str`**
+    - **`processed_by: str`**
     - **`processed_at: datetime`**
     - **`execution_time: float`**
+    - **`token_usage: TokenUsage`**
+        - **`completion_usage: CompletionUsage`**
+            - **`prompt_tokens: int`**
+            - **`completion_tokens: int`**
+            - **`total_tokens: int`**
+        - **`analyze_usage: AnalyzeUsage`**
+            - **`prompt_tokens: int`**
+            - **`completion_tokens: int`**
+            - **`total_tokens: int`**
 - Serialize output to JSON using the `to_json()` method.
 - Verify operation success with the `is_successful()` method.
 - Convert output to a dictionary with the `to_dict()` method.
-**Note:** For BatchTheTool: Each method returns a list[ToolOutput] containing results for all input texts.
+**Note:** For BatchTheTool: Each method returns a `list[ToolOutput]` containing results for all input texts.
 ---

{hamtaa_texttools-2.2.0 → hamtaa_texttools-2.3.0}/hamtaa_texttools.egg-info/requires.txt RENAMED Viewed

@@ -1,4 +1,5 @@
 dotenv>=0.9.9
 openai>=1.97.1
 pydantic>=2.0.0
+pytest>=9.0.2
 pyyaml>=6.0

{hamtaa_texttools-2.2.0 → hamtaa_texttools-2.3.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "hamtaa-texttools"
-version = "2.2.0"
+version = "2.3.0"
 authors = [
   {name = "Tohidi", email = "the.mohammad.tohidi@gmail.com"},
   {name = "Erfan Moosavi", email = "erfanmoosavi84@gmail.com"},
@@ -24,6 +24,7 @@ dependencies = [
   "dotenv>=0.9.9",
   "openai>=1.97.1",
   "pydantic>=2.0.0",
+  "pytest>=9.0.2",
   "pyyaml>=6.0",
 ]
 keywords = ["nlp", "llm", "text-processing", "openai"]

hamtaa_texttools-2.3.0/texttools/__init__.py ADDED Viewed

@@ -0,0 +1,4 @@
+from .models import CategoryTree
+from .tools import AsyncTheTool, BatchTheTool, TheTool
+__all__ = ["CategoryTree", "AsyncTheTool", "BatchTheTool", "TheTool"]

hamtaa_texttools-2.3.0/texttools/core/__init__.py ADDED Viewed

@@ -0,0 +1,34 @@
+from .exceptions import LLMError, PromptError, TextToolsError, ValidationError
+from .internal_models import (
+    Bool,
+    ListDictStrStr,
+    ListStr,
+    ReasonListStr,
+    Str,
+    TokenUsage,
+    create_dynamic_model,
+)
+from .operators import AsyncOperator, Operator
+from .utils import OperatorUtils, TheToolUtils
+__all__ = [
+    # Exceptions
+    "LLMError",
+    "PromptError",
+    "TextToolsError",
+    "ValidationError",
+    # Internal models
+    "Bool",
+    "ListDictStrStr",
+    "ListStr",
+    "ReasonListStr",
+    "Str",
+    "TokenUsage",
+    "create_dynamic_model",
+    # Operators
+    "AsyncOperator",
+    "Operator",
+    # Utils
+    "OperatorUtils",
+    "TheToolUtils",
+]

hamtaa_texttools-2.3.0/texttools/core/internal_models.py ADDED Viewed

@@ -0,0 +1,123 @@
+from __future__ import annotations
+from typing import Any, Literal
+from pydantic import BaseModel, Field, create_model
+class CompletionUsage(BaseModel):
+    prompt_tokens: int = 0
+    completion_tokens: int = 0
+    total_tokens: int = 0
+class AnalyzeUsage(BaseModel):
+    prompt_tokens: int = 0
+    completion_tokens: int = 0
+    total_tokens: int = 0
+class TokenUsage(BaseModel):
+    completion_usage: CompletionUsage = CompletionUsage()
+    analyze_usage: AnalyzeUsage = AnalyzeUsage()
+    total_tokens: int = 0
+    def __add__(self, other: TokenUsage) -> TokenUsage:
+        new_completion_usage = CompletionUsage(
+            prompt_tokens=self.completion_usage.prompt_tokens
+            + other.completion_usage.prompt_tokens,
+            completion_tokens=self.completion_usage.completion_tokens
+            + other.completion_usage.completion_tokens,
+            total_tokens=self.completion_usage.total_tokens
+            + other.completion_usage.total_tokens,
+        )
+        new_analyze_usage = AnalyzeUsage(
+            prompt_tokens=self.analyze_usage.prompt_tokens
+            + other.analyze_usage.prompt_tokens,
+            completion_tokens=self.analyze_usage.completion_tokens
+            + other.analyze_usage.completion_tokens,
+            total_tokens=self.analyze_usage.total_tokens
+            + other.analyze_usage.total_tokens,
+        )
+        total_tokens = (
+            new_completion_usage.total_tokens + new_analyze_usage.total_tokens
+        )
+        return TokenUsage(
+            completion_usage=new_completion_usage,
+            analyze_usage=new_analyze_usage,
+            total_tokens=total_tokens,
+        )
+class OperatorOutput(BaseModel):
+    result: Any
+    analysis: str | None
+    logprobs: list[dict[str, Any]] | None
+    token_usage: TokenUsage | None = None
+    prompt_tokens: int | None = None
+    completion_tokens: int | None = None
+    analysis_tokens: int | None = None
+    total_tokens: int | None = None
+class Str(BaseModel):
+    result: str = Field(
+        ..., description="The output string", json_schema_extra={"example": "text"}
+    )
+class Bool(BaseModel):
+    result: bool = Field(
+        ...,
+        description="Boolean indicating the output state",
+        json_schema_extra={"example": True},
+    )
+class ListStr(BaseModel):
+    result: list[str] = Field(
+        ...,
+        description="The output list of strings",
+        json_schema_extra={"example": ["text_1", "text_2", "text_3"]},
+    )
+class ListDictStrStr(BaseModel):
+    result: list[dict[str, str]] = Field(
+        ...,
+        description="List of dictionaries containing string key-value pairs",
+        json_schema_extra={
+            "example": [
+                {"text": "Mohammad", "type": "PER"},
+                {"text": "Iran", "type": "LOC"},
+            ]
+        },
+    )
+class ReasonListStr(BaseModel):
+    reason: str = Field(..., description="Thinking process that led to the output")
+    result: list[str] = Field(
+        ...,
+        description="The output list of strings",
+        json_schema_extra={"example": ["text_1", "text_2", "text_3"]},
+    )
+# Create CategorizerOutput with dynamic categories
+def create_dynamic_model(allowed_values: list[str]) -> type[BaseModel]:
+    literal_type = Literal[*allowed_values]
+    CategorizerOutput = create_model(
+        "CategorizerOutput",
+        reason=(
+            str,
+            Field(
+                ..., description="Explanation of why the input belongs to the category"
+            ),
+        ),
+        result=(literal_type, Field(..., description="Predicted category label")),
+    )
+    return CategorizerOutput

hamtaa_texttools-2.3.0/texttools/core/operators/__init__.py ADDED Viewed

@@ -0,0 +1,4 @@
+from .async_operator import AsyncOperator
+from .sync_operator import Operator
+__all__ = ["AsyncOperator", "Operator"]

{hamtaa_texttools-2.2.0 → hamtaa_texttools-2.3.0}/texttools/core/operators/async_operator.py RENAMED Viewed

@@ -18,7 +18,9 @@ class AsyncOperator:
         self._client = client
         self._model = model
-    async def _analyze_completion(self, analyze_message: list[dict[str, str]]) -> str:
+    async def _analyze_completion(
+        self, analyze_message: list[dict[str, str]]
+    ) -> tuple[str, Any]:
         try:
             completion = await self._client.chat.completions.create(
                 model=self._model,
@@ -33,7 +35,7 @@ class AsyncOperator:
             if not analysis:
                 raise LLMError("Empty analysis response")
-            return analysis
+            return analysis, completion
         except Exception as e:
             if isinstance(e, (PromptError, LLMError)):
@@ -116,12 +118,15 @@ class AsyncOperator:
             )
             analysis: str | None = None
+            analyze_completion: Any = None
             if with_analysis:
                 analyze_message = OperatorUtils.build_message(
                     prompt_configs["analyze_template"]
                 )
-                analysis = await self._analyze_completion(analyze_message)
+                analysis, analyze_completion = await self._analyze_completion(
+                    analyze_message
+                )
             main_prompt = OperatorUtils.build_main_prompt(
                 prompt_configs["main_template"], analysis, output_lang, user_prompt
@@ -176,6 +181,9 @@ class AsyncOperator:
                 logprobs=OperatorUtils.extract_logprobs(completion)
                 if logprobs
                 else None,
+                token_usage=OperatorUtils.extract_token_usage(
+                    completion, analyze_completion
+                ),
             )
             return operator_output

{hamtaa_texttools-2.2.0 → hamtaa_texttools-2.3.0}/texttools/core/operators/sync_operator.py RENAMED Viewed

@@ -18,7 +18,9 @@ class Operator:
         self._client = client
         self._model = model
-    def _analyze_completion(self, analyze_message: list[dict[str, str]]) -> str:
+    def _analyze_completion(
+        self, analyze_message: list[dict[str, str]]
+    ) -> tuple[str, Any]:
         try:
             completion = self._client.chat.completions.create(
                 model=self._model,
@@ -33,7 +35,7 @@ class Operator:
             if not analysis:
                 raise LLMError("Empty analysis response")
-            return analysis
+            return analysis, completion
         except Exception as e:
             if isinstance(e, (PromptError, LLMError)):
@@ -114,12 +116,13 @@ class Operator:
             )
             analysis: str | None = None
+            analyze_completion: Any = None
             if with_analysis:
                 analyze_message = OperatorUtils.build_message(
                     prompt_configs["analyze_template"]
                 )
-                analysis = self._analyze_completion(analyze_message)
+                analysis, analyze_completion = self._analyze_completion(analyze_message)
             main_prompt = OperatorUtils.build_main_prompt(
                 prompt_configs["main_template"], analysis, output_lang, user_prompt
@@ -174,6 +177,9 @@ class Operator:
                 logprobs=OperatorUtils.extract_logprobs(completion)
                 if logprobs
                 else None,
+                token_usage=OperatorUtils.extract_token_usage(
+                    completion, analyze_completion
+                ),
             )
             return operator_output

{hamtaa_texttools-2.2.0 → hamtaa_texttools-2.3.0}/texttools/core/utils.py RENAMED Viewed

@@ -9,6 +9,7 @@ from typing import Any
 import yaml
 from .exceptions import PromptError
+from .internal_models import AnalyzeUsage, CompletionUsage, TokenUsage
 class OperatorUtils:
@@ -148,6 +149,38 @@ class OperatorUtils:
         new_temp = base_temp + random.choice([-1, 1]) * random.uniform(0.1, 0.9)
         return max(0.0, min(new_temp, 1.5))
+    @staticmethod
+    def extract_token_usage(completion: Any, analyze_completion: Any) -> TokenUsage:
+        completion_usage = completion.usage
+        analyze_usage = analyze_completion.usage if analyze_completion else None
+        completion_usage_model = CompletionUsage(
+            prompt_tokens=getattr(completion_usage, "prompt_tokens", 00),
+            completion_tokens=getattr(completion_usage, "completion_tokens", 00),
+            total_tokens=getattr(completion_usage, "total_tokens", 00),
+        )
+        analyze_usage_model = AnalyzeUsage(
+            prompt_tokens=getattr(analyze_usage, "prompt_tokens", 0),
+            completion_tokens=getattr(analyze_usage, "completion_tokens", 0),
+            total_tokens=getattr(analyze_usage, "total_tokens", 0),
+        )
+        total_analyze_tokens = (
+            analyze_usage_model.prompt_tokens + analyze_usage_model.completion_tokens
+            if analyze_completion
+            else 0
+        )
+        total_tokens = (
+            completion_usage_model.prompt_tokens
+            + completion_usage_model.completion_tokens
+            + total_analyze_tokens
+        )
+        return TokenUsage(
+            completion_usage=completion_usage_model,
+            analyze_usage=analyze_usage_model,
+            total_tokens=total_tokens,
+        )
 class TheToolUtils:
     """

{hamtaa_texttools-2.2.0 → hamtaa_texttools-2.3.0}/texttools/models.py RENAMED Viewed

@@ -5,11 +5,15 @@ from typing import Any
 from pydantic import BaseModel, Field
+from .core import TokenUsage
 class ToolOutputMetadata(BaseModel):
     tool_name: str
+    processed_by: str | None = None
     processed_at: datetime = Field(default_factory=datetime.now)
     execution_time: float | None = None
+    token_usage: TokenUsage | None = None
 class ToolOutput(BaseModel):

{hamtaa_texttools-2.2.0 → hamtaa_texttools-2.3.0}/texttools/prompts/to_question.yaml RENAMED Viewed

@@ -7,7 +7,6 @@ main_template:
     and must not mention any verbs like this, that, he or she in the question.
     There is a `reason` key, fill that up with a summerized version of your thoughts.
-    The `reason` must be less than 20 words.
     Don't forget to fill the reason.
     Respond only in JSON format:
@@ -23,7 +22,6 @@ main_template:
     and must not mention any verbs like this, that, he or she in the question.
     There is a `reason` key, fill that up with a summerized version of your thoughts.
-    The `reason` must be less than 20 words.
     Don't forget to fill the reason.
     Respond only in JSON format:

{hamtaa_texttools-2.2.0 → hamtaa_texttools-2.3.0}/texttools/prompts/translate.yaml RENAMED Viewed

@@ -3,9 +3,9 @@ main_template: |
   Output only the translated text.
   Respond only in JSON format:
-  {{"result": "string"}}
+  {{"result": "translated_text"}}
-  Don't translate proper name, only transliterate them to {target_lang}
+  Don't translate proper names, only transliterate them to {target_lang}
   Translate the following text to {target_lang}:
   {text}

hamtaa_texttools-2.3.0/texttools/tools/__init__.py ADDED Viewed

@@ -0,0 +1,5 @@
+from .async_tools import AsyncTheTool
+from .batch_tools import BatchTheTool
+from .sync_tools import TheTool
+__all__ = ["AsyncTheTool", "BatchTheTool", "TheTool"]

{hamtaa_texttools-2.2.0 → hamtaa_texttools-2.3.0}/texttools/tools/async_tools.py RENAMED Viewed

@@ -5,17 +5,21 @@ from typing import Any, Literal
 from openai import AsyncOpenAI
-from ..core.exceptions import LLMError, PromptError, TextToolsError, ValidationError
-from ..core.internal_models import (
+from ..core import (
+    AsyncOperator,
     Bool,
     ListDictStrStr,
     ListStr,
+    LLMError,
+    PromptError,
     ReasonListStr,
     Str,
+    TextToolsError,
+    TheToolUtils,
+    TokenUsage,
+    ValidationError,
     create_dynamic_model,
 )
-from ..core.operators.async_operator import AsyncOperator
-from ..core.utils import TheToolUtils
 from ..models import CategoryTree, ToolOutput, ToolOutputMetadata
@@ -29,6 +33,7 @@ class AsyncTheTool:
         self._operator = AsyncOperator(client=client, model=model)
         self.logger = logging.getLogger(self.__class__.__name__)
         self.raise_on_error = raise_on_error
+        self.model = model
     async def categorize(
         self,
@@ -91,7 +96,10 @@ class AsyncTheTool:
                 )
                 metadata = ToolOutputMetadata(
-                    tool_name=tool_name, execution_time=perf_counter() - start
+                    tool_name=tool_name,
+                    execution_time=perf_counter() - start,
+                    processed_by=self.model,
+                    token_usage=operator_output.token_usage,
                 )
                 tool_output = ToolOutput(
                     result=operator_output.result,
@@ -106,6 +114,7 @@ class AsyncTheTool:
                 final_categories = []
                 analysis = ""
                 logprobs_list = []
+                token_usage = TokenUsage()
                 for _ in range(levels):
                     if not parent_node.children:
@@ -149,9 +158,13 @@ class AsyncTheTool:
                         analysis += level_operator_output.analysis
                     if logprobs:
                         logprobs_list.extend(level_operator_output.logprobs)
+                    token_usage += level_operator_output.token_usage
                 metadata = ToolOutputMetadata(
-                    tool_name=tool_name, execution_time=(perf_counter() - start)
+                    tool_name=tool_name,
+                    execution_time=perf_counter() - start,
+                    processed_by=self.model,
+                    token_usage=token_usage,
                 )
                 tool_output = ToolOutput(
                     result=final_categories,
@@ -237,7 +250,10 @@ class AsyncTheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -321,7 +337,10 @@ class AsyncTheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -400,7 +419,10 @@ class AsyncTheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -486,7 +508,10 @@ class AsyncTheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -570,7 +595,10 @@ class AsyncTheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -653,7 +681,10 @@ class AsyncTheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -734,7 +765,10 @@ class AsyncTheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -802,6 +836,7 @@ class AsyncTheTool:
                 translation = ""
                 analysis = ""
                 logprobs_list = []
+                token_usage = TokenUsage()
                 for chunk in chunks:
                     chunk_operator_output = await TheToolUtils.run_with_timeout(
@@ -832,9 +867,13 @@ class AsyncTheTool:
                         analysis += chunk_operator_output.analysis
                     if logprobs:
                         logprobs_list.extend(chunk_operator_output.logprobs)
+                    token_usage += chunk_operator_output.token_usage
                 metadata = ToolOutputMetadata(
-                    tool_name=tool_name, execution_time=perf_counter() - start
+                    tool_name=tool_name,
+                    execution_time=perf_counter() - start,
+                    processed_by=self.model,
+                    token_usage=token_usage,
                 )
                 tool_output = ToolOutput(
                     result=translation,
@@ -867,7 +906,10 @@ class AsyncTheTool:
                 )
                 metadata = ToolOutputMetadata(
-                    tool_name=tool_name, execution_time=perf_counter() - start
+                    tool_name=tool_name,
+                    execution_time=perf_counter() - start,
+                    processed_by=self.model,
+                    token_usage=operator_output.token_usage,
                 )
                 tool_output = ToolOutput(
                     result=operator_output.result,
@@ -950,7 +992,10 @@ class AsyncTheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -1036,7 +1081,10 @@ class AsyncTheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -1121,7 +1169,10 @@ class AsyncTheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,

{hamtaa_texttools-2.2.0 → hamtaa_texttools-2.3.0}/texttools/tools/sync_tools.py RENAMED Viewed

@@ -5,17 +5,21 @@ from typing import Any, Literal
 from openai import OpenAI
-from ..core.exceptions import LLMError, PromptError, TextToolsError, ValidationError
-from ..core.internal_models import (
+from ..core import (
     Bool,
     ListDictStrStr,
     ListStr,
+    LLMError,
+    Operator,
+    PromptError,
     ReasonListStr,
     Str,
+    TextToolsError,
+    TheToolUtils,
+    TokenUsage,
+    ValidationError,
     create_dynamic_model,
 )
-from ..core.operators.sync_operator import Operator
-from ..core.utils import TheToolUtils
 from ..models import CategoryTree, ToolOutput, ToolOutputMetadata
@@ -29,6 +33,7 @@ class TheTool:
         self._operator = Operator(client=client, model=model)
         self.logger = logging.getLogger(self.__class__.__name__)
         self.raise_on_error = raise_on_error
+        self.model = model
     def categorize(
         self,
@@ -86,7 +91,10 @@ class TheTool:
                 )
                 metadata = ToolOutputMetadata(
-                    tool_name=tool_name, execution_time=perf_counter() - start
+                    tool_name=tool_name,
+                    execution_time=perf_counter() - start,
+                    processed_by=self.model,
+                    token_usage=operator_output.token_usage,
                 )
                 tool_output = ToolOutput(
                     result=operator_output.result,
@@ -101,6 +109,7 @@ class TheTool:
                 final_categories = []
                 analysis = ""
                 logprobs_list = []
+                token_usage = TokenUsage()
                 for _ in range(levels):
                     if not parent_node.children:
@@ -141,9 +150,13 @@ class TheTool:
                         analysis += level_operator_output.analysis
                     if logprobs:
                         logprobs_list.extend(level_operator_output.logprobs)
+                    token_usage += level_operator_output.token_usage
                 metadata = ToolOutputMetadata(
-                    tool_name=tool_name, execution_time=(perf_counter() - start)
+                    tool_name=tool_name,
+                    execution_time=perf_counter() - start,
+                    processed_by=self.model,
+                    token_usage=token_usage,
                 )
                 tool_output = ToolOutput(
                     result=final_categories,
@@ -224,7 +237,10 @@ class TheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -303,7 +319,10 @@ class TheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -377,7 +396,10 @@ class TheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -458,7 +480,10 @@ class TheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -537,7 +562,10 @@ class TheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -615,7 +643,10 @@ class TheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -691,7 +722,10 @@ class TheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -757,6 +791,7 @@ class TheTool:
                 translation = ""
                 analysis = ""
                 logprobs_list = []
+                token_usage = TokenUsage()
                 for chunk in chunks:
                     chunk_operator_output = self._operator.run(
@@ -784,9 +819,13 @@ class TheTool:
                         analysis += chunk_operator_output.analysis
                     if logprobs:
                         logprobs_list.extend(chunk_operator_output.logprobs)
+                    token_usage += chunk_operator_output.token_usage
                 metadata = ToolOutputMetadata(
-                    tool_name=tool_name, execution_time=perf_counter() - start
+                    tool_name=tool_name,
+                    execution_time=perf_counter() - start,
+                    processed_by=self.model,
+                    token_usage=token_usage,
                 )
                 tool_output = ToolOutput(
                     result=translation,
@@ -816,7 +855,10 @@ class TheTool:
                 )
                 metadata = ToolOutputMetadata(
-                    tool_name=tool_name, execution_time=perf_counter() - start
+                    tool_name=tool_name,
+                    execution_time=perf_counter() - start,
+                    processed_by=self.model,
+                    token_usage=operator_output.token_usage,
                 )
                 tool_output = ToolOutput(
                     result=operator_output.result,
@@ -894,7 +936,10 @@ class TheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -975,7 +1020,10 @@ class TheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,
@@ -1055,7 +1103,10 @@ class TheTool:
             )
             metadata = ToolOutputMetadata(
-                tool_name=tool_name, execution_time=perf_counter() - start
+                tool_name=tool_name,
+                execution_time=perf_counter() - start,
+                processed_by=self.model,
+                token_usage=operator_output.token_usage,
             )
             tool_output = ToolOutput(
                 result=operator_output.result,

hamtaa_texttools-2.2.0/texttools/__init__.py DELETED Viewed

@@ -1,6 +0,0 @@
-from .models import CategoryTree
-from .tools.async_tools import AsyncTheTool
-from .tools.batch_tools import BatchTheTool
-from .tools.sync_tools import TheTool
-__all__ = ["CategoryTree", "AsyncTheTool", "TheTool", "BatchTheTool"]

hamtaa_texttools-2.2.0/texttools/core/__init__.py DELETED Viewed

File without changes

hamtaa_texttools-2.2.0/texttools/core/internal_models.py DELETED Viewed

@@ -1,71 +0,0 @@
-from typing import Any, Literal
-from pydantic import BaseModel, Field, create_model
-class OperatorOutput(BaseModel):
-    result: Any
-    analysis: str | None
-    logprobs: list[dict[str, Any]] | None
-class Str(BaseModel):
-    result: str = Field(
-        ..., description="The output string", json_schema_extra={"example": "text"}
-    )
-class Bool(BaseModel):
-    result: bool = Field(
-        ...,
-        description="Boolean indicating the output state",
-        json_schema_extra={"example": True},
-    )
-class ListStr(BaseModel):
-    result: list[str] = Field(
-        ...,
-        description="The output list of strings",
-        json_schema_extra={"example": ["text_1", "text_2", "text_3"]},
-    )
-class ListDictStrStr(BaseModel):
-    result: list[dict[str, str]] = Field(
-        ...,
-        description="List of dictionaries containing string key-value pairs",
-        json_schema_extra={
-            "example": [
-                {"text": "Mohammad", "type": "PER"},
-                {"text": "Iran", "type": "LOC"},
-            ]
-        },
-    )
-class ReasonListStr(BaseModel):
-    reason: str = Field(..., description="Thinking process that led to the output")
-    result: list[str] = Field(
-        ...,
-        description="The output list of strings",
-        json_schema_extra={"example": ["text_1", "text_2", "text_3"]},
-    )
-# Create CategorizerOutput with dynamic categories
-def create_dynamic_model(allowed_values: list[str]) -> type[BaseModel]:
-    literal_type = Literal[*allowed_values]
-    CategorizerOutput = create_model(
-        "CategorizerOutput",
-        reason=(
-            str,
-            Field(
-                ..., description="Explanation of why the input belongs to the category"
-            ),
-        ),
-        result=(literal_type, Field(..., description="Predicted category label")),
-    )
-    return CategorizerOutput