PyPI - hamtaa-texttools - Versions diffs - 1.3.1__tar.gz → 1.3.2__tar.gz - Mend

hamtaa-texttools 1.3.1tar.gz → 1.3.2tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (39) hide show

{hamtaa_texttools-1.3.1 → hamtaa_texttools-1.3.2}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: hamtaa-texttools
-Version: 1.3.1
+Version: 1.3.2
 Summary: A high-level NLP toolkit built on top of modern LLMs.
 Author-email: Tohidi <the.mohammad.tohidi@gmail.com>, Erfan Moosavi <erfanmoosavi84@gmail.com>, Montazer <montazerh82@gmail.com>, Givechi <mohamad.m.givechi@gmail.com>, Zareshahi <a.zareshahi1377@gmail.com>
 Maintainer-email: Erfan Moosavi <erfanmoosavi84@gmail.com>, Tohidi <the.mohammad.tohidi@gmail.com>
@@ -21,6 +21,9 @@ Dynamic: license-file
 # TextTools
+![PyPI](https://img.shields.io/pypi/v/hamtaa-texttools)
+![License](https://img.shields.io/pypi/l/hamtaa-texttools)
 ## 📌 Overview
 **TextTools** is a high-level **NLP toolkit** built on top of **LLMs**.
@@ -44,11 +47,11 @@ Each tool is designed to work with structured outputs.
 - **`is_question()`** - Binary question detection
 - **`text_to_question()`** - Generates questions from text
 - **`merge_questions()`** - Merges multiple questions into one
-- **`rewrite()`** - Rewrites text in a diffrent way
-- **`subject_to_question()`** - Generates questions about a specific subject
+- **`rewrite()`** - Rewrites text in a different way
+- **`subject_to_question()`** - Generates questions about a given subject
 - **`summarize()`** - Text summarization
 - **`translate()`** - Text translation
-- **`propositionize()`** - Convert text to atomic independence meaningful sentences
+- **`propositionize()`** - Convert text to atomic independent meaningful sentences
 - **`check_fact()`** - Check whether a statement is relevant to the source text
 - **`run_custom()`** - Allows users to define a custom tool with an arbitrary BaseModel
@@ -66,7 +69,7 @@ pip install -U hamtaa-texttools
 ## 📊 Tool Quality Tiers
-| Status | Meaning | Tools | Use in Production? |
+| Status | Meaning | Tools | Safe for Production? |
 |--------|---------|----------|-------------------|
 | **✅ Production** | Evaluated, tested, stable. | `categorize()` (list mode), `extract_keywords()`, `extract_entities()`, `is_question()`, `text_to_question()`, `merge_questions()`, `rewrite()`, `subject_to_question()`, `summarize()`, `run_custom()` | **Yes** - ready for reliable use. |
 | **🧪 Experimental** | Added to the package but **not fully evaluated**. Functional, but quality may vary. | `categorize()` (tree mode), `translate()`, `propositionize()`, `check_fact()` | **Use with caution** - outputs not yet validated. |
@@ -181,9 +184,3 @@ Use **TextTools** when you need to:
 Contributions are welcome!
 Feel free to **open issues, suggest new features, or submit pull requests**.
----
-## 🌿 License
-This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

{hamtaa_texttools-1.3.1 → hamtaa_texttools-1.3.2}/README.md RENAMED Viewed

@@ -1,5 +1,8 @@
 # TextTools
+![PyPI](https://img.shields.io/pypi/v/hamtaa-texttools)
+![License](https://img.shields.io/pypi/l/hamtaa-texttools)
 ## 📌 Overview
 **TextTools** is a high-level **NLP toolkit** built on top of **LLMs**.
@@ -23,11 +26,11 @@ Each tool is designed to work with structured outputs.
 - **`is_question()`** - Binary question detection
 - **`text_to_question()`** - Generates questions from text
 - **`merge_questions()`** - Merges multiple questions into one
-- **`rewrite()`** - Rewrites text in a diffrent way
-- **`subject_to_question()`** - Generates questions about a specific subject
+- **`rewrite()`** - Rewrites text in a different way
+- **`subject_to_question()`** - Generates questions about a given subject
 - **`summarize()`** - Text summarization
 - **`translate()`** - Text translation
-- **`propositionize()`** - Convert text to atomic independence meaningful sentences
+- **`propositionize()`** - Convert text to atomic independent meaningful sentences
 - **`check_fact()`** - Check whether a statement is relevant to the source text
 - **`run_custom()`** - Allows users to define a custom tool with an arbitrary BaseModel
@@ -45,7 +48,7 @@ pip install -U hamtaa-texttools
 ## 📊 Tool Quality Tiers
-| Status | Meaning | Tools | Use in Production? |
+| Status | Meaning | Tools | Safe for Production? |
 |--------|---------|----------|-------------------|
 | **✅ Production** | Evaluated, tested, stable. | `categorize()` (list mode), `extract_keywords()`, `extract_entities()`, `is_question()`, `text_to_question()`, `merge_questions()`, `rewrite()`, `subject_to_question()`, `summarize()`, `run_custom()` | **Yes** - ready for reliable use. |
 | **🧪 Experimental** | Added to the package but **not fully evaluated**. Functional, but quality may vary. | `categorize()` (tree mode), `translate()`, `propositionize()`, `check_fact()` | **Use with caution** - outputs not yet validated. |
@@ -160,9 +163,3 @@ Use **TextTools** when you need to:
 Contributions are welcome!
 Feel free to **open issues, suggest new features, or submit pull requests**.
----
-## 🌿 License
-This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

{hamtaa_texttools-1.3.1 → hamtaa_texttools-1.3.2}/hamtaa_texttools.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: hamtaa-texttools
-Version: 1.3.1
+Version: 1.3.2
 Summary: A high-level NLP toolkit built on top of modern LLMs.
 Author-email: Tohidi <the.mohammad.tohidi@gmail.com>, Erfan Moosavi <erfanmoosavi84@gmail.com>, Montazer <montazerh82@gmail.com>, Givechi <mohamad.m.givechi@gmail.com>, Zareshahi <a.zareshahi1377@gmail.com>
 Maintainer-email: Erfan Moosavi <erfanmoosavi84@gmail.com>, Tohidi <the.mohammad.tohidi@gmail.com>
@@ -21,6 +21,9 @@ Dynamic: license-file
 # TextTools
+![PyPI](https://img.shields.io/pypi/v/hamtaa-texttools)
+![License](https://img.shields.io/pypi/l/hamtaa-texttools)
 ## 📌 Overview
 **TextTools** is a high-level **NLP toolkit** built on top of **LLMs**.
@@ -44,11 +47,11 @@ Each tool is designed to work with structured outputs.
 - **`is_question()`** - Binary question detection
 - **`text_to_question()`** - Generates questions from text
 - **`merge_questions()`** - Merges multiple questions into one
-- **`rewrite()`** - Rewrites text in a diffrent way
-- **`subject_to_question()`** - Generates questions about a specific subject
+- **`rewrite()`** - Rewrites text in a different way
+- **`subject_to_question()`** - Generates questions about a given subject
 - **`summarize()`** - Text summarization
 - **`translate()`** - Text translation
-- **`propositionize()`** - Convert text to atomic independence meaningful sentences
+- **`propositionize()`** - Convert text to atomic independent meaningful sentences
 - **`check_fact()`** - Check whether a statement is relevant to the source text
 - **`run_custom()`** - Allows users to define a custom tool with an arbitrary BaseModel
@@ -66,7 +69,7 @@ pip install -U hamtaa-texttools
 ## 📊 Tool Quality Tiers
-| Status | Meaning | Tools | Use in Production? |
+| Status | Meaning | Tools | Safe for Production? |
 |--------|---------|----------|-------------------|
 | **✅ Production** | Evaluated, tested, stable. | `categorize()` (list mode), `extract_keywords()`, `extract_entities()`, `is_question()`, `text_to_question()`, `merge_questions()`, `rewrite()`, `subject_to_question()`, `summarize()`, `run_custom()` | **Yes** - ready for reliable use. |
 | **🧪 Experimental** | Added to the package but **not fully evaluated**. Functional, but quality may vary. | `categorize()` (tree mode), `translate()`, `propositionize()`, `check_fact()` | **Use with caution** - outputs not yet validated. |
@@ -181,9 +184,3 @@ Use **TextTools** when you need to:
 Contributions are welcome!
 Feel free to **open issues, suggest new features, or submit pull requests**.
----
-## 🌿 License
-This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

{hamtaa_texttools-1.3.1 → hamtaa_texttools-1.3.2}/hamtaa_texttools.egg-info/SOURCES.txt RENAMED Viewed

@@ -16,6 +16,7 @@ texttools/core/__init__.py
 texttools/core/engine.py
 texttools/core/exceptions.py
 texttools/core/internal_models.py
+texttools/core/operators/__init__.py
 texttools/core/operators/async_operator.py
 texttools/core/operators/sync_operator.py
 texttools/prompts/categorize.yaml

{hamtaa_texttools-1.3.1 → hamtaa_texttools-1.3.2}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
 [project]
 name = "hamtaa-texttools"
-version = "1.3.1"
+version = "1.3.2"
 authors = [
   {name = "Tohidi", email = "the.mohammad.tohidi@gmail.com"},
   {name = "Erfan Moosavi", email = "erfanmoosavi84@gmail.com"},

{hamtaa_texttools-1.3.1 → hamtaa_texttools-1.3.2}/texttools/core/engine.py RENAMED Viewed

@@ -4,6 +4,7 @@ import random
 import re
 from functools import lru_cache
 from pathlib import Path
+from typing import Any
 import yaml
@@ -20,9 +21,6 @@ class PromptLoader:
     @lru_cache(maxsize=32)
     def _load_templates(self, prompt_file: str, mode: str | None) -> dict[str, str]:
-        """
-        Loads prompt templates from YAML file with optional mode selection.
-        """
         try:
             base_dir = Path(__file__).parent.parent / Path("prompts")
             prompt_path = base_dir / prompt_file
@@ -73,13 +71,12 @@ class PromptLoader:
         self, prompt_file: str, text: str, mode: str, **extra_kwargs
     ) -> dict[str, str]:
         try:
-            template_configs = self._load_templates(prompt_file, mode)
             format_args = {"text": text}
             format_args.update(extra_kwargs)
-            # Inject variables inside each template
-            for key in template_configs.keys():
-                template_configs[key] = template_configs[key].format(**format_args)
+            template_configs = self._load_templates(prompt_file, mode)
+            for key, value in template_configs.items():
+                template_configs[key] = value.format(**format_args)
             return template_configs
@@ -97,30 +94,27 @@ class OperatorUtils:
         output_lang: str | None,
         user_prompt: str | None,
     ) -> str:
-        main_prompt = ""
+        parts = []
         if analysis:
-            main_prompt += f"Based on this analysis:\n{analysis}\n"
+            parts.append(f"Based on this analysis: {analysis}")
         if output_lang:
-            main_prompt += f"Respond only in the {output_lang} language.\n"
+            parts.append(f"Respond only in the {output_lang} language.")
         if user_prompt:
-            main_prompt += f"Consider this instruction {user_prompt}\n"
+            parts.append(f"Consider this instruction: {user_prompt}")
-        main_prompt += main_template
-        return main_prompt
+        parts.append(main_template)
+        return "\n".join(parts)
     @staticmethod
     def build_message(prompt: str) -> list[dict[str, str]]:
         return [{"role": "user", "content": prompt}]
     @staticmethod
-    def extract_logprobs(completion: dict) -> list[dict]:
+    def extract_logprobs(completion: Any) -> list[dict]:
         """
-        Extracts and filters token probabilities from completion logprobs.
-        Skips punctuation and structural tokens, returns cleaned probability data.
+        Extracts and filters logprobs from completion.
+        Skips punctuation and structural tokens.
         """
         logprobs_data = []
@@ -153,16 +147,17 @@ class OperatorUtils:
     @staticmethod
     def get_retry_temp(base_temp: float) -> float:
-        delta_temp = random.choice([-1, 1]) * random.uniform(0.1, 0.9)
-        new_temp = base_temp + delta_temp
+        new_temp = base_temp + random.choice([-1, 1]) * random.uniform(0.1, 0.9)
         return max(0.0, min(new_temp, 1.5))
 def text_to_chunks(text: str, size: int, overlap: int) -> list[str]:
+    """
+    Utility for chunking large texts. Used for translation tool
+    """
     separators = ["\n\n", "\n", " ", ""]
     is_separator_regex = False
-    keep_separator = True  # Equivalent to 'start'
+    keep_separator = True
     length_function = len
     strip_whitespace = True
     chunk_size = size
@@ -256,6 +251,9 @@ def text_to_chunks(text: str, size: int, overlap: int) -> list[str]:
 async def run_with_timeout(coro, timeout: float | None):
+    """
+    Utility for timeout logic defined in AsyncTheTool
+    """
     if timeout is None:
         return await coro
     try:

{hamtaa_texttools-1.3.1 → hamtaa_texttools-1.3.2}/texttools/core/internal_models.py RENAMED Viewed

@@ -21,7 +21,9 @@ class Bool(BaseModel):
 class ListStr(BaseModel):
     result: list[str] = Field(
-        ..., description="The output list of strings", example=["text_1", "text_2"]
+        ...,
+        description="The output list of strings",
+        example=["text_1", "text_2", "text_3"],
     )
@@ -36,11 +38,13 @@ class ListDictStrStr(BaseModel):
 class ReasonListStr(BaseModel):
     reason: str = Field(..., description="Thinking process that led to the output")
     result: list[str] = Field(
-        ..., description="The output list of strings", example=["text_1", "text_2"]
+        ...,
+        description="The output list of strings",
+        example=["text_1", "text_2", "text_3"],
     )
-# This function is needed to create CategorizerOutput with dynamic categories
+# Create CategorizerOutput with dynamic categories
 def create_dynamic_model(allowed_values: list[str]) -> Type[BaseModel]:
     literal_type = Literal[*allowed_values]

{hamtaa_texttools-1.3.1 → hamtaa_texttools-1.3.2}/texttools/core/operators/async_operator.py RENAMED Viewed

@@ -54,7 +54,7 @@ class AsyncOperator:
     ) -> tuple[T, Any]:
         """
         Parses a chat completion using OpenAI's structured output format.
-        Returns both the parsed Any and the raw completion for logprobs.
+        Returns both the parsed and the completion for logprobs.
         """
         try:
             request_kwargs = {
@@ -92,7 +92,6 @@ class AsyncOperator:
     async def run(
         self,
-        # User parameters
         text: str,
         with_analysis: bool,
         output_lang: str | None,
@@ -103,7 +102,6 @@ class AsyncOperator:
         validator: Callable[[Any], bool] | None,
         max_validation_retries: int | None,
         priority: int | None,
-        # Internal parameters
         tool_name: str,
         output_model: Type[T],
         mode: str | None,

{hamtaa_texttools-1.3.1 → hamtaa_texttools-1.3.2}/texttools/core/operators/sync_operator.py RENAMED Viewed

@@ -54,7 +54,7 @@ class Operator:
     ) -> tuple[T, Any]:
         """
         Parses a chat completion using OpenAI's structured output format.
-        Returns both the parsed Any and the raw completion for logprobs.
+        Returns both the parsed and the completion for logprobs.
         """
         try:
             request_kwargs = {
@@ -90,7 +90,6 @@ class Operator:
     def run(
         self,
-        # User parameters
         text: str,
         with_analysis: bool,
         output_lang: str | None,
@@ -101,7 +100,6 @@ class Operator:
         validator: Callable[[Any], bool] | None,
         max_validation_retries: int | None,
         priority: int | None,
-        # Internal parameters
         tool_name: str,
         output_model: Type[T],
         mode: str | None,