hamtaa-texttools 1.1.21__py3-none-any.whl → 1.1.22__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: hamtaa-texttools
3
- Version: 1.1.21
3
+ Version: 1.1.22
4
4
  Summary: A high-level NLP toolkit built on top of modern LLMs.
5
5
  Author-email: Tohidi <the.mohammad.tohidi@gmail.com>, Montazer <montazerh82@gmail.com>, Givechi <mohamad.m.givechi@gmail.com>, MoosaviNejad <erfanmoosavi84@gmail.com>, Zareshahi <a.zareshahi1377@gmail.com>
6
6
  License: MIT License
@@ -37,61 +37,53 @@ Dynamic: license-file
37
37
 
38
38
  ## 📌 Overview
39
39
 
40
- **TextTools** is a high-level **NLP toolkit** built on top of modern **LLMs**.
40
+ **TextTools** is a high-level **NLP toolkit** built on top of **LLMs**.
41
41
 
42
42
  It provides both **sync (`TheTool`)** and **async (`AsyncTheTool`)** APIs for maximum flexibility.
43
43
 
44
44
  It provides ready-to-use utilities for **translation, question detection, keyword extraction, categorization, NER extraction, and more** - designed to help you integrate AI-powered text processing into your applications with minimal effort.
45
45
 
46
+ **Note:** Most features of `texttools` are reliable when you use `google/gemma-3n-e4b-it` model.
47
+
46
48
  ---
47
49
 
48
50
  ## ✨ Features
49
51
 
50
52
  TextTools provides a rich collection of high-level NLP utilities,
51
- Each tool is designed to work with structured outputs (JSON / Pydantic).
53
+ Each tool is designed to work with structured outputs.
52
54
 
53
55
  - **`categorize()`** - Classifies text into given categories
54
- - **`extract_keywords()`** - Extracts keywords from text
56
+ - **`extract_keywords()`** - Extracts keywords from the text
55
57
  - **`extract_entities()`** - Named Entity Recognition (NER) system
56
- - **`is_question()`** - Binary detection of whether input is a question
58
+ - **`is_question()`** - Binary question detection
57
59
  - **`text_to_question()`** - Generates questions from text
58
- - **`merge_questions()`** - Merges multiple questions with different modes
59
- - **`rewrite()`** - Rewrites text with different wording/meaning
60
+ - **`merge_questions()`** - Merges multiple questions into one
61
+ - **`rewrite()`** - Rewrites text in a diffrent way
60
62
  - **`subject_to_question()`** - Generates questions about a specific subject
61
63
  - **`summarize()`** - Text summarization
62
- - **`translate()`** - Text translation between languages
64
+ - **`translate()`** - Text translation
63
65
  - **`propositionize()`** - Convert text to atomic independence meaningful sentences
64
66
  - **`check_fact()`** - Check whether a statement is relevant to the source text
65
67
  - **`run_custom()`** - Allows users to define a custom tool with an arbitrary BaseModel
66
68
 
67
69
  ---
68
70
 
71
+ ## 🚀 Installation
72
+
73
+ Install the latest release via PyPI:
74
+
75
+ ```bash
76
+ pip install -U hamtaa-texttools
77
+ ```
78
+
79
+ ---
80
+
69
81
  ## 📊 Tool Quality Tiers
70
82
 
71
- | Status | Meaning | Use in Production? |
72
- |--------|---------|-------------------|
73
- | **✅ Production** | Evaluated, tested, stable. | **Yes** - ready for reliable use. |
74
- | **🧪 Experimental** | Added to the package but **not fully evaluated**. Functional, but quality may vary. | **Use with caution** - outputs not yet validated. |
75
-
76
- ### Current Status
77
- **Production Tools:**
78
- - `categorize()` (list mode)
79
- - `extract_keywords()`
80
- - `extract_entities()`
81
- - `is_question()`
82
- - `text_to_question()`
83
- - `merge_questions()`
84
- - `rewrite()`
85
- - `subject_to_question()`
86
- - `summarize()`
87
- - `run_custom()` (fine in most cases)
88
-
89
- **Experimental Tools:**
90
- - `categorize()` (tree mode)
91
- - `translate()`
92
- - `propositionize()`
93
- - `check_fact()`
94
- - `run_custom()` (not evaluated in all scenarios)
83
+ | Status | Meaning | Tools | Use in Production? |
84
+ |--------|---------|----------|-------------------|
85
+ | **✅ Production** | Evaluated, tested, stable. | `categorize()` (list mode), `extract_keywords()`, `extract_entities()`, `is_question()`, `text_to_question()`, `merge_questions()`, `rewrite()`, `subject_to_question()`, `summarize()`, `run_custom()` | **Yes** - ready for reliable use. |
86
+ | **🧪 Experimental** | Added to the package but **not fully evaluated**. Functional, but quality may vary. | `categorize()` (tree mode), `translate()`, `propositionize()`, `check_fact()` | **Use with caution** - outputs not yet validated. |
95
87
 
96
88
  ---
97
89
 
@@ -100,49 +92,37 @@ Each tool is designed to work with structured outputs (JSON / Pydantic).
100
92
  TextTools provides several optional flags to customize LLM behavior:
101
93
 
102
94
  - **`with_analysis: bool`** → Adds a reasoning step before generating the final output.
103
- **Note:** This doubles token usage per call because it triggers an additional LLM request.
95
+ **Note:** This doubles token usage per call.
104
96
 
105
97
  - **`logprobs: bool`** → Returns token-level probabilities for the generated output. You can also specify `top_logprobs=<N>` to get the top N alternative tokens and their probabilities.
106
98
  **Note:** This feature works if it's supported by the model.
107
99
 
108
- - **`output_lang: str`** → Forces the model to respond in a specific language. The model will ignore other instructions about language and respond strictly in the requested language.
100
+ - **`output_lang: str`** → Forces the model to respond in a specific language.
109
101
 
110
- - **`user_prompt: str`** → Allows you to inject a custom instruction or prompt into the model alongside the main template. This gives you fine-grained control over how the model interprets or modifies the input text.
102
+ - **`user_prompt: str`** → Allows you to inject a custom instruction or into the model alongside the main template. This gives you fine-grained control over how the model interprets or modifies the input text.
111
103
 
112
104
  - **`temperature: float`** → Determines how creative the model should respond. Takes a float number from `0.0` to `2.0`.
113
105
 
114
- - **`validator: Callable (Experimental)`** → Forces TheTool to validate the output result based on your custom validator. Validator should return a bool (True if there were no problem, False if the validation fails.) If the validator fails, TheTool will retry to get another output by modifying `temperature`. You can specify `max_validation_retries=<N>` to change the number of retries.
106
+ - **`validator: Callable (Experimental)`** → Forces TheTool to validate the output result based on your custom validator. Validator should return a boolean. If the validator fails, TheTool will retry to get another output by modifying `temperature`. You can also specify `max_validation_retries=<N>`.
115
107
 
116
- - **`priority: int (Experimental)`** → Task execution priority level. Higher values = higher priority. Affects processing order in queues.
108
+ - **`priority: int (Experimental)`** → Task execution priority level. Affects processing order in queues.
117
109
  **Note:** This feature works if it's supported by the model and vLLM.
118
110
 
119
- **Note:** There might be some tools that don't support some of the parameters above.
120
-
121
111
  ---
122
112
 
123
113
  ## 🧩 ToolOutput
124
114
 
125
115
  Every tool of `TextTools` returns a `ToolOutput` object which is a BaseModel with attributes:
126
- - **`result: Any`** → The output of LLM
127
- - **`analysis: str`** → The reasoning step before generating the final output
128
- - **`logprobs: list`** → Token-level probabilities for the generated output
129
- - **`errors: list[str]`** → Any error that have occured during calling LLM
116
+ - **`result: Any`**
117
+ - **`analysis: str`**
118
+ - **`logprobs: list`**
119
+ - **`errors: list[str]`**
130
120
  - **`ToolOutputMetadata`** →
131
- - **`tool_name: str`** → The tool name which processed the input
132
- - **`processed_at: datetime`** → The process time
133
- - **`execution_time: float`** → The execution time (seconds)
121
+ - **`tool_name: str`**
122
+ - **`processed_at: datetime`**
123
+ - **`execution_time: float`**
134
124
 
135
- **Note:** You can use `repr(ToolOutput)` to see details of your ToolOutput.
136
-
137
- ---
138
-
139
- ## 🚀 Installation
140
-
141
- Install the latest release via PyPI:
142
-
143
- ```bash
144
- pip install -U hamtaa-texttools
145
- ```
125
+ **Note:** You can use `repr(ToolOutput)` to print your output with all the details.
146
126
 
147
127
  ---
148
128
 
@@ -160,26 +140,13 @@ pip install -U hamtaa-texttools
160
140
  from openai import OpenAI
161
141
  from texttools import TheTool
162
142
 
163
- # Create your OpenAI client
164
143
  client = OpenAI(base_url = "your_url", API_KEY = "your_api_key")
144
+ model = "model_name"
165
145
 
166
- # Specify the model
167
- model = "gpt-4o-mini"
168
-
169
- # Create an instance of TheTool
170
146
  the_tool = TheTool(client=client, model=model)
171
147
 
172
- # Example: Question Detection
173
- detection = the_tool.is_question("Is this project open source?", logprobs=True, top_logprobs=2)
174
- print(detection.result)
175
- print(detection.logprobs)
176
- # Output: True + logprobs
177
-
178
- # Example: Translation
179
- translation = the_tool.translate("سلام، حالت چطوره؟" target_language="English", with_analysis=True)
180
- print(translation.result)
181
- print(translation.analysis)
182
- # Output: "Hi! How are you?" + analysis
148
+ detection = the_tool.is_question("Is this project open source?")
149
+ print(repr(detection))
183
150
  ```
184
151
 
185
152
  ---
@@ -192,22 +159,17 @@ from openai import AsyncOpenAI
192
159
  from texttools import AsyncTheTool
193
160
 
194
161
  async def main():
195
- # Create your AsyncOpenAI client
196
162
  async_client = AsyncOpenAI(base_url="your_url", api_key="your_api_key")
163
+ model = "model_name"
197
164
 
198
- # Specify the model
199
- model = "gpt-4o-mini"
200
-
201
- # Create an instance of AsyncTheTool
202
165
  async_the_tool = AsyncTheTool(client=async_client, model=model)
203
166
 
204
- # Example: Async Translation and Keyword Extraction
205
167
  translation_task = async_the_tool.translate("سلام، حالت چطوره؟", target_language="English")
206
168
  keywords_task = async_the_tool.extract_keywords("Tomorrow, we will be dead by the car crash")
207
169
 
208
170
  (translation, keywords) = await asyncio.gather(translation_task, keywords_task)
209
- print(translation.result)
210
- print(keywords.result)
171
+ print(repr(translation))
172
+ print(repr(keywords))
211
173
 
212
174
  asyncio.run(main())
213
175
  ```
@@ -229,13 +191,12 @@ Use **TextTools** when you need to:
229
191
 
230
192
  Process large datasets efficiently using OpenAI's batch API.
231
193
 
232
- ## ⚡ Quick Start (Batch)
194
+ ## ⚡ Quick Start (Batch Runner)
233
195
 
234
196
  ```python
235
197
  from pydantic import BaseModel
236
- from texttools import BatchJobRunner, BatchConfig
198
+ from texttools import BatchRunner, BatchConfig
237
199
 
238
- # Configure your batch job
239
200
  config = BatchConfig(
240
201
  system_prompt="Extract entities from the text",
241
202
  job_name="entity_extraction",
@@ -244,12 +205,10 @@ config = BatchConfig(
244
205
  model="gpt-4o-mini"
245
206
  )
246
207
 
247
- # Define your output schema
248
208
  class Output(BaseModel):
249
209
  entities: list[str]
250
210
 
251
- # Run the batch job
252
- runner = BatchJobRunner(config, output_model=Output)
211
+ runner = BatchRunner(config, output_model=Output)
253
212
  runner.run()
254
213
  ```
255
214
 
@@ -1,14 +1,14 @@
1
- hamtaa_texttools-1.1.21.dist-info/licenses/LICENSE,sha256=Hb2YOBKy2MJQLnyLrX37B4ZVuac8eaIcE71SvVIMOLg,1082
2
- texttools/__init__.py,sha256=CmCS9dEvO6061GiJ8A7gD3UAhCWHTkaID9q3Krlyq_o,311
1
+ hamtaa_texttools-1.1.22.dist-info/licenses/LICENSE,sha256=Hb2YOBKy2MJQLnyLrX37B4ZVuac8eaIcE71SvVIMOLg,1082
2
+ texttools/__init__.py,sha256=fqGafzxcnGw0_ivi-vUyLfytWOkjLOumiaB0-I612iY,305
3
3
  texttools/batch/batch_config.py,sha256=scWYQBDuaTj8-b2x_a33Zu-zxm7eqEf5FFoquD-Sv94,1029
4
4
  texttools/batch/batch_manager.py,sha256=6HfsexU0PHGGBH7HKReZ-CQxaQI9DXYKAPsFXxovb_I,8740
5
- texttools/batch/batch_runner.py,sha256=fmoq7yxtEdvfLbEhcx95ma-lgrL-ZdI2EgxmEfVcKtE,10016
6
- texttools/internals/async_operator.py,sha256=sKMYEy7jEcsXpwnBkA18PFubkM-TXZrBH3QwF7l-wSg,7054
5
+ texttools/batch/batch_runner.py,sha256=bpgRnFZiaxqAP6sm3kzb-waeNhIRxXYhttGikGFeXXU,10013
6
+ texttools/internals/async_operator.py,sha256=VHs06Yd_OZqUVyhCOMn7iujEChqhJg8aRS8NXpHBO1w,6719
7
7
  texttools/internals/exceptions.py,sha256=h_yp_5i_5IfmqTBQ4S6ZOISrrliJBQ3HTEAjwJXrplk,495
8
8
  texttools/internals/models.py,sha256=9uoCAe2TLrSzyS9lMJja5orPAYaCvVL1zoCb6FNdkfs,4541
9
- texttools/internals/operator_utils.py,sha256=eLY2OjYQ3jT-50nx3I8gzuVzgGpMi52f5oB3cnFyxko,1864
9
+ texttools/internals/operator_utils.py,sha256=p44-YovUiLefJ-akB3o7Tk1o73ITFxx7E77pod4Aa1Y,2491
10
10
  texttools/internals/prompt_loader.py,sha256=yYXDD4YYG2zohGPAmvZwmv5f6xV_RSl5yOrObTh9w7I,3352
11
- texttools/internals/sync_operator.py,sha256=IG3CXfGmv4PdFlAQ4AZcKuBAqPJdkIAK4mVw77zLbqI,6959
11
+ texttools/internals/sync_operator.py,sha256=23mIxk96SOOkYb_7VXjmkNKuWqPTRQVhO4cTKQ_4Mtw,6624
12
12
  texttools/internals/text_to_chunks.py,sha256=vY3odhgCZK4E44k_SGlLoSiKkdN0ib6-lQAsPcplAHA,3843
13
13
  texttools/prompts/README.md,sha256=ztajRJcmFLhyrUF0_qmOXaCwGsTGCFabfMjch2LAJG0,1375
14
14
  texttools/prompts/categorize.yaml,sha256=016b1uGtbKXEwB8_2_bBgVuUelBlu_rgT85XK_c3Yv0,1219
@@ -24,9 +24,9 @@ texttools/prompts/subject_to_question.yaml,sha256=TfVmZ6gDgaHRqJWCVkFlKpuJczpMvJ
24
24
  texttools/prompts/summarize.yaml,sha256=CKx4vjhHbGus1TdjDz_oc0bNEQtq7zfHsZkV2WeYHDU,457
25
25
  texttools/prompts/text_to_question.yaml,sha256=mnArBoYu7gpGHriaU2-Aw5SixB2ZIgoHMt99PnTPKD0,1003
26
26
  texttools/prompts/translate.yaml,sha256=ew9RERAVSzg0cvxAinNwTSFIaOIjdwIsekbUsgAuNgo,632
27
- texttools/tools/async_tools.py,sha256=VU3cqqCPILsyjRiG84w8kCw3iDSuFbI6S3VjExXZwFQ,44635
28
- texttools/tools/sync_tools.py,sha256=2cqcosMYR6LHuYw32WFR-drvqQ-t7Q9_2rUBDOeYzho,44441
29
- hamtaa_texttools-1.1.21.dist-info/METADATA,sha256=lExdE6uMFSs_wqUSElOyktjpHpZx4RY-cUH6azF-IYA,10183
30
- hamtaa_texttools-1.1.21.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
31
- hamtaa_texttools-1.1.21.dist-info/top_level.txt,sha256=5Mh0jIxxZ5rOXHGJ6Mp-JPKviywwN0MYuH0xk5bEWqE,10
32
- hamtaa_texttools-1.1.21.dist-info/RECORD,,
27
+ texttools/tools/async_tools.py,sha256=s3g6_8Jmg2KvdItWa3sXGfWI8YaOUPnfIRtWhWRMd1c,44543
28
+ texttools/tools/sync_tools.py,sha256=AcApMy_XvT47rBtqGdAFrKE1QDZq30f0uJsqiWYUWQg,44349
29
+ hamtaa_texttools-1.1.22.dist-info/METADATA,sha256=RF431cau25sLMmynuSHXKssKt3ipFt5M9ZKJJA3C9UI,8718
30
+ hamtaa_texttools-1.1.22.dist-info/WHEEL,sha256=_zCd3N1l69ArxyTb8rzEoP9TpbYXkqRFSNOD5OuxnTs,91
31
+ hamtaa_texttools-1.1.22.dist-info/top_level.txt,sha256=5Mh0jIxxZ5rOXHGJ6Mp-JPKviywwN0MYuH0xk5bEWqE,10
32
+ hamtaa_texttools-1.1.22.dist-info/RECORD,,
texttools/__init__.py CHANGED
@@ -1,7 +1,7 @@
1
- from .batch.batch_runner import BatchJobRunner
2
- from .batch.batch_config import BatchConfig
3
1
  from .tools.sync_tools import TheTool
4
2
  from .tools.async_tools import AsyncTheTool
5
3
  from .internals.models import CategoryTree
4
+ from .batch.batch_runner import BatchRunner
5
+ from .batch.batch_config import BatchConfig
6
6
 
7
- __all__ = ["TheTool", "AsyncTheTool", "BatchJobRunner", "BatchConfig", "CategoryTree"]
7
+ __all__ = ["TheTool", "AsyncTheTool", "CategoryTree", "BatchRunner", "BatchConfig"]
@@ -20,7 +20,7 @@ T = TypeVar("T", bound=BaseModel)
20
20
  logger = logging.getLogger("texttools.batch_runner")
21
21
 
22
22
 
23
- class BatchJobRunner:
23
+ class BatchRunner:
24
24
  """
25
25
  Handles running batch jobs using a batch manager and configuration.
26
26
  """
@@ -27,17 +27,11 @@ class AsyncOperator:
27
27
  self._client = client
28
28
  self._model = model
29
29
 
30
- async def _analyze_completion(self, analyze_prompt: str, temperature: float) -> str:
30
+ async def _analyze_completion(self, analyze_message: list[dict[str, str]]) -> str:
31
31
  try:
32
- if not analyze_prompt:
33
- raise PromptError("Analyze template is empty")
34
-
35
- analyze_message = OperatorUtils.build_user_message(analyze_prompt)
36
-
37
32
  completion = await self._client.chat.completions.create(
38
33
  model=self._model,
39
34
  messages=analyze_message,
40
- temperature=temperature,
41
35
  )
42
36
 
43
37
  if not completion.choices:
@@ -57,7 +51,7 @@ class AsyncOperator:
57
51
 
58
52
  async def _parse_completion(
59
53
  self,
60
- main_prompt: str,
54
+ main_message: list[dict[str, str]],
61
55
  output_model: Type[T],
62
56
  temperature: float,
63
57
  logprobs: bool,
@@ -69,8 +63,6 @@ class AsyncOperator:
69
63
  Returns both the parsed object and the raw completion for logprobs.
70
64
  """
71
65
  try:
72
- main_message = OperatorUtils.build_user_message(main_prompt)
73
-
74
66
  request_kwargs = {
75
67
  "model": self._model,
76
68
  "messages": main_message,
@@ -124,11 +116,13 @@ class AsyncOperator:
124
116
  **extra_kwargs,
125
117
  ) -> OperatorOutput:
126
118
  """
127
- Execute the LLM pipeline with the given input text. (Sync)
119
+ Execute the LLM pipeline with the given input text.
128
120
  """
129
121
  try:
130
- prompt_loader = PromptLoader()
122
+ if logprobs and (not isinstance(top_logprobs, int) or top_logprobs < 2):
123
+ raise ValueError("top_logprobs should be an int greater than 1")
131
124
 
125
+ prompt_loader = PromptLoader()
132
126
  prompt_configs = prompt_loader.load(
133
127
  prompt_file=prompt_file,
134
128
  text=text.strip(),
@@ -136,28 +130,27 @@ class AsyncOperator:
136
130
  **extra_kwargs,
137
131
  )
138
132
 
139
- main_prompt = ""
140
- analysis = ""
133
+ analysis: str | None = None
141
134
 
142
135
  if with_analysis:
143
- analysis = await self._analyze_completion(
144
- prompt_configs["analyze_template"], temperature
136
+ analyze_message = OperatorUtils.build_message(
137
+ prompt_configs["analyze_template"]
145
138
  )
146
- main_prompt += f"Based on this analysis:\n{analysis}\n"
139
+ analysis = await self._analyze_completion(analyze_message)
147
140
 
148
- if output_lang:
149
- main_prompt += f"Respond only in the {output_lang} language.\n"
150
-
151
- if user_prompt:
152
- main_prompt += f"Consider this instruction {user_prompt}\n"
153
-
154
- main_prompt += prompt_configs["main_template"]
155
-
156
- if logprobs and (not isinstance(top_logprobs, int) or top_logprobs < 2):
157
- raise ValueError("top_logprobs should be an integer greater than 1")
141
+ main_message = OperatorUtils.build_message(
142
+ OperatorUtils.build_main_prompt(
143
+ prompt_configs["main_template"], analysis, output_lang, user_prompt
144
+ )
145
+ )
158
146
 
159
147
  parsed, completion = await self._parse_completion(
160
- main_prompt, output_model, temperature, logprobs, top_logprobs, priority
148
+ main_message,
149
+ output_model,
150
+ temperature,
151
+ logprobs,
152
+ top_logprobs,
153
+ priority,
161
154
  )
162
155
 
163
156
  # Retry logic if validation fails
@@ -166,9 +159,7 @@ class AsyncOperator:
166
159
  not isinstance(max_validation_retries, int)
167
160
  or max_validation_retries < 1
168
161
  ):
169
- raise ValueError(
170
- "max_validation_retries should be a positive integer"
171
- )
162
+ raise ValueError("max_validation_retries should be a positive int")
172
163
 
173
164
  succeeded = False
174
165
  for _ in range(max_validation_retries):
@@ -177,7 +168,7 @@ class AsyncOperator:
177
168
 
178
169
  try:
179
170
  parsed, completion = await self._parse_completion(
180
- main_prompt,
171
+ main_message,
181
172
  output_model,
182
173
  retry_temperature,
183
174
  logprobs,
@@ -5,7 +5,29 @@ import random
5
5
 
6
6
  class OperatorUtils:
7
7
  @staticmethod
8
- def build_user_message(prompt: str) -> list[dict[str, str]]:
8
+ def build_main_prompt(
9
+ main_template: str,
10
+ analysis: str | None,
11
+ output_lang: str | None,
12
+ user_prompt: str | None,
13
+ ) -> str:
14
+ main_prompt = ""
15
+
16
+ if analysis:
17
+ main_prompt += f"Based on this analysis:\n{analysis}\n"
18
+
19
+ if output_lang:
20
+ main_prompt += f"Respond only in the {output_lang} language.\n"
21
+
22
+ if user_prompt:
23
+ main_prompt += f"Consider this instruction {user_prompt}\n"
24
+
25
+ main_prompt += main_template
26
+
27
+ return main_prompt
28
+
29
+ @staticmethod
30
+ def build_message(prompt: str) -> list[dict[str, str]]:
9
31
  return [{"role": "user", "content": prompt}]
10
32
 
11
33
  @staticmethod
@@ -20,7 +42,7 @@ class OperatorUtils:
20
42
 
21
43
  for choice in completion.choices:
22
44
  if not getattr(choice, "logprobs", None):
23
- return []
45
+ raise ValueError("Your model does not support logprobs")
24
46
 
25
47
  for logprob_item in choice.logprobs.content:
26
48
  if ignore_pattern.match(logprob_item.token):
@@ -27,17 +27,11 @@ class Operator:
27
27
  self._client = client
28
28
  self._model = model
29
29
 
30
- def _analyze_completion(self, analyze_prompt: str, temperature: float) -> str:
30
+ def _analyze_completion(self, analyze_message: list[dict[str, str]]) -> str:
31
31
  try:
32
- if not analyze_prompt:
33
- raise PromptError("Analyze template is empty")
34
-
35
- analyze_message = OperatorUtils.build_user_message(analyze_prompt)
36
-
37
32
  completion = self._client.chat.completions.create(
38
33
  model=self._model,
39
34
  messages=analyze_message,
40
- temperature=temperature,
41
35
  )
42
36
 
43
37
  if not completion.choices:
@@ -57,7 +51,7 @@ class Operator:
57
51
 
58
52
  def _parse_completion(
59
53
  self,
60
- main_prompt: str,
54
+ main_message: list[dict[str, str]],
61
55
  output_model: Type[T],
62
56
  temperature: float,
63
57
  logprobs: bool,
@@ -69,8 +63,6 @@ class Operator:
69
63
  Returns both the parsed object and the raw completion for logprobs.
70
64
  """
71
65
  try:
72
- main_message = OperatorUtils.build_user_message(main_prompt)
73
-
74
66
  request_kwargs = {
75
67
  "model": self._model,
76
68
  "messages": main_message,
@@ -122,11 +114,13 @@ class Operator:
122
114
  **extra_kwargs,
123
115
  ) -> OperatorOutput:
124
116
  """
125
- Execute the LLM pipeline with the given input text. (Sync)
117
+ Execute the LLM pipeline with the given input text.
126
118
  """
127
119
  try:
128
- prompt_loader = PromptLoader()
120
+ if logprobs and (not isinstance(top_logprobs, int) or top_logprobs < 2):
121
+ raise ValueError("top_logprobs should be an int greater than 1")
129
122
 
123
+ prompt_loader = PromptLoader()
130
124
  prompt_configs = prompt_loader.load(
131
125
  prompt_file=prompt_file,
132
126
  text=text.strip(),
@@ -134,28 +128,27 @@ class Operator:
134
128
  **extra_kwargs,
135
129
  )
136
130
 
137
- main_prompt = ""
138
- analysis = ""
131
+ analysis: str | None = None
139
132
 
140
133
  if with_analysis:
141
- analysis = self._analyze_completion(
142
- prompt_configs["analyze_template"], temperature
134
+ analyze_message = OperatorUtils.build_message(
135
+ prompt_configs["analyze_template"]
143
136
  )
144
- main_prompt += f"Based on this analysis:\n{analysis}\n"
137
+ analysis = self._analyze_completion(analyze_message)
145
138
 
146
- if output_lang:
147
- main_prompt += f"Respond only in the {output_lang} language.\n"
148
-
149
- if user_prompt:
150
- main_prompt += f"Consider this instruction {user_prompt}\n"
151
-
152
- main_prompt += prompt_configs["main_template"]
153
-
154
- if logprobs and (not isinstance(top_logprobs, int) or top_logprobs < 2):
155
- raise ValueError("top_logprobs should be an integer greater than 1")
139
+ main_message = OperatorUtils.build_message(
140
+ OperatorUtils.build_main_prompt(
141
+ prompt_configs["main_template"], analysis, output_lang, user_prompt
142
+ )
143
+ )
156
144
 
157
145
  parsed, completion = self._parse_completion(
158
- main_prompt, output_model, temperature, logprobs, top_logprobs, priority
146
+ main_message,
147
+ output_model,
148
+ temperature,
149
+ logprobs,
150
+ top_logprobs,
151
+ priority,
159
152
  )
160
153
 
161
154
  # Retry logic if validation fails
@@ -164,9 +157,7 @@ class Operator:
164
157
  not isinstance(max_validation_retries, int)
165
158
  or max_validation_retries < 1
166
159
  ):
167
- raise ValueError(
168
- "max_validation_retries should be a positive integer"
169
- )
160
+ raise ValueError("max_validation_retries should be a positive int")
170
161
 
171
162
  succeeded = False
172
163
  for _ in range(max_validation_retries):
@@ -175,7 +166,7 @@ class Operator:
175
166
 
176
167
  try:
177
168
  parsed, completion = self._parse_completion(
178
- main_prompt,
169
+ main_message,
179
170
  output_model,
180
171
  retry_temperature,
181
172
  logprobs,
@@ -1044,8 +1044,6 @@ class AsyncTheTool:
1044
1044
  """
1045
1045
  Custom tool that can do almost anything!
1046
1046
 
1047
- Important Note: This tool is EXPERIMENTAL, you can use it but it isn't reliable.
1048
-
1049
1047
  Arguments:
1050
1048
  prompt: The user prompt
1051
1049
  output_model: Pydantic BaseModel used for structured output
@@ -1044,8 +1044,6 @@ class TheTool:
1044
1044
  """
1045
1045
  Custom tool that can do almost anything!
1046
1046
 
1047
- Important Note: This tool is EXPERIMENTAL, you can use it but it isn't reliable.
1048
-
1049
1047
  Arguments:
1050
1048
  prompt: The user prompt
1051
1049
  output_model: Pydantic BaseModel used for structured output