hamtaa-texttools 1.1.1__py3-none-any.whl → 1.2.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- hamtaa_texttools-1.2.0.dist-info/METADATA +212 -0
- hamtaa_texttools-1.2.0.dist-info/RECORD +34 -0
- texttools/__init__.py +6 -8
- texttools/batch/__init__.py +0 -4
- texttools/batch/config.py +40 -0
- texttools/batch/{batch_manager.py → manager.py} +41 -42
- texttools/batch/runner.py +228 -0
- texttools/core/__init__.py +0 -0
- texttools/core/engine.py +254 -0
- texttools/core/exceptions.py +22 -0
- texttools/core/internal_models.py +58 -0
- texttools/core/operators/async_operator.py +194 -0
- texttools/core/operators/sync_operator.py +192 -0
- texttools/models.py +88 -0
- texttools/prompts/categorize.yaml +36 -0
- texttools/prompts/check_fact.yaml +24 -0
- texttools/prompts/extract_entities.yaml +7 -3
- texttools/prompts/extract_keywords.yaml +80 -18
- texttools/prompts/is_question.yaml +6 -2
- texttools/prompts/merge_questions.yaml +12 -5
- texttools/prompts/propositionize.yaml +24 -0
- texttools/prompts/rewrite.yaml +9 -10
- texttools/prompts/run_custom.yaml +2 -2
- texttools/prompts/subject_to_question.yaml +7 -3
- texttools/prompts/summarize.yaml +6 -2
- texttools/prompts/text_to_question.yaml +12 -6
- texttools/prompts/translate.yaml +7 -2
- texttools/py.typed +0 -0
- texttools/tools/__init__.py +0 -4
- texttools/tools/async_tools.py +1093 -0
- texttools/tools/sync_tools.py +1092 -0
- hamtaa_texttools-1.1.1.dist-info/METADATA +0 -183
- hamtaa_texttools-1.1.1.dist-info/RECORD +0 -30
- texttools/batch/batch_runner.py +0 -263
- texttools/prompts/README.md +0 -35
- texttools/prompts/categorizer.yaml +0 -28
- texttools/tools/async_the_tool.py +0 -414
- texttools/tools/internals/async_operator.py +0 -179
- texttools/tools/internals/base_operator.py +0 -91
- texttools/tools/internals/formatters.py +0 -24
- texttools/tools/internals/operator.py +0 -179
- texttools/tools/internals/output_models.py +0 -59
- texttools/tools/internals/prompt_loader.py +0 -57
- texttools/tools/the_tool.py +0 -412
- {hamtaa_texttools-1.1.1.dist-info → hamtaa_texttools-1.2.0.dist-info}/WHEEL +0 -0
- {hamtaa_texttools-1.1.1.dist-info → hamtaa_texttools-1.2.0.dist-info}/licenses/LICENSE +0 -0
- {hamtaa_texttools-1.1.1.dist-info → hamtaa_texttools-1.2.0.dist-info}/top_level.txt +0 -0
|
@@ -4,15 +4,18 @@ main_template:
|
|
|
4
4
|
You are a language expert.
|
|
5
5
|
I will give you a list of questions that are semantically similar.
|
|
6
6
|
Your task is to merge them into one unified question.
|
|
7
|
+
|
|
7
8
|
Guidelines:
|
|
8
9
|
- Preserves all the information and intent from the original questions.
|
|
9
10
|
- Sounds natural, fluent, and concise.
|
|
10
11
|
- Avoids redundancy or unnecessary repetition.
|
|
11
12
|
- Does not omit any unique idea from the originals.
|
|
12
|
-
|
|
13
|
+
|
|
14
|
+
Respond only in JSON format:
|
|
13
15
|
{{"result": "string"}}
|
|
16
|
+
|
|
14
17
|
Here is the questions:
|
|
15
|
-
{
|
|
18
|
+
{text}
|
|
16
19
|
|
|
17
20
|
reason: |
|
|
18
21
|
You are an AI assistant helping to unify semantically similar questions.
|
|
@@ -20,10 +23,12 @@ main_template:
|
|
|
20
23
|
Then, write one merged question that combines all their content clearly and naturally, without redundancy.
|
|
21
24
|
Step 1: Extract key ideas.
|
|
22
25
|
Step 2: Write the final merged question.
|
|
26
|
+
|
|
23
27
|
Respond only in JSON format:
|
|
24
28
|
{{"result": "string"}}
|
|
29
|
+
|
|
25
30
|
Here is the questions:
|
|
26
|
-
{
|
|
31
|
+
{text}
|
|
27
32
|
|
|
28
33
|
analyze_template:
|
|
29
34
|
|
|
@@ -33,14 +38,16 @@ analyze_template:
|
|
|
33
38
|
and the specific information they are seeking.
|
|
34
39
|
Provide a brief, summarized understanding of the questions' meaning that
|
|
35
40
|
will help in merging and rephrasing it accurately without changing its intent.
|
|
41
|
+
|
|
36
42
|
Here is the question:
|
|
37
|
-
{
|
|
43
|
+
{text}
|
|
38
44
|
|
|
39
45
|
reason: |
|
|
40
46
|
Analyze the following questions to identify their exact wording, phrasing,
|
|
41
47
|
and the literal meaning it conveys.
|
|
42
48
|
Provide a brief, summarized analysis of their linguistic structure and current meaning,
|
|
43
49
|
which will then be used to create a new question containing all of their contents.
|
|
50
|
+
|
|
44
51
|
Here is the question:
|
|
45
|
-
{
|
|
52
|
+
{text}
|
|
46
53
|
|
|
@@ -0,0 +1,24 @@
|
|
|
1
|
+
main_template: |
|
|
2
|
+
You are an expert data analyst specializing in Information Extraction.
|
|
3
|
+
Your task is to extract a list of "Atomic Propositions" from the provided text.
|
|
4
|
+
|
|
5
|
+
Definition of Atomic Proposition:
|
|
6
|
+
A single, self-contained statement of fact that is concise and verifiable.
|
|
7
|
+
|
|
8
|
+
Strict Guidelines:
|
|
9
|
+
1. Remove Meta-Data: STRICTLY EXCLUDE all citations, references, URLs, source attributions (e.g., "Source: makarem.ir"), and conversational fillers (e.g., "Based on the documents...", "In conclusion...").
|
|
10
|
+
2. Resolve Context: Replace pronouns ("it", "this", "they") with the specific nouns they refer to. Each proposition must make sense in isolation.
|
|
11
|
+
3. Preserve Logic: Keep conditions attached to their facts. Do not split a rule from its condition (e.g., "If X, then Y" should be one proposition).
|
|
12
|
+
4. No Redundancy: Do not extract summary statements that merely repeat facts already listed.
|
|
13
|
+
|
|
14
|
+
Extract the atomic propositions from the following text:
|
|
15
|
+
{text}
|
|
16
|
+
|
|
17
|
+
analyze_template: |
|
|
18
|
+
We want to analyze this text snippet and think about where we can split sentence to atomic meaningful propositions.
|
|
19
|
+
An atomic proposition is a single, self-contained fact that is concise,
|
|
20
|
+
verifiable, and does not rely on external context.
|
|
21
|
+
You just have to think around the possible propositions in the text and how a proposition can be made.
|
|
22
|
+
|
|
23
|
+
Here is the text:
|
|
24
|
+
{text}
|
texttools/prompts/rewrite.yaml
CHANGED
|
@@ -18,7 +18,7 @@ main_template:
|
|
|
18
18
|
{{"result": "str"}}
|
|
19
19
|
|
|
20
20
|
Anchor Text:
|
|
21
|
-
"{
|
|
21
|
+
"{text}"
|
|
22
22
|
|
|
23
23
|
negative: |
|
|
24
24
|
You are an AI assistant designed to generate high-quality training data for semantic text embedding models.
|
|
@@ -35,7 +35,7 @@ main_template:
|
|
|
35
35
|
{{"result": "str"}}
|
|
36
36
|
|
|
37
37
|
Anchor Text:
|
|
38
|
-
"{
|
|
38
|
+
"{text}"
|
|
39
39
|
|
|
40
40
|
hard_negative: |
|
|
41
41
|
You are an AI assistant designed to generate high-quality training data for semantic text embedding models.
|
|
@@ -52,12 +52,11 @@ main_template:
|
|
|
52
52
|
- Make it Challenging: The difference should be subtle enough that it requires a deep understanding of the text to identify, not just a simple keyword mismatch.
|
|
53
53
|
- Maintain Similar Length: The generated sentence should be of roughly the same length and level of detail as the Anchor.
|
|
54
54
|
|
|
55
|
-
|
|
56
55
|
Respond only in JSON format:
|
|
57
56
|
{{"result": "str"}}
|
|
58
57
|
|
|
59
58
|
Anchor Text:
|
|
60
|
-
"{
|
|
59
|
+
"{text}"
|
|
61
60
|
|
|
62
61
|
|
|
63
62
|
analyze_template:
|
|
@@ -73,8 +72,8 @@ analyze_template:
|
|
|
73
72
|
|
|
74
73
|
Your analysis should capture the ESSENTIAL MEANING that must be preserved in any paraphrase.
|
|
75
74
|
|
|
76
|
-
|
|
77
|
-
{
|
|
75
|
+
Here is the text:
|
|
76
|
+
{text}
|
|
78
77
|
|
|
79
78
|
negative: |
|
|
80
79
|
Analyze the following text to identify its SPECIFIC TOPIC and DOMAIN for creating a high-quality NEGATIVE sample.
|
|
@@ -87,8 +86,8 @@ analyze_template:
|
|
|
87
86
|
|
|
88
87
|
The goal is to find topics that are in the same domain but semantically unrelated to this specific text.
|
|
89
88
|
|
|
90
|
-
|
|
91
|
-
{
|
|
89
|
+
Here is the text:
|
|
90
|
+
{text}
|
|
92
91
|
|
|
93
92
|
hard_negative: |
|
|
94
93
|
Analyze this text to identify EXACTLY ONE ELEMENT that can be changed to create a hard-negative sample.
|
|
@@ -106,6 +105,6 @@ analyze_template:
|
|
|
106
105
|
- Sentence structure
|
|
107
106
|
- 80-90% of the vocabulary
|
|
108
107
|
|
|
109
|
-
|
|
110
|
-
{
|
|
108
|
+
Here is the text:
|
|
109
|
+
{text}
|
|
111
110
|
|
|
@@ -3,13 +3,16 @@ main_template: |
|
|
|
3
3
|
Given the following subject, generate {number_of_questions} appropriate questions that this subject would directly respond to.
|
|
4
4
|
The generated subject should be independently meaningful,
|
|
5
5
|
and it must not mention any verbs like, this, that, he or she and etc. in the question.
|
|
6
|
+
|
|
6
7
|
There is a `reason` key, fill that up with a summerized version of your thoughts.
|
|
7
8
|
The `reason` must be less than 20 words.
|
|
8
9
|
Don't forget to fill the reason.
|
|
10
|
+
|
|
9
11
|
Respond only in JSON format:
|
|
10
12
|
{{"result": ["question1", "question2", ...], "reason": "string"}}
|
|
11
|
-
|
|
12
|
-
|
|
13
|
+
|
|
14
|
+
Here is the subject:
|
|
15
|
+
{text}
|
|
13
16
|
|
|
14
17
|
analyze_template: |
|
|
15
18
|
Our goal is to generate questions from the given subject.
|
|
@@ -18,5 +21,6 @@ analyze_template: |
|
|
|
18
21
|
We need a summerized analysis of the subject.
|
|
19
22
|
What is the subject about?
|
|
20
23
|
What point of views can we see and generate questoins from it? (Questions that real users might have.)
|
|
24
|
+
|
|
21
25
|
Here is the subject:
|
|
22
|
-
{
|
|
26
|
+
{text}
|
texttools/prompts/summarize.yaml
CHANGED
|
@@ -1,14 +1,18 @@
|
|
|
1
1
|
main_template: |
|
|
2
2
|
You are a summarizer.
|
|
3
3
|
You must summarize the given text, preserving its meaning.
|
|
4
|
+
|
|
4
5
|
Respond only in JSON format:
|
|
5
6
|
{{"result": "string"}}
|
|
7
|
+
|
|
6
8
|
Provide a concise summary of the following text:
|
|
7
|
-
{
|
|
9
|
+
{text}
|
|
8
10
|
|
|
9
11
|
|
|
10
12
|
analyze_template: |
|
|
11
13
|
Read the following text and identify its main points, key arguments, and overall purpose.
|
|
12
14
|
Provide a brief, summarized analysis that will help in generating an accurate and concise summary.
|
|
13
|
-
|
|
15
|
+
|
|
16
|
+
Here is the text:
|
|
17
|
+
{text}
|
|
14
18
|
|
|
@@ -1,20 +1,26 @@
|
|
|
1
1
|
main_template: |
|
|
2
2
|
You are a question generator.
|
|
3
|
-
Given the following answer, generate
|
|
4
|
-
appropriate question that this answer would directly respond to.
|
|
3
|
+
Given the following answer, generate {number_of_questions} appropriate questions that this answer would directly respond to.
|
|
5
4
|
The generated answer should be independently meaningful,
|
|
6
5
|
and not mentioning any verbs like, this, that, he or she on the question.
|
|
6
|
+
|
|
7
|
+
There is a `reason` key, fill that up with a summerized version of your thoughts.
|
|
8
|
+
The `reason` must be less than 20 words.
|
|
9
|
+
Don't forget to fill the reason.
|
|
10
|
+
|
|
7
11
|
Respond only in JSON format:
|
|
8
|
-
{{"result": "string"}}
|
|
12
|
+
{{"result": ["question1", "question2", ...], "reason": "string"}}
|
|
13
|
+
|
|
9
14
|
Here is the answer:
|
|
10
|
-
{
|
|
15
|
+
{text}
|
|
11
16
|
|
|
12
17
|
analyze_template: |
|
|
13
18
|
Analyze the following answer to identify its key facts,
|
|
14
19
|
main subject, and what kind of information it provides.
|
|
15
20
|
Provide a brief, summarized understanding of the answer's content that will
|
|
16
|
-
help in formulating
|
|
21
|
+
help in formulating relevant and direct questions.
|
|
17
22
|
Just mention the keypoints that was provided in the answer
|
|
23
|
+
|
|
18
24
|
Here is the answer:
|
|
19
|
-
{
|
|
25
|
+
{text}
|
|
20
26
|
|
texttools/prompts/translate.yaml
CHANGED
|
@@ -1,15 +1,20 @@
|
|
|
1
1
|
main_template: |
|
|
2
2
|
You are a {target_language} translator.
|
|
3
3
|
Output only the translated text.
|
|
4
|
+
|
|
4
5
|
Respond only in JSON format:
|
|
5
6
|
{{"result": "string"}}
|
|
7
|
+
|
|
6
8
|
Don't translate proper name, only transliterate them to {target_language}
|
|
9
|
+
|
|
7
10
|
Translate the following text to {target_language}:
|
|
8
|
-
{
|
|
11
|
+
{text}
|
|
9
12
|
|
|
10
13
|
analyze_template: |
|
|
11
14
|
Analyze the following text and identify important linguistic considerations for translation.
|
|
12
15
|
Point out any idioms, cultural references, or complex structures that need special attention.
|
|
13
16
|
Also, list all proper nouns that should not be translated. Write your analysis in the {target_language}.
|
|
14
|
-
|
|
17
|
+
|
|
18
|
+
Here is the text:
|
|
19
|
+
{text}
|
|
15
20
|
|
texttools/py.typed
ADDED
|
File without changes
|
texttools/tools/__init__.py
CHANGED