llm-ie 0.2.2__py3-none-any.whl → 0.3.1__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- llm_ie/asset/PromptEditor_prompts/chat.txt +5 -0
- llm_ie/asset/PromptEditor_prompts/rewrite.txt +3 -1
- llm_ie/asset/PromptEditor_prompts/system.txt +1 -0
- llm_ie/asset/default_prompts/ReviewFrameExtractor_addition_review_prompt.txt +3 -0
- llm_ie/asset/default_prompts/ReviewFrameExtractor_revision_review_prompt.txt +2 -0
- llm_ie/asset/default_prompts/SentenceReviewFrameExtractor_addition_review_prompt.txt +4 -0
- llm_ie/asset/default_prompts/SentenceReviewFrameExtractor_revision_review_prompt.txt +3 -0
- llm_ie/asset/prompt_guide/BasicFrameExtractor_prompt_guide.txt +117 -7
- llm_ie/asset/prompt_guide/BinaryRelationExtractor_prompt_guide.txt +32 -12
- llm_ie/asset/prompt_guide/MultiClassRelationExtractor_prompt_guide.txt +35 -12
- llm_ie/asset/prompt_guide/ReviewFrameExtractor_prompt_guide.txt +117 -7
- llm_ie/asset/prompt_guide/SentenceCoTFrameExtractor_prompt_guide.txt +217 -0
- llm_ie/asset/prompt_guide/SentenceFrameExtractor_prompt_guide.txt +129 -24
- llm_ie/asset/prompt_guide/SentenceReviewFrameExtractor_prompt_guide.txt +145 -0
- llm_ie/engines.py +1 -1
- llm_ie/extractors.py +331 -24
- llm_ie/prompt_editor.py +150 -8
- {llm_ie-0.2.2.dist-info → llm_ie-0.3.1.dist-info}/METADATA +89 -44
- llm_ie-0.3.1.dist-info/RECORD +23 -0
- llm_ie-0.2.2.dist-info/RECORD +0 -15
- {llm_ie-0.2.2.dist-info → llm_ie-0.3.1.dist-info}/WHEEL +0 -0
|
@@ -1,4 +1,6 @@
|
|
|
1
|
-
|
|
1
|
+
# Task description
|
|
2
|
+
Rewrite the draft prompt following the prompt guideline below.
|
|
3
|
+
DO NOT explain your answer.
|
|
2
4
|
|
|
3
5
|
# Prompt guideline
|
|
4
6
|
{{prompt_guideline}}
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
You are an AI assistant specializing in prompt writing and improvement. Your role is to help users refine, rewrite, and generate effective prompts based on guidelines provided. You are highly knowledgeable in extracting key information and adhering to structured formats. During interactions, you will engage in clear, insightful, and context-aware conversations, providing thoughtful responses to assist the user. Maintain a polite, professional tone and ensure each response adds value to the conversation, promoting clarity and creativity in the user's prompts. If users ask about irrelevant topics (not related to prompt development), you will politely decline to answer and guide the conversation back to prompt development.
|
|
@@ -0,0 +1,4 @@
|
|
|
1
|
+
Review the input sentence and your output carefully. If anything was missed, add it to your output following the defined output formats.
|
|
2
|
+
You should ONLY adding new items. Do NOT re-generate the entire answer.
|
|
3
|
+
Your output should be based on the input sentence.
|
|
4
|
+
Your output should strictly adheres to the defined output formats.
|
|
@@ -0,0 +1,3 @@
|
|
|
1
|
+
Review the input sentence and your output carefully. If you find any omissions or errors, correct them by generating a revised output following the defined output formats.
|
|
2
|
+
Your output should be based on the input sentence.
|
|
3
|
+
Your output should strictly adheres to the defined output formats.
|
|
@@ -1,11 +1,27 @@
|
|
|
1
|
-
Prompt
|
|
2
|
-
1. Task description
|
|
3
|
-
2. Schema definition
|
|
4
|
-
3. Output format definition
|
|
5
|
-
4. Additional hints
|
|
6
|
-
5. Input placeholder
|
|
1
|
+
Prompt Template Design:
|
|
7
2
|
|
|
8
|
-
|
|
3
|
+
1. Task Description:
|
|
4
|
+
Provide a detailed description of the task, including the background and the type of task (e.g., named entity recognition).
|
|
5
|
+
|
|
6
|
+
2. Schema Definition:
|
|
7
|
+
List the key concepts that should be extracted, and provide clear definitions for each one.
|
|
8
|
+
|
|
9
|
+
3. Output Format Definition:
|
|
10
|
+
The output should be a JSON list, where each element is a dictionary representing a frame (an entity along with its attributes). Each dictionary must include a key that holds the entity text. This key can be named "entity_text" or anything else depend on the context. The attributes can either be flat (e.g., {"entity_text": "<entity_text>", "attr1": "<attr1>", "attr2": "<attr2>"}) or nested (e.g., {"entity_text": "<entity_text>", "attributes": {"attr1": "<attr1>", "attr2": "<attr2>"}}).
|
|
11
|
+
|
|
12
|
+
4. Optional: Hints:
|
|
13
|
+
Provide itemized hints for the information extractors to guide the extraction process.
|
|
14
|
+
|
|
15
|
+
5. Optional: Examples:
|
|
16
|
+
Include examples in the format:
|
|
17
|
+
Input: ...
|
|
18
|
+
Output: ...
|
|
19
|
+
|
|
20
|
+
6. Input Placeholder:
|
|
21
|
+
The template must include a placeholder in the format {{<placeholder_name>}} for the input text. The placeholder name can be customized as needed.
|
|
22
|
+
|
|
23
|
+
|
|
24
|
+
Example 1 (single entity type with attributes):
|
|
9
25
|
|
|
10
26
|
# Task description
|
|
11
27
|
The paragraph below is from the Food and Drug Administration (FDA) Clinical Pharmacology Section of Labeling for Human Prescription Drug and Biological Products, Adverse reactions section. Please carefully review it and extract the adverse reactions and percentages. Note that each adverse reaction is nested under a clinical trial and potentially an arm. Your output should take that into consideration.
|
|
@@ -33,3 +49,97 @@ Example:
|
|
|
33
49
|
Below is the Adverse reactions section:
|
|
34
50
|
{{input}}
|
|
35
51
|
|
|
52
|
+
|
|
53
|
+
Example 2 (multiple entity types):
|
|
54
|
+
|
|
55
|
+
# Task description
|
|
56
|
+
This is a named entity recognition task. Given medical note, annotate the Drug, Form, Strength, Frequency, Route, Dosage, Reason, ADE, and Duration.
|
|
57
|
+
|
|
58
|
+
# Schema definition
|
|
59
|
+
Your output should contain:
|
|
60
|
+
"entity_text": the exact wording as mentioned in the note.
|
|
61
|
+
"entity_type": type of the entity. It should be one of the "Drug", "Form", "Strength", "Frequency", "Route", "Dosage", "Reason", "ADE", or "Duration".
|
|
62
|
+
|
|
63
|
+
# Output format definition
|
|
64
|
+
Your output should follow JSON format,
|
|
65
|
+
if there are one of the entity mentions: Drug, Form, Strength, Frequency, Route, Dosage, Reason, ADE, or Duration:
|
|
66
|
+
[{"entity_text": "<Exact entity mentions as in the note>", "entity_type": "<entity type as listed above>"},
|
|
67
|
+
{"entity_text": "<Exact entity mentions as in the note>", "entity_type": "<entity type as listed above>"}]
|
|
68
|
+
if there is no entity mentioned in the given note, just output an empty list:
|
|
69
|
+
[]
|
|
70
|
+
|
|
71
|
+
I am only interested in the extracted contents in []. Do NOT explain your answer.
|
|
72
|
+
|
|
73
|
+
# Examples
|
|
74
|
+
Below are some examples:
|
|
75
|
+
|
|
76
|
+
Input: Acetaminophen 650 mg PO BID 5.
|
|
77
|
+
Output: [{"entity_text": "Acetaminophen", "entity_type": "Drug"}, {"entity_text": "650 mg", "entity_type": "Strength"}, {"entity_text": "PO", "entity_type": "Route"}, {"entity_text": "BID", "entity_type": "Frequency"}]
|
|
78
|
+
|
|
79
|
+
Input: Mesalamine DR 1200 mg PO BID 2.
|
|
80
|
+
Output: [{"entity_text": "Mesalamine DR", "entity_type": "Drug"}, {"entity_text": "1200 mg", "entity_type": "Strength"}, {"entity_text": "BID", "entity_type": "Frequency"}, {"entity_text": "PO", "entity_type": "Route"}]
|
|
81
|
+
|
|
82
|
+
|
|
83
|
+
# Input placeholder
|
|
84
|
+
Below is the medical note:
|
|
85
|
+
"{{input}}"
|
|
86
|
+
|
|
87
|
+
|
|
88
|
+
Example 3 (multiple entity types with corresponding attributes):
|
|
89
|
+
|
|
90
|
+
# Task description
|
|
91
|
+
This is a named entity recognition task. Given a medical note, annotate the events (EVENT) and time expressions (TIMEX3):
|
|
92
|
+
|
|
93
|
+
# Schema definition
|
|
94
|
+
Your output should contain:
|
|
95
|
+
"entity_text": the exact wording as mentioned in the note.
|
|
96
|
+
"entity_type": type of the entity. It should be one of the "EVENT" or "TIMEX3".
|
|
97
|
+
if entity_type is "EVENT",
|
|
98
|
+
"type": the event type as one of the "TEST", "PROBLEM", "TREATMENT", "CLINICAL_DEPT", "EVIDENTIAL", or "OCCURRENCE".
|
|
99
|
+
"polarity": whether an EVENT is positive ("POS") or negative ("NAG"). For example, in “the patient reports headache, and denies chills”, the EVENT [headache] is positive in its polarity, and the EVENT [chills] is negative in its polarity.
|
|
100
|
+
"modality": whether an EVENT actually occurred or not. Must be one of the "FACTUAL", "CONDITIONAL", "POSSIBLE", or "PROPOSED".
|
|
101
|
+
|
|
102
|
+
if entity_type is "TIMEX3",
|
|
103
|
+
"type": the type as one of the "DATE", "TIME", "DURATION", or "FREQUENCY".
|
|
104
|
+
"val": the numeric value 1) DATE: [YYYY]-[MM]-[DD], 2) TIME: [hh]:[mm]:[ss], 3) DURATION: P[n][Y/M/W/D]. So, “for eleven days” will be
|
|
105
|
+
represented as “P11D”, meaning a period of 11 days. 4) R[n][duration], where n denotes the number of repeats. When the n is omitted, the expression denotes an unspecified amount of repeats. For example, “once a day for 3 days” is “R3P1D” (repeat the time interval of 1 day (P1D) for 3 times (R3)), twice every day is “RP12H” (repeat every 12 hours)
|
|
106
|
+
"mod": additional information regarding the temporal value of a time expression. Must be one of the:
|
|
107
|
+
“NA”: the default value, no relevant modifier is present;
|
|
108
|
+
“MORE”, means “more than”, e.g. over 2 days (val = P2D, mod = MORE);
|
|
109
|
+
“LESS”, means “less than”, e.g. almost 2 months (val = P2M, mod=LESS);
|
|
110
|
+
“APPROX”, means “approximate”, e.g. nearly a week (val = P1W, mod=APPROX);
|
|
111
|
+
“START”, describes the beginning of a period of time, e.g. Christmas morning, 2005 (val= 2005-12-25, mod= START).
|
|
112
|
+
“END”, describes the end of a period of time, e.g. late last year, (val = 2010, mod = END)
|
|
113
|
+
“MIDDLE”, describes the middle of a period of time, e.g. mid-September 2001 (val = 2001-09, mod = MIDDLE)
|
|
114
|
+
|
|
115
|
+
# Output format definition
|
|
116
|
+
Your output should follow JSON format,
|
|
117
|
+
if there are one of the EVENT or TIMEX3 entity mentions:
|
|
118
|
+
[
|
|
119
|
+
{"entity_text": "<Exact entity mentions as in the note>", "entity_type": "EVENT", "type": "<event type>", "polarity": "<event polarity>", "modality": "<event modality>"},
|
|
120
|
+
{"entity_text": "<Exact entity mentions as in the note>", "entity_type": "TIMEX3", "type": "<TIMEX3 type>", "val": "<time value>", "mod": "<additional information>"}
|
|
121
|
+
...
|
|
122
|
+
]
|
|
123
|
+
if there is no entity mentioned in the given note, just output an empty list:
|
|
124
|
+
[]
|
|
125
|
+
|
|
126
|
+
I am only interested in the extracted contents in []. Do NOT explain your answer.
|
|
127
|
+
|
|
128
|
+
# Examples
|
|
129
|
+
Below are some examples:
|
|
130
|
+
|
|
131
|
+
Input: At 9/7/93 , 1:00 a.m. , intravenous fluids rate was decreased to 50 cc's per hour , total fluids given during the first 24 hours were 140 to 150 cc's per kilo per day .
|
|
132
|
+
Output: [{"entity_text": "intravenous fluids", "entity_type": "EVENT", "type": "TREATMENT", "polarity": "POS", "modality": "FACTUAL"},
|
|
133
|
+
{"entity_text": "decreased", "entity_type": "EVENT", "type": "OCCURRENCE", "polarity": "POS", "modality": "FACTUAL"},
|
|
134
|
+
{"entity_text": "total fluids", "entity_type": "EVENT", "type": "TREATMENT", "polarity": "POS", "modality": "FACTUAL"},
|
|
135
|
+
{"entity_text": "9/7/93 , 1:00 a.m.", "entity_type": "TIMEX3", "type": "TIME", "val": "1993-09-07T01:00", "mod": "NA"},
|
|
136
|
+
{"entity_text": "24 hours", "entity_type": "TIMEX3", "type": "DURATION", "val": "PT24H", "mod": "NA"}]
|
|
137
|
+
|
|
138
|
+
Input: At that time it appeared well adhered to the underlying skin .
|
|
139
|
+
Output: [{"entity_text": "it", "entity_type": "EVENT", "type": "TREATMENT", "polarity": "POS", "modality": "FACTUAL"},
|
|
140
|
+
{"entity_text": "well adhered", "entity_type": "EVENT", "type": "OCCURRENCE", "polarity": "POS", "modality": "FACTUAL"}]
|
|
141
|
+
|
|
142
|
+
|
|
143
|
+
# Input placeholder
|
|
144
|
+
Below is the entire medical note:
|
|
145
|
+
"{{input}}"
|
|
@@ -1,9 +1,29 @@
|
|
|
1
|
-
Prompt
|
|
2
|
-
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
1
|
+
Prompt Template Design:
|
|
2
|
+
|
|
3
|
+
1. Task description:
|
|
4
|
+
Provide a detailed description of the task, including the background and the type of task (e.g., binary relation extraction). Mention the region of interest (ROI) text.
|
|
5
|
+
2. Schema definition:
|
|
6
|
+
List the criterion for relation (True) and for no relation (False).
|
|
7
|
+
|
|
8
|
+
3. Output format definition:
|
|
9
|
+
The ouptut must be a dictionary with a key "Relation" (i.e., {"Relation": "<True or False>"}).
|
|
10
|
+
|
|
11
|
+
4. (optional) Hints:
|
|
12
|
+
Provide itemized hints for the information extractors to guide the extraction process.
|
|
13
|
+
|
|
14
|
+
5. (optional) Examples:
|
|
15
|
+
Include examples in the format:
|
|
16
|
+
Input: ...
|
|
17
|
+
Output: ...
|
|
18
|
+
|
|
19
|
+
6. Entity 1 full information:
|
|
20
|
+
Include a placeholder in the format {{<frame_1>}}
|
|
21
|
+
|
|
22
|
+
7. Entity 2 full information:
|
|
23
|
+
Include a placeholder in the format {{<frame_2>}}
|
|
24
|
+
|
|
25
|
+
8. Input placeholders:
|
|
26
|
+
The template must include a placeholder "{{roi_text}}" for the ROI text.
|
|
7
27
|
|
|
8
28
|
|
|
9
29
|
Example:
|
|
@@ -27,12 +47,12 @@ Example:
|
|
|
27
47
|
3. If the strength or frequency is for another medication, output False.
|
|
28
48
|
4. If the strength or frequency is for the same medication but at a different location (span), output False.
|
|
29
49
|
|
|
50
|
+
# Entity 1 full information:
|
|
51
|
+
{{frame_1}}
|
|
52
|
+
|
|
53
|
+
# Entity 2 full information:
|
|
54
|
+
{{frame_2}}
|
|
55
|
+
|
|
30
56
|
# Input placeholders
|
|
31
57
|
ROI Text with the two entities annotated with <entity_1> and <entity_2>:
|
|
32
58
|
"{{roi_text}}"
|
|
33
|
-
|
|
34
|
-
Entity 1 full information:
|
|
35
|
-
{{frame_1}}
|
|
36
|
-
|
|
37
|
-
Entity 2 full information:
|
|
38
|
-
{{frame_2}}
|
|
@@ -1,8 +1,31 @@
|
|
|
1
|
-
Prompt
|
|
2
|
-
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
1
|
+
Prompt Template Design:
|
|
2
|
+
|
|
3
|
+
1. Task description:
|
|
4
|
+
Provide a detailed description of the task, including the background and the type of task (e.g., binary relation extraction). Mention the region of interest (ROI) text.
|
|
5
|
+
2. Schema definition:
|
|
6
|
+
List the criterion for relation (True) and for no relation (False).
|
|
7
|
+
|
|
8
|
+
3. Output format definition:
|
|
9
|
+
This section must include a placeholder "{{pos_rel_types}}" for the possible relation types.
|
|
10
|
+
The ouptut must be a dictionary with a key "RelationType" (i.e., {"RelationType": "<relation type or No Relation>"}).
|
|
11
|
+
|
|
12
|
+
4. (optional) Hints:
|
|
13
|
+
Provide itemized hints for the information extractors to guide the extraction process.
|
|
14
|
+
|
|
15
|
+
5. (optional) Examples:
|
|
16
|
+
Include examples in the format:
|
|
17
|
+
Input: ...
|
|
18
|
+
Output: ...
|
|
19
|
+
|
|
20
|
+
6. Entity 1 full information:
|
|
21
|
+
Include a placeholder in the format {{<frame_1>}}
|
|
22
|
+
|
|
23
|
+
7. Entity 2 full information:
|
|
24
|
+
Include a placeholder in the format {{<frame_2>}}
|
|
25
|
+
|
|
26
|
+
8. Input placeholders:
|
|
27
|
+
The template must include a placeholder "{{roi_text}}" for the ROI text.
|
|
28
|
+
|
|
6
29
|
|
|
7
30
|
|
|
8
31
|
Example:
|
|
@@ -35,12 +58,12 @@ Example:
|
|
|
35
58
|
3. If the strength or frequency is for another medication, output "No Relation".
|
|
36
59
|
4. If the strength or frequency is for the same medication but at a different location (span), output "No Relation".
|
|
37
60
|
|
|
38
|
-
#
|
|
39
|
-
ROI Text with the two entities annotated with <entity_1> and <entity_2>:
|
|
40
|
-
"{{roi_text}}"
|
|
41
|
-
|
|
42
|
-
Entity 1 full information:
|
|
61
|
+
# Entity 1 full information:
|
|
43
62
|
{{frame_1}}
|
|
44
63
|
|
|
45
|
-
Entity 2 full information:
|
|
46
|
-
{{frame_2}}
|
|
64
|
+
# Entity 2 full information:
|
|
65
|
+
{{frame_2}}
|
|
66
|
+
|
|
67
|
+
# Input placeholders
|
|
68
|
+
ROI Text with the two entities annotated with <entity_1> and <entity_2>:
|
|
69
|
+
"{{roi_text}}"
|
|
@@ -1,11 +1,27 @@
|
|
|
1
|
-
Prompt
|
|
2
|
-
1. Task description
|
|
3
|
-
2. Schema definition
|
|
4
|
-
3. Output format definition
|
|
5
|
-
4. Additional hints
|
|
6
|
-
5. Input placeholder
|
|
1
|
+
Prompt Template Design:
|
|
7
2
|
|
|
8
|
-
|
|
3
|
+
1. Task Description:
|
|
4
|
+
Provide a detailed description of the task, including the background and the type of task (e.g., named entity recognition).
|
|
5
|
+
|
|
6
|
+
2. Schema Definition:
|
|
7
|
+
List the key concepts that should be extracted, and provide clear definitions for each one.
|
|
8
|
+
|
|
9
|
+
3. Output Format Definition:
|
|
10
|
+
The output should be a JSON list, where each element is a dictionary representing a frame (an entity along with its attributes). Each dictionary must include a key that holds the entity text. This key can be named "entity_text" or anything else depend on the context. The attributes can either be flat (e.g., {"entity_text": "<entity_text>", "attr1": "<attr1>", "attr2": "<attr2>"}) or nested (e.g., {"entity_text": "<entity_text>", "attributes": {"attr1": "<attr1>", "attr2": "<attr2>"}}).
|
|
11
|
+
|
|
12
|
+
4. Optional: Hints:
|
|
13
|
+
Provide itemized hints for the information extractors to guide the extraction process.
|
|
14
|
+
|
|
15
|
+
5. Optional: Examples:
|
|
16
|
+
Include examples in the format:
|
|
17
|
+
Input: ...
|
|
18
|
+
Output: ...
|
|
19
|
+
|
|
20
|
+
6. Input Placeholder:
|
|
21
|
+
The template must include a placeholder in the format {{<placeholder_name>}} for the input text. The placeholder name can be customized as needed.
|
|
22
|
+
|
|
23
|
+
|
|
24
|
+
Example 1 (single entity type with attributes):
|
|
9
25
|
|
|
10
26
|
# Task description
|
|
11
27
|
The paragraph below is from the Food and Drug Administration (FDA) Clinical Pharmacology Section of Labeling for Human Prescription Drug and Biological Products, Adverse reactions section. Please carefully review it and extract the adverse reactions and percentages. Note that each adverse reaction is nested under a clinical trial and potentially an arm. Your output should take that into consideration.
|
|
@@ -33,3 +49,97 @@ Example:
|
|
|
33
49
|
Below is the Adverse reactions section:
|
|
34
50
|
{{input}}
|
|
35
51
|
|
|
52
|
+
|
|
53
|
+
Example 2 (multiple entity types):
|
|
54
|
+
|
|
55
|
+
# Task description
|
|
56
|
+
This is a named entity recognition task. Given medical note, annotate the Drug, Form, Strength, Frequency, Route, Dosage, Reason, ADE, and Duration.
|
|
57
|
+
|
|
58
|
+
# Schema definition
|
|
59
|
+
Your output should contain:
|
|
60
|
+
"entity_text": the exact wording as mentioned in the note.
|
|
61
|
+
"entity_type": type of the entity. It should be one of the "Drug", "Form", "Strength", "Frequency", "Route", "Dosage", "Reason", "ADE", or "Duration".
|
|
62
|
+
|
|
63
|
+
# Output format definition
|
|
64
|
+
Your output should follow JSON format,
|
|
65
|
+
if there are one of the entity mentions: Drug, Form, Strength, Frequency, Route, Dosage, Reason, ADE, or Duration:
|
|
66
|
+
[{"entity_text": "<Exact entity mentions as in the note>", "entity_type": "<entity type as listed above>"},
|
|
67
|
+
{"entity_text": "<Exact entity mentions as in the note>", "entity_type": "<entity type as listed above>"}]
|
|
68
|
+
if there is no entity mentioned in the given note, just output an empty list:
|
|
69
|
+
[]
|
|
70
|
+
|
|
71
|
+
I am only interested in the extracted contents in []. Do NOT explain your answer.
|
|
72
|
+
|
|
73
|
+
# Examples
|
|
74
|
+
Below are some examples:
|
|
75
|
+
|
|
76
|
+
Input: Acetaminophen 650 mg PO BID 5.
|
|
77
|
+
Output: [{"entity_text": "Acetaminophen", "entity_type": "Drug"}, {"entity_text": "650 mg", "entity_type": "Strength"}, {"entity_text": "PO", "entity_type": "Route"}, {"entity_text": "BID", "entity_type": "Frequency"}]
|
|
78
|
+
|
|
79
|
+
Input: Mesalamine DR 1200 mg PO BID 2.
|
|
80
|
+
Output: [{"entity_text": "Mesalamine DR", "entity_type": "Drug"}, {"entity_text": "1200 mg", "entity_type": "Strength"}, {"entity_text": "BID", "entity_type": "Frequency"}, {"entity_text": "PO", "entity_type": "Route"}]
|
|
81
|
+
|
|
82
|
+
|
|
83
|
+
# Input placeholder
|
|
84
|
+
Below is the medical note:
|
|
85
|
+
"{{input}}"
|
|
86
|
+
|
|
87
|
+
|
|
88
|
+
Example 3 (multiple entity types with corresponding attributes):
|
|
89
|
+
|
|
90
|
+
# Task description
|
|
91
|
+
This is a named entity recognition task. Given a medical note, annotate the events (EVENT) and time expressions (TIMEX3):
|
|
92
|
+
|
|
93
|
+
# Schema definition
|
|
94
|
+
Your output should contain:
|
|
95
|
+
"entity_text": the exact wording as mentioned in the note.
|
|
96
|
+
"entity_type": type of the entity. It should be one of the "EVENT" or "TIMEX3".
|
|
97
|
+
if entity_type is "EVENT",
|
|
98
|
+
"type": the event type as one of the "TEST", "PROBLEM", "TREATMENT", "CLINICAL_DEPT", "EVIDENTIAL", or "OCCURRENCE".
|
|
99
|
+
"polarity": whether an EVENT is positive ("POS") or negative ("NAG"). For example, in “the patient reports headache, and denies chills”, the EVENT [headache] is positive in its polarity, and the EVENT [chills] is negative in its polarity.
|
|
100
|
+
"modality": whether an EVENT actually occurred or not. Must be one of the "FACTUAL", "CONDITIONAL", "POSSIBLE", or "PROPOSED".
|
|
101
|
+
|
|
102
|
+
if entity_type is "TIMEX3",
|
|
103
|
+
"type": the type as one of the "DATE", "TIME", "DURATION", or "FREQUENCY".
|
|
104
|
+
"val": the numeric value 1) DATE: [YYYY]-[MM]-[DD], 2) TIME: [hh]:[mm]:[ss], 3) DURATION: P[n][Y/M/W/D]. So, “for eleven days” will be
|
|
105
|
+
represented as “P11D”, meaning a period of 11 days. 4) R[n][duration], where n denotes the number of repeats. When the n is omitted, the expression denotes an unspecified amount of repeats. For example, “once a day for 3 days” is “R3P1D” (repeat the time interval of 1 day (P1D) for 3 times (R3)), twice every day is “RP12H” (repeat every 12 hours)
|
|
106
|
+
"mod": additional information regarding the temporal value of a time expression. Must be one of the:
|
|
107
|
+
“NA”: the default value, no relevant modifier is present;
|
|
108
|
+
“MORE”, means “more than”, e.g. over 2 days (val = P2D, mod = MORE);
|
|
109
|
+
“LESS”, means “less than”, e.g. almost 2 months (val = P2M, mod=LESS);
|
|
110
|
+
“APPROX”, means “approximate”, e.g. nearly a week (val = P1W, mod=APPROX);
|
|
111
|
+
“START”, describes the beginning of a period of time, e.g. Christmas morning, 2005 (val= 2005-12-25, mod= START).
|
|
112
|
+
“END”, describes the end of a period of time, e.g. late last year, (val = 2010, mod = END)
|
|
113
|
+
“MIDDLE”, describes the middle of a period of time, e.g. mid-September 2001 (val = 2001-09, mod = MIDDLE)
|
|
114
|
+
|
|
115
|
+
# Output format definition
|
|
116
|
+
Your output should follow JSON format,
|
|
117
|
+
if there are one of the EVENT or TIMEX3 entity mentions:
|
|
118
|
+
[
|
|
119
|
+
{"entity_text": "<Exact entity mentions as in the note>", "entity_type": "EVENT", "type": "<event type>", "polarity": "<event polarity>", "modality": "<event modality>"},
|
|
120
|
+
{"entity_text": "<Exact entity mentions as in the note>", "entity_type": "TIMEX3", "type": "<TIMEX3 type>", "val": "<time value>", "mod": "<additional information>"}
|
|
121
|
+
...
|
|
122
|
+
]
|
|
123
|
+
if there is no entity mentioned in the given note, just output an empty list:
|
|
124
|
+
[]
|
|
125
|
+
|
|
126
|
+
I am only interested in the extracted contents in []. Do NOT explain your answer.
|
|
127
|
+
|
|
128
|
+
# Examples
|
|
129
|
+
Below are some examples:
|
|
130
|
+
|
|
131
|
+
Input: At 9/7/93 , 1:00 a.m. , intravenous fluids rate was decreased to 50 cc's per hour , total fluids given during the first 24 hours were 140 to 150 cc's per kilo per day .
|
|
132
|
+
Output: [{"entity_text": "intravenous fluids", "entity_type": "EVENT", "type": "TREATMENT", "polarity": "POS", "modality": "FACTUAL"},
|
|
133
|
+
{"entity_text": "decreased", "entity_type": "EVENT", "type": "OCCURRENCE", "polarity": "POS", "modality": "FACTUAL"},
|
|
134
|
+
{"entity_text": "total fluids", "entity_type": "EVENT", "type": "TREATMENT", "polarity": "POS", "modality": "FACTUAL"},
|
|
135
|
+
{"entity_text": "9/7/93 , 1:00 a.m.", "entity_type": "TIMEX3", "type": "TIME", "val": "1993-09-07T01:00", "mod": "NA"},
|
|
136
|
+
{"entity_text": "24 hours", "entity_type": "TIMEX3", "type": "DURATION", "val": "PT24H", "mod": "NA"}]
|
|
137
|
+
|
|
138
|
+
Input: At that time it appeared well adhered to the underlying skin .
|
|
139
|
+
Output: [{"entity_text": "it", "entity_type": "EVENT", "type": "TREATMENT", "polarity": "POS", "modality": "FACTUAL"},
|
|
140
|
+
{"entity_text": "well adhered", "entity_type": "EVENT", "type": "OCCURRENCE", "polarity": "POS", "modality": "FACTUAL"}]
|
|
141
|
+
|
|
142
|
+
|
|
143
|
+
# Input placeholder
|
|
144
|
+
Below is the entire medical note:
|
|
145
|
+
"{{input}}"
|
|
@@ -0,0 +1,217 @@
|
|
|
1
|
+
Prompt Template Design:
|
|
2
|
+
|
|
3
|
+
1. Task Description:
|
|
4
|
+
Provide a detailed description of the task, including the background and the type of task (e.g., named entity recognition).
|
|
5
|
+
|
|
6
|
+
2. Schema Definition:
|
|
7
|
+
List the key concepts that should be extracted, and provide clear definitions for each one.
|
|
8
|
+
|
|
9
|
+
3. Thinking process:
|
|
10
|
+
Provide clear step-by-step instructions for analyzing the input text. Typically, this process should begin with an analysis section and proceed to the output generation. Each section should have a specific purpose:
|
|
11
|
+
|
|
12
|
+
Optional: Recall Section (<Recall>... </Recall>):
|
|
13
|
+
Write a brief recall of the task description and schema definition for better understanding of the task.
|
|
14
|
+
|
|
15
|
+
Analysis Section (<Analysis>... </Analysis>):
|
|
16
|
+
Break down the input text to identify important medical contents and clarify ambiguous concepts.
|
|
17
|
+
|
|
18
|
+
Output Section (<Outputs>... </Outputs>):
|
|
19
|
+
Based on the analysis, generate the required output in the defined format. Ensure that the extracted information adheres to the schema and task description.
|
|
20
|
+
|
|
21
|
+
4. Output Format Definition:
|
|
22
|
+
The output should be a JSON list, where each element is a dictionary representing a frame (an entity along with its attributes). Each dictionary must include a key that holds the entity text. This key can be named "entity_text" or anything else depend on the context. The attributes can either be flat (e.g., {"entity_text": "<entity_text>", "attr1": "<attr1>", "attr2": "<attr2>"}) or nested (e.g., {"entity_text": "<entity_text>", "attributes": {"attr1": "<attr1>", "attr2": "<attr2>"}}).
|
|
23
|
+
|
|
24
|
+
5. Optional: Hints:
|
|
25
|
+
Provide itemized hints for the information extractors to guide the extraction process.
|
|
26
|
+
|
|
27
|
+
6. Optional: Examples:
|
|
28
|
+
Include examples in the format:
|
|
29
|
+
Input: ...
|
|
30
|
+
Output: ...
|
|
31
|
+
|
|
32
|
+
7. Input Placeholder:
|
|
33
|
+
The template must include a placeholder in the format {{<placeholder_name>}} for the input text. The placeholder name can be customized as needed.
|
|
34
|
+
|
|
35
|
+
|
|
36
|
+
Example 1 (single entity type with attributes):
|
|
37
|
+
|
|
38
|
+
# Task description
|
|
39
|
+
The paragraph below is from the Food and Drug Administration (FDA) Clinical Pharmacology Section of Labeling for Human Prescription Drug and Biological Products, Adverse reactions section. Please carefully review it and extract the adverse reactions and percentages. Note that each adverse reaction is nested under a clinical trial and potentially an arm. Your output should take that into consideration.
|
|
40
|
+
|
|
41
|
+
# Schema definition
|
|
42
|
+
Your output should contain:
|
|
43
|
+
"ClinicalTrial" which is the name of the trial,
|
|
44
|
+
If applicable, "Arm" which is the arm within the clinical trial,
|
|
45
|
+
"AdverseReaction" which is the name of the adverse reaction,
|
|
46
|
+
If applicable, "Percentage" which is the occurance of the adverse reaction within the trial and arm,
|
|
47
|
+
"Evidence" which is the EXACT sentence in the text where you found the AdverseReaction from
|
|
48
|
+
|
|
49
|
+
# Thinking process
|
|
50
|
+
Approach this task step by step. Start with a recall section (<Recall>... </Recall>) that briefly summarize of the task description and schema definition for better understanding of the task. Then write an analysis section (<Analysis>... </Analysis>) to analyze the input sentence. Identify important pharmacology contents and clarify ambiguous concepts. Finally, the output section (<Outputs>... </Outputs>) that list your final outputs following the defined format.
|
|
51
|
+
|
|
52
|
+
# Output format definition
|
|
53
|
+
Your output should follow JSON format, for example:
|
|
54
|
+
[
|
|
55
|
+
{"ClinicalTrial": "<Clinical trial name or number>", "Arm": "<name of arm>", "AdverseReaction": "<Adverse reaction text>", "Percentage": "<a percent>", "Evidence": "<exact sentence from the text>"},
|
|
56
|
+
{"ClinicalTrial": "<Clinical trial name or number>", "Arm": "<name of arm>", "AdverseReaction": "<Adverse reaction text>", "Percentage": "<a percent>", "Evidence": "<exact sentence from the text>"}
|
|
57
|
+
]
|
|
58
|
+
|
|
59
|
+
# Additional hints
|
|
60
|
+
Your output should be 100% based on the provided content. DO NOT output fake numbers.
|
|
61
|
+
If there is no specific arm, just omit the "Arm" key. If the percentage is not reported, just omit the "Percentage" key. The "Evidence" should always be provided.
|
|
62
|
+
|
|
63
|
+
# Input placeholder
|
|
64
|
+
Below is the Adverse reactions section for your reference. I will feed you with sentences from it one by one.
|
|
65
|
+
{{input}}
|
|
66
|
+
|
|
67
|
+
|
|
68
|
+
Example 2 (multiple entity types):
|
|
69
|
+
|
|
70
|
+
# Task description
|
|
71
|
+
This is a named entity recognition task. Given a sentence from a medical note, annotate the Drug, Form, Strength, Frequency, Route, Dosage, Reason, ADE, and Duration.
|
|
72
|
+
|
|
73
|
+
# Schema definition
|
|
74
|
+
Your output should contain:
|
|
75
|
+
"entity_text": the exact wording as mentioned in the note.
|
|
76
|
+
"entity_type": type of the entity. It should be one of the "Drug", "Form", "Strength", "Frequency", "Route", "Dosage", "Reason", "ADE", or "Duration".
|
|
77
|
+
|
|
78
|
+
# Thinking process
|
|
79
|
+
Approach this task step by step. Start with an analysis section (<Analysis>... </Analysis>) to analyze the input sentence. Identify important medical contents and clarify ambiguous concepts. Then, the output section (<Outputs>... </Outputs>) that list your final outputs following the defined format.
|
|
80
|
+
|
|
81
|
+
# Output format definition
|
|
82
|
+
Your output should follow JSON format,
|
|
83
|
+
if there are one of the entity mentions: Drug, Form, Strength, Frequency, Route, Dosage, Reason, ADE, or Duration:
|
|
84
|
+
[{"entity_text": "<Exact entity mentions as in the note>", "entity_type": "<entity type as listed above>"},
|
|
85
|
+
{"entity_text": "<Exact entity mentions as in the note>", "entity_type": "<entity type as listed above>"}]
|
|
86
|
+
if there is no entity mentioned in the given note, just output an empty list:
|
|
87
|
+
[]
|
|
88
|
+
|
|
89
|
+
# Examples
|
|
90
|
+
Below are some examples:
|
|
91
|
+
|
|
92
|
+
Input: Acetaminophen 650 mg PO BID 5.
|
|
93
|
+
Output:
|
|
94
|
+
<Analysis>
|
|
95
|
+
The sentence "Acetaminophen 650 mg PO BID 5." contains several potential medical entities.
|
|
96
|
+
|
|
97
|
+
"Acetaminophen" is a Drug.
|
|
98
|
+
"650 mg" represents the Strength.
|
|
99
|
+
"PO" is the Route (meaning by mouth).
|
|
100
|
+
"BID" stands for a dosing frequency, which represents Frequency (meaning twice a day).
|
|
101
|
+
</Analysis>
|
|
102
|
+
|
|
103
|
+
<Outputs>
|
|
104
|
+
[{"entity_text": "Acetaminophen", "entity_type": "Drug"}, {"entity_text": "650 mg", "entity_type": "Strength"}, {"entity_text": "PO", "entity_type": "Route"}, {"entity_text": "BID", "entity_type": "Frequency"}]
|
|
105
|
+
</Outputs>
|
|
106
|
+
|
|
107
|
+
Input: Mesalamine DR 1200 mg PO BID 2.
|
|
108
|
+
Output:
|
|
109
|
+
<Analysis>
|
|
110
|
+
The sentence "Mesalamine DR 1200 mg PO BID 2." contains the following medical entities:
|
|
111
|
+
|
|
112
|
+
"Mesalamine" is a Drug.
|
|
113
|
+
"DR" stands for Form (delayed-release).
|
|
114
|
+
"1200 mg" represents the Strength.
|
|
115
|
+
"PO" is the Route (by mouth).
|
|
116
|
+
"BID" is the Frequency (twice a day).
|
|
117
|
+
</Analysis>
|
|
118
|
+
|
|
119
|
+
<Outputs>
|
|
120
|
+
[{"entity_text": "Mesalamine DR", "entity_type": "Drug"}, {"entity_text": "1200 mg", "entity_type": "Strength"}, {"entity_text": "BID", "entity_type": "Frequency"}, {"entity_text": "PO", "entity_type": "Route"}]
|
|
121
|
+
</Outputs>
|
|
122
|
+
|
|
123
|
+
# Input placeholder
|
|
124
|
+
Below is the medical note for your reference. I will feed you with sentences from it one by one.
|
|
125
|
+
"{{input}}"
|
|
126
|
+
|
|
127
|
+
|
|
128
|
+
Example 3 (multiple entity types with corresponding attributes):
|
|
129
|
+
|
|
130
|
+
# Task description
|
|
131
|
+
This is a named entity recognition task. Given a sentence from a medical note, annotate the events (EVENT) and time expressions (TIMEX3):
|
|
132
|
+
|
|
133
|
+
# Schema definition
|
|
134
|
+
Your output should contain:
|
|
135
|
+
"entity_text": the exact wording as mentioned in the note.
|
|
136
|
+
"entity_type": type of the entity. It should be one of the "EVENT" or "TIMEX3".
|
|
137
|
+
if entity_type is "EVENT",
|
|
138
|
+
"type": the event type as one of the "TEST", "PROBLEM", "TREATMENT", "CLINICAL_DEPT", "EVIDENTIAL", or "OCCURRENCE".
|
|
139
|
+
"polarity": whether an EVENT is positive ("POS") or negative ("NAG"). For example, in “the patient reports headache, and denies chills”, the EVENT [headache] is positive in its polarity, and the EVENT [chills] is negative in its polarity.
|
|
140
|
+
"modality": whether an EVENT actually occurred or not. Must be one of the "FACTUAL", "CONDITIONAL", "POSSIBLE", or "PROPOSED".
|
|
141
|
+
|
|
142
|
+
if entity_type is "TIMEX3",
|
|
143
|
+
"type": the type as one of the "DATE", "TIME", "DURATION", or "FREQUENCY".
|
|
144
|
+
"val": the numeric value 1) DATE: [YYYY]-[MM]-[DD], 2) TIME: [hh]:[mm]:[ss], 3) DURATION: P[n][Y/M/W/D]. So, “for eleven days” will be
|
|
145
|
+
represented as “P11D”, meaning a period of 11 days. 4) R[n][duration], where n denotes the number of repeats. When the n is omitted, the expression denotes an unspecified amount of repeats. For example, “once a day for 3 days” is “R3P1D” (repeat the time interval of 1 day (P1D) for 3 times (R3)), twice every day is “RP12H” (repeat every 12 hours)
|
|
146
|
+
"mod": additional information regarding the temporal value of a time expression. Must be one of the:
|
|
147
|
+
“NA”: the default value, no relevant modifier is present;
|
|
148
|
+
“MORE”, means “more than”, e.g. over 2 days (val = P2D, mod = MORE);
|
|
149
|
+
“LESS”, means “less than”, e.g. almost 2 months (val = P2M, mod=LESS);
|
|
150
|
+
“APPROX”, means “approximate”, e.g. nearly a week (val = P1W, mod=APPROX);
|
|
151
|
+
“START”, describes the beginning of a period of time, e.g. Christmas morning, 2005 (val= 2005-12-25, mod= START).
|
|
152
|
+
“END”, describes the end of a period of time, e.g. late last year, (val = 2010, mod = END)
|
|
153
|
+
“MIDDLE”, describes the middle of a period of time, e.g. mid-September 2001 (val = 2001-09, mod = MIDDLE)
|
|
154
|
+
|
|
155
|
+
# Thinking process
|
|
156
|
+
Approach this task step by step. Start with a recall section (<Recall>... </Recall>) that briefly summarize of the task description and schema definition for better understanding of the task. Followed by an analysis section (<Analysis>... </Analysis>) to analyze the input sentence. Identify important medical contents and clarify ambiguous concepts. Then, the output section (<Outputs>... </Outputs>) that list your final outputs following the defined format.
|
|
157
|
+
|
|
158
|
+
# Output format definition
|
|
159
|
+
Your output should follow JSON format,
|
|
160
|
+
if there are one of the EVENT or TIMEX3 entity mentions:
|
|
161
|
+
[
|
|
162
|
+
{"entity_text": "<Exact entity mentions as in the note>", "entity_type": "EVENT", "type": "<event type>", "polarity": "<event polarity>", "modality": "<event modality>"},
|
|
163
|
+
{"entity_text": "<Exact entity mentions as in the note>", "entity_type": "TIMEX3", "type": "<TIMEX3 type>", "val": "<time value>", "mod": "<additional information>"}
|
|
164
|
+
...
|
|
165
|
+
]
|
|
166
|
+
if there is no entity mentioned in the given note, just output an empty list:
|
|
167
|
+
[]
|
|
168
|
+
|
|
169
|
+
|
|
170
|
+
# Examples
|
|
171
|
+
Below are some examples:
|
|
172
|
+
|
|
173
|
+
Input: At 9/7/93 , 1:00 a.m. , intravenous fluids rate was decreased to 50 cc's per hour , total fluids given during the first 24 hours were 140 to 150 cc's per kilo per day .
|
|
174
|
+
Output:
|
|
175
|
+
<Recall>
|
|
176
|
+
This is a named entity recognition task that focuses on extracting medical events (EVENT) and time expressions (TIMEX3). Events are categorized by their type (TEST, PROBLEM, TREATMENT, etc.), polarity (POS or NEG), and modality (FACTUAL, CONDITIONAL, POSSIBLE, or PROPOSED). Time expressions are identified as either DATE, TIME, DURATION, or FREQUENCY and include specific values or modifiers where applicable.
|
|
177
|
+
</Recall>
|
|
178
|
+
|
|
179
|
+
<Analysis>
|
|
180
|
+
In this sentence:
|
|
181
|
+
|
|
182
|
+
"9/7/93" represents a TIMEX3 entity for the date.
|
|
183
|
+
"1:00 a.m." is a TIMEX3 entity representing the time.
|
|
184
|
+
"first 24 hours" refers to a TIMEX3 entity of duration.
|
|
185
|
+
"intravenous fluids rate was decreased" is an EVENT referring to a TREATMENT event with a negative polarity (as it was "decreased") and a FACTUAL modality (it actually happened).
|
|
186
|
+
"total fluids given during the first 24 hours" is another EVENT representing a TREATMENT that is FACTUAL in its modality.
|
|
187
|
+
</Analysis>
|
|
188
|
+
|
|
189
|
+
<Outputs>
|
|
190
|
+
[{"entity_text": "intravenous fluids", "entity_type": "EVENT", "type": "TREATMENT", "polarity": "POS", "modality": "FACTUAL"},
|
|
191
|
+
{"entity_text": "decreased", "entity_type": "EVENT", "type": "OCCURRENCE", "polarity": "POS", "modality": "FACTUAL"},
|
|
192
|
+
{"entity_text": "total fluids", "entity_type": "EVENT", "type": "TREATMENT", "polarity": "POS", "modality": "FACTUAL"},
|
|
193
|
+
{"entity_text": "9/7/93 , 1:00 a.m.", "entity_type": "TIMEX3", "type": "TIME", "val": "1993-09-07T01:00", "mod": "NA"},
|
|
194
|
+
{"entity_text": "24 hours", "entity_type": "TIMEX3", "type": "DURATION", "val": "PT24H", "mod": "NA"}]
|
|
195
|
+
</Outputs>
|
|
196
|
+
|
|
197
|
+
Input: At that time it appeared well adhered to the underlying skin .
|
|
198
|
+
Output:
|
|
199
|
+
<Recall>
|
|
200
|
+
This is a named entity recognition task focused on extracting medical events (EVENT) and time expressions (TIMEX3). Events are categorized by their type (e.g., TEST, PROBLEM, TREATMENT), polarity (POS or NEG), and modality (FACTUAL, CONDITIONAL, POSSIBLE, or PROPOSED). Time expressions are categorized as DATE, TIME, DURATION, or FREQUENCY, and include values or modifiers where applicable.
|
|
201
|
+
</Recall>
|
|
202
|
+
|
|
203
|
+
<Analysis>
|
|
204
|
+
In this sentence:
|
|
205
|
+
|
|
206
|
+
"At that time" refers to a TIMEX3 entity that is vague, so it can be considered as a TIME with an unspecified value.
|
|
207
|
+
"appeared well adhered to the underlying skin" describes an EVENT that likely indicates a PROBLEM (the condition of the skin) and has a POS polarity (since it is "well adhered") with a FACTUAL modality (it actually occurred).
|
|
208
|
+
</Analysis>
|
|
209
|
+
|
|
210
|
+
<Outputs>
|
|
211
|
+
[{"entity_text": "it", "entity_type": "EVENT", "type": "TREATMENT", "polarity": "POS", "modality": "FACTUAL"},
|
|
212
|
+
{"entity_text": "well adhered", "entity_type": "EVENT", "type": "OCCURRENCE", "polarity": "POS", "modality": "FACTUAL"}]
|
|
213
|
+
</Outputs>
|
|
214
|
+
|
|
215
|
+
# Input placeholder
|
|
216
|
+
Below is the entire medical note for your reference. I will feed you with sentences from it one by one.
|
|
217
|
+
"{{input}}"
|