llm-ie 0.3.5__py3-none-any.whl → 0.4.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.1
2
2
  Name: llm-ie
3
- Version: 0.3.5
3
+ Version: 0.4.0
4
4
  Summary: An LLM-powered tool that transforms everyday language into robust information extraction pipelines.
5
5
  License: MIT
6
6
  Author: Enshuo (David) Hsu
@@ -24,10 +24,20 @@ An LLM-powered tool that transforms everyday language into robust information ex
24
24
 
25
25
  | Features | Support |
26
26
  |----------|----------|
27
- | **LLM Agent for prompt writing** | :white_check_mark: Interactive chat, Python functions |
27
+ | **LLM Agent for prompt writing** | :white_check_mark: Interactive chat, Python functions |
28
28
  | **Named Entity Recognition (NER)** | :white_check_mark: Document-level, Sentence-level |
29
29
  | **Entity Attributes Extraction** | :white_check_mark: Flexible formats |
30
30
  | **Relation Extraction (RE)** | :white_check_mark: Binary & Multiclass relations |
31
+ | **Visualization** | :white_check_mark: Built-in entity & relation visualization |
32
+
33
+ ## Recent Updates
34
+ - [v0.3.0](https://github.com/daviden1013/llm-ie/releases/tag/v0.3.0) (Oct 17, 2024): Interactive chat to Prompt editor LLM agent.
35
+ - [v0.3.1](https://github.com/daviden1013/llm-ie/releases/tag/v0.3.1) (Oct 26, 2024): Added Sentence Review Frame Extractor and Sentence CoT Frame Extractor
36
+ - [v0.3.4](https://github.com/daviden1013/llm-ie/releases/tag/v0.3.4) (Nov 24, 2024): Added entity fuzzy search.
37
+ - [v0.3.5](https://github.com/daviden1013/llm-ie/releases/tag/v0.3.5) (Nov 27, 2024): Adopted `json_repair` to fix broken JSON from LLM outputs.
38
+ - v0.4.0:
39
+ - Concurrent LLM inferencing to speed up frame and relation extraction.
40
+ - Support for LiteLLM.
31
41
 
32
42
  ## Table of Contents
33
43
  - [Overview](#overview)
@@ -38,10 +48,13 @@ An LLM-powered tool that transforms everyday language into robust information ex
38
48
  - [User Guide](#user-guide)
39
49
  - [LLM Inference Engine](#llm-inference-engine)
40
50
  - [Prompt Template](#prompt-template)
41
- - [Prompt Editor](#prompt-editor)
51
+ - [Prompt Editor LLM Agent](#prompt-editor-llm-agent)
42
52
  - [Extractor](#extractor)
43
53
  - [FrameExtractor](#frameextractor)
44
54
  - [RelationExtractor](#relationextractor)
55
+ - [Visualization](#visualization)
56
+ - [Benchmarks](#benchmarks)
57
+ - [Citation](#citation)
45
58
 
46
59
  ## Overview
47
60
  LLM-IE is a toolkit that provides robust information extraction utilities for named entity, entity attributes, and entity relation extraction. Since prompt design has a significant impact on generative information extraction with LLMs, it has a built-in LLM agent ("editor") to help with prompt writing. The flowchart below demonstrates the workflow starting from a casual language request to output visualization.
@@ -49,7 +62,7 @@ LLM-IE is a toolkit that provides robust information extraction utilities for na
49
62
  <div align="center"><img src="doc_asset/readme_img/LLM-IE flowchart.png" width=800 ></div>
50
63
 
51
64
  ## Prerequisite
52
- At least one LLM inference engine is required. There are built-in supports for 🦙 [Llama-cpp-python](https://github.com/abetlen/llama-cpp-python), <img src="https://avatars.githubusercontent.com/u/151674099?s=48&v=4" alt="Icon" width="20"/> [Ollama](https://github.com/ollama/ollama), 🤗 [Huggingface_hub](https://github.com/huggingface/huggingface_hub), <img src=doc_asset/readme_img/openai-logomark.png width=16 /> [OpenAI API](https://platform.openai.com/docs/api-reference/introduction), and <img src=doc_asset/readme_img/vllm-logo.png width=20 /> [vLLM](https://github.com/vllm-project/vllm). For installation guides, please refer to those projects. Other inference engines can be configured through the [InferenceEngine](src/llm_ie/engines.py) abstract class. See [LLM Inference Engine](#llm-inference-engine) section below.
65
+ At least one LLM inference engine is required. There are built-in supports for 🚅 [LiteLLM](https://github.com/BerriAI/litellm), 🦙 [Llama-cpp-python](https://github.com/abetlen/llama-cpp-python), <img src="doc_asset/readme_img/ollama_icon_small.png" alt="Icon" width="18"/> [Ollama](https://github.com/ollama/ollama), 🤗 [Huggingface_hub](https://github.com/huggingface/huggingface_hub), <img src=doc_asset/readme_img/openai-logomark.png width=16 /> [OpenAI API](https://platform.openai.com/docs/api-reference/introduction), and <img src=doc_asset/readme_img/vllm-logo_small.png width=20 /> [vLLM](https://github.com/vllm-project/vllm). For installation guides, please refer to those projects. Other inference engines can be configured through the [InferenceEngine](src/llm_ie/engines.py) abstract class. See [LLM Inference Engine](#llm-inference-engine) section below.
53
66
 
54
67
  ## Installation
55
68
  The Python package is available on PyPI.
@@ -65,22 +78,23 @@ We use a [synthesized medical note](demo/document/synthesized_note.txt) by ChatG
65
78
  Choose one of the built-in engines below.
66
79
 
67
80
  <details>
68
- <summary><img src="https://avatars.githubusercontent.com/u/151674099?s=48&v=4" alt="Icon" width="20"/> Ollama</summary>
81
+ <summary>🚅 LiteLLM</summary>
69
82
 
70
- ```python
71
- from llm_ie.engines import OllamaInferenceEngine
83
+ ```python
84
+ from llm_ie.engines import LiteLLMInferenceEngine
72
85
 
73
- llm = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0")
86
+ inference_engine = LiteLLMInferenceEngine(model="openai/Llama-3.3-70B-Instruct", base_url="http://localhost:8000/v1", api_key="EMPTY")
74
87
  ```
75
88
  </details>
89
+
76
90
  <details>
77
- <summary>🦙 Llama-cpp-python</summary>
91
+ <summary><img src=doc_asset/readme_img/openai-logomark.png width=16 /> OpenAI API</summary>
78
92
 
93
+ Follow the [Best Practices for API Key Safety](https://help.openai.com/en/articles/5112595-best-practices-for-api-key-safety) to set up API key.
79
94
  ```python
80
- from llm_ie.engines import LlamaCppInferenceEngine
95
+ from llm_ie.engines import OpenAIInferenceEngine
81
96
 
82
- llm = LlamaCppInferenceEngine(repo_id="bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF",
83
- gguf_filename="Meta-Llama-3.1-8B-Instruct-Q8_0.gguf")
97
+ inference_engine = OpenAIInferenceEngine(model="gpt-4o-mini")
84
98
  ```
85
99
  </details>
86
100
 
@@ -90,24 +104,22 @@ llm = LlamaCppInferenceEngine(repo_id="bullerwins/Meta-Llama-3.1-8B-Instruct-GGU
90
104
  ```python
91
105
  from llm_ie.engines import HuggingFaceHubInferenceEngine
92
106
 
93
- llm = HuggingFaceHubInferenceEngine(model="meta-llama/Meta-Llama-3-8B-Instruct")
107
+ inference_engine = HuggingFaceHubInferenceEngine(model="meta-llama/Meta-Llama-3-8B-Instruct")
94
108
  ```
95
109
  </details>
96
110
 
97
111
  <details>
98
- <summary><img src=doc_asset/readme_img/openai-logomark.png width=16 /> OpenAI API</summary>
112
+ <summary><img src="doc_asset/readme_img/ollama_icon_small.png" alt="Icon" width="18"/> Ollama</summary>
99
113
 
100
- Follow the [Best Practices for API Key Safety](https://help.openai.com/en/articles/5112595-best-practices-for-api-key-safety) to set up API key.
101
- ```python
102
- from llm_ie.engines import OpenAIInferenceEngine
114
+ ```python
115
+ from llm_ie.engines import OllamaInferenceEngine
103
116
 
104
- llm = OpenAIInferenceEngine(model="gpt-4o-mini")
117
+ inference_engine = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0")
105
118
  ```
106
-
107
119
  </details>
108
120
 
109
121
  <details>
110
- <summary><img src=doc_asset/readme_img/vllm-logo.png width=20 /> vLLM</summary>
122
+ <summary><img src=doc_asset/readme_img/vllm-logo_small.png width=20 /> vLLM</summary>
111
123
 
112
124
  The vLLM support follows the [OpenAI Compatible Server](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html). For more parameters, please refer to the documentation.
113
125
 
@@ -118,15 +130,24 @@ vllm serve meta-llama/Meta-Llama-3.1-8B-Instruct
118
130
  Define inference engine
119
131
  ```python
120
132
  from llm_ie.engines import OpenAIInferenceEngine
121
- engine = OpenAIInferenceEngine(base_url="http://localhost:8000/v1",
122
- api_key="EMPTY",
123
- model="meta-llama/Meta-Llama-3.1-8B-Instruct")
133
+ inference_engine = OpenAIInferenceEngine(base_url="http://localhost:8000/v1",
134
+ api_key="EMPTY",
135
+ model="meta-llama/Meta-Llama-3.1-8B-Instruct")
124
136
  ```
137
+ </details>
138
+
139
+ <details>
140
+ <summary>🦙 Llama-cpp-python</summary>
125
141
 
142
+ ```python
143
+ from llm_ie.engines import LlamaCppInferenceEngine
126
144
 
145
+ inference_engine = LlamaCppInferenceEngine(repo_id="bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF",
146
+ gguf_filename="Meta-Llama-3.1-8B-Instruct-Q8_0.gguf")
147
+ ```
127
148
  </details>
128
149
 
129
- In this quick start demo, we use Llama-cpp-python to run Llama-3.1-8B with int8 quantization ([bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF](https://huggingface.co/bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF)).
150
+ In this quick start demo, we use Ollama to run Llama-3.1-8B with int8 quantization.
130
151
  The outputs might be slightly different with other inference engines, LLMs, or quantization.
131
152
 
132
153
  #### Casual language as prompt
@@ -136,14 +157,12 @@ We start with a casual description:
136
157
 
137
158
  Define the AI prompt editor.
138
159
  ```python
139
- from llm_ie.engines import OllamaInferenceEngine
140
- from llm_ie.extractors import BasicFrameExtractor
141
- from llm_ie.prompt_editor import PromptEditor
160
+ from llm_ie import OllamaInferenceEngine, PromptEditor, BasicFrameExtractor
142
161
 
143
162
  # Define a LLM inference engine
144
- llm = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0")
163
+ inference_engine = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0")
145
164
  # Define LLM prompt editor
146
- editor = PromptEditor(llm, BasicFrameExtractor)
165
+ editor = PromptEditor(inference_engine, BasicFrameExtractor)
147
166
  # Start chat
148
167
  editor.chat()
149
168
  ```
@@ -190,7 +209,7 @@ with open("./demo/document/synthesized_note.txt", 'r') as f:
190
209
  note_text = f.read()
191
210
 
192
211
  # Define extractor
193
- extractor = BasicFrameExtractor(llm, prompt_template)
212
+ extractor = BasicFrameExtractor(inference_engine, prompt_template)
194
213
 
195
214
  # Extract
196
215
  frames = extractor.extract_frames(note_text, entity_key="Diagnosis", stream=True)
@@ -228,7 +247,7 @@ To visualize the extracted frames, we use the ```viz_serve()``` method.
228
247
  ```python
229
248
  doc.viz_serve()
230
249
  ```
231
- A Flask APP starts at port 5000 (default).
250
+ A Flask App starts at port 5000 (default).
232
251
  ```
233
252
  * Serving Flask app 'ie_viz.utilities'
234
253
  * Debug mode: off
@@ -255,39 +274,28 @@ This package is comprised of some key classes:
255
274
  - Extractors
256
275
 
257
276
  ### LLM Inference Engine
258
- Provides an interface for different LLM inference engines to work in the information extraction workflow. The built-in engines are ```LlamaCppInferenceEngine```, ```OllamaInferenceEngine```, and ```HuggingFaceHubInferenceEngine```.
277
+ Provides an interface for different LLM inference engines to work in the information extraction workflow. The built-in engines are `LiteLLMInferenceEngine`, `OpenAIInferenceEngine`, `HuggingFaceHubInferenceEngine`, `OllamaInferenceEngine`, and `LlamaCppInferenceEngine`.
259
278
 
260
- #### 🦙 Llama-cpp-python
261
- The ```repo_id``` and ```gguf_filename``` must match the ones on the Huggingface repo to ensure the correct model is loaded. ```n_ctx``` determines the context length LLM will consider during text generation. Empirically, longer context length gives better performance, while consuming more memory and increases computation. Note that when ```n_ctx``` is less than the prompt length, Llama.cpp throws exceptions. ```n_gpu_layers``` indicates a number of model layers to offload to GPU. Default is -1 for all layers (entire LLM). Flash attention ```flash_attn``` is supported by Llama.cpp. The ```verbose``` indicates whether model information should be displayed. For more input parameters, see 🦙 [Llama-cpp-python](https://github.com/abetlen/llama-cpp-python).
279
+ #### 🚅 LiteLLM
280
+ The LiteLLM is an adaptor project that unifies many proprietary and open-source LLM APIs. Popular inferncing servers, including OpenAI, Huggingface Hub, and Ollama are supported via its interface. For more details, refer to [LiteLLM GitHub page](https://github.com/BerriAI/litellm).
262
281
 
282
+ To use LiteLLM with LLM-IE, import the `LiteLLMInferenceEngine` and follow the required model naming.
263
283
  ```python
264
- from llm_ie.engines import LlamaCppInferenceEngine
284
+ from llm_ie.engines import LiteLLMInferenceEngine
265
285
 
266
- llama_cpp = LlamaCppInferenceEngine(repo_id="bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF",
267
- gguf_filename="Meta-Llama-3.1-8B-Instruct-Q8_0.gguf",
268
- n_ctx=4096,
269
- n_gpu_layers=-1,
270
- flash_attn=True,
271
- verbose=False)
272
- ```
273
- #### <img src="https://avatars.githubusercontent.com/u/151674099?s=48&v=4" alt="Icon" width="20"/> Ollama
274
- The ```model_name``` must match the names on the [Ollama library](https://ollama.com/library). Use the command line ```ollama ls``` to check your local model list. ```num_ctx``` determines the context length LLM will consider during text generation. Empirically, longer context length gives better performance, while consuming more memory and increases computation. ```keep_alive``` regulates the lifespan of LLM. It indicates a number of seconds to hold the LLM after the last API call. Default is 5 minutes (300 sec).
286
+ # Huggingface serverless inferencing
287
+ os.environ['HF_TOKEN']
288
+ inference_engine = LiteLLMInferenceEngine(model="huggingface/meta-llama/Meta-Llama-3-8B-Instruct")
275
289
 
276
- ```python
277
- from llm_ie.engines import OllamaInferenceEngine
290
+ # OpenAI GPT models
291
+ os.environ['OPENAI_API_KEY']
292
+ inference_engine = LiteLLMInferenceEngine(model="openai/gpt-4o-mini")
278
293
 
279
- ollama = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0",
280
- num_ctx=4096,
281
- keep_alive=300)
282
- ```
294
+ # OpenAI compatible local server
295
+ inference_engine = LiteLLMInferenceEngine(model="openai/Llama-3.1-8B-Instruct", base_url="http://localhost:8000/v1", api_key="EMPTY")
283
296
 
284
- #### 🤗 huggingface_hub
285
- The ```model``` can be a model id hosted on the Hugging Face Hub or a URL to a deployed Inference Endpoint. Refer to the [Inference Client](https://huggingface.co/docs/huggingface_hub/en/package_reference/inference_client) documentation for more details.
286
-
287
- ```python
288
- from llm_ie.engines import HuggingFaceHubInferenceEngine
289
-
290
- hf = HuggingFaceHubInferenceEngine(model="meta-llama/Meta-Llama-3-8B-Instruct")
297
+ # Ollama
298
+ inference_engine = LiteLLMInferenceEngine(model="ollama/llama3.1:8b-instruct-q8_0")
291
299
  ```
292
300
 
293
301
  #### <img src=doc_asset/readme_img/openai-logomark.png width=16 /> OpenAI API
@@ -302,10 +310,28 @@ For more parameters, see [OpenAI API reference](https://platform.openai.com/docs
302
310
  ```python
303
311
  from llm_ie.engines import OpenAIInferenceEngine
304
312
 
305
- openai_engine = OpenAIInferenceEngine(model="gpt-4o-mini")
313
+ inference_engine = OpenAIInferenceEngine(model="gpt-4o-mini")
306
314
  ```
307
315
 
308
- #### <img src=doc_asset/readme_img/vllm-logo.png width=20 /> vLLM
316
+ #### 🤗 huggingface_hub
317
+ The ```model``` can be a model id hosted on the Hugging Face Hub or a URL to a deployed Inference Endpoint. Refer to the [Inference Client](https://huggingface.co/docs/huggingface_hub/en/package_reference/inference_client) documentation for more details.
318
+
319
+ ```python
320
+ from llm_ie.engines import HuggingFaceHubInferenceEngine
321
+
322
+ inference_engine = HuggingFaceHubInferenceEngine(model="meta-llama/Meta-Llama-3-8B-Instruct")
323
+ ```
324
+
325
+ #### <img src="doc_asset/readme_img/ollama_icon_small.png" alt="Icon" width="18"/> Ollama
326
+ The ```model_name``` must match the names on the [Ollama library](https://ollama.com/library). Use the command line ```ollama ls``` to check your local model list. ```num_ctx``` determines the context length LLM will consider during text generation. Empirically, longer context length gives better performance, while consuming more memory and increases computation. ```keep_alive``` regulates the lifespan of LLM. It indicates a number of seconds to hold the LLM after the last API call. Default is 5 minutes (300 sec).
327
+
328
+ ```python
329
+ from llm_ie.engines import OllamaInferenceEngine
330
+
331
+ inference_engine = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0", num_ctx=4096, keep_alive=300)
332
+ ```
333
+
334
+ #### <img src=doc_asset/readme_img/vllm-logo_small.png width=20 /> vLLM
309
335
  The vLLM support follows the [OpenAI Compatible Server](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html). For more parameters, please refer to the documentation.
310
336
 
311
337
  Start the server
@@ -318,20 +344,34 @@ the default port is 8000. ```--port``` sets the port.
318
344
  Define inference engine
319
345
  ```python
320
346
  from llm_ie.engines import OpenAIInferenceEngine
321
- engine = OpenAIInferenceEngine(base_url="http://localhost:8000/v1",
347
+ inference_engine = OpenAIInferenceEngine(base_url="http://localhost:8000/v1",
322
348
  api_key="MY_API_KEY",
323
349
  model="meta-llama/Meta-Llama-3.1-8B-Instruct")
324
350
  ```
325
351
  The ```model``` must match the repo name specified in the server.
326
352
 
353
+ #### 🦙 Llama-cpp-python
354
+ The ```repo_id``` and ```gguf_filename``` must match the ones on the Huggingface repo to ensure the correct model is loaded. ```n_ctx``` determines the context length LLM will consider during text generation. Empirically, longer context length gives better performance, while consuming more memory and increases computation. Note that when ```n_ctx``` is less than the prompt length, Llama.cpp throws exceptions. ```n_gpu_layers``` indicates a number of model layers to offload to GPU. Default is -1 for all layers (entire LLM). Flash attention ```flash_attn``` is supported by Llama.cpp. The ```verbose``` indicates whether model information should be displayed. For more input parameters, see 🦙 [Llama-cpp-python](https://github.com/abetlen/llama-cpp-python).
355
+
356
+ ```python
357
+ from llm_ie.engines import LlamaCppInferenceEngine
358
+
359
+ inference_engine = LlamaCppInferenceEngine(repo_id="bullerwins/Meta-Llama-3.1-8B-Instruct-GGUF",
360
+ gguf_filename="Meta-Llama-3.1-8B-Instruct-Q8_0.gguf",
361
+ n_ctx=4096,
362
+ n_gpu_layers=-1,
363
+ flash_attn=True,
364
+ verbose=False)
365
+ ```
366
+
327
367
  #### Test inference engine configuration
328
368
  To test the inference engine, use the ```chat()``` method.
329
369
 
330
370
  ```python
331
371
  from llm_ie.engines import OllamaInferenceEngine
332
372
 
333
- ollama = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0")
334
- engine.chat(messages=[{"role": "user", "content":"Hi"}], stream=True)
373
+ inference_engine = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0")
374
+ inference_engine.chat(messages=[{"role": "user", "content":"Hi"}], stream=True)
335
375
  ```
336
376
  The output should be something like (might vary by LLMs and versions)
337
377
 
@@ -449,8 +489,8 @@ prompt_template = """
449
489
  Below is the medical note:
450
490
  "{{note}}"
451
491
  """
452
- ollama = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0")
453
- extractor = BasicFrameExtractor(ollama, prompt_template)
492
+ inference_engine = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0")
493
+ extractor = BasicFrameExtractor(inference_engine, prompt_template)
454
494
  prompt_text = extractor._get_user_prompt(text_content={"knowledge": "<some text...>",
455
495
  "note": "<some text...>")
456
496
  print(prompt_text)
@@ -468,7 +508,7 @@ from llm_ie.extractors import BasicFrameExtractor
468
508
  print(BasicFrameExtractor.get_prompt_guide())
469
509
  ```
470
510
 
471
- ### Prompt Editor
511
+ ### Prompt Editor LLM Agent
472
512
  The prompt editor is an LLM agent that help users write prompt templates following the defined schema and guideline of each extractor. Chat with the promtp editor:
473
513
 
474
514
  ```python
@@ -477,10 +517,10 @@ from llm_ie.extractors import BasicFrameExtractor
477
517
  from llm_ie.engines import OllamaInferenceEngine
478
518
 
479
519
  # Define an LLM inference engine
480
- ollama = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0")
520
+ inference_engine = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0")
481
521
 
482
522
  # Define editor
483
- editor = PromptEditor(ollama, BasicFrameExtractor)
523
+ editor = PromptEditor(inference_engine, BasicFrameExtractor)
484
524
 
485
525
  editor.chat()
486
526
  ```
@@ -504,10 +544,10 @@ from llm_ie.extractors import BasicFrameExtractor
504
544
  from llm_ie.engines import OllamaInferenceEngine
505
545
 
506
546
  # Define an LLM inference engine
507
- ollama = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0")
547
+ inference_engine = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0")
508
548
 
509
549
  # Define editor
510
- editor = PromptEditor(ollama, BasicFrameExtractor)
550
+ editor = PromptEditor(inference_engine, BasicFrameExtractor)
511
551
 
512
552
  # Have editor to generate initial prompt template
513
553
  initial_version = editor.rewrite("Extract treatment events from the discharge summary.")
@@ -612,10 +652,12 @@ After a few iterations of revision, we will have a high-quality prompt template
612
652
 
613
653
  ### Extractor
614
654
  An extractor implements a prompting method for information extraction. There are two extractor families: ```FrameExtractor``` and ```RelationExtractor```.
615
- The ```FrameExtractor``` extracts named entities and entity attributes ("frame"). The ```RelationExtractor``` extracts the relation (and relation types) between frames.
655
+ The ```FrameExtractor``` extracts named entities with attributes ("frames"). The ```RelationExtractor``` extracts the relations (and relation types) between frames.
616
656
 
617
657
  #### FrameExtractor
618
- The ```BasicFrameExtractor``` directly prompts LLM to generate a list of dictionaries. Each dictionary is then post-processed into a frame. The ```ReviewFrameExtractor``` is based on the ```BasicFrameExtractor``` but adds a review step after the initial extraction to boost sensitivity and improve performance. ```SentenceFrameExtractor``` gives LLM the entire document upfront as a reference, then prompts LLM sentence by sentence and collects per-sentence outputs. To learn about an extractor, use the class method ```get_prompt_guide()``` to print out the prompt guide.
658
+ The ```BasicFrameExtractor``` directly prompts LLM to generate a list of dictionaries. Each dictionary is then post-processed into a frame. The ```ReviewFrameExtractor``` is based on the ```BasicFrameExtractor``` but adds a review step after the initial extraction to boost sensitivity and improve performance. ```SentenceFrameExtractor``` gives LLM the entire document upfront as a reference, then prompts LLM sentence by sentence and collects per-sentence outputs. ```SentenceReviewFrameExtractor``` is the combined version of ```ReviewFrameExtractor``` and ```SentenceFrameExtractor``` which each sentence is extracted and reviewed. The ```SentenceCoTFrameExtractor``` implements chain of thoughts (CoT). It first analyzes a sentence, then extract frames based on the CoT. To learn about an extractor, use the class method ```get_prompt_guide()``` to print out the prompt guide.
659
+
660
+ Since the output entity text from LLMs might not be consistent with the original text due to the limitations of LLMs, we apply fuzzy search in post-processing to find the accurate entity span. In the `FrameExtractor.extract_frames()` method, setting parameter `fuzzy_match=True` applies Jaccard similarity matching.
619
661
 
620
662
  <details>
621
663
  <summary>BasicFrameExtractor</summary>
@@ -625,8 +667,8 @@ The ```BasicFrameExtractor``` directly prompts LLM to generate a list of diction
625
667
  ```python
626
668
  from llm_ie.extractors import BasicFrameExtractor
627
669
 
628
- extractor = BasicFrameExtractor(llm, prompt_temp)
629
- frames = extractor.extract_frames(text_content=text, entity_key="Diagnosis", stream=True)
670
+ extractor = BasicFrameExtractor(inference_engine, prompt_temp)
671
+ frames = extractor.extract_frames(text_content=text, entity_key="Diagnosis", case_sensitive=False, fuzzy_match=True, stream=True)
630
672
  ```
631
673
 
632
674
  Use the ```get_prompt_guide()``` method to inspect the prompt template guideline for ```BasicFrameExtractor```.
@@ -688,7 +730,7 @@ The ```review_mode``` should be set to ```review_mode="revision"```
688
730
  ```python
689
731
  review_prompt = "Review the input and your output again. If you find some diagnosis was missed, add them to your output. Regenerate your output."
690
732
 
691
- extractor = ReviewFrameExtractor(llm, prompt_temp, review_prompt, review_mode="revision")
733
+ extractor = ReviewFrameExtractor(inference_engine, prompt_temp, review_prompt, review_mode="revision")
692
734
  frames = extractor.extract_frames(text_content=text, entity_key="Diagnosis", stream=True)
693
735
  ```
694
736
  </details>
@@ -698,14 +740,95 @@ frames = extractor.extract_frames(text_content=text, entity_key="Diagnosis", str
698
740
 
699
741
  The ```SentenceFrameExtractor``` instructs the LLM to extract sentence by sentence. The reason is to ensure the accuracy of frame spans. It also prevents LLMs from overseeing sections/ sentences. Empirically, this extractor results in better recall than the ```BasicFrameExtractor``` in complex tasks.
700
742
 
743
+ For concurrent extraction (recommended), the `async/ await` feature is used to speed up inferencing. The `concurrent_batch_size` sets the batch size of sentences to be processed in cocurrent.
744
+
745
+ ```python
746
+ from llm_ie.extractors import SentenceFrameExtractor
747
+
748
+ extractor = SentenceFrameExtractor(inference_engine, prompt_temp)
749
+ frames = extractor.extract_frames(text_content=text, entity_key="Diagnosis", case_sensitive=False, fuzzy_match=True, concurrent=True, concurrent_batch_size=32)
750
+ ```
751
+
701
752
  The ```multi_turn``` parameter specifies multi-turn conversation for prompting. If True, sentences and LLM outputs will be appended to the input message and carry-over. If False, only the current sentence is prompted. For LLM inference engines that supports prompt cache (e.g., Llama.Cpp, Ollama), use multi-turn conversation prompting can better utilize the KV caching and results in faster inferencing. But for vLLM with [Automatic Prefix Caching (APC)](https://docs.vllm.ai/en/latest/automatic_prefix_caching/apc.html), multi-turn conversation is not necessary.
702
753
 
703
754
  ```python
704
755
  from llm_ie.extractors import SentenceFrameExtractor
705
756
 
706
- extractor = SentenceFrameExtractor(llm, prompt_temp)
707
- frames = extractor.extract_frames(text_content=text, entity_key="Diagnosis", multi_turn=True, stream=True)
757
+ extractor = SentenceFrameExtractor(inference_engine, prompt_temp)
758
+ frames = extractor.extract_frames(text_content=text, entity_key="Diagnosis", multi_turn=False, case_sensitive=False, fuzzy_match=True, stream=True)
759
+ ```
760
+
761
+ </details>
762
+
763
+ <details>
764
+ <summary>SentenceReviewFrameExtractor</summary>
765
+
766
+ The `SentenceReviewFrameExtractor` performs sentence-level extraction and review.
767
+
768
+ ```python
769
+ from llm_ie.extractors import SentenceReviewFrameExtractor
770
+
771
+ extractor = SentenceReviewFrameExtractor(inference_engine, prompt_temp, review_mode="revision")
772
+ frames = extractor.extract_frames(text_content=note_text, entity_key="Diagnosis", stream=True)
773
+ ```
774
+
775
+ ```
776
+ Sentence:
777
+ #### History of Present Illness
778
+ The patient reported that the chest pain started two days prior to admission.
779
+
780
+ Initial Output:
781
+ [
782
+ {"Diagnosis": "chest pain", "Date": "two days prior to admission", "Status": "reported"}
783
+ ]
784
+ Review:
785
+ [
786
+ {"Diagnosis": "admission", "Date": null, "Status": null}
787
+ ]
788
+ ```
789
+
790
+ </details>
791
+
792
+ <details>
793
+ <summary>SentenceCoTFrameExtractor</summary>
794
+
795
+ The `SentenceCoTFrameExtractor` processes document sentence-by-sentence. For each sentence, it first generate an analysis paragraph in `<Analysis>... </Analysis>`(chain-of-thought). Then output extraction in JSON in `<Outputs>... </Outputs>`, similar to `SentenceFrameExtractor`.
796
+
797
+ ```python
798
+ from llm_ie.extractors import SentenceCoTFrameExtractor
799
+
800
+ extractor = SentenceCoTFrameExtractor(inference_engine, CoT_prompt_temp)
801
+ frames = extractor.extract_frames(text_content=note_text, entity_key="Diagnosis", stream=True)
802
+ ```
803
+
804
+ ```
805
+ Sentence:
806
+ #### Discharge Medications
807
+ - Aspirin 81 mg daily
808
+ - Clopidogrel 75 mg daily
809
+ - Atorvastatin 40 mg daily
810
+ - Metoprolol 50 mg twice daily
811
+ - Lisinopril 20 mg daily
812
+ - Metformin 1000 mg twice daily
813
+
814
+ #### Discharge Instructions
815
+ John Doe was advised to follow a heart-healthy diet, engage in regular physical activity, and monitor his blood glucose levels.
816
+
817
+ CoT:
818
+ <Analysis>
819
+ The given text does not explicitly mention a diagnosis, but rather lists the discharge medications and instructions for the patient. However, we can infer that the patient has been diagnosed with conditions that require these medications, such as high blood pressure, high cholesterol, and diabetes.
820
+
821
+ </Analysis>
822
+
823
+ <Outputs>
824
+ [
825
+ {"Diagnosis": "hypertension", "Date": null, "Status": "confirmed"},
826
+ {"Diagnosis": "hyperlipidemia", "Date": null, "Status": "confirmed"},
827
+ {"Diagnosis": "Type 2 diabetes mellitus", "Date": null, "Status": "confirmed"}
828
+ ]
829
+ </Outputs>
708
830
  ```
831
+
709
832
  </details>
710
833
 
711
834
  #### RelationExtractor
@@ -725,12 +848,32 @@ print(BinaryRelationExtractor.get_prompt_guide())
725
848
  ```
726
849
 
727
850
  ```
728
- Prompt template design:
729
- 1. Task description (mention binary relation extraction and ROI)
730
- 2. Schema definition (defines relation)
731
- 3. Output format definition (must use the key "Relation")
732
- 4. Hints
733
- 5. Input placeholders (must include "roi_text", "frame_1", and "frame_2" placeholders)
851
+ Prompt Template Design:
852
+
853
+ 1. Task description:
854
+ Provide a detailed description of the task, including the background and the type of task (e.g., binary relation extraction). Mention the region of interest (ROI) text.
855
+ 2. Schema definition:
856
+ List the criterion for relation (True) and for no relation (False).
857
+
858
+ 3. Output format definition:
859
+ The ouptut must be a dictionary with a key "Relation" (i.e., {"Relation": "<True or False>"}).
860
+
861
+ 4. (optional) Hints:
862
+ Provide itemized hints for the information extractors to guide the extraction process.
863
+
864
+ 5. (optional) Examples:
865
+ Include examples in the format:
866
+ Input: ...
867
+ Output: ...
868
+
869
+ 6. Entity 1 full information:
870
+ Include a placeholder in the format {{<frame_1>}}
871
+
872
+ 7. Entity 2 full information:
873
+ Include a placeholder in the format {{<frame_2>}}
874
+
875
+ 8. Input placeholders:
876
+ The template must include a placeholder "{{roi_text}}" for the ROI text.
734
877
 
735
878
 
736
879
  Example:
@@ -754,15 +897,15 @@ Example:
754
897
  3. If the strength or frequency is for another medication, output False.
755
898
  4. If the strength or frequency is for the same medication but at a different location (span), output False.
756
899
 
757
- # Input placeholders
758
- ROI Text with the two entities annotated with <entity_1> and <entity_2>:
759
- "{{roi_text}}"
760
-
761
- Entity 1 full information:
900
+ # Entity 1 full information:
762
901
  {{frame_1}}
763
902
 
764
- Entity 2 full information:
903
+ # Entity 2 full information:
765
904
  {{frame_2}}
905
+
906
+ # Input placeholders
907
+ ROI Text with the two entities annotated with <entity_1> and <entity_2>:
908
+ "{{roi_text}}"
766
909
  ```
767
910
 
768
911
  As an example, we define the ```possible_relation_func``` function:
@@ -797,8 +940,12 @@ In the ```BinaryRelationExtractor``` constructor, we pass in the prompt template
797
940
  ```python
798
941
  from llm_ie.extractors import BinaryRelationExtractor
799
942
 
800
- extractor = BinaryRelationExtractor(llm, prompt_template=prompt_template, possible_relation_func=possible_relation_func)
801
- relations = extractor.extract_relations(doc, stream=True)
943
+ extractor = BinaryRelationExtractor(inference_engine, prompt_template=prompt_template, possible_relation_func=possible_relation_func)
944
+ # Extract binary relations with concurrent mode (faster)
945
+ relations = extractor.extract_relations(doc, concurrent=True)
946
+
947
+ # To print out the step-by-step, use the `concurrent=False` and `stream=True` options
948
+ relations = extractor.extract_relations(doc, concurrent=False, stream=True)
802
949
  ```
803
950
 
804
951
  </details>
@@ -814,11 +961,34 @@ print(MultiClassRelationExtractor.get_prompt_guide())
814
961
  ```
815
962
 
816
963
  ```
817
- Prompt template design:
818
- 1. Task description (mention multi-class relation extraction and ROI)
819
- 2. Schema definition (defines relation types)
820
- 3. Output format definition (must use the key "RelationType")
821
- 4. Input placeholders (must include "roi_text", "frame_1", and "frame_2" placeholders)
964
+ Prompt Template Design:
965
+
966
+ 1. Task description:
967
+ Provide a detailed description of the task, including the background and the type of task (e.g., binary relation extraction). Mention the region of interest (ROI) text.
968
+ 2. Schema definition:
969
+ List the criterion for relation (True) and for no relation (False).
970
+
971
+ 3. Output format definition:
972
+ This section must include a placeholder "{{pos_rel_types}}" for the possible relation types.
973
+ The ouptut must be a dictionary with a key "RelationType" (i.e., {"RelationType": "<relation type or No Relation>"}).
974
+
975
+ 4. (optional) Hints:
976
+ Provide itemized hints for the information extractors to guide the extraction process.
977
+
978
+ 5. (optional) Examples:
979
+ Include examples in the format:
980
+ Input: ...
981
+ Output: ...
982
+
983
+ 6. Entity 1 full information:
984
+ Include a placeholder in the format {{<frame_1>}}
985
+
986
+ 7. Entity 2 full information:
987
+ Include a placeholder in the format {{<frame_2>}}
988
+
989
+ 8. Input placeholders:
990
+ The template must include a placeholder "{{roi_text}}" for the ROI text.
991
+
822
992
 
823
993
 
824
994
  Example:
@@ -851,15 +1021,15 @@ Example:
851
1021
  3. If the strength or frequency is for another medication, output "No Relation".
852
1022
  4. If the strength or frequency is for the same medication but at a different location (span), output "No Relation".
853
1023
 
854
- # Input placeholders
855
- ROI Text with the two entities annotated with <entity_1> and <entity_2>:
856
- "{{roi_text}}"
857
-
858
- Entity 1 full information:
1024
+ # Entity 1 full information:
859
1025
  {{frame_1}}
860
1026
 
861
- Entity 2 full information:
1027
+ # Entity 2 full information:
862
1028
  {{frame_2}}
1029
+
1030
+ # Input placeholders
1031
+ ROI Text with the two entities annotated with <entity_1> and <entity_2>:
1032
+ "{{roi_text}}"
863
1033
  ```
864
1034
 
865
1035
  As an example, we define the ```possible_relation_types_func``` :
@@ -890,8 +1060,76 @@ def possible_relation_types_func(frame_1, frame_2) -> List[str]:
890
1060
  ```python
891
1061
  from llm_ie.extractors import MultiClassRelationExtractor
892
1062
 
893
- extractor = MultiClassRelationExtractor(llm, prompt_template=re_prompt_template, possible_relation_types_func=possible_relation_types_func)
894
- relations = extractor.extract_relations(doc, stream=True)
1063
+ extractor = MultiClassRelationExtractor(inference_engine, prompt_template=re_prompt_template,
1064
+ possible_relation_types_func=possible_relation_types_func)
1065
+
1066
+ # Extract multi-class relations with concurrent mode (faster)
1067
+ relations = extractor.extract_relations(doc, concurrent=True)
1068
+
1069
+ # To print out the step-by-step, use the `concurrent=False` and `stream=True` options
1070
+ relations = extractor.extract_relations(doc, concurrent=False, stream=True)
895
1071
  ```
896
1072
 
897
1073
  </details>
1074
+
1075
+ ### Visualization
1076
+ The `LLMInformationExtractionDocument` class supports named entity, entity attributes, and relation visualization. The implementation is through our plug-in package [ie-viz](https://github.com/daviden1013/ie-viz). Check the example Jupyter Notebook [NER + RE for Drug, Strength, Frequency](demo/medication_relation_extraction.ipynb) for a working demo.
1077
+
1078
+ ```cmd
1079
+ pip install ie-viz
1080
+ ```
1081
+
1082
+ The `viz_serve()` method starts a Flask App on localhost port 5000 by default.
1083
+ ```python
1084
+ from llm_ie.data_types import LLMInformationExtractionDocument
1085
+
1086
+ # Define document
1087
+ doc = LLMInformationExtractionDocument(doc_id="Medical note",
1088
+ text=note_text)
1089
+ # Add extracted frames and relations to document
1090
+ doc.add_frames(frames)
1091
+ doc.add_relations(relations)
1092
+ # Visualize the document
1093
+ doc.viz_serve()
1094
+ ```
1095
+
1096
+ Alternatively, the `viz_render()` method returns a self-contained (HTML + JS + CSS) string. Save it to file and open with a browser.
1097
+ ```python
1098
+ html = doc.viz_render()
1099
+
1100
+ with open("Medical note.html", "w") as f:
1101
+ f.write(html)
1102
+ ```
1103
+
1104
+ To customize colors for different entities, use `color_attr_key` (simple) or `color_map_func` (advanced).
1105
+
1106
+ The `color_attr_key` automatically assign colors based on the specified attribute key. For example, "EntityType".
1107
+ ```python
1108
+ doc.viz_serve(color_attr_key="EntityType")
1109
+ ```
1110
+
1111
+ The `color_map_func` allow users to define a custom entity-color mapping function. For example,
1112
+ ```python
1113
+ def color_map_func(entity) -> str:
1114
+ if entity['attr']['<attribute key>'] == "<a certain value>":
1115
+ return "#7f7f7f"
1116
+ else:
1117
+ return "#03A9F4"
1118
+
1119
+ doc.viz_serve(color_map_func=color_map_func)
1120
+ ```
1121
+
1122
+ ## Benchmarks
1123
+ We benchmarked the frame and relation extractors on biomedical information extraction tasks. The results and experiment code is available on [this page](https://github.com/daviden1013/LLM-IE_Benchmark).
1124
+
1125
+
1126
+ ## Citation
1127
+ For more information and benchmarks, please check our paper:
1128
+ ```bibtex
1129
+ @article{hsu2024llm,
1130
+ title={LLM-IE: A Python Package for Generative Information Extraction with Large Language Models},
1131
+ author={Hsu, Enshuo and Roberts, Kirk},
1132
+ journal={arXiv preprint arXiv:2411.11779},
1133
+ year={2024}
1134
+ }
1135
+ ```