gpt-batch 0.1.1__tar.gz → 0.1.2__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,85 @@
1
+ Metadata-Version: 2.1
2
+ Name: gpt_batch
3
+ Version: 0.1.2
4
+ Summary: A package for batch processing with OpenAI API.
5
+ Home-page: https://github.com/fengsxy/gpt_batch
6
+ Author: Ted Yu
7
+ Author-email: liddlerain@gmail.com
8
+ License: UNKNOWN
9
+ Platform: UNKNOWN
10
+ Description-Content-Type: text/markdown
11
+
12
+ Certainly! Here's a clean and comprehensive README for your `GPTBatcher` tool, formatted in Markdown:
13
+
14
+ ```markdown
15
+ # GPT Batcher
16
+
17
+ A simple tool to batch process messages using OpenAI's GPT models. `GPTBatcher` allows for efficient handling of multiple requests simultaneously, ensuring quick responses and robust error management.
18
+
19
+ ## Installation
20
+
21
+ To get started with `GPTBatcher`, clone this repository to your local machine. Navigate to the repository directory and install the required dependencies (if any) by running:
22
+
23
+ ```bash
24
+ pip install -r requirements.txt
25
+ ```
26
+
27
+ ## Quick Start
28
+
29
+ To use `GPTBatcher`, you need to instantiate it with your OpenAI API key and the model name you wish to use. Here's a quick guide:
30
+
31
+ ### Handling Message Lists
32
+
33
+ This example demonstrates how to send a list of questions and receive answers:
34
+
35
+ ```python
36
+ from gpt_batch.batcher import GPTBatcher
37
+
38
+ # Initialize the batcher
39
+ batcher = GPTBatcher(api_key='your_key_here', model_name='gpt-3.5-turbo-1106')
40
+
41
+ # Send a list of messages and receive answers
42
+ result = batcher.handle_message_list(['question_1', 'question_2', 'question_3', 'question_4'])
43
+ print(result)
44
+ # Expected output: ["answer_1", "answer_2", "answer_3", "answer_4"]
45
+ ```
46
+
47
+ ### Handling Embedding Lists
48
+
49
+ This example shows how to get embeddings for a list of strings:
50
+
51
+ ```python
52
+ from gpt_batch.batcher import GPTBatcher
53
+
54
+ # Reinitialize the batcher for embeddings
55
+ batcher = GPTBatcher(api_key='your_key_here', model_name='text-embedding-3-small')
56
+
57
+ # Send a list of strings and get their embeddings
58
+ result = batcher.handle_embedding_list(['question_1', 'question_2', 'question_3', 'question_4'])
59
+ print(result)
60
+ # Expected output: ["embedding_1", "embedding_2", "embedding_3", "embedding_4"]
61
+ ```
62
+
63
+ ## Configuration
64
+
65
+ The `GPTBatcher` class can be customized with several parameters to adjust its performance and behavior:
66
+
67
+ - **api_key** (str): Your OpenAI API key.
68
+ - **model_name** (str): Identifier for the GPT model version you want to use, default is 'gpt-3.5-turbo-1106'.
69
+ - **system_prompt** (str): Initial text or question to seed the model, default is empty.
70
+ - **temperature** (float): Adjusts the creativity of the responses, default is 1.
71
+ - **num_workers** (int): Number of parallel workers for request handling, default is 64.
72
+ - **timeout_duration** (int): Timeout for API responses in seconds, default is 60.
73
+ - **retry_attempts** (int): How many times to retry a failed request, default is 2.
74
+ - **miss_index** (list): Tracks indices of requests that failed to process correctly.
75
+
76
+ For more detailed documentation on the parameters and methods, refer to the class docstring.
77
+
78
+ ## License
79
+
80
+ Specify your licensing information here.
81
+
82
+ ```
83
+
84
+ This README provides clear instructions on how to install and use the `GPTBatcher`, along with detailed explanations of its configuration parameters. Adjust the "License" section as necessary based on your project's licensing terms.
85
+
@@ -0,0 +1,73 @@
1
+ Certainly! Here's a clean and comprehensive README for your `GPTBatcher` tool, formatted in Markdown:
2
+
3
+ ```markdown
4
+ # GPT Batcher
5
+
6
+ A simple tool to batch process messages using OpenAI's GPT models. `GPTBatcher` allows for efficient handling of multiple requests simultaneously, ensuring quick responses and robust error management.
7
+
8
+ ## Installation
9
+
10
+ To get started with `GPTBatcher`, clone this repository to your local machine. Navigate to the repository directory and install the required dependencies (if any) by running:
11
+
12
+ ```bash
13
+ pip install -r requirements.txt
14
+ ```
15
+
16
+ ## Quick Start
17
+
18
+ To use `GPTBatcher`, you need to instantiate it with your OpenAI API key and the model name you wish to use. Here's a quick guide:
19
+
20
+ ### Handling Message Lists
21
+
22
+ This example demonstrates how to send a list of questions and receive answers:
23
+
24
+ ```python
25
+ from gpt_batch.batcher import GPTBatcher
26
+
27
+ # Initialize the batcher
28
+ batcher = GPTBatcher(api_key='your_key_here', model_name='gpt-3.5-turbo-1106')
29
+
30
+ # Send a list of messages and receive answers
31
+ result = batcher.handle_message_list(['question_1', 'question_2', 'question_3', 'question_4'])
32
+ print(result)
33
+ # Expected output: ["answer_1", "answer_2", "answer_3", "answer_4"]
34
+ ```
35
+
36
+ ### Handling Embedding Lists
37
+
38
+ This example shows how to get embeddings for a list of strings:
39
+
40
+ ```python
41
+ from gpt_batch.batcher import GPTBatcher
42
+
43
+ # Reinitialize the batcher for embeddings
44
+ batcher = GPTBatcher(api_key='your_key_here', model_name='text-embedding-3-small')
45
+
46
+ # Send a list of strings and get their embeddings
47
+ result = batcher.handle_embedding_list(['question_1', 'question_2', 'question_3', 'question_4'])
48
+ print(result)
49
+ # Expected output: ["embedding_1", "embedding_2", "embedding_3", "embedding_4"]
50
+ ```
51
+
52
+ ## Configuration
53
+
54
+ The `GPTBatcher` class can be customized with several parameters to adjust its performance and behavior:
55
+
56
+ - **api_key** (str): Your OpenAI API key.
57
+ - **model_name** (str): Identifier for the GPT model version you want to use, default is 'gpt-3.5-turbo-1106'.
58
+ - **system_prompt** (str): Initial text or question to seed the model, default is empty.
59
+ - **temperature** (float): Adjusts the creativity of the responses, default is 1.
60
+ - **num_workers** (int): Number of parallel workers for request handling, default is 64.
61
+ - **timeout_duration** (int): Timeout for API responses in seconds, default is 60.
62
+ - **retry_attempts** (int): How many times to retry a failed request, default is 2.
63
+ - **miss_index** (list): Tracks indices of requests that failed to process correctly.
64
+
65
+ For more detailed documentation on the parameters and methods, refer to the class docstring.
66
+
67
+ ## License
68
+
69
+ Specify your licensing information here.
70
+
71
+ ```
72
+
73
+ This README provides clear instructions on how to install and use the `GPTBatcher`, along with detailed explanations of its configuration parameters. Adjust the "License" section as necessary based on your project's licensing terms.
@@ -1,3 +1,4 @@
1
1
  from .batcher import GPTBatcher
2
2
 
3
+
3
4
  __all__ = ['GPTBatcher']
@@ -0,0 +1,148 @@
1
+ from openai import OpenAI
2
+ from concurrent.futures import ThreadPoolExecutor, wait
3
+ from functools import partial
4
+ from tqdm import tqdm
5
+
6
+ class GPTBatcher:
7
+ """
8
+ A class to handle batching and sending requests to the OpenAI GPT model efficiently.
9
+
10
+ Attributes:
11
+ client (OpenAI): The client instance to communicate with the OpenAI API using the provided API key.
12
+ model_name (str): The name of the GPT model to be used. Default is 'gpt-3.5-turbo-0125'.
13
+ system_prompt (str): Initial prompt or context to be used with the model. Default is an empty string.
14
+ temperature (float): Controls the randomness of the model's responses. Higher values lead to more diverse outputs. Default is 1.
15
+ num_workers (int): Number of worker threads used for handling concurrent requests. Default is 64.
16
+ timeout_duration (int): Maximum time (in seconds) to wait for a response from the API before timing out. Default is 60 seconds.
17
+ retry_attempts (int): Number of retries if a request fails. Default is 2.
18
+ miss_index (list): Tracks the indices of requests that failed to process correctly.
19
+
20
+ Parameters:
21
+ api_key (str): API key for authenticating requests to the OpenAI API.
22
+ model_name (str, optional): Specifies the GPT model version. Default is 'gpt-3.5-turbo-0125'.
23
+ system_prompt (str, optional): Initial text or question to seed the model with. Default is empty.
24
+ temperature (float, optional): Sets the creativity of the responses. Default is 1.
25
+ num_workers (int, optional): Number of parallel workers for request handling. Default is 64.
26
+ timeout_duration (int, optional): Timeout for API responses in seconds. Default is 60.
27
+ retry_attempts (int, optional): How many times to retry a failed request. Default is 2.
28
+ """
29
+
30
+ def __init__(self, api_key, model_name="gpt-3.5-turbo-0125", system_prompt="",temperature=1,num_workers=64,timeout_duration=60,retry_attempts=2,api_base_url=None):
31
+
32
+ self.client = OpenAI(api_key=api_key)
33
+ self.model_name = model_name
34
+ self.system_prompt = system_prompt
35
+ self.temperature = temperature
36
+ self.num_workers = num_workers
37
+ self.timeout_duration = timeout_duration
38
+ self.retry_attempts = retry_attempts
39
+ self.miss_index =[]
40
+ if api_base_url:
41
+ self.client.base_url = api_base_url
42
+
43
+ def get_attitude(self, ask_text):
44
+ index, ask_text = ask_text
45
+
46
+ completion = self.client.chat.completions.create(
47
+ model=self.model_name,
48
+ messages=[
49
+ {"role": "system", "content": self.system_prompt},
50
+ {"role": "user", "content": ask_text}
51
+ ],
52
+ temperature=self.temperature,
53
+ )
54
+ return (index, completion.choices[0].message.content)
55
+
56
+ def process_attitude(self, message_list):
57
+ new_list = []
58
+ num_workers = self.num_workers
59
+ timeout_duration = self.timeout_duration
60
+ retry_attempts=2
61
+
62
+ executor = ThreadPoolExecutor(max_workers=num_workers)
63
+ message_chunks = list(self.chunk_list(message_list, num_workers))
64
+ for chunk in tqdm(message_chunks, desc="Processing messages"):
65
+ future_to_message = {executor.submit(self.get_attitude, message): message for message in chunk}
66
+ for _ in range(retry_attempts):
67
+ done, not_done = wait(future_to_message.keys(), timeout=timeout_duration)
68
+ for future in not_done:
69
+ future.cancel()
70
+ new_list.extend(future.result() for future in done if future.done())
71
+ if len(not_done) == 0:
72
+ break
73
+ future_to_message = {executor.submit(self.get_attitude, future_to_message[future]): future_to_message[future] for future, msg in not_done}
74
+ executor.shutdown(wait=False)
75
+ return new_list
76
+
77
+ def complete_attitude_list(self,attitude_list, max_length):
78
+ completed_list = []
79
+ current_index = 0
80
+ for item in attitude_list:
81
+ index, value = item
82
+ # Fill in missing indices
83
+ while current_index < index:
84
+ completed_list.append((current_index, None))
85
+ current_index += 1
86
+ # Add the current element from the list
87
+ completed_list.append(item)
88
+ current_index = index + 1
89
+ while current_index < max_length:
90
+ print("Filling in missing index", current_index)
91
+ self.miss_index.append(current_index)
92
+ completed_list.append((current_index, None))
93
+ current_index += 1
94
+ return completed_list
95
+
96
+ def chunk_list(self, lst, n):
97
+ """Yield successive n-sized chunks from lst."""
98
+ for i in range(0, len(lst), n):
99
+ yield lst[i:i + n]
100
+
101
+ def handle_message_list(self,message_list):
102
+ indexed_list = [(index, data) for index, data in enumerate(message_list)]
103
+ max_length = len(indexed_list)
104
+ attitude_list = self.process_attitude(indexed_list)
105
+ attitude_list.sort(key=lambda x: x[0])
106
+ attitude_list = self.complete_attitude_list(attitude_list, max_length)
107
+ attitude_list = [x[1] for x in attitude_list]
108
+ return attitude_list
109
+
110
+ def process_embedding(self,message_list):
111
+ new_list = []
112
+ executor = ThreadPoolExecutor(max_workers=self.num_workers)
113
+ # Split message_list into chunks
114
+ message_chunks = list(self.chunk_list(message_list, self.num_workers))
115
+ fixed_get_embedding = partial(self.get_embedding)
116
+ for chunk in tqdm(message_chunks, desc="Processing messages"):
117
+ future_to_message = {executor.submit(fixed_get_embedding, message): message for message in chunk}
118
+ for i in range(self.retry_attempts):
119
+ done, not_done = wait(future_to_message.keys(), timeout=self.timeout_duration)
120
+ for future in not_done:
121
+ future.cancel()
122
+ new_list.extend(future.result() for future in done if future.done())
123
+ if len(not_done) == 0:
124
+ break
125
+ future_to_message = {executor.submit(fixed_get_embedding, future_to_message[future]): future_to_message[future] for future in not_done}
126
+ executor.shutdown(wait=False)
127
+ return new_list
128
+ def get_embedding(self,text):
129
+ index,text = text
130
+ response = self.client.embeddings.create(
131
+ input=text,
132
+ model=self.model_name)
133
+ return (index,response.data[0].embedding)
134
+
135
+ def handle_embedding_list(self,message_list):
136
+ indexed_list = [(index, data) for index, data in enumerate(message_list)]
137
+ max_length = len(indexed_list)
138
+ attitude_list = self.process_embedding(indexed_list)
139
+ attitude_list.sort(key=lambda x: x[0])
140
+ attitude_list = self.complete_attitude_list(attitude_list, max_length)
141
+ attitude_list = [x[1] for x in attitude_list]
142
+ return attitude_list
143
+
144
+ def get_miss_index(self):
145
+ return self.miss_index
146
+
147
+ # Add other necessary methods similar to the above, refactored to fit within this class structure.
148
+
@@ -0,0 +1,85 @@
1
+ Metadata-Version: 2.1
2
+ Name: gpt-batch
3
+ Version: 0.1.2
4
+ Summary: A package for batch processing with OpenAI API.
5
+ Home-page: https://github.com/fengsxy/gpt_batch
6
+ Author: Ted Yu
7
+ Author-email: liddlerain@gmail.com
8
+ License: UNKNOWN
9
+ Platform: UNKNOWN
10
+ Description-Content-Type: text/markdown
11
+
12
+ Certainly! Here's a clean and comprehensive README for your `GPTBatcher` tool, formatted in Markdown:
13
+
14
+ ```markdown
15
+ # GPT Batcher
16
+
17
+ A simple tool to batch process messages using OpenAI's GPT models. `GPTBatcher` allows for efficient handling of multiple requests simultaneously, ensuring quick responses and robust error management.
18
+
19
+ ## Installation
20
+
21
+ To get started with `GPTBatcher`, clone this repository to your local machine. Navigate to the repository directory and install the required dependencies (if any) by running:
22
+
23
+ ```bash
24
+ pip install -r requirements.txt
25
+ ```
26
+
27
+ ## Quick Start
28
+
29
+ To use `GPTBatcher`, you need to instantiate it with your OpenAI API key and the model name you wish to use. Here's a quick guide:
30
+
31
+ ### Handling Message Lists
32
+
33
+ This example demonstrates how to send a list of questions and receive answers:
34
+
35
+ ```python
36
+ from gpt_batch.batcher import GPTBatcher
37
+
38
+ # Initialize the batcher
39
+ batcher = GPTBatcher(api_key='your_key_here', model_name='gpt-3.5-turbo-1106')
40
+
41
+ # Send a list of messages and receive answers
42
+ result = batcher.handle_message_list(['question_1', 'question_2', 'question_3', 'question_4'])
43
+ print(result)
44
+ # Expected output: ["answer_1", "answer_2", "answer_3", "answer_4"]
45
+ ```
46
+
47
+ ### Handling Embedding Lists
48
+
49
+ This example shows how to get embeddings for a list of strings:
50
+
51
+ ```python
52
+ from gpt_batch.batcher import GPTBatcher
53
+
54
+ # Reinitialize the batcher for embeddings
55
+ batcher = GPTBatcher(api_key='your_key_here', model_name='text-embedding-3-small')
56
+
57
+ # Send a list of strings and get their embeddings
58
+ result = batcher.handle_embedding_list(['question_1', 'question_2', 'question_3', 'question_4'])
59
+ print(result)
60
+ # Expected output: ["embedding_1", "embedding_2", "embedding_3", "embedding_4"]
61
+ ```
62
+
63
+ ## Configuration
64
+
65
+ The `GPTBatcher` class can be customized with several parameters to adjust its performance and behavior:
66
+
67
+ - **api_key** (str): Your OpenAI API key.
68
+ - **model_name** (str): Identifier for the GPT model version you want to use, default is 'gpt-3.5-turbo-1106'.
69
+ - **system_prompt** (str): Initial text or question to seed the model, default is empty.
70
+ - **temperature** (float): Adjusts the creativity of the responses, default is 1.
71
+ - **num_workers** (int): Number of parallel workers for request handling, default is 64.
72
+ - **timeout_duration** (int): Timeout for API responses in seconds, default is 60.
73
+ - **retry_attempts** (int): How many times to retry a failed request, default is 2.
74
+ - **miss_index** (list): Tracks indices of requests that failed to process correctly.
75
+
76
+ For more detailed documentation on the parameters and methods, refer to the class docstring.
77
+
78
+ ## License
79
+
80
+ Specify your licensing information here.
81
+
82
+ ```
83
+
84
+ This README provides clear instructions on how to install and use the `GPTBatcher`, along with detailed explanations of its configuration parameters. Adjust the "License" section as necessary based on your project's licensing terms.
85
+
@@ -2,7 +2,7 @@ from setuptools import setup, find_packages
2
2
 
3
3
  setup(
4
4
  name='gpt_batch',
5
- version='0.1.1',
5
+ version='0.1.2',
6
6
  packages=find_packages(),
7
7
  install_requires=[
8
8
  'openai', 'tqdm'
@@ -0,0 +1,55 @@
1
+ import pytest
2
+ from gpt_batch import GPTBatcher
3
+ import os
4
+
5
+ def test_handle_message_list():
6
+ # Initialize the GPTBatcher with hypothetical valid credentials
7
+ #api_key = #get from system environment
8
+ api_key = os.getenv('TEST_KEY')
9
+ if not api_key:
10
+ raise ValueError("API key must be set in the environment variables")
11
+ batcher = GPTBatcher(api_key=api_key, model_name='gpt-3.5-turbo-1106', system_prompt="Your task is to discuss privacy questions and provide persuasive answers with supporting reasons.")
12
+ message_list = ["I think privacy is important", "I don't think privacy is important"]
13
+
14
+ # Call the method under test
15
+ results = batcher.handle_message_list(message_list)
16
+
17
+ # Assertions to verify the length of the results and the structure of each item
18
+ assert len(results) == 2, "There should be two results, one for each message"
19
+ assert all(len(result) >= 2 for result in results), "Each result should be at least two elements"
20
+
21
+ def test_handle_embedding_list():
22
+ # Initialize the GPTBatcher with hypothetical valid credentials
23
+ #api_key = #get from system environment
24
+ api_key = os.getenv('TEST_KEY')
25
+ if not api_key:
26
+ raise ValueError("API key must be set in the environment variables")
27
+ batcher = GPTBatcher(api_key=api_key, model_name='text-embedding-3-small')
28
+ embedding_list = [ "I think privacy is important", "I don't think privacy is important"]
29
+ results = batcher.handle_embedding_list(embedding_list)
30
+ assert len(results) == 2, "There should be two results, one for each message"
31
+ assert all(len(result) >= 2 for result in results), "Each result should be at least two elements"
32
+
33
+ def test_base_url():
34
+ # Initialize the GPTBatcher with hypothetical valid credentials
35
+ #api_key = #get from system environment
36
+ api_key = os.getenv('TEST_KEY')
37
+ if not api_key:
38
+ raise ValueError("API key must be set in the environment variables")
39
+ batcher = GPTBatcher(api_key=api_key, model_name='gpt-3.5-turbo-1106', api_base_url="https://api.openai.com/v2/")
40
+ assert batcher.client.base_url == "https://api.openai.com/v2/", "The base URL should be set to the provided value"
41
+
42
+ def test_get_miss_index():
43
+ # Initialize the GPTBatcher with hypothetical valid credentials
44
+ #api_key = #get from system environment
45
+ api_key = os.getenv('TEST_KEY')
46
+ if not api_key:
47
+ raise ValueError("API key must be set in the environment variables")
48
+ batcher = GPTBatcher(api_key=api_key, model_name='gpt-3.5-turbo-1106', system_prompt="Your task is to discuss privacy questions and provide persuasive answers with supporting reasons.")
49
+ message_list = ["I think privacy is important", "I don't think privacy is important"]
50
+ results = batcher.handle_message_list(message_list)
51
+ miss_index = batcher.get_miss_index()
52
+ assert miss_index == [], "The miss index should be empty"
53
+ # Optionally, you can add a test configuration if you have specific needs
54
+ if __name__ == "__main__":
55
+ pytest.main()
gpt_batch-0.1.1/PKG-INFO DELETED
@@ -1,32 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: gpt_batch
3
- Version: 0.1.1
4
- Summary: A package for batch processing with OpenAI API.
5
- Home-page: https://github.com/fengsxy/gpt_batch
6
- Author: Ted Yu
7
- Author-email: liddlerain@gmail.com
8
- License: UNKNOWN
9
- Platform: UNKNOWN
10
- Description-Content-Type: text/markdown
11
-
12
- # GPT Batcher
13
-
14
- A simple tool to batch process messages using OpenAI's GPT models.
15
-
16
- ## Installation
17
-
18
- Clone this repository and run:
19
-
20
- ## Usage
21
-
22
- Here's how to use the `GPTBatcher`:
23
-
24
- ```python
25
- from gpt_batch.batcher import GPTBatcher
26
-
27
- batcher = GPTBatcher(api_key='your_key_here', model_name='gpt-3.5-turbo-1106')
28
- result = batcher.handle_message_list(['your', 'list', 'of', 'messages'])
29
- print(result)
30
-
31
-
32
-
gpt_batch-0.1.1/README.md DELETED
@@ -1,19 +0,0 @@
1
- # GPT Batcher
2
-
3
- A simple tool to batch process messages using OpenAI's GPT models.
4
-
5
- ## Installation
6
-
7
- Clone this repository and run:
8
-
9
- ## Usage
10
-
11
- Here's how to use the `GPTBatcher`:
12
-
13
- ```python
14
- from gpt_batch.batcher import GPTBatcher
15
-
16
- batcher = GPTBatcher(api_key='your_key_here', model_name='gpt-3.5-turbo-1106')
17
- result = batcher.handle_message_list(['your', 'list', 'of', 'messages'])
18
- print(result)
19
-
@@ -1,85 +0,0 @@
1
- from openai import OpenAI
2
- from concurrent.futures import ThreadPoolExecutor, wait
3
- from functools import partial
4
- from tqdm import tqdm
5
-
6
- class GPTBatcher:
7
- def __init__(self, api_key, model_name="gpt-3.5-turbo-0125", system_prompt="",temperature=1,num_workers=64,timeout_duration=60,retry_attempts=2):
8
- self.client = OpenAI(api_key=api_key)
9
- self.model_name = model_name
10
- self.system_prompt = system_prompt
11
- self.temperature = temperature
12
- self.num_workers = num_workers
13
- self.timeout_duration = timeout_duration
14
- self.retry_attempts = retry_attempts
15
- self.miss_index =[]
16
-
17
- def get_attitude(self, ask_text):
18
- index, ask_text = ask_text
19
-
20
- completion = self.client.chat.completions.create(
21
- model=self.model_name,
22
- messages=[
23
- {"role": "system", "content": self.system_prompt},
24
- {"role": "user", "content": ask_text}
25
- ],
26
- temperature=self.temperature,
27
- )
28
- return (index, completion.choices[0].message.content)
29
-
30
- def process_attitude(self, message_list):
31
- new_list = []
32
- num_workers = self.num_workers
33
- timeout_duration = self.timeout_duration
34
- retry_attempts=2
35
-
36
- executor = ThreadPoolExecutor(max_workers=num_workers)
37
- message_chunks = list(self.chunk_list(message_list, num_workers))
38
- for chunk in tqdm(message_chunks, desc="Processing messages"):
39
- future_to_message = {executor.submit(self.get_attitude, message): message for message in chunk}
40
- for _ in range(retry_attempts):
41
- done, not_done = wait(future_to_message.keys(), timeout=timeout_duration)
42
- for future in not_done:
43
- future.cancel()
44
- new_list.extend(future.result() for future in done if future.done())
45
- if len(not_done) == 0:
46
- break
47
- future_to_message = {executor.submit(self.get_attitude, (future_to_message[future], msg), temperature): future_to_message[future] for future, msg in not_done}
48
- executor.shutdown(wait=False)
49
- return new_list
50
-
51
- def complete_attitude_list(self,attitude_list, max_length):
52
- completed_list = []
53
- current_index = 0
54
- for item in attitude_list:
55
- index, value = item
56
- # Fill in missing indices
57
- while current_index < index:
58
- completed_list.append((current_index, None))
59
- current_index += 1
60
- # Add the current element from the list
61
- completed_list.append(item)
62
- current_index = index + 1
63
- while current_index < max_length:
64
- print("Filling in missing index", current_index)
65
- self.miss_index.append(current_index)
66
- completed_list.append((current_index, None))
67
- current_index += 1
68
- return completed_list
69
-
70
- def chunk_list(self, lst, n):
71
- """Yield successive n-sized chunks from lst."""
72
- for i in range(0, len(lst), n):
73
- yield lst[i:i + n]
74
-
75
- def handle_message_list(self,message_list):
76
- indexed_list = [(index, data) for index, data in enumerate(message_list)]
77
- max_length = len(indexed_list)
78
- attitude_list = self.process_attitude(indexed_list)
79
- attitude_list.sort(key=lambda x: x[0])
80
- attitude_list = self.complete_attitude_list(attitude_list, max_length)
81
- attitude_list = [x[1] for x in attitude_list]
82
- return attitude_list
83
-
84
- # Add other necessary methods similar to the above, refactored to fit within this class structure.
85
-
@@ -1,32 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: gpt-batch
3
- Version: 0.1.1
4
- Summary: A package for batch processing with OpenAI API.
5
- Home-page: https://github.com/fengsxy/gpt_batch
6
- Author: Ted Yu
7
- Author-email: liddlerain@gmail.com
8
- License: UNKNOWN
9
- Platform: UNKNOWN
10
- Description-Content-Type: text/markdown
11
-
12
- # GPT Batcher
13
-
14
- A simple tool to batch process messages using OpenAI's GPT models.
15
-
16
- ## Installation
17
-
18
- Clone this repository and run:
19
-
20
- ## Usage
21
-
22
- Here's how to use the `GPTBatcher`:
23
-
24
- ```python
25
- from gpt_batch.batcher import GPTBatcher
26
-
27
- batcher = GPTBatcher(api_key='your_key_here', model_name='gpt-3.5-turbo-1106')
28
- result = batcher.handle_message_list(['your', 'list', 'of', 'messages'])
29
- print(result)
30
-
31
-
32
-
@@ -1,23 +0,0 @@
1
- import pytest
2
- from gpt_batch import GPTBatcher
3
- import os
4
-
5
- def test_handle_message_list():
6
- # Initialize the GPTBatcher with hypothetical valid credentials
7
- #api_key = #get from system environment
8
- api_key = os.getenv('TEST_KEY')
9
- if not api_key:
10
- raise ValueError("API key must be set in the environment variables")
11
- batcher = GPTBatcher(api_key=api_key, model_name='gpt-3.5-turbo-1106', system_prompt="Your task is to discuss privacy questions and provide persuasive answers with supporting reasons.")
12
- message_list = ["I think privacy is important", "I don't think privacy is important"]
13
-
14
- # Call the method under test
15
- results = batcher.handle_message_list(message_list)
16
-
17
- # Assertions to verify the length of the results and the structure of each item
18
- assert len(results) == 2, "There should be two results, one for each message"
19
- assert all(len(result) >= 2 for result in results), "Each result should be at least two elements"
20
-
21
- # Optionally, you can add a test configuration if you have specific needs
22
- if __name__ == "__main__":
23
- pytest.main()
File without changes
File without changes