structai 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
structai-0.1.0/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Wanghan Xu
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,365 @@
1
+ Metadata-Version: 2.4
2
+ Name: structai
3
+ Version: 0.1.0
4
+ Summary: A utility package for AI development
5
+ Author-email: Wanghan Xu <xu_wanghan@sjtu.edu.cn>
6
+ Project-URL: Homepage, https://github.com/black-yt/structai
7
+ Classifier: Programming Language :: Python :: 3
8
+ Classifier: License :: OSI Approved :: MIT License
9
+ Classifier: Operating System :: OS Independent
10
+ Requires-Python: >=3.6
11
+ Description-Content-Type: text/markdown
12
+ License-File: LICENSE
13
+ Requires-Dist: openai
14
+ Requires-Dist: python-Levenshtein
15
+ Requires-Dist: json_repair
16
+ Requires-Dist: pillow
17
+ Requires-Dist: httpx[socks]
18
+ Requires-Dist: pandas
19
+ Requires-Dist: numpy
20
+ Requires-Dist: tqdm
21
+ Requires-Dist: fastapi
22
+ Requires-Dist: uvicorn
23
+ Dynamic: license-file
24
+
25
+ # StructAI
26
+
27
+ StructAI is a comprehensive utility package for AI development, offering a robust set of tools for file operations, LLM interactions, parallel processing, and general programming tasks.
28
+
29
+ ## Installation
30
+
31
+ ```bash
32
+ git clone https://github.com/black-yt/structai.git
33
+ cd structai
34
+ pip install -e .
35
+ ```
36
+
37
+ ## API Reference & Usage
38
+
39
+ ### `load_file(path)`
40
+
41
+ Automatically reads a file based on its extension.
42
+
43
+ **Supported formats:** `.json`, `.jsonl`, `.csv`, `.txt`, `.md`, `.pkl`, `.parquet`, `.xlsx`, `.py`, `.npy`, `.pt`, `.png`, `.jpg`, `.jpeg`.
44
+
45
+ ```python
46
+ from structai import load_file
47
+
48
+ # Load a JSON file
49
+ data = load_file("config.json")
50
+
51
+ # Load a CSV file as a pandas DataFrame
52
+ df = load_file("data.csv")
53
+
54
+ # Load an image
55
+ image = load_file("photo.jpg")
56
+ ```
57
+
58
+ ### `save_file(data, path)`
59
+
60
+ Automatically saves data to a file based on the extension.
61
+
62
+ ```python
63
+ from structai import save_file
64
+
65
+ data = {"key": "value"}
66
+
67
+ # Save as JSON
68
+ save_file(data, "output.json")
69
+
70
+ # Save as Pickle
71
+ save_file(data, "backup.pkl")
72
+ ```
73
+
74
+ ### `print_once(msg)`
75
+
76
+ Prints a message to stdout only the first time it is called. Useful for logging inside loops.
77
+
78
+ ```python
79
+ from structai import print_once
80
+
81
+ for i in range(10):
82
+ print_once("Starting processing...") # Prints only once
83
+ # process(i)
84
+ ```
85
+
86
+ ### `make_print_once()`
87
+
88
+ Returns a new function that prints a message only once. This allows for creating local "print once" scopes.
89
+
90
+ ```python
91
+ from structai import make_print_once
92
+
93
+ logger1 = make_print_once()
94
+ logger2 = make_print_once()
95
+
96
+ logger1("Hello") # Prints "Hello"
97
+ logger1("Hello") # Does nothing
98
+
99
+ logger2("World") # Prints "World"
100
+ ```
101
+
102
+ ### `LLMAgent`
103
+
104
+ A powerful wrapper class for interacting with OpenAI-compatible LLM APIs. It handles retries, timeouts, and structured output validation.
105
+
106
+ **Initialization:**
107
+
108
+ ```python
109
+ from structai import LLMAgent
110
+
111
+ agent = LLMAgent(
112
+ api_key="sk-...", # Optional if LLM_API_KEY env var is set
113
+ api_base="https://...", # Optional if LLM_BASE_URL env var is set
114
+ model_version='gpt-4.1-mini', # Default model
115
+ system_prompt='You are a helpful assistant.',
116
+ temperature=0,
117
+ time_limit=300, # Timeout in seconds
118
+ max_try=1 # Number of retries
119
+ )
120
+ ```
121
+
122
+ **Basic Usage (`__call__` or `safe_api`):**
123
+
124
+ ```python
125
+ response = agent("What is the capital of France?")
126
+ print(response)
127
+ # Output: "Paris"
128
+ ```
129
+
130
+ **Structured Output Validation:**
131
+
132
+ You can enforce the output format (List, Dict, or specific types) using `return_example`.
133
+
134
+ ```python
135
+ # Enforce a list of integers
136
+ numbers = agent(
137
+ "Generate 3 random numbers",
138
+ return_example=[1],
139
+ list_len=3
140
+ )
141
+ # Output: [10, 42, 7]
142
+
143
+ # Enforce a dictionary with specific keys
144
+ profile = agent(
145
+ "Create a user profile for Alice",
146
+ return_example={"name": "str", "age": 1, "city": "str"}
147
+ )
148
+ # Output: {'name': 'Alice', 'age': 25, 'city': 'New York'}
149
+ ```
150
+
151
+ **Multimodal Input:**
152
+
153
+ ```python
154
+ # Pass image paths for vision models
155
+ description = agent(
156
+ "Describe this image",
157
+ image_paths=["image.jpg"]
158
+ )
159
+ ```
160
+
161
+ ### `sanitize_text(text)`
162
+
163
+ Sanitizes text by keeping only ASCII English characters, digits, and common punctuation. Removes control characters and ANSI codes.
164
+
165
+ ```python
166
+ from structai import sanitize_text
167
+
168
+ clean = sanitize_text("Hello \x1b[31mWorld\x1b[0m!")
169
+ print(clean) # "Hello World!"
170
+ ```
171
+
172
+ ### `str2dict(s)`
173
+
174
+ Robustly converts a string representation of a dictionary to a Python `dict`. It handles common formatting errors and uses `json_repair` as a fallback.
175
+
176
+ ```python
177
+ from structai import str2dict
178
+
179
+ d = str2dict("{'a': 1, 'b': 2}")
180
+ print(d['a']) # 1
181
+ ```
182
+
183
+ ### `str2list(s)`
184
+
185
+ Robustly converts a string representation of a list to a Python `list`.
186
+
187
+ ```python
188
+ from structai import str2list
189
+
190
+ l = str2list("[1, 2, 3]")
191
+ print(len(l)) # 3
192
+ ```
193
+
194
+ ### `add_no_proxy_if_private(url)`
195
+
196
+ Checks if the hostname in the URL is a private IP address. If so, it adds it to the `no_proxy` environment variable to bypass proxies.
197
+
198
+ ```python
199
+ from structai import add_no_proxy_if_private
200
+
201
+ add_no_proxy_if_private("http://192.168.1.100:8080/v1")
202
+ ```
203
+
204
+ ### `read_image(image_path)`
205
+
206
+ Reads an image from a path and returns a PIL Image object.
207
+
208
+ ```python
209
+ from structai import read_image
210
+
211
+ img = read_image("photo.jpg")
212
+ ```
213
+
214
+ ### `encode_image(image_obj)`
215
+
216
+ Encodes a PIL Image object into a base64 string.
217
+
218
+ ```python
219
+ from structai import encode_image
220
+
221
+ b64_str = encode_image(img)
222
+ ```
223
+
224
+ ### `messages_to_responses_input(messages)`
225
+
226
+ Converts standard Chat Completions `messages` format (list of dicts) to the input format required by the Responses API.
227
+
228
+ ```python
229
+ from structai import messages_to_responses_input
230
+
231
+ messages = [{"role": "user", "content": "Hello"}]
232
+ system_prompt, input_blocks = messages_to_responses_input(messages)
233
+ ```
234
+
235
+ ### `extract_text_outputs(result)`
236
+
237
+ Extracts the text content from an LLM API response object (supports both Chat Completions and Responses API formats).
238
+
239
+ ```python
240
+ from structai import extract_text_outputs
241
+
242
+ # Assuming 'response' is the object returned by the OpenAI client
243
+ texts = extract_text_outputs(response)
244
+ print(texts[0])
245
+ ```
246
+
247
+ ### `multi_thread(inp_list, function, max_workers=40, use_tqdm=True)`
248
+
249
+ Executes a function concurrently for each item in `inp_list` using a thread pool.
250
+
251
+ ```python
252
+ from structai import multi_thread
253
+ import time
254
+
255
+ def square(x):
256
+ return x * x
257
+
258
+ inputs = [{"x": i} for i in range(10)]
259
+ results = multi_thread(inputs, square, max_workers=4)
260
+ print(results) # [0, 1, 4, 9, ...]
261
+ ```
262
+
263
+ ### `multi_process(inp_list, function, max_workers=40, use_tqdm=True)`
264
+
265
+ Executes a function concurrently for each item in `inp_list` using a process pool. Ideal for CPU-bound tasks.
266
+
267
+ ```python
268
+ from structai import multi_process
269
+
270
+ def heavy_computation(n):
271
+ return sum(range(n))
272
+
273
+ inputs = [{"n": 1000000} for _ in range(5)]
274
+ results = multi_process(inputs, heavy_computation)
275
+ ```
276
+
277
+ ### `run_server(host="0.0.0.0", port=8001)`
278
+
279
+ Starts a FastAPI server that acts as a proxy to an OpenAI-compatible LLM provider.
280
+
281
+ ```python
282
+ from structai import run_server
283
+
284
+ if __name__ == "__main__":
285
+ run_server()
286
+ ```
287
+
288
+ ### `timeout_limit(timeout=None)`
289
+
290
+ A decorator that enforces a maximum execution time on a function. Raises `TimeoutError` if the limit is exceeded.
291
+
292
+ ```python
293
+ from structai import timeout_limit
294
+ import time
295
+
296
+ @timeout_limit(timeout=2.0)
297
+ def task():
298
+ time.sleep(5)
299
+
300
+ # This will raise TimeoutError
301
+ task()
302
+ ```
303
+
304
+ ### `run_with_timeout(func, args=(), kwargs=None, timeout=None)`
305
+
306
+ Runs a function with a specified timeout without using a decorator.
307
+
308
+ ```python
309
+ from structai import run_with_timeout
310
+
311
+ def task(x):
312
+ return x * 2
313
+
314
+ result = run_with_timeout(task, args=(10,), timeout=1.0)
315
+ ```
316
+
317
+ ### `parse_think_answer(text)`
318
+
319
+ Parses a string containing Chain-of-Thought tags (`<think>...</think>` and `<answer>...</answer>`) and returns the content of both.
320
+
321
+ ```python
322
+ from structai import parse_think_answer
323
+
324
+ raw_text = "<think>Step 1...</think><answer>42</answer>"
325
+ think, answer = parse_think_answer(raw_text)
326
+ print(f"Reasoning: {think}")
327
+ print(f"Result: {answer}")
328
+ ```
329
+
330
+ ### `extract_within_tags(content, start_tag='<answer>', end_tag='</answer>', default_return=None)`
331
+
332
+ Extracts the substring found between two specific tags.
333
+
334
+ ```python
335
+ from structai import extract_within_tags
336
+
337
+ text = "Result: <json>{...}</json>"
338
+ json_str = extract_within_tags(text, "<json>", "</json>")
339
+ ```
340
+
341
+ ### `get_all_file_paths(directory, suffix='')`
342
+
343
+ Recursively retrieves all file paths in a directory that match a given suffix.
344
+
345
+ ```python
346
+ from structai import get_all_file_paths
347
+
348
+ # Get all Python files in the current directory
349
+ py_files = get_all_file_paths(".", suffix=".py")
350
+ print(py_files)
351
+ ```
352
+
353
+ ### `remove_tag(s, tags=["<think>", "</think>", "<answer>", "</answer>"], r="\n")`
354
+
355
+ Removes specified tags from a string, replacing them with a separator (default newline).
356
+
357
+ ```python
358
+ from structai import remove_tag
359
+
360
+ clean_text = remove_tag("<think>...</think> Answer")
361
+ ```
362
+
363
+ ## License
364
+
365
+ MIT License
@@ -0,0 +1,341 @@
1
+ # StructAI
2
+
3
+ StructAI is a comprehensive utility package for AI development, offering a robust set of tools for file operations, LLM interactions, parallel processing, and general programming tasks.
4
+
5
+ ## Installation
6
+
7
+ ```bash
8
+ git clone https://github.com/black-yt/structai.git
9
+ cd structai
10
+ pip install -e .
11
+ ```
12
+
13
+ ## API Reference & Usage
14
+
15
+ ### `load_file(path)`
16
+
17
+ Automatically reads a file based on its extension.
18
+
19
+ **Supported formats:** `.json`, `.jsonl`, `.csv`, `.txt`, `.md`, `.pkl`, `.parquet`, `.xlsx`, `.py`, `.npy`, `.pt`, `.png`, `.jpg`, `.jpeg`.
20
+
21
+ ```python
22
+ from structai import load_file
23
+
24
+ # Load a JSON file
25
+ data = load_file("config.json")
26
+
27
+ # Load a CSV file as a pandas DataFrame
28
+ df = load_file("data.csv")
29
+
30
+ # Load an image
31
+ image = load_file("photo.jpg")
32
+ ```
33
+
34
+ ### `save_file(data, path)`
35
+
36
+ Automatically saves data to a file based on the extension.
37
+
38
+ ```python
39
+ from structai import save_file
40
+
41
+ data = {"key": "value"}
42
+
43
+ # Save as JSON
44
+ save_file(data, "output.json")
45
+
46
+ # Save as Pickle
47
+ save_file(data, "backup.pkl")
48
+ ```
49
+
50
+ ### `print_once(msg)`
51
+
52
+ Prints a message to stdout only the first time it is called. Useful for logging inside loops.
53
+
54
+ ```python
55
+ from structai import print_once
56
+
57
+ for i in range(10):
58
+ print_once("Starting processing...") # Prints only once
59
+ # process(i)
60
+ ```
61
+
62
+ ### `make_print_once()`
63
+
64
+ Returns a new function that prints a message only once. This allows for creating local "print once" scopes.
65
+
66
+ ```python
67
+ from structai import make_print_once
68
+
69
+ logger1 = make_print_once()
70
+ logger2 = make_print_once()
71
+
72
+ logger1("Hello") # Prints "Hello"
73
+ logger1("Hello") # Does nothing
74
+
75
+ logger2("World") # Prints "World"
76
+ ```
77
+
78
+ ### `LLMAgent`
79
+
80
+ A powerful wrapper class for interacting with OpenAI-compatible LLM APIs. It handles retries, timeouts, and structured output validation.
81
+
82
+ **Initialization:**
83
+
84
+ ```python
85
+ from structai import LLMAgent
86
+
87
+ agent = LLMAgent(
88
+ api_key="sk-...", # Optional if LLM_API_KEY env var is set
89
+ api_base="https://...", # Optional if LLM_BASE_URL env var is set
90
+ model_version='gpt-4.1-mini', # Default model
91
+ system_prompt='You are a helpful assistant.',
92
+ temperature=0,
93
+ time_limit=300, # Timeout in seconds
94
+ max_try=1 # Number of retries
95
+ )
96
+ ```
97
+
98
+ **Basic Usage (`__call__` or `safe_api`):**
99
+
100
+ ```python
101
+ response = agent("What is the capital of France?")
102
+ print(response)
103
+ # Output: "Paris"
104
+ ```
105
+
106
+ **Structured Output Validation:**
107
+
108
+ You can enforce the output format (List, Dict, or specific types) using `return_example`.
109
+
110
+ ```python
111
+ # Enforce a list of integers
112
+ numbers = agent(
113
+ "Generate 3 random numbers",
114
+ return_example=[1],
115
+ list_len=3
116
+ )
117
+ # Output: [10, 42, 7]
118
+
119
+ # Enforce a dictionary with specific keys
120
+ profile = agent(
121
+ "Create a user profile for Alice",
122
+ return_example={"name": "str", "age": 1, "city": "str"}
123
+ )
124
+ # Output: {'name': 'Alice', 'age': 25, 'city': 'New York'}
125
+ ```
126
+
127
+ **Multimodal Input:**
128
+
129
+ ```python
130
+ # Pass image paths for vision models
131
+ description = agent(
132
+ "Describe this image",
133
+ image_paths=["image.jpg"]
134
+ )
135
+ ```
136
+
137
+ ### `sanitize_text(text)`
138
+
139
+ Sanitizes text by keeping only ASCII English characters, digits, and common punctuation. Removes control characters and ANSI codes.
140
+
141
+ ```python
142
+ from structai import sanitize_text
143
+
144
+ clean = sanitize_text("Hello \x1b[31mWorld\x1b[0m!")
145
+ print(clean) # "Hello World!"
146
+ ```
147
+
148
+ ### `str2dict(s)`
149
+
150
+ Robustly converts a string representation of a dictionary to a Python `dict`. It handles common formatting errors and uses `json_repair` as a fallback.
151
+
152
+ ```python
153
+ from structai import str2dict
154
+
155
+ d = str2dict("{'a': 1, 'b': 2}")
156
+ print(d['a']) # 1
157
+ ```
158
+
159
+ ### `str2list(s)`
160
+
161
+ Robustly converts a string representation of a list to a Python `list`.
162
+
163
+ ```python
164
+ from structai import str2list
165
+
166
+ l = str2list("[1, 2, 3]")
167
+ print(len(l)) # 3
168
+ ```
169
+
170
+ ### `add_no_proxy_if_private(url)`
171
+
172
+ Checks if the hostname in the URL is a private IP address. If so, it adds it to the `no_proxy` environment variable to bypass proxies.
173
+
174
+ ```python
175
+ from structai import add_no_proxy_if_private
176
+
177
+ add_no_proxy_if_private("http://192.168.1.100:8080/v1")
178
+ ```
179
+
180
+ ### `read_image(image_path)`
181
+
182
+ Reads an image from a path and returns a PIL Image object.
183
+
184
+ ```python
185
+ from structai import read_image
186
+
187
+ img = read_image("photo.jpg")
188
+ ```
189
+
190
+ ### `encode_image(image_obj)`
191
+
192
+ Encodes a PIL Image object into a base64 string.
193
+
194
+ ```python
195
+ from structai import encode_image
196
+
197
+ b64_str = encode_image(img)
198
+ ```
199
+
200
+ ### `messages_to_responses_input(messages)`
201
+
202
+ Converts standard Chat Completions `messages` format (list of dicts) to the input format required by the Responses API.
203
+
204
+ ```python
205
+ from structai import messages_to_responses_input
206
+
207
+ messages = [{"role": "user", "content": "Hello"}]
208
+ system_prompt, input_blocks = messages_to_responses_input(messages)
209
+ ```
210
+
211
+ ### `extract_text_outputs(result)`
212
+
213
+ Extracts the text content from an LLM API response object (supports both Chat Completions and Responses API formats).
214
+
215
+ ```python
216
+ from structai import extract_text_outputs
217
+
218
+ # Assuming 'response' is the object returned by the OpenAI client
219
+ texts = extract_text_outputs(response)
220
+ print(texts[0])
221
+ ```
222
+
223
+ ### `multi_thread(inp_list, function, max_workers=40, use_tqdm=True)`
224
+
225
+ Executes a function concurrently for each item in `inp_list` using a thread pool.
226
+
227
+ ```python
228
+ from structai import multi_thread
229
+ import time
230
+
231
+ def square(x):
232
+ return x * x
233
+
234
+ inputs = [{"x": i} for i in range(10)]
235
+ results = multi_thread(inputs, square, max_workers=4)
236
+ print(results) # [0, 1, 4, 9, ...]
237
+ ```
238
+
239
+ ### `multi_process(inp_list, function, max_workers=40, use_tqdm=True)`
240
+
241
+ Executes a function concurrently for each item in `inp_list` using a process pool. Ideal for CPU-bound tasks.
242
+
243
+ ```python
244
+ from structai import multi_process
245
+
246
+ def heavy_computation(n):
247
+ return sum(range(n))
248
+
249
+ inputs = [{"n": 1000000} for _ in range(5)]
250
+ results = multi_process(inputs, heavy_computation)
251
+ ```
252
+
253
+ ### `run_server(host="0.0.0.0", port=8001)`
254
+
255
+ Starts a FastAPI server that acts as a proxy to an OpenAI-compatible LLM provider.
256
+
257
+ ```python
258
+ from structai import run_server
259
+
260
+ if __name__ == "__main__":
261
+ run_server()
262
+ ```
263
+
264
+ ### `timeout_limit(timeout=None)`
265
+
266
+ A decorator that enforces a maximum execution time on a function. Raises `TimeoutError` if the limit is exceeded.
267
+
268
+ ```python
269
+ from structai import timeout_limit
270
+ import time
271
+
272
+ @timeout_limit(timeout=2.0)
273
+ def task():
274
+ time.sleep(5)
275
+
276
+ # This will raise TimeoutError
277
+ task()
278
+ ```
279
+
280
+ ### `run_with_timeout(func, args=(), kwargs=None, timeout=None)`
281
+
282
+ Runs a function with a specified timeout without using a decorator.
283
+
284
+ ```python
285
+ from structai import run_with_timeout
286
+
287
+ def task(x):
288
+ return x * 2
289
+
290
+ result = run_with_timeout(task, args=(10,), timeout=1.0)
291
+ ```
292
+
293
+ ### `parse_think_answer(text)`
294
+
295
+ Parses a string containing Chain-of-Thought tags (`<think>...</think>` and `<answer>...</answer>`) and returns the content of both.
296
+
297
+ ```python
298
+ from structai import parse_think_answer
299
+
300
+ raw_text = "<think>Step 1...</think><answer>42</answer>"
301
+ think, answer = parse_think_answer(raw_text)
302
+ print(f"Reasoning: {think}")
303
+ print(f"Result: {answer}")
304
+ ```
305
+
306
+ ### `extract_within_tags(content, start_tag='<answer>', end_tag='</answer>', default_return=None)`
307
+
308
+ Extracts the substring found between two specific tags.
309
+
310
+ ```python
311
+ from structai import extract_within_tags
312
+
313
+ text = "Result: <json>{...}</json>"
314
+ json_str = extract_within_tags(text, "<json>", "</json>")
315
+ ```
316
+
317
+ ### `get_all_file_paths(directory, suffix='')`
318
+
319
+ Recursively retrieves all file paths in a directory that match a given suffix.
320
+
321
+ ```python
322
+ from structai import get_all_file_paths
323
+
324
+ # Get all Python files in the current directory
325
+ py_files = get_all_file_paths(".", suffix=".py")
326
+ print(py_files)
327
+ ```
328
+
329
+ ### `remove_tag(s, tags=["<think>", "</think>", "<answer>", "</answer>"], r="\n")`
330
+
331
+ Removes specified tags from a string, replacing them with a separator (default newline).
332
+
333
+ ```python
334
+ from structai import remove_tag
335
+
336
+ clean_text = remove_tag("<think>...</think> Answer")
337
+ ```
338
+
339
+ ## License
340
+
341
+ MIT License