viscribe 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Viscribe AI
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,522 @@
1
+ <div align="center">
2
+ <picture>
3
+ <source media="(prefers-color-scheme: dark)" srcset="./assets/white-v.png">
4
+ <source media="(prefers-color-scheme: light)" srcset="./assets/black-v.png">
5
+ <img src="./assets/black-v.png" alt="Viscribe" width="160">
6
+ </picture>
7
+
8
+ <h1>ViscribeAI</h1>
9
+
10
+ <p>Extract <strong>structured data</strong> from <strong>images</strong> using <strong>AI models</strong>.</p>
11
+
12
+ <p>
13
+ <img alt="Python 3.10+" src="https://img.shields.io/badge/python-3.10%2B-3776AB">
14
+ <img alt="Node.js 20+" src="https://img.shields.io/badge/node.js-20%2B-339933">
15
+ <img alt="License MIT" src="https://img.shields.io/badge/license-MIT-blue">
16
+ <a href="https://x.com/itsperini"><img alt="X @itsperini" src="https://img.shields.io/badge/X-@itsperini-000000?logo=x&logoColor=white"></a>
17
+ <img alt="Discord coming soon" src="https://img.shields.io/badge/discord-coming_soon-5865F2?logo=discord&logoColor=white">
18
+ <a href="https://docs.viscribe.ai"><img alt="Docs docs.viscribe.ai" src="https://img.shields.io/badge/docs-docs.viscribe.ai-2563EB"></a>
19
+ </p>
20
+ </div>
21
+
22
+ > Define the output schema, pass the image, pick the AI model, and get parsed
23
+ structured output back instead of free-form text.
24
+
25
+ ## 📦 Installation
26
+
27
+ Python:
28
+
29
+ ```bash
30
+ pip install viscribe
31
+ ```
32
+
33
+ TypeScript:
34
+
35
+ ```bash
36
+ npm install viscribe
37
+ ```
38
+
39
+ ## 🚀 Features
40
+
41
+ - 🖼️ AI-powered image description, extraction, classification, VQA (Visual Question Answering), and comparison
42
+ - 🔄 Both sync and async clients
43
+ - 📊 Structured output with Pydantic schemas
44
+ - 🔍 Detailed logging
45
+ - ⚡ Automatic retries
46
+
47
+ ## 🎯 Quick Start
48
+
49
+ ```python
50
+ from viscribe.images import describe
51
+
52
+ result = describe(
53
+ image_path="examples/venice.png",
54
+ # image_base64="...",
55
+ generate_tags=True,
56
+ model_config={
57
+ "model": "gpt-4o-mini",
58
+ "api_key": "sk-...",
59
+ "temperature": 0,
60
+ },
61
+ )
62
+
63
+ print(result)
64
+
65
+ # ImageResult(
66
+ # data={
67
+ # "image_description": "A scenic view of Venice...",
68
+ # "tags": ["Venice", "canal", "gondolas"],
69
+ # },
70
+ # raw=<OpenAI response>,
71
+ # usage_metadata={"input_tokens": 123, "output_tokens": 45, ...},
72
+ # )
73
+ ```
74
+
75
+ <details>
76
+ <summary>TypeScript</summary>
77
+
78
+ ```ts
79
+ import { images } from "viscribe";
80
+
81
+ const result = await images.describe({
82
+ imagePath: "examples/venice.png",
83
+ generateTags: true,
84
+ modelConfig: {
85
+ model: "gpt-4o-mini",
86
+ apiKey: "sk-...",
87
+ temperature: 0,
88
+ },
89
+ });
90
+
91
+ console.log(result);
92
+ ```
93
+
94
+ </details>
95
+
96
+ > **Note:**
97
+ > Viscribe works with OpenAI-compatible endpoints (more support coming soon). It is recommended to load your API key from an environment variable instead of hardcoding it in your code.
98
+
99
+ ## 📚 Image Endpoints
100
+
101
+ ### 1. Describe Image
102
+
103
+ Generate a natural language description of an image, optionally with tags.
104
+
105
+ ```python
106
+ from viscribe.images import describe
107
+
108
+ result = describe(
109
+ image_path="examples/venice.png",
110
+ generate_tags=True,
111
+ model_config={
112
+ "model": "gpt-4o-mini",
113
+ "api_key": "sk-...",
114
+ "temperature": 0,
115
+ },
116
+ )
117
+
118
+ print(result.data)
119
+ ```
120
+
121
+ <details>
122
+ <summary>TypeScript</summary>
123
+
124
+ ```ts
125
+ import { images } from "viscribe";
126
+
127
+ const result = await images.describe({
128
+ imagePath: "examples/venice.png",
129
+ generateTags: true,
130
+ modelConfig: {
131
+ model: "gpt-4o-mini",
132
+ apiKey: "sk-...",
133
+ temperature: 0,
134
+ },
135
+ });
136
+
137
+ console.log(result.data);
138
+ ```
139
+
140
+ </details>
141
+
142
+ ### 2. Classify Image
143
+
144
+ Classify an image into one or more categories.
145
+
146
+ ```python
147
+ from viscribe.images import classify
148
+
149
+ result = classify(
150
+ image_path="examples/venice.png",
151
+ classes=["canal", "city", "landmark", "interior"],
152
+ multi_label=True,
153
+ model_config={
154
+ "model": "gpt-4o-mini",
155
+ "api_key": "sk-...",
156
+ "temperature": 0,
157
+ },
158
+ )
159
+
160
+ print(result.data)
161
+ ```
162
+
163
+ <details>
164
+ <summary>TypeScript</summary>
165
+
166
+ ```ts
167
+ import { images } from "viscribe";
168
+
169
+ const result = await images.classify({
170
+ imagePath: "examples/venice.png",
171
+ classes: ["canal", "city", "landmark", "interior"],
172
+ multiLabel: true,
173
+ modelConfig: {
174
+ model: "gpt-4o-mini",
175
+ apiKey: "sk-...",
176
+ temperature: 0,
177
+ },
178
+ });
179
+
180
+ console.log(result.data);
181
+ ```
182
+
183
+ </details>
184
+
185
+ ### 3. Visual Question Answering (VQA)
186
+
187
+ Ask a question about the content of an image and get an answer.
188
+
189
+ ```python
190
+ from viscribe.images import ask
191
+
192
+ result = ask(
193
+ image_path="examples/venice.png",
194
+ question="What kind of place is shown in this image?",
195
+ model_config={
196
+ "model": "gpt-4o-mini",
197
+ "api_key": "sk-...",
198
+ "temperature": 0,
199
+ },
200
+ )
201
+
202
+ print(result.data)
203
+ ```
204
+
205
+ <details>
206
+ <summary>TypeScript</summary>
207
+
208
+ ```ts
209
+ import { images } from "viscribe";
210
+
211
+ const result = await images.ask({
212
+ imagePath: "examples/venice.png",
213
+ question: "What kind of place is shown in this image?",
214
+ modelConfig: {
215
+ model: "gpt-4o-mini",
216
+ apiKey: "sk-...",
217
+ temperature: 0,
218
+ },
219
+ });
220
+
221
+ console.log(result.data);
222
+ ```
223
+
224
+ </details>
225
+
226
+ ### 4. Extract Structured Data from Image
227
+
228
+ Extract structured data from an image using either a simple or more complex output schema.
229
+
230
+ #### Simple Schema
231
+
232
+ Use a simple schema for straightforward data extraction.
233
+
234
+ ```python
235
+ from viscribe.images import extract
236
+
237
+ result = extract(
238
+ image_path="examples/venice.png",
239
+ output_schema=[
240
+ {"name": "location", "type": "text", "description": "Likely place shown"},
241
+ {"name": "visible_elements", "type": "array_text", "description": "Objects and structures"},
242
+ {"name": "colors", "type": "array_text", "description": "Dominant colors"},
243
+ ],
244
+ model_config={
245
+ "model": "gpt-4o-mini",
246
+ "api_key": "sk-...",
247
+ "temperature": 0,
248
+ },
249
+ )
250
+
251
+ print(result.data)
252
+ ```
253
+
254
+ <details>
255
+ <summary>TypeScript</summary>
256
+
257
+ ```ts
258
+ import { images } from "viscribe";
259
+
260
+ const result = await images.extract({
261
+ imagePath: "examples/venice.png",
262
+ outputSchema: [
263
+ { name: "location", type: "text", description: "Likely place shown" },
264
+ {
265
+ name: "visible_elements",
266
+ type: "array_text",
267
+ description: "Objects and structures",
268
+ },
269
+ { name: "colors", type: "array_text", description: "Dominant colors" },
270
+ ],
271
+ modelConfig: {
272
+ model: "gpt-4o-mini",
273
+ apiKey: "sk-...",
274
+ temperature: 0,
275
+ },
276
+ });
277
+
278
+ console.log(result.data);
279
+ ```
280
+
281
+ </details>
282
+
283
+ **Field Types:**
284
+
285
+ - `text`: Single text value
286
+ - `number`: Single numeric value
287
+ - `array_text`: Array of text values
288
+ - `array_number`: Array of numeric values
289
+
290
+ #### More Complex Schema
291
+
292
+ Use a Pydantic model as the `output_schema` when you need complex or nested structures.
293
+
294
+ ```python
295
+ from pydantic import BaseModel
296
+ from viscribe.images import extract
297
+
298
+
299
+ class Scene(BaseModel):
300
+ location: str
301
+ visible_elements: list[str]
302
+ specifications: dict
303
+
304
+
305
+ result = extract(
306
+ image_path="examples/venice.png",
307
+ output_schema=Scene,
308
+ model_config={
309
+ "model": "gpt-4o-mini",
310
+ "api_key": "sk-...",
311
+ "temperature": 0,
312
+ },
313
+ )
314
+
315
+ print(result.data)
316
+ ```
317
+
318
+ <details>
319
+ <summary>TypeScript</summary>
320
+
321
+ ```ts
322
+ import { images } from "viscribe";
323
+
324
+ const result = await images.extract({
325
+ imagePath: "examples/venice.png",
326
+ outputSchema: {
327
+ title: "Scene",
328
+ type: "object",
329
+ properties: {
330
+ location: { type: "string" },
331
+ visible_elements: {
332
+ type: "array",
333
+ items: { type: "string" },
334
+ },
335
+ specifications: { type: "object" },
336
+ },
337
+ required: ["location", "visible_elements", "specifications"],
338
+ additionalProperties: false,
339
+ },
340
+ modelConfig: {
341
+ model: "gpt-4o-mini",
342
+ apiKey: "sk-...",
343
+ temperature: 0,
344
+ },
345
+ });
346
+
347
+ console.log(result.data);
348
+ ```
349
+
350
+ </details>
351
+
352
+ > **Note:** `output_schema` can be either a simple list of field definitions or a Pydantic model.
353
+
354
+ ### 5. Compare Images
355
+
356
+ Compare two images and get a description of their similarities and differences.
357
+
358
+ ```python
359
+ from viscribe.images import compare
360
+
361
+ result = compare(
362
+ image1_path="examples/venice.png",
363
+ image2_path="examples/venice.png",
364
+ model_config={
365
+ "model": "gpt-4o-mini",
366
+ "api_key": "sk-...",
367
+ "temperature": 0,
368
+ },
369
+ )
370
+
371
+ print(result.data)
372
+ ```
373
+
374
+ <details>
375
+ <summary>TypeScript</summary>
376
+
377
+ ```ts
378
+ import { images } from "viscribe";
379
+
380
+ const result = await images.compare({
381
+ image1Path: "examples/venice.png",
382
+ image2Path: "examples/venice.png",
383
+ modelConfig: {
384
+ model: "gpt-4o-mini",
385
+ apiKey: "sk-...",
386
+ temperature: 0,
387
+ },
388
+ });
389
+
390
+ console.log(result.data);
391
+ ```
392
+
393
+ </details>
394
+
395
+ ## ⚡ Async Usage
396
+
397
+ All Python endpoints support async operations with direct `a*` helpers:
398
+
399
+ ```python
400
+ import asyncio
401
+ from viscribe.images import adescribe
402
+
403
+
404
+ async def main() -> None:
405
+ result = await adescribe(
406
+ image_path="examples/venice.png",
407
+ generate_tags=True,
408
+ model_config={
409
+ "model": "gpt-4o-mini",
410
+ "api_key": "sk-...",
411
+ "temperature": 0,
412
+ },
413
+ )
414
+
415
+ print(result.data)
416
+
417
+
418
+ asyncio.run(main())
419
+ ```
420
+
421
+ You can also reuse an async client:
422
+
423
+ ```python
424
+ import asyncio
425
+ from viscribe import ViscribeAI
426
+
427
+
428
+ async def main() -> None:
429
+ client = ViscribeAI(
430
+ model_config={
431
+ "model": "gpt-4o-mini",
432
+ "api_key": "sk-...",
433
+ "temperature": 0,
434
+ }
435
+ )
436
+
437
+ result = await client.images.adescribe(
438
+ image_path="examples/venice.png",
439
+ generate_tags=True,
440
+ )
441
+
442
+ print(result.data)
443
+
444
+
445
+ asyncio.run(main())
446
+ ```
447
+
448
+ <details>
449
+ <summary>TypeScript</summary>
450
+
451
+ TypeScript is async-native, so use the same methods with `await`:
452
+
453
+ ```ts
454
+ import { images, ViscribeAI } from "viscribe";
455
+
456
+ const result = await images.describe({
457
+ imagePath: "examples/venice.png",
458
+ generateTags: true,
459
+ modelConfig: {
460
+ model: "gpt-4o-mini",
461
+ apiKey: "sk-...",
462
+ temperature: 0,
463
+ },
464
+ });
465
+
466
+ console.log(result.data);
467
+
468
+ const client = new ViscribeAI({
469
+ modelConfig: {
470
+ model: "gpt-4o-mini",
471
+ apiKey: "sk-...",
472
+ temperature: 0,
473
+ },
474
+ });
475
+
476
+ const clientResult = await client.images.describe({
477
+ imagePath: "examples/venice.png",
478
+ generateTags: true,
479
+ });
480
+
481
+ console.log(clientResult.data);
482
+ ```
483
+
484
+ </details>
485
+
486
+ ## 📖 Documentation
487
+
488
+ For detailed documentation, visit [docs.viscribe.ai](https://docs.viscribe.ai)
489
+
490
+ ## 🛠️ Development
491
+
492
+ For information about setting up the development environment and contributing to the project, see our [Contributing Guide](CONTRIBUTING.md).
493
+
494
+ ## 💬 Support & Feedback
495
+
496
+ - 📧 Email: support@viscribe.ai
497
+ - 💻 GitHub Issues: [Create an issue](https://github.com/itsperini/viscribe/issues)
498
+ - 🌟 Feature Requests: [Request a feature](https://github.com/itsperini/viscribe/issues/new)
499
+
500
+ ## 🤝 Contributing
501
+
502
+ Feel free to contribute and join our Discord server to discuss with us improvements and give us suggestions!
503
+
504
+ Please see the [contributing guidelines](./CONTRIBUTING.md).
505
+
506
+ [![My Skills](https://skillicons.dev/icons?i=discord)](https://discord.gg/uJN7TYcp)
507
+ [![My Skills](https://skillicons.dev/icons?i=linkedin)](https://www.linkedin.com/in/itsperini)
508
+ [![My Skills](https://skillicons.dev/icons?i=twitter)](https://twitter.com/itsperini)
509
+
510
+ ## 📄 License
511
+
512
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
513
+
514
+ ## 🔗 Links
515
+
516
+ - [Website](https://viscribe.ai)
517
+ - [Documentation](https://docs.viscribe.ai)
518
+ - [GitHub](https://github.com/itsperini/viscribe)
519
+
520
+ ---
521
+
522
+ Made with ❤️ by [ViscribeAI](https://viscribe.ai)
Binary file