viscribe 1.0.5 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,7 +1,7 @@
1
1
  <div align="center">
2
- <img src="./assets/viscribe-hero.png" alt="ViscribeAI" width="860">
2
+ <img src="../assets/viscribe-hero.png" alt="ViscribeAI" width="860">
3
3
 
4
- <h1>ViscribeAI</h1>
4
+ <h1>ViscribeAI TypeScript</h1>
5
5
 
6
6
  <p>Extract <strong>structured data</strong> from <strong>images</strong> using <strong>AI models</strong>.</p>
7
7
 
@@ -9,7 +9,6 @@
9
9
  <a href="https://x.com/itsperini"><img alt="X @itsperini" src="https://img.shields.io/badge/X-@itsperini-000000?logo=x&logoColor=white"></a>
10
10
  <a href="https://discord.gg/GVgJ9ujT"><img alt="Discord" src="https://img.shields.io/badge/discord-join-5865F2?logo=discord&logoColor=white"></a>
11
11
  <a href="https://docs.viscribe.ai"><img alt="Docs docs.viscribe.ai" src="https://img.shields.io/badge/-docs.viscribe.ai-2563EB?logo=bookstack&logoColor=white"></a>
12
- <img alt="Python 3.10+" src="https://img.shields.io/badge/python-3.10%2B-3776AB">
13
12
  <img alt="Node.js 20+" src="https://img.shields.io/badge/node.js-20%2B-339933">
14
13
  <img alt="License MIT" src="https://img.shields.io/badge/license-MIT-blue">
15
14
  </p>
@@ -18,263 +17,35 @@
18
17
  > Define the output schema, pass the image, pick the AI model, and get parsed
19
18
  > structured output back instead of free-form text.
20
19
 
21
- ⭐ If Viscribe helps your project, please leave a
22
- [star](https://github.com/itsperini/viscribe). ⭐
23
-
24
20
  ## 📦 Installation
25
21
 
26
- Python:
27
-
28
- ```bash
29
- pip install viscribe
30
- ```
31
-
32
- TypeScript:
33
-
34
22
  ```bash
35
23
  npm install viscribe
36
24
  ```
37
25
 
38
26
  ## 🚀 Features
39
27
 
40
- - 🖼️ AI-powered image description, extraction, classification, VQA (Visual Question Answering), and comparison
41
- - 🔄 Both sync and async clients
42
- - 📊 Structured output with Pydantic schemas
43
- - 🔍 Detailed logging
44
- - Automatic retries
28
+ - 🖼️ One schema-driven `images.extract` helper for image workflows
29
+ - 📊 JSON Schema objects and simple field definitions
30
+ - 📁 Local image paths, base64 images, and remote image URLs
31
+ - ⚙️ Reusable `new ViscribeAI().images.extract` client namespace
32
+ - 🧩 OpenAI-compatible model configuration
33
+ - 🪄 Typed result wrappers for app and agent code
45
34
 
46
35
  ## 🎯 Quick Start
47
36
 
48
- ```python
49
- from viscribe.images import describe
50
-
51
- result = describe(
52
- image_path="examples/venice.png",
53
- # image_base64="...",
54
- generate_tags=True,
55
- model_config={
56
- "model": "gpt-5-mini",
57
- "api_key": "sk-...",
58
- "temperature": 1,
59
- },
60
- )
61
-
62
- print(result)
63
-
64
- # ImageResult(
65
- # data={
66
- # "image_description": "A scenic view of Venice...",
67
- # "tags": ["Venice", "canal", "gondolas"],
68
- # },
69
- # raw=<OpenAI response>,
70
- # usage_metadata={"input_tokens": 123, "output_tokens": 45, ...},
71
- # )
72
- ```
73
-
74
- <details>
75
- <summary>TypeScript</summary>
76
-
77
- ```ts
78
- import { images } from "viscribe";
79
-
80
- const result = await images.describe({
81
- imagePath: "examples/venice.png",
82
- generateTags: true,
83
- modelConfig: {
84
- model: "gpt-5-mini",
85
- apiKey: "sk-...",
86
- temperature: 1,
87
- },
88
- });
89
-
90
- console.log(result);
91
- ```
92
-
93
- </details>
94
-
95
- > **Note:**
96
- > Viscribe works with OpenAI-compatible endpoints (more support coming soon). It is recommended to load your API key from an environment variable instead of hardcoding it in your code.
97
-
98
- ## 📚 Image Endpoints
99
-
100
- | Method | Description |
101
- | ---------- | ------------------------------------------------------------------------------------------------------ |
102
- | `describe` | Generate an objective image description with optional tags. |
103
- | `classify` | Classify an image into one or more allowed or free-form categories. |
104
- | `ask` | Ask a visual question and get an answer grounded in the image. |
105
- | `extract` | Extract structured data from an image using simple fields, JSON Schema, or a Pydantic model in Python. |
106
- | `compare` | Compare two images and describe their similarities and differences. |
107
-
108
- ### 1. Describe Image
109
-
110
- Generate a natural language description of an image, optionally with tags.
111
-
112
- ```python
113
- from viscribe.images import describe
114
-
115
- result = describe(
116
- image_path="examples/venice.png",
117
- generate_tags=True,
118
- model_config={
119
- "model": "gpt-5-mini",
120
- "api_key": "sk-...",
121
- "temperature": 1,
122
- },
123
- )
124
-
125
- print(result.data)
126
- ```
127
-
128
- <details>
129
- <summary>TypeScript</summary>
130
-
131
- ```ts
132
- import { images } from "viscribe";
133
-
134
- const result = await images.describe({
135
- imagePath: "examples/venice.png",
136
- generateTags: true,
137
- modelConfig: {
138
- model: "gpt-5-mini",
139
- apiKey: "sk-...",
140
- temperature: 1,
141
- },
142
- });
143
-
144
- console.log(result.data);
145
- ```
146
-
147
- </details>
148
-
149
- ### 2. Classify Image
150
-
151
- Classify an image into one or more categories.
152
-
153
- ```python
154
- from viscribe.images import classify
155
-
156
- result = classify(
157
- image_path="examples/venice.png",
158
- classes=["canal", "city", "landmark", "interior"],
159
- multi_label=True,
160
- model_config={
161
- "model": "gpt-5-mini",
162
- "api_key": "sk-...",
163
- "temperature": 1,
164
- },
165
- )
166
-
167
- print(result.data)
168
- ```
169
-
170
- <details>
171
- <summary>TypeScript</summary>
172
-
173
- ```ts
174
- import { images } from "viscribe";
175
-
176
- const result = await images.classify({
177
- imagePath: "examples/venice.png",
178
- classes: ["canal", "city", "landmark", "interior"],
179
- multiLabel: true,
180
- modelConfig: {
181
- model: "gpt-5-mini",
182
- apiKey: "sk-...",
183
- temperature: 1,
184
- },
185
- });
186
-
187
- console.log(result.data);
188
- ```
189
-
190
- </details>
191
-
192
- ### 3. Visual Question Answering (VQA)
193
-
194
- Ask a question about the content of an image and get an answer.
195
-
196
- ```python
197
- from viscribe.images import ask
198
-
199
- result = ask(
200
- image_path="examples/venice.png",
201
- question="What kind of place is shown in this image?",
202
- model_config={
203
- "model": "gpt-5-mini",
204
- "api_key": "sk-...",
205
- "temperature": 1,
206
- },
207
- )
208
-
209
- print(result.data)
210
- ```
211
-
212
- <details>
213
- <summary>TypeScript</summary>
214
-
215
- ```ts
216
- import { images } from "viscribe";
217
-
218
- const result = await images.ask({
219
- imagePath: "examples/venice.png",
220
- question: "What kind of place is shown in this image?",
221
- modelConfig: {
222
- model: "gpt-5-mini",
223
- apiKey: "sk-...",
224
- temperature: 1,
225
- },
226
- });
227
-
228
- console.log(result.data);
229
- ```
230
-
231
- </details>
232
-
233
- ### 4. Extract Structured Data from Image
234
-
235
- Extract structured data from an image using either a simple or more complex output schema.
236
-
237
- #### Simple Schema
238
-
239
- Use a simple schema for straightforward data extraction.
240
-
241
- ```python
242
- from viscribe.images import extract
243
-
244
- result = extract(
245
- image_path="examples/venice.png",
246
- output_schema=[
247
- {"name": "location", "type": "text", "description": "Likely place shown"},
248
- {"name": "visible_elements", "type": "array_text", "description": "Objects and structures"},
249
- {"name": "colors", "type": "array_text", "description": "Dominant colors"},
250
- ],
251
- model_config={
252
- "model": "gpt-5-mini",
253
- "api_key": "sk-...",
254
- "temperature": 1,
255
- },
256
- )
257
-
258
- print(result.data)
259
- ```
260
-
261
- <details>
262
- <summary>TypeScript</summary>
263
-
264
37
  ```ts
265
38
  import { images } from "viscribe";
266
39
 
267
40
  const result = await images.extract({
268
- imagePath: "examples/venice.png",
41
+ imagePath: "examples/receipt.png",
269
42
  outputSchema: [
270
- { name: "location", type: "text", description: "Likely place shown" },
271
- {
272
- name: "visible_elements",
273
- type: "array_text",
274
- description: "Objects and structures",
275
- },
276
- { name: "colors", type: "array_text", description: "Dominant colors" },
43
+ { name: "merchant_name", type: "text", description: "Store or business name" },
44
+ { name: "total_amount", type: "number", description: "Final total on the receipt" },
45
+ { name: "date", type: "text", description: "Receipt date if visible" },
46
+ { name: "line_items", type: "array_text", description: "Visible purchased items" },
277
47
  ],
48
+ instruction: "Extract the receipt fields visible in the image.",
278
49
  modelConfig: {
279
50
  model: "gpt-5-mini",
280
51
  apiKey: "sk-...",
@@ -285,248 +56,46 @@ const result = await images.extract({
285
56
  console.log(result.data);
286
57
  ```
287
58
 
288
- </details>
289
-
290
- **Field Types:**
291
-
292
- - `text`: Single text value
293
- - `number`: Single numeric value
294
- - `array_text`: Array of text values
295
- - `array_number`: Array of numeric values
296
-
297
- #### More Complex Schema
298
-
299
- Use a Pydantic model as the `output_schema` when you need complex or nested structures.
300
-
301
- ```python
302
- from pydantic import BaseModel
303
- from viscribe.images import extract
304
-
305
-
306
- class Scene(BaseModel):
307
- location: str
308
- visible_elements: list[str]
309
- specifications: dict
310
-
311
-
312
- result = extract(
313
- image_path="examples/venice.png",
314
- output_schema=Scene,
315
- model_config={
316
- "model": "gpt-5-mini",
317
- "api_key": "sk-...",
318
- "temperature": 1,
319
- },
320
- )
321
-
322
- print(result.data)
323
- ```
324
-
325
- <details>
326
- <summary>TypeScript</summary>
327
-
328
- ```ts
329
- import { images } from "viscribe";
330
-
331
- const result = await images.extract({
332
- imagePath: "examples/venice.png",
333
- outputSchema: {
334
- title: "Scene",
335
- type: "object",
336
- properties: {
337
- location: { type: "string" },
338
- visible_elements: {
339
- type: "array",
340
- items: { type: "string" },
341
- },
342
- specifications: { type: "object" },
343
- },
344
- required: ["location", "visible_elements", "specifications"],
345
- additionalProperties: false,
346
- },
347
- modelConfig: {
348
- model: "gpt-5-mini",
349
- apiKey: "sk-...",
350
- temperature: 1,
351
- },
352
- });
353
-
354
- console.log(result.data);
355
- ```
356
-
357
- </details>
358
-
359
- > **Note:** `output_schema` can be either a simple list of field definitions or a Pydantic model.
360
-
361
- ### 5. Compare Images
362
-
363
- Compare two images and get a description of their similarities and differences.
364
-
365
- ```python
366
- from viscribe.images import compare
367
-
368
- result = compare(
369
- image1_path="examples/venice.png",
370
- image2_path="examples/venice.png",
371
- model_config={
372
- "model": "gpt-5-mini",
373
- "api_key": "sk-...",
374
- "temperature": 1,
375
- },
376
- )
377
-
378
- print(result.data)
379
- ```
380
-
381
- <details>
382
- <summary>TypeScript</summary>
383
-
384
- ```ts
385
- import { images } from "viscribe";
386
-
387
- const result = await images.compare({
388
- image1Path: "examples/venice.png",
389
- image2Path: "examples/venice.png",
390
- modelConfig: {
391
- model: "gpt-5-mini",
392
- apiKey: "sk-...",
393
- temperature: 1,
394
- },
395
- });
396
-
397
- console.log(result.data);
398
- ```
399
-
400
- </details>
401
-
402
- ## ⚡ Async Usage
403
-
404
- All Python endpoints support async operations with direct `a*` helpers:
405
-
406
- ```python
407
- import asyncio
408
- from viscribe.images import adescribe
409
-
410
-
411
- async def main() -> None:
412
- result = await adescribe(
413
- image_path="examples/venice.png",
414
- generate_tags=True,
415
- model_config={
416
- "model": "gpt-5-mini",
417
- "api_key": "sk-...",
418
- "temperature": 1,
419
- },
420
- )
421
-
422
- print(result.data)
423
-
424
-
425
- asyncio.run(main())
426
- ```
427
-
428
- You can also reuse an async client:
429
-
430
- ```python
431
- import asyncio
432
- from viscribe import ViscribeAI
59
+ ## 🧱 Schema Options
433
60
 
61
+ `outputSchema` accepts:
434
62
 
435
- async def main() -> None:
436
- client = ViscribeAI(
437
- model_config={
438
- "model": "gpt-5-mini",
439
- "api_key": "sk-...",
440
- "temperature": 1,
441
- }
442
- )
63
+ - simple field definitions
64
+ - JSON Schema objects
443
65
 
444
- result = await client.images.adescribe(
445
- image_path="examples/venice.png",
446
- generate_tags=True,
447
- )
66
+ Simple field types are `text`, `number`, `array_text`, and `array_number`.
448
67
 
449
- print(result.data)
450
-
451
-
452
- asyncio.run(main())
453
- ```
454
-
455
- <details>
456
- <summary>TypeScript</summary>
457
-
458
- TypeScript is async-native, so use the same methods with `await`:
68
+ ## ♻️ Reusable Client
459
69
 
460
70
  ```ts
461
- import { images, ViscribeAI } from "viscribe";
462
-
463
- const result = await images.describe({
464
- imagePath: "examples/venice.png",
465
- generateTags: true,
466
- modelConfig: {
467
- model: "gpt-5-mini",
468
- apiKey: "sk-...",
469
- temperature: 1,
470
- },
471
- });
472
-
473
- console.log(result.data);
71
+ import { ViscribeAI } from "viscribe";
474
72
 
475
73
  const client = new ViscribeAI({
476
- modelConfig: {
477
- model: "gpt-5-mini",
478
- apiKey: "sk-...",
479
- temperature: 1,
480
- },
74
+ modelConfig: { model: "gpt-5-mini", temperature: 1 },
481
75
  });
482
76
 
483
- const clientResult = await client.images.describe({
484
- imagePath: "examples/venice.png",
485
- generateTags: true,
77
+ const result = await client.images.extract({
78
+ imagePath: "examples/receipt.png",
79
+ outputSchema: [{ name: "total_amount", type: "number" }],
486
80
  });
487
-
488
- console.log(clientResult.data);
489
81
  ```
490
82
 
491
- </details>
492
-
493
- ## 📖 Documentation
494
-
495
- For detailed documentation, visit [docs.viscribe.ai](https://docs.viscribe.ai)
496
-
497
- ## 🛠️ Development
498
-
499
- For information about setting up the development environment and contributing to the project, see our [Contributing Guide](CONTRIBUTING.md).
500
-
501
83
  ## 💬 Support & Feedback
502
84
 
503
85
  - 📧 Email: support@viscribe.ai
504
86
  - 💻 GitHub Issues: [Create an issue](https://github.com/itsperini/viscribe/issues)
505
- - 🌟 Feature Requests: [Request a feature](https://github.com/itsperini/viscribe/issues/new)
87
+ - 💬 Discord: [Join the community](https://discord.gg/GVgJ9ujT)
506
88
 
507
89
  ## 🤝 Contributing
508
90
 
509
- Feel free to contribute and join our Discord server to discuss with us improvements and give us suggestions!
510
-
511
- Please see the [contributing guidelines](./CONTRIBUTING.md).
512
-
513
- [![My Skills](https://skillicons.dev/icons?i=discord)](https://discord.gg/GVgJ9ujT)
514
- [![My Skills](https://skillicons.dev/icons?i=linkedin)](https://www.linkedin.com/in/itsperini)
515
- [![My Skills](https://skillicons.dev/icons?i=twitter)](https://twitter.com/itsperini)
516
-
517
- ## 📄 License
518
-
519
- This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
91
+ Please see the [contributing guidelines](../CONTRIBUTING.md).
520
92
 
521
- ## 🔗 Links
522
-
523
- - [Website](https://viscribe.ai)
524
- - [Documentation](https://docs.viscribe.ai)
525
- - [GitHub](https://github.com/itsperini/viscribe)
526
-
527
- ⭐ If Viscribe helps your project, please leave a
528
- [star](https://github.com/itsperini/viscribe). ⭐
529
-
530
- ---
93
+ ## 🛠️ Development
531
94
 
532
- Made with ❤️ by [ViscribeAI](https://viscribe.ai)
95
+ ```bash
96
+ npm install
97
+ npm test
98
+ npm run typecheck
99
+ npm run build
100
+ npm pack --dry-run
101
+ ```