viscribe 0.1.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,522 +1,101 @@
1
1
  <div align="center">
2
- <picture>
3
- <source media="(prefers-color-scheme: dark)" srcset="./assets/white-v.png">
4
- <source media="(prefers-color-scheme: light)" srcset="./assets/black-v.png">
5
- <img src="./assets/black-v.png" alt="Viscribe" width="160">
6
- </picture>
2
+ <img src="../assets/viscribe-hero.png" alt="ViscribeAI" width="860">
7
3
 
8
- <h1>ViscribeAI</h1>
4
+ <h1>ViscribeAI TypeScript</h1>
9
5
 
10
6
  <p>Extract <strong>structured data</strong> from <strong>images</strong> using <strong>AI models</strong>.</p>
11
7
 
12
8
  <p>
13
- <img alt="Python 3.10+" src="https://img.shields.io/badge/python-3.10%2B-3776AB">
9
+ <a href="https://x.com/itsperini"><img alt="X @itsperini" src="https://img.shields.io/badge/X-@itsperini-000000?logo=x&logoColor=white"></a>
10
+ <a href="https://discord.gg/GVgJ9ujT"><img alt="Discord" src="https://img.shields.io/badge/discord-join-5865F2?logo=discord&logoColor=white"></a>
11
+ <a href="https://docs.viscribe.ai"><img alt="Docs docs.viscribe.ai" src="https://img.shields.io/badge/-docs.viscribe.ai-2563EB?logo=bookstack&logoColor=white"></a>
14
12
  <img alt="Node.js 20+" src="https://img.shields.io/badge/node.js-20%2B-339933">
15
13
  <img alt="License MIT" src="https://img.shields.io/badge/license-MIT-blue">
16
- <a href="https://x.com/itsperini"><img alt="X @itsperini" src="https://img.shields.io/badge/X-@itsperini-000000?logo=x&logoColor=white"></a>
17
- <img alt="Discord coming soon" src="https://img.shields.io/badge/discord-coming_soon-5865F2?logo=discord&logoColor=white">
18
- <a href="https://docs.viscribe.ai"><img alt="Docs docs.viscribe.ai" src="https://img.shields.io/badge/docs-docs.viscribe.ai-2563EB"></a>
19
14
  </p>
20
15
  </div>
21
16
 
22
17
  > Define the output schema, pass the image, pick the AI model, and get parsed
23
- structured output back instead of free-form text.
18
+ > structured output back instead of free-form text.
24
19
 
25
20
  ## 📦 Installation
26
21
 
27
- Python:
28
-
29
- ```bash
30
- pip install viscribe
31
- ```
32
-
33
- TypeScript:
34
-
35
22
  ```bash
36
23
  npm install viscribe
37
24
  ```
38
25
 
39
26
  ## 🚀 Features
40
27
 
41
- - 🖼️ AI-powered image description, extraction, classification, VQA (Visual Question Answering), and comparison
42
- - 🔄 Both sync and async clients
43
- - 📊 Structured output with Pydantic schemas
44
- - 🔍 Detailed logging
45
- - Automatic retries
28
+ - 🖼️ One schema-driven `images.extract` helper for image workflows
29
+ - 📊 JSON Schema objects and simple field definitions
30
+ - 📁 Local image paths, base64 images, and remote image URLs
31
+ - ⚙️ Reusable `new ViscribeAI().images.extract` client namespace
32
+ - 🧩 OpenAI-compatible model configuration
33
+ - 🪄 Typed result wrappers for app and agent code
46
34
 
47
35
  ## 🎯 Quick Start
48
36
 
49
- ```python
50
- from viscribe.images import describe
51
-
52
- result = describe(
53
- image_path="examples/venice.png",
54
- # image_base64="...",
55
- generate_tags=True,
56
- model_config={
57
- "model": "gpt-4o-mini",
58
- "api_key": "sk-...",
59
- "temperature": 0,
60
- },
61
- )
62
-
63
- print(result)
64
-
65
- # ImageResult(
66
- # data={
67
- # "image_description": "A scenic view of Venice...",
68
- # "tags": ["Venice", "canal", "gondolas"],
69
- # },
70
- # raw=<OpenAI response>,
71
- # usage_metadata={"input_tokens": 123, "output_tokens": 45, ...},
72
- # )
73
- ```
74
-
75
- <details>
76
- <summary>TypeScript</summary>
77
-
78
- ```ts
79
- import { images } from "viscribe";
80
-
81
- const result = await images.describe({
82
- imagePath: "examples/venice.png",
83
- generateTags: true,
84
- modelConfig: {
85
- model: "gpt-4o-mini",
86
- apiKey: "sk-...",
87
- temperature: 0,
88
- },
89
- });
90
-
91
- console.log(result);
92
- ```
93
-
94
- </details>
95
-
96
- > **Note:**
97
- > Viscribe works with OpenAI-compatible endpoints (more support coming soon). It is recommended to load your API key from an environment variable instead of hardcoding it in your code.
98
-
99
- ## 📚 Image Endpoints
100
-
101
- ### 1. Describe Image
102
-
103
- Generate a natural language description of an image, optionally with tags.
104
-
105
- ```python
106
- from viscribe.images import describe
107
-
108
- result = describe(
109
- image_path="examples/venice.png",
110
- generate_tags=True,
111
- model_config={
112
- "model": "gpt-4o-mini",
113
- "api_key": "sk-...",
114
- "temperature": 0,
115
- },
116
- )
117
-
118
- print(result.data)
119
- ```
120
-
121
- <details>
122
- <summary>TypeScript</summary>
123
-
124
- ```ts
125
- import { images } from "viscribe";
126
-
127
- const result = await images.describe({
128
- imagePath: "examples/venice.png",
129
- generateTags: true,
130
- modelConfig: {
131
- model: "gpt-4o-mini",
132
- apiKey: "sk-...",
133
- temperature: 0,
134
- },
135
- });
136
-
137
- console.log(result.data);
138
- ```
139
-
140
- </details>
141
-
142
- ### 2. Classify Image
143
-
144
- Classify an image into one or more categories.
145
-
146
- ```python
147
- from viscribe.images import classify
148
-
149
- result = classify(
150
- image_path="examples/venice.png",
151
- classes=["canal", "city", "landmark", "interior"],
152
- multi_label=True,
153
- model_config={
154
- "model": "gpt-4o-mini",
155
- "api_key": "sk-...",
156
- "temperature": 0,
157
- },
158
- )
159
-
160
- print(result.data)
161
- ```
162
-
163
- <details>
164
- <summary>TypeScript</summary>
165
-
166
- ```ts
167
- import { images } from "viscribe";
168
-
169
- const result = await images.classify({
170
- imagePath: "examples/venice.png",
171
- classes: ["canal", "city", "landmark", "interior"],
172
- multiLabel: true,
173
- modelConfig: {
174
- model: "gpt-4o-mini",
175
- apiKey: "sk-...",
176
- temperature: 0,
177
- },
178
- });
179
-
180
- console.log(result.data);
181
- ```
182
-
183
- </details>
184
-
185
- ### 3. Visual Question Answering (VQA)
186
-
187
- Ask a question about the content of an image and get an answer.
188
-
189
- ```python
190
- from viscribe.images import ask
191
-
192
- result = ask(
193
- image_path="examples/venice.png",
194
- question="What kind of place is shown in this image?",
195
- model_config={
196
- "model": "gpt-4o-mini",
197
- "api_key": "sk-...",
198
- "temperature": 0,
199
- },
200
- )
201
-
202
- print(result.data)
203
- ```
204
-
205
- <details>
206
- <summary>TypeScript</summary>
207
-
208
- ```ts
209
- import { images } from "viscribe";
210
-
211
- const result = await images.ask({
212
- imagePath: "examples/venice.png",
213
- question: "What kind of place is shown in this image?",
214
- modelConfig: {
215
- model: "gpt-4o-mini",
216
- apiKey: "sk-...",
217
- temperature: 0,
218
- },
219
- });
220
-
221
- console.log(result.data);
222
- ```
223
-
224
- </details>
225
-
226
- ### 4. Extract Structured Data from Image
227
-
228
- Extract structured data from an image using either a simple or more complex output schema.
229
-
230
- #### Simple Schema
231
-
232
- Use a simple schema for straightforward data extraction.
233
-
234
- ```python
235
- from viscribe.images import extract
236
-
237
- result = extract(
238
- image_path="examples/venice.png",
239
- output_schema=[
240
- {"name": "location", "type": "text", "description": "Likely place shown"},
241
- {"name": "visible_elements", "type": "array_text", "description": "Objects and structures"},
242
- {"name": "colors", "type": "array_text", "description": "Dominant colors"},
243
- ],
244
- model_config={
245
- "model": "gpt-4o-mini",
246
- "api_key": "sk-...",
247
- "temperature": 0,
248
- },
249
- )
250
-
251
- print(result.data)
252
- ```
253
-
254
- <details>
255
- <summary>TypeScript</summary>
256
-
257
37
  ```ts
258
38
  import { images } from "viscribe";
259
39
 
260
40
  const result = await images.extract({
261
- imagePath: "examples/venice.png",
41
+ imagePath: "examples/receipt.png",
262
42
  outputSchema: [
263
- { name: "location", type: "text", description: "Likely place shown" },
264
- {
265
- name: "visible_elements",
266
- type: "array_text",
267
- description: "Objects and structures",
268
- },
269
- { name: "colors", type: "array_text", description: "Dominant colors" },
43
+ { name: "merchant_name", type: "text", description: "Store or business name" },
44
+ { name: "total_amount", type: "number", description: "Final total on the receipt" },
45
+ { name: "date", type: "text", description: "Receipt date if visible" },
46
+ { name: "line_items", type: "array_text", description: "Visible purchased items" },
270
47
  ],
48
+ instruction: "Extract the receipt fields visible in the image.",
271
49
  modelConfig: {
272
- model: "gpt-4o-mini",
273
- apiKey: "sk-...",
274
- temperature: 0,
275
- },
276
- });
277
-
278
- console.log(result.data);
279
- ```
280
-
281
- </details>
282
-
283
- **Field Types:**
284
-
285
- - `text`: Single text value
286
- - `number`: Single numeric value
287
- - `array_text`: Array of text values
288
- - `array_number`: Array of numeric values
289
-
290
- #### More Complex Schema
291
-
292
- Use a Pydantic model as the `output_schema` when you need complex or nested structures.
293
-
294
- ```python
295
- from pydantic import BaseModel
296
- from viscribe.images import extract
297
-
298
-
299
- class Scene(BaseModel):
300
- location: str
301
- visible_elements: list[str]
302
- specifications: dict
303
-
304
-
305
- result = extract(
306
- image_path="examples/venice.png",
307
- output_schema=Scene,
308
- model_config={
309
- "model": "gpt-4o-mini",
310
- "api_key": "sk-...",
311
- "temperature": 0,
312
- },
313
- )
314
-
315
- print(result.data)
316
- ```
317
-
318
- <details>
319
- <summary>TypeScript</summary>
320
-
321
- ```ts
322
- import { images } from "viscribe";
323
-
324
- const result = await images.extract({
325
- imagePath: "examples/venice.png",
326
- outputSchema: {
327
- title: "Scene",
328
- type: "object",
329
- properties: {
330
- location: { type: "string" },
331
- visible_elements: {
332
- type: "array",
333
- items: { type: "string" },
334
- },
335
- specifications: { type: "object" },
336
- },
337
- required: ["location", "visible_elements", "specifications"],
338
- additionalProperties: false,
339
- },
340
- modelConfig: {
341
- model: "gpt-4o-mini",
342
- apiKey: "sk-...",
343
- temperature: 0,
344
- },
345
- });
346
-
347
- console.log(result.data);
348
- ```
349
-
350
- </details>
351
-
352
- > **Note:** `output_schema` can be either a simple list of field definitions or a Pydantic model.
353
-
354
- ### 5. Compare Images
355
-
356
- Compare two images and get a description of their similarities and differences.
357
-
358
- ```python
359
- from viscribe.images import compare
360
-
361
- result = compare(
362
- image1_path="examples/venice.png",
363
- image2_path="examples/venice.png",
364
- model_config={
365
- "model": "gpt-4o-mini",
366
- "api_key": "sk-...",
367
- "temperature": 0,
368
- },
369
- )
370
-
371
- print(result.data)
372
- ```
373
-
374
- <details>
375
- <summary>TypeScript</summary>
376
-
377
- ```ts
378
- import { images } from "viscribe";
379
-
380
- const result = await images.compare({
381
- image1Path: "examples/venice.png",
382
- image2Path: "examples/venice.png",
383
- modelConfig: {
384
- model: "gpt-4o-mini",
50
+ model: "gpt-5-mini",
385
51
  apiKey: "sk-...",
386
- temperature: 0,
52
+ temperature: 1,
387
53
  },
388
54
  });
389
55
 
390
56
  console.log(result.data);
391
57
  ```
392
58
 
393
- </details>
394
-
395
- ## ⚡ Async Usage
396
-
397
- All Python endpoints support async operations with direct `a*` helpers:
398
-
399
- ```python
400
- import asyncio
401
- from viscribe.images import adescribe
402
-
403
-
404
- async def main() -> None:
405
- result = await adescribe(
406
- image_path="examples/venice.png",
407
- generate_tags=True,
408
- model_config={
409
- "model": "gpt-4o-mini",
410
- "api_key": "sk-...",
411
- "temperature": 0,
412
- },
413
- )
414
-
415
- print(result.data)
416
-
417
-
418
- asyncio.run(main())
419
- ```
59
+ ## 🧱 Schema Options
420
60
 
421
- You can also reuse an async client:
61
+ `outputSchema` accepts:
422
62
 
423
- ```python
424
- import asyncio
425
- from viscribe import ViscribeAI
63
+ - simple field definitions
64
+ - JSON Schema objects
426
65
 
66
+ Simple field types are `text`, `number`, `array_text`, and `array_number`.
427
67
 
428
- async def main() -> None:
429
- client = ViscribeAI(
430
- model_config={
431
- "model": "gpt-4o-mini",
432
- "api_key": "sk-...",
433
- "temperature": 0,
434
- }
435
- )
436
-
437
- result = await client.images.adescribe(
438
- image_path="examples/venice.png",
439
- generate_tags=True,
440
- )
441
-
442
- print(result.data)
443
-
444
-
445
- asyncio.run(main())
446
- ```
447
-
448
- <details>
449
- <summary>TypeScript</summary>
450
-
451
- TypeScript is async-native, so use the same methods with `await`:
68
+ ## ♻️ Reusable Client
452
69
 
453
70
  ```ts
454
- import { images, ViscribeAI } from "viscribe";
455
-
456
- const result = await images.describe({
457
- imagePath: "examples/venice.png",
458
- generateTags: true,
459
- modelConfig: {
460
- model: "gpt-4o-mini",
461
- apiKey: "sk-...",
462
- temperature: 0,
463
- },
464
- });
465
-
466
- console.log(result.data);
71
+ import { ViscribeAI } from "viscribe";
467
72
 
468
73
  const client = new ViscribeAI({
469
- modelConfig: {
470
- model: "gpt-4o-mini",
471
- apiKey: "sk-...",
472
- temperature: 0,
473
- },
74
+ modelConfig: { model: "gpt-5-mini", temperature: 1 },
474
75
  });
475
76
 
476
- const clientResult = await client.images.describe({
477
- imagePath: "examples/venice.png",
478
- generateTags: true,
77
+ const result = await client.images.extract({
78
+ imagePath: "examples/receipt.png",
79
+ outputSchema: [{ name: "total_amount", type: "number" }],
479
80
  });
480
-
481
- console.log(clientResult.data);
482
81
  ```
483
82
 
484
- </details>
485
-
486
- ## 📖 Documentation
487
-
488
- For detailed documentation, visit [docs.viscribe.ai](https://docs.viscribe.ai)
489
-
490
- ## 🛠️ Development
491
-
492
- For information about setting up the development environment and contributing to the project, see our [Contributing Guide](CONTRIBUTING.md).
493
-
494
83
  ## 💬 Support & Feedback
495
84
 
496
85
  - 📧 Email: support@viscribe.ai
497
86
  - 💻 GitHub Issues: [Create an issue](https://github.com/itsperini/viscribe/issues)
498
- - 🌟 Feature Requests: [Request a feature](https://github.com/itsperini/viscribe/issues/new)
87
+ - 💬 Discord: [Join the community](https://discord.gg/GVgJ9ujT)
499
88
 
500
89
  ## 🤝 Contributing
501
90
 
502
- Feel free to contribute and join our Discord server to discuss with us improvements and give us suggestions!
503
-
504
- Please see the [contributing guidelines](./CONTRIBUTING.md).
505
-
506
- [![My Skills](https://skillicons.dev/icons?i=discord)](https://discord.gg/uJN7TYcp)
507
- [![My Skills](https://skillicons.dev/icons?i=linkedin)](https://www.linkedin.com/in/itsperini)
508
- [![My Skills](https://skillicons.dev/icons?i=twitter)](https://twitter.com/itsperini)
509
-
510
- ## 📄 License
511
-
512
- This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
91
+ Please see the [contributing guidelines](../CONTRIBUTING.md).
513
92
 
514
- ## 🔗 Links
515
-
516
- - [Website](https://viscribe.ai)
517
- - [Documentation](https://docs.viscribe.ai)
518
- - [GitHub](https://github.com/itsperini/viscribe)
519
-
520
- ---
93
+ ## 🛠️ Development
521
94
 
522
- Made with ❤️ by [ViscribeAI](https://viscribe.ai)
95
+ ```bash
96
+ npm install
97
+ npm test
98
+ npm run typecheck
99
+ npm run build
100
+ npm pack --dry-run
101
+ ```
Binary file