@michelabboud/visual-forge-mcp 0.7.0 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (41) hide show
  1. package/CHANGELOG.md +272 -0
  2. package/README.md +196 -2
  3. package/config/pricing.json +32 -4
  4. package/dist/providers/base-provider.d.ts +10 -2
  5. package/dist/providers/base-provider.d.ts.map +1 -1
  6. package/dist/providers/base-provider.js +53 -3
  7. package/dist/providers/base-provider.js.map +1 -1
  8. package/dist/providers/index.d.ts +2 -0
  9. package/dist/providers/index.d.ts.map +1 -1
  10. package/dist/providers/index.js +40 -2
  11. package/dist/providers/index.js.map +1 -1
  12. package/dist/providers/zai/zai-provider.d.ts +22 -0
  13. package/dist/providers/zai/zai-provider.d.ts.map +1 -0
  14. package/dist/providers/zai/zai-provider.js +154 -0
  15. package/dist/providers/zai/zai-provider.js.map +1 -0
  16. package/dist/quality/index.d.ts +1 -0
  17. package/dist/quality/index.d.ts.map +1 -1
  18. package/dist/quality/index.js +1 -0
  19. package/dist/quality/index.js.map +1 -1
  20. package/dist/quality/model-tester.d.ts +87 -0
  21. package/dist/quality/model-tester.d.ts.map +1 -0
  22. package/dist/quality/model-tester.js +357 -0
  23. package/dist/quality/model-tester.js.map +1 -0
  24. package/dist/server/mcp-server.d.ts +5 -0
  25. package/dist/server/mcp-server.d.ts.map +1 -1
  26. package/dist/server/mcp-server.js +371 -5
  27. package/dist/server/mcp-server.js.map +1 -1
  28. package/dist/types/generation.d.ts +1 -1
  29. package/dist/types/generation.d.ts.map +1 -1
  30. package/dist/types/provider.d.ts +28 -1
  31. package/dist/types/provider.d.ts.map +1 -1
  32. package/dist/utils/index.d.ts +1 -0
  33. package/dist/utils/index.d.ts.map +1 -1
  34. package/dist/utils/index.js +1 -0
  35. package/dist/utils/index.js.map +1 -1
  36. package/dist/utils/user-config-manager.d.ts +68 -0
  37. package/dist/utils/user-config-manager.d.ts.map +1 -0
  38. package/dist/utils/user-config-manager.js +131 -0
  39. package/dist/utils/user-config-manager.js.map +1 -0
  40. package/docs/guides/comprehensive-guide.md +1552 -0
  41. package/package.json +2 -2
@@ -0,0 +1,1552 @@
1
+ # Visual Forge MCP - Comprehensive Guide
2
+
3
+ **Version:** 0.9.0
4
+ **Last Updated:** 2026-01-16
5
+
6
+ ---
7
+
8
+ ## Table of Contents
9
+
10
+ 1. [Description](#description)
11
+ 2. [Installation](#installation)
12
+ 3. [Architecture & How It Works](#architecture--how-it-works)
13
+ 4. [Environment Variables](#environment-variables)
14
+ 5. [Provider & Model System](#provider--model-system)
15
+ 6. [Usage Workflows](#usage-workflows)
16
+ 7. [MCP Tools Reference](#mcp-tools-reference)
17
+ 8. [Testing](#testing)
18
+ 9. [Troubleshooting](#troubleshooting)
19
+ 10. [Advanced Topics](#advanced-topics)
20
+
21
+ ---
22
+
23
+ ## Description
24
+
25
+ **Visual Forge MCP** is a Model Context Protocol (MCP) server that automates AI-powered image generation for technical documentation. It provides a comprehensive solution for generating, optimizing, and managing images across multiple AI providers.
26
+
27
+ ### Key Features
28
+
29
+ - **Multi-Provider Support**: 8 AI providers with automatic fallback
30
+ - OpenAI (GPT Image)
31
+ - Google Gemini (Nano Banana)
32
+ - Stability AI (SDXL)
33
+ - Replicate (FLUX models)
34
+ - Leonardo AI (Phoenix)
35
+ - HuggingFace (SDXL, FLUX)
36
+ - xAI (Grok 2 Image)
37
+ - Z.ai (GLM-Image) - **NEW** ✨
38
+
39
+ - **Multi-Model Architecture**: Each provider can offer multiple models with different capabilities and pricing
40
+
41
+ - **Model Testing & Comparison** - **NEW v0.9.0** ✨:
42
+ - Standard automated quality tests
43
+ - Custom prompt testing with real use cases
44
+ - Side-by-side multi-provider comparison
45
+ - Quality scoring (sharpness, brightness, text rendering, color accuracy)
46
+ - Intelligent recommendations based on test results
47
+ - Cost-aware permission flow for paid models
48
+
49
+ - **Professional Image Pipeline**:
50
+ - Automatic image optimization (WebP, JPEG, PNG)
51
+ - Quality inspection (sharpness, brightness, dimensions)
52
+ - Watermarking support
53
+ - Auto-regeneration on quality failures
54
+
55
+ - **State Management**:
56
+ - Persistent state across sessions
57
+ - Resumable workflows
58
+ - Cost tracking by provider/file/type
59
+ - Backup/restore system
60
+
61
+ - **Workflow Modes**:
62
+ - **Interactive**: One-by-one generation with approval
63
+ - **Batch**: Generate N images, then approve batch
64
+ - **Bulk**: Parallel generation with concurrency control
65
+
66
+ - **Quality Features**:
67
+ - Image quality validation (sharpness, brightness, OCR)
68
+ - Automatic regeneration on failures (configurable)
69
+ - Multi-format optimization
70
+ - Comprehensive metadata tracking
71
+
72
+ ### Use Cases
73
+
74
+ - **Technical Documentation**: Generate diagrams, flowcharts, and architecture illustrations
75
+ - **Educational Content**: Create instructional images and infographics
76
+ - **API Documentation**: Visualize API endpoints and data flows
77
+ - **System Architecture**: Illustrate cloud infrastructure and system designs
78
+ - **Process Documentation**: Create visual workflows and decision trees
79
+
80
+ ---
81
+
82
+ ## Installation
83
+
84
+ ### Prerequisites
85
+
86
+ - **Node.js**: v18+ or v20+
87
+ - **npm**: v9+ or v10+
88
+ - **Operating System**: Linux, macOS, or Windows with WSL2
89
+
90
+ ### Step 1: Clone the Repository
91
+
92
+ ```bash
93
+ git clone https://github.com/michelabboud/visual-forge-mcp.git
94
+ cd visual-forge-mcp
95
+ ```
96
+
97
+ ### Step 2: Install Dependencies
98
+
99
+ ```bash
100
+ npm install
101
+ ```
102
+
103
+ ### Step 3: Build the Project
104
+
105
+ ```bash
106
+ npm run build
107
+ ```
108
+
109
+ This compiles TypeScript to JavaScript in the `dist/` directory.
110
+
111
+ ### Step 4: Configure Environment Variables
112
+
113
+ Create a `.env` file in the project root:
114
+
115
+ ```bash
116
+ # Copy example environment file
117
+ cp .env.example .env
118
+
119
+ # Edit with your API keys
120
+ nano .env
121
+ ```
122
+
123
+ **Minimum configuration** (at least one provider required):
124
+
125
+ ```env
126
+ # Free option (recommended for testing)
127
+ GOOGLE_API_KEY=AIza...
128
+
129
+ # Or paid options
130
+ OPENAI_API_KEY=sk-...
131
+ ZAI_API_KEY=zai-...
132
+ ```
133
+
134
+ See [Environment Variables](#environment-variables) section for complete list.
135
+
136
+ ### Step 5: Verify Installation
137
+
138
+ ```bash
139
+ # Run tests
140
+ npm test
141
+
142
+ # Check provider availability
143
+ npx tsx scripts/check-providers.ts
144
+ ```
145
+
146
+ ### Step 6: Configure MCP Client
147
+
148
+ Add to your MCP client configuration (e.g., `claude_desktop_config.json`):
149
+
150
+ ```json
151
+ {
152
+ "mcpServers": {
153
+ "visual-forge": {
154
+ "command": "node",
155
+ "args": ["/path/to/visual-forge-mcp/dist/index.js"],
156
+ "env": {
157
+ "GOOGLE_API_KEY": "AIza...",
158
+ "ZAI_API_KEY": "zai-...",
159
+ "IMAGE_GEN_OUTPUT_DIR": "./generated-images",
160
+ "IMAGE_GEN_LOG_LEVEL": "info"
161
+ }
162
+ }
163
+ }
164
+ }
165
+ ```
166
+
167
+ ---
168
+
169
+ ## Architecture & How It Works
170
+
171
+ ### System Overview
172
+
173
+ ```
174
+ ┌─────────────────────────────────────────────────────────┐
175
+ │ MCP Client │
176
+ │ (Claude Desktop, Continue, etc.) │
177
+ └───────────────────────┬─────────────────────────────────┘
178
+ │ MCP Protocol (stdio)
179
+
180
+ ┌───────────────────────▼─────────────────────────────────┐
181
+ │ Visual Forge MCP Server │
182
+ │ ┌────────────────────────────────────────────────┐ │
183
+ │ │ 13+ MCP Tools (parse, generate, configure) │ │
184
+ │ └────────────────────┬────────────────────────────┘ │
185
+ │ │ │
186
+ │ ┌────────────────────▼────────────────────────────┐ │
187
+ │ │ Provider Factory │ │
188
+ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
189
+ │ │ │ OpenAI │ │ Gemini │ │ Z.ai │ ...│ │
190
+ │ │ └──────────┘ └──────────┘ └──────────┘ │ │
191
+ │ └─────────────────────────────────────────────────┘ │
192
+ │ │
193
+ │ ┌─────────────────────────────────────────────────┐ │
194
+ │ │ State Manager │ │
195
+ │ │ • Session tracking │ │
196
+ │ │ • Cost tracking │ │
197
+ │ │ • Workflow orchestration │ │
198
+ │ │ • Backup/restore │ │
199
+ │ └─────────────────────────────────────────────────┘ │
200
+ └──────────────────────────────────────────────────────────┘
201
+ ```
202
+
203
+ ### Core Components
204
+
205
+ #### 1. MCP Server (`src/server/mcp-server.ts`)
206
+
207
+ Entry point for MCP protocol communication:
208
+ - Exposes 13+ MCP tools via stdio transport
209
+ - Handles tool calls and responses
210
+ - Manages configuration via ConfigManager
211
+ - Coordinates all system components
212
+
213
+ #### 2. Provider System (`src/providers/`)
214
+
215
+ **Provider vs Model Architecture** (v0.8.0):
216
+ - **Provider**: Service/company (OpenAI, Google, Z.ai)
217
+ - **Model**: Specific AI implementation (GPT Image 1, Gemini Flash, GLM-Image)
218
+ - One provider can offer multiple models with different pricing/capabilities
219
+
220
+ **Components**:
221
+ - **ProviderFactory** (`index.ts`): Singleton managing all providers
222
+ - **BaseProvider** (`base-provider.ts`): Abstract base with shared functionality
223
+ - **Provider Implementations**: 8 providers extending BaseProvider
224
+
225
+ **Features**:
226
+ - Automatic provider initialization from env vars or ConfigManager
227
+ - Multi-model support per provider
228
+ - Automatic fallback on provider failure
229
+ - Rate limiting per provider
230
+ - Quality inspection post-generation
231
+
232
+ #### 3. State Management (`src/state/state-manager.ts`)
233
+
234
+ Persistent state in `~/.visual-forge-mcp/state.json`:
235
+ - **Atomic writes**: Temp file + rename for crash safety
236
+ - **Tracks**: sessions, images, jobs, costs
237
+ - **Enables**: resumable workflows across restarts
238
+
239
+ #### 4. Workflow Orchestrator (`src/workflow/workflow-orchestrator.ts`)
240
+
241
+ Three workflow modes:
242
+ - **Interactive**: Sequential one-at-a-time (simulated approval in MCP)
243
+ - **Batch**: Generate N images → approve batch
244
+ - **Bulk**: Parallel with concurrency limit (default: 3)
245
+
246
+ #### 5. Quality Inspector (`src/quality/quality-inspector.ts`)
247
+
248
+ Post-generation validation:
249
+ - **Sharpness**: Laplacian variance analysis
250
+ - **Brightness**: 0-255 scale validation
251
+ - **Dimensions**: Size verification
252
+ - **File size**: 10KB - 10MB range
253
+ - **OCR** (optional): Text detection
254
+
255
+ #### 6. Markdown Parser (`src/parser/markdown-parser.ts`)
256
+
257
+ Extracts image specifications from markdown:
258
+ ```markdown
259
+ \`\`\`image
260
+ type: diagram
261
+ aspectRatio: 16:9
262
+ prompt: AWS architecture with VPC, EC2, and RDS
263
+ \`\`\`
264
+ ```
265
+
266
+ Merges global context with per-image prompts for consistency.
267
+
268
+ ---
269
+
270
+ ## Environment Variables
271
+
272
+ ### Provider API Keys (Required - at least one)
273
+
274
+ ```env
275
+ # Google Gemini (FREE, recommended for testing)
276
+ GOOGLE_API_KEY=AIza...
277
+
278
+ # Z.ai GLM-Image (NEW - excellent for text-heavy diagrams)
279
+ ZAI_API_KEY=zai-...
280
+
281
+ # OpenAI GPT Image
282
+ OPENAI_API_KEY=sk-...
283
+
284
+ # Stability AI
285
+ STABILITY_API_KEY=sk-...
286
+
287
+ # Replicate (FLUX models - cheapest paid option)
288
+ REPLICATE_API_TOKEN=r8_...
289
+
290
+ # Leonardo AI
291
+ LEONARDO_API_KEY=...
292
+
293
+ # HuggingFace
294
+ HUGGINGFACE_API_KEY=hf_...
295
+
296
+ # xAI Grok
297
+ XAI_API_KEY=xai-...
298
+ ```
299
+
300
+ ### Configuration (Optional)
301
+
302
+ ```env
303
+ # Output directory for generated images
304
+ IMAGE_GEN_OUTPUT_DIR=./generated-images
305
+
306
+ # State persistence directory
307
+ IMAGE_GEN_STATE_DIR=~/.visual-forge-mcp
308
+
309
+ # Logging level: debug | info | warn | error
310
+ IMAGE_GEN_LOG_LEVEL=info
311
+
312
+ # Default provider selection
313
+ IMAGE_GEN_DEFAULT_PROVIDER=gemini
314
+
315
+ # Quality validation (default: true)
316
+ VF_QUALITY_VALIDATION=true
317
+
318
+ # Auto-regeneration on quality failures (default: true)
319
+ VF_AUTO_REGENERATE=true
320
+
321
+ # Maximum regeneration attempts (default: 3)
322
+ VF_MAX_RETRIES=3
323
+
324
+ # Image format generation
325
+ VF_GENERATE_PNG=false # Generate optimized PNG (default: false)
326
+ VF_PNG_QUALITY=85 # PNG quality 0-100 (default: 85)
327
+ VF_PNG_PALETTE=true # Use palette for PNG (default: true)
328
+ VF_PNG_COLORS=256 # Color palette size (default: 256)
329
+ VF_PNG_DITHER=1.0 # Dithering level 0-1 (default: 1.0)
330
+ ```
331
+
332
+ ### Provider-Specific Environment Variables
333
+
334
+ ```env
335
+ # OpenAI specific
336
+ OPENAI_ORG_ID=org-... # Optional: Organization ID
337
+ OPENAI_PROJECT_ID=proj_... # Optional: Project ID
338
+
339
+ # Stability AI specific
340
+ STABILITY_ORGANIZATION=org-... # Optional: Organization ID
341
+
342
+ # HuggingFace specific
343
+ HUGGINGFACE_MODEL=stabilityai/stable-diffusion-xl-base-1.0 # Default model
344
+ ```
345
+
346
+ ---
347
+
348
+ ## Provider & Model System
349
+
350
+ ### Provider Overview
351
+
352
+ | Provider | Models | Cost Range | Best For | Free Tier |
353
+ |----------|--------|------------|----------|-----------|
354
+ | **Replicate** | FLUX Schnell, Dev, Pro | $0.003-$0.055 | General images, fast | ❌ |
355
+ | **Z.ai** | GLM-Image | $0.015 | Text-heavy diagrams | ❌ |
356
+ | **Gemini** | 2.5 Flash, 2.5 Flash Pro | $0 | Testing, prototypes | ✅ |
357
+ | **HuggingFace** | SDXL, FLUX.1 Dev | $0 | Testing, community models | ✅ |
358
+ | **Leonardo** | Phoenix 1.0 | $0.02 | Artistic images | ❌ |
359
+ | **OpenAI** | GPT Image 1, GPT Image 1 HD | $0.04-$0.12 | Professional images | ❌ |
360
+ | **Stability AI** | SDXL 1.0 | $0.04 | Stable Diffusion | ❌ |
361
+ | **xAI** | Grok 2 Image | $0.07 | Grok integration | ❌ |
362
+
363
+ ### Model Selection
364
+
365
+ Providers with multiple models:
366
+ - **OpenAI**: `gpt-image-1` (standard), `gpt-image-1-hd` (high-def)
367
+ - **Gemini**: `gemini-2.5-flash-image` (2K), `gemini-2.5-flash-image-pro` (4K)
368
+ - **Replicate**: `flux-schnell` (fast), `flux-dev` (quality), `flux-pro` (professional)
369
+ - **HuggingFace**: `stabilityai/stable-diffusion-xl-base-1.0`, `black-forest-labs/FLUX.1-dev`
370
+
371
+ **Select model during generation**:
372
+ ```typescript
373
+ // MCP tool call
374
+ {
375
+ "tool": "generate_image",
376
+ "parameters": {
377
+ "imageId": "doc-img-01",
378
+ "provider": "openai",
379
+ "options": {
380
+ "model": "gpt-image-1-hd", // Specify model
381
+ "quality": "hd"
382
+ }
383
+ }
384
+ }
385
+ ```
386
+
387
+ ### Provider Priority Order
388
+
389
+ When no provider is specified, Visual Forge selects based on:
390
+
391
+ 1. **Environment variable**: `IMAGE_GEN_DEFAULT_PROVIDER`
392
+ 2. **Cost priority** (if no env var):
393
+ - `replicate` (cheapest: $0.003)
394
+ - `zai` (2nd cheapest: $0.015)
395
+ - `gemini` (free)
396
+ - `huggingface` (free)
397
+ - `openai`, `stability`, `leonardo`, `xai`
398
+
399
+ ### Provider Capabilities
400
+
401
+ Each provider reports capabilities via `getCapabilities()`:
402
+
403
+ ```typescript
404
+ interface ProviderCapabilities {
405
+ maxDimensions: { width: number; height: number };
406
+ supportedAspectRatios: string[];
407
+ supportedFormats: Array<'png' | 'jpg' | 'webp'>;
408
+ supportsHD: boolean;
409
+ supportsMultilingual: boolean;
410
+ supportsImageEditing: boolean;
411
+ supportsStyleTransfer: boolean;
412
+ averageGenerationTime: number; // seconds
413
+ }
414
+ ```
415
+
416
+ **Example** (Z.ai GLM-Image):
417
+ ```json
418
+ {
419
+ "maxDimensions": { "width": 2048, "height": 2048 },
420
+ "supportedAspectRatios": ["1:1", "16:9", "4:3", "9:16", "3:2", "21:9"],
421
+ "supportedFormats": ["png"],
422
+ "supportsHD": true,
423
+ "supportsMultilingual": true,
424
+ "supportsImageEditing": true,
425
+ "supportsStyleTransfer": true,
426
+ "averageGenerationTime": 15
427
+ }
428
+ ```
429
+
430
+ ### Provider Configuration via MCP
431
+
432
+ Runtime configuration without restart:
433
+
434
+ ```bash
435
+ # Set API key for Z.ai
436
+ mcp-tool configure_provider \
437
+ --provider zai \
438
+ --apiKey "zai-..."
439
+
440
+ # Check which providers are configured
441
+ mcp-tool get_provider_status
442
+
443
+ # Test API key validity
444
+ mcp-tool test_provider_connection --provider zai
445
+
446
+ # Remove provider
447
+ mcp-tool remove_provider --provider zai
448
+ ```
449
+
450
+ Configuration stored in `~/.visual-forge-mcp/config.json`.
451
+
452
+ ---
453
+
454
+ ## Usage Workflows
455
+
456
+ ### Basic Workflow: Generate Images from Markdown
457
+
458
+ **Step 1: Create markdown with image specifications**
459
+
460
+ ```markdown
461
+ # My Technical Documentation
462
+
463
+ ## Architecture Overview
464
+
465
+ \`\`\`image
466
+ type: architecture
467
+ aspectRatio: 16:9
468
+ prompt: AWS cloud architecture showing VPC with public/private subnets,
469
+ EC2 instances, RDS database, and load balancer. Professional diagram style.
470
+ \`\`\`
471
+
472
+ ## Data Flow
473
+
474
+ \`\`\`image
475
+ type: flowchart
476
+ aspectRatio: 4:3
477
+ prompt: Data processing pipeline flowchart showing ingestion,
478
+ transformation, storage, and analytics stages with arrows.
479
+ \`\`\`
480
+ ```
481
+
482
+ **Step 2: Parse markdown to extract specifications**
483
+
484
+ ```typescript
485
+ // MCP tool call
486
+ {
487
+ "tool": "parse_markdown",
488
+ "parameters": {
489
+ "filePaths": ["docs/architecture.md"]
490
+ }
491
+ }
492
+ ```
493
+
494
+ **Step 3: Start generation workflow**
495
+
496
+ ```typescript
497
+ // Bulk mode (parallel, fire-and-forget)
498
+ {
499
+ "tool": "start_workflow",
500
+ "parameters": {
501
+ "mode": "bulk",
502
+ "provider": "gemini",
503
+ "concurrency": 3
504
+ }
505
+ }
506
+ ```
507
+
508
+ **Step 4: Monitor progress**
509
+
510
+ ```typescript
511
+ {
512
+ "tool": "get_status"
513
+ }
514
+ ```
515
+
516
+ **Response**:
517
+ ```json
518
+ {
519
+ "mode": "bulk",
520
+ "status": "running",
521
+ "currentImage": "docs-img-02",
522
+ "totalImages": 5,
523
+ "completed": 2,
524
+ "failed": 0,
525
+ "provider": "gemini"
526
+ }
527
+ ```
528
+
529
+ **Step 5: Get cost summary**
530
+
531
+ ```typescript
532
+ {
533
+ "tool": "get_cost_summary"
534
+ }
535
+ ```
536
+
537
+ **Response**:
538
+ ```json
539
+ {
540
+ "totalCost": 0.075,
541
+ "byProvider": {
542
+ "gemini": 0.0,
543
+ "zai": 0.045,
544
+ "openai": 0.03
545
+ },
546
+ "byFile": {
547
+ "docs/architecture.md": 0.075
548
+ },
549
+ "byType": {
550
+ "architecture": 0.03,
551
+ "flowchart": 0.045
552
+ },
553
+ "imageCount": 5
554
+ }
555
+ ```
556
+
557
+ ### Advanced Workflow: Global Context
558
+
559
+ For consistent styling across multiple images:
560
+
561
+ **Step 1: Parse with global context**
562
+
563
+ ```typescript
564
+ {
565
+ "tool": "parse_markdown",
566
+ "parameters": {
567
+ "filePaths": ["docs/*.md"],
568
+ "globalContext": {
569
+ "prePrompt": "Professional technical documentation style",
570
+ "documentVibe": "Modern, clean, and professional",
571
+ "style": {
572
+ "visualStyle": "Flat design with subtle isometric perspective",
573
+ "mood": "Professional and informative",
574
+ "colorPalette": ["#1a365d", "#0891b2", "#7c3aed"]
575
+ },
576
+ "postPrompt": "High quality, clear labels, no watermarks"
577
+ }
578
+ }
579
+ }
580
+ ```
581
+
582
+ This context is prepended to every image prompt automatically.
583
+
584
+ ### Workflow Mode Comparison
585
+
586
+ | Mode | Use Case | Concurrency | Approval | Speed |
587
+ |------|----------|-------------|----------|-------|
588
+ | **Interactive** | Few images, manual control | 1 | Per-image | Slowest |
589
+ | **Batch** | Medium batch, review before approval | N images | Per-batch | Medium |
590
+ | **Bulk** | Large batch, fire-and-forget | 3 (configurable) | None | Fastest |
591
+
592
+ **Interactive Mode**:
593
+ ```typescript
594
+ {
595
+ "tool": "start_workflow",
596
+ "parameters": {
597
+ "mode": "interactive",
598
+ "imageIds": ["img-01", "img-02", "img-03"]
599
+ }
600
+ }
601
+ ```
602
+
603
+ **Batch Mode**:
604
+ ```typescript
605
+ {
606
+ "tool": "start_workflow",
607
+ "parameters": {
608
+ "mode": "batch",
609
+ "imageIds": ["img-01", "img-02", "img-03"],
610
+ "batchSize": 10 // Generate 10 at a time
611
+ }
612
+ }
613
+ ```
614
+
615
+ **Bulk Mode**:
616
+ ```typescript
617
+ {
618
+ "tool": "start_workflow",
619
+ "parameters": {
620
+ "mode": "bulk",
621
+ "concurrency": 5 // 5 parallel generations
622
+ }
623
+ }
624
+ ```
625
+
626
+ ### Workflow Control
627
+
628
+ **Pause workflow**:
629
+ ```typescript
630
+ {
631
+ "tool": "pause_workflow"
632
+ }
633
+ ```
634
+
635
+ **Resume workflow**:
636
+ ```typescript
637
+ {
638
+ "tool": "resume_workflow"
639
+ }
640
+ ```
641
+
642
+ **Generate single image**:
643
+ ```typescript
644
+ {
645
+ "tool": "generate_image",
646
+ "parameters": {
647
+ "imageId": "doc-img-01",
648
+ "provider": "zai" // Optional, uses default if omitted
649
+ }
650
+ }
651
+ ```
652
+
653
+ ---
654
+
655
+ ## MCP Tools Reference
656
+
657
+ ### Configuration Tools
658
+
659
+ #### `configure_provider`
660
+ Set API key for a provider at runtime.
661
+
662
+ **Parameters**:
663
+ - `provider` (string): Provider type (`openai`, `gemini`, `zai`, etc.)
664
+ - `apiKey` (string): API key
665
+
666
+ **Example**:
667
+ ```json
668
+ {
669
+ "tool": "configure_provider",
670
+ "parameters": {
671
+ "provider": "zai",
672
+ "apiKey": "zai-..."
673
+ }
674
+ }
675
+ ```
676
+
677
+ #### `get_provider_status`
678
+ Check which providers are configured.
679
+
680
+ **Parameters**: None
681
+
682
+ **Response**:
683
+ ```json
684
+ {
685
+ "configured": ["openai", "gemini", "zai"],
686
+ "unconfigured": ["stability", "replicate", "leonardo", "huggingface", "xai"]
687
+ }
688
+ ```
689
+
690
+ #### `test_provider_connection`
691
+ Verify API key validity.
692
+
693
+ **Parameters**:
694
+ - `provider` (string): Provider to test
695
+
696
+ **Response**:
697
+ ```json
698
+ {
699
+ "success": true,
700
+ "message": "Z.ai API connected (245ms)",
701
+ "latency": 245
702
+ }
703
+ ```
704
+
705
+ #### `remove_provider`
706
+ Remove API key for a provider.
707
+
708
+ **Parameters**:
709
+ - `provider` (string): Provider to remove
710
+
711
+ ### Model Selection & Testing Tools ✨ NEW v0.9.0
712
+
713
+ #### `set_default_model`
714
+ Set the default model for a provider.
715
+
716
+ **Parameters**:
717
+ - `provider` (string): Provider to configure
718
+ - `modelId` (string): Model ID to set as default
719
+
720
+ **Example**:
721
+ ```json
722
+ {
723
+ "tool": "set_default_model",
724
+ "parameters": {
725
+ "provider": "zai",
726
+ "modelId": "glm-image"
727
+ }
728
+ }
729
+ ```
730
+
731
+ **Response**:
732
+ ```json
733
+ {
734
+ "success": true,
735
+ "provider": "zai",
736
+ "modelId": "glm-image",
737
+ "modelName": "GLM-Image",
738
+ "message": "Default model set to 'GLM-Image' for Z.ai GLM-Image..."
739
+ }
740
+ ```
741
+
742
+ #### `get_model_info`
743
+ Get detailed information about a specific model.
744
+
745
+ **Parameters**:
746
+ - `provider` (string): Provider that offers the model
747
+ - `modelId` (string): Model ID to get information about
748
+
749
+ **Response**:
750
+ ```json
751
+ {
752
+ "success": true,
753
+ "provider": "gemini",
754
+ "providerName": "Google Gemini 2.5 Flash Image",
755
+ "model": {
756
+ "id": "gemini-2.5-flash-image",
757
+ "name": "Gemini 2.5 Flash Image",
758
+ "costPerImage": 0.0,
759
+ "description": "Fast, free-tier image generation",
760
+ "capabilities": {
761
+ "maxResolution": "2048x2048",
762
+ "supportedAspectRatios": ["1:1", "16:9", "4:3", "9:16"]
763
+ }
764
+ },
765
+ "testResult": {
766
+ "testedAt": "2026-01-16T10:30:00.000Z",
767
+ "qualityScore": 85.5,
768
+ "passed": true
769
+ }
770
+ }
771
+ ```
772
+
773
+ #### `test_model`
774
+ Test a model with standard or custom prompt.
775
+
776
+ **Parameters**:
777
+ - `provider` (string): Provider to test
778
+ - `modelId` (string): Model ID to test
779
+ - `useStandardTest` (boolean, optional): Use standard test prompt
780
+ - `prompt` (string, optional): Custom prompt for testing
781
+ - `aspectRatio` (string, optional): Aspect ratio (default: "16:9")
782
+ - `skipPermission` (boolean, optional): Skip cost confirmation (default: false)
783
+
784
+ **Example (Standard Test)**:
785
+ ```json
786
+ {
787
+ "tool": "test_model",
788
+ "parameters": {
789
+ "provider": "zai",
790
+ "modelId": "glm-image",
791
+ "useStandardTest": true
792
+ }
793
+ }
794
+ ```
795
+
796
+ **Example (Custom Prompt)**:
797
+ ```json
798
+ {
799
+ "tool": "test_model",
800
+ "parameters": {
801
+ "provider": "gemini",
802
+ "modelId": "gemini-2.5-flash-image",
803
+ "prompt": "AWS VPC architecture diagram with public/private subnets"
804
+ }
805
+ }
806
+ ```
807
+
808
+ **Response**:
809
+ ```json
810
+ {
811
+ "success": true,
812
+ "provider": "zai",
813
+ "providerName": "Z.ai GLM-Image",
814
+ "model": "GLM-Image",
815
+ "testImage": {
816
+ "filepath": "generated-images/tests/zai-glm-image-test.png",
817
+ "generationTime": 12000,
818
+ "actualCost": 0.015
819
+ },
820
+ "qualityScore": {
821
+ "overall": 87.5,
822
+ "sharpness": 89.2,
823
+ "brightness": 145,
824
+ "textRendering": 85.0,
825
+ "colorAccuracy": 90.0,
826
+ "passed": true
827
+ }
828
+ }
829
+ ```
830
+
831
+ **Quality Metrics**:
832
+ - **Sharpness (30%)**: Laplacian variance analysis
833
+ - **Brightness (20%)**: Average brightness (30-240 range)
834
+ - **Text Rendering (40%)**: OCR accuracy estimation
835
+ - **Color Accuracy (10%)**: Heuristic validation
836
+ - **Pass Threshold**: 60/100 overall score
837
+
838
+ #### `compare_models`
839
+ Compare multiple providers/models side-by-side.
840
+
841
+ **Parameters**:
842
+ - `prompt` (string): Prompt to test across all models
843
+ - `providers` (array): Array of {provider, model} objects
844
+ - `aspectRatio` (string, optional): Aspect ratio (default: "16:9")
845
+ - `skipPermission` (boolean, optional): Skip cost confirmation
846
+
847
+ **Example**:
848
+ ```json
849
+ {
850
+ "tool": "compare_models",
851
+ "parameters": {
852
+ "prompt": "Technical diagram showing microservices architecture",
853
+ "providers": [
854
+ { "provider": "gemini", "model": "gemini-2.5-flash-image" },
855
+ { "provider": "zai", "model": "glm-image" },
856
+ { "provider": "huggingface", "model": "black-forest-labs/FLUX.1-dev" }
857
+ ]
858
+ }
859
+ }
860
+ ```
861
+
862
+ **Response**:
863
+ ```json
864
+ {
865
+ "success": true,
866
+ "totalCost": 0.015,
867
+ "totalTime": 35000,
868
+ "results": [
869
+ {
870
+ "provider": "zai",
871
+ "model": "GLM-Image",
872
+ "qualityScore": { "overall": 92.1 },
873
+ "rank": 1
874
+ },
875
+ {
876
+ "provider": "gemini",
877
+ "model": "Gemini Flash Image",
878
+ "qualityScore": { "overall": 85.5 },
879
+ "rank": 2
880
+ }
881
+ ],
882
+ "recommendation": {
883
+ "provider": "zai",
884
+ "model": "glm-image",
885
+ "reason": "Highest overall quality (92.1/100)..."
886
+ }
887
+ }
888
+ ```
889
+
890
+ ### Image Generation Tools
891
+
892
+ #### `parse_markdown`
893
+ Extract image specifications from markdown files.
894
+
895
+ **Parameters**:
896
+ - `filePaths` (string[]): Array of markdown file paths
897
+ - `globalContext` (object, optional): Global styling context
898
+
899
+ **Response**:
900
+ ```json
901
+ {
902
+ "images": [
903
+ {
904
+ "id": "docs-img-01",
905
+ "file": "docs/architecture.md",
906
+ "type": "architecture",
907
+ "aspectRatio": "16:9",
908
+ "prompt": "AWS cloud architecture...",
909
+ "estimatedCost": 0.015
910
+ }
911
+ ],
912
+ "totalImages": 5,
913
+ "totalEstimatedCost": 0.075
914
+ }
915
+ ```
916
+
917
+ #### `list_providers`
918
+ List available providers and their models.
919
+
920
+ **Parameters**: None
921
+
922
+ **Response**:
923
+ ```json
924
+ {
925
+ "providers": [
926
+ {
927
+ "name": "zai",
928
+ "displayName": "Z.ai GLM-Image",
929
+ "isAvailable": true,
930
+ "models": [
931
+ {
932
+ "id": "glm-image",
933
+ "name": "GLM-Image",
934
+ "costPerImage": 0.015,
935
+ "description": "16B hybrid model for text-heavy diagrams"
936
+ }
937
+ ],
938
+ "defaultModel": "glm-image",
939
+ "capabilities": { ... }
940
+ }
941
+ ]
942
+ }
943
+ ```
944
+
945
+ #### `generate_image`
946
+ Generate a single image.
947
+
948
+ **Parameters**:
949
+ - `imageId` (string): Image specification ID
950
+ - `provider` (string, optional): Provider to use
951
+ - `options` (object, optional):
952
+ - `model` (string): Specific model ID
953
+ - `quality` (`'standard'` | `'hd'`)
954
+ - `style` (string): Custom style
955
+
956
+ **Response**:
957
+ ```json
958
+ {
959
+ "id": "doc-img-01",
960
+ "filepath": "generated-images/zai/doc-img-01.webp",
961
+ "provider": "zai",
962
+ "metadata": {
963
+ "model": "glm-image",
964
+ "actualCost": 0.015,
965
+ "generationTime": 12500,
966
+ "dimensions": { "width": 1792, "height": 1024 },
967
+ "fileSize": 145234,
968
+ "format": "webp",
969
+ "quality": {
970
+ "sharpness": 87.3,
971
+ "brightness": 142,
972
+ "passed": true
973
+ }
974
+ },
975
+ "generatedAt": "2026-01-16T06:30:00.000Z"
976
+ }
977
+ ```
978
+
979
+ #### `start_workflow`
980
+ Start image generation workflow.
981
+
982
+ **Parameters**:
983
+ - `mode` (`'interactive'` | `'batch'` | `'bulk'`): Workflow mode
984
+ - `imageIds` (string[], optional): Specific images to generate
985
+ - `provider` (string, optional): Default provider
986
+ - `concurrency` (number, optional): Parallel generations (bulk mode only)
987
+
988
+ **Response**:
989
+ ```json
990
+ {
991
+ "workflowId": "wf-1234",
992
+ "mode": "bulk",
993
+ "status": "started",
994
+ "totalImages": 10,
995
+ "provider": "gemini"
996
+ }
997
+ ```
998
+
999
+ #### `get_status`
1000
+ Get workflow progress.
1001
+
1002
+ **Parameters**: None
1003
+
1004
+ **Response**:
1005
+ ```json
1006
+ {
1007
+ "mode": "bulk",
1008
+ "status": "running",
1009
+ "currentImage": "docs-img-05",
1010
+ "totalImages": 10,
1011
+ "completed": 4,
1012
+ "failed": 0,
1013
+ "pending": 6,
1014
+ "provider": "gemini",
1015
+ "estimatedTimeRemaining": 45
1016
+ }
1017
+ ```
1018
+
1019
+ #### `pause_workflow` / `resume_workflow`
1020
+ Control workflow execution.
1021
+
1022
+ **Parameters**: None
1023
+
1024
+ #### `list_images`
1025
+ List parsed image specifications.
1026
+
1027
+ **Parameters**:
1028
+ - `filter` (string, optional): Filter by type (`architecture`, `flowchart`, etc.)
1029
+
1030
+ **Response**:
1031
+ ```json
1032
+ {
1033
+ "images": [
1034
+ {
1035
+ "id": "docs-img-01",
1036
+ "file": "docs/architecture.md",
1037
+ "type": "architecture",
1038
+ "status": "generated",
1039
+ "filepath": "generated-images/zai/docs-img-01.webp"
1040
+ }
1041
+ ],
1042
+ "total": 10,
1043
+ "byStatus": {
1044
+ "pending": 3,
1045
+ "generated": 6,
1046
+ "failed": 1
1047
+ }
1048
+ }
1049
+ ```
1050
+
1051
+ ### Cost Tracking Tools
1052
+
1053
+ #### `get_cost_summary`
1054
+ Get cost breakdown.
1055
+
1056
+ **Parameters**: None
1057
+
1058
+ **Response**:
1059
+ ```json
1060
+ {
1061
+ "totalCost": 0.225,
1062
+ "byProvider": {
1063
+ "gemini": 0.0,
1064
+ "zai": 0.105,
1065
+ "openai": 0.12
1066
+ },
1067
+ "byFile": {
1068
+ "docs/architecture.md": 0.075,
1069
+ "docs/api.md": 0.15
1070
+ },
1071
+ "byType": {
1072
+ "architecture": 0.09,
1073
+ "flowchart": 0.075,
1074
+ "diagram": 0.06
1075
+ },
1076
+ "imageCount": 15,
1077
+ "averageCostPerImage": 0.015
1078
+ }
1079
+ ```
1080
+
1081
+ ---
1082
+
1083
+ ## Testing
1084
+
1085
+ ### Test Infrastructure
1086
+
1087
+ Visual Forge uses **Jest** with full TypeScript and ES modules support.
1088
+
1089
+ **Test coverage**: 77 tests across 4 test suites
1090
+
1091
+ ### Running Tests
1092
+
1093
+ ```bash
1094
+ # Run all tests
1095
+ npm test
1096
+
1097
+ # Run tests in watch mode
1098
+ npm run test:watch
1099
+
1100
+ # Run with coverage report
1101
+ npm test -- --coverage
1102
+
1103
+ # Run specific test file
1104
+ npm test -- provider-factory.test.ts
1105
+ ```
1106
+
1107
+ ### Test Structure
1108
+
1109
+ ```
1110
+ test/
1111
+ ├── helpers/ # Test utilities and mocks
1112
+ │ ├── test-utils.ts # createMockImageSpec(), etc.
1113
+ │ └── mock-providers.ts # MockSuccessProvider, MockFailureProvider
1114
+ ├── providers/ # Provider system tests
1115
+ │ └── provider-factory.test.ts # 21 test cases
1116
+ ├── state/ # State management tests
1117
+ ├── workflow/ # Workflow orchestration tests
1118
+ └── README.md # Test documentation
1119
+ ```
1120
+
1121
+ ### Manual Provider Testing
1122
+
1123
+ ```bash
1124
+ # Check all providers
1125
+ npx tsx scripts/check-providers.ts
1126
+
1127
+ # Test Z.ai specifically
1128
+ ZAI_API_KEY=zai-... npx tsx scripts/test-zai.ts
1129
+
1130
+ # Compare all providers
1131
+ npx tsx scripts/generate-all-providers.ts
1132
+
1133
+ # Test versioning
1134
+ npx tsx scripts/generate-solo-theme-test.ts
1135
+ ```
1136
+
1137
+ ### Test Helpers
1138
+
1139
+ Use these when writing tests:
1140
+
1141
+ ```typescript
1142
+ import {
1143
+ createMockImageSpec,
1144
+ createMockProviderConfig,
1145
+ setTestEnv,
1146
+ clearTestEnv
1147
+ } from '../helpers/test-utils.js';
1148
+
1149
+ // Create mock image spec
1150
+ const spec = createMockImageSpec({
1151
+ id: 'test-img-01',
1152
+ type: 'architecture',
1153
+ aspectRatio: '16:9'
1154
+ });
1155
+
1156
+ // Set test environment
1157
+ setTestEnv({
1158
+ 'GOOGLE_API_KEY': 'AIza-test-key-1234567890',
1159
+ 'ZAI_API_KEY': 'zai-test-key-1234567890'
1160
+ });
1161
+
1162
+ // Clean up
1163
+ clearTestEnv(['GOOGLE_API_KEY', 'ZAI_API_KEY']);
1164
+ ```
1165
+
1166
+ ### Writing Tests
1167
+
1168
+ Follow AAA pattern:
1169
+ ```typescript
1170
+ describe('ProviderFactory', () => {
1171
+ beforeEach(() => {
1172
+ // Arrange: Setup
1173
+ jest.clearAllMocks();
1174
+ clearTestEnv(['GOOGLE_API_KEY']);
1175
+ });
1176
+
1177
+ it('should initialize Gemini provider when API key is set', () => {
1178
+ // Arrange
1179
+ setTestEnv({ 'GOOGLE_API_KEY': 'AIza-test-key-1234567890' });
1180
+ const factory = new ProviderFactory();
1181
+
1182
+ // Act
1183
+ factory.initialize();
1184
+
1185
+ // Assert
1186
+ expect(factory.isProviderAvailable('gemini')).toBe(true);
1187
+ });
1188
+ });
1189
+ ```
1190
+
1191
+ ---
1192
+
1193
+ ## Troubleshooting
1194
+
1195
+ ### Common Issues
1196
+
1197
+ #### 1. "No providers configured" error
1198
+
1199
+ **Problem**: No API keys set
1200
+
1201
+ **Solution**:
1202
+ ```bash
1203
+ # Check environment variables
1204
+ echo $GOOGLE_API_KEY
1205
+ echo $ZAI_API_KEY
1206
+
1207
+ # Set at least one provider
1208
+ export GOOGLE_API_KEY=AIza...
1209
+
1210
+ # Or use runtime configuration
1211
+ mcp-tool configure_provider --provider gemini --apiKey AIza...
1212
+ ```
1213
+
1214
+ #### 2. Rate limit errors
1215
+
1216
+ **Problem**: Too many requests to provider API
1217
+
1218
+ **Solution**:
1219
+ - Reduce concurrency: `"concurrency": 2` (instead of 3)
1220
+ - Wait between requests (automatic with rate limiter)
1221
+ - Check provider-specific rate limits in `config/pricing.json`
1222
+
1223
+ #### 3. Quality validation failures
1224
+
1225
+ **Problem**: Generated images fail quality checks
1226
+
1227
+ **Solution**:
1228
+ ```env
1229
+ # Disable quality validation temporarily
1230
+ VF_QUALITY_VALIDATION=false
1231
+
1232
+ # Or adjust thresholds (in code: quality-inspector.ts)
1233
+ minSharpness: 40 # Instead of 50
1234
+ minQualityScore: 50 # Instead of 60
1235
+ ```
1236
+
1237
+ #### 4. Build errors
1238
+
1239
+ **Problem**: TypeScript compilation fails
1240
+
1241
+ **Solution**:
1242
+ ```bash
1243
+ # Clean build artifacts
1244
+ npm run clean
1245
+
1246
+ # Reinstall dependencies
1247
+ rm -rf node_modules package-lock.json
1248
+ npm install
1249
+
1250
+ # Rebuild
1251
+ npm run build
1252
+ ```
1253
+
1254
+ #### 5. Provider initialization warnings
1255
+
1256
+ **Problem**: "Failed to initialize X provider"
1257
+
1258
+ **Causes**:
1259
+ - Invalid API key format
1260
+ - Missing environment variable
1261
+ - Network connectivity issues
1262
+
1263
+ **Debug**:
1264
+ ```bash
1265
+ # Check provider status
1266
+ npx tsx scripts/check-providers.ts
1267
+
1268
+ # Test specific provider connection
1269
+ mcp-tool test_provider_connection --provider zai
1270
+
1271
+ # Enable debug logging
1272
+ export IMAGE_GEN_LOG_LEVEL=debug
1273
+ npm start
1274
+ ```
1275
+
1276
+ ### Debug Mode
1277
+
1278
+ Enable detailed logging:
1279
+
1280
+ ```env
1281
+ IMAGE_GEN_LOG_LEVEL=debug
1282
+ ```
1283
+
1284
+ Log output includes:
1285
+ - Provider initialization details
1286
+ - Model selection logic
1287
+ - API request/response details
1288
+ - Quality inspection results
1289
+ - Cost calculations
1290
+
1291
+ ### State Corruption
1292
+
1293
+ If state becomes corrupted:
1294
+
1295
+ ```bash
1296
+ # Backup current state
1297
+ cp ~/.visual-forge-mcp/state.json ~/.visual-forge-mcp/state.json.backup
1298
+
1299
+ # Reset state
1300
+ rm ~/.visual-forge-mcp/state.json
1301
+
1302
+ # Restart MCP server
1303
+ ```
1304
+
1305
+ ---
1306
+
1307
+ ## Advanced Topics
1308
+
1309
+ ### Custom Provider Implementation
1310
+
1311
+ Create a new provider by extending `BaseProvider`:
1312
+
1313
+ ```typescript
1314
+ import { BaseProvider } from '../base-provider.js';
1315
+ import { ProviderType, ProviderConfig, ... } from '../../types/index.js';
1316
+
1317
+ export class MyCustomProvider extends BaseProvider {
1318
+ readonly name: ProviderType = 'mycustom';
1319
+ readonly displayName = 'My Custom Provider';
1320
+
1321
+ constructor(config: ProviderConfig) {
1322
+ super(config);
1323
+ this.client.setHeader('Authorization', `Bearer ${config.apiKey}`);
1324
+ this.init();
1325
+ }
1326
+
1327
+ protected async generateImage(
1328
+ spec: ImageSpec,
1329
+ options?: GenerationOptions
1330
+ ): Promise<GeneratedImage> {
1331
+ // Implementation
1332
+ }
1333
+
1334
+ adaptPrompt(prompt: string, context?: GlobalContext): string {
1335
+ // Customize prompt for your provider
1336
+ }
1337
+
1338
+ getCapabilities(): ProviderCapabilities {
1339
+ // Return provider capabilities
1340
+ }
1341
+
1342
+ async testConnection(): Promise<{ success: boolean; message: string; latency?: number }> {
1343
+ // Test API connectivity
1344
+ }
1345
+ }
1346
+ ```
1347
+
1348
+ Register in `ProviderFactory`:
1349
+
1350
+ ```typescript
1351
+ // src/providers/index.ts
1352
+ import { MyCustomProvider } from './mycustom/mycustom-provider.js';
1353
+
1354
+ // In initialize()
1355
+ if (process.env.MYCUSTOM_API_KEY) {
1356
+ const provider = new MyCustomProvider({
1357
+ type: 'mycustom',
1358
+ apiKey: process.env.MYCUSTOM_API_KEY,
1359
+ endpoint: 'https://api.mycustom.com',
1360
+ models: [
1361
+ {
1362
+ id: 'my-model-1',
1363
+ name: 'My Model 1',
1364
+ costPerImage: 0.01
1365
+ }
1366
+ ],
1367
+ defaultModel: 'my-model-1',
1368
+ costPerImage: 0.01,
1369
+ rateLimit: 10,
1370
+ timeout: 60000
1371
+ });
1372
+ this.providers.set('mycustom', provider);
1373
+ }
1374
+ ```
1375
+
1376
+ ### Pricing Configuration
1377
+
1378
+ Centralized pricing in `config/pricing.json`:
1379
+
1380
+ ```json
1381
+ {
1382
+ "version": "2.1.0",
1383
+ "lastUpdated": "2026-01-16",
1384
+ "providers": {
1385
+ "zai": {
1386
+ "name": "Z.ai (Zhipu AI)",
1387
+ "pricingUrl": "https://docs.z.ai/guides/overview/pricing",
1388
+ "defaultModel": "glm-image",
1389
+ "models": {
1390
+ "glm-image": {
1391
+ "name": "GLM-Image",
1392
+ "costPerImage": 0.015,
1393
+ "rateLimit": 15,
1394
+ "timeout": 90000,
1395
+ "maxDimensions": { "width": 2048, "height": 2048 },
1396
+ "notes": "Excellent for text-heavy diagrams"
1397
+ }
1398
+ }
1399
+ }
1400
+ },
1401
+ "costComparison": {
1402
+ "recommended": {
1403
+ "provider": "zai",
1404
+ "model": "glm-image",
1405
+ "reason": "Best for technical documentation"
1406
+ }
1407
+ }
1408
+ }
1409
+ ```
1410
+
1411
+ ### Backup and Restore
1412
+
1413
+ Visual Forge includes automatic backup system:
1414
+
1415
+ **Create backup before generation**:
1416
+ ```typescript
1417
+ {
1418
+ "tool": "create_backup",
1419
+ "parameters": {
1420
+ "files": ["docs/architecture.md"],
1421
+ "description": "Before architecture diagram generation"
1422
+ }
1423
+ }
1424
+ ```
1425
+
1426
+ **List backups**:
1427
+ ```typescript
1428
+ {
1429
+ "tool": "list_backups"
1430
+ }
1431
+ ```
1432
+
1433
+ **Restore from backup**:
1434
+ ```typescript
1435
+ {
1436
+ "tool": "restore_from_backup",
1437
+ "parameters": {
1438
+ "backupId": "backup-20260116-063000"
1439
+ }
1440
+ }
1441
+ ```
1442
+
1443
+ **Approve changes** (delete backups):
1444
+ ```typescript
1445
+ {
1446
+ "tool": "approve_changes"
1447
+ }
1448
+ ```
1449
+
1450
+ See [Backup System Guide](./backup-system.md) for details.
1451
+
1452
+ ### Multi-Format Optimization
1453
+
1454
+ Generated images are automatically optimized to multiple formats:
1455
+
1456
+ **Default formats**:
1457
+ - **WebP**: Primary format (best compression, wide support)
1458
+ - **JPEG**: Fallback for older browsers (90% quality)
1459
+ - **PNG**: Optional (disabled by default, use for transparency)
1460
+
1461
+ **Configuration**:
1462
+ ```env
1463
+ VF_GENERATE_PNG=true # Enable PNG generation
1464
+ VF_PNG_QUALITY=85 # PNG quality 0-100
1465
+ VF_PNG_PALETTE=true # Use palette compression
1466
+ VF_PNG_COLORS=256 # Palette colors
1467
+ VF_PNG_DITHER=1.0 # Dithering level
1468
+ ```
1469
+
1470
+ **Output structure**:
1471
+ ```
1472
+ generated-images/
1473
+ └── 001-architecture-md/
1474
+ └── zai/
1475
+ ├── original/
1476
+ │ └── doc-img-01.png # Original PNG
1477
+ ├── doc-img-01.webp # Optimized WebP (primary)
1478
+ ├── doc-img-01.jpg # Optimized JPEG (fallback)
1479
+ └── doc-img-01-optimized.png # Optimized PNG (optional)
1480
+ ```
1481
+
1482
+ ### State Persistence
1483
+
1484
+ State file: `~/.visual-forge-mcp/state.json`
1485
+
1486
+ **Structure**:
1487
+ ```json
1488
+ {
1489
+ "version": "1.0",
1490
+ "session": {
1491
+ "id": "session-20260116",
1492
+ "createdAt": "2026-01-16T06:00:00.000Z"
1493
+ },
1494
+ "parsedImages": [
1495
+ {
1496
+ "id": "doc-img-01",
1497
+ "file": "docs/architecture.md",
1498
+ "type": "architecture",
1499
+ "status": "generated"
1500
+ }
1501
+ ],
1502
+ "generatedImages": [
1503
+ {
1504
+ "id": "doc-img-01",
1505
+ "filepath": "generated-images/zai/doc-img-01.webp",
1506
+ "provider": "zai",
1507
+ "cost": 0.015
1508
+ }
1509
+ ],
1510
+ "jobs": {
1511
+ "pending": [],
1512
+ "inProgress": [],
1513
+ "completed": ["doc-img-01"]
1514
+ },
1515
+ "costs": {
1516
+ "total": 0.015,
1517
+ "byProvider": { "zai": 0.015 },
1518
+ "byFile": { "docs/architecture.md": 0.015 }
1519
+ },
1520
+ "workflow": {
1521
+ "mode": "bulk",
1522
+ "status": "completed"
1523
+ }
1524
+ }
1525
+ ```
1526
+
1527
+ **Atomic writes**: Uses temp file + rename to prevent corruption on crash.
1528
+
1529
+ ---
1530
+
1531
+ ## Next Steps
1532
+
1533
+ 1. **Explore Examples**: See [Usage Examples](./usage-examples.md)
1534
+ 2. **Test Providers**: Run `npx tsx scripts/check-providers.ts`
1535
+ 3. **Generate Your First Image**: Follow [Quick Start](../../README.md#quick-start)
1536
+ 4. **Configure Backups**: Read [Backup System Guide](./backup-system.md)
1537
+ 5. **Integrate with n8n**: See [n8n Integration](../integrations/n8n.md)
1538
+
1539
+ ---
1540
+
1541
+ ## Support & Resources
1542
+
1543
+ - **GitHub**: https://github.com/michelabboud/visual-forge-mcp
1544
+ - **Issues**: https://github.com/michelabboud/visual-forge-mcp/issues
1545
+ - **Changelog**: [CHANGELOG.md](../../CHANGELOG.md)
1546
+ - **Architecture**: [MCP Server Architecture](../development/mcp-server-architecture.md)
1547
+
1548
+ ---
1549
+
1550
+ **Version:** 0.7.0
1551
+ **Last Updated:** 2026-01-16
1552
+ **License:** MIT