vertex-ai-proxy 1.0.3 → 1.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,178 +1,600 @@
1
- # vertex-ai-proxy
1
+ # Vertex AI Proxy for OpenClaw & Clawdbot
2
2
 
3
- OpenAI-compatible proxy for Google Vertex AI, supporting **Claude** and **Gemini** models with automatic failover, retries, and prompt caching.
4
-
5
- [![npm version](https://badge.fury.io/js/vertex-ai-proxy.svg)](https://www.npmjs.com/package/vertex-ai-proxy)
3
+ [![npm version](https://badge.fury.io/js/vertex-ai-proxy.svg)](https://badge.fury.io/js/vertex-ai-proxy)
6
4
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
7
5
 
6
+ A proxy server that lets you use **Google Vertex AI models** (Claude, Gemini, Imagen) with [OpenClaw](https://github.com/openclaw/openclaw), [Clawdbot](https://github.com/clawdbot/clawdbot), and other OpenAI-compatible tools.
7
+
8
+ ```
9
+ ┌─────────────┐ ┌──────────────────┐ ┌─────────────────┐
10
+ │ OpenClaw │────▶│ Vertex Proxy │────▶│ Vertex AI API │
11
+ │ Clawdbot │◀────│ (This Server) │◀────│ Claude/Gemini │
12
+ └─────────────┘ └──────────────────┘ └─────────────────┘
13
+ ```
14
+
8
15
  ## Features
9
16
 
10
- - 🔄 **Automatic Region Failover** - Seamlessly switches between regions on rate limits (429)
11
- - 🔁 **Smart Retries** - Exponential backoff with jitter for transient errors
12
- - 💰 **Prompt Caching** - Reduces costs up to 90% for repeated system prompts (Claude)
13
- - 📊 **Prometheus Metrics** - Monitor latency, errors, cache hits at `/metrics`
14
- - ⏱️ **Request Timeout** - Configurable timeout (default 300s)
15
- - 📋 **Request Queue** - Prevents overload with configurable concurrency limits
16
- - 💓 **Heartbeat Ping** - Keeps long-running streaming connections alive
17
- - 🔀 **Multi-Model Support** - Claude Opus/Sonnet/Haiku + Gemini Pro/Flash
18
- - **Full Streaming** - Including tool/function calls
17
+ - 🤖 **Multi-model support**: Claude (Opus, Sonnet, Haiku), Gemini, Imagen
18
+ - 🔄 **Format conversion**: Translates between OpenAI Anthropic API formats
19
+ - 📡 **Streaming**: Full SSE streaming support
20
+ - 🏷️ **Model aliases**: Create friendly names like `my-assistant` `claude-opus-4-5`
21
+ - 🔀 **Fallback chains**: Automatic failover when models are unavailable
22
+ - 🌍 **Dynamic region fallback**: Automatically tries us-east5 us-central1 → europe-west1
23
+ - 📏 **Context management**: Auto-truncate messages to fit model limits
24
+ - 🔐 **Google ADC**: Uses Application Default Credentials (no API keys needed)
25
+ - 🔧 **Daemon mode**: Run as background service with `start`/`stop`/`restart`
26
+ - 📝 **Logging**: Built-in log management with `logs` command
19
27
 
20
- ## Installation
28
+ ## Quick Start
29
+
30
+ ### Option 1: NPX (Recommended)
31
+
32
+ ```bash
33
+ # Run directly without installation
34
+ npx vertex-ai-proxy
35
+
36
+ # Or start as daemon
37
+ npx vertex-ai-proxy start --project your-gcp-project
38
+ ```
39
+
40
+ ### Option 2: Global Install
21
41
 
22
42
  ```bash
23
43
  npm install -g vertex-ai-proxy
44
+ vertex-ai-proxy --project your-gcp-project
24
45
  ```
25
46
 
26
- ## Quick Start
47
+ ### Option 3: From Source
27
48
 
28
49
  ```bash
29
- # Set your Google Cloud project
30
- export PROJECT_ID=your-project-id
50
+ git clone https://github.com/anthropics/vertex-ai-proxy.git
51
+ cd vertex-ai-proxy
52
+ npm install
53
+ npm start
54
+ ```
31
55
 
32
- # Authenticate with Google Cloud
33
- gcloud auth application-default login
56
+ ## CLI Commands
57
+
58
+ ### Daemon Management
59
+
60
+ ```bash
61
+ # Start as background daemon
62
+ vertex-ai-proxy start
63
+ vertex-ai-proxy start --port 8001 --project your-project
64
+
65
+ # Stop the daemon
66
+ vertex-ai-proxy stop
67
+
68
+ # Restart
69
+ vertex-ai-proxy restart
70
+
71
+ # Check status (running, uptime, request count, health)
72
+ vertex-ai-proxy status
73
+
74
+ # View logs
75
+ vertex-ai-proxy logs # Last 50 lines
76
+ vertex-ai-proxy logs -n 100 # Last 100 lines
77
+ vertex-ai-proxy logs -f # Follow (tail -f style)
78
+ ```
79
+
80
+ ### Model Management
81
+
82
+ ```bash
83
+ # List all available models
84
+ vertex-ai-proxy models
85
+
86
+ # Show detailed model info
87
+ vertex-ai-proxy models info claude-opus-4-5@20251101
88
+
89
+ # Show all details including pricing
90
+ vertex-ai-proxy models list --all
91
+
92
+ # Check which models are enabled in your Vertex AI project
93
+ vertex-ai-proxy models fetch
94
+
95
+ # Enable a model in your config
96
+ vertex-ai-proxy models enable claude-opus-4-5@20251101
97
+
98
+ # Enable with an alias
99
+ vertex-ai-proxy models enable claude-opus-4-5@20251101 --alias opus
100
+
101
+ # Disable a model
102
+ vertex-ai-proxy models disable gemini-2.5-flash
103
+ ```
104
+
105
+ ### Configuration
106
+
107
+ ```bash
108
+ # Show current configuration
109
+ vertex-ai-proxy config
110
+
111
+ # Interactive configuration setup
112
+ vertex-ai-proxy config set
113
+
114
+ # Set default model
115
+ vertex-ai-proxy config set-default claude-sonnet-4-5@20250514
116
+
117
+ # Add a model alias
118
+ vertex-ai-proxy config add-alias fast claude-haiku-4-5@20251001
119
+
120
+ # Remove an alias
121
+ vertex-ai-proxy config remove-alias fast
122
+
123
+ # Set fallback chain
124
+ vertex-ai-proxy config set-fallback claude-opus-4-5@20251101 claude-sonnet-4-5@20250514 gemini-2.5-pro
125
+
126
+ # Export configuration for OpenClaw
127
+ vertex-ai-proxy config export
128
+ vertex-ai-proxy config export -o openclaw-snippet.json
129
+ ```
130
+
131
+ ### Setup & Utilities
132
+
133
+ ```bash
134
+ # Check Google Cloud setup (auth, ADC, project)
135
+ vertex-ai-proxy check
136
+
137
+ # Configure OpenClaw integration
138
+ vertex-ai-proxy setup-openclaw
34
139
 
35
- # Start the proxy
36
- vertex-ai-proxy
140
+ # Install as systemd service
141
+ vertex-ai-proxy install-service --user # User service (no sudo)
142
+ vertex-ai-proxy install-service # System service (requires sudo)
37
143
  ```
38
144
 
39
- The proxy starts on `http://localhost:8001` by default.
145
+ ## Prerequisites
40
146
 
41
- ## Usage
147
+ ### 1. Google Cloud Setup
42
148
 
43
- ### CLI Options
149
+ You need a GCP project with Vertex AI enabled:
44
150
 
45
151
  ```bash
46
- vertex-ai-proxy [options]
152
+ # Install Google Cloud CLI (if not already installed)
153
+ # macOS
154
+ brew install google-cloud-sdk
155
+
156
+ # Ubuntu/Debian
157
+ curl https://sdk.cloud.google.com | bash
47
158
 
48
- Options:
49
- -p, --port <port> Server port (default: 8001)
50
- --host <host> Server host (default: 0.0.0.0)
51
- --project <id> Google Cloud project ID
52
- --claude-regions <regions> Comma-separated failover regions
53
- --gemini-location <loc> Gemini location
54
- --max-concurrent <n> Max concurrent requests (default: 10)
55
- --enable-logging Enable request logging
56
- --disable-cache Disable prompt caching
57
- --disable-metrics Disable Prometheus metrics
58
- -h, --help Show help
159
+ # Authenticate
160
+ gcloud auth login
161
+ gcloud auth application-default login
162
+
163
+ # Set your project
164
+ gcloud config set project YOUR_PROJECT_ID
165
+
166
+ # Enable Vertex AI API
167
+ gcloud services enable aiplatform.googleapis.com
59
168
  ```
60
169
 
170
+ ### 2. Claude on Vertex AI Access
171
+
172
+ Claude models require approval. Request access in the [Vertex AI Model Garden](https://console.cloud.google.com/vertex-ai/model-garden):
173
+
174
+ 1. Go to Model Garden
175
+ 2. Search for "Claude"
176
+ 3. Click "Enable" on the models you want
177
+ 4. Wait for approval (usually instant for Haiku/Sonnet, may take longer for Opus)
178
+
179
+ ### 3. Supported Regions
180
+
181
+ | Models | Regions |
182
+ |--------|---------|
183
+ | Claude | `us-east5`, `europe-west1` |
184
+ | Gemini | `us-central1`, `europe-west4` |
185
+ | Imagen | `us-central1` |
186
+
187
+ ## Dynamic Region Fallback
188
+
189
+ The proxy automatically handles region failures by trying regions in this order:
190
+
191
+ 1. **us-east5** (primary for Claude)
192
+ 2. **us-central1** (global, primary for Gemini/Imagen)
193
+ 3. **europe-west1** (EU fallback for Claude)
194
+ 4. Other model-specific regions
195
+
196
+ This means if `us-east5` is overloaded or has capacity issues, the proxy automatically retries in other available regions for that model.
197
+
198
+ ## Configuration
199
+
61
200
  ### Environment Variables
62
201
 
63
- | Variable | Default | Description |
64
- |----------|---------|-------------|
65
- | `PROJECT_ID` | - | Google Cloud project ID (required) |
66
- | `CLAUDE_REGIONS` | `us-east5,us-east1,europe-west1` | Comma-separated failover regions |
67
- | `GEMINI_LOCATION` | `us-east5` | Gemini location |
68
- | `PORT` | `8001` | Server port |
69
- | `MAX_CONCURRENT` | `10` | Max concurrent requests |
70
- | `QUEUE_SIZE` | `100` | Max queue size |
71
- | `MAX_RETRIES` | `3` | Max retries per request |
72
- | `REQUEST_TIMEOUT` | `300` | Request timeout in seconds |
73
- | `ENABLE_PROMPT_CACHE` | `true` | Enable Anthropic prompt caching |
74
- | `ENABLE_METRICS` | `true` | Enable Prometheus metrics |
75
- | `ENABLE_REQUEST_LOGGING` | `false` | Enable detailed request logging |
76
- | `HEARTBEAT_INTERVAL` | `15` | Streaming heartbeat interval (seconds) |
77
-
78
- ### With Clawdbot
79
-
80
- Add to your `clawdbot.json`:
202
+ ```bash
203
+ # Required
204
+ export GOOGLE_CLOUD_PROJECT="your-project-id"
205
+
206
+ # Optional (with defaults)
207
+ export VERTEX_PROXY_PORT="8001"
208
+ export VERTEX_PROXY_REGION="us-east5" # For Claude
209
+ export VERTEX_PROXY_GOOGLE_REGION="us-central1" # For Gemini/Imagen
210
+ ```
211
+
212
+ ### Config File
213
+
214
+ Create `~/.vertex-proxy/config.yaml`:
215
+
216
+ ```yaml
217
+ # Google Cloud Settings
218
+ project_id: "your-project-id"
219
+ default_region: "us-east5"
220
+ google_region: "us-central1"
221
+
222
+ # Model Aliases (optional)
223
+ model_aliases:
224
+ my-best: "claude-opus-4-5@20251101"
225
+ my-fast: "claude-haiku-4-5@20251001"
226
+ my-cheap: "gemini-2.5-flash-lite"
227
+
228
+ # OpenAI compatibility
229
+ gpt-4: "claude-opus-4-5@20251101"
230
+ gpt-4o: "claude-sonnet-4-5@20250514"
231
+ gpt-4o-mini: "claude-haiku-4-5@20251001"
232
+
233
+ # Fallback Chains (optional)
234
+ fallback_chains:
235
+ claude-opus-4-5@20251101:
236
+ - "claude-sonnet-4-5@20250514"
237
+ - "gemini-2.5-pro"
238
+
239
+ # Context Management
240
+ auto_truncate: true
241
+ reserve_output_tokens: 4096
242
+ ```
243
+
244
+ ### Data Files
245
+
246
+ The proxy stores runtime data in `~/.vertex_proxy/`:
247
+
248
+ - `proxy.log` - Request/error logs
249
+ - `proxy.pid` - Daemon PID file
250
+ - `stats.json` - Runtime statistics (uptime, request count)
251
+
252
+ ## Clawdbot Integration
253
+
254
+ ### Setting Up a Fake Auth Profile
255
+
256
+ Clawdbot normally uses Anthropic's API directly, but you can route it through the Vertex AI Proxy by setting up a "fake" auth profile. This lets you use your Google Cloud credits and take advantage of Vertex AI's infrastructure.
257
+
258
+ #### Step 1: Start the Proxy
259
+
260
+ ```bash
261
+ # Start the proxy daemon
262
+ vertex-ai-proxy start --project YOUR_GCP_PROJECT
263
+
264
+ # Verify it's running
265
+ vertex-ai-proxy status
266
+ ```
267
+
268
+ #### Step 2: Configure Clawdbot
269
+
270
+ Add to your Clawdbot config (`~/.clawdbot/clawdbot.json` or equivalent):
81
271
 
82
272
  ```json
83
273
  {
84
274
  "models": {
275
+ "mode": "merge",
85
276
  "providers": {
86
277
  "vertex": {
87
278
  "baseUrl": "http://localhost:8001/v1",
88
- "apiKey": "dummy",
89
- "api": "openai-completions",
279
+ "apiKey": "vertex-proxy-fake-key",
280
+ "api": "anthropic-messages",
90
281
  "models": [
91
282
  {
92
- "id": "opus",
283
+ "id": "claude-opus-4-5@20251101",
93
284
  "name": "Claude Opus 4.5 (Vertex)",
285
+ "input": ["text", "image"],
94
286
  "contextWindow": 200000,
95
- "maxTokens": 16384
287
+ "maxTokens": 8192
288
+ },
289
+ {
290
+ "id": "claude-sonnet-4-5@20250514",
291
+ "name": "Claude Sonnet 4.5 (Vertex)",
292
+ "input": ["text", "image"],
293
+ "contextWindow": 200000,
294
+ "maxTokens": 8192
295
+ },
296
+ {
297
+ "id": "claude-haiku-4-5@20251001",
298
+ "name": "Claude Haiku 4.5 (Vertex)",
299
+ "input": ["text", "image"],
300
+ "contextWindow": 200000,
301
+ "maxTokens": 8192
96
302
  }
97
303
  ]
98
304
  }
99
305
  }
306
+ },
307
+ "agents": {
308
+ "defaults": {
309
+ "model": {
310
+ "primary": "vertex/claude-sonnet-4-5@20250514"
311
+ }
312
+ }
100
313
  }
101
314
  }
102
315
  ```
103
316
 
104
- ### Programmatic Usage
317
+ #### Step 3: Using Model Aliases
318
+
319
+ You can use the built-in aliases for convenience:
320
+
321
+ ```json
322
+ {
323
+ "agents": {
324
+ "defaults": {
325
+ "model": {
326
+ "primary": "vertex/sonnet"
327
+ }
328
+ },
329
+ "my-agent": {
330
+ "model": {
331
+ "primary": "vertex/opus"
332
+ }
333
+ }
334
+ }
335
+ }
336
+ ```
105
337
 
106
- ```typescript
107
- import { createServer, startServer } from 'vertex-ai-proxy';
338
+ The proxy automatically maps:
339
+ - `opus` `claude-opus-4-5@20251101`
340
+ - `sonnet` → `claude-sonnet-4-5@20250514`
341
+ - `haiku` → `claude-haiku-4-5@20251001`
342
+ - `gpt-4` → `claude-opus-4-5@20251101`
343
+ - `gpt-4o` → `claude-sonnet-4-5@20250514`
108
344
 
109
- // Option 1: Start with defaults
110
- startServer({ projectId: 'my-project' });
345
+ #### Why Use Vertex AI Proxy with Clawdbot?
111
346
 
112
- // Option 2: Get Express app for custom middleware
113
- const { app, config } = createServer({
114
- projectId: 'my-project',
115
- claudeRegions: ['us-east5', 'us-central1'],
116
- maxConcurrent: 20,
117
- });
118
- app.listen(8080);
347
+ 1. **Cost management**: Use Google Cloud credits and billing
348
+ 2. **Enterprise features**: VPC Service Controls, audit logging
349
+ 3. **Region control**: Run in specific regions for compliance
350
+ 4. **Automatic failover**: Built-in region fallback for reliability
351
+ 5. **No separate API key**: Uses your existing GCP authentication
352
+
353
+ ## OpenClaw Integration
354
+
355
+ ### Quick Setup
356
+
357
+ Run the setup script to automatically configure OpenClaw:
358
+
359
+ ```bash
360
+ # After installing vertex-ai-proxy
361
+ npx vertex-ai-proxy setup-openclaw
362
+ ```
363
+
364
+ ### Manual Configuration
365
+
366
+ Add to your `~/.openclaw/openclaw.json`:
367
+
368
+ ```json
369
+ {
370
+ "env": {
371
+ "GOOGLE_CLOUD_PROJECT": "your-project-id",
372
+ "GOOGLE_CLOUD_LOCATION": "us-east5"
373
+ },
374
+ "agents": {
375
+ "defaults": {
376
+ "model": {
377
+ "primary": "vertex/claude-opus-4-5@20251101"
378
+ },
379
+ "models": {
380
+ "vertex/claude-opus-4-5@20251101": { "alias": "opus" },
381
+ "vertex/claude-sonnet-4-5@20250514": { "alias": "sonnet" },
382
+ "vertex/claude-haiku-4-5@20251001": { "alias": "haiku" }
383
+ }
384
+ }
385
+ },
386
+ "models": {
387
+ "mode": "merge",
388
+ "providers": {
389
+ "vertex": {
390
+ "baseUrl": "http://localhost:8001/v1",
391
+ "apiKey": "vertex-proxy",
392
+ "api": "anthropic-messages",
393
+ "models": [
394
+ {
395
+ "id": "claude-opus-4-5@20251101",
396
+ "name": "Claude Opus 4.5 (Vertex)",
397
+ "input": ["text", "image"],
398
+ "contextWindow": 200000,
399
+ "maxTokens": 8192
400
+ },
401
+ {
402
+ "id": "claude-sonnet-4-5@20250514",
403
+ "name": "Claude Sonnet 4.5 (Vertex)",
404
+ "input": ["text", "image"],
405
+ "contextWindow": 200000,
406
+ "maxTokens": 8192
407
+ },
408
+ {
409
+ "id": "claude-haiku-4-5@20251001",
410
+ "name": "Claude Haiku 4.5 (Vertex)",
411
+ "input": ["text", "image"],
412
+ "contextWindow": 200000,
413
+ "maxTokens": 8192
414
+ },
415
+ {
416
+ "id": "gemini-3-pro",
417
+ "name": "Gemini 3 Pro (Vertex)",
418
+ "input": ["text", "image", "audio", "video"],
419
+ "contextWindow": 1000000,
420
+ "maxTokens": 8192
421
+ },
422
+ {
423
+ "id": "gemini-2.5-pro",
424
+ "name": "Gemini 2.5 Pro (Vertex)",
425
+ "input": ["text", "image"],
426
+ "contextWindow": 1000000,
427
+ "maxTokens": 8192
428
+ },
429
+ {
430
+ "id": "gemini-2.5-flash",
431
+ "name": "Gemini 2.5 Flash (Vertex)",
432
+ "input": ["text", "image"],
433
+ "contextWindow": 1000000,
434
+ "maxTokens": 8192
435
+ }
436
+ ]
437
+ }
438
+ }
439
+ }
440
+ }
441
+ ```
442
+
443
+ ### Start the Proxy as a Service
444
+
445
+ ```bash
446
+ # Install and enable as systemd service
447
+ sudo npx vertex-ai-proxy install-service
448
+
449
+ # Or use the daemon commands
450
+ vertex-ai-proxy start
451
+ openclaw gateway restart
119
452
  ```
120
453
 
121
454
  ## API Endpoints
122
455
 
123
456
  | Endpoint | Description |
124
457
  |----------|-------------|
125
- | `GET /` | Health check |
126
- | `GET /health` | Health check with config details |
127
- | `GET /metrics` | Prometheus metrics |
128
- | `POST /v1/chat/completions` | OpenAI-compatible chat API |
458
+ | `GET /` | Health check and server info |
459
+ | `GET /health` | Simple health check with stats |
460
+ | `GET /v1/models` | List available models |
461
+ | `POST /v1/chat/completions` | OpenAI-compatible chat (recommended) |
462
+ | `POST /v1/messages` | Anthropic Messages API |
463
+ | `POST /v1/images/generations` | Image generation (Imagen) |
464
+
465
+ ### Example Requests
466
+
467
+ **Chat Completion (OpenAI format):**
468
+ ```bash
469
+ curl http://localhost:8001/v1/chat/completions \
470
+ -H "Content-Type: application/json" \
471
+ -d '{
472
+ "model": "claude-opus-4-5@20251101",
473
+ "messages": [{"role": "user", "content": "Hello!"}],
474
+ "stream": true
475
+ }'
476
+ ```
477
+
478
+ **Chat Completion (Anthropic format):**
479
+ ```bash
480
+ curl http://localhost:8001/v1/messages \
481
+ -H "Content-Type: application/json" \
482
+ -d '{
483
+ "model": "claude-opus-4-5@20251101",
484
+ "max_tokens": 1024,
485
+ "messages": [{"role": "user", "content": "Hello!"}]
486
+ }'
487
+ ```
488
+
489
+ **Image Generation:**
490
+ ```bash
491
+ curl http://localhost:8001/v1/images/generations \
492
+ -H "Content-Type: application/json" \
493
+ -d '{
494
+ "model": "imagen-4.0-generate-001",
495
+ "prompt": "A cute robot learning to paint",
496
+ "n": 1,
497
+ "size": "1024x1024"
498
+ }'
499
+ ```
500
+
501
+ ## Available Models
502
+
503
+ ### Claude Models (Anthropic on Vertex)
504
+
505
+ | Model | ID | Context | Price (per 1M tokens) |
506
+ |-------|----|---------|-----------------------|
507
+ | Opus 4.5 | `claude-opus-4-5@20251101` | 200K | $15 / $75 |
508
+ | Sonnet 4.5 | `claude-sonnet-4-5@20250514` | 200K | $3 / $15 |
509
+ | Haiku 4.5 | `claude-haiku-4-5@20251001` | 200K | $0.25 / $1.25 |
129
510
 
130
- ## Model Aliases
511
+ ### Gemini Models
131
512
 
132
- ### Claude
133
- | Alias | Model |
134
- |-------|-------|
135
- | `opus` | claude-opus-4-5@20251101 |
136
- | `sonnet` | claude-sonnet-4-5@20250929 |
137
- | `haiku` | claude-haiku-3-5@20241022 |
513
+ | Model | ID | Context | Price (per 1M tokens) | Best For |
514
+ |-------|----|---------|-----------------------|----------|
515
+ | Gemini 3 Pro | `gemini-3-pro` | 1M | $2.50 / $15 | Latest & greatest |
516
+ | Gemini 2.5 Pro | `gemini-2.5-pro` | 1M | $1.25 / $5 | Complex reasoning |
517
+ | Gemini 2.5 Flash | `gemini-2.5-flash` | 1M | $0.15 / $0.60 | Fast responses |
518
+ | Gemini 2.5 Flash Lite | `gemini-2.5-flash-lite` | 1M | $0.075 / $0.30 | Budget-friendly |
138
519
 
139
- ### Gemini
140
- | Alias | Model |
141
- |-------|-------|
142
- | `gemini-3-pro` | gemini-3-pro-preview |
143
- | `gemini-2.5-pro` | gemini-2.5-pro |
144
- | `gemini-2.0-flash` | gemini-2.0-flash |
520
+ ### Imagen Models (Image Generation)
145
521
 
146
- ## Region Failover
522
+ | Model | ID | Description | Price |
523
+ |-------|-----|-------------|-------|
524
+ | Imagen 4 | `imagen-4.0-generate-001` | Best quality | ~$0.04/image |
525
+ | Imagen 4 Fast | `imagen-4.0-fast-generate-001` | Lower latency | ~$0.02/image |
526
+ | Imagen 4 Ultra | `imagen-4.0-ultra-generate-001` | Highest quality | ~$0.08/image |
147
527
 
148
- When a region returns 429 (rate limited), the proxy automatically tries the next region:
528
+ ## Troubleshooting
149
529
 
530
+ ### "Requested entity was not found"
531
+
532
+ 1. Check your project ID is correct
533
+ 2. Ensure Claude is enabled in Model Garden
534
+ 3. Verify you're using a supported region (`us-east5` or `europe-west1` for Claude)
535
+
536
+ ### "Permission denied"
537
+
538
+ ```bash
539
+ # Re-authenticate
540
+ gcloud auth application-default login
541
+
542
+ # Check current credentials
543
+ gcloud auth application-default print-access-token
544
+ ```
545
+
546
+ ### "Model not found" in OpenClaw/Clawdbot
547
+
548
+ Ensure the model is defined in `models.providers.vertex.models[]` in your config.
549
+
550
+ ### Streaming not working
551
+
552
+ Check that your client supports SSE (Server-Sent Events). The proxy sends:
150
553
  ```
151
- us-east5 (primary) → us-east1 → europe-west1
554
+ data: {"choices":[{"delta":{"content":"Hello"}}]}
555
+
556
+ data: [DONE]
152
557
  ```
153
558
 
154
- Healthy regions are prioritized based on recent success.
559
+ ### Check proxy logs
560
+
561
+ ```bash
562
+ # View recent logs
563
+ vertex-ai-proxy logs
564
+
565
+ # Follow logs in real-time
566
+ vertex-ai-proxy logs -f
567
+ ```
155
568
 
156
- ## Metrics
569
+ ## Development
157
570
 
158
- Prometheus metrics available at `/metrics`:
571
+ ```bash
572
+ # Clone and install
573
+ git clone https://github.com/anthropics/vertex-ai-proxy.git
574
+ cd vertex-ai-proxy
575
+ npm install
159
576
 
160
- - `vertex_proxy_requests_total{model,status}` - Total requests
161
- - `vertex_proxy_request_duration_seconds` - Request latency
162
- - `vertex_proxy_retries_total{model,region}` - Retry count
163
- - `vertex_proxy_region_failures_total{region}` - Region failures
164
- - `vertex_proxy_cache_hits_total` - Prompt cache hits
577
+ # Run in development mode
578
+ npm run dev
165
579
 
166
- ## Requirements
580
+ # Run tests
581
+ npm test
167
582
 
168
- - Node.js 18+
169
- - Google Cloud authentication (ADC or service account)
170
- - Vertex AI API enabled
583
+ # Build
584
+ npm run build
585
+ ```
171
586
 
172
587
  ## License
173
588
 
174
- MIT
589
+ MIT License - see [LICENSE](LICENSE) for details.
175
590
 
176
591
  ## Contributing
177
592
 
178
- PRs welcome! Please open an issue first to discuss changes.
593
+ Contributions welcome! Please read [CONTRIBUTING.md](CONTRIBUTING.md) first.
594
+
595
+ ## Related Projects
596
+
597
+ - [OpenClaw](https://github.com/openclaw/openclaw) - Personal AI assistant
598
+ - [Clawdbot](https://github.com/clawdbot/clawdbot) - Discord/multi-platform AI bot
599
+ - [Anthropic Vertex SDK](https://github.com/anthropics/anthropic-sdk-python) - Official Python SDK
600
+ - [Google Vertex AI](https://cloud.google.com/vertex-ai) - Google's AI platform