@intentsolutionsio/jeremy-vertex-engine 2.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +20 -0
- package/LICENSE +21 -0
- package/README.md +782 -0
- package/agents/vertex-engine-inspector.md +446 -0
- package/package.json +41 -0
- package/skills/vertex-engine-inspector/SKILL.md +84 -0
- package/skills/vertex-engine-inspector/references/ARD.md +74 -0
- package/skills/vertex-engine-inspector/references/PRD.md +69 -0
- package/skills/vertex-engine-inspector/references/errors.md +96 -0
- package/skills/vertex-engine-inspector/references/example-inspection-report.md +50 -0
- package/skills/vertex-engine-inspector/references/examples.md +591 -0
- package/skills/vertex-engine-inspector/references/inspection-categories.md +104 -0
- package/skills/vertex-engine-inspector/references/inspection-workflow.md +52 -0
- package/skills/vertex-engine-inspector/scripts/check-security.py +254 -0
- package/skills/vertex-engine-inspector/scripts/inspect-agent.sh +194 -0
package/README.md
ADDED
|
@@ -0,0 +1,782 @@
|
|
|
1
|
+
# Jeremy Vertex Engine
|
|
2
|
+
|
|
3
|
+
**🎯 VERTEX AI AGENT ENGINE DEPLOYMENT ONLY**
|
|
4
|
+
|
|
5
|
+
Expert inspector and orchestrator for **Vertex AI Agent Engine** - Google Cloud's fully-managed, serverless agent runtime platform.
|
|
6
|
+
|
|
7
|
+
## ⚠️ Important: What This Plugin Is For
|
|
8
|
+
|
|
9
|
+
**✅ THIS PLUGIN IS FOR:**
|
|
10
|
+
- **Vertex AI Agent Engine** deployments (fully-managed runtime)
|
|
11
|
+
- **ADK (Agent Development Kit)** agents deployed to Agent Engine
|
|
12
|
+
- **Reasoning Engine API** resources (`google_vertex_ai_reasoning_engine`)
|
|
13
|
+
- Agent Engine features: Memory Bank, Code Execution Sandbox, Sessions, A2A Protocol
|
|
14
|
+
|
|
15
|
+
**❌ THIS PLUGIN IS NOT FOR:**
|
|
16
|
+
- Cloud Run deployments (use `jeremy-genkit-terraform` or `jeremy-adk-terraform` with `--cloud-run` flag)
|
|
17
|
+
- LangChain/LlamaIndex on other platforms
|
|
18
|
+
- Self-hosted agent infrastructure
|
|
19
|
+
- Cloud Functions or other serverless platforms
|
|
20
|
+
|
|
21
|
+
## Overview
|
|
22
|
+
|
|
23
|
+
This plugin provides comprehensive inspection and validation capabilities for agents deployed to the **Vertex AI Agent Engine managed runtime**. It acts as a quality assurance layer ensuring agents are properly configured, secure, performant, and production-ready on Google's fully-managed agent infrastructure.
|
|
24
|
+
|
|
25
|
+
## Installation
|
|
26
|
+
|
|
27
|
+
```bash
|
|
28
|
+
/plugin install jeremy-vertex-engine@claude-code-plugins-plus
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
## Prerequisites & Dependencies
|
|
32
|
+
|
|
33
|
+
### Required Google Cloud Setup
|
|
34
|
+
|
|
35
|
+
**1. Google Cloud Project with APIs Enabled:**
|
|
36
|
+
```bash
|
|
37
|
+
# Enable required APIs
|
|
38
|
+
gcloud services enable aiplatform.googleapis.com \
|
|
39
|
+
discoveryengine.googleapis.com \
|
|
40
|
+
logging.googleapis.com \
|
|
41
|
+
monitoring.googleapis.com \
|
|
42
|
+
cloudtrace.googleapis.com \
|
|
43
|
+
--project=YOUR_PROJECT_ID
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
**2. Authentication:**
|
|
47
|
+
```bash
|
|
48
|
+
# Application Default Credentials
|
|
49
|
+
gcloud auth application-default login
|
|
50
|
+
|
|
51
|
+
# Or use service account
|
|
52
|
+
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
**3. Required IAM Permissions:**
|
|
56
|
+
```yaml
|
|
57
|
+
# Minimum required roles for inspection:
|
|
58
|
+
- roles/aiplatform.user # Query Agent Engine resources
|
|
59
|
+
- roles/discoveryengine.viewer # View agent configurations
|
|
60
|
+
- roles/logging.viewer # Read agent logs
|
|
61
|
+
- roles/monitoring.viewer # Access metrics
|
|
62
|
+
- roles/cloudtrace.user # View trace data
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
### Required Python Packages
|
|
66
|
+
|
|
67
|
+
**Install via pip:**
|
|
68
|
+
```bash
|
|
69
|
+
# Core Vertex AI SDK (with Agent Engine support)
|
|
70
|
+
pip install google-cloud-aiplatform[agent_engines]>=1.120.0
|
|
71
|
+
|
|
72
|
+
# ADK SDK (if building ADK agents)
|
|
73
|
+
pip install google-adk>=1.15.1
|
|
74
|
+
|
|
75
|
+
# Observability & Monitoring
|
|
76
|
+
pip install google-cloud-logging>=3.10.0
|
|
77
|
+
pip install google-cloud-monitoring>=2.21.0
|
|
78
|
+
pip install google-cloud-trace>=1.13.0
|
|
79
|
+
|
|
80
|
+
# Optional: A2A Protocol SDK
|
|
81
|
+
pip install a2a-sdk>=0.3.4
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
**All dependencies at once:**
|
|
85
|
+
```bash
|
|
86
|
+
pip install --upgrade \
|
|
87
|
+
'google-cloud-aiplatform[agent_engines]>=1.120.0' \
|
|
88
|
+
'google-adk>=1.15.1' \
|
|
89
|
+
'google-cloud-logging>=3.10.0' \
|
|
90
|
+
'google-cloud-monitoring>=2.21.0' \
|
|
91
|
+
'google-cloud-trace>=1.13.0' \
|
|
92
|
+
'a2a-sdk>=0.3.4'
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
### Required gcloud CLI Tools
|
|
96
|
+
|
|
97
|
+
The `gcloud` CLI is used for IAM policy queries, Cloud Monitoring, and Cloud Logging -- **not** for Agent Engine CRUD operations. There is no `gcloud ai agents`, `gcloud ai reasoning-engines`, or `gcloud alpha ai agent-engines` CLI surface. All Agent Engine operations use the Python SDK.
|
|
98
|
+
|
|
99
|
+
**Install gcloud CLI:**
|
|
100
|
+
```bash
|
|
101
|
+
# Install gcloud (if not already installed)
|
|
102
|
+
curl https://sdk.cloud.google.com | bash
|
|
103
|
+
exec -l $SHELL
|
|
104
|
+
|
|
105
|
+
# Update to latest version
|
|
106
|
+
gcloud components update
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
**Verify Installation:**
|
|
110
|
+
```bash
|
|
111
|
+
gcloud --version
|
|
112
|
+
# Should show: Google Cloud SDK 450.0.0+ (or higher)
|
|
113
|
+
|
|
114
|
+
# Test Agent Engine access via Python SDK
|
|
115
|
+
python3 -c "
|
|
116
|
+
import vertexai
|
|
117
|
+
client = vertexai.Client(project='YOUR_PROJECT_ID', location='us-central1')
|
|
118
|
+
for engine in client.agent_engines.list():
|
|
119
|
+
print(engine.name, engine.display_name)
|
|
120
|
+
"
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
### Vertex AI Agent Engine Requirements
|
|
124
|
+
|
|
125
|
+
**This plugin works with agents deployed via:**
|
|
126
|
+
|
|
127
|
+
1. **ADK Deployment to Agent Engine:**
|
|
128
|
+
```python
|
|
129
|
+
import vertexai
|
|
130
|
+
from google.adk.agents import Agent
|
|
131
|
+
|
|
132
|
+
client = vertexai.Client(project=PROJECT_ID, location=LOCATION)
|
|
133
|
+
|
|
134
|
+
# Define an ADK agent
|
|
135
|
+
agent = Agent(name="my-adk-agent", model="gemini-2.5-flash")
|
|
136
|
+
|
|
137
|
+
# Deploy ADK agent to Agent Engine
|
|
138
|
+
agent_engine = client.agent_engines.create(
|
|
139
|
+
agent=agent,
|
|
140
|
+
config={"display_name": "my-adk-agent"},
|
|
141
|
+
)
|
|
142
|
+
```
|
|
143
|
+
|
|
144
|
+
2. **Terraform Deployment:**
|
|
145
|
+
```hcl
|
|
146
|
+
resource "google_vertex_ai_reasoning_engine" "agent" {
|
|
147
|
+
display_name = "my-agent"
|
|
148
|
+
region = "us-central1"
|
|
149
|
+
|
|
150
|
+
spec {
|
|
151
|
+
agent_framework = "google-adk" # ← ADK agents
|
|
152
|
+
# OR omit for custom agents
|
|
153
|
+
|
|
154
|
+
package_spec {
|
|
155
|
+
pickle_object_gcs_uri = "gs://bucket/agent.pkl"
|
|
156
|
+
python_version = "3.12"
|
|
157
|
+
requirements_gcs_uri = "gs://bucket/requirements.txt"
|
|
158
|
+
}
|
|
159
|
+
}
|
|
160
|
+
}
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
3. **Direct SDK Deployment:**
|
|
164
|
+
```python
|
|
165
|
+
# Custom agent template (NOT LangChain)
|
|
166
|
+
from vertexai.preview.reasoning_engines import ReasoningEngine
|
|
167
|
+
|
|
168
|
+
agent = ReasoningEngine.create(
|
|
169
|
+
my_agent_instance,
|
|
170
|
+
requirements=["google-cloud-aiplatform[agent_engines]>=1.120.0"],
|
|
171
|
+
display_name="custom-agent"
|
|
172
|
+
)
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
### ❌ NOT Compatible With
|
|
176
|
+
|
|
177
|
+
- **Cloud Run deployments** (different runtime, use Cloud Run monitoring tools)
|
|
178
|
+
- **LangChain on non-Agent Engine platforms** (use LangSmith/LangFuse)
|
|
179
|
+
- **LlamaIndex custom servers** (not Agent Engine)
|
|
180
|
+
- **Self-hosted agent infrastructure** (requires custom monitoring)
|
|
181
|
+
|
|
182
|
+
## Features
|
|
183
|
+
|
|
184
|
+
✅ **Runtime Configuration Inspection**: Validate model, tools, VPC settings
|
|
185
|
+
✅ **Code Execution Sandbox Validation**: Check security, state persistence, IAM
|
|
186
|
+
✅ **Memory Bank Configuration**: Verify retention, indexing, query performance
|
|
187
|
+
✅ **A2A Protocol Compliance**: Ensure AgentCard and API endpoints functional
|
|
188
|
+
✅ **Security Audits**: IAM, VPC-SC, encryption, Model Armor checks
|
|
189
|
+
✅ **Performance Monitoring**: Latency, error rates, token usage, costs
|
|
190
|
+
✅ **Production Readiness Scoring**: Comprehensive 28-point checklist
|
|
191
|
+
✅ **Health Monitoring**: Real-time metrics and alerting
|
|
192
|
+
|
|
193
|
+
## Components
|
|
194
|
+
|
|
195
|
+
### Agent
|
|
196
|
+
- **vertex-engine-inspector**: Comprehensive agent inspector with validation logic
|
|
197
|
+
|
|
198
|
+
### Skills (Auto-Activating)
|
|
199
|
+
- **vertex-engine-inspector**: Triggers on "inspect agent engine", "validate deployment"
|
|
200
|
+
- **Tool Permissions**: Read, Grep, Glob, Bash (read-only)
|
|
201
|
+
- **Version**: 2.1.0 (2026 schema compliant)
|
|
202
|
+
|
|
203
|
+
## Quick Start
|
|
204
|
+
|
|
205
|
+
### Natural Language Activation
|
|
206
|
+
|
|
207
|
+
Simply mention what you need:
|
|
208
|
+
|
|
209
|
+
```
|
|
210
|
+
"Inspect my Vertex AI Engine agent deployment"
|
|
211
|
+
"Validate the Code Execution Sandbox configuration"
|
|
212
|
+
"Check Memory Bank settings for my agent"
|
|
213
|
+
"Monitor agent health over the last 24 hours"
|
|
214
|
+
"Production readiness check for agent-id-123"
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
The skill auto-activates and performs comprehensive inspection.
|
|
218
|
+
|
|
219
|
+
### What Gets Inspected
|
|
220
|
+
|
|
221
|
+
1. **Runtime Configuration**
|
|
222
|
+
- Model selection and settings
|
|
223
|
+
- Enabled tools (Code Execution, Memory Bank)
|
|
224
|
+
- VPC and networking configuration
|
|
225
|
+
- Resource allocation and scaling
|
|
226
|
+
|
|
227
|
+
2. **Code Execution Sandbox**
|
|
228
|
+
- Security isolation validation
|
|
229
|
+
- State persistence TTL (1-14 days)
|
|
230
|
+
- IAM least privilege verification
|
|
231
|
+
- Performance settings
|
|
232
|
+
|
|
233
|
+
3. **Memory Bank**
|
|
234
|
+
- Persistent memory configuration
|
|
235
|
+
- Retention policies
|
|
236
|
+
- Query performance (indexing, caching)
|
|
237
|
+
- Storage backend validation
|
|
238
|
+
|
|
239
|
+
4. **A2A Protocol**
|
|
240
|
+
- AgentCard availability and structure
|
|
241
|
+
- Task API functionality
|
|
242
|
+
- Status API accessibility
|
|
243
|
+
- Protocol version compliance
|
|
244
|
+
|
|
245
|
+
5. **Security Posture**
|
|
246
|
+
- IAM roles and permissions
|
|
247
|
+
- VPC Service Controls
|
|
248
|
+
- Model Armor (prompt injection protection)
|
|
249
|
+
- Encryption at rest and in transit
|
|
250
|
+
|
|
251
|
+
6. **Performance Metrics**
|
|
252
|
+
- Error rates and latency
|
|
253
|
+
- Token usage and costs
|
|
254
|
+
- Throughput and scaling
|
|
255
|
+
- SLO compliance
|
|
256
|
+
|
|
257
|
+
7. **Production Readiness**
|
|
258
|
+
- 28-point comprehensive checklist
|
|
259
|
+
- Weighted scoring across 5 categories
|
|
260
|
+
- Overall readiness status
|
|
261
|
+
- Actionable recommendations
|
|
262
|
+
|
|
263
|
+
## Production Readiness Score
|
|
264
|
+
|
|
265
|
+
The plugin generates a production readiness score based on:
|
|
266
|
+
|
|
267
|
+
- **Security** (30%): 6 checks
|
|
268
|
+
- **Performance** (25%): 6 checks
|
|
269
|
+
- **Monitoring** (20%): 6 checks
|
|
270
|
+
- **Compliance** (15%): 5 checks
|
|
271
|
+
- **Reliability** (10%): 5 checks
|
|
272
|
+
|
|
273
|
+
### Status Levels
|
|
274
|
+
|
|
275
|
+
🟢 **PRODUCTION READY (85-100%)**: Safe to deploy
|
|
276
|
+
🟡 **NEEDS IMPROVEMENT (70-84%)**: Address issues first
|
|
277
|
+
🔴 **NOT READY (<70%)**: Critical failures present
|
|
278
|
+
|
|
279
|
+
## Integration with Other Plugins
|
|
280
|
+
|
|
281
|
+
### jeremy-adk-orchestrator
|
|
282
|
+
- Orchestrator deploys → Inspector validates
|
|
283
|
+
- Continuous feedback loop
|
|
284
|
+
|
|
285
|
+
### jeremy-vertex-validator
|
|
286
|
+
- Validator checks code → Inspector checks runtime
|
|
287
|
+
- Pre/post deployment validation
|
|
288
|
+
|
|
289
|
+
### jeremy-adk-terraform
|
|
290
|
+
- Terraform provisions → Inspector validates
|
|
291
|
+
- Infrastructure verification
|
|
292
|
+
|
|
293
|
+
## Use Cases
|
|
294
|
+
|
|
295
|
+
### Pre-Production Validation
|
|
296
|
+
Before deploying to production:
|
|
297
|
+
```
|
|
298
|
+
"Run production readiness check on staging agent"
|
|
299
|
+
```
|
|
300
|
+
|
|
301
|
+
### Post-Deployment Verification
|
|
302
|
+
After deployment:
|
|
303
|
+
```
|
|
304
|
+
"Validate agent-xyz deployment was successful"
|
|
305
|
+
```
|
|
306
|
+
|
|
307
|
+
### Ongoing Health Monitoring
|
|
308
|
+
Regular health checks:
|
|
309
|
+
```
|
|
310
|
+
"Monitor agent health for the last 7 days"
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
### Security Audits
|
|
314
|
+
Compliance validation:
|
|
315
|
+
```
|
|
316
|
+
"Perform security audit on production agents"
|
|
317
|
+
```
|
|
318
|
+
|
|
319
|
+
### Troubleshooting
|
|
320
|
+
When issues occur:
|
|
321
|
+
```
|
|
322
|
+
"Why is my agent responding slowly?"
|
|
323
|
+
"Investigate high error rate on agent-abc"
|
|
324
|
+
```
|
|
325
|
+
|
|
326
|
+
## Example Inspection Report
|
|
327
|
+
|
|
328
|
+
```
|
|
329
|
+
Agent: gcp-deployer-agent
|
|
330
|
+
Status: 🟢 PRODUCTION READY (87%)
|
|
331
|
+
|
|
332
|
+
✅ Code Execution: Enabled (TTL: 14 days)
|
|
333
|
+
✅ Memory Bank: Enabled (retention: 90 days)
|
|
334
|
+
✅ A2A Protocol: Fully compliant
|
|
335
|
+
✅ Security: 92% score
|
|
336
|
+
✅ Performance: Error rate 2.3%, Latency 1.8s (p95)
|
|
337
|
+
|
|
338
|
+
⚠️ Recommendations:
|
|
339
|
+
1. Enable multi-region deployment
|
|
340
|
+
2. Configure automated backups
|
|
341
|
+
3. Add circuit breaker pattern
|
|
342
|
+
```
|
|
343
|
+
|
|
344
|
+
## Observability & Monitoring
|
|
345
|
+
|
|
346
|
+
### Agent Engine Observability Dashboard
|
|
347
|
+
|
|
348
|
+
**New in 2025**: Vertex AI Agent Engine provides a built-in observability dashboard for monitoring agent performance.
|
|
349
|
+
|
|
350
|
+
**Access the Dashboard:**
|
|
351
|
+
```bash
|
|
352
|
+
# Navigate to Cloud Console
|
|
353
|
+
https://console.cloud.google.com/vertex-ai/agent-engines/[AGENT_ENGINE_ID]/observability?project=[PROJECT_ID]
|
|
354
|
+
```
|
|
355
|
+
|
|
356
|
+
**Key Metrics Available:**
|
|
357
|
+
- **Request Volume**: Total queries processed over time
|
|
358
|
+
- **Latency Distribution**: p50, p90, p95, p99 response times
|
|
359
|
+
- **Error Rates**: Failed requests, timeout errors, model errors
|
|
360
|
+
- **Token Usage**: Input/output tokens, cost estimation
|
|
361
|
+
- **Memory Bank Operations**: Query latency, cache hit rate
|
|
362
|
+
- **Code Execution Stats**: Sandbox invocations, execution time
|
|
363
|
+
|
|
364
|
+
### Cloud Trace Integration
|
|
365
|
+
|
|
366
|
+
**Enable distributed tracing with OpenTelemetry:**
|
|
367
|
+
|
|
368
|
+
```python
|
|
369
|
+
from opentelemetry import trace
|
|
370
|
+
from opentelemetry.sdk.trace import TracerProvider
|
|
371
|
+
from opentelemetry.sdk.trace.export import BatchSpanProcessor
|
|
372
|
+
from opentelemetry.exporter.cloud_trace import CloudTraceSpanExporter
|
|
373
|
+
import vertexai
|
|
374
|
+
|
|
375
|
+
# Configure Cloud Trace exporter
|
|
376
|
+
trace.set_tracer_provider(TracerProvider())
|
|
377
|
+
cloud_trace_exporter = CloudTraceSpanExporter()
|
|
378
|
+
trace.get_tracer_provider().add_span_processor(
|
|
379
|
+
BatchSpanProcessor(cloud_trace_exporter)
|
|
380
|
+
)
|
|
381
|
+
|
|
382
|
+
tracer = trace.get_tracer(__name__)
|
|
383
|
+
|
|
384
|
+
# Instrument agent queries
|
|
385
|
+
with tracer.start_as_current_span("agent_query") as span:
|
|
386
|
+
span.set_attribute("agent.id", agent_engine_id)
|
|
387
|
+
span.set_attribute("user.query", user_query)
|
|
388
|
+
|
|
389
|
+
response = agent.query(user_query)
|
|
390
|
+
|
|
391
|
+
span.set_attribute("response.tokens", response.token_count)
|
|
392
|
+
span.set_attribute("response.latency_ms", response.latency)
|
|
393
|
+
```
|
|
394
|
+
|
|
395
|
+
**View traces in Cloud Console:**
|
|
396
|
+
```bash
|
|
397
|
+
# Navigate to Trace Explorer
|
|
398
|
+
https://console.cloud.google.com/traces/list?project=[PROJECT_ID]
|
|
399
|
+
|
|
400
|
+
# Filter by agent queries
|
|
401
|
+
resource.type="aiplatform.googleapis.com/Agent"
|
|
402
|
+
```
|
|
403
|
+
|
|
404
|
+
### Cloud Logging
|
|
405
|
+
|
|
406
|
+
**Query agent logs using Cloud Logging:**
|
|
407
|
+
|
|
408
|
+
```python
|
|
409
|
+
from google.cloud import logging
|
|
410
|
+
|
|
411
|
+
client = logging.Client(project=PROJECT_ID)
|
|
412
|
+
|
|
413
|
+
# Get agent logs from the last 24 hours
|
|
414
|
+
filter_str = f'''
|
|
415
|
+
resource.type="aiplatform.googleapis.com/Agent"
|
|
416
|
+
resource.labels.agent_id="{agent_engine_id}"
|
|
417
|
+
timestamp>="2025-01-12T00:00:00Z"
|
|
418
|
+
severity>="WARNING"
|
|
419
|
+
'''
|
|
420
|
+
|
|
421
|
+
for entry in client.list_entries(filter_=filter_str, page_size=100):
|
|
422
|
+
print(f"{entry.timestamp}: {entry.payload}")
|
|
423
|
+
```
|
|
424
|
+
|
|
425
|
+
**Common log queries:**
|
|
426
|
+
|
|
427
|
+
```bash
|
|
428
|
+
# View all agent errors
|
|
429
|
+
gcloud logging read "resource.type=aiplatform.googleapis.com/Agent AND severity>=ERROR" \
|
|
430
|
+
--project=[PROJECT_ID] \
|
|
431
|
+
--limit=50 \
|
|
432
|
+
--format=json
|
|
433
|
+
|
|
434
|
+
# Memory Bank query performance
|
|
435
|
+
gcloud logging read "resource.type=aiplatform.googleapis.com/Agent AND jsonPayload.component=memory_bank" \
|
|
436
|
+
--project=[PROJECT_ID] \
|
|
437
|
+
--limit=100
|
|
438
|
+
|
|
439
|
+
# Code Execution Sandbox logs
|
|
440
|
+
gcloud logging read "resource.type=aiplatform.googleapis.com/Agent AND jsonPayload.component=code_execution" \
|
|
441
|
+
--project=[PROJECT_ID] \
|
|
442
|
+
--limit=100
|
|
443
|
+
```
|
|
444
|
+
|
|
445
|
+
### Cloud Monitoring Custom Metrics
|
|
446
|
+
|
|
447
|
+
**Create custom dashboards for agent monitoring:**
|
|
448
|
+
|
|
449
|
+
```python
|
|
450
|
+
from google.cloud import monitoring_v3
|
|
451
|
+
|
|
452
|
+
client = monitoring_v3.MetricServiceClient()
|
|
453
|
+
project_name = f"projects/{PROJECT_ID}"
|
|
454
|
+
|
|
455
|
+
# Query agent latency metric
|
|
456
|
+
interval = monitoring_v3.TimeInterval(
|
|
457
|
+
{
|
|
458
|
+
"end_time": {"seconds": int(time.time())},
|
|
459
|
+
"start_time": {"seconds": int(time.time() - 3600)},
|
|
460
|
+
}
|
|
461
|
+
)
|
|
462
|
+
|
|
463
|
+
results = client.list_time_series(
|
|
464
|
+
request={
|
|
465
|
+
"name": project_name,
|
|
466
|
+
"filter": 'metric.type="aiplatform.googleapis.com/agent/prediction_latencies"',
|
|
467
|
+
"interval": interval,
|
|
468
|
+
"view": monitoring_v3.ListTimeSeriesRequest.TimeSeriesView.FULL,
|
|
469
|
+
}
|
|
470
|
+
)
|
|
471
|
+
|
|
472
|
+
for result in results:
|
|
473
|
+
print(f"Agent: {result.resource.labels['agent_id']}")
|
|
474
|
+
for point in result.points:
|
|
475
|
+
print(f" Latency: {point.value.distribution_value.mean}ms")
|
|
476
|
+
```
|
|
477
|
+
|
|
478
|
+
### Alerting Policies
|
|
479
|
+
|
|
480
|
+
**Create alerts for agent health issues:**
|
|
481
|
+
|
|
482
|
+
```python
|
|
483
|
+
from google.cloud import monitoring_v3
|
|
484
|
+
|
|
485
|
+
# Alert on high error rate
|
|
486
|
+
alert_policy = monitoring_v3.AlertPolicy(
|
|
487
|
+
display_name="Agent High Error Rate",
|
|
488
|
+
conditions=[
|
|
489
|
+
monitoring_v3.AlertPolicy.Condition(
|
|
490
|
+
display_name="Error rate > 5%",
|
|
491
|
+
condition_threshold=monitoring_v3.AlertPolicy.Condition.MetricThreshold(
|
|
492
|
+
filter='metric.type="aiplatform.googleapis.com/agent/error_count"',
|
|
493
|
+
comparison=monitoring_v3.ComparisonType.COMPARISON_GT,
|
|
494
|
+
threshold_value=0.05,
|
|
495
|
+
duration={"seconds": 300},
|
|
496
|
+
),
|
|
497
|
+
)
|
|
498
|
+
],
|
|
499
|
+
notification_channels=[notification_channel_id],
|
|
500
|
+
alert_strategy=monitoring_v3.AlertPolicy.AlertStrategy(
|
|
501
|
+
auto_close={"seconds": 86400}
|
|
502
|
+
),
|
|
503
|
+
)
|
|
504
|
+
|
|
505
|
+
policy_client = monitoring_v3.AlertPolicyServiceClient()
|
|
506
|
+
policy = policy_client.create_alert_policy(
|
|
507
|
+
name=f"projects/{PROJECT_ID}",
|
|
508
|
+
alert_policy=alert_policy,
|
|
509
|
+
)
|
|
510
|
+
```
|
|
511
|
+
|
|
512
|
+
**Common alert conditions:**
|
|
513
|
+
- Error rate exceeds 5% for 5 minutes
|
|
514
|
+
- p95 latency exceeds 10 seconds
|
|
515
|
+
- Memory Bank cache hit rate drops below 60%
|
|
516
|
+
- Code Execution Sandbox timeout rate exceeds 2%
|
|
517
|
+
- Token usage exceeds budget threshold
|
|
518
|
+
|
|
519
|
+
## Storage Integration
|
|
520
|
+
|
|
521
|
+
### BigQuery Connector
|
|
522
|
+
|
|
523
|
+
**New in 2025**: Export agent logs and analytics to BigQuery for long-term analysis.
|
|
524
|
+
|
|
525
|
+
**Setup BigQuery Export:**
|
|
526
|
+
|
|
527
|
+
```python
|
|
528
|
+
# Export agent logs to BigQuery via Cloud Logging log sink
|
|
529
|
+
# (Agent Engine logs flow through Cloud Logging; use a sink to route to BigQuery)
|
|
530
|
+
|
|
531
|
+
from google.cloud import logging_v2
|
|
532
|
+
|
|
533
|
+
client = logging_v2.ConfigServiceV2Client()
|
|
534
|
+
|
|
535
|
+
sink_name = f"projects/{PROJECT_ID}/sinks/agent-logs-to-bq"
|
|
536
|
+
sink = logging_v2.LogSink(
|
|
537
|
+
name=sink_name,
|
|
538
|
+
destination=f"bigquery.googleapis.com/projects/{PROJECT_ID}/datasets/agent_analytics",
|
|
539
|
+
filter_='resource.type="aiplatform.googleapis.com/Agent"',
|
|
540
|
+
)
|
|
541
|
+
|
|
542
|
+
# Create the log sink (routes agent logs to BigQuery automatically)
|
|
543
|
+
created_sink = client.create_sink(
|
|
544
|
+
parent=f"projects/{PROJECT_ID}",
|
|
545
|
+
sink=sink,
|
|
546
|
+
)
|
|
547
|
+
print(f"Log sink created: {created_sink.name}")
|
|
548
|
+
```
|
|
549
|
+
|
|
550
|
+
**Query agent analytics in BigQuery:**
|
|
551
|
+
|
|
552
|
+
```sql
|
|
553
|
+
-- Agent query volume and latency trends
|
|
554
|
+
SELECT
|
|
555
|
+
DATE(timestamp) as query_date,
|
|
556
|
+
COUNT(*) as total_queries,
|
|
557
|
+
AVG(latency_ms) as avg_latency,
|
|
558
|
+
APPROX_QUANTILES(latency_ms, 100)[OFFSET(95)] as p95_latency,
|
|
559
|
+
SUM(error_count) as total_errors
|
|
560
|
+
FROM `project.agent_analytics.agent_logs`
|
|
561
|
+
WHERE agent_id = 'your-agent-id'
|
|
562
|
+
AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY)
|
|
563
|
+
GROUP BY query_date
|
|
564
|
+
ORDER BY query_date DESC;
|
|
565
|
+
|
|
566
|
+
-- Memory Bank cache performance
|
|
567
|
+
SELECT
|
|
568
|
+
memory_bank_id,
|
|
569
|
+
COUNT(*) as total_queries,
|
|
570
|
+
SUM(CASE WHEN cache_hit THEN 1 ELSE 0 END) / COUNT(*) as cache_hit_rate,
|
|
571
|
+
AVG(query_latency_ms) as avg_query_latency
|
|
572
|
+
FROM `project.agent_analytics.agent_logs`
|
|
573
|
+
WHERE component = 'MEMORY_BANK'
|
|
574
|
+
AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 1 DAY)
|
|
575
|
+
GROUP BY memory_bank_id;
|
|
576
|
+
|
|
577
|
+
-- Token usage and cost analysis
|
|
578
|
+
SELECT
|
|
579
|
+
DATE(timestamp) as usage_date,
|
|
580
|
+
SUM(input_tokens) as total_input_tokens,
|
|
581
|
+
SUM(output_tokens) as total_output_tokens,
|
|
582
|
+
SUM(input_tokens + output_tokens) as total_tokens,
|
|
583
|
+
SUM((input_tokens * 0.00025 + output_tokens * 0.00075) / 1000) as estimated_cost_usd
|
|
584
|
+
FROM `project.agent_analytics.agent_logs`
|
|
585
|
+
WHERE agent_id = 'your-agent-id'
|
|
586
|
+
AND timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 30 DAY)
|
|
587
|
+
GROUP BY usage_date
|
|
588
|
+
ORDER BY usage_date DESC;
|
|
589
|
+
```
|
|
590
|
+
|
|
591
|
+
**Create scheduled reports:**
|
|
592
|
+
|
|
593
|
+
```bash
|
|
594
|
+
# Create BigQuery scheduled query
|
|
595
|
+
bq query \
|
|
596
|
+
--use_legacy_sql=false \
|
|
597
|
+
--destination_table=agent_analytics.daily_summary \
|
|
598
|
+
--replace \
|
|
599
|
+
--schedule='every 24 hours' \
|
|
600
|
+
--display_name='Agent Daily Summary' \
|
|
601
|
+
'SELECT
|
|
602
|
+
DATE(timestamp) as report_date,
|
|
603
|
+
agent_id,
|
|
604
|
+
COUNT(*) as total_queries,
|
|
605
|
+
AVG(latency_ms) as avg_latency,
|
|
606
|
+
SUM(error_count) as total_errors,
|
|
607
|
+
SUM(input_tokens + output_tokens) as total_tokens
|
|
608
|
+
FROM `project.agent_analytics.agent_logs`
|
|
609
|
+
WHERE timestamp >= CURRENT_DATE()
|
|
610
|
+
GROUP BY report_date, agent_id'
|
|
611
|
+
```
|
|
612
|
+
|
|
613
|
+
### Cloud Storage Integration
|
|
614
|
+
|
|
615
|
+
**Store agent artifacts and code execution outputs:**
|
|
616
|
+
|
|
617
|
+
```python
|
|
618
|
+
from google.cloud import storage
|
|
619
|
+
|
|
620
|
+
storage_client = storage.Client(project=PROJECT_ID)
|
|
621
|
+
bucket = storage_client.bucket(f"{PROJECT_ID}-agent-artifacts")
|
|
622
|
+
|
|
623
|
+
# Configure Code Execution Sandbox to save artifacts
|
|
624
|
+
sandbox_config = {
|
|
625
|
+
"artifact_storage": {
|
|
626
|
+
"gcs_bucket": bucket.name,
|
|
627
|
+
"retention_days": 30,
|
|
628
|
+
"export_patterns": ["*.png", "*.csv", "*.json", "*.log"]
|
|
629
|
+
}
|
|
630
|
+
}
|
|
631
|
+
|
|
632
|
+
# Retrieve execution artifacts
|
|
633
|
+
def get_execution_artifacts(execution_id: str):
|
|
634
|
+
"""Download artifacts from a code execution run."""
|
|
635
|
+
prefix = f"executions/{execution_id}/"
|
|
636
|
+
blobs = bucket.list_blobs(prefix=prefix)
|
|
637
|
+
|
|
638
|
+
artifacts = []
|
|
639
|
+
for blob in blobs:
|
|
640
|
+
local_path = f"/tmp/{blob.name.split('/')[-1]}"
|
|
641
|
+
blob.download_to_filename(local_path)
|
|
642
|
+
artifacts.append(local_path)
|
|
643
|
+
|
|
644
|
+
return artifacts
|
|
645
|
+
```
|
|
646
|
+
|
|
647
|
+
**Incremental refresh for large datasets:**
|
|
648
|
+
|
|
649
|
+
```python
|
|
650
|
+
# Configure incremental data export (new in 2025)
|
|
651
|
+
from google.cloud import discoveryengine_v1
|
|
652
|
+
|
|
653
|
+
# For Memory Bank with large knowledge bases
|
|
654
|
+
data_store_config = {
|
|
655
|
+
"name": f"projects/{PROJECT_ID}/locations/{LOCATION}/dataStores/{DATA_STORE_ID}",
|
|
656
|
+
"content_config": "CONTENT_REQUIRED",
|
|
657
|
+
"document_processing_config": {
|
|
658
|
+
"chunking_config": {
|
|
659
|
+
"layout_based_chunking_config": {
|
|
660
|
+
"chunk_size": 500,
|
|
661
|
+
"include_ancestor_headings": True
|
|
662
|
+
}
|
|
663
|
+
}
|
|
664
|
+
},
|
|
665
|
+
"starting_schema": {
|
|
666
|
+
"incremental_updates": {
|
|
667
|
+
"gcs_source": f"gs://{bucket.name}/knowledge-base/",
|
|
668
|
+
"sync_interval_hours": 4,
|
|
669
|
+
"change_detection": "TIMESTAMP" # Only sync modified files
|
|
670
|
+
}
|
|
671
|
+
}
|
|
672
|
+
}
|
|
673
|
+
```
|
|
674
|
+
|
|
675
|
+
**Monitor storage costs:**
|
|
676
|
+
|
|
677
|
+
```bash
|
|
678
|
+
# Check agent storage usage
|
|
679
|
+
gsutil du -sh gs://[PROJECT_ID]-agent-artifacts/
|
|
680
|
+
|
|
681
|
+
# List large artifacts
|
|
682
|
+
gsutil ls -lh gs://[PROJECT_ID]-agent-artifacts/** | sort -k1 -h -r | head -20
|
|
683
|
+
|
|
684
|
+
# Set lifecycle policy to auto-delete old artifacts
|
|
685
|
+
cat > lifecycle.json <<EOF
|
|
686
|
+
{
|
|
687
|
+
"lifecycle": {
|
|
688
|
+
"rule": [
|
|
689
|
+
{
|
|
690
|
+
"action": {"type": "Delete"},
|
|
691
|
+
"condition": {
|
|
692
|
+
"age": 90,
|
|
693
|
+
"matchesPrefix": ["executions/", "logs/"]
|
|
694
|
+
}
|
|
695
|
+
}
|
|
696
|
+
]
|
|
697
|
+
}
|
|
698
|
+
}
|
|
699
|
+
EOF
|
|
700
|
+
|
|
701
|
+
gsutil lifecycle set lifecycle.json gs://[PROJECT_ID]-agent-artifacts/
|
|
702
|
+
```
|
|
703
|
+
|
|
704
|
+
### Data Export Patterns
|
|
705
|
+
|
|
706
|
+
**Common data export schedules for different use cases:**
|
|
707
|
+
|
|
708
|
+
| Use Case | Export Frequency | Destination | Retention |
|
|
709
|
+
|----------|-----------------|-------------|-----------|
|
|
710
|
+
| Real-time monitoring | Streaming | Cloud Logging | 30 days |
|
|
711
|
+
| Daily analytics | Every 24 hours | BigQuery | 1 year |
|
|
712
|
+
| Compliance audit | Weekly | Cloud Storage (archive) | 7 years |
|
|
713
|
+
| Cost analysis | Monthly | BigQuery | Indefinite |
|
|
714
|
+
| Debugging logs | On-demand | Cloud Storage | 90 days |
|
|
715
|
+
|
|
716
|
+
**Example: Weekly compliance export**
|
|
717
|
+
|
|
718
|
+
```python
|
|
719
|
+
from google.cloud import logging_v2
|
|
720
|
+
import datetime
|
|
721
|
+
|
|
722
|
+
def export_compliance_logs():
|
|
723
|
+
"""Export agent logs for compliance audit via Cloud Logging.
|
|
724
|
+
Agent Engine does not have a direct export_logs API — use Cloud Logging sinks
|
|
725
|
+
or the Logging API to read and export logs to GCS.
|
|
726
|
+
"""
|
|
727
|
+
end_time = datetime.datetime.now(datetime.timezone.utc)
|
|
728
|
+
start_time = end_time - datetime.timedelta(days=7)
|
|
729
|
+
|
|
730
|
+
client = logging_v2.Client(project=PROJECT_ID)
|
|
731
|
+
|
|
732
|
+
# Query agent logs for the compliance window
|
|
733
|
+
filter_str = f'''
|
|
734
|
+
resource.type="aiplatform.googleapis.com/Agent"
|
|
735
|
+
timestamp>="{start_time.isoformat()}"
|
|
736
|
+
timestamp<="{end_time.isoformat()}"
|
|
737
|
+
(severity>=WARNING OR jsonPayload.model_armor_triggered=true)
|
|
738
|
+
'''
|
|
739
|
+
|
|
740
|
+
entries = list(client.list_entries(filter_=filter_str, page_size=1000))
|
|
741
|
+
print(f"Compliance export: {len(entries)} log entries found")
|
|
742
|
+
|
|
743
|
+
# Write to GCS for archival
|
|
744
|
+
from google.cloud import storage
|
|
745
|
+
storage_client = storage.Client(project=PROJECT_ID)
|
|
746
|
+
bucket = storage_client.bucket(f"{PROJECT_ID}-compliance")
|
|
747
|
+
blob = bucket.blob(f"agents/{start_time.strftime('%Y-%m-%d')}/logs.json")
|
|
748
|
+
|
|
749
|
+
import json
|
|
750
|
+
log_data = [{"timestamp": str(e.timestamp), "payload": str(e.payload)} for e in entries]
|
|
751
|
+
blob.upload_from_string(json.dumps(log_data, indent=2))
|
|
752
|
+
print(f"Exported to gs://{bucket.name}/{blob.name}")
|
|
753
|
+
return entries
|
|
754
|
+
|
|
755
|
+
# Schedule with Cloud Scheduler
|
|
756
|
+
# gcloud scheduler jobs create http compliance-export \
|
|
757
|
+
# --schedule="0 0 * * 0" \
|
|
758
|
+
# --uri="https://[REGION]-[PROJECT_ID].cloudfunctions.net/export-compliance-logs" \
|
|
759
|
+
# --http-method=POST
|
|
760
|
+
```
|
|
761
|
+
|
|
762
|
+
## Requirements
|
|
763
|
+
|
|
764
|
+
- Google Cloud Project with Vertex AI enabled
|
|
765
|
+
- Deployed agents on Agent Engine
|
|
766
|
+
- Appropriate IAM permissions for inspection
|
|
767
|
+
- Cloud Monitoring enabled
|
|
768
|
+
- Cloud Logging enabled (for observability features)
|
|
769
|
+
- BigQuery dataset (for analytics integration)
|
|
770
|
+
|
|
771
|
+
## License
|
|
772
|
+
|
|
773
|
+
MIT
|
|
774
|
+
|
|
775
|
+
## Support
|
|
776
|
+
|
|
777
|
+
- Issues: https://github.com/jeremylongshore/claude-code-plugins/issues
|
|
778
|
+
- Discussions: https://github.com/jeremylongshore/claude-code-plugins/discussions
|
|
779
|
+
|
|
780
|
+
## Version
|
|
781
|
+
|
|
782
|
+
2.1.0 (2026) - Agent Engine GA support with comprehensive inspection capabilities; SDK-only patterns (no fabricated gcloud CLI)
|