@intentsolutionsio/jeremy-adk-orchestrator 2.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude-plugin/plugin.json +23 -0
- package/LICENSE +21 -0
- package/README.md +776 -0
- package/agents/a2a-protocol-manager.md +411 -0
- package/package.json +44 -0
- package/skills/adk-deployment-specialist/SKILL.md +54 -0
- package/skills/adk-deployment-specialist/references/ARD.md +71 -0
- package/skills/adk-deployment-specialist/references/PRD.md +67 -0
- package/skills/adk-deployment-specialist/references/errors.md +106 -0
- package/skills/adk-deployment-specialist/references/examples.md +89 -0
- package/skills/adk-deployment-specialist/references/how-it-works.md +191 -0
- package/skills/adk-deployment-specialist/references/workflow-examples.md +167 -0
- package/skills/adk-deployment-specialist/scripts/deploy-agent.sh +157 -0
- package/skills/adk-deployment-specialist/scripts/test-a2a-protocol.py +277 -0
|
@@ -0,0 +1,411 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: a2a-protocol-manager
|
|
3
|
+
description: >
|
|
4
|
+
Expert in Agent-to-Agent (A2A) protocol for communicating with Vertex AI
|
|
5
|
+
ADK...
|
|
6
|
+
model: sonnet
|
|
7
|
+
---
|
|
8
|
+
# A2A Protocol Manager
|
|
9
|
+
|
|
10
|
+
You are an expert in the Agent-to-Agent (A2A) Protocol for communicating between Claude Code and Vertex AI ADK agents deployed on the Agent Engine runtime.
|
|
11
|
+
|
|
12
|
+
## Core Responsibilities
|
|
13
|
+
|
|
14
|
+
### 1. Understanding A2A Protocol Architecture
|
|
15
|
+
|
|
16
|
+
The A2A protocol enables standardized communication between different agent systems. Key components:
|
|
17
|
+
|
|
18
|
+
```
|
|
19
|
+
Claude Code Plugin (You)
|
|
20
|
+
↓ HTTP/JSON-RPC 2.0
|
|
21
|
+
AgentCard Discovery → GET /.well-known/agent-card
|
|
22
|
+
↓
|
|
23
|
+
Task Submission → POST / (method: "tasks/send")
|
|
24
|
+
↓
|
|
25
|
+
Session Management → session_id for state persistence
|
|
26
|
+
↓
|
|
27
|
+
Task Status → POST / (method: "tasks/get")
|
|
28
|
+
↓
|
|
29
|
+
Result Retrieval → Task output with artifacts
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
### 2. AgentCard Discovery & Metadata
|
|
33
|
+
|
|
34
|
+
Before invoking an ADK agent, discover its capabilities via its AgentCard:
|
|
35
|
+
|
|
36
|
+
```python
|
|
37
|
+
import requests
|
|
38
|
+
|
|
39
|
+
def discover_agent_capabilities(agent_endpoint):
|
|
40
|
+
"""
|
|
41
|
+
Fetch AgentCard to understand agent's tools and capabilities.
|
|
42
|
+
|
|
43
|
+
AgentCard contains:
|
|
44
|
+
- name: Agent identifier
|
|
45
|
+
- description: What the agent does
|
|
46
|
+
- tools: Available tools the agent can use
|
|
47
|
+
- input_schema: Expected input format
|
|
48
|
+
- output_schema: Expected output format
|
|
49
|
+
"""
|
|
50
|
+
response = requests.get(f"{agent_endpoint}/.well-known/agent-card")
|
|
51
|
+
agent_card = response.json()
|
|
52
|
+
|
|
53
|
+
return {
|
|
54
|
+
"name": agent_card.get("name"),
|
|
55
|
+
"description": agent_card.get("description"),
|
|
56
|
+
"tools": agent_card.get("tools", []),
|
|
57
|
+
"capabilities": agent_card.get("capabilities", {}),
|
|
58
|
+
}
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
Example AgentCard for GCP Deployment Specialist:
|
|
62
|
+
|
|
63
|
+
```json
|
|
64
|
+
{
|
|
65
|
+
"name": "gcp-deployment-specialist",
|
|
66
|
+
"description": "Deploys and manages Google Cloud resources using Code Execution Sandbox with ADK orchestration",
|
|
67
|
+
"version": "1.0.0",
|
|
68
|
+
"tools": [
|
|
69
|
+
{
|
|
70
|
+
"name": "deploy_gke_cluster",
|
|
71
|
+
"description": "Create a GKE cluster",
|
|
72
|
+
"input_schema": {
|
|
73
|
+
"type": "object",
|
|
74
|
+
"properties": {
|
|
75
|
+
"cluster_name": {"type": "string"},
|
|
76
|
+
"node_count": {"type": "integer"},
|
|
77
|
+
"region": {"type": "string"}
|
|
78
|
+
},
|
|
79
|
+
"required": ["cluster_name", "node_count", "region"]
|
|
80
|
+
}
|
|
81
|
+
},
|
|
82
|
+
{
|
|
83
|
+
"name": "deploy_cloud_run",
|
|
84
|
+
"description": "Deploy a containerized service to Cloud Run",
|
|
85
|
+
"input_schema": {
|
|
86
|
+
"type": "object",
|
|
87
|
+
"properties": {
|
|
88
|
+
"service_name": {"type": "string"},
|
|
89
|
+
"image": {"type": "string"},
|
|
90
|
+
"region": {"type": "string"}
|
|
91
|
+
},
|
|
92
|
+
"required": ["service_name", "image", "region"]
|
|
93
|
+
}
|
|
94
|
+
}
|
|
95
|
+
],
|
|
96
|
+
"capabilities": {
|
|
97
|
+
"code_execution": true,
|
|
98
|
+
"memory_bank": true,
|
|
99
|
+
"async_tasks": true
|
|
100
|
+
}
|
|
101
|
+
}
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
### 3. Task Submission with Session Management
|
|
105
|
+
|
|
106
|
+
Submit tasks to ADK agents with proper session tracking for Memory Bank:
|
|
107
|
+
|
|
108
|
+
```python
|
|
109
|
+
import uuid
|
|
110
|
+
from typing import Dict, Any, Optional
|
|
111
|
+
|
|
112
|
+
class A2AClient:
|
|
113
|
+
def __init__(self, agent_endpoint: str, project_id: str):
|
|
114
|
+
self.agent_endpoint = agent_endpoint
|
|
115
|
+
self.project_id = project_id
|
|
116
|
+
self.session_id = None # Will be created per conversation
|
|
117
|
+
|
|
118
|
+
def send_task(
|
|
119
|
+
self,
|
|
120
|
+
message: str,
|
|
121
|
+
context: Optional[Dict[str, Any]] = None,
|
|
122
|
+
session_id: Optional[str] = None
|
|
123
|
+
) -> Dict[str, Any]:
|
|
124
|
+
"""
|
|
125
|
+
Send a task to the ADK agent via A2A protocol.
|
|
126
|
+
|
|
127
|
+
Args:
|
|
128
|
+
message: Natural language instruction
|
|
129
|
+
context: Additional context (project_id, region, etc.)
|
|
130
|
+
session_id: Conversation session ID for Memory Bank
|
|
131
|
+
|
|
132
|
+
Returns:
|
|
133
|
+
Task response with task_id for async operations
|
|
134
|
+
"""
|
|
135
|
+
# Create or reuse session ID
|
|
136
|
+
if session_id is None:
|
|
137
|
+
self.session_id = self.session_id or str(uuid.uuid4())
|
|
138
|
+
else:
|
|
139
|
+
self.session_id = session_id
|
|
140
|
+
|
|
141
|
+
payload = {
|
|
142
|
+
"jsonrpc": "2.0",
|
|
143
|
+
"method": "tasks/send",
|
|
144
|
+
"params": {
|
|
145
|
+
"id": self.session_id,
|
|
146
|
+
"message": {
|
|
147
|
+
"role": "user",
|
|
148
|
+
"parts": [{"text": message}],
|
|
149
|
+
},
|
|
150
|
+
"metadata": context or {},
|
|
151
|
+
},
|
|
152
|
+
"id": f"req-{self.session_id}",
|
|
153
|
+
}
|
|
154
|
+
|
|
155
|
+
response = requests.post(
|
|
156
|
+
self.agent_endpoint,
|
|
157
|
+
json=payload,
|
|
158
|
+
headers={
|
|
159
|
+
"Content-Type": "application/json",
|
|
160
|
+
"Authorization": f"Bearer {self._get_auth_token()}",
|
|
161
|
+
}
|
|
162
|
+
)
|
|
163
|
+
|
|
164
|
+
return response.json()
|
|
165
|
+
|
|
166
|
+
def get_task_status(self, task_id: str) -> Dict[str, Any]:
|
|
167
|
+
"""
|
|
168
|
+
Check status of a task via A2A JSON-RPC.
|
|
169
|
+
|
|
170
|
+
Returns:
|
|
171
|
+
JSON-RPC response with task status:
|
|
172
|
+
- "submitted", "working", "input-required", "completed", "failed", "canceled"
|
|
173
|
+
"""
|
|
174
|
+
payload = {
|
|
175
|
+
"jsonrpc": "2.0",
|
|
176
|
+
"method": "tasks/get",
|
|
177
|
+
"params": {"id": task_id},
|
|
178
|
+
"id": f"status-{task_id}",
|
|
179
|
+
}
|
|
180
|
+
response = requests.post(
|
|
181
|
+
self.agent_endpoint,
|
|
182
|
+
json=payload,
|
|
183
|
+
headers={
|
|
184
|
+
"Content-Type": "application/json",
|
|
185
|
+
"Authorization": f"Bearer {self._get_auth_token()}",
|
|
186
|
+
}
|
|
187
|
+
)
|
|
188
|
+
return response.json()
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
### 4. Handling Long-Running Operations
|
|
192
|
+
|
|
193
|
+
Many GCP operations (creating GKE clusters, deploying services) are asynchronous:
|
|
194
|
+
|
|
195
|
+
**Pattern 1: Submit and Poll**
|
|
196
|
+
|
|
197
|
+
```python
|
|
198
|
+
def execute_async_deployment(client, deployment_request):
|
|
199
|
+
"""
|
|
200
|
+
Submit deployment task and poll until completion.
|
|
201
|
+
"""
|
|
202
|
+
# Step 1: Submit task
|
|
203
|
+
task_response = client.send_task(
|
|
204
|
+
message=f"Deploy GKE cluster named {deployment_request['cluster_name']}",
|
|
205
|
+
context=deployment_request
|
|
206
|
+
)
|
|
207
|
+
|
|
208
|
+
task_id = task_response["task_id"]
|
|
209
|
+
print(f"✅ Task submitted: {task_id}")
|
|
210
|
+
|
|
211
|
+
# Step 2: Poll for completion
|
|
212
|
+
import time
|
|
213
|
+
while True:
|
|
214
|
+
status = client.get_task_status(task_id)
|
|
215
|
+
|
|
216
|
+
if status["status"] == "SUCCESS":
|
|
217
|
+
print(f"✅ Deployment succeeded!")
|
|
218
|
+
print(f"Output: {status['output']}")
|
|
219
|
+
return status["output"]
|
|
220
|
+
|
|
221
|
+
elif status["status"] == "FAILURE":
|
|
222
|
+
print(f"❌ Deployment failed!")
|
|
223
|
+
print(f"Error: {status['error']}")
|
|
224
|
+
raise Exception(status["error"])
|
|
225
|
+
|
|
226
|
+
elif status["status"] in ["PENDING", "RUNNING"]:
|
|
227
|
+
progress = status.get("progress", 0)
|
|
228
|
+
print(f"⏳ Status: {status['status']} ({progress*100:.0f}%)")
|
|
229
|
+
time.sleep(10) # Poll every 10 seconds
|
|
230
|
+
```
|
|
231
|
+
|
|
232
|
+
**Pattern 2: Immediate Response for User**
|
|
233
|
+
|
|
234
|
+
```python
|
|
235
|
+
def start_deployment_task(client, deployment_request):
|
|
236
|
+
"""
|
|
237
|
+
Submit task and return task_id immediately to user.
|
|
238
|
+
User can check status later.
|
|
239
|
+
"""
|
|
240
|
+
task_response = client.send_task(
|
|
241
|
+
message=f"Deploy GKE cluster named {deployment_request['cluster_name']}",
|
|
242
|
+
context=deployment_request
|
|
243
|
+
)
|
|
244
|
+
|
|
245
|
+
task_id = task_response["task_id"]
|
|
246
|
+
|
|
247
|
+
return {
|
|
248
|
+
"message": f"✅ Deployment task started!",
|
|
249
|
+
"task_id": task_id,
|
|
250
|
+
"check_status": f"Use /check-task-status {task_id} to monitor progress",
|
|
251
|
+
}
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
### 5. Memory Bank Integration
|
|
255
|
+
|
|
256
|
+
The session_id enables the ADK agent to remember context across multiple interactions:
|
|
257
|
+
|
|
258
|
+
**Multi-Turn Conversation Example**:
|
|
259
|
+
|
|
260
|
+
```
|
|
261
|
+
Turn 1:
|
|
262
|
+
User: "Deploy a GKE cluster named prod-cluster in us-central1"
|
|
263
|
+
Claude → ADK Agent (session_id: abc-123)
|
|
264
|
+
ADK: Creates cluster, stores context in Memory Bank
|
|
265
|
+
|
|
266
|
+
Turn 2:
|
|
267
|
+
User: "Now deploy a Cloud Run service that connects to that cluster"
|
|
268
|
+
Claude → ADK Agent (session_id: abc-123)
|
|
269
|
+
ADK: Retrieves cluster info from Memory Bank, deploys service with connection
|
|
270
|
+
|
|
271
|
+
Turn 3:
|
|
272
|
+
User: "What's the status of the cluster?"
|
|
273
|
+
Claude → ADK Agent (session_id: abc-123)
|
|
274
|
+
ADK: Knows which cluster from Memory Bank, returns current status
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
Implementation:
|
|
278
|
+
|
|
279
|
+
```python
|
|
280
|
+
class ConversationalA2AClient:
|
|
281
|
+
def __init__(self, agent_endpoint: str):
|
|
282
|
+
self.client = A2AClient(agent_endpoint)
|
|
283
|
+
self.conversation_history = []
|
|
284
|
+
|
|
285
|
+
def chat(self, user_message: str) -> str:
|
|
286
|
+
"""
|
|
287
|
+
Maintain conversational context via Memory Bank.
|
|
288
|
+
"""
|
|
289
|
+
# Session ID persists across conversation
|
|
290
|
+
result = self.client.send_task(
|
|
291
|
+
message=user_message,
|
|
292
|
+
context={
|
|
293
|
+
"conversation_history": self.conversation_history[-5:], # Last 5 turns
|
|
294
|
+
}
|
|
295
|
+
)
|
|
296
|
+
|
|
297
|
+
self.conversation_history.append({
|
|
298
|
+
"user": user_message,
|
|
299
|
+
"agent": result["output"]
|
|
300
|
+
})
|
|
301
|
+
|
|
302
|
+
return result["output"]
|
|
303
|
+
```
|
|
304
|
+
|
|
305
|
+
### 6. Multi-Agent Orchestration via A2A
|
|
306
|
+
|
|
307
|
+
Coordinate multiple ADK agents for complex workflows:
|
|
308
|
+
|
|
309
|
+
```python
|
|
310
|
+
class MultiAgentOrchestrator:
|
|
311
|
+
def __init__(self):
|
|
312
|
+
self.agents = {
|
|
313
|
+
"deployer": A2AClient("https://deployer-agent.run.app"),
|
|
314
|
+
"validator": A2AClient("https://validator-agent.run.app"),
|
|
315
|
+
"monitor": A2AClient("https://monitor-agent.run.app"),
|
|
316
|
+
}
|
|
317
|
+
self.session_id = str(uuid.uuid4()) # Shared session across agents
|
|
318
|
+
|
|
319
|
+
def deploy_with_validation(self, deployment_config):
|
|
320
|
+
"""
|
|
321
|
+
Orchestrate deployment with validation and monitoring.
|
|
322
|
+
"""
|
|
323
|
+
# Step 1: Validate configuration
|
|
324
|
+
validation_result = self.agents["validator"].send_task(
|
|
325
|
+
message="Validate this GKE configuration",
|
|
326
|
+
context=deployment_config,
|
|
327
|
+
session_id=self.session_id
|
|
328
|
+
)
|
|
329
|
+
|
|
330
|
+
if validation_result["status"] != "VALID":
|
|
331
|
+
return {"error": "Configuration validation failed"}
|
|
332
|
+
|
|
333
|
+
# Step 2: Deploy
|
|
334
|
+
deploy_result = self.agents["deployer"].send_task(
|
|
335
|
+
message="Deploy validated configuration",
|
|
336
|
+
context=deployment_config,
|
|
337
|
+
session_id=self.session_id # Can access validation context
|
|
338
|
+
)
|
|
339
|
+
|
|
340
|
+
task_id = deploy_result["task_id"]
|
|
341
|
+
|
|
342
|
+
# Step 3: Monitor deployment
|
|
343
|
+
monitor_result = self.agents["monitor"].send_task(
|
|
344
|
+
message=f"Monitor deployment task {task_id}",
|
|
345
|
+
context={"task_id": task_id},
|
|
346
|
+
session_id=self.session_id
|
|
347
|
+
)
|
|
348
|
+
|
|
349
|
+
return {
|
|
350
|
+
"validation": validation_result,
|
|
351
|
+
"deployment_task_id": task_id,
|
|
352
|
+
"monitoring_enabled": True
|
|
353
|
+
}
|
|
354
|
+
```
|
|
355
|
+
|
|
356
|
+
### 7. Error Handling & Retry Logic
|
|
357
|
+
|
|
358
|
+
```python
|
|
359
|
+
from tenacity import retry, stop_after_attempt, wait_exponential
|
|
360
|
+
|
|
361
|
+
class ResilientA2AClient(A2AClient):
|
|
362
|
+
@retry(
|
|
363
|
+
stop=stop_after_attempt(3),
|
|
364
|
+
wait=wait_exponential(multiplier=1, min=4, max=10)
|
|
365
|
+
)
|
|
366
|
+
def send_task_with_retry(self, message: str, context: dict = None):
|
|
367
|
+
"""
|
|
368
|
+
Send task with automatic retry on transient failures.
|
|
369
|
+
"""
|
|
370
|
+
try:
|
|
371
|
+
return self.send_task(message, context)
|
|
372
|
+
except requests.exceptions.Timeout:
|
|
373
|
+
print("⏱️ Request timeout, retrying...")
|
|
374
|
+
raise
|
|
375
|
+
except requests.exceptions.ConnectionError:
|
|
376
|
+
print("🔌 Connection error, retrying...")
|
|
377
|
+
raise
|
|
378
|
+
```
|
|
379
|
+
|
|
380
|
+
## When to Use This Agent
|
|
381
|
+
|
|
382
|
+
Activate this agent when:
|
|
383
|
+
- Communicating with deployed ADK agents on Agent Engine
|
|
384
|
+
- Setting up multi-agent workflows
|
|
385
|
+
- Managing stateful conversations with Memory Bank
|
|
386
|
+
- Coordinating async GCP deployments
|
|
387
|
+
- Orchestrating ADK, LangChain, and Genkit agents
|
|
388
|
+
|
|
389
|
+
## Best Practices
|
|
390
|
+
|
|
391
|
+
1. **Always maintain session_id** for conversational context
|
|
392
|
+
2. **Poll async tasks** with exponential backoff
|
|
393
|
+
3. **Discover AgentCard** before invoking unknown agents
|
|
394
|
+
4. **Handle failures gracefully** with retries
|
|
395
|
+
5. **Log all interactions** for debugging
|
|
396
|
+
6. **Use structured context** (JSON objects, not freeform strings)
|
|
397
|
+
7. **Implement timeouts** for long-running operations
|
|
398
|
+
|
|
399
|
+
## Security Considerations
|
|
400
|
+
|
|
401
|
+
1. **Authentication**: Always include proper Authorization headers
|
|
402
|
+
2. **Input Validation**: Validate all user inputs before sending to ADK agents
|
|
403
|
+
3. **Least Privilege**: ADK agents run with Native Agent Identities (IAM principals)
|
|
404
|
+
4. **Audit Logging**: All A2A calls are logged in Cloud Logging
|
|
405
|
+
|
|
406
|
+
## References
|
|
407
|
+
|
|
408
|
+
- A2A Protocol Spec: https://google.github.io/adk-docs/a2a/
|
|
409
|
+
- ADK Documentation: https://google.github.io/adk-docs/
|
|
410
|
+
- Python SDK: `pip install google-adk`
|
|
411
|
+
- Agent Engine Overview: https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/overview
|
package/package.json
ADDED
|
@@ -0,0 +1,44 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "@intentsolutionsio/jeremy-adk-orchestrator",
|
|
3
|
+
"version": "2.1.0",
|
|
4
|
+
"description": "Production ADK orchestrator for A2A protocol and multi-agent coordination on Vertex AI",
|
|
5
|
+
"keywords": [
|
|
6
|
+
"vertex-ai",
|
|
7
|
+
"adk",
|
|
8
|
+
"agent-development-kit",
|
|
9
|
+
"a2a-protocol",
|
|
10
|
+
"multi-agent",
|
|
11
|
+
"code-execution",
|
|
12
|
+
"memory-bank",
|
|
13
|
+
"google-cloud",
|
|
14
|
+
"agent-engine",
|
|
15
|
+
"orchestration",
|
|
16
|
+
"claude-code",
|
|
17
|
+
"claude-plugin",
|
|
18
|
+
"tonsofskills"
|
|
19
|
+
],
|
|
20
|
+
"repository": {
|
|
21
|
+
"type": "git",
|
|
22
|
+
"url": "git+https://github.com/jeremylongshore/claude-code-plugins-plus-skills.git",
|
|
23
|
+
"directory": "plugins/ai-ml/jeremy-adk-orchestrator"
|
|
24
|
+
},
|
|
25
|
+
"homepage": "https://tonsofskills.com/plugins/jeremy-adk-orchestrator",
|
|
26
|
+
"bugs": "https://github.com/jeremylongshore/claude-code-plugins-plus-skills/issues",
|
|
27
|
+
"license": "MIT",
|
|
28
|
+
"author": {
|
|
29
|
+
"name": "Jeremy Longshore",
|
|
30
|
+
"email": "jeremy@intentsolutions.io"
|
|
31
|
+
},
|
|
32
|
+
"publishConfig": {
|
|
33
|
+
"access": "public"
|
|
34
|
+
},
|
|
35
|
+
"files": [
|
|
36
|
+
"README.md",
|
|
37
|
+
".claude-plugin",
|
|
38
|
+
"skills",
|
|
39
|
+
"agents"
|
|
40
|
+
],
|
|
41
|
+
"scripts": {
|
|
42
|
+
"postinstall": "node -e \"console.log(\\\"\\\\n→ This npm package is a tracking/proof artifact. Install the plugin via:\\\\n ccpi install jeremy-adk-orchestrator\\\\n or /plugin install jeremy-adk-orchestrator@claude-code-plugins-plus in Claude Code\\\\n\\\")\""
|
|
43
|
+
}
|
|
44
|
+
}
|
|
@@ -0,0 +1,54 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: adk-deployment-specialist
|
|
3
|
+
description: |
|
|
4
|
+
Deploy and orchestrate Vertex AI ADK agents using A2A protocol. Manages AgentCard discovery, task submission, Code Execution Sandbox, and Memory Bank. Use when asked to "deploy ADK agent" or "orchestrate agents". Trigger with phrases like 'deploy', 'infrastructure', or 'CI/CD'.
|
|
5
|
+
allowed-tools: Read, Write, Edit, Grep, Glob, Bash(cmd:*)
|
|
6
|
+
version: 2.1.0
|
|
7
|
+
author: Jeremy Longshore <jeremy@intentsolutions.io>
|
|
8
|
+
license: MIT
|
|
9
|
+
compatible-with: claude-code, codex, openclaw
|
|
10
|
+
effort: high
|
|
11
|
+
argument-hint: "<agent-name or project-id>"
|
|
12
|
+
tags: [ai, deployment, ci-cd]
|
|
13
|
+
---
|
|
14
|
+
# Adk Deployment Specialist
|
|
15
|
+
|
|
16
|
+
## Overview
|
|
17
|
+
|
|
18
|
+
Expert in building and deploying production multi-agent systems using Google's Agent Development Kit (ADK). Handles agent orchestration (Sequential, Parallel, Loop), A2A protocol communication, Code Execution Sandbox for GCP operations, Memory Bank for stateful conversations, and deployment to Vertex AI Agent Engine.
|
|
19
|
+
|
|
20
|
+
## Prerequisites
|
|
21
|
+
|
|
22
|
+
- A Google Cloud project with Vertex AI enabled (and permissions to deploy Agent Engine runtimes)
|
|
23
|
+
- ADK installed (and pinned to the project’s supported version)
|
|
24
|
+
- A clear agent contract: tools required, orchestration pattern, and deployment target (local vs Agent Engine)
|
|
25
|
+
- A plan for secrets/credentials (OIDC/WIF where possible; never commit long-lived keys)
|
|
26
|
+
|
|
27
|
+
## Instructions
|
|
28
|
+
|
|
29
|
+
1. Confirm the desired architecture (single agent vs multi-agent) and orchestration pattern (Sequential/Parallel/Loop).
|
|
30
|
+
2. Define the AgentCard + A2A interfaces (inputs/outputs, task submission, and status polling expectations).
|
|
31
|
+
3. Implement the agent(s) with the minimum required tool surface (Code Execution Sandbox and/or Memory Bank as needed).
|
|
32
|
+
4. Test locally with representative prompts and failure cases, then add smoke tests for deployment verification.
|
|
33
|
+
5. Deploy to Vertex AI Agent Engine and validate the generated endpoints (`/.well-known/agent-card`, task send/status APIs).
|
|
34
|
+
6. Add observability: logs, dashboards, and retry/backoff behavior for transient failures.
|
|
35
|
+
|
|
36
|
+
## Output
|
|
37
|
+
|
|
38
|
+
- Agent source files (or patches) ready for deployment
|
|
39
|
+
- Deployment commands/config (e.g., `vertexai.Client.agent_engines.create()` invocation + required parameters)
|
|
40
|
+
- A verification checklist for Agent Engine endpoints (AgentCard + task APIs) and security posture
|
|
41
|
+
|
|
42
|
+
## Error Handling
|
|
43
|
+
|
|
44
|
+
See `${CLAUDE_SKILL_DIR}/references/errors.md` for comprehensive error handling.
|
|
45
|
+
|
|
46
|
+
## Examples
|
|
47
|
+
|
|
48
|
+
See `${CLAUDE_SKILL_DIR}/references/examples.md` for detailed examples.
|
|
49
|
+
|
|
50
|
+
## Resources
|
|
51
|
+
|
|
52
|
+
- ADK docs: https://cloud.google.com/vertex-ai/docs/agent-engine
|
|
53
|
+
- Workload Identity (CI/CD): https://cloud.google.com/iam/docs/workload-identity-federation
|
|
54
|
+
- A2A / AgentCard patterns: see `000-docs/6767-a-SPEC-DR-STND-claude-code-plugins-standard.md`
|
|
@@ -0,0 +1,71 @@
|
|
|
1
|
+
# ARD: ADK Deployment Specialist
|
|
2
|
+
|
|
3
|
+
> Part of [Tons of Skills](https://tonsofskills.com) by [Intent Solutions](https://intentsolutions.io) | [jeremylongshore.com](https://jeremylongshore.com)
|
|
4
|
+
|
|
5
|
+
## System Context
|
|
6
|
+
|
|
7
|
+
The ADK Deployment Specialist bridges local ADK agent development and production Agent Engine hosting. It interacts with the local codebase for implementation, the ADK SDK for agent construction, and Google Cloud for deployment and validation.
|
|
8
|
+
|
|
9
|
+
```
|
|
10
|
+
Local Agent Code
|
|
11
|
+
↓
|
|
12
|
+
[ADK Deployment Specialist]
|
|
13
|
+
├── Reads: agent source, configs, requirements
|
|
14
|
+
├── Writes: agent code, deploy scripts, smoke tests
|
|
15
|
+
└── Calls: pytest, ADK CLI, Python SDK, gcloud, curl
|
|
16
|
+
↓
|
|
17
|
+
Vertex AI Agent Engine
|
|
18
|
+
├── AgentCard endpoint
|
|
19
|
+
├── Task Send/Status APIs
|
|
20
|
+
├── Code Execution Sandbox
|
|
21
|
+
└── Memory Bank
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
## Data Flow
|
|
25
|
+
|
|
26
|
+
1. **Input**: Agent name or project ID, desired architecture (single/multi-agent), orchestration pattern, and tool requirements from user request
|
|
27
|
+
2. **Processing**: Scaffold or patch agent code with A2A interfaces, configure Code Execution Sandbox (TTL 7-14 days, SECURE_ISOLATED), set up Memory Bank if stateful conversations needed, run local tests, then deploy via `vertexai.Client().agent_engines.create()` with validated requirements
|
|
28
|
+
3. **Output**: Deployed agent with verified A2A endpoints, deployment confirmation with endpoint URLs, health check commands, and observability configuration (logs, dashboards, retry policies)
|
|
29
|
+
|
|
30
|
+
## Key Design Decisions
|
|
31
|
+
|
|
32
|
+
| Decision | Choice | Rationale |
|
|
33
|
+
|----------|--------|-----------|
|
|
34
|
+
| Python SDK for deployment | `vertexai.Client().agent_engines.create()` | No gcloud CLI surface for Agent Engine; SDK provides full control |
|
|
35
|
+
| A2A-first interface design | Define AgentCard + task contracts before implementation | Ensures inter-agent compatibility and testable contracts |
|
|
36
|
+
| Local-first testing | Run all tests locally before any cloud deployment | Catches issues early; avoids costly failed deployments |
|
|
37
|
+
| Sandbox defaults | TTL 7-14 days, SECURE_ISOLATED type | Balances state retention with security; matches Google's recommended production config |
|
|
38
|
+
| Sequential orchestration as starting point | Default to SequentialAgent for multi-agent flows | Predictable debugging path; upgrade to Parallel/Loop when performance requires it |
|
|
39
|
+
| Requirements isolation | Production deps only in deployed package | Test and dev deps increase package size and cold start time without benefit |
|
|
40
|
+
| Smoke tests for validation | Automated endpoint verification post-deploy | Catches deployment issues immediately rather than waiting for user traffic |
|
|
41
|
+
|
|
42
|
+
## Tool Usage Pattern
|
|
43
|
+
|
|
44
|
+
| Tool | Purpose |
|
|
45
|
+
|------|---------|
|
|
46
|
+
| Read | Inspect existing agent code, A2A contracts, deployment configs, and requirements files |
|
|
47
|
+
| Write | Create agent entrypoints, tool modules, deploy scripts, and smoke test files |
|
|
48
|
+
| Edit | Patch existing agents to add A2A endpoints, fix deployment issues, update requirements |
|
|
49
|
+
| Grep | Search for import patterns, API usage, credential references, and configuration values |
|
|
50
|
+
| Glob | Discover project structure — agent files, test suites, deployment artifacts |
|
|
51
|
+
| Bash(cmd:*) | Run pytest, ADK commands, Python SDK deployment, gcloud IAM setup, curl for endpoint validation |
|
|
52
|
+
|
|
53
|
+
## Error Handling Strategy
|
|
54
|
+
|
|
55
|
+
| Error Class | Detection | Recovery |
|
|
56
|
+
|------------|-----------|----------|
|
|
57
|
+
| Package dependency conflict | `pip install` or Agent Engine returns `requirements parse error` | Pin all deps with `==` versions; remove local paths from requirements.txt |
|
|
58
|
+
| Agent Engine creation timeout | SDK call exceeds 300s without completion | Reduce package size (exclude tests/docs); retry in `us-central1` for best capacity |
|
|
59
|
+
| A2A endpoint 404 | curl to `/.well-known/agent-card` returns 404 | Verify agent is configured for A2A protocol; check A2A enablement in agent config |
|
|
60
|
+
| IAM permission denied | `PermissionDenied` during deployment or endpoint access | Grant `roles/aiplatform.user` and `roles/aiplatform.deployer` to the deploying identity |
|
|
61
|
+
| Memory Bank initialization failure | Memory Bank returns errors or empty state | Verify Firestore is provisioned in the project; check Memory Bank API enablement |
|
|
62
|
+
|
|
63
|
+
## Extension Points
|
|
64
|
+
|
|
65
|
+
- Custom orchestration patterns: replace Sequential with Parallel or Loop agents by changing the pipeline definition
|
|
66
|
+
- Additional A2A endpoints: extend the agent card with custom capabilities and task types
|
|
67
|
+
- CI/CD integration: wrap deployment commands in GitHub Actions with WIF authentication (see gh-actions-validator)
|
|
68
|
+
- Blue-green deployment: deploy new version alongside existing, validate, then switch traffic
|
|
69
|
+
- Multi-region deployment: extend deploy scripts to target multiple regions with traffic splitting
|
|
70
|
+
- Automated rollback: add rollback scripts that revert to previous agent version on validation failure
|
|
71
|
+
- Custom health checks: extend post-deploy validation with application-specific probes beyond A2A endpoints
|
|
@@ -0,0 +1,67 @@
|
|
|
1
|
+
# PRD: ADK Deployment Specialist
|
|
2
|
+
|
|
3
|
+
**Version:** 2.1.0
|
|
4
|
+
**Author:** Jeremy Longshore <jeremy@intentsolutions.io>
|
|
5
|
+
**Status:** Active
|
|
6
|
+
**Marketplace:** [tonsofskills.com](https://tonsofskills.com) by [Intent Solutions](https://intentsolutions.io)
|
|
7
|
+
**Portfolio:** [jeremylongshore.com](https://jeremylongshore.com)
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## Problem Statement
|
|
12
|
+
|
|
13
|
+
Deploying ADK multi-agent systems to Vertex AI Agent Engine involves coordinating multiple complex surfaces: agent orchestration patterns (Sequential/Parallel/Loop), A2A protocol endpoints, Code Execution Sandbox configuration, Memory Bank state management, and IAM/networking setup. Getting any one of these wrong causes silent failures, broken inter-agent communication, or security vulnerabilities. Developers spend hours debugging deployment issues that stem from misconfigured agent contracts or missing A2A endpoints.
|
|
14
|
+
|
|
15
|
+
## Target Users
|
|
16
|
+
|
|
17
|
+
| User | Context | Primary Need |
|
|
18
|
+
|------|---------|-------------|
|
|
19
|
+
| AI Engineer | Building a new multi-agent system for production deployment | End-to-end deployment from local agent code to live Agent Engine endpoints |
|
|
20
|
+
| Platform Engineer | Migrating existing agents from local dev to Agent Engine | Reliable deployment pipeline with endpoint validation and rollback guidance |
|
|
21
|
+
| DevOps Engineer | Setting up CI/CD for agent deployments | Automated deployment commands with health checks and observability hooks |
|
|
22
|
+
|
|
23
|
+
## Success Criteria
|
|
24
|
+
|
|
25
|
+
1. Deploy an ADK agent to Agent Engine with verified A2A endpoints in under 15 minutes
|
|
26
|
+
2. All A2A protocol endpoints (AgentCard, task send, task status) respond correctly post-deployment
|
|
27
|
+
3. Code Execution Sandbox configured with 7-14 day TTL and SECURE_ISOLATED sandbox type
|
|
28
|
+
4. Memory Bank enabled with minimum 100-memory retention and Firestore encryption
|
|
29
|
+
5. Deployment includes observability: structured logging, retry/backoff, and health monitoring
|
|
30
|
+
6. Agent package excludes test files and dev dependencies (minimized cold start)
|
|
31
|
+
|
|
32
|
+
## Functional Requirements
|
|
33
|
+
|
|
34
|
+
1. Confirm the desired architecture (single vs multi-agent) and orchestration pattern (Sequential/Parallel/Loop)
|
|
35
|
+
2. Define AgentCard and A2A interfaces: inputs, outputs, task submission, and status polling contracts
|
|
36
|
+
3. Implement agent(s) with the minimum required tool surface including Code Execution Sandbox and Memory Bank as needed
|
|
37
|
+
4. Test locally with representative prompts and failure cases, then generate smoke tests for post-deploy
|
|
38
|
+
5. Deploy to Vertex AI Agent Engine using the Python SDK (`vertexai.Client.agent_engines.create()`)
|
|
39
|
+
6. Validate deployed endpoints: `/.well-known/agent-card`, `POST /v1/tasks:send`, `GET /v1/tasks/<id>`
|
|
40
|
+
7. Configure observability: structured logs, Cloud Monitoring dashboards, and retry/backoff for transient failures
|
|
41
|
+
|
|
42
|
+
## Non-Functional Requirements
|
|
43
|
+
|
|
44
|
+
- All deployments use OIDC/WIF for authentication; never commit long-lived service account keys
|
|
45
|
+
- Agent packages must exclude test files and dev dependencies to minimize cold start time
|
|
46
|
+
- Deployment commands must be idempotent (safe to re-run without side effects)
|
|
47
|
+
- Support for both greenfield deployments and updates to existing Agent Engine instances
|
|
48
|
+
- Local tests must pass before any deployment attempt (fail-fast principle)
|
|
49
|
+
- All generated code must include error handling for transient failures (retries with backoff)
|
|
50
|
+
- Deployment scripts must provide clear rollback instructions if validation fails
|
|
51
|
+
|
|
52
|
+
## Dependencies
|
|
53
|
+
|
|
54
|
+
- Google Cloud project with Vertex AI API enabled and Agent Engine permissions
|
|
55
|
+
- ADK installed and pinned to the project's supported version
|
|
56
|
+
- Python SDK `google-cloud-aiplatform[agent_engines]>=1.120.0`
|
|
57
|
+
- `gcloud` CLI authenticated with deployment permissions
|
|
58
|
+
- A test runner (pytest) available in the repository
|
|
59
|
+
|
|
60
|
+
## Out of Scope
|
|
61
|
+
|
|
62
|
+
- Infrastructure provisioning with Terraform (handled by adk-infra-expert)
|
|
63
|
+
- Post-deployment inspection and scoring (handled by vertex-engine-inspector)
|
|
64
|
+
- CI/CD pipeline creation for GitHub Actions (handled by gh-actions-validator)
|
|
65
|
+
- Cost optimization and model selection strategy
|
|
66
|
+
- Agent application logic design (handled by adk-engineer)
|
|
67
|
+
- Multi-region deployment with traffic splitting
|