@musashishao/agent-kit 1.8.1 → 1.9.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.agent/agents/ai-architect.md +39 -0
- package/.agent/agents/cloud-engineer.md +39 -0
- package/.agent/agents/game-asset-curator.md +317 -0
- package/.agent/agents/game-developer.md +190 -89
- package/.agent/agents/game-narrative-designer.md +310 -0
- package/.agent/agents/game-qa-agent.md +441 -0
- package/.agent/agents/marketing-specialist.md +41 -0
- package/.agent/agents/penetration-tester.md +15 -1
- package/.agent/rules/CODEX.md +26 -2
- package/.agent/rules/GEMINI.md +7 -5
- package/.agent/rules/REFERENCE.md +92 -2
- package/.agent/scripts/ak_cli.py +1 -1
- package/.agent/scripts/localize_workflows.py +54 -0
- package/.agent/scripts/memory_manager.py +24 -1
- package/.agent/skills/3d-web-experience/SKILL.md +386 -0
- package/.agent/skills/DEPENDENCIES.md +54 -0
- package/.agent/skills/ab-test-setup/SKILL.md +77 -0
- package/.agent/skills/active-directory-attacks/SKILL.md +59 -0
- package/.agent/skills/agent-evaluation/SKILL.md +430 -0
- package/.agent/skills/agent-memory-systems/SKILL.md +426 -0
- package/.agent/skills/agent-tool-builder/SKILL.md +139 -0
- package/.agent/skills/ai-agents-architect/SKILL.md +115 -0
- package/.agent/skills/ai-product/SKILL.md +86 -0
- package/.agent/skills/ai-wrapper-product/SKILL.md +90 -0
- package/.agent/skills/analytics-tracking/SKILL.md +88 -0
- package/.agent/skills/api-fuzzing-bug-bounty/SKILL.md +66 -0
- package/.agent/skills/app-store-optimization/SKILL.md +66 -0
- package/.agent/skills/autonomous-agent-patterns/SKILL.md +414 -0
- package/.agent/skills/aws-penetration-testing/SKILL.md +50 -0
- package/.agent/skills/aws-serverless/SKILL.md +327 -0
- package/.agent/skills/azure-functions/SKILL.md +340 -0
- package/.agent/skills/broken-authentication/SKILL.md +53 -0
- package/.agent/skills/browser-automation/SKILL.md +408 -0
- package/.agent/skills/browser-extension-builder/SKILL.md +422 -0
- package/.agent/skills/bullmq-specialist/SKILL.md +424 -0
- package/.agent/skills/bun-development/SKILL.md +386 -0
- package/.agent/skills/burp-suite-testing/SKILL.md +60 -0
- package/.agent/skills/clerk-auth/SKILL.md +432 -0
- package/.agent/skills/cloud-penetration-testing/SKILL.md +51 -0
- package/.agent/skills/copywriting/SKILL.md +66 -0
- package/.agent/skills/crewai/SKILL.md +470 -0
- package/.agent/skills/discord-bot-architect/SKILL.md +447 -0
- package/.agent/skills/email-sequence/SKILL.md +73 -0
- package/.agent/skills/ethical-hacking-methodology/SKILL.md +67 -0
- package/.agent/skills/firebase/SKILL.md +377 -0
- package/.agent/skills/game-development/godot-expert/SKILL.md +462 -0
- package/.agent/skills/game-development/npc-ai-integration/SKILL.md +110 -0
- package/.agent/skills/game-development/procedural-generation/SKILL.md +168 -0
- package/.agent/skills/game-development/unity-integration/SKILL.md +358 -0
- package/.agent/skills/game-development/webgpu-shading/SKILL.md +209 -0
- package/.agent/skills/gcp-cloud-run/SKILL.md +358 -0
- package/.agent/skills/graphql/SKILL.md +492 -0
- package/.agent/skills/idor-testing/SKILL.md +64 -0
- package/.agent/skills/inngest/SKILL.md +128 -0
- package/.agent/skills/langfuse/SKILL.md +415 -0
- package/.agent/skills/langgraph/SKILL.md +360 -0
- package/.agent/skills/launch-strategy/SKILL.md +68 -0
- package/.agent/skills/linux-privilege-escalation/SKILL.md +62 -0
- package/.agent/skills/llm-app-patterns/SKILL.md +367 -0
- package/.agent/skills/marketing-ideas/SKILL.md +66 -0
- package/.agent/skills/metasploit-framework/SKILL.md +60 -0
- package/.agent/skills/micro-saas-launcher/SKILL.md +93 -0
- package/.agent/skills/neon-postgres/SKILL.md +339 -0
- package/.agent/skills/paid-ads/SKILL.md +64 -0
- package/.agent/skills/supabase-integration/SKILL.md +411 -0
- package/.agent/workflows/ai-agent.md +36 -0
- package/.agent/workflows/autofix.md +1 -0
- package/.agent/workflows/brainstorm.md +1 -0
- package/.agent/workflows/context.md +1 -0
- package/.agent/workflows/create.md +1 -0
- package/.agent/workflows/dashboard.md +1 -0
- package/.agent/workflows/debug.md +1 -0
- package/.agent/workflows/deploy.md +1 -0
- package/.agent/workflows/enhance.md +1 -0
- package/.agent/workflows/game-prototype.md +154 -0
- package/.agent/workflows/marketing.md +37 -0
- package/.agent/workflows/next.md +1 -0
- package/.agent/workflows/orchestrate.md +1 -0
- package/.agent/workflows/pentest.md +37 -0
- package/.agent/workflows/plan.md +1 -0
- package/.agent/workflows/preview.md +2 -1
- package/.agent/workflows/quality.md +1 -0
- package/.agent/workflows/saas.md +36 -0
- package/.agent/workflows/spec.md +1 -0
- package/.agent/workflows/status.md +1 -0
- package/.agent/workflows/test.md +1 -0
- package/.agent/workflows/ui-ux-pro-max.md +1 -0
- package/README.md +52 -24
- package/bin/cli.js +68 -3
- package/docs/CHANGELOG_AI_INFRA.md +30 -0
- package/docs/MIGRATION_GUIDE_V1.9.md +55 -0
- package/package.json +1 -1
|
@@ -0,0 +1,426 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: agent-memory-systems
|
|
3
|
+
description: "Memory architectures for intelligent agents. Covers short-term (context window), long-term (vector stores), episodic, semantic, and procedural memory. Understanding that memory failures look like intelligence failures - the hard part isn't storing, it's retrieving the right memory at the right time."
|
|
4
|
+
version: "1.0.0"
|
|
5
|
+
source: "antigravity-awesome-skills (adapted)"
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
# 🧠 Agent Memory Systems
|
|
9
|
+
|
|
10
|
+
> Memory is the cornerstone of intelligent agents. Without it, every interaction starts from zero.
|
|
11
|
+
|
|
12
|
+
You are a cognitive architect who understands that memory makes agents intelligent. The hard part isn't storing - it's retrieving the right memory at the right time.
|
|
13
|
+
|
|
14
|
+
**Core insight**: Memory failures look like intelligence failures. When an agent "forgets" or gives inconsistent answers, it's almost always a retrieval problem, not a storage problem.
|
|
15
|
+
|
|
16
|
+
---
|
|
17
|
+
|
|
18
|
+
## When to Use This Skill
|
|
19
|
+
|
|
20
|
+
- Designing memory architecture for AI agents
|
|
21
|
+
- Implementing long-term memory with vector stores
|
|
22
|
+
- Building conversational agents that remember
|
|
23
|
+
- Creating agents that learn from past interactions
|
|
24
|
+
- Optimizing retrieval for context windows
|
|
25
|
+
|
|
26
|
+
---
|
|
27
|
+
|
|
28
|
+
## 1. Memory Type Architecture
|
|
29
|
+
|
|
30
|
+
### Memory Types Overview
|
|
31
|
+
|
|
32
|
+
```
|
|
33
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
34
|
+
│ AGENT MEMORY SYSTEM │
|
|
35
|
+
│ │
|
|
36
|
+
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
|
|
37
|
+
│ │ Working │ │ Episodic │ │ Semantic │ │
|
|
38
|
+
│ │ Memory │ │ Memory │ │ Memory │ │
|
|
39
|
+
│ │ (Context) │ │ (Events) │ │ (Facts) │ │
|
|
40
|
+
│ └──────────────┘ └──────────────┘ └──────────────┘ │
|
|
41
|
+
│ │ │ │ │
|
|
42
|
+
│ ▼ ▼ ▼ │
|
|
43
|
+
│ ┌─────────────────────────────────────────────────────┐ │
|
|
44
|
+
│ │ Retrieval & Synthesis │ │
|
|
45
|
+
│ │ (What's relevant right now?) │ │
|
|
46
|
+
│ └─────────────────────────────────────────────────────┘ │
|
|
47
|
+
└─────────────────────────────────────────────────────────────┘
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
| Memory Type | Description | Storage | Retrieval |
|
|
51
|
+
|-------------|-------------|---------|-----------|
|
|
52
|
+
| **Working** | Current context window | In-memory | Direct |
|
|
53
|
+
| **Episodic** | Past conversations, events | Vector DB | Semantic search |
|
|
54
|
+
| **Semantic** | Facts, knowledge | Vector DB + Graph | Query |
|
|
55
|
+
| **Procedural** | How to do things | Code/Prompts | Pattern match |
|
|
56
|
+
|
|
57
|
+
### Implementation
|
|
58
|
+
|
|
59
|
+
```python
|
|
60
|
+
from abc import ABC, abstractmethod
|
|
61
|
+
from dataclasses import dataclass
|
|
62
|
+
from typing import List, Optional
|
|
63
|
+
|
|
64
|
+
@dataclass
|
|
65
|
+
class Memory:
|
|
66
|
+
id: str
|
|
67
|
+
content: str
|
|
68
|
+
memory_type: str # working, episodic, semantic, procedural
|
|
69
|
+
timestamp: datetime
|
|
70
|
+
metadata: dict
|
|
71
|
+
embedding: Optional[List[float]] = None
|
|
72
|
+
importance: float = 0.5
|
|
73
|
+
access_count: int = 0
|
|
74
|
+
last_accessed: Optional[datetime] = None
|
|
75
|
+
|
|
76
|
+
class MemoryStore(ABC):
|
|
77
|
+
@abstractmethod
|
|
78
|
+
def store(self, memory: Memory) -> str:
|
|
79
|
+
"""Store a memory, return ID"""
|
|
80
|
+
pass
|
|
81
|
+
|
|
82
|
+
@abstractmethod
|
|
83
|
+
def retrieve(self, query: str, top_k: int = 5) -> List[Memory]:
|
|
84
|
+
"""Retrieve relevant memories"""
|
|
85
|
+
pass
|
|
86
|
+
|
|
87
|
+
@abstractmethod
|
|
88
|
+
def forget(self, memory_id: str) -> bool:
|
|
89
|
+
"""Remove a memory"""
|
|
90
|
+
pass
|
|
91
|
+
|
|
92
|
+
class AgentMemorySystem:
|
|
93
|
+
def __init__(self):
|
|
94
|
+
self.working_memory = WorkingMemory(max_tokens=8000)
|
|
95
|
+
self.episodic_memory = EpisodicMemory(vector_db=ChromaDB())
|
|
96
|
+
self.semantic_memory = SemanticMemory(vector_db=ChromaDB())
|
|
97
|
+
self.procedural_memory = ProceduralMemory()
|
|
98
|
+
|
|
99
|
+
def remember(self, content: str, memory_type: str, metadata: dict = None):
|
|
100
|
+
"""Store a new memory"""
|
|
101
|
+
memory = Memory(
|
|
102
|
+
id=generate_id(),
|
|
103
|
+
content=content,
|
|
104
|
+
memory_type=memory_type,
|
|
105
|
+
timestamp=datetime.now(),
|
|
106
|
+
metadata=metadata or {},
|
|
107
|
+
importance=self._assess_importance(content)
|
|
108
|
+
)
|
|
109
|
+
|
|
110
|
+
if memory_type == "working":
|
|
111
|
+
self.working_memory.add(memory)
|
|
112
|
+
elif memory_type == "episodic":
|
|
113
|
+
self.episodic_memory.store(memory)
|
|
114
|
+
elif memory_type == "semantic":
|
|
115
|
+
self.semantic_memory.store(memory)
|
|
116
|
+
|
|
117
|
+
def recall(self, query: str, memory_types: List[str] = None) -> List[Memory]:
|
|
118
|
+
"""Retrieve relevant memories across types"""
|
|
119
|
+
results = []
|
|
120
|
+
|
|
121
|
+
types = memory_types or ["working", "episodic", "semantic"]
|
|
122
|
+
|
|
123
|
+
if "working" in types:
|
|
124
|
+
results.extend(self.working_memory.search(query))
|
|
125
|
+
if "episodic" in types:
|
|
126
|
+
results.extend(self.episodic_memory.retrieve(query))
|
|
127
|
+
if "semantic" in types:
|
|
128
|
+
results.extend(self.semantic_memory.retrieve(query))
|
|
129
|
+
|
|
130
|
+
# Rank by relevance + recency + importance
|
|
131
|
+
return self._rank_memories(results, query)
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
---
|
|
135
|
+
|
|
136
|
+
## 2. Working Memory (Context Window)
|
|
137
|
+
|
|
138
|
+
```python
|
|
139
|
+
class WorkingMemory:
|
|
140
|
+
"""
|
|
141
|
+
Active context that fits in LLM context window.
|
|
142
|
+
Manages what the agent is "currently thinking about".
|
|
143
|
+
"""
|
|
144
|
+
|
|
145
|
+
def __init__(self, max_tokens: int = 8000):
|
|
146
|
+
self.max_tokens = max_tokens
|
|
147
|
+
self.memories: List[Memory] = []
|
|
148
|
+
self.priority_queue = []
|
|
149
|
+
|
|
150
|
+
def add(self, memory: Memory):
|
|
151
|
+
"""Add memory, evicting if necessary"""
|
|
152
|
+
tokens = count_tokens(memory.content)
|
|
153
|
+
|
|
154
|
+
# Evict low-priority memories if needed
|
|
155
|
+
while self._current_tokens() + tokens > self.max_tokens:
|
|
156
|
+
self._evict_lowest_priority()
|
|
157
|
+
|
|
158
|
+
self.memories.append(memory)
|
|
159
|
+
|
|
160
|
+
def get_context(self) -> str:
|
|
161
|
+
"""Get formatted context for LLM"""
|
|
162
|
+
return "\n---\n".join([m.content for m in self.memories])
|
|
163
|
+
|
|
164
|
+
def _evict_lowest_priority(self):
|
|
165
|
+
"""Remove least important memory"""
|
|
166
|
+
if not self.memories:
|
|
167
|
+
return
|
|
168
|
+
|
|
169
|
+
# Score by importance * recency
|
|
170
|
+
scores = []
|
|
171
|
+
for m in self.memories:
|
|
172
|
+
age_hours = (datetime.now() - m.timestamp).total_seconds() / 3600
|
|
173
|
+
recency_score = 1 / (1 + age_hours)
|
|
174
|
+
score = m.importance * recency_score
|
|
175
|
+
scores.append((score, m))
|
|
176
|
+
|
|
177
|
+
# Remove lowest scoring
|
|
178
|
+
scores.sort(key=lambda x: x[0])
|
|
179
|
+
self.memories.remove(scores[0][1])
|
|
180
|
+
```
|
|
181
|
+
|
|
182
|
+
---
|
|
183
|
+
|
|
184
|
+
## 3. Episodic Memory (Conversations & Events)
|
|
185
|
+
|
|
186
|
+
```python
|
|
187
|
+
class EpisodicMemory:
|
|
188
|
+
"""
|
|
189
|
+
Stores past conversations and events.
|
|
190
|
+
"What happened before?"
|
|
191
|
+
"""
|
|
192
|
+
|
|
193
|
+
def __init__(self, vector_db):
|
|
194
|
+
self.vector_db = vector_db
|
|
195
|
+
self.collection = vector_db.get_or_create_collection("episodic")
|
|
196
|
+
|
|
197
|
+
def store(self, memory: Memory):
|
|
198
|
+
"""Store an episode"""
|
|
199
|
+
# Generate embedding
|
|
200
|
+
embedding = embed(memory.content)
|
|
201
|
+
|
|
202
|
+
# Store with metadata
|
|
203
|
+
self.collection.add(
|
|
204
|
+
ids=[memory.id],
|
|
205
|
+
embeddings=[embedding],
|
|
206
|
+
documents=[memory.content],
|
|
207
|
+
metadatas=[{
|
|
208
|
+
"timestamp": memory.timestamp.isoformat(),
|
|
209
|
+
"session_id": memory.metadata.get("session_id"),
|
|
210
|
+
"user_id": memory.metadata.get("user_id"),
|
|
211
|
+
"importance": memory.importance
|
|
212
|
+
}]
|
|
213
|
+
)
|
|
214
|
+
|
|
215
|
+
def retrieve(self, query: str, top_k: int = 5,
|
|
216
|
+
time_range: tuple = None,
|
|
217
|
+
session_id: str = None) -> List[Memory]:
|
|
218
|
+
"""Retrieve relevant episodes"""
|
|
219
|
+
|
|
220
|
+
# Build filter
|
|
221
|
+
where = {}
|
|
222
|
+
if session_id:
|
|
223
|
+
where["session_id"] = session_id
|
|
224
|
+
if time_range:
|
|
225
|
+
where["timestamp"] = {
|
|
226
|
+
"$gte": time_range[0].isoformat(),
|
|
227
|
+
"$lte": time_range[1].isoformat()
|
|
228
|
+
}
|
|
229
|
+
|
|
230
|
+
results = self.collection.query(
|
|
231
|
+
query_texts=[query],
|
|
232
|
+
n_results=top_k,
|
|
233
|
+
where=where if where else None
|
|
234
|
+
)
|
|
235
|
+
|
|
236
|
+
return self._results_to_memories(results)
|
|
237
|
+
|
|
238
|
+
def summarize_session(self, session_id: str) -> str:
|
|
239
|
+
"""Create summary of a session for compression"""
|
|
240
|
+
episodes = self.retrieve("", session_id=session_id, top_k=100)
|
|
241
|
+
|
|
242
|
+
# Use LLM to summarize
|
|
243
|
+
summary = llm.summarize([e.content for e in episodes])
|
|
244
|
+
|
|
245
|
+
# Store summary as new memory
|
|
246
|
+
self.store(Memory(
|
|
247
|
+
id=f"summary_{session_id}",
|
|
248
|
+
content=summary,
|
|
249
|
+
memory_type="episodic",
|
|
250
|
+
metadata={"is_summary": True, "session_id": session_id}
|
|
251
|
+
))
|
|
252
|
+
|
|
253
|
+
return summary
|
|
254
|
+
```
|
|
255
|
+
|
|
256
|
+
---
|
|
257
|
+
|
|
258
|
+
## 4. Semantic Memory (Facts & Knowledge)
|
|
259
|
+
|
|
260
|
+
```python
|
|
261
|
+
class SemanticMemory:
|
|
262
|
+
"""
|
|
263
|
+
Stores facts and knowledge.
|
|
264
|
+
"What do I know about X?"
|
|
265
|
+
"""
|
|
266
|
+
|
|
267
|
+
def __init__(self, vector_db):
|
|
268
|
+
self.vector_db = vector_db
|
|
269
|
+
self.collection = vector_db.get_or_create_collection("semantic")
|
|
270
|
+
|
|
271
|
+
def store_fact(self, fact: str, source: str = None,
|
|
272
|
+
confidence: float = 1.0, category: str = None):
|
|
273
|
+
"""Store a fact with metadata"""
|
|
274
|
+
memory = Memory(
|
|
275
|
+
id=generate_id(),
|
|
276
|
+
content=fact,
|
|
277
|
+
memory_type="semantic",
|
|
278
|
+
timestamp=datetime.now(),
|
|
279
|
+
metadata={
|
|
280
|
+
"source": source,
|
|
281
|
+
"confidence": confidence,
|
|
282
|
+
"category": category
|
|
283
|
+
},
|
|
284
|
+
importance=confidence
|
|
285
|
+
)
|
|
286
|
+
|
|
287
|
+
# Check for conflicts
|
|
288
|
+
existing = self.retrieve(fact, top_k=3)
|
|
289
|
+
for e in existing:
|
|
290
|
+
if self._is_contradictory(fact, e.content):
|
|
291
|
+
# Store conflict for resolution
|
|
292
|
+
self._store_conflict(fact, e)
|
|
293
|
+
|
|
294
|
+
self.store(memory)
|
|
295
|
+
|
|
296
|
+
def retrieve(self, query: str, top_k: int = 5,
|
|
297
|
+
category: str = None,
|
|
298
|
+
min_confidence: float = 0.5) -> List[Memory]:
|
|
299
|
+
"""Retrieve relevant facts"""
|
|
300
|
+
|
|
301
|
+
where = {"confidence": {"$gte": min_confidence}}
|
|
302
|
+
if category:
|
|
303
|
+
where["category"] = category
|
|
304
|
+
|
|
305
|
+
results = self.collection.query(
|
|
306
|
+
query_texts=[query],
|
|
307
|
+
n_results=top_k,
|
|
308
|
+
where=where
|
|
309
|
+
)
|
|
310
|
+
|
|
311
|
+
return self._results_to_memories(results)
|
|
312
|
+
```
|
|
313
|
+
|
|
314
|
+
---
|
|
315
|
+
|
|
316
|
+
## 5. Retrieval Strategies
|
|
317
|
+
|
|
318
|
+
### 5.1 Hybrid Retrieval
|
|
319
|
+
|
|
320
|
+
```python
|
|
321
|
+
class HybridRetriever:
|
|
322
|
+
"""Combine multiple retrieval strategies"""
|
|
323
|
+
|
|
324
|
+
def retrieve(self, query: str, memory_system: AgentMemorySystem) -> List[Memory]:
|
|
325
|
+
# 1. Semantic search
|
|
326
|
+
semantic_results = memory_system.episodic_memory.retrieve(query, top_k=10)
|
|
327
|
+
|
|
328
|
+
# 2. Keyword search (BM25)
|
|
329
|
+
keyword_results = self._bm25_search(query)
|
|
330
|
+
|
|
331
|
+
# 3. Recency boost
|
|
332
|
+
recent_results = memory_system.episodic_memory.retrieve(
|
|
333
|
+
query,
|
|
334
|
+
time_range=(datetime.now() - timedelta(hours=24), datetime.now())
|
|
335
|
+
)
|
|
336
|
+
|
|
337
|
+
# 4. Combine with RRF
|
|
338
|
+
all_results = self._reciprocal_rank_fusion([
|
|
339
|
+
semantic_results,
|
|
340
|
+
keyword_results,
|
|
341
|
+
recent_results
|
|
342
|
+
], weights=[0.5, 0.3, 0.2])
|
|
343
|
+
|
|
344
|
+
return all_results[:5]
|
|
345
|
+
```
|
|
346
|
+
|
|
347
|
+
### 5.2 Temporal Scoring
|
|
348
|
+
|
|
349
|
+
```python
|
|
350
|
+
def temporal_score(memory: Memory, query_time: datetime) -> float:
|
|
351
|
+
"""Score memory by time relevance"""
|
|
352
|
+
age = (query_time - memory.timestamp).total_seconds()
|
|
353
|
+
|
|
354
|
+
# Exponential decay
|
|
355
|
+
half_life_hours = 24
|
|
356
|
+
decay = 0.5 ** (age / (half_life_hours * 3600))
|
|
357
|
+
|
|
358
|
+
return memory.importance * decay
|
|
359
|
+
```
|
|
360
|
+
|
|
361
|
+
---
|
|
362
|
+
|
|
363
|
+
## 6. Anti-Patterns
|
|
364
|
+
|
|
365
|
+
### ❌ Store Everything Forever
|
|
366
|
+
|
|
367
|
+
```python
|
|
368
|
+
# WRONG: Never delete anything
|
|
369
|
+
def store(self, memory):
|
|
370
|
+
self.db.insert(memory) # Grows forever!
|
|
371
|
+
|
|
372
|
+
# CORRECT: Implement decay and consolidation
|
|
373
|
+
def store(self, memory):
|
|
374
|
+
self.db.insert(memory)
|
|
375
|
+
self._maybe_consolidate() # Summarize old memories
|
|
376
|
+
self._maybe_forget() # Remove low-value memories
|
|
377
|
+
```
|
|
378
|
+
|
|
379
|
+
### ❌ Chunk Without Testing Retrieval
|
|
380
|
+
|
|
381
|
+
```python
|
|
382
|
+
# WRONG: Chunk and hope for the best
|
|
383
|
+
chunks = text.split("\n\n")
|
|
384
|
+
|
|
385
|
+
# CORRECT: Test retrieval quality
|
|
386
|
+
chunks = chunk_document(text)
|
|
387
|
+
for test_query in test_queries:
|
|
388
|
+
results = retrieve(test_query)
|
|
389
|
+
assert any(expected_chunk in r for r in results)
|
|
390
|
+
```
|
|
391
|
+
|
|
392
|
+
### ❌ Single Memory Type for All Data
|
|
393
|
+
|
|
394
|
+
```python
|
|
395
|
+
# WRONG: Everything in one vector store
|
|
396
|
+
self.memory.store(conversation) # Episodic
|
|
397
|
+
self.memory.store(user.name) # Semantic
|
|
398
|
+
self.memory.store(code_pattern) # Procedural
|
|
399
|
+
|
|
400
|
+
# CORRECT: Separate by type
|
|
401
|
+
self.episodic.store(conversation)
|
|
402
|
+
self.semantic.store(f"User name: {user.name}")
|
|
403
|
+
self.procedural.store(code_pattern)
|
|
404
|
+
```
|
|
405
|
+
|
|
406
|
+
---
|
|
407
|
+
|
|
408
|
+
## 7. Sharp Edges
|
|
409
|
+
|
|
410
|
+
| Issue | Severity | Solution |
|
|
411
|
+
|-------|----------|----------|
|
|
412
|
+
| Embedding drift over time | Critical | Track embedding model in metadata |
|
|
413
|
+
| Context overflow | High | Budget tokens per memory type |
|
|
414
|
+
| Stale information | High | Add temporal scoring |
|
|
415
|
+
| Contradictory facts | Medium | Detect conflicts on storage |
|
|
416
|
+
| Chunking breaks context | High | Use contextual chunking |
|
|
417
|
+
| Retrieval misses | High | Test with diverse queries |
|
|
418
|
+
|
|
419
|
+
---
|
|
420
|
+
|
|
421
|
+
## Related Skills
|
|
422
|
+
|
|
423
|
+
- `llm-app-patterns` - LLM architecture patterns
|
|
424
|
+
- `rag-engineering` - Advanced RAG techniques
|
|
425
|
+
- `langgraph` - Agent frameworks with memory
|
|
426
|
+
- `autonomous-agent-patterns` - Agent design patterns
|
|
@@ -0,0 +1,139 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: agent-tool-builder
|
|
3
|
+
description: "Expertise in designing, building, and optimizing tools for AI agents. Covers MCP (Model Context Protocol), function calling schemas, tool security, and sandbox execution."
|
|
4
|
+
version: "1.0.0"
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# 🛠️ Agent Tool Builder
|
|
8
|
+
|
|
9
|
+
You are an expert in building tools that AI agents use to interact with the world. You know how to design schemas that LLMs understand, handle errors gracefully, and ensure tools are secure and robust.
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## When to Use This Skill
|
|
14
|
+
|
|
15
|
+
- Building MCP servers
|
|
16
|
+
- Designing function calling schemas (OpenAI/Anthropic)
|
|
17
|
+
- Creating custom tools for LangChain/CrewAI/LangGraph
|
|
18
|
+
- Implementing secure code execution sandboxes
|
|
19
|
+
- Optimizing tool descriptions for LLM selection
|
|
20
|
+
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
## Capabilities
|
|
24
|
+
|
|
25
|
+
- `mcp-server-development`
|
|
26
|
+
- `function-calling-schemas`
|
|
27
|
+
- `json-schema-design`
|
|
28
|
+
- `tool-security-sandboxing`
|
|
29
|
+
- `api-wrapper-tools`
|
|
30
|
+
- `browser-use-tools`
|
|
31
|
+
|
|
32
|
+
---
|
|
33
|
+
|
|
34
|
+
## 1. Tool Schema Design (JSON Schema)
|
|
35
|
+
|
|
36
|
+
The most important part of a tool is its description and schema. LLMs use these to decide when and how to call the tool.
|
|
37
|
+
|
|
38
|
+
```typescript
|
|
39
|
+
// Good tool definition
|
|
40
|
+
export const searchDocsSchema = {
|
|
41
|
+
name: "search_documentation",
|
|
42
|
+
description: "Search the project documentation for specific topics. Use this when the user asks 'how to' or for technical details.",
|
|
43
|
+
parameters: {
|
|
44
|
+
type: "object",
|
|
45
|
+
properties: {
|
|
46
|
+
query: {
|
|
47
|
+
type: "string",
|
|
48
|
+
description: "The search query (e.g., 'auth setup', 'database schema')"
|
|
49
|
+
},
|
|
50
|
+
category: {
|
|
51
|
+
type: "string",
|
|
52
|
+
enum: ["api", "guides", "deployment", "troubleshooting"],
|
|
53
|
+
description: "Optional category to filter search results"
|
|
54
|
+
}
|
|
55
|
+
},
|
|
56
|
+
required: ["query"]
|
|
57
|
+
}
|
|
58
|
+
};
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
---
|
|
62
|
+
|
|
63
|
+
## 2. MCP Server (FastMCP Pattern)
|
|
64
|
+
|
|
65
|
+
```typescript
|
|
66
|
+
// server.ts
|
|
67
|
+
import { FastMCP } from "fastmcp";
|
|
68
|
+
|
|
69
|
+
const server = new FastMCP("System-Manager");
|
|
70
|
+
|
|
71
|
+
// Define a tool
|
|
72
|
+
server.addTool({
|
|
73
|
+
name: "get_system_stats",
|
|
74
|
+
description: "Get CPU and Memory usage of the host system",
|
|
75
|
+
parameters: {
|
|
76
|
+
type: "object",
|
|
77
|
+
properties: {
|
|
78
|
+
includeHistory: { type: "boolean" }
|
|
79
|
+
}
|
|
80
|
+
},
|
|
81
|
+
execute: async ({ includeHistory }) => {
|
|
82
|
+
const stats = await getStats();
|
|
83
|
+
return JSON.stringify(stats);
|
|
84
|
+
}
|
|
85
|
+
});
|
|
86
|
+
|
|
87
|
+
// Define a resource
|
|
88
|
+
server.addResource({
|
|
89
|
+
uri: "system://logs",
|
|
90
|
+
name: "System Logs",
|
|
91
|
+
load: async () => {
|
|
92
|
+
const logs = await readLogs();
|
|
93
|
+
return logs;
|
|
94
|
+
}
|
|
95
|
+
});
|
|
96
|
+
|
|
97
|
+
server.start();
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
---
|
|
101
|
+
|
|
102
|
+
## 3. Secure Execution (Sandboxing)
|
|
103
|
+
|
|
104
|
+
```python
|
|
105
|
+
import docker
|
|
106
|
+
|
|
107
|
+
def execute_in_sandbox(code: str):
|
|
108
|
+
client = docker.from_env()
|
|
109
|
+
container = client.containers.run(
|
|
110
|
+
"python:3.11-slim",
|
|
111
|
+
command=f'python -c "{code}"',
|
|
112
|
+
network_disabled=True,
|
|
113
|
+
mem_limit="128m",
|
|
114
|
+
cpu_period=100000,
|
|
115
|
+
cpu_quota=50000,
|
|
116
|
+
remove=True,
|
|
117
|
+
detach=False
|
|
118
|
+
)
|
|
119
|
+
return container.decode("utf-8")
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
---
|
|
123
|
+
|
|
124
|
+
## 4. Best Practices
|
|
125
|
+
|
|
126
|
+
| Rule | Rationale |
|
|
127
|
+
|------|-----------|
|
|
128
|
+
| **Be Descriptive** | LLMs rely on the description to choose the tool. Avoid generic names like `run`. |
|
|
129
|
+
| **Keep it Simple** | Too many parameters confuse the model. Split complex tools into smaller ones. |
|
|
130
|
+
| **Handle Errors** | Always return a clear error message that the LLM can use to fix its call. |
|
|
131
|
+
| **Idempotency** | Ensure side-effect tools can be safely called multiple times by the agent. |
|
|
132
|
+
|
|
133
|
+
---
|
|
134
|
+
|
|
135
|
+
## Related Skills
|
|
136
|
+
|
|
137
|
+
- `autonomous-agent-patterns` - For the "Agent Loop"
|
|
138
|
+
- `mcp-builder` - Deep dive into MCP
|
|
139
|
+
- `api-patterns` - For tool backends
|
|
@@ -0,0 +1,115 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: ai-agents-architect
|
|
3
|
+
description: "Architecture and design of multi-agent systems. Covers agent communication, role delegation, task decomposition, and swarm intelligence."
|
|
4
|
+
version: "1.0.0"
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
# 🤖 AI Agents Architect
|
|
8
|
+
|
|
9
|
+
You are an architect specializing in Multi-Agent Systems (MAS). You know how to decompose complex tasks into sub-tasks and delegate them to specialized agents. You design robust communication protocols and handle agent orchestration at scale.
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## When to Use This Skill
|
|
14
|
+
|
|
15
|
+
- Designing multi-agent workflows (Manager-Worker, Swarm, Hierarchical)
|
|
16
|
+
- Implementing agent communication protocols
|
|
17
|
+
- Balancing agent autonomy vs. controlled orchestration
|
|
18
|
+
- Solving tasks too complex for a single agent
|
|
19
|
+
- Designing agentic feedback loops
|
|
20
|
+
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
## Capabilities
|
|
24
|
+
|
|
25
|
+
- `agent-orchestration`
|
|
26
|
+
- `task-decomposition`
|
|
27
|
+
- `swarm-intelligence`
|
|
28
|
+
- `agent-communication-protocols`
|
|
29
|
+
- `dynamic-teaming`
|
|
30
|
+
- `error-propagation-in-mas`
|
|
31
|
+
|
|
32
|
+
---
|
|
33
|
+
|
|
34
|
+
## 1. Orchestration Patterns
|
|
35
|
+
|
|
36
|
+
### Hierarchical Orchestration (The Manager Pattern)
|
|
37
|
+
|
|
38
|
+
```python
|
|
39
|
+
class ManagerAgent:
|
|
40
|
+
def __init__(self):
|
|
41
|
+
self.specialists = {
|
|
42
|
+
"coder": CoderAgent(),
|
|
43
|
+
"reviewer": ReviewerAgent(),
|
|
44
|
+
"docs": DocumentationAgent()
|
|
45
|
+
}
|
|
46
|
+
|
|
47
|
+
async def run_task(self, request: str):
|
|
48
|
+
# 1. Plan
|
|
49
|
+
plan = await self.decompose(request)
|
|
50
|
+
|
|
51
|
+
# 2. Delegate & Execute
|
|
52
|
+
code = await self.specialists["coder"].execute(plan["code_specs"])
|
|
53
|
+
review = await self.specialists["reviewer"].execute(code)
|
|
54
|
+
|
|
55
|
+
# 3. Loop until quality met
|
|
56
|
+
while review.has_bugs:
|
|
57
|
+
code = await self.specialists["coder"].fix(review.bugs)
|
|
58
|
+
review = await self.specialists["reviewer"].execute(code)
|
|
59
|
+
|
|
60
|
+
return code
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
---
|
|
64
|
+
|
|
65
|
+
## 2. Communication Protocols
|
|
66
|
+
|
|
67
|
+
Agents should communicate using structured data or standardized thought protocols.
|
|
68
|
+
|
|
69
|
+
```json
|
|
70
|
+
{
|
|
71
|
+
"from": "frontend_agent",
|
|
72
|
+
"to": "backend_agent",
|
|
73
|
+
"intent": "SCHEMA_REQUEST",
|
|
74
|
+
"content": {
|
|
75
|
+
"module": "Auth",
|
|
76
|
+
"format": "OpenAPI"
|
|
77
|
+
},
|
|
78
|
+
"priority": "HIGH"
|
|
79
|
+
}
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
---
|
|
83
|
+
|
|
84
|
+
## 3. Swarm Logic (Decentralized)
|
|
85
|
+
|
|
86
|
+
```python
|
|
87
|
+
class SwarmAgent:
|
|
88
|
+
async def process(self, context):
|
|
89
|
+
# Check if I can handle the task
|
|
90
|
+
if self.can_handle(context):
|
|
91
|
+
return await self.execute(context)
|
|
92
|
+
else:
|
|
93
|
+
# Pass to the most suitable neighbor
|
|
94
|
+
neighbor = self.find_best_neighbor(context)
|
|
95
|
+
return await neighbor.process(context)
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
---
|
|
99
|
+
|
|
100
|
+
## 4. Design Matrix
|
|
101
|
+
|
|
102
|
+
| Pattern | Best For | Trade-off |
|
|
103
|
+
|---------|----------|-----------|
|
|
104
|
+
| **Sequential** | Clear linear workflows | High latency, brittle |
|
|
105
|
+
| **Hierarchical** | Quality-critical tasks | Manager overhead |
|
|
106
|
+
| **Broadcast** | Open-ended brainstorming | High noise/token usage |
|
|
107
|
+
| **Swarm** | Large-scale parallel tasks | Difficult to debug |
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
## Related Skills
|
|
112
|
+
|
|
113
|
+
- `crewai` - For specific multi-agent frameworks
|
|
114
|
+
- `langgraph` - For stateful agent graphs
|
|
115
|
+
- `autonomous-agent-patterns` - Individual agent design
|