opencode-skills-collection 1.0.186 → 1.0.188

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (71) hide show
  1. package/bundled-skills/.antigravity-install-manifest.json +5 -1
  2. package/bundled-skills/3d-web-experience/SKILL.md +152 -37
  3. package/bundled-skills/agent-evaluation/SKILL.md +1088 -26
  4. package/bundled-skills/agent-memory-systems/SKILL.md +1037 -25
  5. package/bundled-skills/agent-tool-builder/SKILL.md +668 -16
  6. package/bundled-skills/ai-agents-architect/SKILL.md +271 -31
  7. package/bundled-skills/ai-product/SKILL.md +716 -26
  8. package/bundled-skills/ai-wrapper-product/SKILL.md +450 -44
  9. package/bundled-skills/algolia-search/SKILL.md +867 -15
  10. package/bundled-skills/autonomous-agents/SKILL.md +1033 -26
  11. package/bundled-skills/aws-serverless/SKILL.md +1046 -35
  12. package/bundled-skills/azure-functions/SKILL.md +1318 -19
  13. package/bundled-skills/browser-automation/SKILL.md +1065 -28
  14. package/bundled-skills/browser-extension-builder/SKILL.md +159 -32
  15. package/bundled-skills/bullmq-specialist/SKILL.md +347 -16
  16. package/bundled-skills/clerk-auth/SKILL.md +796 -15
  17. package/bundled-skills/computer-use-agents/SKILL.md +1870 -28
  18. package/bundled-skills/context-window-management/SKILL.md +271 -18
  19. package/bundled-skills/conversation-memory/SKILL.md +453 -24
  20. package/bundled-skills/crewai/SKILL.md +252 -46
  21. package/bundled-skills/discord-bot-architect/SKILL.md +1207 -34
  22. package/bundled-skills/docs/integrations/jetski-cortex.md +3 -3
  23. package/bundled-skills/docs/integrations/jetski-gemini-loader/README.md +1 -1
  24. package/bundled-skills/docs/maintainers/repo-growth-seo.md +3 -3
  25. package/bundled-skills/docs/maintainers/skills-update-guide.md +1 -1
  26. package/bundled-skills/docs/users/bundles.md +1 -1
  27. package/bundled-skills/docs/users/claude-code-skills.md +1 -1
  28. package/bundled-skills/docs/users/gemini-cli-skills.md +1 -1
  29. package/bundled-skills/docs/users/getting-started.md +1 -1
  30. package/bundled-skills/docs/users/kiro-integration.md +1 -1
  31. package/bundled-skills/docs/users/usage.md +4 -4
  32. package/bundled-skills/docs/users/visual-guide.md +4 -4
  33. package/bundled-skills/email-systems/SKILL.md +646 -26
  34. package/bundled-skills/faf-expert/SKILL.md +221 -0
  35. package/bundled-skills/faf-wizard/SKILL.md +252 -0
  36. package/bundled-skills/file-uploads/SKILL.md +212 -11
  37. package/bundled-skills/firebase/SKILL.md +646 -16
  38. package/bundled-skills/gcp-cloud-run/SKILL.md +1117 -32
  39. package/bundled-skills/graphql/SKILL.md +1026 -27
  40. package/bundled-skills/hubspot-integration/SKILL.md +804 -19
  41. package/bundled-skills/idea-darwin/SKILL.md +120 -0
  42. package/bundled-skills/inngest/SKILL.md +431 -16
  43. package/bundled-skills/interactive-portfolio/SKILL.md +342 -44
  44. package/bundled-skills/langfuse/SKILL.md +296 -41
  45. package/bundled-skills/langgraph/SKILL.md +259 -50
  46. package/bundled-skills/micro-saas-launcher/SKILL.md +343 -44
  47. package/bundled-skills/neon-postgres/SKILL.md +572 -15
  48. package/bundled-skills/nextjs-supabase-auth/SKILL.md +269 -21
  49. package/bundled-skills/notion-template-business/SKILL.md +371 -44
  50. package/bundled-skills/personal-tool-builder/SKILL.md +537 -44
  51. package/bundled-skills/plaid-fintech/SKILL.md +825 -19
  52. package/bundled-skills/prompt-caching/SKILL.md +438 -25
  53. package/bundled-skills/rag-engineer/SKILL.md +271 -29
  54. package/bundled-skills/salesforce-development/SKILL.md +912 -19
  55. package/bundled-skills/satori/SKILL.md +54 -0
  56. package/bundled-skills/scroll-experience/SKILL.md +381 -44
  57. package/bundled-skills/segment-cdp/SKILL.md +817 -19
  58. package/bundled-skills/shopify-apps/SKILL.md +1475 -19
  59. package/bundled-skills/slack-bot-builder/SKILL.md +1162 -28
  60. package/bundled-skills/telegram-bot-builder/SKILL.md +152 -37
  61. package/bundled-skills/telegram-mini-app/SKILL.md +445 -44
  62. package/bundled-skills/trigger-dev/SKILL.md +916 -27
  63. package/bundled-skills/twilio-communications/SKILL.md +1310 -28
  64. package/bundled-skills/upstash-qstash/SKILL.md +898 -27
  65. package/bundled-skills/vercel-deployment/SKILL.md +637 -39
  66. package/bundled-skills/viral-generator-builder/SKILL.md +132 -37
  67. package/bundled-skills/voice-agents/SKILL.md +937 -27
  68. package/bundled-skills/voice-ai-development/SKILL.md +375 -46
  69. package/bundled-skills/workflow-automation/SKILL.md +982 -29
  70. package/bundled-skills/zapier-make-patterns/SKILL.md +772 -27
  71. package/package.json +1 -1
@@ -1,21 +1,38 @@
1
1
  ---
2
2
  name: agent-memory-systems
3
- description: "You are a cognitive architect who understands that memory makes agents intelligent. You've built memory systems for agents handling millions of interactions. You know that the hard part isn't storing - it's retrieving the right memory at the right time."
3
+ description: "Memory is the cornerstone of intelligent agents. Without it, every
4
+ interaction starts from zero. This skill covers the architecture of agent
5
+ memory: short-term (context window), long-term (vector stores), and the
6
+ cognitive architectures that organize them."
4
7
  risk: safe
5
- source: "vibeship-spawner-skills (Apache 2.0)"
6
- date_added: "2026-02-27"
8
+ source: vibeship-spawner-skills (Apache 2.0)
9
+ date_added: 2026-02-27
7
10
  ---
8
11
 
9
12
  # Agent Memory Systems
10
13
 
11
- You are a cognitive architect who understands that memory makes agents intelligent.
12
- You've built memory systems for agents handling millions of interactions. You know
13
- that the hard part isn't storing - it's retrieving the right memory at the right time.
14
+ Memory is the cornerstone of intelligent agents. Without it, every interaction
15
+ starts from zero. This skill covers the architecture of agent memory: short-term
16
+ (context window), long-term (vector stores), and the cognitive architectures
17
+ that organize them.
14
18
 
15
- Your core insight: Memory failures look like intelligence failures. When an agent
16
- "forgets" or gives inconsistent answers, it's almost always a retrieval problem,
17
- not a storage problem. You obsess over chunking strategies, embedding quality,
18
- and
19
+ Key insight: Memory isn't just storage - it's retrieval. A million stored facts
20
+ mean nothing if you can't find the right one. Chunking, embedding, and retrieval
21
+ strategies determine whether your agent remembers or forgets.
22
+
23
+ The field is fragmented with inconsistent terminology. We use the CoALA cognitive
24
+ architecture framework: semantic memory (facts), episodic memory (experiences),
25
+ and procedural memory (how-to knowledge).
26
+
27
+ ## Principles
28
+
29
+ - Memory quality = retrieval quality, not storage quantity
30
+ - Chunk for retrieval, not for storage
31
+ - Context isolation is the enemy of memory
32
+ - Right memory type for right information
33
+ - Decay old memories - not everything should be forever
34
+ - Test retrieval accuracy before production
35
+ - Background memory formation beats real-time
19
36
 
20
37
  ## Capabilities
21
38
 
@@ -30,43 +47,1038 @@ and
30
47
  - memory-formation
31
48
  - memory-decay
32
49
 
50
+ ## Scope
51
+
52
+ - vector-database-operations → data-engineer
53
+ - rag-pipeline-architecture → llm-architect
54
+ - embedding-model-selection → ml-engineer
55
+ - knowledge-graph-design → knowledge-engineer
56
+
57
+ ## Tooling
58
+
59
+ ### Memory_frameworks
60
+
61
+ - LangMem (LangChain) - When: LangGraph agents with persistent memory Note: Semantic, episodic, procedural memory types
62
+ - MemGPT / Letta - When: Virtual context management, OS-style memory Note: Hierarchical memory tiers, automatic paging
63
+ - Mem0 - When: User memory layer for personalization Note: Designed for user preferences and history
64
+
65
+ ### Vector_stores
66
+
67
+ - Pinecone - When: Managed, enterprise-scale (billions of vectors) Note: Best query performance, highest cost
68
+ - Qdrant - When: Complex metadata filtering, open-source Note: Rust-based, excellent filtering
69
+ - Weaviate - When: Hybrid search, knowledge graph features Note: GraphQL interface, good for relationships
70
+ - ChromaDB - When: Prototyping, small/medium apps Note: Developer-friendly, ~20ms p50 at 100K vectors
71
+ - pgvector - When: Already using PostgreSQL, simpler setup Note: Good for <1M vectors, familiar tooling
72
+
73
+ ### Embedding_models
74
+
75
+ - OpenAI text-embedding-3-large - When: Best quality, 3072 dimensions Note: $0.13/1M tokens
76
+ - OpenAI text-embedding-3-small - When: Good balance, 1536 dimensions Note: $0.02/1M tokens, 5x cheaper
77
+ - nomic-embed-text-v1.5 - When: Open-source, local deployment Note: 768 dimensions, good quality
78
+ - all-MiniLM-L6-v2 - When: Lightweight, fast local embedding Note: 384 dimensions, lowest latency
79
+
33
80
  ## Patterns
34
81
 
35
82
  ### Memory Type Architecture
36
83
 
37
84
  Choosing the right memory type for different information
38
85
 
86
+ **When to use**: Designing agent memory system
87
+
88
+ # MEMORY TYPE ARCHITECTURE (CoALA Framework):
89
+
90
+ """
91
+ Three memory types for different purposes:
92
+
93
+ 1. Semantic Memory: Facts and knowledge
94
+ - What you know about the world
95
+ - User preferences, domain knowledge
96
+ - Stored in profiles (structured) or collections (unstructured)
97
+
98
+ 2. Episodic Memory: Experiences and events
99
+ - What happened (timestamped events)
100
+ - Past conversations, task outcomes
101
+ - Used for learning from experience
102
+
103
+ 3. Procedural Memory: How to do things
104
+ - Rules, skills, workflows
105
+ - Often implemented as few-shot examples
106
+ - "How did I solve this before?"
107
+ """
108
+
109
+ ## LangMem Implementation
110
+ """
111
+ from langmem import MemoryStore
112
+ from langgraph.graph import StateGraph
113
+
114
+ # Initialize memory store
115
+ memory = MemoryStore(
116
+ connection_string=os.environ["POSTGRES_URL"]
117
+ )
118
+
119
+ # Semantic memory: user profile
120
+ await memory.semantic.upsert(
121
+ namespace="user_profile",
122
+ key=user_id,
123
+ content={
124
+ "name": "Alice",
125
+ "preferences": ["dark mode", "concise responses"],
126
+ "expertise_level": "developer",
127
+ }
128
+ )
129
+
130
+ # Episodic memory: past interaction
131
+ await memory.episodic.add(
132
+ namespace="conversations",
133
+ content={
134
+ "timestamp": datetime.now(),
135
+ "summary": "Helped debug authentication issue",
136
+ "outcome": "resolved",
137
+ "key_insights": ["Token expiry was root cause"],
138
+ },
139
+ metadata={"user_id": user_id, "topic": "debugging"}
140
+ )
141
+
142
+ # Procedural memory: learned pattern
143
+ await memory.procedural.add(
144
+ namespace="skills",
145
+ content={
146
+ "task_type": "debug_auth",
147
+ "steps": ["Check token expiry", "Verify refresh flow"],
148
+ "example_interaction": few_shot_example,
149
+ }
150
+ )
151
+ """
152
+
153
+ ## Memory Retrieval at Runtime
154
+ """
155
+ async def prepare_context(user_id, query):
156
+ # Get user profile (semantic)
157
+ profile = await memory.semantic.get(
158
+ namespace="user_profile",
159
+ key=user_id
160
+ )
161
+
162
+ # Find relevant past experiences (episodic)
163
+ similar_experiences = await memory.episodic.search(
164
+ namespace="conversations",
165
+ query=query,
166
+ filter={"user_id": user_id},
167
+ limit=3
168
+ )
169
+
170
+ # Find relevant skills (procedural)
171
+ relevant_skills = await memory.procedural.search(
172
+ namespace="skills",
173
+ query=query,
174
+ limit=2
175
+ )
176
+
177
+ return {
178
+ "profile": profile,
179
+ "past_experiences": similar_experiences,
180
+ "relevant_skills": relevant_skills,
181
+ }
182
+ """
183
+
39
184
  ### Vector Store Selection Pattern
40
185
 
41
186
  Choosing the right vector database for your use case
42
187
 
188
+ **When to use**: Setting up persistent memory storage
189
+
190
+ # VECTOR STORE SELECTION:
191
+
192
+ """
193
+ Decision matrix:
194
+
195
+ | | Pinecone | Qdrant | Weaviate | ChromaDB | pgvector |
196
+ |------------|----------|--------|----------|----------|----------|
197
+ | Scale | Billions | 100M+ | 100M+ | 1M | 1M |
198
+ | Managed | Yes | Both | Both | Self | Self |
199
+ | Filtering | Basic | Best | Good | Basic | SQL |
200
+ | Hybrid | No | Yes | Best | No | Yes |
201
+ | Cost | High | Medium | Medium | Free | Free |
202
+ | Latency | 5ms | 7ms | 10ms | 20ms | 15ms |
203
+ """
204
+
205
+ ## Pinecone (Enterprise Scale)
206
+ """
207
+ from pinecone import Pinecone
208
+
209
+ pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
210
+ index = pc.Index("agent-memory")
211
+
212
+ # Upsert with metadata
213
+ index.upsert(
214
+ vectors=[
215
+ {
216
+ "id": f"memory-{uuid4()}",
217
+ "values": embedding,
218
+ "metadata": {
219
+ "user_id": user_id,
220
+ "timestamp": datetime.now().isoformat(),
221
+ "type": "episodic",
222
+ "content": memory_text,
223
+ }
224
+ }
225
+ ],
226
+ namespace=namespace
227
+ )
228
+
229
+ # Query with filter
230
+ results = index.query(
231
+ vector=query_embedding,
232
+ filter={"user_id": user_id, "type": "episodic"},
233
+ top_k=5,
234
+ include_metadata=True
235
+ )
236
+ """
237
+
238
+ ## Qdrant (Complex Filtering)
239
+ """
240
+ from qdrant_client import QdrantClient
241
+ from qdrant_client.models import PointStruct, Filter, FieldCondition
242
+
243
+ client = QdrantClient(url="http://localhost:6333")
244
+
245
+ # Complex filtering with Qdrant
246
+ results = client.search(
247
+ collection_name="agent_memory",
248
+ query_vector=query_embedding,
249
+ query_filter=Filter(
250
+ must=[
251
+ FieldCondition(key="user_id", match={"value": user_id}),
252
+ FieldCondition(key="type", match={"value": "semantic"}),
253
+ ],
254
+ should=[
255
+ FieldCondition(key="topic", match={"any": ["auth", "security"]}),
256
+ ]
257
+ ),
258
+ limit=5
259
+ )
260
+ """
261
+
262
+ ## ChromaDB (Prototyping)
263
+ """
264
+ import chromadb
265
+
266
+ client = chromadb.PersistentClient(path="./memory_db")
267
+ collection = client.get_or_create_collection("agent_memory")
268
+
269
+ # Simple and fast for prototypes
270
+ collection.add(
271
+ ids=[str(uuid4())],
272
+ embeddings=[embedding],
273
+ documents=[memory_text],
274
+ metadatas=[{"user_id": user_id, "type": "episodic"}]
275
+ )
276
+
277
+ results = collection.query(
278
+ query_embeddings=[query_embedding],
279
+ n_results=5,
280
+ where={"user_id": user_id}
281
+ )
282
+ """
283
+
43
284
  ### Chunking Strategy Pattern
44
285
 
45
286
  Breaking documents into retrievable chunks
46
287
 
47
- ## Anti-Patterns
288
+ **When to use**: Processing documents for memory storage
289
+
290
+ # CHUNKING STRATEGIES:
291
+
292
+ """
293
+ The chunking dilemma:
294
+ - Too large: Vector loses specificity
295
+ - Too small: Loses context
296
+
297
+ Optimal chunk size depends on:
298
+ - Document type (code vs prose vs data)
299
+ - Query patterns (factual vs exploratory)
300
+ - Embedding model (each has sweet spot)
301
+
302
+ General guidance: 256-512 tokens for most use cases
303
+ """
304
+
305
+ ## Fixed-Size Chunking (Baseline)
306
+ """
307
+ from langchain.text_splitter import RecursiveCharacterTextSplitter
308
+
309
+ splitter = RecursiveCharacterTextSplitter(
310
+ chunk_size=500, # Characters
311
+ chunk_overlap=50, # Overlap prevents cutting sentences
312
+ separators=["\n\n", "\n", ". ", " ", ""] # Priority order
313
+ )
314
+
315
+ chunks = splitter.split_text(document)
316
+ """
317
+
318
+ ## Semantic Chunking (Better Quality)
319
+ """
320
+ from langchain_experimental.text_splitter import SemanticChunker
321
+ from langchain_openai import OpenAIEmbeddings
322
+
323
+ # Splits based on semantic similarity
324
+ splitter = SemanticChunker(
325
+ embeddings=OpenAIEmbeddings(),
326
+ breakpoint_threshold_type="percentile",
327
+ breakpoint_threshold_amount=95
328
+ )
329
+
330
+ chunks = splitter.split_text(document)
331
+ """
332
+
333
+ ## Structure-Aware Chunking (Documents with Hierarchy)
334
+ """
335
+ from langchain.text_splitter import MarkdownHeaderTextSplitter
336
+
337
+ # Respect document structure
338
+ splitter = MarkdownHeaderTextSplitter(
339
+ headers_to_split_on=[
340
+ ("#", "Header 1"),
341
+ ("##", "Header 2"),
342
+ ("###", "Header 3"),
343
+ ]
344
+ )
345
+
346
+ chunks = splitter.split_text(markdown_doc)
347
+ # Each chunk has header metadata for context
348
+ """
349
+
350
+ ## Contextual Chunking (Anthropic's Approach)
351
+ """
352
+ # Add context to each chunk before embedding
353
+ # Reduces retrieval failures by 35%
354
+
355
+ def add_context_to_chunk(chunk, document_summary):
356
+ context_prompt = f'''
357
+ Document summary: {document_summary}
358
+
359
+ The following is a chunk from this document:
360
+ {chunk}
361
+ '''
362
+ return context_prompt
363
+
364
+ # Embed the contextualized chunk, not raw chunk
365
+ for chunk in chunks:
366
+ contextualized = add_context_to_chunk(chunk, summary)
367
+ embedding = embed(contextualized)
368
+ store(chunk, embedding) # Store original, embed contextualized
369
+ """
370
+
371
+ ## Code-Specific Chunking
372
+ """
373
+ from langchain.text_splitter import Language, RecursiveCharacterTextSplitter
374
+
375
+ # Language-aware splitting
376
+ python_splitter = RecursiveCharacterTextSplitter.from_language(
377
+ language=Language.PYTHON,
378
+ chunk_size=1000,
379
+ chunk_overlap=200
380
+ )
381
+
382
+ # Respects function/class boundaries
383
+ chunks = python_splitter.split_text(python_code)
384
+ """
385
+
386
+ ### Background Memory Formation
387
+
388
+ Processing memories asynchronously for better quality
389
+
390
+ **When to use**: You want higher recall without slowing interactions
391
+
392
+ # BACKGROUND MEMORY FORMATION:
393
+
394
+ """
395
+ Real-time memory extraction slows conversations and adds
396
+ complexity to agent tool calls. Background processing after
397
+ conversations yields higher quality memories.
398
+
399
+ Pattern: Subconscious memory formation
400
+ """
401
+
402
+ ## LangGraph Background Processing
403
+ """
404
+ from langgraph.graph import StateGraph
405
+ from langgraph.checkpoint.postgres import PostgresSaver
406
+
407
+ async def background_memory_processor(thread_id: str):
408
+ # Run after conversation ends or goes idle
409
+ conversation = await load_conversation(thread_id)
410
+
411
+ # Extract insights without time pressure
412
+ insights = await llm.invoke('''
413
+ Analyze this conversation and extract:
414
+ 1. Key facts learned about the user
415
+ 2. User preferences revealed
416
+ 3. Tasks completed or pending
417
+ 4. Patterns in user behavior
418
+
419
+ Be thorough - this runs in background.
420
+
421
+ Conversation:
422
+ {conversation}
423
+ ''')
424
+
425
+ # Store to long-term memory
426
+ for insight in insights:
427
+ await memory.semantic.upsert(
428
+ namespace="user_insights",
429
+ key=generate_key(insight),
430
+ content=insight,
431
+ metadata={"source_thread": thread_id}
432
+ )
433
+
434
+ # Trigger on conversation end or idle timeout
435
+ @on_conversation_idle(timeout_minutes=5)
436
+ async def process_conversation(thread_id):
437
+ await background_memory_processor(thread_id)
438
+ """
439
+
440
+ ## Memory Consolidation (Like Sleep)
441
+ """
442
+ # Periodically consolidate and deduplicate memories
443
+
444
+ async def consolidate_memories(user_id: str):
445
+ # Get all memories for user
446
+ memories = await memory.semantic.list(
447
+ namespace="user_insights",
448
+ filter={"user_id": user_id}
449
+ )
450
+
451
+ # Find similar memories (potential duplicates)
452
+ clusters = cluster_by_similarity(memories, threshold=0.9)
453
+
454
+ # Merge similar memories
455
+ for cluster in clusters:
456
+ if len(cluster) > 1:
457
+ merged = await llm.invoke(f'''
458
+ Consolidate these related memories into one:
459
+ {cluster}
460
+
461
+ Preserve all important information.
462
+ ''')
463
+ await memory.semantic.upsert(
464
+ namespace="user_insights",
465
+ key=generate_key(merged),
466
+ content=merged
467
+ )
468
+ # Delete originals
469
+ for old in cluster:
470
+ await memory.semantic.delete(old.id)
471
+ """
472
+
473
+ ### Memory Decay Pattern
474
+
475
+ Forgetting old, irrelevant memories
476
+
477
+ **When to use**: Memory grows large, retrieval slows down
478
+
479
+ # MEMORY DECAY:
480
+
481
+ """
482
+ Not all memories should live forever:
483
+ - Old preferences may be outdated
484
+ - Task details lose relevance
485
+ - Conflicting memories confuse retrieval
486
+
487
+ Implement intelligent decay based on:
488
+ - Recency (when was it created/accessed?)
489
+ - Frequency (how often is it retrieved?)
490
+ - Importance (is it a core fact or detail?)
491
+ """
492
+
493
+ ## Time-Based Decay
494
+ """
495
+ from datetime import datetime, timedelta
496
+
497
+ async def decay_old_memories(namespace: str, max_age_days: int):
498
+ cutoff = datetime.now() - timedelta(days=max_age_days)
499
+
500
+ old_memories = await memory.episodic.list(
501
+ namespace=namespace,
502
+ filter={"last_accessed": {"$lt": cutoff.isoformat()}}
503
+ )
504
+
505
+ for mem in old_memories:
506
+ # Soft delete (mark as archived)
507
+ await memory.episodic.update(
508
+ id=mem.id,
509
+ metadata={"archived": True, "archived_at": datetime.now()}
510
+ )
511
+ """
512
+
513
+ ## Utility-Based Decay (MIRIX Approach)
514
+ """
515
+ def calculate_memory_utility(memory):
516
+ '''
517
+ Composite utility score inspired by cognitive science:
518
+ - Recency: When was it last accessed?
519
+ - Frequency: How often is it accessed?
520
+ - Importance: How critical is this information?
521
+ '''
522
+ now = datetime.now()
523
+
524
+ # Recency score (exponential decay with 72h half-life)
525
+ hours_since_access = (now - memory.last_accessed).total_seconds() / 3600
526
+ recency_score = 0.5 ** (hours_since_access / 72)
527
+
528
+ # Frequency score
529
+ frequency_score = min(memory.access_count / 10, 1.0)
530
+
531
+ # Importance (from metadata or heuristic)
532
+ importance = memory.metadata.get("importance", 0.5)
533
+
534
+ # Weighted combination
535
+ utility = (
536
+ 0.4 * recency_score +
537
+ 0.3 * frequency_score +
538
+ 0.3 * importance
539
+ )
540
+
541
+ return utility
542
+
543
+ async def prune_low_utility_memories(threshold=0.2):
544
+ all_memories = await memory.list_all()
545
+ for mem in all_memories:
546
+ if calculate_memory_utility(mem) < threshold:
547
+ await memory.archive(mem.id)
548
+ """
549
+
550
+ ## Sharp Edges
551
+
552
+ ### Chunking Isolates Information From Its Context
553
+
554
+ Severity: CRITICAL
555
+
556
+ Situation: Processing documents for vector storage
557
+
558
+ Symptoms:
559
+ Retrieval finds chunks but they don't make sense alone. Agent
560
+ answers miss the big picture. "The function returns X" retrieved
561
+ without knowing which function. References to "this" without
562
+ knowing what "this" refers to.
563
+
564
+ Why this breaks:
565
+ When we chunk for AI processing, we're breaking connections,
566
+ reducing a holistic narrative to isolated fragments that often
567
+ miss the big picture. A chunk about "the configuration" without
568
+ context about what system is being configured is nearly useless.
569
+
570
+ Recommended fix:
571
+
572
+ ## Contextual Chunking (Anthropic's approach)
573
+ # Add document context to each chunk before embedding
574
+ # Reduces retrieval failures by 35%
575
+
576
+ def contextualize_chunk(chunk, document):
577
+ summary = summarize(document)
578
+
579
+ # LLM generates context for chunk
580
+ context = llm.invoke(f'''
581
+ Document summary: {summary}
582
+
583
+ Generate a brief context statement for this chunk
584
+ that would help someone understand what it refers to:
585
+
586
+ {chunk}
587
+ ''')
588
+
589
+ return f"{context}\n\n{chunk}"
590
+
591
+ # Embed the contextualized version
592
+ for chunk in chunks:
593
+ contextualized = contextualize_chunk(chunk, full_doc)
594
+ embedding = embed(contextualized)
595
+ # Store original chunk, embed contextualized
596
+ store(original=chunk, embedding=embedding)
597
+
598
+ ## Hierarchical Chunking
599
+ # Store at multiple granularities
600
+ chunks_small = split(doc, size=256)
601
+ chunks_medium = split(doc, size=512)
602
+ chunks_large = split(doc, size=1024)
603
+
604
+ # Retrieve at appropriate level based on query
605
+
606
+ ### Chunk Size Mismatched to Query Patterns
607
+
608
+ Severity: HIGH
609
+
610
+ Situation: Configuring chunking for memory storage
611
+
612
+ Symptoms:
613
+ High-quality documents produce low-quality retrievals. Simple
614
+ questions miss relevant information. Complex questions get
615
+ fragments instead of complete answers.
616
+
617
+ Why this breaks:
618
+ Optimal chunk size depends on query patterns:
619
+ - Factual queries need small, specific chunks
620
+ - Conceptual queries need larger context
621
+ - Code needs function-level boundaries
48
622
 
49
- ### Store Everything Forever
623
+ The sweet spot varies by document type and embedding model.
624
+ Default 1000 characters works for nothing specific.
50
625
 
51
- ### ❌ Chunk Without Testing Retrieval
626
+ Recommended fix:
52
627
 
53
- ### Single Memory Type for All Data
628
+ ## Test different sizes
629
+ from sklearn.metrics import recall_score
54
630
 
55
- ## ⚠️ Sharp Edges
631
+ def evaluate_chunk_size(documents, test_queries, chunk_size):
632
+ chunks = split_documents(documents, size=chunk_size)
633
+ index = build_index(chunks)
56
634
 
57
- | Issue | Severity | Solution |
58
- |-------|----------|----------|
59
- | Issue | critical | ## Contextual Chunking (Anthropic's approach) |
60
- | Issue | high | ## Test different sizes |
61
- | Issue | high | ## Always filter by metadata first |
62
- | Issue | high | ## Add temporal scoring |
63
- | Issue | medium | ## Detect conflicts on storage |
64
- | Issue | medium | ## Budget tokens for different memory types |
65
- | Issue | medium | ## Track embedding model in metadata |
635
+ correct_retrievals = 0
636
+ for query, expected_chunk in test_queries:
637
+ results = index.search(query, k=5)
638
+ if expected_chunk in results:
639
+ correct_retrievals += 1
640
+
641
+ return correct_retrievals / len(test_queries)
642
+
643
+ # Test multiple sizes
644
+ for size in [256, 512, 768, 1024]:
645
+ recall = evaluate_chunk_size(docs, test_queries, size)
646
+ print(f"Size {size}: Recall@5 = {recall:.2%}")
647
+
648
+ ## Size recommendations by content type
649
+ CHUNK_SIZES = {
650
+ "documentation": 512, # Complete concepts
651
+ "code": 1000, # Function-level
652
+ "conversation": 256, # Turn-level
653
+ "articles": 768, # Paragraph-level
654
+ }
655
+
656
+ ## Use overlap to prevent boundary issues
657
+ splitter = RecursiveCharacterTextSplitter(
658
+ chunk_size=512,
659
+ chunk_overlap=50, # 10% overlap
660
+ )
661
+
662
+ ### Semantic Search Returns Irrelevant Results
663
+
664
+ Severity: HIGH
665
+
666
+ Situation: Querying memory for context
667
+
668
+ Symptoms:
669
+ Agent retrieves memories that seem related but aren't useful.
670
+ "Tell me about the user's preferences" returns conversation
671
+ about preferences in general, not this user's. High similarity
672
+ scores for wrong content.
673
+
674
+ Why this breaks:
675
+ Semantic similarity isn't the same as relevance. "The user
676
+ likes Python" and "Python is a programming language" are
677
+ semantically similar but very different types of information.
678
+ Without metadata filtering, retrieval is just word matching.
679
+
680
+ Recommended fix:
681
+
682
+ ## Always filter by metadata first
683
+ # Don't rely on semantic similarity alone
684
+
685
+ # Bad: Only semantic search
686
+ results = index.query(
687
+ vector=query_embedding,
688
+ top_k=5
689
+ )
690
+
691
+ # Good: Filter then search
692
+ results = index.query(
693
+ vector=query_embedding,
694
+ filter={
695
+ "user_id": current_user.id,
696
+ "type": "preference",
697
+ "created_after": cutoff_date,
698
+ },
699
+ top_k=5
700
+ )
701
+
702
+ ## Use hybrid search (semantic + keyword)
703
+ from qdrant_client import QdrantClient
704
+
705
+ client = QdrantClient(...)
706
+
707
+ # Hybrid search with fusion
708
+ results = client.search(
709
+ collection_name="memories",
710
+ query_vector=semantic_embedding,
711
+ query_text=query, # Also keyword match
712
+ fusion={"method": "rrf"}, # Reciprocal Rank Fusion
713
+ )
714
+
715
+ ## Rerank results with cross-encoder
716
+ from sentence_transformers import CrossEncoder
717
+
718
+ reranker = CrossEncoder("cross-encoder/ms-marco-MiniLM-L-6-v2")
719
+
720
+ # Initial retrieval (recall-oriented)
721
+ candidates = index.query(query_embedding, top_k=20)
722
+
723
+ # Rerank (precision-oriented)
724
+ pairs = [(query, c.text) for c in candidates]
725
+ scores = reranker.predict(pairs)
726
+ reranked = sorted(zip(candidates, scores), key=lambda x: x[1], reverse=True)
727
+
728
+ ### Old Memories Override Current Information
729
+
730
+ Severity: HIGH
731
+
732
+ Situation: User preferences or facts change over time
733
+
734
+ Symptoms:
735
+ Agent uses outdated preferences. "User prefers dark mode" from
736
+ 6 months ago overrides recent "switch to light mode" request.
737
+ Agent confidently uses stale data.
738
+
739
+ Why this breaks:
740
+ Vector stores don't have temporal awareness by default. A memory
741
+ from a year ago has the same retrieval weight as one from today.
742
+ Recent information should generally override old information
743
+ for preferences and mutable facts.
744
+
745
+ Recommended fix:
746
+
747
+ ## Add temporal scoring
748
+ from datetime import datetime, timedelta
749
+
750
+ def time_decay_score(memory, half_life_days=30):
751
+ age = (datetime.now() - memory.created_at).days
752
+ decay = 0.5 ** (age / half_life_days)
753
+ return decay
754
+
755
+ def retrieve_with_recency(query, user_id):
756
+ # Get candidates
757
+ candidates = index.query(
758
+ vector=embed(query),
759
+ filter={"user_id": user_id},
760
+ top_k=20
761
+ )
762
+
763
+ # Apply time decay
764
+ for candidate in candidates:
765
+ time_score = time_decay_score(candidate)
766
+ candidate.final_score = candidate.similarity * 0.7 + time_score * 0.3
767
+
768
+ # Re-sort by final score
769
+ return sorted(candidates, key=lambda x: x.final_score, reverse=True)[:5]
770
+
771
+ ## Update instead of append for preferences
772
+ async def update_preference(user_id, category, value):
773
+ # Delete old preference
774
+ await memory.delete(
775
+ filter={"user_id": user_id, "type": "preference", "category": category}
776
+ )
777
+
778
+ # Store new preference
779
+ await memory.upsert(
780
+ id=f"pref-{user_id}-{category}",
781
+ content={"category": category, "value": value},
782
+ metadata={"updated_at": datetime.now()}
783
+ )
784
+
785
+ ## Explicit versioning for facts
786
+ await memory.upsert(
787
+ id=f"fact-{fact_id}-v{version}",
788
+ content=new_fact,
789
+ metadata={
790
+ "version": version,
791
+ "supersedes": previous_id,
792
+ "valid_from": datetime.now()
793
+ }
794
+ )
795
+
796
+ ### Contradictory Memories Retrieved Together
797
+
798
+ Severity: MEDIUM
799
+
800
+ Situation: User has changed preferences or provided conflicting info
801
+
802
+ Symptoms:
803
+ Agent retrieves "user prefers dark mode" and "user prefers light
804
+ mode" in same context. Gives inconsistent answers. Seems confused
805
+ or forgetful to user.
806
+
807
+ Why this breaks:
808
+ Without conflict resolution, both old and new information coexist.
809
+ Semantic search might return both because they're both about the
810
+ same topic (preferences). Agent has no way to know which is current.
811
+
812
+ Recommended fix:
813
+
814
+ ## Detect conflicts on storage
815
+ async def store_with_conflict_check(memory, user_id):
816
+ # Find potentially conflicting memories
817
+ similar = await index.query(
818
+ vector=embed(memory.content),
819
+ filter={"user_id": user_id, "type": memory.type},
820
+ threshold=0.9, # Very similar
821
+ top_k=5
822
+ )
823
+
824
+ for existing in similar:
825
+ if is_contradictory(memory.content, existing.content):
826
+ # Ask for resolution
827
+ resolution = await resolve_conflict(memory, existing)
828
+ if resolution == "replace":
829
+ await index.delete(existing.id)
830
+ elif resolution == "version":
831
+ await mark_superseded(existing.id, memory.id)
832
+
833
+ await index.upsert(memory)
834
+
835
+ ## Conflict detection heuristic
836
+ def is_contradictory(new_content, old_content):
837
+ # Use LLM to detect contradiction
838
+ result = llm.invoke(f'''
839
+ Do these two statements contradict each other?
840
+
841
+ Statement 1: {old_content}
842
+ Statement 2: {new_content}
843
+
844
+ Respond with just YES or NO.
845
+ ''')
846
+ return result.strip().upper() == "YES"
847
+
848
+ ## Periodic consolidation
849
+ async def consolidate_memories(user_id):
850
+ all_memories = await index.list(filter={"user_id": user_id})
851
+ clusters = cluster_by_topic(all_memories)
852
+
853
+ for cluster in clusters:
854
+ if has_conflicts(cluster):
855
+ resolved = await llm.invoke(f'''
856
+ These memories may conflict. Create one consolidated
857
+ memory that represents the current truth:
858
+ {cluster}
859
+ ''')
860
+ await replace_cluster(cluster, resolved)
861
+
862
+ ### Retrieved Memories Exceed Context Window
863
+
864
+ Severity: MEDIUM
865
+
866
+ Situation: Retrieving too many memories at once
867
+
868
+ Symptoms:
869
+ Token limit errors. Agent truncates important information.
870
+ System prompt gets cut off. Retrieved memories compete with
871
+ user query for space.
872
+
873
+ Why this breaks:
874
+ Retrieval typically returns top-k results. If k is too high or
875
+ chunks are too large, retrieved context overwhelms the window.
876
+ Critical information (system prompt, recent messages) gets pushed
877
+ out.
878
+
879
+ Recommended fix:
880
+
881
+ ## Budget tokens for different memory types
882
+ TOKEN_BUDGET = {
883
+ "system_prompt": 500,
884
+ "user_profile": 200,
885
+ "recent_messages": 2000,
886
+ "retrieved_memories": 1000,
887
+ "current_query": 500,
888
+ "buffer": 300, # Safety margin
889
+ }
890
+
891
+ def budget_aware_retrieval(query, context_limit=4000):
892
+ remaining = context_limit - TOKEN_BUDGET["system_prompt"] - TOKEN_BUDGET["buffer"]
893
+
894
+ # Prioritize recent messages
895
+ recent = get_recent_messages(limit=TOKEN_BUDGET["recent_messages"])
896
+ remaining -= count_tokens(recent)
897
+
898
+ # Then user profile
899
+ profile = get_user_profile(limit=TOKEN_BUDGET["user_profile"])
900
+ remaining -= count_tokens(profile)
901
+
902
+ # Finally retrieved memories with remaining budget
903
+ memories = retrieve_memories(query, max_tokens=remaining)
904
+
905
+ return build_context(profile, recent, memories)
906
+
907
+ ## Dynamic k based on chunk size
908
+ def retrieve_with_budget(query, max_tokens=1000):
909
+ avg_chunk_tokens = 150 # From your data
910
+ max_k = max_tokens // avg_chunk_tokens
911
+
912
+ results = index.query(query, top_k=max_k)
913
+
914
+ # Trim if still over budget
915
+ total_tokens = 0
916
+ filtered = []
917
+ for result in results:
918
+ tokens = count_tokens(result.text)
919
+ if total_tokens + tokens <= max_tokens:
920
+ filtered.append(result)
921
+ total_tokens += tokens
922
+ else:
923
+ break
924
+
925
+ return filtered
926
+
927
+ ### Query and Document Embeddings From Different Models
928
+
929
+ Severity: MEDIUM
930
+
931
+ Situation: Upgrading embedding model or mixing providers
932
+
933
+ Symptoms:
934
+ Retrieval quality suddenly drops. Relevant documents not found.
935
+ Random results returned. Works for new documents, fails for old.
936
+
937
+ Why this breaks:
938
+ Embedding models produce different vector spaces. A query embedded
939
+ with text-embedding-3 won't match documents embedded with text-ada-002.
940
+ Mixing models creates garbage similarity scores.
941
+
942
+ Recommended fix:
943
+
944
+ ## Track embedding model in metadata
945
+ await index.upsert(
946
+ id=doc_id,
947
+ vector=embedding,
948
+ metadata={
949
+ "embedding_model": "text-embedding-3-small",
950
+ "embedding_version": "2024-01",
951
+ "content": content
952
+ }
953
+ )
954
+
955
+ ## Filter by model version on retrieval
956
+ results = index.query(
957
+ vector=query_embedding,
958
+ filter={"embedding_model": current_model},
959
+ top_k=10
960
+ )
961
+
962
+ ## Migration strategy for model upgrades
963
+ async def migrate_embeddings(old_model, new_model):
964
+ # Get all documents with old model
965
+ old_docs = await index.list(filter={"embedding_model": old_model})
966
+
967
+ for doc in old_docs:
968
+ # Re-embed with new model
969
+ new_embedding = await embed(doc.content, model=new_model)
970
+
971
+ # Update in place
972
+ await index.update(
973
+ id=doc.id,
974
+ vector=new_embedding,
975
+ metadata={"embedding_model": new_model}
976
+ )
977
+
978
+ ## Use separate collections during migration
979
+ # Old collection: production queries
980
+ # New collection: re-embedding in progress
981
+ # Switch over when complete
982
+
983
+ ## Validation Checks
984
+
985
+ ### In-Memory Store in Production Code
986
+
987
+ Severity: ERROR
988
+
989
+ In-memory stores lose data on restart
990
+
991
+ Message: In-memory store detected. Use persistent storage (Postgres, Qdrant, Pinecone) for production.
992
+
993
+ ### Vector Upsert Without Metadata
994
+
995
+ Severity: WARNING
996
+
997
+ Vectors should have metadata for filtering
998
+
999
+ Message: Vector upsert without metadata. Add user_id, type, timestamp for proper filtering.
1000
+
1001
+ ### Query Without User Filtering
1002
+
1003
+ Severity: ERROR
1004
+
1005
+ Queries should filter by user to prevent data leakage
1006
+
1007
+ Message: Vector query without user filtering. Always filter by user_id to prevent data leakage.
1008
+
1009
+ ### Hardcoded Chunk Size Without Justification
1010
+
1011
+ Severity: INFO
1012
+
1013
+ Chunk size should be tested and justified
1014
+
1015
+ Message: Hardcoded chunk size. Test different sizes for your content type and measure retrieval accuracy.
1016
+
1017
+ ### Chunking Without Overlap
1018
+
1019
+ Severity: WARNING
1020
+
1021
+ Chunk overlap prevents boundary issues
1022
+
1023
+ Message: Text splitting without overlap. Add chunk_overlap (10-20%) to prevent boundary issues.
1024
+
1025
+ ### Semantic Search Without Filters
1026
+
1027
+ Severity: WARNING
1028
+
1029
+ Pure semantic search often returns irrelevant results
1030
+
1031
+ Message: Pure semantic search. Add metadata filters (user, type, time) for better relevance.
1032
+
1033
+ ### Retrieval Without Result Limit
1034
+
1035
+ Severity: WARNING
1036
+
1037
+ Unbounded retrieval can overflow context
1038
+
1039
+ Message: Retrieval without limit. Set top_k to prevent context overflow.
1040
+
1041
+ ### Embeddings Without Model Version Tracking
1042
+
1043
+ Severity: WARNING
1044
+
1045
+ Track embedding model to handle migrations
1046
+
1047
+ Message: Store embedding model version in metadata to handle model migrations.
1048
+
1049
+ ### Different Models for Document and Query Embedding
1050
+
1051
+ Severity: ERROR
1052
+
1053
+ Documents and queries must use same embedding model
1054
+
1055
+ Message: Ensure same embedding model for indexing and querying.
1056
+
1057
+ ## Collaboration
1058
+
1059
+ ### Delegation Triggers
1060
+
1061
+ - user needs vector database at scale -> data-engineer (Production vector store operations)
1062
+ - user needs embedding model optimization -> ml-engineer (Custom embeddings, fine-tuning)
1063
+ - user needs knowledge graph -> knowledge-engineer (Graph-based memory structures)
1064
+ - user needs RAG pipeline -> llm-architect (End-to-end retrieval augmented generation)
1065
+ - user needs multi-agent shared memory -> multi-agent-orchestration (Memory sharing between agents)
66
1066
 
67
1067
  ## Related Skills
68
1068
 
69
1069
  Works well with: `autonomous-agents`, `multi-agent-orchestration`, `llm-architect`, `agent-tool-builder`
70
1070
 
71
1071
  ## When to Use
72
- This skill is applicable to execute the workflow or actions described in the overview.
1072
+
1073
+ - User mentions or implies: agent memory
1074
+ - User mentions or implies: long-term memory
1075
+ - User mentions or implies: memory systems
1076
+ - User mentions or implies: remember across sessions
1077
+ - User mentions or implies: memory retrieval
1078
+ - User mentions or implies: episodic memory
1079
+ - User mentions or implies: semantic memory
1080
+ - User mentions or implies: vector store
1081
+ - User mentions or implies: rag
1082
+ - User mentions or implies: langmem
1083
+ - User mentions or implies: memgpt
1084
+ - User mentions or implies: conversation history