ifcraftcorpus 1.4.0__py3-none-any.whl → 1.5.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (58) hide show
  1. ifcraftcorpus-1.5.0.data/data/share/ifcraftcorpus/corpus/agent-design/agent_memory_architecture.md +765 -0
  2. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/agent-design/agent_prompt_engineering.md +247 -0
  3. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/agent-design/multi_agent_patterns.md +1 -0
  4. {ifcraftcorpus-1.4.0.dist-info → ifcraftcorpus-1.5.0.dist-info}/METADATA +1 -1
  5. {ifcraftcorpus-1.4.0.dist-info → ifcraftcorpus-1.5.0.dist-info}/RECORD +58 -57
  6. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/audience-and-access/accessibility_guidelines.md +0 -0
  7. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/audience-and-access/audience_targeting.md +0 -0
  8. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/audience-and-access/localization_considerations.md +0 -0
  9. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/craft-foundations/audio_visual_integration.md +0 -0
  10. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/craft-foundations/collaborative_if_writing.md +0 -0
  11. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/craft-foundations/creative_workflow_pipeline.md +0 -0
  12. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/craft-foundations/diegetic_design.md +0 -0
  13. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/craft-foundations/idea_capture_and_hooks.md +0 -0
  14. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/craft-foundations/if_platform_tools.md +0 -0
  15. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/craft-foundations/player_analytics_metrics.md +0 -0
  16. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/craft-foundations/quality_standards_if.md +0 -0
  17. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/craft-foundations/research_and_verification.md +0 -0
  18. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/craft-foundations/testing_interactive_fiction.md +0 -0
  19. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/emotional-design/conflict_patterns.md +0 -0
  20. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/emotional-design/emotional_beats.md +0 -0
  21. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/game-design/mechanics_design_patterns.md +0 -0
  22. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/genre-conventions/children_and_ya_conventions.md +0 -0
  23. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/genre-conventions/fantasy_conventions.md +0 -0
  24. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/genre-conventions/historical_fiction.md +0 -0
  25. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/genre-conventions/horror_conventions.md +0 -0
  26. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/genre-conventions/mystery_conventions.md +0 -0
  27. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/genre-conventions/sci_fi_conventions.md +0 -0
  28. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/narrative-structure/branching_narrative_construction.md +0 -0
  29. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/narrative-structure/branching_narrative_craft.md +0 -0
  30. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/narrative-structure/endings_patterns.md +0 -0
  31. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/narrative-structure/episodic_serialized_if.md +0 -0
  32. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/narrative-structure/nonlinear_structure.md +0 -0
  33. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/narrative-structure/pacing_and_tension.md +0 -0
  34. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/narrative-structure/romance_and_relationships.md +0 -0
  35. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/narrative-structure/scene_structure_and_beats.md +0 -0
  36. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/narrative-structure/scene_transitions.md +0 -0
  37. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/prose-and-language/character_voice.md +0 -0
  38. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/prose-and-language/dialogue_craft.md +0 -0
  39. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/prose-and-language/exposition_techniques.md +0 -0
  40. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/prose-and-language/narrative_point_of_view.md +0 -0
  41. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/prose-and-language/prose_patterns.md +0 -0
  42. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/prose-and-language/subtext_and_implication.md +0 -0
  43. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/prose-and-language/voice_register_consistency.md +0 -0
  44. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/scope-and-planning/scope_and_length.md +0 -0
  45. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/world-and-setting/canon_management.md +0 -0
  46. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/world-and-setting/setting_as_character.md +0 -0
  47. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/corpus/world-and-setting/worldbuilding_patterns.md +0 -0
  48. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/subagents/README.md +0 -0
  49. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/subagents/if_genre_consultant.md +0 -0
  50. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/subagents/if_platform_advisor.md +0 -0
  51. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/subagents/if_prose_writer.md +0 -0
  52. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/subagents/if_quality_reviewer.md +0 -0
  53. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/subagents/if_story_architect.md +0 -0
  54. {ifcraftcorpus-1.4.0.data → ifcraftcorpus-1.5.0.data}/data/share/ifcraftcorpus/subagents/if_world_curator.md +0 -0
  55. {ifcraftcorpus-1.4.0.dist-info → ifcraftcorpus-1.5.0.dist-info}/WHEEL +0 -0
  56. {ifcraftcorpus-1.4.0.dist-info → ifcraftcorpus-1.5.0.dist-info}/entry_points.txt +0 -0
  57. {ifcraftcorpus-1.4.0.dist-info → ifcraftcorpus-1.5.0.dist-info}/licenses/LICENSE +0 -0
  58. {ifcraftcorpus-1.4.0.dist-info → ifcraftcorpus-1.5.0.dist-info}/licenses/LICENSE-CONTENT +0 -0
@@ -0,0 +1,765 @@
1
+ ---
2
+ title: Agent Memory Architecture
3
+ summary: Framework-independent patterns for managing agent conversation history and long-term memory—why prompt stuffing fails, state-managed alternatives, memory types, and multi-agent sharing.
4
+ topics:
5
+ - memory-architecture
6
+ - conversation-history
7
+ - state-management
8
+ - checkpointers
9
+ - context-engineering
10
+ - multi-agent
11
+ - langgraph
12
+ - openai-agents
13
+ cluster: agent-design
14
+ ---
15
+
16
+ # Agent Memory Architecture
17
+
18
+ Patterns for managing agent conversation history and long-term memory. This guide explains why manual prompt concatenation fails, how to use state-managed memory correctly, and how to share context between agents.
19
+
20
+ This document is framework-independent in principles but includes concrete examples for LangGraph and OpenAI Agents SDK.
21
+
22
+ ---
23
+
24
+ ## The Anti-Pattern: Manual Prompt Concatenation
25
+
26
+ When building agents, developers (and AI coding assistants) often default to manually concatenating conversation history into prompts. This is the most common mistake in agent development.
27
+
28
+ ### What It Looks Like
29
+
30
+ **Anti-pattern: Naive history concatenation**
31
+
32
+ ```python
33
+ # DON'T DO THIS
34
+ class NaiveAgent:
35
+ def __init__(self, model):
36
+ self.model = model
37
+ self.history = [] # Manual history list
38
+
39
+ def chat(self, user_message: str) -> str:
40
+ self.history.append({"role": "user", "content": user_message})
41
+
42
+ # Stuffing full history into every call
43
+ response = self.model.chat(
44
+ messages=[
45
+ {"role": "system", "content": SYSTEM_PROMPT},
46
+ *self.history # Growing unboundedly
47
+ ]
48
+ )
49
+
50
+ self.history.append({"role": "assistant", "content": response})
51
+ return response
52
+ ```
53
+
54
+ **Problems:**
55
+
56
+ 1. **No persistence**: History lost on restart
57
+ 2. **Unbounded growth**: Eventually exceeds context window
58
+ 3. **No thread isolation**: Can't run multiple conversations
59
+ 4. **Attention degradation**: Middle content gets ignored
60
+ 5. **Token waste**: Paying for stale context every call
61
+
62
+ **Anti-pattern: String concatenation**
63
+
64
+ ```python
65
+ # DON'T DO THIS
66
+ def build_prompt(history: list[dict], new_message: str) -> str:
67
+ history_text = "\n".join([
68
+ f"{msg['role']}: {msg['content']}"
69
+ for msg in history
70
+ ])
71
+
72
+ return f"""Previous conversation:
73
+ {history_text}
74
+
75
+ User: {new_message}
76
+ Assistant:"""
77
+ ```
78
+
79
+ **Problems:**
80
+
81
+ 1. **Format fragility**: Role formatting can confuse the model
82
+ 2. **No structure**: Loses message boundaries
83
+ 3. **Injection risk**: History content can break prompt structure
84
+ 4. **No tool call preservation**: Loses function call context
85
+
86
+ ### Why AI Coding Assistants Default to This
87
+
88
+ Training data contains many examples of this pattern because:
89
+
90
+ - It's the simplest implementation
91
+ - It works for demos and tutorials
92
+ - Framework-specific patterns require API knowledge
93
+ - Most code examples don't show production patterns
94
+
95
+ This is why you have to repeatedly explain you want proper memory management.
96
+
97
+ ### Why It Fails: The Evidence
98
+
99
+ **"Lost in the Middle" Research (Liu et al., 2023)**
100
+
101
+ LLMs exhibit a U-shaped attention curve—content at the start and end of context receives attention, middle content is systematically ignored. Stuffing history into the middle of a prompt means important context gets lost.
102
+
103
+ **The 75% Rule (Claude Code, Anthropic)**
104
+
105
+ When Claude Code operated above 90% context utilization, output quality degraded significantly. Implementing auto-compaction at 75% produced dramatic quality improvements. The lesson: **capacity ≠ capability**. Empty headroom enables reasoning, not just retrieval.
106
+
107
+ **Context Rot**
108
+
109
+ Old, irrelevant details don't just waste tokens—they actively confuse the model. A discussion about error handling from 50 turns ago can distract from the current task, even if technically within the context window.
110
+
111
+ ---
112
+
113
+ ## The Correct Model: State-Managed Memory
114
+
115
+ Memory should be **first-class state**, not prompt injection. The framework handles storage, retrieval, trimming, and injection—your code focuses on logic.
116
+
117
+ ### Core Principles
118
+
119
+ **1. Separation of Concerns**
120
+
121
+ | Concern | Responsibility | Your Code |
122
+ |---------|----------------|-----------|
123
+ | Storage | Persist messages to durable store | Configure checkpointer |
124
+ | Retrieval | Load relevant history for thread | Provide thread_id |
125
+ | Trimming | Keep context within limits | Set thresholds |
126
+ | Injection | Add history to model calls | Automatic |
127
+
128
+ **2. Thread Isolation**
129
+
130
+ Each conversation gets a unique `thread_id`. The framework maintains separate history per thread, enabling concurrent conversations without interference.
131
+
132
+ **3. Resumability**
133
+
134
+ Conversations can be paused and resumed—even across process restarts. The checkpointer persists state to durable storage.
135
+
136
+ **4. Automatic Management**
137
+
138
+ You don't manually append messages or manage context length. The framework handles this based on configuration.
139
+
140
+ ### LangGraph: Checkpointer Pattern
141
+
142
+ ```python
143
+ from langgraph.checkpoint.memory import InMemorySaver
144
+ from langgraph.checkpoint.sqlite import SqliteSaver
145
+ from langgraph.graph import StateGraph, MessagesState
146
+
147
+ # Development: in-memory
148
+ checkpointer = InMemorySaver()
149
+
150
+ # Production: persistent storage
151
+ # checkpointer = SqliteSaver.from_conn_string("conversations.db")
152
+
153
+ # Define your graph
154
+ builder = StateGraph(MessagesState)
155
+ builder.add_node("agent", call_model)
156
+ builder.add_edge("__start__", "agent")
157
+
158
+ # Compile WITH checkpointer
159
+ graph = builder.compile(checkpointer=checkpointer)
160
+
161
+ # Each conversation gets a thread_id
162
+ config = {"configurable": {"thread_id": "user-123-session-1"}}
163
+
164
+ # Framework handles history automatically
165
+ response = graph.invoke(
166
+ {"messages": [{"role": "user", "content": "Hello!"}]},
167
+ config
168
+ )
169
+
170
+ # Same thread_id = conversation continues
171
+ response = graph.invoke(
172
+ {"messages": [{"role": "user", "content": "What did I just say?"}]},
173
+ config # Same config = same thread
174
+ )
175
+ ```
176
+
177
+ **What the framework does:**
178
+
179
+ 1. Before invoke: Loads existing messages for thread_id
180
+ 2. Prepends history to new messages
181
+ 3. Calls model with full context
182
+ 4. After invoke: Persists new messages to checkpointer
183
+ 5. Handles context limits based on configuration
184
+
185
+ ### OpenAI Agents SDK: Session Pattern
186
+
187
+ ```python
188
+ from agents import Agent, Runner
189
+ from agents.sessions import SQLiteSession
190
+
191
+ # Create persistent session storage
192
+ session = SQLiteSession("conversations.db")
193
+
194
+ agent = Agent(
195
+ name="assistant",
196
+ instructions="You are a helpful assistant.",
197
+ model="gpt-4o"
198
+ )
199
+
200
+ runner = Runner()
201
+
202
+ # Session handles history automatically
203
+ response = await runner.run(
204
+ agent,
205
+ "Hello!",
206
+ session=session,
207
+ session_id="user-123-session-1"
208
+ )
209
+
210
+ # Same session_id = conversation continues
211
+ response = await runner.run(
212
+ agent,
213
+ "What did I just say?",
214
+ session=session,
215
+ session_id="user-123-session-1"
216
+ )
217
+ ```
218
+
219
+ **What the session does:**
220
+
221
+ 1. Before run: Retrieves conversation history for session_id
222
+ 2. Prepends history to input items
223
+ 3. Executes agent with full context
224
+ 4. After run: Stores new items (user input, responses, tool calls)
225
+ 5. Handles continuity across runs
226
+
227
+ ---
228
+
229
+ ## Memory Types
230
+
231
+ Agent memory isn't monolithic. Different types serve different purposes and have different scopes.
232
+
233
+ ### Short-Term Memory (Thread-Scoped)
234
+
235
+ **Scope**: Single conversation thread
236
+ **Purpose**: Maintain context within an ongoing session
237
+ **Lifetime**: Duration of conversation (or until explicitly cleared)
238
+
239
+ | Framework | Implementation |
240
+ |-----------|----------------|
241
+ | LangGraph | Checkpointer with `thread_id` |
242
+ | OpenAI SDK | Session with `session_id` |
243
+ | General | Thread-isolated message store |
244
+
245
+ **What belongs in short-term memory:**
246
+
247
+ - User messages and assistant responses
248
+ - Tool calls and results
249
+ - Reasoning traces (if using chain-of-thought)
250
+ - Current task state
251
+
252
+ ### Long-Term Memory (Cross-Session)
253
+
254
+ **Scope**: Across multiple conversations
255
+ **Purpose**: Persist facts, preferences, learned patterns
256
+ **Lifetime**: Indefinite (or until explicitly deleted)
257
+
258
+ #### Structured Long-Term Memory
259
+
260
+ Facts, relationships, and decisions stored in queryable format.
261
+
262
+ ```python
263
+ # LangGraph Store pattern
264
+ from langgraph.store.memory import InMemoryStore
265
+
266
+ store = InMemoryStore()
267
+
268
+ # Store user preference (persists across threads)
269
+ store.put(
270
+ namespace=("users", "user-123", "preferences"),
271
+ key="timezone",
272
+ value={"timezone": "America/New_York", "updated": "2025-01-17"}
273
+ )
274
+
275
+ # Retrieve in any thread
276
+ prefs = store.get(("users", "user-123", "preferences"), "timezone")
277
+ ```
278
+
279
+ #### Semantic Long-Term Memory
280
+
281
+ Embedding-based retrieval for finding relevant past context.
282
+
283
+ ```python
284
+ # Conceptual pattern (framework-independent)
285
+ from your_vector_store import VectorStore
286
+
287
+ memory_store = VectorStore()
288
+
289
+ # Store interaction summary with embedding
290
+ memory_store.add(
291
+ text="User prefers concise responses without code comments",
292
+ metadata={"user_id": "user-123", "type": "preference"},
293
+ embedding=embed("User prefers concise responses...")
294
+ )
295
+
296
+ # Retrieve relevant memories for new context
297
+ relevant = memory_store.search(
298
+ query="How should I format code for this user?",
299
+ filter={"user_id": "user-123"}
300
+ )
301
+ ```
302
+
303
+ ### Episodic Memory
304
+
305
+ **Scope**: Cross-session, timestamped
306
+ **Purpose**: Record past interactions for learning and audit
307
+ **Lifetime**: Configurable retention
308
+
309
+ ```python
310
+ # Record interaction outcome
311
+ episodic_store.add({
312
+ "timestamp": "2025-01-17T10:30:00Z",
313
+ "user_id": "user-123",
314
+ "thread_id": "session-456",
315
+ "task": "debug authentication error",
316
+ "outcome": "resolved",
317
+ "approach": "checked token expiration, found clock skew",
318
+ "user_feedback": "positive"
319
+ })
320
+
321
+ # Query past approaches for similar tasks
322
+ past_successes = episodic_store.query(
323
+ task_type="debug authentication",
324
+ outcome="resolved",
325
+ user_id="user-123"
326
+ )
327
+ ```
328
+
329
+ ### Memory Layers Summary
330
+
331
+ | Layer | Scope | Storage | Retrieval | Example Use |
332
+ |-------|-------|---------|-----------|-------------|
333
+ | Short-term | Thread | Checkpointer/Session | By thread_id | Conversation context |
334
+ | Long-term (Structured) | User/Global | Key-value store | By namespace + key | User preferences |
335
+ | Long-term (Semantic) | User/Global | Vector store | By similarity | Relevant past context |
336
+ | Episodic | User/Global | Event log | By query + time | Past task outcomes |
337
+
338
+ ---
339
+
340
+ ## State-Over-History Principle
341
+
342
+ A key insight for efficient memory management: **prefer passing current state over full history**.
343
+
344
+ ### The Problem with Full History
345
+
346
+ ```python
347
+ # Anti-pattern: Passing full transcript to sub-agent
348
+ sub_agent_prompt = f"""
349
+ Here's the full conversation so far:
350
+ {format_messages(all_300_messages)}
351
+
352
+ Now help with: {current_task}
353
+ """
354
+ ```
355
+
356
+ **Problems:**
357
+
358
+ - Token explosion
359
+ - Attention dilution
360
+ - Irrelevant context pollution
361
+ - Latency increase
362
+
363
+ ### State-Over-History Pattern
364
+
365
+ ```python
366
+ # Better: Pass current state, not history
367
+ current_state = {
368
+ "user_goal": "Build a REST API for user management",
369
+ "completed_steps": ["schema design", "database setup"],
370
+ "current_step": "implement CRUD endpoints",
371
+ "decisions_made": {
372
+ "database": "PostgreSQL",
373
+ "framework": "FastAPI",
374
+ "auth": "JWT tokens"
375
+ },
376
+ "open_questions": [],
377
+ "artifacts": ["schema.sql", "models.py"]
378
+ }
379
+
380
+ sub_agent_prompt = f"""
381
+ Current project state:
382
+ {json.dumps(current_state, indent=2)}
383
+
384
+ Task: {current_task}
385
+ """
386
+ ```
387
+
388
+ **Benefits:**
389
+
390
+ - Minimal tokens
391
+ - Focused attention
392
+ - No stale context
393
+ - Faster inference
394
+
395
+ ### What Belongs in State vs History
396
+
397
+ | State (Pass Forward) | History (Store, Don't Pass) |
398
+ |---------------------|------------------------------|
399
+ | Current goal | How goal was established |
400
+ | Decisions made | Discussion leading to decisions |
401
+ | Artifacts created | Iterations and revisions |
402
+ | Open questions | Resolved questions |
403
+ | Error context (if debugging) | Successful operations |
404
+
405
+ ### Implementing State Extraction
406
+
407
+ ```python
408
+ # LangGraph: Custom state schema
409
+ from typing import TypedDict, Annotated
410
+ from langgraph.graph import add_messages
411
+
412
+ class ProjectState(TypedDict):
413
+ messages: Annotated[list, add_messages] # Short-term (auto-managed)
414
+
415
+ # Extracted state (you manage)
416
+ current_goal: str
417
+ decisions: dict
418
+ artifacts: list[str]
419
+ phase: str
420
+
421
+ # Update state after significant events
422
+ def extract_state(messages: list, current_state: ProjectState) -> ProjectState:
423
+ """Extract/update state from recent messages."""
424
+ # Use LLM or rules to identify:
425
+ # - New decisions made
426
+ # - Artifacts created
427
+ # - Phase transitions
428
+ return updated_state
429
+ ```
430
+
431
+ ---
432
+
433
+ ## Managing History Growth
434
+
435
+ Even with proper memory architecture, history grows. You need strategies to keep it bounded.
436
+
437
+ ### Strategy 1: Trimming
438
+
439
+ Keep only the last N turns, drop the rest.
440
+
441
+ **LangGraph: trim_messages**
442
+
443
+ ```python
444
+ from langgraph.prebuilt import create_react_agent
445
+ from langchain_core.messages import trim_messages
446
+
447
+ def trim_to_recent(messages: list) -> list:
448
+ """Keep system message + last 10 messages."""
449
+ return trim_messages(
450
+ messages,
451
+ max_tokens=4000,
452
+ strategy="last",
453
+ token_counter=len, # Or use tiktoken
454
+ include_system=True,
455
+ allow_partial=False
456
+ )
457
+
458
+ # Apply before model call
459
+ agent = create_react_agent(
460
+ model,
461
+ tools,
462
+ state_modifier=trim_to_recent
463
+ )
464
+ ```
465
+
466
+ **When to use trimming:**
467
+
468
+ - Short, transactional conversations
469
+ - Tasks where old context is truly irrelevant
470
+ - When latency is critical
471
+
472
+ **Anti-patterns with trimming:**
473
+
474
+ - Losing critical decisions from early in conversation
475
+ - Trimming mid-tool-call (orphaned tool results)
476
+ - Using for planning tasks that need long-range context
477
+
478
+ ### Strategy 2: Summarization
479
+
480
+ Compress older messages into a synthetic summary.
481
+
482
+ **LangGraph: SummarizationMiddleware**
483
+
484
+ ```python
485
+ from langchain.agents import create_agent, SummarizationMiddleware
486
+
487
+ agent = create_agent(
488
+ model="gpt-4o",
489
+ tools=tools,
490
+ middleware=[
491
+ SummarizationMiddleware(
492
+ model="gpt-4o-mini", # Cheaper model for summarization
493
+ trigger={"tokens": 4000}, # Trigger when context exceeds
494
+ keep={"messages": 10} # Keep last 10 verbatim
495
+ )
496
+ ]
497
+ )
498
+ ```
499
+
500
+ **What summarization produces:**
501
+
502
+ ```
503
+ [Summary of turns 1-50]:
504
+ - User requested help building a REST API
505
+ - Decided on FastAPI + PostgreSQL
506
+ - Completed: schema design, database models
507
+ - Current focus: authentication implementation
508
+ - User prefers concise code without excessive comments
509
+
510
+ [Recent messages 51-60 kept verbatim]
511
+ ```
512
+
513
+ **When to use summarization:**
514
+
515
+ - Long-running planning conversations
516
+ - Support threads spanning multiple issues
517
+ - Tasks requiring long-range continuity
518
+
519
+ **Anti-patterns with summarization:**
520
+
521
+ - **Summary drift**: Facts get reinterpreted incorrectly
522
+ - **Context poisoning**: Errors in summary propagate indefinitely
523
+ - **Over-compression**: Losing critical details
524
+ - **Summarizing too frequently**: Latency overhead
525
+
526
+ ### Strategy 3: Hybrid (Recommended)
527
+
528
+ Combine summarization for old context + trimming for recent.
529
+
530
+ ```python
531
+ class HybridMemoryConfig:
532
+ # Summarize when total exceeds this
533
+ summarize_threshold_tokens: int = 8000
534
+
535
+ # Keep this many recent messages verbatim
536
+ keep_recent_messages: int = 20
537
+
538
+ # Maximum summary length
539
+ max_summary_tokens: int = 500
540
+
541
+ # Model for summarization (use cheaper model)
542
+ summary_model: str = "gpt-4o-mini"
543
+ ```
544
+
545
+ **Flow:**
546
+
547
+ 1. Check total token count
548
+ 2. If under threshold: no action
549
+ 3. If over threshold:
550
+ - Keep last N messages verbatim
551
+ - Summarize older messages
552
+ - Replace older messages with summary
553
+ - Continue with bounded context
554
+
555
+ ---
556
+
557
+ ## Multi-Agent Memory Sharing
558
+
559
+ When multiple agents collaborate, memory sharing becomes critical.
560
+
561
+ ### Pattern 1: Shared State Object
562
+
563
+ Agents read from and write to a common state.
564
+
565
+ ```python
566
+ # LangGraph: Shared state across nodes
567
+ from typing import TypedDict, Annotated
568
+ from langgraph.graph import StateGraph, add_messages
569
+
570
+ class SharedState(TypedDict):
571
+ messages: Annotated[list, add_messages]
572
+
573
+ # Shared across all agents
574
+ research_findings: list[str]
575
+ draft_content: str
576
+ review_feedback: list[str]
577
+ final_output: str
578
+
579
+ def researcher(state: SharedState) -> SharedState:
580
+ """Research agent adds findings to shared state."""
581
+ findings = do_research(state["messages"][-1])
582
+ return {"research_findings": state["research_findings"] + findings}
583
+
584
+ def writer(state: SharedState) -> SharedState:
585
+ """Writer agent reads research, produces draft."""
586
+ draft = write_draft(state["research_findings"])
587
+ return {"draft_content": draft}
588
+
589
+ def reviewer(state: SharedState) -> SharedState:
590
+ """Reviewer reads draft, adds feedback."""
591
+ feedback = review(state["draft_content"])
592
+ return {"review_feedback": feedback}
593
+
594
+ # Wire agents together
595
+ graph = StateGraph(SharedState)
596
+ graph.add_node("researcher", researcher)
597
+ graph.add_node("writer", writer)
598
+ graph.add_node("reviewer", reviewer)
599
+ ```
600
+
601
+ ### Pattern 2: Artifact Passing (Not Transcript Passing)
602
+
603
+ **Anti-pattern: Context telephone**
604
+
605
+ ```python
606
+ # DON'T DO THIS
607
+ def orchestrator_delegates_to_specialist(conversation_history):
608
+ # Passing full history degrades information
609
+ specialist_result = specialist.run(
610
+ f"Here's the conversation:\n{conversation_history}\n\nDo task X"
611
+ )
612
+ return specialist_result
613
+ ```
614
+
615
+ **Problems:**
616
+
617
+ - Information degrades through each handoff
618
+ - Irrelevant context pollutes specialist focus
619
+ - Token waste compounds at each level
620
+
621
+ **Better: Pass artifacts and state**
622
+
623
+ ```python
624
+ # DO THIS
625
+ def orchestrator_delegates_to_specialist(task_state):
626
+ # Pass only what specialist needs
627
+ specialist_result = specialist.run(
628
+ task_description=task_state["current_task"],
629
+ input_artifacts=task_state["relevant_artifacts"],
630
+ constraints=task_state["constraints"],
631
+ # NOT the full conversation history
632
+ )
633
+ return specialist_result
634
+ ```
635
+
636
+ ### Pattern 3: Memory Isolation vs Sharing
637
+
638
+ | Scenario | Memory Strategy |
639
+ |----------|-----------------|
640
+ | Agents working on same task | Shared state object |
641
+ | Agents with different domains | Isolated memory, share artifacts |
642
+ | Parallel independent tasks | Fully isolated threads |
643
+ | Validator reviewing creator's work | Read-only access to creator's output |
644
+
645
+ **LangGraph: Isolated sub-agents**
646
+
647
+ ```python
648
+ # Each specialist gets its own thread
649
+ def delegate_to_specialist(state, specialist_graph, task):
650
+ # Create isolated thread for specialist
651
+ specialist_thread_id = f"{state['thread_id']}-{specialist_graph.name}-{uuid4()}"
652
+
653
+ result = specialist_graph.invoke(
654
+ {"messages": [{"role": "user", "content": task}]},
655
+ {"configurable": {"thread_id": specialist_thread_id}}
656
+ )
657
+
658
+ # Return only the result, not specialist's internal history
659
+ return result["final_output"]
660
+ ```
661
+
662
+ ### Pattern 4: Namespace-Based Sharing
663
+
664
+ For long-term memory that should be shared across agents:
665
+
666
+ ```python
667
+ # Shared user preferences (all agents can read)
668
+ user_namespace = ("users", user_id, "preferences")
669
+
670
+ # Agent-specific learned patterns (isolated)
671
+ agent_namespace = ("agents", agent_id, "patterns")
672
+
673
+ # Project-specific context (shared within project)
674
+ project_namespace = ("projects", project_id, "context")
675
+ ```
676
+
677
+ ---
678
+
679
+ ## The 75% Rule
680
+
681
+ Never fill context to capacity. Reserve headroom for reasoning.
682
+
683
+ ### Why Headroom Matters
684
+
685
+ | Context Usage | Effect |
686
+ |---------------|--------|
687
+ | < 50% | Optimal reasoning space |
688
+ | 50-75% | Good balance |
689
+ | 75-90% | Degraded quality, trigger compaction |
690
+ | > 90% | Significant quality loss |
691
+
692
+ ### Implementation
693
+
694
+ ```python
695
+ def should_compact(messages: list, model_context_limit: int) -> bool:
696
+ """Check if context needs compaction."""
697
+ current_tokens = count_tokens(messages)
698
+ threshold = model_context_limit * 0.75
699
+ return current_tokens > threshold
700
+
701
+ def auto_compact_middleware(state: AgentState) -> AgentState:
702
+ """Middleware that triggers compaction at 75%."""
703
+ if should_compact(state["messages"], MODEL_CONTEXT_LIMIT):
704
+ state["messages"] = summarize_and_trim(state["messages"])
705
+ return state
706
+ ```
707
+
708
+ ---
709
+
710
+ ## Implementation Checklist
711
+
712
+ When building agents, verify:
713
+
714
+ - [ ] **No manual history concatenation** in prompt building
715
+ - [ ] **Checkpointer/Session configured** for conversation persistence
716
+ - [ ] **Thread IDs assigned** for conversation isolation
717
+ - [ ] **Trimming or summarization** configured for long conversations
718
+ - [ ] **State-over-history** for sub-agent delegation
719
+ - [ ] **Artifacts passed**, not transcripts, between agents
720
+ - [ ] **75% threshold** for context compaction
721
+ - [ ] **Long-term memory** separated from short-term (if needed)
722
+
723
+ ---
724
+
725
+ ## Quick Reference
726
+
727
+ ### Pattern Selection
728
+
729
+ | Situation | Pattern | Framework Feature |
730
+ |-----------|---------|-------------------|
731
+ | Basic conversation persistence | Checkpointer/Session | LangGraph: `InMemorySaver`, OpenAI: `SQLiteSession` |
732
+ | Long conversations | Summarization middleware | LangGraph: `SummarizationMiddleware` |
733
+ | Multi-agent shared context | Shared state schema | LangGraph: `StateGraph` with shared `TypedDict` |
734
+ | Cross-session user data | Long-term store | LangGraph: `InMemoryStore`, MongoDB Store |
735
+ | Semantic memory retrieval | Vector store integration | External: Pinecone, Chroma, pgvector |
736
+
737
+ ### Anti-Pattern Recognition
738
+
739
+ | If you see... | It's wrong because... | Replace with... |
740
+ |---------------|----------------------|-----------------|
741
+ | `history.append(msg)` | Manual management | Checkpointer |
742
+ | `prompt += history` | String concatenation | Session with auto-injection |
743
+ | Full transcript to sub-agent | Context telephone | Artifact/state passing |
744
+ | No thread_id | No isolation | Explicit thread management |
745
+ | No trimming/summarization | Unbounded growth | Memory middleware |
746
+
747
+ ---
748
+
749
+ ## Research Basis
750
+
751
+ | Source | Key Finding |
752
+ |--------|-------------|
753
+ | "Lost in the Middle" (Liu et al., 2023) | U-shaped attention; middle content ignored |
754
+ | Claude Code 75% Rule (Anthropic) | Quality degrades above 75% context usage |
755
+ | LangChain Short-Term Memory Guide | Checkpointer + summarization patterns |
756
+ | OpenAI Agents SDK Session Docs | Session-based auto-persistence |
757
+ | AWS Memory-Augmented Agents | Memory layer architecture patterns |
758
+ | A-Mem (2025) | Dynamic vs predefined memory access |
759
+
760
+ ---
761
+
762
+ ## See Also
763
+
764
+ - [Agent Prompt Engineering](agent_prompt_engineering.md) — Context architecture, active pruning, state-over-history principle
765
+ - [Multi-Agent Patterns](multi_agent_patterns.md) — Delegation, context passing, artifact handoffs