htm 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.architecture/decisions/adrs/001-use-postgresql-timescaledb-storage.md +227 -0
- data/.architecture/decisions/adrs/002-two-tier-memory-architecture.md +322 -0
- data/.architecture/decisions/adrs/003-ollama-default-embedding-provider.md +339 -0
- data/.architecture/decisions/adrs/004-multi-robot-shared-memory-hive-mind.md +374 -0
- data/.architecture/decisions/adrs/005-rag-based-retrieval-with-hybrid-search.md +443 -0
- data/.architecture/decisions/adrs/006-context-assembly-strategies.md +444 -0
- data/.architecture/decisions/adrs/007-working-memory-eviction-strategy.md +461 -0
- data/.architecture/decisions/adrs/008-robot-identification-system.md +550 -0
- data/.architecture/decisions/adrs/009-never-forget-explicit-deletion-only.md +570 -0
- data/.architecture/decisions/adrs/010-redis-working-memory-rejected.md +323 -0
- data/.architecture/decisions/adrs/011-database-side-embedding-generation-with-pgai.md +585 -0
- data/.architecture/decisions/adrs/012-llm-driven-ontology-topic-extraction.md +583 -0
- data/.architecture/decisions/adrs/013-activerecord-orm-and-many-to-many-tagging.md +299 -0
- data/.architecture/decisions/adrs/014-client-side-embedding-generation-workflow.md +569 -0
- data/.architecture/decisions/adrs/015-hierarchical-tag-ontology-and-llm-extraction.md +701 -0
- data/.architecture/decisions/adrs/016-async-embedding-and-tag-generation.md +694 -0
- data/.architecture/members.yml +144 -0
- data/.architecture/reviews/2025-10-29-llm-configuration-and-async-processing-review.md +1137 -0
- data/.architecture/reviews/initial-system-analysis.md +330 -0
- data/.envrc +32 -0
- data/.irbrc +145 -0
- data/CHANGELOG.md +150 -0
- data/COMMITS.md +196 -0
- data/LICENSE +21 -0
- data/README.md +1347 -0
- data/Rakefile +51 -0
- data/SETUP.md +268 -0
- data/config/database.yml +67 -0
- data/db/migrate/20250101000001_enable_extensions.rb +14 -0
- data/db/migrate/20250101000002_create_robots.rb +14 -0
- data/db/migrate/20250101000003_create_nodes.rb +42 -0
- data/db/migrate/20250101000005_create_tags.rb +38 -0
- data/db/migrate/20250101000007_add_node_vector_indexes.rb +30 -0
- data/db/schema.sql +473 -0
- data/db/seed_data/README.md +100 -0
- data/db/seed_data/presidents.md +136 -0
- data/db/seed_data/states.md +151 -0
- data/db/seeds.rb +208 -0
- data/dbdoc/README.md +173 -0
- data/dbdoc/public.node_stats.md +48 -0
- data/dbdoc/public.node_stats.svg +41 -0
- data/dbdoc/public.node_tags.md +40 -0
- data/dbdoc/public.node_tags.svg +112 -0
- data/dbdoc/public.nodes.md +54 -0
- data/dbdoc/public.nodes.svg +118 -0
- data/dbdoc/public.nodes_tags.md +39 -0
- data/dbdoc/public.nodes_tags.svg +112 -0
- data/dbdoc/public.ontology_structure.md +48 -0
- data/dbdoc/public.ontology_structure.svg +38 -0
- data/dbdoc/public.operations_log.md +42 -0
- data/dbdoc/public.operations_log.svg +130 -0
- data/dbdoc/public.relationships.md +39 -0
- data/dbdoc/public.relationships.svg +41 -0
- data/dbdoc/public.robot_activity.md +46 -0
- data/dbdoc/public.robot_activity.svg +35 -0
- data/dbdoc/public.robots.md +35 -0
- data/dbdoc/public.robots.svg +90 -0
- data/dbdoc/public.schema_migrations.md +29 -0
- data/dbdoc/public.schema_migrations.svg +26 -0
- data/dbdoc/public.tags.md +35 -0
- data/dbdoc/public.tags.svg +60 -0
- data/dbdoc/public.topic_relationships.md +45 -0
- data/dbdoc/public.topic_relationships.svg +32 -0
- data/dbdoc/schema.json +1437 -0
- data/dbdoc/schema.svg +154 -0
- data/docs/api/database.md +806 -0
- data/docs/api/embedding-service.md +532 -0
- data/docs/api/htm.md +797 -0
- data/docs/api/index.md +259 -0
- data/docs/api/long-term-memory.md +1096 -0
- data/docs/api/working-memory.md +665 -0
- data/docs/architecture/adrs/001-postgresql-timescaledb.md +314 -0
- data/docs/architecture/adrs/002-two-tier-memory.md +411 -0
- data/docs/architecture/adrs/003-ollama-embeddings.md +421 -0
- data/docs/architecture/adrs/004-hive-mind.md +437 -0
- data/docs/architecture/adrs/005-rag-retrieval.md +531 -0
- data/docs/architecture/adrs/006-context-assembly.md +496 -0
- data/docs/architecture/adrs/007-eviction-strategy.md +645 -0
- data/docs/architecture/adrs/008-robot-identification.md +625 -0
- data/docs/architecture/adrs/009-never-forget.md +648 -0
- data/docs/architecture/adrs/010-redis-working-memory-rejected.md +323 -0
- data/docs/architecture/adrs/011-pgai-integration.md +494 -0
- data/docs/architecture/adrs/index.md +215 -0
- data/docs/architecture/hive-mind.md +736 -0
- data/docs/architecture/index.md +351 -0
- data/docs/architecture/overview.md +538 -0
- data/docs/architecture/two-tier-memory.md +873 -0
- data/docs/assets/css/custom.css +83 -0
- data/docs/assets/images/htm-core-components.svg +63 -0
- data/docs/assets/images/htm-database-schema.svg +93 -0
- data/docs/assets/images/htm-hive-mind-architecture.svg +125 -0
- data/docs/assets/images/htm-importance-scoring-framework.svg +83 -0
- data/docs/assets/images/htm-layered-architecture.svg +71 -0
- data/docs/assets/images/htm-long-term-memory-architecture.svg +115 -0
- data/docs/assets/images/htm-working-memory-architecture.svg +120 -0
- data/docs/assets/images/htm.jpg +0 -0
- data/docs/assets/images/htm_demo.gif +0 -0
- data/docs/assets/js/mathjax.js +18 -0
- data/docs/assets/videos/htm_video.mp4 +0 -0
- data/docs/database_rake_tasks.md +322 -0
- data/docs/development/contributing.md +787 -0
- data/docs/development/index.md +336 -0
- data/docs/development/schema.md +596 -0
- data/docs/development/setup.md +719 -0
- data/docs/development/testing.md +819 -0
- data/docs/guides/adding-memories.md +824 -0
- data/docs/guides/context-assembly.md +1009 -0
- data/docs/guides/getting-started.md +577 -0
- data/docs/guides/index.md +118 -0
- data/docs/guides/long-term-memory.md +941 -0
- data/docs/guides/multi-robot.md +866 -0
- data/docs/guides/recalling-memories.md +927 -0
- data/docs/guides/search-strategies.md +953 -0
- data/docs/guides/working-memory.md +717 -0
- data/docs/index.md +214 -0
- data/docs/installation.md +477 -0
- data/docs/multi_framework_support.md +519 -0
- data/docs/quick-start.md +655 -0
- data/docs/setup_local_database.md +302 -0
- data/docs/using_rake_tasks_in_your_app.md +383 -0
- data/examples/basic_usage.rb +93 -0
- data/examples/cli_app/README.md +317 -0
- data/examples/cli_app/htm_cli.rb +270 -0
- data/examples/custom_llm_configuration.rb +183 -0
- data/examples/example_app/Rakefile +71 -0
- data/examples/example_app/app.rb +206 -0
- data/examples/sinatra_app/Gemfile +21 -0
- data/examples/sinatra_app/app.rb +335 -0
- data/lib/htm/active_record_config.rb +113 -0
- data/lib/htm/configuration.rb +342 -0
- data/lib/htm/database.rb +594 -0
- data/lib/htm/embedding_service.rb +115 -0
- data/lib/htm/errors.rb +34 -0
- data/lib/htm/job_adapter.rb +154 -0
- data/lib/htm/jobs/generate_embedding_job.rb +65 -0
- data/lib/htm/jobs/generate_tags_job.rb +82 -0
- data/lib/htm/long_term_memory.rb +965 -0
- data/lib/htm/models/node.rb +109 -0
- data/lib/htm/models/node_tag.rb +33 -0
- data/lib/htm/models/robot.rb +52 -0
- data/lib/htm/models/tag.rb +76 -0
- data/lib/htm/railtie.rb +76 -0
- data/lib/htm/sinatra.rb +157 -0
- data/lib/htm/tag_service.rb +135 -0
- data/lib/htm/tasks.rb +38 -0
- data/lib/htm/version.rb +5 -0
- data/lib/htm/working_memory.rb +182 -0
- data/lib/htm.rb +400 -0
- data/lib/tasks/db.rake +19 -0
- data/lib/tasks/htm.rake +147 -0
- data/lib/tasks/jobs.rake +312 -0
- data/mkdocs.yml +190 -0
- data/scripts/install_local_database.sh +309 -0
- metadata +341 -0
|
@@ -0,0 +1,444 @@
|
|
|
1
|
+
# ADR-006: Context Assembly Strategies
|
|
2
|
+
|
|
3
|
+
**Status**: Accepted
|
|
4
|
+
|
|
5
|
+
**Date**: 2025-10-25
|
|
6
|
+
|
|
7
|
+
**Decision Makers**: Dewayne VanHoozer, Claude (Anthropic)
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## ⚠️ UPDATE (2025-10-28)
|
|
12
|
+
|
|
13
|
+
**No changes to this ADR's implementation.**
|
|
14
|
+
|
|
15
|
+
After initial struggles with database configuration, the decision was made to drop the TimescaleDB extension as it was not providing sufficient value for the current proof-of-concept applications. Context assembly strategies are unaffected as they operate on working memory, not the database layer.
|
|
16
|
+
|
|
17
|
+
See [ADR-001](001-use-postgresql-timescaledb-storage.md) for details on the TimescaleDB removal.
|
|
18
|
+
|
|
19
|
+
---
|
|
20
|
+
|
|
21
|
+
## Context
|
|
22
|
+
|
|
23
|
+
When preparing context for an LLM, working memory may contain more information than can fit within token limits. HTM needs to intelligently select which memories to include in the assembled context string.
|
|
24
|
+
|
|
25
|
+
Challenges in context assembly:
|
|
26
|
+
|
|
27
|
+
- **Token limits**: LLMs have finite context windows (even with 128K working memory)
|
|
28
|
+
- **Relevance**: Not all memories are equally important for current task
|
|
29
|
+
- **Recency bias**: Recent context often more relevant, but not always
|
|
30
|
+
- **Priority conflicts**: Important old memories vs. less important recent ones
|
|
31
|
+
- **Performance**: Context assembly should be fast (< 10ms)
|
|
32
|
+
|
|
33
|
+
Alternative approaches:
|
|
34
|
+
|
|
35
|
+
1. **FIFO (First-In-First-Out)**: Always include oldest memories
|
|
36
|
+
2. **LIFO (Last-In-First-Out)**: Always include newest memories
|
|
37
|
+
3. **Importance-only**: Sort by importance score
|
|
38
|
+
4. **Recency-only**: Sort by access time
|
|
39
|
+
5. **Balanced (hybrid)**: Combine importance and recency with decay function
|
|
40
|
+
|
|
41
|
+
## Decision
|
|
42
|
+
|
|
43
|
+
We will implement **three context assembly strategies**: recent, important, and balanced, allowing users to choose based on their use case.
|
|
44
|
+
|
|
45
|
+
### Strategy Definitions
|
|
46
|
+
|
|
47
|
+
**1. Recent (`:recent`)**
|
|
48
|
+
|
|
49
|
+
- Sort by access order, most recently accessed first
|
|
50
|
+
- Best for: Conversational continuity, following current discussion thread
|
|
51
|
+
- Use case: Chat interfaces, debugging sessions, iterative development
|
|
52
|
+
|
|
53
|
+
**2. Important (`:important`)**
|
|
54
|
+
|
|
55
|
+
- Sort by importance score, highest first
|
|
56
|
+
- Best for: Critical information, architectural decisions, key facts
|
|
57
|
+
- Use case: Decision-making, strategic planning, summarization
|
|
58
|
+
|
|
59
|
+
**3. Balanced (`:balanced`)** - **Recommended Default**
|
|
60
|
+
|
|
61
|
+
- Hybrid scoring: `importance * (1.0 / (1 + recency_hours))`
|
|
62
|
+
- Importance weighted by time decay (1-hour half-life)
|
|
63
|
+
- Best for: General-purpose context assembly
|
|
64
|
+
- Use case: Most LLM interactions, mixed conversational + strategic context
|
|
65
|
+
|
|
66
|
+
### Default Strategy
|
|
67
|
+
|
|
68
|
+
**Balanced** is the recommended default as it provides the best general-purpose behavior, preserving both important long-term knowledge and recent conversational context.
|
|
69
|
+
|
|
70
|
+
## Rationale
|
|
71
|
+
|
|
72
|
+
### Why Multiple Strategies?
|
|
73
|
+
|
|
74
|
+
**Different tasks need different context**:
|
|
75
|
+
|
|
76
|
+
- Chat conversation: Recent context critical for coherence
|
|
77
|
+
- Strategic planning: Important decisions matter more than recency
|
|
78
|
+
- General assistance: Balance of both
|
|
79
|
+
|
|
80
|
+
**User control over LLM context**:
|
|
81
|
+
|
|
82
|
+
- Explicit strategy selection gives predictable behavior
|
|
83
|
+
- No hidden heuristics or magic
|
|
84
|
+
- Easy to debug context issues
|
|
85
|
+
|
|
86
|
+
**Performance and simplicity**:
|
|
87
|
+
|
|
88
|
+
- All strategies are simple sorts (O(n log n))
|
|
89
|
+
- No ML models or complex algorithms
|
|
90
|
+
- Fast context assembly (< 10ms for typical working memory)
|
|
91
|
+
|
|
92
|
+
### Implementation Details
|
|
93
|
+
|
|
94
|
+
```ruby
|
|
95
|
+
def assemble_context(strategy:, max_tokens: nil)
|
|
96
|
+
max = max_tokens || @max_tokens
|
|
97
|
+
|
|
98
|
+
nodes = case strategy
|
|
99
|
+
when :recent
|
|
100
|
+
# Most recently accessed first
|
|
101
|
+
@access_order.reverse.map { |k| @nodes[k] }
|
|
102
|
+
|
|
103
|
+
when :important
|
|
104
|
+
# Highest importance first
|
|
105
|
+
@nodes.sort_by { |k, v| -v[:importance] }.map(&:last)
|
|
106
|
+
|
|
107
|
+
when :balanced
|
|
108
|
+
# Importance * recency decay (1-hour half-life)
|
|
109
|
+
@nodes.sort_by { |k, v|
|
|
110
|
+
recency_hours = (Time.now - v[:added_at]) / 3600.0
|
|
111
|
+
score = v[:importance] * (1.0 / (1 + recency_hours))
|
|
112
|
+
-score # Negate for descending sort
|
|
113
|
+
}.map(&:last)
|
|
114
|
+
|
|
115
|
+
else
|
|
116
|
+
raise ArgumentError, "Unknown strategy: #{strategy}"
|
|
117
|
+
end
|
|
118
|
+
|
|
119
|
+
# Build context up to token limit
|
|
120
|
+
context_parts = []
|
|
121
|
+
current_tokens = 0
|
|
122
|
+
|
|
123
|
+
nodes.each do |node|
|
|
124
|
+
break if current_tokens + node[:token_count] > max
|
|
125
|
+
context_parts << node[:value]
|
|
126
|
+
current_tokens += node[:token_count]
|
|
127
|
+
end
|
|
128
|
+
|
|
129
|
+
context_parts.join("\n\n")
|
|
130
|
+
end
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
### User API
|
|
134
|
+
|
|
135
|
+
```ruby
|
|
136
|
+
# Use balanced strategy (recommended default)
|
|
137
|
+
context = htm.create_context(strategy: :balanced)
|
|
138
|
+
|
|
139
|
+
# Use recent for conversational continuity
|
|
140
|
+
context = htm.create_context(strategy: :recent, max_tokens: 4000)
|
|
141
|
+
|
|
142
|
+
# Use important for strategic decisions
|
|
143
|
+
context = htm.create_context(strategy: :important)
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
### Decay Function Analysis
|
|
147
|
+
|
|
148
|
+
The balanced strategy uses a **1-hour half-life decay**:
|
|
149
|
+
|
|
150
|
+
```
|
|
151
|
+
score = importance * (1.0 / (1 + hours))
|
|
152
|
+
|
|
153
|
+
Examples:
|
|
154
|
+
- Just added (0 hours): importance * 1.0 (full weight)
|
|
155
|
+
- 1 hour old: importance * 0.5 (half weight)
|
|
156
|
+
- 3 hours old: importance * 0.25 (quarter weight)
|
|
157
|
+
- 24 hours old: importance * 0.04 (4% weight)
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
This means:
|
|
161
|
+
|
|
162
|
+
- Recent memories get full importance weight
|
|
163
|
+
- Importance decays quickly in first few hours
|
|
164
|
+
- Very old memories need high importance to compete
|
|
165
|
+
|
|
166
|
+
## Consequences
|
|
167
|
+
|
|
168
|
+
### Positive
|
|
169
|
+
|
|
170
|
+
✅ **Flexible**: Choose strategy based on use case
|
|
171
|
+
✅ **Predictable**: Clear sorting behavior, no hidden heuristics
|
|
172
|
+
✅ **Fast**: Simple sorting algorithms, < 10ms
|
|
173
|
+
✅ **Debuggable**: Easy to understand why context contains certain memories
|
|
174
|
+
✅ **User control**: Explicit strategy selection
|
|
175
|
+
✅ **Sensible default**: Balanced strategy works well for most cases
|
|
176
|
+
✅ **Token-aware**: Respects max_tokens limit in all strategies
|
|
177
|
+
|
|
178
|
+
### Negative
|
|
179
|
+
|
|
180
|
+
❌ **Strategy selection burden**: User must understand differences
|
|
181
|
+
❌ **No automatic optimization**: Doesn't learn optimal strategy
|
|
182
|
+
❌ **Decay tuning**: 1-hour half-life may not suit all use cases
|
|
183
|
+
❌ **No semantic clustering**: Doesn't group related memories together
|
|
184
|
+
❌ **Position bias**: Earlier context may have more LLM influence
|
|
185
|
+
|
|
186
|
+
### Neutral
|
|
187
|
+
|
|
188
|
+
➡️ **Importance scoring**: Requires user to assign meaningful importance
|
|
189
|
+
➡️ **Access tracking**: Recent strategy depends on access order
|
|
190
|
+
➡️ **Token estimation**: Accuracy depends on token counting precision
|
|
191
|
+
|
|
192
|
+
## Design Decisions
|
|
193
|
+
|
|
194
|
+
### Decision: Three Strategies Instead of One
|
|
195
|
+
**Rationale**: Different use cases benefit from different strategies. Flexibility > simplicity.
|
|
196
|
+
|
|
197
|
+
**Alternative**: Single balanced strategy for all use cases
|
|
198
|
+
**Rejected**: Forces one-size-fits-all approach, limits user control
|
|
199
|
+
|
|
200
|
+
### Decision: Balanced as Default
|
|
201
|
+
**Rationale**: Best general-purpose behavior, balances competing priorities
|
|
202
|
+
|
|
203
|
+
**Alternative**: Recent as default
|
|
204
|
+
**Rejected**: Important long-term knowledge gets lost in conversations
|
|
205
|
+
|
|
206
|
+
### Decision: 1-Hour Decay Half-Life
|
|
207
|
+
**Rationale**:
|
|
208
|
+
|
|
209
|
+
- 1 hour matches typical development session length
|
|
210
|
+
- Quick decay prevents stale context from dominating
|
|
211
|
+
- Long enough to preserve within-session continuity
|
|
212
|
+
|
|
213
|
+
**Alternative**: Configurable decay parameter
|
|
214
|
+
**Deferred**: Can add if real-world usage shows need for tuning
|
|
215
|
+
|
|
216
|
+
### Decision: Linear Decay (1 / (1 + hours))
|
|
217
|
+
**Rationale**: Simple, predictable, computationally cheap
|
|
218
|
+
|
|
219
|
+
**Alternative**: Exponential decay `exp(-λ * hours)`
|
|
220
|
+
**Rejected**: More complex, harder to reason about, minimal practical difference
|
|
221
|
+
|
|
222
|
+
### Decision: Token Limit Enforced Strictly
|
|
223
|
+
**Rationale**: Never exceed LLM context window, prevent truncation errors
|
|
224
|
+
|
|
225
|
+
**Alternative**: Soft limit with warnings
|
|
226
|
+
**Rejected**: Hard limits prevent surprising behavior
|
|
227
|
+
|
|
228
|
+
## Use Cases
|
|
229
|
+
|
|
230
|
+
### Use Case 1: Conversational Chat
|
|
231
|
+
```ruby
|
|
232
|
+
# User having back-and-forth conversation with LLM
|
|
233
|
+
# Recent context is critical for coherence
|
|
234
|
+
|
|
235
|
+
context = htm.create_context(strategy: :recent, max_tokens: 8000)
|
|
236
|
+
|
|
237
|
+
# Example memories in working memory:
|
|
238
|
+
# - "User prefers debug_me over puts" (importance: 9, 5 days old)
|
|
239
|
+
# - "What is the capital of France?" (importance: 1, 2 minutes ago)
|
|
240
|
+
# - "Paris is the capital" (importance: 1, 1 minute ago)
|
|
241
|
+
|
|
242
|
+
# Result: Recent conversation about Paris included first,
|
|
243
|
+
# even though user preference is more important
|
|
244
|
+
```
|
|
245
|
+
|
|
246
|
+
### Use Case 2: Strategic Planning
|
|
247
|
+
```ruby
|
|
248
|
+
# LLM helping with architectural decisions
|
|
249
|
+
# Important decisions matter more than recent chat
|
|
250
|
+
|
|
251
|
+
context = htm.create_context(strategy: :important)
|
|
252
|
+
|
|
253
|
+
# Example memories:
|
|
254
|
+
# - "We decided to use PostgreSQL" (importance: 10, 3 days ago)
|
|
255
|
+
# - "What time is it?" (importance: 1, 1 minute ago)
|
|
256
|
+
# - "TimescaleDB chosen for time-series" (importance: 9, 2 days ago)
|
|
257
|
+
|
|
258
|
+
# Result: Architectural decisions included first,
|
|
259
|
+
# time check ignored if space limited
|
|
260
|
+
```
|
|
261
|
+
|
|
262
|
+
### Use Case 3: General Assistance (Balanced)
|
|
263
|
+
```ruby
|
|
264
|
+
# LLM helping with mixed tasks
|
|
265
|
+
# Balance recent context + important knowledge
|
|
266
|
+
|
|
267
|
+
context = htm.create_context(strategy: :balanced)
|
|
268
|
+
|
|
269
|
+
# Example memories:
|
|
270
|
+
# - "User prefers debug_me" (importance: 9, 5 days ago) → score: 0.007
|
|
271
|
+
# - "PostgreSQL decision" (importance: 10, 3 days ago) → score: 0.014
|
|
272
|
+
# - "Current debugging issue" (importance: 5, 10 minutes ago) → score: 3.0
|
|
273
|
+
# - "Error: foreign key violation" (importance: 7, 2 minutes ago) → score: 21.0
|
|
274
|
+
|
|
275
|
+
# Result: Recent debugging context ranked highest,
|
|
276
|
+
# but important decisions still included if space permits
|
|
277
|
+
```
|
|
278
|
+
|
|
279
|
+
### Use Case 4: Custom Token Limit
|
|
280
|
+
```ruby
|
|
281
|
+
# Generate short summary for preview
|
|
282
|
+
short_context = htm.create_context(strategy: :important, max_tokens: 500)
|
|
283
|
+
|
|
284
|
+
# Generate full context for LLM
|
|
285
|
+
full_context = htm.create_context(strategy: :balanced, max_tokens: 128_000)
|
|
286
|
+
```
|
|
287
|
+
|
|
288
|
+
## Performance Characteristics
|
|
289
|
+
|
|
290
|
+
### Complexity
|
|
291
|
+
|
|
292
|
+
- **Recent**: O(n) - reverse access order array
|
|
293
|
+
- **Important**: O(n log n) - sort by importance
|
|
294
|
+
- **Balanced**: O(n log n) - sort by computed score
|
|
295
|
+
|
|
296
|
+
### Typical Performance
|
|
297
|
+
|
|
298
|
+
- **Working memory size**: 50-200 nodes
|
|
299
|
+
- **Sorting time**: < 5ms (all strategies)
|
|
300
|
+
- **String assembly**: < 5ms
|
|
301
|
+
- **Total**: < 10ms for context assembly
|
|
302
|
+
|
|
303
|
+
### Memory Usage
|
|
304
|
+
|
|
305
|
+
- **No duplication**: Nodes stored once, sorted references
|
|
306
|
+
- **Temporary arrays**: O(n) for sorted node references
|
|
307
|
+
- **Output string**: O(total_tokens) for assembled context
|
|
308
|
+
|
|
309
|
+
## Risks and Mitigations
|
|
310
|
+
|
|
311
|
+
### Risk: Suboptimal Decay Parameter
|
|
312
|
+
|
|
313
|
+
- **Risk**: 1-hour half-life doesn't match real usage patterns
|
|
314
|
+
- **Likelihood**: Medium (usage patterns vary)
|
|
315
|
+
- **Impact**: Low (balanced strategy still works reasonably)
|
|
316
|
+
- **Mitigation**:
|
|
317
|
+
- Monitor real-world usage patterns
|
|
318
|
+
- Make decay configurable if needed
|
|
319
|
+
- Document decay behavior clearly
|
|
320
|
+
|
|
321
|
+
### Risk: Strategy Confusion
|
|
322
|
+
|
|
323
|
+
- **Risk**: Users don't understand which strategy to use
|
|
324
|
+
- **Likelihood**: Medium (three choices require understanding)
|
|
325
|
+
- **Impact**: Low (balanced default works well)
|
|
326
|
+
- **Mitigation**:
|
|
327
|
+
- Clear documentation with use cases
|
|
328
|
+
- Examples in API docs
|
|
329
|
+
- Sensible default (balanced)
|
|
330
|
+
|
|
331
|
+
### Risk: Position Bias in LLM
|
|
332
|
+
|
|
333
|
+
- **Risk**: LLM pays more attention to early context
|
|
334
|
+
- **Likelihood**: High (known LLM behavior)
|
|
335
|
+
- **Impact**: Medium (affects which memories have most influence)
|
|
336
|
+
- **Mitigation**:
|
|
337
|
+
- Document bias in user guide
|
|
338
|
+
- Consider reverse ordering for some LLMs (future)
|
|
339
|
+
- Let users experiment with strategies
|
|
340
|
+
|
|
341
|
+
### Risk: Importance Scoring Inconsistency
|
|
342
|
+
|
|
343
|
+
- **Risk**: Users assign arbitrary importance scores
|
|
344
|
+
- **Likelihood**: High (subjective scoring)
|
|
345
|
+
- **Impact**: Medium (affects balanced and important strategies)
|
|
346
|
+
- **Mitigation**:
|
|
347
|
+
- Document importance scoring guidelines
|
|
348
|
+
- Provide default importance (1.0) for most memories
|
|
349
|
+
- Consider learned importance in future
|
|
350
|
+
|
|
351
|
+
## Future Enhancements
|
|
352
|
+
|
|
353
|
+
### Automatic Strategy Selection
|
|
354
|
+
```ruby
|
|
355
|
+
# Detect conversation vs planning context
|
|
356
|
+
context = htm.create_context_smart()
|
|
357
|
+
|
|
358
|
+
# Uses recent for conversational turns
|
|
359
|
+
# Uses important for strategic questions
|
|
360
|
+
# Uses balanced for mixed contexts
|
|
361
|
+
```
|
|
362
|
+
|
|
363
|
+
### Configurable Decay
|
|
364
|
+
```ruby
|
|
365
|
+
# Adjust decay half-life
|
|
366
|
+
context = htm.create_context(
|
|
367
|
+
strategy: :balanced,
|
|
368
|
+
decay_hours: 0.5 # Faster decay
|
|
369
|
+
)
|
|
370
|
+
```
|
|
371
|
+
|
|
372
|
+
### Semantic Clustering
|
|
373
|
+
```ruby
|
|
374
|
+
# Group related memories together
|
|
375
|
+
context = htm.create_context(
|
|
376
|
+
strategy: :clustered,
|
|
377
|
+
cluster_by: :embedding # Group semantically related nodes
|
|
378
|
+
)
|
|
379
|
+
```
|
|
380
|
+
|
|
381
|
+
### LLM-Optimized Ordering
|
|
382
|
+
```ruby
|
|
383
|
+
# Reverse ordering for LLMs with recency bias
|
|
384
|
+
context = htm.create_context(
|
|
385
|
+
strategy: :balanced,
|
|
386
|
+
order: :reverse # Most important last (closer to query)
|
|
387
|
+
)
|
|
388
|
+
```
|
|
389
|
+
|
|
390
|
+
### Multi-Factor Scoring
|
|
391
|
+
```ruby
|
|
392
|
+
# Custom scoring function
|
|
393
|
+
context = htm.create_context(
|
|
394
|
+
strategy: :custom,
|
|
395
|
+
score_fn: ->(node) {
|
|
396
|
+
recency = Time.now - node[:added_at]
|
|
397
|
+
importance = node[:importance]
|
|
398
|
+
access_count = node[:access_count]
|
|
399
|
+
|
|
400
|
+
importance * (1.0 / (1 + recency / 3600)) * Math.log(1 + access_count)
|
|
401
|
+
}
|
|
402
|
+
)
|
|
403
|
+
```
|
|
404
|
+
|
|
405
|
+
## Alternatives Considered
|
|
406
|
+
|
|
407
|
+
### Always Include Everything
|
|
408
|
+
**Pros**: No selection logic needed, comprehensive context
|
|
409
|
+
**Cons**: Exceeds token limits, includes irrelevant information
|
|
410
|
+
**Decision**: ❌ Rejected - token limits are real constraints
|
|
411
|
+
|
|
412
|
+
### LLM-Powered Selection
|
|
413
|
+
**Pros**: Most intelligent selection, context-aware
|
|
414
|
+
**Cons**: Slow (extra LLM call), expensive, circular dependency
|
|
415
|
+
**Decision**: ❌ Rejected - too slow for context assembly path
|
|
416
|
+
|
|
417
|
+
### Learned Importance
|
|
418
|
+
**Pros**: Automatic optimization based on usage
|
|
419
|
+
**Cons**: Complex, requires training data, non-deterministic
|
|
420
|
+
**Decision**: 🔄 Deferred - consider for v2 after usage data
|
|
421
|
+
|
|
422
|
+
### Semantic Similarity to Query
|
|
423
|
+
**Pros**: Most relevant to current query
|
|
424
|
+
**Cons**: Requires query embedding, slower, breaks generality
|
|
425
|
+
**Decision**: 🔄 Deferred - recall() already does this, context assembly is different
|
|
426
|
+
|
|
427
|
+
## References
|
|
428
|
+
|
|
429
|
+
- [LLM Context Window Management](https://arxiv.org/abs/2307.03172)
|
|
430
|
+
- [Attention Mechanisms in LLMs](https://arxiv.org/abs/1706.03762)
|
|
431
|
+
- [Position Bias in Language Models](https://arxiv.org/abs/2302.00093)
|
|
432
|
+
- [Working Memory in Cognitive Science](https://en.wikipedia.org/wiki/Working_memory)
|
|
433
|
+
|
|
434
|
+
## Review Notes
|
|
435
|
+
|
|
436
|
+
**AI Engineer**: ✅ Three strategies cover common use cases well. Balanced default is smart. Consider position bias documentation.
|
|
437
|
+
|
|
438
|
+
**Performance Specialist**: ✅ O(n log n) sorting is fast enough for typical working memory sizes. No concerns.
|
|
439
|
+
|
|
440
|
+
**Ruby Expert**: ✅ Clean API design. Consider default parameter: `create_context(strategy: :balanced)` → `create_context(strategy = :balanced)`.
|
|
441
|
+
|
|
442
|
+
**Domain Expert**: ✅ Decay function is intuitive. 1-hour half-life matches development session rhythm.
|
|
443
|
+
|
|
444
|
+
**Systems Architect**: ✅ Strategy pattern is appropriate. Document decision matrix for users.
|