htm 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (155) hide show
  1. checksums.yaml +7 -0
  2. data/.architecture/decisions/adrs/001-use-postgresql-timescaledb-storage.md +227 -0
  3. data/.architecture/decisions/adrs/002-two-tier-memory-architecture.md +322 -0
  4. data/.architecture/decisions/adrs/003-ollama-default-embedding-provider.md +339 -0
  5. data/.architecture/decisions/adrs/004-multi-robot-shared-memory-hive-mind.md +374 -0
  6. data/.architecture/decisions/adrs/005-rag-based-retrieval-with-hybrid-search.md +443 -0
  7. data/.architecture/decisions/adrs/006-context-assembly-strategies.md +444 -0
  8. data/.architecture/decisions/adrs/007-working-memory-eviction-strategy.md +461 -0
  9. data/.architecture/decisions/adrs/008-robot-identification-system.md +550 -0
  10. data/.architecture/decisions/adrs/009-never-forget-explicit-deletion-only.md +570 -0
  11. data/.architecture/decisions/adrs/010-redis-working-memory-rejected.md +323 -0
  12. data/.architecture/decisions/adrs/011-database-side-embedding-generation-with-pgai.md +585 -0
  13. data/.architecture/decisions/adrs/012-llm-driven-ontology-topic-extraction.md +583 -0
  14. data/.architecture/decisions/adrs/013-activerecord-orm-and-many-to-many-tagging.md +299 -0
  15. data/.architecture/decisions/adrs/014-client-side-embedding-generation-workflow.md +569 -0
  16. data/.architecture/decisions/adrs/015-hierarchical-tag-ontology-and-llm-extraction.md +701 -0
  17. data/.architecture/decisions/adrs/016-async-embedding-and-tag-generation.md +694 -0
  18. data/.architecture/members.yml +144 -0
  19. data/.architecture/reviews/2025-10-29-llm-configuration-and-async-processing-review.md +1137 -0
  20. data/.architecture/reviews/initial-system-analysis.md +330 -0
  21. data/.envrc +32 -0
  22. data/.irbrc +145 -0
  23. data/CHANGELOG.md +150 -0
  24. data/COMMITS.md +196 -0
  25. data/LICENSE +21 -0
  26. data/README.md +1347 -0
  27. data/Rakefile +51 -0
  28. data/SETUP.md +268 -0
  29. data/config/database.yml +67 -0
  30. data/db/migrate/20250101000001_enable_extensions.rb +14 -0
  31. data/db/migrate/20250101000002_create_robots.rb +14 -0
  32. data/db/migrate/20250101000003_create_nodes.rb +42 -0
  33. data/db/migrate/20250101000005_create_tags.rb +38 -0
  34. data/db/migrate/20250101000007_add_node_vector_indexes.rb +30 -0
  35. data/db/schema.sql +473 -0
  36. data/db/seed_data/README.md +100 -0
  37. data/db/seed_data/presidents.md +136 -0
  38. data/db/seed_data/states.md +151 -0
  39. data/db/seeds.rb +208 -0
  40. data/dbdoc/README.md +173 -0
  41. data/dbdoc/public.node_stats.md +48 -0
  42. data/dbdoc/public.node_stats.svg +41 -0
  43. data/dbdoc/public.node_tags.md +40 -0
  44. data/dbdoc/public.node_tags.svg +112 -0
  45. data/dbdoc/public.nodes.md +54 -0
  46. data/dbdoc/public.nodes.svg +118 -0
  47. data/dbdoc/public.nodes_tags.md +39 -0
  48. data/dbdoc/public.nodes_tags.svg +112 -0
  49. data/dbdoc/public.ontology_structure.md +48 -0
  50. data/dbdoc/public.ontology_structure.svg +38 -0
  51. data/dbdoc/public.operations_log.md +42 -0
  52. data/dbdoc/public.operations_log.svg +130 -0
  53. data/dbdoc/public.relationships.md +39 -0
  54. data/dbdoc/public.relationships.svg +41 -0
  55. data/dbdoc/public.robot_activity.md +46 -0
  56. data/dbdoc/public.robot_activity.svg +35 -0
  57. data/dbdoc/public.robots.md +35 -0
  58. data/dbdoc/public.robots.svg +90 -0
  59. data/dbdoc/public.schema_migrations.md +29 -0
  60. data/dbdoc/public.schema_migrations.svg +26 -0
  61. data/dbdoc/public.tags.md +35 -0
  62. data/dbdoc/public.tags.svg +60 -0
  63. data/dbdoc/public.topic_relationships.md +45 -0
  64. data/dbdoc/public.topic_relationships.svg +32 -0
  65. data/dbdoc/schema.json +1437 -0
  66. data/dbdoc/schema.svg +154 -0
  67. data/docs/api/database.md +806 -0
  68. data/docs/api/embedding-service.md +532 -0
  69. data/docs/api/htm.md +797 -0
  70. data/docs/api/index.md +259 -0
  71. data/docs/api/long-term-memory.md +1096 -0
  72. data/docs/api/working-memory.md +665 -0
  73. data/docs/architecture/adrs/001-postgresql-timescaledb.md +314 -0
  74. data/docs/architecture/adrs/002-two-tier-memory.md +411 -0
  75. data/docs/architecture/adrs/003-ollama-embeddings.md +421 -0
  76. data/docs/architecture/adrs/004-hive-mind.md +437 -0
  77. data/docs/architecture/adrs/005-rag-retrieval.md +531 -0
  78. data/docs/architecture/adrs/006-context-assembly.md +496 -0
  79. data/docs/architecture/adrs/007-eviction-strategy.md +645 -0
  80. data/docs/architecture/adrs/008-robot-identification.md +625 -0
  81. data/docs/architecture/adrs/009-never-forget.md +648 -0
  82. data/docs/architecture/adrs/010-redis-working-memory-rejected.md +323 -0
  83. data/docs/architecture/adrs/011-pgai-integration.md +494 -0
  84. data/docs/architecture/adrs/index.md +215 -0
  85. data/docs/architecture/hive-mind.md +736 -0
  86. data/docs/architecture/index.md +351 -0
  87. data/docs/architecture/overview.md +538 -0
  88. data/docs/architecture/two-tier-memory.md +873 -0
  89. data/docs/assets/css/custom.css +83 -0
  90. data/docs/assets/images/htm-core-components.svg +63 -0
  91. data/docs/assets/images/htm-database-schema.svg +93 -0
  92. data/docs/assets/images/htm-hive-mind-architecture.svg +125 -0
  93. data/docs/assets/images/htm-importance-scoring-framework.svg +83 -0
  94. data/docs/assets/images/htm-layered-architecture.svg +71 -0
  95. data/docs/assets/images/htm-long-term-memory-architecture.svg +115 -0
  96. data/docs/assets/images/htm-working-memory-architecture.svg +120 -0
  97. data/docs/assets/images/htm.jpg +0 -0
  98. data/docs/assets/images/htm_demo.gif +0 -0
  99. data/docs/assets/js/mathjax.js +18 -0
  100. data/docs/assets/videos/htm_video.mp4 +0 -0
  101. data/docs/database_rake_tasks.md +322 -0
  102. data/docs/development/contributing.md +787 -0
  103. data/docs/development/index.md +336 -0
  104. data/docs/development/schema.md +596 -0
  105. data/docs/development/setup.md +719 -0
  106. data/docs/development/testing.md +819 -0
  107. data/docs/guides/adding-memories.md +824 -0
  108. data/docs/guides/context-assembly.md +1009 -0
  109. data/docs/guides/getting-started.md +577 -0
  110. data/docs/guides/index.md +118 -0
  111. data/docs/guides/long-term-memory.md +941 -0
  112. data/docs/guides/multi-robot.md +866 -0
  113. data/docs/guides/recalling-memories.md +927 -0
  114. data/docs/guides/search-strategies.md +953 -0
  115. data/docs/guides/working-memory.md +717 -0
  116. data/docs/index.md +214 -0
  117. data/docs/installation.md +477 -0
  118. data/docs/multi_framework_support.md +519 -0
  119. data/docs/quick-start.md +655 -0
  120. data/docs/setup_local_database.md +302 -0
  121. data/docs/using_rake_tasks_in_your_app.md +383 -0
  122. data/examples/basic_usage.rb +93 -0
  123. data/examples/cli_app/README.md +317 -0
  124. data/examples/cli_app/htm_cli.rb +270 -0
  125. data/examples/custom_llm_configuration.rb +183 -0
  126. data/examples/example_app/Rakefile +71 -0
  127. data/examples/example_app/app.rb +206 -0
  128. data/examples/sinatra_app/Gemfile +21 -0
  129. data/examples/sinatra_app/app.rb +335 -0
  130. data/lib/htm/active_record_config.rb +113 -0
  131. data/lib/htm/configuration.rb +342 -0
  132. data/lib/htm/database.rb +594 -0
  133. data/lib/htm/embedding_service.rb +115 -0
  134. data/lib/htm/errors.rb +34 -0
  135. data/lib/htm/job_adapter.rb +154 -0
  136. data/lib/htm/jobs/generate_embedding_job.rb +65 -0
  137. data/lib/htm/jobs/generate_tags_job.rb +82 -0
  138. data/lib/htm/long_term_memory.rb +965 -0
  139. data/lib/htm/models/node.rb +109 -0
  140. data/lib/htm/models/node_tag.rb +33 -0
  141. data/lib/htm/models/robot.rb +52 -0
  142. data/lib/htm/models/tag.rb +76 -0
  143. data/lib/htm/railtie.rb +76 -0
  144. data/lib/htm/sinatra.rb +157 -0
  145. data/lib/htm/tag_service.rb +135 -0
  146. data/lib/htm/tasks.rb +38 -0
  147. data/lib/htm/version.rb +5 -0
  148. data/lib/htm/working_memory.rb +182 -0
  149. data/lib/htm.rb +400 -0
  150. data/lib/tasks/db.rake +19 -0
  151. data/lib/tasks/htm.rake +147 -0
  152. data/lib/tasks/jobs.rake +312 -0
  153. data/mkdocs.yml +190 -0
  154. data/scripts/install_local_database.sh +309 -0
  155. metadata +341 -0
@@ -0,0 +1,314 @@
1
+ # ADR-001: PostgreSQL with TimescaleDB for Storage
2
+
3
+ **Status**: Accepted
4
+
5
+ **Date**: 2025-10-25
6
+
7
+ **Decision Makers**: Dewayne VanHoozer, Claude (Anthropic)
8
+
9
+ ---
10
+
11
+ ## Quick Summary
12
+
13
+ HTM uses **PostgreSQL with TimescaleDB** as its primary storage backend, providing time-series optimization, vector embeddings (pgvector), full-text search, and ACID compliance in a single, production-proven database system.
14
+
15
+ **Why**: Consolidates time-series data, vector search, and full-text capabilities in one system rather than maintaining multiple specialized databases.
16
+
17
+ **Impact**: Production-ready storage with excellent tooling, at the cost of some operational complexity compared to simpler alternatives.
18
+
19
+ ---
20
+
21
+ ## Context
22
+
23
+ HTM requires a persistent storage solution that can handle:
24
+
25
+ - Time-series data with efficient time-range queries
26
+ - Vector embeddings for semantic search
27
+ - Full-text search capabilities
28
+ - ACID compliance for data integrity
29
+ - Scalability for growing memory databases
30
+ - Production-grade reliability
31
+
32
+ ### Alternative Options Considered
33
+
34
+ 1. **Pure PostgreSQL**: Solid relational database, pgvector support
35
+ 2. **TimescaleDB**: PostgreSQL extension optimized for time-series
36
+ 3. **Elasticsearch**: Strong full-text search, vector support added
37
+ 4. **Pinecone/Weaviate**: Specialized vector databases
38
+ 5. **SQLite + extensions**: Simple, embedded option
39
+
40
+ ---
41
+
42
+ ## Decision
43
+
44
+ We will use **PostgreSQL with TimescaleDB** as the primary storage backend for HTM.
45
+
46
+ ---
47
+
48
+ ## Rationale
49
+
50
+ ### Why PostgreSQL?
51
+
52
+ **Production-proven**:
53
+ - Decades of reliability in demanding environments
54
+ - ACID compliance guarantees data integrity for memory operations
55
+ - Rich ecosystem with extensive tooling, monitoring, and support
56
+
57
+ **Search capabilities**:
58
+ - **pgvector extension**: Native vector similarity search with HNSW indexing
59
+ - **Full-text search**: Built-in tsvector with GIN indexing
60
+ - **pg_trgm extension**: Trigram-based fuzzy matching
61
+
62
+ **Developer experience**:
63
+ - Strong typing with schema enforcement prevents data corruption
64
+ - Wide adoption means well-understood by developers
65
+ - Standard SQL with PostgreSQL-specific enhancements
66
+
67
+ ### Why TimescaleDB?
68
+
69
+ **Time-series optimization**:
70
+ - **Hypertable partitioning**: Automatic chunk-based time partitioning
71
+ - **Compression policies**: Automatic compression of old data (70-90% reduction)
72
+ - **Time-range optimization**: Fast queries on temporal data via chunk exclusion
73
+
74
+ **PostgreSQL compatibility**:
75
+ - Drop-in extension, not a fork
76
+ - All PostgreSQL features remain available
77
+ - Standard PostgreSQL tools work seamlessly
78
+
79
+ **Operational features**:
80
+ - **Continuous aggregates**: Pre-computed summaries for analytics
81
+ - **Retention policies**: Automatic data lifecycle management
82
+ - **Cloud offering**: Managed service available (TimescaleDB Cloud)
83
+
84
+ ### Why Not Alternatives?
85
+
86
+ !!! warning "Elasticsearch"
87
+ - High operational complexity (JVM tuning, cluster management)
88
+ - Higher resource usage
89
+ - Vector support more recent, less mature
90
+ - Superior full-text search not critical for our use case
91
+
92
+ !!! info "Specialized Vector DBs (Pinecone, Weaviate, Qdrant)"
93
+ - Additional service dependency increases complexity
94
+ - Limited relational capabilities
95
+ - Vendor lock-in concerns
96
+ - Cost considerations for managed services
97
+ - Excellent vector search performance
98
+ - Purpose-built for embeddings
99
+
100
+ !!! note "SQLite"
101
+ - Limited concurrency (write locks)
102
+ - No native vector search (extensions experimental)
103
+ - Not suitable for multi-robot scenarios
104
+ - Simple deployment
105
+ - Zero configuration
106
+
107
+ ---
108
+
109
+ ## Implementation Details
110
+
111
+ ### Schema Design
112
+
113
+ ```sql
114
+ -- Nodes table as TimescaleDB hypertable
115
+ CREATE TABLE nodes (
116
+ id SERIAL PRIMARY KEY,
117
+ key TEXT UNIQUE NOT NULL,
118
+ value TEXT NOT NULL,
119
+ embedding vector(1536),
120
+ robot_id TEXT NOT NULL,
121
+ created_at TIMESTAMP NOT NULL,
122
+ importance FLOAT DEFAULT 1.0,
123
+ type TEXT,
124
+ metadata JSONB
125
+ );
126
+
127
+ -- Convert to hypertable (TimescaleDB)
128
+ SELECT create_hypertable('nodes', 'created_at');
129
+
130
+ -- Vector indexing (HNSW for approximate nearest neighbor)
131
+ CREATE INDEX nodes_embedding_idx ON nodes
132
+ USING hnsw (embedding vector_cosine_ops);
133
+
134
+ -- Full-text indexing
135
+ CREATE INDEX nodes_fts_idx ON nodes
136
+ USING GIN (to_tsvector('english', value));
137
+
138
+ -- Additional indexes
139
+ CREATE INDEX nodes_robot_id_idx ON nodes (robot_id);
140
+ CREATE INDEX nodes_created_at_idx ON nodes (created_at DESC);
141
+ CREATE INDEX nodes_type_idx ON nodes (type);
142
+ ```
143
+
144
+ ### Connection Configuration
145
+
146
+ ```ruby
147
+ # Via environment variable (preferred)
148
+ ENV['HTM_DBURL'] = "postgresql://user:pass@host:port/dbname?sslmode=require"
149
+
150
+ # Parsed into connection hash
151
+ {
152
+ host: 'host',
153
+ port: 5432,
154
+ dbname: 'tsdb',
155
+ user: 'tsdbadmin',
156
+ password: 'secret',
157
+ sslmode: 'require'
158
+ }
159
+ ```
160
+
161
+ ### Key Features Enabled
162
+
163
+ **Vector similarity search**:
164
+ ```sql
165
+ -- Find semantically similar nodes
166
+ SELECT *, 1 - (embedding <=> $1::vector) as similarity
167
+ FROM nodes
168
+ WHERE created_at > NOW() - INTERVAL '30 days'
169
+ ORDER BY embedding <=> $1::vector
170
+ LIMIT 10;
171
+ ```
172
+
173
+ **Full-text search**:
174
+ ```sql
175
+ -- Find nodes containing keywords
176
+ SELECT *, ts_rank(to_tsvector('english', value),
177
+ plainto_tsquery('english', $1)) as rank
178
+ FROM nodes
179
+ WHERE to_tsvector('english', value) @@ plainto_tsquery('english', $1)
180
+ ORDER BY rank DESC
181
+ LIMIT 10;
182
+ ```
183
+
184
+ **Time-range queries** (optimized by chunk exclusion):
185
+ ```sql
186
+ -- Fast time-range query (TimescaleDB prunes chunks)
187
+ SELECT * FROM nodes
188
+ WHERE created_at BETWEEN '2025-10-01' AND '2025-10-25'
189
+ AND robot_id = 'robot-123'
190
+ ORDER BY created_at DESC;
191
+ ```
192
+
193
+ **Automatic compression**:
194
+ ```sql
195
+ -- Compress chunks older than 30 days
196
+ SELECT add_compression_policy('nodes', INTERVAL '30 days');
197
+
198
+ -- Segment by robot_id and type for better compression
199
+ ALTER TABLE nodes SET (
200
+ timescaledb.compress,
201
+ timescaledb.compress_segmentby = 'robot_id, type'
202
+ );
203
+ ```
204
+
205
+ ---
206
+
207
+ ## Consequences
208
+
209
+ ### Positive
210
+
211
+ - Production-ready with battle-tested reliability
212
+ - Multi-modal search: vector, full-text, and hybrid strategies
213
+ - Time-series optimization for efficient temporal queries
214
+ - Cost-effective storage with compression reducing cloud costs
215
+ - Familiar tooling: standard PostgreSQL tools and practices
216
+ - Flexible querying: full SQL power for complex operations
217
+ - ACID guarantees for critical memory operations
218
+
219
+ ### Negative
220
+
221
+ - Operational complexity requires database management (mitigated by managed service)
222
+ - Vertical scaling limits (mitigated by partitioning)
223
+ - Connection overhead: PostgreSQL connections relatively heavy
224
+ - Vector search performance slower than specialized vector DBs at massive scale
225
+
226
+ ### Neutral
227
+
228
+ - Learning curve: developers need PostgreSQL + TimescaleDB knowledge
229
+ - Cloud dependency: currently using TimescaleDB Cloud (could self-host)
230
+ - Extension management requires extensions (timescaledb, pgvector, pg_trgm)
231
+
232
+ ---
233
+
234
+ ## Risks and Mitigations
235
+
236
+ ### Risk: Extension Availability
237
+
238
+ !!! danger "Risk"
239
+ Extensions not available in all PostgreSQL environments
240
+
241
+ **Likelihood**: Low (extensions widely available)
242
+ **Impact**: High (breaks core functionality)
243
+ **Mitigation**: Document requirements clearly, verify in setup process
244
+
245
+ ### Risk: Connection Exhaustion
246
+
247
+ !!! warning "Risk"
248
+ PostgreSQL connections limited (default ~100)
249
+
250
+ **Likelihood**: Medium (with many robots)
251
+ **Impact**: Medium (service degradation)
252
+ **Mitigation**: Implement connection pooling (ConnectionPool gem)
253
+
254
+ ### Risk: Storage Costs
255
+
256
+ !!! info "Risk"
257
+ Vector data storage can be expensive at scale
258
+
259
+ **Likelihood**: Medium (depends on usage)
260
+ **Impact**: Medium (operational cost)
261
+ **Mitigation**: Compression policies, retention policies, archival strategies
262
+
263
+ ### Risk: Query Performance at Scale
264
+
265
+ !!! warning "Risk"
266
+ Complex hybrid searches may slow with millions of nodes
267
+
268
+ **Likelihood**: Low (with proper indexing)
269
+ **Impact**: Medium (user experience)
270
+ **Mitigation**: Query optimization, read replicas, caching layer
271
+
272
+ ---
273
+
274
+ ## Alternatives Comparison
275
+
276
+ | Solution | Pros | Cons | Decision |
277
+ |----------|------|------|----------|
278
+ | Pure PostgreSQL | Simple, reliable, pgvector | No time-series optimization | Rejected |
279
+ | **PostgreSQL + TimescaleDB** | **Best of both worlds** | **Slight complexity** | **ACCEPTED** |
280
+ | Elasticsearch | Excellent full-text search | High resource usage, complexity | Rejected |
281
+ | Pinecone | Purpose-built vectors | Vendor lock-in, cost, limited relational | Rejected |
282
+ | SQLite | Simple, embedded | Limited concurrency, no vectors | Rejected |
283
+
284
+ ---
285
+
286
+ ## Future Considerations
287
+
288
+ - **Read replicas**: For query scaling when needed
289
+ - **Partitioning strategies**: By robot_id for tenant isolation
290
+ - **Caching layer**: Redis for hot nodes
291
+ - **Archive tier**: S3/Glacier for very old memories
292
+ - **Multi-region**: For global deployment
293
+
294
+ ---
295
+
296
+ ## References
297
+
298
+ - [TimescaleDB Documentation](https://docs.timescale.com/)
299
+ - [pgvector Documentation](https://github.com/pgvector/pgvector)
300
+ - [PostgreSQL Full-Text Search](https://www.postgresql.org/docs/current/textsearch.html)
301
+ - [HTM Database Schema Guide](../../development/schema.md)
302
+ - [HTM Configuration Guide](../../installation.md)
303
+
304
+ ---
305
+
306
+ ## Review Notes
307
+
308
+ **Systems Architect**: Solid choice for time-series + vector workload. Consider read replicas for scaling.
309
+
310
+ **Database Architect**: Excellent indexing strategy. Monitor query performance as data grows.
311
+
312
+ **Performance Specialist**: TimescaleDB compression will help with costs. Add connection pooling soon.
313
+
314
+ **Maintainability Expert**: PostgreSQL tooling is mature and well-documented. Good choice for long-term maintenance.
@@ -0,0 +1,411 @@
1
+ # ADR-002: Two-Tier Memory Architecture
2
+
3
+ **Status**: Accepted
4
+
5
+ **Date**: 2025-10-25
6
+
7
+ **Decision Makers**: Dewayne VanHoozer, Claude (Anthropic)
8
+
9
+ ---
10
+
11
+ ## Quick Summary
12
+
13
+ HTM implements a **two-tier memory architecture** with token-limited working memory (hot tier) and unlimited long-term memory (cold tier), managing LLM context windows while preserving all historical data through RAG-based retrieval.
14
+
15
+ **Why**: LLMs have limited context windows but need awareness across long conversations. Two tiers provide fast access to recent context while maintaining complete history.
16
+
17
+ **Impact**: Efficient token budget management with never-forget guarantees, at the cost of coordination between two storage layers.
18
+
19
+ ---
20
+
21
+ ## Context
22
+
23
+ LLM-based applications face a fundamental challenge: LLMs have limited context windows (typically 128K-200K tokens) but need to maintain awareness across long conversations and sessions spanning days, weeks, or months.
24
+
25
+ ### Requirements
26
+
27
+ - Persist memories across sessions (durable storage)
28
+ - Provide fast access to recent/relevant context
29
+ - Manage token budgets efficiently
30
+ - Never lose data accidentally
31
+ - Support contextual recall from the past
32
+
33
+ ### Alternative Approaches
34
+
35
+ 1. **Database-only**: Store everything in PostgreSQL, load on demand
36
+ 2. **Memory-only**: Keep everything in RAM, serialize on shutdown
37
+ 3. **Two-tier**: Combine fast working memory with durable long-term storage
38
+ 4. **External service**: Use a managed memory service
39
+
40
+ ---
41
+
42
+ ## Decision
43
+
44
+ We will implement a **two-tier memory architecture** with:
45
+
46
+ - **Working Memory**: Token-limited, in-memory active context
47
+ - **Long-term Memory**: Durable PostgreSQL storage
48
+
49
+ ---
50
+
51
+ ## Rationale
52
+
53
+ ### Working Memory (Hot Tier)
54
+
55
+ **Characteristics**:
56
+ - **Purpose**: Immediate context for LLM
57
+ - **Storage**: In-memory Ruby data structures
58
+ - **Capacity**: Token-limited (default 128K tokens)
59
+ - **Eviction**: LRU-based eviction when full
60
+ - **Access pattern**: Frequent reads, moderate writes
61
+ - **Lifetime**: Process lifetime
62
+
63
+ **Benefits**:
64
+ - O(1) hash lookups for fast context access
65
+ - Token budget control prevents context overflow
66
+ - Explicit eviction policy with transparent behavior
67
+
68
+ ### Long-term Memory (Cold Tier)
69
+
70
+ **Characteristics**:
71
+ - **Purpose**: Permanent knowledge base
72
+ - **Storage**: PostgreSQL with TimescaleDB
73
+ - **Capacity**: Effectively unlimited
74
+ - **Retention**: Permanent (explicit deletion only)
75
+ - **Access pattern**: RAG-based retrieval
76
+ - **Lifetime**: Forever
77
+
78
+ **Benefits**:
79
+ - Never lose data, survives restarts
80
+ - Search historical context semantically
81
+ - Time-series queries for temporal context
82
+
83
+ ### Data Flow
84
+
85
+ ```
86
+ Add Memory:
87
+ User Input → Working Memory → Long-term Memory
88
+ (immediate) (persisted)
89
+
90
+ Recall Memory:
91
+ Query → Long-term Memory (RAG search) → Working Memory
92
+ (semantic + temporal) (evict if needed)
93
+
94
+ Eviction:
95
+ Working Memory (full) → Evict LRU → Long-term Memory (already there)
96
+ (mark as evicted, not deleted)
97
+ ```
98
+
99
+ ---
100
+
101
+ ## Implementation Details
102
+
103
+ ### Working Memory
104
+
105
+ ```ruby
106
+ class WorkingMemory
107
+ attr_reader :max_tokens, :token_count
108
+
109
+ def initialize(max_tokens: 128_000)
110
+ @nodes = {} # key => {value, token_count, importance, timestamp}
111
+ @max_tokens = max_tokens
112
+ @token_count = 0
113
+ @access_order = [] # Track access for LRU
114
+ end
115
+
116
+ def add(key, value, token_count:, importance: 1.0)
117
+ evict_to_make_space(token_count) if needs_eviction?(token_count)
118
+ @nodes[key] = {
119
+ value: value,
120
+ token_count: token_count,
121
+ importance: importance,
122
+ added_at: Time.now,
123
+ last_accessed: Time.now
124
+ }
125
+ @token_count += token_count
126
+ @access_order << key
127
+ end
128
+
129
+ def evict_to_make_space(needed_tokens)
130
+ # LRU eviction based on last access + importance
131
+ # See ADR-007 for detailed eviction strategy
132
+ end
133
+
134
+ def assemble_context(strategy: :balanced, max_tokens: nil)
135
+ # Sort by strategy and assemble within budget
136
+ # See ADR-006 for context assembly strategies
137
+ end
138
+ end
139
+ ```
140
+
141
+ ### Long-term Memory
142
+
143
+ ```ruby
144
+ class LongTermMemory
145
+ def add(key:, value:, embedding:, robot_id:, importance: 1.0, type: nil)
146
+ # Insert into PostgreSQL with vector embedding
147
+ @db.exec_params(<<~SQL, [key, value, embedding, robot_id, importance, type])
148
+ INSERT INTO nodes (key, value, embedding, robot_id, importance, type, created_at)
149
+ VALUES ($1, $2, $3, $4, $5, $6, CURRENT_TIMESTAMP)
150
+ RETURNING id
151
+ SQL
152
+ end
153
+
154
+ def search(timeframe:, query:, embedding_service:, limit:, strategy: :hybrid)
155
+ # RAG-based retrieval: temporal + semantic
156
+ # See ADR-005 for retrieval strategies
157
+ end
158
+
159
+ def mark_evicted(keys)
160
+ # Update in_working_memory flag (not deleted)
161
+ @db.exec_params(<<~SQL, [keys])
162
+ UPDATE nodes
163
+ SET in_working_memory = FALSE
164
+ WHERE key = ANY($1)
165
+ SQL
166
+ end
167
+ end
168
+ ```
169
+
170
+ ### Coordination (HTM Class)
171
+
172
+ ```ruby
173
+ class HTM
174
+ def initialize(robot_name:, robot_id: nil, max_tokens: 128_000, ...)
175
+ @working_memory = WorkingMemory.new(max_tokens: max_tokens)
176
+ @long_term_memory = LongTermMemory.new(db_config)
177
+ @embedding_service = EmbeddingService.new(...)
178
+ @robot_id = robot_id || SecureRandom.uuid
179
+ @robot_name = robot_name
180
+ end
181
+
182
+ def add_node(key, value, importance: 1.0, type: nil)
183
+ # 1. Generate embedding
184
+ embedding = @embedding_service.embed(value)
185
+
186
+ # 2. Store in long-term memory
187
+ @long_term_memory.add(
188
+ key: key,
189
+ value: value,
190
+ embedding: embedding,
191
+ robot_id: @robot_id,
192
+ importance: importance,
193
+ type: type
194
+ )
195
+
196
+ # 3. Add to working memory (evict if needed)
197
+ token_count = estimate_tokens(value)
198
+ @working_memory.add(key, value,
199
+ token_count: token_count,
200
+ importance: importance)
201
+ end
202
+
203
+ def recall(timeframe:, topic:, limit: 10, strategy: :hybrid)
204
+ # 1. Search long-term memory (RAG)
205
+ results = @long_term_memory.search(
206
+ timeframe: timeframe,
207
+ query: topic,
208
+ embedding_service: @embedding_service,
209
+ limit: limit,
210
+ strategy: strategy
211
+ )
212
+
213
+ # 2. Add results to working memory (evict if needed)
214
+ results.each do |node|
215
+ @working_memory.add(node[:key], node[:value],
216
+ token_count: node[:token_count],
217
+ importance: node[:importance])
218
+ end
219
+
220
+ # 3. Return nodes
221
+ results
222
+ end
223
+ end
224
+ ```
225
+
226
+ ---
227
+
228
+ ## Consequences
229
+
230
+ ### Positive
231
+
232
+ - Fast context access through O(1) working memory lookups
233
+ - Durable storage ensures never lose data, survives restarts
234
+ - Token budget control with automatic management
235
+ - Explicit eviction policy provides transparent behavior
236
+ - RAG-enabled search of historical context semantically
237
+ - Never-delete philosophy: eviction moves data, never removes
238
+ - Process-isolated: each robot instance has independent working memory
239
+
240
+ ### Negative
241
+
242
+ - Complexity of coordinating two storage layers
243
+ - Memory overhead from working memory consuming RAM
244
+ - Synchronization challenges keeping both tiers consistent
245
+ - Eviction overhead when moving data between tiers
246
+
247
+ ### Neutral
248
+
249
+ - Token counting requires accurate estimation
250
+ - Strategy tuning for eviction and assembly needs calibration
251
+ - Per-process state means working memory not shared across processes
252
+
253
+ ---
254
+
255
+ ## Eviction Strategies
256
+
257
+ ### LRU-based (Implemented)
258
+
259
+ ```ruby
260
+ def eviction_score(node)
261
+ recency = Time.now - node[:last_accessed]
262
+ importance = node[:importance]
263
+
264
+ # Lower score = evict first
265
+ importance / (recency + 1.0)
266
+ end
267
+ ```
268
+
269
+ See [ADR-007: Working Memory Eviction Strategy](007-eviction-strategy.md) for detailed eviction algorithm.
270
+
271
+ ### Future Strategies
272
+
273
+ - **Importance-only**: Keep most important nodes
274
+ - **Recency-only**: Pure LRU cache
275
+ - **Frequency-based**: Track access counts
276
+ - **Category-based**: Pin certain types (facts, preferences)
277
+ - **Smart eviction**: ML-based prediction of future access
278
+
279
+ ---
280
+
281
+ ## Context Assembly Strategies
282
+
283
+ ### Recent (`:recent`)
284
+ Sort by `created_at DESC`, newest first
285
+
286
+ ### Important (`:important`)
287
+ Sort by `importance DESC`, most important first
288
+
289
+ ### Balanced (`:balanced`)
290
+ ```ruby
291
+ score = importance * (1.0 / (1 + age_in_hours))
292
+ ```
293
+
294
+ See [ADR-006: Context Assembly Strategies](006-context-assembly.md) for detailed assembly algorithms.
295
+
296
+ ---
297
+
298
+ ## Design Principles
299
+
300
+ ### Never Forget (Unless Told)
301
+
302
+ - Eviction moves data, never deletes
303
+ - Only `forget(confirm: :confirmed)` deletes
304
+ - Long-term memory is append-only (updates rare)
305
+
306
+ See [ADR-009: Never-Forget Philosophy](009-never-forget.md) for deletion policies.
307
+
308
+ ### Token Budget Management
309
+
310
+ - Token counting happens at add time
311
+ - Working memory enforces hard token limit
312
+ - Context assembly respects token budget
313
+ - Safety margin (10%) for token estimation errors
314
+
315
+ ### Transparent Behavior
316
+
317
+ - Log all evictions
318
+ - Track `in_working_memory` flag
319
+ - Operations log for audit trail
320
+
321
+ ---
322
+
323
+ ## Risks and Mitigations
324
+
325
+ ### Risk: Token Count Inaccuracy
326
+
327
+ !!! warning "Risk"
328
+ Tiktoken approximation differs from LLM's actual count
329
+
330
+ **Likelihood**: Medium (different tokenizers)
331
+ **Impact**: Medium (context overflow)
332
+ **Mitigation**: Add safety margin (10%), use LLM-specific counters
333
+
334
+ ### Risk: Eviction Thrashing
335
+
336
+ !!! info "Risk"
337
+ Constant eviction/recall cycles
338
+
339
+ **Likelihood**: Low (with proper sizing)
340
+ **Impact**: Medium (performance degradation)
341
+ **Mitigation**: Larger working memory, smarter eviction, caching
342
+
343
+ ### Risk: Working Memory Growth
344
+
345
+ !!! danger "Risk"
346
+ Memory leaks or unbounded growth
347
+
348
+ **Likelihood**: Low (token budget enforced)
349
+ **Impact**: High (OOM crashes)
350
+ **Mitigation**: Hard limits, monitoring, alerts
351
+
352
+ ### Risk: Stale Working Memory
353
+
354
+ !!! note "Risk"
355
+ Working memory doesn't reflect long-term updates
356
+
357
+ **Likelihood**: Low (single-writer pattern)
358
+ **Impact**: Low (eventual consistency OK)
359
+ **Mitigation**: Refresh on recall, invalidation on update
360
+
361
+ ---
362
+
363
+ ## Performance Characteristics
364
+
365
+ ### Working Memory
366
+
367
+ - **Add**: O(1) amortized (eviction is O(n) when needed)
368
+ - **Retrieve**: O(1) hash lookup
369
+ - **Eviction**: O(n log n) for sorting, O(k) for removing k nodes
370
+ - **Context assembly**: O(n log n) for sorting, O(k) for selecting
371
+
372
+ ### Long-term Memory
373
+
374
+ - **Add**: O(log n) PostgreSQL insert with indexes
375
+ - **Vector search**: O(log n) with HNSW index (approximate)
376
+ - **Full-text search**: O(log n) with GIN index
377
+ - **Hybrid search**: O(log n) for both, then merge
378
+
379
+ ---
380
+
381
+ ## Future Enhancements
382
+
383
+ 1. **Shared working memory**: Redis-backed for multi-process
384
+ 2. **Lazy loading**: Load nodes on first access
385
+ 3. **Pre-fetching**: Anticipate needed context
386
+ 4. **Compression**: Compress old working memory nodes
387
+ 5. **Tiered eviction**: Multiple working memory levels
388
+ 6. **Smart assembly**: ML-driven context selection
389
+
390
+ ---
391
+
392
+ ## References
393
+
394
+ - [Working Memory (Psychology)](https://en.wikipedia.org/wiki/Working_memory)
395
+ - [Cache Eviction Policies](https://en.wikipedia.org/wiki/Cache_replacement_policies)
396
+ - [LLM Context Window Management](https://www.anthropic.com/research/context-windows)
397
+ - [ADR-001: PostgreSQL Storage](001-postgresql-timescaledb.md)
398
+ - [ADR-006: Context Assembly](006-context-assembly.md)
399
+ - [ADR-007: Eviction Strategy](007-eviction-strategy.md)
400
+
401
+ ---
402
+
403
+ ## Review Notes
404
+
405
+ **Systems Architect**: Clean separation of concerns. Consider shared cache for horizontal scaling.
406
+
407
+ **Performance Specialist**: Good balance of speed and durability. Monitor eviction frequency.
408
+
409
+ **AI Engineer**: Token budget management is critical. Add safety margins for token count variance.
410
+
411
+ **Ruby Expert**: Consider using Concurrent::Map for thread-safe working memory in future.