htm 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (155) hide show
  1. checksums.yaml +7 -0
  2. data/.architecture/decisions/adrs/001-use-postgresql-timescaledb-storage.md +227 -0
  3. data/.architecture/decisions/adrs/002-two-tier-memory-architecture.md +322 -0
  4. data/.architecture/decisions/adrs/003-ollama-default-embedding-provider.md +339 -0
  5. data/.architecture/decisions/adrs/004-multi-robot-shared-memory-hive-mind.md +374 -0
  6. data/.architecture/decisions/adrs/005-rag-based-retrieval-with-hybrid-search.md +443 -0
  7. data/.architecture/decisions/adrs/006-context-assembly-strategies.md +444 -0
  8. data/.architecture/decisions/adrs/007-working-memory-eviction-strategy.md +461 -0
  9. data/.architecture/decisions/adrs/008-robot-identification-system.md +550 -0
  10. data/.architecture/decisions/adrs/009-never-forget-explicit-deletion-only.md +570 -0
  11. data/.architecture/decisions/adrs/010-redis-working-memory-rejected.md +323 -0
  12. data/.architecture/decisions/adrs/011-database-side-embedding-generation-with-pgai.md +585 -0
  13. data/.architecture/decisions/adrs/012-llm-driven-ontology-topic-extraction.md +583 -0
  14. data/.architecture/decisions/adrs/013-activerecord-orm-and-many-to-many-tagging.md +299 -0
  15. data/.architecture/decisions/adrs/014-client-side-embedding-generation-workflow.md +569 -0
  16. data/.architecture/decisions/adrs/015-hierarchical-tag-ontology-and-llm-extraction.md +701 -0
  17. data/.architecture/decisions/adrs/016-async-embedding-and-tag-generation.md +694 -0
  18. data/.architecture/members.yml +144 -0
  19. data/.architecture/reviews/2025-10-29-llm-configuration-and-async-processing-review.md +1137 -0
  20. data/.architecture/reviews/initial-system-analysis.md +330 -0
  21. data/.envrc +32 -0
  22. data/.irbrc +145 -0
  23. data/CHANGELOG.md +150 -0
  24. data/COMMITS.md +196 -0
  25. data/LICENSE +21 -0
  26. data/README.md +1347 -0
  27. data/Rakefile +51 -0
  28. data/SETUP.md +268 -0
  29. data/config/database.yml +67 -0
  30. data/db/migrate/20250101000001_enable_extensions.rb +14 -0
  31. data/db/migrate/20250101000002_create_robots.rb +14 -0
  32. data/db/migrate/20250101000003_create_nodes.rb +42 -0
  33. data/db/migrate/20250101000005_create_tags.rb +38 -0
  34. data/db/migrate/20250101000007_add_node_vector_indexes.rb +30 -0
  35. data/db/schema.sql +473 -0
  36. data/db/seed_data/README.md +100 -0
  37. data/db/seed_data/presidents.md +136 -0
  38. data/db/seed_data/states.md +151 -0
  39. data/db/seeds.rb +208 -0
  40. data/dbdoc/README.md +173 -0
  41. data/dbdoc/public.node_stats.md +48 -0
  42. data/dbdoc/public.node_stats.svg +41 -0
  43. data/dbdoc/public.node_tags.md +40 -0
  44. data/dbdoc/public.node_tags.svg +112 -0
  45. data/dbdoc/public.nodes.md +54 -0
  46. data/dbdoc/public.nodes.svg +118 -0
  47. data/dbdoc/public.nodes_tags.md +39 -0
  48. data/dbdoc/public.nodes_tags.svg +112 -0
  49. data/dbdoc/public.ontology_structure.md +48 -0
  50. data/dbdoc/public.ontology_structure.svg +38 -0
  51. data/dbdoc/public.operations_log.md +42 -0
  52. data/dbdoc/public.operations_log.svg +130 -0
  53. data/dbdoc/public.relationships.md +39 -0
  54. data/dbdoc/public.relationships.svg +41 -0
  55. data/dbdoc/public.robot_activity.md +46 -0
  56. data/dbdoc/public.robot_activity.svg +35 -0
  57. data/dbdoc/public.robots.md +35 -0
  58. data/dbdoc/public.robots.svg +90 -0
  59. data/dbdoc/public.schema_migrations.md +29 -0
  60. data/dbdoc/public.schema_migrations.svg +26 -0
  61. data/dbdoc/public.tags.md +35 -0
  62. data/dbdoc/public.tags.svg +60 -0
  63. data/dbdoc/public.topic_relationships.md +45 -0
  64. data/dbdoc/public.topic_relationships.svg +32 -0
  65. data/dbdoc/schema.json +1437 -0
  66. data/dbdoc/schema.svg +154 -0
  67. data/docs/api/database.md +806 -0
  68. data/docs/api/embedding-service.md +532 -0
  69. data/docs/api/htm.md +797 -0
  70. data/docs/api/index.md +259 -0
  71. data/docs/api/long-term-memory.md +1096 -0
  72. data/docs/api/working-memory.md +665 -0
  73. data/docs/architecture/adrs/001-postgresql-timescaledb.md +314 -0
  74. data/docs/architecture/adrs/002-two-tier-memory.md +411 -0
  75. data/docs/architecture/adrs/003-ollama-embeddings.md +421 -0
  76. data/docs/architecture/adrs/004-hive-mind.md +437 -0
  77. data/docs/architecture/adrs/005-rag-retrieval.md +531 -0
  78. data/docs/architecture/adrs/006-context-assembly.md +496 -0
  79. data/docs/architecture/adrs/007-eviction-strategy.md +645 -0
  80. data/docs/architecture/adrs/008-robot-identification.md +625 -0
  81. data/docs/architecture/adrs/009-never-forget.md +648 -0
  82. data/docs/architecture/adrs/010-redis-working-memory-rejected.md +323 -0
  83. data/docs/architecture/adrs/011-pgai-integration.md +494 -0
  84. data/docs/architecture/adrs/index.md +215 -0
  85. data/docs/architecture/hive-mind.md +736 -0
  86. data/docs/architecture/index.md +351 -0
  87. data/docs/architecture/overview.md +538 -0
  88. data/docs/architecture/two-tier-memory.md +873 -0
  89. data/docs/assets/css/custom.css +83 -0
  90. data/docs/assets/images/htm-core-components.svg +63 -0
  91. data/docs/assets/images/htm-database-schema.svg +93 -0
  92. data/docs/assets/images/htm-hive-mind-architecture.svg +125 -0
  93. data/docs/assets/images/htm-importance-scoring-framework.svg +83 -0
  94. data/docs/assets/images/htm-layered-architecture.svg +71 -0
  95. data/docs/assets/images/htm-long-term-memory-architecture.svg +115 -0
  96. data/docs/assets/images/htm-working-memory-architecture.svg +120 -0
  97. data/docs/assets/images/htm.jpg +0 -0
  98. data/docs/assets/images/htm_demo.gif +0 -0
  99. data/docs/assets/js/mathjax.js +18 -0
  100. data/docs/assets/videos/htm_video.mp4 +0 -0
  101. data/docs/database_rake_tasks.md +322 -0
  102. data/docs/development/contributing.md +787 -0
  103. data/docs/development/index.md +336 -0
  104. data/docs/development/schema.md +596 -0
  105. data/docs/development/setup.md +719 -0
  106. data/docs/development/testing.md +819 -0
  107. data/docs/guides/adding-memories.md +824 -0
  108. data/docs/guides/context-assembly.md +1009 -0
  109. data/docs/guides/getting-started.md +577 -0
  110. data/docs/guides/index.md +118 -0
  111. data/docs/guides/long-term-memory.md +941 -0
  112. data/docs/guides/multi-robot.md +866 -0
  113. data/docs/guides/recalling-memories.md +927 -0
  114. data/docs/guides/search-strategies.md +953 -0
  115. data/docs/guides/working-memory.md +717 -0
  116. data/docs/index.md +214 -0
  117. data/docs/installation.md +477 -0
  118. data/docs/multi_framework_support.md +519 -0
  119. data/docs/quick-start.md +655 -0
  120. data/docs/setup_local_database.md +302 -0
  121. data/docs/using_rake_tasks_in_your_app.md +383 -0
  122. data/examples/basic_usage.rb +93 -0
  123. data/examples/cli_app/README.md +317 -0
  124. data/examples/cli_app/htm_cli.rb +270 -0
  125. data/examples/custom_llm_configuration.rb +183 -0
  126. data/examples/example_app/Rakefile +71 -0
  127. data/examples/example_app/app.rb +206 -0
  128. data/examples/sinatra_app/Gemfile +21 -0
  129. data/examples/sinatra_app/app.rb +335 -0
  130. data/lib/htm/active_record_config.rb +113 -0
  131. data/lib/htm/configuration.rb +342 -0
  132. data/lib/htm/database.rb +594 -0
  133. data/lib/htm/embedding_service.rb +115 -0
  134. data/lib/htm/errors.rb +34 -0
  135. data/lib/htm/job_adapter.rb +154 -0
  136. data/lib/htm/jobs/generate_embedding_job.rb +65 -0
  137. data/lib/htm/jobs/generate_tags_job.rb +82 -0
  138. data/lib/htm/long_term_memory.rb +965 -0
  139. data/lib/htm/models/node.rb +109 -0
  140. data/lib/htm/models/node_tag.rb +33 -0
  141. data/lib/htm/models/robot.rb +52 -0
  142. data/lib/htm/models/tag.rb +76 -0
  143. data/lib/htm/railtie.rb +76 -0
  144. data/lib/htm/sinatra.rb +157 -0
  145. data/lib/htm/tag_service.rb +135 -0
  146. data/lib/htm/tasks.rb +38 -0
  147. data/lib/htm/version.rb +5 -0
  148. data/lib/htm/working_memory.rb +182 -0
  149. data/lib/htm.rb +400 -0
  150. data/lib/tasks/db.rake +19 -0
  151. data/lib/tasks/htm.rake +147 -0
  152. data/lib/tasks/jobs.rake +312 -0
  153. data/mkdocs.yml +190 -0
  154. data/scripts/install_local_database.sh +309 -0
  155. metadata +341 -0
@@ -0,0 +1,538 @@
1
+ # Detailed Architecture
2
+
3
+ This document provides a comprehensive deep dive into HTM's system architecture, component interactions, data flows, database schema, and performance characteristics.
4
+
5
+ ## Table of Contents
6
+
7
+ - [System Architecture](#system-architecture)
8
+ - [Component Diagrams](#component-diagrams)
9
+ - [Data Flow Diagrams](#data-flow-diagrams)
10
+ - [Memory Lifecycle](#memory-lifecycle)
11
+ - [Database Schema](#database-schema)
12
+ - [Technology Stack](#technology-stack)
13
+ - [Performance Characteristics](#performance-characteristics)
14
+ - [Scalability Considerations](#scalability-considerations)
15
+
16
+ ## System Architecture
17
+
18
+ HTM implements a layered architecture with clear separation of concerns between presentation (API), business logic (memory management), and data access (database).
19
+
20
+ ### Architecture Layers
21
+
22
+ ![HTM Layered Architecture](../assets/images/htm-layered-architecture.svg)
23
+
24
+ ### Component Responsibilities
25
+
26
+ #### API Layer (HTM class)
27
+
28
+ - Public interface for all memory operations
29
+ - Robot identification and initialization
30
+ - Request routing to appropriate subsystems
31
+ - Response aggregation and formatting
32
+ - Activity logging and statistics
33
+
34
+ #### Coordination Layer
35
+
36
+ - **Robot Management**: Registration, activity tracking, metadata
37
+ - **Embedding Coordination**: Generate embeddings for new memories and search queries
38
+ - **Memory Orchestration**: Coordinate between working and long-term memory
39
+ - **Context Assembly**: Build LLM context strings from working memory
40
+ - **Token Management**: Count tokens and enforce limits
41
+
42
+ #### Memory Management Layer
43
+
44
+ ##### Working Memory
45
+
46
+ - **In-Memory Store**: Fast Ruby Hash-based storage
47
+ - **Token Budget**: Enforce maximum token limit (default 128K)
48
+ - **Eviction Policy**: Hybrid importance + recency eviction
49
+ - **Access Tracking**: LRU-style access order for recency
50
+ - **Context Assembly**: Three strategies (recent, important, balanced)
51
+
52
+ ##### Long-Term Memory
53
+
54
+ - **Persistence**: Write all memories to PostgreSQL
55
+ - **RAG Search**: Vector + temporal + full-text search
56
+ - **Relationship Management**: Store and query node relationships
57
+ - **Robot Registry**: Track all robots using the system
58
+ - **Eviction Marking**: Mark which nodes are in working memory
59
+
60
+ #### Services Layer
61
+
62
+ ##### Embedding Service
63
+
64
+ - **Client-Side Generation**: Generate embeddings before database insertion
65
+ - **Token Counting**: Estimate token counts for strings
66
+ - **Model Management**: Handle different models per provider
67
+ - **Provider Support**: Ollama (default) and OpenAI
68
+
69
+ !!! info "Architecture Change (October 2025)"
70
+ Embeddings are generated client-side in Ruby before database insertion. This provides reliable, cross-platform operation without complex database extension dependencies.
71
+
72
+ ##### Database Service
73
+
74
+ - **Connection Pooling**: Manage PostgreSQL connections
75
+ - **Query Execution**: Execute parameterized queries safely
76
+ - **Transaction Management**: ACID guarantees for operations
77
+ - **Error Handling**: Retry logic and failure recovery
78
+
79
+ #### Data Layer
80
+
81
+ - **PostgreSQL**: Relational storage with ACID guarantees
82
+ - **TimescaleDB**: Time-series optimization and compression
83
+ - **pgvector**: Vector similarity search with HNSW
84
+ - **pg_trgm**: Fuzzy text matching for search
85
+
86
+ ## Component Diagrams
87
+
88
+ ### HTM Core Components
89
+
90
+ ![HTM Core Components](../assets/images/htm-core-components.svg)
91
+
92
+ ## Data Flow Diagrams
93
+
94
+ ### Memory Addition Flow
95
+
96
+ This diagram shows the complete flow of adding a new memory node to HTM with **client-side embedding generation**.
97
+
98
+ !!! info "Architecture Note"
99
+ With client-side generation (October 2025), embeddings are generated in Ruby before database insertion. This provides reliable, cross-platform operation.
100
+
101
+ ```mermaid
102
+ graph TD
103
+ A[User: add_message] -->|1. Request| B[HTM]
104
+ B -->|2. Count tokens| C[EmbeddingService]
105
+ C -->|3. Return count| B
106
+
107
+ B -->|4. Generate embedding| C
108
+ C -->|5. HTTP call| D[Ollama/OpenAI]
109
+ D -->|6. Return vector| C
110
+ C -->|7. Return embedding| B
111
+
112
+ B -->|8. Persist with embedding| E[LongTermMemory]
113
+ E -->|9. INSERT nodes with embedding| F[PostgreSQL]
114
+ F -->|10. Return node_id| E
115
+ E -->|11. Return node_id| B
116
+
117
+ B -->|12. Check space| G[WorkingMemory]
118
+ G -->|13. Space available?| H{Has Space?}
119
+ H -->|No| I[Evict nodes]
120
+ I -->|14. Mark evicted| E
121
+ H -->|Yes| J[Add to WM]
122
+ I --> J
123
+
124
+ J -->|15. Success| B
125
+ B -->|16. Log operation| E
126
+ B -->|17. Return node_id| A
127
+
128
+ style A fill:rgba(76,175,80,0.3)
129
+ style B fill:rgba(33,150,243,0.3)
130
+ style C fill:rgba(255,152,0,0.3)
131
+ style D fill:rgba(255,193,7,0.3)
132
+ style E fill:rgba(156,39,176,0.3)
133
+ style F fill:rgba(156,39,176,0.3)
134
+ style G fill:rgba(33,150,243,0.3)
135
+ ```
136
+
137
+ ### Memory Recall Flow
138
+
139
+ This diagram illustrates the RAG-based retrieval process with **client-side query embeddings**.
140
+
141
+ !!! info "Architecture Note"
142
+ With client-side generation, query embeddings are generated in Ruby before being passed to SQL for vector similarity search.
143
+
144
+ ```mermaid
145
+ graph TD
146
+ A[User: recall] -->|1. Request| B[HTM]
147
+ B -->|2. Parse timeframe| C[Parse Natural Language]
148
+ C -->|3. Return range| B
149
+
150
+ B -->|4. Generate query embedding| D[EmbeddingService]
151
+ D -->|5. HTTP call| E[Ollama/OpenAI]
152
+ E -->|6. Return vector| D
153
+ D -->|7. Return embedding| B
154
+
155
+ B -->|8. Search with embedding| F[LongTermMemory]
156
+ F -->|9. Vector similarity| G{Search Strategy}
157
+ G -->|:vector| H[Vector Search]
158
+ G -->|:fulltext| I[Full-Text Search]
159
+ G -->|:hybrid| J[Hybrid Search]
160
+
161
+ H -->|10. pgvector HNSW| K[Return results]
162
+ I -->|10. ts_rank GIN| K
163
+ J -->|10. Hybrid + RRF| K
164
+
165
+ K -->|11. Results| F
166
+ F -->|12. Results| B
167
+
168
+ B -->|13. For each result| L[WorkingMemory]
169
+ L -->|14. Add to WM| M{Has Space?}
170
+ M -->|No| N[Evict old nodes]
171
+ P -->|Yes| R[Add node]
172
+ Q --> R
173
+
174
+ R -->|12. Log operation| E
175
+ B -->|13. Return memories| A
176
+
177
+ style A fill:rgba(76,175,80,0.3)
178
+ style B fill:rgba(33,150,243,0.3)
179
+ style E fill:rgba(156,39,176,0.3)
180
+ style J fill:rgba(255,193,7,0.3)
181
+ style O fill:rgba(33,150,243,0.3)
182
+ ```
183
+
184
+ ### Context Assembly Flow
185
+
186
+ This diagram shows how working memory assembles context for LLM consumption using different strategies.
187
+
188
+ ```mermaid
189
+ graph TD
190
+ A[User: create_context] -->|1. Request with strategy| B[HTM]
191
+ B -->|2. Assemble| C[WorkingMemory]
192
+
193
+ C -->|3. Strategy?| D{Strategy Type}
194
+ D -->|:recent| E[Sort by access order]
195
+ D -->|:important| F[Sort by importance]
196
+ D -->|:balanced| G[Hybrid score]
197
+
198
+ E --> H[Sorted nodes]
199
+ F --> H
200
+ G --> H
201
+
202
+ H -->|4. Build context| I[Token budget loop]
203
+ I -->|5. Check tokens| J{Tokens < max?}
204
+ J -->|Yes| K[Add node to context]
205
+ J -->|No| L[Stop, return context]
206
+ K --> I
207
+
208
+ L -->|6. Join nodes| M[Assembled context string]
209
+ M -->|7. Return| C
210
+ C -->|8. Return| B
211
+ B -->|9. Return| A
212
+
213
+ style A fill:rgba(76,175,80,0.3)
214
+ style C fill:rgba(33,150,243,0.3)
215
+ style G fill:rgba(255,193,7,0.3)
216
+ ```
217
+
218
+ ## Memory Lifecycle
219
+
220
+ ### Node States
221
+
222
+ A memory node transitions through several states during its lifetime in HTM:
223
+
224
+ ```mermaid
225
+ stateDiagram-v2
226
+ [*] --> Created: add_node()
227
+
228
+ Created --> InBothMemories: Initial state
229
+ InBothMemories --> WorkingMemoryOnly: Evicted from WM
230
+ InBothMemories --> LongTermMemoryOnly: WM cleared
231
+
232
+ WorkingMemoryOnly --> InBothMemories: Recalled
233
+ LongTermMemoryOnly --> InBothMemories: Recalled
234
+
235
+ InBothMemories --> Forgotten: forget(confirm: :confirmed)
236
+ WorkingMemoryOnly --> Forgotten: forget(confirm: :confirmed)
237
+ LongTermMemoryOnly --> Forgotten: forget(confirm: :confirmed)
238
+
239
+ Forgotten --> [*]
240
+
241
+ note right of InBothMemories
242
+ Node exists in:
243
+ - Working Memory (fast access)
244
+ - Long-Term Memory (persistent)
245
+ in_working_memory = TRUE
246
+ end note
247
+
248
+ note right of WorkingMemoryOnly
249
+ Node exists only in:
250
+ - Long-Term Memory
251
+ in_working_memory = FALSE
252
+ (Evicted due to token limit)
253
+ end note
254
+
255
+ note right of Forgotten
256
+ Node permanently deleted
257
+ from both memories
258
+ (Explicit user action)
259
+ end note
260
+ ```
261
+
262
+ ### Eviction Process
263
+
264
+ When working memory reaches its token limit, the eviction process runs to free up space:
265
+
266
+ ```mermaid
267
+ sequenceDiagram
268
+ participant User
269
+ participant HTM
270
+ participant WM as WorkingMemory
271
+ participant LTM as LongTermMemory
272
+ participant DB as Database
273
+
274
+ User->>HTM: add_node(large_memory)
275
+ HTM->>WM: add(key, value, token_count)
276
+ WM->>WM: Check: token_count + current > max?
277
+
278
+ alt Space Available
279
+ WM->>WM: Add node directly
280
+ WM-->>HTM: Success
281
+ else No Space
282
+ WM->>WM: Sort by [importance, -recency]
283
+ WM->>WM: Evict low-importance old nodes
284
+ Note over WM: Free enough tokens
285
+
286
+ WM->>HTM: Return evicted nodes
287
+ HTM->>LTM: mark_evicted(keys)
288
+ LTM->>DB: UPDATE in_working_memory = FALSE
289
+ DB-->>LTM: Updated
290
+
291
+ WM->>WM: Add new node
292
+ WM-->>HTM: Success
293
+ end
294
+
295
+ HTM-->>User: node_id
296
+ ```
297
+
298
+ ## Database Schema
299
+
300
+ ### Entity-Relationship Diagram
301
+
302
+ ![HTM Database Schema](../assets/images/htm-database-schema.svg)
303
+
304
+ ### Table Details
305
+
306
+ #### nodes
307
+
308
+ The main table storing all memory nodes with vector embeddings, metadata, and timestamps.
309
+
310
+ | Column | Type | Description |
311
+ |--------|------|-------------|
312
+ | `id` | BIGSERIAL | Primary key, auto-incrementing |
313
+ | `key` | TEXT | Unique identifier for node (user-defined) |
314
+ | `value` | TEXT | Content of the memory |
315
+ | `type` | TEXT | Memory type (fact, context, code, preference, decision, question) |
316
+ | `category` | TEXT | Optional category for organization |
317
+ | `importance` | REAL | Importance score (0.0-10.0, default 1.0) |
318
+ | `created_at` | TIMESTAMP | Creation timestamp |
319
+ | `updated_at` | TIMESTAMP | Last update timestamp |
320
+ | `last_accessed` | TIMESTAMP | Last access timestamp |
321
+ | `token_count` | INTEGER | Number of tokens in value |
322
+ | `in_working_memory` | BOOLEAN | Whether currently in working memory |
323
+ | `robot_id` | TEXT | Foreign key to robots table |
324
+ | `embedding` | vector(1536) | Vector embedding for semantic search |
325
+
326
+ **Indexes:**
327
+
328
+ - Primary key on `id`
329
+ - Unique index on `key`
330
+ - B-tree indexes on `created_at`, `updated_at`, `last_accessed`, `type`, `category`, `robot_id`
331
+ - HNSW index on `embedding` for vector similarity
332
+ - GIN indexes on `to_tsvector('english', value)` for full-text search
333
+ - GIN trigram index on `value` for fuzzy matching
334
+
335
+ #### robots
336
+
337
+ Registry of all robots using the HTM system.
338
+
339
+ | Column | Type | Description |
340
+ |--------|------|-------------|
341
+ | `id` | TEXT | Primary key, UUID v4 |
342
+ | `name` | TEXT | Human-readable robot name |
343
+ | `created_at` | TIMESTAMP | Registration timestamp |
344
+ | `last_active` | TIMESTAMP | Last activity timestamp |
345
+ | `metadata` | JSONB | Flexible robot configuration |
346
+
347
+ #### relationships
348
+
349
+ Graph edges connecting related nodes.
350
+
351
+ | Column | Type | Description |
352
+ |--------|------|-------------|
353
+ | `id` | BIGSERIAL | Primary key |
354
+ | `from_node_id` | BIGINT | Source node foreign key |
355
+ | `to_node_id` | BIGINT | Target node foreign key |
356
+ | `relationship_type` | TEXT | Type of relationship (e.g., "related_to", "follows") |
357
+ | `strength` | REAL | Relationship strength (0.0-1.0) |
358
+ | `created_at` | TIMESTAMP | Creation timestamp |
359
+
360
+ **Indexes:**
361
+
362
+ - B-tree indexes on `from_node_id` and `to_node_id`
363
+ - Unique constraint on `(from_node_id, to_node_id, relationship_type)`
364
+
365
+ #### tags
366
+
367
+ Flexible categorization system for nodes.
368
+
369
+ | Column | Type | Description |
370
+ |--------|------|-------------|
371
+ | `id` | BIGSERIAL | Primary key |
372
+ | `node_id` | BIGINT | Foreign key to nodes |
373
+ | `tag` | TEXT | Tag name |
374
+ | `created_at` | TIMESTAMP | Creation timestamp |
375
+
376
+ **Indexes:**
377
+
378
+ - B-tree index on `node_id`
379
+ - B-tree index on `tag`
380
+ - Unique constraint on `(node_id, tag)`
381
+
382
+ #### operations_log
383
+
384
+ Audit trail of all memory operations for debugging and replay.
385
+
386
+ | Column | Type | Description |
387
+ |--------|------|-------------|
388
+ | `id` | BIGSERIAL | Primary key |
389
+ | `timestamp` | TIMESTAMP | Operation timestamp |
390
+ | `operation` | TEXT | Operation type (add, retrieve, recall, forget, evict) |
391
+ | `node_id` | BIGINT | Foreign key to nodes (nullable) |
392
+ | `robot_id` | TEXT | Foreign key to robots |
393
+ | `details` | JSONB | Flexible operation metadata |
394
+
395
+ **Indexes:**
396
+
397
+ - B-tree indexes on `timestamp`, `robot_id`, `operation`
398
+
399
+ ## Technology Stack
400
+
401
+ ### Core Technologies
402
+
403
+ | Technology | Version | Purpose | Why Chosen |
404
+ |-----------|---------|---------|------------|
405
+ | **Ruby** | 3.2+ | Implementation language | Readable, expressive, mature ecosystem |
406
+ | **PostgreSQL** | 16+ | Relational database | ACID guarantees, rich extensions, production-proven |
407
+ | **TimescaleDB** | 2.13+ | Time-series extension | Hypertable partitioning, automatic compression |
408
+ | **pgvector** | 0.5+ | Vector similarity | HNSW indexing, PostgreSQL-native, fast approximate search |
409
+ | **pg_trgm** | - | Fuzzy text search | Built-in PostgreSQL extension for trigram matching |
410
+
411
+ ### Ruby Dependencies
412
+
413
+ ```ruby
414
+ # Core dependencies
415
+ gem 'pg', '~> 1.5' # PostgreSQL client
416
+ gem 'pgvector', '~> 0.2' # Vector operations
417
+ gem 'connection_pool', '~> 2.4' # Connection pooling
418
+ gem 'faraday', '~> 2.7' # HTTP client (for embedding APIs)
419
+
420
+ # Optional dependencies
421
+ gem 'tiktoken_ruby', '~> 0.0.6' # Token counting (OpenAI-compatible)
422
+ ```
423
+
424
+ ### Embedding Providers
425
+
426
+ !!! info "Client-Side Generation"
427
+ Embeddings are generated client-side in Ruby before database insertion. This provides reliable, cross-platform operation.
428
+
429
+ | Provider | Models | Dimensions | Speed | Cost |
430
+ |----------|--------|------------|-------|------|
431
+ | **Ollama** (default) | nomic-embed-text, mxbai-embed-large, all-minilm | 384-1024 | Fast (local HTTP) | Free |
432
+ | **OpenAI** | text-embedding-3-small, text-embedding-ada-002 | 1536 | Fast (API) | $0.0001/1K tokens |
433
+
434
+ ## Performance Characteristics
435
+
436
+ ### Latency Benchmarks
437
+
438
+ Based on typical production workloads with 10,000 nodes in long-term memory (client-side embeddings):
439
+
440
+ !!! info "Performance Characteristics"
441
+ Client-side embedding generation provides reliable, debuggable operation. Latency includes HTTP call to Ollama/OpenAI for embedding generation.
442
+
443
+ | Operation | Median | P95 | P99 | Notes |
444
+ |-----------|--------|-----|-----|-------|
445
+ | `add_message()` | 50ms | 110ms | 190ms | Client-side embedding generation + insert |
446
+ | `recall()` (vector) | 80ms | 140ms | 230ms | Client-side query embedding + vector search |
447
+ | `recall()` (fulltext) | 30ms | 60ms | 100ms | GIN index search (no embedding needed) |
448
+ | `recall()` (hybrid) | 110ms | 190ms | 330ms | Client-side embedding + hybrid search |
449
+ | `retrieve()` | 5ms | 10ms | 20ms | Simple primary key lookup |
450
+ | `create_context()` | 8ms | 15ms | 25ms | In-memory sort + join |
451
+ | `forget()` | 10ms | 20ms | 40ms | DELETE with cascades |
452
+
453
+ !!! tip "Performance Optimization"
454
+ - Use connection pooling (included by default)
455
+ - Add database indexes for common query patterns
456
+ - Consider read replicas for query-heavy workloads
457
+ - Monitor HNSW build time for large embedding tables
458
+
459
+ ### Throughput
460
+
461
+ | Workload | Throughput | Resource Usage |
462
+ |----------|-----------|----------------|
463
+ | Add nodes | 500-1000/sec | CPU-bound (embeddings) |
464
+ | Vector search | 2000-5000/sec | I/O-bound (database) |
465
+ | Full-text search | 5000-10000/sec | I/O-bound (database) |
466
+ | Context assembly | 10000+/sec | Memory-bound (working memory) |
467
+
468
+ ### Storage
469
+
470
+ | Component | Size Estimate | Compression |
471
+ |-----------|--------------|-------------|
472
+ | Node (text only) | ~1KB average | None |
473
+ | Node (with embedding) | ~7KB (1536 dims × 4 bytes) | TimescaleDB compression (70-90%) |
474
+ | Indexes | ~2x data size | Minimal |
475
+ | Operations log | ~200 bytes/op | TimescaleDB compression |
476
+
477
+ **Example:** 100,000 nodes with embeddings:
478
+
479
+ - Raw data: ~700 MB
480
+ - With indexes: ~2.1 GB
481
+ - With compression (after 30 days): ~300 MB
482
+
483
+ ## Scalability Considerations
484
+
485
+ ### Vertical Scaling Limits
486
+
487
+ | Resource | Limit | Mitigation |
488
+ |----------|-------|------------|
489
+ | **Working Memory (RAM)** | ~2GB per robot process | Use smaller `working_memory_size`, evict more aggressively |
490
+ | **PostgreSQL Connections** | ~100-200 (default) | Connection pooling, adjust `max_connections` |
491
+ | **Embedding API Rate Limits** | Provider-dependent | Implement rate limiting, use local models |
492
+ | **HNSW Build Time** | O(n log n) on large tables | Partition tables by timeframe |
493
+
494
+ ### Horizontal Scaling Strategies
495
+
496
+ #### Multi-Process (Single Host)
497
+
498
+ - Each robot process has independent working memory
499
+ - All processes share single PostgreSQL instance
500
+ - Connection pooling prevents connection exhaustion
501
+
502
+ #### Multi-Host (Distributed)
503
+
504
+ - **Option 1: Shared Database**
505
+ - All hosts connect to central PostgreSQL
506
+ - Read replicas for query scaling
507
+ - Write operations to primary only
508
+
509
+ - **Option 2: Sharded Database**
510
+ - Partition by `robot_id` or timeframe
511
+ - Requires coordination for cross-shard queries
512
+ - More complex but scales writes
513
+
514
+ #### Read Scaling
515
+
516
+ - Add PostgreSQL read replicas
517
+ - Route `recall()` and `retrieve()` to replicas
518
+ - Primary handles writes only
519
+ - TimescaleDB native replication support
520
+
521
+ !!! warning "Consistency Considerations"
522
+ Read replicas may lag primary by seconds. For strong consistency requirements, query primary database.
523
+
524
+ ### Future Scaling Enhancements
525
+
526
+ 1. **Redis-backed Working Memory**: Share working memory across processes
527
+ 2. **Horizontal Partitioning**: Shard `nodes` table by `robot_id` or time ranges
528
+ 3. **Caching Layer**: Add Redis cache for hot nodes
529
+ 4. **Async Embedding Generation**: Queue embedding jobs for batch processing
530
+ 5. **Vector Database Migration**: Consider specialized vector DB (Pinecone, Weaviate) at massive scale
531
+
532
+ ## Related Documentation
533
+
534
+ - [Architecture Index](index.md) - Architecture overview and component summary
535
+ - [Two-Tier Memory System](two-tier-memory.md) - Working memory and long-term memory deep dive
536
+ - [Hive Mind Architecture](hive-mind.md) - Multi-robot shared memory design
537
+ - [API Reference](../api/htm.md) - Complete API documentation
538
+ - [Architecture Decision Records](adrs/index.md) - Decision history