htm 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.architecture/decisions/adrs/001-use-postgresql-timescaledb-storage.md +227 -0
- data/.architecture/decisions/adrs/002-two-tier-memory-architecture.md +322 -0
- data/.architecture/decisions/adrs/003-ollama-default-embedding-provider.md +339 -0
- data/.architecture/decisions/adrs/004-multi-robot-shared-memory-hive-mind.md +374 -0
- data/.architecture/decisions/adrs/005-rag-based-retrieval-with-hybrid-search.md +443 -0
- data/.architecture/decisions/adrs/006-context-assembly-strategies.md +444 -0
- data/.architecture/decisions/adrs/007-working-memory-eviction-strategy.md +461 -0
- data/.architecture/decisions/adrs/008-robot-identification-system.md +550 -0
- data/.architecture/decisions/adrs/009-never-forget-explicit-deletion-only.md +570 -0
- data/.architecture/decisions/adrs/010-redis-working-memory-rejected.md +323 -0
- data/.architecture/decisions/adrs/011-database-side-embedding-generation-with-pgai.md +585 -0
- data/.architecture/decisions/adrs/012-llm-driven-ontology-topic-extraction.md +583 -0
- data/.architecture/decisions/adrs/013-activerecord-orm-and-many-to-many-tagging.md +299 -0
- data/.architecture/decisions/adrs/014-client-side-embedding-generation-workflow.md +569 -0
- data/.architecture/decisions/adrs/015-hierarchical-tag-ontology-and-llm-extraction.md +701 -0
- data/.architecture/decisions/adrs/016-async-embedding-and-tag-generation.md +694 -0
- data/.architecture/members.yml +144 -0
- data/.architecture/reviews/2025-10-29-llm-configuration-and-async-processing-review.md +1137 -0
- data/.architecture/reviews/initial-system-analysis.md +330 -0
- data/.envrc +32 -0
- data/.irbrc +145 -0
- data/CHANGELOG.md +150 -0
- data/COMMITS.md +196 -0
- data/LICENSE +21 -0
- data/README.md +1347 -0
- data/Rakefile +51 -0
- data/SETUP.md +268 -0
- data/config/database.yml +67 -0
- data/db/migrate/20250101000001_enable_extensions.rb +14 -0
- data/db/migrate/20250101000002_create_robots.rb +14 -0
- data/db/migrate/20250101000003_create_nodes.rb +42 -0
- data/db/migrate/20250101000005_create_tags.rb +38 -0
- data/db/migrate/20250101000007_add_node_vector_indexes.rb +30 -0
- data/db/schema.sql +473 -0
- data/db/seed_data/README.md +100 -0
- data/db/seed_data/presidents.md +136 -0
- data/db/seed_data/states.md +151 -0
- data/db/seeds.rb +208 -0
- data/dbdoc/README.md +173 -0
- data/dbdoc/public.node_stats.md +48 -0
- data/dbdoc/public.node_stats.svg +41 -0
- data/dbdoc/public.node_tags.md +40 -0
- data/dbdoc/public.node_tags.svg +112 -0
- data/dbdoc/public.nodes.md +54 -0
- data/dbdoc/public.nodes.svg +118 -0
- data/dbdoc/public.nodes_tags.md +39 -0
- data/dbdoc/public.nodes_tags.svg +112 -0
- data/dbdoc/public.ontology_structure.md +48 -0
- data/dbdoc/public.ontology_structure.svg +38 -0
- data/dbdoc/public.operations_log.md +42 -0
- data/dbdoc/public.operations_log.svg +130 -0
- data/dbdoc/public.relationships.md +39 -0
- data/dbdoc/public.relationships.svg +41 -0
- data/dbdoc/public.robot_activity.md +46 -0
- data/dbdoc/public.robot_activity.svg +35 -0
- data/dbdoc/public.robots.md +35 -0
- data/dbdoc/public.robots.svg +90 -0
- data/dbdoc/public.schema_migrations.md +29 -0
- data/dbdoc/public.schema_migrations.svg +26 -0
- data/dbdoc/public.tags.md +35 -0
- data/dbdoc/public.tags.svg +60 -0
- data/dbdoc/public.topic_relationships.md +45 -0
- data/dbdoc/public.topic_relationships.svg +32 -0
- data/dbdoc/schema.json +1437 -0
- data/dbdoc/schema.svg +154 -0
- data/docs/api/database.md +806 -0
- data/docs/api/embedding-service.md +532 -0
- data/docs/api/htm.md +797 -0
- data/docs/api/index.md +259 -0
- data/docs/api/long-term-memory.md +1096 -0
- data/docs/api/working-memory.md +665 -0
- data/docs/architecture/adrs/001-postgresql-timescaledb.md +314 -0
- data/docs/architecture/adrs/002-two-tier-memory.md +411 -0
- data/docs/architecture/adrs/003-ollama-embeddings.md +421 -0
- data/docs/architecture/adrs/004-hive-mind.md +437 -0
- data/docs/architecture/adrs/005-rag-retrieval.md +531 -0
- data/docs/architecture/adrs/006-context-assembly.md +496 -0
- data/docs/architecture/adrs/007-eviction-strategy.md +645 -0
- data/docs/architecture/adrs/008-robot-identification.md +625 -0
- data/docs/architecture/adrs/009-never-forget.md +648 -0
- data/docs/architecture/adrs/010-redis-working-memory-rejected.md +323 -0
- data/docs/architecture/adrs/011-pgai-integration.md +494 -0
- data/docs/architecture/adrs/index.md +215 -0
- data/docs/architecture/hive-mind.md +736 -0
- data/docs/architecture/index.md +351 -0
- data/docs/architecture/overview.md +538 -0
- data/docs/architecture/two-tier-memory.md +873 -0
- data/docs/assets/css/custom.css +83 -0
- data/docs/assets/images/htm-core-components.svg +63 -0
- data/docs/assets/images/htm-database-schema.svg +93 -0
- data/docs/assets/images/htm-hive-mind-architecture.svg +125 -0
- data/docs/assets/images/htm-importance-scoring-framework.svg +83 -0
- data/docs/assets/images/htm-layered-architecture.svg +71 -0
- data/docs/assets/images/htm-long-term-memory-architecture.svg +115 -0
- data/docs/assets/images/htm-working-memory-architecture.svg +120 -0
- data/docs/assets/images/htm.jpg +0 -0
- data/docs/assets/images/htm_demo.gif +0 -0
- data/docs/assets/js/mathjax.js +18 -0
- data/docs/assets/videos/htm_video.mp4 +0 -0
- data/docs/database_rake_tasks.md +322 -0
- data/docs/development/contributing.md +787 -0
- data/docs/development/index.md +336 -0
- data/docs/development/schema.md +596 -0
- data/docs/development/setup.md +719 -0
- data/docs/development/testing.md +819 -0
- data/docs/guides/adding-memories.md +824 -0
- data/docs/guides/context-assembly.md +1009 -0
- data/docs/guides/getting-started.md +577 -0
- data/docs/guides/index.md +118 -0
- data/docs/guides/long-term-memory.md +941 -0
- data/docs/guides/multi-robot.md +866 -0
- data/docs/guides/recalling-memories.md +927 -0
- data/docs/guides/search-strategies.md +953 -0
- data/docs/guides/working-memory.md +717 -0
- data/docs/index.md +214 -0
- data/docs/installation.md +477 -0
- data/docs/multi_framework_support.md +519 -0
- data/docs/quick-start.md +655 -0
- data/docs/setup_local_database.md +302 -0
- data/docs/using_rake_tasks_in_your_app.md +383 -0
- data/examples/basic_usage.rb +93 -0
- data/examples/cli_app/README.md +317 -0
- data/examples/cli_app/htm_cli.rb +270 -0
- data/examples/custom_llm_configuration.rb +183 -0
- data/examples/example_app/Rakefile +71 -0
- data/examples/example_app/app.rb +206 -0
- data/examples/sinatra_app/Gemfile +21 -0
- data/examples/sinatra_app/app.rb +335 -0
- data/lib/htm/active_record_config.rb +113 -0
- data/lib/htm/configuration.rb +342 -0
- data/lib/htm/database.rb +594 -0
- data/lib/htm/embedding_service.rb +115 -0
- data/lib/htm/errors.rb +34 -0
- data/lib/htm/job_adapter.rb +154 -0
- data/lib/htm/jobs/generate_embedding_job.rb +65 -0
- data/lib/htm/jobs/generate_tags_job.rb +82 -0
- data/lib/htm/long_term_memory.rb +965 -0
- data/lib/htm/models/node.rb +109 -0
- data/lib/htm/models/node_tag.rb +33 -0
- data/lib/htm/models/robot.rb +52 -0
- data/lib/htm/models/tag.rb +76 -0
- data/lib/htm/railtie.rb +76 -0
- data/lib/htm/sinatra.rb +157 -0
- data/lib/htm/tag_service.rb +135 -0
- data/lib/htm/tasks.rb +38 -0
- data/lib/htm/version.rb +5 -0
- data/lib/htm/working_memory.rb +182 -0
- data/lib/htm.rb +400 -0
- data/lib/tasks/db.rake +19 -0
- data/lib/tasks/htm.rake +147 -0
- data/lib/tasks/jobs.rake +312 -0
- data/mkdocs.yml +190 -0
- data/scripts/install_local_database.sh +309 -0
- metadata +341 -0
|
@@ -0,0 +1,351 @@
|
|
|
1
|
+
# Architecture Overview
|
|
2
|
+
|
|
3
|
+
HTM (Hierarchical Temporary Memory) implements a sophisticated two-tier memory system designed specifically for LLM-based applications ("robots"). This architecture enables robots to maintain long-term context across sessions while managing token budgets efficiently.
|
|
4
|
+
|
|
5
|
+
## System Overview
|
|
6
|
+
|
|
7
|
+
HTM provides intelligent memory management through five core components that work together to deliver persistent, searchable, and context-aware memory for AI agents.
|
|
8
|
+
|
|
9
|
+
<svg viewBox="0 0 800 600" xmlns="http://www.w3.org/2000/svg" style="background: transparent;">
|
|
10
|
+
<!-- HTM Core -->
|
|
11
|
+
<rect x="300" y="50" width="200" height="80" fill="rgba(76, 175, 80, 0.2)" stroke="#4CAF50" stroke-width="2" rx="5"/>
|
|
12
|
+
<text x="400" y="85" text-anchor="middle" fill="#E0E0E0" font-size="16" font-weight="bold">HTM</text>
|
|
13
|
+
<text x="400" y="105" text-anchor="middle" fill="#B0B0B0" font-size="12">Coordination Layer</text>
|
|
14
|
+
|
|
15
|
+
<!-- Working Memory -->
|
|
16
|
+
<rect x="50" y="200" width="200" height="120" fill="rgba(33, 150, 243, 0.2)" stroke="#2196F3" stroke-width="2" rx="5"/>
|
|
17
|
+
<text x="150" y="235" text-anchor="middle" fill="#E0E0E0" font-size="14" font-weight="bold">Working Memory</text>
|
|
18
|
+
<text x="150" y="255" text-anchor="middle" fill="#B0B0B0" font-size="11">Token-Limited</text>
|
|
19
|
+
<text x="150" y="275" text-anchor="middle" fill="#B0B0B0" font-size="11">In-Memory</text>
|
|
20
|
+
<text x="150" y="295" text-anchor="middle" fill="#B0B0B0" font-size="11">LRU Eviction</text>
|
|
21
|
+
|
|
22
|
+
<!-- Long-Term Memory -->
|
|
23
|
+
<rect x="300" y="200" width="200" height="120" fill="rgba(156, 39, 176, 0.2)" stroke="#9C27B0" stroke-width="2" rx="5"/>
|
|
24
|
+
<text x="400" y="235" text-anchor="middle" fill="#E0E0E0" font-size="14" font-weight="bold">Long-Term Memory</text>
|
|
25
|
+
<text x="400" y="255" text-anchor="middle" fill="#B0B0B0" font-size="11">PostgreSQL</text>
|
|
26
|
+
<text x="400" y="275" text-anchor="middle" fill="#B0B0B0" font-size="11">Unlimited Capacity</text>
|
|
27
|
+
<text x="400" y="295" text-anchor="middle" fill="#B0B0B0" font-size="11">Durable Storage</text>
|
|
28
|
+
|
|
29
|
+
<!-- Embedding Service -->
|
|
30
|
+
<rect x="550" y="200" width="200" height="120" fill="rgba(255, 152, 0, 0.2)" stroke="#FF9800" stroke-width="2" rx="5"/>
|
|
31
|
+
<text x="650" y="235" text-anchor="middle" fill="#E0E0E0" font-size="14" font-weight="bold">Embedding Service</text>
|
|
32
|
+
<text x="650" y="255" text-anchor="middle" fill="#B0B0B0" font-size="11">Ollama/OpenAI</text>
|
|
33
|
+
<text x="650" y="275" text-anchor="middle" fill="#B0B0B0" font-size="11">Vector Embeddings</text>
|
|
34
|
+
<text x="650" y="295" text-anchor="middle" fill="#B0B0B0" font-size="11">Semantic Search</text>
|
|
35
|
+
|
|
36
|
+
<!-- Database -->
|
|
37
|
+
<rect x="300" y="380" width="200" height="120" fill="rgba(244, 67, 54, 0.2)" stroke="#F44336" stroke-width="2" rx="5"/>
|
|
38
|
+
<text x="400" y="415" text-anchor="middle" fill="#E0E0E0" font-size="14" font-weight="bold">Database</text>
|
|
39
|
+
<text x="400" y="435" text-anchor="middle" fill="#B0B0B0" font-size="11">PostgreSQL 16+</text>
|
|
40
|
+
<text x="400" y="455" text-anchor="middle" fill="#B0B0B0" font-size="11">TimescaleDB</text>
|
|
41
|
+
<text x="400" y="475" text-anchor="middle" fill="#B0B0B0" font-size="11">pgvector + pg_trgm</text>
|
|
42
|
+
|
|
43
|
+
<!-- Connections -->
|
|
44
|
+
<line x1="400" y1="130" x2="150" y2="200" stroke="#2196F3" stroke-width="2"/>
|
|
45
|
+
<line x1="400" y1="130" x2="400" y2="200" stroke="#9C27B0" stroke-width="2"/>
|
|
46
|
+
<line x1="400" y1="130" x2="650" y2="200" stroke="#FF9800" stroke-width="2"/>
|
|
47
|
+
<line x1="400" y1="320" x2="400" y2="380" stroke="#F44336" stroke-width="2"/>
|
|
48
|
+
|
|
49
|
+
<!-- Labels -->
|
|
50
|
+
<text x="275" y="170" fill="#B0B0B0" font-size="10">manages</text>
|
|
51
|
+
<text x="410" y="170" fill="#B0B0B0" font-size="10">persists</text>
|
|
52
|
+
<text x="520" y="170" fill="#B0B0B0" font-size="10">generates</text>
|
|
53
|
+
<text x="420" y="360" fill="#B0B0B0" font-size="10">stores</text>
|
|
54
|
+
|
|
55
|
+
<!-- Data Flow Arrows -->
|
|
56
|
+
<defs>
|
|
57
|
+
<marker id="arrowhead" markerWidth="10" markerHeight="10" refX="9" refY="3" orient="auto">
|
|
58
|
+
<polygon points="0 0, 10 3, 0 6" fill="#4CAF50"/>
|
|
59
|
+
</marker>
|
|
60
|
+
</defs>
|
|
61
|
+
|
|
62
|
+
<text x="20" y="540" fill="#4CAF50" font-size="12" font-weight="bold">Data Flow:</text>
|
|
63
|
+
<text x="20" y="560" fill="#B0B0B0" font-size="11">Add Memory → Working Memory → Long-Term Memory (persistent)</text>
|
|
64
|
+
<text x="20" y="580" fill="#B0B0B0" font-size="11">Recall → Long-Term (RAG search) → Working Memory (evict if needed)</text>
|
|
65
|
+
</svg>
|
|
66
|
+
|
|
67
|
+
## Core Components
|
|
68
|
+
|
|
69
|
+
### HTM (Main Interface)
|
|
70
|
+
|
|
71
|
+
The HTM class is the primary interface for memory operations. It coordinates between working memory, long-term memory, and embedding services to provide a unified API.
|
|
72
|
+
|
|
73
|
+
**Key Responsibilities:**
|
|
74
|
+
|
|
75
|
+
- Initialize and coordinate all memory subsystems
|
|
76
|
+
- Manage robot identification and registration
|
|
77
|
+
- Generate embeddings for new memories
|
|
78
|
+
- Orchestrate recall operations with RAG-based retrieval
|
|
79
|
+
- Assemble context for LLM consumption
|
|
80
|
+
- Track memory statistics and robot activity
|
|
81
|
+
|
|
82
|
+
**Related ADRs:** [ADR-002](adrs/002-two-tier-memory.md), [ADR-008](adrs/008-robot-identification.md)
|
|
83
|
+
|
|
84
|
+
### Working Memory
|
|
85
|
+
|
|
86
|
+
Token-limited, in-memory storage for active conversation context. Working memory acts as a fast cache for recently accessed or highly important memories that the LLM needs immediate access to.
|
|
87
|
+
|
|
88
|
+
**Characteristics:**
|
|
89
|
+
|
|
90
|
+
- **Capacity:** Token-limited (default: 128,000 tokens)
|
|
91
|
+
- **Storage:** Ruby Hash (in-memory)
|
|
92
|
+
- **Eviction:** Hybrid importance + recency (LRU-based)
|
|
93
|
+
- **Lifetime:** Process lifetime
|
|
94
|
+
- **Access Time:** O(1) hash lookups
|
|
95
|
+
|
|
96
|
+
**Related ADRs:** [ADR-002](adrs/002-two-tier-memory.md), [ADR-007](adrs/007-eviction-strategy.md)
|
|
97
|
+
|
|
98
|
+
### Long-Term Memory
|
|
99
|
+
|
|
100
|
+
Durable PostgreSQL storage for permanent knowledge retention. All memories are stored here permanently unless explicitly deleted.
|
|
101
|
+
|
|
102
|
+
**Characteristics:**
|
|
103
|
+
|
|
104
|
+
- **Capacity:** Effectively unlimited
|
|
105
|
+
- **Storage:** PostgreSQL with TimescaleDB extension
|
|
106
|
+
- **Retention:** Permanent (explicit deletion only)
|
|
107
|
+
- **Access Pattern:** RAG-based retrieval (semantic + temporal)
|
|
108
|
+
- **Lifetime:** Forever
|
|
109
|
+
|
|
110
|
+
**Related ADRs:** [ADR-001](adrs/001-postgresql-timescaledb.md), [ADR-005](adrs/005-rag-retrieval.md)
|
|
111
|
+
|
|
112
|
+
### Embedding Service
|
|
113
|
+
|
|
114
|
+
Generates vector embeddings for semantic search and manages token counting for memory management.
|
|
115
|
+
|
|
116
|
+
**Supported Providers:**
|
|
117
|
+
|
|
118
|
+
- **Ollama** (default): Local embedding models (gpt-oss, nomic-embed-text, mxbai-embed-large)
|
|
119
|
+
- **OpenAI**: text-embedding-3-small, text-embedding-3-large
|
|
120
|
+
- **Cohere**: embed-english-v3.0, embed-multilingual-v3.0
|
|
121
|
+
- **Local**: Transformers.js for browser/edge deployment
|
|
122
|
+
|
|
123
|
+
**Related ADRs:** [ADR-003](adrs/003-ollama-embeddings.md)
|
|
124
|
+
|
|
125
|
+
### Database
|
|
126
|
+
|
|
127
|
+
PostgreSQL 16+ with extensions for time-series optimization, vector similarity search, and full-text search.
|
|
128
|
+
|
|
129
|
+
**Key Extensions:**
|
|
130
|
+
|
|
131
|
+
- **TimescaleDB**: Hypertable partitioning, compression policies, time-range optimization
|
|
132
|
+
- **pgvector**: Vector similarity search with HNSW indexing
|
|
133
|
+
- **pg_trgm**: Trigram-based fuzzy text matching
|
|
134
|
+
|
|
135
|
+
**Related ADRs:** [ADR-001](adrs/001-postgresql-timescaledb.md)
|
|
136
|
+
|
|
137
|
+
## Component Interaction Flow
|
|
138
|
+
|
|
139
|
+
### Adding a Memory
|
|
140
|
+
|
|
141
|
+
```mermaid
|
|
142
|
+
sequenceDiagram
|
|
143
|
+
participant User
|
|
144
|
+
participant HTM
|
|
145
|
+
participant EmbeddingService
|
|
146
|
+
participant LongTermMemory
|
|
147
|
+
participant WorkingMemory
|
|
148
|
+
participant Database
|
|
149
|
+
|
|
150
|
+
User->>HTM: add_node(key, value, ...)
|
|
151
|
+
HTM->>EmbeddingService: embed(value)
|
|
152
|
+
EmbeddingService-->>HTM: embedding vector
|
|
153
|
+
HTM->>EmbeddingService: count_tokens(value)
|
|
154
|
+
EmbeddingService-->>HTM: token_count
|
|
155
|
+
HTM->>LongTermMemory: add(key, value, embedding, ...)
|
|
156
|
+
LongTermMemory->>Database: INSERT INTO nodes
|
|
157
|
+
Database-->>LongTermMemory: node_id
|
|
158
|
+
LongTermMemory-->>HTM: node_id
|
|
159
|
+
HTM->>WorkingMemory: add(key, value, token_count, ...)
|
|
160
|
+
Note over WorkingMemory: Evict if needed
|
|
161
|
+
WorkingMemory-->>HTM: success
|
|
162
|
+
HTM-->>User: node_id
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
### Recalling Memories
|
|
166
|
+
|
|
167
|
+
```mermaid
|
|
168
|
+
sequenceDiagram
|
|
169
|
+
participant User
|
|
170
|
+
participant HTM
|
|
171
|
+
participant LongTermMemory
|
|
172
|
+
participant EmbeddingService
|
|
173
|
+
participant Database
|
|
174
|
+
participant WorkingMemory
|
|
175
|
+
|
|
176
|
+
User->>HTM: recall(timeframe, topic, ...)
|
|
177
|
+
HTM->>EmbeddingService: embed(topic)
|
|
178
|
+
EmbeddingService-->>HTM: query_embedding
|
|
179
|
+
HTM->>LongTermMemory: search(timeframe, embedding, ...)
|
|
180
|
+
LongTermMemory->>Database: SELECT with vector similarity
|
|
181
|
+
Database-->>LongTermMemory: matching nodes
|
|
182
|
+
LongTermMemory-->>HTM: recalled_memories
|
|
183
|
+
loop For each recalled memory
|
|
184
|
+
HTM->>WorkingMemory: add(memory)
|
|
185
|
+
Note over WorkingMemory: Evict old memories if needed
|
|
186
|
+
end
|
|
187
|
+
HTM-->>User: recalled_memories
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
## Key Architectural Principles
|
|
191
|
+
|
|
192
|
+
### 1. Never Forget (Unless Told)
|
|
193
|
+
|
|
194
|
+
HTM implements a "never forget" philosophy. Eviction from working memory moves data to long-term storage, it doesn't delete it. Only explicit `forget(key, confirm: :confirmed)` operations delete data.
|
|
195
|
+
|
|
196
|
+
!!! info "Design Principle"
|
|
197
|
+
Memory eviction is about managing working memory tokens, not data deletion. All evicted memories remain searchable and recallable from long-term storage.
|
|
198
|
+
|
|
199
|
+
**Related ADRs:** [ADR-009](adrs/009-never-forget.md)
|
|
200
|
+
|
|
201
|
+
### 2. Two-Tier Memory Hierarchy
|
|
202
|
+
|
|
203
|
+
Working memory provides fast O(1) access to recent/important context, while long-term memory provides unlimited durable storage with RAG-based retrieval.
|
|
204
|
+
|
|
205
|
+
!!! success "Performance Benefit"
|
|
206
|
+
This architecture balances the competing needs of fast access (working memory) and unlimited retention (long-term memory).
|
|
207
|
+
|
|
208
|
+
**Related ADRs:** [ADR-002](adrs/002-two-tier-memory.md)
|
|
209
|
+
|
|
210
|
+
### 3. Hive Mind Architecture
|
|
211
|
+
|
|
212
|
+
All robots share a global long-term memory database, enabling cross-robot learning and context continuity. Each robot maintains its own working memory for process isolation.
|
|
213
|
+
|
|
214
|
+
!!! tip "Multi-Robot Collaboration"
|
|
215
|
+
Knowledge gained by one robot benefits all robots. Users never need to repeat information across sessions or robots.
|
|
216
|
+
|
|
217
|
+
**Related ADRs:** [ADR-004](adrs/004-hive-mind.md)
|
|
218
|
+
|
|
219
|
+
### 4. RAG-Based Retrieval
|
|
220
|
+
|
|
221
|
+
HTM uses Retrieval-Augmented Generation patterns with hybrid search strategies combining semantic similarity (vector search) and temporal relevance (time-range filtering).
|
|
222
|
+
|
|
223
|
+
!!! note "Search Strategies"
|
|
224
|
+
- **Vector**: Pure semantic similarity
|
|
225
|
+
- **Full-text**: Keyword-based search
|
|
226
|
+
- **Hybrid**: Combines both with RRF scoring
|
|
227
|
+
|
|
228
|
+
**Related ADRs:** [ADR-005](adrs/005-rag-retrieval.md)
|
|
229
|
+
|
|
230
|
+
### 5. Importance-Weighted Eviction
|
|
231
|
+
|
|
232
|
+
Working memory eviction prioritizes low-importance older memories first, preserving critical context even if it's old.
|
|
233
|
+
|
|
234
|
+
!!! warning "Token Budget Management"
|
|
235
|
+
Eviction is inevitable with finite token limits. The hybrid importance + recency strategy ensures the most valuable memories stay in working memory.
|
|
236
|
+
|
|
237
|
+
**Related ADRs:** [ADR-007](adrs/007-eviction-strategy.md)
|
|
238
|
+
|
|
239
|
+
## Memory Lifecycle
|
|
240
|
+
|
|
241
|
+
```mermaid
|
|
242
|
+
stateDiagram-v2
|
|
243
|
+
[*] --> Created: add_node()
|
|
244
|
+
Created --> InWorkingMemory: Add to WM
|
|
245
|
+
Created --> InLongTermMemory: Persist to LTM
|
|
246
|
+
|
|
247
|
+
InWorkingMemory --> Evicted: Token limit reached
|
|
248
|
+
Evicted --> InLongTermMemory: Mark as evicted
|
|
249
|
+
|
|
250
|
+
InLongTermMemory --> Recalled: recall()
|
|
251
|
+
Recalled --> InWorkingMemory: Add back to WM
|
|
252
|
+
|
|
253
|
+
InWorkingMemory --> [*]: Process ends
|
|
254
|
+
InLongTermMemory --> Forgotten: forget(confirm: :confirmed)
|
|
255
|
+
Forgotten --> [*]: Permanently deleted
|
|
256
|
+
|
|
257
|
+
note right of InWorkingMemory
|
|
258
|
+
Fast O(1) access
|
|
259
|
+
Token-limited
|
|
260
|
+
Process-local
|
|
261
|
+
end note
|
|
262
|
+
|
|
263
|
+
note right of InLongTermMemory
|
|
264
|
+
Durable PostgreSQL
|
|
265
|
+
Unlimited capacity
|
|
266
|
+
RAG retrieval
|
|
267
|
+
end note
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
## Architecture Documents
|
|
271
|
+
|
|
272
|
+
Explore detailed architecture documentation:
|
|
273
|
+
|
|
274
|
+
- **[Detailed Architecture](overview.md)** - Deep dive into system architecture, data flows, and performance characteristics
|
|
275
|
+
- **[Two-Tier Memory System](two-tier-memory.md)** - Working memory and long-term memory design, eviction strategies, and context assembly
|
|
276
|
+
- **[Hive Mind Architecture](hive-mind.md)** - Multi-robot shared memory, robot identification, and cross-robot knowledge sharing
|
|
277
|
+
|
|
278
|
+
## Technology Stack
|
|
279
|
+
|
|
280
|
+
| Layer | Technology | Purpose |
|
|
281
|
+
|-------|-----------|---------|
|
|
282
|
+
| **Language** | Ruby 3.2+ | Core implementation |
|
|
283
|
+
| **Database** | PostgreSQL 16+ | Relational storage |
|
|
284
|
+
| **Time-Series** | TimescaleDB | Hypertable partitioning, compression |
|
|
285
|
+
| **Vector Search** | pgvector | Semantic similarity (HNSW) |
|
|
286
|
+
| **Full-Text** | pg_trgm | Fuzzy text matching |
|
|
287
|
+
| **Embeddings** | Ollama/OpenAI | Vector generation |
|
|
288
|
+
| **Connection Pool** | connection_pool gem | Database connection management |
|
|
289
|
+
| **Testing** | Minitest | Test framework |
|
|
290
|
+
|
|
291
|
+
## Performance Characteristics
|
|
292
|
+
|
|
293
|
+
### Working Memory
|
|
294
|
+
|
|
295
|
+
- **Add**: O(1) amortized (eviction is O(n log n) when needed)
|
|
296
|
+
- **Retrieve**: O(1) hash lookup
|
|
297
|
+
- **Context Assembly**: O(n log n) for sorting, O(k) for selecting
|
|
298
|
+
- **Typical Size**: 50-200 nodes (~128K tokens)
|
|
299
|
+
|
|
300
|
+
### Long-Term Memory
|
|
301
|
+
|
|
302
|
+
- **Add**: O(log n) with PostgreSQL indexes
|
|
303
|
+
- **Vector Search**: O(log n) with HNSW (approximate)
|
|
304
|
+
- **Full-Text Search**: O(log n) with GIN indexes
|
|
305
|
+
- **Hybrid Search**: O(log n) + merge
|
|
306
|
+
- **Typical Size**: Thousands to millions of nodes
|
|
307
|
+
|
|
308
|
+
### Overall System
|
|
309
|
+
|
|
310
|
+
- **Memory Addition**: < 100ms (including embedding generation)
|
|
311
|
+
- **Recall Operation**: < 200ms (typical hybrid search)
|
|
312
|
+
- **Context Assembly**: < 10ms (working memory sort)
|
|
313
|
+
- **Eviction**: < 10ms (rare, only when working memory full)
|
|
314
|
+
|
|
315
|
+
## Scalability Considerations
|
|
316
|
+
|
|
317
|
+
### Vertical Scaling
|
|
318
|
+
|
|
319
|
+
- **Working Memory**: Limited by process RAM (~1-2GB for 128K tokens)
|
|
320
|
+
- **Database**: PostgreSQL scales to TBs with proper indexing
|
|
321
|
+
- **Embeddings**: Local models (Ollama) bounded by GPU/CPU
|
|
322
|
+
|
|
323
|
+
### Horizontal Scaling
|
|
324
|
+
|
|
325
|
+
- **Multiple Robots**: Each robot process has independent working memory
|
|
326
|
+
- **Database**: Single shared PostgreSQL instance (can add replicas)
|
|
327
|
+
- **Read Replicas**: For query scaling (future consideration)
|
|
328
|
+
- **Sharding**: By robot_id or timeframe (future consideration)
|
|
329
|
+
|
|
330
|
+
!!! tip "Scaling Strategy"
|
|
331
|
+
Start with single PostgreSQL instance. Add read replicas when query load increases. Consider partitioning by robot_id for multi-tenant scenarios.
|
|
332
|
+
|
|
333
|
+
## Related Documentation
|
|
334
|
+
|
|
335
|
+
- [Installation Guide](../installation.md) - Setup PostgreSQL, TimescaleDB, and dependencies
|
|
336
|
+
- [Quick Start](../quick-start.md) - Get started with HTM in 5 minutes
|
|
337
|
+
- [API Reference](../api/htm.md) - Complete API documentation
|
|
338
|
+
- [Architecture Decision Records](adrs/index.md) - Detailed decision history
|
|
339
|
+
|
|
340
|
+
## Architecture Reviews
|
|
341
|
+
|
|
342
|
+
All architecture decisions are documented in ADRs and reviewed by domain experts:
|
|
343
|
+
|
|
344
|
+
- **Systems Architect**: Overall system design and scalability
|
|
345
|
+
- **Database Architect**: PostgreSQL schema and query optimization
|
|
346
|
+
- **AI Engineer**: Embedding strategies and RAG implementation
|
|
347
|
+
- **Performance Specialist**: Latency and throughput analysis
|
|
348
|
+
- **Ruby Expert**: Idiomatic Ruby patterns and best practices
|
|
349
|
+
- **Security Specialist**: Data privacy and access control
|
|
350
|
+
|
|
351
|
+
See [Architecture Decision Records](adrs/index.md) for complete review notes.
|