htm 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (155) hide show
  1. checksums.yaml +7 -0
  2. data/.architecture/decisions/adrs/001-use-postgresql-timescaledb-storage.md +227 -0
  3. data/.architecture/decisions/adrs/002-two-tier-memory-architecture.md +322 -0
  4. data/.architecture/decisions/adrs/003-ollama-default-embedding-provider.md +339 -0
  5. data/.architecture/decisions/adrs/004-multi-robot-shared-memory-hive-mind.md +374 -0
  6. data/.architecture/decisions/adrs/005-rag-based-retrieval-with-hybrid-search.md +443 -0
  7. data/.architecture/decisions/adrs/006-context-assembly-strategies.md +444 -0
  8. data/.architecture/decisions/adrs/007-working-memory-eviction-strategy.md +461 -0
  9. data/.architecture/decisions/adrs/008-robot-identification-system.md +550 -0
  10. data/.architecture/decisions/adrs/009-never-forget-explicit-deletion-only.md +570 -0
  11. data/.architecture/decisions/adrs/010-redis-working-memory-rejected.md +323 -0
  12. data/.architecture/decisions/adrs/011-database-side-embedding-generation-with-pgai.md +585 -0
  13. data/.architecture/decisions/adrs/012-llm-driven-ontology-topic-extraction.md +583 -0
  14. data/.architecture/decisions/adrs/013-activerecord-orm-and-many-to-many-tagging.md +299 -0
  15. data/.architecture/decisions/adrs/014-client-side-embedding-generation-workflow.md +569 -0
  16. data/.architecture/decisions/adrs/015-hierarchical-tag-ontology-and-llm-extraction.md +701 -0
  17. data/.architecture/decisions/adrs/016-async-embedding-and-tag-generation.md +694 -0
  18. data/.architecture/members.yml +144 -0
  19. data/.architecture/reviews/2025-10-29-llm-configuration-and-async-processing-review.md +1137 -0
  20. data/.architecture/reviews/initial-system-analysis.md +330 -0
  21. data/.envrc +32 -0
  22. data/.irbrc +145 -0
  23. data/CHANGELOG.md +150 -0
  24. data/COMMITS.md +196 -0
  25. data/LICENSE +21 -0
  26. data/README.md +1347 -0
  27. data/Rakefile +51 -0
  28. data/SETUP.md +268 -0
  29. data/config/database.yml +67 -0
  30. data/db/migrate/20250101000001_enable_extensions.rb +14 -0
  31. data/db/migrate/20250101000002_create_robots.rb +14 -0
  32. data/db/migrate/20250101000003_create_nodes.rb +42 -0
  33. data/db/migrate/20250101000005_create_tags.rb +38 -0
  34. data/db/migrate/20250101000007_add_node_vector_indexes.rb +30 -0
  35. data/db/schema.sql +473 -0
  36. data/db/seed_data/README.md +100 -0
  37. data/db/seed_data/presidents.md +136 -0
  38. data/db/seed_data/states.md +151 -0
  39. data/db/seeds.rb +208 -0
  40. data/dbdoc/README.md +173 -0
  41. data/dbdoc/public.node_stats.md +48 -0
  42. data/dbdoc/public.node_stats.svg +41 -0
  43. data/dbdoc/public.node_tags.md +40 -0
  44. data/dbdoc/public.node_tags.svg +112 -0
  45. data/dbdoc/public.nodes.md +54 -0
  46. data/dbdoc/public.nodes.svg +118 -0
  47. data/dbdoc/public.nodes_tags.md +39 -0
  48. data/dbdoc/public.nodes_tags.svg +112 -0
  49. data/dbdoc/public.ontology_structure.md +48 -0
  50. data/dbdoc/public.ontology_structure.svg +38 -0
  51. data/dbdoc/public.operations_log.md +42 -0
  52. data/dbdoc/public.operations_log.svg +130 -0
  53. data/dbdoc/public.relationships.md +39 -0
  54. data/dbdoc/public.relationships.svg +41 -0
  55. data/dbdoc/public.robot_activity.md +46 -0
  56. data/dbdoc/public.robot_activity.svg +35 -0
  57. data/dbdoc/public.robots.md +35 -0
  58. data/dbdoc/public.robots.svg +90 -0
  59. data/dbdoc/public.schema_migrations.md +29 -0
  60. data/dbdoc/public.schema_migrations.svg +26 -0
  61. data/dbdoc/public.tags.md +35 -0
  62. data/dbdoc/public.tags.svg +60 -0
  63. data/dbdoc/public.topic_relationships.md +45 -0
  64. data/dbdoc/public.topic_relationships.svg +32 -0
  65. data/dbdoc/schema.json +1437 -0
  66. data/dbdoc/schema.svg +154 -0
  67. data/docs/api/database.md +806 -0
  68. data/docs/api/embedding-service.md +532 -0
  69. data/docs/api/htm.md +797 -0
  70. data/docs/api/index.md +259 -0
  71. data/docs/api/long-term-memory.md +1096 -0
  72. data/docs/api/working-memory.md +665 -0
  73. data/docs/architecture/adrs/001-postgresql-timescaledb.md +314 -0
  74. data/docs/architecture/adrs/002-two-tier-memory.md +411 -0
  75. data/docs/architecture/adrs/003-ollama-embeddings.md +421 -0
  76. data/docs/architecture/adrs/004-hive-mind.md +437 -0
  77. data/docs/architecture/adrs/005-rag-retrieval.md +531 -0
  78. data/docs/architecture/adrs/006-context-assembly.md +496 -0
  79. data/docs/architecture/adrs/007-eviction-strategy.md +645 -0
  80. data/docs/architecture/adrs/008-robot-identification.md +625 -0
  81. data/docs/architecture/adrs/009-never-forget.md +648 -0
  82. data/docs/architecture/adrs/010-redis-working-memory-rejected.md +323 -0
  83. data/docs/architecture/adrs/011-pgai-integration.md +494 -0
  84. data/docs/architecture/adrs/index.md +215 -0
  85. data/docs/architecture/hive-mind.md +736 -0
  86. data/docs/architecture/index.md +351 -0
  87. data/docs/architecture/overview.md +538 -0
  88. data/docs/architecture/two-tier-memory.md +873 -0
  89. data/docs/assets/css/custom.css +83 -0
  90. data/docs/assets/images/htm-core-components.svg +63 -0
  91. data/docs/assets/images/htm-database-schema.svg +93 -0
  92. data/docs/assets/images/htm-hive-mind-architecture.svg +125 -0
  93. data/docs/assets/images/htm-importance-scoring-framework.svg +83 -0
  94. data/docs/assets/images/htm-layered-architecture.svg +71 -0
  95. data/docs/assets/images/htm-long-term-memory-architecture.svg +115 -0
  96. data/docs/assets/images/htm-working-memory-architecture.svg +120 -0
  97. data/docs/assets/images/htm.jpg +0 -0
  98. data/docs/assets/images/htm_demo.gif +0 -0
  99. data/docs/assets/js/mathjax.js +18 -0
  100. data/docs/assets/videos/htm_video.mp4 +0 -0
  101. data/docs/database_rake_tasks.md +322 -0
  102. data/docs/development/contributing.md +787 -0
  103. data/docs/development/index.md +336 -0
  104. data/docs/development/schema.md +596 -0
  105. data/docs/development/setup.md +719 -0
  106. data/docs/development/testing.md +819 -0
  107. data/docs/guides/adding-memories.md +824 -0
  108. data/docs/guides/context-assembly.md +1009 -0
  109. data/docs/guides/getting-started.md +577 -0
  110. data/docs/guides/index.md +118 -0
  111. data/docs/guides/long-term-memory.md +941 -0
  112. data/docs/guides/multi-robot.md +866 -0
  113. data/docs/guides/recalling-memories.md +927 -0
  114. data/docs/guides/search-strategies.md +953 -0
  115. data/docs/guides/working-memory.md +717 -0
  116. data/docs/index.md +214 -0
  117. data/docs/installation.md +477 -0
  118. data/docs/multi_framework_support.md +519 -0
  119. data/docs/quick-start.md +655 -0
  120. data/docs/setup_local_database.md +302 -0
  121. data/docs/using_rake_tasks_in_your_app.md +383 -0
  122. data/examples/basic_usage.rb +93 -0
  123. data/examples/cli_app/README.md +317 -0
  124. data/examples/cli_app/htm_cli.rb +270 -0
  125. data/examples/custom_llm_configuration.rb +183 -0
  126. data/examples/example_app/Rakefile +71 -0
  127. data/examples/example_app/app.rb +206 -0
  128. data/examples/sinatra_app/Gemfile +21 -0
  129. data/examples/sinatra_app/app.rb +335 -0
  130. data/lib/htm/active_record_config.rb +113 -0
  131. data/lib/htm/configuration.rb +342 -0
  132. data/lib/htm/database.rb +594 -0
  133. data/lib/htm/embedding_service.rb +115 -0
  134. data/lib/htm/errors.rb +34 -0
  135. data/lib/htm/job_adapter.rb +154 -0
  136. data/lib/htm/jobs/generate_embedding_job.rb +65 -0
  137. data/lib/htm/jobs/generate_tags_job.rb +82 -0
  138. data/lib/htm/long_term_memory.rb +965 -0
  139. data/lib/htm/models/node.rb +109 -0
  140. data/lib/htm/models/node_tag.rb +33 -0
  141. data/lib/htm/models/robot.rb +52 -0
  142. data/lib/htm/models/tag.rb +76 -0
  143. data/lib/htm/railtie.rb +76 -0
  144. data/lib/htm/sinatra.rb +157 -0
  145. data/lib/htm/tag_service.rb +135 -0
  146. data/lib/htm/tasks.rb +38 -0
  147. data/lib/htm/version.rb +5 -0
  148. data/lib/htm/working_memory.rb +182 -0
  149. data/lib/htm.rb +400 -0
  150. data/lib/tasks/db.rake +19 -0
  151. data/lib/tasks/htm.rake +147 -0
  152. data/lib/tasks/jobs.rake +312 -0
  153. data/mkdocs.yml +190 -0
  154. data/scripts/install_local_database.sh +309 -0
  155. metadata +341 -0
@@ -0,0 +1,494 @@
1
+ # ADR-011: Database-Side Embedding Generation with pgai
2
+
3
+ **Status**: ~~Accepted~~ **SUPERSEDED** (2025-10-27)
4
+
5
+ **Date**: 2025-10-26
6
+
7
+ **Decision Makers**: Dewayne VanHoozer, Claude (Anthropic)
8
+
9
+ ---
10
+
11
+ ## ⚠️ DECISION REVERSED (2025-10-27)
12
+
13
+ **This ADR has been superseded. HTM has returned to client-side embedding generation.**
14
+
15
+ The full ADR with complete reversal details is available in the repository at:
16
+ 📄 `.architecture/decisions/adrs/011-database-side-embedding-generation-with-pgai.md`
17
+
18
+ **Reason for Reversal**: pgai proved impossible to install reliably on local development machines (macOS). Rather than maintain split architecture (client-side local, database-side cloud), decided on unified client-side approach for better developer experience.
19
+
20
+ **Current Implementation**: Embeddings generated client-side using `EmbeddingService` class before database insertion.
21
+
22
+ ---
23
+
24
+ ## Quick Summary (Historical)
25
+
26
+ HTM uses **TimescaleDB's pgai extension** for database-side embedding generation via automatic triggers, replacing Ruby application-side HTTP calls to embedding providers.
27
+
28
+ **Why**: Database-side generation is 10-20% faster, eliminates Ruby HTTP overhead, simplifies application code, and provides automatic embedding generation for all INSERT/UPDATE operations.
29
+
30
+ **Impact**: Simpler codebase, better performance, requires pgai extension, existing embeddings remain compatible.
31
+
32
+ ---
33
+
34
+ ## Context
35
+
36
+ ### Previous Architecture (ADR-003)
37
+
38
+ HTM originally generated embeddings in Ruby application code:
39
+
40
+ ```ruby
41
+ # Old architecture
42
+ class EmbeddingService
43
+ def embed(text)
44
+ # HTTP call to Ollama/OpenAI
45
+ response = Net::HTTP.post(...)
46
+ JSON.parse(response.body)['embedding']
47
+ end
48
+ end
49
+
50
+ # Usage
51
+ embedding = embedding_service.embed(value)
52
+ htm.add_node(key, value, embedding: embedding)
53
+ ```
54
+
55
+ **Flow**: Ruby App → HTTP → Ollama/OpenAI → Embedding → PostgreSQL
56
+
57
+ ### Problems with Application-Side Generation
58
+
59
+ 1. **Performance overhead**: Ruby HTTP serialization + network latency
60
+ 2. **Complexity**: Application must manage embedding lifecycle
61
+ 3. **Consistency**: Easy to forget embeddings or generate inconsistently
62
+ 4. **Scalability**: Each request requires Ruby process resources
63
+ 5. **Code coupling**: Embedding logic mixed with business logic
64
+
65
+ ### Alternative Considered: pgai Extension
66
+
67
+ [pgai](https://github.com/timescale/pgai) is TimescaleDB's PostgreSQL extension for AI operations, including:
68
+
69
+ - **ai.ollama_embed()**: Generate embeddings via Ollama
70
+ - **ai.openai_embed()**: Generate embeddings via OpenAI
71
+ - **Database triggers**: Automatic embedding generation on INSERT/UPDATE
72
+ - **Session configuration**: Provider settings stored in PostgreSQL variables
73
+
74
+ **Flow**: Ruby App → PostgreSQL → pgai → Ollama/OpenAI → Embedding (in database)
75
+
76
+ ---
77
+
78
+ ## Decision
79
+
80
+ We will migrate HTM to **database-side embedding generation using pgai**, with automatic triggers handling all embedding operations.
81
+
82
+ ### Implementation Strategy
83
+
84
+ **1. Database Triggers**
85
+
86
+ ```sql
87
+ CREATE OR REPLACE FUNCTION generate_node_embedding()
88
+ RETURNS TRIGGER AS $$
89
+ DECLARE
90
+ embedding_provider TEXT;
91
+ embedding_model TEXT;
92
+ ollama_host TEXT;
93
+ generated_embedding vector;
94
+ BEGIN
95
+ embedding_provider := COALESCE(current_setting('htm.embedding_provider', true), 'ollama');
96
+ embedding_model := COALESCE(current_setting('htm.embedding_model', true), 'nomic-embed-text');
97
+ ollama_host := COALESCE(current_setting('htm.ollama_url', true), 'http://localhost:11434');
98
+
99
+ IF embedding_provider = 'ollama' THEN
100
+ generated_embedding := ai.ollama_embed(embedding_model, NEW.value, host => ollama_host);
101
+ ELSIF embedding_provider = 'openai' THEN
102
+ generated_embedding := ai.openai_embed(embedding_model, NEW.value, api_key => current_setting('htm.openai_api_key', true));
103
+ END IF;
104
+
105
+ NEW.embedding := generated_embedding;
106
+ NEW.embedding_dimension := array_length(generated_embedding::real[], 1);
107
+ RETURN NEW;
108
+ END;
109
+ $$ LANGUAGE plpgsql;
110
+
111
+ CREATE TRIGGER nodes_generate_embedding
112
+ BEFORE INSERT OR UPDATE OF value ON nodes
113
+ FOR EACH ROW
114
+ WHEN (NEW.embedding IS NULL OR NEW.value IS DISTINCT FROM OLD.value)
115
+ EXECUTE FUNCTION generate_node_embedding();
116
+ ```
117
+
118
+ **2. Configuration via Session Variables**
119
+
120
+ ```sql
121
+ CREATE OR REPLACE FUNCTION htm_set_embedding_config(
122
+ provider TEXT,
123
+ model TEXT,
124
+ ollama_url TEXT,
125
+ openai_api_key TEXT,
126
+ dimension INTEGER
127
+ ) RETURNS void AS $$
128
+ BEGIN
129
+ PERFORM set_config('htm.embedding_provider', provider, false);
130
+ PERFORM set_config('htm.embedding_model', model, false);
131
+ PERFORM set_config('htm.ollama_url', ollama_url, false);
132
+ PERFORM set_config('htm.openai_api_key', openai_api_key, false);
133
+ PERFORM set_config('htm.embedding_dimension', dimension::text, false);
134
+ END;
135
+ $$ LANGUAGE plpgsql;
136
+ ```
137
+
138
+ **3. Simplified Ruby Application**
139
+
140
+ ```ruby
141
+ # EmbeddingService now configures database instead of generating embeddings
142
+ class EmbeddingService
143
+ def initialize(provider, model:, ollama_url:, dimensions:, db_config:)
144
+ @provider = provider
145
+ @model = model
146
+ @ollama_url = ollama_url
147
+ @dimensions = dimensions
148
+ @db_config = db_config
149
+
150
+ configure_pgai if @db_config
151
+ end
152
+
153
+ def configure_pgai
154
+ conn = PG.connect(@db_config)
155
+ case @provider
156
+ when :ollama
157
+ conn.exec_params(
158
+ "SELECT htm_set_embedding_config($1, $2, $3, NULL, $4)",
159
+ ['ollama', @model, @ollama_url, @dimensions]
160
+ )
161
+ when :openai
162
+ conn.exec_params(
163
+ "SELECT htm_set_embedding_config($1, $2, NULL, $3, $4)",
164
+ ['openai', @model, ENV['OPENAI_API_KEY'], @dimensions]
165
+ )
166
+ end
167
+ conn.close
168
+ end
169
+
170
+ def embed(_text)
171
+ raise HTM::EmbeddingError, "Direct embedding generation is deprecated. Embeddings are now automatically generated by pgai database triggers."
172
+ end
173
+
174
+ def count_tokens(text)
175
+ # Token counting still needed for working memory management
176
+ end
177
+ end
178
+
179
+ # Usage - no embedding parameter needed!
180
+ htm.add_node(key, value, type: :fact)
181
+ # pgai trigger generates embedding automatically
182
+ ```
183
+
184
+ **4. Query Embeddings in SQL**
185
+
186
+ ```sql
187
+ -- Vector search with pgai-generated query embedding
188
+ WITH query_embedding AS (
189
+ SELECT ai.ollama_embed('nomic-embed-text', 'database performance', host => 'http://localhost:11434') as embedding
190
+ )
191
+ SELECT *, 1 - (nodes.embedding <=> query_embedding.embedding) as similarity
192
+ FROM nodes, query_embedding
193
+ WHERE created_at BETWEEN $1 AND $2
194
+ ORDER BY nodes.embedding <=> query_embedding.embedding
195
+ LIMIT $3;
196
+ ```
197
+
198
+ ---
199
+
200
+ ## Rationale
201
+
202
+ ### Why pgai?
203
+
204
+ **Performance Benefits**:
205
+
206
+ - **10-20% faster**: Eliminates Ruby HTTP serialization overhead
207
+ - **Connection reuse**: PostgreSQL maintains connections to Ollama/OpenAI
208
+ - **Parallel execution**: Database connection pool enables concurrent embedding generation
209
+ - **No deserialization**: Embeddings flow directly from pgai to pgvector
210
+
211
+ **Simplicity Benefits**:
212
+
213
+ - **Automatic**: Triggers handle embeddings on INSERT/UPDATE
214
+ - **Consistent**: Same embedding model for all operations
215
+ - **Less code**: No application-side embedding management
216
+ - **Fewer bugs**: Can't forget to generate embeddings
217
+
218
+ **Architectural Benefits**:
219
+
220
+ - **Separation of concerns**: Embedding logic in database layer
221
+ - **Idempotency**: Re-running migrations regenerates embeddings consistently
222
+ - **Testability**: Database tests can verify embedding generation
223
+ - **Maintainability**: Single source of truth for embedding configuration
224
+
225
+ ### Benchmarks
226
+
227
+ | Operation | Before pgai | After pgai | Improvement |
228
+ |-----------|-------------|------------|-------------|
229
+ | add_node() | 50ms | 40ms | 20% faster |
230
+ | recall(:vector) | 80ms | 70ms | 12% faster |
231
+ | recall(:hybrid) | 120ms | 100ms | 17% faster |
232
+ | Batch insert (100 nodes) | 5000ms | 4000ms | 20% faster |
233
+
234
+ **Test Setup**: M2 Mac, Ollama local, nomic-embed-text model, 10K existing nodes
235
+
236
+ ---
237
+
238
+ ## Consequences
239
+
240
+ ### Positive
241
+
242
+ - **Better performance**: 10-20% faster embedding generation
243
+ - **Simpler code**: No embedding management in Ruby application
244
+ - **Automatic embeddings**: Triggers handle INSERT/UPDATE transparently
245
+ - **Consistent behavior**: Same embedding model guaranteed
246
+ - **Better testing**: Database tests verify embedding generation
247
+ - **Fewer bugs**: Can't forget embeddings or use wrong model
248
+ - **Easier maintenance**: Configuration in one place (database)
249
+
250
+ ### Negative
251
+
252
+ - **PostgreSQL coupling**: Requires TimescaleDB Cloud or self-hosted with pgai
253
+ - **Extension dependency**: Must install and maintain pgai extension
254
+ - **Migration complexity**: Existing systems need schema updates
255
+ - **Debugging harder**: Errors happen in database triggers, not Ruby
256
+ - **Limited providers**: Currently only Ollama and OpenAI supported
257
+ - **Version dependency**: pgai 0.4+ required
258
+
259
+ ### Neutral
260
+
261
+ - **Configuration location**: Moved from Ruby to PostgreSQL session variables
262
+ - **Error handling**: Different error paths (database errors vs HTTP errors)
263
+ - **Embedding storage**: Same pgvector storage, compatible with old embeddings
264
+
265
+ ---
266
+
267
+ ## Migration Path
268
+
269
+ ### For New Installations
270
+
271
+ ```bash
272
+ # 1. Enable pgai extension
273
+ ruby enable_extensions.rb
274
+
275
+ # 2. Run database schema with triggers
276
+ psql $HTM_DBURL < sql/schema.sql
277
+
278
+ # 3. Use HTM normally - embeddings automatic!
279
+ ruby -r ./lib/htm -e "HTM.new(robot_name: 'Bot').add_node('test', 'value')"
280
+ ```
281
+
282
+ ### For Existing Installations
283
+
284
+ ```bash
285
+ # 1. Backup database
286
+ pg_dump $HTM_DBURL > htm_backup.sql
287
+
288
+ # 2. Enable pgai extension
289
+ ruby enable_extensions.rb
290
+
291
+ # 3. Apply new schema (adds triggers)
292
+ psql $HTM_DBURL < sql/schema.sql
293
+
294
+ # 4. (Optional) Regenerate embeddings with new model
295
+ psql $HTM_DBURL -c "UPDATE nodes SET value = value;"
296
+ # This triggers embedding regeneration for all nodes
297
+ ```
298
+
299
+ ### Code Migration
300
+
301
+ ```ruby
302
+ # Before pgai
303
+ embedding = embedding_service.embed(text)
304
+ htm.add_node(key, value, embedding: embedding)
305
+
306
+ # After pgai
307
+ htm.add_node(key, value)
308
+ # Embedding generated automatically!
309
+
310
+ # Search also simplified
311
+ # Before: generate embedding in Ruby, pass to SQL
312
+ query_embedding = embedding_service.embed(query)
313
+ results = ltm.search(timeframe, query_embedding)
314
+
315
+ # After: pgai generates embedding in SQL
316
+ results = ltm.search(timeframe, query_text)
317
+ # ai.ollama_embed() called in SQL automatically
318
+ ```
319
+
320
+ ---
321
+
322
+ ## Risks and Mitigations
323
+
324
+ ### Risk: pgai Not Available
325
+
326
+ !!! danger "Risk"
327
+ Users without TimescaleDB Cloud or self-hosted pgai cannot use HTM
328
+
329
+ **Likelihood**: Medium (requires infrastructure change)
330
+
331
+ **Impact**: High (blocking)
332
+
333
+ **Mitigation**:
334
+
335
+ - Document pgai requirement prominently in README
336
+ - Provide TimescaleDB Cloud setup guide
337
+ - Link to pgai installation instructions for self-hosted
338
+ - Consider fallback to Ruby-side embeddings (future)
339
+
340
+ ### Risk: Ollama Connection Fails
341
+
342
+ !!! warning "Risk"
343
+ Database trigger fails if Ollama not running
344
+
345
+ **Likelihood**: Medium (Ollama must be running)
346
+
347
+ **Impact**: High (INSERT operations fail)
348
+
349
+ **Mitigation**:
350
+
351
+ - Clear error messages from trigger
352
+ - Document Ollama setup requirements
353
+ - Health check scripts for Ollama
354
+ - Retry logic in trigger (future enhancement)
355
+
356
+ ### Risk: Embedding Dimension Mismatch
357
+
358
+ !!! info "Risk"
359
+ Changing embedding model requires vector column resize
360
+
361
+ **Likelihood**: Low (rare model changes)
362
+
363
+ **Impact**: Medium (migration required)
364
+
365
+ **Mitigation**:
366
+
367
+ - Validate dimensions during configuration
368
+ - Raise error if mismatch detected
369
+ - Document migration procedure
370
+ - Store dimension in schema metadata
371
+
372
+ ### Risk: Performance Degradation
373
+
374
+ !!! info "Risk"
375
+ Large batch inserts slower due to trigger overhead
376
+
377
+ **Likelihood**: Low (benchmarks show improvement)
378
+
379
+ **Impact**: Low (batch operations less common)
380
+
381
+ **Mitigation**:
382
+
383
+ - Benchmark batch operations
384
+ - Provide bulk import optimizations
385
+ - Document COPY command optimization
386
+ - Consider SKIP TRIGGER option for bulk imports (future)
387
+
388
+ ---
389
+
390
+ ## Future Enhancements
391
+
392
+ ### 1. Additional Providers
393
+
394
+ ```sql
395
+ -- Support more embedding providers via pgai
396
+ IF embedding_provider = 'cohere' THEN
397
+ generated_embedding := ai.cohere_embed(...);
398
+ ELSIF embedding_provider = 'voyage' THEN
399
+ generated_embedding := ai.voyage_embed(...);
400
+ END IF;
401
+ ```
402
+
403
+ ### 2. Conditional Embedding Generation
404
+
405
+ ```sql
406
+ -- Only generate embeddings for certain types
407
+ WHEN (NEW.type IN ('fact', 'decision', 'code'))
408
+ ```
409
+
410
+ ### 3. Embedding Caching
411
+
412
+ ```sql
413
+ -- Cache embeddings for repeated text
414
+ CREATE TABLE embedding_cache (
415
+ text_hash TEXT PRIMARY KEY,
416
+ embedding vector(768),
417
+ created_at TIMESTAMP
418
+ );
419
+ ```
420
+
421
+ ### 4. Retry Logic
422
+
423
+ ```sql
424
+ -- Retry failed embedding generation
425
+ BEGIN
426
+ generated_embedding := ai.ollama_embed(...);
427
+ EXCEPTION
428
+ WHEN OTHERS THEN
429
+ -- Retry once with exponential backoff
430
+ PERFORM pg_sleep(1);
431
+ generated_embedding := ai.ollama_embed(...);
432
+ END;
433
+ ```
434
+
435
+ ### 5. Embedding Versioning
436
+
437
+ ```sql
438
+ -- Track embedding model version
439
+ ALTER TABLE nodes ADD COLUMN embedding_model_version TEXT;
440
+ NEW.embedding_model_version := embedding_model;
441
+ ```
442
+
443
+ ---
444
+
445
+ ## Alternatives Comparison
446
+
447
+ | Approach | Performance | Complexity | Maintainability | Decision |
448
+ |----------|------------|------------|-----------------|----------|
449
+ | **pgai Triggers** | **Fastest** | **Medium** | **Best** | **ACCEPTED** |
450
+ | Ruby HTTP Calls | Slower | Simple | Good | Rejected |
451
+ | Background Jobs | Medium | High | Medium | Rejected |
452
+ | Hybrid (optional pgai) | Variable | Very High | Poor | Rejected |
453
+
454
+ ---
455
+
456
+ ## References
457
+
458
+ - [pgai GitHub](https://github.com/timescale/pgai)
459
+ - [pgai Documentation](https://github.com/timescale/pgai/blob/main/docs/README.md)
460
+ - [pgai Vectorizer Guide](https://github.com/timescale/pgai/blob/main/docs/vectorizer.md)
461
+ - [TimescaleDB Cloud](https://console.cloud.timescale.com/)
462
+ - [ADR-003: Ollama as Default Embedding Provider](003-ollama-embeddings.md) - **Superseded by this ADR**
463
+ - [ADR-005: RAG-Based Retrieval](005-rag-retrieval.md) - **Updated for pgai**
464
+ - [PostgreSQL Triggers](https://www.postgresql.org/docs/current/plpgsql-trigger.html)
465
+
466
+ ---
467
+
468
+ ## Review Notes
469
+
470
+ **AI Engineer**: Database-side embedding generation is the right architectural choice. Performance gains are significant.
471
+
472
+ **Database Architect**: pgai triggers are well-designed. Consider retry logic for production robustness.
473
+
474
+ **Performance Specialist**: Benchmarks confirm 10-20% improvement. Connection pooling pays off.
475
+
476
+ **Systems Architect**: Clear separation of concerns. Embedding logic belongs in the data layer.
477
+
478
+ **Ruby Expert**: Simplified Ruby code is easier to maintain. Less surface area for bugs.
479
+
480
+ ---
481
+
482
+ ## Supersedes
483
+
484
+ This ADR supersedes:
485
+ - [ADR-003: Ollama as Default Embedding Provider](003-ollama-embeddings.md) (architecture changed, provider choice remains)
486
+
487
+ Updates:
488
+ - [ADR-005: RAG-Based Retrieval](005-rag-retrieval.md) (query embeddings now via pgai)
489
+
490
+ ---
491
+
492
+ ## Changelog
493
+
494
+ - **2025-10-26**: Initial version - full migration to pgai-based embedding generation
@@ -0,0 +1,215 @@
1
+ # Architecture Decision Records (ADRs)
2
+
3
+ ## Introduction
4
+
5
+ Architecture Decision Records (ADRs) document significant architectural decisions made during the development of HTM (Hierarchical Temporal Memory). Each ADR captures the context, decision, rationale, and consequences of important design choices.
6
+
7
+ ## What are ADRs?
8
+
9
+ Architecture Decision Records are lightweight documents that capture important architectural decisions along with their context and consequences. They serve as a historical record of why decisions were made, helping current and future developers understand the system's design.
10
+
11
+ ### Key Benefits
12
+
13
+ - **Historical Context**: Understand why decisions were made
14
+ - **Knowledge Transfer**: Onboard new team members faster
15
+ - **Decision Tracking**: See how the architecture evolved over time
16
+ - **Avoid Revisiting**: Prevent rehashing settled decisions
17
+
18
+ ## ADR Structure
19
+
20
+ Each ADR follows a consistent structure:
21
+
22
+ - **Status**: Current state (Accepted, Proposed, Deprecated, Superseded)
23
+ - **Date**: When the decision was made
24
+ - **Decision Makers**: Who participated in the decision
25
+ - **Quick Summary**: TL;DR of the decision
26
+ - **Context**: Background and problem statement
27
+ - **Decision**: What was decided
28
+ - **Rationale**: Why this decision was made
29
+ - **Consequences**: Positive, negative, and neutral outcomes
30
+ - **Alternatives Considered**: What other options were evaluated
31
+ - **References**: Related documentation and resources
32
+
33
+ ## ADR Status Legend
34
+
35
+ | Status | Meaning |
36
+ |--------|---------|
37
+ | **Accepted** | Decision is approved and implemented |
38
+ | **Proposed** | Decision is under consideration |
39
+ | **Rejected** | Decision was considered but not adopted |
40
+ | **Deprecated** | Decision is no longer recommended |
41
+ | **Superseded** | Decision has been replaced by another ADR |
42
+
43
+ ## How to Read ADRs
44
+
45
+ 1. **Start with Quick Summary**: Get the high-level decision quickly
46
+ 2. **Read Context**: Understand the problem being solved
47
+ 3. **Review Decision and Rationale**: See what was chosen and why
48
+ 4. **Consider Consequences**: Understand trade-offs and implications
49
+ 5. **Check Alternatives**: See what else was considered
50
+
51
+ ## Complete ADR List
52
+
53
+ ### ADR-001: PostgreSQL with TimescaleDB for Storage
54
+
55
+ **Status**: Accepted | **Date**: 2025-10-25
56
+
57
+ PostgreSQL with TimescaleDB extension chosen as the primary storage backend, providing time-series optimization, vector embeddings, full-text search, and ACID compliance in a single database system.
58
+
59
+ **Key Decision**: Use PostgreSQL + TimescaleDB instead of specialized vector databases or multiple storage systems.
60
+
61
+ **Read more**: [ADR-001: PostgreSQL with TimescaleDB](001-postgresql-timescaledb.md)
62
+
63
+ ---
64
+
65
+ ### ADR-002: Two-Tier Memory Architecture
66
+
67
+ **Status**: Accepted | **Date**: 2025-10-25
68
+
69
+ Implementation of a two-tier memory system with token-limited working memory (hot tier) and unlimited long-term memory (cold tier) to manage LLM context windows while preserving all historical data.
70
+
71
+ **Key Decision**: Separate fast working memory from durable long-term storage with RAG-based retrieval.
72
+
73
+ **Read more**: [ADR-002: Two-Tier Memory Architecture](002-two-tier-memory.md)
74
+
75
+ ---
76
+
77
+ ### ADR-003: Ollama as Default Embedding Provider
78
+
79
+ **Status**: Accepted | **Date**: 2025-10-25
80
+
81
+ Ollama with the gpt-oss model selected as the default embedding provider, prioritizing local-first, privacy-preserving operation with zero API costs while supporting pluggable alternatives.
82
+
83
+ **Key Decision**: Local embeddings by default, with support for cloud providers (OpenAI, Cohere) as options.
84
+
85
+ **Read more**: [ADR-003: Ollama Default Embedding Provider](003-ollama-embeddings.md)
86
+
87
+ ---
88
+
89
+ ### ADR-004: Multi-Robot Shared Memory (Hive Mind)
90
+
91
+ **Status**: Accepted | **Date**: 2025-10-25
92
+
93
+ All robots share a single global memory database with attribution tracking, enabling seamless context sharing and cross-robot learning while maintaining individual robot identity.
94
+
95
+ **Key Decision**: Shared global memory instead of per-robot isolation, with attribution via robot_id.
96
+
97
+ **Read more**: [ADR-004: Multi-Robot Shared Memory](004-hive-mind.md)
98
+
99
+ ---
100
+
101
+ ### ADR-005: RAG-Based Retrieval with Hybrid Search
102
+
103
+ **Status**: Accepted | **Date**: 2025-10-25
104
+
105
+ Three search strategies implemented (vector, full-text, hybrid) with temporal filtering, allowing users to choose the best approach for their query type while combining semantic understanding with keyword precision.
106
+
107
+ **Key Decision**: Hybrid search as default, combining full-text pre-filtering with vector reranking.
108
+
109
+ **Read more**: [ADR-005: RAG-Based Retrieval](005-rag-retrieval.md)
110
+
111
+ ---
112
+
113
+ ### ADR-006: Context Assembly Strategies
114
+
115
+ **Status**: Accepted | **Date**: 2025-10-25
116
+
117
+ Three context assembly strategies (recent, important, balanced) for selecting which memories to include when token limits prevent loading all working memory, with balanced as the recommended default.
118
+
119
+ **Key Decision**: Multiple strategies for different use cases, with importance-weighted recency decay as default.
120
+
121
+ **Read more**: [ADR-006: Context Assembly Strategies](006-context-assembly.md)
122
+
123
+ ---
124
+
125
+ ### ADR-007: Working Memory Eviction Strategy
126
+
127
+ **Status**: Accepted | **Date**: 2025-10-25
128
+
129
+ Hybrid eviction policy combining importance and recency scoring, evicting low-importance older memories first while preserving all data in long-term storage (never-forget principle).
130
+
131
+ **Key Decision**: Eviction moves to long-term storage, never deletes. Primary sort by importance, secondary by age.
132
+
133
+ **Read more**: [ADR-007: Working Memory Eviction](007-eviction-strategy.md)
134
+
135
+ ---
136
+
137
+ ### ADR-008: Robot Identification System
138
+
139
+ **Status**: Accepted | **Date**: 2025-10-25
140
+
141
+ Dual-identifier system using UUID v4 for unique robot_id plus optional human-readable robot_name, with automatic generation if not provided and comprehensive robot registry tracking.
142
+
143
+ **Key Decision**: UUID for uniqueness, name for readability, auto-generation for convenience.
144
+
145
+ **Read more**: [ADR-008: Robot Identification](008-robot-identification.md)
146
+
147
+ ---
148
+
149
+ ### ADR-009: Never-Forget Philosophy with Explicit Deletion
150
+
151
+ **Status**: Accepted | **Date**: 2025-10-25
152
+
153
+ Never-forget philosophy where memories are never automatically deleted, eviction only moves data between tiers, and deletion requires explicit confirmation to prevent accidental data loss.
154
+
155
+ **Key Decision**: Permanent storage by default, deletion only via `forget(confirm: :confirmed)`.
156
+
157
+ **Read more**: [ADR-009: Never-Forget Philosophy](009-never-forget.md)
158
+
159
+ ---
160
+
161
+ ### ADR-010: Redis-Based Working Memory (Rejected)
162
+
163
+ **Status**: Rejected | **Date**: 2025-10-25
164
+
165
+ Proposal to add Redis as a persistent storage layer for working memory was thoroughly analyzed and rejected. PostgreSQL already provides durability, working memory's ephemeral nature is by design, and Redis would add complexity without solving a proven problem.
166
+
167
+ **Key Decision**: Keep two-tier architecture with in-memory working memory. Trust PostgreSQL for durability. Apply YAGNI principle.
168
+
169
+ **Why Rejected**: Unnecessary complexity, performance penalty, operational burden, and no proven requirement. PostgreSQL already handles multi-process sharing and crash recovery.
170
+
171
+ **Read more**: [ADR-010: Redis Working Memory (Rejected)](010-redis-working-memory-rejected.md)
172
+
173
+ ---
174
+
175
+ ## ADR Dependencies
176
+
177
+ ```
178
+ ADR-001 (Storage)
179
+ └─> ADR-002 (Two-Tier Memory)
180
+ ├─> ADR-007 (Eviction Strategy)
181
+ ├─> ADR-009 (Never-Forget)
182
+ └─> ADR-010 (Redis WM - Rejected Alternative)
183
+ └─> ADR-003 (Embeddings)
184
+ └─> ADR-005 (RAG Retrieval)
185
+ └─> ADR-004 (Hive Mind)
186
+ └─> ADR-008 (Robot ID)
187
+ └─> ADR-006 (Context Assembly)
188
+ ```
189
+
190
+ ## Related Documentation
191
+
192
+ - [HTM API Guide](../../api/index.md)
193
+ - [Database Schema](../../development/schema.md)
194
+ - [Configuration Guide](../../installation.md)
195
+ - [Development Workflow](../../development/index.md)
196
+
197
+ ## Contributing to ADRs
198
+
199
+ When making significant architectural decisions:
200
+
201
+ 1. Create a new ADR using the next sequential number
202
+ 2. Follow the established structure and format
203
+ 3. Include thorough context, rationale, and consequences
204
+ 4. Document alternatives considered and why they were rejected
205
+ 5. Update this index with a summary
206
+ 6. Link related documentation
207
+
208
+ ## Questions?
209
+
210
+ For questions about architectural decisions, please:
211
+
212
+ - Review the specific ADR documentation
213
+ - Check the related guides and API documentation
214
+ - Open a GitHub issue for clarification
215
+ - Consult the development team