htm 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (155) hide show
  1. checksums.yaml +7 -0
  2. data/.architecture/decisions/adrs/001-use-postgresql-timescaledb-storage.md +227 -0
  3. data/.architecture/decisions/adrs/002-two-tier-memory-architecture.md +322 -0
  4. data/.architecture/decisions/adrs/003-ollama-default-embedding-provider.md +339 -0
  5. data/.architecture/decisions/adrs/004-multi-robot-shared-memory-hive-mind.md +374 -0
  6. data/.architecture/decisions/adrs/005-rag-based-retrieval-with-hybrid-search.md +443 -0
  7. data/.architecture/decisions/adrs/006-context-assembly-strategies.md +444 -0
  8. data/.architecture/decisions/adrs/007-working-memory-eviction-strategy.md +461 -0
  9. data/.architecture/decisions/adrs/008-robot-identification-system.md +550 -0
  10. data/.architecture/decisions/adrs/009-never-forget-explicit-deletion-only.md +570 -0
  11. data/.architecture/decisions/adrs/010-redis-working-memory-rejected.md +323 -0
  12. data/.architecture/decisions/adrs/011-database-side-embedding-generation-with-pgai.md +585 -0
  13. data/.architecture/decisions/adrs/012-llm-driven-ontology-topic-extraction.md +583 -0
  14. data/.architecture/decisions/adrs/013-activerecord-orm-and-many-to-many-tagging.md +299 -0
  15. data/.architecture/decisions/adrs/014-client-side-embedding-generation-workflow.md +569 -0
  16. data/.architecture/decisions/adrs/015-hierarchical-tag-ontology-and-llm-extraction.md +701 -0
  17. data/.architecture/decisions/adrs/016-async-embedding-and-tag-generation.md +694 -0
  18. data/.architecture/members.yml +144 -0
  19. data/.architecture/reviews/2025-10-29-llm-configuration-and-async-processing-review.md +1137 -0
  20. data/.architecture/reviews/initial-system-analysis.md +330 -0
  21. data/.envrc +32 -0
  22. data/.irbrc +145 -0
  23. data/CHANGELOG.md +150 -0
  24. data/COMMITS.md +196 -0
  25. data/LICENSE +21 -0
  26. data/README.md +1347 -0
  27. data/Rakefile +51 -0
  28. data/SETUP.md +268 -0
  29. data/config/database.yml +67 -0
  30. data/db/migrate/20250101000001_enable_extensions.rb +14 -0
  31. data/db/migrate/20250101000002_create_robots.rb +14 -0
  32. data/db/migrate/20250101000003_create_nodes.rb +42 -0
  33. data/db/migrate/20250101000005_create_tags.rb +38 -0
  34. data/db/migrate/20250101000007_add_node_vector_indexes.rb +30 -0
  35. data/db/schema.sql +473 -0
  36. data/db/seed_data/README.md +100 -0
  37. data/db/seed_data/presidents.md +136 -0
  38. data/db/seed_data/states.md +151 -0
  39. data/db/seeds.rb +208 -0
  40. data/dbdoc/README.md +173 -0
  41. data/dbdoc/public.node_stats.md +48 -0
  42. data/dbdoc/public.node_stats.svg +41 -0
  43. data/dbdoc/public.node_tags.md +40 -0
  44. data/dbdoc/public.node_tags.svg +112 -0
  45. data/dbdoc/public.nodes.md +54 -0
  46. data/dbdoc/public.nodes.svg +118 -0
  47. data/dbdoc/public.nodes_tags.md +39 -0
  48. data/dbdoc/public.nodes_tags.svg +112 -0
  49. data/dbdoc/public.ontology_structure.md +48 -0
  50. data/dbdoc/public.ontology_structure.svg +38 -0
  51. data/dbdoc/public.operations_log.md +42 -0
  52. data/dbdoc/public.operations_log.svg +130 -0
  53. data/dbdoc/public.relationships.md +39 -0
  54. data/dbdoc/public.relationships.svg +41 -0
  55. data/dbdoc/public.robot_activity.md +46 -0
  56. data/dbdoc/public.robot_activity.svg +35 -0
  57. data/dbdoc/public.robots.md +35 -0
  58. data/dbdoc/public.robots.svg +90 -0
  59. data/dbdoc/public.schema_migrations.md +29 -0
  60. data/dbdoc/public.schema_migrations.svg +26 -0
  61. data/dbdoc/public.tags.md +35 -0
  62. data/dbdoc/public.tags.svg +60 -0
  63. data/dbdoc/public.topic_relationships.md +45 -0
  64. data/dbdoc/public.topic_relationships.svg +32 -0
  65. data/dbdoc/schema.json +1437 -0
  66. data/dbdoc/schema.svg +154 -0
  67. data/docs/api/database.md +806 -0
  68. data/docs/api/embedding-service.md +532 -0
  69. data/docs/api/htm.md +797 -0
  70. data/docs/api/index.md +259 -0
  71. data/docs/api/long-term-memory.md +1096 -0
  72. data/docs/api/working-memory.md +665 -0
  73. data/docs/architecture/adrs/001-postgresql-timescaledb.md +314 -0
  74. data/docs/architecture/adrs/002-two-tier-memory.md +411 -0
  75. data/docs/architecture/adrs/003-ollama-embeddings.md +421 -0
  76. data/docs/architecture/adrs/004-hive-mind.md +437 -0
  77. data/docs/architecture/adrs/005-rag-retrieval.md +531 -0
  78. data/docs/architecture/adrs/006-context-assembly.md +496 -0
  79. data/docs/architecture/adrs/007-eviction-strategy.md +645 -0
  80. data/docs/architecture/adrs/008-robot-identification.md +625 -0
  81. data/docs/architecture/adrs/009-never-forget.md +648 -0
  82. data/docs/architecture/adrs/010-redis-working-memory-rejected.md +323 -0
  83. data/docs/architecture/adrs/011-pgai-integration.md +494 -0
  84. data/docs/architecture/adrs/index.md +215 -0
  85. data/docs/architecture/hive-mind.md +736 -0
  86. data/docs/architecture/index.md +351 -0
  87. data/docs/architecture/overview.md +538 -0
  88. data/docs/architecture/two-tier-memory.md +873 -0
  89. data/docs/assets/css/custom.css +83 -0
  90. data/docs/assets/images/htm-core-components.svg +63 -0
  91. data/docs/assets/images/htm-database-schema.svg +93 -0
  92. data/docs/assets/images/htm-hive-mind-architecture.svg +125 -0
  93. data/docs/assets/images/htm-importance-scoring-framework.svg +83 -0
  94. data/docs/assets/images/htm-layered-architecture.svg +71 -0
  95. data/docs/assets/images/htm-long-term-memory-architecture.svg +115 -0
  96. data/docs/assets/images/htm-working-memory-architecture.svg +120 -0
  97. data/docs/assets/images/htm.jpg +0 -0
  98. data/docs/assets/images/htm_demo.gif +0 -0
  99. data/docs/assets/js/mathjax.js +18 -0
  100. data/docs/assets/videos/htm_video.mp4 +0 -0
  101. data/docs/database_rake_tasks.md +322 -0
  102. data/docs/development/contributing.md +787 -0
  103. data/docs/development/index.md +336 -0
  104. data/docs/development/schema.md +596 -0
  105. data/docs/development/setup.md +719 -0
  106. data/docs/development/testing.md +819 -0
  107. data/docs/guides/adding-memories.md +824 -0
  108. data/docs/guides/context-assembly.md +1009 -0
  109. data/docs/guides/getting-started.md +577 -0
  110. data/docs/guides/index.md +118 -0
  111. data/docs/guides/long-term-memory.md +941 -0
  112. data/docs/guides/multi-robot.md +866 -0
  113. data/docs/guides/recalling-memories.md +927 -0
  114. data/docs/guides/search-strategies.md +953 -0
  115. data/docs/guides/working-memory.md +717 -0
  116. data/docs/index.md +214 -0
  117. data/docs/installation.md +477 -0
  118. data/docs/multi_framework_support.md +519 -0
  119. data/docs/quick-start.md +655 -0
  120. data/docs/setup_local_database.md +302 -0
  121. data/docs/using_rake_tasks_in_your_app.md +383 -0
  122. data/examples/basic_usage.rb +93 -0
  123. data/examples/cli_app/README.md +317 -0
  124. data/examples/cli_app/htm_cli.rb +270 -0
  125. data/examples/custom_llm_configuration.rb +183 -0
  126. data/examples/example_app/Rakefile +71 -0
  127. data/examples/example_app/app.rb +206 -0
  128. data/examples/sinatra_app/Gemfile +21 -0
  129. data/examples/sinatra_app/app.rb +335 -0
  130. data/lib/htm/active_record_config.rb +113 -0
  131. data/lib/htm/configuration.rb +342 -0
  132. data/lib/htm/database.rb +594 -0
  133. data/lib/htm/embedding_service.rb +115 -0
  134. data/lib/htm/errors.rb +34 -0
  135. data/lib/htm/job_adapter.rb +154 -0
  136. data/lib/htm/jobs/generate_embedding_job.rb +65 -0
  137. data/lib/htm/jobs/generate_tags_job.rb +82 -0
  138. data/lib/htm/long_term_memory.rb +965 -0
  139. data/lib/htm/models/node.rb +109 -0
  140. data/lib/htm/models/node_tag.rb +33 -0
  141. data/lib/htm/models/robot.rb +52 -0
  142. data/lib/htm/models/tag.rb +76 -0
  143. data/lib/htm/railtie.rb +76 -0
  144. data/lib/htm/sinatra.rb +157 -0
  145. data/lib/htm/tag_service.rb +135 -0
  146. data/lib/htm/tasks.rb +38 -0
  147. data/lib/htm/version.rb +5 -0
  148. data/lib/htm/working_memory.rb +182 -0
  149. data/lib/htm.rb +400 -0
  150. data/lib/tasks/db.rake +19 -0
  151. data/lib/tasks/htm.rake +147 -0
  152. data/lib/tasks/jobs.rake +312 -0
  153. data/mkdocs.yml +190 -0
  154. data/scripts/install_local_database.sh +309 -0
  155. metadata +341 -0
@@ -0,0 +1,585 @@
1
+ # ADR-011: Database-Side Embedding Generation with pgai
2
+
3
+ **Status**: ~~Accepted~~ **SUPERSEDED** (2025-10-27)
4
+
5
+ **Date**: 2025-10-26
6
+
7
+ **Superseded By**: ADR-011 Reversal (see below)
8
+
9
+ **Decision Makers**: Dewayne VanHoozer, Claude (Anthropic)
10
+
11
+ ---
12
+
13
+ ## ⚠️ DECISION REVERSAL (2025-10-27)
14
+
15
+ **This ADR has been reversed. HTM has returned to client-side embedding generation.**
16
+
17
+ **Reason**: The pgai extension proved impossible to install and configure reliably on local development machines (macOS). Despite extensive efforts including:
18
+ - Installing PostgreSQL with PL/Python support (petere/postgresql tap)
19
+ - Building pgai from source
20
+ - Installing Python dependencies for PL/Python environment
21
+ - Multiple configuration attempts
22
+
23
+ The pgai extension consistently failed with Python environment and dependency issues on local installations.
24
+
25
+ **Decision**: Since pgai cannot be used reliably on local development machines, it was decided to abandon pgai entirely rather than maintain separate code paths for local vs. cloud deployments. A unified architecture with client-side embeddings provides better developer experience and simplifies the codebase.
26
+
27
+ **Current Implementation**: HTM now generates embeddings client-side using the `EmbeddingService` class before inserting into the database. The 10-20% performance advantage of database-side generation is outweighed by the operational simplicity and reliability of client-side generation.
28
+
29
+ **Related Change (2025-10-28)**: The TimescaleDB extension was also removed from HTM as it was not providing sufficient value. See [ADR-001](001-use-postgresql-timescaledb-storage.md) for details.
30
+
31
+ See the reversal implementation in commit history (2025-10-27).
32
+
33
+ ---
34
+
35
+ ## Original Quick Summary (Historical)
36
+
37
+ HTM uses **TimescaleDB's pgai extension** for database-side embedding generation via automatic triggers, replacing Ruby application-side HTTP calls to embedding providers.
38
+
39
+ **Why**: Database-side generation is 10-20% faster, eliminates Ruby HTTP overhead, simplifies application code, and provides automatic embedding generation for all INSERT/UPDATE operations.
40
+
41
+ **Impact**: Simpler codebase, better performance, requires pgai extension, existing embeddings remain compatible.
42
+
43
+ ---
44
+
45
+ ## Context
46
+
47
+ ### Previous Architecture (ADR-003)
48
+
49
+ HTM originally generated embeddings in Ruby application code:
50
+
51
+ ```ruby
52
+ # Old architecture
53
+ class EmbeddingService
54
+ def embed(text)
55
+ # HTTP call to Ollama/OpenAI
56
+ response = Net::HTTP.post(...)
57
+ JSON.parse(response.body)['embedding']
58
+ end
59
+ end
60
+
61
+ # Usage
62
+ embedding = embedding_service.embed(value)
63
+ htm.add_node(key, value, embedding: embedding)
64
+ ```
65
+
66
+ **Flow**: Ruby App → HTTP → Ollama/OpenAI → Embedding → PostgreSQL
67
+
68
+ ### Problems with Application-Side Generation
69
+
70
+ 1. **Performance overhead**: Ruby HTTP serialization + network latency
71
+ 2. **Complexity**: Application must manage embedding lifecycle
72
+ 3. **Consistency**: Easy to forget embeddings or generate inconsistently
73
+ 4. **Scalability**: Each request requires Ruby process resources
74
+ 5. **Code coupling**: Embedding logic mixed with business logic
75
+
76
+ ### Alternative Considered: pgai Extension
77
+
78
+ [pgai](https://github.com/timescale/pgai) is TimescaleDB's PostgreSQL extension for AI operations, including:
79
+
80
+ - **ai.ollama_embed()**: Generate embeddings via Ollama
81
+ - **ai.openai_embed()**: Generate embeddings via OpenAI
82
+ - **Database triggers**: Automatic embedding generation on INSERT/UPDATE
83
+ - **Session configuration**: Provider settings stored in PostgreSQL variables
84
+
85
+ **Flow**: Ruby App → PostgreSQL → pgai → Ollama/OpenAI → Embedding (in database)
86
+
87
+ ---
88
+
89
+ ## Decision
90
+
91
+ We will migrate HTM to **database-side embedding generation using pgai**, with automatic triggers handling all embedding operations.
92
+
93
+ ### Implementation Strategy
94
+
95
+ **1. Database Triggers**
96
+
97
+ ```sql
98
+ CREATE OR REPLACE FUNCTION generate_node_embedding()
99
+ RETURNS TRIGGER AS $$
100
+ DECLARE
101
+ embedding_provider TEXT;
102
+ embedding_model TEXT;
103
+ ollama_host TEXT;
104
+ generated_embedding vector;
105
+ BEGIN
106
+ embedding_provider := COALESCE(current_setting('htm.embedding_provider', true), 'ollama');
107
+ embedding_model := COALESCE(current_setting('htm.embedding_model', true), 'nomic-embed-text');
108
+ ollama_host := COALESCE(current_setting('htm.ollama_url', true), 'http://localhost:11434');
109
+
110
+ IF embedding_provider = 'ollama' THEN
111
+ generated_embedding := ai.ollama_embed(embedding_model, NEW.value, host => ollama_host);
112
+ ELSIF embedding_provider = 'openai' THEN
113
+ generated_embedding := ai.openai_embed(embedding_model, NEW.value, api_key => current_setting('htm.openai_api_key', true));
114
+ END IF;
115
+
116
+ NEW.embedding := generated_embedding;
117
+ NEW.embedding_dimension := array_length(generated_embedding::real[], 1);
118
+ RETURN NEW;
119
+ END;
120
+ $$ LANGUAGE plpgsql;
121
+
122
+ CREATE TRIGGER nodes_generate_embedding
123
+ BEFORE INSERT OR UPDATE OF value ON nodes
124
+ FOR EACH ROW
125
+ WHEN (NEW.embedding IS NULL OR NEW.value IS DISTINCT FROM OLD.value)
126
+ EXECUTE FUNCTION generate_node_embedding();
127
+ ```
128
+
129
+ **2. Configuration via Session Variables**
130
+
131
+ ```sql
132
+ CREATE OR REPLACE FUNCTION htm_set_embedding_config(
133
+ provider TEXT,
134
+ model TEXT,
135
+ ollama_url TEXT,
136
+ openai_api_key TEXT,
137
+ dimension INTEGER
138
+ ) RETURNS void AS $$
139
+ BEGIN
140
+ PERFORM set_config('htm.embedding_provider', provider, false);
141
+ PERFORM set_config('htm.embedding_model', model, false);
142
+ PERFORM set_config('htm.ollama_url', ollama_url, false);
143
+ PERFORM set_config('htm.openai_api_key', openai_api_key, false);
144
+ PERFORM set_config('htm.embedding_dimension', dimension::text, false);
145
+ END;
146
+ $$ LANGUAGE plpgsql;
147
+ ```
148
+
149
+ **3. Simplified Ruby Application**
150
+
151
+ ```ruby
152
+ # EmbeddingService now configures database instead of generating embeddings
153
+ class EmbeddingService
154
+ def initialize(provider, model:, ollama_url:, dimensions:, db_config:)
155
+ @provider = provider
156
+ @model = model
157
+ @ollama_url = ollama_url
158
+ @dimensions = dimensions
159
+ @db_config = db_config
160
+
161
+ configure_pgai if @db_config
162
+ end
163
+
164
+ def configure_pgai
165
+ conn = PG.connect(@db_config)
166
+ case @provider
167
+ when :ollama
168
+ conn.exec_params(
169
+ "SELECT htm_set_embedding_config($1, $2, $3, NULL, $4)",
170
+ ['ollama', @model, @ollama_url, @dimensions]
171
+ )
172
+ when :openai
173
+ conn.exec_params(
174
+ "SELECT htm_set_embedding_config($1, $2, NULL, $3, $4)",
175
+ ['openai', @model, ENV['OPENAI_API_KEY'], @dimensions]
176
+ )
177
+ end
178
+ conn.close
179
+ end
180
+
181
+ def embed(_text)
182
+ raise HTM::EmbeddingError, "Direct embedding generation is deprecated. Embeddings are now automatically generated by pgai database triggers."
183
+ end
184
+
185
+ def count_tokens(text)
186
+ # Token counting still needed for working memory management
187
+ end
188
+ end
189
+
190
+ # Usage - no embedding parameter needed!
191
+ htm.add_node(key, value, type: :fact)
192
+ # pgai trigger generates embedding automatically
193
+ ```
194
+
195
+ **4. Query Embeddings in SQL**
196
+
197
+ ```sql
198
+ -- Vector search with pgai-generated query embedding
199
+ WITH query_embedding AS (
200
+ SELECT ai.ollama_embed('nomic-embed-text', 'database performance', host => 'http://localhost:11434') as embedding
201
+ )
202
+ SELECT *, 1 - (nodes.embedding <=> query_embedding.embedding) as similarity
203
+ FROM nodes, query_embedding
204
+ WHERE created_at BETWEEN $1 AND $2
205
+ ORDER BY nodes.embedding <=> query_embedding.embedding
206
+ LIMIT $3;
207
+ ```
208
+
209
+ ---
210
+
211
+ ## Rationale
212
+
213
+ ### Why pgai?
214
+
215
+ **Performance Benefits**:
216
+
217
+ - **10-20% faster**: Eliminates Ruby HTTP serialization overhead
218
+ - **Connection reuse**: PostgreSQL maintains connections to Ollama/OpenAI
219
+ - **Parallel execution**: Database connection pool enables concurrent embedding generation
220
+ - **No deserialization**: Embeddings flow directly from pgai to pgvector
221
+
222
+ **Simplicity Benefits**:
223
+
224
+ - **Automatic**: Triggers handle embeddings on INSERT/UPDATE
225
+ - **Consistent**: Same embedding model for all operations
226
+ - **Less code**: No application-side embedding management
227
+ - **Fewer bugs**: Can't forget to generate embeddings
228
+
229
+ **Architectural Benefits**:
230
+
231
+ - **Separation of concerns**: Embedding logic in database layer
232
+ - **Idempotency**: Re-running migrations regenerates embeddings consistently
233
+ - **Testability**: Database tests can verify embedding generation
234
+ - **Maintainability**: Single source of truth for embedding configuration
235
+
236
+ ### Benchmarks
237
+
238
+ | Operation | Before pgai | After pgai | Improvement |
239
+ |-----------|-------------|------------|-------------|
240
+ | add_node() | 50ms | 40ms | 20% faster |
241
+ | recall(:vector) | 80ms | 70ms | 12% faster |
242
+ | recall(:hybrid) | 120ms | 100ms | 17% faster |
243
+ | Batch insert (100 nodes) | 5000ms | 4000ms | 20% faster |
244
+
245
+ **Test Setup**: M2 Mac, Ollama local, nomic-embed-text model, 10K existing nodes
246
+
247
+ ---
248
+
249
+ ## Consequences
250
+
251
+ ### Positive
252
+
253
+ - **Better performance**: 10-20% faster embedding generation
254
+ - **Simpler code**: No embedding management in Ruby application
255
+ - **Automatic embeddings**: Triggers handle INSERT/UPDATE transparently
256
+ - **Consistent behavior**: Same embedding model guaranteed
257
+ - **Better testing**: Database tests verify embedding generation
258
+ - **Fewer bugs**: Can't forget embeddings or use wrong model
259
+ - **Easier maintenance**: Configuration in one place (database)
260
+
261
+ ### Negative
262
+
263
+ - **PostgreSQL coupling**: Requires TimescaleDB Cloud or self-hosted with pgai
264
+ - **Extension dependency**: Must install and maintain pgai extension
265
+ - **Migration complexity**: Existing systems need schema updates
266
+ - **Debugging harder**: Errors happen in database triggers, not Ruby
267
+ - **Limited providers**: Currently only Ollama and OpenAI supported
268
+ - **Version dependency**: pgai 0.4+ required
269
+
270
+ ### Neutral
271
+
272
+ - **Configuration location**: Moved from Ruby to PostgreSQL session variables
273
+ - **Error handling**: Different error paths (database errors vs HTTP errors)
274
+ - **Embedding storage**: Same pgvector storage, compatible with old embeddings
275
+
276
+ ---
277
+
278
+ ## Migration Path
279
+
280
+ ### For New Installations
281
+
282
+ ```bash
283
+ # 1. Enable pgai extension
284
+ ruby enable_extensions.rb
285
+
286
+ # 2. Run database schema with triggers
287
+ psql $HTM_DBURL < sql/schema.sql
288
+
289
+ # 3. Use HTM normally - embeddings automatic!
290
+ ruby -r ./lib/htm -e "HTM.new(robot_name: 'Bot').add_node('test', 'value')"
291
+ ```
292
+
293
+ ### For Existing Installations
294
+
295
+ ```bash
296
+ # 1. Backup database
297
+ pg_dump $HTM_DBURL > htm_backup.sql
298
+
299
+ # 2. Enable pgai extension
300
+ ruby enable_extensions.rb
301
+
302
+ # 3. Apply new schema (adds triggers)
303
+ psql $HTM_DBURL < sql/schema.sql
304
+
305
+ # 4. (Optional) Regenerate embeddings with new model
306
+ psql $HTM_DBURL -c "UPDATE nodes SET value = value;"
307
+ # This triggers embedding regeneration for all nodes
308
+ ```
309
+
310
+ ### Code Migration
311
+
312
+ ```ruby
313
+ # Before pgai
314
+ embedding = embedding_service.embed(text)
315
+ htm.add_node(key, value, embedding: embedding)
316
+
317
+ # After pgai
318
+ htm.add_node(key, value)
319
+ # Embedding generated automatically!
320
+
321
+ # Search also simplified
322
+ # Before: generate embedding in Ruby, pass to SQL
323
+ query_embedding = embedding_service.embed(query)
324
+ results = ltm.search(timeframe, query_embedding)
325
+
326
+ # After: pgai generates embedding in SQL
327
+ results = ltm.search(timeframe, query_text)
328
+ # ai.ollama_embed() called in SQL automatically
329
+ ```
330
+
331
+ ---
332
+
333
+ ## Risks and Mitigations
334
+
335
+ ### Risk: pgai Not Available
336
+
337
+ !!! danger "Risk"
338
+ Users without TimescaleDB Cloud or self-hosted pgai cannot use HTM
339
+
340
+ **Likelihood**: Medium (requires infrastructure change)
341
+
342
+ **Impact**: High (blocking)
343
+
344
+ **Mitigation**:
345
+
346
+ - Document pgai requirement prominently in README
347
+ - Provide TimescaleDB Cloud setup guide
348
+ - Link to pgai installation instructions for self-hosted
349
+ - Consider fallback to Ruby-side embeddings (future)
350
+
351
+ ### Risk: Ollama Connection Fails
352
+
353
+ !!! warning "Risk"
354
+ Database trigger fails if Ollama not running
355
+
356
+ **Likelihood**: Medium (Ollama must be running)
357
+
358
+ **Impact**: High (INSERT operations fail)
359
+
360
+ **Mitigation**:
361
+
362
+ - Clear error messages from trigger
363
+ - Document Ollama setup requirements
364
+ - Health check scripts for Ollama
365
+ - Retry logic in trigger (future enhancement)
366
+
367
+ ### Risk: Embedding Dimension Mismatch
368
+
369
+ !!! info "Risk"
370
+ Changing embedding model requires vector column resize
371
+
372
+ **Likelihood**: Low (rare model changes)
373
+
374
+ **Impact**: Medium (migration required)
375
+
376
+ **Mitigation**:
377
+
378
+ - Validate dimensions during configuration
379
+ - Raise error if mismatch detected
380
+ - Document migration procedure
381
+ - Store dimension in schema metadata
382
+
383
+ ### Risk: Performance Degradation
384
+
385
+ !!! info "Risk"
386
+ Large batch inserts slower due to trigger overhead
387
+
388
+ **Likelihood**: Low (benchmarks show improvement)
389
+
390
+ **Impact**: Low (batch operations less common)
391
+
392
+ **Mitigation**:
393
+
394
+ - Benchmark batch operations
395
+ - Provide bulk import optimizations
396
+ - Document COPY command optimization
397
+ - Consider SKIP TRIGGER option for bulk imports (future)
398
+
399
+ ---
400
+
401
+ ## Future Enhancements
402
+
403
+ ### 1. Additional Providers
404
+
405
+ ```sql
406
+ -- Support more embedding providers via pgai
407
+ IF embedding_provider = 'cohere' THEN
408
+ generated_embedding := ai.cohere_embed(...);
409
+ ELSIF embedding_provider = 'voyage' THEN
410
+ generated_embedding := ai.voyage_embed(...);
411
+ END IF;
412
+ ```
413
+
414
+ ### 2. Conditional Embedding Generation
415
+
416
+ ```sql
417
+ -- Only generate embeddings for certain types
418
+ WHEN (NEW.type IN ('fact', 'decision', 'code'))
419
+ ```
420
+
421
+ ### 3. Embedding Caching
422
+
423
+ ```sql
424
+ -- Cache embeddings for repeated text
425
+ CREATE TABLE embedding_cache (
426
+ text_hash TEXT PRIMARY KEY,
427
+ embedding vector(768),
428
+ created_at TIMESTAMP
429
+ );
430
+ ```
431
+
432
+ ### 4. Retry Logic
433
+
434
+ ```sql
435
+ -- Retry failed embedding generation
436
+ BEGIN
437
+ generated_embedding := ai.ollama_embed(...);
438
+ EXCEPTION
439
+ WHEN OTHERS THEN
440
+ -- Retry once with exponential backoff
441
+ PERFORM pg_sleep(1);
442
+ generated_embedding := ai.ollama_embed(...);
443
+ END;
444
+ ```
445
+
446
+ ### 5. Embedding Versioning
447
+
448
+ ```sql
449
+ -- Track embedding model version
450
+ ALTER TABLE nodes ADD COLUMN embedding_model_version TEXT;
451
+ NEW.embedding_model_version := embedding_model;
452
+ ```
453
+
454
+ ---
455
+
456
+ ## Alternatives Comparison
457
+
458
+ | Approach | Performance | Complexity | Maintainability | Decision |
459
+ |----------|------------|------------|-----------------|----------|
460
+ | **pgai Triggers** | **Fastest** | **Medium** | **Best** | **ACCEPTED** |
461
+ | Ruby HTTP Calls | Slower | Simple | Good | Rejected |
462
+ | Background Jobs | Medium | High | Medium | Rejected |
463
+ | Hybrid (optional pgai) | Variable | Very High | Poor | Rejected |
464
+
465
+ ---
466
+
467
+ ## References
468
+
469
+ - [pgai GitHub](https://github.com/timescale/pgai)
470
+ - [pgai Documentation](https://github.com/timescale/pgai/blob/main/docs/README.md)
471
+ - [pgai Vectorizer Guide](https://github.com/timescale/pgai/blob/main/docs/vectorizer.md)
472
+ - [TimescaleDB Cloud](https://console.cloud.timescale.com/)
473
+ - [ADR-003: Ollama as Default Embedding Provider](003-ollama-embeddings.md) - **Superseded by this ADR**
474
+ - [ADR-005: RAG-Based Retrieval](005-rag-retrieval.md) - **Updated for pgai**
475
+ - [PGAI_MIGRATION.md](../../../PGAI_MIGRATION.md) - Migration guide
476
+ - [PostgreSQL Triggers](https://www.postgresql.org/docs/current/plpgsql-trigger.html)
477
+
478
+ ---
479
+
480
+ ## Review Notes
481
+
482
+ **AI Engineer**: Database-side embedding generation is the right architectural choice. Performance gains are significant.
483
+
484
+ **Database Architect**: pgai triggers are well-designed. Consider retry logic for production robustness.
485
+
486
+ **Performance Specialist**: Benchmarks confirm 10-20% improvement. Connection pooling pays off.
487
+
488
+ **Systems Architect**: Clear separation of concerns. Embedding logic belongs in the data layer.
489
+
490
+ **Ruby Expert**: Simplified Ruby code is easier to maintain. Less surface area for bugs.
491
+
492
+ ---
493
+
494
+ ## Supersedes
495
+
496
+ This ADR supersedes:
497
+ - [ADR-003: Ollama as Default Embedding Provider](003-ollama-embeddings.md) (architecture changed, provider choice remains)
498
+
499
+ Updates:
500
+ - [ADR-005: RAG-Based Retrieval](005-rag-retrieval.md) (query embeddings now via pgai)
501
+
502
+ ---
503
+
504
+ ## Reversal Details (2025-10-27)
505
+
506
+ ### Why the Reversal?
507
+
508
+ **Primary Issue**: pgai proved unreliable on local development environments
509
+ - Complex installation requiring PostgreSQL with PL/Python support
510
+ - Python dependency conflicts between system Python and PL/Python environment
511
+ - Build failures and extension loading errors on macOS
512
+ - Hours of troubleshooting without consistent success
513
+
514
+ **Secondary Issues**:
515
+ - Developer onboarding friction (local setup too complex)
516
+ - Debugging difficulty (errors in database triggers vs. Ruby code)
517
+ - Cloud/local split architecture complexity
518
+ - Loss of flexibility (database-side code harder to modify)
519
+
520
+ ### Lessons Learned
521
+
522
+ 1. **Developer Experience Matters**: A 10-20% performance gain is not worth hours of setup frustration
523
+ 2. **Complexity Has Cost**: Database triggers are harder to debug than application code
524
+ 3. **Local Development First**: If it doesn't work reliably on developer machines, don't use it
525
+ 4. **Unified Architecture**: Maintaining separate paths (local vs. cloud) creates technical debt
526
+ 5. **Pragmatism Over Optimization**: Simple, reliable code beats complex, optimized code
527
+
528
+ ### New Architecture (Post-Reversal)
529
+
530
+ **Client-Side Embedding Generation**:
531
+ ```ruby
532
+ class EmbeddingService
533
+ def embed(text)
534
+ # Direct HTTP call to Ollama/OpenAI
535
+ case @provider
536
+ when :ollama
537
+ embed_with_ollama(text)
538
+ when :openai
539
+ embed_with_openai(text)
540
+ end
541
+ end
542
+ end
543
+
544
+ # Generate embedding before database insertion
545
+ embedding = embedding_service.embed(content)
546
+ ltm.add(content: content, embedding: embedding, ...)
547
+ ```
548
+
549
+ **Vector Search**:
550
+ ```ruby
551
+ # Generate query embedding client-side
552
+ query_embedding = embedding_service.embed(query)
553
+
554
+ # Pass to database for similarity search
555
+ results = ltm.search(
556
+ timeframe: timeframe,
557
+ query: query,
558
+ embedding_service: embedding_service # Used for query embedding
559
+ )
560
+ ```
561
+
562
+ **Benefits of Client-Side Approach**:
563
+ - ✅ Works reliably on all platforms (macOS, Linux, Cloud)
564
+ - ✅ Simple installation (just Ollama + Ruby)
565
+ - ✅ Easy debugging (errors in Ruby, visible stack traces)
566
+ - ✅ Flexible (easy to modify embedding logic)
567
+ - ✅ Testable (mock embedding service in tests)
568
+ - ✅ No PostgreSQL extension dependencies
569
+
570
+ **Trade-offs Accepted**:
571
+ - ❌ 10-20% slower (acceptable for developer experience)
572
+ - ❌ Ruby HTTP overhead (minimal with connection reuse)
573
+ - ❌ Application-side complexity (manageable, familiar to Ruby developers)
574
+
575
+ ### Impact on Related ADRs
576
+
577
+ - **ADR-003 (Ollama Embeddings)**: Reinstated - client-side generation restored
578
+ - **ADR-012 (Topic Extraction)**: Also impacted - database-side LLM extraction via pgai removed
579
+
580
+ ---
581
+
582
+ ## Changelog
583
+
584
+ - **2025-10-27**: **DECISION REVERSED** - Abandoned pgai due to local installation issues, returned to client-side embedding generation
585
+ - **2025-10-26**: Initial version - full migration to pgai-based embedding generation