htm 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.architecture/decisions/adrs/001-use-postgresql-timescaledb-storage.md +227 -0
- data/.architecture/decisions/adrs/002-two-tier-memory-architecture.md +322 -0
- data/.architecture/decisions/adrs/003-ollama-default-embedding-provider.md +339 -0
- data/.architecture/decisions/adrs/004-multi-robot-shared-memory-hive-mind.md +374 -0
- data/.architecture/decisions/adrs/005-rag-based-retrieval-with-hybrid-search.md +443 -0
- data/.architecture/decisions/adrs/006-context-assembly-strategies.md +444 -0
- data/.architecture/decisions/adrs/007-working-memory-eviction-strategy.md +461 -0
- data/.architecture/decisions/adrs/008-robot-identification-system.md +550 -0
- data/.architecture/decisions/adrs/009-never-forget-explicit-deletion-only.md +570 -0
- data/.architecture/decisions/adrs/010-redis-working-memory-rejected.md +323 -0
- data/.architecture/decisions/adrs/011-database-side-embedding-generation-with-pgai.md +585 -0
- data/.architecture/decisions/adrs/012-llm-driven-ontology-topic-extraction.md +583 -0
- data/.architecture/decisions/adrs/013-activerecord-orm-and-many-to-many-tagging.md +299 -0
- data/.architecture/decisions/adrs/014-client-side-embedding-generation-workflow.md +569 -0
- data/.architecture/decisions/adrs/015-hierarchical-tag-ontology-and-llm-extraction.md +701 -0
- data/.architecture/decisions/adrs/016-async-embedding-and-tag-generation.md +694 -0
- data/.architecture/members.yml +144 -0
- data/.architecture/reviews/2025-10-29-llm-configuration-and-async-processing-review.md +1137 -0
- data/.architecture/reviews/initial-system-analysis.md +330 -0
- data/.envrc +32 -0
- data/.irbrc +145 -0
- data/CHANGELOG.md +150 -0
- data/COMMITS.md +196 -0
- data/LICENSE +21 -0
- data/README.md +1347 -0
- data/Rakefile +51 -0
- data/SETUP.md +268 -0
- data/config/database.yml +67 -0
- data/db/migrate/20250101000001_enable_extensions.rb +14 -0
- data/db/migrate/20250101000002_create_robots.rb +14 -0
- data/db/migrate/20250101000003_create_nodes.rb +42 -0
- data/db/migrate/20250101000005_create_tags.rb +38 -0
- data/db/migrate/20250101000007_add_node_vector_indexes.rb +30 -0
- data/db/schema.sql +473 -0
- data/db/seed_data/README.md +100 -0
- data/db/seed_data/presidents.md +136 -0
- data/db/seed_data/states.md +151 -0
- data/db/seeds.rb +208 -0
- data/dbdoc/README.md +173 -0
- data/dbdoc/public.node_stats.md +48 -0
- data/dbdoc/public.node_stats.svg +41 -0
- data/dbdoc/public.node_tags.md +40 -0
- data/dbdoc/public.node_tags.svg +112 -0
- data/dbdoc/public.nodes.md +54 -0
- data/dbdoc/public.nodes.svg +118 -0
- data/dbdoc/public.nodes_tags.md +39 -0
- data/dbdoc/public.nodes_tags.svg +112 -0
- data/dbdoc/public.ontology_structure.md +48 -0
- data/dbdoc/public.ontology_structure.svg +38 -0
- data/dbdoc/public.operations_log.md +42 -0
- data/dbdoc/public.operations_log.svg +130 -0
- data/dbdoc/public.relationships.md +39 -0
- data/dbdoc/public.relationships.svg +41 -0
- data/dbdoc/public.robot_activity.md +46 -0
- data/dbdoc/public.robot_activity.svg +35 -0
- data/dbdoc/public.robots.md +35 -0
- data/dbdoc/public.robots.svg +90 -0
- data/dbdoc/public.schema_migrations.md +29 -0
- data/dbdoc/public.schema_migrations.svg +26 -0
- data/dbdoc/public.tags.md +35 -0
- data/dbdoc/public.tags.svg +60 -0
- data/dbdoc/public.topic_relationships.md +45 -0
- data/dbdoc/public.topic_relationships.svg +32 -0
- data/dbdoc/schema.json +1437 -0
- data/dbdoc/schema.svg +154 -0
- data/docs/api/database.md +806 -0
- data/docs/api/embedding-service.md +532 -0
- data/docs/api/htm.md +797 -0
- data/docs/api/index.md +259 -0
- data/docs/api/long-term-memory.md +1096 -0
- data/docs/api/working-memory.md +665 -0
- data/docs/architecture/adrs/001-postgresql-timescaledb.md +314 -0
- data/docs/architecture/adrs/002-two-tier-memory.md +411 -0
- data/docs/architecture/adrs/003-ollama-embeddings.md +421 -0
- data/docs/architecture/adrs/004-hive-mind.md +437 -0
- data/docs/architecture/adrs/005-rag-retrieval.md +531 -0
- data/docs/architecture/adrs/006-context-assembly.md +496 -0
- data/docs/architecture/adrs/007-eviction-strategy.md +645 -0
- data/docs/architecture/adrs/008-robot-identification.md +625 -0
- data/docs/architecture/adrs/009-never-forget.md +648 -0
- data/docs/architecture/adrs/010-redis-working-memory-rejected.md +323 -0
- data/docs/architecture/adrs/011-pgai-integration.md +494 -0
- data/docs/architecture/adrs/index.md +215 -0
- data/docs/architecture/hive-mind.md +736 -0
- data/docs/architecture/index.md +351 -0
- data/docs/architecture/overview.md +538 -0
- data/docs/architecture/two-tier-memory.md +873 -0
- data/docs/assets/css/custom.css +83 -0
- data/docs/assets/images/htm-core-components.svg +63 -0
- data/docs/assets/images/htm-database-schema.svg +93 -0
- data/docs/assets/images/htm-hive-mind-architecture.svg +125 -0
- data/docs/assets/images/htm-importance-scoring-framework.svg +83 -0
- data/docs/assets/images/htm-layered-architecture.svg +71 -0
- data/docs/assets/images/htm-long-term-memory-architecture.svg +115 -0
- data/docs/assets/images/htm-working-memory-architecture.svg +120 -0
- data/docs/assets/images/htm.jpg +0 -0
- data/docs/assets/images/htm_demo.gif +0 -0
- data/docs/assets/js/mathjax.js +18 -0
- data/docs/assets/videos/htm_video.mp4 +0 -0
- data/docs/database_rake_tasks.md +322 -0
- data/docs/development/contributing.md +787 -0
- data/docs/development/index.md +336 -0
- data/docs/development/schema.md +596 -0
- data/docs/development/setup.md +719 -0
- data/docs/development/testing.md +819 -0
- data/docs/guides/adding-memories.md +824 -0
- data/docs/guides/context-assembly.md +1009 -0
- data/docs/guides/getting-started.md +577 -0
- data/docs/guides/index.md +118 -0
- data/docs/guides/long-term-memory.md +941 -0
- data/docs/guides/multi-robot.md +866 -0
- data/docs/guides/recalling-memories.md +927 -0
- data/docs/guides/search-strategies.md +953 -0
- data/docs/guides/working-memory.md +717 -0
- data/docs/index.md +214 -0
- data/docs/installation.md +477 -0
- data/docs/multi_framework_support.md +519 -0
- data/docs/quick-start.md +655 -0
- data/docs/setup_local_database.md +302 -0
- data/docs/using_rake_tasks_in_your_app.md +383 -0
- data/examples/basic_usage.rb +93 -0
- data/examples/cli_app/README.md +317 -0
- data/examples/cli_app/htm_cli.rb +270 -0
- data/examples/custom_llm_configuration.rb +183 -0
- data/examples/example_app/Rakefile +71 -0
- data/examples/example_app/app.rb +206 -0
- data/examples/sinatra_app/Gemfile +21 -0
- data/examples/sinatra_app/app.rb +335 -0
- data/lib/htm/active_record_config.rb +113 -0
- data/lib/htm/configuration.rb +342 -0
- data/lib/htm/database.rb +594 -0
- data/lib/htm/embedding_service.rb +115 -0
- data/lib/htm/errors.rb +34 -0
- data/lib/htm/job_adapter.rb +154 -0
- data/lib/htm/jobs/generate_embedding_job.rb +65 -0
- data/lib/htm/jobs/generate_tags_job.rb +82 -0
- data/lib/htm/long_term_memory.rb +965 -0
- data/lib/htm/models/node.rb +109 -0
- data/lib/htm/models/node_tag.rb +33 -0
- data/lib/htm/models/robot.rb +52 -0
- data/lib/htm/models/tag.rb +76 -0
- data/lib/htm/railtie.rb +76 -0
- data/lib/htm/sinatra.rb +157 -0
- data/lib/htm/tag_service.rb +135 -0
- data/lib/htm/tasks.rb +38 -0
- data/lib/htm/version.rb +5 -0
- data/lib/htm/working_memory.rb +182 -0
- data/lib/htm.rb +400 -0
- data/lib/tasks/db.rake +19 -0
- data/lib/tasks/htm.rake +147 -0
- data/lib/tasks/jobs.rake +312 -0
- data/mkdocs.yml +190 -0
- data/scripts/install_local_database.sh +309 -0
- metadata +341 -0
|
@@ -0,0 +1,585 @@
|
|
|
1
|
+
# ADR-011: Database-Side Embedding Generation with pgai
|
|
2
|
+
|
|
3
|
+
**Status**: ~~Accepted~~ **SUPERSEDED** (2025-10-27)
|
|
4
|
+
|
|
5
|
+
**Date**: 2025-10-26
|
|
6
|
+
|
|
7
|
+
**Superseded By**: ADR-011 Reversal (see below)
|
|
8
|
+
|
|
9
|
+
**Decision Makers**: Dewayne VanHoozer, Claude (Anthropic)
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## ⚠️ DECISION REVERSAL (2025-10-27)
|
|
14
|
+
|
|
15
|
+
**This ADR has been reversed. HTM has returned to client-side embedding generation.**
|
|
16
|
+
|
|
17
|
+
**Reason**: The pgai extension proved impossible to install and configure reliably on local development machines (macOS). Despite extensive efforts including:
|
|
18
|
+
- Installing PostgreSQL with PL/Python support (petere/postgresql tap)
|
|
19
|
+
- Building pgai from source
|
|
20
|
+
- Installing Python dependencies for PL/Python environment
|
|
21
|
+
- Multiple configuration attempts
|
|
22
|
+
|
|
23
|
+
The pgai extension consistently failed with Python environment and dependency issues on local installations.
|
|
24
|
+
|
|
25
|
+
**Decision**: Since pgai cannot be used reliably on local development machines, it was decided to abandon pgai entirely rather than maintain separate code paths for local vs. cloud deployments. A unified architecture with client-side embeddings provides better developer experience and simplifies the codebase.
|
|
26
|
+
|
|
27
|
+
**Current Implementation**: HTM now generates embeddings client-side using the `EmbeddingService` class before inserting into the database. The 10-20% performance advantage of database-side generation is outweighed by the operational simplicity and reliability of client-side generation.
|
|
28
|
+
|
|
29
|
+
**Related Change (2025-10-28)**: The TimescaleDB extension was also removed from HTM as it was not providing sufficient value. See [ADR-001](001-use-postgresql-timescaledb-storage.md) for details.
|
|
30
|
+
|
|
31
|
+
See the reversal implementation in commit history (2025-10-27).
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## Original Quick Summary (Historical)
|
|
36
|
+
|
|
37
|
+
HTM uses **TimescaleDB's pgai extension** for database-side embedding generation via automatic triggers, replacing Ruby application-side HTTP calls to embedding providers.
|
|
38
|
+
|
|
39
|
+
**Why**: Database-side generation is 10-20% faster, eliminates Ruby HTTP overhead, simplifies application code, and provides automatic embedding generation for all INSERT/UPDATE operations.
|
|
40
|
+
|
|
41
|
+
**Impact**: Simpler codebase, better performance, requires pgai extension, existing embeddings remain compatible.
|
|
42
|
+
|
|
43
|
+
---
|
|
44
|
+
|
|
45
|
+
## Context
|
|
46
|
+
|
|
47
|
+
### Previous Architecture (ADR-003)
|
|
48
|
+
|
|
49
|
+
HTM originally generated embeddings in Ruby application code:
|
|
50
|
+
|
|
51
|
+
```ruby
|
|
52
|
+
# Old architecture
|
|
53
|
+
class EmbeddingService
|
|
54
|
+
def embed(text)
|
|
55
|
+
# HTTP call to Ollama/OpenAI
|
|
56
|
+
response = Net::HTTP.post(...)
|
|
57
|
+
JSON.parse(response.body)['embedding']
|
|
58
|
+
end
|
|
59
|
+
end
|
|
60
|
+
|
|
61
|
+
# Usage
|
|
62
|
+
embedding = embedding_service.embed(value)
|
|
63
|
+
htm.add_node(key, value, embedding: embedding)
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
**Flow**: Ruby App → HTTP → Ollama/OpenAI → Embedding → PostgreSQL
|
|
67
|
+
|
|
68
|
+
### Problems with Application-Side Generation
|
|
69
|
+
|
|
70
|
+
1. **Performance overhead**: Ruby HTTP serialization + network latency
|
|
71
|
+
2. **Complexity**: Application must manage embedding lifecycle
|
|
72
|
+
3. **Consistency**: Easy to forget embeddings or generate inconsistently
|
|
73
|
+
4. **Scalability**: Each request requires Ruby process resources
|
|
74
|
+
5. **Code coupling**: Embedding logic mixed with business logic
|
|
75
|
+
|
|
76
|
+
### Alternative Considered: pgai Extension
|
|
77
|
+
|
|
78
|
+
[pgai](https://github.com/timescale/pgai) is TimescaleDB's PostgreSQL extension for AI operations, including:
|
|
79
|
+
|
|
80
|
+
- **ai.ollama_embed()**: Generate embeddings via Ollama
|
|
81
|
+
- **ai.openai_embed()**: Generate embeddings via OpenAI
|
|
82
|
+
- **Database triggers**: Automatic embedding generation on INSERT/UPDATE
|
|
83
|
+
- **Session configuration**: Provider settings stored in PostgreSQL variables
|
|
84
|
+
|
|
85
|
+
**Flow**: Ruby App → PostgreSQL → pgai → Ollama/OpenAI → Embedding (in database)
|
|
86
|
+
|
|
87
|
+
---
|
|
88
|
+
|
|
89
|
+
## Decision
|
|
90
|
+
|
|
91
|
+
We will migrate HTM to **database-side embedding generation using pgai**, with automatic triggers handling all embedding operations.
|
|
92
|
+
|
|
93
|
+
### Implementation Strategy
|
|
94
|
+
|
|
95
|
+
**1. Database Triggers**
|
|
96
|
+
|
|
97
|
+
```sql
|
|
98
|
+
CREATE OR REPLACE FUNCTION generate_node_embedding()
|
|
99
|
+
RETURNS TRIGGER AS $$
|
|
100
|
+
DECLARE
|
|
101
|
+
embedding_provider TEXT;
|
|
102
|
+
embedding_model TEXT;
|
|
103
|
+
ollama_host TEXT;
|
|
104
|
+
generated_embedding vector;
|
|
105
|
+
BEGIN
|
|
106
|
+
embedding_provider := COALESCE(current_setting('htm.embedding_provider', true), 'ollama');
|
|
107
|
+
embedding_model := COALESCE(current_setting('htm.embedding_model', true), 'nomic-embed-text');
|
|
108
|
+
ollama_host := COALESCE(current_setting('htm.ollama_url', true), 'http://localhost:11434');
|
|
109
|
+
|
|
110
|
+
IF embedding_provider = 'ollama' THEN
|
|
111
|
+
generated_embedding := ai.ollama_embed(embedding_model, NEW.value, host => ollama_host);
|
|
112
|
+
ELSIF embedding_provider = 'openai' THEN
|
|
113
|
+
generated_embedding := ai.openai_embed(embedding_model, NEW.value, api_key => current_setting('htm.openai_api_key', true));
|
|
114
|
+
END IF;
|
|
115
|
+
|
|
116
|
+
NEW.embedding := generated_embedding;
|
|
117
|
+
NEW.embedding_dimension := array_length(generated_embedding::real[], 1);
|
|
118
|
+
RETURN NEW;
|
|
119
|
+
END;
|
|
120
|
+
$$ LANGUAGE plpgsql;
|
|
121
|
+
|
|
122
|
+
CREATE TRIGGER nodes_generate_embedding
|
|
123
|
+
BEFORE INSERT OR UPDATE OF value ON nodes
|
|
124
|
+
FOR EACH ROW
|
|
125
|
+
WHEN (NEW.embedding IS NULL OR NEW.value IS DISTINCT FROM OLD.value)
|
|
126
|
+
EXECUTE FUNCTION generate_node_embedding();
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
**2. Configuration via Session Variables**
|
|
130
|
+
|
|
131
|
+
```sql
|
|
132
|
+
CREATE OR REPLACE FUNCTION htm_set_embedding_config(
|
|
133
|
+
provider TEXT,
|
|
134
|
+
model TEXT,
|
|
135
|
+
ollama_url TEXT,
|
|
136
|
+
openai_api_key TEXT,
|
|
137
|
+
dimension INTEGER
|
|
138
|
+
) RETURNS void AS $$
|
|
139
|
+
BEGIN
|
|
140
|
+
PERFORM set_config('htm.embedding_provider', provider, false);
|
|
141
|
+
PERFORM set_config('htm.embedding_model', model, false);
|
|
142
|
+
PERFORM set_config('htm.ollama_url', ollama_url, false);
|
|
143
|
+
PERFORM set_config('htm.openai_api_key', openai_api_key, false);
|
|
144
|
+
PERFORM set_config('htm.embedding_dimension', dimension::text, false);
|
|
145
|
+
END;
|
|
146
|
+
$$ LANGUAGE plpgsql;
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
**3. Simplified Ruby Application**
|
|
150
|
+
|
|
151
|
+
```ruby
|
|
152
|
+
# EmbeddingService now configures database instead of generating embeddings
|
|
153
|
+
class EmbeddingService
|
|
154
|
+
def initialize(provider, model:, ollama_url:, dimensions:, db_config:)
|
|
155
|
+
@provider = provider
|
|
156
|
+
@model = model
|
|
157
|
+
@ollama_url = ollama_url
|
|
158
|
+
@dimensions = dimensions
|
|
159
|
+
@db_config = db_config
|
|
160
|
+
|
|
161
|
+
configure_pgai if @db_config
|
|
162
|
+
end
|
|
163
|
+
|
|
164
|
+
def configure_pgai
|
|
165
|
+
conn = PG.connect(@db_config)
|
|
166
|
+
case @provider
|
|
167
|
+
when :ollama
|
|
168
|
+
conn.exec_params(
|
|
169
|
+
"SELECT htm_set_embedding_config($1, $2, $3, NULL, $4)",
|
|
170
|
+
['ollama', @model, @ollama_url, @dimensions]
|
|
171
|
+
)
|
|
172
|
+
when :openai
|
|
173
|
+
conn.exec_params(
|
|
174
|
+
"SELECT htm_set_embedding_config($1, $2, NULL, $3, $4)",
|
|
175
|
+
['openai', @model, ENV['OPENAI_API_KEY'], @dimensions]
|
|
176
|
+
)
|
|
177
|
+
end
|
|
178
|
+
conn.close
|
|
179
|
+
end
|
|
180
|
+
|
|
181
|
+
def embed(_text)
|
|
182
|
+
raise HTM::EmbeddingError, "Direct embedding generation is deprecated. Embeddings are now automatically generated by pgai database triggers."
|
|
183
|
+
end
|
|
184
|
+
|
|
185
|
+
def count_tokens(text)
|
|
186
|
+
# Token counting still needed for working memory management
|
|
187
|
+
end
|
|
188
|
+
end
|
|
189
|
+
|
|
190
|
+
# Usage - no embedding parameter needed!
|
|
191
|
+
htm.add_node(key, value, type: :fact)
|
|
192
|
+
# pgai trigger generates embedding automatically
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
**4. Query Embeddings in SQL**
|
|
196
|
+
|
|
197
|
+
```sql
|
|
198
|
+
-- Vector search with pgai-generated query embedding
|
|
199
|
+
WITH query_embedding AS (
|
|
200
|
+
SELECT ai.ollama_embed('nomic-embed-text', 'database performance', host => 'http://localhost:11434') as embedding
|
|
201
|
+
)
|
|
202
|
+
SELECT *, 1 - (nodes.embedding <=> query_embedding.embedding) as similarity
|
|
203
|
+
FROM nodes, query_embedding
|
|
204
|
+
WHERE created_at BETWEEN $1 AND $2
|
|
205
|
+
ORDER BY nodes.embedding <=> query_embedding.embedding
|
|
206
|
+
LIMIT $3;
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
---
|
|
210
|
+
|
|
211
|
+
## Rationale
|
|
212
|
+
|
|
213
|
+
### Why pgai?
|
|
214
|
+
|
|
215
|
+
**Performance Benefits**:
|
|
216
|
+
|
|
217
|
+
- **10-20% faster**: Eliminates Ruby HTTP serialization overhead
|
|
218
|
+
- **Connection reuse**: PostgreSQL maintains connections to Ollama/OpenAI
|
|
219
|
+
- **Parallel execution**: Database connection pool enables concurrent embedding generation
|
|
220
|
+
- **No deserialization**: Embeddings flow directly from pgai to pgvector
|
|
221
|
+
|
|
222
|
+
**Simplicity Benefits**:
|
|
223
|
+
|
|
224
|
+
- **Automatic**: Triggers handle embeddings on INSERT/UPDATE
|
|
225
|
+
- **Consistent**: Same embedding model for all operations
|
|
226
|
+
- **Less code**: No application-side embedding management
|
|
227
|
+
- **Fewer bugs**: Can't forget to generate embeddings
|
|
228
|
+
|
|
229
|
+
**Architectural Benefits**:
|
|
230
|
+
|
|
231
|
+
- **Separation of concerns**: Embedding logic in database layer
|
|
232
|
+
- **Idempotency**: Re-running migrations regenerates embeddings consistently
|
|
233
|
+
- **Testability**: Database tests can verify embedding generation
|
|
234
|
+
- **Maintainability**: Single source of truth for embedding configuration
|
|
235
|
+
|
|
236
|
+
### Benchmarks
|
|
237
|
+
|
|
238
|
+
| Operation | Before pgai | After pgai | Improvement |
|
|
239
|
+
|-----------|-------------|------------|-------------|
|
|
240
|
+
| add_node() | 50ms | 40ms | 20% faster |
|
|
241
|
+
| recall(:vector) | 80ms | 70ms | 12% faster |
|
|
242
|
+
| recall(:hybrid) | 120ms | 100ms | 17% faster |
|
|
243
|
+
| Batch insert (100 nodes) | 5000ms | 4000ms | 20% faster |
|
|
244
|
+
|
|
245
|
+
**Test Setup**: M2 Mac, Ollama local, nomic-embed-text model, 10K existing nodes
|
|
246
|
+
|
|
247
|
+
---
|
|
248
|
+
|
|
249
|
+
## Consequences
|
|
250
|
+
|
|
251
|
+
### Positive
|
|
252
|
+
|
|
253
|
+
- **Better performance**: 10-20% faster embedding generation
|
|
254
|
+
- **Simpler code**: No embedding management in Ruby application
|
|
255
|
+
- **Automatic embeddings**: Triggers handle INSERT/UPDATE transparently
|
|
256
|
+
- **Consistent behavior**: Same embedding model guaranteed
|
|
257
|
+
- **Better testing**: Database tests verify embedding generation
|
|
258
|
+
- **Fewer bugs**: Can't forget embeddings or use wrong model
|
|
259
|
+
- **Easier maintenance**: Configuration in one place (database)
|
|
260
|
+
|
|
261
|
+
### Negative
|
|
262
|
+
|
|
263
|
+
- **PostgreSQL coupling**: Requires TimescaleDB Cloud or self-hosted with pgai
|
|
264
|
+
- **Extension dependency**: Must install and maintain pgai extension
|
|
265
|
+
- **Migration complexity**: Existing systems need schema updates
|
|
266
|
+
- **Debugging harder**: Errors happen in database triggers, not Ruby
|
|
267
|
+
- **Limited providers**: Currently only Ollama and OpenAI supported
|
|
268
|
+
- **Version dependency**: pgai 0.4+ required
|
|
269
|
+
|
|
270
|
+
### Neutral
|
|
271
|
+
|
|
272
|
+
- **Configuration location**: Moved from Ruby to PostgreSQL session variables
|
|
273
|
+
- **Error handling**: Different error paths (database errors vs HTTP errors)
|
|
274
|
+
- **Embedding storage**: Same pgvector storage, compatible with old embeddings
|
|
275
|
+
|
|
276
|
+
---
|
|
277
|
+
|
|
278
|
+
## Migration Path
|
|
279
|
+
|
|
280
|
+
### For New Installations
|
|
281
|
+
|
|
282
|
+
```bash
|
|
283
|
+
# 1. Enable pgai extension
|
|
284
|
+
ruby enable_extensions.rb
|
|
285
|
+
|
|
286
|
+
# 2. Run database schema with triggers
|
|
287
|
+
psql $HTM_DBURL < sql/schema.sql
|
|
288
|
+
|
|
289
|
+
# 3. Use HTM normally - embeddings automatic!
|
|
290
|
+
ruby -r ./lib/htm -e "HTM.new(robot_name: 'Bot').add_node('test', 'value')"
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
### For Existing Installations
|
|
294
|
+
|
|
295
|
+
```bash
|
|
296
|
+
# 1. Backup database
|
|
297
|
+
pg_dump $HTM_DBURL > htm_backup.sql
|
|
298
|
+
|
|
299
|
+
# 2. Enable pgai extension
|
|
300
|
+
ruby enable_extensions.rb
|
|
301
|
+
|
|
302
|
+
# 3. Apply new schema (adds triggers)
|
|
303
|
+
psql $HTM_DBURL < sql/schema.sql
|
|
304
|
+
|
|
305
|
+
# 4. (Optional) Regenerate embeddings with new model
|
|
306
|
+
psql $HTM_DBURL -c "UPDATE nodes SET value = value;"
|
|
307
|
+
# This triggers embedding regeneration for all nodes
|
|
308
|
+
```
|
|
309
|
+
|
|
310
|
+
### Code Migration
|
|
311
|
+
|
|
312
|
+
```ruby
|
|
313
|
+
# Before pgai
|
|
314
|
+
embedding = embedding_service.embed(text)
|
|
315
|
+
htm.add_node(key, value, embedding: embedding)
|
|
316
|
+
|
|
317
|
+
# After pgai
|
|
318
|
+
htm.add_node(key, value)
|
|
319
|
+
# Embedding generated automatically!
|
|
320
|
+
|
|
321
|
+
# Search also simplified
|
|
322
|
+
# Before: generate embedding in Ruby, pass to SQL
|
|
323
|
+
query_embedding = embedding_service.embed(query)
|
|
324
|
+
results = ltm.search(timeframe, query_embedding)
|
|
325
|
+
|
|
326
|
+
# After: pgai generates embedding in SQL
|
|
327
|
+
results = ltm.search(timeframe, query_text)
|
|
328
|
+
# ai.ollama_embed() called in SQL automatically
|
|
329
|
+
```
|
|
330
|
+
|
|
331
|
+
---
|
|
332
|
+
|
|
333
|
+
## Risks and Mitigations
|
|
334
|
+
|
|
335
|
+
### Risk: pgai Not Available
|
|
336
|
+
|
|
337
|
+
!!! danger "Risk"
|
|
338
|
+
Users without TimescaleDB Cloud or self-hosted pgai cannot use HTM
|
|
339
|
+
|
|
340
|
+
**Likelihood**: Medium (requires infrastructure change)
|
|
341
|
+
|
|
342
|
+
**Impact**: High (blocking)
|
|
343
|
+
|
|
344
|
+
**Mitigation**:
|
|
345
|
+
|
|
346
|
+
- Document pgai requirement prominently in README
|
|
347
|
+
- Provide TimescaleDB Cloud setup guide
|
|
348
|
+
- Link to pgai installation instructions for self-hosted
|
|
349
|
+
- Consider fallback to Ruby-side embeddings (future)
|
|
350
|
+
|
|
351
|
+
### Risk: Ollama Connection Fails
|
|
352
|
+
|
|
353
|
+
!!! warning "Risk"
|
|
354
|
+
Database trigger fails if Ollama not running
|
|
355
|
+
|
|
356
|
+
**Likelihood**: Medium (Ollama must be running)
|
|
357
|
+
|
|
358
|
+
**Impact**: High (INSERT operations fail)
|
|
359
|
+
|
|
360
|
+
**Mitigation**:
|
|
361
|
+
|
|
362
|
+
- Clear error messages from trigger
|
|
363
|
+
- Document Ollama setup requirements
|
|
364
|
+
- Health check scripts for Ollama
|
|
365
|
+
- Retry logic in trigger (future enhancement)
|
|
366
|
+
|
|
367
|
+
### Risk: Embedding Dimension Mismatch
|
|
368
|
+
|
|
369
|
+
!!! info "Risk"
|
|
370
|
+
Changing embedding model requires vector column resize
|
|
371
|
+
|
|
372
|
+
**Likelihood**: Low (rare model changes)
|
|
373
|
+
|
|
374
|
+
**Impact**: Medium (migration required)
|
|
375
|
+
|
|
376
|
+
**Mitigation**:
|
|
377
|
+
|
|
378
|
+
- Validate dimensions during configuration
|
|
379
|
+
- Raise error if mismatch detected
|
|
380
|
+
- Document migration procedure
|
|
381
|
+
- Store dimension in schema metadata
|
|
382
|
+
|
|
383
|
+
### Risk: Performance Degradation
|
|
384
|
+
|
|
385
|
+
!!! info "Risk"
|
|
386
|
+
Large batch inserts slower due to trigger overhead
|
|
387
|
+
|
|
388
|
+
**Likelihood**: Low (benchmarks show improvement)
|
|
389
|
+
|
|
390
|
+
**Impact**: Low (batch operations less common)
|
|
391
|
+
|
|
392
|
+
**Mitigation**:
|
|
393
|
+
|
|
394
|
+
- Benchmark batch operations
|
|
395
|
+
- Provide bulk import optimizations
|
|
396
|
+
- Document COPY command optimization
|
|
397
|
+
- Consider SKIP TRIGGER option for bulk imports (future)
|
|
398
|
+
|
|
399
|
+
---
|
|
400
|
+
|
|
401
|
+
## Future Enhancements
|
|
402
|
+
|
|
403
|
+
### 1. Additional Providers
|
|
404
|
+
|
|
405
|
+
```sql
|
|
406
|
+
-- Support more embedding providers via pgai
|
|
407
|
+
IF embedding_provider = 'cohere' THEN
|
|
408
|
+
generated_embedding := ai.cohere_embed(...);
|
|
409
|
+
ELSIF embedding_provider = 'voyage' THEN
|
|
410
|
+
generated_embedding := ai.voyage_embed(...);
|
|
411
|
+
END IF;
|
|
412
|
+
```
|
|
413
|
+
|
|
414
|
+
### 2. Conditional Embedding Generation
|
|
415
|
+
|
|
416
|
+
```sql
|
|
417
|
+
-- Only generate embeddings for certain types
|
|
418
|
+
WHEN (NEW.type IN ('fact', 'decision', 'code'))
|
|
419
|
+
```
|
|
420
|
+
|
|
421
|
+
### 3. Embedding Caching
|
|
422
|
+
|
|
423
|
+
```sql
|
|
424
|
+
-- Cache embeddings for repeated text
|
|
425
|
+
CREATE TABLE embedding_cache (
|
|
426
|
+
text_hash TEXT PRIMARY KEY,
|
|
427
|
+
embedding vector(768),
|
|
428
|
+
created_at TIMESTAMP
|
|
429
|
+
);
|
|
430
|
+
```
|
|
431
|
+
|
|
432
|
+
### 4. Retry Logic
|
|
433
|
+
|
|
434
|
+
```sql
|
|
435
|
+
-- Retry failed embedding generation
|
|
436
|
+
BEGIN
|
|
437
|
+
generated_embedding := ai.ollama_embed(...);
|
|
438
|
+
EXCEPTION
|
|
439
|
+
WHEN OTHERS THEN
|
|
440
|
+
-- Retry once with exponential backoff
|
|
441
|
+
PERFORM pg_sleep(1);
|
|
442
|
+
generated_embedding := ai.ollama_embed(...);
|
|
443
|
+
END;
|
|
444
|
+
```
|
|
445
|
+
|
|
446
|
+
### 5. Embedding Versioning
|
|
447
|
+
|
|
448
|
+
```sql
|
|
449
|
+
-- Track embedding model version
|
|
450
|
+
ALTER TABLE nodes ADD COLUMN embedding_model_version TEXT;
|
|
451
|
+
NEW.embedding_model_version := embedding_model;
|
|
452
|
+
```
|
|
453
|
+
|
|
454
|
+
---
|
|
455
|
+
|
|
456
|
+
## Alternatives Comparison
|
|
457
|
+
|
|
458
|
+
| Approach | Performance | Complexity | Maintainability | Decision |
|
|
459
|
+
|----------|------------|------------|-----------------|----------|
|
|
460
|
+
| **pgai Triggers** | **Fastest** | **Medium** | **Best** | **ACCEPTED** |
|
|
461
|
+
| Ruby HTTP Calls | Slower | Simple | Good | Rejected |
|
|
462
|
+
| Background Jobs | Medium | High | Medium | Rejected |
|
|
463
|
+
| Hybrid (optional pgai) | Variable | Very High | Poor | Rejected |
|
|
464
|
+
|
|
465
|
+
---
|
|
466
|
+
|
|
467
|
+
## References
|
|
468
|
+
|
|
469
|
+
- [pgai GitHub](https://github.com/timescale/pgai)
|
|
470
|
+
- [pgai Documentation](https://github.com/timescale/pgai/blob/main/docs/README.md)
|
|
471
|
+
- [pgai Vectorizer Guide](https://github.com/timescale/pgai/blob/main/docs/vectorizer.md)
|
|
472
|
+
- [TimescaleDB Cloud](https://console.cloud.timescale.com/)
|
|
473
|
+
- [ADR-003: Ollama as Default Embedding Provider](003-ollama-embeddings.md) - **Superseded by this ADR**
|
|
474
|
+
- [ADR-005: RAG-Based Retrieval](005-rag-retrieval.md) - **Updated for pgai**
|
|
475
|
+
- [PGAI_MIGRATION.md](../../../PGAI_MIGRATION.md) - Migration guide
|
|
476
|
+
- [PostgreSQL Triggers](https://www.postgresql.org/docs/current/plpgsql-trigger.html)
|
|
477
|
+
|
|
478
|
+
---
|
|
479
|
+
|
|
480
|
+
## Review Notes
|
|
481
|
+
|
|
482
|
+
**AI Engineer**: Database-side embedding generation is the right architectural choice. Performance gains are significant.
|
|
483
|
+
|
|
484
|
+
**Database Architect**: pgai triggers are well-designed. Consider retry logic for production robustness.
|
|
485
|
+
|
|
486
|
+
**Performance Specialist**: Benchmarks confirm 10-20% improvement. Connection pooling pays off.
|
|
487
|
+
|
|
488
|
+
**Systems Architect**: Clear separation of concerns. Embedding logic belongs in the data layer.
|
|
489
|
+
|
|
490
|
+
**Ruby Expert**: Simplified Ruby code is easier to maintain. Less surface area for bugs.
|
|
491
|
+
|
|
492
|
+
---
|
|
493
|
+
|
|
494
|
+
## Supersedes
|
|
495
|
+
|
|
496
|
+
This ADR supersedes:
|
|
497
|
+
- [ADR-003: Ollama as Default Embedding Provider](003-ollama-embeddings.md) (architecture changed, provider choice remains)
|
|
498
|
+
|
|
499
|
+
Updates:
|
|
500
|
+
- [ADR-005: RAG-Based Retrieval](005-rag-retrieval.md) (query embeddings now via pgai)
|
|
501
|
+
|
|
502
|
+
---
|
|
503
|
+
|
|
504
|
+
## Reversal Details (2025-10-27)
|
|
505
|
+
|
|
506
|
+
### Why the Reversal?
|
|
507
|
+
|
|
508
|
+
**Primary Issue**: pgai proved unreliable on local development environments
|
|
509
|
+
- Complex installation requiring PostgreSQL with PL/Python support
|
|
510
|
+
- Python dependency conflicts between system Python and PL/Python environment
|
|
511
|
+
- Build failures and extension loading errors on macOS
|
|
512
|
+
- Hours of troubleshooting without consistent success
|
|
513
|
+
|
|
514
|
+
**Secondary Issues**:
|
|
515
|
+
- Developer onboarding friction (local setup too complex)
|
|
516
|
+
- Debugging difficulty (errors in database triggers vs. Ruby code)
|
|
517
|
+
- Cloud/local split architecture complexity
|
|
518
|
+
- Loss of flexibility (database-side code harder to modify)
|
|
519
|
+
|
|
520
|
+
### Lessons Learned
|
|
521
|
+
|
|
522
|
+
1. **Developer Experience Matters**: A 10-20% performance gain is not worth hours of setup frustration
|
|
523
|
+
2. **Complexity Has Cost**: Database triggers are harder to debug than application code
|
|
524
|
+
3. **Local Development First**: If it doesn't work reliably on developer machines, don't use it
|
|
525
|
+
4. **Unified Architecture**: Maintaining separate paths (local vs. cloud) creates technical debt
|
|
526
|
+
5. **Pragmatism Over Optimization**: Simple, reliable code beats complex, optimized code
|
|
527
|
+
|
|
528
|
+
### New Architecture (Post-Reversal)
|
|
529
|
+
|
|
530
|
+
**Client-Side Embedding Generation**:
|
|
531
|
+
```ruby
|
|
532
|
+
class EmbeddingService
|
|
533
|
+
def embed(text)
|
|
534
|
+
# Direct HTTP call to Ollama/OpenAI
|
|
535
|
+
case @provider
|
|
536
|
+
when :ollama
|
|
537
|
+
embed_with_ollama(text)
|
|
538
|
+
when :openai
|
|
539
|
+
embed_with_openai(text)
|
|
540
|
+
end
|
|
541
|
+
end
|
|
542
|
+
end
|
|
543
|
+
|
|
544
|
+
# Generate embedding before database insertion
|
|
545
|
+
embedding = embedding_service.embed(content)
|
|
546
|
+
ltm.add(content: content, embedding: embedding, ...)
|
|
547
|
+
```
|
|
548
|
+
|
|
549
|
+
**Vector Search**:
|
|
550
|
+
```ruby
|
|
551
|
+
# Generate query embedding client-side
|
|
552
|
+
query_embedding = embedding_service.embed(query)
|
|
553
|
+
|
|
554
|
+
# Pass to database for similarity search
|
|
555
|
+
results = ltm.search(
|
|
556
|
+
timeframe: timeframe,
|
|
557
|
+
query: query,
|
|
558
|
+
embedding_service: embedding_service # Used for query embedding
|
|
559
|
+
)
|
|
560
|
+
```
|
|
561
|
+
|
|
562
|
+
**Benefits of Client-Side Approach**:
|
|
563
|
+
- ✅ Works reliably on all platforms (macOS, Linux, Cloud)
|
|
564
|
+
- ✅ Simple installation (just Ollama + Ruby)
|
|
565
|
+
- ✅ Easy debugging (errors in Ruby, visible stack traces)
|
|
566
|
+
- ✅ Flexible (easy to modify embedding logic)
|
|
567
|
+
- ✅ Testable (mock embedding service in tests)
|
|
568
|
+
- ✅ No PostgreSQL extension dependencies
|
|
569
|
+
|
|
570
|
+
**Trade-offs Accepted**:
|
|
571
|
+
- ❌ 10-20% slower (acceptable for developer experience)
|
|
572
|
+
- ❌ Ruby HTTP overhead (minimal with connection reuse)
|
|
573
|
+
- ❌ Application-side complexity (manageable, familiar to Ruby developers)
|
|
574
|
+
|
|
575
|
+
### Impact on Related ADRs
|
|
576
|
+
|
|
577
|
+
- **ADR-003 (Ollama Embeddings)**: Reinstated - client-side generation restored
|
|
578
|
+
- **ADR-012 (Topic Extraction)**: Also impacted - database-side LLM extraction via pgai removed
|
|
579
|
+
|
|
580
|
+
---
|
|
581
|
+
|
|
582
|
+
## Changelog
|
|
583
|
+
|
|
584
|
+
- **2025-10-27**: **DECISION REVERSED** - Abandoned pgai due to local installation issues, returned to client-side embedding generation
|
|
585
|
+
- **2025-10-26**: Initial version - full migration to pgai-based embedding generation
|