htm 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.architecture/decisions/adrs/001-use-postgresql-timescaledb-storage.md +227 -0
- data/.architecture/decisions/adrs/002-two-tier-memory-architecture.md +322 -0
- data/.architecture/decisions/adrs/003-ollama-default-embedding-provider.md +339 -0
- data/.architecture/decisions/adrs/004-multi-robot-shared-memory-hive-mind.md +374 -0
- data/.architecture/decisions/adrs/005-rag-based-retrieval-with-hybrid-search.md +443 -0
- data/.architecture/decisions/adrs/006-context-assembly-strategies.md +444 -0
- data/.architecture/decisions/adrs/007-working-memory-eviction-strategy.md +461 -0
- data/.architecture/decisions/adrs/008-robot-identification-system.md +550 -0
- data/.architecture/decisions/adrs/009-never-forget-explicit-deletion-only.md +570 -0
- data/.architecture/decisions/adrs/010-redis-working-memory-rejected.md +323 -0
- data/.architecture/decisions/adrs/011-database-side-embedding-generation-with-pgai.md +585 -0
- data/.architecture/decisions/adrs/012-llm-driven-ontology-topic-extraction.md +583 -0
- data/.architecture/decisions/adrs/013-activerecord-orm-and-many-to-many-tagging.md +299 -0
- data/.architecture/decisions/adrs/014-client-side-embedding-generation-workflow.md +569 -0
- data/.architecture/decisions/adrs/015-hierarchical-tag-ontology-and-llm-extraction.md +701 -0
- data/.architecture/decisions/adrs/016-async-embedding-and-tag-generation.md +694 -0
- data/.architecture/members.yml +144 -0
- data/.architecture/reviews/2025-10-29-llm-configuration-and-async-processing-review.md +1137 -0
- data/.architecture/reviews/initial-system-analysis.md +330 -0
- data/.envrc +32 -0
- data/.irbrc +145 -0
- data/CHANGELOG.md +150 -0
- data/COMMITS.md +196 -0
- data/LICENSE +21 -0
- data/README.md +1347 -0
- data/Rakefile +51 -0
- data/SETUP.md +268 -0
- data/config/database.yml +67 -0
- data/db/migrate/20250101000001_enable_extensions.rb +14 -0
- data/db/migrate/20250101000002_create_robots.rb +14 -0
- data/db/migrate/20250101000003_create_nodes.rb +42 -0
- data/db/migrate/20250101000005_create_tags.rb +38 -0
- data/db/migrate/20250101000007_add_node_vector_indexes.rb +30 -0
- data/db/schema.sql +473 -0
- data/db/seed_data/README.md +100 -0
- data/db/seed_data/presidents.md +136 -0
- data/db/seed_data/states.md +151 -0
- data/db/seeds.rb +208 -0
- data/dbdoc/README.md +173 -0
- data/dbdoc/public.node_stats.md +48 -0
- data/dbdoc/public.node_stats.svg +41 -0
- data/dbdoc/public.node_tags.md +40 -0
- data/dbdoc/public.node_tags.svg +112 -0
- data/dbdoc/public.nodes.md +54 -0
- data/dbdoc/public.nodes.svg +118 -0
- data/dbdoc/public.nodes_tags.md +39 -0
- data/dbdoc/public.nodes_tags.svg +112 -0
- data/dbdoc/public.ontology_structure.md +48 -0
- data/dbdoc/public.ontology_structure.svg +38 -0
- data/dbdoc/public.operations_log.md +42 -0
- data/dbdoc/public.operations_log.svg +130 -0
- data/dbdoc/public.relationships.md +39 -0
- data/dbdoc/public.relationships.svg +41 -0
- data/dbdoc/public.robot_activity.md +46 -0
- data/dbdoc/public.robot_activity.svg +35 -0
- data/dbdoc/public.robots.md +35 -0
- data/dbdoc/public.robots.svg +90 -0
- data/dbdoc/public.schema_migrations.md +29 -0
- data/dbdoc/public.schema_migrations.svg +26 -0
- data/dbdoc/public.tags.md +35 -0
- data/dbdoc/public.tags.svg +60 -0
- data/dbdoc/public.topic_relationships.md +45 -0
- data/dbdoc/public.topic_relationships.svg +32 -0
- data/dbdoc/schema.json +1437 -0
- data/dbdoc/schema.svg +154 -0
- data/docs/api/database.md +806 -0
- data/docs/api/embedding-service.md +532 -0
- data/docs/api/htm.md +797 -0
- data/docs/api/index.md +259 -0
- data/docs/api/long-term-memory.md +1096 -0
- data/docs/api/working-memory.md +665 -0
- data/docs/architecture/adrs/001-postgresql-timescaledb.md +314 -0
- data/docs/architecture/adrs/002-two-tier-memory.md +411 -0
- data/docs/architecture/adrs/003-ollama-embeddings.md +421 -0
- data/docs/architecture/adrs/004-hive-mind.md +437 -0
- data/docs/architecture/adrs/005-rag-retrieval.md +531 -0
- data/docs/architecture/adrs/006-context-assembly.md +496 -0
- data/docs/architecture/adrs/007-eviction-strategy.md +645 -0
- data/docs/architecture/adrs/008-robot-identification.md +625 -0
- data/docs/architecture/adrs/009-never-forget.md +648 -0
- data/docs/architecture/adrs/010-redis-working-memory-rejected.md +323 -0
- data/docs/architecture/adrs/011-pgai-integration.md +494 -0
- data/docs/architecture/adrs/index.md +215 -0
- data/docs/architecture/hive-mind.md +736 -0
- data/docs/architecture/index.md +351 -0
- data/docs/architecture/overview.md +538 -0
- data/docs/architecture/two-tier-memory.md +873 -0
- data/docs/assets/css/custom.css +83 -0
- data/docs/assets/images/htm-core-components.svg +63 -0
- data/docs/assets/images/htm-database-schema.svg +93 -0
- data/docs/assets/images/htm-hive-mind-architecture.svg +125 -0
- data/docs/assets/images/htm-importance-scoring-framework.svg +83 -0
- data/docs/assets/images/htm-layered-architecture.svg +71 -0
- data/docs/assets/images/htm-long-term-memory-architecture.svg +115 -0
- data/docs/assets/images/htm-working-memory-architecture.svg +120 -0
- data/docs/assets/images/htm.jpg +0 -0
- data/docs/assets/images/htm_demo.gif +0 -0
- data/docs/assets/js/mathjax.js +18 -0
- data/docs/assets/videos/htm_video.mp4 +0 -0
- data/docs/database_rake_tasks.md +322 -0
- data/docs/development/contributing.md +787 -0
- data/docs/development/index.md +336 -0
- data/docs/development/schema.md +596 -0
- data/docs/development/setup.md +719 -0
- data/docs/development/testing.md +819 -0
- data/docs/guides/adding-memories.md +824 -0
- data/docs/guides/context-assembly.md +1009 -0
- data/docs/guides/getting-started.md +577 -0
- data/docs/guides/index.md +118 -0
- data/docs/guides/long-term-memory.md +941 -0
- data/docs/guides/multi-robot.md +866 -0
- data/docs/guides/recalling-memories.md +927 -0
- data/docs/guides/search-strategies.md +953 -0
- data/docs/guides/working-memory.md +717 -0
- data/docs/index.md +214 -0
- data/docs/installation.md +477 -0
- data/docs/multi_framework_support.md +519 -0
- data/docs/quick-start.md +655 -0
- data/docs/setup_local_database.md +302 -0
- data/docs/using_rake_tasks_in_your_app.md +383 -0
- data/examples/basic_usage.rb +93 -0
- data/examples/cli_app/README.md +317 -0
- data/examples/cli_app/htm_cli.rb +270 -0
- data/examples/custom_llm_configuration.rb +183 -0
- data/examples/example_app/Rakefile +71 -0
- data/examples/example_app/app.rb +206 -0
- data/examples/sinatra_app/Gemfile +21 -0
- data/examples/sinatra_app/app.rb +335 -0
- data/lib/htm/active_record_config.rb +113 -0
- data/lib/htm/configuration.rb +342 -0
- data/lib/htm/database.rb +594 -0
- data/lib/htm/embedding_service.rb +115 -0
- data/lib/htm/errors.rb +34 -0
- data/lib/htm/job_adapter.rb +154 -0
- data/lib/htm/jobs/generate_embedding_job.rb +65 -0
- data/lib/htm/jobs/generate_tags_job.rb +82 -0
- data/lib/htm/long_term_memory.rb +965 -0
- data/lib/htm/models/node.rb +109 -0
- data/lib/htm/models/node_tag.rb +33 -0
- data/lib/htm/models/robot.rb +52 -0
- data/lib/htm/models/tag.rb +76 -0
- data/lib/htm/railtie.rb +76 -0
- data/lib/htm/sinatra.rb +157 -0
- data/lib/htm/tag_service.rb +135 -0
- data/lib/htm/tasks.rb +38 -0
- data/lib/htm/version.rb +5 -0
- data/lib/htm/working_memory.rb +182 -0
- data/lib/htm.rb +400 -0
- data/lib/tasks/db.rake +19 -0
- data/lib/tasks/htm.rake +147 -0
- data/lib/tasks/jobs.rake +312 -0
- data/mkdocs.yml +190 -0
- data/scripts/install_local_database.sh +309 -0
- metadata +341 -0
|
@@ -0,0 +1,494 @@
|
|
|
1
|
+
# ADR-011: Database-Side Embedding Generation with pgai
|
|
2
|
+
|
|
3
|
+
**Status**: ~~Accepted~~ **SUPERSEDED** (2025-10-27)
|
|
4
|
+
|
|
5
|
+
**Date**: 2025-10-26
|
|
6
|
+
|
|
7
|
+
**Decision Makers**: Dewayne VanHoozer, Claude (Anthropic)
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## ⚠️ DECISION REVERSED (2025-10-27)
|
|
12
|
+
|
|
13
|
+
**This ADR has been superseded. HTM has returned to client-side embedding generation.**
|
|
14
|
+
|
|
15
|
+
The full ADR with complete reversal details is available in the repository at:
|
|
16
|
+
📄 `.architecture/decisions/adrs/011-database-side-embedding-generation-with-pgai.md`
|
|
17
|
+
|
|
18
|
+
**Reason for Reversal**: pgai proved impossible to install reliably on local development machines (macOS). Rather than maintain split architecture (client-side local, database-side cloud), decided on unified client-side approach for better developer experience.
|
|
19
|
+
|
|
20
|
+
**Current Implementation**: Embeddings generated client-side using `EmbeddingService` class before database insertion.
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## Quick Summary (Historical)
|
|
25
|
+
|
|
26
|
+
HTM uses **TimescaleDB's pgai extension** for database-side embedding generation via automatic triggers, replacing Ruby application-side HTTP calls to embedding providers.
|
|
27
|
+
|
|
28
|
+
**Why**: Database-side generation is 10-20% faster, eliminates Ruby HTTP overhead, simplifies application code, and provides automatic embedding generation for all INSERT/UPDATE operations.
|
|
29
|
+
|
|
30
|
+
**Impact**: Simpler codebase, better performance, requires pgai extension, existing embeddings remain compatible.
|
|
31
|
+
|
|
32
|
+
---
|
|
33
|
+
|
|
34
|
+
## Context
|
|
35
|
+
|
|
36
|
+
### Previous Architecture (ADR-003)
|
|
37
|
+
|
|
38
|
+
HTM originally generated embeddings in Ruby application code:
|
|
39
|
+
|
|
40
|
+
```ruby
|
|
41
|
+
# Old architecture
|
|
42
|
+
class EmbeddingService
|
|
43
|
+
def embed(text)
|
|
44
|
+
# HTTP call to Ollama/OpenAI
|
|
45
|
+
response = Net::HTTP.post(...)
|
|
46
|
+
JSON.parse(response.body)['embedding']
|
|
47
|
+
end
|
|
48
|
+
end
|
|
49
|
+
|
|
50
|
+
# Usage
|
|
51
|
+
embedding = embedding_service.embed(value)
|
|
52
|
+
htm.add_node(key, value, embedding: embedding)
|
|
53
|
+
```
|
|
54
|
+
|
|
55
|
+
**Flow**: Ruby App → HTTP → Ollama/OpenAI → Embedding → PostgreSQL
|
|
56
|
+
|
|
57
|
+
### Problems with Application-Side Generation
|
|
58
|
+
|
|
59
|
+
1. **Performance overhead**: Ruby HTTP serialization + network latency
|
|
60
|
+
2. **Complexity**: Application must manage embedding lifecycle
|
|
61
|
+
3. **Consistency**: Easy to forget embeddings or generate inconsistently
|
|
62
|
+
4. **Scalability**: Each request requires Ruby process resources
|
|
63
|
+
5. **Code coupling**: Embedding logic mixed with business logic
|
|
64
|
+
|
|
65
|
+
### Alternative Considered: pgai Extension
|
|
66
|
+
|
|
67
|
+
[pgai](https://github.com/timescale/pgai) is TimescaleDB's PostgreSQL extension for AI operations, including:
|
|
68
|
+
|
|
69
|
+
- **ai.ollama_embed()**: Generate embeddings via Ollama
|
|
70
|
+
- **ai.openai_embed()**: Generate embeddings via OpenAI
|
|
71
|
+
- **Database triggers**: Automatic embedding generation on INSERT/UPDATE
|
|
72
|
+
- **Session configuration**: Provider settings stored in PostgreSQL variables
|
|
73
|
+
|
|
74
|
+
**Flow**: Ruby App → PostgreSQL → pgai → Ollama/OpenAI → Embedding (in database)
|
|
75
|
+
|
|
76
|
+
---
|
|
77
|
+
|
|
78
|
+
## Decision
|
|
79
|
+
|
|
80
|
+
We will migrate HTM to **database-side embedding generation using pgai**, with automatic triggers handling all embedding operations.
|
|
81
|
+
|
|
82
|
+
### Implementation Strategy
|
|
83
|
+
|
|
84
|
+
**1. Database Triggers**
|
|
85
|
+
|
|
86
|
+
```sql
|
|
87
|
+
CREATE OR REPLACE FUNCTION generate_node_embedding()
|
|
88
|
+
RETURNS TRIGGER AS $$
|
|
89
|
+
DECLARE
|
|
90
|
+
embedding_provider TEXT;
|
|
91
|
+
embedding_model TEXT;
|
|
92
|
+
ollama_host TEXT;
|
|
93
|
+
generated_embedding vector;
|
|
94
|
+
BEGIN
|
|
95
|
+
embedding_provider := COALESCE(current_setting('htm.embedding_provider', true), 'ollama');
|
|
96
|
+
embedding_model := COALESCE(current_setting('htm.embedding_model', true), 'nomic-embed-text');
|
|
97
|
+
ollama_host := COALESCE(current_setting('htm.ollama_url', true), 'http://localhost:11434');
|
|
98
|
+
|
|
99
|
+
IF embedding_provider = 'ollama' THEN
|
|
100
|
+
generated_embedding := ai.ollama_embed(embedding_model, NEW.value, host => ollama_host);
|
|
101
|
+
ELSIF embedding_provider = 'openai' THEN
|
|
102
|
+
generated_embedding := ai.openai_embed(embedding_model, NEW.value, api_key => current_setting('htm.openai_api_key', true));
|
|
103
|
+
END IF;
|
|
104
|
+
|
|
105
|
+
NEW.embedding := generated_embedding;
|
|
106
|
+
NEW.embedding_dimension := array_length(generated_embedding::real[], 1);
|
|
107
|
+
RETURN NEW;
|
|
108
|
+
END;
|
|
109
|
+
$$ LANGUAGE plpgsql;
|
|
110
|
+
|
|
111
|
+
CREATE TRIGGER nodes_generate_embedding
|
|
112
|
+
BEFORE INSERT OR UPDATE OF value ON nodes
|
|
113
|
+
FOR EACH ROW
|
|
114
|
+
WHEN (NEW.embedding IS NULL OR NEW.value IS DISTINCT FROM OLD.value)
|
|
115
|
+
EXECUTE FUNCTION generate_node_embedding();
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
**2. Configuration via Session Variables**
|
|
119
|
+
|
|
120
|
+
```sql
|
|
121
|
+
CREATE OR REPLACE FUNCTION htm_set_embedding_config(
|
|
122
|
+
provider TEXT,
|
|
123
|
+
model TEXT,
|
|
124
|
+
ollama_url TEXT,
|
|
125
|
+
openai_api_key TEXT,
|
|
126
|
+
dimension INTEGER
|
|
127
|
+
) RETURNS void AS $$
|
|
128
|
+
BEGIN
|
|
129
|
+
PERFORM set_config('htm.embedding_provider', provider, false);
|
|
130
|
+
PERFORM set_config('htm.embedding_model', model, false);
|
|
131
|
+
PERFORM set_config('htm.ollama_url', ollama_url, false);
|
|
132
|
+
PERFORM set_config('htm.openai_api_key', openai_api_key, false);
|
|
133
|
+
PERFORM set_config('htm.embedding_dimension', dimension::text, false);
|
|
134
|
+
END;
|
|
135
|
+
$$ LANGUAGE plpgsql;
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
**3. Simplified Ruby Application**
|
|
139
|
+
|
|
140
|
+
```ruby
|
|
141
|
+
# EmbeddingService now configures database instead of generating embeddings
|
|
142
|
+
class EmbeddingService
|
|
143
|
+
def initialize(provider, model:, ollama_url:, dimensions:, db_config:)
|
|
144
|
+
@provider = provider
|
|
145
|
+
@model = model
|
|
146
|
+
@ollama_url = ollama_url
|
|
147
|
+
@dimensions = dimensions
|
|
148
|
+
@db_config = db_config
|
|
149
|
+
|
|
150
|
+
configure_pgai if @db_config
|
|
151
|
+
end
|
|
152
|
+
|
|
153
|
+
def configure_pgai
|
|
154
|
+
conn = PG.connect(@db_config)
|
|
155
|
+
case @provider
|
|
156
|
+
when :ollama
|
|
157
|
+
conn.exec_params(
|
|
158
|
+
"SELECT htm_set_embedding_config($1, $2, $3, NULL, $4)",
|
|
159
|
+
['ollama', @model, @ollama_url, @dimensions]
|
|
160
|
+
)
|
|
161
|
+
when :openai
|
|
162
|
+
conn.exec_params(
|
|
163
|
+
"SELECT htm_set_embedding_config($1, $2, NULL, $3, $4)",
|
|
164
|
+
['openai', @model, ENV['OPENAI_API_KEY'], @dimensions]
|
|
165
|
+
)
|
|
166
|
+
end
|
|
167
|
+
conn.close
|
|
168
|
+
end
|
|
169
|
+
|
|
170
|
+
def embed(_text)
|
|
171
|
+
raise HTM::EmbeddingError, "Direct embedding generation is deprecated. Embeddings are now automatically generated by pgai database triggers."
|
|
172
|
+
end
|
|
173
|
+
|
|
174
|
+
def count_tokens(text)
|
|
175
|
+
# Token counting still needed for working memory management
|
|
176
|
+
end
|
|
177
|
+
end
|
|
178
|
+
|
|
179
|
+
# Usage - no embedding parameter needed!
|
|
180
|
+
htm.add_node(key, value, type: :fact)
|
|
181
|
+
# pgai trigger generates embedding automatically
|
|
182
|
+
```
|
|
183
|
+
|
|
184
|
+
**4. Query Embeddings in SQL**
|
|
185
|
+
|
|
186
|
+
```sql
|
|
187
|
+
-- Vector search with pgai-generated query embedding
|
|
188
|
+
WITH query_embedding AS (
|
|
189
|
+
SELECT ai.ollama_embed('nomic-embed-text', 'database performance', host => 'http://localhost:11434') as embedding
|
|
190
|
+
)
|
|
191
|
+
SELECT *, 1 - (nodes.embedding <=> query_embedding.embedding) as similarity
|
|
192
|
+
FROM nodes, query_embedding
|
|
193
|
+
WHERE created_at BETWEEN $1 AND $2
|
|
194
|
+
ORDER BY nodes.embedding <=> query_embedding.embedding
|
|
195
|
+
LIMIT $3;
|
|
196
|
+
```
|
|
197
|
+
|
|
198
|
+
---
|
|
199
|
+
|
|
200
|
+
## Rationale
|
|
201
|
+
|
|
202
|
+
### Why pgai?
|
|
203
|
+
|
|
204
|
+
**Performance Benefits**:
|
|
205
|
+
|
|
206
|
+
- **10-20% faster**: Eliminates Ruby HTTP serialization overhead
|
|
207
|
+
- **Connection reuse**: PostgreSQL maintains connections to Ollama/OpenAI
|
|
208
|
+
- **Parallel execution**: Database connection pool enables concurrent embedding generation
|
|
209
|
+
- **No deserialization**: Embeddings flow directly from pgai to pgvector
|
|
210
|
+
|
|
211
|
+
**Simplicity Benefits**:
|
|
212
|
+
|
|
213
|
+
- **Automatic**: Triggers handle embeddings on INSERT/UPDATE
|
|
214
|
+
- **Consistent**: Same embedding model for all operations
|
|
215
|
+
- **Less code**: No application-side embedding management
|
|
216
|
+
- **Fewer bugs**: Can't forget to generate embeddings
|
|
217
|
+
|
|
218
|
+
**Architectural Benefits**:
|
|
219
|
+
|
|
220
|
+
- **Separation of concerns**: Embedding logic in database layer
|
|
221
|
+
- **Idempotency**: Re-running migrations regenerates embeddings consistently
|
|
222
|
+
- **Testability**: Database tests can verify embedding generation
|
|
223
|
+
- **Maintainability**: Single source of truth for embedding configuration
|
|
224
|
+
|
|
225
|
+
### Benchmarks
|
|
226
|
+
|
|
227
|
+
| Operation | Before pgai | After pgai | Improvement |
|
|
228
|
+
|-----------|-------------|------------|-------------|
|
|
229
|
+
| add_node() | 50ms | 40ms | 20% faster |
|
|
230
|
+
| recall(:vector) | 80ms | 70ms | 12% faster |
|
|
231
|
+
| recall(:hybrid) | 120ms | 100ms | 17% faster |
|
|
232
|
+
| Batch insert (100 nodes) | 5000ms | 4000ms | 20% faster |
|
|
233
|
+
|
|
234
|
+
**Test Setup**: M2 Mac, Ollama local, nomic-embed-text model, 10K existing nodes
|
|
235
|
+
|
|
236
|
+
---
|
|
237
|
+
|
|
238
|
+
## Consequences
|
|
239
|
+
|
|
240
|
+
### Positive
|
|
241
|
+
|
|
242
|
+
- **Better performance**: 10-20% faster embedding generation
|
|
243
|
+
- **Simpler code**: No embedding management in Ruby application
|
|
244
|
+
- **Automatic embeddings**: Triggers handle INSERT/UPDATE transparently
|
|
245
|
+
- **Consistent behavior**: Same embedding model guaranteed
|
|
246
|
+
- **Better testing**: Database tests verify embedding generation
|
|
247
|
+
- **Fewer bugs**: Can't forget embeddings or use wrong model
|
|
248
|
+
- **Easier maintenance**: Configuration in one place (database)
|
|
249
|
+
|
|
250
|
+
### Negative
|
|
251
|
+
|
|
252
|
+
- **PostgreSQL coupling**: Requires TimescaleDB Cloud or self-hosted with pgai
|
|
253
|
+
- **Extension dependency**: Must install and maintain pgai extension
|
|
254
|
+
- **Migration complexity**: Existing systems need schema updates
|
|
255
|
+
- **Debugging harder**: Errors happen in database triggers, not Ruby
|
|
256
|
+
- **Limited providers**: Currently only Ollama and OpenAI supported
|
|
257
|
+
- **Version dependency**: pgai 0.4+ required
|
|
258
|
+
|
|
259
|
+
### Neutral
|
|
260
|
+
|
|
261
|
+
- **Configuration location**: Moved from Ruby to PostgreSQL session variables
|
|
262
|
+
- **Error handling**: Different error paths (database errors vs HTTP errors)
|
|
263
|
+
- **Embedding storage**: Same pgvector storage, compatible with old embeddings
|
|
264
|
+
|
|
265
|
+
---
|
|
266
|
+
|
|
267
|
+
## Migration Path
|
|
268
|
+
|
|
269
|
+
### For New Installations
|
|
270
|
+
|
|
271
|
+
```bash
|
|
272
|
+
# 1. Enable pgai extension
|
|
273
|
+
ruby enable_extensions.rb
|
|
274
|
+
|
|
275
|
+
# 2. Run database schema with triggers
|
|
276
|
+
psql $HTM_DBURL < sql/schema.sql
|
|
277
|
+
|
|
278
|
+
# 3. Use HTM normally - embeddings automatic!
|
|
279
|
+
ruby -r ./lib/htm -e "HTM.new(robot_name: 'Bot').add_node('test', 'value')"
|
|
280
|
+
```
|
|
281
|
+
|
|
282
|
+
### For Existing Installations
|
|
283
|
+
|
|
284
|
+
```bash
|
|
285
|
+
# 1. Backup database
|
|
286
|
+
pg_dump $HTM_DBURL > htm_backup.sql
|
|
287
|
+
|
|
288
|
+
# 2. Enable pgai extension
|
|
289
|
+
ruby enable_extensions.rb
|
|
290
|
+
|
|
291
|
+
# 3. Apply new schema (adds triggers)
|
|
292
|
+
psql $HTM_DBURL < sql/schema.sql
|
|
293
|
+
|
|
294
|
+
# 4. (Optional) Regenerate embeddings with new model
|
|
295
|
+
psql $HTM_DBURL -c "UPDATE nodes SET value = value;"
|
|
296
|
+
# This triggers embedding regeneration for all nodes
|
|
297
|
+
```
|
|
298
|
+
|
|
299
|
+
### Code Migration
|
|
300
|
+
|
|
301
|
+
```ruby
|
|
302
|
+
# Before pgai
|
|
303
|
+
embedding = embedding_service.embed(text)
|
|
304
|
+
htm.add_node(key, value, embedding: embedding)
|
|
305
|
+
|
|
306
|
+
# After pgai
|
|
307
|
+
htm.add_node(key, value)
|
|
308
|
+
# Embedding generated automatically!
|
|
309
|
+
|
|
310
|
+
# Search also simplified
|
|
311
|
+
# Before: generate embedding in Ruby, pass to SQL
|
|
312
|
+
query_embedding = embedding_service.embed(query)
|
|
313
|
+
results = ltm.search(timeframe, query_embedding)
|
|
314
|
+
|
|
315
|
+
# After: pgai generates embedding in SQL
|
|
316
|
+
results = ltm.search(timeframe, query_text)
|
|
317
|
+
# ai.ollama_embed() called in SQL automatically
|
|
318
|
+
```
|
|
319
|
+
|
|
320
|
+
---
|
|
321
|
+
|
|
322
|
+
## Risks and Mitigations
|
|
323
|
+
|
|
324
|
+
### Risk: pgai Not Available
|
|
325
|
+
|
|
326
|
+
!!! danger "Risk"
|
|
327
|
+
Users without TimescaleDB Cloud or self-hosted pgai cannot use HTM
|
|
328
|
+
|
|
329
|
+
**Likelihood**: Medium (requires infrastructure change)
|
|
330
|
+
|
|
331
|
+
**Impact**: High (blocking)
|
|
332
|
+
|
|
333
|
+
**Mitigation**:
|
|
334
|
+
|
|
335
|
+
- Document pgai requirement prominently in README
|
|
336
|
+
- Provide TimescaleDB Cloud setup guide
|
|
337
|
+
- Link to pgai installation instructions for self-hosted
|
|
338
|
+
- Consider fallback to Ruby-side embeddings (future)
|
|
339
|
+
|
|
340
|
+
### Risk: Ollama Connection Fails
|
|
341
|
+
|
|
342
|
+
!!! warning "Risk"
|
|
343
|
+
Database trigger fails if Ollama not running
|
|
344
|
+
|
|
345
|
+
**Likelihood**: Medium (Ollama must be running)
|
|
346
|
+
|
|
347
|
+
**Impact**: High (INSERT operations fail)
|
|
348
|
+
|
|
349
|
+
**Mitigation**:
|
|
350
|
+
|
|
351
|
+
- Clear error messages from trigger
|
|
352
|
+
- Document Ollama setup requirements
|
|
353
|
+
- Health check scripts for Ollama
|
|
354
|
+
- Retry logic in trigger (future enhancement)
|
|
355
|
+
|
|
356
|
+
### Risk: Embedding Dimension Mismatch
|
|
357
|
+
|
|
358
|
+
!!! info "Risk"
|
|
359
|
+
Changing embedding model requires vector column resize
|
|
360
|
+
|
|
361
|
+
**Likelihood**: Low (rare model changes)
|
|
362
|
+
|
|
363
|
+
**Impact**: Medium (migration required)
|
|
364
|
+
|
|
365
|
+
**Mitigation**:
|
|
366
|
+
|
|
367
|
+
- Validate dimensions during configuration
|
|
368
|
+
- Raise error if mismatch detected
|
|
369
|
+
- Document migration procedure
|
|
370
|
+
- Store dimension in schema metadata
|
|
371
|
+
|
|
372
|
+
### Risk: Performance Degradation
|
|
373
|
+
|
|
374
|
+
!!! info "Risk"
|
|
375
|
+
Large batch inserts slower due to trigger overhead
|
|
376
|
+
|
|
377
|
+
**Likelihood**: Low (benchmarks show improvement)
|
|
378
|
+
|
|
379
|
+
**Impact**: Low (batch operations less common)
|
|
380
|
+
|
|
381
|
+
**Mitigation**:
|
|
382
|
+
|
|
383
|
+
- Benchmark batch operations
|
|
384
|
+
- Provide bulk import optimizations
|
|
385
|
+
- Document COPY command optimization
|
|
386
|
+
- Consider SKIP TRIGGER option for bulk imports (future)
|
|
387
|
+
|
|
388
|
+
---
|
|
389
|
+
|
|
390
|
+
## Future Enhancements
|
|
391
|
+
|
|
392
|
+
### 1. Additional Providers
|
|
393
|
+
|
|
394
|
+
```sql
|
|
395
|
+
-- Support more embedding providers via pgai
|
|
396
|
+
IF embedding_provider = 'cohere' THEN
|
|
397
|
+
generated_embedding := ai.cohere_embed(...);
|
|
398
|
+
ELSIF embedding_provider = 'voyage' THEN
|
|
399
|
+
generated_embedding := ai.voyage_embed(...);
|
|
400
|
+
END IF;
|
|
401
|
+
```
|
|
402
|
+
|
|
403
|
+
### 2. Conditional Embedding Generation
|
|
404
|
+
|
|
405
|
+
```sql
|
|
406
|
+
-- Only generate embeddings for certain types
|
|
407
|
+
WHEN (NEW.type IN ('fact', 'decision', 'code'))
|
|
408
|
+
```
|
|
409
|
+
|
|
410
|
+
### 3. Embedding Caching
|
|
411
|
+
|
|
412
|
+
```sql
|
|
413
|
+
-- Cache embeddings for repeated text
|
|
414
|
+
CREATE TABLE embedding_cache (
|
|
415
|
+
text_hash TEXT PRIMARY KEY,
|
|
416
|
+
embedding vector(768),
|
|
417
|
+
created_at TIMESTAMP
|
|
418
|
+
);
|
|
419
|
+
```
|
|
420
|
+
|
|
421
|
+
### 4. Retry Logic
|
|
422
|
+
|
|
423
|
+
```sql
|
|
424
|
+
-- Retry failed embedding generation
|
|
425
|
+
BEGIN
|
|
426
|
+
generated_embedding := ai.ollama_embed(...);
|
|
427
|
+
EXCEPTION
|
|
428
|
+
WHEN OTHERS THEN
|
|
429
|
+
-- Retry once with exponential backoff
|
|
430
|
+
PERFORM pg_sleep(1);
|
|
431
|
+
generated_embedding := ai.ollama_embed(...);
|
|
432
|
+
END;
|
|
433
|
+
```
|
|
434
|
+
|
|
435
|
+
### 5. Embedding Versioning
|
|
436
|
+
|
|
437
|
+
```sql
|
|
438
|
+
-- Track embedding model version
|
|
439
|
+
ALTER TABLE nodes ADD COLUMN embedding_model_version TEXT;
|
|
440
|
+
NEW.embedding_model_version := embedding_model;
|
|
441
|
+
```
|
|
442
|
+
|
|
443
|
+
---
|
|
444
|
+
|
|
445
|
+
## Alternatives Comparison
|
|
446
|
+
|
|
447
|
+
| Approach | Performance | Complexity | Maintainability | Decision |
|
|
448
|
+
|----------|------------|------------|-----------------|----------|
|
|
449
|
+
| **pgai Triggers** | **Fastest** | **Medium** | **Best** | **ACCEPTED** |
|
|
450
|
+
| Ruby HTTP Calls | Slower | Simple | Good | Rejected |
|
|
451
|
+
| Background Jobs | Medium | High | Medium | Rejected |
|
|
452
|
+
| Hybrid (optional pgai) | Variable | Very High | Poor | Rejected |
|
|
453
|
+
|
|
454
|
+
---
|
|
455
|
+
|
|
456
|
+
## References
|
|
457
|
+
|
|
458
|
+
- [pgai GitHub](https://github.com/timescale/pgai)
|
|
459
|
+
- [pgai Documentation](https://github.com/timescale/pgai/blob/main/docs/README.md)
|
|
460
|
+
- [pgai Vectorizer Guide](https://github.com/timescale/pgai/blob/main/docs/vectorizer.md)
|
|
461
|
+
- [TimescaleDB Cloud](https://console.cloud.timescale.com/)
|
|
462
|
+
- [ADR-003: Ollama as Default Embedding Provider](003-ollama-embeddings.md) - **Superseded by this ADR**
|
|
463
|
+
- [ADR-005: RAG-Based Retrieval](005-rag-retrieval.md) - **Updated for pgai**
|
|
464
|
+
- [PostgreSQL Triggers](https://www.postgresql.org/docs/current/plpgsql-trigger.html)
|
|
465
|
+
|
|
466
|
+
---
|
|
467
|
+
|
|
468
|
+
## Review Notes
|
|
469
|
+
|
|
470
|
+
**AI Engineer**: Database-side embedding generation is the right architectural choice. Performance gains are significant.
|
|
471
|
+
|
|
472
|
+
**Database Architect**: pgai triggers are well-designed. Consider retry logic for production robustness.
|
|
473
|
+
|
|
474
|
+
**Performance Specialist**: Benchmarks confirm 10-20% improvement. Connection pooling pays off.
|
|
475
|
+
|
|
476
|
+
**Systems Architect**: Clear separation of concerns. Embedding logic belongs in the data layer.
|
|
477
|
+
|
|
478
|
+
**Ruby Expert**: Simplified Ruby code is easier to maintain. Less surface area for bugs.
|
|
479
|
+
|
|
480
|
+
---
|
|
481
|
+
|
|
482
|
+
## Supersedes
|
|
483
|
+
|
|
484
|
+
This ADR supersedes:
|
|
485
|
+
- [ADR-003: Ollama as Default Embedding Provider](003-ollama-embeddings.md) (architecture changed, provider choice remains)
|
|
486
|
+
|
|
487
|
+
Updates:
|
|
488
|
+
- [ADR-005: RAG-Based Retrieval](005-rag-retrieval.md) (query embeddings now via pgai)
|
|
489
|
+
|
|
490
|
+
---
|
|
491
|
+
|
|
492
|
+
## Changelog
|
|
493
|
+
|
|
494
|
+
- **2025-10-26**: Initial version - full migration to pgai-based embedding generation
|
|
@@ -0,0 +1,215 @@
|
|
|
1
|
+
# Architecture Decision Records (ADRs)
|
|
2
|
+
|
|
3
|
+
## Introduction
|
|
4
|
+
|
|
5
|
+
Architecture Decision Records (ADRs) document significant architectural decisions made during the development of HTM (Hierarchical Temporal Memory). Each ADR captures the context, decision, rationale, and consequences of important design choices.
|
|
6
|
+
|
|
7
|
+
## What are ADRs?
|
|
8
|
+
|
|
9
|
+
Architecture Decision Records are lightweight documents that capture important architectural decisions along with their context and consequences. They serve as a historical record of why decisions were made, helping current and future developers understand the system's design.
|
|
10
|
+
|
|
11
|
+
### Key Benefits
|
|
12
|
+
|
|
13
|
+
- **Historical Context**: Understand why decisions were made
|
|
14
|
+
- **Knowledge Transfer**: Onboard new team members faster
|
|
15
|
+
- **Decision Tracking**: See how the architecture evolved over time
|
|
16
|
+
- **Avoid Revisiting**: Prevent rehashing settled decisions
|
|
17
|
+
|
|
18
|
+
## ADR Structure
|
|
19
|
+
|
|
20
|
+
Each ADR follows a consistent structure:
|
|
21
|
+
|
|
22
|
+
- **Status**: Current state (Accepted, Proposed, Deprecated, Superseded)
|
|
23
|
+
- **Date**: When the decision was made
|
|
24
|
+
- **Decision Makers**: Who participated in the decision
|
|
25
|
+
- **Quick Summary**: TL;DR of the decision
|
|
26
|
+
- **Context**: Background and problem statement
|
|
27
|
+
- **Decision**: What was decided
|
|
28
|
+
- **Rationale**: Why this decision was made
|
|
29
|
+
- **Consequences**: Positive, negative, and neutral outcomes
|
|
30
|
+
- **Alternatives Considered**: What other options were evaluated
|
|
31
|
+
- **References**: Related documentation and resources
|
|
32
|
+
|
|
33
|
+
## ADR Status Legend
|
|
34
|
+
|
|
35
|
+
| Status | Meaning |
|
|
36
|
+
|--------|---------|
|
|
37
|
+
| **Accepted** | Decision is approved and implemented |
|
|
38
|
+
| **Proposed** | Decision is under consideration |
|
|
39
|
+
| **Rejected** | Decision was considered but not adopted |
|
|
40
|
+
| **Deprecated** | Decision is no longer recommended |
|
|
41
|
+
| **Superseded** | Decision has been replaced by another ADR |
|
|
42
|
+
|
|
43
|
+
## How to Read ADRs
|
|
44
|
+
|
|
45
|
+
1. **Start with Quick Summary**: Get the high-level decision quickly
|
|
46
|
+
2. **Read Context**: Understand the problem being solved
|
|
47
|
+
3. **Review Decision and Rationale**: See what was chosen and why
|
|
48
|
+
4. **Consider Consequences**: Understand trade-offs and implications
|
|
49
|
+
5. **Check Alternatives**: See what else was considered
|
|
50
|
+
|
|
51
|
+
## Complete ADR List
|
|
52
|
+
|
|
53
|
+
### ADR-001: PostgreSQL with TimescaleDB for Storage
|
|
54
|
+
|
|
55
|
+
**Status**: Accepted | **Date**: 2025-10-25
|
|
56
|
+
|
|
57
|
+
PostgreSQL with TimescaleDB extension chosen as the primary storage backend, providing time-series optimization, vector embeddings, full-text search, and ACID compliance in a single database system.
|
|
58
|
+
|
|
59
|
+
**Key Decision**: Use PostgreSQL + TimescaleDB instead of specialized vector databases or multiple storage systems.
|
|
60
|
+
|
|
61
|
+
**Read more**: [ADR-001: PostgreSQL with TimescaleDB](001-postgresql-timescaledb.md)
|
|
62
|
+
|
|
63
|
+
---
|
|
64
|
+
|
|
65
|
+
### ADR-002: Two-Tier Memory Architecture
|
|
66
|
+
|
|
67
|
+
**Status**: Accepted | **Date**: 2025-10-25
|
|
68
|
+
|
|
69
|
+
Implementation of a two-tier memory system with token-limited working memory (hot tier) and unlimited long-term memory (cold tier) to manage LLM context windows while preserving all historical data.
|
|
70
|
+
|
|
71
|
+
**Key Decision**: Separate fast working memory from durable long-term storage with RAG-based retrieval.
|
|
72
|
+
|
|
73
|
+
**Read more**: [ADR-002: Two-Tier Memory Architecture](002-two-tier-memory.md)
|
|
74
|
+
|
|
75
|
+
---
|
|
76
|
+
|
|
77
|
+
### ADR-003: Ollama as Default Embedding Provider
|
|
78
|
+
|
|
79
|
+
**Status**: Accepted | **Date**: 2025-10-25
|
|
80
|
+
|
|
81
|
+
Ollama with the gpt-oss model selected as the default embedding provider, prioritizing local-first, privacy-preserving operation with zero API costs while supporting pluggable alternatives.
|
|
82
|
+
|
|
83
|
+
**Key Decision**: Local embeddings by default, with support for cloud providers (OpenAI, Cohere) as options.
|
|
84
|
+
|
|
85
|
+
**Read more**: [ADR-003: Ollama Default Embedding Provider](003-ollama-embeddings.md)
|
|
86
|
+
|
|
87
|
+
---
|
|
88
|
+
|
|
89
|
+
### ADR-004: Multi-Robot Shared Memory (Hive Mind)
|
|
90
|
+
|
|
91
|
+
**Status**: Accepted | **Date**: 2025-10-25
|
|
92
|
+
|
|
93
|
+
All robots share a single global memory database with attribution tracking, enabling seamless context sharing and cross-robot learning while maintaining individual robot identity.
|
|
94
|
+
|
|
95
|
+
**Key Decision**: Shared global memory instead of per-robot isolation, with attribution via robot_id.
|
|
96
|
+
|
|
97
|
+
**Read more**: [ADR-004: Multi-Robot Shared Memory](004-hive-mind.md)
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
### ADR-005: RAG-Based Retrieval with Hybrid Search
|
|
102
|
+
|
|
103
|
+
**Status**: Accepted | **Date**: 2025-10-25
|
|
104
|
+
|
|
105
|
+
Three search strategies implemented (vector, full-text, hybrid) with temporal filtering, allowing users to choose the best approach for their query type while combining semantic understanding with keyword precision.
|
|
106
|
+
|
|
107
|
+
**Key Decision**: Hybrid search as default, combining full-text pre-filtering with vector reranking.
|
|
108
|
+
|
|
109
|
+
**Read more**: [ADR-005: RAG-Based Retrieval](005-rag-retrieval.md)
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
|
|
113
|
+
### ADR-006: Context Assembly Strategies
|
|
114
|
+
|
|
115
|
+
**Status**: Accepted | **Date**: 2025-10-25
|
|
116
|
+
|
|
117
|
+
Three context assembly strategies (recent, important, balanced) for selecting which memories to include when token limits prevent loading all working memory, with balanced as the recommended default.
|
|
118
|
+
|
|
119
|
+
**Key Decision**: Multiple strategies for different use cases, with importance-weighted recency decay as default.
|
|
120
|
+
|
|
121
|
+
**Read more**: [ADR-006: Context Assembly Strategies](006-context-assembly.md)
|
|
122
|
+
|
|
123
|
+
---
|
|
124
|
+
|
|
125
|
+
### ADR-007: Working Memory Eviction Strategy
|
|
126
|
+
|
|
127
|
+
**Status**: Accepted | **Date**: 2025-10-25
|
|
128
|
+
|
|
129
|
+
Hybrid eviction policy combining importance and recency scoring, evicting low-importance older memories first while preserving all data in long-term storage (never-forget principle).
|
|
130
|
+
|
|
131
|
+
**Key Decision**: Eviction moves to long-term storage, never deletes. Primary sort by importance, secondary by age.
|
|
132
|
+
|
|
133
|
+
**Read more**: [ADR-007: Working Memory Eviction](007-eviction-strategy.md)
|
|
134
|
+
|
|
135
|
+
---
|
|
136
|
+
|
|
137
|
+
### ADR-008: Robot Identification System
|
|
138
|
+
|
|
139
|
+
**Status**: Accepted | **Date**: 2025-10-25
|
|
140
|
+
|
|
141
|
+
Dual-identifier system using UUID v4 for unique robot_id plus optional human-readable robot_name, with automatic generation if not provided and comprehensive robot registry tracking.
|
|
142
|
+
|
|
143
|
+
**Key Decision**: UUID for uniqueness, name for readability, auto-generation for convenience.
|
|
144
|
+
|
|
145
|
+
**Read more**: [ADR-008: Robot Identification](008-robot-identification.md)
|
|
146
|
+
|
|
147
|
+
---
|
|
148
|
+
|
|
149
|
+
### ADR-009: Never-Forget Philosophy with Explicit Deletion
|
|
150
|
+
|
|
151
|
+
**Status**: Accepted | **Date**: 2025-10-25
|
|
152
|
+
|
|
153
|
+
Never-forget philosophy where memories are never automatically deleted, eviction only moves data between tiers, and deletion requires explicit confirmation to prevent accidental data loss.
|
|
154
|
+
|
|
155
|
+
**Key Decision**: Permanent storage by default, deletion only via `forget(confirm: :confirmed)`.
|
|
156
|
+
|
|
157
|
+
**Read more**: [ADR-009: Never-Forget Philosophy](009-never-forget.md)
|
|
158
|
+
|
|
159
|
+
---
|
|
160
|
+
|
|
161
|
+
### ADR-010: Redis-Based Working Memory (Rejected)
|
|
162
|
+
|
|
163
|
+
**Status**: Rejected | **Date**: 2025-10-25
|
|
164
|
+
|
|
165
|
+
Proposal to add Redis as a persistent storage layer for working memory was thoroughly analyzed and rejected. PostgreSQL already provides durability, working memory's ephemeral nature is by design, and Redis would add complexity without solving a proven problem.
|
|
166
|
+
|
|
167
|
+
**Key Decision**: Keep two-tier architecture with in-memory working memory. Trust PostgreSQL for durability. Apply YAGNI principle.
|
|
168
|
+
|
|
169
|
+
**Why Rejected**: Unnecessary complexity, performance penalty, operational burden, and no proven requirement. PostgreSQL already handles multi-process sharing and crash recovery.
|
|
170
|
+
|
|
171
|
+
**Read more**: [ADR-010: Redis Working Memory (Rejected)](010-redis-working-memory-rejected.md)
|
|
172
|
+
|
|
173
|
+
---
|
|
174
|
+
|
|
175
|
+
## ADR Dependencies
|
|
176
|
+
|
|
177
|
+
```
|
|
178
|
+
ADR-001 (Storage)
|
|
179
|
+
└─> ADR-002 (Two-Tier Memory)
|
|
180
|
+
├─> ADR-007 (Eviction Strategy)
|
|
181
|
+
├─> ADR-009 (Never-Forget)
|
|
182
|
+
└─> ADR-010 (Redis WM - Rejected Alternative)
|
|
183
|
+
└─> ADR-003 (Embeddings)
|
|
184
|
+
└─> ADR-005 (RAG Retrieval)
|
|
185
|
+
└─> ADR-004 (Hive Mind)
|
|
186
|
+
└─> ADR-008 (Robot ID)
|
|
187
|
+
└─> ADR-006 (Context Assembly)
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
## Related Documentation
|
|
191
|
+
|
|
192
|
+
- [HTM API Guide](../../api/index.md)
|
|
193
|
+
- [Database Schema](../../development/schema.md)
|
|
194
|
+
- [Configuration Guide](../../installation.md)
|
|
195
|
+
- [Development Workflow](../../development/index.md)
|
|
196
|
+
|
|
197
|
+
## Contributing to ADRs
|
|
198
|
+
|
|
199
|
+
When making significant architectural decisions:
|
|
200
|
+
|
|
201
|
+
1. Create a new ADR using the next sequential number
|
|
202
|
+
2. Follow the established structure and format
|
|
203
|
+
3. Include thorough context, rationale, and consequences
|
|
204
|
+
4. Document alternatives considered and why they were rejected
|
|
205
|
+
5. Update this index with a summary
|
|
206
|
+
6. Link related documentation
|
|
207
|
+
|
|
208
|
+
## Questions?
|
|
209
|
+
|
|
210
|
+
For questions about architectural decisions, please:
|
|
211
|
+
|
|
212
|
+
- Review the specific ADR documentation
|
|
213
|
+
- Check the related guides and API documentation
|
|
214
|
+
- Open a GitHub issue for clarification
|
|
215
|
+
- Consult the development team
|