htm 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.architecture/decisions/adrs/001-use-postgresql-timescaledb-storage.md +227 -0
- data/.architecture/decisions/adrs/002-two-tier-memory-architecture.md +322 -0
- data/.architecture/decisions/adrs/003-ollama-default-embedding-provider.md +339 -0
- data/.architecture/decisions/adrs/004-multi-robot-shared-memory-hive-mind.md +374 -0
- data/.architecture/decisions/adrs/005-rag-based-retrieval-with-hybrid-search.md +443 -0
- data/.architecture/decisions/adrs/006-context-assembly-strategies.md +444 -0
- data/.architecture/decisions/adrs/007-working-memory-eviction-strategy.md +461 -0
- data/.architecture/decisions/adrs/008-robot-identification-system.md +550 -0
- data/.architecture/decisions/adrs/009-never-forget-explicit-deletion-only.md +570 -0
- data/.architecture/decisions/adrs/010-redis-working-memory-rejected.md +323 -0
- data/.architecture/decisions/adrs/011-database-side-embedding-generation-with-pgai.md +585 -0
- data/.architecture/decisions/adrs/012-llm-driven-ontology-topic-extraction.md +583 -0
- data/.architecture/decisions/adrs/013-activerecord-orm-and-many-to-many-tagging.md +299 -0
- data/.architecture/decisions/adrs/014-client-side-embedding-generation-workflow.md +569 -0
- data/.architecture/decisions/adrs/015-hierarchical-tag-ontology-and-llm-extraction.md +701 -0
- data/.architecture/decisions/adrs/016-async-embedding-and-tag-generation.md +694 -0
- data/.architecture/members.yml +144 -0
- data/.architecture/reviews/2025-10-29-llm-configuration-and-async-processing-review.md +1137 -0
- data/.architecture/reviews/initial-system-analysis.md +330 -0
- data/.envrc +32 -0
- data/.irbrc +145 -0
- data/CHANGELOG.md +150 -0
- data/COMMITS.md +196 -0
- data/LICENSE +21 -0
- data/README.md +1347 -0
- data/Rakefile +51 -0
- data/SETUP.md +268 -0
- data/config/database.yml +67 -0
- data/db/migrate/20250101000001_enable_extensions.rb +14 -0
- data/db/migrate/20250101000002_create_robots.rb +14 -0
- data/db/migrate/20250101000003_create_nodes.rb +42 -0
- data/db/migrate/20250101000005_create_tags.rb +38 -0
- data/db/migrate/20250101000007_add_node_vector_indexes.rb +30 -0
- data/db/schema.sql +473 -0
- data/db/seed_data/README.md +100 -0
- data/db/seed_data/presidents.md +136 -0
- data/db/seed_data/states.md +151 -0
- data/db/seeds.rb +208 -0
- data/dbdoc/README.md +173 -0
- data/dbdoc/public.node_stats.md +48 -0
- data/dbdoc/public.node_stats.svg +41 -0
- data/dbdoc/public.node_tags.md +40 -0
- data/dbdoc/public.node_tags.svg +112 -0
- data/dbdoc/public.nodes.md +54 -0
- data/dbdoc/public.nodes.svg +118 -0
- data/dbdoc/public.nodes_tags.md +39 -0
- data/dbdoc/public.nodes_tags.svg +112 -0
- data/dbdoc/public.ontology_structure.md +48 -0
- data/dbdoc/public.ontology_structure.svg +38 -0
- data/dbdoc/public.operations_log.md +42 -0
- data/dbdoc/public.operations_log.svg +130 -0
- data/dbdoc/public.relationships.md +39 -0
- data/dbdoc/public.relationships.svg +41 -0
- data/dbdoc/public.robot_activity.md +46 -0
- data/dbdoc/public.robot_activity.svg +35 -0
- data/dbdoc/public.robots.md +35 -0
- data/dbdoc/public.robots.svg +90 -0
- data/dbdoc/public.schema_migrations.md +29 -0
- data/dbdoc/public.schema_migrations.svg +26 -0
- data/dbdoc/public.tags.md +35 -0
- data/dbdoc/public.tags.svg +60 -0
- data/dbdoc/public.topic_relationships.md +45 -0
- data/dbdoc/public.topic_relationships.svg +32 -0
- data/dbdoc/schema.json +1437 -0
- data/dbdoc/schema.svg +154 -0
- data/docs/api/database.md +806 -0
- data/docs/api/embedding-service.md +532 -0
- data/docs/api/htm.md +797 -0
- data/docs/api/index.md +259 -0
- data/docs/api/long-term-memory.md +1096 -0
- data/docs/api/working-memory.md +665 -0
- data/docs/architecture/adrs/001-postgresql-timescaledb.md +314 -0
- data/docs/architecture/adrs/002-two-tier-memory.md +411 -0
- data/docs/architecture/adrs/003-ollama-embeddings.md +421 -0
- data/docs/architecture/adrs/004-hive-mind.md +437 -0
- data/docs/architecture/adrs/005-rag-retrieval.md +531 -0
- data/docs/architecture/adrs/006-context-assembly.md +496 -0
- data/docs/architecture/adrs/007-eviction-strategy.md +645 -0
- data/docs/architecture/adrs/008-robot-identification.md +625 -0
- data/docs/architecture/adrs/009-never-forget.md +648 -0
- data/docs/architecture/adrs/010-redis-working-memory-rejected.md +323 -0
- data/docs/architecture/adrs/011-pgai-integration.md +494 -0
- data/docs/architecture/adrs/index.md +215 -0
- data/docs/architecture/hive-mind.md +736 -0
- data/docs/architecture/index.md +351 -0
- data/docs/architecture/overview.md +538 -0
- data/docs/architecture/two-tier-memory.md +873 -0
- data/docs/assets/css/custom.css +83 -0
- data/docs/assets/images/htm-core-components.svg +63 -0
- data/docs/assets/images/htm-database-schema.svg +93 -0
- data/docs/assets/images/htm-hive-mind-architecture.svg +125 -0
- data/docs/assets/images/htm-importance-scoring-framework.svg +83 -0
- data/docs/assets/images/htm-layered-architecture.svg +71 -0
- data/docs/assets/images/htm-long-term-memory-architecture.svg +115 -0
- data/docs/assets/images/htm-working-memory-architecture.svg +120 -0
- data/docs/assets/images/htm.jpg +0 -0
- data/docs/assets/images/htm_demo.gif +0 -0
- data/docs/assets/js/mathjax.js +18 -0
- data/docs/assets/videos/htm_video.mp4 +0 -0
- data/docs/database_rake_tasks.md +322 -0
- data/docs/development/contributing.md +787 -0
- data/docs/development/index.md +336 -0
- data/docs/development/schema.md +596 -0
- data/docs/development/setup.md +719 -0
- data/docs/development/testing.md +819 -0
- data/docs/guides/adding-memories.md +824 -0
- data/docs/guides/context-assembly.md +1009 -0
- data/docs/guides/getting-started.md +577 -0
- data/docs/guides/index.md +118 -0
- data/docs/guides/long-term-memory.md +941 -0
- data/docs/guides/multi-robot.md +866 -0
- data/docs/guides/recalling-memories.md +927 -0
- data/docs/guides/search-strategies.md +953 -0
- data/docs/guides/working-memory.md +717 -0
- data/docs/index.md +214 -0
- data/docs/installation.md +477 -0
- data/docs/multi_framework_support.md +519 -0
- data/docs/quick-start.md +655 -0
- data/docs/setup_local_database.md +302 -0
- data/docs/using_rake_tasks_in_your_app.md +383 -0
- data/examples/basic_usage.rb +93 -0
- data/examples/cli_app/README.md +317 -0
- data/examples/cli_app/htm_cli.rb +270 -0
- data/examples/custom_llm_configuration.rb +183 -0
- data/examples/example_app/Rakefile +71 -0
- data/examples/example_app/app.rb +206 -0
- data/examples/sinatra_app/Gemfile +21 -0
- data/examples/sinatra_app/app.rb +335 -0
- data/lib/htm/active_record_config.rb +113 -0
- data/lib/htm/configuration.rb +342 -0
- data/lib/htm/database.rb +594 -0
- data/lib/htm/embedding_service.rb +115 -0
- data/lib/htm/errors.rb +34 -0
- data/lib/htm/job_adapter.rb +154 -0
- data/lib/htm/jobs/generate_embedding_job.rb +65 -0
- data/lib/htm/jobs/generate_tags_job.rb +82 -0
- data/lib/htm/long_term_memory.rb +965 -0
- data/lib/htm/models/node.rb +109 -0
- data/lib/htm/models/node_tag.rb +33 -0
- data/lib/htm/models/robot.rb +52 -0
- data/lib/htm/models/tag.rb +76 -0
- data/lib/htm/railtie.rb +76 -0
- data/lib/htm/sinatra.rb +157 -0
- data/lib/htm/tag_service.rb +135 -0
- data/lib/htm/tasks.rb +38 -0
- data/lib/htm/version.rb +5 -0
- data/lib/htm/working_memory.rb +182 -0
- data/lib/htm.rb +400 -0
- data/lib/tasks/db.rake +19 -0
- data/lib/tasks/htm.rake +147 -0
- data/lib/tasks/jobs.rake +312 -0
- data/mkdocs.yml +190 -0
- data/scripts/install_local_database.sh +309 -0
- metadata +341 -0
|
@@ -0,0 +1,694 @@
|
|
|
1
|
+
# ADR-016: Async Embedding and Tag Generation with Background Jobs
|
|
2
|
+
|
|
3
|
+
**Status**: Accepted
|
|
4
|
+
|
|
5
|
+
**Date**: 2025-10-29
|
|
6
|
+
|
|
7
|
+
**Decision Makers**: Dewayne VanHoozer, Claude (Anthropic)
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## Context
|
|
12
|
+
|
|
13
|
+
The initial architecture (ADR-014, ADR-015) proposed synchronous embedding generation before node save, which would add 50-500ms latency to every node creation. For a responsive user experience, we need to:
|
|
14
|
+
|
|
15
|
+
1. Save nodes immediately (fast response)
|
|
16
|
+
2. Generate embeddings asynchronously
|
|
17
|
+
3. Generate tags asynchronously
|
|
18
|
+
4. Handle failures gracefully without blocking the user
|
|
19
|
+
|
|
20
|
+
### User Experience Requirements
|
|
21
|
+
|
|
22
|
+
**Fast Node Creation**:
|
|
23
|
+
- User creates a memory/message
|
|
24
|
+
- System responds immediately (< 50ms)
|
|
25
|
+
- Embedding and tagging happen in background
|
|
26
|
+
- User doesn't wait for LLM operations
|
|
27
|
+
|
|
28
|
+
**Eventual Consistency**:
|
|
29
|
+
- Node available immediately for retrieval
|
|
30
|
+
- Embedding added when ready (enables vector search)
|
|
31
|
+
- Tags added when ready (enables hierarchical navigation)
|
|
32
|
+
- System remains usable while jobs are processing
|
|
33
|
+
|
|
34
|
+
---
|
|
35
|
+
|
|
36
|
+
## Decision
|
|
37
|
+
|
|
38
|
+
We will use **async-job** for background processing with two parallel jobs triggered on node creation:
|
|
39
|
+
|
|
40
|
+
1. **Save node immediately** (no embedding, no tags)
|
|
41
|
+
2. **Enqueue `GenerateEmbeddingJob`** to add embedding
|
|
42
|
+
3. **Enqueue `GenerateTagsJob`** to extract and add tags
|
|
43
|
+
|
|
44
|
+
Both jobs have equal priority and run in parallel. Errors are logged but do not block or retry excessively.
|
|
45
|
+
|
|
46
|
+
---
|
|
47
|
+
|
|
48
|
+
## Architecture
|
|
49
|
+
|
|
50
|
+
### Node Creation Flow
|
|
51
|
+
|
|
52
|
+
```ruby
|
|
53
|
+
# 1. User API call
|
|
54
|
+
node = htm.add_message("PostgreSQL supports vector search via pgvector")
|
|
55
|
+
|
|
56
|
+
# 2. Node saved immediately to database
|
|
57
|
+
# - content: "PostgreSQL supports vector search via pgvector"
|
|
58
|
+
# - speaker: "user"
|
|
59
|
+
# - embedding: nil (will be added by job)
|
|
60
|
+
# - tags: none (will be added by job)
|
|
61
|
+
# Response time: ~10-20ms
|
|
62
|
+
|
|
63
|
+
# 3. Two async jobs enqueued (non-blocking)
|
|
64
|
+
GenerateEmbeddingJob.perform_later(node.id) # Job 1
|
|
65
|
+
GenerateTagsJob.perform_later(node.id) # Job 2
|
|
66
|
+
|
|
67
|
+
# 4. Jobs run in background (parallel, same priority)
|
|
68
|
+
# - Job 1: Generate embedding via EmbeddingService → Update node.embedding
|
|
69
|
+
# - Job 2: Generate tags via TagService → Create Tag records → Create NodeTag associations
|
|
70
|
+
|
|
71
|
+
# 5. Node is eventually fully enriched
|
|
72
|
+
# - Has embedding (enables vector search)
|
|
73
|
+
# - Has tags (enables hierarchical navigation)
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
### Component Architecture
|
|
77
|
+
|
|
78
|
+
```
|
|
79
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
80
|
+
│ User Request │
|
|
81
|
+
│ add_message(content, ...) │
|
|
82
|
+
└─────────────────────┬───────────────────────────────────────┘
|
|
83
|
+
│
|
|
84
|
+
▼
|
|
85
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
86
|
+
│ HTM Main Class │
|
|
87
|
+
│ - Create Node record (immediate save) │
|
|
88
|
+
│ - Enqueue GenerateEmbeddingJob │
|
|
89
|
+
│ - Enqueue GenerateTagsJob │
|
|
90
|
+
│ - Return node to user (fast response) │
|
|
91
|
+
└──────────────┬──────────────────────────┬───────────────────┘
|
|
92
|
+
│ │
|
|
93
|
+
│ Async │ Async
|
|
94
|
+
│ (parallel) │ (parallel)
|
|
95
|
+
▼ ▼
|
|
96
|
+
┌──────────────────────────┐ ┌──────────────────────────────┐
|
|
97
|
+
│ GenerateEmbeddingJob │ │ GenerateTagsJob │
|
|
98
|
+
│ │ │ │
|
|
99
|
+
│ 1. Load Node │ │ 1. Load Node │
|
|
100
|
+
│ 2. EmbeddingService │ │ 2. Load existing ontology │
|
|
101
|
+
│ 3. Generate embedding │ │ 3. TagService │
|
|
102
|
+
│ 4. Update node.embedding│ │ 4. Extract tags │
|
|
103
|
+
│ 5. Log errors │ │ 5. Create Tag records │
|
|
104
|
+
│ │ │ 6. Create NodeTag records │
|
|
105
|
+
│ │ │ 7. Log errors │
|
|
106
|
+
└──────────────────────────┘ └──────────────────────────────┘
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
## Implementation
|
|
112
|
+
|
|
113
|
+
### 1. TagService (New)
|
|
114
|
+
|
|
115
|
+
Parallel to `EmbeddingService`, handles LLM-based tag extraction:
|
|
116
|
+
|
|
117
|
+
```ruby
|
|
118
|
+
# lib/htm/tag_service.rb
|
|
119
|
+
class HTM::TagService
|
|
120
|
+
# Default models for tag extraction
|
|
121
|
+
DEFAULT_MODELS = {
|
|
122
|
+
ollama: 'llama3',
|
|
123
|
+
openai: 'gpt-4o-mini'
|
|
124
|
+
}.freeze
|
|
125
|
+
|
|
126
|
+
attr_reader :provider, :model
|
|
127
|
+
|
|
128
|
+
# Initialize tag extraction service
|
|
129
|
+
#
|
|
130
|
+
# @param provider [Symbol] LLM provider (:ollama, :openai)
|
|
131
|
+
# @param model [String] Model name
|
|
132
|
+
# @param base_url [String] Base URL for Ollama
|
|
133
|
+
#
|
|
134
|
+
def initialize(provider = :ollama, model: nil, base_url: nil)
|
|
135
|
+
@provider = provider
|
|
136
|
+
@model = model || DEFAULT_MODELS[provider]
|
|
137
|
+
@base_url = base_url || ENV['OLLAMA_URL'] || 'http://localhost:11434'
|
|
138
|
+
end
|
|
139
|
+
|
|
140
|
+
# Extract hierarchical tags from content
|
|
141
|
+
#
|
|
142
|
+
# @param content [String] Text to analyze
|
|
143
|
+
# @param existing_ontology [Array<String>] Sample of existing tags for context
|
|
144
|
+
# @return [Array<String>] Extracted tag names in format root:level1:level2
|
|
145
|
+
#
|
|
146
|
+
def extract_tags(content, existing_ontology: [])
|
|
147
|
+
prompt = build_extraction_prompt(content, existing_ontology)
|
|
148
|
+
response = call_llm(prompt)
|
|
149
|
+
parse_and_validate_tags(response)
|
|
150
|
+
end
|
|
151
|
+
|
|
152
|
+
private
|
|
153
|
+
|
|
154
|
+
def build_extraction_prompt(content, ontology_sample)
|
|
155
|
+
ontology_context = if ontology_sample.any?
|
|
156
|
+
sample_tags = ontology_sample.sample([ontology_sample.size, 20].min)
|
|
157
|
+
"Existing ontology includes: #{sample_tags.join(', ')}\n"
|
|
158
|
+
else
|
|
159
|
+
"This is a new ontology - create appropriate hierarchical tags.\n"
|
|
160
|
+
end
|
|
161
|
+
|
|
162
|
+
<<~PROMPT
|
|
163
|
+
Extract hierarchical topic tags from the following text.
|
|
164
|
+
|
|
165
|
+
#{ontology_context}
|
|
166
|
+
Format: root:level1:level2:level3 (use colons to separate levels)
|
|
167
|
+
|
|
168
|
+
Rules:
|
|
169
|
+
- Use lowercase letters, numbers, and hyphens only
|
|
170
|
+
- Maximum depth: 5 levels
|
|
171
|
+
- Return 2-5 tags per text
|
|
172
|
+
- Tags should be reusable and consistent
|
|
173
|
+
- Prefer existing ontology tags when applicable
|
|
174
|
+
- Use hyphens for multi-word terms (e.g., natural-language-processing)
|
|
175
|
+
|
|
176
|
+
Text: #{content}
|
|
177
|
+
|
|
178
|
+
Return ONLY the topic tags, one per line, no explanations.
|
|
179
|
+
PROMPT
|
|
180
|
+
end
|
|
181
|
+
|
|
182
|
+
def call_llm(prompt)
|
|
183
|
+
case @provider
|
|
184
|
+
when :ollama
|
|
185
|
+
call_ollama(prompt)
|
|
186
|
+
when :openai
|
|
187
|
+
call_openai(prompt)
|
|
188
|
+
else
|
|
189
|
+
raise HTM::TagError, "Unknown provider: #{@provider}"
|
|
190
|
+
end
|
|
191
|
+
end
|
|
192
|
+
|
|
193
|
+
def call_ollama(prompt)
|
|
194
|
+
require 'net/http'
|
|
195
|
+
require 'json'
|
|
196
|
+
|
|
197
|
+
uri = URI("#{@base_url}/api/generate")
|
|
198
|
+
request = Net::HTTP::Post.new(uri)
|
|
199
|
+
request['Content-Type'] = 'application/json'
|
|
200
|
+
request.body = JSON.generate({
|
|
201
|
+
model: @model,
|
|
202
|
+
prompt: prompt,
|
|
203
|
+
stream: false,
|
|
204
|
+
system: 'You are a precise topic extraction system. Output only topic tags in hierarchical format: root:subtopic:detail',
|
|
205
|
+
options: {
|
|
206
|
+
temperature: 0 # Deterministic output
|
|
207
|
+
}
|
|
208
|
+
})
|
|
209
|
+
|
|
210
|
+
response = Net::HTTP.start(uri.hostname, uri.port) do |http|
|
|
211
|
+
http.request(request)
|
|
212
|
+
end
|
|
213
|
+
|
|
214
|
+
unless response.is_a?(Net::HTTPSuccess)
|
|
215
|
+
raise HTM::TagError, "Ollama API error: #{response.code} #{response.message}"
|
|
216
|
+
end
|
|
217
|
+
|
|
218
|
+
result = JSON.parse(response.body)
|
|
219
|
+
result['response']
|
|
220
|
+
rescue JSON::ParserError => e
|
|
221
|
+
raise HTM::TagError, "Failed to parse Ollama response: #{e.message}"
|
|
222
|
+
rescue StandardError => e
|
|
223
|
+
raise HTM::TagError, "Failed to call Ollama: #{e.message}"
|
|
224
|
+
end
|
|
225
|
+
|
|
226
|
+
def call_openai(prompt)
|
|
227
|
+
require 'net/http'
|
|
228
|
+
require 'json'
|
|
229
|
+
|
|
230
|
+
api_key = ENV['OPENAI_API_KEY']
|
|
231
|
+
raise HTM::TagError, "OPENAI_API_KEY not set" unless api_key
|
|
232
|
+
|
|
233
|
+
uri = URI('https://api.openai.com/v1/chat/completions')
|
|
234
|
+
request = Net::HTTP::Post.new(uri)
|
|
235
|
+
request['Content-Type'] = 'application/json'
|
|
236
|
+
request['Authorization'] = "Bearer #{api_key}"
|
|
237
|
+
request.body = JSON.generate({
|
|
238
|
+
model: @model,
|
|
239
|
+
messages: [
|
|
240
|
+
{
|
|
241
|
+
role: 'system',
|
|
242
|
+
content: 'You are a precise topic extraction system. Output only topic tags in hierarchical format: root:subtopic:detail'
|
|
243
|
+
},
|
|
244
|
+
{
|
|
245
|
+
role: 'user',
|
|
246
|
+
content: prompt
|
|
247
|
+
}
|
|
248
|
+
],
|
|
249
|
+
temperature: 0
|
|
250
|
+
})
|
|
251
|
+
|
|
252
|
+
response = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) do |http|
|
|
253
|
+
http.request(request)
|
|
254
|
+
end
|
|
255
|
+
|
|
256
|
+
unless response.is_a?(Net::HTTPSuccess)
|
|
257
|
+
raise HTM::TagError, "OpenAI API error: #{response.code} #{response.message}"
|
|
258
|
+
end
|
|
259
|
+
|
|
260
|
+
result = JSON.parse(response.body)
|
|
261
|
+
result.dig('choices', 0, 'message', 'content')
|
|
262
|
+
rescue JSON::ParserError => e
|
|
263
|
+
raise HTM::TagError, "Failed to parse OpenAI response: #{e.message}"
|
|
264
|
+
rescue StandardError => e
|
|
265
|
+
raise HTM::TagError, "Failed to call OpenAI: #{e.message}"
|
|
266
|
+
end
|
|
267
|
+
|
|
268
|
+
def parse_and_validate_tags(response)
|
|
269
|
+
return [] if response.nil? || response.strip.empty?
|
|
270
|
+
|
|
271
|
+
# Parse response (one tag per line)
|
|
272
|
+
tags = response.split("\n").map(&:strip).reject(&:empty?)
|
|
273
|
+
|
|
274
|
+
# Validate format: lowercase alphanumeric + hyphens + colons
|
|
275
|
+
valid_tags = tags.select do |tag|
|
|
276
|
+
tag =~ /^[a-z0-9\-]+(:[a-z0-9\-]+)*$/
|
|
277
|
+
end
|
|
278
|
+
|
|
279
|
+
# Limit depth to 5 levels (4 colons maximum)
|
|
280
|
+
valid_tags.select { |tag| tag.count(':') < 5 }
|
|
281
|
+
end
|
|
282
|
+
end
|
|
283
|
+
```
|
|
284
|
+
|
|
285
|
+
### 2. Background Jobs
|
|
286
|
+
|
|
287
|
+
Using `async-job` gem:
|
|
288
|
+
|
|
289
|
+
```ruby
|
|
290
|
+
# lib/htm/jobs/generate_embedding_job.rb
|
|
291
|
+
require 'async/job'
|
|
292
|
+
|
|
293
|
+
class HTM::GenerateEmbeddingJob < Async::Job
|
|
294
|
+
# Generate embedding for node content and update database
|
|
295
|
+
#
|
|
296
|
+
# @param node_id [Integer] ID of node to process
|
|
297
|
+
#
|
|
298
|
+
def perform(node_id)
|
|
299
|
+
node = HTM::Models::Node.find(node_id)
|
|
300
|
+
|
|
301
|
+
# Skip if already has embedding
|
|
302
|
+
return if node.embedding.present?
|
|
303
|
+
|
|
304
|
+
# Initialize embedding service
|
|
305
|
+
embedding_service = HTM::EmbeddingService.new(
|
|
306
|
+
:ollama,
|
|
307
|
+
model: ENV['EMBEDDING_MODEL'] || 'nomic-embed-text'
|
|
308
|
+
)
|
|
309
|
+
|
|
310
|
+
# Generate embedding
|
|
311
|
+
embedding = embedding_service.embed(node.content)
|
|
312
|
+
|
|
313
|
+
# Update node
|
|
314
|
+
node.update!(
|
|
315
|
+
embedding: embedding,
|
|
316
|
+
embedding_dimension: embedding.length
|
|
317
|
+
)
|
|
318
|
+
|
|
319
|
+
logger.info("Generated embedding for node #{node_id} (#{embedding.length} dimensions)")
|
|
320
|
+
|
|
321
|
+
rescue HTM::EmbeddingError => e
|
|
322
|
+
logger.error("Embedding generation failed for node #{node_id}: #{e.message}")
|
|
323
|
+
# Don't retry - node remains without embedding
|
|
324
|
+
rescue StandardError => e
|
|
325
|
+
logger.error("Unexpected error generating embedding for node #{node_id}: #{e.class} - #{e.message}")
|
|
326
|
+
logger.error(e.backtrace.join("\n"))
|
|
327
|
+
end
|
|
328
|
+
end
|
|
329
|
+
```
|
|
330
|
+
|
|
331
|
+
```ruby
|
|
332
|
+
# lib/htm/jobs/generate_tags_job.rb
|
|
333
|
+
require 'async/job'
|
|
334
|
+
|
|
335
|
+
class HTM::GenerateTagsJob < Async::Job
|
|
336
|
+
# Extract tags from node content and update database
|
|
337
|
+
#
|
|
338
|
+
# @param node_id [Integer] ID of node to process
|
|
339
|
+
#
|
|
340
|
+
def perform(node_id)
|
|
341
|
+
node = HTM::Models::Node.find(node_id)
|
|
342
|
+
|
|
343
|
+
# Skip if already has tags
|
|
344
|
+
return if node.tags.any?
|
|
345
|
+
|
|
346
|
+
# Initialize tag service
|
|
347
|
+
tag_service = HTM::TagService.new(
|
|
348
|
+
:ollama,
|
|
349
|
+
model: ENV['TAG_MODEL'] || 'llama3'
|
|
350
|
+
)
|
|
351
|
+
|
|
352
|
+
# Get sample of existing ontology for context
|
|
353
|
+
existing_tags = HTM::Models::Tag
|
|
354
|
+
.order('RANDOM()') # PostgreSQL random sampling
|
|
355
|
+
.limit(50)
|
|
356
|
+
.pluck(:name)
|
|
357
|
+
|
|
358
|
+
# Extract tags
|
|
359
|
+
tag_names = tag_service.extract_tags(
|
|
360
|
+
node.content,
|
|
361
|
+
existing_ontology: existing_tags
|
|
362
|
+
)
|
|
363
|
+
|
|
364
|
+
# Create tags and associations
|
|
365
|
+
tag_names.each do |tag_name|
|
|
366
|
+
# Find or create tag record
|
|
367
|
+
tag = HTM::Models::Tag.find_or_create_by(name: tag_name)
|
|
368
|
+
|
|
369
|
+
# Create association (skip if already exists)
|
|
370
|
+
HTM::Models::NodeTag.create(
|
|
371
|
+
node_id: node.id,
|
|
372
|
+
tag_id: tag.id
|
|
373
|
+
)
|
|
374
|
+
rescue ActiveRecord::RecordNotUnique
|
|
375
|
+
# Tag association already exists, skip
|
|
376
|
+
next
|
|
377
|
+
end
|
|
378
|
+
|
|
379
|
+
logger.info("Generated #{tag_names.size} tags for node #{node_id}: #{tag_names.join(', ')}")
|
|
380
|
+
|
|
381
|
+
rescue HTM::TagError => e
|
|
382
|
+
logger.error("Tag generation failed for node #{node_id}: #{e.message}")
|
|
383
|
+
# Don't retry - node remains without tags
|
|
384
|
+
rescue StandardError => e
|
|
385
|
+
logger.error("Unexpected error generating tags for node #{node_id}: #{e.class} - #{e.message}")
|
|
386
|
+
logger.error(e.backtrace.join("\n"))
|
|
387
|
+
end
|
|
388
|
+
end
|
|
389
|
+
```
|
|
390
|
+
|
|
391
|
+
### 3. HTM Main Class Integration
|
|
392
|
+
|
|
393
|
+
```ruby
|
|
394
|
+
# lib/htm.rb
|
|
395
|
+
class HTM
|
|
396
|
+
def add_message(content, speaker: 'user', type: nil, category: nil, importance: 1.0)
|
|
397
|
+
# 1. Save node immediately (no embedding, no tags)
|
|
398
|
+
node = @ltm.add(
|
|
399
|
+
content: content,
|
|
400
|
+
speaker: speaker,
|
|
401
|
+
robot_id: @robot.id,
|
|
402
|
+
type: type,
|
|
403
|
+
category: category,
|
|
404
|
+
importance: importance,
|
|
405
|
+
token_count: @embedding_service.count_tokens(content)
|
|
406
|
+
)
|
|
407
|
+
|
|
408
|
+
# 2. Add to working memory
|
|
409
|
+
@working_memory.add(node)
|
|
410
|
+
|
|
411
|
+
# 3. Enqueue async jobs (non-blocking)
|
|
412
|
+
GenerateEmbeddingJob.perform_later(node.id)
|
|
413
|
+
GenerateTagsJob.perform_later(node.id)
|
|
414
|
+
|
|
415
|
+
# 4. Return immediately
|
|
416
|
+
node
|
|
417
|
+
end
|
|
418
|
+
end
|
|
419
|
+
```
|
|
420
|
+
|
|
421
|
+
### 4. Error Handling Class
|
|
422
|
+
|
|
423
|
+
```ruby
|
|
424
|
+
# lib/htm/errors.rb
|
|
425
|
+
class HTM
|
|
426
|
+
class Error < StandardError; end
|
|
427
|
+
class EmbeddingError < Error; end
|
|
428
|
+
class TagError < Error; end
|
|
429
|
+
class DatabaseError < Error; end
|
|
430
|
+
end
|
|
431
|
+
```
|
|
432
|
+
|
|
433
|
+
---
|
|
434
|
+
|
|
435
|
+
## Query Behavior with Async Jobs
|
|
436
|
+
|
|
437
|
+
### Vector Search
|
|
438
|
+
|
|
439
|
+
Nodes without embeddings are excluded automatically:
|
|
440
|
+
|
|
441
|
+
```ruby
|
|
442
|
+
# lib/htm/long_term_memory.rb
|
|
443
|
+
def vector_search(query_embedding:, limit: 10, **filters)
|
|
444
|
+
HTM::Models::Node
|
|
445
|
+
.where.not(embedding: nil) # Exclude nodes without embeddings
|
|
446
|
+
.where(filters)
|
|
447
|
+
.order(Arel.sql("embedding <=> ?::vector", query_embedding.to_s))
|
|
448
|
+
.limit(limit)
|
|
449
|
+
end
|
|
450
|
+
```
|
|
451
|
+
|
|
452
|
+
**Behavior**:
|
|
453
|
+
- New node created → Not in vector search results yet
|
|
454
|
+
- Embedding job completes → Node appears in vector search results
|
|
455
|
+
- Eventual consistency: Node becomes searchable within seconds
|
|
456
|
+
|
|
457
|
+
### Tag Search
|
|
458
|
+
|
|
459
|
+
Nodes without tags are excluded implicitly:
|
|
460
|
+
|
|
461
|
+
```ruby
|
|
462
|
+
def nodes_with_tag(tag_name)
|
|
463
|
+
HTM::Models::Node
|
|
464
|
+
.joins(:tags)
|
|
465
|
+
.where(tags: { name: tag_name })
|
|
466
|
+
end
|
|
467
|
+
|
|
468
|
+
def nodes_with_tag_prefix(prefix)
|
|
469
|
+
HTM::Models::Node
|
|
470
|
+
.joins(:tags)
|
|
471
|
+
.where("tags.name LIKE ?", "#{prefix}%")
|
|
472
|
+
end
|
|
473
|
+
```
|
|
474
|
+
|
|
475
|
+
**Behavior**:
|
|
476
|
+
- New node created → Not in tag-based queries yet
|
|
477
|
+
- Tag job completes → Node appears in tag queries
|
|
478
|
+
- Eventual consistency: Node becomes navigable within seconds
|
|
479
|
+
|
|
480
|
+
### Full-Text Search
|
|
481
|
+
|
|
482
|
+
Works immediately (doesn't depend on embeddings or tags):
|
|
483
|
+
|
|
484
|
+
```ruby
|
|
485
|
+
def fulltext_search(query:, limit: 20)
|
|
486
|
+
HTM::Models::Node
|
|
487
|
+
.where("to_tsvector('english', content) @@ plainto_tsquery('english', ?)", query)
|
|
488
|
+
.order("ts_rank(to_tsvector('english', content), plainto_tsquery('english', ?)) DESC", query)
|
|
489
|
+
.limit(limit)
|
|
490
|
+
end
|
|
491
|
+
```
|
|
492
|
+
|
|
493
|
+
---
|
|
494
|
+
|
|
495
|
+
## Configuration
|
|
496
|
+
|
|
497
|
+
### Environment Variables
|
|
498
|
+
|
|
499
|
+
```bash
|
|
500
|
+
# Embedding configuration
|
|
501
|
+
export EMBEDDING_MODEL=nomic-embed-text # Ollama model for embeddings
|
|
502
|
+
export OLLAMA_URL=http://localhost:11434
|
|
503
|
+
|
|
504
|
+
# Tag extraction configuration
|
|
505
|
+
export TAG_MODEL=llama3 # Ollama model for tag extraction
|
|
506
|
+
|
|
507
|
+
# Alternative: OpenAI
|
|
508
|
+
export OPENAI_API_KEY=sk-...
|
|
509
|
+
```
|
|
510
|
+
|
|
511
|
+
### Async Job Configuration
|
|
512
|
+
|
|
513
|
+
```ruby
|
|
514
|
+
# config/async_job.rb (example)
|
|
515
|
+
Async::Job.configure do |config|
|
|
516
|
+
config.backend = :sidekiq # or :async (in-process), :delayed_job, etc.
|
|
517
|
+
config.queue = :default
|
|
518
|
+
config.retry_limit = 0 # Don't retry (errors are logged)
|
|
519
|
+
end
|
|
520
|
+
```
|
|
521
|
+
|
|
522
|
+
---
|
|
523
|
+
|
|
524
|
+
## Performance Characteristics
|
|
525
|
+
|
|
526
|
+
### Node Creation (User-Facing)
|
|
527
|
+
|
|
528
|
+
| Operation | Time | Notes |
|
|
529
|
+
|-----------|------|-------|
|
|
530
|
+
| Save node to database | ~10ms | Fast INSERT |
|
|
531
|
+
| Enqueue 2 jobs | ~5ms | Add to job queue |
|
|
532
|
+
| **Total user-facing latency** | **~15ms** | Excellent UX |
|
|
533
|
+
|
|
534
|
+
### Background Processing (Async)
|
|
535
|
+
|
|
536
|
+
| Job | Time | Notes |
|
|
537
|
+
|-----|------|-------|
|
|
538
|
+
| GenerateEmbeddingJob | ~50-100ms | Ollama local |
|
|
539
|
+
| GenerateTagsJob | ~500-1000ms | LLM generation + parsing |
|
|
540
|
+
| **Total background** | ~1 second | User doesn't wait |
|
|
541
|
+
|
|
542
|
+
### Eventual Consistency Windows
|
|
543
|
+
|
|
544
|
+
| Feature | Available After | Notes |
|
|
545
|
+
|---------|----------------|-------|
|
|
546
|
+
| Full-text search | Immediate | No dependencies |
|
|
547
|
+
| Basic retrieval | Immediate | Get by ID, speaker, etc. |
|
|
548
|
+
| Vector search | ~100ms | After embedding job |
|
|
549
|
+
| Tag navigation | ~1 second | After tag extraction job |
|
|
550
|
+
|
|
551
|
+
---
|
|
552
|
+
|
|
553
|
+
## Consequences
|
|
554
|
+
|
|
555
|
+
### Positive
|
|
556
|
+
|
|
557
|
+
✅ **Fast response time**: User sees node created in ~15ms
|
|
558
|
+
✅ **Non-blocking**: LLM operations don't block user
|
|
559
|
+
✅ **Parallel processing**: Embedding and tagging happen simultaneously
|
|
560
|
+
✅ **Graceful degradation**: Errors don't prevent node creation
|
|
561
|
+
✅ **Scalable**: Job queue can be scaled independently
|
|
562
|
+
✅ **Simple error handling**: Just log errors, no complex retry logic
|
|
563
|
+
✅ **Eventual consistency**: All features work, just slightly delayed
|
|
564
|
+
|
|
565
|
+
### Negative
|
|
566
|
+
|
|
567
|
+
❌ **Eventual consistency**: Small window where features unavailable
|
|
568
|
+
❌ **Job queue dependency**: Requires async-job infrastructure
|
|
569
|
+
❌ **Debugging complexity**: Errors happen in background, not in request
|
|
570
|
+
❌ **State tracking**: Node may be in various states of completion
|
|
571
|
+
|
|
572
|
+
### Neutral
|
|
573
|
+
|
|
574
|
+
➡️ **Job framework**: Using async-job (could swap for Sidekiq, etc.)
|
|
575
|
+
➡️ **Priority**: Both jobs equal priority (can adjust if needed)
|
|
576
|
+
➡️ **Retries**: No automatic retries (errors just logged)
|
|
577
|
+
|
|
578
|
+
---
|
|
579
|
+
|
|
580
|
+
## Monitoring and Observability
|
|
581
|
+
|
|
582
|
+
### Logging Strategy
|
|
583
|
+
|
|
584
|
+
```ruby
|
|
585
|
+
# Successful operations
|
|
586
|
+
logger.info("Generated embedding for node #{node_id} (768 dimensions)")
|
|
587
|
+
logger.info("Generated 3 tags for node #{node_id}: ai:llm, database:postgresql, performance")
|
|
588
|
+
|
|
589
|
+
# Errors (no retry)
|
|
590
|
+
logger.error("Embedding generation failed for node #{node_id}: Ollama connection refused")
|
|
591
|
+
logger.error("Tag generation failed for node #{node_id}: Invalid response format")
|
|
592
|
+
```
|
|
593
|
+
|
|
594
|
+
### Metrics to Track
|
|
595
|
+
|
|
596
|
+
```ruby
|
|
597
|
+
# Example metrics
|
|
598
|
+
{
|
|
599
|
+
nodes_created: counter,
|
|
600
|
+
embeddings_generated: counter,
|
|
601
|
+
embeddings_failed: counter,
|
|
602
|
+
tags_generated: counter,
|
|
603
|
+
tags_failed: counter,
|
|
604
|
+
embedding_duration_ms: histogram,
|
|
605
|
+
tag_extraction_duration_ms: histogram,
|
|
606
|
+
job_queue_depth: gauge
|
|
607
|
+
}
|
|
608
|
+
```
|
|
609
|
+
|
|
610
|
+
### Health Checks
|
|
611
|
+
|
|
612
|
+
```ruby
|
|
613
|
+
def system_health
|
|
614
|
+
{
|
|
615
|
+
ollama_available: check_ollama_connection,
|
|
616
|
+
job_queue_healthy: check_job_queue_depth,
|
|
617
|
+
recent_failures: count_recent_job_failures
|
|
618
|
+
}
|
|
619
|
+
end
|
|
620
|
+
```
|
|
621
|
+
|
|
622
|
+
---
|
|
623
|
+
|
|
624
|
+
## Future Enhancements
|
|
625
|
+
|
|
626
|
+
### 1. Progress Tracking (Optional)
|
|
627
|
+
|
|
628
|
+
```ruby
|
|
629
|
+
# Add columns to nodes table
|
|
630
|
+
class AddJobStatusToNodes < ActiveRecord::Migration
|
|
631
|
+
add_column :nodes, :embedding_status, :string, default: 'pending'
|
|
632
|
+
add_column :nodes, :tagging_status, :string, default: 'pending'
|
|
633
|
+
add_index :nodes, :embedding_status
|
|
634
|
+
add_index :nodes, :tagging_status
|
|
635
|
+
end
|
|
636
|
+
|
|
637
|
+
# Update in jobs
|
|
638
|
+
node.update!(embedding_status: 'completed')
|
|
639
|
+
node.update!(tagging_status: 'completed')
|
|
640
|
+
```
|
|
641
|
+
|
|
642
|
+
### 2. Retry with Exponential Backoff
|
|
643
|
+
|
|
644
|
+
```ruby
|
|
645
|
+
# If needed in future
|
|
646
|
+
class GenerateEmbeddingJob < Async::Job
|
|
647
|
+
retry_on HTM::EmbeddingError, wait: :exponentially_longer, attempts: 3
|
|
648
|
+
end
|
|
649
|
+
```
|
|
650
|
+
|
|
651
|
+
### 3. Batch Processing
|
|
652
|
+
|
|
653
|
+
```ruby
|
|
654
|
+
# Process multiple nodes in one job
|
|
655
|
+
class GenerateEmbeddingsBatchJob < Async::Job
|
|
656
|
+
def perform(node_ids)
|
|
657
|
+
nodes = HTM::Models::Node.where(id: node_ids, embedding: nil)
|
|
658
|
+
# Batch embed for efficiency
|
|
659
|
+
end
|
|
660
|
+
end
|
|
661
|
+
```
|
|
662
|
+
|
|
663
|
+
### 4. Priority Queue
|
|
664
|
+
|
|
665
|
+
```ruby
|
|
666
|
+
# High-priority nodes processed first
|
|
667
|
+
GenerateEmbeddingJob.set(priority: :high).perform_later(important_node_id)
|
|
668
|
+
```
|
|
669
|
+
|
|
670
|
+
---
|
|
671
|
+
|
|
672
|
+
## Related ADRs
|
|
673
|
+
|
|
674
|
+
**Supersedes**:
|
|
675
|
+
- ADR-014 (Client-Side Embedding) - Replaced with async approach
|
|
676
|
+
- ADR-015 (Manual Tagging + Future LLM) - LLM extraction now implemented via TagService
|
|
677
|
+
|
|
678
|
+
**References**:
|
|
679
|
+
- ADR-001 (PostgreSQL Storage)
|
|
680
|
+
- ADR-013 (ActiveRecord + Many-to-Many Tags)
|
|
681
|
+
|
|
682
|
+
---
|
|
683
|
+
|
|
684
|
+
## Review Notes
|
|
685
|
+
|
|
686
|
+
**User (Dewayne)**: ✅ Async approach with two parallel jobs. Use async-job. TagService parallel to EmbeddingService.
|
|
687
|
+
|
|
688
|
+
**Systems Architect**: ✅ Async processing greatly improves UX. Eventual consistency is acceptable trade-off.
|
|
689
|
+
|
|
690
|
+
**Performance Specialist**: ✅ 15ms user-facing latency vs. 500ms+ synchronous is massive improvement.
|
|
691
|
+
|
|
692
|
+
**Ruby Expert**: ✅ TagService design mirrors EmbeddingService well. Consistent architecture.
|
|
693
|
+
|
|
694
|
+
**AI Engineer**: ✅ Parallel embedding and tagging is efficient. LLM operations don't block users.
|