htm 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (155) hide show
  1. checksums.yaml +7 -0
  2. data/.architecture/decisions/adrs/001-use-postgresql-timescaledb-storage.md +227 -0
  3. data/.architecture/decisions/adrs/002-two-tier-memory-architecture.md +322 -0
  4. data/.architecture/decisions/adrs/003-ollama-default-embedding-provider.md +339 -0
  5. data/.architecture/decisions/adrs/004-multi-robot-shared-memory-hive-mind.md +374 -0
  6. data/.architecture/decisions/adrs/005-rag-based-retrieval-with-hybrid-search.md +443 -0
  7. data/.architecture/decisions/adrs/006-context-assembly-strategies.md +444 -0
  8. data/.architecture/decisions/adrs/007-working-memory-eviction-strategy.md +461 -0
  9. data/.architecture/decisions/adrs/008-robot-identification-system.md +550 -0
  10. data/.architecture/decisions/adrs/009-never-forget-explicit-deletion-only.md +570 -0
  11. data/.architecture/decisions/adrs/010-redis-working-memory-rejected.md +323 -0
  12. data/.architecture/decisions/adrs/011-database-side-embedding-generation-with-pgai.md +585 -0
  13. data/.architecture/decisions/adrs/012-llm-driven-ontology-topic-extraction.md +583 -0
  14. data/.architecture/decisions/adrs/013-activerecord-orm-and-many-to-many-tagging.md +299 -0
  15. data/.architecture/decisions/adrs/014-client-side-embedding-generation-workflow.md +569 -0
  16. data/.architecture/decisions/adrs/015-hierarchical-tag-ontology-and-llm-extraction.md +701 -0
  17. data/.architecture/decisions/adrs/016-async-embedding-and-tag-generation.md +694 -0
  18. data/.architecture/members.yml +144 -0
  19. data/.architecture/reviews/2025-10-29-llm-configuration-and-async-processing-review.md +1137 -0
  20. data/.architecture/reviews/initial-system-analysis.md +330 -0
  21. data/.envrc +32 -0
  22. data/.irbrc +145 -0
  23. data/CHANGELOG.md +150 -0
  24. data/COMMITS.md +196 -0
  25. data/LICENSE +21 -0
  26. data/README.md +1347 -0
  27. data/Rakefile +51 -0
  28. data/SETUP.md +268 -0
  29. data/config/database.yml +67 -0
  30. data/db/migrate/20250101000001_enable_extensions.rb +14 -0
  31. data/db/migrate/20250101000002_create_robots.rb +14 -0
  32. data/db/migrate/20250101000003_create_nodes.rb +42 -0
  33. data/db/migrate/20250101000005_create_tags.rb +38 -0
  34. data/db/migrate/20250101000007_add_node_vector_indexes.rb +30 -0
  35. data/db/schema.sql +473 -0
  36. data/db/seed_data/README.md +100 -0
  37. data/db/seed_data/presidents.md +136 -0
  38. data/db/seed_data/states.md +151 -0
  39. data/db/seeds.rb +208 -0
  40. data/dbdoc/README.md +173 -0
  41. data/dbdoc/public.node_stats.md +48 -0
  42. data/dbdoc/public.node_stats.svg +41 -0
  43. data/dbdoc/public.node_tags.md +40 -0
  44. data/dbdoc/public.node_tags.svg +112 -0
  45. data/dbdoc/public.nodes.md +54 -0
  46. data/dbdoc/public.nodes.svg +118 -0
  47. data/dbdoc/public.nodes_tags.md +39 -0
  48. data/dbdoc/public.nodes_tags.svg +112 -0
  49. data/dbdoc/public.ontology_structure.md +48 -0
  50. data/dbdoc/public.ontology_structure.svg +38 -0
  51. data/dbdoc/public.operations_log.md +42 -0
  52. data/dbdoc/public.operations_log.svg +130 -0
  53. data/dbdoc/public.relationships.md +39 -0
  54. data/dbdoc/public.relationships.svg +41 -0
  55. data/dbdoc/public.robot_activity.md +46 -0
  56. data/dbdoc/public.robot_activity.svg +35 -0
  57. data/dbdoc/public.robots.md +35 -0
  58. data/dbdoc/public.robots.svg +90 -0
  59. data/dbdoc/public.schema_migrations.md +29 -0
  60. data/dbdoc/public.schema_migrations.svg +26 -0
  61. data/dbdoc/public.tags.md +35 -0
  62. data/dbdoc/public.tags.svg +60 -0
  63. data/dbdoc/public.topic_relationships.md +45 -0
  64. data/dbdoc/public.topic_relationships.svg +32 -0
  65. data/dbdoc/schema.json +1437 -0
  66. data/dbdoc/schema.svg +154 -0
  67. data/docs/api/database.md +806 -0
  68. data/docs/api/embedding-service.md +532 -0
  69. data/docs/api/htm.md +797 -0
  70. data/docs/api/index.md +259 -0
  71. data/docs/api/long-term-memory.md +1096 -0
  72. data/docs/api/working-memory.md +665 -0
  73. data/docs/architecture/adrs/001-postgresql-timescaledb.md +314 -0
  74. data/docs/architecture/adrs/002-two-tier-memory.md +411 -0
  75. data/docs/architecture/adrs/003-ollama-embeddings.md +421 -0
  76. data/docs/architecture/adrs/004-hive-mind.md +437 -0
  77. data/docs/architecture/adrs/005-rag-retrieval.md +531 -0
  78. data/docs/architecture/adrs/006-context-assembly.md +496 -0
  79. data/docs/architecture/adrs/007-eviction-strategy.md +645 -0
  80. data/docs/architecture/adrs/008-robot-identification.md +625 -0
  81. data/docs/architecture/adrs/009-never-forget.md +648 -0
  82. data/docs/architecture/adrs/010-redis-working-memory-rejected.md +323 -0
  83. data/docs/architecture/adrs/011-pgai-integration.md +494 -0
  84. data/docs/architecture/adrs/index.md +215 -0
  85. data/docs/architecture/hive-mind.md +736 -0
  86. data/docs/architecture/index.md +351 -0
  87. data/docs/architecture/overview.md +538 -0
  88. data/docs/architecture/two-tier-memory.md +873 -0
  89. data/docs/assets/css/custom.css +83 -0
  90. data/docs/assets/images/htm-core-components.svg +63 -0
  91. data/docs/assets/images/htm-database-schema.svg +93 -0
  92. data/docs/assets/images/htm-hive-mind-architecture.svg +125 -0
  93. data/docs/assets/images/htm-importance-scoring-framework.svg +83 -0
  94. data/docs/assets/images/htm-layered-architecture.svg +71 -0
  95. data/docs/assets/images/htm-long-term-memory-architecture.svg +115 -0
  96. data/docs/assets/images/htm-working-memory-architecture.svg +120 -0
  97. data/docs/assets/images/htm.jpg +0 -0
  98. data/docs/assets/images/htm_demo.gif +0 -0
  99. data/docs/assets/js/mathjax.js +18 -0
  100. data/docs/assets/videos/htm_video.mp4 +0 -0
  101. data/docs/database_rake_tasks.md +322 -0
  102. data/docs/development/contributing.md +787 -0
  103. data/docs/development/index.md +336 -0
  104. data/docs/development/schema.md +596 -0
  105. data/docs/development/setup.md +719 -0
  106. data/docs/development/testing.md +819 -0
  107. data/docs/guides/adding-memories.md +824 -0
  108. data/docs/guides/context-assembly.md +1009 -0
  109. data/docs/guides/getting-started.md +577 -0
  110. data/docs/guides/index.md +118 -0
  111. data/docs/guides/long-term-memory.md +941 -0
  112. data/docs/guides/multi-robot.md +866 -0
  113. data/docs/guides/recalling-memories.md +927 -0
  114. data/docs/guides/search-strategies.md +953 -0
  115. data/docs/guides/working-memory.md +717 -0
  116. data/docs/index.md +214 -0
  117. data/docs/installation.md +477 -0
  118. data/docs/multi_framework_support.md +519 -0
  119. data/docs/quick-start.md +655 -0
  120. data/docs/setup_local_database.md +302 -0
  121. data/docs/using_rake_tasks_in_your_app.md +383 -0
  122. data/examples/basic_usage.rb +93 -0
  123. data/examples/cli_app/README.md +317 -0
  124. data/examples/cli_app/htm_cli.rb +270 -0
  125. data/examples/custom_llm_configuration.rb +183 -0
  126. data/examples/example_app/Rakefile +71 -0
  127. data/examples/example_app/app.rb +206 -0
  128. data/examples/sinatra_app/Gemfile +21 -0
  129. data/examples/sinatra_app/app.rb +335 -0
  130. data/lib/htm/active_record_config.rb +113 -0
  131. data/lib/htm/configuration.rb +342 -0
  132. data/lib/htm/database.rb +594 -0
  133. data/lib/htm/embedding_service.rb +115 -0
  134. data/lib/htm/errors.rb +34 -0
  135. data/lib/htm/job_adapter.rb +154 -0
  136. data/lib/htm/jobs/generate_embedding_job.rb +65 -0
  137. data/lib/htm/jobs/generate_tags_job.rb +82 -0
  138. data/lib/htm/long_term_memory.rb +965 -0
  139. data/lib/htm/models/node.rb +109 -0
  140. data/lib/htm/models/node_tag.rb +33 -0
  141. data/lib/htm/models/robot.rb +52 -0
  142. data/lib/htm/models/tag.rb +76 -0
  143. data/lib/htm/railtie.rb +76 -0
  144. data/lib/htm/sinatra.rb +157 -0
  145. data/lib/htm/tag_service.rb +135 -0
  146. data/lib/htm/tasks.rb +38 -0
  147. data/lib/htm/version.rb +5 -0
  148. data/lib/htm/working_memory.rb +182 -0
  149. data/lib/htm.rb +400 -0
  150. data/lib/tasks/db.rake +19 -0
  151. data/lib/tasks/htm.rake +147 -0
  152. data/lib/tasks/jobs.rake +312 -0
  153. data/mkdocs.yml +190 -0
  154. data/scripts/install_local_database.sh +309 -0
  155. metadata +341 -0
@@ -0,0 +1,694 @@
1
+ # ADR-016: Async Embedding and Tag Generation with Background Jobs
2
+
3
+ **Status**: Accepted
4
+
5
+ **Date**: 2025-10-29
6
+
7
+ **Decision Makers**: Dewayne VanHoozer, Claude (Anthropic)
8
+
9
+ ---
10
+
11
+ ## Context
12
+
13
+ The initial architecture (ADR-014, ADR-015) proposed synchronous embedding generation before node save, which would add 50-500ms latency to every node creation. For a responsive user experience, we need to:
14
+
15
+ 1. Save nodes immediately (fast response)
16
+ 2. Generate embeddings asynchronously
17
+ 3. Generate tags asynchronously
18
+ 4. Handle failures gracefully without blocking the user
19
+
20
+ ### User Experience Requirements
21
+
22
+ **Fast Node Creation**:
23
+ - User creates a memory/message
24
+ - System responds immediately (< 50ms)
25
+ - Embedding and tagging happen in background
26
+ - User doesn't wait for LLM operations
27
+
28
+ **Eventual Consistency**:
29
+ - Node available immediately for retrieval
30
+ - Embedding added when ready (enables vector search)
31
+ - Tags added when ready (enables hierarchical navigation)
32
+ - System remains usable while jobs are processing
33
+
34
+ ---
35
+
36
+ ## Decision
37
+
38
+ We will use **async-job** for background processing with two parallel jobs triggered on node creation:
39
+
40
+ 1. **Save node immediately** (no embedding, no tags)
41
+ 2. **Enqueue `GenerateEmbeddingJob`** to add embedding
42
+ 3. **Enqueue `GenerateTagsJob`** to extract and add tags
43
+
44
+ Both jobs have equal priority and run in parallel. Errors are logged but do not block or retry excessively.
45
+
46
+ ---
47
+
48
+ ## Architecture
49
+
50
+ ### Node Creation Flow
51
+
52
+ ```ruby
53
+ # 1. User API call
54
+ node = htm.add_message("PostgreSQL supports vector search via pgvector")
55
+
56
+ # 2. Node saved immediately to database
57
+ # - content: "PostgreSQL supports vector search via pgvector"
58
+ # - speaker: "user"
59
+ # - embedding: nil (will be added by job)
60
+ # - tags: none (will be added by job)
61
+ # Response time: ~10-20ms
62
+
63
+ # 3. Two async jobs enqueued (non-blocking)
64
+ GenerateEmbeddingJob.perform_later(node.id) # Job 1
65
+ GenerateTagsJob.perform_later(node.id) # Job 2
66
+
67
+ # 4. Jobs run in background (parallel, same priority)
68
+ # - Job 1: Generate embedding via EmbeddingService → Update node.embedding
69
+ # - Job 2: Generate tags via TagService → Create Tag records → Create NodeTag associations
70
+
71
+ # 5. Node is eventually fully enriched
72
+ # - Has embedding (enables vector search)
73
+ # - Has tags (enables hierarchical navigation)
74
+ ```
75
+
76
+ ### Component Architecture
77
+
78
+ ```
79
+ ┌─────────────────────────────────────────────────────────────┐
80
+ │ User Request │
81
+ │ add_message(content, ...) │
82
+ └─────────────────────┬───────────────────────────────────────┘
83
+
84
+
85
+ ┌─────────────────────────────────────────────────────────────┐
86
+ │ HTM Main Class │
87
+ │ - Create Node record (immediate save) │
88
+ │ - Enqueue GenerateEmbeddingJob │
89
+ │ - Enqueue GenerateTagsJob │
90
+ │ - Return node to user (fast response) │
91
+ └──────────────┬──────────────────────────┬───────────────────┘
92
+ │ │
93
+ │ Async │ Async
94
+ │ (parallel) │ (parallel)
95
+ ▼ ▼
96
+ ┌──────────────────────────┐ ┌──────────────────────────────┐
97
+ │ GenerateEmbeddingJob │ │ GenerateTagsJob │
98
+ │ │ │ │
99
+ │ 1. Load Node │ │ 1. Load Node │
100
+ │ 2. EmbeddingService │ │ 2. Load existing ontology │
101
+ │ 3. Generate embedding │ │ 3. TagService │
102
+ │ 4. Update node.embedding│ │ 4. Extract tags │
103
+ │ 5. Log errors │ │ 5. Create Tag records │
104
+ │ │ │ 6. Create NodeTag records │
105
+ │ │ │ 7. Log errors │
106
+ └──────────────────────────┘ └──────────────────────────────┘
107
+ ```
108
+
109
+ ---
110
+
111
+ ## Implementation
112
+
113
+ ### 1. TagService (New)
114
+
115
+ Parallel to `EmbeddingService`, handles LLM-based tag extraction:
116
+
117
+ ```ruby
118
+ # lib/htm/tag_service.rb
119
+ class HTM::TagService
120
+ # Default models for tag extraction
121
+ DEFAULT_MODELS = {
122
+ ollama: 'llama3',
123
+ openai: 'gpt-4o-mini'
124
+ }.freeze
125
+
126
+ attr_reader :provider, :model
127
+
128
+ # Initialize tag extraction service
129
+ #
130
+ # @param provider [Symbol] LLM provider (:ollama, :openai)
131
+ # @param model [String] Model name
132
+ # @param base_url [String] Base URL for Ollama
133
+ #
134
+ def initialize(provider = :ollama, model: nil, base_url: nil)
135
+ @provider = provider
136
+ @model = model || DEFAULT_MODELS[provider]
137
+ @base_url = base_url || ENV['OLLAMA_URL'] || 'http://localhost:11434'
138
+ end
139
+
140
+ # Extract hierarchical tags from content
141
+ #
142
+ # @param content [String] Text to analyze
143
+ # @param existing_ontology [Array<String>] Sample of existing tags for context
144
+ # @return [Array<String>] Extracted tag names in format root:level1:level2
145
+ #
146
+ def extract_tags(content, existing_ontology: [])
147
+ prompt = build_extraction_prompt(content, existing_ontology)
148
+ response = call_llm(prompt)
149
+ parse_and_validate_tags(response)
150
+ end
151
+
152
+ private
153
+
154
+ def build_extraction_prompt(content, ontology_sample)
155
+ ontology_context = if ontology_sample.any?
156
+ sample_tags = ontology_sample.sample([ontology_sample.size, 20].min)
157
+ "Existing ontology includes: #{sample_tags.join(', ')}\n"
158
+ else
159
+ "This is a new ontology - create appropriate hierarchical tags.\n"
160
+ end
161
+
162
+ <<~PROMPT
163
+ Extract hierarchical topic tags from the following text.
164
+
165
+ #{ontology_context}
166
+ Format: root:level1:level2:level3 (use colons to separate levels)
167
+
168
+ Rules:
169
+ - Use lowercase letters, numbers, and hyphens only
170
+ - Maximum depth: 5 levels
171
+ - Return 2-5 tags per text
172
+ - Tags should be reusable and consistent
173
+ - Prefer existing ontology tags when applicable
174
+ - Use hyphens for multi-word terms (e.g., natural-language-processing)
175
+
176
+ Text: #{content}
177
+
178
+ Return ONLY the topic tags, one per line, no explanations.
179
+ PROMPT
180
+ end
181
+
182
+ def call_llm(prompt)
183
+ case @provider
184
+ when :ollama
185
+ call_ollama(prompt)
186
+ when :openai
187
+ call_openai(prompt)
188
+ else
189
+ raise HTM::TagError, "Unknown provider: #{@provider}"
190
+ end
191
+ end
192
+
193
+ def call_ollama(prompt)
194
+ require 'net/http'
195
+ require 'json'
196
+
197
+ uri = URI("#{@base_url}/api/generate")
198
+ request = Net::HTTP::Post.new(uri)
199
+ request['Content-Type'] = 'application/json'
200
+ request.body = JSON.generate({
201
+ model: @model,
202
+ prompt: prompt,
203
+ stream: false,
204
+ system: 'You are a precise topic extraction system. Output only topic tags in hierarchical format: root:subtopic:detail',
205
+ options: {
206
+ temperature: 0 # Deterministic output
207
+ }
208
+ })
209
+
210
+ response = Net::HTTP.start(uri.hostname, uri.port) do |http|
211
+ http.request(request)
212
+ end
213
+
214
+ unless response.is_a?(Net::HTTPSuccess)
215
+ raise HTM::TagError, "Ollama API error: #{response.code} #{response.message}"
216
+ end
217
+
218
+ result = JSON.parse(response.body)
219
+ result['response']
220
+ rescue JSON::ParserError => e
221
+ raise HTM::TagError, "Failed to parse Ollama response: #{e.message}"
222
+ rescue StandardError => e
223
+ raise HTM::TagError, "Failed to call Ollama: #{e.message}"
224
+ end
225
+
226
+ def call_openai(prompt)
227
+ require 'net/http'
228
+ require 'json'
229
+
230
+ api_key = ENV['OPENAI_API_KEY']
231
+ raise HTM::TagError, "OPENAI_API_KEY not set" unless api_key
232
+
233
+ uri = URI('https://api.openai.com/v1/chat/completions')
234
+ request = Net::HTTP::Post.new(uri)
235
+ request['Content-Type'] = 'application/json'
236
+ request['Authorization'] = "Bearer #{api_key}"
237
+ request.body = JSON.generate({
238
+ model: @model,
239
+ messages: [
240
+ {
241
+ role: 'system',
242
+ content: 'You are a precise topic extraction system. Output only topic tags in hierarchical format: root:subtopic:detail'
243
+ },
244
+ {
245
+ role: 'user',
246
+ content: prompt
247
+ }
248
+ ],
249
+ temperature: 0
250
+ })
251
+
252
+ response = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) do |http|
253
+ http.request(request)
254
+ end
255
+
256
+ unless response.is_a?(Net::HTTPSuccess)
257
+ raise HTM::TagError, "OpenAI API error: #{response.code} #{response.message}"
258
+ end
259
+
260
+ result = JSON.parse(response.body)
261
+ result.dig('choices', 0, 'message', 'content')
262
+ rescue JSON::ParserError => e
263
+ raise HTM::TagError, "Failed to parse OpenAI response: #{e.message}"
264
+ rescue StandardError => e
265
+ raise HTM::TagError, "Failed to call OpenAI: #{e.message}"
266
+ end
267
+
268
+ def parse_and_validate_tags(response)
269
+ return [] if response.nil? || response.strip.empty?
270
+
271
+ # Parse response (one tag per line)
272
+ tags = response.split("\n").map(&:strip).reject(&:empty?)
273
+
274
+ # Validate format: lowercase alphanumeric + hyphens + colons
275
+ valid_tags = tags.select do |tag|
276
+ tag =~ /^[a-z0-9\-]+(:[a-z0-9\-]+)*$/
277
+ end
278
+
279
+ # Limit depth to 5 levels (4 colons maximum)
280
+ valid_tags.select { |tag| tag.count(':') < 5 }
281
+ end
282
+ end
283
+ ```
284
+
285
+ ### 2. Background Jobs
286
+
287
+ Using `async-job` gem:
288
+
289
+ ```ruby
290
+ # lib/htm/jobs/generate_embedding_job.rb
291
+ require 'async/job'
292
+
293
+ class HTM::GenerateEmbeddingJob < Async::Job
294
+ # Generate embedding for node content and update database
295
+ #
296
+ # @param node_id [Integer] ID of node to process
297
+ #
298
+ def perform(node_id)
299
+ node = HTM::Models::Node.find(node_id)
300
+
301
+ # Skip if already has embedding
302
+ return if node.embedding.present?
303
+
304
+ # Initialize embedding service
305
+ embedding_service = HTM::EmbeddingService.new(
306
+ :ollama,
307
+ model: ENV['EMBEDDING_MODEL'] || 'nomic-embed-text'
308
+ )
309
+
310
+ # Generate embedding
311
+ embedding = embedding_service.embed(node.content)
312
+
313
+ # Update node
314
+ node.update!(
315
+ embedding: embedding,
316
+ embedding_dimension: embedding.length
317
+ )
318
+
319
+ logger.info("Generated embedding for node #{node_id} (#{embedding.length} dimensions)")
320
+
321
+ rescue HTM::EmbeddingError => e
322
+ logger.error("Embedding generation failed for node #{node_id}: #{e.message}")
323
+ # Don't retry - node remains without embedding
324
+ rescue StandardError => e
325
+ logger.error("Unexpected error generating embedding for node #{node_id}: #{e.class} - #{e.message}")
326
+ logger.error(e.backtrace.join("\n"))
327
+ end
328
+ end
329
+ ```
330
+
331
+ ```ruby
332
+ # lib/htm/jobs/generate_tags_job.rb
333
+ require 'async/job'
334
+
335
+ class HTM::GenerateTagsJob < Async::Job
336
+ # Extract tags from node content and update database
337
+ #
338
+ # @param node_id [Integer] ID of node to process
339
+ #
340
+ def perform(node_id)
341
+ node = HTM::Models::Node.find(node_id)
342
+
343
+ # Skip if already has tags
344
+ return if node.tags.any?
345
+
346
+ # Initialize tag service
347
+ tag_service = HTM::TagService.new(
348
+ :ollama,
349
+ model: ENV['TAG_MODEL'] || 'llama3'
350
+ )
351
+
352
+ # Get sample of existing ontology for context
353
+ existing_tags = HTM::Models::Tag
354
+ .order('RANDOM()') # PostgreSQL random sampling
355
+ .limit(50)
356
+ .pluck(:name)
357
+
358
+ # Extract tags
359
+ tag_names = tag_service.extract_tags(
360
+ node.content,
361
+ existing_ontology: existing_tags
362
+ )
363
+
364
+ # Create tags and associations
365
+ tag_names.each do |tag_name|
366
+ # Find or create tag record
367
+ tag = HTM::Models::Tag.find_or_create_by(name: tag_name)
368
+
369
+ # Create association (skip if already exists)
370
+ HTM::Models::NodeTag.create(
371
+ node_id: node.id,
372
+ tag_id: tag.id
373
+ )
374
+ rescue ActiveRecord::RecordNotUnique
375
+ # Tag association already exists, skip
376
+ next
377
+ end
378
+
379
+ logger.info("Generated #{tag_names.size} tags for node #{node_id}: #{tag_names.join(', ')}")
380
+
381
+ rescue HTM::TagError => e
382
+ logger.error("Tag generation failed for node #{node_id}: #{e.message}")
383
+ # Don't retry - node remains without tags
384
+ rescue StandardError => e
385
+ logger.error("Unexpected error generating tags for node #{node_id}: #{e.class} - #{e.message}")
386
+ logger.error(e.backtrace.join("\n"))
387
+ end
388
+ end
389
+ ```
390
+
391
+ ### 3. HTM Main Class Integration
392
+
393
+ ```ruby
394
+ # lib/htm.rb
395
+ class HTM
396
+ def add_message(content, speaker: 'user', type: nil, category: nil, importance: 1.0)
397
+ # 1. Save node immediately (no embedding, no tags)
398
+ node = @ltm.add(
399
+ content: content,
400
+ speaker: speaker,
401
+ robot_id: @robot.id,
402
+ type: type,
403
+ category: category,
404
+ importance: importance,
405
+ token_count: @embedding_service.count_tokens(content)
406
+ )
407
+
408
+ # 2. Add to working memory
409
+ @working_memory.add(node)
410
+
411
+ # 3. Enqueue async jobs (non-blocking)
412
+ GenerateEmbeddingJob.perform_later(node.id)
413
+ GenerateTagsJob.perform_later(node.id)
414
+
415
+ # 4. Return immediately
416
+ node
417
+ end
418
+ end
419
+ ```
420
+
421
+ ### 4. Error Handling Class
422
+
423
+ ```ruby
424
+ # lib/htm/errors.rb
425
+ class HTM
426
+ class Error < StandardError; end
427
+ class EmbeddingError < Error; end
428
+ class TagError < Error; end
429
+ class DatabaseError < Error; end
430
+ end
431
+ ```
432
+
433
+ ---
434
+
435
+ ## Query Behavior with Async Jobs
436
+
437
+ ### Vector Search
438
+
439
+ Nodes without embeddings are excluded automatically:
440
+
441
+ ```ruby
442
+ # lib/htm/long_term_memory.rb
443
+ def vector_search(query_embedding:, limit: 10, **filters)
444
+ HTM::Models::Node
445
+ .where.not(embedding: nil) # Exclude nodes without embeddings
446
+ .where(filters)
447
+ .order(Arel.sql("embedding <=> ?::vector", query_embedding.to_s))
448
+ .limit(limit)
449
+ end
450
+ ```
451
+
452
+ **Behavior**:
453
+ - New node created → Not in vector search results yet
454
+ - Embedding job completes → Node appears in vector search results
455
+ - Eventual consistency: Node becomes searchable within seconds
456
+
457
+ ### Tag Search
458
+
459
+ Nodes without tags are excluded implicitly:
460
+
461
+ ```ruby
462
+ def nodes_with_tag(tag_name)
463
+ HTM::Models::Node
464
+ .joins(:tags)
465
+ .where(tags: { name: tag_name })
466
+ end
467
+
468
+ def nodes_with_tag_prefix(prefix)
469
+ HTM::Models::Node
470
+ .joins(:tags)
471
+ .where("tags.name LIKE ?", "#{prefix}%")
472
+ end
473
+ ```
474
+
475
+ **Behavior**:
476
+ - New node created → Not in tag-based queries yet
477
+ - Tag job completes → Node appears in tag queries
478
+ - Eventual consistency: Node becomes navigable within seconds
479
+
480
+ ### Full-Text Search
481
+
482
+ Works immediately (doesn't depend on embeddings or tags):
483
+
484
+ ```ruby
485
+ def fulltext_search(query:, limit: 20)
486
+ HTM::Models::Node
487
+ .where("to_tsvector('english', content) @@ plainto_tsquery('english', ?)", query)
488
+ .order("ts_rank(to_tsvector('english', content), plainto_tsquery('english', ?)) DESC", query)
489
+ .limit(limit)
490
+ end
491
+ ```
492
+
493
+ ---
494
+
495
+ ## Configuration
496
+
497
+ ### Environment Variables
498
+
499
+ ```bash
500
+ # Embedding configuration
501
+ export EMBEDDING_MODEL=nomic-embed-text # Ollama model for embeddings
502
+ export OLLAMA_URL=http://localhost:11434
503
+
504
+ # Tag extraction configuration
505
+ export TAG_MODEL=llama3 # Ollama model for tag extraction
506
+
507
+ # Alternative: OpenAI
508
+ export OPENAI_API_KEY=sk-...
509
+ ```
510
+
511
+ ### Async Job Configuration
512
+
513
+ ```ruby
514
+ # config/async_job.rb (example)
515
+ Async::Job.configure do |config|
516
+ config.backend = :sidekiq # or :async (in-process), :delayed_job, etc.
517
+ config.queue = :default
518
+ config.retry_limit = 0 # Don't retry (errors are logged)
519
+ end
520
+ ```
521
+
522
+ ---
523
+
524
+ ## Performance Characteristics
525
+
526
+ ### Node Creation (User-Facing)
527
+
528
+ | Operation | Time | Notes |
529
+ |-----------|------|-------|
530
+ | Save node to database | ~10ms | Fast INSERT |
531
+ | Enqueue 2 jobs | ~5ms | Add to job queue |
532
+ | **Total user-facing latency** | **~15ms** | Excellent UX |
533
+
534
+ ### Background Processing (Async)
535
+
536
+ | Job | Time | Notes |
537
+ |-----|------|-------|
538
+ | GenerateEmbeddingJob | ~50-100ms | Ollama local |
539
+ | GenerateTagsJob | ~500-1000ms | LLM generation + parsing |
540
+ | **Total background** | ~1 second | User doesn't wait |
541
+
542
+ ### Eventual Consistency Windows
543
+
544
+ | Feature | Available After | Notes |
545
+ |---------|----------------|-------|
546
+ | Full-text search | Immediate | No dependencies |
547
+ | Basic retrieval | Immediate | Get by ID, speaker, etc. |
548
+ | Vector search | ~100ms | After embedding job |
549
+ | Tag navigation | ~1 second | After tag extraction job |
550
+
551
+ ---
552
+
553
+ ## Consequences
554
+
555
+ ### Positive
556
+
557
+ ✅ **Fast response time**: User sees node created in ~15ms
558
+ ✅ **Non-blocking**: LLM operations don't block user
559
+ ✅ **Parallel processing**: Embedding and tagging happen simultaneously
560
+ ✅ **Graceful degradation**: Errors don't prevent node creation
561
+ ✅ **Scalable**: Job queue can be scaled independently
562
+ ✅ **Simple error handling**: Just log errors, no complex retry logic
563
+ ✅ **Eventual consistency**: All features work, just slightly delayed
564
+
565
+ ### Negative
566
+
567
+ ❌ **Eventual consistency**: Small window where features unavailable
568
+ ❌ **Job queue dependency**: Requires async-job infrastructure
569
+ ❌ **Debugging complexity**: Errors happen in background, not in request
570
+ ❌ **State tracking**: Node may be in various states of completion
571
+
572
+ ### Neutral
573
+
574
+ ➡️ **Job framework**: Using async-job (could swap for Sidekiq, etc.)
575
+ ➡️ **Priority**: Both jobs equal priority (can adjust if needed)
576
+ ➡️ **Retries**: No automatic retries (errors just logged)
577
+
578
+ ---
579
+
580
+ ## Monitoring and Observability
581
+
582
+ ### Logging Strategy
583
+
584
+ ```ruby
585
+ # Successful operations
586
+ logger.info("Generated embedding for node #{node_id} (768 dimensions)")
587
+ logger.info("Generated 3 tags for node #{node_id}: ai:llm, database:postgresql, performance")
588
+
589
+ # Errors (no retry)
590
+ logger.error("Embedding generation failed for node #{node_id}: Ollama connection refused")
591
+ logger.error("Tag generation failed for node #{node_id}: Invalid response format")
592
+ ```
593
+
594
+ ### Metrics to Track
595
+
596
+ ```ruby
597
+ # Example metrics
598
+ {
599
+ nodes_created: counter,
600
+ embeddings_generated: counter,
601
+ embeddings_failed: counter,
602
+ tags_generated: counter,
603
+ tags_failed: counter,
604
+ embedding_duration_ms: histogram,
605
+ tag_extraction_duration_ms: histogram,
606
+ job_queue_depth: gauge
607
+ }
608
+ ```
609
+
610
+ ### Health Checks
611
+
612
+ ```ruby
613
+ def system_health
614
+ {
615
+ ollama_available: check_ollama_connection,
616
+ job_queue_healthy: check_job_queue_depth,
617
+ recent_failures: count_recent_job_failures
618
+ }
619
+ end
620
+ ```
621
+
622
+ ---
623
+
624
+ ## Future Enhancements
625
+
626
+ ### 1. Progress Tracking (Optional)
627
+
628
+ ```ruby
629
+ # Add columns to nodes table
630
+ class AddJobStatusToNodes < ActiveRecord::Migration
631
+ add_column :nodes, :embedding_status, :string, default: 'pending'
632
+ add_column :nodes, :tagging_status, :string, default: 'pending'
633
+ add_index :nodes, :embedding_status
634
+ add_index :nodes, :tagging_status
635
+ end
636
+
637
+ # Update in jobs
638
+ node.update!(embedding_status: 'completed')
639
+ node.update!(tagging_status: 'completed')
640
+ ```
641
+
642
+ ### 2. Retry with Exponential Backoff
643
+
644
+ ```ruby
645
+ # If needed in future
646
+ class GenerateEmbeddingJob < Async::Job
647
+ retry_on HTM::EmbeddingError, wait: :exponentially_longer, attempts: 3
648
+ end
649
+ ```
650
+
651
+ ### 3. Batch Processing
652
+
653
+ ```ruby
654
+ # Process multiple nodes in one job
655
+ class GenerateEmbeddingsBatchJob < Async::Job
656
+ def perform(node_ids)
657
+ nodes = HTM::Models::Node.where(id: node_ids, embedding: nil)
658
+ # Batch embed for efficiency
659
+ end
660
+ end
661
+ ```
662
+
663
+ ### 4. Priority Queue
664
+
665
+ ```ruby
666
+ # High-priority nodes processed first
667
+ GenerateEmbeddingJob.set(priority: :high).perform_later(important_node_id)
668
+ ```
669
+
670
+ ---
671
+
672
+ ## Related ADRs
673
+
674
+ **Supersedes**:
675
+ - ADR-014 (Client-Side Embedding) - Replaced with async approach
676
+ - ADR-015 (Manual Tagging + Future LLM) - LLM extraction now implemented via TagService
677
+
678
+ **References**:
679
+ - ADR-001 (PostgreSQL Storage)
680
+ - ADR-013 (ActiveRecord + Many-to-Many Tags)
681
+
682
+ ---
683
+
684
+ ## Review Notes
685
+
686
+ **User (Dewayne)**: ✅ Async approach with two parallel jobs. Use async-job. TagService parallel to EmbeddingService.
687
+
688
+ **Systems Architect**: ✅ Async processing greatly improves UX. Eventual consistency is acceptable trade-off.
689
+
690
+ **Performance Specialist**: ✅ 15ms user-facing latency vs. 500ms+ synchronous is massive improvement.
691
+
692
+ **Ruby Expert**: ✅ TagService design mirrors EmbeddingService well. Consistent architecture.
693
+
694
+ **AI Engineer**: ✅ Parallel embedding and tagging is efficient. LLM operations don't block users.