htm 0.0.20 → 0.0.30

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (154) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +60 -0
  3. data/Rakefile +104 -18
  4. data/db/migrate/00001_enable_extensions.rb +9 -5
  5. data/db/migrate/00002_create_robots.rb +18 -6
  6. data/db/migrate/00003_create_file_sources.rb +30 -17
  7. data/db/migrate/00004_create_nodes.rb +60 -48
  8. data/db/migrate/00005_create_tags.rb +24 -12
  9. data/db/migrate/00006_create_node_tags.rb +28 -13
  10. data/db/migrate/00007_create_robot_nodes.rb +40 -26
  11. data/db/schema.sql +17 -1
  12. data/db/seeds.rb +33 -33
  13. data/docs/database/naming-convention.md +244 -0
  14. data/docs/database_rake_tasks.md +31 -0
  15. data/docs/development/rake-tasks.md +80 -35
  16. data/docs/guides/mcp-server.md +70 -1
  17. data/examples/.envrc +6 -0
  18. data/examples/.gitignore +2 -0
  19. data/examples/00_create_examples_db.rb +94 -0
  20. data/examples/{basic_usage.rb → 01_basic_usage.rb} +12 -16
  21. data/examples/{custom_llm_configuration.rb → 03_custom_llm_configuration.rb} +13 -3
  22. data/examples/{file_loader_usage.rb → 04_file_loader_usage.rb} +11 -14
  23. data/examples/{timeframe_demo.rb → 05_timeframe_demo.rb} +10 -3
  24. data/examples/{example_app → 06_example_app}/app.rb +15 -15
  25. data/examples/{cli_app → 07_cli_app}/htm_cli.rb +15 -22
  26. data/examples/08_sinatra_app/Gemfile.lock +241 -0
  27. data/examples/{sinatra_app → 08_sinatra_app}/app.rb +19 -18
  28. data/examples/{mcp_client.rb → 09_mcp_client.rb} +5 -8
  29. data/examples/{telemetry → 10_telemetry}/SETUP_README.md +1 -1
  30. data/examples/{telemetry → 10_telemetry}/demo.rb +14 -10
  31. data/examples/11_robot_groups/README.md +335 -0
  32. data/examples/{robot_groups → 11_robot_groups/lib}/robot_worker.rb +17 -3
  33. data/examples/{robot_groups → 11_robot_groups}/multi_process.rb +9 -9
  34. data/examples/{robot_groups → 11_robot_groups}/same_process.rb +9 -12
  35. data/examples/{rails_app → 12_rails_app}/Gemfile +3 -0
  36. data/examples/{rails_app → 12_rails_app}/Gemfile.lock +87 -58
  37. data/examples/{rails_app → 12_rails_app}/app/controllers/dashboard_controller.rb +10 -6
  38. data/examples/{rails_app → 12_rails_app}/app/controllers/files_controller.rb +5 -5
  39. data/examples/{rails_app → 12_rails_app}/app/controllers/memories_controller.rb +11 -7
  40. data/examples/{rails_app → 12_rails_app}/app/controllers/robots_controller.rb +8 -8
  41. data/examples/12_rails_app/app/controllers/tags_controller.rb +36 -0
  42. data/examples/{rails_app → 12_rails_app}/app/views/dashboard/index.html.erb +2 -2
  43. data/examples/{rails_app → 12_rails_app}/app/views/files/new.html.erb +5 -2
  44. data/examples/{rails_app → 12_rails_app}/app/views/memories/_memory_card.html.erb +3 -3
  45. data/examples/{rails_app → 12_rails_app}/app/views/memories/deleted.html.erb +3 -3
  46. data/examples/{rails_app → 12_rails_app}/app/views/memories/edit.html.erb +3 -3
  47. data/examples/{rails_app → 12_rails_app}/app/views/memories/show.html.erb +4 -4
  48. data/examples/{rails_app → 12_rails_app}/app/views/robots/index.html.erb +2 -2
  49. data/examples/{rails_app → 12_rails_app}/app/views/robots/show.html.erb +4 -4
  50. data/examples/{rails_app → 12_rails_app}/app/views/search/index.html.erb +1 -1
  51. data/examples/{rails_app → 12_rails_app}/app/views/tags/index.html.erb +2 -2
  52. data/examples/{rails_app → 12_rails_app}/app/views/tags/show.html.erb +1 -1
  53. data/examples/12_rails_app/config/initializers/htm.rb +7 -0
  54. data/examples/12_rails_app/config/initializers/rack.rb +5 -0
  55. data/examples/README.md +230 -211
  56. data/examples/examples_helper.rb +138 -0
  57. data/lib/htm/config/builder.rb +167 -0
  58. data/lib/htm/config/database.rb +317 -0
  59. data/lib/htm/config/defaults.yml +37 -9
  60. data/lib/htm/config/section.rb +74 -0
  61. data/lib/htm/config/validator.rb +83 -0
  62. data/lib/htm/config.rb +64 -360
  63. data/lib/htm/database.rb +85 -127
  64. data/lib/htm/errors.rb +14 -0
  65. data/lib/htm/integrations/sinatra.rb +13 -44
  66. data/lib/htm/jobs/generate_embedding_job.rb +3 -4
  67. data/lib/htm/jobs/generate_propositions_job.rb +4 -5
  68. data/lib/htm/jobs/generate_tags_job.rb +16 -15
  69. data/lib/htm/loaders/defaults_loader.rb +23 -0
  70. data/lib/htm/loaders/markdown_loader.rb +17 -15
  71. data/lib/htm/loaders/xdg_config_loader.rb +9 -9
  72. data/lib/htm/long_term_memory/fulltext_search.rb +14 -14
  73. data/lib/htm/long_term_memory/hybrid_search.rb +396 -229
  74. data/lib/htm/long_term_memory/node_operations.rb +24 -23
  75. data/lib/htm/long_term_memory/relevance_scorer.rb +23 -20
  76. data/lib/htm/long_term_memory/robot_operations.rb +4 -4
  77. data/lib/htm/long_term_memory/tag_operations.rb +91 -77
  78. data/lib/htm/long_term_memory/vector_search.rb +4 -5
  79. data/lib/htm/long_term_memory.rb +13 -13
  80. data/lib/htm/mcp/cli.rb +115 -8
  81. data/lib/htm/mcp/resources.rb +4 -3
  82. data/lib/htm/mcp/server.rb +5 -4
  83. data/lib/htm/mcp/tools.rb +37 -28
  84. data/lib/htm/migration.rb +72 -0
  85. data/lib/htm/models/file_source.rb +52 -31
  86. data/lib/htm/models/node.rb +224 -108
  87. data/lib/htm/models/node_tag.rb +49 -28
  88. data/lib/htm/models/robot.rb +38 -27
  89. data/lib/htm/models/robot_node.rb +63 -35
  90. data/lib/htm/models/tag.rb +126 -123
  91. data/lib/htm/observability.rb +45 -41
  92. data/lib/htm/proposition_service.rb +76 -7
  93. data/lib/htm/railtie.rb +2 -2
  94. data/lib/htm/robot_group.rb +30 -18
  95. data/lib/htm/sequel_config.rb +215 -0
  96. data/lib/htm/sql_builder.rb +14 -16
  97. data/lib/htm/tag_service.rb +78 -0
  98. data/lib/htm/tasks.rb +3 -0
  99. data/lib/htm/version.rb +1 -1
  100. data/lib/htm/workflows/remember_workflow.rb +6 -5
  101. data/lib/htm.rb +26 -22
  102. data/lib/tasks/db.rake +0 -2
  103. data/lib/tasks/doc.rake +2 -2
  104. data/lib/tasks/files.rake +11 -18
  105. data/lib/tasks/htm.rake +190 -62
  106. data/lib/tasks/jobs.rake +179 -54
  107. data/lib/tasks/tags.rake +8 -13
  108. data/scripts/backfill_parent_tags.rb +376 -0
  109. data/scripts/normalize_plural_tags.rb +335 -0
  110. metadata +109 -80
  111. data/examples/rails_app/app/controllers/tags_controller.rb +0 -30
  112. data/examples/sinatra_app/Gemfile.lock +0 -166
  113. data/lib/htm/active_record_config.rb +0 -104
  114. /data/examples/{config_file_example → 02_config_file_example}/README.md +0 -0
  115. /data/examples/{config_file_example → 02_config_file_example}/config/htm.local.yml +0 -0
  116. /data/examples/{config_file_example → 02_config_file_example}/custom_config.yml +0 -0
  117. /data/examples/{config_file_example → 02_config_file_example}/show_config.rb +0 -0
  118. /data/examples/{example_app → 06_example_app}/Rakefile +0 -0
  119. /data/examples/{cli_app → 07_cli_app}/README.md +0 -0
  120. /data/examples/{sinatra_app → 08_sinatra_app}/Gemfile +0 -0
  121. /data/examples/{telemetry → 10_telemetry}/README.md +0 -0
  122. /data/examples/{telemetry → 10_telemetry}/grafana/dashboards/htm-metrics.json +0 -0
  123. /data/examples/{rails_app → 12_rails_app}/.gitignore +0 -0
  124. /data/examples/{rails_app → 12_rails_app}/Procfile.dev +0 -0
  125. /data/examples/{rails_app → 12_rails_app}/README.md +0 -0
  126. /data/examples/{rails_app → 12_rails_app}/Rakefile +0 -0
  127. /data/examples/{rails_app → 12_rails_app}/app/assets/stylesheets/application.css +0 -0
  128. /data/examples/{rails_app → 12_rails_app}/app/assets/stylesheets/inter-font.css +0 -0
  129. /data/examples/{rails_app → 12_rails_app}/app/controllers/application_controller.rb +0 -0
  130. /data/examples/{rails_app → 12_rails_app}/app/controllers/search_controller.rb +0 -0
  131. /data/examples/{rails_app → 12_rails_app}/app/javascript/application.js +0 -0
  132. /data/examples/{rails_app → 12_rails_app}/app/javascript/controllers/application.js +0 -0
  133. /data/examples/{rails_app → 12_rails_app}/app/javascript/controllers/index.js +0 -0
  134. /data/examples/{rails_app → 12_rails_app}/app/views/files/index.html.erb +0 -0
  135. /data/examples/{rails_app → 12_rails_app}/app/views/files/show.html.erb +0 -0
  136. /data/examples/{rails_app → 12_rails_app}/app/views/layouts/application.html.erb +0 -0
  137. /data/examples/{rails_app → 12_rails_app}/app/views/memories/index.html.erb +0 -0
  138. /data/examples/{rails_app → 12_rails_app}/app/views/memories/new.html.erb +0 -0
  139. /data/examples/{rails_app → 12_rails_app}/app/views/robots/new.html.erb +0 -0
  140. /data/examples/{rails_app → 12_rails_app}/app/views/shared/_navbar.html.erb +0 -0
  141. /data/examples/{rails_app → 12_rails_app}/app/views/shared/_stat_card.html.erb +0 -0
  142. /data/examples/{rails_app → 12_rails_app}/bin/dev +0 -0
  143. /data/examples/{rails_app → 12_rails_app}/bin/rails +0 -0
  144. /data/examples/{rails_app → 12_rails_app}/bin/rake +0 -0
  145. /data/examples/{rails_app → 12_rails_app}/config/application.rb +0 -0
  146. /data/examples/{rails_app → 12_rails_app}/config/boot.rb +0 -0
  147. /data/examples/{rails_app → 12_rails_app}/config/database.yml +0 -0
  148. /data/examples/{rails_app → 12_rails_app}/config/environment.rb +0 -0
  149. /data/examples/{rails_app → 12_rails_app}/config/importmap.rb +0 -0
  150. /data/examples/{rails_app → 12_rails_app}/config/routes.rb +0 -0
  151. /data/examples/{rails_app → 12_rails_app}/config/tailwind.config.js +0 -0
  152. /data/examples/{rails_app → 12_rails_app}/config.ru +0 -0
  153. /data/examples/{rails_app → 12_rails_app}/log/.keep +0 -0
  154. /data/examples/{rails_app → 12_rails_app}/tmp/local_secret.txt +0 -0
@@ -1,33 +1,47 @@
1
1
  # frozen_string_literal: true
2
2
 
3
- class CreateRobotNodes < ActiveRecord::Migration[7.1]
4
- def change
5
- create_table :robot_nodes, comment: 'Join table connecting robots to nodes (many-to-many)' do |t|
6
- t.bigint :robot_id, null: false, comment: 'ID of the robot that remembered this node'
7
- t.bigint :node_id, null: false, comment: 'ID of the node being remembered'
8
- t.timestamptz :first_remembered_at, default: -> { 'CURRENT_TIMESTAMP' },
9
- comment: 'When this robot first remembered this content'
10
- t.timestamptz :last_remembered_at, default: -> { 'CURRENT_TIMESTAMP' },
11
- comment: 'When this robot last tried to remember this content'
12
- t.integer :remember_count, default: 1, null: false,
13
- comment: 'Number of times this robot has tried to remember this content'
14
- t.boolean :working_memory, default: false, null: false,
15
- comment: 'True if this node is currently in the robot working memory'
16
- t.timestamptz :created_at, default: -> { 'CURRENT_TIMESTAMP' }
17
- t.timestamptz :updated_at, default: -> { 'CURRENT_TIMESTAMP' }
18
- t.timestamptz :deleted_at, comment: 'Soft delete timestamp'
3
+ require_relative '../../lib/htm/migration'
4
+
5
+ class CreateRobotNodes < HTM::Migration
6
+ def up
7
+ create_table(:robot_nodes) do
8
+ primary_key :id
9
+ Bignum :robot_id, null: false
10
+ Bignum :node_id, null: false
11
+ DateTime :first_remembered_at, default: Sequel::CURRENT_TIMESTAMP
12
+ DateTime :last_remembered_at, default: Sequel::CURRENT_TIMESTAMP
13
+ Integer :remember_count, default: 1, null: false
14
+ TrueClass :working_memory, default: false, null: false
15
+ DateTime :created_at, default: Sequel::CURRENT_TIMESTAMP
16
+ DateTime :updated_at, default: Sequel::CURRENT_TIMESTAMP
17
+ DateTime :deleted_at
18
+ end
19
+
20
+ add_index :robot_nodes, [:robot_id, :node_id], unique: true, name: :idx_robot_nodes_unique
21
+ add_index :robot_nodes, :robot_id, name: :idx_robot_nodes_robot_id
22
+ add_index :robot_nodes, :node_id, name: :idx_robot_nodes_node_id
23
+ add_index :robot_nodes, :last_remembered_at, name: :idx_robot_nodes_last_remembered_at
24
+ add_index :robot_nodes, :deleted_at, name: :idx_robot_nodes_deleted_at
25
+
26
+ # Partial index for working memory queries
27
+ run "CREATE INDEX idx_robot_nodes_working_memory ON robot_nodes (robot_id, working_memory) WHERE working_memory = true"
28
+
29
+ alter_table(:robot_nodes) do
30
+ add_foreign_key [:robot_id], :robots, on_delete: :cascade
31
+ add_foreign_key [:node_id], :nodes, on_delete: :cascade
19
32
  end
20
33
 
21
- add_index :robot_nodes, [:robot_id, :node_id], unique: true, name: 'idx_robot_nodes_unique'
22
- add_index :robot_nodes, :robot_id, name: 'idx_robot_nodes_robot_id'
23
- add_index :robot_nodes, :node_id, name: 'idx_robot_nodes_node_id'
24
- add_index :robot_nodes, :last_remembered_at, name: 'idx_robot_nodes_last_remembered_at'
25
- add_index :robot_nodes, :deleted_at, name: 'idx_robot_nodes_deleted_at'
26
- add_index :robot_nodes, [:robot_id, :working_memory],
27
- where: 'working_memory = true',
28
- name: 'idx_robot_nodes_working_memory'
34
+ run "COMMENT ON TABLE robot_nodes IS 'Join table connecting robots to nodes (many-to-many)'"
35
+ run "COMMENT ON COLUMN robot_nodes.robot_id IS 'ID of the robot that remembered this node'"
36
+ run "COMMENT ON COLUMN robot_nodes.node_id IS 'ID of the node being remembered'"
37
+ run "COMMENT ON COLUMN robot_nodes.first_remembered_at IS 'When this robot first remembered this content'"
38
+ run "COMMENT ON COLUMN robot_nodes.last_remembered_at IS 'When this robot last tried to remember this content'"
39
+ run "COMMENT ON COLUMN robot_nodes.remember_count IS 'Number of times this robot has tried to remember this content'"
40
+ run "COMMENT ON COLUMN robot_nodes.working_memory IS 'True if this node is currently in the robot working memory'"
41
+ run "COMMENT ON COLUMN robot_nodes.deleted_at IS 'Soft delete timestamp'"
42
+ end
29
43
 
30
- add_foreign_key :robot_nodes, :robots, column: :robot_id, on_delete: :cascade
31
- add_foreign_key :robot_nodes, :nodes, column: :node_id, on_delete: :cascade
44
+ def down
45
+ drop_table(:robot_nodes)
32
46
  end
33
47
  end
data/db/schema.sql CHANGED
@@ -3,6 +3,22 @@
3
3
  -- DO NOT EDIT THIS FILE MANUALLY
4
4
  -- Run 'rake htm:db:schema:dump' to regenerate
5
5
 
6
+ --
7
+ -- Name: paradedb; Type: SCHEMA; Schema: -; Owner: -
8
+ --
9
+
10
+ CREATE SCHEMA paradedb;
11
+
12
+ --
13
+ -- Name: pg_search; Type: EXTENSION; Schema: -; Owner: -
14
+ --
15
+
16
+ CREATE EXTENSION IF NOT EXISTS pg_search WITH SCHEMA paradedb;
17
+
18
+ --
19
+ -- Name: EXTENSION pg_search; Type: COMMENT; Schema: -; Owner: -
20
+ --
21
+
6
22
  --
7
23
  -- Name: pg_trgm; Type: EXTENSION; Schema: -; Owner: -
8
24
  --
@@ -793,4 +809,4 @@ ALTER TABLE ONLY public.robot_nodes
793
809
  -- PostgreSQL database dump complete
794
810
  --
795
811
 
796
- \unrestrict 4WlUqnJzNHaNhcr67XLIIhAvRPidZODUPGkM34l27SvmC0zu6dIsQdJ8dtu589Z
812
+ \unrestrict 1ItB7RQU4jC5IvOL40FU9j9sS6bzk9jcKeDUYSOd78ym0sA7pq0FXYSOEoWsPh7
data/db/seeds.rb CHANGED
@@ -6,14 +6,14 @@
6
6
  # and creates memory nodes with embeddings and tags.
7
7
  #
8
8
  # Configuration is read from environment variables:
9
- # HTM_EMBEDDING_PROVIDER - Embedding provider (default: ollama)
10
- # HTM_EMBEDDING_MODEL - Embedding model (default: nomic-embed-text)
11
- # HTM_EMBEDDING_DIMENSIONS - Embedding dimensions (default: 768)
12
- # HTM_TAG_PROVIDER - Tag extraction provider (default: ollama)
13
- # HTM_TAG_MODEL - Tag extraction model (default: gemma3)
14
- # OLLAMA_URL - Ollama server URL (default: http://localhost:11434)
15
- # HTM_EMBEDDING_TIMEOUT - Embedding generation timeout in seconds (default: 120)
16
- # HTM_TAG_TIMEOUT - Tag generation timeout in seconds (default: 180)
9
+ # HTM_EMBEDDING__PROVIDER - Embedding provider (default: ollama)
10
+ # HTM_EMBEDDING__MODEL - Embedding model (default: nomic-embed-text)
11
+ # HTM_EMBEDDING__DIMENSIONS - Embedding dimensions (default: 768)
12
+ # HTM_TAG__PROVIDER - Tag extraction provider (default: ollama)
13
+ # HTM_TAG__MODEL - Tag extraction model (default: gemma3)
14
+ # HTM_PROVIDERS__OLLAMA__URL - Ollama server URL (default: http://localhost:11434)
15
+ # HTM_EMBEDDING__TIMEOUT - Embedding generation timeout in seconds (default: 120)
16
+ # HTM_TAG__TIMEOUT - Tag generation timeout in seconds (default: 180)
17
17
  # HTM_CONNECTION_TIMEOUT - LLM connection timeout in seconds (default: 30)
18
18
  # HTM_DATABASE__URL - Database connection URL
19
19
  #
@@ -30,13 +30,13 @@ puts "=" * 80
30
30
  puts
31
31
 
32
32
  # Configure HTM using environment variables or defaults
33
- embedding_provider = (ENV['HTM_EMBEDDING_PROVIDER'] || 'ollama').to_sym
34
- embedding_model = ENV['HTM_EMBEDDING_MODEL'] || 'nomic-embed-text'
35
- embedding_dimensions = (ENV['HTM_EMBEDDING_DIMENSIONS'] || '768').to_i
36
- tag_provider = (ENV['HTM_TAG_PROVIDER'] || 'ollama').to_sym
37
- tag_model = ENV['HTM_TAG_MODEL'] || 'gemma3'
38
- embedding_timeout = (ENV['HTM_EMBEDDING_TIMEOUT'] || '120').to_i
39
- tag_timeout = (ENV['HTM_TAG_TIMEOUT'] || '180').to_i
33
+ embedding_provider = (ENV['HTM_EMBEDDING__PROVIDER'] || 'ollama').to_sym
34
+ embedding_model = ENV['HTM_EMBEDDING__MODEL'] || 'nomic-embed-text'
35
+ embedding_dimensions = (ENV['HTM_EMBEDDING__DIMENSIONS'] || '768').to_i
36
+ tag_provider = (ENV['HTM_TAG__PROVIDER'] || 'ollama').to_sym
37
+ tag_model = ENV['HTM_TAG__MODEL'] || 'gemma3'
38
+ embedding_timeout = (ENV['HTM_EMBEDDING__TIMEOUT'] || '120').to_i
39
+ tag_timeout = (ENV['HTM_TAG__TIMEOUT'] || '180').to_i
40
40
  connection_timeout = (ENV['HTM_CONNECTION_TIMEOUT'] || '60').to_i
41
41
 
42
42
  puts "Configuration:"
@@ -49,15 +49,15 @@ puts " Timeouts: embedding=#{embedding_timeout}s, tag=#{tag_timeout}s, connecti
49
49
  puts
50
50
 
51
51
  HTM.configure do |c|
52
- c.embedding_provider = embedding_provider
53
- c.embedding_model = embedding_model
54
- c.embedding_dimensions = embedding_dimensions
55
- c.tag_provider = tag_provider
56
- c.tag_model = tag_model
57
- c.embedding_timeout = embedding_timeout
58
- c.tag_timeout = tag_timeout
52
+ c.embedding.provider = embedding_provider
53
+ c.embedding.model = embedding_model
54
+ c.embedding.dimensions = embedding_dimensions
55
+ c.tag.provider = tag_provider
56
+ c.tag.model = tag_model
57
+ c.embedding.timeout = embedding_timeout
58
+ c.tag.timeout = tag_timeout
59
59
  c.connection_timeout = connection_timeout
60
- c.ollama_url = ENV['OLLAMA_URL'] if ENV['OLLAMA_URL']
60
+ c.providers.ollama.url = ENV['HTM_PROVIDERS__OLLAMA__URL'] if ENV['HTM_PROVIDERS__OLLAMA__URL']
61
61
  c.reset_to_defaults # Apply default implementations with configured settings
62
62
  end
63
63
 
@@ -72,32 +72,32 @@ puts "Creating sample conversation..."
72
72
 
73
73
  htm.remember(
74
74
  "What is TimescaleDB good for?",
75
- source: "user"
75
+ metadata: { source: "user" }
76
76
  )
77
77
 
78
78
  htm.remember(
79
79
  "PostgreSQL with TimescaleDB provides efficient time-series data storage and querying capabilities.",
80
- source: "assistant"
80
+ metadata: { source: "assistant" }
81
81
  )
82
82
 
83
83
  htm.remember(
84
84
  "How much training data do ML models need?",
85
- source: "user"
85
+ metadata: { source: "user" }
86
86
  )
87
87
 
88
88
  htm.remember(
89
89
  "Machine learning models require large amounts of training data to achieve good performance.",
90
- source: "assistant"
90
+ metadata: { source: "assistant" }
91
91
  )
92
92
 
93
93
  htm.remember(
94
94
  "Tell me about Ruby on Rails",
95
- source: "user"
95
+ metadata: { source: "user" }
96
96
  )
97
97
 
98
98
  htm.remember(
99
99
  "Ruby on Rails is a web framework for building database-backed applications.",
100
- source: "assistant"
100
+ metadata: { source: "assistant" }
101
101
  )
102
102
 
103
103
  puts "✓ Created 6 conversation messages (3 exchanges)"
@@ -135,7 +135,7 @@ if Dir.exist?(seed_data_dir)
135
135
  # Save previous section if we have one
136
136
  if current_section && current_paragraph.any?
137
137
  paragraph_text = current_paragraph.join(' ')
138
- htm.remember(paragraph_text, source: filename)
138
+ htm.remember(paragraph_text, metadata: { source: filename })
139
139
  count += 1
140
140
  print "." if count % 10 == 0
141
141
  end
@@ -152,7 +152,7 @@ if Dir.exist?(seed_data_dir)
152
152
  # Don't forget the last section
153
153
  if current_section && current_paragraph.any?
154
154
  paragraph_text = current_paragraph.join(' ')
155
- htm.remember(paragraph_text, source: filename)
155
+ htm.remember(paragraph_text, metadata: { source: filename })
156
156
  count += 1
157
157
  end
158
158
 
@@ -187,7 +187,7 @@ puts
187
187
  puts "Checking completion status..."
188
188
 
189
189
  # Check completion status
190
- nodes_with_embeddings = HTM::Models::Node.where.not(embedding: nil).count
190
+ nodes_with_embeddings = HTM::Models::Node.exclude(embedding: nil).count
191
191
  puts " - Nodes with embeddings: #{nodes_with_embeddings}/#{total_records}"
192
192
 
193
193
  total_tags = HTM::Models::NodeTag.count
@@ -203,6 +203,6 @@ if nodes_with_embeddings == total_records && total_tags > 0
203
203
  else
204
204
  puts "⚠ Some background jobs may still be running."
205
205
  puts " Run this query to check progress:"
206
- puts " HTM::Models::Node.where.not(embedding: nil).count"
206
+ puts " HTM::Models::Node.exclude(embedding: nil).count"
207
207
  puts "=" * 80
208
208
  end
@@ -0,0 +1,244 @@
1
+ # Database Naming Convention
2
+
3
+ HTM enforces a strict database naming convention to prevent accidental data corruption or loss from operating on the wrong database.
4
+
5
+ ## The Convention
6
+
7
+ Database names **must** follow this exact format:
8
+
9
+ ```
10
+ {service_name}_{environment}
11
+ ```
12
+
13
+ Where:
14
+ - `service_name` is the value of `config.service.name` (default: `htm`)
15
+ - `environment` is the value of `HTM_ENV` (or `RAILS_ENV` / `RACK_ENV` fallback)
16
+
17
+ ## Valid Examples
18
+
19
+ | Service Name | Environment | Expected Database Name |
20
+ |--------------|-------------|------------------------|
21
+ | `htm` | `development` | `htm_development` |
22
+ | `htm` | `test` | `htm_test` |
23
+ | `htm` | `production` | `htm_production` |
24
+ | `payroll` | `development` | `payroll_development` |
25
+ | `payroll` | `production` | `payroll_production` |
26
+
27
+ ## Why This Matters
28
+
29
+ Without strict enforcement, dangerous misconfigurations can go undetected:
30
+
31
+ ### Scenario 1: Environment Mismatch
32
+
33
+ ```bash
34
+ # Developer thinks they're in test, but connected to production
35
+ HTM_ENV=test
36
+ HTM_DATABASE__URL="postgresql://user@host/htm_production"
37
+
38
+ rake htm:db:drop # DISASTER: Drops production database!
39
+ ```
40
+
41
+ With the naming convention enforced, this command fails immediately:
42
+
43
+ ```
44
+ Error: Database name does not follow naming convention!
45
+
46
+ Database names must be: {service_name}_{environment}
47
+
48
+ Service name: htm
49
+ Environment: test
50
+ Expected: htm_test
51
+ Actual: htm_production
52
+ ```
53
+
54
+ ### Scenario 2: Service Mismatch
55
+
56
+ ```bash
57
+ # HTM configured to use another application's database
58
+ HTM_ENV=production
59
+ # service.name = "htm" (default)
60
+ HTM_DATABASE__URL="postgresql://user@host/payroll_production"
61
+
62
+ rake htm:db:setup # DISASTER: Corrupts payroll application's data!
63
+ ```
64
+
65
+ With enforcement, this fails:
66
+
67
+ ```
68
+ Error: Database name does not follow naming convention!
69
+
70
+ Service name: htm
71
+ Environment: production
72
+ Expected: htm_production
73
+ Actual: payroll_production
74
+ ```
75
+
76
+ ## How Enforcement Works
77
+
78
+ ### Validation Points
79
+
80
+ The naming convention is validated at these points:
81
+
82
+ 1. **All rake tasks** that depend on `htm:db:validate` (setup, migrate, drop, etc.)
83
+ 2. **Programmatic access** via `HTM.config.validate_database_name!`
84
+
85
+ ### No Bypass Option
86
+
87
+ There is no way to skip this validation. If your database name doesn't match the convention, you must either:
88
+
89
+ 1. Rename your database to match the convention
90
+ 2. Change `HTM_ENV` to match the database suffix
91
+ 3. Change `config.service.name` to match the database prefix
92
+
93
+ ## Configuration
94
+
95
+ ### Setting the Service Name
96
+
97
+ The service name defaults to `htm`. To use a different name:
98
+
99
+ **Via environment variable:**
100
+ ```bash
101
+ export HTM_SERVICE__NAME="myapp"
102
+ ```
103
+
104
+ **Via configuration file (`config/htm.yml`):**
105
+ ```yaml
106
+ service:
107
+ name: myapp
108
+ ```
109
+
110
+ **Via Ruby configuration:**
111
+ ```ruby
112
+ HTM.configure do |config|
113
+ config.service.name = "myapp"
114
+ end
115
+ ```
116
+
117
+ ### Setting the Environment
118
+
119
+ Environment is determined by (in priority order):
120
+
121
+ 1. `HTM_ENV`
122
+ 2. `RAILS_ENV`
123
+ 3. `RACK_ENV`
124
+ 4. Default: `development`
125
+
126
+ **Valid environments:**
127
+ - `development`
128
+ - `test`
129
+ - `production`
130
+
131
+ These correspond to the top-level keys in `config/defaults.yml`.
132
+
133
+ ## Validation Methods
134
+
135
+ ### Check if Database Name is Valid
136
+
137
+ ```ruby
138
+ config = HTM.config
139
+
140
+ # Boolean check
141
+ if config.valid_database_name?
142
+ puts "Database name is correct"
143
+ else
144
+ puts "Expected: #{config.expected_database_name}"
145
+ puts "Actual: #{config.actual_database_name}"
146
+ end
147
+ ```
148
+
149
+ ### Raise Error on Invalid Name
150
+
151
+ ```ruby
152
+ config = HTM.config
153
+
154
+ # Raises HTM::ConfigurationError if invalid
155
+ config.validate_database_name!
156
+ ```
157
+
158
+ ### Get Expected and Actual Names
159
+
160
+ ```ruby
161
+ config = HTM.config
162
+
163
+ config.expected_database_name # => "htm_test"
164
+ config.actual_database_name # => Extracted from URL or config
165
+ ```
166
+
167
+ ## Rake Task Validation
168
+
169
+ All database-related rake tasks run validation automatically:
170
+
171
+ ```bash
172
+ # These all validate the naming convention first:
173
+ rake htm:db:setup
174
+ rake htm:db:migrate
175
+ rake htm:db:drop
176
+ rake htm:db:reset
177
+ rake htm:db:create
178
+ ```
179
+
180
+ To validate manually without performing any operation:
181
+
182
+ ```bash
183
+ rake htm:db:validate
184
+ ```
185
+
186
+ ## Migration Guide
187
+
188
+ If you have existing databases that don't follow the convention:
189
+
190
+ ### Option 1: Rename the Database
191
+
192
+ ```bash
193
+ # PostgreSQL
194
+ psql -c "ALTER DATABASE old_name RENAME TO htm_development;"
195
+ ```
196
+
197
+ ### Option 2: Export and Import
198
+
199
+ ```bash
200
+ # Export from old database
201
+ pg_dump old_database > backup.sql
202
+
203
+ # Create new database with correct name
204
+ createdb htm_development
205
+
206
+ # Import to new database
207
+ psql htm_development < backup.sql
208
+ ```
209
+
210
+ ### Option 3: Change Your Service Name
211
+
212
+ If your database is named `myapp_production`, set your service name to match:
213
+
214
+ ```bash
215
+ export HTM_SERVICE__NAME="myapp"
216
+ ```
217
+
218
+ ## Error Messages
219
+
220
+ When validation fails, you'll see a clear error message:
221
+
222
+ ```
223
+ Error: Database name 'wrong_db' does not match expected 'htm_test'.
224
+ Database names must follow the convention: {service_name}_{environment}
225
+ Service name: htm
226
+ Environment: test
227
+ Expected: htm_test
228
+ Actual: wrong_db
229
+
230
+ Either:
231
+ - Set HTM_DATABASE__URL to point to 'htm_test'
232
+ - Set HTM_DATABASE__NAME=htm_test
233
+ - Change HTM_ENV to match the database suffix
234
+ ```
235
+
236
+ ## Summary
237
+
238
+ The strict database naming convention:
239
+
240
+ - **Prevents** accidental operations on wrong environments
241
+ - **Prevents** cross-application database corruption
242
+ - **Requires** exact match of `{service_name}_{environment}`
243
+ - **Has no bypass** - you must fix the configuration
244
+ - **Validates automatically** on all database operations
@@ -219,6 +219,37 @@ $ rake htm:db:reset
219
219
  # Runs drop (with confirmation) then setup
220
220
  ```
221
221
 
222
+ #### `rake htm:db:purge_all`
223
+ Permanently removes all soft-deleted records from all tables.
224
+
225
+ **What it does:**
226
+ - Removes soft-deleted nodes, node_tags, and robot_nodes
227
+ - Removes orphaned join table entries (pointing to non-existent nodes)
228
+ - Removes orphaned propositions (where source_node_id no longer exists)
229
+ - Removes orphaned robots (with no associated memory nodes)
230
+ - Deletes in correct order for referential integrity
231
+
232
+ **Safety:** Prompts for confirmation before deletion
233
+
234
+ ```bash
235
+ $ rake htm:db:purge_all
236
+
237
+ HTM Purge All Soft-Deleted Records
238
+ ============================================================
239
+
240
+ Records to permanently delete:
241
+ --------------------------------------------------------------
242
+ Soft-deleted nodes: 23
243
+ Soft-deleted node_tags: 45
244
+ Orphaned propositions: 5
245
+ Orphaned robots (no nodes): 2
246
+ --------------------------------------------------------------
247
+ Total records to delete: 75
248
+
249
+ Proceed with permanent deletion? (yes/no): yes
250
+ ✓ Purge complete!
251
+ ```
252
+
222
253
  ---
223
254
 
224
255
  ## Environment Variables