htm 0.0.20 → 0.0.30
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +60 -0
- data/Rakefile +104 -18
- data/db/migrate/00001_enable_extensions.rb +9 -5
- data/db/migrate/00002_create_robots.rb +18 -6
- data/db/migrate/00003_create_file_sources.rb +30 -17
- data/db/migrate/00004_create_nodes.rb +60 -48
- data/db/migrate/00005_create_tags.rb +24 -12
- data/db/migrate/00006_create_node_tags.rb +28 -13
- data/db/migrate/00007_create_robot_nodes.rb +40 -26
- data/db/schema.sql +17 -1
- data/db/seeds.rb +33 -33
- data/docs/database/naming-convention.md +244 -0
- data/docs/database_rake_tasks.md +31 -0
- data/docs/development/rake-tasks.md +80 -35
- data/docs/guides/mcp-server.md +70 -1
- data/examples/.envrc +6 -0
- data/examples/.gitignore +2 -0
- data/examples/00_create_examples_db.rb +94 -0
- data/examples/{basic_usage.rb → 01_basic_usage.rb} +12 -16
- data/examples/{custom_llm_configuration.rb → 03_custom_llm_configuration.rb} +13 -3
- data/examples/{file_loader_usage.rb → 04_file_loader_usage.rb} +11 -14
- data/examples/{timeframe_demo.rb → 05_timeframe_demo.rb} +10 -3
- data/examples/{example_app → 06_example_app}/app.rb +15 -15
- data/examples/{cli_app → 07_cli_app}/htm_cli.rb +15 -22
- data/examples/08_sinatra_app/Gemfile.lock +241 -0
- data/examples/{sinatra_app → 08_sinatra_app}/app.rb +19 -18
- data/examples/{mcp_client.rb → 09_mcp_client.rb} +5 -8
- data/examples/{telemetry → 10_telemetry}/SETUP_README.md +1 -1
- data/examples/{telemetry → 10_telemetry}/demo.rb +14 -10
- data/examples/11_robot_groups/README.md +335 -0
- data/examples/{robot_groups → 11_robot_groups/lib}/robot_worker.rb +17 -3
- data/examples/{robot_groups → 11_robot_groups}/multi_process.rb +9 -9
- data/examples/{robot_groups → 11_robot_groups}/same_process.rb +9 -12
- data/examples/{rails_app → 12_rails_app}/Gemfile +3 -0
- data/examples/{rails_app → 12_rails_app}/Gemfile.lock +87 -58
- data/examples/{rails_app → 12_rails_app}/app/controllers/dashboard_controller.rb +10 -6
- data/examples/{rails_app → 12_rails_app}/app/controllers/files_controller.rb +5 -5
- data/examples/{rails_app → 12_rails_app}/app/controllers/memories_controller.rb +11 -7
- data/examples/{rails_app → 12_rails_app}/app/controllers/robots_controller.rb +8 -8
- data/examples/12_rails_app/app/controllers/tags_controller.rb +36 -0
- data/examples/{rails_app → 12_rails_app}/app/views/dashboard/index.html.erb +2 -2
- data/examples/{rails_app → 12_rails_app}/app/views/files/new.html.erb +5 -2
- data/examples/{rails_app → 12_rails_app}/app/views/memories/_memory_card.html.erb +3 -3
- data/examples/{rails_app → 12_rails_app}/app/views/memories/deleted.html.erb +3 -3
- data/examples/{rails_app → 12_rails_app}/app/views/memories/edit.html.erb +3 -3
- data/examples/{rails_app → 12_rails_app}/app/views/memories/show.html.erb +4 -4
- data/examples/{rails_app → 12_rails_app}/app/views/robots/index.html.erb +2 -2
- data/examples/{rails_app → 12_rails_app}/app/views/robots/show.html.erb +4 -4
- data/examples/{rails_app → 12_rails_app}/app/views/search/index.html.erb +1 -1
- data/examples/{rails_app → 12_rails_app}/app/views/tags/index.html.erb +2 -2
- data/examples/{rails_app → 12_rails_app}/app/views/tags/show.html.erb +1 -1
- data/examples/12_rails_app/config/initializers/htm.rb +7 -0
- data/examples/12_rails_app/config/initializers/rack.rb +5 -0
- data/examples/README.md +230 -211
- data/examples/examples_helper.rb +138 -0
- data/lib/htm/config/builder.rb +167 -0
- data/lib/htm/config/database.rb +317 -0
- data/lib/htm/config/defaults.yml +37 -9
- data/lib/htm/config/section.rb +74 -0
- data/lib/htm/config/validator.rb +83 -0
- data/lib/htm/config.rb +64 -360
- data/lib/htm/database.rb +85 -127
- data/lib/htm/errors.rb +14 -0
- data/lib/htm/integrations/sinatra.rb +13 -44
- data/lib/htm/jobs/generate_embedding_job.rb +3 -4
- data/lib/htm/jobs/generate_propositions_job.rb +4 -5
- data/lib/htm/jobs/generate_tags_job.rb +16 -15
- data/lib/htm/loaders/defaults_loader.rb +23 -0
- data/lib/htm/loaders/markdown_loader.rb +17 -15
- data/lib/htm/loaders/xdg_config_loader.rb +9 -9
- data/lib/htm/long_term_memory/fulltext_search.rb +14 -14
- data/lib/htm/long_term_memory/hybrid_search.rb +396 -229
- data/lib/htm/long_term_memory/node_operations.rb +24 -23
- data/lib/htm/long_term_memory/relevance_scorer.rb +23 -20
- data/lib/htm/long_term_memory/robot_operations.rb +4 -4
- data/lib/htm/long_term_memory/tag_operations.rb +91 -77
- data/lib/htm/long_term_memory/vector_search.rb +4 -5
- data/lib/htm/long_term_memory.rb +13 -13
- data/lib/htm/mcp/cli.rb +115 -8
- data/lib/htm/mcp/resources.rb +4 -3
- data/lib/htm/mcp/server.rb +5 -4
- data/lib/htm/mcp/tools.rb +37 -28
- data/lib/htm/migration.rb +72 -0
- data/lib/htm/models/file_source.rb +52 -31
- data/lib/htm/models/node.rb +224 -108
- data/lib/htm/models/node_tag.rb +49 -28
- data/lib/htm/models/robot.rb +38 -27
- data/lib/htm/models/robot_node.rb +63 -35
- data/lib/htm/models/tag.rb +126 -123
- data/lib/htm/observability.rb +45 -41
- data/lib/htm/proposition_service.rb +76 -7
- data/lib/htm/railtie.rb +2 -2
- data/lib/htm/robot_group.rb +30 -18
- data/lib/htm/sequel_config.rb +215 -0
- data/lib/htm/sql_builder.rb +14 -16
- data/lib/htm/tag_service.rb +78 -0
- data/lib/htm/tasks.rb +3 -0
- data/lib/htm/version.rb +1 -1
- data/lib/htm/workflows/remember_workflow.rb +6 -5
- data/lib/htm.rb +26 -22
- data/lib/tasks/db.rake +0 -2
- data/lib/tasks/doc.rake +2 -2
- data/lib/tasks/files.rake +11 -18
- data/lib/tasks/htm.rake +190 -62
- data/lib/tasks/jobs.rake +179 -54
- data/lib/tasks/tags.rake +8 -13
- data/scripts/backfill_parent_tags.rb +376 -0
- data/scripts/normalize_plural_tags.rb +335 -0
- metadata +109 -80
- data/examples/rails_app/app/controllers/tags_controller.rb +0 -30
- data/examples/sinatra_app/Gemfile.lock +0 -166
- data/lib/htm/active_record_config.rb +0 -104
- /data/examples/{config_file_example → 02_config_file_example}/README.md +0 -0
- /data/examples/{config_file_example → 02_config_file_example}/config/htm.local.yml +0 -0
- /data/examples/{config_file_example → 02_config_file_example}/custom_config.yml +0 -0
- /data/examples/{config_file_example → 02_config_file_example}/show_config.rb +0 -0
- /data/examples/{example_app → 06_example_app}/Rakefile +0 -0
- /data/examples/{cli_app → 07_cli_app}/README.md +0 -0
- /data/examples/{sinatra_app → 08_sinatra_app}/Gemfile +0 -0
- /data/examples/{telemetry → 10_telemetry}/README.md +0 -0
- /data/examples/{telemetry → 10_telemetry}/grafana/dashboards/htm-metrics.json +0 -0
- /data/examples/{rails_app → 12_rails_app}/.gitignore +0 -0
- /data/examples/{rails_app → 12_rails_app}/Procfile.dev +0 -0
- /data/examples/{rails_app → 12_rails_app}/README.md +0 -0
- /data/examples/{rails_app → 12_rails_app}/Rakefile +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/assets/stylesheets/application.css +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/assets/stylesheets/inter-font.css +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/controllers/application_controller.rb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/controllers/search_controller.rb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/javascript/application.js +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/javascript/controllers/application.js +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/javascript/controllers/index.js +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/files/index.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/files/show.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/layouts/application.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/memories/index.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/memories/new.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/robots/new.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/shared/_navbar.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/shared/_stat_card.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/bin/dev +0 -0
- /data/examples/{rails_app → 12_rails_app}/bin/rails +0 -0
- /data/examples/{rails_app → 12_rails_app}/bin/rake +0 -0
- /data/examples/{rails_app → 12_rails_app}/config/application.rb +0 -0
- /data/examples/{rails_app → 12_rails_app}/config/boot.rb +0 -0
- /data/examples/{rails_app → 12_rails_app}/config/database.yml +0 -0
- /data/examples/{rails_app → 12_rails_app}/config/environment.rb +0 -0
- /data/examples/{rails_app → 12_rails_app}/config/importmap.rb +0 -0
- /data/examples/{rails_app → 12_rails_app}/config/routes.rb +0 -0
- /data/examples/{rails_app → 12_rails_app}/config/tailwind.config.js +0 -0
- /data/examples/{rails_app → 12_rails_app}/config.ru +0 -0
- /data/examples/{rails_app → 12_rails_app}/log/.keep +0 -0
- /data/examples/{rails_app → 12_rails_app}/tmp/local_secret.txt +0 -0
|
@@ -1,33 +1,47 @@
|
|
|
1
1
|
# frozen_string_literal: true
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
3
|
+
require_relative '../../lib/htm/migration'
|
|
4
|
+
|
|
5
|
+
class CreateRobotNodes < HTM::Migration
|
|
6
|
+
def up
|
|
7
|
+
create_table(:robot_nodes) do
|
|
8
|
+
primary_key :id
|
|
9
|
+
Bignum :robot_id, null: false
|
|
10
|
+
Bignum :node_id, null: false
|
|
11
|
+
DateTime :first_remembered_at, default: Sequel::CURRENT_TIMESTAMP
|
|
12
|
+
DateTime :last_remembered_at, default: Sequel::CURRENT_TIMESTAMP
|
|
13
|
+
Integer :remember_count, default: 1, null: false
|
|
14
|
+
TrueClass :working_memory, default: false, null: false
|
|
15
|
+
DateTime :created_at, default: Sequel::CURRENT_TIMESTAMP
|
|
16
|
+
DateTime :updated_at, default: Sequel::CURRENT_TIMESTAMP
|
|
17
|
+
DateTime :deleted_at
|
|
18
|
+
end
|
|
19
|
+
|
|
20
|
+
add_index :robot_nodes, [:robot_id, :node_id], unique: true, name: :idx_robot_nodes_unique
|
|
21
|
+
add_index :robot_nodes, :robot_id, name: :idx_robot_nodes_robot_id
|
|
22
|
+
add_index :robot_nodes, :node_id, name: :idx_robot_nodes_node_id
|
|
23
|
+
add_index :robot_nodes, :last_remembered_at, name: :idx_robot_nodes_last_remembered_at
|
|
24
|
+
add_index :robot_nodes, :deleted_at, name: :idx_robot_nodes_deleted_at
|
|
25
|
+
|
|
26
|
+
# Partial index for working memory queries
|
|
27
|
+
run "CREATE INDEX idx_robot_nodes_working_memory ON robot_nodes (robot_id, working_memory) WHERE working_memory = true"
|
|
28
|
+
|
|
29
|
+
alter_table(:robot_nodes) do
|
|
30
|
+
add_foreign_key [:robot_id], :robots, on_delete: :cascade
|
|
31
|
+
add_foreign_key [:node_id], :nodes, on_delete: :cascade
|
|
19
32
|
end
|
|
20
33
|
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
34
|
+
run "COMMENT ON TABLE robot_nodes IS 'Join table connecting robots to nodes (many-to-many)'"
|
|
35
|
+
run "COMMENT ON COLUMN robot_nodes.robot_id IS 'ID of the robot that remembered this node'"
|
|
36
|
+
run "COMMENT ON COLUMN robot_nodes.node_id IS 'ID of the node being remembered'"
|
|
37
|
+
run "COMMENT ON COLUMN robot_nodes.first_remembered_at IS 'When this robot first remembered this content'"
|
|
38
|
+
run "COMMENT ON COLUMN robot_nodes.last_remembered_at IS 'When this robot last tried to remember this content'"
|
|
39
|
+
run "COMMENT ON COLUMN robot_nodes.remember_count IS 'Number of times this robot has tried to remember this content'"
|
|
40
|
+
run "COMMENT ON COLUMN robot_nodes.working_memory IS 'True if this node is currently in the robot working memory'"
|
|
41
|
+
run "COMMENT ON COLUMN robot_nodes.deleted_at IS 'Soft delete timestamp'"
|
|
42
|
+
end
|
|
29
43
|
|
|
30
|
-
|
|
31
|
-
|
|
44
|
+
def down
|
|
45
|
+
drop_table(:robot_nodes)
|
|
32
46
|
end
|
|
33
47
|
end
|
data/db/schema.sql
CHANGED
|
@@ -3,6 +3,22 @@
|
|
|
3
3
|
-- DO NOT EDIT THIS FILE MANUALLY
|
|
4
4
|
-- Run 'rake htm:db:schema:dump' to regenerate
|
|
5
5
|
|
|
6
|
+
--
|
|
7
|
+
-- Name: paradedb; Type: SCHEMA; Schema: -; Owner: -
|
|
8
|
+
--
|
|
9
|
+
|
|
10
|
+
CREATE SCHEMA paradedb;
|
|
11
|
+
|
|
12
|
+
--
|
|
13
|
+
-- Name: pg_search; Type: EXTENSION; Schema: -; Owner: -
|
|
14
|
+
--
|
|
15
|
+
|
|
16
|
+
CREATE EXTENSION IF NOT EXISTS pg_search WITH SCHEMA paradedb;
|
|
17
|
+
|
|
18
|
+
--
|
|
19
|
+
-- Name: EXTENSION pg_search; Type: COMMENT; Schema: -; Owner: -
|
|
20
|
+
--
|
|
21
|
+
|
|
6
22
|
--
|
|
7
23
|
-- Name: pg_trgm; Type: EXTENSION; Schema: -; Owner: -
|
|
8
24
|
--
|
|
@@ -793,4 +809,4 @@ ALTER TABLE ONLY public.robot_nodes
|
|
|
793
809
|
-- PostgreSQL database dump complete
|
|
794
810
|
--
|
|
795
811
|
|
|
796
|
-
\unrestrict
|
|
812
|
+
\unrestrict 1ItB7RQU4jC5IvOL40FU9j9sS6bzk9jcKeDUYSOd78ym0sA7pq0FXYSOEoWsPh7
|
data/db/seeds.rb
CHANGED
|
@@ -6,14 +6,14 @@
|
|
|
6
6
|
# and creates memory nodes with embeddings and tags.
|
|
7
7
|
#
|
|
8
8
|
# Configuration is read from environment variables:
|
|
9
|
-
#
|
|
10
|
-
#
|
|
11
|
-
#
|
|
12
|
-
#
|
|
13
|
-
#
|
|
14
|
-
#
|
|
15
|
-
#
|
|
16
|
-
#
|
|
9
|
+
# HTM_EMBEDDING__PROVIDER - Embedding provider (default: ollama)
|
|
10
|
+
# HTM_EMBEDDING__MODEL - Embedding model (default: nomic-embed-text)
|
|
11
|
+
# HTM_EMBEDDING__DIMENSIONS - Embedding dimensions (default: 768)
|
|
12
|
+
# HTM_TAG__PROVIDER - Tag extraction provider (default: ollama)
|
|
13
|
+
# HTM_TAG__MODEL - Tag extraction model (default: gemma3)
|
|
14
|
+
# HTM_PROVIDERS__OLLAMA__URL - Ollama server URL (default: http://localhost:11434)
|
|
15
|
+
# HTM_EMBEDDING__TIMEOUT - Embedding generation timeout in seconds (default: 120)
|
|
16
|
+
# HTM_TAG__TIMEOUT - Tag generation timeout in seconds (default: 180)
|
|
17
17
|
# HTM_CONNECTION_TIMEOUT - LLM connection timeout in seconds (default: 30)
|
|
18
18
|
# HTM_DATABASE__URL - Database connection URL
|
|
19
19
|
#
|
|
@@ -30,13 +30,13 @@ puts "=" * 80
|
|
|
30
30
|
puts
|
|
31
31
|
|
|
32
32
|
# Configure HTM using environment variables or defaults
|
|
33
|
-
embedding_provider = (ENV['
|
|
34
|
-
embedding_model = ENV['
|
|
35
|
-
embedding_dimensions = (ENV['
|
|
36
|
-
tag_provider = (ENV['
|
|
37
|
-
tag_model = ENV['
|
|
38
|
-
embedding_timeout = (ENV['
|
|
39
|
-
tag_timeout = (ENV['
|
|
33
|
+
embedding_provider = (ENV['HTM_EMBEDDING__PROVIDER'] || 'ollama').to_sym
|
|
34
|
+
embedding_model = ENV['HTM_EMBEDDING__MODEL'] || 'nomic-embed-text'
|
|
35
|
+
embedding_dimensions = (ENV['HTM_EMBEDDING__DIMENSIONS'] || '768').to_i
|
|
36
|
+
tag_provider = (ENV['HTM_TAG__PROVIDER'] || 'ollama').to_sym
|
|
37
|
+
tag_model = ENV['HTM_TAG__MODEL'] || 'gemma3'
|
|
38
|
+
embedding_timeout = (ENV['HTM_EMBEDDING__TIMEOUT'] || '120').to_i
|
|
39
|
+
tag_timeout = (ENV['HTM_TAG__TIMEOUT'] || '180').to_i
|
|
40
40
|
connection_timeout = (ENV['HTM_CONNECTION_TIMEOUT'] || '60').to_i
|
|
41
41
|
|
|
42
42
|
puts "Configuration:"
|
|
@@ -49,15 +49,15 @@ puts " Timeouts: embedding=#{embedding_timeout}s, tag=#{tag_timeout}s, connecti
|
|
|
49
49
|
puts
|
|
50
50
|
|
|
51
51
|
HTM.configure do |c|
|
|
52
|
-
c.
|
|
53
|
-
c.
|
|
54
|
-
c.
|
|
55
|
-
c.
|
|
56
|
-
c.
|
|
57
|
-
c.
|
|
58
|
-
c.
|
|
52
|
+
c.embedding.provider = embedding_provider
|
|
53
|
+
c.embedding.model = embedding_model
|
|
54
|
+
c.embedding.dimensions = embedding_dimensions
|
|
55
|
+
c.tag.provider = tag_provider
|
|
56
|
+
c.tag.model = tag_model
|
|
57
|
+
c.embedding.timeout = embedding_timeout
|
|
58
|
+
c.tag.timeout = tag_timeout
|
|
59
59
|
c.connection_timeout = connection_timeout
|
|
60
|
-
c.
|
|
60
|
+
c.providers.ollama.url = ENV['HTM_PROVIDERS__OLLAMA__URL'] if ENV['HTM_PROVIDERS__OLLAMA__URL']
|
|
61
61
|
c.reset_to_defaults # Apply default implementations with configured settings
|
|
62
62
|
end
|
|
63
63
|
|
|
@@ -72,32 +72,32 @@ puts "Creating sample conversation..."
|
|
|
72
72
|
|
|
73
73
|
htm.remember(
|
|
74
74
|
"What is TimescaleDB good for?",
|
|
75
|
-
source: "user"
|
|
75
|
+
metadata: { source: "user" }
|
|
76
76
|
)
|
|
77
77
|
|
|
78
78
|
htm.remember(
|
|
79
79
|
"PostgreSQL with TimescaleDB provides efficient time-series data storage and querying capabilities.",
|
|
80
|
-
source: "assistant"
|
|
80
|
+
metadata: { source: "assistant" }
|
|
81
81
|
)
|
|
82
82
|
|
|
83
83
|
htm.remember(
|
|
84
84
|
"How much training data do ML models need?",
|
|
85
|
-
source: "user"
|
|
85
|
+
metadata: { source: "user" }
|
|
86
86
|
)
|
|
87
87
|
|
|
88
88
|
htm.remember(
|
|
89
89
|
"Machine learning models require large amounts of training data to achieve good performance.",
|
|
90
|
-
source: "assistant"
|
|
90
|
+
metadata: { source: "assistant" }
|
|
91
91
|
)
|
|
92
92
|
|
|
93
93
|
htm.remember(
|
|
94
94
|
"Tell me about Ruby on Rails",
|
|
95
|
-
source: "user"
|
|
95
|
+
metadata: { source: "user" }
|
|
96
96
|
)
|
|
97
97
|
|
|
98
98
|
htm.remember(
|
|
99
99
|
"Ruby on Rails is a web framework for building database-backed applications.",
|
|
100
|
-
source: "assistant"
|
|
100
|
+
metadata: { source: "assistant" }
|
|
101
101
|
)
|
|
102
102
|
|
|
103
103
|
puts "✓ Created 6 conversation messages (3 exchanges)"
|
|
@@ -135,7 +135,7 @@ if Dir.exist?(seed_data_dir)
|
|
|
135
135
|
# Save previous section if we have one
|
|
136
136
|
if current_section && current_paragraph.any?
|
|
137
137
|
paragraph_text = current_paragraph.join(' ')
|
|
138
|
-
htm.remember(paragraph_text, source: filename)
|
|
138
|
+
htm.remember(paragraph_text, metadata: { source: filename })
|
|
139
139
|
count += 1
|
|
140
140
|
print "." if count % 10 == 0
|
|
141
141
|
end
|
|
@@ -152,7 +152,7 @@ if Dir.exist?(seed_data_dir)
|
|
|
152
152
|
# Don't forget the last section
|
|
153
153
|
if current_section && current_paragraph.any?
|
|
154
154
|
paragraph_text = current_paragraph.join(' ')
|
|
155
|
-
htm.remember(paragraph_text, source: filename)
|
|
155
|
+
htm.remember(paragraph_text, metadata: { source: filename })
|
|
156
156
|
count += 1
|
|
157
157
|
end
|
|
158
158
|
|
|
@@ -187,7 +187,7 @@ puts
|
|
|
187
187
|
puts "Checking completion status..."
|
|
188
188
|
|
|
189
189
|
# Check completion status
|
|
190
|
-
nodes_with_embeddings = HTM::Models::Node.
|
|
190
|
+
nodes_with_embeddings = HTM::Models::Node.exclude(embedding: nil).count
|
|
191
191
|
puts " - Nodes with embeddings: #{nodes_with_embeddings}/#{total_records}"
|
|
192
192
|
|
|
193
193
|
total_tags = HTM::Models::NodeTag.count
|
|
@@ -203,6 +203,6 @@ if nodes_with_embeddings == total_records && total_tags > 0
|
|
|
203
203
|
else
|
|
204
204
|
puts "⚠ Some background jobs may still be running."
|
|
205
205
|
puts " Run this query to check progress:"
|
|
206
|
-
puts " HTM::Models::Node.
|
|
206
|
+
puts " HTM::Models::Node.exclude(embedding: nil).count"
|
|
207
207
|
puts "=" * 80
|
|
208
208
|
end
|
|
@@ -0,0 +1,244 @@
|
|
|
1
|
+
# Database Naming Convention
|
|
2
|
+
|
|
3
|
+
HTM enforces a strict database naming convention to prevent accidental data corruption or loss from operating on the wrong database.
|
|
4
|
+
|
|
5
|
+
## The Convention
|
|
6
|
+
|
|
7
|
+
Database names **must** follow this exact format:
|
|
8
|
+
|
|
9
|
+
```
|
|
10
|
+
{service_name}_{environment}
|
|
11
|
+
```
|
|
12
|
+
|
|
13
|
+
Where:
|
|
14
|
+
- `service_name` is the value of `config.service.name` (default: `htm`)
|
|
15
|
+
- `environment` is the value of `HTM_ENV` (or `RAILS_ENV` / `RACK_ENV` fallback)
|
|
16
|
+
|
|
17
|
+
## Valid Examples
|
|
18
|
+
|
|
19
|
+
| Service Name | Environment | Expected Database Name |
|
|
20
|
+
|--------------|-------------|------------------------|
|
|
21
|
+
| `htm` | `development` | `htm_development` |
|
|
22
|
+
| `htm` | `test` | `htm_test` |
|
|
23
|
+
| `htm` | `production` | `htm_production` |
|
|
24
|
+
| `payroll` | `development` | `payroll_development` |
|
|
25
|
+
| `payroll` | `production` | `payroll_production` |
|
|
26
|
+
|
|
27
|
+
## Why This Matters
|
|
28
|
+
|
|
29
|
+
Without strict enforcement, dangerous misconfigurations can go undetected:
|
|
30
|
+
|
|
31
|
+
### Scenario 1: Environment Mismatch
|
|
32
|
+
|
|
33
|
+
```bash
|
|
34
|
+
# Developer thinks they're in test, but connected to production
|
|
35
|
+
HTM_ENV=test
|
|
36
|
+
HTM_DATABASE__URL="postgresql://user@host/htm_production"
|
|
37
|
+
|
|
38
|
+
rake htm:db:drop # DISASTER: Drops production database!
|
|
39
|
+
```
|
|
40
|
+
|
|
41
|
+
With the naming convention enforced, this command fails immediately:
|
|
42
|
+
|
|
43
|
+
```
|
|
44
|
+
Error: Database name does not follow naming convention!
|
|
45
|
+
|
|
46
|
+
Database names must be: {service_name}_{environment}
|
|
47
|
+
|
|
48
|
+
Service name: htm
|
|
49
|
+
Environment: test
|
|
50
|
+
Expected: htm_test
|
|
51
|
+
Actual: htm_production
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
### Scenario 2: Service Mismatch
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
# HTM configured to use another application's database
|
|
58
|
+
HTM_ENV=production
|
|
59
|
+
# service.name = "htm" (default)
|
|
60
|
+
HTM_DATABASE__URL="postgresql://user@host/payroll_production"
|
|
61
|
+
|
|
62
|
+
rake htm:db:setup # DISASTER: Corrupts payroll application's data!
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
With enforcement, this fails:
|
|
66
|
+
|
|
67
|
+
```
|
|
68
|
+
Error: Database name does not follow naming convention!
|
|
69
|
+
|
|
70
|
+
Service name: htm
|
|
71
|
+
Environment: production
|
|
72
|
+
Expected: htm_production
|
|
73
|
+
Actual: payroll_production
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
## How Enforcement Works
|
|
77
|
+
|
|
78
|
+
### Validation Points
|
|
79
|
+
|
|
80
|
+
The naming convention is validated at these points:
|
|
81
|
+
|
|
82
|
+
1. **All rake tasks** that depend on `htm:db:validate` (setup, migrate, drop, etc.)
|
|
83
|
+
2. **Programmatic access** via `HTM.config.validate_database_name!`
|
|
84
|
+
|
|
85
|
+
### No Bypass Option
|
|
86
|
+
|
|
87
|
+
There is no way to skip this validation. If your database name doesn't match the convention, you must either:
|
|
88
|
+
|
|
89
|
+
1. Rename your database to match the convention
|
|
90
|
+
2. Change `HTM_ENV` to match the database suffix
|
|
91
|
+
3. Change `config.service.name` to match the database prefix
|
|
92
|
+
|
|
93
|
+
## Configuration
|
|
94
|
+
|
|
95
|
+
### Setting the Service Name
|
|
96
|
+
|
|
97
|
+
The service name defaults to `htm`. To use a different name:
|
|
98
|
+
|
|
99
|
+
**Via environment variable:**
|
|
100
|
+
```bash
|
|
101
|
+
export HTM_SERVICE__NAME="myapp"
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
**Via configuration file (`config/htm.yml`):**
|
|
105
|
+
```yaml
|
|
106
|
+
service:
|
|
107
|
+
name: myapp
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
**Via Ruby configuration:**
|
|
111
|
+
```ruby
|
|
112
|
+
HTM.configure do |config|
|
|
113
|
+
config.service.name = "myapp"
|
|
114
|
+
end
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
### Setting the Environment
|
|
118
|
+
|
|
119
|
+
Environment is determined by (in priority order):
|
|
120
|
+
|
|
121
|
+
1. `HTM_ENV`
|
|
122
|
+
2. `RAILS_ENV`
|
|
123
|
+
3. `RACK_ENV`
|
|
124
|
+
4. Default: `development`
|
|
125
|
+
|
|
126
|
+
**Valid environments:**
|
|
127
|
+
- `development`
|
|
128
|
+
- `test`
|
|
129
|
+
- `production`
|
|
130
|
+
|
|
131
|
+
These correspond to the top-level keys in `config/defaults.yml`.
|
|
132
|
+
|
|
133
|
+
## Validation Methods
|
|
134
|
+
|
|
135
|
+
### Check if Database Name is Valid
|
|
136
|
+
|
|
137
|
+
```ruby
|
|
138
|
+
config = HTM.config
|
|
139
|
+
|
|
140
|
+
# Boolean check
|
|
141
|
+
if config.valid_database_name?
|
|
142
|
+
puts "Database name is correct"
|
|
143
|
+
else
|
|
144
|
+
puts "Expected: #{config.expected_database_name}"
|
|
145
|
+
puts "Actual: #{config.actual_database_name}"
|
|
146
|
+
end
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
### Raise Error on Invalid Name
|
|
150
|
+
|
|
151
|
+
```ruby
|
|
152
|
+
config = HTM.config
|
|
153
|
+
|
|
154
|
+
# Raises HTM::ConfigurationError if invalid
|
|
155
|
+
config.validate_database_name!
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
### Get Expected and Actual Names
|
|
159
|
+
|
|
160
|
+
```ruby
|
|
161
|
+
config = HTM.config
|
|
162
|
+
|
|
163
|
+
config.expected_database_name # => "htm_test"
|
|
164
|
+
config.actual_database_name # => Extracted from URL or config
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
## Rake Task Validation
|
|
168
|
+
|
|
169
|
+
All database-related rake tasks run validation automatically:
|
|
170
|
+
|
|
171
|
+
```bash
|
|
172
|
+
# These all validate the naming convention first:
|
|
173
|
+
rake htm:db:setup
|
|
174
|
+
rake htm:db:migrate
|
|
175
|
+
rake htm:db:drop
|
|
176
|
+
rake htm:db:reset
|
|
177
|
+
rake htm:db:create
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
To validate manually without performing any operation:
|
|
181
|
+
|
|
182
|
+
```bash
|
|
183
|
+
rake htm:db:validate
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
## Migration Guide
|
|
187
|
+
|
|
188
|
+
If you have existing databases that don't follow the convention:
|
|
189
|
+
|
|
190
|
+
### Option 1: Rename the Database
|
|
191
|
+
|
|
192
|
+
```bash
|
|
193
|
+
# PostgreSQL
|
|
194
|
+
psql -c "ALTER DATABASE old_name RENAME TO htm_development;"
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
### Option 2: Export and Import
|
|
198
|
+
|
|
199
|
+
```bash
|
|
200
|
+
# Export from old database
|
|
201
|
+
pg_dump old_database > backup.sql
|
|
202
|
+
|
|
203
|
+
# Create new database with correct name
|
|
204
|
+
createdb htm_development
|
|
205
|
+
|
|
206
|
+
# Import to new database
|
|
207
|
+
psql htm_development < backup.sql
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
### Option 3: Change Your Service Name
|
|
211
|
+
|
|
212
|
+
If your database is named `myapp_production`, set your service name to match:
|
|
213
|
+
|
|
214
|
+
```bash
|
|
215
|
+
export HTM_SERVICE__NAME="myapp"
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
## Error Messages
|
|
219
|
+
|
|
220
|
+
When validation fails, you'll see a clear error message:
|
|
221
|
+
|
|
222
|
+
```
|
|
223
|
+
Error: Database name 'wrong_db' does not match expected 'htm_test'.
|
|
224
|
+
Database names must follow the convention: {service_name}_{environment}
|
|
225
|
+
Service name: htm
|
|
226
|
+
Environment: test
|
|
227
|
+
Expected: htm_test
|
|
228
|
+
Actual: wrong_db
|
|
229
|
+
|
|
230
|
+
Either:
|
|
231
|
+
- Set HTM_DATABASE__URL to point to 'htm_test'
|
|
232
|
+
- Set HTM_DATABASE__NAME=htm_test
|
|
233
|
+
- Change HTM_ENV to match the database suffix
|
|
234
|
+
```
|
|
235
|
+
|
|
236
|
+
## Summary
|
|
237
|
+
|
|
238
|
+
The strict database naming convention:
|
|
239
|
+
|
|
240
|
+
- **Prevents** accidental operations on wrong environments
|
|
241
|
+
- **Prevents** cross-application database corruption
|
|
242
|
+
- **Requires** exact match of `{service_name}_{environment}`
|
|
243
|
+
- **Has no bypass** - you must fix the configuration
|
|
244
|
+
- **Validates automatically** on all database operations
|
data/docs/database_rake_tasks.md
CHANGED
|
@@ -219,6 +219,37 @@ $ rake htm:db:reset
|
|
|
219
219
|
# Runs drop (with confirmation) then setup
|
|
220
220
|
```
|
|
221
221
|
|
|
222
|
+
#### `rake htm:db:purge_all`
|
|
223
|
+
Permanently removes all soft-deleted records from all tables.
|
|
224
|
+
|
|
225
|
+
**What it does:**
|
|
226
|
+
- Removes soft-deleted nodes, node_tags, and robot_nodes
|
|
227
|
+
- Removes orphaned join table entries (pointing to non-existent nodes)
|
|
228
|
+
- Removes orphaned propositions (where source_node_id no longer exists)
|
|
229
|
+
- Removes orphaned robots (with no associated memory nodes)
|
|
230
|
+
- Deletes in correct order for referential integrity
|
|
231
|
+
|
|
232
|
+
**Safety:** Prompts for confirmation before deletion
|
|
233
|
+
|
|
234
|
+
```bash
|
|
235
|
+
$ rake htm:db:purge_all
|
|
236
|
+
|
|
237
|
+
HTM Purge All Soft-Deleted Records
|
|
238
|
+
============================================================
|
|
239
|
+
|
|
240
|
+
Records to permanently delete:
|
|
241
|
+
--------------------------------------------------------------
|
|
242
|
+
Soft-deleted nodes: 23
|
|
243
|
+
Soft-deleted node_tags: 45
|
|
244
|
+
Orphaned propositions: 5
|
|
245
|
+
Orphaned robots (no nodes): 2
|
|
246
|
+
--------------------------------------------------------------
|
|
247
|
+
Total records to delete: 75
|
|
248
|
+
|
|
249
|
+
Proceed with permanent deletion? (yes/no): yes
|
|
250
|
+
✓ Purge complete!
|
|
251
|
+
```
|
|
252
|
+
|
|
222
253
|
---
|
|
223
254
|
|
|
224
255
|
## Environment Variables
|