htm 0.0.18 → 0.0.30
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +119 -1
- data/README.md +12 -0
- data/Rakefile +104 -18
- data/db/migrate/00001_enable_extensions.rb +9 -5
- data/db/migrate/00002_create_robots.rb +18 -6
- data/db/migrate/00003_create_file_sources.rb +30 -17
- data/db/migrate/00004_create_nodes.rb +60 -48
- data/db/migrate/00005_create_tags.rb +24 -12
- data/db/migrate/00006_create_node_tags.rb +28 -13
- data/db/migrate/00007_create_robot_nodes.rb +40 -26
- data/db/schema.sql +17 -1
- data/db/seeds.rb +34 -34
- data/docs/api/embedding-service.md +140 -110
- data/docs/api/yard/HTM/ActiveRecordConfig.md +6 -0
- data/docs/api/yard/HTM/Config.md +173 -0
- data/docs/api/yard/HTM/ConfigSection.md +28 -0
- data/docs/api/yard/HTM/Database.md +1 -1
- data/docs/api/yard/HTM/Railtie.md +2 -2
- data/docs/api/yard/HTM.md +0 -57
- data/docs/api/yard/index.csv +76 -61
- data/docs/api/yard-reference.md +2 -1
- data/docs/architecture/adrs/003-ollama-embeddings.md +45 -36
- data/docs/architecture/adrs/004-hive-mind.md +1 -1
- data/docs/architecture/adrs/008-robot-identification.md +1 -1
- data/docs/architecture/index.md +11 -9
- data/docs/architecture/overview.md +11 -7
- data/docs/assets/images/balanced-strategy-decay.svg +41 -0
- data/docs/assets/images/class-hierarchy.svg +1 -1
- data/docs/assets/images/eviction-priority.svg +43 -0
- data/docs/assets/images/exception-hierarchy.svg +2 -2
- data/docs/assets/images/hive-mind-shared-memory.svg +52 -0
- data/docs/assets/images/htm-architecture-overview.svg +3 -3
- data/docs/assets/images/htm-core-components.svg +4 -4
- data/docs/assets/images/htm-layered-architecture.svg +1 -1
- data/docs/assets/images/htm-memory-addition-flow.svg +2 -2
- data/docs/assets/images/htm-memory-recall-flow.svg +2 -2
- data/docs/assets/images/memory-topology.svg +53 -0
- data/docs/assets/images/two-tier-memory-architecture.svg +55 -0
- data/docs/database/naming-convention.md +244 -0
- data/docs/database_rake_tasks.md +31 -0
- data/docs/development/rake-tasks.md +80 -35
- data/docs/development/setup.md +76 -44
- data/docs/examples/basic-usage.md +133 -0
- data/docs/examples/config-files.md +170 -0
- data/docs/examples/file-loading.md +208 -0
- data/docs/examples/index.md +116 -0
- data/docs/examples/llm-configuration.md +168 -0
- data/docs/examples/mcp-client.md +172 -0
- data/docs/examples/rails-integration.md +173 -0
- data/docs/examples/robot-groups.md +210 -0
- data/docs/examples/sinatra-integration.md +218 -0
- data/docs/examples/standalone-app.md +216 -0
- data/docs/examples/telemetry.md +224 -0
- data/docs/examples/timeframes.md +143 -0
- data/docs/getting-started/installation.md +97 -40
- data/docs/getting-started/quick-start.md +28 -11
- data/docs/guides/configuration.md +515 -0
- data/docs/guides/file-loading.md +322 -0
- data/docs/guides/getting-started.md +40 -9
- data/docs/guides/index.md +3 -3
- data/docs/guides/mcp-server.md +100 -13
- data/docs/guides/propositions.md +264 -0
- data/docs/guides/recalling-memories.md +4 -4
- data/docs/guides/search-strategies.md +3 -3
- data/docs/guides/tags.md +318 -0
- data/docs/guides/telemetry.md +229 -0
- data/docs/index.md +8 -16
- data/docs/{architecture → robots}/hive-mind.md +8 -111
- data/docs/robots/index.md +73 -0
- data/docs/{guides → robots}/multi-robot.md +3 -3
- data/docs/{guides → robots}/robot-groups.md +8 -7
- data/docs/{architecture → robots}/two-tier-memory.md +13 -149
- data/docs/robots/why-robots.md +85 -0
- data/examples/.envrc +6 -0
- data/examples/.gitignore +2 -0
- data/examples/00_create_examples_db.rb +94 -0
- data/examples/{basic_usage.rb → 01_basic_usage.rb} +12 -16
- data/examples/{custom_llm_configuration.rb → 03_custom_llm_configuration.rb} +13 -3
- data/examples/{file_loader_usage.rb → 04_file_loader_usage.rb} +11 -14
- data/examples/{timeframe_demo.rb → 05_timeframe_demo.rb} +10 -3
- data/examples/{example_app → 06_example_app}/app.rb +15 -15
- data/examples/{cli_app → 07_cli_app}/htm_cli.rb +15 -22
- data/examples/08_sinatra_app/Gemfile.lock +241 -0
- data/examples/{sinatra_app → 08_sinatra_app}/app.rb +19 -18
- data/examples/{mcp_client.rb → 09_mcp_client.rb} +5 -8
- data/examples/{telemetry → 10_telemetry}/SETUP_README.md +1 -1
- data/examples/{telemetry → 10_telemetry}/demo.rb +14 -10
- data/examples/11_robot_groups/README.md +335 -0
- data/examples/{robot_groups → 11_robot_groups/lib}/robot_worker.rb +17 -3
- data/examples/{robot_groups → 11_robot_groups}/multi_process.rb +9 -9
- data/examples/{robot_groups → 11_robot_groups}/same_process.rb +9 -12
- data/examples/{rails_app → 12_rails_app}/Gemfile +3 -0
- data/examples/{rails_app → 12_rails_app}/Gemfile.lock +87 -58
- data/examples/{rails_app → 12_rails_app}/app/controllers/dashboard_controller.rb +10 -6
- data/examples/{rails_app → 12_rails_app}/app/controllers/files_controller.rb +5 -5
- data/examples/{rails_app → 12_rails_app}/app/controllers/memories_controller.rb +11 -7
- data/examples/{rails_app → 12_rails_app}/app/controllers/robots_controller.rb +8 -8
- data/examples/12_rails_app/app/controllers/tags_controller.rb +36 -0
- data/examples/{rails_app → 12_rails_app}/app/views/dashboard/index.html.erb +2 -2
- data/examples/{rails_app → 12_rails_app}/app/views/files/new.html.erb +5 -2
- data/examples/{rails_app → 12_rails_app}/app/views/memories/_memory_card.html.erb +3 -3
- data/examples/{rails_app → 12_rails_app}/app/views/memories/deleted.html.erb +3 -3
- data/examples/{rails_app → 12_rails_app}/app/views/memories/edit.html.erb +3 -3
- data/examples/{rails_app → 12_rails_app}/app/views/memories/show.html.erb +4 -4
- data/examples/{rails_app → 12_rails_app}/app/views/robots/index.html.erb +2 -2
- data/examples/{rails_app → 12_rails_app}/app/views/robots/show.html.erb +4 -4
- data/examples/{rails_app → 12_rails_app}/app/views/search/index.html.erb +1 -1
- data/examples/{rails_app → 12_rails_app}/app/views/tags/index.html.erb +2 -2
- data/examples/{rails_app → 12_rails_app}/app/views/tags/show.html.erb +1 -1
- data/examples/12_rails_app/config/initializers/htm.rb +7 -0
- data/examples/12_rails_app/config/initializers/rack.rb +5 -0
- data/examples/README.md +230 -211
- data/examples/examples_helper.rb +138 -0
- data/lib/htm/config/builder.rb +167 -0
- data/lib/htm/config/database.rb +317 -0
- data/lib/htm/config/defaults.yml +41 -13
- data/lib/htm/config/section.rb +74 -0
- data/lib/htm/config/validator.rb +83 -0
- data/lib/htm/config.rb +65 -361
- data/lib/htm/database.rb +85 -127
- data/lib/htm/errors.rb +14 -0
- data/lib/htm/integrations/sinatra.rb +13 -44
- data/lib/htm/job_adapter.rb +75 -1
- data/lib/htm/jobs/generate_embedding_job.rb +3 -4
- data/lib/htm/jobs/generate_propositions_job.rb +4 -5
- data/lib/htm/jobs/generate_tags_job.rb +16 -15
- data/lib/htm/loaders/defaults_loader.rb +23 -0
- data/lib/htm/loaders/markdown_loader.rb +17 -15
- data/lib/htm/loaders/xdg_config_loader.rb +9 -9
- data/lib/htm/long_term_memory/fulltext_search.rb +14 -14
- data/lib/htm/long_term_memory/hybrid_search.rb +396 -229
- data/lib/htm/long_term_memory/node_operations.rb +24 -23
- data/lib/htm/long_term_memory/relevance_scorer.rb +23 -20
- data/lib/htm/long_term_memory/robot_operations.rb +4 -4
- data/lib/htm/long_term_memory/tag_operations.rb +91 -77
- data/lib/htm/long_term_memory/vector_search.rb +4 -5
- data/lib/htm/long_term_memory.rb +13 -13
- data/lib/htm/mcp/cli.rb +115 -8
- data/lib/htm/mcp/resources.rb +4 -3
- data/lib/htm/mcp/server.rb +5 -4
- data/lib/htm/mcp/tools.rb +37 -28
- data/lib/htm/migration.rb +72 -0
- data/lib/htm/models/file_source.rb +52 -31
- data/lib/htm/models/node.rb +224 -108
- data/lib/htm/models/node_tag.rb +49 -28
- data/lib/htm/models/robot.rb +38 -27
- data/lib/htm/models/robot_node.rb +63 -35
- data/lib/htm/models/tag.rb +126 -123
- data/lib/htm/observability.rb +45 -41
- data/lib/htm/proposition_service.rb +76 -7
- data/lib/htm/railtie.rb +2 -2
- data/lib/htm/robot_group.rb +30 -18
- data/lib/htm/sequel_config.rb +215 -0
- data/lib/htm/sql_builder.rb +14 -16
- data/lib/htm/tag_service.rb +78 -0
- data/lib/htm/tasks.rb +3 -0
- data/lib/htm/version.rb +1 -1
- data/lib/htm/workflows/remember_workflow.rb +213 -0
- data/lib/htm.rb +27 -22
- data/lib/tasks/db.rake +0 -2
- data/lib/tasks/doc.rake +2 -2
- data/lib/tasks/files.rake +11 -18
- data/lib/tasks/htm.rake +190 -62
- data/lib/tasks/jobs.rake +179 -54
- data/lib/tasks/tags.rake +8 -13
- data/mkdocs.yml +33 -8
- data/scripts/backfill_parent_tags.rb +376 -0
- data/scripts/normalize_plural_tags.rb +335 -0
- metadata +168 -86
- data/docs/api/yard/HTM/Configuration.md +0 -240
- data/docs/telemetry.md +0 -391
- data/examples/rails_app/app/controllers/tags_controller.rb +0 -30
- data/examples/sinatra_app/Gemfile.lock +0 -166
- data/lib/htm/active_record_config.rb +0 -104
- /data/examples/{config_file_example → 02_config_file_example}/README.md +0 -0
- /data/examples/{config_file_example → 02_config_file_example}/config/htm.local.yml +0 -0
- /data/examples/{config_file_example → 02_config_file_example}/custom_config.yml +0 -0
- /data/examples/{config_file_example → 02_config_file_example}/show_config.rb +0 -0
- /data/examples/{example_app → 06_example_app}/Rakefile +0 -0
- /data/examples/{cli_app → 07_cli_app}/README.md +0 -0
- /data/examples/{sinatra_app → 08_sinatra_app}/Gemfile +0 -0
- /data/examples/{telemetry → 10_telemetry}/README.md +0 -0
- /data/examples/{telemetry → 10_telemetry}/grafana/dashboards/htm-metrics.json +0 -0
- /data/examples/{rails_app → 12_rails_app}/.gitignore +0 -0
- /data/examples/{rails_app → 12_rails_app}/Procfile.dev +0 -0
- /data/examples/{rails_app → 12_rails_app}/README.md +0 -0
- /data/examples/{rails_app → 12_rails_app}/Rakefile +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/assets/stylesheets/application.css +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/assets/stylesheets/inter-font.css +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/controllers/application_controller.rb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/controllers/search_controller.rb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/javascript/application.js +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/javascript/controllers/application.js +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/javascript/controllers/index.js +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/files/index.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/files/show.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/layouts/application.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/memories/index.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/memories/new.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/robots/new.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/shared/_navbar.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/app/views/shared/_stat_card.html.erb +0 -0
- /data/examples/{rails_app → 12_rails_app}/bin/dev +0 -0
- /data/examples/{rails_app → 12_rails_app}/bin/rails +0 -0
- /data/examples/{rails_app → 12_rails_app}/bin/rake +0 -0
- /data/examples/{rails_app → 12_rails_app}/config/application.rb +0 -0
- /data/examples/{rails_app → 12_rails_app}/config/boot.rb +0 -0
- /data/examples/{rails_app → 12_rails_app}/config/database.yml +0 -0
- /data/examples/{rails_app → 12_rails_app}/config/environment.rb +0 -0
- /data/examples/{rails_app → 12_rails_app}/config/importmap.rb +0 -0
- /data/examples/{rails_app → 12_rails_app}/config/routes.rb +0 -0
- /data/examples/{rails_app → 12_rails_app}/config/tailwind.config.js +0 -0
- /data/examples/{rails_app → 12_rails_app}/config.ru +0 -0
- /data/examples/{rails_app → 12_rails_app}/log/.keep +0 -0
- /data/examples/{rails_app → 12_rails_app}/tmp/local_secret.txt +0 -0
|
@@ -0,0 +1,322 @@
|
|
|
1
|
+
# File Loading
|
|
2
|
+
|
|
3
|
+
HTM can load text-based files (currently markdown) into long-term memory with automatic chunking, source tracking, and re-sync support. This is ideal for building knowledge bases from documentation, notes, or any text content.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
The file loading system provides:
|
|
8
|
+
|
|
9
|
+
- **Automatic chunking**: Large files are split into semantically-aware chunks
|
|
10
|
+
- **YAML frontmatter extraction**: Metadata from file headers is preserved
|
|
11
|
+
- **Source tracking**: Files are tracked for re-sync when content changes
|
|
12
|
+
- **Duplicate detection**: Content hashing prevents duplicate chunks
|
|
13
|
+
- **Soft delete**: Unloading files uses soft delete for recovery
|
|
14
|
+
|
|
15
|
+
## Quick Start
|
|
16
|
+
|
|
17
|
+
```ruby
|
|
18
|
+
require 'htm'
|
|
19
|
+
|
|
20
|
+
htm = HTM.new(robot_name: "Document Loader")
|
|
21
|
+
|
|
22
|
+
# Load a single markdown file
|
|
23
|
+
result = htm.load_file("docs/guide.md")
|
|
24
|
+
# => { file_source_id: 1, chunks_created: 5, chunks_updated: 0, skipped: false }
|
|
25
|
+
|
|
26
|
+
# Load all markdown files from a directory
|
|
27
|
+
results = htm.load_directory("docs/", pattern: "**/*.md")
|
|
28
|
+
# => [{ file_path: "docs/guide.md", ... }, { file_path: "docs/api.md", ... }]
|
|
29
|
+
|
|
30
|
+
# Query nodes from a specific file
|
|
31
|
+
nodes = htm.nodes_from_file("docs/guide.md")
|
|
32
|
+
|
|
33
|
+
# Unload a file (soft deletes chunks)
|
|
34
|
+
htm.unload_file("docs/guide.md")
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
## API Reference
|
|
38
|
+
|
|
39
|
+
### load_file(path, force: false)
|
|
40
|
+
|
|
41
|
+
Loads a single file into long-term memory.
|
|
42
|
+
|
|
43
|
+
| Parameter | Type | Default | Description |
|
|
44
|
+
|-----------|------|---------|-------------|
|
|
45
|
+
| `path` | String | required | Path to the file |
|
|
46
|
+
| `force` | Boolean | `false` | Force reload even if file unchanged |
|
|
47
|
+
|
|
48
|
+
**Returns:** Hash with keys:
|
|
49
|
+
- `file_source_id`: ID of the FileSource record
|
|
50
|
+
- `chunks_created`: Number of new chunks created
|
|
51
|
+
- `chunks_updated`: Number of existing chunks updated
|
|
52
|
+
- `chunks_deleted`: Number of chunks removed
|
|
53
|
+
- `skipped`: Whether file was skipped (unchanged)
|
|
54
|
+
|
|
55
|
+
```ruby
|
|
56
|
+
# Normal load - skips unchanged files
|
|
57
|
+
result = htm.load_file("docs/guide.md")
|
|
58
|
+
|
|
59
|
+
# Force reload even if file hasn't changed
|
|
60
|
+
result = htm.load_file("docs/guide.md", force: true)
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
### load_directory(path, pattern: "**/*.md", force: false)
|
|
64
|
+
|
|
65
|
+
Loads all matching files from a directory.
|
|
66
|
+
|
|
67
|
+
| Parameter | Type | Default | Description |
|
|
68
|
+
|-----------|------|---------|-------------|
|
|
69
|
+
| `path` | String | required | Directory path |
|
|
70
|
+
| `pattern` | String | `"**/*.md"` | Glob pattern for files |
|
|
71
|
+
| `force` | Boolean | `false` | Force reload all files |
|
|
72
|
+
|
|
73
|
+
**Returns:** Array of result hashes (one per file)
|
|
74
|
+
|
|
75
|
+
```ruby
|
|
76
|
+
# Load all markdown files
|
|
77
|
+
results = htm.load_directory("docs/")
|
|
78
|
+
|
|
79
|
+
# Load only top-level markdown files
|
|
80
|
+
results = htm.load_directory("docs/", pattern: "*.md")
|
|
81
|
+
|
|
82
|
+
# Load specific subdirectory
|
|
83
|
+
results = htm.load_directory("docs/guides/", pattern: "**/*.md")
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
### nodes_from_file(path)
|
|
87
|
+
|
|
88
|
+
Returns all nodes loaded from a specific file.
|
|
89
|
+
|
|
90
|
+
```ruby
|
|
91
|
+
nodes = htm.nodes_from_file("docs/guide.md")
|
|
92
|
+
nodes.each do |node|
|
|
93
|
+
puts "#{node.id}: #{node.content[0..50]}..."
|
|
94
|
+
end
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
### unload_file(path)
|
|
98
|
+
|
|
99
|
+
Soft deletes all nodes from a file and removes the file source.
|
|
100
|
+
|
|
101
|
+
```ruby
|
|
102
|
+
count = htm.unload_file("docs/guide.md")
|
|
103
|
+
puts "Removed #{count} chunks"
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
## YAML Frontmatter
|
|
107
|
+
|
|
108
|
+
Files with YAML frontmatter have their metadata extracted and stored:
|
|
109
|
+
|
|
110
|
+
```markdown
|
|
111
|
+
---
|
|
112
|
+
title: PostgreSQL Guide
|
|
113
|
+
author: HTM Team
|
|
114
|
+
tags:
|
|
115
|
+
- database
|
|
116
|
+
- postgresql
|
|
117
|
+
version: 1.2
|
|
118
|
+
---
|
|
119
|
+
|
|
120
|
+
# PostgreSQL Guide
|
|
121
|
+
|
|
122
|
+
Content starts here...
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
Access frontmatter via the FileSource model:
|
|
126
|
+
|
|
127
|
+
```ruby
|
|
128
|
+
source = HTM::Models::FileSource.find_by(file_path: "docs/guide.md")
|
|
129
|
+
source.title # => "PostgreSQL Guide"
|
|
130
|
+
source.author # => "HTM Team"
|
|
131
|
+
source.frontmatter_tags # => ["database", "postgresql"]
|
|
132
|
+
source.frontmatter # => { "title" => "...", "author" => "...", ... }
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
## Chunking Strategy
|
|
136
|
+
|
|
137
|
+
HTM uses the [Baran gem](https://github.com/baran) with `MarkdownSplitter` for intelligent chunking that respects markdown structure:
|
|
138
|
+
|
|
139
|
+
- **Headers**: Chunks break at header boundaries
|
|
140
|
+
- **Code blocks**: Code blocks are kept intact
|
|
141
|
+
- **Horizontal rules**: Natural section breaks
|
|
142
|
+
- **Configurable size**: Control chunk size and overlap
|
|
143
|
+
|
|
144
|
+
### Configuration
|
|
145
|
+
|
|
146
|
+
```ruby
|
|
147
|
+
# Global configuration
|
|
148
|
+
HTM.configure do |config|
|
|
149
|
+
config.chunk_size = 1024 # Characters per chunk (default: 1024)
|
|
150
|
+
config.chunk_overlap = 64 # Overlap between chunks (default: 64)
|
|
151
|
+
end
|
|
152
|
+
|
|
153
|
+
# Or via environment variables
|
|
154
|
+
# HTM_CHUNK_SIZE=512
|
|
155
|
+
# HTM_CHUNK_OVERLAP=50
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
### Per-Loader Configuration
|
|
159
|
+
|
|
160
|
+
```ruby
|
|
161
|
+
loader = HTM::Loaders::MarkdownLoader.new(
|
|
162
|
+
htm,
|
|
163
|
+
chunk_size: 512,
|
|
164
|
+
chunk_overlap: 50
|
|
165
|
+
)
|
|
166
|
+
loader.load("docs/guide.md")
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
## Re-Sync Behavior
|
|
170
|
+
|
|
171
|
+
The file loading system tracks file modification times for efficient re-syncing:
|
|
172
|
+
|
|
173
|
+
1. **First load**: Creates FileSource record and chunks
|
|
174
|
+
2. **Subsequent loads**: Compares mtime, skips unchanged files
|
|
175
|
+
3. **Changed files**: Re-chunks and updates nodes
|
|
176
|
+
4. **Force reload**: Bypasses mtime check
|
|
177
|
+
|
|
178
|
+
```ruby
|
|
179
|
+
# First load - creates chunks
|
|
180
|
+
htm.load_file("docs/guide.md")
|
|
181
|
+
# => { skipped: false, chunks_created: 5 }
|
|
182
|
+
|
|
183
|
+
# Second load - skipped (unchanged)
|
|
184
|
+
htm.load_file("docs/guide.md")
|
|
185
|
+
# => { skipped: true }
|
|
186
|
+
|
|
187
|
+
# After editing file - re-syncs
|
|
188
|
+
htm.load_file("docs/guide.md")
|
|
189
|
+
# => { skipped: false, chunks_updated: 2, chunks_created: 1 }
|
|
190
|
+
|
|
191
|
+
# Force reload
|
|
192
|
+
htm.load_file("docs/guide.md", force: true)
|
|
193
|
+
# => { skipped: false, chunks_updated: 5 }
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
## FileSource Model
|
|
197
|
+
|
|
198
|
+
The `HTM::Models::FileSource` tracks loaded files:
|
|
199
|
+
|
|
200
|
+
```ruby
|
|
201
|
+
source = HTM::Models::FileSource.find_by(file_path: "docs/guide.md")
|
|
202
|
+
|
|
203
|
+
source.file_path # Full path to file
|
|
204
|
+
source.mtime # Last modification time
|
|
205
|
+
source.needs_sync? # Check if file changed since load
|
|
206
|
+
source.chunks # Associated nodes (ordered by position)
|
|
207
|
+
source.frontmatter # Parsed YAML frontmatter
|
|
208
|
+
source.title # Frontmatter title (convenience)
|
|
209
|
+
source.author # Frontmatter author (convenience)
|
|
210
|
+
source.frontmatter_tags # Tags from frontmatter
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
## Rake Tasks
|
|
214
|
+
|
|
215
|
+
HTM provides rake tasks for file management:
|
|
216
|
+
|
|
217
|
+
```bash
|
|
218
|
+
# Load a single file
|
|
219
|
+
rake 'htm:files:load[docs/guide.md]'
|
|
220
|
+
|
|
221
|
+
# Load directory
|
|
222
|
+
rake 'htm:files:load_dir[docs/]'
|
|
223
|
+
rake 'htm:files:load_dir[docs/,**/*.md]'
|
|
224
|
+
|
|
225
|
+
# List loaded files
|
|
226
|
+
rake htm:files:list
|
|
227
|
+
|
|
228
|
+
# Show file details
|
|
229
|
+
rake 'htm:files:info[docs/guide.md]'
|
|
230
|
+
|
|
231
|
+
# Unload a file
|
|
232
|
+
rake 'htm:files:unload[docs/guide.md]'
|
|
233
|
+
|
|
234
|
+
# Sync all files (reload changed)
|
|
235
|
+
rake htm:files:sync
|
|
236
|
+
|
|
237
|
+
# Show statistics
|
|
238
|
+
rake htm:files:stats
|
|
239
|
+
|
|
240
|
+
# Force reload with FORCE=true
|
|
241
|
+
FORCE=true rake 'htm:files:load[docs/guide.md]'
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
## Best Practices
|
|
245
|
+
|
|
246
|
+
### Organize Files Logically
|
|
247
|
+
|
|
248
|
+
```ruby
|
|
249
|
+
# Load by category
|
|
250
|
+
htm.load_directory("docs/guides/", pattern: "**/*.md")
|
|
251
|
+
htm.load_directory("docs/api/", pattern: "**/*.md")
|
|
252
|
+
htm.load_directory("docs/tutorials/", pattern: "**/*.md")
|
|
253
|
+
```
|
|
254
|
+
|
|
255
|
+
### Use Frontmatter for Metadata
|
|
256
|
+
|
|
257
|
+
```markdown
|
|
258
|
+
---
|
|
259
|
+
title: API Authentication
|
|
260
|
+
category: api
|
|
261
|
+
tags:
|
|
262
|
+
- security
|
|
263
|
+
- authentication
|
|
264
|
+
priority: high
|
|
265
|
+
---
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
### Tune Chunk Size for Your Content
|
|
269
|
+
|
|
270
|
+
```ruby
|
|
271
|
+
# Smaller chunks for dense technical content
|
|
272
|
+
HTM.configure { |c| c.chunk_size = 512 }
|
|
273
|
+
|
|
274
|
+
# Larger chunks for narrative content
|
|
275
|
+
HTM.configure { |c| c.chunk_size = 2048 }
|
|
276
|
+
```
|
|
277
|
+
|
|
278
|
+
### Regular Sync for Updated Content
|
|
279
|
+
|
|
280
|
+
```ruby
|
|
281
|
+
# Sync all loaded files periodically
|
|
282
|
+
htm.sync_files # Re-checks all FileSource records
|
|
283
|
+
```
|
|
284
|
+
|
|
285
|
+
## Example: Building a Knowledge Base
|
|
286
|
+
|
|
287
|
+
```ruby
|
|
288
|
+
require 'htm'
|
|
289
|
+
|
|
290
|
+
# Initialize
|
|
291
|
+
htm = HTM.new(robot_name: "Knowledge Base")
|
|
292
|
+
|
|
293
|
+
# Configure chunking for technical docs
|
|
294
|
+
HTM.configure do |config|
|
|
295
|
+
config.chunk_size = 768
|
|
296
|
+
config.chunk_overlap = 100
|
|
297
|
+
end
|
|
298
|
+
|
|
299
|
+
# Load documentation
|
|
300
|
+
htm.load_directory("docs/", pattern: "**/*.md")
|
|
301
|
+
htm.load_directory("README.md")
|
|
302
|
+
htm.load_directory("CHANGELOG.md")
|
|
303
|
+
|
|
304
|
+
# Query the knowledge base
|
|
305
|
+
results = htm.recall(
|
|
306
|
+
"How do I configure authentication?",
|
|
307
|
+
strategy: :hybrid,
|
|
308
|
+
limit: 5
|
|
309
|
+
)
|
|
310
|
+
|
|
311
|
+
results.each do |result|
|
|
312
|
+
puts result['content']
|
|
313
|
+
puts "---"
|
|
314
|
+
end
|
|
315
|
+
```
|
|
316
|
+
|
|
317
|
+
## Related Documentation
|
|
318
|
+
|
|
319
|
+
- [Adding Memories](adding-memories.md) - Core memory operations
|
|
320
|
+
- [Search Strategies](search-strategies.md) - Querying loaded content
|
|
321
|
+
- [API Reference: HTM](../api/htm.md) - Complete API documentation
|
|
322
|
+
- [Example: File Loading](../examples/file-loading.md) - Working example
|
|
@@ -8,26 +8,47 @@ Before starting, ensure you have:
|
|
|
8
8
|
|
|
9
9
|
1. **Ruby 3.0+** installed
|
|
10
10
|
2. **PostgreSQL with TimescaleDB** (or access to a TimescaleDB cloud instance)
|
|
11
|
-
3. **
|
|
11
|
+
3. **LLM Provider** configured - Ollama (default for local development), OpenAI, Anthropic, Gemini, or others via RubyLLM
|
|
12
12
|
4. Basic understanding of Ruby and LLMs
|
|
13
13
|
|
|
14
|
-
###
|
|
14
|
+
### Configuring an LLM Provider
|
|
15
15
|
|
|
16
|
-
HTM uses
|
|
16
|
+
HTM uses RubyLLM which supports multiple providers for generating embeddings and extracting tags.
|
|
17
|
+
|
|
18
|
+
**Option A: Ollama (Recommended for Local Development)**
|
|
17
19
|
|
|
18
20
|
```bash
|
|
19
21
|
# Install Ollama
|
|
20
22
|
curl https://ollama.ai/install.sh | sh
|
|
21
23
|
|
|
22
|
-
# Pull
|
|
23
|
-
ollama pull
|
|
24
|
+
# Pull required models
|
|
25
|
+
ollama pull nomic-embed-text
|
|
26
|
+
ollama pull gemma3:latest
|
|
24
27
|
|
|
25
28
|
# Verify Ollama is running
|
|
26
29
|
curl http://localhost:11434/api/version
|
|
27
30
|
```
|
|
28
31
|
|
|
32
|
+
**Option B: OpenAI (Recommended for Production)**
|
|
33
|
+
|
|
34
|
+
```bash
|
|
35
|
+
export OPENAI_API_KEY="sk-..."
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
Configure HTM:
|
|
39
|
+
```ruby
|
|
40
|
+
HTM.configure do |config|
|
|
41
|
+
config.embedding.provider = :openai
|
|
42
|
+
config.embedding.model = 'text-embedding-3-small'
|
|
43
|
+
end
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
**Option C: Other Providers** (Anthropic, Gemini, Azure, Bedrock, DeepSeek)
|
|
47
|
+
|
|
48
|
+
Set the appropriate API key and configure HTM with your preferred provider.
|
|
49
|
+
|
|
29
50
|
!!! tip
|
|
30
|
-
|
|
51
|
+
HTM uses vector embeddings to understand the semantic meaning of your memories, not just keyword matches. Any provider will work—choose based on your privacy, cost, and quality requirements.
|
|
31
52
|
|
|
32
53
|
## Installation
|
|
33
54
|
|
|
@@ -463,9 +484,9 @@ htm.forget(node_id, soft: false, confirm: :confirmed)
|
|
|
463
484
|
|
|
464
485
|
## Troubleshooting
|
|
465
486
|
|
|
466
|
-
###
|
|
487
|
+
### LLM Provider Connection Issues
|
|
467
488
|
|
|
468
|
-
If
|
|
489
|
+
**If using Ollama:**
|
|
469
490
|
|
|
470
491
|
```bash
|
|
471
492
|
# Check Ollama is running
|
|
@@ -474,10 +495,20 @@ curl http://localhost:11434/api/version
|
|
|
474
495
|
# If not running, start it
|
|
475
496
|
ollama serve
|
|
476
497
|
|
|
477
|
-
# Verify the
|
|
498
|
+
# Verify the models are available
|
|
478
499
|
ollama list
|
|
479
500
|
```
|
|
480
501
|
|
|
502
|
+
**If using cloud providers:**
|
|
503
|
+
|
|
504
|
+
```bash
|
|
505
|
+
# Verify API key is set
|
|
506
|
+
echo $OPENAI_API_KEY # or ANTHROPIC_API_KEY, GEMINI_API_KEY, etc.
|
|
507
|
+
|
|
508
|
+
# Test connectivity
|
|
509
|
+
curl https://api.openai.com/v1/models -H "Authorization: Bearer $OPENAI_API_KEY"
|
|
510
|
+
```
|
|
511
|
+
|
|
481
512
|
### Database Connection Issues
|
|
482
513
|
|
|
483
514
|
```ruby
|
data/docs/guides/index.md
CHANGED
|
@@ -32,7 +32,7 @@ Learn how to work with HTM's memory system effectively.
|
|
|
32
32
|
|
|
33
33
|
Dive deeper into HTM's powerful capabilities.
|
|
34
34
|
|
|
35
|
-
- [**Multi-Robot Usage**](multi-robot.md) - Building hive mind systems with multiple robots
|
|
35
|
+
- [**Multi-Robot Usage**](../robots/multi-robot.md) - Building hive mind systems with multiple robots
|
|
36
36
|
- [**Search Strategies**](search-strategies.md) - Vector, full-text, and hybrid search
|
|
37
37
|
- [**Context Assembly**](context-assembly.md) - Creating optimized context for LLMs
|
|
38
38
|
|
|
@@ -62,7 +62,7 @@ We recommend the following progression:
|
|
|
62
62
|
- [Search Strategies](search-strategies.md) - Optimize retrieval
|
|
63
63
|
|
|
64
64
|
4. **Advanced Topics**: Multi-Robot Systems
|
|
65
|
-
- [Multi-Robot Usage](multi-robot.md) - Build collaborative systems
|
|
65
|
+
- [Multi-Robot Usage](../robots/multi-robot.md) - Build collaborative systems
|
|
66
66
|
|
|
67
67
|
## Quick Reference
|
|
68
68
|
|
|
@@ -73,7 +73,7 @@ We recommend the following progression:
|
|
|
73
73
|
- **Search for memories**: See [Recalling Memories](recalling-memories.md#basic-recall)
|
|
74
74
|
- **Create LLM context**: See [Context Assembly](context-assembly.md#basic-usage)
|
|
75
75
|
- **Monitor memory usage**: See [Working Memory](working-memory.md#monitoring-utilization)
|
|
76
|
-
- **Multi-robot setup**: See [Multi-Robot Usage](multi-robot.md#setting-up-multiple-robots)
|
|
76
|
+
- **Multi-robot setup**: See [Multi-Robot Usage](../robots/multi-robot.md#setting-up-multiple-robots)
|
|
77
77
|
- **Use with Claude/AIA**: See [MCP Server](mcp-server.md#client-configuration)
|
|
78
78
|
|
|
79
79
|
### Memory Types
|
data/docs/guides/mcp-server.md
CHANGED
|
@@ -30,11 +30,19 @@ Before using the MCP server, ensure you have:
|
|
|
30
30
|
htm_mcp setup
|
|
31
31
|
```
|
|
32
32
|
|
|
33
|
-
3. **
|
|
33
|
+
3. **LLM provider configured** (for embeddings and tag extraction)
|
|
34
|
+
|
|
35
|
+
**Option A: Ollama (default for local development)**
|
|
34
36
|
```bash
|
|
35
37
|
ollama serve
|
|
36
38
|
ollama pull nomic-embed-text
|
|
37
|
-
ollama pull
|
|
39
|
+
ollama pull gemma3:latest
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
**Option B: Cloud providers** (OpenAI, Anthropic, etc.)
|
|
43
|
+
```bash
|
|
44
|
+
export OPENAI_API_KEY="sk-..."
|
|
45
|
+
# Configure via HTM.configure or environment variables
|
|
38
46
|
```
|
|
39
47
|
|
|
40
48
|
## Starting the Server
|
|
@@ -49,7 +57,7 @@ The server logs to STDERR to avoid corrupting the JSON-RPC protocol on STDOUT.
|
|
|
49
57
|
|
|
50
58
|
## CLI Commands
|
|
51
59
|
|
|
52
|
-
The `htm_mcp` executable includes management commands for database setup and
|
|
60
|
+
The `htm_mcp` executable includes management commands for database setup, diagnostics, and rake task execution:
|
|
53
61
|
|
|
54
62
|
| Command | Description |
|
|
55
63
|
|---------|-------------|
|
|
@@ -61,6 +69,9 @@ The `htm_mcp` executable includes management commands for database setup and dia
|
|
|
61
69
|
| `htm_mcp stats` | Show memory statistics (nodes, tags, robots, database size) |
|
|
62
70
|
| `htm_mcp version` | Show HTM version |
|
|
63
71
|
| `htm_mcp help` | Show help with all environment variables |
|
|
72
|
+
| `htm_mcp rake <task>` | Run any HTM rake task |
|
|
73
|
+
| `htm_mcp rake -T` | List all available HTM rake tasks |
|
|
74
|
+
| `htm_mcp rake -T <pattern>` | List HTM rake tasks matching pattern |
|
|
64
75
|
|
|
65
76
|
### First-Time Setup
|
|
66
77
|
|
|
@@ -93,6 +104,72 @@ Migration Status
|
|
|
93
104
|
3 applied, 1 pending
|
|
94
105
|
```
|
|
95
106
|
|
|
107
|
+
### Rake Task Passthrough
|
|
108
|
+
|
|
109
|
+
The `htm_mcp rake` command allows you to run any HTM rake task directly through the MCP CLI. This is useful when working with HTM without a full Rails/Rake environment.
|
|
110
|
+
|
|
111
|
+
**List all available tasks:**
|
|
112
|
+
|
|
113
|
+
```bash
|
|
114
|
+
$ htm_mcp rake -T
|
|
115
|
+
# or
|
|
116
|
+
$ htm_mcp rake --tasks
|
|
117
|
+
|
|
118
|
+
HTM Rake Tasks
|
|
119
|
+
================================================================================
|
|
120
|
+
htm:db:console # Open psql console to database
|
|
121
|
+
htm:db:create # Create the database if it doesn't exist
|
|
122
|
+
htm:db:drop # Drop all HTM tables (WARNING: destructive!)
|
|
123
|
+
htm:db:info # Show database information
|
|
124
|
+
htm:db:migrate # Run pending database migrations
|
|
125
|
+
htm:db:purge_all # Permanently delete all soft-deleted records
|
|
126
|
+
...
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
**Filter tasks by pattern** (like standard `rake -T`):
|
|
130
|
+
|
|
131
|
+
```bash
|
|
132
|
+
$ htm_mcp rake -T htm:jobs
|
|
133
|
+
|
|
134
|
+
HTM Rake Tasks
|
|
135
|
+
================================================================================
|
|
136
|
+
htm:jobs:process_all # Process all pending jobs (embeddings, tags, propositions)
|
|
137
|
+
htm:jobs:process_embeddings # Process pending embedding jobs
|
|
138
|
+
htm:jobs:process_propositions # Process pending proposition extraction jobs
|
|
139
|
+
htm:jobs:process_tags # Process pending tag extraction jobs
|
|
140
|
+
htm:jobs:stats # Show job processing statistics
|
|
141
|
+
|
|
142
|
+
$ htm_mcp rake -T db:rebuild
|
|
143
|
+
|
|
144
|
+
HTM Rake Tasks
|
|
145
|
+
================================================================================
|
|
146
|
+
htm:db:rebuild:embeddings # Clear and regenerate all embeddings
|
|
147
|
+
htm:db:rebuild:propositions # Extract propositions from all non-proposition nodes
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
**Run specific tasks:**
|
|
151
|
+
|
|
152
|
+
```bash
|
|
153
|
+
# Database tasks
|
|
154
|
+
$ htm_mcp rake htm:db:stats
|
|
155
|
+
$ htm_mcp rake htm:db:verify
|
|
156
|
+
$ htm_mcp rake htm:db:purge_all
|
|
157
|
+
|
|
158
|
+
# Job processing tasks
|
|
159
|
+
$ htm_mcp rake htm:jobs:process_all
|
|
160
|
+
$ htm_mcp rake htm:jobs:process_embeddings
|
|
161
|
+
|
|
162
|
+
# Tag tasks
|
|
163
|
+
$ htm_mcp rake htm:tags:tree
|
|
164
|
+
$ htm_mcp rake 'htm:tags:tree[database]' # With argument
|
|
165
|
+
|
|
166
|
+
# File tasks
|
|
167
|
+
$ htm_mcp rake htm:files:list
|
|
168
|
+
$ htm_mcp rake htm:files:sync
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
**Note:** Tasks requiring arguments use the standard rake syntax with brackets quoted for shell safety: `htm_mcp rake 'htm:files:load[path/to/file.md]'`
|
|
172
|
+
|
|
96
173
|
## Tools Reference
|
|
97
174
|
|
|
98
175
|
### SetRobotTool
|
|
@@ -681,8 +758,8 @@ Memory statistics as JSON.
|
|
|
681
758
|
"current_robot": "my-assistant",
|
|
682
759
|
"robot_id": 5,
|
|
683
760
|
"robot_initialized": true,
|
|
684
|
-
"embedding_provider": "ollama",
|
|
685
|
-
"embedding_model": "nomic-embed-text"
|
|
761
|
+
"embedding_provider": "ollama", // or "openai", "gemini", etc.
|
|
762
|
+
"embedding_model": "nomic-embed-text" // provider-specific model
|
|
686
763
|
}
|
|
687
764
|
```
|
|
688
765
|
|
|
@@ -986,12 +1063,18 @@ psql htm_development -c "CREATE EXTENSION IF NOT EXISTS pg_trgm;"
|
|
|
986
1063
|
|
|
987
1064
|
### Embedding/Tag Errors
|
|
988
1065
|
|
|
989
|
-
**Error: `Connection refused` (Ollama)**
|
|
1066
|
+
**Error: `Connection refused` (when using Ollama)**
|
|
990
1067
|
1. Start Ollama: `ollama serve`
|
|
991
1068
|
2. Pull required models:
|
|
992
1069
|
```bash
|
|
993
1070
|
ollama pull nomic-embed-text
|
|
994
|
-
ollama pull
|
|
1071
|
+
ollama pull gemma3:latest
|
|
1072
|
+
```
|
|
1073
|
+
|
|
1074
|
+
**Error: `API key invalid` (when using cloud providers)**
|
|
1075
|
+
1. Verify the API key is set:
|
|
1076
|
+
```bash
|
|
1077
|
+
echo $OPENAI_API_KEY # or ANTHROPIC_API_KEY, GEMINI_API_KEY
|
|
995
1078
|
```
|
|
996
1079
|
|
|
997
1080
|
### Debugging
|
|
@@ -1026,19 +1109,23 @@ Run `htm_mcp help` for a complete list. Key variables:
|
|
|
1026
1109
|
|
|
1027
1110
|
### LLM Providers
|
|
1028
1111
|
|
|
1112
|
+
HTM uses RubyLLM which supports multiple providers. Defaults to Ollama for local development.
|
|
1113
|
+
|
|
1029
1114
|
| Variable | Description | Default |
|
|
1030
1115
|
|----------|-------------|---------|
|
|
1031
|
-
| `HTM_EMBEDDING_PROVIDER` | Embedding provider | `ollama` |
|
|
1032
|
-
| `HTM_EMBEDDING_MODEL` | Embedding model | `nomic-embed-text
|
|
1116
|
+
| `HTM_EMBEDDING_PROVIDER` | Embedding provider (`ollama`, `openai`, `gemini`, etc.) | `ollama` |
|
|
1117
|
+
| `HTM_EMBEDDING_MODEL` | Embedding model (provider-specific) | `nomic-embed-text` |
|
|
1033
1118
|
| `HTM_TAG_PROVIDER` | Tag extraction provider | `ollama` |
|
|
1034
1119
|
| `HTM_TAG_MODEL` | Tag model | `gemma3:latest` |
|
|
1035
|
-
| `HTM_OLLAMA_URL` | Ollama server URL | `http://localhost:11434` |
|
|
1120
|
+
| `HTM_OLLAMA_URL` | Ollama server URL (if using Ollama) | `http://localhost:11434` |
|
|
1036
1121
|
|
|
1037
|
-
###
|
|
1122
|
+
### Cloud Provider API Keys
|
|
1038
1123
|
|
|
1039
1124
|
| Variable | Description |
|
|
1040
1125
|
|----------|-------------|
|
|
1041
|
-
| `
|
|
1126
|
+
| `OPENAI_API_KEY` | OpenAI API key |
|
|
1127
|
+
| `ANTHROPIC_API_KEY` | Anthropic API key |
|
|
1128
|
+
| `GEMINI_API_KEY` | Google Gemini API key |
|
|
1042
1129
|
| `HTM_ANTHROPIC_API_KEY` | Anthropic API key |
|
|
1043
1130
|
| `HTM_GEMINI_API_KEY` | Google Gemini API key |
|
|
1044
1131
|
| `HTM_AZURE_API_KEY` | Azure OpenAI API key |
|
|
@@ -1049,4 +1136,4 @@ Run `htm_mcp help` for a complete list. Key variables:
|
|
|
1049
1136
|
- [Getting Started](getting-started.md) - HTM basics
|
|
1050
1137
|
- [Adding Memories](adding-memories.md) - Learn about tags and metadata
|
|
1051
1138
|
- [Recalling Memories](recalling-memories.md) - Search strategies
|
|
1052
|
-
- [Multi-Robot Systems](multi-robot.md) - Working with multiple robots
|
|
1139
|
+
- [Multi-Robot Systems](../robots/multi-robot.md) - Working with multiple robots
|