htm 0.0.31 → 0.0.32
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.irbrc +2 -3
- data/.rubocop.yml +184 -0
- data/CHANGELOG.md +46 -0
- data/README.md +2 -0
- data/Rakefile +93 -12
- data/db/migrate/00008_create_node_relationships.rb +54 -0
- data/db/migrate/00009_fix_node_relationships_column_types.rb +17 -0
- data/db/schema.sql +124 -1
- data/docs/api/database.md +35 -57
- data/docs/api/embedding-service.md +1 -1
- data/docs/api/index.md +26 -15
- data/docs/api/working-memory.md +8 -8
- data/docs/architecture/index.md +5 -7
- data/docs/architecture/overview.md +5 -8
- data/docs/assets/images/htm-architecture-overview.svg +1 -1
- data/docs/assets/images/htm-context-assembly-flow.svg +2 -2
- data/docs/assets/images/htm-layered-architecture.svg +3 -3
- data/docs/assets/images/two-tier-memory-architecture.svg +1 -1
- data/docs/database/README.md +1 -0
- data/docs/database_rake_tasks.md +20 -28
- data/docs/development/contributing.md +5 -5
- data/docs/development/index.md +4 -7
- data/docs/development/schema.md +71 -1
- data/docs/development/setup.md +40 -82
- data/docs/development/testing.md +1 -1
- data/docs/examples/file-loading.md +4 -4
- data/docs/examples/mcp-client.md +1 -1
- data/docs/getting-started/quick-start.md +4 -4
- data/docs/guides/adding-memories.md +14 -1
- data/docs/guides/configuration.md +5 -5
- data/docs/guides/context-assembly.md +4 -4
- data/docs/guides/file-loading.md +12 -12
- data/docs/guides/getting-started.md +2 -2
- data/docs/guides/long-term-memory.md +7 -27
- data/docs/guides/propositions.md +20 -19
- data/docs/guides/recalling-memories.md +5 -5
- data/docs/guides/tags.md +18 -13
- data/docs/multi_framework_support.md +1 -1
- data/docs/robots/hive-mind.md +1 -1
- data/docs/robots/multi-robot.md +2 -2
- data/docs/robots/robot-groups.md +1 -1
- data/docs/robots/two-tier-memory.md +72 -94
- data/docs/setup_local_database.md +8 -54
- data/docs/using_rake_tasks_in_your_app.md +6 -6
- data/examples/01_basic_usage.rb +1 -0
- data/examples/03_custom_llm_configuration.rb +1 -0
- data/examples/04_file_loader_usage.rb +1 -0
- data/examples/05_timeframe_demo.rb +1 -0
- data/examples/06_example_app/app.rb +1 -0
- data/examples/07_cli_app/htm_cli.rb +1 -0
- data/examples/09_mcp_client.rb +1 -0
- data/examples/10_telemetry/demo.rb +1 -0
- data/examples/11_robot_groups/multi_process.rb +1 -0
- data/examples/11_robot_groups/same_process.rb +1 -0
- data/examples/12_rails_app/.envrc +12 -0
- data/examples/12_rails_app/Gemfile +8 -3
- data/examples/12_rails_app/Gemfile.lock +94 -89
- data/examples/12_rails_app/README.md +70 -19
- data/examples/12_rails_app/app/controllers/application_controller.rb +6 -0
- data/examples/12_rails_app/app/controllers/chats_controller.rb +305 -0
- data/examples/12_rails_app/app/controllers/dashboard_controller.rb +3 -0
- data/examples/12_rails_app/app/controllers/files_controller.rb +17 -2
- data/examples/12_rails_app/app/controllers/home_controller.rb +8 -0
- data/examples/12_rails_app/app/controllers/memories_controller.rb +9 -4
- data/examples/12_rails_app/app/controllers/messages_controller.rb +214 -0
- data/examples/12_rails_app/app/controllers/robots_controller.rb +11 -1
- data/examples/12_rails_app/app/controllers/tags_controller.rb +14 -1
- data/examples/12_rails_app/app/javascript/application.js +1 -1
- data/examples/12_rails_app/app/models/application_record.rb +5 -0
- data/examples/12_rails_app/app/models/chat.rb +36 -0
- data/examples/12_rails_app/app/models/message.rb +5 -0
- data/examples/12_rails_app/app/models/model.rb +5 -0
- data/examples/12_rails_app/app/models/tool_call.rb +5 -0
- data/examples/12_rails_app/app/views/chats/index.html.erb +61 -0
- data/examples/12_rails_app/app/views/chats/show.html.erb +213 -0
- data/examples/12_rails_app/app/views/dashboard/index.html.erb +3 -0
- data/examples/12_rails_app/app/views/files/index.html.erb +10 -5
- data/examples/12_rails_app/app/views/files/new.html.erb +4 -2
- data/examples/12_rails_app/app/views/files/show.html.erb +19 -3
- data/examples/12_rails_app/app/views/home/index.html.erb +45 -0
- data/examples/12_rails_app/app/views/layouts/application.html.erb +20 -18
- data/examples/12_rails_app/app/views/memories/_memory_card.html.erb +1 -1
- data/examples/12_rails_app/app/views/memories/deleted.html.erb +3 -1
- data/examples/12_rails_app/app/views/memories/edit.html.erb +2 -0
- data/examples/12_rails_app/app/views/memories/index.html.erb +2 -0
- data/examples/12_rails_app/app/views/memories/new.html.erb +2 -0
- data/examples/12_rails_app/app/views/memories/show.html.erb +4 -2
- data/examples/12_rails_app/app/views/messages/_message.html.erb +20 -0
- data/examples/12_rails_app/app/views/robots/index.html.erb +2 -0
- data/examples/12_rails_app/app/views/robots/new.html.erb +2 -0
- data/examples/12_rails_app/app/views/robots/show.html.erb +2 -0
- data/examples/12_rails_app/app/views/search/index.html.erb +59 -8
- data/examples/12_rails_app/app/views/shared/_navbar.html.erb +75 -29
- data/examples/12_rails_app/app/views/tags/index.html.erb +2 -0
- data/examples/12_rails_app/app/views/tags/show.html.erb +3 -1
- data/examples/12_rails_app/config/application.rb +1 -1
- data/examples/12_rails_app/config/database.yml +9 -5
- data/examples/12_rails_app/config/importmap.rb +1 -1
- data/examples/12_rails_app/config/initializers/htm.rb +9 -2
- data/examples/12_rails_app/config/initializers/ruby_llm.rb +33 -0
- data/examples/12_rails_app/config/routes.rb +39 -23
- data/examples/12_rails_app/db/migrate/20250124000001_create_ruby_llm_tables.rb +34 -0
- data/examples/12_rails_app/db/migrate/20250124000002_create_models_table.rb +28 -0
- data/examples/12_rails_app/db/schema.rb +67 -0
- data/examples/examples_helper.rb +25 -0
- data/lib/htm/circuit_breaker.rb +5 -6
- data/lib/htm/config/builder.rb +12 -12
- data/lib/htm/config/database.rb +21 -27
- data/lib/htm/config/validator.rb +12 -18
- data/lib/htm/config.rb +76 -65
- data/lib/htm/database.rb +193 -199
- data/lib/htm/embedding_service.rb +4 -9
- data/lib/htm/integrations/sinatra.rb +7 -7
- data/lib/htm/job_adapter.rb +14 -21
- data/lib/htm/jobs/generate_embedding_job.rb +28 -44
- data/lib/htm/jobs/generate_propositions_job.rb +29 -55
- data/lib/htm/jobs/generate_relationships_job.rb +137 -0
- data/lib/htm/jobs/generate_tags_job.rb +45 -67
- data/lib/htm/loaders/markdown_loader.rb +65 -112
- data/lib/htm/long_term_memory/fulltext_search.rb +1 -1
- data/lib/htm/long_term_memory/hybrid_search.rb +300 -128
- data/lib/htm/long_term_memory/node_operations.rb +2 -2
- data/lib/htm/long_term_memory/relevance_scorer.rb +100 -68
- data/lib/htm/long_term_memory/tag_operations.rb +87 -120
- data/lib/htm/long_term_memory/vector_search.rb +1 -1
- data/lib/htm/long_term_memory.rb +2 -1
- data/lib/htm/mcp/cli.rb +59 -58
- data/lib/htm/mcp/server.rb +5 -6
- data/lib/htm/mcp/tools.rb +30 -36
- data/lib/htm/migration.rb +10 -10
- data/lib/htm/models/node.rb +2 -3
- data/lib/htm/models/node_relationship.rb +72 -0
- data/lib/htm/models/node_tag.rb +2 -2
- data/lib/htm/models/robot_node.rb +2 -2
- data/lib/htm/models/tag.rb +41 -28
- data/lib/htm/observability.rb +45 -51
- data/lib/htm/proposition_service.rb +3 -7
- data/lib/htm/query_cache.rb +13 -15
- data/lib/htm/railtie.rb +1 -2
- data/lib/htm/robot_group.rb +9 -9
- data/lib/htm/sequel_config.rb +1 -0
- data/lib/htm/sql_builder.rb +1 -1
- data/lib/htm/tag_service.rb +2 -6
- data/lib/htm/timeframe.rb +4 -5
- data/lib/htm/timeframe_extractor.rb +42 -83
- data/lib/htm/version.rb +1 -1
- data/lib/htm/workflows/remember_workflow.rb +112 -115
- data/lib/htm/working_memory.rb +21 -26
- data/lib/htm.rb +103 -116
- data/lib/tasks/db.rake +0 -2
- data/lib/tasks/doc.rake +14 -13
- data/lib/tasks/files.rake +5 -12
- data/lib/tasks/htm.rake +70 -71
- data/lib/tasks/jobs.rake +41 -47
- data/lib/tasks/tags.rake +3 -8
- metadata +25 -100
data/docs/database_rake_tasks.md
CHANGED
|
@@ -19,24 +19,22 @@ rake htm:db:setup # Create schema and run migrations
|
|
|
19
19
|
Sets up the HTM database schema and runs all migrations.
|
|
20
20
|
|
|
21
21
|
**What it does:**
|
|
22
|
-
- Verifies required extensions (
|
|
23
|
-
- Creates all HTM tables (nodes, tags, robots,
|
|
22
|
+
- Verifies required extensions (pgvector, pg_trgm)
|
|
23
|
+
- Creates all HTM tables (nodes, tags, robots, file_sources)
|
|
24
24
|
- Runs all pending migrations
|
|
25
|
-
- Sets up hypertables for time-series optimization
|
|
26
25
|
|
|
27
26
|
**When to use:** First-time setup or after dropping the database
|
|
28
27
|
|
|
29
28
|
```bash
|
|
30
29
|
$ rake htm:db:setup
|
|
31
|
-
✓
|
|
32
|
-
✓
|
|
30
|
+
✓ pgvector version: 0.8.1+
|
|
31
|
+
✓ pg_trgm version: 1.6
|
|
33
32
|
Creating HTM schema...
|
|
34
33
|
✓ Schema created
|
|
35
34
|
Running migration: 001_support_variable_dimensions
|
|
36
35
|
✓ Migration 001_support_variable_dimensions applied
|
|
37
36
|
Running migration: 002_ontology_topic_extraction
|
|
38
37
|
✓ Migration 002_ontology_topic_extraction applied
|
|
39
|
-
✓ Created hypertable for operations_log
|
|
40
38
|
✓ HTM database schema created successfully
|
|
41
39
|
```
|
|
42
40
|
|
|
@@ -92,29 +90,24 @@ HTM Database Information
|
|
|
92
90
|
================================================================================
|
|
93
91
|
|
|
94
92
|
Connection:
|
|
95
|
-
Host:
|
|
96
|
-
Port:
|
|
97
|
-
Database:
|
|
98
|
-
User:
|
|
93
|
+
Host: localhost
|
|
94
|
+
Port: 5432
|
|
95
|
+
Database: htm_development
|
|
96
|
+
User: postgres
|
|
99
97
|
|
|
100
98
|
PostgreSQL Version:
|
|
101
|
-
PostgreSQL 17.
|
|
99
|
+
PostgreSQL 17.x
|
|
102
100
|
|
|
103
101
|
Extensions:
|
|
104
|
-
ai (0.11.2)
|
|
105
|
-
pg_stat_statements (1.11)
|
|
106
102
|
pg_trgm (1.6)
|
|
107
103
|
plpgsql (1.0)
|
|
108
|
-
|
|
109
|
-
timescaledb_toolkit (1.21.0)
|
|
110
|
-
vector (0.8.1)
|
|
111
|
-
vectorscale (0.8.0)
|
|
104
|
+
vector (0.8.1+)
|
|
112
105
|
|
|
113
106
|
HTM Tables:
|
|
114
107
|
nodes: 42 rows
|
|
115
108
|
tags: 156 rows
|
|
116
109
|
robots: 3 rows
|
|
117
|
-
|
|
110
|
+
file_sources: 5 rows
|
|
118
111
|
schema_migrations: 2 rows
|
|
119
112
|
|
|
120
113
|
Database Size: 14 MB
|
|
@@ -127,10 +120,9 @@ Tests database connection by running `test_connection.rb`.
|
|
|
127
120
|
**Example output:**
|
|
128
121
|
```bash
|
|
129
122
|
$ rake htm:db:test
|
|
130
|
-
Connecting to
|
|
123
|
+
Connecting to PostgreSQL...
|
|
131
124
|
✓ Connected successfully!
|
|
132
|
-
✓
|
|
133
|
-
✓ pgvector Extension: Version 0.8.1
|
|
125
|
+
✓ pgvector Extension: Version 0.8.1+
|
|
134
126
|
✓ pg_trgm Extension: Version 1.6
|
|
135
127
|
```
|
|
136
128
|
|
|
@@ -153,14 +145,14 @@ psql (17.6)
|
|
|
153
145
|
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off, ALPN: none)
|
|
154
146
|
Type "help" for help.
|
|
155
147
|
|
|
156
|
-
|
|
148
|
+
htm_development=> SELECT COUNT(*) FROM nodes;
|
|
157
149
|
count
|
|
158
150
|
-------
|
|
159
151
|
42
|
|
160
152
|
(1 row)
|
|
161
153
|
|
|
162
|
-
|
|
163
|
-
|
|
154
|
+
htm_development=> \d nodes
|
|
155
|
+
htm_development=> \q
|
|
164
156
|
```
|
|
165
157
|
|
|
166
158
|
#### `rake htm:db:seed`
|
|
@@ -272,9 +264,9 @@ export HTM_DATABASE__URL="postgresql://user:password@host:port/dbname?sslmode=re
|
|
|
272
264
|
rake htm:db:info
|
|
273
265
|
```
|
|
274
266
|
|
|
275
|
-
### Method 3: Source
|
|
267
|
+
### Method 3: Source Credentials File
|
|
276
268
|
```bash
|
|
277
|
-
source ~/.
|
|
269
|
+
source ~/.envrc # Load environment from credentials file
|
|
278
270
|
rake htm:db:info
|
|
279
271
|
```
|
|
280
272
|
|
|
@@ -333,9 +325,9 @@ rake htm:db:migrate # Run new migrations only
|
|
|
333
325
|
- Test: `rake htm:db:test`
|
|
334
326
|
|
|
335
327
|
### "Extension not found"
|
|
336
|
-
-
|
|
328
|
+
- Enable required extensions manually: `psql $HTM_DATABASE__URL -c "CREATE EXTENSION IF NOT EXISTS vector;"`
|
|
337
329
|
- Check with: `rake htm:db:info`
|
|
338
|
-
- Extensions needed:
|
|
330
|
+
- Extensions needed: pgvector, pg_trgm
|
|
339
331
|
|
|
340
332
|
### Migrations not running
|
|
341
333
|
- Check migration files exist: `ls -la sql/migrations/`
|
|
@@ -252,11 +252,11 @@ git commit -m "wip"
|
|
|
252
252
|
For more complex changes, include a body:
|
|
253
253
|
|
|
254
254
|
```bash
|
|
255
|
-
git commit -m "feat: add
|
|
255
|
+
git commit -m "feat: add partial index for active memories
|
|
256
|
+
|
|
257
|
+
Implements partial index for active (non-deleted) memories to improve
|
|
258
|
+
query performance. This reduces index size and speeds up reads.
|
|
256
259
|
|
|
257
|
-
Implements automatic compression for memories older than 30 days
|
|
258
|
-
using TimescaleDB compression policies. This reduces storage costs
|
|
259
|
-
and improves query performance for recent data.
|
|
260
260
|
|
|
261
261
|
- Adds compress_old_memories rake task
|
|
262
262
|
- Updates schema with compression settings
|
|
@@ -672,7 +672,7 @@ HTM's hybrid search combines vector similarity search with full-text search for
|
|
|
672
672
|
|
|
673
673
|
```ruby
|
|
674
674
|
memories = htm.recall(
|
|
675
|
-
|
|
675
|
+
"database decisions",
|
|
676
676
|
strategy: :hybrid,
|
|
677
677
|
timeframe: "last week"
|
|
678
678
|
)
|
data/docs/development/index.md
CHANGED
|
@@ -25,7 +25,7 @@ Learn how to set up your development environment, clone the repository, install
|
|
|
25
25
|
|
|
26
26
|
- Cloning the repository
|
|
27
27
|
- Installing Ruby and dependencies
|
|
28
|
-
- Setting up
|
|
28
|
+
- Setting up PostgreSQL for development
|
|
29
29
|
- Configuring Ollama for embeddings
|
|
30
30
|
- Running tests and examples
|
|
31
31
|
- Development tools and rake tasks
|
|
@@ -73,7 +73,7 @@ Complete reference for all 44 HTM rake tasks covering database management, docum
|
|
|
73
73
|
- Troubleshooting guide
|
|
74
74
|
|
|
75
75
|
### [Database Schema](schema.md)
|
|
76
|
-
Deep dive into HTM's database architecture, tables, indexes, and
|
|
76
|
+
Deep dive into HTM's database architecture, tables, indexes, and PostgreSQL optimization strategies.
|
|
77
77
|
|
|
78
78
|
**Topics covered:**
|
|
79
79
|
|
|
@@ -82,8 +82,6 @@ Deep dive into HTM's database architecture, tables, indexes, and TimescaleDB opt
|
|
|
82
82
|
- Table definitions and column details
|
|
83
83
|
- Indexes and constraints
|
|
84
84
|
- Views and functions
|
|
85
|
-
- TimescaleDB hypertables
|
|
86
|
-
- Compression policies
|
|
87
85
|
- Migration strategies
|
|
88
86
|
|
|
89
87
|
## Quick Start for Contributors
|
|
@@ -202,7 +200,7 @@ A: Check out issues labeled `good-first-issue` on GitHub. These are specifically
|
|
|
202
200
|
|
|
203
201
|
**Q: How do I run tests without a database?**
|
|
204
202
|
|
|
205
|
-
A: Unit tests (like `test/htm_test.rb`) don't require a database. Integration tests require
|
|
203
|
+
A: Unit tests (like `test/htm_test.rb`) don't require a database. Integration tests require PostgreSQL with pgvector. See the [Testing Guide](testing.md) for details.
|
|
206
204
|
|
|
207
205
|
**Q: What's the preferred debugging approach?**
|
|
208
206
|
|
|
@@ -277,8 +275,7 @@ HTM is built with modern Ruby tools:
|
|
|
277
275
|
### Core Technologies
|
|
278
276
|
|
|
279
277
|
- **Ruby 3.0+**: Modern Ruby with pattern matching and better performance
|
|
280
|
-
- **PostgreSQL 17**: Robust relational database
|
|
281
|
-
- **TimescaleDB**: Time-series optimization for PostgreSQL
|
|
278
|
+
- **PostgreSQL 17**: Robust relational database with pgvector and pg_trgm
|
|
282
279
|
- **pgvector**: Vector similarity search
|
|
283
280
|
- **RubyLLM**: LLM client library for embeddings
|
|
284
281
|
- **Ollama**: Local embedding generation
|
data/docs/development/schema.md
CHANGED
|
@@ -45,6 +45,7 @@ For detailed table definitions, columns, indexes, and constraints, see the auto-
|
|
|
45
45
|
|-------|-------------|---------|
|
|
46
46
|
| [robot_nodes](../database/public.robot_nodes.md) | Links robots to nodes (many-to-many) | Enables "hive mind" shared memory; includes `working_memory` boolean for per-robot working memory state |
|
|
47
47
|
| [node_tags](../database/public.node_tags.md) | Links nodes to tags (many-to-many) | Flexible multi-tag categorization |
|
|
48
|
+
| node_relationships | Weighted directed edges between related nodes | Stores Jaccard-similarity edges in both directions (A→B and B→A) for CTE graph traversal; populated by `GenerateRelationshipsJob` |
|
|
48
49
|
|
|
49
50
|
### System Tables
|
|
50
51
|
|
|
@@ -141,6 +142,40 @@ WHERE fs.file_path = '/path/to/file.md'
|
|
|
141
142
|
ORDER BY n.chunk_position;
|
|
142
143
|
```
|
|
143
144
|
|
|
145
|
+
### Node Relationships (Graph Edges)
|
|
146
|
+
|
|
147
|
+
The `node_relationships` table stores weighted directed edges between nodes. Edges are created automatically by `GenerateRelationshipsJob` after tags are assigned.
|
|
148
|
+
|
|
149
|
+
| Column | Type | Description |
|
|
150
|
+
|--------|------|-------------|
|
|
151
|
+
| `id` | bigint | Primary key |
|
|
152
|
+
| `source_id` | bigint | FK → nodes.id (ON DELETE CASCADE) |
|
|
153
|
+
| `target_id` | bigint | FK → nodes.id (ON DELETE CASCADE) |
|
|
154
|
+
| `rel_type` | varchar | Edge type: `related_to`, `supports`, `contradicts`, `derived_from` |
|
|
155
|
+
| `origin` | varchar | How the edge was created: `tag_cooccurrence`, `tag_hierarchy`, `explicit` |
|
|
156
|
+
| `weight` | float | Jaccard similarity score [0.0 – 1.0] |
|
|
157
|
+
| `created_at` | timestamptz | When edge was first created |
|
|
158
|
+
| `updated_at` | timestamptz | When edge weight was last refreshed |
|
|
159
|
+
|
|
160
|
+
**Design principles:**
|
|
161
|
+
- Both directions stored explicitly (A→B and B→A) so traversal only needs `WHERE source_id IN (seeds)` — no OR across columns
|
|
162
|
+
- Unique constraint on `(source_id, target_id, rel_type)` — re-running the job refreshes weights via upsert
|
|
163
|
+
- Self-loops rejected by a DB CHECK constraint
|
|
164
|
+
- Edges with Jaccard weight < 0.1 are skipped; at most 50 edges per source node
|
|
165
|
+
|
|
166
|
+
**Ruby model:** `HTM::Models::NodeRelationship`
|
|
167
|
+
|
|
168
|
+
```ruby
|
|
169
|
+
# Direct neighbors by weight
|
|
170
|
+
HTM::Models::NodeRelationship.neighbors_of(node_id).limit(10).all
|
|
171
|
+
|
|
172
|
+
# Edges above a threshold
|
|
173
|
+
HTM::Models::NodeRelationship.above_weight(0.5).where(source_id: node_id).all
|
|
174
|
+
|
|
175
|
+
# Edges of a specific type
|
|
176
|
+
HTM::Models::NodeRelationship.by_rel_type('related_to').where(source_id: node_id).all
|
|
177
|
+
```
|
|
178
|
+
|
|
144
179
|
### Remember Tracking
|
|
145
180
|
|
|
146
181
|
The `robot_nodes` table tracks per-robot remember metadata:
|
|
@@ -221,6 +256,38 @@ WHERE t.name LIKE 'ai:llm:%'
|
|
|
221
256
|
ORDER BY n.created_at DESC;
|
|
222
257
|
```
|
|
223
258
|
|
|
259
|
+
### Finding Related Nodes via Relationship Graph
|
|
260
|
+
|
|
261
|
+
```sql
|
|
262
|
+
-- Direct neighbors (1 hop), ordered by Jaccard weight
|
|
263
|
+
SELECT nr.target_id, nr.weight, n.content
|
|
264
|
+
FROM node_relationships nr
|
|
265
|
+
JOIN nodes n ON n.id = nr.target_id
|
|
266
|
+
WHERE nr.source_id = $1
|
|
267
|
+
AND nr.rel_type = 'related_to'
|
|
268
|
+
ORDER BY nr.weight DESC
|
|
269
|
+
LIMIT 10;
|
|
270
|
+
|
|
271
|
+
-- Two-hop traversal using a recursive CTE
|
|
272
|
+
WITH RECURSIVE graph AS (
|
|
273
|
+
SELECT target_id, weight, 1 AS depth
|
|
274
|
+
FROM node_relationships
|
|
275
|
+
WHERE source_id = $1
|
|
276
|
+
AND rel_type = 'related_to'
|
|
277
|
+
AND weight >= 0.1
|
|
278
|
+
UNION ALL
|
|
279
|
+
SELECT nr.target_id, nr.weight * g.weight, g.depth + 1
|
|
280
|
+
FROM node_relationships nr
|
|
281
|
+
JOIN graph g ON g.target_id = nr.source_id
|
|
282
|
+
WHERE g.depth < 2
|
|
283
|
+
AND nr.rel_type = 'related_to'
|
|
284
|
+
AND nr.weight >= 0.1
|
|
285
|
+
)
|
|
286
|
+
SELECT DISTINCT ON (target_id) target_id, weight
|
|
287
|
+
FROM graph
|
|
288
|
+
ORDER BY target_id, weight DESC;
|
|
289
|
+
```
|
|
290
|
+
|
|
224
291
|
### Finding Related Topics by Shared Nodes
|
|
225
292
|
|
|
226
293
|
```sql
|
|
@@ -336,13 +403,15 @@ REINDEX INDEX CONCURRENTLY idx_nodes_content_gin;
|
|
|
336
403
|
|
|
337
404
|
## Schema Migration
|
|
338
405
|
|
|
339
|
-
The schema is managed through
|
|
406
|
+
The schema is managed through Sequel migrations located in `db/migrate/`:
|
|
340
407
|
|
|
341
408
|
1. `20250101000001_create_robots.rb` - Creates robots table
|
|
342
409
|
2. `20250101000002_create_nodes.rb` - Creates nodes table with all indexes
|
|
343
410
|
3. `20250101000005_create_tags.rb` - Creates tags and nodes_tags tables
|
|
344
411
|
4. `20251128000002_create_file_sources.rb` - Creates file_sources table for document tracking
|
|
345
412
|
5. `20251128000003_add_source_to_nodes.rb` - Adds source_id and chunk_position to nodes
|
|
413
|
+
6. `00008_create_node_relationships.rb` - Creates node_relationships table with FK constraints and weight check
|
|
414
|
+
7. `00009_fix_node_relationships_column_types.rb` - Promotes id to bigint and timestamps to timestamptz
|
|
346
415
|
|
|
347
416
|
To apply migrations:
|
|
348
417
|
```bash
|
|
@@ -412,6 +481,7 @@ SELECT content ILIKE '%pattern%' FROM nodes; -- Pattern matching (uses
|
|
|
412
481
|
2. **Full-text search**: Best for keyword matching ("documents containing Y")
|
|
413
482
|
3. **Fuzzy search**: Best for typo tolerance and pattern matching
|
|
414
483
|
4. **Hybrid search**: Combine vector + full-text with weighted scores
|
|
484
|
+
5. **Graph traversal**: Follow `node_relationships` edges via CTE for association-based retrieval
|
|
415
485
|
|
|
416
486
|
### Performance Tuning
|
|
417
487
|
|
data/docs/development/setup.md
CHANGED
|
@@ -9,7 +9,7 @@ Setting up HTM for development involves:
|
|
|
9
9
|
1. Cloning the repository
|
|
10
10
|
2. Installing Ruby and system dependencies
|
|
11
11
|
3. Installing Ruby gem dependencies
|
|
12
|
-
4. Setting up
|
|
12
|
+
4. Setting up PostgreSQL database
|
|
13
13
|
5. Configuring an LLM provider (Ollama, OpenAI, etc.)
|
|
14
14
|
6. Verifying your setup
|
|
15
15
|
7. Running tests and examples
|
|
@@ -154,96 +154,57 @@ bundle exec ruby -e "require 'htm'; puts HTM::VERSION"
|
|
|
154
154
|
# Should output: 0.1.0 (or current version)
|
|
155
155
|
```
|
|
156
156
|
|
|
157
|
-
## Step 4: Set Up
|
|
157
|
+
## Step 4: Set Up PostgreSQL Database
|
|
158
158
|
|
|
159
|
-
HTM requires PostgreSQL with
|
|
159
|
+
HTM requires PostgreSQL 16+ with pgvector and pg_trgm extensions. You have two options:
|
|
160
160
|
|
|
161
|
-
### Option A:
|
|
162
|
-
|
|
163
|
-
This is the fastest way to get a working database:
|
|
164
|
-
|
|
165
|
-
#### 1. Create Account
|
|
166
|
-
|
|
167
|
-
Visit [https://www.timescale.com/](https://www.timescale.com/) and sign up for a free account.
|
|
168
|
-
|
|
169
|
-
#### 2. Create Service
|
|
170
|
-
|
|
171
|
-
- Click "Create Service"
|
|
172
|
-
- Select your region (choose closest to you)
|
|
173
|
-
- Choose the **Free Tier** (or your preferred plan)
|
|
174
|
-
- Click "Create Service"
|
|
175
|
-
- Wait 2-3 minutes for provisioning
|
|
176
|
-
|
|
177
|
-
#### 3. Get Connection Details
|
|
178
|
-
|
|
179
|
-
- Click on your new service
|
|
180
|
-
- Click "Connection Info"
|
|
181
|
-
- Copy the full connection string (looks like `postgres://username:password@host:port/database?sslmode=require`)
|
|
182
|
-
|
|
183
|
-
#### 4. Configure Environment Variables
|
|
184
|
-
|
|
185
|
-
Create or edit `~/.bashrc__tiger`:
|
|
161
|
+
### Option A: Local PostgreSQL (macOS/Homebrew)
|
|
186
162
|
|
|
187
163
|
```bash
|
|
188
|
-
#
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
export HTM_DATABASE__USER="tsdbadmin"
|
|
192
|
-
export HTM_DATABASE__PASSWORD="your_password_here"
|
|
193
|
-
export HTM_DATABASE__PORT="37807" # Your port number
|
|
194
|
-
export HTM_DATABASE__URL="postgres://tsdbadmin:your_password@host:port/tsdb?sslmode=require"
|
|
195
|
-
```
|
|
164
|
+
# Install PostgreSQL 17
|
|
165
|
+
brew install postgresql@17
|
|
166
|
+
brew services start postgresql@17
|
|
196
167
|
|
|
197
|
-
|
|
168
|
+
# Create development database
|
|
169
|
+
createdb htm_development
|
|
198
170
|
|
|
199
|
-
|
|
171
|
+
# Enable required extensions
|
|
172
|
+
psql htm_development -c "CREATE EXTENSION IF NOT EXISTS vector;"
|
|
173
|
+
psql htm_development -c "CREATE EXTENSION IF NOT EXISTS pg_trgm;"
|
|
200
174
|
|
|
201
|
-
|
|
202
|
-
|
|
203
|
-
source ~/.bashrc__tiger
|
|
204
|
-
|
|
205
|
-
# Optionally, add to your ~/.bashrc for automatic loading
|
|
206
|
-
echo 'source ~/.bashrc__tiger' >> ~/.bashrc
|
|
175
|
+
# Configure environment variables
|
|
176
|
+
export HTM_DATABASE__URL="postgresql://$(whoami)@localhost:5432/htm_development"
|
|
207
177
|
```
|
|
208
178
|
|
|
209
|
-
### Option B: Local PostgreSQL with Docker
|
|
179
|
+
### Option B: Local PostgreSQL with Docker
|
|
210
180
|
|
|
211
|
-
For
|
|
181
|
+
For Docker-based development:
|
|
212
182
|
|
|
213
183
|
```bash
|
|
214
184
|
# Create docker-compose.yml
|
|
215
185
|
cat > docker-compose.yml <<'EOF'
|
|
216
186
|
version: '3.8'
|
|
217
187
|
services:
|
|
218
|
-
|
|
219
|
-
image:
|
|
188
|
+
postgres:
|
|
189
|
+
image: pgvector/pgvector:pg17
|
|
220
190
|
environment:
|
|
221
|
-
POSTGRES_USER:
|
|
191
|
+
POSTGRES_USER: htm
|
|
222
192
|
POSTGRES_PASSWORD: devpassword
|
|
223
|
-
POSTGRES_DB:
|
|
193
|
+
POSTGRES_DB: htm_development
|
|
224
194
|
ports:
|
|
225
195
|
- "5432:5432"
|
|
226
196
|
volumes:
|
|
227
|
-
-
|
|
197
|
+
- pg_data:/var/lib/postgresql/data
|
|
228
198
|
|
|
229
199
|
volumes:
|
|
230
|
-
|
|
200
|
+
pg_data:
|
|
231
201
|
EOF
|
|
232
202
|
|
|
233
|
-
# Start
|
|
203
|
+
# Start PostgreSQL
|
|
234
204
|
docker-compose up -d
|
|
235
205
|
|
|
236
206
|
# Configure environment variables
|
|
237
|
-
|
|
238
|
-
export HTM_SERVICE_NAME="local-dev"
|
|
239
|
-
export HTM_DATABASE__NAME="tsdb"
|
|
240
|
-
export HTM_DATABASE__USER="tsdbadmin"
|
|
241
|
-
export HTM_DATABASE__PASSWORD="devpassword"
|
|
242
|
-
export HTM_DATABASE__PORT="5432"
|
|
243
|
-
export HTM_DATABASE__URL="postgres://tsdbadmin:devpassword@localhost:5432/tsdb?sslmode=disable"
|
|
244
|
-
EOF
|
|
245
|
-
|
|
246
|
-
source ~/.bashrc__tiger
|
|
207
|
+
export HTM_DATABASE__URL="postgresql://htm:devpassword@localhost:5432/htm_development"
|
|
247
208
|
```
|
|
248
209
|
|
|
249
210
|
### Verify Database Connection
|
|
@@ -252,27 +213,26 @@ Test your database connection:
|
|
|
252
213
|
|
|
253
214
|
```bash
|
|
254
215
|
# From the htm directory
|
|
255
|
-
|
|
216
|
+
rake htm:db:verify
|
|
256
217
|
```
|
|
257
218
|
|
|
258
219
|
Expected output:
|
|
259
220
|
|
|
260
221
|
```
|
|
261
222
|
Connected successfully!
|
|
262
|
-
|
|
263
|
-
pgvector Extension: Version 0.8.1
|
|
223
|
+
pgvector Extension: Version 0.8.0+
|
|
264
224
|
pg_trgm Extension: Version 1.6
|
|
265
225
|
```
|
|
266
226
|
|
|
267
227
|
### Enable Required Extensions
|
|
268
228
|
|
|
269
|
-
Run the extension setup script:
|
|
229
|
+
Run the extension setup script if needed:
|
|
270
230
|
|
|
271
231
|
```bash
|
|
272
232
|
ruby enable_extensions.rb
|
|
273
233
|
```
|
|
274
234
|
|
|
275
|
-
This ensures that
|
|
235
|
+
This ensures that pgvector and pg_trgm extensions are enabled.
|
|
276
236
|
|
|
277
237
|
## Step 5: Configure LLM Provider
|
|
278
238
|
|
|
@@ -495,12 +455,12 @@ HTM uses environment variables for configuration. Here's a complete reference:
|
|
|
495
455
|
|
|
496
456
|
| Variable | Description | Example |
|
|
497
457
|
|----------|-------------|---------|
|
|
498
|
-
| `HTM_DATABASE__URL` | Full PostgreSQL connection URL (preferred) | `
|
|
499
|
-
| `HTM_DATABASE__NAME` | Database name | `
|
|
500
|
-
| `HTM_DATABASE__USER` | Database username | `
|
|
458
|
+
| `HTM_DATABASE__URL` | Full PostgreSQL connection URL (preferred) | `postgresql://user:pass@localhost:5432/htm_development` |
|
|
459
|
+
| `HTM_DATABASE__NAME` | Database name | `htm_development` |
|
|
460
|
+
| `HTM_DATABASE__USER` | Database username | `postgres` |
|
|
501
461
|
| `HTM_DATABASE__PASSWORD` | Database password | `your_password` |
|
|
502
|
-
| `HTM_DATABASE__PORT` | Database port | `
|
|
503
|
-
| `
|
|
462
|
+
| `HTM_DATABASE__PORT` | Database port | `5432` |
|
|
463
|
+
| `HTM_SERVICE__NAME` | Service identifier (for DB naming) | `htm` |
|
|
504
464
|
|
|
505
465
|
### LLM Provider Variables
|
|
506
466
|
|
|
@@ -540,11 +500,8 @@ echo $HTM_DATABASE__URL
|
|
|
540
500
|
# Test connection directly with psql
|
|
541
501
|
psql $HTM_DATABASE__URL
|
|
542
502
|
|
|
543
|
-
# Check if service is running (TimescaleDB Cloud)
|
|
544
|
-
# Visit your Timescale Cloud dashboard
|
|
545
|
-
|
|
546
503
|
# For Docker, check if container is running
|
|
547
|
-
docker ps | grep
|
|
504
|
+
docker ps | grep postgres
|
|
548
505
|
```
|
|
549
506
|
|
|
550
507
|
#### "LLM provider connection failed"
|
|
@@ -582,7 +539,7 @@ curl https://api.openai.com/v1/models -H "Authorization: Bearer $OPENAI_API_KEY"
|
|
|
582
539
|
|
|
583
540
|
#### "Extension not available"
|
|
584
541
|
|
|
585
|
-
**Symptoms**: Errors about missing
|
|
542
|
+
**Symptoms**: Errors about missing pgvector or pg_trgm
|
|
586
543
|
|
|
587
544
|
**Solutions**:
|
|
588
545
|
|
|
@@ -593,8 +550,9 @@ ruby enable_extensions.rb
|
|
|
593
550
|
# Check extension status
|
|
594
551
|
psql $HTM_DATABASE__URL -c "SELECT extname, extversion FROM pg_extension ORDER BY extname"
|
|
595
552
|
|
|
596
|
-
# For
|
|
597
|
-
|
|
553
|
+
# For local PostgreSQL, install extensions manually:
|
|
554
|
+
psql $HTM_DATABASE__URL -c "CREATE EXTENSION IF NOT EXISTS vector;"
|
|
555
|
+
psql $HTM_DATABASE__URL -c "CREATE EXTENSION IF NOT EXISTS pg_trgm;"
|
|
598
556
|
```
|
|
599
557
|
|
|
600
558
|
#### "Bundle install fails"
|
|
@@ -650,8 +608,8 @@ If you see SSL certificate errors:
|
|
|
650
608
|
echo $HTM_DATABASE__URL | grep sslmode
|
|
651
609
|
# Should show: sslmode=require
|
|
652
610
|
|
|
653
|
-
# For local development
|
|
654
|
-
export HTM_DATABASE__URL="
|
|
611
|
+
# For local development
|
|
612
|
+
export HTM_DATABASE__URL="postgresql://user:pass@localhost:5432/htm_development"
|
|
655
613
|
```
|
|
656
614
|
|
|
657
615
|
### Ruby Version Issues
|
data/docs/development/testing.md
CHANGED
|
@@ -125,16 +125,16 @@ puts "Removed #{count} chunks"
|
|
|
125
125
|
|
|
126
126
|
```ruby
|
|
127
127
|
HTM.configure do |config|
|
|
128
|
-
config.chunk_size = 1024 # Characters per chunk (default)
|
|
129
|
-
config.chunk_overlap = 64 # Overlap between chunks (default)
|
|
128
|
+
config.chunking.chunk_size = 1024 # Characters per chunk (default)
|
|
129
|
+
config.chunking.chunk_overlap = 64 # Overlap between chunks (default)
|
|
130
130
|
end
|
|
131
131
|
```
|
|
132
132
|
|
|
133
133
|
Or via environment variables:
|
|
134
134
|
|
|
135
135
|
```bash
|
|
136
|
-
export
|
|
137
|
-
export
|
|
136
|
+
export HTM_CHUNKING__CHUNK_SIZE=512
|
|
137
|
+
export HTM_CHUNKING__CHUNK_OVERLAP=50
|
|
138
138
|
```
|
|
139
139
|
|
|
140
140
|
## Expected Output
|
data/docs/examples/mcp-client.md
CHANGED
|
@@ -142,7 +142,7 @@ Assistant> I've stored that information about the PostgreSQL connection string.
|
|
|
142
142
|
you> What do you know about databases?
|
|
143
143
|
|
|
144
144
|
[Tool Call] RecallTool
|
|
145
|
-
Arguments: {
|
|
145
|
+
Arguments: {query: "databases", limit: 5}
|
|
146
146
|
[Tool Result] RecallTool
|
|
147
147
|
Result: {"memories": [...]}
|
|
148
148
|
|
|
@@ -216,13 +216,13 @@ Look up a memory by its node ID:
|
|
|
216
216
|
puts "\n4. Looking up specific memory..."
|
|
217
217
|
|
|
218
218
|
# Use the node_id returned from remember()
|
|
219
|
-
node = HTM::Models::Node.
|
|
219
|
+
node = HTM::Models::Node.first(id: node_id)
|
|
220
220
|
|
|
221
221
|
if node
|
|
222
222
|
puts "✓ Found memory:"
|
|
223
223
|
puts " ID: #{node.id}"
|
|
224
224
|
puts " Content: #{node.content[0..100]}..."
|
|
225
|
-
puts " Tags: #{node.
|
|
225
|
+
puts " Tags: #{node.tag_names.join(', ')}"
|
|
226
226
|
puts " Created: #{node.created_at}"
|
|
227
227
|
else
|
|
228
228
|
puts "✗ Memory not found"
|
|
@@ -471,8 +471,8 @@ puts HTM::Models::Tag.tree_string
|
|
|
471
471
|
# └── postgresql
|
|
472
472
|
# └── features
|
|
473
473
|
|
|
474
|
-
# Find all memories
|
|
475
|
-
nodes = HTM::Models::Tag.
|
|
474
|
+
# Find all memories with a specific tag
|
|
475
|
+
nodes = HTM::Models::Tag.first(name: 'knowledge:databases')&.nodes
|
|
476
476
|
```
|
|
477
477
|
|
|
478
478
|
## Forget (Explicit Deletion)
|
|
@@ -266,7 +266,7 @@ Each robot-node relationship is tracked in `robot_nodes`:
|
|
|
266
266
|
|
|
267
267
|
```ruby
|
|
268
268
|
# Check how many times a robot has "remembered" content
|
|
269
|
-
rn = HTM::Models::RobotNode.
|
|
269
|
+
rn = HTM::Models::RobotNode.first(robot_id: htm.robot_id, node_id: node_id)
|
|
270
270
|
rn.remember_count # => 3 (remembered 3 times)
|
|
271
271
|
rn.first_remembered_at # => When first encountered
|
|
272
272
|
rn.last_remembered_at # => When last tried to remember
|
|
@@ -342,12 +342,24 @@ node_id = htm.remember("Important fact about databases")
|
|
|
342
342
|
# 2. Background jobs enqueue (async)
|
|
343
343
|
# - GenerateEmbeddingJob runs (~100ms)
|
|
344
344
|
# - GenerateTagsJob runs (~1 second)
|
|
345
|
+
# - GenerateRelationshipsJob runs after tags (~1-2 seconds total)
|
|
345
346
|
|
|
346
347
|
# 3. Node is eventually enriched
|
|
347
348
|
# - embedding field populated (enables vector search)
|
|
348
349
|
# - tags associated (enables tag navigation and boosting)
|
|
350
|
+
# - relationship edges computed (enables graph traversal)
|
|
349
351
|
```
|
|
350
352
|
|
|
353
|
+
### Job Chain
|
|
354
|
+
|
|
355
|
+
The three background jobs run in a chain so each job can depend on prior results:
|
|
356
|
+
|
|
357
|
+
1. **`GenerateEmbeddingJob`** — generates a vector embedding for the node
|
|
358
|
+
2. **`GenerateTagsJob`** — extracts hierarchical tags using an LLM; on success, enqueues step 3
|
|
359
|
+
3. **`GenerateRelationshipsJob`** — computes Jaccard-weighted edges to all tag-sharing nodes and upserts both directions (A→B and B→A) into `node_relationships`
|
|
360
|
+
|
|
361
|
+
Edges below `MIN_WEIGHT_THRESHOLD` (0.1) are skipped; at most `MAX_EDGES_PER_NODE` (50) edges are stored per node.
|
|
362
|
+
|
|
351
363
|
### Immediate vs Eventual Capabilities
|
|
352
364
|
|
|
353
365
|
| Capability | Available | Notes |
|
|
@@ -357,6 +369,7 @@ node_id = htm.remember("Important fact about databases")
|
|
|
357
369
|
| Vector search | After ~100ms | Needs embedding |
|
|
358
370
|
| Tag-enhanced search | After ~1s | Needs tags |
|
|
359
371
|
| Hybrid search | After ~1s | Needs embedding + tags |
|
|
372
|
+
| Graph traversal | After ~1-2s | Needs relationship edges |
|
|
360
373
|
|
|
361
374
|
## Working Memory Integration
|
|
362
375
|
|
|
@@ -207,18 +207,18 @@ HTM_PROPOSITION__MODEL=gpt-4o-mini
|
|
|
207
207
|
|
|
208
208
|
### Chunking Configuration
|
|
209
209
|
|
|
210
|
-
Access: `HTM.config.chunking.
|
|
210
|
+
Access: `HTM.config.chunking.chunk_size`, `HTM.config.chunking.chunk_overlap`
|
|
211
211
|
|
|
212
212
|
```yaml
|
|
213
213
|
chunking:
|
|
214
|
-
|
|
215
|
-
|
|
214
|
+
chunk_size: 1024 # Characters per chunk
|
|
215
|
+
chunk_overlap: 64 # Overlap between chunks
|
|
216
216
|
```
|
|
217
217
|
|
|
218
218
|
**Environment variables:**
|
|
219
219
|
```bash
|
|
220
|
-
|
|
221
|
-
|
|
220
|
+
HTM_CHUNKING__CHUNK_SIZE=512
|
|
221
|
+
HTM_CHUNKING__CHUNK_OVERLAP=50
|
|
222
222
|
```
|
|
223
223
|
|
|
224
224
|
### Job Backend Configuration
|