RubyGems - htm - Versions diffs - 0.0.1 → 0.0.2 - Mend

htm 0.0.1 → 0.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (109) hide show

checksums.yaml +4 -4
data/.envrc +1 -0
data/.tbls.yml +30 -0
data/CHANGELOG.md +30 -0
data/SETUP.md +132 -101
data/db/migrate/20250125000001_add_content_hash_to_nodes.rb +14 -0
data/db/migrate/20250125000002_create_robot_nodes.rb +35 -0
data/db/migrate/20250125000003_remove_source_and_robot_id_from_nodes.rb +28 -0
data/db/migrate/20250126000001_create_working_memories.rb +19 -0
data/db/migrate/20250126000002_remove_unused_columns.rb +12 -0
data/db/schema.sql +226 -43
data/docs/api/database.md +20 -232
data/docs/api/embedding-service.md +1 -7
data/docs/api/htm.md +195 -449
data/docs/api/index.md +1 -7
data/docs/api/long-term-memory.md +342 -590
data/docs/architecture/adrs/001-postgresql-timescaledb.md +1 -1
data/docs/architecture/adrs/003-ollama-embeddings.md +1 -1
data/docs/architecture/adrs/010-redis-working-memory-rejected.md +2 -27
data/docs/architecture/adrs/index.md +2 -13
data/docs/architecture/hive-mind.md +165 -166
data/docs/architecture/index.md +2 -2
data/docs/architecture/overview.md +5 -171
data/docs/architecture/two-tier-memory.md +1 -35
data/docs/assets/images/adr-010-current-architecture.svg +37 -0
data/docs/assets/images/adr-010-proposed-architecture.svg +48 -0
data/docs/assets/images/adr-dependency-tree.svg +93 -0
data/docs/assets/images/class-hierarchy.svg +55 -0
data/docs/assets/images/exception-hierarchy.svg +45 -0
data/docs/assets/images/htm-architecture-overview.svg +83 -0
data/docs/assets/images/htm-complete-memory-flow.svg +160 -0
data/docs/assets/images/htm-context-assembly-flow.svg +148 -0
data/docs/assets/images/htm-eviction-process.svg +141 -0
data/docs/assets/images/htm-memory-addition-flow.svg +138 -0
data/docs/assets/images/htm-memory-recall-flow.svg +152 -0
data/docs/assets/images/htm-node-states.svg +123 -0
data/docs/assets/images/project-structure.svg +78 -0
data/docs/assets/images/test-directory-structure.svg +38 -0
data/{dbdoc → docs/database}/README.md +5 -3
data/{dbdoc → docs/database}/public.node_tags.md +4 -5
data/docs/database/public.node_tags.svg +106 -0
data/{dbdoc → docs/database}/public.nodes.md +3 -8
data/docs/database/public.nodes.svg +152 -0
data/docs/database/public.robot_nodes.md +44 -0
data/docs/database/public.robot_nodes.svg +121 -0
data/{dbdoc → docs/database}/public.robots.md +1 -2
data/docs/database/public.robots.svg +106 -0
data/docs/database/public.working_memories.md +40 -0
data/docs/database/public.working_memories.svg +112 -0
data/{dbdoc → docs/database}/schema.json +342 -110
data/docs/database/schema.svg +223 -0
data/docs/development/index.md +1 -29
data/docs/development/schema.md +84 -324
data/docs/development/testing.md +1 -9
data/docs/getting-started/index.md +47 -0
data/docs/{installation.md → getting-started/installation.md} +2 -2
data/docs/{quick-start.md → getting-started/quick-start.md} +5 -5
data/docs/guides/adding-memories.md +221 -655
data/docs/guides/search-strategies.md +85 -51
data/docs/images/htm-er-diagram.svg +156 -0
data/docs/index.md +16 -31
data/docs/multi_framework_support.md +4 -4
data/examples/basic_usage.rb +18 -16
data/examples/cli_app/htm_cli.rb +86 -8
data/examples/custom_llm_configuration.rb +1 -2
data/examples/example_app/app.rb +11 -14
data/examples/sinatra_app/Gemfile +1 -0
data/examples/sinatra_app/Gemfile.lock +166 -0
data/examples/sinatra_app/app.rb +219 -24
data/lib/htm/active_record_config.rb +10 -3
data/lib/htm/configuration.rb +265 -78
data/lib/htm/{sinatra.rb → integrations/sinatra.rb} +87 -12
data/lib/htm/job_adapter.rb +10 -3
data/lib/htm/long_term_memory.rb +220 -57
data/lib/htm/models/node.rb +36 -7
data/lib/htm/models/robot.rb +30 -4
data/lib/htm/models/robot_node.rb +50 -0
data/lib/htm/models/tag.rb +52 -0
data/lib/htm/models/working_memory_entry.rb +88 -0
data/lib/htm/tasks.rb +4 -0
data/lib/htm/version.rb +1 -1
data/lib/htm.rb +34 -13
data/lib/tasks/htm.rake +32 -1
data/lib/tasks/jobs.rake +7 -3
data/lib/tasks/tags.rake +34 -0
data/mkdocs.yml +56 -9
metadata +61 -31
data/dbdoc/public.node_tags.svg +0 -112
data/dbdoc/public.nodes.svg +0 -118
data/dbdoc/public.robots.svg +0 -90
data/dbdoc/schema.svg +0 -154
/data/{dbdoc → docs/database}/public.node_stats.md +0 -0
/data/{dbdoc → docs/database}/public.node_stats.svg +0 -0
/data/{dbdoc → docs/database}/public.nodes_tags.md +0 -0
/data/{dbdoc → docs/database}/public.nodes_tags.svg +0 -0
/data/{dbdoc → docs/database}/public.ontology_structure.md +0 -0
/data/{dbdoc → docs/database}/public.ontology_structure.svg +0 -0
/data/{dbdoc → docs/database}/public.operations_log.md +0 -0
/data/{dbdoc → docs/database}/public.operations_log.svg +0 -0
/data/{dbdoc → docs/database}/public.relationships.md +0 -0
/data/{dbdoc → docs/database}/public.relationships.svg +0 -0
/data/{dbdoc → docs/database}/public.robot_activity.md +0 -0
/data/{dbdoc → docs/database}/public.robot_activity.svg +0 -0
/data/{dbdoc → docs/database}/public.schema_migrations.md +0 -0
/data/{dbdoc → docs/database}/public.schema_migrations.svg +0 -0
/data/{dbdoc → docs/database}/public.tags.md +0 -0
/data/{dbdoc → docs/database}/public.tags.svg +0 -0
/data/{dbdoc → docs/database}/public.topic_relationships.md +0 -0
/data/{dbdoc → docs/database}/public.topic_relationships.svg +0 -0

data/docs/api/long-term-memory.md CHANGED Viewed

@@ -8,35 +8,44 @@ PostgreSQL-backed permanent memory storage with RAG-based retrieval.
 - **Vector similarity search** - Semantic understanding via embeddings
 - **Full-text search** - Fast keyword and phrase matching
-- **Hybrid search** - Combines fulltext prefiltering with vector ranking
-- **Time-range queries** - TimescaleDB-optimized temporal search
-- **Relationship graphs** - Connect related knowledge
-- **Tag system** - Flexible categorization
-- **Multi-robot tracking** - Shared global memory
+- **Tag-enhanced hybrid search** - Combines fulltext + vector + tag matching
+- **Content deduplication** - SHA-256 based node deduplication
+- **Query result caching** - LRU cache for frequent queries
+- **Hierarchical tagging** - Colon-separated tag namespaces
 ## Class Definition
 ```ruby
 class HTM::LongTermMemory
-  # No public attributes
+  attr_reader :query_timeout
 end
 ```
 ## Initialization
-### `new(config)` {: #new }
+### `new(config, **options)` {: #new }
 Create a new long-term memory instance.
 ```ruby
-HTM::LongTermMemory.new(config)
+HTM::LongTermMemory.new(
+  config,
+  pool_size: nil,
+  query_timeout: 30_000,
+  cache_size: 1000,
+  cache_ttl: 300
+)
 ```
 #### Parameters
-| Parameter | Type | Description |
-|-----------|------|-------------|
-| `config` | Hash | PostgreSQL connection configuration |
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `config` | Hash | *required* | PostgreSQL connection configuration |
+| `pool_size` | Integer, nil | `nil` | Connection pool size (managed by ActiveRecord) |
+| `query_timeout` | Integer | `30_000` | Query timeout in milliseconds |
+| `cache_size` | Integer | `1000` | LRU cache size (0 to disable) |
+| `cache_ttl` | Integer | `300` | Cache TTL in seconds |
 #### Configuration Hash
@@ -47,18 +56,10 @@ HTM::LongTermMemory.new(config)
   dbname: "database_name",
   user: "username",
   password: "password",
-  sslmode: "require"  # or "prefer", "disable"
+  sslmode: "require"
 }
 ```
-#### Returns
-- `HTM::LongTermMemory` instance
-#### Raises
-- `RuntimeError` - If config is nil
 #### Examples
 ```ruby
@@ -66,25 +67,16 @@ HTM::LongTermMemory.new(config)
 config = HTM::Database.default_config
 ltm = HTM::LongTermMemory.new(config)
-# Custom configuration
+# With custom timeout and cache
 ltm = HTM::LongTermMemory.new(
-  host: 'localhost',
-  port: 5432,
-  dbname: 'htm_production',
-  user: 'htm_user',
-  password: ENV['DB_PASSWORD'],
-  sslmode: 'require'
+  config,
+  query_timeout: 60_000,  # 60 seconds
+  cache_size: 5000,
+  cache_ttl: 600
 )
-# TimescaleDB Cloud
-ltm = HTM::LongTermMemory.new(
-  host: 'xxx.tsdb.cloud.timescale.com',
-  port: 37807,
-  dbname: 'tsdb',
-  user: 'tsdbadmin',
-  password: ENV['HTM_DBPASS'],
-  sslmode: 'require'
-)
+# Disable caching
+ltm = HTM::LongTermMemory.new(config, cache_size: 0)
 ```
 ---
@@ -93,18 +85,14 @@ ltm = HTM::LongTermMemory.new(
 ### `add(**params)` {: #add }
-Add a node to long-term memory.
+Add a node to long-term memory with content deduplication.
 ```ruby
 add(
-  key:,
-  value:,
-  type: nil,
-  category: nil,
-  importance: 1.0,
+  content:,
   token_count: 0,
   robot_id:,
-  embedding:
+  embedding: nil
 )
 ```
@@ -112,100 +100,92 @@ add(
 | Parameter | Type | Default | Description |
 |-----------|------|---------|-------------|
-| `key` | String | *required* | Unique node identifier |
-| `value` | String | *required* | Node content |
-| `type` | String, nil | `nil` | Node type |
-| `category` | String, nil | `nil` | Node category |
-| `importance` | Float | `1.0` | Importance score (0.0-10.0) |
+| `content` | String | *required* | Node content |
 | `token_count` | Integer | `0` | Token count |
-| `robot_id` | String | *required* | Robot identifier |
-| `embedding` | Array\<Float\> | *required* | Vector embedding |
+| `robot_id` | Integer | *required* | Robot identifier |
+| `embedding` | Array\<Float\>, nil | `nil` | Pre-generated embedding vector |
 #### Returns
-- `Integer` - Database ID of the created node
+- `Hash` - `{ node_id:, is_new:, robot_node: }`
+#### Content Deduplication
+When `add()` is called:
+1. A SHA-256 hash of the content is computed
+2. If a node with the same hash exists:
+   - Links the robot to the existing node (or updates `remember_count`)
+   - Returns `is_new: false`
+3. If no match:
+   - Creates a new node
+   - Links the robot to it
+   - Returns `is_new: true`
 #### Examples
 ```ruby
-embedding = embedding_service.embed("content...")
-node_id = ltm.add(
-  key: "fact_001",
-  value: "PostgreSQL is our primary database",
-  type: "fact",
-  category: "architecture",
-  importance: 8.0,
-  token_count: 50,
-  robot_id: "robot-abc123",
-  embedding: embedding
+# Add new content
+result = ltm.add(
+  content: "PostgreSQL is our primary database",
+  token_count: 8,
+  robot_id: 1
 )
-# => 1234
-# Minimal add
-node_id = ltm.add(
-  key: "simple_note",
-  value: "Remember to check logs",
-  robot_id: robot_id,
-  embedding: embedding
+# => { node_id: 123, is_new: true, robot_node: <RobotNode> }
+# Add duplicate content (different robot)
+result = ltm.add(
+  content: "PostgreSQL is our primary database",
+  token_count: 8,
+  robot_id: 2
+)
+# => { node_id: 123, is_new: false, robot_node: <RobotNode> }
+# Same node_id, robot_node tracks this robot's remember_count
+# With pre-generated embedding
+result = ltm.add(
+  content: "Vector search is powerful",
+  token_count: 5,
+  robot_id: 1,
+  embedding: [0.1, 0.2, 0.3, ...]  # Will be padded to 2000 dims
 )
 ```
-#### Notes
-- `key` must be unique (enforced by database)
-- `embedding` is stored as a pgvector type
-- Automatically sets `created_at` timestamp
 ---
-### `retrieve(key)` {: #retrieve }
+### `retrieve(node_id)` {: #retrieve }
-Retrieve a node by its key.
+Retrieve a node by its database ID.
 ```ruby
-retrieve(key)
+retrieve(node_id)
 ```
 #### Parameters
 | Parameter | Type | Description |
 |-----------|------|-------------|
-| `key` | String | Node identifier |
+| `node_id` | Integer | Node database ID |
 #### Returns
-- `Hash` - Node data if found
+- `Hash` - Node attributes if found
 - `nil` - If node doesn't exist
-#### Hash Structure
+#### Side Effects
-```ruby
-{
-  "id" => "123",
-  "key" => "fact_001",
-  "value" => "content...",
-  "type" => "fact",
-  "category" => "architecture",
-  "importance" => "8.0",
-  "token_count" => "50",
-  "robot_id" => "robot-abc123",
-  "created_at" => "2025-01-15 10:30:00",
-  "last_accessed" => "2025-01-15 14:20:00",
-  "in_working_memory" => "t",
-  "evicted_at" => nil
-}
-```
+- Increments `access_count`
+- Updates `last_accessed` timestamp
 #### Examples
 ```ruby
-node = ltm.retrieve("fact_001")
+node = ltm.retrieve(123)
 if node
-  puts node['value']
+  puts node['content']
+  puts "Accessed #{node['access_count']} times"
   puts "Created: #{node['created_at']}"
-  puts "Importance: #{node['importance']}"
 else
   puts "Node not found"
 end
@@ -213,108 +193,57 @@ end
 ---
-### `update_last_accessed(key)` {: #update_last_accessed }
+### `exists?(node_id)` {: #exists }
-Update the last accessed timestamp for a node.
+Check if a node exists.
 ```ruby
-update_last_accessed(key)
+exists?(node_id)
 ```
 #### Parameters
 | Parameter | Type | Description |
 |-----------|------|-------------|
-| `key` | String | Node identifier |
+| `node_id` | Integer | Node database ID |
 #### Returns
-- `void`
+- `Boolean` - True if node exists
 #### Examples
 ```ruby
-# After retrieving a node
-node = ltm.retrieve("important_fact")
-ltm.update_last_accessed("important_fact")
-# Track access patterns
-accessed_keys = ["key1", "key2", "key3"]
-accessed_keys.each { |k| ltm.update_last_accessed(k) }
+if ltm.exists?(123)
+  ltm.delete(123)
+end
 ```
 ---
-### `delete(key)` {: #delete }
+### `delete(node_id)` {: #delete }
 Delete a node permanently.
 ```ruby
-delete(key)
+delete(node_id)
 ```
 #### Parameters
 | Parameter | Type | Description |
 |-----------|------|-------------|
-| `key` | String | Node identifier |
-#### Returns
-- `void`
+| `node_id` | Integer | Node database ID |
 #### Side Effects
 - Deletes node from database
-- Cascades to related relationships and tags
-#### Examples
-```ruby
-# Delete a node
-ltm.delete("temp_note_123")
-# Safe deletion
-if ltm.retrieve("old_key")
-  ltm.delete("old_key")
-end
-```
+- Cascades to robot_nodes and node_tags
+- Invalidates query cache
 #### Warning
-Deletion is **permanent** and cannot be undone. Use `HTM#forget` instead for proper confirmation flow.
----
-### `get_node_id(key)` {: #get_node_id }
-Get the database ID for a node.
-```ruby
-get_node_id(key)
-```
-#### Parameters
-| Parameter | Type | Description |
-|-----------|------|-------------|
-| `key` | String | Node identifier |
-#### Returns
-- `Integer` - Database ID if found
-- `nil` - If node doesn't exist
-#### Examples
-```ruby
-node_id = ltm.get_node_id("fact_001")
-# => 123
-# Use in relationships
-from_id = ltm.get_node_id("decision_001")
-to_id = ltm.get_node_id("fact_001")
-```
+Deletion is **permanent** and cannot be undone. Use `HTM#forget` for proper confirmation flow.
 ---
@@ -334,7 +263,7 @@ search(
 #### Parameters
 | Parameter | Type | Description |
-|-----------|------|---------|
+|-----------|------|-------------|
 | `timeframe` | Range | Time range to search (Time..Time) |
 | `query` | String | Search query text |
 | `limit` | Integer | Maximum results |
@@ -348,55 +277,32 @@ search(
 ```ruby
 {
-  "id" => "123",
-  "key" => "fact_001",
-  "value" => "content...",
-  "type" => "fact",
-  "category" => "architecture",
-  "importance" => "8.0",
+  "id" => 123,
+  "content" => "content...",
+  "access_count" => 5,
   "created_at" => "2025-01-15 10:30:00",
-  "robot_id" => "robot-abc123",
-  "token_count" => "50",
-  "similarity" => "0.8745"  # 0.0-1.0, higher = more similar
+  "token_count" => 50,
+  "similarity" => 0.8745  # 0.0-1.0, higher = more similar
 }
 ```
 #### Examples
 ```ruby
-# Semantic search
 timeframe = (Time.now - 7*24*3600)..Time.now
 results = ltm.search(
   timeframe: timeframe,
   query: "database performance optimization",
   limit: 20,
-  embedding_service: embedding_service
+  embedding_service: HTM
 )
 results.each do |node|
-  puts "[#{node['similarity']}] #{node['value']}"
+  puts "[#{node['similarity']}] #{node['content']}"
 end
-# Find similar to a specific concept
-results = ltm.search(
-  timeframe: (Time.at(0)..Time.now),  # All time
-  query: "microservices architecture patterns",
-  limit: 10,
-  embedding_service: embedding_service
-)
-# Filter by similarity threshold
-high_similarity = results.select { |n| n['similarity'].to_f > 0.7 }
 ```
-#### Technical Details
-- Uses pgvector's `<=>` cosine distance operator
-- Returns `1 - distance` as similarity (0.0-1.0)
-- Indexed for fast approximate nearest neighbor search
-- Query embedding is generated on-the-fly
 ---
 ### `search_fulltext(**params)` {: #search_fulltext }
@@ -425,56 +331,28 @@ search_fulltext(
 #### Hash Structure
-Similar to `search`, but with `"rank"` instead of `"similarity"`:
 ```ruby
 {
   ...,
-  "rank" => "0.456"  # Higher = better match
+  "rank" => 0.456  # Higher = better match
 }
 ```
 #### Examples
 ```ruby
-# Exact phrase search
 results = ltm.search_fulltext(
   timeframe: (Time.now - 30*24*3600)..Time.now,
   query: "PostgreSQL connection pooling",
   limit: 10
 )
-# Multiple keywords
-results = ltm.search_fulltext(
-  timeframe: (Time.now - 7*24*3600)..Time.now,
-  query: "API authentication JWT token",
-  limit: 20
-)
-# Find mentions
-results = ltm.search_fulltext(
-  timeframe: (Time.at(0)..Time.now),
-  query: "security vulnerability",
-  limit: 50
-)
-results.each do |node|
-  puts "[#{node['rank']}] #{node['created_at']}: #{node['value']}"
-end
 ```
-#### Technical Details
-- Uses PostgreSQL `to_tsvector` and `plainto_tsquery`
-- English language stemming and stop words
-- GIN index for fast search
-- Ranks by `ts_rank` (term frequency)
 ---
 ### `search_hybrid(**params)` {: #search_hybrid }
-Hybrid search combining fulltext prefiltering with vector ranking.
+Tag-enhanced hybrid search combining fulltext, vector, and tag matching.
 ```ruby
 search_hybrid(
@@ -494,127 +372,88 @@ search_hybrid(
 | `query` | String | *required* | Search query |
 | `limit` | Integer | *required* | Maximum final results |
 | `embedding_service` | Object | *required* | Service for embeddings |
-| `prefilter_limit` | Integer | `100` | Fulltext candidates to consider |
+| `prefilter_limit` | Integer | `100` | Candidates to consider |
 #### Returns
-- `Array<Hash>` - Matching nodes sorted by vector similarity
+- `Array<Hash>` - Matching nodes with combined scores
-#### Strategy
+#### Hash Structure
+```ruby
+{
+  "id" => 123,
+  "content" => "...",
+  "similarity" => 0.87,       # Vector similarity (0-1)
+  "tag_boost" => 0.3,         # Tag match score (0-1)
+  "combined_score" => 0.79    # (similarity × 0.7) + (tag_boost × 0.3)
+}
+```
-1. **Prefilter**: Use fulltext search to find `prefilter_limit` candidates
-2. **Rank**: Compute vector similarity for candidates only
-3. **Return**: Top `limit` results by similarity
+#### Strategy
-This combines the **accuracy** of fulltext with the **semantic understanding** of vectors.
+1. **Find matching tags**: Searches tags for query term matches
+2. **Build candidate pool**: Fulltext matches + tag-matching nodes
+3. **Score candidates**: Vector similarity + tag boost
+4. **Return top results**: Sorted by combined_score
 #### Examples
 ```ruby
-# Best of both worlds
 results = ltm.search_hybrid(
   timeframe: (Time.now - 30*24*3600)..Time.now,
-  query: "API rate limiting implementation",
+  query: "PostgreSQL performance",
   limit: 15,
-  embedding_service: embedding_service,
-  prefilter_limit: 100
-)
-# Adjust prefilter for performance
-results = ltm.search_hybrid(
-  timeframe: timeframe,
-  query: "security best practices",
-  limit: 20,
-  embedding_service: embedding_service,
-  prefilter_limit: 50  # Smaller = faster
+  embedding_service: HTM
 )
-# Large candidate pool for better recall
-results = ltm.search_hybrid(
-  timeframe: timeframe,
-  query: "deployment strategies",
-  limit: 10,
-  embedding_service: embedding_service,
-  prefilter_limit: 200  # Larger = better recall
-)
+results.each do |node|
+  puts "#{node['content']}"
+  puts "  Similarity: #{node['similarity']}"
+  puts "  Tag boost: #{node['tag_boost']}"
+  puts "  Combined: #{node['combined_score']}"
+end
 ```
-#### Performance Tuning
-| `prefilter_limit` | Speed | Recall | Use Case |
-|-------------------|-------|--------|----------|
-| 50 | Fast | Low | Common queries |
-| 100 | Medium | Medium | Default (recommended) |
-| 200+ | Slow | High | Rare/complex queries |
 ---
-### `add_relationship(**params)` {: #add_relationship }
+### `find_query_matching_tags(query)` {: #find_query_matching_tags }
-Add a relationship between two nodes.
+Find tags that match terms in the query.
 ```ruby
-add_relationship(
-  from:,
-  to:,
-  type: nil,
-  strength: 1.0
-)
+find_query_matching_tags(query)
 ```
 #### Parameters
-| Parameter | Type | Default | Description |
-|-----------|------|---------|-------------|
-| `from` | String | *required* | From node key |
-| `to` | String | *required* | To node key |
-| `type` | String, nil | `nil` | Relationship type |
-| `strength` | Float | `1.0` | Relationship strength (0.0-1.0) |
+| Parameter | Type | Description |
+|-----------|------|-------------|
+| `query` | String | Search query |
 #### Returns
-- `void`
+- `Array<String>` - Matching tag names
-#### Side Effects
+#### How It Works
-- Inserts relationship into `relationships` table
-- Skips if relationship already exists (ON CONFLICT DO NOTHING)
-- Returns early if either node doesn't exist
+1. Extracts words from query (3+ chars, lowercase)
+2. Searches tags where any hierarchy level matches (ILIKE)
+3. Returns all matching tag names
 #### Examples
 ```ruby
-# Simple relationship
-ltm.add_relationship(
-  from: "decision_001",
-  to: "fact_001"
-)
-# Typed relationship with strength
-ltm.add_relationship(
-  from: "api_v2",
-  to: "api_v1",
-  type: "replaces",
-  strength: 0.9
-)
+# Query: "PostgreSQL database optimization"
+# Might return: ["database:postgresql", "database:optimization", "database:sql"]
-# Build knowledge graph
-ltm.add_relationship(from: "microservices", to: "docker", type: "requires")
-ltm.add_relationship(from: "microservices", to: "api_gateway", type: "requires")
-ltm.add_relationship(from: "microservices", to: "service_mesh", type: "optional")
-# Related decisions
-ltm.add_relationship(
-  from: "database_choice",
-  to: "timescaledb_decision",
-  type: "influences",
-  strength: 0.8
-)
+matching_tags = ltm.find_query_matching_tags("PostgreSQL database")
+# => ["database:postgresql", "database:postgresql:extensions"]
 ```
 ---
-### `add_tag(**params)` {: #add_tag }
+### `add_tag(node_id:, tag:)` {: #add_tag }
 Add a tag to a node.
@@ -629,282 +468,267 @@ add_tag(node_id:, tag:)
 | `node_id` | Integer | Node database ID |
 | `tag` | String | Tag name |
-#### Returns
-- `void`
-#### Side Effects
-- Inserts tag into `tags` table
-- Skips if tag already exists (ON CONFLICT DO NOTHING)
 #### Examples
 ```ruby
-node_id = ltm.add(key: "fact_001", ...)
-# Add single tag
-ltm.add_tag(node_id: node_id, tag: "architecture")
-# Add multiple tags
-["architecture", "database", "postgresql"].each do |tag|
-  ltm.add_tag(node_id: node_id, tag: tag)
-end
-# Categorize decision
-decision_id = ltm.add(key: "decision_001", ...)
-ltm.add_tag(node_id: decision_id, tag: "critical")
-ltm.add_tag(node_id: decision_id, tag: "security")
-ltm.add_tag(node_id: decision_id, tag: "2025-q1")
+ltm.add_tag(node_id: 123, tag: "database:postgresql")
+ltm.add_tag(node_id: 123, tag: "architecture:decision")
 ```
 ---
-### `mark_evicted(keys)` {: #mark_evicted }
+### `get_node_tags(node_id)` {: #get_node_tags }
-Mark nodes as evicted from working memory.
+Get tags for a specific node.
 ```ruby
-mark_evicted(keys)
+get_node_tags(node_id)
 ```
 #### Parameters
 | Parameter | Type | Description |
 |-----------|------|-------------|
-| `keys` | Array\<String\> | Node keys to mark |
+| `node_id` | Integer | Node database ID |
 #### Returns
-- `void`
-#### Side Effects
-- Sets `in_working_memory = FALSE` for specified nodes
-- Sets `evicted_at` timestamp
+- `Array<String>` - Tag names
 #### Examples
 ```ruby
-# Mark single eviction
-ltm.mark_evicted(["temp_note_123"])
+tags = ltm.get_node_tags(123)
+# => ["database:postgresql", "architecture:decision"]
+```
+---
+### `node_topics(node_id)` {: #node_topics }
-# Mark batch eviction
-evicted_keys = ["key1", "key2", "key3"]
-ltm.mark_evicted(evicted_keys)
+Alias for `get_node_tags` - returns topics/tags for a node.
-# From working memory eviction
-evicted = working_memory.evict_to_make_space(10000)
-evicted_keys = evicted.map { |n| n[:key] }
-ltm.mark_evicted(evicted_keys) unless evicted_keys.empty?
+```ruby
+node_topics(node_id)
 ```
 ---
-### `register_robot(robot_id, robot_name)` {: #register_robot }
+### `nodes_by_topic(topic_path, exact:, limit:)` {: #nodes_by_topic }
-Register a robot in the system.
+Retrieve nodes by tag/topic.
 ```ruby
-register_robot(robot_id, robot_name)
+nodes_by_topic(topic_path, exact: false, limit: 50)
 ```
 #### Parameters
-| Parameter | Type | Description |
-|-----------|------|-------------|
-| `robot_id` | String | Robot identifier |
-| `robot_name` | String | Robot name |
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `topic_path` | String | *required* | Topic hierarchy path |
+| `exact` | Boolean | `false` | Exact match or prefix match |
+| `limit` | Integer | `50` | Maximum results |
 #### Returns
-- `void`
-#### Side Effects
-- Inserts robot into `robots` table
-- Updates name and `last_active` if robot exists
+- `Array<Hash>` - Matching node attributes
 #### Examples
 ```ruby
-ltm.register_robot("robot-abc123", "Code Assistant")
-ltm.register_robot("robot-def456", "Research Bot")
+# Prefix match (default) - finds all database-related nodes
+nodes = ltm.nodes_by_topic("database")
-# Register with UUID
-robot_id = SecureRandom.uuid
-ltm.register_robot(robot_id, "Analysis Bot")
+# Exact match - only nodes tagged with exactly "database:postgresql"
+nodes = ltm.nodes_by_topic("database:postgresql", exact: true)
 ```
 ---
-### `update_robot_activity(robot_id)` {: #update_robot_activity }
+### `search_by_tags(**params)` {: #search_by_tags }
-Update robot's last activity timestamp.
+Search nodes by tags with relevance scoring.
 ```ruby
-update_robot_activity(robot_id)
+search_by_tags(
+  tags:,
+  match_all: false,
+  timeframe: nil,
+  limit: 20
+)
 ```
 #### Parameters
-| Parameter | Type | Description |
-|-----------|------|-------------|
-| `robot_id` | String | Robot identifier |
+| Parameter | Type | Default | Description |
+|-----------|------|---------|-------------|
+| `tags` | Array\<String\> | *required* | Tags to search for |
+| `match_all` | Boolean | `false` | Match ALL tags or ANY tag |
+| `timeframe` | Range, nil | `nil` | Optional time range filter |
+| `limit` | Integer | `20` | Maximum results |
 #### Returns
-- `void`
+- `Array<Hash>` - Nodes with relevance scores and tags
 #### Examples
 ```ruby
-# Update after operations
-ltm.update_robot_activity("robot-abc123")
+# Match ANY tag
+nodes = ltm.search_by_tags(tags: ["database", "api"])
-# Automatic heartbeat
-loop do
-  ltm.update_robot_activity(robot_id)
-  sleep 60  # Every minute
-end
+# Match ALL tags
+nodes = ltm.search_by_tags(
+  tags: ["database:postgresql", "architecture"],
+  match_all: true
+)
+# With timeframe
+nodes = ltm.search_by_tags(
+  tags: ["security"],
+  timeframe: (Time.now - 7*24*3600)..Time.now
+)
 ```
 ---
-### `log_operation(**params)` {: #log_operation }
+### `popular_tags(limit:, timeframe:)` {: #popular_tags }
-Log an operation to the operations log.
+Get most frequently used tags.
 ```ruby
-log_operation(
-  operation:,
-  node_id:,
-  robot_id:,
-  details:
-)
+popular_tags(limit: 20, timeframe: nil)
 ```
-#### Parameters
-| Parameter | Type | Description |
-|-----------|------|---------|
-| `operation` | String | Operation type |
-| `node_id` | Integer, nil | Node database ID (can be nil) |
-| `robot_id` | String | Robot identifier |
-| `details` | Hash | Operation details (stored as JSON) |
 #### Returns
-- `void`
+- `Array<Hash>` - `[{ name: "tag_name", usage_count: 42 }, ...]`
 #### Examples
 ```ruby
-# Log add operation
-ltm.log_operation(
-  operation: 'add',
-  node_id: 123,
-  robot_id: robot_id,
-  details: { key: "fact_001", type: "fact" }
-)
+top_tags = ltm.popular_tags(limit: 10)
+top_tags.each do |tag|
+  puts "#{tag[:name]}: #{tag[:usage_count]} nodes"
+end
+```
-# Log recall operation
-ltm.log_operation(
-  operation: 'recall',
-  node_id: nil,
-  robot_id: robot_id,
-  details: {
-    timeframe: "last week",
-    topic: "postgresql",
-    count: 15
-  }
-)
+---
-# Log forget operation
-ltm.log_operation(
-  operation: 'forget',
-  node_id: 456,
-  robot_id: robot_id,
-  details: { key: "temp_note", reason: "temporary" }
-)
+### `topic_relationships(min_shared_nodes:, limit:)` {: #topic_relationships }
+Get tag co-occurrence relationships.
+```ruby
+topic_relationships(min_shared_nodes: 2, limit: 50)
+```
+#### Returns
+- `Array<Hash>` - `[{ topic1:, topic2:, shared_nodes: }, ...]`
+#### Examples
+```ruby
+related = ltm.topic_relationships(min_shared_nodes: 3)
+related.each do |r|
+  puts "#{r['topic1']} <-> #{r['topic2']}: #{r['shared_nodes']} shared"
+end
 ```
 ---
-### `stats()` {: #stats }
+### `register_robot(robot_name)` {: #register_robot }
-Get comprehensive memory statistics.
+Register a robot in the system.
 ```ruby
-stats()
+register_robot(robot_name)
 ```
 #### Returns
-- `Hash` - Statistics hash
+- `Integer` - Robot ID
-#### Hash Structure
+#### Examples
 ```ruby
-{
-  total_nodes: 1234,
+robot_id = ltm.register_robot("Code Assistant")
+```
-  nodes_by_robot: {
-    "robot-abc123" => 500,
-    "robot-def456" => 734
-  },
+---
-  nodes_by_type: [
-    { "type" => "fact", "count" => 400, "avg_importance" => 6.5 },
-    { "type" => "decision", "count" => 200, "avg_importance" => 8.2 },
-    ...
-  ],
+### `update_robot_activity(robot_id)` {: #update_robot_activity }
-  total_relationships: 567,
-  total_tags: 890,
+Update robot's last activity timestamp.
-  oldest_memory: "2025-01-01 12:00:00",
-  newest_memory: "2025-01-15 14:30:00",
+```ruby
+update_robot_activity(robot_id)
+```
-  active_robots: 3,
+---
-  robot_activity: [
-    { "id" => "robot-1", "name" => "Assistant", "last_active" => "2025-01-15 14:00:00" },
-    ...
-  ],
+### `mark_evicted(node_ids)` {: #mark_evicted }
-  database_size: 12345678  # bytes
-}
+Mark nodes as evicted from working memory.
+```ruby
+mark_evicted(node_ids)
 ```
-#### Examples
+#### Parameters
+| Parameter | Type | Description |
+|-----------|------|-------------|
+| `node_ids` | Array\<Integer\> | Node IDs to mark |
+---
+### `track_access(node_ids)` {: #track_access }
+Track access for multiple nodes (bulk update).
 ```ruby
-stats = ltm.stats
+track_access(node_ids)
+```
-puts "Total memories: #{stats[:total_nodes]}"
-puts "Robots: #{stats[:active_robots]}"
-puts "Relationships: #{stats[:total_relationships]}"
-puts "Tags: #{stats[:total_tags]}"
+Updates `access_count` and `last_accessed` for all specified nodes.
-# By type
-stats[:nodes_by_type].each do |type_info|
-  puts "#{type_info['type']}: #{type_info['count']} nodes, avg importance #{type_info['avg_importance']}"
-end
+---
-# Database size
-size_mb = stats[:database_size] / 1024.0 / 1024.0
-puts "Database size: #{size_mb.round(2)} MB"
+### `stats()` {: #stats }
-# Robot activity
-stats[:robot_activity].each do |robot|
-  puts "#{robot['name']}: last active #{robot['last_active']}"
-end
+Get comprehensive memory statistics.
+```ruby
+stats()
+```
+#### Returns
+```ruby
+{
+  total_nodes: 1234,
+  nodes_by_robot: { 1 => 500, 2 => 734 },
+  total_tags: 890,
+  oldest_memory: Time,
+  newest_memory: Time,
+  active_robots: 3,
+  robot_activity: [{ id:, name:, last_active: }, ...],
+  database_size: 12345678,  # bytes
+  cache: {                  # Only if cache enabled
+    hits: 150,
+    misses: 50,
+    hit_rate: 75.0,
+    size: 200
+  }
+}
 ```
 ---
-## Database Schema Reference
+## Database Schema
 ### Tables Used
@@ -912,95 +736,68 @@ end
 Primary memory storage:
-- `id` - Serial primary key
-- `key` - Unique text identifier
-- `value` - Text content
-- `type` - Optional type
-- `category` - Optional category
-- `importance` - Float (0.0-10.0)
-- `token_count` - Integer
-- `robot_id` - Foreign key to robots
-- `embedding` - Vector (pgvector)
-- `created_at` - Timestamp
-- `last_accessed` - Timestamp
-- `in_working_memory` - Boolean
-- `evicted_at` - Timestamp (nullable)
-#### `relationships`
-Node relationships:
-- `id` - Serial primary key
-- `from_node_id` - Foreign key to nodes
-- `to_node_id` - Foreign key to nodes
-- `relationship_type` - Optional type
-- `strength` - Float (0.0-1.0)
-- `created_at` - Timestamp
-#### `tags`
-Node tags:
+- `id` - BIGSERIAL primary key
+- `content` - TEXT (the memory content)
+- `content_hash` - VARCHAR(64) UNIQUE (SHA-256 for deduplication)
+- `access_count` - INTEGER (retrieval count)
+- `token_count` - INTEGER
+- `embedding` - vector(2000)
+- `embedding_dimension` - INTEGER
+- `created_at`, `updated_at`, `last_accessed` - TIMESTAMPTZ
+- `in_working_memory` - BOOLEAN
-- `id` - Serial primary key
-- `node_id` - Foreign key to nodes
-- `tag` - Text
-- `created_at` - Timestamp
+#### `robot_nodes`
-#### `robots`
+Robot-node associations (many-to-many):
-Robot registry:
+- `id` - BIGSERIAL primary key
+- `robot_id` - BIGINT FK
+- `node_id` - BIGINT FK
+- `first_remembered_at`, `last_remembered_at` - TIMESTAMPTZ
+- `remember_count` - INTEGER
-- `id` - Text primary key
-- `name` - Text
-- `created_at` - Timestamp
-- `last_active` - Timestamp
+#### `tags`
-#### `operations_log`
+Hierarchical tag registry:
-Operation audit log:
+- `id` - BIGSERIAL primary key
+- `name` - TEXT UNIQUE (colon-separated hierarchy)
+- `created_at` - TIMESTAMPTZ
-- `id` - Serial primary key
-- `operation` - Text
-- `node_id` - Foreign key to nodes (nullable)
-- `robot_id` - Foreign key to robots
-- `timestamp` - Timestamp
-- `details` - JSONB
+#### `node_tags`
-### Views
+Node-tag associations (many-to-many):
-#### `node_stats`
+- `node_id` - BIGINT FK
+- `tag_id` - BIGINT FK
-Aggregated statistics by type:
+---
-```sql
-SELECT type, COUNT(*) as count, AVG(importance) as avg_importance
-FROM nodes
-GROUP BY type
-```
+## Performance Considerations
-#### `robot_activity`
+### Query Caching
-Robot activity summary:
+Results are cached in an LRU cache with TTL:
-```sql
-SELECT id, name, last_active
-FROM robots
-ORDER BY last_active DESC
+```ruby
+# Check cache stats
+stats = ltm.stats
+puts "Cache hit rate: #{stats[:cache][:hit_rate]}%"
 ```
----
-## Performance Considerations
+Cache is automatically invalidated when:
+- Nodes are added
+- Nodes are deleted
 ### Indexing
 Automatic indexes:
-- `nodes.key` - Unique index for fast retrieval
-- `nodes.embedding` - IVFFlat index for vector search
-- `nodes.value` - GIN index for fulltext search
-- `nodes.created_at` - B-tree index for time-range queries
-- `relationships (from_node_id, to_node_id, relationship_type)` - Unique index
+- `content_hash` - UNIQUE index for deduplication
+- `embedding` - HNSW index for vector search
+- `content` - GIN indexes for fulltext and trigram search
+- `created_at` - B-tree for time-range queries
+- `robot_nodes` and `node_tags` - Indexes on foreign keys
 ### Query Optimization
@@ -1013,33 +810,8 @@ ltm.search(timeframe: (Time.at(0)..Time.now), ...)
 # Good: Reasonable limits
 ltm.search_fulltext(query: "...", limit: 20)
-# Bad: Unlimited results
-ltm.search_fulltext(query: "...", limit: 10000)
 ```
-### Connection Management
-Each method call:
-1. Opens a new PostgreSQL connection
-2. Executes the query
-3. Closes the connection
-For bulk operations, this can be slow. Consider:
-- Using connection pooling (future enhancement)
-- Batching operations when possible
-- Caching frequently accessed data
-### TimescaleDB Optimization
-The `nodes` table is a hypertable partitioned by `created_at`:
-- Automatic data partitioning by time
-- Compression for data older than 30 days
-- Optimized for time-series queries
 ---
 ## Error Handling
@@ -1051,39 +823,20 @@ The `nodes` table is a hypertable partitioned by `created_at`:
 ltm = HTM::LongTermMemory.new(invalid_config)
 # => PG::ConnectionBad
-# Unique constraint violations
-ltm.add(key: "existing_key", ...)
+# Unique constraint violations (rare with deduplication)
 # => PG::UniqueViolation
-# Foreign key violations
-ltm.add_relationship(from: "nonexistent", to: "key")
-# No error - returns early if nodes don't exist
 ```
 ### Best Practices
 ```ruby
-# Wrap in rescue blocks
-begin
-  node_id = ltm.add(key: key, ...)
-rescue PG::UniqueViolation
-  # Key already exists
-  node = ltm.retrieve(key)
-  node_id = node['id'].to_i
-end
 # Check existence before operations
-if ltm.retrieve(key)
-  ltm.delete(key)
+if ltm.exists?(node_id)
+  ltm.delete(node_id)
 end
-# Validate before adding relationships
-from_exists = ltm.get_node_id(from_key)
-to_exists = ltm.get_node_id(to_key)
-if from_exists && to_exists
-  ltm.add_relationship(from: from_key, to: to_key)
-end
+# Use HTM#forget for safe deletion with confirmation
+htm.forget(node_id, confirm: :confirmed)
 ```
 ---
@@ -1092,5 +845,4 @@ end
 - [HTM API](htm.md) - Main class that uses LongTermMemory
 - [WorkingMemory API](working-memory.md) - Token-limited active context
-- [EmbeddingService API](embedding-service.md) - Vector embedding generation
-- [Database API](database.md) - Schema setup and configuration
+- [Database Schema](../development/schema.md) - Full schema documentation