htm 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.architecture/decisions/adrs/001-use-postgresql-timescaledb-storage.md +227 -0
- data/.architecture/decisions/adrs/002-two-tier-memory-architecture.md +322 -0
- data/.architecture/decisions/adrs/003-ollama-default-embedding-provider.md +339 -0
- data/.architecture/decisions/adrs/004-multi-robot-shared-memory-hive-mind.md +374 -0
- data/.architecture/decisions/adrs/005-rag-based-retrieval-with-hybrid-search.md +443 -0
- data/.architecture/decisions/adrs/006-context-assembly-strategies.md +444 -0
- data/.architecture/decisions/adrs/007-working-memory-eviction-strategy.md +461 -0
- data/.architecture/decisions/adrs/008-robot-identification-system.md +550 -0
- data/.architecture/decisions/adrs/009-never-forget-explicit-deletion-only.md +570 -0
- data/.architecture/decisions/adrs/010-redis-working-memory-rejected.md +323 -0
- data/.architecture/decisions/adrs/011-database-side-embedding-generation-with-pgai.md +585 -0
- data/.architecture/decisions/adrs/012-llm-driven-ontology-topic-extraction.md +583 -0
- data/.architecture/decisions/adrs/013-activerecord-orm-and-many-to-many-tagging.md +299 -0
- data/.architecture/decisions/adrs/014-client-side-embedding-generation-workflow.md +569 -0
- data/.architecture/decisions/adrs/015-hierarchical-tag-ontology-and-llm-extraction.md +701 -0
- data/.architecture/decisions/adrs/016-async-embedding-and-tag-generation.md +694 -0
- data/.architecture/members.yml +144 -0
- data/.architecture/reviews/2025-10-29-llm-configuration-and-async-processing-review.md +1137 -0
- data/.architecture/reviews/initial-system-analysis.md +330 -0
- data/.envrc +32 -0
- data/.irbrc +145 -0
- data/CHANGELOG.md +150 -0
- data/COMMITS.md +196 -0
- data/LICENSE +21 -0
- data/README.md +1347 -0
- data/Rakefile +51 -0
- data/SETUP.md +268 -0
- data/config/database.yml +67 -0
- data/db/migrate/20250101000001_enable_extensions.rb +14 -0
- data/db/migrate/20250101000002_create_robots.rb +14 -0
- data/db/migrate/20250101000003_create_nodes.rb +42 -0
- data/db/migrate/20250101000005_create_tags.rb +38 -0
- data/db/migrate/20250101000007_add_node_vector_indexes.rb +30 -0
- data/db/schema.sql +473 -0
- data/db/seed_data/README.md +100 -0
- data/db/seed_data/presidents.md +136 -0
- data/db/seed_data/states.md +151 -0
- data/db/seeds.rb +208 -0
- data/dbdoc/README.md +173 -0
- data/dbdoc/public.node_stats.md +48 -0
- data/dbdoc/public.node_stats.svg +41 -0
- data/dbdoc/public.node_tags.md +40 -0
- data/dbdoc/public.node_tags.svg +112 -0
- data/dbdoc/public.nodes.md +54 -0
- data/dbdoc/public.nodes.svg +118 -0
- data/dbdoc/public.nodes_tags.md +39 -0
- data/dbdoc/public.nodes_tags.svg +112 -0
- data/dbdoc/public.ontology_structure.md +48 -0
- data/dbdoc/public.ontology_structure.svg +38 -0
- data/dbdoc/public.operations_log.md +42 -0
- data/dbdoc/public.operations_log.svg +130 -0
- data/dbdoc/public.relationships.md +39 -0
- data/dbdoc/public.relationships.svg +41 -0
- data/dbdoc/public.robot_activity.md +46 -0
- data/dbdoc/public.robot_activity.svg +35 -0
- data/dbdoc/public.robots.md +35 -0
- data/dbdoc/public.robots.svg +90 -0
- data/dbdoc/public.schema_migrations.md +29 -0
- data/dbdoc/public.schema_migrations.svg +26 -0
- data/dbdoc/public.tags.md +35 -0
- data/dbdoc/public.tags.svg +60 -0
- data/dbdoc/public.topic_relationships.md +45 -0
- data/dbdoc/public.topic_relationships.svg +32 -0
- data/dbdoc/schema.json +1437 -0
- data/dbdoc/schema.svg +154 -0
- data/docs/api/database.md +806 -0
- data/docs/api/embedding-service.md +532 -0
- data/docs/api/htm.md +797 -0
- data/docs/api/index.md +259 -0
- data/docs/api/long-term-memory.md +1096 -0
- data/docs/api/working-memory.md +665 -0
- data/docs/architecture/adrs/001-postgresql-timescaledb.md +314 -0
- data/docs/architecture/adrs/002-two-tier-memory.md +411 -0
- data/docs/architecture/adrs/003-ollama-embeddings.md +421 -0
- data/docs/architecture/adrs/004-hive-mind.md +437 -0
- data/docs/architecture/adrs/005-rag-retrieval.md +531 -0
- data/docs/architecture/adrs/006-context-assembly.md +496 -0
- data/docs/architecture/adrs/007-eviction-strategy.md +645 -0
- data/docs/architecture/adrs/008-robot-identification.md +625 -0
- data/docs/architecture/adrs/009-never-forget.md +648 -0
- data/docs/architecture/adrs/010-redis-working-memory-rejected.md +323 -0
- data/docs/architecture/adrs/011-pgai-integration.md +494 -0
- data/docs/architecture/adrs/index.md +215 -0
- data/docs/architecture/hive-mind.md +736 -0
- data/docs/architecture/index.md +351 -0
- data/docs/architecture/overview.md +538 -0
- data/docs/architecture/two-tier-memory.md +873 -0
- data/docs/assets/css/custom.css +83 -0
- data/docs/assets/images/htm-core-components.svg +63 -0
- data/docs/assets/images/htm-database-schema.svg +93 -0
- data/docs/assets/images/htm-hive-mind-architecture.svg +125 -0
- data/docs/assets/images/htm-importance-scoring-framework.svg +83 -0
- data/docs/assets/images/htm-layered-architecture.svg +71 -0
- data/docs/assets/images/htm-long-term-memory-architecture.svg +115 -0
- data/docs/assets/images/htm-working-memory-architecture.svg +120 -0
- data/docs/assets/images/htm.jpg +0 -0
- data/docs/assets/images/htm_demo.gif +0 -0
- data/docs/assets/js/mathjax.js +18 -0
- data/docs/assets/videos/htm_video.mp4 +0 -0
- data/docs/database_rake_tasks.md +322 -0
- data/docs/development/contributing.md +787 -0
- data/docs/development/index.md +336 -0
- data/docs/development/schema.md +596 -0
- data/docs/development/setup.md +719 -0
- data/docs/development/testing.md +819 -0
- data/docs/guides/adding-memories.md +824 -0
- data/docs/guides/context-assembly.md +1009 -0
- data/docs/guides/getting-started.md +577 -0
- data/docs/guides/index.md +118 -0
- data/docs/guides/long-term-memory.md +941 -0
- data/docs/guides/multi-robot.md +866 -0
- data/docs/guides/recalling-memories.md +927 -0
- data/docs/guides/search-strategies.md +953 -0
- data/docs/guides/working-memory.md +717 -0
- data/docs/index.md +214 -0
- data/docs/installation.md +477 -0
- data/docs/multi_framework_support.md +519 -0
- data/docs/quick-start.md +655 -0
- data/docs/setup_local_database.md +302 -0
- data/docs/using_rake_tasks_in_your_app.md +383 -0
- data/examples/basic_usage.rb +93 -0
- data/examples/cli_app/README.md +317 -0
- data/examples/cli_app/htm_cli.rb +270 -0
- data/examples/custom_llm_configuration.rb +183 -0
- data/examples/example_app/Rakefile +71 -0
- data/examples/example_app/app.rb +206 -0
- data/examples/sinatra_app/Gemfile +21 -0
- data/examples/sinatra_app/app.rb +335 -0
- data/lib/htm/active_record_config.rb +113 -0
- data/lib/htm/configuration.rb +342 -0
- data/lib/htm/database.rb +594 -0
- data/lib/htm/embedding_service.rb +115 -0
- data/lib/htm/errors.rb +34 -0
- data/lib/htm/job_adapter.rb +154 -0
- data/lib/htm/jobs/generate_embedding_job.rb +65 -0
- data/lib/htm/jobs/generate_tags_job.rb +82 -0
- data/lib/htm/long_term_memory.rb +965 -0
- data/lib/htm/models/node.rb +109 -0
- data/lib/htm/models/node_tag.rb +33 -0
- data/lib/htm/models/robot.rb +52 -0
- data/lib/htm/models/tag.rb +76 -0
- data/lib/htm/railtie.rb +76 -0
- data/lib/htm/sinatra.rb +157 -0
- data/lib/htm/tag_service.rb +135 -0
- data/lib/htm/tasks.rb +38 -0
- data/lib/htm/version.rb +5 -0
- data/lib/htm/working_memory.rb +182 -0
- data/lib/htm.rb +400 -0
- data/lib/tasks/db.rake +19 -0
- data/lib/tasks/htm.rake +147 -0
- data/lib/tasks/jobs.rake +312 -0
- data/mkdocs.yml +190 -0
- data/scripts/install_local_database.sh +309 -0
- metadata +341 -0
|
@@ -0,0 +1,330 @@
|
|
|
1
|
+
# HTM Initial System Analysis
|
|
2
|
+
|
|
3
|
+
**Generated**: 2025-10-25
|
|
4
|
+
**Version**: 0.1.0 (Initial Development)
|
|
5
|
+
**Status**: Active Development
|
|
6
|
+
|
|
7
|
+
## Executive Summary
|
|
8
|
+
|
|
9
|
+
HTM (Hierarchical Temporary Memory) is a Ruby gem providing intelligent memory management for LLM-based applications ("robots"). The system implements a two-tier memory architecture combining durable PostgreSQL/TimescaleDB storage with token-limited working memory, enabling contextual recall through RAG (Retrieval-Augmented Generation) techniques.
|
|
10
|
+
|
|
11
|
+
### Key Strengths
|
|
12
|
+
- **Never-forget architecture**: Explicit deletion model prevents accidental data loss
|
|
13
|
+
- **Multi-robot "hive mind"**: Shared memory enables cross-robot context awareness
|
|
14
|
+
- **Time-series optimization**: TimescaleDB hypertables with automatic compression
|
|
15
|
+
- **Flexible search**: Vector, full-text, and hybrid search strategies
|
|
16
|
+
- **Production-grade storage**: PostgreSQL with pgvector for semantic search
|
|
17
|
+
|
|
18
|
+
### Key Challenges
|
|
19
|
+
- **Embedding service dependency**: Requires Ollama or external API for vector generation
|
|
20
|
+
- **Early development stage**: Some features are stubs (OpenAI, Cohere providers)
|
|
21
|
+
- **Schema evolution**: No migration framework currently in place
|
|
22
|
+
- **Connection management**: Single connection model may not scale
|
|
23
|
+
|
|
24
|
+
## System Architecture
|
|
25
|
+
|
|
26
|
+
### Component Overview
|
|
27
|
+
|
|
28
|
+
```
|
|
29
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
30
|
+
│ HTM API │
|
|
31
|
+
│ (Main interface - lib/htm.rb) │
|
|
32
|
+
│ • add_node, recall, retrieve, forget │
|
|
33
|
+
│ • create_context, memory_stats │
|
|
34
|
+
│ • which_robot_said, conversation_timeline │
|
|
35
|
+
└────────────┬────────────────────────────┬───────────────────┘
|
|
36
|
+
│ │
|
|
37
|
+
▼ ▼
|
|
38
|
+
┌──────────────────────┐ ┌──────────────────────┐
|
|
39
|
+
│ WorkingMemory │ │ LongTermMemory │
|
|
40
|
+
│ (In-memory) │◄────►│ (PostgreSQL) │
|
|
41
|
+
│ │ │ │
|
|
42
|
+
│ • Token tracking │ │ • Persistent nodes │
|
|
43
|
+
│ • LRU eviction │ │ • Relationships │
|
|
44
|
+
│ • Context assembly │ │ • Tags │
|
|
45
|
+
└──────────────────────┘ │ • Vector search │
|
|
46
|
+
│ • Full-text search │
|
|
47
|
+
└──────────┬───────────┘
|
|
48
|
+
│
|
|
49
|
+
▼
|
|
50
|
+
┌──────────────────────┐
|
|
51
|
+
│ EmbeddingService │
|
|
52
|
+
│ (via RubyLLM) │
|
|
53
|
+
│ │
|
|
54
|
+
│ • Ollama (default) │
|
|
55
|
+
│ • OpenAI (stub) │
|
|
56
|
+
│ • Token counting │
|
|
57
|
+
└──────────────────────┘
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
### Data Flow
|
|
61
|
+
|
|
62
|
+
**Adding a Memory:**
|
|
63
|
+
1. HTM.add_node() receives content and metadata
|
|
64
|
+
2. EmbeddingService generates vector embedding (via Ollama)
|
|
65
|
+
3. Token count calculated using Tiktoken
|
|
66
|
+
4. LongTermMemory persists to PostgreSQL with embedding
|
|
67
|
+
5. WorkingMemory adds to active context (with eviction if needed)
|
|
68
|
+
6. Relationships and tags created
|
|
69
|
+
7. Operation logged to audit trail
|
|
70
|
+
|
|
71
|
+
**Recalling Memories:**
|
|
72
|
+
1. HTM.recall() with timeframe and topic
|
|
73
|
+
2. Natural language timeframe parsed ("last week" → date range)
|
|
74
|
+
3. Search strategy selected (vector/fulltext/hybrid)
|
|
75
|
+
4. EmbeddingService generates query embedding (for vector search)
|
|
76
|
+
5. LongTermMemory executes search with time filter
|
|
77
|
+
6. Results added to WorkingMemory (evicting if needed)
|
|
78
|
+
7. Operation logged
|
|
79
|
+
8. Nodes returned to caller
|
|
80
|
+
|
|
81
|
+
**Context Assembly:**
|
|
82
|
+
1. HTM.create_context() with strategy
|
|
83
|
+
2. WorkingMemory sorts nodes by strategy (recent/important/balanced)
|
|
84
|
+
3. Assembles text within token budget
|
|
85
|
+
4. Returns context string for LLM
|
|
86
|
+
|
|
87
|
+
## Technology Stack
|
|
88
|
+
|
|
89
|
+
### Core Dependencies
|
|
90
|
+
- **PostgreSQL 17+**: Primary data store
|
|
91
|
+
- **TimescaleDB**: Time-series optimization, hypertables, compression
|
|
92
|
+
- **pgvector**: Vector similarity search (cosine distance, HNSW indexing)
|
|
93
|
+
- **pg_trgm**: Trigram-based fuzzy text matching
|
|
94
|
+
- **Ruby 3.0+**: Implementation language
|
|
95
|
+
- **Ollama**: Local embedding generation (default via RubyLLM)
|
|
96
|
+
- **Tiktoken**: Token counting for context management
|
|
97
|
+
|
|
98
|
+
### Development Tools
|
|
99
|
+
- **Minitest**: Testing framework
|
|
100
|
+
- **Rake**: Task automation
|
|
101
|
+
- **debug_me**: Debugging utility (project standard)
|
|
102
|
+
|
|
103
|
+
## Database Schema
|
|
104
|
+
|
|
105
|
+
### Core Tables
|
|
106
|
+
|
|
107
|
+
**nodes** (TimescaleDB hypertable on `created_at`):
|
|
108
|
+
- Primary memory storage
|
|
109
|
+
- Vector embeddings (1536 dimensions)
|
|
110
|
+
- Token counts, importance scores
|
|
111
|
+
- Robot ownership tracking
|
|
112
|
+
- Compression after 30 days
|
|
113
|
+
|
|
114
|
+
**relationships**:
|
|
115
|
+
- Knowledge graph edges
|
|
116
|
+
- From/to node references
|
|
117
|
+
- Relationship types and strength
|
|
118
|
+
|
|
119
|
+
**tags**:
|
|
120
|
+
- Flexible categorization
|
|
121
|
+
- Many-to-many with nodes
|
|
122
|
+
|
|
123
|
+
**operations_log** (TimescaleDB hypertable):
|
|
124
|
+
- Audit trail for all operations
|
|
125
|
+
- Partitioned by timestamp
|
|
126
|
+
|
|
127
|
+
**robots**:
|
|
128
|
+
- Robot registry
|
|
129
|
+
- Activity tracking
|
|
130
|
+
|
|
131
|
+
### Indexing Strategy
|
|
132
|
+
- **HNSW** on vector embeddings (cosine distance)
|
|
133
|
+
- **GIN** on full-text search vectors
|
|
134
|
+
- **GIN** with trigram ops for fuzzy matching
|
|
135
|
+
- **B-tree** on temporal columns, robot_id, types
|
|
136
|
+
|
|
137
|
+
## Current Implementation Status
|
|
138
|
+
|
|
139
|
+
### Completed (Phase 1)
|
|
140
|
+
- ✅ Core two-tier memory architecture
|
|
141
|
+
- ✅ PostgreSQL/TimescaleDB schema
|
|
142
|
+
- ✅ Ollama embedding integration
|
|
143
|
+
- ✅ Token counting and budget management
|
|
144
|
+
- ✅ Database connection and setup
|
|
145
|
+
- ✅ Hypertable configuration
|
|
146
|
+
- ✅ Basic testing framework
|
|
147
|
+
|
|
148
|
+
### In Progress
|
|
149
|
+
- 🔄 RAG retrieval implementation
|
|
150
|
+
- 🔄 Working memory eviction strategies
|
|
151
|
+
- 🔄 Relationship graph queries
|
|
152
|
+
- 🔄 Tag-based filtering
|
|
153
|
+
|
|
154
|
+
### Planned
|
|
155
|
+
- 📋 Additional embedding providers (OpenAI, Cohere)
|
|
156
|
+
- 📋 Connection pooling
|
|
157
|
+
- 📋 Advanced context assembly
|
|
158
|
+
- 📋 Memory consolidation
|
|
159
|
+
- 📋 Observability and metrics
|
|
160
|
+
- 📋 Migration framework
|
|
161
|
+
- 📋 Production hardening
|
|
162
|
+
|
|
163
|
+
## Design Patterns & Principles
|
|
164
|
+
|
|
165
|
+
### Architecture Patterns
|
|
166
|
+
- **Two-tier memory**: Separates hot (working) from cold (long-term) storage
|
|
167
|
+
- **RAG (Retrieval-Augmented Generation)**: Semantic + temporal search
|
|
168
|
+
- **Repository pattern**: Database abstraction in LongTermMemory
|
|
169
|
+
- **Strategy pattern**: Multiple search and context assembly strategies
|
|
170
|
+
- **Adapter pattern**: EmbeddingService abstracts provider differences
|
|
171
|
+
|
|
172
|
+
### Design Principles Applied
|
|
173
|
+
- **Explicit deletion**: Never delete without confirmation
|
|
174
|
+
- **Fail-safe defaults**: Falls back to random embeddings if Ollama unavailable
|
|
175
|
+
- **Separation of concerns**: Clear component boundaries
|
|
176
|
+
- **Testability**: Components designed for isolation testing
|
|
177
|
+
- **Documentation as code**: Inline documentation with examples
|
|
178
|
+
|
|
179
|
+
## Key Architectural Decisions
|
|
180
|
+
|
|
181
|
+
See ADRs for detailed decision records:
|
|
182
|
+
|
|
183
|
+
1. **PostgreSQL + TimescaleDB for storage** (ADR-001)
|
|
184
|
+
- Time-series optimization
|
|
185
|
+
- Native vector search with pgvector
|
|
186
|
+
- Production-grade reliability
|
|
187
|
+
|
|
188
|
+
2. **Two-tier memory architecture** (ADR-002)
|
|
189
|
+
- Token budget management
|
|
190
|
+
- LRU eviction to long-term storage
|
|
191
|
+
- Never-delete philosophy
|
|
192
|
+
|
|
193
|
+
3. **Ollama as default embedding provider** (ADR-003)
|
|
194
|
+
- Local-first approach
|
|
195
|
+
- No API costs
|
|
196
|
+
- Privacy-preserving
|
|
197
|
+
|
|
198
|
+
4. **Multi-robot shared memory (hive mind)** (ADR-004)
|
|
199
|
+
- Cross-robot context sharing
|
|
200
|
+
- Conversation attribution
|
|
201
|
+
- Timeline reconstruction
|
|
202
|
+
|
|
203
|
+
5. **Hybrid search strategy** (ADR-005)
|
|
204
|
+
- Vector similarity for semantics
|
|
205
|
+
- Full-text for keywords
|
|
206
|
+
- Temporal filtering
|
|
207
|
+
- Weighted combination
|
|
208
|
+
|
|
209
|
+
## Risk Assessment
|
|
210
|
+
|
|
211
|
+
### Technical Risks
|
|
212
|
+
|
|
213
|
+
**High Priority:**
|
|
214
|
+
- **Ollama dependency**: Embedding generation fails if Ollama unavailable
|
|
215
|
+
- *Mitigation*: Fallback to stub embeddings, multi-provider support
|
|
216
|
+
|
|
217
|
+
- **Schema evolution**: No migration framework
|
|
218
|
+
- *Mitigation*: Implement Rails-like migration system
|
|
219
|
+
|
|
220
|
+
**Medium Priority:**
|
|
221
|
+
- **Connection management**: Single connection per instance
|
|
222
|
+
- *Mitigation*: Implement connection pooling (ConnectionPool gem already included)
|
|
223
|
+
|
|
224
|
+
- **Memory growth**: Working memory could grow unbounded
|
|
225
|
+
- *Mitigation*: Implement aggressive eviction strategies
|
|
226
|
+
|
|
227
|
+
**Low Priority:**
|
|
228
|
+
- **Embedding dimension mismatch**: Hardcoded 1536 dimensions
|
|
229
|
+
- *Mitigation*: Make configurable per provider
|
|
230
|
+
|
|
231
|
+
### Operational Risks
|
|
232
|
+
|
|
233
|
+
**Medium Priority:**
|
|
234
|
+
- **Database costs**: TimescaleDB Cloud usage-based pricing
|
|
235
|
+
- *Mitigation*: Compression policies, retention policies
|
|
236
|
+
|
|
237
|
+
- **Token counting accuracy**: Tiktoken approximation may differ from LLM
|
|
238
|
+
- *Mitigation*: Add safety margins, LLM-specific counters
|
|
239
|
+
|
|
240
|
+
## Performance Considerations
|
|
241
|
+
|
|
242
|
+
### Strengths
|
|
243
|
+
- TimescaleDB chunk-based partitioning for time-range queries
|
|
244
|
+
- HNSW indexing for fast vector similarity search
|
|
245
|
+
- Compression for old data reduces storage costs
|
|
246
|
+
- Token pre-calculation avoids runtime overhead
|
|
247
|
+
|
|
248
|
+
### Optimization Opportunities
|
|
249
|
+
- Connection pooling for concurrent access
|
|
250
|
+
- Batch embedding generation
|
|
251
|
+
- Caching frequently accessed nodes
|
|
252
|
+
- Lazy loading of relationships
|
|
253
|
+
- Prepared statements for common queries
|
|
254
|
+
|
|
255
|
+
## Security Considerations
|
|
256
|
+
|
|
257
|
+
### Current State
|
|
258
|
+
- SSL required for TimescaleDB Cloud connection
|
|
259
|
+
- Database credentials via environment variables
|
|
260
|
+
- No encryption at rest (relies on database)
|
|
261
|
+
- No access control beyond robot_id tracking
|
|
262
|
+
|
|
263
|
+
### Recommendations
|
|
264
|
+
- Implement row-level security for multi-tenant scenarios
|
|
265
|
+
- Encrypt embeddings if sensitive
|
|
266
|
+
- Add audit logging for forget() operations
|
|
267
|
+
- Consider API key rotation for embedding providers
|
|
268
|
+
- Validate and sanitize all user inputs
|
|
269
|
+
|
|
270
|
+
## Scalability Analysis
|
|
271
|
+
|
|
272
|
+
### Current Limitations
|
|
273
|
+
- Single database connection per HTM instance
|
|
274
|
+
- In-memory working memory (per-process)
|
|
275
|
+
- No horizontal scaling strategy
|
|
276
|
+
- Limited to single TimescaleDB instance
|
|
277
|
+
|
|
278
|
+
### Growth Path
|
|
279
|
+
- Add connection pooling
|
|
280
|
+
- Consider Redis for shared working memory
|
|
281
|
+
- Implement read replicas for query scaling
|
|
282
|
+
- Partition by robot_id for tenant isolation
|
|
283
|
+
- Add caching layer (Redis/Memcached)
|
|
284
|
+
|
|
285
|
+
## Maintainability Assessment
|
|
286
|
+
|
|
287
|
+
### Strengths
|
|
288
|
+
- Clear component separation
|
|
289
|
+
- Comprehensive inline documentation
|
|
290
|
+
- Test framework in place
|
|
291
|
+
- Debugging with debug_me standard
|
|
292
|
+
- Frozen string literals enabled
|
|
293
|
+
|
|
294
|
+
### Areas for Improvement
|
|
295
|
+
- Increase test coverage (integration tests needed)
|
|
296
|
+
- Add API documentation (YARD/RDoc)
|
|
297
|
+
- Implement CI/CD pipeline
|
|
298
|
+
- Add code quality metrics (RuboCop, SimpleCov)
|
|
299
|
+
- Create migration framework
|
|
300
|
+
|
|
301
|
+
## Next Steps
|
|
302
|
+
|
|
303
|
+
### Immediate (Current Sprint)
|
|
304
|
+
1. Complete RAG retrieval implementation
|
|
305
|
+
2. Finalize working memory eviction
|
|
306
|
+
3. Add comprehensive integration tests
|
|
307
|
+
4. Document API with YARD
|
|
308
|
+
|
|
309
|
+
### Short-term (Next 2-4 weeks)
|
|
310
|
+
1. Implement connection pooling
|
|
311
|
+
2. Add OpenAI embedding provider
|
|
312
|
+
3. Create migration framework
|
|
313
|
+
4. Add observability (logging, metrics)
|
|
314
|
+
5. Performance profiling and optimization
|
|
315
|
+
|
|
316
|
+
### Long-term (Next Quarter)
|
|
317
|
+
1. Production hardening
|
|
318
|
+
2. Horizontal scaling strategy
|
|
319
|
+
3. Advanced RAG features (re-ranking, filtering)
|
|
320
|
+
4. Memory consolidation algorithms
|
|
321
|
+
5. Web UI for memory exploration
|
|
322
|
+
6. Publish gem to RubyGems
|
|
323
|
+
|
|
324
|
+
## Conclusion
|
|
325
|
+
|
|
326
|
+
HTM demonstrates a solid architectural foundation with clear separation of concerns and production-grade technology choices. The two-tier memory model with RAG-based retrieval is well-suited for LLM applications requiring contextual awareness across conversations.
|
|
327
|
+
|
|
328
|
+
Key strengths include the never-forget philosophy, multi-robot hive mind, and TimescaleDB time-series optimization. Primary areas for improvement are connection management, schema evolution, and comprehensive testing.
|
|
329
|
+
|
|
330
|
+
The project is positioned well for growth from prototype to production-ready gem with focused attention on connection pooling, additional embedding providers, and operational tooling.
|
data/.envrc
ADDED
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
# htm/.envrc
|
|
2
|
+
|
|
3
|
+
export RR=`pwd`
|
|
4
|
+
|
|
5
|
+
# Database connection - Localhost PostgreSQL
|
|
6
|
+
export HTM_DBHOST=localhost
|
|
7
|
+
export HTM_DBPORT=5432
|
|
8
|
+
export HTM_DBNAME=htm_development
|
|
9
|
+
export HTM_DBUSER=${USER}
|
|
10
|
+
export HTM_DBPASS=
|
|
11
|
+
export HTM_DBURL="postgresql://${HTM_DBUSER}@${HTM_DBHOST}:${HTM_DBPORT}/${HTM_DBNAME}?sslmode=disable"
|
|
12
|
+
|
|
13
|
+
# Uncomment if using TimescaleDB Cloud instead:
|
|
14
|
+
# export HTM_SERVICE_NAME=$TIGER_SERVICE_NAME
|
|
15
|
+
# export HTM_DBURL=$TIGER_DBURL
|
|
16
|
+
# export HTM_DBNAME=$TIGER_DBNAME
|
|
17
|
+
# export HTM_DBUSER=$TIGER_DBUSER
|
|
18
|
+
# export HTM_DBPASS=$TIGER_DBPASS
|
|
19
|
+
# export HTM_DBHOST=$TIGER_DBHOST
|
|
20
|
+
# export HTM_DBPORT=$TIGER_DBPORT
|
|
21
|
+
|
|
22
|
+
# Client-side embedding generation
|
|
23
|
+
# HTM generates embeddings before inserting into database
|
|
24
|
+
export HTM_EMBEDDINGS_PROVIDER=ollama
|
|
25
|
+
export HTM_EMBEDDINGS_MODEL=embeddinggemma
|
|
26
|
+
export HTM_EMBEDDINGS_BASE_URL=http://localhost:11434
|
|
27
|
+
export HTM_EMBEDDINGS_DIMENSION=768
|
|
28
|
+
|
|
29
|
+
# Topic extraction (client-side)
|
|
30
|
+
export HTM_TOPIC_PROVIDER=ollama
|
|
31
|
+
export HTM_TOPIC_MODEL=phi4
|
|
32
|
+
export HTM_TOPIC_BASE_URL=http://localhost:11434
|
data/.irbrc
ADDED
|
@@ -0,0 +1,145 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
# HTM IRB Configuration
|
|
4
|
+
# Load this with: irb -r ./.irbrc
|
|
5
|
+
# Or just: irb (if in the htm directory)
|
|
6
|
+
|
|
7
|
+
puts "Loading HTM library..."
|
|
8
|
+
|
|
9
|
+
# Load the HTM library
|
|
10
|
+
require_relative 'lib/htm'
|
|
11
|
+
|
|
12
|
+
# Establish database connection
|
|
13
|
+
HTM::ActiveRecordConfig.establish_connection! unless HTM::ActiveRecordConfig.connected?
|
|
14
|
+
|
|
15
|
+
# Configure HTM with Ollama for embedding and tag generation
|
|
16
|
+
HTM.configure do |c|
|
|
17
|
+
c.embedding_provider = :ollama
|
|
18
|
+
c.embedding_model = 'nomic-embed-text'
|
|
19
|
+
c.embedding_dimensions = 768
|
|
20
|
+
c.tag_provider = :ollama
|
|
21
|
+
c.tag_model = 'gemma3'
|
|
22
|
+
c.reset_to_defaults
|
|
23
|
+
end
|
|
24
|
+
|
|
25
|
+
# Convenience aliases for models
|
|
26
|
+
Node = HTM::Models::Node
|
|
27
|
+
Tag = HTM::Models::Tag
|
|
28
|
+
NodeTag = HTM::Models::NodeTag
|
|
29
|
+
Robot = HTM::Models::Robot
|
|
30
|
+
|
|
31
|
+
# Helper methods
|
|
32
|
+
def reload!
|
|
33
|
+
puts "Reloading HTM library..."
|
|
34
|
+
load 'lib/htm.rb'
|
|
35
|
+
puts "✓ Reloaded"
|
|
36
|
+
end
|
|
37
|
+
|
|
38
|
+
def db_stats
|
|
39
|
+
puts <<~STATS
|
|
40
|
+
|
|
41
|
+
=== Database Statistics ===
|
|
42
|
+
Nodes: #{Node.count}
|
|
43
|
+
Tags: #{Tag.count}
|
|
44
|
+
NodeTags: #{NodeTag.count}
|
|
45
|
+
Robots: #{Robot.count}
|
|
46
|
+
|
|
47
|
+
STATS
|
|
48
|
+
end
|
|
49
|
+
|
|
50
|
+
def recent_nodes(limit = 5)
|
|
51
|
+
puts "\n=== Recent Nodes ==="
|
|
52
|
+
Node.order(created_at: :desc).limit(limit).each do |node|
|
|
53
|
+
tags = node.tags.pluck(:name).join(', ')
|
|
54
|
+
tags_str = tags.empty? ? "(no tags)" : tags
|
|
55
|
+
puts "Node #{node.id}: #{node.content[0..60]}..."
|
|
56
|
+
puts " Tags: #{tags_str}"
|
|
57
|
+
puts " Embedding: #{node.embedding ? '✓' : '✗'}"
|
|
58
|
+
puts ""
|
|
59
|
+
end
|
|
60
|
+
end
|
|
61
|
+
|
|
62
|
+
def recent_tags(limit = 10)
|
|
63
|
+
puts "\n=== Recent Tags ==="
|
|
64
|
+
Tag.order(created_at: :desc).limit(limit).each do |tag|
|
|
65
|
+
count = tag.nodes.count
|
|
66
|
+
puts "#{tag.name} (#{count} nodes)"
|
|
67
|
+
end
|
|
68
|
+
puts
|
|
69
|
+
end
|
|
70
|
+
|
|
71
|
+
def search_tags(pattern)
|
|
72
|
+
puts "\n=== Tags matching '#{pattern}' ==="
|
|
73
|
+
Tag.where("name LIKE ?", "%#{pattern}%").each do |tag|
|
|
74
|
+
count = tag.nodes.count
|
|
75
|
+
puts "#{tag.name} (#{count} nodes)"
|
|
76
|
+
end
|
|
77
|
+
puts
|
|
78
|
+
end
|
|
79
|
+
|
|
80
|
+
def node_with_tags(node_id)
|
|
81
|
+
node = Node.includes(:tags).find(node_id)
|
|
82
|
+
embedding_info = if node.embedding
|
|
83
|
+
"✓ (#{node.embedding.size} dimensions)"
|
|
84
|
+
else
|
|
85
|
+
'✗'
|
|
86
|
+
end
|
|
87
|
+
|
|
88
|
+
puts <<~NODE_INFO
|
|
89
|
+
|
|
90
|
+
=== Node #{node_id} ===
|
|
91
|
+
Content: #{node.content}
|
|
92
|
+
Source: #{node.source}
|
|
93
|
+
Created: #{node.created_at}
|
|
94
|
+
Embedding: #{embedding_info}
|
|
95
|
+
|
|
96
|
+
Tags:
|
|
97
|
+
NODE_INFO
|
|
98
|
+
|
|
99
|
+
if node.tags.any?
|
|
100
|
+
node.tags.each { |tag| puts " - #{tag.name}" }
|
|
101
|
+
else
|
|
102
|
+
puts " (no tags)"
|
|
103
|
+
end
|
|
104
|
+
puts
|
|
105
|
+
node
|
|
106
|
+
end
|
|
107
|
+
|
|
108
|
+
def create_test_node(content, source: "irb")
|
|
109
|
+
htm = HTM.new(robot_name: "IRB User")
|
|
110
|
+
node_id = htm.remember(content, source: source)
|
|
111
|
+
puts "✓ Created node #{node_id}"
|
|
112
|
+
node_id
|
|
113
|
+
end
|
|
114
|
+
|
|
115
|
+
def htm_help
|
|
116
|
+
puts <<~WELCOME
|
|
117
|
+
|
|
118
|
+
============================================================
|
|
119
|
+
HTM Interactive Console
|
|
120
|
+
============================================================
|
|
121
|
+
|
|
122
|
+
Available models:
|
|
123
|
+
- Node (HTM::Models::Node)
|
|
124
|
+
- Tag (HTM::Models::Tag)
|
|
125
|
+
- NodeTag (HTM::Models::NodeTag)
|
|
126
|
+
- Robot (HTM::Models::Robot)
|
|
127
|
+
|
|
128
|
+
Helper methods:
|
|
129
|
+
htm_help # Reprints this message
|
|
130
|
+
db_stats # Show database statistics
|
|
131
|
+
recent_nodes(n) # Show n recent nodes (default: 5)
|
|
132
|
+
recent_tags(n) # Show n recent tags (default: 10)
|
|
133
|
+
search_tags(pattern) # Search tags by pattern
|
|
134
|
+
node_with_tags(id) # Show node details with tags
|
|
135
|
+
create_test_node(str) # Create a test node
|
|
136
|
+
reload! # Reload HTM library
|
|
137
|
+
|
|
138
|
+
Database: #{HTM::Database.default_config[:dbname]}
|
|
139
|
+
WELCOME
|
|
140
|
+
end
|
|
141
|
+
|
|
142
|
+
htm_help
|
|
143
|
+
db_stats
|
|
144
|
+
|
|
145
|
+
print "HTM Ready!\n\n"
|
data/CHANGELOG.md
ADDED
|
@@ -0,0 +1,150 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to this project will be documented in this file.
|
|
4
|
+
|
|
5
|
+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
6
|
+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
|
+
|
|
8
|
+
## [Unreleased]
|
|
9
|
+
|
|
10
|
+
### Added
|
|
11
|
+
- Architecture documentation using ai-software-architect framework
|
|
12
|
+
- Comprehensive ADRs (Architecture Decision Records):
|
|
13
|
+
- ADR-001: PostgreSQL with TimescaleDB for storage
|
|
14
|
+
- ADR-002: Two-tier memory architecture (working + long-term)
|
|
15
|
+
- ADR-003: Ollama as default embedding provider
|
|
16
|
+
- ADR-004: Multi-robot shared memory (hive mind)
|
|
17
|
+
- ADR-005: RAG-based retrieval with hybrid search
|
|
18
|
+
- ADR-006: Context assembly strategies (recent, important, balanced)
|
|
19
|
+
- ADR-007: Working memory eviction strategy (hybrid importance + recency)
|
|
20
|
+
- ADR-008: Robot identification system (UUID + name)
|
|
21
|
+
- ADR-009: Never-forget philosophy with explicit deletion
|
|
22
|
+
- Architecture review team with 8 specialist perspectives
|
|
23
|
+
- Had the robot convert my notss and system analysis documentation into Architectural Decision Records (ADR)
|
|
24
|
+
|
|
25
|
+
## [0.1.0] - 2025-10-25
|
|
26
|
+
|
|
27
|
+
### Added
|
|
28
|
+
- Initial release of HTM (Hierarchical Temporary Memory)
|
|
29
|
+
- Two-tier memory system:
|
|
30
|
+
- Working memory: Token-limited, in-memory active context
|
|
31
|
+
- Long-term memory: Durable PostgreSQL/TimescaleDB storage
|
|
32
|
+
- Core memory operations:
|
|
33
|
+
- `add_node`: Store memories with metadata, embeddings, and relationships
|
|
34
|
+
- `recall`: RAG-based retrieval with temporal and semantic search
|
|
35
|
+
- `retrieve`: Direct memory lookup by key
|
|
36
|
+
- `forget`: Explicit deletion with confirmation requirement
|
|
37
|
+
- `create_context`: Assemble LLM context from working memory
|
|
38
|
+
- Multi-robot "hive mind" architecture:
|
|
39
|
+
- Shared global memory database
|
|
40
|
+
- Robot attribution tracking
|
|
41
|
+
- Robot registry and activity monitoring
|
|
42
|
+
- Cross-robot knowledge sharing
|
|
43
|
+
- Search strategies:
|
|
44
|
+
- Vector search: Semantic similarity using pgvector
|
|
45
|
+
- Full-text search: Keyword matching with PostgreSQL full-text search
|
|
46
|
+
- Hybrid search: Combined pre-filter + vector reranking
|
|
47
|
+
- Temporal filtering: Time-range queries with TimescaleDB
|
|
48
|
+
- Context assembly strategies:
|
|
49
|
+
- Recent: Most recently accessed memories first
|
|
50
|
+
- Important: Highest importance score first
|
|
51
|
+
- Balanced: Hybrid with time-decay function (default)
|
|
52
|
+
- Working memory management:
|
|
53
|
+
- Configurable token limits (default: 128,000)
|
|
54
|
+
- Hybrid eviction strategy (importance + recency)
|
|
55
|
+
- LRU access tracking
|
|
56
|
+
- Automatic eviction to long-term storage
|
|
57
|
+
- Long-term memory features:
|
|
58
|
+
- PostgreSQL 17+ with TimescaleDB 2.22.1
|
|
59
|
+
- Vector embeddings with pgvector 0.8.1 (HNSW indexing)
|
|
60
|
+
- Full-text search with pg_trgm
|
|
61
|
+
- Relationship graphs between memories
|
|
62
|
+
- Tag system for categorization
|
|
63
|
+
- Operations logging and audit trail
|
|
64
|
+
- Memory statistics and analytics
|
|
65
|
+
- Embedding service:
|
|
66
|
+
- Default: Ollama with gpt-oss model (local-first)
|
|
67
|
+
- Support for multiple providers (OpenAI, Cohere, local)
|
|
68
|
+
- Configurable models and endpoints
|
|
69
|
+
- Accurate token counting with tiktoken_ruby
|
|
70
|
+
- Database schema:
|
|
71
|
+
- `nodes`: Core memory storage with embeddings
|
|
72
|
+
- `relationships`: Graph connections between memories
|
|
73
|
+
- `tags`: Flexible categorization system
|
|
74
|
+
- `robots`: Robot registry and activity tracking
|
|
75
|
+
- `operations_log`: Audit trail for all operations
|
|
76
|
+
- TimescaleDB hypertables for time-series optimization
|
|
77
|
+
- PostgreSQL views for statistics
|
|
78
|
+
- Memory types:
|
|
79
|
+
- `:fact`: Factual information
|
|
80
|
+
- `:context`: Contextual information
|
|
81
|
+
- `:code`: Code snippets and technical content
|
|
82
|
+
- `:preference`: User preferences
|
|
83
|
+
- `:decision`: Architectural and strategic decisions
|
|
84
|
+
- `:question`: Questions and queries
|
|
85
|
+
- Memory metadata:
|
|
86
|
+
- Importance scoring (0.0-10.0)
|
|
87
|
+
- Token counting
|
|
88
|
+
- Timestamps (created_at, last_accessed)
|
|
89
|
+
- Robot attribution
|
|
90
|
+
- Categories and tags
|
|
91
|
+
- Relationships to other memories
|
|
92
|
+
- Robot identification:
|
|
93
|
+
- UUID-based robot_id (auto-generated)
|
|
94
|
+
- Optional human-readable robot_name
|
|
95
|
+
- Robot registry with activity tracking
|
|
96
|
+
- Memory attribution by robot
|
|
97
|
+
- Never-forget philosophy:
|
|
98
|
+
- Memories never automatically deleted
|
|
99
|
+
- Eviction moves to long-term storage (no data loss)
|
|
100
|
+
- Explicit confirmation required for deletion (`:confirmed` symbol)
|
|
101
|
+
- All deletions logged for audit trail
|
|
102
|
+
- Database utilities:
|
|
103
|
+
- Schema creation and migration scripts
|
|
104
|
+
- Extension installation (TimescaleDB, pgvector, pg_trgm)
|
|
105
|
+
- Hypertable configuration
|
|
106
|
+
- Compression policies
|
|
107
|
+
- Index creation for performance
|
|
108
|
+
- Development tools:
|
|
109
|
+
- Comprehensive test suite (Minitest)
|
|
110
|
+
- Example scripts and usage patterns
|
|
111
|
+
- Rakefile with common tasks
|
|
112
|
+
- Environment configuration with direnv
|
|
113
|
+
- Documentation:
|
|
114
|
+
- README with quick start guide
|
|
115
|
+
- SETUP.md with detailed installation instructions
|
|
116
|
+
- CLAUDE.md for AI assistant context
|
|
117
|
+
- Architecture documentation in `.architecture/`
|
|
118
|
+
- Inline code documentation
|
|
119
|
+
|
|
120
|
+
### Dependencies
|
|
121
|
+
- Ruby 3.0+
|
|
122
|
+
- PostgreSQL 17+
|
|
123
|
+
- TimescaleDB 2.22.1
|
|
124
|
+
- pgvector 0.8.1
|
|
125
|
+
- pg gem (~> 1.5)
|
|
126
|
+
- pgvector gem (~> 0.8)
|
|
127
|
+
- connection_pool gem (~> 2.4)
|
|
128
|
+
- tiktoken_ruby gem (~> 0.0.9)
|
|
129
|
+
- ruby-llm gem (~> 0.7.1)
|
|
130
|
+
|
|
131
|
+
### Database Requirements
|
|
132
|
+
- PostgreSQL 17+ with extensions:
|
|
133
|
+
- timescaledb (2.22.1+)
|
|
134
|
+
- vector (0.8.1+)
|
|
135
|
+
- pg_trgm (1.6+)
|
|
136
|
+
- Recommended: TimescaleDB Cloud or local TimescaleDB installation
|
|
137
|
+
|
|
138
|
+
### Environment Variables
|
|
139
|
+
- `HTM_DBURL`: PostgreSQL connection string (required)
|
|
140
|
+
- `OLLAMA_URL`: Ollama API endpoint (default: http://localhost:11434)
|
|
141
|
+
|
|
142
|
+
### Notes
|
|
143
|
+
- This is an initial release focused on core functionality
|
|
144
|
+
- Database schema is stable but may evolve in future versions
|
|
145
|
+
- Embedding models and providers are configurable
|
|
146
|
+
- Working memory size is user-configurable
|
|
147
|
+
- See ADRs for detailed architectural decisions and rationale
|
|
148
|
+
|
|
149
|
+
[Unreleased]: https://github.com/madbomber/htm/compare/v0.1.0...HEAD
|
|
150
|
+
[0.1.0]: https://github.com/madbomber/htm/releases/tag/v0.1.0
|