htm 0.0.1 → 0.0.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (184) hide show
  1. checksums.yaml +4 -4
  2. data/.aigcm_msg +1 -0
  3. data/.architecture/reviews/comprehensive-codebase-review.md +577 -0
  4. data/.claude/settings.local.json +92 -0
  5. data/.envrc +1 -0
  6. data/.irbrc +283 -80
  7. data/.tbls.yml +31 -0
  8. data/CHANGELOG.md +314 -16
  9. data/CLAUDE.md +603 -0
  10. data/README.md +76 -5
  11. data/Rakefile +5 -0
  12. data/SETUP.md +132 -101
  13. data/db/migrate/{20250101000001_enable_extensions.rb → 00001_enable_extensions.rb} +0 -1
  14. data/db/migrate/00002_create_robots.rb +11 -0
  15. data/db/migrate/00003_create_file_sources.rb +20 -0
  16. data/db/migrate/00004_create_nodes.rb +65 -0
  17. data/db/migrate/00005_create_tags.rb +13 -0
  18. data/db/migrate/00006_create_node_tags.rb +18 -0
  19. data/db/migrate/00007_create_robot_nodes.rb +26 -0
  20. data/db/migrate/00009_add_working_memory_to_robot_nodes.rb +12 -0
  21. data/db/schema.sql +390 -36
  22. data/docs/api/database.md +19 -232
  23. data/docs/api/embedding-service.md +1 -7
  24. data/docs/api/htm.md +305 -364
  25. data/docs/api/index.md +1 -7
  26. data/docs/api/long-term-memory.md +342 -590
  27. data/docs/api/yard/HTM/ActiveRecordConfig.md +23 -0
  28. data/docs/api/yard/HTM/AuthorizationError.md +11 -0
  29. data/docs/api/yard/HTM/CircuitBreaker.md +92 -0
  30. data/docs/api/yard/HTM/CircuitBreakerOpenError.md +34 -0
  31. data/docs/api/yard/HTM/Configuration.md +175 -0
  32. data/docs/api/yard/HTM/Database.md +99 -0
  33. data/docs/api/yard/HTM/DatabaseError.md +14 -0
  34. data/docs/api/yard/HTM/EmbeddingError.md +18 -0
  35. data/docs/api/yard/HTM/EmbeddingService.md +58 -0
  36. data/docs/api/yard/HTM/Error.md +11 -0
  37. data/docs/api/yard/HTM/JobAdapter.md +39 -0
  38. data/docs/api/yard/HTM/LongTermMemory.md +342 -0
  39. data/docs/api/yard/HTM/NotFoundError.md +17 -0
  40. data/docs/api/yard/HTM/Observability.md +107 -0
  41. data/docs/api/yard/HTM/QueryTimeoutError.md +19 -0
  42. data/docs/api/yard/HTM/Railtie.md +27 -0
  43. data/docs/api/yard/HTM/ResourceExhaustedError.md +13 -0
  44. data/docs/api/yard/HTM/TagError.md +18 -0
  45. data/docs/api/yard/HTM/TagService.md +67 -0
  46. data/docs/api/yard/HTM/Timeframe/Result.md +24 -0
  47. data/docs/api/yard/HTM/Timeframe.md +40 -0
  48. data/docs/api/yard/HTM/TimeframeExtractor/Result.md +24 -0
  49. data/docs/api/yard/HTM/TimeframeExtractor.md +45 -0
  50. data/docs/api/yard/HTM/ValidationError.md +20 -0
  51. data/docs/api/yard/HTM/WorkingMemory.md +131 -0
  52. data/docs/api/yard/HTM.md +80 -0
  53. data/docs/api/yard/index.csv +179 -0
  54. data/docs/api/yard-reference.md +51 -0
  55. data/docs/architecture/adrs/001-postgresql-timescaledb.md +1 -1
  56. data/docs/architecture/adrs/003-ollama-embeddings.md +1 -1
  57. data/docs/architecture/adrs/010-redis-working-memory-rejected.md +2 -27
  58. data/docs/architecture/adrs/index.md +2 -13
  59. data/docs/architecture/hive-mind.md +165 -166
  60. data/docs/architecture/index.md +2 -2
  61. data/docs/architecture/overview.md +5 -171
  62. data/docs/architecture/two-tier-memory.md +1 -35
  63. data/docs/assets/images/adr-010-current-architecture.svg +37 -0
  64. data/docs/assets/images/adr-010-proposed-architecture.svg +48 -0
  65. data/docs/assets/images/adr-dependency-tree.svg +93 -0
  66. data/docs/assets/images/class-hierarchy.svg +55 -0
  67. data/docs/assets/images/exception-hierarchy.svg +45 -0
  68. data/docs/assets/images/htm-architecture-overview.svg +83 -0
  69. data/docs/assets/images/htm-complete-memory-flow.svg +160 -0
  70. data/docs/assets/images/htm-context-assembly-flow.svg +148 -0
  71. data/docs/assets/images/htm-eviction-process.svg +141 -0
  72. data/docs/assets/images/htm-memory-addition-flow.svg +138 -0
  73. data/docs/assets/images/htm-memory-recall-flow.svg +152 -0
  74. data/docs/assets/images/htm-node-states.svg +123 -0
  75. data/docs/assets/images/project-structure.svg +78 -0
  76. data/docs/assets/images/test-directory-structure.svg +38 -0
  77. data/{dbdoc → docs/database}/README.md +127 -125
  78. data/docs/database/public.file_sources.md +42 -0
  79. data/docs/database/public.file_sources.svg +211 -0
  80. data/{dbdoc → docs/database}/public.node_tags.md +7 -8
  81. data/docs/database/public.node_tags.svg +239 -0
  82. data/{dbdoc → docs/database}/public.nodes.md +22 -17
  83. data/docs/database/public.nodes.svg +271 -0
  84. data/docs/database/public.robot_nodes.md +46 -0
  85. data/docs/database/public.robot_nodes.svg +243 -0
  86. data/{dbdoc → docs/database}/public.robots.md +2 -3
  87. data/docs/database/public.robots.svg +161 -0
  88. data/docs/database/public.tags.svg +139 -0
  89. data/{dbdoc → docs/database}/schema.json +941 -630
  90. data/docs/database/schema.svg +282 -0
  91. data/docs/development/index.md +1 -29
  92. data/docs/development/schema.md +134 -309
  93. data/docs/development/testing.md +1 -9
  94. data/docs/getting-started/index.md +47 -0
  95. data/docs/{installation.md → getting-started/installation.md} +2 -2
  96. data/docs/{quick-start.md → getting-started/quick-start.md} +5 -5
  97. data/docs/guides/adding-memories.md +295 -643
  98. data/docs/guides/recalling-memories.md +36 -1
  99. data/docs/guides/search-strategies.md +85 -51
  100. data/docs/images/htm-er-diagram.svg +156 -0
  101. data/docs/index.md +16 -31
  102. data/docs/multi_framework_support.md +4 -4
  103. data/examples/README.md +280 -0
  104. data/examples/basic_usage.rb +18 -16
  105. data/examples/cli_app/htm_cli.rb +146 -8
  106. data/examples/cli_app/temp.log +93 -0
  107. data/examples/custom_llm_configuration.rb +1 -2
  108. data/examples/example_app/app.rb +11 -14
  109. data/examples/file_loader_usage.rb +177 -0
  110. data/examples/robot_groups/lib/robot_group.rb +419 -0
  111. data/examples/robot_groups/lib/working_memory_channel.rb +140 -0
  112. data/examples/robot_groups/multi_process.rb +286 -0
  113. data/examples/robot_groups/robot_worker.rb +136 -0
  114. data/examples/robot_groups/same_process.rb +229 -0
  115. data/examples/sinatra_app/Gemfile +1 -0
  116. data/examples/sinatra_app/Gemfile.lock +166 -0
  117. data/examples/sinatra_app/app.rb +219 -24
  118. data/examples/timeframe_demo.rb +276 -0
  119. data/lib/htm/active_record_config.rb +10 -3
  120. data/lib/htm/circuit_breaker.rb +202 -0
  121. data/lib/htm/configuration.rb +313 -80
  122. data/lib/htm/database.rb +67 -36
  123. data/lib/htm/embedding_service.rb +39 -2
  124. data/lib/htm/errors.rb +131 -11
  125. data/lib/htm/{sinatra.rb → integrations/sinatra.rb} +87 -12
  126. data/lib/htm/job_adapter.rb +10 -3
  127. data/lib/htm/jobs/generate_embedding_job.rb +5 -4
  128. data/lib/htm/jobs/generate_tags_job.rb +4 -0
  129. data/lib/htm/loaders/markdown_loader.rb +263 -0
  130. data/lib/htm/loaders/paragraph_chunker.rb +112 -0
  131. data/lib/htm/long_term_memory.rb +601 -321
  132. data/lib/htm/models/file_source.rb +99 -0
  133. data/lib/htm/models/node.rb +116 -12
  134. data/lib/htm/models/robot.rb +53 -4
  135. data/lib/htm/models/robot_node.rb +51 -0
  136. data/lib/htm/models/tag.rb +302 -0
  137. data/lib/htm/observability.rb +395 -0
  138. data/lib/htm/tag_service.rb +60 -3
  139. data/lib/htm/tasks.rb +29 -0
  140. data/lib/htm/timeframe.rb +194 -0
  141. data/lib/htm/timeframe_extractor.rb +307 -0
  142. data/lib/htm/version.rb +1 -1
  143. data/lib/htm/working_memory.rb +165 -70
  144. data/lib/htm.rb +352 -133
  145. data/lib/tasks/doc.rake +300 -0
  146. data/lib/tasks/files.rake +299 -0
  147. data/lib/tasks/htm.rake +188 -2
  148. data/lib/tasks/jobs.rake +10 -12
  149. data/lib/tasks/tags.rake +194 -0
  150. data/mkdocs.yml +91 -9
  151. data/notes/ARCHITECTURE_REVIEW.md +1167 -0
  152. data/notes/IMPLEMENTATION_SUMMARY.md +606 -0
  153. data/notes/MULTI_FRAMEWORK_IMPLEMENTATION.md +451 -0
  154. data/notes/next_steps.md +100 -0
  155. data/notes/plan.md +627 -0
  156. data/notes/tag_ontology_enhancement_ideas.md +222 -0
  157. data/notes/timescaledb_removal_summary.md +200 -0
  158. metadata +177 -37
  159. data/db/migrate/20250101000002_create_robots.rb +0 -14
  160. data/db/migrate/20250101000003_create_nodes.rb +0 -42
  161. data/db/migrate/20250101000005_create_tags.rb +0 -38
  162. data/db/migrate/20250101000007_add_node_vector_indexes.rb +0 -30
  163. data/dbdoc/public.node_tags.svg +0 -112
  164. data/dbdoc/public.nodes.svg +0 -118
  165. data/dbdoc/public.robots.svg +0 -90
  166. data/dbdoc/public.tags.svg +0 -60
  167. data/dbdoc/schema.svg +0 -154
  168. data/{dbdoc → docs/database}/public.node_stats.md +0 -0
  169. data/{dbdoc → docs/database}/public.node_stats.svg +0 -0
  170. data/{dbdoc → docs/database}/public.nodes_tags.md +0 -0
  171. data/{dbdoc → docs/database}/public.nodes_tags.svg +0 -0
  172. data/{dbdoc → docs/database}/public.ontology_structure.md +0 -0
  173. data/{dbdoc → docs/database}/public.ontology_structure.svg +0 -0
  174. data/{dbdoc → docs/database}/public.operations_log.md +0 -0
  175. data/{dbdoc → docs/database}/public.operations_log.svg +0 -0
  176. data/{dbdoc → docs/database}/public.relationships.md +0 -0
  177. data/{dbdoc → docs/database}/public.relationships.svg +0 -0
  178. data/{dbdoc → docs/database}/public.robot_activity.md +0 -0
  179. data/{dbdoc → docs/database}/public.robot_activity.svg +0 -0
  180. data/{dbdoc → docs/database}/public.schema_migrations.md +0 -0
  181. data/{dbdoc → docs/database}/public.schema_migrations.svg +0 -0
  182. data/{dbdoc → docs/database}/public.tags.md +3 -3
  183. /data/{dbdoc → docs/database}/public.topic_relationships.md +0 -0
  184. /data/{dbdoc → docs/database}/public.topic_relationships.svg +0 -0
data/CHANGELOG.md CHANGED
@@ -7,22 +7,312 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [0.0.10] - 2025-12-02
11
+
12
+ ### Added
13
+ - **Robot groups example with multi-process synchronization** - New `examples/robot_groups/` directory
14
+ - `RobotGroup` class for coordinating multiple robots with shared working memory
15
+ - `WorkingMemoryChannel` for real-time sync via PostgreSQL LISTEN/NOTIFY
16
+ - `same_process.rb` - Single-process demo of robot groups with failover
17
+ - `multi_process.rb` - Cross-process coordination with separate Ruby processes
18
+ - `robot_worker.rb` - Worker process for multi-process demo
19
+ - Demonstrates high-availability patterns: active/passive roles, warm standby, instant failover
20
+ - **Examples directory README** - Comprehensive documentation for all example programs
21
+ - Describes 9 example programs across standalone scripts and application examples
22
+ - Usage instructions, features, and directory structure
23
+ - Quick reference table for choosing the right example
24
+
25
+ ### Changed
26
+ - **YARD documentation updates** - Added documentation for metadata parameter support
27
+
28
+
29
+ ### Changed
30
+ - **Refactored working memory persistence to robot_nodes join table** - Simpler, more efficient schema
31
+ - Added `working_memory` boolean column to `robot_nodes` table (default: false)
32
+ - Partial index `idx_robot_nodes_working_memory` for efficient queries on active working memory
33
+ - Working memory state now tracked per robot-node relationship
34
+ - `remember()` and `recall()` now set `working_memory = true` when adding to working memory
35
+ - Eviction sets `working_memory = false` on evicted nodes
36
+ - **Updated `mark_evicted` signature** - Now requires `robot_id:` and `node_ids:` keyword arguments
37
+ - **Added space check to `remember()`** - Now evicts old memories before adding if working memory is full (was missing, only `recall()` had this check)
38
+
39
+ ### Added
40
+ - **`RobotNode.in_working_memory` scope** - Query nodes currently in a robot's working memory
41
+ - **`Robot#memory_summary[:in_working_memory]`** - Now uses efficient scope instead of separate table
42
+
43
+ ### Removed
44
+ - **`working_memories` table** - Replaced by `working_memory` boolean on `robot_nodes`
45
+ - **`WorkingMemoryEntry` model** - No longer needed
46
+ - **Migration `00008_create_working_memories.rb`** - Replaced by simpler approach
47
+
48
+ ### Migration
49
+ | Migration | Table | Description |
50
+ |-----------|-------|-------------|
51
+ | `00009_add_working_memory_to_robot_nodes.rb` | `robot_nodes` | Adds working_memory boolean column |
52
+
53
+ ## [0.0.9] - 2025-11-29
54
+
55
+ ### Changed
56
+ - **Consolidated database migrations** - Reduced 14 migrations to 8 clean migrations
57
+ - Each migration now handles exactly one table
58
+ - Removed incremental add/remove column migrations
59
+ - All indexes, constraints, and foreign keys included in table creation
60
+ - Migrations ordered by dependencies (extensions, then tables with FKs)
61
+ - Migration files now use simple numeric prefixes (00001-00008)
62
+
63
+ ### Migration Files
64
+ | Migration | Table | Description |
65
+ |-----------|-------|-------------|
66
+ | `00001_enable_extensions.rb` | (extensions) | Enables vector and pg_trgm |
67
+ | `00002_create_robots.rb` | `robots` | Robot registry |
68
+ | `00003_create_file_sources.rb` | `file_sources` | Source file metadata |
69
+ | `00004_create_nodes.rb` | `nodes` | Core memory storage |
70
+ | `00005_create_tags.rb` | `tags` | Tag names |
71
+ | `00006_create_node_tags.rb` | `node_tags` | Node-tag join table |
72
+ | `00007_create_robot_nodes.rb` | `robot_nodes` | Robot-node join table |
73
+ | `00008_create_working_memories.rb` | `working_memories` | Per-robot working memory |
74
+
75
+ ## [0.0.8] - 2025-11-29
76
+
77
+ ### Added
78
+ - **Circuit breaker pattern for LLM services** - Prevents cascading failures from external APIs
79
+ - New `HTM::CircuitBreaker` class with configurable thresholds
80
+ - Three states: `:closed` (normal), `:open` (failing fast), `:half_open` (testing recovery)
81
+ - Configurable `failure_threshold` (default: 5), `reset_timeout` (default: 60s)
82
+ - Thread-safe implementation with Mutex protection
83
+ - Integrated into `EmbeddingService` and `TagService`
84
+ - Background jobs handle `CircuitBreakerOpenError` gracefully
85
+ - **Thread-safe WorkingMemory** - All public methods now protected by Mutex
86
+ - `add`, `remove`, `has_space?`, `evict_to_make_space` synchronized
87
+ - `assemble_context`, `token_count`, `utilization_percentage`, `node_count` synchronized
88
+ - Internal helpers renamed to `*_unlocked` variants for safe internal use
89
+ - **Observability module** (`HTM::Observability`) for system monitoring
90
+ - `connection_pool_stats` - Pool health with warning/critical/exhausted status
91
+ - `circuit_breaker_stats` - Service circuit breaker states
92
+ - `query_timing_stats` - Query performance metrics (avg, min, max, p50, p95, p99)
93
+ - `service_timing_stats` - Embedding/tag generation timing
94
+ - `memory_stats` - Process memory usage
95
+ - `health_check` - Comprehensive system health verification
96
+ - `healthy?` - Quick boolean health check
97
+ - Configurable thresholds: 75% warning, 90% critical for connection pool
98
+ - **Comprehensive test suites**:
99
+ - `test/circuit_breaker_test.rb` - 13 tests for circuit breaker states and transitions
100
+ - `test/embedding_service_test.rb` - 19 tests for validation, generation, and circuit breaker
101
+ - `test/observability_test.rb` - 10 tests for observability module
102
+ - Updated `test/tag_service_test.rb` - Added 4 circuit breaker tests (total 33 tests)
103
+ - Expanded `test/integration_test.rb` - Added 13 new integration tests
104
+ - **Architecture review document** - Comprehensive multi-perspective codebase review
105
+ - Reviews from 8 specialist perspectives (Systems, Domain, Security, etc.)
106
+ - Located at `.architecture/reviews/comprehensive-codebase-review.md`
107
+ - **Enhanced YARD documentation** for all error classes with examples
108
+
109
+ ### Changed
110
+ - **CircuitBreakerOpenError** now extends `HTM::Error` (was `EmbeddingError`)
111
+ - Allows both EmbeddingService and TagService to use it
112
+ - **EmbeddingService and TagService** re-raise `CircuitBreakerOpenError` without wrapping
113
+ - Enables proper circuit breaker behavior in calling code
114
+ - **GenerateEmbeddingJob and GenerateTagsJob** log warnings for circuit breaker open state
115
+ - Graceful degradation when LLM services are unavailable
116
+
117
+ ### Removed
118
+ - **`embedding_dimension` column from nodes table** - Unused since embeddings are always padded to 2000
119
+ - **`embedding_model` column from nodes table** - Not needed for current use cases
120
+
121
+ ## [0.0.7] - 2025-11-28
122
+
123
+ ### Security
124
+ - **Fixed SQL injection vulnerabilities** in multiple locations:
125
+ - `LongTermMemory#build_timeframe_condition` - Now uses `connection.quote`
126
+ - `LongTermMemory#topic_relationships` - Now uses parameterized queries ($1, $2)
127
+ - `Node#similarity_to` - Added embedding validation and proper quoting
128
+ - `Database#run_activerecord_migrations` - Uses `sanitize_sql_array`
129
+ - **Removed hardcoded database credentials** from default configuration
130
+
131
+ ### Added
132
+ - **Thread-safe cache statistics** - Added `Mutex` synchronization for `@cache_stats`
133
+ - **Input validation for `remember` method** - Validates content size and tag format
134
+ - **URL format validation** - `Database.parse_connection_url` now validates scheme, host, and database name
135
+ - **Encoding fallback in MarkdownLoader** - UTF-8 with binary fallback for non-UTF-8 files
136
+ - **File size validation** - MarkdownLoader enforces 10 MB maximum file size
137
+ - **New test suites**:
138
+ - `test/configuration_test.rb` - Configuration validation tests
139
+ - `test/working_memory_test.rb` - Working memory operations and eviction tests
140
+ - `test/tag_service_test.rb` - Tag validation and extraction tests
141
+
142
+ ### Changed
143
+ - **Wrapped `LongTermMemory#add` in transaction** - Ensures atomicity for node creation
144
+ - **Updated documentation** - Removed outdated TimescaleDB references, added pgvector and async processing info
145
+ - **Defensive copies in WorkingMemory** - Uses `.dup` in `assemble_context` to prevent mutation
146
+ - **Embedding validation** - `Node#similarity_to` validates embedding is array of finite numbers
147
+
148
+ ### Fixed
149
+ - **N+1 query in `search_with_relevance`** - Added `batch_load_node_tags` helper
150
+ - **Bare rescue in `get_node_tags`** - Now catches specific `ActiveRecord::RecordNotFound`
151
+
152
+ ## [0.0.6] - 2025-11-28
153
+
154
+ ### Added
155
+ - **Automatic timeframe extraction from queries** - No LLM required
156
+ - `TimeframeExtractor` service parses natural language time expressions
157
+ - Uses `chronic` gem for robust date/time parsing
158
+ - Supports standard expressions: "yesterday", "last week", "this morning", etc.
159
+ - `FEW` constant (3) maps "few", "a few", "several" to numeric values
160
+ - "recently"/"recent" without units defaults to 3 days
161
+ - Custom weekend handling: "weekend before last", "N weekends ago"
162
+ - Returns cleaned query with temporal expression removed
163
+ - **Flexible `timeframe` parameter in `recall` method** - Multiple input types:
164
+ - `nil` - No time filter (searches all time)
165
+ - `Date` / `DateTime` / `Time` - Entire day (00:00:00 to 23:59:59)
166
+ - `Range` - Exact time window
167
+ - `String` - Natural language parsing via Chronic
168
+ - `:auto` - Extract timeframe from query text automatically
169
+ - `Array<Range>` - Multiple time windows OR'd together
170
+ - **`HTM::Timeframe` normalizer class** - Converts all input types to Range or Array<Range>
171
+ - `Timeframe.normalize(input, query:)` handles all conversions
172
+ - `Timeframe.valid?(input)` validates timeframe input
173
+ - Returns `Result` struct with `:timeframe`, `:query`, `:extracted` when using `:auto`
174
+ - **Configurable week start** - `HTM.configuration.week_start`
175
+ - Options: `:sunday` (default) or `:monday`
176
+ - Passed to Chronic for "last week" and similar expressions
177
+ - **Timeframe demo** - `examples/timeframe_demo.rb` showcasing all input types
178
+ - Run with `rake timeframe_demo`
179
+ - **New rake task**: `rake timeframe_demo` to run the demo
180
+
181
+ ### Changed
182
+ - **`recall` method** now accepts all new timeframe input types
183
+ - **`validate_timeframe!`** uses `HTM::Timeframe.valid?` for validation
184
+ - **`LongTermMemory` search methods** support `nil` and `Array<Range>` timeframes
185
+ - `apply_timeframe_scope` handles OR conditions for multiple ranges
186
+
187
+ ### Dependencies
188
+ - Added `chronic` gem for natural language date parsing
189
+
190
+ ## [0.0.5] - 2025-11-28
191
+
192
+ ### Added
193
+ - **Semantic tag matching for queries** - Query tags are now extracted using LLM
194
+ - Uses same `TagService.extract()` process as node storage
195
+ - 3-step search strategy: exact match → prefix match → component match
196
+ - Component matching searches right-to-left (most specific first)
197
+ - Replaces naive keyword substring matching
198
+ - **New rake tasks for database maintenance**:
199
+ - `htm:db:stats` - Show record counts for all HTM tables with breakdowns
200
+ - `htm:db:rebuild:embeddings` - Clear and regenerate all embeddings with progress bar
201
+ - `htm:tags:rebuild` - Clear and regenerate all tags with progress bar
202
+ - **Progress bar support** - Added `ruby-progressbar` gem for long-running rake tasks
203
+ - **CLI demo enhancements** (`examples/cli_app/htm_cli.rb`):
204
+ - Shows extracted tags, searched tags, and matched tags during recall
205
+ - Generates context-aware responses using RubyLLM with Ollama
206
+ - Stores LLM responses in long-term memory for learning
207
+
208
+ ### Changed
209
+ - **Improved tag extraction prompt** with CRITICAL CONSTRAINTS to prevent:
210
+ - Circular references (concept at both root and leaf)
211
+ - Self-containment (parent containing itself as descendant)
212
+ - Duplicate segments in hierarchy path
213
+ - Redundant duplicates across branches
214
+ - **TagService validation** now programmatically enforces:
215
+ - Self-containment detection (root == leaf)
216
+ - Duplicate segment detection in hierarchy path
217
+ - Maximum depth reduced from 5 to 4 levels
218
+ - **`find_query_matching_tags` method** completely rewritten:
219
+ - Now uses LLM-based semantic extraction instead of keyword matching
220
+ - Returns both extracted and matched tags via `include_extracted: true` option
221
+
222
+ ### Fixed
223
+ - Tag search no longer matches unrelated tags via substring (e.g., "man" matching "management")
224
+
225
+ ## [0.0.4] - 2025-11-28
226
+
10
227
  ### Added
228
+ - **Markdown file loader** - Load markdown files into long-term memory
229
+ - `FileSource` model to track loaded files with metadata and sync status
230
+ - `MarkdownLoader` with YAML frontmatter extraction
231
+ - `ParagraphChunker` for splitting content into semantic chunks
232
+ - DELTA_TIME tolerance (5 seconds) for reliable file change detection
233
+ - **New HTM API methods** for file operations:
234
+ - `htm.load_file(path, force: false)` - Load single markdown file
235
+ - `htm.load_directory(path, pattern: '**/*.md', force: false)` - Load directory
236
+ - `htm.nodes_from_file(path)` - Query nodes from a loaded file
237
+ - `htm.unload_file(path)` - Unload file and soft-delete its chunks
238
+ - **File loading rake tasks**:
239
+ - `htm:files:load[path]` - Load a markdown file
240
+ - `htm:files:load_dir[path,pattern]` - Load directory with glob pattern
241
+ - `htm:files:list` - List all loaded file sources
242
+ - `htm:files:info[path]` - Show details for a loaded file
243
+ - `htm:files:unload[path]` - Unload a file from memory
244
+ - `htm:files:sync` - Re-sync all loaded files (reload changed files)
245
+ - `htm:files:stats` - Show file loading statistics
246
+ - **FileSource model features**:
247
+ - `needs_sync?(mtime)` with DELTA_TIME tolerance for mtime comparison
248
+ - `frontmatter_tags`, `title`, `author` accessors for frontmatter data
249
+ - `soft_delete_chunks!` for bulk soft-delete of associated nodes
250
+ - `by_path` scope for path-based lookups
251
+ - **New example**: `examples/file_loader_usage.rb` demonstrating all file operations
252
+ - **New tests**: FileSource model tests (19) and MarkdownLoader tests (18)
253
+
254
+ ### Changed
255
+ - Node model now has optional `source_id` foreign key to FileSource
256
+ - Node model has `chunk_position` column for ordering chunks within a file
257
+
258
+ ## [0.0.2] - 2025-11-28
259
+
260
+ ### Added
261
+ - **Soft delete for memory nodes** - `forget()` now soft deletes by default (recoverable)
262
+ - `restore(node_id)` to recover soft-deleted nodes
263
+ - `purge_deleted(older_than:, confirm:)` to permanently remove old deleted nodes
264
+ - Permanent delete requires `soft: false, confirm: :confirmed`
265
+ - `deleted_at` column and scopes: `Node.deleted`, `Node.with_deleted`
266
+ - **Tag hierarchy visualization** - Export tag trees in multiple formats
267
+ - `Tag.all.tree` returns nested hash structure
268
+ - `Tag.all.tree_string` returns directory-style text tree
269
+ - `Tag.all.tree_mermaid` generates Mermaid flowchart syntax
270
+ - `Tag.all.tree_svg` generates SVG with dark theme, transparent background
271
+ - Rake tasks: `htm:tags:tree`, `htm:tags:mermaid`, `htm:tags:svg`, `htm:tags:export`
272
+ - All rake tasks accept optional prefix filter parameter
273
+ - **Per-robot working memory persistence** - Optional database-backed working memory
274
+ - New `working_memories` table for state restoration after process restart
275
+ - `WorkingMemoryEntry` model with `sync`, `load`, `clear` methods
276
+ - Enables cross-robot observability in hive mind architecture
277
+ - **Temporal filtering in recall** - Parse timeframe strings (seconds/minutes/hours)
278
+ - **Integration tests** for embeddings, vector search, and recall options
279
+ - **Multi-provider LLM support via RubyLLM** - HTM now supports 9 LLM providers:
280
+ - OpenAI (`text-embedding-3-small`, `gpt-4o-mini`)
281
+ - Anthropic (`claude-3-haiku-20240307`)
282
+ - Google Gemini (`text-embedding-004`)
283
+ - Azure OpenAI
284
+ - Ollama (default, local-first)
285
+ - HuggingFace Inference API
286
+ - OpenRouter
287
+ - AWS Bedrock
288
+ - DeepSeek
289
+ - Provider-specific configuration attributes for all supported providers
290
+ - `HTM::Configuration#configure_ruby_llm` method for provider credential setup
291
+ - `SUPPORTED_PROVIDERS` constant listing all available providers
292
+ - `DEFAULT_DIMENSIONS` hash with typical embedding dimensions per provider
11
293
  - Architecture documentation using ai-software-architect framework
12
- - Comprehensive ADRs (Architecture Decision Records):
13
- - ADR-001: PostgreSQL with TimescaleDB for storage
14
- - ADR-002: Two-tier memory architecture (working + long-term)
15
- - ADR-003: Ollama as default embedding provider
16
- - ADR-004: Multi-robot shared memory (hive mind)
17
- - ADR-005: RAG-based retrieval with hybrid search
18
- - ADR-006: Context assembly strategies (recent, important, balanced)
19
- - ADR-007: Working memory eviction strategy (hybrid importance + recency)
20
- - ADR-008: Robot identification system (UUID + name)
21
- - ADR-009: Never-forget philosophy with explicit deletion
22
- - Architecture review team with 8 specialist perspectives
23
- - Had the robot convert my notss and system analysis documentation into Architectural Decision Records (ADR)
24
-
25
- ## [0.1.0] - 2025-10-25
294
+ - Comprehensive ADRs (Architecture Decision Records) for all major design decisions
295
+
296
+ ### Changed
297
+ - **Embedding generator now uses `RubyLLM.embed()`** instead of raw HTTP calls to Ollama
298
+ - **Tag extractor now uses `RubyLLM.chat()`** instead of raw HTTP calls to Ollama
299
+ - **Sinatra integration moved** to `lib/htm/integrations/sinatra.rb` (require path changed)
300
+ - **Hybrid search includes nodes without embeddings** using 0.5 default similarity
301
+ - Configuration validation now checks provider is in `SUPPORTED_PROVIDERS`
302
+ - MkDocs documentation reorganized with tbls schema docs integration
303
+ - Updated CLAUDE.md with multi-provider documentation and examples
304
+
305
+ ### Removed
306
+ - Unused `nodes.in_working_memory` column (was never set to true)
307
+ - Unused `robots.metadata` column (never referenced in codebase)
308
+ - One-off test scripts replaced with proper Minitest integration tests
309
+
310
+ ### Fixed
311
+ - Sinatra session secret error (Rack requires 64+ bytes)
312
+ - Thread-safe database connection in Sinatra integration
313
+ - tbls database documentation rake task configuration
314
+
315
+ ## [0.0.1] - 2025-10-25
26
316
 
27
317
  ### Added
28
318
  - Initial release of HTM (Hierarchical Temporary Memory)
@@ -146,5 +436,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
146
436
  - Working memory size is user-configurable
147
437
  - See ADRs for detailed architectural decisions and rationale
148
438
 
149
- [Unreleased]: https://github.com/madbomber/htm/compare/v0.1.0...HEAD
150
- [0.1.0]: https://github.com/madbomber/htm/releases/tag/v0.1.0
439
+ [Unreleased]: https://github.com/madbomber/htm/compare/v0.0.10...HEAD
440
+ [0.0.10]: https://github.com/madbomber/htm/compare/v0.0.9...v0.0.10
441
+ [0.0.9]: https://github.com/madbomber/htm/compare/v0.0.8...v0.0.9
442
+ [0.0.8]: https://github.com/madbomber/htm/compare/v0.0.7...v0.0.8
443
+ [0.0.7]: https://github.com/madbomber/htm/compare/v0.0.6...v0.0.7
444
+ [0.0.6]: https://github.com/madbomber/htm/compare/v0.0.5...v0.0.6
445
+ [0.0.5]: https://github.com/madbomber/htm/compare/v0.0.4...v0.0.5
446
+ [0.0.4]: https://github.com/madbomber/htm/compare/v0.0.2...v0.0.4
447
+ [0.0.2]: https://github.com/madbomber/htm/compare/v0.0.1...v0.0.2
448
+ [0.0.1]: https://github.com/madbomber/htm/releases/tag/v0.0.1