RubyGems - htm - Versions diffs - 0.0.1 - Mend

htm 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (155) hide show

checksums.yaml +7 -0
data/.architecture/decisions/adrs/001-use-postgresql-timescaledb-storage.md +227 -0
data/.architecture/decisions/adrs/002-two-tier-memory-architecture.md +322 -0
data/.architecture/decisions/adrs/003-ollama-default-embedding-provider.md +339 -0
data/.architecture/decisions/adrs/004-multi-robot-shared-memory-hive-mind.md +374 -0
data/.architecture/decisions/adrs/005-rag-based-retrieval-with-hybrid-search.md +443 -0
data/.architecture/decisions/adrs/006-context-assembly-strategies.md +444 -0
data/.architecture/decisions/adrs/007-working-memory-eviction-strategy.md +461 -0
data/.architecture/decisions/adrs/008-robot-identification-system.md +550 -0
data/.architecture/decisions/adrs/009-never-forget-explicit-deletion-only.md +570 -0
data/.architecture/decisions/adrs/010-redis-working-memory-rejected.md +323 -0
data/.architecture/decisions/adrs/011-database-side-embedding-generation-with-pgai.md +585 -0
data/.architecture/decisions/adrs/012-llm-driven-ontology-topic-extraction.md +583 -0
data/.architecture/decisions/adrs/013-activerecord-orm-and-many-to-many-tagging.md +299 -0
data/.architecture/decisions/adrs/014-client-side-embedding-generation-workflow.md +569 -0
data/.architecture/decisions/adrs/015-hierarchical-tag-ontology-and-llm-extraction.md +701 -0
data/.architecture/decisions/adrs/016-async-embedding-and-tag-generation.md +694 -0
data/.architecture/members.yml +144 -0
data/.architecture/reviews/2025-10-29-llm-configuration-and-async-processing-review.md +1137 -0
data/.architecture/reviews/initial-system-analysis.md +330 -0
data/.envrc +32 -0
data/.irbrc +145 -0
data/CHANGELOG.md +150 -0
data/COMMITS.md +196 -0
data/LICENSE +21 -0
data/README.md +1347 -0
data/Rakefile +51 -0
data/SETUP.md +268 -0
data/config/database.yml +67 -0
data/db/migrate/20250101000001_enable_extensions.rb +14 -0
data/db/migrate/20250101000002_create_robots.rb +14 -0
data/db/migrate/20250101000003_create_nodes.rb +42 -0
data/db/migrate/20250101000005_create_tags.rb +38 -0
data/db/migrate/20250101000007_add_node_vector_indexes.rb +30 -0
data/db/schema.sql +473 -0
data/db/seed_data/README.md +100 -0
data/db/seed_data/presidents.md +136 -0
data/db/seed_data/states.md +151 -0
data/db/seeds.rb +208 -0
data/dbdoc/README.md +173 -0
data/dbdoc/public.node_stats.md +48 -0
data/dbdoc/public.node_stats.svg +41 -0
data/dbdoc/public.node_tags.md +40 -0
data/dbdoc/public.node_tags.svg +112 -0
data/dbdoc/public.nodes.md +54 -0
data/dbdoc/public.nodes.svg +118 -0
data/dbdoc/public.nodes_tags.md +39 -0
data/dbdoc/public.nodes_tags.svg +112 -0
data/dbdoc/public.ontology_structure.md +48 -0
data/dbdoc/public.ontology_structure.svg +38 -0
data/dbdoc/public.operations_log.md +42 -0
data/dbdoc/public.operations_log.svg +130 -0
data/dbdoc/public.relationships.md +39 -0
data/dbdoc/public.relationships.svg +41 -0
data/dbdoc/public.robot_activity.md +46 -0
data/dbdoc/public.robot_activity.svg +35 -0
data/dbdoc/public.robots.md +35 -0
data/dbdoc/public.robots.svg +90 -0
data/dbdoc/public.schema_migrations.md +29 -0
data/dbdoc/public.schema_migrations.svg +26 -0
data/dbdoc/public.tags.md +35 -0
data/dbdoc/public.tags.svg +60 -0
data/dbdoc/public.topic_relationships.md +45 -0
data/dbdoc/public.topic_relationships.svg +32 -0
data/dbdoc/schema.json +1437 -0
data/dbdoc/schema.svg +154 -0
data/docs/api/database.md +806 -0
data/docs/api/embedding-service.md +532 -0
data/docs/api/htm.md +797 -0
data/docs/api/index.md +259 -0
data/docs/api/long-term-memory.md +1096 -0
data/docs/api/working-memory.md +665 -0
data/docs/architecture/adrs/001-postgresql-timescaledb.md +314 -0
data/docs/architecture/adrs/002-two-tier-memory.md +411 -0
data/docs/architecture/adrs/003-ollama-embeddings.md +421 -0
data/docs/architecture/adrs/004-hive-mind.md +437 -0
data/docs/architecture/adrs/005-rag-retrieval.md +531 -0
data/docs/architecture/adrs/006-context-assembly.md +496 -0
data/docs/architecture/adrs/007-eviction-strategy.md +645 -0
data/docs/architecture/adrs/008-robot-identification.md +625 -0
data/docs/architecture/adrs/009-never-forget.md +648 -0
data/docs/architecture/adrs/010-redis-working-memory-rejected.md +323 -0
data/docs/architecture/adrs/011-pgai-integration.md +494 -0
data/docs/architecture/adrs/index.md +215 -0
data/docs/architecture/hive-mind.md +736 -0
data/docs/architecture/index.md +351 -0
data/docs/architecture/overview.md +538 -0
data/docs/architecture/two-tier-memory.md +873 -0
data/docs/assets/css/custom.css +83 -0
data/docs/assets/images/htm-core-components.svg +63 -0
data/docs/assets/images/htm-database-schema.svg +93 -0
data/docs/assets/images/htm-hive-mind-architecture.svg +125 -0
data/docs/assets/images/htm-importance-scoring-framework.svg +83 -0
data/docs/assets/images/htm-layered-architecture.svg +71 -0
data/docs/assets/images/htm-long-term-memory-architecture.svg +115 -0
data/docs/assets/images/htm-working-memory-architecture.svg +120 -0
data/docs/assets/images/htm.jpg +0 -0
data/docs/assets/images/htm_demo.gif +0 -0
data/docs/assets/js/mathjax.js +18 -0
data/docs/assets/videos/htm_video.mp4 +0 -0
data/docs/database_rake_tasks.md +322 -0
data/docs/development/contributing.md +787 -0
data/docs/development/index.md +336 -0
data/docs/development/schema.md +596 -0
data/docs/development/setup.md +719 -0
data/docs/development/testing.md +819 -0
data/docs/guides/adding-memories.md +824 -0
data/docs/guides/context-assembly.md +1009 -0
data/docs/guides/getting-started.md +577 -0
data/docs/guides/index.md +118 -0
data/docs/guides/long-term-memory.md +941 -0
data/docs/guides/multi-robot.md +866 -0
data/docs/guides/recalling-memories.md +927 -0
data/docs/guides/search-strategies.md +953 -0
data/docs/guides/working-memory.md +717 -0
data/docs/index.md +214 -0
data/docs/installation.md +477 -0
data/docs/multi_framework_support.md +519 -0
data/docs/quick-start.md +655 -0
data/docs/setup_local_database.md +302 -0
data/docs/using_rake_tasks_in_your_app.md +383 -0
data/examples/basic_usage.rb +93 -0
data/examples/cli_app/README.md +317 -0
data/examples/cli_app/htm_cli.rb +270 -0
data/examples/custom_llm_configuration.rb +183 -0
data/examples/example_app/Rakefile +71 -0
data/examples/example_app/app.rb +206 -0
data/examples/sinatra_app/Gemfile +21 -0
data/examples/sinatra_app/app.rb +335 -0
data/lib/htm/active_record_config.rb +113 -0
data/lib/htm/configuration.rb +342 -0
data/lib/htm/database.rb +594 -0
data/lib/htm/embedding_service.rb +115 -0
data/lib/htm/errors.rb +34 -0
data/lib/htm/job_adapter.rb +154 -0
data/lib/htm/jobs/generate_embedding_job.rb +65 -0
data/lib/htm/jobs/generate_tags_job.rb +82 -0
data/lib/htm/long_term_memory.rb +965 -0
data/lib/htm/models/node.rb +109 -0
data/lib/htm/models/node_tag.rb +33 -0
data/lib/htm/models/robot.rb +52 -0
data/lib/htm/models/tag.rb +76 -0
data/lib/htm/railtie.rb +76 -0
data/lib/htm/sinatra.rb +157 -0
data/lib/htm/tag_service.rb +135 -0
data/lib/htm/tasks.rb +38 -0
data/lib/htm/version.rb +5 -0
data/lib/htm/working_memory.rb +182 -0
data/lib/htm.rb +400 -0
data/lib/tasks/db.rake +19 -0
data/lib/tasks/htm.rake +147 -0
data/lib/tasks/jobs.rake +312 -0
data/mkdocs.yml +190 -0
data/scripts/install_local_database.sh +309 -0
metadata +341 -0

checksums.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+---
+SHA256:
+  metadata.gz: 0a25e07ce28cc74ddb8b2fca0a9b316108b97362c2281f2e77b7bec5124da0f7
+  data.tar.gz: 9978ce3bd0c1b3c589436a5c8637bf8043bac3b0dfb542657851fa7cb96e1b49
+SHA512:
+  metadata.gz: bbd241d5bda941b6d3df5b35bb13e8dbf630af1be02debf814b4396fba5e76e67b6a22756690d3f510d4a0ad441710d61b3d8d20018db8ba1347220818ecefcf
+  data.tar.gz: 57e49d9629c934dd875e3d65a9c4b476bd96bc04dbc168455d3652c8a8b0aededaa2cd0163e377feadf28507550f6adbd0d13b5a9870b9c54b95437e7fd4e8bb

data/.architecture/decisions/adrs/001-use-postgresql-timescaledb-storage.md ADDED Viewed

@@ -0,0 +1,227 @@
+# ADR-001: Use PostgreSQL with TimescaleDB for Storage
+**Status**: ~~Accepted~~ **SUPERSEDED** (2025-10-28)
+**Date**: 2025-10-25
+**Decision Makers**: Dewayne VanHoozer, Claude (Anthropic)
+---
+## ⚠️ DECISION UPDATE (2025-10-28)
+**TimescaleDB extension has been removed from HTM.**
+**Reason**: After initial struggles with database configuration and practical usage, the decision was made to drop the TimescaleDB extension as it was not providing sufficient value for the current proof-of-concept applications.
+**Key findings**:
+- No hypertables were actually created in the implementation (the `setup_hypertables` method was essentially a no-op)
+- Time-range queries use standard PostgreSQL B-tree indexes on timestamp columns, not TimescaleDB-specific optimizations
+- No compression policies were configured or used
+- The extension added deployment complexity without delivering measurable benefits
+**Current Implementation**: HTM now uses **standard PostgreSQL 17** with only the following extensions:
+- `vector` (pgvector) - for embedding similarity search with up to 2000 dimensions
+- `pg_trgm` - for fuzzy text matching
+**Migration to ActiveRecord** (2025-10-29): The database layer has been migrated to use ActiveRecord ORM with proper models and migrations. See ADR-013 for details.
+No functionality was lost in the removal, as TimescaleDB features were never actually utilized despite being documented.
+---
+## Context
+HTM requires a persistent storage solution that can handle:
+- Time-series data with efficient time-range queries
+- Vector embeddings for semantic search
+- Full-text search capabilities
+- ACID compliance for data integrity
+- Scalability for growing memory databases
+- Production-grade reliability
+Alternative options considered:
+1. **Pure PostgreSQL**: Solid relational database, pgvector support
+2. **TimescaleDB**: PostgreSQL extension optimized for time-series
+3. **Elasticsearch**: Strong full-text search, vector support added
+4. **Pinecone/Weaviate**: Specialized vector databases
+5. **SQLite + extensions**: Simple, embedded option
+## Decision
+We will use **PostgreSQL with TimescaleDB** as the primary storage backend for HTM.
+## Rationale
+### Why PostgreSQL?
+- **Production-proven**: Decades of reliability in demanding environments
+- **ACID compliance**: Guarantees data integrity for memory operations
+- **Rich ecosystem**: Extensive tooling, monitoring, and support
+- **pgvector extension**: Native vector similarity search with HNSW indexing
+- **Full-text search**: Built-in tsvector with GIN indexing
+- **pg_trgm extension**: Trigram-based fuzzy matching
+- **Strong typing**: Schema enforcement prevents data corruption
+- **Wide adoption**: Well-understood by developers
+### Why TimescaleDB?
+- **Hypertable partitioning**: Automatic chunk-based time partitioning
+- **Compression policies**: Automatic compression of old data (70-90% reduction)
+- **Time-range optimization**: Fast queries on temporal data
+- **PostgreSQL compatibility**: Drop-in extension, not a fork
+- **Continuous aggregates**: Pre-computed summaries for analytics
+- **Retention policies**: Automatic data lifecycle management
+- **Cloud offering**: Managed service available (TimescaleDB Cloud)
+### Why Not Alternatives?
+**Elasticsearch**:
+- ❌ Operational complexity (JVM tuning, cluster management)
+- ❌ Higher resource usage
+- ❌ Vector support more recent, less mature
+- ✅ Superior full-text search (not critical for our use case)
+**Specialized Vector DBs** (Pinecone, Weaviate, Qdrant):
+- ❌ Additional service dependency
+- ❌ Limited relational capabilities
+- ❌ Vendor lock-in concerns
+- ❌ Cost considerations for managed services
+- ✅ Excellent vector search performance
+- ✅ Purpose-built for embeddings
+**SQLite**:
+- ❌ Limited concurrency (write locks)
+- ❌ No native vector search (extensions experimental)
+- ❌ Not suitable for multi-robot scenarios
+- ✅ Simple deployment
+- ✅ Zero configuration
+## Implementation Details
+### Schema Design
+- **nodes table**: TimescaleDB hypertable partitioned by `created_at`
+- **operations_log**: TimescaleDB hypertable for audit trail
+- **Vector indexing**: HNSW algorithm for approximate nearest neighbor
+- **Full-text indexing**: GIN indexes on tsvector columns
+- **Compression**: Automatic after 30 days, segmented by robot_id and type
+### Connection Configuration
+```ruby
+# Via environment variable (preferred)
+ENV['HTM_DBURL'] = "postgresql://user:pass@host:port/dbname?sslmode=require"
+# Parsed into connection hash
+{
+  host: 'host',
+  port: 5432,
+  dbname: 'tsdb',
+  user: 'tsdbadmin',
+  password: 'secret',
+  sslmode: 'require'
+}
+```
+### Key Features Enabled
+- Vector similarity search with `<=>` operator (cosine distance)
+- Full-text search with `to_tsvector()` and `@@` operator
+- Trigram fuzzy matching with `pg_trgm`
+- Time-range queries optimized by chunk exclusion
+- Automatic compression of old chunks
+## Consequences
+### Positive
+✅ **Production-ready**: Battle-tested database with proven reliability
+✅ **Multi-modal search**: Vector, full-text, and hybrid strategies
+✅ **Time-series optimization**: Efficient temporal queries
+✅ **Cost-effective storage**: Compression reduces cloud storage costs
+✅ **Familiar tooling**: Standard PostgreSQL tools and practices apply
+✅ **Flexible querying**: Full SQL power for complex operations
+✅ **ACID guarantees**: Data integrity for critical memory operations
+### Negative
+❌ **Operational complexity**: Requires database management (mitigated by managed service)
+❌ **Scaling limits**: Vertical scaling limits (mitigated by partitioning)
+❌ **Connection overhead**: PostgreSQL connections are relatively heavy
+❌ **Vector search performance**: Slower than specialized vector DBs at massive scale
+### Neutral
+➡️ **Learning curve**: Developers need PostgreSQL + TimescaleDB knowledge
+➡️ **Cloud dependency**: Currently using TimescaleDB Cloud (could self-host)
+➡️ **Extension management**: Requires extensions (timescaledb, pgvector, pg_trgm)
+## Risks and Mitigations
+### Risk: Extension Availability
+- **Risk**: Extensions not available in all PostgreSQL environments
+- **Likelihood**: Low (extensions widely available)
+- **Impact**: High (breaks core functionality)
+- **Mitigation**: Document requirements clearly, verify in setup process
+### Risk: Connection Exhaustion
+- **Risk**: PostgreSQL connections limited (default ~100)
+- **Likelihood**: Medium (with many robots)
+- **Impact**: Medium (service degradation)
+- **Mitigation**: Implement connection pooling (ConnectionPool gem)
+### Risk: Storage Costs
+- **Risk**: Vector data storage can be expensive at scale
+- **Likelihood**: Medium (depends on usage)
+- **Impact**: Medium (operational cost)
+- **Mitigation**: Compression policies, retention policies, archival strategies
+### Risk: Query Performance at Scale
+- **Risk**: Complex hybrid searches may slow down with millions of nodes
+- **Likelihood**: Low (with proper indexing)
+- **Impact**: Medium (user experience)
+- **Mitigation**: Query optimization, read replicas, caching layer
+## Alternatives Considered
+| Solution | Pros | Cons | Decision |
+|----------|------|------|----------|
+| Pure PostgreSQL | Simple, reliable, pgvector | No time-series optimization | ❌ Rejected |
+| PostgreSQL + TimescaleDB | Best of both worlds | Slight complexity increase | ✅ **Accepted** |
+| Elasticsearch | Excellent full-text search | High resource usage, complexity | ❌ Rejected |
+| Pinecone | Purpose-built vectors | Vendor lock-in, cost, limited relational | ❌ Rejected |
+| SQLite | Simple, embedded | Limited concurrency, no vectors | ❌ Rejected |
+## Future Considerations
+- **Read replicas**: For query scaling when needed
+- **Partitioning strategies**: By robot_id for tenant isolation
+- **Caching layer**: Redis for hot nodes
+- **Archive tier**: S3/Glacier for very old memories
+- **Multi-region**: For global deployment
+## References
+- [TimescaleDB Documentation](https://docs.timescale.com/)
+- [pgvector Documentation](https://github.com/pgvector/pgvector)
+- [PostgreSQL Full-Text Search](https://www.postgresql.org/docs/current/textsearch.html)
+- [HTM Planning Document](../../htm_teamwork.md)
+## Review Notes
+**Systems Architect**: ✅ Solid choice for time-series + vector workload. Consider read replicas for scaling.
+**Database Architect**: ✅ Excellent indexing strategy. Monitor query performance as data grows.
+**Performance Specialist**: ✅ TimescaleDB compression will help with costs. Add connection pooling soon.
+**Maintainability Expert**: ✅ PostgreSQL tooling is mature and well-documented. Good choice for long-term maintenance.

data/.architecture/decisions/adrs/002-two-tier-memory-architecture.md ADDED Viewed

@@ -0,0 +1,322 @@
+# ADR-002: Two-Tier Memory Architecture
+**Status**: Accepted
+**Date**: 2025-10-25
+**Decision Makers**: Dewayne VanHoozer, Claude (Anthropic)
+---
+## ⚠️ UPDATE (2025-10-28)
+**References to TimescaleDB in this ADR are now historical.**
+After initial struggles with database configuration, the decision was made to drop the TimescaleDB extension as it was not providing sufficient value for the current proof-of-concept applications. The two-tier architecture remains unchanged, but long-term memory now uses **standard PostgreSQL** instead of PostgreSQL + TimescaleDB.
+See [ADR-001](001-use-postgresql-timescaledb-storage.md) for details on the TimescaleDB removal.
+---
+## Context
+LLM-based applications ("robots") face a fundamental challenge: LLMs have limited context windows (typically 128K-200K tokens) but need to maintain awareness across long conversations and sessions spanning days, weeks, or months.
+Requirements:
+- Persist memories across sessions (durable storage)
+- Provide fast access to recent/relevant context
+- Manage token budgets efficiently
+- Never lose data accidentally
+- Support contextual recall from the past
+Alternative approaches:
+1. **Database-only**: Store everything in PostgreSQL, load on demand
+2. **Memory-only**: Keep everything in RAM, serialize on shutdown
+3. **Two-tier**: Combine fast working memory with durable long-term storage
+4. **External service**: Use a managed memory service
+## Decision
+We will implement a **two-tier memory architecture** with:
+- **Working Memory**: Token-limited, in-memory active context
+- **Long-term Memory**: Durable PostgreSQL/TimescaleDB storage
+## Rationale
+### Working Memory (Hot Tier)
+- **Purpose**: Immediate context for LLM
+- **Storage**: In-memory Ruby data structures
+- **Capacity**: Token-limited (default 128K tokens)
+- **Eviction**: LRU-based eviction when full
+- **Access pattern**: Frequent reads, moderate writes
+- **Lifetime**: Process lifetime
+### Long-term Memory (Cold Tier)
+- **Purpose**: Permanent knowledge base
+- **Storage**: PostgreSQL with TimescaleDB
+- **Capacity**: Effectively unlimited
+- **Retention**: Permanent (explicit deletion only)
+- **Access pattern**: RAG-based retrieval
+- **Lifetime**: Forever
+### Data Flow
+```
+Add Memory:
+  User Input → Working Memory → Long-term Memory
+               (immediate)      (persisted)
+Recall Memory:
+  Query → Long-term Memory (RAG search) → Working Memory
+          (semantic + temporal)            (evict if needed)
+Eviction:
+  Working Memory (full) → Evict LRU → Long-term Memory (already there)
+                                       (mark as evicted, not deleted)
+```
+## Implementation Details
+### Working Memory
+```ruby
+class WorkingMemory
+  attr_reader :max_tokens, :token_count
+  def initialize(max_tokens: 128_000)
+    @nodes = {}           # key => {value, token_count, importance, timestamp}
+    @max_tokens = max_tokens
+    @token_count = 0
+  end
+  def add(key, value, token_count:, importance: 1.0)
+    evict_to_make_space(token_count) if needs_eviction?(token_count)
+    @nodes[key] = {value: value, token_count: token_count, ...}
+    @token_count += token_count
+  end
+  def evict_to_make_space(needed_tokens)
+    # LRU eviction based on last access + importance
+  end
+  def assemble_context(strategy: :balanced, max_tokens: nil)
+    # Sort by strategy and assemble within budget
+  end
+end
+```
+### Long-term Memory
+```ruby
+class LongTermMemory
+  def add(key:, value:, embedding:, ...)
+    # Insert into PostgreSQL with vector embedding
+  end
+  def search(timeframe:, query:, embedding_service:, limit:)
+    # RAG-based retrieval: temporal + semantic
+  end
+  def mark_evicted(keys)
+    # Update in_working_memory flag (not deleted)
+  end
+end
+```
+### Coordination (HTM class)
+```ruby
+class HTM
+  def add_node(key, value, ...)
+    # 1. Generate embedding
+    # 2. Store in long-term memory
+    # 3. Add to working memory (evict if needed)
+  end
+  def recall(timeframe:, topic:, ...)
+    # 1. Search long-term memory (RAG)
+    # 2. Add results to working memory (evict if needed)
+    # 3. Return nodes
+  end
+end
+```
+## Consequences
+### Positive
+✅ **Fast context access**: Working memory provides O(1) lookups
+✅ **Durable storage**: Never lose data, survives restarts
+✅ **Token budget control**: Automatic management of context size
+✅ **Explicit eviction policy**: Transparent behavior
+✅ **RAG-enabled**: Search historical context semantically
+✅ **Never-delete philosophy**: Eviction moves data, never removes
+✅ **Process-isolated**: Each robot instance has independent working memory
+### Negative
+❌ **Complexity**: Two storage layers to coordinate
+❌ **Memory overhead**: Working memory consumes RAM
+❌ **Synchronization**: Keep both tiers consistent
+❌ **Eviction overhead**: Moving data between tiers
+### Neutral
+➡️ **Token counting**: Requires accurate token estimation
+➡️ **Strategy tuning**: Eviction and assembly strategies need calibration
+➡️ **Per-process state**: Working memory not shared across processes
+## Eviction Strategies
+### LRU-based (Implemented)
+```ruby
+def eviction_score(node)
+  recency = Time.now - node[:last_accessed]
+  importance = node[:importance]
+  # Lower score = evict first
+  importance / (recency + 1.0)
+end
+```
+### Future Strategies
+- **Importance-only**: Keep most important nodes
+- **Recency-only**: Pure LRU cache
+- **Frequency-based**: Track access counts
+- **Category-based**: Pin certain types (facts, preferences)
+- **Smart eviction**: ML-based prediction of future access
+## Context Assembly Strategies
+### Recent (`:recent`)
+Sort by `created_at DESC`, newest first
+### Important (`:important`)
+Sort by `importance DESC`, most important first
+### Balanced (`:balanced`)
+```ruby
+score = importance * (1.0 / age_in_days)
+```
+### Future Strategies
+- **Semantic clustering**: Group related memories
+- **Conversation threading**: Follow reply chains
+- **Category grouping**: Facts first, then context, etc.
+- **Hybrid scoring**: Multiple factors weighted
+## Design Principles
+### Never Forget (Unless Told)
+- Eviction moves data, never deletes
+- Only `forget(confirm: :confirmed)` deletes
+- Long-term memory is append-only (updates rare)
+### Token Budget Management
+- Token counting happens at add time
+- Working memory enforces hard token limit
+- Context assembly respects token budget
+- Safety margin (10%) for token estimation errors
+### Transparent Behavior
+- Log all evictions
+- Track in_working_memory flag
+- Operations log for audit trail
+## Risks and Mitigations
+### Risk: Token Count Inaccuracy
+- **Risk**: Tiktoken approximation differs from LLM's actual count
+- **Likelihood**: Medium (different tokenizers)
+- **Impact**: Medium (context overflow)
+- **Mitigation**: Add safety margin (10%), use LLM-specific counters
+### Risk: Eviction Thrashing
+- **Risk**: Constant eviction/recall cycles
+- **Likelihood**: Low (with proper sizing)
+- **Impact**: Medium (performance degradation)
+- **Mitigation**: Larger working memory, smarter eviction, caching
+### Risk: Working Memory Growth
+- **Risk**: Memory leaks or unbounded growth
+- **Likelihood**: Low (token budget enforced)
+- **Impact**: High (OOM crashes)
+- **Mitigation**: Hard limits, monitoring, alerts
+### Risk: Stale Working Memory
+- **Risk**: Working memory doesn't reflect long-term updates
+- **Likelihood**: Low (single-writer pattern)
+- **Impact**: Low (eventual consistency OK)
+- **Mitigation**: Refresh on recall, invalidation on update
+## Alternatives Considered
+### Database-Only
+**Pros**: Simple, no synchronization
+**Cons**: Slow access, no token budget management
+**Decision**: ❌ Rejected - too slow for every LLM call
+### Memory-Only
+**Pros**: Fast, simple
+**Cons**: Not durable, lost on crash
+**Decision**: ❌ Rejected - unacceptable data loss risk
+### External Service (Redis, Memcached)
+**Pros**: Shared across processes, mature caching
+**Cons**: Additional dependency, serialization overhead
+**Decision**: ⏸️ Deferred - consider for multi-process scenarios
+### Three-Tier (L1/L2/L3)
+**Pros**: More granular caching
+**Cons**: Much higher complexity
+**Decision**: ❌ Rejected - YAGNI for v1
+## Performance Characteristics
+### Working Memory
+- **Add**: O(1) amortized (eviction is O(n) when needed)
+- **Retrieve**: O(1) hash lookup
+- **Eviction**: O(n log n) for sorting, O(k) for removing k nodes
+- **Context assembly**: O(n log n) for sorting, O(k) for selecting
+### Long-term Memory
+- **Add**: O(log n) PostgreSQL insert with indexes
+- **Vector search**: O(log n) with HNSW index (approximate)
+- **Full-text search**: O(log n) with GIN index
+- **Hybrid search**: O(log n) for both, then merge
+## Future Enhancements
+1. **Shared working memory**: Redis-backed for multi-process
+2. **Lazy loading**: Load nodes on first access
+3. **Pre-fetching**: Anticipate needed context
+4. **Compression**: Compress old working memory nodes
+5. **Tiered eviction**: Multiple working memory levels
+6. **Smart assembly**: ML-driven context selection
+## References
+- [Working Memory (Psychology)](https://en.wikipedia.org/wiki/Working_memory)
+- [Cache Eviction Policies](https://en.wikipedia.org/wiki/Cache_replacement_policies)
+- [LLM Context Window Management](https://www.anthropic.com/research/context-windows)
+- [HTM Planning Document](../../htm_teamwork.md)
+## Review Notes
+**Systems Architect**: ✅ Clean separation of concerns. Consider shared cache for horizontal scaling.
+**Performance Specialist**: ✅ Good balance of speed and durability. Monitor eviction frequency.
+**AI Engineer**: ✅ Token budget management is critical. Add safety margins for token count variance.
+**Ruby Expert**: ✅ Consider using Concurrent::Map for thread-safe working memory in future.