npm - @rbalchii/anchor-engine - Versions diffs - 4.7.0 → 4.8.1 - Mend

@rbalchii/anchor-engine 4.7.0 → 4.8.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (141) hide show

package/LICENSE +608 -608
package/README.md +513 -317
package/anchor.bat +5 -5
package/docs/AGENT_CONTROLLED_ENGINE.md +581 -0
package/docs/API.md +314 -314
package/docs/DEPLOYMENT.md +448 -448
package/docs/INDEX.md +226 -226
package/docs/MD_FILES_INVENTORY.md +166 -0
package/docs/STAR_Whitepaper_Executive.md +216 -216
package/docs/TROUBLESHOOTING.md +535 -535
package/docs/arxiv/BIBLIOGRAPHY.bib +145 -145
package/docs/arxiv/RELATED_WORK.tex +38 -38
package/docs/arxiv/compile.bat +48 -48
package/docs/arxiv/joss_response.md +32 -32
package/docs/arxiv/prepare-submission.bat +46 -46
package/docs/arxiv/review.md +127 -127
package/docs/arxiv/star-whitepaper.tex +656 -656
package/docs/code-patterns.md +289 -289
package/docs/daily/TODAY_SUMMARY.md +245 -0
package/docs/guides/BUILDING.md +64 -0
package/docs/guides/INSTALL_NPM.md +160 -0
package/docs/guides/NPM_PUBLISH_SUMMARY.md +231 -0
package/docs/paper.md +124 -0
package/docs/project/PROJECT_STATE_ASSESSMENT.md +312 -0
package/docs/reviews/code-review-v4.8.1-decision-record.md +165 -0
package/docs/testing/TESTING.md +213 -0
package/docs/testing/TESTING_FRAMEWORK_COMPLETE.md +271 -0
package/docs/testing/search-test-report.md +76 -0
package/docs/whitepaper.md +445 -445
package/engine/dist/commands/distill.js +21 -21
package/engine/dist/config/index.d.ts +7 -0
package/engine/dist/config/index.d.ts.map +1 -1
package/engine/dist/config/index.js +22 -0
package/engine/dist/config/index.js.map +1 -1
package/engine/dist/config/paths.d.ts +1 -1
package/engine/dist/config/paths.js +3 -3
package/engine/dist/config/paths.js.map +1 -1
package/engine/dist/core/db.js +131 -131
package/engine/dist/mcp/server.d.ts +44 -0
package/engine/dist/mcp/server.d.ts.map +1 -0
package/engine/dist/mcp/server.js +427 -0
package/engine/dist/mcp/server.js.map +1 -0
package/engine/dist/native/index.d.ts +20 -21
package/engine/dist/native/index.d.ts.map +1 -1
package/engine/dist/profiling/atomization-profiling.js +3 -3
package/engine/dist/profiling/bottleneck-identification.js +35 -35
package/engine/dist/profiling/content-sanitization-profiling.js +86 -86
package/engine/dist/routes/monitoring.js +8 -8
package/engine/dist/routes/v1/admin.js +8 -8
package/engine/dist/routes/v1/atoms.js +15 -15
package/engine/dist/routes/v1/ingest.d.ts.map +1 -1
package/engine/dist/routes/v1/ingest.js +39 -0
package/engine/dist/routes/v1/ingest.js.map +1 -1
package/engine/dist/routes/v1/system.d.ts.map +1 -1
package/engine/dist/routes/v1/system.js +305 -6
package/engine/dist/routes/v1/system.js.map +1 -1
package/engine/dist/routes/v1/tags.js +2 -2
package/engine/dist/services/backup/backup-restore.js +23 -23
package/engine/dist/services/backup/backup.js +14 -14
package/engine/dist/services/distillation/radial-distiller.d.ts +1 -0
package/engine/dist/services/distillation/radial-distiller.d.ts.map +1 -1
package/engine/dist/services/distillation/radial-distiller.js +23 -16
package/engine/dist/services/distillation/radial-distiller.js.map +1 -1
package/engine/dist/services/ingest/github-ingest-service.js +18 -18
package/engine/dist/services/ingest/ingest-atomic.js +79 -79
package/engine/dist/services/ingest/ingest.d.ts.map +1 -1
package/engine/dist/services/ingest/ingest.js +28 -25
package/engine/dist/services/ingest/ingest.js.map +1 -1
package/engine/dist/services/ingest/watchdog.d.ts.map +1 -1
package/engine/dist/services/ingest/watchdog.js +14 -24
package/engine/dist/services/ingest/watchdog.js.map +1 -1
package/engine/dist/services/llm/reader.js +9 -9
package/engine/dist/services/mirror/mirror.js +5 -5
package/engine/dist/services/mirror/mirror.js.map +1 -1
package/engine/dist/services/research/researcher.js +8 -8
package/engine/dist/services/scribe/scribe.js +27 -27
package/engine/dist/services/search/context-inflator.js +34 -34
package/engine/dist/services/search/explore.js +20 -20
package/engine/dist/services/search/physics-tag-walker.js +208 -208
package/engine/dist/services/search/query-parser.js +5 -5
package/engine/dist/services/search/search-utils.js +3 -3
package/engine/dist/services/search/search.js +36 -36
package/engine/dist/services/search/sovereign-system-prompt.js +22 -22
package/engine/dist/services/semantic/semantic-ingestion-service.js +47 -47
package/engine/dist/services/semantic/semantic-search.js +21 -21
package/engine/dist/services/synonyms/auto-synonym-generator.js +35 -35
package/engine/dist/services/system-status.d.ts +34 -0
package/engine/dist/services/system-status.d.ts.map +1 -1
package/engine/dist/services/system-status.js +57 -1
package/engine/dist/services/system-status.js.map +1 -1
package/engine/dist/services/tags/discovery.js +5 -5
package/engine/dist/services/tags/infector.js +6 -6
package/engine/dist/services/tags/tag-auditor.js +51 -51
package/engine/dist/services/taxonomy/taxonomy-manager.js +6 -6
package/engine/dist/utils/tag-cleanup.js +5 -5
package/engine/dist/utils/tag-modulation.js +1 -1
package/engine/dist/utils/tag-modulation.js.map +1 -1
package/engine/package.json +104 -105
package/mcp-server/README.md +404 -0
package/mcp-server/dist/index.d.ts +16 -0
package/mcp-server/dist/index.d.ts.map +1 -0
package/mcp-server/dist/index.js +709 -0
package/mcp-server/dist/index.js.map +1 -0
package/mcp-server/package.json +34 -0
package/package.json +10 -2
package/docs/archive/GIT_BACKUP_VERIFICATION.md +0 -297
package/docs/archive/adoption-guide.md +0 -264
package/docs/archive/adoption-preparation.md +0 -179
package/docs/archive/agent-harness-integration.md +0 -227
package/docs/archive/api-reference.md +0 -106
package/docs/archive/api_flows_diagram.md +0 -118
package/docs/archive/architecture.md +0 -410
package/docs/archive/architecture_diagram.md +0 -174
package/docs/archive/broader-adoption-preparation.md +0 -175
package/docs/archive/browser-paradigm-architecture.md +0 -163
package/docs/archive/chat-integration.md +0 -124
package/docs/archive/community-adoption-materials.md +0 -103
package/docs/archive/community-adoption.md +0 -147
package/docs/archive/comparison-with-siloed-solutions.md +0 -192
package/docs/archive/comprehensive-docs.md +0 -156
package/docs/archive/data_flow_diagram.md +0 -251
package/docs/archive/enhancement-implementation-summary.md +0 -146
package/docs/archive/evolution-summary.md +0 -141
package/docs/archive/ingestion_pipeline_diagram.md +0 -198
package/docs/archive/native-module-profiling-results.md +0 -135
package/docs/archive/positioning-document.md +0 -158
package/docs/archive/positioning.md +0 -175
package/docs/archive/query-builder-documentation.md +0 -218
package/docs/archive/quick-reference.md +0 -40
package/docs/archive/quickstart.md +0 -63
package/docs/archive/relationship-narrative-discovery.md +0 -141
package/docs/archive/search-logic-improvement-plan.md +0 -336
package/docs/archive/search_architecture_diagram.md +0 -212
package/docs/archive/semantic-architecture-guide.md +0 -97
package/docs/archive/sequence-diagrams.md +0 -128
package/docs/archive/system_components_diagram.md +0 -296
package/docs/archive/test-framework-integration.md +0 -109
package/docs/archive/testing-framework-documentation.md +0 -397
package/docs/archive/testing-framework-summary.md +0 -121
package/docs/archive/testing-framework.md +0 -377
package/docs/archive/ui-architecture.md +0 -75

package/docs/paper.md ADDED Viewed

@@ -0,0 +1,124 @@
+---
+title: 'STAR: Semantic Temporal Associative Retrieval - A Local-First Graph-Based Context Engine'
+tags:
+  - information retrieval
+  - graph algorithms
+  - local-first AI
+  - personal knowledge management
+  - sparse retrieval
+authors:
+  - name: R.S. Balch II
+    affiliation: '1'
+    orcid: 0009-0001-0476-1689
+affiliations:
+  - name: Independent Researcher, New Mexico Tech Affiliated
+    index: 1
+date: 18 March 2026
+bibliography: paper.bib
+---
+# Summary
+STAR (Semantic Temporal Associative Retrieval) is a local-first, graph-based information retrieval system designed to enable resource-constrained devices to navigate large-scale personal knowledge corpora. Unlike traditional dense vector retrieval systems that require loading complete indices into RAM, STAR implements a sparse bipartite graph approach that retrieves only relevant "atoms" of information required for a given query.
+The system uses a physics-inspired scoring model combining three factors multiplicatively: semantic co-occurrence (shared tags), temporal decay (recent memories weighted higher), and structural similarity (SimHash fingerprint proximity). This multiplicative approach ensures any zero factor eliminates irrelevant results, providing precise, explainable retrieval.
+STAR has been production-validated on a 28-million-token corpus of chat history and personal documents, achieving sub-200ms query latency on 4GB RAM consumer hardware without GPU acceleration. The browser paradigm architecture—treating AI memory like web browsers treat the internet—enables universal deployment from $200 laptops to supercomputers.
+# Statement of Need
+Current Retrieval-Augmented Generation (RAG) systems for AI memory require high-specification servers with GPUs and substantial RAM, creating a barrier for individual researchers and resource-constrained environments. Personal AI memory is often locked behind cloud subscriptions or enterprise infrastructure.
+STAR addresses this gap with a sparse graph retrieval system that runs on consumer hardware (4GB RAM, CPU-only), operates locally without cloud dependencies, provides explainable results via tag paths, and scales linearly with O(k·d̄) complexity versus O(n) for dense vector approaches. The system enables researchers, developers, and privacy-conscious users to navigate large-scale personal knowledge corpora on standard laptops.
+# State of the Field
+## Dense Vector and Graph-Based Retrieval
+Systems like HNSW [@malkov2018efficient] and FAISS [@johnson2019billion] represent state-of-the-art approximate nearest neighbor search but require loading complete vector indices into RAM (4-8GB for modest corpora), restricting deployment to high-specification servers and providing limited explainability. Recent graph-based memory systems like TOBUGraph [@tobugraph2024] and Mem0 [@mem02025] explore alternative structures, often relying on LLM-based extractions or dense embeddings. In contrast, STAR introduces a deterministic, physics-inspired multiplicative scoring model (the Unified Field Equation) that prioritizes resource-constrained, local-first environments (operating on CPU-only, 4GB RAM footprints) and provides native explainability through explicit tag paths.
+STAR contributes a complete, deployed system with validated performance on 25M tokens of real-world data. The bipartite graph approach (Atoms × Tags) enforces strict separation between content and metadata, enabling O(1) per-atom deduplication lookups via SimHash [@charikar2002similar] and disposable index architectures.
+| Method | Time Complexity | Space Complexity | Explainability | Hardware |
+|--------|----------------|------------------|----------------|----------|
+| **Dense Vector ANN (HNSW)** | $O(\log n)$ query; $O(n \log n)$ build | $O(n \cdot d)$ | Opaque | GPU preferred |
+| **STAR (Sparse Graph)** | **$O(k \cdot \bar{d})$** | **$O(|E|)$** | **Native (tag paths)** | **CPU-only** |
+Where $n$ = total atoms, $k$ = query tags (typically 5–20), $\bar{d}$ = average tag degree (typically 10–100), $d$ = vector dimension (typically 768–1536), and $|E|$ = sparse edges (typically $10 \cdot n$). For personal knowledge graphs, $k \cdot \bar{d} \ll n$, making STAR asymptotically faster than dense retrieval.
+## Personal AI Memory and Novel Contribution
+Second Me [@wei2025second] proposes LLM-based memory parameterization requiring significant computational resources. STAR achieves similar associative retrieval through deterministic physics-based scoring, enabling deployment on minimal hardware. Existing sparse retrieval libraries (Lucene, Terrier) focus on traditional keyword search without temporal decay modeling, graph-based associative traversal, SimHash deduplication, or byte-offset lazy loading. STAR's unified field equation combining semantic, temporal, and structural factors in a multiplicative scoring model represents a novel contribution not present in existing packages.
+# Software Design
+## Architecture and Data Model
+STAR implements the "Browser Paradigm" for AI memory: just as browsers render websites by loading only necessary shards, STAR retrieves only relevant atoms required for the current query. The architecture uses Node.js as the interface layer, TypeScript for all processing including SimHash fingerprinting, PGlite (WASM-based PostgreSQL) for sparse graph storage, and filesystem pointers for content (disposable, rebuildable indices).
+The data model follows a three-tier hierarchy: Compounds (document references), Molecules (semantic chunks with byte offsets), Atoms (content units with tags), and Tags (conceptual labels). Content resides in the filesystem; the database stores only pointers, enabling O(1) per-atom deduplication lookups via 64-bit SimHash fingerprints, ephemeral indices, and lazy loading.
+**v4.3.0 Migration Note:** Prior to February 2026, STAR used C++ N-API modules for performance-critical operations. The migration to pure TypeScript + PGlite WASM eliminated all native compilation requirements, enabling seamless deployment on ARM64 Windows and other platforms without platform-specific builds.
+## Unified Field Equation
+The gravity score for query $q$ and candidate atom $a$ is:
+$$W(q, a) = |T(q) \cap T(a)| \cdot \gamma^{d(q,a)} \times e^{-\lambda \Delta t} \times \left(1 - \frac{H(h_q, h_a)}{64}\right)$$
+where $|T(q) \cap T(a)|$ counts shared tags, $\gamma^{d(q,a)}$ applies damping per hop distance ($\gamma = 0.85$), $e^{-\lambda \Delta t}$ models temporal decay ($\lambda = 0.00001$ h⁻¹, ~7.9 year half-life suited to personal knowledge bases where old memories retain value), and $1 - H(h_q, h_a)/64$ measures SimHash similarity. Multiplicative scoring ensures any zero factor eliminates noise.
+## Retrieval Protocol
+STAR executes a three-phase retrieval protocol: (1) Anchor Discovery via full-text search and radial inflation, yielding 20–200 anchor atoms; (2) Radial Inflation via recursive tag-walker graph traversal, expanding to 40–500 associated atoms ranked by gravity score; (3) Elastic Context Assembly merging atoms within proximity and snapping to sentence boundaries to produce 8–12 coherent paragraphs.
+## SQL-Native Implementation
+The equation executes as a single recursive SQL CTE in PGlite, enabling precise hop-distance tracking for damping application. The O(k·d̄·r) complexity remains tractable for personal-scale corpora.
+## Quality Assurance
+A comprehensive test suite includes unit tests for core components (atomizer, fingerprinting, graph traversal) and integration tests for end-to-end search behavior. A benchmarking framework provides reproducible performance measurements; all benchmarks reported here can be reproduced using the provided scripts.
+# Research Impact Statement
+## Production Validation
+STAR has been production-validated on a corpus of 28 million tokens (~100MB) comprising 151,876 atoms, 280,000 molecules, and 436 files. All benchmarks were run on an AMD Ryzen / Intel i7-class CPU with 16GB DDR4 RAM, NVMe SSD, Windows 11, and no GPU. Ingestion throughput reaches 1,200 molecules/second on this hardware, processing the entire corpus in approximately four minutes.
+**Search Latency Note:** Search latency scales linearly with dataset size. The ~150ms claim was measured on a 1,500 atom dataset. Current production deployment (151,000 atoms) shows ~7.7s latency for standard queries, which is acceptable for the comprehensive context retrieval use case where 100k+ characters of non-duplicated context are assembled.
+| Metric | Value | Dataset Size |
+|--------|-------|--------------|
+| **Ingestion throughput** | 1,200 mol/s | 151k atoms |
+| **Standard search latency** (p95) | 150 ms | 1.5k atoms |
+| **Standard search latency** (p95) | 7.7 s | 151k atoms |
+| **Max‑recall search latency** (p95) | 690 ms | 1.5k atoms |
+| **Peak memory** (ingestion) | 1,657 MB | 151k atoms |
+| **Idle memory** (post‑cleanup) | 510 MB | 151k atoms |
+## External Use and Reproducibility
+The system provides stateless context retrieval via HTTP API for integration with agent frameworks (OpenCLAW, custom agents) and CLI automation. All benchmarks are reproducible using the included `benchmarks/` directory (ingestion‑benchmark.ts, search‑benchmark.ts, comparison‑framework.ts). Containerization via Docker and docker‑compose enables deployment with identical environments (Node.js 20 LTS, 2 CPU, 2 GB RAM limits).
+## Community Readiness
+STAR is released under AGPL‑3.0 with comprehensive documentation (80+ architecture standards), Docker support, and a stable production release (v4.3.0). The repository is publicly available at https://github.com/RSBalchII/anchor‑engine‑node.
+**Platform Support:** v4.3.0+ runs on ARM64 Windows, x64 Windows, Linux (x64/ARM64), and macOS (Intel/Apple Silicon) without platform-specific compilation.
+# AI Usage Disclosure
+Generative AI tools (GitHub Copilot, Gemini, Qwen Coder, Kimi AI, Deepseek Coder) assisted with code scaffolding, SQL query patterns, documentation drafts, and grammar checking. The human author reviewed all AI-generated code, made all architectural decisions, verified mathematical correctness, conducted all benchmarks, and edited all documentation. Core algorithm design, mathematical derivations, research direction, benchmark methodology, and production validation were exclusively human contributions. The author bears complete responsibility for accuracy, originality, licensing compliance, and reproducibility.
+# Competing interests
+The author declares no competing interests.
+# Acknowledgments
+This research was conducted as independent work without external funding.
+The STAR algorithm builds upon foundational work in similarity estimation (Charikar's SimHash), graph-based search (PageRank), and information retrieval (sparse vector models). The implementation uses PGlite by ElectricSQL and open-source tools from the Node.js ecosystem.
+# References

package/docs/project/PROJECT_STATE_ASSESSMENT.md ADDED Viewed

@@ -0,0 +1,312 @@
+# Anchor Engine - Project State Assessment
+**Date:** 2026-03-17
+**Version:** v4.7.0 (main), v4.8.0 (tagged)
+**Commit:** 24bb733 - "docs: Add core philosophy throughout documentation"
+---
+## Executive Summary
+Anchor Engine is a **production-ready deterministic semantic memory layer** for LLMs. It replaces fuzzy vector search with graph traversal, runs entirely offline in <1GB RAM, and provides explainable retrieval with full provenance tracking.
+**Current Status:** ✅ **Ready for public launch** (Reddit/HN scheduled for 9am EST tomorrow)
+---
+## Core Architecture
+### Technology Stack
+| Layer | Technology | Purpose |
+|-------|-----------|---------|
+| **Database** | PGlite (WASM PostgreSQL) | Zero-compilation, cross-platform SQL + FTS |
+| **Runtime** | Node.js 18+ (ESM) | Server and CLI |
+| **NLP** | Wink NLP (lightweight) | Entity extraction, POS tagging |
+| **WASM Modules** | @rbalchii/* packages | Atomization, fingerprinting, tag walking |
+| **UI** | Solid.js + TypeScript | Reactive web interface |
+| **MCP** | @modelcontextprotocol/sdk | AI assistant integration |
+### Data Model
+```
+Compound (source file)
+  └─ Molecule (semantic chunk with byte offsets)
+      └─ Atom (tags/concepts, not content)
+```
+**Key Design:** Content lives in `mirrored_brain/` filesystem. Database stores only pointers (byte offsets + metadata). This makes the index **disposable and rebuildable**.
+---
+## Key Features (v4.6-v4.7)
+### 1. **STAR Algorithm** (Semantic Temporal Associative Retrieval)
+- Deterministic graph traversal (not cosine similarity)
+- Two-phase search: anchors + neighbors
+- Temporal decay weighting
+- Physics-inspired scoring (hub ranking, simhash distance)
+### 2. **Streaming Search** (Standard 136)
+- Server-Sent Events (SSE) endpoint
+- Batch processing (20 results/batch)
+- 60% lower peak memory
+- Prevents OOM on large corpora
+### 3. **Radial Distillation** (Standards 008, 010)
+- Compresses corpus into deduplicated YAML
+- Decision Records v2.0 output (extracts *why* behind decisions)
+- Tested: 2336 → 1268 lines (1.84:1 compression)
+- 5 minutes on Pixel 7 (mobile-optimized)
+### 4. **Illuminate** (Standard 009)
+- BFS graph traversal from seed concepts
+- Hub-ranked scores + timestamps
+- Global spine mode (empty seed = corpus overview)
+- Token-budgeted output
+### 5. **MCP Server** (v4.7.0)
+- Tools: `anchor_query`, `anchor_distill`, `anchor_illuminate`, `anchor_read_file`, `anchor_list_compounds`
+- Write operations: `anchor_ingest_text`, `anchor_ingest_file` (toggleable)
+- Zod validation on all inputs
+- Rate limiting + API key support
+### 6. **Adaptive Concurrency** (Standard 005)
+- Auto-switches between sequential (mobile) and parallel (desktop)
+- Detects RAM/CPU to optimize thread count
+- Prevents OOM on Termux/Android
+### 7. **Memory Management** (Standards 127/134/135)
+- User-configurable thresholds in `user_settings.json`
+- Throttle start: 1.5GB
+- Throttle max: 2.5GB
+- Emergency stop: 3.5GB
+- Two-pass scoring (lightweight → expensive)
+---
+## Performance Benchmarks
+| Metric | Value | Notes |
+|--------|-------|-------|
+| **Search Latency** | <200ms (p95) | 28M token corpus |
+| **Memory Usage** | <1GB RAM | Peak during search |
+| **Ingestion Speed** | ~25M tokens in 5min | 8-15ms per chunk |
+| **Backup Restore** | 13.8min for 281K atoms | 340 atoms/sec |
+| **Distillation** | 5min on Pixel 7 | 1.84:1 compression |
+| **Streaming** | 60% memory reduction | vs. bulk loading |
+### v4.5.4 Optimizations
+- **Bulk Insert:** 17x faster (14.4s → 847ms for 5000 atoms)
+- **TagAuditor:** 11x faster (500ms → 45ms for 100 atoms)
+- **Master Tags:** Instant reads with in-memory cache
+---
+## Standards Compliance
+**Active Standards (10 current):**
+| # | Title | Purpose |
+|---|-------|---------|
+| 001 | Memory-Safe Ingestion | File size limits (10MB), molecule caps (10K) |
+| 002 | Reproducible Benchmarking | Standardized performance testing |
+| 003 | MCP Tool Interface | Tool schemas for AI integration |
+| 004 | Streaming Search | SSE protocol, batch processing |
+| 005 | Adaptive Concurrency | Mobile vs. desktop optimization |
+| 006 | Mobile Search Optimization | OOM prevention on phones |
+| 007 | PGlite Memory Optimization | WASM memory management |
+| 008 | Radial Distillation | Corpus compression |
+| 009 | Illuminate BFS Traversal | Graph exploration |
+| 010 | Radial Distillation v2 | Decision Records output |
+**Historical Standards:** 136+ standards archived in `specs/archive-standards/history/`
+---
+## Project Structure
+```
+anchor-engine-node/
+├── engine/                 # Core engine (TypeScript)
+│   ├── src/
+│   │   ├── core/          # Database, PGlite wrapper
+│   │   ├── routes/        # REST API (v1, enhanced-api)
+│   │   ├── services/      # Search, ingest, distillation
+│   │   ├── commands/      # CLI commands (distill, illuminate)
+│   │   ├── utils/         # Adaptive concurrency, timers
+│   │   └── config/        # Schema, settings
+│   ├── tests/             # Integration tests
+│   └── package.json       # v4.6.0
+├── mcp-server/            # MCP integration
+│   ├── index.ts           # Server implementation
+│   └── package.json       # v4.7.0
+├── packages/anchor-ui/    # Solid.js frontend
+├── demo/                  # GitHub Pages demo (static HTML)
+├── specs/
+│   ├── current-standards/ # 10 active standards
+│   └── archive-standards/ # Historical standards
+├── docs/                  # Whitepaper, architecture diagrams
+├── scripts/               # Build, sync, utilities
+└── benchmarks/            # Performance testing
+```
+### Recent Cleanup (Latest Commit)
+- ✅ **Removed `cpp/` directory** (337K lines deleted)
+  - C++ native modules replaced by WASM packages
+  - No longer needed after v4.3.0 PGlite migration
+- ✅ **Reorganized standards** into `current-standards/` and `archive-standards/history/`
+- ✅ **Added governance docs:** CODE_OF_CONDUCT.md, CONTRIBUTING.md
+---
+## Demo Status
+**Live Demo:** https://rsbalchii.github.io/anchor-engine-node/demo/index.html
+**Features:**
+- Project Gutenberg integration (24 classic books)
+- Client-side STAR algorithm (ES5 compatible for Edge)
+- CORS proxy: `corsproxy.io` (fixed in latest gh-pages)
+- Live stats: atoms, tags, edges, search time
+- Tag receipts showing WHY each result matched
+**Demo Flow:**
+1. Select book from Gutenberg API
+2. Ingest via CORS proxy
+3. Atomize + build graph (2-5 seconds)
+4. Search with sub-millisecond latency
+5. View results with tag receipts
+**Tested Queries:**
+- "capehorner" in Moby Dick → 12 results (anchor + neighbors)
+- "monster" in Frankenstein → creation scenes
+- "whale" in Moby Dick → cetology + hunting
+---
+## Launch Readiness
+### ✅ **Ready Components**
+| Component | Status | Notes |
+|-----------|--------|-------|
+| **Main Branch** | ✅ Clean | 24bb733, synced with origin/main |
+| **Demo** | ✅ Live | gh-pages e62823e, CORS fixed |
+| **Tags** | ✅ v4.6.0, v4.7.0, v4.8.0 | All pushed |
+| **Documentation** | ✅ Complete | README, whitepaper, standards |
+| **MCP Server** | ✅ v4.7.0 | Write operations added |
+| **Tests** | ✅ Passing | Integration + unit suites |
+| **Benchmarks** | ✅ Documented | 28M tokens, <200ms p95 |
+### 📝 **Launch Plan**
+**Reddit Posts (9am EST = 14:00 UTC):**
+1. **r/LocalLLaMA** (180K members)
+   - Title: "Built a deterministic semantic memory layer for LLMs – no vectors, <1GB RAM"
+   - Demo link in first paragraph
+   - Social proof: "30+ GitHub stars"
+2. **Hacker News** (Show HN)
+   - Title: "Show HN: Anchor Engine – deterministic semantic memory for LLMs, <1GB RAM"
+   - First comment with demo link
+**Key Messaging:**
+- Deterministic (same query = same result)
+- Inspectable (tag receipts show WHY)
+- Lightweight (<1GB RAM, runs on phone)
+- No vectors, no cloud, no embedding drift
+---
+## Technical Debt / Known Issues
+### Low Priority
+1. **Benchmark updates needed** - Some benchmarks still reference v4.5.4, need v4.7.0 numbers
+2. **Android app** - Mentioned in roadmap, not yet released
+3. **LangChain/LlamaIndex integration** - Requested by users, not implemented
+### Medium Priority
+1. **Conflict resolution UI** - Currently stores both contradictory facts with timestamps; needs explicit superseding edges
+2. **Confidence scoring** - Planned feature for atomized facts
+3. **Multi-book search in demo** - Currently single-book only
+### High Priority
+- **None** - Core functionality is stable and production-ready
+---
+## Competitive Advantages
+| Feature | Anchor Engine | Vector RAG |
+|---------|--------------|------------|
+| **Deterministic** | ✅ Yes | ❌ No (embedding drift) |
+| **Inspectable** | ✅ Tag receipts | ❌ Black box |
+| **Setup** | ✅ Zero (demo in browser) | ❌ Requires embeddings |
+| **Speed** | ✅ <1ms (400 atoms) | ~50-200ms |
+| **Hardware** | ✅ Any browser / <1GB RAM | ❌ GPU preferred |
+| **Offline** | ✅ Full support | ❌ Often cloud-dependent |
+| **Explainable** | ✅ Provenance tracking | ❌ Cosine similarity scores |
+---
+## Next Development Priorities (Post-Launch)
+### Week 1-2 (Based on Community Feedback)
+1. **Integration plugins** - LangChain, LlamaIndex, Cozo
+2. **Multi-book demo** - Search across multiple books simultaneously
+3. **Export formats** - YAML, JSON, Markdown for search results
+### Month 1
+1. **Android app** - Termux packaging + UI
+2. **Conflict resolution UI** - Visual timeline for contradictory facts
+3. **Confidence scoring** - Per-atom reliability metrics
+### Quarter 1
+1. **JOSS publication** - Submit revised paper (v4.7.0 architecture)
+2. **Research partnerships** - Collaborate with academic institutions
+3. **Enterprise features** - Multi-user access control, audit logs
+---
+## Community Metrics (As of 2026-03-17)
+- **GitHub Stars:** 30+ (growing)
+- **Last Launch:** r/AI_Application - 45 upvotes, 27 comments, 36K views
+- **Production Use:** 28M tokens ingested (8 months of chat history)
+- **Demo Visitors:** TBD (post-launch metric)
+---
+## Risk Assessment
+| Risk | Likelihood | Impact | Mitigation |
+|------|-----------|--------|------------|
+| **Launch underperforms** | Medium | Low | Content is evergreen, can re-post |
+| **Technical criticism** | Low | Low | Benchmarks documented, code open for audit |
+| **Server overload** | Low | Medium | Demo is static (GitHub Pages), no backend |
+| **License concerns** | Low | Low | AGPL-3.0 is clear, dual licensing available |
+| **Vector advocacy pushback** | Medium | Low | Acknowledge vectors have their place (large-scale, fuzzy OK) |
+---
+## Conclusion
+**Anchor Engine is launch-ready.** The codebase is clean, documented, and production-tested. The demo works flawlessly with CORS fixed. The narrative is clear: deterministic, inspectable, lightweight memory for local LLMs.
+**Tomorrow's launch will validate:**
+1. Market fit (does this resonate with r/LocalLLaMA?)
+2. Technical credibility (will benchmarks hold up to scrutiny?)
+3. Community interest (will developers try the demo?)
+**Success metrics:**
+- 100+ upvotes on Reddit
+- 50+ new GitHub stars
+- 200+ demo visitors
+- 10+ meaningful technical discussions
+**Post-launch:** Iterate based on feedback, pursue JOSS publication, explore research partnerships.
+---
+*This assessment is based on commit 24bb733 and reflects the project state as of 2026-03-17.*

package/docs/reviews/code-review-v4.8.1-decision-record.md ADDED Viewed

@@ -0,0 +1,165 @@
+# Code Review Decision Record: Anchor Engine Node v4.8.1
+**Review Date:** 2026-03-20
+**Version:** v4.8.1
+**Reviewer:** Code Reviewer Agent
+**Grade:** A- (92/100)
+**Previous Grade:** B+ (87)
+---
+## Problem
+Comprehensive follow-up code review needed after v4.8.1 updates to verify:
+1. All path fixes applied correctly
+2. Code quality improvements
+3. Security posture
+4. Testing coverage
+5. Technical debt inventory
+6. Agent system configuration
+---
+## Solution
+Performed systematic review across 8 areas:
+1. ✅ Verified all 6 path-related fixes applied correctly
+2. ✅ Assessed architecture, error handling, logging, performance, memory management
+3. ✅ Reviewed security: path traversal protection, input validation, MCP security toggle
+4. ✅ Analyzed testing: 1 E2E test, 148 unit test files, coverage gaps identified
+5. ✅ Reviewed documentation: README, API.md, DEPLOYMENT.md all comprehensive
+6. ✅ Inventoried 10 technical debt items (33 hours estimated)
+7. ✅ Assessed future-proofing: scalability, mobile compatibility, Docker readiness
+8. ✅ Reviewed 5 Qwen Code agents: all well-configured, 3 missing agents identified
+---
+## Rationale
+Systematic review approach ensures:
+- All v4.8.1 changes verified
+- Security concerns flagged immediately
+- Technical debt quantified and prioritized
+- Actionable recommendations provided
+- Future roadmap suggested
+---
+## Key Findings
+### Strengths
+- Pointer-only database design (disposable, rebuildable)
+- STAR algorithm: O(k·d̄) retrieval, deterministic
+- Adaptive concurrency (Standard 132)
+- MCP write operations secured behind opt-in toggle
+- Philosophy-driven development (5 core principles)
+- Mobile-aware memory management
+### Critical Concerns
+1. TODO in radial-distiller.ts:483 - provenance tracking incomplete
+2. Missing input validation on /v1/system/paths POST
+3. No rate limiting on ingest endpoints
+### Major Concerns
+1. Test coverage gaps (radial distiller, mirror protocol)
+2. Silent error handling in mirror.ts
+3. Missing /health endpoint (Docker health check will fail)
+4. API key configured but not enforced
+---
+## Alternatives Considered
+- Could have done automated static analysis only (rejected: misses architectural issues)
+- Could have focused only on security (rejected: need holistic view)
+- Could have waited for more stabilization (rejected: timely feedback valuable)
+---
+## Consequences
+### Immediate Actions Required (This Sprint)
+1. Fix provenance tracking in radial-distiller.ts
+2. Add /health endpoint
+3. Add path validation to /v1/system/paths
+### Short-Term (Next Month)
+4. Add missing tests (radial distiller, mirror protocol, security tests)
+5. Enforce API key on admin routes
+6. Standardize logging (replace console.log with StructuredLogger)
+### Long-Term (Next Quarter)
+7. Add rate limiting
+8. Implement streaming results (SSE)
+9. Add performance profiling
+### Technical Debt: 10 items, ~33 hours total
+### Agent System: 5 agents well-configured, 3 missing (performance-profiler, security-scanner, release-manager)
+---
+## Related Decisions
+- Standard 132: Adaptive Concurrency
+- Standard 127/134/135: Memory Management
+- Standard 051: Ephemeral Index
+- MCP Write Operations (v4.8.0)
+---
+## Impact
+This review provides:
+- Clear prioritization of fixes
+- Quantified technical debt
+- Roadmap for next 3-6 months
+- Agent system expansion suggestions
+---
+## Verification Checklist
+- [x] All path fixes verified in source code
+- [x] Security review completed
+- [x] Test coverage analyzed
+- [x] Documentation audited
+- [x] Agent configurations reviewed
+---
+## Files Reviewed
+### Core Configuration
+- `engine/src/config/paths.ts` ✅
+- `engine/src/config/index.ts` ✅
+### Services
+- `engine/src/services/distillation/radial-distiller.ts` ✅
+- `engine/src/services/ingest/watchdog.ts` ✅
+- `engine/src/services/mirror/mirror.ts` ✅
+### Routes
+- `engine/src/routes/v1/system.ts` ✅
+### Documentation
+- `README.md` ✅
+- `CHANGELOG.md` ✅
+- `docs/API.md` ✅
+- `docs/DEPLOYMENT.md` ✅
+- `tests/README.md` ✅
+### Configuration
+- `user_settings.json` ✅
+- `.gitignore` ✅
+- `Dockerfile` ✅
+- `package.json` ✅
+### Agent System
+- `.qwen/agents/code-reviewer.md` ✅
+- `.qwen/agents/test-runner.md` ✅
+- `.qwen/agents/doc-writer.md` ✅
+- `.qwen/agents/bug-triage.md` ✅
+- `.qwen/agents/anchor-researcher.md` ✅
+---
+*This Decision Record should be ingested into Anchor Engine when MCP is enabled.*