brainlayer 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (133) hide show
  1. brainlayer-1.0.0/.github/workflows/ci.yml +55 -0
  2. brainlayer-1.0.0/.github/workflows/docs.yml +32 -0
  3. brainlayer-1.0.0/.github/workflows/publish.yml +20 -0
  4. brainlayer-1.0.0/.gitignore +55 -0
  5. brainlayer-1.0.0/CHANGELOG.md +19 -0
  6. brainlayer-1.0.0/CLAUDE.md +456 -0
  7. brainlayer-1.0.0/CONTRIBUTING.md +100 -0
  8. brainlayer-1.0.0/LICENSE +190 -0
  9. brainlayer-1.0.0/PKG-INFO +313 -0
  10. brainlayer-1.0.0/README.md +255 -0
  11. brainlayer-1.0.0/docs/architecture.md +141 -0
  12. brainlayer-1.0.0/docs/archive/chat-based-analysis-design.md +149 -0
  13. brainlayer-1.0.0/docs/archive/chat-tags-example.yaml +14 -0
  14. brainlayer-1.0.0/docs/archive/communication-analysis.md +413 -0
  15. brainlayer-1.0.0/docs/archive/hierarchical-clustering-deep-research.md +508 -0
  16. brainlayer-1.0.0/docs/archive/hierarchical-clustering-research.txt +181 -0
  17. brainlayer-1.0.0/docs/archive/research-chat-based-analysis.md +146 -0
  18. brainlayer-1.0.0/docs/archive/research-prompt-embeddings.md +44 -0
  19. brainlayer-1.0.0/docs/assets/logo.svg +75 -0
  20. brainlayer-1.0.0/docs/configuration.md +64 -0
  21. brainlayer-1.0.0/docs/contributing.md +10 -0
  22. brainlayer-1.0.0/docs/data-locations.md +135 -0
  23. brainlayer-1.0.0/docs/embedding-setup.md +22 -0
  24. brainlayer-1.0.0/docs/enrichment-runbook.md +179 -0
  25. brainlayer-1.0.0/docs/enrichment.md +102 -0
  26. brainlayer-1.0.0/docs/index.md +62 -0
  27. brainlayer-1.0.0/docs/local-models-guide.md +160 -0
  28. brainlayer-1.0.0/docs/mcp-config.md +42 -0
  29. brainlayer-1.0.0/docs/mcp-tools.md +206 -0
  30. brainlayer-1.0.0/docs/quickstart.md +128 -0
  31. brainlayer-1.0.0/docs/research/launch-readiness-audit.md +232 -0
  32. brainlayer-1.0.0/docs/research/session-enrichment/README.md +43 -0
  33. brainlayer-1.0.0/docs/research/session-enrichment/auto-extracting-learnings.md +98 -0
  34. brainlayer-1.0.0/docs/research/session-enrichment/brainstore-write-tools.md +33 -0
  35. brainlayer-1.0.0/docs/research/session-enrichment/cli-ux-patterns.md +261 -0
  36. brainlayer-1.0.0/docs/research/session-enrichment/conversation-reconstruction.md +80 -0
  37. brainlayer-1.0.0/docs/research/session-enrichment/prompts.md +197 -0
  38. brainlayer-1.0.0/docs/research/session-enrichment/session-enrichment-architecture.md +277 -0
  39. brainlayer-1.0.0/docs/showcase-claude-collab-discovery.md +102 -0
  40. brainlayer-1.0.0/docs/stylesheets/extra.css +37 -0
  41. brainlayer-1.0.0/mkdocs.yml +58 -0
  42. brainlayer-1.0.0/pyproject.toml +107 -0
  43. brainlayer-1.0.0/scripts/backfill-context.py +102 -0
  44. brainlayer-1.0.0/scripts/backfill-created-at.py +314 -0
  45. brainlayer-1.0.0/scripts/backfill-metadata.py +142 -0
  46. brainlayer-1.0.0/scripts/backfill_data/.gitignore +3 -0
  47. brainlayer-1.0.0/scripts/classify-all.py +141 -0
  48. brainlayer-1.0.0/scripts/cloud_backfill.py +757 -0
  49. brainlayer-1.0.0/scripts/cloud_stream.py +411 -0
  50. brainlayer-1.0.0/scripts/generate_style_card.py +134 -0
  51. brainlayer-1.0.0/scripts/index_youtube.py +696 -0
  52. brainlayer-1.0.0/scripts/label-chunks.py +272 -0
  53. brainlayer-1.0.0/scripts/launchd/com.brainlayer.enrich.plist +45 -0
  54. brainlayer-1.0.0/scripts/launchd/com.brainlayer.index.plist +36 -0
  55. brainlayer-1.0.0/scripts/launchd/install.sh +82 -0
  56. brainlayer-1.0.0/scripts/pre-label.py +216 -0
  57. brainlayer-1.0.0/scripts/reembed_bge_m3.py +158 -0
  58. brainlayer-1.0.0/scripts/run-second-analysis-after-first.sh +31 -0
  59. brainlayer-1.0.0/scripts/test_extraction.py +126 -0
  60. brainlayer-1.0.0/scripts/train-setfit.py +161 -0
  61. brainlayer-1.0.0/scripts/verify_hebrew_search.py +91 -0
  62. brainlayer-1.0.0/scripts/vertex_poll_import.py +279 -0
  63. brainlayer-1.0.0/server.json +32 -0
  64. brainlayer-1.0.0/src/brainlayer/__init__.py +3 -0
  65. brainlayer-1.0.0/src/brainlayer/cli/__init__.py +1545 -0
  66. brainlayer-1.0.0/src/brainlayer/cli/wizard.py +132 -0
  67. brainlayer-1.0.0/src/brainlayer/cli_new.py +151 -0
  68. brainlayer-1.0.0/src/brainlayer/client.py +164 -0
  69. brainlayer-1.0.0/src/brainlayer/clustering.py +736 -0
  70. brainlayer-1.0.0/src/brainlayer/daemon.py +1105 -0
  71. brainlayer-1.0.0/src/brainlayer/dashboard/README.md +129 -0
  72. brainlayer-1.0.0/src/brainlayer/dashboard/__init__.py +5 -0
  73. brainlayer-1.0.0/src/brainlayer/dashboard/app.py +151 -0
  74. brainlayer-1.0.0/src/brainlayer/dashboard/search.py +229 -0
  75. brainlayer-1.0.0/src/brainlayer/dashboard/views.py +230 -0
  76. brainlayer-1.0.0/src/brainlayer/embeddings.py +131 -0
  77. brainlayer-1.0.0/src/brainlayer/engine.py +550 -0
  78. brainlayer-1.0.0/src/brainlayer/index_new.py +87 -0
  79. brainlayer-1.0.0/src/brainlayer/mcp/__init__.py +1558 -0
  80. brainlayer-1.0.0/src/brainlayer/migrate.py +205 -0
  81. brainlayer-1.0.0/src/brainlayer/paths.py +43 -0
  82. brainlayer-1.0.0/src/brainlayer/pipeline/__init__.py +47 -0
  83. brainlayer-1.0.0/src/brainlayer/pipeline/analyze_communication.py +508 -0
  84. brainlayer-1.0.0/src/brainlayer/pipeline/brain_graph.py +567 -0
  85. brainlayer-1.0.0/src/brainlayer/pipeline/chat_tags.py +63 -0
  86. brainlayer-1.0.0/src/brainlayer/pipeline/chunk.py +422 -0
  87. brainlayer-1.0.0/src/brainlayer/pipeline/classify.py +472 -0
  88. brainlayer-1.0.0/src/brainlayer/pipeline/cluster_sampling.py +73 -0
  89. brainlayer-1.0.0/src/brainlayer/pipeline/enrichment.py +810 -0
  90. brainlayer-1.0.0/src/brainlayer/pipeline/extract.py +66 -0
  91. brainlayer-1.0.0/src/brainlayer/pipeline/extract_claude_desktop.py +149 -0
  92. brainlayer-1.0.0/src/brainlayer/pipeline/extract_corrections.py +231 -0
  93. brainlayer-1.0.0/src/brainlayer/pipeline/extract_markdown.py +195 -0
  94. brainlayer-1.0.0/src/brainlayer/pipeline/extract_whatsapp.py +227 -0
  95. brainlayer-1.0.0/src/brainlayer/pipeline/git_overlay.py +301 -0
  96. brainlayer-1.0.0/src/brainlayer/pipeline/longitudinal_analyzer.py +568 -0
  97. brainlayer-1.0.0/src/brainlayer/pipeline/obsidian_export.py +455 -0
  98. brainlayer-1.0.0/src/brainlayer/pipeline/operation_grouping.py +486 -0
  99. brainlayer-1.0.0/src/brainlayer/pipeline/plan_linking.py +313 -0
  100. brainlayer-1.0.0/src/brainlayer/pipeline/sanitize.py +549 -0
  101. brainlayer-1.0.0/src/brainlayer/pipeline/semantic_style.py +574 -0
  102. brainlayer-1.0.0/src/brainlayer/pipeline/session_enrichment.py +472 -0
  103. brainlayer-1.0.0/src/brainlayer/pipeline/style_embed.py +67 -0
  104. brainlayer-1.0.0/src/brainlayer/pipeline/style_index.py +139 -0
  105. brainlayer-1.0.0/src/brainlayer/pipeline/temporal_chains.py +203 -0
  106. brainlayer-1.0.0/src/brainlayer/pipeline/time_batcher.py +248 -0
  107. brainlayer-1.0.0/src/brainlayer/pipeline/unified_timeline.py +569 -0
  108. brainlayer-1.0.0/src/brainlayer/storage.py +66 -0
  109. brainlayer-1.0.0/src/brainlayer/store.py +155 -0
  110. brainlayer-1.0.0/src/brainlayer/taxonomy.json +80 -0
  111. brainlayer-1.0.0/src/brainlayer/vector_store.py +1891 -0
  112. brainlayer-1.0.0/tests/__init__.py +1 -0
  113. brainlayer-1.0.0/tests/conftest.py +15 -0
  114. brainlayer-1.0.0/tests/test_brainstore.py +321 -0
  115. brainlayer-1.0.0/tests/test_chat_list.py +27 -0
  116. brainlayer-1.0.0/tests/test_chunk.py +110 -0
  117. brainlayer-1.0.0/tests/test_chunker.py +91 -0
  118. brainlayer-1.0.0/tests/test_classify.py +102 -0
  119. brainlayer-1.0.0/tests/test_dashboard.py +61 -0
  120. brainlayer-1.0.0/tests/test_engine.py +416 -0
  121. brainlayer-1.0.0/tests/test_enrichment_threshold.py +26 -0
  122. brainlayer-1.0.0/tests/test_extract_markdown.py +235 -0
  123. brainlayer-1.0.0/tests/test_normalize_project.py +58 -0
  124. brainlayer-1.0.0/tests/test_paths.py +53 -0
  125. brainlayer-1.0.0/tests/test_phase2.py +289 -0
  126. brainlayer-1.0.0/tests/test_phase3_qa.py +398 -0
  127. brainlayer-1.0.0/tests/test_sanitize.py +382 -0
  128. brainlayer-1.0.0/tests/test_semantic_style.py +315 -0
  129. brainlayer-1.0.0/tests/test_session_enrichment.py +642 -0
  130. brainlayer-1.0.0/tests/test_storage.py +62 -0
  131. brainlayer-1.0.0/tests/test_think_recall_integration.py +284 -0
  132. brainlayer-1.0.0/tests/test_vector_store.py +148 -0
  133. brainlayer-1.0.0/tests/test_wizard.py +25 -0
@@ -0,0 +1,55 @@
1
+ name: CI
2
+
3
+ on:
4
+ push:
5
+ branches: [main]
6
+ pull_request:
7
+ branches: [main]
8
+
9
+ jobs:
10
+ test:
11
+ runs-on: ubuntu-latest
12
+ strategy:
13
+ matrix:
14
+ python-version: ["3.11", "3.12", "3.13"]
15
+ steps:
16
+ - uses: actions/checkout@v4
17
+
18
+ - uses: actions/setup-python@v5
19
+ with:
20
+ python-version: ${{ matrix.python-version }}
21
+
22
+ - name: Cache pip
23
+ uses: actions/cache@v4
24
+ with:
25
+ path: ~/.cache/pip
26
+ key: ${{ runner.os }}-pip-${{ matrix.python-version }}-${{ hashFiles('pyproject.toml') }}
27
+ restore-keys: ${{ runner.os }}-pip-${{ matrix.python-version }}-
28
+
29
+ - name: Install
30
+ run: pip install -e ".[dev]"
31
+
32
+ - name: Unit tests
33
+ run: pytest tests/ -v --tb=short -m "not integration" -x
34
+
35
+ - name: MCP tool registration
36
+ run: pytest tests/test_think_recall_integration.py::TestMCPToolCount -v --tb=short
37
+
38
+ lint:
39
+ runs-on: ubuntu-latest
40
+ steps:
41
+ - uses: actions/checkout@v4
42
+
43
+ - uses: actions/setup-python@v5
44
+ with:
45
+ python-version: "3.13"
46
+
47
+ - name: Cache pip
48
+ uses: actions/cache@v4
49
+ with:
50
+ path: ~/.cache/pip
51
+ key: ${{ runner.os }}-pip-lint-${{ hashFiles('pyproject.toml') }}
52
+
53
+ - run: pip install ruff
54
+ - run: ruff check src/ tests/
55
+ - run: ruff format --check src/ tests/
@@ -0,0 +1,32 @@
1
+ name: Deploy Docs
2
+
3
+ on:
4
+ push:
5
+ branches: [main]
6
+ paths:
7
+ - "docs/**"
8
+ - "mkdocs.yml"
9
+ workflow_dispatch:
10
+
11
+ permissions:
12
+ contents: write
13
+
14
+ jobs:
15
+ deploy:
16
+ runs-on: ubuntu-latest
17
+ steps:
18
+ - uses: actions/checkout@v4
19
+
20
+ - uses: actions/setup-python@v5
21
+ with:
22
+ python-version: "3.12"
23
+
24
+ - name: Cache pip
25
+ uses: actions/cache@v4
26
+ with:
27
+ path: ~/.cache/pip
28
+ key: ${{ runner.os }}-pip-docs-${{ hashFiles('pyproject.toml') }}
29
+
30
+ - run: pip install mkdocs-material pymdownx-blocks
31
+
32
+ - run: mkdocs gh-deploy --force
@@ -0,0 +1,20 @@
1
+ name: Publish to PyPI
2
+
3
+ on:
4
+ push:
5
+ tags: ["v*"]
6
+
7
+ jobs:
8
+ publish:
9
+ runs-on: ubuntu-latest
10
+ environment: pypi
11
+ permissions:
12
+ id-token: write
13
+ steps:
14
+ - uses: actions/checkout@v4
15
+ - uses: actions/setup-python@v5
16
+ with:
17
+ python-version: "3.12"
18
+ - run: pip install build
19
+ - run: python -m build
20
+ - uses: pypa/gh-action-pypi-publish@release/v1
@@ -0,0 +1,55 @@
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+ *.so
6
+ .venv/
7
+ venv/
8
+ env/
9
+
10
+ # Build
11
+ dist/
12
+ build/
13
+ *.egg-info/
14
+ *.egg
15
+ site/
16
+
17
+ # Testing
18
+ .pytest_cache/
19
+ .ruff_cache/
20
+ htmlcov/
21
+ .coverage
22
+ .tox/
23
+
24
+ # Data
25
+ *.db
26
+ *.db-wal
27
+ *.db-shm
28
+
29
+ # Secrets
30
+ .env
31
+ .env.local
32
+ .env.*.local
33
+
34
+ # IDE
35
+ .idea/
36
+ .vscode/
37
+ *.swp
38
+ *.swo
39
+ *~
40
+
41
+ # Local working files (audit logs, prompts, scratch)
42
+ docs.local/
43
+
44
+ # Internal planning docs (contain cross-repo references)
45
+ docs/plan/
46
+
47
+ # OS
48
+ .DS_Store
49
+ Thumbs.db
50
+
51
+ # Lock files (reproducible installs handled via pyproject.toml)
52
+ uv.lock
53
+
54
+ # Claude scratchpad
55
+ claude.scratchpad.md
@@ -0,0 +1,19 @@
1
+ # Changelog
2
+
3
+ ## [1.0.0] - 2026-02-19
4
+
5
+ ### Added
6
+ - Initial open-source release as BrainLayer (formerly Zikaron)
7
+ - Semantic search across AI conversation history (sqlite-vec + bge-large-en-v1.5)
8
+ - 10-field LLM enrichment pipeline (Ollama / MLX backends)
9
+ - Brain graph visualization (HDBSCAN clustering + UMAP 3D layout)
10
+ - MCP server with 8 tools for Claude Code, Zed, Cursor
11
+ - Interactive setup wizard (`brainlayer init`)
12
+ - Centralized artifact storage (`~/.local/share/brainlayer/storage/`)
13
+ - Multi-source indexing: Claude Code, WhatsApp, YouTube, Markdown, Claude Desktop
14
+ - Communication style analysis pipeline
15
+ - Obsidian vault export
16
+ - FastAPI daemon with 25+ HTTP endpoints
17
+ - GitHub Actions CI/CD with PyPI publishing
18
+ - PII sanitization pipeline for safe cloud processing
19
+ - Source-aware enrichment thresholds
@@ -0,0 +1,456 @@
1
+ # BrainLayer (זיכרון) - Local Knowledge Pipeline
2
+
3
+ > **Memory** for Claude Code conversations. Index, search, retrieve, and visualize knowledge from past coding sessions.
4
+
5
+ ---
6
+
7
+ ## Quick Start
8
+
9
+ ```bash
10
+ cd ~/projects/brainlayer
11
+ python3 -m venv .venv && source .venv/bin/activate
12
+ pip install -e ".[dev]"
13
+
14
+ # Index conversations
15
+ brainlayer index
16
+
17
+ # Start daemon (for dashboard + fast searches)
18
+ brainlayer serve --http 8787
19
+
20
+ # Search
21
+ brainlayer search "how did I implement authentication"
22
+
23
+ # Enrich with local LLM
24
+ brainlayer enrich
25
+ ```
26
+
27
+ ---
28
+
29
+ ## Architecture (Feb 2026 - sqlite-vec)
30
+
31
+ ```
32
+ ~/.claude/projects/ # Source: Claude Code conversations (JSONL)
33
+
34
+ ┌─────────────────────────────────────────────────────────────┐
35
+ │ PIPELINE │
36
+ │ ┌─────────┐ ┌──────────┐ ┌───────┐ ┌───────┐ ┌───────┐│
37
+ │ │ Extract │→ │ Classify │→ │ Chunk │→ │ Embed │→ │ Index ││
38
+ │ └─────────┘ └──────────┘ └───────┘ └───────┘ └───────┘│
39
+ │ bge-large sqlite-vec│
40
+ │ 1024 dims fast DB │
41
+ └─────────────────────────────────────────────────────────────┘
42
+
43
+ ~/.local/share/brainlayer/brainlayer.db # Storage: sqlite-vec (~1.4GB, 260K+ chunks)
44
+
45
+ ┌─────────────────────────────────────────────────────────────┐
46
+ │ POST-PROCESSING │
47
+ │ ┌───────────┐ ┌──────────────┐ ┌────────────┐ │
48
+ │ │ Enrichment│ │ Brain Graph │ │ Obsidian │ │
49
+ │ │ (GLM-4.7) │ │ (clustering) │ │ Export │ │
50
+ │ └───────────┘ └──────────────┘ └────────────┘ │
51
+ └─────────────────────────────────────────────────────────────┘
52
+
53
+ ┌─────────────────────────────────────────────────────────────┐
54
+ │ INTERFACES │
55
+ │ ┌───────┐ ┌──────────────┐ ┌───────────┐ ┌───────────┐ │
56
+ │ │ CLI │ │ FastAPI │ │ MCP Server│ │ Dashboard │ │
57
+ │ │ │ │ Daemon │ │ brainlayer- │ │ (Next.js) │ │
58
+ │ │ │ │ :8787/socket │ │ mcp │ │ :3000 │ │
59
+ │ └───────┘ └──────────────┘ └───────────┘ └───────────┘ │
60
+ └─────────────────────────────────────────────────────────────┘
61
+ ```
62
+
63
+ > **Storage:** sqlite-vec with bge-large-en-v1.5 embeddings (1024 dims). WAL mode + busy_timeout=5000ms for concurrent access.
64
+
65
+ ---
66
+
67
+ ## File Structure
68
+
69
+ ```
70
+ brainlayer/
71
+ ├── src/brainlayer/
72
+ │ ├── __init__.py
73
+ │ ├── cli/ # CLI interface (typer)
74
+ │ │ └── __init__.py # All CLI commands
75
+ │ ├── cli_new.py # New unified CLI (in progress)
76
+ │ ├── client.py # Python client for daemon API
77
+ │ ├── clustering.py # Topic clustering (HDBSCAN + UMAP)
78
+ │ ├── daemon.py # FastAPI HTTP daemon (25+ endpoints)
79
+ │ ├── embeddings.py # bge-large-en-v1.5 embedding model
80
+ │ ├── index_new.py # Unified indexer (batch + progress)
81
+ │ ├── migrate.py # DB schema migrations
82
+ │ ├── vector_store.py # sqlite-vec storage layer
83
+ │ ├── dashboard/ # Built-in TUI dashboard (textual)
84
+ │ │ ├── app.py
85
+ │ │ ├── search.py
86
+ │ │ └── views.py
87
+ │ ├── mcp/ # MCP server (8 tools)
88
+ │ │ └── __init__.py
89
+ │ └── pipeline/ # Processing stages
90
+ │ ├── extract.py # Stage 1: Parse JSONL conversations
91
+ │ ├── extract_whatsapp.py # WhatsApp chat import
92
+ │ ├── extract_markdown.py # Markdown file import
93
+ │ ├── extract_claude_desktop.py # Claude Desktop import
94
+ │ ├── extract_corrections.py # Correction detection
95
+ │ ├── classify.py # Stage 2: Content classification
96
+ │ ├── chunk.py # Stage 3: AST-aware chunking
97
+ │ ├── enrichment.py # LLM enrichment (summaries, tags, importance)
98
+ │ ├── session_enrichment.py # Session-level LLM analysis (Phase 7)
99
+ │ ├── brain_graph.py # Brain graph generation (nodes + edges)
100
+ │ ├── obsidian_export.py # Obsidian vault export
101
+ │ ├── operation_grouping.py # read→edit→test cycle detection
102
+ │ ├── plan_linking.py # Session → plan/phase linking
103
+ │ ├── temporal_chains.py # Topic chain detection
104
+ │ ├── git_overlay.py # Git diff enrichment
105
+ │ ├── semantic_style.py # Communication style analysis
106
+ │ ├── analyze_communication.py # Evolution analysis
107
+ │ ├── cluster_sampling.py # Cluster-based sampling
108
+ │ ├── style_embed.py # Style embedding
109
+ │ ├── style_index.py # Style indexing
110
+ │ ├── unified_timeline.py # Cross-source timeline
111
+ │ ├── time_batcher.py # Temporal batching
112
+ │ ├── longitudinal_analyzer.py # Long-term trend analysis
113
+ │ └── chat_tags.py # Chat tag extraction
114
+ ├── tests/
115
+ ├── pyproject.toml
116
+ ├── CLAUDE.md # This file
117
+ └── prd-json/ # Ralph PRD (if using Ralph)
118
+ ```
119
+
120
+ ---
121
+
122
+ ## CLI Commands
123
+
124
+ ### Core
125
+
126
+ ```bash
127
+ brainlayer index # Index all conversations
128
+ brainlayer index --project my-project # Index specific project
129
+ brainlayer index-fast # Fast incremental index
130
+
131
+ brainlayer search "authentication" # Semantic search
132
+ brainlayer search "config.py" --text # Exact text match
133
+
134
+ brainlayer stats # Knowledge base statistics
135
+ brainlayer clear --yes # Clear database
136
+ ```
137
+
138
+ ### Daemon & Server
139
+
140
+ ```bash
141
+ brainlayer serve # Start daemon (Unix socket)
142
+ brainlayer serve --http 8787 # Start daemon (HTTP mode for dashboard)
143
+ ```
144
+
145
+ ### Enrichment
146
+
147
+ ```bash
148
+ brainlayer enrich # Run LLM enrichment (GLM-4.7-Flash via Ollama)
149
+ brainlayer enrich --batch-size 50 # Custom batch size
150
+
151
+ brainlayer enrich-sessions # Session-level LLM analysis
152
+ brainlayer enrich-sessions --project golems --since 2026-01-01
153
+ brainlayer enrich-sessions --stats # Show session enrichment stats
154
+ ```
155
+
156
+ **Chunk enrichment** adds to each chunk: summary, tags, importance score (1-10), intent classification. Uses local GLM-4.7-Flash with `"think": false` for speed (~1s/chunk for short, ~13s for long).
157
+
158
+ **Session enrichment** (Phase 7) analyzes full conversations to extract: session summary, decisions, corrections, learnings, mistakes, patterns, outcome, quality scores.
159
+
160
+ ### Analysis & Export
161
+
162
+ ```bash
163
+ brainlayer git-overlay # Enrich with git diff context
164
+ brainlayer group-operations # Detect read→edit→test cycles
165
+ brainlayer topic-chains # Find topic continuity across sessions
166
+ brainlayer plan-linking # Link sessions to plans/phases
167
+ brainlayer brain-export # Generate brain graph JSON
168
+ brainlayer export-obsidian # Export to Obsidian vault
169
+ ```
170
+
171
+ ### Style Analysis
172
+
173
+ ```bash
174
+ brainlayer analyze-style # Quick WhatsApp style analysis
175
+ brainlayer analyze-evolution --use-embeddings -c ~/export.json -o data/archives/style-$(date +%Y-%m-%d)
176
+ brainlayer analyze-semantic # Semantic style profiling
177
+ brainlayer list-chats # List indexed chat sources
178
+ ```
179
+
180
+ ### Utilities
181
+
182
+ ```bash
183
+ brainlayer context <chunk_id> # Get surrounding context
184
+ brainlayer review <session_id> # Review a session
185
+ brainlayer fix-projects # Normalize project names
186
+ brainlayer migrate # Run DB migrations
187
+ brainlayer dashboard # Interactive TUI dashboard
188
+ ```
189
+
190
+ ---
191
+
192
+ ## Daemon HTTP Endpoints
193
+
194
+ The daemon (`brainlayer serve --http 8787`) exposes a FastAPI server used by the Next.js dashboard:
195
+
196
+ ### Health & Stats
197
+
198
+ | Endpoint | Method | Description |
199
+ |----------|--------|-------------|
200
+ | `/health` | GET | Health check |
201
+ | `/health/services` | GET | Service status (daemon, enrichment, Ollama) |
202
+ | `/stats` | GET | Knowledge base stats (chunks, projects, types) |
203
+ | `/stats/tokens` | GET | LLM token usage + costs |
204
+ | `/stats/enrichment` | GET | Enrichment progress (enriched vs total) |
205
+ | `/stats/service-runs` | GET | Recent service run logs |
206
+
207
+ ### Search & Context
208
+
209
+ | Endpoint | Method | Description |
210
+ |----------|--------|-------------|
211
+ | `/search` | POST | Semantic + keyword search |
212
+ | `/context/{chunk_id}` | GET | Surrounding chunks for a result |
213
+ | `/dashboard/search` | GET | Dashboard search (GET-friendly) |
214
+ | `/session/{session_id}` | GET | Full session detail |
215
+
216
+ ### Brain Graph
217
+
218
+ | Endpoint | Method | Description |
219
+ |----------|--------|-------------|
220
+ | `/brain/graph` | GET | Full brain graph (nodes + edges) |
221
+ | `/brain/metadata` | GET | Graph metadata (node count, clusters) |
222
+ | `/brain/node/{node_id}` | GET | Single node details |
223
+
224
+ ### Content & Backlog
225
+
226
+ | Endpoint | Method | Description |
227
+ |----------|--------|-------------|
228
+ | `/content/pipeline-runs` | GET | Content pipeline execution logs |
229
+ | `/content/pipeline-stats` | GET | Pipeline routing stats |
230
+ | `/backlog/items` | GET | Backlog items (Kanban board) |
231
+ | `/backlog/items` | POST | Create backlog item |
232
+ | `/backlog/items/{id}` | PATCH | Update backlog item |
233
+ | `/backlog/items/{id}` | DELETE | Delete backlog item |
234
+
235
+ ### Events
236
+
237
+ | Endpoint | Method | Description |
238
+ |----------|--------|-------------|
239
+ | `/events/recent` | GET | Recent golem events |
240
+
241
+ ---
242
+
243
+ ## MCP Server (14 Tools)
244
+
245
+ Add to `~/.claude/settings.json`:
246
+
247
+ ```json
248
+ {
249
+ "mcpServers": {
250
+ "brainlayer": {
251
+ "command": "brainlayer-mcp",
252
+ "args": []
253
+ }
254
+ }
255
+ }
256
+ ```
257
+
258
+ ### Available Tools
259
+
260
+ | Tool | Required Params | Optional Params | Description |
261
+ |------|----------------|-----------------|-------------|
262
+ | `brainlayer_search` | `query` | `project`, `content_type`, `num_results`, `source`, `tag`, `intent`, `importance_min` | Search past conversations |
263
+ | `brainlayer_stats` | — | — | Knowledge base statistics |
264
+ | `brainlayer_list_projects` | — | — | List indexed projects |
265
+ | `brainlayer_context` | `chunk_id` | `before` (3), `after` (3) | Surrounding context for a result |
266
+ | `brainlayer_file_timeline` | `file_path` | `project`, `limit` (50) | File interaction history across sessions |
267
+ | `brainlayer_operations` | `session_id` | — | Logical operation groups (read→edit→test) |
268
+ | `brainlayer_regression` | `file_path` | `project` | Regression analysis (what changed since last success) |
269
+ | `brainlayer_plan_links` | — | `plan_name`, `session_id`, `project` | Session ↔ plan linkage |
270
+ | `brainlayer_session_summary` | `session_id` | — | Session enrichment: decisions, corrections, learnings |
271
+
272
+ ### Search Parameters
273
+
274
+ - **`source`**: `claude_code` (default), `whatsapp`, `youtube`, `all`
275
+ - **`content_type`**: `ai_code`, `stack_trace`, `user_message`, `assistant_text`, `file_read`, `git_diff`
276
+ - **`intent`**: `debugging`, `designing`, `configuring`, `discussing`, `deciding`, `implementing`, `reviewing`
277
+ - **`importance_min`**: 1-10 (from enrichment)
278
+ - **`tag`**: enrichment-generated tags (e.g., `bug-fix`, `authentication`, `typescript`)
279
+
280
+ ---
281
+
282
+ ## Enrichment Pipeline
283
+
284
+ Local LLM enrichment adds structured metadata to each chunk. Think of it as a librarian cataloging every conversation snippet — what it's about, how important it is, and how to find it later.
285
+
286
+ ### Fields (10 total)
287
+
288
+ | Field | What it captures | Example |
289
+ |-------|-----------------|---------|
290
+ | `summary` | 1-2 sentence gist | "Debugging why Telegram bot drops messages under load" |
291
+ | `tags` | Topic tags (comma-separated) | "telegram, debugging, performance" |
292
+ | `importance` | 1-10 relevance score | 8 (architectural decision) vs 2 (directory listing) |
293
+ | `intent` | What was happening | `debugging`, `designing`, `implementing`, `configuring`, `discussing`, `deciding`, `reviewing` |
294
+ | `primary_symbols` | Key code entities | "TelegramBot, handleMessage, grammy" |
295
+ | `resolved_query` | Question this answers (HyDE-style) | "How does the Telegram bot handle rate limiting?" |
296
+ | `epistemic_level` | How proven is this | `hypothesis`, `substantiated`, `validated` |
297
+ | `version_scope` | What version/system state | "grammy 1.32, Node 22, pre-Railway migration" |
298
+ | `debt_impact` | Technical debt signal | `introduction`, `resolution`, `none` |
299
+ | `external_deps` | Libraries/APIs mentioned | "grammy, Supabase, Railway" |
300
+
301
+ The first 4 fields have been populated for ~11.6K chunks via local Ollama. The remaining 6 fields await cloud backfill (Gemini Batch API, ~$16 for all 251K chunks).
302
+
303
+ ### Backends
304
+
305
+ Two local LLM backends available — use whichever suits your setup:
306
+
307
+ | Backend | How to start | Speed | Env var |
308
+ |---------|-------------|-------|---------|
309
+ | **Ollama** (default) | `ollama serve` + `ollama pull glm4` | ~1s/chunk (short), ~13s (long) | `BRAINLAYER_ENRICH_BACKEND=ollama` |
310
+ | **MLX** (Apple Silicon) | `python3 -m mlx_lm.server --model mlx-community/Qwen2.5-Coder-14B-Instruct-4bit --port 8080` | 21-87% faster | `BRAINLAYER_ENRICH_BACKEND=mlx` |
311
+
312
+ Both work with the same enrichment pipeline — just set the env var and go.
313
+
314
+ ### Running Enrichment
315
+
316
+ ```bash
317
+ # Basic (50 chunks at a time, Ollama)
318
+ brainlayer enrich
319
+
320
+ # Bigger batches, MLX, parallel workers
321
+ BRAINLAYER_ENRICH_BACKEND=mlx brainlayer enrich --batch-size=100 --parallel=3
322
+
323
+ # Process up to 5000 chunks in one run
324
+ brainlayer enrich --max=5000
325
+
326
+ # Automated scheduling (checks queue, runs if needed)
327
+ ./scripts/auto-enrich.sh --threshold 500 --max-hours 3
328
+ ```
329
+
330
+ ### Cloud Backfill (one-time)
331
+
332
+ For the initial 251K chunk backfill, there's a Gemini Batch API script. See `docs/enrichment-runbook.md` for the full runbook.
333
+
334
+ ### Concurrency Notes
335
+
336
+ - **`PRAGMA busy_timeout = 5000`** — waits up to 5s for DB locks (daemon + MCP + enrichment can all access DB)
337
+ - **Retry logic** — 3 attempts with backoff on `SQLITE_BUSY`
338
+ - **Parallel mode** — each thread gets its own DB connection (thread-local VectorStore)
339
+ - **Ollama tip:** Set `"think": false` in API calls — GLM-4.7 defaults to thinking mode, adding 350+ tokens and 20s delay for no benefit
340
+ - **Background running:** `PYTHONUNBUFFERED=1` required for log visibility in background processes
341
+
342
+ ---
343
+
344
+ ## Brain Graph
345
+
346
+ Generated by `brainlayer brain-export`, produces a JSON file with:
347
+ - **Nodes:** One per session, with label, project, branch, plan, chunk count
348
+ - **Edges:** Connections between related sessions (shared files, topics, plans)
349
+ - **Clusters:** HDBSCAN clustering by topic similarity
350
+
351
+ Used by the BrainLayer Dashboard 3D visualization (`react-force-graph-3d`). Can be uploaded to Supabase Storage for multi-tenant access.
352
+
353
+ ---
354
+
355
+ ## Obsidian Export
356
+
357
+ `brainlayer export-obsidian` generates a Markdown vault:
358
+ - One note per session with frontmatter (project, date, plan)
359
+ - Backlinks between related sessions
360
+ - Tag-based navigation
361
+ - Compatible with Obsidian graph view
362
+
363
+ ---
364
+
365
+ ## Data Locations
366
+
367
+ | Path | Purpose |
368
+ |------|---------|
369
+ | `~/.claude/projects/` | Source conversations (read-only) |
370
+ | `~/.local/share/brainlayer/brainlayer.db` | sqlite-vec database (~1.4GB, 260K+ chunks) |
371
+ | `~/.local/share/brainlayer/prompts/` | Deduplicated system prompts (SHA-256) |
372
+ | `/tmp/brainlayer.sock` | Daemon Unix socket |
373
+ | `/tmp/brainlayer-enrichment.lock` | Enrichment process lock file |
374
+
375
+ ---
376
+
377
+ ## Development
378
+
379
+ ```bash
380
+ # Install dev dependencies
381
+ uv pip install -e ".[dev]"
382
+
383
+ # Run tests
384
+ pytest
385
+
386
+ # Lint + format
387
+ ruff check src/ && ruff format src/
388
+ ```
389
+
390
+ ---
391
+
392
+ ## Pipeline Stages
393
+
394
+ ### Stage 1: Extract (`pipeline/extract.py`)
395
+ - Parse JSONL conversation files
396
+ - **Content-addressable storage** for system prompts (SHA-256 hash → dedupe)
397
+ - Detect conversation continuations (session ID + temporal proximity)
398
+ - Also: `extract_whatsapp.py`, `extract_markdown.py`, `extract_claude_desktop.py`
399
+
400
+ ### Stage 2: Classify (`pipeline/classify.py`)
401
+
402
+ | Type | Value | Action |
403
+ |------|-------|--------|
404
+ | `ai_code` | HIGH | Preserve verbatim |
405
+ | `stack_trace` | HIGH | Preserve exact (never split) |
406
+ | `user_message` | HIGH | Preserve |
407
+ | `assistant_text` | MEDIUM | Preserve |
408
+ | `file_read` | MEDIUM | Context-dependent |
409
+ | `git_diff` | MEDIUM | Extract changed entities |
410
+ | `build_log` | LOW | Summarize or mask |
411
+ | `dir_listing` | LOW | Structure only |
412
+ | `noise` | SKIP | Filter out (progress, queue-operation) |
413
+
414
+ ### Stage 3: Chunk (`pipeline/chunk.py`)
415
+ - **AST-aware chunking** with tree-sitter for code (~500 tokens)
416
+ - **Never split** stack traces
417
+ - **Observation masking** for large tool outputs (`[N lines elided]`)
418
+ - Turn-based chunking for conversation with 10-20% overlap
419
+
420
+ ### Stage 4: Embed (`embeddings.py`)
421
+ - **bge-large-en-v1.5** via sentence-transformers (local, private)
422
+ - 1024 dimensions, 63.5 MTEB score
423
+ - ~8s model load (vs 30s with Ollama)
424
+ - MPS acceleration on Apple Silicon
425
+
426
+ ### Stage 5: Index (`vector_store.py`)
427
+ - **sqlite-vec** with APSW (macOS compatible)
428
+ - WAL mode for concurrent reads
429
+ - `PRAGMA busy_timeout = 5000` for multi-process safety
430
+ - Metadata: project, content_type, source_file, char_count
431
+
432
+ ---
433
+
434
+ ## Communication Style Analysis
435
+
436
+ BrainLayer includes **communication pattern analysis** from WhatsApp, Claude, YouTube, and Gemini chats.
437
+
438
+ ### Latest Analysis Location
439
+ ```
440
+ data/archives/style-2026-01-31-2121/
441
+ ├── master-style-guide.md # Main style rules
442
+ ├── per-period/ # Style evolution over time
443
+ ```
444
+
445
+ ### Usage
446
+ ```bash
447
+ brainlayer analyze-evolution --use-embeddings -c ~/claude-export.json -o data/archives/style-$(date +%Y-%m-%d-%H%M) -y
448
+ brainlayer analyze-style # Quick WhatsApp-only
449
+ brainlayer analyze-semantic # Semantic style profiling
450
+ ```
451
+
452
+ ---
453
+
454
+ ## Naming
455
+
456
+ **BrainLayer** (זיכרון) - Hebrew for "memory"