brainlayer 1.0.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- brainlayer-1.0.0/.github/workflows/ci.yml +55 -0
- brainlayer-1.0.0/.github/workflows/docs.yml +32 -0
- brainlayer-1.0.0/.github/workflows/publish.yml +20 -0
- brainlayer-1.0.0/.gitignore +55 -0
- brainlayer-1.0.0/CHANGELOG.md +19 -0
- brainlayer-1.0.0/CLAUDE.md +456 -0
- brainlayer-1.0.0/CONTRIBUTING.md +100 -0
- brainlayer-1.0.0/LICENSE +190 -0
- brainlayer-1.0.0/PKG-INFO +313 -0
- brainlayer-1.0.0/README.md +255 -0
- brainlayer-1.0.0/docs/architecture.md +141 -0
- brainlayer-1.0.0/docs/archive/chat-based-analysis-design.md +149 -0
- brainlayer-1.0.0/docs/archive/chat-tags-example.yaml +14 -0
- brainlayer-1.0.0/docs/archive/communication-analysis.md +413 -0
- brainlayer-1.0.0/docs/archive/hierarchical-clustering-deep-research.md +508 -0
- brainlayer-1.0.0/docs/archive/hierarchical-clustering-research.txt +181 -0
- brainlayer-1.0.0/docs/archive/research-chat-based-analysis.md +146 -0
- brainlayer-1.0.0/docs/archive/research-prompt-embeddings.md +44 -0
- brainlayer-1.0.0/docs/assets/logo.svg +75 -0
- brainlayer-1.0.0/docs/configuration.md +64 -0
- brainlayer-1.0.0/docs/contributing.md +10 -0
- brainlayer-1.0.0/docs/data-locations.md +135 -0
- brainlayer-1.0.0/docs/embedding-setup.md +22 -0
- brainlayer-1.0.0/docs/enrichment-runbook.md +179 -0
- brainlayer-1.0.0/docs/enrichment.md +102 -0
- brainlayer-1.0.0/docs/index.md +62 -0
- brainlayer-1.0.0/docs/local-models-guide.md +160 -0
- brainlayer-1.0.0/docs/mcp-config.md +42 -0
- brainlayer-1.0.0/docs/mcp-tools.md +206 -0
- brainlayer-1.0.0/docs/quickstart.md +128 -0
- brainlayer-1.0.0/docs/research/launch-readiness-audit.md +232 -0
- brainlayer-1.0.0/docs/research/session-enrichment/README.md +43 -0
- brainlayer-1.0.0/docs/research/session-enrichment/auto-extracting-learnings.md +98 -0
- brainlayer-1.0.0/docs/research/session-enrichment/brainstore-write-tools.md +33 -0
- brainlayer-1.0.0/docs/research/session-enrichment/cli-ux-patterns.md +261 -0
- brainlayer-1.0.0/docs/research/session-enrichment/conversation-reconstruction.md +80 -0
- brainlayer-1.0.0/docs/research/session-enrichment/prompts.md +197 -0
- brainlayer-1.0.0/docs/research/session-enrichment/session-enrichment-architecture.md +277 -0
- brainlayer-1.0.0/docs/showcase-claude-collab-discovery.md +102 -0
- brainlayer-1.0.0/docs/stylesheets/extra.css +37 -0
- brainlayer-1.0.0/mkdocs.yml +58 -0
- brainlayer-1.0.0/pyproject.toml +107 -0
- brainlayer-1.0.0/scripts/backfill-context.py +102 -0
- brainlayer-1.0.0/scripts/backfill-created-at.py +314 -0
- brainlayer-1.0.0/scripts/backfill-metadata.py +142 -0
- brainlayer-1.0.0/scripts/backfill_data/.gitignore +3 -0
- brainlayer-1.0.0/scripts/classify-all.py +141 -0
- brainlayer-1.0.0/scripts/cloud_backfill.py +757 -0
- brainlayer-1.0.0/scripts/cloud_stream.py +411 -0
- brainlayer-1.0.0/scripts/generate_style_card.py +134 -0
- brainlayer-1.0.0/scripts/index_youtube.py +696 -0
- brainlayer-1.0.0/scripts/label-chunks.py +272 -0
- brainlayer-1.0.0/scripts/launchd/com.brainlayer.enrich.plist +45 -0
- brainlayer-1.0.0/scripts/launchd/com.brainlayer.index.plist +36 -0
- brainlayer-1.0.0/scripts/launchd/install.sh +82 -0
- brainlayer-1.0.0/scripts/pre-label.py +216 -0
- brainlayer-1.0.0/scripts/reembed_bge_m3.py +158 -0
- brainlayer-1.0.0/scripts/run-second-analysis-after-first.sh +31 -0
- brainlayer-1.0.0/scripts/test_extraction.py +126 -0
- brainlayer-1.0.0/scripts/train-setfit.py +161 -0
- brainlayer-1.0.0/scripts/verify_hebrew_search.py +91 -0
- brainlayer-1.0.0/scripts/vertex_poll_import.py +279 -0
- brainlayer-1.0.0/server.json +32 -0
- brainlayer-1.0.0/src/brainlayer/__init__.py +3 -0
- brainlayer-1.0.0/src/brainlayer/cli/__init__.py +1545 -0
- brainlayer-1.0.0/src/brainlayer/cli/wizard.py +132 -0
- brainlayer-1.0.0/src/brainlayer/cli_new.py +151 -0
- brainlayer-1.0.0/src/brainlayer/client.py +164 -0
- brainlayer-1.0.0/src/brainlayer/clustering.py +736 -0
- brainlayer-1.0.0/src/brainlayer/daemon.py +1105 -0
- brainlayer-1.0.0/src/brainlayer/dashboard/README.md +129 -0
- brainlayer-1.0.0/src/brainlayer/dashboard/__init__.py +5 -0
- brainlayer-1.0.0/src/brainlayer/dashboard/app.py +151 -0
- brainlayer-1.0.0/src/brainlayer/dashboard/search.py +229 -0
- brainlayer-1.0.0/src/brainlayer/dashboard/views.py +230 -0
- brainlayer-1.0.0/src/brainlayer/embeddings.py +131 -0
- brainlayer-1.0.0/src/brainlayer/engine.py +550 -0
- brainlayer-1.0.0/src/brainlayer/index_new.py +87 -0
- brainlayer-1.0.0/src/brainlayer/mcp/__init__.py +1558 -0
- brainlayer-1.0.0/src/brainlayer/migrate.py +205 -0
- brainlayer-1.0.0/src/brainlayer/paths.py +43 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/__init__.py +47 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/analyze_communication.py +508 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/brain_graph.py +567 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/chat_tags.py +63 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/chunk.py +422 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/classify.py +472 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/cluster_sampling.py +73 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/enrichment.py +810 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/extract.py +66 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/extract_claude_desktop.py +149 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/extract_corrections.py +231 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/extract_markdown.py +195 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/extract_whatsapp.py +227 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/git_overlay.py +301 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/longitudinal_analyzer.py +568 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/obsidian_export.py +455 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/operation_grouping.py +486 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/plan_linking.py +313 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/sanitize.py +549 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/semantic_style.py +574 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/session_enrichment.py +472 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/style_embed.py +67 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/style_index.py +139 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/temporal_chains.py +203 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/time_batcher.py +248 -0
- brainlayer-1.0.0/src/brainlayer/pipeline/unified_timeline.py +569 -0
- brainlayer-1.0.0/src/brainlayer/storage.py +66 -0
- brainlayer-1.0.0/src/brainlayer/store.py +155 -0
- brainlayer-1.0.0/src/brainlayer/taxonomy.json +80 -0
- brainlayer-1.0.0/src/brainlayer/vector_store.py +1891 -0
- brainlayer-1.0.0/tests/__init__.py +1 -0
- brainlayer-1.0.0/tests/conftest.py +15 -0
- brainlayer-1.0.0/tests/test_brainstore.py +321 -0
- brainlayer-1.0.0/tests/test_chat_list.py +27 -0
- brainlayer-1.0.0/tests/test_chunk.py +110 -0
- brainlayer-1.0.0/tests/test_chunker.py +91 -0
- brainlayer-1.0.0/tests/test_classify.py +102 -0
- brainlayer-1.0.0/tests/test_dashboard.py +61 -0
- brainlayer-1.0.0/tests/test_engine.py +416 -0
- brainlayer-1.0.0/tests/test_enrichment_threshold.py +26 -0
- brainlayer-1.0.0/tests/test_extract_markdown.py +235 -0
- brainlayer-1.0.0/tests/test_normalize_project.py +58 -0
- brainlayer-1.0.0/tests/test_paths.py +53 -0
- brainlayer-1.0.0/tests/test_phase2.py +289 -0
- brainlayer-1.0.0/tests/test_phase3_qa.py +398 -0
- brainlayer-1.0.0/tests/test_sanitize.py +382 -0
- brainlayer-1.0.0/tests/test_semantic_style.py +315 -0
- brainlayer-1.0.0/tests/test_session_enrichment.py +642 -0
- brainlayer-1.0.0/tests/test_storage.py +62 -0
- brainlayer-1.0.0/tests/test_think_recall_integration.py +284 -0
- brainlayer-1.0.0/tests/test_vector_store.py +148 -0
- brainlayer-1.0.0/tests/test_wizard.py +25 -0
|
@@ -0,0 +1,55 @@
|
|
|
1
|
+
name: CI
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
branches: [main]
|
|
6
|
+
pull_request:
|
|
7
|
+
branches: [main]
|
|
8
|
+
|
|
9
|
+
jobs:
|
|
10
|
+
test:
|
|
11
|
+
runs-on: ubuntu-latest
|
|
12
|
+
strategy:
|
|
13
|
+
matrix:
|
|
14
|
+
python-version: ["3.11", "3.12", "3.13"]
|
|
15
|
+
steps:
|
|
16
|
+
- uses: actions/checkout@v4
|
|
17
|
+
|
|
18
|
+
- uses: actions/setup-python@v5
|
|
19
|
+
with:
|
|
20
|
+
python-version: ${{ matrix.python-version }}
|
|
21
|
+
|
|
22
|
+
- name: Cache pip
|
|
23
|
+
uses: actions/cache@v4
|
|
24
|
+
with:
|
|
25
|
+
path: ~/.cache/pip
|
|
26
|
+
key: ${{ runner.os }}-pip-${{ matrix.python-version }}-${{ hashFiles('pyproject.toml') }}
|
|
27
|
+
restore-keys: ${{ runner.os }}-pip-${{ matrix.python-version }}-
|
|
28
|
+
|
|
29
|
+
- name: Install
|
|
30
|
+
run: pip install -e ".[dev]"
|
|
31
|
+
|
|
32
|
+
- name: Unit tests
|
|
33
|
+
run: pytest tests/ -v --tb=short -m "not integration" -x
|
|
34
|
+
|
|
35
|
+
- name: MCP tool registration
|
|
36
|
+
run: pytest tests/test_think_recall_integration.py::TestMCPToolCount -v --tb=short
|
|
37
|
+
|
|
38
|
+
lint:
|
|
39
|
+
runs-on: ubuntu-latest
|
|
40
|
+
steps:
|
|
41
|
+
- uses: actions/checkout@v4
|
|
42
|
+
|
|
43
|
+
- uses: actions/setup-python@v5
|
|
44
|
+
with:
|
|
45
|
+
python-version: "3.13"
|
|
46
|
+
|
|
47
|
+
- name: Cache pip
|
|
48
|
+
uses: actions/cache@v4
|
|
49
|
+
with:
|
|
50
|
+
path: ~/.cache/pip
|
|
51
|
+
key: ${{ runner.os }}-pip-lint-${{ hashFiles('pyproject.toml') }}
|
|
52
|
+
|
|
53
|
+
- run: pip install ruff
|
|
54
|
+
- run: ruff check src/ tests/
|
|
55
|
+
- run: ruff format --check src/ tests/
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
name: Deploy Docs
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
branches: [main]
|
|
6
|
+
paths:
|
|
7
|
+
- "docs/**"
|
|
8
|
+
- "mkdocs.yml"
|
|
9
|
+
workflow_dispatch:
|
|
10
|
+
|
|
11
|
+
permissions:
|
|
12
|
+
contents: write
|
|
13
|
+
|
|
14
|
+
jobs:
|
|
15
|
+
deploy:
|
|
16
|
+
runs-on: ubuntu-latest
|
|
17
|
+
steps:
|
|
18
|
+
- uses: actions/checkout@v4
|
|
19
|
+
|
|
20
|
+
- uses: actions/setup-python@v5
|
|
21
|
+
with:
|
|
22
|
+
python-version: "3.12"
|
|
23
|
+
|
|
24
|
+
- name: Cache pip
|
|
25
|
+
uses: actions/cache@v4
|
|
26
|
+
with:
|
|
27
|
+
path: ~/.cache/pip
|
|
28
|
+
key: ${{ runner.os }}-pip-docs-${{ hashFiles('pyproject.toml') }}
|
|
29
|
+
|
|
30
|
+
- run: pip install mkdocs-material pymdownx-blocks
|
|
31
|
+
|
|
32
|
+
- run: mkdocs gh-deploy --force
|
|
@@ -0,0 +1,20 @@
|
|
|
1
|
+
name: Publish to PyPI
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
tags: ["v*"]
|
|
6
|
+
|
|
7
|
+
jobs:
|
|
8
|
+
publish:
|
|
9
|
+
runs-on: ubuntu-latest
|
|
10
|
+
environment: pypi
|
|
11
|
+
permissions:
|
|
12
|
+
id-token: write
|
|
13
|
+
steps:
|
|
14
|
+
- uses: actions/checkout@v4
|
|
15
|
+
- uses: actions/setup-python@v5
|
|
16
|
+
with:
|
|
17
|
+
python-version: "3.12"
|
|
18
|
+
- run: pip install build
|
|
19
|
+
- run: python -m build
|
|
20
|
+
- uses: pypa/gh-action-pypi-publish@release/v1
|
|
@@ -0,0 +1,55 @@
|
|
|
1
|
+
# Python
|
|
2
|
+
__pycache__/
|
|
3
|
+
*.py[cod]
|
|
4
|
+
*$py.class
|
|
5
|
+
*.so
|
|
6
|
+
.venv/
|
|
7
|
+
venv/
|
|
8
|
+
env/
|
|
9
|
+
|
|
10
|
+
# Build
|
|
11
|
+
dist/
|
|
12
|
+
build/
|
|
13
|
+
*.egg-info/
|
|
14
|
+
*.egg
|
|
15
|
+
site/
|
|
16
|
+
|
|
17
|
+
# Testing
|
|
18
|
+
.pytest_cache/
|
|
19
|
+
.ruff_cache/
|
|
20
|
+
htmlcov/
|
|
21
|
+
.coverage
|
|
22
|
+
.tox/
|
|
23
|
+
|
|
24
|
+
# Data
|
|
25
|
+
*.db
|
|
26
|
+
*.db-wal
|
|
27
|
+
*.db-shm
|
|
28
|
+
|
|
29
|
+
# Secrets
|
|
30
|
+
.env
|
|
31
|
+
.env.local
|
|
32
|
+
.env.*.local
|
|
33
|
+
|
|
34
|
+
# IDE
|
|
35
|
+
.idea/
|
|
36
|
+
.vscode/
|
|
37
|
+
*.swp
|
|
38
|
+
*.swo
|
|
39
|
+
*~
|
|
40
|
+
|
|
41
|
+
# Local working files (audit logs, prompts, scratch)
|
|
42
|
+
docs.local/
|
|
43
|
+
|
|
44
|
+
# Internal planning docs (contain cross-repo references)
|
|
45
|
+
docs/plan/
|
|
46
|
+
|
|
47
|
+
# OS
|
|
48
|
+
.DS_Store
|
|
49
|
+
Thumbs.db
|
|
50
|
+
|
|
51
|
+
# Lock files (reproducible installs handled via pyproject.toml)
|
|
52
|
+
uv.lock
|
|
53
|
+
|
|
54
|
+
# Claude scratchpad
|
|
55
|
+
claude.scratchpad.md
|
|
@@ -0,0 +1,19 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
## [1.0.0] - 2026-02-19
|
|
4
|
+
|
|
5
|
+
### Added
|
|
6
|
+
- Initial open-source release as BrainLayer (formerly Zikaron)
|
|
7
|
+
- Semantic search across AI conversation history (sqlite-vec + bge-large-en-v1.5)
|
|
8
|
+
- 10-field LLM enrichment pipeline (Ollama / MLX backends)
|
|
9
|
+
- Brain graph visualization (HDBSCAN clustering + UMAP 3D layout)
|
|
10
|
+
- MCP server with 8 tools for Claude Code, Zed, Cursor
|
|
11
|
+
- Interactive setup wizard (`brainlayer init`)
|
|
12
|
+
- Centralized artifact storage (`~/.local/share/brainlayer/storage/`)
|
|
13
|
+
- Multi-source indexing: Claude Code, WhatsApp, YouTube, Markdown, Claude Desktop
|
|
14
|
+
- Communication style analysis pipeline
|
|
15
|
+
- Obsidian vault export
|
|
16
|
+
- FastAPI daemon with 25+ HTTP endpoints
|
|
17
|
+
- GitHub Actions CI/CD with PyPI publishing
|
|
18
|
+
- PII sanitization pipeline for safe cloud processing
|
|
19
|
+
- Source-aware enrichment thresholds
|
|
@@ -0,0 +1,456 @@
|
|
|
1
|
+
# BrainLayer (זיכרון) - Local Knowledge Pipeline
|
|
2
|
+
|
|
3
|
+
> **Memory** for Claude Code conversations. Index, search, retrieve, and visualize knowledge from past coding sessions.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## Quick Start
|
|
8
|
+
|
|
9
|
+
```bash
|
|
10
|
+
cd ~/projects/brainlayer
|
|
11
|
+
python3 -m venv .venv && source .venv/bin/activate
|
|
12
|
+
pip install -e ".[dev]"
|
|
13
|
+
|
|
14
|
+
# Index conversations
|
|
15
|
+
brainlayer index
|
|
16
|
+
|
|
17
|
+
# Start daemon (for dashboard + fast searches)
|
|
18
|
+
brainlayer serve --http 8787
|
|
19
|
+
|
|
20
|
+
# Search
|
|
21
|
+
brainlayer search "how did I implement authentication"
|
|
22
|
+
|
|
23
|
+
# Enrich with local LLM
|
|
24
|
+
brainlayer enrich
|
|
25
|
+
```
|
|
26
|
+
|
|
27
|
+
---
|
|
28
|
+
|
|
29
|
+
## Architecture (Feb 2026 - sqlite-vec)
|
|
30
|
+
|
|
31
|
+
```
|
|
32
|
+
~/.claude/projects/ # Source: Claude Code conversations (JSONL)
|
|
33
|
+
↓
|
|
34
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
35
|
+
│ PIPELINE │
|
|
36
|
+
│ ┌─────────┐ ┌──────────┐ ┌───────┐ ┌───────┐ ┌───────┐│
|
|
37
|
+
│ │ Extract │→ │ Classify │→ │ Chunk │→ │ Embed │→ │ Index ││
|
|
38
|
+
│ └─────────┘ └──────────┘ └───────┘ └───────┘ └───────┘│
|
|
39
|
+
│ bge-large sqlite-vec│
|
|
40
|
+
│ 1024 dims fast DB │
|
|
41
|
+
└─────────────────────────────────────────────────────────────┘
|
|
42
|
+
↓
|
|
43
|
+
~/.local/share/brainlayer/brainlayer.db # Storage: sqlite-vec (~1.4GB, 260K+ chunks)
|
|
44
|
+
↓
|
|
45
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
46
|
+
│ POST-PROCESSING │
|
|
47
|
+
│ ┌───────────┐ ┌──────────────┐ ┌────────────┐ │
|
|
48
|
+
│ │ Enrichment│ │ Brain Graph │ │ Obsidian │ │
|
|
49
|
+
│ │ (GLM-4.7) │ │ (clustering) │ │ Export │ │
|
|
50
|
+
│ └───────────┘ └──────────────┘ └────────────┘ │
|
|
51
|
+
└─────────────────────────────────────────────────────────────┘
|
|
52
|
+
↓
|
|
53
|
+
┌─────────────────────────────────────────────────────────────┐
|
|
54
|
+
│ INTERFACES │
|
|
55
|
+
│ ┌───────┐ ┌──────────────┐ ┌───────────┐ ┌───────────┐ │
|
|
56
|
+
│ │ CLI │ │ FastAPI │ │ MCP Server│ │ Dashboard │ │
|
|
57
|
+
│ │ │ │ Daemon │ │ brainlayer- │ │ (Next.js) │ │
|
|
58
|
+
│ │ │ │ :8787/socket │ │ mcp │ │ :3000 │ │
|
|
59
|
+
│ └───────┘ └──────────────┘ └───────────┘ └───────────┘ │
|
|
60
|
+
└─────────────────────────────────────────────────────────────┘
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
> **Storage:** sqlite-vec with bge-large-en-v1.5 embeddings (1024 dims). WAL mode + busy_timeout=5000ms for concurrent access.
|
|
64
|
+
|
|
65
|
+
---
|
|
66
|
+
|
|
67
|
+
## File Structure
|
|
68
|
+
|
|
69
|
+
```
|
|
70
|
+
brainlayer/
|
|
71
|
+
├── src/brainlayer/
|
|
72
|
+
│ ├── __init__.py
|
|
73
|
+
│ ├── cli/ # CLI interface (typer)
|
|
74
|
+
│ │ └── __init__.py # All CLI commands
|
|
75
|
+
│ ├── cli_new.py # New unified CLI (in progress)
|
|
76
|
+
│ ├── client.py # Python client for daemon API
|
|
77
|
+
│ ├── clustering.py # Topic clustering (HDBSCAN + UMAP)
|
|
78
|
+
│ ├── daemon.py # FastAPI HTTP daemon (25+ endpoints)
|
|
79
|
+
│ ├── embeddings.py # bge-large-en-v1.5 embedding model
|
|
80
|
+
│ ├── index_new.py # Unified indexer (batch + progress)
|
|
81
|
+
│ ├── migrate.py # DB schema migrations
|
|
82
|
+
│ ├── vector_store.py # sqlite-vec storage layer
|
|
83
|
+
│ ├── dashboard/ # Built-in TUI dashboard (textual)
|
|
84
|
+
│ │ ├── app.py
|
|
85
|
+
│ │ ├── search.py
|
|
86
|
+
│ │ └── views.py
|
|
87
|
+
│ ├── mcp/ # MCP server (8 tools)
|
|
88
|
+
│ │ └── __init__.py
|
|
89
|
+
│ └── pipeline/ # Processing stages
|
|
90
|
+
│ ├── extract.py # Stage 1: Parse JSONL conversations
|
|
91
|
+
│ ├── extract_whatsapp.py # WhatsApp chat import
|
|
92
|
+
│ ├── extract_markdown.py # Markdown file import
|
|
93
|
+
│ ├── extract_claude_desktop.py # Claude Desktop import
|
|
94
|
+
│ ├── extract_corrections.py # Correction detection
|
|
95
|
+
│ ├── classify.py # Stage 2: Content classification
|
|
96
|
+
│ ├── chunk.py # Stage 3: AST-aware chunking
|
|
97
|
+
│ ├── enrichment.py # LLM enrichment (summaries, tags, importance)
|
|
98
|
+
│ ├── session_enrichment.py # Session-level LLM analysis (Phase 7)
|
|
99
|
+
│ ├── brain_graph.py # Brain graph generation (nodes + edges)
|
|
100
|
+
│ ├── obsidian_export.py # Obsidian vault export
|
|
101
|
+
│ ├── operation_grouping.py # read→edit→test cycle detection
|
|
102
|
+
│ ├── plan_linking.py # Session → plan/phase linking
|
|
103
|
+
│ ├── temporal_chains.py # Topic chain detection
|
|
104
|
+
│ ├── git_overlay.py # Git diff enrichment
|
|
105
|
+
│ ├── semantic_style.py # Communication style analysis
|
|
106
|
+
│ ├── analyze_communication.py # Evolution analysis
|
|
107
|
+
│ ├── cluster_sampling.py # Cluster-based sampling
|
|
108
|
+
│ ├── style_embed.py # Style embedding
|
|
109
|
+
│ ├── style_index.py # Style indexing
|
|
110
|
+
│ ├── unified_timeline.py # Cross-source timeline
|
|
111
|
+
│ ├── time_batcher.py # Temporal batching
|
|
112
|
+
│ ├── longitudinal_analyzer.py # Long-term trend analysis
|
|
113
|
+
│ └── chat_tags.py # Chat tag extraction
|
|
114
|
+
├── tests/
|
|
115
|
+
├── pyproject.toml
|
|
116
|
+
├── CLAUDE.md # This file
|
|
117
|
+
└── prd-json/ # Ralph PRD (if using Ralph)
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
---
|
|
121
|
+
|
|
122
|
+
## CLI Commands
|
|
123
|
+
|
|
124
|
+
### Core
|
|
125
|
+
|
|
126
|
+
```bash
|
|
127
|
+
brainlayer index # Index all conversations
|
|
128
|
+
brainlayer index --project my-project # Index specific project
|
|
129
|
+
brainlayer index-fast # Fast incremental index
|
|
130
|
+
|
|
131
|
+
brainlayer search "authentication" # Semantic search
|
|
132
|
+
brainlayer search "config.py" --text # Exact text match
|
|
133
|
+
|
|
134
|
+
brainlayer stats # Knowledge base statistics
|
|
135
|
+
brainlayer clear --yes # Clear database
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
### Daemon & Server
|
|
139
|
+
|
|
140
|
+
```bash
|
|
141
|
+
brainlayer serve # Start daemon (Unix socket)
|
|
142
|
+
brainlayer serve --http 8787 # Start daemon (HTTP mode for dashboard)
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
### Enrichment
|
|
146
|
+
|
|
147
|
+
```bash
|
|
148
|
+
brainlayer enrich # Run LLM enrichment (GLM-4.7-Flash via Ollama)
|
|
149
|
+
brainlayer enrich --batch-size 50 # Custom batch size
|
|
150
|
+
|
|
151
|
+
brainlayer enrich-sessions # Session-level LLM analysis
|
|
152
|
+
brainlayer enrich-sessions --project golems --since 2026-01-01
|
|
153
|
+
brainlayer enrich-sessions --stats # Show session enrichment stats
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
**Chunk enrichment** adds to each chunk: summary, tags, importance score (1-10), intent classification. Uses local GLM-4.7-Flash with `"think": false` for speed (~1s/chunk for short, ~13s for long).
|
|
157
|
+
|
|
158
|
+
**Session enrichment** (Phase 7) analyzes full conversations to extract: session summary, decisions, corrections, learnings, mistakes, patterns, outcome, quality scores.
|
|
159
|
+
|
|
160
|
+
### Analysis & Export
|
|
161
|
+
|
|
162
|
+
```bash
|
|
163
|
+
brainlayer git-overlay # Enrich with git diff context
|
|
164
|
+
brainlayer group-operations # Detect read→edit→test cycles
|
|
165
|
+
brainlayer topic-chains # Find topic continuity across sessions
|
|
166
|
+
brainlayer plan-linking # Link sessions to plans/phases
|
|
167
|
+
brainlayer brain-export # Generate brain graph JSON
|
|
168
|
+
brainlayer export-obsidian # Export to Obsidian vault
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
### Style Analysis
|
|
172
|
+
|
|
173
|
+
```bash
|
|
174
|
+
brainlayer analyze-style # Quick WhatsApp style analysis
|
|
175
|
+
brainlayer analyze-evolution --use-embeddings -c ~/export.json -o data/archives/style-$(date +%Y-%m-%d)
|
|
176
|
+
brainlayer analyze-semantic # Semantic style profiling
|
|
177
|
+
brainlayer list-chats # List indexed chat sources
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
### Utilities
|
|
181
|
+
|
|
182
|
+
```bash
|
|
183
|
+
brainlayer context <chunk_id> # Get surrounding context
|
|
184
|
+
brainlayer review <session_id> # Review a session
|
|
185
|
+
brainlayer fix-projects # Normalize project names
|
|
186
|
+
brainlayer migrate # Run DB migrations
|
|
187
|
+
brainlayer dashboard # Interactive TUI dashboard
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
---
|
|
191
|
+
|
|
192
|
+
## Daemon HTTP Endpoints
|
|
193
|
+
|
|
194
|
+
The daemon (`brainlayer serve --http 8787`) exposes a FastAPI server used by the Next.js dashboard:
|
|
195
|
+
|
|
196
|
+
### Health & Stats
|
|
197
|
+
|
|
198
|
+
| Endpoint | Method | Description |
|
|
199
|
+
|----------|--------|-------------|
|
|
200
|
+
| `/health` | GET | Health check |
|
|
201
|
+
| `/health/services` | GET | Service status (daemon, enrichment, Ollama) |
|
|
202
|
+
| `/stats` | GET | Knowledge base stats (chunks, projects, types) |
|
|
203
|
+
| `/stats/tokens` | GET | LLM token usage + costs |
|
|
204
|
+
| `/stats/enrichment` | GET | Enrichment progress (enriched vs total) |
|
|
205
|
+
| `/stats/service-runs` | GET | Recent service run logs |
|
|
206
|
+
|
|
207
|
+
### Search & Context
|
|
208
|
+
|
|
209
|
+
| Endpoint | Method | Description |
|
|
210
|
+
|----------|--------|-------------|
|
|
211
|
+
| `/search` | POST | Semantic + keyword search |
|
|
212
|
+
| `/context/{chunk_id}` | GET | Surrounding chunks for a result |
|
|
213
|
+
| `/dashboard/search` | GET | Dashboard search (GET-friendly) |
|
|
214
|
+
| `/session/{session_id}` | GET | Full session detail |
|
|
215
|
+
|
|
216
|
+
### Brain Graph
|
|
217
|
+
|
|
218
|
+
| Endpoint | Method | Description |
|
|
219
|
+
|----------|--------|-------------|
|
|
220
|
+
| `/brain/graph` | GET | Full brain graph (nodes + edges) |
|
|
221
|
+
| `/brain/metadata` | GET | Graph metadata (node count, clusters) |
|
|
222
|
+
| `/brain/node/{node_id}` | GET | Single node details |
|
|
223
|
+
|
|
224
|
+
### Content & Backlog
|
|
225
|
+
|
|
226
|
+
| Endpoint | Method | Description |
|
|
227
|
+
|----------|--------|-------------|
|
|
228
|
+
| `/content/pipeline-runs` | GET | Content pipeline execution logs |
|
|
229
|
+
| `/content/pipeline-stats` | GET | Pipeline routing stats |
|
|
230
|
+
| `/backlog/items` | GET | Backlog items (Kanban board) |
|
|
231
|
+
| `/backlog/items` | POST | Create backlog item |
|
|
232
|
+
| `/backlog/items/{id}` | PATCH | Update backlog item |
|
|
233
|
+
| `/backlog/items/{id}` | DELETE | Delete backlog item |
|
|
234
|
+
|
|
235
|
+
### Events
|
|
236
|
+
|
|
237
|
+
| Endpoint | Method | Description |
|
|
238
|
+
|----------|--------|-------------|
|
|
239
|
+
| `/events/recent` | GET | Recent golem events |
|
|
240
|
+
|
|
241
|
+
---
|
|
242
|
+
|
|
243
|
+
## MCP Server (14 Tools)
|
|
244
|
+
|
|
245
|
+
Add to `~/.claude/settings.json`:
|
|
246
|
+
|
|
247
|
+
```json
|
|
248
|
+
{
|
|
249
|
+
"mcpServers": {
|
|
250
|
+
"brainlayer": {
|
|
251
|
+
"command": "brainlayer-mcp",
|
|
252
|
+
"args": []
|
|
253
|
+
}
|
|
254
|
+
}
|
|
255
|
+
}
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
### Available Tools
|
|
259
|
+
|
|
260
|
+
| Tool | Required Params | Optional Params | Description |
|
|
261
|
+
|------|----------------|-----------------|-------------|
|
|
262
|
+
| `brainlayer_search` | `query` | `project`, `content_type`, `num_results`, `source`, `tag`, `intent`, `importance_min` | Search past conversations |
|
|
263
|
+
| `brainlayer_stats` | — | — | Knowledge base statistics |
|
|
264
|
+
| `brainlayer_list_projects` | — | — | List indexed projects |
|
|
265
|
+
| `brainlayer_context` | `chunk_id` | `before` (3), `after` (3) | Surrounding context for a result |
|
|
266
|
+
| `brainlayer_file_timeline` | `file_path` | `project`, `limit` (50) | File interaction history across sessions |
|
|
267
|
+
| `brainlayer_operations` | `session_id` | — | Logical operation groups (read→edit→test) |
|
|
268
|
+
| `brainlayer_regression` | `file_path` | `project` | Regression analysis (what changed since last success) |
|
|
269
|
+
| `brainlayer_plan_links` | — | `plan_name`, `session_id`, `project` | Session ↔ plan linkage |
|
|
270
|
+
| `brainlayer_session_summary` | `session_id` | — | Session enrichment: decisions, corrections, learnings |
|
|
271
|
+
|
|
272
|
+
### Search Parameters
|
|
273
|
+
|
|
274
|
+
- **`source`**: `claude_code` (default), `whatsapp`, `youtube`, `all`
|
|
275
|
+
- **`content_type`**: `ai_code`, `stack_trace`, `user_message`, `assistant_text`, `file_read`, `git_diff`
|
|
276
|
+
- **`intent`**: `debugging`, `designing`, `configuring`, `discussing`, `deciding`, `implementing`, `reviewing`
|
|
277
|
+
- **`importance_min`**: 1-10 (from enrichment)
|
|
278
|
+
- **`tag`**: enrichment-generated tags (e.g., `bug-fix`, `authentication`, `typescript`)
|
|
279
|
+
|
|
280
|
+
---
|
|
281
|
+
|
|
282
|
+
## Enrichment Pipeline
|
|
283
|
+
|
|
284
|
+
Local LLM enrichment adds structured metadata to each chunk. Think of it as a librarian cataloging every conversation snippet — what it's about, how important it is, and how to find it later.
|
|
285
|
+
|
|
286
|
+
### Fields (10 total)
|
|
287
|
+
|
|
288
|
+
| Field | What it captures | Example |
|
|
289
|
+
|-------|-----------------|---------|
|
|
290
|
+
| `summary` | 1-2 sentence gist | "Debugging why Telegram bot drops messages under load" |
|
|
291
|
+
| `tags` | Topic tags (comma-separated) | "telegram, debugging, performance" |
|
|
292
|
+
| `importance` | 1-10 relevance score | 8 (architectural decision) vs 2 (directory listing) |
|
|
293
|
+
| `intent` | What was happening | `debugging`, `designing`, `implementing`, `configuring`, `discussing`, `deciding`, `reviewing` |
|
|
294
|
+
| `primary_symbols` | Key code entities | "TelegramBot, handleMessage, grammy" |
|
|
295
|
+
| `resolved_query` | Question this answers (HyDE-style) | "How does the Telegram bot handle rate limiting?" |
|
|
296
|
+
| `epistemic_level` | How proven is this | `hypothesis`, `substantiated`, `validated` |
|
|
297
|
+
| `version_scope` | What version/system state | "grammy 1.32, Node 22, pre-Railway migration" |
|
|
298
|
+
| `debt_impact` | Technical debt signal | `introduction`, `resolution`, `none` |
|
|
299
|
+
| `external_deps` | Libraries/APIs mentioned | "grammy, Supabase, Railway" |
|
|
300
|
+
|
|
301
|
+
The first 4 fields have been populated for ~11.6K chunks via local Ollama. The remaining 6 fields await cloud backfill (Gemini Batch API, ~$16 for all 251K chunks).
|
|
302
|
+
|
|
303
|
+
### Backends
|
|
304
|
+
|
|
305
|
+
Two local LLM backends available — use whichever suits your setup:
|
|
306
|
+
|
|
307
|
+
| Backend | How to start | Speed | Env var |
|
|
308
|
+
|---------|-------------|-------|---------|
|
|
309
|
+
| **Ollama** (default) | `ollama serve` + `ollama pull glm4` | ~1s/chunk (short), ~13s (long) | `BRAINLAYER_ENRICH_BACKEND=ollama` |
|
|
310
|
+
| **MLX** (Apple Silicon) | `python3 -m mlx_lm.server --model mlx-community/Qwen2.5-Coder-14B-Instruct-4bit --port 8080` | 21-87% faster | `BRAINLAYER_ENRICH_BACKEND=mlx` |
|
|
311
|
+
|
|
312
|
+
Both work with the same enrichment pipeline — just set the env var and go.
|
|
313
|
+
|
|
314
|
+
### Running Enrichment
|
|
315
|
+
|
|
316
|
+
```bash
|
|
317
|
+
# Basic (50 chunks at a time, Ollama)
|
|
318
|
+
brainlayer enrich
|
|
319
|
+
|
|
320
|
+
# Bigger batches, MLX, parallel workers
|
|
321
|
+
BRAINLAYER_ENRICH_BACKEND=mlx brainlayer enrich --batch-size=100 --parallel=3
|
|
322
|
+
|
|
323
|
+
# Process up to 5000 chunks in one run
|
|
324
|
+
brainlayer enrich --max=5000
|
|
325
|
+
|
|
326
|
+
# Automated scheduling (checks queue, runs if needed)
|
|
327
|
+
./scripts/auto-enrich.sh --threshold 500 --max-hours 3
|
|
328
|
+
```
|
|
329
|
+
|
|
330
|
+
### Cloud Backfill (one-time)
|
|
331
|
+
|
|
332
|
+
For the initial 251K chunk backfill, there's a Gemini Batch API script. See `docs/enrichment-runbook.md` for the full runbook.
|
|
333
|
+
|
|
334
|
+
### Concurrency Notes
|
|
335
|
+
|
|
336
|
+
- **`PRAGMA busy_timeout = 5000`** — waits up to 5s for DB locks (daemon + MCP + enrichment can all access DB)
|
|
337
|
+
- **Retry logic** — 3 attempts with backoff on `SQLITE_BUSY`
|
|
338
|
+
- **Parallel mode** — each thread gets its own DB connection (thread-local VectorStore)
|
|
339
|
+
- **Ollama tip:** Set `"think": false` in API calls — GLM-4.7 defaults to thinking mode, adding 350+ tokens and 20s delay for no benefit
|
|
340
|
+
- **Background running:** `PYTHONUNBUFFERED=1` required for log visibility in background processes
|
|
341
|
+
|
|
342
|
+
---
|
|
343
|
+
|
|
344
|
+
## Brain Graph
|
|
345
|
+
|
|
346
|
+
Generated by `brainlayer brain-export`, produces a JSON file with:
|
|
347
|
+
- **Nodes:** One per session, with label, project, branch, plan, chunk count
|
|
348
|
+
- **Edges:** Connections between related sessions (shared files, topics, plans)
|
|
349
|
+
- **Clusters:** HDBSCAN clustering by topic similarity
|
|
350
|
+
|
|
351
|
+
Used by the BrainLayer Dashboard 3D visualization (`react-force-graph-3d`). Can be uploaded to Supabase Storage for multi-tenant access.
|
|
352
|
+
|
|
353
|
+
---
|
|
354
|
+
|
|
355
|
+
## Obsidian Export
|
|
356
|
+
|
|
357
|
+
`brainlayer export-obsidian` generates a Markdown vault:
|
|
358
|
+
- One note per session with frontmatter (project, date, plan)
|
|
359
|
+
- Backlinks between related sessions
|
|
360
|
+
- Tag-based navigation
|
|
361
|
+
- Compatible with Obsidian graph view
|
|
362
|
+
|
|
363
|
+
---
|
|
364
|
+
|
|
365
|
+
## Data Locations
|
|
366
|
+
|
|
367
|
+
| Path | Purpose |
|
|
368
|
+
|------|---------|
|
|
369
|
+
| `~/.claude/projects/` | Source conversations (read-only) |
|
|
370
|
+
| `~/.local/share/brainlayer/brainlayer.db` | sqlite-vec database (~1.4GB, 260K+ chunks) |
|
|
371
|
+
| `~/.local/share/brainlayer/prompts/` | Deduplicated system prompts (SHA-256) |
|
|
372
|
+
| `/tmp/brainlayer.sock` | Daemon Unix socket |
|
|
373
|
+
| `/tmp/brainlayer-enrichment.lock` | Enrichment process lock file |
|
|
374
|
+
|
|
375
|
+
---
|
|
376
|
+
|
|
377
|
+
## Development
|
|
378
|
+
|
|
379
|
+
```bash
|
|
380
|
+
# Install dev dependencies
|
|
381
|
+
uv pip install -e ".[dev]"
|
|
382
|
+
|
|
383
|
+
# Run tests
|
|
384
|
+
pytest
|
|
385
|
+
|
|
386
|
+
# Lint + format
|
|
387
|
+
ruff check src/ && ruff format src/
|
|
388
|
+
```
|
|
389
|
+
|
|
390
|
+
---
|
|
391
|
+
|
|
392
|
+
## Pipeline Stages
|
|
393
|
+
|
|
394
|
+
### Stage 1: Extract (`pipeline/extract.py`)
|
|
395
|
+
- Parse JSONL conversation files
|
|
396
|
+
- **Content-addressable storage** for system prompts (SHA-256 hash → dedupe)
|
|
397
|
+
- Detect conversation continuations (session ID + temporal proximity)
|
|
398
|
+
- Also: `extract_whatsapp.py`, `extract_markdown.py`, `extract_claude_desktop.py`
|
|
399
|
+
|
|
400
|
+
### Stage 2: Classify (`pipeline/classify.py`)
|
|
401
|
+
|
|
402
|
+
| Type | Value | Action |
|
|
403
|
+
|------|-------|--------|
|
|
404
|
+
| `ai_code` | HIGH | Preserve verbatim |
|
|
405
|
+
| `stack_trace` | HIGH | Preserve exact (never split) |
|
|
406
|
+
| `user_message` | HIGH | Preserve |
|
|
407
|
+
| `assistant_text` | MEDIUM | Preserve |
|
|
408
|
+
| `file_read` | MEDIUM | Context-dependent |
|
|
409
|
+
| `git_diff` | MEDIUM | Extract changed entities |
|
|
410
|
+
| `build_log` | LOW | Summarize or mask |
|
|
411
|
+
| `dir_listing` | LOW | Structure only |
|
|
412
|
+
| `noise` | SKIP | Filter out (progress, queue-operation) |
|
|
413
|
+
|
|
414
|
+
### Stage 3: Chunk (`pipeline/chunk.py`)
|
|
415
|
+
- **AST-aware chunking** with tree-sitter for code (~500 tokens)
|
|
416
|
+
- **Never split** stack traces
|
|
417
|
+
- **Observation masking** for large tool outputs (`[N lines elided]`)
|
|
418
|
+
- Turn-based chunking for conversation with 10-20% overlap
|
|
419
|
+
|
|
420
|
+
### Stage 4: Embed (`embeddings.py`)
|
|
421
|
+
- **bge-large-en-v1.5** via sentence-transformers (local, private)
|
|
422
|
+
- 1024 dimensions, 63.5 MTEB score
|
|
423
|
+
- ~8s model load (vs 30s with Ollama)
|
|
424
|
+
- MPS acceleration on Apple Silicon
|
|
425
|
+
|
|
426
|
+
### Stage 5: Index (`vector_store.py`)
|
|
427
|
+
- **sqlite-vec** with APSW (macOS compatible)
|
|
428
|
+
- WAL mode for concurrent reads
|
|
429
|
+
- `PRAGMA busy_timeout = 5000` for multi-process safety
|
|
430
|
+
- Metadata: project, content_type, source_file, char_count
|
|
431
|
+
|
|
432
|
+
---
|
|
433
|
+
|
|
434
|
+
## Communication Style Analysis
|
|
435
|
+
|
|
436
|
+
BrainLayer includes **communication pattern analysis** from WhatsApp, Claude, YouTube, and Gemini chats.
|
|
437
|
+
|
|
438
|
+
### Latest Analysis Location
|
|
439
|
+
```
|
|
440
|
+
data/archives/style-2026-01-31-2121/
|
|
441
|
+
├── master-style-guide.md # Main style rules
|
|
442
|
+
├── per-period/ # Style evolution over time
|
|
443
|
+
```
|
|
444
|
+
|
|
445
|
+
### Usage
|
|
446
|
+
```bash
|
|
447
|
+
brainlayer analyze-evolution --use-embeddings -c ~/claude-export.json -o data/archives/style-$(date +%Y-%m-%d-%H%M) -y
|
|
448
|
+
brainlayer analyze-style # Quick WhatsApp-only
|
|
449
|
+
brainlayer analyze-semantic # Semantic style profiling
|
|
450
|
+
```
|
|
451
|
+
|
|
452
|
+
---
|
|
453
|
+
|
|
454
|
+
## Naming
|
|
455
|
+
|
|
456
|
+
**BrainLayer** (זיכרון) - Hebrew for "memory"
|