mi-crow 0.1.1.post14__tar.gz → 0.1.1.post16__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/.github/workflows/publish.yml +0 -1
- {mi_crow-0.1.1.post14/src/mi_crow.egg-info → mi_crow-0.1.1.post16}/PKG-INFO +1 -1
- mi_crow-0.1.1.post16/docs/experiments/index.md +127 -0
- mi_crow-0.1.1.post16/docs/experiments/slurm-pipeline.md +75 -0
- mi_crow-0.1.1.post16/docs/experiments/verify-sae-training.md +196 -0
- mi_crow-0.1.1.post16/docs/guide/best-practices.md +390 -0
- mi_crow-0.1.1.post16/docs/guide/core-concepts.md +222 -0
- mi_crow-0.1.1.post16/docs/guide/examples.md +324 -0
- mi_crow-0.1.1.post16/docs/guide/hooks/advanced.md +445 -0
- mi_crow-0.1.1.post16/docs/guide/hooks/controllers.md +380 -0
- mi_crow-0.1.1.post16/docs/guide/hooks/detectors.md +295 -0
- mi_crow-0.1.1.post16/docs/guide/hooks/fundamentals.md +337 -0
- mi_crow-0.1.1.post16/docs/guide/hooks/index.md +163 -0
- mi_crow-0.1.1.post16/docs/guide/hooks/registration.md +415 -0
- mi_crow-0.1.1.post16/docs/guide/index.md +91 -0
- mi_crow-0.1.1.post16/docs/guide/installation.md +172 -0
- mi_crow-0.1.1.post16/docs/guide/quickstart.md +163 -0
- mi_crow-0.1.1.post16/docs/guide/troubleshooting.md +392 -0
- mi_crow-0.1.1.post16/docs/guide/workflows/activation-control.md +335 -0
- mi_crow-0.1.1.post16/docs/guide/workflows/concept-discovery.md +318 -0
- mi_crow-0.1.1.post16/docs/guide/workflows/concept-manipulation.md +316 -0
- mi_crow-0.1.1.post16/docs/guide/workflows/index.md +106 -0
- mi_crow-0.1.1.post16/docs/guide/workflows/saving-activations.md +335 -0
- mi_crow-0.1.1.post16/docs/guide/workflows/training-sae.md +314 -0
- mi_crow-0.1.1.post16/docs/index.md +105 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/mkdocs.yml +26 -0
- mi_crow-0.1.1.post16/site/404.html +585 -0
- mi_crow-0.1.1.post16/site/api/datasets/index.html +9162 -0
- mi_crow-0.1.1.post16/site/api/hooks/index.html +5040 -0
- mi_crow-0.1.1.post16/site/api/index.html +775 -0
- mi_crow-0.1.1.post16/site/api/language_model/index.html +9785 -0
- mi_crow-0.1.1.post16/site/api/sae/index.html +5541 -0
- mi_crow-0.1.1.post16/site/api/store/index.html +2579 -0
- mi_crow-0.1.1.post16/site/assets/_mkdocstrings.css +237 -0
- mi_crow-0.1.1.post16/site/assets/images/favicon.png +0 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/bundle.e71a0d61.min.js +16 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/bundle.e71a0d61.min.js.map +7 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.ar.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.da.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.de.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.du.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.el.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.es.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.fi.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.fr.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.he.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.hi.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.hu.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.hy.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.it.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.ja.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.jp.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.kn.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.ko.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.multi.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.nl.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.no.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.pt.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.ro.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.ru.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.sa.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.stemmer.support.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.sv.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.ta.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.te.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.th.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.tr.min.js +18 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.vi.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/min/lunr.zh.min.js +1 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/tinyseg.js +206 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/lunr/wordcut.js +6708 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/workers/search.7a47a382.min.js +42 -0
- mi_crow-0.1.1.post16/site/assets/javascripts/workers/search.7a47a382.min.js.map +7 -0
- mi_crow-0.1.1.post16/site/assets/stylesheets/main.618322db.min.css +1 -0
- mi_crow-0.1.1.post16/site/assets/stylesheets/main.618322db.min.css.map +1 -0
- mi_crow-0.1.1.post16/site/assets/stylesheets/palette.ab4e12ef.min.css +1 -0
- mi_crow-0.1.1.post16/site/assets/stylesheets/palette.ab4e12ef.min.css.map +1 -0
- mi_crow-0.1.1.post16/site/index.html +636 -0
- mi_crow-0.1.1.post16/site/objects.inv +0 -0
- mi_crow-0.1.1.post16/site/search/search_index.json +1 -0
- mi_crow-0.1.1.post16/site/sitemap.xml +35 -0
- mi_crow-0.1.1.post16/site/sitemap.xml.gz +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/datasets/base_dataset.py +3 -3
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/datasets/classification_dataset.py +3 -3
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/datasets/text_dataset.py +3 -3
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/language_model/language_model.py +3 -3
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/language_model/layers.py +3 -3
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/sae/modules/topk_sae.py +2 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16/src/mi_crow.egg-info}/PKG-INFO +1 -1
- mi_crow-0.1.1.post16/src/mi_crow.egg-info/SOURCES.txt +158 -0
- mi_crow-0.1.1.post14/docs/index.md +0 -5
- mi_crow-0.1.1.post14/src/mi_crow.egg-info/SOURCES.txt +0 -80
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/.cursor/commands/fix-and-add-unit-tests.md +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/.cursor/commands/refactor-given-code.md +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/.cursor/plans/dokumentacja-implementacji-modu-w-i-automatyzacji-83a10087.plan.md +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/.cursor/plans/server-sae-full-metadata_b869cacf.plan.md +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/.cursor/rules/coding-rules.mdc +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/.cursor/rules/comments.mdc +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/.github/workflows/docs.yml +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/.github/workflows/tests.yml +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/.gitignore +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/.pre-commit-config.yaml +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/.python-version +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/.run/Unit tests.run.xml +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/MANIFEST.in +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/README.md +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/docs/api/datasets.md +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/docs/api/hooks.md +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/docs/api/index.md +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/docs/api/language_model.md +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/docs/api/sae.md +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/docs/api/store.md +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/docs/api.md +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/pyproject.toml +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/raport_implementacji.md +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/setup.cfg +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/__init__.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/datasets/__init__.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/datasets/loading_strategy.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/hooks/__init__.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/hooks/controller.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/hooks/detector.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/hooks/hook.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/hooks/implementations/__init__.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/hooks/implementations/function_controller.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/hooks/implementations/layer_activation_detector.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/hooks/implementations/model_input_detector.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/hooks/implementations/model_output_detector.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/hooks/utils.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/language_model/__init__.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/language_model/activations.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/language_model/context.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/language_model/contracts.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/language_model/hook_metadata.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/language_model/inference.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/language_model/initialization.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/language_model/persistence.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/language_model/tokenizer.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/language_model/utils.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/__init__.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/sae/__init__.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/sae/autoencoder_context.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/sae/concepts/__init__.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/sae/concepts/autoencoder_concepts.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/sae/concepts/concept_dictionary.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/sae/concepts/concept_models.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/sae/concepts/input_tracker.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/sae/modules/__init__.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/sae/modules/l1_sae.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/sae/sae.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/sae/sae_trainer.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/sae/training/__init__.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/mechanistic/sae/training/wandb_logger.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/store/__init__.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/store/local_store.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/store/store.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/store/store_dataloader.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow/utils.py +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow.egg-info/dependency_links.txt +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow.egg-info/requires.txt +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/src/mi_crow.egg-info/top_level.txt +0 -0
- {mi_crow-0.1.1.post14 → mi_crow-0.1.1.post16}/uv.lock +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: mi-crow
|
|
3
|
-
Version: 0.1.1.
|
|
3
|
+
Version: 0.1.1.post16
|
|
4
4
|
Summary: Engineer Thesis: Explaining and modifying LLM responses using SAE and concepts.
|
|
5
5
|
Author-email: Hubert Kowalski <your.email@example.com>, Adam Kaniasty <adam.kaniasty@gmail.com>
|
|
6
6
|
Requires-Python: >=3.10
|
|
@@ -0,0 +1,127 @@
|
|
|
1
|
+
# Experiments
|
|
2
|
+
|
|
3
|
+
This section provides detailed walkthroughs of sample experiments demonstrating real-world usage of mi-crow.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
Experiments showcase complete workflows from data collection through analysis, using real models and datasets. They demonstrate best practices and provide templates for your own research.
|
|
8
|
+
|
|
9
|
+
## Available Experiments
|
|
10
|
+
|
|
11
|
+
### [Verify SAE Training](verify-sae-training.md)
|
|
12
|
+
|
|
13
|
+
Complete workflow for training and validating SAE models on the Bielik model using TinyStories dataset.
|
|
14
|
+
|
|
15
|
+
**What it covers**:
|
|
16
|
+
- Saving activations from a production model
|
|
17
|
+
- Training SAEs with proper hyperparameters
|
|
18
|
+
- Validating training success
|
|
19
|
+
- Concept discovery and naming
|
|
20
|
+
- Analysis and visualization
|
|
21
|
+
|
|
22
|
+
**Time required**: Several hours (depending on hardware)
|
|
23
|
+
|
|
24
|
+
**Prerequisites**:
|
|
25
|
+
- Access to Bielik model or similar
|
|
26
|
+
- Sufficient GPU memory
|
|
27
|
+
- Understanding of basic SAE concepts
|
|
28
|
+
|
|
29
|
+
### [SLURM SAE Pipeline](slurm-pipeline.md)
|
|
30
|
+
|
|
31
|
+
Distributed training setup for large-scale SAE training on cluster environments.
|
|
32
|
+
|
|
33
|
+
**What it covers**:
|
|
34
|
+
- SLURM job configuration
|
|
35
|
+
- Distributed activation saving
|
|
36
|
+
- Large-scale SAE training
|
|
37
|
+
- Resource management
|
|
38
|
+
|
|
39
|
+
**Time required**: Days (cluster-dependent)
|
|
40
|
+
|
|
41
|
+
**Prerequisites**:
|
|
42
|
+
- Access to SLURM cluster
|
|
43
|
+
- Understanding of cluster computing
|
|
44
|
+
- Large-scale dataset
|
|
45
|
+
|
|
46
|
+
## Experiment Structure
|
|
47
|
+
|
|
48
|
+
Each experiment typically includes:
|
|
49
|
+
|
|
50
|
+
1. **Setup**: Environment and dependencies
|
|
51
|
+
2. **Data Collection**: Saving activations
|
|
52
|
+
3. **Training**: SAE model training
|
|
53
|
+
4. **Validation**: Verifying results
|
|
54
|
+
5. **Analysis**: Understanding outcomes
|
|
55
|
+
6. **Documentation**: Recording findings
|
|
56
|
+
|
|
57
|
+
## Running Experiments
|
|
58
|
+
|
|
59
|
+
### Prerequisites
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
# Install dependencies
|
|
63
|
+
pip install -e .
|
|
64
|
+
|
|
65
|
+
# Or with uv
|
|
66
|
+
uv sync
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
### Basic Workflow
|
|
70
|
+
|
|
71
|
+
```bash
|
|
72
|
+
# 1. Navigate to experiment directory
|
|
73
|
+
cd experiments/verify_sae_training
|
|
74
|
+
|
|
75
|
+
# 2. Review README
|
|
76
|
+
cat README.md
|
|
77
|
+
|
|
78
|
+
# 3. Run scripts in order
|
|
79
|
+
python 01_save_activations.py
|
|
80
|
+
python 02_train_sae.py
|
|
81
|
+
|
|
82
|
+
# 4. Open analysis notebooks
|
|
83
|
+
jupyter notebook 03_analyze_training.ipynb
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
### Customization
|
|
87
|
+
|
|
88
|
+
Experiments are designed to be customizable:
|
|
89
|
+
|
|
90
|
+
- Modify model names
|
|
91
|
+
- Adjust hyperparameters
|
|
92
|
+
- Change dataset sources
|
|
93
|
+
- Adapt to your hardware
|
|
94
|
+
|
|
95
|
+
## Experiment Outputs
|
|
96
|
+
|
|
97
|
+
Experiments produce:
|
|
98
|
+
|
|
99
|
+
- **Saved activations**: Organized in store
|
|
100
|
+
- **Trained models**: SAE checkpoints
|
|
101
|
+
- **Analysis results**: Visualizations and metrics
|
|
102
|
+
- **Documentation**: Findings and observations
|
|
103
|
+
|
|
104
|
+
## Best Practices
|
|
105
|
+
|
|
106
|
+
1. **Start small**: Test with limited data first
|
|
107
|
+
2. **Monitor resources**: Watch memory and compute usage
|
|
108
|
+
3. **Document changes**: Record any modifications
|
|
109
|
+
4. **Save checkpoints**: Don't lose progress
|
|
110
|
+
5. **Validate results**: Verify outputs make sense
|
|
111
|
+
|
|
112
|
+
## Contributing Experiments
|
|
113
|
+
|
|
114
|
+
If you create a new experiment:
|
|
115
|
+
|
|
116
|
+
1. Create directory in `experiments/`
|
|
117
|
+
2. Include README with description
|
|
118
|
+
3. Provide runnable scripts/notebooks
|
|
119
|
+
4. Document setup and requirements
|
|
120
|
+
5. Share findings and observations
|
|
121
|
+
|
|
122
|
+
## Next Steps
|
|
123
|
+
|
|
124
|
+
- **[Verify SAE Training](verify-sae-training.md)** - Start with this experiment
|
|
125
|
+
- **[User Guide](../guide/index.md)** - Learn fundamentals first
|
|
126
|
+
- **[Examples](../guide/examples.md)** - Try examples before experiments
|
|
127
|
+
|
|
@@ -0,0 +1,75 @@
|
|
|
1
|
+
# SLURM SAE Pipeline
|
|
2
|
+
|
|
3
|
+
This experiment demonstrates distributed training setup for large-scale SAE training on cluster environments using SLURM.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
This pipeline shows how to:
|
|
8
|
+
- Configure SLURM jobs for activation saving
|
|
9
|
+
- Set up distributed SAE training on clusters
|
|
10
|
+
- Manage resources and job dependencies
|
|
11
|
+
- Handle large-scale datasets efficiently
|
|
12
|
+
|
|
13
|
+
## Prerequisites
|
|
14
|
+
|
|
15
|
+
- Access to SLURM cluster
|
|
16
|
+
- Understanding of cluster computing
|
|
17
|
+
- Large-scale dataset
|
|
18
|
+
- Sufficient cluster resources
|
|
19
|
+
|
|
20
|
+
## Experiment Structure
|
|
21
|
+
|
|
22
|
+
```
|
|
23
|
+
slurm_sae_pipeline/
|
|
24
|
+
├── 01_save_activations.py # Activation saving script
|
|
25
|
+
├── 02_train_sae.py # SAE training script
|
|
26
|
+
├── submit_save_activations.sh # SLURM submission script for activations
|
|
27
|
+
├── submit_train_sae.sh # SLURM submission script for training
|
|
28
|
+
└── README.md # Pipeline documentation
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
## Configuration
|
|
32
|
+
|
|
33
|
+
### SLURM Job Configuration
|
|
34
|
+
|
|
35
|
+
The submission scripts configure:
|
|
36
|
+
- **Resources**: GPU allocation, memory, time limits
|
|
37
|
+
- **Dependencies**: Job ordering (train after save)
|
|
38
|
+
- **Environment**: Python environment setup
|
|
39
|
+
- **Output**: Logging and error handling
|
|
40
|
+
|
|
41
|
+
### mi-crow Configuration
|
|
42
|
+
|
|
43
|
+
The Python scripts use standard mi-crow APIs:
|
|
44
|
+
- `lm.activations.save()` for distributed activation saving
|
|
45
|
+
- `SaeTrainer.train()` for SAE training
|
|
46
|
+
- Store configuration for cluster filesystems
|
|
47
|
+
|
|
48
|
+
## Workflow
|
|
49
|
+
|
|
50
|
+
1. **Submit activation saving job**: Uses `submit_save_activations.sh`
|
|
51
|
+
2. **Wait for completion**: Activation saving must complete first
|
|
52
|
+
3. **Submit training job**: Uses `submit_train_sae.sh` (depends on step 1)
|
|
53
|
+
4. **Monitor jobs**: Track progress through SLURM
|
|
54
|
+
|
|
55
|
+
## Key Features
|
|
56
|
+
|
|
57
|
+
- **Distributed processing**: Handles large datasets across cluster nodes
|
|
58
|
+
- **Resource management**: Proper GPU and memory allocation
|
|
59
|
+
- **Job dependencies**: Ensures correct execution order
|
|
60
|
+
- **Error handling**: Robust failure recovery
|
|
61
|
+
|
|
62
|
+
## Customization
|
|
63
|
+
|
|
64
|
+
Adapt the pipeline for your cluster:
|
|
65
|
+
- Modify resource requests in submission scripts
|
|
66
|
+
- Adjust batch sizes for available memory
|
|
67
|
+
- Configure store paths for cluster filesystem
|
|
68
|
+
- Set appropriate time limits
|
|
69
|
+
|
|
70
|
+
## Related Documentation
|
|
71
|
+
|
|
72
|
+
- **[Training SAE Models](../guide/workflows/training-sae.md)** - SAE training guide
|
|
73
|
+
- **[Saving Activations](../guide/workflows/saving-activations.md)** - Activation saving guide
|
|
74
|
+
- **[Best Practices](../guide/best-practices.md)** - Performance optimization
|
|
75
|
+
|
|
@@ -0,0 +1,196 @@
|
|
|
1
|
+
# Verify SAE Training Experiment
|
|
2
|
+
|
|
3
|
+
This experiment demonstrates a complete workflow for training and validating SAE models on the Bielik model using the TinyStories dataset.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
This experiment walks through:
|
|
8
|
+
1. Saving activations from a production model
|
|
9
|
+
2. Training SAEs with proper hyperparameters
|
|
10
|
+
3. Validating training success
|
|
11
|
+
4. Concept discovery and naming
|
|
12
|
+
5. Analysis and visualization
|
|
13
|
+
|
|
14
|
+
## Prerequisites
|
|
15
|
+
|
|
16
|
+
- Python 3.8+
|
|
17
|
+
- PyTorch
|
|
18
|
+
- Required packages: `mi_crow`, `torch`, `transformers`, `datasets`, `overcomplete`, `matplotlib`, `seaborn`
|
|
19
|
+
- Access to Bielik model or similar
|
|
20
|
+
- Sufficient GPU memory (or use CPU for smaller experiments)
|
|
21
|
+
|
|
22
|
+
## Experiment Structure
|
|
23
|
+
|
|
24
|
+
```
|
|
25
|
+
verify_sae_training/
|
|
26
|
+
├── 01_save_activations.py # Step 1: Save activations from dataset
|
|
27
|
+
├── 02_train_sae.py # Step 2: Train SAE model
|
|
28
|
+
├── 03_analyze_training.ipynb # Step 3: Analyze training metrics and verify learning
|
|
29
|
+
├── 04_name_sae_concepts.ipynb # Step 4: Export top texts for each neuron
|
|
30
|
+
├── 05_show_concepts.ipynb # Step 5: Display and explore concepts
|
|
31
|
+
├── observations.md # Findings and observations
|
|
32
|
+
└── README.md # Experiment documentation
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## Step-by-Step Instructions
|
|
36
|
+
|
|
37
|
+
### Step 1: Save Activations
|
|
38
|
+
|
|
39
|
+
**File**: `01_save_activations.py`
|
|
40
|
+
|
|
41
|
+
This script:
|
|
42
|
+
- Loads the Bielik model
|
|
43
|
+
- Uses resid_mid layer (post_attention_layernorm) at layer 16
|
|
44
|
+
- Loads TinyStories dataset
|
|
45
|
+
- Saves activations from the specified layer
|
|
46
|
+
- Stores run ID in `store/run_id.txt`
|
|
47
|
+
|
|
48
|
+
**Configuration**:
|
|
49
|
+
- **Model**: `speakleash/Bielik-1.5B-v3.0-Instruct`
|
|
50
|
+
- **Dataset**: `roneneldan/TinyStories` (train split)
|
|
51
|
+
- **Layer**: `llamaforcausallm_model_layers_16_post_attention_layernorm`
|
|
52
|
+
- **Store location**: `experiments/verify_sae_training/store/`
|
|
53
|
+
|
|
54
|
+
**To change the layer**: Edit `LAYER_SIGNATURE` in the script (e.g., use `_0_` for first layer, `_31_` for last layer).
|
|
55
|
+
|
|
56
|
+
**Run**:
|
|
57
|
+
```bash
|
|
58
|
+
cd experiments/verify_sae_training
|
|
59
|
+
python 01_save_activations.py
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
### Step 2: Train SAE
|
|
63
|
+
|
|
64
|
+
**File**: `02_train_sae.py`
|
|
65
|
+
|
|
66
|
+
This script:
|
|
67
|
+
- Loads the saved activations
|
|
68
|
+
- Creates a TopKSAE model
|
|
69
|
+
- Trains the SAE on the activations
|
|
70
|
+
- Saves the trained model to `store/sae_model/topk_sae.pt`
|
|
71
|
+
- Saves training history to `store/training_history.json`
|
|
72
|
+
|
|
73
|
+
**Configuration**:
|
|
74
|
+
- `N_LATENTS_MULTIPLIER`: Overcompleteness factor (default: 4x)
|
|
75
|
+
- `TOP_K`: Sparsity parameter (default: 8)
|
|
76
|
+
- `EPOCHS`: Number of training epochs (default: 100)
|
|
77
|
+
- `BATCH_SIZE_TRAIN`: Training batch size (default: 1024)
|
|
78
|
+
|
|
79
|
+
**Run**:
|
|
80
|
+
```bash
|
|
81
|
+
python 02_train_sae.py
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
### Step 3: Analyze Training
|
|
85
|
+
|
|
86
|
+
**File**: `03_analyze_training.ipynb`
|
|
87
|
+
|
|
88
|
+
This notebook demonstrates:
|
|
89
|
+
- Accessing training history from `store/training_history.json`
|
|
90
|
+
- Visualizing training metrics (loss, R², L0, dead features)
|
|
91
|
+
- Validating SAE training success using mi-crow APIs
|
|
92
|
+
- Analyzing reconstruction quality
|
|
93
|
+
|
|
94
|
+
**Key validation checks**:
|
|
95
|
+
- Loss should decrease over time
|
|
96
|
+
- R² should increase (better reconstruction)
|
|
97
|
+
- L0 should match expected TopK sparsity
|
|
98
|
+
- Dead features should be minimal
|
|
99
|
+
- Weight variance should be significant
|
|
100
|
+
|
|
101
|
+
### Step 4: Export Top Texts
|
|
102
|
+
|
|
103
|
+
**File**: `04_name_sae_concepts.ipynb`
|
|
104
|
+
|
|
105
|
+
This notebook demonstrates:
|
|
106
|
+
- Loading trained SAE using mi-crow APIs
|
|
107
|
+
- Attaching SAE to language model with `lm.attach_sae()`
|
|
108
|
+
- Enabling text tracking with `sae.concepts.enable_text_tracking()`
|
|
109
|
+
- Collecting top texts using mi-crow's concept discovery features
|
|
110
|
+
- Exporting results to JSON format
|
|
111
|
+
|
|
112
|
+
**Output**: JSON file mapping neuron indices to top activating text snippets.
|
|
113
|
+
|
|
114
|
+
### Step 5: Show Concepts
|
|
115
|
+
|
|
116
|
+
**File**: `05_show_concepts.ipynb`
|
|
117
|
+
|
|
118
|
+
This notebook demonstrates:
|
|
119
|
+
- Loading exported top texts
|
|
120
|
+
- Using mi-crow APIs to access concept data
|
|
121
|
+
- Analyzing neuron activation patterns
|
|
122
|
+
- Exploring concept relationships
|
|
123
|
+
|
|
124
|
+
## Expected Outputs
|
|
125
|
+
|
|
126
|
+
After running all steps, you'll have:
|
|
127
|
+
|
|
128
|
+
- `store/run_id.txt` - Run ID for the activation saving run
|
|
129
|
+
- `store/runs/<run_id>/` - Saved activations
|
|
130
|
+
- `store/sae_model/topk_sae.pt` - Trained SAE model
|
|
131
|
+
- `store/training_history.json` - Training metrics
|
|
132
|
+
- `store/top_texts.json` - Exported top texts for each neuron
|
|
133
|
+
|
|
134
|
+
## Analysis and Validation
|
|
135
|
+
|
|
136
|
+
### Training Metrics
|
|
137
|
+
|
|
138
|
+
Check that:
|
|
139
|
+
- Loss decreases over epochs
|
|
140
|
+
- R² score improves (target: > 0.9)
|
|
141
|
+
- L0 matches expected TopK
|
|
142
|
+
- Dead features < 10% of total
|
|
143
|
+
|
|
144
|
+
### Concept Quality
|
|
145
|
+
|
|
146
|
+
Verify that:
|
|
147
|
+
- Top texts show coherent patterns
|
|
148
|
+
- Neurons detect meaningful concepts
|
|
149
|
+
- Concepts are interpretable
|
|
150
|
+
- Multiple neurons may detect similar concepts (redundancy is normal)
|
|
151
|
+
|
|
152
|
+
## Troubleshooting
|
|
153
|
+
|
|
154
|
+
### Layer Signature Not Found
|
|
155
|
+
|
|
156
|
+
If you get an error about layer signature:
|
|
157
|
+
1. Run `01_save_activations.py` first to see available layers
|
|
158
|
+
2. Copy one of the layer names
|
|
159
|
+
3. Set `LAYER_SIGNATURE` in `01_save_activations.py`
|
|
160
|
+
|
|
161
|
+
### Out of Memory
|
|
162
|
+
|
|
163
|
+
If you run out of memory:
|
|
164
|
+
- Reduce `DATA_LIMIT` in `01_save_activations.py`
|
|
165
|
+
- Reduce `BATCH_SIZE_SAVE` in `01_save_activations.py`
|
|
166
|
+
- Reduce `BATCH_SIZE_TRAIN` in `02_train_sae.py`
|
|
167
|
+
- Use CPU instead of GPU (set `DEVICE = "cpu"`)
|
|
168
|
+
|
|
169
|
+
### Missing Dependencies
|
|
170
|
+
|
|
171
|
+
Install missing packages:
|
|
172
|
+
```bash
|
|
173
|
+
pip install torch transformers datasets overcomplete matplotlib seaborn
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
## Notes
|
|
177
|
+
|
|
178
|
+
- The experiment uses a relatively small dataset (1000 samples) for quick testing
|
|
179
|
+
- For production use, increase `DATA_LIMIT` and training epochs
|
|
180
|
+
- See `observations.md` for findings and missing functionality notes
|
|
181
|
+
|
|
182
|
+
## Related Documentation
|
|
183
|
+
|
|
184
|
+
- **[Training SAE Models](../guide/workflows/training-sae.md)** - Detailed training guide
|
|
185
|
+
- **[Concept Discovery](../guide/workflows/concept-discovery.md)** - Concept discovery workflow
|
|
186
|
+
- **[Best Practices](../guide/best-practices.md)** - General best practices
|
|
187
|
+
- **[Troubleshooting](../guide/troubleshooting.md)** - Common issues
|
|
188
|
+
|
|
189
|
+
## Next Steps
|
|
190
|
+
|
|
191
|
+
After completing this experiment:
|
|
192
|
+
- Try different layers
|
|
193
|
+
- Experiment with hyperparameters
|
|
194
|
+
- Scale up to larger datasets
|
|
195
|
+
- Create your own experiments
|
|
196
|
+
|