@synsci/cli-darwin-x64 1.1.49
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/skills/accelerate/SKILL.md +332 -0
- package/bin/skills/accelerate/references/custom-plugins.md +453 -0
- package/bin/skills/accelerate/references/megatron-integration.md +489 -0
- package/bin/skills/accelerate/references/performance.md +525 -0
- package/bin/skills/audiocraft/SKILL.md +564 -0
- package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
- package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
- package/bin/skills/autogpt/SKILL.md +403 -0
- package/bin/skills/autogpt/references/advanced-usage.md +535 -0
- package/bin/skills/autogpt/references/troubleshooting.md +420 -0
- package/bin/skills/awq/SKILL.md +310 -0
- package/bin/skills/awq/references/advanced-usage.md +324 -0
- package/bin/skills/awq/references/troubleshooting.md +344 -0
- package/bin/skills/axolotl/SKILL.md +158 -0
- package/bin/skills/axolotl/references/api.md +5548 -0
- package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
- package/bin/skills/axolotl/references/index.md +15 -0
- package/bin/skills/axolotl/references/other.md +3563 -0
- package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
- package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
- package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
- package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
- package/bin/skills/bitsandbytes/SKILL.md +411 -0
- package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
- package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
- package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
- package/bin/skills/blip-2/SKILL.md +564 -0
- package/bin/skills/blip-2/references/advanced-usage.md +680 -0
- package/bin/skills/blip-2/references/troubleshooting.md +526 -0
- package/bin/skills/chroma/SKILL.md +406 -0
- package/bin/skills/chroma/references/integration.md +38 -0
- package/bin/skills/clip/SKILL.md +253 -0
- package/bin/skills/clip/references/applications.md +207 -0
- package/bin/skills/constitutional-ai/SKILL.md +290 -0
- package/bin/skills/crewai/SKILL.md +498 -0
- package/bin/skills/crewai/references/flows.md +438 -0
- package/bin/skills/crewai/references/tools.md +429 -0
- package/bin/skills/crewai/references/troubleshooting.md +480 -0
- package/bin/skills/deepspeed/SKILL.md +141 -0
- package/bin/skills/deepspeed/references/08.md +17 -0
- package/bin/skills/deepspeed/references/09.md +173 -0
- package/bin/skills/deepspeed/references/2020.md +378 -0
- package/bin/skills/deepspeed/references/2023.md +279 -0
- package/bin/skills/deepspeed/references/assets.md +179 -0
- package/bin/skills/deepspeed/references/index.md +35 -0
- package/bin/skills/deepspeed/references/mii.md +118 -0
- package/bin/skills/deepspeed/references/other.md +1191 -0
- package/bin/skills/deepspeed/references/tutorials.md +6554 -0
- package/bin/skills/dspy/SKILL.md +590 -0
- package/bin/skills/dspy/references/examples.md +663 -0
- package/bin/skills/dspy/references/modules.md +475 -0
- package/bin/skills/dspy/references/optimizers.md +566 -0
- package/bin/skills/faiss/SKILL.md +221 -0
- package/bin/skills/faiss/references/index_types.md +280 -0
- package/bin/skills/flash-attention/SKILL.md +367 -0
- package/bin/skills/flash-attention/references/benchmarks.md +215 -0
- package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
- package/bin/skills/gguf/SKILL.md +427 -0
- package/bin/skills/gguf/references/advanced-usage.md +504 -0
- package/bin/skills/gguf/references/troubleshooting.md +442 -0
- package/bin/skills/gptq/SKILL.md +450 -0
- package/bin/skills/gptq/references/calibration.md +337 -0
- package/bin/skills/gptq/references/integration.md +129 -0
- package/bin/skills/gptq/references/troubleshooting.md +95 -0
- package/bin/skills/grpo-rl-training/README.md +97 -0
- package/bin/skills/grpo-rl-training/SKILL.md +572 -0
- package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
- package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
- package/bin/skills/guidance/SKILL.md +572 -0
- package/bin/skills/guidance/references/backends.md +554 -0
- package/bin/skills/guidance/references/constraints.md +674 -0
- package/bin/skills/guidance/references/examples.md +767 -0
- package/bin/skills/hqq/SKILL.md +445 -0
- package/bin/skills/hqq/references/advanced-usage.md +528 -0
- package/bin/skills/hqq/references/troubleshooting.md +503 -0
- package/bin/skills/hugging-face-cli/SKILL.md +191 -0
- package/bin/skills/hugging-face-cli/references/commands.md +954 -0
- package/bin/skills/hugging-face-cli/references/examples.md +374 -0
- package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
- package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
- package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
- package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
- package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
- package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
- package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
- package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
- package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
- package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
- package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
- package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
- package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
- package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
- package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
- package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
- package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
- package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
- package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
- package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
- package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
- package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
- package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
- package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
- package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
- package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
- package/bin/skills/hugging-face-jobs/index.html +216 -0
- package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
- package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
- package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
- package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
- package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
- package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
- package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
- package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
- package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
- package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
- package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
- package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
- package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
- package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
- package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
- package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
- package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
- package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
- package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
- package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
- package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
- package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
- package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
- package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
- package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
- package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
- package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
- package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
- package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
- package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
- package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
- package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
- package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
- package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
- package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
- package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
- package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
- package/bin/skills/instructor/SKILL.md +740 -0
- package/bin/skills/instructor/references/examples.md +107 -0
- package/bin/skills/instructor/references/providers.md +70 -0
- package/bin/skills/instructor/references/validation.md +606 -0
- package/bin/skills/knowledge-distillation/SKILL.md +458 -0
- package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
- package/bin/skills/lambda-labs/SKILL.md +545 -0
- package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
- package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
- package/bin/skills/langchain/SKILL.md +480 -0
- package/bin/skills/langchain/references/agents.md +499 -0
- package/bin/skills/langchain/references/integration.md +562 -0
- package/bin/skills/langchain/references/rag.md +600 -0
- package/bin/skills/langsmith/SKILL.md +422 -0
- package/bin/skills/langsmith/references/advanced-usage.md +548 -0
- package/bin/skills/langsmith/references/troubleshooting.md +537 -0
- package/bin/skills/litgpt/SKILL.md +469 -0
- package/bin/skills/litgpt/references/custom-models.md +568 -0
- package/bin/skills/litgpt/references/distributed-training.md +451 -0
- package/bin/skills/litgpt/references/supported-models.md +336 -0
- package/bin/skills/litgpt/references/training-recipes.md +619 -0
- package/bin/skills/llama-cpp/SKILL.md +258 -0
- package/bin/skills/llama-cpp/references/optimization.md +89 -0
- package/bin/skills/llama-cpp/references/quantization.md +213 -0
- package/bin/skills/llama-cpp/references/server.md +125 -0
- package/bin/skills/llama-factory/SKILL.md +80 -0
- package/bin/skills/llama-factory/references/_images.md +23 -0
- package/bin/skills/llama-factory/references/advanced.md +1055 -0
- package/bin/skills/llama-factory/references/getting_started.md +349 -0
- package/bin/skills/llama-factory/references/index.md +19 -0
- package/bin/skills/llama-factory/references/other.md +31 -0
- package/bin/skills/llamaguard/SKILL.md +337 -0
- package/bin/skills/llamaindex/SKILL.md +569 -0
- package/bin/skills/llamaindex/references/agents.md +83 -0
- package/bin/skills/llamaindex/references/data_connectors.md +108 -0
- package/bin/skills/llamaindex/references/query_engines.md +406 -0
- package/bin/skills/llava/SKILL.md +304 -0
- package/bin/skills/llava/references/training.md +197 -0
- package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
- package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
- package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
- package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
- package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
- package/bin/skills/long-context/SKILL.md +536 -0
- package/bin/skills/long-context/references/extension_methods.md +468 -0
- package/bin/skills/long-context/references/fine_tuning.md +611 -0
- package/bin/skills/long-context/references/rope.md +402 -0
- package/bin/skills/mamba/SKILL.md +260 -0
- package/bin/skills/mamba/references/architecture-details.md +206 -0
- package/bin/skills/mamba/references/benchmarks.md +255 -0
- package/bin/skills/mamba/references/training-guide.md +388 -0
- package/bin/skills/megatron-core/SKILL.md +366 -0
- package/bin/skills/megatron-core/references/benchmarks.md +249 -0
- package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
- package/bin/skills/megatron-core/references/production-examples.md +473 -0
- package/bin/skills/megatron-core/references/training-recipes.md +547 -0
- package/bin/skills/miles/SKILL.md +315 -0
- package/bin/skills/miles/references/api-reference.md +141 -0
- package/bin/skills/miles/references/troubleshooting.md +352 -0
- package/bin/skills/mlflow/SKILL.md +704 -0
- package/bin/skills/mlflow/references/deployment.md +744 -0
- package/bin/skills/mlflow/references/model-registry.md +770 -0
- package/bin/skills/mlflow/references/tracking.md +680 -0
- package/bin/skills/modal/SKILL.md +341 -0
- package/bin/skills/modal/references/advanced-usage.md +503 -0
- package/bin/skills/modal/references/troubleshooting.md +494 -0
- package/bin/skills/model-merging/SKILL.md +539 -0
- package/bin/skills/model-merging/references/evaluation.md +462 -0
- package/bin/skills/model-merging/references/examples.md +428 -0
- package/bin/skills/model-merging/references/methods.md +352 -0
- package/bin/skills/model-pruning/SKILL.md +495 -0
- package/bin/skills/model-pruning/references/wanda.md +347 -0
- package/bin/skills/moe-training/SKILL.md +526 -0
- package/bin/skills/moe-training/references/architectures.md +432 -0
- package/bin/skills/moe-training/references/inference.md +348 -0
- package/bin/skills/moe-training/references/training.md +425 -0
- package/bin/skills/nanogpt/SKILL.md +290 -0
- package/bin/skills/nanogpt/references/architecture.md +382 -0
- package/bin/skills/nanogpt/references/data.md +476 -0
- package/bin/skills/nanogpt/references/training.md +564 -0
- package/bin/skills/nemo-curator/SKILL.md +383 -0
- package/bin/skills/nemo-curator/references/deduplication.md +87 -0
- package/bin/skills/nemo-curator/references/filtering.md +102 -0
- package/bin/skills/nemo-evaluator/SKILL.md +494 -0
- package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
- package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
- package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
- package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
- package/bin/skills/nemo-guardrails/SKILL.md +297 -0
- package/bin/skills/nnsight/SKILL.md +436 -0
- package/bin/skills/nnsight/references/README.md +78 -0
- package/bin/skills/nnsight/references/api.md +344 -0
- package/bin/skills/nnsight/references/tutorials.md +300 -0
- package/bin/skills/openrlhf/SKILL.md +249 -0
- package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
- package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
- package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
- package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
- package/bin/skills/outlines/SKILL.md +652 -0
- package/bin/skills/outlines/references/backends.md +615 -0
- package/bin/skills/outlines/references/examples.md +773 -0
- package/bin/skills/outlines/references/json_generation.md +652 -0
- package/bin/skills/peft/SKILL.md +431 -0
- package/bin/skills/peft/references/advanced-usage.md +514 -0
- package/bin/skills/peft/references/troubleshooting.md +480 -0
- package/bin/skills/phoenix/SKILL.md +475 -0
- package/bin/skills/phoenix/references/advanced-usage.md +619 -0
- package/bin/skills/phoenix/references/troubleshooting.md +538 -0
- package/bin/skills/pinecone/SKILL.md +358 -0
- package/bin/skills/pinecone/references/deployment.md +181 -0
- package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
- package/bin/skills/pytorch-fsdp/references/index.md +7 -0
- package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
- package/bin/skills/pytorch-lightning/SKILL.md +346 -0
- package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
- package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
- package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
- package/bin/skills/pyvene/SKILL.md +473 -0
- package/bin/skills/pyvene/references/README.md +73 -0
- package/bin/skills/pyvene/references/api.md +383 -0
- package/bin/skills/pyvene/references/tutorials.md +376 -0
- package/bin/skills/qdrant/SKILL.md +493 -0
- package/bin/skills/qdrant/references/advanced-usage.md +648 -0
- package/bin/skills/qdrant/references/troubleshooting.md +631 -0
- package/bin/skills/ray-data/SKILL.md +326 -0
- package/bin/skills/ray-data/references/integration.md +82 -0
- package/bin/skills/ray-data/references/transformations.md +83 -0
- package/bin/skills/ray-train/SKILL.md +406 -0
- package/bin/skills/ray-train/references/multi-node.md +628 -0
- package/bin/skills/rwkv/SKILL.md +260 -0
- package/bin/skills/rwkv/references/architecture-details.md +344 -0
- package/bin/skills/rwkv/references/rwkv7.md +386 -0
- package/bin/skills/rwkv/references/state-management.md +369 -0
- package/bin/skills/saelens/SKILL.md +386 -0
- package/bin/skills/saelens/references/README.md +70 -0
- package/bin/skills/saelens/references/api.md +333 -0
- package/bin/skills/saelens/references/tutorials.md +318 -0
- package/bin/skills/segment-anything/SKILL.md +500 -0
- package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
- package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
- package/bin/skills/sentence-transformers/SKILL.md +255 -0
- package/bin/skills/sentence-transformers/references/models.md +123 -0
- package/bin/skills/sentencepiece/SKILL.md +235 -0
- package/bin/skills/sentencepiece/references/algorithms.md +200 -0
- package/bin/skills/sentencepiece/references/training.md +304 -0
- package/bin/skills/sglang/SKILL.md +442 -0
- package/bin/skills/sglang/references/deployment.md +490 -0
- package/bin/skills/sglang/references/radix-attention.md +413 -0
- package/bin/skills/sglang/references/structured-generation.md +541 -0
- package/bin/skills/simpo/SKILL.md +219 -0
- package/bin/skills/simpo/references/datasets.md +478 -0
- package/bin/skills/simpo/references/hyperparameters.md +452 -0
- package/bin/skills/simpo/references/loss-functions.md +350 -0
- package/bin/skills/skypilot/SKILL.md +509 -0
- package/bin/skills/skypilot/references/advanced-usage.md +491 -0
- package/bin/skills/skypilot/references/troubleshooting.md +570 -0
- package/bin/skills/slime/SKILL.md +464 -0
- package/bin/skills/slime/references/api-reference.md +392 -0
- package/bin/skills/slime/references/troubleshooting.md +386 -0
- package/bin/skills/speculative-decoding/SKILL.md +467 -0
- package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
- package/bin/skills/speculative-decoding/references/medusa.md +350 -0
- package/bin/skills/stable-diffusion/SKILL.md +519 -0
- package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
- package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
- package/bin/skills/tensorboard/SKILL.md +629 -0
- package/bin/skills/tensorboard/references/integrations.md +638 -0
- package/bin/skills/tensorboard/references/profiling.md +545 -0
- package/bin/skills/tensorboard/references/visualization.md +620 -0
- package/bin/skills/tensorrt-llm/SKILL.md +187 -0
- package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
- package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
- package/bin/skills/tensorrt-llm/references/serving.md +470 -0
- package/bin/skills/tinker/SKILL.md +362 -0
- package/bin/skills/tinker/references/api-reference.md +168 -0
- package/bin/skills/tinker/references/getting-started.md +157 -0
- package/bin/skills/tinker/references/loss-functions.md +163 -0
- package/bin/skills/tinker/references/models-and-lora.md +139 -0
- package/bin/skills/tinker/references/recipes.md +280 -0
- package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
- package/bin/skills/tinker/references/rendering.md +243 -0
- package/bin/skills/tinker/references/supervised-learning.md +232 -0
- package/bin/skills/tinker-training-cost/SKILL.md +187 -0
- package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
- package/bin/skills/torchforge/SKILL.md +433 -0
- package/bin/skills/torchforge/references/api-reference.md +327 -0
- package/bin/skills/torchforge/references/troubleshooting.md +409 -0
- package/bin/skills/torchtitan/SKILL.md +358 -0
- package/bin/skills/torchtitan/references/checkpoint.md +181 -0
- package/bin/skills/torchtitan/references/custom-models.md +258 -0
- package/bin/skills/torchtitan/references/float8.md +133 -0
- package/bin/skills/torchtitan/references/fsdp.md +126 -0
- package/bin/skills/transformer-lens/SKILL.md +346 -0
- package/bin/skills/transformer-lens/references/README.md +54 -0
- package/bin/skills/transformer-lens/references/api.md +362 -0
- package/bin/skills/transformer-lens/references/tutorials.md +339 -0
- package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
- package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
- package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
- package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
- package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
- package/bin/skills/unsloth/SKILL.md +80 -0
- package/bin/skills/unsloth/references/index.md +7 -0
- package/bin/skills/unsloth/references/llms-full.md +16799 -0
- package/bin/skills/unsloth/references/llms-txt.md +12044 -0
- package/bin/skills/unsloth/references/llms.md +82 -0
- package/bin/skills/verl/SKILL.md +391 -0
- package/bin/skills/verl/references/api-reference.md +301 -0
- package/bin/skills/verl/references/troubleshooting.md +391 -0
- package/bin/skills/vllm/SKILL.md +364 -0
- package/bin/skills/vllm/references/optimization.md +226 -0
- package/bin/skills/vllm/references/quantization.md +284 -0
- package/bin/skills/vllm/references/server-deployment.md +255 -0
- package/bin/skills/vllm/references/troubleshooting.md +447 -0
- package/bin/skills/weights-and-biases/SKILL.md +590 -0
- package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
- package/bin/skills/weights-and-biases/references/integrations.md +700 -0
- package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
- package/bin/skills/whisper/SKILL.md +317 -0
- package/bin/skills/whisper/references/languages.md +189 -0
- package/bin/synsc +0 -0
- package/package.json +10 -0
|
@@ -0,0 +1,386 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: sparse-autoencoder-training
|
|
3
|
+
description: Provides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. Use when discovering interpretable features, analyzing superposition, or studying monosemantic representations in language models.
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
author: Synthetic Sciences
|
|
6
|
+
license: MIT
|
|
7
|
+
tags: [Sparse Autoencoders, SAE, Mechanistic Interpretability, Feature Discovery, Superposition]
|
|
8
|
+
dependencies: [sae-lens>=6.0.0, transformer-lens>=2.0.0, torch>=2.0.0]
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# SAELens: Sparse Autoencoders for Mechanistic Interpretability
|
|
12
|
+
|
|
13
|
+
SAELens is the primary library for training and analyzing Sparse Autoencoders (SAEs) - a technique for decomposing polysemantic neural network activations into sparse, interpretable features. Based on Anthropic's groundbreaking research on monosemanticity.
|
|
14
|
+
|
|
15
|
+
**GitHub**: [jbloomAus/SAELens](https://github.com/jbloomAus/SAELens) (1,100+ stars)
|
|
16
|
+
|
|
17
|
+
## The Problem: Polysemanticity & Superposition
|
|
18
|
+
|
|
19
|
+
Individual neurons in neural networks are **polysemantic** - they activate in multiple, semantically distinct contexts. This happens because models use **superposition** to represent more features than they have neurons, making interpretability difficult.
|
|
20
|
+
|
|
21
|
+
**SAEs solve this** by decomposing dense activations into sparse, monosemantic features - typically only a small number of features activate for any given input, and each feature corresponds to an interpretable concept.
|
|
22
|
+
|
|
23
|
+
## When to Use SAELens
|
|
24
|
+
|
|
25
|
+
**Use SAELens when you need to:**
|
|
26
|
+
- Discover interpretable features in model activations
|
|
27
|
+
- Understand what concepts a model has learned
|
|
28
|
+
- Study superposition and feature geometry
|
|
29
|
+
- Perform feature-based steering or ablation
|
|
30
|
+
- Analyze safety-relevant features (deception, bias, harmful content)
|
|
31
|
+
|
|
32
|
+
**Consider alternatives when:**
|
|
33
|
+
- You need basic activation analysis → Use **TransformerLens** directly
|
|
34
|
+
- You want causal intervention experiments → Use **pyvene** or **TransformerLens**
|
|
35
|
+
- You need production steering → Consider direct activation engineering
|
|
36
|
+
|
|
37
|
+
## Installation
|
|
38
|
+
|
|
39
|
+
```bash
|
|
40
|
+
pip install sae-lens
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
Requirements: Python 3.10+, transformer-lens>=2.0.0
|
|
44
|
+
|
|
45
|
+
## Core Concepts
|
|
46
|
+
|
|
47
|
+
### What SAEs Learn
|
|
48
|
+
|
|
49
|
+
SAEs are trained to reconstruct model activations through a sparse bottleneck:
|
|
50
|
+
|
|
51
|
+
```
|
|
52
|
+
Input Activation → Encoder → Sparse Features → Decoder → Reconstructed Activation
|
|
53
|
+
(d_model) ↓ (d_sae >> d_model) ↓ (d_model)
|
|
54
|
+
sparsity reconstruction
|
|
55
|
+
penalty loss
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
**Loss Function**: `MSE(original, reconstructed) + L1_coefficient × L1(features)`
|
|
59
|
+
|
|
60
|
+
### Key Validation (Anthropic Research)
|
|
61
|
+
|
|
62
|
+
In "Towards Monosemanticity", human evaluators found **70% of SAE features genuinely interpretable**. Features discovered include:
|
|
63
|
+
- DNA sequences, legal language, HTTP requests
|
|
64
|
+
- Hebrew text, nutrition statements, code syntax
|
|
65
|
+
- Sentiment, named entities, grammatical structures
|
|
66
|
+
|
|
67
|
+
## Workflow 1: Loading and Analyzing Pre-trained SAEs
|
|
68
|
+
|
|
69
|
+
### Step-by-Step
|
|
70
|
+
|
|
71
|
+
```python
|
|
72
|
+
from transformer_lens import HookedTransformer
|
|
73
|
+
from sae_lens import SAE
|
|
74
|
+
|
|
75
|
+
# 1. Load model and pre-trained SAE
|
|
76
|
+
model = HookedTransformer.from_pretrained("gpt2-small", device="cuda")
|
|
77
|
+
sae, cfg_dict, sparsity = SAE.from_pretrained(
|
|
78
|
+
release="gpt2-small-res-jb",
|
|
79
|
+
sae_id="blocks.8.hook_resid_pre",
|
|
80
|
+
device="cuda"
|
|
81
|
+
)
|
|
82
|
+
|
|
83
|
+
# 2. Get model activations
|
|
84
|
+
tokens = model.to_tokens("The capital of France is Paris")
|
|
85
|
+
_, cache = model.run_with_cache(tokens)
|
|
86
|
+
activations = cache["resid_pre", 8] # [batch, pos, d_model]
|
|
87
|
+
|
|
88
|
+
# 3. Encode to SAE features
|
|
89
|
+
sae_features = sae.encode(activations) # [batch, pos, d_sae]
|
|
90
|
+
print(f"Active features: {(sae_features > 0).sum()}")
|
|
91
|
+
|
|
92
|
+
# 4. Find top features for each position
|
|
93
|
+
for pos in range(tokens.shape[1]):
|
|
94
|
+
top_features = sae_features[0, pos].topk(5)
|
|
95
|
+
token = model.to_str_tokens(tokens[0, pos:pos+1])[0]
|
|
96
|
+
print(f"Token '{token}': features {top_features.indices.tolist()}")
|
|
97
|
+
|
|
98
|
+
# 5. Reconstruct activations
|
|
99
|
+
reconstructed = sae.decode(sae_features)
|
|
100
|
+
reconstruction_error = (activations - reconstructed).norm()
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
### Available Pre-trained SAEs
|
|
104
|
+
|
|
105
|
+
| Release | Model | Layers |
|
|
106
|
+
|---------|-------|--------|
|
|
107
|
+
| `gpt2-small-res-jb` | GPT-2 Small | Multiple residual streams |
|
|
108
|
+
| `gemma-2b-res` | Gemma 2B | Residual streams |
|
|
109
|
+
| Various on HuggingFace | Search tag `saelens` | Various |
|
|
110
|
+
|
|
111
|
+
### Checklist
|
|
112
|
+
- [ ] Load model with TransformerLens
|
|
113
|
+
- [ ] Load matching SAE for target layer
|
|
114
|
+
- [ ] Encode activations to sparse features
|
|
115
|
+
- [ ] Identify top-activating features per token
|
|
116
|
+
- [ ] Validate reconstruction quality
|
|
117
|
+
|
|
118
|
+
## Workflow 2: Training a Custom SAE
|
|
119
|
+
|
|
120
|
+
### Step-by-Step
|
|
121
|
+
|
|
122
|
+
```python
|
|
123
|
+
from sae_lens import SAE, LanguageModelSAERunnerConfig, SAETrainingRunner
|
|
124
|
+
|
|
125
|
+
# 1. Configure training
|
|
126
|
+
cfg = LanguageModelSAERunnerConfig(
|
|
127
|
+
# Model
|
|
128
|
+
model_name="gpt2-small",
|
|
129
|
+
hook_name="blocks.8.hook_resid_pre",
|
|
130
|
+
hook_layer=8,
|
|
131
|
+
d_in=768, # Model dimension
|
|
132
|
+
|
|
133
|
+
# SAE architecture
|
|
134
|
+
architecture="standard", # or "gated", "topk"
|
|
135
|
+
d_sae=768 * 8, # Expansion factor of 8
|
|
136
|
+
activation_fn="relu",
|
|
137
|
+
|
|
138
|
+
# Training
|
|
139
|
+
lr=4e-4,
|
|
140
|
+
l1_coefficient=8e-5, # Sparsity penalty
|
|
141
|
+
l1_warm_up_steps=1000,
|
|
142
|
+
train_batch_size_tokens=4096,
|
|
143
|
+
training_tokens=100_000_000,
|
|
144
|
+
|
|
145
|
+
# Data
|
|
146
|
+
dataset_path="monology/pile-uncopyrighted",
|
|
147
|
+
context_size=128,
|
|
148
|
+
|
|
149
|
+
# Logging
|
|
150
|
+
log_to_wandb=True,
|
|
151
|
+
wandb_project="sae-training",
|
|
152
|
+
|
|
153
|
+
# Checkpointing
|
|
154
|
+
checkpoint_path="checkpoints",
|
|
155
|
+
n_checkpoints=5,
|
|
156
|
+
)
|
|
157
|
+
|
|
158
|
+
# 2. Train
|
|
159
|
+
trainer = SAETrainingRunner(cfg)
|
|
160
|
+
sae = trainer.run()
|
|
161
|
+
|
|
162
|
+
# 3. Evaluate
|
|
163
|
+
print(f"L0 (avg active features): {trainer.metrics['l0']}")
|
|
164
|
+
print(f"CE Loss Recovered: {trainer.metrics['ce_loss_score']}")
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
### Key Hyperparameters
|
|
168
|
+
|
|
169
|
+
| Parameter | Typical Value | Effect |
|
|
170
|
+
|-----------|---------------|--------|
|
|
171
|
+
| `d_sae` | 4-16× d_model | More features, higher capacity |
|
|
172
|
+
| `l1_coefficient` | 5e-5 to 1e-4 | Higher = sparser, less accurate |
|
|
173
|
+
| `lr` | 1e-4 to 1e-3 | Standard optimizer LR |
|
|
174
|
+
| `l1_warm_up_steps` | 500-2000 | Prevents early feature death |
|
|
175
|
+
|
|
176
|
+
### Evaluation Metrics
|
|
177
|
+
|
|
178
|
+
| Metric | Target | Meaning |
|
|
179
|
+
|--------|--------|---------|
|
|
180
|
+
| **L0** | 50-200 | Average active features per token |
|
|
181
|
+
| **CE Loss Score** | 80-95% | Cross-entropy recovered vs original |
|
|
182
|
+
| **Dead Features** | <5% | Features that never activate |
|
|
183
|
+
| **Explained Variance** | >90% | Reconstruction quality |
|
|
184
|
+
|
|
185
|
+
### Checklist
|
|
186
|
+
- [ ] Choose target layer and hook point
|
|
187
|
+
- [ ] Set expansion factor (d_sae = 4-16× d_model)
|
|
188
|
+
- [ ] Tune L1 coefficient for desired sparsity
|
|
189
|
+
- [ ] Enable L1 warm-up to prevent dead features
|
|
190
|
+
- [ ] Monitor metrics during training (W&B)
|
|
191
|
+
- [ ] Validate L0 and CE loss recovery
|
|
192
|
+
- [ ] Check dead feature ratio
|
|
193
|
+
|
|
194
|
+
## Workflow 3: Feature Analysis and Steering
|
|
195
|
+
|
|
196
|
+
### Analyzing Individual Features
|
|
197
|
+
|
|
198
|
+
```python
|
|
199
|
+
from transformer_lens import HookedTransformer
|
|
200
|
+
from sae_lens import SAE
|
|
201
|
+
import torch
|
|
202
|
+
|
|
203
|
+
model = HookedTransformer.from_pretrained("gpt2-small", device="cuda")
|
|
204
|
+
sae, _, _ = SAE.from_pretrained(
|
|
205
|
+
release="gpt2-small-res-jb",
|
|
206
|
+
sae_id="blocks.8.hook_resid_pre",
|
|
207
|
+
device="cuda"
|
|
208
|
+
)
|
|
209
|
+
|
|
210
|
+
# Find what activates a specific feature
|
|
211
|
+
feature_idx = 1234
|
|
212
|
+
test_texts = [
|
|
213
|
+
"The scientist conducted an experiment",
|
|
214
|
+
"I love chocolate cake",
|
|
215
|
+
"The code compiles successfully",
|
|
216
|
+
"Paris is beautiful in spring",
|
|
217
|
+
]
|
|
218
|
+
|
|
219
|
+
for text in test_texts:
|
|
220
|
+
tokens = model.to_tokens(text)
|
|
221
|
+
_, cache = model.run_with_cache(tokens)
|
|
222
|
+
features = sae.encode(cache["resid_pre", 8])
|
|
223
|
+
activation = features[0, :, feature_idx].max().item()
|
|
224
|
+
print(f"{activation:.3f}: {text}")
|
|
225
|
+
```
|
|
226
|
+
|
|
227
|
+
### Feature Steering
|
|
228
|
+
|
|
229
|
+
```python
|
|
230
|
+
def steer_with_feature(model, sae, prompt, feature_idx, strength=5.0):
|
|
231
|
+
"""Add SAE feature direction to residual stream."""
|
|
232
|
+
tokens = model.to_tokens(prompt)
|
|
233
|
+
|
|
234
|
+
# Get feature direction from decoder
|
|
235
|
+
feature_direction = sae.W_dec[feature_idx] # [d_model]
|
|
236
|
+
|
|
237
|
+
def steering_hook(activation, hook):
|
|
238
|
+
# Add scaled feature direction at all positions
|
|
239
|
+
activation += strength * feature_direction
|
|
240
|
+
return activation
|
|
241
|
+
|
|
242
|
+
# Generate with steering
|
|
243
|
+
output = model.generate(
|
|
244
|
+
tokens,
|
|
245
|
+
max_new_tokens=50,
|
|
246
|
+
fwd_hooks=[("blocks.8.hook_resid_pre", steering_hook)]
|
|
247
|
+
)
|
|
248
|
+
return model.to_string(output[0])
|
|
249
|
+
```
|
|
250
|
+
|
|
251
|
+
### Feature Attribution
|
|
252
|
+
|
|
253
|
+
```python
|
|
254
|
+
# Which features most affect a specific output?
|
|
255
|
+
tokens = model.to_tokens("The capital of France is")
|
|
256
|
+
_, cache = model.run_with_cache(tokens)
|
|
257
|
+
|
|
258
|
+
# Get features at final position
|
|
259
|
+
features = sae.encode(cache["resid_pre", 8])[0, -1] # [d_sae]
|
|
260
|
+
|
|
261
|
+
# Get logit attribution per feature
|
|
262
|
+
# Feature contribution = feature_activation × decoder_weight × unembedding
|
|
263
|
+
W_dec = sae.W_dec # [d_sae, d_model]
|
|
264
|
+
W_U = model.W_U # [d_model, vocab]
|
|
265
|
+
|
|
266
|
+
# Contribution to "Paris" logit
|
|
267
|
+
paris_token = model.to_single_token(" Paris")
|
|
268
|
+
feature_contributions = features * (W_dec @ W_U[:, paris_token])
|
|
269
|
+
|
|
270
|
+
top_features = feature_contributions.topk(10)
|
|
271
|
+
print("Top features for 'Paris' prediction:")
|
|
272
|
+
for idx, val in zip(top_features.indices, top_features.values):
|
|
273
|
+
print(f" Feature {idx.item()}: {val.item():.3f}")
|
|
274
|
+
```
|
|
275
|
+
|
|
276
|
+
## Common Issues & Solutions
|
|
277
|
+
|
|
278
|
+
### Issue: High dead feature ratio
|
|
279
|
+
```python
|
|
280
|
+
# WRONG: No warm-up, features die early
|
|
281
|
+
cfg = LanguageModelSAERunnerConfig(
|
|
282
|
+
l1_coefficient=1e-4,
|
|
283
|
+
l1_warm_up_steps=0, # Bad!
|
|
284
|
+
)
|
|
285
|
+
|
|
286
|
+
# RIGHT: Warm-up L1 penalty
|
|
287
|
+
cfg = LanguageModelSAERunnerConfig(
|
|
288
|
+
l1_coefficient=8e-5,
|
|
289
|
+
l1_warm_up_steps=1000, # Gradually increase
|
|
290
|
+
use_ghost_grads=True, # Revive dead features
|
|
291
|
+
)
|
|
292
|
+
```
|
|
293
|
+
|
|
294
|
+
### Issue: Poor reconstruction (low CE recovery)
|
|
295
|
+
```python
|
|
296
|
+
# Reduce sparsity penalty
|
|
297
|
+
cfg = LanguageModelSAERunnerConfig(
|
|
298
|
+
l1_coefficient=5e-5, # Lower = better reconstruction
|
|
299
|
+
d_sae=768 * 16, # More capacity
|
|
300
|
+
)
|
|
301
|
+
```
|
|
302
|
+
|
|
303
|
+
### Issue: Features not interpretable
|
|
304
|
+
```python
|
|
305
|
+
# Increase sparsity (higher L1)
|
|
306
|
+
cfg = LanguageModelSAERunnerConfig(
|
|
307
|
+
l1_coefficient=1e-4, # Higher = sparser, more interpretable
|
|
308
|
+
)
|
|
309
|
+
# Or use TopK architecture
|
|
310
|
+
cfg = LanguageModelSAERunnerConfig(
|
|
311
|
+
architecture="topk",
|
|
312
|
+
activation_fn_kwargs={"k": 50}, # Exactly 50 active features
|
|
313
|
+
)
|
|
314
|
+
```
|
|
315
|
+
|
|
316
|
+
### Issue: Memory errors during training
|
|
317
|
+
```python
|
|
318
|
+
cfg = LanguageModelSAERunnerConfig(
|
|
319
|
+
train_batch_size_tokens=2048, # Reduce batch size
|
|
320
|
+
store_batch_size_prompts=4, # Fewer prompts in buffer
|
|
321
|
+
n_batches_in_buffer=8, # Smaller activation buffer
|
|
322
|
+
)
|
|
323
|
+
```
|
|
324
|
+
|
|
325
|
+
## Integration with Neuronpedia
|
|
326
|
+
|
|
327
|
+
Browse pre-trained SAE features at [neuronpedia.org](https://neuronpedia.org):
|
|
328
|
+
|
|
329
|
+
```python
|
|
330
|
+
# Features are indexed by SAE ID
|
|
331
|
+
# Example: gpt2-small layer 8 feature 1234
|
|
332
|
+
# → neuronpedia.org/gpt2-small/8-res-jb/1234
|
|
333
|
+
```
|
|
334
|
+
|
|
335
|
+
## Key Classes Reference
|
|
336
|
+
|
|
337
|
+
| Class | Purpose |
|
|
338
|
+
|-------|---------|
|
|
339
|
+
| `SAE` | Sparse Autoencoder model |
|
|
340
|
+
| `LanguageModelSAERunnerConfig` | Training configuration |
|
|
341
|
+
| `SAETrainingRunner` | Training loop manager |
|
|
342
|
+
| `ActivationsStore` | Activation collection and batching |
|
|
343
|
+
| `HookedSAETransformer` | TransformerLens + SAE integration |
|
|
344
|
+
|
|
345
|
+
## Reference Documentation
|
|
346
|
+
|
|
347
|
+
For detailed API documentation, tutorials, and advanced usage, see the `references/` folder:
|
|
348
|
+
|
|
349
|
+
| File | Contents |
|
|
350
|
+
|------|----------|
|
|
351
|
+
| [references/README.md](references/README.md) | Overview and quick start guide |
|
|
352
|
+
| [references/api.md](references/api.md) | Complete API reference for SAE, TrainingSAE, configurations |
|
|
353
|
+
| [references/tutorials.md](references/tutorials.md) | Step-by-step tutorials for training, analysis, steering |
|
|
354
|
+
|
|
355
|
+
## External Resources
|
|
356
|
+
|
|
357
|
+
### Tutorials
|
|
358
|
+
- [Basic Loading & Analysis](https://github.com/jbloomAus/SAELens/blob/main/tutorials/basic_loading_and_analysing.ipynb)
|
|
359
|
+
- [Training a Sparse Autoencoder](https://github.com/jbloomAus/SAELens/blob/main/tutorials/training_a_sparse_autoencoder.ipynb)
|
|
360
|
+
- [ARENA SAE Curriculum](https://www.lesswrong.com/posts/LnHowHgmrMbWtpkxx/intro-to-superposition-and-sparse-autoencoders-colab)
|
|
361
|
+
|
|
362
|
+
### Papers
|
|
363
|
+
- [Towards Monosemanticity](https://transformer-circuits.pub/2023/monosemantic-features) - Anthropic (2023)
|
|
364
|
+
- [Scaling Monosemanticity](https://transformer-circuits.pub/2024/scaling-monosemanticity/) - Anthropic (2024)
|
|
365
|
+
- [Sparse Autoencoders Find Highly Interpretable Features](https://arxiv.org/abs/2309.08600) - Cunningham et al. (ICLR 2024)
|
|
366
|
+
|
|
367
|
+
### Official Documentation
|
|
368
|
+
- [SAELens Docs](https://jbloomaus.github.io/SAELens/)
|
|
369
|
+
- [Neuronpedia](https://neuronpedia.org) - Feature browser
|
|
370
|
+
|
|
371
|
+
## SAE Architectures
|
|
372
|
+
|
|
373
|
+
| Architecture | Description | Use Case |
|
|
374
|
+
|--------------|-------------|----------|
|
|
375
|
+
| **Standard** | ReLU + L1 penalty | General purpose |
|
|
376
|
+
| **Gated** | Learned gating mechanism | Better sparsity control |
|
|
377
|
+
| **TopK** | Exactly K active features | Consistent sparsity |
|
|
378
|
+
|
|
379
|
+
```python
|
|
380
|
+
# TopK SAE (exactly 50 features active)
|
|
381
|
+
cfg = LanguageModelSAERunnerConfig(
|
|
382
|
+
architecture="topk",
|
|
383
|
+
activation_fn="topk",
|
|
384
|
+
activation_fn_kwargs={"k": 50},
|
|
385
|
+
)
|
|
386
|
+
```
|
|
@@ -0,0 +1,70 @@
|
|
|
1
|
+
# SAELens Reference Documentation
|
|
2
|
+
|
|
3
|
+
This directory contains comprehensive reference materials for SAELens.
|
|
4
|
+
|
|
5
|
+
## Contents
|
|
6
|
+
|
|
7
|
+
- [api.md](api.md) - Complete API reference for SAE, TrainingSAE, and configuration classes
|
|
8
|
+
- [tutorials.md](tutorials.md) - Step-by-step tutorials for training and analyzing SAEs
|
|
9
|
+
- [papers.md](papers.md) - Key research papers on sparse autoencoders
|
|
10
|
+
|
|
11
|
+
## Quick Links
|
|
12
|
+
|
|
13
|
+
- **GitHub Repository**: https://github.com/jbloomAus/SAELens
|
|
14
|
+
- **Neuronpedia**: https://neuronpedia.org (browse pre-trained SAE features)
|
|
15
|
+
- **HuggingFace SAEs**: Search for tag `saelens`
|
|
16
|
+
|
|
17
|
+
## Installation
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
pip install sae-lens
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
Requirements: Python 3.10+, transformer-lens>=2.0.0
|
|
24
|
+
|
|
25
|
+
## Basic Usage
|
|
26
|
+
|
|
27
|
+
```python
|
|
28
|
+
from transformer_lens import HookedTransformer
|
|
29
|
+
from sae_lens import SAE
|
|
30
|
+
|
|
31
|
+
# Load model and SAE
|
|
32
|
+
model = HookedTransformer.from_pretrained("gpt2-small", device="cuda")
|
|
33
|
+
sae, cfg_dict, sparsity = SAE.from_pretrained(
|
|
34
|
+
release="gpt2-small-res-jb",
|
|
35
|
+
sae_id="blocks.8.hook_resid_pre",
|
|
36
|
+
device="cuda"
|
|
37
|
+
)
|
|
38
|
+
|
|
39
|
+
# Encode activations to sparse features
|
|
40
|
+
tokens = model.to_tokens("Hello world")
|
|
41
|
+
_, cache = model.run_with_cache(tokens)
|
|
42
|
+
activations = cache["resid_pre", 8]
|
|
43
|
+
|
|
44
|
+
features = sae.encode(activations) # Sparse feature activations
|
|
45
|
+
reconstructed = sae.decode(features) # Reconstructed activations
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
## Key Concepts
|
|
49
|
+
|
|
50
|
+
### Sparse Autoencoders
|
|
51
|
+
SAEs decompose dense neural activations into sparse, interpretable features:
|
|
52
|
+
- **Encoder**: Maps d_model → d_sae (typically 4-16x expansion)
|
|
53
|
+
- **ReLU/TopK**: Enforces sparsity
|
|
54
|
+
- **Decoder**: Reconstructs original activations
|
|
55
|
+
|
|
56
|
+
### Training Loss
|
|
57
|
+
`Loss = MSE(original, reconstructed) + L1_coefficient × L1(features)`
|
|
58
|
+
|
|
59
|
+
### Key Metrics
|
|
60
|
+
- **L0**: Average number of active features (target: 50-200)
|
|
61
|
+
- **CE Loss Score**: Cross-entropy recovered vs original model (target: 80-95%)
|
|
62
|
+
- **Dead Features**: Features that never activate (target: <5%)
|
|
63
|
+
|
|
64
|
+
## Available Pre-trained SAEs
|
|
65
|
+
|
|
66
|
+
| Release | Model | Description |
|
|
67
|
+
|---------|-------|-------------|
|
|
68
|
+
| `gpt2-small-res-jb` | GPT-2 Small | Residual stream SAEs |
|
|
69
|
+
| `gemma-2b-res` | Gemma 2B | Residual stream SAEs |
|
|
70
|
+
| Various | Search HuggingFace | Community-trained SAEs |
|