@synsci/cli-darwin-x64 1.1.49

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (373) hide show
  1. package/bin/skills/accelerate/SKILL.md +332 -0
  2. package/bin/skills/accelerate/references/custom-plugins.md +453 -0
  3. package/bin/skills/accelerate/references/megatron-integration.md +489 -0
  4. package/bin/skills/accelerate/references/performance.md +525 -0
  5. package/bin/skills/audiocraft/SKILL.md +564 -0
  6. package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
  7. package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
  8. package/bin/skills/autogpt/SKILL.md +403 -0
  9. package/bin/skills/autogpt/references/advanced-usage.md +535 -0
  10. package/bin/skills/autogpt/references/troubleshooting.md +420 -0
  11. package/bin/skills/awq/SKILL.md +310 -0
  12. package/bin/skills/awq/references/advanced-usage.md +324 -0
  13. package/bin/skills/awq/references/troubleshooting.md +344 -0
  14. package/bin/skills/axolotl/SKILL.md +158 -0
  15. package/bin/skills/axolotl/references/api.md +5548 -0
  16. package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
  17. package/bin/skills/axolotl/references/index.md +15 -0
  18. package/bin/skills/axolotl/references/other.md +3563 -0
  19. package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
  20. package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
  21. package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
  22. package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
  23. package/bin/skills/bitsandbytes/SKILL.md +411 -0
  24. package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
  25. package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
  26. package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
  27. package/bin/skills/blip-2/SKILL.md +564 -0
  28. package/bin/skills/blip-2/references/advanced-usage.md +680 -0
  29. package/bin/skills/blip-2/references/troubleshooting.md +526 -0
  30. package/bin/skills/chroma/SKILL.md +406 -0
  31. package/bin/skills/chroma/references/integration.md +38 -0
  32. package/bin/skills/clip/SKILL.md +253 -0
  33. package/bin/skills/clip/references/applications.md +207 -0
  34. package/bin/skills/constitutional-ai/SKILL.md +290 -0
  35. package/bin/skills/crewai/SKILL.md +498 -0
  36. package/bin/skills/crewai/references/flows.md +438 -0
  37. package/bin/skills/crewai/references/tools.md +429 -0
  38. package/bin/skills/crewai/references/troubleshooting.md +480 -0
  39. package/bin/skills/deepspeed/SKILL.md +141 -0
  40. package/bin/skills/deepspeed/references/08.md +17 -0
  41. package/bin/skills/deepspeed/references/09.md +173 -0
  42. package/bin/skills/deepspeed/references/2020.md +378 -0
  43. package/bin/skills/deepspeed/references/2023.md +279 -0
  44. package/bin/skills/deepspeed/references/assets.md +179 -0
  45. package/bin/skills/deepspeed/references/index.md +35 -0
  46. package/bin/skills/deepspeed/references/mii.md +118 -0
  47. package/bin/skills/deepspeed/references/other.md +1191 -0
  48. package/bin/skills/deepspeed/references/tutorials.md +6554 -0
  49. package/bin/skills/dspy/SKILL.md +590 -0
  50. package/bin/skills/dspy/references/examples.md +663 -0
  51. package/bin/skills/dspy/references/modules.md +475 -0
  52. package/bin/skills/dspy/references/optimizers.md +566 -0
  53. package/bin/skills/faiss/SKILL.md +221 -0
  54. package/bin/skills/faiss/references/index_types.md +280 -0
  55. package/bin/skills/flash-attention/SKILL.md +367 -0
  56. package/bin/skills/flash-attention/references/benchmarks.md +215 -0
  57. package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
  58. package/bin/skills/gguf/SKILL.md +427 -0
  59. package/bin/skills/gguf/references/advanced-usage.md +504 -0
  60. package/bin/skills/gguf/references/troubleshooting.md +442 -0
  61. package/bin/skills/gptq/SKILL.md +450 -0
  62. package/bin/skills/gptq/references/calibration.md +337 -0
  63. package/bin/skills/gptq/references/integration.md +129 -0
  64. package/bin/skills/gptq/references/troubleshooting.md +95 -0
  65. package/bin/skills/grpo-rl-training/README.md +97 -0
  66. package/bin/skills/grpo-rl-training/SKILL.md +572 -0
  67. package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
  68. package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
  69. package/bin/skills/guidance/SKILL.md +572 -0
  70. package/bin/skills/guidance/references/backends.md +554 -0
  71. package/bin/skills/guidance/references/constraints.md +674 -0
  72. package/bin/skills/guidance/references/examples.md +767 -0
  73. package/bin/skills/hqq/SKILL.md +445 -0
  74. package/bin/skills/hqq/references/advanced-usage.md +528 -0
  75. package/bin/skills/hqq/references/troubleshooting.md +503 -0
  76. package/bin/skills/hugging-face-cli/SKILL.md +191 -0
  77. package/bin/skills/hugging-face-cli/references/commands.md +954 -0
  78. package/bin/skills/hugging-face-cli/references/examples.md +374 -0
  79. package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
  80. package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
  81. package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
  82. package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
  83. package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
  84. package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
  85. package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
  86. package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
  87. package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
  88. package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
  89. package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
  90. package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
  91. package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
  92. package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
  93. package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
  94. package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
  95. package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
  96. package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
  97. package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
  98. package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
  99. package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
  100. package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
  101. package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
  102. package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
  103. package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
  104. package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
  105. package/bin/skills/hugging-face-jobs/index.html +216 -0
  106. package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
  107. package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
  108. package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
  109. package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
  110. package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
  111. package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
  112. package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
  113. package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
  114. package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
  115. package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
  116. package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
  117. package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
  118. package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
  119. package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
  120. package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
  121. package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
  122. package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
  123. package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
  124. package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
  125. package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
  126. package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
  127. package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
  128. package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
  129. package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
  130. package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
  131. package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
  132. package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
  133. package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
  134. package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
  135. package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
  136. package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
  137. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
  138. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
  139. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
  140. package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
  141. package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
  142. package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
  143. package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
  144. package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
  145. package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
  146. package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
  147. package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
  148. package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
  149. package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
  150. package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
  151. package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
  152. package/bin/skills/instructor/SKILL.md +740 -0
  153. package/bin/skills/instructor/references/examples.md +107 -0
  154. package/bin/skills/instructor/references/providers.md +70 -0
  155. package/bin/skills/instructor/references/validation.md +606 -0
  156. package/bin/skills/knowledge-distillation/SKILL.md +458 -0
  157. package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
  158. package/bin/skills/lambda-labs/SKILL.md +545 -0
  159. package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
  160. package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
  161. package/bin/skills/langchain/SKILL.md +480 -0
  162. package/bin/skills/langchain/references/agents.md +499 -0
  163. package/bin/skills/langchain/references/integration.md +562 -0
  164. package/bin/skills/langchain/references/rag.md +600 -0
  165. package/bin/skills/langsmith/SKILL.md +422 -0
  166. package/bin/skills/langsmith/references/advanced-usage.md +548 -0
  167. package/bin/skills/langsmith/references/troubleshooting.md +537 -0
  168. package/bin/skills/litgpt/SKILL.md +469 -0
  169. package/bin/skills/litgpt/references/custom-models.md +568 -0
  170. package/bin/skills/litgpt/references/distributed-training.md +451 -0
  171. package/bin/skills/litgpt/references/supported-models.md +336 -0
  172. package/bin/skills/litgpt/references/training-recipes.md +619 -0
  173. package/bin/skills/llama-cpp/SKILL.md +258 -0
  174. package/bin/skills/llama-cpp/references/optimization.md +89 -0
  175. package/bin/skills/llama-cpp/references/quantization.md +213 -0
  176. package/bin/skills/llama-cpp/references/server.md +125 -0
  177. package/bin/skills/llama-factory/SKILL.md +80 -0
  178. package/bin/skills/llama-factory/references/_images.md +23 -0
  179. package/bin/skills/llama-factory/references/advanced.md +1055 -0
  180. package/bin/skills/llama-factory/references/getting_started.md +349 -0
  181. package/bin/skills/llama-factory/references/index.md +19 -0
  182. package/bin/skills/llama-factory/references/other.md +31 -0
  183. package/bin/skills/llamaguard/SKILL.md +337 -0
  184. package/bin/skills/llamaindex/SKILL.md +569 -0
  185. package/bin/skills/llamaindex/references/agents.md +83 -0
  186. package/bin/skills/llamaindex/references/data_connectors.md +108 -0
  187. package/bin/skills/llamaindex/references/query_engines.md +406 -0
  188. package/bin/skills/llava/SKILL.md +304 -0
  189. package/bin/skills/llava/references/training.md +197 -0
  190. package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
  191. package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
  192. package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
  193. package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
  194. package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
  195. package/bin/skills/long-context/SKILL.md +536 -0
  196. package/bin/skills/long-context/references/extension_methods.md +468 -0
  197. package/bin/skills/long-context/references/fine_tuning.md +611 -0
  198. package/bin/skills/long-context/references/rope.md +402 -0
  199. package/bin/skills/mamba/SKILL.md +260 -0
  200. package/bin/skills/mamba/references/architecture-details.md +206 -0
  201. package/bin/skills/mamba/references/benchmarks.md +255 -0
  202. package/bin/skills/mamba/references/training-guide.md +388 -0
  203. package/bin/skills/megatron-core/SKILL.md +366 -0
  204. package/bin/skills/megatron-core/references/benchmarks.md +249 -0
  205. package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
  206. package/bin/skills/megatron-core/references/production-examples.md +473 -0
  207. package/bin/skills/megatron-core/references/training-recipes.md +547 -0
  208. package/bin/skills/miles/SKILL.md +315 -0
  209. package/bin/skills/miles/references/api-reference.md +141 -0
  210. package/bin/skills/miles/references/troubleshooting.md +352 -0
  211. package/bin/skills/mlflow/SKILL.md +704 -0
  212. package/bin/skills/mlflow/references/deployment.md +744 -0
  213. package/bin/skills/mlflow/references/model-registry.md +770 -0
  214. package/bin/skills/mlflow/references/tracking.md +680 -0
  215. package/bin/skills/modal/SKILL.md +341 -0
  216. package/bin/skills/modal/references/advanced-usage.md +503 -0
  217. package/bin/skills/modal/references/troubleshooting.md +494 -0
  218. package/bin/skills/model-merging/SKILL.md +539 -0
  219. package/bin/skills/model-merging/references/evaluation.md +462 -0
  220. package/bin/skills/model-merging/references/examples.md +428 -0
  221. package/bin/skills/model-merging/references/methods.md +352 -0
  222. package/bin/skills/model-pruning/SKILL.md +495 -0
  223. package/bin/skills/model-pruning/references/wanda.md +347 -0
  224. package/bin/skills/moe-training/SKILL.md +526 -0
  225. package/bin/skills/moe-training/references/architectures.md +432 -0
  226. package/bin/skills/moe-training/references/inference.md +348 -0
  227. package/bin/skills/moe-training/references/training.md +425 -0
  228. package/bin/skills/nanogpt/SKILL.md +290 -0
  229. package/bin/skills/nanogpt/references/architecture.md +382 -0
  230. package/bin/skills/nanogpt/references/data.md +476 -0
  231. package/bin/skills/nanogpt/references/training.md +564 -0
  232. package/bin/skills/nemo-curator/SKILL.md +383 -0
  233. package/bin/skills/nemo-curator/references/deduplication.md +87 -0
  234. package/bin/skills/nemo-curator/references/filtering.md +102 -0
  235. package/bin/skills/nemo-evaluator/SKILL.md +494 -0
  236. package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
  237. package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
  238. package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
  239. package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
  240. package/bin/skills/nemo-guardrails/SKILL.md +297 -0
  241. package/bin/skills/nnsight/SKILL.md +436 -0
  242. package/bin/skills/nnsight/references/README.md +78 -0
  243. package/bin/skills/nnsight/references/api.md +344 -0
  244. package/bin/skills/nnsight/references/tutorials.md +300 -0
  245. package/bin/skills/openrlhf/SKILL.md +249 -0
  246. package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
  247. package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
  248. package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
  249. package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
  250. package/bin/skills/outlines/SKILL.md +652 -0
  251. package/bin/skills/outlines/references/backends.md +615 -0
  252. package/bin/skills/outlines/references/examples.md +773 -0
  253. package/bin/skills/outlines/references/json_generation.md +652 -0
  254. package/bin/skills/peft/SKILL.md +431 -0
  255. package/bin/skills/peft/references/advanced-usage.md +514 -0
  256. package/bin/skills/peft/references/troubleshooting.md +480 -0
  257. package/bin/skills/phoenix/SKILL.md +475 -0
  258. package/bin/skills/phoenix/references/advanced-usage.md +619 -0
  259. package/bin/skills/phoenix/references/troubleshooting.md +538 -0
  260. package/bin/skills/pinecone/SKILL.md +358 -0
  261. package/bin/skills/pinecone/references/deployment.md +181 -0
  262. package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
  263. package/bin/skills/pytorch-fsdp/references/index.md +7 -0
  264. package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
  265. package/bin/skills/pytorch-lightning/SKILL.md +346 -0
  266. package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
  267. package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
  268. package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
  269. package/bin/skills/pyvene/SKILL.md +473 -0
  270. package/bin/skills/pyvene/references/README.md +73 -0
  271. package/bin/skills/pyvene/references/api.md +383 -0
  272. package/bin/skills/pyvene/references/tutorials.md +376 -0
  273. package/bin/skills/qdrant/SKILL.md +493 -0
  274. package/bin/skills/qdrant/references/advanced-usage.md +648 -0
  275. package/bin/skills/qdrant/references/troubleshooting.md +631 -0
  276. package/bin/skills/ray-data/SKILL.md +326 -0
  277. package/bin/skills/ray-data/references/integration.md +82 -0
  278. package/bin/skills/ray-data/references/transformations.md +83 -0
  279. package/bin/skills/ray-train/SKILL.md +406 -0
  280. package/bin/skills/ray-train/references/multi-node.md +628 -0
  281. package/bin/skills/rwkv/SKILL.md +260 -0
  282. package/bin/skills/rwkv/references/architecture-details.md +344 -0
  283. package/bin/skills/rwkv/references/rwkv7.md +386 -0
  284. package/bin/skills/rwkv/references/state-management.md +369 -0
  285. package/bin/skills/saelens/SKILL.md +386 -0
  286. package/bin/skills/saelens/references/README.md +70 -0
  287. package/bin/skills/saelens/references/api.md +333 -0
  288. package/bin/skills/saelens/references/tutorials.md +318 -0
  289. package/bin/skills/segment-anything/SKILL.md +500 -0
  290. package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
  291. package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
  292. package/bin/skills/sentence-transformers/SKILL.md +255 -0
  293. package/bin/skills/sentence-transformers/references/models.md +123 -0
  294. package/bin/skills/sentencepiece/SKILL.md +235 -0
  295. package/bin/skills/sentencepiece/references/algorithms.md +200 -0
  296. package/bin/skills/sentencepiece/references/training.md +304 -0
  297. package/bin/skills/sglang/SKILL.md +442 -0
  298. package/bin/skills/sglang/references/deployment.md +490 -0
  299. package/bin/skills/sglang/references/radix-attention.md +413 -0
  300. package/bin/skills/sglang/references/structured-generation.md +541 -0
  301. package/bin/skills/simpo/SKILL.md +219 -0
  302. package/bin/skills/simpo/references/datasets.md +478 -0
  303. package/bin/skills/simpo/references/hyperparameters.md +452 -0
  304. package/bin/skills/simpo/references/loss-functions.md +350 -0
  305. package/bin/skills/skypilot/SKILL.md +509 -0
  306. package/bin/skills/skypilot/references/advanced-usage.md +491 -0
  307. package/bin/skills/skypilot/references/troubleshooting.md +570 -0
  308. package/bin/skills/slime/SKILL.md +464 -0
  309. package/bin/skills/slime/references/api-reference.md +392 -0
  310. package/bin/skills/slime/references/troubleshooting.md +386 -0
  311. package/bin/skills/speculative-decoding/SKILL.md +467 -0
  312. package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
  313. package/bin/skills/speculative-decoding/references/medusa.md +350 -0
  314. package/bin/skills/stable-diffusion/SKILL.md +519 -0
  315. package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
  316. package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
  317. package/bin/skills/tensorboard/SKILL.md +629 -0
  318. package/bin/skills/tensorboard/references/integrations.md +638 -0
  319. package/bin/skills/tensorboard/references/profiling.md +545 -0
  320. package/bin/skills/tensorboard/references/visualization.md +620 -0
  321. package/bin/skills/tensorrt-llm/SKILL.md +187 -0
  322. package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
  323. package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
  324. package/bin/skills/tensorrt-llm/references/serving.md +470 -0
  325. package/bin/skills/tinker/SKILL.md +362 -0
  326. package/bin/skills/tinker/references/api-reference.md +168 -0
  327. package/bin/skills/tinker/references/getting-started.md +157 -0
  328. package/bin/skills/tinker/references/loss-functions.md +163 -0
  329. package/bin/skills/tinker/references/models-and-lora.md +139 -0
  330. package/bin/skills/tinker/references/recipes.md +280 -0
  331. package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
  332. package/bin/skills/tinker/references/rendering.md +243 -0
  333. package/bin/skills/tinker/references/supervised-learning.md +232 -0
  334. package/bin/skills/tinker-training-cost/SKILL.md +187 -0
  335. package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
  336. package/bin/skills/torchforge/SKILL.md +433 -0
  337. package/bin/skills/torchforge/references/api-reference.md +327 -0
  338. package/bin/skills/torchforge/references/troubleshooting.md +409 -0
  339. package/bin/skills/torchtitan/SKILL.md +358 -0
  340. package/bin/skills/torchtitan/references/checkpoint.md +181 -0
  341. package/bin/skills/torchtitan/references/custom-models.md +258 -0
  342. package/bin/skills/torchtitan/references/float8.md +133 -0
  343. package/bin/skills/torchtitan/references/fsdp.md +126 -0
  344. package/bin/skills/transformer-lens/SKILL.md +346 -0
  345. package/bin/skills/transformer-lens/references/README.md +54 -0
  346. package/bin/skills/transformer-lens/references/api.md +362 -0
  347. package/bin/skills/transformer-lens/references/tutorials.md +339 -0
  348. package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
  349. package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
  350. package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
  351. package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
  352. package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
  353. package/bin/skills/unsloth/SKILL.md +80 -0
  354. package/bin/skills/unsloth/references/index.md +7 -0
  355. package/bin/skills/unsloth/references/llms-full.md +16799 -0
  356. package/bin/skills/unsloth/references/llms-txt.md +12044 -0
  357. package/bin/skills/unsloth/references/llms.md +82 -0
  358. package/bin/skills/verl/SKILL.md +391 -0
  359. package/bin/skills/verl/references/api-reference.md +301 -0
  360. package/bin/skills/verl/references/troubleshooting.md +391 -0
  361. package/bin/skills/vllm/SKILL.md +364 -0
  362. package/bin/skills/vllm/references/optimization.md +226 -0
  363. package/bin/skills/vllm/references/quantization.md +284 -0
  364. package/bin/skills/vllm/references/server-deployment.md +255 -0
  365. package/bin/skills/vllm/references/troubleshooting.md +447 -0
  366. package/bin/skills/weights-and-biases/SKILL.md +590 -0
  367. package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
  368. package/bin/skills/weights-and-biases/references/integrations.md +700 -0
  369. package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
  370. package/bin/skills/whisper/SKILL.md +317 -0
  371. package/bin/skills/whisper/references/languages.md +189 -0
  372. package/bin/synsc +0 -0
  373. package/package.json +10 -0
@@ -0,0 +1,386 @@
1
+ ---
2
+ name: sparse-autoencoder-training
3
+ description: Provides guidance for training and analyzing Sparse Autoencoders (SAEs) using SAELens to decompose neural network activations into interpretable features. Use when discovering interpretable features, analyzing superposition, or studying monosemantic representations in language models.
4
+ version: 1.0.0
5
+ author: Synthetic Sciences
6
+ license: MIT
7
+ tags: [Sparse Autoencoders, SAE, Mechanistic Interpretability, Feature Discovery, Superposition]
8
+ dependencies: [sae-lens>=6.0.0, transformer-lens>=2.0.0, torch>=2.0.0]
9
+ ---
10
+
11
+ # SAELens: Sparse Autoencoders for Mechanistic Interpretability
12
+
13
+ SAELens is the primary library for training and analyzing Sparse Autoencoders (SAEs) - a technique for decomposing polysemantic neural network activations into sparse, interpretable features. Based on Anthropic's groundbreaking research on monosemanticity.
14
+
15
+ **GitHub**: [jbloomAus/SAELens](https://github.com/jbloomAus/SAELens) (1,100+ stars)
16
+
17
+ ## The Problem: Polysemanticity & Superposition
18
+
19
+ Individual neurons in neural networks are **polysemantic** - they activate in multiple, semantically distinct contexts. This happens because models use **superposition** to represent more features than they have neurons, making interpretability difficult.
20
+
21
+ **SAEs solve this** by decomposing dense activations into sparse, monosemantic features - typically only a small number of features activate for any given input, and each feature corresponds to an interpretable concept.
22
+
23
+ ## When to Use SAELens
24
+
25
+ **Use SAELens when you need to:**
26
+ - Discover interpretable features in model activations
27
+ - Understand what concepts a model has learned
28
+ - Study superposition and feature geometry
29
+ - Perform feature-based steering or ablation
30
+ - Analyze safety-relevant features (deception, bias, harmful content)
31
+
32
+ **Consider alternatives when:**
33
+ - You need basic activation analysis → Use **TransformerLens** directly
34
+ - You want causal intervention experiments → Use **pyvene** or **TransformerLens**
35
+ - You need production steering → Consider direct activation engineering
36
+
37
+ ## Installation
38
+
39
+ ```bash
40
+ pip install sae-lens
41
+ ```
42
+
43
+ Requirements: Python 3.10+, transformer-lens>=2.0.0
44
+
45
+ ## Core Concepts
46
+
47
+ ### What SAEs Learn
48
+
49
+ SAEs are trained to reconstruct model activations through a sparse bottleneck:
50
+
51
+ ```
52
+ Input Activation → Encoder → Sparse Features → Decoder → Reconstructed Activation
53
+ (d_model) ↓ (d_sae >> d_model) ↓ (d_model)
54
+ sparsity reconstruction
55
+ penalty loss
56
+ ```
57
+
58
+ **Loss Function**: `MSE(original, reconstructed) + L1_coefficient × L1(features)`
59
+
60
+ ### Key Validation (Anthropic Research)
61
+
62
+ In "Towards Monosemanticity", human evaluators found **70% of SAE features genuinely interpretable**. Features discovered include:
63
+ - DNA sequences, legal language, HTTP requests
64
+ - Hebrew text, nutrition statements, code syntax
65
+ - Sentiment, named entities, grammatical structures
66
+
67
+ ## Workflow 1: Loading and Analyzing Pre-trained SAEs
68
+
69
+ ### Step-by-Step
70
+
71
+ ```python
72
+ from transformer_lens import HookedTransformer
73
+ from sae_lens import SAE
74
+
75
+ # 1. Load model and pre-trained SAE
76
+ model = HookedTransformer.from_pretrained("gpt2-small", device="cuda")
77
+ sae, cfg_dict, sparsity = SAE.from_pretrained(
78
+ release="gpt2-small-res-jb",
79
+ sae_id="blocks.8.hook_resid_pre",
80
+ device="cuda"
81
+ )
82
+
83
+ # 2. Get model activations
84
+ tokens = model.to_tokens("The capital of France is Paris")
85
+ _, cache = model.run_with_cache(tokens)
86
+ activations = cache["resid_pre", 8] # [batch, pos, d_model]
87
+
88
+ # 3. Encode to SAE features
89
+ sae_features = sae.encode(activations) # [batch, pos, d_sae]
90
+ print(f"Active features: {(sae_features > 0).sum()}")
91
+
92
+ # 4. Find top features for each position
93
+ for pos in range(tokens.shape[1]):
94
+ top_features = sae_features[0, pos].topk(5)
95
+ token = model.to_str_tokens(tokens[0, pos:pos+1])[0]
96
+ print(f"Token '{token}': features {top_features.indices.tolist()}")
97
+
98
+ # 5. Reconstruct activations
99
+ reconstructed = sae.decode(sae_features)
100
+ reconstruction_error = (activations - reconstructed).norm()
101
+ ```
102
+
103
+ ### Available Pre-trained SAEs
104
+
105
+ | Release | Model | Layers |
106
+ |---------|-------|--------|
107
+ | `gpt2-small-res-jb` | GPT-2 Small | Multiple residual streams |
108
+ | `gemma-2b-res` | Gemma 2B | Residual streams |
109
+ | Various on HuggingFace | Search tag `saelens` | Various |
110
+
111
+ ### Checklist
112
+ - [ ] Load model with TransformerLens
113
+ - [ ] Load matching SAE for target layer
114
+ - [ ] Encode activations to sparse features
115
+ - [ ] Identify top-activating features per token
116
+ - [ ] Validate reconstruction quality
117
+
118
+ ## Workflow 2: Training a Custom SAE
119
+
120
+ ### Step-by-Step
121
+
122
+ ```python
123
+ from sae_lens import SAE, LanguageModelSAERunnerConfig, SAETrainingRunner
124
+
125
+ # 1. Configure training
126
+ cfg = LanguageModelSAERunnerConfig(
127
+ # Model
128
+ model_name="gpt2-small",
129
+ hook_name="blocks.8.hook_resid_pre",
130
+ hook_layer=8,
131
+ d_in=768, # Model dimension
132
+
133
+ # SAE architecture
134
+ architecture="standard", # or "gated", "topk"
135
+ d_sae=768 * 8, # Expansion factor of 8
136
+ activation_fn="relu",
137
+
138
+ # Training
139
+ lr=4e-4,
140
+ l1_coefficient=8e-5, # Sparsity penalty
141
+ l1_warm_up_steps=1000,
142
+ train_batch_size_tokens=4096,
143
+ training_tokens=100_000_000,
144
+
145
+ # Data
146
+ dataset_path="monology/pile-uncopyrighted",
147
+ context_size=128,
148
+
149
+ # Logging
150
+ log_to_wandb=True,
151
+ wandb_project="sae-training",
152
+
153
+ # Checkpointing
154
+ checkpoint_path="checkpoints",
155
+ n_checkpoints=5,
156
+ )
157
+
158
+ # 2. Train
159
+ trainer = SAETrainingRunner(cfg)
160
+ sae = trainer.run()
161
+
162
+ # 3. Evaluate
163
+ print(f"L0 (avg active features): {trainer.metrics['l0']}")
164
+ print(f"CE Loss Recovered: {trainer.metrics['ce_loss_score']}")
165
+ ```
166
+
167
+ ### Key Hyperparameters
168
+
169
+ | Parameter | Typical Value | Effect |
170
+ |-----------|---------------|--------|
171
+ | `d_sae` | 4-16× d_model | More features, higher capacity |
172
+ | `l1_coefficient` | 5e-5 to 1e-4 | Higher = sparser, less accurate |
173
+ | `lr` | 1e-4 to 1e-3 | Standard optimizer LR |
174
+ | `l1_warm_up_steps` | 500-2000 | Prevents early feature death |
175
+
176
+ ### Evaluation Metrics
177
+
178
+ | Metric | Target | Meaning |
179
+ |--------|--------|---------|
180
+ | **L0** | 50-200 | Average active features per token |
181
+ | **CE Loss Score** | 80-95% | Cross-entropy recovered vs original |
182
+ | **Dead Features** | <5% | Features that never activate |
183
+ | **Explained Variance** | >90% | Reconstruction quality |
184
+
185
+ ### Checklist
186
+ - [ ] Choose target layer and hook point
187
+ - [ ] Set expansion factor (d_sae = 4-16× d_model)
188
+ - [ ] Tune L1 coefficient for desired sparsity
189
+ - [ ] Enable L1 warm-up to prevent dead features
190
+ - [ ] Monitor metrics during training (W&B)
191
+ - [ ] Validate L0 and CE loss recovery
192
+ - [ ] Check dead feature ratio
193
+
194
+ ## Workflow 3: Feature Analysis and Steering
195
+
196
+ ### Analyzing Individual Features
197
+
198
+ ```python
199
+ from transformer_lens import HookedTransformer
200
+ from sae_lens import SAE
201
+ import torch
202
+
203
+ model = HookedTransformer.from_pretrained("gpt2-small", device="cuda")
204
+ sae, _, _ = SAE.from_pretrained(
205
+ release="gpt2-small-res-jb",
206
+ sae_id="blocks.8.hook_resid_pre",
207
+ device="cuda"
208
+ )
209
+
210
+ # Find what activates a specific feature
211
+ feature_idx = 1234
212
+ test_texts = [
213
+ "The scientist conducted an experiment",
214
+ "I love chocolate cake",
215
+ "The code compiles successfully",
216
+ "Paris is beautiful in spring",
217
+ ]
218
+
219
+ for text in test_texts:
220
+ tokens = model.to_tokens(text)
221
+ _, cache = model.run_with_cache(tokens)
222
+ features = sae.encode(cache["resid_pre", 8])
223
+ activation = features[0, :, feature_idx].max().item()
224
+ print(f"{activation:.3f}: {text}")
225
+ ```
226
+
227
+ ### Feature Steering
228
+
229
+ ```python
230
+ def steer_with_feature(model, sae, prompt, feature_idx, strength=5.0):
231
+ """Add SAE feature direction to residual stream."""
232
+ tokens = model.to_tokens(prompt)
233
+
234
+ # Get feature direction from decoder
235
+ feature_direction = sae.W_dec[feature_idx] # [d_model]
236
+
237
+ def steering_hook(activation, hook):
238
+ # Add scaled feature direction at all positions
239
+ activation += strength * feature_direction
240
+ return activation
241
+
242
+ # Generate with steering
243
+ output = model.generate(
244
+ tokens,
245
+ max_new_tokens=50,
246
+ fwd_hooks=[("blocks.8.hook_resid_pre", steering_hook)]
247
+ )
248
+ return model.to_string(output[0])
249
+ ```
250
+
251
+ ### Feature Attribution
252
+
253
+ ```python
254
+ # Which features most affect a specific output?
255
+ tokens = model.to_tokens("The capital of France is")
256
+ _, cache = model.run_with_cache(tokens)
257
+
258
+ # Get features at final position
259
+ features = sae.encode(cache["resid_pre", 8])[0, -1] # [d_sae]
260
+
261
+ # Get logit attribution per feature
262
+ # Feature contribution = feature_activation × decoder_weight × unembedding
263
+ W_dec = sae.W_dec # [d_sae, d_model]
264
+ W_U = model.W_U # [d_model, vocab]
265
+
266
+ # Contribution to "Paris" logit
267
+ paris_token = model.to_single_token(" Paris")
268
+ feature_contributions = features * (W_dec @ W_U[:, paris_token])
269
+
270
+ top_features = feature_contributions.topk(10)
271
+ print("Top features for 'Paris' prediction:")
272
+ for idx, val in zip(top_features.indices, top_features.values):
273
+ print(f" Feature {idx.item()}: {val.item():.3f}")
274
+ ```
275
+
276
+ ## Common Issues & Solutions
277
+
278
+ ### Issue: High dead feature ratio
279
+ ```python
280
+ # WRONG: No warm-up, features die early
281
+ cfg = LanguageModelSAERunnerConfig(
282
+ l1_coefficient=1e-4,
283
+ l1_warm_up_steps=0, # Bad!
284
+ )
285
+
286
+ # RIGHT: Warm-up L1 penalty
287
+ cfg = LanguageModelSAERunnerConfig(
288
+ l1_coefficient=8e-5,
289
+ l1_warm_up_steps=1000, # Gradually increase
290
+ use_ghost_grads=True, # Revive dead features
291
+ )
292
+ ```
293
+
294
+ ### Issue: Poor reconstruction (low CE recovery)
295
+ ```python
296
+ # Reduce sparsity penalty
297
+ cfg = LanguageModelSAERunnerConfig(
298
+ l1_coefficient=5e-5, # Lower = better reconstruction
299
+ d_sae=768 * 16, # More capacity
300
+ )
301
+ ```
302
+
303
+ ### Issue: Features not interpretable
304
+ ```python
305
+ # Increase sparsity (higher L1)
306
+ cfg = LanguageModelSAERunnerConfig(
307
+ l1_coefficient=1e-4, # Higher = sparser, more interpretable
308
+ )
309
+ # Or use TopK architecture
310
+ cfg = LanguageModelSAERunnerConfig(
311
+ architecture="topk",
312
+ activation_fn_kwargs={"k": 50}, # Exactly 50 active features
313
+ )
314
+ ```
315
+
316
+ ### Issue: Memory errors during training
317
+ ```python
318
+ cfg = LanguageModelSAERunnerConfig(
319
+ train_batch_size_tokens=2048, # Reduce batch size
320
+ store_batch_size_prompts=4, # Fewer prompts in buffer
321
+ n_batches_in_buffer=8, # Smaller activation buffer
322
+ )
323
+ ```
324
+
325
+ ## Integration with Neuronpedia
326
+
327
+ Browse pre-trained SAE features at [neuronpedia.org](https://neuronpedia.org):
328
+
329
+ ```python
330
+ # Features are indexed by SAE ID
331
+ # Example: gpt2-small layer 8 feature 1234
332
+ # → neuronpedia.org/gpt2-small/8-res-jb/1234
333
+ ```
334
+
335
+ ## Key Classes Reference
336
+
337
+ | Class | Purpose |
338
+ |-------|---------|
339
+ | `SAE` | Sparse Autoencoder model |
340
+ | `LanguageModelSAERunnerConfig` | Training configuration |
341
+ | `SAETrainingRunner` | Training loop manager |
342
+ | `ActivationsStore` | Activation collection and batching |
343
+ | `HookedSAETransformer` | TransformerLens + SAE integration |
344
+
345
+ ## Reference Documentation
346
+
347
+ For detailed API documentation, tutorials, and advanced usage, see the `references/` folder:
348
+
349
+ | File | Contents |
350
+ |------|----------|
351
+ | [references/README.md](references/README.md) | Overview and quick start guide |
352
+ | [references/api.md](references/api.md) | Complete API reference for SAE, TrainingSAE, configurations |
353
+ | [references/tutorials.md](references/tutorials.md) | Step-by-step tutorials for training, analysis, steering |
354
+
355
+ ## External Resources
356
+
357
+ ### Tutorials
358
+ - [Basic Loading & Analysis](https://github.com/jbloomAus/SAELens/blob/main/tutorials/basic_loading_and_analysing.ipynb)
359
+ - [Training a Sparse Autoencoder](https://github.com/jbloomAus/SAELens/blob/main/tutorials/training_a_sparse_autoencoder.ipynb)
360
+ - [ARENA SAE Curriculum](https://www.lesswrong.com/posts/LnHowHgmrMbWtpkxx/intro-to-superposition-and-sparse-autoencoders-colab)
361
+
362
+ ### Papers
363
+ - [Towards Monosemanticity](https://transformer-circuits.pub/2023/monosemantic-features) - Anthropic (2023)
364
+ - [Scaling Monosemanticity](https://transformer-circuits.pub/2024/scaling-monosemanticity/) - Anthropic (2024)
365
+ - [Sparse Autoencoders Find Highly Interpretable Features](https://arxiv.org/abs/2309.08600) - Cunningham et al. (ICLR 2024)
366
+
367
+ ### Official Documentation
368
+ - [SAELens Docs](https://jbloomaus.github.io/SAELens/)
369
+ - [Neuronpedia](https://neuronpedia.org) - Feature browser
370
+
371
+ ## SAE Architectures
372
+
373
+ | Architecture | Description | Use Case |
374
+ |--------------|-------------|----------|
375
+ | **Standard** | ReLU + L1 penalty | General purpose |
376
+ | **Gated** | Learned gating mechanism | Better sparsity control |
377
+ | **TopK** | Exactly K active features | Consistent sparsity |
378
+
379
+ ```python
380
+ # TopK SAE (exactly 50 features active)
381
+ cfg = LanguageModelSAERunnerConfig(
382
+ architecture="topk",
383
+ activation_fn="topk",
384
+ activation_fn_kwargs={"k": 50},
385
+ )
386
+ ```
@@ -0,0 +1,70 @@
1
+ # SAELens Reference Documentation
2
+
3
+ This directory contains comprehensive reference materials for SAELens.
4
+
5
+ ## Contents
6
+
7
+ - [api.md](api.md) - Complete API reference for SAE, TrainingSAE, and configuration classes
8
+ - [tutorials.md](tutorials.md) - Step-by-step tutorials for training and analyzing SAEs
9
+ - [papers.md](papers.md) - Key research papers on sparse autoencoders
10
+
11
+ ## Quick Links
12
+
13
+ - **GitHub Repository**: https://github.com/jbloomAus/SAELens
14
+ - **Neuronpedia**: https://neuronpedia.org (browse pre-trained SAE features)
15
+ - **HuggingFace SAEs**: Search for tag `saelens`
16
+
17
+ ## Installation
18
+
19
+ ```bash
20
+ pip install sae-lens
21
+ ```
22
+
23
+ Requirements: Python 3.10+, transformer-lens>=2.0.0
24
+
25
+ ## Basic Usage
26
+
27
+ ```python
28
+ from transformer_lens import HookedTransformer
29
+ from sae_lens import SAE
30
+
31
+ # Load model and SAE
32
+ model = HookedTransformer.from_pretrained("gpt2-small", device="cuda")
33
+ sae, cfg_dict, sparsity = SAE.from_pretrained(
34
+ release="gpt2-small-res-jb",
35
+ sae_id="blocks.8.hook_resid_pre",
36
+ device="cuda"
37
+ )
38
+
39
+ # Encode activations to sparse features
40
+ tokens = model.to_tokens("Hello world")
41
+ _, cache = model.run_with_cache(tokens)
42
+ activations = cache["resid_pre", 8]
43
+
44
+ features = sae.encode(activations) # Sparse feature activations
45
+ reconstructed = sae.decode(features) # Reconstructed activations
46
+ ```
47
+
48
+ ## Key Concepts
49
+
50
+ ### Sparse Autoencoders
51
+ SAEs decompose dense neural activations into sparse, interpretable features:
52
+ - **Encoder**: Maps d_model → d_sae (typically 4-16x expansion)
53
+ - **ReLU/TopK**: Enforces sparsity
54
+ - **Decoder**: Reconstructs original activations
55
+
56
+ ### Training Loss
57
+ `Loss = MSE(original, reconstructed) + L1_coefficient × L1(features)`
58
+
59
+ ### Key Metrics
60
+ - **L0**: Average number of active features (target: 50-200)
61
+ - **CE Loss Score**: Cross-entropy recovered vs original model (target: 80-95%)
62
+ - **Dead Features**: Features that never activate (target: <5%)
63
+
64
+ ## Available Pre-trained SAEs
65
+
66
+ | Release | Model | Description |
67
+ |---------|-------|-------------|
68
+ | `gpt2-small-res-jb` | GPT-2 Small | Residual stream SAEs |
69
+ | `gemma-2b-res` | Gemma 2B | Residual stream SAEs |
70
+ | Various | Search HuggingFace | Community-trained SAEs |