@synsci/cli-darwin-x64 1.1.49

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (373) hide show
  1. package/bin/skills/accelerate/SKILL.md +332 -0
  2. package/bin/skills/accelerate/references/custom-plugins.md +453 -0
  3. package/bin/skills/accelerate/references/megatron-integration.md +489 -0
  4. package/bin/skills/accelerate/references/performance.md +525 -0
  5. package/bin/skills/audiocraft/SKILL.md +564 -0
  6. package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
  7. package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
  8. package/bin/skills/autogpt/SKILL.md +403 -0
  9. package/bin/skills/autogpt/references/advanced-usage.md +535 -0
  10. package/bin/skills/autogpt/references/troubleshooting.md +420 -0
  11. package/bin/skills/awq/SKILL.md +310 -0
  12. package/bin/skills/awq/references/advanced-usage.md +324 -0
  13. package/bin/skills/awq/references/troubleshooting.md +344 -0
  14. package/bin/skills/axolotl/SKILL.md +158 -0
  15. package/bin/skills/axolotl/references/api.md +5548 -0
  16. package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
  17. package/bin/skills/axolotl/references/index.md +15 -0
  18. package/bin/skills/axolotl/references/other.md +3563 -0
  19. package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
  20. package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
  21. package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
  22. package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
  23. package/bin/skills/bitsandbytes/SKILL.md +411 -0
  24. package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
  25. package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
  26. package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
  27. package/bin/skills/blip-2/SKILL.md +564 -0
  28. package/bin/skills/blip-2/references/advanced-usage.md +680 -0
  29. package/bin/skills/blip-2/references/troubleshooting.md +526 -0
  30. package/bin/skills/chroma/SKILL.md +406 -0
  31. package/bin/skills/chroma/references/integration.md +38 -0
  32. package/bin/skills/clip/SKILL.md +253 -0
  33. package/bin/skills/clip/references/applications.md +207 -0
  34. package/bin/skills/constitutional-ai/SKILL.md +290 -0
  35. package/bin/skills/crewai/SKILL.md +498 -0
  36. package/bin/skills/crewai/references/flows.md +438 -0
  37. package/bin/skills/crewai/references/tools.md +429 -0
  38. package/bin/skills/crewai/references/troubleshooting.md +480 -0
  39. package/bin/skills/deepspeed/SKILL.md +141 -0
  40. package/bin/skills/deepspeed/references/08.md +17 -0
  41. package/bin/skills/deepspeed/references/09.md +173 -0
  42. package/bin/skills/deepspeed/references/2020.md +378 -0
  43. package/bin/skills/deepspeed/references/2023.md +279 -0
  44. package/bin/skills/deepspeed/references/assets.md +179 -0
  45. package/bin/skills/deepspeed/references/index.md +35 -0
  46. package/bin/skills/deepspeed/references/mii.md +118 -0
  47. package/bin/skills/deepspeed/references/other.md +1191 -0
  48. package/bin/skills/deepspeed/references/tutorials.md +6554 -0
  49. package/bin/skills/dspy/SKILL.md +590 -0
  50. package/bin/skills/dspy/references/examples.md +663 -0
  51. package/bin/skills/dspy/references/modules.md +475 -0
  52. package/bin/skills/dspy/references/optimizers.md +566 -0
  53. package/bin/skills/faiss/SKILL.md +221 -0
  54. package/bin/skills/faiss/references/index_types.md +280 -0
  55. package/bin/skills/flash-attention/SKILL.md +367 -0
  56. package/bin/skills/flash-attention/references/benchmarks.md +215 -0
  57. package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
  58. package/bin/skills/gguf/SKILL.md +427 -0
  59. package/bin/skills/gguf/references/advanced-usage.md +504 -0
  60. package/bin/skills/gguf/references/troubleshooting.md +442 -0
  61. package/bin/skills/gptq/SKILL.md +450 -0
  62. package/bin/skills/gptq/references/calibration.md +337 -0
  63. package/bin/skills/gptq/references/integration.md +129 -0
  64. package/bin/skills/gptq/references/troubleshooting.md +95 -0
  65. package/bin/skills/grpo-rl-training/README.md +97 -0
  66. package/bin/skills/grpo-rl-training/SKILL.md +572 -0
  67. package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
  68. package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
  69. package/bin/skills/guidance/SKILL.md +572 -0
  70. package/bin/skills/guidance/references/backends.md +554 -0
  71. package/bin/skills/guidance/references/constraints.md +674 -0
  72. package/bin/skills/guidance/references/examples.md +767 -0
  73. package/bin/skills/hqq/SKILL.md +445 -0
  74. package/bin/skills/hqq/references/advanced-usage.md +528 -0
  75. package/bin/skills/hqq/references/troubleshooting.md +503 -0
  76. package/bin/skills/hugging-face-cli/SKILL.md +191 -0
  77. package/bin/skills/hugging-face-cli/references/commands.md +954 -0
  78. package/bin/skills/hugging-face-cli/references/examples.md +374 -0
  79. package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
  80. package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
  81. package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
  82. package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
  83. package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
  84. package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
  85. package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
  86. package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
  87. package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
  88. package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
  89. package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
  90. package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
  91. package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
  92. package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
  93. package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
  94. package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
  95. package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
  96. package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
  97. package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
  98. package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
  99. package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
  100. package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
  101. package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
  102. package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
  103. package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
  104. package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
  105. package/bin/skills/hugging-face-jobs/index.html +216 -0
  106. package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
  107. package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
  108. package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
  109. package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
  110. package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
  111. package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
  112. package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
  113. package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
  114. package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
  115. package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
  116. package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
  117. package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
  118. package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
  119. package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
  120. package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
  121. package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
  122. package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
  123. package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
  124. package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
  125. package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
  126. package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
  127. package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
  128. package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
  129. package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
  130. package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
  131. package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
  132. package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
  133. package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
  134. package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
  135. package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
  136. package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
  137. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
  138. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
  139. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
  140. package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
  141. package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
  142. package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
  143. package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
  144. package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
  145. package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
  146. package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
  147. package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
  148. package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
  149. package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
  150. package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
  151. package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
  152. package/bin/skills/instructor/SKILL.md +740 -0
  153. package/bin/skills/instructor/references/examples.md +107 -0
  154. package/bin/skills/instructor/references/providers.md +70 -0
  155. package/bin/skills/instructor/references/validation.md +606 -0
  156. package/bin/skills/knowledge-distillation/SKILL.md +458 -0
  157. package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
  158. package/bin/skills/lambda-labs/SKILL.md +545 -0
  159. package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
  160. package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
  161. package/bin/skills/langchain/SKILL.md +480 -0
  162. package/bin/skills/langchain/references/agents.md +499 -0
  163. package/bin/skills/langchain/references/integration.md +562 -0
  164. package/bin/skills/langchain/references/rag.md +600 -0
  165. package/bin/skills/langsmith/SKILL.md +422 -0
  166. package/bin/skills/langsmith/references/advanced-usage.md +548 -0
  167. package/bin/skills/langsmith/references/troubleshooting.md +537 -0
  168. package/bin/skills/litgpt/SKILL.md +469 -0
  169. package/bin/skills/litgpt/references/custom-models.md +568 -0
  170. package/bin/skills/litgpt/references/distributed-training.md +451 -0
  171. package/bin/skills/litgpt/references/supported-models.md +336 -0
  172. package/bin/skills/litgpt/references/training-recipes.md +619 -0
  173. package/bin/skills/llama-cpp/SKILL.md +258 -0
  174. package/bin/skills/llama-cpp/references/optimization.md +89 -0
  175. package/bin/skills/llama-cpp/references/quantization.md +213 -0
  176. package/bin/skills/llama-cpp/references/server.md +125 -0
  177. package/bin/skills/llama-factory/SKILL.md +80 -0
  178. package/bin/skills/llama-factory/references/_images.md +23 -0
  179. package/bin/skills/llama-factory/references/advanced.md +1055 -0
  180. package/bin/skills/llama-factory/references/getting_started.md +349 -0
  181. package/bin/skills/llama-factory/references/index.md +19 -0
  182. package/bin/skills/llama-factory/references/other.md +31 -0
  183. package/bin/skills/llamaguard/SKILL.md +337 -0
  184. package/bin/skills/llamaindex/SKILL.md +569 -0
  185. package/bin/skills/llamaindex/references/agents.md +83 -0
  186. package/bin/skills/llamaindex/references/data_connectors.md +108 -0
  187. package/bin/skills/llamaindex/references/query_engines.md +406 -0
  188. package/bin/skills/llava/SKILL.md +304 -0
  189. package/bin/skills/llava/references/training.md +197 -0
  190. package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
  191. package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
  192. package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
  193. package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
  194. package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
  195. package/bin/skills/long-context/SKILL.md +536 -0
  196. package/bin/skills/long-context/references/extension_methods.md +468 -0
  197. package/bin/skills/long-context/references/fine_tuning.md +611 -0
  198. package/bin/skills/long-context/references/rope.md +402 -0
  199. package/bin/skills/mamba/SKILL.md +260 -0
  200. package/bin/skills/mamba/references/architecture-details.md +206 -0
  201. package/bin/skills/mamba/references/benchmarks.md +255 -0
  202. package/bin/skills/mamba/references/training-guide.md +388 -0
  203. package/bin/skills/megatron-core/SKILL.md +366 -0
  204. package/bin/skills/megatron-core/references/benchmarks.md +249 -0
  205. package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
  206. package/bin/skills/megatron-core/references/production-examples.md +473 -0
  207. package/bin/skills/megatron-core/references/training-recipes.md +547 -0
  208. package/bin/skills/miles/SKILL.md +315 -0
  209. package/bin/skills/miles/references/api-reference.md +141 -0
  210. package/bin/skills/miles/references/troubleshooting.md +352 -0
  211. package/bin/skills/mlflow/SKILL.md +704 -0
  212. package/bin/skills/mlflow/references/deployment.md +744 -0
  213. package/bin/skills/mlflow/references/model-registry.md +770 -0
  214. package/bin/skills/mlflow/references/tracking.md +680 -0
  215. package/bin/skills/modal/SKILL.md +341 -0
  216. package/bin/skills/modal/references/advanced-usage.md +503 -0
  217. package/bin/skills/modal/references/troubleshooting.md +494 -0
  218. package/bin/skills/model-merging/SKILL.md +539 -0
  219. package/bin/skills/model-merging/references/evaluation.md +462 -0
  220. package/bin/skills/model-merging/references/examples.md +428 -0
  221. package/bin/skills/model-merging/references/methods.md +352 -0
  222. package/bin/skills/model-pruning/SKILL.md +495 -0
  223. package/bin/skills/model-pruning/references/wanda.md +347 -0
  224. package/bin/skills/moe-training/SKILL.md +526 -0
  225. package/bin/skills/moe-training/references/architectures.md +432 -0
  226. package/bin/skills/moe-training/references/inference.md +348 -0
  227. package/bin/skills/moe-training/references/training.md +425 -0
  228. package/bin/skills/nanogpt/SKILL.md +290 -0
  229. package/bin/skills/nanogpt/references/architecture.md +382 -0
  230. package/bin/skills/nanogpt/references/data.md +476 -0
  231. package/bin/skills/nanogpt/references/training.md +564 -0
  232. package/bin/skills/nemo-curator/SKILL.md +383 -0
  233. package/bin/skills/nemo-curator/references/deduplication.md +87 -0
  234. package/bin/skills/nemo-curator/references/filtering.md +102 -0
  235. package/bin/skills/nemo-evaluator/SKILL.md +494 -0
  236. package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
  237. package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
  238. package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
  239. package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
  240. package/bin/skills/nemo-guardrails/SKILL.md +297 -0
  241. package/bin/skills/nnsight/SKILL.md +436 -0
  242. package/bin/skills/nnsight/references/README.md +78 -0
  243. package/bin/skills/nnsight/references/api.md +344 -0
  244. package/bin/skills/nnsight/references/tutorials.md +300 -0
  245. package/bin/skills/openrlhf/SKILL.md +249 -0
  246. package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
  247. package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
  248. package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
  249. package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
  250. package/bin/skills/outlines/SKILL.md +652 -0
  251. package/bin/skills/outlines/references/backends.md +615 -0
  252. package/bin/skills/outlines/references/examples.md +773 -0
  253. package/bin/skills/outlines/references/json_generation.md +652 -0
  254. package/bin/skills/peft/SKILL.md +431 -0
  255. package/bin/skills/peft/references/advanced-usage.md +514 -0
  256. package/bin/skills/peft/references/troubleshooting.md +480 -0
  257. package/bin/skills/phoenix/SKILL.md +475 -0
  258. package/bin/skills/phoenix/references/advanced-usage.md +619 -0
  259. package/bin/skills/phoenix/references/troubleshooting.md +538 -0
  260. package/bin/skills/pinecone/SKILL.md +358 -0
  261. package/bin/skills/pinecone/references/deployment.md +181 -0
  262. package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
  263. package/bin/skills/pytorch-fsdp/references/index.md +7 -0
  264. package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
  265. package/bin/skills/pytorch-lightning/SKILL.md +346 -0
  266. package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
  267. package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
  268. package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
  269. package/bin/skills/pyvene/SKILL.md +473 -0
  270. package/bin/skills/pyvene/references/README.md +73 -0
  271. package/bin/skills/pyvene/references/api.md +383 -0
  272. package/bin/skills/pyvene/references/tutorials.md +376 -0
  273. package/bin/skills/qdrant/SKILL.md +493 -0
  274. package/bin/skills/qdrant/references/advanced-usage.md +648 -0
  275. package/bin/skills/qdrant/references/troubleshooting.md +631 -0
  276. package/bin/skills/ray-data/SKILL.md +326 -0
  277. package/bin/skills/ray-data/references/integration.md +82 -0
  278. package/bin/skills/ray-data/references/transformations.md +83 -0
  279. package/bin/skills/ray-train/SKILL.md +406 -0
  280. package/bin/skills/ray-train/references/multi-node.md +628 -0
  281. package/bin/skills/rwkv/SKILL.md +260 -0
  282. package/bin/skills/rwkv/references/architecture-details.md +344 -0
  283. package/bin/skills/rwkv/references/rwkv7.md +386 -0
  284. package/bin/skills/rwkv/references/state-management.md +369 -0
  285. package/bin/skills/saelens/SKILL.md +386 -0
  286. package/bin/skills/saelens/references/README.md +70 -0
  287. package/bin/skills/saelens/references/api.md +333 -0
  288. package/bin/skills/saelens/references/tutorials.md +318 -0
  289. package/bin/skills/segment-anything/SKILL.md +500 -0
  290. package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
  291. package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
  292. package/bin/skills/sentence-transformers/SKILL.md +255 -0
  293. package/bin/skills/sentence-transformers/references/models.md +123 -0
  294. package/bin/skills/sentencepiece/SKILL.md +235 -0
  295. package/bin/skills/sentencepiece/references/algorithms.md +200 -0
  296. package/bin/skills/sentencepiece/references/training.md +304 -0
  297. package/bin/skills/sglang/SKILL.md +442 -0
  298. package/bin/skills/sglang/references/deployment.md +490 -0
  299. package/bin/skills/sglang/references/radix-attention.md +413 -0
  300. package/bin/skills/sglang/references/structured-generation.md +541 -0
  301. package/bin/skills/simpo/SKILL.md +219 -0
  302. package/bin/skills/simpo/references/datasets.md +478 -0
  303. package/bin/skills/simpo/references/hyperparameters.md +452 -0
  304. package/bin/skills/simpo/references/loss-functions.md +350 -0
  305. package/bin/skills/skypilot/SKILL.md +509 -0
  306. package/bin/skills/skypilot/references/advanced-usage.md +491 -0
  307. package/bin/skills/skypilot/references/troubleshooting.md +570 -0
  308. package/bin/skills/slime/SKILL.md +464 -0
  309. package/bin/skills/slime/references/api-reference.md +392 -0
  310. package/bin/skills/slime/references/troubleshooting.md +386 -0
  311. package/bin/skills/speculative-decoding/SKILL.md +467 -0
  312. package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
  313. package/bin/skills/speculative-decoding/references/medusa.md +350 -0
  314. package/bin/skills/stable-diffusion/SKILL.md +519 -0
  315. package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
  316. package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
  317. package/bin/skills/tensorboard/SKILL.md +629 -0
  318. package/bin/skills/tensorboard/references/integrations.md +638 -0
  319. package/bin/skills/tensorboard/references/profiling.md +545 -0
  320. package/bin/skills/tensorboard/references/visualization.md +620 -0
  321. package/bin/skills/tensorrt-llm/SKILL.md +187 -0
  322. package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
  323. package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
  324. package/bin/skills/tensorrt-llm/references/serving.md +470 -0
  325. package/bin/skills/tinker/SKILL.md +362 -0
  326. package/bin/skills/tinker/references/api-reference.md +168 -0
  327. package/bin/skills/tinker/references/getting-started.md +157 -0
  328. package/bin/skills/tinker/references/loss-functions.md +163 -0
  329. package/bin/skills/tinker/references/models-and-lora.md +139 -0
  330. package/bin/skills/tinker/references/recipes.md +280 -0
  331. package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
  332. package/bin/skills/tinker/references/rendering.md +243 -0
  333. package/bin/skills/tinker/references/supervised-learning.md +232 -0
  334. package/bin/skills/tinker-training-cost/SKILL.md +187 -0
  335. package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
  336. package/bin/skills/torchforge/SKILL.md +433 -0
  337. package/bin/skills/torchforge/references/api-reference.md +327 -0
  338. package/bin/skills/torchforge/references/troubleshooting.md +409 -0
  339. package/bin/skills/torchtitan/SKILL.md +358 -0
  340. package/bin/skills/torchtitan/references/checkpoint.md +181 -0
  341. package/bin/skills/torchtitan/references/custom-models.md +258 -0
  342. package/bin/skills/torchtitan/references/float8.md +133 -0
  343. package/bin/skills/torchtitan/references/fsdp.md +126 -0
  344. package/bin/skills/transformer-lens/SKILL.md +346 -0
  345. package/bin/skills/transformer-lens/references/README.md +54 -0
  346. package/bin/skills/transformer-lens/references/api.md +362 -0
  347. package/bin/skills/transformer-lens/references/tutorials.md +339 -0
  348. package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
  349. package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
  350. package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
  351. package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
  352. package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
  353. package/bin/skills/unsloth/SKILL.md +80 -0
  354. package/bin/skills/unsloth/references/index.md +7 -0
  355. package/bin/skills/unsloth/references/llms-full.md +16799 -0
  356. package/bin/skills/unsloth/references/llms-txt.md +12044 -0
  357. package/bin/skills/unsloth/references/llms.md +82 -0
  358. package/bin/skills/verl/SKILL.md +391 -0
  359. package/bin/skills/verl/references/api-reference.md +301 -0
  360. package/bin/skills/verl/references/troubleshooting.md +391 -0
  361. package/bin/skills/vllm/SKILL.md +364 -0
  362. package/bin/skills/vllm/references/optimization.md +226 -0
  363. package/bin/skills/vllm/references/quantization.md +284 -0
  364. package/bin/skills/vllm/references/server-deployment.md +255 -0
  365. package/bin/skills/vllm/references/troubleshooting.md +447 -0
  366. package/bin/skills/weights-and-biases/SKILL.md +590 -0
  367. package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
  368. package/bin/skills/weights-and-biases/references/integrations.md +700 -0
  369. package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
  370. package/bin/skills/whisper/SKILL.md +317 -0
  371. package/bin/skills/whisper/references/languages.md +189 -0
  372. package/bin/synsc +0 -0
  373. package/package.json +10 -0
@@ -0,0 +1,346 @@
1
+ ---
2
+ name: transformer-lens-interpretability
3
+ description: Provides guidance for mechanistic interpretability research using TransformerLens to inspect and manipulate transformer internals via HookPoints and activation caching. Use when reverse-engineering model algorithms, studying attention patterns, or performing activation patching experiments.
4
+ version: 1.0.0
5
+ author: Synthetic Sciences
6
+ license: MIT
7
+ tags: [Mechanistic Interpretability, TransformerLens, Activation Patching, Circuit Analysis]
8
+ dependencies: [transformer-lens>=2.0.0, torch>=2.0.0]
9
+ ---
10
+
11
+ # TransformerLens: Mechanistic Interpretability for Transformers
12
+
13
+ TransformerLens is the de facto standard library for mechanistic interpretability research on GPT-style language models. Created by Neel Nanda and maintained by Bryce Meyer, it provides clean interfaces to inspect and manipulate model internals via HookPoints on every activation.
14
+
15
+ **GitHub**: [TransformerLensOrg/TransformerLens](https://github.com/TransformerLensOrg/TransformerLens) (2,900+ stars)
16
+
17
+ ## When to Use TransformerLens
18
+
19
+ **Use TransformerLens when you need to:**
20
+ - Reverse-engineer algorithms learned during training
21
+ - Perform activation patching / causal tracing experiments
22
+ - Study attention patterns and information flow
23
+ - Analyze circuits (e.g., induction heads, IOI circuit)
24
+ - Cache and inspect intermediate activations
25
+ - Apply direct logit attribution
26
+
27
+ **Consider alternatives when:**
28
+ - You need to work with non-transformer architectures → Use **nnsight** or **pyvene**
29
+ - You want to train/analyze Sparse Autoencoders → Use **SAELens**
30
+ - You need remote execution on massive models → Use **nnsight** with NDIF
31
+ - You want higher-level causal intervention abstractions → Use **pyvene**
32
+
33
+ ## Installation
34
+
35
+ ```bash
36
+ pip install transformer-lens
37
+ ```
38
+
39
+ For development version:
40
+ ```bash
41
+ pip install git+https://github.com/TransformerLensOrg/TransformerLens
42
+ ```
43
+
44
+ ## Core Concepts
45
+
46
+ ### HookedTransformer
47
+
48
+ The main class that wraps transformer models with HookPoints on every activation:
49
+
50
+ ```python
51
+ from transformer_lens import HookedTransformer
52
+
53
+ # Load a model
54
+ model = HookedTransformer.from_pretrained("gpt2-small")
55
+
56
+ # For gated models (LLaMA, Mistral)
57
+ import os
58
+ os.environ["HF_TOKEN"] = "your_token"
59
+ model = HookedTransformer.from_pretrained("meta-llama/Llama-2-7b-hf")
60
+ ```
61
+
62
+ ### Supported Models (50+)
63
+
64
+ | Family | Models |
65
+ |--------|--------|
66
+ | GPT-2 | gpt2, gpt2-medium, gpt2-large, gpt2-xl |
67
+ | LLaMA | llama-7b, llama-13b, llama-2-7b, llama-2-13b |
68
+ | EleutherAI | pythia-70m to pythia-12b, gpt-neo, gpt-j-6b |
69
+ | Mistral | mistral-7b, mixtral-8x7b |
70
+ | Others | phi, qwen, opt, gemma |
71
+
72
+ ### Activation Caching
73
+
74
+ Run the model and cache all intermediate activations:
75
+
76
+ ```python
77
+ # Get all activations
78
+ tokens = model.to_tokens("The Eiffel Tower is in")
79
+ logits, cache = model.run_with_cache(tokens)
80
+
81
+ # Access specific activations
82
+ residual = cache["resid_post", 5] # Layer 5 residual stream
83
+ attn_pattern = cache["pattern", 3] # Layer 3 attention pattern
84
+ mlp_out = cache["mlp_out", 7] # Layer 7 MLP output
85
+
86
+ # Filter which activations to cache (saves memory)
87
+ logits, cache = model.run_with_cache(
88
+ tokens,
89
+ names_filter=lambda name: "resid_post" in name
90
+ )
91
+ ```
92
+
93
+ ### ActivationCache Keys
94
+
95
+ | Key Pattern | Shape | Description |
96
+ |-------------|-------|-------------|
97
+ | `resid_pre, layer` | [batch, pos, d_model] | Residual before attention |
98
+ | `resid_mid, layer` | [batch, pos, d_model] | Residual after attention |
99
+ | `resid_post, layer` | [batch, pos, d_model] | Residual after MLP |
100
+ | `attn_out, layer` | [batch, pos, d_model] | Attention output |
101
+ | `mlp_out, layer` | [batch, pos, d_model] | MLP output |
102
+ | `pattern, layer` | [batch, head, q_pos, k_pos] | Attention pattern (post-softmax) |
103
+ | `q, layer` | [batch, pos, head, d_head] | Query vectors |
104
+ | `k, layer` | [batch, pos, head, d_head] | Key vectors |
105
+ | `v, layer` | [batch, pos, head, d_head] | Value vectors |
106
+
107
+ ## Workflow 1: Activation Patching (Causal Tracing)
108
+
109
+ Identify which activations causally affect model output by patching clean activations into corrupted runs.
110
+
111
+ ### Step-by-Step
112
+
113
+ ```python
114
+ from transformer_lens import HookedTransformer, patching
115
+ import torch
116
+
117
+ model = HookedTransformer.from_pretrained("gpt2-small")
118
+
119
+ # 1. Define clean and corrupted prompts
120
+ clean_prompt = "The Eiffel Tower is in the city of"
121
+ corrupted_prompt = "The Colosseum is in the city of"
122
+
123
+ clean_tokens = model.to_tokens(clean_prompt)
124
+ corrupted_tokens = model.to_tokens(corrupted_prompt)
125
+
126
+ # 2. Get clean activations
127
+ _, clean_cache = model.run_with_cache(clean_tokens)
128
+
129
+ # 3. Define metric (e.g., logit difference)
130
+ paris_token = model.to_single_token(" Paris")
131
+ rome_token = model.to_single_token(" Rome")
132
+
133
+ def metric(logits):
134
+ return logits[0, -1, paris_token] - logits[0, -1, rome_token]
135
+
136
+ # 4. Patch each position and layer
137
+ results = torch.zeros(model.cfg.n_layers, clean_tokens.shape[1])
138
+
139
+ for layer in range(model.cfg.n_layers):
140
+ for pos in range(clean_tokens.shape[1]):
141
+ def patch_hook(activation, hook):
142
+ activation[0, pos] = clean_cache[hook.name][0, pos]
143
+ return activation
144
+
145
+ patched_logits = model.run_with_hooks(
146
+ corrupted_tokens,
147
+ fwd_hooks=[(f"blocks.{layer}.hook_resid_post", patch_hook)]
148
+ )
149
+ results[layer, pos] = metric(patched_logits)
150
+
151
+ # 5. Visualize results (layer x position heatmap)
152
+ ```
153
+
154
+ ### Checklist
155
+ - [ ] Define clean and corrupted inputs that differ minimally
156
+ - [ ] Choose metric that captures behavior difference
157
+ - [ ] Cache clean activations
158
+ - [ ] Systematically patch each (layer, position) combination
159
+ - [ ] Visualize results as heatmap
160
+ - [ ] Identify causal hotspots
161
+
162
+ ## Workflow 2: Circuit Analysis (Indirect Object Identification)
163
+
164
+ Replicate the IOI circuit discovery from "Interpretability in the Wild".
165
+
166
+ ### Step-by-Step
167
+
168
+ ```python
169
+ from transformer_lens import HookedTransformer
170
+ import torch
171
+
172
+ model = HookedTransformer.from_pretrained("gpt2-small")
173
+
174
+ # IOI task: "When John and Mary went to the store, Mary gave a bottle to"
175
+ # Model should predict "John" (indirect object)
176
+
177
+ prompt = "When John and Mary went to the store, Mary gave a bottle to"
178
+ tokens = model.to_tokens(prompt)
179
+
180
+ # 1. Get baseline logits
181
+ logits, cache = model.run_with_cache(tokens)
182
+
183
+ john_token = model.to_single_token(" John")
184
+ mary_token = model.to_single_token(" Mary")
185
+
186
+ # 2. Compute logit difference (IO - S)
187
+ logit_diff = logits[0, -1, john_token] - logits[0, -1, mary_token]
188
+ print(f"Logit difference: {logit_diff.item():.3f}")
189
+
190
+ # 3. Direct logit attribution by head
191
+ def get_head_contribution(layer, head):
192
+ # Project head output to logits
193
+ head_out = cache["z", layer][0, :, head, :] # [pos, d_head]
194
+ W_O = model.W_O[layer, head] # [d_head, d_model]
195
+ W_U = model.W_U # [d_model, vocab]
196
+
197
+ # Head contribution to logits at final position
198
+ contribution = head_out[-1] @ W_O @ W_U
199
+ return contribution[john_token] - contribution[mary_token]
200
+
201
+ # 4. Map all heads
202
+ head_contributions = torch.zeros(model.cfg.n_layers, model.cfg.n_heads)
203
+ for layer in range(model.cfg.n_layers):
204
+ for head in range(model.cfg.n_heads):
205
+ head_contributions[layer, head] = get_head_contribution(layer, head)
206
+
207
+ # 5. Identify top contributing heads (name movers, backup name movers)
208
+ ```
209
+
210
+ ### Checklist
211
+ - [ ] Set up task with clear IO/S tokens
212
+ - [ ] Compute baseline logit difference
213
+ - [ ] Decompose by attention head contributions
214
+ - [ ] Identify key circuit components (name movers, S-inhibition, induction)
215
+ - [ ] Validate with ablation experiments
216
+
217
+ ## Workflow 3: Induction Head Detection
218
+
219
+ Find induction heads that implement [A][B]...[A] → [B] pattern.
220
+
221
+ ```python
222
+ from transformer_lens import HookedTransformer
223
+ import torch
224
+
225
+ model = HookedTransformer.from_pretrained("gpt2-small")
226
+
227
+ # Create repeated sequence: [A][B][A] should predict [B]
228
+ repeated_tokens = torch.tensor([[1000, 2000, 1000]]) # Arbitrary tokens
229
+
230
+ _, cache = model.run_with_cache(repeated_tokens)
231
+
232
+ # Induction heads attend from final [A] back to first [B]
233
+ # Check attention from position 2 to position 1
234
+ induction_scores = torch.zeros(model.cfg.n_layers, model.cfg.n_heads)
235
+
236
+ for layer in range(model.cfg.n_layers):
237
+ pattern = cache["pattern", layer][0] # [head, q_pos, k_pos]
238
+ # Attention from pos 2 to pos 1
239
+ induction_scores[layer] = pattern[:, 2, 1]
240
+
241
+ # Heads with high scores are induction heads
242
+ top_heads = torch.topk(induction_scores.flatten(), k=5)
243
+ ```
244
+
245
+ ## Common Issues & Solutions
246
+
247
+ ### Issue: Hooks persist after debugging
248
+ ```python
249
+ # WRONG: Old hooks remain active
250
+ model.run_with_hooks(tokens, fwd_hooks=[...]) # Debug, add new hooks
251
+ model.run_with_hooks(tokens, fwd_hooks=[...]) # Old hooks still there!
252
+
253
+ # RIGHT: Always reset hooks
254
+ model.reset_hooks()
255
+ model.run_with_hooks(tokens, fwd_hooks=[...])
256
+ ```
257
+
258
+ ### Issue: Tokenization gotchas
259
+ ```python
260
+ # WRONG: Assuming consistent tokenization
261
+ model.to_tokens("Tim") # Single token
262
+ model.to_tokens("Neel") # Becomes "Ne" + "el" (two tokens!)
263
+
264
+ # RIGHT: Check tokenization explicitly
265
+ tokens = model.to_tokens("Neel", prepend_bos=False)
266
+ print(model.to_str_tokens(tokens)) # ['Ne', 'el']
267
+ ```
268
+
269
+ ### Issue: LayerNorm ignored in analysis
270
+ ```python
271
+ # WRONG: Ignoring LayerNorm
272
+ pre_activation = residual @ model.W_in[layer]
273
+
274
+ # RIGHT: Include LayerNorm
275
+ ln_scale = model.blocks[layer].ln2.w
276
+ ln_out = model.blocks[layer].ln2(residual)
277
+ pre_activation = ln_out @ model.W_in[layer]
278
+ ```
279
+
280
+ ### Issue: Memory explosion with large models
281
+ ```python
282
+ # Use selective caching
283
+ logits, cache = model.run_with_cache(
284
+ tokens,
285
+ names_filter=lambda n: "resid_post" in n or "pattern" in n,
286
+ device="cpu" # Cache on CPU
287
+ )
288
+ ```
289
+
290
+ ## Key Classes Reference
291
+
292
+ | Class | Purpose |
293
+ |-------|---------|
294
+ | `HookedTransformer` | Main model wrapper with hooks |
295
+ | `ActivationCache` | Dictionary-like cache of activations |
296
+ | `HookedTransformerConfig` | Model configuration |
297
+ | `FactoredMatrix` | Efficient factored matrix operations |
298
+
299
+ ## Integration with SAELens
300
+
301
+ TransformerLens integrates with SAELens for Sparse Autoencoder analysis:
302
+
303
+ ```python
304
+ from transformer_lens import HookedTransformer
305
+ from sae_lens import SAE
306
+
307
+ model = HookedTransformer.from_pretrained("gpt2-small")
308
+ sae = SAE.from_pretrained("gpt2-small-res-jb", "blocks.8.hook_resid_pre")
309
+
310
+ # Run with SAE
311
+ tokens = model.to_tokens("Hello world")
312
+ _, cache = model.run_with_cache(tokens)
313
+ sae_acts = sae.encode(cache["resid_pre", 8])
314
+ ```
315
+
316
+ ## Reference Documentation
317
+
318
+ For detailed API documentation, tutorials, and advanced usage, see the `references/` folder:
319
+
320
+ | File | Contents |
321
+ |------|----------|
322
+ | [references/README.md](references/README.md) | Overview and quick start guide |
323
+ | [references/api.md](references/api.md) | Complete API reference for HookedTransformer, ActivationCache, HookPoints |
324
+ | [references/tutorials.md](references/tutorials.md) | Step-by-step tutorials for activation patching, circuit analysis, logit lens |
325
+
326
+ ## External Resources
327
+
328
+ ### Tutorials
329
+ - [Main Demo Notebook](https://transformerlensorg.github.io/TransformerLens/generated/demos/Main_Demo.html)
330
+ - [Activation Patching Demo](https://colab.research.google.com/github/TransformerLensOrg/TransformerLens/blob/main/demos/Activation_Patching_in_TL_Demo.ipynb)
331
+ - [ARENA Mech Interp Course](https://arena-foundation.github.io/ARENA/) - 200+ hours of tutorials
332
+
333
+ ### Papers
334
+ - [A Mathematical Framework for Transformer Circuits](https://transformer-circuits.pub/2021/framework/index.html)
335
+ - [In-context Learning and Induction Heads](https://transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html)
336
+ - [Interpretability in the Wild (IOI)](https://arxiv.org/abs/2211.00593)
337
+
338
+ ### Official Documentation
339
+ - [Official Docs](https://transformerlensorg.github.io/TransformerLens/)
340
+ - [Model Properties Table](https://transformerlensorg.github.io/TransformerLens/generated/model_properties_table.html)
341
+ - [Neel Nanda's Glossary](https://www.neelnanda.io/mechanistic-interpretability/glossary)
342
+
343
+ ## Version Notes
344
+
345
+ - **v2.0**: Removed HookedSAE (moved to SAELens)
346
+ - **v3.0 (alpha)**: TransformerBridge for loading any nn.Module
@@ -0,0 +1,54 @@
1
+ # TransformerLens Reference Documentation
2
+
3
+ This directory contains comprehensive reference materials for TransformerLens.
4
+
5
+ ## Contents
6
+
7
+ - [api.md](api.md) - Complete API reference for HookedTransformer, ActivationCache, and HookPoints
8
+ - [tutorials.md](tutorials.md) - Step-by-step tutorials for common interpretability workflows
9
+ - [papers.md](papers.md) - Key research papers and foundational concepts
10
+
11
+ ## Quick Links
12
+
13
+ - **Official Documentation**: https://transformerlensorg.github.io/TransformerLens/
14
+ - **GitHub Repository**: https://github.com/TransformerLensOrg/TransformerLens
15
+ - **Model Properties Table**: https://transformerlensorg.github.io/TransformerLens/generated/model_properties_table.html
16
+
17
+ ## Installation
18
+
19
+ ```bash
20
+ pip install transformer-lens
21
+ ```
22
+
23
+ ## Basic Usage
24
+
25
+ ```python
26
+ from transformer_lens import HookedTransformer
27
+
28
+ # Load model
29
+ model = HookedTransformer.from_pretrained("gpt2-small")
30
+
31
+ # Run with activation caching
32
+ tokens = model.to_tokens("Hello world")
33
+ logits, cache = model.run_with_cache(tokens)
34
+
35
+ # Access activations
36
+ residual = cache["resid_post", 5] # Layer 5 residual stream
37
+ attention = cache["pattern", 3] # Layer 3 attention patterns
38
+ ```
39
+
40
+ ## Key Concepts
41
+
42
+ ### HookPoints
43
+ Every activation in the transformer has a HookPoint wrapper, enabling:
44
+ - Reading activations via `run_with_cache()`
45
+ - Modifying activations via `run_with_hooks()`
46
+
47
+ ### Activation Cache
48
+ The `ActivationCache` stores all intermediate activations with helper methods for:
49
+ - Residual stream decomposition
50
+ - Logit attribution
51
+ - Layer-wise analysis
52
+
53
+ ### Supported Models (50+)
54
+ GPT-2, LLaMA, Mistral, Pythia, GPT-Neo, OPT, Gemma, Phi, and more.