@synsci/cli-darwin-x64 1.1.49

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (373) hide show
  1. package/bin/skills/accelerate/SKILL.md +332 -0
  2. package/bin/skills/accelerate/references/custom-plugins.md +453 -0
  3. package/bin/skills/accelerate/references/megatron-integration.md +489 -0
  4. package/bin/skills/accelerate/references/performance.md +525 -0
  5. package/bin/skills/audiocraft/SKILL.md +564 -0
  6. package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
  7. package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
  8. package/bin/skills/autogpt/SKILL.md +403 -0
  9. package/bin/skills/autogpt/references/advanced-usage.md +535 -0
  10. package/bin/skills/autogpt/references/troubleshooting.md +420 -0
  11. package/bin/skills/awq/SKILL.md +310 -0
  12. package/bin/skills/awq/references/advanced-usage.md +324 -0
  13. package/bin/skills/awq/references/troubleshooting.md +344 -0
  14. package/bin/skills/axolotl/SKILL.md +158 -0
  15. package/bin/skills/axolotl/references/api.md +5548 -0
  16. package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
  17. package/bin/skills/axolotl/references/index.md +15 -0
  18. package/bin/skills/axolotl/references/other.md +3563 -0
  19. package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
  20. package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
  21. package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
  22. package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
  23. package/bin/skills/bitsandbytes/SKILL.md +411 -0
  24. package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
  25. package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
  26. package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
  27. package/bin/skills/blip-2/SKILL.md +564 -0
  28. package/bin/skills/blip-2/references/advanced-usage.md +680 -0
  29. package/bin/skills/blip-2/references/troubleshooting.md +526 -0
  30. package/bin/skills/chroma/SKILL.md +406 -0
  31. package/bin/skills/chroma/references/integration.md +38 -0
  32. package/bin/skills/clip/SKILL.md +253 -0
  33. package/bin/skills/clip/references/applications.md +207 -0
  34. package/bin/skills/constitutional-ai/SKILL.md +290 -0
  35. package/bin/skills/crewai/SKILL.md +498 -0
  36. package/bin/skills/crewai/references/flows.md +438 -0
  37. package/bin/skills/crewai/references/tools.md +429 -0
  38. package/bin/skills/crewai/references/troubleshooting.md +480 -0
  39. package/bin/skills/deepspeed/SKILL.md +141 -0
  40. package/bin/skills/deepspeed/references/08.md +17 -0
  41. package/bin/skills/deepspeed/references/09.md +173 -0
  42. package/bin/skills/deepspeed/references/2020.md +378 -0
  43. package/bin/skills/deepspeed/references/2023.md +279 -0
  44. package/bin/skills/deepspeed/references/assets.md +179 -0
  45. package/bin/skills/deepspeed/references/index.md +35 -0
  46. package/bin/skills/deepspeed/references/mii.md +118 -0
  47. package/bin/skills/deepspeed/references/other.md +1191 -0
  48. package/bin/skills/deepspeed/references/tutorials.md +6554 -0
  49. package/bin/skills/dspy/SKILL.md +590 -0
  50. package/bin/skills/dspy/references/examples.md +663 -0
  51. package/bin/skills/dspy/references/modules.md +475 -0
  52. package/bin/skills/dspy/references/optimizers.md +566 -0
  53. package/bin/skills/faiss/SKILL.md +221 -0
  54. package/bin/skills/faiss/references/index_types.md +280 -0
  55. package/bin/skills/flash-attention/SKILL.md +367 -0
  56. package/bin/skills/flash-attention/references/benchmarks.md +215 -0
  57. package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
  58. package/bin/skills/gguf/SKILL.md +427 -0
  59. package/bin/skills/gguf/references/advanced-usage.md +504 -0
  60. package/bin/skills/gguf/references/troubleshooting.md +442 -0
  61. package/bin/skills/gptq/SKILL.md +450 -0
  62. package/bin/skills/gptq/references/calibration.md +337 -0
  63. package/bin/skills/gptq/references/integration.md +129 -0
  64. package/bin/skills/gptq/references/troubleshooting.md +95 -0
  65. package/bin/skills/grpo-rl-training/README.md +97 -0
  66. package/bin/skills/grpo-rl-training/SKILL.md +572 -0
  67. package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
  68. package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
  69. package/bin/skills/guidance/SKILL.md +572 -0
  70. package/bin/skills/guidance/references/backends.md +554 -0
  71. package/bin/skills/guidance/references/constraints.md +674 -0
  72. package/bin/skills/guidance/references/examples.md +767 -0
  73. package/bin/skills/hqq/SKILL.md +445 -0
  74. package/bin/skills/hqq/references/advanced-usage.md +528 -0
  75. package/bin/skills/hqq/references/troubleshooting.md +503 -0
  76. package/bin/skills/hugging-face-cli/SKILL.md +191 -0
  77. package/bin/skills/hugging-face-cli/references/commands.md +954 -0
  78. package/bin/skills/hugging-face-cli/references/examples.md +374 -0
  79. package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
  80. package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
  81. package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
  82. package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
  83. package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
  84. package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
  85. package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
  86. package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
  87. package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
  88. package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
  89. package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
  90. package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
  91. package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
  92. package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
  93. package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
  94. package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
  95. package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
  96. package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
  97. package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
  98. package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
  99. package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
  100. package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
  101. package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
  102. package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
  103. package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
  104. package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
  105. package/bin/skills/hugging-face-jobs/index.html +216 -0
  106. package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
  107. package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
  108. package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
  109. package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
  110. package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
  111. package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
  112. package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
  113. package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
  114. package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
  115. package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
  116. package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
  117. package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
  118. package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
  119. package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
  120. package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
  121. package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
  122. package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
  123. package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
  124. package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
  125. package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
  126. package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
  127. package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
  128. package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
  129. package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
  130. package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
  131. package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
  132. package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
  133. package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
  134. package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
  135. package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
  136. package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
  137. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
  138. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
  139. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
  140. package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
  141. package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
  142. package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
  143. package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
  144. package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
  145. package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
  146. package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
  147. package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
  148. package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
  149. package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
  150. package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
  151. package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
  152. package/bin/skills/instructor/SKILL.md +740 -0
  153. package/bin/skills/instructor/references/examples.md +107 -0
  154. package/bin/skills/instructor/references/providers.md +70 -0
  155. package/bin/skills/instructor/references/validation.md +606 -0
  156. package/bin/skills/knowledge-distillation/SKILL.md +458 -0
  157. package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
  158. package/bin/skills/lambda-labs/SKILL.md +545 -0
  159. package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
  160. package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
  161. package/bin/skills/langchain/SKILL.md +480 -0
  162. package/bin/skills/langchain/references/agents.md +499 -0
  163. package/bin/skills/langchain/references/integration.md +562 -0
  164. package/bin/skills/langchain/references/rag.md +600 -0
  165. package/bin/skills/langsmith/SKILL.md +422 -0
  166. package/bin/skills/langsmith/references/advanced-usage.md +548 -0
  167. package/bin/skills/langsmith/references/troubleshooting.md +537 -0
  168. package/bin/skills/litgpt/SKILL.md +469 -0
  169. package/bin/skills/litgpt/references/custom-models.md +568 -0
  170. package/bin/skills/litgpt/references/distributed-training.md +451 -0
  171. package/bin/skills/litgpt/references/supported-models.md +336 -0
  172. package/bin/skills/litgpt/references/training-recipes.md +619 -0
  173. package/bin/skills/llama-cpp/SKILL.md +258 -0
  174. package/bin/skills/llama-cpp/references/optimization.md +89 -0
  175. package/bin/skills/llama-cpp/references/quantization.md +213 -0
  176. package/bin/skills/llama-cpp/references/server.md +125 -0
  177. package/bin/skills/llama-factory/SKILL.md +80 -0
  178. package/bin/skills/llama-factory/references/_images.md +23 -0
  179. package/bin/skills/llama-factory/references/advanced.md +1055 -0
  180. package/bin/skills/llama-factory/references/getting_started.md +349 -0
  181. package/bin/skills/llama-factory/references/index.md +19 -0
  182. package/bin/skills/llama-factory/references/other.md +31 -0
  183. package/bin/skills/llamaguard/SKILL.md +337 -0
  184. package/bin/skills/llamaindex/SKILL.md +569 -0
  185. package/bin/skills/llamaindex/references/agents.md +83 -0
  186. package/bin/skills/llamaindex/references/data_connectors.md +108 -0
  187. package/bin/skills/llamaindex/references/query_engines.md +406 -0
  188. package/bin/skills/llava/SKILL.md +304 -0
  189. package/bin/skills/llava/references/training.md +197 -0
  190. package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
  191. package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
  192. package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
  193. package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
  194. package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
  195. package/bin/skills/long-context/SKILL.md +536 -0
  196. package/bin/skills/long-context/references/extension_methods.md +468 -0
  197. package/bin/skills/long-context/references/fine_tuning.md +611 -0
  198. package/bin/skills/long-context/references/rope.md +402 -0
  199. package/bin/skills/mamba/SKILL.md +260 -0
  200. package/bin/skills/mamba/references/architecture-details.md +206 -0
  201. package/bin/skills/mamba/references/benchmarks.md +255 -0
  202. package/bin/skills/mamba/references/training-guide.md +388 -0
  203. package/bin/skills/megatron-core/SKILL.md +366 -0
  204. package/bin/skills/megatron-core/references/benchmarks.md +249 -0
  205. package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
  206. package/bin/skills/megatron-core/references/production-examples.md +473 -0
  207. package/bin/skills/megatron-core/references/training-recipes.md +547 -0
  208. package/bin/skills/miles/SKILL.md +315 -0
  209. package/bin/skills/miles/references/api-reference.md +141 -0
  210. package/bin/skills/miles/references/troubleshooting.md +352 -0
  211. package/bin/skills/mlflow/SKILL.md +704 -0
  212. package/bin/skills/mlflow/references/deployment.md +744 -0
  213. package/bin/skills/mlflow/references/model-registry.md +770 -0
  214. package/bin/skills/mlflow/references/tracking.md +680 -0
  215. package/bin/skills/modal/SKILL.md +341 -0
  216. package/bin/skills/modal/references/advanced-usage.md +503 -0
  217. package/bin/skills/modal/references/troubleshooting.md +494 -0
  218. package/bin/skills/model-merging/SKILL.md +539 -0
  219. package/bin/skills/model-merging/references/evaluation.md +462 -0
  220. package/bin/skills/model-merging/references/examples.md +428 -0
  221. package/bin/skills/model-merging/references/methods.md +352 -0
  222. package/bin/skills/model-pruning/SKILL.md +495 -0
  223. package/bin/skills/model-pruning/references/wanda.md +347 -0
  224. package/bin/skills/moe-training/SKILL.md +526 -0
  225. package/bin/skills/moe-training/references/architectures.md +432 -0
  226. package/bin/skills/moe-training/references/inference.md +348 -0
  227. package/bin/skills/moe-training/references/training.md +425 -0
  228. package/bin/skills/nanogpt/SKILL.md +290 -0
  229. package/bin/skills/nanogpt/references/architecture.md +382 -0
  230. package/bin/skills/nanogpt/references/data.md +476 -0
  231. package/bin/skills/nanogpt/references/training.md +564 -0
  232. package/bin/skills/nemo-curator/SKILL.md +383 -0
  233. package/bin/skills/nemo-curator/references/deduplication.md +87 -0
  234. package/bin/skills/nemo-curator/references/filtering.md +102 -0
  235. package/bin/skills/nemo-evaluator/SKILL.md +494 -0
  236. package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
  237. package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
  238. package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
  239. package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
  240. package/bin/skills/nemo-guardrails/SKILL.md +297 -0
  241. package/bin/skills/nnsight/SKILL.md +436 -0
  242. package/bin/skills/nnsight/references/README.md +78 -0
  243. package/bin/skills/nnsight/references/api.md +344 -0
  244. package/bin/skills/nnsight/references/tutorials.md +300 -0
  245. package/bin/skills/openrlhf/SKILL.md +249 -0
  246. package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
  247. package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
  248. package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
  249. package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
  250. package/bin/skills/outlines/SKILL.md +652 -0
  251. package/bin/skills/outlines/references/backends.md +615 -0
  252. package/bin/skills/outlines/references/examples.md +773 -0
  253. package/bin/skills/outlines/references/json_generation.md +652 -0
  254. package/bin/skills/peft/SKILL.md +431 -0
  255. package/bin/skills/peft/references/advanced-usage.md +514 -0
  256. package/bin/skills/peft/references/troubleshooting.md +480 -0
  257. package/bin/skills/phoenix/SKILL.md +475 -0
  258. package/bin/skills/phoenix/references/advanced-usage.md +619 -0
  259. package/bin/skills/phoenix/references/troubleshooting.md +538 -0
  260. package/bin/skills/pinecone/SKILL.md +358 -0
  261. package/bin/skills/pinecone/references/deployment.md +181 -0
  262. package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
  263. package/bin/skills/pytorch-fsdp/references/index.md +7 -0
  264. package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
  265. package/bin/skills/pytorch-lightning/SKILL.md +346 -0
  266. package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
  267. package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
  268. package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
  269. package/bin/skills/pyvene/SKILL.md +473 -0
  270. package/bin/skills/pyvene/references/README.md +73 -0
  271. package/bin/skills/pyvene/references/api.md +383 -0
  272. package/bin/skills/pyvene/references/tutorials.md +376 -0
  273. package/bin/skills/qdrant/SKILL.md +493 -0
  274. package/bin/skills/qdrant/references/advanced-usage.md +648 -0
  275. package/bin/skills/qdrant/references/troubleshooting.md +631 -0
  276. package/bin/skills/ray-data/SKILL.md +326 -0
  277. package/bin/skills/ray-data/references/integration.md +82 -0
  278. package/bin/skills/ray-data/references/transformations.md +83 -0
  279. package/bin/skills/ray-train/SKILL.md +406 -0
  280. package/bin/skills/ray-train/references/multi-node.md +628 -0
  281. package/bin/skills/rwkv/SKILL.md +260 -0
  282. package/bin/skills/rwkv/references/architecture-details.md +344 -0
  283. package/bin/skills/rwkv/references/rwkv7.md +386 -0
  284. package/bin/skills/rwkv/references/state-management.md +369 -0
  285. package/bin/skills/saelens/SKILL.md +386 -0
  286. package/bin/skills/saelens/references/README.md +70 -0
  287. package/bin/skills/saelens/references/api.md +333 -0
  288. package/bin/skills/saelens/references/tutorials.md +318 -0
  289. package/bin/skills/segment-anything/SKILL.md +500 -0
  290. package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
  291. package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
  292. package/bin/skills/sentence-transformers/SKILL.md +255 -0
  293. package/bin/skills/sentence-transformers/references/models.md +123 -0
  294. package/bin/skills/sentencepiece/SKILL.md +235 -0
  295. package/bin/skills/sentencepiece/references/algorithms.md +200 -0
  296. package/bin/skills/sentencepiece/references/training.md +304 -0
  297. package/bin/skills/sglang/SKILL.md +442 -0
  298. package/bin/skills/sglang/references/deployment.md +490 -0
  299. package/bin/skills/sglang/references/radix-attention.md +413 -0
  300. package/bin/skills/sglang/references/structured-generation.md +541 -0
  301. package/bin/skills/simpo/SKILL.md +219 -0
  302. package/bin/skills/simpo/references/datasets.md +478 -0
  303. package/bin/skills/simpo/references/hyperparameters.md +452 -0
  304. package/bin/skills/simpo/references/loss-functions.md +350 -0
  305. package/bin/skills/skypilot/SKILL.md +509 -0
  306. package/bin/skills/skypilot/references/advanced-usage.md +491 -0
  307. package/bin/skills/skypilot/references/troubleshooting.md +570 -0
  308. package/bin/skills/slime/SKILL.md +464 -0
  309. package/bin/skills/slime/references/api-reference.md +392 -0
  310. package/bin/skills/slime/references/troubleshooting.md +386 -0
  311. package/bin/skills/speculative-decoding/SKILL.md +467 -0
  312. package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
  313. package/bin/skills/speculative-decoding/references/medusa.md +350 -0
  314. package/bin/skills/stable-diffusion/SKILL.md +519 -0
  315. package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
  316. package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
  317. package/bin/skills/tensorboard/SKILL.md +629 -0
  318. package/bin/skills/tensorboard/references/integrations.md +638 -0
  319. package/bin/skills/tensorboard/references/profiling.md +545 -0
  320. package/bin/skills/tensorboard/references/visualization.md +620 -0
  321. package/bin/skills/tensorrt-llm/SKILL.md +187 -0
  322. package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
  323. package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
  324. package/bin/skills/tensorrt-llm/references/serving.md +470 -0
  325. package/bin/skills/tinker/SKILL.md +362 -0
  326. package/bin/skills/tinker/references/api-reference.md +168 -0
  327. package/bin/skills/tinker/references/getting-started.md +157 -0
  328. package/bin/skills/tinker/references/loss-functions.md +163 -0
  329. package/bin/skills/tinker/references/models-and-lora.md +139 -0
  330. package/bin/skills/tinker/references/recipes.md +280 -0
  331. package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
  332. package/bin/skills/tinker/references/rendering.md +243 -0
  333. package/bin/skills/tinker/references/supervised-learning.md +232 -0
  334. package/bin/skills/tinker-training-cost/SKILL.md +187 -0
  335. package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
  336. package/bin/skills/torchforge/SKILL.md +433 -0
  337. package/bin/skills/torchforge/references/api-reference.md +327 -0
  338. package/bin/skills/torchforge/references/troubleshooting.md +409 -0
  339. package/bin/skills/torchtitan/SKILL.md +358 -0
  340. package/bin/skills/torchtitan/references/checkpoint.md +181 -0
  341. package/bin/skills/torchtitan/references/custom-models.md +258 -0
  342. package/bin/skills/torchtitan/references/float8.md +133 -0
  343. package/bin/skills/torchtitan/references/fsdp.md +126 -0
  344. package/bin/skills/transformer-lens/SKILL.md +346 -0
  345. package/bin/skills/transformer-lens/references/README.md +54 -0
  346. package/bin/skills/transformer-lens/references/api.md +362 -0
  347. package/bin/skills/transformer-lens/references/tutorials.md +339 -0
  348. package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
  349. package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
  350. package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
  351. package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
  352. package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
  353. package/bin/skills/unsloth/SKILL.md +80 -0
  354. package/bin/skills/unsloth/references/index.md +7 -0
  355. package/bin/skills/unsloth/references/llms-full.md +16799 -0
  356. package/bin/skills/unsloth/references/llms-txt.md +12044 -0
  357. package/bin/skills/unsloth/references/llms.md +82 -0
  358. package/bin/skills/verl/SKILL.md +391 -0
  359. package/bin/skills/verl/references/api-reference.md +301 -0
  360. package/bin/skills/verl/references/troubleshooting.md +391 -0
  361. package/bin/skills/vllm/SKILL.md +364 -0
  362. package/bin/skills/vllm/references/optimization.md +226 -0
  363. package/bin/skills/vllm/references/quantization.md +284 -0
  364. package/bin/skills/vllm/references/server-deployment.md +255 -0
  365. package/bin/skills/vllm/references/troubleshooting.md +447 -0
  366. package/bin/skills/weights-and-biases/SKILL.md +590 -0
  367. package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
  368. package/bin/skills/weights-and-biases/references/integrations.md +700 -0
  369. package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
  370. package/bin/skills/whisper/SKILL.md +317 -0
  371. package/bin/skills/whisper/references/languages.md +189 -0
  372. package/bin/synsc +0 -0
  373. package/package.json +10 -0
@@ -0,0 +1,436 @@
1
+ ---
2
+ name: nnsight-remote-interpretability
3
+ description: Provides guidance for interpreting and manipulating neural network internals using nnsight with optional NDIF remote execution. Use when needing to run interpretability experiments on massive models (70B+) without local GPU resources, or when working with any PyTorch architecture.
4
+ version: 1.0.0
5
+ author: Synthetic Sciences
6
+ license: MIT
7
+ tags: [nnsight, NDIF, Remote Execution, Mechanistic Interpretability, Model Internals]
8
+ dependencies: [nnsight>=0.5.0, torch>=2.0.0]
9
+ ---
10
+
11
+ # nnsight: Transparent Access to Neural Network Internals
12
+
13
+ nnsight (/ɛn.saɪt/) enables researchers to interpret and manipulate the internals of any PyTorch model, with the unique capability of running the same code locally on small models or remotely on massive models (70B+) via NDIF.
14
+
15
+ **GitHub**: [ndif-team/nnsight](https://github.com/ndif-team/nnsight) (730+ stars)
16
+ **Paper**: [NNsight and NDIF: Democratizing Access to Foundation Model Internals](https://arxiv.org/abs/2407.14561) (ICLR 2025)
17
+
18
+ ## Key Value Proposition
19
+
20
+ **Write once, run anywhere**: The same interpretability code works on GPT-2 locally or Llama-3.1-405B remotely. Just toggle `remote=True`.
21
+
22
+ ```python
23
+ # Local execution (small model)
24
+ with model.trace("Hello world"):
25
+ hidden = model.transformer.h[5].output[0].save()
26
+
27
+ # Remote execution (massive model) - same code!
28
+ with model.trace("Hello world", remote=True):
29
+ hidden = model.model.layers[40].output[0].save()
30
+ ```
31
+
32
+ ## When to Use nnsight
33
+
34
+ **Use nnsight when you need to:**
35
+ - Run interpretability experiments on models too large for local GPUs (70B, 405B)
36
+ - Work with any PyTorch architecture (transformers, Mamba, custom models)
37
+ - Perform multi-token generation interventions
38
+ - Share activations between different prompts
39
+ - Access full model internals without reimplementation
40
+
41
+ **Consider alternatives when:**
42
+ - You want consistent API across models → Use **TransformerLens**
43
+ - You need declarative, shareable interventions → Use **pyvene**
44
+ - You're training SAEs → Use **SAELens**
45
+ - You only work with small models locally → **TransformerLens** may be simpler
46
+
47
+ ## Installation
48
+
49
+ ```bash
50
+ # Basic installation
51
+ pip install nnsight
52
+
53
+ # For vLLM support
54
+ pip install "nnsight[vllm]"
55
+ ```
56
+
57
+ For remote NDIF execution, sign up at [login.ndif.us](https://login.ndif.us) for an API key.
58
+
59
+ ## Core Concepts
60
+
61
+ ### LanguageModel Wrapper
62
+
63
+ ```python
64
+ from nnsight import LanguageModel
65
+
66
+ # Load model (uses HuggingFace under the hood)
67
+ model = LanguageModel("openai-community/gpt2", device_map="auto")
68
+
69
+ # For larger models
70
+ model = LanguageModel("meta-llama/Llama-3.1-8B", device_map="auto")
71
+ ```
72
+
73
+ ### Tracing Context
74
+
75
+ The `trace` context manager enables deferred execution - operations are collected into a computation graph:
76
+
77
+ ```python
78
+ from nnsight import LanguageModel
79
+
80
+ model = LanguageModel("gpt2", device_map="auto")
81
+
82
+ with model.trace("The Eiffel Tower is in") as tracer:
83
+ # Access any module's output
84
+ hidden_states = model.transformer.h[5].output[0].save()
85
+
86
+ # Access attention patterns
87
+ attn = model.transformer.h[5].attn.attn_dropout.input[0][0].save()
88
+
89
+ # Modify activations
90
+ model.transformer.h[8].output[0][:] = 0 # Zero out layer 8
91
+
92
+ # Get final output
93
+ logits = model.output.save()
94
+
95
+ # After context exits, access saved values
96
+ print(hidden_states.shape) # [batch, seq, hidden]
97
+ ```
98
+
99
+ ### Proxy Objects
100
+
101
+ Inside `trace`, module accesses return Proxy objects that record operations:
102
+
103
+ ```python
104
+ with model.trace("Hello"):
105
+ # These are all Proxy objects - operations are deferred
106
+ h5_out = model.transformer.h[5].output[0] # Proxy
107
+ h5_mean = h5_out.mean(dim=-1) # Proxy
108
+ h5_saved = h5_mean.save() # Save for later access
109
+ ```
110
+
111
+ ## Workflow 1: Activation Analysis
112
+
113
+ ### Step-by-Step
114
+
115
+ ```python
116
+ from nnsight import LanguageModel
117
+ import torch
118
+
119
+ model = LanguageModel("gpt2", device_map="auto")
120
+
121
+ prompt = "The capital of France is"
122
+
123
+ with model.trace(prompt) as tracer:
124
+ # 1. Collect activations from multiple layers
125
+ layer_outputs = []
126
+ for i in range(12): # GPT-2 has 12 layers
127
+ layer_out = model.transformer.h[i].output[0].save()
128
+ layer_outputs.append(layer_out)
129
+
130
+ # 2. Get attention patterns
131
+ attn_patterns = []
132
+ for i in range(12):
133
+ # Access attention weights (after softmax)
134
+ attn = model.transformer.h[i].attn.attn_dropout.input[0][0].save()
135
+ attn_patterns.append(attn)
136
+
137
+ # 3. Get final logits
138
+ logits = model.output.save()
139
+
140
+ # 4. Analyze outside context
141
+ for i, layer_out in enumerate(layer_outputs):
142
+ print(f"Layer {i} output shape: {layer_out.shape}")
143
+ print(f"Layer {i} norm: {layer_out.norm().item():.3f}")
144
+
145
+ # 5. Find top predictions
146
+ probs = torch.softmax(logits[0, -1], dim=-1)
147
+ top_tokens = probs.topk(5)
148
+ for token, prob in zip(top_tokens.indices, top_tokens.values):
149
+ print(f"{model.tokenizer.decode(token)}: {prob.item():.3f}")
150
+ ```
151
+
152
+ ### Checklist
153
+ - [ ] Load model with LanguageModel wrapper
154
+ - [ ] Use trace context for operations
155
+ - [ ] Call `.save()` on values you need after context
156
+ - [ ] Access saved values outside context
157
+ - [ ] Use `.shape`, `.norm()`, etc. for analysis
158
+
159
+ ## Workflow 2: Activation Patching
160
+
161
+ ### Step-by-Step
162
+
163
+ ```python
164
+ from nnsight import LanguageModel
165
+ import torch
166
+
167
+ model = LanguageModel("gpt2", device_map="auto")
168
+
169
+ clean_prompt = "The Eiffel Tower is in"
170
+ corrupted_prompt = "The Colosseum is in"
171
+
172
+ # 1. Get clean activations
173
+ with model.trace(clean_prompt) as tracer:
174
+ clean_hidden = model.transformer.h[8].output[0].save()
175
+
176
+ # 2. Patch clean into corrupted run
177
+ with model.trace(corrupted_prompt) as tracer:
178
+ # Replace layer 8 output with clean activations
179
+ model.transformer.h[8].output[0][:] = clean_hidden
180
+
181
+ patched_logits = model.output.save()
182
+
183
+ # 3. Compare predictions
184
+ paris_token = model.tokenizer.encode(" Paris")[0]
185
+ rome_token = model.tokenizer.encode(" Rome")[0]
186
+
187
+ patched_probs = torch.softmax(patched_logits[0, -1], dim=-1)
188
+ print(f"Paris prob: {patched_probs[paris_token].item():.3f}")
189
+ print(f"Rome prob: {patched_probs[rome_token].item():.3f}")
190
+ ```
191
+
192
+ ### Systematic Patching Sweep
193
+
194
+ ```python
195
+ def patch_layer_position(layer, position, clean_cache, corrupted_prompt):
196
+ """Patch single layer/position from clean to corrupted."""
197
+ with model.trace(corrupted_prompt) as tracer:
198
+ # Get current activation
199
+ current = model.transformer.h[layer].output[0]
200
+
201
+ # Patch only specific position
202
+ current[:, position, :] = clean_cache[layer][:, position, :]
203
+
204
+ logits = model.output.save()
205
+
206
+ return logits
207
+
208
+ # Sweep over all layers and positions
209
+ results = torch.zeros(12, seq_len)
210
+ for layer in range(12):
211
+ for pos in range(seq_len):
212
+ logits = patch_layer_position(layer, pos, clean_hidden, corrupted)
213
+ results[layer, pos] = compute_metric(logits)
214
+ ```
215
+
216
+ ## Workflow 3: Remote Execution with NDIF
217
+
218
+ Run the same experiments on massive models without local GPUs.
219
+
220
+ ### Step-by-Step
221
+
222
+ ```python
223
+ from nnsight import LanguageModel
224
+
225
+ # 1. Load large model (will run remotely)
226
+ model = LanguageModel("meta-llama/Llama-3.1-70B")
227
+
228
+ # 2. Same code, just add remote=True
229
+ with model.trace("The meaning of life is", remote=True) as tracer:
230
+ # Access internals of 70B model!
231
+ layer_40_out = model.model.layers[40].output[0].save()
232
+ logits = model.output.save()
233
+
234
+ # 3. Results returned from NDIF
235
+ print(f"Layer 40 shape: {layer_40_out.shape}")
236
+
237
+ # 4. Generation with interventions
238
+ with model.trace(remote=True) as tracer:
239
+ with tracer.invoke("What is 2+2?"):
240
+ # Intervene during generation
241
+ model.model.layers[20].output[0][:, -1, :] *= 1.5
242
+
243
+ output = model.generate(max_new_tokens=50)
244
+ ```
245
+
246
+ ### NDIF Setup
247
+
248
+ 1. Sign up at [login.ndif.us](https://login.ndif.us)
249
+ 2. Get API key
250
+ 3. Set environment variable or pass to nnsight:
251
+
252
+ ```python
253
+ import os
254
+ os.environ["NDIF_API_KEY"] = "your_key"
255
+
256
+ # Or configure directly
257
+ from nnsight import CONFIG
258
+ CONFIG.API_KEY = "your_key"
259
+ ```
260
+
261
+ ### Available Models on NDIF
262
+
263
+ - Llama-3.1-8B, 70B, 405B
264
+ - DeepSeek-R1 models
265
+ - Various open-weight models (check [ndif.us](https://ndif.us) for current list)
266
+
267
+ ## Workflow 4: Cross-Prompt Activation Sharing
268
+
269
+ Share activations between different inputs in a single trace.
270
+
271
+ ```python
272
+ from nnsight import LanguageModel
273
+
274
+ model = LanguageModel("gpt2", device_map="auto")
275
+
276
+ with model.trace() as tracer:
277
+ # First prompt
278
+ with tracer.invoke("The cat sat on the"):
279
+ cat_hidden = model.transformer.h[6].output[0].save()
280
+
281
+ # Second prompt - inject cat's activations
282
+ with tracer.invoke("The dog ran through the"):
283
+ # Replace with cat's activations at layer 6
284
+ model.transformer.h[6].output[0][:] = cat_hidden
285
+ dog_with_cat = model.output.save()
286
+
287
+ # The dog prompt now has cat's internal representations
288
+ ```
289
+
290
+ ## Workflow 5: Gradient-Based Analysis
291
+
292
+ Access gradients during backward pass.
293
+
294
+ ```python
295
+ from nnsight import LanguageModel
296
+ import torch
297
+
298
+ model = LanguageModel("gpt2", device_map="auto")
299
+
300
+ with model.trace("The quick brown fox") as tracer:
301
+ # Save activations and enable gradient
302
+ hidden = model.transformer.h[5].output[0].save()
303
+ hidden.retain_grad()
304
+
305
+ logits = model.output
306
+
307
+ # Compute loss on specific token
308
+ target_token = model.tokenizer.encode(" jumps")[0]
309
+ loss = -logits[0, -1, target_token]
310
+
311
+ # Backward pass
312
+ loss.backward()
313
+
314
+ # Access gradients
315
+ grad = hidden.grad
316
+ print(f"Gradient shape: {grad.shape}")
317
+ print(f"Gradient norm: {grad.norm().item():.3f}")
318
+ ```
319
+
320
+ **Note**: Gradient access not supported for vLLM or remote execution.
321
+
322
+ ## Common Issues & Solutions
323
+
324
+ ### Issue: Module path differs between models
325
+ ```python
326
+ # GPT-2 structure
327
+ model.transformer.h[5].output[0]
328
+
329
+ # LLaMA structure
330
+ model.model.layers[5].output[0]
331
+
332
+ # Solution: Check model structure
333
+ print(model._model) # See actual module names
334
+ ```
335
+
336
+ ### Issue: Forgetting to save
337
+ ```python
338
+ # WRONG: Value not accessible outside trace
339
+ with model.trace("Hello"):
340
+ hidden = model.transformer.h[5].output[0] # Not saved!
341
+
342
+ print(hidden) # Error or wrong value
343
+
344
+ # RIGHT: Call .save()
345
+ with model.trace("Hello"):
346
+ hidden = model.transformer.h[5].output[0].save()
347
+
348
+ print(hidden) # Works!
349
+ ```
350
+
351
+ ### Issue: Remote timeout
352
+ ```python
353
+ # For long operations, increase timeout
354
+ with model.trace("prompt", remote=True, timeout=300) as tracer:
355
+ # Long operation...
356
+ ```
357
+
358
+ ### Issue: Memory with many saved activations
359
+ ```python
360
+ # Only save what you need
361
+ with model.trace("prompt"):
362
+ # Don't save everything
363
+ for i in range(100):
364
+ model.transformer.h[i].output[0].save() # Memory heavy!
365
+
366
+ # Better: save specific layers
367
+ key_layers = [0, 5, 11]
368
+ for i in key_layers:
369
+ model.transformer.h[i].output[0].save()
370
+ ```
371
+
372
+ ### Issue: vLLM gradient limitation
373
+ ```python
374
+ # vLLM doesn't support gradients
375
+ # Use standard execution for gradient analysis
376
+ model = LanguageModel("gpt2", device_map="auto") # Not vLLM
377
+ ```
378
+
379
+ ## Key API Reference
380
+
381
+ | Method/Property | Purpose |
382
+ |-----------------|---------|
383
+ | `model.trace(prompt, remote=False)` | Start tracing context |
384
+ | `proxy.save()` | Save value for access after trace |
385
+ | `proxy[:]` | Slice/index proxy (assignment patches) |
386
+ | `tracer.invoke(prompt)` | Add prompt within trace |
387
+ | `model.generate(...)` | Generate with interventions |
388
+ | `model.output` | Final model output logits |
389
+ | `model._model` | Underlying HuggingFace model |
390
+
391
+ ## Comparison with Other Tools
392
+
393
+ | Feature | nnsight | TransformerLens | pyvene |
394
+ |---------|---------|-----------------|--------|
395
+ | Any architecture | Yes | Transformers only | Yes |
396
+ | Remote execution | Yes (NDIF) | No | No |
397
+ | Consistent API | No | Yes | Yes |
398
+ | Deferred execution | Yes | No | No |
399
+ | HuggingFace native | Yes | Reimplemented | Yes |
400
+ | Shareable configs | No | No | Yes |
401
+
402
+ ## Reference Documentation
403
+
404
+ For detailed API documentation, tutorials, and advanced usage, see the `references/` folder:
405
+
406
+ | File | Contents |
407
+ |------|----------|
408
+ | [references/README.md](references/README.md) | Overview and quick start guide |
409
+ | [references/api.md](references/api.md) | Complete API reference for LanguageModel, tracing, proxy objects |
410
+ | [references/tutorials.md](references/tutorials.md) | Step-by-step tutorials for local and remote interpretability |
411
+
412
+ ## External Resources
413
+
414
+ ### Tutorials
415
+ - [Getting Started](https://nnsight.net/start/)
416
+ - [Features Overview](https://nnsight.net/features/)
417
+ - [Remote Execution](https://nnsight.net/notebooks/features/remote_execution/)
418
+ - [Applied Tutorials](https://nnsight.net/applied_tutorials/)
419
+
420
+ ### Official Documentation
421
+ - [Official Docs](https://nnsight.net/documentation/)
422
+ - [NDIF Info](https://ndif.us/)
423
+ - [Community Forum](https://discuss.ndif.us/)
424
+
425
+ ### Papers
426
+ - [NNsight and NDIF Paper](https://arxiv.org/abs/2407.14561) - Fiotto-Kaufman et al. (ICLR 2025)
427
+
428
+ ## Architecture Support
429
+
430
+ nnsight works with any PyTorch model:
431
+ - **Transformers**: GPT-2, LLaMA, Mistral, etc.
432
+ - **State Space Models**: Mamba
433
+ - **Vision Models**: ViT, CLIP
434
+ - **Custom architectures**: Any nn.Module
435
+
436
+ The key is knowing the module structure to access the right components.
@@ -0,0 +1,78 @@
1
+ # nnsight Reference Documentation
2
+
3
+ This directory contains comprehensive reference materials for nnsight.
4
+
5
+ ## Contents
6
+
7
+ - [api.md](api.md) - Complete API reference for LanguageModel, tracing, and proxy objects
8
+ - [tutorials.md](tutorials.md) - Step-by-step tutorials for local and remote interpretability
9
+
10
+ ## Quick Links
11
+
12
+ - **Official Documentation**: https://nnsight.net/
13
+ - **GitHub Repository**: https://github.com/ndif-team/nnsight
14
+ - **NDIF (Remote Execution)**: https://ndif.us/
15
+ - **Community Forum**: https://discuss.ndif.us/
16
+ - **Paper**: https://arxiv.org/abs/2407.14561 (ICLR 2025)
17
+
18
+ ## Installation
19
+
20
+ ```bash
21
+ # Basic installation
22
+ pip install nnsight
23
+
24
+ # For vLLM support
25
+ pip install "nnsight[vllm]"
26
+ ```
27
+
28
+ ## Basic Usage
29
+
30
+ ```python
31
+ from nnsight import LanguageModel
32
+
33
+ # Load model
34
+ model = LanguageModel("openai-community/gpt2", device_map="auto")
35
+
36
+ # Trace and access internals
37
+ with model.trace("The Eiffel Tower is in") as tracer:
38
+ # Access layer output
39
+ hidden = model.transformer.h[5].output[0].save()
40
+
41
+ # Modify activations
42
+ model.transformer.h[8].output[0][:] *= 0.5
43
+
44
+ # Get final output
45
+ logits = model.output.save()
46
+
47
+ # Access saved values outside context
48
+ print(hidden.shape)
49
+ ```
50
+
51
+ ## Key Concepts
52
+
53
+ ### Tracing
54
+ The `trace()` context enables deferred execution - operations are recorded and executed together.
55
+
56
+ ### Proxy Objects
57
+ Inside trace, module accesses return Proxies. Call `.save()` to retrieve values after execution.
58
+
59
+ ### Remote Execution (NDIF)
60
+ Run the same code on massive models (70B+) without local GPUs:
61
+
62
+ ```python
63
+ # Same code, just add remote=True
64
+ with model.trace("Hello", remote=True):
65
+ hidden = model.model.layers[40].output[0].save()
66
+ ```
67
+
68
+ ## NDIF Setup
69
+
70
+ 1. Sign up at https://login.ndif.us/
71
+ 2. Get API key
72
+ 3. Set environment variable: `export NDIF_API_KEY=your_key`
73
+
74
+ ## Available Remote Models
75
+
76
+ - Llama-3.1-8B, 70B, 405B
77
+ - DeepSeek-R1 models
78
+ - More at https://ndif.us/