@synsci/cli-darwin-x64 1.1.49

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (373) hide show
  1. package/bin/skills/accelerate/SKILL.md +332 -0
  2. package/bin/skills/accelerate/references/custom-plugins.md +453 -0
  3. package/bin/skills/accelerate/references/megatron-integration.md +489 -0
  4. package/bin/skills/accelerate/references/performance.md +525 -0
  5. package/bin/skills/audiocraft/SKILL.md +564 -0
  6. package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
  7. package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
  8. package/bin/skills/autogpt/SKILL.md +403 -0
  9. package/bin/skills/autogpt/references/advanced-usage.md +535 -0
  10. package/bin/skills/autogpt/references/troubleshooting.md +420 -0
  11. package/bin/skills/awq/SKILL.md +310 -0
  12. package/bin/skills/awq/references/advanced-usage.md +324 -0
  13. package/bin/skills/awq/references/troubleshooting.md +344 -0
  14. package/bin/skills/axolotl/SKILL.md +158 -0
  15. package/bin/skills/axolotl/references/api.md +5548 -0
  16. package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
  17. package/bin/skills/axolotl/references/index.md +15 -0
  18. package/bin/skills/axolotl/references/other.md +3563 -0
  19. package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
  20. package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
  21. package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
  22. package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
  23. package/bin/skills/bitsandbytes/SKILL.md +411 -0
  24. package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
  25. package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
  26. package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
  27. package/bin/skills/blip-2/SKILL.md +564 -0
  28. package/bin/skills/blip-2/references/advanced-usage.md +680 -0
  29. package/bin/skills/blip-2/references/troubleshooting.md +526 -0
  30. package/bin/skills/chroma/SKILL.md +406 -0
  31. package/bin/skills/chroma/references/integration.md +38 -0
  32. package/bin/skills/clip/SKILL.md +253 -0
  33. package/bin/skills/clip/references/applications.md +207 -0
  34. package/bin/skills/constitutional-ai/SKILL.md +290 -0
  35. package/bin/skills/crewai/SKILL.md +498 -0
  36. package/bin/skills/crewai/references/flows.md +438 -0
  37. package/bin/skills/crewai/references/tools.md +429 -0
  38. package/bin/skills/crewai/references/troubleshooting.md +480 -0
  39. package/bin/skills/deepspeed/SKILL.md +141 -0
  40. package/bin/skills/deepspeed/references/08.md +17 -0
  41. package/bin/skills/deepspeed/references/09.md +173 -0
  42. package/bin/skills/deepspeed/references/2020.md +378 -0
  43. package/bin/skills/deepspeed/references/2023.md +279 -0
  44. package/bin/skills/deepspeed/references/assets.md +179 -0
  45. package/bin/skills/deepspeed/references/index.md +35 -0
  46. package/bin/skills/deepspeed/references/mii.md +118 -0
  47. package/bin/skills/deepspeed/references/other.md +1191 -0
  48. package/bin/skills/deepspeed/references/tutorials.md +6554 -0
  49. package/bin/skills/dspy/SKILL.md +590 -0
  50. package/bin/skills/dspy/references/examples.md +663 -0
  51. package/bin/skills/dspy/references/modules.md +475 -0
  52. package/bin/skills/dspy/references/optimizers.md +566 -0
  53. package/bin/skills/faiss/SKILL.md +221 -0
  54. package/bin/skills/faiss/references/index_types.md +280 -0
  55. package/bin/skills/flash-attention/SKILL.md +367 -0
  56. package/bin/skills/flash-attention/references/benchmarks.md +215 -0
  57. package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
  58. package/bin/skills/gguf/SKILL.md +427 -0
  59. package/bin/skills/gguf/references/advanced-usage.md +504 -0
  60. package/bin/skills/gguf/references/troubleshooting.md +442 -0
  61. package/bin/skills/gptq/SKILL.md +450 -0
  62. package/bin/skills/gptq/references/calibration.md +337 -0
  63. package/bin/skills/gptq/references/integration.md +129 -0
  64. package/bin/skills/gptq/references/troubleshooting.md +95 -0
  65. package/bin/skills/grpo-rl-training/README.md +97 -0
  66. package/bin/skills/grpo-rl-training/SKILL.md +572 -0
  67. package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
  68. package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
  69. package/bin/skills/guidance/SKILL.md +572 -0
  70. package/bin/skills/guidance/references/backends.md +554 -0
  71. package/bin/skills/guidance/references/constraints.md +674 -0
  72. package/bin/skills/guidance/references/examples.md +767 -0
  73. package/bin/skills/hqq/SKILL.md +445 -0
  74. package/bin/skills/hqq/references/advanced-usage.md +528 -0
  75. package/bin/skills/hqq/references/troubleshooting.md +503 -0
  76. package/bin/skills/hugging-face-cli/SKILL.md +191 -0
  77. package/bin/skills/hugging-face-cli/references/commands.md +954 -0
  78. package/bin/skills/hugging-face-cli/references/examples.md +374 -0
  79. package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
  80. package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
  81. package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
  82. package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
  83. package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
  84. package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
  85. package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
  86. package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
  87. package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
  88. package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
  89. package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
  90. package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
  91. package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
  92. package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
  93. package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
  94. package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
  95. package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
  96. package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
  97. package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
  98. package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
  99. package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
  100. package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
  101. package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
  102. package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
  103. package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
  104. package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
  105. package/bin/skills/hugging-face-jobs/index.html +216 -0
  106. package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
  107. package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
  108. package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
  109. package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
  110. package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
  111. package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
  112. package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
  113. package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
  114. package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
  115. package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
  116. package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
  117. package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
  118. package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
  119. package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
  120. package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
  121. package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
  122. package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
  123. package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
  124. package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
  125. package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
  126. package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
  127. package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
  128. package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
  129. package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
  130. package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
  131. package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
  132. package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
  133. package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
  134. package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
  135. package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
  136. package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
  137. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
  138. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
  139. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
  140. package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
  141. package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
  142. package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
  143. package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
  144. package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
  145. package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
  146. package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
  147. package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
  148. package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
  149. package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
  150. package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
  151. package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
  152. package/bin/skills/instructor/SKILL.md +740 -0
  153. package/bin/skills/instructor/references/examples.md +107 -0
  154. package/bin/skills/instructor/references/providers.md +70 -0
  155. package/bin/skills/instructor/references/validation.md +606 -0
  156. package/bin/skills/knowledge-distillation/SKILL.md +458 -0
  157. package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
  158. package/bin/skills/lambda-labs/SKILL.md +545 -0
  159. package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
  160. package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
  161. package/bin/skills/langchain/SKILL.md +480 -0
  162. package/bin/skills/langchain/references/agents.md +499 -0
  163. package/bin/skills/langchain/references/integration.md +562 -0
  164. package/bin/skills/langchain/references/rag.md +600 -0
  165. package/bin/skills/langsmith/SKILL.md +422 -0
  166. package/bin/skills/langsmith/references/advanced-usage.md +548 -0
  167. package/bin/skills/langsmith/references/troubleshooting.md +537 -0
  168. package/bin/skills/litgpt/SKILL.md +469 -0
  169. package/bin/skills/litgpt/references/custom-models.md +568 -0
  170. package/bin/skills/litgpt/references/distributed-training.md +451 -0
  171. package/bin/skills/litgpt/references/supported-models.md +336 -0
  172. package/bin/skills/litgpt/references/training-recipes.md +619 -0
  173. package/bin/skills/llama-cpp/SKILL.md +258 -0
  174. package/bin/skills/llama-cpp/references/optimization.md +89 -0
  175. package/bin/skills/llama-cpp/references/quantization.md +213 -0
  176. package/bin/skills/llama-cpp/references/server.md +125 -0
  177. package/bin/skills/llama-factory/SKILL.md +80 -0
  178. package/bin/skills/llama-factory/references/_images.md +23 -0
  179. package/bin/skills/llama-factory/references/advanced.md +1055 -0
  180. package/bin/skills/llama-factory/references/getting_started.md +349 -0
  181. package/bin/skills/llama-factory/references/index.md +19 -0
  182. package/bin/skills/llama-factory/references/other.md +31 -0
  183. package/bin/skills/llamaguard/SKILL.md +337 -0
  184. package/bin/skills/llamaindex/SKILL.md +569 -0
  185. package/bin/skills/llamaindex/references/agents.md +83 -0
  186. package/bin/skills/llamaindex/references/data_connectors.md +108 -0
  187. package/bin/skills/llamaindex/references/query_engines.md +406 -0
  188. package/bin/skills/llava/SKILL.md +304 -0
  189. package/bin/skills/llava/references/training.md +197 -0
  190. package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
  191. package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
  192. package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
  193. package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
  194. package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
  195. package/bin/skills/long-context/SKILL.md +536 -0
  196. package/bin/skills/long-context/references/extension_methods.md +468 -0
  197. package/bin/skills/long-context/references/fine_tuning.md +611 -0
  198. package/bin/skills/long-context/references/rope.md +402 -0
  199. package/bin/skills/mamba/SKILL.md +260 -0
  200. package/bin/skills/mamba/references/architecture-details.md +206 -0
  201. package/bin/skills/mamba/references/benchmarks.md +255 -0
  202. package/bin/skills/mamba/references/training-guide.md +388 -0
  203. package/bin/skills/megatron-core/SKILL.md +366 -0
  204. package/bin/skills/megatron-core/references/benchmarks.md +249 -0
  205. package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
  206. package/bin/skills/megatron-core/references/production-examples.md +473 -0
  207. package/bin/skills/megatron-core/references/training-recipes.md +547 -0
  208. package/bin/skills/miles/SKILL.md +315 -0
  209. package/bin/skills/miles/references/api-reference.md +141 -0
  210. package/bin/skills/miles/references/troubleshooting.md +352 -0
  211. package/bin/skills/mlflow/SKILL.md +704 -0
  212. package/bin/skills/mlflow/references/deployment.md +744 -0
  213. package/bin/skills/mlflow/references/model-registry.md +770 -0
  214. package/bin/skills/mlflow/references/tracking.md +680 -0
  215. package/bin/skills/modal/SKILL.md +341 -0
  216. package/bin/skills/modal/references/advanced-usage.md +503 -0
  217. package/bin/skills/modal/references/troubleshooting.md +494 -0
  218. package/bin/skills/model-merging/SKILL.md +539 -0
  219. package/bin/skills/model-merging/references/evaluation.md +462 -0
  220. package/bin/skills/model-merging/references/examples.md +428 -0
  221. package/bin/skills/model-merging/references/methods.md +352 -0
  222. package/bin/skills/model-pruning/SKILL.md +495 -0
  223. package/bin/skills/model-pruning/references/wanda.md +347 -0
  224. package/bin/skills/moe-training/SKILL.md +526 -0
  225. package/bin/skills/moe-training/references/architectures.md +432 -0
  226. package/bin/skills/moe-training/references/inference.md +348 -0
  227. package/bin/skills/moe-training/references/training.md +425 -0
  228. package/bin/skills/nanogpt/SKILL.md +290 -0
  229. package/bin/skills/nanogpt/references/architecture.md +382 -0
  230. package/bin/skills/nanogpt/references/data.md +476 -0
  231. package/bin/skills/nanogpt/references/training.md +564 -0
  232. package/bin/skills/nemo-curator/SKILL.md +383 -0
  233. package/bin/skills/nemo-curator/references/deduplication.md +87 -0
  234. package/bin/skills/nemo-curator/references/filtering.md +102 -0
  235. package/bin/skills/nemo-evaluator/SKILL.md +494 -0
  236. package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
  237. package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
  238. package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
  239. package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
  240. package/bin/skills/nemo-guardrails/SKILL.md +297 -0
  241. package/bin/skills/nnsight/SKILL.md +436 -0
  242. package/bin/skills/nnsight/references/README.md +78 -0
  243. package/bin/skills/nnsight/references/api.md +344 -0
  244. package/bin/skills/nnsight/references/tutorials.md +300 -0
  245. package/bin/skills/openrlhf/SKILL.md +249 -0
  246. package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
  247. package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
  248. package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
  249. package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
  250. package/bin/skills/outlines/SKILL.md +652 -0
  251. package/bin/skills/outlines/references/backends.md +615 -0
  252. package/bin/skills/outlines/references/examples.md +773 -0
  253. package/bin/skills/outlines/references/json_generation.md +652 -0
  254. package/bin/skills/peft/SKILL.md +431 -0
  255. package/bin/skills/peft/references/advanced-usage.md +514 -0
  256. package/bin/skills/peft/references/troubleshooting.md +480 -0
  257. package/bin/skills/phoenix/SKILL.md +475 -0
  258. package/bin/skills/phoenix/references/advanced-usage.md +619 -0
  259. package/bin/skills/phoenix/references/troubleshooting.md +538 -0
  260. package/bin/skills/pinecone/SKILL.md +358 -0
  261. package/bin/skills/pinecone/references/deployment.md +181 -0
  262. package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
  263. package/bin/skills/pytorch-fsdp/references/index.md +7 -0
  264. package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
  265. package/bin/skills/pytorch-lightning/SKILL.md +346 -0
  266. package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
  267. package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
  268. package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
  269. package/bin/skills/pyvene/SKILL.md +473 -0
  270. package/bin/skills/pyvene/references/README.md +73 -0
  271. package/bin/skills/pyvene/references/api.md +383 -0
  272. package/bin/skills/pyvene/references/tutorials.md +376 -0
  273. package/bin/skills/qdrant/SKILL.md +493 -0
  274. package/bin/skills/qdrant/references/advanced-usage.md +648 -0
  275. package/bin/skills/qdrant/references/troubleshooting.md +631 -0
  276. package/bin/skills/ray-data/SKILL.md +326 -0
  277. package/bin/skills/ray-data/references/integration.md +82 -0
  278. package/bin/skills/ray-data/references/transformations.md +83 -0
  279. package/bin/skills/ray-train/SKILL.md +406 -0
  280. package/bin/skills/ray-train/references/multi-node.md +628 -0
  281. package/bin/skills/rwkv/SKILL.md +260 -0
  282. package/bin/skills/rwkv/references/architecture-details.md +344 -0
  283. package/bin/skills/rwkv/references/rwkv7.md +386 -0
  284. package/bin/skills/rwkv/references/state-management.md +369 -0
  285. package/bin/skills/saelens/SKILL.md +386 -0
  286. package/bin/skills/saelens/references/README.md +70 -0
  287. package/bin/skills/saelens/references/api.md +333 -0
  288. package/bin/skills/saelens/references/tutorials.md +318 -0
  289. package/bin/skills/segment-anything/SKILL.md +500 -0
  290. package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
  291. package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
  292. package/bin/skills/sentence-transformers/SKILL.md +255 -0
  293. package/bin/skills/sentence-transformers/references/models.md +123 -0
  294. package/bin/skills/sentencepiece/SKILL.md +235 -0
  295. package/bin/skills/sentencepiece/references/algorithms.md +200 -0
  296. package/bin/skills/sentencepiece/references/training.md +304 -0
  297. package/bin/skills/sglang/SKILL.md +442 -0
  298. package/bin/skills/sglang/references/deployment.md +490 -0
  299. package/bin/skills/sglang/references/radix-attention.md +413 -0
  300. package/bin/skills/sglang/references/structured-generation.md +541 -0
  301. package/bin/skills/simpo/SKILL.md +219 -0
  302. package/bin/skills/simpo/references/datasets.md +478 -0
  303. package/bin/skills/simpo/references/hyperparameters.md +452 -0
  304. package/bin/skills/simpo/references/loss-functions.md +350 -0
  305. package/bin/skills/skypilot/SKILL.md +509 -0
  306. package/bin/skills/skypilot/references/advanced-usage.md +491 -0
  307. package/bin/skills/skypilot/references/troubleshooting.md +570 -0
  308. package/bin/skills/slime/SKILL.md +464 -0
  309. package/bin/skills/slime/references/api-reference.md +392 -0
  310. package/bin/skills/slime/references/troubleshooting.md +386 -0
  311. package/bin/skills/speculative-decoding/SKILL.md +467 -0
  312. package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
  313. package/bin/skills/speculative-decoding/references/medusa.md +350 -0
  314. package/bin/skills/stable-diffusion/SKILL.md +519 -0
  315. package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
  316. package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
  317. package/bin/skills/tensorboard/SKILL.md +629 -0
  318. package/bin/skills/tensorboard/references/integrations.md +638 -0
  319. package/bin/skills/tensorboard/references/profiling.md +545 -0
  320. package/bin/skills/tensorboard/references/visualization.md +620 -0
  321. package/bin/skills/tensorrt-llm/SKILL.md +187 -0
  322. package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
  323. package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
  324. package/bin/skills/tensorrt-llm/references/serving.md +470 -0
  325. package/bin/skills/tinker/SKILL.md +362 -0
  326. package/bin/skills/tinker/references/api-reference.md +168 -0
  327. package/bin/skills/tinker/references/getting-started.md +157 -0
  328. package/bin/skills/tinker/references/loss-functions.md +163 -0
  329. package/bin/skills/tinker/references/models-and-lora.md +139 -0
  330. package/bin/skills/tinker/references/recipes.md +280 -0
  331. package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
  332. package/bin/skills/tinker/references/rendering.md +243 -0
  333. package/bin/skills/tinker/references/supervised-learning.md +232 -0
  334. package/bin/skills/tinker-training-cost/SKILL.md +187 -0
  335. package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
  336. package/bin/skills/torchforge/SKILL.md +433 -0
  337. package/bin/skills/torchforge/references/api-reference.md +327 -0
  338. package/bin/skills/torchforge/references/troubleshooting.md +409 -0
  339. package/bin/skills/torchtitan/SKILL.md +358 -0
  340. package/bin/skills/torchtitan/references/checkpoint.md +181 -0
  341. package/bin/skills/torchtitan/references/custom-models.md +258 -0
  342. package/bin/skills/torchtitan/references/float8.md +133 -0
  343. package/bin/skills/torchtitan/references/fsdp.md +126 -0
  344. package/bin/skills/transformer-lens/SKILL.md +346 -0
  345. package/bin/skills/transformer-lens/references/README.md +54 -0
  346. package/bin/skills/transformer-lens/references/api.md +362 -0
  347. package/bin/skills/transformer-lens/references/tutorials.md +339 -0
  348. package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
  349. package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
  350. package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
  351. package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
  352. package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
  353. package/bin/skills/unsloth/SKILL.md +80 -0
  354. package/bin/skills/unsloth/references/index.md +7 -0
  355. package/bin/skills/unsloth/references/llms-full.md +16799 -0
  356. package/bin/skills/unsloth/references/llms-txt.md +12044 -0
  357. package/bin/skills/unsloth/references/llms.md +82 -0
  358. package/bin/skills/verl/SKILL.md +391 -0
  359. package/bin/skills/verl/references/api-reference.md +301 -0
  360. package/bin/skills/verl/references/troubleshooting.md +391 -0
  361. package/bin/skills/vllm/SKILL.md +364 -0
  362. package/bin/skills/vllm/references/optimization.md +226 -0
  363. package/bin/skills/vllm/references/quantization.md +284 -0
  364. package/bin/skills/vllm/references/server-deployment.md +255 -0
  365. package/bin/skills/vllm/references/troubleshooting.md +447 -0
  366. package/bin/skills/weights-and-biases/SKILL.md +590 -0
  367. package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
  368. package/bin/skills/weights-and-biases/references/integrations.md +700 -0
  369. package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
  370. package/bin/skills/whisper/SKILL.md +317 -0
  371. package/bin/skills/whisper/references/languages.md +189 -0
  372. package/bin/synsc +0 -0
  373. package/package.json +10 -0
@@ -0,0 +1,711 @@
1
+ ---
2
+ name: hugging-face-model-trainer
3
+ description: This skill should be used when users want to train or fine-tune language models using TRL (Transformer Reinforcement Learning) on Hugging Face Jobs infrastructure. Covers SFT, DPO, GRPO and reward modeling training methods, plus GGUF conversion for local deployment. Includes guidance on the TRL Jobs package, UV scripts with PEP 723 format, dataset preparation and validation, hardware selection, cost estimation, Trackio monitoring, Hub authentication, and model persistence. Should be invoked for tasks involving cloud GPU training, GGUF conversion, or when users mention training on Hugging Face Jobs without local GPU setup.
4
+ version: 1.0.0
5
+ author: Synthetic Sciences
6
+ license: MIT
7
+ tags: [Hugging Face, Fine-Tuning, Transformers, Training]
8
+ dependencies: [huggingface-hub, transformers]
9
+ license: Complete terms in LICENSE.txt
10
+ ---
11
+
12
+ # TRL Training on Hugging Face Jobs
13
+
14
+ ## Overview
15
+
16
+ Train language models using TRL (Transformer Reinforcement Learning) on fully managed Hugging Face infrastructure. No local GPU setup required—models train on cloud GPUs and results are automatically saved to the Hugging Face Hub.
17
+
18
+ **TRL provides multiple training methods:**
19
+ - **SFT** (Supervised Fine-Tuning) - Standard instruction tuning
20
+ - **DPO** (Direct Preference Optimization) - Alignment from preference data
21
+ - **GRPO** (Group Relative Policy Optimization) - Online RL training
22
+ - **Reward Modeling** - Train reward models for RLHF
23
+
24
+ **For detailed TRL method documentation:**
25
+ ```python
26
+ hf_doc_search("your query", product="trl")
27
+ hf_doc_fetch("https://huggingface.co/docs/trl/sft_trainer") # SFT
28
+ hf_doc_fetch("https://huggingface.co/docs/trl/dpo_trainer") # DPO
29
+ # etc.
30
+ ```
31
+
32
+ **See also:** `references/training_methods.md` for method overviews and selection guidance
33
+
34
+ ## When to Use This Skill
35
+
36
+ Use this skill when users want to:
37
+ - Fine-tune language models on cloud GPUs without local infrastructure
38
+ - Train with TRL methods (SFT, DPO, GRPO, etc.)
39
+ - Run training jobs on Hugging Face Jobs infrastructure
40
+ - Convert trained models to GGUF for local deployment (Ollama, LM Studio, llama.cpp)
41
+ - Ensure trained models are permanently saved to the Hub
42
+ - Use modern workflows with optimized defaults
43
+
44
+ ## Key Directives
45
+
46
+ When assisting with training jobs:
47
+
48
+ 1. **ALWAYS use `hf_jobs()` MCP tool** - Submit jobs using `hf_jobs("uv", {...})`, NOT bash `trl-jobs` commands. The `script` parameter accepts Python code directly. Do NOT save to local files unless the user explicitly requests it. Pass the script content as a string to `hf_jobs()`. If user asks to "train a model", "fine-tune", or similar requests, you MUST create the training script AND submit the job immediately using `hf_jobs()`.
49
+
50
+ 2. **Always include Trackio** - Every training script should include Trackio for real-time monitoring. Use example scripts in `scripts/` as templates.
51
+
52
+ 3. **Provide job details after submission** - After submitting, provide job ID, monitoring URL, estimated time, and note that the user can request status checks later.
53
+
54
+ 4. **Use example scripts as templates** - Reference `scripts/train_sft_example.py`, `scripts/train_dpo_example.py`, etc. as starting points.
55
+
56
+ ## Local Script Dependencies
57
+
58
+ To run scripts locally (like `estimate_cost.py`), install dependencies:
59
+ ```bash
60
+ pip install -r requirements.txt
61
+ ```
62
+
63
+ ## Prerequisites Checklist
64
+
65
+ Before starting any training job, verify:
66
+
67
+ ### ✅ **Account & Authentication**
68
+ - Hugging Face Account with [Pro](https://hf.co/pro), [Team](https://hf.co/enterprise), or [Enterprise](https://hf.co/enterprise) plan (Jobs require paid plan)
69
+ - Authenticated login: Check with `hf_whoami()`
70
+ - **HF_TOKEN for Hub Push** ⚠️ CRITICAL - Training environment is ephemeral, must push to Hub or ALL training results are lost
71
+ - Token must have write permissions
72
+ - **MUST pass `secrets={"HF_TOKEN": "$HF_TOKEN"}` in job config** to make token available (the `$HF_TOKEN` syntax
73
+ references your actual token value)
74
+
75
+ ### ✅ **Dataset Requirements**
76
+ - Dataset must exist on Hub or be loadable via `datasets.load_dataset()`
77
+ - Format must match training method (SFT: "messages"/text/prompt-completion; DPO: chosen/rejected; GRPO: prompt-only)
78
+ - **ALWAYS validate unknown datasets** before GPU training to prevent format failures (see Dataset Validation section below)
79
+ - Size appropriate for hardware (Demo: 50-100 examples on t4-small; Production: 1K-10K+ on a10g-large/a100-large)
80
+
81
+ ### ⚠️ **Critical Settings**
82
+ - **Timeout must exceed expected training time** - Default 30min is TOO SHORT for most training. Minimum recommended: 1-2 hours. Job fails and loses all progress if timeout is exceeded.
83
+ - **Hub push must be enabled** - Config: `push_to_hub=True`, `hub_model_id="username/model-name"`; Job: `secrets={"HF_TOKEN": "$HF_TOKEN"}`
84
+
85
+ ## Asynchronous Job Guidelines
86
+
87
+ **⚠️ IMPORTANT: Training jobs run asynchronously and can take hours**
88
+
89
+ ### Action Required
90
+
91
+ **When user requests training:**
92
+ 1. **Create the training script** with Trackio included (use `scripts/train_sft_example.py` as template)
93
+ 2. **Submit immediately** using `hf_jobs()` MCP tool with script content inline - don't save to file unless user requests
94
+ 3. **Report submission** with job ID, monitoring URL, and estimated time
95
+ 4. **Wait for user** to request status checks - don't poll automatically
96
+
97
+ ### Ground Rules
98
+ - **Jobs run in background** - Submission returns immediately; training continues independently
99
+ - **Initial logs delayed** - Can take 30-60 seconds for logs to appear
100
+ - **User checks status** - Wait for user to request status updates
101
+ - **Avoid polling** - Check logs only on user request; provide monitoring links instead
102
+
103
+ ### After Submission
104
+
105
+ **Provide to user:**
106
+ - ✅ Job ID and monitoring URL
107
+ - ✅ Expected completion time
108
+ - ✅ Trackio dashboard URL
109
+ - ✅ Note that user can request status checks later
110
+
111
+ **Example Response:**
112
+ ```
113
+ ✅ Job submitted successfully!
114
+
115
+ Job ID: abc123xyz
116
+ Monitor: https://huggingface.co/jobs/username/abc123xyz
117
+
118
+ Expected time: ~2 hours
119
+ Estimated cost: ~$10
120
+
121
+ The job is running in the background. Ask me to check status/logs when ready!
122
+ ```
123
+
124
+ ## Quick Start: Three Approaches
125
+
126
+ **💡 Tip for Demos:** For quick demos on smaller GPUs (t4-small), omit `eval_dataset` and `eval_strategy` to save ~40% memory. You'll still see training loss and learning progress.
127
+
128
+ ### Sequence Length Configuration
129
+
130
+ **TRL config classes use `max_length` (not `max_seq_length`)** to control tokenized sequence length:
131
+
132
+ ```python
133
+ # ✅ CORRECT - If you need to set sequence length
134
+ SFTConfig(max_length=512) # Truncate sequences to 512 tokens
135
+ DPOConfig(max_length=2048) # Longer context (2048 tokens)
136
+
137
+ # ❌ WRONG - This parameter doesn't exist
138
+ SFTConfig(max_seq_length=512) # TypeError!
139
+ ```
140
+
141
+ **Default behavior:** `max_length=1024` (truncates from right). This works well for most training.
142
+
143
+ **When to override:**
144
+ - **Longer context**: Set higher (e.g., `max_length=2048`)
145
+ - **Memory constraints**: Set lower (e.g., `max_length=512`)
146
+ - **Vision models**: Set `max_length=None` (prevents cutting image tokens)
147
+
148
+ **Usually you don't need to set this parameter at all** - the examples below use the sensible default.
149
+
150
+ ### Approach 1: UV Scripts (Recommended—Default Choice)
151
+
152
+ UV scripts use PEP 723 inline dependencies for clean, self-contained training. **This is the primary approach for Claude Code.**
153
+
154
+ ```python
155
+ hf_jobs("uv", {
156
+ "script": """
157
+ # /// script
158
+ # dependencies = ["trl>=0.12.0", "peft>=0.7.0", "trackio"]
159
+ # ///
160
+
161
+ from datasets import load_dataset
162
+ from peft import LoraConfig
163
+ from trl import SFTTrainer, SFTConfig
164
+ import trackio
165
+
166
+ dataset = load_dataset("trl-lib/Capybara", split="train")
167
+
168
+ # Create train/eval split for monitoring
169
+ dataset_split = dataset.train_test_split(test_size=0.1, seed=42)
170
+
171
+ trainer = SFTTrainer(
172
+ model="Qwen/Qwen2.5-0.5B",
173
+ train_dataset=dataset_split["train"],
174
+ eval_dataset=dataset_split["test"],
175
+ peft_config=LoraConfig(r=16, lora_alpha=32),
176
+ args=SFTConfig(
177
+ output_dir="my-model",
178
+ push_to_hub=True,
179
+ hub_model_id="username/my-model",
180
+ num_train_epochs=3,
181
+ eval_strategy="steps",
182
+ eval_steps=50,
183
+ report_to="trackio",
184
+ project="meaningful_prject_name", # project name for the training name (trackio)
185
+ run_name="meaningful_run_name", # descriptive name for the specific training run (trackio)
186
+ )
187
+ )
188
+
189
+ trainer.train()
190
+ trainer.push_to_hub()
191
+ """,
192
+ "flavor": "a10g-large",
193
+ "timeout": "2h",
194
+ "secrets": {"HF_TOKEN": "$HF_TOKEN"}
195
+ })
196
+ ```
197
+
198
+ **Benefits:** Direct MCP tool usage, clean code, dependencies declared inline (PEP 723), no file saving required, full control
199
+ **When to use:** Default choice for all training tasks in Claude Code, custom training logic, any scenario requiring `hf_jobs()`
200
+
201
+ #### Working with Scripts
202
+
203
+ ⚠️ **Important:** The `script` parameter accepts either inline code (as shown above) OR a URL. **Local file paths do NOT work.**
204
+
205
+ **Why local paths don't work:**
206
+ Jobs run in isolated Docker containers without access to your local filesystem. Scripts must be:
207
+ - Inline code (recommended for custom training)
208
+ - Publicly accessible URLs
209
+ - Private repo URLs (with HF_TOKEN)
210
+
211
+ **Common mistakes:**
212
+ ```python
213
+ # ❌ These will all fail
214
+ hf_jobs("uv", {"script": "train.py"})
215
+ hf_jobs("uv", {"script": "./scripts/train.py"})
216
+ hf_jobs("uv", {"script": "/path/to/train.py"})
217
+ ```
218
+
219
+ **Correct approaches:**
220
+ ```python
221
+ # ✅ Inline code (recommended)
222
+ hf_jobs("uv", {"script": "# /// script\n# dependencies = [...]\n# ///\n\n<your code>"})
223
+
224
+ # ✅ From Hugging Face Hub
225
+ hf_jobs("uv", {"script": "https://huggingface.co/user/repo/resolve/main/train.py"})
226
+
227
+ # ✅ From GitHub
228
+ hf_jobs("uv", {"script": "https://raw.githubusercontent.com/user/repo/main/train.py"})
229
+
230
+ # ✅ From Gist
231
+ hf_jobs("uv", {"script": "https://gist.githubusercontent.com/user/id/raw/train.py"})
232
+ ```
233
+
234
+ **To use local scripts:** Upload to HF Hub first:
235
+ ```bash
236
+ huggingface-cli repo create my-training-scripts --type model
237
+ huggingface-cli upload my-training-scripts ./train.py train.py
238
+ # Use: https://huggingface.co/USERNAME/my-training-scripts/resolve/main/train.py
239
+ ```
240
+
241
+ ### Approach 2: TRL Maintained Scripts (Official Examples)
242
+
243
+ TRL provides battle-tested scripts for all methods. Can be run from URLs:
244
+
245
+ ```python
246
+ hf_jobs("uv", {
247
+ "script": "https://github.com/huggingface/trl/blob/main/trl/scripts/sft.py",
248
+ "script_args": [
249
+ "--model_name_or_path", "Qwen/Qwen2.5-0.5B",
250
+ "--dataset_name", "trl-lib/Capybara",
251
+ "--output_dir", "my-model",
252
+ "--push_to_hub",
253
+ "--hub_model_id", "username/my-model"
254
+ ],
255
+ "flavor": "a10g-large",
256
+ "timeout": "2h",
257
+ "secrets": {"HF_TOKEN": "$HF_TOKEN"}
258
+ })
259
+ ```
260
+
261
+ **Benefits:** No code to write, maintained by TRL team, production-tested
262
+ **When to use:** Standard TRL training, quick experiments, don't need custom code
263
+ **Available:** Scripts are available from https://github.com/huggingface/trl/tree/main/examples/scripts
264
+
265
+ ### Finding More UV Scripts on Hub
266
+
267
+ The `uv-scripts` organization provides ready-to-use UV scripts stored as datasets on Hugging Face Hub:
268
+
269
+ ```python
270
+ # Discover available UV script collections
271
+ dataset_search({"author": "uv-scripts", "sort": "downloads", "limit": 20})
272
+
273
+ # Explore a specific collection
274
+ hub_repo_details(["uv-scripts/classification"], repo_type="dataset", include_readme=True)
275
+ ```
276
+
277
+ **Popular collections:** ocr, classification, synthetic-data, vllm, dataset-creation
278
+
279
+ ### Approach 3: HF Jobs CLI (Direct Terminal Commands)
280
+
281
+ When the `hf_jobs()` MCP tool is unavailable, use the `hf jobs` CLI directly.
282
+
283
+ **⚠️ CRITICAL: CLI Syntax Rules**
284
+
285
+ ```bash
286
+ # ✅ CORRECT syntax - flags BEFORE script URL
287
+ hf jobs uv run --flavor a10g-large --timeout 2h --secrets HF_TOKEN "https://example.com/train.py"
288
+
289
+ # ❌ WRONG - "run uv" instead of "uv run"
290
+ hf jobs run uv "https://example.com/train.py" --flavor a10g-large
291
+
292
+ # ❌ WRONG - flags AFTER script URL (will be ignored!)
293
+ hf jobs uv run "https://example.com/train.py" --flavor a10g-large
294
+
295
+ # ❌ WRONG - "--secret" instead of "--secrets" (plural)
296
+ hf jobs uv run --secret HF_TOKEN "https://example.com/train.py"
297
+ ```
298
+
299
+ **Key syntax rules:**
300
+ 1. Command order is `hf jobs uv run` (NOT `hf jobs run uv`)
301
+ 2. All flags (`--flavor`, `--timeout`, `--secrets`) must come BEFORE the script URL
302
+ 3. Use `--secrets` (plural), not `--secret`
303
+ 4. Script URL must be the last positional argument
304
+
305
+ **Complete CLI example:**
306
+ ```bash
307
+ hf jobs uv run \
308
+ --flavor a10g-large \
309
+ --timeout 2h \
310
+ --secrets HF_TOKEN \
311
+ "https://huggingface.co/user/repo/resolve/main/train.py"
312
+ ```
313
+
314
+ **Check job status via CLI:**
315
+ ```bash
316
+ hf jobs ps # List all jobs
317
+ hf jobs logs <job-id> # View logs
318
+ hf jobs inspect <job-id> # Job details
319
+ hf jobs cancel <job-id> # Cancel a job
320
+ ```
321
+
322
+ ### Approach 4: TRL Jobs Package (Simplified Training)
323
+
324
+ The `trl-jobs` package provides optimized defaults and one-liner training.
325
+
326
+ ```bash
327
+ # Install
328
+ pip install trl-jobs
329
+
330
+ # Train with SFT (simplest possible)
331
+ trl-jobs sft \
332
+ --model_name Qwen/Qwen2.5-0.5B \
333
+ --dataset_name trl-lib/Capybara
334
+ ```
335
+
336
+ **Benefits:** Pre-configured settings, automatic Trackio integration, automatic Hub push, one-line commands
337
+ **When to use:** User working in terminal directly (not Claude Code context), quick local experimentation
338
+ **Repository:** https://github.com/huggingface/trl-jobs
339
+
340
+ ⚠️ **In Claude Code context, prefer using `hf_jobs()` MCP tool (Approach 1) when available.**
341
+
342
+ ## Hardware Selection
343
+
344
+ | Model Size | Recommended Hardware | Cost (approx/hr) | Use Case |
345
+ |------------|---------------------|------------------|----------|
346
+ | <1B params | `t4-small` | ~$0.75 | Demos, quick tests only without eval steps |
347
+ | 1-3B params | `t4-medium`, `l4x1` | ~$1.50-2.50 | Development |
348
+ | 3-7B params | `a10g-small`, `a10g-large` | ~$3.50-5.00 | Production training |
349
+ | 7-13B params | `a10g-large`, `a100-large` | ~$5-10 | Large models (use LoRA) |
350
+ | 13B+ params | `a100-large`, `a10g-largex2` | ~$10-20 | Very large (use LoRA) |
351
+
352
+ **GPU Flavors:** cpu-basic/upgrade/performance/xl, t4-small/medium, l4x1/x4, a10g-small/large/largex2/largex4, a100-large, h100/h100x8
353
+
354
+ **Guidelines:**
355
+ - Use **LoRA/PEFT** for models >7B to reduce memory
356
+ - Multi-GPU automatically handled by TRL/Accelerate
357
+ - Start with smaller hardware for testing
358
+
359
+ **See:** `references/hardware_guide.md` for detailed specifications
360
+
361
+ ## Critical: Saving Results to Hub
362
+
363
+ **⚠️ EPHEMERAL ENVIRONMENT—MUST PUSH TO HUB**
364
+
365
+ The Jobs environment is temporary. All files are deleted when the job ends. If the model isn't pushed to Hub, **ALL TRAINING IS LOST**.
366
+
367
+ ### Required Configuration
368
+
369
+ **In training script/config:**
370
+ ```python
371
+ SFTConfig(
372
+ push_to_hub=True,
373
+ hub_model_id="username/model-name", # MUST specify
374
+ hub_strategy="every_save", # Optional: push checkpoints
375
+ )
376
+ ```
377
+
378
+ **In job submission:**
379
+ ```python
380
+ {
381
+ "secrets": {"HF_TOKEN": "$HF_TOKEN"} # Enables authentication
382
+ }
383
+ ```
384
+
385
+ ### Verification Checklist
386
+
387
+ Before submitting:
388
+ - [ ] `push_to_hub=True` set in config
389
+ - [ ] `hub_model_id` includes username/repo-name
390
+ - [ ] `secrets` parameter includes HF_TOKEN
391
+ - [ ] User has write access to target repo
392
+
393
+ **See:** `references/hub_saving.md` for detailed troubleshooting
394
+
395
+ ## Timeout Management
396
+
397
+ **⚠️ DEFAULT: 30 MINUTES—TOO SHORT FOR TRAINING**
398
+
399
+ ### Setting Timeouts
400
+
401
+ ```python
402
+ {
403
+ "timeout": "2h" # 2 hours (formats: "90m", "2h", "1.5h", or seconds as integer)
404
+ }
405
+ ```
406
+
407
+ ### Timeout Guidelines
408
+
409
+ | Scenario | Recommended | Notes |
410
+ |----------|-------------|-------|
411
+ | Quick demo (50-100 examples) | 10-30 min | Verify setup |
412
+ | Development training | 1-2 hours | Small datasets |
413
+ | Production (3-7B model) | 4-6 hours | Full datasets |
414
+ | Large model with LoRA | 3-6 hours | Depends on dataset |
415
+
416
+ **Always add 20-30% buffer** for model/dataset loading, checkpoint saving, Hub push operations, and network delays.
417
+
418
+ **On timeout:** Job killed immediately, all unsaved progress lost, must restart from beginning
419
+
420
+ ## Cost Estimation
421
+
422
+ **Offer to estimate cost when planning jobs with known parameters.** Use `scripts/estimate_cost.py`:
423
+
424
+ ```bash
425
+ uv run scripts/estimate_cost.py \
426
+ --model meta-llama/Llama-2-7b-hf \
427
+ --dataset trl-lib/Capybara \
428
+ --hardware a10g-large \
429
+ --dataset-size 16000 \
430
+ --epochs 3
431
+ ```
432
+
433
+ Output includes estimated time, cost, recommended timeout (with buffer), and optimization suggestions.
434
+
435
+ **When to offer:** User planning a job, asks about cost/time, choosing hardware, job will run >1 hour or cost >$5
436
+
437
+ ## Example Training Scripts
438
+
439
+ **Production-ready templates with all best practices:**
440
+
441
+ Load these scripts for correctly:
442
+
443
+ - **`scripts/train_sft_example.py`** - Complete SFT training with Trackio, LoRA, checkpoints
444
+ - **`scripts/train_dpo_example.py`** - DPO training for preference learning
445
+ - **`scripts/train_grpo_example.py`** - GRPO training for online RL
446
+
447
+ These scripts demonstrate proper Hub saving, Trackio integration, checkpoint management, and optimized parameters. Pass their content inline to `hf_jobs()` or use as templates for custom scripts.
448
+
449
+ ## Monitoring and Tracking
450
+
451
+ **Trackio** provides real-time metrics visualization. See `references/trackio_guide.md` for complete setup guide.
452
+
453
+ **Key points:**
454
+ - Add `trackio` to dependencies
455
+ - Configure trainer with `report_to="trackio" and run_name="meaningful_name"`
456
+
457
+ ### Trackio Configuration Defaults
458
+
459
+ **Use sensible defaults unless user specifies otherwise.** When generating training scripts with Trackio:
460
+
461
+ **Default Configuration:**
462
+ - **Space ID**: `{username}/trackio` (use "trackio" as default space name)
463
+ - **Run naming**: Unless otherwise specified, name the run in a way the user will recognize (e.g., descriptive of the task, model, or purpose)
464
+ - **Config**: Keep minimal - only include hyperparameters and model/dataset info
465
+ - **Project Name**: Use a Project Name to associate runs with a particular Project
466
+
467
+ **User overrides:** If user requests specific trackio configuration (custom space, run naming, grouping, or additional config), apply their preferences instead of defaults.
468
+
469
+
470
+ This is useful for managing multiple jobs with the same configuration or keeping training scripts portable.
471
+
472
+ See `references/trackio_guide.md` for complete documentation including grouping runs for experiments.
473
+
474
+ ### Check Job Status
475
+
476
+ ```python
477
+ # List all jobs
478
+ hf_jobs("ps")
479
+
480
+ # Inspect specific job
481
+ hf_jobs("inspect", {"job_id": "your-job-id"})
482
+
483
+ # View logs
484
+ hf_jobs("logs", {"job_id": "your-job-id"})
485
+ ```
486
+
487
+ **Remember:** Wait for user to request status checks. Avoid polling repeatedly.
488
+
489
+ ## Dataset Validation
490
+
491
+ **Validate dataset format BEFORE launching GPU training to prevent the #1 cause of training failures: format mismatches.**
492
+
493
+ ### Why Validate
494
+
495
+ - 50%+ of training failures are due to dataset format issues
496
+ - DPO especially strict: requires exact column names (`prompt`, `chosen`, `rejected`)
497
+ - Failed GPU jobs waste $1-10 and 30-60 minutes
498
+ - Validation on CPU costs ~$0.01 and takes <1 minute
499
+
500
+ ### When to Validate
501
+
502
+ **ALWAYS validate for:**
503
+ - Unknown or custom datasets
504
+ - DPO training (CRITICAL - 90% of datasets need mapping)
505
+ - Any dataset not explicitly TRL-compatible
506
+
507
+ **Skip validation for known TRL datasets:**
508
+ - `trl-lib/ultrachat_200k`, `trl-lib/Capybara`, `HuggingFaceH4/ultrachat_200k`, etc.
509
+
510
+ ### Usage
511
+
512
+ ```python
513
+ hf_jobs("uv", {
514
+ "script": "https://huggingface.co/datasets/mcp-tools/skills/raw/main/dataset_inspector.py",
515
+ "script_args": ["--dataset", "username/dataset-name", "--split", "train"]
516
+ })
517
+ ```
518
+
519
+ The script is fast, and will usually complete synchronously.
520
+
521
+ ### Reading Results
522
+
523
+ The output shows compatibility for each training method:
524
+
525
+ - **`✓ READY`** - Dataset is compatible, use directly
526
+ - **`✗ NEEDS MAPPING`** - Compatible but needs preprocessing (mapping code provided)
527
+ - **`✗ INCOMPATIBLE`** - Cannot be used for this method
528
+
529
+ When mapping is needed, the output includes a **"MAPPING CODE"** section with copy-paste ready Python code.
530
+
531
+ ### Example Workflow
532
+
533
+ ```python
534
+ # 1. Inspect dataset (costs ~$0.01, <1 min on CPU)
535
+ hf_jobs("uv", {
536
+ "script": "https://huggingface.co/datasets/mcp-tools/skills/raw/main/dataset_inspector.py",
537
+ "script_args": ["--dataset", "argilla/distilabel-math-preference-dpo", "--split", "train"]
538
+ })
539
+
540
+ # 2. Check output markers:
541
+ # ✓ READY → proceed with training
542
+ # ✗ NEEDS MAPPING → apply mapping code below
543
+ # ✗ INCOMPATIBLE → choose different method/dataset
544
+
545
+ # 3. If mapping needed, apply before training:
546
+ def format_for_dpo(example):
547
+ return {
548
+ 'prompt': example['instruction'],
549
+ 'chosen': example['chosen_response'],
550
+ 'rejected': example['rejected_response'],
551
+ }
552
+ dataset = dataset.map(format_for_dpo, remove_columns=dataset.column_names)
553
+
554
+ # 4. Launch training job with confidence
555
+ ```
556
+
557
+ ### Common Scenario: DPO Format Mismatch
558
+
559
+ Most DPO datasets use non-standard column names. Example:
560
+
561
+ ```
562
+ Dataset has: instruction, chosen_response, rejected_response
563
+ DPO expects: prompt, chosen, rejected
564
+ ```
565
+
566
+ The validator detects this and provides exact mapping code to fix it.
567
+
568
+ ## Converting Models to GGUF
569
+
570
+ After training, convert models to **GGUF format** for use with llama.cpp, Ollama, LM Studio, and other local inference tools.
571
+
572
+ **What is GGUF:**
573
+ - Optimized for CPU/GPU inference with llama.cpp
574
+ - Supports quantization (4-bit, 5-bit, 8-bit) to reduce model size
575
+ - Compatible with Ollama, LM Studio, Jan, GPT4All, llama.cpp
576
+ - Typically 2-8GB for 7B models (vs 14GB unquantized)
577
+
578
+ **When to convert:**
579
+ - Running models locally with Ollama or LM Studio
580
+ - Reducing model size with quantization
581
+ - Deploying to edge devices
582
+ - Sharing models for local-first use
583
+
584
+ **See:** `references/gguf_conversion.md` for complete conversion guide, including production-ready conversion script, quantization options, hardware requirements, usage examples, and troubleshooting.
585
+
586
+ **Quick conversion:**
587
+ ```python
588
+ hf_jobs("uv", {
589
+ "script": "<see references/gguf_conversion.md for complete script>",
590
+ "flavor": "a10g-large",
591
+ "timeout": "45m",
592
+ "secrets": {"HF_TOKEN": "$HF_TOKEN"},
593
+ "env": {
594
+ "ADAPTER_MODEL": "username/my-finetuned-model",
595
+ "BASE_MODEL": "Qwen/Qwen2.5-0.5B",
596
+ "OUTPUT_REPO": "username/my-model-gguf"
597
+ }
598
+ })
599
+ ```
600
+
601
+ ## Common Training Patterns
602
+
603
+ See `references/training_patterns.md` for detailed examples including:
604
+ - Quick demo (5-10 minutes)
605
+ - Production with checkpoints
606
+ - Multi-GPU training
607
+ - DPO training (preference learning)
608
+ - GRPO training (online RL)
609
+
610
+ ## Common Failure Modes
611
+
612
+ ### Out of Memory (OOM)
613
+
614
+ **Fix (try in order):**
615
+ 1. Reduce batch size: `per_device_train_batch_size=1`, increase `gradient_accumulation_steps=8`. Effective batch size is `per_device_train_batch_size` x `gradient_accumulation_steps`. For best performance keep effective batch size close to 128.
616
+ 2. Enable: `gradient_checkpointing=True`
617
+ 3. Upgrade hardware: t4-small → l4x1, a10g-small → a10g-large etc.
618
+
619
+ ### Dataset Misformatted
620
+
621
+ **Fix:**
622
+ 1. Validate first with dataset inspector:
623
+ ```bash
624
+ uv run https://huggingface.co/datasets/mcp-tools/skills/raw/main/dataset_inspector.py \
625
+ --dataset name --split train
626
+ ```
627
+ 2. Check output for compatibility markers (✓ READY, ✗ NEEDS MAPPING, ✗ INCOMPATIBLE)
628
+ 3. Apply mapping code from inspector output if needed
629
+
630
+ ### Job Timeout
631
+
632
+ **Fix:**
633
+ 1. Check logs for actual runtime: `hf_jobs("logs", {"job_id": "..."})`
634
+ 2. Increase timeout with buffer: `"timeout": "3h"` (add 30% to estimated time)
635
+ 3. Or reduce training: lower `num_train_epochs`, use smaller dataset, enable `max_steps`
636
+ 4. Save checkpoints: `save_strategy="steps"`, `save_steps=500`, `hub_strategy="every_save"`
637
+
638
+ **Note:** Default 30min is insufficient for real training. Minimum 1-2 hours.
639
+
640
+ ### Hub Push Failures
641
+
642
+ **Fix:**
643
+ 1. Add to job: `secrets={"HF_TOKEN": "$HF_TOKEN"}`
644
+ 2. Add to config: `push_to_hub=True`, `hub_model_id="username/model-name"`
645
+ 3. Verify auth: `mcp__huggingface__hf_whoami()`
646
+ 4. Check token has write permissions and repo exists (or set `hub_private_repo=True`)
647
+
648
+ ### Missing Dependencies
649
+
650
+ **Fix:**
651
+ Add to PEP 723 header:
652
+ ```python
653
+ # /// script
654
+ # dependencies = ["trl>=0.12.0", "peft>=0.7.0", "trackio", "missing-package"]
655
+ # ///
656
+ ```
657
+
658
+ ## Troubleshooting
659
+
660
+ **Common issues:**
661
+ - Job times out → Increase timeout, reduce epochs/dataset, use smaller model/LoRA
662
+ - Model not saved to Hub → Check push_to_hub=True, hub_model_id, secrets=HF_TOKEN
663
+ - Out of Memory (OOM) → Reduce batch size, increase gradient accumulation, enable LoRA, use larger GPU
664
+ - Dataset format error → Validate with dataset inspector (see Dataset Validation section)
665
+ - Import/module errors → Add PEP 723 header with dependencies, verify format
666
+ - Authentication errors → Check `mcp__huggingface__hf_whoami()`, token permissions, secrets parameter
667
+
668
+ **See:** `references/troubleshooting.md` for complete troubleshooting guide
669
+
670
+ ## Resources
671
+
672
+ ### References (In This Skill)
673
+ - `references/training_methods.md` - Overview of SFT, DPO, GRPO, KTO, PPO, Reward Modeling
674
+ - `references/training_patterns.md` - Common training patterns and examples
675
+ - `references/gguf_conversion.md` - Complete GGUF conversion guide
676
+ - `references/trackio_guide.md` - Trackio monitoring setup
677
+ - `references/hardware_guide.md` - Hardware specs and selection
678
+ - `references/hub_saving.md` - Hub authentication troubleshooting
679
+ - `references/troubleshooting.md` - Common issues and solutions
680
+
681
+ ### Scripts (In This Skill)
682
+ - `scripts/train_sft_example.py` - Production SFT template
683
+ - `scripts/train_dpo_example.py` - Production DPO template
684
+ - `scripts/train_grpo_example.py` - Production GRPO template
685
+ - `scripts/estimate_cost.py` - Estimate time and cost (offer when appropriate)
686
+ - `scripts/convert_to_gguf.py` - Complete GGUF conversion script
687
+
688
+ ### External Scripts
689
+ - [Dataset Inspector](https://huggingface.co/datasets/mcp-tools/skills/raw/main/dataset_inspector.py) - Validate dataset format before training (use via `uv run` or `hf_jobs`)
690
+
691
+ ### External Links
692
+ - [TRL Documentation](https://huggingface.co/docs/trl)
693
+ - [TRL Jobs Training Guide](https://huggingface.co/docs/trl/en/jobs_training)
694
+ - [TRL Jobs Package](https://github.com/huggingface/trl-jobs)
695
+ - [HF Jobs Documentation](https://huggingface.co/docs/huggingface_hub/guides/jobs)
696
+ - [TRL Example Scripts](https://github.com/huggingface/trl/tree/main/examples/scripts)
697
+ - [UV Scripts Guide](https://docs.astral.sh/uv/guides/scripts/)
698
+ - [UV Scripts Organization](https://huggingface.co/uv-scripts)
699
+
700
+ ## Key Takeaways
701
+
702
+ 1. **Submit scripts inline** - The `script` parameter accepts Python code directly; no file saving required unless user requests
703
+ 2. **Jobs are asynchronous** - Don't wait/poll; let user check when ready
704
+ 3. **Always set timeout** - Default 30 min is insufficient; minimum 1-2 hours recommended
705
+ 4. **Always enable Hub push** - Environment is ephemeral; without push, all results lost
706
+ 5. **Include Trackio** - Use example scripts as templates for real-time monitoring
707
+ 6. **Offer cost estimation** - When parameters are known, use `scripts/estimate_cost.py`
708
+ 7. **Use UV scripts (Approach 1)** - Default to `hf_jobs("uv", {...})` with inline scripts; TRL maintained scripts for standard training; avoid bash `trl-jobs` commands in Claude Code
709
+ 8. **Use hf_doc_fetch/hf_doc_search** for latest TRL documentation
710
+ 9. **Validate dataset format** before training with dataset inspector (see Dataset Validation section)
711
+ 10. **Choose appropriate hardware** for model size; use LoRA for models >7B