@synsci/cli-darwin-x64 1.1.49

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (373) hide show
  1. package/bin/skills/accelerate/SKILL.md +332 -0
  2. package/bin/skills/accelerate/references/custom-plugins.md +453 -0
  3. package/bin/skills/accelerate/references/megatron-integration.md +489 -0
  4. package/bin/skills/accelerate/references/performance.md +525 -0
  5. package/bin/skills/audiocraft/SKILL.md +564 -0
  6. package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
  7. package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
  8. package/bin/skills/autogpt/SKILL.md +403 -0
  9. package/bin/skills/autogpt/references/advanced-usage.md +535 -0
  10. package/bin/skills/autogpt/references/troubleshooting.md +420 -0
  11. package/bin/skills/awq/SKILL.md +310 -0
  12. package/bin/skills/awq/references/advanced-usage.md +324 -0
  13. package/bin/skills/awq/references/troubleshooting.md +344 -0
  14. package/bin/skills/axolotl/SKILL.md +158 -0
  15. package/bin/skills/axolotl/references/api.md +5548 -0
  16. package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
  17. package/bin/skills/axolotl/references/index.md +15 -0
  18. package/bin/skills/axolotl/references/other.md +3563 -0
  19. package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
  20. package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
  21. package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
  22. package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
  23. package/bin/skills/bitsandbytes/SKILL.md +411 -0
  24. package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
  25. package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
  26. package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
  27. package/bin/skills/blip-2/SKILL.md +564 -0
  28. package/bin/skills/blip-2/references/advanced-usage.md +680 -0
  29. package/bin/skills/blip-2/references/troubleshooting.md +526 -0
  30. package/bin/skills/chroma/SKILL.md +406 -0
  31. package/bin/skills/chroma/references/integration.md +38 -0
  32. package/bin/skills/clip/SKILL.md +253 -0
  33. package/bin/skills/clip/references/applications.md +207 -0
  34. package/bin/skills/constitutional-ai/SKILL.md +290 -0
  35. package/bin/skills/crewai/SKILL.md +498 -0
  36. package/bin/skills/crewai/references/flows.md +438 -0
  37. package/bin/skills/crewai/references/tools.md +429 -0
  38. package/bin/skills/crewai/references/troubleshooting.md +480 -0
  39. package/bin/skills/deepspeed/SKILL.md +141 -0
  40. package/bin/skills/deepspeed/references/08.md +17 -0
  41. package/bin/skills/deepspeed/references/09.md +173 -0
  42. package/bin/skills/deepspeed/references/2020.md +378 -0
  43. package/bin/skills/deepspeed/references/2023.md +279 -0
  44. package/bin/skills/deepspeed/references/assets.md +179 -0
  45. package/bin/skills/deepspeed/references/index.md +35 -0
  46. package/bin/skills/deepspeed/references/mii.md +118 -0
  47. package/bin/skills/deepspeed/references/other.md +1191 -0
  48. package/bin/skills/deepspeed/references/tutorials.md +6554 -0
  49. package/bin/skills/dspy/SKILL.md +590 -0
  50. package/bin/skills/dspy/references/examples.md +663 -0
  51. package/bin/skills/dspy/references/modules.md +475 -0
  52. package/bin/skills/dspy/references/optimizers.md +566 -0
  53. package/bin/skills/faiss/SKILL.md +221 -0
  54. package/bin/skills/faiss/references/index_types.md +280 -0
  55. package/bin/skills/flash-attention/SKILL.md +367 -0
  56. package/bin/skills/flash-attention/references/benchmarks.md +215 -0
  57. package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
  58. package/bin/skills/gguf/SKILL.md +427 -0
  59. package/bin/skills/gguf/references/advanced-usage.md +504 -0
  60. package/bin/skills/gguf/references/troubleshooting.md +442 -0
  61. package/bin/skills/gptq/SKILL.md +450 -0
  62. package/bin/skills/gptq/references/calibration.md +337 -0
  63. package/bin/skills/gptq/references/integration.md +129 -0
  64. package/bin/skills/gptq/references/troubleshooting.md +95 -0
  65. package/bin/skills/grpo-rl-training/README.md +97 -0
  66. package/bin/skills/grpo-rl-training/SKILL.md +572 -0
  67. package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
  68. package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
  69. package/bin/skills/guidance/SKILL.md +572 -0
  70. package/bin/skills/guidance/references/backends.md +554 -0
  71. package/bin/skills/guidance/references/constraints.md +674 -0
  72. package/bin/skills/guidance/references/examples.md +767 -0
  73. package/bin/skills/hqq/SKILL.md +445 -0
  74. package/bin/skills/hqq/references/advanced-usage.md +528 -0
  75. package/bin/skills/hqq/references/troubleshooting.md +503 -0
  76. package/bin/skills/hugging-face-cli/SKILL.md +191 -0
  77. package/bin/skills/hugging-face-cli/references/commands.md +954 -0
  78. package/bin/skills/hugging-face-cli/references/examples.md +374 -0
  79. package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
  80. package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
  81. package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
  82. package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
  83. package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
  84. package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
  85. package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
  86. package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
  87. package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
  88. package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
  89. package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
  90. package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
  91. package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
  92. package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
  93. package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
  94. package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
  95. package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
  96. package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
  97. package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
  98. package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
  99. package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
  100. package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
  101. package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
  102. package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
  103. package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
  104. package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
  105. package/bin/skills/hugging-face-jobs/index.html +216 -0
  106. package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
  107. package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
  108. package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
  109. package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
  110. package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
  111. package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
  112. package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
  113. package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
  114. package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
  115. package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
  116. package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
  117. package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
  118. package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
  119. package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
  120. package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
  121. package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
  122. package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
  123. package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
  124. package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
  125. package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
  126. package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
  127. package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
  128. package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
  129. package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
  130. package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
  131. package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
  132. package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
  133. package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
  134. package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
  135. package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
  136. package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
  137. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
  138. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
  139. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
  140. package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
  141. package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
  142. package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
  143. package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
  144. package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
  145. package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
  146. package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
  147. package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
  148. package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
  149. package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
  150. package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
  151. package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
  152. package/bin/skills/instructor/SKILL.md +740 -0
  153. package/bin/skills/instructor/references/examples.md +107 -0
  154. package/bin/skills/instructor/references/providers.md +70 -0
  155. package/bin/skills/instructor/references/validation.md +606 -0
  156. package/bin/skills/knowledge-distillation/SKILL.md +458 -0
  157. package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
  158. package/bin/skills/lambda-labs/SKILL.md +545 -0
  159. package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
  160. package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
  161. package/bin/skills/langchain/SKILL.md +480 -0
  162. package/bin/skills/langchain/references/agents.md +499 -0
  163. package/bin/skills/langchain/references/integration.md +562 -0
  164. package/bin/skills/langchain/references/rag.md +600 -0
  165. package/bin/skills/langsmith/SKILL.md +422 -0
  166. package/bin/skills/langsmith/references/advanced-usage.md +548 -0
  167. package/bin/skills/langsmith/references/troubleshooting.md +537 -0
  168. package/bin/skills/litgpt/SKILL.md +469 -0
  169. package/bin/skills/litgpt/references/custom-models.md +568 -0
  170. package/bin/skills/litgpt/references/distributed-training.md +451 -0
  171. package/bin/skills/litgpt/references/supported-models.md +336 -0
  172. package/bin/skills/litgpt/references/training-recipes.md +619 -0
  173. package/bin/skills/llama-cpp/SKILL.md +258 -0
  174. package/bin/skills/llama-cpp/references/optimization.md +89 -0
  175. package/bin/skills/llama-cpp/references/quantization.md +213 -0
  176. package/bin/skills/llama-cpp/references/server.md +125 -0
  177. package/bin/skills/llama-factory/SKILL.md +80 -0
  178. package/bin/skills/llama-factory/references/_images.md +23 -0
  179. package/bin/skills/llama-factory/references/advanced.md +1055 -0
  180. package/bin/skills/llama-factory/references/getting_started.md +349 -0
  181. package/bin/skills/llama-factory/references/index.md +19 -0
  182. package/bin/skills/llama-factory/references/other.md +31 -0
  183. package/bin/skills/llamaguard/SKILL.md +337 -0
  184. package/bin/skills/llamaindex/SKILL.md +569 -0
  185. package/bin/skills/llamaindex/references/agents.md +83 -0
  186. package/bin/skills/llamaindex/references/data_connectors.md +108 -0
  187. package/bin/skills/llamaindex/references/query_engines.md +406 -0
  188. package/bin/skills/llava/SKILL.md +304 -0
  189. package/bin/skills/llava/references/training.md +197 -0
  190. package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
  191. package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
  192. package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
  193. package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
  194. package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
  195. package/bin/skills/long-context/SKILL.md +536 -0
  196. package/bin/skills/long-context/references/extension_methods.md +468 -0
  197. package/bin/skills/long-context/references/fine_tuning.md +611 -0
  198. package/bin/skills/long-context/references/rope.md +402 -0
  199. package/bin/skills/mamba/SKILL.md +260 -0
  200. package/bin/skills/mamba/references/architecture-details.md +206 -0
  201. package/bin/skills/mamba/references/benchmarks.md +255 -0
  202. package/bin/skills/mamba/references/training-guide.md +388 -0
  203. package/bin/skills/megatron-core/SKILL.md +366 -0
  204. package/bin/skills/megatron-core/references/benchmarks.md +249 -0
  205. package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
  206. package/bin/skills/megatron-core/references/production-examples.md +473 -0
  207. package/bin/skills/megatron-core/references/training-recipes.md +547 -0
  208. package/bin/skills/miles/SKILL.md +315 -0
  209. package/bin/skills/miles/references/api-reference.md +141 -0
  210. package/bin/skills/miles/references/troubleshooting.md +352 -0
  211. package/bin/skills/mlflow/SKILL.md +704 -0
  212. package/bin/skills/mlflow/references/deployment.md +744 -0
  213. package/bin/skills/mlflow/references/model-registry.md +770 -0
  214. package/bin/skills/mlflow/references/tracking.md +680 -0
  215. package/bin/skills/modal/SKILL.md +341 -0
  216. package/bin/skills/modal/references/advanced-usage.md +503 -0
  217. package/bin/skills/modal/references/troubleshooting.md +494 -0
  218. package/bin/skills/model-merging/SKILL.md +539 -0
  219. package/bin/skills/model-merging/references/evaluation.md +462 -0
  220. package/bin/skills/model-merging/references/examples.md +428 -0
  221. package/bin/skills/model-merging/references/methods.md +352 -0
  222. package/bin/skills/model-pruning/SKILL.md +495 -0
  223. package/bin/skills/model-pruning/references/wanda.md +347 -0
  224. package/bin/skills/moe-training/SKILL.md +526 -0
  225. package/bin/skills/moe-training/references/architectures.md +432 -0
  226. package/bin/skills/moe-training/references/inference.md +348 -0
  227. package/bin/skills/moe-training/references/training.md +425 -0
  228. package/bin/skills/nanogpt/SKILL.md +290 -0
  229. package/bin/skills/nanogpt/references/architecture.md +382 -0
  230. package/bin/skills/nanogpt/references/data.md +476 -0
  231. package/bin/skills/nanogpt/references/training.md +564 -0
  232. package/bin/skills/nemo-curator/SKILL.md +383 -0
  233. package/bin/skills/nemo-curator/references/deduplication.md +87 -0
  234. package/bin/skills/nemo-curator/references/filtering.md +102 -0
  235. package/bin/skills/nemo-evaluator/SKILL.md +494 -0
  236. package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
  237. package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
  238. package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
  239. package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
  240. package/bin/skills/nemo-guardrails/SKILL.md +297 -0
  241. package/bin/skills/nnsight/SKILL.md +436 -0
  242. package/bin/skills/nnsight/references/README.md +78 -0
  243. package/bin/skills/nnsight/references/api.md +344 -0
  244. package/bin/skills/nnsight/references/tutorials.md +300 -0
  245. package/bin/skills/openrlhf/SKILL.md +249 -0
  246. package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
  247. package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
  248. package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
  249. package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
  250. package/bin/skills/outlines/SKILL.md +652 -0
  251. package/bin/skills/outlines/references/backends.md +615 -0
  252. package/bin/skills/outlines/references/examples.md +773 -0
  253. package/bin/skills/outlines/references/json_generation.md +652 -0
  254. package/bin/skills/peft/SKILL.md +431 -0
  255. package/bin/skills/peft/references/advanced-usage.md +514 -0
  256. package/bin/skills/peft/references/troubleshooting.md +480 -0
  257. package/bin/skills/phoenix/SKILL.md +475 -0
  258. package/bin/skills/phoenix/references/advanced-usage.md +619 -0
  259. package/bin/skills/phoenix/references/troubleshooting.md +538 -0
  260. package/bin/skills/pinecone/SKILL.md +358 -0
  261. package/bin/skills/pinecone/references/deployment.md +181 -0
  262. package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
  263. package/bin/skills/pytorch-fsdp/references/index.md +7 -0
  264. package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
  265. package/bin/skills/pytorch-lightning/SKILL.md +346 -0
  266. package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
  267. package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
  268. package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
  269. package/bin/skills/pyvene/SKILL.md +473 -0
  270. package/bin/skills/pyvene/references/README.md +73 -0
  271. package/bin/skills/pyvene/references/api.md +383 -0
  272. package/bin/skills/pyvene/references/tutorials.md +376 -0
  273. package/bin/skills/qdrant/SKILL.md +493 -0
  274. package/bin/skills/qdrant/references/advanced-usage.md +648 -0
  275. package/bin/skills/qdrant/references/troubleshooting.md +631 -0
  276. package/bin/skills/ray-data/SKILL.md +326 -0
  277. package/bin/skills/ray-data/references/integration.md +82 -0
  278. package/bin/skills/ray-data/references/transformations.md +83 -0
  279. package/bin/skills/ray-train/SKILL.md +406 -0
  280. package/bin/skills/ray-train/references/multi-node.md +628 -0
  281. package/bin/skills/rwkv/SKILL.md +260 -0
  282. package/bin/skills/rwkv/references/architecture-details.md +344 -0
  283. package/bin/skills/rwkv/references/rwkv7.md +386 -0
  284. package/bin/skills/rwkv/references/state-management.md +369 -0
  285. package/bin/skills/saelens/SKILL.md +386 -0
  286. package/bin/skills/saelens/references/README.md +70 -0
  287. package/bin/skills/saelens/references/api.md +333 -0
  288. package/bin/skills/saelens/references/tutorials.md +318 -0
  289. package/bin/skills/segment-anything/SKILL.md +500 -0
  290. package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
  291. package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
  292. package/bin/skills/sentence-transformers/SKILL.md +255 -0
  293. package/bin/skills/sentence-transformers/references/models.md +123 -0
  294. package/bin/skills/sentencepiece/SKILL.md +235 -0
  295. package/bin/skills/sentencepiece/references/algorithms.md +200 -0
  296. package/bin/skills/sentencepiece/references/training.md +304 -0
  297. package/bin/skills/sglang/SKILL.md +442 -0
  298. package/bin/skills/sglang/references/deployment.md +490 -0
  299. package/bin/skills/sglang/references/radix-attention.md +413 -0
  300. package/bin/skills/sglang/references/structured-generation.md +541 -0
  301. package/bin/skills/simpo/SKILL.md +219 -0
  302. package/bin/skills/simpo/references/datasets.md +478 -0
  303. package/bin/skills/simpo/references/hyperparameters.md +452 -0
  304. package/bin/skills/simpo/references/loss-functions.md +350 -0
  305. package/bin/skills/skypilot/SKILL.md +509 -0
  306. package/bin/skills/skypilot/references/advanced-usage.md +491 -0
  307. package/bin/skills/skypilot/references/troubleshooting.md +570 -0
  308. package/bin/skills/slime/SKILL.md +464 -0
  309. package/bin/skills/slime/references/api-reference.md +392 -0
  310. package/bin/skills/slime/references/troubleshooting.md +386 -0
  311. package/bin/skills/speculative-decoding/SKILL.md +467 -0
  312. package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
  313. package/bin/skills/speculative-decoding/references/medusa.md +350 -0
  314. package/bin/skills/stable-diffusion/SKILL.md +519 -0
  315. package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
  316. package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
  317. package/bin/skills/tensorboard/SKILL.md +629 -0
  318. package/bin/skills/tensorboard/references/integrations.md +638 -0
  319. package/bin/skills/tensorboard/references/profiling.md +545 -0
  320. package/bin/skills/tensorboard/references/visualization.md +620 -0
  321. package/bin/skills/tensorrt-llm/SKILL.md +187 -0
  322. package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
  323. package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
  324. package/bin/skills/tensorrt-llm/references/serving.md +470 -0
  325. package/bin/skills/tinker/SKILL.md +362 -0
  326. package/bin/skills/tinker/references/api-reference.md +168 -0
  327. package/bin/skills/tinker/references/getting-started.md +157 -0
  328. package/bin/skills/tinker/references/loss-functions.md +163 -0
  329. package/bin/skills/tinker/references/models-and-lora.md +139 -0
  330. package/bin/skills/tinker/references/recipes.md +280 -0
  331. package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
  332. package/bin/skills/tinker/references/rendering.md +243 -0
  333. package/bin/skills/tinker/references/supervised-learning.md +232 -0
  334. package/bin/skills/tinker-training-cost/SKILL.md +187 -0
  335. package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
  336. package/bin/skills/torchforge/SKILL.md +433 -0
  337. package/bin/skills/torchforge/references/api-reference.md +327 -0
  338. package/bin/skills/torchforge/references/troubleshooting.md +409 -0
  339. package/bin/skills/torchtitan/SKILL.md +358 -0
  340. package/bin/skills/torchtitan/references/checkpoint.md +181 -0
  341. package/bin/skills/torchtitan/references/custom-models.md +258 -0
  342. package/bin/skills/torchtitan/references/float8.md +133 -0
  343. package/bin/skills/torchtitan/references/fsdp.md +126 -0
  344. package/bin/skills/transformer-lens/SKILL.md +346 -0
  345. package/bin/skills/transformer-lens/references/README.md +54 -0
  346. package/bin/skills/transformer-lens/references/api.md +362 -0
  347. package/bin/skills/transformer-lens/references/tutorials.md +339 -0
  348. package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
  349. package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
  350. package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
  351. package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
  352. package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
  353. package/bin/skills/unsloth/SKILL.md +80 -0
  354. package/bin/skills/unsloth/references/index.md +7 -0
  355. package/bin/skills/unsloth/references/llms-full.md +16799 -0
  356. package/bin/skills/unsloth/references/llms-txt.md +12044 -0
  357. package/bin/skills/unsloth/references/llms.md +82 -0
  358. package/bin/skills/verl/SKILL.md +391 -0
  359. package/bin/skills/verl/references/api-reference.md +301 -0
  360. package/bin/skills/verl/references/troubleshooting.md +391 -0
  361. package/bin/skills/vllm/SKILL.md +364 -0
  362. package/bin/skills/vllm/references/optimization.md +226 -0
  363. package/bin/skills/vllm/references/quantization.md +284 -0
  364. package/bin/skills/vllm/references/server-deployment.md +255 -0
  365. package/bin/skills/vllm/references/troubleshooting.md +447 -0
  366. package/bin/skills/weights-and-biases/SKILL.md +590 -0
  367. package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
  368. package/bin/skills/weights-and-biases/references/integrations.md +700 -0
  369. package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
  370. package/bin/skills/whisper/SKILL.md +317 -0
  371. package/bin/skills/whisper/references/languages.md +189 -0
  372. package/bin/synsc +0 -0
  373. package/package.json +10 -0
@@ -0,0 +1,315 @@
1
+ ---
2
+ name: miles-rl-training
3
+ description: Provides guidance for enterprise-grade RL training using miles, a production-ready fork of slime. Use when training large MoE models with FP8/INT4, needing train-inference alignment, or requiring speculative RL for maximum throughput.
4
+ version: 1.0.0
5
+ author: Synthetic Sciences
6
+ license: MIT
7
+ tags: [Reinforcement Learning, MoE, FP8, INT4, Enterprise, SGLang, Megatron-LM]
8
+ dependencies: [sglang-router>=0.2.3, ray, torch>=2.0.0, transformers>=4.40.0]
9
+ ---
10
+
11
+ # miles: Enterprise-Grade RL for Large-Scale Model Training
12
+
13
+ miles is a high-performance, enterprise-ready RL framework optimized for large-scale model post-training. Built as a production fork of slime, it addresses critical challenges in MoE training stability, low-precision training, and train-inference alignment.
14
+
15
+ ## When to Use miles
16
+
17
+ **Choose miles when you need:**
18
+ - Training 1TB+ MoE models (DeepSeek V3, Qwen3-MoE)
19
+ - FP8 or INT4 quantization-aware training
20
+ - Bit-wise identical train-inference alignment
21
+ - Speculative RL for maximum throughput
22
+ - Production stability with enterprise support
23
+
24
+ **Consider alternatives when:**
25
+ - You want the research-grade original → use **slime**
26
+ - You need flexible backend swapping → use **verl**
27
+ - You want PyTorch-native abstractions → use **torchforge**
28
+
29
+ ## Key Features
30
+
31
+ ### Low-Precision Training
32
+ - **Unified FP8**: End-to-end FP8 for both inference and training
33
+ - **INT4 QAT**: 1TB models on single-machine VRAM (H200)
34
+ - **Rollout Routing Replay (R3)**: Bit-wise expert alignment for MoE
35
+
36
+ ### Performance Optimizations
37
+ - **Speculative RL**: 25%+ rollout speedup with online SFT draft models
38
+ - **Zero-Copy Weight Sync**: CUDA IPC zero-copy mapping
39
+ - **Partial Rollout**: Recycle half-finished trajectories
40
+
41
+ ### Train-Inference Alignment
42
+ - **TIS/MIS**: Truncated/Masked Importance Sampling for off-policy correction
43
+ - **Kernel-level optimization**: FlashAttention-3, DeepGEMM integration
44
+
45
+ ## Installation
46
+
47
+ ```bash
48
+ # Recommended: Docker
49
+ docker pull radixark/miles:latest
50
+ docker run --rm --gpus all --ipc=host --shm-size=16g \
51
+ -it radixark/miles:latest /bin/bash
52
+
53
+ # From source
54
+ git clone https://github.com/radixark/miles.git
55
+ cd miles
56
+ pip install -r requirements.txt
57
+ pip install -e .
58
+ ```
59
+
60
+ ## Quick Start
61
+
62
+ miles inherits slime's configuration system. Basic training:
63
+
64
+ ```bash
65
+ python train.py \
66
+ --advantage-estimator grpo \
67
+ --model-name qwen3-30b-a3b \
68
+ --hf-checkpoint /path/to/qwen3-30b-a3b-hf \
69
+ --rollout-batch-size 512 \
70
+ --n-samples-per-prompt 8
71
+ ```
72
+
73
+ ---
74
+
75
+ ## Workflow 1: Large MoE Training
76
+
77
+ Use this workflow for training large MoE models like DeepSeek V3 or Qwen3-MoE.
78
+
79
+ ### Prerequisites Checklist
80
+ - [ ] H100/H200 GPUs with FP8 support
81
+ - [ ] MoE model (DeepSeek V3, Qwen3-MoE)
82
+ - [ ] Docker environment with miles
83
+
84
+ ### Step 1: Environment Setup
85
+
86
+ ```bash
87
+ # FP8 block scaling (recommended for stability)
88
+ export NVTE_FP8_BLOCK_SCALING_FP32_SCALES=1
89
+ export CUDA_DEVICE_MAX_CONNECTIONS=1
90
+ ```
91
+
92
+ ### Step 2: Configure Training
93
+
94
+ ```bash
95
+ python train.py \
96
+ --actor-num-gpus-per-node 8 \
97
+ --rollout-num-gpus 8 \
98
+ --hf-checkpoint /path/to/deepseek-v3 \
99
+ --advantage-estimator grpo \
100
+ --tensor-model-parallel-size 8 \
101
+ --expert-model-parallel-size 4 \
102
+ --prompt-data /path/to/data.jsonl \
103
+ --num-rollout 3000
104
+ ```
105
+
106
+ ### Verification Checklist
107
+ - [ ] Model loads without errors
108
+ - [ ] Routing decisions are consistent
109
+ - [ ] No NaN/Inf in loss values
110
+
111
+ ---
112
+
113
+ ## Workflow 2: Speculative RL Training
114
+
115
+ Use this workflow for maximum rollout throughput with EAGLE speculative decoding.
116
+
117
+ ### How Speculative RL Works
118
+
119
+ 1. Small draft model generates candidate tokens
120
+ 2. Target model verifies in parallel
121
+ 3. Draft model updated via online SFT to track policy
122
+
123
+ ### Step 1: Enable Speculative Decoding
124
+
125
+ miles supports EAGLE speculative decoding via SGLang:
126
+
127
+ ```bash
128
+ python train.py \
129
+ --actor-num-gpus-per-node 8 \
130
+ --hf-checkpoint /path/to/target-model \
131
+ --sglang-speculative-algorithm EAGLE \
132
+ --sglang-speculative-num-steps 3 \
133
+ --sglang-speculative-eagle-topk 1 \
134
+ --sglang-speculative-num-draft-tokens 4 \
135
+ --sglang-speculative-draft-model-path /path/to/draft-model \
136
+ --advantage-estimator grpo \
137
+ --prompt-data /path/to/data.jsonl
138
+ ```
139
+
140
+ ### Step 2: Enable Online MTP Training (Optional)
141
+
142
+ For online SFT of draft model during training:
143
+
144
+ ```bash
145
+ --mtp-num-layers 1 \
146
+ --enable-mtp-training \
147
+ --mtp-loss-scaling-factor 0.2
148
+ ```
149
+
150
+ **Note**: Online MTP training requires a torch dist checkpoint with MTP weights. Add `--mtp-num-layers 1` during checkpoint conversion from HuggingFace.
151
+
152
+ ### Expected Speedup
153
+
154
+ - **Standard rollout**: Baseline
155
+ - **Speculative RL**: 25-40% faster rollout
156
+ - **With partial rollout**: Additional 10-15% throughput
157
+
158
+ ---
159
+
160
+ ## Configuration Reference
161
+
162
+ miles inherits all slime arguments. See [slime API Reference](../slime/references/api-reference.md) for the complete list.
163
+
164
+ ### Cluster Resources (from slime)
165
+
166
+ ```bash
167
+ --actor-num-nodes 1
168
+ --actor-num-gpus-per-node 8
169
+ --rollout-num-gpus 8
170
+ --rollout-num-gpus-per-engine 2
171
+ --colocate
172
+ ```
173
+
174
+ ### Megatron Parallelism (from slime)
175
+
176
+ ```bash
177
+ --tensor-model-parallel-size 8
178
+ --pipeline-model-parallel-size 2
179
+ --expert-model-parallel-size 4 # MoE expert parallelism
180
+ ```
181
+
182
+ ### Speculative Decoding (miles-specific)
183
+
184
+ ```bash
185
+ --sglang-speculative-algorithm EAGLE
186
+ --sglang-speculative-num-steps 3
187
+ --sglang-speculative-eagle-topk 1
188
+ --sglang-speculative-num-draft-tokens 4
189
+ --sglang-enable-draft-weights-cpu-backup
190
+ --sglang-speculative-draft-model-path /your/draft/model/path
191
+ ```
192
+
193
+ ### Online MTP Training (miles-specific)
194
+
195
+ ```bash
196
+ --mtp-num-layers 1
197
+ --enable-mtp-training
198
+ --mtp-loss-scaling-factor 0.2
199
+ ```
200
+
201
+ ---
202
+
203
+ ## Key Features (Conceptual)
204
+
205
+ The following features are documented in miles but specific CLI flags may vary. Consult the miles repository for latest configuration.
206
+
207
+ ### Unified FP8 Pipeline
208
+
209
+ End-to-end FP8 sampling and training that eliminates quantization-induced discrepancy causing RL collapse in MoE models.
210
+
211
+ ### Rollout Routing Replay (R3)
212
+
213
+ Records expert routing decisions during SGLang inference and replays them during Megatron training for bit-wise expert alignment.
214
+
215
+ **How R3 Works**:
216
+ 1. During SGLang inference, expert routing decisions are recorded
217
+ 2. Routing decisions stored in `sample.rollout_routed_experts`
218
+ 3. During Megatron training, routing is replayed instead of recomputed
219
+ 4. Ensures identical expert selection between train and inference
220
+
221
+ ### INT4 Quantization-Aware Training
222
+
223
+ Enables single-machine deployment of 1TB+ models (e.g., on H200).
224
+
225
+ **Memory Savings with INT4**:
226
+
227
+ | Model Size | BF16 VRAM | INT4 VRAM | Reduction |
228
+ |------------|-----------|-----------|-----------|
229
+ | 70B | 140GB | 45GB | 3.1x |
230
+ | 235B | 470GB | 150GB | 3.1x |
231
+ | 671B | 1.3TB | 420GB | 3.1x |
232
+
233
+ ### Train-Inference Alignment
234
+
235
+ miles achieves "exactly 0 KL divergence" between training and inference through:
236
+ - Flash Attention 3
237
+ - DeepGEMM
238
+ - Batch-invariant kernels from Thinking Machines Lab
239
+ - `torch.compile` integration
240
+
241
+ ---
242
+
243
+ ## Sample Data Structure
244
+
245
+ miles uses the same `Sample` dataclass as slime with the `rollout_routed_experts` field for MoE routing replay:
246
+
247
+ ```python
248
+ @dataclass
249
+ class Sample:
250
+ prompt: str | list[dict]
251
+ tokens: list[int]
252
+ response: str
253
+ reward: float | dict
254
+ loss_mask: list[int]
255
+ status: Status
256
+ metadata: dict
257
+ rollout_log_probs: list[float]
258
+ rollout_routed_experts: list[list[int]] # MoE routing for R3
259
+ ```
260
+
261
+ See [slime API Reference](../slime/references/api-reference.md) for the complete Sample definition.
262
+
263
+ ---
264
+
265
+ ## Common Issues and Solutions
266
+
267
+ ### Issue: FP8 Training Collapse
268
+
269
+ **Symptoms**: Loss explodes, NaN values
270
+
271
+ **Solutions**:
272
+ - Use block scaling: `export NVTE_FP8_BLOCK_SCALING_FP32_SCALES=1`
273
+ - Reduce learning rate: `--lr 5e-7`
274
+ - Ensure MoE routing is consistent between train/inference
275
+
276
+ ### Issue: Speculative Draft Drift
277
+
278
+ **Symptoms**: Low acceptance rate over time
279
+
280
+ **Solutions**:
281
+ - Enable online MTP training to keep draft model aligned
282
+ - Reduce speculative steps: `--sglang-speculative-num-steps 2`
283
+ - Use CPU backup: `--sglang-enable-draft-weights-cpu-backup`
284
+
285
+ ### Issue: Train-Inference Mismatch
286
+
287
+ **Symptoms**: Policy divergence, reward collapse
288
+
289
+ **Solutions**:
290
+ - Use TIS for off-policy correction: `--use-tis --tis-threshold 0.9`
291
+ - Verify log probs match between SGLang and Megatron
292
+ - Enable R3 for MoE models
293
+
294
+ ---
295
+
296
+ ## Supported Models
297
+
298
+ | Family | Models | MoE Support |
299
+ |--------|--------|-------------|
300
+ | DeepSeek | R1, V3, V3.2 | Full |
301
+ | Qwen | 2, 2.5, 3 (including MoE) | Full |
302
+ | Llama | 3, 3.1, 3.3, 4 | Dense only |
303
+ | Gemma | 2, 3, 3N | Dense only |
304
+ | GLM | 4.5, 4.6, 4.7 | Dense only |
305
+ | MiniMax | M2, M2.1 | Full |
306
+
307
+ ---
308
+
309
+ ## Resources
310
+
311
+ - **GitHub**: https://github.com/radixark/miles
312
+ - **Introduction Blog**: https://lmsys.org/blog/2025-11-19-miles/
313
+ - **Slime (upstream)**: https://github.com/THUDM/slime
314
+ - **SGLang**: https://github.com/sgl-project/sglang
315
+
@@ -0,0 +1,141 @@
1
+ # miles API Reference
2
+
3
+ ## Overview
4
+
5
+ miles is an enterprise-grade RL framework built on slime, adding advanced features for large-scale MoE training:
6
+
7
+ - Unified FP8 training and inference
8
+ - INT4 Quantization-Aware Training
9
+ - Rollout Routing Replay (R3)
10
+ - Speculative RL training
11
+
12
+ **Note**: miles inherits slime's configuration system. See [slime API Reference](../../slime/references/api-reference.md) for base arguments.
13
+
14
+ ## Core Data Structures
15
+
16
+ miles uses the same `Sample` dataclass as slime with the `rollout_routed_experts` field for MoE routing replay.
17
+
18
+ ## Quick Start
19
+
20
+ ```bash
21
+ python train.py \
22
+ --advantage-estimator grpo \
23
+ --model-name qwen3-30b-a3b \
24
+ --hf-checkpoint /path/to/qwen3-30b-a3b-hf \
25
+ --rollout-batch-size 512 \
26
+ --n-samples-per-prompt 8
27
+ ```
28
+
29
+ ## Configuration Options
30
+
31
+ miles inherits slime's three argument categories (Megatron, SGLang with `--sglang-` prefix, and slime-specific). Key additions:
32
+
33
+ ### Cluster Resources (inherited from slime)
34
+
35
+ ```bash
36
+ --actor-num-nodes 1
37
+ --actor-num-gpus-per-node 8
38
+ --rollout-num-gpus 8
39
+ --rollout-num-gpus-per-engine 2
40
+ --colocate
41
+ ```
42
+
43
+ ### Megatron Parallelism (inherited from slime)
44
+
45
+ ```bash
46
+ --tensor-model-parallel-size 8
47
+ --pipeline-model-parallel-size 2
48
+ --expert-model-parallel-size 4 # MoE expert parallelism
49
+ ```
50
+
51
+ ### Speculative Decoding
52
+
53
+ Verified flags from miles documentation:
54
+
55
+ ```bash
56
+ # Basic speculative decoding
57
+ --sglang-speculative-algorithm EAGLE
58
+ --sglang-speculative-num-steps 3
59
+ --sglang-speculative-eagle-topk 1
60
+ --sglang-speculative-num-draft-tokens 4
61
+ --sglang-enable-draft-weights-cpu-backup
62
+
63
+ # Draft model path
64
+ --sglang-speculative-draft-model-path /your/draft/model/path
65
+
66
+ # Online SFT for draft model (MTP)
67
+ --mtp-num-layers 1
68
+ --enable-mtp-training
69
+ --mtp-loss-scaling-factor 0.2
70
+ ```
71
+
72
+ **Note**: Online MTP training requires a torch dist checkpoint with MTP weights. Add `--mtp-num-layers 1` during checkpoint conversion from HuggingFace to torch dist format.
73
+
74
+ ## Key Features (Conceptual)
75
+
76
+ The following features are documented in miles but specific CLI flags are not publicly documented. Consult the miles repository for latest configuration options.
77
+
78
+ ### Unified FP8 Pipeline
79
+
80
+ End-to-end FP8 sampling and training that eliminates quantization-induced discrepancy causing RL collapse in MoE models.
81
+
82
+ ### Rollout Routing Replay (R3)
83
+
84
+ Records expert routing decisions during SGLang inference and replays them during Megatron training for bit-wise expert alignment.
85
+
86
+ **How R3 Works**:
87
+ 1. During SGLang inference, expert routing decisions are recorded
88
+ 2. Routing decisions stored in `sample.rollout_routed_experts`
89
+ 3. During Megatron training, routing is replayed instead of recomputed
90
+ 4. Ensures identical expert selection between train and inference
91
+
92
+ ### INT4 Quantization-Aware Training
93
+
94
+ Enables single-machine deployment of 1TB+ models (e.g., on H200).
95
+
96
+ **Memory Savings with INT4**:
97
+
98
+ | Model Size | BF16 VRAM | INT4 VRAM | Reduction |
99
+ |------------|-----------|-----------|-----------|
100
+ | 70B | 140GB | 45GB | 3.1x |
101
+ | 235B | 470GB | 150GB | 3.1x |
102
+ | 671B | 1.3TB | 420GB | 3.1x |
103
+
104
+ ### Train-Inference Alignment
105
+
106
+ miles achieves "exactly 0 KL divergence" between training and inference through infrastructure optimizations:
107
+ - Flash Attention 3
108
+ - DeepGEMM
109
+ - Batch-invariant kernels from Thinking Machines Lab
110
+ - `torch.compile` integration
111
+
112
+ ### Truncated/Masked Importance Sampling (TIS/MIS)
113
+
114
+ Algorithmic corrections for off-policy training. See slime documentation for `--use-tis` flag.
115
+
116
+ ## Custom Functions
117
+
118
+ Same interface as slime:
119
+
120
+ ```bash
121
+ --custom-generate-function-path generate.py
122
+ --custom-rm-path reward.py
123
+ ```
124
+
125
+ ## Supported Models
126
+
127
+ | Family | Models | MoE Support |
128
+ |--------|--------|-------------|
129
+ | DeepSeek | R1, V3, V3.2 | Full |
130
+ | Qwen | 2, 2.5, 3 (including MoE) | Full |
131
+ | Llama | 3, 3.1, 3.3, 4 | Dense only |
132
+ | Gemma | 2, 3, 3N | Dense only |
133
+ | GLM | 4.5, 4.6, 4.7 | Dense only |
134
+ | MiniMax | M2, M2.1 | Full |
135
+
136
+ ## Resources
137
+
138
+ - GitHub: https://github.com/radixark/miles
139
+ - Introduction Blog: https://lmsys.org/blog/2025-11-19-miles/
140
+ - Slime (upstream): https://github.com/THUDM/slime
141
+ - SGLang: https://github.com/sgl-project/sglang