@synsci/cli-darwin-x64 1.1.49

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (373) hide show
  1. package/bin/skills/accelerate/SKILL.md +332 -0
  2. package/bin/skills/accelerate/references/custom-plugins.md +453 -0
  3. package/bin/skills/accelerate/references/megatron-integration.md +489 -0
  4. package/bin/skills/accelerate/references/performance.md +525 -0
  5. package/bin/skills/audiocraft/SKILL.md +564 -0
  6. package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
  7. package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
  8. package/bin/skills/autogpt/SKILL.md +403 -0
  9. package/bin/skills/autogpt/references/advanced-usage.md +535 -0
  10. package/bin/skills/autogpt/references/troubleshooting.md +420 -0
  11. package/bin/skills/awq/SKILL.md +310 -0
  12. package/bin/skills/awq/references/advanced-usage.md +324 -0
  13. package/bin/skills/awq/references/troubleshooting.md +344 -0
  14. package/bin/skills/axolotl/SKILL.md +158 -0
  15. package/bin/skills/axolotl/references/api.md +5548 -0
  16. package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
  17. package/bin/skills/axolotl/references/index.md +15 -0
  18. package/bin/skills/axolotl/references/other.md +3563 -0
  19. package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
  20. package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
  21. package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
  22. package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
  23. package/bin/skills/bitsandbytes/SKILL.md +411 -0
  24. package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
  25. package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
  26. package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
  27. package/bin/skills/blip-2/SKILL.md +564 -0
  28. package/bin/skills/blip-2/references/advanced-usage.md +680 -0
  29. package/bin/skills/blip-2/references/troubleshooting.md +526 -0
  30. package/bin/skills/chroma/SKILL.md +406 -0
  31. package/bin/skills/chroma/references/integration.md +38 -0
  32. package/bin/skills/clip/SKILL.md +253 -0
  33. package/bin/skills/clip/references/applications.md +207 -0
  34. package/bin/skills/constitutional-ai/SKILL.md +290 -0
  35. package/bin/skills/crewai/SKILL.md +498 -0
  36. package/bin/skills/crewai/references/flows.md +438 -0
  37. package/bin/skills/crewai/references/tools.md +429 -0
  38. package/bin/skills/crewai/references/troubleshooting.md +480 -0
  39. package/bin/skills/deepspeed/SKILL.md +141 -0
  40. package/bin/skills/deepspeed/references/08.md +17 -0
  41. package/bin/skills/deepspeed/references/09.md +173 -0
  42. package/bin/skills/deepspeed/references/2020.md +378 -0
  43. package/bin/skills/deepspeed/references/2023.md +279 -0
  44. package/bin/skills/deepspeed/references/assets.md +179 -0
  45. package/bin/skills/deepspeed/references/index.md +35 -0
  46. package/bin/skills/deepspeed/references/mii.md +118 -0
  47. package/bin/skills/deepspeed/references/other.md +1191 -0
  48. package/bin/skills/deepspeed/references/tutorials.md +6554 -0
  49. package/bin/skills/dspy/SKILL.md +590 -0
  50. package/bin/skills/dspy/references/examples.md +663 -0
  51. package/bin/skills/dspy/references/modules.md +475 -0
  52. package/bin/skills/dspy/references/optimizers.md +566 -0
  53. package/bin/skills/faiss/SKILL.md +221 -0
  54. package/bin/skills/faiss/references/index_types.md +280 -0
  55. package/bin/skills/flash-attention/SKILL.md +367 -0
  56. package/bin/skills/flash-attention/references/benchmarks.md +215 -0
  57. package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
  58. package/bin/skills/gguf/SKILL.md +427 -0
  59. package/bin/skills/gguf/references/advanced-usage.md +504 -0
  60. package/bin/skills/gguf/references/troubleshooting.md +442 -0
  61. package/bin/skills/gptq/SKILL.md +450 -0
  62. package/bin/skills/gptq/references/calibration.md +337 -0
  63. package/bin/skills/gptq/references/integration.md +129 -0
  64. package/bin/skills/gptq/references/troubleshooting.md +95 -0
  65. package/bin/skills/grpo-rl-training/README.md +97 -0
  66. package/bin/skills/grpo-rl-training/SKILL.md +572 -0
  67. package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
  68. package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
  69. package/bin/skills/guidance/SKILL.md +572 -0
  70. package/bin/skills/guidance/references/backends.md +554 -0
  71. package/bin/skills/guidance/references/constraints.md +674 -0
  72. package/bin/skills/guidance/references/examples.md +767 -0
  73. package/bin/skills/hqq/SKILL.md +445 -0
  74. package/bin/skills/hqq/references/advanced-usage.md +528 -0
  75. package/bin/skills/hqq/references/troubleshooting.md +503 -0
  76. package/bin/skills/hugging-face-cli/SKILL.md +191 -0
  77. package/bin/skills/hugging-face-cli/references/commands.md +954 -0
  78. package/bin/skills/hugging-face-cli/references/examples.md +374 -0
  79. package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
  80. package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
  81. package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
  82. package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
  83. package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
  84. package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
  85. package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
  86. package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
  87. package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
  88. package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
  89. package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
  90. package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
  91. package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
  92. package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
  93. package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
  94. package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
  95. package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
  96. package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
  97. package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
  98. package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
  99. package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
  100. package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
  101. package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
  102. package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
  103. package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
  104. package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
  105. package/bin/skills/hugging-face-jobs/index.html +216 -0
  106. package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
  107. package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
  108. package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
  109. package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
  110. package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
  111. package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
  112. package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
  113. package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
  114. package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
  115. package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
  116. package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
  117. package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
  118. package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
  119. package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
  120. package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
  121. package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
  122. package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
  123. package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
  124. package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
  125. package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
  126. package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
  127. package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
  128. package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
  129. package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
  130. package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
  131. package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
  132. package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
  133. package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
  134. package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
  135. package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
  136. package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
  137. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
  138. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
  139. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
  140. package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
  141. package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
  142. package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
  143. package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
  144. package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
  145. package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
  146. package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
  147. package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
  148. package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
  149. package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
  150. package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
  151. package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
  152. package/bin/skills/instructor/SKILL.md +740 -0
  153. package/bin/skills/instructor/references/examples.md +107 -0
  154. package/bin/skills/instructor/references/providers.md +70 -0
  155. package/bin/skills/instructor/references/validation.md +606 -0
  156. package/bin/skills/knowledge-distillation/SKILL.md +458 -0
  157. package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
  158. package/bin/skills/lambda-labs/SKILL.md +545 -0
  159. package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
  160. package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
  161. package/bin/skills/langchain/SKILL.md +480 -0
  162. package/bin/skills/langchain/references/agents.md +499 -0
  163. package/bin/skills/langchain/references/integration.md +562 -0
  164. package/bin/skills/langchain/references/rag.md +600 -0
  165. package/bin/skills/langsmith/SKILL.md +422 -0
  166. package/bin/skills/langsmith/references/advanced-usage.md +548 -0
  167. package/bin/skills/langsmith/references/troubleshooting.md +537 -0
  168. package/bin/skills/litgpt/SKILL.md +469 -0
  169. package/bin/skills/litgpt/references/custom-models.md +568 -0
  170. package/bin/skills/litgpt/references/distributed-training.md +451 -0
  171. package/bin/skills/litgpt/references/supported-models.md +336 -0
  172. package/bin/skills/litgpt/references/training-recipes.md +619 -0
  173. package/bin/skills/llama-cpp/SKILL.md +258 -0
  174. package/bin/skills/llama-cpp/references/optimization.md +89 -0
  175. package/bin/skills/llama-cpp/references/quantization.md +213 -0
  176. package/bin/skills/llama-cpp/references/server.md +125 -0
  177. package/bin/skills/llama-factory/SKILL.md +80 -0
  178. package/bin/skills/llama-factory/references/_images.md +23 -0
  179. package/bin/skills/llama-factory/references/advanced.md +1055 -0
  180. package/bin/skills/llama-factory/references/getting_started.md +349 -0
  181. package/bin/skills/llama-factory/references/index.md +19 -0
  182. package/bin/skills/llama-factory/references/other.md +31 -0
  183. package/bin/skills/llamaguard/SKILL.md +337 -0
  184. package/bin/skills/llamaindex/SKILL.md +569 -0
  185. package/bin/skills/llamaindex/references/agents.md +83 -0
  186. package/bin/skills/llamaindex/references/data_connectors.md +108 -0
  187. package/bin/skills/llamaindex/references/query_engines.md +406 -0
  188. package/bin/skills/llava/SKILL.md +304 -0
  189. package/bin/skills/llava/references/training.md +197 -0
  190. package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
  191. package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
  192. package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
  193. package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
  194. package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
  195. package/bin/skills/long-context/SKILL.md +536 -0
  196. package/bin/skills/long-context/references/extension_methods.md +468 -0
  197. package/bin/skills/long-context/references/fine_tuning.md +611 -0
  198. package/bin/skills/long-context/references/rope.md +402 -0
  199. package/bin/skills/mamba/SKILL.md +260 -0
  200. package/bin/skills/mamba/references/architecture-details.md +206 -0
  201. package/bin/skills/mamba/references/benchmarks.md +255 -0
  202. package/bin/skills/mamba/references/training-guide.md +388 -0
  203. package/bin/skills/megatron-core/SKILL.md +366 -0
  204. package/bin/skills/megatron-core/references/benchmarks.md +249 -0
  205. package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
  206. package/bin/skills/megatron-core/references/production-examples.md +473 -0
  207. package/bin/skills/megatron-core/references/training-recipes.md +547 -0
  208. package/bin/skills/miles/SKILL.md +315 -0
  209. package/bin/skills/miles/references/api-reference.md +141 -0
  210. package/bin/skills/miles/references/troubleshooting.md +352 -0
  211. package/bin/skills/mlflow/SKILL.md +704 -0
  212. package/bin/skills/mlflow/references/deployment.md +744 -0
  213. package/bin/skills/mlflow/references/model-registry.md +770 -0
  214. package/bin/skills/mlflow/references/tracking.md +680 -0
  215. package/bin/skills/modal/SKILL.md +341 -0
  216. package/bin/skills/modal/references/advanced-usage.md +503 -0
  217. package/bin/skills/modal/references/troubleshooting.md +494 -0
  218. package/bin/skills/model-merging/SKILL.md +539 -0
  219. package/bin/skills/model-merging/references/evaluation.md +462 -0
  220. package/bin/skills/model-merging/references/examples.md +428 -0
  221. package/bin/skills/model-merging/references/methods.md +352 -0
  222. package/bin/skills/model-pruning/SKILL.md +495 -0
  223. package/bin/skills/model-pruning/references/wanda.md +347 -0
  224. package/bin/skills/moe-training/SKILL.md +526 -0
  225. package/bin/skills/moe-training/references/architectures.md +432 -0
  226. package/bin/skills/moe-training/references/inference.md +348 -0
  227. package/bin/skills/moe-training/references/training.md +425 -0
  228. package/bin/skills/nanogpt/SKILL.md +290 -0
  229. package/bin/skills/nanogpt/references/architecture.md +382 -0
  230. package/bin/skills/nanogpt/references/data.md +476 -0
  231. package/bin/skills/nanogpt/references/training.md +564 -0
  232. package/bin/skills/nemo-curator/SKILL.md +383 -0
  233. package/bin/skills/nemo-curator/references/deduplication.md +87 -0
  234. package/bin/skills/nemo-curator/references/filtering.md +102 -0
  235. package/bin/skills/nemo-evaluator/SKILL.md +494 -0
  236. package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
  237. package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
  238. package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
  239. package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
  240. package/bin/skills/nemo-guardrails/SKILL.md +297 -0
  241. package/bin/skills/nnsight/SKILL.md +436 -0
  242. package/bin/skills/nnsight/references/README.md +78 -0
  243. package/bin/skills/nnsight/references/api.md +344 -0
  244. package/bin/skills/nnsight/references/tutorials.md +300 -0
  245. package/bin/skills/openrlhf/SKILL.md +249 -0
  246. package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
  247. package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
  248. package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
  249. package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
  250. package/bin/skills/outlines/SKILL.md +652 -0
  251. package/bin/skills/outlines/references/backends.md +615 -0
  252. package/bin/skills/outlines/references/examples.md +773 -0
  253. package/bin/skills/outlines/references/json_generation.md +652 -0
  254. package/bin/skills/peft/SKILL.md +431 -0
  255. package/bin/skills/peft/references/advanced-usage.md +514 -0
  256. package/bin/skills/peft/references/troubleshooting.md +480 -0
  257. package/bin/skills/phoenix/SKILL.md +475 -0
  258. package/bin/skills/phoenix/references/advanced-usage.md +619 -0
  259. package/bin/skills/phoenix/references/troubleshooting.md +538 -0
  260. package/bin/skills/pinecone/SKILL.md +358 -0
  261. package/bin/skills/pinecone/references/deployment.md +181 -0
  262. package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
  263. package/bin/skills/pytorch-fsdp/references/index.md +7 -0
  264. package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
  265. package/bin/skills/pytorch-lightning/SKILL.md +346 -0
  266. package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
  267. package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
  268. package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
  269. package/bin/skills/pyvene/SKILL.md +473 -0
  270. package/bin/skills/pyvene/references/README.md +73 -0
  271. package/bin/skills/pyvene/references/api.md +383 -0
  272. package/bin/skills/pyvene/references/tutorials.md +376 -0
  273. package/bin/skills/qdrant/SKILL.md +493 -0
  274. package/bin/skills/qdrant/references/advanced-usage.md +648 -0
  275. package/bin/skills/qdrant/references/troubleshooting.md +631 -0
  276. package/bin/skills/ray-data/SKILL.md +326 -0
  277. package/bin/skills/ray-data/references/integration.md +82 -0
  278. package/bin/skills/ray-data/references/transformations.md +83 -0
  279. package/bin/skills/ray-train/SKILL.md +406 -0
  280. package/bin/skills/ray-train/references/multi-node.md +628 -0
  281. package/bin/skills/rwkv/SKILL.md +260 -0
  282. package/bin/skills/rwkv/references/architecture-details.md +344 -0
  283. package/bin/skills/rwkv/references/rwkv7.md +386 -0
  284. package/bin/skills/rwkv/references/state-management.md +369 -0
  285. package/bin/skills/saelens/SKILL.md +386 -0
  286. package/bin/skills/saelens/references/README.md +70 -0
  287. package/bin/skills/saelens/references/api.md +333 -0
  288. package/bin/skills/saelens/references/tutorials.md +318 -0
  289. package/bin/skills/segment-anything/SKILL.md +500 -0
  290. package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
  291. package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
  292. package/bin/skills/sentence-transformers/SKILL.md +255 -0
  293. package/bin/skills/sentence-transformers/references/models.md +123 -0
  294. package/bin/skills/sentencepiece/SKILL.md +235 -0
  295. package/bin/skills/sentencepiece/references/algorithms.md +200 -0
  296. package/bin/skills/sentencepiece/references/training.md +304 -0
  297. package/bin/skills/sglang/SKILL.md +442 -0
  298. package/bin/skills/sglang/references/deployment.md +490 -0
  299. package/bin/skills/sglang/references/radix-attention.md +413 -0
  300. package/bin/skills/sglang/references/structured-generation.md +541 -0
  301. package/bin/skills/simpo/SKILL.md +219 -0
  302. package/bin/skills/simpo/references/datasets.md +478 -0
  303. package/bin/skills/simpo/references/hyperparameters.md +452 -0
  304. package/bin/skills/simpo/references/loss-functions.md +350 -0
  305. package/bin/skills/skypilot/SKILL.md +509 -0
  306. package/bin/skills/skypilot/references/advanced-usage.md +491 -0
  307. package/bin/skills/skypilot/references/troubleshooting.md +570 -0
  308. package/bin/skills/slime/SKILL.md +464 -0
  309. package/bin/skills/slime/references/api-reference.md +392 -0
  310. package/bin/skills/slime/references/troubleshooting.md +386 -0
  311. package/bin/skills/speculative-decoding/SKILL.md +467 -0
  312. package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
  313. package/bin/skills/speculative-decoding/references/medusa.md +350 -0
  314. package/bin/skills/stable-diffusion/SKILL.md +519 -0
  315. package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
  316. package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
  317. package/bin/skills/tensorboard/SKILL.md +629 -0
  318. package/bin/skills/tensorboard/references/integrations.md +638 -0
  319. package/bin/skills/tensorboard/references/profiling.md +545 -0
  320. package/bin/skills/tensorboard/references/visualization.md +620 -0
  321. package/bin/skills/tensorrt-llm/SKILL.md +187 -0
  322. package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
  323. package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
  324. package/bin/skills/tensorrt-llm/references/serving.md +470 -0
  325. package/bin/skills/tinker/SKILL.md +362 -0
  326. package/bin/skills/tinker/references/api-reference.md +168 -0
  327. package/bin/skills/tinker/references/getting-started.md +157 -0
  328. package/bin/skills/tinker/references/loss-functions.md +163 -0
  329. package/bin/skills/tinker/references/models-and-lora.md +139 -0
  330. package/bin/skills/tinker/references/recipes.md +280 -0
  331. package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
  332. package/bin/skills/tinker/references/rendering.md +243 -0
  333. package/bin/skills/tinker/references/supervised-learning.md +232 -0
  334. package/bin/skills/tinker-training-cost/SKILL.md +187 -0
  335. package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
  336. package/bin/skills/torchforge/SKILL.md +433 -0
  337. package/bin/skills/torchforge/references/api-reference.md +327 -0
  338. package/bin/skills/torchforge/references/troubleshooting.md +409 -0
  339. package/bin/skills/torchtitan/SKILL.md +358 -0
  340. package/bin/skills/torchtitan/references/checkpoint.md +181 -0
  341. package/bin/skills/torchtitan/references/custom-models.md +258 -0
  342. package/bin/skills/torchtitan/references/float8.md +133 -0
  343. package/bin/skills/torchtitan/references/fsdp.md +126 -0
  344. package/bin/skills/transformer-lens/SKILL.md +346 -0
  345. package/bin/skills/transformer-lens/references/README.md +54 -0
  346. package/bin/skills/transformer-lens/references/api.md +362 -0
  347. package/bin/skills/transformer-lens/references/tutorials.md +339 -0
  348. package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
  349. package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
  350. package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
  351. package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
  352. package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
  353. package/bin/skills/unsloth/SKILL.md +80 -0
  354. package/bin/skills/unsloth/references/index.md +7 -0
  355. package/bin/skills/unsloth/references/llms-full.md +16799 -0
  356. package/bin/skills/unsloth/references/llms-txt.md +12044 -0
  357. package/bin/skills/unsloth/references/llms.md +82 -0
  358. package/bin/skills/verl/SKILL.md +391 -0
  359. package/bin/skills/verl/references/api-reference.md +301 -0
  360. package/bin/skills/verl/references/troubleshooting.md +391 -0
  361. package/bin/skills/vllm/SKILL.md +364 -0
  362. package/bin/skills/vllm/references/optimization.md +226 -0
  363. package/bin/skills/vllm/references/quantization.md +284 -0
  364. package/bin/skills/vllm/references/server-deployment.md +255 -0
  365. package/bin/skills/vllm/references/troubleshooting.md +447 -0
  366. package/bin/skills/weights-and-biases/SKILL.md +590 -0
  367. package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
  368. package/bin/skills/weights-and-biases/references/integrations.md +700 -0
  369. package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
  370. package/bin/skills/whisper/SKILL.md +317 -0
  371. package/bin/skills/whisper/references/languages.md +189 -0
  372. package/bin/synsc +0 -0
  373. package/package.json +10 -0
@@ -0,0 +1,490 @@
1
+ # Production Deployment Guide
2
+
3
+ Complete guide to deploying SGLang in production environments.
4
+
5
+ ## Server Deployment
6
+
7
+ ### Basic server
8
+
9
+ ```bash
10
+ python -m sglang.launch_server \
11
+ --model-path meta-llama/Meta-Llama-3-8B-Instruct \
12
+ --host 0.0.0.0 \
13
+ --port 30000 \
14
+ --mem-fraction-static 0.9
15
+ ```
16
+
17
+ ### Multi-GPU (Tensor Parallelism)
18
+
19
+ ```bash
20
+ # Llama 3-70B on 4 GPUs
21
+ python -m sglang.launch_server \
22
+ --model-path meta-llama/Meta-Llama-3-70B-Instruct \
23
+ --tp 4 \
24
+ --port 30000
25
+ ```
26
+
27
+ ### Quantization
28
+
29
+ ```bash
30
+ # FP8 quantization (H100)
31
+ python -m sglang.launch_server \
32
+ --model-path meta-llama/Meta-Llama-3-70B-Instruct \
33
+ --quantization fp8 \
34
+ --tp 4
35
+
36
+ # INT4 AWQ quantization
37
+ python -m sglang.launch_server \
38
+ --model-path TheBloke/Llama-2-70B-AWQ \
39
+ --quantization awq \
40
+ --tp 2
41
+
42
+ # INT4 GPTQ quantization
43
+ python -m sglang.launch_server \
44
+ --model-path TheBloke/Llama-2-70B-GPTQ \
45
+ --quantization gptq \
46
+ --tp 2
47
+ ```
48
+
49
+ ## Docker Deployment
50
+
51
+ ### Dockerfile
52
+
53
+ ```dockerfile
54
+ FROM nvidia/cuda:12.1.0-devel-ubuntu22.04
55
+
56
+ # Install Python
57
+ RUN apt-get update && apt-get install -y python3.10 python3-pip git
58
+
59
+ # Install SGLang
60
+ RUN pip3 install "sglang[all]" flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/
61
+
62
+ # Copy model (or download at runtime)
63
+ WORKDIR /app
64
+
65
+ # Expose port
66
+ EXPOSE 30000
67
+
68
+ # Start server
69
+ CMD ["python3", "-m", "sglang.launch_server", \
70
+ "--model-path", "meta-llama/Meta-Llama-3-8B-Instruct", \
71
+ "--host", "0.0.0.0", \
72
+ "--port", "30000"]
73
+ ```
74
+
75
+ ### Build and run
76
+
77
+ ```bash
78
+ # Build image
79
+ docker build -t sglang:latest .
80
+
81
+ # Run with GPU
82
+ docker run --gpus all -p 30000:30000 sglang:latest
83
+
84
+ # Run with specific GPUs
85
+ docker run --gpus '"device=0,1,2,3"' -p 30000:30000 sglang:latest
86
+
87
+ # Run with custom model
88
+ docker run --gpus all -p 30000:30000 \
89
+ -e MODEL_PATH="meta-llama/Meta-Llama-3-70B-Instruct" \
90
+ -e TP_SIZE="4" \
91
+ sglang:latest
92
+ ```
93
+
94
+ ## Kubernetes Deployment
95
+
96
+ ### Deployment YAML
97
+
98
+ ```yaml
99
+ apiVersion: apps/v1
100
+ kind: Deployment
101
+ metadata:
102
+ name: sglang-llama3-70b
103
+ spec:
104
+ replicas: 2
105
+ selector:
106
+ matchLabels:
107
+ app: sglang
108
+ template:
109
+ metadata:
110
+ labels:
111
+ app: sglang
112
+ spec:
113
+ containers:
114
+ - name: sglang
115
+ image: sglang:latest
116
+ command:
117
+ - python3
118
+ - -m
119
+ - sglang.launch_server
120
+ - --model-path=meta-llama/Meta-Llama-3-70B-Instruct
121
+ - --tp=4
122
+ - --host=0.0.0.0
123
+ - --port=30000
124
+ - --mem-fraction-static=0.9
125
+ ports:
126
+ - containerPort: 30000
127
+ name: http
128
+ resources:
129
+ limits:
130
+ nvidia.com/gpu: 4
131
+ livenessProbe:
132
+ httpGet:
133
+ path: /health
134
+ port: 30000
135
+ initialDelaySeconds: 60
136
+ periodSeconds: 10
137
+ readinessProbe:
138
+ httpGet:
139
+ path: /health
140
+ port: 30000
141
+ initialDelaySeconds: 30
142
+ periodSeconds: 5
143
+ ---
144
+ apiVersion: v1
145
+ kind: Service
146
+ metadata:
147
+ name: sglang-service
148
+ spec:
149
+ selector:
150
+ app: sglang
151
+ ports:
152
+ - port: 80
153
+ targetPort: 30000
154
+ type: LoadBalancer
155
+ ```
156
+
157
+ ## Monitoring
158
+
159
+ ### Health checks
160
+
161
+ ```bash
162
+ # Health endpoint
163
+ curl http://localhost:30000/health
164
+
165
+ # Model info
166
+ curl http://localhost:30000/v1/models
167
+
168
+ # Server stats
169
+ curl http://localhost:30000/stats
170
+ ```
171
+
172
+ ### Prometheus metrics
173
+
174
+ ```bash
175
+ # Start server with metrics
176
+ python -m sglang.launch_server \
177
+ --model-path meta-llama/Meta-Llama-3-8B-Instruct \
178
+ --enable-metrics
179
+
180
+ # Metrics endpoint
181
+ curl http://localhost:30000/metrics
182
+
183
+ # Key metrics:
184
+ # - sglang_request_total
185
+ # - sglang_request_duration_seconds
186
+ # - sglang_tokens_generated_total
187
+ # - sglang_active_requests
188
+ # - sglang_queue_size
189
+ # - sglang_radix_cache_hit_rate
190
+ # - sglang_gpu_memory_used_bytes
191
+ ```
192
+
193
+ ### Logging
194
+
195
+ ```bash
196
+ # Enable debug logging
197
+ python -m sglang.launch_server \
198
+ --model-path meta-llama/Meta-Llama-3-8B-Instruct \
199
+ --log-level debug
200
+
201
+ # Log to file
202
+ python -m sglang.launch_server \
203
+ --model-path meta-llama/Meta-Llama-3-8B-Instruct \
204
+ --log-file /var/log/sglang.log
205
+ ```
206
+
207
+ ## Load Balancing
208
+
209
+ ### NGINX configuration
210
+
211
+ ```nginx
212
+ upstream sglang_backend {
213
+ least_conn; # Route to least busy instance
214
+ server sglang-1:30000 max_fails=3 fail_timeout=30s;
215
+ server sglang-2:30000 max_fails=3 fail_timeout=30s;
216
+ server sglang-3:30000 max_fails=3 fail_timeout=30s;
217
+ }
218
+
219
+ server {
220
+ listen 80;
221
+
222
+ location / {
223
+ proxy_pass http://sglang_backend;
224
+ proxy_http_version 1.1;
225
+ proxy_set_header Connection "";
226
+ proxy_read_timeout 300s;
227
+ proxy_connect_timeout 10s;
228
+
229
+ # For streaming
230
+ proxy_buffering off;
231
+ proxy_cache off;
232
+ }
233
+
234
+ location /metrics {
235
+ proxy_pass http://sglang_backend/metrics;
236
+ }
237
+ }
238
+ ```
239
+
240
+ ## Autoscaling
241
+
242
+ ### HPA based on GPU utilization
243
+
244
+ ```yaml
245
+ apiVersion: autoscaling/v2
246
+ kind: HorizontalPodAutoscaler
247
+ metadata:
248
+ name: sglang-hpa
249
+ spec:
250
+ scaleTargetRef:
251
+ apiVersion: apps/v1
252
+ kind: Deployment
253
+ name: sglang-llama3-70b
254
+ minReplicas: 2
255
+ maxReplicas: 10
256
+ metrics:
257
+ - type: Pods
258
+ pods:
259
+ metric:
260
+ name: nvidia_gpu_duty_cycle
261
+ target:
262
+ type: AverageValue
263
+ averageValue: "80" # Scale when GPU >80%
264
+ ```
265
+
266
+ ### HPA based on active requests
267
+
268
+ ```yaml
269
+ metrics:
270
+ - type: Pods
271
+ pods:
272
+ metric:
273
+ name: sglang_active_requests
274
+ target:
275
+ type: AverageValue
276
+ averageValue: "50" # Scale when >50 active requests per pod
277
+ ```
278
+
279
+ ## Performance Tuning
280
+
281
+ ### Memory optimization
282
+
283
+ ```bash
284
+ # Reduce memory usage
285
+ python -m sglang.launch_server \
286
+ --model-path meta-llama/Meta-Llama-3-70B-Instruct \
287
+ --tp 4 \
288
+ --mem-fraction-static 0.85 \ # Use 85% of GPU memory
289
+ --max-radix-cache-len 8192 # Limit cache to 8K tokens
290
+ ```
291
+
292
+ ### Throughput optimization
293
+
294
+ ```bash
295
+ # Maximize throughput
296
+ python -m sglang.launch_server \
297
+ --model-path meta-llama/Meta-Llama-3-8B-Instruct \
298
+ --mem-fraction-static 0.95 \ # More memory for batching
299
+ --max-radix-cache-len 16384 \ # Larger cache
300
+ --max-running-requests 256 # More concurrent requests
301
+ ```
302
+
303
+ ### Latency optimization
304
+
305
+ ```bash
306
+ # Minimize latency
307
+ python -m sglang.launch_server \
308
+ --model-path meta-llama/Meta-Llama-3-8B-Instruct \
309
+ --max-running-requests 32 \ # Fewer concurrent (less queueing)
310
+ --schedule-policy fcfs # First-come first-served
311
+ ```
312
+
313
+ ## Multi-Node Deployment
314
+
315
+ ### Ray cluster setup
316
+
317
+ ```bash
318
+ # Head node
319
+ ray start --head --port=6379
320
+
321
+ # Worker nodes
322
+ ray start --address='head-node:6379'
323
+
324
+ # Launch server across cluster
325
+ python -m sglang.launch_server \
326
+ --model-path meta-llama/Meta-Llama-3-405B-Instruct \
327
+ --tp 8 \
328
+ --num-nodes 2 # Use 2 nodes (8 GPUs each)
329
+ ```
330
+
331
+ ## Security
332
+
333
+ ### API authentication
334
+
335
+ ```bash
336
+ # Start with API key
337
+ python -m sglang.launch_server \
338
+ --model-path meta-llama/Meta-Llama-3-8B-Instruct \
339
+ --api-key YOUR_SECRET_KEY
340
+
341
+ # Client request
342
+ curl http://localhost:30000/v1/chat/completions \
343
+ -H "Authorization: Bearer YOUR_SECRET_KEY" \
344
+ -H "Content-Type: application/json" \
345
+ -d '{"model": "default", "messages": [...]}'
346
+ ```
347
+
348
+ ### Network policies (Kubernetes)
349
+
350
+ ```yaml
351
+ apiVersion: networking.k8s.io/v1
352
+ kind: NetworkPolicy
353
+ metadata:
354
+ name: sglang-policy
355
+ spec:
356
+ podSelector:
357
+ matchLabels:
358
+ app: sglang
359
+ policyTypes:
360
+ - Ingress
361
+ ingress:
362
+ - from:
363
+ - podSelector:
364
+ matchLabels:
365
+ app: api-gateway # Only allow from gateway
366
+ ports:
367
+ - protocol: TCP
368
+ port: 30000
369
+ ```
370
+
371
+ ## Troubleshooting
372
+
373
+ ### High memory usage
374
+
375
+ **Check**:
376
+ ```bash
377
+ nvidia-smi
378
+ curl http://localhost:30000/stats | grep cache
379
+ ```
380
+
381
+ **Solutions**:
382
+ ```bash
383
+ # Reduce cache size
384
+ --max-radix-cache-len 4096
385
+
386
+ # Reduce memory fraction
387
+ --mem-fraction-static 0.75
388
+
389
+ # Enable quantization
390
+ --quantization fp8
391
+ ```
392
+
393
+ ### Low throughput
394
+
395
+ **Check**:
396
+ ```bash
397
+ curl http://localhost:30000/stats | grep queue_size
398
+ ```
399
+
400
+ **Solutions**:
401
+ ```bash
402
+ # Increase batch size
403
+ --max-running-requests 256
404
+
405
+ # Add more GPUs
406
+ --tp 4 # Increase tensor parallelism
407
+
408
+ # Check cache hit rate (should be >70%)
409
+ curl http://localhost:30000/stats | grep cache_hit_rate
410
+ ```
411
+
412
+ ### High latency
413
+
414
+ **Check**:
415
+ ```bash
416
+ curl http://localhost:30000/metrics | grep duration
417
+ ```
418
+
419
+ **Solutions**:
420
+ ```bash
421
+ # Reduce concurrent requests
422
+ --max-running-requests 32
423
+
424
+ # Use FCFS scheduling (no batching delay)
425
+ --schedule-policy fcfs
426
+
427
+ # Add more replicas (horizontal scaling)
428
+ ```
429
+
430
+ ### OOM errors
431
+
432
+ **Solutions**:
433
+ ```bash
434
+ # Reduce batch size
435
+ --max-running-requests 128
436
+
437
+ # Reduce cache
438
+ --max-radix-cache-len 2048
439
+
440
+ # Enable quantization
441
+ --quantization awq
442
+
443
+ # Increase tensor parallelism
444
+ --tp 8
445
+ ```
446
+
447
+ ## Best Practices
448
+
449
+ 1. **Use RadixAttention** - Enabled by default, 5-10× speedup for agents
450
+ 2. **Monitor cache hit rate** - Target >70% for agent/few-shot workloads
451
+ 3. **Set health checks** - Use `/health` endpoint for k8s probes
452
+ 4. **Enable metrics** - Monitor with Prometheus + Grafana
453
+ 5. **Use load balancing** - Distribute load across replicas
454
+ 6. **Tune memory** - Start with `--mem-fraction-static 0.9`, adjust based on OOM
455
+ 7. **Use quantization** - FP8 on H100, AWQ/GPTQ on A100
456
+ 8. **Set up autoscaling** - Scale based on GPU utilization or active requests
457
+ 9. **Log to persistent storage** - Use `--log-file` for debugging
458
+ 10. **Test before production** - Run load tests with expected traffic patterns
459
+
460
+ ## Cost Optimization
461
+
462
+ ### GPU selection
463
+
464
+ **A100 80GB** ($3-4/hour):
465
+ - Llama 3-70B with FP8 (TP=4)
466
+ - Throughput: 10,000-15,000 tok/s
467
+ - Cost per 1M tokens: $0.20-0.30
468
+
469
+ **H100 80GB** ($6-8/hour):
470
+ - Llama 3-70B with FP8 (TP=4)
471
+ - Throughput: 20,000-30,000 tok/s
472
+ - Cost per 1M tokens: $0.15-0.25 (2× faster)
473
+
474
+ **L4** ($0.50-1/hour):
475
+ - Llama 3-8B
476
+ - Throughput: 1,500-2,500 tok/s
477
+ - Cost per 1M tokens: $0.20-0.40
478
+
479
+ ### Batching for cost efficiency
480
+
481
+ **Low batch (batch=1)**:
482
+ - Throughput: 1,000 tok/s
483
+ - Cost: $3/hour ÷ 1M tok/hour = $3/M tokens
484
+
485
+ **High batch (batch=128)**:
486
+ - Throughput: 8,000 tok/s
487
+ - Cost: $3/hour ÷ 8M tok/hour = $0.375/M tokens
488
+ - **8× cost reduction**
489
+
490
+ **Recommendation**: Target batch size 64-256 for optimal cost/latency.