@synsci/cli-darwin-x64 1.1.49

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (373) hide show
  1. package/bin/skills/accelerate/SKILL.md +332 -0
  2. package/bin/skills/accelerate/references/custom-plugins.md +453 -0
  3. package/bin/skills/accelerate/references/megatron-integration.md +489 -0
  4. package/bin/skills/accelerate/references/performance.md +525 -0
  5. package/bin/skills/audiocraft/SKILL.md +564 -0
  6. package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
  7. package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
  8. package/bin/skills/autogpt/SKILL.md +403 -0
  9. package/bin/skills/autogpt/references/advanced-usage.md +535 -0
  10. package/bin/skills/autogpt/references/troubleshooting.md +420 -0
  11. package/bin/skills/awq/SKILL.md +310 -0
  12. package/bin/skills/awq/references/advanced-usage.md +324 -0
  13. package/bin/skills/awq/references/troubleshooting.md +344 -0
  14. package/bin/skills/axolotl/SKILL.md +158 -0
  15. package/bin/skills/axolotl/references/api.md +5548 -0
  16. package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
  17. package/bin/skills/axolotl/references/index.md +15 -0
  18. package/bin/skills/axolotl/references/other.md +3563 -0
  19. package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
  20. package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
  21. package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
  22. package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
  23. package/bin/skills/bitsandbytes/SKILL.md +411 -0
  24. package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
  25. package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
  26. package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
  27. package/bin/skills/blip-2/SKILL.md +564 -0
  28. package/bin/skills/blip-2/references/advanced-usage.md +680 -0
  29. package/bin/skills/blip-2/references/troubleshooting.md +526 -0
  30. package/bin/skills/chroma/SKILL.md +406 -0
  31. package/bin/skills/chroma/references/integration.md +38 -0
  32. package/bin/skills/clip/SKILL.md +253 -0
  33. package/bin/skills/clip/references/applications.md +207 -0
  34. package/bin/skills/constitutional-ai/SKILL.md +290 -0
  35. package/bin/skills/crewai/SKILL.md +498 -0
  36. package/bin/skills/crewai/references/flows.md +438 -0
  37. package/bin/skills/crewai/references/tools.md +429 -0
  38. package/bin/skills/crewai/references/troubleshooting.md +480 -0
  39. package/bin/skills/deepspeed/SKILL.md +141 -0
  40. package/bin/skills/deepspeed/references/08.md +17 -0
  41. package/bin/skills/deepspeed/references/09.md +173 -0
  42. package/bin/skills/deepspeed/references/2020.md +378 -0
  43. package/bin/skills/deepspeed/references/2023.md +279 -0
  44. package/bin/skills/deepspeed/references/assets.md +179 -0
  45. package/bin/skills/deepspeed/references/index.md +35 -0
  46. package/bin/skills/deepspeed/references/mii.md +118 -0
  47. package/bin/skills/deepspeed/references/other.md +1191 -0
  48. package/bin/skills/deepspeed/references/tutorials.md +6554 -0
  49. package/bin/skills/dspy/SKILL.md +590 -0
  50. package/bin/skills/dspy/references/examples.md +663 -0
  51. package/bin/skills/dspy/references/modules.md +475 -0
  52. package/bin/skills/dspy/references/optimizers.md +566 -0
  53. package/bin/skills/faiss/SKILL.md +221 -0
  54. package/bin/skills/faiss/references/index_types.md +280 -0
  55. package/bin/skills/flash-attention/SKILL.md +367 -0
  56. package/bin/skills/flash-attention/references/benchmarks.md +215 -0
  57. package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
  58. package/bin/skills/gguf/SKILL.md +427 -0
  59. package/bin/skills/gguf/references/advanced-usage.md +504 -0
  60. package/bin/skills/gguf/references/troubleshooting.md +442 -0
  61. package/bin/skills/gptq/SKILL.md +450 -0
  62. package/bin/skills/gptq/references/calibration.md +337 -0
  63. package/bin/skills/gptq/references/integration.md +129 -0
  64. package/bin/skills/gptq/references/troubleshooting.md +95 -0
  65. package/bin/skills/grpo-rl-training/README.md +97 -0
  66. package/bin/skills/grpo-rl-training/SKILL.md +572 -0
  67. package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
  68. package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
  69. package/bin/skills/guidance/SKILL.md +572 -0
  70. package/bin/skills/guidance/references/backends.md +554 -0
  71. package/bin/skills/guidance/references/constraints.md +674 -0
  72. package/bin/skills/guidance/references/examples.md +767 -0
  73. package/bin/skills/hqq/SKILL.md +445 -0
  74. package/bin/skills/hqq/references/advanced-usage.md +528 -0
  75. package/bin/skills/hqq/references/troubleshooting.md +503 -0
  76. package/bin/skills/hugging-face-cli/SKILL.md +191 -0
  77. package/bin/skills/hugging-face-cli/references/commands.md +954 -0
  78. package/bin/skills/hugging-face-cli/references/examples.md +374 -0
  79. package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
  80. package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
  81. package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
  82. package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
  83. package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
  84. package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
  85. package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
  86. package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
  87. package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
  88. package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
  89. package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
  90. package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
  91. package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
  92. package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
  93. package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
  94. package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
  95. package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
  96. package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
  97. package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
  98. package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
  99. package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
  100. package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
  101. package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
  102. package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
  103. package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
  104. package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
  105. package/bin/skills/hugging-face-jobs/index.html +216 -0
  106. package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
  107. package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
  108. package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
  109. package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
  110. package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
  111. package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
  112. package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
  113. package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
  114. package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
  115. package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
  116. package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
  117. package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
  118. package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
  119. package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
  120. package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
  121. package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
  122. package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
  123. package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
  124. package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
  125. package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
  126. package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
  127. package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
  128. package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
  129. package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
  130. package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
  131. package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
  132. package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
  133. package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
  134. package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
  135. package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
  136. package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
  137. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
  138. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
  139. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
  140. package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
  141. package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
  142. package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
  143. package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
  144. package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
  145. package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
  146. package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
  147. package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
  148. package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
  149. package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
  150. package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
  151. package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
  152. package/bin/skills/instructor/SKILL.md +740 -0
  153. package/bin/skills/instructor/references/examples.md +107 -0
  154. package/bin/skills/instructor/references/providers.md +70 -0
  155. package/bin/skills/instructor/references/validation.md +606 -0
  156. package/bin/skills/knowledge-distillation/SKILL.md +458 -0
  157. package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
  158. package/bin/skills/lambda-labs/SKILL.md +545 -0
  159. package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
  160. package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
  161. package/bin/skills/langchain/SKILL.md +480 -0
  162. package/bin/skills/langchain/references/agents.md +499 -0
  163. package/bin/skills/langchain/references/integration.md +562 -0
  164. package/bin/skills/langchain/references/rag.md +600 -0
  165. package/bin/skills/langsmith/SKILL.md +422 -0
  166. package/bin/skills/langsmith/references/advanced-usage.md +548 -0
  167. package/bin/skills/langsmith/references/troubleshooting.md +537 -0
  168. package/bin/skills/litgpt/SKILL.md +469 -0
  169. package/bin/skills/litgpt/references/custom-models.md +568 -0
  170. package/bin/skills/litgpt/references/distributed-training.md +451 -0
  171. package/bin/skills/litgpt/references/supported-models.md +336 -0
  172. package/bin/skills/litgpt/references/training-recipes.md +619 -0
  173. package/bin/skills/llama-cpp/SKILL.md +258 -0
  174. package/bin/skills/llama-cpp/references/optimization.md +89 -0
  175. package/bin/skills/llama-cpp/references/quantization.md +213 -0
  176. package/bin/skills/llama-cpp/references/server.md +125 -0
  177. package/bin/skills/llama-factory/SKILL.md +80 -0
  178. package/bin/skills/llama-factory/references/_images.md +23 -0
  179. package/bin/skills/llama-factory/references/advanced.md +1055 -0
  180. package/bin/skills/llama-factory/references/getting_started.md +349 -0
  181. package/bin/skills/llama-factory/references/index.md +19 -0
  182. package/bin/skills/llama-factory/references/other.md +31 -0
  183. package/bin/skills/llamaguard/SKILL.md +337 -0
  184. package/bin/skills/llamaindex/SKILL.md +569 -0
  185. package/bin/skills/llamaindex/references/agents.md +83 -0
  186. package/bin/skills/llamaindex/references/data_connectors.md +108 -0
  187. package/bin/skills/llamaindex/references/query_engines.md +406 -0
  188. package/bin/skills/llava/SKILL.md +304 -0
  189. package/bin/skills/llava/references/training.md +197 -0
  190. package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
  191. package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
  192. package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
  193. package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
  194. package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
  195. package/bin/skills/long-context/SKILL.md +536 -0
  196. package/bin/skills/long-context/references/extension_methods.md +468 -0
  197. package/bin/skills/long-context/references/fine_tuning.md +611 -0
  198. package/bin/skills/long-context/references/rope.md +402 -0
  199. package/bin/skills/mamba/SKILL.md +260 -0
  200. package/bin/skills/mamba/references/architecture-details.md +206 -0
  201. package/bin/skills/mamba/references/benchmarks.md +255 -0
  202. package/bin/skills/mamba/references/training-guide.md +388 -0
  203. package/bin/skills/megatron-core/SKILL.md +366 -0
  204. package/bin/skills/megatron-core/references/benchmarks.md +249 -0
  205. package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
  206. package/bin/skills/megatron-core/references/production-examples.md +473 -0
  207. package/bin/skills/megatron-core/references/training-recipes.md +547 -0
  208. package/bin/skills/miles/SKILL.md +315 -0
  209. package/bin/skills/miles/references/api-reference.md +141 -0
  210. package/bin/skills/miles/references/troubleshooting.md +352 -0
  211. package/bin/skills/mlflow/SKILL.md +704 -0
  212. package/bin/skills/mlflow/references/deployment.md +744 -0
  213. package/bin/skills/mlflow/references/model-registry.md +770 -0
  214. package/bin/skills/mlflow/references/tracking.md +680 -0
  215. package/bin/skills/modal/SKILL.md +341 -0
  216. package/bin/skills/modal/references/advanced-usage.md +503 -0
  217. package/bin/skills/modal/references/troubleshooting.md +494 -0
  218. package/bin/skills/model-merging/SKILL.md +539 -0
  219. package/bin/skills/model-merging/references/evaluation.md +462 -0
  220. package/bin/skills/model-merging/references/examples.md +428 -0
  221. package/bin/skills/model-merging/references/methods.md +352 -0
  222. package/bin/skills/model-pruning/SKILL.md +495 -0
  223. package/bin/skills/model-pruning/references/wanda.md +347 -0
  224. package/bin/skills/moe-training/SKILL.md +526 -0
  225. package/bin/skills/moe-training/references/architectures.md +432 -0
  226. package/bin/skills/moe-training/references/inference.md +348 -0
  227. package/bin/skills/moe-training/references/training.md +425 -0
  228. package/bin/skills/nanogpt/SKILL.md +290 -0
  229. package/bin/skills/nanogpt/references/architecture.md +382 -0
  230. package/bin/skills/nanogpt/references/data.md +476 -0
  231. package/bin/skills/nanogpt/references/training.md +564 -0
  232. package/bin/skills/nemo-curator/SKILL.md +383 -0
  233. package/bin/skills/nemo-curator/references/deduplication.md +87 -0
  234. package/bin/skills/nemo-curator/references/filtering.md +102 -0
  235. package/bin/skills/nemo-evaluator/SKILL.md +494 -0
  236. package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
  237. package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
  238. package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
  239. package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
  240. package/bin/skills/nemo-guardrails/SKILL.md +297 -0
  241. package/bin/skills/nnsight/SKILL.md +436 -0
  242. package/bin/skills/nnsight/references/README.md +78 -0
  243. package/bin/skills/nnsight/references/api.md +344 -0
  244. package/bin/skills/nnsight/references/tutorials.md +300 -0
  245. package/bin/skills/openrlhf/SKILL.md +249 -0
  246. package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
  247. package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
  248. package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
  249. package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
  250. package/bin/skills/outlines/SKILL.md +652 -0
  251. package/bin/skills/outlines/references/backends.md +615 -0
  252. package/bin/skills/outlines/references/examples.md +773 -0
  253. package/bin/skills/outlines/references/json_generation.md +652 -0
  254. package/bin/skills/peft/SKILL.md +431 -0
  255. package/bin/skills/peft/references/advanced-usage.md +514 -0
  256. package/bin/skills/peft/references/troubleshooting.md +480 -0
  257. package/bin/skills/phoenix/SKILL.md +475 -0
  258. package/bin/skills/phoenix/references/advanced-usage.md +619 -0
  259. package/bin/skills/phoenix/references/troubleshooting.md +538 -0
  260. package/bin/skills/pinecone/SKILL.md +358 -0
  261. package/bin/skills/pinecone/references/deployment.md +181 -0
  262. package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
  263. package/bin/skills/pytorch-fsdp/references/index.md +7 -0
  264. package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
  265. package/bin/skills/pytorch-lightning/SKILL.md +346 -0
  266. package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
  267. package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
  268. package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
  269. package/bin/skills/pyvene/SKILL.md +473 -0
  270. package/bin/skills/pyvene/references/README.md +73 -0
  271. package/bin/skills/pyvene/references/api.md +383 -0
  272. package/bin/skills/pyvene/references/tutorials.md +376 -0
  273. package/bin/skills/qdrant/SKILL.md +493 -0
  274. package/bin/skills/qdrant/references/advanced-usage.md +648 -0
  275. package/bin/skills/qdrant/references/troubleshooting.md +631 -0
  276. package/bin/skills/ray-data/SKILL.md +326 -0
  277. package/bin/skills/ray-data/references/integration.md +82 -0
  278. package/bin/skills/ray-data/references/transformations.md +83 -0
  279. package/bin/skills/ray-train/SKILL.md +406 -0
  280. package/bin/skills/ray-train/references/multi-node.md +628 -0
  281. package/bin/skills/rwkv/SKILL.md +260 -0
  282. package/bin/skills/rwkv/references/architecture-details.md +344 -0
  283. package/bin/skills/rwkv/references/rwkv7.md +386 -0
  284. package/bin/skills/rwkv/references/state-management.md +369 -0
  285. package/bin/skills/saelens/SKILL.md +386 -0
  286. package/bin/skills/saelens/references/README.md +70 -0
  287. package/bin/skills/saelens/references/api.md +333 -0
  288. package/bin/skills/saelens/references/tutorials.md +318 -0
  289. package/bin/skills/segment-anything/SKILL.md +500 -0
  290. package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
  291. package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
  292. package/bin/skills/sentence-transformers/SKILL.md +255 -0
  293. package/bin/skills/sentence-transformers/references/models.md +123 -0
  294. package/bin/skills/sentencepiece/SKILL.md +235 -0
  295. package/bin/skills/sentencepiece/references/algorithms.md +200 -0
  296. package/bin/skills/sentencepiece/references/training.md +304 -0
  297. package/bin/skills/sglang/SKILL.md +442 -0
  298. package/bin/skills/sglang/references/deployment.md +490 -0
  299. package/bin/skills/sglang/references/radix-attention.md +413 -0
  300. package/bin/skills/sglang/references/structured-generation.md +541 -0
  301. package/bin/skills/simpo/SKILL.md +219 -0
  302. package/bin/skills/simpo/references/datasets.md +478 -0
  303. package/bin/skills/simpo/references/hyperparameters.md +452 -0
  304. package/bin/skills/simpo/references/loss-functions.md +350 -0
  305. package/bin/skills/skypilot/SKILL.md +509 -0
  306. package/bin/skills/skypilot/references/advanced-usage.md +491 -0
  307. package/bin/skills/skypilot/references/troubleshooting.md +570 -0
  308. package/bin/skills/slime/SKILL.md +464 -0
  309. package/bin/skills/slime/references/api-reference.md +392 -0
  310. package/bin/skills/slime/references/troubleshooting.md +386 -0
  311. package/bin/skills/speculative-decoding/SKILL.md +467 -0
  312. package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
  313. package/bin/skills/speculative-decoding/references/medusa.md +350 -0
  314. package/bin/skills/stable-diffusion/SKILL.md +519 -0
  315. package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
  316. package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
  317. package/bin/skills/tensorboard/SKILL.md +629 -0
  318. package/bin/skills/tensorboard/references/integrations.md +638 -0
  319. package/bin/skills/tensorboard/references/profiling.md +545 -0
  320. package/bin/skills/tensorboard/references/visualization.md +620 -0
  321. package/bin/skills/tensorrt-llm/SKILL.md +187 -0
  322. package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
  323. package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
  324. package/bin/skills/tensorrt-llm/references/serving.md +470 -0
  325. package/bin/skills/tinker/SKILL.md +362 -0
  326. package/bin/skills/tinker/references/api-reference.md +168 -0
  327. package/bin/skills/tinker/references/getting-started.md +157 -0
  328. package/bin/skills/tinker/references/loss-functions.md +163 -0
  329. package/bin/skills/tinker/references/models-and-lora.md +139 -0
  330. package/bin/skills/tinker/references/recipes.md +280 -0
  331. package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
  332. package/bin/skills/tinker/references/rendering.md +243 -0
  333. package/bin/skills/tinker/references/supervised-learning.md +232 -0
  334. package/bin/skills/tinker-training-cost/SKILL.md +187 -0
  335. package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
  336. package/bin/skills/torchforge/SKILL.md +433 -0
  337. package/bin/skills/torchforge/references/api-reference.md +327 -0
  338. package/bin/skills/torchforge/references/troubleshooting.md +409 -0
  339. package/bin/skills/torchtitan/SKILL.md +358 -0
  340. package/bin/skills/torchtitan/references/checkpoint.md +181 -0
  341. package/bin/skills/torchtitan/references/custom-models.md +258 -0
  342. package/bin/skills/torchtitan/references/float8.md +133 -0
  343. package/bin/skills/torchtitan/references/fsdp.md +126 -0
  344. package/bin/skills/transformer-lens/SKILL.md +346 -0
  345. package/bin/skills/transformer-lens/references/README.md +54 -0
  346. package/bin/skills/transformer-lens/references/api.md +362 -0
  347. package/bin/skills/transformer-lens/references/tutorials.md +339 -0
  348. package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
  349. package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
  350. package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
  351. package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
  352. package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
  353. package/bin/skills/unsloth/SKILL.md +80 -0
  354. package/bin/skills/unsloth/references/index.md +7 -0
  355. package/bin/skills/unsloth/references/llms-full.md +16799 -0
  356. package/bin/skills/unsloth/references/llms-txt.md +12044 -0
  357. package/bin/skills/unsloth/references/llms.md +82 -0
  358. package/bin/skills/verl/SKILL.md +391 -0
  359. package/bin/skills/verl/references/api-reference.md +301 -0
  360. package/bin/skills/verl/references/troubleshooting.md +391 -0
  361. package/bin/skills/vllm/SKILL.md +364 -0
  362. package/bin/skills/vllm/references/optimization.md +226 -0
  363. package/bin/skills/vllm/references/quantization.md +284 -0
  364. package/bin/skills/vllm/references/server-deployment.md +255 -0
  365. package/bin/skills/vllm/references/troubleshooting.md +447 -0
  366. package/bin/skills/weights-and-biases/SKILL.md +590 -0
  367. package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
  368. package/bin/skills/weights-and-biases/references/integrations.md +700 -0
  369. package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
  370. package/bin/skills/whisper/SKILL.md +317 -0
  371. package/bin/skills/whisper/references/languages.md +189 -0
  372. package/bin/synsc +0 -0
  373. package/package.json +10 -0
@@ -0,0 +1,413 @@
1
+ # RadixAttention Deep Dive
2
+
3
+ Complete guide to RadixAttention - SGLang's key innovation for automatic prefix caching.
4
+
5
+ ## What is RadixAttention?
6
+
7
+ **RadixAttention** is an algorithm that automatically caches and reuses KV cache for common prefixes across requests using a radix tree data structure.
8
+
9
+ **Key insight**: In real-world LLM serving:
10
+ - System prompts are repeated across requests
11
+ - Few-shot examples are shared
12
+ - Multi-turn conversations build on previous context
13
+ - Agent tools/functions are defined once
14
+
15
+ **Problem with traditional serving**:
16
+ - Every request recomputes the entire prompt
17
+ - Wasteful for shared prefixes
18
+ - 5-10× slower than necessary
19
+
20
+ **RadixAttention solution**:
21
+ - Build radix tree of all processed tokens
22
+ - Automatically detect shared prefixes
23
+ - Reuse KV cache for matching tokens
24
+ - Only compute new/different tokens
25
+
26
+ ## How It Works
27
+
28
+ ### Radix Tree Structure
29
+
30
+ ```
31
+ Example requests:
32
+ 1. "System: You are helpful\nUser: What's AI?"
33
+ 2. "System: You are helpful\nUser: What's ML?"
34
+ 3. "System: You are helpful\nUser: What's DL?"
35
+
36
+ Radix tree:
37
+ Root
38
+ └── "System: You are helpful\nUser: What's "
39
+ ├── "AI?" → [KV cache for request 1]
40
+ ├── "ML?" → [KV cache for request 2]
41
+ └── "DL?" → [KV cache for request 3]
42
+
43
+ Shared prefix: "System: You are helpful\nUser: What's "
44
+ → Computed once, reused 3 times
45
+ → 5× speedup!
46
+ ```
47
+
48
+ ### Token-Level Matching
49
+
50
+ RadixAttention works at the token level:
51
+
52
+ ```python
53
+ # Request 1: "Hello world"
54
+ Tokens: [15496, 1917] # Hello=15496, world=1917
55
+ → KV cache computed and stored in tree
56
+
57
+ # Request 2: "Hello there"
58
+ Tokens: [15496, 612] # Hello=15496, there=612
59
+ → Reuses KV cache for token 15496
60
+ → Only computes token 612
61
+ → 2× faster
62
+ ```
63
+
64
+ ### Automatic Eviction
65
+
66
+ When memory is full:
67
+ 1. **LRU policy**: Evict least recently used prefixes
68
+ 2. **Leaf-first**: Remove leaf nodes before internal nodes
69
+ 3. **Preserves common prefixes**: Frequently used prefixes stay cached
70
+
71
+ ```
72
+ Before eviction (memory full):
73
+ Root
74
+ ├── "System A" (used 5 min ago)
75
+ │ ├── "Task 1" (used 1 min ago) ← Keep (recent)
76
+ │ └── "Task 2" (used 30 min ago) ← Evict (old + leaf)
77
+ └── "System B" (used 60 min ago) ← Evict (very old)
78
+
79
+ After eviction:
80
+ Root
81
+ └── "System A"
82
+ └── "Task 1"
83
+ ```
84
+
85
+ ## Performance Analysis
86
+
87
+ ### Few-Shot Prompting
88
+
89
+ **Scenario**: 10 examples in prompt (2000 tokens), user query (50 tokens)
90
+
91
+ **Without RadixAttention** (vLLM):
92
+ - Request 1: Compute 2050 tokens (2000 examples + 50 query)
93
+ - Request 2: Compute 2050 tokens (recompute all examples)
94
+ - Request 3: Compute 2050 tokens (recompute all examples)
95
+ - Total: 6150 tokens computed
96
+
97
+ **With RadixAttention** (SGLang):
98
+ - Request 1: Compute 2050 tokens (initial)
99
+ - Request 2: Reuse 2000 tokens, compute 50 (query only)
100
+ - Request 3: Reuse 2000 tokens, compute 50 (query only)
101
+ - Total: 2150 tokens computed
102
+ - **Speedup: 2.86×** (6150 / 2150)
103
+
104
+ ### Agent Workflows
105
+
106
+ **Scenario**: System prompt (1000 tokens) + tools (500 tokens) + query (100 tokens)
107
+
108
+ **Without RadixAttention**:
109
+ - Request 1: 1600 tokens
110
+ - Request 2: 1600 tokens
111
+ - Request 3: 1600 tokens
112
+ - Total: 4800 tokens
113
+
114
+ **With RadixAttention**:
115
+ - Request 1: 1600 tokens (initial)
116
+ - Request 2: Reuse 1500, compute 100
117
+ - Request 3: Reuse 1500, compute 100
118
+ - Total: 1800 tokens
119
+ - **Speedup: 2.67×**
120
+
121
+ ### Multi-Turn Conversations
122
+
123
+ **Scenario**: Conversation grows from 100 → 500 → 1000 tokens
124
+
125
+ | Turn | Tokens | vLLM | SGLang (RadixAttention) |
126
+ |------|--------|------|-------------------------|
127
+ | 1 | 100 | 100 | 100 (initial) |
128
+ | 2 | 500 | 500 | 400 (reuse 100) |
129
+ | 3 | 1000 | 1000 | 500 (reuse 500) |
130
+ | **Total** | | **1600** | **1000** |
131
+ | **Speedup** | | | **1.6×** |
132
+
133
+ As conversation grows, speedup increases!
134
+
135
+ ## Benchmarks
136
+
137
+ ### Throughput Comparison (Llama 3-8B, A100)
138
+
139
+ | Workload | Prefix Length | vLLM | SGLang | Speedup |
140
+ |----------|---------------|------|--------|---------|
141
+ | Simple generation | 0 | 2500 tok/s | 2800 tok/s | 1.12× |
142
+ | Few-shot (5 ex) | 1000 | 800 tok/s | 3200 tok/s | 4× |
143
+ | Few-shot (10 ex) | 2000 | 500 tok/s | 5000 tok/s | **10×** |
144
+ | Agent (tools) | 1500 | 800 tok/s | 4000 tok/s | 5× |
145
+ | Chat (history) | 500-2000 | 1200 tok/s | 3600 tok/s | 3× |
146
+
147
+ **Key insight**: Longer shared prefixes = bigger speedups
148
+
149
+ ### Latency Reduction
150
+
151
+ **Agent workflow** (1000-token system prompt):
152
+
153
+ | Metric | vLLM | SGLang | Improvement |
154
+ |--------|------|--------|-------------|
155
+ | First request | 1.8s | 1.8s | Same (no cache) |
156
+ | Subsequent requests | 1.8s | **0.35s** | **5× faster** |
157
+ | P50 latency (100 req) | 1.8s | 0.42s | 4.3× faster |
158
+ | P99 latency | 2.1s | 0.58s | 3.6× faster |
159
+
160
+ ### Memory Efficiency
161
+
162
+ **Without RadixAttention**:
163
+ - Each request stores its own KV cache
164
+ - 100 requests with 2000-token prefix = 200K tokens cached
165
+ - Memory: ~1.5 GB (Llama 3-8B, FP16)
166
+
167
+ **With RadixAttention**:
168
+ - Prefix stored once in radix tree
169
+ - 100 requests share 2000-token prefix
170
+ - Memory: ~15 MB for prefix + unique tokens
171
+ - **Savings: 99%** for shared portions
172
+
173
+ ## Configuration
174
+
175
+ ### Enable/Disable RadixAttention
176
+
177
+ ```bash
178
+ # Enabled by default
179
+ python -m sglang.launch_server \
180
+ --model-path meta-llama/Meta-Llama-3-8B-Instruct
181
+
182
+ # Disable (for comparison)
183
+ python -m sglang.launch_server \
184
+ --model-path meta-llama/Meta-Llama-3-8B-Instruct \
185
+ --disable-radix-cache
186
+ ```
187
+
188
+ ### Cache Size Tuning
189
+
190
+ ```bash
191
+ # Set max cache size (default: 90% of GPU memory)
192
+ python -m sglang.launch_server \
193
+ --model-path meta-llama/Meta-Llama-3-8B-Instruct \
194
+ --max-radix-cache-len 16384 # Max 16K tokens cached
195
+
196
+ # Reserve memory for KV cache
197
+ --mem-fraction-static 0.85 # Use 85% GPU memory for cache
198
+ ```
199
+
200
+ ### Eviction Policy
201
+
202
+ ```bash
203
+ # LRU eviction (default)
204
+ --eviction-policy lru
205
+
206
+ # FIFO eviction
207
+ --eviction-policy fifo
208
+ ```
209
+
210
+ ## Best Practices
211
+
212
+ ### Design prompts for prefix sharing
213
+
214
+ **Bad** (no prefix sharing):
215
+ ```python
216
+ # Each request has unique prefix
217
+ request_1 = "User Alice asks: What is AI?"
218
+ request_2 = "User Bob asks: What is ML?"
219
+ request_3 = "User Carol asks: What is DL?"
220
+
221
+ # No common prefix → No speedup
222
+ ```
223
+
224
+ **Good** (maximize prefix sharing):
225
+ ```python
226
+ # Shared system prompt
227
+ system = "You are a helpful AI assistant.\n\n"
228
+
229
+ request_1 = system + "User: What is AI?"
230
+ request_2 = system + "User: What is ML?"
231
+ request_3 = system + "User: What is DL?"
232
+
233
+ # Shared prefix → 5× speedup!
234
+ ```
235
+
236
+ ### Structure agent prompts
237
+
238
+ ```python
239
+ # Template for maximum caching
240
+ @sgl.function
241
+ def agent_template(s, user_query):
242
+ # Layer 1: System prompt (always cached)
243
+ s += "You are a helpful assistant.\n\n"
244
+
245
+ # Layer 2: Tools definition (always cached)
246
+ s += "Available tools:\n"
247
+ s += "- get_weather(location)\n"
248
+ s += "- send_email(to, subject, body)\n\n"
249
+
250
+ # Layer 3: Examples (always cached)
251
+ s += "Examples:\n"
252
+ s += "User: What's the weather?\n"
253
+ s += "Assistant: <tool>get_weather('NYC')</tool>\n\n"
254
+
255
+ # Layer 4: User query (unique per request)
256
+ s += f"User: {user_query}\n"
257
+ s += "Assistant: "
258
+ s += sgl.gen("response", max_tokens=200)
259
+
260
+ # Layers 1-3 cached, only Layer 4 computed
261
+ # 5× faster for typical agent queries
262
+ ```
263
+
264
+ ### Optimize few-shot prompting
265
+
266
+ ```python
267
+ # BAD: Examples mixed with query
268
+ def bad_few_shot(s, query):
269
+ s += f"Query: {query}\n" # Unique
270
+ s += "Example 1: ..." # Can't be cached
271
+ s += "Example 2: ..."
272
+ s += sgl.gen("answer")
273
+
274
+ # GOOD: Examples first, then query
275
+ def good_few_shot(s, query):
276
+ # Examples (shared prefix, always cached)
277
+ s += "Example 1: ...\n"
278
+ s += "Example 2: ...\n"
279
+ s += "Example 3: ...\n\n"
280
+
281
+ # Query (unique suffix, computed)
282
+ s += f"Query: {query}\n"
283
+ s += sgl.gen("answer")
284
+
285
+ # 10× faster with RadixAttention
286
+ ```
287
+
288
+ ## Monitoring
289
+
290
+ ### Cache hit rate
291
+
292
+ ```python
293
+ # Check cache statistics
294
+ import requests
295
+ response = requests.get("http://localhost:30000/stats")
296
+ stats = response.json()
297
+
298
+ print(f"Cache hit rate: {stats['radix_cache_hit_rate']:.2%}")
299
+ print(f"Tokens cached: {stats['radix_cache_tokens']}")
300
+ print(f"Cache size: {stats['radix_cache_size_mb']} MB")
301
+
302
+ # Target: >80% hit rate for agent/few-shot workloads
303
+ ```
304
+
305
+ ### Optimization metrics
306
+
307
+ ```bash
308
+ # Monitor cache usage
309
+ curl http://localhost:30000/metrics | grep radix
310
+
311
+ # Key metrics:
312
+ # - radix_cache_hit_tokens: Tokens reused from cache
313
+ # - radix_cache_miss_tokens: Tokens computed (not cached)
314
+ # - radix_cache_evictions: Number of evictions (should be low)
315
+ ```
316
+
317
+ ## Advanced Patterns
318
+
319
+ ### Hierarchical caching
320
+
321
+ ```python
322
+ @sgl.function
323
+ def hierarchical_agent(s, domain, task, query):
324
+ # Level 1: Global system (cached across all requests)
325
+ s += "You are an AI assistant.\n\n"
326
+
327
+ # Level 2: Domain knowledge (cached per domain)
328
+ s += f"Domain: {domain}\n"
329
+ s += f"Knowledge: {get_domain_knowledge(domain)}\n\n"
330
+
331
+ # Level 3: Task context (cached per task)
332
+ s += f"Task: {task}\n"
333
+ s += f"Instructions: {get_task_instructions(task)}\n\n"
334
+
335
+ # Level 4: User query (unique)
336
+ s += f"Query: {query}\n"
337
+ s += sgl.gen("response")
338
+
339
+ # Example cache tree:
340
+ # Root
341
+ # └── "You are an AI assistant\n\n" (L1)
342
+ # ├── "Domain: Finance\n..." (L2)
343
+ # │ ├── "Task: Analysis\n..." (L3)
344
+ # │ │ └── "Query: ..." (L4)
345
+ # │ └── "Task: Forecast\n..." (L3)
346
+ # └── "Domain: Legal\n..." (L2)
347
+ ```
348
+
349
+ ### Batch requests with common prefix
350
+
351
+ ```python
352
+ # All requests share system prompt
353
+ system_prompt = "You are a helpful assistant.\n\n"
354
+
355
+ queries = [
356
+ "What is AI?",
357
+ "What is ML?",
358
+ "What is DL?",
359
+ ]
360
+
361
+ # Run in batch (RadixAttention automatically optimizes)
362
+ results = sgl.run_batch([
363
+ agent.bind(prefix=system_prompt, query=q)
364
+ for q in queries
365
+ ])
366
+
367
+ # System prompt computed once, shared across all 3 requests
368
+ # 3× faster than sequential
369
+ ```
370
+
371
+ ## Troubleshooting
372
+
373
+ ### Low cache hit rate (<50%)
374
+
375
+ **Causes**:
376
+ 1. Prompts have no common structure
377
+ 2. Dynamic content in prefix (timestamps, IDs)
378
+ 3. Cache size too small (evictions)
379
+
380
+ **Solutions**:
381
+ 1. Restructure prompts (shared prefix first)
382
+ 2. Move dynamic content to suffix
383
+ 3. Increase `--max-radix-cache-len`
384
+
385
+ ### High memory usage
386
+
387
+ **Cause**: Too many unique prefixes cached
388
+
389
+ **Solutions**:
390
+ ```bash
391
+ # Reduce cache size
392
+ --max-radix-cache-len 8192
393
+
394
+ # More aggressive eviction
395
+ --mem-fraction-static 0.75
396
+ ```
397
+
398
+ ### Performance worse than vLLM
399
+
400
+ **Cause**: No prefix sharing in workload
401
+
402
+ **Solution**: RadixAttention has small overhead if no sharing. Use vLLM for simple generation workloads without repeated prefixes.
403
+
404
+ ## Comparison with Other Systems
405
+
406
+ | System | Prefix Caching | Automatic | Performance |
407
+ |--------|----------------|-----------|-------------|
408
+ | **SGLang** | ✅ RadixAttention | ✅ Automatic | 5-10× for agents |
409
+ | vLLM | ❌ No prefix caching | N/A | Baseline |
410
+ | Text Generation Inference | ✅ Prefix caching | ❌ Manual | 2-3× (if configured) |
411
+ | TensorRT-LLM | ✅ Static prefix | ❌ Manual | 2× (if configured) |
412
+
413
+ **SGLang advantage**: Fully automatic - no configuration needed, works for any workload with prefix sharing.