@synsci/cli-darwin-x64 1.1.49
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/skills/accelerate/SKILL.md +332 -0
- package/bin/skills/accelerate/references/custom-plugins.md +453 -0
- package/bin/skills/accelerate/references/megatron-integration.md +489 -0
- package/bin/skills/accelerate/references/performance.md +525 -0
- package/bin/skills/audiocraft/SKILL.md +564 -0
- package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
- package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
- package/bin/skills/autogpt/SKILL.md +403 -0
- package/bin/skills/autogpt/references/advanced-usage.md +535 -0
- package/bin/skills/autogpt/references/troubleshooting.md +420 -0
- package/bin/skills/awq/SKILL.md +310 -0
- package/bin/skills/awq/references/advanced-usage.md +324 -0
- package/bin/skills/awq/references/troubleshooting.md +344 -0
- package/bin/skills/axolotl/SKILL.md +158 -0
- package/bin/skills/axolotl/references/api.md +5548 -0
- package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
- package/bin/skills/axolotl/references/index.md +15 -0
- package/bin/skills/axolotl/references/other.md +3563 -0
- package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
- package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
- package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
- package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
- package/bin/skills/bitsandbytes/SKILL.md +411 -0
- package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
- package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
- package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
- package/bin/skills/blip-2/SKILL.md +564 -0
- package/bin/skills/blip-2/references/advanced-usage.md +680 -0
- package/bin/skills/blip-2/references/troubleshooting.md +526 -0
- package/bin/skills/chroma/SKILL.md +406 -0
- package/bin/skills/chroma/references/integration.md +38 -0
- package/bin/skills/clip/SKILL.md +253 -0
- package/bin/skills/clip/references/applications.md +207 -0
- package/bin/skills/constitutional-ai/SKILL.md +290 -0
- package/bin/skills/crewai/SKILL.md +498 -0
- package/bin/skills/crewai/references/flows.md +438 -0
- package/bin/skills/crewai/references/tools.md +429 -0
- package/bin/skills/crewai/references/troubleshooting.md +480 -0
- package/bin/skills/deepspeed/SKILL.md +141 -0
- package/bin/skills/deepspeed/references/08.md +17 -0
- package/bin/skills/deepspeed/references/09.md +173 -0
- package/bin/skills/deepspeed/references/2020.md +378 -0
- package/bin/skills/deepspeed/references/2023.md +279 -0
- package/bin/skills/deepspeed/references/assets.md +179 -0
- package/bin/skills/deepspeed/references/index.md +35 -0
- package/bin/skills/deepspeed/references/mii.md +118 -0
- package/bin/skills/deepspeed/references/other.md +1191 -0
- package/bin/skills/deepspeed/references/tutorials.md +6554 -0
- package/bin/skills/dspy/SKILL.md +590 -0
- package/bin/skills/dspy/references/examples.md +663 -0
- package/bin/skills/dspy/references/modules.md +475 -0
- package/bin/skills/dspy/references/optimizers.md +566 -0
- package/bin/skills/faiss/SKILL.md +221 -0
- package/bin/skills/faiss/references/index_types.md +280 -0
- package/bin/skills/flash-attention/SKILL.md +367 -0
- package/bin/skills/flash-attention/references/benchmarks.md +215 -0
- package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
- package/bin/skills/gguf/SKILL.md +427 -0
- package/bin/skills/gguf/references/advanced-usage.md +504 -0
- package/bin/skills/gguf/references/troubleshooting.md +442 -0
- package/bin/skills/gptq/SKILL.md +450 -0
- package/bin/skills/gptq/references/calibration.md +337 -0
- package/bin/skills/gptq/references/integration.md +129 -0
- package/bin/skills/gptq/references/troubleshooting.md +95 -0
- package/bin/skills/grpo-rl-training/README.md +97 -0
- package/bin/skills/grpo-rl-training/SKILL.md +572 -0
- package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
- package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
- package/bin/skills/guidance/SKILL.md +572 -0
- package/bin/skills/guidance/references/backends.md +554 -0
- package/bin/skills/guidance/references/constraints.md +674 -0
- package/bin/skills/guidance/references/examples.md +767 -0
- package/bin/skills/hqq/SKILL.md +445 -0
- package/bin/skills/hqq/references/advanced-usage.md +528 -0
- package/bin/skills/hqq/references/troubleshooting.md +503 -0
- package/bin/skills/hugging-face-cli/SKILL.md +191 -0
- package/bin/skills/hugging-face-cli/references/commands.md +954 -0
- package/bin/skills/hugging-face-cli/references/examples.md +374 -0
- package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
- package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
- package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
- package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
- package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
- package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
- package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
- package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
- package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
- package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
- package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
- package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
- package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
- package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
- package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
- package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
- package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
- package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
- package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
- package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
- package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
- package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
- package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
- package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
- package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
- package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
- package/bin/skills/hugging-face-jobs/index.html +216 -0
- package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
- package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
- package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
- package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
- package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
- package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
- package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
- package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
- package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
- package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
- package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
- package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
- package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
- package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
- package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
- package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
- package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
- package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
- package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
- package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
- package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
- package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
- package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
- package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
- package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
- package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
- package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
- package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
- package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
- package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
- package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
- package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
- package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
- package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
- package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
- package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
- package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
- package/bin/skills/instructor/SKILL.md +740 -0
- package/bin/skills/instructor/references/examples.md +107 -0
- package/bin/skills/instructor/references/providers.md +70 -0
- package/bin/skills/instructor/references/validation.md +606 -0
- package/bin/skills/knowledge-distillation/SKILL.md +458 -0
- package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
- package/bin/skills/lambda-labs/SKILL.md +545 -0
- package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
- package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
- package/bin/skills/langchain/SKILL.md +480 -0
- package/bin/skills/langchain/references/agents.md +499 -0
- package/bin/skills/langchain/references/integration.md +562 -0
- package/bin/skills/langchain/references/rag.md +600 -0
- package/bin/skills/langsmith/SKILL.md +422 -0
- package/bin/skills/langsmith/references/advanced-usage.md +548 -0
- package/bin/skills/langsmith/references/troubleshooting.md +537 -0
- package/bin/skills/litgpt/SKILL.md +469 -0
- package/bin/skills/litgpt/references/custom-models.md +568 -0
- package/bin/skills/litgpt/references/distributed-training.md +451 -0
- package/bin/skills/litgpt/references/supported-models.md +336 -0
- package/bin/skills/litgpt/references/training-recipes.md +619 -0
- package/bin/skills/llama-cpp/SKILL.md +258 -0
- package/bin/skills/llama-cpp/references/optimization.md +89 -0
- package/bin/skills/llama-cpp/references/quantization.md +213 -0
- package/bin/skills/llama-cpp/references/server.md +125 -0
- package/bin/skills/llama-factory/SKILL.md +80 -0
- package/bin/skills/llama-factory/references/_images.md +23 -0
- package/bin/skills/llama-factory/references/advanced.md +1055 -0
- package/bin/skills/llama-factory/references/getting_started.md +349 -0
- package/bin/skills/llama-factory/references/index.md +19 -0
- package/bin/skills/llama-factory/references/other.md +31 -0
- package/bin/skills/llamaguard/SKILL.md +337 -0
- package/bin/skills/llamaindex/SKILL.md +569 -0
- package/bin/skills/llamaindex/references/agents.md +83 -0
- package/bin/skills/llamaindex/references/data_connectors.md +108 -0
- package/bin/skills/llamaindex/references/query_engines.md +406 -0
- package/bin/skills/llava/SKILL.md +304 -0
- package/bin/skills/llava/references/training.md +197 -0
- package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
- package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
- package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
- package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
- package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
- package/bin/skills/long-context/SKILL.md +536 -0
- package/bin/skills/long-context/references/extension_methods.md +468 -0
- package/bin/skills/long-context/references/fine_tuning.md +611 -0
- package/bin/skills/long-context/references/rope.md +402 -0
- package/bin/skills/mamba/SKILL.md +260 -0
- package/bin/skills/mamba/references/architecture-details.md +206 -0
- package/bin/skills/mamba/references/benchmarks.md +255 -0
- package/bin/skills/mamba/references/training-guide.md +388 -0
- package/bin/skills/megatron-core/SKILL.md +366 -0
- package/bin/skills/megatron-core/references/benchmarks.md +249 -0
- package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
- package/bin/skills/megatron-core/references/production-examples.md +473 -0
- package/bin/skills/megatron-core/references/training-recipes.md +547 -0
- package/bin/skills/miles/SKILL.md +315 -0
- package/bin/skills/miles/references/api-reference.md +141 -0
- package/bin/skills/miles/references/troubleshooting.md +352 -0
- package/bin/skills/mlflow/SKILL.md +704 -0
- package/bin/skills/mlflow/references/deployment.md +744 -0
- package/bin/skills/mlflow/references/model-registry.md +770 -0
- package/bin/skills/mlflow/references/tracking.md +680 -0
- package/bin/skills/modal/SKILL.md +341 -0
- package/bin/skills/modal/references/advanced-usage.md +503 -0
- package/bin/skills/modal/references/troubleshooting.md +494 -0
- package/bin/skills/model-merging/SKILL.md +539 -0
- package/bin/skills/model-merging/references/evaluation.md +462 -0
- package/bin/skills/model-merging/references/examples.md +428 -0
- package/bin/skills/model-merging/references/methods.md +352 -0
- package/bin/skills/model-pruning/SKILL.md +495 -0
- package/bin/skills/model-pruning/references/wanda.md +347 -0
- package/bin/skills/moe-training/SKILL.md +526 -0
- package/bin/skills/moe-training/references/architectures.md +432 -0
- package/bin/skills/moe-training/references/inference.md +348 -0
- package/bin/skills/moe-training/references/training.md +425 -0
- package/bin/skills/nanogpt/SKILL.md +290 -0
- package/bin/skills/nanogpt/references/architecture.md +382 -0
- package/bin/skills/nanogpt/references/data.md +476 -0
- package/bin/skills/nanogpt/references/training.md +564 -0
- package/bin/skills/nemo-curator/SKILL.md +383 -0
- package/bin/skills/nemo-curator/references/deduplication.md +87 -0
- package/bin/skills/nemo-curator/references/filtering.md +102 -0
- package/bin/skills/nemo-evaluator/SKILL.md +494 -0
- package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
- package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
- package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
- package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
- package/bin/skills/nemo-guardrails/SKILL.md +297 -0
- package/bin/skills/nnsight/SKILL.md +436 -0
- package/bin/skills/nnsight/references/README.md +78 -0
- package/bin/skills/nnsight/references/api.md +344 -0
- package/bin/skills/nnsight/references/tutorials.md +300 -0
- package/bin/skills/openrlhf/SKILL.md +249 -0
- package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
- package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
- package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
- package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
- package/bin/skills/outlines/SKILL.md +652 -0
- package/bin/skills/outlines/references/backends.md +615 -0
- package/bin/skills/outlines/references/examples.md +773 -0
- package/bin/skills/outlines/references/json_generation.md +652 -0
- package/bin/skills/peft/SKILL.md +431 -0
- package/bin/skills/peft/references/advanced-usage.md +514 -0
- package/bin/skills/peft/references/troubleshooting.md +480 -0
- package/bin/skills/phoenix/SKILL.md +475 -0
- package/bin/skills/phoenix/references/advanced-usage.md +619 -0
- package/bin/skills/phoenix/references/troubleshooting.md +538 -0
- package/bin/skills/pinecone/SKILL.md +358 -0
- package/bin/skills/pinecone/references/deployment.md +181 -0
- package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
- package/bin/skills/pytorch-fsdp/references/index.md +7 -0
- package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
- package/bin/skills/pytorch-lightning/SKILL.md +346 -0
- package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
- package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
- package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
- package/bin/skills/pyvene/SKILL.md +473 -0
- package/bin/skills/pyvene/references/README.md +73 -0
- package/bin/skills/pyvene/references/api.md +383 -0
- package/bin/skills/pyvene/references/tutorials.md +376 -0
- package/bin/skills/qdrant/SKILL.md +493 -0
- package/bin/skills/qdrant/references/advanced-usage.md +648 -0
- package/bin/skills/qdrant/references/troubleshooting.md +631 -0
- package/bin/skills/ray-data/SKILL.md +326 -0
- package/bin/skills/ray-data/references/integration.md +82 -0
- package/bin/skills/ray-data/references/transformations.md +83 -0
- package/bin/skills/ray-train/SKILL.md +406 -0
- package/bin/skills/ray-train/references/multi-node.md +628 -0
- package/bin/skills/rwkv/SKILL.md +260 -0
- package/bin/skills/rwkv/references/architecture-details.md +344 -0
- package/bin/skills/rwkv/references/rwkv7.md +386 -0
- package/bin/skills/rwkv/references/state-management.md +369 -0
- package/bin/skills/saelens/SKILL.md +386 -0
- package/bin/skills/saelens/references/README.md +70 -0
- package/bin/skills/saelens/references/api.md +333 -0
- package/bin/skills/saelens/references/tutorials.md +318 -0
- package/bin/skills/segment-anything/SKILL.md +500 -0
- package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
- package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
- package/bin/skills/sentence-transformers/SKILL.md +255 -0
- package/bin/skills/sentence-transformers/references/models.md +123 -0
- package/bin/skills/sentencepiece/SKILL.md +235 -0
- package/bin/skills/sentencepiece/references/algorithms.md +200 -0
- package/bin/skills/sentencepiece/references/training.md +304 -0
- package/bin/skills/sglang/SKILL.md +442 -0
- package/bin/skills/sglang/references/deployment.md +490 -0
- package/bin/skills/sglang/references/radix-attention.md +413 -0
- package/bin/skills/sglang/references/structured-generation.md +541 -0
- package/bin/skills/simpo/SKILL.md +219 -0
- package/bin/skills/simpo/references/datasets.md +478 -0
- package/bin/skills/simpo/references/hyperparameters.md +452 -0
- package/bin/skills/simpo/references/loss-functions.md +350 -0
- package/bin/skills/skypilot/SKILL.md +509 -0
- package/bin/skills/skypilot/references/advanced-usage.md +491 -0
- package/bin/skills/skypilot/references/troubleshooting.md +570 -0
- package/bin/skills/slime/SKILL.md +464 -0
- package/bin/skills/slime/references/api-reference.md +392 -0
- package/bin/skills/slime/references/troubleshooting.md +386 -0
- package/bin/skills/speculative-decoding/SKILL.md +467 -0
- package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
- package/bin/skills/speculative-decoding/references/medusa.md +350 -0
- package/bin/skills/stable-diffusion/SKILL.md +519 -0
- package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
- package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
- package/bin/skills/tensorboard/SKILL.md +629 -0
- package/bin/skills/tensorboard/references/integrations.md +638 -0
- package/bin/skills/tensorboard/references/profiling.md +545 -0
- package/bin/skills/tensorboard/references/visualization.md +620 -0
- package/bin/skills/tensorrt-llm/SKILL.md +187 -0
- package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
- package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
- package/bin/skills/tensorrt-llm/references/serving.md +470 -0
- package/bin/skills/tinker/SKILL.md +362 -0
- package/bin/skills/tinker/references/api-reference.md +168 -0
- package/bin/skills/tinker/references/getting-started.md +157 -0
- package/bin/skills/tinker/references/loss-functions.md +163 -0
- package/bin/skills/tinker/references/models-and-lora.md +139 -0
- package/bin/skills/tinker/references/recipes.md +280 -0
- package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
- package/bin/skills/tinker/references/rendering.md +243 -0
- package/bin/skills/tinker/references/supervised-learning.md +232 -0
- package/bin/skills/tinker-training-cost/SKILL.md +187 -0
- package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
- package/bin/skills/torchforge/SKILL.md +433 -0
- package/bin/skills/torchforge/references/api-reference.md +327 -0
- package/bin/skills/torchforge/references/troubleshooting.md +409 -0
- package/bin/skills/torchtitan/SKILL.md +358 -0
- package/bin/skills/torchtitan/references/checkpoint.md +181 -0
- package/bin/skills/torchtitan/references/custom-models.md +258 -0
- package/bin/skills/torchtitan/references/float8.md +133 -0
- package/bin/skills/torchtitan/references/fsdp.md +126 -0
- package/bin/skills/transformer-lens/SKILL.md +346 -0
- package/bin/skills/transformer-lens/references/README.md +54 -0
- package/bin/skills/transformer-lens/references/api.md +362 -0
- package/bin/skills/transformer-lens/references/tutorials.md +339 -0
- package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
- package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
- package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
- package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
- package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
- package/bin/skills/unsloth/SKILL.md +80 -0
- package/bin/skills/unsloth/references/index.md +7 -0
- package/bin/skills/unsloth/references/llms-full.md +16799 -0
- package/bin/skills/unsloth/references/llms-txt.md +12044 -0
- package/bin/skills/unsloth/references/llms.md +82 -0
- package/bin/skills/verl/SKILL.md +391 -0
- package/bin/skills/verl/references/api-reference.md +301 -0
- package/bin/skills/verl/references/troubleshooting.md +391 -0
- package/bin/skills/vllm/SKILL.md +364 -0
- package/bin/skills/vllm/references/optimization.md +226 -0
- package/bin/skills/vllm/references/quantization.md +284 -0
- package/bin/skills/vllm/references/server-deployment.md +255 -0
- package/bin/skills/vllm/references/troubleshooting.md +447 -0
- package/bin/skills/weights-and-biases/SKILL.md +590 -0
- package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
- package/bin/skills/weights-and-biases/references/integrations.md +700 -0
- package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
- package/bin/skills/whisper/SKILL.md +317 -0
- package/bin/skills/whisper/references/languages.md +189 -0
- package/bin/synsc +0 -0
- package/package.json +10 -0
|
@@ -0,0 +1,464 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: slime-rl-training
|
|
3
|
+
description: Provides guidance for LLM post-training with RL using slime, a Megatron+SGLang framework. Use when training GLM models, implementing custom data generation workflows, or needing tight Megatron-LM integration for RL scaling.
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
author: Synthetic Sciences
|
|
6
|
+
license: MIT
|
|
7
|
+
tags: [Reinforcement Learning, Megatron-LM, SGLang, GRPO, Post-Training, GLM]
|
|
8
|
+
dependencies: [sglang-router>=0.2.3, ray, torch>=2.0.0, transformers>=4.40.0]
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# slime: LLM Post-Training Framework for RL Scaling
|
|
12
|
+
|
|
13
|
+
slime is an LLM post-training framework from Tsinghua's THUDM team, powering GLM-4.5, GLM-4.6, and GLM-4.7. It connects Megatron-LM for training with SGLang for high-throughput rollout generation.
|
|
14
|
+
|
|
15
|
+
## When to Use slime
|
|
16
|
+
|
|
17
|
+
**Choose slime when you need:**
|
|
18
|
+
- Megatron-LM native training with SGLang inference
|
|
19
|
+
- Custom data generation workflows with flexible data buffers
|
|
20
|
+
- Training GLM, Qwen3, DeepSeek V3, or Llama 3 models
|
|
21
|
+
- Research-grade framework with production backing (Z.ai)
|
|
22
|
+
|
|
23
|
+
**Consider alternatives when:**
|
|
24
|
+
- You need enterprise-grade stability features → use **miles**
|
|
25
|
+
- You want flexible backend swapping → use **verl**
|
|
26
|
+
- You need PyTorch-native abstractions → use **torchforge**
|
|
27
|
+
|
|
28
|
+
## Key Features
|
|
29
|
+
|
|
30
|
+
- **Training**: Megatron-LM with full parallelism support (TP, PP, DP, SP)
|
|
31
|
+
- **Rollout**: SGLang-based high-throughput generation with router
|
|
32
|
+
- **Data Buffer**: Flexible prompt management and sample storage
|
|
33
|
+
- **Models**: GLM-4.x, Qwen3, DeepSeek V3/R1, Llama 3
|
|
34
|
+
|
|
35
|
+
## Architecture Overview
|
|
36
|
+
|
|
37
|
+
```
|
|
38
|
+
┌─────────────────────────────────────────────────────────┐
|
|
39
|
+
│ Data Buffer │
|
|
40
|
+
│ - Prompt initialization and management │
|
|
41
|
+
│ - Custom data generation and filtering │
|
|
42
|
+
│ - Rollout sample storage │
|
|
43
|
+
└─────────────┬───────────────────────────┬───────────────┘
|
|
44
|
+
│ │
|
|
45
|
+
┌─────────────▼───────────┐ ┌─────────────▼───────────────┐
|
|
46
|
+
│ Training (Megatron-LM) │ │ Rollout (SGLang + Router) │
|
|
47
|
+
│ - Actor model training │ │ - Response generation │
|
|
48
|
+
│ - Critic (optional) │ │ - Reward/verifier output │
|
|
49
|
+
│ - Weight sync to rollout│ │ - Multi-turn support │
|
|
50
|
+
└─────────────────────────┘ └─────────────────────────────┘
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
## Installation
|
|
54
|
+
|
|
55
|
+
```bash
|
|
56
|
+
# Recommended: Docker
|
|
57
|
+
docker pull slimerl/slime:latest
|
|
58
|
+
docker run --rm --gpus all --ipc=host --shm-size=16g \
|
|
59
|
+
-it slimerl/slime:latest /bin/bash
|
|
60
|
+
|
|
61
|
+
# Inside container
|
|
62
|
+
cd /root/slime && pip install -e . --no-deps
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
### From Source
|
|
66
|
+
|
|
67
|
+
```bash
|
|
68
|
+
git clone https://github.com/THUDM/slime.git
|
|
69
|
+
cd slime
|
|
70
|
+
pip install -r requirements.txt
|
|
71
|
+
pip install -e .
|
|
72
|
+
```
|
|
73
|
+
|
|
74
|
+
## Quick Start: GRPO Training
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
# Source model configuration
|
|
78
|
+
source scripts/models/qwen3-4B.sh
|
|
79
|
+
|
|
80
|
+
# Launch training
|
|
81
|
+
python train.py \
|
|
82
|
+
--actor-num-nodes 1 \
|
|
83
|
+
--actor-num-gpus-per-node 4 \
|
|
84
|
+
--rollout-num-gpus 4 \
|
|
85
|
+
--advantage-estimator grpo \
|
|
86
|
+
--use-kl-loss --kl-loss-coef 0.001 \
|
|
87
|
+
--rollout-batch-size 32 \
|
|
88
|
+
--n-samples-per-prompt 8 \
|
|
89
|
+
--global-batch-size 256 \
|
|
90
|
+
--num-rollout 3000 \
|
|
91
|
+
--prompt-data /path/to/data.jsonl \
|
|
92
|
+
${MODEL_ARGS[@]} ${CKPT_ARGS[@]}
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
---
|
|
96
|
+
|
|
97
|
+
## Workflow 1: Standard GRPO Training
|
|
98
|
+
|
|
99
|
+
Use this workflow for training reasoning models with group-relative advantages.
|
|
100
|
+
|
|
101
|
+
### Prerequisites Checklist
|
|
102
|
+
- [ ] Docker environment or Megatron-LM + SGLang installed
|
|
103
|
+
- [ ] Model checkpoint (HuggingFace or Megatron format)
|
|
104
|
+
- [ ] Training data in JSONL format
|
|
105
|
+
|
|
106
|
+
### Step 1: Prepare Data
|
|
107
|
+
|
|
108
|
+
```python
|
|
109
|
+
# data.jsonl format
|
|
110
|
+
{"prompt": "What is 2 + 2?", "label": "4"}
|
|
111
|
+
{"prompt": "Solve: 3x = 12", "label": "x = 4"}
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
Or with chat format:
|
|
115
|
+
```python
|
|
116
|
+
{
|
|
117
|
+
"prompt": [
|
|
118
|
+
{"role": "system", "content": "You are a math tutor."},
|
|
119
|
+
{"role": "user", "content": "What is 15 + 27?"}
|
|
120
|
+
],
|
|
121
|
+
"label": "42"
|
|
122
|
+
}
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
### Step 2: Configure Model
|
|
126
|
+
|
|
127
|
+
Choose a pre-configured model script:
|
|
128
|
+
|
|
129
|
+
```bash
|
|
130
|
+
# List available models
|
|
131
|
+
ls scripts/models/
|
|
132
|
+
# glm4-9B.sh, qwen3-4B.sh, qwen3-30B-A3B.sh, deepseek-v3.sh, llama3-8B.sh, ...
|
|
133
|
+
|
|
134
|
+
# Source your model
|
|
135
|
+
source scripts/models/qwen3-4B.sh
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
### Step 3: Launch Training
|
|
139
|
+
|
|
140
|
+
```bash
|
|
141
|
+
python train.py \
|
|
142
|
+
--actor-num-nodes 1 \
|
|
143
|
+
--actor-num-gpus-per-node 8 \
|
|
144
|
+
--rollout-num-gpus 8 \
|
|
145
|
+
--advantage-estimator grpo \
|
|
146
|
+
--use-kl-loss \
|
|
147
|
+
--kl-loss-coef 0.001 \
|
|
148
|
+
--prompt-data /path/to/train.jsonl \
|
|
149
|
+
--input-key prompt \
|
|
150
|
+
--label-key label \
|
|
151
|
+
--apply-chat-template \
|
|
152
|
+
--rollout-batch-size 32 \
|
|
153
|
+
--n-samples-per-prompt 8 \
|
|
154
|
+
--global-batch-size 256 \
|
|
155
|
+
--num-rollout 3000 \
|
|
156
|
+
--save-interval 100 \
|
|
157
|
+
--eval-interval 50 \
|
|
158
|
+
${MODEL_ARGS[@]}
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
### Step 4: Monitor Training
|
|
162
|
+
- [ ] Check TensorBoard: `tensorboard --logdir outputs/`
|
|
163
|
+
- [ ] Verify reward curves are increasing
|
|
164
|
+
- [ ] Monitor GPU utilization across nodes
|
|
165
|
+
|
|
166
|
+
---
|
|
167
|
+
|
|
168
|
+
## Workflow 2: Asynchronous Training
|
|
169
|
+
|
|
170
|
+
Use async mode for higher throughput by overlapping rollout and training.
|
|
171
|
+
|
|
172
|
+
### When to Use Async
|
|
173
|
+
- Large models with long generation times
|
|
174
|
+
- High GPU idle time in synchronous mode
|
|
175
|
+
- Sufficient memory for buffering
|
|
176
|
+
|
|
177
|
+
### Launch Async Training
|
|
178
|
+
|
|
179
|
+
```bash
|
|
180
|
+
python train_async.py \
|
|
181
|
+
--actor-num-nodes 1 \
|
|
182
|
+
--actor-num-gpus-per-node 8 \
|
|
183
|
+
--rollout-num-gpus 8 \
|
|
184
|
+
--advantage-estimator grpo \
|
|
185
|
+
--async-buffer-size 4 \
|
|
186
|
+
--prompt-data /path/to/train.jsonl \
|
|
187
|
+
${MODEL_ARGS[@]}
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
### Async-Specific Parameters
|
|
191
|
+
|
|
192
|
+
```bash
|
|
193
|
+
--async-buffer-size 4 # Number of rollouts to buffer
|
|
194
|
+
--update-weights-interval 2 # Sync weights every N rollouts
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
---
|
|
198
|
+
|
|
199
|
+
## Workflow 3: Multi-Turn Agentic Training
|
|
200
|
+
|
|
201
|
+
Use this workflow for training agents with tool use or multi-step reasoning.
|
|
202
|
+
|
|
203
|
+
### Prerequisites
|
|
204
|
+
- [ ] Custom generate function for multi-turn logic
|
|
205
|
+
- [ ] Tool/environment interface
|
|
206
|
+
|
|
207
|
+
### Step 1: Define Custom Generate Function
|
|
208
|
+
|
|
209
|
+
```python
|
|
210
|
+
# custom_generate.py
|
|
211
|
+
async def custom_generate(args, samples, evaluation=False):
|
|
212
|
+
"""Multi-turn generation with tool calling."""
|
|
213
|
+
for sample in samples:
|
|
214
|
+
conversation = sample.prompt
|
|
215
|
+
|
|
216
|
+
for turn in range(args.max_turns):
|
|
217
|
+
# Generate response
|
|
218
|
+
response = await generate_single(conversation)
|
|
219
|
+
|
|
220
|
+
# Check for tool call
|
|
221
|
+
tool_call = extract_tool_call(response)
|
|
222
|
+
if tool_call:
|
|
223
|
+
tool_result = execute_tool(tool_call)
|
|
224
|
+
conversation.append({"role": "assistant", "content": response})
|
|
225
|
+
conversation.append({"role": "tool", "content": tool_result})
|
|
226
|
+
else:
|
|
227
|
+
break
|
|
228
|
+
|
|
229
|
+
sample.response = response
|
|
230
|
+
sample.reward = compute_reward(sample)
|
|
231
|
+
|
|
232
|
+
return samples
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
### Step 2: Launch with Custom Function
|
|
236
|
+
|
|
237
|
+
```bash
|
|
238
|
+
python train.py \
|
|
239
|
+
--custom-generate-function-path custom_generate.py \
|
|
240
|
+
--max-turns 5 \
|
|
241
|
+
--prompt-data /path/to/agent_data.jsonl \
|
|
242
|
+
${MODEL_ARGS[@]}
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
See `examples/search-r1/` for a complete multi-turn search example.
|
|
246
|
+
|
|
247
|
+
---
|
|
248
|
+
|
|
249
|
+
## Configuration Reference
|
|
250
|
+
|
|
251
|
+
### Three Argument Categories
|
|
252
|
+
|
|
253
|
+
slime uses three types of arguments:
|
|
254
|
+
|
|
255
|
+
**1. Megatron Arguments** (passed directly):
|
|
256
|
+
```bash
|
|
257
|
+
--tensor-model-parallel-size 2
|
|
258
|
+
--pipeline-model-parallel-size 1
|
|
259
|
+
--num-layers 32
|
|
260
|
+
--hidden-size 4096
|
|
261
|
+
```
|
|
262
|
+
|
|
263
|
+
**2. SGLang Arguments** (prefixed with `--sglang-`):
|
|
264
|
+
```bash
|
|
265
|
+
--sglang-mem-fraction-static 0.8
|
|
266
|
+
--sglang-context-length 8192
|
|
267
|
+
--sglang-log-level INFO
|
|
268
|
+
```
|
|
269
|
+
|
|
270
|
+
**3. slime Arguments**:
|
|
271
|
+
```bash
|
|
272
|
+
# Resource allocation
|
|
273
|
+
--actor-num-nodes 1
|
|
274
|
+
--actor-num-gpus-per-node 8
|
|
275
|
+
--rollout-num-gpus 8
|
|
276
|
+
--colocate # Share GPUs between training/inference
|
|
277
|
+
|
|
278
|
+
# Data
|
|
279
|
+
--prompt-data /path/to/data.jsonl
|
|
280
|
+
--input-key prompt
|
|
281
|
+
--label-key label
|
|
282
|
+
|
|
283
|
+
# Training loop
|
|
284
|
+
--num-rollout 3000
|
|
285
|
+
--rollout-batch-size 32
|
|
286
|
+
--n-samples-per-prompt 8
|
|
287
|
+
--global-batch-size 256
|
|
288
|
+
|
|
289
|
+
# Algorithm
|
|
290
|
+
--advantage-estimator grpo # or: gspo, ppo, reinforce_plus_plus
|
|
291
|
+
--use-kl-loss
|
|
292
|
+
--kl-loss-coef 0.001
|
|
293
|
+
```
|
|
294
|
+
|
|
295
|
+
### Key Constraints
|
|
296
|
+
|
|
297
|
+
```
|
|
298
|
+
rollout_batch_size × n_samples_per_prompt = global_batch_size × num_steps_per_rollout
|
|
299
|
+
```
|
|
300
|
+
|
|
301
|
+
Example: 32 × 8 = 256 × 1
|
|
302
|
+
|
|
303
|
+
---
|
|
304
|
+
|
|
305
|
+
## Data Buffer System
|
|
306
|
+
|
|
307
|
+
slime's data buffer enables flexible data management:
|
|
308
|
+
|
|
309
|
+
### Basic Data Source
|
|
310
|
+
|
|
311
|
+
```python
|
|
312
|
+
class RolloutDataSource:
|
|
313
|
+
def get_samples(self, num_samples):
|
|
314
|
+
"""Fetch prompts from dataset."""
|
|
315
|
+
return self.dataset.sample(num_samples)
|
|
316
|
+
|
|
317
|
+
def add_samples(self, samples):
|
|
318
|
+
"""Called after generation (no-op by default)."""
|
|
319
|
+
pass
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
### Buffered Data Source (Off-Policy)
|
|
323
|
+
|
|
324
|
+
```python
|
|
325
|
+
class RolloutDataSourceWithBuffer(RolloutDataSource):
|
|
326
|
+
def __init__(self):
|
|
327
|
+
self.buffer = []
|
|
328
|
+
|
|
329
|
+
def add_samples(self, samples):
|
|
330
|
+
"""Store generated samples for reuse."""
|
|
331
|
+
self.buffer.extend(samples)
|
|
332
|
+
|
|
333
|
+
def buffer_filter(self, args, buffer, num_samples):
|
|
334
|
+
"""Custom selection logic (prioritized, stratified, etc.)."""
|
|
335
|
+
return select_best(buffer, num_samples)
|
|
336
|
+
```
|
|
337
|
+
|
|
338
|
+
---
|
|
339
|
+
|
|
340
|
+
## Common Issues and Solutions
|
|
341
|
+
|
|
342
|
+
### Issue: SGLang Engine Crash
|
|
343
|
+
|
|
344
|
+
**Symptoms**: Inference engine dies mid-training
|
|
345
|
+
|
|
346
|
+
**Solutions**:
|
|
347
|
+
```bash
|
|
348
|
+
# Enable fault tolerance
|
|
349
|
+
--use-fault-tolerance
|
|
350
|
+
|
|
351
|
+
# Increase memory allocation
|
|
352
|
+
--sglang-mem-fraction-static 0.85
|
|
353
|
+
|
|
354
|
+
# Reduce batch size
|
|
355
|
+
--rollout-batch-size 16
|
|
356
|
+
```
|
|
357
|
+
|
|
358
|
+
### Issue: Weight Sync Timeout
|
|
359
|
+
|
|
360
|
+
**Symptoms**: Training hangs after rollout
|
|
361
|
+
|
|
362
|
+
**Solutions**:
|
|
363
|
+
```bash
|
|
364
|
+
# Increase sync interval
|
|
365
|
+
--update-weights-interval 5
|
|
366
|
+
|
|
367
|
+
# Use colocated mode (no network transfer)
|
|
368
|
+
--colocate
|
|
369
|
+
```
|
|
370
|
+
|
|
371
|
+
### Issue: OOM During Training
|
|
372
|
+
|
|
373
|
+
**Symptoms**: CUDA OOM in backward pass
|
|
374
|
+
|
|
375
|
+
**Solutions**:
|
|
376
|
+
```bash
|
|
377
|
+
# Enable gradient checkpointing
|
|
378
|
+
--recompute-activations
|
|
379
|
+
|
|
380
|
+
# Reduce micro-batch size
|
|
381
|
+
--micro-batch-size 1
|
|
382
|
+
|
|
383
|
+
# Enable sequence parallelism
|
|
384
|
+
--sequence-parallel
|
|
385
|
+
```
|
|
386
|
+
|
|
387
|
+
### Issue: Slow Data Loading
|
|
388
|
+
|
|
389
|
+
**Symptoms**: GPU idle during data fetch
|
|
390
|
+
|
|
391
|
+
**Solutions**:
|
|
392
|
+
```bash
|
|
393
|
+
# Increase data workers
|
|
394
|
+
--num-data-workers 4
|
|
395
|
+
|
|
396
|
+
# Use streaming dataset
|
|
397
|
+
--streaming-data
|
|
398
|
+
```
|
|
399
|
+
|
|
400
|
+
---
|
|
401
|
+
|
|
402
|
+
## Supported Models
|
|
403
|
+
|
|
404
|
+
| Model Family | Configurations |
|
|
405
|
+
|--------------|----------------|
|
|
406
|
+
| GLM | GLM-4.5, GLM-4.6, GLM-4.7, GLM-Z1-9B |
|
|
407
|
+
| Qwen | Qwen3 (4B, 8B, 30B-A3B), Qwen3-MoE, Qwen2.5 |
|
|
408
|
+
| DeepSeek | V3, V3.1, R1 |
|
|
409
|
+
| Llama | Llama 3 (8B, 70B) |
|
|
410
|
+
| Others | Kimi K2, Moonlight-16B |
|
|
411
|
+
|
|
412
|
+
Each model has pre-configured scripts in `scripts/models/`.
|
|
413
|
+
|
|
414
|
+
---
|
|
415
|
+
|
|
416
|
+
## Advanced Topics
|
|
417
|
+
|
|
418
|
+
### Co-location Mode
|
|
419
|
+
|
|
420
|
+
Share GPUs between training and inference to reduce memory:
|
|
421
|
+
|
|
422
|
+
```bash
|
|
423
|
+
python train.py \
|
|
424
|
+
--colocate \
|
|
425
|
+
--actor-num-gpus-per-node 8 \
|
|
426
|
+
--sglang-mem-fraction-static 0.4 \
|
|
427
|
+
${MODEL_ARGS[@]}
|
|
428
|
+
```
|
|
429
|
+
|
|
430
|
+
### Custom Reward Model
|
|
431
|
+
|
|
432
|
+
```python
|
|
433
|
+
# custom_rm.py
|
|
434
|
+
class CustomRewardModel:
|
|
435
|
+
def __init__(self, model_path):
|
|
436
|
+
self.model = load_model(model_path)
|
|
437
|
+
|
|
438
|
+
def compute_reward(self, prompts, responses):
|
|
439
|
+
inputs = self.tokenize(prompts, responses)
|
|
440
|
+
scores = self.model(inputs)
|
|
441
|
+
return scores.tolist()
|
|
442
|
+
```
|
|
443
|
+
|
|
444
|
+
```bash
|
|
445
|
+
--custom-rm-path custom_rm.py
|
|
446
|
+
```
|
|
447
|
+
|
|
448
|
+
### Evaluation Multi-Task
|
|
449
|
+
|
|
450
|
+
```bash
|
|
451
|
+
--eval-prompt-data aime /path/to/aime.jsonl \
|
|
452
|
+
--eval-prompt-data gsm8k /path/to/gsm8k.jsonl \
|
|
453
|
+
--n-samples-per-eval-prompt 16
|
|
454
|
+
```
|
|
455
|
+
|
|
456
|
+
---
|
|
457
|
+
|
|
458
|
+
## Resources
|
|
459
|
+
|
|
460
|
+
- **Documentation**: https://thudm.github.io/slime/
|
|
461
|
+
- **GitHub**: https://github.com/THUDM/slime
|
|
462
|
+
- **Blog**: https://lmsys.org/blog/2025-07-09-slime/
|
|
463
|
+
- **Examples**: See `examples/` directory for 14+ worked examples
|
|
464
|
+
|