@synsci/cli-darwin-x64 1.1.49
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/skills/accelerate/SKILL.md +332 -0
- package/bin/skills/accelerate/references/custom-plugins.md +453 -0
- package/bin/skills/accelerate/references/megatron-integration.md +489 -0
- package/bin/skills/accelerate/references/performance.md +525 -0
- package/bin/skills/audiocraft/SKILL.md +564 -0
- package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
- package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
- package/bin/skills/autogpt/SKILL.md +403 -0
- package/bin/skills/autogpt/references/advanced-usage.md +535 -0
- package/bin/skills/autogpt/references/troubleshooting.md +420 -0
- package/bin/skills/awq/SKILL.md +310 -0
- package/bin/skills/awq/references/advanced-usage.md +324 -0
- package/bin/skills/awq/references/troubleshooting.md +344 -0
- package/bin/skills/axolotl/SKILL.md +158 -0
- package/bin/skills/axolotl/references/api.md +5548 -0
- package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
- package/bin/skills/axolotl/references/index.md +15 -0
- package/bin/skills/axolotl/references/other.md +3563 -0
- package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
- package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
- package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
- package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
- package/bin/skills/bitsandbytes/SKILL.md +411 -0
- package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
- package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
- package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
- package/bin/skills/blip-2/SKILL.md +564 -0
- package/bin/skills/blip-2/references/advanced-usage.md +680 -0
- package/bin/skills/blip-2/references/troubleshooting.md +526 -0
- package/bin/skills/chroma/SKILL.md +406 -0
- package/bin/skills/chroma/references/integration.md +38 -0
- package/bin/skills/clip/SKILL.md +253 -0
- package/bin/skills/clip/references/applications.md +207 -0
- package/bin/skills/constitutional-ai/SKILL.md +290 -0
- package/bin/skills/crewai/SKILL.md +498 -0
- package/bin/skills/crewai/references/flows.md +438 -0
- package/bin/skills/crewai/references/tools.md +429 -0
- package/bin/skills/crewai/references/troubleshooting.md +480 -0
- package/bin/skills/deepspeed/SKILL.md +141 -0
- package/bin/skills/deepspeed/references/08.md +17 -0
- package/bin/skills/deepspeed/references/09.md +173 -0
- package/bin/skills/deepspeed/references/2020.md +378 -0
- package/bin/skills/deepspeed/references/2023.md +279 -0
- package/bin/skills/deepspeed/references/assets.md +179 -0
- package/bin/skills/deepspeed/references/index.md +35 -0
- package/bin/skills/deepspeed/references/mii.md +118 -0
- package/bin/skills/deepspeed/references/other.md +1191 -0
- package/bin/skills/deepspeed/references/tutorials.md +6554 -0
- package/bin/skills/dspy/SKILL.md +590 -0
- package/bin/skills/dspy/references/examples.md +663 -0
- package/bin/skills/dspy/references/modules.md +475 -0
- package/bin/skills/dspy/references/optimizers.md +566 -0
- package/bin/skills/faiss/SKILL.md +221 -0
- package/bin/skills/faiss/references/index_types.md +280 -0
- package/bin/skills/flash-attention/SKILL.md +367 -0
- package/bin/skills/flash-attention/references/benchmarks.md +215 -0
- package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
- package/bin/skills/gguf/SKILL.md +427 -0
- package/bin/skills/gguf/references/advanced-usage.md +504 -0
- package/bin/skills/gguf/references/troubleshooting.md +442 -0
- package/bin/skills/gptq/SKILL.md +450 -0
- package/bin/skills/gptq/references/calibration.md +337 -0
- package/bin/skills/gptq/references/integration.md +129 -0
- package/bin/skills/gptq/references/troubleshooting.md +95 -0
- package/bin/skills/grpo-rl-training/README.md +97 -0
- package/bin/skills/grpo-rl-training/SKILL.md +572 -0
- package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
- package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
- package/bin/skills/guidance/SKILL.md +572 -0
- package/bin/skills/guidance/references/backends.md +554 -0
- package/bin/skills/guidance/references/constraints.md +674 -0
- package/bin/skills/guidance/references/examples.md +767 -0
- package/bin/skills/hqq/SKILL.md +445 -0
- package/bin/skills/hqq/references/advanced-usage.md +528 -0
- package/bin/skills/hqq/references/troubleshooting.md +503 -0
- package/bin/skills/hugging-face-cli/SKILL.md +191 -0
- package/bin/skills/hugging-face-cli/references/commands.md +954 -0
- package/bin/skills/hugging-face-cli/references/examples.md +374 -0
- package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
- package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
- package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
- package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
- package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
- package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
- package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
- package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
- package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
- package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
- package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
- package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
- package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
- package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
- package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
- package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
- package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
- package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
- package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
- package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
- package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
- package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
- package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
- package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
- package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
- package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
- package/bin/skills/hugging-face-jobs/index.html +216 -0
- package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
- package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
- package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
- package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
- package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
- package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
- package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
- package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
- package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
- package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
- package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
- package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
- package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
- package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
- package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
- package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
- package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
- package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
- package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
- package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
- package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
- package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
- package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
- package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
- package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
- package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
- package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
- package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
- package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
- package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
- package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
- package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
- package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
- package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
- package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
- package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
- package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
- package/bin/skills/instructor/SKILL.md +740 -0
- package/bin/skills/instructor/references/examples.md +107 -0
- package/bin/skills/instructor/references/providers.md +70 -0
- package/bin/skills/instructor/references/validation.md +606 -0
- package/bin/skills/knowledge-distillation/SKILL.md +458 -0
- package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
- package/bin/skills/lambda-labs/SKILL.md +545 -0
- package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
- package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
- package/bin/skills/langchain/SKILL.md +480 -0
- package/bin/skills/langchain/references/agents.md +499 -0
- package/bin/skills/langchain/references/integration.md +562 -0
- package/bin/skills/langchain/references/rag.md +600 -0
- package/bin/skills/langsmith/SKILL.md +422 -0
- package/bin/skills/langsmith/references/advanced-usage.md +548 -0
- package/bin/skills/langsmith/references/troubleshooting.md +537 -0
- package/bin/skills/litgpt/SKILL.md +469 -0
- package/bin/skills/litgpt/references/custom-models.md +568 -0
- package/bin/skills/litgpt/references/distributed-training.md +451 -0
- package/bin/skills/litgpt/references/supported-models.md +336 -0
- package/bin/skills/litgpt/references/training-recipes.md +619 -0
- package/bin/skills/llama-cpp/SKILL.md +258 -0
- package/bin/skills/llama-cpp/references/optimization.md +89 -0
- package/bin/skills/llama-cpp/references/quantization.md +213 -0
- package/bin/skills/llama-cpp/references/server.md +125 -0
- package/bin/skills/llama-factory/SKILL.md +80 -0
- package/bin/skills/llama-factory/references/_images.md +23 -0
- package/bin/skills/llama-factory/references/advanced.md +1055 -0
- package/bin/skills/llama-factory/references/getting_started.md +349 -0
- package/bin/skills/llama-factory/references/index.md +19 -0
- package/bin/skills/llama-factory/references/other.md +31 -0
- package/bin/skills/llamaguard/SKILL.md +337 -0
- package/bin/skills/llamaindex/SKILL.md +569 -0
- package/bin/skills/llamaindex/references/agents.md +83 -0
- package/bin/skills/llamaindex/references/data_connectors.md +108 -0
- package/bin/skills/llamaindex/references/query_engines.md +406 -0
- package/bin/skills/llava/SKILL.md +304 -0
- package/bin/skills/llava/references/training.md +197 -0
- package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
- package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
- package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
- package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
- package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
- package/bin/skills/long-context/SKILL.md +536 -0
- package/bin/skills/long-context/references/extension_methods.md +468 -0
- package/bin/skills/long-context/references/fine_tuning.md +611 -0
- package/bin/skills/long-context/references/rope.md +402 -0
- package/bin/skills/mamba/SKILL.md +260 -0
- package/bin/skills/mamba/references/architecture-details.md +206 -0
- package/bin/skills/mamba/references/benchmarks.md +255 -0
- package/bin/skills/mamba/references/training-guide.md +388 -0
- package/bin/skills/megatron-core/SKILL.md +366 -0
- package/bin/skills/megatron-core/references/benchmarks.md +249 -0
- package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
- package/bin/skills/megatron-core/references/production-examples.md +473 -0
- package/bin/skills/megatron-core/references/training-recipes.md +547 -0
- package/bin/skills/miles/SKILL.md +315 -0
- package/bin/skills/miles/references/api-reference.md +141 -0
- package/bin/skills/miles/references/troubleshooting.md +352 -0
- package/bin/skills/mlflow/SKILL.md +704 -0
- package/bin/skills/mlflow/references/deployment.md +744 -0
- package/bin/skills/mlflow/references/model-registry.md +770 -0
- package/bin/skills/mlflow/references/tracking.md +680 -0
- package/bin/skills/modal/SKILL.md +341 -0
- package/bin/skills/modal/references/advanced-usage.md +503 -0
- package/bin/skills/modal/references/troubleshooting.md +494 -0
- package/bin/skills/model-merging/SKILL.md +539 -0
- package/bin/skills/model-merging/references/evaluation.md +462 -0
- package/bin/skills/model-merging/references/examples.md +428 -0
- package/bin/skills/model-merging/references/methods.md +352 -0
- package/bin/skills/model-pruning/SKILL.md +495 -0
- package/bin/skills/model-pruning/references/wanda.md +347 -0
- package/bin/skills/moe-training/SKILL.md +526 -0
- package/bin/skills/moe-training/references/architectures.md +432 -0
- package/bin/skills/moe-training/references/inference.md +348 -0
- package/bin/skills/moe-training/references/training.md +425 -0
- package/bin/skills/nanogpt/SKILL.md +290 -0
- package/bin/skills/nanogpt/references/architecture.md +382 -0
- package/bin/skills/nanogpt/references/data.md +476 -0
- package/bin/skills/nanogpt/references/training.md +564 -0
- package/bin/skills/nemo-curator/SKILL.md +383 -0
- package/bin/skills/nemo-curator/references/deduplication.md +87 -0
- package/bin/skills/nemo-curator/references/filtering.md +102 -0
- package/bin/skills/nemo-evaluator/SKILL.md +494 -0
- package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
- package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
- package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
- package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
- package/bin/skills/nemo-guardrails/SKILL.md +297 -0
- package/bin/skills/nnsight/SKILL.md +436 -0
- package/bin/skills/nnsight/references/README.md +78 -0
- package/bin/skills/nnsight/references/api.md +344 -0
- package/bin/skills/nnsight/references/tutorials.md +300 -0
- package/bin/skills/openrlhf/SKILL.md +249 -0
- package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
- package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
- package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
- package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
- package/bin/skills/outlines/SKILL.md +652 -0
- package/bin/skills/outlines/references/backends.md +615 -0
- package/bin/skills/outlines/references/examples.md +773 -0
- package/bin/skills/outlines/references/json_generation.md +652 -0
- package/bin/skills/peft/SKILL.md +431 -0
- package/bin/skills/peft/references/advanced-usage.md +514 -0
- package/bin/skills/peft/references/troubleshooting.md +480 -0
- package/bin/skills/phoenix/SKILL.md +475 -0
- package/bin/skills/phoenix/references/advanced-usage.md +619 -0
- package/bin/skills/phoenix/references/troubleshooting.md +538 -0
- package/bin/skills/pinecone/SKILL.md +358 -0
- package/bin/skills/pinecone/references/deployment.md +181 -0
- package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
- package/bin/skills/pytorch-fsdp/references/index.md +7 -0
- package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
- package/bin/skills/pytorch-lightning/SKILL.md +346 -0
- package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
- package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
- package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
- package/bin/skills/pyvene/SKILL.md +473 -0
- package/bin/skills/pyvene/references/README.md +73 -0
- package/bin/skills/pyvene/references/api.md +383 -0
- package/bin/skills/pyvene/references/tutorials.md +376 -0
- package/bin/skills/qdrant/SKILL.md +493 -0
- package/bin/skills/qdrant/references/advanced-usage.md +648 -0
- package/bin/skills/qdrant/references/troubleshooting.md +631 -0
- package/bin/skills/ray-data/SKILL.md +326 -0
- package/bin/skills/ray-data/references/integration.md +82 -0
- package/bin/skills/ray-data/references/transformations.md +83 -0
- package/bin/skills/ray-train/SKILL.md +406 -0
- package/bin/skills/ray-train/references/multi-node.md +628 -0
- package/bin/skills/rwkv/SKILL.md +260 -0
- package/bin/skills/rwkv/references/architecture-details.md +344 -0
- package/bin/skills/rwkv/references/rwkv7.md +386 -0
- package/bin/skills/rwkv/references/state-management.md +369 -0
- package/bin/skills/saelens/SKILL.md +386 -0
- package/bin/skills/saelens/references/README.md +70 -0
- package/bin/skills/saelens/references/api.md +333 -0
- package/bin/skills/saelens/references/tutorials.md +318 -0
- package/bin/skills/segment-anything/SKILL.md +500 -0
- package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
- package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
- package/bin/skills/sentence-transformers/SKILL.md +255 -0
- package/bin/skills/sentence-transformers/references/models.md +123 -0
- package/bin/skills/sentencepiece/SKILL.md +235 -0
- package/bin/skills/sentencepiece/references/algorithms.md +200 -0
- package/bin/skills/sentencepiece/references/training.md +304 -0
- package/bin/skills/sglang/SKILL.md +442 -0
- package/bin/skills/sglang/references/deployment.md +490 -0
- package/bin/skills/sglang/references/radix-attention.md +413 -0
- package/bin/skills/sglang/references/structured-generation.md +541 -0
- package/bin/skills/simpo/SKILL.md +219 -0
- package/bin/skills/simpo/references/datasets.md +478 -0
- package/bin/skills/simpo/references/hyperparameters.md +452 -0
- package/bin/skills/simpo/references/loss-functions.md +350 -0
- package/bin/skills/skypilot/SKILL.md +509 -0
- package/bin/skills/skypilot/references/advanced-usage.md +491 -0
- package/bin/skills/skypilot/references/troubleshooting.md +570 -0
- package/bin/skills/slime/SKILL.md +464 -0
- package/bin/skills/slime/references/api-reference.md +392 -0
- package/bin/skills/slime/references/troubleshooting.md +386 -0
- package/bin/skills/speculative-decoding/SKILL.md +467 -0
- package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
- package/bin/skills/speculative-decoding/references/medusa.md +350 -0
- package/bin/skills/stable-diffusion/SKILL.md +519 -0
- package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
- package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
- package/bin/skills/tensorboard/SKILL.md +629 -0
- package/bin/skills/tensorboard/references/integrations.md +638 -0
- package/bin/skills/tensorboard/references/profiling.md +545 -0
- package/bin/skills/tensorboard/references/visualization.md +620 -0
- package/bin/skills/tensorrt-llm/SKILL.md +187 -0
- package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
- package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
- package/bin/skills/tensorrt-llm/references/serving.md +470 -0
- package/bin/skills/tinker/SKILL.md +362 -0
- package/bin/skills/tinker/references/api-reference.md +168 -0
- package/bin/skills/tinker/references/getting-started.md +157 -0
- package/bin/skills/tinker/references/loss-functions.md +163 -0
- package/bin/skills/tinker/references/models-and-lora.md +139 -0
- package/bin/skills/tinker/references/recipes.md +280 -0
- package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
- package/bin/skills/tinker/references/rendering.md +243 -0
- package/bin/skills/tinker/references/supervised-learning.md +232 -0
- package/bin/skills/tinker-training-cost/SKILL.md +187 -0
- package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
- package/bin/skills/torchforge/SKILL.md +433 -0
- package/bin/skills/torchforge/references/api-reference.md +327 -0
- package/bin/skills/torchforge/references/troubleshooting.md +409 -0
- package/bin/skills/torchtitan/SKILL.md +358 -0
- package/bin/skills/torchtitan/references/checkpoint.md +181 -0
- package/bin/skills/torchtitan/references/custom-models.md +258 -0
- package/bin/skills/torchtitan/references/float8.md +133 -0
- package/bin/skills/torchtitan/references/fsdp.md +126 -0
- package/bin/skills/transformer-lens/SKILL.md +346 -0
- package/bin/skills/transformer-lens/references/README.md +54 -0
- package/bin/skills/transformer-lens/references/api.md +362 -0
- package/bin/skills/transformer-lens/references/tutorials.md +339 -0
- package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
- package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
- package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
- package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
- package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
- package/bin/skills/unsloth/SKILL.md +80 -0
- package/bin/skills/unsloth/references/index.md +7 -0
- package/bin/skills/unsloth/references/llms-full.md +16799 -0
- package/bin/skills/unsloth/references/llms-txt.md +12044 -0
- package/bin/skills/unsloth/references/llms.md +82 -0
- package/bin/skills/verl/SKILL.md +391 -0
- package/bin/skills/verl/references/api-reference.md +301 -0
- package/bin/skills/verl/references/troubleshooting.md +391 -0
- package/bin/skills/vllm/SKILL.md +364 -0
- package/bin/skills/vllm/references/optimization.md +226 -0
- package/bin/skills/vllm/references/quantization.md +284 -0
- package/bin/skills/vllm/references/server-deployment.md +255 -0
- package/bin/skills/vllm/references/troubleshooting.md +447 -0
- package/bin/skills/weights-and-biases/SKILL.md +590 -0
- package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
- package/bin/skills/weights-and-biases/references/integrations.md +700 -0
- package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
- package/bin/skills/whisper/SKILL.md +317 -0
- package/bin/skills/whisper/references/languages.md +189 -0
- package/bin/synsc +0 -0
- package/package.json +10 -0
|
@@ -0,0 +1,82 @@
|
|
|
1
|
+
# Online RL Methods
|
|
2
|
+
|
|
3
|
+
Guide to online reinforcement learning with PPO, GRPO, RLOO, and OnlineDPO.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
Online RL generates completions during training and optimizes based on rewards.
|
|
8
|
+
|
|
9
|
+
## PPO (Proximal Policy Optimization)
|
|
10
|
+
|
|
11
|
+
Classic RL algorithm for LLM alignment.
|
|
12
|
+
|
|
13
|
+
### Basic Usage
|
|
14
|
+
|
|
15
|
+
```bash
|
|
16
|
+
python -m trl.scripts.ppo \
|
|
17
|
+
--model_name_or_path Qwen/Qwen2.5-0.5B-Instruct \
|
|
18
|
+
--reward_model_path reward-model \
|
|
19
|
+
--dataset_name trl-internal-testing/descriptiveness-sentiment-trl-style \
|
|
20
|
+
--output_dir model-ppo \
|
|
21
|
+
--learning_rate 3e-6 \
|
|
22
|
+
--per_device_train_batch_size 64 \
|
|
23
|
+
--total_episodes 10000 \
|
|
24
|
+
--num_ppo_epochs 4 \
|
|
25
|
+
--kl_coef 0.05
|
|
26
|
+
```
|
|
27
|
+
|
|
28
|
+
### Key Parameters
|
|
29
|
+
|
|
30
|
+
- `kl_coef`: KL penalty (0.05-0.2)
|
|
31
|
+
- `num_ppo_epochs`: Epochs per batch (2-4)
|
|
32
|
+
- `cliprange`: PPO clip (0.1-0.3)
|
|
33
|
+
- `vf_coef`: Value function coef (0.1)
|
|
34
|
+
|
|
35
|
+
## GRPO (Group Relative Policy Optimization)
|
|
36
|
+
|
|
37
|
+
Memory-efficient online RL.
|
|
38
|
+
|
|
39
|
+
### Basic Usage
|
|
40
|
+
|
|
41
|
+
```python
|
|
42
|
+
from trl import GRPOTrainer, GRPOConfig
|
|
43
|
+
from datasets import load_dataset
|
|
44
|
+
|
|
45
|
+
# Define reward function
|
|
46
|
+
def reward_func(completions, **kwargs):
|
|
47
|
+
return [len(set(c.split())) for c in completions]
|
|
48
|
+
|
|
49
|
+
config = GRPOConfig(
|
|
50
|
+
output_dir="model-grpo",
|
|
51
|
+
num_generations=4, # Completions per prompt
|
|
52
|
+
max_new_tokens=128
|
|
53
|
+
)
|
|
54
|
+
|
|
55
|
+
trainer = GRPOTrainer(
|
|
56
|
+
model="Qwen/Qwen2-0.5B-Instruct",
|
|
57
|
+
reward_funcs=reward_func,
|
|
58
|
+
args=config,
|
|
59
|
+
train_dataset=load_dataset("trl-lib/tldr", split="train")
|
|
60
|
+
)
|
|
61
|
+
trainer.train()
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
### Key Parameters
|
|
65
|
+
|
|
66
|
+
- `num_generations`: 2-8 completions
|
|
67
|
+
- `max_new_tokens`: 64-256
|
|
68
|
+
- Learning rate: 1e-5 to 1e-4
|
|
69
|
+
|
|
70
|
+
## Memory Comparison
|
|
71
|
+
|
|
72
|
+
| Method | Memory (7B) | Speed | Use Case |
|
|
73
|
+
|--------|-------------|-------|----------|
|
|
74
|
+
| PPO | 40GB | Medium | Maximum control |
|
|
75
|
+
| GRPO | 24GB | Fast | **Memory-constrained** |
|
|
76
|
+
| OnlineDPO | 28GB | Fast | No reward model |
|
|
77
|
+
|
|
78
|
+
## References
|
|
79
|
+
|
|
80
|
+
- PPO paper: https://arxiv.org/abs/1707.06347
|
|
81
|
+
- GRPO paper: https://arxiv.org/abs/2402.03300
|
|
82
|
+
- TRL docs: https://huggingface.co/docs/trl/
|
|
@@ -0,0 +1,122 @@
|
|
|
1
|
+
# Reward Modeling
|
|
2
|
+
|
|
3
|
+
Guide to training reward models with TRL for RLHF pipelines.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
Reward models score completions based on human preferences. Used in:
|
|
8
|
+
- PPO training (RL feedback)
|
|
9
|
+
- GRPO online RL
|
|
10
|
+
- Completion ranking
|
|
11
|
+
|
|
12
|
+
## Basic Training
|
|
13
|
+
|
|
14
|
+
```python
|
|
15
|
+
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
|
16
|
+
from trl import RewardTrainer, RewardConfig
|
|
17
|
+
from datasets import load_dataset
|
|
18
|
+
|
|
19
|
+
# Load model (num_labels=1 for single reward score)
|
|
20
|
+
model = AutoModelForSequenceClassification.from_pretrained(
|
|
21
|
+
"Qwen/Qwen2.5-0.5B-Instruct",
|
|
22
|
+
num_labels=1
|
|
23
|
+
)
|
|
24
|
+
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B-Instruct")
|
|
25
|
+
|
|
26
|
+
# Load preference dataset (chosen/rejected pairs)
|
|
27
|
+
dataset = load_dataset("trl-lib/ultrafeedback_binarized", split="train")
|
|
28
|
+
|
|
29
|
+
# Configure
|
|
30
|
+
config = RewardConfig(
|
|
31
|
+
output_dir="Qwen2.5-Reward",
|
|
32
|
+
per_device_train_batch_size=2,
|
|
33
|
+
num_train_epochs=1,
|
|
34
|
+
learning_rate=1e-5
|
|
35
|
+
)
|
|
36
|
+
|
|
37
|
+
# Train
|
|
38
|
+
trainer = RewardTrainer(
|
|
39
|
+
model=model,
|
|
40
|
+
args=config,
|
|
41
|
+
processing_class=tokenizer,
|
|
42
|
+
train_dataset=dataset
|
|
43
|
+
)
|
|
44
|
+
trainer.train()
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
## Dataset Format
|
|
48
|
+
|
|
49
|
+
Required fields:
|
|
50
|
+
```json
|
|
51
|
+
{
|
|
52
|
+
"prompt": "Question or instruction",
|
|
53
|
+
"chosen": "Better response",
|
|
54
|
+
"rejected": "Worse response"
|
|
55
|
+
}
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
## Bradley-Terry Loss
|
|
59
|
+
|
|
60
|
+
Default loss function:
|
|
61
|
+
```
|
|
62
|
+
loss = -log(sigmoid(reward_chosen - reward_rejected))
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
Learns to score chosen > rejected.
|
|
66
|
+
|
|
67
|
+
## Using Reward Models
|
|
68
|
+
|
|
69
|
+
### Inference
|
|
70
|
+
|
|
71
|
+
```python
|
|
72
|
+
from transformers import pipeline
|
|
73
|
+
|
|
74
|
+
# Load trained reward model
|
|
75
|
+
reward_pipe = pipeline("text-classification", model="Qwen2.5-Reward")
|
|
76
|
+
|
|
77
|
+
# Score completions
|
|
78
|
+
texts = ["Good answer", "Bad answer"]
|
|
79
|
+
scores = reward_pipe(texts)
|
|
80
|
+
print(scores) # Higher score = better
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
### In PPO
|
|
84
|
+
|
|
85
|
+
```python
|
|
86
|
+
from trl import PPOTrainer, PPOConfig
|
|
87
|
+
|
|
88
|
+
config = PPOConfig(
|
|
89
|
+
reward_model_path="Qwen2.5-Reward" # Use trained reward model
|
|
90
|
+
)
|
|
91
|
+
|
|
92
|
+
trainer = PPOTrainer(
|
|
93
|
+
model=policy_model,
|
|
94
|
+
config=config,
|
|
95
|
+
# Reward model loaded automatically
|
|
96
|
+
)
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
## Hyperparameters
|
|
100
|
+
|
|
101
|
+
| Model Size | Learning Rate | Batch Size | Epochs |
|
|
102
|
+
|------------|---------------|------------|--------|
|
|
103
|
+
| <1B | 2e-5 | 4-8 | 1-2 |
|
|
104
|
+
| 1-7B | 1e-5 | 2-4 | 1 |
|
|
105
|
+
| 7-13B | 5e-6 | 1-2 | 1 |
|
|
106
|
+
|
|
107
|
+
## Evaluation
|
|
108
|
+
|
|
109
|
+
Check reward separation:
|
|
110
|
+
```python
|
|
111
|
+
# Chosen should score higher than rejected
|
|
112
|
+
chosen_rewards = model(**chosen_inputs).logits
|
|
113
|
+
rejected_rewards = model(**rejected_inputs).logits
|
|
114
|
+
|
|
115
|
+
accuracy = (chosen_rewards > rejected_rewards).float().mean()
|
|
116
|
+
print(f"Accuracy: {accuracy:.2%}") # Target: >80%
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
## References
|
|
120
|
+
|
|
121
|
+
- InstructGPT paper: https://arxiv.org/abs/2203.02155
|
|
122
|
+
- TRL docs: https://huggingface.co/docs/trl/reward_trainer
|
|
@@ -0,0 +1,168 @@
|
|
|
1
|
+
# SFT Training Guide
|
|
2
|
+
|
|
3
|
+
Complete guide to Supervised Fine-Tuning (SFT) with TRL for instruction tuning and task-specific fine-tuning.
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
SFT trains models on input-output pairs to minimize cross-entropy loss. Use for:
|
|
8
|
+
- Instruction following
|
|
9
|
+
- Task-specific fine-tuning
|
|
10
|
+
- Chatbot training
|
|
11
|
+
- Domain adaptation
|
|
12
|
+
|
|
13
|
+
## Dataset Formats
|
|
14
|
+
|
|
15
|
+
### Format 1: Prompt-Completion
|
|
16
|
+
|
|
17
|
+
```json
|
|
18
|
+
[
|
|
19
|
+
{
|
|
20
|
+
"prompt": "What is the capital of France?",
|
|
21
|
+
"completion": "The capital of France is Paris."
|
|
22
|
+
}
|
|
23
|
+
]
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
### Format 2: Conversational (ChatML)
|
|
27
|
+
|
|
28
|
+
```json
|
|
29
|
+
[
|
|
30
|
+
{
|
|
31
|
+
"messages": [
|
|
32
|
+
{"role": "user", "content": "What is Python?"},
|
|
33
|
+
{"role": "assistant", "content": "Python is a programming language."}
|
|
34
|
+
]
|
|
35
|
+
}
|
|
36
|
+
]
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
### Format 3: Text-only
|
|
40
|
+
|
|
41
|
+
```json
|
|
42
|
+
[
|
|
43
|
+
{"text": "User: Hello\nAssistant: Hi! How can I help?"}
|
|
44
|
+
]
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
## Basic Training
|
|
48
|
+
|
|
49
|
+
```python
|
|
50
|
+
from trl import SFTTrainer, SFTConfig
|
|
51
|
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
52
|
+
from datasets import load_dataset
|
|
53
|
+
|
|
54
|
+
# Load model
|
|
55
|
+
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-0.5B")
|
|
56
|
+
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B")
|
|
57
|
+
|
|
58
|
+
# Load dataset
|
|
59
|
+
dataset = load_dataset("trl-lib/Capybara", split="train")
|
|
60
|
+
|
|
61
|
+
# Configure
|
|
62
|
+
config = SFTConfig(
|
|
63
|
+
output_dir="Qwen2.5-SFT",
|
|
64
|
+
per_device_train_batch_size=4,
|
|
65
|
+
num_train_epochs=1,
|
|
66
|
+
learning_rate=2e-5,
|
|
67
|
+
save_strategy="epoch"
|
|
68
|
+
)
|
|
69
|
+
|
|
70
|
+
# Train
|
|
71
|
+
trainer = SFTTrainer(
|
|
72
|
+
model=model,
|
|
73
|
+
args=config,
|
|
74
|
+
train_dataset=dataset,
|
|
75
|
+
tokenizer=tokenizer
|
|
76
|
+
)
|
|
77
|
+
trainer.train()
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
## Chat Templates
|
|
81
|
+
|
|
82
|
+
Apply chat templates automatically:
|
|
83
|
+
|
|
84
|
+
```python
|
|
85
|
+
trainer = SFTTrainer(
|
|
86
|
+
model=model,
|
|
87
|
+
args=config,
|
|
88
|
+
train_dataset=dataset, # Messages format
|
|
89
|
+
tokenizer=tokenizer
|
|
90
|
+
# Chat template applied automatically
|
|
91
|
+
)
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
Or manually:
|
|
95
|
+
```python
|
|
96
|
+
def format_chat(example):
|
|
97
|
+
messages = example["messages"]
|
|
98
|
+
text = tokenizer.apply_chat_template(messages, tokenize=False)
|
|
99
|
+
return {"text": text}
|
|
100
|
+
|
|
101
|
+
dataset = dataset.map(format_chat)
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
## Packing for Efficiency
|
|
105
|
+
|
|
106
|
+
Pack multiple sequences into one to maximize GPU utilization:
|
|
107
|
+
|
|
108
|
+
```python
|
|
109
|
+
config = SFTConfig(
|
|
110
|
+
packing=True, # Enable packing
|
|
111
|
+
max_seq_length=2048,
|
|
112
|
+
dataset_text_field="text"
|
|
113
|
+
)
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
**Benefits**: 2-3× faster training
|
|
117
|
+
**Trade-off**: Slightly more complex batching
|
|
118
|
+
|
|
119
|
+
## Multi-GPU Training
|
|
120
|
+
|
|
121
|
+
```bash
|
|
122
|
+
accelerate launch --num_processes 4 train_sft.py
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
Or with config:
|
|
126
|
+
```python
|
|
127
|
+
config = SFTConfig(
|
|
128
|
+
output_dir="model-sft",
|
|
129
|
+
per_device_train_batch_size=4,
|
|
130
|
+
gradient_accumulation_steps=4,
|
|
131
|
+
num_train_epochs=1
|
|
132
|
+
)
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
## LoRA Fine-Tuning
|
|
136
|
+
|
|
137
|
+
```python
|
|
138
|
+
from peft import LoraConfig
|
|
139
|
+
|
|
140
|
+
lora_config = LoraConfig(
|
|
141
|
+
r=16,
|
|
142
|
+
lora_alpha=32,
|
|
143
|
+
target_modules="all-linear",
|
|
144
|
+
lora_dropout=0.05,
|
|
145
|
+
task_type="CAUSAL_LM"
|
|
146
|
+
)
|
|
147
|
+
|
|
148
|
+
trainer = SFTTrainer(
|
|
149
|
+
model=model,
|
|
150
|
+
args=config,
|
|
151
|
+
train_dataset=dataset,
|
|
152
|
+
peft_config=lora_config # Add LoRA
|
|
153
|
+
)
|
|
154
|
+
```
|
|
155
|
+
|
|
156
|
+
## Hyperparameters
|
|
157
|
+
|
|
158
|
+
| Model Size | Learning Rate | Batch Size | Epochs |
|
|
159
|
+
|------------|---------------|------------|--------|
|
|
160
|
+
| <1B | 5e-5 | 8-16 | 1-3 |
|
|
161
|
+
| 1-7B | 2e-5 | 4-8 | 1-2 |
|
|
162
|
+
| 7-13B | 1e-5 | 2-4 | 1 |
|
|
163
|
+
| 13B+ | 5e-6 | 1-2 | 1 |
|
|
164
|
+
|
|
165
|
+
## References
|
|
166
|
+
|
|
167
|
+
- TRL docs: https://huggingface.co/docs/trl/sft_trainer
|
|
168
|
+
- Examples: https://github.com/huggingface/trl/tree/main/examples/scripts
|
|
@@ -0,0 +1,80 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: unsloth
|
|
3
|
+
description: Expert guidance for fast fine-tuning with Unsloth - 2-5x faster training, 50-80% less memory, LoRA/QLoRA optimization
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
author: Synthetic Sciences
|
|
6
|
+
license: MIT
|
|
7
|
+
tags: [Fine-Tuning, Unsloth, Fast Training, LoRA, QLoRA, Memory-Efficient, Optimization, Llama, Mistral, Gemma, Qwen]
|
|
8
|
+
dependencies: [unsloth, torch, transformers, trl, datasets, peft]
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# Unsloth Skill
|
|
12
|
+
|
|
13
|
+
Comprehensive assistance with unsloth development, generated from official documentation.
|
|
14
|
+
|
|
15
|
+
## When to Use This Skill
|
|
16
|
+
|
|
17
|
+
This skill should be triggered when:
|
|
18
|
+
- Working with unsloth
|
|
19
|
+
- Asking about unsloth features or APIs
|
|
20
|
+
- Implementing unsloth solutions
|
|
21
|
+
- Debugging unsloth code
|
|
22
|
+
- Learning unsloth best practices
|
|
23
|
+
|
|
24
|
+
## Quick Reference
|
|
25
|
+
|
|
26
|
+
### Common Patterns
|
|
27
|
+
|
|
28
|
+
*Quick reference patterns will be added as you use the skill.*
|
|
29
|
+
|
|
30
|
+
## Reference Files
|
|
31
|
+
|
|
32
|
+
This skill includes comprehensive documentation in `references/`:
|
|
33
|
+
|
|
34
|
+
- **llms-txt.md** - Llms-Txt documentation
|
|
35
|
+
|
|
36
|
+
Use `view` to read specific reference files when detailed information is needed.
|
|
37
|
+
|
|
38
|
+
## Working with This Skill
|
|
39
|
+
|
|
40
|
+
### For Beginners
|
|
41
|
+
Start with the getting_started or tutorials reference files for foundational concepts.
|
|
42
|
+
|
|
43
|
+
### For Specific Features
|
|
44
|
+
Use the appropriate category reference file (api, guides, etc.) for detailed information.
|
|
45
|
+
|
|
46
|
+
### For Code Examples
|
|
47
|
+
The quick reference section above contains common patterns extracted from the official docs.
|
|
48
|
+
|
|
49
|
+
## Resources
|
|
50
|
+
|
|
51
|
+
### references/
|
|
52
|
+
Organized documentation extracted from official sources. These files contain:
|
|
53
|
+
- Detailed explanations
|
|
54
|
+
- Code examples with language annotations
|
|
55
|
+
- Links to original documentation
|
|
56
|
+
- Table of contents for quick navigation
|
|
57
|
+
|
|
58
|
+
### scripts/
|
|
59
|
+
Add helper scripts here for common automation tasks.
|
|
60
|
+
|
|
61
|
+
### assets/
|
|
62
|
+
Add templates, boilerplate, or example projects here.
|
|
63
|
+
|
|
64
|
+
## Notes
|
|
65
|
+
|
|
66
|
+
- This skill was automatically generated from official documentation
|
|
67
|
+
- Reference files preserve the structure and examples from source docs
|
|
68
|
+
- Code examples include language detection for better syntax highlighting
|
|
69
|
+
- Quick reference patterns are extracted from common usage examples in the docs
|
|
70
|
+
|
|
71
|
+
## Updating
|
|
72
|
+
|
|
73
|
+
To refresh this skill with updated documentation:
|
|
74
|
+
1. Re-run the scraper with the same configuration
|
|
75
|
+
2. The skill will be rebuilt with the latest information
|
|
76
|
+
|
|
77
|
+
<!-- Trigger re-upload 1763621536 -->
|
|
78
|
+
|
|
79
|
+
|
|
80
|
+
|