@synsci/cli-darwin-x64 1.1.49
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/skills/accelerate/SKILL.md +332 -0
- package/bin/skills/accelerate/references/custom-plugins.md +453 -0
- package/bin/skills/accelerate/references/megatron-integration.md +489 -0
- package/bin/skills/accelerate/references/performance.md +525 -0
- package/bin/skills/audiocraft/SKILL.md +564 -0
- package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
- package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
- package/bin/skills/autogpt/SKILL.md +403 -0
- package/bin/skills/autogpt/references/advanced-usage.md +535 -0
- package/bin/skills/autogpt/references/troubleshooting.md +420 -0
- package/bin/skills/awq/SKILL.md +310 -0
- package/bin/skills/awq/references/advanced-usage.md +324 -0
- package/bin/skills/awq/references/troubleshooting.md +344 -0
- package/bin/skills/axolotl/SKILL.md +158 -0
- package/bin/skills/axolotl/references/api.md +5548 -0
- package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
- package/bin/skills/axolotl/references/index.md +15 -0
- package/bin/skills/axolotl/references/other.md +3563 -0
- package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
- package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
- package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
- package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
- package/bin/skills/bitsandbytes/SKILL.md +411 -0
- package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
- package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
- package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
- package/bin/skills/blip-2/SKILL.md +564 -0
- package/bin/skills/blip-2/references/advanced-usage.md +680 -0
- package/bin/skills/blip-2/references/troubleshooting.md +526 -0
- package/bin/skills/chroma/SKILL.md +406 -0
- package/bin/skills/chroma/references/integration.md +38 -0
- package/bin/skills/clip/SKILL.md +253 -0
- package/bin/skills/clip/references/applications.md +207 -0
- package/bin/skills/constitutional-ai/SKILL.md +290 -0
- package/bin/skills/crewai/SKILL.md +498 -0
- package/bin/skills/crewai/references/flows.md +438 -0
- package/bin/skills/crewai/references/tools.md +429 -0
- package/bin/skills/crewai/references/troubleshooting.md +480 -0
- package/bin/skills/deepspeed/SKILL.md +141 -0
- package/bin/skills/deepspeed/references/08.md +17 -0
- package/bin/skills/deepspeed/references/09.md +173 -0
- package/bin/skills/deepspeed/references/2020.md +378 -0
- package/bin/skills/deepspeed/references/2023.md +279 -0
- package/bin/skills/deepspeed/references/assets.md +179 -0
- package/bin/skills/deepspeed/references/index.md +35 -0
- package/bin/skills/deepspeed/references/mii.md +118 -0
- package/bin/skills/deepspeed/references/other.md +1191 -0
- package/bin/skills/deepspeed/references/tutorials.md +6554 -0
- package/bin/skills/dspy/SKILL.md +590 -0
- package/bin/skills/dspy/references/examples.md +663 -0
- package/bin/skills/dspy/references/modules.md +475 -0
- package/bin/skills/dspy/references/optimizers.md +566 -0
- package/bin/skills/faiss/SKILL.md +221 -0
- package/bin/skills/faiss/references/index_types.md +280 -0
- package/bin/skills/flash-attention/SKILL.md +367 -0
- package/bin/skills/flash-attention/references/benchmarks.md +215 -0
- package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
- package/bin/skills/gguf/SKILL.md +427 -0
- package/bin/skills/gguf/references/advanced-usage.md +504 -0
- package/bin/skills/gguf/references/troubleshooting.md +442 -0
- package/bin/skills/gptq/SKILL.md +450 -0
- package/bin/skills/gptq/references/calibration.md +337 -0
- package/bin/skills/gptq/references/integration.md +129 -0
- package/bin/skills/gptq/references/troubleshooting.md +95 -0
- package/bin/skills/grpo-rl-training/README.md +97 -0
- package/bin/skills/grpo-rl-training/SKILL.md +572 -0
- package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
- package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
- package/bin/skills/guidance/SKILL.md +572 -0
- package/bin/skills/guidance/references/backends.md +554 -0
- package/bin/skills/guidance/references/constraints.md +674 -0
- package/bin/skills/guidance/references/examples.md +767 -0
- package/bin/skills/hqq/SKILL.md +445 -0
- package/bin/skills/hqq/references/advanced-usage.md +528 -0
- package/bin/skills/hqq/references/troubleshooting.md +503 -0
- package/bin/skills/hugging-face-cli/SKILL.md +191 -0
- package/bin/skills/hugging-face-cli/references/commands.md +954 -0
- package/bin/skills/hugging-face-cli/references/examples.md +374 -0
- package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
- package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
- package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
- package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
- package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
- package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
- package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
- package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
- package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
- package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
- package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
- package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
- package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
- package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
- package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
- package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
- package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
- package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
- package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
- package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
- package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
- package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
- package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
- package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
- package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
- package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
- package/bin/skills/hugging-face-jobs/index.html +216 -0
- package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
- package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
- package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
- package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
- package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
- package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
- package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
- package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
- package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
- package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
- package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
- package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
- package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
- package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
- package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
- package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
- package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
- package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
- package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
- package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
- package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
- package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
- package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
- package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
- package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
- package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
- package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
- package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
- package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
- package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
- package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
- package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
- package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
- package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
- package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
- package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
- package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
- package/bin/skills/instructor/SKILL.md +740 -0
- package/bin/skills/instructor/references/examples.md +107 -0
- package/bin/skills/instructor/references/providers.md +70 -0
- package/bin/skills/instructor/references/validation.md +606 -0
- package/bin/skills/knowledge-distillation/SKILL.md +458 -0
- package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
- package/bin/skills/lambda-labs/SKILL.md +545 -0
- package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
- package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
- package/bin/skills/langchain/SKILL.md +480 -0
- package/bin/skills/langchain/references/agents.md +499 -0
- package/bin/skills/langchain/references/integration.md +562 -0
- package/bin/skills/langchain/references/rag.md +600 -0
- package/bin/skills/langsmith/SKILL.md +422 -0
- package/bin/skills/langsmith/references/advanced-usage.md +548 -0
- package/bin/skills/langsmith/references/troubleshooting.md +537 -0
- package/bin/skills/litgpt/SKILL.md +469 -0
- package/bin/skills/litgpt/references/custom-models.md +568 -0
- package/bin/skills/litgpt/references/distributed-training.md +451 -0
- package/bin/skills/litgpt/references/supported-models.md +336 -0
- package/bin/skills/litgpt/references/training-recipes.md +619 -0
- package/bin/skills/llama-cpp/SKILL.md +258 -0
- package/bin/skills/llama-cpp/references/optimization.md +89 -0
- package/bin/skills/llama-cpp/references/quantization.md +213 -0
- package/bin/skills/llama-cpp/references/server.md +125 -0
- package/bin/skills/llama-factory/SKILL.md +80 -0
- package/bin/skills/llama-factory/references/_images.md +23 -0
- package/bin/skills/llama-factory/references/advanced.md +1055 -0
- package/bin/skills/llama-factory/references/getting_started.md +349 -0
- package/bin/skills/llama-factory/references/index.md +19 -0
- package/bin/skills/llama-factory/references/other.md +31 -0
- package/bin/skills/llamaguard/SKILL.md +337 -0
- package/bin/skills/llamaindex/SKILL.md +569 -0
- package/bin/skills/llamaindex/references/agents.md +83 -0
- package/bin/skills/llamaindex/references/data_connectors.md +108 -0
- package/bin/skills/llamaindex/references/query_engines.md +406 -0
- package/bin/skills/llava/SKILL.md +304 -0
- package/bin/skills/llava/references/training.md +197 -0
- package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
- package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
- package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
- package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
- package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
- package/bin/skills/long-context/SKILL.md +536 -0
- package/bin/skills/long-context/references/extension_methods.md +468 -0
- package/bin/skills/long-context/references/fine_tuning.md +611 -0
- package/bin/skills/long-context/references/rope.md +402 -0
- package/bin/skills/mamba/SKILL.md +260 -0
- package/bin/skills/mamba/references/architecture-details.md +206 -0
- package/bin/skills/mamba/references/benchmarks.md +255 -0
- package/bin/skills/mamba/references/training-guide.md +388 -0
- package/bin/skills/megatron-core/SKILL.md +366 -0
- package/bin/skills/megatron-core/references/benchmarks.md +249 -0
- package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
- package/bin/skills/megatron-core/references/production-examples.md +473 -0
- package/bin/skills/megatron-core/references/training-recipes.md +547 -0
- package/bin/skills/miles/SKILL.md +315 -0
- package/bin/skills/miles/references/api-reference.md +141 -0
- package/bin/skills/miles/references/troubleshooting.md +352 -0
- package/bin/skills/mlflow/SKILL.md +704 -0
- package/bin/skills/mlflow/references/deployment.md +744 -0
- package/bin/skills/mlflow/references/model-registry.md +770 -0
- package/bin/skills/mlflow/references/tracking.md +680 -0
- package/bin/skills/modal/SKILL.md +341 -0
- package/bin/skills/modal/references/advanced-usage.md +503 -0
- package/bin/skills/modal/references/troubleshooting.md +494 -0
- package/bin/skills/model-merging/SKILL.md +539 -0
- package/bin/skills/model-merging/references/evaluation.md +462 -0
- package/bin/skills/model-merging/references/examples.md +428 -0
- package/bin/skills/model-merging/references/methods.md +352 -0
- package/bin/skills/model-pruning/SKILL.md +495 -0
- package/bin/skills/model-pruning/references/wanda.md +347 -0
- package/bin/skills/moe-training/SKILL.md +526 -0
- package/bin/skills/moe-training/references/architectures.md +432 -0
- package/bin/skills/moe-training/references/inference.md +348 -0
- package/bin/skills/moe-training/references/training.md +425 -0
- package/bin/skills/nanogpt/SKILL.md +290 -0
- package/bin/skills/nanogpt/references/architecture.md +382 -0
- package/bin/skills/nanogpt/references/data.md +476 -0
- package/bin/skills/nanogpt/references/training.md +564 -0
- package/bin/skills/nemo-curator/SKILL.md +383 -0
- package/bin/skills/nemo-curator/references/deduplication.md +87 -0
- package/bin/skills/nemo-curator/references/filtering.md +102 -0
- package/bin/skills/nemo-evaluator/SKILL.md +494 -0
- package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
- package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
- package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
- package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
- package/bin/skills/nemo-guardrails/SKILL.md +297 -0
- package/bin/skills/nnsight/SKILL.md +436 -0
- package/bin/skills/nnsight/references/README.md +78 -0
- package/bin/skills/nnsight/references/api.md +344 -0
- package/bin/skills/nnsight/references/tutorials.md +300 -0
- package/bin/skills/openrlhf/SKILL.md +249 -0
- package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
- package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
- package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
- package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
- package/bin/skills/outlines/SKILL.md +652 -0
- package/bin/skills/outlines/references/backends.md +615 -0
- package/bin/skills/outlines/references/examples.md +773 -0
- package/bin/skills/outlines/references/json_generation.md +652 -0
- package/bin/skills/peft/SKILL.md +431 -0
- package/bin/skills/peft/references/advanced-usage.md +514 -0
- package/bin/skills/peft/references/troubleshooting.md +480 -0
- package/bin/skills/phoenix/SKILL.md +475 -0
- package/bin/skills/phoenix/references/advanced-usage.md +619 -0
- package/bin/skills/phoenix/references/troubleshooting.md +538 -0
- package/bin/skills/pinecone/SKILL.md +358 -0
- package/bin/skills/pinecone/references/deployment.md +181 -0
- package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
- package/bin/skills/pytorch-fsdp/references/index.md +7 -0
- package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
- package/bin/skills/pytorch-lightning/SKILL.md +346 -0
- package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
- package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
- package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
- package/bin/skills/pyvene/SKILL.md +473 -0
- package/bin/skills/pyvene/references/README.md +73 -0
- package/bin/skills/pyvene/references/api.md +383 -0
- package/bin/skills/pyvene/references/tutorials.md +376 -0
- package/bin/skills/qdrant/SKILL.md +493 -0
- package/bin/skills/qdrant/references/advanced-usage.md +648 -0
- package/bin/skills/qdrant/references/troubleshooting.md +631 -0
- package/bin/skills/ray-data/SKILL.md +326 -0
- package/bin/skills/ray-data/references/integration.md +82 -0
- package/bin/skills/ray-data/references/transformations.md +83 -0
- package/bin/skills/ray-train/SKILL.md +406 -0
- package/bin/skills/ray-train/references/multi-node.md +628 -0
- package/bin/skills/rwkv/SKILL.md +260 -0
- package/bin/skills/rwkv/references/architecture-details.md +344 -0
- package/bin/skills/rwkv/references/rwkv7.md +386 -0
- package/bin/skills/rwkv/references/state-management.md +369 -0
- package/bin/skills/saelens/SKILL.md +386 -0
- package/bin/skills/saelens/references/README.md +70 -0
- package/bin/skills/saelens/references/api.md +333 -0
- package/bin/skills/saelens/references/tutorials.md +318 -0
- package/bin/skills/segment-anything/SKILL.md +500 -0
- package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
- package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
- package/bin/skills/sentence-transformers/SKILL.md +255 -0
- package/bin/skills/sentence-transformers/references/models.md +123 -0
- package/bin/skills/sentencepiece/SKILL.md +235 -0
- package/bin/skills/sentencepiece/references/algorithms.md +200 -0
- package/bin/skills/sentencepiece/references/training.md +304 -0
- package/bin/skills/sglang/SKILL.md +442 -0
- package/bin/skills/sglang/references/deployment.md +490 -0
- package/bin/skills/sglang/references/radix-attention.md +413 -0
- package/bin/skills/sglang/references/structured-generation.md +541 -0
- package/bin/skills/simpo/SKILL.md +219 -0
- package/bin/skills/simpo/references/datasets.md +478 -0
- package/bin/skills/simpo/references/hyperparameters.md +452 -0
- package/bin/skills/simpo/references/loss-functions.md +350 -0
- package/bin/skills/skypilot/SKILL.md +509 -0
- package/bin/skills/skypilot/references/advanced-usage.md +491 -0
- package/bin/skills/skypilot/references/troubleshooting.md +570 -0
- package/bin/skills/slime/SKILL.md +464 -0
- package/bin/skills/slime/references/api-reference.md +392 -0
- package/bin/skills/slime/references/troubleshooting.md +386 -0
- package/bin/skills/speculative-decoding/SKILL.md +467 -0
- package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
- package/bin/skills/speculative-decoding/references/medusa.md +350 -0
- package/bin/skills/stable-diffusion/SKILL.md +519 -0
- package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
- package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
- package/bin/skills/tensorboard/SKILL.md +629 -0
- package/bin/skills/tensorboard/references/integrations.md +638 -0
- package/bin/skills/tensorboard/references/profiling.md +545 -0
- package/bin/skills/tensorboard/references/visualization.md +620 -0
- package/bin/skills/tensorrt-llm/SKILL.md +187 -0
- package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
- package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
- package/bin/skills/tensorrt-llm/references/serving.md +470 -0
- package/bin/skills/tinker/SKILL.md +362 -0
- package/bin/skills/tinker/references/api-reference.md +168 -0
- package/bin/skills/tinker/references/getting-started.md +157 -0
- package/bin/skills/tinker/references/loss-functions.md +163 -0
- package/bin/skills/tinker/references/models-and-lora.md +139 -0
- package/bin/skills/tinker/references/recipes.md +280 -0
- package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
- package/bin/skills/tinker/references/rendering.md +243 -0
- package/bin/skills/tinker/references/supervised-learning.md +232 -0
- package/bin/skills/tinker-training-cost/SKILL.md +187 -0
- package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
- package/bin/skills/torchforge/SKILL.md +433 -0
- package/bin/skills/torchforge/references/api-reference.md +327 -0
- package/bin/skills/torchforge/references/troubleshooting.md +409 -0
- package/bin/skills/torchtitan/SKILL.md +358 -0
- package/bin/skills/torchtitan/references/checkpoint.md +181 -0
- package/bin/skills/torchtitan/references/custom-models.md +258 -0
- package/bin/skills/torchtitan/references/float8.md +133 -0
- package/bin/skills/torchtitan/references/fsdp.md +126 -0
- package/bin/skills/transformer-lens/SKILL.md +346 -0
- package/bin/skills/transformer-lens/references/README.md +54 -0
- package/bin/skills/transformer-lens/references/api.md +362 -0
- package/bin/skills/transformer-lens/references/tutorials.md +339 -0
- package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
- package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
- package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
- package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
- package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
- package/bin/skills/unsloth/SKILL.md +80 -0
- package/bin/skills/unsloth/references/index.md +7 -0
- package/bin/skills/unsloth/references/llms-full.md +16799 -0
- package/bin/skills/unsloth/references/llms-txt.md +12044 -0
- package/bin/skills/unsloth/references/llms.md +82 -0
- package/bin/skills/verl/SKILL.md +391 -0
- package/bin/skills/verl/references/api-reference.md +301 -0
- package/bin/skills/verl/references/troubleshooting.md +391 -0
- package/bin/skills/vllm/SKILL.md +364 -0
- package/bin/skills/vllm/references/optimization.md +226 -0
- package/bin/skills/vllm/references/quantization.md +284 -0
- package/bin/skills/vllm/references/server-deployment.md +255 -0
- package/bin/skills/vllm/references/troubleshooting.md +447 -0
- package/bin/skills/weights-and-biases/SKILL.md +590 -0
- package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
- package/bin/skills/weights-and-biases/references/integrations.md +700 -0
- package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
- package/bin/skills/whisper/SKILL.md +317 -0
- package/bin/skills/whisper/references/languages.md +189 -0
- package/bin/synsc +0 -0
- package/package.json +10 -0
|
@@ -0,0 +1,564 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: audiocraft-audio-generation
|
|
3
|
+
description: PyTorch library for audio generation including text-to-music (MusicGen) and text-to-sound (AudioGen). Use when you need to generate music from text descriptions, create sound effects, or perform melody-conditioned music generation.
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
author: Synthetic Sciences
|
|
6
|
+
license: MIT
|
|
7
|
+
tags: [Multimodal, Audio Generation, Text-to-Music, Text-to-Audio, MusicGen]
|
|
8
|
+
dependencies: [audiocraft, torch>=2.0.0, transformers>=4.30.0]
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# AudioCraft: Audio Generation
|
|
12
|
+
|
|
13
|
+
Comprehensive guide to using Meta's AudioCraft for text-to-music and text-to-audio generation with MusicGen, AudioGen, and EnCodec.
|
|
14
|
+
|
|
15
|
+
## When to use AudioCraft
|
|
16
|
+
|
|
17
|
+
**Use AudioCraft when:**
|
|
18
|
+
- Need to generate music from text descriptions
|
|
19
|
+
- Creating sound effects and environmental audio
|
|
20
|
+
- Building music generation applications
|
|
21
|
+
- Need melody-conditioned music generation
|
|
22
|
+
- Want stereo audio output
|
|
23
|
+
- Require controllable music generation with style transfer
|
|
24
|
+
|
|
25
|
+
**Key features:**
|
|
26
|
+
- **MusicGen**: Text-to-music generation with melody conditioning
|
|
27
|
+
- **AudioGen**: Text-to-sound effects generation
|
|
28
|
+
- **EnCodec**: High-fidelity neural audio codec
|
|
29
|
+
- **Multiple model sizes**: Small (300M) to Large (3.3B)
|
|
30
|
+
- **Stereo support**: Full stereo audio generation
|
|
31
|
+
- **Style conditioning**: MusicGen-Style for reference-based generation
|
|
32
|
+
|
|
33
|
+
**Use alternatives instead:**
|
|
34
|
+
- **Stable Audio**: For longer commercial music generation
|
|
35
|
+
- **Bark**: For text-to-speech with music/sound effects
|
|
36
|
+
- **Riffusion**: For spectogram-based music generation
|
|
37
|
+
- **OpenAI Jukebox**: For raw audio generation with lyrics
|
|
38
|
+
|
|
39
|
+
## Quick start
|
|
40
|
+
|
|
41
|
+
### Installation
|
|
42
|
+
|
|
43
|
+
```bash
|
|
44
|
+
# From PyPI
|
|
45
|
+
pip install audiocraft
|
|
46
|
+
|
|
47
|
+
# From GitHub (latest)
|
|
48
|
+
pip install git+https://github.com/facebookresearch/audiocraft.git
|
|
49
|
+
|
|
50
|
+
# Or use HuggingFace Transformers
|
|
51
|
+
pip install transformers torch torchaudio
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
### Basic text-to-music (AudioCraft)
|
|
55
|
+
|
|
56
|
+
```python
|
|
57
|
+
import torchaudio
|
|
58
|
+
from audiocraft.models import MusicGen
|
|
59
|
+
|
|
60
|
+
# Load model
|
|
61
|
+
model = MusicGen.get_pretrained('facebook/musicgen-small')
|
|
62
|
+
|
|
63
|
+
# Set generation parameters
|
|
64
|
+
model.set_generation_params(
|
|
65
|
+
duration=8, # seconds
|
|
66
|
+
top_k=250,
|
|
67
|
+
temperature=1.0
|
|
68
|
+
)
|
|
69
|
+
|
|
70
|
+
# Generate from text
|
|
71
|
+
descriptions = ["happy upbeat electronic dance music with synths"]
|
|
72
|
+
wav = model.generate(descriptions)
|
|
73
|
+
|
|
74
|
+
# Save audio
|
|
75
|
+
torchaudio.save("output.wav", wav[0].cpu(), sample_rate=32000)
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
### Using HuggingFace Transformers
|
|
79
|
+
|
|
80
|
+
```python
|
|
81
|
+
from transformers import AutoProcessor, MusicgenForConditionalGeneration
|
|
82
|
+
import scipy
|
|
83
|
+
|
|
84
|
+
# Load model and processor
|
|
85
|
+
processor = AutoProcessor.from_pretrained("facebook/musicgen-small")
|
|
86
|
+
model = MusicgenForConditionalGeneration.from_pretrained("facebook/musicgen-small")
|
|
87
|
+
model.to("cuda")
|
|
88
|
+
|
|
89
|
+
# Generate music
|
|
90
|
+
inputs = processor(
|
|
91
|
+
text=["80s pop track with bassy drums and synth"],
|
|
92
|
+
padding=True,
|
|
93
|
+
return_tensors="pt"
|
|
94
|
+
).to("cuda")
|
|
95
|
+
|
|
96
|
+
audio_values = model.generate(
|
|
97
|
+
**inputs,
|
|
98
|
+
do_sample=True,
|
|
99
|
+
guidance_scale=3,
|
|
100
|
+
max_new_tokens=256
|
|
101
|
+
)
|
|
102
|
+
|
|
103
|
+
# Save
|
|
104
|
+
sampling_rate = model.config.audio_encoder.sampling_rate
|
|
105
|
+
scipy.io.wavfile.write("output.wav", rate=sampling_rate, data=audio_values[0, 0].cpu().numpy())
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
### Text-to-sound with AudioGen
|
|
109
|
+
|
|
110
|
+
```python
|
|
111
|
+
from audiocraft.models import AudioGen
|
|
112
|
+
|
|
113
|
+
# Load AudioGen
|
|
114
|
+
model = AudioGen.get_pretrained('facebook/audiogen-medium')
|
|
115
|
+
|
|
116
|
+
model.set_generation_params(duration=5)
|
|
117
|
+
|
|
118
|
+
# Generate sound effects
|
|
119
|
+
descriptions = ["dog barking in a park with birds chirping"]
|
|
120
|
+
wav = model.generate(descriptions)
|
|
121
|
+
|
|
122
|
+
torchaudio.save("sound.wav", wav[0].cpu(), sample_rate=16000)
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
## Core concepts
|
|
126
|
+
|
|
127
|
+
### Architecture overview
|
|
128
|
+
|
|
129
|
+
```
|
|
130
|
+
AudioCraft Architecture:
|
|
131
|
+
┌──────────────────────────────────────────────────────────────┐
|
|
132
|
+
│ Text Encoder (T5) │
|
|
133
|
+
│ │ │
|
|
134
|
+
│ Text Embeddings │
|
|
135
|
+
└────────────────────────┬─────────────────────────────────────┘
|
|
136
|
+
│
|
|
137
|
+
┌────────────────────────▼─────────────────────────────────────┐
|
|
138
|
+
│ Transformer Decoder (LM) │
|
|
139
|
+
│ Auto-regressively generates audio tokens │
|
|
140
|
+
│ Using efficient token interleaving patterns │
|
|
141
|
+
└────────────────────────┬─────────────────────────────────────┘
|
|
142
|
+
│
|
|
143
|
+
┌────────────────────────▼─────────────────────────────────────┐
|
|
144
|
+
│ EnCodec Audio Decoder │
|
|
145
|
+
│ Converts tokens back to audio waveform │
|
|
146
|
+
└──────────────────────────────────────────────────────────────┘
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
### Model variants
|
|
150
|
+
|
|
151
|
+
| Model | Size | Description | Use Case |
|
|
152
|
+
|-------|------|-------------|----------|
|
|
153
|
+
| `musicgen-small` | 300M | Text-to-music | Quick generation |
|
|
154
|
+
| `musicgen-medium` | 1.5B | Text-to-music | Balanced |
|
|
155
|
+
| `musicgen-large` | 3.3B | Text-to-music | Best quality |
|
|
156
|
+
| `musicgen-melody` | 1.5B | Text + melody | Melody conditioning |
|
|
157
|
+
| `musicgen-melody-large` | 3.3B | Text + melody | Best melody |
|
|
158
|
+
| `musicgen-stereo-*` | Varies | Stereo output | Stereo generation |
|
|
159
|
+
| `musicgen-style` | 1.5B | Style transfer | Reference-based |
|
|
160
|
+
| `audiogen-medium` | 1.5B | Text-to-sound | Sound effects |
|
|
161
|
+
|
|
162
|
+
### Generation parameters
|
|
163
|
+
|
|
164
|
+
| Parameter | Default | Description |
|
|
165
|
+
|-----------|---------|-------------|
|
|
166
|
+
| `duration` | 8.0 | Length in seconds (1-120) |
|
|
167
|
+
| `top_k` | 250 | Top-k sampling |
|
|
168
|
+
| `top_p` | 0.0 | Nucleus sampling (0 = disabled) |
|
|
169
|
+
| `temperature` | 1.0 | Sampling temperature |
|
|
170
|
+
| `cfg_coef` | 3.0 | Classifier-free guidance |
|
|
171
|
+
|
|
172
|
+
## MusicGen usage
|
|
173
|
+
|
|
174
|
+
### Text-to-music generation
|
|
175
|
+
|
|
176
|
+
```python
|
|
177
|
+
from audiocraft.models import MusicGen
|
|
178
|
+
import torchaudio
|
|
179
|
+
|
|
180
|
+
model = MusicGen.get_pretrained('facebook/musicgen-medium')
|
|
181
|
+
|
|
182
|
+
# Configure generation
|
|
183
|
+
model.set_generation_params(
|
|
184
|
+
duration=30, # Up to 30 seconds
|
|
185
|
+
top_k=250, # Sampling diversity
|
|
186
|
+
top_p=0.0, # 0 = use top_k only
|
|
187
|
+
temperature=1.0, # Creativity (higher = more varied)
|
|
188
|
+
cfg_coef=3.0 # Text adherence (higher = stricter)
|
|
189
|
+
)
|
|
190
|
+
|
|
191
|
+
# Generate multiple samples
|
|
192
|
+
descriptions = [
|
|
193
|
+
"epic orchestral soundtrack with strings and brass",
|
|
194
|
+
"chill lo-fi hip hop beat with jazzy piano",
|
|
195
|
+
"energetic rock song with electric guitar"
|
|
196
|
+
]
|
|
197
|
+
|
|
198
|
+
# Generate (returns [batch, channels, samples])
|
|
199
|
+
wav = model.generate(descriptions)
|
|
200
|
+
|
|
201
|
+
# Save each
|
|
202
|
+
for i, audio in enumerate(wav):
|
|
203
|
+
torchaudio.save(f"music_{i}.wav", audio.cpu(), sample_rate=32000)
|
|
204
|
+
```
|
|
205
|
+
|
|
206
|
+
### Melody-conditioned generation
|
|
207
|
+
|
|
208
|
+
```python
|
|
209
|
+
from audiocraft.models import MusicGen
|
|
210
|
+
import torchaudio
|
|
211
|
+
|
|
212
|
+
# Load melody model
|
|
213
|
+
model = MusicGen.get_pretrained('facebook/musicgen-melody')
|
|
214
|
+
model.set_generation_params(duration=30)
|
|
215
|
+
|
|
216
|
+
# Load melody audio
|
|
217
|
+
melody, sr = torchaudio.load("melody.wav")
|
|
218
|
+
|
|
219
|
+
# Generate with melody conditioning
|
|
220
|
+
descriptions = ["acoustic guitar folk song"]
|
|
221
|
+
wav = model.generate_with_chroma(descriptions, melody, sr)
|
|
222
|
+
|
|
223
|
+
torchaudio.save("melody_conditioned.wav", wav[0].cpu(), sample_rate=32000)
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
### Stereo generation
|
|
227
|
+
|
|
228
|
+
```python
|
|
229
|
+
from audiocraft.models import MusicGen
|
|
230
|
+
|
|
231
|
+
# Load stereo model
|
|
232
|
+
model = MusicGen.get_pretrained('facebook/musicgen-stereo-medium')
|
|
233
|
+
model.set_generation_params(duration=15)
|
|
234
|
+
|
|
235
|
+
descriptions = ["ambient electronic music with wide stereo panning"]
|
|
236
|
+
wav = model.generate(descriptions)
|
|
237
|
+
|
|
238
|
+
# wav shape: [batch, 2, samples] for stereo
|
|
239
|
+
print(f"Stereo shape: {wav.shape}") # [1, 2, 480000]
|
|
240
|
+
torchaudio.save("stereo.wav", wav[0].cpu(), sample_rate=32000)
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
### Audio continuation
|
|
244
|
+
|
|
245
|
+
```python
|
|
246
|
+
from transformers import AutoProcessor, MusicgenForConditionalGeneration
|
|
247
|
+
|
|
248
|
+
processor = AutoProcessor.from_pretrained("facebook/musicgen-medium")
|
|
249
|
+
model = MusicgenForConditionalGeneration.from_pretrained("facebook/musicgen-medium")
|
|
250
|
+
|
|
251
|
+
# Load audio to continue
|
|
252
|
+
import torchaudio
|
|
253
|
+
audio, sr = torchaudio.load("intro.wav")
|
|
254
|
+
|
|
255
|
+
# Process with text and audio
|
|
256
|
+
inputs = processor(
|
|
257
|
+
audio=audio.squeeze().numpy(),
|
|
258
|
+
sampling_rate=sr,
|
|
259
|
+
text=["continue with a epic chorus"],
|
|
260
|
+
padding=True,
|
|
261
|
+
return_tensors="pt"
|
|
262
|
+
)
|
|
263
|
+
|
|
264
|
+
# Generate continuation
|
|
265
|
+
audio_values = model.generate(**inputs, do_sample=True, guidance_scale=3, max_new_tokens=512)
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
## MusicGen-Style usage
|
|
269
|
+
|
|
270
|
+
### Style-conditioned generation
|
|
271
|
+
|
|
272
|
+
```python
|
|
273
|
+
from audiocraft.models import MusicGen
|
|
274
|
+
|
|
275
|
+
# Load style model
|
|
276
|
+
model = MusicGen.get_pretrained('facebook/musicgen-style')
|
|
277
|
+
|
|
278
|
+
# Configure generation with style
|
|
279
|
+
model.set_generation_params(
|
|
280
|
+
duration=30,
|
|
281
|
+
cfg_coef=3.0,
|
|
282
|
+
cfg_coef_beta=5.0 # Style influence
|
|
283
|
+
)
|
|
284
|
+
|
|
285
|
+
# Configure style conditioner
|
|
286
|
+
model.set_style_conditioner_params(
|
|
287
|
+
eval_q=3, # RVQ quantizers (1-6)
|
|
288
|
+
excerpt_length=3.0 # Style excerpt length
|
|
289
|
+
)
|
|
290
|
+
|
|
291
|
+
# Load style reference
|
|
292
|
+
style_audio, sr = torchaudio.load("reference_style.wav")
|
|
293
|
+
|
|
294
|
+
# Generate with text + style
|
|
295
|
+
descriptions = ["upbeat dance track"]
|
|
296
|
+
wav = model.generate_with_style(descriptions, style_audio, sr)
|
|
297
|
+
```
|
|
298
|
+
|
|
299
|
+
### Style-only generation (no text)
|
|
300
|
+
|
|
301
|
+
```python
|
|
302
|
+
# Generate matching style without text prompt
|
|
303
|
+
model.set_generation_params(
|
|
304
|
+
duration=30,
|
|
305
|
+
cfg_coef=3.0,
|
|
306
|
+
cfg_coef_beta=None # Disable double CFG for style-only
|
|
307
|
+
)
|
|
308
|
+
|
|
309
|
+
wav = model.generate_with_style([None], style_audio, sr)
|
|
310
|
+
```
|
|
311
|
+
|
|
312
|
+
## AudioGen usage
|
|
313
|
+
|
|
314
|
+
### Sound effect generation
|
|
315
|
+
|
|
316
|
+
```python
|
|
317
|
+
from audiocraft.models import AudioGen
|
|
318
|
+
import torchaudio
|
|
319
|
+
|
|
320
|
+
model = AudioGen.get_pretrained('facebook/audiogen-medium')
|
|
321
|
+
model.set_generation_params(duration=10)
|
|
322
|
+
|
|
323
|
+
# Generate various sounds
|
|
324
|
+
descriptions = [
|
|
325
|
+
"thunderstorm with heavy rain and lightning",
|
|
326
|
+
"busy city traffic with car horns",
|
|
327
|
+
"ocean waves crashing on rocks",
|
|
328
|
+
"crackling campfire in forest"
|
|
329
|
+
]
|
|
330
|
+
|
|
331
|
+
wav = model.generate(descriptions)
|
|
332
|
+
|
|
333
|
+
for i, audio in enumerate(wav):
|
|
334
|
+
torchaudio.save(f"sound_{i}.wav", audio.cpu(), sample_rate=16000)
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
## EnCodec usage
|
|
338
|
+
|
|
339
|
+
### Audio compression
|
|
340
|
+
|
|
341
|
+
```python
|
|
342
|
+
from audiocraft.models import CompressionModel
|
|
343
|
+
import torch
|
|
344
|
+
import torchaudio
|
|
345
|
+
|
|
346
|
+
# Load EnCodec
|
|
347
|
+
model = CompressionModel.get_pretrained('facebook/encodec_32khz')
|
|
348
|
+
|
|
349
|
+
# Load audio
|
|
350
|
+
wav, sr = torchaudio.load("audio.wav")
|
|
351
|
+
|
|
352
|
+
# Ensure correct sample rate
|
|
353
|
+
if sr != 32000:
|
|
354
|
+
resampler = torchaudio.transforms.Resample(sr, 32000)
|
|
355
|
+
wav = resampler(wav)
|
|
356
|
+
|
|
357
|
+
# Encode to tokens
|
|
358
|
+
with torch.no_grad():
|
|
359
|
+
encoded = model.encode(wav.unsqueeze(0))
|
|
360
|
+
codes = encoded[0] # Audio codes
|
|
361
|
+
|
|
362
|
+
# Decode back to audio
|
|
363
|
+
with torch.no_grad():
|
|
364
|
+
decoded = model.decode(codes)
|
|
365
|
+
|
|
366
|
+
torchaudio.save("reconstructed.wav", decoded[0].cpu(), sample_rate=32000)
|
|
367
|
+
```
|
|
368
|
+
|
|
369
|
+
## Common workflows
|
|
370
|
+
|
|
371
|
+
### Workflow 1: Music generation pipeline
|
|
372
|
+
|
|
373
|
+
```python
|
|
374
|
+
import torch
|
|
375
|
+
import torchaudio
|
|
376
|
+
from audiocraft.models import MusicGen
|
|
377
|
+
|
|
378
|
+
class MusicGenerator:
|
|
379
|
+
def __init__(self, model_name="facebook/musicgen-medium"):
|
|
380
|
+
self.model = MusicGen.get_pretrained(model_name)
|
|
381
|
+
self.sample_rate = 32000
|
|
382
|
+
|
|
383
|
+
def generate(self, prompt, duration=30, temperature=1.0, cfg=3.0):
|
|
384
|
+
self.model.set_generation_params(
|
|
385
|
+
duration=duration,
|
|
386
|
+
top_k=250,
|
|
387
|
+
temperature=temperature,
|
|
388
|
+
cfg_coef=cfg
|
|
389
|
+
)
|
|
390
|
+
|
|
391
|
+
with torch.no_grad():
|
|
392
|
+
wav = self.model.generate([prompt])
|
|
393
|
+
|
|
394
|
+
return wav[0].cpu()
|
|
395
|
+
|
|
396
|
+
def generate_batch(self, prompts, duration=30):
|
|
397
|
+
self.model.set_generation_params(duration=duration)
|
|
398
|
+
|
|
399
|
+
with torch.no_grad():
|
|
400
|
+
wav = self.model.generate(prompts)
|
|
401
|
+
|
|
402
|
+
return wav.cpu()
|
|
403
|
+
|
|
404
|
+
def save(self, audio, path):
|
|
405
|
+
torchaudio.save(path, audio, sample_rate=self.sample_rate)
|
|
406
|
+
|
|
407
|
+
# Usage
|
|
408
|
+
generator = MusicGenerator()
|
|
409
|
+
audio = generator.generate(
|
|
410
|
+
"epic cinematic orchestral music",
|
|
411
|
+
duration=30,
|
|
412
|
+
temperature=1.0
|
|
413
|
+
)
|
|
414
|
+
generator.save(audio, "epic_music.wav")
|
|
415
|
+
```
|
|
416
|
+
|
|
417
|
+
### Workflow 2: Sound design batch processing
|
|
418
|
+
|
|
419
|
+
```python
|
|
420
|
+
import json
|
|
421
|
+
from pathlib import Path
|
|
422
|
+
from audiocraft.models import AudioGen
|
|
423
|
+
import torchaudio
|
|
424
|
+
|
|
425
|
+
def batch_generate_sounds(sound_specs, output_dir):
|
|
426
|
+
"""
|
|
427
|
+
Generate multiple sounds from specifications.
|
|
428
|
+
|
|
429
|
+
Args:
|
|
430
|
+
sound_specs: list of {"name": str, "description": str, "duration": float}
|
|
431
|
+
output_dir: output directory path
|
|
432
|
+
"""
|
|
433
|
+
model = AudioGen.get_pretrained('facebook/audiogen-medium')
|
|
434
|
+
output_dir = Path(output_dir)
|
|
435
|
+
output_dir.mkdir(exist_ok=True)
|
|
436
|
+
|
|
437
|
+
results = []
|
|
438
|
+
|
|
439
|
+
for spec in sound_specs:
|
|
440
|
+
model.set_generation_params(duration=spec.get("duration", 5))
|
|
441
|
+
|
|
442
|
+
wav = model.generate([spec["description"]])
|
|
443
|
+
|
|
444
|
+
output_path = output_dir / f"{spec['name']}.wav"
|
|
445
|
+
torchaudio.save(str(output_path), wav[0].cpu(), sample_rate=16000)
|
|
446
|
+
|
|
447
|
+
results.append({
|
|
448
|
+
"name": spec["name"],
|
|
449
|
+
"path": str(output_path),
|
|
450
|
+
"description": spec["description"]
|
|
451
|
+
})
|
|
452
|
+
|
|
453
|
+
return results
|
|
454
|
+
|
|
455
|
+
# Usage
|
|
456
|
+
sounds = [
|
|
457
|
+
{"name": "explosion", "description": "massive explosion with debris", "duration": 3},
|
|
458
|
+
{"name": "footsteps", "description": "footsteps on wooden floor", "duration": 5},
|
|
459
|
+
{"name": "door", "description": "wooden door creaking and closing", "duration": 2}
|
|
460
|
+
]
|
|
461
|
+
|
|
462
|
+
results = batch_generate_sounds(sounds, "sound_effects/")
|
|
463
|
+
```
|
|
464
|
+
|
|
465
|
+
### Workflow 3: Gradio demo
|
|
466
|
+
|
|
467
|
+
```python
|
|
468
|
+
import gradio as gr
|
|
469
|
+
import torch
|
|
470
|
+
import torchaudio
|
|
471
|
+
from audiocraft.models import MusicGen
|
|
472
|
+
|
|
473
|
+
model = MusicGen.get_pretrained('facebook/musicgen-small')
|
|
474
|
+
|
|
475
|
+
def generate_music(prompt, duration, temperature, cfg_coef):
|
|
476
|
+
model.set_generation_params(
|
|
477
|
+
duration=duration,
|
|
478
|
+
temperature=temperature,
|
|
479
|
+
cfg_coef=cfg_coef
|
|
480
|
+
)
|
|
481
|
+
|
|
482
|
+
with torch.no_grad():
|
|
483
|
+
wav = model.generate([prompt])
|
|
484
|
+
|
|
485
|
+
# Save to temp file
|
|
486
|
+
path = "temp_output.wav"
|
|
487
|
+
torchaudio.save(path, wav[0].cpu(), sample_rate=32000)
|
|
488
|
+
return path
|
|
489
|
+
|
|
490
|
+
demo = gr.Interface(
|
|
491
|
+
fn=generate_music,
|
|
492
|
+
inputs=[
|
|
493
|
+
gr.Textbox(label="Music Description", placeholder="upbeat electronic dance music"),
|
|
494
|
+
gr.Slider(1, 30, value=8, label="Duration (seconds)"),
|
|
495
|
+
gr.Slider(0.5, 2.0, value=1.0, label="Temperature"),
|
|
496
|
+
gr.Slider(1.0, 10.0, value=3.0, label="CFG Coefficient")
|
|
497
|
+
],
|
|
498
|
+
outputs=gr.Audio(label="Generated Music"),
|
|
499
|
+
title="MusicGen Demo"
|
|
500
|
+
)
|
|
501
|
+
|
|
502
|
+
demo.launch()
|
|
503
|
+
```
|
|
504
|
+
|
|
505
|
+
## Performance optimization
|
|
506
|
+
|
|
507
|
+
### Memory optimization
|
|
508
|
+
|
|
509
|
+
```python
|
|
510
|
+
# Use smaller model
|
|
511
|
+
model = MusicGen.get_pretrained('facebook/musicgen-small')
|
|
512
|
+
|
|
513
|
+
# Clear cache between generations
|
|
514
|
+
torch.cuda.empty_cache()
|
|
515
|
+
|
|
516
|
+
# Generate shorter durations
|
|
517
|
+
model.set_generation_params(duration=10) # Instead of 30
|
|
518
|
+
|
|
519
|
+
# Use half precision
|
|
520
|
+
model = model.half()
|
|
521
|
+
```
|
|
522
|
+
|
|
523
|
+
### Batch processing efficiency
|
|
524
|
+
|
|
525
|
+
```python
|
|
526
|
+
# Process multiple prompts at once (more efficient)
|
|
527
|
+
descriptions = ["prompt1", "prompt2", "prompt3", "prompt4"]
|
|
528
|
+
wav = model.generate(descriptions) # Single batch
|
|
529
|
+
|
|
530
|
+
# Instead of
|
|
531
|
+
for desc in descriptions:
|
|
532
|
+
wav = model.generate([desc]) # Multiple batches (slower)
|
|
533
|
+
```
|
|
534
|
+
|
|
535
|
+
### GPU memory requirements
|
|
536
|
+
|
|
537
|
+
| Model | FP32 VRAM | FP16 VRAM |
|
|
538
|
+
|-------|-----------|-----------|
|
|
539
|
+
| musicgen-small | ~4GB | ~2GB |
|
|
540
|
+
| musicgen-medium | ~8GB | ~4GB |
|
|
541
|
+
| musicgen-large | ~16GB | ~8GB |
|
|
542
|
+
|
|
543
|
+
## Common issues
|
|
544
|
+
|
|
545
|
+
| Issue | Solution |
|
|
546
|
+
|-------|----------|
|
|
547
|
+
| CUDA OOM | Use smaller model, reduce duration |
|
|
548
|
+
| Poor quality | Increase cfg_coef, better prompts |
|
|
549
|
+
| Generation too short | Check max duration setting |
|
|
550
|
+
| Audio artifacts | Try different temperature |
|
|
551
|
+
| Stereo not working | Use stereo model variant |
|
|
552
|
+
|
|
553
|
+
## References
|
|
554
|
+
|
|
555
|
+
- **[Advanced Usage](references/advanced-usage.md)** - Training, fine-tuning, deployment
|
|
556
|
+
- **[Troubleshooting](references/troubleshooting.md)** - Common issues and solutions
|
|
557
|
+
|
|
558
|
+
## Resources
|
|
559
|
+
|
|
560
|
+
- **GitHub**: https://github.com/facebookresearch/audiocraft
|
|
561
|
+
- **Paper (MusicGen)**: https://arxiv.org/abs/2306.05284
|
|
562
|
+
- **Paper (AudioGen)**: https://arxiv.org/abs/2209.15352
|
|
563
|
+
- **HuggingFace**: https://huggingface.co/facebook/musicgen-small
|
|
564
|
+
- **Demo**: https://huggingface.co/spaces/facebook/MusicGen
|