@synsci/cli-darwin-x64 1.1.49
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/skills/accelerate/SKILL.md +332 -0
- package/bin/skills/accelerate/references/custom-plugins.md +453 -0
- package/bin/skills/accelerate/references/megatron-integration.md +489 -0
- package/bin/skills/accelerate/references/performance.md +525 -0
- package/bin/skills/audiocraft/SKILL.md +564 -0
- package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
- package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
- package/bin/skills/autogpt/SKILL.md +403 -0
- package/bin/skills/autogpt/references/advanced-usage.md +535 -0
- package/bin/skills/autogpt/references/troubleshooting.md +420 -0
- package/bin/skills/awq/SKILL.md +310 -0
- package/bin/skills/awq/references/advanced-usage.md +324 -0
- package/bin/skills/awq/references/troubleshooting.md +344 -0
- package/bin/skills/axolotl/SKILL.md +158 -0
- package/bin/skills/axolotl/references/api.md +5548 -0
- package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
- package/bin/skills/axolotl/references/index.md +15 -0
- package/bin/skills/axolotl/references/other.md +3563 -0
- package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
- package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
- package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
- package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
- package/bin/skills/bitsandbytes/SKILL.md +411 -0
- package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
- package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
- package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
- package/bin/skills/blip-2/SKILL.md +564 -0
- package/bin/skills/blip-2/references/advanced-usage.md +680 -0
- package/bin/skills/blip-2/references/troubleshooting.md +526 -0
- package/bin/skills/chroma/SKILL.md +406 -0
- package/bin/skills/chroma/references/integration.md +38 -0
- package/bin/skills/clip/SKILL.md +253 -0
- package/bin/skills/clip/references/applications.md +207 -0
- package/bin/skills/constitutional-ai/SKILL.md +290 -0
- package/bin/skills/crewai/SKILL.md +498 -0
- package/bin/skills/crewai/references/flows.md +438 -0
- package/bin/skills/crewai/references/tools.md +429 -0
- package/bin/skills/crewai/references/troubleshooting.md +480 -0
- package/bin/skills/deepspeed/SKILL.md +141 -0
- package/bin/skills/deepspeed/references/08.md +17 -0
- package/bin/skills/deepspeed/references/09.md +173 -0
- package/bin/skills/deepspeed/references/2020.md +378 -0
- package/bin/skills/deepspeed/references/2023.md +279 -0
- package/bin/skills/deepspeed/references/assets.md +179 -0
- package/bin/skills/deepspeed/references/index.md +35 -0
- package/bin/skills/deepspeed/references/mii.md +118 -0
- package/bin/skills/deepspeed/references/other.md +1191 -0
- package/bin/skills/deepspeed/references/tutorials.md +6554 -0
- package/bin/skills/dspy/SKILL.md +590 -0
- package/bin/skills/dspy/references/examples.md +663 -0
- package/bin/skills/dspy/references/modules.md +475 -0
- package/bin/skills/dspy/references/optimizers.md +566 -0
- package/bin/skills/faiss/SKILL.md +221 -0
- package/bin/skills/faiss/references/index_types.md +280 -0
- package/bin/skills/flash-attention/SKILL.md +367 -0
- package/bin/skills/flash-attention/references/benchmarks.md +215 -0
- package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
- package/bin/skills/gguf/SKILL.md +427 -0
- package/bin/skills/gguf/references/advanced-usage.md +504 -0
- package/bin/skills/gguf/references/troubleshooting.md +442 -0
- package/bin/skills/gptq/SKILL.md +450 -0
- package/bin/skills/gptq/references/calibration.md +337 -0
- package/bin/skills/gptq/references/integration.md +129 -0
- package/bin/skills/gptq/references/troubleshooting.md +95 -0
- package/bin/skills/grpo-rl-training/README.md +97 -0
- package/bin/skills/grpo-rl-training/SKILL.md +572 -0
- package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
- package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
- package/bin/skills/guidance/SKILL.md +572 -0
- package/bin/skills/guidance/references/backends.md +554 -0
- package/bin/skills/guidance/references/constraints.md +674 -0
- package/bin/skills/guidance/references/examples.md +767 -0
- package/bin/skills/hqq/SKILL.md +445 -0
- package/bin/skills/hqq/references/advanced-usage.md +528 -0
- package/bin/skills/hqq/references/troubleshooting.md +503 -0
- package/bin/skills/hugging-face-cli/SKILL.md +191 -0
- package/bin/skills/hugging-face-cli/references/commands.md +954 -0
- package/bin/skills/hugging-face-cli/references/examples.md +374 -0
- package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
- package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
- package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
- package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
- package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
- package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
- package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
- package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
- package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
- package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
- package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
- package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
- package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
- package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
- package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
- package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
- package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
- package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
- package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
- package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
- package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
- package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
- package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
- package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
- package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
- package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
- package/bin/skills/hugging-face-jobs/index.html +216 -0
- package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
- package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
- package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
- package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
- package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
- package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
- package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
- package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
- package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
- package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
- package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
- package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
- package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
- package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
- package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
- package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
- package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
- package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
- package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
- package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
- package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
- package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
- package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
- package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
- package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
- package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
- package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
- package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
- package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
- package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
- package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
- package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
- package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
- package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
- package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
- package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
- package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
- package/bin/skills/instructor/SKILL.md +740 -0
- package/bin/skills/instructor/references/examples.md +107 -0
- package/bin/skills/instructor/references/providers.md +70 -0
- package/bin/skills/instructor/references/validation.md +606 -0
- package/bin/skills/knowledge-distillation/SKILL.md +458 -0
- package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
- package/bin/skills/lambda-labs/SKILL.md +545 -0
- package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
- package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
- package/bin/skills/langchain/SKILL.md +480 -0
- package/bin/skills/langchain/references/agents.md +499 -0
- package/bin/skills/langchain/references/integration.md +562 -0
- package/bin/skills/langchain/references/rag.md +600 -0
- package/bin/skills/langsmith/SKILL.md +422 -0
- package/bin/skills/langsmith/references/advanced-usage.md +548 -0
- package/bin/skills/langsmith/references/troubleshooting.md +537 -0
- package/bin/skills/litgpt/SKILL.md +469 -0
- package/bin/skills/litgpt/references/custom-models.md +568 -0
- package/bin/skills/litgpt/references/distributed-training.md +451 -0
- package/bin/skills/litgpt/references/supported-models.md +336 -0
- package/bin/skills/litgpt/references/training-recipes.md +619 -0
- package/bin/skills/llama-cpp/SKILL.md +258 -0
- package/bin/skills/llama-cpp/references/optimization.md +89 -0
- package/bin/skills/llama-cpp/references/quantization.md +213 -0
- package/bin/skills/llama-cpp/references/server.md +125 -0
- package/bin/skills/llama-factory/SKILL.md +80 -0
- package/bin/skills/llama-factory/references/_images.md +23 -0
- package/bin/skills/llama-factory/references/advanced.md +1055 -0
- package/bin/skills/llama-factory/references/getting_started.md +349 -0
- package/bin/skills/llama-factory/references/index.md +19 -0
- package/bin/skills/llama-factory/references/other.md +31 -0
- package/bin/skills/llamaguard/SKILL.md +337 -0
- package/bin/skills/llamaindex/SKILL.md +569 -0
- package/bin/skills/llamaindex/references/agents.md +83 -0
- package/bin/skills/llamaindex/references/data_connectors.md +108 -0
- package/bin/skills/llamaindex/references/query_engines.md +406 -0
- package/bin/skills/llava/SKILL.md +304 -0
- package/bin/skills/llava/references/training.md +197 -0
- package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
- package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
- package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
- package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
- package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
- package/bin/skills/long-context/SKILL.md +536 -0
- package/bin/skills/long-context/references/extension_methods.md +468 -0
- package/bin/skills/long-context/references/fine_tuning.md +611 -0
- package/bin/skills/long-context/references/rope.md +402 -0
- package/bin/skills/mamba/SKILL.md +260 -0
- package/bin/skills/mamba/references/architecture-details.md +206 -0
- package/bin/skills/mamba/references/benchmarks.md +255 -0
- package/bin/skills/mamba/references/training-guide.md +388 -0
- package/bin/skills/megatron-core/SKILL.md +366 -0
- package/bin/skills/megatron-core/references/benchmarks.md +249 -0
- package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
- package/bin/skills/megatron-core/references/production-examples.md +473 -0
- package/bin/skills/megatron-core/references/training-recipes.md +547 -0
- package/bin/skills/miles/SKILL.md +315 -0
- package/bin/skills/miles/references/api-reference.md +141 -0
- package/bin/skills/miles/references/troubleshooting.md +352 -0
- package/bin/skills/mlflow/SKILL.md +704 -0
- package/bin/skills/mlflow/references/deployment.md +744 -0
- package/bin/skills/mlflow/references/model-registry.md +770 -0
- package/bin/skills/mlflow/references/tracking.md +680 -0
- package/bin/skills/modal/SKILL.md +341 -0
- package/bin/skills/modal/references/advanced-usage.md +503 -0
- package/bin/skills/modal/references/troubleshooting.md +494 -0
- package/bin/skills/model-merging/SKILL.md +539 -0
- package/bin/skills/model-merging/references/evaluation.md +462 -0
- package/bin/skills/model-merging/references/examples.md +428 -0
- package/bin/skills/model-merging/references/methods.md +352 -0
- package/bin/skills/model-pruning/SKILL.md +495 -0
- package/bin/skills/model-pruning/references/wanda.md +347 -0
- package/bin/skills/moe-training/SKILL.md +526 -0
- package/bin/skills/moe-training/references/architectures.md +432 -0
- package/bin/skills/moe-training/references/inference.md +348 -0
- package/bin/skills/moe-training/references/training.md +425 -0
- package/bin/skills/nanogpt/SKILL.md +290 -0
- package/bin/skills/nanogpt/references/architecture.md +382 -0
- package/bin/skills/nanogpt/references/data.md +476 -0
- package/bin/skills/nanogpt/references/training.md +564 -0
- package/bin/skills/nemo-curator/SKILL.md +383 -0
- package/bin/skills/nemo-curator/references/deduplication.md +87 -0
- package/bin/skills/nemo-curator/references/filtering.md +102 -0
- package/bin/skills/nemo-evaluator/SKILL.md +494 -0
- package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
- package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
- package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
- package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
- package/bin/skills/nemo-guardrails/SKILL.md +297 -0
- package/bin/skills/nnsight/SKILL.md +436 -0
- package/bin/skills/nnsight/references/README.md +78 -0
- package/bin/skills/nnsight/references/api.md +344 -0
- package/bin/skills/nnsight/references/tutorials.md +300 -0
- package/bin/skills/openrlhf/SKILL.md +249 -0
- package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
- package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
- package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
- package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
- package/bin/skills/outlines/SKILL.md +652 -0
- package/bin/skills/outlines/references/backends.md +615 -0
- package/bin/skills/outlines/references/examples.md +773 -0
- package/bin/skills/outlines/references/json_generation.md +652 -0
- package/bin/skills/peft/SKILL.md +431 -0
- package/bin/skills/peft/references/advanced-usage.md +514 -0
- package/bin/skills/peft/references/troubleshooting.md +480 -0
- package/bin/skills/phoenix/SKILL.md +475 -0
- package/bin/skills/phoenix/references/advanced-usage.md +619 -0
- package/bin/skills/phoenix/references/troubleshooting.md +538 -0
- package/bin/skills/pinecone/SKILL.md +358 -0
- package/bin/skills/pinecone/references/deployment.md +181 -0
- package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
- package/bin/skills/pytorch-fsdp/references/index.md +7 -0
- package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
- package/bin/skills/pytorch-lightning/SKILL.md +346 -0
- package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
- package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
- package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
- package/bin/skills/pyvene/SKILL.md +473 -0
- package/bin/skills/pyvene/references/README.md +73 -0
- package/bin/skills/pyvene/references/api.md +383 -0
- package/bin/skills/pyvene/references/tutorials.md +376 -0
- package/bin/skills/qdrant/SKILL.md +493 -0
- package/bin/skills/qdrant/references/advanced-usage.md +648 -0
- package/bin/skills/qdrant/references/troubleshooting.md +631 -0
- package/bin/skills/ray-data/SKILL.md +326 -0
- package/bin/skills/ray-data/references/integration.md +82 -0
- package/bin/skills/ray-data/references/transformations.md +83 -0
- package/bin/skills/ray-train/SKILL.md +406 -0
- package/bin/skills/ray-train/references/multi-node.md +628 -0
- package/bin/skills/rwkv/SKILL.md +260 -0
- package/bin/skills/rwkv/references/architecture-details.md +344 -0
- package/bin/skills/rwkv/references/rwkv7.md +386 -0
- package/bin/skills/rwkv/references/state-management.md +369 -0
- package/bin/skills/saelens/SKILL.md +386 -0
- package/bin/skills/saelens/references/README.md +70 -0
- package/bin/skills/saelens/references/api.md +333 -0
- package/bin/skills/saelens/references/tutorials.md +318 -0
- package/bin/skills/segment-anything/SKILL.md +500 -0
- package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
- package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
- package/bin/skills/sentence-transformers/SKILL.md +255 -0
- package/bin/skills/sentence-transformers/references/models.md +123 -0
- package/bin/skills/sentencepiece/SKILL.md +235 -0
- package/bin/skills/sentencepiece/references/algorithms.md +200 -0
- package/bin/skills/sentencepiece/references/training.md +304 -0
- package/bin/skills/sglang/SKILL.md +442 -0
- package/bin/skills/sglang/references/deployment.md +490 -0
- package/bin/skills/sglang/references/radix-attention.md +413 -0
- package/bin/skills/sglang/references/structured-generation.md +541 -0
- package/bin/skills/simpo/SKILL.md +219 -0
- package/bin/skills/simpo/references/datasets.md +478 -0
- package/bin/skills/simpo/references/hyperparameters.md +452 -0
- package/bin/skills/simpo/references/loss-functions.md +350 -0
- package/bin/skills/skypilot/SKILL.md +509 -0
- package/bin/skills/skypilot/references/advanced-usage.md +491 -0
- package/bin/skills/skypilot/references/troubleshooting.md +570 -0
- package/bin/skills/slime/SKILL.md +464 -0
- package/bin/skills/slime/references/api-reference.md +392 -0
- package/bin/skills/slime/references/troubleshooting.md +386 -0
- package/bin/skills/speculative-decoding/SKILL.md +467 -0
- package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
- package/bin/skills/speculative-decoding/references/medusa.md +350 -0
- package/bin/skills/stable-diffusion/SKILL.md +519 -0
- package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
- package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
- package/bin/skills/tensorboard/SKILL.md +629 -0
- package/bin/skills/tensorboard/references/integrations.md +638 -0
- package/bin/skills/tensorboard/references/profiling.md +545 -0
- package/bin/skills/tensorboard/references/visualization.md +620 -0
- package/bin/skills/tensorrt-llm/SKILL.md +187 -0
- package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
- package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
- package/bin/skills/tensorrt-llm/references/serving.md +470 -0
- package/bin/skills/tinker/SKILL.md +362 -0
- package/bin/skills/tinker/references/api-reference.md +168 -0
- package/bin/skills/tinker/references/getting-started.md +157 -0
- package/bin/skills/tinker/references/loss-functions.md +163 -0
- package/bin/skills/tinker/references/models-and-lora.md +139 -0
- package/bin/skills/tinker/references/recipes.md +280 -0
- package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
- package/bin/skills/tinker/references/rendering.md +243 -0
- package/bin/skills/tinker/references/supervised-learning.md +232 -0
- package/bin/skills/tinker-training-cost/SKILL.md +187 -0
- package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
- package/bin/skills/torchforge/SKILL.md +433 -0
- package/bin/skills/torchforge/references/api-reference.md +327 -0
- package/bin/skills/torchforge/references/troubleshooting.md +409 -0
- package/bin/skills/torchtitan/SKILL.md +358 -0
- package/bin/skills/torchtitan/references/checkpoint.md +181 -0
- package/bin/skills/torchtitan/references/custom-models.md +258 -0
- package/bin/skills/torchtitan/references/float8.md +133 -0
- package/bin/skills/torchtitan/references/fsdp.md +126 -0
- package/bin/skills/transformer-lens/SKILL.md +346 -0
- package/bin/skills/transformer-lens/references/README.md +54 -0
- package/bin/skills/transformer-lens/references/api.md +362 -0
- package/bin/skills/transformer-lens/references/tutorials.md +339 -0
- package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
- package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
- package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
- package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
- package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
- package/bin/skills/unsloth/SKILL.md +80 -0
- package/bin/skills/unsloth/references/index.md +7 -0
- package/bin/skills/unsloth/references/llms-full.md +16799 -0
- package/bin/skills/unsloth/references/llms-txt.md +12044 -0
- package/bin/skills/unsloth/references/llms.md +82 -0
- package/bin/skills/verl/SKILL.md +391 -0
- package/bin/skills/verl/references/api-reference.md +301 -0
- package/bin/skills/verl/references/troubleshooting.md +391 -0
- package/bin/skills/vllm/SKILL.md +364 -0
- package/bin/skills/vllm/references/optimization.md +226 -0
- package/bin/skills/vllm/references/quantization.md +284 -0
- package/bin/skills/vllm/references/server-deployment.md +255 -0
- package/bin/skills/vllm/references/troubleshooting.md +447 -0
- package/bin/skills/weights-and-biases/SKILL.md +590 -0
- package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
- package/bin/skills/weights-and-biases/references/integrations.md +700 -0
- package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
- package/bin/skills/whisper/SKILL.md +317 -0
- package/bin/skills/whisper/references/languages.md +189 -0
- package/bin/synsc +0 -0
- package/package.json +10 -0
|
@@ -0,0 +1,566 @@
|
|
|
1
|
+
# DSPy Optimizers (Teleprompters)
|
|
2
|
+
|
|
3
|
+
Complete guide to DSPy's optimization algorithms for improving prompts and model weights.
|
|
4
|
+
|
|
5
|
+
## What are Optimizers?
|
|
6
|
+
|
|
7
|
+
DSPy optimizers (called "teleprompters") automatically improve your modules by:
|
|
8
|
+
- **Synthesizing few-shot examples** from training data
|
|
9
|
+
- **Proposing better instructions** through search
|
|
10
|
+
- **Fine-tuning model weights** (optional)
|
|
11
|
+
|
|
12
|
+
**Key idea**: Instead of manually tuning prompts, define a metric and let DSPy optimize.
|
|
13
|
+
|
|
14
|
+
## Optimizer Selection Guide
|
|
15
|
+
|
|
16
|
+
| Optimizer | Best For | Speed | Quality | Data Needed |
|
|
17
|
+
|-----------|----------|-------|---------|-------------|
|
|
18
|
+
| BootstrapFewShot | General purpose | Fast | Good | 10-50 examples |
|
|
19
|
+
| MIPRO | Instruction tuning | Medium | Excellent | 50-200 examples |
|
|
20
|
+
| BootstrapFinetune | Fine-tuning | Slow | Excellent | 100+ examples |
|
|
21
|
+
| COPRO | Prompt optimization | Medium | Good | 20-100 examples |
|
|
22
|
+
| KNNFewShot | Quick baseline | Very fast | Fair | 10+ examples |
|
|
23
|
+
|
|
24
|
+
## Core Optimizers
|
|
25
|
+
|
|
26
|
+
### BootstrapFewShot
|
|
27
|
+
|
|
28
|
+
**Most popular optimizer** - Generates few-shot demonstrations from training data.
|
|
29
|
+
|
|
30
|
+
**How it works:**
|
|
31
|
+
1. Takes your training examples
|
|
32
|
+
2. Uses your module to generate predictions
|
|
33
|
+
3. Selects high-quality predictions (based on metric)
|
|
34
|
+
4. Uses these as few-shot examples in future prompts
|
|
35
|
+
|
|
36
|
+
**Parameters:**
|
|
37
|
+
- `metric`: Function that scores predictions (required)
|
|
38
|
+
- `max_bootstrapped_demos`: Max demonstrations to generate (default: 4)
|
|
39
|
+
- `max_labeled_demos`: Max labeled examples to use (default: 16)
|
|
40
|
+
- `max_rounds`: Optimization iterations (default: 1)
|
|
41
|
+
- `metric_threshold`: Minimum score to accept (optional)
|
|
42
|
+
|
|
43
|
+
```python
|
|
44
|
+
import dspy
|
|
45
|
+
from dspy.teleprompt import BootstrapFewShot
|
|
46
|
+
|
|
47
|
+
# Define metric
|
|
48
|
+
def validate_answer(example, pred, trace=None):
|
|
49
|
+
"""Return True if prediction matches gold answer."""
|
|
50
|
+
return example.answer.lower() == pred.answer.lower()
|
|
51
|
+
|
|
52
|
+
# Training data
|
|
53
|
+
trainset = [
|
|
54
|
+
dspy.Example(question="What is 2+2?", answer="4").with_inputs("question"),
|
|
55
|
+
dspy.Example(question="What is 3+5?", answer="8").with_inputs("question"),
|
|
56
|
+
dspy.Example(question="What is 10-3?", answer="7").with_inputs("question"),
|
|
57
|
+
]
|
|
58
|
+
|
|
59
|
+
# Create module
|
|
60
|
+
qa = dspy.ChainOfThought("question -> answer")
|
|
61
|
+
|
|
62
|
+
# Optimize
|
|
63
|
+
optimizer = BootstrapFewShot(
|
|
64
|
+
metric=validate_answer,
|
|
65
|
+
max_bootstrapped_demos=3,
|
|
66
|
+
max_rounds=2
|
|
67
|
+
)
|
|
68
|
+
|
|
69
|
+
optimized_qa = optimizer.compile(qa, trainset=trainset)
|
|
70
|
+
|
|
71
|
+
# Now optimized_qa has learned few-shot examples!
|
|
72
|
+
result = optimized_qa(question="What is 5+7?")
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
**Best practices:**
|
|
76
|
+
- Start with 10-50 training examples
|
|
77
|
+
- Use diverse examples covering edge cases
|
|
78
|
+
- Set `max_bootstrapped_demos=3-5` for most tasks
|
|
79
|
+
- Increase `max_rounds=2-3` for better quality
|
|
80
|
+
|
|
81
|
+
**When to use:**
|
|
82
|
+
- First optimizer to try
|
|
83
|
+
- You have 10+ labeled examples
|
|
84
|
+
- Want quick improvements
|
|
85
|
+
- General-purpose tasks
|
|
86
|
+
|
|
87
|
+
### MIPRO (Most Important Prompt Optimization)
|
|
88
|
+
|
|
89
|
+
**State-of-the-art optimizer** - Iteratively searches for better instructions.
|
|
90
|
+
|
|
91
|
+
**How it works:**
|
|
92
|
+
1. Generates candidate instructions
|
|
93
|
+
2. Tests each on validation set
|
|
94
|
+
3. Selects best-performing instructions
|
|
95
|
+
4. Iterates to refine further
|
|
96
|
+
|
|
97
|
+
**Parameters:**
|
|
98
|
+
- `metric`: Evaluation metric (required)
|
|
99
|
+
- `num_candidates`: Instructions to try per iteration (default: 10)
|
|
100
|
+
- `init_temperature`: Sampling temperature (default: 1.0)
|
|
101
|
+
- `verbose`: Show progress (default: False)
|
|
102
|
+
|
|
103
|
+
```python
|
|
104
|
+
from dspy.teleprompt import MIPRO
|
|
105
|
+
|
|
106
|
+
# Define metric with more nuance
|
|
107
|
+
def answer_quality(example, pred, trace=None):
|
|
108
|
+
"""Score answer quality 0-1."""
|
|
109
|
+
if example.answer.lower() in pred.answer.lower():
|
|
110
|
+
return 1.0
|
|
111
|
+
# Partial credit for similar answers
|
|
112
|
+
return 0.5 if len(set(example.answer.split()) & set(pred.answer.split())) > 0 else 0.0
|
|
113
|
+
|
|
114
|
+
# Larger training set (MIPRO benefits from more data)
|
|
115
|
+
trainset = [...] # 50-200 examples
|
|
116
|
+
valset = [...] # 20-50 examples
|
|
117
|
+
|
|
118
|
+
# Create module
|
|
119
|
+
qa = dspy.ChainOfThought("question -> answer")
|
|
120
|
+
|
|
121
|
+
# Optimize with MIPRO
|
|
122
|
+
optimizer = MIPRO(
|
|
123
|
+
metric=answer_quality,
|
|
124
|
+
num_candidates=10,
|
|
125
|
+
init_temperature=1.0,
|
|
126
|
+
verbose=True
|
|
127
|
+
)
|
|
128
|
+
|
|
129
|
+
optimized_qa = optimizer.compile(
|
|
130
|
+
student=qa,
|
|
131
|
+
trainset=trainset,
|
|
132
|
+
valset=valset, # MIPRO uses separate validation set
|
|
133
|
+
num_trials=100 # More trials = better quality
|
|
134
|
+
)
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
**Best practices:**
|
|
138
|
+
- Use 50-200 training examples
|
|
139
|
+
- Separate validation set (20-50 examples)
|
|
140
|
+
- Run 100-200 trials for best results
|
|
141
|
+
- Takes 10-30 minutes typically
|
|
142
|
+
|
|
143
|
+
**When to use:**
|
|
144
|
+
- You have 50+ labeled examples
|
|
145
|
+
- Want state-of-the-art performance
|
|
146
|
+
- Willing to wait for optimization
|
|
147
|
+
- Complex reasoning tasks
|
|
148
|
+
|
|
149
|
+
### BootstrapFinetune
|
|
150
|
+
|
|
151
|
+
**Fine-tune model weights** - Creates training dataset for fine-tuning.
|
|
152
|
+
|
|
153
|
+
**How it works:**
|
|
154
|
+
1. Generates synthetic training data
|
|
155
|
+
2. Exports data in fine-tuning format
|
|
156
|
+
3. You fine-tune model separately
|
|
157
|
+
4. Load fine-tuned model back
|
|
158
|
+
|
|
159
|
+
**Parameters:**
|
|
160
|
+
- `metric`: Evaluation metric (required)
|
|
161
|
+
- `max_bootstrapped_demos`: Demonstrations to generate (default: 4)
|
|
162
|
+
- `max_rounds`: Data generation rounds (default: 1)
|
|
163
|
+
|
|
164
|
+
```python
|
|
165
|
+
from dspy.teleprompt import BootstrapFinetune
|
|
166
|
+
|
|
167
|
+
# Training data
|
|
168
|
+
trainset = [...] # 100+ examples recommended
|
|
169
|
+
|
|
170
|
+
# Define metric
|
|
171
|
+
def validate(example, pred, trace=None):
|
|
172
|
+
return example.answer == pred.answer
|
|
173
|
+
|
|
174
|
+
# Create module
|
|
175
|
+
qa = dspy.ChainOfThought("question -> answer")
|
|
176
|
+
|
|
177
|
+
# Generate fine-tuning data
|
|
178
|
+
optimizer = BootstrapFinetune(metric=validate)
|
|
179
|
+
optimized_qa = optimizer.compile(qa, trainset=trainset)
|
|
180
|
+
|
|
181
|
+
# Exports training data to file
|
|
182
|
+
# You then fine-tune using your LM provider's API
|
|
183
|
+
|
|
184
|
+
# After fine-tuning, load your model:
|
|
185
|
+
finetuned_lm = dspy.OpenAI(model="ft:gpt-3.5-turbo:your-model-id")
|
|
186
|
+
dspy.settings.configure(lm=finetuned_lm)
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
**Best practices:**
|
|
190
|
+
- Use 100+ training examples
|
|
191
|
+
- Validate on held-out test set
|
|
192
|
+
- Monitor for overfitting
|
|
193
|
+
- Compare with prompt-based methods first
|
|
194
|
+
|
|
195
|
+
**When to use:**
|
|
196
|
+
- You have 100+ examples
|
|
197
|
+
- Latency is critical (fine-tuned models faster)
|
|
198
|
+
- Task is narrow and well-defined
|
|
199
|
+
- Prompt optimization isn't enough
|
|
200
|
+
|
|
201
|
+
### COPRO (Coordinate Prompt Optimization)
|
|
202
|
+
|
|
203
|
+
**Optimize prompts via gradient-free search.**
|
|
204
|
+
|
|
205
|
+
**How it works:**
|
|
206
|
+
1. Generates prompt variants
|
|
207
|
+
2. Evaluates each variant
|
|
208
|
+
3. Selects best prompts
|
|
209
|
+
4. Iterates to refine
|
|
210
|
+
|
|
211
|
+
```python
|
|
212
|
+
from dspy.teleprompt import COPRO
|
|
213
|
+
|
|
214
|
+
# Training data
|
|
215
|
+
trainset = [...]
|
|
216
|
+
|
|
217
|
+
# Define metric
|
|
218
|
+
def metric(example, pred, trace=None):
|
|
219
|
+
return example.answer == pred.answer
|
|
220
|
+
|
|
221
|
+
# Create module
|
|
222
|
+
qa = dspy.ChainOfThought("question -> answer")
|
|
223
|
+
|
|
224
|
+
# Optimize with COPRO
|
|
225
|
+
optimizer = COPRO(
|
|
226
|
+
metric=metric,
|
|
227
|
+
breadth=10, # Candidates per iteration
|
|
228
|
+
depth=3 # Optimization rounds
|
|
229
|
+
)
|
|
230
|
+
|
|
231
|
+
optimized_qa = optimizer.compile(qa, trainset=trainset)
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
**When to use:**
|
|
235
|
+
- Want prompt optimization
|
|
236
|
+
- Have 20-100 examples
|
|
237
|
+
- MIPRO too slow
|
|
238
|
+
|
|
239
|
+
### KNNFewShot
|
|
240
|
+
|
|
241
|
+
**Simple k-nearest neighbors** - Selects similar examples for each query.
|
|
242
|
+
|
|
243
|
+
**How it works:**
|
|
244
|
+
1. Embeds all training examples
|
|
245
|
+
2. For each query, finds k most similar examples
|
|
246
|
+
3. Uses these as few-shot demonstrations
|
|
247
|
+
|
|
248
|
+
```python
|
|
249
|
+
from dspy.teleprompt import KNNFewShot
|
|
250
|
+
|
|
251
|
+
trainset = [...]
|
|
252
|
+
|
|
253
|
+
# No metric needed - just selects similar examples
|
|
254
|
+
optimizer = KNNFewShot(k=3)
|
|
255
|
+
optimized_qa = optimizer.compile(qa, trainset=trainset)
|
|
256
|
+
|
|
257
|
+
# For each query, uses 3 most similar examples from trainset
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
**When to use:**
|
|
261
|
+
- Quick baseline
|
|
262
|
+
- Have diverse training examples
|
|
263
|
+
- Similarity is good proxy for helpfulness
|
|
264
|
+
|
|
265
|
+
## Writing Metrics
|
|
266
|
+
|
|
267
|
+
Metrics are functions that score predictions. They're critical for optimization.
|
|
268
|
+
|
|
269
|
+
### Binary Metrics
|
|
270
|
+
|
|
271
|
+
```python
|
|
272
|
+
def exact_match(example, pred, trace=None):
|
|
273
|
+
"""Return True if prediction exactly matches gold."""
|
|
274
|
+
return example.answer == pred.answer
|
|
275
|
+
|
|
276
|
+
def contains_answer(example, pred, trace=None):
|
|
277
|
+
"""Return True if prediction contains gold answer."""
|
|
278
|
+
return example.answer.lower() in pred.answer.lower()
|
|
279
|
+
```
|
|
280
|
+
|
|
281
|
+
### Continuous Metrics
|
|
282
|
+
|
|
283
|
+
```python
|
|
284
|
+
def f1_score(example, pred, trace=None):
|
|
285
|
+
"""F1 score between prediction and gold."""
|
|
286
|
+
pred_tokens = set(pred.answer.lower().split())
|
|
287
|
+
gold_tokens = set(example.answer.lower().split())
|
|
288
|
+
|
|
289
|
+
if not pred_tokens:
|
|
290
|
+
return 0.0
|
|
291
|
+
|
|
292
|
+
precision = len(pred_tokens & gold_tokens) / len(pred_tokens)
|
|
293
|
+
recall = len(pred_tokens & gold_tokens) / len(gold_tokens)
|
|
294
|
+
|
|
295
|
+
if precision + recall == 0:
|
|
296
|
+
return 0.0
|
|
297
|
+
|
|
298
|
+
return 2 * (precision * recall) / (precision + recall)
|
|
299
|
+
|
|
300
|
+
def semantic_similarity(example, pred, trace=None):
|
|
301
|
+
"""Embedding similarity between prediction and gold."""
|
|
302
|
+
from sentence_transformers import SentenceTransformer
|
|
303
|
+
model = SentenceTransformer('all-MiniLM-L6-v2')
|
|
304
|
+
|
|
305
|
+
emb1 = model.encode(example.answer)
|
|
306
|
+
emb2 = model.encode(pred.answer)
|
|
307
|
+
|
|
308
|
+
similarity = cosine_similarity(emb1, emb2)
|
|
309
|
+
return similarity
|
|
310
|
+
```
|
|
311
|
+
|
|
312
|
+
### Multi-Factor Metrics
|
|
313
|
+
|
|
314
|
+
```python
|
|
315
|
+
def comprehensive_metric(example, pred, trace=None):
|
|
316
|
+
"""Combine multiple factors."""
|
|
317
|
+
score = 0.0
|
|
318
|
+
|
|
319
|
+
# Correctness (50%)
|
|
320
|
+
if example.answer.lower() in pred.answer.lower():
|
|
321
|
+
score += 0.5
|
|
322
|
+
|
|
323
|
+
# Conciseness (25%)
|
|
324
|
+
if len(pred.answer.split()) <= 20:
|
|
325
|
+
score += 0.25
|
|
326
|
+
|
|
327
|
+
# Citation (25%)
|
|
328
|
+
if "source:" in pred.answer.lower():
|
|
329
|
+
score += 0.25
|
|
330
|
+
|
|
331
|
+
return score
|
|
332
|
+
```
|
|
333
|
+
|
|
334
|
+
### Using Trace for Debugging
|
|
335
|
+
|
|
336
|
+
```python
|
|
337
|
+
def metric_with_trace(example, pred, trace=None):
|
|
338
|
+
"""Metric that uses trace for debugging."""
|
|
339
|
+
is_correct = example.answer == pred.answer
|
|
340
|
+
|
|
341
|
+
if trace is not None and not is_correct:
|
|
342
|
+
# Log failures for analysis
|
|
343
|
+
print(f"Failed on: {example.question}")
|
|
344
|
+
print(f"Expected: {example.answer}")
|
|
345
|
+
print(f"Got: {pred.answer}")
|
|
346
|
+
|
|
347
|
+
return is_correct
|
|
348
|
+
```
|
|
349
|
+
|
|
350
|
+
## Evaluation Best Practices
|
|
351
|
+
|
|
352
|
+
### Train/Val/Test Split
|
|
353
|
+
|
|
354
|
+
```python
|
|
355
|
+
# Split data
|
|
356
|
+
trainset = data[:100] # 70%
|
|
357
|
+
valset = data[100:120] # 15%
|
|
358
|
+
testset = data[120:] # 15%
|
|
359
|
+
|
|
360
|
+
# Optimize on train
|
|
361
|
+
optimized = optimizer.compile(module, trainset=trainset)
|
|
362
|
+
|
|
363
|
+
# Validate during optimization (for MIPRO)
|
|
364
|
+
optimized = optimizer.compile(module, trainset=trainset, valset=valset)
|
|
365
|
+
|
|
366
|
+
# Evaluate on test
|
|
367
|
+
from dspy.evaluate import Evaluate
|
|
368
|
+
evaluator = Evaluate(devset=testset, metric=metric)
|
|
369
|
+
score = evaluator(optimized)
|
|
370
|
+
```
|
|
371
|
+
|
|
372
|
+
### Cross-Validation
|
|
373
|
+
|
|
374
|
+
```python
|
|
375
|
+
from sklearn.model_selection import KFold
|
|
376
|
+
|
|
377
|
+
kfold = KFold(n_splits=5)
|
|
378
|
+
scores = []
|
|
379
|
+
|
|
380
|
+
for train_idx, val_idx in kfold.split(data):
|
|
381
|
+
trainset = [data[i] for i in train_idx]
|
|
382
|
+
valset = [data[i] for i in val_idx]
|
|
383
|
+
|
|
384
|
+
optimized = optimizer.compile(module, trainset=trainset)
|
|
385
|
+
score = evaluator(optimized, devset=valset)
|
|
386
|
+
scores.append(score)
|
|
387
|
+
|
|
388
|
+
print(f"Average score: {sum(scores) / len(scores):.2f}")
|
|
389
|
+
```
|
|
390
|
+
|
|
391
|
+
### Comparing Optimizers
|
|
392
|
+
|
|
393
|
+
```python
|
|
394
|
+
results = {}
|
|
395
|
+
|
|
396
|
+
for opt_name, optimizer in [
|
|
397
|
+
("baseline", None),
|
|
398
|
+
("fewshot", BootstrapFewShot(metric=metric)),
|
|
399
|
+
("mipro", MIPRO(metric=metric)),
|
|
400
|
+
]:
|
|
401
|
+
if optimizer is None:
|
|
402
|
+
module_opt = module
|
|
403
|
+
else:
|
|
404
|
+
module_opt = optimizer.compile(module, trainset=trainset)
|
|
405
|
+
|
|
406
|
+
score = evaluator(module_opt, devset=testset)
|
|
407
|
+
results[opt_name] = score
|
|
408
|
+
|
|
409
|
+
print(results)
|
|
410
|
+
# {'baseline': 0.65, 'fewshot': 0.78, 'mipro': 0.85}
|
|
411
|
+
```
|
|
412
|
+
|
|
413
|
+
## Advanced Patterns
|
|
414
|
+
|
|
415
|
+
### Custom Optimizer
|
|
416
|
+
|
|
417
|
+
```python
|
|
418
|
+
from dspy.teleprompt import Teleprompter
|
|
419
|
+
|
|
420
|
+
class CustomOptimizer(Teleprompter):
|
|
421
|
+
def __init__(self, metric):
|
|
422
|
+
self.metric = metric
|
|
423
|
+
|
|
424
|
+
def compile(self, student, trainset, **kwargs):
|
|
425
|
+
# Your optimization logic here
|
|
426
|
+
# Return optimized student module
|
|
427
|
+
return student
|
|
428
|
+
```
|
|
429
|
+
|
|
430
|
+
### Multi-Stage Optimization
|
|
431
|
+
|
|
432
|
+
```python
|
|
433
|
+
# Stage 1: Bootstrap few-shot
|
|
434
|
+
stage1 = BootstrapFewShot(metric=metric, max_bootstrapped_demos=3)
|
|
435
|
+
optimized1 = stage1.compile(module, trainset=trainset)
|
|
436
|
+
|
|
437
|
+
# Stage 2: Instruction tuning
|
|
438
|
+
stage2 = MIPRO(metric=metric, num_candidates=10)
|
|
439
|
+
optimized2 = stage2.compile(optimized1, trainset=trainset, valset=valset)
|
|
440
|
+
|
|
441
|
+
# Final optimized module
|
|
442
|
+
final_module = optimized2
|
|
443
|
+
```
|
|
444
|
+
|
|
445
|
+
### Ensemble Optimization
|
|
446
|
+
|
|
447
|
+
```python
|
|
448
|
+
class EnsembleModule(dspy.Module):
|
|
449
|
+
def __init__(self, modules):
|
|
450
|
+
super().__init__()
|
|
451
|
+
self.modules = modules
|
|
452
|
+
|
|
453
|
+
def forward(self, question):
|
|
454
|
+
predictions = [m(question=question).answer for m in self.modules]
|
|
455
|
+
# Vote or average
|
|
456
|
+
return dspy.Prediction(answer=max(set(predictions), key=predictions.count))
|
|
457
|
+
|
|
458
|
+
# Optimize multiple modules
|
|
459
|
+
opt1 = BootstrapFewShot(metric=metric).compile(module, trainset=trainset)
|
|
460
|
+
opt2 = MIPRO(metric=metric).compile(module, trainset=trainset)
|
|
461
|
+
opt3 = COPRO(metric=metric).compile(module, trainset=trainset)
|
|
462
|
+
|
|
463
|
+
# Ensemble
|
|
464
|
+
ensemble = EnsembleModule([opt1, opt2, opt3])
|
|
465
|
+
```
|
|
466
|
+
|
|
467
|
+
## Optimization Workflow
|
|
468
|
+
|
|
469
|
+
### 1. Start with Baseline
|
|
470
|
+
|
|
471
|
+
```python
|
|
472
|
+
# No optimization
|
|
473
|
+
baseline = dspy.ChainOfThought("question -> answer")
|
|
474
|
+
baseline_score = evaluator(baseline, devset=testset)
|
|
475
|
+
print(f"Baseline: {baseline_score}")
|
|
476
|
+
```
|
|
477
|
+
|
|
478
|
+
### 2. Try BootstrapFewShot
|
|
479
|
+
|
|
480
|
+
```python
|
|
481
|
+
# Quick optimization
|
|
482
|
+
fewshot = BootstrapFewShot(metric=metric, max_bootstrapped_demos=3)
|
|
483
|
+
optimized = fewshot.compile(baseline, trainset=trainset)
|
|
484
|
+
fewshot_score = evaluator(optimized, devset=testset)
|
|
485
|
+
print(f"Few-shot: {fewshot_score} (+{fewshot_score - baseline_score:.2f})")
|
|
486
|
+
```
|
|
487
|
+
|
|
488
|
+
### 3. If More Data Available, Try MIPRO
|
|
489
|
+
|
|
490
|
+
```python
|
|
491
|
+
# State-of-the-art optimization
|
|
492
|
+
mipro = MIPRO(metric=metric, num_candidates=10)
|
|
493
|
+
optimized_mipro = mipro.compile(baseline, trainset=trainset, valset=valset)
|
|
494
|
+
mipro_score = evaluator(optimized_mipro, devset=testset)
|
|
495
|
+
print(f"MIPRO: {mipro_score} (+{mipro_score - baseline_score:.2f})")
|
|
496
|
+
```
|
|
497
|
+
|
|
498
|
+
### 4. Save Best Model
|
|
499
|
+
|
|
500
|
+
```python
|
|
501
|
+
if mipro_score > fewshot_score:
|
|
502
|
+
optimized_mipro.save("models/best_model.json")
|
|
503
|
+
else:
|
|
504
|
+
optimized.save("models/best_model.json")
|
|
505
|
+
```
|
|
506
|
+
|
|
507
|
+
## Common Pitfalls
|
|
508
|
+
|
|
509
|
+
### 1. Overfitting to Training Data
|
|
510
|
+
|
|
511
|
+
```python
|
|
512
|
+
# ❌ Bad: Too many demos
|
|
513
|
+
optimizer = BootstrapFewShot(max_bootstrapped_demos=20) # Overfits!
|
|
514
|
+
|
|
515
|
+
# ✅ Good: Moderate demos
|
|
516
|
+
optimizer = BootstrapFewShot(max_bootstrapped_demos=3-5)
|
|
517
|
+
```
|
|
518
|
+
|
|
519
|
+
### 2. Metric Doesn't Match Task
|
|
520
|
+
|
|
521
|
+
```python
|
|
522
|
+
# ❌ Bad: Binary metric for nuanced task
|
|
523
|
+
def bad_metric(example, pred, trace=None):
|
|
524
|
+
return example.answer == pred.answer # Too strict!
|
|
525
|
+
|
|
526
|
+
# ✅ Good: Graded metric
|
|
527
|
+
def good_metric(example, pred, trace=None):
|
|
528
|
+
return f1_score(example.answer, pred.answer) # Allows partial credit
|
|
529
|
+
```
|
|
530
|
+
|
|
531
|
+
### 3. Insufficient Training Data
|
|
532
|
+
|
|
533
|
+
```python
|
|
534
|
+
# ❌ Bad: Too little data
|
|
535
|
+
trainset = data[:5] # Not enough!
|
|
536
|
+
|
|
537
|
+
# ✅ Good: Sufficient data
|
|
538
|
+
trainset = data[:50] # Better
|
|
539
|
+
```
|
|
540
|
+
|
|
541
|
+
### 4. No Validation Set
|
|
542
|
+
|
|
543
|
+
```python
|
|
544
|
+
# ❌ Bad: Optimizing on test set
|
|
545
|
+
optimizer.compile(module, trainset=testset) # Cheating!
|
|
546
|
+
|
|
547
|
+
# ✅ Good: Proper splits
|
|
548
|
+
optimizer.compile(module, trainset=trainset, valset=valset)
|
|
549
|
+
evaluator(optimized, devset=testset)
|
|
550
|
+
```
|
|
551
|
+
|
|
552
|
+
## Performance Tips
|
|
553
|
+
|
|
554
|
+
1. **Start simple**: BootstrapFewShot first
|
|
555
|
+
2. **Use representative data**: Cover edge cases
|
|
556
|
+
3. **Monitor overfitting**: Validate on held-out set
|
|
557
|
+
4. **Iterate metrics**: Refine based on failures
|
|
558
|
+
5. **Save checkpoints**: Don't lose progress
|
|
559
|
+
6. **Compare to baseline**: Measure improvement
|
|
560
|
+
7. **Test multiple optimizers**: Find best fit
|
|
561
|
+
|
|
562
|
+
## Resources
|
|
563
|
+
|
|
564
|
+
- **Paper**: "DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines"
|
|
565
|
+
- **GitHub**: https://github.com/stanfordnlp/dspy
|
|
566
|
+
- **Discord**: https://discord.gg/XCGy2WDCQB
|