@synsci/cli-darwin-x64 1.1.49
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/skills/accelerate/SKILL.md +332 -0
- package/bin/skills/accelerate/references/custom-plugins.md +453 -0
- package/bin/skills/accelerate/references/megatron-integration.md +489 -0
- package/bin/skills/accelerate/references/performance.md +525 -0
- package/bin/skills/audiocraft/SKILL.md +564 -0
- package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
- package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
- package/bin/skills/autogpt/SKILL.md +403 -0
- package/bin/skills/autogpt/references/advanced-usage.md +535 -0
- package/bin/skills/autogpt/references/troubleshooting.md +420 -0
- package/bin/skills/awq/SKILL.md +310 -0
- package/bin/skills/awq/references/advanced-usage.md +324 -0
- package/bin/skills/awq/references/troubleshooting.md +344 -0
- package/bin/skills/axolotl/SKILL.md +158 -0
- package/bin/skills/axolotl/references/api.md +5548 -0
- package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
- package/bin/skills/axolotl/references/index.md +15 -0
- package/bin/skills/axolotl/references/other.md +3563 -0
- package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
- package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
- package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
- package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
- package/bin/skills/bitsandbytes/SKILL.md +411 -0
- package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
- package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
- package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
- package/bin/skills/blip-2/SKILL.md +564 -0
- package/bin/skills/blip-2/references/advanced-usage.md +680 -0
- package/bin/skills/blip-2/references/troubleshooting.md +526 -0
- package/bin/skills/chroma/SKILL.md +406 -0
- package/bin/skills/chroma/references/integration.md +38 -0
- package/bin/skills/clip/SKILL.md +253 -0
- package/bin/skills/clip/references/applications.md +207 -0
- package/bin/skills/constitutional-ai/SKILL.md +290 -0
- package/bin/skills/crewai/SKILL.md +498 -0
- package/bin/skills/crewai/references/flows.md +438 -0
- package/bin/skills/crewai/references/tools.md +429 -0
- package/bin/skills/crewai/references/troubleshooting.md +480 -0
- package/bin/skills/deepspeed/SKILL.md +141 -0
- package/bin/skills/deepspeed/references/08.md +17 -0
- package/bin/skills/deepspeed/references/09.md +173 -0
- package/bin/skills/deepspeed/references/2020.md +378 -0
- package/bin/skills/deepspeed/references/2023.md +279 -0
- package/bin/skills/deepspeed/references/assets.md +179 -0
- package/bin/skills/deepspeed/references/index.md +35 -0
- package/bin/skills/deepspeed/references/mii.md +118 -0
- package/bin/skills/deepspeed/references/other.md +1191 -0
- package/bin/skills/deepspeed/references/tutorials.md +6554 -0
- package/bin/skills/dspy/SKILL.md +590 -0
- package/bin/skills/dspy/references/examples.md +663 -0
- package/bin/skills/dspy/references/modules.md +475 -0
- package/bin/skills/dspy/references/optimizers.md +566 -0
- package/bin/skills/faiss/SKILL.md +221 -0
- package/bin/skills/faiss/references/index_types.md +280 -0
- package/bin/skills/flash-attention/SKILL.md +367 -0
- package/bin/skills/flash-attention/references/benchmarks.md +215 -0
- package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
- package/bin/skills/gguf/SKILL.md +427 -0
- package/bin/skills/gguf/references/advanced-usage.md +504 -0
- package/bin/skills/gguf/references/troubleshooting.md +442 -0
- package/bin/skills/gptq/SKILL.md +450 -0
- package/bin/skills/gptq/references/calibration.md +337 -0
- package/bin/skills/gptq/references/integration.md +129 -0
- package/bin/skills/gptq/references/troubleshooting.md +95 -0
- package/bin/skills/grpo-rl-training/README.md +97 -0
- package/bin/skills/grpo-rl-training/SKILL.md +572 -0
- package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
- package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
- package/bin/skills/guidance/SKILL.md +572 -0
- package/bin/skills/guidance/references/backends.md +554 -0
- package/bin/skills/guidance/references/constraints.md +674 -0
- package/bin/skills/guidance/references/examples.md +767 -0
- package/bin/skills/hqq/SKILL.md +445 -0
- package/bin/skills/hqq/references/advanced-usage.md +528 -0
- package/bin/skills/hqq/references/troubleshooting.md +503 -0
- package/bin/skills/hugging-face-cli/SKILL.md +191 -0
- package/bin/skills/hugging-face-cli/references/commands.md +954 -0
- package/bin/skills/hugging-face-cli/references/examples.md +374 -0
- package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
- package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
- package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
- package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
- package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
- package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
- package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
- package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
- package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
- package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
- package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
- package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
- package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
- package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
- package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
- package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
- package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
- package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
- package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
- package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
- package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
- package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
- package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
- package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
- package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
- package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
- package/bin/skills/hugging-face-jobs/index.html +216 -0
- package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
- package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
- package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
- package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
- package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
- package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
- package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
- package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
- package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
- package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
- package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
- package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
- package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
- package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
- package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
- package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
- package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
- package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
- package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
- package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
- package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
- package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
- package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
- package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
- package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
- package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
- package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
- package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
- package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
- package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
- package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
- package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
- package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
- package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
- package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
- package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
- package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
- package/bin/skills/instructor/SKILL.md +740 -0
- package/bin/skills/instructor/references/examples.md +107 -0
- package/bin/skills/instructor/references/providers.md +70 -0
- package/bin/skills/instructor/references/validation.md +606 -0
- package/bin/skills/knowledge-distillation/SKILL.md +458 -0
- package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
- package/bin/skills/lambda-labs/SKILL.md +545 -0
- package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
- package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
- package/bin/skills/langchain/SKILL.md +480 -0
- package/bin/skills/langchain/references/agents.md +499 -0
- package/bin/skills/langchain/references/integration.md +562 -0
- package/bin/skills/langchain/references/rag.md +600 -0
- package/bin/skills/langsmith/SKILL.md +422 -0
- package/bin/skills/langsmith/references/advanced-usage.md +548 -0
- package/bin/skills/langsmith/references/troubleshooting.md +537 -0
- package/bin/skills/litgpt/SKILL.md +469 -0
- package/bin/skills/litgpt/references/custom-models.md +568 -0
- package/bin/skills/litgpt/references/distributed-training.md +451 -0
- package/bin/skills/litgpt/references/supported-models.md +336 -0
- package/bin/skills/litgpt/references/training-recipes.md +619 -0
- package/bin/skills/llama-cpp/SKILL.md +258 -0
- package/bin/skills/llama-cpp/references/optimization.md +89 -0
- package/bin/skills/llama-cpp/references/quantization.md +213 -0
- package/bin/skills/llama-cpp/references/server.md +125 -0
- package/bin/skills/llama-factory/SKILL.md +80 -0
- package/bin/skills/llama-factory/references/_images.md +23 -0
- package/bin/skills/llama-factory/references/advanced.md +1055 -0
- package/bin/skills/llama-factory/references/getting_started.md +349 -0
- package/bin/skills/llama-factory/references/index.md +19 -0
- package/bin/skills/llama-factory/references/other.md +31 -0
- package/bin/skills/llamaguard/SKILL.md +337 -0
- package/bin/skills/llamaindex/SKILL.md +569 -0
- package/bin/skills/llamaindex/references/agents.md +83 -0
- package/bin/skills/llamaindex/references/data_connectors.md +108 -0
- package/bin/skills/llamaindex/references/query_engines.md +406 -0
- package/bin/skills/llava/SKILL.md +304 -0
- package/bin/skills/llava/references/training.md +197 -0
- package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
- package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
- package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
- package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
- package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
- package/bin/skills/long-context/SKILL.md +536 -0
- package/bin/skills/long-context/references/extension_methods.md +468 -0
- package/bin/skills/long-context/references/fine_tuning.md +611 -0
- package/bin/skills/long-context/references/rope.md +402 -0
- package/bin/skills/mamba/SKILL.md +260 -0
- package/bin/skills/mamba/references/architecture-details.md +206 -0
- package/bin/skills/mamba/references/benchmarks.md +255 -0
- package/bin/skills/mamba/references/training-guide.md +388 -0
- package/bin/skills/megatron-core/SKILL.md +366 -0
- package/bin/skills/megatron-core/references/benchmarks.md +249 -0
- package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
- package/bin/skills/megatron-core/references/production-examples.md +473 -0
- package/bin/skills/megatron-core/references/training-recipes.md +547 -0
- package/bin/skills/miles/SKILL.md +315 -0
- package/bin/skills/miles/references/api-reference.md +141 -0
- package/bin/skills/miles/references/troubleshooting.md +352 -0
- package/bin/skills/mlflow/SKILL.md +704 -0
- package/bin/skills/mlflow/references/deployment.md +744 -0
- package/bin/skills/mlflow/references/model-registry.md +770 -0
- package/bin/skills/mlflow/references/tracking.md +680 -0
- package/bin/skills/modal/SKILL.md +341 -0
- package/bin/skills/modal/references/advanced-usage.md +503 -0
- package/bin/skills/modal/references/troubleshooting.md +494 -0
- package/bin/skills/model-merging/SKILL.md +539 -0
- package/bin/skills/model-merging/references/evaluation.md +462 -0
- package/bin/skills/model-merging/references/examples.md +428 -0
- package/bin/skills/model-merging/references/methods.md +352 -0
- package/bin/skills/model-pruning/SKILL.md +495 -0
- package/bin/skills/model-pruning/references/wanda.md +347 -0
- package/bin/skills/moe-training/SKILL.md +526 -0
- package/bin/skills/moe-training/references/architectures.md +432 -0
- package/bin/skills/moe-training/references/inference.md +348 -0
- package/bin/skills/moe-training/references/training.md +425 -0
- package/bin/skills/nanogpt/SKILL.md +290 -0
- package/bin/skills/nanogpt/references/architecture.md +382 -0
- package/bin/skills/nanogpt/references/data.md +476 -0
- package/bin/skills/nanogpt/references/training.md +564 -0
- package/bin/skills/nemo-curator/SKILL.md +383 -0
- package/bin/skills/nemo-curator/references/deduplication.md +87 -0
- package/bin/skills/nemo-curator/references/filtering.md +102 -0
- package/bin/skills/nemo-evaluator/SKILL.md +494 -0
- package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
- package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
- package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
- package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
- package/bin/skills/nemo-guardrails/SKILL.md +297 -0
- package/bin/skills/nnsight/SKILL.md +436 -0
- package/bin/skills/nnsight/references/README.md +78 -0
- package/bin/skills/nnsight/references/api.md +344 -0
- package/bin/skills/nnsight/references/tutorials.md +300 -0
- package/bin/skills/openrlhf/SKILL.md +249 -0
- package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
- package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
- package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
- package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
- package/bin/skills/outlines/SKILL.md +652 -0
- package/bin/skills/outlines/references/backends.md +615 -0
- package/bin/skills/outlines/references/examples.md +773 -0
- package/bin/skills/outlines/references/json_generation.md +652 -0
- package/bin/skills/peft/SKILL.md +431 -0
- package/bin/skills/peft/references/advanced-usage.md +514 -0
- package/bin/skills/peft/references/troubleshooting.md +480 -0
- package/bin/skills/phoenix/SKILL.md +475 -0
- package/bin/skills/phoenix/references/advanced-usage.md +619 -0
- package/bin/skills/phoenix/references/troubleshooting.md +538 -0
- package/bin/skills/pinecone/SKILL.md +358 -0
- package/bin/skills/pinecone/references/deployment.md +181 -0
- package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
- package/bin/skills/pytorch-fsdp/references/index.md +7 -0
- package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
- package/bin/skills/pytorch-lightning/SKILL.md +346 -0
- package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
- package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
- package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
- package/bin/skills/pyvene/SKILL.md +473 -0
- package/bin/skills/pyvene/references/README.md +73 -0
- package/bin/skills/pyvene/references/api.md +383 -0
- package/bin/skills/pyvene/references/tutorials.md +376 -0
- package/bin/skills/qdrant/SKILL.md +493 -0
- package/bin/skills/qdrant/references/advanced-usage.md +648 -0
- package/bin/skills/qdrant/references/troubleshooting.md +631 -0
- package/bin/skills/ray-data/SKILL.md +326 -0
- package/bin/skills/ray-data/references/integration.md +82 -0
- package/bin/skills/ray-data/references/transformations.md +83 -0
- package/bin/skills/ray-train/SKILL.md +406 -0
- package/bin/skills/ray-train/references/multi-node.md +628 -0
- package/bin/skills/rwkv/SKILL.md +260 -0
- package/bin/skills/rwkv/references/architecture-details.md +344 -0
- package/bin/skills/rwkv/references/rwkv7.md +386 -0
- package/bin/skills/rwkv/references/state-management.md +369 -0
- package/bin/skills/saelens/SKILL.md +386 -0
- package/bin/skills/saelens/references/README.md +70 -0
- package/bin/skills/saelens/references/api.md +333 -0
- package/bin/skills/saelens/references/tutorials.md +318 -0
- package/bin/skills/segment-anything/SKILL.md +500 -0
- package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
- package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
- package/bin/skills/sentence-transformers/SKILL.md +255 -0
- package/bin/skills/sentence-transformers/references/models.md +123 -0
- package/bin/skills/sentencepiece/SKILL.md +235 -0
- package/bin/skills/sentencepiece/references/algorithms.md +200 -0
- package/bin/skills/sentencepiece/references/training.md +304 -0
- package/bin/skills/sglang/SKILL.md +442 -0
- package/bin/skills/sglang/references/deployment.md +490 -0
- package/bin/skills/sglang/references/radix-attention.md +413 -0
- package/bin/skills/sglang/references/structured-generation.md +541 -0
- package/bin/skills/simpo/SKILL.md +219 -0
- package/bin/skills/simpo/references/datasets.md +478 -0
- package/bin/skills/simpo/references/hyperparameters.md +452 -0
- package/bin/skills/simpo/references/loss-functions.md +350 -0
- package/bin/skills/skypilot/SKILL.md +509 -0
- package/bin/skills/skypilot/references/advanced-usage.md +491 -0
- package/bin/skills/skypilot/references/troubleshooting.md +570 -0
- package/bin/skills/slime/SKILL.md +464 -0
- package/bin/skills/slime/references/api-reference.md +392 -0
- package/bin/skills/slime/references/troubleshooting.md +386 -0
- package/bin/skills/speculative-decoding/SKILL.md +467 -0
- package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
- package/bin/skills/speculative-decoding/references/medusa.md +350 -0
- package/bin/skills/stable-diffusion/SKILL.md +519 -0
- package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
- package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
- package/bin/skills/tensorboard/SKILL.md +629 -0
- package/bin/skills/tensorboard/references/integrations.md +638 -0
- package/bin/skills/tensorboard/references/profiling.md +545 -0
- package/bin/skills/tensorboard/references/visualization.md +620 -0
- package/bin/skills/tensorrt-llm/SKILL.md +187 -0
- package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
- package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
- package/bin/skills/tensorrt-llm/references/serving.md +470 -0
- package/bin/skills/tinker/SKILL.md +362 -0
- package/bin/skills/tinker/references/api-reference.md +168 -0
- package/bin/skills/tinker/references/getting-started.md +157 -0
- package/bin/skills/tinker/references/loss-functions.md +163 -0
- package/bin/skills/tinker/references/models-and-lora.md +139 -0
- package/bin/skills/tinker/references/recipes.md +280 -0
- package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
- package/bin/skills/tinker/references/rendering.md +243 -0
- package/bin/skills/tinker/references/supervised-learning.md +232 -0
- package/bin/skills/tinker-training-cost/SKILL.md +187 -0
- package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
- package/bin/skills/torchforge/SKILL.md +433 -0
- package/bin/skills/torchforge/references/api-reference.md +327 -0
- package/bin/skills/torchforge/references/troubleshooting.md +409 -0
- package/bin/skills/torchtitan/SKILL.md +358 -0
- package/bin/skills/torchtitan/references/checkpoint.md +181 -0
- package/bin/skills/torchtitan/references/custom-models.md +258 -0
- package/bin/skills/torchtitan/references/float8.md +133 -0
- package/bin/skills/torchtitan/references/fsdp.md +126 -0
- package/bin/skills/transformer-lens/SKILL.md +346 -0
- package/bin/skills/transformer-lens/references/README.md +54 -0
- package/bin/skills/transformer-lens/references/api.md +362 -0
- package/bin/skills/transformer-lens/references/tutorials.md +339 -0
- package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
- package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
- package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
- package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
- package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
- package/bin/skills/unsloth/SKILL.md +80 -0
- package/bin/skills/unsloth/references/index.md +7 -0
- package/bin/skills/unsloth/references/llms-full.md +16799 -0
- package/bin/skills/unsloth/references/llms-txt.md +12044 -0
- package/bin/skills/unsloth/references/llms.md +82 -0
- package/bin/skills/verl/SKILL.md +391 -0
- package/bin/skills/verl/references/api-reference.md +301 -0
- package/bin/skills/verl/references/troubleshooting.md +391 -0
- package/bin/skills/vllm/SKILL.md +364 -0
- package/bin/skills/vllm/references/optimization.md +226 -0
- package/bin/skills/vllm/references/quantization.md +284 -0
- package/bin/skills/vllm/references/server-deployment.md +255 -0
- package/bin/skills/vllm/references/troubleshooting.md +447 -0
- package/bin/skills/weights-and-biases/SKILL.md +590 -0
- package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
- package/bin/skills/weights-and-biases/references/integrations.md +700 -0
- package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
- package/bin/skills/whisper/SKILL.md +317 -0
- package/bin/skills/whisper/references/languages.md +189 -0
- package/bin/synsc +0 -0
- package/package.json +10 -0
|
@@ -0,0 +1,526 @@
|
|
|
1
|
+
# BLIP-2 Troubleshooting Guide
|
|
2
|
+
|
|
3
|
+
## Installation Issues
|
|
4
|
+
|
|
5
|
+
### Import errors
|
|
6
|
+
|
|
7
|
+
**Error**: `ModuleNotFoundError: No module named 'transformers'`
|
|
8
|
+
|
|
9
|
+
**Solutions**:
|
|
10
|
+
```bash
|
|
11
|
+
# Install transformers with vision support
|
|
12
|
+
pip install transformers[vision] accelerate
|
|
13
|
+
|
|
14
|
+
# Or install all optional dependencies
|
|
15
|
+
pip install transformers accelerate torch Pillow scipy
|
|
16
|
+
|
|
17
|
+
# Verify installation
|
|
18
|
+
python -c "from transformers import Blip2ForConditionalGeneration; print('OK')"
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
### LAVIS installation fails
|
|
22
|
+
|
|
23
|
+
**Error**: Errors installing salesforce-lavis
|
|
24
|
+
|
|
25
|
+
**Solutions**:
|
|
26
|
+
```bash
|
|
27
|
+
# Install from source
|
|
28
|
+
git clone https://github.com/salesforce/LAVIS.git
|
|
29
|
+
cd LAVIS
|
|
30
|
+
pip install -e .
|
|
31
|
+
|
|
32
|
+
# Or specific version
|
|
33
|
+
pip install salesforce-lavis==1.0.2
|
|
34
|
+
|
|
35
|
+
# Install dependencies separately if issues persist
|
|
36
|
+
pip install omegaconf iopath timm webdataset
|
|
37
|
+
pip install salesforce-lavis --no-deps
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
### CUDA version mismatch
|
|
41
|
+
|
|
42
|
+
**Error**: `RuntimeError: CUDA error: no kernel image is available`
|
|
43
|
+
|
|
44
|
+
**Solutions**:
|
|
45
|
+
```bash
|
|
46
|
+
# Check CUDA version
|
|
47
|
+
nvcc --version
|
|
48
|
+
python -c "import torch; print(torch.version.cuda)"
|
|
49
|
+
|
|
50
|
+
# Install matching PyTorch
|
|
51
|
+
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121
|
|
52
|
+
|
|
53
|
+
# For CUDA 11.8
|
|
54
|
+
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
## Model Loading Issues
|
|
58
|
+
|
|
59
|
+
### Out of memory during load
|
|
60
|
+
|
|
61
|
+
**Error**: `torch.cuda.OutOfMemoryError` during model loading
|
|
62
|
+
|
|
63
|
+
**Solutions**:
|
|
64
|
+
```python
|
|
65
|
+
# Use quantization
|
|
66
|
+
from transformers import BitsAndBytesConfig
|
|
67
|
+
|
|
68
|
+
quantization_config = BitsAndBytesConfig(load_in_8bit=True)
|
|
69
|
+
model = Blip2ForConditionalGeneration.from_pretrained(
|
|
70
|
+
"Salesforce/blip2-opt-2.7b",
|
|
71
|
+
quantization_config=quantization_config,
|
|
72
|
+
device_map="auto"
|
|
73
|
+
)
|
|
74
|
+
|
|
75
|
+
# Or 4-bit quantization
|
|
76
|
+
quantization_config = BitsAndBytesConfig(
|
|
77
|
+
load_in_4bit=True,
|
|
78
|
+
bnb_4bit_compute_dtype=torch.float16
|
|
79
|
+
)
|
|
80
|
+
|
|
81
|
+
# Use smaller model
|
|
82
|
+
model = Blip2ForConditionalGeneration.from_pretrained(
|
|
83
|
+
"Salesforce/blip2-opt-2.7b", # Instead of 6.7b or flan-t5-xxl
|
|
84
|
+
torch_dtype=torch.float16,
|
|
85
|
+
device_map="auto"
|
|
86
|
+
)
|
|
87
|
+
|
|
88
|
+
# Offload to CPU
|
|
89
|
+
model = Blip2ForConditionalGeneration.from_pretrained(
|
|
90
|
+
"Salesforce/blip2-opt-6.7b",
|
|
91
|
+
device_map="auto",
|
|
92
|
+
offload_folder="offload"
|
|
93
|
+
)
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
### Model download fails
|
|
97
|
+
|
|
98
|
+
**Error**: Connection errors or incomplete downloads
|
|
99
|
+
|
|
100
|
+
**Solutions**:
|
|
101
|
+
```python
|
|
102
|
+
# Set cache directory
|
|
103
|
+
import os
|
|
104
|
+
os.environ["HF_HOME"] = "/path/to/cache"
|
|
105
|
+
|
|
106
|
+
# Resume download
|
|
107
|
+
from huggingface_hub import snapshot_download
|
|
108
|
+
snapshot_download(
|
|
109
|
+
"Salesforce/blip2-opt-2.7b",
|
|
110
|
+
resume_download=True
|
|
111
|
+
)
|
|
112
|
+
|
|
113
|
+
# Use local files only after download
|
|
114
|
+
model = Blip2ForConditionalGeneration.from_pretrained(
|
|
115
|
+
"Salesforce/blip2-opt-2.7b",
|
|
116
|
+
local_files_only=True
|
|
117
|
+
)
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
### Weight loading errors
|
|
121
|
+
|
|
122
|
+
**Error**: `RuntimeError: Error(s) in loading state_dict`
|
|
123
|
+
|
|
124
|
+
**Solutions**:
|
|
125
|
+
```python
|
|
126
|
+
# Ignore mismatched weights
|
|
127
|
+
model = Blip2ForConditionalGeneration.from_pretrained(
|
|
128
|
+
"Salesforce/blip2-opt-2.7b",
|
|
129
|
+
ignore_mismatched_sizes=True
|
|
130
|
+
)
|
|
131
|
+
|
|
132
|
+
# Check model architecture matches checkpoint
|
|
133
|
+
from transformers import AutoConfig
|
|
134
|
+
config = AutoConfig.from_pretrained("Salesforce/blip2-opt-2.7b")
|
|
135
|
+
print(config.text_config.model_type) # Should be 'opt'
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
## Inference Issues
|
|
139
|
+
|
|
140
|
+
### Image format errors
|
|
141
|
+
|
|
142
|
+
**Error**: `ValueError: Unable to create tensor`
|
|
143
|
+
|
|
144
|
+
**Solutions**:
|
|
145
|
+
```python
|
|
146
|
+
from PIL import Image
|
|
147
|
+
|
|
148
|
+
# Ensure RGB format
|
|
149
|
+
image = Image.open("image.jpg").convert("RGB")
|
|
150
|
+
|
|
151
|
+
# Handle different formats
|
|
152
|
+
def load_image(path):
|
|
153
|
+
image = Image.open(path)
|
|
154
|
+
|
|
155
|
+
# Convert RGBA to RGB
|
|
156
|
+
if image.mode == "RGBA":
|
|
157
|
+
background = Image.new("RGB", image.size, (255, 255, 255))
|
|
158
|
+
background.paste(image, mask=image.split()[3])
|
|
159
|
+
image = background
|
|
160
|
+
elif image.mode != "RGB":
|
|
161
|
+
image = image.convert("RGB")
|
|
162
|
+
|
|
163
|
+
return image
|
|
164
|
+
|
|
165
|
+
# Handle URL images
|
|
166
|
+
import requests
|
|
167
|
+
from io import BytesIO
|
|
168
|
+
|
|
169
|
+
def load_image_from_url(url):
|
|
170
|
+
response = requests.get(url)
|
|
171
|
+
image = Image.open(BytesIO(response.content))
|
|
172
|
+
return image.convert("RGB")
|
|
173
|
+
```
|
|
174
|
+
|
|
175
|
+
### Empty or nonsensical output
|
|
176
|
+
|
|
177
|
+
**Problem**: Model returns empty string or gibberish
|
|
178
|
+
|
|
179
|
+
**Solutions**:
|
|
180
|
+
```python
|
|
181
|
+
# Check input preprocessing
|
|
182
|
+
inputs = processor(images=image, return_tensors="pt")
|
|
183
|
+
print(f"Pixel values shape: {inputs['pixel_values'].shape}")
|
|
184
|
+
# Should be [1, 3, 224, 224] for single image
|
|
185
|
+
|
|
186
|
+
# Ensure correct dtype
|
|
187
|
+
inputs = inputs.to("cuda", torch.float16)
|
|
188
|
+
|
|
189
|
+
# Use better generation parameters
|
|
190
|
+
generated_ids = model.generate(
|
|
191
|
+
**inputs,
|
|
192
|
+
max_new_tokens=100,
|
|
193
|
+
min_length=10,
|
|
194
|
+
num_beams=5,
|
|
195
|
+
do_sample=False # Deterministic for debugging
|
|
196
|
+
)
|
|
197
|
+
|
|
198
|
+
# Check decoder starting tokens
|
|
199
|
+
print(f"Generated IDs: {generated_ids}")
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
### Slow generation
|
|
203
|
+
|
|
204
|
+
**Problem**: Generation takes too long
|
|
205
|
+
|
|
206
|
+
**Solutions**:
|
|
207
|
+
```python
|
|
208
|
+
# Reduce max_new_tokens
|
|
209
|
+
generated_ids = model.generate(**inputs, max_new_tokens=30)
|
|
210
|
+
|
|
211
|
+
# Use greedy decoding (faster than beam search)
|
|
212
|
+
generated_ids = model.generate(
|
|
213
|
+
**inputs,
|
|
214
|
+
max_new_tokens=50,
|
|
215
|
+
num_beams=1,
|
|
216
|
+
do_sample=False
|
|
217
|
+
)
|
|
218
|
+
|
|
219
|
+
# Enable model compilation (PyTorch 2.0+)
|
|
220
|
+
model = torch.compile(model)
|
|
221
|
+
|
|
222
|
+
# Use Flash Attention
|
|
223
|
+
model = Blip2ForConditionalGeneration.from_pretrained(
|
|
224
|
+
"Salesforce/blip2-opt-2.7b",
|
|
225
|
+
torch_dtype=torch.float16,
|
|
226
|
+
attn_implementation="flash_attention_2",
|
|
227
|
+
device_map="auto"
|
|
228
|
+
)
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
### Batch processing errors
|
|
232
|
+
|
|
233
|
+
**Error**: Dimension mismatch in batch processing
|
|
234
|
+
|
|
235
|
+
**Solutions**:
|
|
236
|
+
```python
|
|
237
|
+
# Ensure consistent image sizes with padding
|
|
238
|
+
inputs = processor(
|
|
239
|
+
images=images,
|
|
240
|
+
return_tensors="pt",
|
|
241
|
+
padding=True
|
|
242
|
+
)
|
|
243
|
+
|
|
244
|
+
# Handle variable size images
|
|
245
|
+
from torchvision import transforms
|
|
246
|
+
|
|
247
|
+
transform = transforms.Compose([
|
|
248
|
+
transforms.Resize((224, 224)),
|
|
249
|
+
transforms.ToTensor(),
|
|
250
|
+
])
|
|
251
|
+
|
|
252
|
+
# Ensure all images are same size before processing
|
|
253
|
+
images = [transform(img) for img in images]
|
|
254
|
+
|
|
255
|
+
# For text inputs, use padding
|
|
256
|
+
inputs = processor(
|
|
257
|
+
images=images,
|
|
258
|
+
text=questions,
|
|
259
|
+
return_tensors="pt",
|
|
260
|
+
padding="max_length",
|
|
261
|
+
max_length=32,
|
|
262
|
+
truncation=True
|
|
263
|
+
)
|
|
264
|
+
```
|
|
265
|
+
|
|
266
|
+
## Memory Issues
|
|
267
|
+
|
|
268
|
+
### CUDA out of memory
|
|
269
|
+
|
|
270
|
+
**Error**: `torch.cuda.OutOfMemoryError: CUDA out of memory`
|
|
271
|
+
|
|
272
|
+
**Solutions**:
|
|
273
|
+
```python
|
|
274
|
+
# Clear cache before inference
|
|
275
|
+
torch.cuda.empty_cache()
|
|
276
|
+
|
|
277
|
+
# Use smaller batch size
|
|
278
|
+
batch_size = 1 # Start with 1
|
|
279
|
+
|
|
280
|
+
# Process sequentially
|
|
281
|
+
results = []
|
|
282
|
+
for image in images:
|
|
283
|
+
inputs = processor(images=image, return_tensors="pt").to("cuda", torch.float16)
|
|
284
|
+
generated_ids = model.generate(**inputs, max_new_tokens=50)
|
|
285
|
+
results.append(processor.decode(generated_ids[0], skip_special_tokens=True))
|
|
286
|
+
torch.cuda.empty_cache()
|
|
287
|
+
|
|
288
|
+
# Use gradient checkpointing
|
|
289
|
+
model.gradient_checkpointing_enable()
|
|
290
|
+
|
|
291
|
+
# Monitor memory
|
|
292
|
+
print(f"Allocated: {torch.cuda.memory_allocated() / 1e9:.2f} GB")
|
|
293
|
+
print(f"Cached: {torch.cuda.memory_reserved() / 1e9:.2f} GB")
|
|
294
|
+
```
|
|
295
|
+
|
|
296
|
+
### Memory leak during batch processing
|
|
297
|
+
|
|
298
|
+
**Problem**: Memory grows over time
|
|
299
|
+
|
|
300
|
+
**Solutions**:
|
|
301
|
+
```python
|
|
302
|
+
import gc
|
|
303
|
+
|
|
304
|
+
# Delete tensors explicitly
|
|
305
|
+
del inputs, generated_ids
|
|
306
|
+
gc.collect()
|
|
307
|
+
torch.cuda.empty_cache()
|
|
308
|
+
|
|
309
|
+
# Use context manager
|
|
310
|
+
with torch.inference_mode():
|
|
311
|
+
inputs = processor(images=image, return_tensors="pt").to("cuda", torch.float16)
|
|
312
|
+
generated_ids = model.generate(**inputs, max_new_tokens=50)
|
|
313
|
+
caption = processor.decode(generated_ids[0], skip_special_tokens=True)
|
|
314
|
+
|
|
315
|
+
# Move to CPU after inference
|
|
316
|
+
caption = processor.decode(generated_ids.cpu()[0], skip_special_tokens=True)
|
|
317
|
+
```
|
|
318
|
+
|
|
319
|
+
## Quality Issues
|
|
320
|
+
|
|
321
|
+
### Poor caption quality
|
|
322
|
+
|
|
323
|
+
**Problem**: Captions are generic or inaccurate
|
|
324
|
+
|
|
325
|
+
**Solutions**:
|
|
326
|
+
```python
|
|
327
|
+
# Use larger model
|
|
328
|
+
model = Blip2ForConditionalGeneration.from_pretrained(
|
|
329
|
+
"Salesforce/blip2-flan-t5-xl", # Better quality than OPT
|
|
330
|
+
torch_dtype=torch.float16,
|
|
331
|
+
device_map="auto"
|
|
332
|
+
)
|
|
333
|
+
|
|
334
|
+
# Use prompts for better captions
|
|
335
|
+
inputs = processor(
|
|
336
|
+
images=image,
|
|
337
|
+
text="a detailed description of the image:",
|
|
338
|
+
return_tensors="pt"
|
|
339
|
+
)
|
|
340
|
+
|
|
341
|
+
# Increase diversity with sampling
|
|
342
|
+
generated_ids = model.generate(
|
|
343
|
+
**inputs,
|
|
344
|
+
max_new_tokens=100,
|
|
345
|
+
num_beams=5,
|
|
346
|
+
num_return_sequences=3, # Generate multiple
|
|
347
|
+
temperature=0.9,
|
|
348
|
+
do_sample=True
|
|
349
|
+
)
|
|
350
|
+
|
|
351
|
+
# Select best from multiple candidates
|
|
352
|
+
```
|
|
353
|
+
|
|
354
|
+
### VQA hallucinations
|
|
355
|
+
|
|
356
|
+
**Problem**: Model makes up information not in image
|
|
357
|
+
|
|
358
|
+
**Solutions**:
|
|
359
|
+
```python
|
|
360
|
+
# Use more specific questions
|
|
361
|
+
# Instead of "What is happening?"
|
|
362
|
+
# Ask "Is there a person in this image?"
|
|
363
|
+
|
|
364
|
+
# Lower temperature
|
|
365
|
+
generated_ids = model.generate(
|
|
366
|
+
**inputs,
|
|
367
|
+
max_new_tokens=30,
|
|
368
|
+
temperature=0.3, # More focused
|
|
369
|
+
do_sample=True
|
|
370
|
+
)
|
|
371
|
+
|
|
372
|
+
# Use beam search (more deterministic)
|
|
373
|
+
generated_ids = model.generate(
|
|
374
|
+
**inputs,
|
|
375
|
+
max_new_tokens=30,
|
|
376
|
+
num_beams=5,
|
|
377
|
+
do_sample=False
|
|
378
|
+
)
|
|
379
|
+
|
|
380
|
+
# Add constraints
|
|
381
|
+
generated_ids = model.generate(
|
|
382
|
+
**inputs,
|
|
383
|
+
max_new_tokens=30,
|
|
384
|
+
no_repeat_ngram_size=3,
|
|
385
|
+
)
|
|
386
|
+
```
|
|
387
|
+
|
|
388
|
+
### Incorrect colors/objects
|
|
389
|
+
|
|
390
|
+
**Problem**: Model identifies wrong colors or objects
|
|
391
|
+
|
|
392
|
+
**Solutions**:
|
|
393
|
+
```python
|
|
394
|
+
# Ensure image is RGB not BGR
|
|
395
|
+
import cv2
|
|
396
|
+
image_cv = cv2.imread("image.jpg")
|
|
397
|
+
image_rgb = cv2.cvtColor(image_cv, cv2.COLOR_BGR2RGB)
|
|
398
|
+
image = Image.fromarray(image_rgb)
|
|
399
|
+
|
|
400
|
+
# Check image quality
|
|
401
|
+
print(f"Image size: {image.size}")
|
|
402
|
+
print(f"Image mode: {image.mode}")
|
|
403
|
+
|
|
404
|
+
# Use higher resolution if possible (but processor resizes to 224x224)
|
|
405
|
+
|
|
406
|
+
# Ask more specific questions
|
|
407
|
+
# Instead of "What color is it?"
|
|
408
|
+
# Ask "Is the car red or blue?"
|
|
409
|
+
```
|
|
410
|
+
|
|
411
|
+
## Processor Issues
|
|
412
|
+
|
|
413
|
+
### Tokenizer warnings
|
|
414
|
+
|
|
415
|
+
**Warning**: `Asking to pad but the tokenizer does not have a padding token`
|
|
416
|
+
|
|
417
|
+
**Solutions**:
|
|
418
|
+
```python
|
|
419
|
+
# Set padding token
|
|
420
|
+
processor.tokenizer.pad_token = processor.tokenizer.eos_token
|
|
421
|
+
|
|
422
|
+
# Or specify during processing
|
|
423
|
+
inputs = processor(
|
|
424
|
+
images=image,
|
|
425
|
+
text=question,
|
|
426
|
+
return_tensors="pt",
|
|
427
|
+
padding="max_length",
|
|
428
|
+
max_length=32
|
|
429
|
+
)
|
|
430
|
+
```
|
|
431
|
+
|
|
432
|
+
### Image normalization issues
|
|
433
|
+
|
|
434
|
+
**Problem**: Unexpected results due to normalization
|
|
435
|
+
|
|
436
|
+
**Solutions**:
|
|
437
|
+
```python
|
|
438
|
+
# Check processor's image normalization
|
|
439
|
+
print(processor.image_processor.image_mean)
|
|
440
|
+
print(processor.image_processor.image_std)
|
|
441
|
+
|
|
442
|
+
# Manual normalization if needed
|
|
443
|
+
from torchvision import transforms
|
|
444
|
+
|
|
445
|
+
normalize = transforms.Normalize(
|
|
446
|
+
mean=processor.image_processor.image_mean,
|
|
447
|
+
std=processor.image_processor.image_std
|
|
448
|
+
)
|
|
449
|
+
|
|
450
|
+
# Or use raw pixel values
|
|
451
|
+
inputs = processor(
|
|
452
|
+
images=image,
|
|
453
|
+
return_tensors="pt",
|
|
454
|
+
do_normalize=False # Skip normalization
|
|
455
|
+
)
|
|
456
|
+
```
|
|
457
|
+
|
|
458
|
+
## LAVIS-Specific Issues
|
|
459
|
+
|
|
460
|
+
### Config not found
|
|
461
|
+
|
|
462
|
+
**Error**: `ConfigError: Config file not found`
|
|
463
|
+
|
|
464
|
+
**Solutions**:
|
|
465
|
+
```python
|
|
466
|
+
# Use registry properly
|
|
467
|
+
from lavis.common.registry import registry
|
|
468
|
+
from lavis.models import load_model_and_preprocess
|
|
469
|
+
|
|
470
|
+
# Check available models
|
|
471
|
+
print(registry.list_models())
|
|
472
|
+
|
|
473
|
+
# Load with explicit config
|
|
474
|
+
model, vis_processors, txt_processors = load_model_and_preprocess(
|
|
475
|
+
name="blip2_opt",
|
|
476
|
+
model_type="pretrain_opt2.7b",
|
|
477
|
+
is_eval=True,
|
|
478
|
+
device="cuda"
|
|
479
|
+
)
|
|
480
|
+
```
|
|
481
|
+
|
|
482
|
+
### Dataset loading errors
|
|
483
|
+
|
|
484
|
+
**Error**: `Dataset not found` or download issues
|
|
485
|
+
|
|
486
|
+
**Solutions**:
|
|
487
|
+
```python
|
|
488
|
+
from lavis.datasets.builders import load_dataset
|
|
489
|
+
|
|
490
|
+
# Set download directory
|
|
491
|
+
import os
|
|
492
|
+
os.environ["LAVIS_DATASETS_ROOT"] = "/path/to/datasets"
|
|
493
|
+
|
|
494
|
+
# Download manually first
|
|
495
|
+
# Then load with local files
|
|
496
|
+
dataset = load_dataset("coco_caption", split="val")
|
|
497
|
+
```
|
|
498
|
+
|
|
499
|
+
## Common Error Messages
|
|
500
|
+
|
|
501
|
+
| Error | Cause | Solution |
|
|
502
|
+
|-------|-------|----------|
|
|
503
|
+
| `CUDA out of memory` | Model too large | Use quantization or smaller model |
|
|
504
|
+
| `Unable to create tensor` | Invalid image format | Convert to RGB PIL Image |
|
|
505
|
+
| `padding_side must be` | Tokenizer config | Set pad_token explicitly |
|
|
506
|
+
| `Expected 4D input` | Wrong tensor shape | Add batch dimension with unsqueeze(0) |
|
|
507
|
+
| `device mismatch` | Tensors on different devices | Move all to same device |
|
|
508
|
+
| `half() not implemented` | CPU doesn't support FP16 | Use float32 on CPU |
|
|
509
|
+
|
|
510
|
+
## Getting Help
|
|
511
|
+
|
|
512
|
+
1. **HuggingFace Forums**: https://discuss.huggingface.co
|
|
513
|
+
2. **LAVIS GitHub Issues**: https://github.com/salesforce/LAVIS/issues
|
|
514
|
+
3. **Paper**: https://arxiv.org/abs/2301.12597
|
|
515
|
+
4. **Model Card**: https://huggingface.co/Salesforce/blip2-opt-2.7b
|
|
516
|
+
|
|
517
|
+
### Reporting Issues
|
|
518
|
+
|
|
519
|
+
Include:
|
|
520
|
+
- Python version
|
|
521
|
+
- transformers/lavis version
|
|
522
|
+
- PyTorch and CUDA versions
|
|
523
|
+
- GPU model and VRAM
|
|
524
|
+
- Full error traceback
|
|
525
|
+
- Minimal reproducible code
|
|
526
|
+
- Image resolution and format
|