@synsci/cli-darwin-x64 1.1.49
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/bin/skills/accelerate/SKILL.md +332 -0
- package/bin/skills/accelerate/references/custom-plugins.md +453 -0
- package/bin/skills/accelerate/references/megatron-integration.md +489 -0
- package/bin/skills/accelerate/references/performance.md +525 -0
- package/bin/skills/audiocraft/SKILL.md +564 -0
- package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
- package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
- package/bin/skills/autogpt/SKILL.md +403 -0
- package/bin/skills/autogpt/references/advanced-usage.md +535 -0
- package/bin/skills/autogpt/references/troubleshooting.md +420 -0
- package/bin/skills/awq/SKILL.md +310 -0
- package/bin/skills/awq/references/advanced-usage.md +324 -0
- package/bin/skills/awq/references/troubleshooting.md +344 -0
- package/bin/skills/axolotl/SKILL.md +158 -0
- package/bin/skills/axolotl/references/api.md +5548 -0
- package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
- package/bin/skills/axolotl/references/index.md +15 -0
- package/bin/skills/axolotl/references/other.md +3563 -0
- package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
- package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
- package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
- package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
- package/bin/skills/bitsandbytes/SKILL.md +411 -0
- package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
- package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
- package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
- package/bin/skills/blip-2/SKILL.md +564 -0
- package/bin/skills/blip-2/references/advanced-usage.md +680 -0
- package/bin/skills/blip-2/references/troubleshooting.md +526 -0
- package/bin/skills/chroma/SKILL.md +406 -0
- package/bin/skills/chroma/references/integration.md +38 -0
- package/bin/skills/clip/SKILL.md +253 -0
- package/bin/skills/clip/references/applications.md +207 -0
- package/bin/skills/constitutional-ai/SKILL.md +290 -0
- package/bin/skills/crewai/SKILL.md +498 -0
- package/bin/skills/crewai/references/flows.md +438 -0
- package/bin/skills/crewai/references/tools.md +429 -0
- package/bin/skills/crewai/references/troubleshooting.md +480 -0
- package/bin/skills/deepspeed/SKILL.md +141 -0
- package/bin/skills/deepspeed/references/08.md +17 -0
- package/bin/skills/deepspeed/references/09.md +173 -0
- package/bin/skills/deepspeed/references/2020.md +378 -0
- package/bin/skills/deepspeed/references/2023.md +279 -0
- package/bin/skills/deepspeed/references/assets.md +179 -0
- package/bin/skills/deepspeed/references/index.md +35 -0
- package/bin/skills/deepspeed/references/mii.md +118 -0
- package/bin/skills/deepspeed/references/other.md +1191 -0
- package/bin/skills/deepspeed/references/tutorials.md +6554 -0
- package/bin/skills/dspy/SKILL.md +590 -0
- package/bin/skills/dspy/references/examples.md +663 -0
- package/bin/skills/dspy/references/modules.md +475 -0
- package/bin/skills/dspy/references/optimizers.md +566 -0
- package/bin/skills/faiss/SKILL.md +221 -0
- package/bin/skills/faiss/references/index_types.md +280 -0
- package/bin/skills/flash-attention/SKILL.md +367 -0
- package/bin/skills/flash-attention/references/benchmarks.md +215 -0
- package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
- package/bin/skills/gguf/SKILL.md +427 -0
- package/bin/skills/gguf/references/advanced-usage.md +504 -0
- package/bin/skills/gguf/references/troubleshooting.md +442 -0
- package/bin/skills/gptq/SKILL.md +450 -0
- package/bin/skills/gptq/references/calibration.md +337 -0
- package/bin/skills/gptq/references/integration.md +129 -0
- package/bin/skills/gptq/references/troubleshooting.md +95 -0
- package/bin/skills/grpo-rl-training/README.md +97 -0
- package/bin/skills/grpo-rl-training/SKILL.md +572 -0
- package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
- package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
- package/bin/skills/guidance/SKILL.md +572 -0
- package/bin/skills/guidance/references/backends.md +554 -0
- package/bin/skills/guidance/references/constraints.md +674 -0
- package/bin/skills/guidance/references/examples.md +767 -0
- package/bin/skills/hqq/SKILL.md +445 -0
- package/bin/skills/hqq/references/advanced-usage.md +528 -0
- package/bin/skills/hqq/references/troubleshooting.md +503 -0
- package/bin/skills/hugging-face-cli/SKILL.md +191 -0
- package/bin/skills/hugging-face-cli/references/commands.md +954 -0
- package/bin/skills/hugging-face-cli/references/examples.md +374 -0
- package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
- package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
- package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
- package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
- package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
- package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
- package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
- package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
- package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
- package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
- package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
- package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
- package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
- package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
- package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
- package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
- package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
- package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
- package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
- package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
- package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
- package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
- package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
- package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
- package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
- package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
- package/bin/skills/hugging-face-jobs/index.html +216 -0
- package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
- package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
- package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
- package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
- package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
- package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
- package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
- package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
- package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
- package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
- package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
- package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
- package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
- package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
- package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
- package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
- package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
- package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
- package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
- package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
- package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
- package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
- package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
- package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
- package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
- package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
- package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
- package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
- package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
- package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
- package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
- package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
- package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
- package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
- package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
- package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
- package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
- package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
- package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
- package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
- package/bin/skills/instructor/SKILL.md +740 -0
- package/bin/skills/instructor/references/examples.md +107 -0
- package/bin/skills/instructor/references/providers.md +70 -0
- package/bin/skills/instructor/references/validation.md +606 -0
- package/bin/skills/knowledge-distillation/SKILL.md +458 -0
- package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
- package/bin/skills/lambda-labs/SKILL.md +545 -0
- package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
- package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
- package/bin/skills/langchain/SKILL.md +480 -0
- package/bin/skills/langchain/references/agents.md +499 -0
- package/bin/skills/langchain/references/integration.md +562 -0
- package/bin/skills/langchain/references/rag.md +600 -0
- package/bin/skills/langsmith/SKILL.md +422 -0
- package/bin/skills/langsmith/references/advanced-usage.md +548 -0
- package/bin/skills/langsmith/references/troubleshooting.md +537 -0
- package/bin/skills/litgpt/SKILL.md +469 -0
- package/bin/skills/litgpt/references/custom-models.md +568 -0
- package/bin/skills/litgpt/references/distributed-training.md +451 -0
- package/bin/skills/litgpt/references/supported-models.md +336 -0
- package/bin/skills/litgpt/references/training-recipes.md +619 -0
- package/bin/skills/llama-cpp/SKILL.md +258 -0
- package/bin/skills/llama-cpp/references/optimization.md +89 -0
- package/bin/skills/llama-cpp/references/quantization.md +213 -0
- package/bin/skills/llama-cpp/references/server.md +125 -0
- package/bin/skills/llama-factory/SKILL.md +80 -0
- package/bin/skills/llama-factory/references/_images.md +23 -0
- package/bin/skills/llama-factory/references/advanced.md +1055 -0
- package/bin/skills/llama-factory/references/getting_started.md +349 -0
- package/bin/skills/llama-factory/references/index.md +19 -0
- package/bin/skills/llama-factory/references/other.md +31 -0
- package/bin/skills/llamaguard/SKILL.md +337 -0
- package/bin/skills/llamaindex/SKILL.md +569 -0
- package/bin/skills/llamaindex/references/agents.md +83 -0
- package/bin/skills/llamaindex/references/data_connectors.md +108 -0
- package/bin/skills/llamaindex/references/query_engines.md +406 -0
- package/bin/skills/llava/SKILL.md +304 -0
- package/bin/skills/llava/references/training.md +197 -0
- package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
- package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
- package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
- package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
- package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
- package/bin/skills/long-context/SKILL.md +536 -0
- package/bin/skills/long-context/references/extension_methods.md +468 -0
- package/bin/skills/long-context/references/fine_tuning.md +611 -0
- package/bin/skills/long-context/references/rope.md +402 -0
- package/bin/skills/mamba/SKILL.md +260 -0
- package/bin/skills/mamba/references/architecture-details.md +206 -0
- package/bin/skills/mamba/references/benchmarks.md +255 -0
- package/bin/skills/mamba/references/training-guide.md +388 -0
- package/bin/skills/megatron-core/SKILL.md +366 -0
- package/bin/skills/megatron-core/references/benchmarks.md +249 -0
- package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
- package/bin/skills/megatron-core/references/production-examples.md +473 -0
- package/bin/skills/megatron-core/references/training-recipes.md +547 -0
- package/bin/skills/miles/SKILL.md +315 -0
- package/bin/skills/miles/references/api-reference.md +141 -0
- package/bin/skills/miles/references/troubleshooting.md +352 -0
- package/bin/skills/mlflow/SKILL.md +704 -0
- package/bin/skills/mlflow/references/deployment.md +744 -0
- package/bin/skills/mlflow/references/model-registry.md +770 -0
- package/bin/skills/mlflow/references/tracking.md +680 -0
- package/bin/skills/modal/SKILL.md +341 -0
- package/bin/skills/modal/references/advanced-usage.md +503 -0
- package/bin/skills/modal/references/troubleshooting.md +494 -0
- package/bin/skills/model-merging/SKILL.md +539 -0
- package/bin/skills/model-merging/references/evaluation.md +462 -0
- package/bin/skills/model-merging/references/examples.md +428 -0
- package/bin/skills/model-merging/references/methods.md +352 -0
- package/bin/skills/model-pruning/SKILL.md +495 -0
- package/bin/skills/model-pruning/references/wanda.md +347 -0
- package/bin/skills/moe-training/SKILL.md +526 -0
- package/bin/skills/moe-training/references/architectures.md +432 -0
- package/bin/skills/moe-training/references/inference.md +348 -0
- package/bin/skills/moe-training/references/training.md +425 -0
- package/bin/skills/nanogpt/SKILL.md +290 -0
- package/bin/skills/nanogpt/references/architecture.md +382 -0
- package/bin/skills/nanogpt/references/data.md +476 -0
- package/bin/skills/nanogpt/references/training.md +564 -0
- package/bin/skills/nemo-curator/SKILL.md +383 -0
- package/bin/skills/nemo-curator/references/deduplication.md +87 -0
- package/bin/skills/nemo-curator/references/filtering.md +102 -0
- package/bin/skills/nemo-evaluator/SKILL.md +494 -0
- package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
- package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
- package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
- package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
- package/bin/skills/nemo-guardrails/SKILL.md +297 -0
- package/bin/skills/nnsight/SKILL.md +436 -0
- package/bin/skills/nnsight/references/README.md +78 -0
- package/bin/skills/nnsight/references/api.md +344 -0
- package/bin/skills/nnsight/references/tutorials.md +300 -0
- package/bin/skills/openrlhf/SKILL.md +249 -0
- package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
- package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
- package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
- package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
- package/bin/skills/outlines/SKILL.md +652 -0
- package/bin/skills/outlines/references/backends.md +615 -0
- package/bin/skills/outlines/references/examples.md +773 -0
- package/bin/skills/outlines/references/json_generation.md +652 -0
- package/bin/skills/peft/SKILL.md +431 -0
- package/bin/skills/peft/references/advanced-usage.md +514 -0
- package/bin/skills/peft/references/troubleshooting.md +480 -0
- package/bin/skills/phoenix/SKILL.md +475 -0
- package/bin/skills/phoenix/references/advanced-usage.md +619 -0
- package/bin/skills/phoenix/references/troubleshooting.md +538 -0
- package/bin/skills/pinecone/SKILL.md +358 -0
- package/bin/skills/pinecone/references/deployment.md +181 -0
- package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
- package/bin/skills/pytorch-fsdp/references/index.md +7 -0
- package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
- package/bin/skills/pytorch-lightning/SKILL.md +346 -0
- package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
- package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
- package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
- package/bin/skills/pyvene/SKILL.md +473 -0
- package/bin/skills/pyvene/references/README.md +73 -0
- package/bin/skills/pyvene/references/api.md +383 -0
- package/bin/skills/pyvene/references/tutorials.md +376 -0
- package/bin/skills/qdrant/SKILL.md +493 -0
- package/bin/skills/qdrant/references/advanced-usage.md +648 -0
- package/bin/skills/qdrant/references/troubleshooting.md +631 -0
- package/bin/skills/ray-data/SKILL.md +326 -0
- package/bin/skills/ray-data/references/integration.md +82 -0
- package/bin/skills/ray-data/references/transformations.md +83 -0
- package/bin/skills/ray-train/SKILL.md +406 -0
- package/bin/skills/ray-train/references/multi-node.md +628 -0
- package/bin/skills/rwkv/SKILL.md +260 -0
- package/bin/skills/rwkv/references/architecture-details.md +344 -0
- package/bin/skills/rwkv/references/rwkv7.md +386 -0
- package/bin/skills/rwkv/references/state-management.md +369 -0
- package/bin/skills/saelens/SKILL.md +386 -0
- package/bin/skills/saelens/references/README.md +70 -0
- package/bin/skills/saelens/references/api.md +333 -0
- package/bin/skills/saelens/references/tutorials.md +318 -0
- package/bin/skills/segment-anything/SKILL.md +500 -0
- package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
- package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
- package/bin/skills/sentence-transformers/SKILL.md +255 -0
- package/bin/skills/sentence-transformers/references/models.md +123 -0
- package/bin/skills/sentencepiece/SKILL.md +235 -0
- package/bin/skills/sentencepiece/references/algorithms.md +200 -0
- package/bin/skills/sentencepiece/references/training.md +304 -0
- package/bin/skills/sglang/SKILL.md +442 -0
- package/bin/skills/sglang/references/deployment.md +490 -0
- package/bin/skills/sglang/references/radix-attention.md +413 -0
- package/bin/skills/sglang/references/structured-generation.md +541 -0
- package/bin/skills/simpo/SKILL.md +219 -0
- package/bin/skills/simpo/references/datasets.md +478 -0
- package/bin/skills/simpo/references/hyperparameters.md +452 -0
- package/bin/skills/simpo/references/loss-functions.md +350 -0
- package/bin/skills/skypilot/SKILL.md +509 -0
- package/bin/skills/skypilot/references/advanced-usage.md +491 -0
- package/bin/skills/skypilot/references/troubleshooting.md +570 -0
- package/bin/skills/slime/SKILL.md +464 -0
- package/bin/skills/slime/references/api-reference.md +392 -0
- package/bin/skills/slime/references/troubleshooting.md +386 -0
- package/bin/skills/speculative-decoding/SKILL.md +467 -0
- package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
- package/bin/skills/speculative-decoding/references/medusa.md +350 -0
- package/bin/skills/stable-diffusion/SKILL.md +519 -0
- package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
- package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
- package/bin/skills/tensorboard/SKILL.md +629 -0
- package/bin/skills/tensorboard/references/integrations.md +638 -0
- package/bin/skills/tensorboard/references/profiling.md +545 -0
- package/bin/skills/tensorboard/references/visualization.md +620 -0
- package/bin/skills/tensorrt-llm/SKILL.md +187 -0
- package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
- package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
- package/bin/skills/tensorrt-llm/references/serving.md +470 -0
- package/bin/skills/tinker/SKILL.md +362 -0
- package/bin/skills/tinker/references/api-reference.md +168 -0
- package/bin/skills/tinker/references/getting-started.md +157 -0
- package/bin/skills/tinker/references/loss-functions.md +163 -0
- package/bin/skills/tinker/references/models-and-lora.md +139 -0
- package/bin/skills/tinker/references/recipes.md +280 -0
- package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
- package/bin/skills/tinker/references/rendering.md +243 -0
- package/bin/skills/tinker/references/supervised-learning.md +232 -0
- package/bin/skills/tinker-training-cost/SKILL.md +187 -0
- package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
- package/bin/skills/torchforge/SKILL.md +433 -0
- package/bin/skills/torchforge/references/api-reference.md +327 -0
- package/bin/skills/torchforge/references/troubleshooting.md +409 -0
- package/bin/skills/torchtitan/SKILL.md +358 -0
- package/bin/skills/torchtitan/references/checkpoint.md +181 -0
- package/bin/skills/torchtitan/references/custom-models.md +258 -0
- package/bin/skills/torchtitan/references/float8.md +133 -0
- package/bin/skills/torchtitan/references/fsdp.md +126 -0
- package/bin/skills/transformer-lens/SKILL.md +346 -0
- package/bin/skills/transformer-lens/references/README.md +54 -0
- package/bin/skills/transformer-lens/references/api.md +362 -0
- package/bin/skills/transformer-lens/references/tutorials.md +339 -0
- package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
- package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
- package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
- package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
- package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
- package/bin/skills/unsloth/SKILL.md +80 -0
- package/bin/skills/unsloth/references/index.md +7 -0
- package/bin/skills/unsloth/references/llms-full.md +16799 -0
- package/bin/skills/unsloth/references/llms-txt.md +12044 -0
- package/bin/skills/unsloth/references/llms.md +82 -0
- package/bin/skills/verl/SKILL.md +391 -0
- package/bin/skills/verl/references/api-reference.md +301 -0
- package/bin/skills/verl/references/troubleshooting.md +391 -0
- package/bin/skills/vllm/SKILL.md +364 -0
- package/bin/skills/vllm/references/optimization.md +226 -0
- package/bin/skills/vllm/references/quantization.md +284 -0
- package/bin/skills/vllm/references/server-deployment.md +255 -0
- package/bin/skills/vllm/references/troubleshooting.md +447 -0
- package/bin/skills/weights-and-biases/SKILL.md +590 -0
- package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
- package/bin/skills/weights-and-biases/references/integrations.md +700 -0
- package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
- package/bin/skills/whisper/SKILL.md +317 -0
- package/bin/skills/whisper/references/languages.md +189 -0
- package/bin/synsc +0 -0
- package/package.json +10 -0
|
@@ -0,0 +1,445 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: hqq-quantization
|
|
3
|
+
description: Half-Quadratic Quantization for LLMs without calibration data. Use when quantizing models to 4/3/2-bit precision without needing calibration datasets, for fast quantization workflows, or when deploying with vLLM or HuggingFace Transformers.
|
|
4
|
+
version: 1.0.0
|
|
5
|
+
author: Synthetic Sciences
|
|
6
|
+
license: MIT
|
|
7
|
+
tags: [Quantization, HQQ, Optimization, Memory Efficiency, Inference, Model Compression]
|
|
8
|
+
dependencies: [hqq>=0.2.0, torch>=2.0.0]
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
# HQQ - Half-Quadratic Quantization
|
|
12
|
+
|
|
13
|
+
Fast, calibration-free weight quantization supporting 8/4/3/2/1-bit precision with multiple optimized backends.
|
|
14
|
+
|
|
15
|
+
## When to use HQQ
|
|
16
|
+
|
|
17
|
+
**Use HQQ when:**
|
|
18
|
+
- Quantizing models without calibration data (no dataset needed)
|
|
19
|
+
- Need fast quantization (minutes vs hours for GPTQ/AWQ)
|
|
20
|
+
- Deploying with vLLM or HuggingFace Transformers
|
|
21
|
+
- Fine-tuning quantized models with LoRA/PEFT
|
|
22
|
+
- Experimenting with extreme quantization (2-bit, 1-bit)
|
|
23
|
+
|
|
24
|
+
**Key advantages:**
|
|
25
|
+
- **No calibration**: Quantize any model instantly without sample data
|
|
26
|
+
- **Multiple backends**: PyTorch, ATEN, TorchAO, Marlin, BitBlas for optimized inference
|
|
27
|
+
- **Flexible precision**: 8/4/3/2/1-bit with configurable group sizes
|
|
28
|
+
- **Framework integration**: Native HuggingFace and vLLM support
|
|
29
|
+
- **PEFT compatible**: Fine-tune quantized models with LoRA
|
|
30
|
+
|
|
31
|
+
**Use alternatives instead:**
|
|
32
|
+
- **AWQ**: Need calibration-based accuracy, production serving
|
|
33
|
+
- **GPTQ**: Maximum accuracy with calibration data available
|
|
34
|
+
- **bitsandbytes**: Simple 8-bit/4-bit without custom backends
|
|
35
|
+
- **llama.cpp/GGUF**: CPU inference, Apple Silicon deployment
|
|
36
|
+
|
|
37
|
+
## Quick start
|
|
38
|
+
|
|
39
|
+
### Installation
|
|
40
|
+
|
|
41
|
+
```bash
|
|
42
|
+
pip install hqq
|
|
43
|
+
|
|
44
|
+
# With specific backend
|
|
45
|
+
pip install hqq[torch] # PyTorch backend
|
|
46
|
+
pip install hqq[torchao] # TorchAO int4 backend
|
|
47
|
+
pip install hqq[bitblas] # BitBlas backend
|
|
48
|
+
pip install hqq[marlin] # Marlin backend
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
### Basic quantization
|
|
52
|
+
|
|
53
|
+
```python
|
|
54
|
+
from hqq.core.quantize import BaseQuantizeConfig, HQQLinear
|
|
55
|
+
import torch.nn as nn
|
|
56
|
+
|
|
57
|
+
# Configure quantization
|
|
58
|
+
config = BaseQuantizeConfig(
|
|
59
|
+
nbits=4, # 4-bit quantization
|
|
60
|
+
group_size=64, # Group size for quantization
|
|
61
|
+
axis=1 # Quantize along output dimension
|
|
62
|
+
)
|
|
63
|
+
|
|
64
|
+
# Quantize a linear layer
|
|
65
|
+
linear = nn.Linear(4096, 4096)
|
|
66
|
+
hqq_linear = HQQLinear(linear, config)
|
|
67
|
+
|
|
68
|
+
# Use normally
|
|
69
|
+
output = hqq_linear(input_tensor)
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
### Quantize full model with HuggingFace
|
|
73
|
+
|
|
74
|
+
```python
|
|
75
|
+
from transformers import AutoModelForCausalLM, HqqConfig
|
|
76
|
+
|
|
77
|
+
# Configure HQQ
|
|
78
|
+
quantization_config = HqqConfig(
|
|
79
|
+
nbits=4,
|
|
80
|
+
group_size=64,
|
|
81
|
+
axis=1
|
|
82
|
+
)
|
|
83
|
+
|
|
84
|
+
# Load and quantize
|
|
85
|
+
model = AutoModelForCausalLM.from_pretrained(
|
|
86
|
+
"meta-llama/Llama-3.1-8B",
|
|
87
|
+
quantization_config=quantization_config,
|
|
88
|
+
device_map="auto"
|
|
89
|
+
)
|
|
90
|
+
|
|
91
|
+
# Model is quantized and ready to use
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
## Core concepts
|
|
95
|
+
|
|
96
|
+
### Quantization configuration
|
|
97
|
+
|
|
98
|
+
HQQ uses `BaseQuantizeConfig` to define quantization parameters:
|
|
99
|
+
|
|
100
|
+
```python
|
|
101
|
+
from hqq.core.quantize import BaseQuantizeConfig
|
|
102
|
+
|
|
103
|
+
# Standard 4-bit config
|
|
104
|
+
config_4bit = BaseQuantizeConfig(
|
|
105
|
+
nbits=4, # Bits per weight (1-8)
|
|
106
|
+
group_size=64, # Weights per quantization group
|
|
107
|
+
axis=1 # 0=input dim, 1=output dim
|
|
108
|
+
)
|
|
109
|
+
|
|
110
|
+
# Aggressive 2-bit config
|
|
111
|
+
config_2bit = BaseQuantizeConfig(
|
|
112
|
+
nbits=2,
|
|
113
|
+
group_size=16, # Smaller groups for low-bit
|
|
114
|
+
axis=1
|
|
115
|
+
)
|
|
116
|
+
|
|
117
|
+
# Mixed precision per layer type
|
|
118
|
+
layer_configs = {
|
|
119
|
+
"self_attn.q_proj": BaseQuantizeConfig(nbits=4, group_size=64),
|
|
120
|
+
"self_attn.k_proj": BaseQuantizeConfig(nbits=4, group_size=64),
|
|
121
|
+
"self_attn.v_proj": BaseQuantizeConfig(nbits=4, group_size=64),
|
|
122
|
+
"mlp.gate_proj": BaseQuantizeConfig(nbits=2, group_size=32),
|
|
123
|
+
"mlp.up_proj": BaseQuantizeConfig(nbits=2, group_size=32),
|
|
124
|
+
"mlp.down_proj": BaseQuantizeConfig(nbits=4, group_size=64),
|
|
125
|
+
}
|
|
126
|
+
```
|
|
127
|
+
|
|
128
|
+
### HQQLinear layer
|
|
129
|
+
|
|
130
|
+
The core quantized layer that replaces `nn.Linear`:
|
|
131
|
+
|
|
132
|
+
```python
|
|
133
|
+
from hqq.core.quantize import HQQLinear
|
|
134
|
+
import torch
|
|
135
|
+
|
|
136
|
+
# Create quantized layer
|
|
137
|
+
linear = torch.nn.Linear(4096, 4096)
|
|
138
|
+
hqq_layer = HQQLinear(linear, config)
|
|
139
|
+
|
|
140
|
+
# Access quantized weights
|
|
141
|
+
W_q = hqq_layer.W_q # Quantized weights
|
|
142
|
+
scale = hqq_layer.scale # Scale factors
|
|
143
|
+
zero = hqq_layer.zero # Zero points
|
|
144
|
+
|
|
145
|
+
# Dequantize for inspection
|
|
146
|
+
W_dequant = hqq_layer.dequantize()
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
### Backends
|
|
150
|
+
|
|
151
|
+
HQQ supports multiple inference backends for different hardware:
|
|
152
|
+
|
|
153
|
+
```python
|
|
154
|
+
from hqq.core.quantize import HQQLinear
|
|
155
|
+
|
|
156
|
+
# Available backends
|
|
157
|
+
backends = [
|
|
158
|
+
"pytorch", # Pure PyTorch (default)
|
|
159
|
+
"pytorch_compile", # torch.compile optimized
|
|
160
|
+
"aten", # Custom CUDA kernels
|
|
161
|
+
"torchao_int4", # TorchAO int4 matmul
|
|
162
|
+
"gemlite", # GemLite CUDA kernels
|
|
163
|
+
"bitblas", # BitBlas optimized
|
|
164
|
+
"marlin", # Marlin 4-bit kernels
|
|
165
|
+
]
|
|
166
|
+
|
|
167
|
+
# Set backend globally
|
|
168
|
+
HQQLinear.set_backend("torchao_int4")
|
|
169
|
+
|
|
170
|
+
# Or per layer
|
|
171
|
+
hqq_layer.set_backend("marlin")
|
|
172
|
+
```
|
|
173
|
+
|
|
174
|
+
**Backend selection guide:**
|
|
175
|
+
| Backend | Best For | Requirements |
|
|
176
|
+
|---------|----------|--------------|
|
|
177
|
+
| pytorch | Compatibility | Any GPU |
|
|
178
|
+
| pytorch_compile | Moderate speedup | torch>=2.0 |
|
|
179
|
+
| aten | Good balance | CUDA GPU |
|
|
180
|
+
| torchao_int4 | 4-bit inference | torchao installed |
|
|
181
|
+
| marlin | Maximum 4-bit speed | Ampere+ GPU |
|
|
182
|
+
| bitblas | Flexible bit-widths | bitblas installed |
|
|
183
|
+
|
|
184
|
+
## HuggingFace integration
|
|
185
|
+
|
|
186
|
+
### Load pre-quantized models
|
|
187
|
+
|
|
188
|
+
```python
|
|
189
|
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
190
|
+
|
|
191
|
+
# Load HQQ-quantized model from Hub
|
|
192
|
+
model = AutoModelForCausalLM.from_pretrained(
|
|
193
|
+
"mobiuslabsgmbh/Llama-3.1-8B-HQQ-4bit",
|
|
194
|
+
device_map="auto"
|
|
195
|
+
)
|
|
196
|
+
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B")
|
|
197
|
+
|
|
198
|
+
# Use normally
|
|
199
|
+
inputs = tokenizer("Hello, world!", return_tensors="pt").to(model.device)
|
|
200
|
+
outputs = model.generate(**inputs, max_new_tokens=50)
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
### Quantize and save
|
|
204
|
+
|
|
205
|
+
```python
|
|
206
|
+
from transformers import AutoModelForCausalLM, HqqConfig
|
|
207
|
+
|
|
208
|
+
# Quantize
|
|
209
|
+
config = HqqConfig(nbits=4, group_size=64)
|
|
210
|
+
model = AutoModelForCausalLM.from_pretrained(
|
|
211
|
+
"meta-llama/Llama-3.1-8B",
|
|
212
|
+
quantization_config=config,
|
|
213
|
+
device_map="auto"
|
|
214
|
+
)
|
|
215
|
+
|
|
216
|
+
# Save quantized model
|
|
217
|
+
model.save_pretrained("./llama-8b-hqq-4bit")
|
|
218
|
+
|
|
219
|
+
# Push to Hub
|
|
220
|
+
model.push_to_hub("my-org/Llama-3.1-8B-HQQ-4bit")
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
### Mixed precision quantization
|
|
224
|
+
|
|
225
|
+
```python
|
|
226
|
+
from transformers import AutoModelForCausalLM, HqqConfig
|
|
227
|
+
|
|
228
|
+
# Different precision per layer type
|
|
229
|
+
config = HqqConfig(
|
|
230
|
+
nbits=4,
|
|
231
|
+
group_size=64,
|
|
232
|
+
# Attention layers: higher precision
|
|
233
|
+
# MLP layers: lower precision for memory savings
|
|
234
|
+
dynamic_config={
|
|
235
|
+
"attn": {"nbits": 4, "group_size": 64},
|
|
236
|
+
"mlp": {"nbits": 2, "group_size": 32}
|
|
237
|
+
}
|
|
238
|
+
)
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
## vLLM integration
|
|
242
|
+
|
|
243
|
+
### Serve HQQ models with vLLM
|
|
244
|
+
|
|
245
|
+
```python
|
|
246
|
+
from vllm import LLM, SamplingParams
|
|
247
|
+
|
|
248
|
+
# Load HQQ-quantized model
|
|
249
|
+
llm = LLM(
|
|
250
|
+
model="mobiuslabsgmbh/Llama-3.1-8B-HQQ-4bit",
|
|
251
|
+
quantization="hqq",
|
|
252
|
+
dtype="float16"
|
|
253
|
+
)
|
|
254
|
+
|
|
255
|
+
# Generate
|
|
256
|
+
sampling_params = SamplingParams(temperature=0.7, max_tokens=100)
|
|
257
|
+
outputs = llm.generate(["What is machine learning?"], sampling_params)
|
|
258
|
+
```
|
|
259
|
+
|
|
260
|
+
### vLLM with custom HQQ config
|
|
261
|
+
|
|
262
|
+
```python
|
|
263
|
+
from vllm import LLM
|
|
264
|
+
|
|
265
|
+
llm = LLM(
|
|
266
|
+
model="meta-llama/Llama-3.1-8B",
|
|
267
|
+
quantization="hqq",
|
|
268
|
+
quantization_config={
|
|
269
|
+
"nbits": 4,
|
|
270
|
+
"group_size": 64
|
|
271
|
+
}
|
|
272
|
+
)
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
## PEFT/LoRA fine-tuning
|
|
276
|
+
|
|
277
|
+
### Fine-tune quantized models
|
|
278
|
+
|
|
279
|
+
```python
|
|
280
|
+
from transformers import AutoModelForCausalLM, HqqConfig
|
|
281
|
+
from peft import LoraConfig, get_peft_model
|
|
282
|
+
|
|
283
|
+
# Load quantized model
|
|
284
|
+
quant_config = HqqConfig(nbits=4, group_size=64)
|
|
285
|
+
model = AutoModelForCausalLM.from_pretrained(
|
|
286
|
+
"meta-llama/Llama-3.1-8B",
|
|
287
|
+
quantization_config=quant_config,
|
|
288
|
+
device_map="auto"
|
|
289
|
+
)
|
|
290
|
+
|
|
291
|
+
# Apply LoRA
|
|
292
|
+
lora_config = LoraConfig(
|
|
293
|
+
r=16,
|
|
294
|
+
lora_alpha=32,
|
|
295
|
+
target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
|
|
296
|
+
lora_dropout=0.05,
|
|
297
|
+
bias="none",
|
|
298
|
+
task_type="CAUSAL_LM"
|
|
299
|
+
)
|
|
300
|
+
|
|
301
|
+
model = get_peft_model(model, lora_config)
|
|
302
|
+
|
|
303
|
+
# Train normally with Trainer or custom loop
|
|
304
|
+
```
|
|
305
|
+
|
|
306
|
+
### QLoRA-style training
|
|
307
|
+
|
|
308
|
+
```python
|
|
309
|
+
from transformers import TrainingArguments, Trainer
|
|
310
|
+
|
|
311
|
+
training_args = TrainingArguments(
|
|
312
|
+
output_dir="./hqq-lora-output",
|
|
313
|
+
per_device_train_batch_size=4,
|
|
314
|
+
gradient_accumulation_steps=4,
|
|
315
|
+
learning_rate=2e-4,
|
|
316
|
+
num_train_epochs=3,
|
|
317
|
+
fp16=True,
|
|
318
|
+
logging_steps=10,
|
|
319
|
+
save_strategy="epoch"
|
|
320
|
+
)
|
|
321
|
+
|
|
322
|
+
trainer = Trainer(
|
|
323
|
+
model=model,
|
|
324
|
+
args=training_args,
|
|
325
|
+
train_dataset=train_dataset,
|
|
326
|
+
data_collator=data_collator
|
|
327
|
+
)
|
|
328
|
+
|
|
329
|
+
trainer.train()
|
|
330
|
+
```
|
|
331
|
+
|
|
332
|
+
## Quantization workflows
|
|
333
|
+
|
|
334
|
+
### Workflow 1: Quick model compression
|
|
335
|
+
|
|
336
|
+
```python
|
|
337
|
+
from transformers import AutoModelForCausalLM, AutoTokenizer, HqqConfig
|
|
338
|
+
|
|
339
|
+
# 1. Configure quantization
|
|
340
|
+
config = HqqConfig(nbits=4, group_size=64)
|
|
341
|
+
|
|
342
|
+
# 2. Load and quantize (no calibration needed!)
|
|
343
|
+
model = AutoModelForCausalLM.from_pretrained(
|
|
344
|
+
"meta-llama/Llama-3.1-8B",
|
|
345
|
+
quantization_config=config,
|
|
346
|
+
device_map="auto"
|
|
347
|
+
)
|
|
348
|
+
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.1-8B")
|
|
349
|
+
|
|
350
|
+
# 3. Verify quality
|
|
351
|
+
prompt = "The capital of France is"
|
|
352
|
+
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
|
353
|
+
outputs = model.generate(**inputs, max_new_tokens=20)
|
|
354
|
+
print(tokenizer.decode(outputs[0]))
|
|
355
|
+
|
|
356
|
+
# 4. Save
|
|
357
|
+
model.save_pretrained("./llama-8b-hqq")
|
|
358
|
+
tokenizer.save_pretrained("./llama-8b-hqq")
|
|
359
|
+
```
|
|
360
|
+
|
|
361
|
+
### Workflow 2: Optimize for inference speed
|
|
362
|
+
|
|
363
|
+
```python
|
|
364
|
+
from hqq.core.quantize import HQQLinear
|
|
365
|
+
from transformers import AutoModelForCausalLM, HqqConfig
|
|
366
|
+
|
|
367
|
+
# 1. Quantize with optimal backend
|
|
368
|
+
config = HqqConfig(nbits=4, group_size=64)
|
|
369
|
+
model = AutoModelForCausalLM.from_pretrained(
|
|
370
|
+
"meta-llama/Llama-3.1-8B",
|
|
371
|
+
quantization_config=config,
|
|
372
|
+
device_map="auto"
|
|
373
|
+
)
|
|
374
|
+
|
|
375
|
+
# 2. Set fast backend
|
|
376
|
+
HQQLinear.set_backend("marlin") # or "torchao_int4"
|
|
377
|
+
|
|
378
|
+
# 3. Compile for additional speedup
|
|
379
|
+
import torch
|
|
380
|
+
model = torch.compile(model)
|
|
381
|
+
|
|
382
|
+
# 4. Benchmark
|
|
383
|
+
import time
|
|
384
|
+
inputs = tokenizer("Hello", return_tensors="pt").to(model.device)
|
|
385
|
+
start = time.time()
|
|
386
|
+
for _ in range(10):
|
|
387
|
+
model.generate(**inputs, max_new_tokens=100)
|
|
388
|
+
print(f"Avg time: {(time.time() - start) / 10:.2f}s")
|
|
389
|
+
```
|
|
390
|
+
|
|
391
|
+
## Best practices
|
|
392
|
+
|
|
393
|
+
1. **Start with 4-bit**: Best quality/size tradeoff for most models
|
|
394
|
+
2. **Use group_size=64**: Good balance; smaller for extreme quantization
|
|
395
|
+
3. **Choose backend wisely**: Marlin for 4-bit Ampere+, TorchAO for flexibility
|
|
396
|
+
4. **Verify quality**: Always test generation quality after quantization
|
|
397
|
+
5. **Mixed precision**: Keep attention at higher precision, compress MLP more
|
|
398
|
+
6. **PEFT training**: Use LoRA r=16-32 for good fine-tuning results
|
|
399
|
+
|
|
400
|
+
## Common issues
|
|
401
|
+
|
|
402
|
+
**Out of memory during quantization:**
|
|
403
|
+
```python
|
|
404
|
+
# Quantize layer-by-layer
|
|
405
|
+
from hqq.models.hf.base import AutoHQQHFModel
|
|
406
|
+
|
|
407
|
+
model = AutoHQQHFModel.from_pretrained(
|
|
408
|
+
"meta-llama/Llama-3.1-8B",
|
|
409
|
+
quantization_config=config,
|
|
410
|
+
device_map="sequential" # Load layers sequentially
|
|
411
|
+
)
|
|
412
|
+
```
|
|
413
|
+
|
|
414
|
+
**Slow inference:**
|
|
415
|
+
```python
|
|
416
|
+
# Switch to optimized backend
|
|
417
|
+
from hqq.core.quantize import HQQLinear
|
|
418
|
+
HQQLinear.set_backend("marlin") # Requires Ampere+ GPU
|
|
419
|
+
|
|
420
|
+
# Or compile
|
|
421
|
+
model = torch.compile(model, mode="reduce-overhead")
|
|
422
|
+
```
|
|
423
|
+
|
|
424
|
+
**Poor quality at 2-bit:**
|
|
425
|
+
```python
|
|
426
|
+
# Use smaller group size
|
|
427
|
+
config = BaseQuantizeConfig(
|
|
428
|
+
nbits=2,
|
|
429
|
+
group_size=16, # Smaller groups help at low bits
|
|
430
|
+
axis=1
|
|
431
|
+
)
|
|
432
|
+
```
|
|
433
|
+
|
|
434
|
+
## References
|
|
435
|
+
|
|
436
|
+
- **[Advanced Usage](references/advanced-usage.md)** - Custom backends, mixed precision, optimization
|
|
437
|
+
- **[Troubleshooting](references/troubleshooting.md)** - Common issues, debugging, benchmarks
|
|
438
|
+
|
|
439
|
+
## Resources
|
|
440
|
+
|
|
441
|
+
- **Repository**: https://github.com/mobiusml/hqq
|
|
442
|
+
- **Paper**: Half-Quadratic Quantization
|
|
443
|
+
- **HuggingFace Models**: https://huggingface.co/mobiuslabsgmbh
|
|
444
|
+
- **Version**: 0.2.0+
|
|
445
|
+
- **License**: Apache 2.0
|