@synsci/cli-darwin-x64 1.1.49

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (373) hide show
  1. package/bin/skills/accelerate/SKILL.md +332 -0
  2. package/bin/skills/accelerate/references/custom-plugins.md +453 -0
  3. package/bin/skills/accelerate/references/megatron-integration.md +489 -0
  4. package/bin/skills/accelerate/references/performance.md +525 -0
  5. package/bin/skills/audiocraft/SKILL.md +564 -0
  6. package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
  7. package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
  8. package/bin/skills/autogpt/SKILL.md +403 -0
  9. package/bin/skills/autogpt/references/advanced-usage.md +535 -0
  10. package/bin/skills/autogpt/references/troubleshooting.md +420 -0
  11. package/bin/skills/awq/SKILL.md +310 -0
  12. package/bin/skills/awq/references/advanced-usage.md +324 -0
  13. package/bin/skills/awq/references/troubleshooting.md +344 -0
  14. package/bin/skills/axolotl/SKILL.md +158 -0
  15. package/bin/skills/axolotl/references/api.md +5548 -0
  16. package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
  17. package/bin/skills/axolotl/references/index.md +15 -0
  18. package/bin/skills/axolotl/references/other.md +3563 -0
  19. package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
  20. package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
  21. package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
  22. package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
  23. package/bin/skills/bitsandbytes/SKILL.md +411 -0
  24. package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
  25. package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
  26. package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
  27. package/bin/skills/blip-2/SKILL.md +564 -0
  28. package/bin/skills/blip-2/references/advanced-usage.md +680 -0
  29. package/bin/skills/blip-2/references/troubleshooting.md +526 -0
  30. package/bin/skills/chroma/SKILL.md +406 -0
  31. package/bin/skills/chroma/references/integration.md +38 -0
  32. package/bin/skills/clip/SKILL.md +253 -0
  33. package/bin/skills/clip/references/applications.md +207 -0
  34. package/bin/skills/constitutional-ai/SKILL.md +290 -0
  35. package/bin/skills/crewai/SKILL.md +498 -0
  36. package/bin/skills/crewai/references/flows.md +438 -0
  37. package/bin/skills/crewai/references/tools.md +429 -0
  38. package/bin/skills/crewai/references/troubleshooting.md +480 -0
  39. package/bin/skills/deepspeed/SKILL.md +141 -0
  40. package/bin/skills/deepspeed/references/08.md +17 -0
  41. package/bin/skills/deepspeed/references/09.md +173 -0
  42. package/bin/skills/deepspeed/references/2020.md +378 -0
  43. package/bin/skills/deepspeed/references/2023.md +279 -0
  44. package/bin/skills/deepspeed/references/assets.md +179 -0
  45. package/bin/skills/deepspeed/references/index.md +35 -0
  46. package/bin/skills/deepspeed/references/mii.md +118 -0
  47. package/bin/skills/deepspeed/references/other.md +1191 -0
  48. package/bin/skills/deepspeed/references/tutorials.md +6554 -0
  49. package/bin/skills/dspy/SKILL.md +590 -0
  50. package/bin/skills/dspy/references/examples.md +663 -0
  51. package/bin/skills/dspy/references/modules.md +475 -0
  52. package/bin/skills/dspy/references/optimizers.md +566 -0
  53. package/bin/skills/faiss/SKILL.md +221 -0
  54. package/bin/skills/faiss/references/index_types.md +280 -0
  55. package/bin/skills/flash-attention/SKILL.md +367 -0
  56. package/bin/skills/flash-attention/references/benchmarks.md +215 -0
  57. package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
  58. package/bin/skills/gguf/SKILL.md +427 -0
  59. package/bin/skills/gguf/references/advanced-usage.md +504 -0
  60. package/bin/skills/gguf/references/troubleshooting.md +442 -0
  61. package/bin/skills/gptq/SKILL.md +450 -0
  62. package/bin/skills/gptq/references/calibration.md +337 -0
  63. package/bin/skills/gptq/references/integration.md +129 -0
  64. package/bin/skills/gptq/references/troubleshooting.md +95 -0
  65. package/bin/skills/grpo-rl-training/README.md +97 -0
  66. package/bin/skills/grpo-rl-training/SKILL.md +572 -0
  67. package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
  68. package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
  69. package/bin/skills/guidance/SKILL.md +572 -0
  70. package/bin/skills/guidance/references/backends.md +554 -0
  71. package/bin/skills/guidance/references/constraints.md +674 -0
  72. package/bin/skills/guidance/references/examples.md +767 -0
  73. package/bin/skills/hqq/SKILL.md +445 -0
  74. package/bin/skills/hqq/references/advanced-usage.md +528 -0
  75. package/bin/skills/hqq/references/troubleshooting.md +503 -0
  76. package/bin/skills/hugging-face-cli/SKILL.md +191 -0
  77. package/bin/skills/hugging-face-cli/references/commands.md +954 -0
  78. package/bin/skills/hugging-face-cli/references/examples.md +374 -0
  79. package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
  80. package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
  81. package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
  82. package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
  83. package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
  84. package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
  85. package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
  86. package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
  87. package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
  88. package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
  89. package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
  90. package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
  91. package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
  92. package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
  93. package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
  94. package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
  95. package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
  96. package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
  97. package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
  98. package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
  99. package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
  100. package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
  101. package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
  102. package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
  103. package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
  104. package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
  105. package/bin/skills/hugging-face-jobs/index.html +216 -0
  106. package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
  107. package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
  108. package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
  109. package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
  110. package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
  111. package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
  112. package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
  113. package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
  114. package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
  115. package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
  116. package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
  117. package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
  118. package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
  119. package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
  120. package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
  121. package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
  122. package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
  123. package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
  124. package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
  125. package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
  126. package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
  127. package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
  128. package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
  129. package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
  130. package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
  131. package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
  132. package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
  133. package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
  134. package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
  135. package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
  136. package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
  137. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
  138. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
  139. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
  140. package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
  141. package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
  142. package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
  143. package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
  144. package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
  145. package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
  146. package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
  147. package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
  148. package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
  149. package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
  150. package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
  151. package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
  152. package/bin/skills/instructor/SKILL.md +740 -0
  153. package/bin/skills/instructor/references/examples.md +107 -0
  154. package/bin/skills/instructor/references/providers.md +70 -0
  155. package/bin/skills/instructor/references/validation.md +606 -0
  156. package/bin/skills/knowledge-distillation/SKILL.md +458 -0
  157. package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
  158. package/bin/skills/lambda-labs/SKILL.md +545 -0
  159. package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
  160. package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
  161. package/bin/skills/langchain/SKILL.md +480 -0
  162. package/bin/skills/langchain/references/agents.md +499 -0
  163. package/bin/skills/langchain/references/integration.md +562 -0
  164. package/bin/skills/langchain/references/rag.md +600 -0
  165. package/bin/skills/langsmith/SKILL.md +422 -0
  166. package/bin/skills/langsmith/references/advanced-usage.md +548 -0
  167. package/bin/skills/langsmith/references/troubleshooting.md +537 -0
  168. package/bin/skills/litgpt/SKILL.md +469 -0
  169. package/bin/skills/litgpt/references/custom-models.md +568 -0
  170. package/bin/skills/litgpt/references/distributed-training.md +451 -0
  171. package/bin/skills/litgpt/references/supported-models.md +336 -0
  172. package/bin/skills/litgpt/references/training-recipes.md +619 -0
  173. package/bin/skills/llama-cpp/SKILL.md +258 -0
  174. package/bin/skills/llama-cpp/references/optimization.md +89 -0
  175. package/bin/skills/llama-cpp/references/quantization.md +213 -0
  176. package/bin/skills/llama-cpp/references/server.md +125 -0
  177. package/bin/skills/llama-factory/SKILL.md +80 -0
  178. package/bin/skills/llama-factory/references/_images.md +23 -0
  179. package/bin/skills/llama-factory/references/advanced.md +1055 -0
  180. package/bin/skills/llama-factory/references/getting_started.md +349 -0
  181. package/bin/skills/llama-factory/references/index.md +19 -0
  182. package/bin/skills/llama-factory/references/other.md +31 -0
  183. package/bin/skills/llamaguard/SKILL.md +337 -0
  184. package/bin/skills/llamaindex/SKILL.md +569 -0
  185. package/bin/skills/llamaindex/references/agents.md +83 -0
  186. package/bin/skills/llamaindex/references/data_connectors.md +108 -0
  187. package/bin/skills/llamaindex/references/query_engines.md +406 -0
  188. package/bin/skills/llava/SKILL.md +304 -0
  189. package/bin/skills/llava/references/training.md +197 -0
  190. package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
  191. package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
  192. package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
  193. package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
  194. package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
  195. package/bin/skills/long-context/SKILL.md +536 -0
  196. package/bin/skills/long-context/references/extension_methods.md +468 -0
  197. package/bin/skills/long-context/references/fine_tuning.md +611 -0
  198. package/bin/skills/long-context/references/rope.md +402 -0
  199. package/bin/skills/mamba/SKILL.md +260 -0
  200. package/bin/skills/mamba/references/architecture-details.md +206 -0
  201. package/bin/skills/mamba/references/benchmarks.md +255 -0
  202. package/bin/skills/mamba/references/training-guide.md +388 -0
  203. package/bin/skills/megatron-core/SKILL.md +366 -0
  204. package/bin/skills/megatron-core/references/benchmarks.md +249 -0
  205. package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
  206. package/bin/skills/megatron-core/references/production-examples.md +473 -0
  207. package/bin/skills/megatron-core/references/training-recipes.md +547 -0
  208. package/bin/skills/miles/SKILL.md +315 -0
  209. package/bin/skills/miles/references/api-reference.md +141 -0
  210. package/bin/skills/miles/references/troubleshooting.md +352 -0
  211. package/bin/skills/mlflow/SKILL.md +704 -0
  212. package/bin/skills/mlflow/references/deployment.md +744 -0
  213. package/bin/skills/mlflow/references/model-registry.md +770 -0
  214. package/bin/skills/mlflow/references/tracking.md +680 -0
  215. package/bin/skills/modal/SKILL.md +341 -0
  216. package/bin/skills/modal/references/advanced-usage.md +503 -0
  217. package/bin/skills/modal/references/troubleshooting.md +494 -0
  218. package/bin/skills/model-merging/SKILL.md +539 -0
  219. package/bin/skills/model-merging/references/evaluation.md +462 -0
  220. package/bin/skills/model-merging/references/examples.md +428 -0
  221. package/bin/skills/model-merging/references/methods.md +352 -0
  222. package/bin/skills/model-pruning/SKILL.md +495 -0
  223. package/bin/skills/model-pruning/references/wanda.md +347 -0
  224. package/bin/skills/moe-training/SKILL.md +526 -0
  225. package/bin/skills/moe-training/references/architectures.md +432 -0
  226. package/bin/skills/moe-training/references/inference.md +348 -0
  227. package/bin/skills/moe-training/references/training.md +425 -0
  228. package/bin/skills/nanogpt/SKILL.md +290 -0
  229. package/bin/skills/nanogpt/references/architecture.md +382 -0
  230. package/bin/skills/nanogpt/references/data.md +476 -0
  231. package/bin/skills/nanogpt/references/training.md +564 -0
  232. package/bin/skills/nemo-curator/SKILL.md +383 -0
  233. package/bin/skills/nemo-curator/references/deduplication.md +87 -0
  234. package/bin/skills/nemo-curator/references/filtering.md +102 -0
  235. package/bin/skills/nemo-evaluator/SKILL.md +494 -0
  236. package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
  237. package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
  238. package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
  239. package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
  240. package/bin/skills/nemo-guardrails/SKILL.md +297 -0
  241. package/bin/skills/nnsight/SKILL.md +436 -0
  242. package/bin/skills/nnsight/references/README.md +78 -0
  243. package/bin/skills/nnsight/references/api.md +344 -0
  244. package/bin/skills/nnsight/references/tutorials.md +300 -0
  245. package/bin/skills/openrlhf/SKILL.md +249 -0
  246. package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
  247. package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
  248. package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
  249. package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
  250. package/bin/skills/outlines/SKILL.md +652 -0
  251. package/bin/skills/outlines/references/backends.md +615 -0
  252. package/bin/skills/outlines/references/examples.md +773 -0
  253. package/bin/skills/outlines/references/json_generation.md +652 -0
  254. package/bin/skills/peft/SKILL.md +431 -0
  255. package/bin/skills/peft/references/advanced-usage.md +514 -0
  256. package/bin/skills/peft/references/troubleshooting.md +480 -0
  257. package/bin/skills/phoenix/SKILL.md +475 -0
  258. package/bin/skills/phoenix/references/advanced-usage.md +619 -0
  259. package/bin/skills/phoenix/references/troubleshooting.md +538 -0
  260. package/bin/skills/pinecone/SKILL.md +358 -0
  261. package/bin/skills/pinecone/references/deployment.md +181 -0
  262. package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
  263. package/bin/skills/pytorch-fsdp/references/index.md +7 -0
  264. package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
  265. package/bin/skills/pytorch-lightning/SKILL.md +346 -0
  266. package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
  267. package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
  268. package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
  269. package/bin/skills/pyvene/SKILL.md +473 -0
  270. package/bin/skills/pyvene/references/README.md +73 -0
  271. package/bin/skills/pyvene/references/api.md +383 -0
  272. package/bin/skills/pyvene/references/tutorials.md +376 -0
  273. package/bin/skills/qdrant/SKILL.md +493 -0
  274. package/bin/skills/qdrant/references/advanced-usage.md +648 -0
  275. package/bin/skills/qdrant/references/troubleshooting.md +631 -0
  276. package/bin/skills/ray-data/SKILL.md +326 -0
  277. package/bin/skills/ray-data/references/integration.md +82 -0
  278. package/bin/skills/ray-data/references/transformations.md +83 -0
  279. package/bin/skills/ray-train/SKILL.md +406 -0
  280. package/bin/skills/ray-train/references/multi-node.md +628 -0
  281. package/bin/skills/rwkv/SKILL.md +260 -0
  282. package/bin/skills/rwkv/references/architecture-details.md +344 -0
  283. package/bin/skills/rwkv/references/rwkv7.md +386 -0
  284. package/bin/skills/rwkv/references/state-management.md +369 -0
  285. package/bin/skills/saelens/SKILL.md +386 -0
  286. package/bin/skills/saelens/references/README.md +70 -0
  287. package/bin/skills/saelens/references/api.md +333 -0
  288. package/bin/skills/saelens/references/tutorials.md +318 -0
  289. package/bin/skills/segment-anything/SKILL.md +500 -0
  290. package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
  291. package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
  292. package/bin/skills/sentence-transformers/SKILL.md +255 -0
  293. package/bin/skills/sentence-transformers/references/models.md +123 -0
  294. package/bin/skills/sentencepiece/SKILL.md +235 -0
  295. package/bin/skills/sentencepiece/references/algorithms.md +200 -0
  296. package/bin/skills/sentencepiece/references/training.md +304 -0
  297. package/bin/skills/sglang/SKILL.md +442 -0
  298. package/bin/skills/sglang/references/deployment.md +490 -0
  299. package/bin/skills/sglang/references/radix-attention.md +413 -0
  300. package/bin/skills/sglang/references/structured-generation.md +541 -0
  301. package/bin/skills/simpo/SKILL.md +219 -0
  302. package/bin/skills/simpo/references/datasets.md +478 -0
  303. package/bin/skills/simpo/references/hyperparameters.md +452 -0
  304. package/bin/skills/simpo/references/loss-functions.md +350 -0
  305. package/bin/skills/skypilot/SKILL.md +509 -0
  306. package/bin/skills/skypilot/references/advanced-usage.md +491 -0
  307. package/bin/skills/skypilot/references/troubleshooting.md +570 -0
  308. package/bin/skills/slime/SKILL.md +464 -0
  309. package/bin/skills/slime/references/api-reference.md +392 -0
  310. package/bin/skills/slime/references/troubleshooting.md +386 -0
  311. package/bin/skills/speculative-decoding/SKILL.md +467 -0
  312. package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
  313. package/bin/skills/speculative-decoding/references/medusa.md +350 -0
  314. package/bin/skills/stable-diffusion/SKILL.md +519 -0
  315. package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
  316. package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
  317. package/bin/skills/tensorboard/SKILL.md +629 -0
  318. package/bin/skills/tensorboard/references/integrations.md +638 -0
  319. package/bin/skills/tensorboard/references/profiling.md +545 -0
  320. package/bin/skills/tensorboard/references/visualization.md +620 -0
  321. package/bin/skills/tensorrt-llm/SKILL.md +187 -0
  322. package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
  323. package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
  324. package/bin/skills/tensorrt-llm/references/serving.md +470 -0
  325. package/bin/skills/tinker/SKILL.md +362 -0
  326. package/bin/skills/tinker/references/api-reference.md +168 -0
  327. package/bin/skills/tinker/references/getting-started.md +157 -0
  328. package/bin/skills/tinker/references/loss-functions.md +163 -0
  329. package/bin/skills/tinker/references/models-and-lora.md +139 -0
  330. package/bin/skills/tinker/references/recipes.md +280 -0
  331. package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
  332. package/bin/skills/tinker/references/rendering.md +243 -0
  333. package/bin/skills/tinker/references/supervised-learning.md +232 -0
  334. package/bin/skills/tinker-training-cost/SKILL.md +187 -0
  335. package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
  336. package/bin/skills/torchforge/SKILL.md +433 -0
  337. package/bin/skills/torchforge/references/api-reference.md +327 -0
  338. package/bin/skills/torchforge/references/troubleshooting.md +409 -0
  339. package/bin/skills/torchtitan/SKILL.md +358 -0
  340. package/bin/skills/torchtitan/references/checkpoint.md +181 -0
  341. package/bin/skills/torchtitan/references/custom-models.md +258 -0
  342. package/bin/skills/torchtitan/references/float8.md +133 -0
  343. package/bin/skills/torchtitan/references/fsdp.md +126 -0
  344. package/bin/skills/transformer-lens/SKILL.md +346 -0
  345. package/bin/skills/transformer-lens/references/README.md +54 -0
  346. package/bin/skills/transformer-lens/references/api.md +362 -0
  347. package/bin/skills/transformer-lens/references/tutorials.md +339 -0
  348. package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
  349. package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
  350. package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
  351. package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
  352. package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
  353. package/bin/skills/unsloth/SKILL.md +80 -0
  354. package/bin/skills/unsloth/references/index.md +7 -0
  355. package/bin/skills/unsloth/references/llms-full.md +16799 -0
  356. package/bin/skills/unsloth/references/llms-txt.md +12044 -0
  357. package/bin/skills/unsloth/references/llms.md +82 -0
  358. package/bin/skills/verl/SKILL.md +391 -0
  359. package/bin/skills/verl/references/api-reference.md +301 -0
  360. package/bin/skills/verl/references/troubleshooting.md +391 -0
  361. package/bin/skills/vllm/SKILL.md +364 -0
  362. package/bin/skills/vllm/references/optimization.md +226 -0
  363. package/bin/skills/vllm/references/quantization.md +284 -0
  364. package/bin/skills/vllm/references/server-deployment.md +255 -0
  365. package/bin/skills/vllm/references/troubleshooting.md +447 -0
  366. package/bin/skills/weights-and-biases/SKILL.md +590 -0
  367. package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
  368. package/bin/skills/weights-and-biases/references/integrations.md +700 -0
  369. package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
  370. package/bin/skills/whisper/SKILL.md +317 -0
  371. package/bin/skills/whisper/references/languages.md +189 -0
  372. package/bin/synsc +0 -0
  373. package/package.json +10 -0
@@ -0,0 +1,450 @@
1
+ ---
2
+ name: gptq
3
+ description: Post-training 4-bit quantization for LLMs with minimal accuracy loss. Use for deploying large models (70B, 405B) on consumer GPUs, when you need 4× memory reduction with <2% perplexity degradation, or for faster inference (3-4× speedup) vs FP16. Integrates with transformers and PEFT for QLoRA fine-tuning.
4
+ version: 1.0.0
5
+ author: Synthetic Sciences
6
+ license: MIT
7
+ tags: [Optimization, GPTQ, Quantization, 4-Bit, Post-Training, Memory Optimization, Consumer GPUs, Fast Inference, QLoRA, Group-Wise Quantization]
8
+ dependencies: [auto-gptq, transformers, optimum, peft]
9
+ ---
10
+
11
+ # GPTQ (Generative Pre-trained Transformer Quantization)
12
+
13
+ Post-training quantization method that compresses LLMs to 4-bit with minimal accuracy loss using group-wise quantization.
14
+
15
+ ## When to use GPTQ
16
+
17
+ **Use GPTQ when:**
18
+ - Need to fit large models (70B+) on limited GPU memory
19
+ - Want 4× memory reduction with <2% accuracy loss
20
+ - Deploying on consumer GPUs (RTX 4090, 3090)
21
+ - Need faster inference (3-4× speedup vs FP16)
22
+
23
+ **Use AWQ instead when:**
24
+ - Need slightly better accuracy (<1% loss)
25
+ - Have newer GPUs (Ampere, Ada)
26
+ - Want Marlin kernel support (2× faster on some GPUs)
27
+
28
+ **Use bitsandbytes instead when:**
29
+ - Need simple integration with transformers
30
+ - Want 8-bit quantization (less compression, better quality)
31
+ - Don't need pre-quantized model files
32
+
33
+ ## Quick start
34
+
35
+ ### Installation
36
+
37
+ ```bash
38
+ # Install AutoGPTQ
39
+ pip install auto-gptq
40
+
41
+ # With Triton (Linux only, faster)
42
+ pip install auto-gptq[triton]
43
+
44
+ # With CUDA extensions (faster)
45
+ pip install auto-gptq --no-build-isolation
46
+
47
+ # Full installation
48
+ pip install auto-gptq transformers accelerate
49
+ ```
50
+
51
+ ### Load pre-quantized model
52
+
53
+ ```python
54
+ from transformers import AutoTokenizer
55
+ from auto_gptq import AutoGPTQForCausalLM
56
+
57
+ # Load quantized model from HuggingFace
58
+ model_name = "TheBloke/Llama-2-7B-Chat-GPTQ"
59
+
60
+ model = AutoGPTQForCausalLM.from_quantized(
61
+ model_name,
62
+ device="cuda:0",
63
+ use_triton=False # Set True on Linux for speed
64
+ )
65
+
66
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
67
+
68
+ # Generate
69
+ prompt = "Explain quantum computing"
70
+ inputs = tokenizer(prompt, return_tensors="pt").to("cuda:0")
71
+ outputs = model.generate(**inputs, max_new_tokens=200)
72
+ print(tokenizer.decode(outputs[0]))
73
+ ```
74
+
75
+ ### Quantize your own model
76
+
77
+ ```python
78
+ from transformers import AutoTokenizer
79
+ from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig
80
+ from datasets import load_dataset
81
+
82
+ # Load model
83
+ model_name = "meta-llama/Llama-2-7b-chat-hf"
84
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
85
+
86
+ # Quantization config
87
+ quantize_config = BaseQuantizeConfig(
88
+ bits=4, # 4-bit quantization
89
+ group_size=128, # Group size (recommended: 128)
90
+ desc_act=False, # Activation order (False for CUDA kernel)
91
+ damp_percent=0.01 # Dampening factor
92
+ )
93
+
94
+ # Load model for quantization
95
+ model = AutoGPTQForCausalLM.from_pretrained(
96
+ model_name,
97
+ quantize_config=quantize_config
98
+ )
99
+
100
+ # Prepare calibration data
101
+ dataset = load_dataset("c4", split="train", streaming=True)
102
+ calibration_data = [
103
+ tokenizer(example["text"])["input_ids"][:512]
104
+ for example in dataset.take(128)
105
+ ]
106
+
107
+ # Quantize
108
+ model.quantize(calibration_data)
109
+
110
+ # Save quantized model
111
+ model.save_quantized("llama-2-7b-gptq")
112
+ tokenizer.save_pretrained("llama-2-7b-gptq")
113
+
114
+ # Push to HuggingFace
115
+ model.push_to_hub("username/llama-2-7b-gptq")
116
+ ```
117
+
118
+ ## Group-wise quantization
119
+
120
+ **How GPTQ works**:
121
+ 1. **Group weights**: Divide each weight matrix into groups (typically 128 elements)
122
+ 2. **Quantize per-group**: Each group has its own scale/zero-point
123
+ 3. **Minimize error**: Uses Hessian information to minimize quantization error
124
+ 4. **Result**: 4-bit weights with near-FP16 accuracy
125
+
126
+ **Group size trade-off**:
127
+
128
+ | Group Size | Model Size | Accuracy | Speed | Recommendation |
129
+ |------------|------------|----------|-------|----------------|
130
+ | -1 (per-column) | Smallest | Best | Slowest | Research only |
131
+ | 32 | Smaller | Better | Slower | High accuracy needed |
132
+ | **128** | Medium | Good | **Fast** | **Recommended default** |
133
+ | 256 | Larger | Lower | Faster | Speed critical |
134
+ | 1024 | Largest | Lowest | Fastest | Not recommended |
135
+
136
+ **Example**:
137
+ ```
138
+ Weight matrix: [1024, 4096] = 4.2M elements
139
+
140
+ Group size = 128:
141
+ - Groups: 4.2M / 128 = 32,768 groups
142
+ - Each group: own 4-bit scale + zero-point
143
+ - Result: Better granularity → better accuracy
144
+ ```
145
+
146
+ ## Quantization configurations
147
+
148
+ ### Standard 4-bit (recommended)
149
+
150
+ ```python
151
+ from auto_gptq import BaseQuantizeConfig
152
+
153
+ config = BaseQuantizeConfig(
154
+ bits=4, # 4-bit quantization
155
+ group_size=128, # Standard group size
156
+ desc_act=False, # Faster CUDA kernel
157
+ damp_percent=0.01 # Dampening factor
158
+ )
159
+ ```
160
+
161
+ **Performance**:
162
+ - Memory: 4× reduction (70B model: 140GB → 35GB)
163
+ - Accuracy: ~1.5% perplexity increase
164
+ - Speed: 3-4× faster than FP16
165
+
166
+ ### High accuracy (3-bit with larger groups)
167
+
168
+ ```python
169
+ config = BaseQuantizeConfig(
170
+ bits=3, # 3-bit (more compression)
171
+ group_size=128, # Keep standard group size
172
+ desc_act=True, # Better accuracy (slower)
173
+ damp_percent=0.01
174
+ )
175
+ ```
176
+
177
+ **Trade-off**:
178
+ - Memory: 5× reduction
179
+ - Accuracy: ~3% perplexity increase
180
+ - Speed: 5× faster (but less accurate)
181
+
182
+ ### Maximum accuracy (4-bit with small groups)
183
+
184
+ ```python
185
+ config = BaseQuantizeConfig(
186
+ bits=4,
187
+ group_size=32, # Smaller groups (better accuracy)
188
+ desc_act=True, # Activation reordering
189
+ damp_percent=0.005 # Lower dampening
190
+ )
191
+ ```
192
+
193
+ **Trade-off**:
194
+ - Memory: 3.5× reduction (slightly larger)
195
+ - Accuracy: ~0.8% perplexity increase (best)
196
+ - Speed: 2-3× faster (kernel overhead)
197
+
198
+ ## Kernel backends
199
+
200
+ ### ExLlamaV2 (default, fastest)
201
+
202
+ ```python
203
+ model = AutoGPTQForCausalLM.from_quantized(
204
+ model_name,
205
+ device="cuda:0",
206
+ use_exllama=True, # Use ExLlamaV2
207
+ exllama_config={"version": 2}
208
+ )
209
+ ```
210
+
211
+ **Performance**: 1.5-2× faster than Triton
212
+
213
+ ### Marlin (Ampere+ GPUs)
214
+
215
+ ```python
216
+ # Quantize with Marlin format
217
+ config = BaseQuantizeConfig(
218
+ bits=4,
219
+ group_size=128,
220
+ desc_act=False # Required for Marlin
221
+ )
222
+
223
+ model.quantize(calibration_data, use_marlin=True)
224
+
225
+ # Load with Marlin
226
+ model = AutoGPTQForCausalLM.from_quantized(
227
+ model_name,
228
+ device="cuda:0",
229
+ use_marlin=True # 2× faster on A100/H100
230
+ )
231
+ ```
232
+
233
+ **Requirements**:
234
+ - NVIDIA Ampere or newer (A100, H100, RTX 40xx)
235
+ - Compute capability ≥ 8.0
236
+
237
+ ### Triton (Linux only)
238
+
239
+ ```python
240
+ model = AutoGPTQForCausalLM.from_quantized(
241
+ model_name,
242
+ device="cuda:0",
243
+ use_triton=True # Linux only
244
+ )
245
+ ```
246
+
247
+ **Performance**: 1.2-1.5× faster than CUDA backend
248
+
249
+ ## Integration with transformers
250
+
251
+ ### Direct transformers usage
252
+
253
+ ```python
254
+ from transformers import AutoModelForCausalLM, AutoTokenizer
255
+
256
+ # Load quantized model (transformers auto-detects GPTQ)
257
+ model = AutoModelForCausalLM.from_pretrained(
258
+ "TheBloke/Llama-2-13B-Chat-GPTQ",
259
+ device_map="auto",
260
+ trust_remote_code=False
261
+ )
262
+
263
+ tokenizer = AutoTokenizer.from_pretrained("TheBloke/Llama-2-13B-Chat-GPTQ")
264
+
265
+ # Use like any transformers model
266
+ inputs = tokenizer("Hello", return_tensors="pt").to("cuda")
267
+ outputs = model.generate(**inputs, max_new_tokens=100)
268
+ ```
269
+
270
+ ### QLoRA fine-tuning (GPTQ + LoRA)
271
+
272
+ ```python
273
+ from transformers import AutoModelForCausalLM
274
+ from peft import prepare_model_for_kbit_training, LoraConfig, get_peft_model
275
+
276
+ # Load GPTQ model
277
+ model = AutoModelForCausalLM.from_pretrained(
278
+ "TheBloke/Llama-2-7B-GPTQ",
279
+ device_map="auto"
280
+ )
281
+
282
+ # Prepare for LoRA training
283
+ model = prepare_model_for_kbit_training(model)
284
+
285
+ # LoRA config
286
+ lora_config = LoraConfig(
287
+ r=16,
288
+ lora_alpha=32,
289
+ target_modules=["q_proj", "v_proj"],
290
+ lora_dropout=0.05,
291
+ bias="none",
292
+ task_type="CAUSAL_LM"
293
+ )
294
+
295
+ # Add LoRA adapters
296
+ model = get_peft_model(model, lora_config)
297
+
298
+ # Fine-tune (memory efficient!)
299
+ # 70B model trainable on single A100 80GB
300
+ ```
301
+
302
+ ## Performance benchmarks
303
+
304
+ ### Memory reduction
305
+
306
+ | Model | FP16 | GPTQ 4-bit | Reduction |
307
+ |-------|------|------------|-----------|
308
+ | Llama 2-7B | 14 GB | 3.5 GB | 4× |
309
+ | Llama 2-13B | 26 GB | 6.5 GB | 4× |
310
+ | Llama 2-70B | 140 GB | 35 GB | 4× |
311
+ | Llama 3-405B | 810 GB | 203 GB | 4× |
312
+
313
+ **Enables**:
314
+ - 70B on single A100 80GB (vs 2× A100 needed for FP16)
315
+ - 405B on 3× A100 80GB (vs 11× A100 needed for FP16)
316
+ - 13B on RTX 4090 24GB (vs OOM with FP16)
317
+
318
+ ### Inference speed (Llama 2-7B, A100)
319
+
320
+ | Precision | Tokens/sec | vs FP16 |
321
+ |-----------|------------|---------|
322
+ | FP16 | 25 tok/s | 1× |
323
+ | GPTQ 4-bit (CUDA) | 85 tok/s | 3.4× |
324
+ | GPTQ 4-bit (ExLlama) | 105 tok/s | 4.2× |
325
+ | GPTQ 4-bit (Marlin) | 120 tok/s | 4.8× |
326
+
327
+ ### Accuracy (perplexity on WikiText-2)
328
+
329
+ | Model | FP16 | GPTQ 4-bit (g=128) | Degradation |
330
+ |-------|------|---------------------|-------------|
331
+ | Llama 2-7B | 5.47 | 5.55 | +1.5% |
332
+ | Llama 2-13B | 4.88 | 4.95 | +1.4% |
333
+ | Llama 2-70B | 3.32 | 3.38 | +1.8% |
334
+
335
+ **Excellent quality preservation** - less than 2% degradation!
336
+
337
+ ## Common patterns
338
+
339
+ ### Multi-GPU deployment
340
+
341
+ ```python
342
+ # Automatic device mapping
343
+ model = AutoGPTQForCausalLM.from_quantized(
344
+ "TheBloke/Llama-2-70B-GPTQ",
345
+ device_map="auto", # Automatically split across GPUs
346
+ max_memory={0: "40GB", 1: "40GB"} # Limit per GPU
347
+ )
348
+
349
+ # Manual device mapping
350
+ device_map = {
351
+ "model.embed_tokens": 0,
352
+ "model.layers.0-39": 0, # First 40 layers on GPU 0
353
+ "model.layers.40-79": 1, # Last 40 layers on GPU 1
354
+ "model.norm": 1,
355
+ "lm_head": 1
356
+ }
357
+
358
+ model = AutoGPTQForCausalLM.from_quantized(
359
+ model_name,
360
+ device_map=device_map
361
+ )
362
+ ```
363
+
364
+ ### CPU offloading
365
+
366
+ ```python
367
+ # Offload some layers to CPU (for very large models)
368
+ model = AutoGPTQForCausalLM.from_quantized(
369
+ "TheBloke/Llama-2-405B-GPTQ",
370
+ device_map="auto",
371
+ max_memory={
372
+ 0: "80GB", # GPU 0
373
+ 1: "80GB", # GPU 1
374
+ 2: "80GB", # GPU 2
375
+ "cpu": "200GB" # Offload overflow to CPU
376
+ }
377
+ )
378
+ ```
379
+
380
+ ### Batch inference
381
+
382
+ ```python
383
+ # Process multiple prompts efficiently
384
+ prompts = [
385
+ "Explain AI",
386
+ "Explain ML",
387
+ "Explain DL"
388
+ ]
389
+
390
+ inputs = tokenizer(prompts, return_tensors="pt", padding=True).to("cuda")
391
+
392
+ outputs = model.generate(
393
+ **inputs,
394
+ max_new_tokens=100,
395
+ pad_token_id=tokenizer.eos_token_id
396
+ )
397
+
398
+ for i, output in enumerate(outputs):
399
+ print(f"Prompt {i}: {tokenizer.decode(output)}")
400
+ ```
401
+
402
+ ## Finding pre-quantized models
403
+
404
+ **TheBloke on HuggingFace**:
405
+ - https://huggingface.co/TheBloke
406
+ - 1000+ models in GPTQ format
407
+ - Multiple group sizes (32, 128)
408
+ - Both CUDA and Marlin formats
409
+
410
+ **Search**:
411
+ ```bash
412
+ # Find GPTQ models on HuggingFace
413
+ https://huggingface.co/models?library=gptq
414
+ ```
415
+
416
+ **Download**:
417
+ ```python
418
+ from auto_gptq import AutoGPTQForCausalLM
419
+
420
+ # Automatically downloads from HuggingFace
421
+ model = AutoGPTQForCausalLM.from_quantized(
422
+ "TheBloke/Llama-2-70B-Chat-GPTQ",
423
+ device="cuda:0"
424
+ )
425
+ ```
426
+
427
+ ## Supported models
428
+
429
+ - **LLaMA family**: Llama 2, Llama 3, Code Llama
430
+ - **Mistral**: Mistral 7B, Mixtral 8x7B, 8x22B
431
+ - **Qwen**: Qwen, Qwen2, QwQ
432
+ - **DeepSeek**: V2, V3
433
+ - **Phi**: Phi-2, Phi-3
434
+ - **Yi, Falcon, BLOOM, OPT**
435
+ - **100+ models** on HuggingFace
436
+
437
+ ## References
438
+
439
+ - **[Calibration Guide](references/calibration.md)** - Dataset selection, quantization process, quality optimization
440
+ - **[Integration Guide](references/integration.md)** - Transformers, PEFT, vLLM, TensorRT-LLM
441
+ - **[Troubleshooting](references/troubleshooting.md)** - Common issues, performance optimization
442
+
443
+ ## Resources
444
+
445
+ - **GitHub**: https://github.com/AutoGPTQ/AutoGPTQ
446
+ - **Paper**: GPTQ: Accurate Post-Training Quantization (arXiv:2210.17323)
447
+ - **Models**: https://huggingface.co/models?library=gptq
448
+ - **Discord**: https://discord.gg/autogptq
449
+
450
+