@synsci/cli-darwin-x64 1.1.49

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (373) hide show
  1. package/bin/skills/accelerate/SKILL.md +332 -0
  2. package/bin/skills/accelerate/references/custom-plugins.md +453 -0
  3. package/bin/skills/accelerate/references/megatron-integration.md +489 -0
  4. package/bin/skills/accelerate/references/performance.md +525 -0
  5. package/bin/skills/audiocraft/SKILL.md +564 -0
  6. package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
  7. package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
  8. package/bin/skills/autogpt/SKILL.md +403 -0
  9. package/bin/skills/autogpt/references/advanced-usage.md +535 -0
  10. package/bin/skills/autogpt/references/troubleshooting.md +420 -0
  11. package/bin/skills/awq/SKILL.md +310 -0
  12. package/bin/skills/awq/references/advanced-usage.md +324 -0
  13. package/bin/skills/awq/references/troubleshooting.md +344 -0
  14. package/bin/skills/axolotl/SKILL.md +158 -0
  15. package/bin/skills/axolotl/references/api.md +5548 -0
  16. package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
  17. package/bin/skills/axolotl/references/index.md +15 -0
  18. package/bin/skills/axolotl/references/other.md +3563 -0
  19. package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
  20. package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
  21. package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
  22. package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
  23. package/bin/skills/bitsandbytes/SKILL.md +411 -0
  24. package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
  25. package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
  26. package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
  27. package/bin/skills/blip-2/SKILL.md +564 -0
  28. package/bin/skills/blip-2/references/advanced-usage.md +680 -0
  29. package/bin/skills/blip-2/references/troubleshooting.md +526 -0
  30. package/bin/skills/chroma/SKILL.md +406 -0
  31. package/bin/skills/chroma/references/integration.md +38 -0
  32. package/bin/skills/clip/SKILL.md +253 -0
  33. package/bin/skills/clip/references/applications.md +207 -0
  34. package/bin/skills/constitutional-ai/SKILL.md +290 -0
  35. package/bin/skills/crewai/SKILL.md +498 -0
  36. package/bin/skills/crewai/references/flows.md +438 -0
  37. package/bin/skills/crewai/references/tools.md +429 -0
  38. package/bin/skills/crewai/references/troubleshooting.md +480 -0
  39. package/bin/skills/deepspeed/SKILL.md +141 -0
  40. package/bin/skills/deepspeed/references/08.md +17 -0
  41. package/bin/skills/deepspeed/references/09.md +173 -0
  42. package/bin/skills/deepspeed/references/2020.md +378 -0
  43. package/bin/skills/deepspeed/references/2023.md +279 -0
  44. package/bin/skills/deepspeed/references/assets.md +179 -0
  45. package/bin/skills/deepspeed/references/index.md +35 -0
  46. package/bin/skills/deepspeed/references/mii.md +118 -0
  47. package/bin/skills/deepspeed/references/other.md +1191 -0
  48. package/bin/skills/deepspeed/references/tutorials.md +6554 -0
  49. package/bin/skills/dspy/SKILL.md +590 -0
  50. package/bin/skills/dspy/references/examples.md +663 -0
  51. package/bin/skills/dspy/references/modules.md +475 -0
  52. package/bin/skills/dspy/references/optimizers.md +566 -0
  53. package/bin/skills/faiss/SKILL.md +221 -0
  54. package/bin/skills/faiss/references/index_types.md +280 -0
  55. package/bin/skills/flash-attention/SKILL.md +367 -0
  56. package/bin/skills/flash-attention/references/benchmarks.md +215 -0
  57. package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
  58. package/bin/skills/gguf/SKILL.md +427 -0
  59. package/bin/skills/gguf/references/advanced-usage.md +504 -0
  60. package/bin/skills/gguf/references/troubleshooting.md +442 -0
  61. package/bin/skills/gptq/SKILL.md +450 -0
  62. package/bin/skills/gptq/references/calibration.md +337 -0
  63. package/bin/skills/gptq/references/integration.md +129 -0
  64. package/bin/skills/gptq/references/troubleshooting.md +95 -0
  65. package/bin/skills/grpo-rl-training/README.md +97 -0
  66. package/bin/skills/grpo-rl-training/SKILL.md +572 -0
  67. package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
  68. package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
  69. package/bin/skills/guidance/SKILL.md +572 -0
  70. package/bin/skills/guidance/references/backends.md +554 -0
  71. package/bin/skills/guidance/references/constraints.md +674 -0
  72. package/bin/skills/guidance/references/examples.md +767 -0
  73. package/bin/skills/hqq/SKILL.md +445 -0
  74. package/bin/skills/hqq/references/advanced-usage.md +528 -0
  75. package/bin/skills/hqq/references/troubleshooting.md +503 -0
  76. package/bin/skills/hugging-face-cli/SKILL.md +191 -0
  77. package/bin/skills/hugging-face-cli/references/commands.md +954 -0
  78. package/bin/skills/hugging-face-cli/references/examples.md +374 -0
  79. package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
  80. package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
  81. package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
  82. package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
  83. package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
  84. package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
  85. package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
  86. package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
  87. package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
  88. package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
  89. package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
  90. package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
  91. package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
  92. package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
  93. package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
  94. package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
  95. package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
  96. package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
  97. package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
  98. package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
  99. package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
  100. package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
  101. package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
  102. package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
  103. package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
  104. package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
  105. package/bin/skills/hugging-face-jobs/index.html +216 -0
  106. package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
  107. package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
  108. package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
  109. package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
  110. package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
  111. package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
  112. package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
  113. package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
  114. package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
  115. package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
  116. package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
  117. package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
  118. package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
  119. package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
  120. package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
  121. package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
  122. package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
  123. package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
  124. package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
  125. package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
  126. package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
  127. package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
  128. package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
  129. package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
  130. package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
  131. package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
  132. package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
  133. package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
  134. package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
  135. package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
  136. package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
  137. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
  138. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
  139. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
  140. package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
  141. package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
  142. package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
  143. package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
  144. package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
  145. package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
  146. package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
  147. package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
  148. package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
  149. package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
  150. package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
  151. package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
  152. package/bin/skills/instructor/SKILL.md +740 -0
  153. package/bin/skills/instructor/references/examples.md +107 -0
  154. package/bin/skills/instructor/references/providers.md +70 -0
  155. package/bin/skills/instructor/references/validation.md +606 -0
  156. package/bin/skills/knowledge-distillation/SKILL.md +458 -0
  157. package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
  158. package/bin/skills/lambda-labs/SKILL.md +545 -0
  159. package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
  160. package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
  161. package/bin/skills/langchain/SKILL.md +480 -0
  162. package/bin/skills/langchain/references/agents.md +499 -0
  163. package/bin/skills/langchain/references/integration.md +562 -0
  164. package/bin/skills/langchain/references/rag.md +600 -0
  165. package/bin/skills/langsmith/SKILL.md +422 -0
  166. package/bin/skills/langsmith/references/advanced-usage.md +548 -0
  167. package/bin/skills/langsmith/references/troubleshooting.md +537 -0
  168. package/bin/skills/litgpt/SKILL.md +469 -0
  169. package/bin/skills/litgpt/references/custom-models.md +568 -0
  170. package/bin/skills/litgpt/references/distributed-training.md +451 -0
  171. package/bin/skills/litgpt/references/supported-models.md +336 -0
  172. package/bin/skills/litgpt/references/training-recipes.md +619 -0
  173. package/bin/skills/llama-cpp/SKILL.md +258 -0
  174. package/bin/skills/llama-cpp/references/optimization.md +89 -0
  175. package/bin/skills/llama-cpp/references/quantization.md +213 -0
  176. package/bin/skills/llama-cpp/references/server.md +125 -0
  177. package/bin/skills/llama-factory/SKILL.md +80 -0
  178. package/bin/skills/llama-factory/references/_images.md +23 -0
  179. package/bin/skills/llama-factory/references/advanced.md +1055 -0
  180. package/bin/skills/llama-factory/references/getting_started.md +349 -0
  181. package/bin/skills/llama-factory/references/index.md +19 -0
  182. package/bin/skills/llama-factory/references/other.md +31 -0
  183. package/bin/skills/llamaguard/SKILL.md +337 -0
  184. package/bin/skills/llamaindex/SKILL.md +569 -0
  185. package/bin/skills/llamaindex/references/agents.md +83 -0
  186. package/bin/skills/llamaindex/references/data_connectors.md +108 -0
  187. package/bin/skills/llamaindex/references/query_engines.md +406 -0
  188. package/bin/skills/llava/SKILL.md +304 -0
  189. package/bin/skills/llava/references/training.md +197 -0
  190. package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
  191. package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
  192. package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
  193. package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
  194. package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
  195. package/bin/skills/long-context/SKILL.md +536 -0
  196. package/bin/skills/long-context/references/extension_methods.md +468 -0
  197. package/bin/skills/long-context/references/fine_tuning.md +611 -0
  198. package/bin/skills/long-context/references/rope.md +402 -0
  199. package/bin/skills/mamba/SKILL.md +260 -0
  200. package/bin/skills/mamba/references/architecture-details.md +206 -0
  201. package/bin/skills/mamba/references/benchmarks.md +255 -0
  202. package/bin/skills/mamba/references/training-guide.md +388 -0
  203. package/bin/skills/megatron-core/SKILL.md +366 -0
  204. package/bin/skills/megatron-core/references/benchmarks.md +249 -0
  205. package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
  206. package/bin/skills/megatron-core/references/production-examples.md +473 -0
  207. package/bin/skills/megatron-core/references/training-recipes.md +547 -0
  208. package/bin/skills/miles/SKILL.md +315 -0
  209. package/bin/skills/miles/references/api-reference.md +141 -0
  210. package/bin/skills/miles/references/troubleshooting.md +352 -0
  211. package/bin/skills/mlflow/SKILL.md +704 -0
  212. package/bin/skills/mlflow/references/deployment.md +744 -0
  213. package/bin/skills/mlflow/references/model-registry.md +770 -0
  214. package/bin/skills/mlflow/references/tracking.md +680 -0
  215. package/bin/skills/modal/SKILL.md +341 -0
  216. package/bin/skills/modal/references/advanced-usage.md +503 -0
  217. package/bin/skills/modal/references/troubleshooting.md +494 -0
  218. package/bin/skills/model-merging/SKILL.md +539 -0
  219. package/bin/skills/model-merging/references/evaluation.md +462 -0
  220. package/bin/skills/model-merging/references/examples.md +428 -0
  221. package/bin/skills/model-merging/references/methods.md +352 -0
  222. package/bin/skills/model-pruning/SKILL.md +495 -0
  223. package/bin/skills/model-pruning/references/wanda.md +347 -0
  224. package/bin/skills/moe-training/SKILL.md +526 -0
  225. package/bin/skills/moe-training/references/architectures.md +432 -0
  226. package/bin/skills/moe-training/references/inference.md +348 -0
  227. package/bin/skills/moe-training/references/training.md +425 -0
  228. package/bin/skills/nanogpt/SKILL.md +290 -0
  229. package/bin/skills/nanogpt/references/architecture.md +382 -0
  230. package/bin/skills/nanogpt/references/data.md +476 -0
  231. package/bin/skills/nanogpt/references/training.md +564 -0
  232. package/bin/skills/nemo-curator/SKILL.md +383 -0
  233. package/bin/skills/nemo-curator/references/deduplication.md +87 -0
  234. package/bin/skills/nemo-curator/references/filtering.md +102 -0
  235. package/bin/skills/nemo-evaluator/SKILL.md +494 -0
  236. package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
  237. package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
  238. package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
  239. package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
  240. package/bin/skills/nemo-guardrails/SKILL.md +297 -0
  241. package/bin/skills/nnsight/SKILL.md +436 -0
  242. package/bin/skills/nnsight/references/README.md +78 -0
  243. package/bin/skills/nnsight/references/api.md +344 -0
  244. package/bin/skills/nnsight/references/tutorials.md +300 -0
  245. package/bin/skills/openrlhf/SKILL.md +249 -0
  246. package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
  247. package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
  248. package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
  249. package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
  250. package/bin/skills/outlines/SKILL.md +652 -0
  251. package/bin/skills/outlines/references/backends.md +615 -0
  252. package/bin/skills/outlines/references/examples.md +773 -0
  253. package/bin/skills/outlines/references/json_generation.md +652 -0
  254. package/bin/skills/peft/SKILL.md +431 -0
  255. package/bin/skills/peft/references/advanced-usage.md +514 -0
  256. package/bin/skills/peft/references/troubleshooting.md +480 -0
  257. package/bin/skills/phoenix/SKILL.md +475 -0
  258. package/bin/skills/phoenix/references/advanced-usage.md +619 -0
  259. package/bin/skills/phoenix/references/troubleshooting.md +538 -0
  260. package/bin/skills/pinecone/SKILL.md +358 -0
  261. package/bin/skills/pinecone/references/deployment.md +181 -0
  262. package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
  263. package/bin/skills/pytorch-fsdp/references/index.md +7 -0
  264. package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
  265. package/bin/skills/pytorch-lightning/SKILL.md +346 -0
  266. package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
  267. package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
  268. package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
  269. package/bin/skills/pyvene/SKILL.md +473 -0
  270. package/bin/skills/pyvene/references/README.md +73 -0
  271. package/bin/skills/pyvene/references/api.md +383 -0
  272. package/bin/skills/pyvene/references/tutorials.md +376 -0
  273. package/bin/skills/qdrant/SKILL.md +493 -0
  274. package/bin/skills/qdrant/references/advanced-usage.md +648 -0
  275. package/bin/skills/qdrant/references/troubleshooting.md +631 -0
  276. package/bin/skills/ray-data/SKILL.md +326 -0
  277. package/bin/skills/ray-data/references/integration.md +82 -0
  278. package/bin/skills/ray-data/references/transformations.md +83 -0
  279. package/bin/skills/ray-train/SKILL.md +406 -0
  280. package/bin/skills/ray-train/references/multi-node.md +628 -0
  281. package/bin/skills/rwkv/SKILL.md +260 -0
  282. package/bin/skills/rwkv/references/architecture-details.md +344 -0
  283. package/bin/skills/rwkv/references/rwkv7.md +386 -0
  284. package/bin/skills/rwkv/references/state-management.md +369 -0
  285. package/bin/skills/saelens/SKILL.md +386 -0
  286. package/bin/skills/saelens/references/README.md +70 -0
  287. package/bin/skills/saelens/references/api.md +333 -0
  288. package/bin/skills/saelens/references/tutorials.md +318 -0
  289. package/bin/skills/segment-anything/SKILL.md +500 -0
  290. package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
  291. package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
  292. package/bin/skills/sentence-transformers/SKILL.md +255 -0
  293. package/bin/skills/sentence-transformers/references/models.md +123 -0
  294. package/bin/skills/sentencepiece/SKILL.md +235 -0
  295. package/bin/skills/sentencepiece/references/algorithms.md +200 -0
  296. package/bin/skills/sentencepiece/references/training.md +304 -0
  297. package/bin/skills/sglang/SKILL.md +442 -0
  298. package/bin/skills/sglang/references/deployment.md +490 -0
  299. package/bin/skills/sglang/references/radix-attention.md +413 -0
  300. package/bin/skills/sglang/references/structured-generation.md +541 -0
  301. package/bin/skills/simpo/SKILL.md +219 -0
  302. package/bin/skills/simpo/references/datasets.md +478 -0
  303. package/bin/skills/simpo/references/hyperparameters.md +452 -0
  304. package/bin/skills/simpo/references/loss-functions.md +350 -0
  305. package/bin/skills/skypilot/SKILL.md +509 -0
  306. package/bin/skills/skypilot/references/advanced-usage.md +491 -0
  307. package/bin/skills/skypilot/references/troubleshooting.md +570 -0
  308. package/bin/skills/slime/SKILL.md +464 -0
  309. package/bin/skills/slime/references/api-reference.md +392 -0
  310. package/bin/skills/slime/references/troubleshooting.md +386 -0
  311. package/bin/skills/speculative-decoding/SKILL.md +467 -0
  312. package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
  313. package/bin/skills/speculative-decoding/references/medusa.md +350 -0
  314. package/bin/skills/stable-diffusion/SKILL.md +519 -0
  315. package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
  316. package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
  317. package/bin/skills/tensorboard/SKILL.md +629 -0
  318. package/bin/skills/tensorboard/references/integrations.md +638 -0
  319. package/bin/skills/tensorboard/references/profiling.md +545 -0
  320. package/bin/skills/tensorboard/references/visualization.md +620 -0
  321. package/bin/skills/tensorrt-llm/SKILL.md +187 -0
  322. package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
  323. package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
  324. package/bin/skills/tensorrt-llm/references/serving.md +470 -0
  325. package/bin/skills/tinker/SKILL.md +362 -0
  326. package/bin/skills/tinker/references/api-reference.md +168 -0
  327. package/bin/skills/tinker/references/getting-started.md +157 -0
  328. package/bin/skills/tinker/references/loss-functions.md +163 -0
  329. package/bin/skills/tinker/references/models-and-lora.md +139 -0
  330. package/bin/skills/tinker/references/recipes.md +280 -0
  331. package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
  332. package/bin/skills/tinker/references/rendering.md +243 -0
  333. package/bin/skills/tinker/references/supervised-learning.md +232 -0
  334. package/bin/skills/tinker-training-cost/SKILL.md +187 -0
  335. package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
  336. package/bin/skills/torchforge/SKILL.md +433 -0
  337. package/bin/skills/torchforge/references/api-reference.md +327 -0
  338. package/bin/skills/torchforge/references/troubleshooting.md +409 -0
  339. package/bin/skills/torchtitan/SKILL.md +358 -0
  340. package/bin/skills/torchtitan/references/checkpoint.md +181 -0
  341. package/bin/skills/torchtitan/references/custom-models.md +258 -0
  342. package/bin/skills/torchtitan/references/float8.md +133 -0
  343. package/bin/skills/torchtitan/references/fsdp.md +126 -0
  344. package/bin/skills/transformer-lens/SKILL.md +346 -0
  345. package/bin/skills/transformer-lens/references/README.md +54 -0
  346. package/bin/skills/transformer-lens/references/api.md +362 -0
  347. package/bin/skills/transformer-lens/references/tutorials.md +339 -0
  348. package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
  349. package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
  350. package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
  351. package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
  352. package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
  353. package/bin/skills/unsloth/SKILL.md +80 -0
  354. package/bin/skills/unsloth/references/index.md +7 -0
  355. package/bin/skills/unsloth/references/llms-full.md +16799 -0
  356. package/bin/skills/unsloth/references/llms-txt.md +12044 -0
  357. package/bin/skills/unsloth/references/llms.md +82 -0
  358. package/bin/skills/verl/SKILL.md +391 -0
  359. package/bin/skills/verl/references/api-reference.md +301 -0
  360. package/bin/skills/verl/references/troubleshooting.md +391 -0
  361. package/bin/skills/vllm/SKILL.md +364 -0
  362. package/bin/skills/vllm/references/optimization.md +226 -0
  363. package/bin/skills/vllm/references/quantization.md +284 -0
  364. package/bin/skills/vllm/references/server-deployment.md +255 -0
  365. package/bin/skills/vllm/references/troubleshooting.md +447 -0
  366. package/bin/skills/weights-and-biases/SKILL.md +590 -0
  367. package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
  368. package/bin/skills/weights-and-biases/references/integrations.md +700 -0
  369. package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
  370. package/bin/skills/whisper/SKILL.md +317 -0
  371. package/bin/skills/whisper/references/languages.md +189 -0
  372. package/bin/synsc +0 -0
  373. package/package.json +10 -0
@@ -0,0 +1,301 @@
1
+ # verl API Reference
2
+
3
+ ## Core Classes
4
+
5
+ ### RayPPOTrainer
6
+
7
+ The central controller for the training loop. Manages resource allocation and coordinates worker groups.
8
+
9
+ ```python
10
+ from verl import RayPPOTrainer
11
+
12
+ trainer = RayPPOTrainer(
13
+ config=config,
14
+ resource_pool_manager=resource_manager,
15
+ ray_worker_group_cls=RayWorkerGroup,
16
+ )
17
+ trainer.init_workers()
18
+ trainer.fit()
19
+ ```
20
+
21
+ ### ResourcePoolManager
22
+
23
+ Manages GPU allocation across different worker groups using Ray PlacementGroups.
24
+
25
+ ```python
26
+ from verl.trainer.ppo.resource_pool import ResourcePoolManager
27
+
28
+ manager = ResourcePoolManager(
29
+ resource_pool_spec={
30
+ "actor_rollout_ref": {"gpu": 4},
31
+ "critic": {"gpu": 2},
32
+ }
33
+ )
34
+ ```
35
+
36
+ ### RayWorkerGroup
37
+
38
+ Abstraction for distributed method execution. Spawns Ray actors and dispatches method calls.
39
+
40
+ ```python
41
+ from verl.trainer.ppo.ray_worker_group import RayWorkerGroup
42
+
43
+ worker_group = RayWorkerGroup(
44
+ num_workers=8,
45
+ worker_cls=ActorRolloutRefWorker,
46
+ resource_pool=pool,
47
+ )
48
+ ```
49
+
50
+ ### ActorRolloutRefWorker
51
+
52
+ Worker class implementing policy training, generation, and reference model computations. Manages hybrid engine mode switching.
53
+
54
+ ```python
55
+ # Typically configured via YAML, not instantiated directly
56
+ # See configuration section below
57
+ ```
58
+
59
+ ### RolloutReplica
60
+
61
+ Interface for inference backends with implementations for vLLM, SGLang, TensorRT-LLM, and HuggingFace.
62
+
63
+ ```python
64
+ from verl.workers.rollout import RolloutReplica
65
+
66
+ # Backend selection via config
67
+ rollout:
68
+ name: vllm # or: sglang, hf, tensorrt-llm
69
+ ```
70
+
71
+ ## Configuration Schema
72
+
73
+ ### PPO Configuration (`verl/trainer/config/ppo_trainer.yaml`)
74
+
75
+ ```yaml
76
+ # Data configuration
77
+ data:
78
+ train_files: /path/to/train.parquet
79
+ val_files: /path/to/val.parquet
80
+ train_batch_size: 256 # Global batch size of prompts
81
+ max_prompt_length: 512
82
+ max_response_length: 2048
83
+
84
+ # Algorithm configuration
85
+ algorithm:
86
+ adv_estimator: gae # gae, grpo, rloo, reinforce_plus_plus
87
+ gamma: 0.99 # Discount factor
88
+ lam: 0.95 # GAE lambda
89
+ use_kl_in_reward: false # Add KL term to reward
90
+
91
+ # Actor configuration
92
+ actor_rollout_ref:
93
+ model:
94
+ path: Qwen/Qwen2.5-7B-Instruct
95
+ backend: fsdp # fsdp, fsdp2, megatron
96
+ actor:
97
+ ppo_mini_batch_size: 64 # Mini-batch for actor updates
98
+ ppo_epochs: 1 # Number of actor update epochs
99
+ clip_ratio: 0.2 # PPO clip range
100
+ use_kl_loss: true # Use KL loss in actor
101
+ kl_loss_coef: 0.001 # KL loss coefficient
102
+ kl_loss_type: low_var # KL divergence calculation method
103
+ loss_agg_mode: token-mean # token-mean or sequence-mean
104
+ gradient_checkpointing: true
105
+ max_grad_norm: 1.0 # Gradient clipping
106
+ lr: 1e-6 # Learning rate
107
+ rollout:
108
+ name: vllm # vllm, sglang, hf
109
+ n: 8 # Samples per prompt
110
+ temperature: 0.7
111
+ top_p: 0.95
112
+ log_prob_micro_batch_size: 8
113
+
114
+ # Critic configuration (PPO only)
115
+ critic:
116
+ model:
117
+ path: Qwen/Qwen2.5-7B-Instruct
118
+ ppo_mini_batch_size: 64
119
+ ppo_epochs: 1 # Defaults to actor epochs
120
+
121
+ # Trainer configuration
122
+ trainer:
123
+ total_epochs: 3
124
+ n_gpus_per_node: 8
125
+ nnodes: 1
126
+ save_freq: 100
127
+ experiment_name: my_experiment
128
+ async_weight_update: false
129
+ ```
130
+
131
+ ### GRPO Configuration (`docs/algo/grpo.md`)
132
+
133
+ ```yaml
134
+ algorithm:
135
+ adv_estimator: grpo # Enable GRPO
136
+ gamma: 1.0
137
+ lam: 1.0
138
+
139
+ actor_rollout_ref:
140
+ rollout:
141
+ n: 8 # Must be > 1 for GRPO
142
+ actor:
143
+ use_kl_loss: true # Required for GRPO
144
+ kl_loss_coef: 0.001
145
+ kl_loss_type: low_var # or: k1, k2, k3
146
+ loss_agg_mode: token-mean
147
+ ```
148
+
149
+ ### Multi-Turn Configuration (`verl/trainer/config/rollout/rollout.yaml`)
150
+
151
+ ```yaml
152
+ actor_rollout_ref:
153
+ rollout:
154
+ name: sglang # Required for multi-turn
155
+ multi_turn:
156
+ enable: true
157
+ tool_config_path: /path/to/tools.yaml
158
+ interaction_config_path: /path/to/interaction.yaml
159
+ ```
160
+
161
+ ## Reward Functions
162
+
163
+ ### Built-in Reward Types
164
+
165
+ ```yaml
166
+ # Model-based reward
167
+ reward_model:
168
+ path: OpenRLHF/Llama-3-8b-rm-700k
169
+
170
+ # Custom function-based reward
171
+ custom_reward_function:
172
+ path: /path/to/reward.py
173
+ name: compute_score # Function name, default: compute_score
174
+ ```
175
+
176
+ ### Custom Reward Function Signature
177
+
178
+ ```python
179
+ # reward.py
180
+ def compute_score(responses: list[str], ground_truths: list[str], **kwargs) -> list[float]:
181
+ """
182
+ Compute rewards for a batch of responses.
183
+
184
+ Args:
185
+ responses: Generated completions
186
+ ground_truths: Expected answers from data
187
+ **kwargs: Additional metadata
188
+
189
+ Returns:
190
+ List of reward scores (floats)
191
+ """
192
+ rewards = []
193
+ for response, gt in zip(responses, ground_truths):
194
+ # Your reward logic
195
+ score = 1.0 if correct(response, gt) else 0.0
196
+ rewards.append(score)
197
+ return rewards
198
+ ```
199
+
200
+ ## Backend-Specific Configuration
201
+
202
+ ### FSDP Configuration
203
+
204
+ ```yaml
205
+ actor_rollout_ref:
206
+ actor:
207
+ strategy: fsdp
208
+ fsdp_config:
209
+ mixed_precision: bf16
210
+ sharding_strategy: FULL_SHARD
211
+ offload_policy: false
212
+ ```
213
+
214
+ ### FSDP2 Configuration
215
+
216
+ ```yaml
217
+ actor_rollout_ref:
218
+ actor:
219
+ strategy: fsdp2
220
+ fsdp_config:
221
+ offload_policy: true # CPU offloading
222
+ reshard_after_forward: true
223
+ ```
224
+
225
+ ### Megatron Configuration
226
+
227
+ ```yaml
228
+ actor_rollout_ref:
229
+ model:
230
+ backend: megatron
231
+ actor:
232
+ strategy: megatron
233
+ tensor_model_parallel_size: 8
234
+ pipeline_model_parallel_size: 2
235
+ megatron:
236
+ use_mbridge: true # Required for format conversion
237
+ ```
238
+
239
+ ### vLLM Rollout Configuration
240
+
241
+ ```yaml
242
+ actor_rollout_ref:
243
+ rollout:
244
+ name: vllm
245
+ tensor_parallel_size: 2
246
+ gpu_memory_utilization: 0.9
247
+ max_num_seqs: 256
248
+ enforce_eager: false
249
+ ```
250
+
251
+ ### SGLang Rollout Configuration
252
+
253
+ ```yaml
254
+ actor_rollout_ref:
255
+ rollout:
256
+ name: sglang
257
+ tp_size: 2
258
+ mem_fraction_static: 0.8
259
+ context_length: 8192
260
+ ```
261
+
262
+ ## Algorithm Reference
263
+
264
+ | Algorithm | `adv_estimator` | Requires Critic | Best For |
265
+ |-----------|-----------------|-----------------|----------|
266
+ | PPO | `gae` | Yes | Dense rewards, value estimation |
267
+ | GRPO | `grpo` | No | Sparse rewards, math/reasoning |
268
+ | RLOO | `rloo` | No | Leave-one-out baseline |
269
+ | REINFORCE++ | `reinforce_plus_plus` | No | Variance reduction |
270
+ | DAPO | `dapo` | No | Doubly-adaptive optimization |
271
+
272
+ ## Vision-Language Model Support
273
+
274
+ ```yaml
275
+ actor_rollout_ref:
276
+ model:
277
+ path: Qwen/Qwen2.5-VL-7B-Instruct
278
+ rollout:
279
+ name: vllm
280
+ enable_vision: true
281
+ max_model_len: 32768
282
+ ```
283
+
284
+ ## LoRA Configuration
285
+
286
+ ```yaml
287
+ actor_rollout_ref:
288
+ actor:
289
+ lora:
290
+ enabled: true
291
+ r: 16
292
+ alpha: 32
293
+ target_modules: ["q_proj", "v_proj", "k_proj", "o_proj"]
294
+ dropout: 0.05
295
+ ```
296
+
297
+ ## Resources
298
+
299
+ - Documentation: https://verl.readthedocs.io/
300
+ - GitHub: https://github.com/volcengine/verl
301
+ - Paper: https://arxiv.org/abs/2409.19256 (HybridFlow)
@@ -0,0 +1,391 @@
1
+ # verl Troubleshooting Guide
2
+
3
+ ## Common Issues and Solutions
4
+
5
+ ### OOM (Out of Memory) Issues
6
+
7
+ #### Issue: OOM During Rollout
8
+
9
+ **Symptoms**: CUDA out of memory during generation phase
10
+
11
+ **Solutions**:
12
+
13
+ 1. **Reduce log prob batch size**:
14
+ ```yaml
15
+ actor_rollout_ref:
16
+ rollout:
17
+ log_prob_micro_batch_size: 4 # Reduce from 8
18
+ ```
19
+
20
+ 2. **Enable gradient checkpointing**:
21
+ ```yaml
22
+ actor_rollout_ref:
23
+ actor:
24
+ gradient_checkpointing: true
25
+ ```
26
+
27
+ 3. **Use FSDP2 with CPU offloading**:
28
+ ```yaml
29
+ actor_rollout_ref:
30
+ actor:
31
+ strategy: fsdp2
32
+ fsdp_config:
33
+ offload_policy: true
34
+ ```
35
+
36
+ 4. **Reduce vLLM memory utilization**:
37
+ ```yaml
38
+ actor_rollout_ref:
39
+ rollout:
40
+ gpu_memory_utilization: 0.7 # Reduce from 0.9
41
+ ```
42
+
43
+ #### Issue: OOM During Training
44
+
45
+ **Symptoms**: CUDA OOM in backward pass
46
+
47
+ **Solutions**:
48
+
49
+ 1. **Reduce batch sizes**:
50
+ ```yaml
51
+ actor_rollout_ref:
52
+ actor:
53
+ ppo_mini_batch_size: 32 # Reduce from 64
54
+ ```
55
+
56
+ 2. **Use gradient accumulation**:
57
+ ```yaml
58
+ actor_rollout_ref:
59
+ actor:
60
+ gradient_accumulation_steps: 4
61
+ ```
62
+
63
+ 3. **Enable mixed precision**:
64
+ ```yaml
65
+ actor_rollout_ref:
66
+ actor:
67
+ fsdp_config:
68
+ mixed_precision: bf16
69
+ ```
70
+
71
+ ### Training Stability Issues
72
+
73
+ #### Issue: Training Instability / Loss Spikes
74
+
75
+ **Symptoms**: Loss spikes, reward collapse, divergence
76
+
77
+ **Solutions**:
78
+
79
+ 1. **Reduce learning rate**:
80
+ ```yaml
81
+ actor_rollout_ref:
82
+ actor:
83
+ lr: 5e-7 # Reduce from 1e-6
84
+ ```
85
+
86
+ 2. **Increase KL penalty**:
87
+ ```yaml
88
+ actor_rollout_ref:
89
+ actor:
90
+ kl_loss_coef: 0.01 # Increase from 0.001
91
+ ```
92
+
93
+ 3. **Enable gradient clipping**:
94
+ ```yaml
95
+ actor_rollout_ref:
96
+ actor:
97
+ max_grad_norm: 1.0
98
+ ```
99
+
100
+ 4. **Use smaller PPO clip range**:
101
+ ```yaml
102
+ actor_rollout_ref:
103
+ actor:
104
+ clip_ratio: 0.1 # Reduce from 0.2
105
+ ```
106
+
107
+ #### Issue: Policy Collapse (Entropy Drops to Zero)
108
+
109
+ **Symptoms**: Model outputs become deterministic, entropy approaches zero
110
+
111
+ **Solutions**:
112
+
113
+ 1. **Increase temperature during rollout**:
114
+ ```yaml
115
+ actor_rollout_ref:
116
+ rollout:
117
+ temperature: 0.9 # Increase from 0.7
118
+ ```
119
+
120
+ 2. **Add entropy bonus**:
121
+ ```yaml
122
+ algorithm:
123
+ entropy_coef: 0.01
124
+ ```
125
+
126
+ 3. **Reduce KL penalty**:
127
+ ```yaml
128
+ actor_rollout_ref:
129
+ actor:
130
+ kl_loss_coef: 0.0001 # Reduce
131
+ ```
132
+
133
+ ### Weight Synchronization Issues
134
+
135
+ #### Issue: Slow Weight Sync
136
+
137
+ **Symptoms**: Long pauses between rollout and training phases
138
+
139
+ **Solutions**:
140
+
141
+ 1. **Use FSDP2 for faster resharding**:
142
+ ```yaml
143
+ actor_rollout_ref:
144
+ actor:
145
+ strategy: fsdp2
146
+ ```
147
+
148
+ 2. **Enable async weight transfer**:
149
+ ```yaml
150
+ trainer:
151
+ async_weight_update: true
152
+ ```
153
+
154
+ 3. **Reduce sync frequency**:
155
+ ```yaml
156
+ trainer:
157
+ weight_sync_interval: 2 # Sync every 2 steps
158
+ ```
159
+
160
+ #### Issue: Weight Sync Timeout
161
+
162
+ **Symptoms**: Ray actor timeouts during weight synchronization
163
+
164
+ **Solutions**:
165
+
166
+ 1. **Increase Ray timeout**:
167
+ ```python
168
+ import ray
169
+ ray.init(num_gpus=8, timeout=3600) # 1 hour timeout
170
+ ```
171
+
172
+ 2. **Use colocated mode** (if memory allows):
173
+ ```yaml
174
+ trainer:
175
+ colocate_actor_ref: true
176
+ ```
177
+
178
+ ### vLLM Version Issues
179
+
180
+ #### Issue: vLLM Import Errors or Generation Failures
181
+
182
+ **Symptoms**: Import errors, generation hangs, incorrect outputs
183
+
184
+ **Solutions**:
185
+
186
+ 1. **Use compatible vLLM version**:
187
+ ```bash
188
+ pip install vllm>=0.8.2,<=0.12.0
189
+ # Avoid vLLM 0.7.x (known bugs)
190
+ ```
191
+
192
+ 2. **For vLLM 0.8.x issues**:
193
+ ```yaml
194
+ actor_rollout_ref:
195
+ rollout:
196
+ enforce_eager: true # Disable CUDA graphs
197
+ ```
198
+
199
+ 3. **Check CUDA version compatibility**:
200
+ ```bash
201
+ # vLLM 0.11+ requires CUDA 12.1+
202
+ nvidia-smi # Check CUDA version
203
+ ```
204
+
205
+ ### Ray Issues
206
+
207
+ #### Issue: Ray Cluster Connection Failures
208
+
209
+ **Symptoms**: Cannot connect to Ray cluster
210
+
211
+ **Solutions**:
212
+
213
+ 1. **Check Ray head node**:
214
+ ```bash
215
+ ray status
216
+ ```
217
+
218
+ 2. **Restart Ray cluster**:
219
+ ```bash
220
+ ray stop
221
+ ray start --head --port=6379 --num-gpus=8
222
+ ```
223
+
224
+ 3. **Verify network connectivity**:
225
+ ```bash
226
+ ping head_node_ip
227
+ ```
228
+
229
+ #### Issue: Ray Actor OOM
230
+
231
+ **Symptoms**: Ray actors killed due to OOM
232
+
233
+ **Solutions**:
234
+
235
+ 1. **Increase Ray object store memory**:
236
+ ```bash
237
+ ray start --head --object-store-memory=10000000000 # 10GB
238
+ ```
239
+
240
+ 2. **Enable spilling to disk**:
241
+ ```bash
242
+ export RAY_object_spilling_config='{"type":"filesystem","params":{"directory_path":"/tmp/ray_spill"}}'
243
+ ```
244
+
245
+ ### Multi-Node Issues
246
+
247
+ #### Issue: NCCL Timeout
248
+
249
+ **Symptoms**: NCCL operations timeout on multi-node
250
+
251
+ **Solutions**:
252
+
253
+ 1. **Set NCCL environment variables**:
254
+ ```bash
255
+ export NCCL_DEBUG=INFO
256
+ export NCCL_SOCKET_IFNAME=eth0
257
+ export NCCL_IB_DISABLE=0 # Enable InfiniBand if available
258
+ ```
259
+
260
+ 2. **Increase NCCL timeout**:
261
+ ```bash
262
+ export NCCL_TIMEOUT=1800 # 30 minutes
263
+ ```
264
+
265
+ 3. **Check network interface**:
266
+ ```bash
267
+ ifconfig # Verify correct interface
268
+ ```
269
+
270
+ #### Issue: DeepSpeed GPU Index Out of Range
271
+
272
+ **Symptoms**: "GPU index out of range" error with DeepSpeed
273
+
274
+ **Solutions**:
275
+
276
+ ```bash
277
+ export RAY_EXPERIMENTAL_NOSET_CUDA_VISIBLE_DEVICES=1
278
+ ```
279
+
280
+ ### Data Issues
281
+
282
+ #### Issue: Empty Batches
283
+
284
+ **Symptoms**: Training receives empty batches
285
+
286
+ **Solutions**:
287
+
288
+ 1. **Verify data format**:
289
+ ```python
290
+ import pandas as pd
291
+ df = pd.read_parquet("train.parquet")
292
+ print(df.columns) # Should include 'prompt', 'reward_model'
293
+ ```
294
+
295
+ 2. **Check data loading**:
296
+ ```yaml
297
+ data:
298
+ train_files: /absolute/path/to/train.parquet # Use absolute path
299
+ ```
300
+
301
+ #### Issue: Tokenization Errors
302
+
303
+ **Symptoms**: Tokenizer errors, sequence length mismatches
304
+
305
+ **Solutions**:
306
+
307
+ 1. **Set padding token**:
308
+ ```python
309
+ tokenizer.pad_token = tokenizer.eos_token
310
+ ```
311
+
312
+ 2. **Verify max length configuration**:
313
+ ```yaml
314
+ data:
315
+ max_prompt_length: 512
316
+ max_response_length: 2048
317
+ # Total should not exceed model's max length
318
+ ```
319
+
320
+ ### Megatron-Specific Issues
321
+
322
+ #### Issue: Megatron Checkpoint Loading Fails
323
+
324
+ **Symptoms**: Cannot load Megatron checkpoints
325
+
326
+ **Solutions**:
327
+
328
+ 1. **Enable mbridge conversion**:
329
+ ```yaml
330
+ actor_rollout_ref:
331
+ actor:
332
+ megatron:
333
+ use_mbridge: true
334
+ ```
335
+
336
+ 2. **Convert HuggingFace to Megatron format**:
337
+ ```bash
338
+ python tools/convert_hf_to_megatron.py \
339
+ --hf_model_path /path/to/hf/model \
340
+ --save_path /path/to/megatron/checkpoint
341
+ ```
342
+
343
+ #### Issue: Megatron on AMD GPUs
344
+
345
+ **Current Limitation**: Megatron-LM backend is not supported on AMD GPUs. Use FSDP backend instead:
346
+
347
+ ```yaml
348
+ actor_rollout_ref:
349
+ model:
350
+ backend: fsdp
351
+ ```
352
+
353
+ ### Debugging Tips
354
+
355
+ #### Enable Verbose Logging
356
+
357
+ ```yaml
358
+ trainer:
359
+ logging_level: DEBUG
360
+ ```
361
+
362
+ ```bash
363
+ export VERL_DEBUG=1
364
+ export RAY_DEDUP_LOGS=0
365
+ ```
366
+
367
+ #### Check GPU Utilization
368
+
369
+ ```bash
370
+ watch -n 1 nvidia-smi
371
+ ```
372
+
373
+ #### Profile Training
374
+
375
+ ```python
376
+ # Add profiling to training loop
377
+ import torch.profiler
378
+
379
+ with torch.profiler.profile(
380
+ activities=[torch.profiler.ProfilerActivity.CPU, torch.profiler.ProfilerActivity.CUDA],
381
+ record_shapes=True,
382
+ ) as prof:
383
+ trainer.fit()
384
+ prof.export_chrome_trace("trace.json")
385
+ ```
386
+
387
+ ## Resources
388
+
389
+ - GitHub Issues: https://github.com/volcengine/verl/issues
390
+ - Documentation: https://verl.readthedocs.io/
391
+ - Community Slack: verl-project