@synsci/cli-darwin-x64 1.1.49

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (373) hide show
  1. package/bin/skills/accelerate/SKILL.md +332 -0
  2. package/bin/skills/accelerate/references/custom-plugins.md +453 -0
  3. package/bin/skills/accelerate/references/megatron-integration.md +489 -0
  4. package/bin/skills/accelerate/references/performance.md +525 -0
  5. package/bin/skills/audiocraft/SKILL.md +564 -0
  6. package/bin/skills/audiocraft/references/advanced-usage.md +666 -0
  7. package/bin/skills/audiocraft/references/troubleshooting.md +504 -0
  8. package/bin/skills/autogpt/SKILL.md +403 -0
  9. package/bin/skills/autogpt/references/advanced-usage.md +535 -0
  10. package/bin/skills/autogpt/references/troubleshooting.md +420 -0
  11. package/bin/skills/awq/SKILL.md +310 -0
  12. package/bin/skills/awq/references/advanced-usage.md +324 -0
  13. package/bin/skills/awq/references/troubleshooting.md +344 -0
  14. package/bin/skills/axolotl/SKILL.md +158 -0
  15. package/bin/skills/axolotl/references/api.md +5548 -0
  16. package/bin/skills/axolotl/references/dataset-formats.md +1029 -0
  17. package/bin/skills/axolotl/references/index.md +15 -0
  18. package/bin/skills/axolotl/references/other.md +3563 -0
  19. package/bin/skills/bigcode-evaluation-harness/SKILL.md +405 -0
  20. package/bin/skills/bigcode-evaluation-harness/references/benchmarks.md +393 -0
  21. package/bin/skills/bigcode-evaluation-harness/references/custom-tasks.md +424 -0
  22. package/bin/skills/bigcode-evaluation-harness/references/issues.md +394 -0
  23. package/bin/skills/bitsandbytes/SKILL.md +411 -0
  24. package/bin/skills/bitsandbytes/references/memory-optimization.md +521 -0
  25. package/bin/skills/bitsandbytes/references/qlora-training.md +521 -0
  26. package/bin/skills/bitsandbytes/references/quantization-formats.md +447 -0
  27. package/bin/skills/blip-2/SKILL.md +564 -0
  28. package/bin/skills/blip-2/references/advanced-usage.md +680 -0
  29. package/bin/skills/blip-2/references/troubleshooting.md +526 -0
  30. package/bin/skills/chroma/SKILL.md +406 -0
  31. package/bin/skills/chroma/references/integration.md +38 -0
  32. package/bin/skills/clip/SKILL.md +253 -0
  33. package/bin/skills/clip/references/applications.md +207 -0
  34. package/bin/skills/constitutional-ai/SKILL.md +290 -0
  35. package/bin/skills/crewai/SKILL.md +498 -0
  36. package/bin/skills/crewai/references/flows.md +438 -0
  37. package/bin/skills/crewai/references/tools.md +429 -0
  38. package/bin/skills/crewai/references/troubleshooting.md +480 -0
  39. package/bin/skills/deepspeed/SKILL.md +141 -0
  40. package/bin/skills/deepspeed/references/08.md +17 -0
  41. package/bin/skills/deepspeed/references/09.md +173 -0
  42. package/bin/skills/deepspeed/references/2020.md +378 -0
  43. package/bin/skills/deepspeed/references/2023.md +279 -0
  44. package/bin/skills/deepspeed/references/assets.md +179 -0
  45. package/bin/skills/deepspeed/references/index.md +35 -0
  46. package/bin/skills/deepspeed/references/mii.md +118 -0
  47. package/bin/skills/deepspeed/references/other.md +1191 -0
  48. package/bin/skills/deepspeed/references/tutorials.md +6554 -0
  49. package/bin/skills/dspy/SKILL.md +590 -0
  50. package/bin/skills/dspy/references/examples.md +663 -0
  51. package/bin/skills/dspy/references/modules.md +475 -0
  52. package/bin/skills/dspy/references/optimizers.md +566 -0
  53. package/bin/skills/faiss/SKILL.md +221 -0
  54. package/bin/skills/faiss/references/index_types.md +280 -0
  55. package/bin/skills/flash-attention/SKILL.md +367 -0
  56. package/bin/skills/flash-attention/references/benchmarks.md +215 -0
  57. package/bin/skills/flash-attention/references/transformers-integration.md +293 -0
  58. package/bin/skills/gguf/SKILL.md +427 -0
  59. package/bin/skills/gguf/references/advanced-usage.md +504 -0
  60. package/bin/skills/gguf/references/troubleshooting.md +442 -0
  61. package/bin/skills/gptq/SKILL.md +450 -0
  62. package/bin/skills/gptq/references/calibration.md +337 -0
  63. package/bin/skills/gptq/references/integration.md +129 -0
  64. package/bin/skills/gptq/references/troubleshooting.md +95 -0
  65. package/bin/skills/grpo-rl-training/README.md +97 -0
  66. package/bin/skills/grpo-rl-training/SKILL.md +572 -0
  67. package/bin/skills/grpo-rl-training/examples/reward_functions_library.py +393 -0
  68. package/bin/skills/grpo-rl-training/templates/basic_grpo_training.py +228 -0
  69. package/bin/skills/guidance/SKILL.md +572 -0
  70. package/bin/skills/guidance/references/backends.md +554 -0
  71. package/bin/skills/guidance/references/constraints.md +674 -0
  72. package/bin/skills/guidance/references/examples.md +767 -0
  73. package/bin/skills/hqq/SKILL.md +445 -0
  74. package/bin/skills/hqq/references/advanced-usage.md +528 -0
  75. package/bin/skills/hqq/references/troubleshooting.md +503 -0
  76. package/bin/skills/hugging-face-cli/SKILL.md +191 -0
  77. package/bin/skills/hugging-face-cli/references/commands.md +954 -0
  78. package/bin/skills/hugging-face-cli/references/examples.md +374 -0
  79. package/bin/skills/hugging-face-datasets/SKILL.md +547 -0
  80. package/bin/skills/hugging-face-datasets/examples/diverse_training_examples.json +239 -0
  81. package/bin/skills/hugging-face-datasets/examples/system_prompt_template.txt +196 -0
  82. package/bin/skills/hugging-face-datasets/examples/training_examples.json +176 -0
  83. package/bin/skills/hugging-face-datasets/scripts/dataset_manager.py +522 -0
  84. package/bin/skills/hugging-face-datasets/scripts/sql_manager.py +844 -0
  85. package/bin/skills/hugging-face-datasets/templates/chat.json +55 -0
  86. package/bin/skills/hugging-face-datasets/templates/classification.json +62 -0
  87. package/bin/skills/hugging-face-datasets/templates/completion.json +51 -0
  88. package/bin/skills/hugging-face-datasets/templates/custom.json +75 -0
  89. package/bin/skills/hugging-face-datasets/templates/qa.json +54 -0
  90. package/bin/skills/hugging-face-datasets/templates/tabular.json +81 -0
  91. package/bin/skills/hugging-face-evaluation/SKILL.md +656 -0
  92. package/bin/skills/hugging-face-evaluation/examples/USAGE_EXAMPLES.md +382 -0
  93. package/bin/skills/hugging-face-evaluation/examples/artificial_analysis_to_hub.py +141 -0
  94. package/bin/skills/hugging-face-evaluation/examples/example_readme_tables.md +135 -0
  95. package/bin/skills/hugging-face-evaluation/examples/metric_mapping.json +50 -0
  96. package/bin/skills/hugging-face-evaluation/requirements.txt +20 -0
  97. package/bin/skills/hugging-face-evaluation/scripts/evaluation_manager.py +1374 -0
  98. package/bin/skills/hugging-face-evaluation/scripts/inspect_eval_uv.py +104 -0
  99. package/bin/skills/hugging-face-evaluation/scripts/inspect_vllm_uv.py +317 -0
  100. package/bin/skills/hugging-face-evaluation/scripts/lighteval_vllm_uv.py +303 -0
  101. package/bin/skills/hugging-face-evaluation/scripts/run_eval_job.py +98 -0
  102. package/bin/skills/hugging-face-evaluation/scripts/run_vllm_eval_job.py +331 -0
  103. package/bin/skills/hugging-face-evaluation/scripts/test_extraction.py +206 -0
  104. package/bin/skills/hugging-face-jobs/SKILL.md +1041 -0
  105. package/bin/skills/hugging-face-jobs/index.html +216 -0
  106. package/bin/skills/hugging-face-jobs/references/hardware_guide.md +336 -0
  107. package/bin/skills/hugging-face-jobs/references/hub_saving.md +352 -0
  108. package/bin/skills/hugging-face-jobs/references/token_usage.md +546 -0
  109. package/bin/skills/hugging-face-jobs/references/troubleshooting.md +475 -0
  110. package/bin/skills/hugging-face-jobs/scripts/cot-self-instruct.py +718 -0
  111. package/bin/skills/hugging-face-jobs/scripts/finepdfs-stats.py +546 -0
  112. package/bin/skills/hugging-face-jobs/scripts/generate-responses.py +587 -0
  113. package/bin/skills/hugging-face-model-trainer/SKILL.md +711 -0
  114. package/bin/skills/hugging-face-model-trainer/references/gguf_conversion.md +296 -0
  115. package/bin/skills/hugging-face-model-trainer/references/hardware_guide.md +283 -0
  116. package/bin/skills/hugging-face-model-trainer/references/hub_saving.md +364 -0
  117. package/bin/skills/hugging-face-model-trainer/references/reliability_principles.md +371 -0
  118. package/bin/skills/hugging-face-model-trainer/references/trackio_guide.md +189 -0
  119. package/bin/skills/hugging-face-model-trainer/references/training_methods.md +150 -0
  120. package/bin/skills/hugging-face-model-trainer/references/training_patterns.md +203 -0
  121. package/bin/skills/hugging-face-model-trainer/references/troubleshooting.md +282 -0
  122. package/bin/skills/hugging-face-model-trainer/scripts/convert_to_gguf.py +424 -0
  123. package/bin/skills/hugging-face-model-trainer/scripts/dataset_inspector.py +417 -0
  124. package/bin/skills/hugging-face-model-trainer/scripts/estimate_cost.py +150 -0
  125. package/bin/skills/hugging-face-model-trainer/scripts/train_dpo_example.py +106 -0
  126. package/bin/skills/hugging-face-model-trainer/scripts/train_grpo_example.py +89 -0
  127. package/bin/skills/hugging-face-model-trainer/scripts/train_sft_example.py +122 -0
  128. package/bin/skills/hugging-face-paper-publisher/SKILL.md +627 -0
  129. package/bin/skills/hugging-face-paper-publisher/examples/example_usage.md +327 -0
  130. package/bin/skills/hugging-face-paper-publisher/references/quick_reference.md +216 -0
  131. package/bin/skills/hugging-face-paper-publisher/scripts/paper_manager.py +508 -0
  132. package/bin/skills/hugging-face-paper-publisher/templates/arxiv.md +299 -0
  133. package/bin/skills/hugging-face-paper-publisher/templates/ml-report.md +358 -0
  134. package/bin/skills/hugging-face-paper-publisher/templates/modern.md +319 -0
  135. package/bin/skills/hugging-face-paper-publisher/templates/standard.md +201 -0
  136. package/bin/skills/hugging-face-tool-builder/SKILL.md +115 -0
  137. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.py +57 -0
  138. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.sh +40 -0
  139. package/bin/skills/hugging-face-tool-builder/references/baseline_hf_api.tsx +57 -0
  140. package/bin/skills/hugging-face-tool-builder/references/find_models_by_paper.sh +230 -0
  141. package/bin/skills/hugging-face-tool-builder/references/hf_enrich_models.sh +96 -0
  142. package/bin/skills/hugging-face-tool-builder/references/hf_model_card_frontmatter.sh +188 -0
  143. package/bin/skills/hugging-face-tool-builder/references/hf_model_papers_auth.sh +171 -0
  144. package/bin/skills/hugging-face-trackio/SKILL.md +65 -0
  145. package/bin/skills/hugging-face-trackio/references/logging_metrics.md +206 -0
  146. package/bin/skills/hugging-face-trackio/references/retrieving_metrics.md +223 -0
  147. package/bin/skills/huggingface-tokenizers/SKILL.md +516 -0
  148. package/bin/skills/huggingface-tokenizers/references/algorithms.md +653 -0
  149. package/bin/skills/huggingface-tokenizers/references/integration.md +637 -0
  150. package/bin/skills/huggingface-tokenizers/references/pipeline.md +723 -0
  151. package/bin/skills/huggingface-tokenizers/references/training.md +565 -0
  152. package/bin/skills/instructor/SKILL.md +740 -0
  153. package/bin/skills/instructor/references/examples.md +107 -0
  154. package/bin/skills/instructor/references/providers.md +70 -0
  155. package/bin/skills/instructor/references/validation.md +606 -0
  156. package/bin/skills/knowledge-distillation/SKILL.md +458 -0
  157. package/bin/skills/knowledge-distillation/references/minillm.md +334 -0
  158. package/bin/skills/lambda-labs/SKILL.md +545 -0
  159. package/bin/skills/lambda-labs/references/advanced-usage.md +611 -0
  160. package/bin/skills/lambda-labs/references/troubleshooting.md +530 -0
  161. package/bin/skills/langchain/SKILL.md +480 -0
  162. package/bin/skills/langchain/references/agents.md +499 -0
  163. package/bin/skills/langchain/references/integration.md +562 -0
  164. package/bin/skills/langchain/references/rag.md +600 -0
  165. package/bin/skills/langsmith/SKILL.md +422 -0
  166. package/bin/skills/langsmith/references/advanced-usage.md +548 -0
  167. package/bin/skills/langsmith/references/troubleshooting.md +537 -0
  168. package/bin/skills/litgpt/SKILL.md +469 -0
  169. package/bin/skills/litgpt/references/custom-models.md +568 -0
  170. package/bin/skills/litgpt/references/distributed-training.md +451 -0
  171. package/bin/skills/litgpt/references/supported-models.md +336 -0
  172. package/bin/skills/litgpt/references/training-recipes.md +619 -0
  173. package/bin/skills/llama-cpp/SKILL.md +258 -0
  174. package/bin/skills/llama-cpp/references/optimization.md +89 -0
  175. package/bin/skills/llama-cpp/references/quantization.md +213 -0
  176. package/bin/skills/llama-cpp/references/server.md +125 -0
  177. package/bin/skills/llama-factory/SKILL.md +80 -0
  178. package/bin/skills/llama-factory/references/_images.md +23 -0
  179. package/bin/skills/llama-factory/references/advanced.md +1055 -0
  180. package/bin/skills/llama-factory/references/getting_started.md +349 -0
  181. package/bin/skills/llama-factory/references/index.md +19 -0
  182. package/bin/skills/llama-factory/references/other.md +31 -0
  183. package/bin/skills/llamaguard/SKILL.md +337 -0
  184. package/bin/skills/llamaindex/SKILL.md +569 -0
  185. package/bin/skills/llamaindex/references/agents.md +83 -0
  186. package/bin/skills/llamaindex/references/data_connectors.md +108 -0
  187. package/bin/skills/llamaindex/references/query_engines.md +406 -0
  188. package/bin/skills/llava/SKILL.md +304 -0
  189. package/bin/skills/llava/references/training.md +197 -0
  190. package/bin/skills/lm-evaluation-harness/SKILL.md +490 -0
  191. package/bin/skills/lm-evaluation-harness/references/api-evaluation.md +490 -0
  192. package/bin/skills/lm-evaluation-harness/references/benchmark-guide.md +488 -0
  193. package/bin/skills/lm-evaluation-harness/references/custom-tasks.md +602 -0
  194. package/bin/skills/lm-evaluation-harness/references/distributed-eval.md +519 -0
  195. package/bin/skills/long-context/SKILL.md +536 -0
  196. package/bin/skills/long-context/references/extension_methods.md +468 -0
  197. package/bin/skills/long-context/references/fine_tuning.md +611 -0
  198. package/bin/skills/long-context/references/rope.md +402 -0
  199. package/bin/skills/mamba/SKILL.md +260 -0
  200. package/bin/skills/mamba/references/architecture-details.md +206 -0
  201. package/bin/skills/mamba/references/benchmarks.md +255 -0
  202. package/bin/skills/mamba/references/training-guide.md +388 -0
  203. package/bin/skills/megatron-core/SKILL.md +366 -0
  204. package/bin/skills/megatron-core/references/benchmarks.md +249 -0
  205. package/bin/skills/megatron-core/references/parallelism-guide.md +404 -0
  206. package/bin/skills/megatron-core/references/production-examples.md +473 -0
  207. package/bin/skills/megatron-core/references/training-recipes.md +547 -0
  208. package/bin/skills/miles/SKILL.md +315 -0
  209. package/bin/skills/miles/references/api-reference.md +141 -0
  210. package/bin/skills/miles/references/troubleshooting.md +352 -0
  211. package/bin/skills/mlflow/SKILL.md +704 -0
  212. package/bin/skills/mlflow/references/deployment.md +744 -0
  213. package/bin/skills/mlflow/references/model-registry.md +770 -0
  214. package/bin/skills/mlflow/references/tracking.md +680 -0
  215. package/bin/skills/modal/SKILL.md +341 -0
  216. package/bin/skills/modal/references/advanced-usage.md +503 -0
  217. package/bin/skills/modal/references/troubleshooting.md +494 -0
  218. package/bin/skills/model-merging/SKILL.md +539 -0
  219. package/bin/skills/model-merging/references/evaluation.md +462 -0
  220. package/bin/skills/model-merging/references/examples.md +428 -0
  221. package/bin/skills/model-merging/references/methods.md +352 -0
  222. package/bin/skills/model-pruning/SKILL.md +495 -0
  223. package/bin/skills/model-pruning/references/wanda.md +347 -0
  224. package/bin/skills/moe-training/SKILL.md +526 -0
  225. package/bin/skills/moe-training/references/architectures.md +432 -0
  226. package/bin/skills/moe-training/references/inference.md +348 -0
  227. package/bin/skills/moe-training/references/training.md +425 -0
  228. package/bin/skills/nanogpt/SKILL.md +290 -0
  229. package/bin/skills/nanogpt/references/architecture.md +382 -0
  230. package/bin/skills/nanogpt/references/data.md +476 -0
  231. package/bin/skills/nanogpt/references/training.md +564 -0
  232. package/bin/skills/nemo-curator/SKILL.md +383 -0
  233. package/bin/skills/nemo-curator/references/deduplication.md +87 -0
  234. package/bin/skills/nemo-curator/references/filtering.md +102 -0
  235. package/bin/skills/nemo-evaluator/SKILL.md +494 -0
  236. package/bin/skills/nemo-evaluator/references/adapter-system.md +340 -0
  237. package/bin/skills/nemo-evaluator/references/configuration.md +447 -0
  238. package/bin/skills/nemo-evaluator/references/custom-benchmarks.md +315 -0
  239. package/bin/skills/nemo-evaluator/references/execution-backends.md +361 -0
  240. package/bin/skills/nemo-guardrails/SKILL.md +297 -0
  241. package/bin/skills/nnsight/SKILL.md +436 -0
  242. package/bin/skills/nnsight/references/README.md +78 -0
  243. package/bin/skills/nnsight/references/api.md +344 -0
  244. package/bin/skills/nnsight/references/tutorials.md +300 -0
  245. package/bin/skills/openrlhf/SKILL.md +249 -0
  246. package/bin/skills/openrlhf/references/algorithm-comparison.md +404 -0
  247. package/bin/skills/openrlhf/references/custom-rewards.md +530 -0
  248. package/bin/skills/openrlhf/references/hybrid-engine.md +287 -0
  249. package/bin/skills/openrlhf/references/multi-node-training.md +454 -0
  250. package/bin/skills/outlines/SKILL.md +652 -0
  251. package/bin/skills/outlines/references/backends.md +615 -0
  252. package/bin/skills/outlines/references/examples.md +773 -0
  253. package/bin/skills/outlines/references/json_generation.md +652 -0
  254. package/bin/skills/peft/SKILL.md +431 -0
  255. package/bin/skills/peft/references/advanced-usage.md +514 -0
  256. package/bin/skills/peft/references/troubleshooting.md +480 -0
  257. package/bin/skills/phoenix/SKILL.md +475 -0
  258. package/bin/skills/phoenix/references/advanced-usage.md +619 -0
  259. package/bin/skills/phoenix/references/troubleshooting.md +538 -0
  260. package/bin/skills/pinecone/SKILL.md +358 -0
  261. package/bin/skills/pinecone/references/deployment.md +181 -0
  262. package/bin/skills/pytorch-fsdp/SKILL.md +126 -0
  263. package/bin/skills/pytorch-fsdp/references/index.md +7 -0
  264. package/bin/skills/pytorch-fsdp/references/other.md +4249 -0
  265. package/bin/skills/pytorch-lightning/SKILL.md +346 -0
  266. package/bin/skills/pytorch-lightning/references/callbacks.md +436 -0
  267. package/bin/skills/pytorch-lightning/references/distributed.md +490 -0
  268. package/bin/skills/pytorch-lightning/references/hyperparameter-tuning.md +556 -0
  269. package/bin/skills/pyvene/SKILL.md +473 -0
  270. package/bin/skills/pyvene/references/README.md +73 -0
  271. package/bin/skills/pyvene/references/api.md +383 -0
  272. package/bin/skills/pyvene/references/tutorials.md +376 -0
  273. package/bin/skills/qdrant/SKILL.md +493 -0
  274. package/bin/skills/qdrant/references/advanced-usage.md +648 -0
  275. package/bin/skills/qdrant/references/troubleshooting.md +631 -0
  276. package/bin/skills/ray-data/SKILL.md +326 -0
  277. package/bin/skills/ray-data/references/integration.md +82 -0
  278. package/bin/skills/ray-data/references/transformations.md +83 -0
  279. package/bin/skills/ray-train/SKILL.md +406 -0
  280. package/bin/skills/ray-train/references/multi-node.md +628 -0
  281. package/bin/skills/rwkv/SKILL.md +260 -0
  282. package/bin/skills/rwkv/references/architecture-details.md +344 -0
  283. package/bin/skills/rwkv/references/rwkv7.md +386 -0
  284. package/bin/skills/rwkv/references/state-management.md +369 -0
  285. package/bin/skills/saelens/SKILL.md +386 -0
  286. package/bin/skills/saelens/references/README.md +70 -0
  287. package/bin/skills/saelens/references/api.md +333 -0
  288. package/bin/skills/saelens/references/tutorials.md +318 -0
  289. package/bin/skills/segment-anything/SKILL.md +500 -0
  290. package/bin/skills/segment-anything/references/advanced-usage.md +589 -0
  291. package/bin/skills/segment-anything/references/troubleshooting.md +484 -0
  292. package/bin/skills/sentence-transformers/SKILL.md +255 -0
  293. package/bin/skills/sentence-transformers/references/models.md +123 -0
  294. package/bin/skills/sentencepiece/SKILL.md +235 -0
  295. package/bin/skills/sentencepiece/references/algorithms.md +200 -0
  296. package/bin/skills/sentencepiece/references/training.md +304 -0
  297. package/bin/skills/sglang/SKILL.md +442 -0
  298. package/bin/skills/sglang/references/deployment.md +490 -0
  299. package/bin/skills/sglang/references/radix-attention.md +413 -0
  300. package/bin/skills/sglang/references/structured-generation.md +541 -0
  301. package/bin/skills/simpo/SKILL.md +219 -0
  302. package/bin/skills/simpo/references/datasets.md +478 -0
  303. package/bin/skills/simpo/references/hyperparameters.md +452 -0
  304. package/bin/skills/simpo/references/loss-functions.md +350 -0
  305. package/bin/skills/skypilot/SKILL.md +509 -0
  306. package/bin/skills/skypilot/references/advanced-usage.md +491 -0
  307. package/bin/skills/skypilot/references/troubleshooting.md +570 -0
  308. package/bin/skills/slime/SKILL.md +464 -0
  309. package/bin/skills/slime/references/api-reference.md +392 -0
  310. package/bin/skills/slime/references/troubleshooting.md +386 -0
  311. package/bin/skills/speculative-decoding/SKILL.md +467 -0
  312. package/bin/skills/speculative-decoding/references/lookahead.md +309 -0
  313. package/bin/skills/speculative-decoding/references/medusa.md +350 -0
  314. package/bin/skills/stable-diffusion/SKILL.md +519 -0
  315. package/bin/skills/stable-diffusion/references/advanced-usage.md +716 -0
  316. package/bin/skills/stable-diffusion/references/troubleshooting.md +555 -0
  317. package/bin/skills/tensorboard/SKILL.md +629 -0
  318. package/bin/skills/tensorboard/references/integrations.md +638 -0
  319. package/bin/skills/tensorboard/references/profiling.md +545 -0
  320. package/bin/skills/tensorboard/references/visualization.md +620 -0
  321. package/bin/skills/tensorrt-llm/SKILL.md +187 -0
  322. package/bin/skills/tensorrt-llm/references/multi-gpu.md +298 -0
  323. package/bin/skills/tensorrt-llm/references/optimization.md +242 -0
  324. package/bin/skills/tensorrt-llm/references/serving.md +470 -0
  325. package/bin/skills/tinker/SKILL.md +362 -0
  326. package/bin/skills/tinker/references/api-reference.md +168 -0
  327. package/bin/skills/tinker/references/getting-started.md +157 -0
  328. package/bin/skills/tinker/references/loss-functions.md +163 -0
  329. package/bin/skills/tinker/references/models-and-lora.md +139 -0
  330. package/bin/skills/tinker/references/recipes.md +280 -0
  331. package/bin/skills/tinker/references/reinforcement-learning.md +212 -0
  332. package/bin/skills/tinker/references/rendering.md +243 -0
  333. package/bin/skills/tinker/references/supervised-learning.md +232 -0
  334. package/bin/skills/tinker-training-cost/SKILL.md +187 -0
  335. package/bin/skills/tinker-training-cost/scripts/calculate_cost.py +123 -0
  336. package/bin/skills/torchforge/SKILL.md +433 -0
  337. package/bin/skills/torchforge/references/api-reference.md +327 -0
  338. package/bin/skills/torchforge/references/troubleshooting.md +409 -0
  339. package/bin/skills/torchtitan/SKILL.md +358 -0
  340. package/bin/skills/torchtitan/references/checkpoint.md +181 -0
  341. package/bin/skills/torchtitan/references/custom-models.md +258 -0
  342. package/bin/skills/torchtitan/references/float8.md +133 -0
  343. package/bin/skills/torchtitan/references/fsdp.md +126 -0
  344. package/bin/skills/transformer-lens/SKILL.md +346 -0
  345. package/bin/skills/transformer-lens/references/README.md +54 -0
  346. package/bin/skills/transformer-lens/references/api.md +362 -0
  347. package/bin/skills/transformer-lens/references/tutorials.md +339 -0
  348. package/bin/skills/trl-fine-tuning/SKILL.md +455 -0
  349. package/bin/skills/trl-fine-tuning/references/dpo-variants.md +227 -0
  350. package/bin/skills/trl-fine-tuning/references/online-rl.md +82 -0
  351. package/bin/skills/trl-fine-tuning/references/reward-modeling.md +122 -0
  352. package/bin/skills/trl-fine-tuning/references/sft-training.md +168 -0
  353. package/bin/skills/unsloth/SKILL.md +80 -0
  354. package/bin/skills/unsloth/references/index.md +7 -0
  355. package/bin/skills/unsloth/references/llms-full.md +16799 -0
  356. package/bin/skills/unsloth/references/llms-txt.md +12044 -0
  357. package/bin/skills/unsloth/references/llms.md +82 -0
  358. package/bin/skills/verl/SKILL.md +391 -0
  359. package/bin/skills/verl/references/api-reference.md +301 -0
  360. package/bin/skills/verl/references/troubleshooting.md +391 -0
  361. package/bin/skills/vllm/SKILL.md +364 -0
  362. package/bin/skills/vllm/references/optimization.md +226 -0
  363. package/bin/skills/vllm/references/quantization.md +284 -0
  364. package/bin/skills/vllm/references/server-deployment.md +255 -0
  365. package/bin/skills/vllm/references/troubleshooting.md +447 -0
  366. package/bin/skills/weights-and-biases/SKILL.md +590 -0
  367. package/bin/skills/weights-and-biases/references/artifacts.md +584 -0
  368. package/bin/skills/weights-and-biases/references/integrations.md +700 -0
  369. package/bin/skills/weights-and-biases/references/sweeps.md +847 -0
  370. package/bin/skills/whisper/SKILL.md +317 -0
  371. package/bin/skills/whisper/references/languages.md +189 -0
  372. package/bin/synsc +0 -0
  373. package/package.json +10 -0
@@ -0,0 +1,406 @@
1
+ ---
2
+ name: chroma
3
+ description: Open-source embedding database for AI applications. Store embeddings and metadata, perform vector and full-text search, filter by metadata. Simple 4-function API. Scales from notebooks to production clusters. Use for semantic search, RAG applications, or document retrieval. Best for local development and open-source projects.
4
+ version: 1.0.0
5
+ author: Synthetic Sciences
6
+ license: MIT
7
+ tags: [RAG, Chroma, Vector Database, Embeddings, Semantic Search, Open Source, Self-Hosted, Document Retrieval, Metadata Filtering]
8
+ dependencies: [chromadb, sentence-transformers]
9
+ ---
10
+
11
+ # Chroma - Open-Source Embedding Database
12
+
13
+ The AI-native database for building LLM applications with memory.
14
+
15
+ ## When to use Chroma
16
+
17
+ **Use Chroma when:**
18
+ - Building RAG (retrieval-augmented generation) applications
19
+ - Need local/self-hosted vector database
20
+ - Want open-source solution (Apache 2.0)
21
+ - Prototyping in notebooks
22
+ - Semantic search over documents
23
+ - Storing embeddings with metadata
24
+
25
+ **Metrics**:
26
+ - **24,300+ GitHub stars**
27
+ - **1,900+ forks**
28
+ - **v1.3.3** (stable, weekly releases)
29
+ - **Apache 2.0 license**
30
+
31
+ **Use alternatives instead**:
32
+ - **Pinecone**: Managed cloud, auto-scaling
33
+ - **FAISS**: Pure similarity search, no metadata
34
+ - **Weaviate**: Production ML-native database
35
+ - **Qdrant**: High performance, Rust-based
36
+
37
+ ## Quick start
38
+
39
+ ### Installation
40
+
41
+ ```bash
42
+ # Python
43
+ pip install chromadb
44
+
45
+ # JavaScript/TypeScript
46
+ npm install chromadb @chroma-core/default-embed
47
+ ```
48
+
49
+ ### Basic usage (Python)
50
+
51
+ ```python
52
+ import chromadb
53
+
54
+ # Create client
55
+ client = chromadb.Client()
56
+
57
+ # Create collection
58
+ collection = client.create_collection(name="my_collection")
59
+
60
+ # Add documents
61
+ collection.add(
62
+ documents=["This is document 1", "This is document 2"],
63
+ metadatas=[{"source": "doc1"}, {"source": "doc2"}],
64
+ ids=["id1", "id2"]
65
+ )
66
+
67
+ # Query
68
+ results = collection.query(
69
+ query_texts=["document about topic"],
70
+ n_results=2
71
+ )
72
+
73
+ print(results)
74
+ ```
75
+
76
+ ## Core operations
77
+
78
+ ### 1. Create collection
79
+
80
+ ```python
81
+ # Simple collection
82
+ collection = client.create_collection("my_docs")
83
+
84
+ # With custom embedding function
85
+ from chromadb.utils import embedding_functions
86
+
87
+ openai_ef = embedding_functions.OpenAIEmbeddingFunction(
88
+ api_key="your-key",
89
+ model_name="text-embedding-3-small"
90
+ )
91
+
92
+ collection = client.create_collection(
93
+ name="my_docs",
94
+ embedding_function=openai_ef
95
+ )
96
+
97
+ # Get existing collection
98
+ collection = client.get_collection("my_docs")
99
+
100
+ # Delete collection
101
+ client.delete_collection("my_docs")
102
+ ```
103
+
104
+ ### 2. Add documents
105
+
106
+ ```python
107
+ # Add with auto-generated IDs
108
+ collection.add(
109
+ documents=["Doc 1", "Doc 2", "Doc 3"],
110
+ metadatas=[
111
+ {"source": "web", "category": "tutorial"},
112
+ {"source": "pdf", "page": 5},
113
+ {"source": "api", "timestamp": "2025-01-01"}
114
+ ],
115
+ ids=["id1", "id2", "id3"]
116
+ )
117
+
118
+ # Add with custom embeddings
119
+ collection.add(
120
+ embeddings=[[0.1, 0.2, ...], [0.3, 0.4, ...]],
121
+ documents=["Doc 1", "Doc 2"],
122
+ ids=["id1", "id2"]
123
+ )
124
+ ```
125
+
126
+ ### 3. Query (similarity search)
127
+
128
+ ```python
129
+ # Basic query
130
+ results = collection.query(
131
+ query_texts=["machine learning tutorial"],
132
+ n_results=5
133
+ )
134
+
135
+ # Query with filters
136
+ results = collection.query(
137
+ query_texts=["Python programming"],
138
+ n_results=3,
139
+ where={"source": "web"}
140
+ )
141
+
142
+ # Query with metadata filters
143
+ results = collection.query(
144
+ query_texts=["advanced topics"],
145
+ where={
146
+ "$and": [
147
+ {"category": "tutorial"},
148
+ {"difficulty": {"$gte": 3}}
149
+ ]
150
+ }
151
+ )
152
+
153
+ # Access results
154
+ print(results["documents"]) # List of matching documents
155
+ print(results["metadatas"]) # Metadata for each doc
156
+ print(results["distances"]) # Similarity scores
157
+ print(results["ids"]) # Document IDs
158
+ ```
159
+
160
+ ### 4. Get documents
161
+
162
+ ```python
163
+ # Get by IDs
164
+ docs = collection.get(
165
+ ids=["id1", "id2"]
166
+ )
167
+
168
+ # Get with filters
169
+ docs = collection.get(
170
+ where={"category": "tutorial"},
171
+ limit=10
172
+ )
173
+
174
+ # Get all documents
175
+ docs = collection.get()
176
+ ```
177
+
178
+ ### 5. Update documents
179
+
180
+ ```python
181
+ # Update document content
182
+ collection.update(
183
+ ids=["id1"],
184
+ documents=["Updated content"],
185
+ metadatas=[{"source": "updated"}]
186
+ )
187
+ ```
188
+
189
+ ### 6. Delete documents
190
+
191
+ ```python
192
+ # Delete by IDs
193
+ collection.delete(ids=["id1", "id2"])
194
+
195
+ # Delete with filter
196
+ collection.delete(
197
+ where={"source": "outdated"}
198
+ )
199
+ ```
200
+
201
+ ## Persistent storage
202
+
203
+ ```python
204
+ # Persist to disk
205
+ client = chromadb.PersistentClient(path="./chroma_db")
206
+
207
+ collection = client.create_collection("my_docs")
208
+ collection.add(documents=["Doc 1"], ids=["id1"])
209
+
210
+ # Data persisted automatically
211
+ # Reload later with same path
212
+ client = chromadb.PersistentClient(path="./chroma_db")
213
+ collection = client.get_collection("my_docs")
214
+ ```
215
+
216
+ ## Embedding functions
217
+
218
+ ### Default (Sentence Transformers)
219
+
220
+ ```python
221
+ # Uses sentence-transformers by default
222
+ collection = client.create_collection("my_docs")
223
+ # Default model: all-MiniLM-L6-v2
224
+ ```
225
+
226
+ ### OpenAI
227
+
228
+ ```python
229
+ from chromadb.utils import embedding_functions
230
+
231
+ openai_ef = embedding_functions.OpenAIEmbeddingFunction(
232
+ api_key="your-key",
233
+ model_name="text-embedding-3-small"
234
+ )
235
+
236
+ collection = client.create_collection(
237
+ name="openai_docs",
238
+ embedding_function=openai_ef
239
+ )
240
+ ```
241
+
242
+ ### HuggingFace
243
+
244
+ ```python
245
+ huggingface_ef = embedding_functions.HuggingFaceEmbeddingFunction(
246
+ api_key="your-key",
247
+ model_name="sentence-transformers/all-mpnet-base-v2"
248
+ )
249
+
250
+ collection = client.create_collection(
251
+ name="hf_docs",
252
+ embedding_function=huggingface_ef
253
+ )
254
+ ```
255
+
256
+ ### Custom embedding function
257
+
258
+ ```python
259
+ from chromadb import Documents, EmbeddingFunction, Embeddings
260
+
261
+ class MyEmbeddingFunction(EmbeddingFunction):
262
+ def __call__(self, input: Documents) -> Embeddings:
263
+ # Your embedding logic
264
+ return embeddings
265
+
266
+ my_ef = MyEmbeddingFunction()
267
+ collection = client.create_collection(
268
+ name="custom_docs",
269
+ embedding_function=my_ef
270
+ )
271
+ ```
272
+
273
+ ## Metadata filtering
274
+
275
+ ```python
276
+ # Exact match
277
+ results = collection.query(
278
+ query_texts=["query"],
279
+ where={"category": "tutorial"}
280
+ )
281
+
282
+ # Comparison operators
283
+ results = collection.query(
284
+ query_texts=["query"],
285
+ where={"page": {"$gt": 10}} # $gt, $gte, $lt, $lte, $ne
286
+ )
287
+
288
+ # Logical operators
289
+ results = collection.query(
290
+ query_texts=["query"],
291
+ where={
292
+ "$and": [
293
+ {"category": "tutorial"},
294
+ {"difficulty": {"$lte": 3}}
295
+ ]
296
+ } # Also: $or
297
+ )
298
+
299
+ # Contains
300
+ results = collection.query(
301
+ query_texts=["query"],
302
+ where={"tags": {"$in": ["python", "ml"]}}
303
+ )
304
+ ```
305
+
306
+ ## LangChain integration
307
+
308
+ ```python
309
+ from langchain_chroma import Chroma
310
+ from langchain_openai import OpenAIEmbeddings
311
+ from langchain.text_splitter import RecursiveCharacterTextSplitter
312
+
313
+ # Split documents
314
+ text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000)
315
+ docs = text_splitter.split_documents(documents)
316
+
317
+ # Create Chroma vector store
318
+ vectorstore = Chroma.from_documents(
319
+ documents=docs,
320
+ embedding=OpenAIEmbeddings(),
321
+ persist_directory="./chroma_db"
322
+ )
323
+
324
+ # Query
325
+ results = vectorstore.similarity_search("machine learning", k=3)
326
+
327
+ # As retriever
328
+ retriever = vectorstore.as_retriever(search_kwargs={"k": 5})
329
+ ```
330
+
331
+ ## LlamaIndex integration
332
+
333
+ ```python
334
+ from llama_index.vector_stores.chroma import ChromaVectorStore
335
+ from llama_index.core import VectorStoreIndex, StorageContext
336
+ import chromadb
337
+
338
+ # Initialize Chroma
339
+ db = chromadb.PersistentClient(path="./chroma_db")
340
+ collection = db.get_or_create_collection("my_collection")
341
+
342
+ # Create vector store
343
+ vector_store = ChromaVectorStore(chroma_collection=collection)
344
+ storage_context = StorageContext.from_defaults(vector_store=vector_store)
345
+
346
+ # Create index
347
+ index = VectorStoreIndex.from_documents(
348
+ documents,
349
+ storage_context=storage_context
350
+ )
351
+
352
+ # Query
353
+ query_engine = index.as_query_engine()
354
+ response = query_engine.query("What is machine learning?")
355
+ ```
356
+
357
+ ## Server mode
358
+
359
+ ```python
360
+ # Run Chroma server
361
+ # Terminal: chroma run --path ./chroma_db --port 8000
362
+
363
+ # Connect to server
364
+ import chromadb
365
+ from chromadb.config import Settings
366
+
367
+ client = chromadb.HttpClient(
368
+ host="localhost",
369
+ port=8000,
370
+ settings=Settings(anonymized_telemetry=False)
371
+ )
372
+
373
+ # Use as normal
374
+ collection = client.get_or_create_collection("my_docs")
375
+ ```
376
+
377
+ ## Best practices
378
+
379
+ 1. **Use persistent client** - Don't lose data on restart
380
+ 2. **Add metadata** - Enables filtering and tracking
381
+ 3. **Batch operations** - Add multiple docs at once
382
+ 4. **Choose right embedding model** - Balance speed/quality
383
+ 5. **Use filters** - Narrow search space
384
+ 6. **Unique IDs** - Avoid collisions
385
+ 7. **Regular backups** - Copy chroma_db directory
386
+ 8. **Monitor collection size** - Scale up if needed
387
+ 9. **Test embedding functions** - Ensure quality
388
+ 10. **Use server mode for production** - Better for multi-user
389
+
390
+ ## Performance
391
+
392
+ | Operation | Latency | Notes |
393
+ |-----------|---------|-------|
394
+ | Add 100 docs | ~1-3s | With embedding |
395
+ | Query (top 10) | ~50-200ms | Depends on collection size |
396
+ | Metadata filter | ~10-50ms | Fast with proper indexing |
397
+
398
+ ## Resources
399
+
400
+ - **GitHub**: https://github.com/chroma-core/chroma ⭐ 24,300+
401
+ - **Docs**: https://docs.trychroma.com
402
+ - **Discord**: https://discord.gg/MMeYNTmh3x
403
+ - **Version**: 1.3.3+
404
+ - **License**: Apache 2.0
405
+
406
+
@@ -0,0 +1,38 @@
1
+ # Chroma Integration Guide
2
+
3
+ Integration with LangChain, LlamaIndex, and frameworks.
4
+
5
+ ## LangChain
6
+
7
+ ```python
8
+ from langchain_chroma import Chroma
9
+ from langchain_openai import OpenAIEmbeddings
10
+
11
+ vectorstore = Chroma.from_documents(
12
+ documents=docs,
13
+ embedding=OpenAIEmbeddings(),
14
+ persist_directory="./chroma_db"
15
+ )
16
+
17
+ # Query
18
+ results = vectorstore.similarity_search("query", k=3)
19
+
20
+ # As retriever
21
+ retriever = vectorstore.as_retriever()
22
+ ```
23
+
24
+ ## LlamaIndex
25
+
26
+ ```python
27
+ from llama_index.vector_stores.chroma import ChromaVectorStore
28
+ import chromadb
29
+
30
+ db = chromadb.PersistentClient(path="./chroma_db")
31
+ collection = db.get_or_create_collection("docs")
32
+
33
+ vector_store = ChromaVectorStore(chroma_collection=collection)
34
+ ```
35
+
36
+ ## Resources
37
+
38
+ - **Docs**: https://docs.trychroma.com
@@ -0,0 +1,253 @@
1
+ ---
2
+ name: clip
3
+ description: OpenAI's model connecting vision and language. Enables zero-shot image classification, image-text matching, and cross-modal retrieval. Trained on 400M image-text pairs. Use for image search, content moderation, or vision-language tasks without fine-tuning. Best for general-purpose image understanding.
4
+ version: 1.0.0
5
+ author: Synthetic Sciences
6
+ license: MIT
7
+ tags: [Multimodal, CLIP, Vision-Language, Zero-Shot, Image Classification, OpenAI, Image Search, Cross-Modal Retrieval, Content Moderation]
8
+ dependencies: [transformers, torch, pillow]
9
+ ---
10
+
11
+ # CLIP - Contrastive Language-Image Pre-Training
12
+
13
+ OpenAI's model that understands images from natural language.
14
+
15
+ ## When to use CLIP
16
+
17
+ **Use when:**
18
+ - Zero-shot image classification (no training data needed)
19
+ - Image-text similarity/matching
20
+ - Semantic image search
21
+ - Content moderation (detect NSFW, violence)
22
+ - Visual question answering
23
+ - Cross-modal retrieval (image→text, text→image)
24
+
25
+ **Metrics**:
26
+ - **25,300+ GitHub stars**
27
+ - Trained on 400M image-text pairs
28
+ - Matches ResNet-50 on ImageNet (zero-shot)
29
+ - MIT License
30
+
31
+ **Use alternatives instead**:
32
+ - **BLIP-2**: Better captioning
33
+ - **LLaVA**: Vision-language chat
34
+ - **Segment Anything**: Image segmentation
35
+
36
+ ## Quick start
37
+
38
+ ### Installation
39
+
40
+ ```bash
41
+ pip install git+https://github.com/openai/CLIP.git
42
+ pip install torch torchvision ftfy regex tqdm
43
+ ```
44
+
45
+ ### Zero-shot classification
46
+
47
+ ```python
48
+ import torch
49
+ import clip
50
+ from PIL import Image
51
+
52
+ # Load model
53
+ device = "cuda" if torch.cuda.is_available() else "cpu"
54
+ model, preprocess = clip.load("ViT-B/32", device=device)
55
+
56
+ # Load image
57
+ image = preprocess(Image.open("photo.jpg")).unsqueeze(0).to(device)
58
+
59
+ # Define possible labels
60
+ text = clip.tokenize(["a dog", "a cat", "a bird", "a car"]).to(device)
61
+
62
+ # Compute similarity
63
+ with torch.no_grad():
64
+ image_features = model.encode_image(image)
65
+ text_features = model.encode_text(text)
66
+
67
+ # Cosine similarity
68
+ logits_per_image, logits_per_text = model(image, text)
69
+ probs = logits_per_image.softmax(dim=-1).cpu().numpy()
70
+
71
+ # Print results
72
+ labels = ["a dog", "a cat", "a bird", "a car"]
73
+ for label, prob in zip(labels, probs[0]):
74
+ print(f"{label}: {prob:.2%}")
75
+ ```
76
+
77
+ ## Available models
78
+
79
+ ```python
80
+ # Models (sorted by size)
81
+ models = [
82
+ "RN50", # ResNet-50
83
+ "RN101", # ResNet-101
84
+ "ViT-B/32", # Vision Transformer (recommended)
85
+ "ViT-B/16", # Better quality, slower
86
+ "ViT-L/14", # Best quality, slowest
87
+ ]
88
+
89
+ model, preprocess = clip.load("ViT-B/32")
90
+ ```
91
+
92
+ | Model | Parameters | Speed | Quality |
93
+ |-------|------------|-------|---------|
94
+ | RN50 | 102M | Fast | Good |
95
+ | ViT-B/32 | 151M | Medium | Better |
96
+ | ViT-L/14 | 428M | Slow | Best |
97
+
98
+ ## Image-text similarity
99
+
100
+ ```python
101
+ # Compute embeddings
102
+ image_features = model.encode_image(image)
103
+ text_features = model.encode_text(text)
104
+
105
+ # Normalize
106
+ image_features /= image_features.norm(dim=-1, keepdim=True)
107
+ text_features /= text_features.norm(dim=-1, keepdim=True)
108
+
109
+ # Cosine similarity
110
+ similarity = (image_features @ text_features.T).item()
111
+ print(f"Similarity: {similarity:.4f}")
112
+ ```
113
+
114
+ ## Semantic image search
115
+
116
+ ```python
117
+ # Index images
118
+ image_paths = ["img1.jpg", "img2.jpg", "img3.jpg"]
119
+ image_embeddings = []
120
+
121
+ for img_path in image_paths:
122
+ image = preprocess(Image.open(img_path)).unsqueeze(0).to(device)
123
+ with torch.no_grad():
124
+ embedding = model.encode_image(image)
125
+ embedding /= embedding.norm(dim=-1, keepdim=True)
126
+ image_embeddings.append(embedding)
127
+
128
+ image_embeddings = torch.cat(image_embeddings)
129
+
130
+ # Search with text query
131
+ query = "a sunset over the ocean"
132
+ text_input = clip.tokenize([query]).to(device)
133
+ with torch.no_grad():
134
+ text_embedding = model.encode_text(text_input)
135
+ text_embedding /= text_embedding.norm(dim=-1, keepdim=True)
136
+
137
+ # Find most similar images
138
+ similarities = (text_embedding @ image_embeddings.T).squeeze(0)
139
+ top_k = similarities.topk(3)
140
+
141
+ for idx, score in zip(top_k.indices, top_k.values):
142
+ print(f"{image_paths[idx]}: {score:.3f}")
143
+ ```
144
+
145
+ ## Content moderation
146
+
147
+ ```python
148
+ # Define categories
149
+ categories = [
150
+ "safe for work",
151
+ "not safe for work",
152
+ "violent content",
153
+ "graphic content"
154
+ ]
155
+
156
+ text = clip.tokenize(categories).to(device)
157
+
158
+ # Check image
159
+ with torch.no_grad():
160
+ logits_per_image, _ = model(image, text)
161
+ probs = logits_per_image.softmax(dim=-1)
162
+
163
+ # Get classification
164
+ max_idx = probs.argmax().item()
165
+ max_prob = probs[0, max_idx].item()
166
+
167
+ print(f"Category: {categories[max_idx]} ({max_prob:.2%})")
168
+ ```
169
+
170
+ ## Batch processing
171
+
172
+ ```python
173
+ # Process multiple images
174
+ images = [preprocess(Image.open(f"img{i}.jpg")) for i in range(10)]
175
+ images = torch.stack(images).to(device)
176
+
177
+ with torch.no_grad():
178
+ image_features = model.encode_image(images)
179
+ image_features /= image_features.norm(dim=-1, keepdim=True)
180
+
181
+ # Batch text
182
+ texts = ["a dog", "a cat", "a bird"]
183
+ text_tokens = clip.tokenize(texts).to(device)
184
+
185
+ with torch.no_grad():
186
+ text_features = model.encode_text(text_tokens)
187
+ text_features /= text_features.norm(dim=-1, keepdim=True)
188
+
189
+ # Similarity matrix (10 images × 3 texts)
190
+ similarities = image_features @ text_features.T
191
+ print(similarities.shape) # (10, 3)
192
+ ```
193
+
194
+ ## Integration with vector databases
195
+
196
+ ```python
197
+ # Store CLIP embeddings in Chroma/FAISS
198
+ import chromadb
199
+
200
+ client = chromadb.Client()
201
+ collection = client.create_collection("image_embeddings")
202
+
203
+ # Add image embeddings
204
+ for img_path, embedding in zip(image_paths, image_embeddings):
205
+ collection.add(
206
+ embeddings=[embedding.cpu().numpy().tolist()],
207
+ metadatas=[{"path": img_path}],
208
+ ids=[img_path]
209
+ )
210
+
211
+ # Query with text
212
+ query = "a sunset"
213
+ text_embedding = model.encode_text(clip.tokenize([query]))
214
+ results = collection.query(
215
+ query_embeddings=[text_embedding.cpu().numpy().tolist()],
216
+ n_results=5
217
+ )
218
+ ```
219
+
220
+ ## Best practices
221
+
222
+ 1. **Use ViT-B/32 for most cases** - Good balance
223
+ 2. **Normalize embeddings** - Required for cosine similarity
224
+ 3. **Batch processing** - More efficient
225
+ 4. **Cache embeddings** - Expensive to recompute
226
+ 5. **Use descriptive labels** - Better zero-shot performance
227
+ 6. **GPU recommended** - 10-50× faster
228
+ 7. **Preprocess images** - Use provided preprocess function
229
+
230
+ ## Performance
231
+
232
+ | Operation | CPU | GPU (V100) |
233
+ |-----------|-----|------------|
234
+ | Image encoding | ~200ms | ~20ms |
235
+ | Text encoding | ~50ms | ~5ms |
236
+ | Similarity compute | <1ms | <1ms |
237
+
238
+ ## Limitations
239
+
240
+ 1. **Not for fine-grained tasks** - Best for broad categories
241
+ 2. **Requires descriptive text** - Vague labels perform poorly
242
+ 3. **Biased on web data** - May have dataset biases
243
+ 4. **No bounding boxes** - Whole image only
244
+ 5. **Limited spatial understanding** - Position/counting weak
245
+
246
+ ## Resources
247
+
248
+ - **GitHub**: https://github.com/openai/CLIP ⭐ 25,300+
249
+ - **Paper**: https://arxiv.org/abs/2103.00020
250
+ - **Colab**: https://colab.research.google.com/github/openai/clip/
251
+ - **License**: MIT
252
+
253
+