aigroup-workflow 2.2.0 → 2.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/commands/fix-build.md +10 -5
- package/.claude/commands/init-project.md +13 -8
- package/.claude/commands/plan.md +15 -8
- package/.claude/commands/review.md +12 -6
- package/.claude/commands/tdd.md +11 -5
- package/.claude/commands/workflow-start.md +20 -11
- package/.claude/settings.json +28 -0
- package/.codex/agents/architect.toml +207 -0
- package/.codex/agents/build-error-resolver.toml +110 -0
- package/.codex/agents/code-reviewer.toml +233 -0
- package/.codex/agents/doc-updater.toml +103 -0
- package/.codex/agents/e2e-runner.toml +103 -0
- package/.codex/agents/get-current-datetime.toml +23 -0
- package/.codex/agents/init-architect.toml +181 -0
- package/.codex/agents/planner.toml +208 -0
- package/.codex/agents/refactor-cleaner.toml +81 -0
- package/.codex/agents/rust-reviewer.toml +90 -0
- package/.codex/agents/security-reviewer.toml +104 -0
- package/.codex/agents/tdd-guide.toml +87 -0
- package/AGENTS.md +2 -2
- package/CLAUDE.md +23 -1
- package/LICENSE +20 -20
- package/README.md +333 -333
- package/agents/a11y-architect.md +141 -141
- package/agents/architect.md +211 -211
- package/agents/build-error-resolver.md +114 -114
- package/agents/chief-of-staff.md +151 -151
- package/agents/code-architect.md +71 -71
- package/agents/code-explorer.md +69 -69
- package/agents/code-reviewer.md +237 -237
- package/agents/code-simplifier.md +47 -47
- package/agents/comment-analyzer.md +45 -45
- package/agents/conversation-analyzer.md +52 -52
- package/agents/cpp-build-resolver.md +90 -90
- package/agents/cpp-reviewer.md +72 -72
- package/agents/csharp-reviewer.md +101 -101
- package/agents/dart-build-resolver.md +201 -201
- package/agents/database-reviewer.md +91 -91
- package/agents/doc-updater.md +107 -107
- package/agents/docs-lookup.md +68 -68
- package/agents/e2e-runner.md +107 -107
- package/agents/flutter-reviewer.md +243 -243
- package/agents/gan-evaluator.md +209 -209
- package/agents/gan-generator.md +131 -131
- package/agents/gan-planner.md +99 -99
- package/agents/get-current-datetime.md +26 -26
- package/agents/go-build-resolver.md +94 -94
- package/agents/go-reviewer.md +76 -76
- package/agents/harness-optimizer.md +35 -35
- package/agents/healthcare-reviewer.md +83 -83
- package/agents/java-build-resolver.md +153 -153
- package/agents/java-reviewer.md +92 -92
- package/agents/kotlin-build-resolver.md +118 -118
- package/agents/kotlin-reviewer.md +159 -159
- package/agents/loop-operator.md +36 -36
- package/agents/opensource-forker.md +198 -198
- package/agents/opensource-packager.md +249 -249
- package/agents/opensource-sanitizer.md +188 -188
- package/agents/performance-optimizer.md +446 -446
- package/agents/planner.md +212 -212
- package/agents/pr-test-analyzer.md +45 -45
- package/agents/python-reviewer.md +98 -98
- package/agents/pytorch-build-resolver.md +120 -120
- package/agents/refactor-cleaner.md +85 -85
- package/agents/rust-build-resolver.md +148 -148
- package/agents/rust-reviewer.md +94 -94
- package/agents/security-reviewer.md +108 -108
- package/agents/seo-specialist.md +59 -59
- package/agents/silent-failure-hunter.md +50 -50
- package/agents/tdd-guide.md +91 -91
- package/agents/type-design-analyzer.md +41 -41
- package/agents/typescript-reviewer.md +112 -112
- package/cli/commands/update.mjs +1 -1
- package/cli/utils/scaffold.mjs +53 -0
- package/docs/rules/agents.md +166 -50
- package/docs/rules/cpp/coding-style.md +44 -44
- package/docs/rules/cpp/hooks.md +39 -39
- package/docs/rules/cpp/patterns.md +51 -51
- package/docs/rules/cpp/security.md +51 -51
- package/docs/rules/cpp/testing.md +44 -44
- package/docs/rules/csharp/coding-style.md +72 -72
- package/docs/rules/csharp/hooks.md +25 -25
- package/docs/rules/csharp/patterns.md +50 -50
- package/docs/rules/csharp/security.md +58 -58
- package/docs/rules/csharp/testing.md +46 -46
- package/docs/rules/dart/coding-style.md +159 -159
- package/docs/rules/dart/hooks.md +66 -66
- package/docs/rules/dart/patterns.md +261 -261
- package/docs/rules/dart/security.md +135 -135
- package/docs/rules/dart/testing.md +215 -215
- package/docs/rules/golang/coding-style.md +32 -32
- package/docs/rules/golang/hooks.md +17 -17
- package/docs/rules/golang/patterns.md +45 -45
- package/docs/rules/golang/security.md +34 -34
- package/docs/rules/golang/testing.md +31 -31
- package/docs/rules/java/coding-style.md +114 -114
- package/docs/rules/java/hooks.md +18 -18
- package/docs/rules/java/patterns.md +146 -146
- package/docs/rules/java/security.md +100 -100
- package/docs/rules/java/testing.md +131 -131
- package/docs/rules/kotlin/coding-style.md +86 -86
- package/docs/rules/kotlin/hooks.md +17 -17
- package/docs/rules/kotlin/patterns.md +146 -146
- package/docs/rules/kotlin/security.md +82 -82
- package/docs/rules/kotlin/testing.md +128 -128
- package/docs/rules/perl/coding-style.md +46 -46
- package/docs/rules/perl/hooks.md +22 -22
- package/docs/rules/perl/patterns.md +76 -76
- package/docs/rules/perl/security.md +69 -69
- package/docs/rules/perl/testing.md +54 -54
- package/docs/rules/php/coding-style.md +40 -40
- package/docs/rules/php/hooks.md +24 -24
- package/docs/rules/php/patterns.md +33 -33
- package/docs/rules/php/security.md +37 -37
- package/docs/rules/php/testing.md +39 -39
- package/docs/rules/python/coding-style.md +42 -42
- package/docs/rules/python/hooks.md +19 -19
- package/docs/rules/python/patterns.md +39 -39
- package/docs/rules/python/security.md +30 -30
- package/docs/rules/python/testing.md +38 -38
- package/docs/rules/rust/coding-style.md +151 -151
- package/docs/rules/rust/hooks.md +16 -16
- package/docs/rules/rust/patterns.md +168 -168
- package/docs/rules/rust/security.md +141 -141
- package/docs/rules/rust/testing.md +154 -154
- package/docs/rules/swift/coding-style.md +47 -47
- package/docs/rules/swift/hooks.md +20 -20
- package/docs/rules/swift/patterns.md +66 -66
- package/docs/rules/swift/security.md +33 -33
- package/docs/rules/swift/testing.md +45 -45
- package/docs/rules/typescript/coding-style.md +199 -199
- package/docs/rules/typescript/hooks.md +22 -22
- package/docs/rules/typescript/patterns.md +52 -52
- package/docs/rules/typescript/security.md +28 -28
- package/docs/rules/typescript/testing.md +18 -18
- package/docs/rules/web/coding-style.md +96 -96
- package/docs/rules/web/design-quality.md +62 -62
- package/docs/rules/web/hooks.md +120 -120
- package/docs/rules/web/patterns.md +79 -79
- package/docs/rules/web/performance.md +64 -64
- package/docs/rules/web/security.md +57 -57
- package/docs/rules/web/testing.md +55 -55
- package/docs/templates/README.md +36 -36
- package/docs/templates/ai-project-final.md +124 -124
- package/docs/templates/ai-project.md +105 -105
- package/docs/templates/api.md +157 -157
- package/docs/templates/bug.md +62 -62
- package/docs/templates/code-review.md +87 -87
- package/docs/templates/generic.md +116 -116
- package/docs/templates/implementation-plan.md +1 -1
- package/docs/templates/meeting.md +68 -68
- package/docs/templates/prd.md +98 -98
- package/docs/templates/ui.md +134 -134
- package/docs/workflow-pipeline.md +11 -10
- package/package.json +40 -39
- package/scripts/hooks/checks/orchestration-artifacts.cjs +28 -23
- package/scripts/hooks/checks/workflow-state.cjs +4 -5
- package/scripts/orchestration/lib/orchestrator.cjs +344 -117
- package/scripts/orchestration/lib/validate.cjs +145 -0
- package/scripts/orchestration/session.cjs +88 -44
- package/skills/SUPERPOWERS-LICENSE +21 -21
- package/skills/ai-ml/fine-tuning-expert/SKILL.md +162 -162
- package/skills/ai-ml/fine-tuning-expert/references/dataset-preparation.md +540 -540
- package/skills/ai-ml/fine-tuning-expert/references/deployment-optimization.md +673 -673
- package/skills/ai-ml/fine-tuning-expert/references/evaluation-metrics.md +597 -597
- package/skills/ai-ml/fine-tuning-expert/references/hyperparameter-tuning.md +565 -565
- package/skills/ai-ml/fine-tuning-expert/references/lora-peft.md +347 -347
- package/skills/ai-ml/ml-pipeline/SKILL.md +159 -159
- package/skills/ai-ml/ml-pipeline/references/experiment-tracking.md +833 -833
- package/skills/ai-ml/ml-pipeline/references/feature-engineering.md +631 -631
- package/skills/ai-ml/ml-pipeline/references/model-validation.md +978 -978
- package/skills/ai-ml/ml-pipeline/references/pipeline-orchestration.md +907 -907
- package/skills/ai-ml/ml-pipeline/references/training-pipelines.md +782 -782
- package/skills/ai-ml/rag-architect/SKILL.md +194 -194
- package/skills/ai-ml/rag-architect/references/chunking-strategies.md +878 -878
- package/skills/ai-ml/rag-architect/references/embedding-models.md +561 -561
- package/skills/ai-ml/rag-architect/references/rag-evaluation.md +833 -833
- package/skills/ai-ml/rag-architect/references/retrieval-optimization.md +795 -795
- package/skills/ai-ml/rag-architect/references/vector-databases.md +589 -589
- package/skills/ai-ml/spark-engineer/SKILL.md +148 -148
- package/skills/ai-ml/spark-engineer/references/partitioning-caching.md +543 -543
- package/skills/ai-ml/spark-engineer/references/performance-tuning.md +544 -544
- package/skills/ai-ml/spark-engineer/references/rdd-operations.md +599 -599
- package/skills/ai-ml/spark-engineer/references/spark-sql-dataframes.md +474 -474
- package/skills/ai-ml/spark-engineer/references/streaming-patterns.md +786 -786
- package/skills/backend/api-designer/SKILL.md +217 -217
- package/skills/backend/api-designer/references/error-handling.md +541 -541
- package/skills/backend/api-designer/references/openapi.md +824 -824
- package/skills/backend/api-designer/references/pagination.md +494 -494
- package/skills/backend/api-designer/references/rest-patterns.md +335 -335
- package/skills/backend/api-designer/references/versioning.md +391 -391
- package/skills/backend/architecture-designer/SKILL.md +117 -117
- package/skills/backend/architecture-designer/references/adr-template.md +116 -116
- package/skills/backend/architecture-designer/references/architecture-patterns.md +111 -111
- package/skills/backend/architecture-designer/references/database-selection.md +102 -102
- package/skills/backend/architecture-designer/references/nfr-checklist.md +112 -112
- package/skills/backend/architecture-designer/references/system-design.md +100 -100
- package/skills/backend/code-documenter/SKILL.md +147 -147
- package/skills/backend/code-documenter/references/api-docs-fastapi-django.md +166 -166
- package/skills/backend/code-documenter/references/api-docs-nestjs-express.md +220 -220
- package/skills/backend/code-documenter/references/coverage-reports.md +125 -125
- package/skills/backend/code-documenter/references/documentation-systems.md +333 -333
- package/skills/backend/code-documenter/references/interactive-api-docs.md +531 -531
- package/skills/backend/code-documenter/references/python-docstrings.md +121 -121
- package/skills/backend/code-documenter/references/typescript-jsdoc.md +145 -145
- package/skills/backend/code-documenter/references/user-guides-tutorials.md +530 -530
- package/skills/backend/debugging-wizard/SKILL.md +105 -105
- package/skills/backend/debugging-wizard/references/common-patterns.md +132 -132
- package/skills/backend/debugging-wizard/references/debugging-tools.md +140 -140
- package/skills/backend/debugging-wizard/references/quick-fixes.md +177 -177
- package/skills/backend/debugging-wizard/references/strategies.md +142 -142
- package/skills/backend/debugging-wizard/references/systematic-debugging.md +367 -367
- package/skills/backend/feature-forge/SKILL.md +98 -98
- package/skills/backend/feature-forge/references/acceptance-criteria.md +104 -104
- package/skills/backend/feature-forge/references/ears-syntax.md +99 -99
- package/skills/backend/feature-forge/references/interview-questions.md +150 -150
- package/skills/backend/feature-forge/references/pre-discovery-subagents.md +54 -54
- package/skills/backend/feature-forge/references/specification-template.md +103 -103
- package/skills/backend/fullstack-guardian/SKILL.md +105 -105
- package/skills/backend/fullstack-guardian/references/api-design-standards.md +307 -307
- package/skills/backend/fullstack-guardian/references/architecture-decisions.md +350 -350
- package/skills/backend/fullstack-guardian/references/backend-patterns.md +237 -237
- package/skills/backend/fullstack-guardian/references/common-patterns.md +134 -134
- package/skills/backend/fullstack-guardian/references/deliverables-checklist.md +354 -354
- package/skills/backend/fullstack-guardian/references/design-template.md +91 -91
- package/skills/backend/fullstack-guardian/references/error-handling.md +135 -135
- package/skills/backend/fullstack-guardian/references/frontend-patterns.md +340 -340
- package/skills/backend/fullstack-guardian/references/integration-patterns.md +333 -333
- package/skills/backend/fullstack-guardian/references/security-checklist.md +106 -106
- package/skills/backend/graphql-architect/SKILL.md +146 -146
- package/skills/backend/graphql-architect/references/federation.md +418 -418
- package/skills/backend/graphql-architect/references/migration-from-rest.md +1141 -1141
- package/skills/backend/graphql-architect/references/resolvers.md +425 -425
- package/skills/backend/graphql-architect/references/schema-design.md +393 -393
- package/skills/backend/graphql-architect/references/security.md +569 -569
- package/skills/backend/graphql-architect/references/subscriptions.md +510 -510
- package/skills/backend/legacy-modernizer/SKILL.md +137 -137
- package/skills/backend/legacy-modernizer/references/legacy-testing.md +381 -381
- package/skills/backend/legacy-modernizer/references/migration-strategies.md +423 -423
- package/skills/backend/legacy-modernizer/references/refactoring-patterns.md +395 -395
- package/skills/backend/legacy-modernizer/references/strangler-fig-pattern.md +281 -281
- package/skills/backend/legacy-modernizer/references/system-assessment.md +487 -487
- package/skills/backend/microservices-architect/SKILL.md +164 -164
- package/skills/backend/microservices-architect/references/communication.md +499 -499
- package/skills/backend/microservices-architect/references/data.md +721 -721
- package/skills/backend/microservices-architect/references/decomposition.md +344 -344
- package/skills/backend/microservices-architect/references/observability.md +805 -805
- package/skills/backend/microservices-architect/references/patterns.md +603 -603
- package/skills/database/database-optimizer/SKILL.md +147 -147
- package/skills/database/database-optimizer/references/index-strategies.md +331 -331
- package/skills/database/database-optimizer/references/monitoring-analysis.md +501 -501
- package/skills/database/database-optimizer/references/mysql-tuning.md +452 -452
- package/skills/database/database-optimizer/references/postgresql-tuning.md +413 -413
- package/skills/database/database-optimizer/references/query-optimization.md +251 -251
- package/skills/database/postgres-pro/SKILL.md +152 -152
- package/skills/database/postgres-pro/references/extensions.md +404 -404
- package/skills/database/postgres-pro/references/jsonb.md +321 -321
- package/skills/database/postgres-pro/references/maintenance.md +481 -481
- package/skills/database/postgres-pro/references/performance.md +265 -265
- package/skills/database/postgres-pro/references/replication.md +446 -446
- package/skills/database/sql-pro/SKILL.md +129 -129
- package/skills/database/sql-pro/references/database-design.md +402 -402
- package/skills/database/sql-pro/references/dialect-differences.md +419 -419
- package/skills/database/sql-pro/references/optimization.md +384 -384
- package/skills/database/sql-pro/references/query-patterns.md +285 -285
- package/skills/database/sql-pro/references/window-functions.md +328 -328
- package/skills/dotnet/csharp-developer/SKILL.md +125 -125
- package/skills/dotnet/csharp-developer/references/aspnet-core.md +394 -394
- package/skills/dotnet/csharp-developer/references/blazor.md +553 -553
- package/skills/dotnet/csharp-developer/references/entity-framework.md +409 -409
- package/skills/dotnet/csharp-developer/references/modern-csharp.md +248 -248
- package/skills/dotnet/csharp-developer/references/performance.md +498 -498
- package/skills/dotnet/dotnet-core-expert/SKILL.md +138 -138
- package/skills/dotnet/dotnet-core-expert/references/authentication.md +546 -546
- package/skills/dotnet/dotnet-core-expert/references/clean-architecture.md +455 -455
- package/skills/dotnet/dotnet-core-expert/references/cloud-native.md +548 -548
- package/skills/dotnet/dotnet-core-expert/references/entity-framework.md +440 -440
- package/skills/dotnet/dotnet-core-expert/references/minimal-apis.md +319 -319
- package/skills/frontend/angular-architect/SKILL.md +152 -152
- package/skills/frontend/angular-architect/references/components.md +297 -297
- package/skills/frontend/angular-architect/references/ngrx.md +401 -401
- package/skills/frontend/angular-architect/references/routing.md +361 -361
- package/skills/frontend/angular-architect/references/rxjs.md +319 -319
- package/skills/frontend/angular-architect/references/testing.md +405 -405
- package/skills/frontend/design-commands/design.md +91 -91
- package/skills/frontend/design-commands/handoff.md +97 -97
- package/skills/frontend/design-commands/prototype.md +120 -120
- package/skills/frontend/design-commands/spec.md +160 -160
- package/skills/frontend/design-commands/style.md +78 -78
- package/skills/frontend/flutter-expert/SKILL.md +138 -138
- package/skills/frontend/flutter-expert/references/bloc-state.md +259 -259
- package/skills/frontend/flutter-expert/references/gorouter-navigation.md +119 -119
- package/skills/frontend/flutter-expert/references/performance.md +99 -99
- package/skills/frontend/flutter-expert/references/project-structure.md +118 -118
- package/skills/frontend/flutter-expert/references/riverpod-state.md +130 -130
- package/skills/frontend/flutter-expert/references/widget-patterns.md +123 -123
- package/skills/frontend/nextjs-developer/SKILL.md +143 -143
- package/skills/frontend/nextjs-developer/references/app-router.md +311 -311
- package/skills/frontend/nextjs-developer/references/data-fetching.md +482 -482
- package/skills/frontend/nextjs-developer/references/deployment.md +545 -545
- package/skills/frontend/nextjs-developer/references/server-actions.md +462 -462
- package/skills/frontend/nextjs-developer/references/server-components.md +384 -384
- package/skills/frontend/react-expert/SKILL.md +149 -149
- package/skills/frontend/react-expert/references/hooks-patterns.md +162 -162
- package/skills/frontend/react-expert/references/migration-class-to-modern.md +1119 -1119
- package/skills/frontend/react-expert/references/performance.md +168 -168
- package/skills/frontend/react-expert/references/react-19-features.md +174 -174
- package/skills/frontend/react-expert/references/server-components.md +143 -143
- package/skills/frontend/react-expert/references/state-management.md +171 -171
- package/skills/frontend/react-expert/references/testing-react.md +174 -174
- package/skills/frontend/react-native-expert/SKILL.md +185 -185
- package/skills/frontend/react-native-expert/references/expo-router.md +187 -187
- package/skills/frontend/react-native-expert/references/list-optimization.md +204 -204
- package/skills/frontend/react-native-expert/references/platform-handling.md +188 -188
- package/skills/frontend/react-native-expert/references/project-structure.md +171 -171
- package/skills/frontend/react-native-expert/references/storage-hooks.md +173 -173
- package/skills/frontend/senior-frontend/SKILL.md +477 -477
- package/skills/frontend/senior-frontend/references/frontend_best_practices.md +806 -806
- package/skills/frontend/senior-frontend/references/nextjs_optimization_guide.md +724 -724
- package/skills/frontend/senior-frontend/references/react_patterns.md +746 -746
- package/skills/frontend/senior-frontend/scripts/bundle_analyzer.py +407 -407
- package/skills/frontend/senior-frontend/scripts/component_generator.py +329 -329
- package/skills/frontend/senior-frontend/scripts/frontend_scaffolder.py +1005 -1005
- package/skills/frontend/ui-ux-pro-max/SKILL.md +386 -386
- package/skills/frontend/ui-ux-pro-max/data/charts.csv +26 -26
- package/skills/frontend/ui-ux-pro-max/data/colors.csv +97 -97
- package/skills/frontend/ui-ux-pro-max/data/icons.csv +101 -101
- package/skills/frontend/ui-ux-pro-max/data/landing.csv +31 -31
- package/skills/frontend/ui-ux-pro-max/data/products.csv +96 -96
- package/skills/frontend/ui-ux-pro-max/data/react-performance.csv +45 -45
- package/skills/frontend/ui-ux-pro-max/data/stacks/astro.csv +54 -54
- package/skills/frontend/ui-ux-pro-max/data/stacks/flutter.csv +53 -53
- package/skills/frontend/ui-ux-pro-max/data/stacks/html-tailwind.csv +56 -56
- package/skills/frontend/ui-ux-pro-max/data/stacks/jetpack-compose.csv +53 -53
- package/skills/frontend/ui-ux-pro-max/data/stacks/nextjs.csv +53 -53
- package/skills/frontend/ui-ux-pro-max/data/stacks/nuxt-ui.csv +51 -51
- package/skills/frontend/ui-ux-pro-max/data/stacks/nuxtjs.csv +59 -59
- package/skills/frontend/ui-ux-pro-max/data/stacks/react-native.csv +52 -52
- package/skills/frontend/ui-ux-pro-max/data/stacks/react.csv +54 -54
- package/skills/frontend/ui-ux-pro-max/data/stacks/shadcn.csv +61 -61
- package/skills/frontend/ui-ux-pro-max/data/stacks/svelte.csv +54 -54
- package/skills/frontend/ui-ux-pro-max/data/stacks/swiftui.csv +51 -51
- package/skills/frontend/ui-ux-pro-max/data/stacks/vue.csv +50 -50
- package/skills/frontend/ui-ux-pro-max/data/styles.csv +68 -68
- package/skills/frontend/ui-ux-pro-max/data/typography.csv +57 -57
- package/skills/frontend/ui-ux-pro-max/data/ui-reasoning.csv +101 -101
- package/skills/frontend/ui-ux-pro-max/data/ux-guidelines.csv +99 -99
- package/skills/frontend/ui-ux-pro-max/data/web-interface.csv +31 -31
- package/skills/frontend/ui-ux-pro-max/scripts/core.py +253 -253
- package/skills/frontend/ui-ux-pro-max/scripts/design_system.py +1067 -1067
- package/skills/frontend/ui-ux-pro-max/scripts/search.py +114 -114
- package/skills/frontend/vue-expert/SKILL.md +98 -98
- package/skills/frontend/vue-expert/references/build-tooling.md +480 -480
- package/skills/frontend/vue-expert/references/components.md +448 -448
- package/skills/frontend/vue-expert/references/composition-api.md +299 -299
- package/skills/frontend/vue-expert/references/mobile-hybrid.md +636 -636
- package/skills/frontend/vue-expert/references/nuxt.md +669 -669
- package/skills/frontend/vue-expert/references/state-management.md +449 -449
- package/skills/frontend/vue-expert/references/typescript.md +584 -584
- package/skills/frontend/vue-expert-js/SKILL.md +167 -167
- package/skills/frontend/vue-expert-js/references/component-architecture.md +219 -219
- package/skills/frontend/vue-expert-js/references/composables-patterns.md +183 -183
- package/skills/frontend/vue-expert-js/references/jsdoc-typing.md +535 -535
- package/skills/frontend/vue-expert-js/references/state-management.md +249 -249
- package/skills/frontend/vue-expert-js/references/testing-patterns.md +237 -237
- package/skills/go-rust-cpp/cpp-pro/SKILL.md +115 -115
- package/skills/go-rust-cpp/cpp-pro/references/build-tooling.md +440 -440
- package/skills/go-rust-cpp/cpp-pro/references/concurrency.md +437 -437
- package/skills/go-rust-cpp/cpp-pro/references/memory-performance.md +397 -397
- package/skills/go-rust-cpp/cpp-pro/references/modern-cpp.md +304 -304
- package/skills/go-rust-cpp/cpp-pro/references/templates.md +357 -357
- package/skills/go-rust-cpp/golang-pro/SKILL.md +122 -122
- package/skills/go-rust-cpp/golang-pro/references/concurrency.md +329 -329
- package/skills/go-rust-cpp/golang-pro/references/generics.md +442 -442
- package/skills/go-rust-cpp/golang-pro/references/interfaces.md +432 -432
- package/skills/go-rust-cpp/golang-pro/references/project-structure.md +477 -477
- package/skills/go-rust-cpp/golang-pro/references/testing.md +451 -451
- package/skills/go-rust-cpp/rust-engineer/SKILL.md +167 -167
- package/skills/go-rust-cpp/rust-engineer/references/async.md +458 -458
- package/skills/go-rust-cpp/rust-engineer/references/error-handling.md +334 -334
- package/skills/go-rust-cpp/rust-engineer/references/ownership.md +278 -278
- package/skills/go-rust-cpp/rust-engineer/references/testing.md +470 -470
- package/skills/go-rust-cpp/rust-engineer/references/traits.md +413 -413
- package/skills/infra/cli-developer/SKILL.md +113 -113
- package/skills/infra/cli-developer/references/design-patterns.md +221 -221
- package/skills/infra/cli-developer/references/go-cli.md +540 -540
- package/skills/infra/cli-developer/references/node-cli.md +383 -383
- package/skills/infra/cli-developer/references/python-cli.md +422 -422
- package/skills/infra/cli-developer/references/ux-patterns.md +448 -448
- package/skills/infra/cloud-architect/SKILL.md +216 -216
- package/skills/infra/cloud-architect/references/aws.md +394 -394
- package/skills/infra/cloud-architect/references/azure.md +562 -562
- package/skills/infra/cloud-architect/references/cost.md +582 -582
- package/skills/infra/cloud-architect/references/gcp.md +633 -633
- package/skills/infra/cloud-architect/references/multi-cloud.md +483 -483
- package/skills/infra/devops-engineer/SKILL.md +144 -144
- package/skills/infra/devops-engineer/references/deployment-strategies.md +241 -241
- package/skills/infra/devops-engineer/references/docker-patterns.md +113 -113
- package/skills/infra/devops-engineer/references/github-actions.md +139 -139
- package/skills/infra/devops-engineer/references/incident-response.md +331 -331
- package/skills/infra/devops-engineer/references/kubernetes.md +154 -154
- package/skills/infra/devops-engineer/references/platform-engineering.md +417 -417
- package/skills/infra/devops-engineer/references/release-automation.md +527 -527
- package/skills/infra/devops-engineer/references/terraform-iac.md +141 -141
- package/skills/infra/kubernetes-specialist/SKILL.md +241 -241
- package/skills/infra/kubernetes-specialist/references/configuration.md +452 -452
- package/skills/infra/kubernetes-specialist/references/cost-optimization.md +458 -458
- package/skills/infra/kubernetes-specialist/references/custom-operators.md +563 -563
- package/skills/infra/kubernetes-specialist/references/gitops.md +530 -530
- package/skills/infra/kubernetes-specialist/references/helm-charts.md +912 -912
- package/skills/infra/kubernetes-specialist/references/multi-cluster.md +507 -507
- package/skills/infra/kubernetes-specialist/references/networking.md +447 -447
- package/skills/infra/kubernetes-specialist/references/service-mesh.md +459 -459
- package/skills/infra/kubernetes-specialist/references/storage.md +535 -535
- package/skills/infra/kubernetes-specialist/references/troubleshooting.md +414 -414
- package/skills/infra/kubernetes-specialist/references/workloads.md +377 -377
- package/skills/infra/mcp-developer/SKILL.md +143 -143
- package/skills/infra/mcp-developer/references/protocol.md +244 -244
- package/skills/infra/mcp-developer/references/python-sdk.md +367 -367
- package/skills/infra/mcp-developer/references/resources.md +554 -554
- package/skills/infra/mcp-developer/references/tools.md +480 -480
- package/skills/infra/mcp-developer/references/typescript-sdk.md +350 -350
- package/skills/infra/monitoring-expert/SKILL.md +176 -176
- package/skills/infra/monitoring-expert/references/alerting-rules.md +141 -141
- package/skills/infra/monitoring-expert/references/application-profiling.md +331 -331
- package/skills/infra/monitoring-expert/references/capacity-planning.md +344 -344
- package/skills/infra/monitoring-expert/references/dashboards.md +126 -126
- package/skills/infra/monitoring-expert/references/opentelemetry.md +123 -123
- package/skills/infra/monitoring-expert/references/performance-testing.md +269 -269
- package/skills/infra/monitoring-expert/references/prometheus-metrics.md +136 -136
- package/skills/infra/monitoring-expert/references/structured-logging.md +142 -142
- package/skills/infra/sre-engineer/SKILL.md +181 -181
- package/skills/infra/sre-engineer/references/automation-toil.md +492 -492
- package/skills/infra/sre-engineer/references/error-budget-policy.md +334 -334
- package/skills/infra/sre-engineer/references/incident-chaos.md +576 -576
- package/skills/infra/sre-engineer/references/monitoring-alerting.md +424 -424
- package/skills/infra/sre-engineer/references/slo-sli-management.md +238 -238
- package/skills/infra/terraform-engineer/SKILL.md +143 -143
- package/skills/infra/terraform-engineer/references/best-practices.md +583 -583
- package/skills/infra/terraform-engineer/references/module-patterns.md +297 -297
- package/skills/infra/terraform-engineer/references/providers.md +452 -452
- package/skills/infra/terraform-engineer/references/state-management.md +371 -371
- package/skills/infra/terraform-engineer/references/testing.md +486 -486
- package/skills/infra/websocket-engineer/SKILL.md +168 -168
- package/skills/infra/websocket-engineer/references/alternatives.md +391 -391
- package/skills/infra/websocket-engineer/references/patterns.md +400 -400
- package/skills/infra/websocket-engineer/references/protocol.md +195 -195
- package/skills/infra/websocket-engineer/references/scaling.md +333 -333
- package/skills/infra/websocket-engineer/references/security.md +474 -474
- package/skills/java/java-architect/SKILL.md +132 -132
- package/skills/java/java-architect/references/jpa-optimization.md +393 -393
- package/skills/java/java-architect/references/reactive-webflux.md +356 -356
- package/skills/java/java-architect/references/spring-boot-setup.md +269 -269
- package/skills/java/java-architect/references/spring-security.md +445 -445
- package/skills/java/java-architect/references/testing-patterns.md +500 -500
- package/skills/java/kotlin-specialist/SKILL.md +147 -147
- package/skills/java/kotlin-specialist/references/android-compose.md +419 -419
- package/skills/java/kotlin-specialist/references/coroutines-flow.md +276 -276
- package/skills/java/kotlin-specialist/references/dsl-idioms.md +421 -421
- package/skills/java/kotlin-specialist/references/ktor-server.md +426 -426
- package/skills/java/kotlin-specialist/references/multiplatform-kmp.md +380 -380
- package/skills/java/spring-boot-engineer/SKILL.md +195 -195
- package/skills/java/spring-boot-engineer/references/cloud.md +498 -498
- package/skills/java/spring-boot-engineer/references/data.md +381 -381
- package/skills/java/spring-boot-engineer/references/security.md +459 -459
- package/skills/java/spring-boot-engineer/references/testing.md +545 -545
- package/skills/java/spring-boot-engineer/references/web.md +295 -295
- package/skills/javascript/javascript-pro/SKILL.md +132 -132
- package/skills/javascript/javascript-pro/references/async-patterns.md +334 -334
- package/skills/javascript/javascript-pro/references/browser-apis.md +398 -398
- package/skills/javascript/javascript-pro/references/modern-syntax.md +272 -272
- package/skills/javascript/javascript-pro/references/modules.md +357 -357
- package/skills/javascript/javascript-pro/references/node-essentials.md +471 -471
- package/skills/javascript/nestjs-expert/SKILL.md +206 -206
- package/skills/javascript/nestjs-expert/references/authentication.md +166 -166
- package/skills/javascript/nestjs-expert/references/controllers-routing.md +111 -111
- package/skills/javascript/nestjs-expert/references/dtos-validation.md +153 -153
- package/skills/javascript/nestjs-expert/references/migration-from-express.md +1237 -1237
- package/skills/javascript/nestjs-expert/references/services-di.md +140 -140
- package/skills/javascript/nestjs-expert/references/testing-patterns.md +186 -186
- package/skills/javascript/typescript-pro/SKILL.md +145 -145
- package/skills/javascript/typescript-pro/references/advanced-types.md +259 -259
- package/skills/javascript/typescript-pro/references/configuration.md +445 -445
- package/skills/javascript/typescript-pro/references/patterns.md +484 -484
- package/skills/javascript/typescript-pro/references/type-guards.md +352 -352
- package/skills/javascript/typescript-pro/references/utility-types.md +329 -329
- package/skills/php/laravel-specialist/SKILL.md +262 -262
- package/skills/php/laravel-specialist/references/eloquent.md +351 -351
- package/skills/php/laravel-specialist/references/livewire.md +512 -512
- package/skills/php/laravel-specialist/references/queues.md +423 -423
- package/skills/php/laravel-specialist/references/routing.md +362 -362
- package/skills/php/laravel-specialist/references/testing.md +522 -522
- package/skills/php/php-pro/SKILL.md +206 -206
- package/skills/php/php-pro/references/async-patterns.md +412 -412
- package/skills/php/php-pro/references/laravel-patterns.md +377 -377
- package/skills/php/php-pro/references/modern-php-features.md +323 -323
- package/skills/php/php-pro/references/symfony-patterns.md +466 -466
- package/skills/php/php-pro/references/testing-quality.md +466 -466
- package/skills/product/competitive-analysis/SKILL.md +257 -257
- package/skills/product/meeting-notes/SKILL.md +266 -266
- package/skills/product/prd-template/SKILL.md +150 -150
- package/skills/product/stakeholder-update/SKILL.md +225 -225
- package/skills/product/user-research-synthesis/SKILL.md +235 -235
- package/skills/python/django-expert/SKILL.md +162 -162
- package/skills/python/django-expert/references/authentication.md +145 -145
- package/skills/python/django-expert/references/drf-serializers.md +148 -148
- package/skills/python/django-expert/references/models-orm.md +151 -151
- package/skills/python/django-expert/references/testing-django.md +204 -204
- package/skills/python/django-expert/references/viewsets-views.md +153 -153
- package/skills/python/fastapi-expert/SKILL.md +185 -185
- package/skills/python/fastapi-expert/references/async-sqlalchemy.md +146 -146
- package/skills/python/fastapi-expert/references/authentication.md +159 -159
- package/skills/python/fastapi-expert/references/endpoints-routing.md +142 -142
- package/skills/python/fastapi-expert/references/migration-from-django.md +996 -996
- package/skills/python/fastapi-expert/references/pydantic-v2.md +135 -135
- package/skills/python/fastapi-expert/references/testing-async.md +159 -159
- package/skills/python/pandas-pro/SKILL.md +178 -178
- package/skills/python/pandas-pro/references/aggregation-groupby.md +545 -545
- package/skills/python/pandas-pro/references/data-cleaning.md +500 -500
- package/skills/python/pandas-pro/references/dataframe-operations.md +420 -420
- package/skills/python/pandas-pro/references/merging-joining.md +596 -596
- package/skills/python/pandas-pro/references/performance-optimization.md +597 -597
- package/skills/python/python-pro/SKILL.md +177 -177
- package/skills/python/python-pro/references/async-patterns.md +356 -356
- package/skills/python/python-pro/references/packaging.md +460 -460
- package/skills/python/python-pro/references/standard-library.md +378 -378
- package/skills/python/python-pro/references/testing.md +404 -404
- package/skills/python/python-pro/references/type-system.md +290 -290
- package/skills/quality/chaos-engineer/SKILL.md +182 -182
- package/skills/quality/chaos-engineer/references/chaos-tools.md +511 -511
- package/skills/quality/chaos-engineer/references/experiment-design.md +229 -229
- package/skills/quality/chaos-engineer/references/game-days.md +434 -434
- package/skills/quality/chaos-engineer/references/infrastructure-chaos.md +348 -348
- package/skills/quality/chaos-engineer/references/kubernetes-chaos.md +432 -432
- package/skills/quality/code-reviewer/SKILL.md +119 -119
- package/skills/quality/code-reviewer/references/common-issues.md +142 -142
- package/skills/quality/code-reviewer/references/feedback-examples.md +144 -144
- package/skills/quality/code-reviewer/references/receiving-feedback.md +238 -238
- package/skills/quality/code-reviewer/references/report-template.md +109 -109
- package/skills/quality/code-reviewer/references/review-checklist.md +88 -88
- package/skills/quality/code-reviewer/references/spec-compliance-review.md +258 -258
- package/skills/quality/playwright-expert/SKILL.md +169 -169
- package/skills/quality/playwright-expert/references/api-mocking.md +140 -140
- package/skills/quality/playwright-expert/references/configuration.md +155 -155
- package/skills/quality/playwright-expert/references/debugging-flaky.md +150 -150
- package/skills/quality/playwright-expert/references/page-object-model.md +152 -152
- package/skills/quality/playwright-expert/references/selectors-locators.md +119 -119
- package/skills/quality/secure-code-guardian/SKILL.md +191 -191
- package/skills/quality/secure-code-guardian/references/authentication.md +136 -136
- package/skills/quality/secure-code-guardian/references/input-validation.md +146 -146
- package/skills/quality/secure-code-guardian/references/owasp-prevention.md +135 -135
- package/skills/quality/secure-code-guardian/references/security-headers.md +133 -133
- package/skills/quality/secure-code-guardian/references/xss-csrf.md +157 -157
- package/skills/quality/security-reviewer/SKILL.md +103 -103
- package/skills/quality/security-reviewer/references/infrastructure-security.md +268 -268
- package/skills/quality/security-reviewer/references/penetration-testing.md +268 -268
- package/skills/quality/security-reviewer/references/report-template.md +170 -170
- package/skills/quality/security-reviewer/references/sast-tools.md +117 -117
- package/skills/quality/security-reviewer/references/secret-scanning.md +125 -125
- package/skills/quality/security-reviewer/references/vulnerability-patterns.md +152 -152
- package/skills/quality/senior-qa/README.md +196 -196
- package/skills/quality/senior-qa/SKILL.md +399 -399
- package/skills/quality/senior-qa/references/qa_best_practices.md +964 -964
- package/skills/quality/senior-qa/references/test_automation_patterns.md +1009 -1009
- package/skills/quality/senior-qa/references/testing_strategies.md +649 -649
- package/skills/quality/senior-qa/scripts/coverage_analyzer.py +836 -836
- package/skills/quality/senior-qa/scripts/e2e_test_scaffolder.py +820 -820
- package/skills/quality/senior-qa/scripts/test_suite_generator.py +605 -605
- package/skills/quality/tdd-guide/HOW_TO_USE.md +313 -313
- package/skills/quality/tdd-guide/README.md +680 -680
- package/skills/quality/tdd-guide/SKILL.md +122 -122
- package/skills/quality/tdd-guide/assets/expected_output.json +77 -77
- package/skills/quality/tdd-guide/assets/sample_input_python.json +39 -39
- package/skills/quality/tdd-guide/assets/sample_input_typescript.json +36 -36
- package/skills/quality/tdd-guide/references/ci-integration.md +195 -195
- package/skills/quality/tdd-guide/references/framework-guide.md +206 -206
- package/skills/quality/tdd-guide/references/tdd-best-practices.md +128 -128
- package/skills/quality/tdd-guide/scripts/coverage_analyzer.py +434 -434
- package/skills/quality/tdd-guide/scripts/fixture_generator.py +440 -440
- package/skills/quality/tdd-guide/scripts/format_detector.py +384 -384
- package/skills/quality/tdd-guide/scripts/framework_adapter.py +428 -428
- package/skills/quality/tdd-guide/scripts/metrics_calculator.py +456 -456
- package/skills/quality/tdd-guide/scripts/output_formatter.py +354 -354
- package/skills/quality/tdd-guide/scripts/tdd_workflow.py +474 -474
- package/skills/quality/tdd-guide/scripts/test_generator.py +438 -438
- package/skills/quality/test-master/SKILL.md +94 -94
- package/skills/quality/test-master/references/automation-frameworks.md +294 -294
- package/skills/quality/test-master/references/e2e-testing.md +128 -128
- package/skills/quality/test-master/references/integration-testing.md +120 -120
- package/skills/quality/test-master/references/performance-testing.md +118 -118
- package/skills/quality/test-master/references/qa-methodology.md +247 -247
- package/skills/quality/test-master/references/security-testing.md +127 -127
- package/skills/quality/test-master/references/tdd-iron-laws.md +174 -174
- package/skills/quality/test-master/references/test-reports.md +104 -104
- package/skills/quality/test-master/references/testing-anti-patterns.md +231 -231
- package/skills/quality/test-master/references/unit-testing.md +113 -113
- package/skills/ruby/rails-expert/SKILL.md +154 -154
- package/skills/ruby/rails-expert/references/active-record.md +244 -244
- package/skills/ruby/rails-expert/references/api-development.md +401 -401
- package/skills/ruby/rails-expert/references/background-jobs.md +272 -272
- package/skills/ruby/rails-expert/references/hotwire-turbo.md +228 -228
- package/skills/ruby/rails-expert/references/rspec-testing.md +367 -367
- package/skills/swift/swift-expert/SKILL.md +163 -163
- package/skills/swift/swift-expert/references/async-concurrency.md +360 -360
- package/skills/swift/swift-expert/references/memory-performance.md +377 -377
- package/skills/swift/swift-expert/references/protocol-oriented.md +354 -354
- package/skills/swift/swift-expert/references/swiftui-patterns.md +291 -291
- package/skills/swift/swift-expert/references/testing-patterns.md +399 -399
- package/skills/workflow/brainstorming/SKILL.md +164 -164
- package/skills/workflow/brainstorming/scripts/frame-template.html +214 -214
- package/skills/workflow/brainstorming/scripts/helper.js +88 -88
- package/skills/workflow/brainstorming/scripts/server.cjs +354 -354
- package/skills/workflow/brainstorming/scripts/start-server.sh +148 -148
- package/skills/workflow/brainstorming/scripts/stop-server.sh +56 -56
- package/skills/workflow/brainstorming/spec-document-reviewer-prompt.md +49 -49
- package/skills/workflow/brainstorming/visual-companion.md +287 -287
- package/skills/workflow/documentation/SKILL.md +45 -45
- package/skills/workflow/entropy-management/SKILL.md +115 -115
- package/skills/workflow/executing-plans/SKILL.md +70 -70
- package/skills/workflow/finishing-a-development-branch/SKILL.md +200 -200
- package/skills/workflow/receiving-code-review/SKILL.md +213 -213
- package/skills/workflow/requesting-code-review/SKILL.md +105 -105
- package/skills/workflow/requesting-code-review/code-reviewer.md +146 -146
- package/skills/workflow/requirement-engineering/SKILL.md +111 -111
- package/skills/workflow/systematic-debugging/CREATION-LOG.md +119 -119
- package/skills/workflow/systematic-debugging/SKILL.md +296 -296
- package/skills/workflow/systematic-debugging/condition-based-waiting-example.ts +158 -158
- package/skills/workflow/systematic-debugging/condition-based-waiting.md +115 -115
- package/skills/workflow/systematic-debugging/defense-in-depth.md +122 -122
- package/skills/workflow/systematic-debugging/find-polluter.sh +63 -63
- package/skills/workflow/systematic-debugging/root-cause-tracing.md +169 -169
- package/skills/workflow/systematic-debugging/test-academic.md +14 -14
- package/skills/workflow/systematic-debugging/test-pressure-1.md +58 -58
- package/skills/workflow/systematic-debugging/test-pressure-2.md +68 -68
- package/skills/workflow/systematic-debugging/test-pressure-3.md +69 -69
- package/skills/workflow/using-git-worktrees/SKILL.md +218 -218
- package/skills/workflow/verification-before-completion/SKILL.md +139 -139
- package/skills/workflow/writing-plans/SKILL.md +151 -151
- package/skills/workflow/writing-plans/plan-document-reviewer-prompt.md +49 -49
- package/skills/workflow/writing-skills/SKILL.md +655 -655
- package/skills/workflow/writing-skills/anthropic-best-practices.md +1150 -1150
- package/skills/workflow/writing-skills/examples/CLAUDE_MD_TESTING.md +189 -189
- package/skills/workflow/writing-skills/persuasion-principles.md +187 -187
- package/skills/workflow/writing-skills/render-graphs.js +168 -168
- package/skills/workflow/writing-skills/testing-skills-with-subagents.md +384 -384
|
@@ -1,561 +1,561 @@
|
|
|
1
|
-
# Embedding Models
|
|
2
|
-
|
|
3
|
-
---
|
|
4
|
-
|
|
5
|
-
## Model Comparison Matrix
|
|
6
|
-
|
|
7
|
-
| Model | Dimensions | Max Tokens | Strengths | Provider |
|
|
8
|
-
|-------|------------|------------|-----------|----------|
|
|
9
|
-
| **text-embedding-3-large** | 3072 (or 256-3072) | 8191 | Best quality, flexible dims | OpenAI |
|
|
10
|
-
| **text-embedding-3-small** | 1536 (or 256-1536) | 8191 | Cost-effective, good quality | OpenAI |
|
|
11
|
-
| **embed-english-v3.0** | 1024 | 512 | Excellent compression, fast | Cohere |
|
|
12
|
-
| **embed-multilingual-v3.0** | 1024 | 512 | 100+ languages | Cohere |
|
|
13
|
-
| **voyage-large-2** | 1536 | 16000 | Long context, code-aware | Voyage AI |
|
|
14
|
-
| **voyage-code-2** | 1536 | 16000 | Code retrieval specialist | Voyage AI |
|
|
15
|
-
| **BGE-large-en-v1.5** | 1024 | 512 | Open source, high quality | BAAI |
|
|
16
|
-
| **BGE-M3** | 1024 | 8192 | Multi-lingual, multi-granularity | BAAI |
|
|
17
|
-
| **E5-large-v2** | 1024 | 512 | Strong benchmark performance | Microsoft |
|
|
18
|
-
| **GTE-large** | 1024 | 512 | Good general-purpose | Alibaba |
|
|
19
|
-
| **all-MiniLM-L6-v2** | 384 | 256 | Fast, lightweight | Sentence Transformers |
|
|
20
|
-
| **nomic-embed-text-v1.5** | 768 | 8192 | Long context, open weights | Nomic AI |
|
|
21
|
-
|
|
22
|
-
---
|
|
23
|
-
|
|
24
|
-
## When to Use Each Model
|
|
25
|
-
|
|
26
|
-
### OpenAI text-embedding-3-large
|
|
27
|
-
```
|
|
28
|
-
Best For:
|
|
29
|
-
- Production RAG requiring highest accuracy
|
|
30
|
-
- Enterprise applications with quality SLAs
|
|
31
|
-
- Flexible dimension requirements (can reduce to save cost)
|
|
32
|
-
- English and major languages
|
|
33
|
-
|
|
34
|
-
When to Avoid:
|
|
35
|
-
- Cost-sensitive high-volume applications
|
|
36
|
-
- Air-gapped or offline deployments
|
|
37
|
-
- Specialized domains without fine-tuning budget
|
|
38
|
-
```
|
|
39
|
-
|
|
40
|
-
### OpenAI text-embedding-3-small
|
|
41
|
-
```
|
|
42
|
-
Best For:
|
|
43
|
-
- Cost-effective production deployments
|
|
44
|
-
- Good quality-to-cost ratio
|
|
45
|
-
- General-purpose retrieval tasks
|
|
46
|
-
- Quick prototyping with API simplicity
|
|
47
|
-
|
|
48
|
-
When to Avoid:
|
|
49
|
-
- Maximum accuracy requirements
|
|
50
|
-
- Specialized technical domains
|
|
51
|
-
- When open-source is required
|
|
52
|
-
```
|
|
53
|
-
|
|
54
|
-
### Cohere embed-v3
|
|
55
|
-
```
|
|
56
|
-
Best For:
|
|
57
|
-
- Multi-lingual applications (100+ languages)
|
|
58
|
-
- Search-optimized retrieval (search_document/search_query types)
|
|
59
|
-
- Compression (int8/binary quantization built-in)
|
|
60
|
-
- Production with cost constraints
|
|
61
|
-
|
|
62
|
-
When to Avoid:
|
|
63
|
-
- Very long documents (512 token limit)
|
|
64
|
-
- Code-heavy retrieval tasks
|
|
65
|
-
```
|
|
66
|
-
|
|
67
|
-
### Voyage AI
|
|
68
|
-
```
|
|
69
|
-
Best For:
|
|
70
|
-
- Code retrieval and technical documentation
|
|
71
|
-
- Long-context documents (16K tokens)
|
|
72
|
-
- Domain-specific fine-tuning options
|
|
73
|
-
- Legal/financial specialized models
|
|
74
|
-
|
|
75
|
-
When to Avoid:
|
|
76
|
-
- Budget-constrained projects
|
|
77
|
-
- Simple general-purpose retrieval
|
|
78
|
-
```
|
|
79
|
-
|
|
80
|
-
### BGE / E5 (Open Source)
|
|
81
|
-
```
|
|
82
|
-
Best For:
|
|
83
|
-
- Self-hosted deployments
|
|
84
|
-
- Air-gapped environments
|
|
85
|
-
- Cost elimination (no API fees)
|
|
86
|
-
- Fine-tuning on custom domains
|
|
87
|
-
|
|
88
|
-
When to Avoid:
|
|
89
|
-
- Teams without GPU infrastructure
|
|
90
|
-
- Need for zero maintenance
|
|
91
|
-
- Maximum out-of-box quality
|
|
92
|
-
```
|
|
93
|
-
|
|
94
|
-
---
|
|
95
|
-
|
|
96
|
-
## OpenAI Embeddings
|
|
97
|
-
|
|
98
|
-
```python
|
|
99
|
-
from openai import OpenAI
|
|
100
|
-
|
|
101
|
-
client = OpenAI(api_key="your-api-key")
|
|
102
|
-
|
|
103
|
-
def get_embedding(
|
|
104
|
-
text: str,
|
|
105
|
-
model: str = "text-embedding-3-small",
|
|
106
|
-
dimensions: int | None = None
|
|
107
|
-
) -> list[float]:
|
|
108
|
-
"""Get embedding with optional dimension reduction."""
|
|
109
|
-
params = {"input": text, "model": model}
|
|
110
|
-
if dimensions:
|
|
111
|
-
params["dimensions"] = dimensions
|
|
112
|
-
|
|
113
|
-
response = client.embeddings.create(**params)
|
|
114
|
-
return response.data[0].embedding
|
|
115
|
-
|
|
116
|
-
# Single embedding
|
|
117
|
-
embedding = get_embedding("How do I install the software?")
|
|
118
|
-
|
|
119
|
-
# Batch embeddings (more efficient)
|
|
120
|
-
def get_embeddings_batch(
|
|
121
|
-
texts: list[str],
|
|
122
|
-
model: str = "text-embedding-3-small",
|
|
123
|
-
dimensions: int | None = None
|
|
124
|
-
) -> list[list[float]]:
|
|
125
|
-
"""Batch embed multiple texts."""
|
|
126
|
-
params = {"input": texts, "model": model}
|
|
127
|
-
if dimensions:
|
|
128
|
-
params["dimensions"] = dimensions
|
|
129
|
-
|
|
130
|
-
response = client.embeddings.create(**params)
|
|
131
|
-
# Sort by index to maintain order
|
|
132
|
-
return [item.embedding for item in sorted(response.data, key=lambda x: x.index)]
|
|
133
|
-
|
|
134
|
-
embeddings = get_embeddings_batch(["text1", "text2", "text3"])
|
|
135
|
-
|
|
136
|
-
# Dimension reduction (cost/storage savings)
|
|
137
|
-
# text-embedding-3-large: 3072 -> 1024 (66% storage savings)
|
|
138
|
-
reduced_embedding = get_embedding(
|
|
139
|
-
"Installation guide...",
|
|
140
|
-
model="text-embedding-3-large",
|
|
141
|
-
dimensions=1024 # Reduce from 3072
|
|
142
|
-
)
|
|
143
|
-
```
|
|
144
|
-
|
|
145
|
-
### Dimension Trade-offs
|
|
146
|
-
|
|
147
|
-
| Original | Reduced | Quality Loss | Storage Savings |
|
|
148
|
-
|----------|---------|--------------|-----------------|
|
|
149
|
-
| 3072 | 1536 | ~1-2% | 50% |
|
|
150
|
-
| 3072 | 1024 | ~2-4% | 67% |
|
|
151
|
-
| 3072 | 512 | ~5-8% | 83% |
|
|
152
|
-
| 3072 | 256 | ~10-15% | 92% |
|
|
153
|
-
|
|
154
|
-
---
|
|
155
|
-
|
|
156
|
-
## Cohere Embeddings
|
|
157
|
-
|
|
158
|
-
```python
|
|
159
|
-
import cohere
|
|
160
|
-
|
|
161
|
-
co = cohere.Client(api_key="your-api-key")
|
|
162
|
-
|
|
163
|
-
# Document embeddings (for indexing)
|
|
164
|
-
doc_embeddings = co.embed(
|
|
165
|
-
texts=["Installation guide content...", "Configuration steps..."],
|
|
166
|
-
model="embed-english-v3.0",
|
|
167
|
-
input_type="search_document", # Use for documents being indexed
|
|
168
|
-
truncate="END"
|
|
169
|
-
).embeddings
|
|
170
|
-
|
|
171
|
-
# Query embeddings (for search)
|
|
172
|
-
query_embedding = co.embed(
|
|
173
|
-
texts=["how to install"],
|
|
174
|
-
model="embed-english-v3.0",
|
|
175
|
-
input_type="search_query", # Use for search queries
|
|
176
|
-
).embeddings[0]
|
|
177
|
-
|
|
178
|
-
# Multilingual
|
|
179
|
-
multilingual_embedding = co.embed(
|
|
180
|
-
texts=["Comment installer le logiciel?"], # French
|
|
181
|
-
model="embed-multilingual-v3.0",
|
|
182
|
-
input_type="search_query"
|
|
183
|
-
).embeddings[0]
|
|
184
|
-
|
|
185
|
-
# Compressed embeddings (int8)
|
|
186
|
-
compressed = co.embed(
|
|
187
|
-
texts=["Document content..."],
|
|
188
|
-
model="embed-english-v3.0",
|
|
189
|
-
input_type="search_document",
|
|
190
|
-
embedding_types=["int8"] # 4x smaller than float32
|
|
191
|
-
).embeddings
|
|
192
|
-
```
|
|
193
|
-
|
|
194
|
-
### Cohere Input Types
|
|
195
|
-
|
|
196
|
-
| Type | Use Case |
|
|
197
|
-
|------|----------|
|
|
198
|
-
| `search_document` | Documents being indexed in vector DB |
|
|
199
|
-
| `search_query` | User search queries |
|
|
200
|
-
| `classification` | Text classification tasks |
|
|
201
|
-
| `clustering` | Document clustering |
|
|
202
|
-
|
|
203
|
-
---
|
|
204
|
-
|
|
205
|
-
## Voyage AI Embeddings
|
|
206
|
-
|
|
207
|
-
```python
|
|
208
|
-
import voyageai
|
|
209
|
-
|
|
210
|
-
vo = voyageai.Client(api_key="your-api-key")
|
|
211
|
-
|
|
212
|
-
# General embeddings
|
|
213
|
-
result = vo.embed(
|
|
214
|
-
texts=["Installation guide for the software..."],
|
|
215
|
-
model="voyage-large-2",
|
|
216
|
-
input_type="document"
|
|
217
|
-
)
|
|
218
|
-
embeddings = result.embeddings
|
|
219
|
-
|
|
220
|
-
# Code embeddings (specialized)
|
|
221
|
-
code_result = vo.embed(
|
|
222
|
-
texts=[
|
|
223
|
-
"def install_package(name):\n subprocess.run(['pip', 'install', name])",
|
|
224
|
-
"How do I install packages in Python?"
|
|
225
|
-
],
|
|
226
|
-
model="voyage-code-2",
|
|
227
|
-
input_type="document" # or "query" for search
|
|
228
|
-
)
|
|
229
|
-
|
|
230
|
-
# Long context (up to 16K tokens)
|
|
231
|
-
long_doc_embedding = vo.embed(
|
|
232
|
-
texts=[very_long_document], # Up to 16K tokens
|
|
233
|
-
model="voyage-large-2",
|
|
234
|
-
input_type="document"
|
|
235
|
-
).embeddings[0]
|
|
236
|
-
```
|
|
237
|
-
|
|
238
|
-
---
|
|
239
|
-
|
|
240
|
-
## Open Source Models (Sentence Transformers)
|
|
241
|
-
|
|
242
|
-
```python
|
|
243
|
-
from sentence_transformers import SentenceTransformer
|
|
244
|
-
|
|
245
|
-
# Load model (downloads on first use)
|
|
246
|
-
model = SentenceTransformer("BAAI/bge-large-en-v1.5")
|
|
247
|
-
|
|
248
|
-
# Single embedding
|
|
249
|
-
embedding = model.encode("How do I install the software?")
|
|
250
|
-
|
|
251
|
-
# Batch encoding (GPU accelerated)
|
|
252
|
-
embeddings = model.encode(
|
|
253
|
-
["doc1", "doc2", "doc3"],
|
|
254
|
-
batch_size=32,
|
|
255
|
-
show_progress_bar=True,
|
|
256
|
-
convert_to_numpy=True,
|
|
257
|
-
normalize_embeddings=True # For cosine similarity
|
|
258
|
-
)
|
|
259
|
-
|
|
260
|
-
# BGE requires instruction prefix for queries
|
|
261
|
-
query_embedding = model.encode(
|
|
262
|
-
"Represent this sentence for searching relevant passages: How do I install?"
|
|
263
|
-
)
|
|
264
|
-
|
|
265
|
-
# GPU acceleration
|
|
266
|
-
model = SentenceTransformer("BAAI/bge-large-en-v1.5", device="cuda")
|
|
267
|
-
|
|
268
|
-
# Multi-GPU encoding
|
|
269
|
-
pool = model.start_multi_process_pool()
|
|
270
|
-
embeddings = model.encode_multi_process(
|
|
271
|
-
sentences=large_corpus,
|
|
272
|
-
pool=pool,
|
|
273
|
-
batch_size=64
|
|
274
|
-
)
|
|
275
|
-
model.stop_multi_process_pool(pool)
|
|
276
|
-
```
|
|
277
|
-
|
|
278
|
-
### BGE-M3 (Multi-lingual, Multi-granularity)
|
|
279
|
-
|
|
280
|
-
```python
|
|
281
|
-
from FlagEmbedding import BGEM3FlagModel
|
|
282
|
-
|
|
283
|
-
model = BGEM3FlagModel("BAAI/bge-m3", use_fp16=True)
|
|
284
|
-
|
|
285
|
-
# Dense, sparse, and colbert embeddings in one call
|
|
286
|
-
output = model.encode(
|
|
287
|
-
["Installation guide in English", "Guide d'installation en francais"],
|
|
288
|
-
return_dense=True,
|
|
289
|
-
return_sparse=True,
|
|
290
|
-
return_colbert_vecs=True
|
|
291
|
-
)
|
|
292
|
-
|
|
293
|
-
dense_embeddings = output["dense_vecs"]
|
|
294
|
-
sparse_embeddings = output["lexical_weights"]
|
|
295
|
-
colbert_embeddings = output["colbert_vecs"]
|
|
296
|
-
```
|
|
297
|
-
|
|
298
|
-
---
|
|
299
|
-
|
|
300
|
-
## Embedding Fine-Tuning
|
|
301
|
-
|
|
302
|
-
### When to Fine-Tune
|
|
303
|
-
|
|
304
|
-
| Scenario | Recommendation |
|
|
305
|
-
|----------|----------------|
|
|
306
|
-
| Domain-specific jargon (legal, medical) | Fine-tune on domain corpus |
|
|
307
|
-
| Low retrieval precision (<80%) | Fine-tune with hard negatives |
|
|
308
|
-
| Out-of-distribution queries | Fine-tune with query-doc pairs |
|
|
309
|
-
| Cost optimization | Fine-tune smaller model to match larger |
|
|
310
|
-
|
|
311
|
-
### Fine-Tuning with Sentence Transformers
|
|
312
|
-
|
|
313
|
-
```python
|
|
314
|
-
from sentence_transformers import SentenceTransformer, InputExample, losses
|
|
315
|
-
from torch.utils.data import DataLoader
|
|
316
|
-
|
|
317
|
-
# Prepare training data
|
|
318
|
-
train_examples = [
|
|
319
|
-
InputExample(
|
|
320
|
-
texts=["query: how to install", "doc: Installation guide content..."],
|
|
321
|
-
label=1.0 # Relevance score
|
|
322
|
-
),
|
|
323
|
-
InputExample(
|
|
324
|
-
texts=["query: how to install", "doc: Unrelated content..."],
|
|
325
|
-
label=0.0 # Negative example
|
|
326
|
-
),
|
|
327
|
-
]
|
|
328
|
-
|
|
329
|
-
# Load base model
|
|
330
|
-
model = SentenceTransformer("BAAI/bge-base-en-v1.5")
|
|
331
|
-
|
|
332
|
-
# Create dataloader
|
|
333
|
-
train_dataloader = DataLoader(train_examples, shuffle=True, batch_size=16)
|
|
334
|
-
|
|
335
|
-
# Contrastive loss for similarity learning
|
|
336
|
-
train_loss = losses.CosineSimilarityLoss(model)
|
|
337
|
-
|
|
338
|
-
# Fine-tune
|
|
339
|
-
model.fit(
|
|
340
|
-
train_objectives=[(train_dataloader, train_loss)],
|
|
341
|
-
epochs=3,
|
|
342
|
-
warmup_steps=100,
|
|
343
|
-
output_path="./fine-tuned-model"
|
|
344
|
-
)
|
|
345
|
-
|
|
346
|
-
# Or use Multiple Negatives Ranking Loss (better for retrieval)
|
|
347
|
-
train_examples_mnrl = [
|
|
348
|
-
InputExample(texts=["query", "positive_doc", "negative_doc1", "negative_doc2"])
|
|
349
|
-
]
|
|
350
|
-
train_loss = losses.MultipleNegativesRankingLoss(model)
|
|
351
|
-
```
|
|
352
|
-
|
|
353
|
-
### Hard Negative Mining
|
|
354
|
-
|
|
355
|
-
```python
|
|
356
|
-
from sentence_transformers import SentenceTransformer
|
|
357
|
-
from sentence_transformers.util import semantic_search
|
|
358
|
-
import torch
|
|
359
|
-
|
|
360
|
-
def mine_hard_negatives(
|
|
361
|
-
queries: list[str],
|
|
362
|
-
positives: list[str],
|
|
363
|
-
corpus: list[str],
|
|
364
|
-
model: SentenceTransformer,
|
|
365
|
-
top_k: int = 10
|
|
366
|
-
) -> list[InputExample]:
|
|
367
|
-
"""Mine hard negatives from corpus for each query-positive pair."""
|
|
368
|
-
|
|
369
|
-
query_embeddings = model.encode(queries, convert_to_tensor=True)
|
|
370
|
-
corpus_embeddings = model.encode(corpus, convert_to_tensor=True)
|
|
371
|
-
positive_set = set(positives)
|
|
372
|
-
|
|
373
|
-
examples = []
|
|
374
|
-
for i, query in enumerate(queries):
|
|
375
|
-
# Find similar documents that are NOT the positive
|
|
376
|
-
hits = semantic_search(
|
|
377
|
-
query_embeddings[i:i+1],
|
|
378
|
-
corpus_embeddings,
|
|
379
|
-
top_k=top_k + 1
|
|
380
|
-
)[0]
|
|
381
|
-
|
|
382
|
-
hard_negatives = [
|
|
383
|
-
corpus[hit["corpus_id"]]
|
|
384
|
-
for hit in hits
|
|
385
|
-
if corpus[hit["corpus_id"]] not in positive_set
|
|
386
|
-
][:3] # Top 3 hard negatives
|
|
387
|
-
|
|
388
|
-
examples.append(InputExample(
|
|
389
|
-
texts=[query, positives[i]] + hard_negatives
|
|
390
|
-
))
|
|
391
|
-
|
|
392
|
-
return examples
|
|
393
|
-
```
|
|
394
|
-
|
|
395
|
-
---
|
|
396
|
-
|
|
397
|
-
## Embedding Pipeline Best Practices
|
|
398
|
-
|
|
399
|
-
### Text Preprocessing
|
|
400
|
-
|
|
401
|
-
```python
|
|
402
|
-
import re
|
|
403
|
-
from typing import Callable
|
|
404
|
-
|
|
405
|
-
def clean_for_embedding(text: str) -> str:
|
|
406
|
-
"""Clean text before embedding."""
|
|
407
|
-
# Remove excessive whitespace
|
|
408
|
-
text = re.sub(r'\s+', ' ', text)
|
|
409
|
-
# Remove special characters that don't add meaning
|
|
410
|
-
text = re.sub(r'[^\w\s\.\,\!\?\-\:\;\(\)]', '', text)
|
|
411
|
-
# Truncate to reasonable length (model dependent)
|
|
412
|
-
text = text[:8000] # Leave room for tokenization expansion
|
|
413
|
-
return text.strip()
|
|
414
|
-
|
|
415
|
-
def preprocess_for_embedding(
|
|
416
|
-
text: str,
|
|
417
|
-
prefix: str = "",
|
|
418
|
-
max_length: int = 8000
|
|
419
|
-
) -> str:
|
|
420
|
-
"""Preprocess with optional prefix (for instruction-tuned models)."""
|
|
421
|
-
cleaned = clean_for_embedding(text)
|
|
422
|
-
prefixed = f"{prefix}{cleaned}" if prefix else cleaned
|
|
423
|
-
return prefixed[:max_length]
|
|
424
|
-
|
|
425
|
-
# BGE-style prefix for queries
|
|
426
|
-
query_text = preprocess_for_embedding(
|
|
427
|
-
"how to install",
|
|
428
|
-
prefix="Represent this sentence for searching relevant passages: "
|
|
429
|
-
)
|
|
430
|
-
```
|
|
431
|
-
|
|
432
|
-
### Caching Embeddings
|
|
433
|
-
|
|
434
|
-
```python
|
|
435
|
-
import hashlib
|
|
436
|
-
import json
|
|
437
|
-
from functools import lru_cache
|
|
438
|
-
from pathlib import Path
|
|
439
|
-
|
|
440
|
-
class EmbeddingCache:
|
|
441
|
-
"""Disk-based embedding cache."""
|
|
442
|
-
|
|
443
|
-
def __init__(self, cache_dir: str = ".embedding_cache"):
|
|
444
|
-
self.cache_dir = Path(cache_dir)
|
|
445
|
-
self.cache_dir.mkdir(exist_ok=True)
|
|
446
|
-
|
|
447
|
-
def _hash_key(self, text: str, model: str) -> str:
|
|
448
|
-
content = f"{model}:{text}"
|
|
449
|
-
return hashlib.sha256(content.encode()).hexdigest()
|
|
450
|
-
|
|
451
|
-
def get(self, text: str, model: str) -> list[float] | None:
|
|
452
|
-
key = self._hash_key(text, model)
|
|
453
|
-
cache_file = self.cache_dir / f"{key}.json"
|
|
454
|
-
if cache_file.exists():
|
|
455
|
-
return json.loads(cache_file.read_text())
|
|
456
|
-
return None
|
|
457
|
-
|
|
458
|
-
def set(self, text: str, model: str, embedding: list[float]) -> None:
|
|
459
|
-
key = self._hash_key(text, model)
|
|
460
|
-
cache_file = self.cache_dir / f"{key}.json"
|
|
461
|
-
cache_file.write_text(json.dumps(embedding))
|
|
462
|
-
|
|
463
|
-
# Usage
|
|
464
|
-
cache = EmbeddingCache()
|
|
465
|
-
|
|
466
|
-
def get_embedding_cached(text: str, model: str = "text-embedding-3-small") -> list[float]:
|
|
467
|
-
cached = cache.get(text, model)
|
|
468
|
-
if cached:
|
|
469
|
-
return cached
|
|
470
|
-
|
|
471
|
-
embedding = get_embedding(text, model) # Call API
|
|
472
|
-
cache.set(text, model, embedding)
|
|
473
|
-
return embedding
|
|
474
|
-
```
|
|
475
|
-
|
|
476
|
-
### Batching Strategy
|
|
477
|
-
|
|
478
|
-
```python
|
|
479
|
-
from typing import Iterator
|
|
480
|
-
import asyncio
|
|
481
|
-
from openai import AsyncOpenAI
|
|
482
|
-
|
|
483
|
-
def batch_texts(texts: list[str], batch_size: int = 100) -> Iterator[list[str]]:
|
|
484
|
-
"""Yield batches of texts."""
|
|
485
|
-
for i in range(0, len(texts), batch_size):
|
|
486
|
-
yield texts[i:i + batch_size]
|
|
487
|
-
|
|
488
|
-
async def get_embeddings_async(
|
|
489
|
-
texts: list[str],
|
|
490
|
-
model: str = "text-embedding-3-small",
|
|
491
|
-
batch_size: int = 100,
|
|
492
|
-
max_concurrent: int = 5
|
|
493
|
-
) -> list[list[float]]:
|
|
494
|
-
"""Async batch embedding with concurrency control."""
|
|
495
|
-
client = AsyncOpenAI()
|
|
496
|
-
semaphore = asyncio.Semaphore(max_concurrent)
|
|
497
|
-
|
|
498
|
-
async def embed_batch(batch: list[str]) -> list[list[float]]:
|
|
499
|
-
async with semaphore:
|
|
500
|
-
response = await client.embeddings.create(
|
|
501
|
-
input=batch,
|
|
502
|
-
model=model
|
|
503
|
-
)
|
|
504
|
-
return [item.embedding for item in sorted(response.data, key=lambda x: x.index)]
|
|
505
|
-
|
|
506
|
-
batches = list(batch_texts(texts, batch_size))
|
|
507
|
-
results = await asyncio.gather(*[embed_batch(b) for b in batches])
|
|
508
|
-
|
|
509
|
-
# Flatten results
|
|
510
|
-
return [emb for batch_result in results for emb in batch_result]
|
|
511
|
-
```
|
|
512
|
-
|
|
513
|
-
---
|
|
514
|
-
|
|
515
|
-
## Model Selection Flowchart
|
|
516
|
-
|
|
517
|
-
```
|
|
518
|
-
Start
|
|
519
|
-
│
|
|
520
|
-
├─ Need offline/self-hosted?
|
|
521
|
-
│ └─ Yes → BGE-large or E5-large (open source)
|
|
522
|
-
│
|
|
523
|
-
├─ Multi-lingual requirement?
|
|
524
|
-
│ └─ Yes → Cohere embed-multilingual-v3 or BGE-M3
|
|
525
|
-
│
|
|
526
|
-
├─ Code/technical documentation?
|
|
527
|
-
│ └─ Yes → Voyage-code-2
|
|
528
|
-
│
|
|
529
|
-
├─ Long documents (>8K tokens)?
|
|
530
|
-
│ └─ Yes → Voyage-large-2 or nomic-embed-text
|
|
531
|
-
│
|
|
532
|
-
├─ Cost is primary concern?
|
|
533
|
-
│ └─ Yes → text-embedding-3-small (reduced dims)
|
|
534
|
-
│
|
|
535
|
-
├─ Maximum quality needed?
|
|
536
|
-
│ └─ Yes → text-embedding-3-large
|
|
537
|
-
│
|
|
538
|
-
└─ Default → text-embedding-3-small (best balance)
|
|
539
|
-
```
|
|
540
|
-
|
|
541
|
-
---
|
|
542
|
-
|
|
543
|
-
## Quick Reference
|
|
544
|
-
|
|
545
|
-
| Task | Recommendation |
|
|
546
|
-
|------|----------------|
|
|
547
|
-
| Production RAG (English) | text-embedding-3-small/large |
|
|
548
|
-
| Multi-lingual | Cohere embed-multilingual-v3 |
|
|
549
|
-
| Code retrieval | Voyage-code-2 |
|
|
550
|
-
| Self-hosted | BGE-large-en-v1.5 |
|
|
551
|
-
| Long documents | Voyage-large-2, nomic-embed-text |
|
|
552
|
-
| Prototyping | all-MiniLM-L6-v2 (fast, free) |
|
|
553
|
-
| Maximum quality | text-embedding-3-large |
|
|
554
|
-
| Cost optimized | text-embedding-3-small @ 512 dims |
|
|
555
|
-
|
|
556
|
-
## Related Skills
|
|
557
|
-
|
|
558
|
-
- **RAG Architect** - Vector database integration
|
|
559
|
-
- **Python Pro** - Async embedding pipelines
|
|
560
|
-
- **ML Pipeline** - Embedding model deployment
|
|
561
|
-
- **Fine-Tuning Expert** - Custom embedding training
|
|
1
|
+
# Embedding Models
|
|
2
|
+
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
## Model Comparison Matrix
|
|
6
|
+
|
|
7
|
+
| Model | Dimensions | Max Tokens | Strengths | Provider |
|
|
8
|
+
|-------|------------|------------|-----------|----------|
|
|
9
|
+
| **text-embedding-3-large** | 3072 (or 256-3072) | 8191 | Best quality, flexible dims | OpenAI |
|
|
10
|
+
| **text-embedding-3-small** | 1536 (or 256-1536) | 8191 | Cost-effective, good quality | OpenAI |
|
|
11
|
+
| **embed-english-v3.0** | 1024 | 512 | Excellent compression, fast | Cohere |
|
|
12
|
+
| **embed-multilingual-v3.0** | 1024 | 512 | 100+ languages | Cohere |
|
|
13
|
+
| **voyage-large-2** | 1536 | 16000 | Long context, code-aware | Voyage AI |
|
|
14
|
+
| **voyage-code-2** | 1536 | 16000 | Code retrieval specialist | Voyage AI |
|
|
15
|
+
| **BGE-large-en-v1.5** | 1024 | 512 | Open source, high quality | BAAI |
|
|
16
|
+
| **BGE-M3** | 1024 | 8192 | Multi-lingual, multi-granularity | BAAI |
|
|
17
|
+
| **E5-large-v2** | 1024 | 512 | Strong benchmark performance | Microsoft |
|
|
18
|
+
| **GTE-large** | 1024 | 512 | Good general-purpose | Alibaba |
|
|
19
|
+
| **all-MiniLM-L6-v2** | 384 | 256 | Fast, lightweight | Sentence Transformers |
|
|
20
|
+
| **nomic-embed-text-v1.5** | 768 | 8192 | Long context, open weights | Nomic AI |
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## When to Use Each Model
|
|
25
|
+
|
|
26
|
+
### OpenAI text-embedding-3-large
|
|
27
|
+
```
|
|
28
|
+
Best For:
|
|
29
|
+
- Production RAG requiring highest accuracy
|
|
30
|
+
- Enterprise applications with quality SLAs
|
|
31
|
+
- Flexible dimension requirements (can reduce to save cost)
|
|
32
|
+
- English and major languages
|
|
33
|
+
|
|
34
|
+
When to Avoid:
|
|
35
|
+
- Cost-sensitive high-volume applications
|
|
36
|
+
- Air-gapped or offline deployments
|
|
37
|
+
- Specialized domains without fine-tuning budget
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
### OpenAI text-embedding-3-small
|
|
41
|
+
```
|
|
42
|
+
Best For:
|
|
43
|
+
- Cost-effective production deployments
|
|
44
|
+
- Good quality-to-cost ratio
|
|
45
|
+
- General-purpose retrieval tasks
|
|
46
|
+
- Quick prototyping with API simplicity
|
|
47
|
+
|
|
48
|
+
When to Avoid:
|
|
49
|
+
- Maximum accuracy requirements
|
|
50
|
+
- Specialized technical domains
|
|
51
|
+
- When open-source is required
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
### Cohere embed-v3
|
|
55
|
+
```
|
|
56
|
+
Best For:
|
|
57
|
+
- Multi-lingual applications (100+ languages)
|
|
58
|
+
- Search-optimized retrieval (search_document/search_query types)
|
|
59
|
+
- Compression (int8/binary quantization built-in)
|
|
60
|
+
- Production with cost constraints
|
|
61
|
+
|
|
62
|
+
When to Avoid:
|
|
63
|
+
- Very long documents (512 token limit)
|
|
64
|
+
- Code-heavy retrieval tasks
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
### Voyage AI
|
|
68
|
+
```
|
|
69
|
+
Best For:
|
|
70
|
+
- Code retrieval and technical documentation
|
|
71
|
+
- Long-context documents (16K tokens)
|
|
72
|
+
- Domain-specific fine-tuning options
|
|
73
|
+
- Legal/financial specialized models
|
|
74
|
+
|
|
75
|
+
When to Avoid:
|
|
76
|
+
- Budget-constrained projects
|
|
77
|
+
- Simple general-purpose retrieval
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
### BGE / E5 (Open Source)
|
|
81
|
+
```
|
|
82
|
+
Best For:
|
|
83
|
+
- Self-hosted deployments
|
|
84
|
+
- Air-gapped environments
|
|
85
|
+
- Cost elimination (no API fees)
|
|
86
|
+
- Fine-tuning on custom domains
|
|
87
|
+
|
|
88
|
+
When to Avoid:
|
|
89
|
+
- Teams without GPU infrastructure
|
|
90
|
+
- Need for zero maintenance
|
|
91
|
+
- Maximum out-of-box quality
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
---
|
|
95
|
+
|
|
96
|
+
## OpenAI Embeddings
|
|
97
|
+
|
|
98
|
+
```python
|
|
99
|
+
from openai import OpenAI
|
|
100
|
+
|
|
101
|
+
client = OpenAI(api_key="your-api-key")
|
|
102
|
+
|
|
103
|
+
def get_embedding(
|
|
104
|
+
text: str,
|
|
105
|
+
model: str = "text-embedding-3-small",
|
|
106
|
+
dimensions: int | None = None
|
|
107
|
+
) -> list[float]:
|
|
108
|
+
"""Get embedding with optional dimension reduction."""
|
|
109
|
+
params = {"input": text, "model": model}
|
|
110
|
+
if dimensions:
|
|
111
|
+
params["dimensions"] = dimensions
|
|
112
|
+
|
|
113
|
+
response = client.embeddings.create(**params)
|
|
114
|
+
return response.data[0].embedding
|
|
115
|
+
|
|
116
|
+
# Single embedding
|
|
117
|
+
embedding = get_embedding("How do I install the software?")
|
|
118
|
+
|
|
119
|
+
# Batch embeddings (more efficient)
|
|
120
|
+
def get_embeddings_batch(
|
|
121
|
+
texts: list[str],
|
|
122
|
+
model: str = "text-embedding-3-small",
|
|
123
|
+
dimensions: int | None = None
|
|
124
|
+
) -> list[list[float]]:
|
|
125
|
+
"""Batch embed multiple texts."""
|
|
126
|
+
params = {"input": texts, "model": model}
|
|
127
|
+
if dimensions:
|
|
128
|
+
params["dimensions"] = dimensions
|
|
129
|
+
|
|
130
|
+
response = client.embeddings.create(**params)
|
|
131
|
+
# Sort by index to maintain order
|
|
132
|
+
return [item.embedding for item in sorted(response.data, key=lambda x: x.index)]
|
|
133
|
+
|
|
134
|
+
embeddings = get_embeddings_batch(["text1", "text2", "text3"])
|
|
135
|
+
|
|
136
|
+
# Dimension reduction (cost/storage savings)
|
|
137
|
+
# text-embedding-3-large: 3072 -> 1024 (66% storage savings)
|
|
138
|
+
reduced_embedding = get_embedding(
|
|
139
|
+
"Installation guide...",
|
|
140
|
+
model="text-embedding-3-large",
|
|
141
|
+
dimensions=1024 # Reduce from 3072
|
|
142
|
+
)
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
### Dimension Trade-offs
|
|
146
|
+
|
|
147
|
+
| Original | Reduced | Quality Loss | Storage Savings |
|
|
148
|
+
|----------|---------|--------------|-----------------|
|
|
149
|
+
| 3072 | 1536 | ~1-2% | 50% |
|
|
150
|
+
| 3072 | 1024 | ~2-4% | 67% |
|
|
151
|
+
| 3072 | 512 | ~5-8% | 83% |
|
|
152
|
+
| 3072 | 256 | ~10-15% | 92% |
|
|
153
|
+
|
|
154
|
+
---
|
|
155
|
+
|
|
156
|
+
## Cohere Embeddings
|
|
157
|
+
|
|
158
|
+
```python
|
|
159
|
+
import cohere
|
|
160
|
+
|
|
161
|
+
co = cohere.Client(api_key="your-api-key")
|
|
162
|
+
|
|
163
|
+
# Document embeddings (for indexing)
|
|
164
|
+
doc_embeddings = co.embed(
|
|
165
|
+
texts=["Installation guide content...", "Configuration steps..."],
|
|
166
|
+
model="embed-english-v3.0",
|
|
167
|
+
input_type="search_document", # Use for documents being indexed
|
|
168
|
+
truncate="END"
|
|
169
|
+
).embeddings
|
|
170
|
+
|
|
171
|
+
# Query embeddings (for search)
|
|
172
|
+
query_embedding = co.embed(
|
|
173
|
+
texts=["how to install"],
|
|
174
|
+
model="embed-english-v3.0",
|
|
175
|
+
input_type="search_query", # Use for search queries
|
|
176
|
+
).embeddings[0]
|
|
177
|
+
|
|
178
|
+
# Multilingual
|
|
179
|
+
multilingual_embedding = co.embed(
|
|
180
|
+
texts=["Comment installer le logiciel?"], # French
|
|
181
|
+
model="embed-multilingual-v3.0",
|
|
182
|
+
input_type="search_query"
|
|
183
|
+
).embeddings[0]
|
|
184
|
+
|
|
185
|
+
# Compressed embeddings (int8)
|
|
186
|
+
compressed = co.embed(
|
|
187
|
+
texts=["Document content..."],
|
|
188
|
+
model="embed-english-v3.0",
|
|
189
|
+
input_type="search_document",
|
|
190
|
+
embedding_types=["int8"] # 4x smaller than float32
|
|
191
|
+
).embeddings
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
### Cohere Input Types
|
|
195
|
+
|
|
196
|
+
| Type | Use Case |
|
|
197
|
+
|------|----------|
|
|
198
|
+
| `search_document` | Documents being indexed in vector DB |
|
|
199
|
+
| `search_query` | User search queries |
|
|
200
|
+
| `classification` | Text classification tasks |
|
|
201
|
+
| `clustering` | Document clustering |
|
|
202
|
+
|
|
203
|
+
---
|
|
204
|
+
|
|
205
|
+
## Voyage AI Embeddings
|
|
206
|
+
|
|
207
|
+
```python
|
|
208
|
+
import voyageai
|
|
209
|
+
|
|
210
|
+
vo = voyageai.Client(api_key="your-api-key")
|
|
211
|
+
|
|
212
|
+
# General embeddings
|
|
213
|
+
result = vo.embed(
|
|
214
|
+
texts=["Installation guide for the software..."],
|
|
215
|
+
model="voyage-large-2",
|
|
216
|
+
input_type="document"
|
|
217
|
+
)
|
|
218
|
+
embeddings = result.embeddings
|
|
219
|
+
|
|
220
|
+
# Code embeddings (specialized)
|
|
221
|
+
code_result = vo.embed(
|
|
222
|
+
texts=[
|
|
223
|
+
"def install_package(name):\n subprocess.run(['pip', 'install', name])",
|
|
224
|
+
"How do I install packages in Python?"
|
|
225
|
+
],
|
|
226
|
+
model="voyage-code-2",
|
|
227
|
+
input_type="document" # or "query" for search
|
|
228
|
+
)
|
|
229
|
+
|
|
230
|
+
# Long context (up to 16K tokens)
|
|
231
|
+
long_doc_embedding = vo.embed(
|
|
232
|
+
texts=[very_long_document], # Up to 16K tokens
|
|
233
|
+
model="voyage-large-2",
|
|
234
|
+
input_type="document"
|
|
235
|
+
).embeddings[0]
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
---
|
|
239
|
+
|
|
240
|
+
## Open Source Models (Sentence Transformers)
|
|
241
|
+
|
|
242
|
+
```python
|
|
243
|
+
from sentence_transformers import SentenceTransformer
|
|
244
|
+
|
|
245
|
+
# Load model (downloads on first use)
|
|
246
|
+
model = SentenceTransformer("BAAI/bge-large-en-v1.5")
|
|
247
|
+
|
|
248
|
+
# Single embedding
|
|
249
|
+
embedding = model.encode("How do I install the software?")
|
|
250
|
+
|
|
251
|
+
# Batch encoding (GPU accelerated)
|
|
252
|
+
embeddings = model.encode(
|
|
253
|
+
["doc1", "doc2", "doc3"],
|
|
254
|
+
batch_size=32,
|
|
255
|
+
show_progress_bar=True,
|
|
256
|
+
convert_to_numpy=True,
|
|
257
|
+
normalize_embeddings=True # For cosine similarity
|
|
258
|
+
)
|
|
259
|
+
|
|
260
|
+
# BGE requires instruction prefix for queries
|
|
261
|
+
query_embedding = model.encode(
|
|
262
|
+
"Represent this sentence for searching relevant passages: How do I install?"
|
|
263
|
+
)
|
|
264
|
+
|
|
265
|
+
# GPU acceleration
|
|
266
|
+
model = SentenceTransformer("BAAI/bge-large-en-v1.5", device="cuda")
|
|
267
|
+
|
|
268
|
+
# Multi-GPU encoding
|
|
269
|
+
pool = model.start_multi_process_pool()
|
|
270
|
+
embeddings = model.encode_multi_process(
|
|
271
|
+
sentences=large_corpus,
|
|
272
|
+
pool=pool,
|
|
273
|
+
batch_size=64
|
|
274
|
+
)
|
|
275
|
+
model.stop_multi_process_pool(pool)
|
|
276
|
+
```
|
|
277
|
+
|
|
278
|
+
### BGE-M3 (Multi-lingual, Multi-granularity)
|
|
279
|
+
|
|
280
|
+
```python
|
|
281
|
+
from FlagEmbedding import BGEM3FlagModel
|
|
282
|
+
|
|
283
|
+
model = BGEM3FlagModel("BAAI/bge-m3", use_fp16=True)
|
|
284
|
+
|
|
285
|
+
# Dense, sparse, and colbert embeddings in one call
|
|
286
|
+
output = model.encode(
|
|
287
|
+
["Installation guide in English", "Guide d'installation en francais"],
|
|
288
|
+
return_dense=True,
|
|
289
|
+
return_sparse=True,
|
|
290
|
+
return_colbert_vecs=True
|
|
291
|
+
)
|
|
292
|
+
|
|
293
|
+
dense_embeddings = output["dense_vecs"]
|
|
294
|
+
sparse_embeddings = output["lexical_weights"]
|
|
295
|
+
colbert_embeddings = output["colbert_vecs"]
|
|
296
|
+
```
|
|
297
|
+
|
|
298
|
+
---
|
|
299
|
+
|
|
300
|
+
## Embedding Fine-Tuning
|
|
301
|
+
|
|
302
|
+
### When to Fine-Tune
|
|
303
|
+
|
|
304
|
+
| Scenario | Recommendation |
|
|
305
|
+
|----------|----------------|
|
|
306
|
+
| Domain-specific jargon (legal, medical) | Fine-tune on domain corpus |
|
|
307
|
+
| Low retrieval precision (<80%) | Fine-tune with hard negatives |
|
|
308
|
+
| Out-of-distribution queries | Fine-tune with query-doc pairs |
|
|
309
|
+
| Cost optimization | Fine-tune smaller model to match larger |
|
|
310
|
+
|
|
311
|
+
### Fine-Tuning with Sentence Transformers
|
|
312
|
+
|
|
313
|
+
```python
|
|
314
|
+
from sentence_transformers import SentenceTransformer, InputExample, losses
|
|
315
|
+
from torch.utils.data import DataLoader
|
|
316
|
+
|
|
317
|
+
# Prepare training data
|
|
318
|
+
train_examples = [
|
|
319
|
+
InputExample(
|
|
320
|
+
texts=["query: how to install", "doc: Installation guide content..."],
|
|
321
|
+
label=1.0 # Relevance score
|
|
322
|
+
),
|
|
323
|
+
InputExample(
|
|
324
|
+
texts=["query: how to install", "doc: Unrelated content..."],
|
|
325
|
+
label=0.0 # Negative example
|
|
326
|
+
),
|
|
327
|
+
]
|
|
328
|
+
|
|
329
|
+
# Load base model
|
|
330
|
+
model = SentenceTransformer("BAAI/bge-base-en-v1.5")
|
|
331
|
+
|
|
332
|
+
# Create dataloader
|
|
333
|
+
train_dataloader = DataLoader(train_examples, shuffle=True, batch_size=16)
|
|
334
|
+
|
|
335
|
+
# Contrastive loss for similarity learning
|
|
336
|
+
train_loss = losses.CosineSimilarityLoss(model)
|
|
337
|
+
|
|
338
|
+
# Fine-tune
|
|
339
|
+
model.fit(
|
|
340
|
+
train_objectives=[(train_dataloader, train_loss)],
|
|
341
|
+
epochs=3,
|
|
342
|
+
warmup_steps=100,
|
|
343
|
+
output_path="./fine-tuned-model"
|
|
344
|
+
)
|
|
345
|
+
|
|
346
|
+
# Or use Multiple Negatives Ranking Loss (better for retrieval)
|
|
347
|
+
train_examples_mnrl = [
|
|
348
|
+
InputExample(texts=["query", "positive_doc", "negative_doc1", "negative_doc2"])
|
|
349
|
+
]
|
|
350
|
+
train_loss = losses.MultipleNegativesRankingLoss(model)
|
|
351
|
+
```
|
|
352
|
+
|
|
353
|
+
### Hard Negative Mining
|
|
354
|
+
|
|
355
|
+
```python
|
|
356
|
+
from sentence_transformers import SentenceTransformer
|
|
357
|
+
from sentence_transformers.util import semantic_search
|
|
358
|
+
import torch
|
|
359
|
+
|
|
360
|
+
def mine_hard_negatives(
|
|
361
|
+
queries: list[str],
|
|
362
|
+
positives: list[str],
|
|
363
|
+
corpus: list[str],
|
|
364
|
+
model: SentenceTransformer,
|
|
365
|
+
top_k: int = 10
|
|
366
|
+
) -> list[InputExample]:
|
|
367
|
+
"""Mine hard negatives from corpus for each query-positive pair."""
|
|
368
|
+
|
|
369
|
+
query_embeddings = model.encode(queries, convert_to_tensor=True)
|
|
370
|
+
corpus_embeddings = model.encode(corpus, convert_to_tensor=True)
|
|
371
|
+
positive_set = set(positives)
|
|
372
|
+
|
|
373
|
+
examples = []
|
|
374
|
+
for i, query in enumerate(queries):
|
|
375
|
+
# Find similar documents that are NOT the positive
|
|
376
|
+
hits = semantic_search(
|
|
377
|
+
query_embeddings[i:i+1],
|
|
378
|
+
corpus_embeddings,
|
|
379
|
+
top_k=top_k + 1
|
|
380
|
+
)[0]
|
|
381
|
+
|
|
382
|
+
hard_negatives = [
|
|
383
|
+
corpus[hit["corpus_id"]]
|
|
384
|
+
for hit in hits
|
|
385
|
+
if corpus[hit["corpus_id"]] not in positive_set
|
|
386
|
+
][:3] # Top 3 hard negatives
|
|
387
|
+
|
|
388
|
+
examples.append(InputExample(
|
|
389
|
+
texts=[query, positives[i]] + hard_negatives
|
|
390
|
+
))
|
|
391
|
+
|
|
392
|
+
return examples
|
|
393
|
+
```
|
|
394
|
+
|
|
395
|
+
---
|
|
396
|
+
|
|
397
|
+
## Embedding Pipeline Best Practices
|
|
398
|
+
|
|
399
|
+
### Text Preprocessing
|
|
400
|
+
|
|
401
|
+
```python
|
|
402
|
+
import re
|
|
403
|
+
from typing import Callable
|
|
404
|
+
|
|
405
|
+
def clean_for_embedding(text: str) -> str:
|
|
406
|
+
"""Clean text before embedding."""
|
|
407
|
+
# Remove excessive whitespace
|
|
408
|
+
text = re.sub(r'\s+', ' ', text)
|
|
409
|
+
# Remove special characters that don't add meaning
|
|
410
|
+
text = re.sub(r'[^\w\s\.\,\!\?\-\:\;\(\)]', '', text)
|
|
411
|
+
# Truncate to reasonable length (model dependent)
|
|
412
|
+
text = text[:8000] # Leave room for tokenization expansion
|
|
413
|
+
return text.strip()
|
|
414
|
+
|
|
415
|
+
def preprocess_for_embedding(
|
|
416
|
+
text: str,
|
|
417
|
+
prefix: str = "",
|
|
418
|
+
max_length: int = 8000
|
|
419
|
+
) -> str:
|
|
420
|
+
"""Preprocess with optional prefix (for instruction-tuned models)."""
|
|
421
|
+
cleaned = clean_for_embedding(text)
|
|
422
|
+
prefixed = f"{prefix}{cleaned}" if prefix else cleaned
|
|
423
|
+
return prefixed[:max_length]
|
|
424
|
+
|
|
425
|
+
# BGE-style prefix for queries
|
|
426
|
+
query_text = preprocess_for_embedding(
|
|
427
|
+
"how to install",
|
|
428
|
+
prefix="Represent this sentence for searching relevant passages: "
|
|
429
|
+
)
|
|
430
|
+
```
|
|
431
|
+
|
|
432
|
+
### Caching Embeddings
|
|
433
|
+
|
|
434
|
+
```python
|
|
435
|
+
import hashlib
|
|
436
|
+
import json
|
|
437
|
+
from functools import lru_cache
|
|
438
|
+
from pathlib import Path
|
|
439
|
+
|
|
440
|
+
class EmbeddingCache:
|
|
441
|
+
"""Disk-based embedding cache."""
|
|
442
|
+
|
|
443
|
+
def __init__(self, cache_dir: str = ".embedding_cache"):
|
|
444
|
+
self.cache_dir = Path(cache_dir)
|
|
445
|
+
self.cache_dir.mkdir(exist_ok=True)
|
|
446
|
+
|
|
447
|
+
def _hash_key(self, text: str, model: str) -> str:
|
|
448
|
+
content = f"{model}:{text}"
|
|
449
|
+
return hashlib.sha256(content.encode()).hexdigest()
|
|
450
|
+
|
|
451
|
+
def get(self, text: str, model: str) -> list[float] | None:
|
|
452
|
+
key = self._hash_key(text, model)
|
|
453
|
+
cache_file = self.cache_dir / f"{key}.json"
|
|
454
|
+
if cache_file.exists():
|
|
455
|
+
return json.loads(cache_file.read_text())
|
|
456
|
+
return None
|
|
457
|
+
|
|
458
|
+
def set(self, text: str, model: str, embedding: list[float]) -> None:
|
|
459
|
+
key = self._hash_key(text, model)
|
|
460
|
+
cache_file = self.cache_dir / f"{key}.json"
|
|
461
|
+
cache_file.write_text(json.dumps(embedding))
|
|
462
|
+
|
|
463
|
+
# Usage
|
|
464
|
+
cache = EmbeddingCache()
|
|
465
|
+
|
|
466
|
+
def get_embedding_cached(text: str, model: str = "text-embedding-3-small") -> list[float]:
|
|
467
|
+
cached = cache.get(text, model)
|
|
468
|
+
if cached:
|
|
469
|
+
return cached
|
|
470
|
+
|
|
471
|
+
embedding = get_embedding(text, model) # Call API
|
|
472
|
+
cache.set(text, model, embedding)
|
|
473
|
+
return embedding
|
|
474
|
+
```
|
|
475
|
+
|
|
476
|
+
### Batching Strategy
|
|
477
|
+
|
|
478
|
+
```python
|
|
479
|
+
from typing import Iterator
|
|
480
|
+
import asyncio
|
|
481
|
+
from openai import AsyncOpenAI
|
|
482
|
+
|
|
483
|
+
def batch_texts(texts: list[str], batch_size: int = 100) -> Iterator[list[str]]:
|
|
484
|
+
"""Yield batches of texts."""
|
|
485
|
+
for i in range(0, len(texts), batch_size):
|
|
486
|
+
yield texts[i:i + batch_size]
|
|
487
|
+
|
|
488
|
+
async def get_embeddings_async(
|
|
489
|
+
texts: list[str],
|
|
490
|
+
model: str = "text-embedding-3-small",
|
|
491
|
+
batch_size: int = 100,
|
|
492
|
+
max_concurrent: int = 5
|
|
493
|
+
) -> list[list[float]]:
|
|
494
|
+
"""Async batch embedding with concurrency control."""
|
|
495
|
+
client = AsyncOpenAI()
|
|
496
|
+
semaphore = asyncio.Semaphore(max_concurrent)
|
|
497
|
+
|
|
498
|
+
async def embed_batch(batch: list[str]) -> list[list[float]]:
|
|
499
|
+
async with semaphore:
|
|
500
|
+
response = await client.embeddings.create(
|
|
501
|
+
input=batch,
|
|
502
|
+
model=model
|
|
503
|
+
)
|
|
504
|
+
return [item.embedding for item in sorted(response.data, key=lambda x: x.index)]
|
|
505
|
+
|
|
506
|
+
batches = list(batch_texts(texts, batch_size))
|
|
507
|
+
results = await asyncio.gather(*[embed_batch(b) for b in batches])
|
|
508
|
+
|
|
509
|
+
# Flatten results
|
|
510
|
+
return [emb for batch_result in results for emb in batch_result]
|
|
511
|
+
```
|
|
512
|
+
|
|
513
|
+
---
|
|
514
|
+
|
|
515
|
+
## Model Selection Flowchart
|
|
516
|
+
|
|
517
|
+
```
|
|
518
|
+
Start
|
|
519
|
+
│
|
|
520
|
+
├─ Need offline/self-hosted?
|
|
521
|
+
│ └─ Yes → BGE-large or E5-large (open source)
|
|
522
|
+
│
|
|
523
|
+
├─ Multi-lingual requirement?
|
|
524
|
+
│ └─ Yes → Cohere embed-multilingual-v3 or BGE-M3
|
|
525
|
+
│
|
|
526
|
+
├─ Code/technical documentation?
|
|
527
|
+
│ └─ Yes → Voyage-code-2
|
|
528
|
+
│
|
|
529
|
+
├─ Long documents (>8K tokens)?
|
|
530
|
+
│ └─ Yes → Voyage-large-2 or nomic-embed-text
|
|
531
|
+
│
|
|
532
|
+
├─ Cost is primary concern?
|
|
533
|
+
│ └─ Yes → text-embedding-3-small (reduced dims)
|
|
534
|
+
│
|
|
535
|
+
├─ Maximum quality needed?
|
|
536
|
+
│ └─ Yes → text-embedding-3-large
|
|
537
|
+
│
|
|
538
|
+
└─ Default → text-embedding-3-small (best balance)
|
|
539
|
+
```
|
|
540
|
+
|
|
541
|
+
---
|
|
542
|
+
|
|
543
|
+
## Quick Reference
|
|
544
|
+
|
|
545
|
+
| Task | Recommendation |
|
|
546
|
+
|------|----------------|
|
|
547
|
+
| Production RAG (English) | text-embedding-3-small/large |
|
|
548
|
+
| Multi-lingual | Cohere embed-multilingual-v3 |
|
|
549
|
+
| Code retrieval | Voyage-code-2 |
|
|
550
|
+
| Self-hosted | BGE-large-en-v1.5 |
|
|
551
|
+
| Long documents | Voyage-large-2, nomic-embed-text |
|
|
552
|
+
| Prototyping | all-MiniLM-L6-v2 (fast, free) |
|
|
553
|
+
| Maximum quality | text-embedding-3-large |
|
|
554
|
+
| Cost optimized | text-embedding-3-small @ 512 dims |
|
|
555
|
+
|
|
556
|
+
## Related Skills
|
|
557
|
+
|
|
558
|
+
- **RAG Architect** - Vector database integration
|
|
559
|
+
- **Python Pro** - Async embedding pipelines
|
|
560
|
+
- **ML Pipeline** - Embedding model deployment
|
|
561
|
+
- **Fine-Tuning Expert** - Custom embedding training
|