cocoindex 0.1.52__tar.gz → 0.1.54__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {cocoindex-0.1.52 → cocoindex-0.1.54}/.github/ISSUE_TEMPLATE//360/237/222/241-feature-request.md +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/.github/scripts/update_version.sh +1 -1
- cocoindex-0.1.54/.pre-commit-config.yaml +71 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/.vscode/settings.json +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/CONTRIBUTING.md +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/Cargo.lock +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/Cargo.toml +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/PKG-INFO +11 -10
- {cocoindex-0.1.52 → cocoindex-0.1.54}/README.md +9 -9
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/about/contributing.md +31 -17
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/ai/llm.mdx +86 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/core/basics.md +3 -3
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/core/cli.mdx +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/core/custom_function.mdx +3 -3
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/core/flow_def.mdx +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/core/flow_methods.mdx +4 -4
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/core/settings.mdx +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/getting_started/installation.md +2 -3
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/getting_started/overview.md +3 -4
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/getting_started/quickstart.md +3 -3
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/ops/functions.md +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/ops/sources.md +4 -4
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/ops/targets.md +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/query.mdx +0 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/sidebars.ts +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/src/css/custom.css +7 -7
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/src/theme/Root.js +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/static/robots.txt +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/amazon_s3_embedding/.env.example +1 -1
- cocoindex-0.1.54/examples/amazon_s3_embedding/.gitignore +1 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/amazon_s3_embedding/README.md +2 -2
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/amazon_s3_embedding/main.py +2 -2
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/amazon_s3_embedding/pyproject.toml +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/code_embedding/README.md +6 -7
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/code_embedding/main.py +18 -6
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/code_embedding/pyproject.toml +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/docs_to_knowledge_graph/README.md +7 -9
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/docs_to_knowledge_graph/main.py +19 -19
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/docs_to_knowledge_graph/pyproject.toml +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/fastapi_server_docker/README.md +2 -2
- {cocoindex-0.1.52/examples/text_embedding/markdown_files → cocoindex-0.1.54/examples/fastapi_server_docker/files}/1810.04805v2.md +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/fastapi_server_docker/main.py +2 -2
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/fastapi_server_docker/requirements.txt +2 -2
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/gdrive_text_embedding/.env.example +2 -2
- cocoindex-0.1.54/examples/gdrive_text_embedding/.gitignore +1 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/gdrive_text_embedding/README.md +4 -4
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/gdrive_text_embedding/main.py +2 -2
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/gdrive_text_embedding/pyproject.toml +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/.env +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/README.md +0 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/main.py +29 -27
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/pyproject.toml +3 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/requirements.txt +2 -2
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/manuals_llm_extraction/README.md +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/manuals_llm_extraction/main.py +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/manuals_llm_extraction/pyproject.toml +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/pdf_embedding/README.md +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/pdf_embedding/main.py +2 -2
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/pdf_embedding/pyproject.toml +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/product_recommendation/README.md +7 -9
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/product_recommendation/main.py +19 -19
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/product_recommendation/products/p1.json +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/product_recommendation/products/p2.json +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/product_recommendation/products/p3.json +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/product_recommendation/products/p4.json +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/product_recommendation/products/p6.json +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/product_recommendation/products/p7.json +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/product_recommendation/products/p8.json +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/product_recommendation/products/p9.json +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/product_recommendation/pyproject.toml +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/text_embedding/README.md +3 -4
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/text_embedding/Text_Embedding.ipynb +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/text_embedding/main.py +13 -5
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/text_embedding/markdown_files/1706.03762v7.md +1 -1
- {cocoindex-0.1.52/examples/fastapi_server_docker/files → cocoindex-0.1.54/examples/text_embedding/markdown_files}/1810.04805v2.md +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/text_embedding/markdown_files/rfc8259.md +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/text_embedding/pyproject.toml +1 -1
- cocoindex-0.1.54/examples/text_embedding_qdrant/.env +2 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/text_embedding_qdrant/README.md +4 -6
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/text_embedding_qdrant/main.py +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/text_embedding_qdrant/markdown_files/rfc8259.md +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/text_embedding_qdrant/pyproject.toml +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/pyproject.toml +2 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/cli.py +6 -6
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/convert.py +93 -46
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/flow.py +3 -2
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/functions.py +10 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/llm.py +3 -0
- cocoindex-0.1.54/python/cocoindex/tests/__init__.py +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/tests/test_convert.py +289 -58
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/tests/test_typing.py +115 -77
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/typing.py +76 -64
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/base/json_schema.rs +12 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/base/schema.rs +19 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/base/value.rs +60 -3
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/builder/flow_builder.rs +8 -11
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/execution/db_tracking_setup.rs +3 -5
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/execution/live_updater.rs +10 -10
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/execution/memoization.rs +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/execution/source_indexer.rs +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/lib_context.rs +4 -2
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/llm/anthropic.rs +21 -10
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/llm/gemini.rs +81 -15
- cocoindex-0.1.54/src/llm/litellm.rs +16 -0
- cocoindex-0.1.54/src/llm/mod.rs +137 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/llm/ollama.rs +13 -10
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/llm/openai.rs +49 -9
- cocoindex-0.1.54/src/llm/openrouter.rs +16 -0
- cocoindex-0.1.54/src/llm/voyage.rs +109 -0
- cocoindex-0.1.54/src/ops/functions/embed_text.rs +97 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/functions/extract_by_llm.rs +6 -2
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/functions/mod.rs +1 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/registration.rs +1 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/sdk.rs +1 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/targets/kuzu.rs +2 -4
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/targets/neo4j.rs +11 -2
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/targets/postgres.rs +7 -2
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/targets/qdrant.rs +0 -2
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/py/convert.rs +31 -4
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/py/mod.rs +15 -3
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/settings.rs +5 -5
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/setup/db_metadata.rs +0 -2
- cocoindex-0.1.52/examples/amazon_s3_embedding/.gitignore +0 -1
- cocoindex-0.1.52/examples/gdrive_text_embedding/.gitignore +0 -1
- cocoindex-0.1.52/examples/product_recommendation/.env +0 -3
- cocoindex-0.1.52/python/cocoindex/tests/__init__.py +0 -1
- cocoindex-0.1.52/src/llm/mod.rs +0 -76
- {cocoindex-0.1.52 → cocoindex-0.1.54}/.cargo/config.toml +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/.env.lib_debug +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/.github/ISSUE_TEMPLATE//360/237/220/233-bug-report.md +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/.github/workflows/CI.yml +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/.github/workflows/_doc_release.yml +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/.github/workflows/_test.yml +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/.github/workflows/docs.yml +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/.github/workflows/release.yml +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/.gitignore +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/CODE_OF_CONDUCT.md +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/LICENSE +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/dev/neo4j.yaml +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/dev/postgres.yaml +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/.gitignore +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/README.md +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/about/community.md +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/core/data_example.svg +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/core/data_types.mdx +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/core/flow_example.svg +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docs/getting_started/markdown_files.zip +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/docusaurus.config.ts +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/package.json +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/src/components/HomepageFeatures/index.tsx +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/src/components/HomepageFeatures/styles.module.css +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/static/.nojekyll +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/static/img/docusaurus.png +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/static/img/favicon.ico +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/static/img/icon.svg +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/static/img/incremental-etl.gif +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/tsconfig.json +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/docs/yarn.lock +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/code_embedding/.env +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/docs_to_knowledge_graph/.env +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/fastapi_server_docker/.dockerignore +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/fastapi_server_docker/.env +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/fastapi_server_docker/compose.yaml +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/fastapi_server_docker/dockerfile +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/frontend/.gitignore +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/frontend/index.html +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/frontend/package-lock.json +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/frontend/package.json +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/frontend/src/App.jsx +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/frontend/src/main.jsx +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/frontend/src/style.css +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/frontend/vite.config.js +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/img/cat1.jpeg +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/img/dog1.jpeg +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/img/elephant1.jpg +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/image_search/img/giraffe.jpg +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/manuals_llm_extraction/.env +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/manuals_llm_extraction/manuals/array.pdf +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/manuals_llm_extraction/manuals/base64.pdf +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/manuals_llm_extraction/manuals/copy.pdf +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/manuals_llm_extraction/manuals/glob.pdf +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/pdf_embedding/.env +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/pdf_embedding/pdf_files/1706.03762v7.pdf +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/pdf_embedding/pdf_files/1810.04805v2.pdf +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/pdf_embedding/pdf_files/rfc8259.pdf +0 -0
- {cocoindex-0.1.52/examples/text_embedding → cocoindex-0.1.54/examples/product_recommendation}/.env +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/product_recommendation/img/cocoinsight.png +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/product_recommendation/img/neo4j.png +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/examples/product_recommendation/products/p5.json +0 -0
- {cocoindex-0.1.52/examples/text_embedding_qdrant → cocoindex-0.1.54/examples/text_embedding}/.env +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/__init__.py +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/auth_registry.py +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/index.py +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/lib.py +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/op.py +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/py.typed +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/runtime.py +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/setting.py +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/setup.py +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/sources.py +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/targets.py +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/tests/test_optional_database.py +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/python/cocoindex/utils.py +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/ruff.toml +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/base/duration.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/base/field_attrs.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/base/mod.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/base/spec.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/builder/analyzed_flow.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/builder/analyzer.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/builder/mod.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/builder/plan.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/execution/db_tracking.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/execution/dumper.rs +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/execution/evaluator.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/execution/indexing_status.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/execution/mod.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/execution/row_indexer.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/execution/stats.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/lib.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/factory_bases.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/functions/parse_json.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/functions/split_recursively.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/interface.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/mod.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/py_factory.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/registry.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/sources/amazon_s3.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/sources/google_drive.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/sources/local_file.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/sources/mod.rs +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/targets/mod.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/targets/shared/mod.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/targets/shared/property_graph.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/ops/targets/shared/table_columns.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/prelude.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/server.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/service/error.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/service/flows.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/service/mod.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/setup/auth_registry.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/setup/components.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/setup/driver.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/setup/mod.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/setup/states.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/utils/db.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/utils/fingerprint.rs +1 -1
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/utils/immutable.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/utils/mod.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/utils/retryable.rs +0 -0
- {cocoindex-0.1.52 → cocoindex-0.1.54}/src/utils/yaml_ser.rs +0 -0
{cocoindex-0.1.52 → cocoindex-0.1.54}/.github/ISSUE_TEMPLATE//360/237/222/241-feature-request.md
RENAMED
@@ -17,4 +17,4 @@ assignees: ''
|
|
17
17
|
|
18
18
|
---
|
19
19
|
❤️ Contributors, please refer to 📙[Contributing Guide](https://cocoindex.io/docs/about/contributing).
|
20
|
-
Unless the PR can be sent immediately (e.g. just a few lines of code), we recommend you to leave a comment on the issue like **`I'm working on it`** or **`Can I work on this issue?`** to avoid duplicating work. Our [Discord server](https://discord.com/invite/zpA9S2DR7s) is always open and friendly.
|
20
|
+
Unless the PR can be sent immediately (e.g. just a few lines of code), we recommend you to leave a comment on the issue like **`I'm working on it`** or **`Can I work on this issue?`** to avoid duplicating work. Our [Discord server](https://discord.com/invite/zpA9S2DR7s) is always open and friendly.
|
@@ -0,0 +1,71 @@
|
|
1
|
+
ci:
|
2
|
+
autofix_prs: false
|
3
|
+
autoupdate_schedule: 'monthly'
|
4
|
+
|
5
|
+
repos:
|
6
|
+
- repo: https://github.com/pre-commit/pre-commit-hooks
|
7
|
+
rev: v5.0.0
|
8
|
+
hooks:
|
9
|
+
- id: check-case-conflict
|
10
|
+
# Check for files with names that would conflict on a case-insensitive
|
11
|
+
# filesystem like MacOS HFS+ or Windows FAT.
|
12
|
+
- id: check-merge-conflict
|
13
|
+
# Check for files that contain merge conflict strings.
|
14
|
+
- id: check-symlinks
|
15
|
+
# Checks for symlinks which do not point to anything.
|
16
|
+
exclude: ".*(.github.*)$"
|
17
|
+
- id: detect-private-key
|
18
|
+
# Checks for the existence of private keys.
|
19
|
+
- id: end-of-file-fixer
|
20
|
+
# Makes sure files end in a newline and only a newline.
|
21
|
+
exclude: ".*(data.*|licenses.*|_static.*|\\.ya?ml|\\.jpe?g|\\.png|\\.svg|\\.webp)$"
|
22
|
+
- id: trailing-whitespace
|
23
|
+
# Trims trailing whitespace.
|
24
|
+
exclude_types: [python] # Covered by Ruff W291.
|
25
|
+
exclude: ".*(data.*|licenses.*|_static.*|\\.ya?ml|\\.jpe?g|\\.png|\\.svg|\\.webp)$"
|
26
|
+
|
27
|
+
- repo: local
|
28
|
+
hooks:
|
29
|
+
- id: maturin-develop
|
30
|
+
name: maturin develop
|
31
|
+
entry: maturin develop
|
32
|
+
language: system
|
33
|
+
files: ^(python/|src/|Cargo\.toml|pyproject\.toml)
|
34
|
+
pass_filenames: false
|
35
|
+
|
36
|
+
- id: cargo-fmt
|
37
|
+
name: cargo fmt
|
38
|
+
entry: cargo fmt
|
39
|
+
language: system
|
40
|
+
types: [rust]
|
41
|
+
pass_filenames: false
|
42
|
+
|
43
|
+
- id: cargo-test
|
44
|
+
name: cargo test
|
45
|
+
entry: cargo test
|
46
|
+
language: system
|
47
|
+
types: [rust]
|
48
|
+
pass_filenames: false
|
49
|
+
|
50
|
+
- id: mypy-check
|
51
|
+
name: mypy type check
|
52
|
+
entry: mypy
|
53
|
+
language: system
|
54
|
+
types: [python]
|
55
|
+
pass_filenames: false
|
56
|
+
|
57
|
+
- repo: https://github.com/astral-sh/ruff-pre-commit
|
58
|
+
rev: v0.12.0
|
59
|
+
hooks:
|
60
|
+
- id: ruff-format
|
61
|
+
types: [python]
|
62
|
+
pass_filenames: true
|
63
|
+
|
64
|
+
- repo: https://github.com/christophmeissner/pytest-pre-commit
|
65
|
+
rev: 1.0.0
|
66
|
+
hooks:
|
67
|
+
- id: pytest
|
68
|
+
language: system
|
69
|
+
types: [python]
|
70
|
+
pass_filenames: false
|
71
|
+
always_run: false
|
@@ -1 +1 @@
|
|
1
|
-
We love contributions from our community ❤️. Please check out our [contributing guide](https://cocoindex.io/docs/about/contributing).
|
1
|
+
We love contributions from our community ❤️. Please check out our [contributing guide](https://cocoindex.io/docs/about/contributing).
|
@@ -1,12 +1,13 @@
|
|
1
1
|
Metadata-Version: 2.4
|
2
2
|
Name: cocoindex
|
3
|
-
Version: 0.1.
|
3
|
+
Version: 0.1.54
|
4
4
|
Requires-Dist: sentence-transformers>=3.3.1
|
5
5
|
Requires-Dist: click>=8.1.8
|
6
6
|
Requires-Dist: rich>=14.0.0
|
7
7
|
Requires-Dist: python-dotenv>=1.1.0
|
8
8
|
Requires-Dist: pytest ; extra == 'test'
|
9
9
|
Requires-Dist: ruff ; extra == 'dev'
|
10
|
+
Requires-Dist: pre-commit ; extra == 'dev'
|
10
11
|
Provides-Extra: test
|
11
12
|
Provides-Extra: dev
|
12
13
|
License-File: LICENSE
|
@@ -51,10 +52,10 @@ Unlike a workflow orchestration framework where data is usually opaque, in CocoI
|
|
51
52
|
|
52
53
|
```python
|
53
54
|
# import
|
54
|
-
data['content'] = flow_builder.add_source(...)
|
55
|
+
data['content'] = flow_builder.add_source(...)
|
55
56
|
|
56
57
|
# transform
|
57
|
-
data['out'] = data['content']
|
58
|
+
data['out'] = data['content']
|
58
59
|
.transform(...)
|
59
60
|
.transform(...)
|
60
61
|
|
@@ -75,17 +76,17 @@ As a data framework, CocoIndex takes it to the next level on data freshness. **I
|
|
75
76
|
The frameworks takes care of
|
76
77
|
- Change data capture.
|
77
78
|
- Figure out what exactly needs to be updated, and only updating that without having to recompute everything.
|
78
|
-
|
79
|
+
|
79
80
|
This makes it fast to reflect any source updates to the target store. If you have concerns with surfacing stale data to AI agents and are spending lots of efforts working on infra piece to optimize the latency, the framework actually handles it for you.
|
80
81
|
|
81
82
|
|
82
83
|
## Quick Start:
|
83
|
-
If you're new to CocoIndex, we recommend checking out
|
84
|
+
If you're new to CocoIndex, we recommend checking out
|
84
85
|
- 📖 [Documentation](https://cocoindex.io/docs)
|
85
86
|
- ⚡ [Quick Start Guide](https://cocoindex.io/docs/getting_started/quickstart)
|
86
|
-
- 🎬 [Quick Start Video Tutorial](https://youtu.be/gv5R8nOXsWU?si=9ioeKYkMEnYevTXT)
|
87
|
+
- 🎬 [Quick Start Video Tutorial](https://youtu.be/gv5R8nOXsWU?si=9ioeKYkMEnYevTXT)
|
87
88
|
|
88
|
-
### Setup
|
89
|
+
### Setup
|
89
90
|
|
90
91
|
1. Install CocoIndex Python library
|
91
92
|
|
@@ -155,8 +156,8 @@ It defines an index flow like this:
|
|
155
156
|
| [Google Drive Text Embedding](examples/gdrive_text_embedding) | Index text documents from Google Drive |
|
156
157
|
| [Docs to Knowledge Graph](examples/docs_to_knowledge_graph) | Extract relationships from Markdown documents and build a knowledge graph |
|
157
158
|
| [Embeddings to Qdrant](examples/text_embedding_qdrant) | Index documents in a Qdrant collection for semantic search |
|
158
|
-
| [FastAPI Server with Docker](examples/fastapi_server_docker) | Run the semantic search server in a Dockerized FastAPI setup |
|
159
|
-
| [Product Recommendation](examples/product_recommendation) | Build real-time product recommendations with LLM and graph database|
|
159
|
+
| [FastAPI Server with Docker](examples/fastapi_server_docker) | Run the semantic search server in a Dockerized FastAPI setup |
|
160
|
+
| [Product Recommendation](examples/product_recommendation) | Build real-time product recommendations with LLM and graph database|
|
160
161
|
| [Image Search with Vision API](examples/image_search) | Generates detailed captions for images using a vision model, embeds them, enables live-updating semantic search via FastAPI and served on a React frontend|
|
161
162
|
|
162
163
|
More coming and stay tuned 👀!
|
@@ -178,7 +179,7 @@ Join our community here:
|
|
178
179
|
- 📜 [Read our blog posts](https://cocoindex.io/blogs/)
|
179
180
|
|
180
181
|
## Support us:
|
181
|
-
We are constantly improving, and more features and examples are coming soon. If you love this project, please drop us a star ⭐ at GitHub repo [](https://github.com/cocoindex-io/cocoindex) to stay tuned and help us grow.
|
182
|
+
We are constantly improving, and more features and examples are coming soon. If you love this project, please drop us a star ⭐ at GitHub repo [](https://github.com/cocoindex-io/cocoindex) to stay tuned and help us grow.
|
182
183
|
|
183
184
|
## License
|
184
185
|
CocoIndex is Apache 2.0 licensed.
|
@@ -32,10 +32,10 @@ Unlike a workflow orchestration framework where data is usually opaque, in CocoI
|
|
32
32
|
|
33
33
|
```python
|
34
34
|
# import
|
35
|
-
data['content'] = flow_builder.add_source(...)
|
35
|
+
data['content'] = flow_builder.add_source(...)
|
36
36
|
|
37
37
|
# transform
|
38
|
-
data['out'] = data['content']
|
38
|
+
data['out'] = data['content']
|
39
39
|
.transform(...)
|
40
40
|
.transform(...)
|
41
41
|
|
@@ -56,17 +56,17 @@ As a data framework, CocoIndex takes it to the next level on data freshness. **I
|
|
56
56
|
The frameworks takes care of
|
57
57
|
- Change data capture.
|
58
58
|
- Figure out what exactly needs to be updated, and only updating that without having to recompute everything.
|
59
|
-
|
59
|
+
|
60
60
|
This makes it fast to reflect any source updates to the target store. If you have concerns with surfacing stale data to AI agents and are spending lots of efforts working on infra piece to optimize the latency, the framework actually handles it for you.
|
61
61
|
|
62
62
|
|
63
63
|
## Quick Start:
|
64
|
-
If you're new to CocoIndex, we recommend checking out
|
64
|
+
If you're new to CocoIndex, we recommend checking out
|
65
65
|
- 📖 [Documentation](https://cocoindex.io/docs)
|
66
66
|
- ⚡ [Quick Start Guide](https://cocoindex.io/docs/getting_started/quickstart)
|
67
|
-
- 🎬 [Quick Start Video Tutorial](https://youtu.be/gv5R8nOXsWU?si=9ioeKYkMEnYevTXT)
|
67
|
+
- 🎬 [Quick Start Video Tutorial](https://youtu.be/gv5R8nOXsWU?si=9ioeKYkMEnYevTXT)
|
68
68
|
|
69
|
-
### Setup
|
69
|
+
### Setup
|
70
70
|
|
71
71
|
1. Install CocoIndex Python library
|
72
72
|
|
@@ -136,8 +136,8 @@ It defines an index flow like this:
|
|
136
136
|
| [Google Drive Text Embedding](examples/gdrive_text_embedding) | Index text documents from Google Drive |
|
137
137
|
| [Docs to Knowledge Graph](examples/docs_to_knowledge_graph) | Extract relationships from Markdown documents and build a knowledge graph |
|
138
138
|
| [Embeddings to Qdrant](examples/text_embedding_qdrant) | Index documents in a Qdrant collection for semantic search |
|
139
|
-
| [FastAPI Server with Docker](examples/fastapi_server_docker) | Run the semantic search server in a Dockerized FastAPI setup |
|
140
|
-
| [Product Recommendation](examples/product_recommendation) | Build real-time product recommendations with LLM and graph database|
|
139
|
+
| [FastAPI Server with Docker](examples/fastapi_server_docker) | Run the semantic search server in a Dockerized FastAPI setup |
|
140
|
+
| [Product Recommendation](examples/product_recommendation) | Build real-time product recommendations with LLM and graph database|
|
141
141
|
| [Image Search with Vision API](examples/image_search) | Generates detailed captions for images using a vision model, embeds them, enables live-updating semantic search via FastAPI and served on a React frontend|
|
142
142
|
|
143
143
|
More coming and stay tuned 👀!
|
@@ -159,7 +159,7 @@ Join our community here:
|
|
159
159
|
- 📜 [Read our blog posts](https://cocoindex.io/blogs/)
|
160
160
|
|
161
161
|
## Support us:
|
162
|
-
We are constantly improving, and more features and examples are coming soon. If you love this project, please drop us a star ⭐ at GitHub repo [](https://github.com/cocoindex-io/cocoindex) to stay tuned and help us grow.
|
162
|
+
We are constantly improving, and more features and examples are coming soon. If you love this project, please drop us a star ⭐ at GitHub repo [](https://github.com/cocoindex-io/cocoindex) to stay tuned and help us grow.
|
163
163
|
|
164
164
|
## License
|
165
165
|
CocoIndex is Apache 2.0 licensed.
|
@@ -15,47 +15,52 @@ We use [GitHub Issues](https://github.com/cocoindex-io/cocoindex/issues) to trac
|
|
15
15
|
|
16
16
|
We tag issues with the ["good first issue"](https://github.com/cocoindex-io/cocoindex/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) label for beginner contributors.
|
17
17
|
|
18
|
-
## How to Contribute
|
18
|
+
## How to Contribute
|
19
19
|
- If you decide to work on an issue, unless the PR can be sent immediately (e.g. just a few lines of code), we recommend you to leave a comment on the issue like **`I'm working on it`** or **`Can I work on this issue?`** to avoid duplicating work.
|
20
20
|
- For larger features, we recommend you to discuss with us first in our [Discord server](https://discord.com/invite/zpA9S2DR7s) to coordinate the design and work.
|
21
21
|
- Our [Discord server](https://discord.com/invite/zpA9S2DR7s) are constantly open. If you are unsure about anything, it is a good place to discuss! We'd love to collaborate and will always be friendly.
|
22
22
|
|
23
|
-
## Start hacking! Setting Up Development Environment
|
23
|
+
## Start hacking! Setting Up Development Environment
|
24
24
|
Following the steps below to get cocoindex build on latest codebase locally - if you are making changes to cocoindex funcionality and want to test it out.
|
25
25
|
|
26
26
|
- 🦀 [Install Rust](https://rust-lang.org/tools/install)
|
27
|
-
|
27
|
+
|
28
28
|
If you don't have Rust installed, run
|
29
|
-
```
|
29
|
+
```sh
|
30
30
|
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
|
31
31
|
```
|
32
|
-
Already have Rust? Make sure it's up to date
|
33
|
-
```
|
32
|
+
Already have Rust? Make sure it's up to date
|
33
|
+
```sh
|
34
34
|
rustup update
|
35
35
|
```
|
36
36
|
|
37
|
-
-
|
38
|
-
```
|
37
|
+
- Setup Python virtual environment:
|
38
|
+
```sh
|
39
39
|
python3 -m venv .venv
|
40
40
|
```
|
41
|
-
Activate the virtual environment, before any
|
41
|
+
Activate the virtual environment, before any installing / building / running:
|
42
42
|
|
43
|
-
```
|
43
|
+
```sh
|
44
44
|
. .venv/bin/activate
|
45
45
|
```
|
46
46
|
|
47
|
-
- Install
|
48
|
-
```
|
49
|
-
pip install maturin
|
47
|
+
- Install required tools:
|
48
|
+
```sh
|
49
|
+
pip install maturin mypy pre-commit
|
50
50
|
```
|
51
51
|
|
52
52
|
- Build the library. Run at the root of cocoindex directory:
|
53
|
-
```
|
53
|
+
```sh
|
54
54
|
maturin develop
|
55
55
|
```
|
56
56
|
|
57
|
-
-
|
58
|
-
```
|
57
|
+
- Install and enable pre-commit hooks. This ensures all checks run automatically before each commit:
|
58
|
+
```sh
|
59
|
+
pre-commit install
|
60
|
+
```
|
61
|
+
|
62
|
+
- Before running a specific example, set extra environment variables, for exposing extra traces, allowing dev UI, etc.
|
63
|
+
```sh
|
59
64
|
. ./.env.lib_debug
|
60
65
|
```
|
61
66
|
|
@@ -67,7 +72,16 @@ To submit your code:
|
|
67
72
|
1. Fork the [CocoIndex repository](https://github.com/cocoindex-io/cocoindex)
|
68
73
|
2. [Create a new branch](https://docs.github.com/en/desktop/making-changes-in-a-branch/managing-branches-in-github-desktop) on your fork
|
69
74
|
3. Make your changes
|
70
|
-
4.
|
75
|
+
4. Run the pre-commit checks (automatically triggered on `git commit`)
|
76
|
+
|
77
|
+
:::tip
|
78
|
+
To run them manually (same as CI):
|
79
|
+
```sh
|
80
|
+
pre-commit run --all-files
|
81
|
+
```
|
82
|
+
:::
|
83
|
+
|
84
|
+
5. [Open a Pull Request (PR)](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request-from-a-fork) when your work is ready for review
|
71
85
|
|
72
86
|
In your PR description, please include:
|
73
87
|
- Description of the changes
|
@@ -121,3 +121,89 @@ cocoindex.LlmSpec(
|
|
121
121
|
|
122
122
|
You can find the full list of models supported by Anthropic [here](https://docs.anthropic.com/en/docs/about-claude/models/all-models).
|
123
123
|
|
124
|
+
### LiteLLM
|
125
|
+
|
126
|
+
To use the LiteLLM API, you need to set the environment variable `LITELLM_API_KEY`.
|
127
|
+
|
128
|
+
#### 1. Install LiteLLM Proxy
|
129
|
+
|
130
|
+
```bash
|
131
|
+
pip install 'litellm[proxy]'
|
132
|
+
```
|
133
|
+
|
134
|
+
#### 2. Create a `config.yml` for LiteLLM
|
135
|
+
|
136
|
+
**Example for OpenAI:**
|
137
|
+
```yaml
|
138
|
+
model_list:
|
139
|
+
- model_name: "*"
|
140
|
+
litellm_params:
|
141
|
+
model: openai/*
|
142
|
+
api_key: os.environ/LITELLM_API_KEY
|
143
|
+
```
|
144
|
+
|
145
|
+
**Example for DeepSeek:**
|
146
|
+
|
147
|
+
First, pull the DeepSeek model with Ollama:
|
148
|
+
```bash
|
149
|
+
ollama pull deepseek-r1
|
150
|
+
```
|
151
|
+
Then run it if it's not running:
|
152
|
+
```bash
|
153
|
+
ollama run deepseek-r1
|
154
|
+
```
|
155
|
+
|
156
|
+
Then, use this in your `config.yml`:
|
157
|
+
```yaml
|
158
|
+
model_list:
|
159
|
+
- model_name: "deepseek-r1"
|
160
|
+
litellm_params:
|
161
|
+
model: "ollama_chat/deepseek-r1"
|
162
|
+
api_base: "http://localhost:11434"
|
163
|
+
```
|
164
|
+
|
165
|
+
#### 3. Run LiteLLM Proxy
|
166
|
+
|
167
|
+
```bash
|
168
|
+
litellm --config config.yml
|
169
|
+
```
|
170
|
+
|
171
|
+
#### 4. A Spec for LiteLLM will look like this:
|
172
|
+
|
173
|
+
<Tabs>
|
174
|
+
<TabItem value="python" label="Python" default>
|
175
|
+
|
176
|
+
```python
|
177
|
+
cocoindex.LlmSpec(
|
178
|
+
api_type=cocoindex.LlmApiType.LITE_LLM,
|
179
|
+
model="deepseek-r1",
|
180
|
+
address="http://127.0.0.1:4000", # default url of LiteLLM
|
181
|
+
)
|
182
|
+
```
|
183
|
+
|
184
|
+
</TabItem>
|
185
|
+
</Tabs>
|
186
|
+
|
187
|
+
You can find the full list of models supported by LiteLLM [here](https://docs.litellm.ai/docs/providers).
|
188
|
+
|
189
|
+
### OpenRouter
|
190
|
+
|
191
|
+
To use the OpenRouter API, you need to set the environment variable `OPENROUTER_API_KEY`.
|
192
|
+
You can generate the API key from [here](https://openrouter.ai/settings/keys).
|
193
|
+
|
194
|
+
A spec for OpenRouter looks like this:
|
195
|
+
|
196
|
+
<Tabs>
|
197
|
+
<TabItem value="python" label="Python" default>
|
198
|
+
|
199
|
+
```python
|
200
|
+
cocoindex.LlmSpec(
|
201
|
+
api_type=cocoindex.LlmApiType.OPEN_ROUTER,
|
202
|
+
model="deepseek/deepseek-r1:free",
|
203
|
+
)
|
204
|
+
```
|
205
|
+
|
206
|
+
</TabItem>
|
207
|
+
</Tabs>
|
208
|
+
|
209
|
+
You can find the full list of models supported by OpenRouter [here](https://openrouter.ai/models).
|
@@ -50,7 +50,7 @@ For the example shown in the [Quickstart](../getting_started/quickstart) section
|
|
50
50
|
|
51
51
|
This creates the following data for the indexing flow:
|
52
52
|
|
53
|
-
* The `
|
53
|
+
* The `LocalFile` source creates a `documents` field at the top level, with `filename` (key) and `content` sub fields.
|
54
54
|
* A "for each" action works on each document, with the following transformations:
|
55
55
|
* The `SplitRecursively` function splits content into chunks, adds a `chunks` field into the current scope (each document), with `location` (key) and `text` sub fields.
|
56
56
|
* A "collect" action works on each chunk, with the following transformations:
|
@@ -71,7 +71,7 @@ An indexing flow, once set up, maintains a long-lived relationship between data
|
|
71
71
|
|
72
72
|
* **One time update**: Once triggered, CocoIndex updates the target data to reflect the version of source data up to the current moment.
|
73
73
|
* **Live update**: CocoIndex continuously reacts to changes of source data and updates the target data accordingly, based on various **change capture mechanisms** for the source.
|
74
|
-
|
74
|
+
|
75
75
|
See more details in the [build / update target data](flow_methods#build--update-target-data) section.
|
76
76
|
|
77
77
|
3. CocoIndex intelligently reprocesses to propagate source changes to target by:
|
@@ -101,4 +101,4 @@ As an indexing flow is long-lived, it needs to store intermediate data to keep t
|
|
101
101
|
CocoIndex uses internal storage for this purpose.
|
102
102
|
|
103
103
|
Currently, CocoIndex uses Postgres database as the internal storage.
|
104
|
-
See [Settings](settings#databaseconnectionspec) for configuring its location, and `cocoindex setup` CLI command (see [CocoIndex CLI](cli)) creates tables for the internal storage.
|
104
|
+
See [Settings](settings#databaseconnectionspec) for configuring its location, and `cocoindex setup` CLI command (see [CocoIndex CLI](cli)) creates tables for the internal storage.
|
@@ -84,7 +84,7 @@ Notes:
|
|
84
84
|
|
85
85
|
### Function Executor
|
86
86
|
|
87
|
-
A function executor defines behavior of a function. It's
|
87
|
+
A function executor defines behavior of a function. It's instantiated for each operation that uses this function.
|
88
88
|
|
89
89
|
The function executor is responsible for:
|
90
90
|
|
@@ -117,7 +117,7 @@ Notes:
|
|
117
117
|
* The `cocoindex.op.executor_class()` class decorator also takes optional parameters.
|
118
118
|
See [Parameters for custom functions](#parameters-for-custom-functions) for details.
|
119
119
|
|
120
|
-
* A `spec` field must be present in the class, and must be
|
120
|
+
* A `spec` field must be present in the class, and must be annotated with the spec class name.
|
121
121
|
* The `prepare()` method is optional. It's executed once and only once before any `__call__` execution, to prepare the function execution.
|
122
122
|
* The `__call__()` method is required. It's executed for each specific rows of data.
|
123
123
|
Types of arugments and the return value must be decorated, so that CocoIndex will have information about data types of the operation's output fields.
|
@@ -133,7 +133,7 @@ The cocoindex repository contains the following examples of custom functions def
|
|
133
133
|
* In the [pdf_embedding](https://github.com/cocoindex-io/cocoindex/blob/main/examples/pdf_embedding/main.py) example, we define a custom function `PdfToMarkdown`
|
134
134
|
* The `SentenceTransformerEmbed` function shipped with the CocoIndex Python package is defined by Python SDK.
|
135
135
|
Search for [`SentenceTransformerEmbedExecutor`](https://github.com/search?q=repo%3Acocoindex-io%2Fcocoindex+lang%3Apython+SentenceTransformerEmbedExecutor&type=code) to see the code.
|
136
|
-
|
136
|
+
|
137
137
|
## Parameters for custom functions
|
138
138
|
|
139
139
|
Custom functions take the following additional parameters:
|
@@ -56,7 +56,7 @@ A data scope has a bunch of fields and collectors, and users can add new fields
|
|
56
56
|
|
57
57
|
### Get or Add a Field
|
58
58
|
|
59
|
-
You can get or add a field of a data scope (which is a data slice).
|
59
|
+
You can get or add a field of a data scope (which is a data slice).
|
60
60
|
|
61
61
|
:::note
|
62
62
|
|
@@ -81,7 +81,7 @@ The `update()` async method creates/updates data in the target.
|
|
81
81
|
Once the function returns, the target data is fresh up to the moment when the function is called.
|
82
82
|
|
83
83
|
```python
|
84
|
-
stats =
|
84
|
+
stats = demo_flow.update()
|
85
85
|
print(stats)
|
86
86
|
```
|
87
87
|
|
@@ -182,10 +182,10 @@ CocoIndex also provides asynchronous versions of APIs for blocking operations, i
|
|
182
182
|
my_updater = cocoindex.FlowLiveUpdater(demo_flow)
|
183
183
|
# Start the updater.
|
184
184
|
await my_updater.start_async()
|
185
|
-
|
185
|
+
|
186
186
|
# Perform your own logic (e.g. a query loop).
|
187
187
|
...
|
188
|
-
|
188
|
+
|
189
189
|
# Print the update stats.
|
190
190
|
print(my_updater.update_stats())
|
191
191
|
# Abort the updater.
|
@@ -245,4 +245,4 @@ demo_flow.evaluate_and_dump(EvaluateAndDumpOptions(output_dir="./eval_output"))
|
|
245
245
|
```
|
246
246
|
|
247
247
|
</TabItem>
|
248
|
-
</Tabs>
|
248
|
+
</Tabs>
|
@@ -113,4 +113,4 @@ This is the list of environment variables, each of which has a corresponding fie
|
|
113
113
|
| `COCOINDEX_DATABASE_URL` | `database.url` | Yes |
|
114
114
|
| `COCOINDEX_DATABASE_USER` | `database.user` | No |
|
115
115
|
| `COCOINDEX_DATABASE_PASSWORD` | `database.password` | No |
|
116
|
-
| `COCOINDEX_APP_NAMESPACE` | `app_namespace` | No |
|
116
|
+
| `COCOINDEX_APP_NAMESPACE` | `app_namespace` | No |
|
@@ -1,5 +1,5 @@
|
|
1
1
|
---
|
2
|
-
title: Installation
|
2
|
+
title: Installation
|
3
3
|
description: Setup the CocoIndex environment in 0-3 min
|
4
4
|
---
|
5
5
|
|
@@ -17,7 +17,7 @@ pip install -U cocoindex
|
|
17
17
|
|
18
18
|
## 📦 Install Postgres
|
19
19
|
|
20
|
-
You can skip this step if you already have a Postgres database with pgvector extension installed.
|
20
|
+
You can skip this step if you already have a Postgres database with pgvector extension installed.
|
21
21
|
|
22
22
|
If you don't have a Postgres database:
|
23
23
|
|
@@ -31,4 +31,3 @@ docker compose -f <(curl -L https://raw.githubusercontent.com/cocoindex-io/cocoi
|
|
31
31
|
## 🎉 All set!
|
32
32
|
|
33
33
|
You can now start using CocoIndex.
|
34
|
-
|
@@ -5,7 +5,7 @@ slug: /
|
|
5
5
|
|
6
6
|
# Welcome to CocoIndex
|
7
7
|
|
8
|
-
CocoIndex is an ultra-performant real-time data transformation framework for AI, with incremental processing.
|
8
|
+
CocoIndex is an ultra-performant real-time data transformation framework for AI, with incremental processing.
|
9
9
|
|
10
10
|
As a data framework, CocoIndex takes it to the next level on data freshness. **Incremental processing** is one of the core values provided by CocoIndex.
|
11
11
|
|
@@ -17,10 +17,10 @@ CocoIndex follows the idea of [Dataflow programming](https://en.wikipedia.org/wi
|
|
17
17
|
The gist of an example data transformation:
|
18
18
|
```python
|
19
19
|
# import
|
20
|
-
data['content'] = flow_builder.add_source(...)
|
20
|
+
data['content'] = flow_builder.add_source(...)
|
21
21
|
|
22
22
|
# transform
|
23
|
-
data['out'] = data['content']
|
23
|
+
data['out'] = data['content']
|
24
24
|
.transform(...)
|
25
25
|
.transform(...)
|
26
26
|
|
@@ -33,4 +33,3 @@ collector.export(...)
|
|
33
33
|
|
34
34
|
Get Started:
|
35
35
|
- [Quick Start](https://cocoindex.io/docs/getting_started/quickstart)
|
36
|
-
|
@@ -19,7 +19,7 @@ This guide will help you get up and running with CocoIndex in just a few minutes
|
|
19
19
|
We'll need to install a bunch of dependencies for this project.
|
20
20
|
|
21
21
|
1. Install CocoIndex:
|
22
|
-
|
22
|
+
|
23
23
|
```bash
|
24
24
|
pip install -U cocoindex
|
25
25
|
```
|
@@ -149,7 +149,7 @@ documents: 3 added, 0 removed, 0 updated
|
|
149
149
|
|
150
150
|
## Step 4 (optional): Run queries against the index
|
151
151
|
|
152
|
-
CocoIndex excels at transforming your data and storing it (a.k.a. indexing).
|
152
|
+
CocoIndex excels at transforming your data and storing it (a.k.a. indexing).
|
153
153
|
The goal of transforming your data is usually to query against it.
|
154
154
|
Once you already have your index built, you can directly access the transformed data in the target database.
|
155
155
|
CocoIndex also provides utilities for you to do this more seamlessly.
|
@@ -291,4 +291,4 @@ Next, you may want to:
|
|
291
291
|
* Learn about [CocoIndex Basics](../core/basics.md).
|
292
292
|
* Learn about other examples in the [examples](https://github.com/cocoindex-io/cocoindex/tree/main/examples) directory.
|
293
293
|
* The `text_embedding` example is this quickstart.
|
294
|
-
* Pick other examples to learn upon your interest.
|
294
|
+
* Pick other examples to learn upon your interest.
|
@@ -53,7 +53,7 @@ Input data:
|
|
53
53
|
:::note
|
54
54
|
|
55
55
|
We use the `language` field to determine how to split the input text, following these rules:
|
56
|
-
|
56
|
+
|
57
57
|
* We'll match the input `language` field against the `language_name` or `aliases` of each custom language specification, and use the matched one. If value of `language` is null, it'll be treated as empty string when matching `language_name` or `aliases`.
|
58
58
|
* If no match is found, we'll match the `language` field against the builtin language configurations.
|
59
59
|
For all supported builtin language names and aliases (extensions), see [the code](https://github.com/search?q=org%3Acocoindex-io+lang%3Arust++%22static+TREE_SITTER_LANGUAGE_BY_LANG%22&type=code).
|
@@ -22,9 +22,9 @@ The spec takes the following fields:
|
|
22
22
|
If not specified, no files will be excluded.
|
23
23
|
|
24
24
|
:::info
|
25
|
-
|
25
|
+
|
26
26
|
`included_patterns` and `excluded_patterns` are using Unix-style glob syntax. See [globset syntax](https://docs.rs/globset/latest/globset/index.html#syntax) for the details.
|
27
|
-
|
27
|
+
|
28
28
|
:::
|
29
29
|
|
30
30
|
### Schema
|
@@ -131,9 +131,9 @@ The spec takes the following fields:
|
|
131
131
|
If not specified, no files will be excluded.
|
132
132
|
|
133
133
|
:::info
|
134
|
-
|
134
|
+
|
135
135
|
`included_patterns` and `excluded_patterns` are using Unix-style glob syntax. See [globset syntax](https://docs.rs/globset/latest/globset/index.html#syntax) for the details.
|
136
|
-
|
136
|
+
|
137
137
|
:::
|
138
138
|
|
139
139
|
* `sqs_queue_url` (type: `str`, optional): if provided, the source will receive change event notifications from Amazon S3 via this SQS queue.
|
@@ -52,7 +52,7 @@ Here's how CocoIndex data elements map to Qdrant elements during export:
|
|
52
52
|
|
53
53
|
| CocoIndex Element | Qdrant Element |
|
54
54
|
|-------------------|------------------|
|
55
|
-
| an export target | a unique collection |
|
55
|
+
| an export target | a unique collection |
|
56
56
|
| a collected row | a point |
|
57
57
|
| a field | a named vector, if fits into Qdrant vector; or a field within payload otherwise |
|
58
58
|
|