cocoindex 0.1.70__tar.gz → 0.1.72__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {cocoindex-0.1.70 → cocoindex-0.1.72}/Cargo.lock +1 -1
- {cocoindex-0.1.70 → cocoindex-0.1.72}/Cargo.toml +1 -1
- {cocoindex-0.1.70 → cocoindex-0.1.72}/PKG-INFO +12 -11
- {cocoindex-0.1.70 → cocoindex-0.1.72}/README.md +11 -10
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/custom_function.mdx +11 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/data_types.mdx +18 -10
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/ops/targets.md +14 -0
- cocoindex-0.1.72/docs/docs/tutorials/live_updates.md +156 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/sidebars.ts +8 -0
- cocoindex-0.1.72/examples/face_recognition/README.md +51 -0
- cocoindex-0.1.72/examples/face_recognition/images/Carter_welcomes_Reagan.jpg +0 -0
- cocoindex-0.1.72/examples/face_recognition/images/Solvay_conference_1927.jpg +0 -0
- cocoindex-0.1.72/examples/face_recognition/images/Steve_Jobs_and_Bill_Gates_(522695099).jpg +0 -0
- cocoindex-0.1.72/examples/face_recognition/images/einplanck3.jpg +0 -0
- cocoindex-0.1.72/examples/face_recognition/main.py +120 -0
- cocoindex-0.1.72/examples/face_recognition/pyproject.toml +14 -0
- cocoindex-0.1.72/examples/live_updates/.env +1 -0
- cocoindex-0.1.72/examples/live_updates/README.md +58 -0
- cocoindex-0.1.72/examples/live_updates/data/bizarre_animals.md +21 -0
- cocoindex-0.1.72/examples/live_updates/data/chunk_norris.md +19 -0
- cocoindex-0.1.72/examples/live_updates/main.py +55 -0
- cocoindex-0.1.72/examples/live_updates/pyproject.toml +12 -0
- cocoindex-0.1.72/examples/text_embedding_qdrant/.env +2 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/__init__.py +1 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/convert.py +79 -4
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/flow.py +16 -7
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/functions.py +8 -7
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/op.py +33 -4
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/setting.py +3 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/tests/test_convert.py +127 -0
- cocoindex-0.1.72/python/cocoindex/tests/test_validation.py +134 -0
- cocoindex-0.1.72/python/cocoindex/validation.py +104 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/base/field_attrs.rs +1 -1
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/db_tracking_setup.rs +15 -10
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/kuzu.rs +1 -1
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/neo4j.rs +2 -2
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/postgres.rs +45 -6
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/qdrant.rs +29 -14
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/shared/table_columns.rs +8 -8
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/py/mod.rs +3 -3
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/setup/components.rs +8 -5
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/setup/db_metadata.rs +3 -3
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/setup/driver.rs +6 -1
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/setup/states.rs +27 -5
- {cocoindex-0.1.70 → cocoindex-0.1.72}/.cargo/config.toml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/.env.lib_debug +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/ISSUE_TEMPLATE//360/237/220/233-bug-report.md" +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/ISSUE_TEMPLATE//360/237/222/241-feature-request.md" +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/scripts/update_version.sh +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/workflows/CI.yml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/workflows/_doc_release.yml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/workflows/_test.yml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/workflows/docs.yml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/workflows/format.yml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/workflows/release.yml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/.gitignore +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/.pre-commit-config.yaml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/CODE_OF_CONDUCT.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/CONTRIBUTING.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/LICENSE +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/dev/neo4j.yaml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/dev/postgres.yaml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/.gitignore +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/about/community.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/about/contributing.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/ai/llm.mdx +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/basics.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/cli.mdx +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/data_example.svg +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/flow_def.mdx +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/flow_example.svg +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/flow_methods.mdx +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/settings.mdx +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/getting_started/installation.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/getting_started/markdown_files.zip +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/getting_started/overview.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/getting_started/quickstart.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/ops/functions.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/ops/sources.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/query.mdx +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docusaurus.config.ts +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/package.json +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/src/components/HomepageFeatures/index.tsx +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/src/components/HomepageFeatures/styles.module.css +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/src/css/custom.css +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/src/theme/Root.js +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/static/.nojekyll +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/static/img/docusaurus.png +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/static/img/favicon.ico +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/static/img/icon.svg +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/static/img/incremental-etl.gif +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/static/robots.txt +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/tsconfig.json +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/yarn.lock +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/amazon_s3_embedding/.env.example +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/amazon_s3_embedding/.gitignore +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/amazon_s3_embedding/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/amazon_s3_embedding/main.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/amazon_s3_embedding/pyproject.toml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/azure_blob_embedding/.env.example +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/azure_blob_embedding/.gitignore +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/azure_blob_embedding/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/azure_blob_embedding/main.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/azure_blob_embedding/pyproject.toml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/code_embedding/.env +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/code_embedding/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/code_embedding/main.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/code_embedding/pyproject.toml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/docs_to_knowledge_graph/.env +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/docs_to_knowledge_graph/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/docs_to_knowledge_graph/main.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/docs_to_knowledge_graph/pyproject.toml +0 -0
- {cocoindex-0.1.70/examples/manuals_llm_extraction → cocoindex-0.1.72/examples/face_recognition}/.env +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/.dockerignore +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/.env +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/compose.yaml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/dockerfile +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/files/1810.04805v2.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/main.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/requirements.txt +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/gdrive_text_embedding/.env.example +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/gdrive_text_embedding/.gitignore +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/gdrive_text_embedding/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/gdrive_text_embedding/main.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/gdrive_text_embedding/pyproject.toml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/.env +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/.gitignore +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/index.html +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/package-lock.json +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/package.json +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/src/App.jsx +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/src/main.jsx +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/src/style.css +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/vite.config.js +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/img/cat1.jpeg +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/img/dog1.jpeg +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/img/elephant1.jpg +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/img/giraffe.jpg +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/main.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/pyproject.toml +0 -0
- {cocoindex-0.1.70/examples/pdf_embedding → cocoindex-0.1.72/examples/manuals_llm_extraction}/.env +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/manuals_llm_extraction/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/manuals_llm_extraction/main.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/manuals_llm_extraction/manuals/array.pdf +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/manuals_llm_extraction/manuals/base64.pdf +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/manuals_llm_extraction/manuals/copy.pdf +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/manuals_llm_extraction/manuals/glob.pdf +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/manuals_llm_extraction/pyproject.toml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/.env.example +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/.gitignore +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/main.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/papers/1706.03762v7.pdf +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/papers/1810.04805v2.pdf +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/papers/2502.06786v3.pdf +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/papers/2502.20346v1.pdf +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/pyproject.toml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/.env.example +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/data/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/data/patient_forms/Patient_Intake_Form_David_Artificial.docx +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/data/patient_forms/Patient_Intake_Form_Emily_Artificial.pdf +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/data/patient_forms/Patient_Intake_Form_Joe_Artificial.pdf +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/data/patient_forms/Patient_Intake_From_Jane_Artificial.docx +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/main.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/pyproject.toml +0 -0
- {cocoindex-0.1.70/examples/product_recommendation → cocoindex-0.1.72/examples/pdf_embedding}/.env +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/pdf_embedding/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/pdf_embedding/main.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/pdf_embedding/pdf_files/1706.03762v7.pdf +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/pdf_embedding/pdf_files/1810.04805v2.pdf +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/pdf_embedding/pdf_files/rfc8259.pdf +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/pdf_embedding/pyproject.toml +0 -0
- {cocoindex-0.1.70/examples/text_embedding → cocoindex-0.1.72/examples/product_recommendation}/.env +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/img/cocoinsight.png +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/img/neo4j.png +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/main.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p1.json +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p2.json +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p3.json +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p4.json +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p5.json +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p6.json +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p7.json +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p8.json +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p9.json +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/pyproject.toml +0 -0
- {cocoindex-0.1.70/examples/text_embedding_qdrant → cocoindex-0.1.72/examples/text_embedding}/.env +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding/Text_Embedding.ipynb +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding/main.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding/markdown_files/1706.03762v7.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding/markdown_files/1810.04805v2.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding/markdown_files/rfc8259.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding/pyproject.toml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding_qdrant/README.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding_qdrant/main.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding_qdrant/markdown_files/rfc8259.md +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding_qdrant/pyproject.toml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/pyproject.toml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/auth_registry.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/cli.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/index.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/lib.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/llm.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/py.typed +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/runtime.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/setup.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/sources.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/targets.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/tests/__init__.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/tests/test_optional_database.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/tests/test_typing.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/typing.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/utils.py +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/ruff.toml +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/base/duration.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/base/json_schema.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/base/mod.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/base/schema.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/base/spec.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/base/value.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/builder/analyzed_flow.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/builder/analyzer.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/builder/exec_ctx.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/builder/flow_builder.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/builder/mod.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/builder/plan.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/db_tracking.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/dumper.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/evaluator.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/indexing_status.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/live_updater.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/memoization.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/mod.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/row_indexer.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/source_indexer.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/stats.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/lib.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/lib_context.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/anthropic.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/gemini.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/litellm.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/mod.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/ollama.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/openai.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/openrouter.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/vertex_ai.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/vllm.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/voyage.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/factory_bases.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/functions/embed_text.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/functions/extract_by_llm.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/functions/mod.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/functions/parse_json.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/functions/split_recursively.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/functions/test_utils.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/interface.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/mod.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/py_factory.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/registration.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/registry.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/sdk.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/sources/amazon_s3.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/sources/azure_blob.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/sources/google_drive.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/sources/local_file.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/sources/mod.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/mod.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/shared/mod.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/shared/property_graph.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/prelude.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/py/convert.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/server.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/service/error.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/service/flows.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/service/mod.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/settings.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/setup/auth_registry.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/setup/mod.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/utils/concur_control.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/utils/db.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/utils/fingerprint.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/utils/immutable.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/utils/mod.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/utils/retryable.rs +0 -0
- {cocoindex-0.1.70 → cocoindex-0.1.72}/src/utils/yaml_ser.rs +0 -0
@@ -1,6 +1,6 @@
|
|
1
1
|
Metadata-Version: 2.4
|
2
2
|
Name: cocoindex
|
3
|
-
Version: 0.1.
|
3
|
+
Version: 0.1.72
|
4
4
|
Requires-Dist: click>=8.1.8
|
5
5
|
Requires-Dist: rich>=14.0.0
|
6
6
|
Requires-Dist: python-dotenv>=1.1.0
|
@@ -52,18 +52,18 @@ Ultra performant data transformation framework for AI, with core engine written
|
|
52
52
|
⭐ Drop a star to help us grow!
|
53
53
|
|
54
54
|
<div align="center">
|
55
|
-
|
55
|
+
|
56
56
|
<!-- Keep these links. Translations will automatically update with the README. -->
|
57
|
-
[Deutsch](https://readme-i18n.com/cocoindex-io/cocoindex?lang=de) |
|
58
|
-
[English](https://readme-i18n.com/cocoindex-io/cocoindex?lang=en) |
|
59
|
-
[Español](https://readme-i18n.com/cocoindex-io/cocoindex?lang=es) |
|
60
|
-
[français](https://readme-i18n.com/cocoindex-io/cocoindex?lang=fr) |
|
61
|
-
[日本語](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ja) |
|
62
|
-
[한국어](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ko) |
|
63
|
-
[Português](https://readme-i18n.com/cocoindex-io/cocoindex?lang=pt) |
|
64
|
-
[Русский](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ru) |
|
57
|
+
[Deutsch](https://readme-i18n.com/cocoindex-io/cocoindex?lang=de) |
|
58
|
+
[English](https://readme-i18n.com/cocoindex-io/cocoindex?lang=en) |
|
59
|
+
[Español](https://readme-i18n.com/cocoindex-io/cocoindex?lang=es) |
|
60
|
+
[français](https://readme-i18n.com/cocoindex-io/cocoindex?lang=fr) |
|
61
|
+
[日本語](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ja) |
|
62
|
+
[한국어](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ko) |
|
63
|
+
[Português](https://readme-i18n.com/cocoindex-io/cocoindex?lang=pt) |
|
64
|
+
[Русский](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ru) |
|
65
65
|
[中文](https://readme-i18n.com/cocoindex-io/cocoindex?lang=zh)
|
66
|
-
|
66
|
+
|
67
67
|
</div>
|
68
68
|
|
69
69
|
</br>
|
@@ -208,6 +208,7 @@ It defines an index flow like this:
|
|
208
208
|
| [FastAPI Server with Docker](examples/fastapi_server_docker) | Run the semantic search server in a Dockerized FastAPI setup |
|
209
209
|
| [Product Recommendation](examples/product_recommendation) | Build real-time product recommendations with LLM and graph database|
|
210
210
|
| [Image Search with Vision API](examples/image_search) | Generates detailed captions for images using a vision model, embeds them, enables live-updating semantic search via FastAPI and served on a React frontend|
|
211
|
+
| [Face Recognition](examples/face_recognition) | Recognize faces in images and build embedding index |
|
211
212
|
| [Paper Metadata](examples/paper_metadata) | Index papers in PDF files, and build metadata tables for each paper |
|
212
213
|
|
213
214
|
More coming and stay tuned 👀!
|
@@ -27,18 +27,18 @@ Ultra performant data transformation framework for AI, with core engine written
|
|
27
27
|
⭐ Drop a star to help us grow!
|
28
28
|
|
29
29
|
<div align="center">
|
30
|
-
|
30
|
+
|
31
31
|
<!-- Keep these links. Translations will automatically update with the README. -->
|
32
|
-
[Deutsch](https://readme-i18n.com/cocoindex-io/cocoindex?lang=de) |
|
33
|
-
[English](https://readme-i18n.com/cocoindex-io/cocoindex?lang=en) |
|
34
|
-
[Español](https://readme-i18n.com/cocoindex-io/cocoindex?lang=es) |
|
35
|
-
[français](https://readme-i18n.com/cocoindex-io/cocoindex?lang=fr) |
|
36
|
-
[日本語](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ja) |
|
37
|
-
[한국어](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ko) |
|
38
|
-
[Português](https://readme-i18n.com/cocoindex-io/cocoindex?lang=pt) |
|
39
|
-
[Русский](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ru) |
|
32
|
+
[Deutsch](https://readme-i18n.com/cocoindex-io/cocoindex?lang=de) |
|
33
|
+
[English](https://readme-i18n.com/cocoindex-io/cocoindex?lang=en) |
|
34
|
+
[Español](https://readme-i18n.com/cocoindex-io/cocoindex?lang=es) |
|
35
|
+
[français](https://readme-i18n.com/cocoindex-io/cocoindex?lang=fr) |
|
36
|
+
[日本語](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ja) |
|
37
|
+
[한국어](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ko) |
|
38
|
+
[Português](https://readme-i18n.com/cocoindex-io/cocoindex?lang=pt) |
|
39
|
+
[Русский](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ru) |
|
40
40
|
[中文](https://readme-i18n.com/cocoindex-io/cocoindex?lang=zh)
|
41
|
-
|
41
|
+
|
42
42
|
</div>
|
43
43
|
|
44
44
|
</br>
|
@@ -183,6 +183,7 @@ It defines an index flow like this:
|
|
183
183
|
| [FastAPI Server with Docker](examples/fastapi_server_docker) | Run the semantic search server in a Dockerized FastAPI setup |
|
184
184
|
| [Product Recommendation](examples/product_recommendation) | Build real-time product recommendations with LLM and graph database|
|
185
185
|
| [Image Search with Vision API](examples/image_search) | Generates detailed captions for images using a vision model, embeds them, enables live-updating semantic search via FastAPI and served on a React frontend|
|
186
|
+
| [Face Recognition](examples/face_recognition) | Recognize faces in images and build embedding index |
|
186
187
|
| [Paper Metadata](examples/paper_metadata) | Index papers in PDF files, and build metadata tables for each paper |
|
187
188
|
|
188
189
|
More coming and stay tuned 👀!
|
@@ -148,6 +148,17 @@ Custom functions take the following additional parameters:
|
|
148
148
|
When the version is changed, the function will be re-executed even if cache is enabled.
|
149
149
|
It's required to be set if `cache` is `True`.
|
150
150
|
|
151
|
+
* `arg_relationship: tuple[ArgRelationship, str]`: It specifies the relationship between an input argument and the output,
|
152
|
+
e.g. `(ArgRelationship.CHUNKS_BASE_TEXT, "content")` means the output is chunks for the text represented by the
|
153
|
+
input argument with name `content`.
|
154
|
+
This provides metadata for tools, e.g. CocoInsight.
|
155
|
+
Currently the following attributes are supported:
|
156
|
+
|
157
|
+
* `ArgRelationship.CHUNKS_BASE_TEXT`:
|
158
|
+
The output is chunks for the text represented by the input argument. In this case, the output is expected to be a *Table*, whose each row represents a text chunk, and the first column has type *Range*, representing the range of the text chunk.
|
159
|
+
* `ArgRelationship.EMBEDDING_ORIGIN_TEXT`: The output is embedding vector for the text represented by the input argument. The output is expected to be a *Vector*.
|
160
|
+
* `ArgRelationship.RECTS_BASE_IMAGE`: The output is rectangles for the image represented by the input argument. The output is expected to be a *Table*, whose each row represents a rectangle, and the first column has type *Struct*, with fields `min_x`, `min_y`, `max_x`, `max_y` to represent the coordinates of the rectangle.
|
161
|
+
|
151
162
|
For example:
|
152
163
|
|
153
164
|
<Tabs>
|
@@ -21,12 +21,13 @@ All you need to do is to make sure the data passed to functions and targets are
|
|
21
21
|
Each type in CocoIndex type system is mapped to one or multiple types in Python.
|
22
22
|
When you define a [custom function](/docs/core/custom_function), you need to annotate the data types of arguments and return values.
|
23
23
|
|
24
|
-
* When you pass a Python value to the engine (e.g. return values of a custom function), type annotation is required
|
25
|
-
as it provides the ground truth of the data type in the flow.
|
24
|
+
* When you pass a Python value to the engine (e.g. return values of a custom function), a specific type annotation is required.
|
25
|
+
The type annotation needs to be specific in describing the target data type, as it provides the ground truth of the data type in the flow.
|
26
26
|
|
27
27
|
* When you use a Python variable to bind to an engine value (e.g. arguments of a custom function),
|
28
|
-
|
29
|
-
|
28
|
+
the engine already knows the specific data type, so we don't require a specific type annotation, e.g. type annotations can be omitted, or you can use `Any` at any level.
|
29
|
+
When a specific type annotation is provided, it's still used as a guidance to construct the Python value with compatible type.
|
30
|
+
Otherwise, we will bind to a default Python type.
|
30
31
|
|
31
32
|
### Basic Types
|
32
33
|
|
@@ -54,7 +55,7 @@ This is the list of all primitive types supported by CocoIndex:
|
|
54
55
|
Notes:
|
55
56
|
|
56
57
|
* For some CocoIndex types, we support multiple Python types. You can annotate with any of these Python types.
|
57
|
-
The first one is the default type, i.e. CocoIndex will create a value with this type when
|
58
|
+
The first one is the default type, i.e. CocoIndex will create a value with this type when a specific type annotation is not provided (e.g. for arguments of a custom function).
|
58
59
|
|
59
60
|
* All Python types starting with `cocoindex.` are type aliases exported by CocoIndex. They're annotated types based on certain Python types:
|
60
61
|
|
@@ -86,7 +87,7 @@ Optionally, it can have a fixed dimension. Noted as *Vector[Type]* or *Vector[Ty
|
|
86
87
|
|
87
88
|
It supports the following Python types:
|
88
89
|
|
89
|
-
* `cocoindex.Vector[T]` or `cocoindex.Vector[T, typing.Literal[Dim]]`, e.g. `cocoindex.Vector[cocoindex.Float32]` or `cocoindex.Vector[cocoindex.Float32, 384]`
|
90
|
+
* `cocoindex.Vector[T]` or `cocoindex.Vector[T, typing.Literal[Dim]]`, e.g. `cocoindex.Vector[cocoindex.Float32]` or `cocoindex.Vector[cocoindex.Float32, typing.Literal[384]]`
|
90
91
|
* The underlying Python type is `numpy.typing.NDArray[T]` where `T` is a numpy numeric type (`numpy.int64`, `numpy.float32` or `numpy.float64`), or `list[T]` otherwise
|
91
92
|
* `numpy.typing.NDArray[T]` where `T` is a numpy numeric type
|
92
93
|
* `list[T]`
|
@@ -136,7 +137,7 @@ Both `Person` and `PersonTuple` are valid Struct types in CocoIndex, with identi
|
|
136
137
|
Choose `dataclass` for mutable objects or when you need additional methods, and `NamedTuple` for immutable, lightweight structures.
|
137
138
|
|
138
139
|
Besides, for arguments of custom functions, CocoIndex also supports using dictionaries (`dict[str, Any]`) to represent a *Struct* type.
|
139
|
-
It's the default Python type if you don't annotate the function argument.
|
140
|
+
It's the default Python type if you don't annotate the function argument with a specific type.
|
140
141
|
|
141
142
|
### Table Types
|
142
143
|
|
@@ -152,11 +153,16 @@ The row order of a *KTable* is not preserved.
|
|
152
153
|
Type of the first column (key column) must be a [key type](#key-types).
|
153
154
|
|
154
155
|
In Python, a *KTable* type is represented by `dict[K, V]`.
|
155
|
-
The `
|
156
|
+
The `K` should be the type binding to a key type,
|
157
|
+
and the `V` should be the type binding to a *Struct* type representing the value fields of each row.
|
158
|
+
When the specific type annotation is not provided,
|
159
|
+
the key type is bound to a tuple with its key parts when it's a *Struct* type, the value type is bound to `dict[str, Any]`.
|
160
|
+
|
161
|
+
|
156
162
|
For example, you can use `dict[str, Person]` or `dict[str, PersonTuple]` to represent a *KTable*, with 4 columns: key (*Str*), `first_name` (*Str*), `last_name` (*Str*), `dob` (*Date*).
|
163
|
+
It's bound to `dict[str, dict[str, Any]]` if you don't annotate the function argument with a specific type.
|
157
164
|
|
158
165
|
Note that if you want to use a *Struct* as the key, you need to ensure its value in Python is immutable. For `dataclass`, annotate it with `@dataclass(frozen=True)`. For `NamedTuple`, immutability is built-in. For example:
|
159
|
-
For example:
|
160
166
|
|
161
167
|
```python
|
162
168
|
@dataclass(frozen=True)
|
@@ -170,14 +176,16 @@ class PersonKeyTuple(NamedTuple):
|
|
170
176
|
```
|
171
177
|
|
172
178
|
Then you can use `dict[PersonKey, Person]` or `dict[PersonKeyTuple, PersonTuple]` to represent a KTable keyed by `PersonKey` or `PersonKeyTuple`.
|
179
|
+
It's bound to `dict[(str, str), dict[str, Any]]` if you don't annotate the function argument with a specific type.
|
173
180
|
|
174
181
|
|
175
182
|
#### LTable
|
176
183
|
|
177
184
|
*LTable* is a *Table* type whose row order is preserved. *LTable* has no key column.
|
178
185
|
|
179
|
-
In Python, a *LTable* type is represented by `list[R]`, where `R` is
|
186
|
+
In Python, a *LTable* type is represented by `list[R]`, where `R` is the type binding to the *Struct* type representing the value fields of each row.
|
180
187
|
For example, you can use `list[Person]` to represent a *LTable* with 3 columns: `first_name` (*Str*), `last_name` (*Str*), `dob` (*Date*).
|
188
|
+
It's bound to `list[dict[str, Any]]` if you don't annotate the function argument with a specific type.
|
181
189
|
|
182
190
|
## Key Types
|
183
191
|
|
@@ -32,6 +32,13 @@ Here's how CocoIndex data elements map to Postgres elements during export:
|
|
32
32
|
For example, if you have a data collector that collects rows with fields `id`, `title`, and `embedding`, it will be exported to a Postgres table with corresponding columns.
|
33
33
|
It should be a unique table, meaning that no other export target should export to the same table.
|
34
34
|
|
35
|
+
:::warning vector type mapping to Postgres
|
36
|
+
|
37
|
+
Since vectors in pgvector must have fixed dimension, we only map vectors of number types with fixed dimension (i.e. *Vector[cocoindex.Float32, N]*, *Vector[cocoindex.Float64, N]*, and *Vector[cocoindex.Int64, N]*) to `vector(N)` columns.
|
38
|
+
For all other vector types, we map them to `jsonb` columns.
|
39
|
+
|
40
|
+
:::
|
41
|
+
|
35
42
|
#### Spec
|
36
43
|
|
37
44
|
The spec takes the following fields:
|
@@ -58,6 +65,13 @@ Here's how CocoIndex data elements map to Qdrant elements during export:
|
|
58
65
|
|
59
66
|
*Vector[Float32, N]*, *Vector[Float64, N]* and *Vector[Int64, N]* types fit into Qdrant vector.
|
60
67
|
|
68
|
+
:::warning vector type mapping to Qdrant
|
69
|
+
|
70
|
+
Since vectors in Qdrant must have fixed dimension, we only map vectors of number types with fixed dimension (i.e. *Vector[cocoindex.Float32, N]*, *Vector[cocoindex.Float64, N]*, and *Vector[cocoindex.Int64, N]*) to Qdrant vectors.
|
71
|
+
For all other vector types, we map to Qdrant payload as JSON arrays.
|
72
|
+
|
73
|
+
:::
|
74
|
+
|
61
75
|
#### Spec
|
62
76
|
|
63
77
|
The spec takes the following fields:
|
@@ -0,0 +1,156 @@
|
|
1
|
+
---
|
2
|
+
title: Live Updates
|
3
|
+
description: "Keep your indexes up-to-date with live updates in CocoIndex."
|
4
|
+
---
|
5
|
+
|
6
|
+
# Live Updates
|
7
|
+
|
8
|
+
CocoIndex is designed to keep your indexes synchronized with your data sources. This is achieved through a feature called **live updates**, which automatically detects changes in your sources and updates your indexes accordingly. This ensures that your search results and data analysis are always based on the most current information.
|
9
|
+
|
10
|
+
## How Live Updates Work
|
11
|
+
|
12
|
+
Live updates in CocoIndex can be triggered in two main ways:
|
13
|
+
|
14
|
+
1. **Refresh Interval:** You can configure a `refresh_interval` for any data source. CocoIndex will then periodically check the source for any new, updated, or deleted data. This is a simple and effective way to keep your index fresh, especially for sources that don't have a built-in change notification system.
|
15
|
+
|
16
|
+
2. **Change Capture Mechanisms:** Some data sources offer more sophisticated ways to track changes. For example:
|
17
|
+
* **Amazon S3:** You can configure an SQS queue to receive notifications whenever a file is added, modified, or deleted in your S3 bucket. CocoIndex can listen to this queue and trigger an update instantly.
|
18
|
+
* **Google Drive:** The Google Drive source can be configured to poll for recent changes, which is more efficient than a full refresh.
|
19
|
+
|
20
|
+
When a change is detected, CocoIndex performs an **incremental update**. This means it only re-processes the data that has been affected by the change, without having to re-index your entire dataset. This makes the update process fast and efficient.
|
21
|
+
|
22
|
+
Here's an example of how to set up a source with a `refresh_interval`:
|
23
|
+
|
24
|
+
```python
|
25
|
+
@cocoindex.flow_def(name="LiveUpdateExample")
|
26
|
+
def live_update_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope):
|
27
|
+
# Source: local files in the 'data' directory
|
28
|
+
data_scope["documents"] = flow_builder.add_source(
|
29
|
+
cocoindex.sources.LocalFile(path="data"),
|
30
|
+
refresh_interval=cocoindex.timedelta(seconds=5),
|
31
|
+
)
|
32
|
+
# ...
|
33
|
+
```
|
34
|
+
|
35
|
+
By setting `refresh_interval` to 5 seconds, we're telling CocoIndex to check for changes in the `data` directory every 5 seconds.
|
36
|
+
|
37
|
+
## Implementing Live Updates
|
38
|
+
|
39
|
+
You can enable live updates using either the CocoIndex CLI or the Python library.
|
40
|
+
|
41
|
+
### Using the CLI
|
42
|
+
|
43
|
+
To start a live update process from the command line, use the `update` command with the `-L` or `--live` flag:
|
44
|
+
|
45
|
+
```bash
|
46
|
+
cocoindex update -L your_flow_definition_file.py
|
47
|
+
```
|
48
|
+
|
49
|
+
This will start a long-running process that continuously monitors your data sources for changes and updates your indexes in real-time. You can stop the process by pressing `Ctrl+C`.
|
50
|
+
|
51
|
+
### Using the Python Library
|
52
|
+
|
53
|
+
For more control over the live update process, you can use the `FlowLiveUpdater` class in your Python code. This is particularly useful when you want to integrate CocoIndex into a larger application.
|
54
|
+
|
55
|
+
The `FlowLiveUpdater` can be used as a context manager, which automatically starts the updater when you enter the `with` block and stops it when you exit. The `wait()` method will block until the updater is aborted (e.g., by pressing `Ctrl+C`).
|
56
|
+
|
57
|
+
Here's how you can use `FlowLiveUpdater` to start and manage a live update process:
|
58
|
+
|
59
|
+
```python
|
60
|
+
import cocoindex
|
61
|
+
|
62
|
+
# Create a FlowLiveUpdater instance
|
63
|
+
with cocoindex.FlowLiveUpdater(live_update_flow, cocoindex.FlowLiveUpdaterOptions(print_stats=True)) as updater:
|
64
|
+
print("Live updater started. Press Ctrl+C to stop.")
|
65
|
+
# The updater runs in the background.
|
66
|
+
# The wait() method blocks until the updater is stopped.
|
67
|
+
updater.wait()
|
68
|
+
|
69
|
+
print("Live updater stopped.")
|
70
|
+
```
|
71
|
+
|
72
|
+
#### Getting Status Updates
|
73
|
+
|
74
|
+
You can also get status updates from the `FlowLiveUpdater` to monitor the update process. The `next_status_updates()` method blocks until there is a new status update.
|
75
|
+
|
76
|
+
```python
|
77
|
+
import cocoindex
|
78
|
+
|
79
|
+
updater = cocoindex.FlowLiveUpdater(live_update_flow)
|
80
|
+
updater.start()
|
81
|
+
|
82
|
+
while True:
|
83
|
+
updates = updater.next_status_updates()
|
84
|
+
|
85
|
+
if not updates.active_sources:
|
86
|
+
print("All sources have finished processing.")
|
87
|
+
break
|
88
|
+
|
89
|
+
for source_name in updates.updated_sources:
|
90
|
+
print(f"Source '{source_name}' has been updated.")
|
91
|
+
|
92
|
+
updater.wait()
|
93
|
+
```
|
94
|
+
|
95
|
+
This allows you to react to updates in your application, for example, by notifying users or triggering downstream processes.
|
96
|
+
|
97
|
+
## Example
|
98
|
+
|
99
|
+
Let's walk through an example of how to set up a live update flow. For the complete, runnable code, see the [live updates example](https://github.com/cocoindex-io/cocoindex/tree/main/examples/live_updates) in the CocoIndex repository.
|
100
|
+
|
101
|
+
### 1. Setting up the Source
|
102
|
+
|
103
|
+
The first step is to define a source and configure a `refresh_interval`. In this example, we'll use a `LocalFile` source to monitor a directory named `data`.
|
104
|
+
|
105
|
+
```python
|
106
|
+
@cocoindex.flow_def(name="LiveUpdateExample")
|
107
|
+
def live_update_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope):
|
108
|
+
# Source: local files in the 'data' directory
|
109
|
+
data_scope["documents"] = flow_builder.add_source(
|
110
|
+
cocoindex.sources.LocalFile(path="data"),
|
111
|
+
refresh_interval=cocoindex.timedelta(seconds=5),
|
112
|
+
)
|
113
|
+
|
114
|
+
# Collector
|
115
|
+
collector = data_scope.add_collector()
|
116
|
+
with data_scope["documents"].row() as doc:
|
117
|
+
collector.collect(filename=doc["filename"], content=doc["content"])
|
118
|
+
|
119
|
+
# Target: Postgres database
|
120
|
+
collector.export(
|
121
|
+
"documents_index",
|
122
|
+
cocoindex.targets.Postgres(),
|
123
|
+
primary_key_fields=["filename"]
|
124
|
+
)
|
125
|
+
```
|
126
|
+
|
127
|
+
By setting `refresh_interval` to 5 seconds, we're telling CocoIndex to check for changes in the `data` directory every 5 seconds.
|
128
|
+
|
129
|
+
### 2. Running the Live Updater
|
130
|
+
|
131
|
+
Once the flow is defined, you can use the `FlowLiveUpdater` to start the live update process.
|
132
|
+
|
133
|
+
```python
|
134
|
+
def main():
|
135
|
+
# Initialize CocoIndex
|
136
|
+
cocoindex.init()
|
137
|
+
|
138
|
+
# Setup the flow
|
139
|
+
live_update_flow.setup(report_to_stdout=True)
|
140
|
+
|
141
|
+
# Start the live updater
|
142
|
+
with cocoindex.FlowLiveUpdater(live_update_flow, cocoindex.FlowLiveUpdaterOptions(print_stats=True)) as updater:
|
143
|
+
print("Live updater started. Watching for changes in the 'data' directory.")
|
144
|
+
updater.wait()
|
145
|
+
|
146
|
+
if __name__ == "__main__":
|
147
|
+
main()
|
148
|
+
```
|
149
|
+
|
150
|
+
The `FlowLiveUpdater` will run in the background, and the `updater.wait()` call will block until the process is stopped.
|
151
|
+
|
152
|
+
## Conclusion
|
153
|
+
|
154
|
+
Live updates is a powerful feature of CocoIndex that ensures your indexes are always fresh. By using a combination of refresh intervals and source-specific change capture mechanisms, you can build responsive, real-time applications that are always in sync with your data.
|
155
|
+
|
156
|
+
For more detailed information on the `FlowLiveUpdater` and other live update options, please refer to the [Run a Flow documentation](https://cocoindex.io/docs/core/flow_methods#live-update).
|
@@ -12,6 +12,14 @@ const sidebars: SidebarsConfig = {
|
|
12
12
|
'getting_started/installation',
|
13
13
|
],
|
14
14
|
},
|
15
|
+
{
|
16
|
+
type: 'category',
|
17
|
+
label: 'Tutorials',
|
18
|
+
collapsed: false,
|
19
|
+
items: [
|
20
|
+
'tutorials/live_updates',
|
21
|
+
],
|
22
|
+
},
|
15
23
|
{
|
16
24
|
type: 'category',
|
17
25
|
label: 'CocoIndex Core',
|
@@ -0,0 +1,51 @@
|
|
1
|
+
# Recognize faces in images and build embedding index
|
2
|
+
[](https://github.com/cocoindex-io/cocoindex)
|
3
|
+
|
4
|
+
|
5
|
+
In this example, we will recognize faces in images and build embedding index.
|
6
|
+
|
7
|
+
We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/cocoindex) if this is helpful.
|
8
|
+
|
9
|
+
## Steps
|
10
|
+
### Indexing Flow
|
11
|
+
|
12
|
+
1. We will ingest a list of images.
|
13
|
+
2. For each image, we:
|
14
|
+
- Extract faces from the image.
|
15
|
+
- Compute embeddings for each face.
|
16
|
+
3. We will export to the following tables in Postgres with PGVector:
|
17
|
+
- Filename, rect, embedding for each face.
|
18
|
+
|
19
|
+
|
20
|
+
## Prerequisite
|
21
|
+
|
22
|
+
1. [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one.
|
23
|
+
|
24
|
+
2. dependencies:
|
25
|
+
|
26
|
+
```bash
|
27
|
+
pip install -e .
|
28
|
+
```
|
29
|
+
|
30
|
+
## Run
|
31
|
+
|
32
|
+
Update index, which will also setup the tables at the first time:
|
33
|
+
|
34
|
+
```bash
|
35
|
+
cocoindex update --setup main.py
|
36
|
+
```
|
37
|
+
|
38
|
+
You can also run the command with `-L`, which will watch for file changes and update the index automatically.
|
39
|
+
|
40
|
+
```bash
|
41
|
+
cocoindex update --setup -L main.py
|
42
|
+
```
|
43
|
+
|
44
|
+
## CocoInsight
|
45
|
+
I used CocoInsight (Free beta now) to troubleshoot the index generation and understand the data lineage of the pipeline. It just connects to your local CocoIndex server, with zero pipeline data retention. Run following command to start CocoInsight:
|
46
|
+
|
47
|
+
```
|
48
|
+
cocoindex server -ci main.py
|
49
|
+
```
|
50
|
+
|
51
|
+
Then open the CocoInsight UI at [https://cocoindex.io/cocoinsight](https://cocoindex.io/cocoinsight).
|
Binary file
|
Binary file
|
@@ -0,0 +1,120 @@
|
|
1
|
+
import cocoindex
|
2
|
+
import io
|
3
|
+
import dataclasses
|
4
|
+
import datetime
|
5
|
+
import typing
|
6
|
+
|
7
|
+
import face_recognition
|
8
|
+
from PIL import Image
|
9
|
+
import numpy as np
|
10
|
+
|
11
|
+
|
12
|
+
@dataclasses.dataclass
|
13
|
+
class ImageRect:
|
14
|
+
min_x: int
|
15
|
+
min_y: int
|
16
|
+
max_x: int
|
17
|
+
max_y: int
|
18
|
+
|
19
|
+
|
20
|
+
@dataclasses.dataclass
|
21
|
+
class FaceBase:
|
22
|
+
"""A face in an image."""
|
23
|
+
|
24
|
+
rect: ImageRect
|
25
|
+
image: bytes
|
26
|
+
|
27
|
+
|
28
|
+
MAX_IMAGE_WIDTH = 1280
|
29
|
+
|
30
|
+
|
31
|
+
@cocoindex.op.function(
|
32
|
+
cache=True,
|
33
|
+
behavior_version=1,
|
34
|
+
gpu=True,
|
35
|
+
arg_relationship=(cocoindex.op.ArgRelationship.RECTS_BASE_IMAGE, "content"),
|
36
|
+
)
|
37
|
+
def extract_faces(content: bytes) -> list[FaceBase]:
|
38
|
+
"""Extract the first pages of a PDF."""
|
39
|
+
orig_img = Image.open(io.BytesIO(content)).convert("RGB")
|
40
|
+
|
41
|
+
# The model is too slow on large images, so we resize them if too large.
|
42
|
+
if orig_img.width > MAX_IMAGE_WIDTH:
|
43
|
+
ratio = orig_img.width * 1.0 / MAX_IMAGE_WIDTH
|
44
|
+
img = orig_img.resize(
|
45
|
+
(MAX_IMAGE_WIDTH, int(orig_img.height / ratio)),
|
46
|
+
resample=Image.Resampling.BICUBIC,
|
47
|
+
)
|
48
|
+
else:
|
49
|
+
ratio = 1.0
|
50
|
+
img = orig_img
|
51
|
+
|
52
|
+
# Extract face locations.
|
53
|
+
locs = face_recognition.face_locations(np.array(img), model="cnn")
|
54
|
+
|
55
|
+
faces: list[FaceBase] = []
|
56
|
+
for min_y, max_x, max_y, min_x in locs:
|
57
|
+
rect = ImageRect(
|
58
|
+
min_x=int(min_x * ratio),
|
59
|
+
min_y=int(min_y * ratio),
|
60
|
+
max_x=int(max_x * ratio),
|
61
|
+
max_y=int(max_y * ratio),
|
62
|
+
)
|
63
|
+
|
64
|
+
# Crop the face and save it as a PNG.
|
65
|
+
buf = io.BytesIO()
|
66
|
+
orig_img.crop((rect.min_x, rect.min_y, rect.max_x, rect.max_y)).save(
|
67
|
+
buf, format="PNG"
|
68
|
+
)
|
69
|
+
face = buf.getvalue()
|
70
|
+
faces.append(FaceBase(rect, face))
|
71
|
+
|
72
|
+
return faces
|
73
|
+
|
74
|
+
|
75
|
+
@cocoindex.op.function(cache=True, behavior_version=1, gpu=True)
|
76
|
+
def extract_face_embedding(
|
77
|
+
face: bytes,
|
78
|
+
) -> cocoindex.Vector[cocoindex.Float32]:
|
79
|
+
"""Extract the embedding of a face."""
|
80
|
+
img = Image.open(io.BytesIO(face)).convert("RGB")
|
81
|
+
embedding = face_recognition.face_encodings(
|
82
|
+
np.array(img),
|
83
|
+
known_face_locations=[(0, img.width - 1, img.height - 1, 0)],
|
84
|
+
)[0]
|
85
|
+
return embedding
|
86
|
+
|
87
|
+
|
88
|
+
@cocoindex.flow_def(name="FaceRecognition")
|
89
|
+
def face_recognition_flow(
|
90
|
+
flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope
|
91
|
+
) -> None:
|
92
|
+
"""
|
93
|
+
Define an example flow that embeds files into a vector database.
|
94
|
+
"""
|
95
|
+
data_scope["images"] = flow_builder.add_source(
|
96
|
+
cocoindex.sources.LocalFile(path="images", binary=True),
|
97
|
+
refresh_interval=datetime.timedelta(seconds=10),
|
98
|
+
)
|
99
|
+
|
100
|
+
face_embeddings = data_scope.add_collector()
|
101
|
+
|
102
|
+
with data_scope["images"].row() as image:
|
103
|
+
# Extract faces
|
104
|
+
image["faces"] = image["content"].transform(extract_faces)
|
105
|
+
|
106
|
+
with image["faces"].row() as face:
|
107
|
+
face["embedding"] = face["image"].transform(extract_face_embedding)
|
108
|
+
|
109
|
+
# Collect embeddings
|
110
|
+
face_embeddings.collect(
|
111
|
+
filename=image["filename"],
|
112
|
+
rect=face["rect"],
|
113
|
+
embedding=face["embedding"],
|
114
|
+
)
|
115
|
+
|
116
|
+
face_embeddings.export(
|
117
|
+
"face_embeddings",
|
118
|
+
cocoindex.targets.Postgres(),
|
119
|
+
primary_key_fields=["filename", "rect"],
|
120
|
+
)
|
@@ -0,0 +1,14 @@
|
|
1
|
+
[project]
|
2
|
+
name = "cocoindex-face-recognition-example"
|
3
|
+
version = "0.1.0"
|
4
|
+
description = "Build index for papers with both metadata and content embeddings"
|
5
|
+
requires-python = ">=3.11"
|
6
|
+
dependencies = [
|
7
|
+
"cocoindex>=0.1.71",
|
8
|
+
"face-recognition>=1.3.0",
|
9
|
+
"pillow>=10.0.0",
|
10
|
+
"numpy>=1.26.0",
|
11
|
+
]
|
12
|
+
|
13
|
+
[tool.setuptools]
|
14
|
+
packages = []
|
@@ -0,0 +1 @@
|
|
1
|
+
COCOINDEX_DATABASE_URL=postgres://cocoindex:cocoindex@localhost/cocoindex
|
@@ -0,0 +1,58 @@
|
|
1
|
+
# Applying Live Updates to CocoIndex Flow Example
|
2
|
+
[](https://github.com/cocoindex-io/cocoindex)
|
3
|
+
|
4
|
+
We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/cocoindex) if this is helpful.
|
5
|
+
|
6
|
+
This example demonstrates how to use CocoIndex's live update feature to keep an index synchronized with a local directory.
|
7
|
+
|
8
|
+
## How it Works
|
9
|
+
|
10
|
+
The `main.py` script defines a CocoIndex flow that:
|
11
|
+
|
12
|
+
1. **Sources** data from a local directory named `data`. It uses a `refresh_interval` of 5 seconds to check for changes.
|
13
|
+
2. **Collects** the `filename` and `content` of each file.
|
14
|
+
3. **Exports** the collected data to a Postgres database table.
|
15
|
+
|
16
|
+
The script then starts a `FlowLiveUpdater`, which runs in the background and continuously monitors the `data` directory for changes.
|
17
|
+
|
18
|
+
## Running the Example
|
19
|
+
|
20
|
+
1. [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one.
|
21
|
+
|
22
|
+
2. **Install the dependencies:**
|
23
|
+
|
24
|
+
```bash
|
25
|
+
pip install -e .
|
26
|
+
```
|
27
|
+
|
28
|
+
3. **Run the example:**
|
29
|
+
|
30
|
+
You can run the live update example in two ways:
|
31
|
+
|
32
|
+
**Option 1: Using the Python script**
|
33
|
+
|
34
|
+
This method uses CocoIndex [Library API](https://cocoindex.io/docs/core/flow_methods#library-api-2) to perform live updates.
|
35
|
+
|
36
|
+
```bash
|
37
|
+
python main.py
|
38
|
+
```
|
39
|
+
|
40
|
+
**Option 2: Using the CocoIndex CLI**
|
41
|
+
|
42
|
+
This method is useful for managing your indexes from the command line, through CocoIndex [CLI](https://cocoindex.io/docs/core/flow_methods#cli-2).
|
43
|
+
|
44
|
+
```bash
|
45
|
+
cocoindex update main.py -L --setup
|
46
|
+
```
|
47
|
+
|
48
|
+
4. **Test the live updates:**
|
49
|
+
|
50
|
+
While the script is running, you can try adding, modifying, or deleting files in the `data` directory. You will see the changes reflected in the logs as CocoIndex updates the index.
|
51
|
+
|
52
|
+
## Cleaning Up
|
53
|
+
|
54
|
+
To remove the database table created by this example, you can run:
|
55
|
+
|
56
|
+
```bash
|
57
|
+
cocoindex drop main.py
|
58
|
+
```
|