cocoindex 0.2.11__tar.gz → 0.2.13__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- cocoindex-0.2.13/.github/SECURITY.md +26 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/.gitignore +3 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/.pre-commit-config.yaml +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/Cargo.lock +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/Cargo.toml +2 -2
- {cocoindex-0.2.11 → cocoindex-0.2.13}/PKG-INFO +6 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/README.md +1 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/THIRD_PARTY_NOTICES.html +27 -33
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/ai/llm.mdx +8 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/contributing/guide.md +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/contributing/setup_dev_environment.md +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/examples/academic_papers_index.md +11 -11
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/examples/codebase_index.md +8 -8
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/examples/custom_targets.md +9 -11
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/examples/docs_to_knowledge_graph.md +10 -11
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/examples/document_ai.md +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/examples/image_search.md +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/examples/manual_extraction.md +1 -2
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/examples/multi_format_index.md +7 -9
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/examples/patient_form_extraction.md +6 -6
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/examples/photo_search.md +5 -5
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/examples/postgres_source.md +16 -16
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/examples/product_recommendation.md +16 -17
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/examples/simple_vector_index.md +6 -6
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/index.md +2 -2
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/getting_started/quickstart.md +4 -4
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/ops/targets.md +78 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docusaurus.config.ts +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/src/theme/DocCard/index.tsx +3 -3
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/src/theme/DocCard/styles.module.css +4 -4
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/src/theme/DocCardList/index.tsx +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/src/theme/DocCardList/styles.module.css +7 -7
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/code_embedding/main.py +41 -18
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/postgres_source/.env +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/text_embedding/main.py +32 -8
- cocoindex-0.2.13/examples/text_embedding_lancedb/.gitignore +1 -0
- cocoindex-0.2.13/examples/text_embedding_lancedb/README.md +58 -0
- cocoindex-0.2.13/examples/text_embedding_lancedb/main.py +109 -0
- cocoindex-0.2.13/examples/text_embedding_lancedb/markdown_files/rfc8259.md +362 -0
- cocoindex-0.2.13/examples/text_embedding_lancedb/pyproject.toml +13 -0
- cocoindex-0.2.13/examples/text_embedding_qdrant/.env +6 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/text_embedding_qdrant/main.py +45 -18
- {cocoindex-0.2.11 → cocoindex-0.2.13}/pyproject.toml +3 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/auth_registry.py +6 -2
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/convert.py +183 -27
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/flow.py +4 -2
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/op.py +163 -41
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/query_handler.py +8 -2
- cocoindex-0.2.13/python/cocoindex/targets/__init__.py +5 -0
- cocoindex-0.2.11/python/cocoindex/targets.py → cocoindex-0.2.13/python/cocoindex/targets/_engine_builtin_specs.py +4 -4
- cocoindex-0.2.13/python/cocoindex/targets/lancedb.py +460 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/tests/test_convert.py +51 -26
- cocoindex-0.2.13/python/cocoindex/tests/test_load_convert.py +118 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/tests/test_typing.py +126 -2
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/typing.py +207 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/base/json_schema.rs +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/base/spec.rs +9 -22
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/builder/analyzed_flow.rs +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/builder/analyzer.rs +12 -4
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/builder/exec_ctx.rs +16 -9
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/builder/plan.rs +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/execution/memoization.rs +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/execution/source_indexer.rs +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/llm/anthropic.rs +6 -9
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/llm/gemini.rs +17 -9
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/llm/mod.rs +4 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/llm/ollama.rs +10 -18
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/llm/voyage.rs +5 -10
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/factory_bases.rs +1 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/functions/embed_text.rs +27 -2
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/interface.rs +2 -2
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/py_factory.rs +70 -32
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/targets/kuzu.rs +3 -9
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/targets/postgres.rs +2 -2
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/targets/qdrant.rs +0 -14
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/targets/shared/property_graph.rs +3 -3
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/py/mod.rs +9 -1
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/service/error.rs +72 -46
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/setup/states.rs +2 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/.cargo/config.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/.env.lib_debug +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/.github/ISSUE_TEMPLATE//360/237/220/233-bug-report.md" +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/.github/ISSUE_TEMPLATE//360/237/222/241-feature-request.md" +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/.github/scripts/update_version.sh +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/.github/workflows/CI.yml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/.github/workflows/_docs_release.yml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/.github/workflows/_test.yml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/.github/workflows/docs_release.yml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/.github/workflows/docs_test.yml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/.github/workflows/format.yml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/.github/workflows/release.yml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/CODE_OF_CONDUCT.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/CONTRIBUTING.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/LICENSE +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/about.hbs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/about.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/dev/neo4j.yaml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/dev/postgres.yaml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/.gitignore +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/about/community.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/contributing/new_built_in_target.mdx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/core/basics.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/core/cli.mdx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/core/data_example.svg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/core/data_types.mdx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/core/flow_def.mdx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/core/flow_example.svg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/core/flow_methods.mdx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/core/settings.mdx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/custom_ops/custom_functions.mdx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/custom_ops/custom_targets.mdx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/getting_started/installation.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/getting_started/markdown_files.zip +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/getting_started/overview.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/ops/functions.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/ops/sources.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/query.mdx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/tutorials/live_updates.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/tutorials/manage_flow_dynamically.mdx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/package.json +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/sidebars.ts +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/src/components/GitHubButton/index.tsx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/src/css/custom.css +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/src/theme/Root.js +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/.nojekyll +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/docusaurus.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/academic_papers_index/abstract_chunks.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/academic_papers_index/basic_info.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/academic_papers_index/chunk_embedding.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/academic_papers_index/cover.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/academic_papers_index/first_page.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/academic_papers_index/flow.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/academic_papers_index/metadata.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/codebase_index/chunk.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/codebase_index/cover.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/codebase_index/flow.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/custom_targets/convert.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/custom_targets/cover.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/docs_to_knowledge_graph/cover.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/docs_to_knowledge_graph/dedupe.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/docs_to_knowledge_graph/export_document.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/docs_to_knowledge_graph/export_relationship.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/docs_to_knowledge_graph/extract_relationship.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/docs_to_knowledge_graph/flow.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/docs_to_knowledge_graph/relationship.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/docs_to_knowledge_graph/summary.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/document_ai/cover.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/document_ai/document_ai.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/document_ai/processor.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/image_search/cover.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/image_search/embedding.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/image_search/flow.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/image_search/multi_modal_architecture.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/image_search/result.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/manual_extraction/cover.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/manual_extraction/extraction.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/manual_extraction/flow.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/manual_extraction/summary.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/multi_format_index/colpali_architecture.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/multi_format_index/cover.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/multi_format_index/embed.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/multi_format_index/flow.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/multi_format_index/pages.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/patient_form_extraction/cover.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/patient_form_extraction/extraction.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/patient_form_extraction/fields.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/patient_form_extraction/flow.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/patient_form_extraction/tomarkdown.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/photo_search/cover.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/photo_search/extraction.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/photo_search/flow.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/postgres_source/collector.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/postgres_source/cover.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/postgres_source/description.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/postgres_source/embed.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/postgres_source/flow.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/postgres_source/lineage.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/postgres_source/price.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/postgres_source/source.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/product_recommendation/cover.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/product_recommendation/dedupe.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/product_recommendation/export_all.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/product_recommendation/export_product.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/product_recommendation/export_taxonomy.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/product_recommendation/extract_product.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/product_recommendation/extract_taxonomy.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/product_recommendation/neo4j.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/product_recommendation/parse_json.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/product_recommendation/taxonomy.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/simple_vector_index/chunk.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/simple_vector_index/cover.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/simple_vector_index/embed.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/examples/simple_vector_index/flow.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/favicon.ico +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/icon.svg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/img/incremental-etl.gif +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/static/robots.txt +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/tsconfig.json +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/docs/yarn.lock +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/amazon_s3_embedding/.env.example +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/amazon_s3_embedding/.gitignore +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/amazon_s3_embedding/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/amazon_s3_embedding/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/amazon_s3_embedding/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/azure_blob_embedding/.env.example +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/azure_blob_embedding/.gitignore +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/azure_blob_embedding/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/azure_blob_embedding/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/azure_blob_embedding/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/code_embedding/.env +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/code_embedding/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/code_embedding/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/custom_output_files/.env +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/custom_output_files/.gitignore +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/custom_output_files/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/custom_output_files/data/bizarre_animals.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/custom_output_files/data/chunk_norris.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/custom_output_files/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/custom_output_files/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/docs_to_knowledge_graph/.env +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/docs_to_knowledge_graph/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/docs_to_knowledge_graph/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/docs_to_knowledge_graph/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/face_recognition/.env +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/face_recognition/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/face_recognition/images/Carter_welcomes_Reagan.jpg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/face_recognition/images/Solvay_conference_1927.jpg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/face_recognition/images/Steve_Jobs_and_Bill_Gates_(522695099).jpg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/face_recognition/images/einplanck3.jpg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/face_recognition/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/face_recognition/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/fastapi_server_docker/.dockerignore +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/fastapi_server_docker/.env +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/fastapi_server_docker/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/fastapi_server_docker/compose.yaml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/fastapi_server_docker/dockerfile +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/fastapi_server_docker/files/1810.04805v2.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/fastapi_server_docker/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/fastapi_server_docker/requirements.txt +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/gdrive_text_embedding/.env.example +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/gdrive_text_embedding/.gitignore +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/gdrive_text_embedding/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/gdrive_text_embedding/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/gdrive_text_embedding/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/.env +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/colpali_main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/frontend/.gitignore +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/frontend/index.html +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/frontend/package-lock.json +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/frontend/package.json +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/frontend/src/App.jsx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/frontend/src/main.jsx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/frontend/src/style.css +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/frontend/vite.config.js +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/img/cat1.jpeg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/img/dog1.jpeg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/img/elephant1.jpg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/img/giraffe.jpg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/image_search/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/live_updates/.env +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/live_updates/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/live_updates/data/bizarre_animals.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/live_updates/data/chunk_norris.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/live_updates/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/live_updates/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/manuals_llm_extraction/.env +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/manuals_llm_extraction/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/manuals_llm_extraction/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/manuals_llm_extraction/manuals/array.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/manuals_llm_extraction/manuals/base64.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/manuals_llm_extraction/manuals/copy.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/manuals_llm_extraction/manuals/glob.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/manuals_llm_extraction/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/multi_format_indexing/.env +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/multi_format_indexing/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/multi_format_indexing/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/multi_format_indexing/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/multi_format_indexing/source_files/1706.03762v7.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/multi_format_indexing/source_files/1810.04805v2.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/multi_format_indexing/source_files/2502.06786v3.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/multi_format_indexing/source_files/healthcare_industry_test_p101.jpg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/multi_format_indexing/source_files/healthcare_industry_test_p86.jpg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/multi_format_indexing/source_files/healthcare_industry_test_p9.jpg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/multi_format_indexing/source_files/restaurant_brands_international_2023.jpg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/multi_format_indexing/source_files/sweetgreen_2023.jpg +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/paper_metadata/.env.example +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/paper_metadata/.gitignore +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/paper_metadata/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/paper_metadata/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/paper_metadata/papers/1706.03762v7.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/paper_metadata/papers/1810.04805v2.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/paper_metadata/papers/2502.06786v3.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/paper_metadata/papers/2502.20346v1.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/paper_metadata/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/patient_intake_extraction/.env.example +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/patient_intake_extraction/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/patient_intake_extraction/data/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/patient_intake_extraction/data/patient_forms/Patient_Intake_Form_David_Artificial.docx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/patient_intake_extraction/data/patient_forms/Patient_Intake_Form_Emily_Artificial.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/patient_intake_extraction/data/patient_forms/Patient_Intake_Form_Joe_Artificial.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/patient_intake_extraction/data/patient_forms/Patient_Intake_From_Jane_Artificial.docx +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/patient_intake_extraction/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/patient_intake_extraction/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/pdf_embedding/.env +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/pdf_embedding/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/pdf_embedding/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/pdf_embedding/pdf_files/1706.03762v7.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/pdf_embedding/pdf_files/1810.04805v2.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/pdf_embedding/pdf_files/rfc8259.pdf +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/pdf_embedding/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/postgres_source/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/postgres_source/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/postgres_source/prepare_source_data.sql +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/postgres_source/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/.env.example +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/.gitignore +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/img/cocoinsight.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/img/neo4j.png +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/main.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/products/p1.json +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/products/p2.json +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/products/p3.json +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/products/p4.json +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/products/p5.json +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/products/p6.json +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/products/p7.json +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/products/p8.json +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/products/p9.json +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/product_recommendation/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/text_embedding/.env +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/text_embedding/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/text_embedding/Text_Embedding.ipynb +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/text_embedding/markdown_files/1706.03762v7.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/text_embedding/markdown_files/1810.04805v2.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/text_embedding/markdown_files/rfc8259.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/text_embedding/pyproject.toml +0 -0
- {cocoindex-0.2.11/examples/text_embedding_qdrant → cocoindex-0.2.13/examples/text_embedding_lancedb}/.env +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/text_embedding_qdrant/README.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/text_embedding_qdrant/markdown_files/rfc8259.md +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/examples/text_embedding_qdrant/pyproject.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/__init__.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/cli.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/functions.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/index.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/lib.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/llm.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/py.typed +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/runtime.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/setting.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/setup.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/sources.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/subprocess_exec.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/tests/__init__.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/tests/test_optional_database.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/tests/test_transform_flow.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/tests/test_validation.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/user_app_loader.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/utils.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/python/cocoindex/validation.py +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/ruff.toml +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/base/duration.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/base/field_attrs.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/base/mod.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/base/schema.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/base/value.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/builder/flow_builder.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/builder/mod.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/execution/db_tracking.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/execution/db_tracking_setup.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/execution/dumper.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/execution/evaluator.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/execution/indexing_status.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/execution/live_updater.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/execution/mod.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/execution/row_indexer.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/execution/stats.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/lib.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/lib_context.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/llm/litellm.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/llm/openai.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/llm/openrouter.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/llm/vllm.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/functions/extract_by_llm.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/functions/mod.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/functions/parse_json.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/functions/split_recursively.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/functions/test_utils.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/mod.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/registration.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/registry.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/sdk.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/shared/mod.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/shared/postgres.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/sources/amazon_s3.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/sources/azure_blob.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/sources/google_drive.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/sources/local_file.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/sources/mod.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/sources/postgres.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/sources/shared/mod.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/sources/shared/pattern_matcher.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/targets/mod.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/targets/neo4j.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/targets/shared/mod.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/ops/targets/shared/table_columns.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/prelude.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/py/convert.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/server.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/service/flows.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/service/mod.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/service/query_handler.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/settings.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/setup/auth_registry.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/setup/components.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/setup/db_metadata.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/setup/driver.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/setup/flow_features.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/setup/mod.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/utils/concur_control.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/utils/db.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/utils/deser.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/utils/fingerprint.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/utils/immutable.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/utils/mod.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/utils/retryable.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/utils/str_sanitize.rs +0 -0
- {cocoindex-0.2.11 → cocoindex-0.2.13}/src/utils/yaml_ser.rs +0 -0
@@ -0,0 +1,26 @@
|
|
1
|
+
# Security Policy for CocoIndex
|
2
|
+
|
3
|
+
## Reporting a Vulnerability
|
4
|
+
If you discover a security vulnerability in CocoIndex, please report it responsibly to our security team:
|
5
|
+
|
6
|
+
**Email:** [support@cocoindex.io](mailto:support@cocoindex.io)
|
7
|
+
|
8
|
+
⚠️ Please do not file GitHub issues for security vulnerabilities as they are public! ⚠️
|
9
|
+
|
10
|
+
Please provide:
|
11
|
+
- A detailed description of the vulnerability
|
12
|
+
- Steps to reproduce the issue
|
13
|
+
- Any relevant logs, screenshots, or proof-of-concept code
|
14
|
+
|
15
|
+
We will acknowledge your report promptly and work with you to resolve the issue.
|
16
|
+
|
17
|
+
## Scope
|
18
|
+
This policy covers security issues related to CocoIndex open-source software.
|
19
|
+
|
20
|
+
## Response & Disclosure
|
21
|
+
- We aim to respond as soon as we can.
|
22
|
+
- Security fixes will be released as soon as practical after verification.
|
23
|
+
|
24
|
+
---
|
25
|
+
|
26
|
+
Thank you for helping us keep CocoIndex secure!
|
@@ -2,9 +2,9 @@
|
|
2
2
|
name = "cocoindex"
|
3
3
|
# Version used for local development is always higher than others to take precedence.
|
4
4
|
# Will be overridden for specific release versions.
|
5
|
-
version = "0.2.
|
5
|
+
version = "0.2.13"
|
6
6
|
edition = "2024"
|
7
|
-
rust-version = "1.
|
7
|
+
rust-version = "1.89"
|
8
8
|
license = "Apache-2.0"
|
9
9
|
readme = "README.md"
|
10
10
|
|
@@ -1,6 +1,6 @@
|
|
1
1
|
Metadata-Version: 2.4
|
2
2
|
Name: cocoindex
|
3
|
-
Version: 0.2.
|
3
|
+
Version: 0.2.13
|
4
4
|
Classifier: Development Status :: 3 - Alpha
|
5
5
|
Classifier: License :: OSI Approved :: Apache Software License
|
6
6
|
Classifier: Operating System :: OS Independent
|
@@ -16,6 +16,7 @@ Classifier: Topic :: Text Processing :: Indexing
|
|
16
16
|
Classifier: Intended Audience :: Developers
|
17
17
|
Classifier: Natural Language :: English
|
18
18
|
Classifier: Typing :: Typed
|
19
|
+
Requires-Dist: typing-extensions>=4.12 ; python_full_version < '3.13'
|
19
20
|
Requires-Dist: click>=8.1.8
|
20
21
|
Requires-Dist: rich>=14.0.0
|
21
22
|
Requires-Dist: python-dotenv>=1.1.0
|
@@ -29,11 +30,14 @@ Requires-Dist: mypy ; extra == 'dev'
|
|
29
30
|
Requires-Dist: pre-commit ; extra == 'dev'
|
30
31
|
Requires-Dist: sentence-transformers>=3.3.1 ; extra == 'embeddings'
|
31
32
|
Requires-Dist: colpali-engine ; extra == 'colpali'
|
33
|
+
Requires-Dist: lancedb>=0.25.0 ; extra == 'lancedb'
|
32
34
|
Requires-Dist: sentence-transformers>=3.3.1 ; extra == 'all'
|
33
35
|
Requires-Dist: colpali-engine ; extra == 'all'
|
36
|
+
Requires-Dist: lancedb>=0.25.0 ; extra == 'all'
|
34
37
|
Provides-Extra: dev
|
35
38
|
Provides-Extra: embeddings
|
36
39
|
Provides-Extra: colpali
|
40
|
+
Provides-Extra: lancedb
|
37
41
|
Provides-Extra: all
|
38
42
|
License-File: THIRD_PARTY_NOTICES.html
|
39
43
|
Summary: With CocoIndex, users declare the transformation, CocoIndex creates & maintains an index, and keeps the derived index up to date based on source update, with minimal computation and changes.
|
@@ -227,6 +231,7 @@ It defines an index flow like this:
|
|
227
231
|
| [Google Drive Text Embedding](examples/gdrive_text_embedding) | Index text documents from Google Drive |
|
228
232
|
| [Docs to Knowledge Graph](examples/docs_to_knowledge_graph) | Extract relationships from Markdown documents and build a knowledge graph |
|
229
233
|
| [Embeddings to Qdrant](examples/text_embedding_qdrant) | Index documents in a Qdrant collection for semantic search |
|
234
|
+
| [Embeddings to LanceDB](examples/text_embedding_lancedb) | Index documents in a LanceDB collection for semantic search |
|
230
235
|
| [FastAPI Server with Docker](examples/fastapi_server_docker) | Run the semantic search server in a Dockerized FastAPI setup |
|
231
236
|
| [Product Recommendation](examples/product_recommendation) | Build real-time product recommendations with LLM and graph database|
|
232
237
|
| [Image Search with Vision API](examples/image_search) | Generates detailed captions for images using a vision model, embeds them, enables live-updating semantic search via FastAPI and served on a React frontend|
|
@@ -181,6 +181,7 @@ It defines an index flow like this:
|
|
181
181
|
| [Google Drive Text Embedding](examples/gdrive_text_embedding) | Index text documents from Google Drive |
|
182
182
|
| [Docs to Knowledge Graph](examples/docs_to_knowledge_graph) | Extract relationships from Markdown documents and build a knowledge graph |
|
183
183
|
| [Embeddings to Qdrant](examples/text_embedding_qdrant) | Index documents in a Qdrant collection for semantic search |
|
184
|
+
| [Embeddings to LanceDB](examples/text_embedding_lancedb) | Index documents in a LanceDB collection for semantic search |
|
184
185
|
| [FastAPI Server with Docker](examples/fastapi_server_docker) | Run the semantic search server in a Dockerized FastAPI setup |
|
185
186
|
| [Product Recommendation](examples/product_recommendation) | Build real-time product recommendations with LLM and graph database|
|
186
187
|
| [Image Search with Vision API](examples/image_search) | Generates detailed captions for images using a vision model, embeds them, enables live-updating semantic search via FastAPI and served on a React frontend|
|
@@ -2428,7 +2428,7 @@ Software.
|
|
2428
2428
|
<h3 id="Apache-2.0">Apache License 2.0</h3>
|
2429
2429
|
<h4>Used by:</h4>
|
2430
2430
|
<ul class="license-used-by">
|
2431
|
-
<li><a href=" https://crates.io/crates/cocoindex ">cocoindex 0.2.
|
2431
|
+
<li><a href=" https://crates.io/crates/cocoindex ">cocoindex 0.2.13</a></li>
|
2432
2432
|
<li><a href=" https://github.com/awesomized/crc-fast-rust ">crc-fast 1.3.0</a></li>
|
2433
2433
|
<li><a href=" https://github.com/qdrant/rust-client ">qdrant-client 1.15.0</a></li>
|
2434
2434
|
</ul>
|
@@ -10673,38 +10673,6 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
10673
10673
|
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
10674
10674
|
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
10675
10675
|
THE SOFTWARE.</pre>
|
10676
|
-
</li>
|
10677
|
-
<li class="license">
|
10678
|
-
<h3 id="MIT">MIT License</h3>
|
10679
|
-
<h4>Used by:</h4>
|
10680
|
-
<ul class="license-used-by">
|
10681
|
-
<li><a href=" https://github.com/tree-sitter/tree-sitter-scala ">tree-sitter-scala 0.24.0</a></li>
|
10682
|
-
</ul>
|
10683
|
-
<pre class="license-text">(The MIT License)
|
10684
|
-
|
10685
|
-
Copyright (c) 2014 Nathan Rajlich <nathan@tootallnate.net>
|
10686
|
-
|
10687
|
-
Permission is hereby granted, free of charge, to any person
|
10688
|
-
obtaining a copy of this software and associated documentation
|
10689
|
-
files (the "Software"), to deal in the Software without
|
10690
|
-
restriction, including without limitation the rights to use,
|
10691
|
-
copy, modify, merge, publish, distribute, sublicense, and/or sell
|
10692
|
-
copies of the Software, and to permit persons to whom the
|
10693
|
-
Software is furnished to do so, subject to the following
|
10694
|
-
conditions:
|
10695
|
-
|
10696
|
-
The above copyright notice and this permission notice shall be
|
10697
|
-
included in all copies or substantial portions of the Software.
|
10698
|
-
|
10699
|
-
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
10700
|
-
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
|
10701
|
-
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
10702
|
-
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
|
10703
|
-
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
|
10704
|
-
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
10705
|
-
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
|
10706
|
-
OTHER DEALINGS IN THE SOFTWARE.
|
10707
|
-
</pre>
|
10708
10676
|
</li>
|
10709
10677
|
<li class="license">
|
10710
10678
|
<h3 id="MIT">MIT License</h3>
|
@@ -12299,6 +12267,32 @@ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
12299
12267
|
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
12300
12268
|
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
12301
12269
|
THE SOFTWARE.
|
12270
|
+
</pre>
|
12271
|
+
</li>
|
12272
|
+
<li class="license">
|
12273
|
+
<h3 id="MIT">MIT License</h3>
|
12274
|
+
<h4>Used by:</h4>
|
12275
|
+
<ul class="license-used-by">
|
12276
|
+
<li><a href=" https://github.com/tree-sitter/tree-sitter-scala ">tree-sitter-scala 0.24.0</a></li>
|
12277
|
+
</ul>
|
12278
|
+
<pre class="license-text">This software is released under the MIT license:
|
12279
|
+
|
12280
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy of
|
12281
|
+
this software and associated documentation files (the "Software"), to deal in
|
12282
|
+
the Software without restriction, including without limitation the rights to
|
12283
|
+
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
|
12284
|
+
the Software, and to permit persons to whom the Software is furnished to do so,
|
12285
|
+
subject to the following conditions:
|
12286
|
+
|
12287
|
+
The above copyright notice and this permission notice shall be included in all
|
12288
|
+
copies or substantial portions of the Software.
|
12289
|
+
|
12290
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
12291
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
|
12292
|
+
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
|
12293
|
+
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
|
12294
|
+
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
|
12295
|
+
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
12302
12296
|
</pre>
|
12303
12297
|
</li>
|
12304
12298
|
<li class="license">
|
@@ -107,6 +107,8 @@ For text embedding, a spec for OpenAI looks like this:
|
|
107
107
|
cocoindex.functions.EmbedText(
|
108
108
|
api_type=cocoindex.LlmApiType.OPENAI,
|
109
109
|
model="text-embedding-3-small",
|
110
|
+
# Optional, use the default output dimension if not specified
|
111
|
+
output_dimension=1536,
|
110
112
|
)
|
111
113
|
```
|
112
114
|
|
@@ -199,7 +201,10 @@ For text embedding, a spec looks like this:
|
|
199
201
|
cocoindex.functions.EmbedText(
|
200
202
|
api_type=cocoindex.LlmApiType.GEMINI,
|
201
203
|
model="text-embedding-004",
|
204
|
+
# Optional, use the default task type if not specified
|
202
205
|
task_type="SEMANTICS_SIMILARITY",
|
206
|
+
# Optional, use the default output dimension if not specified
|
207
|
+
output_dimension=1536,
|
203
208
|
)
|
204
209
|
```
|
205
210
|
|
@@ -260,7 +265,10 @@ For text embedding, a spec for Vertex AI looks like this:
|
|
260
265
|
cocoindex.functions.EmbedText(
|
261
266
|
api_type=cocoindex.LlmApiType.VERTEX_AI,
|
262
267
|
model="text-embedding-005",
|
268
|
+
# Optional, use the default task type if not specified
|
263
269
|
task_type="SEMANTICS_SIMILARITY",
|
270
|
+
# Optional, use the default output dimension if not specified
|
271
|
+
output_dimension=1536,
|
264
272
|
api_config=cocoindex.llm.VertexAiConfig(project="your-project-id"),
|
265
273
|
)
|
266
274
|
```
|
@@ -5,7 +5,7 @@ description: How to contribute to CocoIndex
|
|
5
5
|
|
6
6
|
[CocoIndex](https://github.com/cocoindex-io/cocoindex) is an open source project. We are respectful, open and friendly. This guide explains how to get involved and contribute to [CocoIndex](https://github.com/cocoindex-io/cocoindex).
|
7
7
|
|
8
|
-
Our [Discord server](https://discord.com/invite/zpA9S2DR7s) is constantly open.
|
8
|
+
Our [Discord server](https://discord.com/invite/zpA9S2DR7s) is constantly open.
|
9
9
|
If you are unsure about anything, it is a good place to discuss! We'd love to collaborate and will always be friendly.
|
10
10
|
|
11
11
|
## Good First Issues
|
@@ -21,10 +21,10 @@ import { GitHubButton, YouTubeButton, DocumentationButton } from '../../../src/c
|
|
21
21
|
|
22
22
|
1. Extract the paper metadata, including file name, title, author information, abstract, and number of pages.
|
23
23
|
|
24
|
-
2. Build vector embeddings for the metadata, such as the title and abstract, for semantic search.
|
24
|
+
2. Build vector embeddings for the metadata, such as the title and abstract, for semantic search.
|
25
25
|
This enables better metadata-driven semantic search results. For example, you can match text queries against titles and abstracts.
|
26
26
|
|
27
|
-
3. Build an index of authors and all the file names associated with each author
|
27
|
+
3. Build an index of authors and all the file names associated with each author
|
28
28
|
to answer questions like "Give me all the papers by Jeff Dean."
|
29
29
|
|
30
30
|
4. If you want to perform full PDF embedding for the paper, you can extend the flow.
|
@@ -108,7 +108,7 @@ After this step, we should have the basic info of each paper.
|
|
108
108
|
|
109
109
|
We will convert the first page to Markdown using Marker. Alternatively, you can easily plug in any PDF parser, such as Docling using CocoIndex's [custom function](https://cocoindex.io/docs/custom_ops/custom_functions).
|
110
110
|
|
111
|
-
Define a marker converter function and cache it, since its initialization is resource-intensive.
|
111
|
+
Define a marker converter function and cache it, since its initialization is resource-intensive.
|
112
112
|
This ensures that the same converter instance is reused for different input files.
|
113
113
|
|
114
114
|
```python
|
@@ -137,7 +137,7 @@ def pdf_to_markdown(content: bytes) -> str:
|
|
137
137
|
Pass it to your transform
|
138
138
|
|
139
139
|
```python
|
140
|
-
with data_scope["documents"].row() as doc:
|
140
|
+
with data_scope["documents"].row() as doc:
|
141
141
|
# ... process
|
142
142
|
doc["first_page_md"] = doc["basic_info"]["first_page"].transform(
|
143
143
|
pdf_to_markdown
|
@@ -200,7 +200,7 @@ paper_metadata.collect(
|
|
200
200
|
Just collect anything you need :)
|
201
201
|
|
202
202
|
### Collect `author` to `filename` information
|
203
|
-
We’ve already extracted author list. Here we want to collect Author → Papers in a separate table to build a look up functionality.
|
203
|
+
We’ve already extracted author list. Here we want to collect Author → Papers in a separate table to build a look up functionality.
|
204
204
|
Simply collect by author.
|
205
205
|
|
206
206
|
```python
|
@@ -229,8 +229,8 @@ doc["title_embedding"] = doc["metadata"]["title"].transform(
|
|
229
229
|
|
230
230
|
### Abstract
|
231
231
|
|
232
|
-
Split abstract into chunks, embed each chunk and collect their embeddings.
|
233
|
-
Sometimes the abstract could be very long.
|
232
|
+
Split abstract into chunks, embed each chunk and collect their embeddings.
|
233
|
+
Sometimes the abstract could be very long.
|
234
234
|
|
235
235
|
```python
|
236
236
|
doc["abstract_chunks"] = doc["metadata"]["abstract"].transform(
|
@@ -308,7 +308,7 @@ author_papers.export(
|
|
308
308
|
"author_papers",
|
309
309
|
cocoindex.targets.Postgres(),
|
310
310
|
primary_key_fields=["author_name", "filename"],
|
311
|
-
)
|
311
|
+
)
|
312
312
|
metadata_embeddings.export(
|
313
313
|
"metadata_embeddings",
|
314
314
|
cocoindex.targets.Postgres(),
|
@@ -328,9 +328,9 @@ In this example we use PGVector as embedding store. With CocoIndex, you can do o
|
|
328
328
|
|
329
329
|
## Query the index
|
330
330
|
|
331
|
-
You can refer to this section of [Text Embeddings](https://cocoindex.io/blogs/text-embeddings-101#3-query-the-index) about
|
332
|
-
how to build query against embeddings.
|
333
|
-
For now CocoIndex doesn't provide additional query interface. We can write SQL or rely on the query engine by the target storage.
|
331
|
+
You can refer to this section of [Text Embeddings](https://cocoindex.io/blogs/text-embeddings-101#3-query-the-index) about
|
332
|
+
how to build query against embeddings.
|
333
|
+
For now CocoIndex doesn't provide additional query interface. We can write SQL or rely on the query engine by the target storage.
|
334
334
|
|
335
335
|
- Many databases already have optimized query implementations with their own best practices
|
336
336
|
- The query space has excellent solutions for querying, reranking, and other search-related functionality.
|
@@ -19,7 +19,7 @@ import { GitHubButton, YouTubeButton, DocumentationButton } from '../../../src/c
|
|
19
19
|

|
20
20
|
|
21
21
|
## Overview
|
22
|
-
In this tutorial, we will build codebase index. [CocoIndex](https://github.com/cocoindex-io/cocoindex) provides built-in support for codebase chunking, with native Tree-sitter support. It works with large codebases, and can be updated in near real-time with incremental processing - only reprocess what's changed.
|
22
|
+
In this tutorial, we will build codebase index. [CocoIndex](https://github.com/cocoindex-io/cocoindex) provides built-in support for codebase chunking, with native Tree-sitter support. It works with large codebases, and can be updated in near real-time with incremental processing - only reprocess what's changed.
|
23
23
|
|
24
24
|
## Use Cases
|
25
25
|
A wide range of applications can be built with an effective codebase index that is always up-to-date.
|
@@ -44,14 +44,14 @@ The flow is composed of the following steps:
|
|
44
44
|
- Generate embeddings for each chunk
|
45
45
|
- Store in a vector database for retrieval
|
46
46
|
|
47
|
-
## Setup
|
47
|
+
## Setup
|
48
48
|
- Install Postgres, follow [installation guide](https://cocoindex.io/docs/getting_started/installation#-install-postgres).
|
49
49
|
- Install CocoIndex
|
50
50
|
```bash
|
51
51
|
pip install -U cocoindex
|
52
52
|
```
|
53
53
|
|
54
|
-
## Add the codebase as a source.
|
54
|
+
## Add the codebase as a source.
|
55
55
|
We will index the CocoIndex codebase. Here we use the `LocalFile` source to ingest files from the CocoIndex codebase root directory.
|
56
56
|
|
57
57
|
```python
|
@@ -67,7 +67,7 @@ def code_embedding_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoind
|
|
67
67
|
- Include files with the extensions of `.py`, `.rs`, `.toml`, `.md`, `.mdx`
|
68
68
|
- Exclude files and directories starting `.`, `target` in the root and `node_modules` under any directory.
|
69
69
|
|
70
|
-
`flow_builder.add_source` will create a table with sub fields (`filename`, `content`).
|
70
|
+
`flow_builder.add_source` will create a table with sub fields (`filename`, `content`).
|
71
71
|
<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Sources" />
|
72
72
|
|
73
73
|
|
@@ -96,14 +96,14 @@ with data_scope["files"].row() as file:
|
|
96
96
|
file["extension"] = file["filename"].transform(extract_extension)
|
97
97
|
file["chunks"] = file["content"].transform(
|
98
98
|
cocoindex.functions.SplitRecursively(),
|
99
|
-
language=file["extension"], chunk_size=1000, chunk_overlap=300)
|
99
|
+
language=file["extension"], chunk_size=1000, chunk_overlap=300)
|
100
100
|
```
|
101
101
|
<DocumentationButton url="https://cocoindex.io/docs/ops/functions#splitrecursively" text="SplitRecursively" margin="0 0 16px 0" />
|
102
102
|
|
103
103
|

|
104
104
|
|
105
105
|
### Embed the chunks
|
106
|
-
We use `SentenceTransformerEmbed` to embed the chunks.
|
106
|
+
We use `SentenceTransformerEmbed` to embed the chunks.
|
107
107
|
|
108
108
|
```python
|
109
109
|
@cocoindex.transform_flow()
|
@@ -141,7 +141,7 @@ code_embeddings.export(
|
|
141
141
|
vector_indexes=[cocoindex.VectorIndex("embedding", cocoindex.VectorSimilarityMetric.COSINE_SIMILARITY)])
|
142
142
|
```
|
143
143
|
|
144
|
-
We use [Cosine Similarity](https://en.wikipedia.org/wiki/Cosine_similarity) to measure the similarity between the query and the indexed data.
|
144
|
+
We use [Cosine Similarity](https://en.wikipedia.org/wiki/Cosine_similarity) to measure the similarity between the query and the indexed data.
|
145
145
|
|
146
146
|
## Query the index
|
147
147
|
We match against user-provided text by a SQL query, reusing the embedding operation in the indexing flow.
|
@@ -230,4 +230,4 @@ Follow the url from the terminal - `https://cocoindex.io/cocoinsight` to access
|
|
230
230
|
|
231
231
|
SplitRecursively has native support for all major programming languages.
|
232
232
|
|
233
|
-
<DocumentationButton url="https://cocoindex.io/docs/ops/functions#supported-languages" text="Supported Languages" margin="0 0 16px 0" />
|
233
|
+
<DocumentationButton url="https://cocoindex.io/docs/ops/functions#supported-languages" text="Supported Languages" margin="0 0 16px 0" />
|
@@ -35,7 +35,7 @@ flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope
|
|
35
35
|
refresh_interval=timedelta(seconds=5),
|
36
36
|
)
|
37
37
|
```
|
38
|
-
This ingestion creates a table with `filename` and `content` fields.
|
38
|
+
This ingestion creates a table with `filename` and `content` fields.
|
39
39
|
<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Sources" />
|
40
40
|
|
41
41
|
## Process each file and collect
|
@@ -92,7 +92,7 @@ class LocalFileTargetConnector:
|
|
92
92
|
|
93
93
|
```
|
94
94
|
|
95
|
-
The `describe()` method returns a human-readable string that describes the target, which is displayed in the CLI logs.
|
95
|
+
The `describe()` method returns a human-readable string that describes the target, which is displayed in the CLI logs.
|
96
96
|
For example, it prints:
|
97
97
|
|
98
98
|
`Target: Local directory ./data/output`
|
@@ -104,10 +104,10 @@ def describe(key: str) -> str:
|
|
104
104
|
return f"Local directory {key}"
|
105
105
|
```
|
106
106
|
|
107
|
-
`apply_setup_change()` applies setup changes to the backend. The previous and current specs are passed as arguments,
|
107
|
+
`apply_setup_change()` applies setup changes to the backend. The previous and current specs are passed as arguments,
|
108
108
|
and the method is expected to update the backend setup to match the current state.
|
109
109
|
|
110
|
-
A `None` spec indicates non-existence, so when `previous` is `None`, we need to create it,
|
110
|
+
A `None` spec indicates non-existence, so when `previous` is `None`, we need to create it,
|
111
111
|
and when `current` is `None`, we need to delete it.
|
112
112
|
|
113
113
|
|
@@ -135,8 +135,8 @@ def apply_setup_change(
|
|
135
135
|
os.rmdir(previous.directory)
|
136
136
|
```
|
137
137
|
|
138
|
-
The `mutate()` method is called by CocoIndex to apply data changes to the target,
|
139
|
-
batching mutations to potentially multiple targets of the same type.
|
138
|
+
The `mutate()` method is called by CocoIndex to apply data changes to the target,
|
139
|
+
batching mutations to potentially multiple targets of the same type.
|
140
140
|
This allows the target connector flexibility in implementation (e.g., atomic commits, or processing items with dependencies in a specific order).
|
141
141
|
|
142
142
|
Each element in the batch corresponds to a specific target and is represented by a tuple containing:
|
@@ -151,8 +151,8 @@ class LocalFileTargetValues:
|
|
151
151
|
html: str
|
152
152
|
```
|
153
153
|
|
154
|
-
The value type of the `dict` is `LocalFileTargetValues | None`,
|
155
|
-
where a non-`None` value means an upsert and `None` value means a delete. Similar to `apply_setup_changes()`,
|
154
|
+
The value type of the `dict` is `LocalFileTargetValues | None`,
|
155
|
+
where a non-`None` value means an upsert and `None` value means a delete. Similar to `apply_setup_changes()`,
|
156
156
|
idempotency is expected here.
|
157
157
|
|
158
158
|
```python
|
@@ -217,7 +217,5 @@ This keeps your knowledge graph continuously synchronized with your document sou
|
|
217
217
|
Sometimes there may be an internal/homegrown tool or API (e.g. within a company) that's not publicly available.
|
218
218
|
These can only be connected through custom targets.
|
219
219
|
|
220
|
-
### Faster adoption of new export logic
|
220
|
+
### Faster adoption of new export logic
|
221
221
|
When a new tool, database, or API joins your stack, simply define a Target Spec and Target Connector — start exporting right away, with no pipeline refactoring required.
|
222
|
-
|
223
|
-
|
{cocoindex-0.2.11 → cocoindex-0.2.13}/docs/docs/examples/examples/docs_to_knowledge_graph.md
RENAMED
@@ -36,7 +36,7 @@ and then build a knowledge graph.
|
|
36
36
|
- CocoIndex can direct map the collected data to Neo4j nodes and relationships.
|
37
37
|
|
38
38
|
## Setup
|
39
|
-
* [Install PostgreSQL](https://cocoindex.io/docs/getting_started/installation#-install-postgres). CocoIndex uses PostgreSQL internally for incremental processing.
|
39
|
+
* [Install PostgreSQL](https://cocoindex.io/docs/getting_started/installation#-install-postgres). CocoIndex uses PostgreSQL internally for incremental processing.
|
40
40
|
* [Install Neo4j](https://cocoindex.io/docs/ops/targets#neo4j-dev-instance), a graph database.
|
41
41
|
* [Configure your OpenAI API key](https://cocoindex.io/docs/ai/llm#openai). Alternatively, we have native support for Gemini, Ollama, LiteLLM. You can choose your favorite LLM provider and work completely on-premises.
|
42
42
|
|
@@ -51,7 +51,7 @@ and then build a knowledge graph.
|
|
51
51
|
|
52
52
|
### Add documents as source
|
53
53
|
|
54
|
-
We will process CocoIndex documentation markdown files (`.md`, `.mdx`) from the `docs/core` directory ([markdown files](https://github.com/cocoindex-io/cocoindex/tree/main/docs/docs/core), [deployed docs](https://cocoindex.io/docs/core/basics)).
|
54
|
+
We will process CocoIndex documentation markdown files (`.md`, `.mdx`) from the `docs/core` directory ([markdown files](https://github.com/cocoindex-io/cocoindex/tree/main/docs/docs/core), [deployed docs](https://cocoindex.io/docs/core/basics)).
|
55
55
|
|
56
56
|
```python
|
57
57
|
@cocoindex.flow_def(name="DocsToKG")
|
@@ -141,7 +141,7 @@ Next, we will use `cocoindex.functions.ExtractByLlm` to extract the relationship
|
|
141
141
|
doc["relationships"] = doc["content"].transform(
|
142
142
|
cocoindex.functions.ExtractByLlm(
|
143
143
|
llm_spec=cocoindex.LlmSpec(
|
144
|
-
api_type=cocoindex.LlmApiType.OPENAI,
|
144
|
+
api_type=cocoindex.LlmApiType.OPENAI,
|
145
145
|
model="gpt-4o"
|
146
146
|
),
|
147
147
|
output_type=list[Relationship],
|
@@ -187,7 +187,7 @@ with doc["relationships"].row() as relationship:
|
|
187
187
|
|
188
188
|
|
189
189
|
### Build knowledge graph
|
190
|
-
|
190
|
+
|
191
191
|
#### Basic concepts
|
192
192
|
All nodes for Neo4j need two things:
|
193
193
|
1. Label: The type of the node. E.g., `Document`, `Entity`.
|
@@ -236,10 +236,10 @@ This exports Neo4j nodes with label `Document` from the `document_node` collecto
|
|
236
236
|
|
237
237
|
#### Export `RELATIONSHIP` and `Entity` nodes to Neo4j
|
238
238
|
|
239
|
-
We don't have explicit collector for `Entity` nodes.
|
239
|
+
We don't have explicit collector for `Entity` nodes.
|
240
240
|
They are part of the `entity_relationship` collector and fields are collected during the relationship extraction.
|
241
241
|
|
242
|
-
To export them as Neo4j nodes, we need to first declare `Entity` nodes.
|
242
|
+
To export them as Neo4j nodes, we need to first declare `Entity` nodes.
|
243
243
|
|
244
244
|
```python
|
245
245
|
flow_builder.declare(
|
@@ -289,7 +289,7 @@ In a relationship, there's:
|
|
289
289
|
2. A relationship connecting the source and target.
|
290
290
|
Note that different relationships may share the same source and target nodes.
|
291
291
|
|
292
|
-
`NodeFromFields` takes the fields from the `entity_relationship` collector and creates `Entity` nodes.
|
292
|
+
`NodeFromFields` takes the fields from the `entity_relationship` collector and creates `Entity` nodes.
|
293
293
|
|
294
294
|
#### Export the `entity_mention` to Neo4j.
|
295
295
|
|
@@ -334,7 +334,7 @@ It creates relationships by:
|
|
334
334
|
```sh
|
335
335
|
cocoindex update --setup main.py
|
336
336
|
```
|
337
|
-
|
337
|
+
|
338
338
|
You'll see the index updates state in the terminal. For example,
|
339
339
|
|
340
340
|
```
|
@@ -343,7 +343,7 @@ It creates relationships by:
|
|
343
343
|
|
344
344
|
## CocoInsight
|
345
345
|
|
346
|
-
I used CocoInsight to troubleshoot the index generation and understand the data lineage of the pipeline. It is in free beta now, you can give it a try.
|
346
|
+
I used CocoInsight to troubleshoot the index generation and understand the data lineage of the pipeline. It is in free beta now, you can give it a try.
|
347
347
|
|
348
348
|
```sh
|
349
349
|
cocoindex server -ci main
|
@@ -369,7 +369,7 @@ MATCH p=()-->() RETURN p
|
|
369
369
|
## Kuzu
|
370
370
|
Cocoindex natively supports Kuzu - a high performant, embedded open source graph database.
|
371
371
|
|
372
|
-
<DocumentationButton url="https://cocoindex.io/docs/ops/targets#kuzu" text="Kuzu" margin="0 0 16px 0" />
|
372
|
+
<DocumentationButton url="https://cocoindex.io/docs/ops/targets#kuzu" text="Kuzu" margin="0 0 16px 0" />
|
373
373
|
|
374
374
|
The GraphDB interface in CocoIndex is standardized, you just need to **switch the configuration** without any additional code changes. CocoIndex supports exporting to Kuzu through its API server. You can bring up a Kuzu API server locally by running:
|
375
375
|
|
@@ -391,4 +391,3 @@ kuzu_conn_spec = cocoindex.add_auth_entry(
|
|
391
391
|
```
|
392
392
|
|
393
393
|
<GitHubButton url="https://github.com/cocoindex-io/cocoindex/blob/30761f8ab674903d742c8ab2e18d4c588df6d46f/examples/docs_to_knowledge_graph/main.py#L33-L37" margin="0 0 16px 0" />
|
394
|
-
|
@@ -21,7 +21,7 @@ CocoIndex is a flexible ETL framework with incremental processing. We don’t b
|
|
21
21
|
|
22
22
|
## Set up
|
23
23
|
- [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one.
|
24
|
-
- Configure Project and Processor ID for Document AI API
|
24
|
+
- Configure Project and Processor ID for Document AI API
|
25
25
|
- [Official Google document AI API](https://cloud.google.com/document-ai/docs/try-docai) with free live demo.
|
26
26
|
- Sign in to [Google Cloud Console](https://console.cloud.google.com/), create or open a project, and enable Document AI API.
|
27
27
|
- 
|
@@ -21,7 +21,7 @@ import { GitHubButton, YouTubeButton, DocumentationButton } from '../../../src/c
|
|
21
21
|
CocoIndex supports native integration with ColPali - with just a few lines of code, you embed and index images with ColPali’s late-interaction architecture. We also build a light weight image search application with FastAPI.
|
22
22
|
|
23
23
|
|
24
|
-
## ColPali
|
24
|
+
## ColPali
|
25
25
|
|
26
26
|
**ColPali (Contextual Late-interaction over Patches)** is a powerful model for multimodal retrieval.
|
27
27
|
|
@@ -188,7 +188,7 @@ def summarize_module(module_info: ModuleInfo) -> ModuleSummary:
|
|
188
188
|
num_classes=len(module_info.classes),
|
189
189
|
num_methods=len(module_info.methods),
|
190
190
|
)
|
191
|
-
```
|
191
|
+
```
|
192
192
|
|
193
193
|
### Plug in the function into the flow
|
194
194
|
```python
|
@@ -249,4 +249,3 @@ SELECT filename, module_info->'title' AS title, module_summary FROM modules_info
|
|
249
249
|
cocoindex server -ci main
|
250
250
|
```
|
251
251
|
CocoInsight dashboard is here `https://cocoindex.io/cocoinsight`. It connects to your local CocoIndex server with zero data retention.
|
252
|
-
|
@@ -1,5 +1,5 @@
|
|
1
1
|
---
|
2
|
-
title: Index PDFs, Images, Slides without OCR
|
2
|
+
title: Index PDFs, Images, Slides without OCR
|
3
3
|
description: Build a visual document indexing pipeline using ColPali to index scanned documents, PDFs, academic papers, presentation slides, and standalone images — all mixed together with charts, tables, and figures - into the same vector space.
|
4
4
|
sidebar_class_name: hidden
|
5
5
|
slug: /examples/multi_format_index
|
@@ -20,7 +20,7 @@ import { GitHubButton, YouTubeButton, DocumentationButton } from '../../../src/c
|
|
20
20
|
## Overview
|
21
21
|
Do you have a messy collection of scanned documents, PDFs, academic papers, presentation slides, and standalone images — all mixed together with charts, tables, and figures — that you want to process into the same vector space for semantic search or to power an AI agent?
|
22
22
|
|
23
|
-
In this example, we’ll walk through how to build a visual document indexing pipeline using ColPali for embedding both PDFs and images — and then query the index using natural language.
|
23
|
+
In this example, we’ll walk through how to build a visual document indexing pipeline using ColPali for embedding both PDFs and images — and then query the index using natural language.
|
24
24
|
|
25
25
|
We’ll skip OCR entirely — ColPali can directly understand document layouts, tables, and figures from images, making it perfect for semantic search across visual-heavy content.
|
26
26
|
|
@@ -57,7 +57,7 @@ data_scope["documents"] = flow_builder.add_source(
|
|
57
57
|
|
58
58
|
## Convert Files to Pages
|
59
59
|
|
60
|
-
We classify files by MIME type and process accordingly.
|
60
|
+
We classify files by MIME type and process accordingly.
|
61
61
|
|
62
62
|
Define a dataclass:
|
63
63
|
|
@@ -112,7 +112,7 @@ In the flow we convert all the files to pages. this makes each pages and all ima
|
|
112
112
|
|
113
113
|
## Generate Visual Embeddings
|
114
114
|
|
115
|
-
We use ColPali to generate embeddings for images on each page.
|
115
|
+
We use ColPali to generate embeddings for images on each page.
|
116
116
|
|
117
117
|
```python
|
118
118
|
with doc["pages"].row() as page:
|
@@ -132,7 +132,7 @@ with doc["pages"].row() as page:
|
|
132
132
|
|
133
133
|

|
134
134
|
|
135
|
-
ColPali Architecture fundamentally rethinks how documents, especially visually complex or image-rich ones, are represented and searched. Instead of reducing each image or page to a single dense vector (as in traditional bi-encoders), ColPali breaks an image into many smaller patches, preserving local spatial and semantic structure.
|
135
|
+
ColPali Architecture fundamentally rethinks how documents, especially visually complex or image-rich ones, are represented and searched. Instead of reducing each image or page to a single dense vector (as in traditional bi-encoders), ColPali breaks an image into many smaller patches, preserving local spatial and semantic structure.
|
136
136
|
|
137
137
|
Each patch receives its own embedding, which together form a multi-vector representation of the complete document.
|
138
138
|
|
@@ -143,7 +143,7 @@ Each patch receives its own embedding, which together form a multi-vector repres
|
|
143
143
|
|
144
144
|
## Export to Qdrant
|
145
145
|
|
146
|
-
Note the way to embed image and query are different, as they’re two different types of data.
|
146
|
+
Note the way to embed image and query are different, as they’re two different types of data.
|
147
147
|
|
148
148
|
Create a function to embed query:
|
149
149
|
|
@@ -200,9 +200,7 @@ cocoindex server -ci main
|
|
200
200
|
Follow the url `https://cocoindex.io/cocoinsight`. It connects to your local CocoIndex server, with zero pipeline data retention. You can use it to view extracted pages, see embedding vectors and metadata.
|
201
201
|
|
202
202
|
|
203
|
-
## Connect to other sources
|
203
|
+
## Connect to other sources
|
204
204
|
CocoIndex natively supports Google Drive, Amazon S3, Azure Blob Storage, and more.
|
205
205
|
|
206
206
|
<DocumentationButton url="https://cocoindex.io/docs/ops/sources" text="Sources" margin="0 0 16px 0" />
|
207
|
-
|
208
|
-
|