cocoindex 0.1.70__tar.gz → 0.1.72__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (291) hide show
  1. {cocoindex-0.1.70 → cocoindex-0.1.72}/Cargo.lock +1 -1
  2. {cocoindex-0.1.70 → cocoindex-0.1.72}/Cargo.toml +1 -1
  3. {cocoindex-0.1.70 → cocoindex-0.1.72}/PKG-INFO +12 -11
  4. {cocoindex-0.1.70 → cocoindex-0.1.72}/README.md +11 -10
  5. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/custom_function.mdx +11 -0
  6. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/data_types.mdx +18 -10
  7. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/ops/targets.md +14 -0
  8. cocoindex-0.1.72/docs/docs/tutorials/live_updates.md +156 -0
  9. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/sidebars.ts +8 -0
  10. cocoindex-0.1.72/examples/face_recognition/README.md +51 -0
  11. cocoindex-0.1.72/examples/face_recognition/images/Carter_welcomes_Reagan.jpg +0 -0
  12. cocoindex-0.1.72/examples/face_recognition/images/Solvay_conference_1927.jpg +0 -0
  13. cocoindex-0.1.72/examples/face_recognition/images/Steve_Jobs_and_Bill_Gates_(522695099).jpg +0 -0
  14. cocoindex-0.1.72/examples/face_recognition/images/einplanck3.jpg +0 -0
  15. cocoindex-0.1.72/examples/face_recognition/main.py +120 -0
  16. cocoindex-0.1.72/examples/face_recognition/pyproject.toml +14 -0
  17. cocoindex-0.1.72/examples/live_updates/.env +1 -0
  18. cocoindex-0.1.72/examples/live_updates/README.md +58 -0
  19. cocoindex-0.1.72/examples/live_updates/data/bizarre_animals.md +21 -0
  20. cocoindex-0.1.72/examples/live_updates/data/chunk_norris.md +19 -0
  21. cocoindex-0.1.72/examples/live_updates/main.py +55 -0
  22. cocoindex-0.1.72/examples/live_updates/pyproject.toml +12 -0
  23. cocoindex-0.1.72/examples/text_embedding_qdrant/.env +2 -0
  24. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/__init__.py +1 -0
  25. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/convert.py +79 -4
  26. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/flow.py +16 -7
  27. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/functions.py +8 -7
  28. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/op.py +33 -4
  29. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/setting.py +3 -0
  30. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/tests/test_convert.py +127 -0
  31. cocoindex-0.1.72/python/cocoindex/tests/test_validation.py +134 -0
  32. cocoindex-0.1.72/python/cocoindex/validation.py +104 -0
  33. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/base/field_attrs.rs +1 -1
  34. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/db_tracking_setup.rs +15 -10
  35. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/kuzu.rs +1 -1
  36. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/neo4j.rs +2 -2
  37. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/postgres.rs +45 -6
  38. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/qdrant.rs +29 -14
  39. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/shared/table_columns.rs +8 -8
  40. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/py/mod.rs +3 -3
  41. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/setup/components.rs +8 -5
  42. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/setup/db_metadata.rs +3 -3
  43. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/setup/driver.rs +6 -1
  44. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/setup/states.rs +27 -5
  45. {cocoindex-0.1.70 → cocoindex-0.1.72}/.cargo/config.toml +0 -0
  46. {cocoindex-0.1.70 → cocoindex-0.1.72}/.env.lib_debug +0 -0
  47. {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/ISSUE_TEMPLATE//360/237/220/233-bug-report.md" +0 -0
  48. {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/ISSUE_TEMPLATE//360/237/222/241-feature-request.md" +0 -0
  49. {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/scripts/update_version.sh +0 -0
  50. {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/workflows/CI.yml +0 -0
  51. {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/workflows/_doc_release.yml +0 -0
  52. {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/workflows/_test.yml +0 -0
  53. {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/workflows/docs.yml +0 -0
  54. {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/workflows/format.yml +0 -0
  55. {cocoindex-0.1.70 → cocoindex-0.1.72}/.github/workflows/release.yml +0 -0
  56. {cocoindex-0.1.70 → cocoindex-0.1.72}/.gitignore +0 -0
  57. {cocoindex-0.1.70 → cocoindex-0.1.72}/.pre-commit-config.yaml +0 -0
  58. {cocoindex-0.1.70 → cocoindex-0.1.72}/CODE_OF_CONDUCT.md +0 -0
  59. {cocoindex-0.1.70 → cocoindex-0.1.72}/CONTRIBUTING.md +0 -0
  60. {cocoindex-0.1.70 → cocoindex-0.1.72}/LICENSE +0 -0
  61. {cocoindex-0.1.70 → cocoindex-0.1.72}/dev/neo4j.yaml +0 -0
  62. {cocoindex-0.1.70 → cocoindex-0.1.72}/dev/postgres.yaml +0 -0
  63. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/.gitignore +0 -0
  64. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/README.md +0 -0
  65. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/about/community.md +0 -0
  66. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/about/contributing.md +0 -0
  67. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/ai/llm.mdx +0 -0
  68. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/basics.md +0 -0
  69. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/cli.mdx +0 -0
  70. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/data_example.svg +0 -0
  71. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/flow_def.mdx +0 -0
  72. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/flow_example.svg +0 -0
  73. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/flow_methods.mdx +0 -0
  74. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/core/settings.mdx +0 -0
  75. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/getting_started/installation.md +0 -0
  76. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/getting_started/markdown_files.zip +0 -0
  77. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/getting_started/overview.md +0 -0
  78. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/getting_started/quickstart.md +0 -0
  79. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/ops/functions.md +0 -0
  80. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/ops/sources.md +0 -0
  81. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docs/query.mdx +0 -0
  82. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/docusaurus.config.ts +0 -0
  83. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/package.json +0 -0
  84. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/src/components/HomepageFeatures/index.tsx +0 -0
  85. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/src/components/HomepageFeatures/styles.module.css +0 -0
  86. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/src/css/custom.css +0 -0
  87. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/src/theme/Root.js +0 -0
  88. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/static/.nojekyll +0 -0
  89. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/static/img/docusaurus.png +0 -0
  90. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/static/img/favicon.ico +0 -0
  91. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/static/img/icon.svg +0 -0
  92. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/static/img/incremental-etl.gif +0 -0
  93. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/static/robots.txt +0 -0
  94. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/tsconfig.json +0 -0
  95. {cocoindex-0.1.70 → cocoindex-0.1.72}/docs/yarn.lock +0 -0
  96. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/amazon_s3_embedding/.env.example +0 -0
  97. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/amazon_s3_embedding/.gitignore +0 -0
  98. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/amazon_s3_embedding/README.md +0 -0
  99. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/amazon_s3_embedding/main.py +0 -0
  100. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/amazon_s3_embedding/pyproject.toml +0 -0
  101. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/azure_blob_embedding/.env.example +0 -0
  102. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/azure_blob_embedding/.gitignore +0 -0
  103. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/azure_blob_embedding/README.md +0 -0
  104. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/azure_blob_embedding/main.py +0 -0
  105. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/azure_blob_embedding/pyproject.toml +0 -0
  106. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/code_embedding/.env +0 -0
  107. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/code_embedding/README.md +0 -0
  108. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/code_embedding/main.py +0 -0
  109. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/code_embedding/pyproject.toml +0 -0
  110. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/docs_to_knowledge_graph/.env +0 -0
  111. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/docs_to_knowledge_graph/README.md +0 -0
  112. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/docs_to_knowledge_graph/main.py +0 -0
  113. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/docs_to_knowledge_graph/pyproject.toml +0 -0
  114. {cocoindex-0.1.70/examples/manuals_llm_extraction → cocoindex-0.1.72/examples/face_recognition}/.env +0 -0
  115. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/.dockerignore +0 -0
  116. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/.env +0 -0
  117. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/README.md +0 -0
  118. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/compose.yaml +0 -0
  119. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/dockerfile +0 -0
  120. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/files/1810.04805v2.md +0 -0
  121. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/main.py +0 -0
  122. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/fastapi_server_docker/requirements.txt +0 -0
  123. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/gdrive_text_embedding/.env.example +0 -0
  124. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/gdrive_text_embedding/.gitignore +0 -0
  125. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/gdrive_text_embedding/README.md +0 -0
  126. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/gdrive_text_embedding/main.py +0 -0
  127. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/gdrive_text_embedding/pyproject.toml +0 -0
  128. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/.env +0 -0
  129. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/README.md +0 -0
  130. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/.gitignore +0 -0
  131. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/index.html +0 -0
  132. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/package-lock.json +0 -0
  133. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/package.json +0 -0
  134. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/src/App.jsx +0 -0
  135. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/src/main.jsx +0 -0
  136. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/src/style.css +0 -0
  137. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/frontend/vite.config.js +0 -0
  138. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/img/cat1.jpeg +0 -0
  139. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/img/dog1.jpeg +0 -0
  140. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/img/elephant1.jpg +0 -0
  141. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/img/giraffe.jpg +0 -0
  142. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/main.py +0 -0
  143. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/image_search/pyproject.toml +0 -0
  144. {cocoindex-0.1.70/examples/pdf_embedding → cocoindex-0.1.72/examples/manuals_llm_extraction}/.env +0 -0
  145. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/manuals_llm_extraction/README.md +0 -0
  146. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/manuals_llm_extraction/main.py +0 -0
  147. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/manuals_llm_extraction/manuals/array.pdf +0 -0
  148. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/manuals_llm_extraction/manuals/base64.pdf +0 -0
  149. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/manuals_llm_extraction/manuals/copy.pdf +0 -0
  150. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/manuals_llm_extraction/manuals/glob.pdf +0 -0
  151. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/manuals_llm_extraction/pyproject.toml +0 -0
  152. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/.env.example +0 -0
  153. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/.gitignore +0 -0
  154. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/README.md +0 -0
  155. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/main.py +0 -0
  156. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/papers/1706.03762v7.pdf +0 -0
  157. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/papers/1810.04805v2.pdf +0 -0
  158. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/papers/2502.06786v3.pdf +0 -0
  159. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/papers/2502.20346v1.pdf +0 -0
  160. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/paper_metadata/pyproject.toml +0 -0
  161. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/.env.example +0 -0
  162. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/README.md +0 -0
  163. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/data/README.md +0 -0
  164. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/data/patient_forms/Patient_Intake_Form_David_Artificial.docx +0 -0
  165. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/data/patient_forms/Patient_Intake_Form_Emily_Artificial.pdf +0 -0
  166. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/data/patient_forms/Patient_Intake_Form_Joe_Artificial.pdf +0 -0
  167. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/data/patient_forms/Patient_Intake_From_Jane_Artificial.docx +0 -0
  168. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/main.py +0 -0
  169. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/patient_intake_extraction/pyproject.toml +0 -0
  170. {cocoindex-0.1.70/examples/product_recommendation → cocoindex-0.1.72/examples/pdf_embedding}/.env +0 -0
  171. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/pdf_embedding/README.md +0 -0
  172. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/pdf_embedding/main.py +0 -0
  173. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/pdf_embedding/pdf_files/1706.03762v7.pdf +0 -0
  174. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/pdf_embedding/pdf_files/1810.04805v2.pdf +0 -0
  175. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/pdf_embedding/pdf_files/rfc8259.pdf +0 -0
  176. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/pdf_embedding/pyproject.toml +0 -0
  177. {cocoindex-0.1.70/examples/text_embedding → cocoindex-0.1.72/examples/product_recommendation}/.env +0 -0
  178. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/README.md +0 -0
  179. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/img/cocoinsight.png +0 -0
  180. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/img/neo4j.png +0 -0
  181. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/main.py +0 -0
  182. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p1.json +0 -0
  183. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p2.json +0 -0
  184. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p3.json +0 -0
  185. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p4.json +0 -0
  186. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p5.json +0 -0
  187. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p6.json +0 -0
  188. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p7.json +0 -0
  189. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p8.json +0 -0
  190. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/products/p9.json +0 -0
  191. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/product_recommendation/pyproject.toml +0 -0
  192. {cocoindex-0.1.70/examples/text_embedding_qdrant → cocoindex-0.1.72/examples/text_embedding}/.env +0 -0
  193. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding/README.md +0 -0
  194. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding/Text_Embedding.ipynb +0 -0
  195. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding/main.py +0 -0
  196. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding/markdown_files/1706.03762v7.md +0 -0
  197. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding/markdown_files/1810.04805v2.md +0 -0
  198. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding/markdown_files/rfc8259.md +0 -0
  199. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding/pyproject.toml +0 -0
  200. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding_qdrant/README.md +0 -0
  201. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding_qdrant/main.py +0 -0
  202. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding_qdrant/markdown_files/rfc8259.md +0 -0
  203. {cocoindex-0.1.70 → cocoindex-0.1.72}/examples/text_embedding_qdrant/pyproject.toml +0 -0
  204. {cocoindex-0.1.70 → cocoindex-0.1.72}/pyproject.toml +0 -0
  205. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/auth_registry.py +0 -0
  206. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/cli.py +0 -0
  207. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/index.py +0 -0
  208. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/lib.py +0 -0
  209. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/llm.py +0 -0
  210. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/py.typed +0 -0
  211. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/runtime.py +0 -0
  212. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/setup.py +0 -0
  213. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/sources.py +0 -0
  214. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/targets.py +0 -0
  215. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/tests/__init__.py +0 -0
  216. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/tests/test_optional_database.py +0 -0
  217. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/tests/test_typing.py +0 -0
  218. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/typing.py +0 -0
  219. {cocoindex-0.1.70 → cocoindex-0.1.72}/python/cocoindex/utils.py +0 -0
  220. {cocoindex-0.1.70 → cocoindex-0.1.72}/ruff.toml +0 -0
  221. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/base/duration.rs +0 -0
  222. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/base/json_schema.rs +0 -0
  223. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/base/mod.rs +0 -0
  224. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/base/schema.rs +0 -0
  225. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/base/spec.rs +0 -0
  226. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/base/value.rs +0 -0
  227. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/builder/analyzed_flow.rs +0 -0
  228. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/builder/analyzer.rs +0 -0
  229. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/builder/exec_ctx.rs +0 -0
  230. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/builder/flow_builder.rs +0 -0
  231. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/builder/mod.rs +0 -0
  232. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/builder/plan.rs +0 -0
  233. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/db_tracking.rs +0 -0
  234. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/dumper.rs +0 -0
  235. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/evaluator.rs +0 -0
  236. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/indexing_status.rs +0 -0
  237. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/live_updater.rs +0 -0
  238. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/memoization.rs +0 -0
  239. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/mod.rs +0 -0
  240. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/row_indexer.rs +0 -0
  241. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/source_indexer.rs +0 -0
  242. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/execution/stats.rs +0 -0
  243. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/lib.rs +0 -0
  244. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/lib_context.rs +0 -0
  245. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/anthropic.rs +0 -0
  246. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/gemini.rs +0 -0
  247. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/litellm.rs +0 -0
  248. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/mod.rs +0 -0
  249. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/ollama.rs +0 -0
  250. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/openai.rs +0 -0
  251. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/openrouter.rs +0 -0
  252. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/vertex_ai.rs +0 -0
  253. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/vllm.rs +0 -0
  254. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/llm/voyage.rs +0 -0
  255. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/factory_bases.rs +0 -0
  256. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/functions/embed_text.rs +0 -0
  257. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/functions/extract_by_llm.rs +0 -0
  258. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/functions/mod.rs +0 -0
  259. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/functions/parse_json.rs +0 -0
  260. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/functions/split_recursively.rs +0 -0
  261. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/functions/test_utils.rs +0 -0
  262. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/interface.rs +0 -0
  263. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/mod.rs +0 -0
  264. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/py_factory.rs +0 -0
  265. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/registration.rs +0 -0
  266. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/registry.rs +0 -0
  267. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/sdk.rs +0 -0
  268. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/sources/amazon_s3.rs +0 -0
  269. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/sources/azure_blob.rs +0 -0
  270. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/sources/google_drive.rs +0 -0
  271. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/sources/local_file.rs +0 -0
  272. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/sources/mod.rs +0 -0
  273. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/mod.rs +0 -0
  274. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/shared/mod.rs +0 -0
  275. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/ops/targets/shared/property_graph.rs +0 -0
  276. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/prelude.rs +0 -0
  277. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/py/convert.rs +0 -0
  278. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/server.rs +0 -0
  279. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/service/error.rs +0 -0
  280. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/service/flows.rs +0 -0
  281. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/service/mod.rs +0 -0
  282. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/settings.rs +0 -0
  283. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/setup/auth_registry.rs +0 -0
  284. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/setup/mod.rs +0 -0
  285. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/utils/concur_control.rs +0 -0
  286. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/utils/db.rs +0 -0
  287. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/utils/fingerprint.rs +0 -0
  288. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/utils/immutable.rs +0 -0
  289. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/utils/mod.rs +0 -0
  290. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/utils/retryable.rs +0 -0
  291. {cocoindex-0.1.70 → cocoindex-0.1.72}/src/utils/yaml_ser.rs +0 -0
@@ -1297,7 +1297,7 @@ dependencies = [
1297
1297
 
1298
1298
  [[package]]
1299
1299
  name = "cocoindex"
1300
- version = "0.1.70"
1300
+ version = "0.1.72"
1301
1301
  dependencies = [
1302
1302
  "anyhow",
1303
1303
  "async-openai",
@@ -2,7 +2,7 @@
2
2
  name = "cocoindex"
3
3
  # Version used for local development is always higher than others to take precedence.
4
4
  # Will be overridden for specific release versions.
5
- version = "0.1.70"
5
+ version = "0.1.72"
6
6
  edition = "2024"
7
7
  rust-version = "1.88"
8
8
 
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: cocoindex
3
- Version: 0.1.70
3
+ Version: 0.1.72
4
4
  Requires-Dist: click>=8.1.8
5
5
  Requires-Dist: rich>=14.0.0
6
6
  Requires-Dist: python-dotenv>=1.1.0
@@ -52,18 +52,18 @@ Ultra performant data transformation framework for AI, with core engine written
52
52
  ⭐ Drop a star to help us grow!
53
53
 
54
54
  <div align="center">
55
-
55
+
56
56
  <!-- Keep these links. Translations will automatically update with the README. -->
57
- [Deutsch](https://readme-i18n.com/cocoindex-io/cocoindex?lang=de) |
58
- [English](https://readme-i18n.com/cocoindex-io/cocoindex?lang=en) |
59
- [Español](https://readme-i18n.com/cocoindex-io/cocoindex?lang=es) |
60
- [français](https://readme-i18n.com/cocoindex-io/cocoindex?lang=fr) |
61
- [日本語](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ja) |
62
- [한국어](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ko) |
63
- [Português](https://readme-i18n.com/cocoindex-io/cocoindex?lang=pt) |
64
- [Русский](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ru) |
57
+ [Deutsch](https://readme-i18n.com/cocoindex-io/cocoindex?lang=de) |
58
+ [English](https://readme-i18n.com/cocoindex-io/cocoindex?lang=en) |
59
+ [Español](https://readme-i18n.com/cocoindex-io/cocoindex?lang=es) |
60
+ [français](https://readme-i18n.com/cocoindex-io/cocoindex?lang=fr) |
61
+ [日本語](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ja) |
62
+ [한국어](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ko) |
63
+ [Português](https://readme-i18n.com/cocoindex-io/cocoindex?lang=pt) |
64
+ [Русский](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ru) |
65
65
  [中文](https://readme-i18n.com/cocoindex-io/cocoindex?lang=zh)
66
-
66
+
67
67
  </div>
68
68
 
69
69
  </br>
@@ -208,6 +208,7 @@ It defines an index flow like this:
208
208
  | [FastAPI Server with Docker](examples/fastapi_server_docker) | Run the semantic search server in a Dockerized FastAPI setup |
209
209
  | [Product Recommendation](examples/product_recommendation) | Build real-time product recommendations with LLM and graph database|
210
210
  | [Image Search with Vision API](examples/image_search) | Generates detailed captions for images using a vision model, embeds them, enables live-updating semantic search via FastAPI and served on a React frontend|
211
+ | [Face Recognition](examples/face_recognition) | Recognize faces in images and build embedding index |
211
212
  | [Paper Metadata](examples/paper_metadata) | Index papers in PDF files, and build metadata tables for each paper |
212
213
 
213
214
  More coming and stay tuned 👀!
@@ -27,18 +27,18 @@ Ultra performant data transformation framework for AI, with core engine written
27
27
  ⭐ Drop a star to help us grow!
28
28
 
29
29
  <div align="center">
30
-
30
+
31
31
  <!-- Keep these links. Translations will automatically update with the README. -->
32
- [Deutsch](https://readme-i18n.com/cocoindex-io/cocoindex?lang=de) |
33
- [English](https://readme-i18n.com/cocoindex-io/cocoindex?lang=en) |
34
- [Español](https://readme-i18n.com/cocoindex-io/cocoindex?lang=es) |
35
- [français](https://readme-i18n.com/cocoindex-io/cocoindex?lang=fr) |
36
- [日本語](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ja) |
37
- [한국어](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ko) |
38
- [Português](https://readme-i18n.com/cocoindex-io/cocoindex?lang=pt) |
39
- [Русский](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ru) |
32
+ [Deutsch](https://readme-i18n.com/cocoindex-io/cocoindex?lang=de) |
33
+ [English](https://readme-i18n.com/cocoindex-io/cocoindex?lang=en) |
34
+ [Español](https://readme-i18n.com/cocoindex-io/cocoindex?lang=es) |
35
+ [français](https://readme-i18n.com/cocoindex-io/cocoindex?lang=fr) |
36
+ [日本語](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ja) |
37
+ [한국어](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ko) |
38
+ [Português](https://readme-i18n.com/cocoindex-io/cocoindex?lang=pt) |
39
+ [Русский](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ru) |
40
40
  [中文](https://readme-i18n.com/cocoindex-io/cocoindex?lang=zh)
41
-
41
+
42
42
  </div>
43
43
 
44
44
  </br>
@@ -183,6 +183,7 @@ It defines an index flow like this:
183
183
  | [FastAPI Server with Docker](examples/fastapi_server_docker) | Run the semantic search server in a Dockerized FastAPI setup |
184
184
  | [Product Recommendation](examples/product_recommendation) | Build real-time product recommendations with LLM and graph database|
185
185
  | [Image Search with Vision API](examples/image_search) | Generates detailed captions for images using a vision model, embeds them, enables live-updating semantic search via FastAPI and served on a React frontend|
186
+ | [Face Recognition](examples/face_recognition) | Recognize faces in images and build embedding index |
186
187
  | [Paper Metadata](examples/paper_metadata) | Index papers in PDF files, and build metadata tables for each paper |
187
188
 
188
189
  More coming and stay tuned 👀!
@@ -148,6 +148,17 @@ Custom functions take the following additional parameters:
148
148
  When the version is changed, the function will be re-executed even if cache is enabled.
149
149
  It's required to be set if `cache` is `True`.
150
150
 
151
+ * `arg_relationship: tuple[ArgRelationship, str]`: It specifies the relationship between an input argument and the output,
152
+ e.g. `(ArgRelationship.CHUNKS_BASE_TEXT, "content")` means the output is chunks for the text represented by the
153
+ input argument with name `content`.
154
+ This provides metadata for tools, e.g. CocoInsight.
155
+ Currently the following attributes are supported:
156
+
157
+ * `ArgRelationship.CHUNKS_BASE_TEXT`:
158
+ The output is chunks for the text represented by the input argument. In this case, the output is expected to be a *Table*, whose each row represents a text chunk, and the first column has type *Range*, representing the range of the text chunk.
159
+ * `ArgRelationship.EMBEDDING_ORIGIN_TEXT`: The output is embedding vector for the text represented by the input argument. The output is expected to be a *Vector*.
160
+ * `ArgRelationship.RECTS_BASE_IMAGE`: The output is rectangles for the image represented by the input argument. The output is expected to be a *Table*, whose each row represents a rectangle, and the first column has type *Struct*, with fields `min_x`, `min_y`, `max_x`, `max_y` to represent the coordinates of the rectangle.
161
+
151
162
  For example:
152
163
 
153
164
  <Tabs>
@@ -21,12 +21,13 @@ All you need to do is to make sure the data passed to functions and targets are
21
21
  Each type in CocoIndex type system is mapped to one or multiple types in Python.
22
22
  When you define a [custom function](/docs/core/custom_function), you need to annotate the data types of arguments and return values.
23
23
 
24
- * When you pass a Python value to the engine (e.g. return values of a custom function), type annotation is required,
25
- as it provides the ground truth of the data type in the flow.
24
+ * When you pass a Python value to the engine (e.g. return values of a custom function), a specific type annotation is required.
25
+ The type annotation needs to be specific in describing the target data type, as it provides the ground truth of the data type in the flow.
26
26
 
27
27
  * When you use a Python variable to bind to an engine value (e.g. arguments of a custom function),
28
- we use the type annotation as a guidance to construct the Python value.
29
- Type annotation is optional for basic types and struct types, and required for table types.
28
+ the engine already knows the specific data type, so we don't require a specific type annotation, e.g. type annotations can be omitted, or you can use `Any` at any level.
29
+ When a specific type annotation is provided, it's still used as a guidance to construct the Python value with compatible type.
30
+ Otherwise, we will bind to a default Python type.
30
31
 
31
32
  ### Basic Types
32
33
 
@@ -54,7 +55,7 @@ This is the list of all primitive types supported by CocoIndex:
54
55
  Notes:
55
56
 
56
57
  * For some CocoIndex types, we support multiple Python types. You can annotate with any of these Python types.
57
- The first one is the default type, i.e. CocoIndex will create a value with this type when the type annotation is not provided (e.g. for arguments of a custom function).
58
+ The first one is the default type, i.e. CocoIndex will create a value with this type when a specific type annotation is not provided (e.g. for arguments of a custom function).
58
59
 
59
60
  * All Python types starting with `cocoindex.` are type aliases exported by CocoIndex. They're annotated types based on certain Python types:
60
61
 
@@ -86,7 +87,7 @@ Optionally, it can have a fixed dimension. Noted as *Vector[Type]* or *Vector[Ty
86
87
 
87
88
  It supports the following Python types:
88
89
 
89
- * `cocoindex.Vector[T]` or `cocoindex.Vector[T, typing.Literal[Dim]]`, e.g. `cocoindex.Vector[cocoindex.Float32]` or `cocoindex.Vector[cocoindex.Float32, 384]`
90
+ * `cocoindex.Vector[T]` or `cocoindex.Vector[T, typing.Literal[Dim]]`, e.g. `cocoindex.Vector[cocoindex.Float32]` or `cocoindex.Vector[cocoindex.Float32, typing.Literal[384]]`
90
91
  * The underlying Python type is `numpy.typing.NDArray[T]` where `T` is a numpy numeric type (`numpy.int64`, `numpy.float32` or `numpy.float64`), or `list[T]` otherwise
91
92
  * `numpy.typing.NDArray[T]` where `T` is a numpy numeric type
92
93
  * `list[T]`
@@ -136,7 +137,7 @@ Both `Person` and `PersonTuple` are valid Struct types in CocoIndex, with identi
136
137
  Choose `dataclass` for mutable objects or when you need additional methods, and `NamedTuple` for immutable, lightweight structures.
137
138
 
138
139
  Besides, for arguments of custom functions, CocoIndex also supports using dictionaries (`dict[str, Any]`) to represent a *Struct* type.
139
- It's the default Python type if you don't annotate the function argument.
140
+ It's the default Python type if you don't annotate the function argument with a specific type.
140
141
 
141
142
  ### Table Types
142
143
 
@@ -152,11 +153,16 @@ The row order of a *KTable* is not preserved.
152
153
  Type of the first column (key column) must be a [key type](#key-types).
153
154
 
154
155
  In Python, a *KTable* type is represented by `dict[K, V]`.
155
- The `V` should be a *Struct* type, either a `dataclass` or `NamedTuple`, representing the value fields of each row.
156
+ The `K` should be the type binding to a key type,
157
+ and the `V` should be the type binding to a *Struct* type representing the value fields of each row.
158
+ When the specific type annotation is not provided,
159
+ the key type is bound to a tuple with its key parts when it's a *Struct* type, the value type is bound to `dict[str, Any]`.
160
+
161
+
156
162
  For example, you can use `dict[str, Person]` or `dict[str, PersonTuple]` to represent a *KTable*, with 4 columns: key (*Str*), `first_name` (*Str*), `last_name` (*Str*), `dob` (*Date*).
163
+ It's bound to `dict[str, dict[str, Any]]` if you don't annotate the function argument with a specific type.
157
164
 
158
165
  Note that if you want to use a *Struct* as the key, you need to ensure its value in Python is immutable. For `dataclass`, annotate it with `@dataclass(frozen=True)`. For `NamedTuple`, immutability is built-in. For example:
159
- For example:
160
166
 
161
167
  ```python
162
168
  @dataclass(frozen=True)
@@ -170,14 +176,16 @@ class PersonKeyTuple(NamedTuple):
170
176
  ```
171
177
 
172
178
  Then you can use `dict[PersonKey, Person]` or `dict[PersonKeyTuple, PersonTuple]` to represent a KTable keyed by `PersonKey` or `PersonKeyTuple`.
179
+ It's bound to `dict[(str, str), dict[str, Any]]` if you don't annotate the function argument with a specific type.
173
180
 
174
181
 
175
182
  #### LTable
176
183
 
177
184
  *LTable* is a *Table* type whose row order is preserved. *LTable* has no key column.
178
185
 
179
- In Python, a *LTable* type is represented by `list[R]`, where `R` is a dataclass representing a row.
186
+ In Python, a *LTable* type is represented by `list[R]`, where `R` is the type binding to the *Struct* type representing the value fields of each row.
180
187
  For example, you can use `list[Person]` to represent a *LTable* with 3 columns: `first_name` (*Str*), `last_name` (*Str*), `dob` (*Date*).
188
+ It's bound to `list[dict[str, Any]]` if you don't annotate the function argument with a specific type.
181
189
 
182
190
  ## Key Types
183
191
 
@@ -32,6 +32,13 @@ Here's how CocoIndex data elements map to Postgres elements during export:
32
32
  For example, if you have a data collector that collects rows with fields `id`, `title`, and `embedding`, it will be exported to a Postgres table with corresponding columns.
33
33
  It should be a unique table, meaning that no other export target should export to the same table.
34
34
 
35
+ :::warning vector type mapping to Postgres
36
+
37
+ Since vectors in pgvector must have fixed dimension, we only map vectors of number types with fixed dimension (i.e. *Vector[cocoindex.Float32, N]*, *Vector[cocoindex.Float64, N]*, and *Vector[cocoindex.Int64, N]*) to `vector(N)` columns.
38
+ For all other vector types, we map them to `jsonb` columns.
39
+
40
+ :::
41
+
35
42
  #### Spec
36
43
 
37
44
  The spec takes the following fields:
@@ -58,6 +65,13 @@ Here's how CocoIndex data elements map to Qdrant elements during export:
58
65
 
59
66
  *Vector[Float32, N]*, *Vector[Float64, N]* and *Vector[Int64, N]* types fit into Qdrant vector.
60
67
 
68
+ :::warning vector type mapping to Qdrant
69
+
70
+ Since vectors in Qdrant must have fixed dimension, we only map vectors of number types with fixed dimension (i.e. *Vector[cocoindex.Float32, N]*, *Vector[cocoindex.Float64, N]*, and *Vector[cocoindex.Int64, N]*) to Qdrant vectors.
71
+ For all other vector types, we map to Qdrant payload as JSON arrays.
72
+
73
+ :::
74
+
61
75
  #### Spec
62
76
 
63
77
  The spec takes the following fields:
@@ -0,0 +1,156 @@
1
+ ---
2
+ title: Live Updates
3
+ description: "Keep your indexes up-to-date with live updates in CocoIndex."
4
+ ---
5
+
6
+ # Live Updates
7
+
8
+ CocoIndex is designed to keep your indexes synchronized with your data sources. This is achieved through a feature called **live updates**, which automatically detects changes in your sources and updates your indexes accordingly. This ensures that your search results and data analysis are always based on the most current information.
9
+
10
+ ## How Live Updates Work
11
+
12
+ Live updates in CocoIndex can be triggered in two main ways:
13
+
14
+ 1. **Refresh Interval:** You can configure a `refresh_interval` for any data source. CocoIndex will then periodically check the source for any new, updated, or deleted data. This is a simple and effective way to keep your index fresh, especially for sources that don't have a built-in change notification system.
15
+
16
+ 2. **Change Capture Mechanisms:** Some data sources offer more sophisticated ways to track changes. For example:
17
+ * **Amazon S3:** You can configure an SQS queue to receive notifications whenever a file is added, modified, or deleted in your S3 bucket. CocoIndex can listen to this queue and trigger an update instantly.
18
+ * **Google Drive:** The Google Drive source can be configured to poll for recent changes, which is more efficient than a full refresh.
19
+
20
+ When a change is detected, CocoIndex performs an **incremental update**. This means it only re-processes the data that has been affected by the change, without having to re-index your entire dataset. This makes the update process fast and efficient.
21
+
22
+ Here's an example of how to set up a source with a `refresh_interval`:
23
+
24
+ ```python
25
+ @cocoindex.flow_def(name="LiveUpdateExample")
26
+ def live_update_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope):
27
+ # Source: local files in the 'data' directory
28
+ data_scope["documents"] = flow_builder.add_source(
29
+ cocoindex.sources.LocalFile(path="data"),
30
+ refresh_interval=cocoindex.timedelta(seconds=5),
31
+ )
32
+ # ...
33
+ ```
34
+
35
+ By setting `refresh_interval` to 5 seconds, we're telling CocoIndex to check for changes in the `data` directory every 5 seconds.
36
+
37
+ ## Implementing Live Updates
38
+
39
+ You can enable live updates using either the CocoIndex CLI or the Python library.
40
+
41
+ ### Using the CLI
42
+
43
+ To start a live update process from the command line, use the `update` command with the `-L` or `--live` flag:
44
+
45
+ ```bash
46
+ cocoindex update -L your_flow_definition_file.py
47
+ ```
48
+
49
+ This will start a long-running process that continuously monitors your data sources for changes and updates your indexes in real-time. You can stop the process by pressing `Ctrl+C`.
50
+
51
+ ### Using the Python Library
52
+
53
+ For more control over the live update process, you can use the `FlowLiveUpdater` class in your Python code. This is particularly useful when you want to integrate CocoIndex into a larger application.
54
+
55
+ The `FlowLiveUpdater` can be used as a context manager, which automatically starts the updater when you enter the `with` block and stops it when you exit. The `wait()` method will block until the updater is aborted (e.g., by pressing `Ctrl+C`).
56
+
57
+ Here's how you can use `FlowLiveUpdater` to start and manage a live update process:
58
+
59
+ ```python
60
+ import cocoindex
61
+
62
+ # Create a FlowLiveUpdater instance
63
+ with cocoindex.FlowLiveUpdater(live_update_flow, cocoindex.FlowLiveUpdaterOptions(print_stats=True)) as updater:
64
+ print("Live updater started. Press Ctrl+C to stop.")
65
+ # The updater runs in the background.
66
+ # The wait() method blocks until the updater is stopped.
67
+ updater.wait()
68
+
69
+ print("Live updater stopped.")
70
+ ```
71
+
72
+ #### Getting Status Updates
73
+
74
+ You can also get status updates from the `FlowLiveUpdater` to monitor the update process. The `next_status_updates()` method blocks until there is a new status update.
75
+
76
+ ```python
77
+ import cocoindex
78
+
79
+ updater = cocoindex.FlowLiveUpdater(live_update_flow)
80
+ updater.start()
81
+
82
+ while True:
83
+ updates = updater.next_status_updates()
84
+
85
+ if not updates.active_sources:
86
+ print("All sources have finished processing.")
87
+ break
88
+
89
+ for source_name in updates.updated_sources:
90
+ print(f"Source '{source_name}' has been updated.")
91
+
92
+ updater.wait()
93
+ ```
94
+
95
+ This allows you to react to updates in your application, for example, by notifying users or triggering downstream processes.
96
+
97
+ ## Example
98
+
99
+ Let's walk through an example of how to set up a live update flow. For the complete, runnable code, see the [live updates example](https://github.com/cocoindex-io/cocoindex/tree/main/examples/live_updates) in the CocoIndex repository.
100
+
101
+ ### 1. Setting up the Source
102
+
103
+ The first step is to define a source and configure a `refresh_interval`. In this example, we'll use a `LocalFile` source to monitor a directory named `data`.
104
+
105
+ ```python
106
+ @cocoindex.flow_def(name="LiveUpdateExample")
107
+ def live_update_flow(flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope):
108
+ # Source: local files in the 'data' directory
109
+ data_scope["documents"] = flow_builder.add_source(
110
+ cocoindex.sources.LocalFile(path="data"),
111
+ refresh_interval=cocoindex.timedelta(seconds=5),
112
+ )
113
+
114
+ # Collector
115
+ collector = data_scope.add_collector()
116
+ with data_scope["documents"].row() as doc:
117
+ collector.collect(filename=doc["filename"], content=doc["content"])
118
+
119
+ # Target: Postgres database
120
+ collector.export(
121
+ "documents_index",
122
+ cocoindex.targets.Postgres(),
123
+ primary_key_fields=["filename"]
124
+ )
125
+ ```
126
+
127
+ By setting `refresh_interval` to 5 seconds, we're telling CocoIndex to check for changes in the `data` directory every 5 seconds.
128
+
129
+ ### 2. Running the Live Updater
130
+
131
+ Once the flow is defined, you can use the `FlowLiveUpdater` to start the live update process.
132
+
133
+ ```python
134
+ def main():
135
+ # Initialize CocoIndex
136
+ cocoindex.init()
137
+
138
+ # Setup the flow
139
+ live_update_flow.setup(report_to_stdout=True)
140
+
141
+ # Start the live updater
142
+ with cocoindex.FlowLiveUpdater(live_update_flow, cocoindex.FlowLiveUpdaterOptions(print_stats=True)) as updater:
143
+ print("Live updater started. Watching for changes in the 'data' directory.")
144
+ updater.wait()
145
+
146
+ if __name__ == "__main__":
147
+ main()
148
+ ```
149
+
150
+ The `FlowLiveUpdater` will run in the background, and the `updater.wait()` call will block until the process is stopped.
151
+
152
+ ## Conclusion
153
+
154
+ Live updates is a powerful feature of CocoIndex that ensures your indexes are always fresh. By using a combination of refresh intervals and source-specific change capture mechanisms, you can build responsive, real-time applications that are always in sync with your data.
155
+
156
+ For more detailed information on the `FlowLiveUpdater` and other live update options, please refer to the [Run a Flow documentation](https://cocoindex.io/docs/core/flow_methods#live-update).
@@ -12,6 +12,14 @@ const sidebars: SidebarsConfig = {
12
12
  'getting_started/installation',
13
13
  ],
14
14
  },
15
+ {
16
+ type: 'category',
17
+ label: 'Tutorials',
18
+ collapsed: false,
19
+ items: [
20
+ 'tutorials/live_updates',
21
+ ],
22
+ },
15
23
  {
16
24
  type: 'category',
17
25
  label: 'CocoIndex Core',
@@ -0,0 +1,51 @@
1
+ # Recognize faces in images and build embedding index
2
+ [![GitHub](https://img.shields.io/github/stars/cocoindex-io/cocoindex?color=5B5BD6)](https://github.com/cocoindex-io/cocoindex)
3
+
4
+
5
+ In this example, we will recognize faces in images and build embedding index.
6
+
7
+ We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/cocoindex) if this is helpful.
8
+
9
+ ## Steps
10
+ ### Indexing Flow
11
+
12
+ 1. We will ingest a list of images.
13
+ 2. For each image, we:
14
+ - Extract faces from the image.
15
+ - Compute embeddings for each face.
16
+ 3. We will export to the following tables in Postgres with PGVector:
17
+ - Filename, rect, embedding for each face.
18
+
19
+
20
+ ## Prerequisite
21
+
22
+ 1. [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one.
23
+
24
+ 2. dependencies:
25
+
26
+ ```bash
27
+ pip install -e .
28
+ ```
29
+
30
+ ## Run
31
+
32
+ Update index, which will also setup the tables at the first time:
33
+
34
+ ```bash
35
+ cocoindex update --setup main.py
36
+ ```
37
+
38
+ You can also run the command with `-L`, which will watch for file changes and update the index automatically.
39
+
40
+ ```bash
41
+ cocoindex update --setup -L main.py
42
+ ```
43
+
44
+ ## CocoInsight
45
+ I used CocoInsight (Free beta now) to troubleshoot the index generation and understand the data lineage of the pipeline. It just connects to your local CocoIndex server, with zero pipeline data retention. Run following command to start CocoInsight:
46
+
47
+ ```
48
+ cocoindex server -ci main.py
49
+ ```
50
+
51
+ Then open the CocoInsight UI at [https://cocoindex.io/cocoinsight](https://cocoindex.io/cocoinsight).
@@ -0,0 +1,120 @@
1
+ import cocoindex
2
+ import io
3
+ import dataclasses
4
+ import datetime
5
+ import typing
6
+
7
+ import face_recognition
8
+ from PIL import Image
9
+ import numpy as np
10
+
11
+
12
+ @dataclasses.dataclass
13
+ class ImageRect:
14
+ min_x: int
15
+ min_y: int
16
+ max_x: int
17
+ max_y: int
18
+
19
+
20
+ @dataclasses.dataclass
21
+ class FaceBase:
22
+ """A face in an image."""
23
+
24
+ rect: ImageRect
25
+ image: bytes
26
+
27
+
28
+ MAX_IMAGE_WIDTH = 1280
29
+
30
+
31
+ @cocoindex.op.function(
32
+ cache=True,
33
+ behavior_version=1,
34
+ gpu=True,
35
+ arg_relationship=(cocoindex.op.ArgRelationship.RECTS_BASE_IMAGE, "content"),
36
+ )
37
+ def extract_faces(content: bytes) -> list[FaceBase]:
38
+ """Extract the first pages of a PDF."""
39
+ orig_img = Image.open(io.BytesIO(content)).convert("RGB")
40
+
41
+ # The model is too slow on large images, so we resize them if too large.
42
+ if orig_img.width > MAX_IMAGE_WIDTH:
43
+ ratio = orig_img.width * 1.0 / MAX_IMAGE_WIDTH
44
+ img = orig_img.resize(
45
+ (MAX_IMAGE_WIDTH, int(orig_img.height / ratio)),
46
+ resample=Image.Resampling.BICUBIC,
47
+ )
48
+ else:
49
+ ratio = 1.0
50
+ img = orig_img
51
+
52
+ # Extract face locations.
53
+ locs = face_recognition.face_locations(np.array(img), model="cnn")
54
+
55
+ faces: list[FaceBase] = []
56
+ for min_y, max_x, max_y, min_x in locs:
57
+ rect = ImageRect(
58
+ min_x=int(min_x * ratio),
59
+ min_y=int(min_y * ratio),
60
+ max_x=int(max_x * ratio),
61
+ max_y=int(max_y * ratio),
62
+ )
63
+
64
+ # Crop the face and save it as a PNG.
65
+ buf = io.BytesIO()
66
+ orig_img.crop((rect.min_x, rect.min_y, rect.max_x, rect.max_y)).save(
67
+ buf, format="PNG"
68
+ )
69
+ face = buf.getvalue()
70
+ faces.append(FaceBase(rect, face))
71
+
72
+ return faces
73
+
74
+
75
+ @cocoindex.op.function(cache=True, behavior_version=1, gpu=True)
76
+ def extract_face_embedding(
77
+ face: bytes,
78
+ ) -> cocoindex.Vector[cocoindex.Float32]:
79
+ """Extract the embedding of a face."""
80
+ img = Image.open(io.BytesIO(face)).convert("RGB")
81
+ embedding = face_recognition.face_encodings(
82
+ np.array(img),
83
+ known_face_locations=[(0, img.width - 1, img.height - 1, 0)],
84
+ )[0]
85
+ return embedding
86
+
87
+
88
+ @cocoindex.flow_def(name="FaceRecognition")
89
+ def face_recognition_flow(
90
+ flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope
91
+ ) -> None:
92
+ """
93
+ Define an example flow that embeds files into a vector database.
94
+ """
95
+ data_scope["images"] = flow_builder.add_source(
96
+ cocoindex.sources.LocalFile(path="images", binary=True),
97
+ refresh_interval=datetime.timedelta(seconds=10),
98
+ )
99
+
100
+ face_embeddings = data_scope.add_collector()
101
+
102
+ with data_scope["images"].row() as image:
103
+ # Extract faces
104
+ image["faces"] = image["content"].transform(extract_faces)
105
+
106
+ with image["faces"].row() as face:
107
+ face["embedding"] = face["image"].transform(extract_face_embedding)
108
+
109
+ # Collect embeddings
110
+ face_embeddings.collect(
111
+ filename=image["filename"],
112
+ rect=face["rect"],
113
+ embedding=face["embedding"],
114
+ )
115
+
116
+ face_embeddings.export(
117
+ "face_embeddings",
118
+ cocoindex.targets.Postgres(),
119
+ primary_key_fields=["filename", "rect"],
120
+ )
@@ -0,0 +1,14 @@
1
+ [project]
2
+ name = "cocoindex-face-recognition-example"
3
+ version = "0.1.0"
4
+ description = "Build index for papers with both metadata and content embeddings"
5
+ requires-python = ">=3.11"
6
+ dependencies = [
7
+ "cocoindex>=0.1.71",
8
+ "face-recognition>=1.3.0",
9
+ "pillow>=10.0.0",
10
+ "numpy>=1.26.0",
11
+ ]
12
+
13
+ [tool.setuptools]
14
+ packages = []
@@ -0,0 +1 @@
1
+ COCOINDEX_DATABASE_URL=postgres://cocoindex:cocoindex@localhost/cocoindex
@@ -0,0 +1,58 @@
1
+ # Applying Live Updates to CocoIndex Flow Example
2
+ [![GitHub](https://img.shields.io/github/stars/cocoindex-io/cocoindex?color=5B5BD6)](https://github.com/cocoindex-io/cocoindex)
3
+
4
+ We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/cocoindex) if this is helpful.
5
+
6
+ This example demonstrates how to use CocoIndex's live update feature to keep an index synchronized with a local directory.
7
+
8
+ ## How it Works
9
+
10
+ The `main.py` script defines a CocoIndex flow that:
11
+
12
+ 1. **Sources** data from a local directory named `data`. It uses a `refresh_interval` of 5 seconds to check for changes.
13
+ 2. **Collects** the `filename` and `content` of each file.
14
+ 3. **Exports** the collected data to a Postgres database table.
15
+
16
+ The script then starts a `FlowLiveUpdater`, which runs in the background and continuously monitors the `data` directory for changes.
17
+
18
+ ## Running the Example
19
+
20
+ 1. [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one.
21
+
22
+ 2. **Install the dependencies:**
23
+
24
+ ```bash
25
+ pip install -e .
26
+ ```
27
+
28
+ 3. **Run the example:**
29
+
30
+ You can run the live update example in two ways:
31
+
32
+ **Option 1: Using the Python script**
33
+
34
+ This method uses CocoIndex [Library API](https://cocoindex.io/docs/core/flow_methods#library-api-2) to perform live updates.
35
+
36
+ ```bash
37
+ python main.py
38
+ ```
39
+
40
+ **Option 2: Using the CocoIndex CLI**
41
+
42
+ This method is useful for managing your indexes from the command line, through CocoIndex [CLI](https://cocoindex.io/docs/core/flow_methods#cli-2).
43
+
44
+ ```bash
45
+ cocoindex update main.py -L --setup
46
+ ```
47
+
48
+ 4. **Test the live updates:**
49
+
50
+ While the script is running, you can try adding, modifying, or deleting files in the `data` directory. You will see the changes reflected in the logs as CocoIndex updates the index.
51
+
52
+ ## Cleaning Up
53
+
54
+ To remove the database table created by this example, you can run:
55
+
56
+ ```bash
57
+ cocoindex drop main.py
58
+ ```