cocoindex 0.1.48__tar.gz → 0.1.49__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (241) hide show
  1. {cocoindex-0.1.48 → cocoindex-0.1.49}/Cargo.lock +1 -1
  2. {cocoindex-0.1.48 → cocoindex-0.1.49}/Cargo.toml +1 -1
  3. {cocoindex-0.1.48 → cocoindex-0.1.49}/PKG-INFO +1 -1
  4. cocoindex-0.1.49/docs/docs/getting_started/overview.md +34 -0
  5. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/ops/functions.md +21 -3
  6. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/ops/storages.md +52 -40
  7. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/docs_to_knowledge_graph/README.md +7 -10
  8. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/docs_to_knowledge_graph/main.py +26 -23
  9. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/README.md +0 -15
  10. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/README.md +7 -7
  11. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/main.py +32 -28
  12. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/text_embedding_qdrant/README.md +3 -19
  13. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/functions.py +12 -0
  14. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/functions/split_recursively.rs +292 -203
  15. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/registration.rs +1 -1
  16. cocoindex-0.1.48/docs/docs/getting_started/overview.md +0 -14
  17. {cocoindex-0.1.48 → cocoindex-0.1.49}/.cargo/config.toml +0 -0
  18. {cocoindex-0.1.48 → cocoindex-0.1.49}/.env.lib_debug +0 -0
  19. {cocoindex-0.1.48 → cocoindex-0.1.49}/.github/ISSUE_TEMPLATE//360/237/220/233-bug-report.md" +0 -0
  20. {cocoindex-0.1.48 → cocoindex-0.1.49}/.github/ISSUE_TEMPLATE//360/237/222/241-feature-request.md" +0 -0
  21. {cocoindex-0.1.48 → cocoindex-0.1.49}/.github/scripts/update_version.sh +0 -0
  22. {cocoindex-0.1.48 → cocoindex-0.1.49}/.github/workflows/CI.yml +0 -0
  23. {cocoindex-0.1.48 → cocoindex-0.1.49}/.github/workflows/_test.yml +0 -0
  24. {cocoindex-0.1.48 → cocoindex-0.1.49}/.github/workflows/docs.yml +0 -0
  25. {cocoindex-0.1.48 → cocoindex-0.1.49}/.github/workflows/release.yml +0 -0
  26. {cocoindex-0.1.48 → cocoindex-0.1.49}/.gitignore +0 -0
  27. {cocoindex-0.1.48 → cocoindex-0.1.49}/.vscode/settings.json +0 -0
  28. {cocoindex-0.1.48 → cocoindex-0.1.49}/CODE_OF_CONDUCT.md +0 -0
  29. {cocoindex-0.1.48 → cocoindex-0.1.49}/CONTRIBUTING.md +0 -0
  30. {cocoindex-0.1.48 → cocoindex-0.1.49}/LICENSE +0 -0
  31. {cocoindex-0.1.48 → cocoindex-0.1.49}/README.md +0 -0
  32. {cocoindex-0.1.48 → cocoindex-0.1.49}/dev/neo4j.yaml +0 -0
  33. {cocoindex-0.1.48 → cocoindex-0.1.49}/dev/postgres.yaml +0 -0
  34. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/.gitignore +0 -0
  35. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/README.md +0 -0
  36. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/about/community.md +0 -0
  37. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/about/contributing.md +0 -0
  38. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/ai/llm.mdx +0 -0
  39. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/core/basics.md +0 -0
  40. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/core/cli.mdx +0 -0
  41. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/core/custom_function.mdx +0 -0
  42. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/core/data_example.svg +0 -0
  43. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/core/data_types.mdx +0 -0
  44. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/core/flow_def.mdx +0 -0
  45. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/core/flow_example.svg +0 -0
  46. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/core/flow_methods.mdx +0 -0
  47. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/core/settings.mdx +0 -0
  48. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/getting_started/installation.md +0 -0
  49. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/getting_started/markdown_files.zip +0 -0
  50. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/getting_started/quickstart.md +0 -0
  51. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/ops/sources.md +0 -0
  52. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docs/query.mdx +0 -0
  53. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/docusaurus.config.ts +0 -0
  54. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/package.json +0 -0
  55. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/sidebars.ts +0 -0
  56. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/src/components/HomepageFeatures/index.tsx +0 -0
  57. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/src/components/HomepageFeatures/styles.module.css +0 -0
  58. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/src/css/custom.css +0 -0
  59. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/src/theme/Root.js +0 -0
  60. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/static/.nojekyll +0 -0
  61. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/static/img/docusaurus.png +0 -0
  62. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/static/img/favicon.ico +0 -0
  63. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/static/img/icon.svg +0 -0
  64. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/static/robots.txt +0 -0
  65. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/tsconfig.json +0 -0
  66. {cocoindex-0.1.48 → cocoindex-0.1.49}/docs/yarn.lock +0 -0
  67. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/amazon_s3_embedding/.env.example +0 -0
  68. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/amazon_s3_embedding/.gitignore +0 -0
  69. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/amazon_s3_embedding/README.md +0 -0
  70. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/amazon_s3_embedding/main.py +0 -0
  71. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/amazon_s3_embedding/pyproject.toml +0 -0
  72. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/code_embedding/.env +0 -0
  73. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/code_embedding/README.md +0 -0
  74. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/code_embedding/main.py +0 -0
  75. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/code_embedding/pyproject.toml +0 -0
  76. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/docs_to_knowledge_graph/.env +0 -0
  77. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/docs_to_knowledge_graph/pyproject.toml +0 -0
  78. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/fastapi_server_docker/.dockerignore +0 -0
  79. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/fastapi_server_docker/.env +0 -0
  80. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/fastapi_server_docker/README.md +0 -0
  81. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/fastapi_server_docker/compose.yaml +0 -0
  82. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/fastapi_server_docker/dockerfile +0 -0
  83. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/fastapi_server_docker/files/1810.04805v2.md +0 -0
  84. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/fastapi_server_docker/main.py +0 -0
  85. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/fastapi_server_docker/requirements.txt +0 -0
  86. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/gdrive_text_embedding/.env.example +0 -0
  87. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/gdrive_text_embedding/.gitignore +0 -0
  88. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/gdrive_text_embedding/README.md +0 -0
  89. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/gdrive_text_embedding/main.py +0 -0
  90. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/gdrive_text_embedding/pyproject.toml +0 -0
  91. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/.env +0 -0
  92. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/frontend/.gitignore +0 -0
  93. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/frontend/index.html +0 -0
  94. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/frontend/package-lock.json +0 -0
  95. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/frontend/package.json +0 -0
  96. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/frontend/src/App.jsx +0 -0
  97. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/frontend/src/main.jsx +0 -0
  98. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/frontend/src/style.css +0 -0
  99. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/frontend/vite.config.js +0 -0
  100. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/img/cat1.jpeg +0 -0
  101. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/img/dog1.jpeg +0 -0
  102. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/img/elephant1.jpg +0 -0
  103. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/img/giraffe.jpg +0 -0
  104. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/main.py +0 -0
  105. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/pyproject.toml +0 -0
  106. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/image_search/requirements.txt +0 -0
  107. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/manuals_llm_extraction/.env +0 -0
  108. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/manuals_llm_extraction/README.md +0 -0
  109. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/manuals_llm_extraction/main.py +0 -0
  110. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/manuals_llm_extraction/manuals/array.pdf +0 -0
  111. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/manuals_llm_extraction/manuals/base64.pdf +0 -0
  112. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/manuals_llm_extraction/manuals/copy.pdf +0 -0
  113. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/manuals_llm_extraction/manuals/glob.pdf +0 -0
  114. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/manuals_llm_extraction/pyproject.toml +0 -0
  115. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/pdf_embedding/.env +0 -0
  116. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/pdf_embedding/README.md +0 -0
  117. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/pdf_embedding/main.py +0 -0
  118. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/pdf_embedding/pdf_files/1706.03762v7.pdf +0 -0
  119. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/pdf_embedding/pdf_files/1810.04805v2.pdf +0 -0
  120. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/pdf_embedding/pdf_files/rfc8259.pdf +0 -0
  121. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/pdf_embedding/pyproject.toml +0 -0
  122. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/.env +0 -0
  123. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/img/cocoinsight.png +0 -0
  124. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/img/neo4j.png +0 -0
  125. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/products/p1.json +0 -0
  126. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/products/p2.json +0 -0
  127. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/products/p3.json +0 -0
  128. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/products/p4.json +0 -0
  129. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/products/p5.json +0 -0
  130. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/products/p6.json +0 -0
  131. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/products/p7.json +0 -0
  132. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/products/p8.json +0 -0
  133. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/products/p9.json +0 -0
  134. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/product_recommendation/pyproject.toml +0 -0
  135. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/text_embedding/.env +0 -0
  136. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/text_embedding/README.md +0 -0
  137. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/text_embedding/Text_Embedding.ipynb +0 -0
  138. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/text_embedding/main.py +0 -0
  139. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/text_embedding/markdown_files/1706.03762v7.md +0 -0
  140. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/text_embedding/markdown_files/1810.04805v2.md +0 -0
  141. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/text_embedding/markdown_files/rfc8259.md +0 -0
  142. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/text_embedding/pyproject.toml +0 -0
  143. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/text_embedding_qdrant/.env +0 -0
  144. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/text_embedding_qdrant/main.py +0 -0
  145. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/text_embedding_qdrant/markdown_files/rfc8259.md +0 -0
  146. {cocoindex-0.1.48 → cocoindex-0.1.49}/examples/text_embedding_qdrant/pyproject.toml +0 -0
  147. {cocoindex-0.1.48 → cocoindex-0.1.49}/pyproject.toml +0 -0
  148. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/__init__.py +0 -0
  149. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/auth_registry.py +0 -0
  150. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/cli.py +0 -0
  151. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/convert.py +0 -0
  152. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/flow.py +0 -0
  153. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/index.py +0 -0
  154. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/lib.py +0 -0
  155. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/llm.py +0 -0
  156. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/op.py +0 -0
  157. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/py.typed +0 -0
  158. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/query.py +0 -0
  159. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/runtime.py +0 -0
  160. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/setting.py +0 -0
  161. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/setup.py +0 -0
  162. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/sources.py +0 -0
  163. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/storages.py +0 -0
  164. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/tests/__init__.py +0 -0
  165. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/tests/test_convert.py +0 -0
  166. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/typing.py +0 -0
  167. {cocoindex-0.1.48 → cocoindex-0.1.49}/python/cocoindex/utils.py +0 -0
  168. {cocoindex-0.1.48 → cocoindex-0.1.49}/ruff.toml +0 -0
  169. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/base/duration.rs +0 -0
  170. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/base/field_attrs.rs +0 -0
  171. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/base/json_schema.rs +0 -0
  172. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/base/mod.rs +0 -0
  173. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/base/schema.rs +0 -0
  174. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/base/spec.rs +0 -0
  175. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/base/value.rs +0 -0
  176. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/builder/analyzed_flow.rs +0 -0
  177. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/builder/analyzer.rs +0 -0
  178. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/builder/flow_builder.rs +0 -0
  179. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/builder/mod.rs +0 -0
  180. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/builder/plan.rs +0 -0
  181. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/execution/db_tracking.rs +0 -0
  182. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/execution/db_tracking_setup.rs +0 -0
  183. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/execution/dumper.rs +0 -0
  184. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/execution/evaluator.rs +0 -0
  185. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/execution/indexing_status.rs +0 -0
  186. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/execution/live_updater.rs +0 -0
  187. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/execution/memoization.rs +0 -0
  188. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/execution/mod.rs +0 -0
  189. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/execution/query.rs +0 -0
  190. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/execution/row_indexer.rs +0 -0
  191. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/execution/source_indexer.rs +0 -0
  192. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/execution/stats.rs +0 -0
  193. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/lib.rs +0 -0
  194. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/lib_context.rs +0 -0
  195. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/llm/anthropic.rs +0 -0
  196. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/llm/gemini.rs +0 -0
  197. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/llm/mod.rs +0 -0
  198. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/llm/ollama.rs +0 -0
  199. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/llm/openai.rs +0 -0
  200. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/factory_bases.rs +0 -0
  201. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/functions/extract_by_llm.rs +0 -0
  202. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/functions/mod.rs +0 -0
  203. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/functions/parse_json.rs +0 -0
  204. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/interface.rs +0 -0
  205. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/mod.rs +0 -0
  206. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/py_factory.rs +0 -0
  207. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/registry.rs +0 -0
  208. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/sdk.rs +0 -0
  209. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/sources/amazon_s3.rs +0 -0
  210. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/sources/google_drive.rs +0 -0
  211. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/sources/local_file.rs +0 -0
  212. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/sources/mod.rs +0 -0
  213. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/storages/kuzu.rs +0 -0
  214. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/storages/mod.rs +0 -0
  215. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/storages/neo4j.rs +0 -0
  216. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/storages/postgres.rs +0 -0
  217. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/storages/qdrant.rs +0 -0
  218. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/storages/shared/mod.rs +0 -0
  219. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/storages/shared/property_graph.rs +0 -0
  220. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/ops/storages/shared/table_columns.rs +0 -0
  221. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/prelude.rs +0 -0
  222. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/py/convert.rs +0 -0
  223. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/py/mod.rs +0 -0
  224. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/server.rs +0 -0
  225. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/service/error.rs +0 -0
  226. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/service/flows.rs +0 -0
  227. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/service/mod.rs +0 -0
  228. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/service/search.rs +0 -0
  229. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/settings.rs +0 -0
  230. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/setup/auth_registry.rs +0 -0
  231. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/setup/components.rs +0 -0
  232. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/setup/db_metadata.rs +0 -0
  233. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/setup/driver.rs +0 -0
  234. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/setup/mod.rs +0 -0
  235. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/setup/states.rs +0 -0
  236. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/utils/db.rs +0 -0
  237. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/utils/fingerprint.rs +0 -0
  238. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/utils/immutable.rs +0 -0
  239. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/utils/mod.rs +0 -0
  240. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/utils/retryable.rs +0 -0
  241. {cocoindex-0.1.48 → cocoindex-0.1.49}/src/utils/yaml_ser.rs +0 -0
@@ -993,7 +993,7 @@ dependencies = [
993
993
 
994
994
  [[package]]
995
995
  name = "cocoindex"
996
- version = "0.1.48"
996
+ version = "0.1.49"
997
997
  dependencies = [
998
998
  "anyhow",
999
999
  "async-openai",
@@ -2,7 +2,7 @@
2
2
  name = "cocoindex"
3
3
  # Version used for local development is always higher than others to take precedence.
4
4
  # Will be overridden for specific release versions.
5
- version = "0.1.48"
5
+ version = "0.1.49"
6
6
  edition = "2024"
7
7
 
8
8
  [profile.release]
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: cocoindex
3
- Version: 0.1.48
3
+ Version: 0.1.49
4
4
  Requires-Dist: sentence-transformers>=3.3.1
5
5
  Requires-Dist: click>=8.1.8
6
6
  Requires-Dist: rich>=14.0.0
@@ -0,0 +1,34 @@
1
+ ---
2
+ title: Overview
3
+ slug: /
4
+ ---
5
+
6
+ # Welcome to CocoIndex
7
+
8
+ CocoIndex is an ultra-performant real-time data transformation framework for AI, with incremental processing.
9
+
10
+ As a data framework, CocoIndex takes it to the next level on data freshness. **Incremental processing** is one of the core values provided by CocoIndex.
11
+
12
+ ## Programming Model
13
+ CocoIndex follows the idea of [Dataflow programming](https://en.wikipedia.org/wiki/Dataflow_programming) model. Each transformation creates a new field solely based on input fields, without hidden states and value mutation. All data before/after each transformation is observable, with lineage out of the box.
14
+
15
+ The gist of an example data transformation:
16
+ ```python
17
+ # import
18
+ data['content'] = flow_builder.add_source(...)
19
+
20
+ # transform
21
+ data['out'] = data['content']
22
+ .transform(...)
23
+ .transform(...)
24
+
25
+ # collect data
26
+ collector.collect(...)
27
+
28
+ # export to db, vector db, graph db ...
29
+ collector.export(...)
30
+ ```
31
+
32
+ Get Started:
33
+ - [Quick Start](https://cocoindex.io/docs/getting_started/quickstart)
34
+
@@ -39,9 +39,27 @@ Input data:
39
39
 
40
40
  * `chunk_overlap` (type: `int`, optional): The maximum overlap size between adjacent chunks, in bytes.
41
41
  * `language` (type: `str`, optional): The language of the document.
42
- Can be a langauge name (e.g. `Python`, `Javascript`, `Markdown`) or a file extension (e.g. `.py`, `.js`, `.md`).
43
- To see all supported language names and extensions, see [the code](https://github.com/search?q=org%3Acocoindex-io+lang%3Arust++%22static+TREE_SITTER_LANGUAGE_BY_LANG%22&type=code).
44
- If it's unspecified or the specified language is not supported, it will be treated as plain text.
42
+ Can be a language name (e.g. `Python`, `Javascript`, `Markdown`) or a file extension (e.g. `.py`, `.js`, `.md`).
43
+
44
+ * `custom_languages` (type: `list[CustomLanguageSpec]`, optional): This allows you to customize the way to chunking specific languages using regular expressions. Each `CustomLanguageSpec` is a dict with the following fields:
45
+ * `language_name` (type: `str`, required): Name of the language.
46
+ * `aliases` (type: `list[str]`, optional): A list of aliases for the language.
47
+ It's an error if any language name or alias is duplicated.
48
+
49
+ * `separators_regex` (type: `list[str]`, required): A list of regex patterns to split the text.
50
+ Higher-level boundaries should come first, and lower-level should be listed later. e.g. `[r"\n# ", r"\n## ", r"\n\n", r"\. "]`.
51
+ See [regex Syntax](https://docs.rs/regex/latest/regex/#syntax) for supported regular expression syntax.
52
+
53
+ :::note
54
+
55
+ We use the `language` field to determine how to split the input text, following these rules:
56
+
57
+ * We'll match the input `language` field against the `language_name` or `aliases` of each custom language specification, and use the matched one. If value of `language` is null, it'll be treated as empty string when matching `language_name` or `aliases`.
58
+ * If no match is found, we'll match the `language` field against the builtin language configurations.
59
+ For all supported builtin language names and aliases (extensions), see [the code](https://github.com/search?q=org%3Acocoindex-io+lang%3Arust++%22static+TREE_SITTER_LANGUAGE_BY_LANG%22&type=code).
60
+ * If no match is found, the input will be treated as plain text.
61
+
62
+ :::
45
63
 
46
64
  Return type: [KTable](/docs/core/data_types#ktable), each row represents a chunk, with the following sub fields:
47
65
 
@@ -54,34 +54,21 @@ Here's how CocoIndex data elements map to Qdrant elements during export:
54
54
  |-------------------|------------------|
55
55
  | an export target | a unique collection |
56
56
  | a collected row | a point |
57
- | a field | a named vector (for fields with vector type); a field within payload (otherwise) |
57
+ | a field | a named vector, if fits into Qdrant vector; or a field within payload otherwise |
58
+
59
+ A vector with `Float32`, `Float64` or `Int64` type, and with fixed dimension, fits into Qdrant vector.
58
60
 
59
61
  #### Spec
60
62
 
61
63
  The spec takes the following fields:
62
64
 
63
- * `collection_name` (type: `str`, required): The name of the collection to export the data to.
64
-
65
- * `grpc_url` (type: `str`, optional): The [gRPC URL](https://qdrant.tech/documentation/interfaces/#grpc-interface) of the Qdrant instance. Defaults to `http://localhost:6334/`.
66
-
67
- * `api_key` (type: `str`, optional). API key to authenticate requests with.
65
+ * `connection` (type: [auth reference](../core/flow_def#auth-registry) to `QdrantConnection`, optional): The connection to the Qdrant instance. `QdrantConnection` has the following fields:
66
+ * `grpc_url` (type: `str`): The [gRPC URL](https://qdrant.tech/documentation/interfaces/#grpc-interface) of the Qdrant instance, e.g. `http://localhost:6334/`.
67
+ * `api_key` (type: `str`, optional). API key to authenticate requests with.
68
68
 
69
- Before exporting, you must create a collection with a [vector name](https://qdrant.tech/documentation/concepts/vectors/#named-vectors) that matches the vector field name in CocoIndex, and set `setup_by_user=True` during export.
69
+ If `connection` is not provided, will use local Qdrant instance at `http://localhost:6334/` by default.
70
70
 
71
- Example:
72
-
73
- ```python
74
- doc_embeddings.export(
75
- "doc_embeddings",
76
- cocoindex.storages.Qdrant(
77
- collection_name="cocoindex",
78
- grpc_url="https://xyz-example.cloud-region.cloud-provider.cloud.qdrant.io:6334/",
79
- api_key="<your-api-key-here>",
80
- ),
81
- primary_key_fields=["id_field"],
82
- setup_by_user=True,
83
- )
84
- ```
71
+ * `collection_name` (type: `str`, required): The name of the collection to export the data to.
85
72
 
86
73
  You can find an end-to-end example [here](https://github.com/cocoindex-io/cocoindex/tree/main/examples/text_embedding_qdrant).
87
74
 
@@ -399,19 +386,7 @@ You can find end-to-end examples fitting into any of supported property graphs i
399
386
 
400
387
  ### Neo4j
401
388
 
402
- If you don't have a Neo4j database, you can start a Neo4j database using our docker compose config:
403
-
404
- ```bash
405
- docker compose -f <(curl -L https://raw.githubusercontent.com/cocoindex-io/cocoindex/refs/heads/main/dev/neo4j.yaml) up -d
406
- ```
407
-
408
- :::warning
409
-
410
- The docker compose config above will start a Neo4j Enterprise instance under the [Evaluation License](https://neo4j.com/terms/enterprise_us/),
411
- with 30 days trial period.
412
- Please read and agree the license before starting the instance.
413
-
414
- :::
389
+ #### Spec
415
390
 
416
391
  The `Neo4j` target spec takes the following fields:
417
392
 
@@ -430,17 +405,32 @@ Neo4j also provides a declaration spec `Neo4jDeclaration`, to configure indexing
430
405
  * `primary_key_fields` (required)
431
406
  * `vector_indexes` (optional)
432
407
 
433
- ### Kuzu
408
+ #### Neo4j dev instance
434
409
 
435
- CocoIndex supports talking to Kuzu through its [API server](https://github.com/kuzudb/api-server).
436
- You can bring up a Kuzu API server locally by running:
410
+ If you don't have a Neo4j database, you can start a Neo4j database using our docker compose config:
437
411
 
438
412
  ```bash
439
- KUZU_DB_DIR=$HOME/.kuzudb
440
- KUZU_PORT=8123
441
- docker run -d --name kuzu -p ${KUZU_PORT}:8000 -v ${KUZU_DB_DIR}:/database kuzudb/api-server:latest
413
+ docker compose -f <(curl -L https://raw.githubusercontent.com/cocoindex-io/cocoindex/refs/heads/main/dev/neo4j.yaml) up -d
442
414
  ```
443
415
 
416
+ If will bring up a Neo4j instance, which can be accessed by username `neo4j` and password `cocoindex`.
417
+ You can access the Neo4j browser at [http://localhost:7474](http://localhost:7474).
418
+
419
+ :::warning
420
+
421
+ The docker compose config above will start a Neo4j Enterprise instance under the [Evaluation License](https://neo4j.com/terms/enterprise_us/),
422
+ with 30 days trial period.
423
+ Please read and agree the license before starting the instance.
424
+
425
+ :::
426
+
427
+
428
+ ### Kuzu
429
+
430
+ #### Spec
431
+
432
+ CocoIndex supports talking to Kuzu through its [API server](https://github.com/kuzudb/api-server).
433
+
444
434
  The `Kuzu` target spec takes the following fields:
445
435
 
446
436
  * `connection` (type: [auth reference](../core/flow_def#auth-registry) to `KuzuConnectionSpec`): The connection to the Kuzu database. `KuzuConnectionSpec` has the following fields:
@@ -453,3 +443,25 @@ Kuzu also provides a declaration spec `KuzuDeclaration`, to configure indexing o
453
443
  * Fields for [nodes to declare](#declare-extra-node-labels), including
454
444
  * `nodes_label` (required)
455
445
  * `primary_key_fields` (required)
446
+
447
+ #### Kuzu dev instance
448
+
449
+ If you don't have a Kuzu instance yet, you can bring up a Kuzu API server locally by running:
450
+
451
+ ```bash
452
+ KUZU_DB_DIR=$HOME/.kuzudb
453
+ KUZU_PORT=8123
454
+ docker run -d --name kuzu -p ${KUZU_PORT}:8000 -v ${KUZU_DB_DIR}:/database kuzudb/api-server:latest
455
+ ```
456
+
457
+ To explore the graph you built with Kuzu, you can use the [Kuzu Explorer](https://github.com/kuzudb/explorer).
458
+ Currently Kuzu API server and the explorer cannot be up at the same time. So you need to stop the API server before running the explorer.
459
+
460
+ To start the instance of the explorer, run:
461
+
462
+ ```bash
463
+ KUZU_EXPLORER_PORT=8124
464
+ docker run -d --name kuzu-explorer -p ${KUZU_EXPLORER_PORT}:8000 -v ${KUZU_DB_DIR}:/database -e MODE=READ_ONLY kuzudb/explorer:latest
465
+ ```
466
+
467
+ You can then access the explorer at [http://localhost:8124](http://localhost:8124).
@@ -12,10 +12,10 @@ Please drop [Cocoindex on Github](https://github.com/cocoindex-io/cocoindex) a s
12
12
 
13
13
  ![example-explanation](https://github.com/user-attachments/assets/07ddbd60-106f-427f-b7cc-16b73b142d27)
14
14
 
15
-
16
15
  ## Prerequisite
17
16
  * [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one.
18
- * [Install Neo4j](https://cocoindex.io/docs/ops/storages#neo4j) if you don't have one.
17
+ * Install [Neo4j](https://cocoindex.io/docs/ops/storages#neo4j-dev-instance) or [Kuzu](https://cocoindex.io/docs/ops/storages#kuzu-dev-instance) if you don't have one.
18
+ * The example uses Neo4j by default for now. If you want to use Kuzu, find out the "SELECT ONE GRAPH DATABASE TO USE" section and switch the active branch.
19
19
  * [Configure your OpenAI API key](https://cocoindex.io/docs/ai/llm#openai).
20
20
 
21
21
  ## Documentation
@@ -45,21 +45,18 @@ cocoindex update main.py
45
45
 
46
46
  ### Browse the knowledge graph
47
47
 
48
- After the knowledge graph is build, you can explore the knowledge graph you built in Neo4j Browser.
48
+ After the knowledge graph is built, you can explore the knowledge graph.
49
49
 
50
- For the dev enviroment, you can connect neo4j browser using credentials:
51
- - username: `neo4j`
52
- - password: `cocoindex`
53
- which is pre-configured in the our docker compose [config.yaml](https://raw.githubusercontent.com/cocoindex-io/cocoindex/refs/heads/main/dev/neo4j.yaml).
50
+ * If you're using Neo4j, you can open the explorer at [http://localhost:7474](http://localhost:7474), with username `neo4j` and password `cocoindex`.
51
+ * If you're using Kuzu, you can start a Kuzu explorer locally. See [Kuzu dev instance](https://cocoindex.io/docs/ops/storages#kuzu-dev-instance) for more details.
54
52
 
55
- You can open it at [http://localhost:7474](http://localhost:7474), and run the following Cypher query to get all relationships:
53
+ You can run the following Cypher query to get all relationships:
56
54
 
57
55
  ```cypher
58
56
  MATCH p=()-->() RETURN p
59
57
  ```
60
- <img width="1366" alt="neo4j-for-coco-docs" src="https://github.com/user-attachments/assets/3c8b6329-6fee-4533-9480-571399b57e57" />
61
-
62
58
 
59
+ <img width="1366" alt="neo4j-for-coco-docs" src="https://github.com/user-attachments/assets/3c8b6329-6fee-4533-9480-571399b57e57" />
63
60
 
64
61
  ## CocoInsight
65
62
  I used CocoInsight (Free beta now) to troubleshoot the index generation and understand the data lineage of the pipeline.
@@ -5,27 +5,6 @@ This example shows how to extract relationships from documents and build a knowl
5
5
  import dataclasses
6
6
  import cocoindex
7
7
 
8
-
9
- @dataclasses.dataclass
10
- class DocumentSummary:
11
- """Describe a summary of a document."""
12
-
13
- title: str
14
- summary: str
15
-
16
-
17
- @dataclasses.dataclass
18
- class Relationship:
19
- """
20
- Describe a relationship between two entities.
21
- Subject and object should be Core CocoIndex concepts only, should be nouns. For example, `CocoIndex`, `Incremental Processing`, `ETL`, `Data` etc.
22
- """
23
-
24
- subject: str
25
- predicate: str
26
- object: str
27
-
28
-
29
8
  neo4j_conn_spec = cocoindex.add_auth_entry(
30
9
  "Neo4jConnection",
31
10
  cocoindex.storages.Neo4jConnection(
@@ -41,19 +20,43 @@ kuzu_conn_spec = cocoindex.add_auth_entry(
41
20
  ),
42
21
  )
43
22
 
44
- # Use Neo4j as the graph database
23
+ # SELECT ONE GRAPH DATABASE TO USE
24
+ # This example can use either Neo4j or Kuzu as the graph database.
25
+ # Please make sure only one branch is live and others are commented out.
26
+
27
+ # Use Neo4j
45
28
  GraphDbSpec = cocoindex.storages.Neo4j
46
29
  GraphDbConnection = cocoindex.storages.Neo4jConnection
47
30
  GraphDbDeclaration = cocoindex.storages.Neo4jDeclaration
48
31
  conn_spec = neo4j_conn_spec
49
32
 
50
- # Use Kuzu as the graph database
33
+ # Use Kuzu
51
34
  # GraphDbSpec = cocoindex.storages.Kuzu
52
35
  # GraphDbConnection = cocoindex.storages.KuzuConnection
53
36
  # GraphDbDeclaration = cocoindex.storages.KuzuDeclaration
54
37
  # conn_spec = kuzu_conn_spec
55
38
 
56
39
 
40
+ @dataclasses.dataclass
41
+ class DocumentSummary:
42
+ """Describe a summary of a document."""
43
+
44
+ title: str
45
+ summary: str
46
+
47
+
48
+ @dataclasses.dataclass
49
+ class Relationship:
50
+ """
51
+ Describe a relationship between two entities.
52
+ Subject and object should be Core CocoIndex concepts only, should be nouns. For example, `CocoIndex`, `Incremental Processing`, `ETL`, `Data` etc.
53
+ """
54
+
55
+ subject: str
56
+ predicate: str
57
+ object: str
58
+
59
+
57
60
  @cocoindex.flow_def(name="DocsToKG")
58
61
  def docs_to_kg_flow(
59
62
  flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope
@@ -14,7 +14,6 @@ We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/c
14
14
  - Qdrant for Vector Storage
15
15
  - FastApi for backend
16
16
 
17
-
18
17
  ## Setup
19
18
  - Make sure Postgres and Qdrant are running
20
19
  ```
@@ -22,20 +21,6 @@ We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/c
22
21
  export COCOINDEX_DATABASE_URL="postgres://cocoindex:cocoindex@localhost/cocoindex"
23
22
  ```
24
23
 
25
- - Create Qdrant Collection
26
- ```
27
- curl -X PUT 'http://localhost:6333/collections/image_search' \
28
- -H 'Content-Type: application/json' \
29
- -d '{
30
- "vectors": {
31
- "embedding": {
32
- "size": 768,
33
- "distance": "Cosine"
34
- }
35
- }
36
- }'
37
- ```
38
-
39
24
  ## Run
40
25
  - Install dependencies:
41
26
  ```
@@ -9,7 +9,8 @@ Please drop [CocoIndex on Github](https://github.com/cocoindex-io/cocoindex) a s
9
9
 
10
10
  ## Prerequisite
11
11
  * [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one.
12
- * [Install Neo4j](https://cocoindex.io/docs/ops/storages#neo4j) if you don't have one.
12
+ * Install [Neo4j](https://cocoindex.io/docs/ops/storages#neo4j-dev-instance) or [Kuzu](https://cocoindex.io/docs/ops/storages#kuzu-dev-instance) if you don't have one.
13
+ * The example uses Neo4j by default for now. If you want to use Kuzu, find out the "SELECT ONE GRAPH DATABASE TO USE" section and switch the active branch.
13
14
  * [Configure your OpenAI API key](https://cocoindex.io/docs/ai/llm#openai).
14
15
 
15
16
  ## Documentation
@@ -39,18 +40,17 @@ cocoindex update main.py
39
40
 
40
41
  ### Browse the knowledge graph
41
42
 
42
- After the knowledge graph is built, you can explore the knowledge graph you built in Neo4j Browser.
43
+ After the knowledge graph is built, you can explore the knowledge graph.
43
44
 
44
- For the dev enviroment, you can connect neo4j browser using credentials:
45
- - username: `neo4j`
46
- - password: `cocoindex`
47
- which is pre-configured in the our docker compose [config.yaml](https://raw.githubusercontent.com/cocoindex-io/cocoindex/refs/heads/main/dev/neo4j.yaml).
45
+ * If you're using Neo4j, you can open the explorer at [http://localhost:7474](http://localhost:7474), with username `neo4j` and password `cocoindex`.
46
+ * If you're using Kuzu, you can start a Kuzu explorer locally. See [Kuzu dev instance](https://cocoindex.io/docs/ops/storages#kuzu-dev-instance) for more details.
48
47
 
49
- You can open it at [http://localhost:7474](http://localhost:7474), and run the following Cypher query to get all relationships:
48
+ You can run the following Cypher query to get all relationships:
50
49
 
51
50
  ```cypher
52
51
  MATCH p=()-->() RETURN p
53
52
  ```
53
+
54
54
  ![Neo4j Browser Screenshot](img/neo4j.png)
55
55
 
56
56
  ## CocoInsight
@@ -7,6 +7,38 @@ import datetime
7
7
  import cocoindex
8
8
  from jinja2 import Template
9
9
 
10
+ neo4j_conn_spec = cocoindex.add_auth_entry(
11
+ "Neo4jConnection",
12
+ cocoindex.storages.Neo4jConnection(
13
+ uri="bolt://localhost:7687",
14
+ user="neo4j",
15
+ password="cocoindex",
16
+ ),
17
+ )
18
+ kuzu_conn_spec = cocoindex.add_auth_entry(
19
+ "KuzuConnection",
20
+ cocoindex.storages.KuzuConnection(
21
+ api_server_url="http://localhost:8123",
22
+ ),
23
+ )
24
+
25
+ # SELECT ONE GRAPH DATABASE TO USE
26
+ # This example can use either Neo4j or Kuzu as the graph database.
27
+ # Please make sure only one branch is live and others are commented out.
28
+
29
+ # Use Neo4j
30
+ GraphDbSpec = cocoindex.storages.Neo4j
31
+ GraphDbConnection = cocoindex.storages.Neo4jConnection
32
+ GraphDbDeclaration = cocoindex.storages.Neo4jDeclaration
33
+ conn_spec = neo4j_conn_spec
34
+
35
+ # Use Kuzu
36
+ # GraphDbSpec = cocoindex.storages.Kuzu
37
+ # GraphDbConnection = cocoindex.storages.KuzuConnection
38
+ # GraphDbDeclaration = cocoindex.storages.KuzuDeclaration
39
+ # conn_spec = kuzu_conn_spec
40
+
41
+
10
42
  # Template for rendering product information as markdown to provide information to LLMs
11
43
  PRODUCT_TEMPLATE = """
12
44
  # {{ title }}
@@ -77,34 +109,6 @@ def extract_product_info(product: cocoindex.Json, filename: str) -> ProductInfo:
77
109
  )
78
110
 
79
111
 
80
- neo4j_conn_spec = cocoindex.add_auth_entry(
81
- "Neo4jConnection",
82
- cocoindex.storages.Neo4jConnection(
83
- uri="bolt://localhost:7687",
84
- user="neo4j",
85
- password="cocoindex",
86
- ),
87
- )
88
- kuzu_conn_spec = cocoindex.add_auth_entry(
89
- "KuzuConnection",
90
- cocoindex.storages.KuzuConnection(
91
- api_server_url="http://localhost:8123",
92
- ),
93
- )
94
-
95
- # Use Neo4j as the graph database
96
- GraphDbSpec = cocoindex.storages.Neo4j
97
- GraphDbConnection = cocoindex.storages.Neo4jConnection
98
- GraphDbDeclaration = cocoindex.storages.Neo4jDeclaration
99
- conn_spec = neo4j_conn_spec
100
-
101
- # Use Kuzu as the graph database
102
- # GraphDbSpec = cocoindex.storages.Kuzu
103
- # GraphDbConnection = cocoindex.storages.KuzuConnection
104
- # GraphDbDeclaration = cocoindex.storages.KuzuDeclaration
105
- # conn_spec = kuzu_conn_spec
106
-
107
-
108
112
  @cocoindex.flow_def(name="StoreProduct")
109
113
  def store_product_flow(
110
114
  flow_builder: cocoindex.FlowBuilder, data_scope: cocoindex.DataScope
@@ -19,7 +19,6 @@ We appreciate a star ⭐ at [CocoIndex Github](https://github.com/cocoindex-io/c
19
19
  ### Query
20
20
  We use Qdrant client to query the index, and reuse the embedding operation in the indexing flow.
21
21
 
22
-
23
22
  ## Pre-requisites
24
23
 
25
24
  - [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one. Although the target store is Qdrant, CocoIndex uses Postgress to track the data lineage for incremental processing.
@@ -30,24 +29,6 @@ We use Qdrant client to query the index, and reuse the embedding operation in th
30
29
  docker run -d -p 6334:6334 -p 6333:6333 qdrant/qdrant
31
30
  ```
32
31
 
33
- - [Create a collection](https://qdrant.tech/documentation/concepts/vectors/#named-vectors) to export the embeddings to.
34
-
35
- ```bash
36
- curl -X PUT \
37
- 'http://localhost:6333/collections/cocoindex' \
38
- --header 'Content-Type: application/json' \
39
- --data-raw '{
40
- "vectors": {
41
- "text_embedding": {
42
- "size": 384,
43
- "distance": "Cosine"
44
- }
45
- }
46
- }'
47
- ```
48
-
49
- You can view the collections and data with the Qdrant dashboard at <http://localhost:6333/dashboard>.
50
-
51
32
  ## Run
52
33
 
53
34
  - Install dependencies:
@@ -62,6 +43,9 @@ We use Qdrant client to query the index, and reuse the embedding operation in th
62
43
  cocoindex setup main.py
63
44
  ```
64
45
 
46
+ It will automatically create a collection in Qdrant.
47
+ You can view the collections and data with the Qdrant dashboard at <http://localhost:6333/dashboard>.
48
+
65
49
  - Update index:
66
50
 
67
51
  ```bash
@@ -1,6 +1,7 @@
1
1
  """All builtin functions."""
2
2
 
3
3
  from typing import Annotated, Any, TYPE_CHECKING
4
+ import dataclasses
4
5
 
5
6
  from .typing import Float32, Vector, TypeAttr
6
7
  from . import op, llm
@@ -14,9 +15,20 @@ class ParseJson(op.FunctionSpec):
14
15
  """Parse a text into a JSON object."""
15
16
 
16
17
 
18
+ @dataclasses.dataclass
19
+ class CustomLanguageSpec:
20
+ """Custom language specification."""
21
+
22
+ language_name: str
23
+ separators_regex: list[str]
24
+ aliases: list[str] = dataclasses.field(default_factory=list)
25
+
26
+
17
27
  class SplitRecursively(op.FunctionSpec):
18
28
  """Split a document (in string) recursively."""
19
29
 
30
+ custom_languages: list[CustomLanguageSpec] = dataclasses.field(default_factory=list)
31
+
20
32
 
21
33
  class ExtractByLlm(op.FunctionSpec):
22
34
  """Extract information from a text using a LLM."""