kodit 0.3.2__tar.gz → 0.3.4__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of kodit might be problematic. Click here for more details.

Files changed (278) hide show
  1. kodit-0.3.4/.claude/commands/update-docs.md +79 -0
  2. {kodit-0.3.2 → kodit-0.3.4}/.dockerignore +1 -1
  3. {kodit-0.3.2 → kodit-0.3.4}/.github/workflows/docker.yaml +22 -0
  4. {kodit-0.3.2 → kodit-0.3.4}/CLAUDE.md +14 -8
  5. {kodit-0.3.2 → kodit-0.3.4}/Dockerfile +5 -1
  6. {kodit-0.3.2 → kodit-0.3.4}/PKG-INFO +9 -4
  7. {kodit-0.3.2 → kodit-0.3.4}/README.md +8 -3
  8. kodit-0.3.4/docs/MIGRATION_TO_INDEX_AGGREGATE.md +222 -0
  9. {kodit-0.3.2 → kodit-0.3.4}/docs/_index.md +8 -3
  10. {kodit-0.3.2 → kodit-0.3.4}/docs/reference/indexing/index.md +64 -13
  11. {kodit-0.3.2 → kodit-0.3.4}/docs/reference/mcp/index.md +17 -6
  12. {kodit-0.3.2 → kodit-0.3.4}/pyproject.toml +5 -0
  13. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/_version.py +2 -2
  14. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/application/factories/code_indexing_factory.py +56 -29
  15. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/application/services/code_indexing_application_service.py +152 -118
  16. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/cli.py +14 -41
  17. kodit-0.3.4/src/kodit/domain/entities.py +271 -0
  18. kodit-0.3.4/src/kodit/domain/protocols.py +61 -0
  19. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/domain/services/embedding_service.py +1 -1
  20. kodit-0.3.4/src/kodit/domain/services/index_query_service.py +66 -0
  21. kodit-0.3.4/src/kodit/domain/services/index_service.py +282 -0
  22. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/domain/value_objects.py +143 -65
  23. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/cloning/git/working_copy.py +17 -8
  24. kodit-0.3.4/src/kodit/infrastructure/cloning/metadata.py +98 -0
  25. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/embedding/embedding_factory.py +1 -1
  26. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/embedding/local_vector_search_repository.py +1 -1
  27. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/embedding/vectorchord_vector_search_repository.py +1 -1
  28. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/enrichment/null_enrichment_provider.py +4 -10
  29. kodit-0.3.4/src/kodit/infrastructure/git/git_utils.py +25 -0
  30. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/ignore/ignore_pattern_provider.py +1 -2
  31. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/indexing/auto_indexing_service.py +2 -12
  32. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/indexing/fusion_service.py +1 -1
  33. kodit-0.3.4/src/kodit/infrastructure/mappers/__init__.py +1 -0
  34. kodit-0.3.4/src/kodit/infrastructure/mappers/index_mapper.py +344 -0
  35. kodit-0.3.4/src/kodit/infrastructure/slicing/__init__.py +1 -0
  36. kodit-0.3.4/src/kodit/infrastructure/slicing/language_detection_service.py +18 -0
  37. kodit-0.3.4/src/kodit/infrastructure/slicing/slicer.py +894 -0
  38. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/sqlalchemy/embedding_repository.py +1 -1
  39. {kodit-0.3.2/src/kodit/domain → kodit-0.3.4/src/kodit/infrastructure/sqlalchemy}/entities.py +3 -0
  40. kodit-0.3.4/src/kodit/infrastructure/sqlalchemy/index_repository.py +579 -0
  41. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/mcp.py +0 -7
  42. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/migrations/env.py +1 -1
  43. kodit-0.3.4/src/kodit/migrations/versions/4073b33f9436_add_file_processing_flag.py +36 -0
  44. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/migrations/versions/4552eb3f23ce_add_summary.py +4 -4
  45. kodit-0.3.4/src/kodit/migrations/versions/7c3bbc2ab32b_add_embeddings_table.py +55 -0
  46. kodit-0.3.4/src/kodit/migrations/versions/85155663351e_initial.py +98 -0
  47. kodit-0.3.4/src/kodit/migrations/versions/c3f5137d30f5_index_all_the_things.py +50 -0
  48. kodit-0.3.4/src/kodit/utils/__init__.py +1 -0
  49. kodit-0.3.4/src/kodit/utils/path_utils.py +54 -0
  50. {kodit-0.3.2 → kodit-0.3.4}/tests/conftest.py +10 -2
  51. kodit-0.3.4/tests/kodit/application/test_code_indexing_application_service.py +258 -0
  52. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/domain/enrichment_domain_service_test.py +1 -1
  53. kodit-0.3.2/tests/kodit/infrastructure/git/test_git_utils.py → kodit-0.3.4/tests/kodit/domain/entities_test.py +25 -25
  54. kodit-0.3.4/tests/kodit/domain/services/__init__.py +1 -0
  55. kodit-0.3.4/tests/kodit/domain/services/index_service_test.py +162 -0
  56. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/domain/test_embedding_service.py +1 -1
  57. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/domain/test_multi_search_result.py +0 -18
  58. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/bm25/vectorchord_bm25_repository_test.py +10 -1
  59. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/cloning/git_cloning/working_copy_test.py +3 -3
  60. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/embedding/test_embedding_integration.py +15 -8
  61. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/embedding/test_local_vector_search_repository.py +10 -8
  62. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/embedding/test_vectorchord_vector_search_repository.py +10 -1
  63. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/enrichment/enrichment_provider/test_null_enrichment_provider.py +11 -7
  64. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/indexing/test_auto_indexing_service.py +21 -59
  65. kodit-0.3.4/tests/kodit/infrastructure/mappers/__init__.py +1 -0
  66. kodit-0.3.4/tests/kodit/infrastructure/mappers/test_index_mapper.py +184 -0
  67. kodit-0.3.4/tests/kodit/infrastructure/slicing/__init__.py +1 -0
  68. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/__init__.py +1 -0
  69. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/c/main.c +72 -0
  70. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/c/models.c +75 -0
  71. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/c/models.h +99 -0
  72. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/c/utils.c +17 -0
  73. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/c/utils.h +33 -0
  74. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/cpp/main.cpp +85 -0
  75. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/cpp/models.cpp +39 -0
  76. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/cpp/models.hpp +98 -0
  77. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/cpp/utils.cpp +40 -0
  78. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/cpp/utils.hpp +56 -0
  79. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/csharp/Main.cs +52 -0
  80. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/csharp/Models.cs +89 -0
  81. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/csharp/Utils.cs +85 -0
  82. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/css/components.css +428 -0
  83. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/css/main.css +259 -0
  84. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/css/utilities.css +456 -0
  85. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/go/main.go +79 -0
  86. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/go/models.go +75 -0
  87. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/go/utils.go +45 -0
  88. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/html/components.html +165 -0
  89. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/html/forms.html +344 -0
  90. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/html/main.html +72 -0
  91. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/java/Main.java +75 -0
  92. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/java/Models.java +108 -0
  93. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/java/Utils.java +74 -0
  94. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/javascript/main.js +66 -0
  95. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/javascript/models.js +87 -0
  96. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/javascript/utils.js +61 -0
  97. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/python/__init__.py +1 -0
  98. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/python/main.py +55 -0
  99. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/python/models.py +54 -0
  100. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/python/utils.py +27 -0
  101. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/rust/main.rs +58 -0
  102. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/rust/models.rs +84 -0
  103. kodit-0.3.4/tests/kodit/infrastructure/slicing/data/rust/utils.rs +50 -0
  104. kodit-0.3.4/tests/kodit/infrastructure/slicing/slicer_test.py +830 -0
  105. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/sqlalchemy/test_embedding_repository.py +1 -1
  106. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/mcp_test.py +9 -1
  107. {kodit-0.3.2 → kodit-0.3.4}/tests/performance/similarity.py +4 -4
  108. kodit-0.3.4/tests/utils/__init__.py +1 -0
  109. kodit-0.3.4/tests/utils/test_path_utils.py +42 -0
  110. kodit-0.3.2/src/kodit/domain/enums.py +0 -9
  111. kodit-0.3.2/src/kodit/domain/repositories.py +0 -128
  112. kodit-0.3.2/src/kodit/domain/services/ignore_service.py +0 -45
  113. kodit-0.3.2/src/kodit/domain/services/indexing_service.py +0 -204
  114. kodit-0.3.2/src/kodit/domain/services/snippet_extraction_service.py +0 -89
  115. kodit-0.3.2/src/kodit/domain/services/snippet_service.py +0 -215
  116. kodit-0.3.2/src/kodit/domain/services/source_service.py +0 -85
  117. kodit-0.3.2/src/kodit/infrastructure/cloning/folder/__init__.py +0 -1
  118. kodit-0.3.2/src/kodit/infrastructure/cloning/folder/factory.py +0 -128
  119. kodit-0.3.2/src/kodit/infrastructure/cloning/folder/working_copy.py +0 -38
  120. kodit-0.3.2/src/kodit/infrastructure/cloning/git/factory.py +0 -153
  121. kodit-0.3.2/src/kodit/infrastructure/cloning/metadata.py +0 -128
  122. kodit-0.3.2/src/kodit/infrastructure/git/git_utils.py +0 -87
  123. kodit-0.3.2/src/kodit/infrastructure/indexing/index_repository.py +0 -286
  124. kodit-0.3.2/src/kodit/infrastructure/indexing/snippet_domain_service_factory.py +0 -37
  125. kodit-0.3.2/src/kodit/infrastructure/snippet_extraction/__init__.py +0 -1
  126. kodit-0.3.2/src/kodit/infrastructure/snippet_extraction/language_detection_service.py +0 -39
  127. kodit-0.3.2/src/kodit/infrastructure/snippet_extraction/languages/csharp.scm +0 -12
  128. kodit-0.3.2/src/kodit/infrastructure/snippet_extraction/languages/go.scm +0 -26
  129. kodit-0.3.2/src/kodit/infrastructure/snippet_extraction/languages/java.scm +0 -12
  130. kodit-0.3.2/src/kodit/infrastructure/snippet_extraction/languages/javascript.scm +0 -24
  131. kodit-0.3.2/src/kodit/infrastructure/snippet_extraction/languages/python.scm +0 -22
  132. kodit-0.3.2/src/kodit/infrastructure/snippet_extraction/languages/typescript.scm +0 -25
  133. kodit-0.3.2/src/kodit/infrastructure/snippet_extraction/snippet_extraction_factory.py +0 -67
  134. kodit-0.3.2/src/kodit/infrastructure/snippet_extraction/snippet_query_provider.py +0 -45
  135. kodit-0.3.2/src/kodit/infrastructure/snippet_extraction/tree_sitter_snippet_extractor.py +0 -182
  136. kodit-0.3.2/src/kodit/infrastructure/sqlalchemy/file_repository.py +0 -78
  137. kodit-0.3.2/src/kodit/infrastructure/sqlalchemy/repository.py +0 -133
  138. kodit-0.3.2/src/kodit/infrastructure/sqlalchemy/snippet_repository.py +0 -259
  139. kodit-0.3.2/src/kodit/migrations/versions/7c3bbc2ab32b_add_embeddings_table.py +0 -47
  140. kodit-0.3.2/src/kodit/migrations/versions/85155663351e_initial.py +0 -82
  141. kodit-0.3.2/src/kodit/migrations/versions/c3f5137d30f5_index_all_the_things.py +0 -44
  142. kodit-0.3.2/tests/kodit/application/test_code_indexing_application_service.py +0 -496
  143. kodit-0.3.2/tests/kodit/domain/snippet_domain_service_test.py +0 -314
  144. kodit-0.3.2/tests/kodit/domain/snippet_extraction_domain_service_test.py +0 -185
  145. kodit-0.3.2/tests/kodit/infrastructure/cloning/git_cloning/factory_test.py +0 -221
  146. kodit-0.3.2/tests/kodit/infrastructure/git/__init__.py +0 -1
  147. kodit-0.3.2/tests/kodit/infrastructure/indexing/indexing_repository_test.py +0 -129
  148. kodit-0.3.2/tests/kodit/infrastructure/source/__init__.py +0 -1
  149. kodit-0.3.2/tests/kodit/infrastructure/source/source_service_test.py +0 -192
  150. kodit-0.3.2/tests/kodit/infrastructure/sqlalchemy/test_snippet_repository.py +0 -373
  151. {kodit-0.3.2 → kodit-0.3.4}/.claude/commands/debug.md +0 -0
  152. {kodit-0.3.2 → kodit-0.3.4}/.claude/commands/new-requirement.md +0 -0
  153. {kodit-0.3.2 → kodit-0.3.4}/.claude/commands/refactor.md +0 -0
  154. {kodit-0.3.2 → kodit-0.3.4}/.claude/settings.json +0 -0
  155. {kodit-0.3.2 → kodit-0.3.4}/.cursor/rules/kodit.mdc +0 -0
  156. {kodit-0.3.2 → kodit-0.3.4}/.cursor/rules/style.mdc +0 -0
  157. {kodit-0.3.2 → kodit-0.3.4}/.github/CODE_OF_CONDUCT.md +0 -0
  158. {kodit-0.3.2 → kodit-0.3.4}/.github/CONTRIBUTING.md +0 -0
  159. {kodit-0.3.2 → kodit-0.3.4}/.github/ISSUE_TEMPLATE/bug_report.md +0 -0
  160. {kodit-0.3.2 → kodit-0.3.4}/.github/ISSUE_TEMPLATE/feature_request.md +0 -0
  161. {kodit-0.3.2 → kodit-0.3.4}/.github/PULL_REQUEST_TEMPLATE.md +0 -0
  162. {kodit-0.3.2 → kodit-0.3.4}/.github/dependabot.yml +0 -0
  163. {kodit-0.3.2 → kodit-0.3.4}/.github/workflows/docs.yaml +0 -0
  164. {kodit-0.3.2 → kodit-0.3.4}/.github/workflows/pull_request.yaml +0 -0
  165. {kodit-0.3.2 → kodit-0.3.4}/.github/workflows/pypi-test.yaml +0 -0
  166. {kodit-0.3.2 → kodit-0.3.4}/.github/workflows/pypi.yaml +0 -0
  167. {kodit-0.3.2 → kodit-0.3.4}/.github/workflows/test.yaml +0 -0
  168. {kodit-0.3.2 → kodit-0.3.4}/.gitignore +0 -0
  169. {kodit-0.3.2 → kodit-0.3.4}/.python-version +0 -0
  170. {kodit-0.3.2 → kodit-0.3.4}/.vscode/launch.json +0 -0
  171. {kodit-0.3.2 → kodit-0.3.4}/.vscode/settings.json +0 -0
  172. {kodit-0.3.2 → kodit-0.3.4}/LICENSE +0 -0
  173. {kodit-0.3.2 → kodit-0.3.4}/alembic.ini +0 -0
  174. {kodit-0.3.2 → kodit-0.3.4}/docs/demos/_index.md +0 -0
  175. {kodit-0.3.2 → kodit-0.3.4}/docs/demos/go-simple-microservice/index.md +0 -0
  176. {kodit-0.3.2 → kodit-0.3.4}/docs/demos/knock-knock-auth/index.md +0 -0
  177. {kodit-0.3.2 → kodit-0.3.4}/docs/developer/index.md +0 -0
  178. {kodit-0.3.2 → kodit-0.3.4}/docs/getting-started/_index.md +0 -0
  179. {kodit-0.3.2 → kodit-0.3.4}/docs/getting-started/installation/index.md +0 -0
  180. {kodit-0.3.2 → kodit-0.3.4}/docs/getting-started/integration/index.md +0 -0
  181. {kodit-0.3.2 → kodit-0.3.4}/docs/getting-started/quick-start/index.md +0 -0
  182. {kodit-0.3.2 → kodit-0.3.4}/docs/reference/_index.md +0 -0
  183. {kodit-0.3.2 → kodit-0.3.4}/docs/reference/configuration/index.md +0 -0
  184. {kodit-0.3.2 → kodit-0.3.4}/docs/reference/deployment/docker-compose.yaml +0 -0
  185. {kodit-0.3.2 → kodit-0.3.4}/docs/reference/deployment/index.md +0 -0
  186. {kodit-0.3.2 → kodit-0.3.4}/docs/reference/deployment/kubernetes.yaml +0 -0
  187. {kodit-0.3.2 → kodit-0.3.4}/docs/reference/telemetry/index.md +0 -0
  188. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/.gitignore +0 -0
  189. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/__init__.py +0 -0
  190. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/app.py +0 -0
  191. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/application/__init__.py +0 -0
  192. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/application/factories/__init__.py +0 -0
  193. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/application/services/__init__.py +0 -0
  194. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/config.py +0 -0
  195. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/database.py +0 -0
  196. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/domain/__init__.py +0 -0
  197. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/domain/errors.py +0 -0
  198. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/domain/interfaces.py +0 -0
  199. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/domain/services/__init__.py +0 -0
  200. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/domain/services/bm25_service.py +0 -0
  201. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/domain/services/enrichment_service.py +0 -0
  202. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/__init__.py +0 -0
  203. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/bm25/__init__.py +0 -0
  204. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/bm25/bm25_factory.py +0 -0
  205. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/bm25/local_bm25_repository.py +0 -0
  206. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/bm25/vectorchord_bm25_repository.py +0 -0
  207. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/cloning/__init__.py +0 -0
  208. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/cloning/git/__init__.py +0 -0
  209. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/embedding/__init__.py +0 -0
  210. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/embedding/embedding_providers/__init__.py +0 -0
  211. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/embedding/embedding_providers/batching.py +0 -0
  212. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/embedding/embedding_providers/hash_embedding_provider.py +0 -0
  213. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/embedding/embedding_providers/local_embedding_provider.py +0 -0
  214. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/embedding/embedding_providers/openai_embedding_provider.py +0 -0
  215. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/enrichment/__init__.py +0 -0
  216. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/enrichment/enrichment_factory.py +0 -0
  217. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/enrichment/local_enrichment_provider.py +0 -0
  218. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/enrichment/openai_enrichment_provider.py +0 -0
  219. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/git/__init__.py +0 -0
  220. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/ignore/__init__.py +0 -0
  221. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/indexing/__init__.py +0 -0
  222. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/indexing/indexing_factory.py +0 -0
  223. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/sqlalchemy/__init__.py +0 -0
  224. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/ui/__init__.py +0 -0
  225. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/ui/progress.py +0 -0
  226. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/infrastructure/ui/spinner.py +0 -0
  227. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/log.py +0 -0
  228. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/middleware.py +0 -0
  229. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/migrations/README +0 -0
  230. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/migrations/__init__.py +0 -0
  231. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/migrations/script.py.mako +0 -0
  232. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/migrations/versions/9e53ea8bb3b0_add_authors.py +0 -0
  233. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/migrations/versions/__init__.py +0 -0
  234. {kodit-0.3.2 → kodit-0.3.4}/src/kodit/reporting.py +0 -0
  235. {kodit-0.3.2 → kodit-0.3.4}/tests/__init__.py +0 -0
  236. {kodit-0.3.2 → kodit-0.3.4}/tests/docker-smoke.sh +0 -0
  237. {kodit-0.3.2 → kodit-0.3.4}/tests/experiments/__init__.py +0 -0
  238. {kodit-0.3.2 → kodit-0.3.4}/tests/experiments/cline_prompt_tests/__init__.py +0 -0
  239. {kodit-0.3.2 → kodit-0.3.4}/tests/experiments/cline_prompt_tests/cline_prompt.txt +0 -0
  240. {kodit-0.3.2 → kodit-0.3.4}/tests/experiments/cline_prompt_tests/cline_prompt_test.py +0 -0
  241. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/__init__.py +0 -0
  242. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/application/__init__.py +0 -0
  243. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/cli_test.py +0 -0
  244. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/config_test.py +0 -0
  245. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/domain/__init__.py +0 -0
  246. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/domain/bm25_domain_service_test.py +0 -0
  247. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/domain/test_language_mapping.py +0 -0
  248. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/e2e.py +0 -0
  249. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/__init__.py +0 -0
  250. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/bm25/__init__.py +0 -0
  251. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/bm25/local_bm25_repository_test.py +0 -0
  252. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/cloning/git_cloning/__init__.py +0 -0
  253. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/embedding/__init__.py +0 -0
  254. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/embedding/embedding_factory_test.py +0 -0
  255. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/embedding/embedding_provider/__init__.py +0 -0
  256. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/embedding/embedding_provider/test_hash_embedding_provider.py +0 -0
  257. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/embedding/embedding_provider/test_local_embedding_provider.py +0 -0
  258. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/embedding/embedding_provider/test_openai_embedding_provider.py +0 -0
  259. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/embedding/test_batching.py +0 -0
  260. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/enrichment/__init__.py +0 -0
  261. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/enrichment/enrichment_provider/__init__.py +0 -0
  262. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/enrichment/enrichment_provider/test_local_enrichment_provider.py +0 -0
  263. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/enrichment/enrichment_provider/test_openai_enrichment_provider.py +0 -0
  264. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/enrichment/test_enrichment_factory.py +0 -0
  265. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/indexing/__init__.py +0 -0
  266. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/snippets/__init__.py +0 -0
  267. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/snippets/csharp.cs +0 -0
  268. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/snippets/golang.go +0 -0
  269. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/snippets/javascript.js +0 -0
  270. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/snippets/knock_knock_server.py +0 -0
  271. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/snippets/python.py +0 -0
  272. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/snippets/typescript.tsx +0 -0
  273. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/infrastructure/sqlalchemy/__init__.py +0 -0
  274. {kodit-0.3.2 → kodit-0.3.4}/tests/kodit/log_test.py +0 -0
  275. {kodit-0.3.2 → kodit-0.3.4}/tests/performance/__init__.py +0 -0
  276. {kodit-0.3.2 → kodit-0.3.4}/tests/smoke.sh +0 -0
  277. {kodit-0.3.2 → kodit-0.3.4}/tests/vectorchord-smoke.sh +0 -0
  278. {kodit-0.3.2 → kodit-0.3.4}/uv.lock +0 -0
@@ -0,0 +1,79 @@
1
+ # Update Documentation Based on Recent Code Changes
2
+
3
+ ## Objective
4
+
5
+ Analyze recent commits in the current Git branch and update relevant documentation to reflect code changes.
6
+
7
+ ## Steps to Complete
8
+
9
+ ### 1. Analyze Recent Commits
10
+
11
+ - Run `git log --oneline -n 20` to view the last 20 commits on the current branch, or
12
+ all commits if it is not the main branch.
13
+ - For each commit, run `git show --name-status <commit-hash>` to see which files were modified
14
+ - Focus on commits that modified source code files (`.js`, `.py`, `.ts`, `.java`, `.go`, `.rs`, etc.)
15
+
16
+ ### 2. Identify Code Changes
17
+
18
+ For each modified source file in recent commits:
19
+
20
+ - Examine the diff using `git diff <commit-hash>^ <commit-hash> -- <file-path>`
21
+ - Identify:
22
+ - New functions or methods added
23
+ - Functions or methods removed or renamed
24
+ - Changes to function signatures (parameters, return types)
25
+ - New classes or modules
26
+ - Changes to public APIs
27
+ - New configuration options or environment variables
28
+ - Breaking changes
29
+
30
+ ### 3. Update README.md
31
+
32
+ Check if the README.md needs updates for:
33
+
34
+ - **Installation instructions**: If dependencies or setup steps changed
35
+ - **Usage examples**: If APIs or interfaces changed
36
+ - **Configuration**: If new environment variables or config options were added
37
+ - **Features list**: If new features were implemented
38
+ - **Quick start guide**: If the basic usage pattern changed
39
+
40
+ ### 4. Update Documentation in /docs
41
+
42
+ For each markdown file in the `docs/` folder:
43
+
44
+ - Check if it references any of the changed code
45
+ - Update:
46
+ - API documentation with new/changed function signatures
47
+ - Code examples that may no longer work
48
+ - Configuration guides if settings changed
49
+ - Architecture diagrams if structural changes occurred
50
+ - Migration guides if there are breaking changes
51
+
52
+ ### 5. Create or Update Specific Docs
53
+
54
+ Based on the changes found:
55
+
56
+ - If new features were added without documentation, create new doc files
57
+ - If breaking changes exist, create or update a migration guide
58
+ - If new APIs were added, ensure they have proper documentation
59
+
60
+ ### 6. Verify Documentation Accuracy
61
+
62
+ - Ensure all code examples in documentation are up-to-date
63
+ - Check that any referenced file paths still exist
64
+ - Verify that installation and setup instructions still work
65
+
66
+ ## Output Required
67
+
68
+ 1. Summary of commits analyzed and significant changes found
69
+ 2. List of documentation files updated with brief description of changes
70
+ 3. Any new documentation files created
71
+ 4. Warnings about potentially outdated documentation that needs manual review
72
+
73
+ ## Important Notes
74
+
75
+ - Focus on user-facing changes that affect how people use Kodit
76
+ - Don't document internal implementation details unless they affect the public API
77
+ - Keep documentation concise and example-driven
78
+ - If unsure about a change's impact, flag it for manual review
79
+ - Ensure all documentation follows the existing style and format in the repository
@@ -188,6 +188,6 @@ docs/
188
188
  .vscode/
189
189
  .gitignore
190
190
  LICENSE
191
- .python-version
191
+ .python-version # Set in the Dockerfile
192
192
  .dockerignore
193
193
  tests/
@@ -45,6 +45,9 @@ jobs:
45
45
  needs: test-build
46
46
  # Only run on main branch or when explicitly triggered
47
47
  if: github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository
48
+ strategy:
49
+ matrix:
50
+ python-version: ["3.12.8", "3.13.5"]
48
51
  steps:
49
52
  - name: Free up disk space
50
53
  run: sudo rm -rf /usr/local/lib/android /usr/share/dotnet || true
@@ -65,11 +68,28 @@ jobs:
65
68
  username: ${{ secrets.DOCKER_USERNAME }}
66
69
  password: ${{ secrets.DOCKER_PASSWORD }}
67
70
 
71
+ - name: Set short Python version
72
+ id: pyver
73
+ run: |
74
+ echo "SHORT_PY=$(echo '${{ matrix.python-version }}' | cut -d. -f1,2)" >> $GITHUB_OUTPUT
75
+ echo "ENABLE_DEFAULT=$([ \"$(echo '${{ matrix.python-version }}' | cut -d. -f1,2)\" = \"3.13\" ] && echo true || echo false)" >> $GITHUB_OUTPUT
76
+
68
77
  - name: Extract metadata (tags, labels) for Docker
69
78
  id: meta
70
79
  uses: docker/metadata-action@9ec57ed1fcdbf14dcef7dfbe97b2010124a938b7
71
80
  with:
72
81
  images: ${{ vars.REGISTRY }}/${{ vars.REGISTRY_ORG }}/${{ github.event.repository.name }}
82
+ tags: |
83
+ type=semver,pattern={{version}},enable=${{ steps.pyver.outputs.ENABLE_DEFAULT }}
84
+ type=semver,pattern={{version}},suffix=-py${{ steps.pyver.outputs.SHORT_PY }}
85
+ type=semver,pattern={{major}}.{{minor}},enable=${{ steps.pyver.outputs.ENABLE_DEFAULT }}
86
+ type=semver,pattern={{major}}.{{minor}},suffix=-py${{ steps.pyver.outputs.SHORT_PY }}
87
+ type=ref,event=branch,enable=${{ steps.pyver.outputs.ENABLE_DEFAULT }}
88
+ type=ref,event=branch,suffix=-py${{ steps.pyver.outputs.SHORT_PY }}
89
+ type=ref,event=pr,enable=${{ steps.pyver.outputs.ENABLE_DEFAULT }}
90
+ type=ref,event=pr,suffix=-py${{ steps.pyver.outputs.SHORT_PY }}
91
+ type=sha,enable=${{ steps.pyver.outputs.ENABLE_DEFAULT }}
92
+ type=sha,suffix=-py${{ steps.pyver.outputs.SHORT_PY }}
73
93
 
74
94
  - name: Build and push Docker image
75
95
  id: push
@@ -81,6 +101,8 @@ jobs:
81
101
  push: true
82
102
  tags: ${{ steps.meta.outputs.tags }}
83
103
  labels: ${{ steps.meta.outputs.labels }}
104
+ build-args: |
105
+ PYTHON_VERSION=${{ matrix.python-version }}
84
106
 
85
107
  - name: Generate artifact attestation
86
108
  uses: actions/attest-build-provenance@v2
@@ -50,18 +50,23 @@ The codebase follows Domain-Driven Design (DDD) with clean architecture:
50
50
 
51
51
  ### Key Components
52
52
 
53
- **Indexing Pipeline:**
53
+ **Advanced Indexing Pipeline:**
54
54
 
55
- 1. Clone/read source code
56
- 2. Extract snippets using Tree-sitter
57
- 3. Generate embeddings and BM25 indices
58
- 4. Store in database
55
+ 1. Clone/read source code with Git metadata extraction
56
+ 2. Language detection for 20+ programming languages
57
+ 3. Advanced snippet extraction using Tree-sitter with dependency analysis
58
+ 4. Build call graphs and import maps for context-aware extraction
59
+ 5. Generate embeddings and BM25 indices
60
+ 6. Store in database with selective reindexing for performance
59
61
 
60
- **Search System:**
62
+ **Advanced Search System:**
61
63
 
62
- - Hybrid search combining semantic (embeddings) and keyword (BM25)
64
+ - Hybrid search combining semantic (embeddings) and keyword (BM25) with Reciprocal Rank Fusion
65
+ - Multi-dimensional filtering: language, author, date range, source, file path
66
+ - Context-aware results with dependency tracking and usage examples
63
67
  - Multiple providers: local models, OpenAI, custom APIs
64
68
  - Configurable via environment variables
69
+ - Support for 20+ programming languages including HTML/CSS
65
70
 
66
71
  **MCP Server:**
67
72
 
@@ -96,4 +101,5 @@ Migrations managed with Alembic in `migrations/` directory. DO NOT EDIT THESE FI
96
101
  - Smoke tests for deployment validation
97
102
  - Performance tests for similarity search
98
103
 
99
- Test files mirror source structure under `tests/` directory.
104
+ Test file names should mirror the source structure under `tests/` directory and end in
105
+ the name `_test.py`.
@@ -30,9 +30,11 @@ COPY --from=ghcr.io/astral-sh/uv:0.7.2 /uv /usr/local/bin/uv
30
30
  ENV UV_LINK_MODE=copy \
31
31
  UV_COMPILE_BYTECODE=1 \
32
32
  UV_PYTHON_DOWNLOADS=never \
33
- UV_PYTHON=python3.13 \
34
33
  UV_PROJECT_ENVIRONMENT=/app
35
34
 
35
+ # Write the PYTHON_VERSION to a .python-version
36
+ RUN echo ${PYTHON_VERSION} > .python-version
37
+
36
38
  # Synchronize DEPENDENCIES without the application itself.
37
39
  # This layer is cached until uv.lock or pyproject.toml change, which are
38
40
  # only temporarily mounted into the build container since we don't need
@@ -43,6 +45,7 @@ ENV UV_LINK_MODE=copy \
43
45
  RUN --mount=type=cache,target=/root/.cache/uv \
44
46
  --mount=type=bind,source=uv.lock,target=uv.lock \
45
47
  --mount=type=bind,source=pyproject.toml,target=pyproject.toml \
48
+ UV_PYTHON="python$(echo ${PYTHON_VERSION} | cut -d. -f1-2)" \
46
49
  uv sync \
47
50
  --locked \
48
51
  --no-dev \
@@ -54,6 +57,7 @@ RUN --mount=type=cache,target=/root/.cache/uv \
54
57
  COPY . /src
55
58
  WORKDIR /src
56
59
  RUN --mount=type=cache,target=/root/.cache/uv \
60
+ UV_PYTHON="python$(echo ${PYTHON_VERSION} | cut -d. -f1-2)" \
57
61
  uv sync \
58
62
  --locked \
59
63
  --no-dev \
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: kodit
3
- Version: 0.3.2
3
+ Version: 0.3.4
4
4
  Summary: Code indexing for better AI code generation
5
5
  Project-URL: Homepage, https://docs.helixml.tech/kodit/
6
6
  Project-URL: Documentation, https://docs.helixml.tech/kodit/
@@ -92,13 +92,16 @@ code. This index is used to build a snippet library, ready for ingestion into an
92
92
 
93
93
  - Index local directories and public Git repositories
94
94
  - Build comprehensive snippet libraries for LLM ingestion
95
- - Support for multiple codebase types and languages
96
- - Efficient indexing and search capabilities
95
+ - Support for 20+ programming languages including Python, JavaScript/TypeScript, Java, Go, Rust, C/C++, C#, HTML/CSS, and more
96
+ - Advanced code analysis with dependency tracking and call graph generation
97
+ - Intelligent snippet extraction with context-aware dependencies
98
+ - Efficient indexing with selective reindexing (only processes modified files)
97
99
  - Privacy first: respects .gitignore and .noindex files
98
100
  - **NEW in 0.3**: Auto-indexing configuration for shared server deployments
99
101
  - **NEW in 0.3**: Enhanced Git provider support including Azure DevOps
100
102
  - **NEW in 0.3**: Index private repositories via a PAT
101
103
  - **NEW in 0.3**: Improved progress monitoring and reporting during indexing
104
+ - **NEW in 0.3**: Advanced code slicing infrastructure with Tree-sitter parsing
102
105
 
103
106
  ### MCP Server
104
107
 
@@ -111,7 +114,9 @@ intent. Kodit has been tested to work well with:
111
114
  - [Cursor](https://docs.helix.ml/kodit/getting-started/integration/#integration-with-cursor)
112
115
  - [Cline](https://docs.helix.ml/kodit/getting-started/integration/#integration-with-cline)
113
116
  - Please contribute more instructions! ... any other assistant is likely to work ...
114
- - **New in 0.3**: Filter snippets by source, language, author or timestamp.
117
+ - **New in 0.3**: Advanced search filters by source, language, author, date range, and file path
118
+ - **New in 0.3**: Hybrid search combining BM25 keyword search with semantic search
119
+ - **New in 0.4**: Enhanced MCP tools with rich context parameters and metadata
115
120
 
116
121
  ### Enterprise Ready
117
122
 
@@ -39,13 +39,16 @@ code. This index is used to build a snippet library, ready for ingestion into an
39
39
 
40
40
  - Index local directories and public Git repositories
41
41
  - Build comprehensive snippet libraries for LLM ingestion
42
- - Support for multiple codebase types and languages
43
- - Efficient indexing and search capabilities
42
+ - Support for 20+ programming languages including Python, JavaScript/TypeScript, Java, Go, Rust, C/C++, C#, HTML/CSS, and more
43
+ - Advanced code analysis with dependency tracking and call graph generation
44
+ - Intelligent snippet extraction with context-aware dependencies
45
+ - Efficient indexing with selective reindexing (only processes modified files)
44
46
  - Privacy first: respects .gitignore and .noindex files
45
47
  - **NEW in 0.3**: Auto-indexing configuration for shared server deployments
46
48
  - **NEW in 0.3**: Enhanced Git provider support including Azure DevOps
47
49
  - **NEW in 0.3**: Index private repositories via a PAT
48
50
  - **NEW in 0.3**: Improved progress monitoring and reporting during indexing
51
+ - **NEW in 0.3**: Advanced code slicing infrastructure with Tree-sitter parsing
49
52
 
50
53
  ### MCP Server
51
54
 
@@ -58,7 +61,9 @@ intent. Kodit has been tested to work well with:
58
61
  - [Cursor](https://docs.helix.ml/kodit/getting-started/integration/#integration-with-cursor)
59
62
  - [Cline](https://docs.helix.ml/kodit/getting-started/integration/#integration-with-cline)
60
63
  - Please contribute more instructions! ... any other assistant is likely to work ...
61
- - **New in 0.3**: Filter snippets by source, language, author or timestamp.
64
+ - **New in 0.3**: Advanced search filters by source, language, author, date range, and file path
65
+ - **New in 0.3**: Hybrid search combining BM25 keyword search with semantic search
66
+ - **New in 0.4**: Enhanced MCP tools with rich context parameters and metadata
62
67
 
63
68
  ### Enterprise Ready
64
69
 
@@ -0,0 +1,222 @@
1
+ # Migration to Index Aggregate Architecture
2
+
3
+ This document outlines the strategy for migrating from the current multiple-service architecture to the new Index aggregate root design.
4
+
5
+ ## Current Architecture Issues
6
+
7
+ ### Code Analysis of `CodeIndexingApplicationService`
8
+
9
+ The current application service has several problems:
10
+
11
+ 1. **Service Proliferation**: 7+ domain services injected
12
+ 2. **Manual Orchestration**: Application layer contains complex business logic
13
+ 3. **Leaky Abstractions**: SQLAlchemy session management at application level
14
+ 4. **Scattered State**: Snippet state tracked across multiple services
15
+
16
+ ### Current Workflow Complexity
17
+
18
+ ```python
19
+ # Current approach - complex orchestration
20
+ async def run_index(self, index_id: int) -> None:
21
+ # 1. Get index from indexing service
22
+ index = await self.indexing_domain_service.get_index(index_id)
23
+
24
+ # 2. Delete old snippets via snippet service
25
+ await self.snippet_domain_service.delete_snippets_for_index(index.id)
26
+
27
+ # 3. Extract snippets via snippet service
28
+ snippets = await self.snippet_domain_service.extract_and_create_snippets(...)
29
+
30
+ # 4. Manual transaction management
31
+ await self.session.commit()
32
+
33
+ # 5. Create BM25 index via separate service
34
+ await self._create_bm25_index(snippets, progress_callback)
35
+
36
+ # 6. Create embeddings via separate service
37
+ await self._create_code_embeddings(snippets, progress_callback)
38
+
39
+ # 7. Enrich snippets via separate service
40
+ await self._enrich_snippets(snippets, progress_callback)
41
+
42
+ # 8. More embeddings via separate service
43
+ await self._create_text_embeddings(snippets, progress_callback)
44
+
45
+ # 9. Update timestamp via indexing service
46
+ await self.indexing_domain_service.update_index_timestamp(index.id)
47
+
48
+ # 10. Final commit
49
+ await self.session.commit()
50
+ ```
51
+
52
+ ## New Architecture Benefits
53
+
54
+ ### Simplified Application Service
55
+
56
+ ```python
57
+ # New approach - aggregate root handles complexity
58
+ async def run_complete_indexing_workflow(
59
+ self, uri: AnyUrl, local_path: Path
60
+ ) -> domain_entities.Index:
61
+ # 1. Create index (aggregate root)
62
+ index = await self._index_domain_service.create_index(uri)
63
+
64
+ # 2. Populate working copy (aggregate method)
65
+ index = await self._index_domain_service.clone_and_populate_working_copy(
66
+ index, local_path, SourceType.GIT
67
+ )
68
+
69
+ # 3. Extract snippets (aggregate method)
70
+ index = await self._index_domain_service.extract_snippets(index)
71
+
72
+ # 4. Simple transaction management
73
+ await self._session.commit()
74
+
75
+ return index
76
+ ```
77
+
78
+ ## Migration Strategy
79
+
80
+ ### Phase 1: Parallel Implementation ✅
81
+
82
+ - [x] Create new domain entities (`domain/models/entities.py`)
83
+ - [x] Create repository protocol (`domain/models/protocols.py`)
84
+ - [x] Create mapping layer (`infrastructure/mappers/index_mapper.py`)
85
+ - [x] Create repository implementation (`infrastructure/sqlalchemy/index_repository.py`)
86
+ - [x] Create domain service (`domain/services/index_service.py`)
87
+ - [x] Create simplified application service (`application/services/simplified_indexing_service.py`)
88
+
89
+ ### Phase 2: Feature Parity (Next Steps)
90
+
91
+ #### 2.1 Complete Index Domain Service
92
+ - [ ] Implement actual cloning logic in `clone_and_populate_working_copy`
93
+ - [ ] Complete snippet enrichment in `enrich_snippets_with_summaries`
94
+ - [ ] Add snippet search capabilities to Index aggregate
95
+ - [ ] Add BM25/embedding integration
96
+
97
+ #### 2.2 Application Service Integration
98
+ - [ ] Update application factories to create new services
99
+ - [ ] Add legacy compatibility methods
100
+ - [ ] Implement search functionality migration
101
+
102
+ #### 2.3 CLI Integration
103
+ - [ ] Update CLI commands to use new application service
104
+ - [ ] Maintain backward compatibility for existing commands
105
+
106
+ ### Phase 3: Gradual Migration
107
+
108
+ #### 3.1 New Endpoints First
109
+ - [ ] Create new CLI commands using Index aggregate
110
+ - [ ] Add new MCP tools using simplified service
111
+ - [ ] Implement new features with aggregate root
112
+
113
+ #### 3.2 Legacy Adaptation
114
+ - [ ] Wrap old API calls to use new domain service
115
+ - [ ] Provide compatibility layer for existing integrations
116
+ - [ ] Migrate tests gradually
117
+
118
+ #### 3.3 Search Migration
119
+ - [ ] Move search logic into Index aggregate
120
+ - [ ] Create search value objects in domain
121
+ - [ ] Simplify search application service
122
+
123
+ ### Phase 4: Complete Migration
124
+
125
+ #### 4.1 Remove Old Services
126
+ - [ ] Remove `IndexingDomainService`
127
+ - [ ] Remove `SnippetDomainService`
128
+ - [ ] Remove `SourceService`
129
+ - [ ] Clean up old value objects
130
+
131
+ #### 4.2 Final Cleanup
132
+ - [ ] Remove legacy compatibility methods
133
+ - [ ] Update all tests to use new architecture
134
+ - [ ] Remove old application service
135
+
136
+ ## Code Examples
137
+
138
+ ### Before: Current Complexity
139
+
140
+ ```python
141
+ class CodeIndexingApplicationService:
142
+ def __init__(self,
143
+ indexing_domain_service: IndexingDomainService,
144
+ snippet_domain_service: SnippetDomainService,
145
+ source_service: SourceService,
146
+ bm25_service: BM25DomainService,
147
+ code_search_service: EmbeddingDomainService,
148
+ text_search_service: EmbeddingDomainService,
149
+ enrichment_service: EnrichmentDomainService,
150
+ session: AsyncSession, # Leaky abstraction!
151
+ ):
152
+ # 7+ services to coordinate
153
+ ```
154
+
155
+ ### After: Aggregate Root Simplicity
156
+
157
+ ```python
158
+ class SimplifiedIndexingApplicationService:
159
+ def __init__(self,
160
+ index_domain_service: IndexDomainService,
161
+ session: AsyncSession,
162
+ ):
163
+ # Single domain service + session
164
+ # All business logic in domain
165
+ ```
166
+
167
+ ## Benefits of Migration
168
+
169
+ ### 1. **Reduced Complexity**
170
+ - Single domain service instead of 7+
171
+ - Business logic moves to domain layer
172
+ - Application layer focuses on coordination
173
+
174
+ ### 2. **Better Domain Modeling**
175
+ - Index as true aggregate root
176
+ - Rich domain objects with behavior
177
+ - Proper encapsulation of business rules
178
+
179
+ ### 3. **Improved Testability**
180
+ - Domain service can be tested in isolation
181
+ - No SQLAlchemy dependencies in domain tests
182
+ - Cleaner mocking for application tests
183
+
184
+ ### 4. **Enhanced Maintainability**
185
+ - Clear boundaries between layers
186
+ - Easier to add new features
187
+ - Reduced coupling between services
188
+
189
+ ### 5. **Better Performance**
190
+ - Fewer repository round trips
191
+ - Optimized aggregate loading
192
+ - Reduced object mapping overhead
193
+
194
+ ## Risks and Mitigation
195
+
196
+ ### Risk: Breaking Changes
197
+ **Mitigation**: Implement compatibility layer during transition
198
+
199
+ ### Risk: Feature Regression
200
+ **Mitigation**: Comprehensive test coverage for both old and new
201
+
202
+ ### Risk: Performance Impact
203
+ **Mitigation**: Benchmark and optimize aggregate loading
204
+
205
+ ### Risk: Complex Migration
206
+ **Mitigation**: Gradual, phase-by-phase approach
207
+
208
+ ## Success Metrics
209
+
210
+ - [ ] Reduced lines of code in application service (target: 50% reduction)
211
+ - [ ] Improved test coverage for domain logic
212
+ - [ ] Faster indexing workflow execution
213
+ - [ ] Fewer bugs related to state management
214
+ - [ ] Easier onboarding for new developers
215
+
216
+ ## Next Immediate Steps
217
+
218
+ 1. **Complete Domain Service**: Finish implementing cloning and enrichment
219
+ 2. **Factory Integration**: Update application factories
220
+ 3. **Simple CLI Command**: Create one new command using aggregate
221
+ 4. **Performance Test**: Benchmark against current implementation
222
+ 5. **Migration Plan**: Detail specific steps for first legacy endpoint
@@ -49,13 +49,16 @@ code. This index is used to build a snippet library, ready for ingestion into an
49
49
 
50
50
  - Index local directories and public Git repositories
51
51
  - Build comprehensive snippet libraries for LLM ingestion
52
- - Support for multiple codebase types and languages
53
- - Efficient indexing and search capabilities
52
+ - Support for 20+ programming languages including Python, JavaScript/TypeScript, Java, Go, Rust, C/C++, C#, HTML/CSS, and more
53
+ - Advanced code analysis with dependency tracking and call graph generation
54
+ - Intelligent snippet extraction with context-aware dependencies
55
+ - Efficient indexing with selective reindexing (only processes modified files)
54
56
  - Privacy first: respects .gitignore and .noindex files
55
57
  - **NEW in 0.3**: Auto-indexing configuration for shared server deployments
56
58
  - **NEW in 0.3**: Enhanced Git provider support including Azure DevOps
57
59
  - **NEW in 0.3**: Index private repositories via a PAT
58
60
  - **NEW in 0.3**: Improved progress monitoring and reporting during indexing
61
+ - **NEW in 0.4**: Advanced code slicing infrastructure with Tree-sitter parsing
59
62
 
60
63
  ### MCP Server
61
64
 
@@ -68,7 +71,9 @@ intent. Kodit has been tested to work well with:
68
71
  - [Cursor](./getting-started/integration/index.md#integration-with-cursor)
69
72
  - [Cline](./getting-started/integration/index.md#integration-with-cline)
70
73
  - Please contribute more instructions! ... any other assistant is likely to work ...
71
- - **New in 0.3**: Filter snippets by source, language, author or timestamp.
74
+ - **New in 0.3**: Advanced search filters by source, language, author, date range, and file path
75
+ - **New in 0.3**: Hybrid search combining BM25 keyword search with semantic search
76
+ - **New in 0.3**: Enhanced MCP tools with rich context parameters and metadata
72
77
 
73
78
  ### Enterprise Ready
74
79
 
@@ -209,26 +209,51 @@ Kodit respects [standard ignore patterns](#ignore-patterns):
209
209
  - **`.gitignore`**: Standard Git ignore patterns
210
210
  - **`.noindex`**: Custom ignore patterns for Kodit (uses gitignore syntax)
211
211
 
212
- ### Supported File Types
212
+ ### Supported Programming Languages
213
213
 
214
- Kodit automatically detects and processes files based on their extensions:
214
+ Kodit automatically detects and processes files based on their extensions. The following languages are supported with advanced Tree-sitter parsing:
215
215
 
216
- | Language | Extensions |
217
- |----------|------------|
218
- | Python | `.py` |
219
- | JavaScript | `.js`, `.jsx` |
220
- | TypeScript | `.ts`, `.tsx` |
221
- | Go | `.go` |
222
- | C# | `.cs` |
216
+ | Language | Extensions | Features |
217
+ |----------|------------|----------|
218
+ | Python | `.py`, `.pyw`, `.pyx`, `.pxd` | Function/method extraction, import analysis, call graph |
219
+ | JavaScript | `.js`, `.jsx`, `.mjs` | Function extraction, ES6 modules, JSX support |
220
+ | TypeScript | `.ts`, `.tsx` | Type definitions, interfaces, decorators |
221
+ | Java | `.java` | Method declarations, constructors, class hierarchies |
222
+ | Go | `.go` | Function/method extraction, package imports |
223
+ | Rust | `.rs` | Function definitions, trait implementations |
224
+ | C/C++ | `.c`, `.h`, `.cpp`, `.cc`, `.cxx`, `.hpp`, `.hxx` | Function definitions, header includes |
225
+ | C# | `.cs` | Method declarations, using directives, constructors |
226
+ | HTML | `.html`, `.htm` | Element extraction with ID/class identification |
227
+ | CSS | `.css`, `.scss`, `.sass`, `.less` | Rule extraction, selector analysis, keyframes |
223
228
 
224
- ### Snippet Extraction
229
+ ### Advanced Snippet Extraction
225
230
 
226
- Kodit uses tree-sitter to intelligently extract code snippets:
231
+ Kodit uses a sophisticated Tree-sitter-based slicing system to intelligently extract code snippets with context:
232
+
233
+ #### Core Features
227
234
 
228
235
  - **Functions and Methods**: Complete function definitions with their bodies
229
236
  - **Classes**: Class definitions and their methods
230
237
  - **Imports**: Import statements for context
231
238
  - **Dependencies**: Ancestor classes and functions that the snippet depends on
239
+ - **Call Graph Analysis**: Builds relationships between functions to understand dependencies
240
+ - **Context-Aware Extraction**: Includes related functions and usage examples
241
+ - **Topological Sorting**: Orders dependencies for optimal LLM consumption
242
+
243
+ #### Smart Dependency Tracking
244
+
245
+ - **Import Maps**: Tracks import statements and their usage
246
+ - **Function Calls**: Identifies which functions call which others
247
+ - **Reverse Dependencies**: Finds all callers of a given function
248
+ - **Usage Examples**: Includes examples of how functions are used in the codebase
249
+
250
+ #### Language-Specific Extraction
251
+
252
+ - **Python**: Decorators, async functions, class inheritance
253
+ - **JavaScript/TypeScript**: Arrow functions, async/await, ES6 modules
254
+ - **Java**: Annotations, generics, inheritance hierarchies
255
+ - **Go**: Interfaces, struct methods, package organization
256
+ - **HTML/CSS**: Elements with semantic context, CSS rules and selectors
232
257
 
233
258
  ## Configuration
234
259
 
@@ -269,9 +294,35 @@ DEFAULT_ENDPOINT_API_KEY=sk-your-api-key
269
294
 
270
295
  ## Advanced Features
271
296
 
272
- ### Re-indexing Sources
297
+ ### Selective Re-indexing
298
+
299
+ Kodit includes intelligent re-indexing that only processes files that have been modified:
300
+
301
+ #### How It Works
302
+
303
+ - **SHA256 Change Detection**: Compares file content hashes to detect changes
304
+ - **File Status Tracking**: Tracks files as CLEAN, MODIFIED, or DELETED
305
+ - **Incremental Updates**: Only re-processes changed files, improving performance for large codebases
306
+ - **Metadata Preservation**: Maintains file metadata and Git information
273
307
 
274
- Future feature!
308
+ #### Benefits
309
+
310
+ - **Performance**: Dramatically faster re-indexing for large repositories
311
+ - **Resource Efficiency**: Reduces CPU and memory usage during updates
312
+ - **Consistency**: Ensures only actual changes trigger re-processing
313
+ - **Scalability**: Enables efficient handling of large, frequently-updated codebases
314
+
315
+ #### Usage
316
+
317
+ Re-indexing automatically uses selective processing when you re-index an existing source:
318
+
319
+ ```sh
320
+ # Re-index with selective processing
321
+ kodit index /path/to/existing/source
322
+
323
+ # Or for Git repositories
324
+ kodit index https://github.com/username/repo.git
325
+ ```
275
326
 
276
327
  ### Progress Monitoring
277
328
 
@@ -53,21 +53,30 @@ The search tool accepts the following parameters:
53
53
  | `related_file_paths` | list[Path] | Absolute paths to relevant files | `["/path/to/auth.py"]` |
54
54
  | `related_file_contents` | list[string] | Contents of relevant files | `["def authenticate(): ..."]` |
55
55
  | `keywords` | list[string] | Relevant keywords for the search | `["authentication", "jwt", "login"]` |
56
- | `language` | string \| None | Filter by programming language | `"python"`, `"go"`, `"javascript"` |
56
+ | `language` | string \| None | Filter by programming language (20+ supported) | `"python"`, `"go"`, `"javascript"`, `"html"`, `"css"` |
57
57
  | `author` | string \| None | Filter by author name | `"john.doe"` |
58
58
  | `created_after` | string \| None | Filter by creation date (YYYY-MM-DD) | `"2023-01-01"` |
59
59
  | `created_before` | string \| None | Filter by creation date (YYYY-MM-DD) | `"2023-12-31"` |
60
60
  | `source_repo` | string \| None | Filter by source repository | `"github.com/example/repo"` |
61
+ | `file_path` | string \| None | Filter by file path pattern | `"src/"`, `"*.test.py"` |
61
62
 
62
- ### Search Functionality
63
+ ### Advanced Search Functionality
63
64
 
64
- The search tool combines multiple search strategies:
65
+ The search tool combines multiple search strategies with sophisticated ranking:
65
66
 
66
- 1. **Keyword Search** - Uses BM25 algorithm for exact keyword matching
67
- 2. **Semantic Code Search** - Uses embeddings to find semantically similar code
67
+ 1. **BM25 Keyword Search** - Advanced keyword matching with relevance scoring
68
+ 2. **Semantic Code Search** - Uses embeddings to find semantically similar code patterns
68
69
  3. **Semantic Text Search** - Uses embeddings to find code matching natural language descriptions
70
+ 4. **Reciprocal Rank Fusion (RRF)** - Intelligently combines results from multiple search strategies
71
+ 5. **Context-Aware Filtering** - Advanced filtering by language, author, date, source, and file path
72
+ 6. **Dependency-Aware Results** - Returns code snippets with their dependencies and usage examples
69
73
 
70
- Results are fused together to provide the most relevant snippets for the user's intent.
74
+ #### Enhanced Result Quality
75
+
76
+ - **Smart Snippet Selection**: Returns functions with their dependencies and context
77
+ - **Rich Metadata**: Each result includes file path, language, author, and creation date
78
+ - **Usage Examples**: Includes examples of how functions are used in the codebase
79
+ - **Topological Ordering**: Dependencies are ordered for optimal LLM consumption
71
80
 
72
81
  ## Filtering Capabilities
73
82
 
@@ -81,6 +90,8 @@ Filter results by programming language:
81
90
  > "I need to create a web server in Python. Please search for Flask or FastAPI examples and show me the best practices."
82
91
  > "I'm working on a Go microservice. Can you search for Go-specific patterns for handling HTTP requests and database connections?"
83
92
  > "I need JavaScript examples for form validation. Please search for modern JavaScript/TypeScript validation patterns."
93
+ > "I'm building a responsive layout. Please search for CSS Grid and Flexbox examples in our stylesheets."
94
+ > "I need HTML form examples. Please search for form elements with proper accessibility attributes."
84
95
 
85
96
  ### Author Filtering
86
97