chewy 0.8.4 → 7.3.4

Sign up to get free protection for your applications and to get access to all the features.
Files changed (340) hide show
  1. checksums.yaml +5 -5
  2. data/.github/CODEOWNERS +1 -0
  3. data/.github/ISSUE_TEMPLATE/bug_report.md +39 -0
  4. data/.github/ISSUE_TEMPLATE/feature_request.md +20 -0
  5. data/.github/PULL_REQUEST_TEMPLATE.md +16 -0
  6. data/.github/workflows/ruby.yml +74 -0
  7. data/.gitignore +1 -0
  8. data/.rubocop.yml +61 -0
  9. data/.rubocop_todo.yml +132 -0
  10. data/.yardopts +5 -0
  11. data/CHANGELOG.md +554 -245
  12. data/CODE_OF_CONDUCT.md +14 -0
  13. data/CONTRIBUTING.md +63 -0
  14. data/Gemfile +14 -11
  15. data/Guardfile +8 -6
  16. data/LICENSE.txt +1 -1
  17. data/README.md +748 -623
  18. data/Rakefile +11 -1
  19. data/chewy.gemspec +15 -19
  20. data/gemfiles/rails.5.2.activerecord.gemfile +11 -0
  21. data/gemfiles/rails.6.0.activerecord.gemfile +11 -0
  22. data/gemfiles/rails.6.1.activerecord.gemfile +13 -0
  23. data/gemfiles/rails.7.0.activerecord.gemfile +13 -0
  24. data/lib/chewy/config.rb +64 -50
  25. data/lib/chewy/errors.rb +10 -16
  26. data/lib/chewy/fields/base.rb +122 -32
  27. data/lib/chewy/fields/root.rb +48 -23
  28. data/lib/chewy/index/actions.rb +140 -54
  29. data/lib/chewy/index/adapter/active_record.rb +112 -0
  30. data/lib/chewy/{type → index}/adapter/base.rb +31 -12
  31. data/lib/chewy/index/adapter/object.rb +249 -0
  32. data/lib/chewy/index/adapter/orm.rb +194 -0
  33. data/lib/chewy/index/aliases.rb +14 -4
  34. data/lib/chewy/index/crutch.rb +40 -0
  35. data/lib/chewy/index/import/bulk_builder.rb +311 -0
  36. data/lib/chewy/index/import/bulk_request.rb +77 -0
  37. data/lib/chewy/index/import/journal_builder.rb +44 -0
  38. data/lib/chewy/index/import/routine.rb +139 -0
  39. data/lib/chewy/index/import.rb +243 -0
  40. data/lib/chewy/{type → index}/mapping.rb +79 -68
  41. data/lib/chewy/index/observe/active_record_methods.rb +87 -0
  42. data/lib/chewy/index/observe/callback.rb +34 -0
  43. data/lib/chewy/index/observe.rb +17 -0
  44. data/lib/chewy/index/settings.rb +10 -5
  45. data/lib/chewy/index/specification.rb +61 -0
  46. data/lib/chewy/index/syncer.rb +221 -0
  47. data/lib/chewy/{type → index}/witchcraft.rb +100 -39
  48. data/lib/chewy/index/wrapper.rb +95 -0
  49. data/lib/chewy/index.rb +216 -140
  50. data/lib/chewy/journal.rb +66 -0
  51. data/lib/chewy/log_subscriber.rb +8 -8
  52. data/lib/chewy/minitest/helpers.rb +150 -0
  53. data/lib/chewy/minitest/search_index_receiver.rb +76 -0
  54. data/lib/chewy/minitest.rb +1 -0
  55. data/lib/chewy/multi_search.rb +62 -0
  56. data/lib/chewy/railtie.rb +12 -25
  57. data/lib/chewy/rake_helper.rb +335 -37
  58. data/lib/chewy/repository.rb +2 -2
  59. data/lib/chewy/rspec/build_query.rb +12 -0
  60. data/lib/chewy/rspec/helpers.rb +55 -0
  61. data/lib/chewy/rspec/update_index.rb +106 -90
  62. data/lib/chewy/rspec.rb +3 -1
  63. data/lib/chewy/runtime/version.rb +4 -4
  64. data/lib/chewy/runtime.rb +1 -1
  65. data/lib/chewy/search/loader.rb +61 -0
  66. data/lib/chewy/{query → search}/pagination/kaminari.rb +13 -5
  67. data/lib/chewy/search/parameters/aggs.rb +16 -0
  68. data/lib/chewy/search/parameters/allow_partial_search_results.rb +27 -0
  69. data/lib/chewy/search/parameters/collapse.rb +16 -0
  70. data/lib/chewy/search/parameters/concerns/bool_storage.rb +24 -0
  71. data/lib/chewy/search/parameters/concerns/hash_storage.rb +23 -0
  72. data/lib/chewy/search/parameters/concerns/integer_storage.rb +14 -0
  73. data/lib/chewy/search/parameters/concerns/query_storage.rb +238 -0
  74. data/lib/chewy/search/parameters/concerns/string_array_storage.rb +23 -0
  75. data/lib/chewy/search/parameters/concerns/string_storage.rb +14 -0
  76. data/lib/chewy/search/parameters/docvalue_fields.rb +12 -0
  77. data/lib/chewy/search/parameters/explain.rb +16 -0
  78. data/lib/chewy/search/parameters/filter.rb +47 -0
  79. data/lib/chewy/search/parameters/highlight.rb +16 -0
  80. data/lib/chewy/search/parameters/ignore_unavailable.rb +27 -0
  81. data/lib/chewy/search/parameters/indices.rb +78 -0
  82. data/lib/chewy/search/parameters/indices_boost.rb +52 -0
  83. data/lib/chewy/search/parameters/limit.rb +17 -0
  84. data/lib/chewy/search/parameters/load.rb +32 -0
  85. data/lib/chewy/search/parameters/min_score.rb +16 -0
  86. data/lib/chewy/search/parameters/none.rb +25 -0
  87. data/lib/chewy/search/parameters/offset.rb +17 -0
  88. data/lib/chewy/search/parameters/order.rb +51 -0
  89. data/lib/chewy/search/parameters/post_filter.rb +19 -0
  90. data/lib/chewy/search/parameters/preference.rb +16 -0
  91. data/lib/chewy/search/parameters/profile.rb +16 -0
  92. data/lib/chewy/search/parameters/query.rb +19 -0
  93. data/lib/chewy/search/parameters/request_cache.rb +27 -0
  94. data/lib/chewy/search/parameters/rescore.rb +29 -0
  95. data/lib/chewy/search/parameters/script_fields.rb +16 -0
  96. data/lib/chewy/search/parameters/search_after.rb +20 -0
  97. data/lib/chewy/search/parameters/search_type.rb +16 -0
  98. data/lib/chewy/search/parameters/source.rb +77 -0
  99. data/lib/chewy/search/parameters/storage.rb +95 -0
  100. data/lib/chewy/search/parameters/stored_fields.rb +63 -0
  101. data/lib/chewy/search/parameters/suggest.rb +16 -0
  102. data/lib/chewy/search/parameters/terminate_after.rb +16 -0
  103. data/lib/chewy/search/parameters/timeout.rb +16 -0
  104. data/lib/chewy/search/parameters/track_scores.rb +16 -0
  105. data/lib/chewy/search/parameters/track_total_hits.rb +16 -0
  106. data/lib/chewy/search/parameters/version.rb +16 -0
  107. data/lib/chewy/search/parameters.rb +170 -0
  108. data/lib/chewy/search/query_proxy.rb +264 -0
  109. data/lib/chewy/search/request.rb +1071 -0
  110. data/lib/chewy/search/response.rb +119 -0
  111. data/lib/chewy/search/scoping.rb +49 -0
  112. data/lib/chewy/search/scrolling.rb +137 -0
  113. data/lib/chewy/search.rb +68 -28
  114. data/lib/chewy/stash.rb +68 -0
  115. data/lib/chewy/strategy/active_job.rb +3 -2
  116. data/lib/chewy/strategy/atomic.rb +2 -4
  117. data/lib/chewy/strategy/atomic_no_refresh.rb +18 -0
  118. data/lib/chewy/strategy/base.rb +13 -3
  119. data/lib/chewy/strategy/bypass.rb +1 -2
  120. data/lib/chewy/strategy/delayed_sidekiq/scheduler.rb +148 -0
  121. data/lib/chewy/strategy/delayed_sidekiq/worker.rb +52 -0
  122. data/lib/chewy/strategy/delayed_sidekiq.rb +17 -0
  123. data/lib/chewy/strategy/lazy_sidekiq.rb +64 -0
  124. data/lib/chewy/strategy/sidekiq.rb +15 -2
  125. data/lib/chewy/strategy/urgent.rb +1 -1
  126. data/lib/chewy/strategy.rb +16 -20
  127. data/lib/chewy/version.rb +1 -1
  128. data/lib/chewy.rb +81 -82
  129. data/lib/generators/chewy/install_generator.rb +3 -3
  130. data/lib/tasks/chewy.rake +99 -32
  131. data/migration_guide.md +56 -0
  132. data/spec/chewy/config_spec.rb +87 -15
  133. data/spec/chewy/fields/base_spec.rb +542 -233
  134. data/spec/chewy/fields/root_spec.rb +115 -17
  135. data/spec/chewy/fields/time_fields_spec.rb +13 -12
  136. data/spec/chewy/index/actions_spec.rb +595 -77
  137. data/spec/chewy/index/adapter/active_record_spec.rb +601 -0
  138. data/spec/chewy/index/adapter/object_spec.rb +243 -0
  139. data/spec/chewy/index/aliases_spec.rb +5 -5
  140. data/spec/chewy/index/import/bulk_builder_spec.rb +494 -0
  141. data/spec/chewy/index/import/bulk_request_spec.rb +95 -0
  142. data/spec/chewy/index/import/journal_builder_spec.rb +87 -0
  143. data/spec/chewy/index/import/routine_spec.rb +110 -0
  144. data/spec/chewy/index/import_spec.rb +615 -0
  145. data/spec/chewy/index/mapping_spec.rb +135 -0
  146. data/spec/chewy/index/observe/active_record_methods_spec.rb +68 -0
  147. data/spec/chewy/index/observe/callback_spec.rb +139 -0
  148. data/spec/chewy/index/observe_spec.rb +143 -0
  149. data/spec/chewy/index/settings_spec.rb +103 -50
  150. data/spec/chewy/index/specification_spec.rb +159 -0
  151. data/spec/chewy/index/syncer_spec.rb +118 -0
  152. data/spec/chewy/index/witchcraft_spec.rb +245 -0
  153. data/spec/chewy/index/wrapper_spec.rb +100 -0
  154. data/spec/chewy/index_spec.rb +149 -121
  155. data/spec/chewy/journal_spec.rb +223 -0
  156. data/spec/chewy/minitest/helpers_spec.rb +198 -0
  157. data/spec/chewy/minitest/search_index_receiver_spec.rb +118 -0
  158. data/spec/chewy/multi_search_spec.rb +84 -0
  159. data/spec/chewy/rake_helper_spec.rb +656 -0
  160. data/spec/chewy/repository_spec.rb +8 -8
  161. data/spec/chewy/rspec/build_query_spec.rb +34 -0
  162. data/spec/chewy/rspec/helpers_spec.rb +61 -0
  163. data/spec/chewy/rspec/update_index_spec.rb +220 -114
  164. data/spec/chewy/runtime_spec.rb +2 -2
  165. data/spec/chewy/search/loader_spec.rb +83 -0
  166. data/spec/chewy/search/pagination/kaminari_examples.rb +69 -0
  167. data/spec/chewy/search/pagination/kaminari_spec.rb +21 -0
  168. data/spec/chewy/search/parameters/aggs_spec.rb +5 -0
  169. data/spec/chewy/search/parameters/bool_storage_examples.rb +53 -0
  170. data/spec/chewy/search/parameters/collapse_spec.rb +5 -0
  171. data/spec/chewy/search/parameters/docvalue_fields_spec.rb +5 -0
  172. data/spec/chewy/search/parameters/explain_spec.rb +5 -0
  173. data/spec/chewy/search/parameters/filter_spec.rb +5 -0
  174. data/spec/chewy/search/parameters/hash_storage_examples.rb +59 -0
  175. data/spec/chewy/search/parameters/highlight_spec.rb +5 -0
  176. data/spec/chewy/search/parameters/ignore_unavailable_spec.rb +67 -0
  177. data/spec/chewy/search/parameters/indices_spec.rb +99 -0
  178. data/spec/chewy/search/parameters/integer_storage_examples.rb +32 -0
  179. data/spec/chewy/search/parameters/limit_spec.rb +5 -0
  180. data/spec/chewy/search/parameters/load_spec.rb +60 -0
  181. data/spec/chewy/search/parameters/min_score_spec.rb +32 -0
  182. data/spec/chewy/search/parameters/none_spec.rb +5 -0
  183. data/spec/chewy/search/parameters/offset_spec.rb +5 -0
  184. data/spec/chewy/search/parameters/order_spec.rb +72 -0
  185. data/spec/chewy/search/parameters/post_filter_spec.rb +5 -0
  186. data/spec/chewy/search/parameters/preference_spec.rb +5 -0
  187. data/spec/chewy/search/parameters/profile_spec.rb +5 -0
  188. data/spec/chewy/search/parameters/query_spec.rb +5 -0
  189. data/spec/chewy/search/parameters/query_storage_examples.rb +434 -0
  190. data/spec/chewy/search/parameters/request_cache_spec.rb +67 -0
  191. data/spec/chewy/search/parameters/rescore_spec.rb +62 -0
  192. data/spec/chewy/search/parameters/script_fields_spec.rb +5 -0
  193. data/spec/chewy/search/parameters/search_after_spec.rb +35 -0
  194. data/spec/chewy/search/parameters/search_type_spec.rb +5 -0
  195. data/spec/chewy/search/parameters/source_spec.rb +162 -0
  196. data/spec/chewy/search/parameters/storage_spec.rb +60 -0
  197. data/spec/chewy/search/parameters/stored_fields_spec.rb +126 -0
  198. data/spec/chewy/search/parameters/string_array_storage_examples.rb +63 -0
  199. data/spec/chewy/search/parameters/string_storage_examples.rb +32 -0
  200. data/spec/chewy/search/parameters/suggest_spec.rb +5 -0
  201. data/spec/chewy/search/parameters/terminate_after_spec.rb +5 -0
  202. data/spec/chewy/search/parameters/timeout_spec.rb +5 -0
  203. data/spec/chewy/search/parameters/track_scores_spec.rb +5 -0
  204. data/spec/chewy/search/parameters/track_total_hits_spec.rb +5 -0
  205. data/spec/chewy/search/parameters/version_spec.rb +5 -0
  206. data/spec/chewy/search/parameters_spec.rb +161 -0
  207. data/spec/chewy/search/query_proxy_spec.rb +119 -0
  208. data/spec/chewy/search/request_spec.rb +880 -0
  209. data/spec/chewy/search/response_spec.rb +202 -0
  210. data/spec/chewy/search/scrolling_spec.rb +171 -0
  211. data/spec/chewy/search_spec.rb +82 -55
  212. data/spec/chewy/stash_spec.rb +85 -0
  213. data/spec/chewy/strategy/active_job_spec.rb +27 -8
  214. data/spec/chewy/strategy/atomic_no_refresh_spec.rb +60 -0
  215. data/spec/chewy/strategy/atomic_spec.rb +13 -11
  216. data/spec/chewy/strategy/delayed_sidekiq_spec.rb +190 -0
  217. data/spec/chewy/strategy/lazy_sidekiq_spec.rb +214 -0
  218. data/spec/chewy/strategy/sidekiq_spec.rb +19 -7
  219. data/spec/chewy/strategy_spec.rb +19 -15
  220. data/spec/chewy_spec.rb +65 -88
  221. data/spec/spec_helper.rb +11 -20
  222. data/spec/support/active_record.rb +48 -6
  223. data/spec/support/class_helpers.rb +4 -19
  224. metadata +299 -183
  225. data/.travis.yml +0 -76
  226. data/Appraisals +0 -76
  227. data/gemfiles/rails.3.2.activerecord.gemfile +0 -15
  228. data/gemfiles/rails.3.2.activerecord.kaminari.gemfile +0 -14
  229. data/gemfiles/rails.3.2.activerecord.will_paginate.gemfile +0 -14
  230. data/gemfiles/rails.4.0.activerecord.gemfile +0 -15
  231. data/gemfiles/rails.4.0.activerecord.kaminari.gemfile +0 -14
  232. data/gemfiles/rails.4.0.activerecord.will_paginate.gemfile +0 -14
  233. data/gemfiles/rails.4.0.mongoid.4.0.0.gemfile +0 -15
  234. data/gemfiles/rails.4.0.mongoid.4.0.0.kaminari.gemfile +0 -14
  235. data/gemfiles/rails.4.0.mongoid.4.0.0.will_paginate.gemfile +0 -14
  236. data/gemfiles/rails.4.0.mongoid.5.1.0.gemfile +0 -15
  237. data/gemfiles/rails.4.0.mongoid.5.1.0.kaminari.gemfile +0 -14
  238. data/gemfiles/rails.4.0.mongoid.5.1.0.will_paginate.gemfile +0 -14
  239. data/gemfiles/rails.4.1.activerecord.gemfile +0 -15
  240. data/gemfiles/rails.4.1.activerecord.kaminari.gemfile +0 -14
  241. data/gemfiles/rails.4.1.activerecord.will_paginate.gemfile +0 -14
  242. data/gemfiles/rails.4.1.mongoid.4.0.0.gemfile +0 -15
  243. data/gemfiles/rails.4.1.mongoid.4.0.0.kaminari.gemfile +0 -14
  244. data/gemfiles/rails.4.1.mongoid.4.0.0.will_paginate.gemfile +0 -14
  245. data/gemfiles/rails.4.1.mongoid.5.1.0.gemfile +0 -15
  246. data/gemfiles/rails.4.1.mongoid.5.1.0.kaminari.gemfile +0 -14
  247. data/gemfiles/rails.4.1.mongoid.5.1.0.will_paginate.gemfile +0 -14
  248. data/gemfiles/rails.4.2.activerecord.gemfile +0 -16
  249. data/gemfiles/rails.4.2.activerecord.kaminari.gemfile +0 -15
  250. data/gemfiles/rails.4.2.activerecord.will_paginate.gemfile +0 -15
  251. data/gemfiles/rails.4.2.mongoid.4.0.0.gemfile +0 -15
  252. data/gemfiles/rails.4.2.mongoid.4.0.0.kaminari.gemfile +0 -14
  253. data/gemfiles/rails.4.2.mongoid.4.0.0.will_paginate.gemfile +0 -14
  254. data/gemfiles/rails.4.2.mongoid.5.1.0.gemfile +0 -15
  255. data/gemfiles/rails.4.2.mongoid.5.1.0.kaminari.gemfile +0 -14
  256. data/gemfiles/rails.4.2.mongoid.5.1.0.will_paginate.gemfile +0 -14
  257. data/gemfiles/rails.5.0.0.beta3.activerecord.gemfile +0 -16
  258. data/gemfiles/rails.5.0.0.beta3.activerecord.kaminari.gemfile +0 -16
  259. data/gemfiles/rails.5.0.0.beta3.activerecord.will_paginate.gemfile +0 -15
  260. data/gemfiles/sequel.4.31.gemfile +0 -13
  261. data/lib/chewy/backports/deep_dup.rb +0 -46
  262. data/lib/chewy/backports/duplicable.rb +0 -90
  263. data/lib/chewy/query/compose.rb +0 -69
  264. data/lib/chewy/query/criteria.rb +0 -181
  265. data/lib/chewy/query/filters.rb +0 -227
  266. data/lib/chewy/query/loading.rb +0 -111
  267. data/lib/chewy/query/nodes/and.rb +0 -25
  268. data/lib/chewy/query/nodes/base.rb +0 -17
  269. data/lib/chewy/query/nodes/bool.rb +0 -32
  270. data/lib/chewy/query/nodes/equal.rb +0 -34
  271. data/lib/chewy/query/nodes/exists.rb +0 -20
  272. data/lib/chewy/query/nodes/expr.rb +0 -28
  273. data/lib/chewy/query/nodes/field.rb +0 -106
  274. data/lib/chewy/query/nodes/has_child.rb +0 -14
  275. data/lib/chewy/query/nodes/has_parent.rb +0 -14
  276. data/lib/chewy/query/nodes/has_relation.rb +0 -61
  277. data/lib/chewy/query/nodes/match_all.rb +0 -11
  278. data/lib/chewy/query/nodes/missing.rb +0 -20
  279. data/lib/chewy/query/nodes/not.rb +0 -25
  280. data/lib/chewy/query/nodes/or.rb +0 -25
  281. data/lib/chewy/query/nodes/prefix.rb +0 -18
  282. data/lib/chewy/query/nodes/query.rb +0 -20
  283. data/lib/chewy/query/nodes/range.rb +0 -63
  284. data/lib/chewy/query/nodes/raw.rb +0 -15
  285. data/lib/chewy/query/nodes/regexp.rb +0 -31
  286. data/lib/chewy/query/nodes/script.rb +0 -20
  287. data/lib/chewy/query/pagination/will_paginate.rb +0 -27
  288. data/lib/chewy/query/pagination.rb +0 -16
  289. data/lib/chewy/query/scoping.rb +0 -20
  290. data/lib/chewy/query.rb +0 -1026
  291. data/lib/chewy/strategy/resque.rb +0 -26
  292. data/lib/chewy/type/actions.rb +0 -19
  293. data/lib/chewy/type/adapter/active_record.rb +0 -72
  294. data/lib/chewy/type/adapter/mongoid.rb +0 -58
  295. data/lib/chewy/type/adapter/object.rb +0 -89
  296. data/lib/chewy/type/adapter/orm.rb +0 -156
  297. data/lib/chewy/type/adapter/sequel.rb +0 -75
  298. data/lib/chewy/type/crutch.rb +0 -31
  299. data/lib/chewy/type/import.rb +0 -224
  300. data/lib/chewy/type/observe.rb +0 -76
  301. data/lib/chewy/type/wrapper.rb +0 -53
  302. data/lib/chewy/type.rb +0 -89
  303. data/lib/sequel/plugins/chewy_observe.rb +0 -78
  304. data/spec/chewy/query/criteria_spec.rb +0 -433
  305. data/spec/chewy/query/filters_spec.rb +0 -173
  306. data/spec/chewy/query/loading_spec.rb +0 -86
  307. data/spec/chewy/query/nodes/and_spec.rb +0 -16
  308. data/spec/chewy/query/nodes/bool_spec.rb +0 -22
  309. data/spec/chewy/query/nodes/equal_spec.rb +0 -32
  310. data/spec/chewy/query/nodes/exists_spec.rb +0 -18
  311. data/spec/chewy/query/nodes/has_child_spec.rb +0 -40
  312. data/spec/chewy/query/nodes/has_parent_spec.rb +0 -40
  313. data/spec/chewy/query/nodes/match_all_spec.rb +0 -11
  314. data/spec/chewy/query/nodes/missing_spec.rb +0 -15
  315. data/spec/chewy/query/nodes/not_spec.rb +0 -16
  316. data/spec/chewy/query/nodes/or_spec.rb +0 -16
  317. data/spec/chewy/query/nodes/prefix_spec.rb +0 -16
  318. data/spec/chewy/query/nodes/query_spec.rb +0 -12
  319. data/spec/chewy/query/nodes/range_spec.rb +0 -32
  320. data/spec/chewy/query/nodes/raw_spec.rb +0 -11
  321. data/spec/chewy/query/nodes/regexp_spec.rb +0 -31
  322. data/spec/chewy/query/nodes/script_spec.rb +0 -15
  323. data/spec/chewy/query/pagination/kaminari_spec.rb +0 -57
  324. data/spec/chewy/query/pagination/will_paginage_spec.rb +0 -60
  325. data/spec/chewy/query/pagination_spec.rb +0 -36
  326. data/spec/chewy/query_spec.rb +0 -632
  327. data/spec/chewy/strategy/resque_spec.rb +0 -40
  328. data/spec/chewy/type/actions_spec.rb +0 -31
  329. data/spec/chewy/type/adapter/active_record_spec.rb +0 -317
  330. data/spec/chewy/type/adapter/mongoid_spec.rb +0 -253
  331. data/spec/chewy/type/adapter/object_spec.rb +0 -139
  332. data/spec/chewy/type/adapter/sequel_spec.rb +0 -320
  333. data/spec/chewy/type/import_spec.rb +0 -433
  334. data/spec/chewy/type/mapping_spec.rb +0 -106
  335. data/spec/chewy/type/observe_spec.rb +0 -127
  336. data/spec/chewy/type/witchcraft_spec.rb +0 -154
  337. data/spec/chewy/type/wrapper_spec.rb +0 -58
  338. data/spec/chewy/type_spec.rb +0 -33
  339. data/spec/support/mongoid.rb +0 -81
  340. data/spec/support/sequel.rb +0 -75
data/README.md CHANGED
@@ -1,20 +1,15 @@
1
1
  [![Gem Version](https://badge.fury.io/rb/chewy.svg)](http://badge.fury.io/rb/chewy)
2
- [![Build Status](https://travis-ci.org/toptal/chewy.svg)](https://travis-ci.org/toptal/chewy)
2
+ [![GitHub Actions](https://github.com/toptal/chewy/actions/workflows/ruby.yml/badge.svg)](https://github.com/toptal/chewy/actions/workflows/ruby.yml)
3
3
  [![Code Climate](https://codeclimate.com/github/toptal/chewy.svg)](https://codeclimate.com/github/toptal/chewy)
4
4
  [![Inline docs](http://inch-ci.org/github/toptal/chewy.svg?branch=master)](http://inch-ci.org/github/toptal/chewy)
5
5
 
6
- <p align="right">Sponsored by</p>
7
- <p align="right"><a href="http://www.toptal.com/"><img src="http://www.toptal.com/assets/public/blocks/logo/big.png" alt="Toptal" width="105" height="34"></a></p>
8
-
9
6
  # Chewy
10
7
 
11
- Chewy is an ODM and wrapper for [the official Elasticsearch client](https://github.com/elasticsearch/elasticsearch-ruby).
8
+ Chewy is an ODM (Object Document Mapper), built on top of [the official Elasticsearch client](https://github.com/elastic/elasticsearch-ruby).
12
9
 
13
10
  ## Why Chewy?
14
11
 
15
- * Multi-model indices.
16
-
17
- Index classes are independent from ORM/ODM models. Now, implementing e.g. cross-model autocomplete is much easier. You can just define the index and work with it in an object-oriented style. You can define several types for index - one per indexed model.
12
+ In this section we'll cover why you might want to use Chewy instead of the official `elasticsearch-ruby` client gem.
18
13
 
19
14
  * Every index is observable by all the related models.
20
15
 
@@ -28,12 +23,11 @@ Chewy is an ODM and wrapper for [the official Elasticsearch client](https://gith
28
23
 
29
24
  Chewy has an ActiveRecord-style query DSL. It is chainable, mergeable and lazy, so you can produce queries in the most efficient way. It also has object-oriented query and filter builders.
30
25
 
31
- * Support for ActiveRecord, [Mongoid](https://github.com/mongoid/mongoid) and [Sequel](https://github.com/jeremyevans/sequel).
32
-
26
+ * Support for ActiveRecord.
33
27
 
34
28
  ## Installation
35
29
 
36
- Add this line to your application's Gemfile:
30
+ Add this line to your application's `Gemfile`:
37
31
 
38
32
  gem 'chewy'
39
33
 
@@ -45,19 +39,181 @@ Or install it yourself as:
45
39
 
46
40
  $ gem install chewy
47
41
 
48
- ## Usage
42
+ ## Compatibility
49
43
 
50
- ### Client settings
44
+ ### Ruby
45
+
46
+ Chewy is compatible with MRI 2.6-3.0¹.
47
+
48
+ > ¹ Ruby 3 is only supported with Rails 6.1
49
+
50
+ ### Elasticsearch compatibility matrix
51
+
52
+ | Chewy version | Elasticsearch version |
53
+ | ------------- | ---------------------------------- |
54
+ | 7.2.x | 7.x |
55
+ | 7.1.x | 7.x |
56
+ | 7.0.x | 6.8, 7.x |
57
+ | 6.0.0 | 5.x, 6.x |
58
+ | 5.x | 5.x, limited support for 1.x & 2.x |
59
+
60
+ **Important:** Chewy doesn't follow SemVer, so you should always
61
+ check the release notes before upgrading. The major version is linked to the
62
+ newest supported Elasticsearch and the minor version bumps may include breaking changes.
63
+
64
+ See our [migration guide](migration_guide.md) for detailed upgrade instructions between
65
+ various Chewy versions.
66
+
67
+ ### Active Record
68
+
69
+ 5.2, 6.0, 6.1 Active Record versions are supported by all Chewy versions.
70
+
71
+ ## Getting Started
72
+
73
+ Chewy provides functionality for Elasticsearch index handling, documents import mappings, index update strategies and chainable query DSL.
74
+
75
+ ### Minimal client setting
76
+
77
+ Create `config/initializers/chewy.rb` with this line:
78
+
79
+ ```ruby
80
+ Chewy.settings = {host: 'localhost:9250'}
81
+ ```
82
+
83
+ And run `rails g chewy:install` to generate `chewy.yml`:
84
+
85
+ ```yaml
86
+ # config/chewy.yml
87
+ # separate environment configs
88
+ test:
89
+ host: 'localhost:9250'
90
+ prefix: 'test'
91
+ development:
92
+ host: 'localhost:9200'
93
+ ```
94
+
95
+ ### Elasticsearch
96
+
97
+ Make sure you have Elasticsearch up and running. You can [install](https://www.elastic.co/guide/en/elasticsearch/reference/current/install-elasticsearch.html) it locally, but the easiest way is to use [Docker](https://www.docker.com/get-started):
98
+
99
+ ```shell
100
+ $ docker run --rm --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" elasticsearch:7.11.1
101
+ ```
102
+
103
+ ### Index
104
+
105
+ Create `app/chewy/users_index.rb` with User Index:
106
+
107
+ ```ruby
108
+ class UsersIndex < Chewy::Index
109
+ settings analysis: {
110
+ analyzer: {
111
+ email: {
112
+ tokenizer: 'keyword',
113
+ filter: ['lowercase']
114
+ }
115
+ }
116
+ }
51
117
 
52
- There are two ways to configure the Chewy client: the `Chewy.settings` hash and `chewy.yml`
118
+ index_scope User
119
+ field :first_name
120
+ field :last_name
121
+ field :email, analyzer: 'email'
122
+ end
123
+ ```
53
124
 
54
- You can create this file manually or run `rails g chewy:install`.
125
+ ### Model
126
+
127
+ Add User model, table and migrate it:
128
+
129
+ ```shell
130
+ $ bundle exec rails g model User first_name last_name email
131
+ $ bundle exec rails db:migrate
132
+ ```
133
+
134
+ Add `update_index` to app/models/user.rb:
135
+
136
+ ```ruby
137
+ class User < ApplicationRecord
138
+ update_index('users') { self }
139
+ end
140
+ ```
141
+
142
+ ### Example of data request
143
+
144
+ 1. Once a record is created (could be done via the Rails console), it creates User index too:
145
+
146
+ ```
147
+ User.create(
148
+ first_name: "test1",
149
+ last_name: "test1",
150
+ email: 'test1@example.com',
151
+ # other fields
152
+ )
153
+ # UsersIndex Import (355.3ms) {:index=>1}
154
+ # => #<User id: 1, first_name: "test1", last_name: "test1", email: "test1@example.com", # other fields>
155
+ ```
156
+
157
+ 2. A query could be exposed at a given `UsersController`:
158
+
159
+ ```ruby
160
+ def search
161
+ @users = UsersIndex.query(query_string: { fields: [:first_name, :last_name, :email, ...], query: search_params[:query], default_operator: 'and' })
162
+ render json: @users.to_json, status: :ok
163
+ end
164
+
165
+ private
166
+
167
+ def search_params
168
+ params.permit(:query, :page, :per)
169
+ end
170
+ ```
171
+
172
+ 3. So a request against `http://localhost:3000/users/search?query=test1@example.com` issuing a response like:
173
+
174
+ ```json
175
+ [
176
+ {
177
+ "attributes":{
178
+ "id":"1",
179
+ "first_name":"test1",
180
+ "last_name":"test1",
181
+ "email":"test1@example.com",
182
+ ...
183
+ "_score":0.9808291,
184
+ "_explanation":null
185
+ },
186
+ "_data":{
187
+ "_index":"users",
188
+ "_type":"_doc",
189
+ "_id":"1",
190
+ "_score":0.9808291,
191
+ "_source":{
192
+ "first_name":"test1",
193
+ "last_name":"test1",
194
+ "email":"test1@example.com",
195
+ ...
196
+ }
197
+ }
198
+ }
199
+ ]
200
+ ```
201
+
202
+ ## Usage and configuration
203
+
204
+ ### Client settings
205
+
206
+ To configure the Chewy client you need to add `chewy.rb` file with `Chewy.settings` hash:
55
207
 
56
208
  ```ruby
57
209
  # config/initializers/chewy.rb
58
210
  Chewy.settings = {host: 'localhost:9250'} # do not use environments
59
211
  ```
60
212
 
213
+ And add `chewy.yml` configuration file.
214
+
215
+ You can create `chewy.yml` manually or run `rails g chewy:install` to generate it:
216
+
61
217
  ```yaml
62
218
  # config/chewy.yml
63
219
  # separate environment configs
@@ -83,7 +239,31 @@ Chewy.logger = Logger.new(STDOUT)
83
239
 
84
240
  See [config.rb](lib/chewy/config.rb) for more details.
85
241
 
86
- ### Index definition
242
+ #### AWS Elasticsearch
243
+
244
+ If you would like to use AWS's Elasticsearch using an IAM user policy, you will need to sign your requests for the `es:*` action by injecting the appropriate headers passing a proc to `transport_options`.
245
+ You'll need an additional gem for Faraday middleware: add `gem 'faraday_middleware-aws-sigv4'` to your Gemfile.
246
+
247
+ ```ruby
248
+ require 'faraday_middleware/aws_sigv4'
249
+
250
+ Chewy.settings = {
251
+ host: 'http://my-es-instance-on-aws.us-east-1.es.amazonaws.com:80',
252
+ port: 80, # 443 for https host
253
+ transport_options: {
254
+ headers: { content_type: 'application/json' },
255
+ proc: -> (f) do
256
+ f.request :aws_sigv4,
257
+ service: 'es',
258
+ region: 'us-east-1',
259
+ access_key_id: ENV['AWS_ACCESS_KEY'],
260
+ secret_access_key: ENV['AWS_SECRET_ACCESS_KEY']
261
+ end
262
+ }
263
+ }
264
+ ```
265
+
266
+ #### Index definition
87
267
 
88
268
  1. Create `/app/chewy/users_index.rb`
89
269
 
@@ -93,41 +273,38 @@ See [config.rb](lib/chewy/config.rb) for more details.
93
273
  end
94
274
  ```
95
275
 
96
- 2. Add one or more types mapping
276
+ 2. Define index scope (you can omit this part if you don't need to specify a scope (i.e. use PORO objects for import) or options)
97
277
 
98
278
  ```ruby
99
279
  class UsersIndex < Chewy::Index
100
- define_type User.active # or just model instead_of scope: define_type User
280
+ index_scope User.active # or just model instead_of scope: index_scope User
101
281
  end
102
282
  ```
103
283
 
104
- Newly-defined index type class is accessible via `UsersIndex.user` or `UsersIndex::User`
105
-
106
- 3. Add some type mappings
284
+ 3. Add some mappings
107
285
 
108
286
  ```ruby
109
287
  class UsersIndex < Chewy::Index
110
- define_type User.active.includes(:country, :badges, :projects) do
111
- field :first_name, :last_name # multiple fields without additional options
112
- field :email, analyzer: 'email' # Elasticsearch-related options
113
- field :country, value: ->(user) { user.country.name } # custom value proc
114
- field :badges, value: ->(user) { user.badges.map(&:name) } # passing array values to index
115
- field :projects do # the same block syntax for multi_field, if `:type` is specified
116
- field :title
117
- field :description # default data type is `string`
118
- # additional top-level objects passed to value proc:
119
- field :categories, value: ->(project, user) { project.categories.map(&:name) if user.active? }
120
- end
121
- field :rating, type: 'integer' # custom data type
122
- field :created, type: 'date', include_in_all: false,
123
- value: ->{ created_at } # value proc for source object context
288
+ index_scope User.active.includes(:country, :badges, :projects)
289
+ field :first_name, :last_name # multiple fields without additional options
290
+ field :email, analyzer: 'email' # Elasticsearch-related options
291
+ field :country, value: ->(user) { user.country.name } # custom value proc
292
+ field :badges, value: ->(user) { user.badges.map(&:name) } # passing array values to index
293
+ field :projects do # the same block syntax for multi_field, if `:type` is specified
294
+ field :title
295
+ field :description # default data type is `text`
296
+ # additional top-level objects passed to value proc:
297
+ field :categories, value: ->(project, user) { project.categories.map(&:name) if user.active? }
124
298
  end
299
+ field :rating, type: 'integer' # custom data type
300
+ field :created, type: 'date', include_in_all: false,
301
+ value: ->{ created_at } # value proc for source object context
125
302
  end
126
303
  ```
127
304
 
128
- [See here for mapping definitions](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping.html).
305
+ [See here for mapping definitions](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html).
129
306
 
130
- 4. Add some index- and type-related settings. Analyzer repositories might be used as well. See `Chewy::Index.settings` docs for details:
307
+ 4. Add some index-related settings. Analyzer repositories might be used as well. See `Chewy::Index.settings` docs for details:
131
308
 
132
309
  ```ruby
133
310
  class UsersIndex < Chewy::Index
@@ -140,69 +317,61 @@ See [config.rb](lib/chewy/config.rb) for more details.
140
317
  }
141
318
  }
142
319
 
143
- define_type User.active.includes(:country, :badges, :projects) do
144
- root date_detection: false do
145
- template 'about_translations.*', type: 'string', analyzer: 'standard'
146
-
147
- field :first_name, :last_name
148
- field :email, analyzer: 'email'
149
- field :country, value: ->(user) { user.country.name }
150
- field :badges, value: ->(user) { user.badges.map(&:name) }
151
- field :projects do
152
- field :title
153
- field :description
154
- end
155
- field :about_translations, type: 'object' # pass object type explicitly if necessary
156
- field :rating, type: 'integer'
157
- field :created, type: 'date', include_in_all: false,
158
- value: ->{ created_at }
320
+ index_scope User.active.includes(:country, :badges, :projects)
321
+ root date_detection: false do
322
+ template 'about_translations.*', type: 'text', analyzer: 'standard'
323
+
324
+ field :first_name, :last_name
325
+ field :email, analyzer: 'email'
326
+ field :country, value: ->(user) { user.country.name }
327
+ field :badges, value: ->(user) { user.badges.map(&:name) }
328
+ field :projects do
329
+ field :title
330
+ field :description
159
331
  end
332
+ field :about_translations, type: 'object' # pass object type explicitly if necessary
333
+ field :rating, type: 'integer'
334
+ field :created, type: 'date', include_in_all: false,
335
+ value: ->{ created_at }
160
336
  end
161
337
  end
162
338
  ```
163
339
 
164
- [See index settings here](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-update-settings.html).
165
- [See root object settings here](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-root-object-type.html).
340
+ [See index settings here](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-update-settings.html).
341
+ [See root object settings here](https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html).
166
342
 
167
- See [mapping.rb](lib/chewy/type/mapping.rb) for more details.
343
+ See [mapping.rb](lib/chewy/index/mapping.rb) for more details.
168
344
 
169
345
  5. Add model-observing code
170
346
 
171
347
  ```ruby
172
348
  class User < ActiveRecord::Base
173
- update_index('users#user') { self } # specifying index, type and back-reference
349
+ update_index('users') { self } # specifying index and back-reference
174
350
  # for updating after user save or destroy
175
351
  end
176
352
 
177
353
  class Country < ActiveRecord::Base
178
354
  has_many :users
179
355
 
180
- update_index('users#user') { users } # return single object or collection
356
+ update_index('users') { users } # return single object or collection
181
357
  end
182
358
 
183
359
  class Project < ActiveRecord::Base
184
- update_index('users#user') { user if user.active? } # you can return even `nil` from the back-reference
185
- end
186
-
187
- class Badge < ActiveRecord::Base
188
- has_and_belongs_to_many :users
189
-
190
- update_index('users') { users } # if index has only one type
191
- # there is no need to specify updated type
360
+ update_index('users') { user if user.active? } # you can return even `nil` from the back-reference
192
361
  end
193
362
 
194
363
  class Book < ActiveRecord::Base
195
- update_index(->(book) {"books#book_#{book.language}"}) { self } # dynamic index and type with proc.
196
- # For book with language == "en"
197
- # this code will generate `books#book_en`
364
+ update_index(->(book) {"books_#{book.language}"}) { self } # dynamic index name with proc.
365
+ # For book with language == "en"
366
+ # this code will generate `books_en`
198
367
  end
199
368
  ```
200
369
 
201
370
  Also, you can use the second argument for method name passing:
202
371
 
203
372
  ```ruby
204
- update_index('users#user', :self)
205
- update_index('users#user', :users)
373
+ update_index('users', :self)
374
+ update_index('users', :users)
206
375
  ```
207
376
 
208
377
  In the case of a belongs_to association you may need to update both associated objects, previous and current:
@@ -211,47 +380,28 @@ See [config.rb](lib/chewy/config.rb) for more details.
211
380
  class City < ActiveRecord::Base
212
381
  belongs_to :country
213
382
 
214
- update_index('cities#city') { self }
215
- update_index 'countries#country' do
216
- # For the latest active_record changed values are
217
- # already in `previous_changes` hash,
218
- # but for mongoid you have to use `changes` hash
383
+ update_index('cities') { self }
384
+ update_index 'countries' do
219
385
  previous_changes['country_id'] || country
220
386
  end
221
387
  end
222
388
  ```
223
389
 
224
- You can observe Sequel models in the same way as ActiveRecord:
390
+ ### Default import options
225
391
 
226
- ```ruby
227
- class User < Sequel::Model
228
- update_index('users#user') { self }
229
- end
230
- ```
231
-
232
- However, to make it work, you must load the chewy plugin into Sequel model:
233
-
234
- ```ruby
235
- Sequel::Model.plugin :chewy_observe # for all models, or...
236
- User.plugin :chewy_observe # just for User
237
- ```
238
-
239
- ### Type default import options
240
-
241
- Every type has `default_import_options` configuration to specify, suddenly, default import options:
392
+ Every index has `default_import_options` configuration to specify, suddenly, default import options:
242
393
 
243
394
  ```ruby
244
395
  class ProductsIndex < Chewy::Index
245
- define_type Post.includes(:tags) do
246
- default_import_options batch_size: 100, bulk_size: 10.megabytes, refresh: false
396
+ index_scope Post.includes(:tags)
397
+ default_import_options batch_size: 100, bulk_size: 10.megabytes, refresh: false
247
398
 
248
- field :name
249
- field :tags, value: -> { tags.map(&:name) }
250
- end
399
+ field :name
400
+ field :tags, value: -> { tags.map(&:name) }
251
401
  end
252
402
  ```
253
403
 
254
- See [import.rb](lib/chewy/type/import.rb) for available options.
404
+ See [import.rb](lib/chewy/index/import.rb) for available options.
255
405
 
256
406
  ### Multi (nested) and object field types
257
407
 
@@ -269,17 +419,17 @@ This will automatically set the type or root field to `object`. You may also spe
269
419
  To define a multi field you have to specify any type except for `object` or `nested` in the root field:
270
420
 
271
421
  ```ruby
272
- field :full_name, type: 'string', value: ->{ full_name.strip } do
422
+ field :full_name, type: 'text', value: ->{ full_name.strip } do
273
423
  field :ordered, analyzer: 'ordered'
274
- field :untouched, index: 'not_analyzed'
424
+ field :untouched, type: 'keyword'
275
425
  end
276
426
  ```
277
427
 
278
- The `value:` option for internal fields would no longer be effective.
428
+ The `value:` option for internal fields will no longer be effective.
279
429
 
280
430
  ### Geo Point fields
281
431
 
282
- You can use [Elasticsearch's geo mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-geo-point-type.html) with the `geo_point` field type, allowing you to query, filter and order by latitude and longitude. You can use the following hash format:
432
+ You can use [Elasticsearch's geo mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/geo-point.html) with the `geo_point` field type, allowing you to query, filter and order by latitude and longitude. You can use the following hash format:
283
433
 
284
434
  ```ruby
285
435
  field :coordinates, type: 'geo_point', value: ->{ {lat: latitude, lon: longitude} }
@@ -296,20 +446,36 @@ end
296
446
 
297
447
  See the section on *Script fields* for details on calculating distance in a search.
298
448
 
449
+ ### Join fields
450
+
451
+ You can use a [join field](https://www.elastic.co/guide/en/elasticsearch/reference/current/parent-join.html)
452
+ to implement parent-child relationships between documents.
453
+ It [replaces the old `parent_id` based parent-child mapping](https://www.elastic.co/guide/en/elasticsearch/reference/current/removal-of-types.html#parent-child-mapping-types)
454
+
455
+ To use it, you need to pass `relations` and `join` (with `type` and `id`) options:
456
+ ```ruby
457
+ field :hierarchy_link, type: :join, relations: {question: %i[answer comment], answer: :vote, vote: :subvote}, join: {type: :comment_type, id: :commented_id}
458
+ ```
459
+ assuming you have `comment_type` and `commented_id` fields in your model.
460
+
461
+ Note that when you reindex a parent, its children and grandchildren will be reindexed as well.
462
+ This may require additional queries to the primary database and to elastisearch.
463
+
464
+ Also note that the join field doesn't support crutches (it should be a field directly defined on the model).
465
+
299
466
  ### Crutches™ technology
300
467
 
301
468
  Assume you are defining your index like this (product has_many categories through product_categories):
302
469
 
303
470
  ```ruby
304
471
  class ProductsIndex < Chewy::Index
305
- define_type Product.includes(:categories) do
306
- field :name
307
- field :category_names, value: ->(product) { product.categories.map(&:name) } # or shorter just -> { categories.map(&:name) }
308
- end
472
+ index_scope Product.includes(:categories)
473
+ field :name
474
+ field :category_names, value: ->(product) { product.categories.map(&:name) } # or shorter just -> { categories.map(&:name) }
309
475
  end
310
476
  ```
311
477
 
312
- Then the Chewy reindexing flow would look like the following pseudo-code (even in Mongoid):
478
+ Then the Chewy reindexing flow will look like the following pseudo-code:
313
479
 
314
480
  ```ruby
315
481
  Product.includes(:categories).find_in_batches(1000) do |batch|
@@ -321,30 +487,27 @@ Product.includes(:categories).find_in_batches(1000) do |batch|
321
487
  end
322
488
  ```
323
489
 
324
- But in Rails 4.1 and 4.2 you may face a problem with slow associations (take a look at https://github.com/rails/rails/pull/19423). Also, there might be really complicated cases when associations are not applicable.
325
-
326
- Then you can replace Rails associations with Chewy Crutches™ technology:
490
+ If you meet complicated cases when associations are not applicable you can replace Rails associations with Chewy Crutches™ technology:
327
491
 
328
492
  ```ruby
329
493
  class ProductsIndex < Chewy::Index
330
- define_type Product do
331
- crutch :categories do |collection| # collection here is a current batch of products
332
- # data is fetched with a lightweight query without objects initialization
333
- data = ProductCategory.joins(:category).where(product_id: collection.map(&:id)).pluck(:product_id, 'categories.name')
334
- # then we have to convert fetched data to appropriate format
335
- # this will return our data in structure like:
336
- # {123 => ['sweets', 'juices'], 456 => ['meat']}
337
- data.each.with_object({}) { |(id, name), result| (result[id] ||= []).push(name) }
338
- end
339
-
340
- field :name
341
- # simply use crutch-fetched data as a value:
342
- field :category_names, value: ->(product, crutches) { crutches.categories[product.id] }
494
+ index_scope Product
495
+ crutch :categories do |collection| # collection here is a current batch of products
496
+ # data is fetched with a lightweight query without objects initialization
497
+ data = ProductCategory.joins(:category).where(product_id: collection.map(&:id)).pluck(:product_id, 'categories.name')
498
+ # then we have to convert fetched data to appropriate format
499
+ # this will return our data in structure like:
500
+ # {123 => ['sweets', 'juices'], 456 => ['meat']}
501
+ data.each.with_object({}) { |(id, name), result| (result[id] ||= []).push(name) }
343
502
  end
503
+
504
+ field :name
505
+ # simply use crutch-fetched data as a value:
506
+ field :category_names, value: ->(product, crutches) { crutches[:categories][product.id] }
344
507
  end
345
508
  ```
346
509
 
347
- An example flow would look like this:
510
+ An example flow will look like this:
348
511
 
349
512
  ```ruby
350
513
  Product.includes(:categories).find_in_batches(1000) do |batch|
@@ -362,22 +525,21 @@ So Chewy Crutches™ technology is able to increase your indexing performance in
362
525
 
363
526
  ### Witchcraft™ technology
364
527
 
365
- One more experimental technology to increase import performance. As far as you know, chewy defines value proc for every imported field in mapping, so at the import time each of this procs is executed on imported object to extract result document to import. It would be great for performance to use one huge whole-document-returning proc instead. So basically the idea or Witchcraft™ technology is to compile a single document-returning proc from the type definition.
528
+ One more experimental technology to increase import performance. As far as you know, chewy defines value proc for every imported field in mapping, so at the import time each of these procs is executed on imported object to extract result document to import. It would be great for performance to use one huge whole-document-returning proc instead. So basically the idea or Witchcraft™ technology is to compile a single document-returning proc from the index definition.
366
529
 
367
530
  ```ruby
368
- define_type Product do
369
- witchcraft!
370
-
371
- field :title
372
- field :tags, value: -> { tags.map(&:name) }
373
- field :categories do
374
- field :name, value: -> (product, category) { category.name }
375
- field :type, value: -> (product, category, crutch) { crutch.types[category.name] }
376
- end
531
+ index_scope Product
532
+ witchcraft!
533
+
534
+ field :title
535
+ field :tags, value: -> { tags.map(&:name) }
536
+ field :categories do
537
+ field :name, value: -> (product, category) { category.name }
538
+ field :type, value: -> (product, category, crutch) { crutch.types[category.name] }
377
539
  end
378
540
  ```
379
541
 
380
- The type definition above will be compiled to something close to:
542
+ The index definition above will be compiled to something close to:
381
543
 
382
544
  ```ruby
383
545
  -> (object, crutches) do
@@ -395,19 +557,128 @@ end
395
557
  ```
396
558
 
397
559
  And don't even ask how is it possible, it is a witchcraft.
398
- Obviously not every type of definition might be compiled, so use reasonable formatting to make `method_source` be able to extract field value proc sources. Also value procs with splat arguments are not supported right now. However, it is quite possible that your type definition will be supported by Witchcraft™ technology out of the box in the most of the cases.
560
+ Obviously not every type of definition might be compiled. There are some restrictions:
561
+
562
+ 1. Use reasonable formatting to make `method_source` be able to extract field value proc sources.
563
+ 2. Value procs with splat arguments are not supported right now.
564
+ 3. If you are generating fields dynamically use value proc with arguments, argumentless value procs are not supported yet:
565
+
566
+ ```ruby
567
+ [:first_name, :last_name].each do |name|
568
+ field name, value: -> (o) { o.send(name) }
569
+ end
570
+ ```
571
+
572
+ However, it is quite possible that your index definition will be supported by Witchcraft™ technology out of the box in most of the cases.
573
+
574
+ ### Raw Import
575
+
576
+ Another way to speed up import time is Raw Imports. This technology is only available in ActiveRecord adapter. Very often, ActiveRecord model instantiation is what consumes most of the CPU and RAM resources. Precious time is wasted on converting, say, timestamps from strings and then serializing them back to strings. Chewy can operate on raw hashes of data directly obtained from the database. All you need is to provide a way to convert that hash to a lightweight object that mimics the behaviour of the normal ActiveRecord object.
577
+
578
+ ```ruby
579
+ class LightweightProduct
580
+ def initialize(attributes)
581
+ @attributes = attributes
582
+ end
583
+
584
+ # Depending on the database, `created_at` might
585
+ # be in different formats. In PostgreSQL, for example,
586
+ # you might see the following format:
587
+ # "2016-03-22 16:23:22"
588
+ #
589
+ # Taking into account that Elastic expects something different,
590
+ # one might do something like the following, just to avoid
591
+ # unnecessary String -> DateTime -> String conversion.
592
+ #
593
+ # "2016-03-22 16:23:22" -> "2016-03-22T16:23:22Z"
594
+ def created_at
595
+ @attributes['created_at'].tr(' ', 'T') << 'Z'
596
+ end
597
+ end
598
+
599
+ index_scope Product
600
+ default_import_options raw_import: ->(hash) {
601
+ LightweightProduct.new(hash)
602
+ }
399
603
 
400
- ### Types access
604
+ field :created_at, 'datetime'
605
+ ```
606
+
607
+ Also, you can pass `:raw_import` option to the `import` method explicitly.
608
+
609
+ ### Index creation during import
610
+
611
+ By default, when you perform import Chewy checks whether an index exists and creates it if it's absent.
612
+ You can turn off this feature to decrease Elasticsearch hits count.
613
+ To do so you need to set `skip_index_creation_on_import` parameter to `false` in your `config/chewy.yml`
614
+
615
+ ### Skip record fields during import
616
+
617
+ You can use `ignore_blank: true` to skip fields that return `true` for the `.blank?` method:
618
+
619
+ ```ruby
620
+ index_scope Country
621
+ field :id
622
+ field :cities, ignore_blank: true do
623
+ field :id
624
+ field :name
625
+ field :surname, ignore_blank: true
626
+ field :description
627
+ end
628
+ ```
629
+
630
+ #### Default values for different types
631
+
632
+ By default `ignore_blank` is false on every type except `geo_point`.
633
+
634
+ ### Journaling
401
635
 
402
- You can access index-defined types with the following API:
636
+ You can record all actions that were made to the separate journal index in ElasticSearch.
637
+ When you create/update/destroy your documents, it will be saved in this special index.
638
+ If you make something with a batch of documents (e.g. during index reset) it will be saved as a one record, including primary keys of each document that was affected.
639
+ Common journal record looks like this:
640
+
641
+ ```json
642
+ {
643
+ "action": "index",
644
+ "object_id": [1, 2, 3],
645
+ "index_name": "...",
646
+ "created_at": "<timestamp>"
647
+ }
648
+ ```
649
+
650
+ This feature is turned off by default.
651
+ But you can turn it on by setting `journal` setting to `true` in `config/chewy.yml`.
652
+ Also, you can specify journal index name. For example:
653
+
654
+ ```yaml
655
+ # config/chewy.yml
656
+ production:
657
+ journal: true
658
+ journal_name: my_super_journal
659
+ ```
660
+
661
+ Also, you can provide this option while you're importing some index:
403
662
 
404
663
  ```ruby
405
- UsersIndex::User # => UsersIndex::User
406
- UsersIndex.type_hash['user'] # => UsersIndex::User
407
- UsersIndex.types # => [UsersIndex::User]
408
- UsersIndex.type_names # => ['user']
664
+ CityIndex.import journal: true
409
665
  ```
410
666
 
667
+ Or as a default import option for an index:
668
+
669
+ ```ruby
670
+ class CityIndex
671
+ index_scope City
672
+ default_import_options journal: true
673
+ end
674
+ ```
675
+
676
+ You may be wondering why do you need it? The answer is simple: not to lose the data.
677
+
678
+ Imagine that you reset your index in a zero-downtime manner (to separate index), and in the meantime somebody keeps updating the data frequently (to old index). So all these actions will be written to the journal index and you'll be able to apply them after index reset using the `Chewy::Journal` interface.
679
+
680
+ When enabled, journal can grow to enormous size, consider setting up cron job that would clean it occasionally using [`chewy:journal:clean` rake task](#chewyjournal).
681
+
411
682
  ### Index manipulation
412
683
 
413
684
  ```ruby
@@ -420,24 +691,22 @@ UsersIndex.create! # use bang or non-bang methods
420
691
  UsersIndex.purge
421
692
  UsersIndex.purge! # deletes then creates index
422
693
 
423
- UsersIndex::User.import # import with 0 arguments process all the data specified in type definition
424
- # literally, User.active.includes(:country, :badges, :projects).find_in_batches
425
- UsersIndex::User.import User.where('rating > 100') # or import specified users scope
426
- UsersIndex::User.import User.where('rating > 100').to_a # or import specified users array
427
- UsersIndex::User.import [1, 2, 42] # pass even ids for import, it will be handled in the most effective way
694
+ UsersIndex.import # import with 0 arguments process all the data specified in index_scope definition
695
+ UsersIndex.import User.where('rating > 100') # or import specified users scope
696
+ UsersIndex.import User.where('rating > 100').to_a # or import specified users array
697
+ UsersIndex.import [1, 2, 42] # pass even ids for import, it will be handled in the most effective way
698
+ UsersIndex.import User.where('rating > 100'), update_fields: [:email] # if update fields are specified - it will update their values only with the `update` bulk action
699
+ UsersIndex.import! # raises an exception in case of any import errors
428
700
 
429
- UsersIndex.import # import every defined type
430
- UsersIndex.import user: User.where('rating > 100') # import only active users to `user` type.
431
- # Other index types, if exists, will be imported with default scope from the type definition.
432
701
  UsersIndex.reset! # purges index and imports default data for all types
433
702
  ```
434
703
 
435
- If the passed user is `#destroyed?`, or satisfies a `delete_if` type option, or the specified id does not exist in the database, import will perform delete from index action for this object.
704
+ If the passed user is `#destroyed?`, or satisfies a `delete_if` index_scope option, or the specified id does not exist in the database, import will perform delete from index action for this object.
436
705
 
437
706
  ```ruby
438
- define_type User, delete_if: :deleted_at
439
- define_type User, delete_if: -> { deleted_at }
440
- define_type User, delete_if: ->(user) { user.deleted_at }
707
+ index_scope User, delete_if: :deleted_at
708
+ index_scope User, delete_if: -> { deleted_at }
709
+ index_scope User, delete_if: ->(user) { user.deleted_at }
441
710
  ```
442
711
 
443
712
  See [actions.rb](lib/chewy/index/actions.rb) for more details.
@@ -448,13 +717,12 @@ Assume you've got the following code:
448
717
 
449
718
  ```ruby
450
719
  class City < ActiveRecord::Base
451
- update_index 'cities#city', :self
720
+ update_index 'cities', :self
452
721
  end
453
722
 
454
723
  class CitiesIndex < Chewy::Index
455
- define_type City do
456
- field :name
457
- end
724
+ index_scope City
725
+ field :name
458
726
  end
459
727
  ```
460
728
 
@@ -474,26 +742,112 @@ end
474
742
 
475
743
  Using this strategy delays the index update request until the end of the block. Updated records are aggregated and the index update happens with the bulk API. So this strategy is highly optimized.
476
744
 
477
- #### `:resque`
745
+ #### `:sidekiq`
478
746
 
479
- This does the same thing as `:atomic`, but asynchronously using resque. The default queue name is `chewy`. Patch `Chewy::Strategy::Resque::Worker` for index updates improving.
747
+ This does the same thing as `:atomic`, but asynchronously using sidekiq. Patch `Chewy::Strategy::Sidekiq::Worker` for index updates improving.
480
748
 
481
749
  ```ruby
482
- Chewy.strategy(:resque) do
750
+ Chewy.strategy(:sidekiq) do
483
751
  City.popular.map(&:do_some_update_action!)
484
752
  end
485
753
  ```
486
754
 
487
- #### `:sidekiq`
755
+ The default queue name is `chewy`, you can customize it in settings: `sidekiq.queue_name`
756
+ ```
757
+ Chewy.settings[:sidekiq] = {queue: :low}
758
+ ```
488
759
 
489
- This does the same thing as `:atomic`, but asynchronously using sidekiq. Patch `Chewy::Strategy::Sidekiq::Worker` for index updates improving.
760
+ #### `:lazy_sidekiq`
761
+
762
+ This does the same thing as `:sidekiq`, but with lazy evaluation. Beware it does not allow you to use any non-persistent record state for indices and conditions because record will be re-fetched from database asynchronously using sidekiq. However for destroying records strategy will fallback to `:sidekiq` because it's not possible to re-fetch deleted records from database.
763
+
764
+ The purpose of this strategy is to improve the response time of the code that should update indexes, as it does not only defer actual ES calls to a background job but `update_index` callbacks evaluation (for created and updated objects) too. Similar to `:sidekiq`, index update is asynchronous so this strategy cannot be used when data and index synchronization is required.
490
765
 
491
766
  ```ruby
492
- Chewy.strategy(:sidekiq) do
767
+ Chewy.strategy(:lazy_sidekiq) do
493
768
  City.popular.map(&:do_some_update_action!)
494
769
  end
495
770
  ```
496
771
 
772
+ The default queue name is `chewy`, you can customize it in settings: `sidekiq.queue_name`
773
+ ```
774
+ Chewy.settings[:sidekiq] = {queue: :low}
775
+ ```
776
+
777
+ #### `:delayed_sidekiq`
778
+
779
+ It accumulates ids of records to be reindexed during the latency window in redis and then does the reindexing of all accumulated records at once.
780
+ The strategy is very useful in case of frequently mutated records.
781
+ It supports `update_fields` option, so it will try to select just enough data from the DB
782
+
783
+ There are three options that can be defined in the index:
784
+ ```ruby
785
+ class CitiesIndex...
786
+ strategy_config delayed_sidekiq: {
787
+ latency: 3,
788
+ margin: 2,
789
+ ttl: 60 * 60 * 24,
790
+ reindex_wrapper: ->(&reindex) {
791
+ ActiveRecord::Base.connected_to(role: :reading) { reindex.call }
792
+ }
793
+ # latency - will prevent scheduling identical jobs
794
+ # margin - main purpose is to cover db replication lag by the margin
795
+ # ttl - a chunk expiration time (in seconds)
796
+ # reindex_wrapper - lambda that accepts block to wrap that reindex process AR connection block.
797
+ }
798
+
799
+ ...
800
+ end
801
+ ```
802
+
803
+ Also you can define defaults in the `initializers/chewy.rb`
804
+ ```ruby
805
+ Chewy.settings = {
806
+ strategy_config: {
807
+ delayed_sidekiq: {
808
+ latency: 3,
809
+ margin: 2,
810
+ ttl: 60 * 60 * 24,
811
+ reindex_wrapper: ->(&reindex) {
812
+ ActiveRecord::Base.connected_to(role: :reading) { reindex.call }
813
+ }
814
+ }
815
+ }
816
+ }
817
+
818
+ ```
819
+ or in `config/chewy.yml`
820
+ ```ruby
821
+ strategy_config:
822
+ delayed_sidekiq:
823
+ latency: 3
824
+ margin: 2
825
+ ttl: <%= 60 * 60 * 24 %>
826
+ # reindex_wrapper setting is not possible here!!! use the initializer instead
827
+ ```
828
+
829
+ You can use the strategy identically to other strategies
830
+ ```ruby
831
+ Chewy.strategy(:delayed_sidekiq) do
832
+ City.popular.map(&:do_some_update_action!)
833
+ end
834
+ ```
835
+
836
+ The default queue name is `chewy`, you can customize it in settings: `sidekiq.queue_name`
837
+ ```
838
+ Chewy.settings[:sidekiq] = {queue: :low}
839
+ ```
840
+
841
+ Explicit call of the reindex using `:delayed_sidekiq strategy`
842
+ ```ruby
843
+ CitiesIndex.import([1, 2, 3], strategy: :delayed_sidekiq)
844
+ ```
845
+
846
+ Explicit call of the reindex using `:delayed_sidekiq` strategy with `:update_fields` support
847
+ ```ruby
848
+ CitiesIndex.import([1, 2, 3], update_fields: [:name], strategy: :delayed_sidekiq)
849
+ ```
850
+
497
851
  #### `:active_job`
498
852
 
499
853
  This does the same thing as `:atomic`, but using ActiveJob. This will inherit the ActiveJob configuration settings including the `active_job.queue_adapter` setting for the environment. Patch `Chewy::Strategy::ActiveJob::Worker` for index updates improving.
@@ -504,6 +858,11 @@ Chewy.strategy(:active_job) do
504
858
  end
505
859
  ```
506
860
 
861
+ The default queue name is `chewy`, you can customize it in settings: `active_job.queue_name`
862
+ ```
863
+ Chewy.settings[:active_job] = {queue: :low}
864
+ ```
865
+
507
866
  #### `:urgent`
508
867
 
509
868
  The following strategy is convenient if you are going to update documents in your index one by one.
@@ -514,7 +873,7 @@ Chewy.strategy(:urgent) do
514
873
  end
515
874
  ```
516
875
 
517
- This code would perform `City.popular.count` requests for ES documents update.
876
+ This code will perform `City.popular.count` requests for ES documents update.
518
877
 
519
878
  It is convenient for use in e.g. the Rails console with non-block notation:
520
879
 
@@ -525,7 +884,9 @@ It is convenient for use in e.g. the Rails console with non-block notation:
525
884
 
526
885
  #### `:bypass`
527
886
 
528
- The bypass strategy simply silences index updates.
887
+ When the bypass strategy is active the index will not be automatically updated on object save.
888
+
889
+ For example, on `City.first.save!` the cities index would not be updated.
529
890
 
530
891
  #### Nesting
531
892
 
@@ -579,582 +940,341 @@ RSpec.configure do |config|
579
940
  end
580
941
  ```
581
942
 
582
- ### Index querying
943
+ ### Elasticsearch client options
583
944
 
584
- ```ruby
585
- scope = UsersIndex.query(term: {name: 'foo'})
586
- .filter(range: {rating: {gte: 100}})
587
- .order(created: :desc)
588
- .limit(20).offset(100)
589
-
590
- scope.to_a # => will produce array of UserIndex::User or other types instances
591
- scope.map { |user| user.email }
592
- scope.total_count # => will return total objects count
945
+ All connection options, except the `:prefix`, are passed to the `Elasticseach::Client.new` ([chewy/lib/chewy.rb](https://github.com/toptal/chewy/blob/f5bad9f83c21416ac10590f6f34009c645062e89/lib/chewy.rb#L153-L160)):
593
946
 
594
- scope.per(10).page(3) # supports kaminari pagination
595
- scope.explain.map { |user| user._explanation }
596
- scope.only(:id, :email) # returns ids and emails only
947
+ Here's the relevant Elasticsearch documentation on the subject: https://rubydoc.info/gems/elasticsearch-transport#setting-hosts
597
948
 
598
- scope.merge(other_scope) # queries could be merged
599
- ```
949
+ ### `ActiveSupport::Notifications` support
600
950
 
601
- Also, queries can be performed on a type individually:
951
+ Chewy has notifying the following events:
602
952
 
603
- ```ruby
604
- UsersIndex::User.filter(term: {name: 'foo'}) # will return UserIndex::User collection only
605
- ```
953
+ #### `search_query.chewy` payload
606
954
 
607
- If you are performing more than one `filter` or `query` in the chain, all the filters and queries will be concatenated in the way specified by
608
- `filter_mode` and `query_mode` respectively.
955
+ * `payload[:index]`: requested index class
956
+ * `payload[:request]`: request hash
609
957
 
610
- The default `filter_mode` is `:and` and the default `query_mode` is `bool`.
958
+ #### `import_objects.chewy` payload
611
959
 
612
- Available filter modes are: `:and`, `:or`, `:must`, `:should` and any minimum_should_match-acceptable value
960
+ * `payload[:index]`: currently imported index name
961
+ * `payload[:import]`: imports stats, total imported and deleted objects count:
613
962
 
614
- Available query modes are: `:must`, `:should`, `:dis_max`, any minimum_should_match-acceptable value or float value for dis_max query with tie_breaker specified.
963
+ ```ruby
964
+ {index: 30, delete: 5}
965
+ ```
615
966
 
616
- ```ruby
617
- UsersIndex::User.filter{ name == 'Fred' }.filter{ age < 42 } # will be wrapped with `and` filter
618
- UsersIndex::User.filter{ name == 'Fred' }.filter{ age < 42 }.filter_mode(:should) # will be wrapped with bool `should` filter
619
- UsersIndex::User.filter{ name == 'Fred' }.filter{ age < 42 }.filter_mode('75%') # will be wrapped with bool `should` filter with `minimum_should_match: '75%'`
620
- ```
967
+ * `payload[:errors]`: might not exist. Contains grouped errors with objects ids list:
621
968
 
622
- See [query.rb](lib/chewy/query.rb) for more details.
969
+ ```ruby
970
+ {index: {
971
+ 'error 1 text' => ['1', '2', '3'],
972
+ 'error 2 text' => ['4']
973
+ }, delete: {
974
+ 'delete error text' => ['10', '12']
975
+ }}
976
+ ```
623
977
 
624
- ### Additional query action.
978
+ ### NewRelic integration
625
979
 
626
- You may also perform additional actions on the query scope, such as deleting of all the scope documents:
980
+ To integrate with NewRelic you may use the following example source (config/initializers/chewy.rb):
627
981
 
628
982
  ```ruby
629
- UsersIndex.delete_all
630
- UsersIndex::User.delete_all
631
- UsersIndex.filter{ age < 42 }.delete_all
632
- UsersIndex::User.filter{ age < 42 }.delete_all
633
- ```
634
-
635
- ### Filters query DSL
983
+ require 'new_relic/agent/instrumentation/evented_subscriber'
636
984
 
637
- There is a test version of the filter-creating DSL:
985
+ class ChewySubscriber < NewRelic::Agent::Instrumentation::EventedSubscriber
986
+ def start(name, id, payload)
987
+ event = ChewyEvent.new(name, Time.current, nil, id, payload)
988
+ push_event(event)
989
+ end
638
990
 
639
- ```ruby
640
- UsersIndex.filter{ name == 'Fred' } # will produce `term` filter.
641
- UsersIndex.filter{ age <= 42 } # will produce `range` filter.
642
- ```
991
+ def finish(_name, id, _payload)
992
+ pop_event(id).finish
993
+ end
643
994
 
644
- The basis of the DSL is the expression. There are 2 types of expressions:
995
+ class ChewyEvent < NewRelic::Agent::Instrumentation::Event
996
+ OPERATIONS = {
997
+ 'import_objects.chewy' => 'import',
998
+ 'search_query.chewy' => 'search',
999
+ 'delete_query.chewy' => 'delete'
1000
+ }.freeze
645
1001
 
646
- * Simple function
1002
+ def initialize(*args)
1003
+ super
1004
+ @segment = start_segment
1005
+ end
647
1006
 
648
- ```ruby
649
- UsersIndex.filter{ s('doc["num"] > 1') } # script expression
650
- UsersIndex.filter{ q(query_string: {query: 'lazy fox'}) } # query expression
651
- ```
1007
+ def start_segment
1008
+ segment = NewRelic::Agent::Transaction::DatastoreSegment.new product, operation, collection, host, port
1009
+ if (txn = state.current_transaction)
1010
+ segment.transaction = txn
1011
+ end
1012
+ segment.notice_sql @payload[:request].to_s
1013
+ segment.start
1014
+ segment
1015
+ end
652
1016
 
653
- * Field-dependent composite expression
654
- Consists of the field name (with or without dot notation), a value, and an action operator between them. The field name might take additional options for passing to the resulting expression.
1017
+ def finish
1018
+ if (txn = state.current_transaction)
1019
+ txn.add_segment @segment
1020
+ end
1021
+ @segment.finish
1022
+ end
655
1023
 
656
- ```ruby
657
- UsersIndex.filter{ name == 'Name' } # simple field term filter
658
- UsersIndex.filter{ name(:bool) == ['Name1', 'Name2'] } # terms query with `execution: :bool` option passed
659
- UsersIndex.filter{ answers.title =~ /regexp/ } # regexp filter for `answers.title` field
660
- ```
1024
+ private
661
1025
 
662
- You can combine expressions as you wish with the help of combination operators.
1026
+ def state
1027
+ @state ||= NewRelic::Agent::TransactionState.tl_get
1028
+ end
663
1029
 
664
- ```ruby
665
- UsersIndex.filter{ (name == 'Name') & (email == 'Email') } # combination produces `and` filter
666
- UsersIndex.filter{
667
- must(
668
- should(name =~ 'Fr').should_not(name == 'Fred') & (age == 42), email =~ /gmail\.com/
669
- ) | ((roles.admin == true) & name?)
670
- } # many of the combination possibilities
671
- ```
1030
+ def product
1031
+ 'Elasticsearch'
1032
+ end
672
1033
 
673
- There is also a special syntax for cache enabling:
1034
+ def operation
1035
+ OPERATIONS[name]
1036
+ end
674
1037
 
675
- ```ruby
676
- UsersIndex.filter{ ~name == 'Name' } # you can apply tilde to the field name
677
- UsersIndex.filter{ ~(name == 'Name') } # or to the whole expression
1038
+ def collection
1039
+ payload.values_at(:type, :index)
1040
+ .reject { |value| value.try(:empty?) }
1041
+ .first
1042
+ .to_s
1043
+ end
678
1044
 
679
- # if you are applying cache to the one part of range filter
680
- # the whole filter will be cached:
681
- UsersIndex.filter{ ~(age > 42) & (age <= 50) }
1045
+ def host
1046
+ Chewy.client.transport.hosts.first[:host]
1047
+ end
682
1048
 
683
- # You can pass cache options as a field option also.
684
- UsersIndex.filter{ name(cache: true) == 'Name' }
685
- UsersIndex.filter{ name(cache: false) == 'Name' }
1049
+ def port
1050
+ Chewy.client.transport.hosts.first[:port]
1051
+ end
1052
+ end
1053
+ end
686
1054
 
687
- # With regexp filter you can pass _cache_key
688
- UsersIndex.filter{ name(cache: 'name_regexp') =~ /Name/ }
689
- # Or not
690
- UsersIndex.filter{ name(cache: true) =~ /Name/ }
1055
+ ActiveSupport::Notifications.subscribe(/.chewy$/, ChewySubscriber.new)
691
1056
  ```
692
1057
 
693
- Compliance cheatsheet for filters and DSL expressions:
694
-
695
- * Term filter
696
-
697
- ```json
698
- {"term": {"name": "Fred"}}
699
- {"not": {"term": {"name": "Johny"}}}
700
- ```
701
-
702
- ```ruby
703
- UsersIndex.filter{ name == 'Fred' }
704
- UsersIndex.filter{ name != 'Johny' }
705
- ```
706
-
707
- * Terms filter
1058
+ ### Search requests
708
1059
 
709
- ```json
710
- {"terms": {"name": ["Fred", "Johny"]}}
711
- {"not": {"terms": {"name": ["Fred", "Johny"]}}}
1060
+ Quick introduction.
712
1061
 
713
- {"terms": {"name": ["Fred", "Johny"], "execution": "or"}}
1062
+ #### Composing requests
714
1063
 
715
- {"terms": {"name": ["Fred", "Johny"], "execution": "and"}}
1064
+ The request DSL have the same chainable nature as AR. The main class is `Chewy::Search::Request`.
716
1065
 
717
- {"terms": {"name": ["Fred", "Johny"], "execution": "bool"}}
718
-
719
- {"terms": {"name": ["Fred", "Johny"], "execution": "fielddata"}}
720
- ```
721
-
722
- ```ruby
723
- UsersIndex.filter{ name == ['Fred', 'Johny'] }
724
- UsersIndex.filter{ name != ['Fred', 'Johny'] }
725
-
726
- UsersIndex.filter{ name(:|) == ['Fred', 'Johny'] }
727
- UsersIndex.filter{ name(:or) == ['Fred', 'Johny'] }
728
- UsersIndex.filter{ name(execution: :or) == ['Fred', 'Johny'] }
729
-
730
- UsersIndex.filter{ name(:&) == ['Fred', 'Johny'] }
731
- UsersIndex.filter{ name(:and) == ['Fred', 'Johny'] }
732
- UsersIndex.filter{ name(execution: :and) == ['Fred', 'Johny'] }
733
-
734
- UsersIndex.filter{ name(:b) == ['Fred', 'Johny'] }
735
- UsersIndex.filter{ name(:bool) == ['Fred', 'Johny'] }
736
- UsersIndex.filter{ name(execution: :bool) == ['Fred', 'Johny'] }
737
-
738
- UsersIndex.filter{ name(:f) == ['Fred', 'Johny'] }
739
- UsersIndex.filter{ name(:fielddata) == ['Fred', 'Johny'] }
740
- UsersIndex.filter{ name(execution: :fielddata) == ['Fred', 'Johny'] }
741
- ```
742
-
743
- * Regexp filter (== and =~ are equivalent)
744
-
745
- ```json
746
- {"regexp": {"name.first": "s.*y"}}
747
-
748
- {"not": {"regexp": {"name.first": "s.*y"}}}
749
-
750
- {"regexp": {"name.first": {"value": "s.*y", "flags": "ANYSTRING|INTERSECTION"}}}
751
- ```
752
-
753
- ```ruby
754
- UsersIndex.filter{ name.first == /s.*y/ }
755
- UsersIndex.filter{ name.first =~ /s.*y/ }
756
-
757
- UsersIndex.filter{ name.first != /s.*y/ }
758
- UsersIndex.filter{ name.first !~ /s.*y/ }
759
-
760
- UsersIndex.filter{ name.first(:anystring, :intersection) == /s.*y/ }
761
- UsersIndex.filter{ name.first(flags: [:anystring, :intersection]) == /s.*y/ }
762
- ```
763
-
764
- * Prefix filter
765
-
766
- ```json
767
- {"prefix": {"name": "Fre"}}
768
- {"not": {"prefix": {"name": "Joh"}}}
769
- ```
770
-
771
- ```ruby
772
- UsersIndex.filter{ name =~ re' }
773
- UsersIndex.filter{ name !~ 'Joh' }
774
- ```
775
-
776
- * Exists filter
777
-
778
- ```json
779
- {"exists": {"field": "name"}}
780
- ```
781
-
782
- ```ruby
783
- UsersIndex.filter{ name? }
784
- UsersIndex.filter{ !!name }
785
- UsersIndex.filter{ !!name? }
786
- UsersIndex.filter{ name != nil }
787
- UsersIndex.filter{ !(name == nil) }
788
- ```
789
-
790
- * Missing filter
791
-
792
- ```json
793
- {"missing": {"field": "name", "existence": true, "null_value": false}}
794
- {"missing": {"field": "name", "existence": true, "null_value": true}}
795
- {"missing": {"field": "name", "existence": false, "null_value": true}}
796
- ```
797
-
798
- ```ruby
799
- UsersIndex.filter{ !name }
800
- UsersIndex.filter{ !name? }
801
- UsersIndex.filter{ name == nil }
802
- ```
803
-
804
- * Range
805
-
806
- ```json
807
- {"range": {"age": {"gt": 42}}}
808
- {"range": {"age": {"gte": 42}}}
809
- {"range": {"age": {"lt": 42}}}
810
- {"range": {"age": {"lte": 42}}}
811
-
812
- {"range": {"age": {"gt": 40, "lt": 50}}}
813
- {"range": {"age": {"gte": 40, "lte": 50}}}
814
-
815
- {"range": {"age": {"gt": 40, "lte": 50}}}
816
- {"range": {"age": {"gte": 40, "lt": 50}}}
817
- ```
818
-
819
- ```ruby
820
- UsersIndex.filter{ age > 42 }
821
- UsersIndex.filter{ age >= 42 }
822
- UsersIndex.filter{ age < 42 }
823
- UsersIndex.filter{ age <= 42 }
824
-
825
- UsersIndex.filter{ age == (40..50) }
826
- UsersIndex.filter{ (age > 40) & (age < 50) }
827
- UsersIndex.filter{ age == [40..50] }
828
- UsersIndex.filter{ (age >= 40) & (age <= 50) }
829
-
830
- UsersIndex.filter{ (age > 40) & (age <= 50) }
831
- UsersIndex.filter{ (age >= 40) & (age < 50) }
832
- ```
833
-
834
- * Bool filter
835
-
836
- ```json
837
- {"bool": {
838
- "must": [{"term": {"name": "Name"}}],
839
- "should": [{"term": {"age": 42}}, {"term": {"age": 45}}]
840
- }}
841
- ```
842
-
843
- ```ruby
844
- UsersIndex.filter{ must(name == 'Name').should(age == 42, age == 45) }
845
- ```
846
-
847
- * And filter
848
-
849
- ```json
850
- {"and": [{"term": {"name": "Name"}}, {"range": {"age": {"lt": 42}}}]}
851
- ```
1066
+ ```ruby
1067
+ CitiesIndex.query(match: {name: 'London'})
1068
+ ```
852
1069
 
853
- ```ruby
854
- UsersIndex.filter{ (name == 'Name') & (age < 42) }
855
- ```
1070
+ Main methods of the request DSL are: `query`, `filter` and `post_filter`, it is possible to pass pure query hashes or use `elasticsearch-dsl`.
856
1071
 
857
- * Or filter
1072
+ ```ruby
1073
+ CitiesIndex
1074
+ .filter(term: {name: 'Bangkok'})
1075
+ .query(match: {name: 'London'})
1076
+ .query.not(range: {population: {gt: 1_000_000}})
1077
+ ```
858
1078
 
859
- ```json
860
- {"or": [{"term": {"name": "Name"}}, {"range": {"age": {"lt": 42}}}]}
861
- ```
1079
+ You can query a set of indexes at once:
862
1080
 
863
- ```ruby
864
- UsersIndex.filter{ (name == 'Name') | (age < 42) }
865
- ```
1081
+ ```ruby
1082
+ CitiesIndex.indices(CountriesIndex).query(match: {name: 'Some'})
1083
+ ```
866
1084
 
867
- ```json
868
- {"not": {"term": {"name": "Name"}}}
869
- {"not": {"range": {"age": {"lt": 42}}}}
870
- ```
1085
+ See https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html and https://github.com/elastic/elasticsearch-dsl-ruby for more details.
871
1086
 
872
- ```ruby
873
- UsersIndex.filter{ !(name == 'Name') } # or UsersIndex.filter{ name != 'Name' }
874
- UsersIndex.filter{ !(age < 42) }
875
- ```
1087
+ An important part of requests manipulation is merging. There are 4 methods to perform it: `merge`, `and`, `or`, `not`. See [Chewy::Search::QueryProxy](lib/chewy/search/query_proxy.rb) for details. Also, `only` and `except` methods help to remove unneeded parts of the request.
876
1088
 
877
- * Match all filter
1089
+ Every other request part is covered by a bunch of additional methods, see [Chewy::Search::Request](lib/chewy/search/request.rb) for details:
878
1090
 
879
- ```json
880
- {"match_all": {}}
881
- ```
1091
+ ```ruby
1092
+ CitiesIndex.limit(10).offset(30).order(:name, {population: {order: :desc}})
1093
+ ```
882
1094
 
883
- ```ruby
884
- UsersIndex.filter{ match_all }
885
- ```
1095
+ Request DSL also provides additional scope actions, like `delete_all`, `exists?`, `count`, `pluck`, etc.
886
1096
 
887
- * Has child filter
1097
+ #### Pagination
888
1098
 
889
- ```json
890
- {"has_child": {"type": "blog_tag", "query": {"term": {"tag": "something"}}}
891
- {"has_child": {"type": "comment", "filter": {"term": {"user": "john"}}}
892
- ```
1099
+ The request DSL supports pagination with `Kaminari`. An extension is enabled on initialization if `Kaminari` is available. See [Chewy::Search](lib/chewy/search.rb) and [Chewy::Search::Pagination::Kaminari](lib/chewy/search/pagination/kaminari.rb) for details.
893
1100
 
894
- ```ruby
895
- UsersIndex.filter{ has_child(:blog_tag).query(term: {tag: 'something'}) }
896
- UsersIndex.filter{ has_child(:comment).filter{ user == 'john' } }
897
- ```
1101
+ #### Named scopes
898
1102
 
899
- * Has parent filter
1103
+ Chewy supports named scopes functionality. There is no specialized DSL for named scopes definition, it is simply about defining class methods.
900
1104
 
901
- ```json
902
- {"has_parent": {"type": "blog", "query": {"term": {"tag": "something"}}}}
903
- {"has_parent": {"type": "blog", "filter": {"term": {"text": "bonsai three"}}}}
904
- ```
1105
+ See [Chewy::Search::Scoping](lib/chewy/search/scoping.rb) for details.
905
1106
 
906
- ```ruby
907
- UsersIndex.filter{ has_parent(:blog).query(term: {tag: 'something'}) }
908
- UsersIndex.filter{ has_parent(:blog).filter{ text == 'bonsai three' } }
909
- ```
1107
+ #### Scroll API
910
1108
 
911
- See [filters.rb](lib/chewy/query/filters.rb) for more details.
1109
+ ElasticSearch scroll API is utilized by a bunch of methods: `scroll_batches`, `scroll_hits`, `scroll_wrappers` and `scroll_objects`.
912
1110
 
913
- ### Faceting
1111
+ See [Chewy::Search::Scrolling](lib/chewy/search/scrolling.rb) for details.
914
1112
 
915
- Facets are an optional sidechannel you can request from Elasticsearch describing certain fields of the resulting collection. The most common use for facets is to allow the user to continue filtering specifically within the subset, as opposed to the global index.
1113
+ #### Loading objects
916
1114
 
917
- For instance, let's request the `country` field as a facet along with our users collection. We can do this with the #facets method like so:
1115
+ It is possible to load ORM/ODM source objects with the `objects` method. To provide additional loading options use `load` method:
918
1116
 
919
1117
  ```ruby
920
- UsersIndex.filter{ [...] }.facets({countries: {terms: {field: 'country'}}})
1118
+ CitiesIndex.load(scope: -> { active }).to_a # to_a returns `Chewy::Index` wrappers.
1119
+ CitiesIndex.load(scope: -> { active }).objects # An array of AR source objects.
921
1120
  ```
922
1121
 
923
- Let's look at what we asked from Elasticsearch. The facets setter method accepts a hash. You can choose custom/semantic key names for this hash for your own convenience (in this case I used the plural version of the actual field), in our case `countries`. The following nested hash tells ES to grab and aggregate values (terms) from the `country` field on our indexed records.
924
-
925
- The response will include the `:facets` sidechannel:
926
-
927
- ```
928
- < { ... ,"facets":{"countries":{"_type":"terms","missing":?,"total":?,"other":?,"terms":[{"term":"USA","count":?},{"term":"Brazil","count":?}, ...}}
929
- ```
1122
+ See [Chewy::Search::Loader](lib/chewy/search/loader.rb) for more details.
930
1123
 
931
- ### Aggregations
932
-
933
- Aggregations are part of the optional sidechannel that can be requested with a query.
934
-
935
- You interact with aggregations using the composable #aggregations method (or its alias #aggs)
936
-
937
- Let's look at an example.
1124
+ In case when it is necessary to iterate through both of the wrappers and objects simultaneously, `object_hash` method helps a lot:
938
1125
 
939
1126
  ```ruby
940
- class UsersIndex < Chewy::Index
941
- define_type User do
942
- field :name
943
- field :rating
944
- end
1127
+ scope = CitiesIndex.load(scope: -> { active })
1128
+ scope.each do |wrapper|
1129
+ scope.object_hash[wrapper]
945
1130
  end
946
-
947
- all_johns = UsersIndex::User.filter { name == 'john' }.aggs({ avg_rating: { avg: { field: 'rating' } } })
948
-
949
- avg_johns_rating = all_johns.aggs
950
- # => {"avg_rating"=>{"value"=>3.5}}
951
1131
  ```
952
1132
 
953
- It is convenient to name aggregations that you intend to reuse regularly. This is achieve with the .aggregation method,
954
- which is also available under the .agg alias method.
1133
+ ### Rake tasks
955
1134
 
956
- Here's the same example from before
1135
+ For a Rails application, some index-maintaining rake tasks are defined.
957
1136
 
958
- ```ruby
959
- class UsersIndex < Chewy::Index
960
- define_type User do
961
- field :name
962
- field :rating, type: "long"
963
- agg :avg_rating do
964
- { avg: { field: 'rating' } }
965
- end
966
- end
967
- end
1137
+ #### `chewy:reset`
968
1138
 
969
- all_johns = UsersIndex::User.filter { name == 'john' }.aggs(:avg_rating)
1139
+ Performs zero-downtime reindexing as described [here](https://www.elastic.co/blog/changing-mapping-with-zero-downtime). So the rake task creates a new index with unique suffix and then simply aliases it to the common index name. The previous index is deleted afterwards (see `Chewy::Index.reset!` for more details).
970
1140
 
971
- avg_johns_rating = all_johns.aggs
972
- # => {"avg_rating"=>{"value"=>3.5}}
1141
+ ```bash
1142
+ rake chewy:reset # resets all the existing indices
1143
+ rake chewy:reset[users] # resets UsersIndex only
1144
+ rake chewy:reset[users,cities] # resets UsersIndex and CitiesIndex
1145
+ rake chewy:reset[-users,cities] # resets every index in the application except specified ones
973
1146
  ```
974
1147
 
975
- It is possible to run into collisions between named aggregations. This occurs when there is more than one aggregation
976
- with the same name. To explicitly reference an aggregation you provide a string to the #aggs method of the form:
977
- `index_name#document_type.aggregation_name`
1148
+ #### `chewy:upgrade`
978
1149
 
979
- Consider this example where there are two separate aggregations named `avg_rating`
1150
+ Performs reset exactly the same way as `chewy:reset` does, but only when the index specification (setting or mapping) was changed.
980
1151
 
981
- ```ruby
982
- class UsersIndex < Chewy::Index
983
- define_type User do
984
- field :name
985
- field :rating, type: "long"
986
- agg :avg_rating do
987
- { avg: { field: 'rating' } }
988
- end
989
- end
990
- define_type Post do
991
- field :title
992
- field :body
993
- field :comments do
994
- field :message
995
- field :rating, type: "long"
996
- end
997
- agg :avg_rating do
998
- { avg: { field: 'comments.rating' } }
999
- end
1000
- end
1001
- end
1002
-
1003
- all_docs = UsersIndex.filter {match_all}.aggs("users#user.avg_rating")
1004
- all_docs.aggs
1005
- # => {"users#user.avg_rating"=>{"value"=>3.5}}
1006
- ```
1152
+ It works only when index specification is locked in `Chewy::Stash::Specification` index. The first run will reset all indexes and lock their specifications.
1007
1153
 
1008
- ### Script fields
1154
+ See [Chewy::Stash::Specification](lib/chewy/stash.rb) and [Chewy::Index::Specification](lib/chewy/index/specification.rb) for more details.
1009
1155
 
1010
- Script fields allow you to execute Elasticsearch's scripting languages such as groovy and javascript. More about supported languages and what scripting is [here](https://www.elastic.co/guide/en/elasticsearch/reference/0.90/modules-scripting.html). This feature allows you to calculate the distance between geo points, for example. This is how to use the DSL:
1011
1156
 
1012
- ```ruby
1013
- UsersIndex.script_fields(
1014
- distance: {
1015
- params: {
1016
- lat: 37.569976,
1017
- lon: -122.351591
1018
- },
1019
- script: "doc['coordinates'].distanceInMiles(lat, lon)"
1020
- }
1021
- )
1157
+ ```bash
1158
+ rake chewy:upgrade # upgrades all the existing indices
1159
+ rake chewy:upgrade[users] # upgrades UsersIndex only
1160
+ rake chewy:upgrade[users,cities] # upgrades UsersIndex and CitiesIndex
1161
+ rake chewy:upgrade[-users,cities] # upgrades every index in the application except specified ones
1022
1162
  ```
1023
- Here, `coordinates` is a field with type `geo_point`. There will be a `distance` field for the index's model in the search result.
1024
1163
 
1025
- ### Script scoring
1164
+ #### `chewy:update`
1026
1165
 
1027
- Script scoring is used to score the search results. All scores are added to the search request and combined according to boost mode and score mode. This can be useful if, for example, a score function is computationally expensive and it is sufficient to compute the score on a filtered set of documents. For example, you might want to multiply the score by another numeric field in the doc:
1166
+ It doesn't create indexes, it simply imports everything to the existing ones and fails if the index was not created before.
1028
1167
 
1029
- ```ruby
1030
- UsersIndex.script_score("_score * doc['my_numeric_field'].value")
1168
+ ```bash
1169
+ rake chewy:update # updates all the existing indices
1170
+ rake chewy:update[users] # updates UsersIndex only
1171
+ rake chewy:update[users,cities] # updates UsersIndex and CitiesIndex
1172
+ rake chewy:update[-users,cities] # updates every index in the application except UsersIndex and CitiesIndex
1031
1173
  ```
1032
1174
 
1033
- ### Boost Factor
1175
+ #### `chewy:sync`
1034
1176
 
1035
- Boost factors are a way to add a boost to a query where documents match the filter. If you have some users who are experts and some who are regular users, you might want to give the experts a higher score and boost to the top of the search results. You can accomplish this by using the #boost_factor method and adding a boost score of 5 for an expert user:
1177
+ Provides a way to synchronize outdated indexes with the source quickly and without doing a full reset. By default field `updated_at` is used to find outdated records, but this could be customized by `outdated_sync_field` as described at [Chewy::Index::Syncer](lib/chewy/index/syncer.rb).
1036
1178
 
1037
- ```ruby
1038
- UsersIndex.boost_factor(5, filter: {term: {type: 'Expert'}})
1179
+ Arguments are similar to the ones taken by `chewy:update` task.
1180
+
1181
+ See [Chewy::Index::Syncer](lib/chewy/index/syncer.rb) for more details.
1182
+
1183
+ ```bash
1184
+ rake chewy:sync # synchronizes all the existing indices
1185
+ rake chewy:sync[users] # synchronizes UsersIndex only
1186
+ rake chewy:sync[users,cities] # synchronizes UsersIndex and CitiesIndex
1187
+ rake chewy:sync[-users,cities] # synchronizes every index in the application except except UsersIndex and CitiesIndex
1039
1188
  ```
1040
1189
 
1041
- ### Objects loading
1190
+ #### `chewy:deploy`
1042
1191
 
1043
- It is possible to load source objects from the database for every search result:
1192
+ This rake task is especially useful during the production deploy. It is a combination of `chewy:upgrade` and `chewy:sync` and the latter is called only for the indexes that were not reset during the first stage.
1044
1193
 
1045
- ```ruby
1046
- scope = UsersIndex.filter(range: {rating: {gte: 100}})
1194
+ It is not possible to specify any particular indexes for this task as it doesn't make much sense.
1047
1195
 
1048
- scope.load # => scope is marked to return User instances array
1049
- scope.load.query(...) # => since objects are loaded lazily you can complete scope
1050
- scope.load(user: { scope: ->{ includes(:country) }}) # you can also pass loading scopes for each
1051
- # possibly returned type
1052
- scope.load(user: { scope: User.includes(:country) }) # the second scope passing way.
1053
- scope.load(scope: ->{ includes(:country) }) # and more common scope applied to every loaded object type.
1196
+ Right now the approach is that if some data had been updated, but index definition was not changed (no changes satisfying the synchronization algorithm were done), it would be much faster to perform manual partial index update inside data migrations or even manually after the deploy.
1054
1197
 
1055
- scope.only(:id).load # it is optimal to request ids only if you are not planning to use type objects
1056
- ```
1198
+ Also, there is always full reset alternative with `rake chewy:reset`.
1057
1199
 
1058
- The `preload` method takes the same options as `load` and ORM/ODM objects will be loaded, but the scope will still return an array of Chewy wrappers. To access real objects use the `_object` wrapper method:
1200
+ #### `chewy:create_missing_indexes`
1059
1201
 
1060
- ```ruby
1061
- UsersIndex.filter(range: {rating: {gte: 100}}).preload(...).query(...).map(&:_object)
1062
- ```
1202
+ This rake task creates newly defined indexes in ElasticSearch and skips existing ones. Useful for production-like environments.
1063
1203
 
1064
- See [loading.rb](lib/chewy/query/loading.rb) for more details.
1204
+ #### Parallelizing rake tasks
1065
1205
 
1066
- ### `ActiveSupport::Notifications` support
1206
+ Every task described above has its own parallel version. Every parallel rake task takes the number for processes for execution as the first argument and the rest of the arguments are exactly the same as for the non-parallel task version.
1067
1207
 
1068
- Chewy has notifying the following events:
1208
+ [https://github.com/grosser/parallel](https://github.com/grosser/parallel) gem is required to use these tasks.
1069
1209
 
1070
- #### `search_query.chewy` payload
1210
+ If the number of processes is not specified explicitly - `parallel` gem tries to automatically derive the number of processes to use.
1071
1211
 
1072
- * `payload[:index]`: requested index class
1073
- * `payload[:request]`: request hash
1212
+ ```bash
1213
+ rake chewy:parallel:reset
1214
+ rake chewy:parallel:upgrade[4]
1215
+ rake chewy:parallel:update[4,cities]
1216
+ rake chewy:parallel:sync[4,-users]
1217
+ rake chewy:parallel:deploy[4] # performs parallel upgrade and parallel sync afterwards
1218
+ ```
1074
1219
 
1075
- #### `import_objects.chewy` payload
1220
+ #### `chewy:journal`
1076
1221
 
1077
- * `payload[:type]`: currently imported type
1078
- * `payload[:import]`: imports stats, total imported and deleted objects count:
1222
+ This namespace contains two tasks for the journal manipulations: `chewy:journal:apply` and `chewy:journal:clean`. Both are taking time as the first argument (optional for clean) and a list of indexes exactly as the tasks above. Time can be in any format parsable by ActiveSupport.
1079
1223
 
1080
- ```ruby
1081
- {index: 30, delete: 5}
1082
- ```
1224
+ ```bash
1225
+ rake chewy:journal:apply["$(date -v-1H -u +%FT%TZ)"] # apply journaled changes for the past hour
1226
+ rake chewy:journal:apply["$(date -v-1H -u +%FT%TZ)",users] # apply journaled changes for the past hour on UsersIndex only
1227
+ ```
1083
1228
 
1084
- * `payload[:errors]`: might not exists. Contains grouped errors with objects ids list:
1229
+ When the size of the journal becomes very large, the classical way of deletion would be obstructive and resource consuming. Fortunately, Chewy internally uses [delete-by-query](https://www.elastic.co/guide/en/elasticsearch/reference/7.17/docs-delete-by-query.html#docs-delete-by-query-task-api) ES function which supports async execution with batching and [throttling](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#docs-delete-by-query-throttle).
1085
1230
 
1086
- ```ruby
1087
- {index: {
1088
- 'error 1 text' => ['1', '2', '3'],
1089
- 'error 2 text' => ['4']
1090
- }, delete: {
1091
- 'delete error text' => ['10', '12']
1092
- }}
1093
- ```
1231
+ The available options, which can be set by ENV variables, are listed below:
1232
+ * `WAIT_FOR_COMPLETION` - a boolean flag. It controls async execution. It waits by default. When set to `false` (`0`, `f`, `false` or `off` in any case spelling is accepted as `false`), Elasticsearch performs some preflight checks, launches the request, and returns a task reference you can use to cancel the task or get its status.
1233
+ * `REQUESTS_PER_SECOND` - float. The throttle for this request in sub-requests per second. No throttling is enforced by default.
1234
+ * `SCROLL_SIZE` - integer. The number of documents to be deleted in single sub-request. The default batch size is 1000.
1094
1235
 
1095
- #### NewRelic integration
1236
+ ```bash
1237
+ rake chewy:journal:clean WAIT_FOR_COMPLETION=false REQUESTS_PER_SECOND=10 SCROLL_SIZE=5000
1238
+ ```
1096
1239
 
1097
- To integrate with NewRelic you may use the following example source (config/initializers/chewy.rb):
1240
+ ### RSpec integration
1098
1241
 
1099
- ```ruby
1100
- ActiveSupport::Notifications.subscribe('import_objects.chewy') do |name, start, finish, id, payload|
1101
- metric_name = "Database/ElasticSearch/import"
1102
- duration = (finish - start).to_f
1103
- logged = "#{payload[:type]} #{payload[:import].to_a.map{ |i| i.join(':') }.join(', ')}"
1104
-
1105
- self.class.trace_execution_scoped([metric_name]) do
1106
- NewRelic::Agent.instance.transaction_sampler.notice_sql(logged, nil, duration)
1107
- NewRelic::Agent.instance.sql_sampler.notice_sql(logged, metric_name, nil, duration)
1108
- NewRelic::Agent.record_metric(metric_name, duration)
1109
- end
1110
- end
1242
+ Just add `require 'chewy/rspec'` to your spec_helper.rb and you will get additional features:
1111
1243
 
1112
- ActiveSupport::Notifications.subscribe('search_query.chewy') do |name, start, finish, id, payload|
1113
- metric_name = "Database/ElasticSearch/search"
1114
- duration = (finish - start).to_f
1115
- logged = "#{payload[:type].presence || payload[:index]} #{payload[:request]}"
1244
+ [update_index](lib/chewy/rspec/update_index.rb) helper
1245
+ `mock_elasticsearch_response` helper to mock elasticsearch response
1246
+ `mock_elasticsearch_response_sources` helper to mock elasticsearch response sources
1247
+ `build_query` matcher to compare request and expected query (returns `true`/`false`)
1116
1248
 
1117
- self.class.trace_execution_scoped([metric_name]) do
1118
- NewRelic::Agent.instance.transaction_sampler.notice_sql(logged, nil, duration)
1119
- NewRelic::Agent.instance.sql_sampler.notice_sql(logged, metric_name, nil, duration)
1120
- NewRelic::Agent.record_metric(metric_name, duration)
1121
- end
1122
- end
1123
- ```
1249
+ To use `mock_elasticsearch_response` and `mock_elasticsearch_response_sources` helpers add `include Chewy::Rspec::Helpers` to your tests.
1124
1250
 
1125
- ### Rake tasks
1251
+ See [chewy/rspec/](lib/chewy/rspec/) for more details.
1126
1252
 
1127
- Inside the Rails application, some index-maintaining rake tasks are defined.
1253
+ ### Minitest integration
1128
1254
 
1129
- ```bash
1130
- rake chewy:reset # resets all the existing indices, declared in app/chewy
1131
- rake chewy:reset[users] # resets UsersIndex only
1255
+ Add `require 'chewy/minitest'` to your test_helper.rb, and then for tests which you'd like indexing test hooks, `include Chewy::Minitest::Helpers`.
1132
1256
 
1133
- rake chewy:update # updates all the existing indices, declared in app/chewy
1134
- rake chewy:update[users] # updates UsersIndex only
1135
- ```
1257
+ Since you can set `:bypass` strategy for test suites and manually handle import for the index and manually flush test indices using `Chewy.massacre`. This will help reduce unnecessary ES requests
1258
+
1259
+ But if you require chewy to index/update model regularly in your test suite then you can specify `:urgent` strategy for documents indexing. Add `Chewy.strategy(:urgent)` to test_helper.rb.
1136
1260
 
1137
- `rake chewy:reset` performs zero-downtime reindexing as described [here](https://www.elastic.co/blog/changing-mapping-with-zero-downtime). So basically rake task creates a new index with uniq suffix and then simply aliases it to the common index name. The previous index is deleted afterwards (see `Chewy::Index.reset!` for more details).
1261
+ Also, you can use additional helpers:
1138
1262
 
1263
+ `mock_elasticsearch_response` to mock elasticsearch response
1264
+ `mock_elasticsearch_response_sources` to mock elasticsearch response sources
1265
+ `assert_elasticsearch_query` to compare request and expected query (returns `true`/`false`)
1139
1266
 
1140
- ### Rspec integration
1267
+ See [chewy/minitest/](lib/chewy/minitest/) for more details.
1141
1268
 
1142
- Just add `require 'chewy/rspec'` to your spec_helper.rb and you will get additional features: See [update_index.rb](lib/chewy/rspec/update_index.rb) for more details.
1269
+ ### DatabaseCleaner
1143
1270
 
1144
- If you use `DatabaseCleaner` in your tests with [the `transaction` strategy](https://github.com/DatabaseCleaner/database_cleaner#how-to-use), you may run into the problem that `ActiveRecord`'s models are not indexed automatically on save despite the fact that you set the callbacks to do this with the `update_index` method. The issue arises because `chewy` indexes data on `after_commit` run as default, but all `after_commit` callbacks are not run with the `DatabaseCleaner`'s' `transaction` strategy. You can solve this issue by changing the `Chewy.use_after_commit_callbacks` option. Just add the following initializer in your Rails application:
1271
+ If you use `DatabaseCleaner` in your tests with [the `transaction` strategy](https://github.com/DatabaseCleaner/database_cleaner#how-to-use), you may run into the problem that `ActiveRecord`'s models are not indexed automatically on save despite the fact that you set the callbacks to do this with the `update_index` method. The issue arises because `chewy` indices data on `after_commit` run as default, but all `after_commit` callbacks are not run with the `DatabaseCleaner`'s' `transaction` strategy. You can solve this issue by changing the `Chewy.use_after_commit_callbacks` option. Just add the following initializer in your Rails application:
1145
1272
 
1146
1273
  ```ruby
1147
1274
  #config/initializers/chewy.rb
1148
1275
  Chewy.use_after_commit_callbacks = !Rails.env.test?
1149
1276
  ```
1150
1277
 
1151
- ## TODO a.k.a coming soon:
1152
-
1153
- * Typecasting support
1154
- * Advanced (simplified) query DSL: `UsersIndex.query { email == 'my@gmail.com' }` will produce term query
1155
- * update_all support
1156
- * Maybe, closer ORM/ODM integration, creating index classes implicitly
1157
-
1158
1278
  ## Contributing
1159
1279
 
1160
1280
  1. Fork it (http://github.com/toptal/chewy/fork)
@@ -1164,9 +1284,14 @@ Chewy.use_after_commit_callbacks = !Rails.env.test?
1164
1284
  5. Push to the branch (`git push origin my-new-feature`)
1165
1285
  6. Create new Pull Request
1166
1286
 
1167
- Use the following Rake tasks to control the Elasticsearch cluster while developing.
1287
+ Use the following Rake tasks to control the Elasticsearch cluster while developing, if you prefer native Elasticsearch installation over the dockerized one:
1168
1288
 
1169
1289
  ```bash
1170
1290
  rake elasticsearch:start # start Elasticsearch cluster on 9250 port for tests
1171
1291
  rake elasticsearch:stop # stop Elasticsearch
1172
1292
  ```
1293
+
1294
+ ## Copyright
1295
+
1296
+ Copyright (c) 2013-2021 Toptal, LLC. See [LICENSE.txt](LICENSE.txt) for
1297
+ further details.