rroonga 5.0.0-x86-mingw32 → 5.0.1-x86-mingw32
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +8 -8
- data/.yardopts +1 -0
- data/Rakefile +1 -16
- data/example/bookmark.rb +1 -6
- data/example/index-html.rb +0 -1
- data/ext/groonga/extconf.rb +4 -7
- data/ext/groonga/rb-grn-array.c +1 -1
- data/ext/groonga/rb-grn-column.c +33 -67
- data/ext/groonga/rb-grn-context.c +5 -5
- data/ext/groonga/rb-grn-database.c +2 -2
- data/ext/groonga/rb-grn-double-array-trie.c +4 -2
- data/ext/groonga/rb-grn-encoding-support.c +7 -1
- data/ext/groonga/rb-grn-equal-operator.c +85 -0
- data/ext/groonga/rb-grn-exception.c +17 -0
- data/ext/groonga/rb-grn-expression.c +85 -43
- data/ext/groonga/rb-grn-greater-equal-operator.c +88 -0
- data/ext/groonga/rb-grn-greater-operator.c +85 -0
- data/ext/groonga/rb-grn-hash.c +1 -1
- data/ext/groonga/rb-grn-index-column.c +150 -11
- data/ext/groonga/rb-grn-less-equal-operator.c +88 -0
- data/ext/groonga/rb-grn-less-operator.c +85 -0
- data/ext/groonga/rb-grn-logger.c +5 -5
- data/ext/groonga/rb-grn-match-operator.c +86 -0
- data/ext/groonga/rb-grn-normalizer.c +8 -1
- data/ext/groonga/rb-grn-not-equal-operator.c +85 -0
- data/ext/groonga/rb-grn-object.c +170 -36
- data/ext/groonga/rb-grn-operator.c +395 -172
- data/ext/groonga/rb-grn-patricia-trie.c +10 -8
- data/ext/groonga/rb-grn-plugin.c +51 -3
- data/ext/groonga/rb-grn-prefix-operator.c +86 -0
- data/ext/groonga/rb-grn-procedure-type.c +4 -0
- data/ext/groonga/rb-grn-query-logger.c +4 -4
- data/ext/groonga/rb-grn-regexp-operator.c +85 -0
- data/ext/groonga/rb-grn-snippet.c +1 -1
- data/ext/groonga/rb-grn-table-key-support.c +9 -5
- data/ext/groonga/rb-grn-table.c +52 -66
- data/ext/groonga/rb-grn-type.c +1 -1
- data/ext/groonga/rb-grn-utils.c +22 -3
- data/ext/groonga/rb-grn.h +31 -4
- data/ext/groonga/rb-groonga.c +9 -9
- data/lib/1.9/groonga.so +0 -0
- data/lib/2.0/groonga.so +0 -0
- data/lib/2.1/groonga.so +0 -0
- data/lib/2.2/groonga.so +0 -0
- data/lib/groonga/context.rb +31 -0
- data/lib/groonga/expression-builder.rb +14 -1
- data/lib/groonga/record.rb +10 -8
- data/lib/groonga/schema.rb +3 -1
- data/rroonga-build.rb +2 -2
- data/rroonga.gemspec +3 -3
- data/test/groonga-test-utils.rb +4 -0
- data/test/test-column.rb +28 -26
- data/test/test-exception.rb +1 -0
- data/test/test-expression-builder.rb +83 -1
- data/test/test-expression.rb +80 -48
- data/test/test-index-column.rb +102 -29
- data/test/test-normalizer.rb +35 -29
- data/test/test-operator.rb +214 -0
- data/test/test-plugin.rb +24 -6
- data/test/test-procedure.rb +29 -0
- data/test/test-schema-type.rb +14 -0
- data/test/test-table-select-mecab.rb +1 -4
- data/test/test-table.rb +7 -0
- data/test/test-token-regexp.rb +30 -0
- data/test/test-type.rb +24 -0
- data/vendor/local/bin/grndb.exe +0 -0
- data/vendor/local/bin/groonga-benchmark.exe +0 -0
- data/vendor/local/bin/groonga.exe +0 -0
- data/vendor/local/bin/libgcc_s_sjlj-1.dll +0 -0
- data/vendor/local/bin/libgroonga-0.dll +0 -0
- data/vendor/local/bin/libmecab-1.dll +0 -0
- data/vendor/local/bin/libmsgpack-3.dll +0 -0
- data/vendor/local/bin/libmsgpackc-2.dll +0 -0
- data/vendor/local/bin/libonig-5.dll +0 -0
- data/vendor/local/bin/libstdc++-6.dll +0 -0
- data/vendor/local/bin/lz4.exe +0 -0
- data/vendor/local/bin/lz4c.exe +0 -0
- data/vendor/local/bin/lz4cat +0 -0
- data/vendor/local/bin/mecab-config +2 -2
- data/vendor/local/bin/mecab.exe +0 -0
- data/vendor/local/bin/onig-config +1 -1
- data/vendor/local/bin/zlib1.dll +0 -0
- data/vendor/local/etc/groonga/groonga.conf +1 -1
- data/vendor/local/etc/groonga/httpd/groonga-httpd.conf +2 -2
- data/vendor/local/include/groonga/groonga.h +1 -0
- data/vendor/local/include/groonga/groonga/expr.h +2 -0
- data/vendor/local/include/groonga/groonga/groonga.h +32 -5
- data/vendor/local/include/groonga/groonga/ii.h +7 -0
- data/vendor/local/include/groonga/groonga/obj.h +37 -0
- data/vendor/local/include/groonga/groonga/scorer.h +95 -0
- data/vendor/local/lib/groonga/plugins/query_expanders/tsv.a +0 -0
- data/vendor/local/lib/groonga/plugins/query_expanders/tsv.dll +0 -0
- data/vendor/local/lib/groonga/plugins/query_expanders/tsv.dll.a +0 -0
- data/vendor/local/lib/groonga/plugins/query_expanders/tsv.la +2 -2
- data/vendor/local/lib/groonga/plugins/ruby/eval.a +0 -0
- data/vendor/local/lib/groonga/plugins/ruby/eval.dll +0 -0
- data/vendor/local/lib/groonga/plugins/ruby/eval.dll.a +0 -0
- data/vendor/local/lib/groonga/plugins/ruby/eval.la +2 -2
- data/vendor/local/lib/groonga/plugins/ruby/load.a +0 -0
- data/vendor/local/lib/groonga/plugins/ruby/load.dll +0 -0
- data/vendor/local/lib/groonga/plugins/ruby/load.dll.a +0 -0
- data/vendor/local/lib/groonga/plugins/ruby/load.la +2 -2
- data/vendor/local/lib/groonga/plugins/sharding/logical_count.rb +6 -3
- data/vendor/local/lib/groonga/plugins/sharding/logical_enumerator.rb +6 -5
- data/vendor/local/lib/groonga/plugins/sharding/logical_range_filter.rb +421 -17
- data/vendor/local/lib/groonga/plugins/suggest/suggest.a +0 -0
- data/vendor/local/lib/groonga/plugins/suggest/suggest.dll +0 -0
- data/vendor/local/lib/groonga/plugins/suggest/suggest.dll.a +0 -0
- data/vendor/local/lib/groonga/plugins/suggest/suggest.la +2 -2
- data/vendor/local/lib/groonga/plugins/table/table.a +0 -0
- data/vendor/local/lib/groonga/plugins/table/table.dll +0 -0
- data/vendor/local/lib/groonga/plugins/table/table.dll.a +0 -0
- data/vendor/local/lib/groonga/plugins/table/table.la +2 -2
- data/vendor/local/lib/groonga/plugins/token_filters/stop_word.a +0 -0
- data/vendor/local/lib/groonga/plugins/token_filters/stop_word.dll +0 -0
- data/vendor/local/lib/groonga/plugins/token_filters/stop_word.dll.a +0 -0
- data/vendor/local/lib/groonga/plugins/token_filters/stop_word.la +2 -2
- data/vendor/local/lib/groonga/plugins/tokenizers/mecab.a +0 -0
- data/vendor/local/lib/groonga/plugins/tokenizers/mecab.dll +0 -0
- data/vendor/local/lib/groonga/plugins/tokenizers/mecab.dll.a +0 -0
- data/vendor/local/lib/groonga/plugins/tokenizers/mecab.la +2 -2
- data/vendor/local/lib/groonga/scripts/ruby/backtrace_entry.rb +12 -4
- data/vendor/local/lib/groonga/scripts/ruby/database.rb +11 -3
- data/vendor/local/lib/groonga/scripts/ruby/expression.rb +23 -0
- data/vendor/local/lib/groonga/scripts/ruby/expression_size_estimator.rb +158 -0
- data/vendor/local/lib/groonga/scripts/ruby/index_column.rb +39 -0
- data/vendor/local/lib/groonga/scripts/ruby/initialize/post.rb +4 -0
- data/vendor/local/lib/groonga/scripts/ruby/initialize/pre.rb +2 -0
- data/vendor/local/lib/groonga/scripts/ruby/logger.rb +11 -7
- data/vendor/local/lib/groonga/scripts/ruby/object.rb +11 -0
- data/vendor/local/lib/groonga/scripts/ruby/operator.rb +22 -0
- data/vendor/local/lib/groonga/scripts/ruby/scan_info.rb +7 -2
- data/vendor/local/lib/groonga/scripts/ruby/scan_info_builder.rb +7 -11
- data/vendor/local/lib/groonga/scripts/ruby/scan_info_data.rb +137 -34
- data/vendor/local/lib/groonga/scripts/ruby/scan_info_search_index.rb +9 -0
- data/vendor/local/lib/libgroonga.a +0 -0
- data/vendor/local/lib/libgroonga.dll.a +0 -0
- data/vendor/local/lib/libgroonga.la +2 -2
- data/vendor/local/lib/liblz4.a +0 -0
- data/vendor/local/lib/liblz4.dll +0 -0
- data/vendor/local/lib/liblz4.dll.1 +0 -0
- data/vendor/local/lib/liblz4.dll.1.5.0 +0 -0
- data/vendor/local/lib/libmecab.a +0 -0
- data/vendor/local/lib/libmecab.dll.a +0 -0
- data/vendor/local/lib/libmecab.la +2 -2
- data/vendor/local/lib/libmsgpack.a +0 -0
- data/vendor/local/lib/libmsgpack.dll.a +0 -0
- data/vendor/local/lib/libmsgpack.la +2 -2
- data/vendor/local/lib/libmsgpackc.a +0 -0
- data/vendor/local/lib/libmsgpackc.dll.a +0 -0
- data/vendor/local/lib/libmsgpackc.la +2 -2
- data/vendor/local/lib/libonig.a +0 -0
- data/vendor/local/lib/libonig.dll.a +0 -0
- data/vendor/local/lib/libonig.la +2 -2
- data/vendor/local/lib/libz.a +0 -0
- data/vendor/local/lib/libz.dll.a +0 -0
- data/vendor/local/lib/pkgconfig/groonga.pc +3 -3
- data/vendor/local/lib/pkgconfig/liblz4.pc +5 -5
- data/vendor/local/lib/pkgconfig/msgpack.pc +1 -1
- data/vendor/local/lib/pkgconfig/oniguruma.pc +6 -6
- data/vendor/local/lib/pkgconfig/zlib.pc +3 -3
- data/vendor/local/libexec/mecab/mecab-cost-train.exe +0 -0
- data/vendor/local/libexec/mecab/mecab-dict-gen.exe +0 -0
- data/vendor/local/libexec/mecab/mecab-dict-index.exe +0 -0
- data/vendor/local/libexec/mecab/mecab-system-eval.exe +0 -0
- data/vendor/local/libexec/mecab/mecab-test-gen.exe +0 -0
- data/vendor/local/sbin/groonga-httpd-restart +1 -1
- data/vendor/local/sbin/groonga-httpd.exe +0 -0
- data/vendor/local/share/doc/groonga/en/html/.buildinfo +1 -1
- data/vendor/local/share/doc/groonga/en/html/_images/used-when-indexing.png +0 -0
- data/vendor/local/share/doc/groonga/en/html/_images/used-when-searching.png +0 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/characteristic.txt +1 -1
- data/vendor/local/share/doc/groonga/en/html/_sources/contribution/development/release.txt +32 -17
- data/vendor/local/share/doc/groonga/en/html/_sources/install/centos.txt +3 -3
- data/vendor/local/share/doc/groonga/en/html/_sources/install/debian.txt +3 -3
- data/vendor/local/share/doc/groonga/en/html/_sources/install/fedora.txt +4 -4
- data/vendor/local/share/doc/groonga/en/html/_sources/install/mac_os_x.txt +3 -3
- data/vendor/local/share/doc/groonga/en/html/_sources/install/others.txt +3 -3
- data/vendor/local/share/doc/groonga/en/html/_sources/install/solaris.txt +3 -3
- data/vendor/local/share/doc/groonga/en/html/_sources/install/ubuntu.txt +3 -3
- data/vendor/local/share/doc/groonga/en/html/_sources/install/windows.txt +9 -9
- data/vendor/local/share/doc/groonga/en/html/_sources/news.txt +194 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/news/1.0.x.txt +1 -1
- data/vendor/local/share/doc/groonga/en/html/_sources/news/3.x.txt +2 -2
- data/vendor/local/share/doc/groonga/en/html/_sources/news/4.x.txt +2 -2
- data/vendor/local/share/doc/groonga/en/html/_sources/reference.txt +2 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/api.txt +3 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/api/grn_ctx.txt +42 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/api/overview.txt +54 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/cache_limit.txt +1 -1
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/column_create.txt +2 -2
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/column_list.txt +1 -1
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/column_rename.txt +3 -3
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/delete.txt +4 -4
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/load.txt +5 -5
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/lock_clear.txt +4 -4
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/logical_count.txt +173 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/logical_range_filter.txt +112 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/normalize.txt +7 -6
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/plugin_register.txt +64 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/plugin_unregister.txt +63 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/register.txt +11 -1
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/request_cancel.txt +3 -2
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/ruby_eval.txt +1 -1
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/ruby_load.txt +1 -1
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/select.txt +17 -17
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/suggest.txt +12 -12
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/table_create.txt +7 -7
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/table_tokenize.txt +4 -4
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/tokenize.txt +6 -6
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/commands/truncate.txt +1 -1
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/executables/groonga.txt +47 -26
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/functions/between.txt +5 -5
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/functions/geo_distance.txt +3 -3
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/functions/highlight_full.txt +6 -6
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/functions/highlight_html.txt +1 -1
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/functions/html_untag.txt +1 -1
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/functions/in_values.txt +54 -2
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/functions/query.txt +4 -4
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/functions/sub_filter.txt +4 -4
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/grn_expr/query_syntax.txt +44 -18
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/grn_expr/script_syntax.txt +41 -11
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/indexing.txt +2 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/normalizers.txt +4 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/operations.txt +2 -1
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/regular_expression.txt +403 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/scorer.txt +217 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/scorers/scorer_tf_at_most.txt +22 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/scorers/scorer_tf_idf.txt +110 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/scoring_note.txt +13 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/tables.txt +8 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/reference/tokenizers.txt +530 -16
- data/vendor/local/share/doc/groonga/en/html/_sources/server.txt +2 -1
- data/vendor/local/share/doc/groonga/en/html/_sources/server/memcached.txt +15 -0
- data/vendor/local/share/doc/groonga/en/html/_sources/spec/gqtp.txt +66 -1
- data/vendor/local/share/doc/groonga/en/html/_sources/troubleshooting/different_results_with_the_same_keyword.txt +1 -1
- data/vendor/local/share/doc/groonga/en/html/_sources/tutorial/network.txt +0 -81
- data/vendor/local/share/doc/groonga/en/html/_static/basic.css +6 -68
- data/vendor/local/share/doc/groonga/en/html/_static/doctools.js +1 -26
- data/vendor/local/share/doc/groonga/en/html/_static/down-pressed.png +0 -0
- data/vendor/local/share/doc/groonga/en/html/_static/down.png +0 -0
- data/vendor/local/share/doc/groonga/en/html/_static/file.png +0 -0
- data/vendor/local/share/doc/groonga/en/html/_static/jquery.js +9404 -4
- data/vendor/local/share/doc/groonga/en/html/_static/minus.png +0 -0
- data/vendor/local/share/doc/groonga/en/html/_static/plus.png +0 -0
- data/vendor/local/share/doc/groonga/en/html/_static/searchtools.js +2 -2
- data/vendor/local/share/doc/groonga/en/html/_static/underscore.js +1415 -31
- data/vendor/local/share/doc/groonga/en/html/_static/up-pressed.png +0 -0
- data/vendor/local/share/doc/groonga/en/html/_static/up.png +0 -0
- data/vendor/local/share/doc/groonga/en/html/_static/websupport.js +15 -15
- data/vendor/local/share/doc/groonga/en/html/characteristic.html +18 -20
- data/vendor/local/share/doc/groonga/en/html/client.html +22 -24
- data/vendor/local/share/doc/groonga/en/html/community.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/contribution.html +18 -20
- data/vendor/local/share/doc/groonga/en/html/contribution/development.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/contribution/development/com.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/contribution/development/cooperation.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/contribution/development/query.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/contribution/development/release.html +51 -38
- data/vendor/local/share/doc/groonga/en/html/contribution/development/repository.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/contribution/development/test.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/contribution/documentation.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/contribution/documentation/c-api.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/contribution/documentation/i18n.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/contribution/documentation/introduction.html +24 -26
- data/vendor/local/share/doc/groonga/en/html/contribution/report.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/development.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/development/travis-ci.html +24 -26
- data/vendor/local/share/doc/groonga/en/html/genindex.html +26 -14
- data/vendor/local/share/doc/groonga/en/html/index.html +150 -130
- data/vendor/local/share/doc/groonga/en/html/install.html +32 -34
- data/vendor/local/share/doc/groonga/en/html/install/centos.html +28 -30
- data/vendor/local/share/doc/groonga/en/html/install/debian.html +24 -26
- data/vendor/local/share/doc/groonga/en/html/install/fedora.html +28 -30
- data/vendor/local/share/doc/groonga/en/html/install/mac_os_x.html +22 -24
- data/vendor/local/share/doc/groonga/en/html/install/others.html +87 -89
- data/vendor/local/share/doc/groonga/en/html/install/solaris.html +22 -24
- data/vendor/local/share/doc/groonga/en/html/install/ubuntu.html +25 -27
- data/vendor/local/share/doc/groonga/en/html/install/windows.html +30 -32
- data/vendor/local/share/doc/groonga/en/html/limitations.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/news.html +256 -27
- data/vendor/local/share/doc/groonga/en/html/news/0.x.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/news/1.0.x.html +19 -21
- data/vendor/local/share/doc/groonga/en/html/news/1.1.x.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/news/1.2.x.html +32 -34
- data/vendor/local/share/doc/groonga/en/html/news/1.3.x.html +27 -29
- data/vendor/local/share/doc/groonga/en/html/news/2.x.html +98 -100
- data/vendor/local/share/doc/groonga/en/html/news/3.x.html +68 -70
- data/vendor/local/share/doc/groonga/en/html/news/4.x.html +102 -104
- data/vendor/local/share/doc/groonga/en/html/news/senna.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/objects.inv +0 -0
- data/vendor/local/share/doc/groonga/en/html/reference.html +139 -118
- data/vendor/local/share/doc/groonga/en/html/reference/api.html +51 -52
- data/vendor/local/share/doc/groonga/en/html/reference/api/global_configurations.html +49 -51
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_cache.html +60 -62
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_column.html +80 -82
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_command_version.html +42 -44
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_content_type.html +37 -39
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_ctx.html +130 -80
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_db.html +48 -50
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_encoding.html +44 -46
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_expr.html +79 -81
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_geo.html +42 -44
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_hook.html +44 -46
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_ii.html +42 -44
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_index_cursor.html +41 -43
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_info.html +41 -43
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_match_escalation.html +40 -42
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_obj.html +89 -91
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_proc.html +44 -46
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_search.html +39 -41
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_table.html +75 -77
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_table_cursor.html +64 -66
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_type.html +40 -42
- data/vendor/local/share/doc/groonga/en/html/reference/api/grn_user_data.html +39 -41
- data/vendor/local/share/doc/groonga/en/html/reference/api/overview.html +202 -0
- data/vendor/local/share/doc/groonga/en/html/reference/api/plugin.html +58 -60
- data/vendor/local/share/doc/groonga/en/html/reference/cast.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/reference/column.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/reference/columns/index.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/reference/columns/pseudo.html +22 -24
- data/vendor/local/share/doc/groonga/en/html/reference/columns/scalar.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/reference/columns/vector.html +58 -60
- data/vendor/local/share/doc/groonga/en/html/reference/command.html +56 -54
- data/vendor/local/share/doc/groonga/en/html/reference/command/command_version.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/reference/command/output_format.html +53 -55
- data/vendor/local/share/doc/groonga/en/html/reference/command/request_id.html +22 -24
- data/vendor/local/share/doc/groonga/en/html/reference/command/return_code.html +94 -96
- data/vendor/local/share/doc/groonga/en/html/reference/commands/cache_limit.html +39 -41
- data/vendor/local/share/doc/groonga/en/html/reference/commands/check.html +43 -45
- data/vendor/local/share/doc/groonga/en/html/reference/commands/clearlock.html +23 -25
- data/vendor/local/share/doc/groonga/en/html/reference/commands/column_create.html +39 -41
- data/vendor/local/share/doc/groonga/en/html/reference/commands/column_list.html +71 -73
- data/vendor/local/share/doc/groonga/en/html/reference/commands/column_remove.html +24 -26
- data/vendor/local/share/doc/groonga/en/html/reference/commands/column_rename.html +44 -46
- data/vendor/local/share/doc/groonga/en/html/reference/commands/define_selector.html +37 -39
- data/vendor/local/share/doc/groonga/en/html/reference/commands/defrag.html +24 -26
- data/vendor/local/share/doc/groonga/en/html/reference/commands/delete.html +35 -37
- data/vendor/local/share/doc/groonga/en/html/reference/commands/dump.html +22 -24
- data/vendor/local/share/doc/groonga/en/html/reference/commands/load.html +43 -45
- data/vendor/local/share/doc/groonga/en/html/reference/commands/lock_clear.html +49 -47
- data/vendor/local/share/doc/groonga/en/html/reference/commands/log_level.html +23 -25
- data/vendor/local/share/doc/groonga/en/html/reference/commands/log_put.html +24 -26
- data/vendor/local/share/doc/groonga/en/html/reference/commands/log_reopen.html +26 -28
- data/vendor/local/share/doc/groonga/en/html/reference/commands/logical_count.html +314 -0
- data/vendor/local/share/doc/groonga/en/html/reference/commands/logical_range_filter.html +252 -0
- data/vendor/local/share/doc/groonga/en/html/reference/commands/normalize.html +87 -89
- data/vendor/local/share/doc/groonga/en/html/reference/commands/normalizer_list.html +46 -48
- data/vendor/local/share/doc/groonga/en/html/reference/commands/plugin_register.html +195 -0
- data/vendor/local/share/doc/groonga/en/html/reference/commands/plugin_unregister.html +193 -0
- data/vendor/local/share/doc/groonga/en/html/reference/commands/quit.html +38 -40
- data/vendor/local/share/doc/groonga/en/html/reference/commands/range_filter.html +37 -39
- data/vendor/local/share/doc/groonga/en/html/reference/commands/register.html +61 -51
- data/vendor/local/share/doc/groonga/en/html/reference/commands/request_cancel.html +72 -74
- data/vendor/local/share/doc/groonga/en/html/reference/commands/ruby_eval.html +54 -56
- data/vendor/local/share/doc/groonga/en/html/reference/commands/ruby_load.html +54 -56
- data/vendor/local/share/doc/groonga/en/html/reference/commands/select.html +590 -592
- data/vendor/local/share/doc/groonga/en/html/reference/commands/shutdown.html +37 -39
- data/vendor/local/share/doc/groonga/en/html/reference/commands/status.html +40 -42
- data/vendor/local/share/doc/groonga/en/html/reference/commands/suggest.html +92 -94
- data/vendor/local/share/doc/groonga/en/html/reference/commands/table_create.html +152 -154
- data/vendor/local/share/doc/groonga/en/html/reference/commands/table_list.html +49 -51
- data/vendor/local/share/doc/groonga/en/html/reference/commands/table_remove.html +39 -41
- data/vendor/local/share/doc/groonga/en/html/reference/commands/table_tokenize.html +68 -70
- data/vendor/local/share/doc/groonga/en/html/reference/commands/tokenize.html +103 -105
- data/vendor/local/share/doc/groonga/en/html/reference/commands/tokenizer_list.html +45 -47
- data/vendor/local/share/doc/groonga/en/html/reference/commands/truncate.html +52 -54
- data/vendor/local/share/doc/groonga/en/html/reference/executables.html +19 -21
- data/vendor/local/share/doc/groonga/en/html/reference/executables/grndb.html +35 -37
- data/vendor/local/share/doc/groonga/en/html/reference/executables/grnslap.html +21 -23
- data/vendor/local/share/doc/groonga/en/html/reference/executables/groonga-benchmark.html +26 -28
- data/vendor/local/share/doc/groonga/en/html/reference/executables/groonga-httpd.html +73 -75
- data/vendor/local/share/doc/groonga/en/html/reference/executables/groonga-server-http.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/reference/executables/groonga-suggest-create-dataset.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/reference/executables/groonga-suggest-httpd.html +42 -44
- data/vendor/local/share/doc/groonga/en/html/reference/executables/groonga-suggest-learner.html +34 -36
- data/vendor/local/share/doc/groonga/en/html/reference/executables/groonga.html +124 -90
- data/vendor/local/share/doc/groonga/en/html/reference/function.html +42 -44
- data/vendor/local/share/doc/groonga/en/html/reference/functions/between.html +66 -68
- data/vendor/local/share/doc/groonga/en/html/reference/functions/edit_distance.html +44 -46
- data/vendor/local/share/doc/groonga/en/html/reference/functions/geo_distance.html +113 -115
- data/vendor/local/share/doc/groonga/en/html/reference/functions/geo_in_circle.html +55 -57
- data/vendor/local/share/doc/groonga/en/html/reference/functions/geo_in_rectangle.html +45 -47
- data/vendor/local/share/doc/groonga/en/html/reference/functions/highlight_full.html +81 -83
- data/vendor/local/share/doc/groonga/en/html/reference/functions/highlight_html.html +65 -67
- data/vendor/local/share/doc/groonga/en/html/reference/functions/html_untag.html +54 -56
- data/vendor/local/share/doc/groonga/en/html/reference/functions/in_values.html +135 -44
- data/vendor/local/share/doc/groonga/en/html/reference/functions/now.html +40 -42
- data/vendor/local/share/doc/groonga/en/html/reference/functions/query.html +81 -83
- data/vendor/local/share/doc/groonga/en/html/reference/functions/rand.html +43 -45
- data/vendor/local/share/doc/groonga/en/html/reference/functions/snippet_html.html +67 -69
- data/vendor/local/share/doc/groonga/en/html/reference/functions/sub_filter.html +66 -70
- data/vendor/local/share/doc/groonga/en/html/reference/grn_expr.html +39 -41
- data/vendor/local/share/doc/groonga/en/html/reference/grn_expr/query_syntax.html +349 -286
- data/vendor/local/share/doc/groonga/en/html/reference/grn_expr/script_syntax.html +483 -417
- data/vendor/local/share/doc/groonga/en/html/reference/indexing.html +35 -37
- data/vendor/local/share/doc/groonga/en/html/reference/log.html +38 -40
- data/vendor/local/share/doc/groonga/en/html/reference/normalizers.html +44 -46
- data/vendor/local/share/doc/groonga/en/html/reference/operations.html +30 -31
- data/vendor/local/share/doc/groonga/en/html/reference/operations/geolocation_search.html +32 -34
- data/vendor/local/share/doc/groonga/en/html/reference/output.html +32 -34
- data/vendor/local/share/doc/groonga/en/html/reference/query_expanders.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/reference/query_expanders/tsv.html +60 -62
- data/vendor/local/share/doc/groonga/en/html/reference/regular_expression.html +931 -0
- data/vendor/local/share/doc/groonga/en/html/reference/scorer.html +442 -0
- data/vendor/local/share/doc/groonga/en/html/reference/scorers/scorer_tf_at_most.html +153 -0
- data/vendor/local/share/doc/groonga/en/html/reference/scorers/scorer_tf_idf.html +287 -0
- data/vendor/local/share/doc/groonga/en/html/reference/scoring_note.html +114 -0
- data/vendor/local/share/doc/groonga/en/html/reference/suggest.html +45 -47
- data/vendor/local/share/doc/groonga/en/html/reference/suggest/completion.html +51 -53
- data/vendor/local/share/doc/groonga/en/html/reference/suggest/correction.html +40 -42
- data/vendor/local/share/doc/groonga/en/html/reference/suggest/introduction.html +38 -40
- data/vendor/local/share/doc/groonga/en/html/reference/suggest/suggestion.html +40 -42
- data/vendor/local/share/doc/groonga/en/html/reference/tables.html +52 -54
- data/vendor/local/share/doc/groonga/en/html/reference/token_filters.html +36 -38
- data/vendor/local/share/doc/groonga/en/html/reference/tokenizers.html +1394 -34
- data/vendor/local/share/doc/groonga/en/html/reference/tuning.html +57 -59
- data/vendor/local/share/doc/groonga/en/html/reference/types.html +38 -40
- data/vendor/local/share/doc/groonga/en/html/search.html +11 -11
- data/vendor/local/share/doc/groonga/en/html/searchindex.js +1 -1
- data/vendor/local/share/doc/groonga/en/html/server.html +23 -24
- data/vendor/local/share/doc/groonga/en/html/server/gqtp.html +28 -30
- data/vendor/local/share/doc/groonga/en/html/server/http.html +42 -44
- data/vendor/local/share/doc/groonga/en/html/server/http/comparison.html +68 -70
- data/vendor/local/share/doc/groonga/en/html/server/http/groonga-httpd.html +30 -32
- data/vendor/local/share/doc/groonga/en/html/server/http/groonga.html +29 -31
- data/vendor/local/share/doc/groonga/en/html/server/memcached.html +137 -0
- data/vendor/local/share/doc/groonga/en/html/server/package.html +36 -38
- data/vendor/local/share/doc/groonga/en/html/spec.html +22 -24
- data/vendor/local/share/doc/groonga/en/html/spec/gqtp.html +208 -129
- data/vendor/local/share/doc/groonga/en/html/spec/search.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/troubleshooting.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/troubleshooting/different_results_with_the_same_keyword.html +18 -20
- data/vendor/local/share/doc/groonga/en/html/troubleshooting/mmap_cannot_allocate_memory.html +18 -20
- data/vendor/local/share/doc/groonga/en/html/tutorial.html +21 -25
- data/vendor/local/share/doc/groonga/en/html/tutorial/data.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/tutorial/drilldown.html +31 -33
- data/vendor/local/share/doc/groonga/en/html/tutorial/index.html +20 -22
- data/vendor/local/share/doc/groonga/en/html/tutorial/introduction.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/tutorial/lexicon.html +17 -19
- data/vendor/local/share/doc/groonga/en/html/tutorial/match_columns.html +21 -23
- data/vendor/local/share/doc/groonga/en/html/tutorial/micro_blog.html +50 -52
- data/vendor/local/share/doc/groonga/en/html/tutorial/network.html +27 -125
- data/vendor/local/share/doc/groonga/en/html/tutorial/patricia_trie.html +18 -20
- data/vendor/local/share/doc/groonga/en/html/tutorial/query_expansion.html +20 -22
- data/vendor/local/share/doc/groonga/en/html/tutorial/search.html +33 -35
- data/vendor/local/share/doc/groonga/ja/html/.buildinfo +1 -1
- data/vendor/local/share/doc/groonga/ja/html/_images/used-when-indexing.png +0 -0
- data/vendor/local/share/doc/groonga/ja/html/_images/used-when-searching.png +0 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/characteristic.txt +1 -1
- data/vendor/local/share/doc/groonga/ja/html/_sources/contribution/development/release.txt +32 -17
- data/vendor/local/share/doc/groonga/ja/html/_sources/install/centos.txt +3 -3
- data/vendor/local/share/doc/groonga/ja/html/_sources/install/debian.txt +3 -3
- data/vendor/local/share/doc/groonga/ja/html/_sources/install/fedora.txt +4 -4
- data/vendor/local/share/doc/groonga/ja/html/_sources/install/mac_os_x.txt +3 -3
- data/vendor/local/share/doc/groonga/ja/html/_sources/install/others.txt +3 -3
- data/vendor/local/share/doc/groonga/ja/html/_sources/install/solaris.txt +3 -3
- data/vendor/local/share/doc/groonga/ja/html/_sources/install/ubuntu.txt +3 -3
- data/vendor/local/share/doc/groonga/ja/html/_sources/install/windows.txt +9 -9
- data/vendor/local/share/doc/groonga/ja/html/_sources/news.txt +194 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/news/1.0.x.txt +1 -1
- data/vendor/local/share/doc/groonga/ja/html/_sources/news/3.x.txt +2 -2
- data/vendor/local/share/doc/groonga/ja/html/_sources/news/4.x.txt +2 -2
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference.txt +2 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/api.txt +3 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/api/grn_ctx.txt +42 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/api/overview.txt +54 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/cache_limit.txt +1 -1
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/column_create.txt +2 -2
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/column_list.txt +1 -1
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/column_rename.txt +3 -3
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/delete.txt +4 -4
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/load.txt +5 -5
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/lock_clear.txt +4 -4
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/logical_count.txt +173 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/logical_range_filter.txt +112 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/normalize.txt +7 -6
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/plugin_register.txt +64 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/plugin_unregister.txt +63 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/register.txt +11 -1
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/request_cancel.txt +3 -2
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/ruby_eval.txt +1 -1
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/ruby_load.txt +1 -1
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/select.txt +17 -17
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/suggest.txt +12 -12
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/table_create.txt +7 -7
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/table_tokenize.txt +4 -4
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/tokenize.txt +6 -6
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/commands/truncate.txt +1 -1
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/executables/groonga.txt +47 -26
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/functions/between.txt +5 -5
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/functions/geo_distance.txt +3 -3
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/functions/highlight_full.txt +6 -6
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/functions/highlight_html.txt +1 -1
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/functions/html_untag.txt +1 -1
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/functions/in_values.txt +54 -2
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/functions/query.txt +4 -4
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/functions/sub_filter.txt +4 -4
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/grn_expr/query_syntax.txt +44 -18
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/grn_expr/script_syntax.txt +41 -11
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/indexing.txt +2 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/normalizers.txt +4 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/operations.txt +2 -1
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/regular_expression.txt +403 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/scorer.txt +217 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/scorers/scorer_tf_at_most.txt +22 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/scorers/scorer_tf_idf.txt +110 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/scoring_note.txt +13 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/tables.txt +8 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/reference/tokenizers.txt +530 -16
- data/vendor/local/share/doc/groonga/ja/html/_sources/server.txt +2 -1
- data/vendor/local/share/doc/groonga/ja/html/_sources/server/memcached.txt +15 -0
- data/vendor/local/share/doc/groonga/ja/html/_sources/spec/gqtp.txt +66 -1
- data/vendor/local/share/doc/groonga/ja/html/_sources/troubleshooting/different_results_with_the_same_keyword.txt +1 -1
- data/vendor/local/share/doc/groonga/ja/html/_sources/tutorial/network.txt +0 -81
- data/vendor/local/share/doc/groonga/ja/html/_static/basic.css +6 -68
- data/vendor/local/share/doc/groonga/ja/html/_static/doctools.js +1 -26
- data/vendor/local/share/doc/groonga/ja/html/_static/down-pressed.png +0 -0
- data/vendor/local/share/doc/groonga/ja/html/_static/down.png +0 -0
- data/vendor/local/share/doc/groonga/ja/html/_static/file.png +0 -0
- data/vendor/local/share/doc/groonga/ja/html/_static/jquery.js +9404 -4
- data/vendor/local/share/doc/groonga/ja/html/_static/minus.png +0 -0
- data/vendor/local/share/doc/groonga/ja/html/_static/plus.png +0 -0
- data/vendor/local/share/doc/groonga/ja/html/_static/searchtools.js +2 -2
- data/vendor/local/share/doc/groonga/ja/html/_static/underscore.js +1415 -31
- data/vendor/local/share/doc/groonga/ja/html/_static/up-pressed.png +0 -0
- data/vendor/local/share/doc/groonga/ja/html/_static/up.png +0 -0
- data/vendor/local/share/doc/groonga/ja/html/_static/websupport.js +15 -15
- data/vendor/local/share/doc/groonga/ja/html/characteristic.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/client.html +22 -24
- data/vendor/local/share/doc/groonga/ja/html/community.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/contribution.html +18 -20
- data/vendor/local/share/doc/groonga/ja/html/contribution/development.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/contribution/development/com.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/contribution/development/cooperation.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/contribution/development/query.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/contribution/development/release.html +51 -38
- data/vendor/local/share/doc/groonga/ja/html/contribution/development/repository.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/contribution/development/test.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/contribution/documentation.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/contribution/documentation/c-api.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/contribution/documentation/i18n.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/contribution/documentation/introduction.html +24 -26
- data/vendor/local/share/doc/groonga/ja/html/contribution/report.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/development.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/development/travis-ci.html +20 -22
- data/vendor/local/share/doc/groonga/ja/html/genindex.html +26 -14
- data/vendor/local/share/doc/groonga/ja/html/index.html +150 -130
- data/vendor/local/share/doc/groonga/ja/html/install.html +32 -34
- data/vendor/local/share/doc/groonga/ja/html/install/centos.html +31 -33
- data/vendor/local/share/doc/groonga/ja/html/install/debian.html +25 -27
- data/vendor/local/share/doc/groonga/ja/html/install/fedora.html +29 -31
- data/vendor/local/share/doc/groonga/ja/html/install/mac_os_x.html +22 -24
- data/vendor/local/share/doc/groonga/ja/html/install/others.html +78 -80
- data/vendor/local/share/doc/groonga/ja/html/install/solaris.html +21 -23
- data/vendor/local/share/doc/groonga/ja/html/install/ubuntu.html +26 -28
- data/vendor/local/share/doc/groonga/ja/html/install/windows.html +29 -31
- data/vendor/local/share/doc/groonga/ja/html/limitations.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/news.html +210 -27
- data/vendor/local/share/doc/groonga/ja/html/news/0.x.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/news/1.0.x.html +19 -21
- data/vendor/local/share/doc/groonga/ja/html/news/1.1.x.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/news/1.2.x.html +32 -34
- data/vendor/local/share/doc/groonga/ja/html/news/1.3.x.html +27 -29
- data/vendor/local/share/doc/groonga/ja/html/news/2.x.html +91 -93
- data/vendor/local/share/doc/groonga/ja/html/news/3.x.html +59 -61
- data/vendor/local/share/doc/groonga/ja/html/news/4.x.html +89 -91
- data/vendor/local/share/doc/groonga/ja/html/news/senna.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/objects.inv +0 -0
- data/vendor/local/share/doc/groonga/ja/html/reference.html +139 -118
- data/vendor/local/share/doc/groonga/ja/html/reference/api.html +51 -52
- data/vendor/local/share/doc/groonga/ja/html/reference/api/global_configurations.html +49 -51
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_cache.html +55 -57
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_column.html +80 -82
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_command_version.html +42 -44
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_content_type.html +37 -39
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_ctx.html +126 -76
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_db.html +48 -50
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_encoding.html +44 -46
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_expr.html +74 -76
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_geo.html +42 -44
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_hook.html +44 -46
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_ii.html +42 -44
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_index_cursor.html +41 -43
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_info.html +41 -43
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_match_escalation.html +40 -42
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_obj.html +89 -91
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_proc.html +44 -46
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_search.html +39 -41
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_table.html +75 -77
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_table_cursor.html +64 -66
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_type.html +40 -42
- data/vendor/local/share/doc/groonga/ja/html/reference/api/grn_user_data.html +39 -41
- data/vendor/local/share/doc/groonga/ja/html/reference/api/overview.html +197 -0
- data/vendor/local/share/doc/groonga/ja/html/reference/api/plugin.html +58 -60
- data/vendor/local/share/doc/groonga/ja/html/reference/cast.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/reference/column.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/reference/columns/index.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/reference/columns/pseudo.html +22 -24
- data/vendor/local/share/doc/groonga/ja/html/reference/columns/scalar.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/reference/columns/vector.html +48 -50
- data/vendor/local/share/doc/groonga/ja/html/reference/command.html +56 -54
- data/vendor/local/share/doc/groonga/ja/html/reference/command/command_version.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/reference/command/output_format.html +43 -45
- data/vendor/local/share/doc/groonga/ja/html/reference/command/request_id.html +22 -24
- data/vendor/local/share/doc/groonga/ja/html/reference/command/return_code.html +93 -95
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/cache_limit.html +35 -37
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/check.html +43 -45
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/clearlock.html +23 -25
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/column_create.html +39 -41
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/column_list.html +62 -64
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/column_remove.html +24 -26
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/column_rename.html +38 -40
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/define_selector.html +37 -39
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/defrag.html +24 -26
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/delete.html +32 -34
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/dump.html +22 -24
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/load.html +33 -35
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/lock_clear.html +48 -46
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/log_level.html +23 -25
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/log_put.html +24 -26
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/log_reopen.html +26 -28
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/logical_count.html +314 -0
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/logical_range_filter.html +250 -0
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/normalize.html +80 -81
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/normalizer_list.html +46 -48
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/plugin_register.html +188 -0
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/plugin_unregister.html +190 -0
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/quit.html +38 -40
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/range_filter.html +37 -39
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/register.html +57 -47
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/request_cancel.html +71 -73
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/ruby_eval.html +53 -55
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/ruby_load.html +53 -55
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/select.html +394 -396
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/shutdown.html +37 -39
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/status.html +38 -40
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/suggest.html +78 -80
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/table_create.html +123 -125
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/table_list.html +49 -51
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/table_remove.html +39 -41
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/table_tokenize.html +61 -63
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/tokenize.html +89 -91
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/tokenizer_list.html +46 -48
- data/vendor/local/share/doc/groonga/ja/html/reference/commands/truncate.html +51 -53
- data/vendor/local/share/doc/groonga/ja/html/reference/executables.html +19 -21
- data/vendor/local/share/doc/groonga/ja/html/reference/executables/grndb.html +35 -37
- data/vendor/local/share/doc/groonga/ja/html/reference/executables/grnslap.html +21 -23
- data/vendor/local/share/doc/groonga/ja/html/reference/executables/groonga-benchmark.html +26 -28
- data/vendor/local/share/doc/groonga/ja/html/reference/executables/groonga-httpd.html +61 -63
- data/vendor/local/share/doc/groonga/ja/html/reference/executables/groonga-server-http.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/reference/executables/groonga-suggest-create-dataset.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/reference/executables/groonga-suggest-httpd.html +42 -44
- data/vendor/local/share/doc/groonga/ja/html/reference/executables/groonga-suggest-learner.html +34 -36
- data/vendor/local/share/doc/groonga/ja/html/reference/executables/groonga.html +126 -90
- data/vendor/local/share/doc/groonga/ja/html/reference/function.html +42 -44
- data/vendor/local/share/doc/groonga/ja/html/reference/functions/between.html +63 -65
- data/vendor/local/share/doc/groonga/ja/html/reference/functions/edit_distance.html +44 -46
- data/vendor/local/share/doc/groonga/ja/html/reference/functions/geo_distance.html +94 -96
- data/vendor/local/share/doc/groonga/ja/html/reference/functions/geo_in_circle.html +55 -57
- data/vendor/local/share/doc/groonga/ja/html/reference/functions/geo_in_rectangle.html +45 -47
- data/vendor/local/share/doc/groonga/ja/html/reference/functions/highlight_full.html +66 -68
- data/vendor/local/share/doc/groonga/ja/html/reference/functions/highlight_html.html +55 -57
- data/vendor/local/share/doc/groonga/ja/html/reference/functions/html_untag.html +53 -55
- data/vendor/local/share/doc/groonga/ja/html/reference/functions/in_values.html +135 -44
- data/vendor/local/share/doc/groonga/ja/html/reference/functions/now.html +40 -42
- data/vendor/local/share/doc/groonga/ja/html/reference/functions/query.html +70 -72
- data/vendor/local/share/doc/groonga/ja/html/reference/functions/rand.html +43 -45
- data/vendor/local/share/doc/groonga/ja/html/reference/functions/snippet_html.html +53 -55
- data/vendor/local/share/doc/groonga/ja/html/reference/functions/sub_filter.html +56 -62
- data/vendor/local/share/doc/groonga/ja/html/reference/grn_expr.html +36 -38
- data/vendor/local/share/doc/groonga/ja/html/reference/grn_expr/query_syntax.html +229 -171
- data/vendor/local/share/doc/groonga/ja/html/reference/grn_expr/script_syntax.html +381 -322
- data/vendor/local/share/doc/groonga/ja/html/reference/indexing.html +34 -36
- data/vendor/local/share/doc/groonga/ja/html/reference/log.html +38 -40
- data/vendor/local/share/doc/groonga/ja/html/reference/normalizers.html +38 -40
- data/vendor/local/share/doc/groonga/ja/html/reference/operations.html +28 -30
- data/vendor/local/share/doc/groonga/ja/html/reference/operations/geolocation_search.html +32 -34
- data/vendor/local/share/doc/groonga/ja/html/reference/output.html +28 -30
- data/vendor/local/share/doc/groonga/ja/html/reference/query_expanders.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/reference/query_expanders/tsv.html +39 -41
- data/vendor/local/share/doc/groonga/ja/html/reference/regular_expression.html +878 -0
- data/vendor/local/share/doc/groonga/ja/html/reference/scorer.html +442 -0
- data/vendor/local/share/doc/groonga/ja/html/reference/scorers/scorer_tf_at_most.html +154 -0
- data/vendor/local/share/doc/groonga/ja/html/reference/scorers/scorer_tf_idf.html +287 -0
- data/vendor/local/share/doc/groonga/ja/html/reference/scoring_note.html +115 -0
- data/vendor/local/share/doc/groonga/ja/html/reference/suggest.html +45 -47
- data/vendor/local/share/doc/groonga/ja/html/reference/suggest/completion.html +48 -50
- data/vendor/local/share/doc/groonga/ja/html/reference/suggest/correction.html +40 -42
- data/vendor/local/share/doc/groonga/ja/html/reference/suggest/introduction.html +38 -40
- data/vendor/local/share/doc/groonga/ja/html/reference/suggest/suggestion.html +40 -42
- data/vendor/local/share/doc/groonga/ja/html/reference/tables.html +42 -44
- data/vendor/local/share/doc/groonga/ja/html/reference/token_filters.html +37 -39
- data/vendor/local/share/doc/groonga/ja/html/reference/tokenizers.html +1300 -34
- data/vendor/local/share/doc/groonga/ja/html/reference/tuning.html +57 -59
- data/vendor/local/share/doc/groonga/ja/html/reference/types.html +38 -40
- data/vendor/local/share/doc/groonga/ja/html/search.html +11 -11
- data/vendor/local/share/doc/groonga/ja/html/searchindex.js +1 -1
- data/vendor/local/share/doc/groonga/ja/html/server.html +23 -24
- data/vendor/local/share/doc/groonga/ja/html/server/gqtp.html +28 -30
- data/vendor/local/share/doc/groonga/ja/html/server/http.html +42 -44
- data/vendor/local/share/doc/groonga/ja/html/server/http/comparison.html +62 -64
- data/vendor/local/share/doc/groonga/ja/html/server/http/groonga-httpd.html +30 -32
- data/vendor/local/share/doc/groonga/ja/html/server/http/groonga.html +29 -31
- data/vendor/local/share/doc/groonga/ja/html/server/memcached.html +138 -0
- data/vendor/local/share/doc/groonga/ja/html/server/package.html +35 -37
- data/vendor/local/share/doc/groonga/ja/html/spec.html +22 -24
- data/vendor/local/share/doc/groonga/ja/html/spec/gqtp.html +207 -128
- data/vendor/local/share/doc/groonga/ja/html/spec/search.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/troubleshooting.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/troubleshooting/different_results_with_the_same_keyword.html +18 -20
- data/vendor/local/share/doc/groonga/ja/html/troubleshooting/mmap_cannot_allocate_memory.html +18 -20
- data/vendor/local/share/doc/groonga/ja/html/tutorial.html +21 -25
- data/vendor/local/share/doc/groonga/ja/html/tutorial/data.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/tutorial/drilldown.html +30 -32
- data/vendor/local/share/doc/groonga/ja/html/tutorial/index.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/tutorial/introduction.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/tutorial/lexicon.html +17 -19
- data/vendor/local/share/doc/groonga/ja/html/tutorial/match_columns.html +23 -25
- data/vendor/local/share/doc/groonga/ja/html/tutorial/micro_blog.html +47 -49
- data/vendor/local/share/doc/groonga/ja/html/tutorial/network.html +27 -125
- data/vendor/local/share/doc/groonga/ja/html/tutorial/patricia_trie.html +18 -20
- data/vendor/local/share/doc/groonga/ja/html/tutorial/query_expansion.html +20 -22
- data/vendor/local/share/doc/groonga/ja/html/tutorial/search.html +31 -33
- data/vendor/local/share/license/mruby/README.md +2 -2
- data/vendor/local/share/man/ja/man1/groonga.1 +6205 -2251
- data/vendor/local/share/man/man1/groonga.1 +7210 -3029
- metadata +75 -11
- data/doc/text/news.textile +0 -1217
- data/vendor/local/share/doc/groonga/en/html/_static/jquery-1.11.1.js +0 -10308
- data/vendor/local/share/doc/groonga/en/html/_static/underscore-1.3.1.js +0 -999
- data/vendor/local/share/doc/groonga/ja/html/_static/jquery-1.11.1.js +0 -10308
- data/vendor/local/share/doc/groonga/ja/html/_static/underscore-1.3.1.js +0 -999
@@ -0,0 +1,217 @@
|
|
1
|
+
.. -*- rst -*-
|
2
|
+
|
3
|
+
.. highlightlang:: none
|
4
|
+
|
5
|
+
.. groonga-command
|
6
|
+
.. database: scorer
|
7
|
+
|
8
|
+
Scorer
|
9
|
+
======
|
10
|
+
|
11
|
+
Summary
|
12
|
+
-------
|
13
|
+
|
14
|
+
Groonga has scorer module that customizes score function. Score
|
15
|
+
function computes score of matched record. The default scorer function
|
16
|
+
uses the number of appeared terms. It is also known as TF (term
|
17
|
+
frequency).
|
18
|
+
|
19
|
+
TF is a fast score function but it's not suitable for the following
|
20
|
+
cases:
|
21
|
+
|
22
|
+
* Search query contains one or more frequently-appearing words such
|
23
|
+
as "the" and "a".
|
24
|
+
* Document contains many same keywords such as "They are keyword,
|
25
|
+
keyword, keyword ... and keyword". Search engine spammer may use
|
26
|
+
the technique.
|
27
|
+
|
28
|
+
Score function can solve these cases. For example, `TF-IDF
|
29
|
+
<http://en.wikipedia.org/wiki/Tf%E2%80%93idf>`_ (term
|
30
|
+
frequency-inverse document frequency) can solve the first case.
|
31
|
+
`Okapi BM25 <http://en.wikipedia.org/wiki/Okapi_BM25>`_ can solve the
|
32
|
+
second case. But their are slower than TF.
|
33
|
+
|
34
|
+
Groonga provides TF-IDF based scorer as
|
35
|
+
:doc:`/reference/scorers/scorer_tf_idf` but doesn't provide Okapi BM25
|
36
|
+
based scorer yet.
|
37
|
+
|
38
|
+
.. include:: scoring_note.rst
|
39
|
+
|
40
|
+
Usage
|
41
|
+
-----
|
42
|
+
|
43
|
+
This section describes how to use scorer.
|
44
|
+
|
45
|
+
Here are a schema definition and sample data to show usage.
|
46
|
+
|
47
|
+
Sample schema:
|
48
|
+
|
49
|
+
.. groonga-command
|
50
|
+
.. include:: ../example/reference/scorer/usage_setup_schema.log
|
51
|
+
.. table_create Memos TABLE_HASH_KEY ShortText
|
52
|
+
.. column_create Memos title COLUMN_SCALAR ShortText
|
53
|
+
.. column_create Memos content COLUMN_SCALAR Text
|
54
|
+
..
|
55
|
+
.. table_create Terms TABLE_PAT_KEY ShortText \
|
56
|
+
.. --default_tokenizer TokenBigram \
|
57
|
+
.. --normalizer NormalizerAuto
|
58
|
+
.. column_create Terms title_index COLUMN_INDEX|WITH_POSITION Memos title
|
59
|
+
.. column_create Terms content_index COLUMN_INDEX|WITH_POSITION Memos content
|
60
|
+
|
61
|
+
Sample data:
|
62
|
+
|
63
|
+
.. groonga-command
|
64
|
+
.. include:: ../example/reference/scorer/usage_setup_data.log
|
65
|
+
.. load --table Memos
|
66
|
+
.. [
|
67
|
+
.. {
|
68
|
+
.. "_key": "memo1",
|
69
|
+
.. "title": "Groonga is easy",
|
70
|
+
.. "content": "Groonga is very easy full text search engine!"
|
71
|
+
.. },
|
72
|
+
.. {
|
73
|
+
.. "_key": "memo2",
|
74
|
+
.. "title": "Mroonga is easy",
|
75
|
+
.. "content": "Mroonga is more easier full text search engine!"
|
76
|
+
.. },
|
77
|
+
.. {
|
78
|
+
.. "_key": "memo3",
|
79
|
+
.. "title": "Rroonga is easy",
|
80
|
+
.. "content": "Ruby is very helpful."
|
81
|
+
.. },
|
82
|
+
.. {
|
83
|
+
.. "_key": "memo4",
|
84
|
+
.. "title": "Groonga is fast",
|
85
|
+
.. "content": "Groonga! Groonga! Groonga! Groonga is very fast!"
|
86
|
+
.. },
|
87
|
+
.. {
|
88
|
+
.. "_key": "memo5",
|
89
|
+
.. "title": "PGroonga is fast",
|
90
|
+
.. "content": "PGroonga is very fast!"
|
91
|
+
.. },
|
92
|
+
.. {
|
93
|
+
.. "_key": "memo6",
|
94
|
+
.. "title": "PGroonga is useful",
|
95
|
+
.. "content": "SQL is easy because many client libraries exist."
|
96
|
+
.. },
|
97
|
+
.. {
|
98
|
+
.. "_key": "memo7",
|
99
|
+
.. "title": "Mroonga is also useful",
|
100
|
+
.. "content": "MySQL has replication feature. Mroonga can use it."
|
101
|
+
.. }
|
102
|
+
.. ]
|
103
|
+
|
104
|
+
You can specify custom score function in :ref:`select-match-columns`.
|
105
|
+
There are some syntaxes.
|
106
|
+
|
107
|
+
For score function that doesn't require any parameter such as
|
108
|
+
:doc:`/reference/scorers/scorer_tf_idf`::
|
109
|
+
|
110
|
+
SCORE_FUNCTION(COLUMN)
|
111
|
+
|
112
|
+
You can specify weight::
|
113
|
+
|
114
|
+
SCORE_FUNCTION(COLUMN) * WEIGHT
|
115
|
+
|
116
|
+
For score function that requires one or more parameters such as
|
117
|
+
:doc:`/reference/scorers/scorer_tf_at_most`::
|
118
|
+
|
119
|
+
SCORE_FUNCTION(COLUMN, ARGUMENT1, ARGUMENT2, ...)
|
120
|
+
|
121
|
+
You can specify weight::
|
122
|
+
|
123
|
+
SCORE_FUNCTION(COLUMN, ARGUMENT1, ARGUMENT2, ...) * WEIGHT
|
124
|
+
|
125
|
+
You can use different score function for each match column::
|
126
|
+
|
127
|
+
SCORE_FUNCTION1(COLUMN1) ||
|
128
|
+
SCORE_FUNCTION2(COLUMN2) * WEIGHT ||
|
129
|
+
SCORE_FUNCTION3(COLUMN3, ARGUMENT1) ||
|
130
|
+
...
|
131
|
+
|
132
|
+
Here is a simplest example:
|
133
|
+
|
134
|
+
.. groonga-command
|
135
|
+
.. include:: ../example/reference/scorer/usage_one_no_argument_no_weight.log
|
136
|
+
.. select Memos \
|
137
|
+
.. --match_columns "scorer_tf_idf(content)" \
|
138
|
+
.. --query "Groonga" \
|
139
|
+
.. --output_columns "content, _score" \
|
140
|
+
.. --sortby "-_score"
|
141
|
+
|
142
|
+
``Groonga! Groonga! Groonga! Groonga is very fast!`` contains 4
|
143
|
+
``Groonga``. If you use TF based scorer that is the default scorer,
|
144
|
+
``_score`` is ``4``. But the actual ``_score`` is ``2``. Because the
|
145
|
+
``select`` command uses TF-IDF based scorer ``scorer_tf_idf()``.
|
146
|
+
|
147
|
+
Here is an example that uses weight:
|
148
|
+
|
149
|
+
.. groonga-command
|
150
|
+
.. include:: ../example/reference/scorer/usage_one_no_argument_weight.log
|
151
|
+
.. select Memos \
|
152
|
+
.. --match_columns "scorer_tf_idf(content) * 10" \
|
153
|
+
.. --query "Groonga" \
|
154
|
+
.. --output_columns "content, _score" \
|
155
|
+
.. --sortby "-_score"
|
156
|
+
|
157
|
+
``Groonga! Groonga! Groonga! Groonga is very fast!`` has ``22`` as
|
158
|
+
``_score``. It had ``2`` as ``_score`` in the previous example that
|
159
|
+
doesn't specify weight.
|
160
|
+
|
161
|
+
Here is an example that uses scorer that requires one
|
162
|
+
argument. :doc:`/reference/scorers/scorer_tf_at_most` scorer requires
|
163
|
+
one argument. You can limit TF score by the scorer.
|
164
|
+
|
165
|
+
.. groonga-command
|
166
|
+
.. include:: ../example/reference/scorer/usage_one_one_argument_no_weight.log
|
167
|
+
.. select Memos \
|
168
|
+
.. --match_columns "scorer_tf_at_most(content, 2.0)" \
|
169
|
+
.. --query "Groonga" \
|
170
|
+
.. --output_columns "content, _score" \
|
171
|
+
.. --sortby "-_score"
|
172
|
+
|
173
|
+
``Groonga! Groonga! Groonga! Groonga is very fast!`` contains 4
|
174
|
+
``Groonga``. If you use normal TF based scorer that is the default
|
175
|
+
scorer, ``_score`` is ``4``. But the actual ``_score`` is ``2``.
|
176
|
+
Because the scorer used in the ``select`` command limits the maximum
|
177
|
+
score value to ``2``.
|
178
|
+
|
179
|
+
Here is an example that uses multiple scorers::
|
180
|
+
|
181
|
+
.. groonga-command
|
182
|
+
.. include:: ../example/reference/scorer/usage_multiple_scorers.log
|
183
|
+
.. select Memos \
|
184
|
+
.. --match_columns "scorer_tf_idf(title) || scorer_tf_at_most(content, 2.0)" \
|
185
|
+
.. --query "Groonga" \
|
186
|
+
.. --output_columns "title, content, _score" \
|
187
|
+
.. --sortby "-_score"
|
188
|
+
|
189
|
+
The ``--match_columns`` uses ``scorer_tf_idf(title)`` and
|
190
|
+
``scorer_tf_at_most(content, 2.0)``. ``_score`` value is sum of them.
|
191
|
+
|
192
|
+
You can use the default scorer and custom scorer in the same
|
193
|
+
``--match_columns``. You can use the default scorer by just specifying
|
194
|
+
a match column::
|
195
|
+
|
196
|
+
.. groonga-command
|
197
|
+
.. include:: ../example/reference/scorer/usage_default_and_custom_scorers.log
|
198
|
+
.. select Memos \
|
199
|
+
.. --match_columns "title || scorer_tf_at_most(content, 2.0)" \
|
200
|
+
.. --query "Groonga" \
|
201
|
+
.. --output_columns "title, content, _score" \
|
202
|
+
.. --sortby "-_score"
|
203
|
+
|
204
|
+
The ``--match_columns`` uses the default scorer (TF) for ``title`` and
|
205
|
+
:doc:`/reference/scorers/scorer_tf_at_most` for
|
206
|
+
``content``. ``_score`` value is sum of them.
|
207
|
+
|
208
|
+
Built-in scorers
|
209
|
+
----------------
|
210
|
+
|
211
|
+
Here are built-in scores:
|
212
|
+
|
213
|
+
.. toctree::
|
214
|
+
:maxdepth: 1
|
215
|
+
:glob:
|
216
|
+
|
217
|
+
scorers/*
|
@@ -0,0 +1,22 @@
|
|
1
|
+
.. -*- rst -*-
|
2
|
+
|
3
|
+
.. highlightlang:: none
|
4
|
+
|
5
|
+
.. groonga-command
|
6
|
+
.. database: scorer_tf_at_most
|
7
|
+
|
8
|
+
``scorer_tf_at_most``
|
9
|
+
=====================
|
10
|
+
|
11
|
+
.. note::
|
12
|
+
|
13
|
+
This scorer is an experimental feature.
|
14
|
+
|
15
|
+
.. versionadded:: 5.0.1
|
16
|
+
|
17
|
+
Summary
|
18
|
+
-------
|
19
|
+
|
20
|
+
``scorer_tf_at_most`` is a scorer based on TF (term frequency).
|
21
|
+
|
22
|
+
TODO
|
@@ -0,0 +1,110 @@
|
|
1
|
+
.. -*- rst -*-
|
2
|
+
|
3
|
+
.. highlightlang:: none
|
4
|
+
|
5
|
+
.. groonga-command
|
6
|
+
.. database: scorer_tf_idf
|
7
|
+
|
8
|
+
``scorer_tf_idf``
|
9
|
+
=================
|
10
|
+
|
11
|
+
.. note::
|
12
|
+
|
13
|
+
This scorer is an experimental feature.
|
14
|
+
|
15
|
+
.. versionadded:: 5.0.1
|
16
|
+
|
17
|
+
Summary
|
18
|
+
-------
|
19
|
+
|
20
|
+
``scorer_tf_idf`` is a scorer based of `TF-IDF
|
21
|
+
<http://en.wikipedia.org/wiki/Tf%E2%80%93idf>`_ (term
|
22
|
+
frequency-inverse document frequency) score function.
|
23
|
+
|
24
|
+
To put it simply, TF (term frequency) divided by DF (document
|
25
|
+
frequency) is TF-IDF. "TF" means that "the number of occurrences is
|
26
|
+
more important". "TF divided by DF" means that "the number of
|
27
|
+
occurrences of important term is more important".
|
28
|
+
|
29
|
+
The default score function in Groonga is TF (term frequency). It
|
30
|
+
doesn't care about term importance but is fast.
|
31
|
+
|
32
|
+
TF-IDF cares about term importance but is slower than TF.
|
33
|
+
|
34
|
+
TF-IDF will compute more suitable score rather than TF for many cases.
|
35
|
+
But it's not perfect.
|
36
|
+
|
37
|
+
If document contains many same keywords such as "They are keyword,
|
38
|
+
keyword, keyword ... and keyword". It increases score by TF and
|
39
|
+
TF-IDF. Search engine spammer may use the technique. But TF-IDF
|
40
|
+
doesn't guard from the technique.
|
41
|
+
|
42
|
+
`Okapi BM25 <http://en.wikipedia.org/wiki/Okapi_BM25>`_ can solve the
|
43
|
+
case. But it's more slower than TF-IDF and not implemented yet in
|
44
|
+
Groonga.
|
45
|
+
|
46
|
+
.. include:: ../scoring_note.rst
|
47
|
+
|
48
|
+
Usage
|
49
|
+
-----
|
50
|
+
|
51
|
+
This section describes how to use this scorer.
|
52
|
+
|
53
|
+
Here are a schema definition and sample data to show usage.
|
54
|
+
|
55
|
+
Sample schema:
|
56
|
+
|
57
|
+
.. groonga-command
|
58
|
+
.. include:: ../example/reference/scorers/scorer_tf_idf/usage_setup_schema.log
|
59
|
+
.. table_create Logs TABLE_NO_KEY
|
60
|
+
.. column_create Logs message COLUMN_SCALAR Text
|
61
|
+
..
|
62
|
+
.. table_create Terms TABLE_PAT_KEY ShortText \
|
63
|
+
.. --default_tokenizer TokenBigram \
|
64
|
+
.. --normalizer NormalizerAuto
|
65
|
+
.. column_create Terms message_index COLUMN_INDEX|WITH_POSITION Logs message
|
66
|
+
|
67
|
+
Sample data:
|
68
|
+
|
69
|
+
.. groonga-command
|
70
|
+
.. include:: ../example/reference/scorers/scorer_tf_idf/usage_setup_data.log
|
71
|
+
.. load --table Logs
|
72
|
+
.. [
|
73
|
+
.. {"message": "Error"},
|
74
|
+
.. {"message": "Warning"},
|
75
|
+
.. {"message": "Warning Warning"},
|
76
|
+
.. {"message": "Warning Warning Warning"},
|
77
|
+
.. {"message": "Info"},
|
78
|
+
.. {"message": "Info Info"},
|
79
|
+
.. {"message": "Info Info Info"},
|
80
|
+
.. {"message": "Info Info Info Info"},
|
81
|
+
.. {"message": "Notice"},
|
82
|
+
.. {"message": "Notice Notice"},
|
83
|
+
.. {"message": "Notice Notice Notice"},
|
84
|
+
.. {"message": "Notice Notice Notice Notice"},
|
85
|
+
.. {"message": "Notice Notice Notice Notice Notice"}
|
86
|
+
.. ]
|
87
|
+
|
88
|
+
You specify ``scorer_tf_idf`` in :ref:`select-match-columns` like the
|
89
|
+
following:
|
90
|
+
|
91
|
+
.. groonga-command
|
92
|
+
.. include:: ../example/reference/scorers/scorer_tf_idf/usage_no_weight.log
|
93
|
+
.. select Logs \
|
94
|
+
.. --match_columns "scorer_tf_idf(message)" \
|
95
|
+
.. --query "Error OR Info" \
|
96
|
+
.. --output_columns "message, _score" \
|
97
|
+
.. --sortby "-_score"
|
98
|
+
|
99
|
+
Both the score of ``Info Info Info`` and the score of ``Error`` are
|
100
|
+
``2`` even ``Info Info Info`` includes three ``Info`` terms. Because
|
101
|
+
``Error`` is more important term rather than ``Info``. The number of
|
102
|
+
documents that include ``Info`` is ``4``. The number of documents that
|
103
|
+
include ``Error`` is ``1``. Term that is included in less documents
|
104
|
+
means that the term is more characteristic term. Characteristic term
|
105
|
+
is important term.
|
106
|
+
|
107
|
+
See also
|
108
|
+
--------
|
109
|
+
|
110
|
+
* :doc:`../scorer`
|
@@ -0,0 +1,13 @@
|
|
1
|
+
.. _note:
|
2
|
+
|
3
|
+
You don't need to resolve scoring only by score function. Score
|
4
|
+
function is highly depends on search query. You may be able to use
|
5
|
+
metadata of matched record.
|
6
|
+
|
7
|
+
For example, Google uses `PageRank
|
8
|
+
<http://en.wikipedia.org/wiki/PageRank>`_ for scoring. You may be
|
9
|
+
able to use data type ("title" data are important rather than
|
10
|
+
"memo" data), tag, geolocation and so on.
|
11
|
+
|
12
|
+
Please stop to think about only score function for scoring.
|
13
|
+
|
@@ -69,6 +69,8 @@ prefix is omitted in the table.)
|
|
69
69
|
| search | | | | |
|
70
70
|
+--------------+------------+--------------+-------------+-------------+
|
71
71
|
|
72
|
+
.. _table-no-key:
|
73
|
+
|
72
74
|
``TABLE_NO_KEY``
|
73
75
|
^^^^^^^^^^^^^^^^
|
74
76
|
|
@@ -79,6 +81,8 @@ You cannot use ``TABLE_NO_KEY`` for lexicon for fulltext search
|
|
79
81
|
because lexicon stores tokens as key. ``TABLE_NO_KEY`` is useful for
|
80
82
|
no key records such as log.
|
81
83
|
|
84
|
+
.. _table-hash-key:
|
85
|
+
|
82
86
|
``TABLE_HASH_KEY``
|
83
87
|
^^^^^^^^^^^^^^^^^^
|
84
88
|
|
@@ -88,6 +92,8 @@ functions such as common prefix search and predictive search.
|
|
88
92
|
``TABLE_HASH_KEY`` is useful for index for exact search such as tag
|
89
93
|
search.
|
90
94
|
|
95
|
+
.. _table-pat-key:
|
96
|
+
|
91
97
|
``TABLE_PAT_KEY``
|
92
98
|
^^^^^^^^^^^^^^^^^
|
93
99
|
|
@@ -96,6 +102,8 @@ search.
|
|
96
102
|
``TABLE_PAT_KEY`` is useful for lexicon for fulltext search and
|
97
103
|
index for range search.
|
98
104
|
|
105
|
+
.. _table-dat-key:
|
106
|
+
|
99
107
|
``TABLE_DAT_KEY``
|
100
108
|
^^^^^^^^^^^^^^^^^
|
101
109
|
|
@@ -2,23 +2,537 @@
|
|
2
2
|
|
3
3
|
.. highlightlang:: none
|
4
4
|
|
5
|
+
.. groonga-command
|
6
|
+
.. database: tokenizers
|
7
|
+
|
5
8
|
Tokenizers
|
6
9
|
==========
|
7
10
|
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
*
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
*
|
23
|
-
|
11
|
+
Summary
|
12
|
+
-------
|
13
|
+
|
14
|
+
Groonga has tokenizer module that tokenizes text. It is used when
|
15
|
+
the following cases:
|
16
|
+
|
17
|
+
* Indexing text
|
18
|
+
|
19
|
+
.. figure:: /images/reference/tokenizers/used-when-indexing.png
|
20
|
+
:align: center
|
21
|
+
:width: 80%
|
22
|
+
|
23
|
+
Tokenizer is used when indexing text.
|
24
|
+
|
25
|
+
* Searching by query
|
26
|
+
|
27
|
+
.. figure:: /images/reference/tokenizers/used-when-searching.png
|
28
|
+
:align: center
|
29
|
+
:width: 80%
|
30
|
+
|
31
|
+
Tokenizer is used when searching by query.
|
32
|
+
|
33
|
+
Tokenizer is an important module for full-text search. You can change
|
34
|
+
trade-off between `precision and recall
|
35
|
+
<http://en.wikipedia.org/wiki/Precision_and_recall>`_ by changing
|
36
|
+
tokenizer.
|
37
|
+
|
38
|
+
Normally, :ref:`token-bigram` is a suitable tokenizer. If you don't
|
39
|
+
know much about tokenizer, it's recommended that you choose
|
40
|
+
:ref:`token-bigram`.
|
41
|
+
|
42
|
+
You can try a tokenizer by :doc:`/reference/commands/tokenize` and
|
43
|
+
:doc:`/reference/commands/table_tokenize`. Here is an example to
|
44
|
+
try :ref:`token-bigram` tokenizer by
|
45
|
+
:doc:`/reference/commands/tokenize`:
|
46
|
+
|
47
|
+
.. groonga-command
|
48
|
+
.. include:: ../example/reference/tokenizers/tokenize-example.log
|
49
|
+
.. tokenize TokenBigram "Hello World"
|
50
|
+
|
51
|
+
What is "tokenize"?
|
52
|
+
-------------------
|
53
|
+
|
54
|
+
"tokenize" is the process that extracts zero or more tokens from a
|
55
|
+
text. There are some "tokenize" methods.
|
56
|
+
|
57
|
+
For example, ``Hello World`` is tokenized to the following tokens by
|
58
|
+
bigram tokenize method:
|
59
|
+
|
60
|
+
* ``He``
|
61
|
+
* ``el``
|
62
|
+
* ``ll``
|
63
|
+
* ``lo``
|
64
|
+
* ``o_`` (``_`` means a white-space)
|
65
|
+
* ``_W`` (``_`` means a white-space)
|
66
|
+
* ``Wo``
|
67
|
+
* ``or``
|
68
|
+
* ``rl``
|
69
|
+
* ``ld``
|
70
|
+
|
71
|
+
In the above example, 10 tokens are extracted from one text ``Hello
|
72
|
+
World``.
|
73
|
+
|
74
|
+
For example, ``Hello World`` is tokenized to the following tokens by
|
75
|
+
white-space-separate tokenize method:
|
76
|
+
|
77
|
+
* ``Hello``
|
78
|
+
* ``World``
|
79
|
+
|
80
|
+
In the above example, 2 tokens are extracted from one text ``Hello
|
81
|
+
World``.
|
82
|
+
|
83
|
+
Token is used as search key. You can find indexed documents only by
|
84
|
+
tokens that are extracted by used tokenize method. For example, you
|
85
|
+
can find ``Hello World`` by ``ll`` with bigram tokenize method but you
|
86
|
+
can't find ``Hello World`` by ``ll`` with white-space-separate tokenize
|
87
|
+
method. Because white-space-separate tokenize method doesn't extract
|
88
|
+
``ll`` token. It just extracts ``Hello`` and ``World`` tokens.
|
89
|
+
|
90
|
+
In general, tokenize method that generates small tokens increases
|
91
|
+
recall but decreases precision. Tokenize method that generates large
|
92
|
+
tokens increases precision but decreases recall.
|
93
|
+
|
94
|
+
For example, we can find ``Hello World`` and ``A or B`` by ``or`` with
|
95
|
+
bigram tokenize method. ``Hello World`` is a noise for people who
|
96
|
+
wants to search "logical and". It means that precision is
|
97
|
+
decreased. But recall is increased.
|
98
|
+
|
99
|
+
We can find only ``A or B`` by ``or`` with white-space-separate
|
100
|
+
tokenize method. Because ``World`` is tokenized to one token ``World``
|
101
|
+
with white-space-separate tokenize method. It means that precision is
|
102
|
+
increased for people who wants to search "logical and". But recall is
|
103
|
+
decreased because ``Hello World`` that contains ``or`` isn't found.
|
104
|
+
|
105
|
+
Built-in tokenizsers
|
106
|
+
--------------------
|
107
|
+
|
108
|
+
Here is a list of built-in tokenizers:
|
109
|
+
|
110
|
+
* ``TokenBigram``
|
111
|
+
* ``TokenBigramSplitSymbol``
|
112
|
+
* ``TokenBigramSplitSymbolAlpha``
|
113
|
+
* ``TokenBigramSplitSymbolAlphaDigit``
|
114
|
+
* ``TokenBigramIgnoreBlank``
|
115
|
+
* ``TokenBigramIgnoreBlankSplitSymbol``
|
116
|
+
* ``TokenBigramIgnoreBlankSplitAlpha``
|
117
|
+
* ``TokenBigramIgnoreBlankSplitAlphaDigit``
|
118
|
+
* ``TokenUnigram``
|
119
|
+
* ``TokenTrigram``
|
120
|
+
* ``TokenDelimit``
|
121
|
+
* ``TokenDelimitNull``
|
122
|
+
* ``TokenMecab``
|
123
|
+
* ``TokenRegexp``
|
124
|
+
|
125
|
+
.. _token-bigram:
|
126
|
+
|
127
|
+
``TokenBigram``
|
128
|
+
^^^^^^^^^^^^^^^
|
129
|
+
|
130
|
+
``TokenBigram`` is a bigram based tokenizer. It's recommended to use
|
131
|
+
this tokenizer for most cases.
|
132
|
+
|
133
|
+
Bigram tokenize method tokenizes a text to two adjacent characters
|
134
|
+
tokens. For example, ``Hello`` is tokenized to the following tokens:
|
135
|
+
|
136
|
+
* ``He``
|
137
|
+
* ``el``
|
138
|
+
* ``ll``
|
139
|
+
* ``lo``
|
140
|
+
|
141
|
+
Bigram tokenize method is good for recall because you can find all
|
142
|
+
texts by query consists of two or more characters.
|
143
|
+
|
144
|
+
In general, you can't find all texts by query consists of one
|
145
|
+
character because one character token doesn't exist. But you can find
|
146
|
+
all texts by query consists of one character in Groonga. Because
|
147
|
+
Groonga find tokens that start with query by predictive search. For
|
148
|
+
example, Groonga can find ``ll`` and ``lo`` tokens by ``l`` query.
|
149
|
+
|
150
|
+
Bigram tokenize method isn't good for precision because you can find
|
151
|
+
texts that includes query in word. For example, you can find ``world``
|
152
|
+
by ``or``. This is more sensitive for ASCII only languages rather than
|
153
|
+
non-ASCII languages. ``TokenBigram`` has solution for this problem
|
154
|
+
described in the bellow.
|
155
|
+
|
156
|
+
``TokenBigram`` behavior is different when it's worked with any
|
157
|
+
:doc:`/reference/normalizers`.
|
158
|
+
|
159
|
+
If no normalizer is used, ``TokenBigram`` uses pure bigram (all tokens
|
160
|
+
except the last token have two characters) tokenize method:
|
161
|
+
|
162
|
+
.. groonga-command
|
163
|
+
.. include:: ../example/reference/tokenizers/token-bigram-no-normalizer.log
|
164
|
+
.. tokenize TokenBigram "Hello World"
|
165
|
+
|
166
|
+
If normalizer is used, ``TokenBigram`` uses white-space-separate like
|
167
|
+
tokenize method for ASCII characters. ``TokenBigram`` uses bigram
|
168
|
+
tokenize method for non-ASCII characters.
|
169
|
+
|
170
|
+
You may be confused with this combined behavior. But it's reasonable
|
171
|
+
for most use cases such as English text (only ASCII characters) and
|
172
|
+
Japanese text (ASCII and non-ASCII characters are mixed).
|
173
|
+
|
174
|
+
Most languages consists of only ASCII characters use white-space for
|
175
|
+
word separator. White-space-separate tokenize method is suitable for
|
176
|
+
the case.
|
177
|
+
|
178
|
+
Languages consists of non-ASCII characters don't use white-space for
|
179
|
+
word separator. Bigram tokenize method is suitable for the case.
|
180
|
+
|
181
|
+
Mixed tokenize method is suitable for mixed language case.
|
182
|
+
|
183
|
+
If you want to use bigram tokenize method for ASCII character, see
|
184
|
+
``TokenBigramSplitXXX`` type tokenizers such as
|
185
|
+
:ref:`token-bigram-split-symbol-alpha`.
|
186
|
+
|
187
|
+
Let's confirm ``TokenBigram`` behavior by example.
|
188
|
+
|
189
|
+
``TokenBigram`` uses one or more white-spaces as token delimiter for
|
190
|
+
ASCII characters:
|
191
|
+
|
192
|
+
.. groonga-command
|
193
|
+
.. include:: ../example/reference/tokenizers/token-bigram-ascii-and-white-space-with-normalizer.log
|
194
|
+
.. tokenize TokenBigram "Hello World" NormalizerAuto
|
195
|
+
|
196
|
+
``TokenBigram`` uses character type change as token delimiter for
|
197
|
+
ASCII characters. Character type is one of them:
|
198
|
+
|
199
|
+
* Alphabet
|
200
|
+
* Digit
|
201
|
+
* Symbol (such as ``(``, ``)`` and ``!``)
|
202
|
+
* Hiragana
|
203
|
+
* Katakana
|
204
|
+
* Kanji
|
205
|
+
* Others
|
206
|
+
|
207
|
+
The following example shows two token delimiters:
|
208
|
+
|
209
|
+
* at between ``100`` (digits) and ``cents`` (alphabets)
|
210
|
+
* at between ``cents`` (alphabets) and ``!!!`` (symbols)
|
211
|
+
|
212
|
+
.. groonga-command
|
213
|
+
.. include:: ../example/reference/tokenizers/token-bigram-ascii-and-character-type-change-with-normalizer.log
|
214
|
+
.. tokenize TokenBigram "100cents!!!" NormalizerAuto
|
215
|
+
|
216
|
+
Here is an example that ``TokenBigram`` uses bigram tokenize method
|
217
|
+
for non-ASCII characters.
|
218
|
+
|
219
|
+
.. groonga-command
|
220
|
+
.. include:: ../example/reference/tokenizers/token-bigram-non-ascii-with-normalizer.log
|
221
|
+
.. tokenize TokenBigram "日本語の勉強" NormalizerAuto
|
222
|
+
|
223
|
+
.. _token-bigram-split-symbol:
|
224
|
+
|
225
|
+
``TokenBigramSplitSymbol``
|
226
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
227
|
+
|
228
|
+
``TokenBigramSplitSymbol`` is similar to :ref:`token-bigram`. The
|
229
|
+
difference between them is symbol handling. ``TokenBigramSplitSymbol``
|
230
|
+
tokenizes symbols by bigram tokenize method:
|
231
|
+
|
232
|
+
.. groonga-command
|
233
|
+
.. include:: ../example/reference/tokenizers/token-bigram-split-symbol-with-normalizer.log
|
234
|
+
.. tokenize TokenBigramSplitSymbol "100cents!!!" NormalizerAuto
|
235
|
+
|
236
|
+
.. _token-bigram-split-symbol-alpha:
|
237
|
+
|
238
|
+
``TokenBigramSplitSymbolAlpha``
|
239
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
240
|
+
|
241
|
+
``TokenBigramSplitSymbolAlpha`` is similar to :ref:`token-bigram`. The
|
242
|
+
difference between them is symbol and alphabet
|
243
|
+
handling. ``TokenBigramSplitSymbolAlpha`` tokenizes symbols and
|
244
|
+
alphabets by bigram tokenize method:
|
245
|
+
|
246
|
+
.. groonga-command
|
247
|
+
.. include:: ../example/reference/tokenizers/token-bigram-split-symbol-alpha-with-normalizer.log
|
248
|
+
.. tokenize TokenBigramSplitSymbolAlpha "100cents!!!" NormalizerAuto
|
249
|
+
|
250
|
+
.. _token-bigram-split-symbol-alpha-digit:
|
251
|
+
|
252
|
+
``TokenBigramSplitSymbolAlphaDigit``
|
253
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
254
|
+
|
255
|
+
``TokenBigramSplitSymbolAlphaDigit`` is similar to
|
256
|
+
:ref:`token-bigram`. The difference between them is symbol, alphabet
|
257
|
+
and digit handling. ``TokenBigramSplitSymbolAlphaDigit`` tokenizes
|
258
|
+
symbols, alphabets and digits by bigram tokenize method. It means that
|
259
|
+
all characters are tokenized by bigram tokenize method:
|
260
|
+
|
261
|
+
.. groonga-command
|
262
|
+
.. include:: ../example/reference/tokenizers/token-bigram-split-symbol-alpha-digit-with-normalizer.log
|
263
|
+
.. tokenize TokenBigramSplitSymbolAlphaDigit "100cents!!!" NormalizerAuto
|
264
|
+
|
265
|
+
.. _token-bigram-ignore-blank:
|
266
|
+
|
267
|
+
``TokenBigramIgnoreBlank``
|
268
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^
|
269
|
+
|
270
|
+
``TokenBigramIgnoreBlank`` is similar to :ref:`token-bigram`. The
|
271
|
+
difference between them is blank handling. ``TokenBigramIgnoreBlank``
|
272
|
+
ignores white-spaces in continuous symbols and non-ASCII characters.
|
273
|
+
|
274
|
+
You can find difference of them by ``日 本 語 ! ! !`` text because it
|
275
|
+
has symbols and non-ASCII characters.
|
276
|
+
|
277
|
+
Here is a result by :ref:`token-bigram` :
|
278
|
+
|
279
|
+
.. groonga-command
|
280
|
+
.. include:: ../example/reference/tokenizers/token-bigram-with-white-spaces.log
|
281
|
+
.. tokenize TokenBigram "日 本 語 ! ! !" NormalizerAuto
|
282
|
+
|
283
|
+
Here is a result by ``TokenBigramIgnoreBlank``:
|
284
|
+
|
285
|
+
.. groonga-command
|
286
|
+
.. include:: ../example/reference/tokenizers/token-bigram-ignore-blank-with-white-spaces.log
|
287
|
+
.. tokenize TokenBigramIgnoreBlank "日 本 語 ! ! !" NormalizerAuto
|
288
|
+
|
289
|
+
.. _token-bigram-ignore-blank-split-symbol:
|
290
|
+
|
291
|
+
``TokenBigramIgnoreBlankSplitSymbol``
|
292
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
293
|
+
|
294
|
+
``TokenBigramIgnoreBlankSplitSymbol`` is similar to
|
295
|
+
:ref:`token-bigram`. The differences between them are the followings:
|
296
|
+
|
297
|
+
* Blank handling
|
298
|
+
* Symbol handling
|
299
|
+
|
300
|
+
``TokenBigramIgnoreBlankSplitSymbol`` ignores white-spaces in
|
301
|
+
continuous symbols and non-ASCII characters.
|
302
|
+
|
303
|
+
``TokenBigramIgnoreBlankSplitSymbol`` tokenizes symbols by bigram
|
304
|
+
tokenize method.
|
305
|
+
|
306
|
+
You can find difference of them by ``日 本 語 ! ! !`` text because it
|
307
|
+
has symbols and non-ASCII characters.
|
308
|
+
|
309
|
+
Here is a result by :ref:`token-bigram` :
|
310
|
+
|
311
|
+
.. groonga-command
|
312
|
+
.. include:: ../example/reference/tokenizers/token-bigram-with-white-spaces-and-symbol.log
|
313
|
+
.. tokenize TokenBigram "日 本 語 ! ! !" NormalizerAuto
|
314
|
+
|
315
|
+
Here is a result by ``TokenBigramIgnoreBlankSplitSymbol``:
|
316
|
+
|
317
|
+
.. groonga-command
|
318
|
+
.. include:: ../example/reference/tokenizers/token-bigram-ignore-blank-split-symbol-with-white-spaces-and-symbol.log
|
319
|
+
.. tokenize TokenBigramIgnoreBlankSplitSymbol "日 本 語 ! ! !" NormalizerAuto
|
320
|
+
|
321
|
+
.. _token-bigram-ignore-blank-split-symbol-alpha:
|
322
|
+
|
323
|
+
``TokenBigramIgnoreBlankSplitSymbolAlpha``
|
324
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
325
|
+
|
326
|
+
``TokenBigramIgnoreBlankSplitSymbolAlpha`` is similar to
|
327
|
+
:ref:`token-bigram`. The differences between them are the followings:
|
328
|
+
|
329
|
+
* Blank handling
|
330
|
+
* Symbol and alphabet handling
|
331
|
+
|
332
|
+
``TokenBigramIgnoreBlankSplitSymbolAlpha`` ignores white-spaces in
|
333
|
+
continuous symbols and non-ASCII characters.
|
334
|
+
|
335
|
+
``TokenBigramIgnoreBlankSplitSymbolAlpha`` tokenizes symbols and
|
336
|
+
alphabets by bigram tokenize method.
|
337
|
+
|
338
|
+
You can find difference of them by ``Hello 日 本 語 ! ! !`` text because it
|
339
|
+
has symbols and non-ASCII characters with white spaces and alphabets.
|
340
|
+
|
341
|
+
Here is a result by :ref:`token-bigram` :
|
342
|
+
|
343
|
+
.. groonga-command
|
344
|
+
.. include:: ../example/reference/tokenizers/token-bigram-with-white-spaces-and-symbol-and-alphabet.log
|
345
|
+
.. tokenize TokenBigram "Hello 日 本 語 ! ! !" NormalizerAuto
|
346
|
+
|
347
|
+
Here is a result by ``TokenBigramIgnoreBlankSplitSymbolAlpha``:
|
348
|
+
|
349
|
+
.. groonga-command
|
350
|
+
.. include:: ../example/reference/tokenizers/token-bigram-ignore-blank-split-symbol-with-white-spaces-and-symbol-and-alphabet.log
|
351
|
+
.. tokenize TokenBigramIgnoreBlankSplitSymbolAlpha "Hello 日 本 語 ! ! !" NormalizerAuto
|
352
|
+
|
353
|
+
.. _token-bigram-ignore-blank-split-symbol-alpha-digit:
|
354
|
+
|
355
|
+
``TokenBigramIgnoreBlankSplitSymbolAlphaDigit``
|
356
|
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
357
|
+
|
358
|
+
``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` is similar to
|
359
|
+
:ref:`token-bigram`. The differences between them are the followings:
|
360
|
+
|
361
|
+
* Blank handling
|
362
|
+
* Symbol, alphabet and digit handling
|
363
|
+
|
364
|
+
``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` ignores white-spaces
|
365
|
+
in continuous symbols and non-ASCII characters.
|
366
|
+
|
367
|
+
``TokenBigramIgnoreBlankSplitSymbolAlphaDigit`` tokenizes symbols,
|
368
|
+
alphabets and digits by bigram tokenize method. It means that all
|
369
|
+
characters are tokenized by bigram tokenize method.
|
370
|
+
|
371
|
+
You can find difference of them by ``Hello 日 本 語 ! ! ! 777`` text
|
372
|
+
because it has symbols and non-ASCII characters with white spaces,
|
373
|
+
alphabets and digits.
|
374
|
+
|
375
|
+
Here is a result by :ref:`token-bigram` :
|
376
|
+
|
377
|
+
.. groonga-command
|
378
|
+
.. include:: ../example/reference/tokenizers/token-bigram-with-white-spaces-and-symbol-and-alphabet-and-digit.log
|
379
|
+
.. tokenize TokenBigram "Hello 日 本 語 ! ! ! 777" NormalizerAuto
|
380
|
+
|
381
|
+
Here is a result by ``TokenBigramIgnoreBlankSplitSymbolAlphaDigit``:
|
382
|
+
|
383
|
+
.. groonga-command
|
384
|
+
.. include:: ../example/reference/tokenizers/token-bigram-ignore-blank-split-symbol-with-white-spaces-and-symbol-and-alphabet-digit.log
|
385
|
+
.. tokenize TokenBigramIgnoreBlankSplitSymbolAlphaDigit "Hello 日 本 語 ! ! ! 777" NormalizerAuto
|
386
|
+
|
387
|
+
.. _token-unigram:
|
388
|
+
|
389
|
+
``TokenUnigram``
|
390
|
+
^^^^^^^^^^^^^^^^
|
391
|
+
|
392
|
+
``TokenUnigram`` is similar to :ref:`token-bigram`. The differences
|
393
|
+
between them is token unit. :ref:`token-bigram` uses 2 characters per
|
394
|
+
token. ``TokenUnigram`` uses 1 character per token.
|
395
|
+
|
396
|
+
.. groonga-command
|
397
|
+
.. include:: ../example/reference/tokenizers/token-unigram.log
|
398
|
+
.. tokenize TokenUnigram "100cents!!!" NormalizerAuto
|
399
|
+
|
400
|
+
.. _token-trigram:
|
401
|
+
|
402
|
+
``TokenTrigram``
|
403
|
+
^^^^^^^^^^^^^^^^
|
404
|
+
|
405
|
+
``TokenTrigram`` is similar to :ref:`token-bigram`. The differences
|
406
|
+
between them is token unit. :ref:`token-bigram` uses 2 characters per
|
407
|
+
token. ``TokenTrigram`` uses 3 characters per token.
|
408
|
+
|
409
|
+
.. groonga-command
|
410
|
+
.. include:: ../example/reference/tokenizers/token-trigram.log
|
411
|
+
.. tokenize TokenTrigram "10000cents!!!!!" NormalizerAuto
|
412
|
+
|
413
|
+
.. _token-delimit:
|
414
|
+
|
415
|
+
``TokenDelimit``
|
416
|
+
^^^^^^^^^^^^^^^^
|
417
|
+
|
418
|
+
``TokenDelimit`` extracts token by splitting one or more space
|
419
|
+
characters (``U+0020``). For example, ``Hello World`` is tokenized to
|
420
|
+
``Hello`` and ``World``.
|
421
|
+
|
422
|
+
``TokenDelimit`` is suitable for tag text. You can extract ``groonga``
|
423
|
+
and ``full-text-search`` and ``http`` as tags from ``groonga
|
424
|
+
full-text-search http``.
|
425
|
+
|
426
|
+
Here is an example of ``TokenDelimit``:
|
427
|
+
|
428
|
+
.. groonga-command
|
429
|
+
.. include:: ../example/reference/tokenizers/token-delimit.log
|
430
|
+
.. tokenize TokenDelimit "Groonga full-text-search HTTP" NormalizerAuto
|
431
|
+
|
432
|
+
.. _token-delimit-null:
|
433
|
+
|
434
|
+
``TokenDelimitNull``
|
435
|
+
^^^^^^^^^^^^^^^^^^^^
|
436
|
+
|
437
|
+
``TokenDelimitNull`` is similar to :ref:`token-delimit`. The
|
438
|
+
difference between them is separator character. :ref:`token-delimit`
|
439
|
+
uses space character (``U+0020``) but ``TokenDelimitNull`` uses NUL
|
440
|
+
character (``U+0000``).
|
441
|
+
|
442
|
+
``TokenDelimitNull`` is also suitable for tag text.
|
443
|
+
|
444
|
+
Here is an example of ``TokenDelimitNull``:
|
445
|
+
|
446
|
+
.. groonga-command
|
447
|
+
.. include:: ../example/reference/tokenizers/token-delimit-null.log
|
448
|
+
.. tokenize TokenDelimitNull "Groonga\u0000full-text-search\u0000HTTP" NormalizerAuto
|
449
|
+
|
450
|
+
.. _token-mecab:
|
451
|
+
|
452
|
+
``TokenMecab``
|
453
|
+
^^^^^^^^^^^^^^
|
454
|
+
|
455
|
+
``TokenMecab`` is a tokenizer based on `MeCab
|
456
|
+
<http://mecab.sourceforge.net/>`_ part-of-speech and
|
457
|
+
morphological analyzer.
|
458
|
+
|
459
|
+
MeCab doesn't depend on Japanese. You can use MeCab for other
|
460
|
+
languages by creating dictionary for the languages. You can use `NAIST
|
461
|
+
Japanese Dictionary <http://sourceforge.jp/projects/naist-jdic/>`_
|
462
|
+
for Japanese.
|
463
|
+
|
464
|
+
``TokenMecab`` is good for precision rather than recall. You can find
|
465
|
+
``東京都`` and ``京都`` texts by ``京都`` query with
|
466
|
+
:ref:`token-bigram` but ``東京都`` isn't expected. You can find only
|
467
|
+
``京都`` text by ``京都`` query with ``TokenMecab``.
|
468
|
+
|
469
|
+
If you want to support neologisms, you need to keep updating your
|
470
|
+
MeCab dictionary. It needs maintain cost. (:ref:`token-bigram` doesn't
|
471
|
+
require dictionary maintenance because :ref:`token-bigram` doesn't use
|
472
|
+
dictionary.) `mecab-ipadic-NEologd : Neologism dictionary for MeCab
|
473
|
+
<https://github.com/neologd/mecab-ipadic-neologd>`_ may help you.
|
474
|
+
|
475
|
+
Here is an example of ``TokenMeCab``. ``東京都`` is tokenized to ``東京``
|
476
|
+
and ``都``. They don't include ``京都``:
|
477
|
+
|
478
|
+
.. groonga-command
|
479
|
+
.. include:: ../example/reference/tokenizers/token-mecab.log
|
480
|
+
.. tokenize TokenMecab "東京都"
|
481
|
+
|
482
|
+
.. _token-regexp:
|
483
|
+
|
484
|
+
``TokenRegexp``
|
485
|
+
^^^^^^^^^^^^^^^
|
486
|
+
|
487
|
+
.. versionadded:: 5.0.1
|
488
|
+
|
489
|
+
.. caution::
|
490
|
+
|
491
|
+
This tokenizer is experimental. Specification may be changed.
|
492
|
+
|
493
|
+
.. caution::
|
494
|
+
|
495
|
+
This tokenizer can be used only with UTF-8. You can't use this
|
496
|
+
tokenizer with EUC-JP, Shift_JIS and so on.
|
497
|
+
|
498
|
+
``TokenRegexp`` is a tokenizer for supporting regular expression
|
499
|
+
search by index.
|
500
|
+
|
501
|
+
In general, regular expression search is evaluated as sequential
|
502
|
+
search. But the following cases can be evaluated as index search:
|
503
|
+
|
504
|
+
* Literal only case such as ``hello``
|
505
|
+
* The beginning of text and literal case such as ``\A/home/alice``
|
506
|
+
* The end of text and literal case such as ``\.txt\z``
|
507
|
+
|
508
|
+
In most cases, index search is faster than sequential search.
|
509
|
+
|
510
|
+
``TokenRegexp`` is based on bigram tokenize method. ``TokenRegexp``
|
511
|
+
adds the beginning of text mark (``U+FFEF``) at the begging of text
|
512
|
+
and the end of text mark (``U+FFF0``) to the end of text when you
|
513
|
+
index text:
|
514
|
+
|
515
|
+
.. groonga-command
|
516
|
+
.. include:: ../example/reference/tokenizers/token-regexp-add.log
|
517
|
+
.. tokenize TokenRegexp "/home/alice/test.txt" NormalizerAuto --mode ADD
|
518
|
+
|
519
|
+
The beginning of text mark is used for the beginning of text search by
|
520
|
+
``\A``. If you use ``TokenRegexp`` for tokenizing query,
|
521
|
+
``TokenRegexp`` adds the beginning of text mark (``U+FFEF``) as the
|
522
|
+
first token. The beginning of text mark must be appeared at the first,
|
523
|
+
you can get results of the beginning of text search.
|
524
|
+
|
525
|
+
.. groonga-command
|
526
|
+
.. include:: ../example/reference/tokenizers/token-regexp-get-beginning-of-text.log
|
527
|
+
.. tokenize TokenRegexp "\\A/home/alice/" NormalizerAuto --mode GET
|
528
|
+
|
529
|
+
The end of text mark is used for the end of text search by ``\z``.
|
530
|
+
If you use ``TokenRegexp`` for tokenizing query, ``TokenRegexp`` adds
|
531
|
+
the end of text mark (``U+FFF0``) as the last token. The end of text
|
532
|
+
mark must be appeared at the end, you can get results of the end of
|
533
|
+
text search.
|
534
|
+
|
535
|
+
.. groonga-command
|
536
|
+
.. include:: ../example/reference/tokenizers/token-regexp-get-end-of-text.log
|
537
|
+
.. tokenize TokenRegexp "\\.txt\\z" NormalizerAuto --mode GET
|
24
538
|
|