sphinx 0.9.10.2043 → 0.9.10.2091

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/.gitignore CHANGED
@@ -1,2 +1,4 @@
1
1
  rdoc
2
+ doc
3
+ .yardoc
2
4
  pkg
data/README.rdoc CHANGED
@@ -1,10 +1,10 @@
1
- =Sphinx Client API 0.9.10
1
+ = Sphinx Client API 0.9.10
2
2
 
3
- This document gives an overview of what is Sphinx itself and how to use in
4
- within Ruby on Rails. For more information or documentation,
3
+ This document gives an overview of what is Sphinx itself and how to use it
4
+ from your Ruby on Rails application. For more information or documentation,
5
5
  please go to http://www.sphinxsearch.com
6
6
 
7
- ==Sphinx
7
+ == Sphinx
8
8
 
9
9
  Sphinx is a standalone full-text search engine, meant to provide fast,
10
10
  size-efficient and relevant fulltext search functions to other applications.
@@ -12,37 +12,191 @@ Sphinx was specially designed to integrate well with SQL databases and
12
12
  scripting languages. Currently built-in data sources support fetching data
13
13
  either via direct connection to MySQL, or from an XML pipe.
14
14
 
15
- Simplest way to communicate with Sphinx is to use <tt>searchd</tt> -
16
- a daemon to search through fulltext indices from external software.
15
+ Simplest way to communicate with Sphinx is to use <tt>searchd</tt>
16
+ a daemon to search through full text indexes from external software.
17
17
 
18
- ==Compatibility
18
+ == Installation
19
19
 
20
- This version supports all API features of Sphinx 0.9.10-r2043. Full compatibility list:
20
+ There are two options when approaching sphinx plugin installation:
21
21
 
22
- * <tt>0.9.10</tt> Sphinx 0.9.10-r2043
23
- * <tt>0.9.9</tt> Sphinx 0.9.9-r1299
22
+ * using the gem (recommended)
23
+ * install as a Rails plugin
24
24
 
25
- ==Documentation
25
+ To install as a gem, add this to your environment.rb:
26
26
 
27
- You can create the documentation by running:
27
+ config.gem 'sphinx', :source => 'http://gemcutter.org'
28
28
 
29
- rake rdoc
29
+ And then run the command:
30
30
 
31
- ==Latest version
31
+ sudo rake gems:install
32
32
 
33
- You can always get latest version from
33
+ To install Sphinx as a Rails plugin use this:
34
+
35
+ script/plugin install git://github.com/kpumuk/sphinx.git
36
+
37
+ == Documentation
38
+
39
+ Complete Sphinx plugin documentation could be found here:
40
+ http://kpumuk.github.com/sphinx
41
+
42
+ Also you can find documentation on rdoc.info:
43
+ http://rdoc.info/projects/kpumuk/sphinx
44
+
45
+ You can build the documentation locally by running:
46
+
47
+ rake yard
48
+
49
+ Please note: you should have yard gem installed on your system:
50
+
51
+ sudo gem install yard --source http://gemcutter.org
52
+
53
+ Complete Sphinx API documentation could be found on Sphinx Search Engine
54
+ site: http://www.sphinxsearch.com/docs/current.html
55
+ This plugin is fully compatible with original PHP API implementation.
56
+
57
+ == Ruby naming conventions
58
+
59
+ Sphinx Client API supports Ruby naming conventions, so every API
60
+ method name is in underscored, lowercase form:
61
+
62
+ SetServer -> set_server
63
+ RunQueries -> run_queries
64
+ SetMatchMode -> set_match_mode
65
+
66
+ Every method is aliased to a corresponding one from standard Sphinx
67
+ API, so you can use both <tt>SetServer</tt> and <tt>set_server</tt>
68
+ with no differrence.
69
+
70
+ There are three exceptions to this naming rule:
71
+
72
+ GetLastError -> last_error
73
+ GetLastWarning -> last_warning
74
+ IsConnectError -> connect_error?
75
+
76
+ Of course, all of them are aliased to the original method names.
77
+
78
+ == Using multiple Sphinx servers
79
+
80
+ Since we actively use this plugin in our Scribd development workflow,
81
+ there are several methods have been added to accommodate our needs.
82
+ You can find documentation on Ruby-specific methods in documentation:
83
+ http://rdoc.info/projects/kpumuk/sphinx
84
+
85
+ First of all, we added support of multiple Sphinx servers to balance
86
+ load between them. Also it means that in case of any problems with one
87
+ of servers, library will try to fetch the results from another one.
88
+ Every consequence request will be executed on the next server in list
89
+ (round-robin technique).
90
+
91
+ sphinx.set_servers([
92
+ { :host => 'browse01.local', :port => 3312 },
93
+ { :host => 'browse02.local', :port => 3312 },
94
+ { :host => 'browse03.local', :port => 3312 }
95
+ ])
96
+
97
+ By default library will try to fetch results from a single server, and
98
+ fail if it does not respond. To setup number of retries being performed,
99
+ you can use second (additional) parameter of the <tt>set_connect_timeout</tt>
100
+ and <tt>set_request_timeout</tt> methods:
101
+
102
+ sphinx.set_connect_timeout(1, 3)
103
+ sphinx.set_request_timeout(1, 3)
104
+
105
+ There is a big difference between these two methods. First will affect
106
+ only on requests experiencing problems with connection (socket error,
107
+ pipe error, etc), second will be used when request is broken somehow
108
+ (temporary searchd error, incomplete reply, etc). The workflow looks like
109
+ this:
110
+
111
+ 1. Increase retries number. If is less or equal to configured value,
112
+ try to connect to the next server. Otherwise, raise an error.
113
+ 2. In case of connection problem go to 1.
114
+ 3. Increase request retries number. If it less or equal to configured
115
+ value, try to perform request. Otherwise, raise an error.
116
+ 4. In case of connection problem go to 1.
117
+ 5. In case of request problem, go to 3.
118
+ 6. Parse and return response.
119
+
120
+ Withdrawals:
121
+
122
+ 1. Request could be performed <tt>connect_retries</tt> * <tt>request_retries</tt>
123
+ times. E.g., it could be tried <tt>request_retries</tt> times on each
124
+ of <tt>connect_retries</tt> servers (when you have 1 server configured,
125
+ but <tt>connect_retries</tt> is 5, library will try to connect to this
126
+ server 5 times).
127
+ 2. Request could be tried to execute on each server <tt>1..request_retries</tt>
128
+ times. In case of connection problem, request will be moved to another
129
+ server immediately.
130
+
131
+ Usually you will set <tt>connect_retries</tt> equal to servers number,
132
+ so you will be sure each failing request will be performed on all servers.
133
+ This means that if one of servers is live, but others are dead, you request
134
+ will be finally executed successfully.
135
+
136
+ == Sphinx constants
137
+
138
+ Most Sphinx API methods expecting for special constants will be passed.
139
+ For example:
140
+
141
+ sphinx.set_match_mode(Sphinx::Client::SPH_MATCH_ANY)
142
+
143
+ Please note that these constants defined in a <tt>Sphinx::Client</tt>
144
+ namespace. You can use symbols or strings instead of these awful
145
+ constants:
146
+
147
+ sphinx.set_match_mode(:any)
148
+ sphinx.set_match_mode('any')
149
+
150
+ == Setting query filters
151
+
152
+ Every <tt>set_</tt> method returns <tt>Sphinx::Client</tt> object itself.
153
+ It means that you can chain filtering methods:
154
+
155
+ results = Sphinx::Client.new.
156
+ set_match_mode(:any).
157
+ set_ranking_mode(:bm25).
158
+ set_id_range(10, 1000).
159
+ query('test')
160
+
161
+ == Example
162
+
163
+ This simple example illustrates base connection establishing,
164
+ search results retrieving, and excerpts building. Please note
165
+ how does it perform database select using ActiveRecord to
166
+ save the order of records established by Sphinx.
167
+
168
+ sphinx = Sphinx::Client.new
169
+ result = sphinx.query('test')
170
+ ids = result['matches'].map { |match| match['id'] }
171
+ posts = Post.all :conditions => { :id => ids },
172
+ :order => "FIELD(id,#{ids.join(',')})"
173
+
174
+ docs = posts.map(&:body)
175
+ excerpts = sphinx.build_excerpts(docs, 'index', 'test')
176
+
177
+ == Support
178
+
179
+ Source code:
180
+ http://github.com/kpumuk/sphinx
181
+
182
+ To suggest a feature or report a bug:
183
+ http://github.com/kpumuk/sphinx/issues
184
+
185
+ Project home page:
34
186
  http://kpumuk.info/projects/ror-plugins/sphinx
35
187
 
36
- ==Credits
188
+ == Credits
37
189
 
38
190
  Dmytro Shteflyuk <kpumuk@kpumuk.info> http://kpumuk.info
39
191
 
40
- Andrew Aksyonoff http://sphinxsearch.com/
192
+ Andrew Aksyonoff http://sphinxsearch.com
41
193
 
42
194
  Special thanks to Alexey Kovyrin <alexey@kovyrin.net> http://blog.kovyrin.net
43
195
 
196
+ Special thanks to Mike Perham http://www.mikeperham.com for his awesome
197
+ memcache-client gem, where latest Sphinx gem got new sockets handling from.
198
+
44
199
  ==License
45
200
 
46
201
  This library is distributed under the terms of the Ruby license.
47
202
  You can freely distribute/modify this library.
48
-
data/Rakefile CHANGED
@@ -1,6 +1,4 @@
1
1
  require 'rake'
2
- require 'spec/rake/spectask'
3
- require 'rake/rdoctask'
4
2
 
5
3
  begin
6
4
  require 'jeweler'
@@ -17,20 +15,31 @@ rescue LoadError
17
15
  puts 'Jeweler not available. Install it with: sudo gem install jeweler'
18
16
  end
19
17
 
20
- desc 'Default: run specs'
21
- task :default => :spec
18
+ begin
19
+ require 'spec/rake/spectask'
20
+
21
+ desc 'Default: run specs'
22
+ task :default => :spec
22
23
 
23
- desc 'Test the sphinx plugin'
24
- Spec::Rake::SpecTask.new(:spec) do |t|
25
- t.libs << 'lib'
26
- t.pattern = 'spec/*_spec.rb'
24
+ desc 'Test the sphinx plugin'
25
+ Spec::Rake::SpecTask.new do |t|
26
+ t.libs << 'lib'
27
+ t.pattern = 'spec/*_spec.rb'
28
+ end
29
+ rescue LoadError
30
+ puts 'RSpec not available. Install it with: sudo gem install rspec'
27
31
  end
28
32
 
29
- desc 'Generate documentation for the sphinx plugin'
30
- Rake::RDocTask.new(:rdoc) do |rdoc|
31
- rdoc.rdoc_dir = 'rdoc'
32
- rdoc.title = 'Sphinx Client API'
33
- rdoc.options << '--line-numbers' << '--inline-source'
34
- rdoc.rdoc_files.include('README.rdoc')
35
- rdoc.rdoc_files.include('lib/**/*.rb')
33
+ begin
34
+ require 'yard'
35
+ YARD::Rake::YardocTask.new(:yard) do |t|
36
+ t.options = ['--title', 'Sphinx Client API Documentation']
37
+ if ENV['PRIVATE']
38
+ t.options.concat ['--protected', '--private']
39
+ else
40
+ t.options << '--no-private'
41
+ end
42
+ end
43
+ rescue LoadError
44
+ puts 'Yard not available. Install it with: sudo gem install yard'
36
45
  end
data/VERSION.yml CHANGED
@@ -2,4 +2,4 @@
2
2
  :major: 0
3
3
  :minor: 9
4
4
  :patch: 10
5
- :build: 2043
5
+ :build: 2091
data/lib/sphinx.rb CHANGED
@@ -1,9 +1,21 @@
1
- require 'socket'
2
- require 'net/protocol'
3
-
1
+ # Sphinx Client API
2
+ #
3
+ # Author:: Dmytro Shteflyuk <mailto:kpumuk@kpumuk.info>.
4
+ # Copyright:: Copyright (c) 2006 — 2009 Dmytro Shteflyuk
5
+ # License:: Distributes under the same terms as Ruby
6
+ # Version:: 0.9.10-r2091
7
+ # Website:: http://kpumuk.info/projects/ror-plugins/sphinx
8
+ # Sources:: http://github.com/kpumuk/sphinx
9
+ #
10
+ # This library is distributed under the terms of the Ruby license.
11
+ # You can freely distribute/modify this library.
12
+ #
4
13
  module Sphinx
5
14
  end
6
15
 
16
+ require 'socket'
17
+ require 'net/protocol'
18
+
7
19
  require File.dirname(__FILE__) + '/sphinx/request'
8
20
  require File.dirname(__FILE__) + '/sphinx/response'
9
21
  require File.dirname(__FILE__) + '/sphinx/timeout'
@@ -1,3 +1,7 @@
1
+ # A simple wrapper around <tt>Net::BufferedIO</tt> performing
2
+ # non-blocking select.
3
+ #
4
+ # @private
1
5
  class Sphinx::BufferedIO < Net::BufferedIO # :nodoc:
2
6
  BUFSIZE = 1024 * 16
3
7
 
data/lib/sphinx/client.rb CHANGED
@@ -1,120 +1,149 @@
1
- # = client.rb - Sphinx Client API
2
- #
3
- # Author:: Dmytro Shteflyuk <mailto:kpumuk@kpumuk.info>.
4
- # Copyright:: Copyright (c) 2006 — 2009 Dmytro Shteflyuk
5
- # License:: Distributes under the same terms as Ruby
6
- # Version:: 0.9.10-r2043
7
- # Website:: http://kpumuk.info/projects/ror-plugins/sphinx
8
- #
9
- # This library is distributed under the terms of the Ruby license.
10
- # You can freely distribute/modify this library.
11
-
12
- # ==Sphinx Client API
13
- #
14
- # The Sphinx Client API is used to communicate with <tt>searchd</tt>
15
- # daemon and get search results from Sphinx.
16
- #
17
- # ===Usage
18
- #
19
- # sphinx = Sphinx::Client.new
20
- # result = sphinx.Query('test')
21
- # ids = result['matches'].map { |match| match['id'] }.join(',')
22
- # posts = Post.find :all, :conditions => "id IN (#{ids})"
23
- #
24
- # docs = posts.map(&:body)
25
- # excerpts = sphinx.BuildExcerpts(docs, 'index', 'test')
26
-
27
1
  module Sphinx
28
- # :stopdoc:
29
-
2
+ # Base class for all Sphinx errors
30
3
  class SphinxError < StandardError; end
4
+ # Connect error occurred on the API side.
31
5
  class SphinxConnectError < SphinxError; end
6
+ # Request error occurred on the API side.
32
7
  class SphinxResponseError < SphinxError; end
8
+ # Internal error occurred inside searchd.
33
9
  class SphinxInternalError < SphinxError; end
10
+ # Temporary error occurred inside searchd.
34
11
  class SphinxTemporaryError < SphinxError; end
12
+ # Unknown error occurred inside searchd.
35
13
  class SphinxUnknownError < SphinxError; end
36
14
 
37
- # :startdoc:
38
-
15
+ # The Sphinx Client API is used to communicate with <tt>searchd</tt>
16
+ # daemon and perform requests.
17
+ #
18
+ # @example
19
+ # sphinx = Sphinx::Client.new
20
+ # result = sphinx.query('test')
21
+ # ids = result['matches'].map { |match| match['id'] }
22
+ # posts = Post.all :conditions => { :id => ids },
23
+ # :order => "FIELD(id,#{ids.join(',')})"
24
+ #
25
+ # docs = posts.map(&:body)
26
+ # excerpts = sphinx.build_excerpts(docs, 'index', 'test')
27
+ #
39
28
  class Client
40
- # :stopdoc:
41
-
29
+ #=================================================================
42
30
  # Known searchd commands
43
-
31
+ #=================================================================
32
+
44
33
  # search command
34
+ # @private
45
35
  SEARCHD_COMMAND_SEARCH = 0
46
36
  # excerpt command
37
+ # @private
47
38
  SEARCHD_COMMAND_EXCERPT = 1
48
39
  # update command
40
+ # @private
49
41
  SEARCHD_COMMAND_UPDATE = 2
50
42
  # keywords command
43
+ # @private
51
44
  SEARCHD_COMMAND_KEYWORDS = 3
52
45
  # persist command
46
+ # @private
53
47
  SEARCHD_COMMAND_PERSIST = 4
54
48
  # status command
49
+ # @private
55
50
  SEARCHD_COMMAND_STATUS = 5
56
51
  # query command
52
+ # @private
57
53
  SEARCHD_COMMAND_QUERY = 6
58
54
  # flushattrs command
55
+ # @private
59
56
  SEARCHD_COMMAND_FLUSHATTRS = 7
60
-
57
+
58
+ #=================================================================
61
59
  # Current client-side command implementation versions
62
-
60
+ #=================================================================
61
+
63
62
  # search command version
63
+ # @private
64
64
  VER_COMMAND_SEARCH = 0x117
65
65
  # excerpt command version
66
+ # @private
66
67
  VER_COMMAND_EXCERPT = 0x100
67
68
  # update command version
69
+ # @private
68
70
  VER_COMMAND_UPDATE = 0x102
69
71
  # keywords command version
72
+ # @private
70
73
  VER_COMMAND_KEYWORDS = 0x100
71
74
  # persist command version
75
+ # @private
72
76
  VER_COMMAND_PERSIST = 0x000
73
77
  # status command version
78
+ # @private
74
79
  VER_COMMAND_STATUS = 0x100
75
80
  # query command version
81
+ # @private
76
82
  VER_COMMAND_QUERY = 0x100
77
83
  # flushattrs command version
84
+ # @private
78
85
  VER_COMMAND_FLUSHATTRS = 0x100
79
-
86
+
87
+ #=================================================================
80
88
  # Known searchd status codes
81
-
89
+ #=================================================================
90
+
82
91
  # general success, command-specific reply follows
92
+ # @private
83
93
  SEARCHD_OK = 0
84
94
  # general failure, command-specific reply may follow
95
+ # @private
85
96
  SEARCHD_ERROR = 1
86
97
  # temporaty failure, client should retry later
98
+ # @private
87
99
  SEARCHD_RETRY = 2
88
- # general success, warning message and command-specific reply follow
89
- SEARCHD_WARNING = 3
100
+ # general success, warning message and command-specific reply follow
101
+ # @private
102
+ SEARCHD_WARNING = 3
90
103
 
104
+ #=================================================================
105
+ # Some internal attributes to use inside client API
106
+ #=================================================================
107
+
108
+ # List of searchd servers to connect to.
109
+ # @private
91
110
  attr_reader :servers
111
+ # Connection timeout in seconds.
112
+ # @private
92
113
  attr_reader :timeout
114
+ # Number of connection retries.
115
+ # @private
93
116
  attr_reader :retries
117
+ # Request timeout in seconds.
118
+ # @private
94
119
  attr_reader :reqtimeout
120
+ # Number of request retries.
121
+ # @private
95
122
  attr_reader :reqretries
96
-
97
- # :startdoc:
98
-
123
+
124
+ #=================================================================
99
125
  # Known match modes
100
-
126
+ #=================================================================
127
+
101
128
  # match all query words
102
- SPH_MATCH_ALL = 0
129
+ SPH_MATCH_ALL = 0
103
130
  # match any query word
104
- SPH_MATCH_ANY = 1
131
+ SPH_MATCH_ANY = 1
105
132
  # match this exact phrase
106
- SPH_MATCH_PHRASE = 2
133
+ SPH_MATCH_PHRASE = 2
107
134
  # match this boolean query
108
- SPH_MATCH_BOOLEAN = 3
135
+ SPH_MATCH_BOOLEAN = 3
109
136
  # match this extended query
110
- SPH_MATCH_EXTENDED = 4
137
+ SPH_MATCH_EXTENDED = 4
111
138
  # match all document IDs w/o fulltext query, apply filters
112
139
  SPH_MATCH_FULLSCAN = 5
113
140
  # extended engine V2 (TEMPORARY, WILL BE REMOVED IN 0.9.8-RELEASE)
114
141
  SPH_MATCH_EXTENDED2 = 6
115
-
142
+
143
+ #=================================================================
116
144
  # Known ranking modes (ext2 only)
117
-
145
+ #=================================================================
146
+
118
147
  # default mode, phrase proximity major factor and BM25 minor one
119
148
  SPH_RANK_PROXIMITY_BM25 = 0
120
149
  # statistical mode, BM25 ranking only (faster but worse quality)
@@ -131,9 +160,11 @@ module Sphinx
131
160
  SPH_RANK_FIELDMASK = 6
132
161
  # codename SPH04, phrase proximity + bm25 + head/exact boost
133
162
  SPH_RANK_SPH04 = 7
134
-
163
+
164
+ #=================================================================
135
165
  # Known sort modes
136
-
166
+ #=================================================================
167
+
137
168
  # sort by document relevance desc, then by date
138
169
  SPH_SORT_RELEVANCE = 0
139
170
  # sort by document date desc, then by relevance desc
@@ -146,23 +177,27 @@ module Sphinx
146
177
  SPH_SORT_EXTENDED = 4
147
178
  # sort by arithmetic expression in descending order (eg. "@id + max(@weight,1000)*boost + log(price)")
148
179
  SPH_SORT_EXPR = 5
149
-
180
+
181
+ #=================================================================
150
182
  # Known filter types
151
-
183
+ #=================================================================
184
+
152
185
  # filter by integer values set
153
186
  SPH_FILTER_VALUES = 0
154
187
  # filter by integer range
155
188
  SPH_FILTER_RANGE = 1
156
189
  # filter by float range
157
190
  SPH_FILTER_FLOATRANGE = 2
158
-
191
+
192
+ #=================================================================
159
193
  # Known attribute types
160
-
194
+ #=================================================================
195
+
161
196
  # this attr is just an integer
162
197
  SPH_ATTR_INTEGER = 1
163
198
  # this attr is a timestamp
164
199
  SPH_ATTR_TIMESTAMP = 2
165
- # this attr is an ordinal string number (integer at search time,
200
+ # this attr is an ordinal string number (integer at search time,
166
201
  # specially handled at indexing time)
167
202
  SPH_ATTR_ORDINAL = 3
168
203
  # this attr is a boolean bit field
@@ -175,23 +210,25 @@ module Sphinx
175
210
  SPH_ATTR_STRING = 7
176
211
  # this attr has multiple values (0 or more)
177
212
  SPH_ATTR_MULTI = 0x40000000
178
-
213
+
214
+ #=================================================================
179
215
  # Known grouping functions
180
-
216
+ #=================================================================
217
+
181
218
  # group by day
182
219
  SPH_GROUPBY_DAY = 0
183
220
  # group by week
184
- SPH_GROUPBY_WEEK = 1
221
+ SPH_GROUPBY_WEEK = 1
185
222
  # group by month
186
- SPH_GROUPBY_MONTH = 2
223
+ SPH_GROUPBY_MONTH = 2
187
224
  # group by year
188
225
  SPH_GROUPBY_YEAR = 3
189
226
  # group by attribute value
190
227
  SPH_GROUPBY_ATTR = 4
191
228
  # group by sequential attrs pair
192
229
  SPH_GROUPBY_ATTRPAIR = 5
193
-
194
- # Constructs the <tt>Sphinx::Client</tt> object and sets options to their default values.
230
+
231
+ # Constructs the <tt>Sphinx::Client</tt> object and sets options to their default values.
195
232
  def initialize
196
233
  # per-query settings
197
234
  @offset = 0 # how many records to seek from result-set start (default is 0)
@@ -214,16 +251,16 @@ module Sphinx
214
251
  @anchor = [] # geographical anchor point
215
252
  @indexweights = [] # per-index weights
216
253
  @ranker = SPH_RANK_PROXIMITY_BM25 # ranking mode (default is SPH_RANK_PROXIMITY_BM25)
217
- @maxquerytime = 0 # max query time, milliseconds (default is 0, do not limit)
254
+ @maxquerytime = 0 # max query time, milliseconds (default is 0, do not limit)
218
255
  @fieldweights = {} # per-field-name weights
219
256
  @overrides = [] # per-query attribute values overrides
220
257
  @select = '*' # select-list (attributes or expressions, with optional aliases)
221
-
258
+
222
259
  # per-reply fields (for single-query case)
223
260
  @error = '' # last error message
224
261
  @warning = '' # last warning message
225
262
  @connerror = false # connection error vs remote error flag
226
-
263
+
227
264
  @reqs = [] # requests storage (for multi-query case)
228
265
  @mbenc = '' # stored mbstring encoding
229
266
  @timeout = 0 # connect timeout
@@ -233,58 +270,104 @@ module Sphinx
233
270
 
234
271
  # per-client-object settings
235
272
  # searchd servers list
236
- @servers = [Sphinx::Server.new(self, 'localhost', 3312, false)].freeze
273
+ @servers = [Sphinx::Server.new(self, 'localhost', 9312, false)].freeze
237
274
  @lastserver = -1
238
275
  end
239
-
276
+
277
+ #=================================================================
278
+ # General API functions
279
+ #=================================================================
280
+
240
281
  # Returns last error message, as a string, in human readable format. If there
241
282
  # were no errors during the previous API call, empty string is returned.
242
283
  #
243
- # You should call it when any other function (such as +Query+) fails (typically,
284
+ # You should call it when any other function (such as {#query}) fails (typically,
244
285
  # the failing function returns false). The returned string will contain the
245
286
  # error description.
246
287
  #
247
288
  # The error message is not reset by this call; so you can safely call it
248
289
  # several times if needed.
249
290
  #
250
- def GetLastError
291
+ # @return [String] last error message.
292
+ #
293
+ # @example
294
+ # puts sphinx.last_error
295
+ #
296
+ # @see #last_warning
297
+ # @see #connect_error?
298
+ #
299
+ def last_error
251
300
  @error
252
301
  end
253
-
302
+ alias :GetLastError :last_error
303
+
254
304
  # Returns last warning message, as a string, in human readable format. If there
255
305
  # were no warnings during the previous API call, empty string is returned.
256
306
  #
257
- # You should call it to verify whether your request (such as +Query+) was
307
+ # You should call it to verify whether your request (such as {#query}) was
258
308
  # completed but with warnings. For instance, search query against a distributed
259
309
  # index might complete succesfully even if several remote agents timed out.
260
310
  # In that case, a warning message would be produced.
261
- #
311
+ #
262
312
  # The warning message is not reset by this call; so you can safely call it
263
313
  # several times if needed.
264
314
  #
265
- def GetLastWarning
315
+ # @return [String] last warning message.
316
+ #
317
+ # @example
318
+ # puts sphinx.last_warning
319
+ #
320
+ # @see #last_error
321
+ # @see #connect_error?
322
+ #
323
+ def last_warning
266
324
  @warning
267
325
  end
268
-
326
+ alias :GetLastWarning :last_warning
327
+
269
328
  # Checks whether the last error was a network error on API side, or a
270
329
  # remote error reported by searchd. Returns true if the last connection
271
330
  # attempt to searchd failed on API side, false otherwise (if the error
272
331
  # was remote, or there were no connection attempts at all).
273
332
  #
274
- def IsConnectError
333
+ # @return [Boolean] the value indicating whether last error was a
334
+ # nework error on API side.
335
+ #
336
+ # @example
337
+ # puts "Connection failed!" if sphinx.connect_error?
338
+ #
339
+ # @see #last_error
340
+ # @see #last_warning
341
+ #
342
+ def connect_error?
275
343
  @connerror || false
276
344
  end
277
-
345
+ alias :IsConnectError :connect_error?
346
+
278
347
  # Sets searchd host name and TCP port. All subsequent requests will
279
348
  # use the new host and port settings. Default +host+ and +port+ are
280
- # 'localhost' and 3312, respectively.
349
+ # 'localhost' and 9312, respectively.
281
350
  #
282
351
  # Also, you can specify an absolute path to Sphinx's UNIX socket as +host+,
283
352
  # in this case pass port as +0+ or +nil+.
284
353
  #
285
- def SetServer(host, port)
354
+ # @param [String] host the searchd host name or UNIX socket absolute path.
355
+ # @param [Integer] port the searchd port name (could be any if UNIX
356
+ # socket path specified).
357
+ # @return [Sphinx::Client] self.
358
+ #
359
+ # @example
360
+ # sphinx.set_server('localhost', 9312)
361
+ # sphinx.set_server('/opt/sphinx/var/run/sphinx.sock')
362
+ #
363
+ # @raise [ArgumentError] Occurred when parameters are invalid.
364
+ # @see #set_servers
365
+ # @see #set_connect_timeout
366
+ # @see #set_request_timeout
367
+ #
368
+ def set_server(host, port = 9312)
286
369
  raise ArgumentError, '"host" argument must be String' unless host.kind_of?(String)
287
-
370
+
288
371
  path = nil
289
372
  # Check if UNIX socket should be used
290
373
  if host[0] == ?/
@@ -298,25 +381,47 @@ module Sphinx
298
381
  host = port = nil unless path.nil?
299
382
 
300
383
  @servers = [Sphinx::Server.new(self, host, port, path)].freeze
384
+ self
301
385
  end
386
+ alias :SetServer :set_server
302
387
 
303
388
  # Sets the list of searchd servers. Each subsequent request will use next
304
389
  # server in list (round-robin). In case of one server failure, request could
305
- # be retried on another server (see +SetConnectTimeout+ and +SetRequestTimeout+).
306
- #
307
- # Method accepts an +Array+ of +Hash+es, each of them should have :host
308
- # and :port (to connect to searchd through network) or :path (an absolute path
309
- # to UNIX socket) specified.
310
- #
311
- def SetServers(servers)
390
+ # be retried on another server (see {#set_connect_timeout} and
391
+ # {#set_request_timeout}).
392
+ #
393
+ # Method accepts an +Array+ of +Hash+es, each of them should have <tt>:host</tt>
394
+ # and <tt>:port</tt> (to connect to searchd through network) or <tt>:path</tt>
395
+ # (an absolute path to UNIX socket) specified.
396
+ #
397
+ # @param [Array<Hash>] servers an +Array+ of +Hash+ objects with servers parameters.
398
+ # @option servers [String] :host the searchd host name or UNIX socket absolute path.
399
+ # @option servers [String] :path the searchd UNIX socket absolute path.
400
+ # @option servers [Integer] :port (9312) the searchd port name (skiped when UNIX
401
+ # socket path specified)
402
+ # @return [Sphinx::Client] self.
403
+ #
404
+ # @example
405
+ # sphinx.set_servers([
406
+ # { :host => 'browse01.local' }, # default port is 9312
407
+ # { :host => 'browse02.local', :port => 9312 },
408
+ # { :path => '/opt/sphinx/var/run/sphinx.sock' }
409
+ # ])
410
+ #
411
+ # @raise [ArgumentError] Occurred when parameters are invalid.
412
+ # @see #set_server
413
+ # @see #set_connect_timeout
414
+ # @see #set_request_timeout
415
+ #
416
+ def set_servers(servers)
312
417
  raise ArgumentError, '"servers" argument must be Array' unless servers.kind_of?(Array)
313
418
  raise ArgumentError, '"servers" argument must be not empty' if servers.empty?
314
-
419
+
315
420
  @servers = servers.map do |server|
316
421
  raise ArgumentError, '"servers" argument must be Array of Hashes' unless server.kind_of?(Hash)
317
422
 
318
423
  host = server[:path] || server['path'] || server[:host] || server['host']
319
- port = server[:port] || server['port']
424
+ port = server[:port] || server['port'] || 9312
320
425
  path = nil
321
426
  raise ArgumentError, '"host" argument must be String' unless host.kind_of?(String)
322
427
 
@@ -330,11 +435,13 @@ module Sphinx
330
435
  end
331
436
 
332
437
  host = port = nil unless path.nil?
333
-
438
+
334
439
  Sphinx::Server.new(self, host, port, path)
335
440
  end.freeze
441
+ self
336
442
  end
337
-
443
+ alias :SetServers :set_servers
444
+
338
445
  # Sets the time allowed to spend connecting to the server before giving up
339
446
  # and number of retries to perform.
340
447
  #
@@ -342,7 +449,7 @@ module Sphinx
342
449
  # be returned back to the application in order for application-level error
343
450
  # handling to advise the user.
344
451
  #
345
- # When multiple servers configured through +SetServers+ method, and +retries+
452
+ # When multiple servers configured through {#set_servers} method, and +retries+
346
453
  # number is greater than 1, library will try to connect to another server.
347
454
  # In case of single server configured, it will try to reconnect +retries+
348
455
  # times.
@@ -350,15 +457,29 @@ module Sphinx
350
457
  # Please note, this timeout will only be used for connection establishing, not
351
458
  # for regular API requests.
352
459
  #
353
- def SetConnectTimeout(timeout, retries = 1)
460
+ # @param [Integer] timeout a connection timeout in seconds.
461
+ # @param [Integer] retries number of connect retries.
462
+ # @return [Sphinx::Client] self.
463
+ #
464
+ # @example Set connection timeout to 1 second and number of retries to 5
465
+ # sphinx.set_connect_timeout(1, 5)
466
+ #
467
+ # @raise [ArgumentError] Occurred when parameters are invalid.
468
+ # @see #set_server
469
+ # @see #set_servers
470
+ # @see #set_request_timeout
471
+ #
472
+ def set_connect_timeout(timeout, retries = 1)
354
473
  raise ArgumentError, '"timeout" argument must be Integer' unless timeout.respond_to?(:integer?) and timeout.integer?
355
474
  raise ArgumentError, '"retries" argument must be Integer' unless retries.respond_to?(:integer?) and retries.integer?
356
475
  raise ArgumentError, '"retries" argument must be greater than 0' unless retries > 0
357
-
476
+
358
477
  @timeout = timeout
359
478
  @retries = retries
479
+ self
360
480
  end
361
-
481
+ alias :SetConnectTimeout :set_connect_timeout
482
+
362
483
  # Sets the time allowed to spend performing request to the server before giving up
363
484
  # and number of retries to perform.
364
485
  #
@@ -366,34 +487,82 @@ module Sphinx
366
487
  # be returned back to the application in order for application-level error
367
488
  # handling to advise the user.
368
489
  #
369
- # When multiple servers configured through +SetServers+ method, and +retries+
490
+ # When multiple servers configured through {#set_servers} method, and +retries+
370
491
  # number is greater than 1, library will try to do another try with this server
371
492
  # (with full reconnect). If connection would fail, behavior depends on
372
- # +SetConnectTimeout+ settings.
493
+ # {#set_connect_timeout} settings.
373
494
  #
374
495
  # Please note, this timeout will only be used for request performing, not
375
496
  # for connection establishing.
376
497
  #
377
- def SetRequestTimeout(timeout, retries = 1)
498
+ # @param [Integer] timeout a request timeout in seconds.
499
+ # @param [Integer] retries number of request retries.
500
+ # @return [Sphinx::Client] self.
501
+ #
502
+ # @example Set request timeout to 1 second and number of retries to 5
503
+ # sphinx.set_request_timeout(1, 5)
504
+ #
505
+ # @raise [ArgumentError] Occurred when parameters are invalid.
506
+ # @see #set_server
507
+ # @see #set_servers
508
+ # @see #set_connect_timeout
509
+ #
510
+ def set_request_timeout(timeout, retries = 1)
378
511
  raise ArgumentError, '"timeout" argument must be Integer' unless timeout.respond_to?(:integer?) and timeout.integer?
379
512
  raise ArgumentError, '"retries" argument must be Integer' unless retries.respond_to?(:integer?) and retries.integer?
380
513
  raise ArgumentError, '"retries" argument must be greater than 0' unless retries > 0
381
-
514
+
382
515
  @reqtimeout = timeout
383
516
  @reqretries = retries
517
+ self
518
+ end
519
+ alias :SetRequestTimeout :set_request_timeout
520
+
521
+ # Sets distributed retry count and delay.
522
+ #
523
+ # On temporary failures searchd will attempt up to +count+ retries
524
+ # per agent. +delay+ is the delay between the retries, in milliseconds.
525
+ # Retries are disabled by default. Note that this call will not make
526
+ # the API itself retry on temporary failure; it only tells searchd
527
+ # to do so. Currently, the list of temporary failures includes all
528
+ # kinds of connection failures and maxed out (too busy) remote agents.
529
+ #
530
+ # @param [Integer] count a number of retries to perform.
531
+ # @param [Integer] delay a delay between the retries.
532
+ # @return [Sphinx::Client] self.
533
+ #
534
+ # @example Perform 5 retries with 200 ms between them
535
+ # sphinx.set_retries(5, 200)
536
+ #
537
+ # @raise [ArgumentError] Occurred when parameters are invalid.
538
+ # @see #set_connect_timeout
539
+ # @see #set_request_timeout
540
+ #
541
+ def set_retries(count, delay = 0)
542
+ raise ArgumentError, '"count" argument must be Integer' unless count.respond_to?(:integer?) and count.integer?
543
+ raise ArgumentError, '"delay" argument must be Integer' unless delay.respond_to?(:integer?) and delay.integer?
544
+
545
+ @retrycount = count
546
+ @retrydelay = delay
547
+ self
384
548
  end
385
-
549
+ alias :SetRetries :set_retries
550
+
551
+ #=================================================================
552
+ # General query settings
553
+ #=================================================================
554
+
386
555
  # Sets offset into server-side result set (+offset+) and amount of matches to
387
556
  # return to client starting from that offset (+limit+). Can additionally control
388
557
  # maximum server-side result set size for current query (+max_matches+) and the
389
558
  # threshold amount of matches to stop searching at (+cutoff+). All parameters
390
559
  # must be non-negative integers.
391
560
  #
392
- # First two parameters to +SetLimits+ are identical in behavior to MySQL LIMIT
561
+ # First two parameters to {#set_limits} are identical in behavior to MySQL LIMIT
393
562
  # clause. They instruct searchd to return at most +limit+ matches starting from
394
563
  # match number +offset+. The default offset and limit settings are +0+ and +20+,
395
564
  # that is, to return first +20+ matches.
396
- #
565
+ #
397
566
  # +max_matches+ setting controls how much matches searchd will keep in RAM
398
567
  # while searching. All matching documents will be normally processed, ranked,
399
568
  # filtered, and sorted even if max_matches is set to +1+. But only best +N+
@@ -415,12 +584,23 @@ module Sphinx
415
584
  # searchd to forcibly stop search query once $cutoff matches had been found
416
585
  # and processed.
417
586
  #
418
- def SetLimits(offset, limit, max = 0, cutoff = 0)
587
+ # @param [Integer] offset an offset into server-side result set.
588
+ # @param [Integer] limit an amount of matches to return.
589
+ # @param [Integer] max a maximum server-side result set size.
590
+ # @param [Integer] cutoff a threshold amount of matches to stop searching at.
591
+ # @return [Sphinx::Client] self.
592
+ #
593
+ # @example
594
+ # sphinx.set_limits(100, 50, 1000, 5000)
595
+ #
596
+ # @raise [ArgumentError] Occurred when parameters are invalid.
597
+ #
598
+ def set_limits(offset, limit, max = 0, cutoff = 0)
419
599
  raise ArgumentError, '"offset" argument must be Integer' unless offset.respond_to?(:integer?) and offset.integer?
420
600
  raise ArgumentError, '"limit" argument must be Integer' unless limit.respond_to?(:integer?) and limit.integer?
421
601
  raise ArgumentError, '"max" argument must be Integer' unless max.respond_to?(:integer?) and max.integer?
422
602
  raise ArgumentError, '"cutoff" argument must be Integer' unless cutoff.respond_to?(:integer?) and cutoff.integer?
423
-
603
+
424
604
  raise ArgumentError, '"offset" argument should be greater or equal to zero' unless offset >= 0
425
605
  raise ArgumentError, '"limit" argument should be greater to zero' unless limit > 0
426
606
  raise ArgumentError, '"max" argument should be greater or equal to zero' unless max >= 0
@@ -430,35 +610,176 @@ module Sphinx
430
610
  @limit = limit
431
611
  @maxmatches = max if max > 0
432
612
  @cutoff = cutoff if cutoff > 0
613
+ self
433
614
  end
434
-
615
+ alias :SetLimits :set_limits
616
+
435
617
  # Sets maximum search query time, in milliseconds. Parameter must be a
436
618
  # non-negative integer. Default valus is +0+ which means "do not limit".
437
619
  #
438
- # Similar to +cutoff+ setting from +SetLimits+, but limits elapsed query
620
+ # Similar to +cutoff+ setting from {#set_limits}, but limits elapsed query
439
621
  # time instead of processed matches count. Local search queries will be
440
622
  # stopped once that much time has elapsed. Note that if you're performing
441
623
  # a search which queries several local indexes, this limit applies to each
442
624
  # index separately.
443
625
  #
444
- def SetMaxQueryTime(max)
626
+ # @param [Integer] max maximum search query time in milliseconds.
627
+ # @return [Sphinx::Client] self.
628
+ #
629
+ # @example
630
+ # sphinx.set_max_query_time(200)
631
+ #
632
+ # @raise [ArgumentError] Occurred when parameters are invalid.
633
+ #
634
+ def set_max_query_time(max)
445
635
  raise ArgumentError, '"max" argument must be Integer' unless max.respond_to?(:integer?) and max.integer?
446
636
  raise ArgumentError, '"max" argument should be greater or equal to zero' unless max >= 0
447
637
 
448
638
  @maxquerytime = max
639
+ self
449
640
  end
450
-
641
+ alias :SetMaxQueryTime :set_max_query_time
642
+
643
+ # Sets temporary (per-query) per-document attribute value overrides. Only
644
+ # supports scalar attributes. +values+ must be a +Hash+ that maps document
645
+ # IDs to overridden attribute values.
646
+ #
647
+ # Override feature lets you "temporary" update attribute values for some
648
+ # documents within a single query, leaving all other queries unaffected.
649
+ # This might be useful for personalized data. For example, assume you're
650
+ # implementing a personalized search function that wants to boost the posts
651
+ # that the user's friends recommend. Such data is not just dynamic, but
652
+ # also personal; so you can't simply put it in the index because you don't
653
+ # want everyone's searches affected. Overrides, on the other hand, are local
654
+ # to a single query and invisible to everyone else. So you can, say, setup
655
+ # a "friends_weight" value for every document, defaulting to 0, then
656
+ # temporary override it with 1 for documents 123, 456 and 789 (recommended
657
+ # by exactly the friends of current user), and use that value when ranking.
658
+ #
659
+ # You can specify attribute type as String ("integer", "float", etc),
660
+ # Symbol (:integer, :float, etc), or
661
+ # Fixnum constant (SPH_ATTR_INTEGER, SPH_ATTR_FLOAT, etc).
662
+ #
663
+ # @param [String, Symbol] attribute an attribute name to override values of.
664
+ # @param [Integer, String, Symbol] attrtype attribute type.
665
+ # @param [Hash] values a +Hash+ that maps document IDs to overridden attribute values.
666
+ # @return [Sphinx::Client] self.
667
+ #
668
+ # @example
669
+ # sphinx.set_override(:friends_weight, :integer, {123 => 1, 456 => 1, 789 => 1})
670
+ #
671
+ # @raise [ArgumentError] Occurred when parameters are invalid.
672
+ #
673
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setoverride Section 6.2.3, "SetOverride"
674
+ #
675
+ def set_override(attribute, attrtype, values)
676
+ raise ArgumentError, '"attribute" argument must be String or Symbol' unless attribute.kind_of?(String) or attribute.kind_of?(Symbol)
677
+
678
+ case attrtype
679
+ when String, Symbol
680
+ begin
681
+ attrtype = self.class.const_get("SPH_ATTR_#{attrtype.to_s.upcase}")
682
+ rescue NameError
683
+ raise ArgumentError, "\"attrtype\" argument value \"#{attrtype}\" is invalid"
684
+ end
685
+ when Fixnum
686
+ raise ArgumentError, "\"attrtype\" argument value \"#{attrtype}\" is invalid" unless (SPH_ATTR_INTEGER..SPH_ATTR_BIGINT).include?(attrtype)
687
+ else
688
+ raise ArgumentError, '"attrtype" argument must be Fixnum, String, or Symbol'
689
+ end
690
+
691
+ raise ArgumentError, '"values" argument must be Hash' unless values.kind_of?(Hash)
692
+
693
+ values.each do |id, value|
694
+ raise ArgumentError, '"values" argument must be Hash map of Integer to Integer or Time' unless id.respond_to?(:integer?) and id.integer?
695
+ case attrtype
696
+ when SPH_ATTR_TIMESTAMP
697
+ raise ArgumentError, '"values" argument must be Hash map of Integer to Integer or Time' unless (value.respond_to?(:integer?) and value.integer?) or value.kind_of?(Time)
698
+ when SPH_ATTR_FLOAT
699
+ raise ArgumentError, '"values" argument must be Hash map of Integer to Float or Integer' unless value.kind_of?(Float) or (value.respond_to?(:integer?) and value.integer?)
700
+ else
701
+ # SPH_ATTR_INTEGER, SPH_ATTR_ORDINAL, SPH_ATTR_BOOL, SPH_ATTR_BIGINT
702
+ raise ArgumentError, '"values" argument must be Hash map of Integer to Integer' unless value.respond_to?(:integer?) and value.integer?
703
+ end
704
+ end
705
+
706
+ @overrides << { 'attr' => attribute.to_s, 'type' => attrtype, 'values' => values }
707
+ self
708
+ end
709
+ alias :SetOverride :set_override
710
+
711
+ # Sets the select clause, listing specific attributes to fetch, and
712
+ # expressions to compute and fetch. Clause syntax mimics SQL.
713
+ #
714
+ # {#set_select} is very similar to the part of a typical SQL query between
715
+ # +SELECT+ and +FROM+. It lets you choose what attributes (columns) to
716
+ # fetch, and also what expressions over the columns to compute and fetch.
717
+ # A certain difference from SQL is that expressions must always be aliased
718
+ # to a correct identifier (consisting of letters and digits) using +AS+
719
+ # keyword. SQL also lets you do that but does not require to. Sphinx enforces
720
+ # aliases so that the computation results can always be returned under a
721
+ # "normal" name in the result set, used in other clauses, etc.
722
+ #
723
+ # Everything else is basically identical to SQL. Star ('*') is supported.
724
+ # Functions are supported. Arbitrary amount of expressions is supported.
725
+ # Computed expressions can be used for sorting, filtering, and grouping,
726
+ # just as the regular attributes.
727
+ #
728
+ # Starting with version 0.9.9-rc2, aggregate functions (<tt>AVG()</tt>,
729
+ # <tt>MIN()</tt>, <tt>MAX()</tt>, <tt>SUM()</tt>) are supported when using
730
+ # <tt>GROUP BY</tt>.
731
+ #
732
+ # Expression sorting (Section 4.5, “SPH_SORT_EXPR mode”) and geodistance
733
+ # functions ({#set_geo_anchor}) are now internally implemented
734
+ # using this computed expressions mechanism, using magic names '<tt>@expr</tt>'
735
+ # and '<tt>@geodist</tt>' respectively.
736
+ #
737
+ # @param [String] select a select clause, listing specific attributes to fetch.
738
+ # @return [Sphinx::Client] self.
739
+ #
740
+ # @example
741
+ # sphinx.set_select('*, @weight+(user_karma+ln(pageviews))*0.1 AS myweight')
742
+ # sphinx.set_select('exp_years, salary_gbp*{$gbp_usd_rate} AS salary_usd, IF(age>40,1,0) AS over40')
743
+ # sphinx.set_select('*, AVG(price) AS avgprice')
744
+ #
745
+ # @raise [ArgumentError] Occurred when parameters are invalid.
746
+ #
747
+ # @see http://www.sphinxsearch.com/docs/current.html#sort-expr Section 4.5, "SPH_SORT_EXPR mode"
748
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setgeoanchor Section 6.4.5, "SetGeoAnchor"
749
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setselect Section 6.2.4, "SetSelect"
750
+ #
751
+ def set_select(select)
752
+ raise ArgumentError, '"select" argument must be String' unless select.kind_of?(String)
753
+
754
+ @select = select
755
+ self
756
+ end
757
+ alias :SetSelect :set_select
758
+
759
+ #=================================================================
760
+ # Full-text search query settings
761
+ #=================================================================
762
+
451
763
  # Sets full-text query matching mode.
452
764
  #
453
765
  # Parameter must be a +Fixnum+ constant specifying one of the known modes
454
766
  # (+SPH_MATCH_ALL+, +SPH_MATCH_ANY+, etc), +String+ with identifier (<tt>"all"</tt>,
455
767
  # <tt>"any"</tt>, etc), or a +Symbol+ (<tt>:all</tt>, <tt>:any</tt>, etc).
456
768
  #
457
- # Corresponding sections in Sphinx reference manual:
458
- # * {Section 4.1, "Matching modes"}[http://www.sphinxsearch.com/docs/current.html#matching-modes] for details.
459
- # * {Section 6.3.1, "SetMatchMode"}[http://www.sphinxsearch.com/docs/current.html#api-func-setmatchmode] for details.
769
+ # @param [Integer, String, Symbol] mode full-text query matching mode.
770
+ # @return [Sphinx::Client] self.
460
771
  #
461
- def SetMatchMode(mode)
772
+ # @example
773
+ # sphinx.set_match_mode(Sphinx::Client::SPH_MATCH_ALL)
774
+ # sphinx.set_match_mode(:all)
775
+ # sphinx.set_match_mode('all')
776
+ #
777
+ # @raise [ArgumentError] Occurred when parameters are invalid.
778
+ #
779
+ # @see http://www.sphinxsearch.com/docs/current.html#matching-modes Section 4.1, "Matching modes"
780
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setmatchmode Section 6.3.1, "SetMatchMode"
781
+ #
782
+ def set_match_mode(mode)
462
783
  case mode
463
784
  when String, Symbol
464
785
  begin
@@ -473,14 +794,33 @@ module Sphinx
473
794
  end
474
795
 
475
796
  @mode = mode
797
+ self
476
798
  end
477
-
478
- # Set ranking mode.
799
+ alias :SetMatchMode :set_match_mode
800
+
801
+ # Sets ranking mode. Only available in +SPH_MATCH_EXTENDED2+
802
+ # matching mode at the time of this writing. Parameter must be a
803
+ # constant specifying one of the known modes.
479
804
  #
480
805
  # You can specify ranking mode as String ("proximity_bm25", "bm25", etc),
481
806
  # Symbol (:proximity_bm25, :bm25, etc), or
482
807
  # Fixnum constant (SPH_RANK_PROXIMITY_BM25, SPH_RANK_BM25, etc).
483
- def SetRankingMode(ranker)
808
+ #
809
+ # @param [Integer, String, Symbol] ranker ranking mode.
810
+ # @return [Sphinx::Client] self.
811
+ #
812
+ # @example
813
+ # sphinx.set_ranking_mode(Sphinx::Client::SPH_RANK_BM25)
814
+ # sphinx.set_ranking_mode(:bm25)
815
+ # sphinx.set_ranking_mode('bm25')
816
+ #
817
+ # @raise [ArgumentError] Occurred when parameters are invalid.
818
+ #
819
+ # @see http://www.sphinxsearch.com/docs/current.html#matching-modes Section 4.1, "Matching modes"
820
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setmatchmode Section 6.3.1, "SetMatchMode"
821
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setrankingmode Section 6.3.2, "SetRankingMode"
822
+ #
823
+ def set_ranking_mode(ranker)
484
824
  case ranker
485
825
  when String, Symbol
486
826
  begin
@@ -495,14 +835,33 @@ module Sphinx
495
835
  end
496
836
 
497
837
  @ranker = ranker
838
+ self
498
839
  end
499
-
840
+ alias :SetRankingMode :set_ranking_mode
841
+
500
842
  # Set matches sorting mode.
501
843
  #
502
844
  # You can specify sorting mode as String ("relevance", "attr_desc", etc),
503
845
  # Symbol (:relevance, :attr_desc, etc), or
504
846
  # Fixnum constant (SPH_SORT_RELEVANCE, SPH_SORT_ATTR_DESC, etc).
505
- def SetSortMode(mode, sortby = '')
847
+ #
848
+ # @param [Integer, String, Symbol] mode matches sorting mode.
849
+ # @param [String] sortby sorting clause, with the syntax depending on
850
+ # specific mode. Should be specified unless sorting mode is
851
+ # +SPH_SORT_RELEVANCE+.
852
+ # @return [Sphinx::Client] self.
853
+ #
854
+ # @example
855
+ # sphinx.set_sort_mode(Sphinx::Client::SPH_SORT_ATTR_ASC, 'attr')
856
+ # sphinx.set_sort_mode(:attr_asc, 'attr')
857
+ # sphinx.set_sort_mode('attr_asc', 'attr')
858
+ #
859
+ # @raise [ArgumentError] Occurred when parameters are invalid.
860
+ #
861
+ # @see http://www.sphinxsearch.com/docs/current.html#sorting-modes Section 4.5, "Sorting modes"
862
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setsortmode Section 6.3.3, "SetSortMode"
863
+ #
864
+ def set_sort_mode(mode, sortby = '')
506
865
  case mode
507
866
  when String, Symbol
508
867
  begin
@@ -521,27 +880,69 @@ module Sphinx
521
880
 
522
881
  @sort = mode
523
882
  @sortby = sortby
883
+ self
524
884
  end
525
-
526
- # Bind per-field weights by order.
885
+ alias :SetSortMode :set_sort_mode
886
+
887
+ # Binds per-field weights in the order of appearance in the index.
888
+ #
889
+ # @param [Array<Integer>] weights an +Array+ of integer per-field weights.
890
+ # @return [Sphinx::Client] self.
527
891
  #
528
- # DEPRECATED; use SetFieldWeights() instead.
529
- def SetWeights(weights)
892
+ # @example
893
+ # sphinx.set_weights([1, 3, 5])
894
+ #
895
+ # @raise [ArgumentError] Occurred when parameters are invalid.
896
+ #
897
+ # @deprecated Use {#set_field_weights} instead.
898
+ # @see #set_field_weights
899
+ #
900
+ def set_weights(weights)
530
901
  raise ArgumentError, '"weights" argument must be Array' unless weights.kind_of?(Array)
531
902
  weights.each do |weight|
532
903
  raise ArgumentError, '"weights" argument must be Array of integers' unless weight.respond_to?(:integer?) and weight.integer?
533
904
  end
534
905
 
535
906
  @weights = weights
907
+ self
536
908
  end
909
+ alias :SetWeights :set_weights
537
910
 
538
- # Bind per-field weights by name.
911
+ # Binds per-field weights by name. Parameter must be a +Hash+
912
+ # mapping string field names to integer weights.
913
+ #
914
+ # Match ranking can be affected by per-field weights. For instance,
915
+ # see Section 4.4, "Weighting" for an explanation how phrase
916
+ # proximity ranking is affected. This call lets you specify what
917
+ # non-default weights to assign to different full-text fields.
918
+ #
919
+ # The weights must be positive 32-bit integers. The final weight
920
+ # will be a 32-bit integer too. Default weight value is 1. Unknown
921
+ # field names will be silently ignored.
922
+ #
923
+ # There is no enforced limit on the maximum weight value at the
924
+ # moment. However, beware that if you set it too high you can
925
+ # start hitting 32-bit wraparound issues. For instance, if
926
+ # you set a weight of 10,000,000 and search in extended mode,
927
+ # then maximum possible weight will be equal to 10 million (your
928
+ # weight) by 1 thousand (internal BM25 scaling factor, see
929
+ # Section 4.4, “Weighting”) by 1 or more (phrase proximity rank).
930
+ # The result is at least 10 billion that does not fit in 32 bits
931
+ # and will be wrapped around, producing unexpected results.
932
+ #
933
+ # @param [Hash] weights a +Hash+ mapping string field names to
934
+ # integer weights.
935
+ # @return [Sphinx::Client] self.
936
+ #
937
+ # @example
938
+ # sphinx.set_field_weights(:title => 20, :text => 10)
539
939
  #
540
- # Takes string (field name) to integer (field weight) hash as an argument.
541
- # * Takes precedence over SetWeights().
542
- # * Unknown names will be silently ignored.
543
- # * Unbound fields will be silently given a weight of 1.
544
- def SetFieldWeights(weights)
940
+ # @raise [ArgumentError] Occurred when parameters are invalid.
941
+ #
942
+ # @see http://www.sphinxsearch.com/docs/current.html#weighting Section 4.4, "Weighting"
943
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setfieldweights Section 6.3.5, "SetFieldWeights"
944
+ #
945
+ def set_field_weights(weights)
545
946
  raise ArgumentError, '"weights" argument must be Hash' unless weights.kind_of?(Hash)
546
947
  weights.each do |name, weight|
547
948
  unless (name.kind_of?(String) or name.kind_of?(Symbol)) and (weight.respond_to?(:integer?) and weight.integer?)
@@ -550,37 +951,119 @@ module Sphinx
550
951
  end
551
952
 
552
953
  @fieldweights = weights
954
+ self
553
955
  end
554
-
555
- # Bind per-index weights by name.
556
- def SetIndexWeights(weights)
956
+ alias :SetFieldWeights :set_field_weights
957
+
958
+ # Sets per-index weights, and enables weighted summing of match
959
+ # weights across different indexes. Parameter must be a hash
960
+ # (associative array) mapping string index names to integer
961
+ # weights. Default is empty array that means to disable weighting
962
+ # summing.
963
+ #
964
+ # When a match with the same document ID is found in several
965
+ # different local indexes, by default Sphinx simply chooses the
966
+ # match from the index specified last in the query. This is to
967
+ # support searching through partially overlapping index partitions.
968
+ #
969
+ # However in some cases the indexes are not just partitions,
970
+ # and you might want to sum the weights across the indexes
971
+ # instead of picking one. {#set_index_weights} lets you do that.
972
+ # With summing enabled, final match weight in result set will be
973
+ # computed as a sum of match weight coming from the given index
974
+ # multiplied by respective per-index weight specified in this
975
+ # call. Ie. if the document 123 is found in index A with the
976
+ # weight of 2, and also in index B with the weight of 3, and
977
+ # you called {#set_index_weights} with <tt>{"A"=>100, "B"=>10}</tt>,
978
+ # the final weight return to the client will be 2*100+3*10 = 230.
979
+ #
980
+ # @param [Hash] weights a +Hash+ mapping string index names to
981
+ # integer weights.
982
+ # @return [Sphinx::Client] self.
983
+ #
984
+ # @example
985
+ # sphinx.set_field_weights(:fresh => 20, :archived => 10)
986
+ #
987
+ # @raise [ArgumentError] Occurred when parameters are invalid.
988
+ #
989
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setindexweights Section 6.3.6, "SetIndexWeights"
990
+ #
991
+ def set_index_weights(weights)
557
992
  raise ArgumentError, '"weights" argument must be Hash' unless weights.kind_of?(Hash)
558
993
  weights.each do |index, weight|
559
994
  unless (index.kind_of?(String) or index.kind_of?(Symbol)) and (weight.respond_to?(:integer?) and weight.integer?)
560
995
  raise ArgumentError, '"weights" argument must be Hash map of strings to integers'
561
996
  end
562
997
  end
563
-
998
+
564
999
  @indexweights = weights
1000
+ self
565
1001
  end
1002
+ alias :SetIndexWeights :set_index_weights
566
1003
 
567
- # Set IDs range to match.
568
- #
569
- # Only match records if document ID is beetwen <tt>min_id</tt> and <tt>max_id</tt> (inclusive).
570
- def SetIDRange(min, max)
1004
+ #=================================================================
1005
+ # Result set filtering settings
1006
+ #=================================================================
1007
+
1008
+ # Sets an accepted range of document IDs. Parameters must be integers.
1009
+ # Defaults are 0 and 0; that combination means to not limit by range.
1010
+ #
1011
+ # After this call, only those records that have document ID between
1012
+ # +min+ and +max+ (including IDs exactly equal to +min+ or +max+)
1013
+ # will be matched.
1014
+ #
1015
+ # @param [Integer] min min document ID.
1016
+ # @param [Integer] min max document ID.
1017
+ # @return [Sphinx::Client] self.
1018
+ #
1019
+ # @example
1020
+ # sphinx.set_id_range(10, 1000)
1021
+ #
1022
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1023
+ #
1024
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setidrange Section 6.4.1, "SetIDRange"
1025
+ #
1026
+ def set_id_range(min, max)
571
1027
  raise ArgumentError, '"min" argument must be Integer' unless min.respond_to?(:integer?) and min.integer?
572
1028
  raise ArgumentError, '"max" argument must be Integer' unless max.respond_to?(:integer?) and max.integer?
573
1029
  raise ArgumentError, '"max" argument greater or equal to "min"' unless min <= max
574
1030
 
575
1031
  @min_id = min
576
1032
  @max_id = max
1033
+ self
577
1034
  end
578
-
579
- # Set values filter.
580
- #
581
- # Only match those records where <tt>attribute</tt> column values
582
- # are in specified set.
583
- def SetFilter(attribute, values, exclude = false)
1035
+ alias :SetIDRange :set_id_range
1036
+
1037
+ # Adds new integer values set filter.
1038
+ #
1039
+ # On this call, additional new filter is added to the existing
1040
+ # list of filters. $attribute must be a string with attribute
1041
+ # name. +values+ must be a plain array containing integer
1042
+ # values. +exclude+ must be a boolean value; it controls
1043
+ # whether to accept the matching documents (default mode, when
1044
+ # +exclude+ is +false+) or reject them.
1045
+ #
1046
+ # Only those documents where +attribute+ column value stored in
1047
+ # the index matches any of the values from +values+ array will
1048
+ # be matched (or rejected, if +exclude+ is +true+).
1049
+ #
1050
+ # @param [String, Symbol] attribute an attribute name to filter by.
1051
+ # @param [Array<Integer>] values an +Array+ of integers with given attribute values.
1052
+ # @param [Boolean] exclude indicating whether documents with given attribute
1053
+ # matching specified values should be excluded from search results.
1054
+ # @return [Sphinx::Client] self.
1055
+ #
1056
+ # @example
1057
+ # sphinx.set_filter(:group_id, [10, 15, 20])
1058
+ # sphinx.set_filter(:group_id, [10, 15, 20], true)
1059
+ #
1060
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1061
+ #
1062
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setfilter Section 6.4.2, "SetFilter"
1063
+ # @see #set_filter_range
1064
+ # @see #set_filter_float_range
1065
+ #
1066
+ def set_filter(attribute, values, exclude = false)
584
1067
  raise ArgumentError, '"attribute" argument must be String or Symbol' unless attribute.kind_of?(String) or attribute.kind_of?(Symbol)
585
1068
  raise ArgumentError, '"values" argument must be Array' unless values.kind_of?(Array)
586
1069
  raise ArgumentError, '"values" argument must not be empty' if values.empty?
@@ -589,94 +1072,194 @@ module Sphinx
589
1072
  values.each do |value|
590
1073
  raise ArgumentError, '"values" argument must be Array of Integer' unless value.respond_to?(:integer?) and value.integer?
591
1074
  end
592
-
1075
+
593
1076
  @filters << { 'type' => SPH_FILTER_VALUES, 'attr' => attribute.to_s, 'exclude' => exclude, 'values' => values }
1077
+ self
594
1078
  end
595
-
596
- # Set range filter.
597
- #
598
- # Only match those records where <tt>attribute</tt> column value
599
- # is beetwen <tt>min</tt> and <tt>max</tt> (including <tt>min</tt> and <tt>max</tt>).
600
- def SetFilterRange(attribute, min, max, exclude = false)
1079
+ alias :SetFilter :set_filter
1080
+
1081
+ # Adds new integer range filter.
1082
+ #
1083
+ # On this call, additional new filter is added to the existing
1084
+ # list of filters. +attribute+ must be a string with attribute
1085
+ # name. +min+ and +max+ must be integers that define the acceptable
1086
+ # attribute values range (including the boundaries). +exclude+
1087
+ # must be a boolean value; it controls whether to accept the
1088
+ # matching documents (default mode, when +exclude+ is false) or
1089
+ # reject them.
1090
+ #
1091
+ # Only those documents where +attribute+ column value stored
1092
+ # in the index is between +min+ and +max+ (including values
1093
+ # that are exactly equal to +min+ or +max+) will be matched
1094
+ # (or rejected, if +exclude+ is true).
1095
+ #
1096
+ # @param [String, Symbol] attribute an attribute name to filter by.
1097
+ # @param [Integer] min min value of the given attribute.
1098
+ # @param [Integer] max max value of the given attribute.
1099
+ # @param [Boolean] exclude indicating whether documents with given attribute
1100
+ # matching specified boundaries should be excluded from search results.
1101
+ # @return [Sphinx::Client] self.
1102
+ #
1103
+ # @example
1104
+ # sphinx.set_filter_range(:group_id, 10, 20)
1105
+ # sphinx.set_filter_range(:group_id, 10, 20, true)
1106
+ #
1107
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1108
+ #
1109
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setfilterrange Section 6.4.3, "SetFilterRange"
1110
+ # @see #set_filter
1111
+ # @see #set_filter_float_range
1112
+ #
1113
+ def set_filter_range(attribute, min, max, exclude = false)
601
1114
  raise ArgumentError, '"attribute" argument must be String or Symbol' unless attribute.kind_of?(String) or attribute.kind_of?(Symbol)
602
1115
  raise ArgumentError, '"min" argument must be Integer' unless min.respond_to?(:integer?) and min.integer?
603
1116
  raise ArgumentError, '"max" argument must be Integer' unless max.respond_to?(:integer?) and max.integer?
604
1117
  raise ArgumentError, '"max" argument greater or equal to "min"' unless min <= max
605
1118
  raise ArgumentError, '"exclude" argument must be Boolean' unless exclude.kind_of?(TrueClass) or exclude.kind_of?(FalseClass)
606
-
1119
+
607
1120
  @filters << { 'type' => SPH_FILTER_RANGE, 'attr' => attribute.to_s, 'exclude' => exclude, 'min' => min, 'max' => max }
1121
+ self
608
1122
  end
609
-
610
- # Set float range filter.
611
- #
612
- # Only match those records where <tt>attribute</tt> column value
613
- # is beetwen <tt>min</tt> and <tt>max</tt> (including <tt>min</tt> and <tt>max</tt>).
614
- def SetFilterFloatRange(attribute, min, max, exclude = false)
1123
+ alias :SetFilterRange :set_filter_range
1124
+
1125
+ # Adds new float range filter.
1126
+ #
1127
+ # On this call, additional new filter is added to the existing
1128
+ # list of filters. +attribute+ must be a string with attribute name.
1129
+ # +min+ and +max+ must be floats that define the acceptable
1130
+ # attribute values range (including the boundaries). +exclude+ must
1131
+ # be a boolean value; it controls whether to accept the matching
1132
+ # documents (default mode, when +exclude+ is false) or reject them.
1133
+ #
1134
+ # Only those documents where +attribute+ column value stored in
1135
+ # the index is between +min+ and +max+ (including values that are
1136
+ # exactly equal to +min+ or +max+) will be matched (or rejected,
1137
+ # if +exclude+ is true).
1138
+ #
1139
+ # @param [String, Symbol] attribute an attribute name to filter by.
1140
+ # @param [Integer, Float] min min value of the given attribute.
1141
+ # @param [Integer, Float] max max value of the given attribute.
1142
+ # @param [Boolean] exclude indicating whether documents with given attribute
1143
+ # matching specified boundaries should be excluded from search results.
1144
+ # @return [Sphinx::Client] self.
1145
+ #
1146
+ # @example
1147
+ # sphinx.set_filter_float_range(:group_id, 10.5, 20)
1148
+ # sphinx.set_filter_float_range(:group_id, 10.5, 20, true)
1149
+ #
1150
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1151
+ #
1152
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setfilterfloatrange Section 6.4.4, "SetFilterFloatRange"
1153
+ # @see #set_filter
1154
+ # @see #set_filter_range
1155
+ #
1156
+ def set_filter_float_range(attribute, min, max, exclude = false)
615
1157
  raise ArgumentError, '"attribute" argument must be String or Symbol' unless attribute.kind_of?(String) or attribute.kind_of?(Symbol)
616
1158
  raise ArgumentError, '"min" argument must be Float or Integer' unless min.kind_of?(Float) or (min.respond_to?(:integer?) and min.integer?)
617
1159
  raise ArgumentError, '"max" argument must be Float or Integer' unless max.kind_of?(Float) or (max.respond_to?(:integer?) and max.integer?)
618
1160
  raise ArgumentError, '"max" argument greater or equal to "min"' unless min <= max
619
1161
  raise ArgumentError, '"exclude" argument must be Boolean' unless exclude.kind_of?(TrueClass) or exclude.kind_of?(FalseClass)
620
-
1162
+
621
1163
  @filters << { 'type' => SPH_FILTER_FLOATRANGE, 'attr' => attribute.to_s, 'exclude' => exclude, 'min' => min.to_f, 'max' => max.to_f }
1164
+ self
622
1165
  end
623
-
624
- # Setup anchor point for geosphere distance calculations.
625
- #
626
- # Required to use <tt>@geodist</tt> in filters and sorting
627
- # distance will be computed to this point. Latitude and longitude
628
- # must be in radians.
629
- #
630
- # * <tt>attrlat</tt> -- is the name of latitude attribute
631
- # * <tt>attrlong</tt> -- is the name of longitude attribute
632
- # * <tt>lat</tt> -- is anchor point latitude, in radians
633
- # * <tt>long</tt> -- is anchor point longitude, in radians
634
- def SetGeoAnchor(attrlat, attrlong, lat, long)
1166
+ alias :SetFilterFloatRange :set_filter_float_range
1167
+
1168
+ # Sets anchor point for and geosphere distance (geodistance)
1169
+ # calculations, and enable them.
1170
+ #
1171
+ # +attrlat+ and +attrlong+ must be strings that contain the names
1172
+ # of latitude and longitude attributes, respectively. +lat+ and
1173
+ # +long+ are floats that specify anchor point latitude and
1174
+ # longitude, in radians.
1175
+ #
1176
+ # Once an anchor point is set, you can use magic <tt>"@geodist"</tt>
1177
+ # attribute name in your filters and/or sorting expressions.
1178
+ # Sphinx will compute geosphere distance between the given anchor
1179
+ # point and a point specified by latitude and lognitude attributes
1180
+ # from each full-text match, and attach this value to the resulting
1181
+ # match. The latitude and longitude values both in {#set_geo_anchor}
1182
+ # and the index attribute data are expected to be in radians.
1183
+ # The result will be returned in meters, so geodistance value of
1184
+ # 1000.0 means 1 km. 1 mile is approximately 1609.344 meters.
1185
+ #
1186
+ # @param [String, Symbol] attrlat a name of latitude attribute.
1187
+ # @param [String, Symbol] attrlong a name of longitude attribute.
1188
+ # @param [Integer, Float] lat an anchor point latitude, in radians.
1189
+ # @param [Integer, Float] long an anchor point longitude, in radians.
1190
+ # @return [Sphinx::Client] self.
1191
+ #
1192
+ # @example
1193
+ # sphinx.set_geo_anchor(:latitude, :longitude, 192.5, 143.5)
1194
+ #
1195
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1196
+ #
1197
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setgeoanchor Section 6.4.5, "SetGeoAnchor"
1198
+ #
1199
+ def set_geo_anchor(attrlat, attrlong, lat, long)
635
1200
  raise ArgumentError, '"attrlat" argument must be String or Symbol' unless attrlat.kind_of?(String) or attrlat.kind_of?(Symbol)
636
1201
  raise ArgumentError, '"attrlong" argument must be String or Symbol' unless attrlong.kind_of?(String) or attrlong.kind_of?(Symbol)
637
1202
  raise ArgumentError, '"lat" argument must be Float or Integer' unless lat.kind_of?(Float) or (lat.respond_to?(:integer?) and lat.integer?)
638
1203
  raise ArgumentError, '"long" argument must be Float or Integer' unless long.kind_of?(Float) or (long.respond_to?(:integer?) and long.integer?)
639
1204
 
640
1205
  @anchor = { 'attrlat' => attrlat.to_s, 'attrlong' => attrlong.to_s, 'lat' => lat.to_f, 'long' => long.to_f }
1206
+ self
641
1207
  end
1208
+ alias :SetGeoAnchor :set_geo_anchor
642
1209
 
643
- # Set grouping attribute and function.
1210
+ #=================================================================
1211
+ # GROUP BY settings
1212
+ #=================================================================
1213
+
1214
+ # Sets grouping attribute, function, and groups sorting mode; and
1215
+ # enables grouping (as described in Section 4.6, "Grouping (clustering) search results").
644
1216
  #
645
- # In grouping mode, all matches are assigned to different groups
646
- # based on grouping function value.
1217
+ # +attribute+ is a string that contains group-by attribute name.
1218
+ # +func+ is a constant that chooses a function applied to the
1219
+ # attribute value in order to compute group-by key. +groupsort+
1220
+ # is a clause that controls how the groups will be sorted. Its
1221
+ # syntax is similar to that described in Section 4.5,
1222
+ # "SPH_SORT_EXTENDED mode".
647
1223
  #
648
- # Each group keeps track of the total match count, and the best match
649
- # (in this group) according to current sorting function.
1224
+ # Grouping feature is very similar in nature to <tt>GROUP BY</tt> clause
1225
+ # from SQL. Results produces by this function call are going to
1226
+ # be the same as produced by the following pseudo code:
650
1227
  #
651
- # The final result set contains one best match per group, with
652
- # grouping function value and matches count attached.
1228
+ # SELECT ... GROUP BY func(attribute) ORDER BY groupsort
653
1229
  #
654
- # Groups in result set could be sorted by any sorting clause,
655
- # including both document attributes and the following special
656
- # internal Sphinx attributes:
1230
+ # Note that it's +groupsort+ that affects the order of matches in
1231
+ # the final result set. Sorting mode (see {#set_sort_mode}) affect
1232
+ # the ordering of matches within group, ie. what match will be
1233
+ # selected as the best one from the group. So you can for instance
1234
+ # order the groups by matches count and select the most relevant
1235
+ # match within each group at the same time.
657
1236
  #
658
- # * @id - match document ID;
659
- # * @weight, @rank, @relevance - match weight;
660
- # * @group - groupby function value;
661
- # * @count - amount of matches in group.
1237
+ # Starting with version 0.9.9-rc2, aggregate functions (<tt>AVG()</tt>,
1238
+ # <tt>MIN()</tt>, <tt>MAX()</tt>, <tt>SUM()</tt>) are supported
1239
+ # through {#set_select} API call when using <tt>GROUP BY</tt>.
1240
+ #
1241
+ # You can specify group function and attribute as String
1242
+ # ("attr", "day", etc), Symbol (:attr, :day, etc), or
1243
+ # Fixnum constant (SPH_GROUPBY_ATTR, SPH_GROUPBY_DAY, etc).
1244
+ #
1245
+ # @param [String, Symbol] attribute an attribute name to group by.
1246
+ # @param [Integer, String, Symbol] func a grouping function.
1247
+ # @param [String] groupsort a groups sorting mode.
1248
+ # @return [Sphinx::Client] self.
662
1249
  #
663
- # the default mode is to sort by groupby value in descending order,
664
- # ie. by '@group desc'.
1250
+ # @example
1251
+ # sphinx.set_group_by(:tag_id, :attr)
665
1252
  #
666
- # 'total_found' would contain total amount of matching groups over
667
- # the whole index.
1253
+ # @raise [ArgumentError] Occurred when parameters are invalid.
668
1254
  #
669
- # WARNING: grouping is done in fixed memory and thus its results
670
- # are only approximate; so there might be more groups reported
671
- # in total_found than actually present. @count might also
672
- # be underestimated.
1255
+ # @see http://www.sphinxsearch.com/docs/current.html#clustering Section 4.6, "Grouping (clustering) search results"
1256
+ # @see http://www.sphinxsearch.com/docs/current.html#sort-extended Section 4.5, "SPH_SORT_EXTENDED mode"
1257
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setgroupby Section 6.5.1, "SetGroupBy"
1258
+ # @see #set_sort_mode
1259
+ # @see #set_select
1260
+ # @see #set_group_distinct
673
1261
  #
674
- # For example, if sorting by relevance and grouping by "published"
675
- # attribute with SPH_GROUPBY_DAY function, then the result set will
676
- # contain one most relevant match per each day when there were any
677
- # matches published, with day number and per-day match count attached,
678
- # and sorted by day number in descending order (ie. recent days first).
679
- def SetGroupBy(attribute, func, groupsort = '@group desc')
1262
+ def set_group_by(attribute, func, groupsort = '@group desc')
680
1263
  raise ArgumentError, '"attribute" argument must be String or Symbol' unless attribute.kind_of?(String) or attribute.kind_of?(Symbol)
681
1264
  raise ArgumentError, '"groupsort" argument must be String' unless groupsort.kind_of?(String)
682
1265
 
@@ -696,217 +1279,311 @@ module Sphinx
696
1279
  @groupby = attribute.to_s
697
1280
  @groupfunc = func
698
1281
  @groupsort = groupsort
1282
+ self
699
1283
  end
700
-
701
- # Set count-distinct attribute for group-by queries.
702
- def SetGroupDistinct(attribute)
703
- raise ArgumentError, '"attribute" argument must be String or Symbol' unless attribute.kind_of?(String) or attribute.kind_of?(Symbol)
1284
+ alias :SetGroupBy :set_group_by
704
1285
 
705
- @groupdistinct = attribute.to_s
706
- end
707
-
708
- # Sets distributed retry count and delay.
1286
+ # Sets attribute name for per-group distinct values count
1287
+ # calculations. Only available for grouping queries.
709
1288
  #
710
- # On temporary failures searchd will attempt up to +count+ retries per
711
- # agent. +delay+ is the delay between the retries, in milliseconds. Retries
712
- # are disabled by default. Note that this call will not make the API itself
713
- # retry on temporary failure; it only tells searchd to do so. Currently,
714
- # the list of temporary failures includes all kinds of +connect+
715
- # failures and maxed out (too busy) remote agents.
1289
+ # +attribute+ is a string that contains the attribute name. For
1290
+ # each group, all values of this attribute will be stored (as
1291
+ # RAM limits permit), then the amount of distinct values will
1292
+ # be calculated and returned to the client. This feature is
1293
+ # similar to <tt>COUNT(DISTINCT)</tt> clause in standard SQL;
1294
+ # so these Sphinx calls:
716
1295
  #
717
- def SetRetries(count, delay = 0)
718
- raise ArgumentError, '"count" argument must be Integer' unless count.respond_to?(:integer?) and count.integer?
719
- raise ArgumentError, '"delay" argument must be Integer' unless delay.respond_to?(:integer?) and delay.integer?
720
-
721
- @retrycount = count
722
- @retrydelay = delay
723
- end
724
-
725
- # Sets temporary (per-query) per-document attribute value overrides. Only
726
- # supports scalar attributes. +values+ must be a +Hash+ that maps document
727
- # IDs to overridden attribute values.
1296
+ # sphinx.set_group_by(:category, :attr, '@count desc')
1297
+ # sphinx.set_group_distinct(:vendor)
728
1298
  #
729
- # Override feature lets you "temporary" update attribute values for some
730
- # documents within a single query, leaving all other queries unaffected.
731
- # This might be useful for personalized data. For example, assume you're
732
- # implementing a personalized search function that wants to boost the posts
733
- # that the user's friends recommend. Such data is not just dynamic, but
734
- # also personal; so you can't simply put it in the index because you don't
735
- # want everyone's searches affected. Overrides, on the other hand, are local
736
- # to a single query and invisible to everyone else. So you can, say, setup
737
- # a "friends_weight" value for every document, defaulting to 0, then
738
- # temporary override it with 1 for documents 123, 456 and 789 (recommended
739
- # by exactly the friends of current user), and use that value when ranking.
1299
+ # can be expressed using the following SQL clauses:
740
1300
  #
741
- def SetOverride(attrname, attrtype, values)
742
- raise ArgumentError, '"attrname" argument must be String or Symbol' unless attrname.kind_of?(String) or attrname.kind_of?(Symbol)
743
-
744
- case attrtype
745
- when String, Symbol
746
- begin
747
- attrtype = self.class.const_get("SPH_ATTR_#{attrtype.to_s.upcase}")
748
- rescue NameError
749
- raise ArgumentError, "\"attrtype\" argument value \"#{attrtype}\" is invalid"
750
- end
751
- when Fixnum
752
- raise ArgumentError, "\"attrtype\" argument value \"#{attrtype}\" is invalid" unless (SPH_ATTR_INTEGER..SPH_ATTR_BIGINT).include?(attrtype)
753
- else
754
- raise ArgumentError, '"attrtype" argument must be Fixnum, String, or Symbol'
755
- end
756
-
757
- raise ArgumentError, '"values" argument must be Hash' unless values.kind_of?(Hash)
758
-
759
- values.each do |id, value|
760
- raise ArgumentError, '"values" argument must be Hash map of Integer to Integer or Time' unless id.respond_to?(:integer?) and id.integer?
761
- case attrtype
762
- when SPH_ATTR_TIMESTAMP
763
- raise ArgumentError, '"values" argument must be Hash map of Integer to Integer or Time' unless (value.respond_to?(:integer?) and value.integer?) or value.kind_of?(Time)
764
- when SPH_ATTR_FLOAT
765
- raise ArgumentError, '"values" argument must be Hash map of Integer to Float or Integer' unless value.kind_of?(Float) or (value.respond_to?(:integer?) and value.integer?)
766
- else
767
- # SPH_ATTR_INTEGER, SPH_ATTR_ORDINAL, SPH_ATTR_BOOL, SPH_ATTR_BIGINT
768
- raise ArgumentError, '"values" argument must be Hash map of Integer to Integer' unless value.respond_to?(:integer?) and value.integer?
769
- end
770
- end
771
-
772
- @overrides << { 'attr' => attrname.to_s, 'type' => attrtype, 'values' => values }
773
- end
774
-
775
- # Sets the select clause, listing specific attributes to fetch, and
776
- # expressions to compute and fetch. Clause syntax mimics SQL.
1301
+ # SELECT id, weight, all-attributes,
1302
+ # COUNT(DISTINCT vendor) AS @distinct,
1303
+ # COUNT(*) AS @count
1304
+ # FROM products
1305
+ # GROUP BY category
1306
+ # ORDER BY @count DESC
777
1307
  #
778
- # +SetSelect+ is very similar to the part of a typical SQL query between
779
- # +SELECT+ and +FROM+. It lets you choose what attributes (columns) to
780
- # fetch, and also what expressions over the columns to compute and fetch.
781
- # A certain difference from SQL is that expressions must always be aliased
782
- # to a correct identifier (consisting of letters and digits) using +AS+
783
- # keyword. SQL also lets you do that but does not require to. Sphinx enforces
784
- # aliases so that the computation results can always be returned under a
785
- #{ }"normal" name in the result set, used in other clauses, etc.
1308
+ # In the sample pseudo code shown just above, {#set_group_distinct}
1309
+ # call corresponds to <tt>COUNT(DISINCT vendor)</tt> clause only.
1310
+ # <tt>GROUP BY</tt>, <tt>ORDER BY</tt>, and <tt>COUNT(*)</tt>
1311
+ # clauses are all an equivalent of {#set_group_by} settings. Both
1312
+ # queries will return one matching row for each category. In
1313
+ # addition to indexed attributes, matches will also contain
1314
+ # total per-category matches count, and the count of distinct
1315
+ # vendor IDs within each category.
786
1316
  #
787
- # Everything else is basically identical to SQL. Star ('*') is supported.
788
- # Functions are supported. Arbitrary amount of expressions is supported.
789
- # Computed expressions can be used for sorting, filtering, and grouping,
790
- # just as the regular attributes.
1317
+ # @param [String, Symbol] attribute an attribute name.
1318
+ # @return [Sphinx::Client] self.
791
1319
  #
792
- # Starting with version 0.9.9-rc2, aggregate functions (<tt>AVG()</tt>,
793
- # <tt>MIN()</tt>, <tt>MAX()</tt>, <tt>SUM()</tt>) are supported when using
794
- # <tt>GROUP BY</tt>.
1320
+ # @example
1321
+ # sphinx.set_group_distinct(:category_id)
795
1322
  #
796
- # Expression sorting (Section 4.5, “SPH_SORT_EXPR mode”) and geodistance
797
- # functions (+SetGeoAnchor+) are now internally implemented
798
- # using this computed expressions mechanism, using magic names '<tt>@expr</tt>'
799
- # and '<tt>@geodist</tt>' respectively.
1323
+ # @raise [ArgumentError] Occurred when parameters are invalid.
800
1324
  #
801
- # Usage example:
1325
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setgroupdistinct Section 6.5.2, "SetGroupDistinct"
1326
+ # @see #set_group_by
802
1327
  #
803
- # sphinx.SetSelect('*, @weight+(user_karma+ln(pageviews))*0.1 AS myweight')
804
- # sphinx.SetSelect('exp_years, salary_gbp*{$gbp_usd_rate} AS salary_usd, IF(age>40,1,0) AS over40')
805
- # sphinx.SetSelect('*, AVG(price) AS avgprice')
806
- #
807
- def SetSelect(select)
808
- raise ArgumentError, '"select" argument must be String' unless select.kind_of?(String)
1328
+ def set_group_distinct(attribute)
1329
+ raise ArgumentError, '"attribute" argument must be String or Symbol' unless attribute.kind_of?(String) or attribute.kind_of?(Symbol)
809
1330
 
810
- @select = select
1331
+ @groupdistinct = attribute.to_s
1332
+ self
811
1333
  end
812
-
1334
+ alias :SetGroupDistinct :set_group_distinct
1335
+
1336
+ #=================================================================
1337
+ # Querying
1338
+ #=================================================================
1339
+
813
1340
  # Clears all currently set filters.
814
1341
  #
815
1342
  # This call is only normally required when using multi-queries. You might want
816
1343
  # to set different filters for different queries in the batch. To do that,
817
- # you should call +ResetFilters+ and add new filters using the respective calls.
1344
+ # you should call {#reset_filters} and add new filters using the respective calls.
1345
+ #
1346
+ # @return [Sphinx::Client] self.
818
1347
  #
819
- # Usage example:
1348
+ # @example
1349
+ # sphinx.reset_filters
820
1350
  #
821
- # sphinx.ResetFilters
1351
+ # @see #set_filter
1352
+ # @see #set_filter_range
1353
+ # @see #set_filter_float_range
1354
+ # @see #set_geo_anchor
822
1355
  #
823
- def ResetFilters
1356
+ def reset_filters
824
1357
  @filters = []
825
1358
  @anchor = []
1359
+ self
826
1360
  end
827
-
1361
+ alias :ResetFilters :reset_filters
1362
+
828
1363
  # Clears all currently group-by settings, and disables group-by.
829
1364
  #
830
1365
  # This call is only normally required when using multi-queries. You can
831
- # change individual group-by settings using +SetGroupBy+ and +SetGroupDistinct+
832
- # calls, but you can not disable group-by using those calls. +ResetGroupBy+
1366
+ # change individual group-by settings using {#set_group_by} and {#set_group_distinct}
1367
+ # calls, but you can not disable group-by using those calls. {#reset_group_by}
833
1368
  # fully resets previous group-by settings and disables group-by mode in the
834
- # current state, so that subsequent +AddQuery+ calls can perform non-grouping
1369
+ # current state, so that subsequent {#add_query} calls can perform non-grouping
835
1370
  # searches.
836
1371
  #
837
- # Usage example:
1372
+ # @return [Sphinx::Client] self.
838
1373
  #
839
- # sphinx.ResetGroupBy
1374
+ # @example
1375
+ # sphinx.reset_group_by
840
1376
  #
841
- def ResetGroupBy
1377
+ # @see #set_group_by
1378
+ # @see #set_group_distinct
1379
+ #
1380
+ def reset_group_by
842
1381
  @groupby = ''
843
1382
  @groupfunc = SPH_GROUPBY_DAY
844
1383
  @groupsort = '@group desc'
845
1384
  @groupdistinct = ''
1385
+ self
846
1386
  end
847
-
1387
+ alias :ResetGroupBy :reset_group_by
1388
+
848
1389
  # Clear all attribute value overrides (for multi-queries).
849
- def ResetOverrides
850
- @overrides = []
851
- end
852
-
853
- # Connect to searchd server and run given search query.
854
1390
  #
855
- # <tt>query</tt> is query string
856
-
857
- # <tt>index</tt> is index name (or names) to query. default value is "*" which means
858
- # to query all indexes. Accepted characters for index names are letters, numbers,
859
- # dash, and underscore; everything else is considered a separator. Therefore,
860
- # all the following calls are valid and will search two indexes:
1391
+ # This call is only normally required when using multi-queries. You might want
1392
+ # to set field overrides for different queries in the batch. To do that,
1393
+ # you should call {#reset_overrides} and add new overrides using the
1394
+ # respective calls.
861
1395
  #
862
- # sphinx.Query('test query', 'main delta')
863
- # sphinx.Query('test query', 'main;delta')
864
- # sphinx.Query('test query', 'main, delta')
1396
+ # @return [Sphinx::Client] self.
865
1397
  #
866
- # Index order matters. If identical IDs are found in two or more indexes,
867
- # weight and attribute values from the very last matching index will be used
868
- # for sorting and returning to client. Therefore, in the example above,
869
- # matches from "delta" index will always "win" over matches from "main".
1398
+ # @example
1399
+ # sphinx.reset_overrides
870
1400
  #
871
- # Returns false on failure.
872
- # Returns hash which has the following keys on success:
873
- #
874
- # * <tt>'matches'</tt> -- array of hashes {'weight', 'group', 'id'}, where 'id' is document_id.
875
- # * <tt>'total'</tt> -- total amount of matches retrieved (upto SPH_MAX_MATCHES, see sphinx.h)
876
- # * <tt>'total_found'</tt> -- total amount of matching documents in index
877
- # * <tt>'time'</tt> -- search time
878
- # * <tt>'words'</tt> -- hash which maps query terms (stemmed!) to ('docs', 'hits') hash
879
- def Query(query, index = '*', comment = '')
1401
+ # @see #set_override
1402
+ #
1403
+ def reset_overrides
1404
+ @overrides = []
1405
+ self
1406
+ end
1407
+ alias :ResetOverrides :reset_overrides
1408
+
1409
+ # Connects to searchd server, runs given search query with
1410
+ # current settings, obtains and returns the result set.
1411
+ #
1412
+ # +query+ is a query string. +index+ is an index name (or names)
1413
+ # string. Returns false and sets {#last_error} message on general
1414
+ # error. Returns search result set on success. Additionally,
1415
+ # the contents of +comment+ are sent to the query log, marked in
1416
+ # square brackets, just before the search terms, which can be very
1417
+ # useful for debugging. Currently, the comment is limited to 128
1418
+ # characters.
1419
+ #
1420
+ # Default value for +index+ is <tt>"*"</tt> that means to query
1421
+ # all local indexes. Characters allowed in index names include
1422
+ # Latin letters (a-z), numbers (0-9), minus sign (-), and
1423
+ # underscore (_); everything else is considered a separator.
1424
+ # Therefore, all of the following samples calls are valid and
1425
+ # will search the same two indexes:
1426
+ #
1427
+ # sphinx.query('test query', 'main delta')
1428
+ # sphinx.query('test query', 'main;delta')
1429
+ # sphinx.query('test query', 'main, delta');
1430
+ #
1431
+ # Index specification order matters. If document with identical
1432
+ # IDs are found in two or more indexes, weight and attribute
1433
+ # values from the very last matching index will be used for
1434
+ # sorting and returning to client (unless explicitly overridden
1435
+ # with {#set_index_weights}). Therefore, in the example above,
1436
+ # matches from "delta" index will always win over matches
1437
+ # from "main".
1438
+ #
1439
+ # On success, {#query} returns a result set that contains some
1440
+ # of the found matches (as requested by {#set_limits}) and
1441
+ # additional general per-query statistics. The result set
1442
+ # is an +Hash+ with the following keys and values:
1443
+ #
1444
+ # <tt>"matches"</tt>::
1445
+ # Array with small +Hash+es containing document weight and
1446
+ # attribute values.
1447
+ # <tt>"total"</tt>::
1448
+ # Total amount of matches retrieved on server (ie. to the server
1449
+ # side result set) by this query. You can retrieve up to this
1450
+ # amount of matches from server for this query text with current
1451
+ # query settings.
1452
+ # <tt>"total_found"</tt>::
1453
+ # Total amount of matching documents in index (that were found
1454
+ # and procesed on server).
1455
+ # <tt>"words"</tt>::
1456
+ # Hash which maps query keywords (case-folded, stemmed, and
1457
+ # otherwise processed) to a small Hash with per-keyword statitics
1458
+ # ("docs", "hits").
1459
+ # <tt>"error"</tt>::
1460
+ # Query error message reported by searchd (string, human readable).
1461
+ # Empty if there were no errors.
1462
+ # <tt>"warning"</tt>::
1463
+ # Query warning message reported by searchd (string, human readable).
1464
+ # Empty if there were no warnings.
1465
+ #
1466
+ # It should be noted that {#query} carries out the same actions as
1467
+ # {#add_query} and {#run_queries} without the intermediate steps; it
1468
+ # is analoguous to a single {#add_query} call, followed by a
1469
+ # corresponding {#run_queries}, then returning the first array
1470
+ # element of matches (from the first, and only, query.)
1471
+ #
1472
+ # @param [String] query a query string.
1473
+ # @param [String] index an index name (or names).
1474
+ # @param [String] comment a comment to be sent to the query log.
1475
+ # @return [Hash, false] result set described above or +false+ on error.
1476
+ #
1477
+ # @example
1478
+ # sphinx.query('some search text', '*', 'search page')
1479
+ #
1480
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-query Section 6.6.1, "Query"
1481
+ # @see #add_query
1482
+ # @see #run_queries
1483
+ #
1484
+ def query(query, index = '*', comment = '')
880
1485
  @reqs = []
881
-
882
- self.AddQuery(query, index, comment)
883
- results = self.RunQueries
884
-
1486
+
1487
+ self.add_query(query, index, comment)
1488
+ results = self.run_queries
1489
+
885
1490
  # probably network error; error message should be already filled
886
1491
  return false unless results.instance_of?(Array)
887
-
1492
+
888
1493
  @error = results[0]['error']
889
1494
  @warning = results[0]['warning']
890
-
1495
+
891
1496
  return false if results[0]['status'] == SEARCHD_ERROR
892
1497
  return results[0]
893
1498
  end
894
-
895
- # Add query to batch.
896
- #
897
- # Batch queries enable searchd to perform internal optimizations,
898
- # if possible; and reduce network connection overheads in all cases.
899
- #
900
- # For instance, running exactly the same query with different
901
- # groupby settings will enable searched to perform expensive
902
- # full-text search and ranking operation only once, but compute
903
- # multiple groupby results from its output.
1499
+ alias :Query :query
1500
+
1501
+ # Adds additional query with current settings to multi-query batch.
1502
+ # +query+ is a query string. +index+ is an index name (or names)
1503
+ # string. Additionally if provided, the contents of +comment+ are
1504
+ # sent to the query log, marked in square brackets, just before
1505
+ # the search terms, which can be very useful for debugging.
1506
+ # Currently, this is limited to 128 characters. Returns index
1507
+ # to results array returned from {#run_queries}.
1508
+ #
1509
+ # Batch queries (or multi-queries) enable searchd to perform
1510
+ # internal optimizations if possible. They also reduce network
1511
+ # connection overheads and search process creation overheads in all
1512
+ # cases. They do not result in any additional overheads compared
1513
+ # to simple queries. Thus, if you run several different queries
1514
+ # from your web page, you should always consider using multi-queries.
1515
+ #
1516
+ # For instance, running the same full-text query but with different
1517
+ # sorting or group-by settings will enable searchd to perform
1518
+ # expensive full-text search and ranking operation only once, but
1519
+ # compute multiple group-by results from its output.
1520
+ #
1521
+ # This can be a big saver when you need to display not just plain
1522
+ # search results but also some per-category counts, such as the
1523
+ # amount of products grouped by vendor. Without multi-query, you
1524
+ # would have to run several queries which perform essentially the
1525
+ # same search and retrieve the same matches, but create result
1526
+ # sets differently. With multi-query, you simply pass all these
1527
+ # queries in a single batch and Sphinx optimizes the redundant
1528
+ # full-text search internally.
1529
+ #
1530
+ # {#add_query} internally saves full current settings state along
1531
+ # with the query, and you can safely change them afterwards for
1532
+ # subsequent {#add_query} calls. Already added queries will not
1533
+ # be affected; there's actually no way to change them at all.
1534
+ # Here's an example:
1535
+ #
1536
+ # sphinx.set_sort_mode(:relevance)
1537
+ # sphinx.add_query("hello world", "documents")
1538
+ #
1539
+ # sphinx.set_sort_mode(:attr_desc, :price)
1540
+ # sphinx.add_query("ipod", "products")
904
1541
  #
905
- # Parameters are exactly the same as in <tt>Query</tt> call.
906
- # Returns index to results array returned by <tt>RunQueries</tt> call.
907
- def AddQuery(query, index = '*', comment = '')
1542
+ # sphinx.add_query("harry potter", "books")
1543
+ #
1544
+ # results = sphinx.run_queries
1545
+ #
1546
+ # With the code above, 1st query will search for "hello world"
1547
+ # in "documents" index and sort results by relevance, 2nd query
1548
+ # will search for "ipod" in "products" index and sort results
1549
+ # by price, and 3rd query will search for "harry potter" in
1550
+ # "books" index while still sorting by price. Note that 2nd
1551
+ # {#set_sort_mode} call does not affect the first query (because
1552
+ # it's already added) but affects both other subsequent queries.
1553
+ #
1554
+ # Additionally, any filters set up before an {#add_query} will
1555
+ # fall through to subsequent queries. So, if {#set_filter} is
1556
+ # called before the first query, the same filter will be in
1557
+ # place for the second (and subsequent) queries batched through
1558
+ # {#add_query} unless you call {#reset_filters} first. Alternatively,
1559
+ # you can add additional filters as well.
1560
+ #
1561
+ # This would also be true for grouping options and sorting options;
1562
+ # no current sorting, filtering, and grouping settings are affected
1563
+ # by this call; so subsequent queries will reuse current query settings.
1564
+ #
1565
+ # {#add_query} returns an index into an array of results that will
1566
+ # be returned from {#run_queries} call. It is simply a sequentially
1567
+ # increasing 0-based integer, ie. first call will return 0, second
1568
+ # will return 1, and so on. Just a small helper so you won't have
1569
+ # to track the indexes manualy if you need then.
1570
+ #
1571
+ # @param [String] query a query string.
1572
+ # @param [String] index an index name (or names).
1573
+ # @param [String] comment a comment to be sent to the query log.
1574
+ # @return [Integer] an index into an array of results that will
1575
+ # be returned from {#run_queries} call.
1576
+ #
1577
+ # @example
1578
+ # sphinx.add_query('some search text', '*', 'search page')
1579
+ #
1580
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-addquery Section 6.6.2, "AddQuery"
1581
+ # @see #query
1582
+ # @see #run_queries
1583
+ #
1584
+ def add_query(query, index = '*', comment = '')
908
1585
  # build request
909
-
1586
+
910
1587
  # mode and limits
911
1588
  request = Request.new
912
1589
  request.put_int @offset, @limit, @mode, @ranker, @sort
@@ -920,8 +1597,8 @@ module Sphinx
920
1597
  # id64 range marker
921
1598
  request.put_int 1
922
1599
  # id64 range
923
- request.put_int64 @min_id.to_i, @max_id.to_i
924
-
1600
+ request.put_int64 @min_id.to_i, @max_id.to_i
1601
+
925
1602
  # filters
926
1603
  request.put_int @filters.length
927
1604
  @filters.each do |filter|
@@ -940,7 +1617,7 @@ module Sphinx
940
1617
  end
941
1618
  request.put_int filter['exclude'] ? 1 : 0
942
1619
  end
943
-
1620
+
944
1621
  # group-by clause, max-matches count, group-sort clause, cutoff count
945
1622
  request.put_int @groupfunc
946
1623
  request.put_string @groupby
@@ -948,7 +1625,7 @@ module Sphinx
948
1625
  request.put_string @groupsort
949
1626
  request.put_int @cutoff, @retrycount, @retrydelay
950
1627
  request.put_string @groupdistinct
951
-
1628
+
952
1629
  # anchor point
953
1630
  if @anchor.empty?
954
1631
  request.put_int 0
@@ -957,27 +1634,27 @@ module Sphinx
957
1634
  request.put_string @anchor['attrlat'], @anchor['attrlong']
958
1635
  request.put_float @anchor['lat'], @anchor['long']
959
1636
  end
960
-
1637
+
961
1638
  # per-index weights
962
1639
  request.put_int @indexweights.length
963
1640
  @indexweights.each do |idx, weight|
964
1641
  request.put_string idx.to_s
965
1642
  request.put_int weight
966
1643
  end
967
-
1644
+
968
1645
  # max query time
969
1646
  request.put_int @maxquerytime
970
-
1647
+
971
1648
  # per-field weights
972
1649
  request.put_int @fieldweights.length
973
1650
  @fieldweights.each do |field, weight|
974
1651
  request.put_string field.to_s
975
1652
  request.put_int weight
976
1653
  end
977
-
1654
+
978
1655
  # comment
979
1656
  request.put_string comment
980
-
1657
+
981
1658
  # attribute overrides
982
1659
  request.put_int @overrides.length
983
1660
  for entry in @overrides do
@@ -995,173 +1672,196 @@ module Sphinx
995
1672
  end
996
1673
  end
997
1674
  end
998
-
1675
+
999
1676
  # select-list
1000
1677
  request.put_string @select
1001
-
1678
+
1002
1679
  # store request to requests array
1003
1680
  @reqs << request.to_s;
1004
1681
  return @reqs.length - 1
1005
1682
  end
1006
-
1007
- # Run queries batch.
1008
- #
1009
- # Returns an array of result sets on success.
1010
- # Returns false on network IO failure.
1011
- #
1012
- # Each result set in returned array is a hash which containts
1013
- # the same keys as the hash returned by <tt>Query</tt>, plus:
1014
- #
1015
- # * <tt>'error'</tt> -- search error for this query
1016
- # * <tt>'words'</tt> -- hash which maps query terms (stemmed!) to ( "docs", "hits" ) hash
1017
- #
1018
- def RunQueries
1683
+ alias :AddQuery :add_query
1684
+
1685
+ # Connect to searchd, runs a batch of all queries added using
1686
+ # {#add_query}, obtains and returns the result sets. Returns
1687
+ # +false+ and sets {#last_error} message on general error
1688
+ # (such as network I/O failure). Returns a plain array of
1689
+ # result sets on success.
1690
+ #
1691
+ # Each result set in the returned array is exactly the same as
1692
+ # the result set returned from {#query}.
1693
+ #
1694
+ # Note that the batch query request itself almost always succeds —
1695
+ # unless there's a network error, blocking index rotation in
1696
+ # progress, or another general failure which prevents the whole
1697
+ # request from being processed.
1698
+ #
1699
+ # However individual queries within the batch might very well
1700
+ # fail. In this case their respective result sets will contain
1701
+ # non-empty "error" message, but no matches or query statistics.
1702
+ # In the extreme case all queries within the batch could fail.
1703
+ # There still will be no general error reported, because API
1704
+ # was able to succesfully connect to searchd, submit the batch,
1705
+ # and receive the results — but every result set will have a
1706
+ # specific error message.
1707
+ #
1708
+ # @return [Array<Hash>] an +Array+ of +Hash+es which are exactly
1709
+ # the same as the result set returned from {#query}.
1710
+ #
1711
+ # @example
1712
+ # sphinx.add_query('some search text', '*', 'search page')
1713
+ # results = sphinx.run_queries
1714
+ #
1715
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-runqueries Section 6.6.3, "RunQueries"
1716
+ # @see #add_query
1717
+ #
1718
+ def run_queries
1019
1719
  if @reqs.empty?
1020
- @error = 'No queries defined, issue AddQuery() first'
1720
+ @error = 'No queries defined, issue add_query() first'
1021
1721
  return false
1022
1722
  end
1023
1723
 
1024
- req = @reqs.join('')
1025
- nreqs = @reqs.length
1724
+ reqs, nreqs = @reqs.join(''), @reqs.length
1026
1725
  @reqs = []
1027
- response = perform_request(:search, req, nreqs)
1028
-
1726
+ response = perform_request(:search, reqs, nreqs)
1727
+
1029
1728
  # parse response
1030
- begin
1031
- results = []
1032
- ires = 0
1033
- while ires < nreqs
1034
- ires += 1
1035
- result = {}
1036
-
1037
- result['error'] = ''
1038
- result['warning'] = ''
1039
-
1040
- # extract status
1041
- status = result['status'] = response.get_int
1042
- if status != SEARCHD_OK
1043
- message = response.get_string
1044
- if status == SEARCHD_WARNING
1045
- result['warning'] = message
1046
- else
1047
- result['error'] = message
1048
- results << result
1049
- next
1050
- end
1051
- end
1052
-
1053
- # read schema
1054
- fields = []
1055
- attrs = {}
1056
- attrs_names_in_order = []
1057
-
1058
- nfields = response.get_int
1059
- while nfields > 0
1060
- nfields -= 1
1061
- fields << response.get_string
1729
+ (1..nreqs).map do
1730
+ result = { 'error' => '', 'warning' => '' }
1731
+
1732
+ # extract status
1733
+ status = result['status'] = response.get_int
1734
+ if status != SEARCHD_OK
1735
+ message = response.get_string
1736
+ if status == SEARCHD_WARNING
1737
+ result['warning'] = message
1738
+ else
1739
+ result['error'] = message
1740
+ next result
1062
1741
  end
1063
- result['fields'] = fields
1064
-
1065
- nattrs = response.get_int
1066
- while nattrs > 0
1067
- nattrs -= 1
1068
- attr = response.get_string
1069
- type = response.get_int
1070
- attrs[attr] = type
1071
- attrs_names_in_order << attr
1742
+ end
1743
+
1744
+ # read schema
1745
+ nfields = response.get_int
1746
+ result['fields'] = (1..nfields).map { response.get_string }
1747
+
1748
+ attrs_names_in_order = []
1749
+ nattrs = response.get_int
1750
+ attrs = (1..nattrs).inject({}) do |hash, idx|
1751
+ name, type = response.get_string, response.get_int
1752
+ hash[name] = type
1753
+ attrs_names_in_order << name
1754
+ hash
1755
+ end
1756
+ result['attrs'] = attrs
1757
+
1758
+ # read match count
1759
+ count, id64 = response.get_ints(2)
1760
+
1761
+ # read matches
1762
+ result['matches'] = (1..count).map do
1763
+ doc, weight = if id64 == 0
1764
+ response.get_ints(2)
1765
+ else
1766
+ [response.get_int64, response.get_int]
1072
1767
  end
1073
- result['attrs'] = attrs
1074
-
1075
- # read match count
1076
- count = response.get_int
1077
- id64 = response.get_int
1078
-
1079
- # read matches
1080
- result['matches'] = []
1081
- while count > 0
1082
- count -= 1
1083
-
1084
- if id64 != 0
1085
- doc = response.get_int64
1086
- weight = response.get_int
1087
- else
1088
- doc, weight = response.get_ints(2)
1089
- end
1090
-
1091
- r = {} # This is a single result put in the result['matches'] array
1092
- r['id'] = doc
1093
- r['weight'] = weight
1094
- attrs_names_in_order.each do |a|
1095
- r['attrs'] ||= {}
1096
-
1097
- case attrs[a]
1098
- when SPH_ATTR_BIGINT
1099
- # handle 64-bit ints
1100
- r['attrs'][a] = response.get_int64
1101
- when SPH_ATTR_FLOAT
1102
- # handle floats
1103
- r['attrs'][a] = response.get_float
1104
- when SPH_ATTR_STRING
1105
- r['attrs'][a] = response.get_string
1768
+
1769
+ # This is a single result put in the result['matches'] array
1770
+ match = { 'id' => doc, 'weight' => weight }
1771
+ match['attrs'] = attrs_names_in_order.inject({}) do |hash, name|
1772
+ hash[name] = case attrs[name]
1773
+ when SPH_ATTR_BIGINT
1774
+ # handle 64-bit ints
1775
+ response.get_int64
1776
+ when SPH_ATTR_FLOAT
1777
+ # handle floats
1778
+ response.get_float
1779
+ when SPH_ATTR_STRING
1780
+ response.get_string
1781
+ else
1782
+ # handle everything else as unsigned ints
1783
+ val = response.get_int
1784
+ if (attrs[name] & SPH_ATTR_MULTI) != 0
1785
+ (1..val).map { response.get_int }
1106
1786
  else
1107
- # handle everything else as unsigned ints
1108
- val = response.get_int
1109
- if (attrs[a] & SPH_ATTR_MULTI) != 0
1110
- r['attrs'][a] = []
1111
- 1.upto(val) do
1112
- r['attrs'][a] << response.get_int
1113
- end
1114
- else
1115
- r['attrs'][a] = val
1116
- end
1117
- end
1787
+ val
1788
+ end
1118
1789
  end
1119
- result['matches'] << r
1790
+ hash
1120
1791
  end
1121
- result['total'], result['total_found'], msecs, words = response.get_ints(4)
1122
- result['time'] = '%.3f' % (msecs / 1000.0)
1123
-
1124
- result['words'] = {}
1125
- while words > 0
1126
- words -= 1
1127
- word = response.get_string
1128
- docs, hits = response.get_ints(2)
1129
- result['words'][word] = { 'docs' => docs, 'hits' => hits }
1130
- end
1131
-
1132
- results << result
1792
+ match
1133
1793
  end
1134
- #rescue EOFError
1135
- # @error = 'incomplete reply'
1136
- # raise SphinxResponseError, @error
1794
+ result['total'], result['total_found'], msecs = response.get_ints(3)
1795
+ result['time'] = '%.3f' % (msecs / 1000.0)
1796
+
1797
+ nwords = response.get_int
1798
+ result['words'] = (1..nwords).inject({}) do |hash, idx|
1799
+ word = response.get_string
1800
+ docs, hits = response.get_ints(2)
1801
+ hash[word] = { 'docs' => docs, 'hits' => hits }
1802
+ hash
1803
+ end
1804
+
1805
+ result
1137
1806
  end
1138
-
1139
- return results
1140
1807
  end
1141
-
1142
- # Connect to searchd server and generate exceprts from given documents.
1143
- #
1144
- # * <tt>docs</tt> -- an array of strings which represent the documents' contents
1145
- # * <tt>index</tt> -- a string specifiying the index which settings will be used
1146
- # for stemming, lexing and case folding
1147
- # * <tt>words</tt> -- a string which contains the words to highlight
1148
- # * <tt>opts</tt> is a hash which contains additional optional highlighting parameters.
1149
- #
1150
- # You can use following parameters:
1151
- # * <tt>'before_match'</tt> -- a string to insert before a set of matching words, default is "<b>"
1152
- # * <tt>'after_match'</tt> -- a string to insert after a set of matching words, default is "<b>"
1153
- # * <tt>'chunk_separator'</tt> -- a string to insert between excerpts chunks, default is " ... "
1154
- # * <tt>'limit'</tt> -- max excerpt size in symbols (codepoints), default is 256
1155
- # * <tt>'around'</tt> -- how much words to highlight around each match, default is 5
1156
- # * <tt>'exact_phrase'</tt> -- whether to highlight exact phrase matches only, default is <tt>false</tt>
1157
- # * <tt>'single_passage'</tt> -- whether to extract single best passage only, default is false
1158
- # * <tt>'use_boundaries'</tt> -- whether to extract passages by phrase boundaries setup in tokenizer
1159
- # * <tt>'weight_order'</tt> -- whether to order best passages in document (default) or weight order
1160
- #
1161
- # Returns false on failure.
1162
- # Returns an array of string excerpts on success.
1163
- #
1164
- def BuildExcerpts(docs, index, words, opts = {})
1808
+ alias :RunQueries :run_queries
1809
+
1810
+ #=================================================================
1811
+ # Additional functionality
1812
+ #=================================================================
1813
+
1814
+ # Excerpts (snippets) builder function. Connects to searchd, asks
1815
+ # it to generate excerpts (snippets) from given documents, and
1816
+ # returns the results.
1817
+ #
1818
+ # +docs+ is a plain array of strings that carry the documents'
1819
+ # contents. +index+ is an index name string. Different settings
1820
+ # (such as charset, morphology, wordforms) from given index will
1821
+ # be used. +words+ is a string that contains the keywords to
1822
+ # highlight. They will be processed with respect to index settings.
1823
+ # For instance, if English stemming is enabled in the index,
1824
+ # "shoes" will be highlighted even if keyword is "shoe". Starting
1825
+ # with version 0.9.9-rc1, keywords can contain wildcards, that
1826
+ # work similarly to star-syntax available in queries.
1827
+ #
1828
+ # @param [Array<String>] docs an array of strings which represent
1829
+ # the documents' contents.
1830
+ # @param [String] index an index which settings will be used for
1831
+ # stemming, lexing and case folding.
1832
+ # @param [String] words a string which contains the words to highlight.
1833
+ # @param [Hash] opts a +Hash+ which contains additional optional
1834
+ # highlighting parameters.
1835
+ # @option opts [String] 'before_match' ("<b>") a string to insert before a
1836
+ # keyword match.
1837
+ # @option opts [String] 'after_match' ("</b>") a string to insert after a
1838
+ # keyword match.
1839
+ # @option opts [String] 'chunk_separator' (" ... ") a string to insert
1840
+ # between snippet chunks (passages).
1841
+ # @option opts [Integer] 'limit' (256) maximum snippet size, in symbols
1842
+ # (codepoints).
1843
+ # @option opts [Integer] 'around' (5) how many words to pick around
1844
+ # each matching keywords block.
1845
+ # @option opts [Boolean] 'exact_phrase' (false) whether to highlight exact
1846
+ # query phrase matches only instead of individual keywords.
1847
+ # @option opts [Boolean] 'single_passage' (false) whether to extract single
1848
+ # best passage only.
1849
+ # @option opts [Boolean] 'use_boundaries' (false) whether to extract
1850
+ # passages by phrase boundaries setup in tokenizer.
1851
+ # @option opts [Boolean] 'weight_order' (false) whether to sort the
1852
+ # extracted passages in order of relevance (decreasing weight),
1853
+ # or in order of appearance in the document (increasing position).
1854
+ # @return [Array<String>, false] a plain array of strings with
1855
+ # excerpts (snippets) on success; otherwise, +false+.
1856
+ #
1857
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1858
+ #
1859
+ # @example
1860
+ # sphinx.build_excerpts(['hello world', 'hello me'], 'idx', 'hello')
1861
+ #
1862
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-buildexcerpts Section 6.7.1, "BuildExcerpts"
1863
+ #
1864
+ def build_excerpts(docs, index, words, opts = {})
1165
1865
  raise ArgumentError, '"docs" argument must be Array' unless docs.kind_of?(Array)
1166
1866
  raise ArgumentError, '"index" argument must be String' unless index.kind_of?(String) or index.kind_of?(Symbol)
1167
1867
  raise ArgumentError, '"words" argument must be String' unless words.kind_of?(String)
@@ -1182,9 +1882,9 @@ module Sphinx
1182
1882
  opts['use_boundaries'] ||= opts[:use_boundaries] || false
1183
1883
  opts['weight_order'] ||= opts[:weight_order] || false
1184
1884
  opts['query_mode'] ||= opts[:query_mode] || false
1185
-
1885
+
1186
1886
  # build request
1187
-
1887
+
1188
1888
  # v.1.0 req
1189
1889
  flags = 1
1190
1890
  flags |= 2 if opts['exact_phrase']
@@ -1192,47 +1892,71 @@ module Sphinx
1192
1892
  flags |= 8 if opts['use_boundaries']
1193
1893
  flags |= 16 if opts['weight_order']
1194
1894
  flags |= 32 if opts['query_mode']
1195
-
1895
+
1196
1896
  request = Request.new
1197
1897
  request.put_int 0, flags # mode=0, flags=1 (remove spaces)
1198
1898
  # req index
1199
1899
  request.put_string index.to_s
1200
1900
  # req words
1201
1901
  request.put_string words
1202
-
1902
+
1203
1903
  # options
1204
1904
  request.put_string opts['before_match']
1205
1905
  request.put_string opts['after_match']
1206
1906
  request.put_string opts['chunk_separator']
1207
1907
  request.put_int opts['limit'].to_i, opts['around'].to_i
1208
-
1908
+
1209
1909
  # documents
1210
1910
  request.put_int docs.size
1211
1911
  request.put_string(*docs)
1212
-
1912
+
1213
1913
  response = perform_request(:excerpt, request)
1214
-
1914
+
1215
1915
  # parse response
1216
- begin
1217
- res = []
1218
- docs.each do |doc|
1219
- res << response.get_string
1220
- end
1221
- rescue EOFError
1222
- @error = 'incomplete reply'
1223
- raise SphinxResponseError, @error
1224
- end
1225
- return res
1916
+ docs.map { response.get_string }
1226
1917
  end
1227
-
1228
- # Connect to searchd server, and generate keyword list for a given query.
1229
- #
1230
- # Returns an array of words on success.
1231
- def BuildKeywords(query, index, hits)
1918
+ alias :BuildExcerpts :build_excerpts
1919
+
1920
+ # Extracts keywords from query using tokenizer settings for given
1921
+ # index, optionally with per-keyword occurrence statistics.
1922
+ # Returns an array of hashes with per-keyword information.
1923
+ #
1924
+ # +query+ is a query to extract keywords from. +index+ is a name of
1925
+ # the index to get tokenizing settings and keyword occurrence
1926
+ # statistics from. +hits+ is a boolean flag that indicates whether
1927
+ # keyword occurrence statistics are required.
1928
+ #
1929
+ # The result set consists of +Hash+es with the following keys and values:
1930
+ #
1931
+ # <tt>'tokenized'</tt>::
1932
+ # Tokenized keyword.
1933
+ # <tt>'normalized'</tt>::
1934
+ # Normalized keyword.
1935
+ # <tt>'docs'</tt>::
1936
+ # A number of documents where keyword is found (if +hits+ param is +true+).
1937
+ # <tt>'hits'</tt>::
1938
+ # A number of keywords occurrences among all documents (if +hits+ param is +true+).
1939
+ #
1940
+ # @param [String] query a query string.
1941
+ # @param [String] index an index to get tokenizing settings and
1942
+ # keyword occurrence statistics from.
1943
+ # @param [Boolean] hits indicates whether keyword occurrence
1944
+ # statistics are required.
1945
+ # @return [Array<Hash>] an +Array+ of +Hash+es in format specified
1946
+ # above.
1947
+ #
1948
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1949
+ #
1950
+ # @example
1951
+ # keywords = sphinx.build_keywords("this.is.my query", "test1", false)
1952
+ #
1953
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-buildkeywords Section 6.7.3, "BuildKeywords"
1954
+ #
1955
+ def build_keywords(query, index, hits)
1232
1956
  raise ArgumentError, '"query" argument must be String' unless query.kind_of?(String)
1233
1957
  raise ArgumentError, '"index" argument must be String' unless index.kind_of?(String) or index.kind_of?(Symbol)
1234
1958
  raise ArgumentError, '"hits" argument must be Boolean' unless hits.kind_of?(TrueClass) or hits.kind_of?(FalseClass)
1235
-
1959
+
1236
1960
  # build request
1237
1961
  request = Request.new
1238
1962
  # v.1.0 req
@@ -1241,53 +1965,79 @@ module Sphinx
1241
1965
  request.put_int hits ? 1 : 0
1242
1966
 
1243
1967
  response = perform_request(:keywords, request)
1244
-
1968
+
1245
1969
  # parse response
1246
- begin
1247
- res = []
1248
- nwords = response.get_int
1249
- 0.upto(nwords - 1) do |i|
1250
- tokenized = response.get_string
1251
- normalized = response.get_string
1252
-
1253
- entry = { 'tokenized' => tokenized, 'normalized' => normalized }
1254
- entry['docs'], entry['hits'] = response.get_ints(2) if hits
1255
-
1256
- res << entry
1257
- end
1258
- rescue EOFError
1259
- @error = 'incomplete reply'
1260
- raise SphinxResponseError, @error
1970
+ nwords = response.get_int
1971
+ (0...nwords).map do
1972
+ tokenized = response.get_string
1973
+ normalized = response.get_string
1974
+
1975
+ entry = { 'tokenized' => tokenized, 'normalized' => normalized }
1976
+ entry['docs'], entry['hits'] = response.get_ints(2) if hits
1977
+
1978
+ entry
1261
1979
  end
1262
-
1263
- return res
1264
1980
  end
1981
+ alias :BuildKeywords :build_keywords
1265
1982
 
1266
- # Batch update given attributes in given rows in given indexes.
1983
+ # Instantly updates given attribute values in given documents.
1984
+ # Returns number of actually updated documents (0 or more) on
1985
+ # success, or -1 on failure.
1986
+ #
1987
+ # +index+ is a name of the index (or indexes) to be updated.
1988
+ # +attrs+ is a plain array with string attribute names, listing
1989
+ # attributes that are updated. +values+ is a Hash where key is
1990
+ # document ID, and value is a plain array of new attribute values.
1991
+ #
1992
+ # +index+ can be either a single index name or a list, like in
1993
+ # {#query}. Unlike {#query}, wildcard is not allowed and all the
1994
+ # indexes to update must be specified explicitly. The list of
1995
+ # indexes can include distributed index names. Updates on
1996
+ # distributed indexes will be pushed to all agents.
1997
+ #
1998
+ # The updates only work with docinfo=extern storage strategy.
1999
+ # They are very fast because they're working fully in RAM, but
2000
+ # they can also be made persistent: updates are saved on disk
2001
+ # on clean searchd shutdown initiated by SIGTERM signal. With
2002
+ # additional restrictions, updates are also possible on MVA
2003
+ # attributes; refer to mva_updates_pool directive for details.
2004
+ #
2005
+ # The first sample statement will update document 1 in index
2006
+ # "test1", setting "group_id" to 456. The second one will update
2007
+ # documents 1001, 1002 and 1003 in index "products". For document
2008
+ # 1001, the new price will be set to 123 and the new amount in
2009
+ # stock to 5; for document 1002, the new price will be 37 and the
2010
+ # new amount will be 11; etc. The third one updates document 1
2011
+ # in index "test2", setting MVA attribute "group_id" to [456, 789].
2012
+ #
2013
+ # @example
2014
+ # sphinx.update_attributes("test1", ["group_id"], { 1 => [456] });
2015
+ # sphinx.update_attributes("products", ["price", "amount_in_stock"],
2016
+ # { 1001 => [123, 5], 1002 => [37, 11], 1003 => [25, 129] });
2017
+ # sphinx.update_attributes('test2', ['group_id'], { 1 => [[456, 789]] }, true)
1267
2018
  #
1268
- # * +index+ is a name of the index to be updated
1269
- # * +attrs+ is an array of attribute name strings.
1270
- # * +values+ is a hash where key is document id, and value is an array of
1271
- # * +mva+ identifies whether update MVA
1272
- # new attribute values
2019
+ # @param [String] index a name of the index to be updated.
2020
+ # @param [Array<String>] attrs an array of attribute name strings.
2021
+ # @param [Hash] values is a hash where key is document id, and
2022
+ # value is an array of new attribute values.
2023
+ # @param [Boolean] mva indicating whether to update MVA.
2024
+ # @return [Integer] number of actually updated documents (0 or more) on success,
2025
+ # -1 on failure.
1273
2026
  #
1274
- # Returns number of actually updated documents (0 or more) on success.
1275
- # Returns -1 on failure.
2027
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1276
2028
  #
1277
- # Usage example:
1278
- # sphinx.UpdateAttributes('test1', ['group_id'], { 1 => [456] })
1279
- # sphinx.UpdateAttributes('test1', ['group_id'], { 1 => [[456, 789]] }, true)
2029
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-updateatttributes Section 6.7.2, "UpdateAttributes"
1280
2030
  #
1281
- def UpdateAttributes(index, attrs, values, mva = false)
2031
+ def update_attributes(index, attrs, values, mva = false)
1282
2032
  # verify everything
1283
2033
  raise ArgumentError, '"index" argument must be String' unless index.kind_of?(String) or index.kind_of?(Symbol)
1284
2034
  raise ArgumentError, '"mva" argument must be Boolean' unless mva.kind_of?(TrueClass) or mva.kind_of?(FalseClass)
1285
-
2035
+
1286
2036
  raise ArgumentError, '"attrs" argument must be Array' unless attrs.kind_of?(Array)
1287
2037
  attrs.each do |attr|
1288
2038
  raise ArgumentError, '"attrs" argument must be Array of Strings' unless attr.kind_of?(String) or attr.kind_of?(Symbol)
1289
2039
  end
1290
-
2040
+
1291
2041
  raise ArgumentError, '"values" argument must be Hash' unless values.kind_of?(Hash)
1292
2042
  values.each do |id, entry|
1293
2043
  raise ArgumentError, '"values" argument must be Hash map of Integer to Array' unless id.respond_to?(:integer?) and id.integer?
@@ -1304,17 +2054,17 @@ module Sphinx
1304
2054
  end
1305
2055
  end
1306
2056
  end
1307
-
2057
+
1308
2058
  # build request
1309
2059
  request = Request.new
1310
2060
  request.put_string index
1311
-
2061
+
1312
2062
  request.put_int attrs.length
1313
2063
  for attr in attrs
1314
2064
  request.put_string attr
1315
2065
  request.put_int mva ? 1 : 0
1316
2066
  end
1317
-
2067
+
1318
2068
  request.put_int values.length
1319
2069
  values.each do |id, entry|
1320
2070
  request.put_int64 id
@@ -1324,33 +2074,89 @@ module Sphinx
1324
2074
  request.put_int(*entry)
1325
2075
  end
1326
2076
  end
1327
-
2077
+
1328
2078
  response = perform_request(:update, request)
1329
-
2079
+
2080
+ # parse response
2081
+ response.get_int
2082
+ end
2083
+ alias :UpdateAttributes :update_attributes
2084
+
2085
+ # Queries searchd status, and returns an array of status variable name
2086
+ # and value pairs.
2087
+ #
2088
+ # @return [Array<Array>] a table containing searchd status information.
2089
+ #
2090
+ # @example
2091
+ # status = sphinx.status
2092
+ # puts status.map { |key, value| "#{key.rjust(20)}: #{value}" }
2093
+ #
2094
+ def status
2095
+ request = Request.new
2096
+ request.put_int(1)
2097
+ response = perform_request(:status, request)
2098
+
2099
+ # parse response
2100
+ rows, cols = response.get_ints(2)
2101
+ (0...rows).map do
2102
+ (0...cols).map { response.get_string }
2103
+ end
2104
+ end
2105
+ alias :Status :status
2106
+
2107
+ # Force attribute flush, and block until it completes.
2108
+ #
2109
+ # @return [Integer] current internal flush tag on success, -1 on failure.
2110
+ #
2111
+ # @example
2112
+ # sphinx.flush_attrs
2113
+ #
2114
+ def flush_attrs
2115
+ request = Request.new
2116
+ response = perform_request(:flushattrs, request)
2117
+
1330
2118
  # parse response
1331
2119
  begin
1332
- return response.get_int
2120
+ response.get_int
1333
2121
  rescue EOFError
1334
- @error = 'incomplete reply'
1335
- raise SphinxResponseError, @error
2122
+ -1
1336
2123
  end
1337
2124
  end
1338
-
1339
- # persistent connections
1340
-
2125
+ alias :FlushAttrs :flush_attrs
2126
+
2127
+ #=================================================================
2128
+ # Persistent connections
2129
+ #=================================================================
2130
+
1341
2131
  # Opens persistent connection to the server.
1342
2132
  #
1343
- def Open
2133
+ # This method could be used only when a single searchd server
2134
+ # configured.
2135
+ #
2136
+ # @return [Boolean] +true+ when persistent connection has been
2137
+ # established; otherwise, +false+.
2138
+ #
2139
+ # @example
2140
+ # begin
2141
+ # sphinx.open
2142
+ # # perform several requests
2143
+ # ensure
2144
+ # sphinx.close
2145
+ # end
2146
+ #
2147
+ # @see #close
2148
+ #
2149
+ def open
1344
2150
  if @servers.size > 1
1345
2151
  @error = 'too many servers. persistent socket allowed only for a single server.'
1346
2152
  return false
1347
2153
  end
1348
-
2154
+
1349
2155
  if @servers.first.persistent?
1350
2156
  @error = 'already connected'
1351
2157
  return false;
1352
2158
  end
1353
-
2159
+
1354
2160
  request = Request.new
1355
2161
  request.put_int(1)
1356
2162
 
@@ -1360,85 +2166,64 @@ module Sphinx
1360
2166
 
1361
2167
  true
1362
2168
  end
1363
-
2169
+ alias :Open :open
2170
+
1364
2171
  # Closes previously opened persistent connection.
1365
2172
  #
1366
- def Close
2173
+ # This method could be used only when a single searchd server
2174
+ # configured.
2175
+ #
2176
+ # @return [Boolean] +true+ when persistent connection has been
2177
+ # closed; otherwise, +false+.
2178
+ #
2179
+ # @example
2180
+ # begin
2181
+ # sphinx.open
2182
+ # # perform several requests
2183
+ # ensure
2184
+ # sphinx.close
2185
+ # end
2186
+ #
2187
+ # @see #open
2188
+ #
2189
+ def close
1367
2190
  if @servers.size > 1
1368
2191
  @error = 'too many servers. persistent socket allowed only for a single server.'
1369
2192
  return false
1370
2193
  end
1371
-
2194
+
1372
2195
  unless @servers.first.persistent?
1373
2196
  @error = 'not connected'
1374
2197
  return false;
1375
2198
  end
1376
-
1377
- @servers.first.close_persistent!
1378
- end
1379
-
1380
- # Queries searchd status, and returns an array of status variable name
1381
- # and value pairs.
1382
- #
1383
- # Usage example:
1384
- #
1385
- # status = sphinx.Status
1386
- # puts status.map { |key, value| "#{key.rjust(20)}: #{value}" }
1387
- #
1388
- def Status
1389
- request = Request.new
1390
- request.put_int(1)
1391
- response = perform_request(:status, request)
1392
2199
 
1393
- # parse response
1394
- begin
1395
- rows, cols = response.get_ints(2)
1396
-
1397
- res = []
1398
- 0.upto(rows - 1) do |i|
1399
- res[i] = []
1400
- 0.upto(cols - 1) do |j|
1401
- res[i] << response.get_string
1402
- end
1403
- end
1404
- rescue EOFError
1405
- @error = 'incomplete reply'
1406
- raise SphinxResponseError, @error
1407
- end
1408
-
1409
- res
2200
+ @servers.first.close_persistent!
1410
2201
  end
1411
-
1412
- def FlushAttrs
1413
- request = Request.new
1414
- response = perform_request(:flushattrs, request)
2202
+ alias :Close :close
1415
2203
 
1416
- # parse response
1417
- begin
1418
- response.get_int
1419
- rescue EOFError
1420
- -1
1421
- end
1422
- end
1423
-
1424
2204
  protected
1425
-
2205
+
1426
2206
  # Connect, send query, get response.
1427
2207
  #
1428
2208
  # Use this method to communicate with Sphinx server. It ensures connection
1429
2209
  # will be instantiated properly, all headers will be generated properly, etc.
1430
2210
  #
1431
- # Parameters:
1432
- # * +command+ -- searchd command to perform (<tt>:search</tt>, <tt>:excerpt</tt>,
2211
+ # @param [Symbol, String] command searchd command to perform (<tt>:search</tt>, <tt>:excerpt</tt>,
1433
2212
  # <tt>:update</tt>, <tt>:keywords</tt>, <tt>:persist</tt>, <tt>:status</tt>,
1434
2213
  # <tt>:query</tt>, <tt>:flushattrs</tt>. See <tt>SEARCHD_COMMAND_*</tt> for details).
1435
- # * +request+ -- an instance of <tt>Sphinx::Request</tt> class. Contains request body.
1436
- # * +additional+ -- additional integer data to be placed between header and body.
1437
- # * +block+ -- if given, response will not be parsed, plain socket will be
1438
- # passed instead. this is special mode used for persistent connections,
1439
- # do not use for other tasks.
2214
+ # @param [Sphinx::Request] request contains request body.
2215
+ # @param [nil, Integer] additional additional integer data to be placed between header and body.
2216
+ #
2217
+ # @yield if block given, response will not be parsed, plain socket
2218
+ # will be yielded instead. This is special mode used for
2219
+ # persistent connections, do not use for other tasks.
2220
+ # @yieldparam [Sphinx::Server] server a server where request was performed on.
2221
+ # @yieldparam [Sphinx::BufferedIO] socket a socket used to perform the request.
2222
+ # @return [Sphinx::Response] contains response body.
1440
2223
  #
1441
- def perform_request(command, request, additional = nil, &block)
2224
+ # @see #parse_response
2225
+ #
2226
+ def perform_request(command, request, additional = nil)
1442
2227
  with_server do |server|
1443
2228
  cmd = command.to_s.upcase
1444
2229
  command_id = Sphinx::Client.const_get("SEARCHD_COMMAND_#{cmd}")
@@ -1465,26 +2250,31 @@ module Sphinx
1465
2250
  #
1466
2251
  # There are several exceptions which could be thrown in this method:
1467
2252
  #
1468
- # * various network errors -- should be handled by caller (see +with_socket+).
1469
- # * +SphinxResponseError+ -- incomplete reply from searchd.
1470
- # * +SphinxInternalError+ -- searchd error.
1471
- # * +SphinxTemporaryError+ -- temporary searchd error.
1472
- # * +SphinxUnknownError+ -- unknows searchd error.
2253
+ # @param [Sphinx::BufferedIO] socket an input stream object.
2254
+ # @param [Integer] client_version a command version which client supports.
2255
+ # @return [Sphinx::Response] could be used for context-based
2256
+ # parsing of reply from the server.
2257
+ #
2258
+ # @raise [SystemCallError, SocketError] should be handled by caller (see {#with_socket}).
2259
+ # @raise [SphinxResponseError] incomplete reply from searchd.
2260
+ # @raise [SphinxInternalError] searchd internal error.
2261
+ # @raise [SphinxTemporaryError] searchd temporary error.
2262
+ # @raise [SphinxUnknownError] searchd unknown error.
1473
2263
  #
1474
- # Method returns an instance of <tt>Sphinx::Response</tt> class, which
1475
- # could be used for context-based parsing of reply from the server.
2264
+ # @see #with_socket
2265
+ # @private
1476
2266
  #
1477
2267
  def parse_response(socket, client_version)
1478
2268
  response = ''
1479
2269
  status = ver = len = 0
1480
-
1481
- # Read server reply from server. All exceptions are handled by +with_socket+.
2270
+
2271
+ # Read server reply from server. All exceptions are handled by {#with_socket}.
1482
2272
  header = socket.read(8)
1483
2273
  if header.length == 8
1484
2274
  status, ver, len = header.unpack('n2N')
1485
2275
  response = socket.read(len) if len > 0
1486
2276
  end
1487
-
2277
+
1488
2278
  # check response
1489
2279
  read = response.length
1490
2280
  if response.empty? or read != len.to_i
@@ -1493,7 +2283,7 @@ module Sphinx
1493
2283
  : 'received zero-sized searchd response'
1494
2284
  raise SphinxResponseError, error
1495
2285
  end
1496
-
2286
+
1497
2287
  # check status
1498
2288
  if (status == SEARCHD_WARNING)
1499
2289
  wlen = response[0, 4].unpack('N*').first
@@ -1505,34 +2295,40 @@ module Sphinx
1505
2295
  error = 'searchd error: ' + response[4, response.length - 4]
1506
2296
  raise SphinxInternalError, error
1507
2297
  end
1508
-
2298
+
1509
2299
  if status == SEARCHD_RETRY
1510
2300
  error = 'temporary searchd error: ' + response[4, response.length - 4]
1511
2301
  raise SphinxTemporaryError, error
1512
2302
  end
1513
-
2303
+
1514
2304
  unless status == SEARCHD_OK
1515
2305
  error = "unknown status code: '#{status}'"
1516
2306
  raise SphinxUnknownError, error
1517
2307
  end
1518
-
2308
+
1519
2309
  # check version
1520
2310
  if ver < client_version
1521
2311
  @warning = "searchd command v.#{ver >> 8}.#{ver & 0xff} older than client's " +
1522
2312
  "v.#{client_version >> 8}.#{client_version & 0xff}, some options might not work"
1523
2313
  end
1524
-
2314
+
1525
2315
  Response.new(response)
1526
2316
  end
1527
-
2317
+
1528
2318
  # This is internal method which selects next server (round-robin)
1529
2319
  # and yields it to the block passed.
1530
2320
  #
1531
2321
  # In case of connection error, it will try next server several times
1532
- # (see +SetConnectionTimeout+ method details). If all servers are down,
1533
- # it will set +error+ attribute value with the last exception message,
1534
- # and <tt>connection_timeout?</tt> method will return true. Also,
1535
- # +SphinxConnectErorr+ exception will be raised.
2322
+ # (see {#set_connect_timeout} method details). If all servers are down,
2323
+ # it will set error attribute (could be retrieved with {#last_error}
2324
+ # method) with the last exception message, and {#connect_error?}
2325
+ # method will return true. Also, {SphinxConnectError} exception
2326
+ # will be raised.
2327
+ #
2328
+ # @yield a block which performs request on a given server.
2329
+ # @yieldparam [Sphinx::Server] server contains information
2330
+ # about the server to perform request on.
2331
+ # @raise [SphinxConnectError] on any connection error.
1536
2332
  #
1537
2333
  def with_server
1538
2334
  attempts = @retries
@@ -1552,29 +2348,39 @@ module Sphinx
1552
2348
  raise
1553
2349
  end
1554
2350
  end
1555
-
2351
+
1556
2352
  # This is internal method which retrieves socket for a given server,
1557
2353
  # initiates Sphinx session, and yields this socket to a block passed.
1558
2354
  #
1559
- # In case of any problems with session initiation, +SphinxConnectError+
2355
+ # In case of any problems with session initiation, {SphinxConnectError}
1560
2356
  # will be raised, because this is part of connection establishing. See
1561
- # +with_server+ method details to get more infromation about how this
2357
+ # {#with_server} method details to get more infromation about how this
1562
2358
  # exception is handled.
1563
2359
  #
1564
2360
  # Socket retrieving routine is wrapped in a block with it's own
1565
- # timeout value (see +SetConnectTimeout+). This is done in
1566
- # <tt>Server#get_socket</tt> method, so check it for details.
2361
+ # timeout value (see {#set_connect_timeout}). This is done in
2362
+ # {Server#get_socket} method, so check it for details.
1567
2363
  #
1568
2364
  # Request execution is wrapped with block with another timeout
1569
- # (see +SetRequestTimeout+). This ensures no Sphinx request will
2365
+ # (see {#set_request_timeout}). This ensures no Sphinx request will
1570
2366
  # take unreasonable time.
1571
2367
  #
1572
2368
  # In case of any Sphinx error (incomplete reply, internal or temporary
1573
2369
  # error), connection to the server will be re-established, and request
1574
- # will be retried (see +SetRequestTimeout+). Of course, if connection
2370
+ # will be retried (see {#set_request_timeout}). Of course, if connection
1575
2371
  # could not be established, next server will be selected (see explanation
1576
2372
  # above).
1577
2373
  #
2374
+ # @param [Sphinx::Server] server contains information
2375
+ # about the server to perform request on.
2376
+ # @yield a block which will actually perform the request.
2377
+ # @yieldparam [Sphinx::BufferedIO] socket a socket used to
2378
+ # perform the request.
2379
+ #
2380
+ # @raise [SphinxResponseError, SphinxInternalError, SphinxTemporaryError, SphinxUnknownError]
2381
+ # on any response error.
2382
+ # @raise [SphinxConnectError] on any connection error.
2383
+ #
1578
2384
  def with_socket(server)
1579
2385
  attempts = @reqretries
1580
2386
  socket = nil
@@ -1612,14 +2418,14 @@ module Sphinx
1612
2418
  new_e.set_backtrace(e.backtrace)
1613
2419
  e = new_e
1614
2420
  end
1615
-
2421
+
1616
2422
  # Close previously opened socket (in case of it has been really opened)
1617
2423
  server.free_socket(socket)
1618
2424
 
1619
2425
  # Request error! Do we need to try it again?
1620
2426
  attempts -= 1
1621
2427
  retry if attempts > 0
1622
-
2428
+
1623
2429
  # Re-raise original exception
1624
2430
  @error = e.message
1625
2431
  raise e