sphinx 0.9.10.2043 → 0.9.10.2091

Sign up to get free protection for your applications and to get access to all the features.
data/.gitignore CHANGED
@@ -1,2 +1,4 @@
1
1
  rdoc
2
+ doc
3
+ .yardoc
2
4
  pkg
data/README.rdoc CHANGED
@@ -1,10 +1,10 @@
1
- =Sphinx Client API 0.9.10
1
+ = Sphinx Client API 0.9.10
2
2
 
3
- This document gives an overview of what is Sphinx itself and how to use in
4
- within Ruby on Rails. For more information or documentation,
3
+ This document gives an overview of what is Sphinx itself and how to use it
4
+ from your Ruby on Rails application. For more information or documentation,
5
5
  please go to http://www.sphinxsearch.com
6
6
 
7
- ==Sphinx
7
+ == Sphinx
8
8
 
9
9
  Sphinx is a standalone full-text search engine, meant to provide fast,
10
10
  size-efficient and relevant fulltext search functions to other applications.
@@ -12,37 +12,191 @@ Sphinx was specially designed to integrate well with SQL databases and
12
12
  scripting languages. Currently built-in data sources support fetching data
13
13
  either via direct connection to MySQL, or from an XML pipe.
14
14
 
15
- Simplest way to communicate with Sphinx is to use <tt>searchd</tt> -
16
- a daemon to search through fulltext indices from external software.
15
+ Simplest way to communicate with Sphinx is to use <tt>searchd</tt>
16
+ a daemon to search through full text indexes from external software.
17
17
 
18
- ==Compatibility
18
+ == Installation
19
19
 
20
- This version supports all API features of Sphinx 0.9.10-r2043. Full compatibility list:
20
+ There are two options when approaching sphinx plugin installation:
21
21
 
22
- * <tt>0.9.10</tt> Sphinx 0.9.10-r2043
23
- * <tt>0.9.9</tt> Sphinx 0.9.9-r1299
22
+ * using the gem (recommended)
23
+ * install as a Rails plugin
24
24
 
25
- ==Documentation
25
+ To install as a gem, add this to your environment.rb:
26
26
 
27
- You can create the documentation by running:
27
+ config.gem 'sphinx', :source => 'http://gemcutter.org'
28
28
 
29
- rake rdoc
29
+ And then run the command:
30
30
 
31
- ==Latest version
31
+ sudo rake gems:install
32
32
 
33
- You can always get latest version from
33
+ To install Sphinx as a Rails plugin use this:
34
+
35
+ script/plugin install git://github.com/kpumuk/sphinx.git
36
+
37
+ == Documentation
38
+
39
+ Complete Sphinx plugin documentation could be found here:
40
+ http://kpumuk.github.com/sphinx
41
+
42
+ Also you can find documentation on rdoc.info:
43
+ http://rdoc.info/projects/kpumuk/sphinx
44
+
45
+ You can build the documentation locally by running:
46
+
47
+ rake yard
48
+
49
+ Please note: you should have yard gem installed on your system:
50
+
51
+ sudo gem install yard --source http://gemcutter.org
52
+
53
+ Complete Sphinx API documentation could be found on Sphinx Search Engine
54
+ site: http://www.sphinxsearch.com/docs/current.html
55
+ This plugin is fully compatible with original PHP API implementation.
56
+
57
+ == Ruby naming conventions
58
+
59
+ Sphinx Client API supports Ruby naming conventions, so every API
60
+ method name is in underscored, lowercase form:
61
+
62
+ SetServer -> set_server
63
+ RunQueries -> run_queries
64
+ SetMatchMode -> set_match_mode
65
+
66
+ Every method is aliased to a corresponding one from standard Sphinx
67
+ API, so you can use both <tt>SetServer</tt> and <tt>set_server</tt>
68
+ with no differrence.
69
+
70
+ There are three exceptions to this naming rule:
71
+
72
+ GetLastError -> last_error
73
+ GetLastWarning -> last_warning
74
+ IsConnectError -> connect_error?
75
+
76
+ Of course, all of them are aliased to the original method names.
77
+
78
+ == Using multiple Sphinx servers
79
+
80
+ Since we actively use this plugin in our Scribd development workflow,
81
+ there are several methods have been added to accommodate our needs.
82
+ You can find documentation on Ruby-specific methods in documentation:
83
+ http://rdoc.info/projects/kpumuk/sphinx
84
+
85
+ First of all, we added support of multiple Sphinx servers to balance
86
+ load between them. Also it means that in case of any problems with one
87
+ of servers, library will try to fetch the results from another one.
88
+ Every consequence request will be executed on the next server in list
89
+ (round-robin technique).
90
+
91
+ sphinx.set_servers([
92
+ { :host => 'browse01.local', :port => 3312 },
93
+ { :host => 'browse02.local', :port => 3312 },
94
+ { :host => 'browse03.local', :port => 3312 }
95
+ ])
96
+
97
+ By default library will try to fetch results from a single server, and
98
+ fail if it does not respond. To setup number of retries being performed,
99
+ you can use second (additional) parameter of the <tt>set_connect_timeout</tt>
100
+ and <tt>set_request_timeout</tt> methods:
101
+
102
+ sphinx.set_connect_timeout(1, 3)
103
+ sphinx.set_request_timeout(1, 3)
104
+
105
+ There is a big difference between these two methods. First will affect
106
+ only on requests experiencing problems with connection (socket error,
107
+ pipe error, etc), second will be used when request is broken somehow
108
+ (temporary searchd error, incomplete reply, etc). The workflow looks like
109
+ this:
110
+
111
+ 1. Increase retries number. If is less or equal to configured value,
112
+ try to connect to the next server. Otherwise, raise an error.
113
+ 2. In case of connection problem go to 1.
114
+ 3. Increase request retries number. If it less or equal to configured
115
+ value, try to perform request. Otherwise, raise an error.
116
+ 4. In case of connection problem go to 1.
117
+ 5. In case of request problem, go to 3.
118
+ 6. Parse and return response.
119
+
120
+ Withdrawals:
121
+
122
+ 1. Request could be performed <tt>connect_retries</tt> * <tt>request_retries</tt>
123
+ times. E.g., it could be tried <tt>request_retries</tt> times on each
124
+ of <tt>connect_retries</tt> servers (when you have 1 server configured,
125
+ but <tt>connect_retries</tt> is 5, library will try to connect to this
126
+ server 5 times).
127
+ 2. Request could be tried to execute on each server <tt>1..request_retries</tt>
128
+ times. In case of connection problem, request will be moved to another
129
+ server immediately.
130
+
131
+ Usually you will set <tt>connect_retries</tt> equal to servers number,
132
+ so you will be sure each failing request will be performed on all servers.
133
+ This means that if one of servers is live, but others are dead, you request
134
+ will be finally executed successfully.
135
+
136
+ == Sphinx constants
137
+
138
+ Most Sphinx API methods expecting for special constants will be passed.
139
+ For example:
140
+
141
+ sphinx.set_match_mode(Sphinx::Client::SPH_MATCH_ANY)
142
+
143
+ Please note that these constants defined in a <tt>Sphinx::Client</tt>
144
+ namespace. You can use symbols or strings instead of these awful
145
+ constants:
146
+
147
+ sphinx.set_match_mode(:any)
148
+ sphinx.set_match_mode('any')
149
+
150
+ == Setting query filters
151
+
152
+ Every <tt>set_</tt> method returns <tt>Sphinx::Client</tt> object itself.
153
+ It means that you can chain filtering methods:
154
+
155
+ results = Sphinx::Client.new.
156
+ set_match_mode(:any).
157
+ set_ranking_mode(:bm25).
158
+ set_id_range(10, 1000).
159
+ query('test')
160
+
161
+ == Example
162
+
163
+ This simple example illustrates base connection establishing,
164
+ search results retrieving, and excerpts building. Please note
165
+ how does it perform database select using ActiveRecord to
166
+ save the order of records established by Sphinx.
167
+
168
+ sphinx = Sphinx::Client.new
169
+ result = sphinx.query('test')
170
+ ids = result['matches'].map { |match| match['id'] }
171
+ posts = Post.all :conditions => { :id => ids },
172
+ :order => "FIELD(id,#{ids.join(',')})"
173
+
174
+ docs = posts.map(&:body)
175
+ excerpts = sphinx.build_excerpts(docs, 'index', 'test')
176
+
177
+ == Support
178
+
179
+ Source code:
180
+ http://github.com/kpumuk/sphinx
181
+
182
+ To suggest a feature or report a bug:
183
+ http://github.com/kpumuk/sphinx/issues
184
+
185
+ Project home page:
34
186
  http://kpumuk.info/projects/ror-plugins/sphinx
35
187
 
36
- ==Credits
188
+ == Credits
37
189
 
38
190
  Dmytro Shteflyuk <kpumuk@kpumuk.info> http://kpumuk.info
39
191
 
40
- Andrew Aksyonoff http://sphinxsearch.com/
192
+ Andrew Aksyonoff http://sphinxsearch.com
41
193
 
42
194
  Special thanks to Alexey Kovyrin <alexey@kovyrin.net> http://blog.kovyrin.net
43
195
 
196
+ Special thanks to Mike Perham http://www.mikeperham.com for his awesome
197
+ memcache-client gem, where latest Sphinx gem got new sockets handling from.
198
+
44
199
  ==License
45
200
 
46
201
  This library is distributed under the terms of the Ruby license.
47
202
  You can freely distribute/modify this library.
48
-
data/Rakefile CHANGED
@@ -1,6 +1,4 @@
1
1
  require 'rake'
2
- require 'spec/rake/spectask'
3
- require 'rake/rdoctask'
4
2
 
5
3
  begin
6
4
  require 'jeweler'
@@ -17,20 +15,31 @@ rescue LoadError
17
15
  puts 'Jeweler not available. Install it with: sudo gem install jeweler'
18
16
  end
19
17
 
20
- desc 'Default: run specs'
21
- task :default => :spec
18
+ begin
19
+ require 'spec/rake/spectask'
20
+
21
+ desc 'Default: run specs'
22
+ task :default => :spec
22
23
 
23
- desc 'Test the sphinx plugin'
24
- Spec::Rake::SpecTask.new(:spec) do |t|
25
- t.libs << 'lib'
26
- t.pattern = 'spec/*_spec.rb'
24
+ desc 'Test the sphinx plugin'
25
+ Spec::Rake::SpecTask.new do |t|
26
+ t.libs << 'lib'
27
+ t.pattern = 'spec/*_spec.rb'
28
+ end
29
+ rescue LoadError
30
+ puts 'RSpec not available. Install it with: sudo gem install rspec'
27
31
  end
28
32
 
29
- desc 'Generate documentation for the sphinx plugin'
30
- Rake::RDocTask.new(:rdoc) do |rdoc|
31
- rdoc.rdoc_dir = 'rdoc'
32
- rdoc.title = 'Sphinx Client API'
33
- rdoc.options << '--line-numbers' << '--inline-source'
34
- rdoc.rdoc_files.include('README.rdoc')
35
- rdoc.rdoc_files.include('lib/**/*.rb')
33
+ begin
34
+ require 'yard'
35
+ YARD::Rake::YardocTask.new(:yard) do |t|
36
+ t.options = ['--title', 'Sphinx Client API Documentation']
37
+ if ENV['PRIVATE']
38
+ t.options.concat ['--protected', '--private']
39
+ else
40
+ t.options << '--no-private'
41
+ end
42
+ end
43
+ rescue LoadError
44
+ puts 'Yard not available. Install it with: sudo gem install yard'
36
45
  end
data/VERSION.yml CHANGED
@@ -2,4 +2,4 @@
2
2
  :major: 0
3
3
  :minor: 9
4
4
  :patch: 10
5
- :build: 2043
5
+ :build: 2091
data/lib/sphinx.rb CHANGED
@@ -1,9 +1,21 @@
1
- require 'socket'
2
- require 'net/protocol'
3
-
1
+ # Sphinx Client API
2
+ #
3
+ # Author:: Dmytro Shteflyuk <mailto:kpumuk@kpumuk.info>.
4
+ # Copyright:: Copyright (c) 2006 — 2009 Dmytro Shteflyuk
5
+ # License:: Distributes under the same terms as Ruby
6
+ # Version:: 0.9.10-r2091
7
+ # Website:: http://kpumuk.info/projects/ror-plugins/sphinx
8
+ # Sources:: http://github.com/kpumuk/sphinx
9
+ #
10
+ # This library is distributed under the terms of the Ruby license.
11
+ # You can freely distribute/modify this library.
12
+ #
4
13
  module Sphinx
5
14
  end
6
15
 
16
+ require 'socket'
17
+ require 'net/protocol'
18
+
7
19
  require File.dirname(__FILE__) + '/sphinx/request'
8
20
  require File.dirname(__FILE__) + '/sphinx/response'
9
21
  require File.dirname(__FILE__) + '/sphinx/timeout'
@@ -1,3 +1,7 @@
1
+ # A simple wrapper around <tt>Net::BufferedIO</tt> performing
2
+ # non-blocking select.
3
+ #
4
+ # @private
1
5
  class Sphinx::BufferedIO < Net::BufferedIO # :nodoc:
2
6
  BUFSIZE = 1024 * 16
3
7
 
data/lib/sphinx/client.rb CHANGED
@@ -1,120 +1,149 @@
1
- # = client.rb - Sphinx Client API
2
- #
3
- # Author:: Dmytro Shteflyuk <mailto:kpumuk@kpumuk.info>.
4
- # Copyright:: Copyright (c) 2006 — 2009 Dmytro Shteflyuk
5
- # License:: Distributes under the same terms as Ruby
6
- # Version:: 0.9.10-r2043
7
- # Website:: http://kpumuk.info/projects/ror-plugins/sphinx
8
- #
9
- # This library is distributed under the terms of the Ruby license.
10
- # You can freely distribute/modify this library.
11
-
12
- # ==Sphinx Client API
13
- #
14
- # The Sphinx Client API is used to communicate with <tt>searchd</tt>
15
- # daemon and get search results from Sphinx.
16
- #
17
- # ===Usage
18
- #
19
- # sphinx = Sphinx::Client.new
20
- # result = sphinx.Query('test')
21
- # ids = result['matches'].map { |match| match['id'] }.join(',')
22
- # posts = Post.find :all, :conditions => "id IN (#{ids})"
23
- #
24
- # docs = posts.map(&:body)
25
- # excerpts = sphinx.BuildExcerpts(docs, 'index', 'test')
26
-
27
1
  module Sphinx
28
- # :stopdoc:
29
-
2
+ # Base class for all Sphinx errors
30
3
  class SphinxError < StandardError; end
4
+ # Connect error occurred on the API side.
31
5
  class SphinxConnectError < SphinxError; end
6
+ # Request error occurred on the API side.
32
7
  class SphinxResponseError < SphinxError; end
8
+ # Internal error occurred inside searchd.
33
9
  class SphinxInternalError < SphinxError; end
10
+ # Temporary error occurred inside searchd.
34
11
  class SphinxTemporaryError < SphinxError; end
12
+ # Unknown error occurred inside searchd.
35
13
  class SphinxUnknownError < SphinxError; end
36
14
 
37
- # :startdoc:
38
-
15
+ # The Sphinx Client API is used to communicate with <tt>searchd</tt>
16
+ # daemon and perform requests.
17
+ #
18
+ # @example
19
+ # sphinx = Sphinx::Client.new
20
+ # result = sphinx.query('test')
21
+ # ids = result['matches'].map { |match| match['id'] }
22
+ # posts = Post.all :conditions => { :id => ids },
23
+ # :order => "FIELD(id,#{ids.join(',')})"
24
+ #
25
+ # docs = posts.map(&:body)
26
+ # excerpts = sphinx.build_excerpts(docs, 'index', 'test')
27
+ #
39
28
  class Client
40
- # :stopdoc:
41
-
29
+ #=================================================================
42
30
  # Known searchd commands
43
-
31
+ #=================================================================
32
+
44
33
  # search command
34
+ # @private
45
35
  SEARCHD_COMMAND_SEARCH = 0
46
36
  # excerpt command
37
+ # @private
47
38
  SEARCHD_COMMAND_EXCERPT = 1
48
39
  # update command
40
+ # @private
49
41
  SEARCHD_COMMAND_UPDATE = 2
50
42
  # keywords command
43
+ # @private
51
44
  SEARCHD_COMMAND_KEYWORDS = 3
52
45
  # persist command
46
+ # @private
53
47
  SEARCHD_COMMAND_PERSIST = 4
54
48
  # status command
49
+ # @private
55
50
  SEARCHD_COMMAND_STATUS = 5
56
51
  # query command
52
+ # @private
57
53
  SEARCHD_COMMAND_QUERY = 6
58
54
  # flushattrs command
55
+ # @private
59
56
  SEARCHD_COMMAND_FLUSHATTRS = 7
60
-
57
+
58
+ #=================================================================
61
59
  # Current client-side command implementation versions
62
-
60
+ #=================================================================
61
+
63
62
  # search command version
63
+ # @private
64
64
  VER_COMMAND_SEARCH = 0x117
65
65
  # excerpt command version
66
+ # @private
66
67
  VER_COMMAND_EXCERPT = 0x100
67
68
  # update command version
69
+ # @private
68
70
  VER_COMMAND_UPDATE = 0x102
69
71
  # keywords command version
72
+ # @private
70
73
  VER_COMMAND_KEYWORDS = 0x100
71
74
  # persist command version
75
+ # @private
72
76
  VER_COMMAND_PERSIST = 0x000
73
77
  # status command version
78
+ # @private
74
79
  VER_COMMAND_STATUS = 0x100
75
80
  # query command version
81
+ # @private
76
82
  VER_COMMAND_QUERY = 0x100
77
83
  # flushattrs command version
84
+ # @private
78
85
  VER_COMMAND_FLUSHATTRS = 0x100
79
-
86
+
87
+ #=================================================================
80
88
  # Known searchd status codes
81
-
89
+ #=================================================================
90
+
82
91
  # general success, command-specific reply follows
92
+ # @private
83
93
  SEARCHD_OK = 0
84
94
  # general failure, command-specific reply may follow
95
+ # @private
85
96
  SEARCHD_ERROR = 1
86
97
  # temporaty failure, client should retry later
98
+ # @private
87
99
  SEARCHD_RETRY = 2
88
- # general success, warning message and command-specific reply follow
89
- SEARCHD_WARNING = 3
100
+ # general success, warning message and command-specific reply follow
101
+ # @private
102
+ SEARCHD_WARNING = 3
90
103
 
104
+ #=================================================================
105
+ # Some internal attributes to use inside client API
106
+ #=================================================================
107
+
108
+ # List of searchd servers to connect to.
109
+ # @private
91
110
  attr_reader :servers
111
+ # Connection timeout in seconds.
112
+ # @private
92
113
  attr_reader :timeout
114
+ # Number of connection retries.
115
+ # @private
93
116
  attr_reader :retries
117
+ # Request timeout in seconds.
118
+ # @private
94
119
  attr_reader :reqtimeout
120
+ # Number of request retries.
121
+ # @private
95
122
  attr_reader :reqretries
96
-
97
- # :startdoc:
98
-
123
+
124
+ #=================================================================
99
125
  # Known match modes
100
-
126
+ #=================================================================
127
+
101
128
  # match all query words
102
- SPH_MATCH_ALL = 0
129
+ SPH_MATCH_ALL = 0
103
130
  # match any query word
104
- SPH_MATCH_ANY = 1
131
+ SPH_MATCH_ANY = 1
105
132
  # match this exact phrase
106
- SPH_MATCH_PHRASE = 2
133
+ SPH_MATCH_PHRASE = 2
107
134
  # match this boolean query
108
- SPH_MATCH_BOOLEAN = 3
135
+ SPH_MATCH_BOOLEAN = 3
109
136
  # match this extended query
110
- SPH_MATCH_EXTENDED = 4
137
+ SPH_MATCH_EXTENDED = 4
111
138
  # match all document IDs w/o fulltext query, apply filters
112
139
  SPH_MATCH_FULLSCAN = 5
113
140
  # extended engine V2 (TEMPORARY, WILL BE REMOVED IN 0.9.8-RELEASE)
114
141
  SPH_MATCH_EXTENDED2 = 6
115
-
142
+
143
+ #=================================================================
116
144
  # Known ranking modes (ext2 only)
117
-
145
+ #=================================================================
146
+
118
147
  # default mode, phrase proximity major factor and BM25 minor one
119
148
  SPH_RANK_PROXIMITY_BM25 = 0
120
149
  # statistical mode, BM25 ranking only (faster but worse quality)
@@ -131,9 +160,11 @@ module Sphinx
131
160
  SPH_RANK_FIELDMASK = 6
132
161
  # codename SPH04, phrase proximity + bm25 + head/exact boost
133
162
  SPH_RANK_SPH04 = 7
134
-
163
+
164
+ #=================================================================
135
165
  # Known sort modes
136
-
166
+ #=================================================================
167
+
137
168
  # sort by document relevance desc, then by date
138
169
  SPH_SORT_RELEVANCE = 0
139
170
  # sort by document date desc, then by relevance desc
@@ -146,23 +177,27 @@ module Sphinx
146
177
  SPH_SORT_EXTENDED = 4
147
178
  # sort by arithmetic expression in descending order (eg. "@id + max(@weight,1000)*boost + log(price)")
148
179
  SPH_SORT_EXPR = 5
149
-
180
+
181
+ #=================================================================
150
182
  # Known filter types
151
-
183
+ #=================================================================
184
+
152
185
  # filter by integer values set
153
186
  SPH_FILTER_VALUES = 0
154
187
  # filter by integer range
155
188
  SPH_FILTER_RANGE = 1
156
189
  # filter by float range
157
190
  SPH_FILTER_FLOATRANGE = 2
158
-
191
+
192
+ #=================================================================
159
193
  # Known attribute types
160
-
194
+ #=================================================================
195
+
161
196
  # this attr is just an integer
162
197
  SPH_ATTR_INTEGER = 1
163
198
  # this attr is a timestamp
164
199
  SPH_ATTR_TIMESTAMP = 2
165
- # this attr is an ordinal string number (integer at search time,
200
+ # this attr is an ordinal string number (integer at search time,
166
201
  # specially handled at indexing time)
167
202
  SPH_ATTR_ORDINAL = 3
168
203
  # this attr is a boolean bit field
@@ -175,23 +210,25 @@ module Sphinx
175
210
  SPH_ATTR_STRING = 7
176
211
  # this attr has multiple values (0 or more)
177
212
  SPH_ATTR_MULTI = 0x40000000
178
-
213
+
214
+ #=================================================================
179
215
  # Known grouping functions
180
-
216
+ #=================================================================
217
+
181
218
  # group by day
182
219
  SPH_GROUPBY_DAY = 0
183
220
  # group by week
184
- SPH_GROUPBY_WEEK = 1
221
+ SPH_GROUPBY_WEEK = 1
185
222
  # group by month
186
- SPH_GROUPBY_MONTH = 2
223
+ SPH_GROUPBY_MONTH = 2
187
224
  # group by year
188
225
  SPH_GROUPBY_YEAR = 3
189
226
  # group by attribute value
190
227
  SPH_GROUPBY_ATTR = 4
191
228
  # group by sequential attrs pair
192
229
  SPH_GROUPBY_ATTRPAIR = 5
193
-
194
- # Constructs the <tt>Sphinx::Client</tt> object and sets options to their default values.
230
+
231
+ # Constructs the <tt>Sphinx::Client</tt> object and sets options to their default values.
195
232
  def initialize
196
233
  # per-query settings
197
234
  @offset = 0 # how many records to seek from result-set start (default is 0)
@@ -214,16 +251,16 @@ module Sphinx
214
251
  @anchor = [] # geographical anchor point
215
252
  @indexweights = [] # per-index weights
216
253
  @ranker = SPH_RANK_PROXIMITY_BM25 # ranking mode (default is SPH_RANK_PROXIMITY_BM25)
217
- @maxquerytime = 0 # max query time, milliseconds (default is 0, do not limit)
254
+ @maxquerytime = 0 # max query time, milliseconds (default is 0, do not limit)
218
255
  @fieldweights = {} # per-field-name weights
219
256
  @overrides = [] # per-query attribute values overrides
220
257
  @select = '*' # select-list (attributes or expressions, with optional aliases)
221
-
258
+
222
259
  # per-reply fields (for single-query case)
223
260
  @error = '' # last error message
224
261
  @warning = '' # last warning message
225
262
  @connerror = false # connection error vs remote error flag
226
-
263
+
227
264
  @reqs = [] # requests storage (for multi-query case)
228
265
  @mbenc = '' # stored mbstring encoding
229
266
  @timeout = 0 # connect timeout
@@ -233,58 +270,104 @@ module Sphinx
233
270
 
234
271
  # per-client-object settings
235
272
  # searchd servers list
236
- @servers = [Sphinx::Server.new(self, 'localhost', 3312, false)].freeze
273
+ @servers = [Sphinx::Server.new(self, 'localhost', 9312, false)].freeze
237
274
  @lastserver = -1
238
275
  end
239
-
276
+
277
+ #=================================================================
278
+ # General API functions
279
+ #=================================================================
280
+
240
281
  # Returns last error message, as a string, in human readable format. If there
241
282
  # were no errors during the previous API call, empty string is returned.
242
283
  #
243
- # You should call it when any other function (such as +Query+) fails (typically,
284
+ # You should call it when any other function (such as {#query}) fails (typically,
244
285
  # the failing function returns false). The returned string will contain the
245
286
  # error description.
246
287
  #
247
288
  # The error message is not reset by this call; so you can safely call it
248
289
  # several times if needed.
249
290
  #
250
- def GetLastError
291
+ # @return [String] last error message.
292
+ #
293
+ # @example
294
+ # puts sphinx.last_error
295
+ #
296
+ # @see #last_warning
297
+ # @see #connect_error?
298
+ #
299
+ def last_error
251
300
  @error
252
301
  end
253
-
302
+ alias :GetLastError :last_error
303
+
254
304
  # Returns last warning message, as a string, in human readable format. If there
255
305
  # were no warnings during the previous API call, empty string is returned.
256
306
  #
257
- # You should call it to verify whether your request (such as +Query+) was
307
+ # You should call it to verify whether your request (such as {#query}) was
258
308
  # completed but with warnings. For instance, search query against a distributed
259
309
  # index might complete succesfully even if several remote agents timed out.
260
310
  # In that case, a warning message would be produced.
261
- #
311
+ #
262
312
  # The warning message is not reset by this call; so you can safely call it
263
313
  # several times if needed.
264
314
  #
265
- def GetLastWarning
315
+ # @return [String] last warning message.
316
+ #
317
+ # @example
318
+ # puts sphinx.last_warning
319
+ #
320
+ # @see #last_error
321
+ # @see #connect_error?
322
+ #
323
+ def last_warning
266
324
  @warning
267
325
  end
268
-
326
+ alias :GetLastWarning :last_warning
327
+
269
328
  # Checks whether the last error was a network error on API side, or a
270
329
  # remote error reported by searchd. Returns true if the last connection
271
330
  # attempt to searchd failed on API side, false otherwise (if the error
272
331
  # was remote, or there were no connection attempts at all).
273
332
  #
274
- def IsConnectError
333
+ # @return [Boolean] the value indicating whether last error was a
334
+ # nework error on API side.
335
+ #
336
+ # @example
337
+ # puts "Connection failed!" if sphinx.connect_error?
338
+ #
339
+ # @see #last_error
340
+ # @see #last_warning
341
+ #
342
+ def connect_error?
275
343
  @connerror || false
276
344
  end
277
-
345
+ alias :IsConnectError :connect_error?
346
+
278
347
  # Sets searchd host name and TCP port. All subsequent requests will
279
348
  # use the new host and port settings. Default +host+ and +port+ are
280
- # 'localhost' and 3312, respectively.
349
+ # 'localhost' and 9312, respectively.
281
350
  #
282
351
  # Also, you can specify an absolute path to Sphinx's UNIX socket as +host+,
283
352
  # in this case pass port as +0+ or +nil+.
284
353
  #
285
- def SetServer(host, port)
354
+ # @param [String] host the searchd host name or UNIX socket absolute path.
355
+ # @param [Integer] port the searchd port name (could be any if UNIX
356
+ # socket path specified).
357
+ # @return [Sphinx::Client] self.
358
+ #
359
+ # @example
360
+ # sphinx.set_server('localhost', 9312)
361
+ # sphinx.set_server('/opt/sphinx/var/run/sphinx.sock')
362
+ #
363
+ # @raise [ArgumentError] Occurred when parameters are invalid.
364
+ # @see #set_servers
365
+ # @see #set_connect_timeout
366
+ # @see #set_request_timeout
367
+ #
368
+ def set_server(host, port = 9312)
286
369
  raise ArgumentError, '"host" argument must be String' unless host.kind_of?(String)
287
-
370
+
288
371
  path = nil
289
372
  # Check if UNIX socket should be used
290
373
  if host[0] == ?/
@@ -298,25 +381,47 @@ module Sphinx
298
381
  host = port = nil unless path.nil?
299
382
 
300
383
  @servers = [Sphinx::Server.new(self, host, port, path)].freeze
384
+ self
301
385
  end
386
+ alias :SetServer :set_server
302
387
 
303
388
  # Sets the list of searchd servers. Each subsequent request will use next
304
389
  # server in list (round-robin). In case of one server failure, request could
305
- # be retried on another server (see +SetConnectTimeout+ and +SetRequestTimeout+).
306
- #
307
- # Method accepts an +Array+ of +Hash+es, each of them should have :host
308
- # and :port (to connect to searchd through network) or :path (an absolute path
309
- # to UNIX socket) specified.
310
- #
311
- def SetServers(servers)
390
+ # be retried on another server (see {#set_connect_timeout} and
391
+ # {#set_request_timeout}).
392
+ #
393
+ # Method accepts an +Array+ of +Hash+es, each of them should have <tt>:host</tt>
394
+ # and <tt>:port</tt> (to connect to searchd through network) or <tt>:path</tt>
395
+ # (an absolute path to UNIX socket) specified.
396
+ #
397
+ # @param [Array<Hash>] servers an +Array+ of +Hash+ objects with servers parameters.
398
+ # @option servers [String] :host the searchd host name or UNIX socket absolute path.
399
+ # @option servers [String] :path the searchd UNIX socket absolute path.
400
+ # @option servers [Integer] :port (9312) the searchd port name (skiped when UNIX
401
+ # socket path specified)
402
+ # @return [Sphinx::Client] self.
403
+ #
404
+ # @example
405
+ # sphinx.set_servers([
406
+ # { :host => 'browse01.local' }, # default port is 9312
407
+ # { :host => 'browse02.local', :port => 9312 },
408
+ # { :path => '/opt/sphinx/var/run/sphinx.sock' }
409
+ # ])
410
+ #
411
+ # @raise [ArgumentError] Occurred when parameters are invalid.
412
+ # @see #set_server
413
+ # @see #set_connect_timeout
414
+ # @see #set_request_timeout
415
+ #
416
+ def set_servers(servers)
312
417
  raise ArgumentError, '"servers" argument must be Array' unless servers.kind_of?(Array)
313
418
  raise ArgumentError, '"servers" argument must be not empty' if servers.empty?
314
-
419
+
315
420
  @servers = servers.map do |server|
316
421
  raise ArgumentError, '"servers" argument must be Array of Hashes' unless server.kind_of?(Hash)
317
422
 
318
423
  host = server[:path] || server['path'] || server[:host] || server['host']
319
- port = server[:port] || server['port']
424
+ port = server[:port] || server['port'] || 9312
320
425
  path = nil
321
426
  raise ArgumentError, '"host" argument must be String' unless host.kind_of?(String)
322
427
 
@@ -330,11 +435,13 @@ module Sphinx
330
435
  end
331
436
 
332
437
  host = port = nil unless path.nil?
333
-
438
+
334
439
  Sphinx::Server.new(self, host, port, path)
335
440
  end.freeze
441
+ self
336
442
  end
337
-
443
+ alias :SetServers :set_servers
444
+
338
445
  # Sets the time allowed to spend connecting to the server before giving up
339
446
  # and number of retries to perform.
340
447
  #
@@ -342,7 +449,7 @@ module Sphinx
342
449
  # be returned back to the application in order for application-level error
343
450
  # handling to advise the user.
344
451
  #
345
- # When multiple servers configured through +SetServers+ method, and +retries+
452
+ # When multiple servers configured through {#set_servers} method, and +retries+
346
453
  # number is greater than 1, library will try to connect to another server.
347
454
  # In case of single server configured, it will try to reconnect +retries+
348
455
  # times.
@@ -350,15 +457,29 @@ module Sphinx
350
457
  # Please note, this timeout will only be used for connection establishing, not
351
458
  # for regular API requests.
352
459
  #
353
- def SetConnectTimeout(timeout, retries = 1)
460
+ # @param [Integer] timeout a connection timeout in seconds.
461
+ # @param [Integer] retries number of connect retries.
462
+ # @return [Sphinx::Client] self.
463
+ #
464
+ # @example Set connection timeout to 1 second and number of retries to 5
465
+ # sphinx.set_connect_timeout(1, 5)
466
+ #
467
+ # @raise [ArgumentError] Occurred when parameters are invalid.
468
+ # @see #set_server
469
+ # @see #set_servers
470
+ # @see #set_request_timeout
471
+ #
472
+ def set_connect_timeout(timeout, retries = 1)
354
473
  raise ArgumentError, '"timeout" argument must be Integer' unless timeout.respond_to?(:integer?) and timeout.integer?
355
474
  raise ArgumentError, '"retries" argument must be Integer' unless retries.respond_to?(:integer?) and retries.integer?
356
475
  raise ArgumentError, '"retries" argument must be greater than 0' unless retries > 0
357
-
476
+
358
477
  @timeout = timeout
359
478
  @retries = retries
479
+ self
360
480
  end
361
-
481
+ alias :SetConnectTimeout :set_connect_timeout
482
+
362
483
  # Sets the time allowed to spend performing request to the server before giving up
363
484
  # and number of retries to perform.
364
485
  #
@@ -366,34 +487,82 @@ module Sphinx
366
487
  # be returned back to the application in order for application-level error
367
488
  # handling to advise the user.
368
489
  #
369
- # When multiple servers configured through +SetServers+ method, and +retries+
490
+ # When multiple servers configured through {#set_servers} method, and +retries+
370
491
  # number is greater than 1, library will try to do another try with this server
371
492
  # (with full reconnect). If connection would fail, behavior depends on
372
- # +SetConnectTimeout+ settings.
493
+ # {#set_connect_timeout} settings.
373
494
  #
374
495
  # Please note, this timeout will only be used for request performing, not
375
496
  # for connection establishing.
376
497
  #
377
- def SetRequestTimeout(timeout, retries = 1)
498
+ # @param [Integer] timeout a request timeout in seconds.
499
+ # @param [Integer] retries number of request retries.
500
+ # @return [Sphinx::Client] self.
501
+ #
502
+ # @example Set request timeout to 1 second and number of retries to 5
503
+ # sphinx.set_request_timeout(1, 5)
504
+ #
505
+ # @raise [ArgumentError] Occurred when parameters are invalid.
506
+ # @see #set_server
507
+ # @see #set_servers
508
+ # @see #set_connect_timeout
509
+ #
510
+ def set_request_timeout(timeout, retries = 1)
378
511
  raise ArgumentError, '"timeout" argument must be Integer' unless timeout.respond_to?(:integer?) and timeout.integer?
379
512
  raise ArgumentError, '"retries" argument must be Integer' unless retries.respond_to?(:integer?) and retries.integer?
380
513
  raise ArgumentError, '"retries" argument must be greater than 0' unless retries > 0
381
-
514
+
382
515
  @reqtimeout = timeout
383
516
  @reqretries = retries
517
+ self
518
+ end
519
+ alias :SetRequestTimeout :set_request_timeout
520
+
521
+ # Sets distributed retry count and delay.
522
+ #
523
+ # On temporary failures searchd will attempt up to +count+ retries
524
+ # per agent. +delay+ is the delay between the retries, in milliseconds.
525
+ # Retries are disabled by default. Note that this call will not make
526
+ # the API itself retry on temporary failure; it only tells searchd
527
+ # to do so. Currently, the list of temporary failures includes all
528
+ # kinds of connection failures and maxed out (too busy) remote agents.
529
+ #
530
+ # @param [Integer] count a number of retries to perform.
531
+ # @param [Integer] delay a delay between the retries.
532
+ # @return [Sphinx::Client] self.
533
+ #
534
+ # @example Perform 5 retries with 200 ms between them
535
+ # sphinx.set_retries(5, 200)
536
+ #
537
+ # @raise [ArgumentError] Occurred when parameters are invalid.
538
+ # @see #set_connect_timeout
539
+ # @see #set_request_timeout
540
+ #
541
+ def set_retries(count, delay = 0)
542
+ raise ArgumentError, '"count" argument must be Integer' unless count.respond_to?(:integer?) and count.integer?
543
+ raise ArgumentError, '"delay" argument must be Integer' unless delay.respond_to?(:integer?) and delay.integer?
544
+
545
+ @retrycount = count
546
+ @retrydelay = delay
547
+ self
384
548
  end
385
-
549
+ alias :SetRetries :set_retries
550
+
551
+ #=================================================================
552
+ # General query settings
553
+ #=================================================================
554
+
386
555
  # Sets offset into server-side result set (+offset+) and amount of matches to
387
556
  # return to client starting from that offset (+limit+). Can additionally control
388
557
  # maximum server-side result set size for current query (+max_matches+) and the
389
558
  # threshold amount of matches to stop searching at (+cutoff+). All parameters
390
559
  # must be non-negative integers.
391
560
  #
392
- # First two parameters to +SetLimits+ are identical in behavior to MySQL LIMIT
561
+ # First two parameters to {#set_limits} are identical in behavior to MySQL LIMIT
393
562
  # clause. They instruct searchd to return at most +limit+ matches starting from
394
563
  # match number +offset+. The default offset and limit settings are +0+ and +20+,
395
564
  # that is, to return first +20+ matches.
396
- #
565
+ #
397
566
  # +max_matches+ setting controls how much matches searchd will keep in RAM
398
567
  # while searching. All matching documents will be normally processed, ranked,
399
568
  # filtered, and sorted even if max_matches is set to +1+. But only best +N+
@@ -415,12 +584,23 @@ module Sphinx
415
584
  # searchd to forcibly stop search query once $cutoff matches had been found
416
585
  # and processed.
417
586
  #
418
- def SetLimits(offset, limit, max = 0, cutoff = 0)
587
+ # @param [Integer] offset an offset into server-side result set.
588
+ # @param [Integer] limit an amount of matches to return.
589
+ # @param [Integer] max a maximum server-side result set size.
590
+ # @param [Integer] cutoff a threshold amount of matches to stop searching at.
591
+ # @return [Sphinx::Client] self.
592
+ #
593
+ # @example
594
+ # sphinx.set_limits(100, 50, 1000, 5000)
595
+ #
596
+ # @raise [ArgumentError] Occurred when parameters are invalid.
597
+ #
598
+ def set_limits(offset, limit, max = 0, cutoff = 0)
419
599
  raise ArgumentError, '"offset" argument must be Integer' unless offset.respond_to?(:integer?) and offset.integer?
420
600
  raise ArgumentError, '"limit" argument must be Integer' unless limit.respond_to?(:integer?) and limit.integer?
421
601
  raise ArgumentError, '"max" argument must be Integer' unless max.respond_to?(:integer?) and max.integer?
422
602
  raise ArgumentError, '"cutoff" argument must be Integer' unless cutoff.respond_to?(:integer?) and cutoff.integer?
423
-
603
+
424
604
  raise ArgumentError, '"offset" argument should be greater or equal to zero' unless offset >= 0
425
605
  raise ArgumentError, '"limit" argument should be greater to zero' unless limit > 0
426
606
  raise ArgumentError, '"max" argument should be greater or equal to zero' unless max >= 0
@@ -430,35 +610,176 @@ module Sphinx
430
610
  @limit = limit
431
611
  @maxmatches = max if max > 0
432
612
  @cutoff = cutoff if cutoff > 0
613
+ self
433
614
  end
434
-
615
+ alias :SetLimits :set_limits
616
+
435
617
  # Sets maximum search query time, in milliseconds. Parameter must be a
436
618
  # non-negative integer. Default valus is +0+ which means "do not limit".
437
619
  #
438
- # Similar to +cutoff+ setting from +SetLimits+, but limits elapsed query
620
+ # Similar to +cutoff+ setting from {#set_limits}, but limits elapsed query
439
621
  # time instead of processed matches count. Local search queries will be
440
622
  # stopped once that much time has elapsed. Note that if you're performing
441
623
  # a search which queries several local indexes, this limit applies to each
442
624
  # index separately.
443
625
  #
444
- def SetMaxQueryTime(max)
626
+ # @param [Integer] max maximum search query time in milliseconds.
627
+ # @return [Sphinx::Client] self.
628
+ #
629
+ # @example
630
+ # sphinx.set_max_query_time(200)
631
+ #
632
+ # @raise [ArgumentError] Occurred when parameters are invalid.
633
+ #
634
+ def set_max_query_time(max)
445
635
  raise ArgumentError, '"max" argument must be Integer' unless max.respond_to?(:integer?) and max.integer?
446
636
  raise ArgumentError, '"max" argument should be greater or equal to zero' unless max >= 0
447
637
 
448
638
  @maxquerytime = max
639
+ self
449
640
  end
450
-
641
+ alias :SetMaxQueryTime :set_max_query_time
642
+
643
+ # Sets temporary (per-query) per-document attribute value overrides. Only
644
+ # supports scalar attributes. +values+ must be a +Hash+ that maps document
645
+ # IDs to overridden attribute values.
646
+ #
647
+ # Override feature lets you "temporary" update attribute values for some
648
+ # documents within a single query, leaving all other queries unaffected.
649
+ # This might be useful for personalized data. For example, assume you're
650
+ # implementing a personalized search function that wants to boost the posts
651
+ # that the user's friends recommend. Such data is not just dynamic, but
652
+ # also personal; so you can't simply put it in the index because you don't
653
+ # want everyone's searches affected. Overrides, on the other hand, are local
654
+ # to a single query and invisible to everyone else. So you can, say, setup
655
+ # a "friends_weight" value for every document, defaulting to 0, then
656
+ # temporary override it with 1 for documents 123, 456 and 789 (recommended
657
+ # by exactly the friends of current user), and use that value when ranking.
658
+ #
659
+ # You can specify attribute type as String ("integer", "float", etc),
660
+ # Symbol (:integer, :float, etc), or
661
+ # Fixnum constant (SPH_ATTR_INTEGER, SPH_ATTR_FLOAT, etc).
662
+ #
663
+ # @param [String, Symbol] attribute an attribute name to override values of.
664
+ # @param [Integer, String, Symbol] attrtype attribute type.
665
+ # @param [Hash] values a +Hash+ that maps document IDs to overridden attribute values.
666
+ # @return [Sphinx::Client] self.
667
+ #
668
+ # @example
669
+ # sphinx.set_override(:friends_weight, :integer, {123 => 1, 456 => 1, 789 => 1})
670
+ #
671
+ # @raise [ArgumentError] Occurred when parameters are invalid.
672
+ #
673
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setoverride Section 6.2.3, "SetOverride"
674
+ #
675
+ def set_override(attribute, attrtype, values)
676
+ raise ArgumentError, '"attribute" argument must be String or Symbol' unless attribute.kind_of?(String) or attribute.kind_of?(Symbol)
677
+
678
+ case attrtype
679
+ when String, Symbol
680
+ begin
681
+ attrtype = self.class.const_get("SPH_ATTR_#{attrtype.to_s.upcase}")
682
+ rescue NameError
683
+ raise ArgumentError, "\"attrtype\" argument value \"#{attrtype}\" is invalid"
684
+ end
685
+ when Fixnum
686
+ raise ArgumentError, "\"attrtype\" argument value \"#{attrtype}\" is invalid" unless (SPH_ATTR_INTEGER..SPH_ATTR_BIGINT).include?(attrtype)
687
+ else
688
+ raise ArgumentError, '"attrtype" argument must be Fixnum, String, or Symbol'
689
+ end
690
+
691
+ raise ArgumentError, '"values" argument must be Hash' unless values.kind_of?(Hash)
692
+
693
+ values.each do |id, value|
694
+ raise ArgumentError, '"values" argument must be Hash map of Integer to Integer or Time' unless id.respond_to?(:integer?) and id.integer?
695
+ case attrtype
696
+ when SPH_ATTR_TIMESTAMP
697
+ raise ArgumentError, '"values" argument must be Hash map of Integer to Integer or Time' unless (value.respond_to?(:integer?) and value.integer?) or value.kind_of?(Time)
698
+ when SPH_ATTR_FLOAT
699
+ raise ArgumentError, '"values" argument must be Hash map of Integer to Float or Integer' unless value.kind_of?(Float) or (value.respond_to?(:integer?) and value.integer?)
700
+ else
701
+ # SPH_ATTR_INTEGER, SPH_ATTR_ORDINAL, SPH_ATTR_BOOL, SPH_ATTR_BIGINT
702
+ raise ArgumentError, '"values" argument must be Hash map of Integer to Integer' unless value.respond_to?(:integer?) and value.integer?
703
+ end
704
+ end
705
+
706
+ @overrides << { 'attr' => attribute.to_s, 'type' => attrtype, 'values' => values }
707
+ self
708
+ end
709
+ alias :SetOverride :set_override
710
+
711
+ # Sets the select clause, listing specific attributes to fetch, and
712
+ # expressions to compute and fetch. Clause syntax mimics SQL.
713
+ #
714
+ # {#set_select} is very similar to the part of a typical SQL query between
715
+ # +SELECT+ and +FROM+. It lets you choose what attributes (columns) to
716
+ # fetch, and also what expressions over the columns to compute and fetch.
717
+ # A certain difference from SQL is that expressions must always be aliased
718
+ # to a correct identifier (consisting of letters and digits) using +AS+
719
+ # keyword. SQL also lets you do that but does not require to. Sphinx enforces
720
+ # aliases so that the computation results can always be returned under a
721
+ # "normal" name in the result set, used in other clauses, etc.
722
+ #
723
+ # Everything else is basically identical to SQL. Star ('*') is supported.
724
+ # Functions are supported. Arbitrary amount of expressions is supported.
725
+ # Computed expressions can be used for sorting, filtering, and grouping,
726
+ # just as the regular attributes.
727
+ #
728
+ # Starting with version 0.9.9-rc2, aggregate functions (<tt>AVG()</tt>,
729
+ # <tt>MIN()</tt>, <tt>MAX()</tt>, <tt>SUM()</tt>) are supported when using
730
+ # <tt>GROUP BY</tt>.
731
+ #
732
+ # Expression sorting (Section 4.5, “SPH_SORT_EXPR mode”) and geodistance
733
+ # functions ({#set_geo_anchor}) are now internally implemented
734
+ # using this computed expressions mechanism, using magic names '<tt>@expr</tt>'
735
+ # and '<tt>@geodist</tt>' respectively.
736
+ #
737
+ # @param [String] select a select clause, listing specific attributes to fetch.
738
+ # @return [Sphinx::Client] self.
739
+ #
740
+ # @example
741
+ # sphinx.set_select('*, @weight+(user_karma+ln(pageviews))*0.1 AS myweight')
742
+ # sphinx.set_select('exp_years, salary_gbp*{$gbp_usd_rate} AS salary_usd, IF(age>40,1,0) AS over40')
743
+ # sphinx.set_select('*, AVG(price) AS avgprice')
744
+ #
745
+ # @raise [ArgumentError] Occurred when parameters are invalid.
746
+ #
747
+ # @see http://www.sphinxsearch.com/docs/current.html#sort-expr Section 4.5, "SPH_SORT_EXPR mode"
748
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setgeoanchor Section 6.4.5, "SetGeoAnchor"
749
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setselect Section 6.2.4, "SetSelect"
750
+ #
751
+ def set_select(select)
752
+ raise ArgumentError, '"select" argument must be String' unless select.kind_of?(String)
753
+
754
+ @select = select
755
+ self
756
+ end
757
+ alias :SetSelect :set_select
758
+
759
+ #=================================================================
760
+ # Full-text search query settings
761
+ #=================================================================
762
+
451
763
  # Sets full-text query matching mode.
452
764
  #
453
765
  # Parameter must be a +Fixnum+ constant specifying one of the known modes
454
766
  # (+SPH_MATCH_ALL+, +SPH_MATCH_ANY+, etc), +String+ with identifier (<tt>"all"</tt>,
455
767
  # <tt>"any"</tt>, etc), or a +Symbol+ (<tt>:all</tt>, <tt>:any</tt>, etc).
456
768
  #
457
- # Corresponding sections in Sphinx reference manual:
458
- # * {Section 4.1, "Matching modes"}[http://www.sphinxsearch.com/docs/current.html#matching-modes] for details.
459
- # * {Section 6.3.1, "SetMatchMode"}[http://www.sphinxsearch.com/docs/current.html#api-func-setmatchmode] for details.
769
+ # @param [Integer, String, Symbol] mode full-text query matching mode.
770
+ # @return [Sphinx::Client] self.
460
771
  #
461
- def SetMatchMode(mode)
772
+ # @example
773
+ # sphinx.set_match_mode(Sphinx::Client::SPH_MATCH_ALL)
774
+ # sphinx.set_match_mode(:all)
775
+ # sphinx.set_match_mode('all')
776
+ #
777
+ # @raise [ArgumentError] Occurred when parameters are invalid.
778
+ #
779
+ # @see http://www.sphinxsearch.com/docs/current.html#matching-modes Section 4.1, "Matching modes"
780
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setmatchmode Section 6.3.1, "SetMatchMode"
781
+ #
782
+ def set_match_mode(mode)
462
783
  case mode
463
784
  when String, Symbol
464
785
  begin
@@ -473,14 +794,33 @@ module Sphinx
473
794
  end
474
795
 
475
796
  @mode = mode
797
+ self
476
798
  end
477
-
478
- # Set ranking mode.
799
+ alias :SetMatchMode :set_match_mode
800
+
801
+ # Sets ranking mode. Only available in +SPH_MATCH_EXTENDED2+
802
+ # matching mode at the time of this writing. Parameter must be a
803
+ # constant specifying one of the known modes.
479
804
  #
480
805
  # You can specify ranking mode as String ("proximity_bm25", "bm25", etc),
481
806
  # Symbol (:proximity_bm25, :bm25, etc), or
482
807
  # Fixnum constant (SPH_RANK_PROXIMITY_BM25, SPH_RANK_BM25, etc).
483
- def SetRankingMode(ranker)
808
+ #
809
+ # @param [Integer, String, Symbol] ranker ranking mode.
810
+ # @return [Sphinx::Client] self.
811
+ #
812
+ # @example
813
+ # sphinx.set_ranking_mode(Sphinx::Client::SPH_RANK_BM25)
814
+ # sphinx.set_ranking_mode(:bm25)
815
+ # sphinx.set_ranking_mode('bm25')
816
+ #
817
+ # @raise [ArgumentError] Occurred when parameters are invalid.
818
+ #
819
+ # @see http://www.sphinxsearch.com/docs/current.html#matching-modes Section 4.1, "Matching modes"
820
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setmatchmode Section 6.3.1, "SetMatchMode"
821
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setrankingmode Section 6.3.2, "SetRankingMode"
822
+ #
823
+ def set_ranking_mode(ranker)
484
824
  case ranker
485
825
  when String, Symbol
486
826
  begin
@@ -495,14 +835,33 @@ module Sphinx
495
835
  end
496
836
 
497
837
  @ranker = ranker
838
+ self
498
839
  end
499
-
840
+ alias :SetRankingMode :set_ranking_mode
841
+
500
842
  # Set matches sorting mode.
501
843
  #
502
844
  # You can specify sorting mode as String ("relevance", "attr_desc", etc),
503
845
  # Symbol (:relevance, :attr_desc, etc), or
504
846
  # Fixnum constant (SPH_SORT_RELEVANCE, SPH_SORT_ATTR_DESC, etc).
505
- def SetSortMode(mode, sortby = '')
847
+ #
848
+ # @param [Integer, String, Symbol] mode matches sorting mode.
849
+ # @param [String] sortby sorting clause, with the syntax depending on
850
+ # specific mode. Should be specified unless sorting mode is
851
+ # +SPH_SORT_RELEVANCE+.
852
+ # @return [Sphinx::Client] self.
853
+ #
854
+ # @example
855
+ # sphinx.set_sort_mode(Sphinx::Client::SPH_SORT_ATTR_ASC, 'attr')
856
+ # sphinx.set_sort_mode(:attr_asc, 'attr')
857
+ # sphinx.set_sort_mode('attr_asc', 'attr')
858
+ #
859
+ # @raise [ArgumentError] Occurred when parameters are invalid.
860
+ #
861
+ # @see http://www.sphinxsearch.com/docs/current.html#sorting-modes Section 4.5, "Sorting modes"
862
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setsortmode Section 6.3.3, "SetSortMode"
863
+ #
864
+ def set_sort_mode(mode, sortby = '')
506
865
  case mode
507
866
  when String, Symbol
508
867
  begin
@@ -521,27 +880,69 @@ module Sphinx
521
880
 
522
881
  @sort = mode
523
882
  @sortby = sortby
883
+ self
524
884
  end
525
-
526
- # Bind per-field weights by order.
885
+ alias :SetSortMode :set_sort_mode
886
+
887
+ # Binds per-field weights in the order of appearance in the index.
888
+ #
889
+ # @param [Array<Integer>] weights an +Array+ of integer per-field weights.
890
+ # @return [Sphinx::Client] self.
527
891
  #
528
- # DEPRECATED; use SetFieldWeights() instead.
529
- def SetWeights(weights)
892
+ # @example
893
+ # sphinx.set_weights([1, 3, 5])
894
+ #
895
+ # @raise [ArgumentError] Occurred when parameters are invalid.
896
+ #
897
+ # @deprecated Use {#set_field_weights} instead.
898
+ # @see #set_field_weights
899
+ #
900
+ def set_weights(weights)
530
901
  raise ArgumentError, '"weights" argument must be Array' unless weights.kind_of?(Array)
531
902
  weights.each do |weight|
532
903
  raise ArgumentError, '"weights" argument must be Array of integers' unless weight.respond_to?(:integer?) and weight.integer?
533
904
  end
534
905
 
535
906
  @weights = weights
907
+ self
536
908
  end
909
+ alias :SetWeights :set_weights
537
910
 
538
- # Bind per-field weights by name.
911
+ # Binds per-field weights by name. Parameter must be a +Hash+
912
+ # mapping string field names to integer weights.
913
+ #
914
+ # Match ranking can be affected by per-field weights. For instance,
915
+ # see Section 4.4, "Weighting" for an explanation how phrase
916
+ # proximity ranking is affected. This call lets you specify what
917
+ # non-default weights to assign to different full-text fields.
918
+ #
919
+ # The weights must be positive 32-bit integers. The final weight
920
+ # will be a 32-bit integer too. Default weight value is 1. Unknown
921
+ # field names will be silently ignored.
922
+ #
923
+ # There is no enforced limit on the maximum weight value at the
924
+ # moment. However, beware that if you set it too high you can
925
+ # start hitting 32-bit wraparound issues. For instance, if
926
+ # you set a weight of 10,000,000 and search in extended mode,
927
+ # then maximum possible weight will be equal to 10 million (your
928
+ # weight) by 1 thousand (internal BM25 scaling factor, see
929
+ # Section 4.4, “Weighting”) by 1 or more (phrase proximity rank).
930
+ # The result is at least 10 billion that does not fit in 32 bits
931
+ # and will be wrapped around, producing unexpected results.
932
+ #
933
+ # @param [Hash] weights a +Hash+ mapping string field names to
934
+ # integer weights.
935
+ # @return [Sphinx::Client] self.
936
+ #
937
+ # @example
938
+ # sphinx.set_field_weights(:title => 20, :text => 10)
539
939
  #
540
- # Takes string (field name) to integer (field weight) hash as an argument.
541
- # * Takes precedence over SetWeights().
542
- # * Unknown names will be silently ignored.
543
- # * Unbound fields will be silently given a weight of 1.
544
- def SetFieldWeights(weights)
940
+ # @raise [ArgumentError] Occurred when parameters are invalid.
941
+ #
942
+ # @see http://www.sphinxsearch.com/docs/current.html#weighting Section 4.4, "Weighting"
943
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setfieldweights Section 6.3.5, "SetFieldWeights"
944
+ #
945
+ def set_field_weights(weights)
545
946
  raise ArgumentError, '"weights" argument must be Hash' unless weights.kind_of?(Hash)
546
947
  weights.each do |name, weight|
547
948
  unless (name.kind_of?(String) or name.kind_of?(Symbol)) and (weight.respond_to?(:integer?) and weight.integer?)
@@ -550,37 +951,119 @@ module Sphinx
550
951
  end
551
952
 
552
953
  @fieldweights = weights
954
+ self
553
955
  end
554
-
555
- # Bind per-index weights by name.
556
- def SetIndexWeights(weights)
956
+ alias :SetFieldWeights :set_field_weights
957
+
958
+ # Sets per-index weights, and enables weighted summing of match
959
+ # weights across different indexes. Parameter must be a hash
960
+ # (associative array) mapping string index names to integer
961
+ # weights. Default is empty array that means to disable weighting
962
+ # summing.
963
+ #
964
+ # When a match with the same document ID is found in several
965
+ # different local indexes, by default Sphinx simply chooses the
966
+ # match from the index specified last in the query. This is to
967
+ # support searching through partially overlapping index partitions.
968
+ #
969
+ # However in some cases the indexes are not just partitions,
970
+ # and you might want to sum the weights across the indexes
971
+ # instead of picking one. {#set_index_weights} lets you do that.
972
+ # With summing enabled, final match weight in result set will be
973
+ # computed as a sum of match weight coming from the given index
974
+ # multiplied by respective per-index weight specified in this
975
+ # call. Ie. if the document 123 is found in index A with the
976
+ # weight of 2, and also in index B with the weight of 3, and
977
+ # you called {#set_index_weights} with <tt>{"A"=>100, "B"=>10}</tt>,
978
+ # the final weight return to the client will be 2*100+3*10 = 230.
979
+ #
980
+ # @param [Hash] weights a +Hash+ mapping string index names to
981
+ # integer weights.
982
+ # @return [Sphinx::Client] self.
983
+ #
984
+ # @example
985
+ # sphinx.set_field_weights(:fresh => 20, :archived => 10)
986
+ #
987
+ # @raise [ArgumentError] Occurred when parameters are invalid.
988
+ #
989
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setindexweights Section 6.3.6, "SetIndexWeights"
990
+ #
991
+ def set_index_weights(weights)
557
992
  raise ArgumentError, '"weights" argument must be Hash' unless weights.kind_of?(Hash)
558
993
  weights.each do |index, weight|
559
994
  unless (index.kind_of?(String) or index.kind_of?(Symbol)) and (weight.respond_to?(:integer?) and weight.integer?)
560
995
  raise ArgumentError, '"weights" argument must be Hash map of strings to integers'
561
996
  end
562
997
  end
563
-
998
+
564
999
  @indexweights = weights
1000
+ self
565
1001
  end
1002
+ alias :SetIndexWeights :set_index_weights
566
1003
 
567
- # Set IDs range to match.
568
- #
569
- # Only match records if document ID is beetwen <tt>min_id</tt> and <tt>max_id</tt> (inclusive).
570
- def SetIDRange(min, max)
1004
+ #=================================================================
1005
+ # Result set filtering settings
1006
+ #=================================================================
1007
+
1008
+ # Sets an accepted range of document IDs. Parameters must be integers.
1009
+ # Defaults are 0 and 0; that combination means to not limit by range.
1010
+ #
1011
+ # After this call, only those records that have document ID between
1012
+ # +min+ and +max+ (including IDs exactly equal to +min+ or +max+)
1013
+ # will be matched.
1014
+ #
1015
+ # @param [Integer] min min document ID.
1016
+ # @param [Integer] min max document ID.
1017
+ # @return [Sphinx::Client] self.
1018
+ #
1019
+ # @example
1020
+ # sphinx.set_id_range(10, 1000)
1021
+ #
1022
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1023
+ #
1024
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setidrange Section 6.4.1, "SetIDRange"
1025
+ #
1026
+ def set_id_range(min, max)
571
1027
  raise ArgumentError, '"min" argument must be Integer' unless min.respond_to?(:integer?) and min.integer?
572
1028
  raise ArgumentError, '"max" argument must be Integer' unless max.respond_to?(:integer?) and max.integer?
573
1029
  raise ArgumentError, '"max" argument greater or equal to "min"' unless min <= max
574
1030
 
575
1031
  @min_id = min
576
1032
  @max_id = max
1033
+ self
577
1034
  end
578
-
579
- # Set values filter.
580
- #
581
- # Only match those records where <tt>attribute</tt> column values
582
- # are in specified set.
583
- def SetFilter(attribute, values, exclude = false)
1035
+ alias :SetIDRange :set_id_range
1036
+
1037
+ # Adds new integer values set filter.
1038
+ #
1039
+ # On this call, additional new filter is added to the existing
1040
+ # list of filters. $attribute must be a string with attribute
1041
+ # name. +values+ must be a plain array containing integer
1042
+ # values. +exclude+ must be a boolean value; it controls
1043
+ # whether to accept the matching documents (default mode, when
1044
+ # +exclude+ is +false+) or reject them.
1045
+ #
1046
+ # Only those documents where +attribute+ column value stored in
1047
+ # the index matches any of the values from +values+ array will
1048
+ # be matched (or rejected, if +exclude+ is +true+).
1049
+ #
1050
+ # @param [String, Symbol] attribute an attribute name to filter by.
1051
+ # @param [Array<Integer>] values an +Array+ of integers with given attribute values.
1052
+ # @param [Boolean] exclude indicating whether documents with given attribute
1053
+ # matching specified values should be excluded from search results.
1054
+ # @return [Sphinx::Client] self.
1055
+ #
1056
+ # @example
1057
+ # sphinx.set_filter(:group_id, [10, 15, 20])
1058
+ # sphinx.set_filter(:group_id, [10, 15, 20], true)
1059
+ #
1060
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1061
+ #
1062
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setfilter Section 6.4.2, "SetFilter"
1063
+ # @see #set_filter_range
1064
+ # @see #set_filter_float_range
1065
+ #
1066
+ def set_filter(attribute, values, exclude = false)
584
1067
  raise ArgumentError, '"attribute" argument must be String or Symbol' unless attribute.kind_of?(String) or attribute.kind_of?(Symbol)
585
1068
  raise ArgumentError, '"values" argument must be Array' unless values.kind_of?(Array)
586
1069
  raise ArgumentError, '"values" argument must not be empty' if values.empty?
@@ -589,94 +1072,194 @@ module Sphinx
589
1072
  values.each do |value|
590
1073
  raise ArgumentError, '"values" argument must be Array of Integer' unless value.respond_to?(:integer?) and value.integer?
591
1074
  end
592
-
1075
+
593
1076
  @filters << { 'type' => SPH_FILTER_VALUES, 'attr' => attribute.to_s, 'exclude' => exclude, 'values' => values }
1077
+ self
594
1078
  end
595
-
596
- # Set range filter.
597
- #
598
- # Only match those records where <tt>attribute</tt> column value
599
- # is beetwen <tt>min</tt> and <tt>max</tt> (including <tt>min</tt> and <tt>max</tt>).
600
- def SetFilterRange(attribute, min, max, exclude = false)
1079
+ alias :SetFilter :set_filter
1080
+
1081
+ # Adds new integer range filter.
1082
+ #
1083
+ # On this call, additional new filter is added to the existing
1084
+ # list of filters. +attribute+ must be a string with attribute
1085
+ # name. +min+ and +max+ must be integers that define the acceptable
1086
+ # attribute values range (including the boundaries). +exclude+
1087
+ # must be a boolean value; it controls whether to accept the
1088
+ # matching documents (default mode, when +exclude+ is false) or
1089
+ # reject them.
1090
+ #
1091
+ # Only those documents where +attribute+ column value stored
1092
+ # in the index is between +min+ and +max+ (including values
1093
+ # that are exactly equal to +min+ or +max+) will be matched
1094
+ # (or rejected, if +exclude+ is true).
1095
+ #
1096
+ # @param [String, Symbol] attribute an attribute name to filter by.
1097
+ # @param [Integer] min min value of the given attribute.
1098
+ # @param [Integer] max max value of the given attribute.
1099
+ # @param [Boolean] exclude indicating whether documents with given attribute
1100
+ # matching specified boundaries should be excluded from search results.
1101
+ # @return [Sphinx::Client] self.
1102
+ #
1103
+ # @example
1104
+ # sphinx.set_filter_range(:group_id, 10, 20)
1105
+ # sphinx.set_filter_range(:group_id, 10, 20, true)
1106
+ #
1107
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1108
+ #
1109
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setfilterrange Section 6.4.3, "SetFilterRange"
1110
+ # @see #set_filter
1111
+ # @see #set_filter_float_range
1112
+ #
1113
+ def set_filter_range(attribute, min, max, exclude = false)
601
1114
  raise ArgumentError, '"attribute" argument must be String or Symbol' unless attribute.kind_of?(String) or attribute.kind_of?(Symbol)
602
1115
  raise ArgumentError, '"min" argument must be Integer' unless min.respond_to?(:integer?) and min.integer?
603
1116
  raise ArgumentError, '"max" argument must be Integer' unless max.respond_to?(:integer?) and max.integer?
604
1117
  raise ArgumentError, '"max" argument greater or equal to "min"' unless min <= max
605
1118
  raise ArgumentError, '"exclude" argument must be Boolean' unless exclude.kind_of?(TrueClass) or exclude.kind_of?(FalseClass)
606
-
1119
+
607
1120
  @filters << { 'type' => SPH_FILTER_RANGE, 'attr' => attribute.to_s, 'exclude' => exclude, 'min' => min, 'max' => max }
1121
+ self
608
1122
  end
609
-
610
- # Set float range filter.
611
- #
612
- # Only match those records where <tt>attribute</tt> column value
613
- # is beetwen <tt>min</tt> and <tt>max</tt> (including <tt>min</tt> and <tt>max</tt>).
614
- def SetFilterFloatRange(attribute, min, max, exclude = false)
1123
+ alias :SetFilterRange :set_filter_range
1124
+
1125
+ # Adds new float range filter.
1126
+ #
1127
+ # On this call, additional new filter is added to the existing
1128
+ # list of filters. +attribute+ must be a string with attribute name.
1129
+ # +min+ and +max+ must be floats that define the acceptable
1130
+ # attribute values range (including the boundaries). +exclude+ must
1131
+ # be a boolean value; it controls whether to accept the matching
1132
+ # documents (default mode, when +exclude+ is false) or reject them.
1133
+ #
1134
+ # Only those documents where +attribute+ column value stored in
1135
+ # the index is between +min+ and +max+ (including values that are
1136
+ # exactly equal to +min+ or +max+) will be matched (or rejected,
1137
+ # if +exclude+ is true).
1138
+ #
1139
+ # @param [String, Symbol] attribute an attribute name to filter by.
1140
+ # @param [Integer, Float] min min value of the given attribute.
1141
+ # @param [Integer, Float] max max value of the given attribute.
1142
+ # @param [Boolean] exclude indicating whether documents with given attribute
1143
+ # matching specified boundaries should be excluded from search results.
1144
+ # @return [Sphinx::Client] self.
1145
+ #
1146
+ # @example
1147
+ # sphinx.set_filter_float_range(:group_id, 10.5, 20)
1148
+ # sphinx.set_filter_float_range(:group_id, 10.5, 20, true)
1149
+ #
1150
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1151
+ #
1152
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setfilterfloatrange Section 6.4.4, "SetFilterFloatRange"
1153
+ # @see #set_filter
1154
+ # @see #set_filter_range
1155
+ #
1156
+ def set_filter_float_range(attribute, min, max, exclude = false)
615
1157
  raise ArgumentError, '"attribute" argument must be String or Symbol' unless attribute.kind_of?(String) or attribute.kind_of?(Symbol)
616
1158
  raise ArgumentError, '"min" argument must be Float or Integer' unless min.kind_of?(Float) or (min.respond_to?(:integer?) and min.integer?)
617
1159
  raise ArgumentError, '"max" argument must be Float or Integer' unless max.kind_of?(Float) or (max.respond_to?(:integer?) and max.integer?)
618
1160
  raise ArgumentError, '"max" argument greater or equal to "min"' unless min <= max
619
1161
  raise ArgumentError, '"exclude" argument must be Boolean' unless exclude.kind_of?(TrueClass) or exclude.kind_of?(FalseClass)
620
-
1162
+
621
1163
  @filters << { 'type' => SPH_FILTER_FLOATRANGE, 'attr' => attribute.to_s, 'exclude' => exclude, 'min' => min.to_f, 'max' => max.to_f }
1164
+ self
622
1165
  end
623
-
624
- # Setup anchor point for geosphere distance calculations.
625
- #
626
- # Required to use <tt>@geodist</tt> in filters and sorting
627
- # distance will be computed to this point. Latitude and longitude
628
- # must be in radians.
629
- #
630
- # * <tt>attrlat</tt> -- is the name of latitude attribute
631
- # * <tt>attrlong</tt> -- is the name of longitude attribute
632
- # * <tt>lat</tt> -- is anchor point latitude, in radians
633
- # * <tt>long</tt> -- is anchor point longitude, in radians
634
- def SetGeoAnchor(attrlat, attrlong, lat, long)
1166
+ alias :SetFilterFloatRange :set_filter_float_range
1167
+
1168
+ # Sets anchor point for and geosphere distance (geodistance)
1169
+ # calculations, and enable them.
1170
+ #
1171
+ # +attrlat+ and +attrlong+ must be strings that contain the names
1172
+ # of latitude and longitude attributes, respectively. +lat+ and
1173
+ # +long+ are floats that specify anchor point latitude and
1174
+ # longitude, in radians.
1175
+ #
1176
+ # Once an anchor point is set, you can use magic <tt>"@geodist"</tt>
1177
+ # attribute name in your filters and/or sorting expressions.
1178
+ # Sphinx will compute geosphere distance between the given anchor
1179
+ # point and a point specified by latitude and lognitude attributes
1180
+ # from each full-text match, and attach this value to the resulting
1181
+ # match. The latitude and longitude values both in {#set_geo_anchor}
1182
+ # and the index attribute data are expected to be in radians.
1183
+ # The result will be returned in meters, so geodistance value of
1184
+ # 1000.0 means 1 km. 1 mile is approximately 1609.344 meters.
1185
+ #
1186
+ # @param [String, Symbol] attrlat a name of latitude attribute.
1187
+ # @param [String, Symbol] attrlong a name of longitude attribute.
1188
+ # @param [Integer, Float] lat an anchor point latitude, in radians.
1189
+ # @param [Integer, Float] long an anchor point longitude, in radians.
1190
+ # @return [Sphinx::Client] self.
1191
+ #
1192
+ # @example
1193
+ # sphinx.set_geo_anchor(:latitude, :longitude, 192.5, 143.5)
1194
+ #
1195
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1196
+ #
1197
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setgeoanchor Section 6.4.5, "SetGeoAnchor"
1198
+ #
1199
+ def set_geo_anchor(attrlat, attrlong, lat, long)
635
1200
  raise ArgumentError, '"attrlat" argument must be String or Symbol' unless attrlat.kind_of?(String) or attrlat.kind_of?(Symbol)
636
1201
  raise ArgumentError, '"attrlong" argument must be String or Symbol' unless attrlong.kind_of?(String) or attrlong.kind_of?(Symbol)
637
1202
  raise ArgumentError, '"lat" argument must be Float or Integer' unless lat.kind_of?(Float) or (lat.respond_to?(:integer?) and lat.integer?)
638
1203
  raise ArgumentError, '"long" argument must be Float or Integer' unless long.kind_of?(Float) or (long.respond_to?(:integer?) and long.integer?)
639
1204
 
640
1205
  @anchor = { 'attrlat' => attrlat.to_s, 'attrlong' => attrlong.to_s, 'lat' => lat.to_f, 'long' => long.to_f }
1206
+ self
641
1207
  end
1208
+ alias :SetGeoAnchor :set_geo_anchor
642
1209
 
643
- # Set grouping attribute and function.
1210
+ #=================================================================
1211
+ # GROUP BY settings
1212
+ #=================================================================
1213
+
1214
+ # Sets grouping attribute, function, and groups sorting mode; and
1215
+ # enables grouping (as described in Section 4.6, "Grouping (clustering) search results").
644
1216
  #
645
- # In grouping mode, all matches are assigned to different groups
646
- # based on grouping function value.
1217
+ # +attribute+ is a string that contains group-by attribute name.
1218
+ # +func+ is a constant that chooses a function applied to the
1219
+ # attribute value in order to compute group-by key. +groupsort+
1220
+ # is a clause that controls how the groups will be sorted. Its
1221
+ # syntax is similar to that described in Section 4.5,
1222
+ # "SPH_SORT_EXTENDED mode".
647
1223
  #
648
- # Each group keeps track of the total match count, and the best match
649
- # (in this group) according to current sorting function.
1224
+ # Grouping feature is very similar in nature to <tt>GROUP BY</tt> clause
1225
+ # from SQL. Results produces by this function call are going to
1226
+ # be the same as produced by the following pseudo code:
650
1227
  #
651
- # The final result set contains one best match per group, with
652
- # grouping function value and matches count attached.
1228
+ # SELECT ... GROUP BY func(attribute) ORDER BY groupsort
653
1229
  #
654
- # Groups in result set could be sorted by any sorting clause,
655
- # including both document attributes and the following special
656
- # internal Sphinx attributes:
1230
+ # Note that it's +groupsort+ that affects the order of matches in
1231
+ # the final result set. Sorting mode (see {#set_sort_mode}) affect
1232
+ # the ordering of matches within group, ie. what match will be
1233
+ # selected as the best one from the group. So you can for instance
1234
+ # order the groups by matches count and select the most relevant
1235
+ # match within each group at the same time.
657
1236
  #
658
- # * @id - match document ID;
659
- # * @weight, @rank, @relevance - match weight;
660
- # * @group - groupby function value;
661
- # * @count - amount of matches in group.
1237
+ # Starting with version 0.9.9-rc2, aggregate functions (<tt>AVG()</tt>,
1238
+ # <tt>MIN()</tt>, <tt>MAX()</tt>, <tt>SUM()</tt>) are supported
1239
+ # through {#set_select} API call when using <tt>GROUP BY</tt>.
1240
+ #
1241
+ # You can specify group function and attribute as String
1242
+ # ("attr", "day", etc), Symbol (:attr, :day, etc), or
1243
+ # Fixnum constant (SPH_GROUPBY_ATTR, SPH_GROUPBY_DAY, etc).
1244
+ #
1245
+ # @param [String, Symbol] attribute an attribute name to group by.
1246
+ # @param [Integer, String, Symbol] func a grouping function.
1247
+ # @param [String] groupsort a groups sorting mode.
1248
+ # @return [Sphinx::Client] self.
662
1249
  #
663
- # the default mode is to sort by groupby value in descending order,
664
- # ie. by '@group desc'.
1250
+ # @example
1251
+ # sphinx.set_group_by(:tag_id, :attr)
665
1252
  #
666
- # 'total_found' would contain total amount of matching groups over
667
- # the whole index.
1253
+ # @raise [ArgumentError] Occurred when parameters are invalid.
668
1254
  #
669
- # WARNING: grouping is done in fixed memory and thus its results
670
- # are only approximate; so there might be more groups reported
671
- # in total_found than actually present. @count might also
672
- # be underestimated.
1255
+ # @see http://www.sphinxsearch.com/docs/current.html#clustering Section 4.6, "Grouping (clustering) search results"
1256
+ # @see http://www.sphinxsearch.com/docs/current.html#sort-extended Section 4.5, "SPH_SORT_EXTENDED mode"
1257
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setgroupby Section 6.5.1, "SetGroupBy"
1258
+ # @see #set_sort_mode
1259
+ # @see #set_select
1260
+ # @see #set_group_distinct
673
1261
  #
674
- # For example, if sorting by relevance and grouping by "published"
675
- # attribute with SPH_GROUPBY_DAY function, then the result set will
676
- # contain one most relevant match per each day when there were any
677
- # matches published, with day number and per-day match count attached,
678
- # and sorted by day number in descending order (ie. recent days first).
679
- def SetGroupBy(attribute, func, groupsort = '@group desc')
1262
+ def set_group_by(attribute, func, groupsort = '@group desc')
680
1263
  raise ArgumentError, '"attribute" argument must be String or Symbol' unless attribute.kind_of?(String) or attribute.kind_of?(Symbol)
681
1264
  raise ArgumentError, '"groupsort" argument must be String' unless groupsort.kind_of?(String)
682
1265
 
@@ -696,217 +1279,311 @@ module Sphinx
696
1279
  @groupby = attribute.to_s
697
1280
  @groupfunc = func
698
1281
  @groupsort = groupsort
1282
+ self
699
1283
  end
700
-
701
- # Set count-distinct attribute for group-by queries.
702
- def SetGroupDistinct(attribute)
703
- raise ArgumentError, '"attribute" argument must be String or Symbol' unless attribute.kind_of?(String) or attribute.kind_of?(Symbol)
1284
+ alias :SetGroupBy :set_group_by
704
1285
 
705
- @groupdistinct = attribute.to_s
706
- end
707
-
708
- # Sets distributed retry count and delay.
1286
+ # Sets attribute name for per-group distinct values count
1287
+ # calculations. Only available for grouping queries.
709
1288
  #
710
- # On temporary failures searchd will attempt up to +count+ retries per
711
- # agent. +delay+ is the delay between the retries, in milliseconds. Retries
712
- # are disabled by default. Note that this call will not make the API itself
713
- # retry on temporary failure; it only tells searchd to do so. Currently,
714
- # the list of temporary failures includes all kinds of +connect+
715
- # failures and maxed out (too busy) remote agents.
1289
+ # +attribute+ is a string that contains the attribute name. For
1290
+ # each group, all values of this attribute will be stored (as
1291
+ # RAM limits permit), then the amount of distinct values will
1292
+ # be calculated and returned to the client. This feature is
1293
+ # similar to <tt>COUNT(DISTINCT)</tt> clause in standard SQL;
1294
+ # so these Sphinx calls:
716
1295
  #
717
- def SetRetries(count, delay = 0)
718
- raise ArgumentError, '"count" argument must be Integer' unless count.respond_to?(:integer?) and count.integer?
719
- raise ArgumentError, '"delay" argument must be Integer' unless delay.respond_to?(:integer?) and delay.integer?
720
-
721
- @retrycount = count
722
- @retrydelay = delay
723
- end
724
-
725
- # Sets temporary (per-query) per-document attribute value overrides. Only
726
- # supports scalar attributes. +values+ must be a +Hash+ that maps document
727
- # IDs to overridden attribute values.
1296
+ # sphinx.set_group_by(:category, :attr, '@count desc')
1297
+ # sphinx.set_group_distinct(:vendor)
728
1298
  #
729
- # Override feature lets you "temporary" update attribute values for some
730
- # documents within a single query, leaving all other queries unaffected.
731
- # This might be useful for personalized data. For example, assume you're
732
- # implementing a personalized search function that wants to boost the posts
733
- # that the user's friends recommend. Such data is not just dynamic, but
734
- # also personal; so you can't simply put it in the index because you don't
735
- # want everyone's searches affected. Overrides, on the other hand, are local
736
- # to a single query and invisible to everyone else. So you can, say, setup
737
- # a "friends_weight" value for every document, defaulting to 0, then
738
- # temporary override it with 1 for documents 123, 456 and 789 (recommended
739
- # by exactly the friends of current user), and use that value when ranking.
1299
+ # can be expressed using the following SQL clauses:
740
1300
  #
741
- def SetOverride(attrname, attrtype, values)
742
- raise ArgumentError, '"attrname" argument must be String or Symbol' unless attrname.kind_of?(String) or attrname.kind_of?(Symbol)
743
-
744
- case attrtype
745
- when String, Symbol
746
- begin
747
- attrtype = self.class.const_get("SPH_ATTR_#{attrtype.to_s.upcase}")
748
- rescue NameError
749
- raise ArgumentError, "\"attrtype\" argument value \"#{attrtype}\" is invalid"
750
- end
751
- when Fixnum
752
- raise ArgumentError, "\"attrtype\" argument value \"#{attrtype}\" is invalid" unless (SPH_ATTR_INTEGER..SPH_ATTR_BIGINT).include?(attrtype)
753
- else
754
- raise ArgumentError, '"attrtype" argument must be Fixnum, String, or Symbol'
755
- end
756
-
757
- raise ArgumentError, '"values" argument must be Hash' unless values.kind_of?(Hash)
758
-
759
- values.each do |id, value|
760
- raise ArgumentError, '"values" argument must be Hash map of Integer to Integer or Time' unless id.respond_to?(:integer?) and id.integer?
761
- case attrtype
762
- when SPH_ATTR_TIMESTAMP
763
- raise ArgumentError, '"values" argument must be Hash map of Integer to Integer or Time' unless (value.respond_to?(:integer?) and value.integer?) or value.kind_of?(Time)
764
- when SPH_ATTR_FLOAT
765
- raise ArgumentError, '"values" argument must be Hash map of Integer to Float or Integer' unless value.kind_of?(Float) or (value.respond_to?(:integer?) and value.integer?)
766
- else
767
- # SPH_ATTR_INTEGER, SPH_ATTR_ORDINAL, SPH_ATTR_BOOL, SPH_ATTR_BIGINT
768
- raise ArgumentError, '"values" argument must be Hash map of Integer to Integer' unless value.respond_to?(:integer?) and value.integer?
769
- end
770
- end
771
-
772
- @overrides << { 'attr' => attrname.to_s, 'type' => attrtype, 'values' => values }
773
- end
774
-
775
- # Sets the select clause, listing specific attributes to fetch, and
776
- # expressions to compute and fetch. Clause syntax mimics SQL.
1301
+ # SELECT id, weight, all-attributes,
1302
+ # COUNT(DISTINCT vendor) AS @distinct,
1303
+ # COUNT(*) AS @count
1304
+ # FROM products
1305
+ # GROUP BY category
1306
+ # ORDER BY @count DESC
777
1307
  #
778
- # +SetSelect+ is very similar to the part of a typical SQL query between
779
- # +SELECT+ and +FROM+. It lets you choose what attributes (columns) to
780
- # fetch, and also what expressions over the columns to compute and fetch.
781
- # A certain difference from SQL is that expressions must always be aliased
782
- # to a correct identifier (consisting of letters and digits) using +AS+
783
- # keyword. SQL also lets you do that but does not require to. Sphinx enforces
784
- # aliases so that the computation results can always be returned under a
785
- #{ }"normal" name in the result set, used in other clauses, etc.
1308
+ # In the sample pseudo code shown just above, {#set_group_distinct}
1309
+ # call corresponds to <tt>COUNT(DISINCT vendor)</tt> clause only.
1310
+ # <tt>GROUP BY</tt>, <tt>ORDER BY</tt>, and <tt>COUNT(*)</tt>
1311
+ # clauses are all an equivalent of {#set_group_by} settings. Both
1312
+ # queries will return one matching row for each category. In
1313
+ # addition to indexed attributes, matches will also contain
1314
+ # total per-category matches count, and the count of distinct
1315
+ # vendor IDs within each category.
786
1316
  #
787
- # Everything else is basically identical to SQL. Star ('*') is supported.
788
- # Functions are supported. Arbitrary amount of expressions is supported.
789
- # Computed expressions can be used for sorting, filtering, and grouping,
790
- # just as the regular attributes.
1317
+ # @param [String, Symbol] attribute an attribute name.
1318
+ # @return [Sphinx::Client] self.
791
1319
  #
792
- # Starting with version 0.9.9-rc2, aggregate functions (<tt>AVG()</tt>,
793
- # <tt>MIN()</tt>, <tt>MAX()</tt>, <tt>SUM()</tt>) are supported when using
794
- # <tt>GROUP BY</tt>.
1320
+ # @example
1321
+ # sphinx.set_group_distinct(:category_id)
795
1322
  #
796
- # Expression sorting (Section 4.5, “SPH_SORT_EXPR mode”) and geodistance
797
- # functions (+SetGeoAnchor+) are now internally implemented
798
- # using this computed expressions mechanism, using magic names '<tt>@expr</tt>'
799
- # and '<tt>@geodist</tt>' respectively.
1323
+ # @raise [ArgumentError] Occurred when parameters are invalid.
800
1324
  #
801
- # Usage example:
1325
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-setgroupdistinct Section 6.5.2, "SetGroupDistinct"
1326
+ # @see #set_group_by
802
1327
  #
803
- # sphinx.SetSelect('*, @weight+(user_karma+ln(pageviews))*0.1 AS myweight')
804
- # sphinx.SetSelect('exp_years, salary_gbp*{$gbp_usd_rate} AS salary_usd, IF(age>40,1,0) AS over40')
805
- # sphinx.SetSelect('*, AVG(price) AS avgprice')
806
- #
807
- def SetSelect(select)
808
- raise ArgumentError, '"select" argument must be String' unless select.kind_of?(String)
1328
+ def set_group_distinct(attribute)
1329
+ raise ArgumentError, '"attribute" argument must be String or Symbol' unless attribute.kind_of?(String) or attribute.kind_of?(Symbol)
809
1330
 
810
- @select = select
1331
+ @groupdistinct = attribute.to_s
1332
+ self
811
1333
  end
812
-
1334
+ alias :SetGroupDistinct :set_group_distinct
1335
+
1336
+ #=================================================================
1337
+ # Querying
1338
+ #=================================================================
1339
+
813
1340
  # Clears all currently set filters.
814
1341
  #
815
1342
  # This call is only normally required when using multi-queries. You might want
816
1343
  # to set different filters for different queries in the batch. To do that,
817
- # you should call +ResetFilters+ and add new filters using the respective calls.
1344
+ # you should call {#reset_filters} and add new filters using the respective calls.
1345
+ #
1346
+ # @return [Sphinx::Client] self.
818
1347
  #
819
- # Usage example:
1348
+ # @example
1349
+ # sphinx.reset_filters
820
1350
  #
821
- # sphinx.ResetFilters
1351
+ # @see #set_filter
1352
+ # @see #set_filter_range
1353
+ # @see #set_filter_float_range
1354
+ # @see #set_geo_anchor
822
1355
  #
823
- def ResetFilters
1356
+ def reset_filters
824
1357
  @filters = []
825
1358
  @anchor = []
1359
+ self
826
1360
  end
827
-
1361
+ alias :ResetFilters :reset_filters
1362
+
828
1363
  # Clears all currently group-by settings, and disables group-by.
829
1364
  #
830
1365
  # This call is only normally required when using multi-queries. You can
831
- # change individual group-by settings using +SetGroupBy+ and +SetGroupDistinct+
832
- # calls, but you can not disable group-by using those calls. +ResetGroupBy+
1366
+ # change individual group-by settings using {#set_group_by} and {#set_group_distinct}
1367
+ # calls, but you can not disable group-by using those calls. {#reset_group_by}
833
1368
  # fully resets previous group-by settings and disables group-by mode in the
834
- # current state, so that subsequent +AddQuery+ calls can perform non-grouping
1369
+ # current state, so that subsequent {#add_query} calls can perform non-grouping
835
1370
  # searches.
836
1371
  #
837
- # Usage example:
1372
+ # @return [Sphinx::Client] self.
838
1373
  #
839
- # sphinx.ResetGroupBy
1374
+ # @example
1375
+ # sphinx.reset_group_by
840
1376
  #
841
- def ResetGroupBy
1377
+ # @see #set_group_by
1378
+ # @see #set_group_distinct
1379
+ #
1380
+ def reset_group_by
842
1381
  @groupby = ''
843
1382
  @groupfunc = SPH_GROUPBY_DAY
844
1383
  @groupsort = '@group desc'
845
1384
  @groupdistinct = ''
1385
+ self
846
1386
  end
847
-
1387
+ alias :ResetGroupBy :reset_group_by
1388
+
848
1389
  # Clear all attribute value overrides (for multi-queries).
849
- def ResetOverrides
850
- @overrides = []
851
- end
852
-
853
- # Connect to searchd server and run given search query.
854
1390
  #
855
- # <tt>query</tt> is query string
856
-
857
- # <tt>index</tt> is index name (or names) to query. default value is "*" which means
858
- # to query all indexes. Accepted characters for index names are letters, numbers,
859
- # dash, and underscore; everything else is considered a separator. Therefore,
860
- # all the following calls are valid and will search two indexes:
1391
+ # This call is only normally required when using multi-queries. You might want
1392
+ # to set field overrides for different queries in the batch. To do that,
1393
+ # you should call {#reset_overrides} and add new overrides using the
1394
+ # respective calls.
861
1395
  #
862
- # sphinx.Query('test query', 'main delta')
863
- # sphinx.Query('test query', 'main;delta')
864
- # sphinx.Query('test query', 'main, delta')
1396
+ # @return [Sphinx::Client] self.
865
1397
  #
866
- # Index order matters. If identical IDs are found in two or more indexes,
867
- # weight and attribute values from the very last matching index will be used
868
- # for sorting and returning to client. Therefore, in the example above,
869
- # matches from "delta" index will always "win" over matches from "main".
1398
+ # @example
1399
+ # sphinx.reset_overrides
870
1400
  #
871
- # Returns false on failure.
872
- # Returns hash which has the following keys on success:
873
- #
874
- # * <tt>'matches'</tt> -- array of hashes {'weight', 'group', 'id'}, where 'id' is document_id.
875
- # * <tt>'total'</tt> -- total amount of matches retrieved (upto SPH_MAX_MATCHES, see sphinx.h)
876
- # * <tt>'total_found'</tt> -- total amount of matching documents in index
877
- # * <tt>'time'</tt> -- search time
878
- # * <tt>'words'</tt> -- hash which maps query terms (stemmed!) to ('docs', 'hits') hash
879
- def Query(query, index = '*', comment = '')
1401
+ # @see #set_override
1402
+ #
1403
+ def reset_overrides
1404
+ @overrides = []
1405
+ self
1406
+ end
1407
+ alias :ResetOverrides :reset_overrides
1408
+
1409
+ # Connects to searchd server, runs given search query with
1410
+ # current settings, obtains and returns the result set.
1411
+ #
1412
+ # +query+ is a query string. +index+ is an index name (or names)
1413
+ # string. Returns false and sets {#last_error} message on general
1414
+ # error. Returns search result set on success. Additionally,
1415
+ # the contents of +comment+ are sent to the query log, marked in
1416
+ # square brackets, just before the search terms, which can be very
1417
+ # useful for debugging. Currently, the comment is limited to 128
1418
+ # characters.
1419
+ #
1420
+ # Default value for +index+ is <tt>"*"</tt> that means to query
1421
+ # all local indexes. Characters allowed in index names include
1422
+ # Latin letters (a-z), numbers (0-9), minus sign (-), and
1423
+ # underscore (_); everything else is considered a separator.
1424
+ # Therefore, all of the following samples calls are valid and
1425
+ # will search the same two indexes:
1426
+ #
1427
+ # sphinx.query('test query', 'main delta')
1428
+ # sphinx.query('test query', 'main;delta')
1429
+ # sphinx.query('test query', 'main, delta');
1430
+ #
1431
+ # Index specification order matters. If document with identical
1432
+ # IDs are found in two or more indexes, weight and attribute
1433
+ # values from the very last matching index will be used for
1434
+ # sorting and returning to client (unless explicitly overridden
1435
+ # with {#set_index_weights}). Therefore, in the example above,
1436
+ # matches from "delta" index will always win over matches
1437
+ # from "main".
1438
+ #
1439
+ # On success, {#query} returns a result set that contains some
1440
+ # of the found matches (as requested by {#set_limits}) and
1441
+ # additional general per-query statistics. The result set
1442
+ # is an +Hash+ with the following keys and values:
1443
+ #
1444
+ # <tt>"matches"</tt>::
1445
+ # Array with small +Hash+es containing document weight and
1446
+ # attribute values.
1447
+ # <tt>"total"</tt>::
1448
+ # Total amount of matches retrieved on server (ie. to the server
1449
+ # side result set) by this query. You can retrieve up to this
1450
+ # amount of matches from server for this query text with current
1451
+ # query settings.
1452
+ # <tt>"total_found"</tt>::
1453
+ # Total amount of matching documents in index (that were found
1454
+ # and procesed on server).
1455
+ # <tt>"words"</tt>::
1456
+ # Hash which maps query keywords (case-folded, stemmed, and
1457
+ # otherwise processed) to a small Hash with per-keyword statitics
1458
+ # ("docs", "hits").
1459
+ # <tt>"error"</tt>::
1460
+ # Query error message reported by searchd (string, human readable).
1461
+ # Empty if there were no errors.
1462
+ # <tt>"warning"</tt>::
1463
+ # Query warning message reported by searchd (string, human readable).
1464
+ # Empty if there were no warnings.
1465
+ #
1466
+ # It should be noted that {#query} carries out the same actions as
1467
+ # {#add_query} and {#run_queries} without the intermediate steps; it
1468
+ # is analoguous to a single {#add_query} call, followed by a
1469
+ # corresponding {#run_queries}, then returning the first array
1470
+ # element of matches (from the first, and only, query.)
1471
+ #
1472
+ # @param [String] query a query string.
1473
+ # @param [String] index an index name (or names).
1474
+ # @param [String] comment a comment to be sent to the query log.
1475
+ # @return [Hash, false] result set described above or +false+ on error.
1476
+ #
1477
+ # @example
1478
+ # sphinx.query('some search text', '*', 'search page')
1479
+ #
1480
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-query Section 6.6.1, "Query"
1481
+ # @see #add_query
1482
+ # @see #run_queries
1483
+ #
1484
+ def query(query, index = '*', comment = '')
880
1485
  @reqs = []
881
-
882
- self.AddQuery(query, index, comment)
883
- results = self.RunQueries
884
-
1486
+
1487
+ self.add_query(query, index, comment)
1488
+ results = self.run_queries
1489
+
885
1490
  # probably network error; error message should be already filled
886
1491
  return false unless results.instance_of?(Array)
887
-
1492
+
888
1493
  @error = results[0]['error']
889
1494
  @warning = results[0]['warning']
890
-
1495
+
891
1496
  return false if results[0]['status'] == SEARCHD_ERROR
892
1497
  return results[0]
893
1498
  end
894
-
895
- # Add query to batch.
896
- #
897
- # Batch queries enable searchd to perform internal optimizations,
898
- # if possible; and reduce network connection overheads in all cases.
899
- #
900
- # For instance, running exactly the same query with different
901
- # groupby settings will enable searched to perform expensive
902
- # full-text search and ranking operation only once, but compute
903
- # multiple groupby results from its output.
1499
+ alias :Query :query
1500
+
1501
+ # Adds additional query with current settings to multi-query batch.
1502
+ # +query+ is a query string. +index+ is an index name (or names)
1503
+ # string. Additionally if provided, the contents of +comment+ are
1504
+ # sent to the query log, marked in square brackets, just before
1505
+ # the search terms, which can be very useful for debugging.
1506
+ # Currently, this is limited to 128 characters. Returns index
1507
+ # to results array returned from {#run_queries}.
1508
+ #
1509
+ # Batch queries (or multi-queries) enable searchd to perform
1510
+ # internal optimizations if possible. They also reduce network
1511
+ # connection overheads and search process creation overheads in all
1512
+ # cases. They do not result in any additional overheads compared
1513
+ # to simple queries. Thus, if you run several different queries
1514
+ # from your web page, you should always consider using multi-queries.
1515
+ #
1516
+ # For instance, running the same full-text query but with different
1517
+ # sorting or group-by settings will enable searchd to perform
1518
+ # expensive full-text search and ranking operation only once, but
1519
+ # compute multiple group-by results from its output.
1520
+ #
1521
+ # This can be a big saver when you need to display not just plain
1522
+ # search results but also some per-category counts, such as the
1523
+ # amount of products grouped by vendor. Without multi-query, you
1524
+ # would have to run several queries which perform essentially the
1525
+ # same search and retrieve the same matches, but create result
1526
+ # sets differently. With multi-query, you simply pass all these
1527
+ # queries in a single batch and Sphinx optimizes the redundant
1528
+ # full-text search internally.
1529
+ #
1530
+ # {#add_query} internally saves full current settings state along
1531
+ # with the query, and you can safely change them afterwards for
1532
+ # subsequent {#add_query} calls. Already added queries will not
1533
+ # be affected; there's actually no way to change them at all.
1534
+ # Here's an example:
1535
+ #
1536
+ # sphinx.set_sort_mode(:relevance)
1537
+ # sphinx.add_query("hello world", "documents")
1538
+ #
1539
+ # sphinx.set_sort_mode(:attr_desc, :price)
1540
+ # sphinx.add_query("ipod", "products")
904
1541
  #
905
- # Parameters are exactly the same as in <tt>Query</tt> call.
906
- # Returns index to results array returned by <tt>RunQueries</tt> call.
907
- def AddQuery(query, index = '*', comment = '')
1542
+ # sphinx.add_query("harry potter", "books")
1543
+ #
1544
+ # results = sphinx.run_queries
1545
+ #
1546
+ # With the code above, 1st query will search for "hello world"
1547
+ # in "documents" index and sort results by relevance, 2nd query
1548
+ # will search for "ipod" in "products" index and sort results
1549
+ # by price, and 3rd query will search for "harry potter" in
1550
+ # "books" index while still sorting by price. Note that 2nd
1551
+ # {#set_sort_mode} call does not affect the first query (because
1552
+ # it's already added) but affects both other subsequent queries.
1553
+ #
1554
+ # Additionally, any filters set up before an {#add_query} will
1555
+ # fall through to subsequent queries. So, if {#set_filter} is
1556
+ # called before the first query, the same filter will be in
1557
+ # place for the second (and subsequent) queries batched through
1558
+ # {#add_query} unless you call {#reset_filters} first. Alternatively,
1559
+ # you can add additional filters as well.
1560
+ #
1561
+ # This would also be true for grouping options and sorting options;
1562
+ # no current sorting, filtering, and grouping settings are affected
1563
+ # by this call; so subsequent queries will reuse current query settings.
1564
+ #
1565
+ # {#add_query} returns an index into an array of results that will
1566
+ # be returned from {#run_queries} call. It is simply a sequentially
1567
+ # increasing 0-based integer, ie. first call will return 0, second
1568
+ # will return 1, and so on. Just a small helper so you won't have
1569
+ # to track the indexes manualy if you need then.
1570
+ #
1571
+ # @param [String] query a query string.
1572
+ # @param [String] index an index name (or names).
1573
+ # @param [String] comment a comment to be sent to the query log.
1574
+ # @return [Integer] an index into an array of results that will
1575
+ # be returned from {#run_queries} call.
1576
+ #
1577
+ # @example
1578
+ # sphinx.add_query('some search text', '*', 'search page')
1579
+ #
1580
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-addquery Section 6.6.2, "AddQuery"
1581
+ # @see #query
1582
+ # @see #run_queries
1583
+ #
1584
+ def add_query(query, index = '*', comment = '')
908
1585
  # build request
909
-
1586
+
910
1587
  # mode and limits
911
1588
  request = Request.new
912
1589
  request.put_int @offset, @limit, @mode, @ranker, @sort
@@ -920,8 +1597,8 @@ module Sphinx
920
1597
  # id64 range marker
921
1598
  request.put_int 1
922
1599
  # id64 range
923
- request.put_int64 @min_id.to_i, @max_id.to_i
924
-
1600
+ request.put_int64 @min_id.to_i, @max_id.to_i
1601
+
925
1602
  # filters
926
1603
  request.put_int @filters.length
927
1604
  @filters.each do |filter|
@@ -940,7 +1617,7 @@ module Sphinx
940
1617
  end
941
1618
  request.put_int filter['exclude'] ? 1 : 0
942
1619
  end
943
-
1620
+
944
1621
  # group-by clause, max-matches count, group-sort clause, cutoff count
945
1622
  request.put_int @groupfunc
946
1623
  request.put_string @groupby
@@ -948,7 +1625,7 @@ module Sphinx
948
1625
  request.put_string @groupsort
949
1626
  request.put_int @cutoff, @retrycount, @retrydelay
950
1627
  request.put_string @groupdistinct
951
-
1628
+
952
1629
  # anchor point
953
1630
  if @anchor.empty?
954
1631
  request.put_int 0
@@ -957,27 +1634,27 @@ module Sphinx
957
1634
  request.put_string @anchor['attrlat'], @anchor['attrlong']
958
1635
  request.put_float @anchor['lat'], @anchor['long']
959
1636
  end
960
-
1637
+
961
1638
  # per-index weights
962
1639
  request.put_int @indexweights.length
963
1640
  @indexweights.each do |idx, weight|
964
1641
  request.put_string idx.to_s
965
1642
  request.put_int weight
966
1643
  end
967
-
1644
+
968
1645
  # max query time
969
1646
  request.put_int @maxquerytime
970
-
1647
+
971
1648
  # per-field weights
972
1649
  request.put_int @fieldweights.length
973
1650
  @fieldweights.each do |field, weight|
974
1651
  request.put_string field.to_s
975
1652
  request.put_int weight
976
1653
  end
977
-
1654
+
978
1655
  # comment
979
1656
  request.put_string comment
980
-
1657
+
981
1658
  # attribute overrides
982
1659
  request.put_int @overrides.length
983
1660
  for entry in @overrides do
@@ -995,173 +1672,196 @@ module Sphinx
995
1672
  end
996
1673
  end
997
1674
  end
998
-
1675
+
999
1676
  # select-list
1000
1677
  request.put_string @select
1001
-
1678
+
1002
1679
  # store request to requests array
1003
1680
  @reqs << request.to_s;
1004
1681
  return @reqs.length - 1
1005
1682
  end
1006
-
1007
- # Run queries batch.
1008
- #
1009
- # Returns an array of result sets on success.
1010
- # Returns false on network IO failure.
1011
- #
1012
- # Each result set in returned array is a hash which containts
1013
- # the same keys as the hash returned by <tt>Query</tt>, plus:
1014
- #
1015
- # * <tt>'error'</tt> -- search error for this query
1016
- # * <tt>'words'</tt> -- hash which maps query terms (stemmed!) to ( "docs", "hits" ) hash
1017
- #
1018
- def RunQueries
1683
+ alias :AddQuery :add_query
1684
+
1685
+ # Connect to searchd, runs a batch of all queries added using
1686
+ # {#add_query}, obtains and returns the result sets. Returns
1687
+ # +false+ and sets {#last_error} message on general error
1688
+ # (such as network I/O failure). Returns a plain array of
1689
+ # result sets on success.
1690
+ #
1691
+ # Each result set in the returned array is exactly the same as
1692
+ # the result set returned from {#query}.
1693
+ #
1694
+ # Note that the batch query request itself almost always succeds —
1695
+ # unless there's a network error, blocking index rotation in
1696
+ # progress, or another general failure which prevents the whole
1697
+ # request from being processed.
1698
+ #
1699
+ # However individual queries within the batch might very well
1700
+ # fail. In this case their respective result sets will contain
1701
+ # non-empty "error" message, but no matches or query statistics.
1702
+ # In the extreme case all queries within the batch could fail.
1703
+ # There still will be no general error reported, because API
1704
+ # was able to succesfully connect to searchd, submit the batch,
1705
+ # and receive the results — but every result set will have a
1706
+ # specific error message.
1707
+ #
1708
+ # @return [Array<Hash>] an +Array+ of +Hash+es which are exactly
1709
+ # the same as the result set returned from {#query}.
1710
+ #
1711
+ # @example
1712
+ # sphinx.add_query('some search text', '*', 'search page')
1713
+ # results = sphinx.run_queries
1714
+ #
1715
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-runqueries Section 6.6.3, "RunQueries"
1716
+ # @see #add_query
1717
+ #
1718
+ def run_queries
1019
1719
  if @reqs.empty?
1020
- @error = 'No queries defined, issue AddQuery() first'
1720
+ @error = 'No queries defined, issue add_query() first'
1021
1721
  return false
1022
1722
  end
1023
1723
 
1024
- req = @reqs.join('')
1025
- nreqs = @reqs.length
1724
+ reqs, nreqs = @reqs.join(''), @reqs.length
1026
1725
  @reqs = []
1027
- response = perform_request(:search, req, nreqs)
1028
-
1726
+ response = perform_request(:search, reqs, nreqs)
1727
+
1029
1728
  # parse response
1030
- begin
1031
- results = []
1032
- ires = 0
1033
- while ires < nreqs
1034
- ires += 1
1035
- result = {}
1036
-
1037
- result['error'] = ''
1038
- result['warning'] = ''
1039
-
1040
- # extract status
1041
- status = result['status'] = response.get_int
1042
- if status != SEARCHD_OK
1043
- message = response.get_string
1044
- if status == SEARCHD_WARNING
1045
- result['warning'] = message
1046
- else
1047
- result['error'] = message
1048
- results << result
1049
- next
1050
- end
1051
- end
1052
-
1053
- # read schema
1054
- fields = []
1055
- attrs = {}
1056
- attrs_names_in_order = []
1057
-
1058
- nfields = response.get_int
1059
- while nfields > 0
1060
- nfields -= 1
1061
- fields << response.get_string
1729
+ (1..nreqs).map do
1730
+ result = { 'error' => '', 'warning' => '' }
1731
+
1732
+ # extract status
1733
+ status = result['status'] = response.get_int
1734
+ if status != SEARCHD_OK
1735
+ message = response.get_string
1736
+ if status == SEARCHD_WARNING
1737
+ result['warning'] = message
1738
+ else
1739
+ result['error'] = message
1740
+ next result
1062
1741
  end
1063
- result['fields'] = fields
1064
-
1065
- nattrs = response.get_int
1066
- while nattrs > 0
1067
- nattrs -= 1
1068
- attr = response.get_string
1069
- type = response.get_int
1070
- attrs[attr] = type
1071
- attrs_names_in_order << attr
1742
+ end
1743
+
1744
+ # read schema
1745
+ nfields = response.get_int
1746
+ result['fields'] = (1..nfields).map { response.get_string }
1747
+
1748
+ attrs_names_in_order = []
1749
+ nattrs = response.get_int
1750
+ attrs = (1..nattrs).inject({}) do |hash, idx|
1751
+ name, type = response.get_string, response.get_int
1752
+ hash[name] = type
1753
+ attrs_names_in_order << name
1754
+ hash
1755
+ end
1756
+ result['attrs'] = attrs
1757
+
1758
+ # read match count
1759
+ count, id64 = response.get_ints(2)
1760
+
1761
+ # read matches
1762
+ result['matches'] = (1..count).map do
1763
+ doc, weight = if id64 == 0
1764
+ response.get_ints(2)
1765
+ else
1766
+ [response.get_int64, response.get_int]
1072
1767
  end
1073
- result['attrs'] = attrs
1074
-
1075
- # read match count
1076
- count = response.get_int
1077
- id64 = response.get_int
1078
-
1079
- # read matches
1080
- result['matches'] = []
1081
- while count > 0
1082
- count -= 1
1083
-
1084
- if id64 != 0
1085
- doc = response.get_int64
1086
- weight = response.get_int
1087
- else
1088
- doc, weight = response.get_ints(2)
1089
- end
1090
-
1091
- r = {} # This is a single result put in the result['matches'] array
1092
- r['id'] = doc
1093
- r['weight'] = weight
1094
- attrs_names_in_order.each do |a|
1095
- r['attrs'] ||= {}
1096
-
1097
- case attrs[a]
1098
- when SPH_ATTR_BIGINT
1099
- # handle 64-bit ints
1100
- r['attrs'][a] = response.get_int64
1101
- when SPH_ATTR_FLOAT
1102
- # handle floats
1103
- r['attrs'][a] = response.get_float
1104
- when SPH_ATTR_STRING
1105
- r['attrs'][a] = response.get_string
1768
+
1769
+ # This is a single result put in the result['matches'] array
1770
+ match = { 'id' => doc, 'weight' => weight }
1771
+ match['attrs'] = attrs_names_in_order.inject({}) do |hash, name|
1772
+ hash[name] = case attrs[name]
1773
+ when SPH_ATTR_BIGINT
1774
+ # handle 64-bit ints
1775
+ response.get_int64
1776
+ when SPH_ATTR_FLOAT
1777
+ # handle floats
1778
+ response.get_float
1779
+ when SPH_ATTR_STRING
1780
+ response.get_string
1781
+ else
1782
+ # handle everything else as unsigned ints
1783
+ val = response.get_int
1784
+ if (attrs[name] & SPH_ATTR_MULTI) != 0
1785
+ (1..val).map { response.get_int }
1106
1786
  else
1107
- # handle everything else as unsigned ints
1108
- val = response.get_int
1109
- if (attrs[a] & SPH_ATTR_MULTI) != 0
1110
- r['attrs'][a] = []
1111
- 1.upto(val) do
1112
- r['attrs'][a] << response.get_int
1113
- end
1114
- else
1115
- r['attrs'][a] = val
1116
- end
1117
- end
1787
+ val
1788
+ end
1118
1789
  end
1119
- result['matches'] << r
1790
+ hash
1120
1791
  end
1121
- result['total'], result['total_found'], msecs, words = response.get_ints(4)
1122
- result['time'] = '%.3f' % (msecs / 1000.0)
1123
-
1124
- result['words'] = {}
1125
- while words > 0
1126
- words -= 1
1127
- word = response.get_string
1128
- docs, hits = response.get_ints(2)
1129
- result['words'][word] = { 'docs' => docs, 'hits' => hits }
1130
- end
1131
-
1132
- results << result
1792
+ match
1133
1793
  end
1134
- #rescue EOFError
1135
- # @error = 'incomplete reply'
1136
- # raise SphinxResponseError, @error
1794
+ result['total'], result['total_found'], msecs = response.get_ints(3)
1795
+ result['time'] = '%.3f' % (msecs / 1000.0)
1796
+
1797
+ nwords = response.get_int
1798
+ result['words'] = (1..nwords).inject({}) do |hash, idx|
1799
+ word = response.get_string
1800
+ docs, hits = response.get_ints(2)
1801
+ hash[word] = { 'docs' => docs, 'hits' => hits }
1802
+ hash
1803
+ end
1804
+
1805
+ result
1137
1806
  end
1138
-
1139
- return results
1140
1807
  end
1141
-
1142
- # Connect to searchd server and generate exceprts from given documents.
1143
- #
1144
- # * <tt>docs</tt> -- an array of strings which represent the documents' contents
1145
- # * <tt>index</tt> -- a string specifiying the index which settings will be used
1146
- # for stemming, lexing and case folding
1147
- # * <tt>words</tt> -- a string which contains the words to highlight
1148
- # * <tt>opts</tt> is a hash which contains additional optional highlighting parameters.
1149
- #
1150
- # You can use following parameters:
1151
- # * <tt>'before_match'</tt> -- a string to insert before a set of matching words, default is "<b>"
1152
- # * <tt>'after_match'</tt> -- a string to insert after a set of matching words, default is "<b>"
1153
- # * <tt>'chunk_separator'</tt> -- a string to insert between excerpts chunks, default is " ... "
1154
- # * <tt>'limit'</tt> -- max excerpt size in symbols (codepoints), default is 256
1155
- # * <tt>'around'</tt> -- how much words to highlight around each match, default is 5
1156
- # * <tt>'exact_phrase'</tt> -- whether to highlight exact phrase matches only, default is <tt>false</tt>
1157
- # * <tt>'single_passage'</tt> -- whether to extract single best passage only, default is false
1158
- # * <tt>'use_boundaries'</tt> -- whether to extract passages by phrase boundaries setup in tokenizer
1159
- # * <tt>'weight_order'</tt> -- whether to order best passages in document (default) or weight order
1160
- #
1161
- # Returns false on failure.
1162
- # Returns an array of string excerpts on success.
1163
- #
1164
- def BuildExcerpts(docs, index, words, opts = {})
1808
+ alias :RunQueries :run_queries
1809
+
1810
+ #=================================================================
1811
+ # Additional functionality
1812
+ #=================================================================
1813
+
1814
+ # Excerpts (snippets) builder function. Connects to searchd, asks
1815
+ # it to generate excerpts (snippets) from given documents, and
1816
+ # returns the results.
1817
+ #
1818
+ # +docs+ is a plain array of strings that carry the documents'
1819
+ # contents. +index+ is an index name string. Different settings
1820
+ # (such as charset, morphology, wordforms) from given index will
1821
+ # be used. +words+ is a string that contains the keywords to
1822
+ # highlight. They will be processed with respect to index settings.
1823
+ # For instance, if English stemming is enabled in the index,
1824
+ # "shoes" will be highlighted even if keyword is "shoe". Starting
1825
+ # with version 0.9.9-rc1, keywords can contain wildcards, that
1826
+ # work similarly to star-syntax available in queries.
1827
+ #
1828
+ # @param [Array<String>] docs an array of strings which represent
1829
+ # the documents' contents.
1830
+ # @param [String] index an index which settings will be used for
1831
+ # stemming, lexing and case folding.
1832
+ # @param [String] words a string which contains the words to highlight.
1833
+ # @param [Hash] opts a +Hash+ which contains additional optional
1834
+ # highlighting parameters.
1835
+ # @option opts [String] 'before_match' ("<b>") a string to insert before a
1836
+ # keyword match.
1837
+ # @option opts [String] 'after_match' ("</b>") a string to insert after a
1838
+ # keyword match.
1839
+ # @option opts [String] 'chunk_separator' (" ... ") a string to insert
1840
+ # between snippet chunks (passages).
1841
+ # @option opts [Integer] 'limit' (256) maximum snippet size, in symbols
1842
+ # (codepoints).
1843
+ # @option opts [Integer] 'around' (5) how many words to pick around
1844
+ # each matching keywords block.
1845
+ # @option opts [Boolean] 'exact_phrase' (false) whether to highlight exact
1846
+ # query phrase matches only instead of individual keywords.
1847
+ # @option opts [Boolean] 'single_passage' (false) whether to extract single
1848
+ # best passage only.
1849
+ # @option opts [Boolean] 'use_boundaries' (false) whether to extract
1850
+ # passages by phrase boundaries setup in tokenizer.
1851
+ # @option opts [Boolean] 'weight_order' (false) whether to sort the
1852
+ # extracted passages in order of relevance (decreasing weight),
1853
+ # or in order of appearance in the document (increasing position).
1854
+ # @return [Array<String>, false] a plain array of strings with
1855
+ # excerpts (snippets) on success; otherwise, +false+.
1856
+ #
1857
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1858
+ #
1859
+ # @example
1860
+ # sphinx.build_excerpts(['hello world', 'hello me'], 'idx', 'hello')
1861
+ #
1862
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-buildexcerpts Section 6.7.1, "BuildExcerpts"
1863
+ #
1864
+ def build_excerpts(docs, index, words, opts = {})
1165
1865
  raise ArgumentError, '"docs" argument must be Array' unless docs.kind_of?(Array)
1166
1866
  raise ArgumentError, '"index" argument must be String' unless index.kind_of?(String) or index.kind_of?(Symbol)
1167
1867
  raise ArgumentError, '"words" argument must be String' unless words.kind_of?(String)
@@ -1182,9 +1882,9 @@ module Sphinx
1182
1882
  opts['use_boundaries'] ||= opts[:use_boundaries] || false
1183
1883
  opts['weight_order'] ||= opts[:weight_order] || false
1184
1884
  opts['query_mode'] ||= opts[:query_mode] || false
1185
-
1885
+
1186
1886
  # build request
1187
-
1887
+
1188
1888
  # v.1.0 req
1189
1889
  flags = 1
1190
1890
  flags |= 2 if opts['exact_phrase']
@@ -1192,47 +1892,71 @@ module Sphinx
1192
1892
  flags |= 8 if opts['use_boundaries']
1193
1893
  flags |= 16 if opts['weight_order']
1194
1894
  flags |= 32 if opts['query_mode']
1195
-
1895
+
1196
1896
  request = Request.new
1197
1897
  request.put_int 0, flags # mode=0, flags=1 (remove spaces)
1198
1898
  # req index
1199
1899
  request.put_string index.to_s
1200
1900
  # req words
1201
1901
  request.put_string words
1202
-
1902
+
1203
1903
  # options
1204
1904
  request.put_string opts['before_match']
1205
1905
  request.put_string opts['after_match']
1206
1906
  request.put_string opts['chunk_separator']
1207
1907
  request.put_int opts['limit'].to_i, opts['around'].to_i
1208
-
1908
+
1209
1909
  # documents
1210
1910
  request.put_int docs.size
1211
1911
  request.put_string(*docs)
1212
-
1912
+
1213
1913
  response = perform_request(:excerpt, request)
1214
-
1914
+
1215
1915
  # parse response
1216
- begin
1217
- res = []
1218
- docs.each do |doc|
1219
- res << response.get_string
1220
- end
1221
- rescue EOFError
1222
- @error = 'incomplete reply'
1223
- raise SphinxResponseError, @error
1224
- end
1225
- return res
1916
+ docs.map { response.get_string }
1226
1917
  end
1227
-
1228
- # Connect to searchd server, and generate keyword list for a given query.
1229
- #
1230
- # Returns an array of words on success.
1231
- def BuildKeywords(query, index, hits)
1918
+ alias :BuildExcerpts :build_excerpts
1919
+
1920
+ # Extracts keywords from query using tokenizer settings for given
1921
+ # index, optionally with per-keyword occurrence statistics.
1922
+ # Returns an array of hashes with per-keyword information.
1923
+ #
1924
+ # +query+ is a query to extract keywords from. +index+ is a name of
1925
+ # the index to get tokenizing settings and keyword occurrence
1926
+ # statistics from. +hits+ is a boolean flag that indicates whether
1927
+ # keyword occurrence statistics are required.
1928
+ #
1929
+ # The result set consists of +Hash+es with the following keys and values:
1930
+ #
1931
+ # <tt>'tokenized'</tt>::
1932
+ # Tokenized keyword.
1933
+ # <tt>'normalized'</tt>::
1934
+ # Normalized keyword.
1935
+ # <tt>'docs'</tt>::
1936
+ # A number of documents where keyword is found (if +hits+ param is +true+).
1937
+ # <tt>'hits'</tt>::
1938
+ # A number of keywords occurrences among all documents (if +hits+ param is +true+).
1939
+ #
1940
+ # @param [String] query a query string.
1941
+ # @param [String] index an index to get tokenizing settings and
1942
+ # keyword occurrence statistics from.
1943
+ # @param [Boolean] hits indicates whether keyword occurrence
1944
+ # statistics are required.
1945
+ # @return [Array<Hash>] an +Array+ of +Hash+es in format specified
1946
+ # above.
1947
+ #
1948
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1949
+ #
1950
+ # @example
1951
+ # keywords = sphinx.build_keywords("this.is.my query", "test1", false)
1952
+ #
1953
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-buildkeywords Section 6.7.3, "BuildKeywords"
1954
+ #
1955
+ def build_keywords(query, index, hits)
1232
1956
  raise ArgumentError, '"query" argument must be String' unless query.kind_of?(String)
1233
1957
  raise ArgumentError, '"index" argument must be String' unless index.kind_of?(String) or index.kind_of?(Symbol)
1234
1958
  raise ArgumentError, '"hits" argument must be Boolean' unless hits.kind_of?(TrueClass) or hits.kind_of?(FalseClass)
1235
-
1959
+
1236
1960
  # build request
1237
1961
  request = Request.new
1238
1962
  # v.1.0 req
@@ -1241,53 +1965,79 @@ module Sphinx
1241
1965
  request.put_int hits ? 1 : 0
1242
1966
 
1243
1967
  response = perform_request(:keywords, request)
1244
-
1968
+
1245
1969
  # parse response
1246
- begin
1247
- res = []
1248
- nwords = response.get_int
1249
- 0.upto(nwords - 1) do |i|
1250
- tokenized = response.get_string
1251
- normalized = response.get_string
1252
-
1253
- entry = { 'tokenized' => tokenized, 'normalized' => normalized }
1254
- entry['docs'], entry['hits'] = response.get_ints(2) if hits
1255
-
1256
- res << entry
1257
- end
1258
- rescue EOFError
1259
- @error = 'incomplete reply'
1260
- raise SphinxResponseError, @error
1970
+ nwords = response.get_int
1971
+ (0...nwords).map do
1972
+ tokenized = response.get_string
1973
+ normalized = response.get_string
1974
+
1975
+ entry = { 'tokenized' => tokenized, 'normalized' => normalized }
1976
+ entry['docs'], entry['hits'] = response.get_ints(2) if hits
1977
+
1978
+ entry
1261
1979
  end
1262
-
1263
- return res
1264
1980
  end
1981
+ alias :BuildKeywords :build_keywords
1265
1982
 
1266
- # Batch update given attributes in given rows in given indexes.
1983
+ # Instantly updates given attribute values in given documents.
1984
+ # Returns number of actually updated documents (0 or more) on
1985
+ # success, or -1 on failure.
1986
+ #
1987
+ # +index+ is a name of the index (or indexes) to be updated.
1988
+ # +attrs+ is a plain array with string attribute names, listing
1989
+ # attributes that are updated. +values+ is a Hash where key is
1990
+ # document ID, and value is a plain array of new attribute values.
1991
+ #
1992
+ # +index+ can be either a single index name or a list, like in
1993
+ # {#query}. Unlike {#query}, wildcard is not allowed and all the
1994
+ # indexes to update must be specified explicitly. The list of
1995
+ # indexes can include distributed index names. Updates on
1996
+ # distributed indexes will be pushed to all agents.
1997
+ #
1998
+ # The updates only work with docinfo=extern storage strategy.
1999
+ # They are very fast because they're working fully in RAM, but
2000
+ # they can also be made persistent: updates are saved on disk
2001
+ # on clean searchd shutdown initiated by SIGTERM signal. With
2002
+ # additional restrictions, updates are also possible on MVA
2003
+ # attributes; refer to mva_updates_pool directive for details.
2004
+ #
2005
+ # The first sample statement will update document 1 in index
2006
+ # "test1", setting "group_id" to 456. The second one will update
2007
+ # documents 1001, 1002 and 1003 in index "products". For document
2008
+ # 1001, the new price will be set to 123 and the new amount in
2009
+ # stock to 5; for document 1002, the new price will be 37 and the
2010
+ # new amount will be 11; etc. The third one updates document 1
2011
+ # in index "test2", setting MVA attribute "group_id" to [456, 789].
2012
+ #
2013
+ # @example
2014
+ # sphinx.update_attributes("test1", ["group_id"], { 1 => [456] });
2015
+ # sphinx.update_attributes("products", ["price", "amount_in_stock"],
2016
+ # { 1001 => [123, 5], 1002 => [37, 11], 1003 => [25, 129] });
2017
+ # sphinx.update_attributes('test2', ['group_id'], { 1 => [[456, 789]] }, true)
1267
2018
  #
1268
- # * +index+ is a name of the index to be updated
1269
- # * +attrs+ is an array of attribute name strings.
1270
- # * +values+ is a hash where key is document id, and value is an array of
1271
- # * +mva+ identifies whether update MVA
1272
- # new attribute values
2019
+ # @param [String] index a name of the index to be updated.
2020
+ # @param [Array<String>] attrs an array of attribute name strings.
2021
+ # @param [Hash] values is a hash where key is document id, and
2022
+ # value is an array of new attribute values.
2023
+ # @param [Boolean] mva indicating whether to update MVA.
2024
+ # @return [Integer] number of actually updated documents (0 or more) on success,
2025
+ # -1 on failure.
1273
2026
  #
1274
- # Returns number of actually updated documents (0 or more) on success.
1275
- # Returns -1 on failure.
2027
+ # @raise [ArgumentError] Occurred when parameters are invalid.
1276
2028
  #
1277
- # Usage example:
1278
- # sphinx.UpdateAttributes('test1', ['group_id'], { 1 => [456] })
1279
- # sphinx.UpdateAttributes('test1', ['group_id'], { 1 => [[456, 789]] }, true)
2029
+ # @see http://www.sphinxsearch.com/docs/current.html#api-func-updateatttributes Section 6.7.2, "UpdateAttributes"
1280
2030
  #
1281
- def UpdateAttributes(index, attrs, values, mva = false)
2031
+ def update_attributes(index, attrs, values, mva = false)
1282
2032
  # verify everything
1283
2033
  raise ArgumentError, '"index" argument must be String' unless index.kind_of?(String) or index.kind_of?(Symbol)
1284
2034
  raise ArgumentError, '"mva" argument must be Boolean' unless mva.kind_of?(TrueClass) or mva.kind_of?(FalseClass)
1285
-
2035
+
1286
2036
  raise ArgumentError, '"attrs" argument must be Array' unless attrs.kind_of?(Array)
1287
2037
  attrs.each do |attr|
1288
2038
  raise ArgumentError, '"attrs" argument must be Array of Strings' unless attr.kind_of?(String) or attr.kind_of?(Symbol)
1289
2039
  end
1290
-
2040
+
1291
2041
  raise ArgumentError, '"values" argument must be Hash' unless values.kind_of?(Hash)
1292
2042
  values.each do |id, entry|
1293
2043
  raise ArgumentError, '"values" argument must be Hash map of Integer to Array' unless id.respond_to?(:integer?) and id.integer?
@@ -1304,17 +2054,17 @@ module Sphinx
1304
2054
  end
1305
2055
  end
1306
2056
  end
1307
-
2057
+
1308
2058
  # build request
1309
2059
  request = Request.new
1310
2060
  request.put_string index
1311
-
2061
+
1312
2062
  request.put_int attrs.length
1313
2063
  for attr in attrs
1314
2064
  request.put_string attr
1315
2065
  request.put_int mva ? 1 : 0
1316
2066
  end
1317
-
2067
+
1318
2068
  request.put_int values.length
1319
2069
  values.each do |id, entry|
1320
2070
  request.put_int64 id
@@ -1324,33 +2074,89 @@ module Sphinx
1324
2074
  request.put_int(*entry)
1325
2075
  end
1326
2076
  end
1327
-
2077
+
1328
2078
  response = perform_request(:update, request)
1329
-
2079
+
2080
+ # parse response
2081
+ response.get_int
2082
+ end
2083
+ alias :UpdateAttributes :update_attributes
2084
+
2085
+ # Queries searchd status, and returns an array of status variable name
2086
+ # and value pairs.
2087
+ #
2088
+ # @return [Array<Array>] a table containing searchd status information.
2089
+ #
2090
+ # @example
2091
+ # status = sphinx.status
2092
+ # puts status.map { |key, value| "#{key.rjust(20)}: #{value}" }
2093
+ #
2094
+ def status
2095
+ request = Request.new
2096
+ request.put_int(1)
2097
+ response = perform_request(:status, request)
2098
+
2099
+ # parse response
2100
+ rows, cols = response.get_ints(2)
2101
+ (0...rows).map do
2102
+ (0...cols).map { response.get_string }
2103
+ end
2104
+ end
2105
+ alias :Status :status
2106
+
2107
+ # Force attribute flush, and block until it completes.
2108
+ #
2109
+ # @return [Integer] current internal flush tag on success, -1 on failure.
2110
+ #
2111
+ # @example
2112
+ # sphinx.flush_attrs
2113
+ #
2114
+ def flush_attrs
2115
+ request = Request.new
2116
+ response = perform_request(:flushattrs, request)
2117
+
1330
2118
  # parse response
1331
2119
  begin
1332
- return response.get_int
2120
+ response.get_int
1333
2121
  rescue EOFError
1334
- @error = 'incomplete reply'
1335
- raise SphinxResponseError, @error
2122
+ -1
1336
2123
  end
1337
2124
  end
1338
-
1339
- # persistent connections
1340
-
2125
+ alias :FlushAttrs :flush_attrs
2126
+
2127
+ #=================================================================
2128
+ # Persistent connections
2129
+ #=================================================================
2130
+
1341
2131
  # Opens persistent connection to the server.
1342
2132
  #
1343
- def Open
2133
+ # This method could be used only when a single searchd server
2134
+ # configured.
2135
+ #
2136
+ # @return [Boolean] +true+ when persistent connection has been
2137
+ # established; otherwise, +false+.
2138
+ #
2139
+ # @example
2140
+ # begin
2141
+ # sphinx.open
2142
+ # # perform several requests
2143
+ # ensure
2144
+ # sphinx.close
2145
+ # end
2146
+ #
2147
+ # @see #close
2148
+ #
2149
+ def open
1344
2150
  if @servers.size > 1
1345
2151
  @error = 'too many servers. persistent socket allowed only for a single server.'
1346
2152
  return false
1347
2153
  end
1348
-
2154
+
1349
2155
  if @servers.first.persistent?
1350
2156
  @error = 'already connected'
1351
2157
  return false;
1352
2158
  end
1353
-
2159
+
1354
2160
  request = Request.new
1355
2161
  request.put_int(1)
1356
2162
 
@@ -1360,85 +2166,64 @@ module Sphinx
1360
2166
 
1361
2167
  true
1362
2168
  end
1363
-
2169
+ alias :Open :open
2170
+
1364
2171
  # Closes previously opened persistent connection.
1365
2172
  #
1366
- def Close
2173
+ # This method could be used only when a single searchd server
2174
+ # configured.
2175
+ #
2176
+ # @return [Boolean] +true+ when persistent connection has been
2177
+ # closed; otherwise, +false+.
2178
+ #
2179
+ # @example
2180
+ # begin
2181
+ # sphinx.open
2182
+ # # perform several requests
2183
+ # ensure
2184
+ # sphinx.close
2185
+ # end
2186
+ #
2187
+ # @see #open
2188
+ #
2189
+ def close
1367
2190
  if @servers.size > 1
1368
2191
  @error = 'too many servers. persistent socket allowed only for a single server.'
1369
2192
  return false
1370
2193
  end
1371
-
2194
+
1372
2195
  unless @servers.first.persistent?
1373
2196
  @error = 'not connected'
1374
2197
  return false;
1375
2198
  end
1376
-
1377
- @servers.first.close_persistent!
1378
- end
1379
-
1380
- # Queries searchd status, and returns an array of status variable name
1381
- # and value pairs.
1382
- #
1383
- # Usage example:
1384
- #
1385
- # status = sphinx.Status
1386
- # puts status.map { |key, value| "#{key.rjust(20)}: #{value}" }
1387
- #
1388
- def Status
1389
- request = Request.new
1390
- request.put_int(1)
1391
- response = perform_request(:status, request)
1392
2199
 
1393
- # parse response
1394
- begin
1395
- rows, cols = response.get_ints(2)
1396
-
1397
- res = []
1398
- 0.upto(rows - 1) do |i|
1399
- res[i] = []
1400
- 0.upto(cols - 1) do |j|
1401
- res[i] << response.get_string
1402
- end
1403
- end
1404
- rescue EOFError
1405
- @error = 'incomplete reply'
1406
- raise SphinxResponseError, @error
1407
- end
1408
-
1409
- res
2200
+ @servers.first.close_persistent!
1410
2201
  end
1411
-
1412
- def FlushAttrs
1413
- request = Request.new
1414
- response = perform_request(:flushattrs, request)
2202
+ alias :Close :close
1415
2203
 
1416
- # parse response
1417
- begin
1418
- response.get_int
1419
- rescue EOFError
1420
- -1
1421
- end
1422
- end
1423
-
1424
2204
  protected
1425
-
2205
+
1426
2206
  # Connect, send query, get response.
1427
2207
  #
1428
2208
  # Use this method to communicate with Sphinx server. It ensures connection
1429
2209
  # will be instantiated properly, all headers will be generated properly, etc.
1430
2210
  #
1431
- # Parameters:
1432
- # * +command+ -- searchd command to perform (<tt>:search</tt>, <tt>:excerpt</tt>,
2211
+ # @param [Symbol, String] command searchd command to perform (<tt>:search</tt>, <tt>:excerpt</tt>,
1433
2212
  # <tt>:update</tt>, <tt>:keywords</tt>, <tt>:persist</tt>, <tt>:status</tt>,
1434
2213
  # <tt>:query</tt>, <tt>:flushattrs</tt>. See <tt>SEARCHD_COMMAND_*</tt> for details).
1435
- # * +request+ -- an instance of <tt>Sphinx::Request</tt> class. Contains request body.
1436
- # * +additional+ -- additional integer data to be placed between header and body.
1437
- # * +block+ -- if given, response will not be parsed, plain socket will be
1438
- # passed instead. this is special mode used for persistent connections,
1439
- # do not use for other tasks.
2214
+ # @param [Sphinx::Request] request contains request body.
2215
+ # @param [nil, Integer] additional additional integer data to be placed between header and body.
2216
+ #
2217
+ # @yield if block given, response will not be parsed, plain socket
2218
+ # will be yielded instead. This is special mode used for
2219
+ # persistent connections, do not use for other tasks.
2220
+ # @yieldparam [Sphinx::Server] server a server where request was performed on.
2221
+ # @yieldparam [Sphinx::BufferedIO] socket a socket used to perform the request.
2222
+ # @return [Sphinx::Response] contains response body.
1440
2223
  #
1441
- def perform_request(command, request, additional = nil, &block)
2224
+ # @see #parse_response
2225
+ #
2226
+ def perform_request(command, request, additional = nil)
1442
2227
  with_server do |server|
1443
2228
  cmd = command.to_s.upcase
1444
2229
  command_id = Sphinx::Client.const_get("SEARCHD_COMMAND_#{cmd}")
@@ -1465,26 +2250,31 @@ module Sphinx
1465
2250
  #
1466
2251
  # There are several exceptions which could be thrown in this method:
1467
2252
  #
1468
- # * various network errors -- should be handled by caller (see +with_socket+).
1469
- # * +SphinxResponseError+ -- incomplete reply from searchd.
1470
- # * +SphinxInternalError+ -- searchd error.
1471
- # * +SphinxTemporaryError+ -- temporary searchd error.
1472
- # * +SphinxUnknownError+ -- unknows searchd error.
2253
+ # @param [Sphinx::BufferedIO] socket an input stream object.
2254
+ # @param [Integer] client_version a command version which client supports.
2255
+ # @return [Sphinx::Response] could be used for context-based
2256
+ # parsing of reply from the server.
2257
+ #
2258
+ # @raise [SystemCallError, SocketError] should be handled by caller (see {#with_socket}).
2259
+ # @raise [SphinxResponseError] incomplete reply from searchd.
2260
+ # @raise [SphinxInternalError] searchd internal error.
2261
+ # @raise [SphinxTemporaryError] searchd temporary error.
2262
+ # @raise [SphinxUnknownError] searchd unknown error.
1473
2263
  #
1474
- # Method returns an instance of <tt>Sphinx::Response</tt> class, which
1475
- # could be used for context-based parsing of reply from the server.
2264
+ # @see #with_socket
2265
+ # @private
1476
2266
  #
1477
2267
  def parse_response(socket, client_version)
1478
2268
  response = ''
1479
2269
  status = ver = len = 0
1480
-
1481
- # Read server reply from server. All exceptions are handled by +with_socket+.
2270
+
2271
+ # Read server reply from server. All exceptions are handled by {#with_socket}.
1482
2272
  header = socket.read(8)
1483
2273
  if header.length == 8
1484
2274
  status, ver, len = header.unpack('n2N')
1485
2275
  response = socket.read(len) if len > 0
1486
2276
  end
1487
-
2277
+
1488
2278
  # check response
1489
2279
  read = response.length
1490
2280
  if response.empty? or read != len.to_i
@@ -1493,7 +2283,7 @@ module Sphinx
1493
2283
  : 'received zero-sized searchd response'
1494
2284
  raise SphinxResponseError, error
1495
2285
  end
1496
-
2286
+
1497
2287
  # check status
1498
2288
  if (status == SEARCHD_WARNING)
1499
2289
  wlen = response[0, 4].unpack('N*').first
@@ -1505,34 +2295,40 @@ module Sphinx
1505
2295
  error = 'searchd error: ' + response[4, response.length - 4]
1506
2296
  raise SphinxInternalError, error
1507
2297
  end
1508
-
2298
+
1509
2299
  if status == SEARCHD_RETRY
1510
2300
  error = 'temporary searchd error: ' + response[4, response.length - 4]
1511
2301
  raise SphinxTemporaryError, error
1512
2302
  end
1513
-
2303
+
1514
2304
  unless status == SEARCHD_OK
1515
2305
  error = "unknown status code: '#{status}'"
1516
2306
  raise SphinxUnknownError, error
1517
2307
  end
1518
-
2308
+
1519
2309
  # check version
1520
2310
  if ver < client_version
1521
2311
  @warning = "searchd command v.#{ver >> 8}.#{ver & 0xff} older than client's " +
1522
2312
  "v.#{client_version >> 8}.#{client_version & 0xff}, some options might not work"
1523
2313
  end
1524
-
2314
+
1525
2315
  Response.new(response)
1526
2316
  end
1527
-
2317
+
1528
2318
  # This is internal method which selects next server (round-robin)
1529
2319
  # and yields it to the block passed.
1530
2320
  #
1531
2321
  # In case of connection error, it will try next server several times
1532
- # (see +SetConnectionTimeout+ method details). If all servers are down,
1533
- # it will set +error+ attribute value with the last exception message,
1534
- # and <tt>connection_timeout?</tt> method will return true. Also,
1535
- # +SphinxConnectErorr+ exception will be raised.
2322
+ # (see {#set_connect_timeout} method details). If all servers are down,
2323
+ # it will set error attribute (could be retrieved with {#last_error}
2324
+ # method) with the last exception message, and {#connect_error?}
2325
+ # method will return true. Also, {SphinxConnectError} exception
2326
+ # will be raised.
2327
+ #
2328
+ # @yield a block which performs request on a given server.
2329
+ # @yieldparam [Sphinx::Server] server contains information
2330
+ # about the server to perform request on.
2331
+ # @raise [SphinxConnectError] on any connection error.
1536
2332
  #
1537
2333
  def with_server
1538
2334
  attempts = @retries
@@ -1552,29 +2348,39 @@ module Sphinx
1552
2348
  raise
1553
2349
  end
1554
2350
  end
1555
-
2351
+
1556
2352
  # This is internal method which retrieves socket for a given server,
1557
2353
  # initiates Sphinx session, and yields this socket to a block passed.
1558
2354
  #
1559
- # In case of any problems with session initiation, +SphinxConnectError+
2355
+ # In case of any problems with session initiation, {SphinxConnectError}
1560
2356
  # will be raised, because this is part of connection establishing. See
1561
- # +with_server+ method details to get more infromation about how this
2357
+ # {#with_server} method details to get more infromation about how this
1562
2358
  # exception is handled.
1563
2359
  #
1564
2360
  # Socket retrieving routine is wrapped in a block with it's own
1565
- # timeout value (see +SetConnectTimeout+). This is done in
1566
- # <tt>Server#get_socket</tt> method, so check it for details.
2361
+ # timeout value (see {#set_connect_timeout}). This is done in
2362
+ # {Server#get_socket} method, so check it for details.
1567
2363
  #
1568
2364
  # Request execution is wrapped with block with another timeout
1569
- # (see +SetRequestTimeout+). This ensures no Sphinx request will
2365
+ # (see {#set_request_timeout}). This ensures no Sphinx request will
1570
2366
  # take unreasonable time.
1571
2367
  #
1572
2368
  # In case of any Sphinx error (incomplete reply, internal or temporary
1573
2369
  # error), connection to the server will be re-established, and request
1574
- # will be retried (see +SetRequestTimeout+). Of course, if connection
2370
+ # will be retried (see {#set_request_timeout}). Of course, if connection
1575
2371
  # could not be established, next server will be selected (see explanation
1576
2372
  # above).
1577
2373
  #
2374
+ # @param [Sphinx::Server] server contains information
2375
+ # about the server to perform request on.
2376
+ # @yield a block which will actually perform the request.
2377
+ # @yieldparam [Sphinx::BufferedIO] socket a socket used to
2378
+ # perform the request.
2379
+ #
2380
+ # @raise [SphinxResponseError, SphinxInternalError, SphinxTemporaryError, SphinxUnknownError]
2381
+ # on any response error.
2382
+ # @raise [SphinxConnectError] on any connection error.
2383
+ #
1578
2384
  def with_socket(server)
1579
2385
  attempts = @reqretries
1580
2386
  socket = nil
@@ -1612,14 +2418,14 @@ module Sphinx
1612
2418
  new_e.set_backtrace(e.backtrace)
1613
2419
  e = new_e
1614
2420
  end
1615
-
2421
+
1616
2422
  # Close previously opened socket (in case of it has been really opened)
1617
2423
  server.free_socket(socket)
1618
2424
 
1619
2425
  # Request error! Do we need to try it again?
1620
2426
  attempts -= 1
1621
2427
  retry if attempts > 0
1622
-
2428
+
1623
2429
  # Re-raise original exception
1624
2430
  @error = e.message
1625
2431
  raise e