riddle 0.9.8.1231.0 → 0.9.8.1533.10

Sign up to get free protection for your applications and to get access to all the features.
data/README.textile ADDED
@@ -0,0 +1,89 @@
1
+ This client has been written to interface with "Sphinx":http://sphinxsearch.com/. It is written by
2
+ "Pat Allan":http://freelancing-gods.com, and has been influenced by both Dmytro Shteflyuk's Ruby
3
+ client and the original PHP client - credit where credit's due, after all.
4
+
5
+ It does not follow the same syntax as those two, though (not much point writing this otherwise) -
6
+ opting for a more Ruby-like structure.
7
+
8
+ The easiest way to install is to grab the gem from GitHub:
9
+
10
+ sudo gem install freelancing-god-riddle --source http://gems.github.com/
11
+
12
+ However, if you're so inclined, you can grab sourcecode.
13
+
14
+ git clone git://github.com/freelancing-god/thinking-sphinx.git
15
+
16
+ If you're looking for old versions - for older versions of Sphinx - you'll want to peruse the tagged versions in the old subversion repository
17
+
18
+ svn co http://rails-oceania.googlecode.com/svn/patallan/riddle/tags riddle-tags
19
+
20
+ Please note that at the time of writing, the following versions are supported (if you get the appropriate tag):
21
+
22
+ * 0.9.8-r871
23
+ * 0.9.8-r909
24
+ * 0.9.8-r985
25
+ * 0.9.8-r1065
26
+ * 0.9.8-r1112
27
+ * 0.9.8-rc1 (gem version: 0.9.8.1198)
28
+ * 0.9.8-rc2 (gem version: 0.9.8.1231)
29
+ * 0.9.8 (gem version: 0.9.8.1371)
30
+
31
+ To get started, just instantiate a Client object:
32
+
33
+ client = Riddle::Client.new # defaults to localhost and port 3312
34
+ client = Riddle::Client.new "sphinxserver.domain.tld", 3333 # custom settings
35
+
36
+ And then set the parameters to what you want, before running a query:
37
+
38
+ client.match_mode = :extended
39
+ client.query "Pat Allan @state Victoria"
40
+
41
+ The results from a query are similar to the other clients - but here's the details. It's a hash with
42
+ the following keys:
43
+
44
+ * :matches
45
+ * :fields
46
+ * :attributes
47
+ * :attribute_names
48
+ * :words
49
+ * :total
50
+ * :total_found
51
+ * :time
52
+ * :status
53
+ * :warning (if appropriate)
54
+ * :error (if appropriate)
55
+
56
+ The key @:matches@ returns an array of hashes - the actual search results. Each hash has the
57
+ document id (@:doc@), the result weighting (@:weight@), and a hash of the attributes for
58
+ the document (@:attributes@).
59
+
60
+ The @:fields@ and @:attribute_names@ keys return list of fields and attributes for the
61
+ documents. The key @:attributes@ will return a hash of attribute name and type pairs, and
62
+ @:words@ returns a hash of hashes representing the words from the search, with the number of
63
+ documents and hits for each, along the lines of:
64
+
65
+ results[:words]["Pat"] #=> {:docs => 12, :hits => 15}
66
+
67
+ @:total@, @:total_found@ and @:time@ return the number of matches available, the
68
+ total number of matches (which may be greater than the maximum available), and the time in milliseconds
69
+ that the query took to run.
70
+
71
+ @:status@ is the error code for the query - and if there was a related warning, it will be under
72
+ the @:warning@ key. Fatal errors will be described under @:error@.
73
+
74
+ If you've installed the gem and wondering why there's no tests - check out the svn version. I've kept the specs out of the gem as I have a decent amount of test data in there, which really isn't needed unless you want to submit patches.
75
+
76
+ h2. Contributors
77
+
78
+ Thanks to the following people who have contributed to Riddle in some shape or form:
79
+
80
+ * Andrew Aksyonoff
81
+ * Brad Greenlee
82
+ * Lachie Cox
83
+ * Jeremy Seitz
84
+ * Mark Lane
85
+ * Xavier Noria
86
+ * Henrik Nye
87
+ * Kristopher Chambers
88
+ * Rob Anderton
89
+ * Dylan Egan
data/lib/riddle.rb CHANGED
@@ -1,8 +1,9 @@
1
1
  require 'socket'
2
+ require 'timeout'
3
+
2
4
  require 'riddle/client'
3
- require 'riddle/client/filter'
4
- require 'riddle/client/message'
5
- require 'riddle/client/response'
5
+ require 'riddle/configuration'
6
+ require 'riddle/controller'
6
7
 
7
8
  module Riddle #:nodoc:
8
9
  class ConnectionError < StandardError #:nodoc:
@@ -14,12 +15,16 @@ module Riddle #:nodoc:
14
15
  Tiny = 8
15
16
  # Revision number for RubyForge's sake, taken from what Sphinx
16
17
  # outputs to the command line.
17
- Rev = 1231
18
+ Rev = 1533
18
19
  # Release number to mark my own fixes, beyond feature parity with
19
20
  # Sphinx itself.
20
- Release = 0
21
+ Release = 10
21
22
 
22
- String = [Major, Minor, Tiny].join('.') + "rc2"
23
+ String = [Major, Minor, Tiny].join('.')
23
24
  GemVersion = [Major, Minor, Tiny, Rev, Release].join('.')
24
25
  end
26
+
27
+ def self.escape(string)
28
+ string.gsub(/[\(\)\|\-!@~"&\/]/) { |char| "\\#{char}" }
29
+ end
25
30
  end
data/lib/riddle/client.rb CHANGED
@@ -1,3 +1,7 @@
1
+ require 'riddle/client/filter'
2
+ require 'riddle/client/message'
3
+ require 'riddle/client/response'
4
+
1
5
  module Riddle
2
6
  class VersionError < StandardError; end
3
7
  class ResponseError < StandardError; end
@@ -100,7 +104,7 @@ module Riddle
100
104
  :match_mode, :sort_mode, :sort_by, :weights, :id_range, :filters,
101
105
  :group_by, :group_function, :group_clause, :group_distinct, :cut_off,
102
106
  :retry_count, :retry_delay, :anchor, :index_weights, :rank_mode,
103
- :max_query_time, :field_weights
107
+ :max_query_time, :field_weights, :timeout
104
108
  attr_reader :queue
105
109
 
106
110
  # Can instantiate with a specific server and port - otherwise it assumes
@@ -110,6 +114,13 @@ module Riddle
110
114
  @server = server || "localhost"
111
115
  @port = port || 3312
112
116
 
117
+ reset
118
+
119
+ @queue = []
120
+ end
121
+
122
+ # Reset attributes and settings to defaults.
123
+ def reset
113
124
  # defaults
114
125
  @offset = 0
115
126
  @limit = 20
@@ -134,8 +145,7 @@ module Riddle
134
145
  @max_query_time = 0
135
146
  # string keys are field names, integer values are weightings
136
147
  @field_weights = {}
137
-
138
- @queue = []
148
+ @timeout = 0
139
149
  end
140
150
 
141
151
  # Set the geo-anchor point - with the names of the attributes that contain
@@ -384,8 +394,28 @@ module Riddle
384
394
  # Connects to the Sphinx daemon, and yields a socket to use. The socket is
385
395
  # closed at the end of the block.
386
396
  def connect(&block)
387
- socket = TCPSocket.new @server, @port
388
-
397
+ socket = nil
398
+ if @timeout == 0
399
+ socket = initialise_connection
400
+ else
401
+ begin
402
+ Timeout.timeout(@timeout) { socket = initialise_connection }
403
+ rescue Timeout::Error
404
+ raise Riddle::ConnectionError,
405
+ "Connection to #{@server} on #{@port} timed out after #{@timeout} seconds"
406
+ end
407
+ end
408
+
409
+ begin
410
+ yield socket
411
+ ensure
412
+ socket.close
413
+ end
414
+ end
415
+
416
+ def initialise_connection
417
+ socket = initialise_socket
418
+
389
419
  # Checking version
390
420
  version = socket.recv(4).unpack('N*').first
391
421
  if version < 1
@@ -396,11 +426,20 @@ module Riddle
396
426
  # Send version
397
427
  socket.send [1].pack('N'), 0
398
428
 
429
+ socket
430
+ end
431
+
432
+ def initialise_socket
433
+ tries = 0
399
434
  begin
400
- yield socket
401
- ensure
402
- socket.close
435
+ socket = TCPSocket.new @server, @port
436
+ rescue Errno::ECONNREFUSED => e
437
+ retry if (tries += 1) < 5
438
+ raise Riddle::ConnectionError,
439
+ "Connection to #{@server} on #{@port} failed. #{e.message}"
403
440
  end
441
+
442
+ socket
404
443
  end
405
444
 
406
445
  # Send a collection of messages, for a command type (eg, search, excerpts,
@@ -411,6 +450,9 @@ module Riddle
411
450
  version = 0
412
451
  length = 0
413
452
  message = Array(messages).join("")
453
+ if message.respond_to?(:force_encoding)
454
+ message = message.force_encoding('ASCII-8BIT')
455
+ end
414
456
 
415
457
  connect do |socket|
416
458
  case command
@@ -430,7 +472,7 @@ module Riddle
430
472
  header = socket.recv(8)
431
473
  status, version, length = header.unpack('n2N')
432
474
 
433
- while response.length < length
475
+ while response.length < (length || 0)
434
476
  part = socket.recv(length - response.length)
435
477
  response << part if part
436
478
  end
@@ -507,7 +549,7 @@ module Riddle
507
549
  # Per Index Weights
508
550
  message.append_int @index_weights.length
509
551
  @index_weights.each do |key,val|
510
- message.append_string key
552
+ message.append_string key.to_s
511
553
  message.append_int val
512
554
  end
513
555
 
@@ -517,7 +559,7 @@ module Riddle
517
559
  # Per Field Weights
518
560
  message.append_int @field_weights.length
519
561
  @field_weights.each do |key,val|
520
- message.append_string key
562
+ message.append_string key.to_s
521
563
  message.append_int val
522
564
  end
523
565
 
@@ -18,7 +18,7 @@ module Riddle
18
18
  def query_message
19
19
  message = Message.new
20
20
 
21
- message.append_string self.attribute
21
+ message.append_string self.attribute.to_s
22
22
  case self.values
23
23
  when Range
24
24
  if self.values.first.is_a?(Float) && self.values.last.is_a?(Float)
@@ -33,7 +33,16 @@ module Riddle
33
33
  message.append_int self.values.length
34
34
  # using to_f is a hack from the php client - to workaround 32bit
35
35
  # signed ints on x32 platforms
36
- message.append_ints *self.values.collect { |val| val.to_f }
36
+ message.append_ints *self.values.collect { |val|
37
+ case val
38
+ when TrueClass
39
+ 1.0
40
+ when FalseClass
41
+ 0.0
42
+ else
43
+ val.to_f
44
+ end
45
+ }
37
46
  end
38
47
  message.append_int self.exclude? ? 1 : 0
39
48
 
@@ -10,14 +10,15 @@ module Riddle
10
10
 
11
11
  # Append raw data (only use if you know what you're doing)
12
12
  def append(*args)
13
- return if args.length == 0
14
-
15
13
  args.each { |arg| @message << arg }
16
14
  end
17
15
 
18
16
  # Append a string's length, then the string itself
19
17
  def append_string(str)
20
- @message << [str.send(@size_method)].pack('N') + str
18
+ string = str.respond_to?(:force_encoding) ?
19
+ str.dup.force_encoding('ASCII-8BIT') : str
20
+
21
+ @message << [string.send(@size_method)].pack('N') + string
21
22
  end
22
23
 
23
24
  # Append an integer
@@ -0,0 +1,33 @@
1
+ require 'riddle/configuration/section'
2
+
3
+ require 'riddle/configuration/distributed_index'
4
+ require 'riddle/configuration/index'
5
+ require 'riddle/configuration/indexer'
6
+ require 'riddle/configuration/remote_index'
7
+ require 'riddle/configuration/searchd'
8
+ require 'riddle/configuration/source'
9
+ require 'riddle/configuration/sql_source'
10
+ require 'riddle/configuration/xml_source'
11
+
12
+ module Riddle
13
+ class Configuration
14
+ class ConfigurationError < StandardError #:nodoc:
15
+ end
16
+
17
+ attr_reader :indexes, :searchd
18
+ attr_accessor :indexer
19
+
20
+ def initialize
21
+ @indexer = Riddle::Configuration::Indexer.new
22
+ @searchd = Riddle::Configuration::Searchd.new
23
+ @indexes = []
24
+ end
25
+
26
+ def render
27
+ (
28
+ [@indexer.render, @searchd.render] +
29
+ @indexes.collect { |index| index.render }
30
+ ).join("\n")
31
+ end
32
+ end
33
+ end
@@ -0,0 +1,48 @@
1
+ module Riddle
2
+ class Configuration
3
+ class DistributedIndex < Riddle::Configuration::Section
4
+ self.settings = [:type, :local, :agent, :agent_connect_timeout,
5
+ :agent_query_timeout]
6
+
7
+ attr_accessor :name, :local_indexes, :remote_indexes,
8
+ :agent_connect_timeout, :agent_query_timeout
9
+
10
+ def initialize(name)
11
+ @name = name
12
+ @local_indexes = []
13
+ @remote_indexes = []
14
+ end
15
+
16
+ def type
17
+ "distributed"
18
+ end
19
+
20
+ def local
21
+ self.local_indexes
22
+ end
23
+
24
+ def agent
25
+ agents = remote_indexes.collect { |index| index.remote }.uniq
26
+ agents.collect { |agent|
27
+ agent + ":" + remote_indexes.select { |index|
28
+ index.remote == agent
29
+ }.collect { |index| index.name }.join(",")
30
+ }
31
+ end
32
+
33
+ def render
34
+ raise ConfigurationError unless valid?
35
+
36
+ (
37
+ ["index #{name}", "{"] +
38
+ settings_body +
39
+ ["}", ""]
40
+ ).join("\n")
41
+ end
42
+
43
+ def valid?
44
+ @local_indexes.length > 0 || @remote_indexes.length > 0
45
+ end
46
+ end
47
+ end
48
+ end
@@ -0,0 +1,142 @@
1
+ module Riddle
2
+ class Configuration
3
+ class Index < Riddle::Configuration::Section
4
+ self.settings = [:source, :path, :docinfo, :mlock, :morphology,
5
+ :stopwords, :wordforms, :exceptions, :min_word_len, :charset_type,
6
+ :charset_table, :ignore_chars, :min_prefix_len, :min_infix_len,
7
+ :prefix_fields, :infix_fields, :enable_star, :ngram_len, :ngram_chars,
8
+ :phrase_boundary, :phrase_boundary_step, :html_strip,
9
+ :html_index_attrs, :html_remove_elements, :preopen]
10
+
11
+ attr_accessor :name, :parent, :sources, :path, :docinfo, :mlock,
12
+ :morphologies, :stopword_files, :wordform_files, :exception_files,
13
+ :min_word_len, :charset_type, :charset_table, :ignore_characters,
14
+ :min_prefix_len, :min_infix_len, :prefix_field_names,
15
+ :infix_field_names, :enable_star, :ngram_len, :ngram_characters,
16
+ :phrase_boundaries, :phrase_boundary_step, :html_strip,
17
+ :html_index_attrs, :html_remove_element_tags, :preopen
18
+
19
+ def initialize(name, *sources)
20
+ @name = name
21
+ @sources = sources
22
+ @morphologies = []
23
+ @stopword_files = []
24
+ @wordform_files = []
25
+ @exception_files = []
26
+ @ignore_characters = []
27
+ @prefix_field_names = []
28
+ @infix_field_names = []
29
+ @ngram_characters = []
30
+ @phrase_boundaries = []
31
+ @html_remove_element_tags = []
32
+ end
33
+
34
+ def source
35
+ @sources.collect { |s| s.name }
36
+ end
37
+
38
+ def morphology
39
+ nil_join @morphologies, ", "
40
+ end
41
+
42
+ def morphology=(morphology)
43
+ @morphologies = nil_split morphology, /,\s?/
44
+ end
45
+
46
+ def stopwords
47
+ nil_join @stopword_files, " "
48
+ end
49
+
50
+ def stopwords=(stopwords)
51
+ @stopword_files = nil_split stopwords, ' '
52
+ end
53
+
54
+ def wordforms
55
+ nil_join @wordform_files, " "
56
+ end
57
+
58
+ def wordforms=(wordforms)
59
+ @wordform_files = nil_split wordforms, ' '
60
+ end
61
+
62
+ def exceptions
63
+ nil_join @exception_files, " "
64
+ end
65
+
66
+ def exceptions=(exceptions)
67
+ @exception_files = nil_split exceptions, ' '
68
+ end
69
+
70
+ def ignore_chars
71
+ nil_join @ignore_characters, ", "
72
+ end
73
+
74
+ def ignore_chars=(ignore_chars)
75
+ @ignore_characters = nil_split ignore_chars, /,\s?/
76
+ end
77
+
78
+ def prefix_fields
79
+ nil_join @prefix_field_names, ", "
80
+ end
81
+
82
+ def infix_fields
83
+ nil_join @infix_field_names, ", "
84
+ end
85
+
86
+ def ngram_chars
87
+ nil_join @ngram_characters, ", "
88
+ end
89
+
90
+ def ngram_chars=(ngram_chars)
91
+ @ngram_characters = nil_split ngram_chars, /,\s?/
92
+ end
93
+
94
+ def phrase_boundary
95
+ nil_join @phrase_boundaries, ", "
96
+ end
97
+
98
+ def phrase_boundary=(phrase_boundary)
99
+ @phrase_boundaries = nil_split phrase_boundary, /,\s?/
100
+ end
101
+
102
+ def html_remove_elements
103
+ nil_join @html_remove_element_tags, ", "
104
+ end
105
+
106
+ def html_remove_elements=(html_remove_elements)
107
+ @html_remove_element_tags = nil_split html_remove_elements, /,\s?/
108
+ end
109
+
110
+ def render
111
+ raise ConfigurationError, "#{@name} #{@sources.inspect} #{@path} #{@parent}" unless valid?
112
+
113
+ inherited_name = "#{name}"
114
+ inherited_name << " : #{parent}" if parent
115
+ (
116
+ @sources.collect { |s| s.render } +
117
+ ["index #{inherited_name}", "{"] +
118
+ settings_body +
119
+ ["}", ""]
120
+ ).join("\n")
121
+ end
122
+
123
+ def valid?
124
+ (!@name.nil?) && (!( @sources.length == 0 || @path.nil? ) || !@parent.nil?)
125
+ end
126
+
127
+ private
128
+
129
+ def nil_split(string, pattern)
130
+ (string || "").split(pattern)
131
+ end
132
+
133
+ def nil_join(array, delimiter)
134
+ if array.length == 0
135
+ nil
136
+ else
137
+ array.join(delimiter)
138
+ end
139
+ end
140
+ end
141
+ end
142
+ end