slingshot-rb 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
data/.gitignore ADDED
@@ -0,0 +1,7 @@
1
+ *.gem
2
+ .bundle
3
+ Gemfile.lock
4
+ pkg/*
5
+ rdoc/
6
+ coverage/
7
+ scratch/
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source "http://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in slingshot.gemspec
4
+ gemspec
data/MIT-LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright 2011 Karel Minarik
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.markdown ADDED
@@ -0,0 +1,164 @@
1
+ Slingshot
2
+ =========
3
+
4
+ _Slingshot_ aims to provide a rich Ruby API and DSL for the
5
+ [ElasticSearch](http://www.elasticsearch.org/) search engine/database.
6
+
7
+ _ElasticSearch_ is a scalable, distributed, highly-available,
8
+ RESTful database communicating by JSON over HTTP, based on [Lucene](http://lucene.apache.org/),
9
+ written in Java. It manages to very simple and very powerful at the same time.
10
+ You should seriously consider it to power search in your Ruby applications:
11
+ it will deliver all the features you want — and many more you may have not
12
+ imagined yet (native geo search? histogram facets?)
13
+
14
+ _Slingshot_ currently allow basic operation with the index and searching. See chapters below.
15
+
16
+
17
+ Installation
18
+ ------------
19
+
20
+ First, you need a running _ElasticSearch_ server. Thankfully, it's easy. Let's define easy:
21
+
22
+ $ curl -L -o elasticsearch-0.14.4.tar.gz http://github.com/downloads/elasticsearch/elasticsearch/elasticsearch-0.14.4.tar.gz
23
+ $ tar -zxvf elasticsearch-0.14.4.tar.gz
24
+ $ ./elasticsearch-0.14.4/bin/elasticsearch -f
25
+
26
+ OK, easy. Now, install the gem via Rubygems:
27
+
28
+ $ gem install slingshot
29
+
30
+ or from source:
31
+
32
+ $ git clone git://github.com/karmi/slingshot.git
33
+ $ rake install
34
+
35
+
36
+ Usage
37
+ -----
38
+
39
+ Currently, you can use _Slingshot_ via the DSL (eg. by extending your class with it).
40
+ Plans for full ActiveModel integration (and other convenience layers) are in progress.
41
+
42
+ To kick the tiers, require the gem in an IRB session or a Ruby script
43
+ (note that you can run the full example from [`examples/dsl.rb`](https://github.com/karmi/slingshot/blob/master/examples/dsl.rb)):
44
+
45
+ require 'rubygems'
46
+ require 'slingshot'
47
+
48
+ First, let's create an index named `articles` and store/index some documents:
49
+
50
+ Slingshot.index 'articles' do
51
+ delete
52
+ create
53
+
54
+ store :title => 'One', :tags => ['ruby']
55
+ store :title => 'Two', :tags => ['ruby', 'python']
56
+ store :title => 'Three', :tags => ['java']
57
+ store :title => 'Four', :tags => ['ruby', 'php']
58
+
59
+ refresh
60
+ end
61
+
62
+ Now, let's query the database:
63
+
64
+ We are searching for articles tagged _ruby_, sorted by `title` in `descending` order,
65
+ and also retrieving some [_facets_](http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Faceted-Search-Solr)
66
+ from the database:
67
+
68
+ s = Slingshot.search 'articles' do
69
+ query do
70
+ terms :tags, ['ruby']
71
+ end
72
+
73
+ sort do
74
+ title 'desc'
75
+ end
76
+
77
+ facet 'global-tags' do
78
+ terms :tags, :global => true
79
+ end
80
+
81
+ facet 'current-tags' do
82
+ terms :tags
83
+ end
84
+ end
85
+
86
+ Let's display the results:
87
+
88
+ s.results.each do |document|
89
+ puts "* #{ document['_source']['title'] }"
90
+ end
91
+
92
+ # * Two
93
+ # * One
94
+ # * Four
95
+
96
+ Let's display the facets (distribution of tags across the whole database):
97
+
98
+ s.results.facets['global-tags']['terms'].each do |f|
99
+ puts "#{f['term'].ljust(10)} #{f['count']}"
100
+ end
101
+
102
+ # ruby 3
103
+ # python 1
104
+ # php 1
105
+ # java 1
106
+
107
+ We can display the full query JSON:
108
+
109
+ puts s.to_json
110
+ # {"facets":{"current-tags":{"terms":{"field":"tags"}},"global-tags":{"global":true,"terms":{"field":"tags"}}},"sort":[{"title":"desc"}],"query":{"terms":{"tags":["ruby"]}}}
111
+
112
+ See, a Ruby DSL for this thing is kinda handy? We can query _ElasticSearch_ manually with `curl`, simply:
113
+
114
+ puts s.to_curl
115
+ # curl -X POST "http://localhost:9200/articles/_search?pretty=true" -d '{"facets":{"current-tags":{"terms":{"field":"tags"}},"global-tags":{"global":true,"terms":{"field":"tags"}}},"sort":[{"title":"desc"}],"query":{"terms":{"tags":["ruby"]}}}'
116
+
117
+
118
+ Features
119
+ --------
120
+
121
+ Currently, _Slingshot_ supports only a limited subset of vast _ElasticSearch_ [Search API](http://www.elasticsearch.org/guide/reference/api/search/request-body.html) and it's [Query DSL](http://www.elasticsearch.org/guide/reference/query-dsl/):
122
+
123
+ * Creating, deleting and refreshing the index
124
+ * Storing a document in the index
125
+ * [Querying](https://github.com/karmi/slingshot/blob/master/examples/dsl.rb) the index with the `query_string`, `term` and `terms` types of queries
126
+ * Sorting the results by `fields`
127
+ * Retrieving a _terms_ type of [facets](http://www.elasticsearch.org/guide/reference/api/search/facets/index.html) -- other types are high priority
128
+ * Returning just specific fields from documents
129
+ * Paging with `from` and `size` query options
130
+
131
+ See the [`examples/dsl.rb`](blob/master/examples/dsl.rb).
132
+
133
+ Todo & Plans
134
+ ------------
135
+
136
+ In order of importance:
137
+
138
+ * Basic wrapper class for _hits_ in results, so we could write `results.first.document.title` instead of using the raw Hash
139
+ * Getting document [by ID](http://www.elasticsearch.org/guide/reference/api/get.html)
140
+ * Seamless _ActiveModel_ compatibility for easy usage in _Rails_ applications (this also means nearly full _ActiveRecord_ compatibility)
141
+ * Allowing to set custom non-ActiveModel wrapper class (your own)
142
+ * Seamless [will_paginate](https://github.com/mislav/will_paginate) compatibility for easy pagination
143
+ * [Histogram](http://www.elasticsearch.org/guide/reference/api/search/facets/histogram-facet.html) facets
144
+ * Seamless support for [auto-updating _river_ index](http://www.elasticsearch.org/guide/reference/river/couchdb.html) for _CouchDB_ `_changes` feed
145
+ * [Mapping](http://www.elasticsearch.org/guide/reference/mapping/) management
146
+ * Infrastructure for query filters
147
+ * [Range](http://www.elasticsearch.org/guide/reference/query-dsl/range-filter.html) filters and queries
148
+ * [Geo Filters](http://www.elasticsearch.org/blog/2010/08/16/geo_location_and_search.html) for queries
149
+ * [Statistical](http://www.elasticsearch.org/guide/reference/api/search/facets/statistical-facet.html) facets
150
+ * [Geo Distance](http://www.elasticsearch.org/guide/reference/api/search/facets/geo-distance-facet.html) facets
151
+ * [Index aliases](http://www.elasticsearch.org/guide/reference/api/admin-indices-aliases.html) management
152
+ * [Analyze](http://www.elasticsearch.org/guide/reference/api/admin-indices-analyze.html) API support
153
+ * [Highligting](http://www.elasticsearch.org/guide/reference/api/search/highlighting.html) support
154
+ * [Bulk](http://www.elasticsearch.org/guide/reference/api/bulk.html) API
155
+ * Embedded webserver to display cluster statistics and allow easy searches
156
+
157
+ Feedback
158
+ --------
159
+
160
+ You can send feedback via [e-mail](mailto:karmi@karmi.cz) or via [Github Issues](https://github.com/karmi/slingshot/issues).
161
+
162
+ -----
163
+
164
+ [Karel Minarik](http://karmi.cz)
data/Rakefile ADDED
@@ -0,0 +1,52 @@
1
+ require 'bundler'
2
+ Bundler::GemHelper.install_tasks
3
+
4
+ task :default => :test
5
+
6
+ require 'rake/testtask'
7
+ Rake::TestTask.new(:test) do |test|
8
+ test.libs << 'lib' << 'test'
9
+ test.pattern = 'test/**/*_test.rb'
10
+ test.verbose = true
11
+ end
12
+
13
+ namespace :test do
14
+ Rake::TestTask.new(:unit) do |test|
15
+ test.libs << 'lib' << 'test'
16
+ test.pattern = 'test/unit/*_test.rb'
17
+ test.verbose = true
18
+ end
19
+ Rake::TestTask.new(:integration) do |test|
20
+ test.libs << 'lib' << 'test'
21
+ test.pattern = 'test/integration/*_test.rb'
22
+ test.verbose = true
23
+ end
24
+ end
25
+
26
+ # Generate documentation
27
+ begin
28
+ require 'sdoc'
29
+ rescue LoadError
30
+ end
31
+ require 'rake/rdoctask'
32
+ Rake::RDocTask.new do |rdoc|
33
+ rdoc.rdoc_dir = 'rdoc'
34
+ rdoc.title = "Slingshot"
35
+ rdoc.rdoc_files.include('README.rdoc')
36
+ rdoc.rdoc_files.include('lib/**/*.rb')
37
+ end
38
+
39
+ # Generate coverage reports
40
+ begin
41
+ require 'rcov/rcovtask'
42
+ Rcov::RcovTask.new do |test|
43
+ test.libs << 'test'
44
+ test.rcov_opts = ['--exclude', 'gems/*']
45
+ test.pattern = 'test/**/*_test.rb'
46
+ test.verbose = true
47
+ end
48
+ rescue LoadError
49
+ task :rcov do
50
+ abort "RCov is not available. In order to run rcov, you must: sudo gem install rcov"
51
+ end
52
+ end
data/examples/dsl.rb ADDED
@@ -0,0 +1,70 @@
1
+ $LOAD_PATH.unshift File.expand_path('../../lib', __FILE__)
2
+
3
+ require 'rubygems'
4
+ require 'slingshot'
5
+
6
+ extend Slingshot::DSL
7
+
8
+ configure do
9
+ url "http://localhost:9200"
10
+ end
11
+
12
+ index 'articles' do
13
+ delete
14
+ create
15
+
16
+ puts "Documents:", "-"*80
17
+ [
18
+ { :title => 'One', :tags => ['ruby'] },
19
+ { :title => 'Two', :tags => ['ruby', 'python'] },
20
+ { :title => 'Three', :tags => ['java'] },
21
+ { :title => 'Four', :tags => ['ruby', 'php'] }
22
+ ].each do |article|
23
+ puts "Indexing article: #{article.to_json}"
24
+ store article
25
+ end
26
+
27
+ refresh
28
+ end
29
+
30
+ s = search 'articles' do
31
+ query do
32
+ terms :tags, ['ruby']
33
+ end
34
+
35
+ sort do
36
+ title 'desc'
37
+ end
38
+
39
+ facet 'global-tags' do
40
+ terms :tags, :global => true
41
+ end
42
+
43
+ facet 'current-tags' do
44
+ terms :tags
45
+ end
46
+ end
47
+
48
+ puts "", "Query:", "-"*80
49
+ puts s.to_json
50
+
51
+ puts "", "Raw JSON result:", "-"*80
52
+ puts JSON.pretty_generate(s.response)
53
+
54
+ puts "", "Try the query in Curl:", "-"*80
55
+ puts s.to_curl
56
+
57
+ puts "", "Results:", "-"*80
58
+ s.results.each_with_index do |document, i|
59
+ puts "#{i+1}. #{ document['_source']['title'].ljust(20) } [id] #{document['_id']}"
60
+ end
61
+
62
+ puts "", "Facets: tags distribution across the whole database:", "-"*80
63
+ s.results.facets['global-tags']['terms'].each do |f|
64
+ puts "#{f['term'].ljust(10)} #{f['count']}"
65
+ end
66
+
67
+ puts "", "Facets: tags distribution for the current query", "-"*80
68
+ s.results.facets['current-tags']['terms'].each do |f|
69
+ puts "#{f['term'].ljust(10)} #{f['count']}"
70
+ end
@@ -0,0 +1,25 @@
1
+ module Slingshot
2
+
3
+ module Client
4
+
5
+ class Base
6
+ def post(url, data)
7
+ raise NoMethodError, "Implement this method in your client class"
8
+ end
9
+ def delete(url)
10
+ raise NoMethodError, "Implement this method in your client class"
11
+ end
12
+ end
13
+
14
+ class RestClient < Base
15
+ def self.post(url, data)
16
+ ::RestClient.post url, data
17
+ end
18
+ def self.delete(url)
19
+ ::RestClient.delete url rescue nil
20
+ end
21
+ end
22
+
23
+ end
24
+
25
+ end
@@ -0,0 +1,15 @@
1
+ module Slingshot
2
+
3
+ class Configuration
4
+
5
+ def self.url(value=nil)
6
+ @url = value || @url || "http://localhost:9200"
7
+ end
8
+
9
+ def self.client(klass=nil)
10
+ @client = klass || @client || Client::RestClient
11
+ end
12
+
13
+ end
14
+
15
+ end
@@ -0,0 +1,17 @@
1
+ module Slingshot
2
+ module DSL
3
+
4
+ def configure(&block)
5
+ Configuration.class_eval(&block)
6
+ end
7
+
8
+ def search(indices, &block)
9
+ Search::Search.new(indices, &block).perform
10
+ end
11
+
12
+ def index(name, &block)
13
+ Index.new(name, &block)
14
+ end
15
+
16
+ end
17
+ end
@@ -0,0 +1,42 @@
1
+ module Slingshot
2
+ class Index
3
+
4
+ def initialize(name, &block)
5
+ @name = name
6
+ instance_eval(&block) if block_given?
7
+ end
8
+
9
+ def delete
10
+ response = Configuration.client.delete "#{Configuration.url}/#{@name}"
11
+ return response =~ /error/ ? false : true
12
+ rescue
13
+ false
14
+ end
15
+
16
+ def create
17
+ Configuration.client.post "#{Configuration.url}/#{@name}", ''
18
+ rescue
19
+ false
20
+ end
21
+
22
+ def store(*args)
23
+ if args.size > 1
24
+ (type, document = args)
25
+ else
26
+ (document = args.pop; type = :document)
27
+ end
28
+ document = case true
29
+ when document.is_a?(String) then document
30
+ when document.respond_to?(:to_indexed_json) then document.to_indexed_json
31
+ else raise ArgumentError, "Please pass a JSON string or object with a 'to_indexed_json' method"
32
+ end
33
+ result = Configuration.client.post "#{Configuration.url}/#{@name}/#{type}/", document
34
+ JSON.parse(result)
35
+ end
36
+
37
+ def refresh
38
+ Configuration.client.post "#{Configuration.url}/#{@name}/_refresh", ''
39
+ end
40
+
41
+ end
42
+ end
@@ -0,0 +1,22 @@
1
+ module Slingshot
2
+ module Results
3
+
4
+ class Collection
5
+ include Enumerable
6
+ attr_reader :time, :total, :results, :facets
7
+
8
+ def initialize(response)
9
+ @time = response['took']
10
+ @total = response['hits']['total']
11
+ @results = response['hits']['hits']
12
+ @facets = response['facets']
13
+ end
14
+
15
+ def each(&block)
16
+ @results.each(&block)
17
+ end
18
+
19
+ end
20
+
21
+ end
22
+ end
@@ -0,0 +1,3 @@
1
+ class Hash
2
+ alias_method :to_indexed_json, :to_json if respond_to?(:to_json)
3
+ end
@@ -0,0 +1,34 @@
1
+ module Slingshot
2
+ module Search
3
+
4
+ #--
5
+ # TODO: Implement all elastic search facets (geo, histogram, range, etc)
6
+ # https://github.com/elasticsearch/elasticsearch/wiki/Search-API-Facets
7
+ #++
8
+
9
+ class Facet
10
+
11
+ def initialize(name, options={}, &block)
12
+ @name = name
13
+ @options = options
14
+ self.instance_eval(&block) if block_given?
15
+ end
16
+
17
+ def terms(field, options={})
18
+ @value = { :terms => { :field => field } }.update(options)
19
+ self
20
+ end
21
+
22
+ def to_json
23
+ to_hash.to_json
24
+ end
25
+
26
+ def to_hash
27
+ h = { @name => @value }
28
+ h[@name].update @options
29
+ return h
30
+ end
31
+ end
32
+
33
+ end
34
+ end
@@ -0,0 +1,33 @@
1
+ module Slingshot
2
+ module Search
3
+
4
+ class Query
5
+ def initialize(&block)
6
+ self.instance_eval(&block) if block_given?
7
+ end
8
+
9
+ def term(field, value)
10
+ @value = { :term => { field => value } }
11
+ end
12
+
13
+ def terms(field, value, options={})
14
+ @value = { :terms => { field => value } }
15
+ @value[:terms].update( { :minimum_match => options[:minimum_match] } ) if options[:minimum_match]
16
+ @value
17
+ end
18
+
19
+ def string(value, options={})
20
+ @value = { :query_string => { :query => value } }
21
+ @value[:query_string].update( { :default_field => options[:default_field] } ) if options[:default_field]
22
+ # TODO: https://github.com/elasticsearch/elasticsearch/wiki/Query-String-Query
23
+ @value
24
+ end
25
+
26
+ def to_json
27
+ @value.to_json
28
+ end
29
+
30
+ end
31
+
32
+ end
33
+ end
@@ -0,0 +1,24 @@
1
+ module Slingshot
2
+ module Search
3
+
4
+ class Sort
5
+ def initialize(&block)
6
+ @value = []
7
+ self.instance_eval(&block) if block_given?
8
+ end
9
+
10
+ def method_missing(id, *args, &block)
11
+ case arg = args.shift
12
+ when String, Symbol, Hash then @value << { id => arg }
13
+ else @value << id
14
+ end
15
+ self
16
+ end
17
+
18
+ def to_json
19
+ @value.to_json
20
+ end
21
+ end
22
+
23
+ end
24
+ end
@@ -0,0 +1,70 @@
1
+ module Slingshot
2
+ module Search
3
+
4
+ class Search
5
+
6
+ attr_reader :indices, :url, :results, :response, :query, :facets
7
+
8
+ def initialize(*indices, &block)
9
+ raise ArgumentError, 'Please pass index or indices to search' if indices.empty?
10
+ @indices = indices
11
+ instance_eval(&block) if block_given?
12
+ end
13
+
14
+ def query(&block)
15
+ @query = Query.new(&block)
16
+ self
17
+ end
18
+
19
+ def sort(&block)
20
+ @sort = Sort.new(&block)
21
+ self
22
+ end
23
+
24
+ def facet(name, options={}, &block)
25
+ @facets ||= {}
26
+ @facets.update Facet.new(name, options, &block).to_hash
27
+ self
28
+ end
29
+
30
+ def from(value)
31
+ @from = value
32
+ self
33
+ end
34
+
35
+ def size(value)
36
+ @size = value
37
+ self
38
+ end
39
+
40
+ def fields(fields=[])
41
+ @fields = fields
42
+ self
43
+ end
44
+
45
+ def perform
46
+ @url = "#{Configuration.url}/#{indices.join(',')}/_search"
47
+ @response = JSON.parse( Configuration.client.post(@url, self.to_json) )
48
+ @results = Results::Collection.new(@response)
49
+ self
50
+ end
51
+
52
+ def to_curl
53
+ %Q|curl -X POST "http://localhost:9200/#{indices}/_search?pretty=true" -d '#{self.to_json}'|
54
+ end
55
+
56
+ def to_json
57
+ request = {}
58
+ request.update( { :query => @query } )
59
+ request.update( { :sort => @sort } ) if @sort
60
+ request.update( { :facets => @facets } ) if @facets
61
+ request.update( { :size => @size } ) if @size
62
+ request.update( { :from => @from } ) if @from
63
+ request.update( { :fields => @fields } ) if @fields
64
+ request.to_json
65
+ end
66
+
67
+ end
68
+
69
+ end
70
+ end
@@ -0,0 +1 @@
1
+ require 'slingshot'
@@ -0,0 +1,3 @@
1
+ module Slingshot
2
+ VERSION = "0.0.1"
3
+ end
data/lib/slingshot.rb ADDED
@@ -0,0 +1,18 @@
1
+ require 'rest_client'
2
+ require 'yajl/json_gem'
3
+
4
+ require 'slingshot/rubyext/hash'
5
+ require 'slingshot/configuration'
6
+ require 'slingshot/client'
7
+ require 'slingshot/client'
8
+ require 'slingshot/search'
9
+ require 'slingshot/search/query'
10
+ require 'slingshot/search/sort'
11
+ require 'slingshot/search/facet'
12
+ require 'slingshot/results/collection'
13
+ require 'slingshot/index'
14
+ require 'slingshot/dsl'
15
+
16
+ module Slingshot
17
+ extend DSL
18
+ end
data/slingshot.gemspec ADDED
@@ -0,0 +1,41 @@
1
+ # -*- encoding: utf-8 -*-
2
+ $:.push File.expand_path("../lib", __FILE__)
3
+ require "slingshot/version"
4
+
5
+ Gem::Specification.new do |s|
6
+ s.name = "slingshot-rb"
7
+ s.version = Slingshot::VERSION
8
+ s.platform = Gem::Platform::RUBY
9
+ s.summary = "Ruby API for ElasticSearch"
10
+ s.homepage = "http://github.com/karmi/slingshot"
11
+ s.authors = [ 'Karel Minarik' ]
12
+ s.email = 'karmi@karmi.cz'
13
+
14
+ s.rubyforge_project = "slingshot"
15
+
16
+ s.files = `git ls-files`.split("\n")
17
+ s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
18
+ s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
19
+
20
+ s.require_paths = ["lib"]
21
+
22
+ s.extra_rdoc_files = [ "README.markdown", "MIT-LICENSE" ]
23
+ s.rdoc_options = [ "--charset=UTF-8" ]
24
+
25
+ s.required_rubygems_version = ">= 1.3.6"
26
+
27
+ s.add_dependency "bundler", "~> 1.0.0"
28
+ s.add_dependency "rest-client", "~> 1.6.0"
29
+ s.add_dependency "yajl-ruby", "> 0.7.9"
30
+
31
+ s.add_development_dependency "turn"
32
+ s.add_development_dependency "shoulda"
33
+ s.add_development_dependency "mocha"
34
+ s.add_development_dependency "sdoc"
35
+ s.add_development_dependency "rcov"
36
+
37
+ s.description = <<-DESC
38
+ Ruby API for the ElasticSearch search engine/database.
39
+ A work in progress, currently.
40
+ DESC
41
+ end