webtagger 0.1.1 → 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
data/Gemfile ADDED
@@ -0,0 +1,15 @@
1
+ source "http://rubygems.org"
2
+ # Add dependencies required to use your gem here.
3
+ # Example:
4
+ # gem "activesupport", ">= 2.3.5"
5
+
6
+ # Add dependencies to develop your gem here.
7
+ # Include everything needed to run rake, tests, features, etc.
8
+ gem 'json'
9
+ group :development do
10
+ gem "rspec", "~> 2.3.0"
11
+ gem "bundler", "~> 1.0.0"
12
+ gem "jeweler", "~> 1.5.2"
13
+ gem "rcov", ">= 0"
14
+ gem "fakeweb", "~> 1.3.0"
15
+ end
data/Gemfile.lock ADDED
@@ -0,0 +1,32 @@
1
+ GEM
2
+ remote: http://rubygems.org/
3
+ specs:
4
+ diff-lcs (1.1.2)
5
+ fakeweb (1.3.0)
6
+ git (1.2.5)
7
+ jeweler (1.5.2)
8
+ bundler (~> 1.0.0)
9
+ git (>= 1.2.5)
10
+ rake
11
+ json (1.4.6)
12
+ rake (0.8.7)
13
+ rcov (0.9.9)
14
+ rspec (2.3.0)
15
+ rspec-core (~> 2.3.0)
16
+ rspec-expectations (~> 2.3.0)
17
+ rspec-mocks (~> 2.3.0)
18
+ rspec-core (2.3.1)
19
+ rspec-expectations (2.3.0)
20
+ diff-lcs (~> 1.1.2)
21
+ rspec-mocks (2.3.0)
22
+
23
+ PLATFORMS
24
+ ruby
25
+
26
+ DEPENDENCIES
27
+ bundler (~> 1.0.0)
28
+ fakeweb (~> 1.3.0)
29
+ jeweler (~> 1.5.2)
30
+ json
31
+ rcov
32
+ rspec (~> 2.3.0)
data/README.rdoc CHANGED
@@ -2,10 +2,9 @@
2
2
 
3
3
  Webtagger is a simple ruby gem that uses the web intelligence to extract important terms in texts, suitable for tagging them, finding the main subject or automatically building queries.
4
4
 
5
- It depends on {httparty}[http://github.com/jnunemaker/httparty] and uses the following external APIs:
6
5
  * {Yahoo term extraction}[http://developer.yahoo.com/search/content/V1/termExtraction.html]
7
- * {Tag-the-net}[http://tagthe.net]
8
- * {Alchemy API}[http://www.alchemyapi.com/api/keyword/textc.html]
6
+ * {Tag-the-net}[http://tagthe.net] (Needs and API key!)
7
+ * {Alchemy API}[http://www.alchemyapi.com/api/keyword/textc.html] (Needs an API key!)
9
8
 
10
9
  And it's written to support any API in the future.
11
10
 
@@ -14,29 +13,19 @@ And it's written to support any API in the future.
14
13
 
15
14
  ==Usage
16
15
 
17
- Ok, little caveat here, you might need an API-key for some of the services, so you might want to run
18
- webtagger --configure
19
-
20
- To set or update your API keys
21
- Or, you can pass them in the tagging method, like this
22
- tags = WebTagger.tag(text, "yahoo", "YOUR-API-KEY")
23
-
24
16
  Besides that pickle, the standard usage is really simple:
25
17
  require 'webtagger'
26
18
  text = "Hi, I'm text"
27
- #you can use the default service (tagthe)
28
- tags = WebTagger.tag(text)
29
- #or choose whichever you want, if it isn't supported, falls back to the default, so you don't have
30
- #to be on the look for exceptions
31
- tags = WebTagger.tag(text,"yahoo")
19
+ #you simply call the appropriate method:
20
+ tags = WebTagger.tag_with_tagthe(text)
21
+ #some APIs might need an api key, pass that as the second parameter
22
+ tags = WebTagger.tag_with_yahoo(text, "YOUR-API-KEY")
23
+
32
24
 
33
25
  WebTagger uses caching so rest assured you won't be throttled by the API providers.
34
26
 
35
- If something funny happens when calling an API, a +WebTaggerException+ will be raised, and the instance of it will count with a +response+ attribute to see what the original error response was.
36
-
37
- If a http error happens (404, 500, etc), +nil+ will be returned.
27
+ If something funny happens (a 4XX or 5XX response is returned), nil will be returned.
38
28
 
39
-
40
29
  == Note on Patches/Pull Requests
41
30
 
42
31
  * Fork the project.
data/Rakefile CHANGED
@@ -1,55 +1,46 @@
1
1
  require 'rubygems'
2
+ require 'bundler'
3
+ begin
4
+ Bundler.setup(:default, :development)
5
+ rescue Bundler::BundlerError => e
6
+ $stderr.puts e.message
7
+ $stderr.puts "Run `bundle install` to install missing gems"
8
+ exit e.status_code
9
+ end
10
+
2
11
  require 'rake'
3
12
 
4
- begin
5
- require 'jeweler'
6
- Jeweler::Tasks.new do |gem|
13
+ require 'jeweler'
14
+ Jeweler::Tasks.new do |gem|
15
+ # gem is a Gem::Specification... see http://docs.rubygems.org/read/chapter/20 for more options
7
16
  gem.name = "webtagger"
8
17
  gem.summary = %Q{Use some popular web services to extract keywords from text}
9
18
  gem.description = %Q{Use webtagger to use keyword extraction web services (yahoo, tagthe and alchemy) to extract from a text terms suitable for tagging, summarization, query building, etc.}
10
- gem.email = "me@lfborjas.com"
19
+ gem.email = "luisfelipe@lfborjas.com"
11
20
  gem.homepage = "http://github.com/lfborjas/webtagger"
12
21
  gem.authors = ["lfborjas"]
13
- gem.add_development_dependency "thoughtbot-shoulda", ">= 0"
14
- gem.add_dependency "httparty", "0.6.1"
15
- gem.executables << 'webtagger'
16
- # gem is a Gem::Specification... see http://www.rubygems.org/read/chapter/20 for additional settings
17
- end
18
- Jeweler::GemcutterTasks.new
19
- rescue LoadError
20
- puts "Jeweler (or a dependency) not available. Install it with: gem install jeweler"
21
22
  end
23
+ Jeweler::RubygemsDotOrgTasks.new
22
24
 
23
- require 'rake/testtask'
24
- Rake::TestTask.new(:test) do |test|
25
- test.libs << 'lib' << 'test'
26
- test.pattern = 'test/**/test_*.rb'
27
- test.verbose = true
25
+ require 'rspec/core'
26
+ require 'rspec/core/rake_task'
27
+ RSpec::Core::RakeTask.new(:spec) do |spec|
28
+ spec.pattern = FileList['spec/**/*_spec.rb']
28
29
  end
29
30
 
30
- begin
31
- require 'rcov/rcovtask'
32
- Rcov::RcovTask.new do |test|
33
- test.libs << 'test'
34
- test.pattern = 'test/**/test_*.rb'
35
- test.verbose = true
36
- end
37
- rescue LoadError
38
- task :rcov do
39
- abort "RCov is not available. In order to run rcov, you must: sudo gem install spicycode-rcov"
40
- end
31
+ RSpec::Core::RakeTask.new(:rcov) do |spec|
32
+ spec.pattern = 'spec/**/*_spec.rb'
33
+ spec.rcov = true
41
34
  end
42
35
 
43
- task :test => :check_dependencies
44
-
45
- task :default => :test
36
+ task :default => :spec
46
37
 
47
38
  require 'rake/rdoctask'
48
39
  Rake::RDocTask.new do |rdoc|
49
40
  version = File.exist?('VERSION') ? File.read('VERSION') : ""
50
41
 
51
42
  rdoc.rdoc_dir = 'rdoc'
52
- rdoc.title = "webtagger #{version}"
43
+ rdoc.title = "scriabin #{version}"
53
44
  rdoc.rdoc_files.include('README*')
54
45
  rdoc.rdoc_files.include('lib/**/*.rb')
55
46
  end
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.1.1
1
+ 1.0.0
data/lib/webtagger.rb CHANGED
@@ -1,133 +1,82 @@
1
- require 'fileutils'
2
- require 'httparty'
3
- require 'httparty_icebox'
1
+ %w{net/http json digest/md5}.each{|m| require m }
4
2
 
5
- #Module for extracting keywords from text. Uses the tagthe, yahoo and alchemyAPI web services.
6
- #Because the yahoo and alchemy services require an API key, a command line utility is provided
7
- #to add those tokens for subsequent uses of the modules, storing them in <tt>~/.webtagger</tt>
3
+ #Class for extracting keywords from text. Uses the tagthe, yahoo and alchemyAPI web services.
8
4
  #it uses caching to avoid being throttled by the apis, via the httparty_icebox gem
9
- module WebTagger
10
-
11
- #The services supported by this version
12
- SERVICES = ['yahoo', 'alchemy', 'tagthe']
13
-
14
- #A generic exception to handle api call errors
15
- class WebTaggerError < RuntimeError
16
- attr :response
17
- def initialize(resp)
18
- @response = resp
19
- end
20
- end
21
-
22
- #Get the persisted token for a service, if no service is provided, all tokens are returned in a hash
23
- #Params:
24
- #+service+:: the service for which the token should be retrieved, must be one of SERVICES
25
- def get_token(service="")
26
- service = service.strip.downcase
27
- conf = File.join(ENV['HOME'], '.webtagger')
28
- return nil unless File.exist? conf
29
- srvcs = {}
30
- File.open(conf).each do |service_conf|
31
- s, t = service_conf.split(/\s*=\s*/) rescue next
32
- srvcs[s.strip.downcase] = t.strip
33
- end
5
+ class WebTagger
34
6
 
35
- return case
36
- when service == "all"
37
- srvcs
38
- when (SERVICES.include?(service) and srvcs[service])
39
- srvcs[service]
40
- else
41
- nil
42
- end
43
- end
44
-
45
- #Class to access the
46
- #{yahoo term extraction web service}[http://developer.yahoo.com/search/content/V1/termExtraction.html]
47
- class Yahoo
48
- include HTTParty
49
- include HTTParty::Icebox
50
- format :json
51
- base_uri "http://search.yahooapis.com/ContentAnalysisService/V1"
52
- cache :store => 'memory', :timeout => 1
53
-
54
- def self.tag(text, token)
55
- raise "Token missing!" unless token
56
- resp = post("/termExtraction", :query => {:appid => token, :context => text, :output=>'json'} )
57
- if resp.has_key?('ResultSet')
58
- return resp['ResultSet']['Result'] || []
59
- else
60
- raise WebTaggerError.new(resp), "Error in API call"
7
+ #one of these days, gotta add filesystem cache
8
+ @@cache = {}
9
+ #Macro for creating a provider-specific tagger
10
+ def self.tags_with(service, options={}, &callback)
11
+ opts = {:uri => "",
12
+ :use_tokens=>true,
13
+ :cache=>true,
14
+ :json=>true,
15
+ :method=>:post,
16
+ :text_param=>"text",
17
+ :token_param=>"",
18
+ :extra_params=>{} }.merge(options)
19
+
20
+ #use the meta-class to inject a static method in this class
21
+ (class << self; self; end).instance_eval do
22
+
23
+ #hack the block: using the star operator we can get an empty second param without fuss
24
+ define_method("tag_with_#{service.to_s}") do | text, *tokens |
25
+
26
+ text_digest = Digest::MD5.hexdigest service.to_s+text
27
+ callback.call(@@cache[text_digest]) unless @@cache[text_digest].nil?
28
+
29
+ query = {opts[:text_param] => text}.merge(opts[:extra_params])
30
+ query[opts[:token_param]] = *tokens if opts[:use_tokens]
31
+
32
+ r = Net::HTTP.post_form URI.parse(opts[:uri]), query
33
+
34
+ response = if opts[:json] then JSON.parse(r.body) else r.body end
35
+ if (100..399) === r.code.to_i
36
+ @@cache[text_digest] = response
37
+ callback.call(response)
38
+ else
39
+ callback.call(nil)
40
+ end
61
41
  end
62
42
  end
63
43
  end
64
-
65
- #Class for accessing the
66
- #{alchemy keyword extraction service}[http://www.alchemyapi.com/api/keyword/textc.html]
67
- class Alchemy
68
- include HTTParty
69
- include HTTParty::Icebox
70
- format :json
71
- base_uri "http://access.alchemyapi.com/calls/text"
72
- cache :store => 'memory', :timeout => 1
73
-
74
- def self.tag(text, token)
75
- raise "Token missing!" unless token
76
- resp = post("/TextGetRankedKeywords", :query => {:apikey => token, :text => text, :outputMode=>'json'} )
77
- if resp['status'] != 'ERROR'
78
- #it's a hash array of [{:text=>"", :relevance=>""}]
79
- kws = []
80
- resp['keywords'].each do |m|
81
- kws.push m["text"]
82
- end
83
- return kws
84
- else
85
- raise WebTaggerError.new(resp), "Error in API call"
86
- end
87
- end
44
+
45
+ Boilerplate = {:yahoo=>{:uri=>"http://search.yahooapis.com/ContentAnalysisService/V1/termExtraction",
46
+ :token_param=>"appid",
47
+ :text_param=>"context",
48
+ :extra_params=>{:output=>"json"}
49
+ },
50
+ :alchemy=>{
51
+ :uri => "http://access.alchemyapi.com/calls/text/TextGetRankedKeywords",
52
+ :token_param => "apikey",
53
+ :extra_params=>{:outputMode => "json"}
54
+ },
55
+ :tagthe=>{:uri=>"http://tagthe.net/api",
56
+ :extra_params=>{:view=>"json"}
57
+ }
58
+ }
59
+
60
+ tags_with :yahoo, Boilerplate[:yahoo] do |r|
61
+ r['ResultSet']['ResultSet'] if r and r['ResultSet']
88
62
  end
89
63
 
90
- #class for accesing the
91
- #{tagthe API}[http://tagthe.net/fordevelopers]
92
- class Tagthe
93
- include HTTParty
94
- include HTTParty::Icebox
95
- format :json
96
- base_uri "http://tagthe.net/api"
97
- cache :store => 'memory', :timeout => 1
98
-
99
- def self.tag(text)
100
- resp = post("/", :query => {:text => text, :view=>'json'} )
101
- if resp.has_key?('memes') and resp['memes'][0].has_key?('dimensions') \
102
- and resp['memes'][0]['dimensions'].has_key?('topic')
103
-
104
- return resp['memes'][0]['dimensions']['topic']
105
- else
106
- return []
64
+ tags_with :alchemy, Boilerplate[:alchemy] do |resp|
65
+ if resp['status'] != 'ERROR'
66
+ #it's a hash array of [{:text=>"", :relevance=>""}]
67
+ kws = []
68
+ resp['keywords'].each do |m|
69
+ kws.push m["text"]
107
70
  end
108
- end
71
+ kws
72
+ end
109
73
  end
110
-
111
- #Method for obtaining keywords in a text
112
- #Params:
113
- #+text+:: a +String+, the text to tag
114
- #+service+(optional):: a +String+, the name of the service to use, defaults to tagthe and must be one of SERVICES
115
- #+token+(optional):: a token to use for calling the service (tagthe doesn't need one), keep in mind that this value,
116
- #superseeds the one stored in +~/.webtagger+ and that, due to caching, might not be used if the request is done
117
- #less than a minute after the last one with a different token
118
- def tag(text,service="tagthe",token=nil)
119
- service = service.strip.downcase
120
- token = get_token(service) unless token
121
- return case
122
- when service == "yahoo"
123
- Yahoo.tag(text, token)
124
- when service == "alchemy"
125
- Alchemy.tag(text, token)
126
- else
127
- Tagthe.tag(text)
74
+
75
+ tags_with :tagthe, Boilerplate[:tagthe] do |resp|
76
+ if resp.has_key?('memes') and resp['memes'][0].has_key?('dimensions') \
77
+ and resp['memes'][0]['dimensions'].has_key?('topic')
78
+
79
+ resp['memes'][0]['dimensions']['topic']
128
80
  end
129
81
  end
130
-
131
- module_function :tag
132
- module_function :get_token
133
82
  end #of webtagger module
@@ -0,0 +1,12 @@
1
+ {
2
+ "status": "OK",
3
+ "usage": "By accessing AlchemyAPI or using information generated by AlchemyAPI, you are agreeing to be bound by the AlchemyAPI Terms of Use: http://www.alchemyapi.com/company/terms.html",
4
+ "url": "",
5
+ "language": "english",
6
+ "keywords": [
7
+ {
8
+ "text": "general surgeon",
9
+ "relevance": "0.989011"
10
+ }
11
+ ]
12
+ }
@@ -0,0 +1 @@
1
+ {"memes":[{"source":"urn:memanage:4F85801E2FE923FF6A0DBBB1A606F1A7","updated":"Sat Jan 22 11:33:19 CET 2011","dimensions":{"topic":["surgeon"],"language":["english"]}}]}
@@ -0,0 +1,18 @@
1
+ $LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '..', 'lib'))
2
+ $LOAD_PATH.unshift(File.dirname(__FILE__))
3
+ require 'rspec'
4
+ require 'webtagger'
5
+ require 'fakeweb'
6
+ file_opener = lambda {|service| File.open("#{File.dirname(__FILE__)}/fixtures/#{service}.json").read}
7
+
8
+ FakeWeb.register_uri(:post, "http://tagthe.net/api", :body=>file_opener.call("tagthe"))
9
+ FakeWeb.register_uri(:post, "http://access.alchemyapi.com/calls/text/TextGetRankedKeywords",
10
+ :body=>file_opener.call("alchemy"))
11
+
12
+ # Requires supporting files with custom matchers and macros, etc,
13
+ # in ./support/ and its subdirectories.
14
+ Dir["#{File.dirname(__FILE__)}/support/**/*.rb"].each {|f| require f}
15
+
16
+ RSpec.configure do |config|
17
+
18
+ end
@@ -0,0 +1 @@
1
+ require File.expand_path(File.dirname(__FILE__) + '/spec_helper')
@@ -0,0 +1,18 @@
1
+ require File.expand_path(File.dirname(__FILE__) + '/spec_helper')
2
+
3
+ describe "WebTagger" do
4
+ before(:each) do
5
+ @query = "I'm a very general surgeon, surgeon"
6
+ end
7
+
8
+ it "should tag with tagthe" do
9
+ r = WebTagger.tag_with_tagthe @query
10
+ r.should == ["surgeon"]
11
+ end
12
+
13
+ it "should tag with alchemy" do
14
+ r = WebTagger.tag_with_alchemy @query
15
+ r.should == ["general surgeon"]
16
+ end
17
+
18
+ end
metadata CHANGED
@@ -1,13 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: webtagger
3
3
  version: !ruby/object:Gem::Version
4
- hash: 25
4
+ hash: 23
5
5
  prerelease: false
6
6
  segments:
7
- - 0
8
- - 1
9
7
  - 1
10
- version: 0.1.1
8
+ - 0
9
+ - 0
10
+ version: 1.0.0
11
11
  platform: ruby
12
12
  authors:
13
13
  - lfborjas
@@ -15,13 +15,13 @@ autorequire:
15
15
  bindir: bin
16
16
  cert_chain: []
17
17
 
18
- date: 2010-08-28 00:00:00 -06:00
18
+ date: 2011-01-22 00:00:00 -06:00
19
19
  default_executable:
20
20
  dependencies:
21
21
  - !ruby/object:Gem::Dependency
22
- name: thoughtbot-shoulda
23
22
  prerelease: false
24
- requirement: &id001 !ruby/object:Gem::Requirement
23
+ name: json
24
+ version_requirements: &id001 !ruby/object:Gem::Requirement
25
25
  none: false
26
26
  requirements:
27
27
  - - ">="
@@ -30,29 +30,90 @@ dependencies:
30
30
  segments:
31
31
  - 0
32
32
  version: "0"
33
+ requirement: *id001
34
+ type: :runtime
35
+ - !ruby/object:Gem::Dependency
36
+ prerelease: false
37
+ name: rspec
38
+ version_requirements: &id002 !ruby/object:Gem::Requirement
39
+ none: false
40
+ requirements:
41
+ - - ~>
42
+ - !ruby/object:Gem::Version
43
+ hash: 3
44
+ segments:
45
+ - 2
46
+ - 3
47
+ - 0
48
+ version: 2.3.0
49
+ requirement: *id002
33
50
  type: :development
34
- version_requirements: *id001
35
51
  - !ruby/object:Gem::Dependency
36
- name: httparty
37
52
  prerelease: false
38
- requirement: &id002 !ruby/object:Gem::Requirement
53
+ name: bundler
54
+ version_requirements: &id003 !ruby/object:Gem::Requirement
39
55
  none: false
40
56
  requirements:
41
- - - "="
57
+ - - ~>
42
58
  - !ruby/object:Gem::Version
43
- hash: 5
59
+ hash: 23
44
60
  segments:
61
+ - 1
62
+ - 0
45
63
  - 0
46
- - 6
64
+ version: 1.0.0
65
+ requirement: *id003
66
+ type: :development
67
+ - !ruby/object:Gem::Dependency
68
+ prerelease: false
69
+ name: jeweler
70
+ version_requirements: &id004 !ruby/object:Gem::Requirement
71
+ none: false
72
+ requirements:
73
+ - - ~>
74
+ - !ruby/object:Gem::Version
75
+ hash: 7
76
+ segments:
47
77
  - 1
48
- version: 0.6.1
49
- type: :runtime
50
- version_requirements: *id002
78
+ - 5
79
+ - 2
80
+ version: 1.5.2
81
+ requirement: *id004
82
+ type: :development
83
+ - !ruby/object:Gem::Dependency
84
+ prerelease: false
85
+ name: rcov
86
+ version_requirements: &id005 !ruby/object:Gem::Requirement
87
+ none: false
88
+ requirements:
89
+ - - ">="
90
+ - !ruby/object:Gem::Version
91
+ hash: 3
92
+ segments:
93
+ - 0
94
+ version: "0"
95
+ requirement: *id005
96
+ type: :development
97
+ - !ruby/object:Gem::Dependency
98
+ prerelease: false
99
+ name: fakeweb
100
+ version_requirements: &id006 !ruby/object:Gem::Requirement
101
+ none: false
102
+ requirements:
103
+ - - ~>
104
+ - !ruby/object:Gem::Version
105
+ hash: 27
106
+ segments:
107
+ - 1
108
+ - 3
109
+ - 0
110
+ version: 1.3.0
111
+ requirement: *id006
112
+ type: :development
51
113
  description: Use webtagger to use keyword extraction web services (yahoo, tagthe and alchemy) to extract from a text terms suitable for tagging, summarization, query building, etc.
52
- email: me@lfborjas.com
53
- executables:
54
- - webtagger
55
- - webtagger
114
+ email: luisfelipe@lfborjas.com
115
+ executables: []
116
+
56
117
  extensions: []
57
118
 
58
119
  extra_rdoc_files:
@@ -60,24 +121,26 @@ extra_rdoc_files:
60
121
  - README.rdoc
61
122
  files:
62
123
  - .document
63
- - .gitignore
124
+ - Gemfile
125
+ - Gemfile.lock
64
126
  - LICENSE
65
127
  - README.rdoc
66
128
  - Rakefile
67
129
  - VERSION
68
- - bin/webtagger
69
- - lib/httparty_icebox.rb
70
130
  - lib/webtagger.rb
71
- - test/helper.rb
72
- - test/test_webtagger.rb
131
+ - spec/fixtures/alchemy.json
132
+ - spec/fixtures/tagthe.json
133
+ - spec/spec_helper.rb
134
+ - spec/support_spec.rb
135
+ - spec/webtagger_spec.rb
73
136
  - webtagger.gemspec
74
137
  has_rdoc: true
75
138
  homepage: http://github.com/lfborjas/webtagger
76
139
  licenses: []
77
140
 
78
141
  post_install_message:
79
- rdoc_options:
80
- - --charset=UTF-8
142
+ rdoc_options: []
143
+
81
144
  require_paths:
82
145
  - lib
83
146
  required_ruby_version: !ruby/object:Gem::Requirement
@@ -106,5 +169,6 @@ signing_key:
106
169
  specification_version: 3
107
170
  summary: Use some popular web services to extract keywords from text
108
171
  test_files:
109
- - test/helper.rb
110
- - test/test_webtagger.rb
172
+ - spec/spec_helper.rb
173
+ - spec/support_spec.rb
174
+ - spec/webtagger_spec.rb
data/.gitignore DELETED
@@ -1,21 +0,0 @@
1
- ## MAC OS
2
- .DS_Store
3
-
4
- ## TEXTMATE
5
- *.tmproj
6
- tmtags
7
-
8
- ## EMACS
9
- *~
10
- \#*
11
- .\#*
12
-
13
- ## VIM
14
- *.swp
15
-
16
- ## PROJECT::GENERAL
17
- coverage
18
- rdoc
19
- pkg
20
-
21
- ## PROJECT::SPECIFIC
data/bin/webtagger DELETED
@@ -1,60 +0,0 @@
1
- #!/usr/bin/env ruby
2
- require 'optparse'
3
- require 'fileutils'
4
- $:.unshift File.dirname(__FILE__) + "/../lib"
5
-
6
- require 'webtagger'
7
-
8
- service = ""
9
-
10
- def configure
11
- WebTagger::SERVICES.each do |service|
12
- next if service == "tagthe"
13
- conf = File.join(ENV['HOME'], '.webtagger')
14
- FileUtils.touch(conf) unless File.exist? conf
15
- srvcs = {}
16
- File.open(conf).each do |service_conf|
17
- s, t = service_conf.split(/\s*=\s*/) rescue next
18
- srvcs[s.strip.downcase] = t ? t.strip : ""
19
- end
20
- puts "Token for #{service.downcase} (leave blank if you don't want to set it now or you already did): "
21
- token = gets
22
- srvcs[service]= (token and not token.strip.empty?) ? token : srvcs[service] || ""
23
- File.open(conf,'w') do |new_conf|
24
- srvcs.each do |s, t|
25
- new_conf.write("#{s.upcase}=#{t.strip}\n")
26
- end
27
- end
28
- end
29
- end
30
-
31
- OptionParser.new do |opt|
32
- opt.banner = "usage: webtagger [OPTIONS] [text]"
33
- opt.on('-c', '--configure', String, "Add tokens for each service") do
34
- configure()
35
- exit
36
- end
37
-
38
- opt.on('-t', '--token=[service]', String, "Get the token of a specific service (or all if not specified)") do |s|
39
- s="all" if not s or s.empty?
40
- puts WebTagger.get_token(s)
41
- exit
42
- end
43
- opt.on('-s', '--service=[service]', String, "Tag the text with the specified service (defaults to tagthe)") do |s|
44
- s="" unless WebTagger::SERVICES.include?(s)
45
- service = s
46
- end
47
- opt.on('-h', '--help', "Display the help screen and exit") do
48
- puts opt
49
- exit
50
- end
51
-
52
- end.parse!
53
-
54
- #do the actual tagging:
55
- text = ARGV[0]
56
- if text and not text.empty?
57
- puts "tags: %s"%WebTagger.tag(text, service).inspect[1..-2] rescue puts "Couldn't extract tags"
58
- else
59
- puts "You must supply some text to tag!"
60
- end
@@ -1,263 +0,0 @@
1
- # = Icebox : Caching for HTTParty
2
- #
3
- # Cache responses in HTTParty models [http://github.com/jnunemaker/httparty]
4
- #
5
- # === Usage
6
- #
7
- # class Foo
8
- # include HTTParty
9
- # include HTTParty::Icebox
10
- # cache :store => 'file', :timeout => 600, :location => MY_APP_ROOT.join('tmp', 'cache')
11
- # end
12
- #
13
- # Modeled after Martyn Loughran's APICache [http://github.com/newbamboo/api_cache]
14
- # and Ruby On Rails's caching [http://api.rubyonrails.org/classes/ActiveSupport/Cache.html]
15
- #
16
- # Author: Karel Minarik [www.karmi.cz]
17
- #
18
- # === Notes
19
- #
20
- # Thanks to Amit Chakradeo for pointing out response objects have to be stored marhalled on FS
21
- # Thanks to Marlin Forbes for pointing out the query parameters have to be included in the cache key
22
- #
23
- #
24
-
25
- require 'logger'
26
- require 'ftools'
27
- require 'tmpdir'
28
- require 'pathname'
29
- require 'digest/md5'
30
-
31
- module HTTParty #:nodoc:
32
- # == Caching for HTTParty
33
- # See documentation in HTTParty::Icebox::ClassMethods.cache
34
- #
35
- module Icebox
36
-
37
- module ClassMethods
38
-
39
- # Enable caching and set cache options
40
- # Returns memoized cache object
41
- #
42
- # Following options are available, default values are in []:
43
- #
44
- # +store+:: Storage mechanism for cached data (memory, filesystem, your own) [memory]
45
- # +timeout+:: Cache expiration in seconds [60]
46
- # +logger+:: Path to logfile or logger instance [nil, silent]
47
- #
48
- # Any additional options are passed to the Cache constructor
49
- #
50
- # Usage:
51
- #
52
- # # Enable caching in HTTParty, in memory, for 1 minute
53
- # cache # Use default values
54
- #
55
- # # Enable caching in HTTParty, on filesystem (/tmp), for 10 minutes
56
- # cache :store => 'file', :timeout => 600, :location => '/tmp/'
57
- #
58
- # # Use your own cache store (see +AbstractStore+ class below)
59
- # cache :store => 'memcached', :timeout => 600, :server => '192.168.1.1:1001'
60
- #
61
- def cache(options={})
62
- options[:store] ||= 'memory'
63
- options[:timeout] ||= 60
64
- logger = options[:logger]
65
- @cache ||= Cache.new( options.delete(:store), options )
66
- end
67
-
68
- end
69
-
70
- # When included, extend class with +cache+ method
71
- # and redefine +get+ method to use cache
72
- #
73
- def self.included(receiver) #:nodoc:
74
- receiver.extend ClassMethods
75
- receiver.class_eval do
76
-
77
- # Get reponse from network
78
- #
79
- # TODO: Why alias :new :old is not working here? Returns NoMethodError
80
- #
81
- def self.get_without_caching(path, options={})
82
- perform_request Net::HTTP::Get, path, options
83
- end
84
-
85
- # Get response from cache, if available
86
- #
87
- def self.get_with_caching(path, options={})
88
- key = path
89
- key << options[:query].to_s if defined? options[:query]
90
- if cache.exists?(key) and not cache.stale?(key)
91
- Cache.logger.debug "CACHE -- GET #{path}#{options[:query]}"
92
- return cache.get(key)
93
- else
94
- Cache.logger.debug "/!\\ NETWORK -- GET #{path}#{options[:query]}"
95
- response = get_without_caching(path, options)
96
- cache.set(key, response) if response.code == 200
97
- return response
98
- end
99
- end
100
-
101
- # Redefine original HTTParty +get+ method to use cache
102
- #
103
- def self.get(path, options={})
104
- self.get_with_caching(path, options={})
105
- end
106
-
107
- end
108
- end
109
-
110
- # === Cache container
111
- #
112
- # Pass a store name ('memory', etc) to new
113
- #
114
- class Cache
115
- attr_accessor :store
116
-
117
- def initialize(store, options={})
118
- self.class.logger = options[:logger]
119
- @store = self.class.lookup_store(store).new(options)
120
- end
121
-
122
- def get(key); @store.get encode(key) unless stale?(key); end
123
- def set(key, value); @store.set encode(key), value; end
124
- def exists?(key); @store.exists? encode(key); end
125
- def stale?(key); @store.stale? encode(key); end
126
-
127
- def self.logger; @logger || default_logger; end
128
- def self.default_logger; logger = ::Logger.new(STDERR); end
129
-
130
- # Pass a filename (String), IO object, Logger instance or +nil+ to silence the logger
131
- def self.logger=(device); @logger = device.kind_of?(::Logger) ? device : ::Logger.new(device); end
132
-
133
- private
134
-
135
- # Return store class based on passed name
136
- def self.lookup_store(name)
137
- store_name = "#{name.capitalize}Store"
138
- return Store::const_get(store_name)
139
- rescue NameError => e
140
- raise Store::StoreNotFound, "The cache store '#{store_name}' was not found. Did you loaded any such class?"
141
- end
142
-
143
- def encode(key); Digest::MD5.hexdigest(key); end
144
- end
145
-
146
-
147
- # === Cache stores
148
- #
149
- module Store
150
-
151
- class StoreNotFound < StandardError; end #:nodoc:
152
-
153
- # ==== Abstract Store
154
- # Inherit your store from this class
155
- # *IMPORTANT*: Do not forget to call +super+ in your +initialize+ method!
156
- #
157
- class AbstractStore
158
- def initialize(options={})
159
- raise ArgumentError, "You need to set the :timeout parameter" unless options[:timeout]
160
- @timeout = options[:timeout]
161
- message = "Cache: Using #{self.class.to_s.split('::').last}"
162
- message << " in location: #{options[:location]}" if options[:location]
163
- message << " with timeout #{options[:timeout]} sec"
164
- Cache.logger.info message unless options[:logger].nil?
165
- return self
166
- end
167
- %w{set get exists? stale?}.each do |method_name|
168
- define_method(method_name) { raise NoMethodError, "Please implement method #{method_name} in your store class" }
169
- end
170
- end
171
-
172
- # ==== Store objects in memory
173
- # See HTTParty::Icebox::ClassMethods.cache
174
- #
175
- class MemoryStore < AbstractStore
176
- def initialize(options={})
177
- super; @store = {}; self
178
- end
179
- def set(key, value)
180
- Cache.logger.info("Cache: set (#{key})")
181
- @store[key] = [Time.now, value]; true
182
- end
183
- def get(key)
184
- data = @store[key][1]
185
- Cache.logger.info("Cache: #{data.nil? ? "miss" : "hit"} (#{key})")
186
- data
187
- end
188
- def exists?(key)
189
- !@store[key].nil?
190
- end
191
- def stale?(key)
192
- return true unless exists?(key)
193
- Time.now - created(key) > @timeout
194
- end
195
- private
196
- def created(key)
197
- @store[key][0]
198
- end
199
- end
200
-
201
- # ==== Store objects on the filesystem
202
- # See HTTParty::Icebox::ClassMethods.cache
203
- #
204
- class FileStore < AbstractStore
205
- def initialize(options={})
206
- super
207
- options[:location] ||= Dir::tmpdir
208
- @path = Pathname.new( options[:location] )
209
- FileUtils.mkdir_p( @path )
210
- self
211
- end
212
- def set(key, value)
213
- Cache.logger.info("Cache: set (#{key})")
214
- File.open( @path.join(key), 'w' ) { |file| file << Marshal.dump(value) }
215
- true
216
- end
217
- def get(key)
218
- data = Marshal.load(File.read( @path.join(key)))
219
- Cache.logger.info("Cache: #{data.nil? ? "miss" : "hit"} (#{key})")
220
- data
221
- end
222
- def exists?(key)
223
- File.exists?( @path.join(key) )
224
- end
225
- def stale?(key)
226
- return true unless exists?(key)
227
- Time.now - created(key) > @timeout
228
- end
229
- private
230
- def created(key)
231
- File.mtime( @path.join(key) )
232
- end
233
- end
234
- end
235
-
236
- end
237
- end
238
-
239
-
240
- # Major parts of this code are based on architecture of ApiCache.
241
- # Copyright (c) 2008 Martyn Loughran
242
- #
243
- # Other parts are inspired by the ActiveSupport::Cache in Ruby On Rails.
244
- # Copyright (c) 2005-2009 David Heinemeier Hansson
245
- #
246
- # Permission is hereby granted, free of charge, to any person obtaining
247
- # a copy of this software and associated documentation files (the
248
- # "Software"), to deal in the Software without restriction, including
249
- # without limitation the rights to use, copy, modify, merge, publish,
250
- # distribute, sublicense, and/or sell copies of the Software, and to
251
- # permit persons to whom the Software is furnished to do so, subject to
252
- # the following conditions:
253
- #
254
- # The above copyright notice and this permission notice shall be
255
- # included in all copies or substantial portions of the Software.
256
- #
257
- # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
258
- # EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
259
- # MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
260
- # NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
261
- # LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
262
- # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
263
- # WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/test/helper.rb DELETED
@@ -1,10 +0,0 @@
1
- require 'rubygems'
2
- require 'test/unit'
3
- require 'shoulda'
4
-
5
- $LOAD_PATH.unshift(File.join(File.dirname(__FILE__), '..', 'lib'))
6
- $LOAD_PATH.unshift(File.dirname(__FILE__))
7
- require 'webtagger'
8
-
9
- class Test::Unit::TestCase
10
- end
@@ -1,7 +0,0 @@
1
- require 'helper'
2
-
3
- class TestWebtagger < Test::Unit::TestCase
4
- should "probably rename this file and start testing for real" do
5
- flunk "hey buddy, you should probably rename this file and start testing for real"
6
- end
7
- end