sem_extractor 0.0.2 → 0.0.3

Sign up to get free protection for your applications and to get access to all the features.
data/README.rdoc CHANGED
@@ -6,36 +6,36 @@ SemExtractor is made to have in a single place, wrappers for most of the semanti
6
6
  - Yahoo Boss
7
7
  - OpenCalais
8
8
 
9
- = H2 Please tell me if there are more API's to include!
9
+ Please tell me if there are more API's to include!
10
10
 
11
11
  <em>After using Term Extraction gem, I happened to need the score of the different tags I got from the different APIS + I wanted to use Nokogiri for performance concerns.
12
12
  I thank alexrabarts, because his work gave me the idea to create my first gem</em>
13
13
 
14
14
  == Installation
15
15
 
16
- To install (I strongly recommend you use RVM):
17
- gem install sem_extractor
16
+ gem install sem_extractor
18
17
 
19
18
 
20
- == Usage
19
+ == Examples
20
+ For sure, you'll have to request your own API Key for each kind of API you want to use.
21
21
  Most of the methods below retrieve a hash with 'name' and 'score'
22
22
 
23
- Initialize:
24
- - yahoo = SemExtractor::Yahoo.new(:api_key => your_key, :context => your_text)
25
- - zemanta = SemExtractor::Zemanta.new(:api_key => your_key, :context => your_text)
26
- - sem = SemExtractor::Textwise.new(:api_key => your_key, :context => your_text_or_url)
27
- - calais = SemExtractor::Calais.new(:api_key => CALAIS, :context => REQUEST)
28
-
29
- Get info:
30
- - yahoo.terms
31
- - zemanta.terms
32
- - zemanta.categories
33
- - sem.terms
34
- - sem.categories
35
- - sem.filter #filters the useful content of a web page, retrieves text
36
- - calais.terms
37
- - calais.categories
38
- - calais.geos
23
+ === Initialize:
24
+ yahoo = SemExtractor::Yahoo.new(:api_key => your_key, :context => your_text)
25
+ zemanta = SemExtractor::Zemanta.new(:api_key => your_key, :context => your_text)
26
+ sem = SemExtractor::Textwise.new(:api_key => your_key, :context => your_text_or_url)
27
+ calais = SemExtractor::Calais.new(:api_key => your_key, :context => your_text)
28
+
29
+ ===Get info:
30
+ yahoo.terms
31
+ zemanta.terms
32
+ zemanta.categories
33
+ sem.terms
34
+ sem.categories
35
+ sem.filter #filters the useful content of a web page, retrieves text
36
+ calais.terms
37
+ calais.categories
38
+ calais.geos
39
39
 
40
40
  == Note on Patches/Pull Requests
41
41
 
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.0.2
1
+ 0.0.3
data/lib/apis/calais.rb CHANGED
@@ -6,17 +6,16 @@ class SemExtractor
6
6
  Nokogiri::XML(remote_xml).xpath('//rdf:Description').map { |h|
7
7
  node_type = h.xpath('rdf:type').first['resource']
8
8
  if node_type.include?('/type/cat/')
9
- @categories << { "name" => h.xpath('c:categoryName').first.content, "score"=>h.xpath('c:score').first.content}
9
+ @categories << { "name" => sanitize(h.xpath('c:categoryName')), "score"=> sanitize(h.xpath('c:score'))}
10
10
  elsif node_type.include?('/type/em/')
11
- nationality = h.xpath('c:nationality').first.nil? ? 'N/A' : h.xpath('c:nationality').first.content
12
- @terms << { "name" => h.xpath('c:name').first.content, "score" => nil, "nationality" => nationality }
11
+ @terms << { "name" => sanitize(h.xpath('c:name')), "score" => nil, "nationality" => sanitize(h.xpath('c:nationality')) }
13
12
  elsif node_type.include?('/type/sys/InstanceInfo')
14
13
  #nothing to do, no info to take
15
14
  elsif node_type.include?('/type/sys/RelevanceInfo')
16
15
  # I assume here, Open Calais will keep on giving information in the proper order, seems fair :)
17
- @terms.last["score"] = h.xpath('c:relevance').first.content
16
+ @terms.last["score"] = sanitize(h.xpath('c:relevance'))
18
17
  elsif node_type.include?('/Geo/')
19
- @geos <<{ "name" => h.xpath('c:name').first.content }
18
+ @geos <<{ "name" => sanitize(h.xpath('c:name')) }
20
19
  end
21
20
  }
22
21
  end
@@ -33,6 +32,10 @@ class SemExtractor
33
32
  end
34
33
 
35
34
  private
35
+ def sanitize(item)
36
+ item.first.nil? ? 'N/A' : item.first.content
37
+ end
38
+
36
39
  def gateway
37
40
  'http://api.opencalais.com/enlighten/rest/'
38
41
  end
Binary file
@@ -5,7 +5,7 @@
5
5
 
6
6
  Gem::Specification.new do |s|
7
7
  s.name = %q{sem_extractor}
8
- s.version = "0.0.2"
8
+ s.version = "0.0.3"
9
9
 
10
10
  s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
11
11
  s.authors = ["apneadiving"]
@@ -29,6 +29,7 @@ Gem::Specification.new do |s|
29
29
  "lib/sem_extractor.rb",
30
30
  "pkg/sem_extractor-0.0.0.gem",
31
31
  "pkg/sem_extractor-0.0.1.gem",
32
+ "pkg/sem_extractor-0.0.2.gem",
32
33
  "sem_extractor.gemspec",
33
34
  "test/helper.rb",
34
35
  "test/test_sem_extractor.rb"
metadata CHANGED
@@ -1,13 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: sem_extractor
3
3
  version: !ruby/object:Gem::Version
4
- hash: 27
4
+ hash: 25
5
5
  prerelease: false
6
6
  segments:
7
7
  - 0
8
8
  - 0
9
- - 2
10
- version: 0.0.2
9
+ - 3
10
+ version: 0.0.3
11
11
  platform: ruby
12
12
  authors:
13
13
  - apneadiving
@@ -54,6 +54,7 @@ files:
54
54
  - lib/sem_extractor.rb
55
55
  - pkg/sem_extractor-0.0.0.gem
56
56
  - pkg/sem_extractor-0.0.1.gem
57
+ - pkg/sem_extractor-0.0.2.gem
57
58
  - sem_extractor.gemspec
58
59
  - test/helper.rb
59
60
  - test/test_sem_extractor.rb