sem_extractor 0.0.2 → 0.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.rdoc CHANGED
@@ -6,36 +6,36 @@ SemExtractor is made to have in a single place, wrappers for most of the semanti
6
6
  - Yahoo Boss
7
7
  - OpenCalais
8
8
 
9
- = H2 Please tell me if there are more API's to include!
9
+ Please tell me if there are more API's to include!
10
10
 
11
11
  <em>After using Term Extraction gem, I happened to need the score of the different tags I got from the different APIS + I wanted to use Nokogiri for performance concerns.
12
12
  I thank alexrabarts, because his work gave me the idea to create my first gem</em>
13
13
 
14
14
  == Installation
15
15
 
16
- To install (I strongly recommend you use RVM):
17
- gem install sem_extractor
16
+ gem install sem_extractor
18
17
 
19
18
 
20
- == Usage
19
+ == Examples
20
+ For sure, you'll have to request your own API Key for each kind of API you want to use.
21
21
  Most of the methods below retrieve a hash with 'name' and 'score'
22
22
 
23
- Initialize:
24
- - yahoo = SemExtractor::Yahoo.new(:api_key => your_key, :context => your_text)
25
- - zemanta = SemExtractor::Zemanta.new(:api_key => your_key, :context => your_text)
26
- - sem = SemExtractor::Textwise.new(:api_key => your_key, :context => your_text_or_url)
27
- - calais = SemExtractor::Calais.new(:api_key => CALAIS, :context => REQUEST)
28
-
29
- Get info:
30
- - yahoo.terms
31
- - zemanta.terms
32
- - zemanta.categories
33
- - sem.terms
34
- - sem.categories
35
- - sem.filter #filters the useful content of a web page, retrieves text
36
- - calais.terms
37
- - calais.categories
38
- - calais.geos
23
+ === Initialize:
24
+ yahoo = SemExtractor::Yahoo.new(:api_key => your_key, :context => your_text)
25
+ zemanta = SemExtractor::Zemanta.new(:api_key => your_key, :context => your_text)
26
+ sem = SemExtractor::Textwise.new(:api_key => your_key, :context => your_text_or_url)
27
+ calais = SemExtractor::Calais.new(:api_key => your_key, :context => your_text)
28
+
29
+ ===Get info:
30
+ yahoo.terms
31
+ zemanta.terms
32
+ zemanta.categories
33
+ sem.terms
34
+ sem.categories
35
+ sem.filter #filters the useful content of a web page, retrieves text
36
+ calais.terms
37
+ calais.categories
38
+ calais.geos
39
39
 
40
40
  == Note on Patches/Pull Requests
41
41
 
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.0.2
1
+ 0.0.3
data/lib/apis/calais.rb CHANGED
@@ -6,17 +6,16 @@ class SemExtractor
6
6
  Nokogiri::XML(remote_xml).xpath('//rdf:Description').map { |h|
7
7
  node_type = h.xpath('rdf:type').first['resource']
8
8
  if node_type.include?('/type/cat/')
9
- @categories << { "name" => h.xpath('c:categoryName').first.content, "score"=>h.xpath('c:score').first.content}
9
+ @categories << { "name" => sanitize(h.xpath('c:categoryName')), "score"=> sanitize(h.xpath('c:score'))}
10
10
  elsif node_type.include?('/type/em/')
11
- nationality = h.xpath('c:nationality').first.nil? ? 'N/A' : h.xpath('c:nationality').first.content
12
- @terms << { "name" => h.xpath('c:name').first.content, "score" => nil, "nationality" => nationality }
11
+ @terms << { "name" => sanitize(h.xpath('c:name')), "score" => nil, "nationality" => sanitize(h.xpath('c:nationality')) }
13
12
  elsif node_type.include?('/type/sys/InstanceInfo')
14
13
  #nothing to do, no info to take
15
14
  elsif node_type.include?('/type/sys/RelevanceInfo')
16
15
  # I assume here, Open Calais will keep on giving information in the proper order, seems fair :)
17
- @terms.last["score"] = h.xpath('c:relevance').first.content
16
+ @terms.last["score"] = sanitize(h.xpath('c:relevance'))
18
17
  elsif node_type.include?('/Geo/')
19
- @geos <<{ "name" => h.xpath('c:name').first.content }
18
+ @geos <<{ "name" => sanitize(h.xpath('c:name')) }
20
19
  end
21
20
  }
22
21
  end
@@ -33,6 +32,10 @@ class SemExtractor
33
32
  end
34
33
 
35
34
  private
35
+ def sanitize(item)
36
+ item.first.nil? ? 'N/A' : item.first.content
37
+ end
38
+
36
39
  def gateway
37
40
  'http://api.opencalais.com/enlighten/rest/'
38
41
  end
Binary file
@@ -5,7 +5,7 @@
5
5
 
6
6
  Gem::Specification.new do |s|
7
7
  s.name = %q{sem_extractor}
8
- s.version = "0.0.2"
8
+ s.version = "0.0.3"
9
9
 
10
10
  s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
11
11
  s.authors = ["apneadiving"]
@@ -29,6 +29,7 @@ Gem::Specification.new do |s|
29
29
  "lib/sem_extractor.rb",
30
30
  "pkg/sem_extractor-0.0.0.gem",
31
31
  "pkg/sem_extractor-0.0.1.gem",
32
+ "pkg/sem_extractor-0.0.2.gem",
32
33
  "sem_extractor.gemspec",
33
34
  "test/helper.rb",
34
35
  "test/test_sem_extractor.rb"
metadata CHANGED
@@ -1,13 +1,13 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: sem_extractor
3
3
  version: !ruby/object:Gem::Version
4
- hash: 27
4
+ hash: 25
5
5
  prerelease: false
6
6
  segments:
7
7
  - 0
8
8
  - 0
9
- - 2
10
- version: 0.0.2
9
+ - 3
10
+ version: 0.0.3
11
11
  platform: ruby
12
12
  authors:
13
13
  - apneadiving
@@ -54,6 +54,7 @@ files:
54
54
  - lib/sem_extractor.rb
55
55
  - pkg/sem_extractor-0.0.0.gem
56
56
  - pkg/sem_extractor-0.0.1.gem
57
+ - pkg/sem_extractor-0.0.2.gem
57
58
  - sem_extractor.gemspec
58
59
  - test/helper.rb
59
60
  - test/test_sem_extractor.rb