dover_to_calais 0.1.0 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,15 @@
1
+ ---
2
+ !binary "U0hBMQ==":
3
+ metadata.gz: !binary |-
4
+ OWJmOWEyMGFjNDk2ZjZiODYyNjQ1NDM2YjM0YjMyNzQ1MmUzZjg3MA==
5
+ data.tar.gz: !binary |-
6
+ MTllMDRiOTNlNDg2Y2FiNmY1MmQyMjAyMzViNWJiZWFmN2ZjYTY3ZA==
7
+ SHA512:
8
+ metadata.gz: !binary |-
9
+ OGI3NDU4YWU1YzllMjBiNTVlMmU3NzNhMDUzYmNhNWYzODY0ZTE3MDQzZmMx
10
+ NmZiMDMxZjYzMTI3ZTdkYWU5MGNiNDc3ZTE2ZTRjYThjYjc5ZDQxNjFlZjU0
11
+ MGRmZWY5ZGM4NTAwYjAyZTEyZmY5M2I5MDdjNDA4NWQ1MDE4MDM=
12
+ data.tar.gz: !binary |-
13
+ ODUyZGFhN2JhYjdjZDAyNmMxMTNhZjY0MjJhNWQ5YjU2OTY0OTQyNmU4MDkz
14
+ NjljZTU1NDUzYTRhN2I2MTA3MmQ3MTM3MDYxODUyMjgzOGVkYTYzNzY4MjA1
15
+ YTgyZDNkNjE3YjI1NWJiYTJkMzNjN2RiYzEzN2M1MWNmYzFhMzU=
data/.gitignore CHANGED
@@ -1,17 +1,33 @@
1
1
  *.gem
2
2
  *.rbc
3
- .bundle
4
- .config
5
- .yardoc
3
+ /.config
4
+ /coverage/
5
+ /InstalledFiles
6
+ /pkg/
7
+ /spec/reports/
8
+ /test/tmp/
9
+ /test/version_tmp/
10
+ /tmp/
11
+
12
+ ## Documentation cache and generated files:
13
+ /.yardoc/
14
+ /_yardoc/
15
+ /doc/
16
+ /rdoc/
17
+
18
+ ## Environment normalisation:
19
+ /.bundle/
20
+ /lib/bundler/man/
21
+
22
+ # for a library or gem, you might want to ignore these files since the code is
23
+ # intended to run in multiple environments; otherwise, check them in:
6
24
  Gemfile.lock
7
- InstalledFiles
8
- _yardoc
9
- coverage
10
- doc/
11
- lib/bundler/man
12
- pkg
13
- rdoc
14
- spec/reports
15
- test/tmp
16
- test/version_tmp
17
- tmp
25
+ .ruby-version
26
+ .ruby-gemset
27
+
28
+ # unless supporting rvm < 1.11.0 or doing something fancy, ignore this:
29
+ .rvmrc
30
+
31
+ ## Sublime Text project files
32
+ *.sublime-project
33
+ *.sublime-workspace
data/README.md CHANGED
@@ -1,14 +1,19 @@
1
+
2
+
1
3
  # DoverToCalais
2
4
 
3
5
  DoverToCalais allows the user to send a wide range of data sources (files & URLs)
4
6
  to [OpenCalais](http://www.opencalais.com/about) and receive asynchronous responses when [OpenCalais](http://www.opencalais.com/about) has finished processing
5
- the inputs. In addition, DoverToCalais enables the filtering of the response in order to
6
- find relevant tags and/or tag values.
7
+ the inputs. In addition, DoverToCalais enables response filtering in order to find relevant tags and/or tag values.
7
8
 
8
9
  ## What is OpenCalais?
9
10
  In short -and quoting the [OpenCalais](http://www.opencalais.com/about) creators:
11
+ > "*The OpenCalais Web Service automatically creates rich semantic metadata for the content you submit – in well under a second. Using natural language processing (NLP), machine learning and other methods, Calais analyzes your document and finds the entities within it. But, Calais goes well beyond classic entity identification and returns the facts and events hidden within your text as well.*"
12
+
13
+ In general, OpenCalais Simple XML Format (the one used by DoverToCalais) returns three kinds of tags: [Entitites, Events](http://www.opencalais.com/documentation/calais-web-service-api/api-metadata/entity-index-and-definitions) and [Topics](http://www.opencalais.com/documentation/calais-web-service-api/api-metadata/document-categorization). ***Entities*** are static 'things', like Persons, Places, et al. that are involved in the textual context in some capacity. OpenCalais assigns a *relevance* score to each entity to indicate it's relevance within the context of the data source's general topic. ***Events*** are facts or actions that pertain to one or more Entities. ***Topics*** are a characterisation or generic description of the data source's context.
14
+
15
+ We can use these tags and the information within them to extract relevant information from the data or to draw useful conclusions about it. For example, if the data source tags include an *&lt;Event&gt;* with the value of *'CompanyExpansion'*, I can then look for the &lt;City&gt; or &lt;Company&gt; tags to find out which company is expanding and if it's near my location (hint: they may be looking for more staff :)) Or, I could pick out all &lt;Company&gt;s involved in a &lt;JointVenture&gt;, or all &lt;Person&gt;s implicated in an &lt;Arrest&gt; in my &lt;City&gt;, etc.
10
16
 
11
- *The OpenCalais Web Service automatically creates rich semantic metadata for the content you submit – in well under a second. Using natural language processing (NLP), machine learning and other methods, Calais analyzes your document and finds the entities within it. But, Calais goes well beyond classic entity identification and returns the facts and events hidden within your text as well.*
12
17
 
13
18
  ## Why use OpenCalais?
14
19
  There are many reasons, mainly to:
@@ -16,11 +21,11 @@ There are many reasons, mainly to:
16
21
  * incorporate tags into other applications, such as search, news aggregation, blogs, catalogs, etc.
17
22
  * enrich search by looking for deeper, contextual meaning instead of merely phrases or keywords.
18
23
  * help to discern relationships between semantic entities.
19
- * facilitate data processing and analysis by allowing easy filtering of relevant data sources and the discarding of irrelevant ones.
24
+ * facilitate data processing and analysis by allowing easy identification of relevant or important data sources and the discarding of irrelevant ones.
20
25
 
21
26
 
22
27
  ## DoverToCalais Features
23
- 1. **Supports many data sources**: Thanks to the power of [Yomu](https://github.com/Erol/yomu), DoverToCalais can process a vast range of files (and, of course, web pages), extract text from them and send
28
+ 1. **Multiple data source support**: Thanks to the power of [Yomu](https://github.com/Erol/yomu), DoverToCalais can process a vast range of files (and, of course, web pages), extract text from them and send
24
29
  them to OpenCalais for analysis and tag generation.
25
30
 
26
31
  2. **Asynchronous responses (callbacks)**:
@@ -28,7 +33,7 @@ Users can set callbacks to receive the processed meta-data, once the OpenCalais
28
33
  Furthermore, a user can set multiple callbacks for the same request (data source), thus enabling cleaner,
29
34
  more modular code.
30
35
 
31
- 3. **Result filtering**: DoverToCalais uses the OpenCalais [Simple XML Format](http://www.opencalais.com/documentation/calais-web-service-api/interpreting-api-response/simple-format) as its preferred response format. The user can work directly with the XML-formatted response, or -if feeling a bit lazy- can take advantage of the DoverToCalais filtering functionality and receive specific entities, optionally based on specified conditions.
36
+ 3. **Result filtering**: DoverToCalais uses the OpenCalais [Simple XML Format](http://www.opencalais.com/documentation/calais-web-service-api/interpreting-api-response/simple-format) as the preferred response format. The user can work directly with the XML-formatted response, or -if feeling a bit lazy- can take advantage of the DoverToCalais filtering functionality and receive specific entities, optionally based on specified conditions.
32
37
 
33
38
  For more details of the features and code samples, see [Usage](#usage).
34
39
 
@@ -53,20 +58,325 @@ Or install it yourself as:
53
58
 
54
59
  $ gem install dover_to_calais
55
60
 
61
+
62
+
56
63
  ## Dependencies
57
- DoverToCalais has been developed in Ruby 1.9.3 and requires the following gems (for development purposes only)
64
+ DoverToCalais has been developed in Ruby 1.9.3 and relies on the following gems to work (installation with the gem command will automatically install all dependencies)
58
65
 
59
66
  * 'nokogiri', 1.6.0
60
67
  * 'eventmachine', 1.0.3
61
68
  * 'em-http-request', 1.1.0
62
- * 'open-uri',
63
69
  * 'yomu', 0.1.9
64
70
 
65
71
  As [Yomu](https://github.com/Erol/yomu) depends on a working JRE in order to function, so does DoverToCalais.
66
72
 
67
73
  ## Usage
74
+ Using DoverToCalais is extremely simple.
75
+
76
+ ### The Basics
77
+ As DoverToCalais uses the awesome-ness of [EventMachine](http://rubyeventmachine.com/), code must be placed within an EM *run* block:
78
+
79
+ ```ruby
80
+ EM.run do
81
+
82
+ # use Control + C to stop the EM
83
+ Signal.trap('INT') { EventMachine.stop }
84
+ Signal.trap('TERM') { EventMachine.stop }
85
+
86
+ # we need an API key to use OpenCalais
87
+ DoverToCalais::API_KEY = 'my-opencalais-api-key'
88
+ # create a new dover
89
+ dover = DoverToCalais::Dover.new('http://www.bbc.co.uk/news/world-africa-24412315')
90
+ # parse the text and send it to OpenCalais
91
+ dover.analyse_this
92
+ puts 'do some stuff....'
93
+ # set a callback for when we receive a response
94
+ dover.to_calais { |response| puts response.error ? response.error : response }
95
+
96
+ puts 'do some more stuff....'
97
+
98
+ end
99
+ ```
100
+ This will produce the following result:
101
+
102
+
103
+ > do some stuff.... <br>
104
+ > do some more stuff.... <br>
105
+ > <?xml version="1.0"?> <br>
106
+ > &lt;OpenCalaisSimple&gt; <br>
107
+ > .......... <br>
108
+ > (the rest of the XML response from OpenCalais) <br>
109
+
110
+
111
+ As can be observed, the callback (#to_calais) is trigerred after the rest of the code has been executed and only when the OpenCalais request has been completed.
112
+
113
+ Of course, we can analyse more than one sources at a time:
114
+
115
+ ```ruby
116
+ EM.run do
117
+
118
+ # use Control + C to stop the EM
119
+ Signal.trap('INT') { EventMachine.stop }
120
+ Signal.trap('TERM') { EventMachine.stop }
121
+
122
+ DoverToCalais::API_KEY = 'my-opencalais-api-key'
123
+
124
+ d1 = DoverToCalais::Dover.new('http://www.bbc.co.uk/news/world-africa-24412315')
125
+ d2 = DoverToCalais::Dover.new('/home/fred/Documents/RailsRecipes.pdf')
126
+ d3 = DoverToCalais::Dover.new('//network-drive/annual_forecast.doc')
127
+
128
+ d1.analyse_this; d2.analyse_this; d3.analyse_this;
129
+
130
+ puts 'do some stuff....'
131
+
132
+ d1.to_calais { |response| puts response.error ? response.error : response }
133
+ d2.to_calais { |response| puts response.error ? response.error : response }
134
+ d3.to_calais { |response| puts response.error ? response.error : response }
135
+
136
+ puts 'do some more stuff....'
137
+
138
+ end
139
+ ```
140
+
141
+ This will output the two *puts* statements followed by the three callbacks (d1, d2, d3) in the order in which they are triggered, i.e. the first callback to receive a response from OpenCalais will fire first.
142
+
143
+
144
+ ###Filtering the response
145
+ Why parse the response XML ourselves when DoverToCalais can do it for us? We'll just use the *#filter* method on the response object, passing a filtering hash:
146
+
147
+ ```ruby
148
+ my_filter = {:entity => 'Entity1', :value => 'Value1', :given => {:entity => 'Entity2', :value => 'Value2'}}
149
+ reponse.filter(my_filter)
150
+ ```
151
+
152
+ The above tells DoverToCalais to look in the reponse for an entity called 'Entity1' with a value of 'Value1', **only** if the response contains an entity called 'Entity2' which has a value of 'Value2'.
153
+
154
+ The conditional clause (*:given*) is optional; the filtering hash can be used in pretty much any permutation. For instance:
155
+
156
+ ```ruby
157
+ EM.run do
158
+
159
+ DoverToCalais::API_KEY = 'my-opencalais-api-key'
160
+
161
+ dover = DoverToCalais::Dover.new('http://www.bbc.co.uk/news/world-africa-24412315')
162
+ dover.analyse_this
163
+
164
+ dover.to_calais do |response|
165
+ if response.error
166
+ puts response.error
167
+ else
168
+ puts response.filter({:entity => 'Company'})
169
+ end
170
+ end
171
+
172
+ end
173
+ ```
174
+
175
+ This will pick out all entities tagged 'Company' from the data source. The output will be an Array of ResponseItem objects.
176
+
177
+
178
+ > &lt;struct DoverToCalais::ResponseItem name="Company", value="BBC News", relevance=0.654, count=13, normalized=nil, importance=nil, originalValue=nil&gt;<br>
179
+ > &lt;struct DoverToCalais::ResponseItem name="Company", value="TV Radio", relevance=0.565, count=2, normalized="HERALD & WEEKLY-TV,RADIO OPS", importance=nil, originalValue=nil&gt; <br>
180
+ > &lt;struct DoverToCalais::ResponseItem name="Company", value="Reuters", relevance=0.255, count=2, normalized="THOMSON REUTERS GROUP LIMITED", importance=nil, originalValue=nil&gt; <br>
181
+ > &lt;struct DoverToCalais::ResponseItem name="Company", value="Twitter", relevance=0.395, count=1, normalized="TWITTER, INC.", importance=nil, originalValue=nil&gt; <br>
182
+ > &lt;struct DoverToCalais::ResponseItem name="Company", value="Huffington Post UK", relevance=0.136, count=1, normalized=nil, importance=nil, originalValue=nil&gt; <br>
183
+ > &lt;struct DoverToCalais::ResponseItem name="Company", value="Ireland Kenya", relevance=0.144, count=1, normalized=nil, importance=nil, originalValue=nil&gt; <br>
184
+ > &lt;struct DoverToCalais::ResponseItem name="Company", value="Yahoo! UK", relevance=0.144, count=1, normalized="YAHOO! UK LIMITED", importance=nil, originalValue=nil&gt; <br>
185
+
186
+
187
+ If this output looks a bit cluttered, we can easily tidy it up:
188
+
189
+ ```ruby
190
+ EM.run do
191
+
192
+ DoverToCalais::API_KEY = 'my-opencalais-api-key'
193
+
194
+ dover = DoverToCalais::Dover.new('http://www.bbc.co.uk/news/world-africa-24412315')
195
+ dover.analyse_this
196
+
197
+ dover.to_calais do |response|
198
+ if response.error
199
+ puts response.error
200
+ else
201
+ items = response.filter({:entity => 'Company'})
202
+ items.each do |item|
203
+ puts "#{item.name}: #{item.value}, relevance = #{item.relevance}"
204
+ end
205
+ end
206
+ end
207
+
208
+ end
209
+ ```
210
+
211
+ Which will give us:
212
+
213
+
214
+ > Company: BBC News, relevance = 0.656 <br>
215
+ > Company: TV Radio, relevance = 0.566 <br>
216
+ > Company: Reuters, relevance = 0.26 <br>
217
+ > Company: Guardian.co.uk, relevance = 0.143 <br>
218
+ > Company: Twitter, relevance = 0.399 <br>
219
+ > Company: Huffington Post UK, relevance = 0.132 <br>
220
+ > Company: Ireland Kenya, relevance = 0.139 <br>
221
+ > Company: Yahoo! UK, relevance = 0.139 <br>
222
+
223
+
68
224
 
69
- TODO: Write usage instructions here
225
+ Let's see if the data source refers to any business partnerships:
226
+
227
+ ```ruby
228
+ EM.run do
229
+
230
+ DoverToCalais::API_KEY = 'my-opencalais-api-key'
231
+
232
+ dover = DoverToCalais::Dover.new('http://www.bbc.co.uk/news/technology-24380202')
233
+ dover.analyse_this
234
+
235
+ dover.to_calais do |response|
236
+ if response.error
237
+ puts response.error
238
+ else
239
+ items = response.filter({:entity => 'Event', :value => 'Business Partnership'})
240
+ puts "There are #{items.length} events like that in the source"
241
+ end
242
+ end
243
+
244
+ end
245
+ ```
246
+
247
+ which will produce:
248
+
249
+ > There are 1 events like that in the source
250
+
251
+
252
+ Now let's find all companies involved in any business partnerships:
253
+
254
+ ```ruby
255
+ EM.run do
256
+
257
+ DoverToCalais::API_KEY = 'my-opencalais-api-key'
258
+
259
+ dover = DoverToCalais::Dover.new('http://www.bbc.co.uk/news/technology-24380202')
260
+ dover.analyse_this
261
+
262
+ dover.to_calais do |response|
263
+ if response.error
264
+ puts response.error
265
+ else
266
+ items = response.filter( {:entity => 'Company', :given => {:entity => 'Event', :value => 'Business Partnership'}} )
267
+ items.each do |item|
268
+ puts "#{item.name}: #{item.value} a.k.a #{item.normalized}, relevance = #{item.relevance}"
269
+ end
270
+ end
271
+ end
272
+
273
+ end
274
+ ```
275
+
276
+ which gives us:
277
+
278
+ > Company: BBC News a.k.a , relevance = 0.678 <br>
279
+ > Company: Google a.k.a GOOGLE INC., relevance = 0.508 <br>
280
+ > Company: Flutter a.k.a FLUTTER COM INC, relevance = 0.531 <br>
281
+ > Company: TV Radio a.k.a HERALD & WEEKLY-TV,RADIO OPS, relevance = 0.558 <br>
282
+ > Company: Microsoft a.k.a MICROSOFT CORPORATION, relevance = 0.303 <br>
283
+ > Company: Adobe a.k.a ADOBE SYSTEMS INCORPORATED, relevance = 0.193 <br>
284
+ > Company: Netflix a.k.a NETFLIX, INC., relevance = 0.301 <br>
285
+ > Company: Y Combinator a.k.a Y Combinator, relevance = 0.258 <br>
286
+ > Company: Nintendo a.k.a Nintendo Co., Ltd., relevance = 0.286 <br>
287
+ > Company: Samsung a.k.a Samsung C&T Corporation, relevance = 0.285 <br>
288
+ > Company: Glyndwr University a.k.a , relevance = 0.269 <br>
289
+
290
+
291
+
292
+ At this point, someone may ask: "But what if we want to get more than one entity for a given condition? The filter hash doesn't allow that!"
293
+
294
+ No it doesn't. However, given that filtering is done on the *whole* reponse *after* it's been received, we can apply many filters on the same response:
295
+
296
+ ```ruby
297
+ EM.run do
298
+
299
+ DoverToCalais::API_KEY = 'my-opencalais-api-key'
300
+
301
+ dover = DoverToCalais::Dover.new('http://www.bbc.co.uk/news/technology-24380202')
302
+ dover.analyse_this
303
+
304
+ dover.to_calais do |response|
305
+ if response.error
306
+ puts response.error
307
+ else
308
+ result1 = response.filter( {:entity => 'Company', :value => 'Google', :given => {:entity => 'Technology', :value => 'gesture recognition'}} )
309
+ result2 = response.filter( {:entity => 'Product', :given => {:entity => 'Technology', :value => 'gesture recognition'}} )
310
+ puts result1 | result2
311
+ end
312
+ end
313
+
314
+ end
315
+ ```
316
+
317
+ Which will give us all the gesture-recognition products that Google is associated with according to our data source:
318
+
319
+ > &lt;struct DoverToCalais::ResponseItem name="Company", value="Google", relevance=0.506, count=7, normalized="GOOGLE INC.", importance=nil, originalValue=nil&gt; <br>
320
+ > &lt;struct DoverToCalais::ResponseItem name="Product", value="Xbox Kinect", relevance=0.286, count=1, normalized=nil, importance=nil, originalValue=nil&gt; <br>
321
+ > &lt;struct DoverToCalais::ResponseItem name="Product", value="Galaxy S4 smartphone", relevance=0.282, count=1, normalized=nil, importance=nil, originalValue=nil&gt; <br>
322
+ > &lt;struct DoverToCalais::ResponseItem name="Product", value="Wii", relevance=0.286, count=1, normalized=nil, importance=nil, originalValue=nil&gt; <br>
323
+ > &lt;struct DoverToCalais::ResponseItem name="Product", value="Galaxy S4", relevance=0.282, count=1, normalized=nil, importance=nil, originalValue=nil&gt; <br>
324
+
325
+
326
+
327
+
328
+ ***PS***: If you're not sure about the names or values of the tags you want to filter, you can get a listing with the following Constants:
329
+
330
+ ```ruby
331
+ CalaisOntology::CALAIS_ENTITIES
332
+ CalaisOntology::CALAIS_EVENTS
333
+ CalaisOntology::CALAIS_TOPICS
334
+ ```
335
+
336
+ ###Code samples
337
+
338
+ More examples of using DoverToCalais can be found as GitHub Gists:
339
+
340
+ [Using DoverToCalais to semantically tag all files in a directory](https://gist.github.com/RedFred7/6961349)
341
+ [Use DoverToCalais to find all Persons or Organizations with a relevance score greater than 0.1, if the data source contains an environmental event](https://gist.github.com/RedFred7/6961853)
342
+
343
+
344
+ ### Using a Proxy
345
+
346
+ If you're behind a corporate firewall and the only way to reach outside is through a proxy then you need to set the *DoverToCalais::PROXY* constant:
347
+
348
+ ```ruby
349
+ DoverToCalais::PROXY =
350
+ :proxy => {
351
+ :host => 'www.myproxy.com',
352
+ :port => 8080,
353
+ :authorization => ['username', 'password'] #optional
354
+ }
355
+ ```
356
+
357
+
358
+ If you're connecting through a SOCKS5 Proxy just set the *:type* key to :socks5.
359
+
360
+ ```ruby
361
+ DoverToCalais::PROXY =
362
+ :proxy => {
363
+ :host => 'www.myproxy.com',
364
+ :port => 8080,
365
+ :type => :socks5
366
+ }
367
+ ```
368
+
369
+ ## Documentation
370
+
371
+ Comprehensive documentation can be found at http://rubydoc.info/gems/dover_to_calais.
372
+
373
+ ## Testing
374
+
375
+ A list of Cucumber features and scenarios can be found in the *features* directory. The list is far from exhaustive, so feel free to add your own scenarios and steps.
376
+
377
+ To run the tests, there is already a rake task set up. Just type:
378
+
379
+ rake features API_KEY='my_api_key'
70
380
 
71
381
  ## Contributing
72
382
 
@@ -75,3 +385,13 @@ TODO: Write usage instructions here
75
385
  3. Commit your changes (`git commit -am 'Add some feature'`)
76
386
  4. Push to the branch (`git push origin my-new-feature`)
77
387
  5. Create new Pull Request
388
+
389
+
390
+ ##Changelog
391
+
392
+ * **07-Oct-2013** Version: 0.1.0
393
+ Initial release
394
+ * **10-Feb-2014** Version: 0.1.1
395
+ Improved Response error message
396
+ * **10-Feb-2014** Version: 0.2.0
397
+ Added #analyse_this to public interface
data/Rakefile CHANGED
@@ -1 +1,7 @@
1
1
  require "bundler/gem_tasks"
2
+ require 'cucumber'
3
+ require 'cucumber/rake/task'
4
+
5
+ Cucumber::Rake::Task.new(:features) do |t|
6
+ t.cucumber_opts = "features --format pretty API_KEY=#{ENV['API_KEY']}"
7
+ end
@@ -7,20 +7,21 @@ Gem::Specification.new do |spec|
7
7
  spec.name = "dover_to_calais"
8
8
  spec.version = DoverToCalais::VERSION
9
9
  spec.authors = ["Fred Heath"]
10
- spec.email = ["fred@bootstrap.me.uk"]
10
+ spec.email = ["fred_h@bootstrap.me.uk"]
11
11
  spec.description = %q{DoverToCalais allows the user to send a wide range of data sources (files & URLs)
12
12
  to OpenCalais and receive asynchronous responses when OpenCalais has finished processing
13
13
  the inputs. In addition, DoverToCalais enables the filtering of the response in order to
14
14
  find relevant tags and/or tag values. }
15
15
  spec.summary = %q{An easy-to-use wrapper round the OpenCalais semantic analysis web service. }
16
- spec.homepage = ""
16
+ spec.homepage = "https://github.com/RedFred7/dover_to_calais"
17
17
  spec.license = "MIT"
18
18
 
19
19
 
20
- spec.add_runtime_dependency "nokogiri", "~>1.6.0"
21
- spec.add_runtime_dependency "eventmachine", "~>1.0.3"
22
- spec.add_runtime_dependency "em-http-request", "~>1.1.0"
23
- spec.add_runtime_dependency "yomu", "~>0.1.9"
20
+ spec.add_runtime_dependency "nokogiri", "~> 1.6"
21
+ spec.add_runtime_dependency "eventmachine", "~> 1.0", ">= 1.0.3"
22
+ spec.add_runtime_dependency "em-http-request", "~> 1.1"
23
+ spec.add_runtime_dependency "yomu", "~> 0.1", ">= 0.1.9"
24
+
24
25
 
25
26
  spec.files = `git ls-files`.split($/)
26
27
  spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
@@ -29,5 +30,7 @@ Gem::Specification.new do |spec|
29
30
 
30
31
 
31
32
  spec.add_development_dependency "bundler", "~> 1.3"
32
- spec.add_development_dependency "rake"
33
+ spec.add_development_dependency "rake", "~> 0"
34
+ spec.add_development_dependency "cucumber", "~> 1.3", ">= 1.3.8"
35
+ spec.add_development_dependency "rspec", "~> 2.14", ">= 2.14.1"
33
36
  end
@@ -0,0 +1,14 @@
1
+ Feature: Able to handle wide range of data formats as input
2
+ Scenario Outline: Processing various data-source formats
3
+ Given the file <input>
4
+ When DoverToCalais processes this file
5
+ Then the output should have no errors
6
+
7
+ Examples:
8
+ | input |
9
+ |test_file_1.doc |
10
+ |test_file_1.html|
11
+ |test_file_1.odt|
12
+ |test_file_1.pdf|
13
+ |test_file_1.rtf|
14
+ |test_file_1.txt|
@@ -0,0 +1,24 @@
1
+ Feature: Ability to select certain OpenCalais entities based on certain conditions
2
+
3
+ Background:
4
+ Given the file 'test_file_1.txt' is successfully processed
5
+
6
+
7
+ Scenario: Select all entities with a specific name
8
+ When I filter on {:entity => 'EmailAddress'}
9
+ Then the output should have 2 entries
10
+ And All entries should be named 'EmailAddress'
11
+
12
+ Scenario: Select an entity with a specific value
13
+ When I filter on {:entity => 'Event', :value => 'Meeting'}
14
+ Then the output should have 1 entries
15
+ And All entries should be named 'Event'
16
+ And All entries should have the value 'Meeting'
17
+
18
+
19
+ Scenario: Select an entity only if another entity with a specific value exists in the data source
20
+ When I filter on {:entity => 'Person', :given => {:entity => 'Event', :value => 'Meeting'}}
21
+ Then the output should have 2 entries
22
+ And All entries should be named 'Person'
23
+ And One entry should have the value 'Roger Kay'
24
+ And One entry should have the value 'David Bailey'
@@ -0,0 +1,40 @@
1
+
2
+ require 'nokogiri'
3
+ require 'eventmachine'
4
+ require 'em-http-request'
5
+ require 'yomu'
6
+ require 'rspec'
7
+ require File.expand_path('../../../lib/dover_to_calais', __FILE__)
8
+
9
+
10
+ # N.B Cucumber must be run with the Environment variable 'API_KEY' set
11
+ # to the OpenCalais API Key value.
12
+
13
+
14
+ Given(/^the file (\w+\.\w{3,4})$/) do |arg1|
15
+ puts arg1
16
+ @input = Dir.pwd + '/test/' + arg1
17
+ @output = nil
18
+ end
19
+
20
+
21
+
22
+ When(/^DoverToCalais processes this file$/) do
23
+ EM.run {
24
+
25
+ DoverToCalais::API_KEY = ENV['API_KEY']
26
+ d1 = DoverToCalais::Dover.new(@input)
27
+ d1.analyse_this
28
+ d1.to_calais do |response|
29
+ @output = response
30
+ EM.stop
31
+ end
32
+
33
+ }
34
+ end
35
+
36
+
37
+
38
+ Then(/^the output should have no errors$/) do
39
+ @output.error.should be_nil
40
+ end
@@ -0,0 +1,60 @@
1
+ require 'nokogiri'
2
+ require 'eventmachine'
3
+ require 'em-http-request'
4
+ require 'yomu'
5
+ require 'rspec'
6
+ require File.expand_path('../../../lib/dover_to_calais', __FILE__)
7
+
8
+
9
+ # N.B Cucumber must be run with the Environment variable 'API_KEY' set
10
+ # to the OpenCalais API Key value.
11
+
12
+
13
+
14
+ Given(/^the file '(\w+\.\w{3,4})' is successfully processed$/) do |file|
15
+
16
+ steps %{
17
+ Given the file #{file}
18
+ When DoverToCalais processes this file
19
+ Then the output should have no errors
20
+ }
21
+
22
+ end
23
+
24
+
25
+ When(/^I filter on ({.+})/) do |filter|
26
+ @filtered_output = @output.filter(eval(filter))
27
+
28
+ end
29
+
30
+ Then(/^the output should have (\d+) entries$/) do |item_num|
31
+ @filtered_output.size.should == item_num.to_i
32
+ end
33
+
34
+ Then(/^All entries should be named '(\w+)'$/) do |name|
35
+ @filtered_output.each do |item|
36
+ item.name.should == name
37
+ end
38
+
39
+ end
40
+
41
+
42
+ And(/^All entries should have the value '(\w+)'$/) do |value|
43
+ @filtered_output.each do |item|
44
+ item.value.match(value).should_not be_nil
45
+ end
46
+ end
47
+
48
+
49
+ And(/^One entry should have the value '(\w+\s*\w+)'$/) do |value|
50
+ found = false
51
+ @filtered_output.each do |item|
52
+ if item.value.match(value)
53
+ found =true
54
+ break
55
+ end
56
+ end
57
+
58
+ fail("couldn't match value '#{value}'") unless found
59
+ end
60
+
@@ -130,14 +130,15 @@ module DoverToCalais
130
130
  node_count = node.attribute('count').text.to_i if node.has_attribute?('count')
131
131
  node_normalized = node.attribute('normalized').text if node.has_attribute?('normalized')
132
132
  node_importance = node.attribute('importance').text.to_i if node.has_attribute?('importance')
133
+ node_orig_value = node.xpath('originalValue').text if node.name.eql?('SocialTag')
133
134
 
134
135
  ResponseItem.new(node.name,
135
- node.content,
136
+ node.text,
136
137
  node_relevance,
137
138
  node_count,
138
139
  node_normalized,
139
140
  node_importance,
140
- node.xpath('originalValue').text )
141
+ node_orig_value )
141
142
 
142
143
  end
143
144
 
@@ -172,7 +173,6 @@ module DoverToCalais
172
173
  def initialize(data_src)
173
174
  @data_src = data_src
174
175
  @callbacks = []
175
- analyse_this
176
176
  end
177
177
 
178
178
 
@@ -202,7 +202,7 @@ module DoverToCalais
202
202
  # @return N/A
203
203
  def to_calais(&block)
204
204
  #fred rules ok
205
- if @document
205
+ if !@error
206
206
  @callbacks << block
207
207
  else
208
208
  result = ResponseData.new nil, @error
@@ -211,12 +211,19 @@ module DoverToCalais
211
211
 
212
212
  end #method
213
213
 
214
+ # Gets the source text parsed. If the parsing is successful, the data source is POSTed to OpenCalais
215
+ # via an EventMachine request and a callback is set to manage the OpenCalais response.
216
+ # All Dover object callbacks are then called with the request result yielded to them.
214
217
  #
218
+ # @param N/A
219
+ # @return a {Class ResponseData} object
215
220
  def analyse_this
216
221
 
217
222
  @document = get_src_data(@data_src)
218
223
  begin
219
- if @document
224
+ if @document[0..2].eql?('ERR')
225
+ raise 'Invalid data source'
226
+ else
220
227
  response = nil
221
228
 
222
229
  connection_options = {:inactivity_timeout => 0}
@@ -242,36 +249,47 @@ module DoverToCalais
242
249
 
243
250
 
244
251
  http.callback do
245
- http.response.match(/<OpenCalaisSimple>/) do |m|
246
- response = Nokogiri::XML('<OpenCalaisSimple>' + m.post_match) do |config|
247
- #strict xml parsing, disallow network connections
248
- config.strict.nonet
252
+
253
+ if http.response_header.status == 200
254
+ http.response.match(/<OpenCalaisSimple>/) do |m|
255
+ response = Nokogiri::XML('<OpenCalaisSimple>' + m.post_match) do |config|
256
+ #strict xml parsing, disallow network connections
257
+ config.strict.nonet
258
+ end #block
249
259
  end #block
250
- end #block
251
260
 
252
- result = response ? ResponseData.new(response, nil) : ResponseData.new(nil,'ERR: unable to find <OpenCalaisSimple> tag in response data')
261
+ result = response ?
262
+ ResponseData.new(response, nil) :
263
+ ResponseData.new(nil,'ERR: cannot find <OpenCalaisSimple> tag in response data - source invalid?')
264
+ else #non-200 response header
265
+ result = ResponseData.new nil,
266
+ "ERR: OpenCalais service responded with #{http.response_header.status} - response body: '#{http.response}'"
267
+ end
268
+
253
269
  @callbacks.each { |c| c.call(result) }
270
+
254
271
  end #callback
255
272
 
256
273
 
257
274
  http.errback do
258
-
259
- result = ResponseData.new nil, "#{http.error}"
275
+ result = ResponseData.new nil, "ERR: #{http.error}"
260
276
  @callbacks.each { |c| c.call(result) }
261
277
  end #errback
262
278
 
263
279
 
264
280
  end #if
265
281
  rescue Exception=>e
266
- puts "ERR: #{e}"
267
- exit 0
282
+ #result = ResponseData.new nil, "ERR: #{e}"
283
+ #@callbacks.each { |c| c.call(result) }
284
+ @error = "ERR: #{e}"
268
285
  end
269
286
 
270
287
  end #method
271
288
 
272
289
 
273
- public :to_calais
274
- private :get_src_data, :analyse_this
290
+ alias_method :analyze_this, :analyse_this
291
+ public :to_calais, :analyse_this
292
+ private :get_src_data
275
293
 
276
294
 
277
295
  end #class
@@ -1,3 +1,3 @@
1
1
  module DoverToCalais
2
- VERSION = "0.1.0"
2
+ VERSION = "0.2.0"
3
3
  end
Binary file
@@ -0,0 +1,36 @@
1
+ <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
2
+ <HTML>
3
+ <HEAD>
4
+ <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=utf-8">
5
+ <TITLE></TITLE>
6
+ <META NAME="GENERATOR" CONTENT="LibreOffice 3.5 (Linux)">
7
+ <META NAME="CREATED" CONTENT="0;0">
8
+ <META NAME="CHANGED" CONTENT="0;0">
9
+ <STYLE TYPE="text/css">
10
+ <!--
11
+ @page { margin: 2cm }
12
+ P { margin-bottom: 0.21cm }
13
+ P.western { so-language: en-GB }
14
+ PRE.western { so-language: en-GB }
15
+ PRE.cjk { font-family: "WenQuanYi Micro Hei", monospace }
16
+ PRE.ctl { font-family: "Lohit Hindi", monospace }
17
+ -->
18
+ </STYLE>
19
+ </HEAD>
20
+ <BODY LANG="en-GB" DIR="LTR">
21
+ <PRE CLASS="western">Tensleep Corporation (Other OTC:TENS.PK - News) (&quot;Tensleep&quot;) announced that with the acquisition of XSTV Media, Inc. (&quot;XSTV&quot;),
22
+ it will become an online independent sports company. The transaction is
23
+ to close on or before September 15, 2007. Tensleep will, by the end of
24
+ this week or early next week, call a special meeting of shareholders to
25
+ approve the change name to &quot;XSTV Corporation.&quot;
26
+
27
+ David Bailey, an analyst at Gerard Klauer Mattison who can be contacted at david at bailey dot com, said such cuts &quot;could include head count reductions.&quot;
28
+ Layoffs to some degree are inevitable, said IDC analyst Roger Kay. For years,
29
+ the company enjoyed a lower cost structure than other PC makers because
30
+ it sold computers directly.
31
+ International Star Inc. (info@ITSR.com - OTC BB: ILST) announced that the annual meeting of
32
+ shareholders of International Star Inc. will be held on May 19, 2008,
33
+ at 3:00 p.m. (local time) at The Hilton Hotel, 104 Market Street,
34
+ Shreveport, La., 71101. </PRE>
35
+ </BODY>
36
+ </HTML>
Binary file
Binary file
@@ -0,0 +1,54 @@
1
+ {\rtf1\ansi\deff3\adeflang1025
2
+ {\fonttbl{\f0\froman\fprq2\fcharset0 Times New Roman;}{\f1\froman\fprq2\fcharset2 Symbol;}{\f2\fswiss\fprq2\fcharset0 Arial;}{\f3\froman\fprq2\fcharset128 Times New Roman;}{\f4\fswiss\fprq2\fcharset128 Arial;}{\f5\fmodern\fprq1\fcharset128 Droid Sans Mono;}{\f6\fnil\fprq2\fcharset128 Droid Sans;}{\f7\fmodern\fprq1\fcharset128 WenQuanYi Micro Hei;}{\f8\fnil\fprq2\fcharset128 Lohit Hindi;}{\f9\fnil\fprq0\fcharset128 Lohit Hindi;}{\f10\fmodern\fprq1\fcharset128 Lohit Hindi;}}
3
+ {\colortbl;\red0\green0\blue0;\red0\green0\blue128;\red128\green0\blue0;\red128\green128\blue128;}
4
+ {\stylesheet{\s0\snext0\nowidctlpar{\*\hyphen2\hyphlead2\hyphtrail2\hyphmax0}\cf0\hich\af6\langfe2052\dbch\af8\afs24\alang1081\loch\f3\fs24\lang2057 Normal;}
5
+ {\*\cs15\snext15 Footnote Characters;}
6
+ {\*\cs16\snext16 Endnote Characters;}
7
+ {\*\cs17\snext17\cf2\ul\ulc0\langfe255\alang255\lang255 Internet Link;}
8
+ {\*\cs18\snext18\cf3\ul\ulc0\langfe255\alang255\lang255 Visited Internet Link;}
9
+ {\s19\sbasedon0\snext20\sb240\sa120\keepn\hich\af6\dbch\af8\afs28\loch\f4\fs28 Heading;}
10
+ {\s20\sbasedon0\snext20\sb0\sa120 Text body;}
11
+ {\s21\sbasedon20\snext21\sb0\sa120\dbch\af9 List;}
12
+ {\s22\sbasedon0\snext22\sb120\sa120\noline\i\dbch\af9\afs24\ai\fs24 Caption;}
13
+ {\s23\sbasedon0\snext23\noline\dbch\af9 Index;}
14
+ {\s24\sbasedon0\snext24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20 Preformatted Text;}
15
+ {\s25\sbasedon0\snext25\li567\ri0\lin567\rin0\fi0 List Contents;}
16
+ }{\info{\creatim\yr0\mo0\dy0\hr0\min0}{\revtim\yr0\mo0\dy0\hr0\min0}{\printim\yr0\mo0\dy0\hr0\min0}{\comment LibreOffice}{\vern3500}}\deftab709
17
+
18
+ {\*\pgdsctbl
19
+ {\pgdsc0\pgdscuse195\pgwsxn11906\pghsxn16838\marglsxn1134\margrsxn1134\margtsxn1134\margbsxn1134\pgdscnxt0 Default;}}
20
+ \formshade\paperh16838\paperw11906\margl1134\margr1134\margt1134\margb1134\sectd\sbknone\sectunlocked1\pgndec\pgwsxn11906\pghsxn16838\marglsxn1134\margrsxn1134\margtsxn1134\margbsxn1134\ftnbj\ftnstart1\ftnrstcont\ftnnar\aenddoc\aftnrstcont\aftnstart1\aftnnrlc
21
+ \pgndec\pard\plain \s24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20{\rtlch \ltrch\loch
22
+ Tensleep Corporation (Other OTC:TENS.PK - News) ("Tensleep") announced that with the acquisition of XSTV Media, Inc. ("XSTV"),}
23
+ \par \pard\plain \s24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20{\rtlch \ltrch
24
+ }{\rtlch \ltrch\loch
25
+ it will become an online independent sports company. The transaction is}
26
+ \par \pard\plain \s24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20{\rtlch \ltrch\loch
27
+ to close on or before September 15, 2007. Tensleep will, by the end of}
28
+ \par \pard\plain \s24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20{\rtlch \ltrch\loch
29
+ this week or early next week, call a special meeting of shareholders to}
30
+ \par \pard\plain \s24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20{\rtlch \ltrch\loch
31
+ approve the change name to "XSTV Corporation."}
32
+ \par \pard\plain \s24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20\rtlch \ltrch\loch
33
+
34
+ \par \pard\plain \s24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20{\rtlch \ltrch\loch
35
+ David Bailey, an analyst at Gerard Klauer Mattison who can be contacted at david at bailey dot com, said such cuts "could include head count reductions."}
36
+ \par \pard\plain \s24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20{\rtlch \ltrch\loch
37
+ Layoffs to some degree are inevitable, said IDC analyst Roger Kay. For years,}
38
+ \par \pard\plain \s24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20{\rtlch \ltrch
39
+ }{\rtlch \ltrch\loch
40
+ the company enjoyed a lower cost structure than other PC makers because}
41
+ \par \pard\plain \s24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20{\rtlch \ltrch\loch
42
+ it sold computers directly.}
43
+ \par \pard\plain \s24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20{\rtlch \ltrch
44
+ }{\rtlch \ltrch\loch
45
+ International Star Inc. (info@ITSR.com - OTC BB: ILST) announced that the annual meeting of}
46
+ \par \pard\plain \s24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20{\rtlch \ltrch\loch
47
+ shareholders of International Star Inc. will be held on May 19, 2008,}
48
+ \par \pard\plain \s24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20{\rtlch \ltrch
49
+ }{\rtlch \ltrch\loch
50
+ at 3:00 p.m. (local time) at The Hilton Hotel, 104 Market Street,}
51
+ \par \pard\plain \s24\sb0\sa0\hich\af7\dbch\af10\afs20\loch\f5\fs20{\rtlch \ltrch
52
+ }{\rtlch \ltrch\loch
53
+ Shreveport, La., 71101. }
54
+ \par }
@@ -0,0 +1,14 @@
1
+ Tensleep Corporation (Other OTC:TENS.PK - News) ("Tensleep") announced that with the acquisition of XSTV Media, Inc. ("XSTV"),
2
+ it will become an online independent sports company. The transaction is
3
+ to close on or before September 15, 2007. Tensleep will, by the end of
4
+ this week or early next week, call a special meeting of shareholders to
5
+ approve the change name to "XSTV Corporation."
6
+
7
+ David Bailey, an analyst at Gerard Klauer Mattison who can be contacted at david at bailey dot com, said such cuts "could include head count reductions."
8
+ Layoffs to some degree are inevitable, said IDC analyst Roger Kay. For years,
9
+ the company enjoyed a lower cost structure than other PC makers because
10
+ it sold computers directly.
11
+ International Star Inc. (info@ITSR.com - OTC BB: ILST) announced that the annual meeting of
12
+ shareholders of International Star Inc. will be held on May 19, 2008,
13
+ at 3:00 p.m. (local time) at The Hilton Hotel, 104 Market Street,
14
+ Shreveport, La., 71101.
metadata CHANGED
@@ -1,84 +1,86 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: dover_to_calais
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
5
- prerelease:
4
+ version: 0.2.0
6
5
  platform: ruby
7
6
  authors:
8
7
  - Fred Heath
9
8
  autorequire:
10
9
  bindir: bin
11
10
  cert_chain: []
12
- date: 2013-10-04 00:00:00.000000000 Z
11
+ date: 2014-02-10 00:00:00.000000000 Z
13
12
  dependencies:
14
13
  - !ruby/object:Gem::Dependency
15
14
  name: nokogiri
16
15
  requirement: !ruby/object:Gem::Requirement
17
- none: false
18
16
  requirements:
19
17
  - - ~>
20
18
  - !ruby/object:Gem::Version
21
- version: 1.6.0
19
+ version: '1.6'
22
20
  type: :runtime
23
21
  prerelease: false
24
22
  version_requirements: !ruby/object:Gem::Requirement
25
- none: false
26
23
  requirements:
27
24
  - - ~>
28
25
  - !ruby/object:Gem::Version
29
- version: 1.6.0
26
+ version: '1.6'
30
27
  - !ruby/object:Gem::Dependency
31
28
  name: eventmachine
32
29
  requirement: !ruby/object:Gem::Requirement
33
- none: false
34
30
  requirements:
35
31
  - - ~>
32
+ - !ruby/object:Gem::Version
33
+ version: '1.0'
34
+ - - ! '>='
36
35
  - !ruby/object:Gem::Version
37
36
  version: 1.0.3
38
37
  type: :runtime
39
38
  prerelease: false
40
39
  version_requirements: !ruby/object:Gem::Requirement
41
- none: false
42
40
  requirements:
43
41
  - - ~>
42
+ - !ruby/object:Gem::Version
43
+ version: '1.0'
44
+ - - ! '>='
44
45
  - !ruby/object:Gem::Version
45
46
  version: 1.0.3
46
47
  - !ruby/object:Gem::Dependency
47
48
  name: em-http-request
48
49
  requirement: !ruby/object:Gem::Requirement
49
- none: false
50
50
  requirements:
51
51
  - - ~>
52
52
  - !ruby/object:Gem::Version
53
- version: 1.1.0
53
+ version: '1.1'
54
54
  type: :runtime
55
55
  prerelease: false
56
56
  version_requirements: !ruby/object:Gem::Requirement
57
- none: false
58
57
  requirements:
59
58
  - - ~>
60
59
  - !ruby/object:Gem::Version
61
- version: 1.1.0
60
+ version: '1.1'
62
61
  - !ruby/object:Gem::Dependency
63
62
  name: yomu
64
63
  requirement: !ruby/object:Gem::Requirement
65
- none: false
66
64
  requirements:
67
65
  - - ~>
66
+ - !ruby/object:Gem::Version
67
+ version: '0.1'
68
+ - - ! '>='
68
69
  - !ruby/object:Gem::Version
69
70
  version: 0.1.9
70
71
  type: :runtime
71
72
  prerelease: false
72
73
  version_requirements: !ruby/object:Gem::Requirement
73
- none: false
74
74
  requirements:
75
75
  - - ~>
76
+ - !ruby/object:Gem::Version
77
+ version: '0.1'
78
+ - - ! '>='
76
79
  - !ruby/object:Gem::Version
77
80
  version: 0.1.9
78
81
  - !ruby/object:Gem::Dependency
79
82
  name: bundler
80
83
  requirement: !ruby/object:Gem::Requirement
81
- none: false
82
84
  requirements:
83
85
  - - ~>
84
86
  - !ruby/object:Gem::Version
@@ -86,7 +88,6 @@ dependencies:
86
88
  type: :development
87
89
  prerelease: false
88
90
  version_requirements: !ruby/object:Gem::Requirement
89
- none: false
90
91
  requirements:
91
92
  - - ~>
92
93
  - !ruby/object:Gem::Version
@@ -94,26 +95,64 @@ dependencies:
94
95
  - !ruby/object:Gem::Dependency
95
96
  name: rake
96
97
  requirement: !ruby/object:Gem::Requirement
97
- none: false
98
98
  requirements:
99
- - - ! '>='
99
+ - - ~>
100
100
  - !ruby/object:Gem::Version
101
101
  version: '0'
102
102
  type: :development
103
103
  prerelease: false
104
104
  version_requirements: !ruby/object:Gem::Requirement
105
- none: false
106
105
  requirements:
107
- - - ! '>='
106
+ - - ~>
108
107
  - !ruby/object:Gem::Version
109
108
  version: '0'
109
+ - !ruby/object:Gem::Dependency
110
+ name: cucumber
111
+ requirement: !ruby/object:Gem::Requirement
112
+ requirements:
113
+ - - ~>
114
+ - !ruby/object:Gem::Version
115
+ version: '1.3'
116
+ - - ! '>='
117
+ - !ruby/object:Gem::Version
118
+ version: 1.3.8
119
+ type: :development
120
+ prerelease: false
121
+ version_requirements: !ruby/object:Gem::Requirement
122
+ requirements:
123
+ - - ~>
124
+ - !ruby/object:Gem::Version
125
+ version: '1.3'
126
+ - - ! '>='
127
+ - !ruby/object:Gem::Version
128
+ version: 1.3.8
129
+ - !ruby/object:Gem::Dependency
130
+ name: rspec
131
+ requirement: !ruby/object:Gem::Requirement
132
+ requirements:
133
+ - - ~>
134
+ - !ruby/object:Gem::Version
135
+ version: '2.14'
136
+ - - ! '>='
137
+ - !ruby/object:Gem::Version
138
+ version: 2.14.1
139
+ type: :development
140
+ prerelease: false
141
+ version_requirements: !ruby/object:Gem::Requirement
142
+ requirements:
143
+ - - ~>
144
+ - !ruby/object:Gem::Version
145
+ version: '2.14'
146
+ - - ! '>='
147
+ - !ruby/object:Gem::Version
148
+ version: 2.14.1
110
149
  description: ! "DoverToCalais allows the user to send a wide range of data sources
111
150
  (files & URLs)\n to OpenCalais and receive asynchronous
112
151
  responses when OpenCalais has finished processing\n the
113
152
  inputs. In addition, DoverToCalais enables the filtering of the response in order
114
153
  to\n find relevant tags and/or tag values. "
115
154
  email:
116
- - fred@bootstrap.me.uk
155
+ - fred_h@bootstrap.me.uk
117
156
  executables: []
118
157
  extensions: []
119
158
  extra_rdoc_files: []
@@ -125,32 +164,51 @@ files:
125
164
  - README.md
126
165
  - Rakefile
127
166
  - dover_to_calais.gemspec
167
+ - features/data_sources.feature
168
+ - features/filtering.feature
169
+ - features/step_definitions/data_sources_steps.rb
170
+ - features/step_definitions/filtering_steps.rb
128
171
  - lib/dover_to_calais.rb
129
172
  - lib/dover_to_calais/ontology.rb
130
173
  - lib/dover_to_calais/version.rb
131
- homepage: ''
174
+ - test/test_file_1.doc
175
+ - test/test_file_1.html
176
+ - test/test_file_1.odt
177
+ - test/test_file_1.pdf
178
+ - test/test_file_1.rtf
179
+ - test/test_file_1.txt
180
+ homepage: https://github.com/RedFred7/dover_to_calais
132
181
  licenses:
133
182
  - MIT
183
+ metadata: {}
134
184
  post_install_message:
135
185
  rdoc_options: []
136
186
  require_paths:
137
187
  - lib
138
188
  required_ruby_version: !ruby/object:Gem::Requirement
139
- none: false
140
189
  requirements:
141
190
  - - ! '>='
142
191
  - !ruby/object:Gem::Version
143
192
  version: '0'
144
193
  required_rubygems_version: !ruby/object:Gem::Requirement
145
- none: false
146
194
  requirements:
147
195
  - - ! '>='
148
196
  - !ruby/object:Gem::Version
149
197
  version: '0'
150
198
  requirements: []
151
199
  rubyforge_project:
152
- rubygems_version: 1.8.24
200
+ rubygems_version: 2.2.0
153
201
  signing_key:
154
- specification_version: 3
202
+ specification_version: 4
155
203
  summary: An easy-to-use wrapper round the OpenCalais semantic analysis web service.
156
- test_files: []
204
+ test_files:
205
+ - features/data_sources.feature
206
+ - features/filtering.feature
207
+ - features/step_definitions/data_sources_steps.rb
208
+ - features/step_definitions/filtering_steps.rb
209
+ - test/test_file_1.doc
210
+ - test/test_file_1.html
211
+ - test/test_file_1.odt
212
+ - test/test_file_1.pdf
213
+ - test/test_file_1.rtf
214
+ - test/test_file_1.txt