tag_along 0.0.1 → 0.6.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: bb501613369741434e9f949ebeba6d1b6867db51
4
- data.tar.gz: 82b99ef0fb1c3c543ac5ffa2425abd61024b59d9
3
+ metadata.gz: 391d3d4d89b92c4d71b051eb6545a485e4210669
4
+ data.tar.gz: 7356ca9775ccbbce9fce5c2078669994dc5212f2
5
5
  SHA512:
6
- metadata.gz: 08ceb49d3503b05df762286a6f770d6124912b1d607eb95d7605e73d6aae61d162e5d94961a3a03588e4d8788f148335b0c17c2997b9d71735d603b3d3314be3
7
- data.tar.gz: 0380c0290c379b3d69da2c640c557a11b34122d634514c2e9c7e91f2791054a5ff275bbe87064f55523817c56be4a9e2f0bdbc4908a44d76c2e95eba4307fabf
6
+ metadata.gz: bff906e414af8b88c001e8f532421aa6850744982489f1fbaa59108fe453642963779ebb4f1902ae7ad050258201e11541469f229c2820c1eeb7a93c65c19918
7
+ data.tar.gz: 4edd2b7e8856823343b586dc826ca8c8850ea597daf89d27730a09a622386e6c59bb93a14c7319c8f268f0a042195c3c186514a051824def5d115f9d83d04378
data/.gitignore CHANGED
@@ -15,3 +15,4 @@ spec/reports
15
15
  test/tmp
16
16
  test/version_tmp
17
17
  tmp
18
+ bin
data/.ruby-version ADDED
@@ -0,0 +1 @@
1
+ 2.0.0-p247
data/.travis.yml ADDED
@@ -0,0 +1,7 @@
1
+ rvm:
2
+ - 1.9.3
3
+ - 2.0.0
4
+ bundler_args: --without development
5
+ branches:
6
+ only:
7
+ - master
data/CHANGELOG ADDED
@@ -0,0 +1,6 @@
1
+ 0.6.1 -- fixed a bug which would crash the gem if an object without
2
+ item_string is entered
3
+
4
+ 0.6.0 -- first functional release, not tested in production
5
+
6
+ 0.0.1 -- initial unonfunctional placeholder
data/Gemfile CHANGED
@@ -1,4 +1,17 @@
1
1
  source 'https://rubygems.org'
2
2
 
3
- # Specify your gem's dependencies in tag_along.gemspec
4
- gemspec
3
+ gem 'json', '~> 1.7'
4
+
5
+ group :development do
6
+ gem 'bundler', '~> 1.3'
7
+ gem 'debugger','~> 1.6'
8
+ end
9
+
10
+ group :test do
11
+ gem 'rake', '~> 10.1'
12
+ gem 'rspec', '~> 2.14'
13
+ gem 'rr', '~> 1.1'
14
+ gem 'coveralls', '~> 0.6'
15
+ end
16
+
17
+
data/README.md CHANGED
@@ -1,12 +1,21 @@
1
1
  TagAlong
2
2
  ========
3
3
 
4
+ [![Gem Version][1]][2]
5
+ [![Continuous Integration Status][3]][4]
6
+ [![Coverage Status][5]][6]
7
+ [![CodePolice][7]][8]
8
+ [![Dependency Status][9]][10]
9
+
4
10
  A user who runs a search tool against a text would find
5
11
  multiple text fragments corresponding to the search.
6
12
  These fragments can be found again by storing their
7
13
  start and end offsets. This gem places arbitrary
8
14
  markup tags surrounding the fragments.
9
15
 
16
+ Gem works with UTF-8 and ASCII7 texts. It is quite fast and allows to tag
17
+ 4MB of text in one second on 2.7GHz processor.
18
+
10
19
  Installation
11
20
  ------------
12
21
 
@@ -25,12 +34,54 @@ Or install it yourself as:
25
34
  Usage
26
35
  -----
27
36
 
28
- To add tags to a text:
37
+ For example you want to tag days of week from a text:
29
38
 
30
- tg = TagAlong.new(some_text)
31
- offsets = [[2, 5], [9, 22], [33, 35]]
32
- tg.tag('<my_tag>', '</my_tag>', offsets)
39
+ text = "There's Sunday and there's Monday"
40
+
41
+ To add tags to a text:
33
42
 
43
+ offsets = [[8,13], [27,32]]
44
+ tg = TagAlong.new(text, offsets)
45
+
46
+ tg.tag('<my_tag>', '</my_tag>')
47
+ puts tg.tagged_text
48
+ # There's <my_tag>Sunday</my_tag> and there's <my_tag>Monday</my_tag>
49
+
50
+ tg.tag('<em>', '</em>')
51
+ puts tg.tagged_text
52
+ # There's <em>Sunday</em> and there's <em>Monday</em>
53
+
54
+ Notice that you can retag the text as many times as you want.
55
+
56
+ ### Offsets
57
+
58
+ To prepare offsets from an arbitrary object:
59
+
60
+ # Array of arrays
61
+ my_ary = [[8,13], [27,32]]
62
+ offsets = TagAlong::Offsets.new(my_ary)
63
+
64
+ # Array of hashes
65
+ my_hash = [{ start: 8, end:13 }, { start:27, end:32 }]
66
+ offsets = TagAlong::Offsets.new(my_hash,
67
+ offset_start: 'start'
68
+ offset_end: 'end')
69
+ or
70
+ offsets = TagAlong::Offsets.new(my_hash,
71
+ offset_start: :start,
72
+ offset_end: :end)
73
+
74
+ # Array of objects
75
+ require 'ostruct'
76
+ my_obj = [OpenStruct.new(s: 8, e: 13), OpenStruct.new(s: 27, e: 32)]
77
+ offsets = TagAlong::Offsets.new(my_obj,
78
+ offset_start: :s,
79
+ offset_end: :e)
80
+
81
+ In all cases you can instantiate TagAlong with resulting offsets:
82
+
83
+ tg = TagAlong.new(text, offsets)
84
+ tg.tag('|hi|', '|bye|')
34
85
 
35
86
  Contributing
36
87
  ------------
@@ -44,9 +95,19 @@ Contributing
44
95
  Copyright
45
96
  ---------
46
97
 
47
- Authors: [Dmitry Mozzherin][1],
98
+ Authors: [Dmitry Mozzherin][11]
48
99
 
49
100
  Copyright (c) 2013 Marine Biological Laboratory. See LICENSE for
50
101
  further details.
51
102
 
52
- [1]: https://github.com/dimus
103
+ [1]: https://badge.fury.io/rb/tag_along.png
104
+ [2]: http://badge.fury.io/rb/tag_along
105
+ [3]: https://secure.travis-ci.org/GlobalNamesArchitecture/tag_along.png
106
+ [4]: http://travis-ci.org/GlobalNamesArchitecture/tag_along
107
+ [5]: https://coveralls.io/repos/GlobalNamesArchitecture/tag_along/badge.png?branch=master
108
+ [6]: https://coveralls.io/r/GlobalNamesArchitecture/tag_along?branch=master
109
+ [7]: https://codeclimate.com/github/GlobalNamesArchitecture/tag_along.png
110
+ [8]: https://codeclimate.com/github/GlobalNamesArchitecture/tag_along
111
+ [9]: https://gemnasium.com/GlobalNamesArchitecture/tag_along.png
112
+ [10]: https://gemnasium.com/GlobalNamesArchitecture/tag_along
113
+ [11]: https://github.com/dimus
data/Rakefile CHANGED
@@ -1 +1,20 @@
1
- require "bundler/gem_tasks"
1
+ require 'bundler'
2
+ require 'bundler/gem_tasks'
3
+ require 'rspec/core'
4
+ require 'rspec/core/rake_task'
5
+
6
+ Bundler::GemHelper.install_tasks
7
+
8
+ begin
9
+ Bundler.setup(:default, :development)
10
+ rescue Bundler::BundlerError => e
11
+ $stderr.puts e.message
12
+ $stderr.puts 'Run `bundle install` to install missing gems'
13
+ exit e.status_code
14
+ end
15
+
16
+ task :default => :spec
17
+
18
+ RSpec::Core::RakeTask.new do |t|
19
+ t.pattern = 'spec/**/*spec.rb'
20
+ end
@@ -0,0 +1,78 @@
1
+ class TagAlong
2
+
3
+ class Offsets
4
+ include Enumerable
5
+
6
+ def initialize(offsets, opts = {})
7
+
8
+ @offsets = offsets
9
+ @offset_start = (opts[:offset_start] || 'offset_start').to_sym
10
+ @offset_end = (opts[:offset_end] || 'offset_end').to_sym
11
+ @item_string = (opts[:item_string] || 'item_string').to_sym
12
+
13
+ item = @offsets.first
14
+ if item.is_a?(Array)
15
+ process_array
16
+ elsif item.is_a?(Hash)
17
+ process_hash
18
+ else
19
+ process_obj
20
+ end
21
+ end
22
+
23
+ def each(&block)
24
+ @offsets.each do |o|
25
+ block.call(o)
26
+ end
27
+ end
28
+
29
+ private
30
+
31
+ def process_array
32
+ @offsets = @offsets.map do |o|
33
+ offset_start = o[0]
34
+ offset_end = o[1]
35
+ item_string = o[2]
36
+ instantiate(offset_start, offset_end, item_string)
37
+ end
38
+ end
39
+
40
+ def process_hash
41
+ @offsets.each { |h| symbolize_keys(h) }
42
+ @offsets = @offsets.map do |o|
43
+ instantiate(o[@offset_start], o[@offset_end], o[@item_string])
44
+ end
45
+ end
46
+
47
+ def process_obj
48
+ @offsets = @offsets.map do |o|
49
+ item_string = o.respond_to?(@item_string) ?
50
+ o.send(@item_string) :
51
+ nil
52
+ instantiate(o.send(@offset_start),
53
+ o.send(@offset_end),
54
+ item_string)
55
+ end
56
+ end
57
+
58
+ def instantiate(offset_start, offset_end, item_string)
59
+ OpenStruct.new(offset_start: to_int(offset_start),
60
+ offset_end: to_int(offset_end),
61
+ item_string: item_string)
62
+ end
63
+
64
+ def to_int(val)
65
+ int = val.to_i
66
+ raise TypeError.new('Offsets must be integers') if int.to_s != val.to_s
67
+ int
68
+ end
69
+
70
+ def symbolize_keys(a_hash)
71
+ a_hash.keys.each do |key|
72
+ a_hash[(key.to_sym rescue key) || key] = a_hash.delete(key)
73
+ end
74
+ end
75
+
76
+ end
77
+
78
+ end
@@ -1,3 +1,3 @@
1
- module TagAlong
2
- VERSION = "0.0.1"
1
+ class TagAlong
2
+ VERSION = '0.6.1'
3
3
  end
data/lib/tag_along.rb CHANGED
@@ -1,5 +1,61 @@
1
- require "tag_along/version"
1
+ require 'ostruct'
2
+ require 'tag_along/version'
3
+ require 'tag_along/offsets'
4
+
5
+ class TagAlong
6
+
7
+ attr :text, :tagged_text
8
+
9
+ def self.version
10
+ VERSION
11
+ end
12
+
13
+ def initialize(text, offsets)
14
+ @offsets = offsets.is_a?(Offsets) ? offsets : Offsets.new(offsets)
15
+ @text = text
16
+ @split_text = nil
17
+ @tagged_text = nil
18
+ split_text
19
+ end
20
+
21
+ def tag(open_tag, close_tag)
22
+ @tagged_text = @split_text.inject([]) do |res, t|
23
+ if t[:tagged]
24
+ [open_tag, t[:text], close_tag].each { |text| res << text }
25
+ else
26
+ res << t[:text]
27
+ end
28
+ res
29
+ end.join('')
30
+ end
31
+
32
+ private
33
+
34
+ def split_text
35
+ return if @split_text
36
+
37
+ text_ary = @text.unpack('U*')
38
+ cursor = 0
39
+ fragment = []
40
+ res = []
41
+
42
+ @offsets.each do |item|
43
+ chars_num = item.offset_start - cursor
44
+ chars_num.times { fragment << text_ary.shift }
45
+ res << { tagged: false, text: fragment }
46
+ fragment = []
47
+ cursor = item.offset_start
48
+ chars_num = item.offset_end + 1 - cursor
49
+ chars_num.times { fragment << text_ary.shift }
50
+ res << { tagged: true, text: fragment }
51
+ fragment = []
52
+ cursor = item.offset_end + 1
53
+ end
54
+
55
+ res.each do |r|
56
+ r[:text] = r[:text].pack('U*')
57
+ end
58
+ @split_text = res
59
+ end
2
60
 
3
- module TagAlong
4
- # Your code goes here...
5
61
  end
@@ -0,0 +1,407 @@
1
+ {
2
+ "token_url": "http://gnrd.globalnames.org/name_finder.json?token=Row8ZZJSTsuwJxhQtgha9A",
3
+ "input_url": null,
4
+ "file": "Embryology 1939_15B.tiff",
5
+ "status": 200,
6
+ "engines": [
7
+ "TaxonFinder",
8
+ "NetiNeti"
9
+ ],
10
+ "unique": false,
11
+ "verbatim": true,
12
+ "english": true,
13
+ "execution_time": {
14
+ "find_names_duration": 0.237166328,
15
+ "resolve_names_duration": 1.956796471,
16
+ "total_duration": 23.716051434
17
+ },
18
+ "agent": "",
19
+ "created": "2013-06-19T15:22:01-04:00",
20
+ "total": 11,
21
+ "names": [
22
+ {
23
+ "verbatim": "Pundulus",
24
+ "scientificName": "Pundulus",
25
+ "offsetStart": 61,
26
+ "offsetEnd": 68,
27
+ "identifiedName": "Pundulus"
28
+ },
29
+ {
30
+ "verbatim": "Lebistes reticulatus",
31
+ "scientificName": "Lebistes reticulatus",
32
+ "offsetStart": 866,
33
+ "offsetEnd": 885,
34
+ "identifiedName": "Lebistes reticulatus"
35
+ },
36
+ {
37
+ "verbatim": "Cottus",
38
+ "scientificName": "Cottus",
39
+ "offsetStart": 955,
40
+ "offsetEnd": 960,
41
+ "identifiedName": "Cottus"
42
+ },
43
+ {
44
+ "verbatim": "Entosphenus wilderi",
45
+ "scientificName": "Entosphenus wilderi",
46
+ "offsetStart": 1085,
47
+ "offsetEnd": 1103,
48
+ "identifiedName": "Entosphenus wilderi"
49
+ },
50
+ {
51
+ "verbatim": "Fundulus",
52
+ "scientificName": "Fundulus",
53
+ "offsetStart": 1347,
54
+ "offsetEnd": 1354,
55
+ "identifiedName": "Fundulus"
56
+ },
57
+ {
58
+ "verbatim": "Fundulus heteroclitus",
59
+ "scientificName": "Fundulus heteroclitus",
60
+ "offsetStart": 1604,
61
+ "offsetEnd": 1624,
62
+ "identifiedName": "Fundulus heteroclitus"
63
+ },
64
+ {
65
+ "verbatim": "Nouveaux essais",
66
+ "scientificName": "Nouveaux essais",
67
+ "offsetStart": 1857,
68
+ "offsetEnd": 1871,
69
+ "identifiedName": "Nouveaux essais"
70
+ },
71
+ {
72
+ "verbatim": "Acipenser",
73
+ "scientificName": "Acipenser",
74
+ "offsetStart": 2137,
75
+ "offsetEnd": 2145,
76
+ "identifiedName": "Acipenser"
77
+ },
78
+ {
79
+ "verbatim": "Acipenser guldenstadtii",
80
+ "scientificName": "Acipenser guldenstadtii",
81
+ "offsetStart": 2183,
82
+ "offsetEnd": 2205,
83
+ "identifiedName": "Acipenser guldenstadtii"
84
+ },
85
+ {
86
+ "verbatim": "Acipenser stellatus",
87
+ "scientificName": "Acipenser stellatus",
88
+ "offsetStart": 2211,
89
+ "offsetEnd": 2229,
90
+ "identifiedName": "Acipenser stellatus"
91
+ },
92
+ {
93
+ "verbatim": "Fundulus heteroclitus",
94
+ "scientificName": "Fundulus heteroclitus",
95
+ "offsetStart": 2462,
96
+ "offsetEnd": 2482,
97
+ "identifiedName": "Fundulus heteroclitus"
98
+ }
99
+ ],
100
+ "content": "1959 \nNewman H. H. Spawmin behavior and sexual dimorphism of Pundulus\n9 5 QS\nheteroclitus and allied fish. Biol. Bull. 12, 1907.\nIV. CIRGULATORY SYSTEM\nP. B. Functional reactions in the embryonic heart accom~\npanying the ingrowth and development of the vagus\ninnervation. Jour. Exp. Zool. 58, 1951.\nBrinley, P. J. A Physiological study of the innervation of the\nheart of fish embryos. Physiol. Zool. 5, 1952.\nReagan, F, P. Experimental studies in the origin of the vascular\nendothelium and of Am. Jour, Anat.\n21, 1917. \nShearer, E. M. Studies on the embryology of the circulation in\nfishes, 1. and ll. Am. Jour. Anat. 46, 1950.\nStockard, C. R. An experimental analysis of the origin of blood\nand vascular endothelium Mem. Wist. Inst. Anat.\nBiol. No. 7, 1915. also Am. Jour, Anat. 18, 1915.\nV. GERM CELLS\nGoodrich, H. B., et al. Germ cells and sex differentiation in\nLebistes reticulatus. Biol. Bull. 67, 1954.\nMann, H. W. The history of the germ cells of Cottus Baerdii. Gerard\nJour. Morph. Physiol. 45, 1927.\nokkelberg, Peter The early history of the germ cells in the brook\nlamprey, Entosphenus wilderi (Gage) up to and in~\ncluding the period of sex differentiation. Jour.\nMorph. 55, 1921. (This paper has a complete bib-\nliography of work on germ cells in other groups.)\nTichards, A. and Thompson, James I., Migration with primary sex\ncells of Fundulus hetercclitus Biol. Bull. 40, 1921.\nWolf, L. E. The history of the germ cells in the viviparous\n1 teleost PPlatypoecilus maculatus. Jour. Morph. and\nPhysiol. 52, 1951.\n1 VI. EXPERIMENTAL WORK.\nAmberson, W. R, and F. B. The respiratory metabolism\nof Fundulus heteroclitus during embryonic develop-\nmental Jour. Cell. and Comp. Physiol. 2, 1955.\nP. B. The embryonic origin of function in the pronephros\nthrough differentiation and paren~chyma-vascular\nassociation. Amer. Jour. Anat. 51, 1952.\nBatallion, Nouveaux essais de Parthenogenere experimentale chez les\nvertibres infesienes Arch. Ent. Mech. 18, 1904.\nClapp, C. M. The relation of the axis of the embryo to the first\ncleavage plane. Biol Lect. M. B. L. 1898.\nFilatow, D. Entwicklungsmechanishhe Untersuchungen an Embryonen\nvon Acipenser guldenstadtti and Acipenbryonen von\nAcipenser guldenstadtii und Acipenser stellatus.\nArch. Ent. Mech. 122, 1950.\nHinriChS,.M. A. and Genther 1. T. Ultraviolet radiation and the\nproduction of twins and double monsters. Physiol.\niZool. 4, 1951.\nHeadley, L. 0n the localization of developmental potencies in\nembryo of Fundulus heteroclitus Jour. Exp. Zool.\n52, 1928.\n\n",
101
+ "data_sources": [
102
+
103
+ ],
104
+ "context": null,
105
+ "resolved_names": [
106
+ {
107
+ "supplied_name_string": "Pundulus",
108
+ "results": [
109
+ null
110
+ ],
111
+ "preferred_results": [
112
+
113
+ ],
114
+ "data_sources_number": 0,
115
+ "in_curated_sources": false
116
+ },
117
+ {
118
+ "supplied_name_string": "Lebistes reticulatus",
119
+ "results": [
120
+ {
121
+ "data_source_id": 12,
122
+ "data_source_title": "EOL",
123
+ "gni_uuid": "3564821e-720d-587c-96d7-7e29612a81ad",
124
+ "name_string": "Lebistes reticulatus",
125
+ "canonical_form": "Lebistes reticulatus",
126
+ "classification_path": "",
127
+ "classification_path_ranks": "",
128
+ "classification_path_ids": "",
129
+ "taxon_id": "37270453",
130
+ "local_id": "11081658",
131
+ "url": "http://eol.org/pages/11081658",
132
+ "match_type": 1,
133
+ "prescore": "3|0|0",
134
+ "score": 0.988
135
+ }
136
+ ],
137
+ "preferred_results": [
138
+ {
139
+ "data_source_id": 12,
140
+ "data_source_title": "EOL",
141
+ "gni_uuid": "3564821e-720d-587c-96d7-7e29612a81ad",
142
+ "name_string": "Lebistes reticulatus",
143
+ "canonical_form": "Lebistes reticulatus",
144
+ "classification_path": "",
145
+ "classification_path_ranks": "",
146
+ "classification_path_ids": "",
147
+ "taxon_id": "37270453",
148
+ "local_id": "11081658",
149
+ "url": "http://eol.org/pages/11081658",
150
+ "match_type": 1,
151
+ "prescore": "3|0|0",
152
+ "score": 0.988
153
+ }
154
+ ],
155
+ "data_sources_number": 11,
156
+ "in_curated_sources": true
157
+ },
158
+ {
159
+ "supplied_name_string": "Cottus",
160
+ "results": [
161
+ {
162
+ "data_source_id": 1,
163
+ "data_source_title": "Catalogue of Life",
164
+ "gni_uuid": "ee98e65c-3d7b-5317-91e8-eb543255e7ba",
165
+ "name_string": "Cottus",
166
+ "canonical_form": "Cottus",
167
+ "classification_path": "Animalia|Chordata|Actinopterygii|Scorpaeniformes|Cottidae|Cottus",
168
+ "classification_path_ranks": "kingdom|phylum|class|order|family|genus",
169
+ "classification_path_ids": "2362377|2362754|2365430|2365507|2365562|2396188",
170
+ "taxon_id": "2396188",
171
+ "match_type": 1,
172
+ "prescore": "1|0|0",
173
+ "score": 0.75
174
+ }
175
+ ],
176
+ "preferred_results": [
177
+ {
178
+ "data_source_id": 12,
179
+ "data_source_title": "EOL",
180
+ "gni_uuid": "ee98e65c-3d7b-5317-91e8-eb543255e7ba",
181
+ "name_string": "Cottus",
182
+ "canonical_form": "Cottus",
183
+ "classification_path": "",
184
+ "classification_path_ranks": "",
185
+ "classification_path_ids": "",
186
+ "taxon_id": "37305832",
187
+ "local_id": "204556",
188
+ "url": "http://eol.org/pages/204556",
189
+ "match_type": 1,
190
+ "prescore": "1|0|0",
191
+ "score": 0.75
192
+ }
193
+ ],
194
+ "data_sources_number": 19,
195
+ "in_curated_sources": true
196
+ },
197
+ {
198
+ "supplied_name_string": "Entosphenus wilderi",
199
+ "results": [
200
+ {
201
+ "data_source_id": 168,
202
+ "data_source_title": "Index to Organism Names",
203
+ "gni_uuid": "11c30211-b0b8-5927-903e-1f959cd6173b",
204
+ "name_string": "Entosphenus wilderi",
205
+ "canonical_form": "Entosphenus wilderi",
206
+ "classification_path": "",
207
+ "classification_path_ranks": "",
208
+ "classification_path_ids": "",
209
+ "taxon_id": "128746584",
210
+ "url": "http://www.organismnames.com/details.htm?lsid=3414383",
211
+ "match_type": 1,
212
+ "prescore": "3|0|0",
213
+ "score": 0.988
214
+ }
215
+ ],
216
+ "preferred_results": [
217
+
218
+ ],
219
+ "data_sources_number": 1,
220
+ "in_curated_sources": false
221
+ },
222
+ {
223
+ "supplied_name_string": "Fundulus",
224
+ "results": [
225
+ {
226
+ "data_source_id": 1,
227
+ "data_source_title": "Catalogue of Life",
228
+ "gni_uuid": "849efb3e-564d-5bed-b388-d2abd1e89de0",
229
+ "name_string": "Fundulus",
230
+ "canonical_form": "Fundulus",
231
+ "classification_path": "Animalia|Chordata|Actinopterygii|Cyprinodontiformes|Fundulidae|Fundulus",
232
+ "classification_path_ranks": "kingdom|phylum|class|order|family|genus",
233
+ "classification_path_ids": "2362377|2362754|2365430|2369356|2369363|2396458",
234
+ "taxon_id": "2396458",
235
+ "match_type": 1,
236
+ "prescore": "1|0|0",
237
+ "score": 0.75
238
+ }
239
+ ],
240
+ "preferred_results": [
241
+ {
242
+ "data_source_id": 12,
243
+ "data_source_title": "EOL",
244
+ "gni_uuid": "849efb3e-564d-5bed-b388-d2abd1e89de0",
245
+ "name_string": "Fundulus",
246
+ "canonical_form": "Fundulus",
247
+ "classification_path": "",
248
+ "classification_path_ranks": "",
249
+ "classification_path_ids": "",
250
+ "taxon_id": "37305680",
251
+ "local_id": "207614",
252
+ "url": "http://eol.org/pages/207614",
253
+ "match_type": 1,
254
+ "prescore": "1|0|0",
255
+ "score": 0.75
256
+ }
257
+ ],
258
+ "data_sources_number": 16,
259
+ "in_curated_sources": true
260
+ },
261
+ {
262
+ "supplied_name_string": "Fundulus heteroclitus",
263
+ "results": [
264
+ {
265
+ "data_source_id": 4,
266
+ "data_source_title": "NCBI",
267
+ "gni_uuid": "b4e4dfca-4136-5fc1-8688-53af51190898",
268
+ "name_string": "Fundulus heteroclitus",
269
+ "canonical_form": "Fundulus heteroclitus",
270
+ "classification_path": "|Eukaryota|Opisthokonta|Metazoa|Eumetazoa|Bilateria|Coelomata|Deuterostomia|Chordata|Craniata|Vertebrata|Gnathostomata|Teleostomi|Euteleostomi|Actinopterygii|Actinopteri|Neopterygii|Teleostei|Elopocephala|Clupeocephala|Euteleostei|Neognathi|Neoteleostei|Eurypterygii|Ctenosquamata|Acanthomorpha|Euacanthomorpha|Holacanthopterygii|Acanthopterygii|Euacanthopterygii|Percomorpha|Smegmamorpha|Atherinomorpha|Cyprinodontiformes|Cyprinodontoidei|Fundulidae|Fundulus|Fundulus heteroclitus",
271
+ "classification_path_ranks": "|superkingdom||kingdom|||||phylum|subphylum||superclass|||class||||||||||||||superorder|||||order|suborder|family|genus|species",
272
+ "classification_path_ids": "131567|2759|33154|33208|6072|33213|33316|33511|7711|89593|7742|7776|117570|117571|7898|186623|41665|32443|186624|186625|32447|186839|123365|123366|123367|123368|123369|123370|32455|129947|32485|129949|32456|28738|8087|28756|8077|8078",
273
+ "taxon_id": "8078",
274
+ "match_type": 1,
275
+ "prescore": "3|0|0",
276
+ "score": 0.988
277
+ }
278
+ ],
279
+ "preferred_results": [
280
+ {
281
+ "data_source_id": 12,
282
+ "data_source_title": "EOL",
283
+ "gni_uuid": "b4e4dfca-4136-5fc1-8688-53af51190898",
284
+ "name_string": "Fundulus heteroclitus",
285
+ "canonical_form": "Fundulus heteroclitus",
286
+ "classification_path": "",
287
+ "classification_path_ranks": "",
288
+ "classification_path_ids": "",
289
+ "taxon_id": "39163140",
290
+ "local_id": "1157172",
291
+ "url": "http://eol.org/pages/1157172",
292
+ "match_type": 1,
293
+ "prescore": "3|0|0",
294
+ "score": 0.988
295
+ }
296
+ ],
297
+ "data_sources_number": 15,
298
+ "in_curated_sources": true
299
+ },
300
+ {
301
+ "supplied_name_string": "Nouveaux essais"
302
+ },
303
+ {
304
+ "supplied_name_string": "Acipenser",
305
+ "results": [
306
+ {
307
+ "data_source_id": 1,
308
+ "data_source_title": "Catalogue of Life",
309
+ "gni_uuid": "0801dea4-b087-5845-b4f3-e27adeb5eef7",
310
+ "name_string": "Acipenser",
311
+ "canonical_form": "Acipenser",
312
+ "classification_path": "Animalia|Chordata|Actinopterygii|Acipenseriformes|Acipenseridae|Acipenser",
313
+ "classification_path_ranks": "kingdom|phylum|class|order|family|genus",
314
+ "classification_path_ids": "2362377|2362754|2365430|2371893|2371895|2395796",
315
+ "taxon_id": "2395796",
316
+ "match_type": 1,
317
+ "prescore": "1|0|0",
318
+ "score": 0.75
319
+ }
320
+ ],
321
+ "preferred_results": [
322
+ {
323
+ "data_source_id": 12,
324
+ "data_source_title": "EOL",
325
+ "gni_uuid": "0801dea4-b087-5845-b4f3-e27adeb5eef7",
326
+ "name_string": "Acipenser",
327
+ "canonical_form": "Acipenser",
328
+ "classification_path": "",
329
+ "classification_path_ranks": "",
330
+ "classification_path_ids": "",
331
+ "taxon_id": "20627458",
332
+ "local_id": "15048",
333
+ "url": "http://eol.org/pages/15048",
334
+ "match_type": 1,
335
+ "prescore": "1|0|0",
336
+ "score": 0.75
337
+ }
338
+ ],
339
+ "data_sources_number": 19,
340
+ "in_curated_sources": true
341
+ },
342
+ {
343
+ "supplied_name_string": "Acipenser guldenstadtii",
344
+ "results": [
345
+ {
346
+ "data_source_id": 168,
347
+ "data_source_title": "Index to Organism Names",
348
+ "gni_uuid": "4bd1443b-c862-5ac3-849b-18d481af6a4e",
349
+ "name_string": "Acipenser guldenstadtii",
350
+ "canonical_form": "Acipenser guldenstadtii",
351
+ "classification_path": "",
352
+ "classification_path_ranks": "",
353
+ "classification_path_ids": "",
354
+ "taxon_id": "126471337",
355
+ "url": "http://www.organismnames.com/details.htm?lsid=960481",
356
+ "match_type": 1,
357
+ "prescore": "3|0|0",
358
+ "score": 0.988
359
+ }
360
+ ],
361
+ "preferred_results": [
362
+
363
+ ],
364
+ "data_sources_number": 3,
365
+ "in_curated_sources": true
366
+ },
367
+ {
368
+ "supplied_name_string": "Acipenser stellatus",
369
+ "results": [
370
+ {
371
+ "data_source_id": 4,
372
+ "data_source_title": "NCBI",
373
+ "gni_uuid": "35353b10-cb9b-50dd-929c-934c4e6ff172",
374
+ "name_string": "Acipenser stellatus",
375
+ "canonical_form": "Acipenser stellatus",
376
+ "classification_path": "|Eukaryota|Opisthokonta|Metazoa|Eumetazoa|Bilateria|Coelomata|Deuterostomia|Chordata|Craniata|Vertebrata|Gnathostomata|Teleostomi|Euteleostomi|Actinopterygii|Actinopteri|Chondrostei|Acipenseriformes|Acipenseroidei|Acipenseridae|Acipenserinae|Acipenserini|Acipenser|Acipenser stellatus",
377
+ "classification_path_ranks": "|superkingdom||kingdom|||||phylum|subphylum||superclass|||class|||order|suborder|family|subfamily|tribe|genus|species",
378
+ "classification_path_ids": "131567|2759|33154|33208|6072|33213|33316|33511|7711|89593|7742|7776|117570|117571|7898|186623|32440|7899|186622|7900|124129|124130|7901|7903",
379
+ "taxon_id": "7903",
380
+ "match_type": 1,
381
+ "prescore": "3|0|0",
382
+ "score": 0.988
383
+ }
384
+ ],
385
+ "preferred_results": [
386
+ {
387
+ "data_source_id": 12,
388
+ "data_source_title": "EOL",
389
+ "gni_uuid": "35353b10-cb9b-50dd-929c-934c4e6ff172",
390
+ "name_string": "Acipenser stellatus",
391
+ "canonical_form": "Acipenser stellatus",
392
+ "classification_path": "",
393
+ "classification_path_ranks": "",
394
+ "classification_path_ids": "",
395
+ "taxon_id": "35677530",
396
+ "local_id": "206889",
397
+ "url": "http://eol.org/pages/206889",
398
+ "match_type": 1,
399
+ "prescore": "3|0|0",
400
+ "score": 0.988
401
+ }
402
+ ],
403
+ "data_sources_number": 23,
404
+ "in_curated_sources": true
405
+ }
406
+ ]
407
+ }
@@ -0,0 +1,33 @@
1
+ require 'coveralls'
2
+ Coveralls.wear!
3
+
4
+ require 'json'
5
+ require 'ostruct'
6
+ require_relative '../lib/tag_along'
7
+
8
+ RSpec.configure do |c|
9
+ c.mock_with :rr
10
+ end
11
+
12
+ module TagAlongSpec
13
+ def self.process_spec_data(dir)
14
+ data = open(File.join(dir, 'spec_data.json')).read
15
+ data = JSON.parse(data, symbolize_names: true)
16
+ text = data[:content]
17
+ offset_hash = data[:names]
18
+ offset_obj = offset_hash.map do |h|
19
+ OpenStruct.new(name: h[:verbatim],
20
+ start: h[:offsetStart],
21
+ end: h[:offsetEnd])
22
+ end
23
+ offset_ary = offset_obj.map { |h| [h.start, h.end] }
24
+ [text, offset_ary, offset_hash, offset_obj]
25
+ end
26
+ end
27
+
28
+ unless defined?(SPEC_VARS)
29
+ FILES_DIR = File.expand_path(File.join(File.dirname(__FILE__), 'files'))
30
+ TEXT, OFFSETS_ARY, OFFSETS_HASH, OFFSETS_OBJ =
31
+ TagAlongSpec.process_spec_data(FILES_DIR)
32
+ SPEC_VARS = true
33
+ end
@@ -0,0 +1,61 @@
1
+ require_relative '../spec_helper'
2
+
3
+ describe TagAlong::Offsets do
4
+
5
+ it 'should initialize' do
6
+ o = TagAlong::Offsets.new(OFFSETS_ARY)
7
+ o.is_a?(TagAlong::Offsets).should be_true
8
+ end
9
+
10
+ it 'should process arrays' do
11
+ o = TagAlong::Offsets.new(OFFSETS_ARY)
12
+ o.is_a?(TagAlong::Offsets).should be_true
13
+ o.first.offset_start.should == 61
14
+ o.first.offset_end.should == 68
15
+ o.first.item_string.should be_nil
16
+ ary_with_item = OFFSETS_HASH.map do |h|
17
+ [h[:offsetStart], h[:offsetEnd], h[:verbatim]]
18
+ end
19
+ o = TagAlong::Offsets.new(ary_with_item)
20
+ o.first.offset_start.should == 61
21
+ o.first.offset_end.should == 68
22
+ o.first.item_string.should == 'Pundulus'
23
+ -> { TagAlong::Offsets.new([['a','b']]) }.should
24
+ raise_error(TypeError, 'Offsets must be integers')
25
+ end
26
+
27
+ it 'should process hash' do
28
+ o = TagAlong::Offsets.new(OFFSETS_HASH,
29
+ offset_start: :offsetStart,
30
+ offset_end: :offsetEnd)
31
+ o.is_a?(TagAlong::Offsets).should be_true
32
+ o.first.offset_start.should == 61
33
+ o.first.offset_end.should == 68
34
+ o.first.item_string.should be_nil
35
+ o = TagAlong::Offsets.new(OFFSETS_HASH,
36
+ offset_start: 'offsetStart',
37
+ offset_end: 'offsetEnd',
38
+ item_string: 'verbatim')
39
+ o.first.offset_start.should == 61
40
+ o.first.offset_end.should == 68
41
+ o.first.item_string.should == 'Pundulus'
42
+ end
43
+
44
+ it 'should process object' do
45
+ o = TagAlong::Offsets.new(OFFSETS_OBJ,
46
+ offset_start: :start,
47
+ offset_end: :end)
48
+ o.is_a?(TagAlong::Offsets).should be_true
49
+ o.first.offset_start.should == 61
50
+ o.first.offset_end.should == 68
51
+ o.first.item_string.should be_nil
52
+ o = TagAlong::Offsets.new(OFFSETS_OBJ,
53
+ offset_start: :start,
54
+ offset_end: :end,
55
+ item_string: 'name')
56
+ o.is_a?(TagAlong::Offsets).should be_true
57
+ o.first.offset_start.should == 61
58
+ o.first.offset_end.should == 68
59
+ o.first.item_string.should == 'Pundulus'
60
+ end
61
+ end
@@ -0,0 +1,33 @@
1
+ require 'spec_helper'
2
+
3
+ describe TagAlong do
4
+ it 'should have version' do
5
+ TagAlong.version.should =~ /^[\d]+\.[\d]+.[\d]+$/
6
+ end
7
+
8
+ it 'should initialize' do
9
+ tg = TagAlong.new(TEXT, OFFSETS_ARY)
10
+ tg.is_a?(TagAlong).should be_true
11
+ tg.text.should == TEXT
12
+ tg.tagged_text.should be_nil
13
+ end
14
+
15
+ it 'should tag' do
16
+ tg = TagAlong.new(TEXT, OFFSETS_ARY)
17
+ tagged_text = tg.tag('<my_tag>', '</my_tag>')
18
+ tg.tagged_text.should == tagged_text
19
+ tg.tagged_text.should include('<my_tag>Lebistes reticulatus</my_tag>')
20
+ tagged_text = tg.tag('<another_tag>', '</another_tag>')
21
+ tg.tagged_text.should == tagged_text
22
+ tg.tagged_text.should
23
+ include('<another_tag>Lebistes reticulatus</another_tag>')
24
+ end
25
+
26
+ it 'should tag' do
27
+ text = 'There\'s Sunday and there\'s Monday'
28
+ offsets = [[8,13], [27,32]]
29
+ tg = TagAlong.new(text, offsets)
30
+ tg.tag('<em>', '</em>').should ==
31
+ %q{There's <em>Sunday</em> and there's <em>Monday</em>}
32
+ end
33
+ end
data/tag_along.gemspec CHANGED
@@ -3,26 +3,22 @@ lib = File.expand_path('../lib', __FILE__)
3
3
  $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
4
  require 'tag_along/version'
5
5
 
6
- Gem::Specification.new do |spec|
7
- spec.name = "tag_along"
8
- spec.version = TagAlong::VERSION
9
- spec.authors = ["Dmitry Mozzherin"]
10
- spec.email = ["dmozzherin@gmail.com"]
11
- spec.description = %q{Tags a text with arbitrary tags
6
+ Gem::Specification.new do |sp|
7
+ sp.name = 'tag_along'
8
+ sp.version = TagAlong::VERSION
9
+ sp.authors = ['Dmitry Mozzherin']
10
+ sp.email = ['dmozzherin@gmail.com']
11
+ sp.description = %q{Tags a text with arbitrary tags
12
12
  based on array of start/end offsets}
13
- spec.summary = %q{A user who runs a search tool on a text would find
13
+ sp.summary = %q{A user who runs a search tool on a text would find
14
14
  multiple text fragments corresponding to the search.
15
15
  These fragments can be found again by storing their
16
16
  start and end offsets. This gem places arbitrary
17
17
  markup tags surrounding the fragments.}
18
- spec.homepage = ""
19
- spec.license = "MIT"
18
+ sp.homepage = 'https://github.com/GlobalNamesArchitecture/tag_along'
19
+ sp.license = 'MIT'
20
20
 
21
- spec.files = `git ls-files`.split($/)
22
- spec.executables = spec.files.grep(%r{^bin/}) { |f| File.basename(f) }
23
- spec.test_files = spec.files.grep(%r{^(test|spec|features)/})
24
- spec.require_paths = ["lib"]
25
-
26
- spec.add_development_dependency "bundler", "~> 1.3"
27
- spec.add_development_dependency "rake"
21
+ sp.files = `git ls-files`.split($/)
22
+ sp.test_files = sp.files.grep(%r{^(spec|features)/})
23
+ sp.require_paths = ['lib']
28
24
  end
metadata CHANGED
@@ -1,43 +1,15 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: tag_along
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.6.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Dmitry Mozzherin
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2013-08-31 00:00:00.000000000 Z
12
- dependencies:
13
- - !ruby/object:Gem::Dependency
14
- name: bundler
15
- requirement: !ruby/object:Gem::Requirement
16
- requirements:
17
- - - ~>
18
- - !ruby/object:Gem::Version
19
- version: '1.3'
20
- type: :development
21
- prerelease: false
22
- version_requirements: !ruby/object:Gem::Requirement
23
- requirements:
24
- - - ~>
25
- - !ruby/object:Gem::Version
26
- version: '1.3'
27
- - !ruby/object:Gem::Dependency
28
- name: rake
29
- requirement: !ruby/object:Gem::Requirement
30
- requirements:
31
- - - '>='
32
- - !ruby/object:Gem::Version
33
- version: '0'
34
- type: :development
35
- prerelease: false
36
- version_requirements: !ruby/object:Gem::Requirement
37
- requirements:
38
- - - '>='
39
- - !ruby/object:Gem::Version
40
- version: '0'
11
+ date: 2013-09-03 00:00:00.000000000 Z
12
+ dependencies: []
41
13
  description: |-
42
14
  Tags a text with arbitrary tags
43
15
  based on array of start/end offsets
@@ -48,14 +20,22 @@ extensions: []
48
20
  extra_rdoc_files: []
49
21
  files:
50
22
  - .gitignore
23
+ - .ruby-version
24
+ - .travis.yml
25
+ - CHANGELOG
51
26
  - Gemfile
52
27
  - LICENSE.txt
53
28
  - README.md
54
29
  - Rakefile
55
30
  - lib/tag_along.rb
31
+ - lib/tag_along/offsets.rb
56
32
  - lib/tag_along/version.rb
33
+ - spec/files/spec_data.json
34
+ - spec/spec_helper.rb
35
+ - spec/tag_along/offsets_spec.rb
36
+ - spec/tag_along_spec.rb
57
37
  - tag_along.gemspec
58
- homepage: ''
38
+ homepage: https://github.com/GlobalNamesArchitecture/tag_along
59
39
  licenses:
60
40
  - MIT
61
41
  metadata: {}
@@ -81,4 +61,8 @@ specification_version: 4
81
61
  summary: A user who runs a search tool on a text would find multiple text fragments
82
62
  corresponding to the search. These fragments can be found again by storing their start
83
63
  and end offsets. This gem places arbitrary markup tags surrounding the fragments.
84
- test_files: []
64
+ test_files:
65
+ - spec/files/spec_data.json
66
+ - spec/spec_helper.rb
67
+ - spec/tag_along/offsets_spec.rb
68
+ - spec/tag_along_spec.rb