taxonifi 0.5.0 → 0.5.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 15ea6036e683f7de25293b311a0c85082480f32013b7a8427dbb90a247937076
4
- data.tar.gz: 37f71618bd92e1af915361c0328856708097f76badbe80fbe6fda93edb0e4c02
3
+ metadata.gz: 1ba6e02176e1a58a103de5dfd244b3851fe72965f2421e97660096524bf70592
4
+ data.tar.gz: 26f00acd0c9d3457f6b0668c86035c721970e215d70c5ae78b6822ccce93dfb3
5
5
  SHA512:
6
- metadata.gz: fa9586dde79b0abd633bbea51a739a807a299621ba0bade46953dd477a7dd8e6710770527c8ad18035f7467c275952064d4548a0fd641a299a0663ce1b9851d5
7
- data.tar.gz: 171c01787b879fea5fcbb65e1a344ce0f9e95d34a013f6b4a917112edb1a18c48a78fa4f8adf59b86a154e8bed3fc5009a44455058ea2cb9fa8fc829b3478801
6
+ metadata.gz: 7fd1d847c8a383d2a98a11c251f6d384bff3ab6f2de042264704d43ebc76bd9f4db9cf07209a73983ef78fdca3c6ac5945f2a93ca3c1815aaabc23d6f143fcad
7
+ data.tar.gz: 06f77cc15c87c548d9807df95ce650ee6e14c3cf17ab57db9eace8d6c657862c6ebcf62f23106db4667db61390cab6ba8bf333e5f5861e1f553c5e7cab40ab2d
data/.travis.yml CHANGED
@@ -1,7 +1,6 @@
1
1
  language: ruby
2
2
  rvm:
3
- - 2.3.1
4
- bundler_args: --without development
3
+ - 2.6.5
5
4
  before_install: ./travis/before_install.sh
6
5
  branches:
7
6
  only:
data/README.md CHANGED
@@ -1,29 +1,23 @@
1
1
 
2
2
  [![Build Status](https://travis-ci.org/SpeciesFileGroup/taxonifi.svg?branch=master)](https://travis-ci.org/SpeciesFileGroup/taxonifi)
3
- [![Dependency Status][7]][8]
4
3
 
5
4
 
6
- taxonifi
7
- ========
5
+
6
+ # taxonifi
8
7
  There will always be "legacy" taxonomic data that needs shuffling around. The taxonifi gem is a suite of general purpose tools that act as a middle layer for data-conversion purposes (e.g. migrating legacy taxonomic databases). Its first application was to convert DwC-style data downloaded from EoL into a Species File. The code is well documented in unit tests, poke around to see if it might be useful. In particular, if you've considered building a collection of regular expressions particular to biodiversity data look at the Tokens code and related tests.
9
8
 
10
9
  Overall, the goal is to provide well documented (and unit-tested) coded that is broadly useful, and vanilla enough to encourage other to fork and hack on their own.
11
10
 
12
- Source
13
- ------
11
+ # Source
14
12
  Source is available at https://github.com/SpeciesFile/taxonifi . The rdoc API is also viewable at http://taxonifi.speciesfile.org , (though those docs may lag behind commits to github).
15
13
 
16
- What's next?
17
- ------------
18
-
14
+ # What's next?
19
15
  Before you jump on board you should also check out similar code from the Global Names team at https://github.com/GlobalNamesArchitecture. Future integration and merging of shared functionality is planned.
20
16
 
21
17
  Taxonifi is presently coded for convience, not speed (though it's not necessarily slow). It assumes that conversion processes are typically one-offs that can afford to run over a longer period of time (read minutes rather than seconds). Reading, and fully parsing into objects, around 25k rows of nomenclature (class to species, inc. author year, = ~45k names) in to memory as Taxonifi objects benchmarks at around 2 minutes.
22
18
 
23
- Getting started
24
- ---------------
25
- taxonifi is coded for Ruby 1.9.3, it has not been tested on earlier versions (though it will certainly not work with 1.8.7).
26
- Using Ruby Version Manager (RVM, https://rvm.io/ ) is highly recommend. You can test your version of Ruby by doinging "ruby -v" in your terminal.
19
+ # Getting started
20
+ taxonifi is coded for Ruby 2.6.5, 0.4.0 works on 1.9.4.
27
21
 
28
22
  To install:
29
23
 
@@ -110,8 +104,7 @@ Parent/child style nomenclature is also parseable.
110
104
 
111
105
  There are *lots* more examples of code use in the test suite.
112
106
 
113
- Export/conversion
114
- -----------------
107
+ # Export/conversion
115
108
 
116
109
  The following is an example that translates a DwC style input format as exported by EOL into tables importable to SpeciesFile. The input file is has id, parent, child, vernacular, synonym columns. Data are exported by default to a the users home folder in a taxonifi directory. The export creates 6 tables that can be imported into Species File directly.
117
110
 
@@ -144,8 +137,7 @@ csv = CSV.read('input/my_data.tab', {
144
137
  col_sep: "\t" } )
145
138
  ```
146
139
 
147
- Code organization
148
- -----------------
140
+ # Code organization
149
141
 
150
142
  ```
151
143
  test # unit tests, quite a few of them
@@ -158,8 +150,7 @@ lib/model # Taxonifi objects
158
150
  lib/splitter # a parser/lexer/token suite for breaking down data
159
151
  ```
160
152
 
161
- Contributing to taxonifi
162
- ------------------------
153
+ # Contributing to taxonifi
163
154
 
164
155
  (this is generic)
165
156
 
@@ -172,22 +163,17 @@ Contributing to taxonifi
172
163
  * All pull requests should test clean.
173
164
  * Please try not to mess with the Rakefile, version, or history. If you want to have your own version, or is otherwise necessary, that is fine, but please isolate to its own commit so I can cherry-pick around it.
174
165
 
175
- About
176
- -----
166
+ # About
177
167
 
178
168
  taxonifi is coded by Matt Yoder in consultation with the Species File Group at University of Illinois.
179
169
 
180
- Copyright
181
- ---------
170
+ # Copyright
182
171
 
183
- Copyright (c) 2012 Illinois Natural History Survey. See LICENSE.txt for
172
+ Copyright (c) 2012-2020 Illinois Natural History Survey. See LICENSE.txt for
184
173
  further details.
185
174
 
186
-
187
-
188
175
  [1]: https://secure.travis-ci.org/SpeciesFileGroup/taxonifi.png?branch=master
189
176
  [2]: https://travis-ci.org/SpeciesFileGroup/taxonifi.svg?branch=master
190
- [7]: https://gemnasium.com/SpeciesFileGroup/taxonifi.png?branch=master
191
- [8]: https://gemnasium.com/SpeciesFileGroup/taxonifi?branch=master
177
+
192
178
 
193
179
 
@@ -25,7 +25,7 @@ class Taxonifi::Splitter::Parser
25
25
  @builder.people.push n
26
26
  end
27
27
 
28
- @builder.year = t.year.to_i
28
+ @builder.year = t.year
29
29
  @builder.parens = t.parens
30
30
  end
31
31
 
@@ -39,7 +39,7 @@ module Taxonifi::Splitter::Tokens
39
39
  attr_reader :authors, :year, :parens
40
40
  # This is going to hit just everything, should only be used
41
41
  # in one off when you know you have that string.
42
- @regexp = Regexp.new(/\A\s*(\(?[^\+\d)]+(\d\d\d\d)?\)?)\s*/i)
42
+ @regexp = Regexp.new(/\A\s*(\(?[^\+\d)]+(\d{4})?\)?)\s*/i)
43
43
 
44
44
  def initialize(str)
45
45
  str.strip!
@@ -52,9 +52,9 @@ module Taxonifi::Splitter::Tokens
52
52
  @parens = false
53
53
  end
54
54
  # check for year
55
- if w =~ /(\d\d\d\d)\Z/
56
- @year = $1.to_i
57
- w.gsub!(/\d\d\d\d\Z/, "")
55
+ if w =~ /(\d{4})\Z/
56
+ @year = $1 ? $1.to_i : nil
57
+ w.gsub!(/\d{4}\Z/, "")
58
58
  w.strip!
59
59
  end
60
60
  w.gsub!(/,\s*\Z/, '')
@@ -1,3 +1,3 @@
1
1
  module Taxonifi
2
- VERSION = "0.5.0"
2
+ VERSION = "0.5.1"
3
3
  end
data/test/test_parser.rb CHANGED
@@ -12,7 +12,7 @@ class Test_TaxonifiSplitterParser < Test::Unit::TestCase
12
12
  assert_equal "Smith", builder.names.last.author
13
13
  assert_equal 1912 , builder.names.last.year
14
14
  assert_equal false, builder.names.last.parens
15
- assert_equal "Foo stuff Smith, 1912", builder.display_name
15
+ assert_equal "Foo stuff Smith, 1912", builder.display_name
16
16
  end
17
17
 
18
18
  def test_that_parse_species_name_parses_subspecies
@@ -25,7 +25,7 @@ class Test_TaxonifiSplitterParser < Test::Unit::TestCase
25
25
  assert_equal "Smith", builder.names.last.author
26
26
  assert_equal 1912 , builder.names.last.year
27
27
  assert_equal false, builder.names.last.parens
28
- assert_equal "Foo stuff things Smith, 1912", builder.display_name
28
+ assert_equal "Foo stuff things Smith, 1912", builder.display_name
29
29
  end
30
30
 
31
31
  def test_that_parse_species_name_parses_subgenera
@@ -34,15 +34,15 @@ class Test_TaxonifiSplitterParser < Test::Unit::TestCase
34
34
  Taxonifi::Splitter::Parser.new(lexer, builder).parse_species_name
35
35
  assert_equal "Foo", builder.genus.name
36
36
  assert_equal "Bar", builder.subgenus.name
37
- assert_equal builder.genus, builder.subgenus.parent
37
+ assert_equal builder.genus, builder.subgenus.parent
38
38
  assert_equal "stuff", builder.species.name
39
- assert_equal builder.subgenus, builder.species.parent
39
+ assert_equal builder.subgenus, builder.species.parent
40
40
  assert_equal "things", builder.subspecies.name
41
- assert_equal builder.species, builder.subspecies.parent
41
+ assert_equal builder.species, builder.subspecies.parent
42
42
  assert_equal "Smith", builder.names.last.author
43
43
  assert_equal 1912, builder.names.last.year
44
44
  assert_equal true, builder.names.last.parens
45
- assert_equal "Foo (Bar) stuff things (Smith, 1912)", builder.display_name
45
+ assert_equal "Foo (Bar) stuff things (Smith, 1912)", builder.display_name
46
46
  end
47
47
 
48
48
  def test_that_parse_species_name_parses_variety_following_subspecies
@@ -56,7 +56,7 @@ class Test_TaxonifiSplitterParser < Test::Unit::TestCase
56
56
  assert_equal "Smith", builder.names.last.author
57
57
  assert_equal 1912 , builder.names.last.year
58
58
  assert_equal false, builder.names.last.parens
59
- assert_equal "Foo stuff things var. blorf Smith, 1912", builder.display_name
59
+ assert_equal "Foo stuff things var. blorf Smith, 1912", builder.display_name
60
60
  end
61
61
 
62
62
 
@@ -71,10 +71,10 @@ class Test_TaxonifiSplitterParser < Test::Unit::TestCase
71
71
  assert_equal "Smith", builder.names.last.author
72
72
  assert_equal 1912 , builder.names.last.year
73
73
  assert_equal false, builder.names.last.parens
74
- assert_equal "Foo stuff var. blorf Smith, 1912", builder.display_name
74
+ assert_equal "Foo stuff var. blorf Smith, 1912", builder.display_name
75
75
  end
76
76
 
77
-
77
+
78
78
  def test_that_parse_species_name_parses_variety_following_species_without_author_year
79
79
  lexer = Taxonifi::Splitter::Lexer.new("Foo stuff v. blorf", :species_name)
80
80
  builder = Taxonifi::Model::SpeciesName.new
@@ -84,10 +84,9 @@ class Test_TaxonifiSplitterParser < Test::Unit::TestCase
84
84
  assert_equal nil, builder.subspecies
85
85
  assert_equal "blorf", builder.variety.name
86
86
  assert_equal nil, builder.names.last.parens # not set
87
- assert_equal "Foo stuff var. blorf", builder.display_name
87
+ assert_equal "Foo stuff var. blorf", builder.display_name
88
88
  end
89
89
 
90
-
91
90
  def test_that_parse_species_name_parses_variety_following_species_without_author_year_II
92
91
  lexer = Taxonifi::Splitter::Lexer.new("Calyptonotus rolandri var. opacus", :species_name)
93
92
  builder = Taxonifi::Model::SpeciesName.new
@@ -95,12 +94,32 @@ class Test_TaxonifiSplitterParser < Test::Unit::TestCase
95
94
  assert_equal "Calyptonotus", builder.genus.name
96
95
  assert_equal "rolandri", builder.species.name
97
96
  assert_equal nil, builder.subspecies
97
+ assert_equal nil, builder.names.last.year
98
98
  assert_equal "opacus", builder.variety.name
99
99
  assert_equal nil, builder.names.last.parens # not set
100
- assert_equal "Calyptonotus rolandri var. opacus", builder.display_name
100
+ assert_equal "Calyptonotus rolandri var. opacus", builder.display_name
101
+ end
102
+
103
+ def test_that_parse_family_name_parses_to_nil_year
104
+ lexer = Taxonifi::Splitter::Lexer.new("Bus aus (Jones)", :species_name)
105
+ builder = Taxonifi::Model::SpeciesName.new
106
+ Taxonifi::Splitter::Parser.new(lexer, builder).parse_species_name
107
+ assert_equal nil, builder.names.last.year
101
108
  end
102
109
 
110
+ def test_that_parse_family_name_parses_to_year_2
111
+ lexer = Taxonifi::Splitter::Lexer.new("Bus aus Jones, 1920", :species_name)
112
+ builder = Taxonifi::Model::SpeciesName.new
113
+ Taxonifi::Splitter::Parser.new(lexer, builder).parse_species_name
114
+ assert_equal 1920, builder.names.last.year
115
+ end
103
116
 
117
+ def test_that_parse_family_name_parses_to_year_3
118
+ lexer = Taxonifi::Splitter::Lexer.new("Bus aus (Jones, 1920)", :species_name)
119
+ builder = Taxonifi::Model::SpeciesName.new
120
+ Taxonifi::Splitter::Parser.new(lexer, builder).parse_species_name
121
+ assert_equal 1920, builder.names.last.year
122
+ end
104
123
 
105
- end
124
+ end
106
125
 
@@ -88,7 +88,7 @@ class Test_TaxonifiSplitterTokens < Test::Unit::TestCase
88
88
  lexer = Taxonifi::Splitter::Lexer.new(s)
89
89
  assert t = lexer.pop(Taxonifi::Splitter::Tokens::AuthorYear)
90
90
  assert_equal a.strip, t.authors
91
- assert_equal (y.size > 0 ? y.strip.to_i : nil), t.year
91
+ assert_equal (y.size > 0 ? y.strip.to_i : nil), t.year # bad test
92
92
  assert_equal p, t.parens
93
93
  s = nil
94
94
  end
@@ -425,8 +425,6 @@ class Test_TaxonifiSplitterTokens < Test::Unit::TestCase
425
425
  assert_equal "33", t.pg_end
426
426
  assert_equal "ix 14, 19", t.remainder
427
427
 
428
-
429
428
  end
430
-
431
429
  end
432
430
 
@@ -1,2 +1,2 @@
1
1
  #!/bin/sh
2
- gem install bundler -v=1.9.4
2
+ gem install bundler -v=1.17.3
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: taxonifi
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.5.0
4
+ version: 0.5.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Matt Yoder