lcsort 0.9.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +9 -0
- data/.travis.yml +4 -0
- data/Gemfile +4 -0
- data/LICENSE.txt +22 -0
- data/README.md +76 -0
- data/bin/console +14 -0
- data/bin/setup +7 -0
- data/lcsort.gemspec +25 -0
- data/lib/lcsort.rb +257 -0
- data/lib/lcsort/version.rb +3 -0
- data/lib/lcsort/volume_abbreviations.rb +216 -0
- data/rakefile +10 -0
- metadata +99 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: 830c39b4588cbf4efdc6ce8eb0f97ad0ae596d77
|
4
|
+
data.tar.gz: 760bb1182bdf2a660d364db668e64f0c80f4c88a
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 4bd2881d4c1c9f7ab8c7ead44801005e2c703acb4767d588633e271acad1f3f51233524ae3b793661c015861094513f02c483898f628e9874b02bce63c559678
|
7
|
+
data.tar.gz: 955c2145684c114804e6e6a7b981ad48c240da3d49259eaeeda84004509845a1a71404b4b039d84783b7ff008653fdeea0db42af3df18af968fd15e4ec5c0ad9
|
data/.gitignore
ADDED
data/.travis.yml
ADDED
data/Gemfile
ADDED
data/LICENSE.txt
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
Copyright (c) 2015
|
2
|
+
|
3
|
+
MIT License
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
+
a copy of this software and associated documentation files (the
|
7
|
+
"Software"), to deal in the Software without restriction, including
|
8
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
9
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
10
|
+
permit persons to whom the Software is furnished to do so, subject to
|
11
|
+
the following conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice shall be
|
14
|
+
included in all copies or substantial portions of the Software.
|
15
|
+
|
16
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
17
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
18
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
19
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,76 @@
|
|
1
|
+
Normalized sort key for sorting Library of Congress call numbers.
|
2
|
+
|
3
|
+
## Usage
|
4
|
+
|
5
|
+
Sorting Library of Congress call numbers is tricky. This library generates
|
6
|
+
a sort key for a LC call number, such that for a list of callnums, their
|
7
|
+
sort keys will sort (natural byte order) in the same order the call
|
8
|
+
numbers should sort in.
|
9
|
+
|
10
|
+
# It's often useful to store the sort_key in a db
|
11
|
+
sort_key = Lcsort.normalize(callnum)
|
12
|
+
|
13
|
+
If the input can't be recognized as an LC Call Number, `nil` will be returned.
|
14
|
+
|
15
|
+
This code is intended for ascii-only input, if you have UTF-8 in your
|
16
|
+
call numbers, we don't know what will happen.
|
17
|
+
|
18
|
+
# Or if you have a list of call numbers in memory, easy
|
19
|
+
# enough to just sort them in memory:
|
20
|
+
call_num_array.sort_by {|callnum| Lcsort.normalize(callnum) }
|
21
|
+
|
22
|
+
## We can handle all sorts of call numbers
|
23
|
+
|
24
|
+
Call numbers are diverse, both in standard LC and local practice.
|
25
|
+
We wouldn't have the hubris to say we can properly recognize and sort
|
26
|
+
EVERY possible LC call number including local practices. But we sure
|
27
|
+
can handle a lot, including:
|
28
|
+
|
29
|
+
* Typical call numbers like: `R 169.1 .B59 1990`
|
30
|
+
* Can handle variations in spacing/punctuation, such as: `R 169.B59.C39`, `R169 B59C39 1990`
|
31
|
+
* Can handle properly sorting the dreaded 'date or other number': `KF 4558 15th .G6` sorts after `KF 4558 1st .G6`
|
32
|
+
* Will _generally_ sort volume/number info in call number suffix properly: `Q11 .P6 vol. 4 no. 4` sorts before `Q11 .P6 vol. 12 no. 1`.
|
33
|
+
* Can handle 1-2 letter suffixes on the end of cutters: `R 179 .C79ab`. Common local practice, and also used in NLM call numbers. (No guarantee that every NLM call number can be handled by this library for LC call numbers, but it seems to work okay for NLM.)
|
34
|
+
|
35
|
+
|
36
|
+
[OCLC's docs on MARC 050](http://www.oclc.org/bibformats/en/0xx/050.html) includes some information on possible LC call number components.
|
37
|
+
|
38
|
+
## Range and truncation limiting
|
39
|
+
|
40
|
+
Once you have a bunch of Lcsort keys in your database, you may want to search
|
41
|
+
to find all call numbers beginning with, say, `EG 101`. So that might include `EG 101.5`, `EG 101 .C23 1990` etc.
|
42
|
+
|
43
|
+
The `truncated_range_end` method gives you a proper ending range to get what you want, say:
|
44
|
+
|
45
|
+
sort_key >= #{Lcsort.normalize("EG 101")} AND sort_key <= #{Lcsort.truncated_range_end('EG 101')}
|
46
|
+
|
47
|
+
This can also be used for finding a range of call numbers. Say you want all call numbers
|
48
|
+
from those beginning with `AB 101` to `AB 500`:
|
49
|
+
|
50
|
+
sort_key >= #{Lcsort.normalize("AB 101")} AND sort_key <= #{Lcsort.truncated_range_end('AB 500')}
|
51
|
+
|
52
|
+
`truncated_range_end` works with as many or as few call number components as you want. `Lcsort.truncated_range_end('AB 101.1')` will find `AB 101.123` or `AB 101.1 .A5` too. `Lcsort.truncated_range_end("AB 101 .C45")` will find `AB 101 .C456`, `AB 101 .C45 .B5`, etc.
|
53
|
+
|
54
|
+
At the moment, `truncated_range_end` actually pretty much just adds an `~` onto the end
|
55
|
+
of the normalized sort key. But it did more complicated things in past versions of
|
56
|
+
the normaliation algorithm, and we do have tests ensuring it finds what is expected.
|
57
|
+
|
58
|
+
## append_suffix
|
59
|
+
|
60
|
+
Sometimes you want to add something on to the end of a normalized call number,
|
61
|
+
as a payload, or to ensure normalized sort key uniqueness.
|
62
|
+
|
63
|
+
You can pass an :append_suffix to have it appended in a way that won't
|
64
|
+
otherwise change the sort order of the original call number.
|
65
|
+
|
66
|
+
I use this to add the bib ID on to the end of the normalized sort key,
|
67
|
+
because if two bibs have identical call numbers, I want to avoid
|
68
|
+
normalized sort key collision, because my functions work better with
|
69
|
+
all unique sort keys.
|
70
|
+
|
71
|
+
sortkey = Lcsort.normalize(callnumber, :append_suffix => bibID)
|
72
|
+
|
73
|
+
## Acknowledgement
|
74
|
+
|
75
|
+
Original regex and code by Bill Dueber. Original port to ruby by Nikitas Tampakis.
|
76
|
+
LC handling advice from Naomi Dushay and her code.
|
data/bin/console
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require "bundler/setup"
|
4
|
+
require "lcsort"
|
5
|
+
|
6
|
+
# You can add fixtures and/or initialization code here to make experimenting
|
7
|
+
# with your gem easier. You can also use a different console, if you like.
|
8
|
+
|
9
|
+
# (If you use this, don't forget to add pry to your Gemfile!)
|
10
|
+
# require "pry"
|
11
|
+
# Pry.start
|
12
|
+
|
13
|
+
require "irb"
|
14
|
+
IRB.start
|
data/bin/setup
ADDED
data/lcsort.gemspec
ADDED
@@ -0,0 +1,25 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
lib = File.expand_path('../lib', __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
require 'lcsort/version'
|
5
|
+
|
6
|
+
Gem::Specification.new do |spec|
|
7
|
+
spec.name = "lcsort"
|
8
|
+
spec.version = Lcsort::VERSION
|
9
|
+
spec.authors = ["Nikitas Tampakis", "Jonathan Rochkind"]
|
10
|
+
spec.email = ["tampakis@princeton.edu"]
|
11
|
+
|
12
|
+
spec.summary = %q{Sort-normalized forms of LC Call Numbers}
|
13
|
+
spec.description = %q{Sort-order-normalize Library of Congress call numbers and determine search ranges for left-anchor search}
|
14
|
+
#spec.homepage = "TODO: Put your gem's website or public repo URL here."
|
15
|
+
|
16
|
+
|
17
|
+
spec.files = `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
|
18
|
+
spec.bindir = "exe"
|
19
|
+
spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
|
20
|
+
spec.require_paths = ["lib"]
|
21
|
+
|
22
|
+
spec.add_development_dependency "bundler", "~> 1.9"
|
23
|
+
spec.add_development_dependency "rake", "~> 10.0"
|
24
|
+
spec.add_development_dependency "minitest", "~> 5.0"
|
25
|
+
end
|
data/lib/lcsort.rb
ADDED
@@ -0,0 +1,257 @@
|
|
1
|
+
# encoding: utf-8
|
2
|
+
|
3
|
+
# The sorting code is organized as a class for code organization
|
4
|
+
# and possible future parameterization.
|
5
|
+
#
|
6
|
+
# Lcsort.new.normalize(call)
|
7
|
+
#
|
8
|
+
# But for convenience and efficiency, you can call as a class method too:
|
9
|
+
#
|
10
|
+
# Lcsort.normalize(call)
|
11
|
+
class Lcsort
|
12
|
+
HIGH_CHAR = '~'
|
13
|
+
|
14
|
+
LC= /^
|
15
|
+
\s*
|
16
|
+
([A-Z]{1,3}) # alpha
|
17
|
+
\s*
|
18
|
+
(?: # optional numbers with optional decimal point
|
19
|
+
(\d+) # num
|
20
|
+
(?:\s*?\.\s*?(\d+))? # dec
|
21
|
+
)?
|
22
|
+
\s*
|
23
|
+
(?: # optional doon1 -- date or other number eg 1991 , 103rd, 103d
|
24
|
+
\.?
|
25
|
+
(\d{1,4})
|
26
|
+
(?:ST|ND|RD|TH|D)?
|
27
|
+
)?
|
28
|
+
\s*
|
29
|
+
(?: # optional cutter
|
30
|
+
\.? \s*
|
31
|
+
([A-Z]) # cutter letter c1alpha
|
32
|
+
# cutter numeric portion is optional entirely IF at end of string, to
|
33
|
+
# support bottomout on partial cutters
|
34
|
+
# optional cutter letter suffixes are also supported
|
35
|
+
# ie .A12ab -- which requires lookahead to make sure not absorbing subsequent
|
36
|
+
# cutter, doh.
|
37
|
+
\s*
|
38
|
+
(\d+ # cutter numbers c1num
|
39
|
+
(?: [a-zA-Z]{0,2}(?=[ \.]|\Z))? # ...with optional 1-2 letter suffix
|
40
|
+
| \Z)
|
41
|
+
)?
|
42
|
+
\s*
|
43
|
+
(?: # optional doon2 -- date or other number eg 1991 , 103rd, 103d
|
44
|
+
\.?
|
45
|
+
(\d{1,4})
|
46
|
+
(?:ST|ND|RD|TH|D)?
|
47
|
+
)?
|
48
|
+
\s*
|
49
|
+
(?: # optional cutter
|
50
|
+
\.? \s*
|
51
|
+
([A-Z]) # cutter letter c2alpha
|
52
|
+
\s*
|
53
|
+
(\d+ # cutter numbers c2num
|
54
|
+
(?: [a-zA-Z]{0,2}(?=[ \.]|\Z))? # ...with optional 1-2 letter suffix
|
55
|
+
| \Z)
|
56
|
+
)?
|
57
|
+
\s*
|
58
|
+
(?: # optional cutter
|
59
|
+
\.? \s*
|
60
|
+
([A-Z]) # cutter letter c3alpha
|
61
|
+
\s*
|
62
|
+
(\d+ # cutter numbers c3num
|
63
|
+
(?: [a-zA-Z]{0,2}(?=[ \.]|\Z))? # ...with optional 1-2 letter suffix
|
64
|
+
| \Z)
|
65
|
+
)?
|
66
|
+
(\s+.+?)? # everthing else extra
|
67
|
+
\s*$/x
|
68
|
+
|
69
|
+
|
70
|
+
attr_accessor :alpha_width, :class_whole_width, :doon_width, :extra_vol_num_width
|
71
|
+
attr_accessor :low_prefix_separator, :cutter_extralow_separator, :class_letter_padding, :extra_separator, :append_suffix_separator
|
72
|
+
attr_accessor :extra_num_regexp
|
73
|
+
|
74
|
+
def initialize()
|
75
|
+
self.alpha_width = 3
|
76
|
+
self.class_whole_width = 4
|
77
|
+
self.doon_width = 4
|
78
|
+
self.extra_vol_num_width = 4
|
79
|
+
|
80
|
+
# cutter prefix separator must be lower ascii value than digit 0,
|
81
|
+
# but higher than cutter_extralow_separator. `.` gives us
|
82
|
+
# something that makes debugging easy and doesn't need to be
|
83
|
+
# URI-escaped, which is nice.
|
84
|
+
self.low_prefix_separator = '.'
|
85
|
+
# cutter extralow separator separates cutter letter suffixes
|
86
|
+
# ei as in the 'ab' A234ab. It must be LOWER ascii value than
|
87
|
+
# low_prefix_separator to make sort work.
|
88
|
+
# Could use space ` `, but `-` is
|
89
|
+
# less confusing debugging and nice that it doesn't need to be URI-escaped.
|
90
|
+
self.cutter_extralow_separator = '-'
|
91
|
+
|
92
|
+
# Using anything less than ascii 0 should work, but `.` is nice for
|
93
|
+
# debugging.
|
94
|
+
self.class_letter_padding = '.'
|
95
|
+
|
96
|
+
# Extra separator needs to be lower than our other separators,
|
97
|
+
# especially cutter_extralow_separator.
|
98
|
+
# Doubling the cutter_extralow_separator works.
|
99
|
+
self.extra_separator = (self.cutter_extralow_separator * 2)
|
100
|
+
|
101
|
+
# Needs to sort LOWER than extra separator, at least in cases
|
102
|
+
# where there's no extra. We tried comma, but MySQL did weird
|
103
|
+
# things under utf8 collation. Let's see if it works better with
|
104
|
+
# three dashes.
|
105
|
+
self.append_suffix_separator = (self.cutter_extralow_separator * 3)
|
106
|
+
|
107
|
+
# Only state should be configuration, not about individual call numbers.
|
108
|
+
# We re-use this for multiple call numbers, and don't want callnum-specific
|
109
|
+
# state; we also want to ensure it's thread-safe for using between multiple
|
110
|
+
# threads. So freeze it! Doesn't absolutely prevent state changes, but
|
111
|
+
# helps and sends the message.
|
112
|
+
self.freeze
|
113
|
+
end
|
114
|
+
|
115
|
+
|
116
|
+
# Our code is organized in a class, for code organization and
|
117
|
+
# possibility of sub-class and constructor customization in the future.
|
118
|
+
#
|
119
|
+
# But most people will want to call as a simple class-method.
|
120
|
+
# Store a singleton instance of Lcsort to let class method
|
121
|
+
# be efficient and not need to instantiate a new one every time.
|
122
|
+
#
|
123
|
+
# Initialize singleton NOT lazily but here on class def, for
|
124
|
+
# thread safety.
|
125
|
+
@global = Lcsort.new
|
126
|
+
def self.normalize(*args)
|
127
|
+
@global.normalize(*args)
|
128
|
+
end
|
129
|
+
|
130
|
+
def self.truncated_range_end(*args)
|
131
|
+
@global.truncated_range_end(*args)
|
132
|
+
end
|
133
|
+
|
134
|
+
def normalize(cn, options = {})
|
135
|
+
callnum = cn.upcase
|
136
|
+
|
137
|
+
match = LC.match(callnum)
|
138
|
+
unless match
|
139
|
+
return nil
|
140
|
+
end
|
141
|
+
|
142
|
+
alpha, num, dec, doon1, c1alpha, c1num, doon2, c2alpha, c2num, c3alpha, c3num, extra = match.captures
|
143
|
+
|
144
|
+
|
145
|
+
# We can't handle a class number wider than the space we have
|
146
|
+
if num && num.length > self.class_whole_width
|
147
|
+
return nil
|
148
|
+
end
|
149
|
+
|
150
|
+
normal_str = ""
|
151
|
+
|
152
|
+
# Right fill alpha class with separators, to ensure sort, we
|
153
|
+
# always have alpha.
|
154
|
+
normal_str << right_fill( alpha, alpha_width, self.class_letter_padding)
|
155
|
+
|
156
|
+
# Left-fill whole number with preceding 0's to ensure sort,
|
157
|
+
# Only needed if present, sort will work right regardless.
|
158
|
+
if num
|
159
|
+
normal_str << left_fill_number(num, class_whole_width)
|
160
|
+
end
|
161
|
+
|
162
|
+
# decimal class number needs no fill, add it if we have it.
|
163
|
+
# relies on fixed width whole number to sort properly.
|
164
|
+
normal_str << dec if dec
|
165
|
+
|
166
|
+
# Add cutters and doons in order, if present
|
167
|
+
normal_str << normalize_doon(doon1) if doon1
|
168
|
+
|
169
|
+
normal_str << normalize_cutter(c1alpha, c1num) if c1alpha
|
170
|
+
|
171
|
+
normal_str << normalize_doon(doon2) if doon2
|
172
|
+
|
173
|
+
normal_str << normalize_cutter(c2alpha, c2num) if c2alpha
|
174
|
+
normal_str << normalize_cutter(c3alpha, c3num) if c3alpha
|
175
|
+
|
176
|
+
normal_str << normalize_extra(extra) if extra
|
177
|
+
|
178
|
+
normal_str << normalize_append_suffix(options[:append_suffix]) if options[:append_suffix]
|
179
|
+
|
180
|
+
# normally we REQUIRE an alpha and number for a good call number,
|
181
|
+
# but for creating truncated_end_ranges, we relax that.
|
182
|
+
unless options[:range_end_construction]
|
183
|
+
unless alpha && num
|
184
|
+
return nil
|
185
|
+
end
|
186
|
+
end
|
187
|
+
|
188
|
+
return normal_str
|
189
|
+
end
|
190
|
+
|
191
|
+
def truncated_range_end(callnum)
|
192
|
+
# Tell normalize to relax it's restrictions for range_end
|
193
|
+
# construction.
|
194
|
+
normalized = normalize(callnum, :range_end_construction => true)
|
195
|
+
|
196
|
+
return nil unless normalized
|
197
|
+
|
198
|
+
# We just add a HIGH_CHAR on the end to make sure this sorts
|
199
|
+
# after the original normalized with ANYTHING else on the end.
|
200
|
+
return normalized + HIGH_CHAR
|
201
|
+
end
|
202
|
+
|
203
|
+
def right_fill(content, width, padding)
|
204
|
+
content = content.to_s
|
205
|
+
fill_spots = width - content.length
|
206
|
+
fill_spots = 0 if fill_spots < 0
|
207
|
+
|
208
|
+
content.to_s + (padding * fill_spots)
|
209
|
+
end
|
210
|
+
|
211
|
+
# Left-pad a whole number with zeroes to specified width
|
212
|
+
def left_fill_number(content, width)
|
213
|
+
content = content.to_s
|
214
|
+
fill_spots = width - content.length
|
215
|
+
fill_spots = 0 if fill_spots < 0
|
216
|
+
|
217
|
+
return ('0' * fill_spots) + content
|
218
|
+
end
|
219
|
+
|
220
|
+
def normalize_cutter(c_alpha_prefix, c_rest)
|
221
|
+
return nil if c_alpha_prefix.nil? || c_rest.nil?
|
222
|
+
|
223
|
+
# Put a low separator before alpha suffix if present, to
|
224
|
+
# ensure sort.
|
225
|
+
c_rest = c_rest.sub(/(.*\d)([a-zA-Z]{1,2})\Z/, "\\1#{self.cutter_extralow_separator}\\2")
|
226
|
+
|
227
|
+
self.low_prefix_separator + c_alpha_prefix + c_rest
|
228
|
+
end
|
229
|
+
|
230
|
+
def normalize_doon(doon)
|
231
|
+
return nil if doon.nil?
|
232
|
+
|
233
|
+
self.low_prefix_separator + left_fill_number(doon, self.doon_width)
|
234
|
+
end
|
235
|
+
|
236
|
+
# The 'extra' component is normalized by making it all alphanumeric,
|
237
|
+
# and adding an ultra low prefix separator.
|
238
|
+
def normalize_extra(extra)
|
239
|
+
# Left-pad any volume/number type designations with zeros, so
|
240
|
+
# they sort appropriately. We just find ALL numbers and
|
241
|
+
# normalize them accordingly, it's good enough!
|
242
|
+
extra_normalized = extra.gsub(/(\d+)/) do |match|
|
243
|
+
left_fill_number($1, self.extra_vol_num_width)
|
244
|
+
end
|
245
|
+
|
246
|
+
# remove all non-alphanumeric
|
247
|
+
extra_normalized = extra_normalized.gsub(/[^A-Z0-9]/, '')
|
248
|
+
|
249
|
+
# Add very low prefix separator
|
250
|
+
return (self.extra_separator + extra_normalized)
|
251
|
+
end
|
252
|
+
|
253
|
+
def normalize_append_suffix(suffix)
|
254
|
+
self.append_suffix_separator + suffix
|
255
|
+
end
|
256
|
+
|
257
|
+
end
|
@@ -0,0 +1,216 @@
|
|
1
|
+
require 'lcsort'
|
2
|
+
|
3
|
+
class Lcsort
|
4
|
+
# Volume-type abbreviations used in call numbers, taken from
|
5
|
+
# https://www.libraries.psu.edu/psul/cataloging/catref/callnumbers/callterms.html
|
6
|
+
#
|
7
|
+
# We create an array of them, just the abbreviations without the periods,
|
8
|
+
# normalized to upcase.
|
9
|
+
#
|
10
|
+
# We also add a few more.
|
11
|
+
#
|
12
|
+
# This is used for left-padding vol/num numbers with 0's in the 'extra',
|
13
|
+
# So they sort properly.
|
14
|
+
|
15
|
+
|
16
|
+
abbrevs = [
|
17
|
+
'Abh', # Abhandlung
|
18
|
+
'Abs', # Abschnitt
|
19
|
+
'abstr', # abstracts
|
20
|
+
'Abt', # Abteilung, Abtheilung
|
21
|
+
'addendum', #addendum
|
22
|
+
'addit', # additamenta (Latin)
|
23
|
+
'afd', # afdeling
|
24
|
+
'afl', # aflevering
|
25
|
+
'anejo', # anejo
|
26
|
+
'anexo', # annexo
|
27
|
+
'annex', # annex
|
28
|
+
'appx', #appendix
|
29
|
+
'ar', #arithmos (Greek)
|
30
|
+
'arg', # argang
|
31
|
+
'atlas', # atlas
|
32
|
+
'aux', # auxiliary
|
33
|
+
'avd', # avdeling
|
34
|
+
|
35
|
+
|
36
|
+
'Bdchn', # Bandchen
|
37
|
+
'Bde', # Bande
|
38
|
+
'Bd', #Band (German)
|
39
|
+
'bd', #band (Swedish), b'and (Yiddish)
|
40
|
+
'bk', #book
|
41
|
+
'bklet', # booklet
|
42
|
+
'Buch', #Buch
|
43
|
+
|
44
|
+
|
45
|
+
'canto', # canto
|
46
|
+
'cart', #cartridge
|
47
|
+
'cs', #cassette
|
48
|
+
'c', # cast
|
49
|
+
'chap', #chapter [1]
|
50
|
+
'charts', #charts
|
51
|
+
'ch', #chast', chastyna
|
52
|
+
'cis', # cislo
|
53
|
+
'class', # class
|
54
|
+
'comment', # commentarium, commentaries
|
55
|
+
'cong', #congress
|
56
|
+
'cz', #czesc
|
57
|
+
|
58
|
+
|
59
|
+
'd', # disc
|
60
|
+
'dala', #dala
|
61
|
+
'dalis', # dalis
|
62
|
+
'deel', #deel [2]
|
63
|
+
'del', # del [2]
|
64
|
+
'deo', # deo
|
65
|
+
'dial', #dial
|
66
|
+
'diel', #diel
|
67
|
+
'dil', # dil
|
68
|
+
'dzel', #dzel
|
69
|
+
|
70
|
+
|
71
|
+
'ed', #edition
|
72
|
+
'Ergbd', # Erganzungsband
|
73
|
+
'Erghft', #Erganzungsheft
|
74
|
+
|
75
|
+
|
76
|
+
'F', # Folge
|
77
|
+
'fasc', #fascicle, fasciculus
|
78
|
+
'Fasz', #Faszikel
|
79
|
+
|
80
|
+
|
81
|
+
'g', # godina
|
82
|
+
'Gesamtausg', #Gesamtausgabe
|
83
|
+
'graphs', #graphs
|
84
|
+
'guide', # guide
|
85
|
+
|
86
|
+
|
87
|
+
'hft', # hafte (Swedish)
|
88
|
+
'Halbbd', #Halbband
|
89
|
+
'halvbd', #halvband (Swedish)
|
90
|
+
'handbk', #handbook
|
91
|
+
'Hft', # Heft
|
92
|
+
'hov', # hoveret (Hebrew)
|
93
|
+
|
94
|
+
|
95
|
+
'illus', # illustration, -s [3]
|
96
|
+
'index', # index
|
97
|
+
'intro', # introduction [4]
|
98
|
+
|
99
|
+
|
100
|
+
'jaarg', # jaargang
|
101
|
+
'Jahrg', # Jahrgang
|
102
|
+
'Jahrhdt', # Jahrhundert
|
103
|
+
|
104
|
+
|
105
|
+
'Kap', # Kapitel
|
106
|
+
'kn', #kniga, kniha
|
107
|
+
'knj', # knjiga
|
108
|
+
'koide', # koide
|
109
|
+
'kommentar', # kommentar
|
110
|
+
'kot', # kotet
|
111
|
+
|
112
|
+
|
113
|
+
'Lfg', # Lieferung
|
114
|
+
'livr', #livraison
|
115
|
+
'livre', # livre
|
116
|
+
|
117
|
+
|
118
|
+
'maj', # major
|
119
|
+
'manual', #manual
|
120
|
+
'maps', #maps
|
121
|
+
'med', # medium
|
122
|
+
'min', # minor
|
123
|
+
'module', #module
|
124
|
+
'ms\'', # mispar (Yiddish)
|
125
|
+
|
126
|
+
|
127
|
+
'n.F', # neue Folge
|
128
|
+
'n.s', # new series, nuova serie,nova serie, nueva serie [5]
|
129
|
+
'nom', # nomer
|
130
|
+
'nouv', #nouveau, nouvelle
|
131
|
+
'no', #number, -s, numero (French), numero (Spanish)
|
132
|
+
'nr', #numer
|
133
|
+
'n', # numero (Italian)
|
134
|
+
'n:o', # numero (Finnish)
|
135
|
+
'Nr', #Nummer
|
136
|
+
'nr', #nummer
|
137
|
+
|
138
|
+
|
139
|
+
'op', #opus
|
140
|
+
'osa', # osa
|
141
|
+
'osat', #osat (Finnish)
|
142
|
+
'otd', # otdel, otdelenie
|
143
|
+
|
144
|
+
|
145
|
+
'pars', #pars
|
146
|
+
'pt',
|
147
|
+
'pts', #part, -s
|
148
|
+
'pt', #parte
|
149
|
+
'ptie', #partie
|
150
|
+
'p', # pik
|
151
|
+
'plates', #plates
|
152
|
+
'portfolio', # [6] portfolio
|
153
|
+
'prelim', #preliminary
|
154
|
+
|
155
|
+
|
156
|
+
'qtr', # quarter
|
157
|
+
|
158
|
+
|
159
|
+
'reel', #reel
|
160
|
+
'rept', #report
|
161
|
+
'rev', # revised, revision
|
162
|
+
'r', # rik
|
163
|
+
'roc', # rocnik
|
164
|
+
'rocz', #rocznik
|
165
|
+
'r', # rok
|
166
|
+
|
167
|
+
|
168
|
+
'Samml', # Sammlung
|
169
|
+
'sect', #section
|
170
|
+
'sejums', #sejums
|
171
|
+
'ser', # serie, series
|
172
|
+
'ses', # sesit
|
173
|
+
'sess', #session
|
174
|
+
'study', # study
|
175
|
+
'sub', # subject
|
176
|
+
'suppl', # supplement
|
177
|
+
'sv', #svazek, svazok, sveska, svezak
|
178
|
+
'sz', #szam
|
179
|
+
|
180
|
+
|
181
|
+
'tables', #tables
|
182
|
+
'T', # Teil, Theil
|
183
|
+
'Tbd', # Teilband
|
184
|
+
'theme', # theme
|
185
|
+
'title', # title
|
186
|
+
'Titel', # Titel (German)
|
187
|
+
't', # tom, tome, tomo, tomos, tomus
|
188
|
+
'tl', #tayl, tiyl (Yiddish)
|
189
|
+
|
190
|
+
|
191
|
+
'Uabs', #Unterabschnitt
|
192
|
+
|
193
|
+
|
194
|
+
'v', # volume, -s
|
195
|
+
'vol', # volumul
|
196
|
+
'Vorber', #Vorbericht
|
197
|
+
'vyp', # vypusk
|
198
|
+
|
199
|
+
|
200
|
+
'workbk', #workbook
|
201
|
+
|
202
|
+
|
203
|
+
'yaarg', # yaargang (Hebrew)
|
204
|
+
|
205
|
+
|
206
|
+
'zesz', #zeszyt
|
207
|
+
'zosh', #zoshyt
|
208
|
+
'zv', #zvazok, zvezek
|
209
|
+
]
|
210
|
+
abbrevs = abbrevs.collect {|a| a.upcase}
|
211
|
+
|
212
|
+
abbrevs << "K" # Mozart Kochel catalog number
|
213
|
+
|
214
|
+
VolumeAbbreviations = abbrevs
|
215
|
+
|
216
|
+
end
|
data/rakefile
ADDED
metadata
ADDED
@@ -0,0 +1,99 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: lcsort
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.9.0
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- Nikitas Tampakis
|
8
|
+
- Jonathan Rochkind
|
9
|
+
autorequire:
|
10
|
+
bindir: exe
|
11
|
+
cert_chain: []
|
12
|
+
date: 2015-07-08 00:00:00.000000000 Z
|
13
|
+
dependencies:
|
14
|
+
- !ruby/object:Gem::Dependency
|
15
|
+
name: bundler
|
16
|
+
requirement: !ruby/object:Gem::Requirement
|
17
|
+
requirements:
|
18
|
+
- - "~>"
|
19
|
+
- !ruby/object:Gem::Version
|
20
|
+
version: '1.9'
|
21
|
+
type: :development
|
22
|
+
prerelease: false
|
23
|
+
version_requirements: !ruby/object:Gem::Requirement
|
24
|
+
requirements:
|
25
|
+
- - "~>"
|
26
|
+
- !ruby/object:Gem::Version
|
27
|
+
version: '1.9'
|
28
|
+
- !ruby/object:Gem::Dependency
|
29
|
+
name: rake
|
30
|
+
requirement: !ruby/object:Gem::Requirement
|
31
|
+
requirements:
|
32
|
+
- - "~>"
|
33
|
+
- !ruby/object:Gem::Version
|
34
|
+
version: '10.0'
|
35
|
+
type: :development
|
36
|
+
prerelease: false
|
37
|
+
version_requirements: !ruby/object:Gem::Requirement
|
38
|
+
requirements:
|
39
|
+
- - "~>"
|
40
|
+
- !ruby/object:Gem::Version
|
41
|
+
version: '10.0'
|
42
|
+
- !ruby/object:Gem::Dependency
|
43
|
+
name: minitest
|
44
|
+
requirement: !ruby/object:Gem::Requirement
|
45
|
+
requirements:
|
46
|
+
- - "~>"
|
47
|
+
- !ruby/object:Gem::Version
|
48
|
+
version: '5.0'
|
49
|
+
type: :development
|
50
|
+
prerelease: false
|
51
|
+
version_requirements: !ruby/object:Gem::Requirement
|
52
|
+
requirements:
|
53
|
+
- - "~>"
|
54
|
+
- !ruby/object:Gem::Version
|
55
|
+
version: '5.0'
|
56
|
+
description: Sort-order-normalize Library of Congress call numbers and determine search
|
57
|
+
ranges for left-anchor search
|
58
|
+
email:
|
59
|
+
- tampakis@princeton.edu
|
60
|
+
executables: []
|
61
|
+
extensions: []
|
62
|
+
extra_rdoc_files: []
|
63
|
+
files:
|
64
|
+
- ".gitignore"
|
65
|
+
- ".travis.yml"
|
66
|
+
- Gemfile
|
67
|
+
- LICENSE.txt
|
68
|
+
- README.md
|
69
|
+
- bin/console
|
70
|
+
- bin/setup
|
71
|
+
- lcsort.gemspec
|
72
|
+
- lib/lcsort.rb
|
73
|
+
- lib/lcsort/version.rb
|
74
|
+
- lib/lcsort/volume_abbreviations.rb
|
75
|
+
- rakefile
|
76
|
+
homepage:
|
77
|
+
licenses: []
|
78
|
+
metadata: {}
|
79
|
+
post_install_message:
|
80
|
+
rdoc_options: []
|
81
|
+
require_paths:
|
82
|
+
- lib
|
83
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
84
|
+
requirements:
|
85
|
+
- - ">="
|
86
|
+
- !ruby/object:Gem::Version
|
87
|
+
version: '0'
|
88
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
89
|
+
requirements:
|
90
|
+
- - ">="
|
91
|
+
- !ruby/object:Gem::Version
|
92
|
+
version: '0'
|
93
|
+
requirements: []
|
94
|
+
rubyforge_project:
|
95
|
+
rubygems_version: 2.4.5
|
96
|
+
signing_key:
|
97
|
+
specification_version: 4
|
98
|
+
summary: Sort-normalized forms of LC Call Numbers
|
99
|
+
test_files: []
|