marc4j4r 1.4.3-java

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,3 @@
1
+ -
2
+ ChangeLog.md
3
+ LICENSE.txt
@@ -0,0 +1,4 @@
1
+ Gemfile.lock
2
+ doc/
3
+ pkg/
4
+ vendor/cache/*.gem
@@ -0,0 +1 @@
1
+ --markup markdown --title "marc4j4r Documentation" --protected
@@ -0,0 +1,40 @@
1
+ ###1.4.3 (2012-12-05)
2
+ * NOTE: Will not work with jruby 1.6.8 or 1.7.0 because of bug [JRUBY-6581](http://jira.codehaus.org/browse/JRUBY-6581)
3
+ Better to use 1.6.7 in --1.8 mode or (better) 1.7.1
4
+ * Switched build system to just use Bundler.
5
+ * Require writer by default
6
+ * switched testing framework to minitest
7
+ * updated to work with jruby 1.7.1
8
+
9
+ ###1.4.2
10
+ * Allow Aleph control field 'FMT'
11
+
12
+ ###1.4.1
13
+ * Set so documents without an id are skipped (not indexed)
14
+
15
+ ###1.4.0
16
+ * Update to use newest version of marc4j_extra_readers_writers.jar; before I wasn't
17
+ correctly turning '^' into ' ' in leaders and control fields.
18
+
19
+ ###1.3.0
20
+ * Updated to use jlogger; added logging #nextRecord (also used by #each) to
21
+ AlephSequentialReader and PermissiveStreamReader
22
+
23
+ ###1.2.0
24
+ * Fixed encoding problem with to_marc and from_string roundtrip
25
+ * Added to_hash/to_marc_in_json and from_hash/from_marc_in_json (see
26
+ http://dilettantes.code4lib.org/blog/2010/09/a-proposal-to-serialize-marc-in-json/)
27
+
28
+ ###1.1
29
+ * Added native java method to turn a record into XML (20% speedup or so)
30
+ ###1.0
31
+ * Arbitrary decision that this is 1.0
32
+ * Using javamarc.jar (fork of marc4j) from http://github.com/billdueber/javamarc
33
+ * Including alephsequential reader (but not writer) and specs
34
+ * Added code to Reader#each to deal with #errors object if provided by the
35
+ specific reader (right now, :permissivemarc and :alephsequential) and specs
36
+ to test
37
+ * Updated to latest marc4j; changes involve character conversion
38
+
39
+ ### 0.9.0
40
+ * First real public release
data/Gemfile ADDED
@@ -0,0 +1,16 @@
1
+ source :rubygems
2
+
3
+ gem 'jlogger'
4
+
5
+ group :development do
6
+ gem 'kramdown'
7
+ gem 'bundler', '~> 1.0'
8
+ gem 'rake', '~> 10'
9
+ gem 'yard', '~> 0.8'
10
+ end
11
+
12
+ group :test do
13
+ gem 'minitest', '~> 4'
14
+ end
15
+
16
+ gemspec
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2009-2012 Bill Dueber
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,164 @@
1
+ # marc4j4r
2
+
3
+ * [Homepage](https://github.com/billdueber/marc4j4r#readme)
4
+ * [Issues](https://github.com/billdueber/marc4j4r/issues)
5
+ * [Documentation](http://rubydoc.info/gems/marc4j4r/frames)
6
+ * [Email](mailto:bill at dueber.com)
7
+
8
+
9
+ A ruby wrapper around the marc4j.jar (as forked by javamarc) java library for dealing with library MARC data.
10
+
11
+ ## JRuby version alert
12
+
13
+ `MARC4J4R::Record#each` throws an error in JRuby versions bfore 1.7.1 when in --1.9 mode Your best bet is to use JRuby 1.7.1 (or higher).
14
+ [The error in question is, I think [JRUBY-6581](https://jira.codehaus.org/browse/JRUBY-6581)]
15
+
16
+ ## Deprecation alert
17
+
18
+ I'm giving up on this standalone module and focusing my efforts into making
19
+ a marc4j add-on for the standard [ruby-marc](https://github.com/ruby-marc/ruby-marc) distribution.
20
+
21
+
22
+ ## Getting a MARC reader
23
+
24
+ marc4j4r provides three readers out of the box: :strictmarc (binary), :permissivemarc (:binary), :marcxml (MARC-XML),
25
+ or :alephsequential (Ex Libris's AlephSequential format).
26
+
27
+ You can pass either a filename or an open IO object (either ruby or java.io.inputstream)
28
+
29
+ require 'marc4j4r'
30
+
31
+ binreader = MARC4J4R::Reader.new('test.mrc') # defaults to :strictmarc
32
+ binreader = MARC4J4R::Reader.new('test.mrc', :strictmarc)
33
+
34
+ permissivereader = MARC4J4R::Reader.new('test.mrc', :permissivemarc)
35
+
36
+ xmlreader = MARC4J4R::Reader.new('test.xml', :marcxml)
37
+ asreader = MARC4J4R::Reader.new('test.seq', :alephsequential)
38
+
39
+ # Or use a file object
40
+
41
+ reader = MARC4J4R::Reader.new(File.open('test.mrc'))
42
+
43
+ # Or a java.io.inputstream
44
+
45
+ jurl = Java::java.net.URL.new('http://my.machine.com/test.mrc')
46
+ istream = jurl.openConnection.getInputStream
47
+ reader = MARC4J4R::Reader.new(istream)
48
+
49
+ ## Using the reader
50
+
51
+ A MARC4J4R::Reader is an Enumerable, so you can do:
52
+
53
+ reader.each do |record|
54
+ # do stuff with the record
55
+ end
56
+
57
+ Or, if you're using [jruby_threach](http://rdoc.info/projects/billdueber/jruby_threach):
58
+
59
+ reader.threach(2) do |record|
60
+ # do stuff with records in two threads
61
+ end
62
+
63
+ ## Using the writer
64
+
65
+
66
+ binaryWriter = MARC4J4R::Writer.new(filename, :strictmarc)
67
+ xmlWriter = MARC4J4R::Writer.new(filename, :marcxml)
68
+
69
+ writer.write(record)
70
+ # repeat
71
+ writer.close
72
+
73
+
74
+ ## Working with records and fields
75
+
76
+ In addition to all the normal marc4j methods, MARC4J4R::Record exposes some additional methods
77
+ and syntaxes.
78
+
79
+ See the classes themselves and/or the specs for more examples.
80
+
81
+ * {MARC4J4R::Reader}
82
+ * {MARC4J4R::Writer}
83
+ * {MARC4J4R::Record}
84
+ * {MARC4J4R::ControlField}
85
+ * {MARC4J4R::DataField}
86
+ * {MARC4J4R::SubField}
87
+
88
+
89
+ leader = record.leader
90
+
91
+ # All fields are available via #each or #fields
92
+
93
+ fields = record.fields
94
+
95
+ record.each do |field|
96
+ # do something with each controlfield/datafield; returned in the order they were added
97
+ end
98
+
99
+ # Controlfields have a tag and a value
100
+
101
+ idfield = record['001']
102
+ idfield.tag # => '001'
103
+ id = idfield.value # or idfield.data, same thing
104
+
105
+ # Get the first datafield with a given tag
106
+ first700 = record['700'] # Note: need to use strings, not integers
107
+
108
+ # Stringify a field to get all the subfields joined with spaces
109
+
110
+ fullTitle = record['245'].to_s
111
+
112
+ all700s = record.find_by_tag '700'
113
+ all700and856s = record.find_by_tag ['700', '856']
114
+
115
+
116
+ # Construct and add a controlfield
117
+ record << MARC4J4R::ControlField.new('001', '0000333234')
118
+
119
+ # Construct and add a datafield
120
+ df = MARC4J4R::DataField.new(tag, ind1, ind2)
121
+
122
+ ind1 = df.ind1
123
+ ind2 = df.ind2
124
+
125
+ df << MARC4J4R::Subfield.new('a', 'the $a value')
126
+ df << MARC4J4R::Subfield.new('b', 'the $b value')
127
+
128
+ # Add it to a record
129
+
130
+ record << df
131
+
132
+ # Get subfields or their values
133
+
134
+ firstSubfieldAValue = df['a']
135
+
136
+ allSubfields = df.subs
137
+ allSubfieldAs = df.subs('a')
138
+ allSubfieldAorBs = df.subs(['a', 'b'])
139
+
140
+ allSubfieldAorBValues = df.sub_values(['a', 'b'])
141
+
142
+
143
+
144
+ ## Install
145
+
146
+ $ gem install marc4j4r
147
+
148
+
149
+ ## Note on Patches/Pull Requests
150
+
151
+ * Fork the project.
152
+ * Make your feature addition or bug fix.
153
+ * Add tests for it. This is important so I don't break it in a
154
+ future version unintentionally.
155
+ * Commit, do not mess with rakefile, version, or history.
156
+ (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
157
+ * Send me a pull request. Bonus points for topic branches.
158
+
159
+
160
+ ## Copyright
161
+
162
+ Copyright (c) 2012 Bill Dueber
163
+
164
+ See {file:LICENSE.txt} for details.
@@ -0,0 +1,34 @@
1
+ # encoding: utf-8
2
+
3
+ require 'rubygems'
4
+
5
+ begin
6
+ require 'bundler'
7
+ rescue LoadError => e
8
+ warn e.message
9
+ warn "Run `gem install bundler` to install Bundler."
10
+ exit -1
11
+ end
12
+
13
+ begin
14
+ Bundler.setup(:development)
15
+ rescue Bundler::BundlerError => e
16
+ warn e.message
17
+ warn "Run `bundle install` to install missing gems."
18
+ exit e.status_code
19
+ end
20
+
21
+ require 'rake'
22
+
23
+ require "bundler/gem_tasks"
24
+
25
+ require 'yard'
26
+ YARD::Rake::YardocTask.new
27
+ task :doc => :yard
28
+
29
+ require 'rake/testtask'
30
+ Rake::TestTask.new do |test|
31
+ test.libs << 'spec'
32
+ test.pattern = 'spec/**/*_spec.rb'
33
+ test.verbose = true
34
+ end
Binary file
@@ -0,0 +1,73 @@
1
+ require 'marc4j4r/version'
2
+
3
+ unless defined? JRUBY_VERSION
4
+ raise "Only works under JRUBY"
5
+ end
6
+
7
+ jardir = File.join(File.dirname(__FILE__), '..', 'jars')
8
+
9
+ # For each jar, check for a representative class in each
10
+ # and include the jar if it's not defined
11
+
12
+ begin
13
+ java_import Java::org.marc4j.marc.impl.RecordImpl
14
+ rescue NameError => e
15
+ require "#{jardir}/javamarc.jar"
16
+ end
17
+
18
+ begin
19
+ java_import Java::org.marc4j.MarcAlephSequentialReader
20
+ rescue
21
+ require "#{jardir}/marc4j-extra-readers-writers.jar"
22
+ end
23
+
24
+ begin
25
+ java_import Java::org.codehaus.jackson.map.ObjectMapper
26
+ rescue
27
+ require "#{jardir}/jackson-all-1.6.0.jar"
28
+ end
29
+
30
+
31
+ # Define a method that will take a string (filename), IO object, or StringIO object,
32
+ # and return an inputstream/outputstream
33
+
34
+ module IOConvert
35
+
36
+ def byteinstream(fromwhere)
37
+ stream = nil
38
+ if fromwhere.is_a? Java::JavaIO::InputStream
39
+ stream = fromwhere
40
+ elsif fromwhere.is_a? String
41
+ stream = java.io.FileInputStream.new(fromwhere.to_java_string)
42
+ elsif fromwhere.respond_to? :to_inputstream
43
+ stream = fromwhere.to_inputstream
44
+ end
45
+ return stream
46
+ end
47
+
48
+ def byteoutstream towhere
49
+ stream = nil
50
+ if towhere.is_a? Java::JavaIO::OutputStream
51
+ stream = towhere
52
+ elsif towhere.is_a? String
53
+ stream = java.io.FileOutputStream.new(towhere.to_java_string)
54
+ elsif towhere.respond_to? :to_outputstream
55
+ stream = towhere.to_outputstream
56
+ end
57
+ return stream
58
+ end
59
+
60
+
61
+ module_function :byteinstream, :byteoutstream
62
+
63
+ end
64
+
65
+
66
+
67
+
68
+
69
+ require 'marc4j4r/record.rb'
70
+ require 'marc4j4r/controlfield.rb'
71
+ require 'marc4j4r/reader.rb'
72
+ require 'marc4j4r/datafield.rb'
73
+ require 'marc4j4r/writer.rb'
@@ -0,0 +1,34 @@
1
+ module MARC4J4R
2
+ ControlField = Java::org.marc4j.marc.impl::ControlFieldImpl
3
+ class ControlField
4
+ def value
5
+ return self.data
6
+ end
7
+
8
+ def value= str
9
+ self.data = str
10
+ end
11
+
12
+ def controlField?
13
+ return true
14
+ end
15
+
16
+ def self.control_tag? tag
17
+ return true if Java::org.marc4j.marc.impl.Verifier.isControlField(tag)
18
+ return true if tag == 'FMT'
19
+ return false
20
+ end
21
+
22
+ # Pretty-print
23
+ # @param [String] joiner What string to use to join the subfields
24
+ # @param [String] The pretty string
25
+ def to_s
26
+ return self.tag + " " + self.value
27
+ end
28
+
29
+ def == other
30
+ self.tag == other.tag && self.value == other.value
31
+ end
32
+
33
+ end
34
+ end
@@ -0,0 +1,195 @@
1
+ module MARC4J4R
2
+ DataField = Java::org.marc4j.marc.impl::DataFieldImpl
3
+ SubField = Java::org.marc4j.marc.impl::SubfieldImpl
4
+
5
+ class DataField
6
+ include Enumerable
7
+
8
+ alias_method :<<, :addSubfield
9
+ alias_method :add, :addSubfield
10
+
11
+ # Override the initialize to allow creation with just a tag (marc4j only allows either
12
+ # no args or the tag and both indicators)
13
+
14
+ alias_method :oldinit, :initialize
15
+ def initialize(tag = nil, ind1 = ' ', ind2 = ' ')
16
+ self.oldinit(tag, ind1[0].ord, ind2[0].ord)
17
+ end
18
+
19
+ def controlField?
20
+ return false
21
+ end
22
+
23
+ def == other
24
+
25
+ basics = ((self.tag == other.tag) and (self.indicator1 == other.indicator1) and (self.indicator2 == other.indicator2))
26
+ unless basics
27
+ # puts "Failed basics"
28
+ return false
29
+ end
30
+ selfsubs = self.to_a
31
+ othersubs = other.to_a
32
+
33
+ return false if selfsubs.size != othersubs.size
34
+
35
+ # puts "#{self} vs #{other}"
36
+ while (selfsubs.length > 0)
37
+ ssf = selfsubs.shift
38
+ osf = othersubs.shift
39
+ unless ssf == osf
40
+ # puts "#{ssf} <> #{osf}"
41
+ return false
42
+ end
43
+ end
44
+
45
+ if ((selfsubs.size > 0) or (othersubs.size > 0))
46
+ # puts "sizes unequal"
47
+ return false
48
+ end
49
+ return true
50
+ end
51
+
52
+ # Pretty-print
53
+ # @param [String] joiner What string to use to join the subfields
54
+ # @param [String] The pretty string
55
+ def to_s (joiner = ' ')
56
+ arr = [self.tag + ' ' + self.indicator1 + self.indicator2]
57
+ self.each do |s|
58
+ arr.push s.to_s
59
+ end
60
+ return arr.join(joiner)
61
+ end
62
+
63
+
64
+ # Get the value of the first subfield of this field with the given code
65
+ # @param [String] code 1-character string of the subfield code
66
+ # @return [String] The value of the first matched subfield
67
+ def [] code
68
+ raise ArgumentError, "Code must be a one-character string, not #{code}" unless code.is_a? String and code.size == 1
69
+ # need to send a char value that the underlying java can deal with
70
+ sub = self.getSubfield(code[0].ord)
71
+ if (sub)
72
+ return sub.getData
73
+ else
74
+ return nil
75
+ end
76
+ end
77
+
78
+ # Also call it "sub" for symmatry wtih "sub_values" and "subs"
79
+ # and "first" because it makes sense
80
+ alias_method :sub, :[]
81
+ alias_method :first, :[]
82
+
83
+ # Get all subfields, optionally restricting to those with a given code
84
+ # @param [String, Array<String>] code A (array of?) 1-character strings; the code(s) to collect. Default is all
85
+ # @return [Array<MARC4J4R::SubField] The matching subfields, or an empty array
86
+
87
+ def subs code = false
88
+ unless code
89
+ return self.to_a
90
+ end
91
+
92
+ # Is it a singleton?
93
+ unless code.is_a? Array
94
+ code = [code]
95
+ end
96
+
97
+ return self.select {|s| code.include? s.code}
98
+ end
99
+
100
+ # Get all values from the subfields for the given code or array of codes
101
+ # @param [String, Array<String>] code (Array of?) 1-character string(s) of the subfield code
102
+ # @return [Array<String>] A possibly-empty array of Strings made up of the values in the subfields whose
103
+ # code is included in the given codes (or all subfields is code is empty)
104
+ #
105
+ #
106
+ # @example Quick examples:
107
+ # # 260 $a New York, $b Van Nostrand Reinhold Co. $c 1969
108
+ # rec['260'].sub_values('a') #=> ["New York,"]
109
+ # rec['260'].sub_values(['a', 'c']) #=> ["New York,", "1969"]
110
+ # rec['260'].sub_values(['c', 'a']) #=> ["New York,", "1969"]
111
+
112
+ def sub_values(code=nil)
113
+ return self.subs(code).collect {|s| s.value}
114
+ end
115
+
116
+
117
+ # Get first indicator as a one-character string
118
+ def indicator1
119
+ return self.getIndicator1.chr
120
+ end
121
+
122
+ # Get second indicator as a one-character string
123
+ def indicator2
124
+ return self.getIndicator2.chr
125
+ end
126
+
127
+ def indicator1= char
128
+ self.setIndicator1 char[0].ord
129
+ end
130
+
131
+ def indicator2= char
132
+ self.setIndicator2 char[0].ord
133
+ end
134
+
135
+ alias_method :ind1, :indicator1
136
+ alias_method :"ind1=", :"indicator1="
137
+ alias_method :ind2, :indicator2
138
+ alias_method :"ind2=", :"indicator2="
139
+
140
+ # Iterate over the subfields
141
+ def each
142
+ self.getSubfields.each do |s|
143
+ yield s
144
+ end
145
+ end
146
+
147
+ # Get the concatentated values of the subfields in order the appear in the field
148
+ # @param [String] joiner The string used to join the subfield values
149
+ def value joiner=' '
150
+ data = self.getSubfields.map {|s| s.data}
151
+ return data.join(joiner)
152
+ end
153
+ end
154
+
155
+ class SubField
156
+
157
+ alias_method :oldinit, :initialize
158
+ def initialize code=nil, data=nil
159
+ if code
160
+ code = code[0].ord
161
+ if data
162
+ self.oldinit(code, data)
163
+ else
164
+ self.oldinit(code)
165
+ end
166
+ else
167
+ self.oldinit
168
+ end
169
+ end
170
+
171
+ def == other
172
+ return ((self.code == other.code) and (self.data == other.data))
173
+ end
174
+
175
+ def value
176
+ return self.data
177
+ end
178
+
179
+ def value= str
180
+ self.data = str
181
+ end
182
+
183
+ def code
184
+ return self.getCode.chr
185
+ end
186
+
187
+ def code= str
188
+ self.setCode str[0].ord
189
+ end
190
+
191
+ def to_s
192
+ return '$' + self.code + " " + self.data
193
+ end
194
+ end
195
+ end