marc4j4r 1.4.3-java

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,3 @@
1
+ -
2
+ ChangeLog.md
3
+ LICENSE.txt
@@ -0,0 +1,4 @@
1
+ Gemfile.lock
2
+ doc/
3
+ pkg/
4
+ vendor/cache/*.gem
@@ -0,0 +1 @@
1
+ --markup markdown --title "marc4j4r Documentation" --protected
@@ -0,0 +1,40 @@
1
+ ###1.4.3 (2012-12-05)
2
+ * NOTE: Will not work with jruby 1.6.8 or 1.7.0 because of bug [JRUBY-6581](http://jira.codehaus.org/browse/JRUBY-6581)
3
+ Better to use 1.6.7 in --1.8 mode or (better) 1.7.1
4
+ * Switched build system to just use Bundler.
5
+ * Require writer by default
6
+ * switched testing framework to minitest
7
+ * updated to work with jruby 1.7.1
8
+
9
+ ###1.4.2
10
+ * Allow Aleph control field 'FMT'
11
+
12
+ ###1.4.1
13
+ * Set so documents without an id are skipped (not indexed)
14
+
15
+ ###1.4.0
16
+ * Update to use newest version of marc4j_extra_readers_writers.jar; before I wasn't
17
+ correctly turning '^' into ' ' in leaders and control fields.
18
+
19
+ ###1.3.0
20
+ * Updated to use jlogger; added logging #nextRecord (also used by #each) to
21
+ AlephSequentialReader and PermissiveStreamReader
22
+
23
+ ###1.2.0
24
+ * Fixed encoding problem with to_marc and from_string roundtrip
25
+ * Added to_hash/to_marc_in_json and from_hash/from_marc_in_json (see
26
+ http://dilettantes.code4lib.org/blog/2010/09/a-proposal-to-serialize-marc-in-json/)
27
+
28
+ ###1.1
29
+ * Added native java method to turn a record into XML (20% speedup or so)
30
+ ###1.0
31
+ * Arbitrary decision that this is 1.0
32
+ * Using javamarc.jar (fork of marc4j) from http://github.com/billdueber/javamarc
33
+ * Including alephsequential reader (but not writer) and specs
34
+ * Added code to Reader#each to deal with #errors object if provided by the
35
+ specific reader (right now, :permissivemarc and :alephsequential) and specs
36
+ to test
37
+ * Updated to latest marc4j; changes involve character conversion
38
+
39
+ ### 0.9.0
40
+ * First real public release
data/Gemfile ADDED
@@ -0,0 +1,16 @@
1
+ source :rubygems
2
+
3
+ gem 'jlogger'
4
+
5
+ group :development do
6
+ gem 'kramdown'
7
+ gem 'bundler', '~> 1.0'
8
+ gem 'rake', '~> 10'
9
+ gem 'yard', '~> 0.8'
10
+ end
11
+
12
+ group :test do
13
+ gem 'minitest', '~> 4'
14
+ end
15
+
16
+ gemspec
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2009-2012 Bill Dueber
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,164 @@
1
+ # marc4j4r
2
+
3
+ * [Homepage](https://github.com/billdueber/marc4j4r#readme)
4
+ * [Issues](https://github.com/billdueber/marc4j4r/issues)
5
+ * [Documentation](http://rubydoc.info/gems/marc4j4r/frames)
6
+ * [Email](mailto:bill at dueber.com)
7
+
8
+
9
+ A ruby wrapper around the marc4j.jar (as forked by javamarc) java library for dealing with library MARC data.
10
+
11
+ ## JRuby version alert
12
+
13
+ `MARC4J4R::Record#each` throws an error in JRuby versions bfore 1.7.1 when in --1.9 mode Your best bet is to use JRuby 1.7.1 (or higher).
14
+ [The error in question is, I think [JRUBY-6581](https://jira.codehaus.org/browse/JRUBY-6581)]
15
+
16
+ ## Deprecation alert
17
+
18
+ I'm giving up on this standalone module and focusing my efforts into making
19
+ a marc4j add-on for the standard [ruby-marc](https://github.com/ruby-marc/ruby-marc) distribution.
20
+
21
+
22
+ ## Getting a MARC reader
23
+
24
+ marc4j4r provides three readers out of the box: :strictmarc (binary), :permissivemarc (:binary), :marcxml (MARC-XML),
25
+ or :alephsequential (Ex Libris's AlephSequential format).
26
+
27
+ You can pass either a filename or an open IO object (either ruby or java.io.inputstream)
28
+
29
+ require 'marc4j4r'
30
+
31
+ binreader = MARC4J4R::Reader.new('test.mrc') # defaults to :strictmarc
32
+ binreader = MARC4J4R::Reader.new('test.mrc', :strictmarc)
33
+
34
+ permissivereader = MARC4J4R::Reader.new('test.mrc', :permissivemarc)
35
+
36
+ xmlreader = MARC4J4R::Reader.new('test.xml', :marcxml)
37
+ asreader = MARC4J4R::Reader.new('test.seq', :alephsequential)
38
+
39
+ # Or use a file object
40
+
41
+ reader = MARC4J4R::Reader.new(File.open('test.mrc'))
42
+
43
+ # Or a java.io.inputstream
44
+
45
+ jurl = Java::java.net.URL.new('http://my.machine.com/test.mrc')
46
+ istream = jurl.openConnection.getInputStream
47
+ reader = MARC4J4R::Reader.new(istream)
48
+
49
+ ## Using the reader
50
+
51
+ A MARC4J4R::Reader is an Enumerable, so you can do:
52
+
53
+ reader.each do |record|
54
+ # do stuff with the record
55
+ end
56
+
57
+ Or, if you're using [jruby_threach](http://rdoc.info/projects/billdueber/jruby_threach):
58
+
59
+ reader.threach(2) do |record|
60
+ # do stuff with records in two threads
61
+ end
62
+
63
+ ## Using the writer
64
+
65
+
66
+ binaryWriter = MARC4J4R::Writer.new(filename, :strictmarc)
67
+ xmlWriter = MARC4J4R::Writer.new(filename, :marcxml)
68
+
69
+ writer.write(record)
70
+ # repeat
71
+ writer.close
72
+
73
+
74
+ ## Working with records and fields
75
+
76
+ In addition to all the normal marc4j methods, MARC4J4R::Record exposes some additional methods
77
+ and syntaxes.
78
+
79
+ See the classes themselves and/or the specs for more examples.
80
+
81
+ * {MARC4J4R::Reader}
82
+ * {MARC4J4R::Writer}
83
+ * {MARC4J4R::Record}
84
+ * {MARC4J4R::ControlField}
85
+ * {MARC4J4R::DataField}
86
+ * {MARC4J4R::SubField}
87
+
88
+
89
+ leader = record.leader
90
+
91
+ # All fields are available via #each or #fields
92
+
93
+ fields = record.fields
94
+
95
+ record.each do |field|
96
+ # do something with each controlfield/datafield; returned in the order they were added
97
+ end
98
+
99
+ # Controlfields have a tag and a value
100
+
101
+ idfield = record['001']
102
+ idfield.tag # => '001'
103
+ id = idfield.value # or idfield.data, same thing
104
+
105
+ # Get the first datafield with a given tag
106
+ first700 = record['700'] # Note: need to use strings, not integers
107
+
108
+ # Stringify a field to get all the subfields joined with spaces
109
+
110
+ fullTitle = record['245'].to_s
111
+
112
+ all700s = record.find_by_tag '700'
113
+ all700and856s = record.find_by_tag ['700', '856']
114
+
115
+
116
+ # Construct and add a controlfield
117
+ record << MARC4J4R::ControlField.new('001', '0000333234')
118
+
119
+ # Construct and add a datafield
120
+ df = MARC4J4R::DataField.new(tag, ind1, ind2)
121
+
122
+ ind1 = df.ind1
123
+ ind2 = df.ind2
124
+
125
+ df << MARC4J4R::Subfield.new('a', 'the $a value')
126
+ df << MARC4J4R::Subfield.new('b', 'the $b value')
127
+
128
+ # Add it to a record
129
+
130
+ record << df
131
+
132
+ # Get subfields or their values
133
+
134
+ firstSubfieldAValue = df['a']
135
+
136
+ allSubfields = df.subs
137
+ allSubfieldAs = df.subs('a')
138
+ allSubfieldAorBs = df.subs(['a', 'b'])
139
+
140
+ allSubfieldAorBValues = df.sub_values(['a', 'b'])
141
+
142
+
143
+
144
+ ## Install
145
+
146
+ $ gem install marc4j4r
147
+
148
+
149
+ ## Note on Patches/Pull Requests
150
+
151
+ * Fork the project.
152
+ * Make your feature addition or bug fix.
153
+ * Add tests for it. This is important so I don't break it in a
154
+ future version unintentionally.
155
+ * Commit, do not mess with rakefile, version, or history.
156
+ (if you want to have your own version, that is fine but bump version in a commit by itself I can ignore when I pull)
157
+ * Send me a pull request. Bonus points for topic branches.
158
+
159
+
160
+ ## Copyright
161
+
162
+ Copyright (c) 2012 Bill Dueber
163
+
164
+ See {file:LICENSE.txt} for details.
@@ -0,0 +1,34 @@
1
+ # encoding: utf-8
2
+
3
+ require 'rubygems'
4
+
5
+ begin
6
+ require 'bundler'
7
+ rescue LoadError => e
8
+ warn e.message
9
+ warn "Run `gem install bundler` to install Bundler."
10
+ exit -1
11
+ end
12
+
13
+ begin
14
+ Bundler.setup(:development)
15
+ rescue Bundler::BundlerError => e
16
+ warn e.message
17
+ warn "Run `bundle install` to install missing gems."
18
+ exit e.status_code
19
+ end
20
+
21
+ require 'rake'
22
+
23
+ require "bundler/gem_tasks"
24
+
25
+ require 'yard'
26
+ YARD::Rake::YardocTask.new
27
+ task :doc => :yard
28
+
29
+ require 'rake/testtask'
30
+ Rake::TestTask.new do |test|
31
+ test.libs << 'spec'
32
+ test.pattern = 'spec/**/*_spec.rb'
33
+ test.verbose = true
34
+ end
Binary file
@@ -0,0 +1,73 @@
1
+ require 'marc4j4r/version'
2
+
3
+ unless defined? JRUBY_VERSION
4
+ raise "Only works under JRUBY"
5
+ end
6
+
7
+ jardir = File.join(File.dirname(__FILE__), '..', 'jars')
8
+
9
+ # For each jar, check for a representative class in each
10
+ # and include the jar if it's not defined
11
+
12
+ begin
13
+ java_import Java::org.marc4j.marc.impl.RecordImpl
14
+ rescue NameError => e
15
+ require "#{jardir}/javamarc.jar"
16
+ end
17
+
18
+ begin
19
+ java_import Java::org.marc4j.MarcAlephSequentialReader
20
+ rescue
21
+ require "#{jardir}/marc4j-extra-readers-writers.jar"
22
+ end
23
+
24
+ begin
25
+ java_import Java::org.codehaus.jackson.map.ObjectMapper
26
+ rescue
27
+ require "#{jardir}/jackson-all-1.6.0.jar"
28
+ end
29
+
30
+
31
+ # Define a method that will take a string (filename), IO object, or StringIO object,
32
+ # and return an inputstream/outputstream
33
+
34
+ module IOConvert
35
+
36
+ def byteinstream(fromwhere)
37
+ stream = nil
38
+ if fromwhere.is_a? Java::JavaIO::InputStream
39
+ stream = fromwhere
40
+ elsif fromwhere.is_a? String
41
+ stream = java.io.FileInputStream.new(fromwhere.to_java_string)
42
+ elsif fromwhere.respond_to? :to_inputstream
43
+ stream = fromwhere.to_inputstream
44
+ end
45
+ return stream
46
+ end
47
+
48
+ def byteoutstream towhere
49
+ stream = nil
50
+ if towhere.is_a? Java::JavaIO::OutputStream
51
+ stream = towhere
52
+ elsif towhere.is_a? String
53
+ stream = java.io.FileOutputStream.new(towhere.to_java_string)
54
+ elsif towhere.respond_to? :to_outputstream
55
+ stream = towhere.to_outputstream
56
+ end
57
+ return stream
58
+ end
59
+
60
+
61
+ module_function :byteinstream, :byteoutstream
62
+
63
+ end
64
+
65
+
66
+
67
+
68
+
69
+ require 'marc4j4r/record.rb'
70
+ require 'marc4j4r/controlfield.rb'
71
+ require 'marc4j4r/reader.rb'
72
+ require 'marc4j4r/datafield.rb'
73
+ require 'marc4j4r/writer.rb'
@@ -0,0 +1,34 @@
1
+ module MARC4J4R
2
+ ControlField = Java::org.marc4j.marc.impl::ControlFieldImpl
3
+ class ControlField
4
+ def value
5
+ return self.data
6
+ end
7
+
8
+ def value= str
9
+ self.data = str
10
+ end
11
+
12
+ def controlField?
13
+ return true
14
+ end
15
+
16
+ def self.control_tag? tag
17
+ return true if Java::org.marc4j.marc.impl.Verifier.isControlField(tag)
18
+ return true if tag == 'FMT'
19
+ return false
20
+ end
21
+
22
+ # Pretty-print
23
+ # @param [String] joiner What string to use to join the subfields
24
+ # @param [String] The pretty string
25
+ def to_s
26
+ return self.tag + " " + self.value
27
+ end
28
+
29
+ def == other
30
+ self.tag == other.tag && self.value == other.value
31
+ end
32
+
33
+ end
34
+ end
@@ -0,0 +1,195 @@
1
+ module MARC4J4R
2
+ DataField = Java::org.marc4j.marc.impl::DataFieldImpl
3
+ SubField = Java::org.marc4j.marc.impl::SubfieldImpl
4
+
5
+ class DataField
6
+ include Enumerable
7
+
8
+ alias_method :<<, :addSubfield
9
+ alias_method :add, :addSubfield
10
+
11
+ # Override the initialize to allow creation with just a tag (marc4j only allows either
12
+ # no args or the tag and both indicators)
13
+
14
+ alias_method :oldinit, :initialize
15
+ def initialize(tag = nil, ind1 = ' ', ind2 = ' ')
16
+ self.oldinit(tag, ind1[0].ord, ind2[0].ord)
17
+ end
18
+
19
+ def controlField?
20
+ return false
21
+ end
22
+
23
+ def == other
24
+
25
+ basics = ((self.tag == other.tag) and (self.indicator1 == other.indicator1) and (self.indicator2 == other.indicator2))
26
+ unless basics
27
+ # puts "Failed basics"
28
+ return false
29
+ end
30
+ selfsubs = self.to_a
31
+ othersubs = other.to_a
32
+
33
+ return false if selfsubs.size != othersubs.size
34
+
35
+ # puts "#{self} vs #{other}"
36
+ while (selfsubs.length > 0)
37
+ ssf = selfsubs.shift
38
+ osf = othersubs.shift
39
+ unless ssf == osf
40
+ # puts "#{ssf} <> #{osf}"
41
+ return false
42
+ end
43
+ end
44
+
45
+ if ((selfsubs.size > 0) or (othersubs.size > 0))
46
+ # puts "sizes unequal"
47
+ return false
48
+ end
49
+ return true
50
+ end
51
+
52
+ # Pretty-print
53
+ # @param [String] joiner What string to use to join the subfields
54
+ # @param [String] The pretty string
55
+ def to_s (joiner = ' ')
56
+ arr = [self.tag + ' ' + self.indicator1 + self.indicator2]
57
+ self.each do |s|
58
+ arr.push s.to_s
59
+ end
60
+ return arr.join(joiner)
61
+ end
62
+
63
+
64
+ # Get the value of the first subfield of this field with the given code
65
+ # @param [String] code 1-character string of the subfield code
66
+ # @return [String] The value of the first matched subfield
67
+ def [] code
68
+ raise ArgumentError, "Code must be a one-character string, not #{code}" unless code.is_a? String and code.size == 1
69
+ # need to send a char value that the underlying java can deal with
70
+ sub = self.getSubfield(code[0].ord)
71
+ if (sub)
72
+ return sub.getData
73
+ else
74
+ return nil
75
+ end
76
+ end
77
+
78
+ # Also call it "sub" for symmatry wtih "sub_values" and "subs"
79
+ # and "first" because it makes sense
80
+ alias_method :sub, :[]
81
+ alias_method :first, :[]
82
+
83
+ # Get all subfields, optionally restricting to those with a given code
84
+ # @param [String, Array<String>] code A (array of?) 1-character strings; the code(s) to collect. Default is all
85
+ # @return [Array<MARC4J4R::SubField] The matching subfields, or an empty array
86
+
87
+ def subs code = false
88
+ unless code
89
+ return self.to_a
90
+ end
91
+
92
+ # Is it a singleton?
93
+ unless code.is_a? Array
94
+ code = [code]
95
+ end
96
+
97
+ return self.select {|s| code.include? s.code}
98
+ end
99
+
100
+ # Get all values from the subfields for the given code or array of codes
101
+ # @param [String, Array<String>] code (Array of?) 1-character string(s) of the subfield code
102
+ # @return [Array<String>] A possibly-empty array of Strings made up of the values in the subfields whose
103
+ # code is included in the given codes (or all subfields is code is empty)
104
+ #
105
+ #
106
+ # @example Quick examples:
107
+ # # 260 $a New York, $b Van Nostrand Reinhold Co. $c 1969
108
+ # rec['260'].sub_values('a') #=> ["New York,"]
109
+ # rec['260'].sub_values(['a', 'c']) #=> ["New York,", "1969"]
110
+ # rec['260'].sub_values(['c', 'a']) #=> ["New York,", "1969"]
111
+
112
+ def sub_values(code=nil)
113
+ return self.subs(code).collect {|s| s.value}
114
+ end
115
+
116
+
117
+ # Get first indicator as a one-character string
118
+ def indicator1
119
+ return self.getIndicator1.chr
120
+ end
121
+
122
+ # Get second indicator as a one-character string
123
+ def indicator2
124
+ return self.getIndicator2.chr
125
+ end
126
+
127
+ def indicator1= char
128
+ self.setIndicator1 char[0].ord
129
+ end
130
+
131
+ def indicator2= char
132
+ self.setIndicator2 char[0].ord
133
+ end
134
+
135
+ alias_method :ind1, :indicator1
136
+ alias_method :"ind1=", :"indicator1="
137
+ alias_method :ind2, :indicator2
138
+ alias_method :"ind2=", :"indicator2="
139
+
140
+ # Iterate over the subfields
141
+ def each
142
+ self.getSubfields.each do |s|
143
+ yield s
144
+ end
145
+ end
146
+
147
+ # Get the concatentated values of the subfields in order the appear in the field
148
+ # @param [String] joiner The string used to join the subfield values
149
+ def value joiner=' '
150
+ data = self.getSubfields.map {|s| s.data}
151
+ return data.join(joiner)
152
+ end
153
+ end
154
+
155
+ class SubField
156
+
157
+ alias_method :oldinit, :initialize
158
+ def initialize code=nil, data=nil
159
+ if code
160
+ code = code[0].ord
161
+ if data
162
+ self.oldinit(code, data)
163
+ else
164
+ self.oldinit(code)
165
+ end
166
+ else
167
+ self.oldinit
168
+ end
169
+ end
170
+
171
+ def == other
172
+ return ((self.code == other.code) and (self.data == other.data))
173
+ end
174
+
175
+ def value
176
+ return self.data
177
+ end
178
+
179
+ def value= str
180
+ self.data = str
181
+ end
182
+
183
+ def code
184
+ return self.getCode.chr
185
+ end
186
+
187
+ def code= str
188
+ self.setCode str[0].ord
189
+ end
190
+
191
+ def to_s
192
+ return '$' + self.code + " " + self.data
193
+ end
194
+ end
195
+ end