milkfarm-onix 0.7.7 → 0.8.3
Sign up to get free protection for your applications and to get access to all the features.
- data/CHANGELOG +31 -1
- data/README.markdown +24 -2
- data/lib/onix.rb +3 -16
- data/lib/onix/addressee_identifier.rb +0 -1
- data/lib/onix/apa_product.rb +19 -2
- data/lib/onix/audience_range.rb +6 -7
- data/lib/onix/contributor.rb +0 -1
- data/lib/onix/header.rb +0 -1
- data/lib/onix/imprint.rb +0 -1
- data/lib/onix/language.rb +0 -1
- data/lib/onix/lists/contributor_role.rb +99 -0
- data/lib/onix/market_representation.rb +0 -1
- data/lib/onix/measure.rb +0 -1
- data/lib/onix/media_file.rb +0 -1
- data/lib/onix/normaliser.rb +0 -77
- data/lib/onix/other_text.rb +0 -1
- data/lib/onix/price.rb +0 -1
- data/lib/onix/product.rb +3 -2
- data/lib/onix/product_identifier.rb +0 -1
- data/lib/onix/publisher.rb +0 -1
- data/lib/onix/reader.rb +39 -96
- data/lib/onix/sales_restriction.rb +0 -1
- data/lib/onix/sender_identifier.rb +0 -1
- data/lib/onix/series.rb +8 -8
- data/lib/onix/series_identifier.rb +0 -1
- data/lib/onix/set.rb +76 -0
- data/lib/onix/sl_product.rb +32 -45
- data/lib/onix/stock.rb +0 -1
- data/lib/onix/subject.rb +0 -1
- data/lib/onix/supply_detail.rb +0 -1
- data/lib/onix/title.rb +0 -1
- data/lib/onix/website.rb +0 -1
- data/lib/onix/writer.rb +1 -1
- data/spec/apa_product_spec.rb +44 -4
- data/spec/audience_range_spec.rb +5 -7
- data/spec/contributor_spec.rb +2 -4
- data/spec/header_spec.rb +2 -4
- data/spec/imprint_spec.rb +4 -6
- data/spec/language_spec.rb +5 -7
- data/spec/market_representation_spec.rb +2 -4
- data/spec/measure_spec.rb +2 -4
- data/spec/media_file_spec.rb +2 -4
- data/spec/normaliser_spec.rb +1 -78
- data/spec/other_text_spec.rb +2 -4
- data/spec/price_spec.rb +2 -4
- data/spec/product_identifier_spec.rb +2 -4
- data/spec/product_spec.rb +2 -5
- data/spec/publisher_spec.rb +2 -4
- data/spec/reader_spec.rb +15 -22
- data/spec/sales_restriction_spec.rb +2 -4
- data/spec/sender_identifier.rb +2 -4
- data/spec/series_identifier_spec.rb +5 -7
- data/spec/series_spec.rb +4 -6
- data/spec/set_spec.rb +32 -0
- data/spec/sl_product_spec.rb +76 -24
- data/spec/spec_helper.rb +8 -0
- data/spec/stock_spec.rb +2 -4
- data/spec/subject_spec.rb +2 -4
- data/spec/supply_detail_spec.rb +2 -4
- data/spec/title_spec.rb +2 -4
- data/spec/website_spec.rb +2 -4
- data/spec/writer_spec.rb +1 -4
- metadata +64 -28
- data/lib/onix/common.rb +0 -26
data/CHANGELOG
CHANGED
@@ -1,3 +1,33 @@
|
|
1
|
+
v0.8.2 (6th May 2010)
|
2
|
+
- fix APAProduct#series and APAProduct#series=
|
3
|
+
|
4
|
+
v0.8.1 (5th January 2010)
|
5
|
+
- Use nokogiri's support for transparent entity conversion when reading an ONIX file
|
6
|
+
- Removed entity replacement from ONIX::Normaliser
|
7
|
+
- the external dependency on sed made me uncomfortable, and it wasn't really
|
8
|
+
necessary now that nokogiri can do it for us
|
9
|
+
- Removed utf-8 normalisation from ONIX::Normaliser
|
10
|
+
- nokogiri also handles this really cleanly and transparently. Regardless of
|
11
|
+
the source file encoding, Nokogiri::Reader returns utf-8 encoded data
|
12
|
+
- Add the release attribute to files we generate
|
13
|
+
- it's optional in 2.1, but mandatory in 3.0. As we start to see 3.0 files in the
|
14
|
+
wild it will help to have a rapid way to distinguish between them
|
15
|
+
- Add ONIX::Reader#release - to detect the release version of files we read in
|
16
|
+
|
17
|
+
v0.8.0 (31st October 2009)
|
18
|
+
- Replace LibXML dependency with Nokogiri. Nokogiri is under active development, has
|
19
|
+
a responsive maintainer and is significantly more stable
|
20
|
+
- Switch to ROXML 3.x
|
21
|
+
- roxml also switched from libxml to nokogiri
|
22
|
+
- roxml removed deprecated parts of it's API
|
23
|
+
- should now avoid various conflicts with mongrel
|
24
|
+
- Ensure APAProduct#price returns the first product price and ignores
|
25
|
+
the price type
|
26
|
+
|
27
|
+
v0.7.8 (19th October 2009)
|
28
|
+
- add support for additional elements (mostly series and audience related)
|
29
|
+
- thanks tim
|
30
|
+
|
1
31
|
v0.7.7 (1st October 2009)
|
2
32
|
- optimise sed usage in ONIX::Normaliser. *huge* speed improvement on
|
3
33
|
large files.
|
@@ -29,7 +59,7 @@ v0.7.1 (24th June 2009)
|
|
29
59
|
|
30
60
|
v0.7.0 (17th June 2009)
|
31
61
|
- try using LibXML for reader again
|
32
|
-
- retrieving the ONIX version of the input file is currently disabled, as
|
62
|
+
- retrieving the ONIX version of the input file is currently disabled, as
|
33
63
|
that seems to be the source of our instability
|
34
64
|
- Various Ruby 1.9 compatability tweaks
|
35
65
|
- add source file coding declarations. All source files are UTF-8
|
data/README.markdown
CHANGED
@@ -10,6 +10,29 @@ and writing ONIX files in your ruby applications.
|
|
10
10
|
This replaces the obsolete rbook-onix gem that was spectacular in its crapness.
|
11
11
|
Let us never speak of it again.
|
12
12
|
|
13
|
+
## Feature Support
|
14
|
+
|
15
|
+
This library currently only handles ONIX 2.1 files (all revisions). At some
|
16
|
+
point I'll need to work out what to do about supporting ONIX 3.0 files. I
|
17
|
+
suspect a separate library will be the simplest solution.
|
18
|
+
|
19
|
+
ONIX::Reader only handles the reference tag versions of ONIX 2.1. Use
|
20
|
+
ONIX::Normaliser to convert any short tag files to reference tags.
|
21
|
+
|
22
|
+
ONIX::Writer only generates reference tag ONIX files.
|
23
|
+
|
24
|
+
## DTD Loading
|
25
|
+
|
26
|
+
To correctly handle named entities when reading an ONIX file, this gem attempts
|
27
|
+
to load the DTD describing the ONIX format into memory. By default, this means
|
28
|
+
each file you read will require several hundred Kb of data to be downloaded
|
29
|
+
over the net.
|
30
|
+
|
31
|
+
This is obviously not desirable in most cases. To avoid it, you need to add copies
|
32
|
+
of the ONIX DTDs into your system XML catalog. On Debian and Ubuntu systems,
|
33
|
+
the quickest way to do that is to build and install the package available @
|
34
|
+
http://github.com/yob/onix-dtd
|
35
|
+
|
13
36
|
## Installation
|
14
37
|
|
15
38
|
gem install onix
|
@@ -36,5 +59,4 @@ To be honest, I'm not really expecting any, this is a niche library.
|
|
36
59
|
## Further Reading
|
37
60
|
|
38
61
|
- The source: [http://github.com/yob/onix/tree/master](http://github.com/yob/onix/tree/master)
|
39
|
-
-
|
40
|
-
- The official specs [http://www.editeur.org/onix.html](http://www.editeur.org/onix.html)
|
62
|
+
- The official specs [http://www.editeur.org/8/ONIX/](http://www.editeur.org/8/ONIX/)
|
data/lib/onix.rb
CHANGED
@@ -1,22 +1,15 @@
|
|
1
1
|
# coding: utf-8
|
2
2
|
|
3
|
-
require 'rubygems'
|
4
3
|
require 'bigdecimal'
|
5
4
|
require 'cgi'
|
6
|
-
|
7
|
-
# ensure we load the correct gem versions
|
8
|
-
gem 'roxml', '2.5.3'
|
9
|
-
gem 'andand'
|
10
|
-
|
11
|
-
# and now load the actual gems
|
12
5
|
require 'roxml'
|
13
6
|
require 'andand'
|
14
7
|
|
15
8
|
module ONIX
|
16
9
|
module Version #:nodoc:
|
17
10
|
Major = 0
|
18
|
-
Minor =
|
19
|
-
Tiny =
|
11
|
+
Minor = 8
|
12
|
+
Tiny = 3
|
20
13
|
|
21
14
|
String = [Major, Minor, Tiny].join('.')
|
22
15
|
end
|
@@ -60,13 +53,6 @@ module ONIX
|
|
60
53
|
end
|
61
54
|
end
|
62
55
|
|
63
|
-
# silence some warnings from ROXML
|
64
|
-
unless ROXML.const_defined?("SILENCE_XML_NAME_WARNING")
|
65
|
-
ROXML::SILENCE_XML_NAME_WARNING = true
|
66
|
-
end
|
67
|
-
|
68
|
-
require File.join(File.dirname(__FILE__), "onix", "common")
|
69
|
-
|
70
56
|
# core files
|
71
57
|
# - ordering is important, classes need to be defined before any
|
72
58
|
# other class can use them
|
@@ -76,6 +62,7 @@ require File.join(File.dirname(__FILE__), "onix", "header")
|
|
76
62
|
require File.join(File.dirname(__FILE__), "onix", "product_identifier")
|
77
63
|
require File.join(File.dirname(__FILE__), "onix", "series_identifier")
|
78
64
|
require File.join(File.dirname(__FILE__), "onix", "series")
|
65
|
+
require File.join(File.dirname(__FILE__), "onix", "set")
|
79
66
|
require File.join(File.dirname(__FILE__), "onix", "title")
|
80
67
|
require File.join(File.dirname(__FILE__), "onix", "website")
|
81
68
|
require File.join(File.dirname(__FILE__), "onix", "contributor")
|
data/lib/onix/apa_product.rb
CHANGED
@@ -6,7 +6,6 @@ module ONIX
|
|
6
6
|
delegate :record_reference, :record_reference=
|
7
7
|
delegate :notification_type, :notification_type=
|
8
8
|
delegate :product_form, :product_form=
|
9
|
-
delegate :series, :series=
|
10
9
|
delegate :edition, :edition=
|
11
10
|
delegate :number_of_pages, :number_of_pages=
|
12
11
|
delegate :bic_main_subject, :bic_main_subject=
|
@@ -107,6 +106,20 @@ module ONIX
|
|
107
106
|
composite.subtitle = str
|
108
107
|
end
|
109
108
|
|
109
|
+
def series
|
110
|
+
composite = product.series.first
|
111
|
+
composite.andand.title_of_series
|
112
|
+
end
|
113
|
+
|
114
|
+
def series=(val)
|
115
|
+
composite = product.series.first
|
116
|
+
if composite.nil?
|
117
|
+
composite = ONIX::Series.new
|
118
|
+
product.series << composite
|
119
|
+
end
|
120
|
+
composite.title_of_series = val.to_s
|
121
|
+
end
|
122
|
+
|
110
123
|
# retrieve the current publisher website for this particular product
|
111
124
|
def publisher_website
|
112
125
|
website(2).andand.website_link
|
@@ -651,7 +664,11 @@ module ONIX
|
|
651
664
|
# retrieve the value of a particular price
|
652
665
|
def price_get(type)
|
653
666
|
supply = find_or_create_supply_detail
|
654
|
-
|
667
|
+
if type.nil?
|
668
|
+
supply.prices.first
|
669
|
+
else
|
670
|
+
supply.prices.find { |p| p.price_type_code == type }
|
671
|
+
end
|
655
672
|
end
|
656
673
|
|
657
674
|
# set the value of a particular price
|
data/lib/onix/audience_range.rb
CHANGED
@@ -3,7 +3,6 @@
|
|
3
3
|
module ONIX
|
4
4
|
class AudienceRange
|
5
5
|
include ROXML
|
6
|
-
include ONIX::Common
|
7
6
|
|
8
7
|
xml_name "AudienceRange"
|
9
8
|
|
@@ -11,12 +10,12 @@ module ONIX
|
|
11
10
|
xml_accessor :audience_range_precisions, :from => "AudienceRangePrecision", :as => [Fixnum], :to_xml => [ONIX::Formatters.two_digit] # TODO: two_digit isn't working on the array items
|
12
11
|
xml_accessor :audience_range_values, :from => "AudienceRangeValue", :as => [Integer]
|
13
12
|
|
14
|
-
# TODO: element AudienceRange: validity error :
|
15
|
-
# Element AudienceRange content does not follow the DTD, expecting
|
16
|
-
# (AudienceRangeQualifier , AudienceRangePrecision , AudienceRangeValue ,
|
17
|
-
# (AudienceRangePrecision , AudienceRangeValue)?),
|
18
|
-
# got
|
19
|
-
# (AudienceRangeQualifier AudienceRangePrecision AudienceRangePrecision
|
13
|
+
# TODO: element AudienceRange: validity error :
|
14
|
+
# Element AudienceRange content does not follow the DTD, expecting
|
15
|
+
# (AudienceRangeQualifier , AudienceRangePrecision , AudienceRangeValue ,
|
16
|
+
# (AudienceRangePrecision , AudienceRangeValue)?),
|
17
|
+
# got
|
18
|
+
# (AudienceRangeQualifier AudienceRangePrecision AudienceRangePrecision
|
20
19
|
# AudienceRangeValue AudienceRangeValue )
|
21
20
|
def initialize
|
22
21
|
self.audience_range_precisions = []
|
data/lib/onix/contributor.rb
CHANGED
data/lib/onix/header.rb
CHANGED
data/lib/onix/imprint.rb
CHANGED
data/lib/onix/language.rb
CHANGED
@@ -0,0 +1,99 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
|
3
|
+
module ONIX
|
4
|
+
module Lists
|
5
|
+
# Code list 17
|
6
|
+
CONTRIBUTOR_ROLE = {
|
7
|
+
"A01" => "By (author)",
|
8
|
+
"A02" => "With",
|
9
|
+
"A03" => "Screenplay by",
|
10
|
+
"A04" => "Libretto by",
|
11
|
+
"A05" => "Lyrics by",
|
12
|
+
"A06" => "By (composer)",
|
13
|
+
"A07" => "By (artist)",
|
14
|
+
"A08" => "By (photographer)",
|
15
|
+
"A09" => "Created by",
|
16
|
+
"A10" => "From an idea by",
|
17
|
+
"A11" => "Designed by",
|
18
|
+
"A12" => "Illustrated by",
|
19
|
+
"A13" => "Photographs by",
|
20
|
+
"A14" => "Text by",
|
21
|
+
"A15" => "Preface by",
|
22
|
+
"A16" => "Prologue by",
|
23
|
+
"A17" => "Summary by",
|
24
|
+
"A18" => "Supplement by",
|
25
|
+
"A19" => "Afterword by",
|
26
|
+
"A20" => "Notes by",
|
27
|
+
"A21" => "Commentaries by",
|
28
|
+
"A22" => "Epilogue by",
|
29
|
+
"A23" => "Foreword by",
|
30
|
+
"A24" => "Introduction by",
|
31
|
+
"A25" => "Footnotes by",
|
32
|
+
"A26" => "Memoir by",
|
33
|
+
"A27" => "Experiments by",
|
34
|
+
"A29" => "Introduction and notes by",
|
35
|
+
"A30" => "Software written by",
|
36
|
+
"A31" => "Book and lyrics by",
|
37
|
+
"A32" => "Contributions by",
|
38
|
+
"A33" => "Appendix by",
|
39
|
+
"A34" => "Index by",
|
40
|
+
"A35" => "Drawings by",
|
41
|
+
"A36" => "Cover design or artwork by",
|
42
|
+
"A37" => "Preliminary work by",
|
43
|
+
"A38" => "Original author",
|
44
|
+
"A39" => "Maps by",
|
45
|
+
"A40" => "Inked or colored by",
|
46
|
+
"A41" => "Pop-ups by",
|
47
|
+
"A42" => "Continued by",
|
48
|
+
"A43" => "Interviewer",
|
49
|
+
"A44" => "Interviewee",
|
50
|
+
"A99" => "Other primary creator",
|
51
|
+
"B01" => "Edited by",
|
52
|
+
"B02" => "Revised by",
|
53
|
+
"B03" => "Retold by",
|
54
|
+
"B04" => "Abridged by",
|
55
|
+
"B05" => "Adapted by",
|
56
|
+
"B06" => "Translated by",
|
57
|
+
"B07" => "As told by",
|
58
|
+
"B08" => "Translated with commentary by",
|
59
|
+
"B09" => "Series edited by",
|
60
|
+
"B10" => "Edited and translated by",
|
61
|
+
"B11" => "Editor-in-chief",
|
62
|
+
"B12" => "Guest editor",
|
63
|
+
"B13" => "Volume editor",
|
64
|
+
"B14" => "Editorial board member",
|
65
|
+
"B15" => "Editorial coordination by",
|
66
|
+
"B16" => "Managing editor",
|
67
|
+
"B17" => "Founded by",
|
68
|
+
"B18" => "Prepared for publication by",
|
69
|
+
"B19" => "Associate editor",
|
70
|
+
"B20" => "Consultant editor",
|
71
|
+
"B21" => "General editor",
|
72
|
+
"B22" => "Dramatized by",
|
73
|
+
"B23" => "General rapporteur",
|
74
|
+
"B24" => "Literary editor",
|
75
|
+
"B25" => "Arranged by (music)",
|
76
|
+
"B99" => "Other adaptation by",
|
77
|
+
"C01" => "Compiled by",
|
78
|
+
"C02" => "Selected by",
|
79
|
+
"C99" => "Other compilation by",
|
80
|
+
"D01" => "Producer",
|
81
|
+
"D02" => "Director",
|
82
|
+
"D03" => "Conductor",
|
83
|
+
"D99" => "Other direction by",
|
84
|
+
"E01" => "Actor",
|
85
|
+
"E02" => "Dancer",
|
86
|
+
"E03" => "Narrator",
|
87
|
+
"E04" => "Commentator",
|
88
|
+
"E05" => "Vocal soloist",
|
89
|
+
"E06" => "Instrumental soloist",
|
90
|
+
"E07" => 'Read by',
|
91
|
+
"E08" => "Performed by (orchestra, band, ensemble)",
|
92
|
+
"E99" => "Performed by",
|
93
|
+
"F01" => "Filmed/photographed by",
|
94
|
+
"F99" => "Other recording by",
|
95
|
+
"Z01" => "Assisted by",
|
96
|
+
"Z99" => "Other",
|
97
|
+
}
|
98
|
+
end
|
99
|
+
end
|
data/lib/onix/measure.rb
CHANGED
data/lib/onix/media_file.rb
CHANGED
data/lib/onix/normaliser.rb
CHANGED
@@ -40,9 +40,6 @@ module ONIX
|
|
40
40
|
raise ArgumentError, "#{oldfile} does not exist" unless File.file?(oldfile)
|
41
41
|
raise ArgumentError, "#{newfile} already exists" if File.file?(newfile)
|
42
42
|
raise "xsltproc app not found" unless app_available?("xsltproc")
|
43
|
-
raise "isutf8 app not found" unless app_available?("isutf8")
|
44
|
-
raise "iconv app not found" unless app_available?("iconv")
|
45
|
-
raise "sed app not found" unless app_available?("sed")
|
46
43
|
raise "tr app not found" unless app_available?("tr")
|
47
44
|
|
48
45
|
@oldfile = oldfile
|
@@ -60,21 +57,11 @@ module ONIX
|
|
60
57
|
@curfile = dest
|
61
58
|
end
|
62
59
|
|
63
|
-
# convert to utf8
|
64
|
-
dest = next_tempfile
|
65
|
-
to_utf8(@curfile, dest)
|
66
|
-
@curfile = dest
|
67
|
-
|
68
60
|
# remove control chars
|
69
61
|
dest = next_tempfile
|
70
62
|
remove_control_chars(@curfile, dest)
|
71
63
|
@curfile = dest
|
72
64
|
|
73
|
-
# remove entities
|
74
|
-
dest = next_tempfile
|
75
|
-
replace_named_entities(@curfile, dest)
|
76
|
-
@curfile = dest
|
77
|
-
|
78
65
|
FileUtils.cp(@curfile, @newfile)
|
79
66
|
end
|
80
67
|
|
@@ -110,41 +97,6 @@ module ONIX
|
|
110
97
|
`xsltproc -o #{outpath} #{xsltpath} #{inpath}`
|
111
98
|
end
|
112
99
|
|
113
|
-
# ensure the file is valid utf8, then make sure it's declared as such.
|
114
|
-
#
|
115
|
-
# The following behaviour is expected:
|
116
|
-
#
|
117
|
-
# file is valid utf8, is marked correctly
|
118
|
-
# - copied untouched
|
119
|
-
# file is valid utf8, is marked incorrectly or has no marked encoding
|
120
|
-
# - copied and encoding mark fixed or added
|
121
|
-
# file is no utf8, encoding is marked
|
122
|
-
# - file is converted to utf8 and enecoding mark is updated
|
123
|
-
# file is not utf8, encoding is not marked
|
124
|
-
# - file is copied untouched
|
125
|
-
#
|
126
|
-
def to_utf8(src, dest)
|
127
|
-
inpath = File.expand_path(src)
|
128
|
-
outpath = File.expand_path(dest)
|
129
|
-
|
130
|
-
m, src_enc = *@head.match(/encoding=.([a-zA-Z0-9\-]+)./i)
|
131
|
-
|
132
|
-
# ensure the file is actually utf8
|
133
|
-
if `isutf8 #{inpath}`.strip == ""
|
134
|
-
if src_enc.to_s.downcase == "utf-8"
|
135
|
-
FileUtils.cp(inpath, outpath)
|
136
|
-
else
|
137
|
-
FileUtils.cp(inpath, outpath)
|
138
|
-
`sed -i 's/<?xml.*?>/<?xml version=\"1.0\" encoding=\"UTF-8\"?>/g' #{outpath}`
|
139
|
-
end
|
140
|
-
elsif src_enc
|
141
|
-
`iconv --from-code=#{src_enc} --to-code=UTF-8 #{inpath} > #{outpath}`
|
142
|
-
`sed -i 's/#{src_enc}/UTF-8/' #{outpath}`
|
143
|
-
else
|
144
|
-
FileUtils.cp(inpath, outpath)
|
145
|
-
end
|
146
|
-
end
|
147
|
-
|
148
100
|
# XML files shouldn't contain low ASCII control chars. Strip them.
|
149
101
|
#
|
150
102
|
def remove_control_chars(src, dest)
|
@@ -153,35 +105,6 @@ module ONIX
|
|
153
105
|
`cat #{inpath} | tr -d "\\000-\\010\\013\\014\\016-\\037" > #{outpath}`
|
154
106
|
end
|
155
107
|
|
156
|
-
# replace all named entities in the specified file with
|
157
|
-
# numeric entities.
|
158
|
-
#
|
159
|
-
def replace_named_entities(src, dest)
|
160
|
-
inpath = File.expand_path(src)
|
161
|
-
outpath = File.expand_path(dest)
|
162
|
-
|
163
|
-
cmd = "sed " + entity_map.map do |named, numeric|
|
164
|
-
"-e 's/\\&#{named};/\\&#{numeric};/g'"
|
165
|
-
end.join(" ") + " #{inpath} > #{outpath}"
|
166
|
-
#raise cmd
|
167
|
-
`#{cmd}`
|
168
|
-
end
|
169
|
-
|
170
|
-
# return a named entity to numeric entity mapping, build by extracting
|
171
|
-
# data from the ONIX DTD
|
172
|
-
#
|
173
|
-
def entity_map
|
174
|
-
return @map if @map
|
175
|
-
|
176
|
-
path = File.dirname(__FILE__) + "/../../support/entities.txt"
|
177
|
-
@map = {}
|
178
|
-
File.read(path).split.each do |line|
|
179
|
-
elements = line.split(":")
|
180
|
-
@map[elements.first] = elements.last
|
181
|
-
end
|
182
|
-
@map
|
183
|
-
end
|
184
|
-
|
185
108
|
end
|
186
109
|
|
187
110
|
end
|