milkfarm-onix 0.7.7 → 0.8.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (64) hide show
  1. data/CHANGELOG +31 -1
  2. data/README.markdown +24 -2
  3. data/lib/onix.rb +3 -16
  4. data/lib/onix/addressee_identifier.rb +0 -1
  5. data/lib/onix/apa_product.rb +19 -2
  6. data/lib/onix/audience_range.rb +6 -7
  7. data/lib/onix/contributor.rb +0 -1
  8. data/lib/onix/header.rb +0 -1
  9. data/lib/onix/imprint.rb +0 -1
  10. data/lib/onix/language.rb +0 -1
  11. data/lib/onix/lists/contributor_role.rb +99 -0
  12. data/lib/onix/market_representation.rb +0 -1
  13. data/lib/onix/measure.rb +0 -1
  14. data/lib/onix/media_file.rb +0 -1
  15. data/lib/onix/normaliser.rb +0 -77
  16. data/lib/onix/other_text.rb +0 -1
  17. data/lib/onix/price.rb +0 -1
  18. data/lib/onix/product.rb +3 -2
  19. data/lib/onix/product_identifier.rb +0 -1
  20. data/lib/onix/publisher.rb +0 -1
  21. data/lib/onix/reader.rb +39 -96
  22. data/lib/onix/sales_restriction.rb +0 -1
  23. data/lib/onix/sender_identifier.rb +0 -1
  24. data/lib/onix/series.rb +8 -8
  25. data/lib/onix/series_identifier.rb +0 -1
  26. data/lib/onix/set.rb +76 -0
  27. data/lib/onix/sl_product.rb +32 -45
  28. data/lib/onix/stock.rb +0 -1
  29. data/lib/onix/subject.rb +0 -1
  30. data/lib/onix/supply_detail.rb +0 -1
  31. data/lib/onix/title.rb +0 -1
  32. data/lib/onix/website.rb +0 -1
  33. data/lib/onix/writer.rb +1 -1
  34. data/spec/apa_product_spec.rb +44 -4
  35. data/spec/audience_range_spec.rb +5 -7
  36. data/spec/contributor_spec.rb +2 -4
  37. data/spec/header_spec.rb +2 -4
  38. data/spec/imprint_spec.rb +4 -6
  39. data/spec/language_spec.rb +5 -7
  40. data/spec/market_representation_spec.rb +2 -4
  41. data/spec/measure_spec.rb +2 -4
  42. data/spec/media_file_spec.rb +2 -4
  43. data/spec/normaliser_spec.rb +1 -78
  44. data/spec/other_text_spec.rb +2 -4
  45. data/spec/price_spec.rb +2 -4
  46. data/spec/product_identifier_spec.rb +2 -4
  47. data/spec/product_spec.rb +2 -5
  48. data/spec/publisher_spec.rb +2 -4
  49. data/spec/reader_spec.rb +15 -22
  50. data/spec/sales_restriction_spec.rb +2 -4
  51. data/spec/sender_identifier.rb +2 -4
  52. data/spec/series_identifier_spec.rb +5 -7
  53. data/spec/series_spec.rb +4 -6
  54. data/spec/set_spec.rb +32 -0
  55. data/spec/sl_product_spec.rb +76 -24
  56. data/spec/spec_helper.rb +8 -0
  57. data/spec/stock_spec.rb +2 -4
  58. data/spec/subject_spec.rb +2 -4
  59. data/spec/supply_detail_spec.rb +2 -4
  60. data/spec/title_spec.rb +2 -4
  61. data/spec/website_spec.rb +2 -4
  62. data/spec/writer_spec.rb +1 -4
  63. metadata +64 -28
  64. data/lib/onix/common.rb +0 -26
data/CHANGELOG CHANGED
@@ -1,3 +1,33 @@
1
+ v0.8.2 (6th May 2010)
2
+ - fix APAProduct#series and APAProduct#series=
3
+
4
+ v0.8.1 (5th January 2010)
5
+ - Use nokogiri's support for transparent entity conversion when reading an ONIX file
6
+ - Removed entity replacement from ONIX::Normaliser
7
+ - the external dependency on sed made me uncomfortable, and it wasn't really
8
+ necessary now that nokogiri can do it for us
9
+ - Removed utf-8 normalisation from ONIX::Normaliser
10
+ - nokogiri also handles this really cleanly and transparently. Regardless of
11
+ the source file encoding, Nokogiri::Reader returns utf-8 encoded data
12
+ - Add the release attribute to files we generate
13
+ - it's optional in 2.1, but mandatory in 3.0. As we start to see 3.0 files in the
14
+ wild it will help to have a rapid way to distinguish between them
15
+ - Add ONIX::Reader#release - to detect the release version of files we read in
16
+
17
+ v0.8.0 (31st October 2009)
18
+ - Replace LibXML dependency with Nokogiri. Nokogiri is under active development, has
19
+ a responsive maintainer and is significantly more stable
20
+ - Switch to ROXML 3.x
21
+ - roxml also switched from libxml to nokogiri
22
+ - roxml removed deprecated parts of it's API
23
+ - should now avoid various conflicts with mongrel
24
+ - Ensure APAProduct#price returns the first product price and ignores
25
+ the price type
26
+
27
+ v0.7.8 (19th October 2009)
28
+ - add support for additional elements (mostly series and audience related)
29
+ - thanks tim
30
+
1
31
  v0.7.7 (1st October 2009)
2
32
  - optimise sed usage in ONIX::Normaliser. *huge* speed improvement on
3
33
  large files.
@@ -29,7 +59,7 @@ v0.7.1 (24th June 2009)
29
59
 
30
60
  v0.7.0 (17th June 2009)
31
61
  - try using LibXML for reader again
32
- - retrieving the ONIX version of the input file is currently disabled, as
62
+ - retrieving the ONIX version of the input file is currently disabled, as
33
63
  that seems to be the source of our instability
34
64
  - Various Ruby 1.9 compatability tweaks
35
65
  - add source file coding declarations. All source files are UTF-8
data/README.markdown CHANGED
@@ -10,6 +10,29 @@ and writing ONIX files in your ruby applications.
10
10
  This replaces the obsolete rbook-onix gem that was spectacular in its crapness.
11
11
  Let us never speak of it again.
12
12
 
13
+ ## Feature Support
14
+
15
+ This library currently only handles ONIX 2.1 files (all revisions). At some
16
+ point I'll need to work out what to do about supporting ONIX 3.0 files. I
17
+ suspect a separate library will be the simplest solution.
18
+
19
+ ONIX::Reader only handles the reference tag versions of ONIX 2.1. Use
20
+ ONIX::Normaliser to convert any short tag files to reference tags.
21
+
22
+ ONIX::Writer only generates reference tag ONIX files.
23
+
24
+ ## DTD Loading
25
+
26
+ To correctly handle named entities when reading an ONIX file, this gem attempts
27
+ to load the DTD describing the ONIX format into memory. By default, this means
28
+ each file you read will require several hundred Kb of data to be downloaded
29
+ over the net.
30
+
31
+ This is obviously not desirable in most cases. To avoid it, you need to add copies
32
+ of the ONIX DTDs into your system XML catalog. On Debian and Ubuntu systems,
33
+ the quickest way to do that is to build and install the package available @
34
+ http://github.com/yob/onix-dtd
35
+
13
36
  ## Installation
14
37
 
15
38
  gem install onix
@@ -36,5 +59,4 @@ To be honest, I'm not really expecting any, this is a niche library.
36
59
  ## Further Reading
37
60
 
38
61
  - The source: [http://github.com/yob/onix/tree/master](http://github.com/yob/onix/tree/master)
39
- - Rubyforge project: [http://rubyforge.org/projects/rbook/](http://rubyforge.org/projects/rbook/)
40
- - The official specs [http://www.editeur.org/onix.html](http://www.editeur.org/onix.html)
62
+ - The official specs [http://www.editeur.org/8/ONIX/](http://www.editeur.org/8/ONIX/)
data/lib/onix.rb CHANGED
@@ -1,22 +1,15 @@
1
1
  # coding: utf-8
2
2
 
3
- require 'rubygems'
4
3
  require 'bigdecimal'
5
4
  require 'cgi'
6
-
7
- # ensure we load the correct gem versions
8
- gem 'roxml', '2.5.3'
9
- gem 'andand'
10
-
11
- # and now load the actual gems
12
5
  require 'roxml'
13
6
  require 'andand'
14
7
 
15
8
  module ONIX
16
9
  module Version #:nodoc:
17
10
  Major = 0
18
- Minor = 7
19
- Tiny = 7
11
+ Minor = 8
12
+ Tiny = 3
20
13
 
21
14
  String = [Major, Minor, Tiny].join('.')
22
15
  end
@@ -60,13 +53,6 @@ module ONIX
60
53
  end
61
54
  end
62
55
 
63
- # silence some warnings from ROXML
64
- unless ROXML.const_defined?("SILENCE_XML_NAME_WARNING")
65
- ROXML::SILENCE_XML_NAME_WARNING = true
66
- end
67
-
68
- require File.join(File.dirname(__FILE__), "onix", "common")
69
-
70
56
  # core files
71
57
  # - ordering is important, classes need to be defined before any
72
58
  # other class can use them
@@ -76,6 +62,7 @@ require File.join(File.dirname(__FILE__), "onix", "header")
76
62
  require File.join(File.dirname(__FILE__), "onix", "product_identifier")
77
63
  require File.join(File.dirname(__FILE__), "onix", "series_identifier")
78
64
  require File.join(File.dirname(__FILE__), "onix", "series")
65
+ require File.join(File.dirname(__FILE__), "onix", "set")
79
66
  require File.join(File.dirname(__FILE__), "onix", "title")
80
67
  require File.join(File.dirname(__FILE__), "onix", "website")
81
68
  require File.join(File.dirname(__FILE__), "onix", "contributor")
@@ -3,7 +3,6 @@
3
3
  module ONIX
4
4
  class AddresseeIdentifier
5
5
  include ROXML
6
- include ONIX::Common
7
6
 
8
7
  xml_accessor :addressee_id_type, :from => "AddresseeIDType", :as => Fixnum # should be a 2 digit num
9
8
  xml_accessor :id_type_name, :from => "IDTypeName"
@@ -6,7 +6,6 @@ module ONIX
6
6
  delegate :record_reference, :record_reference=
7
7
  delegate :notification_type, :notification_type=
8
8
  delegate :product_form, :product_form=
9
- delegate :series, :series=
10
9
  delegate :edition, :edition=
11
10
  delegate :number_of_pages, :number_of_pages=
12
11
  delegate :bic_main_subject, :bic_main_subject=
@@ -107,6 +106,20 @@ module ONIX
107
106
  composite.subtitle = str
108
107
  end
109
108
 
109
+ def series
110
+ composite = product.series.first
111
+ composite.andand.title_of_series
112
+ end
113
+
114
+ def series=(val)
115
+ composite = product.series.first
116
+ if composite.nil?
117
+ composite = ONIX::Series.new
118
+ product.series << composite
119
+ end
120
+ composite.title_of_series = val.to_s
121
+ end
122
+
110
123
  # retrieve the current publisher website for this particular product
111
124
  def publisher_website
112
125
  website(2).andand.website_link
@@ -651,7 +664,11 @@ module ONIX
651
664
  # retrieve the value of a particular price
652
665
  def price_get(type)
653
666
  supply = find_or_create_supply_detail
654
- supply.prices.find { |p| p.price_type_code == type }
667
+ if type.nil?
668
+ supply.prices.first
669
+ else
670
+ supply.prices.find { |p| p.price_type_code == type }
671
+ end
655
672
  end
656
673
 
657
674
  # set the value of a particular price
@@ -3,7 +3,6 @@
3
3
  module ONIX
4
4
  class AudienceRange
5
5
  include ROXML
6
- include ONIX::Common
7
6
 
8
7
  xml_name "AudienceRange"
9
8
 
@@ -11,12 +10,12 @@ module ONIX
11
10
  xml_accessor :audience_range_precisions, :from => "AudienceRangePrecision", :as => [Fixnum], :to_xml => [ONIX::Formatters.two_digit] # TODO: two_digit isn't working on the array items
12
11
  xml_accessor :audience_range_values, :from => "AudienceRangeValue", :as => [Integer]
13
12
 
14
- # TODO: element AudienceRange: validity error :
15
- # Element AudienceRange content does not follow the DTD, expecting
16
- # (AudienceRangeQualifier , AudienceRangePrecision , AudienceRangeValue ,
17
- # (AudienceRangePrecision , AudienceRangeValue)?),
18
- # got
19
- # (AudienceRangeQualifier AudienceRangePrecision AudienceRangePrecision
13
+ # TODO: element AudienceRange: validity error :
14
+ # Element AudienceRange content does not follow the DTD, expecting
15
+ # (AudienceRangeQualifier , AudienceRangePrecision , AudienceRangeValue ,
16
+ # (AudienceRangePrecision , AudienceRangeValue)?),
17
+ # got
18
+ # (AudienceRangeQualifier AudienceRangePrecision AudienceRangePrecision
20
19
  # AudienceRangeValue AudienceRangeValue )
21
20
  def initialize
22
21
  self.audience_range_precisions = []
@@ -3,7 +3,6 @@
3
3
  module ONIX
4
4
  class Contributor
5
5
  include ROXML
6
- include ONIX::Common
7
6
 
8
7
  xml_name "Contributor"
9
8
 
data/lib/onix/header.rb CHANGED
@@ -3,7 +3,6 @@
3
3
  module ONIX
4
4
  class Header
5
5
  include ROXML
6
- include ONIX::Common
7
6
 
8
7
  xml_name "Header"
9
8
 
data/lib/onix/imprint.rb CHANGED
@@ -3,7 +3,6 @@
3
3
  module ONIX
4
4
  class Imprint
5
5
  include ROXML
6
- include ONIX::Common
7
6
 
8
7
  xml_name "Imprint"
9
8
 
data/lib/onix/language.rb CHANGED
@@ -3,7 +3,6 @@
3
3
  module ONIX
4
4
  class Language
5
5
  include ROXML
6
- include ONIX::Common
7
6
 
8
7
  xml_name "Language"
9
8
 
@@ -0,0 +1,99 @@
1
+ # coding: utf-8
2
+
3
+ module ONIX
4
+ module Lists
5
+ # Code list 17
6
+ CONTRIBUTOR_ROLE = {
7
+ "A01" => "By (author)",
8
+ "A02" => "With",
9
+ "A03" => "Screenplay by",
10
+ "A04" => "Libretto by",
11
+ "A05" => "Lyrics by",
12
+ "A06" => "By (composer)",
13
+ "A07" => "By (artist)",
14
+ "A08" => "By (photographer)",
15
+ "A09" => "Created by",
16
+ "A10" => "From an idea by",
17
+ "A11" => "Designed by",
18
+ "A12" => "Illustrated by",
19
+ "A13" => "Photographs by",
20
+ "A14" => "Text by",
21
+ "A15" => "Preface by",
22
+ "A16" => "Prologue by",
23
+ "A17" => "Summary by",
24
+ "A18" => "Supplement by",
25
+ "A19" => "Afterword by",
26
+ "A20" => "Notes by",
27
+ "A21" => "Commentaries by",
28
+ "A22" => "Epilogue by",
29
+ "A23" => "Foreword by",
30
+ "A24" => "Introduction by",
31
+ "A25" => "Footnotes by",
32
+ "A26" => "Memoir by",
33
+ "A27" => "Experiments by",
34
+ "A29" => "Introduction and notes by",
35
+ "A30" => "Software written by",
36
+ "A31" => "Book and lyrics by",
37
+ "A32" => "Contributions by",
38
+ "A33" => "Appendix by",
39
+ "A34" => "Index by",
40
+ "A35" => "Drawings by",
41
+ "A36" => "Cover design or artwork by",
42
+ "A37" => "Preliminary work by",
43
+ "A38" => "Original author",
44
+ "A39" => "Maps by",
45
+ "A40" => "Inked or colored by",
46
+ "A41" => "Pop-ups by",
47
+ "A42" => "Continued by",
48
+ "A43" => "Interviewer",
49
+ "A44" => "Interviewee",
50
+ "A99" => "Other primary creator",
51
+ "B01" => "Edited by",
52
+ "B02" => "Revised by",
53
+ "B03" => "Retold by",
54
+ "B04" => "Abridged by",
55
+ "B05" => "Adapted by",
56
+ "B06" => "Translated by",
57
+ "B07" => "As told by",
58
+ "B08" => "Translated with commentary by",
59
+ "B09" => "Series edited by",
60
+ "B10" => "Edited and translated by",
61
+ "B11" => "Editor-in-chief",
62
+ "B12" => "Guest editor",
63
+ "B13" => "Volume editor",
64
+ "B14" => "Editorial board member",
65
+ "B15" => "Editorial coordination by",
66
+ "B16" => "Managing editor",
67
+ "B17" => "Founded by",
68
+ "B18" => "Prepared for publication by",
69
+ "B19" => "Associate editor",
70
+ "B20" => "Consultant editor",
71
+ "B21" => "General editor",
72
+ "B22" => "Dramatized by",
73
+ "B23" => "General rapporteur",
74
+ "B24" => "Literary editor",
75
+ "B25" => "Arranged by (music)",
76
+ "B99" => "Other adaptation by",
77
+ "C01" => "Compiled by",
78
+ "C02" => "Selected by",
79
+ "C99" => "Other compilation by",
80
+ "D01" => "Producer",
81
+ "D02" => "Director",
82
+ "D03" => "Conductor",
83
+ "D99" => "Other direction by",
84
+ "E01" => "Actor",
85
+ "E02" => "Dancer",
86
+ "E03" => "Narrator",
87
+ "E04" => "Commentator",
88
+ "E05" => "Vocal soloist",
89
+ "E06" => "Instrumental soloist",
90
+ "E07" => 'Read by',
91
+ "E08" => "Performed by (orchestra, band, ensemble)",
92
+ "E99" => "Performed by",
93
+ "F01" => "Filmed/photographed by",
94
+ "F99" => "Other recording by",
95
+ "Z01" => "Assisted by",
96
+ "Z99" => "Other",
97
+ }
98
+ end
99
+ end
@@ -3,7 +3,6 @@
3
3
  module ONIX
4
4
  class MarketRepresentation
5
5
  include ROXML
6
- include ONIX::Common
7
6
 
8
7
  xml_name "MarketRepresentation"
9
8
 
data/lib/onix/measure.rb CHANGED
@@ -3,7 +3,6 @@
3
3
  module ONIX
4
4
  class Measure
5
5
  include ROXML
6
- include ONIX::Common
7
6
 
8
7
  xml_name "Measure"
9
8
 
@@ -3,7 +3,6 @@
3
3
  module ONIX
4
4
  class MediaFile
5
5
  include ROXML
6
- include ONIX::Common
7
6
 
8
7
  xml_name "MediaFile"
9
8
 
@@ -40,9 +40,6 @@ module ONIX
40
40
  raise ArgumentError, "#{oldfile} does not exist" unless File.file?(oldfile)
41
41
  raise ArgumentError, "#{newfile} already exists" if File.file?(newfile)
42
42
  raise "xsltproc app not found" unless app_available?("xsltproc")
43
- raise "isutf8 app not found" unless app_available?("isutf8")
44
- raise "iconv app not found" unless app_available?("iconv")
45
- raise "sed app not found" unless app_available?("sed")
46
43
  raise "tr app not found" unless app_available?("tr")
47
44
 
48
45
  @oldfile = oldfile
@@ -60,21 +57,11 @@ module ONIX
60
57
  @curfile = dest
61
58
  end
62
59
 
63
- # convert to utf8
64
- dest = next_tempfile
65
- to_utf8(@curfile, dest)
66
- @curfile = dest
67
-
68
60
  # remove control chars
69
61
  dest = next_tempfile
70
62
  remove_control_chars(@curfile, dest)
71
63
  @curfile = dest
72
64
 
73
- # remove entities
74
- dest = next_tempfile
75
- replace_named_entities(@curfile, dest)
76
- @curfile = dest
77
-
78
65
  FileUtils.cp(@curfile, @newfile)
79
66
  end
80
67
 
@@ -110,41 +97,6 @@ module ONIX
110
97
  `xsltproc -o #{outpath} #{xsltpath} #{inpath}`
111
98
  end
112
99
 
113
- # ensure the file is valid utf8, then make sure it's declared as such.
114
- #
115
- # The following behaviour is expected:
116
- #
117
- # file is valid utf8, is marked correctly
118
- # - copied untouched
119
- # file is valid utf8, is marked incorrectly or has no marked encoding
120
- # - copied and encoding mark fixed or added
121
- # file is no utf8, encoding is marked
122
- # - file is converted to utf8 and enecoding mark is updated
123
- # file is not utf8, encoding is not marked
124
- # - file is copied untouched
125
- #
126
- def to_utf8(src, dest)
127
- inpath = File.expand_path(src)
128
- outpath = File.expand_path(dest)
129
-
130
- m, src_enc = *@head.match(/encoding=.([a-zA-Z0-9\-]+)./i)
131
-
132
- # ensure the file is actually utf8
133
- if `isutf8 #{inpath}`.strip == ""
134
- if src_enc.to_s.downcase == "utf-8"
135
- FileUtils.cp(inpath, outpath)
136
- else
137
- FileUtils.cp(inpath, outpath)
138
- `sed -i 's/<?xml.*?>/<?xml version=\"1.0\" encoding=\"UTF-8\"?>/g' #{outpath}`
139
- end
140
- elsif src_enc
141
- `iconv --from-code=#{src_enc} --to-code=UTF-8 #{inpath} > #{outpath}`
142
- `sed -i 's/#{src_enc}/UTF-8/' #{outpath}`
143
- else
144
- FileUtils.cp(inpath, outpath)
145
- end
146
- end
147
-
148
100
  # XML files shouldn't contain low ASCII control chars. Strip them.
149
101
  #
150
102
  def remove_control_chars(src, dest)
@@ -153,35 +105,6 @@ module ONIX
153
105
  `cat #{inpath} | tr -d "\\000-\\010\\013\\014\\016-\\037" > #{outpath}`
154
106
  end
155
107
 
156
- # replace all named entities in the specified file with
157
- # numeric entities.
158
- #
159
- def replace_named_entities(src, dest)
160
- inpath = File.expand_path(src)
161
- outpath = File.expand_path(dest)
162
-
163
- cmd = "sed " + entity_map.map do |named, numeric|
164
- "-e 's/\\&#{named};/\\&#{numeric};/g'"
165
- end.join(" ") + " #{inpath} > #{outpath}"
166
- #raise cmd
167
- `#{cmd}`
168
- end
169
-
170
- # return a named entity to numeric entity mapping, build by extracting
171
- # data from the ONIX DTD
172
- #
173
- def entity_map
174
- return @map if @map
175
-
176
- path = File.dirname(__FILE__) + "/../../support/entities.txt"
177
- @map = {}
178
- File.read(path).split.each do |line|
179
- elements = line.split(":")
180
- @map[elements.first] = elements.last
181
- end
182
- @map
183
- end
184
-
185
108
  end
186
109
 
187
110
  end