pdf-reader 0.6.2 → 0.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/CHANGELOG CHANGED
@@ -1,3 +1,14 @@
1
+ v0.7 (6th May 2008)
2
+ - API INCOMPATIBLE CHANGE: any hashes that are passed to callbacks use symbols as keys instead of PDF::Reader::Name instances.
3
+ - Improved support for converting text in some PDF files to unicode
4
+ - Behave as expected if the Contents key in a Page Dict is a reference
5
+ - Include some basic metadata callbacks
6
+ - Don't interpret a comment token (%) inside a string as a comment
7
+ - Small fixes to improve 1.9 compatability
8
+ - Improved our Zlib deflating to make it more slightly more robust - still some more issues to work out though
9
+ - Throw an UnsupportedFeatureError if a pdf that uses XRef streams is opened
10
+ - Added an option to PDF::Reader#file and PDF::Reader#string to enable parsing of only parts of a PDF file(ie. only metadata, etc)
11
+
1
12
  v0.6.2 (22nd March 2008)
2
13
  - Catch low level errors when applying filters to a content stream and raise a MalformedPDFError instead.
3
14
  - Added support for processing inline images
data/README CHANGED
@@ -101,6 +101,29 @@ it through less or to a text file.
101
101
  puts cb
102
102
  end
103
103
 
104
+ == Extract metadata only
105
+
106
+ require 'rubygems'
107
+ require 'pdf/reader'
108
+
109
+ class MetaDataReceiver
110
+ attr_accessor :regular
111
+ attr_accessor :xml
112
+
113
+ def metadata(data)
114
+ @regular = data
115
+ end
116
+
117
+ def metadata_xml(data)
118
+ @xml = data
119
+ end
120
+ end
121
+
122
+ receiver = MetaDataReceiver.new
123
+ pdf = PDF::Reader.file(ARGV.shift, receiver, :pages => false, :metadata => true)
124
+ puts receiver.regular.inspect
125
+ puts receiver.xml.inspect
126
+
104
127
  == Basic RSpec of a generated PDF
105
128
 
106
129
  require 'rubygems'
data/Rakefile CHANGED
@@ -6,7 +6,7 @@ require 'rake/testtask'
6
6
  require "rake/gempackagetask"
7
7
  require 'spec/rake/spectask'
8
8
 
9
- PKG_VERSION = "0.6.2"
9
+ PKG_VERSION = "0.7"
10
10
  PKG_NAME = "pdf-reader"
11
11
  PKG_FILE_NAME = "#{PKG_NAME}-#{PKG_VERSION}"
12
12
 
data/TODO CHANGED
@@ -1,24 +1,27 @@
1
- v0.7
2
- - Allow the user to only process certain aspects of the PDF file. For example, if they're only
3
- interested in meta data or bookmarks, there's no point in walking the pages tree.
4
- - maybe a third option to Reader.parse?
5
- parse(io, receiver, {:pages => true, :fonts => false, :metadata => true, :bookmarks => false})
6
- - detect when a font's encoding is a CMap (generally used for pre-Unicode, multibyte asian encodings), and display a user friendly error
7
- - Provide a way to get raw access to a particular object. Good for testing purposes
8
-
9
1
  v0.8
2
+ - Allow more than just page content and metadata to be parsed (see spec section 3.6.1)
3
+ - bookmarks?
4
+ - outline?
5
+ - articles?
6
+ - viewer prefs?
7
+ - Don't remove comment when tokenising in the middle of a string
10
8
  - Tweak encoding mappings to differentiate between bytes that are invalid for an encoding, and bytes that are unchanged.
11
9
  poppler seems to do this in a quite reasonable way. Original Encoding -> Glyph Names -> Unicode. As of 0.6 we go straight
12
10
  from the Original encoding to Unicode.
11
+ - detect when a font's encoding is a CMap (generally used for pre-Unicode, multibyte asian encodings), and display a user friendly error
12
+ - Provide a way to get raw access to a particular object. Good for testing purposes
13
+ - Improve interpretation of non content stream data (ie metadata). Use PDFDofEncoding, recognise UTF16 strings, recognise dates, etc
14
+ - Support Cross Reference Streams (spec 3.4.7)
13
15
 
14
16
  v0.9
15
- - Support for CJK text (convert to UTF-8 like all other encodings. See Section 5.9 of the PDF spec)
16
- - Will require significantly improved handling of CMaps, including creating a bunch of predefined ones
17
17
  - Add a way to extract raster images
18
18
  - see XObjects section of spec (section 4.7)
19
19
  - Add a way to extract font data?
20
20
 
21
21
  Sometime
22
+ - Support for CJK text (convert to UTF-8 like all other encodings. See Section 5.9 of the PDF spec)
23
+ - Will require significantly improved handling of CMaps, including creating a bunch of predefined ones
24
+
22
25
  - Work out why specs/data/zlib*.pdf isn't parsed correctly when all the major PDF viewers can display it correctly
23
26
 
24
27
  - Ship some extra receivers in the standard package, particuarly ones that are useful for running
@@ -27,8 +30,6 @@ Sometime
27
30
  - When we encounter Identity-H encoded text with no ToUnicode CMap, render the glyphs and treat them as images, as there's no
28
31
  sensible way to convert them to unicode
29
32
 
30
- - Improve metadata support
31
-
32
33
  - Add support for additional filters: ASCIIHexDecode, ASCII85Decode, LZWDecode, RunLengthDecode, CCITTFaxDecode, JBIG2Decode, DCTDecode, JPXDecode, Crypt?
33
34
 
34
35
  - Add support for additional encodings:
data/lib/pdf/reader.rb CHANGED
@@ -51,19 +51,35 @@ module PDF
51
51
  #
52
52
  # pdf = PDF::Reader.new
53
53
  # pdf.parse(File.new("somefile.pdf"), receiver)
54
+ #
55
+ # = Parsing parts of a file
56
+ #
57
+ # Both PDF::Reader#file and PDF::Reader#string accept a 3 argument that specifies which
58
+ # parts of the file to process. By default, all options are enabled, so this can be useful
59
+ # to cut down processing time if you're only interested in say, metadata.
60
+ #
61
+ # As an example, the following call will disable parsing the contents of pages in the file,
62
+ # but explicitly enables processing metadata.
63
+ #
64
+ # PDF::Reader.new("somefile.pdf", receiver, {:metadata => true, :pages => false})
65
+ #
66
+ # Available options are currently:
67
+ #
68
+ # :metadata
69
+ # :pages
54
70
  class Reader
55
71
  ################################################################################
56
72
  # Parse the file with the given name, sending events to the given receiver.
57
- def self.file (name, receiver)
73
+ def self.file (name, receiver, opts = {})
58
74
  File.open(name,"rb") do |f|
59
- new.parse(f, receiver)
75
+ new.parse(f, receiver, opts)
60
76
  end
61
77
  end
62
78
  ################################################################################
63
79
  # Parse the given string, sending events to the given receiver.
64
- def self.string (str, receiver)
80
+ def self.string (str, receiver, opts = {})
65
81
  StringIO.open(str) do |s|
66
- new.parse(s, receiver)
82
+ new.parse(s, receiver, opts)
67
83
  end
68
84
  end
69
85
  ################################################################################
@@ -79,7 +95,6 @@ require 'pdf/reader/encoding'
79
95
  require 'pdf/reader/error'
80
96
  require 'pdf/reader/filter'
81
97
  require 'pdf/reader/font'
82
- require 'pdf/reader/name'
83
98
  require 'pdf/reader/parser'
84
99
  require 'pdf/reader/reference'
85
100
  require 'pdf/reader/register_receiver'
@@ -94,14 +109,19 @@ class PDF::Reader
94
109
  end
95
110
  ################################################################################
96
111
  # Given an IO object that contains PDF data, parse it.
97
- def parse (io, receiver)
112
+ def parse (io, receiver, opts = {})
98
113
  @buffer = Buffer.new(io)
99
114
  @xref = XRef.new(@buffer)
100
115
  @parser = Parser.new(@buffer, @xref)
101
116
  @content = (receiver == Explore ? Explore : Content).new(receiver, @xref)
102
117
 
118
+ options = {:pages => true, :metadata => true}
119
+ options.merge!(opts)
120
+
103
121
  trailer = @xref.load
104
- @content.document(@xref.object(trailer['Root'])) || self
122
+ @content.metadata(@xref.object(trailer[:Info]).first) if options[:metadata]
123
+ @content.document(@xref.object(trailer[:Root]).first) if options[:pages]
124
+ self
105
125
  end
106
126
  ################################################################################
107
127
  end
@@ -93,7 +93,8 @@ class PDF::Reader
93
93
  def ready_token (with_strip=true, skip_blanks=true)
94
94
  while @buffer.nil? or @buffer.empty?
95
95
  @buffer = @io.readline
96
- @buffer.sub!(/%.*$/, '')
96
+ @buffer.force_encoding("BINARY") if @buffer.respond_to?(:force_encoding)
97
+ #@buffer.sub!(/%.*$/, '') if strip_comments
97
98
  @buffer.chomp!
98
99
  break unless skip_blanks
99
100
  end
@@ -114,7 +115,14 @@ class PDF::Reader
114
115
  end
115
116
 
116
117
  strip_space = !(i == 0 and @buffer[0,1] == '(')
117
- head(token_chars, strip_space)
118
+ tok = head(token_chars, strip_space)
119
+
120
+ if tok[0,1] == "%"
121
+ @buffer = ""
122
+ token
123
+ else
124
+ tok
125
+ end
118
126
  end
119
127
  ################################################################################
120
128
  def head (chars, with_strip=true)
@@ -52,19 +52,19 @@ class PDF::Reader
52
52
 
53
53
  def decode(c)
54
54
  # TODO: implement the conversion
55
- Error.assert_equal(c.class, Fixnum)
55
+ return c unless c.class == Fixnum
56
56
  @map[c]
57
57
  end
58
58
 
59
59
  private
60
60
 
61
61
  def process_bfchar_line(l)
62
- m, find, replace = *l.match(/<([0-9a-fA-F]+)> <([0-9a-fA-F]+)>/)
62
+ m, find, replace = *l.match(/<([0-9a-fA-F]+)>\s*<([0-9a-fA-F]+)>/)
63
63
  @map["0x#{find}".hex] = "0x#{replace}".hex if find && replace
64
64
  end
65
65
 
66
66
  def process_bfrange_line(l)
67
- m, start_code, end_code, dst = *l.match(/<([0-9a-fA-F]+)> <([0-9a-fA-F]+)> <([0-9a-fA-F]+)>/)
67
+ m, start_code, end_code, dst = *l.match(/<([0-9a-fA-F]+)>\s*<([0-9a-fA-F]+)>\s*<([0-9a-fA-F]+)>/)
68
68
  if start_code && end_code && dst
69
69
  start_code = "0x#{start_code}".hex
70
70
  end_code = "0x#{end_code}".hex
@@ -145,6 +145,8 @@ class PDF::Reader
145
145
  # - end_page_container
146
146
  # - begin_page
147
147
  # - end_page
148
+ # - metadata
149
+ # - xml_metadata
148
150
  #
149
151
  # == Resource Callbacks
150
152
  #
@@ -250,10 +252,20 @@ class PDF::Reader
250
252
  @fonts ||= {}
251
253
  end
252
254
  ################################################################################
255
+ # Begin processing the document metadata
256
+ def metadata (info)
257
+ info = utf16_to_utf8(info)
258
+ callback(:metadata, [info]) if info
259
+ end
260
+ ################################################################################
253
261
  # Begin processing the document
254
262
  def document (root)
263
+ if root[:Metadata]
264
+ obj, stream = @xref.object(root[:Metadata])
265
+ callback(:xml_metadata,stream)
266
+ end
255
267
  callback(:begin_document, [root])
256
- walk_pages(@xref.object(root['Pages']))
268
+ walk_pages(@xref.object(root[:Pages]).first)
257
269
  callback(:end_document)
258
270
  end
259
271
  ################################################################################
@@ -261,27 +273,35 @@ class PDF::Reader
261
273
  # its content
262
274
  def walk_pages (page)
263
275
 
264
- if page['Resources']
265
- res = page['Resources']
266
- page.delete('Resources')
276
+ if page[:Resources]
277
+ res = page[:Resources]
278
+ page.delete(:Resources)
267
279
  end
268
280
 
269
281
  # extract page content
270
- if page['Type'] == "Pages"
282
+ if page[:Type] == :Pages
271
283
  callback(:begin_page_container, [page])
272
- walk_resources(@xref.object(res)) if res
273
- page['Kids'].each {|child| walk_pages(@xref.object(child))}
284
+ walk_resources(@xref.object(res).first) if res
285
+ page[:Kids].each {|child| walk_pages(@xref.object(child).first)}
274
286
  callback(:end_page_container)
275
- elsif page['Type'] == "Page"
287
+ elsif page[:Type] == :Page
276
288
  callback(:begin_page, [page])
277
- walk_resources(@xref.object(res)) if res
289
+ walk_resources(@xref.object(res).first) if res
278
290
  @page = page
279
291
  @params = []
280
292
 
281
- page['Contents'].to_a.each do |cstream|
282
- obj, stream = @xref.object(cstream)
293
+ if page[:Contents].kind_of?(Array)
294
+ contents = page[:Contents]
295
+ elsif @xref.obj_type(page[:Contents]) == :Array
296
+ contents, stream = @xref.object(page[:Contents])
297
+ else
298
+ contents = [page[:Contents]]
299
+ end
300
+
301
+ contents.each do |content|
302
+ obj, stream = @xref.object(content)
283
303
  content_stream(stream)
284
- end if page.has_key?('Contents') and page['Contents']
304
+ end if page.has_key?(:Contents) and page[:Contents]
285
305
 
286
306
  callback(:end_page)
287
307
  end
@@ -324,61 +344,60 @@ class PDF::Reader
324
344
  end
325
345
  end
326
346
  rescue EOFError => e
347
+ raise MalformedPDFError, "End Of File while processing a content stream"
327
348
  end
328
349
  ################################################################################
329
350
  def walk_resources(resources)
330
351
  resources = resolve_references(resources)
331
352
 
332
353
  # extract any procset information
333
- if resources['ProcSet']
334
- callback(:resource_procset, resources['ProcSet'])
354
+ if resources[:ProcSet]
355
+ callback(:resource_procset, resources[:ProcSet])
335
356
  end
336
357
 
337
358
  # extract any xobject information
338
- if resources['XObject']
339
- @xref.object(resources['XObject']).each do |name, val|
359
+ if resources[:XObject]
360
+ @xref.object(resources[:XObject]).first.each do |name, val|
340
361
  obj, stream = @xref.object(val)
341
362
  callback(:resource_xobject, [name, obj, stream])
342
363
  end
343
364
  end
344
365
 
345
366
  # extract any extgstate information
346
- if resources['ExtGState']
347
- @xref.object(resources['ExtGState']).each do |name, val|
348
- callback(:resource_extgstate, [name, @xref.object(val)])
367
+ if resources[:ExtGState]
368
+ @xref.object(resources[:ExtGState]).first.each do |name, val|
369
+ callback(:resource_extgstate, [name, @xref.object(val).first])
349
370
  end
350
371
  end
351
372
 
352
373
  # extract any colorspace information
353
- if resources['ColorSpace']
354
- @xref.object(resources['ColorSpace']).each do |name, val|
355
- callback(:resource_colorspace, [name, @xref.object(val)])
374
+ if resources[:ColorSpace]
375
+ @xref.object(resources[:ColorSpace]).first.each do |name, val|
376
+ callback(:resource_colorspace, [name, @xref.object(val).first])
356
377
  end
357
378
  end
358
379
 
359
380
  # extract any pattern information
360
- if resources['Pattern']
361
- @xref.object(resources['Pattern']).each do |name, val|
362
- callback(:resource_pattern, [name, @xref.object(val)])
381
+ if resources[:Pattern]
382
+ @xref.object(resources[:Pattern]).first.each do |name, val|
383
+ callback(:resource_pattern, [name, @xref.object(val).first])
363
384
  end
364
385
  end
365
386
 
366
387
  # extract any font information
367
- if resources['Font']
368
- @xref.object(resources['Font']).each do |label, desc|
369
- desc = @xref.object(desc)
388
+ if resources[:Font]
389
+ @xref.object(resources[:Font]).first.each do |label, desc|
390
+ desc = @xref.object(desc).first
370
391
  @fonts[label] = PDF::Reader::Font.new
371
392
  @fonts[label].label = label
372
- @fonts[label].subtype = desc['Subtype'] if desc['Subtype']
373
- @fonts[label].basefont = desc['BaseFont'] if desc['BaseFont']
374
- @fonts[label].encoding = PDF::Reader::Encoding.factory(@xref.object(desc['Encoding']))
375
- @fonts[label].descendantfonts = desc['DescendantFonts'] if desc['DescendantFonts']
376
- if desc['ToUnicode']
377
- obj, cmap = @xref.object(desc['ToUnicode'])
378
-
393
+ @fonts[label].subtype = desc[:Subtype] if desc[:Subtype]
394
+ @fonts[label].basefont = desc[:BaseFont] if desc[:BaseFont]
395
+ @fonts[label].encoding = PDF::Reader::Encoding.factory(@xref.object(desc[:Encoding]).first)
396
+ @fonts[label].descendantfonts = desc[:DescendantFonts] if desc[:DescendantFonts]
397
+ if desc[:ToUnicode]
379
398
  # this stream is a cmap
380
399
  begin
381
- @fonts[label].tounicode = PDF::Reader::CMap.new(cmap)
400
+ @fonts[label].tounicode = PDF::Reader::CMap.new(desc[:ToUnicode])
382
401
  rescue
383
402
  # if the CMap fails to parse, don't worry too much. Means we can't translate the text properly
384
403
  end
@@ -391,7 +410,13 @@ class PDF::Reader
391
410
  # Convert any PDF::Reader::Resource objects into a real object
392
411
  def resolve_references(obj)
393
412
  case obj
394
- when PDF::Reader::Reference then resolve_references(@xref.object(obj))
413
+ when PDF::Reader::Reference then
414
+ obj, stream = @xref.object(obj)
415
+ if stream
416
+ stream
417
+ else
418
+ resolve_references(obj)
419
+ end
395
420
  when Hash then obj.each { |key,val| obj[key] = resolve_references(val) }
396
421
  when Array then obj.collect { |item| resolve_references(item) }
397
422
  else
@@ -404,6 +429,21 @@ class PDF::Reader
404
429
  @receiver.send(name, *params) if @receiver.respond_to?(name)
405
430
  end
406
431
  ################################################################################
432
+ private
433
+ def utf16_to_utf8(obj)
434
+ case obj
435
+ when String then
436
+ if obj[0,2] == "\376\377"
437
+ obj[2, obj.size-2].unpack("n*").pack("U*")
438
+ else
439
+ obj
440
+ end
441
+ when Hash then obj.each { |key,val| obj[key] = utf16_to_utf8(val) }
442
+ when Array then obj.collect { |item| utf16_to_utf8(item) }
443
+ else
444
+ obj
445
+ end
446
+ end
407
447
  end
408
448
  ################################################################################
409
449
  end
@@ -60,21 +60,21 @@ class PDF::Reader
60
60
  # Takes the "Encoding" value of a Font dictionary and builds a PDF::Reader::Encoding object
61
61
  def self.factory(enc)
62
62
  if enc.kind_of?(Hash)
63
- diff = enc['Differences']
64
- enc = enc['Encoding'] || enc['BaseEncoding']
63
+ diff = enc[:Differences]
64
+ enc = enc[:Encoding] || enc[:BaseEncoding]
65
65
  elsif enc != nil
66
- enc = enc.to_s
66
+ enc = enc.to_sym
67
67
  end
68
68
 
69
69
  case enc
70
- when nil then enc = PDF::Reader::Encoding::StandardEncoding.new
71
- when "Identity-H" then enc = PDF::Reader::Encoding::IdentityH.new
72
- when "MacRomanEncoding" then enc = PDF::Reader::Encoding::MacRomanEncoding.new
73
- when "MacExpertEncoding" then enc = PDF::Reader::Encoding::MacExpertEncoding.new
74
- when "StandardEncoding" then enc = PDF::Reader::Encoding::StandardEncoding.new
75
- when "SymbolEncoding" then enc = PDF::Reader::Encoding::SymbolEncoding.new
76
- when "WinAnsiEncoding" then enc = PDF::Reader::Encoding::WinAnsiEncoding.new
77
- when "ZapfDingbatsEncoding" then enc = PDF::Reader::Encoding::ZapfDingbatsEncoding.new
70
+ when nil then enc = PDF::Reader::Encoding::StandardEncoding.new
71
+ when "Identity-H".to_sym then enc = PDF::Reader::Encoding::IdentityH.new
72
+ when :MacRomanEncoding then enc = PDF::Reader::Encoding::MacRomanEncoding.new
73
+ when :MacExpertEncoding then enc = PDF::Reader::Encoding::MacExpertEncoding.new
74
+ when :StandardEncoding then enc = PDF::Reader::Encoding::StandardEncoding.new
75
+ when :SymbolEncoding then enc = PDF::Reader::Encoding::SymbolEncoding.new
76
+ when :WinAnsiEncoding then enc = PDF::Reader::Encoding::WinAnsiEncoding.new
77
+ when :ZapfDingbatsEncoding then enc = PDF::Reader::Encoding::ZapfDingbatsEncoding.new
78
78
  else raise UnsupportedFeatureError, "#{enc} is not currently a supported encoding"
79
79
  end
80
80
 
@@ -104,28 +104,28 @@ class PDF::Reader
104
104
  protected :process_glyphnames
105
105
 
106
106
  class IdentityH < Encoding
107
- def to_utf8(str, map = nil)
108
-
107
+ def to_utf8(str, tounicode = nil)
108
+
109
109
  array_enc = []
110
110
 
111
111
  # iterate over string, reading it in 2 byte chunks and interpreting those
112
112
  # chunks as ints
113
- str.unpack("n*").each do |c|
114
-
113
+ str.unpack("n*").each do |num|
114
+
115
115
  # convert the int to a unicode codepoint if possible.
116
116
  # without a ToUnicode CMap, it's impossible to reliably convert this text
117
117
  # to unicode, so just replace each character with a little box. Big smacks
118
118
  # the the PDF producing app.
119
- if map && (code = map.decode(c))
119
+ if tounicode && (code = tounicode.decode(num))
120
120
  array_enc << code
121
121
  else
122
122
  array_enc << PDF::Reader::Encoding::UNKNOWN_CHAR
123
123
  end
124
124
  end
125
-
125
+
126
126
  # replace charcters that didn't convert to unicode nicely with something valid
127
127
  array_enc.collect! { |c| c ? c : PDF::Reader::Encoding::UNKNOWN_CHAR }
128
-
128
+
129
129
  # pack all our Unicode codepoints into a UTF-8 string
130
130
  ret = array_enc.pack("U*")
131
131
 
@@ -143,169 +143,175 @@ class PDF::Reader
143
143
  array_expert = self.process_differences(array_expert)
144
144
  array_enc = []
145
145
  array_expert.each do |num|
146
- case num
147
- # change necesary characters to equivilant Unicode codepoints
148
- when 0x21; array_enc << 0xF721
149
- when 0x22; array_enc << 0xF6F8 # Hungarumlautsmall
150
- when 0x23; array_enc << 0xF7A2
151
- when 0x24; array_enc << 0xF724
152
- when 0x25; array_enc << 0xF6E4
153
- when 0x26; array_enc << 0xF726
154
- when 0x27; array_enc << 0xF7B4
155
- when 0x28; array_enc << 0x207D
156
- when 0x29; array_enc << 0xF07E
157
- when 0x2A; array_enc << 0x2025
158
- when 0x2B; array_enc << 0x2024
159
- when 0x2F; array_enc << 0x2044
160
- when 0x30; array_enc << 0xF730
161
- when 0x31; array_enc << 0xF731
162
- when 0x32; array_enc << 0xF732
163
- when 0x33; array_enc << 0xF733
164
- when 0x34; array_enc << 0xF734
165
- when 0x35; array_enc << 0xF735
166
- when 0x36; array_enc << 0xF736
167
- when 0x37; array_enc << 0xF737
168
- when 0x38; array_enc << 0xF738
169
- when 0x39; array_enc << 0xF739
170
- when 0x3D; array_enc << 0xF6DE
171
- when 0x3F; array_enc << 0xF73F
172
- when 0x44; array_enc << 0xF7F0
173
- when 0x47; array_enc << 0x00BC
174
- when 0x48; array_enc << 0x00BD
175
- when 0x49; array_enc << 0x00BE
176
- when 0x4A; array_enc << 0x215B
177
- when 0x4B; array_enc << 0x215C
178
- when 0x4C; array_enc << 0x215D
179
- when 0x4D; array_enc << 0x215E
180
- when 0x4E; array_enc << 0x2153
181
- when 0x4F; array_enc << 0x2154
182
- when 0x56; array_enc << 0xFB00
183
- when 0x57; array_enc << 0xFB01
184
- when 0x58; array_enc << 0xFB02
185
- when 0x59; array_enc << 0xFB03
186
- when 0x5A; array_enc << 0xFB04
187
- when 0x5B; array_enc << 0x208D
188
- when 0x5D; array_enc << 0x208E
189
- when 0x5E; array_enc << 0xF6F6
190
- when 0x5F; array_enc << 0xF6E5
191
- when 0x60; array_enc << 0xF760
192
- when 0x61; array_enc << 0xF761
193
- when 0x62; array_enc << 0xF762
194
- when 0x63; array_enc << 0xF763
195
- when 0x64; array_enc << 0xF764
196
- when 0x65; array_enc << 0xF765
197
- when 0x66; array_enc << 0xF766
198
- when 0x67; array_enc << 0xF767
199
- when 0x68; array_enc << 0xF768
200
- when 0x69; array_enc << 0xF769
201
- when 0x6A; array_enc << 0xF76A
202
- when 0x6B; array_enc << 0xF76B
203
- when 0x6C; array_enc << 0xF76C
204
- when 0x6D; array_enc << 0xF76D
205
- when 0x6E; array_enc << 0xF76E
206
- when 0x6F; array_enc << 0xF76F
207
- when 0x70; array_enc << 0xF770
208
- when 0x71; array_enc << 0xF771
209
- when 0x72; array_enc << 0xF772
210
- when 0x73; array_enc << 0xF773
211
- when 0x74; array_enc << 0xF774
212
- when 0x75; array_enc << 0xF775
213
- when 0x76; array_enc << 0xF776
214
- when 0x77; array_enc << 0xF777
215
- when 0x78; array_enc << 0xF778
216
- when 0x79; array_enc << 0xF779
217
- when 0x7A; array_enc << 0xF77A
218
- when 0x7B; array_enc << 0x20A1
219
- when 0x7C; array_enc << 0xF6DC
220
- when 0x7D; array_enc << 0xF6DD
221
- when 0x7E; array_enc << 0xF6FE
222
- when 0x81; array_enc << 0xF6E9
223
- when 0x82; array_enc << 0xF6E0
224
- when 0x87; array_enc << 0xF7E1 # Acircumflexsmall
225
- when 0x88; array_enc << 0xF7E0
226
- when 0x89; array_enc << 0xF7E2 # Acutesmall
227
- when 0x8A; array_enc << 0xF7E4
228
- when 0x8B; array_enc << 0xF7E3
229
- when 0x8C; array_enc << 0xF7E5
230
- when 0x8D; array_enc << 0xF7E7
231
- when 0x8E; array_enc << 0xF7E9
232
- when 0x8F; array_enc << 0xF7E8
233
- when 0x90; array_enc << 0xF7E4
234
- when 0x91; array_enc << 0xF7EB
235
- when 0x92; array_enc << 0xF7ED
236
- when 0x93; array_enc << 0xF7EC
237
- when 0x94; array_enc << 0xF7EE
238
- when 0x95; array_enc << 0xF7EF
239
- when 0x96; array_enc << 0xF7F1
240
- when 0x97; array_enc << 0xF7F3
241
- when 0x98; array_enc << 0xF7F2
242
- when 0x99; array_enc << 0xF7F4
243
- when 0x9A; array_enc << 0xF7F6
244
- when 0x9B; array_enc << 0xF7F5
245
- when 0x9C; array_enc << 0xF7FA
246
- when 0x9D; array_enc << 0xF7F9
247
- when 0x9E; array_enc << 0xF7FB
248
- when 0x9F; array_enc << 0xF7FC
249
- when 0xA1; array_enc << 0x2078
250
- when 0xA2; array_enc << 0x2084
251
- when 0xA3; array_enc << 0x2083
252
- when 0xA4; array_enc << 0x2086
253
- when 0xA5; array_enc << 0x2088
254
- when 0xA6; array_enc << 0x2087
255
- when 0xA7; array_enc << 0xF6FD
256
- when 0xA9; array_enc << 0xF6DF
257
- when 0xAA; array_enc << 0x2082
258
- when 0xAC; array_enc << 0xF7A8
259
- when 0xAE; array_enc << 0xF6F5
260
- when 0xAF; array_enc << 0xF6F0
261
- when 0xB0; array_enc << 0x2085
262
- when 0xB2; array_enc << 0xF6E1
263
- when 0xB3; array_enc << 0xF6E7
264
- when 0xB4; array_enc << 0xF7FD
265
- when 0xB6; array_enc << 0xF6E3
266
- when 0xB9; array_enc << 0xF7FE
267
- when 0xBB; array_enc << 0x2089
268
- when 0xBC; array_enc << 0x2080
269
- when 0xBD; array_enc << 0xF6FF
270
- when 0xBE; array_enc << 0xF7E6 # AEsmall
271
- when 0xBF; array_enc << 0xF7F8
272
- when 0xC0; array_enc << 0xF7BF
273
- when 0xC1; array_enc << 0x2081
274
- when 0xC2; array_enc << 0xF6F9
275
- when 0xC9; array_enc << 0xF7B8
276
- when 0xCF; array_enc << 0xF6FA
277
- when 0xD0; array_enc << 0x2012
278
- when 0xD1; array_enc << 0xF6E6
279
- when 0xD6; array_enc << 0xF7A1
280
- when 0xD8; array_enc << 0xF7FF
281
- when 0xDA; array_enc << 0x00B9
282
- when 0xDB; array_enc << 0x00B2
283
- when 0xDC; array_enc << 0x00B3
284
- when 0xDD; array_enc << 0x2074
285
- when 0xDE; array_enc << 0x2075
286
- when 0xDF; array_enc << 0x2076
287
- when 0xE0; array_enc << 0x2077
288
- when 0xE1; array_enc << 0x2079
289
- when 0xE2; array_enc << 0x2070
290
- when 0xE4; array_enc << 0xF6EC
291
- when 0xE5; array_enc << 0xF6F1
292
- when 0xE6; array_enc << 0xF6F3
293
- when 0xE9; array_enc << 0xF6ED
294
- when 0xEA; array_enc << 0xF6F2
295
- when 0xEB; array_enc << 0xF6EB
296
- when 0xF1; array_enc << 0xF6EE
297
- when 0xF2; array_enc << 0xF6FB
298
- when 0xF3; array_enc << 0xF6F4
299
- when 0xF4; array_enc << 0xF7AF
300
- when 0xF5; array_enc << 0xF6EF
301
- when 0xF6; array_enc << 0x207F
302
- when 0xF7; array_enc << 0xF6EF
303
- when 0xF8; array_enc << 0xF6E2
304
- when 0xF9; array_enc << 0xF6E8
305
- when 0xFA; array_enc << 0xF6F7
306
- when 0xFB; array_enc << 0xF6FC
146
+ if tounicode && (code = tounicode.decode(num))
147
+ array_enc << code
148
+ elsif tounicode
149
+ array_enc << PDF::Reader::Encoding::UNKNOWN_CHAR
307
150
  else
308
- array_enc << num
151
+ case num
152
+ # change necesary characters to equivilant Unicode codepoints
153
+ when 0x21; array_enc << 0xF721
154
+ when 0x22; array_enc << 0xF6F8 # Hungarumlautsmall
155
+ when 0x23; array_enc << 0xF7A2
156
+ when 0x24; array_enc << 0xF724
157
+ when 0x25; array_enc << 0xF6E4
158
+ when 0x26; array_enc << 0xF726
159
+ when 0x27; array_enc << 0xF7B4
160
+ when 0x28; array_enc << 0x207D
161
+ when 0x29; array_enc << 0xF07E
162
+ when 0x2A; array_enc << 0x2025
163
+ when 0x2B; array_enc << 0x2024
164
+ when 0x2F; array_enc << 0x2044
165
+ when 0x30; array_enc << 0xF730
166
+ when 0x31; array_enc << 0xF731
167
+ when 0x32; array_enc << 0xF732
168
+ when 0x33; array_enc << 0xF733
169
+ when 0x34; array_enc << 0xF734
170
+ when 0x35; array_enc << 0xF735
171
+ when 0x36; array_enc << 0xF736
172
+ when 0x37; array_enc << 0xF737
173
+ when 0x38; array_enc << 0xF738
174
+ when 0x39; array_enc << 0xF739
175
+ when 0x3D; array_enc << 0xF6DE
176
+ when 0x3F; array_enc << 0xF73F
177
+ when 0x44; array_enc << 0xF7F0
178
+ when 0x47; array_enc << 0x00BC
179
+ when 0x48; array_enc << 0x00BD
180
+ when 0x49; array_enc << 0x00BE
181
+ when 0x4A; array_enc << 0x215B
182
+ when 0x4B; array_enc << 0x215C
183
+ when 0x4C; array_enc << 0x215D
184
+ when 0x4D; array_enc << 0x215E
185
+ when 0x4E; array_enc << 0x2153
186
+ when 0x4F; array_enc << 0x2154
187
+ when 0x56; array_enc << 0xFB00
188
+ when 0x57; array_enc << 0xFB01
189
+ when 0x58; array_enc << 0xFB02
190
+ when 0x59; array_enc << 0xFB03
191
+ when 0x5A; array_enc << 0xFB04
192
+ when 0x5B; array_enc << 0x208D
193
+ when 0x5D; array_enc << 0x208E
194
+ when 0x5E; array_enc << 0xF6F6
195
+ when 0x5F; array_enc << 0xF6E5
196
+ when 0x60; array_enc << 0xF760
197
+ when 0x61; array_enc << 0xF761
198
+ when 0x62; array_enc << 0xF762
199
+ when 0x63; array_enc << 0xF763
200
+ when 0x64; array_enc << 0xF764
201
+ when 0x65; array_enc << 0xF765
202
+ when 0x66; array_enc << 0xF766
203
+ when 0x67; array_enc << 0xF767
204
+ when 0x68; array_enc << 0xF768
205
+ when 0x69; array_enc << 0xF769
206
+ when 0x6A; array_enc << 0xF76A
207
+ when 0x6B; array_enc << 0xF76B
208
+ when 0x6C; array_enc << 0xF76C
209
+ when 0x6D; array_enc << 0xF76D
210
+ when 0x6E; array_enc << 0xF76E
211
+ when 0x6F; array_enc << 0xF76F
212
+ when 0x70; array_enc << 0xF770
213
+ when 0x71; array_enc << 0xF771
214
+ when 0x72; array_enc << 0xF772
215
+ when 0x73; array_enc << 0xF773
216
+ when 0x74; array_enc << 0xF774
217
+ when 0x75; array_enc << 0xF775
218
+ when 0x76; array_enc << 0xF776
219
+ when 0x77; array_enc << 0xF777
220
+ when 0x78; array_enc << 0xF778
221
+ when 0x79; array_enc << 0xF779
222
+ when 0x7A; array_enc << 0xF77A
223
+ when 0x7B; array_enc << 0x20A1
224
+ when 0x7C; array_enc << 0xF6DC
225
+ when 0x7D; array_enc << 0xF6DD
226
+ when 0x7E; array_enc << 0xF6FE
227
+ when 0x81; array_enc << 0xF6E9
228
+ when 0x82; array_enc << 0xF6E0
229
+ when 0x87; array_enc << 0xF7E1 # Acircumflexsmall
230
+ when 0x88; array_enc << 0xF7E0
231
+ when 0x89; array_enc << 0xF7E2 # Acutesmall
232
+ when 0x8A; array_enc << 0xF7E4
233
+ when 0x8B; array_enc << 0xF7E3
234
+ when 0x8C; array_enc << 0xF7E5
235
+ when 0x8D; array_enc << 0xF7E7
236
+ when 0x8E; array_enc << 0xF7E9
237
+ when 0x8F; array_enc << 0xF7E8
238
+ when 0x90; array_enc << 0xF7E4
239
+ when 0x91; array_enc << 0xF7EB
240
+ when 0x92; array_enc << 0xF7ED
241
+ when 0x93; array_enc << 0xF7EC
242
+ when 0x94; array_enc << 0xF7EE
243
+ when 0x95; array_enc << 0xF7EF
244
+ when 0x96; array_enc << 0xF7F1
245
+ when 0x97; array_enc << 0xF7F3
246
+ when 0x98; array_enc << 0xF7F2
247
+ when 0x99; array_enc << 0xF7F4
248
+ when 0x9A; array_enc << 0xF7F6
249
+ when 0x9B; array_enc << 0xF7F5
250
+ when 0x9C; array_enc << 0xF7FA
251
+ when 0x9D; array_enc << 0xF7F9
252
+ when 0x9E; array_enc << 0xF7FB
253
+ when 0x9F; array_enc << 0xF7FC
254
+ when 0xA1; array_enc << 0x2078
255
+ when 0xA2; array_enc << 0x2084
256
+ when 0xA3; array_enc << 0x2083
257
+ when 0xA4; array_enc << 0x2086
258
+ when 0xA5; array_enc << 0x2088
259
+ when 0xA6; array_enc << 0x2087
260
+ when 0xA7; array_enc << 0xF6FD
261
+ when 0xA9; array_enc << 0xF6DF
262
+ when 0xAA; array_enc << 0x2082
263
+ when 0xAC; array_enc << 0xF7A8
264
+ when 0xAE; array_enc << 0xF6F5
265
+ when 0xAF; array_enc << 0xF6F0
266
+ when 0xB0; array_enc << 0x2085
267
+ when 0xB2; array_enc << 0xF6E1
268
+ when 0xB3; array_enc << 0xF6E7
269
+ when 0xB4; array_enc << 0xF7FD
270
+ when 0xB6; array_enc << 0xF6E3
271
+ when 0xB9; array_enc << 0xF7FE
272
+ when 0xBB; array_enc << 0x2089
273
+ when 0xBC; array_enc << 0x2080
274
+ when 0xBD; array_enc << 0xF6FF
275
+ when 0xBE; array_enc << 0xF7E6 # AEsmall
276
+ when 0xBF; array_enc << 0xF7F8
277
+ when 0xC0; array_enc << 0xF7BF
278
+ when 0xC1; array_enc << 0x2081
279
+ when 0xC2; array_enc << 0xF6F9
280
+ when 0xC9; array_enc << 0xF7B8
281
+ when 0xCF; array_enc << 0xF6FA
282
+ when 0xD0; array_enc << 0x2012
283
+ when 0xD1; array_enc << 0xF6E6
284
+ when 0xD6; array_enc << 0xF7A1
285
+ when 0xD8; array_enc << 0xF7FF
286
+ when 0xDA; array_enc << 0x00B9
287
+ when 0xDB; array_enc << 0x00B2
288
+ when 0xDC; array_enc << 0x00B3
289
+ when 0xDD; array_enc << 0x2074
290
+ when 0xDE; array_enc << 0x2075
291
+ when 0xDF; array_enc << 0x2076
292
+ when 0xE0; array_enc << 0x2077
293
+ when 0xE1; array_enc << 0x2079
294
+ when 0xE2; array_enc << 0x2070
295
+ when 0xE4; array_enc << 0xF6EC
296
+ when 0xE5; array_enc << 0xF6F1
297
+ when 0xE6; array_enc << 0xF6F3
298
+ when 0xE9; array_enc << 0xF6ED
299
+ when 0xEA; array_enc << 0xF6F2
300
+ when 0xEB; array_enc << 0xF6EB
301
+ when 0xF1; array_enc << 0xF6EE
302
+ when 0xF2; array_enc << 0xF6FB
303
+ when 0xF3; array_enc << 0xF6F4
304
+ when 0xF4; array_enc << 0xF7AF
305
+ when 0xF5; array_enc << 0xF6EF
306
+ when 0xF6; array_enc << 0x207F
307
+ when 0xF7; array_enc << 0xF6EF
308
+ when 0xF8; array_enc << 0xF6E2
309
+ when 0xF9; array_enc << 0xF6E8
310
+ when 0xFA; array_enc << 0xF6F7
311
+ when 0xFB; array_enc << 0xF6FC
312
+ else
313
+ array_enc << num
314
+ end
309
315
  end
310
316
  end
311
317
 
@@ -314,7 +320,7 @@ class PDF::Reader
314
320
 
315
321
  # replace charcters that didn't convert to unicode nicely with something valid
316
322
  array_enc.collect! { |c| c ? c : PDF::Reader::Encoding::UNKNOWN_CHAR }
317
-
323
+
318
324
  # pack all our Unicode codepoints into a UTF-8 string
319
325
  ret = array_enc.pack("U*")
320
326
 
@@ -335,138 +341,144 @@ class PDF::Reader
335
341
  array_mac = self.process_differences(array_mac)
336
342
  array_enc = []
337
343
  array_mac.each do |num|
338
- case num
339
- # change necesary characters to equivilant Unicode codepoints
340
- when 0x80; array_enc << 0x00C4
341
- when 0x81; array_enc << 0x00C5
342
- when 0x82; array_enc << 0x00C7
343
- when 0x83; array_enc << 0x00C9
344
- when 0x84; array_enc << 0x00D1
345
- when 0x85; array_enc << 0x00D6
346
- when 0x86; array_enc << 0x00DC
347
- when 0x87; array_enc << 0x00E1
348
- when 0x88; array_enc << 0x00E0
349
- when 0x89; array_enc << 0x00E2
350
- when 0x8A; array_enc << 0x00E4
351
- when 0x8B; array_enc << 0x00E3
352
- when 0x8C; array_enc << 0x00E5
353
- when 0x8D; array_enc << 0x00E7
354
- when 0x8E; array_enc << 0x00E9
355
- when 0x8F; array_enc << 0x00E8
356
- when 0x90; array_enc << 0x00EA
357
- when 0x91; array_enc << 0x00EB
358
- when 0x92; array_enc << 0x00ED
359
- when 0x93; array_enc << 0x00EC
360
- when 0x94; array_enc << 0x00EE
361
- when 0x95; array_enc << 0x00EF
362
- when 0x96; array_enc << 0x00F1
363
- when 0x97; array_enc << 0x00F3
364
- when 0x98; array_enc << 0x00F2
365
- when 0x99; array_enc << 0x00F4
366
- when 0x9A; array_enc << 0x00F6
367
- when 0x9B; array_enc << 0x00F5
368
- when 0x9C; array_enc << 0x00FA
369
- when 0x9D; array_enc << 0x00F9
370
- when 0x9E; array_enc << 0x00FB
371
- when 0x9F; array_enc << 0x00FC
372
- when 0xA0; array_enc << 0x2020
373
- when 0xA1; array_enc << 0x00B0
374
- when 0xA2; array_enc << 0x00A2
375
- when 0xA3; array_enc << 0x00A3
376
- when 0xA4; array_enc << 0x00A7
377
- when 0xA5; array_enc << 0x2022
378
- when 0xA6; array_enc << 0x00B6
379
- when 0xA7; array_enc << 0x00DF
380
- when 0xA8; array_enc << 0x00AE
381
- when 0xA9; array_enc << 0x00A9
382
- when 0xAA; array_enc << 0x2122
383
- when 0xAB; array_enc << 0x00B4
384
- when 0xAC; array_enc << 0x00A8
385
- when 0xAD; array_enc << 0x2260
386
- when 0xAE; array_enc << 0x00C6
387
- when 0xAF; array_enc << 0x00D8
388
- when 0xB0; array_enc << 0x221E
389
- when 0xB1; array_enc << 0x00B1
390
- when 0xB2; array_enc << 0x2264
391
- when 0xB3; array_enc << 0x2265
392
- when 0xB4; array_enc << 0x00A5
393
- when 0xB5; array_enc << 0x00B5
394
- when 0xB6; array_enc << 0x2202
395
- when 0xB7; array_enc << 0x2211
396
- when 0xB8; array_enc << 0x220F
397
- when 0xB9; array_enc << 0x03C0
398
- when 0xBA; array_enc << 0x222B
399
- when 0xBB; array_enc << 0x00AA
400
- when 0xBC; array_enc << 0x00BA
401
- when 0xBD; array_enc << 0x03A9
402
- when 0xBE; array_enc << 0x00E6
403
- when 0xBF; array_enc << 0x00F8
404
- when 0xC0; array_enc << 0x00BF
405
- when 0xC1; array_enc << 0x00A1
406
- when 0xC2; array_enc << 0x00AC
407
- when 0xC3; array_enc << 0x221A
408
- when 0xC4; array_enc << 0x0192
409
- when 0xC5; array_enc << 0x2248
410
- when 0xC6; array_enc << 0x2206
411
- when 0xC7; array_enc << 0x00AB
412
- when 0xC8; array_enc << 0x00BB
413
- when 0xC9; array_enc << 0x2026
414
- when 0xCA; array_enc << 0x00A0
415
- when 0xCB; array_enc << 0x00C0
416
- when 0xCC; array_enc << 0x00C3
417
- when 0xCD; array_enc << 0x00D5
418
- when 0xCE; array_enc << 0x0152
419
- when 0xCF; array_enc << 0x0153
420
- when 0xD0; array_enc << 0x2013
421
- when 0xD1; array_enc << 0x2014
422
- when 0xD2; array_enc << 0x201C
423
- when 0xD3; array_enc << 0x201D
424
- when 0xD4; array_enc << 0x2018
425
- when 0xD5; array_enc << 0x2019
426
- when 0xD6; array_enc << 0x00F7
427
- when 0xD7; array_enc << 0x25CA
428
- when 0xD8; array_enc << 0x00FF
429
- when 0xD9; array_enc << 0x0178
430
- when 0xDA; array_enc << 0x2044
431
- when 0xDB; array_enc << 0x20AC
432
- when 0xDC; array_enc << 0x2039
433
- when 0xDD; array_enc << 0x203A
434
- when 0xDE; array_enc << 0xFB01
435
- when 0xDF; array_enc << 0xFB02
436
- when 0xE0; array_enc << 0x2021
437
- when 0xE1; array_enc << 0x00B7
438
- when 0xE2; array_enc << 0x201A
439
- when 0xE3; array_enc << 0x201E
440
- when 0xE4; array_enc << 0x2030
441
- when 0xE5; array_enc << 0x00C2
442
- when 0xE6; array_enc << 0x00CA
443
- when 0xE7; array_enc << 0x00C1
444
- when 0xE8; array_enc << 0x00CB
445
- when 0xE9; array_enc << 0x00C8
446
- when 0xEA; array_enc << 0x00CD
447
- when 0xEB; array_enc << 0x00CE
448
- when 0xEC; array_enc << 0x00CF
449
- when 0xED; array_enc << 0x00CC
450
- when 0xEE; array_enc << 0x00D3
451
- when 0xEF; array_enc << 0x00D4
452
- when 0xF0; array_enc << 0xF8FF
453
- when 0xF1; array_enc << 0x00D2
454
- when 0xF2; array_enc << 0x00DA
455
- when 0xF3; array_enc << 0x00D8
456
- when 0xF4; array_enc << 0x00D9
457
- when 0xF5; array_enc << 0x0131
458
- when 0xF6; array_enc << 0x02C6
459
- when 0xF7; array_enc << 0x02DC
460
- when 0xF8; array_enc << 0x00AF
461
- when 0xF9; array_enc << 0x02D8
462
- when 0xFA; array_enc << 0x02D9
463
- when 0xFB; array_enc << 0x02DA
464
- when 0xFC; array_enc << 0x00B8
465
- when 0xFD; array_enc << 0x02DD
466
- when 0xFE; array_enc << 0x02DB
467
- when 0xFF; array_enc << 0x02C7
344
+ if tounicode && (code = tounicode.decode(num))
345
+ array_enc << code
346
+ elsif tounicode
347
+ array_enc << PDF::Reader::Encoding::UNKNOWN_CHAR
468
348
  else
469
- array_enc << num
349
+ case num
350
+ # change necesary characters to equivilant Unicode codepoints
351
+ when 0x80; array_enc << 0x00C4
352
+ when 0x81; array_enc << 0x00C5
353
+ when 0x82; array_enc << 0x00C7
354
+ when 0x83; array_enc << 0x00C9
355
+ when 0x84; array_enc << 0x00D1
356
+ when 0x85; array_enc << 0x00D6
357
+ when 0x86; array_enc << 0x00DC
358
+ when 0x87; array_enc << 0x00E1
359
+ when 0x88; array_enc << 0x00E0
360
+ when 0x89; array_enc << 0x00E2
361
+ when 0x8A; array_enc << 0x00E4
362
+ when 0x8B; array_enc << 0x00E3
363
+ when 0x8C; array_enc << 0x00E5
364
+ when 0x8D; array_enc << 0x00E7
365
+ when 0x8E; array_enc << 0x00E9
366
+ when 0x8F; array_enc << 0x00E8
367
+ when 0x90; array_enc << 0x00EA
368
+ when 0x91; array_enc << 0x00EB
369
+ when 0x92; array_enc << 0x00ED
370
+ when 0x93; array_enc << 0x00EC
371
+ when 0x94; array_enc << 0x00EE
372
+ when 0x95; array_enc << 0x00EF
373
+ when 0x96; array_enc << 0x00F1
374
+ when 0x97; array_enc << 0x00F3
375
+ when 0x98; array_enc << 0x00F2
376
+ when 0x99; array_enc << 0x00F4
377
+ when 0x9A; array_enc << 0x00F6
378
+ when 0x9B; array_enc << 0x00F5
379
+ when 0x9C; array_enc << 0x00FA
380
+ when 0x9D; array_enc << 0x00F9
381
+ when 0x9E; array_enc << 0x00FB
382
+ when 0x9F; array_enc << 0x00FC
383
+ when 0xA0; array_enc << 0x2020
384
+ when 0xA1; array_enc << 0x00B0
385
+ when 0xA2; array_enc << 0x00A2
386
+ when 0xA3; array_enc << 0x00A3
387
+ when 0xA4; array_enc << 0x00A7
388
+ when 0xA5; array_enc << 0x2022
389
+ when 0xA6; array_enc << 0x00B6
390
+ when 0xA7; array_enc << 0x00DF
391
+ when 0xA8; array_enc << 0x00AE
392
+ when 0xA9; array_enc << 0x00A9
393
+ when 0xAA; array_enc << 0x2122
394
+ when 0xAB; array_enc << 0x00B4
395
+ when 0xAC; array_enc << 0x00A8
396
+ when 0xAD; array_enc << 0x2260
397
+ when 0xAE; array_enc << 0x00C6
398
+ when 0xAF; array_enc << 0x00D8
399
+ when 0xB0; array_enc << 0x221E
400
+ when 0xB1; array_enc << 0x00B1
401
+ when 0xB2; array_enc << 0x2264
402
+ when 0xB3; array_enc << 0x2265
403
+ when 0xB4; array_enc << 0x00A5
404
+ when 0xB5; array_enc << 0x00B5
405
+ when 0xB6; array_enc << 0x2202
406
+ when 0xB7; array_enc << 0x2211
407
+ when 0xB8; array_enc << 0x220F
408
+ when 0xB9; array_enc << 0x03C0
409
+ when 0xBA; array_enc << 0x222B
410
+ when 0xBB; array_enc << 0x00AA
411
+ when 0xBC; array_enc << 0x00BA
412
+ when 0xBD; array_enc << 0x03A9
413
+ when 0xBE; array_enc << 0x00E6
414
+ when 0xBF; array_enc << 0x00F8
415
+ when 0xC0; array_enc << 0x00BF
416
+ when 0xC1; array_enc << 0x00A1
417
+ when 0xC2; array_enc << 0x00AC
418
+ when 0xC3; array_enc << 0x221A
419
+ when 0xC4; array_enc << 0x0192
420
+ when 0xC5; array_enc << 0x2248
421
+ when 0xC6; array_enc << 0x2206
422
+ when 0xC7; array_enc << 0x00AB
423
+ when 0xC8; array_enc << 0x00BB
424
+ when 0xC9; array_enc << 0x2026
425
+ when 0xCA; array_enc << 0x00A0
426
+ when 0xCB; array_enc << 0x00C0
427
+ when 0xCC; array_enc << 0x00C3
428
+ when 0xCD; array_enc << 0x00D5
429
+ when 0xCE; array_enc << 0x0152
430
+ when 0xCF; array_enc << 0x0153
431
+ when 0xD0; array_enc << 0x2013
432
+ when 0xD1; array_enc << 0x2014
433
+ when 0xD2; array_enc << 0x201C
434
+ when 0xD3; array_enc << 0x201D
435
+ when 0xD4; array_enc << 0x2018
436
+ when 0xD5; array_enc << 0x2019
437
+ when 0xD6; array_enc << 0x00F7
438
+ when 0xD7; array_enc << 0x25CA
439
+ when 0xD8; array_enc << 0x00FF
440
+ when 0xD9; array_enc << 0x0178
441
+ when 0xDA; array_enc << 0x2044
442
+ when 0xDB; array_enc << 0x20AC
443
+ when 0xDC; array_enc << 0x2039
444
+ when 0xDD; array_enc << 0x203A
445
+ when 0xDE; array_enc << 0xFB01
446
+ when 0xDF; array_enc << 0xFB02
447
+ when 0xE0; array_enc << 0x2021
448
+ when 0xE1; array_enc << 0x00B7
449
+ when 0xE2; array_enc << 0x201A
450
+ when 0xE3; array_enc << 0x201E
451
+ when 0xE4; array_enc << 0x2030
452
+ when 0xE5; array_enc << 0x00C2
453
+ when 0xE6; array_enc << 0x00CA
454
+ when 0xE7; array_enc << 0x00C1
455
+ when 0xE8; array_enc << 0x00CB
456
+ when 0xE9; array_enc << 0x00C8
457
+ when 0xEA; array_enc << 0x00CD
458
+ when 0xEB; array_enc << 0x00CE
459
+ when 0xEC; array_enc << 0x00CF
460
+ when 0xED; array_enc << 0x00CC
461
+ when 0xEE; array_enc << 0x00D3
462
+ when 0xEF; array_enc << 0x00D4
463
+ when 0xF0; array_enc << 0xF8FF
464
+ when 0xF1; array_enc << 0x00D2
465
+ when 0xF2; array_enc << 0x00DA
466
+ when 0xF3; array_enc << 0x00D8
467
+ when 0xF4; array_enc << 0x00D9
468
+ when 0xF5; array_enc << 0x0131
469
+ when 0xF6; array_enc << 0x02C6
470
+ when 0xF7; array_enc << 0x02DC
471
+ when 0xF8; array_enc << 0x00AF
472
+ when 0xF9; array_enc << 0x02D8
473
+ when 0xFA; array_enc << 0x02D9
474
+ when 0xFB; array_enc << 0x02DA
475
+ when 0xFC; array_enc << 0x00B8
476
+ when 0xFD; array_enc << 0x02DD
477
+ when 0xFE; array_enc << 0x02DB
478
+ when 0xFF; array_enc << 0x02C7
479
+ else
480
+ array_enc << num
481
+ end
470
482
  end
471
483
  end
472
484
 
@@ -475,7 +487,7 @@ class PDF::Reader
475
487
 
476
488
  # replace charcters that didn't convert to unicode nicely with something valid
477
489
  array_enc.collect! { |c| c ? c : PDF::Reader::Encoding::UNKNOWN_CHAR }
478
-
490
+
479
491
  # pack all our Unicode codepoints into a UTF-8 string
480
492
  ret = array_enc.pack("U*")
481
493
 
@@ -495,62 +507,68 @@ class PDF::Reader
495
507
  array_std = self.process_differences(array_std)
496
508
  array_enc = []
497
509
  array_std.each do |num|
498
- case num
499
- when 0x27; array_enc << 0x2019
500
- when 0x60; array_enc << 0x2018
501
- when 0xA4; array_enc << 0x2044
502
- when 0xA6; array_enc << 0x0192
503
- when 0xA8; array_enc << 0x00A4
504
- when 0xA9; array_enc << 0x0027
505
- when 0xAA; array_enc << 0x201C
506
- when 0xAC; array_enc << 0x2039
507
- when 0xAD; array_enc << 0x203A
508
- when 0xAE; array_enc << 0xFB01
509
- when 0xAF; array_enc << 0xFB02
510
- when 0xB1; array_enc << 0x2013
511
- when 0xB2; array_enc << 0x2020
512
- when 0xB3; array_enc << 0x2021
513
- when 0xB4; array_enc << 0x00B7
514
- when 0xB7; array_enc << 0x2022
515
- when 0xB8; array_enc << 0x201A
516
- when 0xB9; array_enc << 0x201E
517
- when 0xBA; array_enc << 0x201D
518
- when 0xBC; array_enc << 0x2026
519
- when 0xBD; array_enc << 0x2030
520
- when 0xC1; array_enc << 0x0060
521
- when 0xC2; array_enc << 0x00B4
522
- when 0xC3; array_enc << 0x02C6
523
- when 0xC4; array_enc << 0x02DC
524
- when 0xC5; array_enc << 0x00AF
525
- when 0xC6; array_enc << 0x02D8
526
- when 0xC7; array_enc << 0x02D9
527
- when 0xC8; array_enc << 0x00A8
528
- when 0xCA; array_enc << 0x02DA
529
- when 0xCB; array_enc << 0x00B8
530
- when 0xCD; array_enc << 0x02DD
531
- when 0xCE; array_enc << 0x02DB
532
- when 0xCF; array_enc << 0x02C7
533
- when 0xD0; array_enc << 0x2014
534
- when 0xE1; array_enc << 0x00C6
535
- when 0xE3; array_enc << 0x00AA
536
- when 0xE8; array_enc << 0x0141
537
- when 0xE9; array_enc << 0x00D8
538
- when 0xEA; array_enc << 0x0152
539
- when 0xEB; array_enc << 0x00BA
540
- when 0xF1; array_enc << 0x00E6
541
- when 0xF5; array_enc << 0x0131
542
- when 0xF8; array_enc << 0x0142
543
- when 0xF9; array_enc << 0x00F8
544
- when 0xFA; array_enc << 0x0153
545
- when 0xFB; array_enc << 0x00DF
510
+ if tounicode && (code = tounicode.decode(num))
511
+ array_enc << code
512
+ elsif tounicode
513
+ array_enc << PDF::Reader::Encoding::UNKNOWN_CHAR
546
514
  else
547
- array_enc << num
515
+ case num
516
+ when 0x27; array_enc << 0x2019
517
+ when 0x60; array_enc << 0x2018
518
+ when 0xA4; array_enc << 0x2044
519
+ when 0xA6; array_enc << 0x0192
520
+ when 0xA8; array_enc << 0x00A4
521
+ when 0xA9; array_enc << 0x0027
522
+ when 0xAA; array_enc << 0x201C
523
+ when 0xAC; array_enc << 0x2039
524
+ when 0xAD; array_enc << 0x203A
525
+ when 0xAE; array_enc << 0xFB01
526
+ when 0xAF; array_enc << 0xFB02
527
+ when 0xB1; array_enc << 0x2013
528
+ when 0xB2; array_enc << 0x2020
529
+ when 0xB3; array_enc << 0x2021
530
+ when 0xB4; array_enc << 0x00B7
531
+ when 0xB7; array_enc << 0x2022
532
+ when 0xB8; array_enc << 0x201A
533
+ when 0xB9; array_enc << 0x201E
534
+ when 0xBA; array_enc << 0x201D
535
+ when 0xBC; array_enc << 0x2026
536
+ when 0xBD; array_enc << 0x2030
537
+ when 0xC1; array_enc << 0x0060
538
+ when 0xC2; array_enc << 0x00B4
539
+ when 0xC3; array_enc << 0x02C6
540
+ when 0xC4; array_enc << 0x02DC
541
+ when 0xC5; array_enc << 0x00AF
542
+ when 0xC6; array_enc << 0x02D8
543
+ when 0xC7; array_enc << 0x02D9
544
+ when 0xC8; array_enc << 0x00A8
545
+ when 0xCA; array_enc << 0x02DA
546
+ when 0xCB; array_enc << 0x00B8
547
+ when 0xCD; array_enc << 0x02DD
548
+ when 0xCE; array_enc << 0x02DB
549
+ when 0xCF; array_enc << 0x02C7
550
+ when 0xD0; array_enc << 0x2014
551
+ when 0xE1; array_enc << 0x00C6
552
+ when 0xE3; array_enc << 0x00AA
553
+ when 0xE8; array_enc << 0x0141
554
+ when 0xE9; array_enc << 0x00D8
555
+ when 0xEA; array_enc << 0x0152
556
+ when 0xEB; array_enc << 0x00BA
557
+ when 0xF1; array_enc << 0x00E6
558
+ when 0xF5; array_enc << 0x0131
559
+ when 0xF8; array_enc << 0x0142
560
+ when 0xF9; array_enc << 0x00F8
561
+ when 0xFA; array_enc << 0x0153
562
+ when 0xFB; array_enc << 0x00DF
563
+ else
564
+ array_enc << num
565
+ end
548
566
  end
549
567
  end
550
-
568
+
551
569
  # convert any glyph names to unicode codepoints
552
570
  array_enc = self.process_glyphnames(array_enc)
553
-
571
+
554
572
  # replace charcters that didn't convert to unicode nicely with something valid
555
573
  array_enc.collect! { |c| c ? c : PDF::Reader::Encoding::UNKNOWN_CHAR }
556
574
 
@@ -571,163 +589,169 @@ class PDF::Reader
571
589
  array_symbol = self.process_differences(array_symbol)
572
590
  array_enc = []
573
591
  array_symbol.each do |num|
574
- case num
575
- when 0x22; array_enc << 0x2200
576
- when 0x24; array_enc << 0x2203
577
- when 0x27; array_enc << 0x220B
578
- when 0x2A; array_enc << 0x2217
579
- when 0x2D; array_enc << 0x2212
580
- when 0x40; array_enc << 0x2245
581
- when 0x41; array_enc << 0x0391
582
- when 0x42; array_enc << 0x0392
583
- when 0x43; array_enc << 0x03A7
584
- when 0x44; array_enc << 0x0394
585
- when 0x45; array_enc << 0x0395
586
- when 0x46; array_enc << 0x03A6
587
- when 0x47; array_enc << 0x0393
588
- when 0x48; array_enc << 0x0397
589
- when 0x49; array_enc << 0x0399
590
- when 0x4A; array_enc << 0x03D1
591
- when 0x4B; array_enc << 0x039A
592
- when 0x4C; array_enc << 0x039B
593
- when 0x4D; array_enc << 0x039C
594
- when 0x4E; array_enc << 0x039D
595
- when 0x4F; array_enc << 0x039F
596
- when 0x50; array_enc << 0x03A0
597
- when 0x51; array_enc << 0x0398
598
- when 0x52; array_enc << 0x03A1
599
- when 0x53; array_enc << 0x03A3
600
- when 0x54; array_enc << 0x03A4
601
- when 0x55; array_enc << 0x03A5
602
- when 0x56; array_enc << 0x03C2
603
- when 0x57; array_enc << 0x03A9
604
- when 0x58; array_enc << 0x039E
605
- when 0x59; array_enc << 0x03A8
606
- when 0x5A; array_enc << 0x0396
607
- when 0x5C; array_enc << 0x2234
608
- when 0x5E; array_enc << 0x22A5
609
- when 0x60; array_enc << 0xF8E5
610
- when 0x61; array_enc << 0x03B1
611
- when 0x62; array_enc << 0x03B2
612
- when 0x63; array_enc << 0x03C7
613
- when 0x64; array_enc << 0x03B4
614
- when 0x65; array_enc << 0x03B5
615
- when 0x66; array_enc << 0x03C6
616
- when 0x67; array_enc << 0x03B3
617
- when 0x68; array_enc << 0x03B7
618
- when 0x69; array_enc << 0x03B9
619
- when 0x6A; array_enc << 0x03D5
620
- when 0x6B; array_enc << 0x03BA
621
- when 0x6C; array_enc << 0x03BB
622
- when 0x6D; array_enc << 0x03BC
623
- when 0x6E; array_enc << 0x03BD
624
- when 0x6F; array_enc << 0x03BF
625
- when 0x70; array_enc << 0x03C0
626
- when 0x71; array_enc << 0x03B8
627
- when 0x72; array_enc << 0x03C1
628
- when 0x73; array_enc << 0x03C3
629
- when 0x74; array_enc << 0x03C4
630
- when 0x75; array_enc << 0x03C5
631
- when 0x76; array_enc << 0x03D6
632
- when 0x77; array_enc << 0x03C9
633
- when 0x78; array_enc << 0x03BE
634
- when 0x79; array_enc << 0x03C8
635
- when 0x7A; array_enc << 0x03B6
636
- when 0x7E; array_enc << 0x223C
637
- when 0xA0; array_enc << 0x20AC
638
- when 0xA1; array_enc << 0x03D2
639
- when 0xA2; array_enc << 0x2032
640
- when 0xA3; array_enc << 0x2264
641
- when 0xA4; array_enc << 0x2215
642
- when 0xA5; array_enc << 0x221E
643
- when 0xA6; array_enc << 0x0192
644
- when 0xA7; array_enc << 0x2663
645
- when 0xA8; array_enc << 0x2666
646
- when 0xA9; array_enc << 0x2665
647
- when 0xAA; array_enc << 0x2660
648
- when 0xAB; array_enc << 0x2194
649
- when 0xAC; array_enc << 0x2190
650
- when 0xAD; array_enc << 0x2191
651
- when 0xAE; array_enc << 0x2192
652
- when 0xAF; array_enc << 0x2193
653
- when 0xB2; array_enc << 0x2033
654
- when 0xB3; array_enc << 0x2265
655
- when 0xB4; array_enc << 0x00D7
656
- when 0xB5; array_enc << 0x221D
657
- when 0xB6; array_enc << 0x2202
658
- when 0xB7; array_enc << 0x2022
659
- when 0xB8; array_enc << 0x00F7
660
- when 0xB9; array_enc << 0x2260
661
- when 0xBA; array_enc << 0x2261
662
- when 0xBB; array_enc << 0x2248
663
- when 0xBC; array_enc << 0x2026
664
- when 0xBD; array_enc << 0xF8E6
665
- when 0xBE; array_enc << 0xF8E7
666
- when 0xBF; array_enc << 0x21B5
667
- when 0xC0; array_enc << 0x2135
668
- when 0xC1; array_enc << 0x2111
669
- when 0xC2; array_enc << 0x211C
670
- when 0xC3; array_enc << 0x2118
671
- when 0xC4; array_enc << 0x2297
672
- when 0xC5; array_enc << 0x2295
673
- when 0xC6; array_enc << 0x2205
674
- when 0xC7; array_enc << 0x2229
675
- when 0xC8; array_enc << 0x222A
676
- when 0xC9; array_enc << 0x2283
677
- when 0xCA; array_enc << 0x2287
678
- when 0xCB; array_enc << 0x2284
679
- when 0xCC; array_enc << 0x2282
680
- when 0xCD; array_enc << 0x2286
681
- when 0xCE; array_enc << 0x2208
682
- when 0xCF; array_enc << 0x2209
683
- when 0xD0; array_enc << 0x2220
684
- when 0xD1; array_enc << 0x2207
685
- when 0xD2; array_enc << 0xF6DA
686
- when 0xD3; array_enc << 0xF6D9
687
- when 0xD4; array_enc << 0xF6DB
688
- when 0xD5; array_enc << 0x220F
689
- when 0xD6; array_enc << 0x221A
690
- when 0xD7; array_enc << 0x22C5
691
- when 0xD8; array_enc << 0x00AC
692
- when 0xD9; array_enc << 0x2227
693
- when 0xDA; array_enc << 0x2228
694
- when 0xDB; array_enc << 0x21D4
695
- when 0xDC; array_enc << 0x21D0
696
- when 0xDD; array_enc << 0x21D1
697
- when 0xDE; array_enc << 0x21D2
698
- when 0xDF; array_enc << 0x21D3
699
- when 0xE0; array_enc << 0x25CA
700
- when 0xE1; array_enc << 0x2329
701
- when 0xE2; array_enc << 0xF8E8
702
- when 0xE3; array_enc << 0xF8E9
703
- when 0xE4; array_enc << 0xF8EA
704
- when 0xE5; array_enc << 0x2211
705
- when 0xE6; array_enc << 0xF8EB
706
- when 0xE7; array_enc << 0xF8EC
707
- when 0xE8; array_enc << 0xF8ED
708
- when 0xE9; array_enc << 0xF8EE
709
- when 0xEA; array_enc << 0xF8EF
710
- when 0xEB; array_enc << 0xF8F0
711
- when 0xEC; array_enc << 0xF8F1
712
- when 0xED; array_enc << 0xF8F2
713
- when 0xEE; array_enc << 0xF8F3
714
- when 0xEF; array_enc << 0xF8F4
715
- when 0xF1; array_enc << 0x232A
716
- when 0xF2; array_enc << 0x222B
717
- when 0xF3; array_enc << 0x2320
718
- when 0xF4; array_enc << 0xF8F5
719
- when 0xF5; array_enc << 0x2321
720
- when 0xF6; array_enc << 0xF8F6
721
- when 0xF7; array_enc << 0xF8F7
722
- when 0xF8; array_enc << 0xF8F8
723
- when 0xF9; array_enc << 0xF8F9
724
- when 0xFA; array_enc << 0xF8FA
725
- when 0xFB; array_enc << 0xF8FB
726
- when 0xFC; array_enc << 0xF8FC
727
- when 0xFD; array_enc << 0xF8FD
728
- when 0xFE; array_enc << 0xF8FE
592
+ if tounicode && (code = tounicode.decode(num))
593
+ array_enc << code
594
+ elsif tounicode
595
+ array_enc << PDF::Reader::Encoding::UNKNOWN_CHAR
729
596
  else
730
- array_enc << num
597
+ case num
598
+ when 0x22; array_enc << 0x2200
599
+ when 0x24; array_enc << 0x2203
600
+ when 0x27; array_enc << 0x220B
601
+ when 0x2A; array_enc << 0x2217
602
+ when 0x2D; array_enc << 0x2212
603
+ when 0x40; array_enc << 0x2245
604
+ when 0x41; array_enc << 0x0391
605
+ when 0x42; array_enc << 0x0392
606
+ when 0x43; array_enc << 0x03A7
607
+ when 0x44; array_enc << 0x0394
608
+ when 0x45; array_enc << 0x0395
609
+ when 0x46; array_enc << 0x03A6
610
+ when 0x47; array_enc << 0x0393
611
+ when 0x48; array_enc << 0x0397
612
+ when 0x49; array_enc << 0x0399
613
+ when 0x4A; array_enc << 0x03D1
614
+ when 0x4B; array_enc << 0x039A
615
+ when 0x4C; array_enc << 0x039B
616
+ when 0x4D; array_enc << 0x039C
617
+ when 0x4E; array_enc << 0x039D
618
+ when 0x4F; array_enc << 0x039F
619
+ when 0x50; array_enc << 0x03A0
620
+ when 0x51; array_enc << 0x0398
621
+ when 0x52; array_enc << 0x03A1
622
+ when 0x53; array_enc << 0x03A3
623
+ when 0x54; array_enc << 0x03A4
624
+ when 0x55; array_enc << 0x03A5
625
+ when 0x56; array_enc << 0x03C2
626
+ when 0x57; array_enc << 0x03A9
627
+ when 0x58; array_enc << 0x039E
628
+ when 0x59; array_enc << 0x03A8
629
+ when 0x5A; array_enc << 0x0396
630
+ when 0x5C; array_enc << 0x2234
631
+ when 0x5E; array_enc << 0x22A5
632
+ when 0x60; array_enc << 0xF8E5
633
+ when 0x61; array_enc << 0x03B1
634
+ when 0x62; array_enc << 0x03B2
635
+ when 0x63; array_enc << 0x03C7
636
+ when 0x64; array_enc << 0x03B4
637
+ when 0x65; array_enc << 0x03B5
638
+ when 0x66; array_enc << 0x03C6
639
+ when 0x67; array_enc << 0x03B3
640
+ when 0x68; array_enc << 0x03B7
641
+ when 0x69; array_enc << 0x03B9
642
+ when 0x6A; array_enc << 0x03D5
643
+ when 0x6B; array_enc << 0x03BA
644
+ when 0x6C; array_enc << 0x03BB
645
+ when 0x6D; array_enc << 0x03BC
646
+ when 0x6E; array_enc << 0x03BD
647
+ when 0x6F; array_enc << 0x03BF
648
+ when 0x70; array_enc << 0x03C0
649
+ when 0x71; array_enc << 0x03B8
650
+ when 0x72; array_enc << 0x03C1
651
+ when 0x73; array_enc << 0x03C3
652
+ when 0x74; array_enc << 0x03C4
653
+ when 0x75; array_enc << 0x03C5
654
+ when 0x76; array_enc << 0x03D6
655
+ when 0x77; array_enc << 0x03C9
656
+ when 0x78; array_enc << 0x03BE
657
+ when 0x79; array_enc << 0x03C8
658
+ when 0x7A; array_enc << 0x03B6
659
+ when 0x7E; array_enc << 0x223C
660
+ when 0xA0; array_enc << 0x20AC
661
+ when 0xA1; array_enc << 0x03D2
662
+ when 0xA2; array_enc << 0x2032
663
+ when 0xA3; array_enc << 0x2264
664
+ when 0xA4; array_enc << 0x2215
665
+ when 0xA5; array_enc << 0x221E
666
+ when 0xA6; array_enc << 0x0192
667
+ when 0xA7; array_enc << 0x2663
668
+ when 0xA8; array_enc << 0x2666
669
+ when 0xA9; array_enc << 0x2665
670
+ when 0xAA; array_enc << 0x2660
671
+ when 0xAB; array_enc << 0x2194
672
+ when 0xAC; array_enc << 0x2190
673
+ when 0xAD; array_enc << 0x2191
674
+ when 0xAE; array_enc << 0x2192
675
+ when 0xAF; array_enc << 0x2193
676
+ when 0xB2; array_enc << 0x2033
677
+ when 0xB3; array_enc << 0x2265
678
+ when 0xB4; array_enc << 0x00D7
679
+ when 0xB5; array_enc << 0x221D
680
+ when 0xB6; array_enc << 0x2202
681
+ when 0xB7; array_enc << 0x2022
682
+ when 0xB8; array_enc << 0x00F7
683
+ when 0xB9; array_enc << 0x2260
684
+ when 0xBA; array_enc << 0x2261
685
+ when 0xBB; array_enc << 0x2248
686
+ when 0xBC; array_enc << 0x2026
687
+ when 0xBD; array_enc << 0xF8E6
688
+ when 0xBE; array_enc << 0xF8E7
689
+ when 0xBF; array_enc << 0x21B5
690
+ when 0xC0; array_enc << 0x2135
691
+ when 0xC1; array_enc << 0x2111
692
+ when 0xC2; array_enc << 0x211C
693
+ when 0xC3; array_enc << 0x2118
694
+ when 0xC4; array_enc << 0x2297
695
+ when 0xC5; array_enc << 0x2295
696
+ when 0xC6; array_enc << 0x2205
697
+ when 0xC7; array_enc << 0x2229
698
+ when 0xC8; array_enc << 0x222A
699
+ when 0xC9; array_enc << 0x2283
700
+ when 0xCA; array_enc << 0x2287
701
+ when 0xCB; array_enc << 0x2284
702
+ when 0xCC; array_enc << 0x2282
703
+ when 0xCD; array_enc << 0x2286
704
+ when 0xCE; array_enc << 0x2208
705
+ when 0xCF; array_enc << 0x2209
706
+ when 0xD0; array_enc << 0x2220
707
+ when 0xD1; array_enc << 0x2207
708
+ when 0xD2; array_enc << 0xF6DA
709
+ when 0xD3; array_enc << 0xF6D9
710
+ when 0xD4; array_enc << 0xF6DB
711
+ when 0xD5; array_enc << 0x220F
712
+ when 0xD6; array_enc << 0x221A
713
+ when 0xD7; array_enc << 0x22C5
714
+ when 0xD8; array_enc << 0x00AC
715
+ when 0xD9; array_enc << 0x2227
716
+ when 0xDA; array_enc << 0x2228
717
+ when 0xDB; array_enc << 0x21D4
718
+ when 0xDC; array_enc << 0x21D0
719
+ when 0xDD; array_enc << 0x21D1
720
+ when 0xDE; array_enc << 0x21D2
721
+ when 0xDF; array_enc << 0x21D3
722
+ when 0xE0; array_enc << 0x25CA
723
+ when 0xE1; array_enc << 0x2329
724
+ when 0xE2; array_enc << 0xF8E8
725
+ when 0xE3; array_enc << 0xF8E9
726
+ when 0xE4; array_enc << 0xF8EA
727
+ when 0xE5; array_enc << 0x2211
728
+ when 0xE6; array_enc << 0xF8EB
729
+ when 0xE7; array_enc << 0xF8EC
730
+ when 0xE8; array_enc << 0xF8ED
731
+ when 0xE9; array_enc << 0xF8EE
732
+ when 0xEA; array_enc << 0xF8EF
733
+ when 0xEB; array_enc << 0xF8F0
734
+ when 0xEC; array_enc << 0xF8F1
735
+ when 0xED; array_enc << 0xF8F2
736
+ when 0xEE; array_enc << 0xF8F3
737
+ when 0xEF; array_enc << 0xF8F4
738
+ when 0xF1; array_enc << 0x232A
739
+ when 0xF2; array_enc << 0x222B
740
+ when 0xF3; array_enc << 0x2320
741
+ when 0xF4; array_enc << 0xF8F5
742
+ when 0xF5; array_enc << 0x2321
743
+ when 0xF6; array_enc << 0xF8F6
744
+ when 0xF7; array_enc << 0xF8F7
745
+ when 0xF8; array_enc << 0xF8F8
746
+ when 0xF9; array_enc << 0xF8F9
747
+ when 0xFA; array_enc << 0xF8FA
748
+ when 0xFB; array_enc << 0xF8FB
749
+ when 0xFC; array_enc << 0xF8FC
750
+ when 0xFD; array_enc << 0xF8FD
751
+ when 0xFE; array_enc << 0xF8FE
752
+ else
753
+ array_enc << num
754
+ end
731
755
  end
732
756
  end
733
757
 
@@ -757,37 +781,43 @@ class PDF::Reader
757
781
  array_latin9 = self.process_differences(array_latin9)
758
782
  array_enc = []
759
783
  array_latin9.each do |num|
760
- case num
761
- # characters that added compared to iso-8859-1
762
- when 0x80; array_enc << 0x20AC # 0xe2 0x82 0xac
763
- when 0x82; array_enc << 0x201A # 0xe2 0x82 0x9a
764
- when 0x83; array_enc << 0x0192 # 0xc6 0x92
765
- when 0x84; array_enc << 0x201E # 0xe2 0x82 0x9e
766
- when 0x85; array_enc << 0x2026 # 0xe2 0x80 0xa6
767
- when 0x86; array_enc << 0x2020 # 0xe2 0x80 0xa0
768
- when 0x87; array_enc << 0x2021 # 0xe2 0x80 0xa1
769
- when 0x88; array_enc << 0x02C6 # 0xcb 0x86
770
- when 0x89; array_enc << 0x2030 # 0xe2 0x80 0xb0
771
- when 0x8A; array_enc << 0x0160 # 0xc5 0xa0
772
- when 0x8B; array_enc << 0x2039 # 0xe2 0x80 0xb9
773
- when 0x8C; array_enc << 0x0152 # 0xc5 0x92
774
- when 0x8E; array_enc << 0x017D # 0xc5 0xbd
775
- when 0x91; array_enc << 0x2018 # 0xe2 0x80 0x98
776
- when 0x92; array_enc << 0x2019 # 0xe2 0x80 0x99
777
- when 0x93; array_enc << 0x201C
778
- when 0x94; array_enc << 0x201D
779
- when 0x95; array_enc << 0x2022
780
- when 0x96; array_enc << 0x2013
781
- when 0x97; array_enc << 0x2014
782
- when 0x98; array_enc << 0x02DC
783
- when 0x99; array_enc << 0x2122
784
- when 0x9A; array_enc << 0x0161
785
- when 0x9B; array_enc << 0x203A
786
- when 0x9C; array_enc << 0x0152 # 0xc5 0x93
787
- when 0x9E; array_enc << 0x017E # 0xc5 0xbe
788
- when 0x9F; array_enc << 0x0178
784
+ if tounicode && (code = tounicode.decode(num))
785
+ array_enc << code
786
+ elsif tounicode
787
+ array_enc << PDF::Reader::Encoding::UNKNOWN_CHAR
789
788
  else
790
- array_enc << num
789
+ case num
790
+ # characters that added compared to iso-8859-1
791
+ when 0x80; array_enc << 0x20AC # 0xe2 0x82 0xac
792
+ when 0x82; array_enc << 0x201A # 0xe2 0x82 0x9a
793
+ when 0x83; array_enc << 0x0192 # 0xc6 0x92
794
+ when 0x84; array_enc << 0x201E # 0xe2 0x82 0x9e
795
+ when 0x85; array_enc << 0x2026 # 0xe2 0x80 0xa6
796
+ when 0x86; array_enc << 0x2020 # 0xe2 0x80 0xa0
797
+ when 0x87; array_enc << 0x2021 # 0xe2 0x80 0xa1
798
+ when 0x88; array_enc << 0x02C6 # 0xcb 0x86
799
+ when 0x89; array_enc << 0x2030 # 0xe2 0x80 0xb0
800
+ when 0x8A; array_enc << 0x0160 # 0xc5 0xa0
801
+ when 0x8B; array_enc << 0x2039 # 0xe2 0x80 0xb9
802
+ when 0x8C; array_enc << 0x0152 # 0xc5 0x92
803
+ when 0x8E; array_enc << 0x017D # 0xc5 0xbd
804
+ when 0x91; array_enc << 0x2018 # 0xe2 0x80 0x98
805
+ when 0x92; array_enc << 0x2019 # 0xe2 0x80 0x99
806
+ when 0x93; array_enc << 0x201C
807
+ when 0x94; array_enc << 0x201D
808
+ when 0x95; array_enc << 0x2022
809
+ when 0x96; array_enc << 0x2013
810
+ when 0x97; array_enc << 0x2014
811
+ when 0x98; array_enc << 0x02DC
812
+ when 0x99; array_enc << 0x2122
813
+ when 0x9A; array_enc << 0x0161
814
+ when 0x9B; array_enc << 0x203A
815
+ when 0x9C; array_enc << 0x0152 # 0xc5 0x93
816
+ when 0x9E; array_enc << 0x017E # 0xc5 0xbe
817
+ when 0x9F; array_enc << 0x0178
818
+ else
819
+ array_enc << num
820
+ end
791
821
  end
792
822
  end
793
823
 
@@ -816,210 +846,216 @@ class PDF::Reader
816
846
  array_symbol = self.process_differences(array_symbol)
817
847
  array_enc = []
818
848
  array_symbol.each do |num|
819
- case num
820
- when 0x21; array_enc << 0x2701
821
- when 0x22; array_enc << 0x2702
822
- when 0x23; array_enc << 0x2703
823
- when 0x24; array_enc << 0x2704
824
- when 0x25; array_enc << 0x260E
825
- when 0x26; array_enc << 0x2706
826
- when 0x27; array_enc << 0x2707
827
- when 0x28; array_enc << 0x2708
828
- when 0x29; array_enc << 0x2709
829
- when 0x2A; array_enc << 0x261B
830
- when 0x2B; array_enc << 0x261E
831
- when 0x2C; array_enc << 0x270C
832
- when 0x2D; array_enc << 0x270D
833
- when 0x2E; array_enc << 0x270E
834
- when 0x2F; array_enc << 0x270F
835
- when 0x30; array_enc << 0x2710
836
- when 0x31; array_enc << 0x2711
837
- when 0x32; array_enc << 0x2712
838
- when 0x33; array_enc << 0x2713
839
- when 0x34; array_enc << 0x2714
840
- when 0x35; array_enc << 0x2715
841
- when 0x36; array_enc << 0x2716
842
- when 0x37; array_enc << 0x2717
843
- when 0x38; array_enc << 0x2718
844
- when 0x39; array_enc << 0x2719
845
- when 0x3A; array_enc << 0x271A
846
- when 0x3B; array_enc << 0x271B
847
- when 0x3C; array_enc << 0x271C
848
- when 0x3D; array_enc << 0x271D
849
- when 0x3E; array_enc << 0x271E
850
- when 0x3F; array_enc << 0x271E
851
- when 0x40; array_enc << 0x2720
852
- when 0x41; array_enc << 0x2721
853
- when 0x42; array_enc << 0x2722
854
- when 0x43; array_enc << 0x2723
855
- when 0x44; array_enc << 0x2724
856
- when 0x45; array_enc << 0x2725
857
- when 0x46; array_enc << 0x2726
858
- when 0x47; array_enc << 0x2727
859
- when 0x48; array_enc << 0x2605
860
- when 0x49; array_enc << 0x2729
861
- when 0x4A; array_enc << 0x272A
862
- when 0x4B; array_enc << 0x272B
863
- when 0x4C; array_enc << 0x272C
864
- when 0x4D; array_enc << 0x272D
865
- when 0x4E; array_enc << 0x272E
866
- when 0x4F; array_enc << 0x272F
867
- when 0x50; array_enc << 0x2730
868
- when 0x51; array_enc << 0x2731
869
- when 0x52; array_enc << 0x2732
870
- when 0x53; array_enc << 0x2733
871
- when 0x54; array_enc << 0x2734
872
- when 0x55; array_enc << 0x2735
873
- when 0x56; array_enc << 0x2736
874
- when 0x57; array_enc << 0x2737
875
- when 0x58; array_enc << 0x2738
876
- when 0x59; array_enc << 0x2739
877
- when 0x5A; array_enc << 0x273A
878
- when 0x5B; array_enc << 0x273B
879
- when 0x5C; array_enc << 0x273C
880
- when 0x5D; array_enc << 0x273D
881
- when 0x5E; array_enc << 0x273E
882
- when 0x5F; array_enc << 0x273F
883
- when 0x60; array_enc << 0x2740
884
- when 0x61; array_enc << 0x2741
885
- when 0x62; array_enc << 0x2742
886
- when 0x63; array_enc << 0x2743
887
- when 0x64; array_enc << 0x2744
888
- when 0x65; array_enc << 0x2745
889
- when 0x66; array_enc << 0x2746
890
- when 0x67; array_enc << 0x2747
891
- when 0x68; array_enc << 0x2748
892
- when 0x69; array_enc << 0x2749
893
- when 0x6A; array_enc << 0x274A
894
- when 0x6B; array_enc << 0x274B
895
- when 0x6C; array_enc << 0x25CF
896
- when 0x6D; array_enc << 0x274D
897
- when 0x6E; array_enc << 0x25A0
898
- when 0x6F; array_enc << 0x274F
899
- when 0x70; array_enc << 0x2750
900
- when 0x71; array_enc << 0x2751
901
- when 0x72; array_enc << 0x2752
902
- when 0x73; array_enc << 0x2753
903
- when 0x74; array_enc << 0x2754
904
- when 0x75; array_enc << 0x2755
905
- when 0x76; array_enc << 0x2756
906
- when 0x77; array_enc << 0x2757
907
- when 0x78; array_enc << 0x2758
908
- when 0x79; array_enc << 0x2759
909
- when 0x7A; array_enc << 0x275A
910
- when 0x7B; array_enc << 0x275B
911
- when 0x7C; array_enc << 0x275C
912
- when 0x7D; array_enc << 0x275D
913
- when 0x7E; array_enc << 0x275E
914
- when 0x80; array_enc << 0xF8D7
915
- when 0x81; array_enc << 0xF8D8
916
- when 0x82; array_enc << 0xF8D9
917
- when 0x83; array_enc << 0xF8DA
918
- when 0x84; array_enc << 0xF8DB
919
- when 0x85; array_enc << 0xF8DC
920
- when 0x86; array_enc << 0xF8DD
921
- when 0x87; array_enc << 0xF8DE
922
- when 0x88; array_enc << 0xF8DF
923
- when 0x89; array_enc << 0xF8E0
924
- when 0x8A; array_enc << 0xF8E1
925
- when 0x8B; array_enc << 0xF8E2
926
- when 0x8C; array_enc << 0xF8E3
927
- when 0x8D; array_enc << 0xF8E4
928
- when 0xA1; array_enc << 0x2761
929
- when 0xA2; array_enc << 0x2762
930
- when 0xA3; array_enc << 0x2763
931
- when 0xA4; array_enc << 0x2764
932
- when 0xA5; array_enc << 0x2765
933
- when 0xA6; array_enc << 0x2766
934
- when 0xA7; array_enc << 0x2767
935
- when 0xA8; array_enc << 0x2663
936
- when 0xA9; array_enc << 0x2666
937
- when 0xAA; array_enc << 0x2665
938
- when 0xAB; array_enc << 0x2660
939
- when 0xAC; array_enc << 0x2460
940
- when 0xAD; array_enc << 0x2461
941
- when 0xAE; array_enc << 0x2462
942
- when 0xAF; array_enc << 0x2463
943
- when 0xB0; array_enc << 0x2464
944
- when 0xB1; array_enc << 0x2465
945
- when 0xB2; array_enc << 0x2466
946
- when 0xB3; array_enc << 0x2467
947
- when 0xB4; array_enc << 0x2468
948
- when 0xB5; array_enc << 0x2469
949
- when 0xB6; array_enc << 0x2776
950
- when 0xB7; array_enc << 0x2777
951
- when 0xB8; array_enc << 0x2778
952
- when 0xB9; array_enc << 0x2779
953
- when 0xBA; array_enc << 0x277A
954
- when 0xBB; array_enc << 0x277B
955
- when 0xBC; array_enc << 0x277C
956
- when 0xBD; array_enc << 0x277D
957
- when 0xBE; array_enc << 0x277E
958
- when 0xBF; array_enc << 0x277F
959
- when 0xC0; array_enc << 0x2780
960
- when 0xC1; array_enc << 0x2781
961
- when 0xC2; array_enc << 0x2782
962
- when 0xC3; array_enc << 0x2783
963
- when 0xC4; array_enc << 0x2784
964
- when 0xC5; array_enc << 0x2785
965
- when 0xC6; array_enc << 0x2786
966
- when 0xC7; array_enc << 0x2787
967
- when 0xC8; array_enc << 0x2788
968
- when 0xC9; array_enc << 0x2789
969
- when 0xCA; array_enc << 0x278A
970
- when 0xCB; array_enc << 0x278B
971
- when 0xCC; array_enc << 0x278C
972
- when 0xCD; array_enc << 0x278D
973
- when 0xCE; array_enc << 0x278E
974
- when 0xCF; array_enc << 0x278F
975
- when 0xD0; array_enc << 0x2790
976
- when 0xD1; array_enc << 0x2791
977
- when 0xD2; array_enc << 0x2792
978
- when 0xD3; array_enc << 0x2793
979
- when 0xD4; array_enc << 0x2794
980
- when 0xD5; array_enc << 0x2795
981
- when 0xD6; array_enc << 0x2796
982
- when 0xD7; array_enc << 0x2797
983
- when 0xD8; array_enc << 0x2798
984
- when 0xD9; array_enc << 0x2799
985
- when 0xDA; array_enc << 0x279A
986
- when 0xDB; array_enc << 0x279B
987
- when 0xDC; array_enc << 0x279C
988
- when 0xDD; array_enc << 0x279D
989
- when 0xDE; array_enc << 0x279E
990
- when 0xDF; array_enc << 0x279F
991
- when 0xE0; array_enc << 0x27A0
992
- when 0xE1; array_enc << 0x27A1
993
- when 0xE2; array_enc << 0x27A2
994
- when 0xE3; array_enc << 0x27A3
995
- when 0xE4; array_enc << 0x27A4
996
- when 0xE5; array_enc << 0x27A5
997
- when 0xE6; array_enc << 0x27A6
998
- when 0xE7; array_enc << 0x27A7
999
- when 0xE8; array_enc << 0x27A8
1000
- when 0xE9; array_enc << 0x27A9
1001
- when 0xEA; array_enc << 0x27AA
1002
- when 0xEB; array_enc << 0x27AB
1003
- when 0xEC; array_enc << 0x27AC
1004
- when 0xED; array_enc << 0x27AD
1005
- when 0xEE; array_enc << 0x27AE
1006
- when 0xEF; array_enc << 0x27AF
1007
- when 0xF1; array_enc << 0x27B1
1008
- when 0xF2; array_enc << 0x27B2
1009
- when 0xF3; array_enc << 0x27B3
1010
- when 0xF4; array_enc << 0x27B4
1011
- when 0xF5; array_enc << 0x27B5
1012
- when 0xF6; array_enc << 0x27B6
1013
- when 0xF7; array_enc << 0x27B7
1014
- when 0xF8; array_enc << 0x27B8
1015
- when 0xF9; array_enc << 0x27B9
1016
- when 0xFA; array_enc << 0x27BA
1017
- when 0xFB; array_enc << 0x27BB
1018
- when 0xFC; array_enc << 0x27BC
1019
- when 0xFD; array_enc << 0x27BD
1020
- when 0xFE; array_enc << 0x27BE
849
+ if tounicode && (code = tounicode.decode(num))
850
+ array_enc << code
851
+ elsif tounicode
852
+ array_enc << PDF::Reader::Encoding::UNKNOWN_CHAR
1021
853
  else
1022
- array_enc << num
854
+ case num
855
+ when 0x21; array_enc << 0x2701
856
+ when 0x22; array_enc << 0x2702
857
+ when 0x23; array_enc << 0x2703
858
+ when 0x24; array_enc << 0x2704
859
+ when 0x25; array_enc << 0x260E
860
+ when 0x26; array_enc << 0x2706
861
+ when 0x27; array_enc << 0x2707
862
+ when 0x28; array_enc << 0x2708
863
+ when 0x29; array_enc << 0x2709
864
+ when 0x2A; array_enc << 0x261B
865
+ when 0x2B; array_enc << 0x261E
866
+ when 0x2C; array_enc << 0x270C
867
+ when 0x2D; array_enc << 0x270D
868
+ when 0x2E; array_enc << 0x270E
869
+ when 0x2F; array_enc << 0x270F
870
+ when 0x30; array_enc << 0x2710
871
+ when 0x31; array_enc << 0x2711
872
+ when 0x32; array_enc << 0x2712
873
+ when 0x33; array_enc << 0x2713
874
+ when 0x34; array_enc << 0x2714
875
+ when 0x35; array_enc << 0x2715
876
+ when 0x36; array_enc << 0x2716
877
+ when 0x37; array_enc << 0x2717
878
+ when 0x38; array_enc << 0x2718
879
+ when 0x39; array_enc << 0x2719
880
+ when 0x3A; array_enc << 0x271A
881
+ when 0x3B; array_enc << 0x271B
882
+ when 0x3C; array_enc << 0x271C
883
+ when 0x3D; array_enc << 0x271D
884
+ when 0x3E; array_enc << 0x271E
885
+ when 0x3F; array_enc << 0x271E
886
+ when 0x40; array_enc << 0x2720
887
+ when 0x41; array_enc << 0x2721
888
+ when 0x42; array_enc << 0x2722
889
+ when 0x43; array_enc << 0x2723
890
+ when 0x44; array_enc << 0x2724
891
+ when 0x45; array_enc << 0x2725
892
+ when 0x46; array_enc << 0x2726
893
+ when 0x47; array_enc << 0x2727
894
+ when 0x48; array_enc << 0x2605
895
+ when 0x49; array_enc << 0x2729
896
+ when 0x4A; array_enc << 0x272A
897
+ when 0x4B; array_enc << 0x272B
898
+ when 0x4C; array_enc << 0x272C
899
+ when 0x4D; array_enc << 0x272D
900
+ when 0x4E; array_enc << 0x272E
901
+ when 0x4F; array_enc << 0x272F
902
+ when 0x50; array_enc << 0x2730
903
+ when 0x51; array_enc << 0x2731
904
+ when 0x52; array_enc << 0x2732
905
+ when 0x53; array_enc << 0x2733
906
+ when 0x54; array_enc << 0x2734
907
+ when 0x55; array_enc << 0x2735
908
+ when 0x56; array_enc << 0x2736
909
+ when 0x57; array_enc << 0x2737
910
+ when 0x58; array_enc << 0x2738
911
+ when 0x59; array_enc << 0x2739
912
+ when 0x5A; array_enc << 0x273A
913
+ when 0x5B; array_enc << 0x273B
914
+ when 0x5C; array_enc << 0x273C
915
+ when 0x5D; array_enc << 0x273D
916
+ when 0x5E; array_enc << 0x273E
917
+ when 0x5F; array_enc << 0x273F
918
+ when 0x60; array_enc << 0x2740
919
+ when 0x61; array_enc << 0x2741
920
+ when 0x62; array_enc << 0x2742
921
+ when 0x63; array_enc << 0x2743
922
+ when 0x64; array_enc << 0x2744
923
+ when 0x65; array_enc << 0x2745
924
+ when 0x66; array_enc << 0x2746
925
+ when 0x67; array_enc << 0x2747
926
+ when 0x68; array_enc << 0x2748
927
+ when 0x69; array_enc << 0x2749
928
+ when 0x6A; array_enc << 0x274A
929
+ when 0x6B; array_enc << 0x274B
930
+ when 0x6C; array_enc << 0x25CF
931
+ when 0x6D; array_enc << 0x274D
932
+ when 0x6E; array_enc << 0x25A0
933
+ when 0x6F; array_enc << 0x274F
934
+ when 0x70; array_enc << 0x2750
935
+ when 0x71; array_enc << 0x2751
936
+ when 0x72; array_enc << 0x2752
937
+ when 0x73; array_enc << 0x2753
938
+ when 0x74; array_enc << 0x2754
939
+ when 0x75; array_enc << 0x2755
940
+ when 0x76; array_enc << 0x2756
941
+ when 0x77; array_enc << 0x2757
942
+ when 0x78; array_enc << 0x2758
943
+ when 0x79; array_enc << 0x2759
944
+ when 0x7A; array_enc << 0x275A
945
+ when 0x7B; array_enc << 0x275B
946
+ when 0x7C; array_enc << 0x275C
947
+ when 0x7D; array_enc << 0x275D
948
+ when 0x7E; array_enc << 0x275E
949
+ when 0x80; array_enc << 0xF8D7
950
+ when 0x81; array_enc << 0xF8D8
951
+ when 0x82; array_enc << 0xF8D9
952
+ when 0x83; array_enc << 0xF8DA
953
+ when 0x84; array_enc << 0xF8DB
954
+ when 0x85; array_enc << 0xF8DC
955
+ when 0x86; array_enc << 0xF8DD
956
+ when 0x87; array_enc << 0xF8DE
957
+ when 0x88; array_enc << 0xF8DF
958
+ when 0x89; array_enc << 0xF8E0
959
+ when 0x8A; array_enc << 0xF8E1
960
+ when 0x8B; array_enc << 0xF8E2
961
+ when 0x8C; array_enc << 0xF8E3
962
+ when 0x8D; array_enc << 0xF8E4
963
+ when 0xA1; array_enc << 0x2761
964
+ when 0xA2; array_enc << 0x2762
965
+ when 0xA3; array_enc << 0x2763
966
+ when 0xA4; array_enc << 0x2764
967
+ when 0xA5; array_enc << 0x2765
968
+ when 0xA6; array_enc << 0x2766
969
+ when 0xA7; array_enc << 0x2767
970
+ when 0xA8; array_enc << 0x2663
971
+ when 0xA9; array_enc << 0x2666
972
+ when 0xAA; array_enc << 0x2665
973
+ when 0xAB; array_enc << 0x2660
974
+ when 0xAC; array_enc << 0x2460
975
+ when 0xAD; array_enc << 0x2461
976
+ when 0xAE; array_enc << 0x2462
977
+ when 0xAF; array_enc << 0x2463
978
+ when 0xB0; array_enc << 0x2464
979
+ when 0xB1; array_enc << 0x2465
980
+ when 0xB2; array_enc << 0x2466
981
+ when 0xB3; array_enc << 0x2467
982
+ when 0xB4; array_enc << 0x2468
983
+ when 0xB5; array_enc << 0x2469
984
+ when 0xB6; array_enc << 0x2776
985
+ when 0xB7; array_enc << 0x2777
986
+ when 0xB8; array_enc << 0x2778
987
+ when 0xB9; array_enc << 0x2779
988
+ when 0xBA; array_enc << 0x277A
989
+ when 0xBB; array_enc << 0x277B
990
+ when 0xBC; array_enc << 0x277C
991
+ when 0xBD; array_enc << 0x277D
992
+ when 0xBE; array_enc << 0x277E
993
+ when 0xBF; array_enc << 0x277F
994
+ when 0xC0; array_enc << 0x2780
995
+ when 0xC1; array_enc << 0x2781
996
+ when 0xC2; array_enc << 0x2782
997
+ when 0xC3; array_enc << 0x2783
998
+ when 0xC4; array_enc << 0x2784
999
+ when 0xC5; array_enc << 0x2785
1000
+ when 0xC6; array_enc << 0x2786
1001
+ when 0xC7; array_enc << 0x2787
1002
+ when 0xC8; array_enc << 0x2788
1003
+ when 0xC9; array_enc << 0x2789
1004
+ when 0xCA; array_enc << 0x278A
1005
+ when 0xCB; array_enc << 0x278B
1006
+ when 0xCC; array_enc << 0x278C
1007
+ when 0xCD; array_enc << 0x278D
1008
+ when 0xCE; array_enc << 0x278E
1009
+ when 0xCF; array_enc << 0x278F
1010
+ when 0xD0; array_enc << 0x2790
1011
+ when 0xD1; array_enc << 0x2791
1012
+ when 0xD2; array_enc << 0x2792
1013
+ when 0xD3; array_enc << 0x2793
1014
+ when 0xD4; array_enc << 0x2794
1015
+ when 0xD5; array_enc << 0x2795
1016
+ when 0xD6; array_enc << 0x2796
1017
+ when 0xD7; array_enc << 0x2797
1018
+ when 0xD8; array_enc << 0x2798
1019
+ when 0xD9; array_enc << 0x2799
1020
+ when 0xDA; array_enc << 0x279A
1021
+ when 0xDB; array_enc << 0x279B
1022
+ when 0xDC; array_enc << 0x279C
1023
+ when 0xDD; array_enc << 0x279D
1024
+ when 0xDE; array_enc << 0x279E
1025
+ when 0xDF; array_enc << 0x279F
1026
+ when 0xE0; array_enc << 0x27A0
1027
+ when 0xE1; array_enc << 0x27A1
1028
+ when 0xE2; array_enc << 0x27A2
1029
+ when 0xE3; array_enc << 0x27A3
1030
+ when 0xE4; array_enc << 0x27A4
1031
+ when 0xE5; array_enc << 0x27A5
1032
+ when 0xE6; array_enc << 0x27A6
1033
+ when 0xE7; array_enc << 0x27A7
1034
+ when 0xE8; array_enc << 0x27A8
1035
+ when 0xE9; array_enc << 0x27A9
1036
+ when 0xEA; array_enc << 0x27AA
1037
+ when 0xEB; array_enc << 0x27AB
1038
+ when 0xEC; array_enc << 0x27AC
1039
+ when 0xED; array_enc << 0x27AD
1040
+ when 0xEE; array_enc << 0x27AE
1041
+ when 0xEF; array_enc << 0x27AF
1042
+ when 0xF1; array_enc << 0x27B1
1043
+ when 0xF2; array_enc << 0x27B2
1044
+ when 0xF3; array_enc << 0x27B3
1045
+ when 0xF4; array_enc << 0x27B4
1046
+ when 0xF5; array_enc << 0x27B5
1047
+ when 0xF6; array_enc << 0x27B6
1048
+ when 0xF7; array_enc << 0x27B7
1049
+ when 0xF8; array_enc << 0x27B8
1050
+ when 0xF9; array_enc << 0x27B9
1051
+ when 0xFA; array_enc << 0x27BA
1052
+ when 0xFB; array_enc << 0x27BB
1053
+ when 0xFC; array_enc << 0x27BC
1054
+ when 0xFD; array_enc << 0x27BD
1055
+ when 0xFE; array_enc << 0x27BE
1056
+ else
1057
+ array_enc << num
1058
+ end
1023
1059
  end
1024
1060
  end
1025
1061